Message ID | 57484FA6.1060600@foss.arm.com |
---|---|
State | New |
Headers | show |
On 27/05/16 14:46, Kyrill Tkachov wrote: > > On 20/05/16 11:04, Kyrill Tkachov wrote: >> Hi all, >> >> The recent -frename-registers change exposed a deficiency in the way we fuse AESE/AESMC instruction >> pairs in arm. >> >> Basically we want to enforce: >> AESE Vn, _ >> AESMC Vn, Vn >> >> to enable the fusion, but regrename comes along and renames the output Vn register in AESMC to something >> else, killing the fusion in the hardware. >> >> The solution in this patch is to add an alternative that ties the input and output registers in the AESMC pattern >> and enable that alternative when the fusion is enabled. >> >> With this patch I've confirmed that the above preferred register sequence is kept even with -frename-registers >> when tuning for a cpu that enables the fusion and that the chain is broken by regrename otherwise and have >> seen the appropriate improvement in a proprietary benchmark (that I cannot name) that exercises this sequence. >> >> Bootstrapped and tested on arm-none-linux-gnueabihf. >> >> Ok for trunk? >> > > Following James's feedback on the AArch64 version, this slightly modified version uses the enum type for the argument of the new function. > Is this ok instead? > > Thanks, > Kyrill > > 2016-05-27 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * config/arm/arm.c (arm_fusion_enabled_p): New function. > * config/arm/arm-protos.h (arm_fusion_enabled_p): Declare prototype. > * config/arm/crypto.md (crypto_<crypto_pattern>, CRYPTO_UNARY): > Add "=w,0" alternative. Enable it when AES/AESMC fusion is enabled. > Ok. ramana
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index cf221d6793eaf0959f2713fe0903a5d8602ec2f4..12a781de13f2f7816cc2b16b04835d87c83f7abb 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -320,6 +320,7 @@ extern int vfp3_const_double_for_bits (rtx); extern void arm_emit_coreregs_64bit_shift (enum rtx_code, rtx, rtx, rtx, rtx, rtx); +extern bool arm_fusion_enabled_p (tune_params::fuse_ops); extern bool arm_valid_symbolic_address_p (rtx); extern bool arm_validize_comparison (rtx *, rtx *, rtx *); #endif /* RTX_CODE */ diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 5110d9e989d605a9e2c262e6007b89a1c7dc7080..39a24c06c123b86883134368ef39794abf11898b 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -29704,6 +29704,13 @@ aarch_macro_fusion_pair_p (rtx_insn* prev, rtx_insn* curr) return false; } +/* Return true iff the instruction fusion described by OP is enabled. */ +bool +arm_fusion_enabled_p (tune_params::fuse_ops op) +{ + return current_tune->fusible_ops & op; +} + /* Implement the TARGET_ASAN_SHADOW_OFFSET hook. */ static unsigned HOST_WIDE_INT diff --git a/gcc/config/arm/crypto.md b/gcc/config/arm/crypto.md index c6f17270b1dbaf6dc43eb1e9b8a182dbb0f5a1e1..0f510f069408471fcbf6751f161e984f39929813 100644 --- a/gcc/config/arm/crypto.md +++ b/gcc/config/arm/crypto.md @@ -18,14 +18,27 @@ ;; along with GCC; see the file COPYING3. If not see ;; <http://www.gnu.org/licenses/>. + +;; When AES/AESMC fusion is enabled we want the register allocation to +;; look like: +;; AESE Vn, _ +;; AESMC Vn, Vn +;; So prefer to tie operand 1 to operand 0 when fusing. + (define_insn "crypto_<crypto_pattern>" - [(set (match_operand:<crypto_mode> 0 "register_operand" "=w") + [(set (match_operand:<crypto_mode> 0 "register_operand" "=w,w") (unspec:<crypto_mode> [(match_operand:<crypto_mode> 1 - "register_operand" "w")] + "register_operand" "0,w")] CRYPTO_UNARY))] "TARGET_CRYPTO" "<crypto_pattern>.<crypto_size_sfx>\\t%q0, %q1" - [(set_attr "type" "<crypto_type>")] + [(set_attr "type" "<crypto_type>") + (set_attr_alternative "enabled" + [(if_then_else (match_test + "arm_fusion_enabled_p (tune_params::FUSE_AES_AESMC)") + (const_string "yes" ) + (const_string "no")) + (const_string "yes")])] ) (define_insn "crypto_<crypto_pattern>"