Message ID | 531959dd-1342-cbf1-054b-faf620907aea@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | [PATCH-1,combine] Don't widen shift mode when target has rotate/mask instruction on original mode [PR93738] | expand |
On 7/18/23 21:06, HAO CHEN GUI via Gcc-patches wrote: > Hi, > The shift mode will be widen in combine pass if the operand has a normal > subreg. But when the target already has rotate/mask/insert instructions on > the narrow mode, it's unnecessary to widen the mode for lshiftrt. As > the lshiftrt is commonly converted to rotate/mask insn, the widen mode > blocks it to be further combined to rotate/mask/insert insn. The PR93738 > shows the case. > > The lshiftrt:SI (subreg:SI (reg:DI)) is converted to > subreg:SI (lshiftrt:DI (reg:DI)) and fails to match rotate/mask pattern. > > Trying 13, 10 -> 14: > 13: r127:SI=r125:SI&0xfffffffffffff0ff > REG_DEAD r125:SI > 10: r124:SI=r129:DI#4 0>>0xc&0xf00 > REG_DEAD r129:DI > 14: r128:SI=r127:SI|r124:SI > > Failed to match this instruction: > (set (reg:SI 128) > (ior:SI (and:SI (reg:SI 125 [+-2 ]) > (const_int -3841 [0xfffffffffffff0ff])) > (and:SI (subreg:SI (zero_extract:DI (reg:DI 129) > (const_int 32 [0x20]) > (const_int 20 [0x14])) 4) > (const_int 3840 [0xf00])))) > Failed to match this instruction: > (set (reg:SI 128) > (ior:SI (and:SI (reg:SI 125 [+-2 ]) > (const_int -3841 [0xfffffffffffff0ff])) > (and:SI (subreg:SI (and:DI (lshiftrt:DI (reg:DI 129) > (const_int 12 [0xc])) > (const_int 4294967295 [0xffffffff])) 4) > (const_int 3840 [0xf00])))) > > If not widen the shift mode, it can be combined to rotate/mask/insert insn > as expected. > > Trying 13, 10 -> 14: > 13: r127:SI=r125:SI&0xfffffffffffff0ff > REG_DEAD r125:SI > 10: r124:SI=r129:DI#4 0>>0xc&0xf00 > REG_DEAD r129:DI > 14: r128:SI=r127:SI|r124:SI > REG_DEAD r127:SI > REG_DEAD r124:SI > Successfully matched this instruction: > (set (reg:SI 128) > (ior:SI (and:SI (reg:SI 125 [+-2 ]) > (const_int -3841 [0xfffffffffffff0ff])) > (and:SI (lshiftrt:SI (subreg:SI (reg:DI 129) 4) > (const_int 12 [0xc])) > (const_int 3840 [0xf00])))) > > > This patch adds a target hook to indicate if rotate/mask instructions are > supported on certain mode. If it's true, widen lshiftrt mode is skipped > and shift is done on original mode. > > The patch fixes the regression of other rs6000 test cases. They're listed > in the second patch. > > The patch passed regression test on Power Linux and x86 platforms. > > Thanks > Gui Haochen > > ChangeLog > combine: Not winden shift mode when target has rotate/mask instruction on > original mode > > To winden shift mode is unnecessary when target already has rotate/mask > instuctions on the original mode. It might blocks the further combine > optimization on the original mode. For instance, further combine the insns > to a rotate/mask/insert instruction on the original mode. > > This patch adds a hook to indicate if a target supports rotate/mask > instructions on the certain mode. If it returns true, the widen shift > mode will be skipped on lshiftrt. > > gcc/ > PR target/93738 > * combine.cc (try_widen_shift_mode): Skip to widen mode for lshiftrt > when the target has rotate/mask instructions on original mode. > * doc/tm.texi: Regenerate. > * doc/tm.texi.in (TARGET_HAVE_ROTATE_AND_MASK): Add. > * target.def (have_rotate_and_mask): New target hook. > * targhooks.cc (default_have_rotate_and_mask): New function. > * targhooks.h (default_have_rotate_and_mask): Declare. Wouldn't it make more sense to just try rotate/mask in the original mode before trying a shift in a widened mode? I'm not sure why we need a target hook here. jeff
Hi Jeff,
在 2023/7/21 5:27, Jeff Law 写道:
> Wouldn't it make more sense to just try rotate/mask in the original mode before trying a shift in a widened mode? I'm not sure why we need a target hook here.
There is no change to try rotate/mask with the original mode when
expensive_optimizations is set. The subst widens the shift mode.
if (flag_expensive_optimizations)
{
/* Pass pc_rtx so no substitutions are done, just
simplifications. */
if (i1)
{
subst_low_luid = DF_INSN_LUID (i1);
i1src = subst (i1src, pc_rtx, pc_rtx, 0, 0, 0);
}
subst_low_luid = DF_INSN_LUID (i2);
i2src = subst (i2src, pc_rtx, pc_rtx, 0, 0, 0);
}
I don't know if the wider mode is helpful to other targets, so
I added the target hook.
Thanks
Gui Haochen
Sorry for the typo s/change/chance 在 2023/7/21 8:59, HAO CHEN GUI 写道: > Hi Jeff, > > 在 2023/7/21 5:27, Jeff Law 写道: >> Wouldn't it make more sense to just try rotate/mask in the original mode before trying a shift in a widened mode? I'm not sure why we need a target hook here. > > There is no change to try rotate/mask with the original mode when > expensive_optimizations is set. The subst widens the shift mode. > > if (flag_expensive_optimizations) > { > /* Pass pc_rtx so no substitutions are done, just > simplifications. */ > if (i1) > { > subst_low_luid = DF_INSN_LUID (i1); > i1src = subst (i1src, pc_rtx, pc_rtx, 0, 0, 0); > } > > subst_low_luid = DF_INSN_LUID (i2); > i2src = subst (i2src, pc_rtx, pc_rtx, 0, 0, 0); > } > > I don't know if the wider mode is helpful to other targets, so > I added the target hook. > > Thanks > Gui Haochen
On 7/20/23 18:59, HAO CHEN GUI wrote: > Hi Jeff, > > 在 2023/7/21 5:27, Jeff Law 写道: >> Wouldn't it make more sense to just try rotate/mask in the original mode before trying a shift in a widened mode? I'm not sure why we need a target hook here. > > There is no change to try rotate/mask with the original mode when > expensive_optimizations is set. The subst widens the shift mode. But we can add it before the attempt in the wider mode. > > if (flag_expensive_optimizations) > { > /* Pass pc_rtx so no substitutions are done, just > simplifications. */ > if (i1) > { > subst_low_luid = DF_INSN_LUID (i1); > i1src = subst (i1src, pc_rtx, pc_rtx, 0, 0, 0); > } > > subst_low_luid = DF_INSN_LUID (i2); > i2src = subst (i2src, pc_rtx, pc_rtx, 0, 0, 0); > } > > I don't know if the wider mode is helpful to other targets, so > I added the target hook. In this scenario we're often better off relying on rtx_costs (even with all its warts) rather than adding yet another target hook. I'd love to hear from Segher here to see if he's got other ideas. jeff
Jeff, Thanks a lot for your comments. The widen shift mode is on i1/i2 before they're combined with i3 to newpat. The newpat matches rotate/mask pattern. The i1/i2 itself don't match rotate/mask pattern. I did an experiment to disable widen shift mode for lshiftrt. I tested it on powerpc/x86/aarch64. There is no regression occurred. I thought that the widen shift mode is helpful for newpat matching. But it seems not, at least no impact on powerpc/x86/aarch64. diff --git a/gcc/combine.cc b/gcc/combine.cc index 4bf867d74b0..0b9b115f9bb 100644 --- a/gcc/combine.cc +++ b/gcc/combine.cc @@ -10479,11 +10479,6 @@ try_widen_shift_mode (enum rtx_code code, rtx op, int count, return orig_mode; case LSHIFTRT: - /* Similarly here but with zero bits. */ - if (HWI_COMPUTABLE_MODE_P (mode) - && (nonzero_bits (op, mode) & ~GET_MODE_MASK (orig_mode)) == 0) - return mode; - /* We can also widen if the bits brought in will be masked off. This operation is performed in ORIG_MODE. */ if (outer_code == AND) Segher, Could you inform me what's the purpose of widen shift mode in simplify_shift_const? Does it definitely reduce the rtx cost or it helps match patterns? Thanks a lot. Thanks Gui Haochen 在 2023/8/5 7:32, Jeff Law 写道: > > > On 7/20/23 18:59, HAO CHEN GUI wrote: >> Hi Jeff, >> >> 在 2023/7/21 5:27, Jeff Law 写道: >>> Wouldn't it make more sense to just try rotate/mask in the original mode before trying a shift in a widened mode? I'm not sure why we need a target hook here. >> >> There is no change to try rotate/mask with the original mode when >> expensive_optimizations is set. The subst widens the shift mode. > But we can add it before the attempt in the wider mode. > >> >> if (flag_expensive_optimizations) >> { >> /* Pass pc_rtx so no substitutions are done, just >> simplifications. */ >> if (i1) >> { >> subst_low_luid = DF_INSN_LUID (i1); >> i1src = subst (i1src, pc_rtx, pc_rtx, 0, 0, 0); >> } >> >> subst_low_luid = DF_INSN_LUID (i2); >> i2src = subst (i2src, pc_rtx, pc_rtx, 0, 0, 0); >> } >> >> I don't know if the wider mode is helpful to other targets, so >> I added the target hook. > In this scenario we're often better off relying on rtx_costs (even with all its warts) rather than adding yet another target hook. > > I'd love to hear from Segher here to see if he's got other ideas. > > jeff
diff --git a/gcc/combine.cc b/gcc/combine.cc index 304c020ec79..f22fe42931b 100644 --- a/gcc/combine.cc +++ b/gcc/combine.cc @@ -10475,20 +10475,25 @@ try_widen_shift_mode (enum rtx_code code, rtx op, int count, return orig_mode; case LSHIFTRT: - /* Similarly here but with zero bits. */ - if (HWI_COMPUTABLE_MODE_P (mode) - && (nonzero_bits (op, mode) & ~GET_MODE_MASK (orig_mode)) == 0) - return mode; - - /* We can also widen if the bits brought in will be masked off. This - operation is performed in ORIG_MODE. */ - if (outer_code == AND) + /* Skip wider mode when the target has rotate and mask instructions on + orig_mode. */ + if (!targetm.have_rotate_and_mask (orig_mode)) { - int care_bits = low_bitmask_len (orig_mode, outer_const); - - if (care_bits >= 0 - && GET_MODE_PRECISION (orig_mode) - care_bits >= count) + /* Similarly here but with zero bits. */ + if (HWI_COMPUTABLE_MODE_P (mode) + && (nonzero_bits (op, mode) & ~GET_MODE_MASK (orig_mode)) == 0) return mode; + + /* We can also widen if the bits brought in will be masked off. + This operation is performed in ORIG_MODE. */ + if (outer_code == AND) + { + int care_bits = low_bitmask_len (orig_mode, outer_const); + + if (care_bits >= 0 + && GET_MODE_PRECISION (orig_mode) - care_bits >= count) + return mode; + } } /* fall through */ diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 95ba56e05ae..cc7342b5253 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -11102,6 +11102,11 @@ The default is four for machines with a @code{casesi} instruction and five otherwise. This is best for most machines. @end deftypefn +@deftypefn {Target Hook} bool TARGET_HAVE_ROTATE_AND_MASK (machine_mode @var{mode}) +Return true if the target has rotate and mask instructions on mode + @var{mode}. +@end deftypefn + @defmac WORD_REGISTER_OPERATIONS Define this macro to 1 if operations between registers with integral mode smaller than a word are always performed on the entire register. To be diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 4ac96dc357d..01257a7b3a2 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -7241,6 +7241,8 @@ is in effect. @hook TARGET_CASE_VALUES_THRESHOLD +@hook TARGET_HAVE_ROTATE_AND_MASK + @defmac WORD_REGISTER_OPERATIONS Define this macro to 1 if operations between registers with integral mode smaller than a word are always performed on the entire register. To be diff --git a/gcc/target.def b/gcc/target.def index 7d684296c17..ee2edfb4504 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -7169,6 +7169,15 @@ DEFHOOKPOD @option{-fsanitize=shadow-call-stack}. The default value is false.", bool, false) +/* Return true if the target has rotate and mask instructions for this\n\ + scalar integer mode. */ +DEFHOOK +(have_rotate_and_mask, + "Return true if the target has rotate and mask instructions on mode\n\ + @var{mode}.", + bool, (machine_mode mode), + default_have_rotate_and_mask) + /* Close the 'struct gcc_target' definition. */ HOOK_VECTOR_END (C90_EMPTY_HACK) diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc index e190369f87a..4743aeb6d9a 100644 --- a/gcc/targhooks.cc +++ b/gcc/targhooks.cc @@ -2775,4 +2775,11 @@ default_gcov_type_size (void) return TYPE_PRECISION (long_long_integer_type_node) > 32 ? 64 : 32; } +bool +default_have_rotate_and_mask (machine_mode mode) +{ + gcc_assert (SCALAR_INT_MODE_P (mode)); + return false; +} + #include "gt-targhooks.h" diff --git a/gcc/targhooks.h b/gcc/targhooks.h index 1a0db8dddd5..209b7a3380b 100644 --- a/gcc/targhooks.h +++ b/gcc/targhooks.h @@ -303,4 +303,6 @@ extern rtx default_memtag_untagged_pointer (rtx, rtx); extern HOST_WIDE_INT default_gcov_type_size (void); +extern bool default_have_rotate_and_mask (machine_mode); + #endif /* GCC_TARGHOOKS_H */