diff mbox series

[PATCH-1,combine] Don't widen shift mode when target has rotate/mask instruction on original mode [PR93738]

Message ID 531959dd-1342-cbf1-054b-faf620907aea@linux.ibm.com
State New
Headers show
Series [PATCH-1,combine] Don't widen shift mode when target has rotate/mask instruction on original mode [PR93738] | expand

Commit Message

HAO CHEN GUI July 19, 2023, 3:06 a.m. UTC
Hi,
  The shift mode will be widen in combine pass if the operand has a normal
subreg. But when the target already has rotate/mask/insert instructions on
the narrow mode, it's unnecessary to widen the mode for lshiftrt. As
the lshiftrt is commonly converted to rotate/mask insn, the widen mode
blocks it to be further combined to rotate/mask/insert insn. The PR93738
shows the case.

The lshiftrt:SI (subreg:SI (reg:DI)) is converted to
subreg:SI (lshiftrt:DI (reg:DI)) and fails to match rotate/mask pattern.

Trying 13, 10 -> 14:
   13: r127:SI=r125:SI&0xfffffffffffff0ff
      REG_DEAD r125:SI
   10: r124:SI=r129:DI#4 0>>0xc&0xf00
      REG_DEAD r129:DI
   14: r128:SI=r127:SI|r124:SI

Failed to match this instruction:
(set (reg:SI 128)
    (ior:SI (and:SI (reg:SI 125 [+-2 ])
            (const_int -3841 [0xfffffffffffff0ff]))
        (and:SI (subreg:SI (zero_extract:DI (reg:DI 129)
                    (const_int 32 [0x20])
                    (const_int 20 [0x14])) 4)
            (const_int 3840 [0xf00]))))
Failed to match this instruction:
(set (reg:SI 128)
    (ior:SI (and:SI (reg:SI 125 [+-2 ])
            (const_int -3841 [0xfffffffffffff0ff]))
        (and:SI (subreg:SI (and:DI (lshiftrt:DI (reg:DI 129)
                        (const_int 12 [0xc]))
                    (const_int 4294967295 [0xffffffff])) 4)
            (const_int 3840 [0xf00]))))

If not widen the shift mode, it can be combined to rotate/mask/insert insn
as expected.

Trying 13, 10 -> 14:
   13: r127:SI=r125:SI&0xfffffffffffff0ff
      REG_DEAD r125:SI
   10: r124:SI=r129:DI#4 0>>0xc&0xf00
      REG_DEAD r129:DI
   14: r128:SI=r127:SI|r124:SI
      REG_DEAD r127:SI
      REG_DEAD r124:SI
Successfully matched this instruction:
(set (reg:SI 128)
    (ior:SI (and:SI (reg:SI 125 [+-2 ])
            (const_int -3841 [0xfffffffffffff0ff]))
        (and:SI (lshiftrt:SI (subreg:SI (reg:DI 129) 4)
                (const_int 12 [0xc]))
            (const_int 3840 [0xf00]))))


  This patch adds a target hook to indicate if rotate/mask instructions are
supported on certain mode. If it's true, widen lshiftrt mode is skipped
and shift is done on original mode.

  The patch fixes the regression of other rs6000 test cases. They're listed
in the second patch.

  The patch passed regression test on Power Linux and x86 platforms.

Thanks
Gui Haochen

ChangeLog
combine: Not winden shift mode when target has rotate/mask instruction on
original mode

To winden shift mode is unnecessary when target already has rotate/mask
instuctions on the original mode.  It might blocks the further combine
optimization on the original mode.  For instance, further combine the insns
to a rotate/mask/insert instruction on the original mode.

This patch adds a hook to indicate if a target supports rotate/mask
instructions on the certain mode.  If it returns true, the widen shift
mode will be skipped on lshiftrt.

gcc/
	PR target/93738
	* combine.cc (try_widen_shift_mode): Skip to widen mode for lshiftrt
	when the target has rotate/mask instructions on original mode.
	* doc/tm.texi: Regenerate.
	* doc/tm.texi.in (TARGET_HAVE_ROTATE_AND_MASK): Add.
	* target.def (have_rotate_and_mask): New target hook.
	* targhooks.cc (default_have_rotate_and_mask): New function.
	* targhooks.h (default_have_rotate_and_mask): Declare.

patch.diff

Comments

Jeff Law July 20, 2023, 9:27 p.m. UTC | #1
On 7/18/23 21:06, HAO CHEN GUI via Gcc-patches wrote:
> Hi,
>    The shift mode will be widen in combine pass if the operand has a normal
> subreg. But when the target already has rotate/mask/insert instructions on
> the narrow mode, it's unnecessary to widen the mode for lshiftrt. As
> the lshiftrt is commonly converted to rotate/mask insn, the widen mode
> blocks it to be further combined to rotate/mask/insert insn. The PR93738
> shows the case.
> 
> The lshiftrt:SI (subreg:SI (reg:DI)) is converted to
> subreg:SI (lshiftrt:DI (reg:DI)) and fails to match rotate/mask pattern.
> 
> Trying 13, 10 -> 14:
>     13: r127:SI=r125:SI&0xfffffffffffff0ff
>        REG_DEAD r125:SI
>     10: r124:SI=r129:DI#4 0>>0xc&0xf00
>        REG_DEAD r129:DI
>     14: r128:SI=r127:SI|r124:SI
> 
> Failed to match this instruction:
> (set (reg:SI 128)
>      (ior:SI (and:SI (reg:SI 125 [+-2 ])
>              (const_int -3841 [0xfffffffffffff0ff]))
>          (and:SI (subreg:SI (zero_extract:DI (reg:DI 129)
>                      (const_int 32 [0x20])
>                      (const_int 20 [0x14])) 4)
>              (const_int 3840 [0xf00]))))
> Failed to match this instruction:
> (set (reg:SI 128)
>      (ior:SI (and:SI (reg:SI 125 [+-2 ])
>              (const_int -3841 [0xfffffffffffff0ff]))
>          (and:SI (subreg:SI (and:DI (lshiftrt:DI (reg:DI 129)
>                          (const_int 12 [0xc]))
>                      (const_int 4294967295 [0xffffffff])) 4)
>              (const_int 3840 [0xf00]))))
> 
> If not widen the shift mode, it can be combined to rotate/mask/insert insn
> as expected.
> 
> Trying 13, 10 -> 14:
>     13: r127:SI=r125:SI&0xfffffffffffff0ff
>        REG_DEAD r125:SI
>     10: r124:SI=r129:DI#4 0>>0xc&0xf00
>        REG_DEAD r129:DI
>     14: r128:SI=r127:SI|r124:SI
>        REG_DEAD r127:SI
>        REG_DEAD r124:SI
> Successfully matched this instruction:
> (set (reg:SI 128)
>      (ior:SI (and:SI (reg:SI 125 [+-2 ])
>              (const_int -3841 [0xfffffffffffff0ff]))
>          (and:SI (lshiftrt:SI (subreg:SI (reg:DI 129) 4)
>                  (const_int 12 [0xc]))
>              (const_int 3840 [0xf00]))))
> 
> 
>    This patch adds a target hook to indicate if rotate/mask instructions are
> supported on certain mode. If it's true, widen lshiftrt mode is skipped
> and shift is done on original mode.
> 
>    The patch fixes the regression of other rs6000 test cases. They're listed
> in the second patch.
> 
>    The patch passed regression test on Power Linux and x86 platforms.
> 
> Thanks
> Gui Haochen
> 
> ChangeLog
> combine: Not winden shift mode when target has rotate/mask instruction on
> original mode
> 
> To winden shift mode is unnecessary when target already has rotate/mask
> instuctions on the original mode.  It might blocks the further combine
> optimization on the original mode.  For instance, further combine the insns
> to a rotate/mask/insert instruction on the original mode.
> 
> This patch adds a hook to indicate if a target supports rotate/mask
> instructions on the certain mode.  If it returns true, the widen shift
> mode will be skipped on lshiftrt.
> 
> gcc/
> 	PR target/93738
> 	* combine.cc (try_widen_shift_mode): Skip to widen mode for lshiftrt
> 	when the target has rotate/mask instructions on original mode.
> 	* doc/tm.texi: Regenerate.
> 	* doc/tm.texi.in (TARGET_HAVE_ROTATE_AND_MASK): Add.
> 	* target.def (have_rotate_and_mask): New target hook.
> 	* targhooks.cc (default_have_rotate_and_mask): New function.
> 	* targhooks.h (default_have_rotate_and_mask): Declare.
Wouldn't it make more sense to just try rotate/mask in the original mode 
before trying a shift in a widened mode?  I'm not sure why we need a 
target hook here.

jeff
HAO CHEN GUI July 21, 2023, 12:59 a.m. UTC | #2
Hi Jeff,

在 2023/7/21 5:27, Jeff Law 写道:
> Wouldn't it make more sense to just try rotate/mask in the original mode before trying a shift in a widened mode?  I'm not sure why we need a target hook here.

There is no change to try rotate/mask with the original mode when
expensive_optimizations is set. The subst widens the shift mode.

      if (flag_expensive_optimizations)
        {
          /* Pass pc_rtx so no substitutions are done, just
             simplifications.  */
          if (i1)
            {
              subst_low_luid = DF_INSN_LUID (i1);
              i1src = subst (i1src, pc_rtx, pc_rtx, 0, 0, 0);
            }

          subst_low_luid = DF_INSN_LUID (i2);
          i2src = subst (i2src, pc_rtx, pc_rtx, 0, 0, 0);
        }

I don't know if the wider mode is helpful to other targets, so
I added the target hook.

Thanks
Gui Haochen
HAO CHEN GUI July 21, 2023, 1:06 a.m. UTC | #3
Sorry for the typo
s/change/chance

在 2023/7/21 8:59, HAO CHEN GUI 写道:
> Hi Jeff,
> 
> 在 2023/7/21 5:27, Jeff Law 写道:
>> Wouldn't it make more sense to just try rotate/mask in the original mode before trying a shift in a widened mode?  I'm not sure why we need a target hook here.
> 
> There is no change to try rotate/mask with the original mode when
> expensive_optimizations is set. The subst widens the shift mode.
> 
>       if (flag_expensive_optimizations)
>         {
>           /* Pass pc_rtx so no substitutions are done, just
>              simplifications.  */
>           if (i1)
>             {
>               subst_low_luid = DF_INSN_LUID (i1);
>               i1src = subst (i1src, pc_rtx, pc_rtx, 0, 0, 0);
>             }
> 
>           subst_low_luid = DF_INSN_LUID (i2);
>           i2src = subst (i2src, pc_rtx, pc_rtx, 0, 0, 0);
>         }
> 
> I don't know if the wider mode is helpful to other targets, so
> I added the target hook.
> 
> Thanks
> Gui Haochen
Jeff Law Aug. 4, 2023, 11:32 p.m. UTC | #4
On 7/20/23 18:59, HAO CHEN GUI wrote:
> Hi Jeff,
> 
> 在 2023/7/21 5:27, Jeff Law 写道:
>> Wouldn't it make more sense to just try rotate/mask in the original mode before trying a shift in a widened mode?  I'm not sure why we need a target hook here.
> 
> There is no change to try rotate/mask with the original mode when
> expensive_optimizations is set. The subst widens the shift mode.
But we can add it before the attempt in the wider mode.

> 
>        if (flag_expensive_optimizations)
>          {
>            /* Pass pc_rtx so no substitutions are done, just
>               simplifications.  */
>            if (i1)
>              {
>                subst_low_luid = DF_INSN_LUID (i1);
>                i1src = subst (i1src, pc_rtx, pc_rtx, 0, 0, 0);
>              }
> 
>            subst_low_luid = DF_INSN_LUID (i2);
>            i2src = subst (i2src, pc_rtx, pc_rtx, 0, 0, 0);
>          }
> 
> I don't know if the wider mode is helpful to other targets, so
> I added the target hook.
In this scenario we're often better off relying on rtx_costs (even with 
all its warts) rather than adding yet another target hook.

I'd love to hear from Segher here to see if he's got other ideas.

jeff
HAO CHEN GUI Aug. 21, 2023, 1:37 a.m. UTC | #5
Jeff,
  Thanks a lot for your comments.

  The widen shift mode is on i1/i2 before they're combined with i3 to newpat.
The newpat matches rotate/mask pattern. The i1/i2 itself don't match
rotate/mask pattern.

  I did an experiment to disable widen shift mode for
lshiftrt. I tested it on powerpc/x86/aarch64. There is no regression occurred.
I thought that the widen shift mode is helpful for newpat matching. But it seems
not, at least no impact on powerpc/x86/aarch64.

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 4bf867d74b0..0b9b115f9bb 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -10479,11 +10479,6 @@ try_widen_shift_mode (enum rtx_code code, rtx op, int count,
       return orig_mode;

     case LSHIFTRT:
-      /* Similarly here but with zero bits.  */
-      if (HWI_COMPUTABLE_MODE_P (mode)
-         && (nonzero_bits (op, mode) & ~GET_MODE_MASK (orig_mode)) == 0)
-       return mode;
-
       /* We can also widen if the bits brought in will be masked off.  This
         operation is performed in ORIG_MODE.  */
       if (outer_code == AND)

Segher,
  Could you inform me what's the purpose of widen shift mode in
simplify_shift_const? Does it definitely reduce the rtx cost or it helps match
patterns? Thanks a lot.

Thanks
Gui Haochen


在 2023/8/5 7:32, Jeff Law 写道:
> 
> 
> On 7/20/23 18:59, HAO CHEN GUI wrote:
>> Hi Jeff,
>>
>> 在 2023/7/21 5:27, Jeff Law 写道:
>>> Wouldn't it make more sense to just try rotate/mask in the original mode before trying a shift in a widened mode?  I'm not sure why we need a target hook here.
>>
>> There is no change to try rotate/mask with the original mode when
>> expensive_optimizations is set. The subst widens the shift mode.
> But we can add it before the attempt in the wider mode.
> 
>>
>>        if (flag_expensive_optimizations)
>>          {
>>            /* Pass pc_rtx so no substitutions are done, just
>>               simplifications.  */
>>            if (i1)
>>              {
>>                subst_low_luid = DF_INSN_LUID (i1);
>>                i1src = subst (i1src, pc_rtx, pc_rtx, 0, 0, 0);
>>              }
>>
>>            subst_low_luid = DF_INSN_LUID (i2);
>>            i2src = subst (i2src, pc_rtx, pc_rtx, 0, 0, 0);
>>          }
>>
>> I don't know if the wider mode is helpful to other targets, so
>> I added the target hook.
> In this scenario we're often better off relying on rtx_costs (even with all its warts) rather than adding yet another target hook.
> 
> I'd love to hear from Segher here to see if he's got other ideas.
> 
> jeff
diff mbox series

Patch

diff --git a/gcc/combine.cc b/gcc/combine.cc
index 304c020ec79..f22fe42931b 100644
--- a/gcc/combine.cc
+++ b/gcc/combine.cc
@@ -10475,20 +10475,25 @@  try_widen_shift_mode (enum rtx_code code, rtx op, int count,
       return orig_mode;

     case LSHIFTRT:
-      /* Similarly here but with zero bits.  */
-      if (HWI_COMPUTABLE_MODE_P (mode)
-	  && (nonzero_bits (op, mode) & ~GET_MODE_MASK (orig_mode)) == 0)
-	return mode;
-
-      /* We can also widen if the bits brought in will be masked off.  This
-	 operation is performed in ORIG_MODE.  */
-      if (outer_code == AND)
+      /* Skip wider mode when the target has rotate and mask instructions on
+	 orig_mode.  */
+      if (!targetm.have_rotate_and_mask (orig_mode))
 	{
-	  int care_bits = low_bitmask_len (orig_mode, outer_const);
-
-	  if (care_bits >= 0
-	      && GET_MODE_PRECISION (orig_mode) - care_bits >= count)
+	  /* Similarly here but with zero bits.  */
+	  if (HWI_COMPUTABLE_MODE_P (mode)
+	      && (nonzero_bits (op, mode) & ~GET_MODE_MASK (orig_mode)) == 0)
 	    return mode;
+
+	  /* We can also widen if the bits brought in will be masked off.
+	     This operation is performed in ORIG_MODE.  */
+	  if (outer_code == AND)
+	    {
+	      int care_bits = low_bitmask_len (orig_mode, outer_const);
+
+	      if (care_bits >= 0
+		  && GET_MODE_PRECISION (orig_mode) - care_bits >= count)
+		return mode;
+	    }
 	}
       /* fall through */

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 95ba56e05ae..cc7342b5253 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -11102,6 +11102,11 @@  The default is four for machines with a @code{casesi} instruction and
 five otherwise.  This is best for most machines.
 @end deftypefn

+@deftypefn {Target Hook} bool TARGET_HAVE_ROTATE_AND_MASK (machine_mode @var{mode})
+Return true if the target has rotate and mask instructions on mode
+ @var{mode}.
+@end deftypefn
+
 @defmac WORD_REGISTER_OPERATIONS
 Define this macro to 1 if operations between registers with integral mode
 smaller than a word are always performed on the entire register.  To be
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 4ac96dc357d..01257a7b3a2 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -7241,6 +7241,8 @@  is in effect.

 @hook TARGET_CASE_VALUES_THRESHOLD

+@hook TARGET_HAVE_ROTATE_AND_MASK
+
 @defmac WORD_REGISTER_OPERATIONS
 Define this macro to 1 if operations between registers with integral mode
 smaller than a word are always performed on the entire register.  To be
diff --git a/gcc/target.def b/gcc/target.def
index 7d684296c17..ee2edfb4504 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -7169,6 +7169,15 @@  DEFHOOKPOD
 @option{-fsanitize=shadow-call-stack}.  The default value is false.",
  bool, false)

+/* Return true if the target has rotate and mask instructions for this\n\
+   scalar integer mode.  */
+DEFHOOK
+(have_rotate_and_mask,
+ "Return true if the target has rotate and mask instructions on mode\n\
+ @var{mode}.",
+ bool, (machine_mode mode),
+ default_have_rotate_and_mask)
+
 /* Close the 'struct gcc_target' definition.  */
 HOOK_VECTOR_END (C90_EMPTY_HACK)

diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc
index e190369f87a..4743aeb6d9a 100644
--- a/gcc/targhooks.cc
+++ b/gcc/targhooks.cc
@@ -2775,4 +2775,11 @@  default_gcov_type_size (void)
   return TYPE_PRECISION (long_long_integer_type_node) > 32 ? 64 : 32;
 }

+bool
+default_have_rotate_and_mask (machine_mode mode)
+{
+  gcc_assert (SCALAR_INT_MODE_P (mode));
+  return false;
+}
+
 #include "gt-targhooks.h"
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index 1a0db8dddd5..209b7a3380b 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -303,4 +303,6 @@  extern rtx default_memtag_untagged_pointer (rtx, rtx);

 extern HOST_WIDE_INT default_gcov_type_size (void);

+extern bool default_have_rotate_and_mask (machine_mode);
+
 #endif /* GCC_TARGHOOKS_H */