diff mbox series

[2/5,ifcvt] optimize x=c ? (y shift_op z):y by RISC-V Zicond like insns

Message ID 20231205081248.2106-2-gaofei@eswincomputing.com
State New
Headers show
Series [1/5,V3,ifcvt] optimize x=c ? (y op z) : y by RISC-V Zicond like insns | expand

Commit Message

Fei Gao Dec. 5, 2023, 8:12 a.m. UTC
op=[ASHIFT, ASHIFTRT, LSHIFTRT, ROTATE, ROTATERT]

Conditional op, if zero
rd = (rc == 0) ? (rs1 op rs2) : rs1
-->
czero.nez rd, rs2, rc
op rd, rs1, rd

Conditional op, if non-zero
rd = (rc != 0) ? (rs1 op rs2) : rs1
-->
czero.eqz rd, rs2, rc
op rd, rs1, rd

Co-authored-by: Xiao Zeng<zengxiao@eswincomputing.com>

gcc/ChangeLog:

	* ifcvt.cc (noce_cond_zero_binary_op_supported): add support for shift like op.
        (get_base_reg): add support for subreg to handle shift amount operand.
        (noce_bbs_ok_for_cond_zero_arith): to replace shift amount operand.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/zicond_ifcvt_opt.c: add TCs for shift like op.
---
 gcc/ifcvt.cc                                  |  8 ++-
 .../gcc.target/riscv/zicond_ifcvt_opt.c       | 55 ++++++++++++++++++-
 2 files changed, 61 insertions(+), 2 deletions(-)

Comments

Jeff Law Dec. 10, 2023, 8:43 p.m. UTC | #1
On 12/5/23 01:12, Fei Gao wrote:
> op=[ASHIFT, ASHIFTRT, LSHIFTRT, ROTATE, ROTATERT]
> 
> Conditional op, if zero
> rd = (rc == 0) ? (rs1 op rs2) : rs1
> -->
> czero.nez rd, rs2, rc
> op rd, rs1, rd
> 
> Conditional op, if non-zero
> rd = (rc != 0) ? (rs1 op rs2) : rs1
> -->
> czero.eqz rd, rs2, rc
> op rd, rs1, rd
> 
> Co-authored-by: Xiao Zeng<zengxiao@eswincomputing.com>
> 
> gcc/ChangeLog:
> 
> 	* ifcvt.cc (noce_cond_zero_binary_op_supported): add support for shift like op.
>          (get_base_reg): add support for subreg to handle shift amount operand.
>          (noce_bbs_ok_for_cond_zero_arith): to replace shift amount operand.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/zicond_ifcvt_opt.c: add TCs for shift like op.
So I removed the SUBREG handling code which makes this patch merely an 
addition of the shift/rotate ops which trivally work just like PLUS, 
MINUS, IOR, XOR (by conditionally zero-ing the shift count) tested on 
x86 and pushed it to the trunk.

As I noted before while I think handling SUBREGs is important, now is 
not the time to be adding that support.

Thanks!

jeff
Fei Gao Dec. 11, 2023, 4:01 a.m. UTC | #2
On 2023-12-11 04:43  Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
>On 12/5/23 01:12, Fei Gao wrote:
>> op=[ASHIFT, ASHIFTRT, LSHIFTRT, ROTATE, ROTATERT]
>>
>> Conditional op, if zero
>> rd = (rc == 0) ? (rs1 op rs2) : rs1
>> -->
>> czero.nez rd, rs2, rc
>> op rd, rs1, rd
>>
>> Conditional op, if non-zero
>> rd = (rc != 0) ? (rs1 op rs2) : rs1
>> -->
>> czero.eqz rd, rs2, rc
>> op rd, rs1, rd
>>
>> Co-authored-by: Xiao Zeng<zengxiao@eswincomputing.com>
>>
>> gcc/ChangeLog:
>>
>> * ifcvt.cc (noce_cond_zero_binary_op_supported): add support for shift like op.
>>          (get_base_reg): add support for subreg to handle shift amount operand.
>>          (noce_bbs_ok_for_cond_zero_arith): to replace shift amount operand.
>>
>> gcc/testsuite/ChangeLog:
>>
>> * gcc.target/riscv/zicond_ifcvt_opt.c: add TCs for shift like op.
>So I removed the SUBREG handling code which makes this patch merely an
>addition of the shift/rotate ops which trivally work just like PLUS,
>MINUS, IOR, XOR (by conditionally zero-ing the shift count) tested on
>x86 and pushed it to the trunk.
>
>As I noted before while I think handling SUBREGs is important, now is
>not the time to be adding that support. 

Thanks for your review.
Got your point to defer support for SUBREGs.

Shift-like pattern:
(set (reg/v:DI 137 [ y ])
        (ashift:DI (reg/v:DI 137 [ y ])
            (subreg:QI (reg/v:DI 138 [ z ]) 0)))

No Zicond instructions are generated with the SUBREG handling code removed.
So I noticed your changes in testcases regarding the number of czero instruction number scanned.
Then this looks like a NFC patch.

BR, 
Fei


>
>Thanks!
>
>jeff
Jeff Law Dec. 11, 2023, 6:15 a.m. UTC | #3
On 12/10/23 21:01, Fei Gao wrote:
> On 2023-12-11 04:43  Jeff Law <jeffreyalaw@gmail.com> wrote:
>>
>>
>>
>> On 12/5/23 01:12, Fei Gao wrote:
>>> op=[ASHIFT, ASHIFTRT, LSHIFTRT, ROTATE, ROTATERT]
>>>
>>> Conditional op, if zero
>>> rd = (rc == 0) ? (rs1 op rs2) : rs1
>>> -->
>>> czero.nez rd, rs2, rc
>>> op rd, rs1, rd
>>>
>>> Conditional op, if non-zero
>>> rd = (rc != 0) ? (rs1 op rs2) : rs1
>>> -->
>>> czero.eqz rd, rs2, rc
>>> op rd, rs1, rd
>>>
>>> Co-authored-by: Xiao Zeng<zengxiao@eswincomputing.com>
>>>
>>> gcc/ChangeLog:
>>>
>>> * ifcvt.cc (noce_cond_zero_binary_op_supported): add support for shift like op.
>>>            (get_base_reg): add support for subreg to handle shift amount operand.
>>>            (noce_bbs_ok_for_cond_zero_arith): to replace shift amount operand.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> * gcc.target/riscv/zicond_ifcvt_opt.c: add TCs for shift like op.
>> So I removed the SUBREG handling code which makes this patch merely an
>> addition of the shift/rotate ops which trivally work just like PLUS,
>> MINUS, IOR, XOR (by conditionally zero-ing the shift count) tested on
>> x86 and pushed it to the trunk.
>>
>> As I noted before while I think handling SUBREGs is important, now is
>> not the time to be adding that support.
> 
> Thanks for your review.
> Got your point to defer support for SUBREGs.
> 
> Shift-like pattern:
> (set (reg/v:DI 137 [ y ])
>          (ashift:DI (reg/v:DI 137 [ y ])
>              (subreg:QI (reg/v:DI 138 [ z ]) 0)))
> 
> No Zicond instructions are generated with the SUBREG handling code removed.
> So I noticed your changes in testcases regarding the number of czero instruction number scanned.
> Then this looks like a NFC patch.
Not on other targets -- not every target forces the shift count into a 
narrow mode.
jeff
diff mbox series

Patch

diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
index 1f0f5414ea1..2efae21ebfe 100644
--- a/gcc/ifcvt.cc
+++ b/gcc/ifcvt.cc
@@ -2920,7 +2920,9 @@  noce_cond_zero_binary_op_supported (rtx op)
 {
   enum rtx_code opcode = GET_CODE (op);
 
-  if (opcode == PLUS || opcode == MINUS || opcode == IOR || opcode == XOR)
+  if (opcode == PLUS || opcode == MINUS || opcode == IOR || opcode == XOR
+      || opcode == ASHIFT || opcode == ASHIFTRT || opcode == LSHIFTRT
+      || opcode == ROTATE || opcode == ROTATERT)
     return true;
 
   return false;
@@ -2934,6 +2936,8 @@  get_base_reg (rtx exp)
 {
   if (REG_P (exp))
     return exp;
+  else if (SUBREG_P (exp))
+    return SUBREG_REG (exp);
 
   return NULL_RTX;
 }
@@ -3006,6 +3010,8 @@  noce_bbs_ok_for_cond_zero_arith (struct noce_if_info *if_info, rtx *common_ptr,
 
   if (REG_P (bin_op1))
     *to_replace = &XEXP (bin_exp, 1);
+  else if (SUBREG_P (bin_op1))
+    *to_replace = &SUBREG_REG (XEXP (bin_exp, 1));
   else
     return false;
 
diff --git a/gcc/testsuite/gcc.target/riscv/zicond_ifcvt_opt.c b/gcc/testsuite/gcc.target/riscv/zicond_ifcvt_opt.c
index dcb21c15d1a..ab5a4909b61 100644
--- a/gcc/testsuite/gcc.target/riscv/zicond_ifcvt_opt.c
+++ b/gcc/testsuite/gcc.target/riscv/zicond_ifcvt_opt.c
@@ -562,5 +562,58 @@  test_XOR_eqz_x_2_reverse_bin_oprands (long x, long z, long c)
   return x;
 }
 
-/* { dg-final { scan-assembler-times {czero\.eqz} 28 } } */
+long
+test_ShiftLeft_eqz (long x, long y, long z, long c)
+{
+  if (c)
+    x = y << z;
+  else
+    x = y;
+  return x;
+}
+
+long
+test_ShiftR_eqz (long x, long y, long z, long c)
+{
+  if (c)
+    x = y >> z;
+  else
+    x = y;
+  return x;
+}
+
+unsigned long
+test_ShiftR_logical_eqz (unsigned long x, unsigned long y, unsigned long z,
+			 unsigned long c)
+{
+  if (c)
+    x = y >> z;
+  else
+    x = y;
+  return x;
+}
+
+unsigned long
+test_RotateL_eqz (unsigned long x, unsigned long y, unsigned long z,
+		  unsigned long c)
+{
+  if (c)
+    x = (y << z) | (y >> (64 - z));
+  else
+    x = y;
+  return x;
+}
+
+unsigned long
+test_RotateR_eqz (unsigned long x, unsigned long y, unsigned long z,
+		  unsigned long c)
+{
+  if (c)
+    x = (y >> z) | (y << (64 - z));
+  else
+    x = y;
+  return x;
+}
+
+/* { dg-final { scan-assembler-times {czero\.eqz} 33 } } */
 /* { dg-final { scan-assembler-times {czero\.nez} 28 } } */