diff mbox series

[v2] RISC-V: zero_extend(not) -> xor optimization [PR112398]

Message ID ZytXbcmjgiYq4GOh@amerzlyakov-PC
State New
Headers show
Series [v2] RISC-V: zero_extend(not) -> xor optimization [PR112398] | expand

Commit Message

Alexey Merzlyakov Nov. 6, 2024, 11:47 a.m. UTC
Hi, Jeff,

Thank you for the review!

All items were met, please find the comments and PATCH v2 in the message below:

On Mon, Nov 04, 2024 at 04:48:31PM -0700, Jeff Law wrote:
> > +      /* Trying to optimize:
> > +     (zero_extend:M (subreg:N (not:M (X:M)))) ->
> > +     (xor:M (zero_extend:M (subreg:N (X:M)), 0xffff))
> > +     where mask takes 0xffff bits of N mode bitsize.
> "where the mask is GET_MODE_MASK (N)" is probably clearer to folks that have
> been working on GCC for a while.  GET_MODE_MASK is the bits set in mode N.
> So for QI -> 0xff, HI would be 0xffff and so-on.

Fixed comment to GET_MODE_MASK (N). Also, used the same macro to obtain
the mask instead of ~0 << ... calculations.

> > +     For the cases when X:M doesn't have any non-zero bits
> > +     outside of mode N, (zero_extend:M (subreg:N (X:M))
> > +     will be simplified to just (X:M)
> > +     and whole optimization will be -> (xor:M (X:M), 0xffff). */
> > +      if (GET_CODE (op) == SUBREG
> Write this as "SUBREG_P (op)"

Done

> > +      && GET_CODE (XEXP (op, 0)) == NOT
> > +      && GET_MODE (XEXP (op, 0)) == mode
> > +      && GET_MODE (XEXP (XEXP (op, 0), 0)) == mode
> I suspect the last mode check is redundant.   We're not supposed to have the
> argument of a code like NOT have a different mode than the code.

Yep, all right. The operand of NOT should be in the same as operation itself
mode. Also checked the logic for other NOT cases in simplify-rtx:
there are no NOT operand's mode checks for all cases.
So, removed the excess check.
 
> > +      && (nonzero_bits (XEXP (XEXP (op, 0), 0), mode)
> > +          & ~GET_MODE_MASK (GET_MODE (XEXP (XEXP (op, 0), 0)))) == 0)
> Isn't GET_MODE (XEXP (XEXP (op, 0), 0)) the same as "mode"?  Use the value
> in the local variable, it's easier to read IMHO.

Fixed

> I suspect the formatting is off as well.  We format conditionals like this:
> 
> if (a
>     && b
>     && c)
> 
> Note how the "&&" lines up under the first argument to the IF.

This was the issue of my email client settings: it was intentionally replaced
all tabs with spaces. This also, seem to the cause CI issues as well.

Trying to use another client in plain text mode.

> > +      {
> > +    const uint64_t mask
> > +      = ~((uint64_t)~0 << GET_MODE_BITSIZE (GET_MODE (op)).coeffs[0]);
> Probably shouldn't be looking directly at .coeffs field.  Instead there are
> methods to convert that value to an integer.  Look for the "to_constant ()"
> method.  so
> GET_MODE_SIZE (GET_MODE (op)).to_constant ()

Replaced these calculations to mask = GET_MODE_MASK (N), so it's not needed
anymore

> 
> Note that this transformation may not work for modes with nonconstant sizes.
> So you might need to check the is_constant () method.

Missed it. Added in the last patch.

> > +    return simplify_gen_binary (XOR, mode,
> > +                    XEXP (XEXP (op, 0), 0),
> > +                    gen_int_mode (mask, mode));
> Formatting looks goofy here too.
> 
> Line up each argument under the argument in the prior line ie
> 
>   foo (XOR, mode
>        XEXP (XEXP (op, 0), 0)
>        get_int_mode (mask, mode))

The same issue of my email client. Fixed.

-- >8 --

This patch adds optimization of the following patterns:

  (zero_extend:M (subreg:N (not:O==M (X:Q==M)))) ->
  (xor:M (zero_extend:M (subreg:N (X:M)), mask))
    ... where the mask is GET_MODE_MASK (N).

For the cases when X:M doesn't have any non-zero bits outside of mode N,
(zero_extend:M (subreg:N (X:M)) could be simplified to just (X:M)
and whole optimization will be:

  (zero_extend:M (subreg:N (not:M (X:M)))) ->
  (xor:M (X:M, mask))

Patch targets to handle code patterns like:
  not   a0,a0
  andi  a0,a0,0xff
to be optimized to:
  xori  a0,a0,255

Change was locally tested for x86_64 and AArch64 (as most common)
and for RV-64 and MIPS-32 targets (as having an effect from this optimization):
no regressions for all cases.

gcc/ChangeLog:

	* simplify-rtx.cc (simplify_context::simplify_unary_operation_1):
	Simplify ZERO_EXTEND (SUBREG (NOT X)) to XOR (X, GET_MODE_MASK(SUBREG))
	when X doesn't have any non-zero bits outside of SUBREG mode.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/pr112398.c: New test.

Signed-off-by: Alexey Merzlyakov <alexey.merzlyakov@samsung.com>
---
 gcc/simplify-rtx.cc                       | 22 ++++++++++++++++++++++
 gcc/testsuite/gcc.target/riscv/pr112398.c | 14 ++++++++++++++
 2 files changed, 36 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr112398.c

Comments

Jeff Law Nov. 6, 2024, 9:41 p.m. UTC | #1
On 11/6/24 4:47 AM, Alexey Merzlyakov wrote:
> This patch adds optimization of the following patterns:
> 
>    (zero_extend:M (subreg:N (not:O==M (X:Q==M)))) ->
>    (xor:M (zero_extend:M (subreg:N (X:M)), mask))
>      ... where the mask is GET_MODE_MASK (N).
> 
> For the cases when X:M doesn't have any non-zero bits outside of mode N,
> (zero_extend:M (subreg:N (X:M)) could be simplified to just (X:M)
> and whole optimization will be:
> 
>    (zero_extend:M (subreg:N (not:M (X:M)))) ->
>    (xor:M (X:M, mask))
> 
> Patch targets to handle code patterns like:
>    not   a0,a0
>    andi  a0,a0,0xff
> to be optimized to:
>    xori  a0,a0,255
> 
> Change was locally tested for x86_64 and AArch64 (as most common)
> and for RV-64 and MIPS-32 targets (as having an effect from this optimization):
> no regressions for all cases.
> 
> gcc/ChangeLog:
> 
> 	* simplify-rtx.cc (simplify_context::simplify_unary_operation_1):
> 	Simplify ZERO_EXTEND (SUBREG (NOT X)) to XOR (X, GET_MODE_MASK(SUBREG))
> 	when X doesn't have any non-zero bits outside of SUBREG mode.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/pr112398.c: New test.
> 
> Signed-off-by: Alexey Merzlyakov<alexey.merzlyakov@samsung.com>
Thanks.   I just pushed this to the trunk.

Jeff
diff mbox series

Patch

diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index a20a61c5ddd..e622d9554f1 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -1842,6 +1842,28 @@  simplify_context::simplify_unary_operation_1 (rtx_code code, machine_mode mode,
 	      & ~GET_MODE_MASK (op_mode)) == 0)
 	return SUBREG_REG (op);
 
+      /* Trying to optimize:
+	 (zero_extend:M (subreg:N (not:M (X:M)))) ->
+	 (xor:M (zero_extend:M (subreg:N (X:M)), mask))
+	 where the mask is GET_MODE_MASK (N).
+	 For the cases when X:M doesn't have any non-zero bits
+	 outside of mode N, (zero_extend:M (subreg:N (X:M))
+	 will be simplified to just (X:M)
+	 and whole optimization will be -> (xor:M (X:M, mask)).  */
+      if (SUBREG_P (op)
+	  && GET_CODE (XEXP (op, 0)) == NOT
+	  && GET_MODE (XEXP (op, 0)) == mode
+	  && subreg_lowpart_p (op)
+	  && GET_MODE_SIZE (GET_MODE (op)).is_constant ()
+	  && (nonzero_bits (XEXP (XEXP (op, 0), 0), mode)
+	      & ~GET_MODE_MASK (mode)) == 0)
+      {
+	const uint64_t mask = GET_MODE_MASK (GET_MODE (op));
+	return simplify_gen_binary (XOR, mode,
+				    XEXP (XEXP (op, 0), 0),
+				    gen_int_mode (mask, mode));
+      }
+
 #if defined(POINTERS_EXTEND_UNSIGNED)
       /* As we do not know which address space the pointer is referring to,
 	 we can do this only if the target does not support different pointer
diff --git a/gcc/testsuite/gcc.target/riscv/pr112398.c b/gcc/testsuite/gcc.target/riscv/pr112398.c
new file mode 100644
index 00000000000..624a665b76c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr112398.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+#include <stdint.h>
+
+uint8_t neg_u8 (const uint8_t src)
+{
+  return ~src;
+}
+
+/* { dg-final { scan-assembler-times "xori\t" 1 } } */
+/* { dg-final { scan-assembler-not "not\t" } } */
+/* { dg-final { scan-assembler-not "andi\t" } } */