diff mbox series

RISC-V: Improve code generation for select of consecutive constants

Message ID DB9PR08MB6634E94834941085D2F96115B7612@DB9PR08MB6634.eurprd08.prod.outlook.com
State New
Headers show
Series RISC-V: Improve code generation for select of consecutive constants | expand

Commit Message

Jovan Vukic Sept. 17, 2024, 11:25 a.m. UTC
The patch optimizes code generated for comparisons of the form x > y ? 2 : 3
(x <= y ? 3 : 2) and x < y ? 2 : 3 (x >= y ? 3 : 2).

For the following C code:

long f1(long x, long y) {
    return (x > y) ? 2 : 3;
}

long f2(long x, long y) {
    return (x < y) ? 2 : 3;
}


Before the patch, the generated assembly is:

f1(long, long):
        sgt     a0, a0, a1
        xori    a0, a0, 1
        addi    a0, a0, 2
        ret

f2(long, long):
        slt     a0, a0, a1
        xori    a0, a0, 1
        addi    a0, a0, 2
        ret


After the patch, the generated assembly is:

f1(long, long):
        slt     a0,a1,a0
        xori    a0,a0,3
        ret

f2(long, long):
        slt     a0,a0,a1
        xori    a0,a0,3
        ret


If we only consider the first example, during the combine pass, it can match
the "le:" pattern (x <= y), but that can only be replaced by the inverse
operation x > y (y < x), which requires replacing the following addi
instruction with an xori instruction as well, as proposed by the patch.

Tested on both RV32 and RV64 with no regressions.

2024-09-17  Jovan Vukic  <Jovan.Vukic@rt-rk.com>

        PR target/108038

gcc/ChangeLog:

        * config/riscv/iterators.md (paired_lt): New code attributes.
        * config/riscv/riscv.md (*sle<u>_<X:mode><GPR:mode>_xor): New pattern.
        (*sge<u>_<X:mode><GPR:mode>_xor): New pattern.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/slt-1.c: New test.

CONFIDENTIALITY: The contents of this e-mail are confidential and intended only for the above addressee(s). If you are not the intended recipient, or the person responsible for delivering it to the intended recipient, copying or delivering it to anyone else or using it in any unauthorized manner is prohibited and may be unlawful. If you receive this e-mail by mistake, please notify the sender and the systems administrator at straymail@rt-rk.com immediately.
---
 gcc/config/riscv/iterators.md          |  4 +++
 gcc/config/riscv/riscv.md              | 32 +++++++++++++++++++
 gcc/testsuite/gcc.target/riscv/slt-1.c | 44 ++++++++++++++++++++++++++
 3 files changed, 80 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/slt-1.c

Comments

Jeff Law Sept. 18, 2024, 4:42 p.m. UTC | #1
On 9/17/24 5:25 AM, Jovan Vukic wrote:
> The patch optimizes code generated for comparisons of the form x > y ? 2 : 3
> (x <= y ? 3 : 2) and x < y ? 2 : 3 (x >= y ? 3 : 2).
> 
> For the following C code:
> 
> long f1(long x, long y) {
>      return (x > y) ? 2 : 3;
> }
> 
> long f2(long x, long y) {
>      return (x < y) ? 2 : 3;
> }
> 
> 
> Before the patch, the generated assembly is:
> 
> f1(long, long):
>          sgt     a0, a0, a1
>          xori    a0, a0, 1
>          addi    a0, a0, 2
>          ret
> 
> f2(long, long):
>          slt     a0, a0, a1
>          xori    a0, a0, 1
>          addi    a0, a0, 2
>          ret
> 
> 
> After the patch, the generated assembly is:
> 
> f1(long, long):
>          slt     a0,a1,a0
>          xori    a0,a0,3
>          ret
> 
> f2(long, long):
>          slt     a0,a0,a1
>          xori    a0,a0,3
>          ret
> 
> 
> If we only consider the first example, during the combine pass, it can match
> the "le:" pattern (x <= y), but that can only be replaced by the inverse
> operation x > y (y < x), which requires replacing the following addi
> instruction with an xori instruction as well, as proposed by the patch.
> 
> Tested on both RV32 and RV64 with no regressions.
> 
> 2024-09-17  Jovan Vukic  <Jovan.Vukic@rt-rk.com>
> 
>          PR target/108038
> 
> gcc/ChangeLog:
> 
>          * config/riscv/iterators.md (paired_lt): New code attributes.
>          * config/riscv/riscv.md (*sle<u>_<X:mode><GPR:mode>_xor): New pattern.
>          (*sge<u>_<X:mode><GPR:mode>_xor): New pattern.
> 
> gcc/testsuite/ChangeLog:
> 
>          * gcc.target/riscv/slt-1.c: New test.
So you're essentially taking advantage of the fact that the bits set by 
the sCC and the constant in the addi are disjoint. That's a pretty 
generic concept.  So there's a real possibility we could do this in 
generic code so that everyone could benefit.

The first point in the various pipelines would probably be phi-opt which 
tries to collapse certain if-then constructs into conditional 
expressions (tree-ssa-phiopt.cc).

It'll be presented with something like this:

> ;;   basic block 2, loop depth 0
> ;;    pred:       ENTRY
>   if (x_2(D) > y_3(D))
>     goto <bb 3>; [50.00%]
>   else
>     goto <bb 4>; [50.00%]
> ;;    succ:       3
> ;;                4
> 
> ;;   basic block 3, loop depth 0
> ;;    pred:       2
> ;;    succ:       4
> 
> ;;   basic block 4, loop depth 0
> ;;    pred:       2
> ;;                3
>   # _1 = PHI <3(2), 2(3)>

Which seems  well suited for value_replacement in tree-ssa-phiopt.cc 
except that value_replacement seems to want an equality comparison and 
SSA_NAMEs (rather than constants) as PHI args.  But still, it's pretty 
close.  Somewhere in tree-ssa-phiopt.cc seems like the best place to 
handle this.

An alternative would be in in simplify_binary_operation in the RTL 
pipeline.  At some point it should be passed a useful PLUS as the outer 
operator and (xor (reg) (const_int 1)) as the first argument and the 
constant as the second argument.

*If* we ultimately decided to do this in the risc-v backend, the first 
thing I would suggest would be to try and use a define_split rather than 
a define_insn_and_split.  The latter tends to throw off costing models 
for splitting, thus the former is slightly preferred.

And just to be 100% clear, I think this is a good patch, but that 
there's a better way to accomplish what you're trying to do.

jeff
diff mbox series

Patch

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index 2844cb02ff0..964d0f24f4c 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -234,6 +234,10 @@ 
 (define_code_iterator any_lt [lt ltu])
 (define_code_iterator any_le [le leu])
 
+(define_code_attr paired_lt [(gt "lt") (gtu "ltu")
+		     (ge "lt") (geu "ltu")
+		     (le "lt") (leu "ltu")])
+
 ; atomics code iterator
 (define_code_iterator any_atomic [plus ior xor and])
 
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 9f94b5aa023..b22a8ac05e0 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -3600,6 +3600,38 @@ 
   [(set_attr "type" "slt")
    (set_attr "mode" "<X:MODE>")])
 
+(define_insn_and_split "*sle<u>_<X:mode><GPR:mode>_xor"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(plus:GPR (any_le:GPR (match_operand:X 1 "register_operand" " r")
+			   (match_operand:X 2 "register_operand" " r"))
+		   (match_operand 3 "const_arith_operand" "i")))]
+  "INTVAL (operands[3]) % 2 == 0"
+  "#"
+  "&& 1"
+  [(set (match_dup 0) (<paired_lt>:GPR (match_dup 2) (match_dup 1)))
+   (set (match_dup 0) (xor:GPR (match_dup 0) (match_dup 3)))]
+{
+  operands[3] = GEN_INT (INTVAL (operands[3]) + 1);
+}
+  [(set_attr "type" "slt")
+   (set_attr "mode" "<X:MODE>")])
+
+(define_insn_and_split "*sge<u>_<X:mode><GPR:mode>_xor"
+  [(set (match_operand:GPR 0 "register_operand" "=r")
+	(plus:GPR (any_ge:GPR (match_operand:X 1 "register_operand" " r")
+			   (match_operand:X 2 "register_operand" " r"))
+		   (match_operand 3 "const_arith_operand" "i")))]
+  "INTVAL (operands[3]) % 2 == 0"
+  "#"
+  "&& 1"
+  [(set (match_dup 0) (<paired_lt>:GPR (match_dup 1) (match_dup 2)))
+   (set (match_dup 0) (xor:GPR (match_dup 0) (match_dup 3)))]
+{
+  operands[3] = GEN_INT (INTVAL (operands[3]) + 1);
+}
+  [(set_attr "type" "slt")
+   (set_attr "mode" "<X:MODE>")])
+
 ;;
 ;;  ....................
 ;;
diff --git a/gcc/testsuite/gcc.target/riscv/slt-1.c b/gcc/testsuite/gcc.target/riscv/slt-1.c
new file mode 100644
index 00000000000..c672da92acf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/slt-1.c
@@ -0,0 +1,44 @@ 
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc -mabi=lp64d" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+#include <stdint.h>
+
+#define COMPARISON(TYPE, OP, OPN, RESULT_TRUE, RESULT_FALSE) \
+    TYPE test##OPN(TYPE x, TYPE y) { \
+        return (x OP y) ? RESULT_TRUE : RESULT_FALSE; \
+    }
+
+/* Signed comparisons */
+COMPARISON(int64_t, >, GT1, 2, 3)
+COMPARISON(int64_t, >, GT3, 5, 6)
+
+COMPARISON(int64_t, <, LT1, 2, 3)
+COMPARISON(int64_t, <, LT3, 5, 6)
+
+COMPARISON(int64_t, >=, GE2, 3, 2)
+COMPARISON(int64_t, >=, GE4, 6, 5)
+
+COMPARISON(int64_t, <=, LE2, 3, 2)
+COMPARISON(int64_t, <=, LE4, 6, 5)
+
+/* Unsigned comparisons */
+COMPARISON(uint64_t, >, GTU1, 2, 3)
+COMPARISON(uint64_t, >, GTU3, 5, 6)
+
+COMPARISON(uint64_t, <, LTU1, 2, 3)
+COMPARISON(uint64_t, <, LTU3, 5, 6)
+
+COMPARISON(uint64_t, >=, GEU2, 3, 2)
+COMPARISON(uint64_t, >=, GEU4, 6, 5)
+
+COMPARISON(uint64_t, <=, LEU2, 3, 2)
+COMPARISON(uint64_t, <=, LEU4, 6, 5)
+
+/* { dg-final { scan-assembler-times "sgt\\t" 2 } } */
+/* { dg-final { scan-assembler-times "sgtu\\t" 2 } } */
+/* { dg-final { scan-assembler-times "slt\\t" 6 } } */
+/* { dg-final { scan-assembler-times "sltu\\t" 6 } } */
+/* { dg-final { scan-assembler-times "xori\\ta0,a0,1" 8 } } */
+/* { dg-final { scan-assembler-times "xori\\ta0,a0,3" 8 } } */
+