diff mbox series

RISC-V: optimization on checking certain bits set ((x & mask) == val)

Message ID AS1PR08MB7449D252D609A655A60571FEC5312@AS1PR08MB7449.eurprd08.prod.outlook.com
State New
Headers show
Series RISC-V: optimization on checking certain bits set ((x & mask) == val) | expand

Commit Message

Oliver Kozul Dec. 6, 2024, 1:12 p.m. UTC
The patch optimizes code generation for comparisons of the form
X & C1 == C2 by converting them to (X | ~C1) == (C2 | ~C1).
C1 is a constant that requires li and addi to be loaded,
while ~C1 requires a single lui instruction.

2024-12-06  Oliver Kozul  <oliver.kozul@rt-rk.com>

      PR target/114087

gcc/ChangeLog:

      * config/riscv/riscv.md (*lui_constraint<ANYI:mode>_and_to_or): New pattern.

gcc/testsuite/ChangeLog:

      * gcc.target/riscv/pr114087-1.c: New test.



CONFIDENTIALITY: The contents of this e-mail are confidential and intended only for the above addressee(s). If you are not the intended recipient, or the person responsible for delivering it to the intended recipient, copying or delivering it to anyone else or using it in any unauthorized manner is prohibited and may be unlawful. If you receive this e-mail by mistake, please notify the sender and the systems administrator at straymail@rt-rk.com immediately.
---
 gcc/config/riscv/riscv.md                   | 21 +++++++++++++++++++++
 gcc/testsuite/gcc.target/riscv/pr114087-1.c | 10 ++++++++++
 2 files changed, 31 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/pr114087-1.c

Comments

Jeff Law Dec. 15, 2024, 12:19 a.m. UTC | #1
On 12/6/24 6:12 AM, Oliver Kozul wrote:
> The patch optimizes code generation for comparisons of the form
> X & C1 == C2 by converting them to (X | ~C1) == (C2 | ~C1).
> C1 is a constant that requires li and addi to be loaded,
> while ~C1 requires a single lui instruction.
> 
> 2024-12-06  Oliver Kozul  <oliver.kozul@rt-rk.com>
> 
>       PR target/114087
> 
> gcc/ChangeLog:
> 
>       * config/riscv/riscv.md (*lui_constraint<ANYI:mode>_and_to_or): 
> New pattern.
> 
> gcc/testsuite/ChangeLog:
> 
>       * gcc.target/riscv/pr114087-1.c: New test.
A few comments:



>   
> +(define_insn_and_split "*lui_constraint<ANYI:mode>_and_to_or"
> +  [(set (match_operand:ANYI 0 "register_operand" "=r")
> +    (plus:ANYI (and:ANYI (match_operand:ANYI 1 "register_operand" "r")
> +    (match_operand 2 "const_int_operand"))
> +    (match_operand 3 "const_int_operand")))
> +    (clobber (match_scratch:X 4 "=&r"))]
Similar formatting comment as I had on the other patch.  Easily fixed. 
Just for giggles I asked chatgpt to format it and it came up with:

> (define_insn_and_split "*lui_constraint<ANYI:mode>_and_to_or"
>   [(set (match_operand:ANYI 0 "register_operand" "=r")
>         (plus:ANYI (and:ANYI (match_operand:ANYI 1 "register_operand" "r")
>                              (match_operand 2 "const_int_operand"))
>                    (match_operand 3 "const_int_operand")))
>    (clobber (match_scratch:X 4 "=&r"))]
>   "LUI_OPERAND (INTVAL (operands[2]) + 1)
>    && (INTVAL (operands[2]) & (-INTVAL (operands[3]))) == (-INTVAL (operands[3]))"
>   "#"
>   "&& reload_completed"
>   [(set (match_dup 4) (match_dup 5))
>    (set (match_dup 0) (ior:X (match_dup 1) (match_dup 4)))
>    (set (match_dup 4) (match_dup 6))
>    (set (match_dup 0) (minus:X (match_dup 0) (match_dup 4)))]
>   {
>     operands[5] = GEN_INT (~INTVAL (operands[2]));
>     operands[6] = GEN_INT ((~INTVAL (operands[2])) | (-INTVAL (operands[3])));
>   }
>   [(set_attr "type" "arith")])

Which looks basically correct.




More importantly we probably want a comment on what this pattern is 
actually doing.  Perhaps something like:

;; Transform (X & C1) + C2 into (X | ~C1) - (-C2 | ~C1)
;; Where C1 is not a LUI operand, but ~C1 is a LUI operand

Did I get that correct?



> +  "LUI_OPERAND (INTVAL (operands[2]) + 1)
So shouldn't this have been LUI_OPERAND (~INTVAL (operands[2]))?  I can 
kind-of see why you used +1, but given the C fragment in the pattern 
uses ~operands[2], let's use the same here for the test if we can.




> +  && (INTVAL (operands[2]) & (-INTVAL (operands[3])))
> +  == (-INTVAL (operands[3]))"
Does it make sense to verify that operands[3] is a constant that 
requires synthesis and that ~INTVAL (operands[2]) | -INTVAL 
(operands[3]) is no more expensive to synthesize than operands[3]?

ie, saving the addi on the first constant isn't a win if the second 
constant gets more complex to synthesize.  The API isn't great, but you 
can use riscv_const_insns to find out how many instructions are 
necessary to synthesize a constant.


Overall it looks good.  Just needs a bit of polishing.

jeff
Oliver Kozul Dec. 15, 2024, 10:11 a.m. UTC | #2
Thank you again for your helpful comments.

I will implement the formatting changes you mentioned.

I agree that a description like:

;; Transform (X & C1) + C2 into (X | ~C1) - (-C2 | ~C1)
;; Where C1 is not a LUI operand, but ~C1 is a LUI operand

is more fitting since we are catching a plus expression.

Your proposed condition also makes more sense than the current one.
LUI_OPERAND (INTVAL (operands[2]) + 1) made sense when starting out,
but I never got around to changing to LUI_OPERAND (~INTVAL (operands[2]).

I did not know about the costs of constant synthesis.
I thought that the values of those constants are known at compile time, and therefore their creation would not have an impact on performance.
I will definitely take a look at riscv_const_insns and evaluate the synthesis complexity.

Best regards,
Oliver
Jeff Law Dec. 15, 2024, 2:15 p.m. UTC | #3
On 12/15/24 3:11 AM, Oliver Kozul wrote:

> 
> I did not know about the costs of constant synthesis.
> I thought that the values of those constants are known at compile time, 
> and therefore their creation would not have an impact on performance.
> I will definitely take a look at riscv_const_insns and evaluate the 
> synthesis complexity.
On an rv32 system constructing a word sized constant takes at most two 
instructions (lui+addi).  On rv64 it can take as many as 6 instructions 
(lui+addi+lui+addi+sll+ior).  If you dive into riscv_const_insns you'll 
see a ton of code that's designed to bring down the cost of constant 
synthesis, particularly on rv64 using various tricks.

Jeff
diff mbox series

Patch

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 3a4cd1d93a0..add31bbf51c 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -858,6 +858,27 @@ 
   [(set_attr "type" "arith")
    (set_attr "mode" "SI")])
 
+(define_insn_and_split "*lui_constraint<ANYI:mode>_and_to_or"
+  [(set (match_operand:ANYI 0 "register_operand" "=r")
+    (plus:ANYI (and:ANYI (match_operand:ANYI 1 "register_operand" "r")
+    (match_operand 2 "const_int_operand"))
+    (match_operand 3 "const_int_operand")))
+    (clobber (match_scratch:X 4 "=&r"))]
+  "LUI_OPERAND (INTVAL (operands[2]) + 1)
+  && (INTVAL (operands[2]) & (-INTVAL (operands[3])))
+  == (-INTVAL (operands[3]))"
+  "#"
+  "&& reload_completed"
+  [(set (match_dup 4) (match_dup 5))
+   (set (match_dup 0) (ior:X (match_dup 1) (match_dup 4)))
+   (set (match_dup 4) (match_dup 6))
+   (set (match_dup 0) (minus:X (match_dup 0) (match_dup 4)))]
+  {
+    operands[5] = GEN_INT (~INTVAL (operands[2]));
+    operands[6] = GEN_INT ((~INTVAL (operands[2])) | (-INTVAL (operands[3])));
+  }
+  [(set_attr "type" "arith")])
+
 ;;
 ;;  ....................
 ;;
diff --git a/gcc/testsuite/gcc.target/riscv/pr114087-1.c b/gcc/testsuite/gcc.target/riscv/pr114087-1.c
new file mode 100644
index 00000000000..5e40b5f7b5b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr114087-1.c
@@ -0,0 +1,10 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target rv64 } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
+/* { dg-options "-march=rv64gc -mabi=lp64d" } */
+
+int pred1a(int x) {
+  return ((x & 0x55555FFF) == 0x14501DEF);
+}
+
+/* { dg-final { scan-assembler  {or\s*[a-x0-9]+,\s*[a-x0-9]+,\s*[a-x0-9]+} } } */