diff mbox series

[v2] RISC-V: zicond: Fix opt2 pattern

Message ID 20230901195311.761131-1-vineetg@rivosinc.com
State New
Headers show
Series [v2] RISC-V: zicond: Fix opt2 pattern | expand

Commit Message

Vineet Gupta Sept. 1, 2023, 7:53 p.m. UTC
This was tripping up gcc.c-torture/execute/pr60003.c at -O1 since in
failing case, pattern's asm czero.nez gets both rs2 and rs1 as non zero.

We start with the following src code snippet:

      if (a == 0)
	return 0;
      else
	return x;
    }

which is equivalent to:  "x = (a != 0) ? x : a" where x is NOT 0.
                                                ^^^^^^^^^^^^^^^^

and matches define_insn "*czero.nez.<GPR:mode><X:mode>.opt2"

| (insn 41 20 38 3 (set (reg/v:DI 136 [ x ])
|        (if_then_else:DI (ne (reg/v:DI 134 [ a ])
|                (const_int 0 [0]))
|            (reg/v:DI 136 [ x ])
|            (reg/v:DI 134 [ a ]))) {*czero.nez.didi.opt2}

The corresponding asm pattern generates
    czero.nez x, x, a   ; %0, %2, %1

which implies
    "x = (a != 0) ? 0 : a"

clearly not what the pattern wants to do.

Essentially "(a != 0) ? x : a" cannot be expressed with CZERO.nez if X
is not guaranteed to be 0.

However this can be fixed with a small tweak

"x = (a != 0) ? x : a"

   is same as

"x = (a == 0) ? a : x" since middle operand is 0 when a == 0.

which can be expressed with CZERO.eqz

before fix			after fix
-----------------		-----------------
li        a5,1		        li        a5,1
ld        a4,8(sp)	        ld        a4,8(sp)   # a4 is runtime non zero
czero.nez a0,a4,a5 # a0=0 NOK 	czero.eqz a0,a4,a5   # a0=a4!=0 OK

The issue only happens at -O1 as at higher optimization levels, the
whole conditional move gets optimized away.

This fixes 4 testsuite failues in a zicond build:

FAIL: gcc.c-torture/execute/pr60003.c   -O1  execution test
FAIL: gcc.dg/setjmp-3.c execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O1  execution test
FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O1 -fpic execution test

gcc/ChangeLog:
	* config/riscv/zicond.md: Fix op2 pattern.

Fixes: 1d5bc3285e8a ("[committed][RISC-V] Fix 20010221-1.c with zicond")
Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
---
changes since v1
   - instead of discarding opt2 pattern, fix the asm
---
 gcc/config/riscv/zicond.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jeff Law Sept. 4, 2023, 11:16 p.m. UTC | #1
On 9/1/23 13:53, Vineet Gupta wrote:
> This was tripping up gcc.c-torture/execute/pr60003.c at -O1 since in
> failing case, pattern's asm czero.nez gets both rs2 and rs1 as non zero.
> 
> We start with the following src code snippet:
> 
>        if (a == 0)
> 	return 0;
>        else
> 	return x;
>      }
> 
> which is equivalent to:  "x = (a != 0) ? x : a" where x is NOT 0.
>                                                  ^^^^^^^^^^^^^^^^
> 
> and matches define_insn "*czero.nez.<GPR:mode><X:mode>.opt2"
> 
> | (insn 41 20 38 3 (set (reg/v:DI 136 [ x ])
> |        (if_then_else:DI (ne (reg/v:DI 134 [ a ])
> |                (const_int 0 [0]))
> |            (reg/v:DI 136 [ x ])
> |            (reg/v:DI 134 [ a ]))) {*czero.nez.didi.opt2}
> 
> The corresponding asm pattern generates
>      czero.nez x, x, a   ; %0, %2, %1
> 
> which implies
>      "x = (a != 0) ? 0 : a"
> 
> clearly not what the pattern wants to do.
> 
> Essentially "(a != 0) ? x : a" cannot be expressed with CZERO.nez if X
> is not guaranteed to be 0.
> 
> However this can be fixed with a small tweak
> 
> "x = (a != 0) ? x : a"
> 
>     is same as
> 
> "x = (a == 0) ? a : x" since middle operand is 0 when a == 0.
> 
> which can be expressed with CZERO.eqz
> 
> before fix			after fix
> -----------------		-----------------
> li        a5,1		        li        a5,1
> ld        a4,8(sp)	        ld        a4,8(sp)   # a4 is runtime non zero
> czero.nez a0,a4,a5 # a0=0 NOK 	czero.eqz a0,a4,a5   # a0=a4!=0 OK
> 
> The issue only happens at -O1 as at higher optimization levels, the
> whole conditional move gets optimized away.
> 
> This fixes 4 testsuite failues in a zicond build:
> 
> FAIL: gcc.c-torture/execute/pr60003.c   -O1  execution test
> FAIL: gcc.dg/setjmp-3.c execution test
> FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O1  execution test
> FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O1 -fpic execution test
> 
> gcc/ChangeLog:
> 	* config/riscv/zicond.md: Fix op2 pattern.
> 
> Fixes: 1d5bc3285e8a ("[committed][RISC-V] Fix 20010221-1.c with zicond")
> Signed-off-by: Vineet Gupta <vineetg@rivosinc.com>
OK.

Still not sure why I didn't trip over it in my own testing (execute.exp 
runs pr60003.c), but regardless, glad to have it fixed.

jeff
Tsukasa OI Sept. 5, 2023, 2:19 a.m. UTC | #2
Sorry, I want to directly reply to Jeff but I couldn't because I haven't
subscribed to gcc-patches and Jeff's recent reply hasn't archived yet.

Bug confirmed for me.

I tried the full test with following configuration (I found another bug
[ICE] as I submitted a quick fix while testing this and requires
following patch set to be applied; will make a PATCH v2 though):
<https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629175.html>

Possibly, ICE, simulator configuration and/or dirty build tree might be
the reason Jeff couldn't reproduce the bug.

# ZiCond enabled
# Remove "_zicond" to disable ZiCond.
# ${SYSROOT} points to the prebuilt sysroot with
# glibc + libgcc with -march=rv64imafdc -mabi=lp64d
${GCC_SRCDIR}/configure \
    --target=riscv64-unknown-linux-gnu \
    --prefix=${PREFIX}               \
    --with-sysroot=${SYSROOT}        \
    --with-system-zlib               \
    --disable-shared                 \
    --enable-tls                     \
    --enable-languages=c,c++         \
    --disable-libmudflap             \
    --disable-libssp                 \
    --disable-libquadmath            \
    --disable-libsanitizer           \
    --disable-nls                    \
    --disable-bootstrap              \
    --disable-multilib               \
    --with-tune=rocket               \
    --with-arch=rv64imafdc_zicond    \
    --with-abi=lp64d                 \
    --with-isa-spec=20191213

Then I ran "make; make check RUNTESTFLAGS='--target_board=riscv-sim'".
Note that I configured DejaGnu (riscv-sim.exp) to execute tests with:
"qemu-riscv64 -L ${SYSROOT} -cpu rv64,g=on,x-zicond=on" (QEMU 8.1.0
Linux user emulation).

Warning: abort() on QEMU with Linux user emulation causes QEMU to abort,
too (possibly making many coredumps).

The diff of test failures are as follows.
-: Occurs only when ZiCond is disabled
+: Occurs only when ZiCond is enabled

-FAIL: 30_threads/async/async.cc execution test
+FAIL: gcc.c-torture/execute/pr60003.c   -O1  execution test
+FAIL: gcc.dg/setjmp-3.c execution test
+FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O1  execution test
+FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O1 -fpic execution test

I'm not sure why 30_threads/async/async.cc succeeds after enabling the
'Zicond' extension but I am sure that setjmp-3.c failures are caused by
this very bug.

Smaller example (not involving setjmp / longjmp) to reproduce this bug
in my environment is as follows (you *don't* have to apply my patch
above, make all-gcc && make install-gcc overwriting existing RV64 GCC
prefix will work):

> #include <stdio.h>
> 
> __attribute__((noinline, noclone))
> void sample(long* a)
> {
>     *a = 1;
> }
> 
> __attribute__((noinline, noclone))
> long foo(long x)
> {
>     long a = 0;
>     sample(&a); // a is overwritten to 1.
>     if (a == 0)
>         return 0;
>     else
>         return x; // should be always taken
> }
> 
> int main(int argc, char** argv)
> {
>     printf("%ld\n", foo(5)); // should print 5
>     return 0;
> }

Note that we have to make sure that variables are not easily inferred by
another optimization pass (that's why I needed two functions).

> riscv64-unknown-linux-gnu-gcc -march=rv64gc_zicond -O1 -static a.c
> qemu-riscv64 -cpu rv64,g=on,x-zicond=on ./a.out

printed 0, not 5 as I expected.

I support Vineet's patch set (v2).

Thanks,
Tsukasa
Jeff Law Sept. 5, 2023, 5:33 a.m. UTC | #3
On 9/4/23 20:19, Tsukasa OI wrote:

> 
> -FAIL: 30_threads/async/async.cc execution test
> +FAIL: gcc.c-torture/execute/pr60003.c   -O1  execution test
> +FAIL: gcc.dg/setjmp-3.c execution test
> +FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O1  execution test
> +FAIL: gcc.dg/torture/stackalign/setjmp-3.c   -O1 -fpic execution test
FWIW, I've got async.cc marked here as flakey when run under QEMU. 
That's also consistent with what I found at a prior employer when 
working on a private GCC port.



Jeff
diff mbox series

Patch

diff --git a/gcc/config/riscv/zicond.md b/gcc/config/riscv/zicond.md
index 4619220ef8ac..1721e1011ea8 100644
--- a/gcc/config/riscv/zicond.md
+++ b/gcc/config/riscv/zicond.md
@@ -60,7 +60,7 @@ 
                           (match_operand:GPR 2 "register_operand" "r")
                           (match_operand:GPR 3 "register_operand" "1")))]
   "TARGET_ZICOND && rtx_equal_p (operands[1], operands[3])"
-  "czero.nez\t%0,%2,%1"
+  "czero.eqz\t%0,%2,%1"
 )
 
 ;; Combine creates this form in some cases (particularly the coremark