Message ID | 20230901195311.761131-1-vineetg@rivosinc.com |
---|---|
State | New |
Headers | show |
Series | [v2] RISC-V: zicond: Fix opt2 pattern | expand |
On 9/1/23 13:53, Vineet Gupta wrote: > This was tripping up gcc.c-torture/execute/pr60003.c at -O1 since in > failing case, pattern's asm czero.nez gets both rs2 and rs1 as non zero. > > We start with the following src code snippet: > > if (a == 0) > return 0; > else > return x; > } > > which is equivalent to: "x = (a != 0) ? x : a" where x is NOT 0. > ^^^^^^^^^^^^^^^^ > > and matches define_insn "*czero.nez.<GPR:mode><X:mode>.opt2" > > | (insn 41 20 38 3 (set (reg/v:DI 136 [ x ]) > | (if_then_else:DI (ne (reg/v:DI 134 [ a ]) > | (const_int 0 [0])) > | (reg/v:DI 136 [ x ]) > | (reg/v:DI 134 [ a ]))) {*czero.nez.didi.opt2} > > The corresponding asm pattern generates > czero.nez x, x, a ; %0, %2, %1 > > which implies > "x = (a != 0) ? 0 : a" > > clearly not what the pattern wants to do. > > Essentially "(a != 0) ? x : a" cannot be expressed with CZERO.nez if X > is not guaranteed to be 0. > > However this can be fixed with a small tweak > > "x = (a != 0) ? x : a" > > is same as > > "x = (a == 0) ? a : x" since middle operand is 0 when a == 0. > > which can be expressed with CZERO.eqz > > before fix after fix > ----------------- ----------------- > li a5,1 li a5,1 > ld a4,8(sp) ld a4,8(sp) # a4 is runtime non zero > czero.nez a0,a4,a5 # a0=0 NOK czero.eqz a0,a4,a5 # a0=a4!=0 OK > > The issue only happens at -O1 as at higher optimization levels, the > whole conditional move gets optimized away. > > This fixes 4 testsuite failues in a zicond build: > > FAIL: gcc.c-torture/execute/pr60003.c -O1 execution test > FAIL: gcc.dg/setjmp-3.c execution test > FAIL: gcc.dg/torture/stackalign/setjmp-3.c -O1 execution test > FAIL: gcc.dg/torture/stackalign/setjmp-3.c -O1 -fpic execution test > > gcc/ChangeLog: > * config/riscv/zicond.md: Fix op2 pattern. > > Fixes: 1d5bc3285e8a ("[committed][RISC-V] Fix 20010221-1.c with zicond") > Signed-off-by: Vineet Gupta <vineetg@rivosinc.com> OK. Still not sure why I didn't trip over it in my own testing (execute.exp runs pr60003.c), but regardless, glad to have it fixed. jeff
Sorry, I want to directly reply to Jeff but I couldn't because I haven't subscribed to gcc-patches and Jeff's recent reply hasn't archived yet. Bug confirmed for me. I tried the full test with following configuration (I found another bug [ICE] as I submitted a quick fix while testing this and requires following patch set to be applied; will make a PATCH v2 though): <https://gcc.gnu.org/pipermail/gcc-patches/2023-September/629175.html> Possibly, ICE, simulator configuration and/or dirty build tree might be the reason Jeff couldn't reproduce the bug. # ZiCond enabled # Remove "_zicond" to disable ZiCond. # ${SYSROOT} points to the prebuilt sysroot with # glibc + libgcc with -march=rv64imafdc -mabi=lp64d ${GCC_SRCDIR}/configure \ --target=riscv64-unknown-linux-gnu \ --prefix=${PREFIX} \ --with-sysroot=${SYSROOT} \ --with-system-zlib \ --disable-shared \ --enable-tls \ --enable-languages=c,c++ \ --disable-libmudflap \ --disable-libssp \ --disable-libquadmath \ --disable-libsanitizer \ --disable-nls \ --disable-bootstrap \ --disable-multilib \ --with-tune=rocket \ --with-arch=rv64imafdc_zicond \ --with-abi=lp64d \ --with-isa-spec=20191213 Then I ran "make; make check RUNTESTFLAGS='--target_board=riscv-sim'". Note that I configured DejaGnu (riscv-sim.exp) to execute tests with: "qemu-riscv64 -L ${SYSROOT} -cpu rv64,g=on,x-zicond=on" (QEMU 8.1.0 Linux user emulation). Warning: abort() on QEMU with Linux user emulation causes QEMU to abort, too (possibly making many coredumps). The diff of test failures are as follows. -: Occurs only when ZiCond is disabled +: Occurs only when ZiCond is enabled -FAIL: 30_threads/async/async.cc execution test +FAIL: gcc.c-torture/execute/pr60003.c -O1 execution test +FAIL: gcc.dg/setjmp-3.c execution test +FAIL: gcc.dg/torture/stackalign/setjmp-3.c -O1 execution test +FAIL: gcc.dg/torture/stackalign/setjmp-3.c -O1 -fpic execution test I'm not sure why 30_threads/async/async.cc succeeds after enabling the 'Zicond' extension but I am sure that setjmp-3.c failures are caused by this very bug. Smaller example (not involving setjmp / longjmp) to reproduce this bug in my environment is as follows (you *don't* have to apply my patch above, make all-gcc && make install-gcc overwriting existing RV64 GCC prefix will work): > #include <stdio.h> > > __attribute__((noinline, noclone)) > void sample(long* a) > { > *a = 1; > } > > __attribute__((noinline, noclone)) > long foo(long x) > { > long a = 0; > sample(&a); // a is overwritten to 1. > if (a == 0) > return 0; > else > return x; // should be always taken > } > > int main(int argc, char** argv) > { > printf("%ld\n", foo(5)); // should print 5 > return 0; > } Note that we have to make sure that variables are not easily inferred by another optimization pass (that's why I needed two functions). > riscv64-unknown-linux-gnu-gcc -march=rv64gc_zicond -O1 -static a.c > qemu-riscv64 -cpu rv64,g=on,x-zicond=on ./a.out printed 0, not 5 as I expected. I support Vineet's patch set (v2). Thanks, Tsukasa
On 9/4/23 20:19, Tsukasa OI wrote: > > -FAIL: 30_threads/async/async.cc execution test > +FAIL: gcc.c-torture/execute/pr60003.c -O1 execution test > +FAIL: gcc.dg/setjmp-3.c execution test > +FAIL: gcc.dg/torture/stackalign/setjmp-3.c -O1 execution test > +FAIL: gcc.dg/torture/stackalign/setjmp-3.c -O1 -fpic execution test FWIW, I've got async.cc marked here as flakey when run under QEMU. That's also consistent with what I found at a prior employer when working on a private GCC port. Jeff
diff --git a/gcc/config/riscv/zicond.md b/gcc/config/riscv/zicond.md index 4619220ef8ac..1721e1011ea8 100644 --- a/gcc/config/riscv/zicond.md +++ b/gcc/config/riscv/zicond.md @@ -60,7 +60,7 @@ (match_operand:GPR 2 "register_operand" "r") (match_operand:GPR 3 "register_operand" "1")))] "TARGET_ZICOND && rtx_equal_p (operands[1], operands[3])" - "czero.nez\t%0,%2,%1" + "czero.eqz\t%0,%2,%1" ) ;; Combine creates this form in some cases (particularly the coremark
This was tripping up gcc.c-torture/execute/pr60003.c at -O1 since in failing case, pattern's asm czero.nez gets both rs2 and rs1 as non zero. We start with the following src code snippet: if (a == 0) return 0; else return x; } which is equivalent to: "x = (a != 0) ? x : a" where x is NOT 0. ^^^^^^^^^^^^^^^^ and matches define_insn "*czero.nez.<GPR:mode><X:mode>.opt2" | (insn 41 20 38 3 (set (reg/v:DI 136 [ x ]) | (if_then_else:DI (ne (reg/v:DI 134 [ a ]) | (const_int 0 [0])) | (reg/v:DI 136 [ x ]) | (reg/v:DI 134 [ a ]))) {*czero.nez.didi.opt2} The corresponding asm pattern generates czero.nez x, x, a ; %0, %2, %1 which implies "x = (a != 0) ? 0 : a" clearly not what the pattern wants to do. Essentially "(a != 0) ? x : a" cannot be expressed with CZERO.nez if X is not guaranteed to be 0. However this can be fixed with a small tweak "x = (a != 0) ? x : a" is same as "x = (a == 0) ? a : x" since middle operand is 0 when a == 0. which can be expressed with CZERO.eqz before fix after fix ----------------- ----------------- li a5,1 li a5,1 ld a4,8(sp) ld a4,8(sp) # a4 is runtime non zero czero.nez a0,a4,a5 # a0=0 NOK czero.eqz a0,a4,a5 # a0=a4!=0 OK The issue only happens at -O1 as at higher optimization levels, the whole conditional move gets optimized away. This fixes 4 testsuite failues in a zicond build: FAIL: gcc.c-torture/execute/pr60003.c -O1 execution test FAIL: gcc.dg/setjmp-3.c execution test FAIL: gcc.dg/torture/stackalign/setjmp-3.c -O1 execution test FAIL: gcc.dg/torture/stackalign/setjmp-3.c -O1 -fpic execution test gcc/ChangeLog: * config/riscv/zicond.md: Fix op2 pattern. Fixes: 1d5bc3285e8a ("[committed][RISC-V] Fix 20010221-1.c with zicond") Signed-off-by: Vineet Gupta <vineetg@rivosinc.com> --- changes since v1 - instead of discarding opt2 pattern, fix the asm --- gcc/config/riscv/zicond.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)