Message ID | 20230721175552.2693295-1-vineetg@rivosinc.com |
---|---|
State | New |
Headers | show |
Series | RISC-V: optim const DF +0.0 store to mem [PR/110748] | expand |
On Fri, 21 Jul 2023 at 19:56, Vineet Gupta <vineetg@rivosinc.com> wrote: > > DF +0.0 is bitwise all zeros so int x0 store to mem can be used to optimize it. > > void zd(double *) { *d = 0.0; } > > currently: > > | fmv.d.x fa5,zero > | fsd fa5,0(a0) > | ret > > With patch > > | sd zero,0(a0) > | ret > This came to light when testing the in-flight f-m-o patch where an ICE > was gettinh triggered due to lack of this pattern but turns out this typo: "gettinh" -> "getting" > is an independent optimization of its own [1] > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624857.html > > Apparently this is a regression in gcc-13, introduced by commit > ef85d150b5963 ("RISC-V: Enable TARGET_SUPPORTS_WIDE_INT") and the fix > thus is a partial revert of that change. Should we add a "Fixes: "? > Ran thru full multilib testsuite, there was 1 false failure due to > random string "lw" appearing in lto build assembler output, > which is also fixed in the patch. > > gcc/Changelog: PR target/110748 > > * config/riscv/predicates.md (const_0_operand): Add back > const_double. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/pr110748-1.c: New Test. > * gcc.target/riscv/xtheadfmv-fmv.c: Add '\t' around test > patterns to avoid random string matches. > > Signed-off-by: Vineet Gupta <vineetg@rivosinc.com> > --- > gcc/config/riscv/predicates.md | 2 +- > gcc/testsuite/gcc.target/riscv/pr110748-1.c | 10 ++++++++++ > gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c | 8 ++++---- > 3 files changed, 15 insertions(+), 5 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/pr110748-1.c > > diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md > index 5a22c77f0cd0..9db28c2def7e 100644 > --- a/gcc/config/riscv/predicates.md > +++ b/gcc/config/riscv/predicates.md > @@ -58,7 +58,7 @@ > (match_test "INTVAL (op) + 1 != 0"))) > > (define_predicate "const_0_operand" > - (and (match_code "const_int,const_wide_int,const_vector") > + (and (match_code "const_int,const_wide_int,const_double,const_vector") > (match_test "op == CONST0_RTX (GET_MODE (op))"))) > > (define_predicate "const_1_operand" > diff --git a/gcc/testsuite/gcc.target/riscv/pr110748-1.c b/gcc/testsuite/gcc.target/riscv/pr110748-1.c > new file mode 100644 > index 000000000000..2f5bc08aae72 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/pr110748-1.c > @@ -0,0 +1,10 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target hard_float } */ > +/* { dg-options "-march=rv64g -mabi=lp64d -O2" } */ > + > + > +void zd(double *d) { *d = 0.0; } > +void zf(float *f) { *f = 0.0; } > + > +/* { dg-final { scan-assembler-not "\tfmv\\.d\\.x\t" } } */ > +/* { dg-final { scan-assembler-not "\tfmv\\.s\\.x\t" } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c > index 1036044291e7..89eb48bed1b9 100644 > --- a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c > +++ b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c > @@ -18,7 +18,7 @@ d2ll (double d) > /* { dg-final { scan-assembler "th.fmv.hw.x" } } */ > /* { dg-final { scan-assembler "fmv.x.w" } } */ > /* { dg-final { scan-assembler "th.fmv.x.hw" } } */ > -/* { dg-final { scan-assembler-not "sw" } } */ > -/* { dg-final { scan-assembler-not "fld" } } */ > -/* { dg-final { scan-assembler-not "fsd" } } */ > -/* { dg-final { scan-assembler-not "lw" } } */ > +/* { dg-final { scan-assembler-not "\tsw\t" } } */ > +/* { dg-final { scan-assembler-not "\tfld\t" } } */ > +/* { dg-final { scan-assembler-not "\tfsd\t" } } */ > +/* { dg-final { scan-assembler-not "\tlw\t" } } */ > -- > 2.34.1 >
On 7/21/23 11:15, Philipp Tomsich wrote: > On Fri, 21 Jul 2023 at 19:56, Vineet Gupta <vineetg@rivosinc.com> wrote: >> DF +0.0 is bitwise all zeros so int x0 store to mem can be used to optimize it. >> >> void zd(double *) { *d = 0.0; } >> >> currently: >> >> | fmv.d.x fa5,zero >> | fsd fa5,0(a0) >> | ret >> >> With patch >> >> | sd zero,0(a0) >> | ret >> This came to light when testing the in-flight f-m-o patch where an ICE >> was gettinh triggered due to lack of this pattern but turns out this > typo: "gettinh" -> "getting" Fixed. >> is an independent optimization of its own [1] >> >> [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624857.html >> >> Apparently this is a regression in gcc-13, introduced by commit >> ef85d150b5963 ("RISC-V: Enable TARGET_SUPPORTS_WIDE_INT") and the fix >> thus is a partial revert of that change. > Should we add a "Fixes: "? Sure. Although gcc usage of Fixes tag seems slightly different than say linux kernel's. > >> Ran thru full multilib testsuite, there was 1 false failure due to >> random string "lw" appearing in lto build assembler output, >> which is also fixed in the patch. >> >> gcc/Changelog: > PR target/110748 Added. Thx, -Vineet > >> * config/riscv/predicates.md (const_0_operand): Add back >> const_double. >> >> gcc/testsuite/ChangeLog: >> >> * gcc.target/riscv/pr110748-1.c: New Test. >> * gcc.target/riscv/xtheadfmv-fmv.c: Add '\t' around test >> patterns to avoid random string matches. >> >> Signed-off-by: Vineet Gupta <vineetg@rivosinc.com> >> --- >> gcc/config/riscv/predicates.md | 2 +- >> gcc/testsuite/gcc.target/riscv/pr110748-1.c | 10 ++++++++++ >> gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c | 8 ++++---- >> 3 files changed, 15 insertions(+), 5 deletions(-) >> create mode 100644 gcc/testsuite/gcc.target/riscv/pr110748-1.c >> >> diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md >> index 5a22c77f0cd0..9db28c2def7e 100644 >> --- a/gcc/config/riscv/predicates.md >> +++ b/gcc/config/riscv/predicates.md >> @@ -58,7 +58,7 @@ >> (match_test "INTVAL (op) + 1 != 0"))) >> >> (define_predicate "const_0_operand" >> - (and (match_code "const_int,const_wide_int,const_vector") >> + (and (match_code "const_int,const_wide_int,const_double,const_vector") >> (match_test "op == CONST0_RTX (GET_MODE (op))"))) >> >> (define_predicate "const_1_operand" >> diff --git a/gcc/testsuite/gcc.target/riscv/pr110748-1.c b/gcc/testsuite/gcc.target/riscv/pr110748-1.c >> new file mode 100644 >> index 000000000000..2f5bc08aae72 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/riscv/pr110748-1.c >> @@ -0,0 +1,10 @@ >> +/* { dg-do compile } */ >> +/* { dg-require-effective-target hard_float } */ >> +/* { dg-options "-march=rv64g -mabi=lp64d -O2" } */ >> + >> + >> +void zd(double *d) { *d = 0.0; } >> +void zf(float *f) { *f = 0.0; } >> + >> +/* { dg-final { scan-assembler-not "\tfmv\\.d\\.x\t" } } */ >> +/* { dg-final { scan-assembler-not "\tfmv\\.s\\.x\t" } } */ >> diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c >> index 1036044291e7..89eb48bed1b9 100644 >> --- a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c >> +++ b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c >> @@ -18,7 +18,7 @@ d2ll (double d) >> /* { dg-final { scan-assembler "th.fmv.hw.x" } } */ >> /* { dg-final { scan-assembler "fmv.x.w" } } */ >> /* { dg-final { scan-assembler "th.fmv.x.hw" } } */ >> -/* { dg-final { scan-assembler-not "sw" } } */ >> -/* { dg-final { scan-assembler-not "fld" } } */ >> -/* { dg-final { scan-assembler-not "fsd" } } */ >> -/* { dg-final { scan-assembler-not "lw" } } */ >> +/* { dg-final { scan-assembler-not "\tsw\t" } } */ >> +/* { dg-final { scan-assembler-not "\tfld\t" } } */ >> +/* { dg-final { scan-assembler-not "\tfsd\t" } } */ >> +/* { dg-final { scan-assembler-not "\tlw\t" } } */ >> -- >> 2.34.1 >>
On Fri, 21 Jul 2023 10:55:52 PDT (-0700), Vineet Gupta wrote: > DF +0.0 is bitwise all zeros so int x0 store to mem can be used to optimize it. > > void zd(double *) { *d = 0.0; } > > currently: > > | fmv.d.x fa5,zero > | fsd fa5,0(a0) > | ret > > With patch > > | sd zero,0(a0) > | ret > > This came to light when testing the in-flight f-m-o patch where an ICE > was gettinh triggered due to lack of this pattern but turns out this > is an independent optimization of its own [1] > > [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624857.html > > Apparently this is a regression in gcc-13, introduced by commit > ef85d150b5963 ("RISC-V: Enable TARGET_SUPPORTS_WIDE_INT") and the fix > thus is a partial revert of that change. Given that it can ICE, we should probably backport it to 13. > Ran thru full multilib testsuite, there was 1 false failure due to Did you run the test with autovec? There's also a pmode_reg_or_0_operand, some of those don't appear protected from FP values. So we might need something like diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index cd5b19457f8..d8ce9223343 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -63,7 +63,7 @@ (define_expand "movmisalign<mode>" (define_expand "len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>" [(match_operand:VNX1_QHSD 0 "register_operand") - (match_operand 1 "pmode_reg_or_0_operand") + (match_operand:P 1 "pmode_reg_or_0_operand") (match_operand:VNX1_QHSDI 2 "register_operand") (match_operand 3 "<VNX1_QHSD:gs_extension>") (match_operand 4 "<VNX1_QHSD:gs_scale>") a bunch of times, as there's a ton of them? I'm not entirely sure if that could manifest as an actual bug, though... > random string "lw" appearing in lto build assembler output, > which is also fixed in the patch. > > gcc/Changelog: > > * config/riscv/predicates.md (const_0_operand): Add back > const_double. > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/pr110748-1.c: New Test. > * gcc.target/riscv/xtheadfmv-fmv.c: Add '\t' around test > patterns to avoid random string matches. > > Signed-off-by: Vineet Gupta <vineetg@rivosinc.com> > --- > gcc/config/riscv/predicates.md | 2 +- > gcc/testsuite/gcc.target/riscv/pr110748-1.c | 10 ++++++++++ > gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c | 8 ++++---- > 3 files changed, 15 insertions(+), 5 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/riscv/pr110748-1.c > > diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md > index 5a22c77f0cd0..9db28c2def7e 100644 > --- a/gcc/config/riscv/predicates.md > +++ b/gcc/config/riscv/predicates.md > @@ -58,7 +58,7 @@ > (match_test "INTVAL (op) + 1 != 0"))) > > (define_predicate "const_0_operand" > - (and (match_code "const_int,const_wide_int,const_vector") > + (and (match_code "const_int,const_wide_int,const_double,const_vector") > (match_test "op == CONST0_RTX (GET_MODE (op))"))) > > (define_predicate "const_1_operand" > diff --git a/gcc/testsuite/gcc.target/riscv/pr110748-1.c b/gcc/testsuite/gcc.target/riscv/pr110748-1.c > new file mode 100644 > index 000000000000..2f5bc08aae72 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/pr110748-1.c > @@ -0,0 +1,10 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target hard_float } */ > +/* { dg-options "-march=rv64g -mabi=lp64d -O2" } */ > + > + > +void zd(double *d) { *d = 0.0; } > +void zf(float *f) { *f = 0.0; } > + > +/* { dg-final { scan-assembler-not "\tfmv\\.d\\.x\t" } } */ > +/* { dg-final { scan-assembler-not "\tfmv\\.s\\.x\t" } } */ IIUC the pattern to emit fmv suffers from the same bug -- it's fixed in the same way, but I think we might be able to come up with a test for it: `fmv.d.x FREG, x0` would be the fastest way to generate 0.0, so maybe something like double sum(double *d) { double sum = 0; for (int i = 0; i < 8; ++i) sum += d[i]; return sum; } would do it? That's generating the fmv on 13 for me, though, so maybe I'm missing something?` > diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c > index 1036044291e7..89eb48bed1b9 100644 > --- a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c > +++ b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c > @@ -18,7 +18,7 @@ d2ll (double d) > /* { dg-final { scan-assembler "th.fmv.hw.x" } } */ > /* { dg-final { scan-assembler "fmv.x.w" } } */ > /* { dg-final { scan-assembler "th.fmv.x.hw" } } */ > -/* { dg-final { scan-assembler-not "sw" } } */ > -/* { dg-final { scan-assembler-not "fld" } } */ > -/* { dg-final { scan-assembler-not "fsd" } } */ > -/* { dg-final { scan-assembler-not "lw" } } */ > +/* { dg-final { scan-assembler-not "\tsw\t" } } */ > +/* { dg-final { scan-assembler-not "\tfld\t" } } */ > +/* { dg-final { scan-assembler-not "\tfsd\t" } } */ > +/* { dg-final { scan-assembler-not "\tlw\t" } } */ I think that autovec one is the only possible dependency that might have snuck in, so we should be safe otherwise. Thanks! Reviewed-by: Palmer Dabbelt <palmer@rivosinc.com>
On 7/21/23 12:31, Palmer Dabbelt wrote: > > (define_expand "len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>" > [(match_operand:VNX1_QHSD 0 "register_operand") > - (match_operand 1 "pmode_reg_or_0_operand") > + (match_operand:P 1 "pmode_reg_or_0_operand") > (match_operand:VNX1_QHSDI 2 "register_operand") > (match_operand 3 "<VNX1_QHSD:gs_extension>") > (match_operand 4 "<VNX1_QHSD:gs_scale>") > > a bunch of times, as there's a ton of them? I'm not entirely sure if that > could manifest as an actual bug, though... But won't this cause (const_int 0) to no longer match because CONST_INT nodes are modeless (VOIDmode)? Jeff
On 7/21/23 11:31, Palmer Dabbelt wrote: > On Fri, 21 Jul 2023 10:55:52 PDT (-0700), Vineet Gupta wrote: >> DF +0.0 is bitwise all zeros so int x0 store to mem can be used to >> optimize it. >> >> void zd(double *) { *d = 0.0; } >> >> currently: >> >> | fmv.d.x fa5,zero >> | fsd fa5,0(a0) >> | ret >> >> With patch >> >> | sd zero,0(a0) >> | ret >> >> This came to light when testing the in-flight f-m-o patch where an ICE >> was gettinh triggered due to lack of this pattern but turns out this >> is an independent optimization of its own [1] >> >> [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624857.html >> >> Apparently this is a regression in gcc-13, introduced by commit >> ef85d150b5963 ("RISC-V: Enable TARGET_SUPPORTS_WIDE_INT") and the fix >> thus is a partial revert of that change. > > Given that it can ICE, we should probably backport it to 13. FWIW ICE is on an in-flight for-gcc-14 patch, not something in tree already. And this will merge ahead of that. I'm fine with backport though. > >> Ran thru full multilib testsuite, there was 1 false failure due to > > Did you run the test with autovec? I have standard 32/64 mutlilibs, but no 'v' in arch so autovec despite being enabled at -O2 and above will not kick in. I think we should add a 'v' multilib. > There's also a pmode_reg_or_0_operand, some of those don't appear > protected from FP values. So we might need something like > > diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md > index cd5b19457f8..d8ce9223343 100644 > --- a/gcc/config/riscv/autovec.md > +++ b/gcc/config/riscv/autovec.md > @@ -63,7 +63,7 @@ (define_expand "movmisalign<mode>" > > (define_expand "len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>" > [(match_operand:VNX1_QHSD 0 "register_operand") > - (match_operand 1 "pmode_reg_or_0_operand") > + (match_operand:P 1 "pmode_reg_or_0_operand") > (match_operand:VNX1_QHSDI 2 "register_operand") > (match_operand 3 "<VNX1_QHSD:gs_extension>") > (match_operand 4 "<VNX1_QHSD:gs_scale>") > > a bunch of times, as there's a ton of them? I'm not entirely sure if > that > could manifest as an actual bug, though... What does 'P' do here ? >> + >> +void zd(double *d) { *d = 0.0; } >> +void zf(float *f) { *f = 0.0; } >> + >> +/* { dg-final { scan-assembler-not "\tfmv\\.d\\.x\t" } } */ >> +/* { dg-final { scan-assembler-not "\tfmv\\.s\\.x\t" } } */ > > IIUC the pattern to emit fmv suffers from the same bug -- it's fixed > in the same > way, but I think we might be able to come up with a test for it: > `fmv.d.x FREG, > x0` would be the fastest way to generate 0.0, so maybe something like > > double sum(double *d) { > double sum = 0; > for (int i = 0; i < 8; ++i) > sum += d[i]; > return sum; > } > > would do it? That's generating the fmv on 13 for me, though, so maybe > I'm > missing something?` I need to unpack this first :-) > >> diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c >> b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c >> index 1036044291e7..89eb48bed1b9 100644 >> --- a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c >> +++ b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c >> @@ -18,7 +18,7 @@ d2ll (double d) >> /* { dg-final { scan-assembler "th.fmv.hw.x" } } */ >> /* { dg-final { scan-assembler "fmv.x.w" } } */ >> /* { dg-final { scan-assembler "th.fmv.x.hw" } } */ >> -/* { dg-final { scan-assembler-not "sw" } } */ >> -/* { dg-final { scan-assembler-not "fld" } } */ >> -/* { dg-final { scan-assembler-not "fsd" } } */ >> -/* { dg-final { scan-assembler-not "lw" } } */ >> +/* { dg-final { scan-assembler-not "\tsw\t" } } */ >> +/* { dg-final { scan-assembler-not "\tfld\t" } } */ >> +/* { dg-final { scan-assembler-not "\tfsd\t" } } */ >> +/* { dg-final { scan-assembler-not "\tlw\t" } } */ > > I think that autovec one is the only possible dependency that might > have snuck > in, so we should be safe otherwise. Thanks! I'm not sure if this specific comment is related to the xthead test or continuation of above. For xthead it is real issue since I saw a random "lw" in lto assembler output.
On 7/21/23 11:31, Palmer Dabbelt wrote: > > IIUC the pattern to emit fmv suffers from the same bug -- it's fixed > in the same > way, but I think we might be able to come up with a test for it: > `fmv.d.x FREG, > x0` would be the fastest way to generate 0.0, so maybe something like > > double sum(double *d) { > double sum = 0; > for (int i = 0; i < 8; ++i) > sum += d[i]; > return sum; > } > > would do it? That's generating the fmv on 13 for me, though, so maybe > I'm > missing something?` I don't think we can avoid FMV in this case fmv.d.x fa0,zero #1 addi a5,a0,64 .L2: fld fa5,0(a0) addi a0,a0,8 fadd.d fa0,fa0,fa5 #2 bne a0,a5,.L2 ret In #1, the zero needs to be setup in FP reg (possible using FMV), since in #2 it will be used for FP math. If we change ur test slightly, double zadd(double *d) { double sum = 0.0; for (int i = 0; i < 8; ++i) d[i] = sum; return sum; } We still get the optimal code for writing to FP 0. The last FMV is unavoidable as we need an FP return reg. addi a5,a0,64 .L2: sd zero,0(a0) addi a0,a0,8 bne a0,a5,.L2 fmv.d.x fa0,zero ret
On 7/21/23 12:55, Vineet Gupta wrote: >>> >>> Apparently this is a regression in gcc-13, introduced by commit >>> ef85d150b5963 ("RISC-V: Enable TARGET_SUPPORTS_WIDE_INT") and the fix >>> thus is a partial revert of that change. >> >> Given that it can ICE, we should probably backport it to 13. > > FWIW ICE is on an in-flight for-gcc-14 patch, not something in tree > already. And this will merge ahead of that. > I'm fine with backport though. It's latent on the gcc-13 branch from an ICE standpoint and would probably stay that way -- triggering would require something to re-recognize the pattern after register allocation. I won't object to it going into gcc-13. No strong opinions either way, happy to go with consensus opinion. >> There's also a pmode_reg_or_0_operand, some of those don't appear >> protected from FP values. So we might need something like >> >> diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md >> index cd5b19457f8..d8ce9223343 100644 >> --- a/gcc/config/riscv/autovec.md >> +++ b/gcc/config/riscv/autovec.md >> @@ -63,7 +63,7 @@ (define_expand "movmisalign<mode>" >> >> (define_expand "len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>" >> [(match_operand:VNX1_QHSD 0 "register_operand") >> - (match_operand 1 "pmode_reg_or_0_operand") >> + (match_operand:P 1 "pmode_reg_or_0_operand") >> (match_operand:VNX1_QHSDI 2 "register_operand") >> (match_operand 3 "<VNX1_QHSD:gs_extension>") >> (match_operand 4 "<VNX1_QHSD:gs_scale>") >> >> a bunch of times, as there's a ton of them? I'm not entirely sure if >> that >> could manifest as an actual bug, though... > > What does 'P' do here ? It will force the operand to match the P mode iterator, which should be DImode for rv64 and SImode for rv32. I'm not sure this is wise as I think it'll end up rejecting (const_int 0) because CONST_INT nodes do not have a mode. Jeff
On Fri, 21 Jul 2023 11:47:58 PDT (-0700), gcc-patches@gcc.gnu.org wrote: > On 7/21/23 12:31, Palmer Dabbelt wrote: >> (define_expand "len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>" >> [(match_operand:VNX1_QHSD 0 "register_operand") >> - (match_operand 1 "pmode_reg_or_0_operand") >> + (match_operand:P 1 "pmode_reg_or_0_operand") >> (match_operand:VNX1_QHSDI 2 "register_operand") >> (match_operand 3 "<VNX1_QHSD:gs_extension>") >> (match_operand 4 "<VNX1_QHSD:gs_scale>") >> >> a bunch of times, as there's a ton of them? I'm not entirely sure if that >> could manifest as an actual bug, though... > But won't this cause (const_int 0) to no longer match because CONST_INT > nodes are modeless (VOIDmode)? I poked around a bit and I'm not actually sure, I'm kind of lost on the docs here. IIUC we're eliding the VOIDmode in the predicate correctly (define_predicate "const_0_operand" (and (match_code "const_int,const_wide_int,const_vector") (match_test "op == CONST0_RTX (GET_MODE (op))"))) so we're OK there, otherwise we'd presumably have similar problems with expanders like (define_expand "subsi3" [(set (match_operand:SI 0 "register_operand" "= r") (minus:SI (match_operand:SI 1 "reg_or_0_operand" " rJ") (match_operand:SI 2 "register_operand" " r")))] "" which we have a few of -- though it'd be kind of a silent failure, as presumably we'd just end up with some more move-x0s emitted?
On 7/25/23 17:05, Palmer Dabbelt wrote: > On Fri, 21 Jul 2023 11:47:58 PDT (-0700), gcc-patches@gcc.gnu.org wrote: >> On 7/21/23 12:31, Palmer Dabbelt wrote: >>> (define_expand "len_mask_gather_load<VNX1_QHSD:mode><VNX1_QHSDI:mode>" >>> [(match_operand:VNX1_QHSD 0 "register_operand") >>> - (match_operand 1 "pmode_reg_or_0_operand") >>> + (match_operand:P 1 "pmode_reg_or_0_operand") >>> (match_operand:VNX1_QHSDI 2 "register_operand") >>> (match_operand 3 "<VNX1_QHSD:gs_extension>") >>> (match_operand 4 "<VNX1_QHSD:gs_scale>") >>> >>> a bunch of times, as there's a ton of them? I'm not entirely sure if >>> that >>> could manifest as an actual bug, though... >> But won't this cause (const_int 0) to no longer match because CONST_INT >> nodes are modeless (VOIDmode)? > > I poked around a bit and I'm not actually sure, I'm kind of lost on the > docs > here. IIUC we're eliding the VOIDmode in the predicate correctly > > (define_predicate "const_0_operand" > (and (match_code "const_int,const_wide_int,const_vector") > (match_test "op == CONST0_RTX (GET_MODE (op))"))) > > so we're OK there, otherwise we'd presumably have similar problems with > expanders like > > (define_expand "subsi3" > [(set (match_operand:SI 0 "register_operand" "= r") > (minus:SI (match_operand:SI 1 "reg_or_0_operand" " rJ") > (match_operand:SI 2 "register_operand" " r")))] > "" > > which we have a few of -- though it'd be kind of a silent failure, as > presumably we'd just end up with some more move-x0s emitted? It's a bit messy to say the least. However, we can look at other ports and after doing so I'm less sure my concern is valid. Take the typical movXX pattern or expander. Both operands have a mode, so things like CONST_INT must be passing through, even though they're VOIDmode. So it's probably a non-issue. jeff
diff --git a/gcc/config/riscv/predicates.md b/gcc/config/riscv/predicates.md index 5a22c77f0cd0..9db28c2def7e 100644 --- a/gcc/config/riscv/predicates.md +++ b/gcc/config/riscv/predicates.md @@ -58,7 +58,7 @@ (match_test "INTVAL (op) + 1 != 0"))) (define_predicate "const_0_operand" - (and (match_code "const_int,const_wide_int,const_vector") + (and (match_code "const_int,const_wide_int,const_double,const_vector") (match_test "op == CONST0_RTX (GET_MODE (op))"))) (define_predicate "const_1_operand" diff --git a/gcc/testsuite/gcc.target/riscv/pr110748-1.c b/gcc/testsuite/gcc.target/riscv/pr110748-1.c new file mode 100644 index 000000000000..2f5bc08aae72 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/pr110748-1.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target hard_float } */ +/* { dg-options "-march=rv64g -mabi=lp64d -O2" } */ + + +void zd(double *d) { *d = 0.0; } +void zf(float *f) { *f = 0.0; } + +/* { dg-final { scan-assembler-not "\tfmv\\.d\\.x\t" } } */ +/* { dg-final { scan-assembler-not "\tfmv\\.s\\.x\t" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c index 1036044291e7..89eb48bed1b9 100644 --- a/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c +++ b/gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c @@ -18,7 +18,7 @@ d2ll (double d) /* { dg-final { scan-assembler "th.fmv.hw.x" } } */ /* { dg-final { scan-assembler "fmv.x.w" } } */ /* { dg-final { scan-assembler "th.fmv.x.hw" } } */ -/* { dg-final { scan-assembler-not "sw" } } */ -/* { dg-final { scan-assembler-not "fld" } } */ -/* { dg-final { scan-assembler-not "fsd" } } */ -/* { dg-final { scan-assembler-not "lw" } } */ +/* { dg-final { scan-assembler-not "\tsw\t" } } */ +/* { dg-final { scan-assembler-not "\tfld\t" } } */ +/* { dg-final { scan-assembler-not "\tfsd\t" } } */ +/* { dg-final { scan-assembler-not "\tlw\t" } } */
DF +0.0 is bitwise all zeros so int x0 store to mem can be used to optimize it. void zd(double *) { *d = 0.0; } currently: | fmv.d.x fa5,zero | fsd fa5,0(a0) | ret With patch | sd zero,0(a0) | ret This came to light when testing the in-flight f-m-o patch where an ICE was gettinh triggered due to lack of this pattern but turns out this is an independent optimization of its own [1] [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-July/624857.html Apparently this is a regression in gcc-13, introduced by commit ef85d150b5963 ("RISC-V: Enable TARGET_SUPPORTS_WIDE_INT") and the fix thus is a partial revert of that change. Ran thru full multilib testsuite, there was 1 false failure due to random string "lw" appearing in lto build assembler output, which is also fixed in the patch. gcc/Changelog: * config/riscv/predicates.md (const_0_operand): Add back const_double. gcc/testsuite/ChangeLog: * gcc.target/riscv/pr110748-1.c: New Test. * gcc.target/riscv/xtheadfmv-fmv.c: Add '\t' around test patterns to avoid random string matches. Signed-off-by: Vineet Gupta <vineetg@rivosinc.com> --- gcc/config/riscv/predicates.md | 2 +- gcc/testsuite/gcc.target/riscv/pr110748-1.c | 10 ++++++++++ gcc/testsuite/gcc.target/riscv/xtheadfmv-fmv.c | 8 ++++---- 3 files changed, 15 insertions(+), 5 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/pr110748-1.c