Message ID | 20230630233315.212700-1-vineetg@rivosinc.com |
---|---|
State | New |
Headers | show |
Series | RISC-V: improve codegen for repeating large constants [3] | expand |
On 6/30/23 16:33, Vineet Gupta wrote: > Ran into a minor snafu in const splitting code when playing with test > case from an old PR/23813. > > long long f(void) { return 0xF0F0F0F0F0F0F0F0ull; } > > This currently generates > > li a5,-252645376 > addi a5,a5,241 > li a0,-252645376 > slli a5,a5,32 > addi a0,a0,240 > add a0,a5,a0 > ret > > The signed math in hival extraction introduces an additional bit, > causing loval == hival check to fail. > > | riscv_split_integer (val=-1085102592571150096, mode=E_DImode) at ../gcc/config/riscv/riscv.cc:702 > | 702 unsigned HOST_WIDE_INT loval = sext_hwi (val, 32); > | (gdb)n > | 703 unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32); > | (gdb) FWIW (and I missed adding this observation to the changelog) I pondered about using unsigned loval/hival with zext_hwi() but that in certain cases can cause additional insns e.g. constant 0x8000_0000 is codegen to LI 1 +SLLI 31 vs, LI 0xFFFFFFFF_80000000 > | 704 rtx hi = gen_reg_rtx (mode), lo = gen_reg_rtx (mode); > | (gdb) p/x val > | $2 = 0xf0f0f0f0f0f0f0f0 > | (gdb) p/x loval > | $3 = 0xfffffffff0f0f0f0 > | (gdb) p/x hival > | $4 = 0xfffffffff0f0f0f1 > ^^^ > Fix that by eliding the subtraction in shift. > > With patch: > > li a5,-252645376 > addi a5,a5,240 > slli a0,a5,32 > add a0,a0,a5 > ret > > gcc/ChangeLog: > > * config/riscv/riscv.cc (riscv_split_integer): hival computation > do elide subtraction of loval. > * (riscv_split_integer_cost): Ditto. > * (riscv_build_integer): Ditto > > Signed-off-by: Vineet Gupta <vineetg@rivosinc.com> > --- > I wasn't planning to do any more work on large const stuff, but just ran into it this > on a random BZ entry when trying search for redundant constant stuff. > The test seemed too good to pass :-) > --- > gcc/config/riscv/riscv.cc | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc > index 5ac187c1b1b4..377d3aac794b 100644 > --- a/gcc/config/riscv/riscv.cc > +++ b/gcc/config/riscv/riscv.cc > @@ -643,7 +643,7 @@ riscv_build_integer (struct riscv_integer_op *codes, HOST_WIDE_INT value, > && (value > INT32_MAX || value < INT32_MIN)) > { > unsigned HOST_WIDE_INT loval = sext_hwi (value, 32); > - unsigned HOST_WIDE_INT hival = sext_hwi ((value - loval) >> 32, 32); > + unsigned HOST_WIDE_INT hival = sext_hwi (value >> 32, 32); > struct riscv_integer_op alt_codes[RISCV_MAX_INTEGER_OPS]; > struct riscv_integer_op hicode[RISCV_MAX_INTEGER_OPS]; > int hi_cost, lo_cost; > @@ -674,7 +674,7 @@ riscv_split_integer_cost (HOST_WIDE_INT val) > { > int cost; > unsigned HOST_WIDE_INT loval = sext_hwi (val, 32); > - unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32); > + unsigned HOST_WIDE_INT hival = sext_hwi (val >> 32, 32); > struct riscv_integer_op codes[RISCV_MAX_INTEGER_OPS]; > > cost = 2 + riscv_build_integer (codes, loval, VOIDmode); > @@ -700,7 +700,7 @@ static rtx > riscv_split_integer (HOST_WIDE_INT val, machine_mode mode) > { > unsigned HOST_WIDE_INT loval = sext_hwi (val, 32); > - unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32); > + unsigned HOST_WIDE_INT hival = sext_hwi (val >> 32, 32); > rtx hi = gen_reg_rtx (mode), lo = gen_reg_rtx (mode); > > riscv_move_integer (lo, lo, loval, mode);
I don't believe this is correct; the subtraction is needed to account for the fact that the low part might be negative, resulting in a borrow from the high part. See the output for your test case below: $ cat test.c #include <stdio.h> int main() { unsigned long result, tmp; asm ( "li %1,-252645376\n" "addi %1,%1,240\n" "slli %0,%1,32\n" "add %0,%0,%1" : "=r" (result), "=r" (tmp)); printf("%lx\n", result); return 0; } $ riscv64-unknown-elf-gcc -O2 test.c $ spike pk a.out bbl loader f0f0f0eff0f0f0f0 $ On Fri, Jun 30, 2023 at 4:42 PM Vineet Gupta <vineetg@rivosinc.com> wrote: > > > > On 6/30/23 16:33, Vineet Gupta wrote: > > Ran into a minor snafu in const splitting code when playing with test > > case from an old PR/23813. > > > > long long f(void) { return 0xF0F0F0F0F0F0F0F0ull; } > > > > This currently generates > > > > li a5,-252645376 > > addi a5,a5,241 > > li a0,-252645376 > > slli a5,a5,32 > > addi a0,a0,240 > > add a0,a5,a0 > > ret > > > > The signed math in hival extraction introduces an additional bit, > > causing loval == hival check to fail. > > > > | riscv_split_integer (val=-1085102592571150096, mode=E_DImode) at ../gcc/config/riscv/riscv.cc:702 > > | 702 unsigned HOST_WIDE_INT loval = sext_hwi (val, 32); > > | (gdb)n > > | 703 unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32); > > | (gdb) > > FWIW (and I missed adding this observation to the changelog) I pondered > about using unsigned loval/hival with zext_hwi() but that in certain > cases can cause additional insns > > e.g. constant 0x8000_0000 is codegen to LI 1 +SLLI 31 vs, LI > 0xFFFFFFFF_80000000 > > > > | 704 rtx hi = gen_reg_rtx (mode), lo = gen_reg_rtx (mode); > > | (gdb) p/x val > > | $2 = 0xf0f0f0f0f0f0f0f0 > > | (gdb) p/x loval > > | $3 = 0xfffffffff0f0f0f0 > > | (gdb) p/x hival > > | $4 = 0xfffffffff0f0f0f1 > > ^^^ > > Fix that by eliding the subtraction in shift. > > > > With patch: > > > > li a5,-252645376 > > addi a5,a5,240 > > slli a0,a5,32 > > add a0,a0,a5 > > ret > > > > gcc/ChangeLog: > > > > * config/riscv/riscv.cc (riscv_split_integer): hival computation > > do elide subtraction of loval. > > * (riscv_split_integer_cost): Ditto. > > * (riscv_build_integer): Ditto > > > > Signed-off-by: Vineet Gupta <vineetg@rivosinc.com> > > --- > > I wasn't planning to do any more work on large const stuff, but just ran into it this > > on a random BZ entry when trying search for redundant constant stuff. > > The test seemed too good to pass :-) > > --- > > gcc/config/riscv/riscv.cc | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc > > index 5ac187c1b1b4..377d3aac794b 100644 > > --- a/gcc/config/riscv/riscv.cc > > +++ b/gcc/config/riscv/riscv.cc > > @@ -643,7 +643,7 @@ riscv_build_integer (struct riscv_integer_op *codes, HOST_WIDE_INT value, > > && (value > INT32_MAX || value < INT32_MIN)) > > { > > unsigned HOST_WIDE_INT loval = sext_hwi (value, 32); > > - unsigned HOST_WIDE_INT hival = sext_hwi ((value - loval) >> 32, 32); > > + unsigned HOST_WIDE_INT hival = sext_hwi (value >> 32, 32); > > struct riscv_integer_op alt_codes[RISCV_MAX_INTEGER_OPS]; > > struct riscv_integer_op hicode[RISCV_MAX_INTEGER_OPS]; > > int hi_cost, lo_cost; > > @@ -674,7 +674,7 @@ riscv_split_integer_cost (HOST_WIDE_INT val) > > { > > int cost; > > unsigned HOST_WIDE_INT loval = sext_hwi (val, 32); > > - unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32); > > + unsigned HOST_WIDE_INT hival = sext_hwi (val >> 32, 32); > > struct riscv_integer_op codes[RISCV_MAX_INTEGER_OPS]; > > > > cost = 2 + riscv_build_integer (codes, loval, VOIDmode); > > @@ -700,7 +700,7 @@ static rtx > > riscv_split_integer (HOST_WIDE_INT val, machine_mode mode) > > { > > unsigned HOST_WIDE_INT loval = sext_hwi (val, 32); > > - unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32); > > + unsigned HOST_WIDE_INT hival = sext_hwi (val >> 32, 32); > > rtx hi = gen_reg_rtx (mode), lo = gen_reg_rtx (mode); > > > > riscv_move_integer (lo, lo, loval, mode); >
On 6/30/23 16:50, Andrew Waterman wrote: > I don't believe this is correct; the subtraction is needed to account > for the fact that the low part might be negative, resulting in a > borrow from the high part. See the output for your test case below: > > $ cat test.c > #include <stdio.h> > > int main() > { > unsigned long result, tmp; > > asm ( > "li %1,-252645376\n" > "addi %1,%1,240\n" > "slli %0,%1,32\n" > "add %0,%0,%1" > : "=r" (result), "=r" (tmp)); > > printf("%lx\n", result); > > return 0; > } > $ riscv64-unknown-elf-gcc -O2 test.c > $ spike pk a.out > bbl loader > f0f0f0eff0f0f0f0 > $ Thx for the quick feedback Andew. I'm clearly lacking in signed math :-( So is it possible to have a better code seq for the testcase at all ? -Vineet > > > On Fri, Jun 30, 2023 at 4:42 PM Vineet Gupta <vineetg@rivosinc.com> wrote: >> >> >> On 6/30/23 16:33, Vineet Gupta wrote: >>> Ran into a minor snafu in const splitting code when playing with test >>> case from an old PR/23813. >>> >>> long long f(void) { return 0xF0F0F0F0F0F0F0F0ull; } >>> >>> This currently generates >>> >>> li a5,-252645376 >>> addi a5,a5,241 >>> li a0,-252645376 >>> slli a5,a5,32 >>> addi a0,a0,240 >>> add a0,a5,a0 >>> ret >>> >>> The signed math in hival extraction introduces an additional bit, >>> causing loval == hival check to fail. >>> >>> | riscv_split_integer (val=-1085102592571150096, mode=E_DImode) at ../gcc/config/riscv/riscv.cc:702 >>> | 702 unsigned HOST_WIDE_INT loval = sext_hwi (val, 32); >>> | (gdb)n >>> | 703 unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32); >>> | (gdb) >> FWIW (and I missed adding this observation to the changelog) I pondered >> about using unsigned loval/hival with zext_hwi() but that in certain >> cases can cause additional insns >> >> e.g. constant 0x8000_0000 is codegen to LI 1 +SLLI 31 vs, LI >> 0xFFFFFFFF_80000000 >> >> >>> | 704 rtx hi = gen_reg_rtx (mode), lo = gen_reg_rtx (mode); >>> | (gdb) p/x val >>> | $2 = 0xf0f0f0f0f0f0f0f0 >>> | (gdb) p/x loval >>> | $3 = 0xfffffffff0f0f0f0 >>> | (gdb) p/x hival >>> | $4 = 0xfffffffff0f0f0f1 >>> ^^^ >>> Fix that by eliding the subtraction in shift. >>> >>> With patch: >>> >>> li a5,-252645376 >>> addi a5,a5,240 >>> slli a0,a5,32 >>> add a0,a0,a5 >>> ret >>> >>> gcc/ChangeLog: >>> >>> * config/riscv/riscv.cc (riscv_split_integer): hival computation >>> do elide subtraction of loval. >>> * (riscv_split_integer_cost): Ditto. >>> * (riscv_build_integer): Ditto >>> >>> Signed-off-by: Vineet Gupta <vineetg@rivosinc.com> >>> --- >>> I wasn't planning to do any more work on large const stuff, but just ran into it this >>> on a random BZ entry when trying search for redundant constant stuff. >>> The test seemed too good to pass :-) >>> --- >>> gcc/config/riscv/riscv.cc | 6 +++--- >>> 1 file changed, 3 insertions(+), 3 deletions(-) >>> >>> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc >>> index 5ac187c1b1b4..377d3aac794b 100644 >>> --- a/gcc/config/riscv/riscv.cc >>> +++ b/gcc/config/riscv/riscv.cc >>> @@ -643,7 +643,7 @@ riscv_build_integer (struct riscv_integer_op *codes, HOST_WIDE_INT value, >>> && (value > INT32_MAX || value < INT32_MIN)) >>> { >>> unsigned HOST_WIDE_INT loval = sext_hwi (value, 32); >>> - unsigned HOST_WIDE_INT hival = sext_hwi ((value - loval) >> 32, 32); >>> + unsigned HOST_WIDE_INT hival = sext_hwi (value >> 32, 32); >>> struct riscv_integer_op alt_codes[RISCV_MAX_INTEGER_OPS]; >>> struct riscv_integer_op hicode[RISCV_MAX_INTEGER_OPS]; >>> int hi_cost, lo_cost; >>> @@ -674,7 +674,7 @@ riscv_split_integer_cost (HOST_WIDE_INT val) >>> { >>> int cost; >>> unsigned HOST_WIDE_INT loval = sext_hwi (val, 32); >>> - unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32); >>> + unsigned HOST_WIDE_INT hival = sext_hwi (val >> 32, 32); >>> struct riscv_integer_op codes[RISCV_MAX_INTEGER_OPS]; >>> >>> cost = 2 + riscv_build_integer (codes, loval, VOIDmode); >>> @@ -700,7 +700,7 @@ static rtx >>> riscv_split_integer (HOST_WIDE_INT val, machine_mode mode) >>> { >>> unsigned HOST_WIDE_INT loval = sext_hwi (val, 32); >>> - unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32); >>> + unsigned HOST_WIDE_INT hival = sext_hwi (val >> 32, 32); >>> rtx hi = gen_reg_rtx (mode), lo = gen_reg_rtx (mode); >>> >>> riscv_move_integer (lo, lo, loval, mode);
On Fri, Jun 30, 2023 at 5:13 PM Vineet Gupta <vineetg@rivosinc.com> wrote: > > > > On 6/30/23 16:50, Andrew Waterman wrote: > > I don't believe this is correct; the subtraction is needed to account > > for the fact that the low part might be negative, resulting in a > > borrow from the high part. See the output for your test case below: > > > > $ cat test.c > > #include <stdio.h> > > > > int main() > > { > > unsigned long result, tmp; > > > > asm ( > > "li %1,-252645376\n" > > "addi %1,%1,240\n" > > "slli %0,%1,32\n" > > "add %0,%0,%1" > > : "=r" (result), "=r" (tmp)); > > > > printf("%lx\n", result); > > > > return 0; > > } > > $ riscv64-unknown-elf-gcc -O2 test.c > > $ spike pk a.out > > bbl loader > > f0f0f0eff0f0f0f0 > > $ > > Thx for the quick feedback Andew. I'm clearly lacking in signed math :-( > So is it possible to have a better code seq for the testcase at all ? You're welcome! When Zba is implemented, then inserting a zext.w would do the trick; see below. (The generalization is that the zext.w is needed if the 32-bit constant is negative.) When Zba is not implemented, I think the original sequence is optimal. li a5, -252645376 addi a5, a5, 240 slli a0, a5, 32 zext.w a5, a5 add a0, a0, a5 > > -Vineet > > > > > > > On Fri, Jun 30, 2023 at 4:42 PM Vineet Gupta <vineetg@rivosinc.com> wrote: > >> > >> > >> On 6/30/23 16:33, Vineet Gupta wrote: > >>> Ran into a minor snafu in const splitting code when playing with test > >>> case from an old PR/23813. > >>> > >>> long long f(void) { return 0xF0F0F0F0F0F0F0F0ull; } > >>> > >>> This currently generates > >>> > >>> li a5,-252645376 > >>> addi a5,a5,241 > >>> li a0,-252645376 > >>> slli a5,a5,32 > >>> addi a0,a0,240 > >>> add a0,a5,a0 > >>> ret > >>> > >>> The signed math in hival extraction introduces an additional bit, > >>> causing loval == hival check to fail. > >>> > >>> | riscv_split_integer (val=-1085102592571150096, mode=E_DImode) at ../gcc/config/riscv/riscv.cc:702 > >>> | 702 unsigned HOST_WIDE_INT loval = sext_hwi (val, 32); > >>> | (gdb)n > >>> | 703 unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32); > >>> | (gdb) > >> FWIW (and I missed adding this observation to the changelog) I pondered > >> about using unsigned loval/hival with zext_hwi() but that in certain > >> cases can cause additional insns > >> > >> e.g. constant 0x8000_0000 is codegen to LI 1 +SLLI 31 vs, LI > >> 0xFFFFFFFF_80000000 > >> > >> > >>> | 704 rtx hi = gen_reg_rtx (mode), lo = gen_reg_rtx (mode); > >>> | (gdb) p/x val > >>> | $2 = 0xf0f0f0f0f0f0f0f0 > >>> | (gdb) p/x loval > >>> | $3 = 0xfffffffff0f0f0f0 > >>> | (gdb) p/x hival > >>> | $4 = 0xfffffffff0f0f0f1 > >>> ^^^ > >>> Fix that by eliding the subtraction in shift. > >>> > >>> With patch: > >>> > >>> li a5,-252645376 > >>> addi a5,a5,240 > >>> slli a0,a5,32 > >>> add a0,a0,a5 > >>> ret > >>> > >>> gcc/ChangeLog: > >>> > >>> * config/riscv/riscv.cc (riscv_split_integer): hival computation > >>> do elide subtraction of loval. > >>> * (riscv_split_integer_cost): Ditto. > >>> * (riscv_build_integer): Ditto > >>> > >>> Signed-off-by: Vineet Gupta <vineetg@rivosinc.com> > >>> --- > >>> I wasn't planning to do any more work on large const stuff, but just ran into it this > >>> on a random BZ entry when trying search for redundant constant stuff. > >>> The test seemed too good to pass :-) > >>> --- > >>> gcc/config/riscv/riscv.cc | 6 +++--- > >>> 1 file changed, 3 insertions(+), 3 deletions(-) > >>> > >>> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc > >>> index 5ac187c1b1b4..377d3aac794b 100644 > >>> --- a/gcc/config/riscv/riscv.cc > >>> +++ b/gcc/config/riscv/riscv.cc > >>> @@ -643,7 +643,7 @@ riscv_build_integer (struct riscv_integer_op *codes, HOST_WIDE_INT value, > >>> && (value > INT32_MAX || value < INT32_MIN)) > >>> { > >>> unsigned HOST_WIDE_INT loval = sext_hwi (value, 32); > >>> - unsigned HOST_WIDE_INT hival = sext_hwi ((value - loval) >> 32, 32); > >>> + unsigned HOST_WIDE_INT hival = sext_hwi (value >> 32, 32); > >>> struct riscv_integer_op alt_codes[RISCV_MAX_INTEGER_OPS]; > >>> struct riscv_integer_op hicode[RISCV_MAX_INTEGER_OPS]; > >>> int hi_cost, lo_cost; > >>> @@ -674,7 +674,7 @@ riscv_split_integer_cost (HOST_WIDE_INT val) > >>> { > >>> int cost; > >>> unsigned HOST_WIDE_INT loval = sext_hwi (val, 32); > >>> - unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32); > >>> + unsigned HOST_WIDE_INT hival = sext_hwi (val >> 32, 32); > >>> struct riscv_integer_op codes[RISCV_MAX_INTEGER_OPS]; > >>> > >>> cost = 2 + riscv_build_integer (codes, loval, VOIDmode); > >>> @@ -700,7 +700,7 @@ static rtx > >>> riscv_split_integer (HOST_WIDE_INT val, machine_mode mode) > >>> { > >>> unsigned HOST_WIDE_INT loval = sext_hwi (val, 32); > >>> - unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32); > >>> + unsigned HOST_WIDE_INT hival = sext_hwi (val >> 32, 32); > >>> rtx hi = gen_reg_rtx (mode), lo = gen_reg_rtx (mode); > >>> > >>> riscv_move_integer (lo, lo, loval, mode); >
On Fri, 30 Jun 2023 17:25:54 PDT (-0700), Andrew Waterman wrote: > On Fri, Jun 30, 2023 at 5:13 PM Vineet Gupta <vineetg@rivosinc.com> wrote: >> >> >> >> On 6/30/23 16:50, Andrew Waterman wrote: >> > I don't believe this is correct; the subtraction is needed to account >> > for the fact that the low part might be negative, resulting in a >> > borrow from the high part. See the output for your test case below: >> > >> > $ cat test.c >> > #include <stdio.h> >> > >> > int main() >> > { >> > unsigned long result, tmp; >> > >> > asm ( >> > "li %1,-252645376\n" >> > "addi %1,%1,240\n" >> > "slli %0,%1,32\n" >> > "add %0,%0,%1" >> > : "=r" (result), "=r" (tmp)); >> > >> > printf("%lx\n", result); >> > >> > return 0; >> > } >> > $ riscv64-unknown-elf-gcc -O2 test.c >> > $ spike pk a.out >> > bbl loader >> > f0f0f0eff0f0f0f0 >> > $ >> >> Thx for the quick feedback Andew. I'm clearly lacking in signed math :-( >> So is it possible to have a better code seq for the testcase at all ? > > You're welcome! > > When Zba is implemented, then inserting a zext.w would do the trick; > see below. (The generalization is that the zext.w is needed if the > 32-bit constant is negative.) When Zba is not implemented, I think > the original sequence is optimal. > > li a5, -252645376 > addi a5, a5, 240 > slli a0, a5, 32 > zext.w a5, a5 > add a0, a0, a5 For the non-Zba case, I think we can leverage the two high parts starting out the same to save an instruction generating the constant. So for the original code sequence of li a5,-252645376 addi a5,a5,241 li a0,-252645376 slli a5,a5,32 addi a0,a0,240 add a0,a5,a0 ret we could instead generate li a5,-252645376 addi a0,a5,240 addi a5,a5,241 slli a5,a5,32 add a0,a5,a0 ret which IIUC produces the same result. I think something along the lines of this (with the corresponding cost function updates) would do it diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index de578b5b899..32b6033a966 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -704,7 +704,13 @@ riscv_split_integer (HOST_WIDE_INT val, machine_mode mode) rtx hi = gen_reg_rtx (mode), lo = gen_reg_rtx (mode); riscv_move_integer (hi, hi, hival, mode); - riscv_move_integer (lo, lo, loval, mode); + if (riscv_integer_cost (loval - hival) + 1 < riscv_integer_cost (loval)) { + rtx delta = gen_reg_rrtx (mode); + riscv_move_integer (delta, delta, loval - hival, mode); + lo = gen_rtx_fmt_ee (PLUS, mode, hi, delta); + } else { + riscv_move_integer (lo, lo, loval, mode); + } hi = gen_rtx_fmt_ee (ASHIFT, mode, hi, GEN_INT (32)); hi = force_reg (mode, hi); though I suppose that would produce a slightly different sequence that has the same number of instructions but a slightly longer dependency chain, something more like li a5,-252645376 addi a5,a5,241 addi a0,a5,-1 slli a5,a5,32 add a0,a5,a0 ret Take that all with a grain of salt, though, as I just ate some very spicy chicken and can barely see straight :) > > >> >> -Vineet >> >> > >> > >> > On Fri, Jun 30, 2023 at 4:42 PM Vineet Gupta <vineetg@rivosinc.com> wrote: >> >> >> >> >> >> On 6/30/23 16:33, Vineet Gupta wrote: >> >>> Ran into a minor snafu in const splitting code when playing with test >> >>> case from an old PR/23813. >> >>> >> >>> long long f(void) { return 0xF0F0F0F0F0F0F0F0ull; } >> >>> >> >>> This currently generates >> >>> >> >>> li a5,-252645376 >> >>> addi a5,a5,241 >> >>> li a0,-252645376 >> >>> slli a5,a5,32 >> >>> addi a0,a0,240 >> >>> add a0,a5,a0 >> >>> ret >> >>> >> >>> The signed math in hival extraction introduces an additional bit, >> >>> causing loval == hival check to fail. >> >>> >> >>> | riscv_split_integer (val=-1085102592571150096, mode=E_DImode) at ../gcc/config/riscv/riscv.cc:702 >> >>> | 702 unsigned HOST_WIDE_INT loval = sext_hwi (val, 32); >> >>> | (gdb)n >> >>> | 703 unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32); >> >>> | (gdb) >> >> FWIW (and I missed adding this observation to the changelog) I pondered >> >> about using unsigned loval/hival with zext_hwi() but that in certain >> >> cases can cause additional insns >> >> >> >> e.g. constant 0x8000_0000 is codegen to LI 1 +SLLI 31 vs, LI >> >> 0xFFFFFFFF_80000000 >> >> >> >> >> >>> | 704 rtx hi = gen_reg_rtx (mode), lo = gen_reg_rtx (mode); >> >>> | (gdb) p/x val >> >>> | $2 = 0xf0f0f0f0f0f0f0f0 >> >>> | (gdb) p/x loval >> >>> | $3 = 0xfffffffff0f0f0f0 >> >>> | (gdb) p/x hival >> >>> | $4 = 0xfffffffff0f0f0f1 >> >>> ^^^ >> >>> Fix that by eliding the subtraction in shift. >> >>> >> >>> With patch: >> >>> >> >>> li a5,-252645376 >> >>> addi a5,a5,240 >> >>> slli a0,a5,32 >> >>> add a0,a0,a5 >> >>> ret >> >>> >> >>> gcc/ChangeLog: >> >>> >> >>> * config/riscv/riscv.cc (riscv_split_integer): hival computation >> >>> do elide subtraction of loval. >> >>> * (riscv_split_integer_cost): Ditto. >> >>> * (riscv_build_integer): Ditto >> >>> >> >>> Signed-off-by: Vineet Gupta <vineetg@rivosinc.com> >> >>> --- >> >>> I wasn't planning to do any more work on large const stuff, but just ran into it this >> >>> on a random BZ entry when trying search for redundant constant stuff. >> >>> The test seemed too good to pass :-) >> >>> --- >> >>> gcc/config/riscv/riscv.cc | 6 +++--- >> >>> 1 file changed, 3 insertions(+), 3 deletions(-) >> >>> >> >>> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc >> >>> index 5ac187c1b1b4..377d3aac794b 100644 >> >>> --- a/gcc/config/riscv/riscv.cc >> >>> +++ b/gcc/config/riscv/riscv.cc >> >>> @@ -643,7 +643,7 @@ riscv_build_integer (struct riscv_integer_op *codes, HOST_WIDE_INT value, >> >>> && (value > INT32_MAX || value < INT32_MIN)) >> >>> { >> >>> unsigned HOST_WIDE_INT loval = sext_hwi (value, 32); >> >>> - unsigned HOST_WIDE_INT hival = sext_hwi ((value - loval) >> 32, 32); >> >>> + unsigned HOST_WIDE_INT hival = sext_hwi (value >> 32, 32); >> >>> struct riscv_integer_op alt_codes[RISCV_MAX_INTEGER_OPS]; >> >>> struct riscv_integer_op hicode[RISCV_MAX_INTEGER_OPS]; >> >>> int hi_cost, lo_cost; >> >>> @@ -674,7 +674,7 @@ riscv_split_integer_cost (HOST_WIDE_INT val) >> >>> { >> >>> int cost; >> >>> unsigned HOST_WIDE_INT loval = sext_hwi (val, 32); >> >>> - unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32); >> >>> + unsigned HOST_WIDE_INT hival = sext_hwi (val >> 32, 32); >> >>> struct riscv_integer_op codes[RISCV_MAX_INTEGER_OPS]; >> >>> >> >>> cost = 2 + riscv_build_integer (codes, loval, VOIDmode); >> >>> @@ -700,7 +700,7 @@ static rtx >> >>> riscv_split_integer (HOST_WIDE_INT val, machine_mode mode) >> >>> { >> >>> unsigned HOST_WIDE_INT loval = sext_hwi (val, 32); >> >>> - unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32); >> >>> + unsigned HOST_WIDE_INT hival = sext_hwi (val >> 32, 32); >> >>> rtx hi = gen_reg_rtx (mode), lo = gen_reg_rtx (mode); >> >>> >> >>> riscv_move_integer (lo, lo, loval, mode); >>
On Fri, Jun 30, 2023 at 5:36 PM Palmer Dabbelt <palmer@rivosinc.com> wrote: > > On Fri, 30 Jun 2023 17:25:54 PDT (-0700), Andrew Waterman wrote: > > On Fri, Jun 30, 2023 at 5:13 PM Vineet Gupta <vineetg@rivosinc.com> wrote: > >> > >> > >> > >> On 6/30/23 16:50, Andrew Waterman wrote: > >> > I don't believe this is correct; the subtraction is needed to account > >> > for the fact that the low part might be negative, resulting in a > >> > borrow from the high part. See the output for your test case below: > >> > > >> > $ cat test.c > >> > #include <stdio.h> > >> > > >> > int main() > >> > { > >> > unsigned long result, tmp; > >> > > >> > asm ( > >> > "li %1,-252645376\n" > >> > "addi %1,%1,240\n" > >> > "slli %0,%1,32\n" > >> > "add %0,%0,%1" > >> > : "=r" (result), "=r" (tmp)); > >> > > >> > printf("%lx\n", result); > >> > > >> > return 0; > >> > } > >> > $ riscv64-unknown-elf-gcc -O2 test.c > >> > $ spike pk a.out > >> > bbl loader > >> > f0f0f0eff0f0f0f0 > >> > $ > >> > >> Thx for the quick feedback Andew. I'm clearly lacking in signed math :-( > >> So is it possible to have a better code seq for the testcase at all ? > > > > You're welcome! > > > > When Zba is implemented, then inserting a zext.w would do the trick; > > see below. (The generalization is that the zext.w is needed if the > > 32-bit constant is negative.) When Zba is not implemented, I think > > the original sequence is optimal. > > > > li a5, -252645376 > > addi a5, a5, 240 > > slli a0, a5, 32 > > zext.w a5, a5 > > add a0, a0, a5 > > For the non-Zba case, I think we can leverage the two high parts > starting out the same to save an instruction generating the constant. > So for the original code sequence of > > li a5,-252645376 > addi a5,a5,241 > li a0,-252645376 > slli a5,a5,32 > addi a0,a0,240 > add a0,a5,a0 > ret > > we could instead generate > > li a5,-252645376 > addi a0,a5,240 > addi a5,a5,241 > slli a5,a5,32 > add a0,a5,a0 > ret > > which IIUC produces the same result. I think something along the lines > of this (with the corresponding cost function updates) would do it > > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc > index de578b5b899..32b6033a966 100644 > --- a/gcc/config/riscv/riscv.cc > +++ b/gcc/config/riscv/riscv.cc > @@ -704,7 +704,13 @@ riscv_split_integer (HOST_WIDE_INT val, machine_mode mode) > rtx hi = gen_reg_rtx (mode), lo = gen_reg_rtx (mode); > > riscv_move_integer (hi, hi, hival, mode); > - riscv_move_integer (lo, lo, loval, mode); > + if (riscv_integer_cost (loval - hival) + 1 < riscv_integer_cost (loval)) { > + rtx delta = gen_reg_rrtx (mode); > + riscv_move_integer (delta, delta, loval - hival, mode); > + lo = gen_rtx_fmt_ee (PLUS, mode, hi, delta); > + } else { > + riscv_move_integer (lo, lo, loval, mode); > + } > > hi = gen_rtx_fmt_ee (ASHIFT, mode, hi, GEN_INT (32)); > hi = force_reg (mode, hi); > > though I suppose that would produce a slightly different sequence that has the > same number of instructions but a slightly longer dependency chain, something > more like > > li a5,-252645376 > addi a5,a5,241 > addi a0,a5,-1 > slli a5,a5,32 > add a0,a5,a0 > ret > > Take that all with a grain of salt, though, as I just ate some very spicy > chicken and can barely see straight :) Yeah, that might end up being a false economy for superscalars. In general, I wouldn't recommend spending too many cleverness beans on non-Zba+Zbb implementations. Going forward, we should expect that even very simple cores provide those extensions. > > > > > > > >> > >> -Vineet > >> > >> > > >> > > >> > On Fri, Jun 30, 2023 at 4:42 PM Vineet Gupta <vineetg@rivosinc.com> wrote: > >> >> > >> >> > >> >> On 6/30/23 16:33, Vineet Gupta wrote: > >> >>> Ran into a minor snafu in const splitting code when playing with test > >> >>> case from an old PR/23813. > >> >>> > >> >>> long long f(void) { return 0xF0F0F0F0F0F0F0F0ull; } > >> >>> > >> >>> This currently generates > >> >>> > >> >>> li a5,-252645376 > >> >>> addi a5,a5,241 > >> >>> li a0,-252645376 > >> >>> slli a5,a5,32 > >> >>> addi a0,a0,240 > >> >>> add a0,a5,a0 > >> >>> ret > >> >>> > >> >>> The signed math in hival extraction introduces an additional bit, > >> >>> causing loval == hival check to fail. > >> >>> > >> >>> | riscv_split_integer (val=-1085102592571150096, mode=E_DImode) at ../gcc/config/riscv/riscv.cc:702 > >> >>> | 702 unsigned HOST_WIDE_INT loval = sext_hwi (val, 32); > >> >>> | (gdb)n > >> >>> | 703 unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32); > >> >>> | (gdb) > >> >> FWIW (and I missed adding this observation to the changelog) I pondered > >> >> about using unsigned loval/hival with zext_hwi() but that in certain > >> >> cases can cause additional insns > >> >> > >> >> e.g. constant 0x8000_0000 is codegen to LI 1 +SLLI 31 vs, LI > >> >> 0xFFFFFFFF_80000000 > >> >> > >> >> > >> >>> | 704 rtx hi = gen_reg_rtx (mode), lo = gen_reg_rtx (mode); > >> >>> | (gdb) p/x val > >> >>> | $2 = 0xf0f0f0f0f0f0f0f0 > >> >>> | (gdb) p/x loval > >> >>> | $3 = 0xfffffffff0f0f0f0 > >> >>> | (gdb) p/x hival > >> >>> | $4 = 0xfffffffff0f0f0f1 > >> >>> ^^^ > >> >>> Fix that by eliding the subtraction in shift. > >> >>> > >> >>> With patch: > >> >>> > >> >>> li a5,-252645376 > >> >>> addi a5,a5,240 > >> >>> slli a0,a5,32 > >> >>> add a0,a0,a5 > >> >>> ret > >> >>> > >> >>> gcc/ChangeLog: > >> >>> > >> >>> * config/riscv/riscv.cc (riscv_split_integer): hival computation > >> >>> do elide subtraction of loval. > >> >>> * (riscv_split_integer_cost): Ditto. > >> >>> * (riscv_build_integer): Ditto > >> >>> > >> >>> Signed-off-by: Vineet Gupta <vineetg@rivosinc.com> > >> >>> --- > >> >>> I wasn't planning to do any more work on large const stuff, but just ran into it this > >> >>> on a random BZ entry when trying search for redundant constant stuff. > >> >>> The test seemed too good to pass :-) > >> >>> --- > >> >>> gcc/config/riscv/riscv.cc | 6 +++--- > >> >>> 1 file changed, 3 insertions(+), 3 deletions(-) > >> >>> > >> >>> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc > >> >>> index 5ac187c1b1b4..377d3aac794b 100644 > >> >>> --- a/gcc/config/riscv/riscv.cc > >> >>> +++ b/gcc/config/riscv/riscv.cc > >> >>> @@ -643,7 +643,7 @@ riscv_build_integer (struct riscv_integer_op *codes, HOST_WIDE_INT value, > >> >>> && (value > INT32_MAX || value < INT32_MIN)) > >> >>> { > >> >>> unsigned HOST_WIDE_INT loval = sext_hwi (value, 32); > >> >>> - unsigned HOST_WIDE_INT hival = sext_hwi ((value - loval) >> 32, 32); > >> >>> + unsigned HOST_WIDE_INT hival = sext_hwi (value >> 32, 32); > >> >>> struct riscv_integer_op alt_codes[RISCV_MAX_INTEGER_OPS]; > >> >>> struct riscv_integer_op hicode[RISCV_MAX_INTEGER_OPS]; > >> >>> int hi_cost, lo_cost; > >> >>> @@ -674,7 +674,7 @@ riscv_split_integer_cost (HOST_WIDE_INT val) > >> >>> { > >> >>> int cost; > >> >>> unsigned HOST_WIDE_INT loval = sext_hwi (val, 32); > >> >>> - unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32); > >> >>> + unsigned HOST_WIDE_INT hival = sext_hwi (val >> 32, 32); > >> >>> struct riscv_integer_op codes[RISCV_MAX_INTEGER_OPS]; > >> >>> > >> >>> cost = 2 + riscv_build_integer (codes, loval, VOIDmode); > >> >>> @@ -700,7 +700,7 @@ static rtx > >> >>> riscv_split_integer (HOST_WIDE_INT val, machine_mode mode) > >> >>> { > >> >>> unsigned HOST_WIDE_INT loval = sext_hwi (val, 32); > >> >>> - unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32); > >> >>> + unsigned HOST_WIDE_INT hival = sext_hwi (val >> 32, 32); > >> >>> rtx hi = gen_reg_rtx (mode), lo = gen_reg_rtx (mode); > >> >>> > >> >>> riscv_move_integer (lo, lo, loval, mode); > >>
On 7/1/23 02:00, Andrew Waterman wrote: > > Yeah, that might end up being a false economy for superscalars. > > In general, I wouldn't recommend spending too many cleverness beans on > non-Zba+Zbb implementations. Going forward, we should expect that > even very simple cores provide those extensions. I suspect you under-estimate how difficult it is to get the distros to move forward on baseline ISAs. jeff
On Sat, 01 Jul 2023 07:04:16 PDT (-0700), jeffreyalaw@gmail.com wrote: > > > On 7/1/23 02:00, Andrew Waterman wrote: > >> >> Yeah, that might end up being a false economy for superscalars. >> >> In general, I wouldn't recommend spending too many cleverness beans on >> non-Zba+Zbb implementations. Going forward, we should expect that >> even very simple cores provide those extensions. > I suspect you under-estimate how difficult it is to get the distros to > move forward on baseline ISAs. Ya, we haven't even gotten to the point where most implementations are shipping with the B extensions, much less to the point where we can start ignoring all the pre-B hardware.
On Sat, Jul 1, 2023 at 7:04 AM Jeff Law <jeffreyalaw@gmail.com> wrote: > > > > On 7/1/23 02:00, Andrew Waterman wrote: > > > > > Yeah, that might end up being a false economy for superscalars. > > > > In general, I wouldn't recommend spending too many cleverness beans on > > non-Zba+Zbb implementations. Going forward, we should expect that > > even very simple cores provide those extensions. > I suspect you under-estimate how difficult it is to get the distros to > move forward on baseline ISAs. Yeah, true. > > jeff
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 5ac187c1b1b4..377d3aac794b 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -643,7 +643,7 @@ riscv_build_integer (struct riscv_integer_op *codes, HOST_WIDE_INT value, && (value > INT32_MAX || value < INT32_MIN)) { unsigned HOST_WIDE_INT loval = sext_hwi (value, 32); - unsigned HOST_WIDE_INT hival = sext_hwi ((value - loval) >> 32, 32); + unsigned HOST_WIDE_INT hival = sext_hwi (value >> 32, 32); struct riscv_integer_op alt_codes[RISCV_MAX_INTEGER_OPS]; struct riscv_integer_op hicode[RISCV_MAX_INTEGER_OPS]; int hi_cost, lo_cost; @@ -674,7 +674,7 @@ riscv_split_integer_cost (HOST_WIDE_INT val) { int cost; unsigned HOST_WIDE_INT loval = sext_hwi (val, 32); - unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32); + unsigned HOST_WIDE_INT hival = sext_hwi (val >> 32, 32); struct riscv_integer_op codes[RISCV_MAX_INTEGER_OPS]; cost = 2 + riscv_build_integer (codes, loval, VOIDmode); @@ -700,7 +700,7 @@ static rtx riscv_split_integer (HOST_WIDE_INT val, machine_mode mode) { unsigned HOST_WIDE_INT loval = sext_hwi (val, 32); - unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32); + unsigned HOST_WIDE_INT hival = sext_hwi (val >> 32, 32); rtx hi = gen_reg_rtx (mode), lo = gen_reg_rtx (mode); riscv_move_integer (lo, lo, loval, mode);
Ran into a minor snafu in const splitting code when playing with test case from an old PR/23813. long long f(void) { return 0xF0F0F0F0F0F0F0F0ull; } This currently generates li a5,-252645376 addi a5,a5,241 li a0,-252645376 slli a5,a5,32 addi a0,a0,240 add a0,a5,a0 ret The signed math in hival extraction introduces an additional bit, causing loval == hival check to fail. | riscv_split_integer (val=-1085102592571150096, mode=E_DImode) at ../gcc/config/riscv/riscv.cc:702 | 702 unsigned HOST_WIDE_INT loval = sext_hwi (val, 32); | (gdb)n | 703 unsigned HOST_WIDE_INT hival = sext_hwi ((val - loval) >> 32, 32); | (gdb) | 704 rtx hi = gen_reg_rtx (mode), lo = gen_reg_rtx (mode); | (gdb) p/x val | $2 = 0xf0f0f0f0f0f0f0f0 | (gdb) p/x loval | $3 = 0xfffffffff0f0f0f0 | (gdb) p/x hival | $4 = 0xfffffffff0f0f0f1 ^^^ Fix that by eliding the subtraction in shift. With patch: li a5,-252645376 addi a5,a5,240 slli a0,a5,32 add a0,a0,a5 ret gcc/ChangeLog: * config/riscv/riscv.cc (riscv_split_integer): hival computation do elide subtraction of loval. * (riscv_split_integer_cost): Ditto. * (riscv_build_integer): Ditto Signed-off-by: Vineet Gupta <vineetg@rivosinc.com> --- I wasn't planning to do any more work on large const stuff, but just ran into it this on a random BZ entry when trying search for redundant constant stuff. The test seemed too good to pass :-) --- gcc/config/riscv/riscv.cc | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)