Message ID | 20221207205517.526182-1-rzinsly@ventanamicro.com |
---|---|
State | New |
Headers | show |
Series | RISC-V: Produce better code with complex constants [PR95632] [PR106602] | expand |
On 12/7/22 13:55, Raphael Moreira Zinsly wrote: > Due to RISC-V limitations on operations with big constants combine > is failing to match such operations and is not being able to > produce optimal code as it keeps splitting them. By pretending we > can do those operations we can get more opportunities for > simplification of surrounding instructions. > > 2022-12-06 Raphael Moreira Zinsly <rzinsly@ventanamicro.com> > Jeff Law <jlaw@ventanamicro.com> > > gcc/Changelog: > PR target/95632 > PR target/106602 > * config/riscv/riscv.md: New pattern to simulate complex > const_int loads. > > gcc/testsuite/ChangeLog: > * gcc.target/riscv/pr95632.c: New test. > * gcc.target/riscv/pr106602.c: Likewise. So to give a little background to others. The core issue is that when we break down constants early, it can make it difficult for combine to reconstruct the constant and simplify code using the reconstructed constant -- you end up trying to do 4->3 or worse combination sequences which aren't supported by the combiner. Usually this kind of scenario is handled with a "bridge" pattern. Those are generally defined as patterns that exist solely for combine and may not correspond to any real instruction on the target. "bridge" patterns are typically 2->1 or 3->1 combinations and are intermediate steps for 4->N or even larger combination opportunities. Obviously if the bridge doesn't allow subsequent simplifications, then the bridge pattern must generate correct code (either by generating suitable assembly or splitting later). Raphael's patch introduces a bridge pattern that pretends we can load up splittable constants in a single insn. We restrict the bridge pattern to be active from the point when CSE is no longer expected through the combiner up to the first splitter pass (where we'll break it down again if it's still in the IL). So we get most of the benefit of splitting constants early (CSE, LICM, etc) while also getting the benefits of splitting late (combine simplifications). Given I was working with Raphael on the patch, it's probably best for someone else to do the review rather than me approving it :-) Jeff
On Wed, Dec 07, 2022 at 05:55:17PM -0300, Raphael Moreira Zinsly wrote: > Due to RISC-V limitations on operations with big constants combine > is failing to match such operations and is not being able to > produce optimal code as it keeps splitting them. By pretending we > can do those operations we can get more opportunities for > simplification of surrounding instructions. > > 2022-12-06 Raphael Moreira Zinsly <rzinsly@ventanamicro.com> > Jeff Law <jlaw@ventanamicro.com> Just nits, not a proper review. 2 spaces after date and 2 spaces before <, rather than just 1. > > gcc/Changelog: > PR target/95632 > PR target/106602 > * config/riscv/riscv.md: New pattern to simulate complex > const_int loads. > > gcc/testsuite/ChangeLog: > * gcc.target/riscv/pr95632.c: New test. > * gcc.target/riscv/pr106602.c: Likewise. All lines in the ChangeLog should be tab indented, rather than just some of them and others with 8 spaces. > --- a/gcc/config/riscv/riscv.md > +++ b/gcc/config/riscv/riscv.md > @@ -1667,6 +1667,22 @@ > MAX_MACHINE_MODE, &operands[3], TRUE); > }) > > +;; Pretend to have the ability to load complex const_int in order to get > +;; better code generation around them. > +(define_insn_and_split "" define_insn_and_split patterns better should have some name, even if it starts with *. It makes dumps more readable, and you can refer to it in the ChangeLog when it is added or changed etc. > + [(set (match_operand:GPR 0 "register_operand" "=r") > + (match_operand:GPR 1 "splittable_const_int_operand" "i"))] > + "cse_not_expected" > + "#" > + "&& 1" > + [(const_int 0)] > + Why the empty line? > +{ > + riscv_move_integer (operands[0], operands[0], INTVAL (operands[1]), > + <GPR:MODE>mode, TRUE); You can just use <MODE> if there is only one iterator in the pattern. Jakub
On Wed, 07 Dec 2022 13:30:32 PST (-0800), gcc-patches@gcc.gnu.org wrote: > On Wed, Dec 07, 2022 at 05:55:17PM -0300, Raphael Moreira Zinsly wrote: >> Due to RISC-V limitations on operations with big constants combine >> is failing to match such operations and is not being able to >> produce optimal code as it keeps splitting them. By pretending we >> can do those operations we can get more opportunities for >> simplification of surrounding instructions. >> >> 2022-12-06 Raphael Moreira Zinsly <rzinsly@ventanamicro.com> >> Jeff Law <jlaw@ventanamicro.com> > > Just nits, not a proper review. > 2 spaces after date and 2 spaces before <, rather than just 1. > >> >> gcc/Changelog: >> PR target/95632 >> PR target/106602 >> * config/riscv/riscv.md: New pattern to simulate complex >> const_int loads. >> >> gcc/testsuite/ChangeLog: >> * gcc.target/riscv/pr95632.c: New test. >> * gcc.target/riscv/pr106602.c: Likewise. > > All lines in the ChangeLog should be tab indented, rather than just some of > them and others with 8 spaces. There's alsot contrib/git-commit-mklog.py, which provides a template for these (I also have trouble remembering the formatting rules). > >> --- a/gcc/config/riscv/riscv.md >> +++ b/gcc/config/riscv/riscv.md >> @@ -1667,6 +1667,22 @@ >> MAX_MACHINE_MODE, &operands[3], TRUE); >> }) >> >> +;; Pretend to have the ability to load complex const_int in order to get >> +;; better code generation around them. >> +(define_insn_and_split "" > > define_insn_and_split patterns better should have some name, even if it > starts with *. It makes dumps more readable, and you can refer to it > in the ChangeLog when it is added or changed etc. > >> + [(set (match_operand:GPR 0 "register_operand" "=r") >> + (match_operand:GPR 1 "splittable_const_int_operand" "i"))] >> + "cse_not_expected" >> + "#" >> + "&& 1" >> + [(const_int 0)] >> + > > Why the empty line? > >> +{ >> + riscv_move_integer (operands[0], operands[0], INTVAL (operands[1]), >> + <GPR:MODE>mode, TRUE); > > You can just use <MODE> if there is only one iterator in the pattern. > > Jakub
On Wed, 07 Dec 2022 12:55:17 PST (-0800), rzinsly@ventanamicro.com wrote: > Due to RISC-V limitations on operations with big constants combine > is failing to match such operations and is not being able to > produce optimal code as it keeps splitting them. By pretending we > can do those operations we can get more opportunities for > simplification of surrounding instructions. I saw Jeff's comments. This is always the kind of thing that worries me: we're essentially lying to the optimizer in order to trick it into generating better code, which might just make it generate worse code. It's always easy to see a small example that improves, but those could be wiped out by secondary effects in real code. So I'd usually want to have some benchmarking for a patch like this. That said, if this is just the standard way of doing things then maybe it's just fine? > 2022-12-06 Raphael Moreira Zinsly <rzinsly@ventanamicro.com> > Jeff Law <jlaw@ventanamicro.com> > > gcc/Changelog: > PR target/95632 > PR target/106602 > * config/riscv/riscv.md: New pattern to simulate complex > const_int loads. > > gcc/testsuite/ChangeLog: > * gcc.target/riscv/pr95632.c: New test. > * gcc.target/riscv/pr106602.c: Likewise. > --- > gcc/config/riscv/riscv.md | 16 ++++++++++++++++ > gcc/testsuite/gcc.target/riscv/pr106602.c | 14 ++++++++++++++ > gcc/testsuite/gcc.target/riscv/pr95632.c | 15 +++++++++++++++ > 3 files changed, 45 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/riscv/pr106602.c > create mode 100644 gcc/testsuite/gcc.target/riscv/pr95632.c > > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md > index df57e2b0b4a..0a9b5ec22b0 100644 > --- a/gcc/config/riscv/riscv.md > +++ b/gcc/config/riscv/riscv.md > @@ -1667,6 +1667,22 @@ > MAX_MACHINE_MODE, &operands[3], TRUE); > }) > > +;; Pretend to have the ability to load complex const_int in order to get > +;; better code generation around them. > +(define_insn_and_split "" > + [(set (match_operand:GPR 0 "register_operand" "=r") > + (match_operand:GPR 1 "splittable_const_int_operand" "i"))] > + "cse_not_expected" > + "#" > + "&& 1" > + [(const_int 0)] > + > +{ > + riscv_move_integer (operands[0], operands[0], INTVAL (operands[1]), > + <GPR:MODE>mode, TRUE); > + DONE; > +}) There's some comments from Jakub on this, I don't see any additional issues with the code (aside from the "does it help" stuff from above). > + > ;; 64-bit integer moves > > (define_expand "movdi" > diff --git a/gcc/testsuite/gcc.target/riscv/pr106602.c b/gcc/testsuite/gcc.target/riscv/pr106602.c > new file mode 100644 > index 00000000000..83b70877012 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/pr106602.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -march=rv64gc" } */ There's a DG hook to limit this to 64-bit targets, that way it'll run with whatever target is being tested. > + > +unsigned long > +foo2 (unsigned long a) > +{ > + return (unsigned long)(unsigned int) a << 6; > +} > + > +/* { dg-final { scan-assembler-times "slli\t" 1 } } */ > +/* { dg-final { scan-assembler-times "srli\t" 1 } } */ > +/* { dg-final { scan-assembler-not "\tli\t" } } */ > +/* { dg-final { scan-assembler-not "addi\t" } } */ > +/* { dg-final { scan-assembler-not "and\t" } } */ > diff --git a/gcc/testsuite/gcc.target/riscv/pr95632.c b/gcc/testsuite/gcc.target/riscv/pr95632.c > new file mode 100644 > index 00000000000..bd316ab1d7b > --- /dev/null > +++ b/gcc/testsuite/gcc.target/riscv/pr95632.c > @@ -0,0 +1,15 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2 -march=rv32imafc -mabi=ilp32f" } */ Is there a reason to make this rv32-only? Unless I'm missing something this should generate pretty much the same code for rv64. > + > +unsigned short > +foo (unsigned short crc) > +{ > + crc ^= 0x4002; > + crc >>= 1; > + crc |= 0x8000; > + > + return crc; > +} > + > +/* { dg-final { scan-assembler-times "srli\t" 1 } } */ > +/* { dg-final { scan-assembler-not "slli\t" } } */
On 12/8/22 10:53, Palmer Dabbelt wrote: > On Wed, 07 Dec 2022 12:55:17 PST (-0800), rzinsly@ventanamicro.com wrote: >> Due to RISC-V limitations on operations with big constants combine >> is failing to match such operations and is not being able to >> produce optimal code as it keeps splitting them. By pretending we >> can do those operations we can get more opportunities for >> simplification of surrounding instructions. > > I saw Jeff's comments. This is always the kind of thing that worries > me: we're essentially lying to the optimizer in order to trick it into > generating better code, which might just make it generate worse code. > It's always easy to see a small example that improves, but those could > be wiped out by secondary effects in real code. So I'd usually want to > have some benchmarking for a patch like this. > > That said, if this is just the standard way of doing things then maybe > it's just fine? Bridge combiner patterns are pretty standard. The insn's condition of cse_not_expected is also in there to minimize the potential for surprises by not exposing this too early. jeff
On Thu, 08 Dec 2022 10:15:47 PST (-0800), gcc-patches@gcc.gnu.org wrote: > > > On 12/8/22 10:53, Palmer Dabbelt wrote: >> On Wed, 07 Dec 2022 12:55:17 PST (-0800), rzinsly@ventanamicro.com wrote: >>> Due to RISC-V limitations on operations with big constants combine >>> is failing to match such operations and is not being able to >>> produce optimal code as it keeps splitting them. By pretending we >>> can do those operations we can get more opportunities for >>> simplification of surrounding instructions. >> >> I saw Jeff's comments. This is always the kind of thing that worries >> me: we're essentially lying to the optimizer in order to trick it into >> generating better code, which might just make it generate worse code. >> It's always easy to see a small example that improves, but those could >> be wiped out by secondary effects in real code. So I'd usually want to >> have some benchmarking for a patch like this. >> >> That said, if this is just the standard way of doing things then maybe >> it's just fine? > Bridge combiner patterns are pretty standard. The insn's condition of > cse_not_expected is also in there to minimize the potential for > surprises by not exposing this too early. OK, I'm fine with this, then -- aside from the fairly minor issues pointed out.
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index df57e2b0b4a..0a9b5ec22b0 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -1667,6 +1667,22 @@ MAX_MACHINE_MODE, &operands[3], TRUE); }) +;; Pretend to have the ability to load complex const_int in order to get +;; better code generation around them. +(define_insn_and_split "" + [(set (match_operand:GPR 0 "register_operand" "=r") + (match_operand:GPR 1 "splittable_const_int_operand" "i"))] + "cse_not_expected" + "#" + "&& 1" + [(const_int 0)] + +{ + riscv_move_integer (operands[0], operands[0], INTVAL (operands[1]), + <GPR:MODE>mode, TRUE); + DONE; +}) + ;; 64-bit integer moves (define_expand "movdi" diff --git a/gcc/testsuite/gcc.target/riscv/pr106602.c b/gcc/testsuite/gcc.target/riscv/pr106602.c new file mode 100644 index 00000000000..83b70877012 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/pr106602.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=rv64gc" } */ + +unsigned long +foo2 (unsigned long a) +{ + return (unsigned long)(unsigned int) a << 6; +} + +/* { dg-final { scan-assembler-times "slli\t" 1 } } */ +/* { dg-final { scan-assembler-times "srli\t" 1 } } */ +/* { dg-final { scan-assembler-not "\tli\t" } } */ +/* { dg-final { scan-assembler-not "addi\t" } } */ +/* { dg-final { scan-assembler-not "and\t" } } */ diff --git a/gcc/testsuite/gcc.target/riscv/pr95632.c b/gcc/testsuite/gcc.target/riscv/pr95632.c new file mode 100644 index 00000000000..bd316ab1d7b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/pr95632.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=rv32imafc -mabi=ilp32f" } */ + +unsigned short +foo (unsigned short crc) +{ + crc ^= 0x4002; + crc >>= 1; + crc |= 0x8000; + + return crc; +} + +/* { dg-final { scan-assembler-times "srli\t" 1 } } */ +/* { dg-final { scan-assembler-not "slli\t" } } */