Message ID | 20221128021428.13824-1-wangfeng@eswincomputing.com |
---|---|
State | New |
Headers | show |
Series | RISC-V: Support the ins "rol" with immediate operand | expand |
On 11/27/22 19:14, Feng Wang wrote: > From: wangfeng <wangfeng@eswincomputing.com> > > There is no Immediate operand of ins "rol" accroding to the B-ext, > so the immediate operand should be loaded into register at first. > But we can convert it to the ins "rori" or "roriw", and then one > immediate load ins can be reduced. > > Please refer to the following use cases: > unsigned long foo2(unsigned long rs1) > { > return (rs1 << 10) | (rs1 >> 54); > } > > The complier result is: > li a1,10 > rol a0,a0,a1 > > This patch will generate one ins > rori a0,a0,54 > > gcc/ChangeLog: > > * config/riscv/bitmanip.md: Add immediate_operand support in rotl RTL pattern > > gcc/testsuite/ChangeLog: > > * gcc.target/riscv/zbb-rol-ror-04.c: New test. > * gcc.target/riscv/zbb-rol-ror-05.c: New test. So this arrived after stage1 close and I'm not aware of an existing BZ around this issue, so I'd tend to think this should wait for stage1 to re-open in the spring. From a technical standpoint, would it be better to hand this in a more generic way? ie, when converting from gimple into RTL, if we want to generate a rotate left by immediate and don't have a suitable insn, then change it to a rotate right by an adjusted immediate. This could probably be done in optabs.cc::expand_binop. We might need similar code in combine.cc or simplify-rtx.cc since some rotate cases (or exposure of the constant) may not show up until later in the RTL pipeline. Anyway, doing this in a more generic way seems like it's worth investigating. jeff
on 2022-11-28 23:39 Jeff Law<jeffreyalaw@gmail.com> wrote: > > >On 11/27/22 19:14, Feng Wang wrote: >> From: wangfeng <wangfeng@eswincomputing.com> >> >> There is no Immediate operand of ins "rol" accroding to the B-ext, >> so the immediate operand should be loaded into register at first. >> But we can convert it to the ins "rori" or "roriw", and then one >> immediate load ins can be reduced. >> >> Please refer to the following use cases: >> unsigned long foo2(unsigned long rs1) >> { >> return (rs1 << 10) | (rs1 >> 54); >> } >> >> The complier result is: >> li a1,10 >> rol a0,a0,a1 >> >> This patch will generate one ins >> rori a0,a0,54 >> >> gcc/ChangeLog: >> >> * config/riscv/bitmanip.md: Add immediate_operand support in rotl RTL pattern >> >> gcc/testsuite/ChangeLog: >> >> * gcc.target/riscv/zbb-rol-ror-04.c: New test. >> * gcc.target/riscv/zbb-rol-ror-05.c: New test. > >So this arrived after stage1 close and I'm not aware of an existing BZ >around this issue, so I'd tend to think this should wait for stage1 to >re-open in the spring. > > > From a technical standpoint, would it be better to hand this in a more >generic way? ie, when converting from gimple into RTL, if we want to >generate a rotate left by immediate and don't have a suitable insn, then >change it to a rotate right by an adjusted immediate. This could >probably be done in optabs.cc::expand_binop. > > >We might need similar code in combine.cc or simplify-rtx.cc since some >rotate cases (or exposure of the constant) may not show up until later >in the RTL pipeline. > > >Anyway, doing this in a more generic way seems like it's worth >investigating. > > >jeff > Hi jeff, Thanks for your reply. In the currently it will judge the rotate shift number when converting from gimple into RTL, if the shift number bigger than mode_size/2, then reverse the rotate direction. I think the purpose of this process is to handle rotate shift quickly. I will think about your advice and try to modify it in the expand pass. Wang Feng Best regards
On 11/28/22 18:53, Feng Wang wrote: > on 2022-11-28 23:39 Jeff Law<jeffreyalaw@gmail.com> wrote: >> >> >> On 11/27/22 19:14, Feng Wang wrote: >>> From: wangfeng <wangfeng@eswincomputing.com> >>> >>> There is no Immediate operand of ins "rol" accroding to the B-ext, >>> so the immediate operand should be loaded into register at first. >>> But we can convert it to the ins "rori" or "roriw", and then one >>> immediate load ins can be reduced. >>> >>> Please refer to the following use cases: >>> unsigned long foo2(unsigned long rs1) >>> { >>> return (rs1 << 10) | (rs1 >> 54); >>> } >>> >>> The complier result is: >>> li a1,10 >>> rol a0,a0,a1 >>> >>> This patch will generate one ins >>> rori a0,a0,54 >>> >>> gcc/ChangeLog: >>> >>> * config/riscv/bitmanip.md: Add immediate_operand support in rotl RTL pattern >>> >>> gcc/testsuite/ChangeLog: >>> >>> * gcc.target/riscv/zbb-rol-ror-04.c: New test. >>> * gcc.target/riscv/zbb-rol-ror-05.c: New test. >> >> So this arrived after stage1 close and I'm not aware of an existing BZ >> around this issue, so I'd tend to think this should wait for stage1 to >> re-open in the spring. >> >> >> From a technical standpoint, would it be better to hand this in a more >> generic way? ie, when converting from gimple into RTL, if we want to >> generate a rotate left by immediate and don't have a suitable insn, then >> change it to a rotate right by an adjusted immediate. This could >> probably be done in optabs.cc::expand_binop. >> >> >> We might need similar code in combine.cc or simplify-rtx.cc since some >> rotate cases (or exposure of the constant) may not show up until later >> in the RTL pipeline. >> >> >> Anyway, doing this in a more generic way seems like it's worth >> investigating. >> >> >> jeff >> > Hi jeff, > > Thanks for your reply. In the currently it will judge the rotate shift number when converting from > gimple into RTL, if the shift number bigger than mode_size/2, then reverse the rotate direction. > I think the purpose of this process is to handle rotate shift quickly. I will think about your advice > and try to modify it in the expand pass. Yea, in the past there were targets where the cost of a shift or rotate was proportional to the number of bits shifted. So for a rotate in particular it was advantageous to reverse the rotation if the count was more than mode_size/2. I suspect such processors are a lot less common now than in the past. But we can probably utilize some of that code to suit our needs. Jeff
diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md index d17133d58c1..cddfa2a4b19 100644 --- a/gcc/config/riscv/bitmanip.md +++ b/gcc/config/riscv/bitmanip.md @@ -300,25 +300,49 @@ (define_insn "rotlsi3" [(set (match_operand:SI 0 "register_operand" "=r") (rotate:SI (match_operand:SI 1 "register_operand" "r") - (match_operand:QI 2 "register_operand" "r")))] + (match_operand:QI 2 "arith_operand" "rI")))] "TARGET_ZBB" - "rol%~\t%0,%1,%2" + { + if (immediate_operand(operands[2], QImode)) + { + operands[2] = GEN_INT(GET_MODE_BITSIZE (SImode) - INTVAL(operands[2])); + return "rori\t%0,%1,%2"; + } + else + return "rol\t%0,%1,%2"; + } [(set_attr "type" "bitmanip")]) (define_insn "rotldi3" [(set (match_operand:DI 0 "register_operand" "=r") (rotate:DI (match_operand:DI 1 "register_operand" "r") - (match_operand:QI 2 "register_operand" "r")))] + (match_operand:QI 2 "arith_operand" "rI")))] "TARGET_64BIT && TARGET_ZBB" - "rol\t%0,%1,%2" + { + if (immediate_operand(operands[2], QImode)) + { + operands[2] = GEN_INT(GET_MODE_BITSIZE (DImode) - INTVAL(operands[2])); + return "rori\t%0,%1,%2"; + } + else + return "rol\t%0,%1,%2"; + } [(set_attr "type" "bitmanip")]) (define_insn "rotlsi3_sext" [(set (match_operand:DI 0 "register_operand" "=r") (sign_extend:DI (rotate:SI (match_operand:SI 1 "register_operand" "r") - (match_operand:QI 2 "register_operand" "r"))))] + (match_operand:QI 2 "arith_operand" "rI"))))] "TARGET_64BIT && TARGET_ZBB" - "rolw\t%0,%1,%2" + { + if (immediate_operand(operands[2], QImode)) + { + operands[2] = GEN_INT(GET_MODE_BITSIZE (SImode) - INTVAL(operands[2])); + return "roriw\t%0,%1,%2"; + } + else + return "rolw\t%0,%1,%2"; + } [(set_attr "type" "bitmanip")]) ;; orc.b (or-combine) is added as an unspec for the benefit of the support diff --git a/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-04.c b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-04.c new file mode 100644 index 00000000000..23883cc3a5e --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-04.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gc_zbb -mabi=lp64d -O2" } */ +/* { dg-skip-if "" { *-*-* } { "-g" } } */ + +unsigned long foo1 (unsigned long rs1) +{ return (rs1 >> (34)) | (rs1 << 30); } + +unsigned long foo2(unsigned long rs1) +{ + return (rs1 << 10) | (rs1 >> 54); +} + +unsigned int foo3(unsigned int rs1) +{ + return (rs1 >> 20) | (rs1 << 12); +} + +unsigned int foo4(unsigned int rs1) +{ + return (rs1 << 10) | (rs1 >> 22); +} + +/* { dg-final { scan-assembler-times "rori\t" 2 } } */ +/* { dg-final { scan-assembler-times "roriw" 2 } } */ \ No newline at end of file diff --git a/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-05.c b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-05.c new file mode 100644 index 00000000000..3e300a30b9b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbb-rol-ror-05.c @@ -0,0 +1,15 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv32gc_zbb -mabi=ilp32" } */ +/* { dg-skip-if "" { rv64-*-* } { "-g" } } */ + +unsigned int foo1(unsigned int rs1) +{ + return (rs1 >> 20) | (rs1 << 12); +} + +unsigned int foo2(unsigned int rs1) +{ + return (rs1 << 10) | (rs1 >> 22); +} + +/* { dg-final { scan-assembler-times "rori" 2 } } */ \ No newline at end of file
From: wangfeng <wangfeng@eswincomputing.com> There is no Immediate operand of ins "rol" accroding to the B-ext, so the immediate operand should be loaded into register at first. But we can convert it to the ins "rori" or "roriw", and then one immediate load ins can be reduced. Please refer to the following use cases: unsigned long foo2(unsigned long rs1) { return (rs1 << 10) | (rs1 >> 54); } The complier result is: li a1,10 rol a0,a0,a1 This patch will generate one ins rori a0,a0,54 gcc/ChangeLog: * config/riscv/bitmanip.md: Add immediate_operand support in rotl RTL pattern gcc/testsuite/ChangeLog: * gcc.target/riscv/zbb-rol-ror-04.c: New test. * gcc.target/riscv/zbb-rol-ror-05.c: New test. --- gcc/config/riscv/bitmanip.md | 36 +++++++++++++++---- .../gcc.target/riscv/zbb-rol-ror-04.c | 24 +++++++++++++ .../gcc.target/riscv/zbb-rol-ror-05.c | 15 ++++++++ 3 files changed, 69 insertions(+), 6 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-rol-ror-04.c create mode 100644 gcc/testsuite/gcc.target/riscv/zbb-rol-ror-05.c