Message ID | 20231025050155.627837-1-vineetg@rivosinc.com |
---|---|
State | New |
Headers | show |
Series | [RFC] RISC-V: elide sign extend when expanding cmp_and_jump | expand |
Hi Vineet, I was thinking of two things while skimming the code: - Couldn't we do this in the expanders directly? Or is the subreg-promoted info gone until we reach that? - Should some common-code part be more suited to handle that? We already elide redundant sign-zero extensions for other reasons. Maybe we could add subreg promoted handling there? Regards Robin
On 10/25/23 01:12, Robin Dapp wrote: > Hi Vineet, > > I was thinking of two things while skimming the code: > > - Couldn't we do this in the expanders directly? Or is the > subreg-promoted info gone until we reach that? Well, it doesn't seem like there's a lot of difference between doing it in the generic expander bits vs target expander bits -- the former just calls into the latter for the most part. Thus if the subreg-promoted state is available in the target expander, I'd expect it to be available in the generic expander. It may be the case that we have more places to fix because we have different expander paths (think conditional branches, conditional moves, sCC and probably others). By catching it in riscv_expand_comparands he caught a nice little choke point. I think what Vineet has done will also work for RTL if-conversion. > > - Should some common-code part be more suited to handle that? > We already elide redundant sign-zero extensions for other > reasons. Maybe we could add subreg promoted handling there? Unsure. I wouldn't be surprised if we were able to find similar code in simplify-rtx or something like that. It's probably worth a quick looksie. I also wonder if Vineet's work would subsume this local change. I've been meaning to find testcases for this and determine if we should drop it or clean it up and submit it upstream: > +(define_insn "*branch<mode>_equals_zero" > + [(set (pc) > + (if_then_else > + (match_operator 1 "equality_operator" > + [(match_operand:ANYI 2 "register_operand" "r") > + (const_int 0)]) > + (label_ref (match_operand 0 "" "")) > + (pc)))] > + "!partial_subreg_p (operands[2])" > + "b%C1\t%2,zero,%0" > + [(set_attr "type" "branch") > + (set_attr "mode" "none")]) My sense is it's just papering over a missed simplification elsewhere. Jeff
> Well, it doesn't seem like there's a lot of difference between doing > it in the generic expander bits vs target expander bits -- the former > just calls into the latter for the most part. Thus if the > subreg-promoted state is available in the target expander, I'd expect > it to be available in the generic expander. Ah, sorry I meant in the [sign|zero]extend expanders rather than the compare expanders in order to catch promoted subregs from other origins as well. Maybe that doesn't work, though? Regards Robin
On 10/25/23 07:47, Robin Dapp wrote: > >> Well, it doesn't seem like there's a lot of difference between doing >> it in the generic expander bits vs target expander bits -- the former >> just calls into the latter for the most part. Thus if the >> subreg-promoted state is available in the target expander, I'd expect >> it to be available in the generic expander. > > Ah, sorry I meant in the [sign|zero]extend expanders rather than the > compare expanders in order to catch promoted subregs from other origins > as well. Maybe that doesn't work, though? That's a *really* interesting idea. Interested in playing around a bit with that Vineet? jeff
Hey Robin, On 10/25/23 00:12, Robin Dapp wrote: > Hi Vineet, > > I was thinking of two things while skimming the code: > > - Couldn't we do this in the expanders directly? Or is the > subreg-promoted info gone until we reach that? Following is the call stack involved: expand_gimple_cond do_compare_and_jump emit_cmp_and_jump_insns gen_cbranchqi4 riscv_expand_conditional_branch riscv_emit_int_compare riscv_extend_comparands Last function is what introduces the extraneous sign extends, w/o taking subreg-promoted into consideration and what my patch attempts to address. > - Should some common-code part be more suited to handle that? > We already elide redundant sign-zero extensions for other > reasons. Maybe we could add subreg promoted handling there? Not in the context of this specific issue. -Vineet
On 10/25/23 10:25, Vineet Gupta wrote: > Hey Robin, > > On 10/25/23 00:12, Robin Dapp wrote: >> Hi Vineet, >> >> I was thinking of two things while skimming the code: >> >> - Couldn't we do this in the expanders directly? Or is the >> subreg-promoted info gone until we reach that? > > Following is the call stack involved: > > expand_gimple_cond > do_compare_and_jump > emit_cmp_and_jump_insns > gen_cbranchqi4 > riscv_expand_conditional_branch > riscv_emit_int_compare > riscv_extend_comparands > > > Last function is what introduces the extraneous sign extends, w/o taking > subreg-promoted into consideration and what my patch attempts to address. > >> - Should some common-code part be more suited to handle that? >> We already elide redundant sign-zero extensions for other >> reasons. Maybe we could add subreg promoted handling there? > > Not in the context of this specific issue. Robin's point (IIUC) is that if we put this logic into a zero/sign extend expander, then it'll get used for *any* attempt to zero/sign extend that goes through the target expander. It doesn't work for your case because we use gen_rtx_{ZERO,SIGN}_EXTEND directly. But if those were adjusted to use the expander, then Robin's idea would be applicable to this case too. Jeff
On 10/25/23 09:30, Jeff Law wrote: >>> - Should some common-code part be more suited to handle that? >>> We already elide redundant sign-zero extensions for other >>> reasons. Maybe we could add subreg promoted handling there? >> >> Not in the context of this specific issue. > Robin's point (IIUC) is that if we put this logic into a zero/sign > extend expander, then it'll get used for *any* attempt to zero/sign > extend that goes through the target expander. > > It doesn't work for your case because we use > gen_rtx_{ZERO,SIGN}_EXTEND directly. But if those were adjusted to > use the expander, then Robin's idea would be applicable to this case too Understood. Definitely solid idea. -Vineet
On 10/25/23 06:52, Jeff Law wrote: > > On 10/25/23 07:47, Robin Dapp wrote: >> >>> Well, it doesn't seem like there's a lot of difference between doing >>> it in the generic expander bits vs target expander bits -- the former >>> just calls into the latter for the most part. Thus if the >>> subreg-promoted state is available in the target expander, I'd expect >>> it to be available in the generic expander. >> >> Ah, sorry I meant in the [sign|zero]extend expanders rather than the >> compare expanders in order to catch promoted subregs from other origins >> as well. Maybe that doesn't work, though? > That's a *really* interesting idea. Interested in playing around a > bit with that Vineet? Sure I'll tinker with the {sign,zero}expanders. And there's a third playing field :-) There seem to be still cases where subreg-promoted note is not set when it probably should. So we end up in riscv_extend_comparands but with note not being there (for something corresponding to function arg) thus can't skip the extension. Thx, -Vineet
On 10/24/23 22:01, Vineet Gupta wrote: > RV64 comapre and branch instructions only support 64-bit operands. > The backend unconditionally generates zero/sign extend at Expand time > for compare and branch operands even if they are already as such > e.g. function args which ABI guarantees to be sign-extended (at callsite). > > And subsequently REE fails to eliminate them as > "missing defintion(s)" > specially for function args since they show up as missing explicit > definition. > > So elide the sign extensions at Expand time for a subreg promoted var > with inner word sized value whcih doesn't need explicit sign extending > (fairly good represntative of an incoming function arg). > > There are patches floating to enhance REE and/or new passes to elimiate > extensions, but it is always better to not generate them if possible, > given Expand is so early, the elimiated extend would help improve outcomes > of later passes such as combine if they have fewer operations/combinations > to deal with. > > The test used was existing gcc.c-torture/execute/20020506-1.c -O2 zbb > It elimiates the SEXT.W and an additional branch > > before after > ------- ------ > test2: test2: > sext.b a0,a0 > blt a0,zero,.L15 > bne a1,zero,.L17 bne a1,zero,.L17 > > This is marked RFC as I only ran a spot check with a directed test and > want to use Patrick's pre-commit CI to do the A/B testing for me. > > gcc/ChangeLog: > * config/riscv/riscv.cc (riscv_extend_comparands): Don't > sign-extend operand if subreg promoted with inner word size. > > Signed-off-by: Vineet Gupta <vineetg@rivosinc.com> > --- > gcc/config/riscv/riscv.cc | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc > index f2dcb0db6fbd..a8d12717e43d 100644 > --- a/gcc/config/riscv/riscv.cc > +++ b/gcc/config/riscv/riscv.cc > @@ -3707,7 +3707,16 @@ riscv_extend_comparands (rtx_code code, rtx *op0, rtx *op1) > } > else > { > - *op0 = gen_rtx_SIGN_EXTEND (word_mode, *op0); > + /* A subreg promoted SI of DI could be peeled to expose DI, eliding > + an unnecessary sign extension. */ > + if (GET_CODE (*op0) == SUBREG > + && SUBREG_PROMOTED_VAR_P (*op0) > + && GET_MODE_SIZE (GET_MODE (XEXP (*op0, 0))).to_constant () > + == GET_MODE_SIZE (word_mode)) > + *op0 = XEXP (*op0, 0); > + else > + *op0 = gen_rtx_SIGN_EXTEND (word_mode, *op0); > + > if (*op1 != const0_rtx) > *op1 = gen_rtx_SIGN_EXTEND (word_mode, *op1); > } Just a quick update that testing revealed additional failures and unpacking which took me a while and in hindsight embarrassing. I was misunderstanding what ABI guarantees and what ISA actually does :-) The ABI guarantees sign extension this for 32-bit things in 64-bit reg (so in rv64, int), but *not* 8-bit things in a reg. i.e. not for char arguments. e.g. uint8_t abs8(uint8_t x) { if (x & 0x80) // SEXT.b a4, a0 ... } main() { if (abs8(128) != 127) // LI a0, 128 => ADDI a0, x0, 128 __builtin_abort(); } So my patch was optimizing away the SEXT.B (despite claiming that subreg prom of SI....) which is wrong. Sure ADDI in main () sign-extends, but it does that for 12-bit imm 0x080, which comes out to 0x80, but sign-extending for char scope 0x80 would yield 0xFFFF....0x80. Hence the issue. I'll spin a v2 after more testing. This is slightly disappointing as it reduces the scope of optimization, but correctness in this trade is non-negotiable :-) -Vineet
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index f2dcb0db6fbd..a8d12717e43d 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -3707,7 +3707,16 @@ riscv_extend_comparands (rtx_code code, rtx *op0, rtx *op1) } else { - *op0 = gen_rtx_SIGN_EXTEND (word_mode, *op0); + /* A subreg promoted SI of DI could be peeled to expose DI, eliding + an unnecessary sign extension. */ + if (GET_CODE (*op0) == SUBREG + && SUBREG_PROMOTED_VAR_P (*op0) + && GET_MODE_SIZE (GET_MODE (XEXP (*op0, 0))).to_constant () + == GET_MODE_SIZE (word_mode)) + *op0 = XEXP (*op0, 0); + else + *op0 = gen_rtx_SIGN_EXTEND (word_mode, *op0); + if (*op1 != const0_rtx) *op1 = gen_rtx_SIGN_EXTEND (word_mode, *op1); }
RV64 comapre and branch instructions only support 64-bit operands. The backend unconditionally generates zero/sign extend at Expand time for compare and branch operands even if they are already as such e.g. function args which ABI guarantees to be sign-extended (at callsite). And subsequently REE fails to eliminate them as "missing defintion(s)" specially for function args since they show up as missing explicit definition. So elide the sign extensions at Expand time for a subreg promoted var with inner word sized value whcih doesn't need explicit sign extending (fairly good represntative of an incoming function arg). There are patches floating to enhance REE and/or new passes to elimiate extensions, but it is always better to not generate them if possible, given Expand is so early, the elimiated extend would help improve outcomes of later passes such as combine if they have fewer operations/combinations to deal with. The test used was existing gcc.c-torture/execute/20020506-1.c -O2 zbb It elimiates the SEXT.W and an additional branch before after ------- ------ test2: test2: sext.b a0,a0 blt a0,zero,.L15 bne a1,zero,.L17 bne a1,zero,.L17 This is marked RFC as I only ran a spot check with a directed test and want to use Patrick's pre-commit CI to do the A/B testing for me. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_extend_comparands): Don't sign-extend operand if subreg promoted with inner word size. Signed-off-by: Vineet Gupta <vineetg@rivosinc.com> --- gcc/config/riscv/riscv.cc | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)