Message ID | 20231030032122.757369-1-vineetg@rivosinc.com |
---|---|
State | New |
Headers | show |
Series | [v3] RISC-V: elide unnecessary sign extend when expanding cmp_and_jump | expand |
On 10/29/23 20:21, Vineet Gupta wrote: > RV64 compare and branch instructions only support 64-bit operands. > At Expand time, the backend conservatively zero/sign extends > its operands even if not needed, such as incoming 32-bit function args > which ABI/ISA guarantee to be sign-extended already. > > And subsequently REE fails to eliminate them as > "missing defintion(s)" or "multiple definition(s) > since function args don't have explicit definition. > > So during expand riscv_extend_comparands (), if an operand is a > subreg-promoted SI with inner DI, which is representative of a function > arg, just peel away the subreg to expose the DI, eliding the sign > extension. As Jeff noted this routine is also used in if-conversion so > also helps there. > > Note there's currently patches floating around to improve REE and also a > new pass to eliminate unneccesary extensions, but it is still beneficial > to not generate those extra extensions in first place. It is obviously > less work for post-reload passes such as REE, but even for earlier > passes, such as combine, having to deal with one less thing and ensuing > fewer combinations is a win too. > > Way too many existing tests used to observe this issue. > e.g. gcc.c-torture/compile/20190827-1.c -O2 -march=rv64gc > It elimiates the SEXT.W > > Tested with rv64gc with no regressions, I'm relying on PAtrick's > pre-commit CI to do the full testing. Testing on the pre-commit CI has completed. https://github.com/ewlu/gcc-precommit-ci/issues/499#issuecomment-1784446631 The patch was applied to this baseline: https://github.com/gcc-mirror/gcc/commit/c6929b085580cf00cbc52b0f5b0afe2b9caa2a22 and no new failures or resolved failures were found when running the testsuite. Tested-by: Patrick O'Neill <patrick@rivosinc.com> Thanks! Patrick > > gcc/ChangeLog: > * config/riscv/riscv.cc (riscv_sign_extend_if_not_subreg_prom): New. > * (riscv_extend_comparands): Call New function on operands. > > Signed-off-by: Vineet Gupta <vineetg@rivosinc.com> > --- > Changes since v2: > - Fix linting issues flagged by pre-commit CI > Changes since v1: > - Elide sign extension for 32-bit operarnds only > - Apply elison for both arguments > --- > gcc/config/riscv/riscv.cc | 23 +++++++++++++++++++++-- > 1 file changed, 21 insertions(+), 2 deletions(-) > > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc > index ca9a2ca81d53..269beb3b159b 100644 > --- a/gcc/config/riscv/riscv.cc > +++ b/gcc/config/riscv/riscv.cc > @@ -3678,6 +3678,24 @@ riscv_zero_if_equal (rtx cmp0, rtx cmp1) > cmp0, cmp1, 0, 0, OPTAB_DIRECT); > } > > +/* Helper function for riscv_extend_comparands to Sign-extend the OP. > + However if the OP is SI subreg promoted with an inner DI, such as > + (subreg/s/v:SI (reg/v:DI) 0) > + just peel off the SUBREG to get DI, avoiding extraneous extension. */ > + > +static void > +riscv_sign_extend_if_not_subreg_prom (rtx *op) > +{ > + if (GET_MODE (*op) == SImode > + && GET_CODE (*op) == SUBREG > + && SUBREG_PROMOTED_VAR_P (*op) > + && GET_MODE_SIZE (GET_MODE (XEXP (*op, 0))).to_constant () > + == GET_MODE_SIZE (word_mode)) > + *op = XEXP (*op, 0); > + else > + *op = gen_rtx_SIGN_EXTEND (word_mode, *op); > +} > + > /* Sign- or zero-extend OP0 and OP1 for integer comparisons. */ > > static void > @@ -3707,9 +3725,10 @@ riscv_extend_comparands (rtx_code code, rtx *op0, rtx *op1) > } > else > { > - *op0 = gen_rtx_SIGN_EXTEND (word_mode, *op0); > + riscv_sign_extend_if_not_subreg_prom (op0); > + > if (*op1 != const0_rtx) > - *op1 = gen_rtx_SIGN_EXTEND (word_mode, *op1); > + riscv_sign_extend_if_not_subreg_prom (op1); > } > } > }
On 10/29/23 21:21, Vineet Gupta wrote: > RV64 compare and branch instructions only support 64-bit operands. > At Expand time, the backend conservatively zero/sign extends > its operands even if not needed, such as incoming 32-bit function args > which ABI/ISA guarantee to be sign-extended already. > > And subsequently REE fails to eliminate them as > "missing defintion(s)" or "multiple definition(s) > since function args don't have explicit definition. > > So during expand riscv_extend_comparands (), if an operand is a > subreg-promoted SI with inner DI, which is representative of a function > arg, just peel away the subreg to expose the DI, eliding the sign > extension. As Jeff noted this routine is also used in if-conversion so > also helps there. > > Note there's currently patches floating around to improve REE and also a > new pass to eliminate unneccesary extensions, but it is still beneficial > to not generate those extra extensions in first place. It is obviously > less work for post-reload passes such as REE, but even for earlier > passes, such as combine, having to deal with one less thing and ensuing > fewer combinations is a win too. > > Way too many existing tests used to observe this issue. > e.g. gcc.c-torture/compile/20190827-1.c -O2 -march=rv64gc > It elimiates the SEXT.W > > Tested with rv64gc with no regressions, I'm relying on PAtrick's > pre-commit CI to do the full testing. > > gcc/ChangeLog: > * config/riscv/riscv.cc (riscv_sign_extend_if_not_subreg_prom): New. > * (riscv_extend_comparands): Call New function on operands. > > Signed-off-by: Vineet Gupta <vineetg@rivosinc.com> > --- > Changes since v2: > - Fix linting issues flagged by pre-commit CI > Changes since v1: > - Elide sign extension for 32-bit operarnds only > - Apply elison for both arguments > --- > gcc/config/riscv/riscv.cc | 23 +++++++++++++++++++++-- > 1 file changed, 21 insertions(+), 2 deletions(-) > > diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc > index ca9a2ca81d53..269beb3b159b 100644 > --- a/gcc/config/riscv/riscv.cc > +++ b/gcc/config/riscv/riscv.cc > @@ -3678,6 +3678,24 @@ riscv_zero_if_equal (rtx cmp0, rtx cmp1) > cmp0, cmp1, 0, 0, OPTAB_DIRECT); > } > > +/* Helper function for riscv_extend_comparands to Sign-extend the OP. > + However if the OP is SI subreg promoted with an inner DI, such as > + (subreg/s/v:SI (reg/v:DI) 0 > + just peel off the SUBREG to get DI, avoiding extraneous extension. */ > + > +static void > +riscv_sign_extend_if_not_subreg_prom (rtx *op) > +{ > + if (GET_MODE (*op) == SImode > + && GET_CODE (*op) == SUBREG > + && SUBREG_PROMOTED_VAR_P (*op) > + && GET_MODE_SIZE (GET_MODE (XEXP (*op, 0))).to_constant () > + == GET_MODE_SIZE (word_mode)) > + *op = XEXP (*op, 0); > + else > + *op = gen_rtx_SIGN_EXTEND (word_mode, *op); So for the wrapped test GET_MODE_SIZE stuff), add parenthesis and indent the "==" clause. ie && (GET_MODE_SIZE (GET_MODE (XEXP (*op), 0))).to_constant () == GET_MODE_SIZE (word_mode)) Don't you also need to verify that the subreg was sign extended? The PROMOTED_VAR_P just notes that it was promoted, not *how* it was promoted. I think you just need to add a test like this: && SUBREG_PROMOTED_SIGNED_P (*op) I don't guess you have data on how this impacts dynamic instruction counts on anything significant do you? OK with the formatting nit fixed and adding the additional check to ensure the value was sign extended. jeff
On 10/30/23 13:33, Jeff Law wrote: > > > On 10/29/23 21:21, Vineet Gupta wrote: >> RV64 compare and branch instructions only support 64-bit operands. >> At Expand time, the backend conservatively zero/sign extends >> its operands even if not needed, such as incoming 32-bit function args >> which ABI/ISA guarantee to be sign-extended already. >> >> And subsequently REE fails to eliminate them as >> "missing defintion(s)" or "multiple definition(s) >> since function args don't have explicit definition. >> >> So during expand riscv_extend_comparands (), if an operand is a >> subreg-promoted SI with inner DI, which is representative of a function >> arg, just peel away the subreg to expose the DI, eliding the sign >> extension. As Jeff noted this routine is also used in if-conversion so >> also helps there. >> >> Note there's currently patches floating around to improve REE and also a >> new pass to eliminate unneccesary extensions, but it is still beneficial >> to not generate those extra extensions in first place. It is obviously >> less work for post-reload passes such as REE, but even for earlier >> passes, such as combine, having to deal with one less thing and ensuing >> fewer combinations is a win too. >> >> Way too many existing tests used to observe this issue. >> e.g. gcc.c-torture/compile/20190827-1.c -O2 -march=rv64gc >> It elimiates the SEXT.W >> >> Tested with rv64gc with no regressions, I'm relying on PAtrick's >> pre-commit CI to do the full testing. >> >> gcc/ChangeLog: >> * config/riscv/riscv.cc (riscv_sign_extend_if_not_subreg_prom): New. >> * (riscv_extend_comparands): Call New function on operands. >> >> Signed-off-by: Vineet Gupta <vineetg@rivosinc.com> >> --- >> Changes since v2: >> - Fix linting issues flagged by pre-commit CI >> Changes since v1: >> - Elide sign extension for 32-bit operarnds only >> - Apply elison for both arguments >> --- >> gcc/config/riscv/riscv.cc | 23 +++++++++++++++++++++-- >> 1 file changed, 21 insertions(+), 2 deletions(-) >> >> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc >> index ca9a2ca81d53..269beb3b159b 100644 >> --- a/gcc/config/riscv/riscv.cc >> +++ b/gcc/config/riscv/riscv.cc >> @@ -3678,6 +3678,24 @@ riscv_zero_if_equal (rtx cmp0, rtx cmp1) >> cmp0, cmp1, 0, 0, OPTAB_DIRECT); >> } >> +/* Helper function for riscv_extend_comparands to Sign-extend the OP. >> + However if the OP is SI subreg promoted with an inner DI, such as >> + (subreg/s/v:SI (reg/v:DI) 0 >> + just peel off the SUBREG to get DI, avoiding extraneous >> extension. */ >> + >> +static void >> +riscv_sign_extend_if_not_subreg_prom (rtx *op) >> +{ >> + if (GET_MODE (*op) == SImode >> + && GET_CODE (*op) == SUBREG >> + && SUBREG_PROMOTED_VAR_P (*op) >> + && GET_MODE_SIZE (GET_MODE (XEXP (*op, 0))).to_constant () >> + == GET_MODE_SIZE (word_mode)) >> + *op = XEXP (*op, 0); >> + else >> + *op = gen_rtx_SIGN_EXTEND (word_mode, *op); > So for the wrapped test GET_MODE_SIZE stuff), add parenthesis and > indent the "==" clause. ie > > && (GET_MODE_SIZE (GET_MODE (XEXP (*op), 0))).to_constant () > == GET_MODE_SIZE (word_mode)) Ok. FWIW I was using the wrong checker: git_check_commit.py vs. check_GNU_style.sh > > Don't you also need to verify that the subreg was sign extended? The > PROMOTED_VAR_P just notes that it was promoted, not *how* it was > promoted. I think you just need to add a test like this: > > && SUBREG_PROMOTED_SIGNED_P (*op) Thx for catching this. The orig test case I used to spot the issue had an unsigned promoted subreg but I was convinced it could still be removed (wrong on so many counts). > I don't guess you have data on how this impacts dynamic instruction > counts on anything significant do you? No, haven't run it yet. I can fire one though. I doubt if this is as significant as the prev one, even if this is the right thing to do. > > OK with the formatting nit fixed and adding the additional check to > ensure the value was sign extended. Thx. I just wait for SPEC run before pushing this. -Vineet
On 10/30/23 16:21, Vineet Gupta wrote: >> I don't guess you have data on how this impacts dynamic instruction >> counts on anything significant do you? > > No, haven't run it yet. I can fire one though. I doubt if this is as > significant as the prev one, even if this is the right thing to do. Very very small improvement overall 49,318,030,233,258 (w/o patch) 49,318,000,693,233 (W/ patch)
On 10/30/23 13:33, Jeff Law wrote: > >> +/* Helper function for riscv_extend_comparands to Sign-extend the OP. >> + However if the OP is SI subreg promoted with an inner DI, such as >> + (subreg/s/v:SI (reg/v:DI) 0 >> + just peel off the SUBREG to get DI, avoiding extraneous >> extension. */ >> + >> +static void >> +riscv_sign_extend_if_not_subreg_prom (rtx *op) >> +{ >> + if (GET_MODE (*op) == SImode So I may have been partially wrong about v2 patch being wrong and needing this fixup ;-) [1] It seems we don't have to limit this to SImode. I re-read the calling convention doc [2] and it says this "When passed in registers or on the stack, integer scalars narrower than XLEN bits are widened according to the sign of their type up to 32 bits, then sign-extended to XLEN bits." This essentially means signed short, and signed char will be already sign-extended at caller site and need not be done in callee: Palmer mention in internal slack that unadorned char is unsigned on RISC-V hence we don't see the compiler extra work for say gcc.dg/torture/pr75964.c. If the test is however tweaked to use signed chars (or short), it seems caller is doing the work (adjusting the constant being passed to be a sign-extended variant). This further validates Jeff's comment about checking for SUBREG_PROMOTED_SIGNED_P (it was anyhow the right thing to begin with anyways). At this point I feel like I'm into splitting hairs (in vain) territory, as fixing this might not matter much in practice .... I'd suppose we go ahead with the v3 with changes Jeff asked for and maybe do a later fixup to relax SI. >> + && GET_CODE (*op) == SUBREG >> + && SUBREG_PROMOTED_VAR_P (*op) >> + && GET_MODE_SIZE (GET_MODE (XEXP (*op, 0))).to_constant () >> + == GET_MODE_SIZE (word_mode)) >> + *op = XEXP (*op, 0); >> + else >> + *op = gen_rtx_SIGN_EXTEND (word_mode, *op); > So for the wrapped test GET_MODE_SIZE stuff), add parenthesis and > indent the "==" clause. ie > > && (GET_MODE_SIZE (GET_MODE (XEXP (*op), 0))).to_constant () > == GET_MODE_SIZE (word_mode)) > > Don't you also need to verify that the subreg was sign extended? The > PROMOTED_VAR_P just notes that it was promoted, not *how* it was > promoted. I think you just need to add a test like this: > > && SUBREG_PROMOTED_SIGNED_P (*op) [1] https://gcc.gnu.org/pipermail/gcc-patches/2023-October/634327.html [2] https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc <https://github.com/riscv-non-isa/riscv-elf-psabi-doc/blob/master/riscv-cc.adoc>
On 10/31/23 18:05, Vineet Gupta wrote: > On 10/30/23 13:33, Jeff Law wrote: >> >>> +/* Helper function for riscv_extend_comparands to Sign-extend the OP. >>> + However if the OP is SI subreg promoted with an inner DI, such as >>> + (subreg/s/v:SI (reg/v:DI) 0 >>> + just peel off the SUBREG to get DI, avoiding extraneous >>> extension. */ >>> + >>> +static void >>> +riscv_sign_extend_if_not_subreg_prom (rtx *op) >>> +{ >>> + if (GET_MODE (*op) == SImode > > So I may have been partially wrong about v2 patch being wrong and > needing this fixup ;-) [1] > > It seems we don't have to limit this to SImode. I re-read the calling > convention doc [2] and it says this > > "When passed in registers or on the stack, integer scalars narrower than > XLEN > bits are widened according to the sign of their type up to 32 bits, then > sign-extended to XLEN bits." > > This essentially means signed short, and signed char will be already > sign-extended at caller site and need not be done in callee: Palmer > mention in internal slack that unadorned char is unsigned on RISC-V > hence we don't see the compiler extra work for say > gcc.dg/torture/pr75964.c. If the test is however tweaked to use signed > chars (or short), it seems caller is doing the work (adjusting the > constant being passed to be a sign-extended variant). > > This further validates Jeff's comment about checking for > SUBREG_PROMOTED_SIGNED_P (it was anyhow the right thing to begin with > anyways). > > At this point I feel like I'm into splitting hairs (in vain) territory, > as fixing this might not matter much in practice ... > > I'd suppose we go ahead with the v3 with changes Jeff asked for and > maybe do a later fixup to relax SI. Consider any such fixup pre-approved. I was thinking that the 8/16 bit sub-objects should probably be extended one way or another. It wouldn't make much sense not to. Not as much of a win as I'd hoped. But I'll take it jeff
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index ca9a2ca81d53..269beb3b159b 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -3678,6 +3678,24 @@ riscv_zero_if_equal (rtx cmp0, rtx cmp1) cmp0, cmp1, 0, 0, OPTAB_DIRECT); } +/* Helper function for riscv_extend_comparands to Sign-extend the OP. + However if the OP is SI subreg promoted with an inner DI, such as + (subreg/s/v:SI (reg/v:DI) 0) + just peel off the SUBREG to get DI, avoiding extraneous extension. */ + +static void +riscv_sign_extend_if_not_subreg_prom (rtx *op) +{ + if (GET_MODE (*op) == SImode + && GET_CODE (*op) == SUBREG + && SUBREG_PROMOTED_VAR_P (*op) + && GET_MODE_SIZE (GET_MODE (XEXP (*op, 0))).to_constant () + == GET_MODE_SIZE (word_mode)) + *op = XEXP (*op, 0); + else + *op = gen_rtx_SIGN_EXTEND (word_mode, *op); +} + /* Sign- or zero-extend OP0 and OP1 for integer comparisons. */ static void @@ -3707,9 +3725,10 @@ riscv_extend_comparands (rtx_code code, rtx *op0, rtx *op1) } else { - *op0 = gen_rtx_SIGN_EXTEND (word_mode, *op0); + riscv_sign_extend_if_not_subreg_prom (op0); + if (*op1 != const0_rtx) - *op1 = gen_rtx_SIGN_EXTEND (word_mode, *op1); + riscv_sign_extend_if_not_subreg_prom (op1); } } }
RV64 compare and branch instructions only support 64-bit operands. At Expand time, the backend conservatively zero/sign extends its operands even if not needed, such as incoming 32-bit function args which ABI/ISA guarantee to be sign-extended already. And subsequently REE fails to eliminate them as "missing defintion(s)" or "multiple definition(s) since function args don't have explicit definition. So during expand riscv_extend_comparands (), if an operand is a subreg-promoted SI with inner DI, which is representative of a function arg, just peel away the subreg to expose the DI, eliding the sign extension. As Jeff noted this routine is also used in if-conversion so also helps there. Note there's currently patches floating around to improve REE and also a new pass to eliminate unneccesary extensions, but it is still beneficial to not generate those extra extensions in first place. It is obviously less work for post-reload passes such as REE, but even for earlier passes, such as combine, having to deal with one less thing and ensuing fewer combinations is a win too. Way too many existing tests used to observe this issue. e.g. gcc.c-torture/compile/20190827-1.c -O2 -march=rv64gc It elimiates the SEXT.W Tested with rv64gc with no regressions, I'm relying on PAtrick's pre-commit CI to do the full testing. gcc/ChangeLog: * config/riscv/riscv.cc (riscv_sign_extend_if_not_subreg_prom): New. * (riscv_extend_comparands): Call New function on operands. Signed-off-by: Vineet Gupta <vineetg@rivosinc.com> --- Changes since v2: - Fix linting issues flagged by pre-commit CI Changes since v1: - Elide sign extension for 32-bit operarnds only - Apply elison for both arguments --- gcc/config/riscv/riscv.cc | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-)