Message ID | 2c7cafc5-aaf7-40de-93a4-80fb3bfd07d5@gjlay.de |
---|---|
State | New |
Headers | show |
Series | [avr] Overhaul avr-ifelse RTL optimization pass | expand |
On Fri, Aug 23, 2024 at 2:16 PM Georg-Johann Lay <avr@gjlay.de> wrote: > > This patch overhauls the avr-ifelse mini-pass that optimizes > two cbranch insns to one comparison and two branches. > > More optimization opportunities are realized, and the code > has been refactored. > > No new regressions. Ok for trunk? > > There is currently no avr maintainer, so some global reviewer > might please have a look at this. I see Denis still listed? Possibly Jeff can have a look though. > And one question I have: avr_optimize_2ifelse() is rewiring > basic blocks using redirect_edge_and_branch(). Does this > require extra pass flags or actions? Currently the RTL_PASS > data reads: It probably depends where the pass is inserted, but if it didn't blow up spectacularly in your testing it should be fine as-is. > static const pass_data avr_pass_data_ifelse = > { > RTL_PASS, // type > "", // name (will be patched) > OPTGROUP_NONE, // optinfo_flags > TV_DF_SCAN, // tv_id > 0, // properties_required > 0, // properties_provided > 0, // properties_destroyed > 0, // todo_flags_start > TODO_df_finish | TODO_df_verify // todo_flags_finish > }; > > > Johann > > p.s. The additional notes on compare-elim / PR115830 can be found > here (pending review): > > https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660743.html > > -- > > AVR: Overhaul the avr-ifelse RTL optimization pass. > > Mini-pass avr-ifelse realizes optimizations that replace two cbranch > insns with one comparison and two branches. This patch adds the > following improvements: > > - The right operand of the comparisons may also be REGs. > Formerly only CONST_INT was handled. > > - The code of the first comparison in no more restricted > to (effectively) EQ. > > - When the second cbranch is located in the fallthrough path > of the first cbranch, then difficult (expensive) comparisons > can always be avoided. This may require to swap the branch > targets. (When the second cbranch if located after the target > label of the first one, then getting rid of difficult branches > would require to reorder blocks.) > > - The code has been cleaned up: avr_rest_of_handle_ifelse() now > just scans the insn stream for optimization candidates. The code > that actually performs the transformation has been outsourced to > the new function avr_optimize_2ifelse(). > > - The code to find a better representation for reg-const_int comparisons > has been split into two parts: First try to find codes such that the > right-hand sides of the comparisons are the same (avr_2comparisons_rhs). > When this succeeds then one comparison can serve two branches, and > avr_redundant_compare() tries to get rid of difficult branches that > may have been introduced by avr_2comparisons_rhs(). This is always > possible when the second cbranch is located in the fallthrough path > of the first one, or when the first code is EQ. > > Some final notes on why we don't use compare-elim: 1) The two cbranch > insns may come with different scratch operands depending on the chosen > constraint alternatives. There are cases where the outgoing comparison > requires a scratch but only one incoming cbranch has one. 2) Avoiding > difficult branches can be achieved by rewiring basic blocks. > compare-elim doesn't do that; it doesn't even know the costs of the > branch codes. 3) avr_2comparisons_rhs() may de-canonicalize a > comparison to achieve its goal. compare-elim doesn't know how to do > that. 4) There are more reasons, see for example the commit > message and discussion for PR115830. > > gcc/ > * config/avr/avr.cc (cfganal.h): Include it. > (avr_2comparisons_rhs, avr_redundant_compare_regs) > (avr_strict_signed_p, avr_strict_unsigned_p): New static functions. > (avr_redundant_compare): Overhaul: Allow more cases. > (avr_optimize_2ifelse): New static function, outsourced from... > (avr_rest_of_handle_ifelse): ...this method. > gcc/testsuite/ > * gcc.target/avr/torture/ifelse-c.h: New file. > * gcc.target/avr/torture/ifelse-d.h: New file. > * gcc.target/avr/torture/ifelse-q.h: New file. > * gcc.target/avr/torture/ifelse-r.h: New file. > * gcc.target/avr/torture/ifelse-c-i8.c: New test. > * gcc.target/avr/torture/ifelse-d-i8.c: New test. > * gcc.target/avr/torture/ifelse-q-i8.c: New test. > * gcc.target/avr/torture/ifelse-r-i8.c: New test. > * gcc.target/avr/torture/ifelse-c-i16.c: New test. > * gcc.target/avr/torture/ifelse-d-i16.c: New test. > * gcc.target/avr/torture/ifelse-q-i16.c: New test. > * gcc.target/avr/torture/ifelse-r-i16.c: New test. > * gcc.target/avr/torture/ifelse-c-u16.c: New test. > * gcc.target/avr/torture/ifelse-d-u16.c: New test. > * gcc.target/avr/torture/ifelse-q-u16.c: New test. > * gcc.target/avr/torture/ifelse-r-u16.c: New test.
On 8/23/24 6:20 AM, Richard Biener wrote: > On Fri, Aug 23, 2024 at 2:16 PM Georg-Johann Lay <avr@gjlay.de> wrote: >> >> This patch overhauls the avr-ifelse mini-pass that optimizes >> two cbranch insns to one comparison and two branches. >> >> More optimization opportunities are realized, and the code >> has been refactored. >> >> No new regressions. Ok for trunk? >> >> There is currently no avr maintainer, so some global reviewer >> might please have a look at this. > > I see Denis still listed? Possibly Jeff can have a look though. I think Denis is inactive at this point. I don't really have any significant interest in avr, nor do I actually know the architecture. So I'm mostly just looking for high level issues rather than diving into really thinking about the codegen impact. IIRC I've asked Georg-Johann if he'd like to take maintainership of the avr port, but he declined. So we're a bit stuck. Jeff
On 8/23/24 6:16 AM, Georg-Johann Lay wrote: > This patch overhauls the avr-ifelse mini-pass that optimizes > two cbranch insns to one comparison and two branches. > > More optimization opportunities are realized, and the code > has been refactored. > > No new regressions. Ok for trunk? > > There is currently no avr maintainer, so some global reviewer > might please have a look at this. > > And one question I have: avr_optimize_2ifelse() is rewiring > basic blocks using redirect_edge_and_branch(). Does this > require extra pass flags or actions? Currently the RTL_PASS > data reads: > > static const pass_data avr_pass_data_ifelse = > { > RTL_PASS, // type > "", // name (will be patched) > OPTGROUP_NONE, // optinfo_flags > TV_DF_SCAN, // tv_id > 0, // properties_required > 0, // properties_provided > 0, // properties_destroyed > 0, // todo_flags_start > TODO_df_finish | TODO_df_verify // todo_flags_finish > }; > > > Johann > > p.s. The additional notes on compare-elim / PR115830 can be found > here (pending review): > > https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660743.html > > -- > > AVR: Overhaul the avr-ifelse RTL optimization pass. > > Mini-pass avr-ifelse realizes optimizations that replace two cbranch > insns with one comparison and two branches. This patch adds the > following improvements: > > - The right operand of the comparisons may also be REGs. > Formerly only CONST_INT was handled. > > - The code of the first comparison in no more restricted > to (effectively) EQ. > > - When the second cbranch is located in the fallthrough path > of the first cbranch, then difficult (expensive) comparisons > can always be avoided. This may require to swap the branch > targets. (When the second cbranch if located after the target > label of the first one, then getting rid of difficult branches > would require to reorder blocks.) > > - The code has been cleaned up: avr_rest_of_handle_ifelse() now > just scans the insn stream for optimization candidates. The code > that actually performs the transformation has been outsourced to > the new function avr_optimize_2ifelse(). > > - The code to find a better representation for reg-const_int comparisons > has been split into two parts: First try to find codes such that the > right-hand sides of the comparisons are the same (avr_2comparisons_rhs). > When this succeeds then one comparison can serve two branches, and > avr_redundant_compare() tries to get rid of difficult branches that > may have been introduced by avr_2comparisons_rhs(). This is always > possible when the second cbranch is located in the fallthrough path > of the first one, or when the first code is EQ. > > Some final notes on why we don't use compare-elim: 1) The two cbranch > insns may come with different scratch operands depending on the chosen > constraint alternatives. There are cases where the outgoing comparison > requires a scratch but only one incoming cbranch has one. 2) Avoiding > difficult branches can be achieved by rewiring basic blocks. > compare-elim doesn't do that; it doesn't even know the costs of the > branch codes. 3) avr_2comparisons_rhs() may de-canonicalize a > comparison to achieve its goal. compare-elim doesn't know how to do > that. 4) There are more reasons, see for example the commit > message and discussion for PR115830. > > gcc/ > * config/avr/avr.cc (cfganal.h): Include it. > (avr_2comparisons_rhs, avr_redundant_compare_regs) > (avr_strict_signed_p, avr_strict_unsigned_p): New static functions. > (avr_redundant_compare): Overhaul: Allow more cases. > (avr_optimize_2ifelse): New static function, outsourced from... > (avr_rest_of_handle_ifelse): ...this method. > gcc/testsuite/ > * gcc.target/avr/torture/ifelse-c.h: New file. > * gcc.target/avr/torture/ifelse-d.h: New file. > * gcc.target/avr/torture/ifelse-q.h: New file. > * gcc.target/avr/torture/ifelse-r.h: New file. > * gcc.target/avr/torture/ifelse-c-i8.c: New test. > * gcc.target/avr/torture/ifelse-d-i8.c: New test. > * gcc.target/avr/torture/ifelse-q-i8.c: New test. > * gcc.target/avr/torture/ifelse-r-i8.c: New test. > * gcc.target/avr/torture/ifelse-c-i16.c: New test. > * gcc.target/avr/torture/ifelse-d-i16.c: New test. > * gcc.target/avr/torture/ifelse-q-i16.c: New test. > * gcc.target/avr/torture/ifelse-r-i16.c: New test. > * gcc.target/avr/torture/ifelse-c-u16.c: New test. > * gcc.target/avr/torture/ifelse-d-u16.c: New test. > * gcc.target/avr/torture/ifelse-q-u16.c: New test. > * gcc.target/avr/torture/ifelse-r-u16.c: New test. > > ifelse-tweak.diff > > diff --git a/gcc/config/avr/avr.cc b/gcc/config/avr/avr.cc > index c520b98a178..90606b73114 100644 > --- a/gcc/config/avr/avr.cc > +++ b/gcc/config/avr/avr.cc > + > +static rtx > +avr_2comparisons_rhs (rtx_code &cond1, rtx xval1, > + rtx_code &cond2, rtx xval2, machine_mode mode) > +{ > + HOST_WIDE_INT val1 = INTVAL (xval1); > + HOST_WIDE_INT val2 = INTVAL (xval2); > + > + if (val1 == val2) > + return xval1; > + > + if (! IN_RANGE (val1 - val2, -2, 2)) > + return NULL_RTX; > + > + // First, ten exceptional cases that occur near the unsigned boundaries. > + // All outgoing codes will have at least one EQ or NE. > + // Similar cases will occur near the signed boundaries, but they are > + // less common (and even more tedious). > + > + if (cond1 == EQ && cond2 == EQ) > + { > + if (val1 == 1 && val2 == 0) > + { > + cond2 = LTU; > + return xval1; > + } > + else if (val1 == 0 && val2 == 1) > + { > + cond1 = LTU; > + return xval2; > + } > + else if (val1 == -2 && val2 == -1) > + { > + cond2 = GEU; > + return xval1; > + } > + else if (val1 == -1 && val2 == -2) > + { > + cond1 = GTU; > + return xval2; > + } > + } // EQ, EQ [ ... ] If I'm understanding this code correctly it looks like you're combining those cases into a single branch. None of the code really looks AVR specific, so is there a good reason why we're not doing this in one of the target independent passes? In fact, I'm a little surprised we're not already handling this stuff. So while I'm not going to object to the AVR code, we may ultimately be better off taking those testscases and seeing where we should be improving the target independent optimizers. Jeff
вс, 25 авг. 2024 г. в 17:55, Jeff Law <jeffreyalaw@gmail.com>: > > > On 8/23/24 6:20 AM, Richard Biener wrote: > > On Fri, Aug 23, 2024 at 2:16 PM Georg-Johann Lay <avr@gjlay.de> wrote: > >> > >> This patch overhauls the avr-ifelse mini-pass that optimizes > >> two cbranch insns to one comparison and two branches. > >> > >> More optimization opportunities are realized, and the code > >> has been refactored. > >> > >> No new regressions. Ok for trunk? > >> > >> There is currently no avr maintainer, so some global reviewer > >> might please have a look at this. > > > > I see Denis still listed? Possibly Jeff can have a look though. > I think Denis is inactive at this point. I don't really have any > significant interest in avr, nor do I actually know the architecture. > So I'm mostly just looking for high level issues rather than diving into > really thinking about the codegen impact. > > IIRC I've asked Georg-Johann if he'd like to take maintainership of the > avr port, but he declined. So we're a bit stuck. > Yes, I was inactive but I'm here. I'm interested in converting the port to LRC. Starting to review the patch... Denis > > > Jeff >
вс, 25 авг. 2024 г. в 20:15, Denis Chertykov <chertykov@gmail.com>: > > > > вс, 25 авг. 2024 г. в 17:55, Jeff Law <jeffreyalaw@gmail.com>: >> >> >> >> On 8/23/24 6:20 AM, Richard Biener wrote: >> > On Fri, Aug 23, 2024 at 2:16 PM Georg-Johann Lay <avr@gjlay.de> wrote: >> >> >> >> This patch overhauls the avr-ifelse mini-pass that optimizes >> >> two cbranch insns to one comparison and two branches. >> >> >> >> More optimization opportunities are realized, and the code >> >> has been refactored. >> >> >> >> No new regressions. Ok for trunk? >> >> >> >> There is currently no avr maintainer, so some global reviewer >> >> might please have a look at this. >> > >> > I see Denis still listed? Possibly Jeff can have a look though. >> I think Denis is inactive at this point. I don't really have any >> significant interest in avr, nor do I actually know the architecture. >> So I'm mostly just looking for high level issues rather than diving into >> really thinking about the codegen impact. >> >> IIRC I've asked Georg-Johann if he'd like to take maintainership of the >> avr port, but he declined. So we're a bit stuck. > > > Yes, I was inactive but I'm here. > I'm interested in converting the port to LRC. > Starting to review the patch... I think that we can go forward. Johann, please apply the patch. > > Denis >> >> >> >> Jeff
Am 25.08.24 um 18:15 schrieb Denis Chertykov: > Starting to review the patch... > > Denis Great to see you back! Prior to commenting on the attached new versions of the overhaul, let me answer Jeff's questions from the other mail: > On 8/23/24 6:16 AM, Georg-Johann Lay wrote: >> This patch overhauls the avr-ifelse mini-pass that optimizes >> two cbranch insns to one comparison and two branches. >> >> More optimization opportunities are realized, and the code >> has been refactored. >> >> No new regressions. Ok for trunk? >> >> [...] >> >> AVR: Overhaul the avr-ifelse RTL optimization pass. >> >> Mini-pass avr-ifelse realizes optimizations that replace two cbranch >> insns with one comparison and two branches. This patch adds the >> following improvements: >> >> - The right operand of the comparisons may also be REGs. >> Formerly only CONST_INT was handled. >> >> - The code of the first comparison in no more restricted >> to (effectively) EQ. >> >> - When the second cbranch is located in the fallthrough path >> of the first cbranch, then difficult (expensive) comparisons >> can always be avoided. This may require to swap the branch >> targets. (When the second cbranch if located after the target >> label of the first one, then getting rid of difficult branches >> would require to reorder blocks.) >> >> - The code has been cleaned up: avr_rest_of_handle_ifelse() now >> just scans the insn stream for optimization candidates. The code >> that actually performs the transformation has been outsourced to >> the new function avr_optimize_2ifelse(). >> >> - The code to find a better representation for reg-const_int comparisons >> has been split into two parts: First try to find codes such that the >> right-hand sides of the comparisons are the same (avr_2comparisons_rhs). >> When this succeeds then one comparison can serve two branches, and >> avr_redundant_compare() tries to get rid of difficult branches that >> may have been introduced by avr_2comparisons_rhs(). This is always >> possible when the second cbranch is located in the fallthrough path >> of the first one, or when the first code is EQ. >> >> Some final notes on why we don't use compare-elim: 1) The two cbranch >> insns may come with different scratch operands depending on the chosen >> constraint alternatives. There are cases where the outgoing comparison >> requires a scratch but only one incoming cbranch has one. 2) Avoiding >> difficult branches can be achieved by rewiring basic blocks. >> compare-elim doesn't do that; it doesn't even know the costs of the >> branch codes. 3) avr_2comparisons_rhs() may de-canonicalize a >> comparison to achieve its goal. compare-elim doesn't know how to do >> that. 4) There are more reasons, see for example the commit >> message and discussion for PR115830. >> >> gcc/ >> * config/avr/avr.cc (cfganal.h): Include it. >> (avr_2comparisons_rhs, avr_redundant_compare_regs) >> (avr_strict_signed_p, avr_strict_unsigned_p): New static functions. >> (avr_redundant_compare): Overhaul: Allow more cases. >> (avr_optimize_2ifelse): New static function, outsourced from... >> (avr_rest_of_handle_ifelse): ...this method. >> gcc/testsuite/ >> * gcc.target/avr/torture/ifelse-c.h: New file. >> * gcc.target/avr/torture/ifelse-d.h: New file. >> * gcc.target/avr/torture/ifelse-q.h: New file. >> * gcc.target/avr/torture/ifelse-r.h: New file. >> * gcc.target/avr/torture/ifelse-c-i8.c: New test. >> * gcc.target/avr/torture/ifelse-d-i8.c: New test. >> * gcc.target/avr/torture/ifelse-q-i8.c: New test. >> * gcc.target/avr/torture/ifelse-r-i8.c: New test. >> * gcc.target/avr/torture/ifelse-c-i16.c: New test. >> * gcc.target/avr/torture/ifelse-d-i16.c: New test. >> * gcc.target/avr/torture/ifelse-q-i16.c: New test. >> * gcc.target/avr/torture/ifelse-r-i16.c: New test. >> * gcc.target/avr/torture/ifelse-c-u16.c: New test. >> * gcc.target/avr/torture/ifelse-d-u16.c: New test. >> * gcc.target/avr/torture/ifelse-q-u16.c: New test. >> * gcc.target/avr/torture/ifelse-r-u16.c: New test. >> >> ifelse-tweak.diff >> >> diff --git a/gcc/config/avr/avr.cc b/gcc/config/avr/avr.cc >> index c520b98a178..90606b73114 100644 >> --- a/gcc/config/avr/avr.cc >> +++ b/gcc/config/avr/avr.cc >> [...] >> +static rtx >> +avr_2comparisons_rhs (rtx_code &cond1, rtx xval1, >> + rtx_code &cond2, rtx xval2, machine_mode mode) >> +{ >> + HOST_WIDE_INT val1 = INTVAL (xval1); >> + HOST_WIDE_INT val2 = INTVAL (xval2); >> + >> + if (val1 == val2) >> + return xval1; >> + >> + if (! IN_RANGE (val1 - val2, -2, 2)) >> + return NULL_RTX; >> + >> + // First, ten exceptional cases that occur near the unsigned boundaries. >> + // All outgoing codes will have at least one EQ or NE. >> + // Similar cases will occur near the signed boundaries, but they are >> + // less common (and even more tedious). >> + >> + if (cond1 == EQ && cond2 == EQ) >> + { >> + if (val1 == 1 && val2 == 0) >> + { >> + cond2 = LTU; >> + return xval1; >> + } >> + else if (val1 == 0 && val2 == 1) >> + { >> + cond1 = LTU; >> + return xval2; >> + } >> + else if (val1 == -2 && val2 == -1) >> + { >> + cond2 = GEU; >> + return xval1; >> + } >> + else if (val1 == -1 && val2 == -2) >> + { >> + cond1 = GTU; >> + return xval2; >> + } >> + } // EQ, EQ > [ ... ] > If I'm understanding this code correctly it looks like you're combining those cases into a single branch. What the avr-ifelse pass does is try to replace 2 cbranch insns with one compare insn and two branches. It runs after reload and just prior to .split2 (split_after_reload). It must run after reload because REG_CC comes into existence in .split2. For example, the last case belongs to transforming if (x == (unsigned) -1) goto A; if (x == (unsigned) -2) goto B; to REG_CC = x compare (unsigned) -2; if (REG_CC > 0) goto A; if (REG_CC == 0) goto B; (As an aside, this will be transformed further down the line to REG_CC = x compare (unsigned) -2; if (REG_CC == 0) goto B; if (REG_CC >= 0) goto A; in order to avoid GTU.) > None of the code really looks AVR specific, so is there a good reason why we're not doing this in one of the target independent passes? That target-independent pass would be compare-elim, which avr does not use. Some of the reasons why not using compare-elim I tried to get across in the review for PR115830: https://gcc.gnu.org/pipermail/gcc-patches/2024-August/660743.html One of the reasons mentioned there is that insns that perform arith+cc may require a scratch reg, but the incoming arith insn that doesn't set cc does not have a scratch. peep2 can provide scratches if available. The scratch regs need handling in the current patch, too: When two cbranch insns should be rewritten as one compare, a scratch may be needed, but only one cbranch has a scratch. For example, CMP 1 and CMP 2 have a scratch while CMP 0 doesn't. This guarantees that in a transformation like if (x == 0) goto A; // cbranch1 no scratch if (x >= 2) goto B; // cbranch2 has a scratch to REG_CC = x compare (unsigned) 1; // needs a scratch if (REG_CC < 0) goto A; if (REG_CC != 0) goto B; we get the needed scratch from cbranch2 due to > rtx scratch = GET_CODE (xop1[4]) == SCRATCH ? xop2[4] : xop1[4]; > rtx (*gen_cmp)(rtx,rtx,rtx) = ... > xcompare = gen_cmp (xop1[1], xval, scratch); It is implicit avr knowledge that at least one of cbranch1 or cbranch2 will provide a scratch, or otherwise no scratch is needed. A second place of avr knowledge is that we want to avoid branch codes that are not supported by AVR. Such codes can be avoided in all situations where we may swap labels. In particular, all sequences like if (x <cond1> a) goto label[1]; if (x <cond2> b) goto label[2]; which allow for a transformation, can avoid non-existent codes when the labels nay be swapped, e.g. the result looks like: REG_CC = x compare y; if (x <cmp1> 0) goto label[n]; // n in { 1, 2 } if (x <cmp2> 0) goto label[3 - n]; When labels cannot be swapped, we know that cost (expensive-jump) <= cost (cheap-jump) + cost (compare) for all modes (all incoming cbranches will have cheap codes from CANONICALIZE_COMPARISON, which must run prior to reload). compare-elim has no idea of branch costs, resp. which branches to avoid, and it doesn't swap labels as needed, just because that's not required under compare-elim assumptions. While none of this is in any way complicated, they add to the things-that-compare-elim-cannot-do list. ======================================================================== So let's return to the current patch review. The patch has all these tedious special cases like in the quoted code above, and also in other places. Meanwhile, I have a second version of the patch that uses operations on sets: The cbranch conditions imply a decomposition / coloring of the domain. When the decomposition is three non-empty sets that look like S1 = { x | x < val } S2 = { x } S3 = { x | x > val } then we can do the optimization. There are 6 = 3! ways to assign the branch labels to these sets, where the 3rd label is for the fallthrough path of cbranch2. Though that solution requires operations on sets. Sadly, value-ranges.h turned out to be of no use. Working around all the value-range.h ICEs will be no less work than a complete, simplistic own implementation. https://gcc.gnu.org/pipermail/gcc/2024-August/244641.html Attached are two versions of the overhaul: 1) ifelse-tweak-v2.diff: Similar to the original patch with some minor fixes. This is the version with all these tedious corner cases and many if-else chains etc. 2) ifelse-ranges.diff: Same feature and identical behavior (except for dump_file). Instead of many special cases, it uses a custom struct Ranges that implements operations (union, intersect, minus, complement) of Ranges (union of finitely many intervals). struct Ranges needs 170 or so lines of code, including all comments. Even though 2) takes ~150 more lines of code than 1), I think my preference is clearly 2). A solution that would wrap-avoid the arcane ICEs from value-ranges.h I don't like at all. Both 1) and 2) contain the same code clean-ups / refactoring. Both 1) and 2) should behave exactly the same w.r.t code generation (and both perform better than the status quo). The test cases are exactly the same. Johann > In fact, I'm a little surprised we're not already handling this stuff. Maybe something like spaceship operator for int_mode? (Currently there's just spaceship for float). This would cover even more cases than the patch can do, because spaceship-like code is expanded in such a way that, in many cases, it cannot be transformed back to spaceship (for example because insn like (set (reg const_int)) will clobber CC. > So while I'm not going to object to the AVR code, we may ultimately be better off taking those testscases and seeing where we should be improving the target independent optimizers. > > Jeff
On 8/26/24 1:15 PM, Georg-Johann Lay wrote: > > What the avr-ifelse pass does is try to replace 2 cbranch insns with > one compare insn and two branches. It runs after reload and just prior > to .split2 (split_after_reload). It must run after reload because > REG_CC comes into existence in .split2. For example, the last case > belongs to transforming > > if (x == (unsigned) -1) goto A; > if (x == (unsigned) -2) goto B; > > to > > REG_CC = x compare (unsigned) -2; > if (REG_CC > 0) goto A; > if (REG_CC == 0) goto B; Hmm. I'd envisioned doing this in gimple, but as your example shows, it's not suitable for gimple (it's an extra expression evaluation). > > (As an aside, this will be transformed further down the line to > > REG_CC = x compare (unsigned) -2; > if (REG_CC == 0) goto B; > if (REG_CC >= 0) goto A; > > in order to avoid GTU.) > >> None of the code really looks AVR specific, so is there a good reason >> why we're not doing this in one of the target independent passes? > > That target-independent pass would be compare-elim, which avr does > not use. Some of the reasons why not using compare-elim I tried to > get across in the review for PR115830: It feels like it'd fit in RTL jump optimizations, well outside compare-elim. Though that may still be too high level. So yea, let's keep it AVR specific. Jeff
Am 27.08.24 um 17:28 schrieb Jeff Law: > > On 8/26/24 1:15 PM, Georg-Johann Lay wrote: > >> What the avr-ifelse pass does is try to replace 2 cbranch insns with >> one compare insn and two branches. It runs after reload and just prior >> to .split2 (split_after_reload). It must run after reload because >> REG_CC comes into existence in .split2. For example, the last case >> belongs to transforming >> >> if (x == (unsigned) -1) goto A; >> if (x == (unsigned) -2) goto B; >> >> to >> >> REG_CC = x compare (unsigned) -2; >> if (REG_CC > 0) goto A; >> if (REG_CC == 0) goto B; > Hmm. I'd envisioned doing this in gimple, but as your example shows, > it's not suitable for gimple (it's an extra expression evaluation). Max be something like starship operator for integer mode would help? All the situations are effectively sship situation, but a sship detection in gimple could do even more: char fun (char x) { if (x > '0') return 10; return x == '0'; } The .optimized tree dump reads something like: char fun (char x) { _Bool _1; char _2; char _4; <bb 2> [local count: 1073741824]: if (x_3(D) > 48) goto <bb 4>; [21.72%] else goto <bb 3>; [78.28%] <bb 3> [local count: 840525096]: _1 = x_3(D) == 48; _4 = (char) _1; <bb 4> [local count: 1073741824]: # _2 = PHI <10(2), _4(3)> return _2; } The problem with this is that the code gets expanded like: cbranch REG = const ;; clobbers CC cbranch With a spaceship insn, the backend could avoid that, though it is unclear to me to which rtl this should be expanded. JUMP_INSNs can only have one JUMP_LABEL (though I recently learned that JUMP_INSN can have more than one label when they are registered as insn notes, but I never tried that.) For example, a spaceship insn could provide operands like $0, $1: The ops for the <=> comparison $2: label_ref for < $3: label_ref for = $4: label_ref for > Or more general, provide set rtxes as $2, $4, $4 like pc=pc for the fallthrough case, (set reg:QI 1) etc. so that jumps can be avoided. Johann >> (As an aside, this will be transformed further down the line to >> >> REG_CC = x compare (unsigned) -2; >> if (REG_CC == 0) goto B; >> if (REG_CC >= 0) goto A; >> >> in order to avoid GTU.) >> >>> None of the code really looks AVR specific, so is there a good reason >>> why we're not doing this in one of the target independent passes? >> >> That target-independent pass would be compare-elim, which avr does >> not use. Some of the reasons why not using compare-elim I tried to >> get across in the review for PR115830: > It feels like it'd fit in RTL jump optimizations, well outside > compare-elim. Though that may still be too high level. So yea, let's > keep it AVR specific. > > > Jeff
On Tue, Aug 27, 2024 at 7:53 PM Georg-Johann Lay <avr@gjlay.de> wrote: > > Am 27.08.24 um 17:28 schrieb Jeff Law: > > > > On 8/26/24 1:15 PM, Georg-Johann Lay wrote: > > > >> What the avr-ifelse pass does is try to replace 2 cbranch insns with > >> one compare insn and two branches. It runs after reload and just prior > >> to .split2 (split_after_reload). It must run after reload because > >> REG_CC comes into existence in .split2. For example, the last case > >> belongs to transforming > >> > >> if (x == (unsigned) -1) goto A; > >> if (x == (unsigned) -2) goto B; > >> > >> to > >> > >> REG_CC = x compare (unsigned) -2; > >> if (REG_CC > 0) goto A; > >> if (REG_CC == 0) goto B; > > Hmm. I'd envisioned doing this in gimple, but as your example shows, > > it's not suitable for gimple (it's an extra expression evaluation). > > Max be something like starship operator for integer mode would help? > > All the situations are effectively sship situation, but a sship > detection in gimple could do even more: > > char fun (char x) > { > if (x > '0') return 10; > return x == '0'; > } > > The .optimized tree dump reads something like: > > char fun (char x) > { > _Bool _1; > char _2; > char _4; > > <bb 2> [local count: 1073741824]: > if (x_3(D) > 48) > goto <bb 4>; [21.72%] > else > goto <bb 3>; [78.28%] > > <bb 3> [local count: 840525096]: > _1 = x_3(D) == 48; > _4 = (char) _1; > > <bb 4> [local count: 1073741824]: > # _2 = PHI <10(2), _4(3)> > > return _2; > } > > The problem with this is that the code gets expanded like: > > cbranch > REG = const ;; clobbers CC > cbranch > > With a spaceship insn, the backend could avoid that, though it > is unclear to me to which rtl this should be expanded. > JUMP_INSNs can only have one JUMP_LABEL (though I recently learned > that JUMP_INSN can have more than one label when they are registered > as insn notes, but I never tried that.) > > For example, a spaceship insn could provide operands like > > $0, $1: The ops for the <=> comparison > $2: label_ref for < > $3: label_ref for = > $4: label_ref for > > > Or more general, provide set rtxes as $2, $4, $4 like pc=pc for the > fallthrough case, (set reg:QI 1) etc. so that jumps can be avoided. For GIMPLE I'm not sure a separate abstraction for condition codes is helpful - in the end we have to map to what the targets can do. Note GIMPLE is also a bit constrained on the number of outputs of a statement, so setting CC as side-effect of say an add isn't currently representable. Instead you'd have to do like risc-v and have result = a + b; result_cc = a +cc b; thus separate value and CC computation. On RTL we've settled to CC-modes but I think the semantics of those are purely target specific - the spaceship idea would produce sth like a "CCallofstarship mode" on GIMPLE but how this can be expanded to RTL would be implemented on each target? Richard. > Johann > > > >> (As an aside, this will be transformed further down the line to > >> > >> REG_CC = x compare (unsigned) -2; > >> if (REG_CC == 0) goto B; > >> if (REG_CC >= 0) goto A; > >> > >> in order to avoid GTU.) > >> > >>> None of the code really looks AVR specific, so is there a good reason > >>> why we're not doing this in one of the target independent passes? > >> > >> That target-independent pass would be compare-elim, which avr does > >> not use. Some of the reasons why not using compare-elim I tried to > >> get across in the review for PR115830: > > It feels like it'd fit in RTL jump optimizations, well outside > > compare-elim. Though that may still be too high level. So yea, let's > > keep it AVR specific. > > > > > > Jeff
diff --git a/gcc/config/avr/avr.cc b/gcc/config/avr/avr.cc index c520b98a178..90606b73114 100644 --- a/gcc/config/avr/avr.cc +++ b/gcc/config/avr/avr.cc @@ -33,6 +33,7 @@ #include "cgraph.h" #include "c-family/c-common.h" #include "cfghooks.h" +#include "cfganal.h" #include "df.h" #include "memmodel.h" #include "tm_p.h" @@ -337,6 +338,174 @@ ra_in_progress () } +/* Return TRUE iff comparison code CODE is explicitly signed. */ + +static bool +avr_strict_signed_p (rtx_code code) +{ + return code == GT || code == GE || code == LT || code == LE; +} + + +/* Return TRUE iff comparison code CODE is explicitly unsigned. */ + +static bool +avr_strict_unsigned_p (rtx_code code) +{ + return code == GTU || code == GEU || code == LTU || code == LEU; +} + + +/* Let + x <COND1> xval1 + x <COND2> xval2 + + be two integer mode comparisons where XVAL1 and XVAL2 are CONST_INT. + When they can be rewritten in the form + + x <cmp1> xval + x <cmp2> xval + + then set COND1 to cmp1, COND2 to cmp2, and return xval. + Otherwise, return NULL_RTX. */ + +static rtx +avr_2comparisons_rhs (rtx_code &cond1, rtx xval1, + rtx_code &cond2, rtx xval2, machine_mode mode) +{ + HOST_WIDE_INT val1 = INTVAL (xval1); + HOST_WIDE_INT val2 = INTVAL (xval2); + + if (val1 == val2) + return xval1; + + if (! IN_RANGE (val1 - val2, -2, 2)) + return NULL_RTX; + + // First, ten exceptional cases that occur near the unsigned boundaries. + // All outgoing codes will have at least one EQ or NE. + // Similar cases will occur near the signed boundaries, but they are + // less common (and even more tedious). + + if (cond1 == EQ && cond2 == EQ) + { + if (val1 == 1 && val2 == 0) + { + cond2 = LTU; + return xval1; + } + else if (val1 == 0 && val2 == 1) + { + cond1 = LTU; + return xval2; + } + else if (val1 == -2 && val2 == -1) + { + cond2 = GEU; + return xval1; + } + else if (val1 == -1 && val2 == -2) + { + cond1 = GTU; + return xval2; + } + } // EQ, EQ + + if (cond1 == EQ && cond2 == NE) + { + if (val1 == 1 && val2 == 0) + { + cond2 = GEU; + return xval1; + } + else if (val1 == 0 && val2 == 1) + { + cond1 = LTU; + return xval2; + } + else if (val1 == -1 && val2 == -2) + { + cond1 = GTU; + return xval2; + } + else if (val1 == -2 && val2 == -1) + { + cond2 = LTU; + return xval1; + } + } // EQ, NE + + if (cond1 == GEU && val1 == 2 && cond2 == EQ && val2 == 0) + { + cond1 = GTU; + cond2 = NE; + return const1_rtx; + } + else if (cond1 == LTU && val1 == -2 && cond2 == EQ && val2 == -1) + { + cond2 = NE; + return xval1; + } + + bool signed1_p = avr_strict_signed_p (cond1); + bool signed2_p = avr_strict_signed_p (cond2); + bool unsigned1_p = avr_strict_unsigned_p (cond1); + bool unsigned2_p = avr_strict_unsigned_p (cond2); + bool signed_p = signed1_p || signed2_p; + bool unsigned_p = unsigned1_p || unsigned2_p; + + // Don't try to optimize non-trivial, exotic cases with mixed signedness. + if (signed_p + unsigned_p > 1) + return NULL_RTX; + + if (unsigned_p) + { + val1 = UINTVAL (xval1) & GET_MODE_MASK (mode); + val2 = UINTVAL (xval2) & GET_MODE_MASK (mode); + } + + // Each array element represents two equivalent integer comparisons + // X <cond.first> Y + // X <cond.second> Y + 1 + // i.e. when <cond.first> is replaced by <cond.second> then we get + // an equivalent comparison provided Y is increased by 1. + + static const std::pair<rtx_code, rtx_code> cond_repl_inc1[] = + { + { GT, GE }, { GTU, GEU }, // X > Y <=> X >= Y + 1 + { LE, LT }, { LEU, LTU } // X <= Y <=> X < Y + 1 + }; + + // Try them back and forth. + + for (auto &c : cond_repl_inc1) + { + if (c.first == cond1 && val2 == 1 + val1) + { + cond1 = c.second; + return xval2; + } + else if (c.first == cond2 && val1 == 1 + val2) + { + cond2 = c.second; + return xval1; + } + else if (c.second == cond1 && val2 == val1 - 1) + { + cond1 = c.first; + return xval2; + } + else if (c.second == cond2 && val1 == val2 - 1) + { + cond2 = c.first; + return xval1; + } + } + + return NULL_RTX; +} + + namespace { static const pass_data avr_pass_data_recompute_notes = @@ -775,46 +944,80 @@ avr_pass_casesi::avr_rest_of_handle_casesi (function *func) /* A helper for the next method. Suppose we have two conditional branches + with REG and CONST_INT operands if (reg <cond1> xval1) goto label1; if (reg <cond2> xval2) goto label2; - If the second comparison is redundant and there is a code <cond> such - that the sequence can be performed as + If the second comparison is redundant and there are codes <cmp1> + and <cmp2> such that the sequence can be performed as - REG_CC = compare (reg, xval1); - if (REG_CC <cond1> 0) goto label1; - if (REG_CC <cond> 0) goto label2; + REG_CC = compare (reg, xval); + if (REG_CC <cmp1> 0) goto label1; + if (REG_CC <cmp2> 0) goto label2; - then return <cond>. Otherwise, return UNKNOWN. - xval1 and xval2 are CONST_INT, and mode is the scalar int mode in which - the comparison will be carried out. reverse_cond1 can be set to reverse - condition cond1. This is useful if the second comparison does not follow - the first one, but is located after label1 like in: + then set COND1 to cmp1, COND2 to cmp2, SWAPT to true when the branch + targets have to be swapped, and return XVAL. Otherwise, return NULL_RTX. + This function may clobber COND1 and COND2 even when it returns NULL_RTX. + + XVAL1 and XVAL2 are CONST_INT. REVERSE_COND1 can be set to reverse + condition COND1. This is useful when the second comparison does not + follow the first one, but is located after label1 like in: if (reg <cond1> xval1) goto label1; ... label1: - if (reg <cond2> xval2) goto label2; */ + if (reg <cond2> xval2) goto label2; -static enum rtx_code -avr_redundant_compare (enum rtx_code cond1, rtx xval1, - enum rtx_code cond2, rtx xval2, - machine_mode mode, bool reverse_cond1) -{ - HOST_WIDE_INT ival1 = INTVAL (xval1); - HOST_WIDE_INT ival2 = INTVAL (xval2); + In such a case we cannot swap the labels, and we may end up with a + difficult branch -- though one comparison can still be optimized out. + Getting rid of such difficult branches would require to reorder + basic blocks. */ - unsigned HOST_WIDE_INT mask = GET_MODE_MASK (mode); - unsigned HOST_WIDE_INT uval1 = mask & UINTVAL (xval1); - unsigned HOST_WIDE_INT uval2 = mask & UINTVAL (xval2); +static rtx +avr_redundant_compare (rtx xreg1, rtx_code &cond1, rtx xval1, + rtx xreg2, rtx_code &cond2, rtx xval2, + bool reverse_cond1, bool &swapt) +{ + // Make sure we have two REG <cond> CONST_INT comparisons with the same reg. + if (! rtx_equal_p (xreg1, xreg2) + || ! CONST_INT_P (xval1) + || ! CONST_INT_P (xval2)) + return NULL_RTX; if (reverse_cond1) cond1 = reverse_condition (cond1); + rtx_code c1 = cond1; + rtx_code c2 = cond2; + rtx xval = avr_2comparisons_rhs (c1, xval1, c2, xval2, GET_MODE (xreg1)); + + if (! xval) + return NULL_RTX; + + // We found an equivalent representation as two comparisons with + // the same RHS. This may have introduced difficult comparisons, + // and the rest of this functions will get rid of these -- which + // is always possible, but may require to swap the branch targets. + // Except in the case of reverse_cond1 where avoiding difficult + // branches would require to reorder blocks or to add new ones. + + if (dump_file) + { + rtx_code a1 = reverse_cond1 ? reverse_condition (cond1) : cond1; + rtx_code b1 = reverse_cond1 ? reverse_condition (c1) : c1; + const char *s_rev1 = reverse_cond1 ? " reverse_cond1" : ""; + avr_dump (";; cond1: %C %r%s\n", a1, xval1, s_rev1); + avr_dump (";; cond2: %C %r\n", cond2, xval2); + avr_dump (";; => %C %d\n", b1, (int) INTVAL (xval)); + avr_dump (";; => %C %d\n", c2, (int) INTVAL (xval)); + } + + cond1 = c1; + cond2 = c2; + if (cond1 == EQ) { - //////////////////////////////////////////////// // A sequence like // if (reg == val) goto label1; // if (reg > val) goto label2; @@ -822,87 +1025,329 @@ avr_redundant_compare (enum rtx_code cond1, rtx xval1, // REG_CC = compare (reg, val) // if (REG_CC == 0) goto label1; // if (REG_CC >= 0) goto label2; - if (ival1 == ival2 - && (cond2 == GT || cond2 == GTU)) - return avr_normalize_condition (cond2); + if (cond2 == GT || cond2 == GTU) + { + avr_dump (";; case #11\n"); + cond2 = avr_normalize_condition (cond2); + return xval; + } - // Similar, but the input sequence is like - // if (reg == val) goto label1; - // if (reg >= val) goto label2; - if (ival1 == ival2 - && (cond2 == GE || cond2 == GEU)) - return cond2; - - // Similar, but the input sequence is like - // if (reg == val) goto label1; - // if (reg >= val + 1) goto label2; - if ((cond2 == GE && ival2 == 1 + ival1) - || (cond2 == GEU && uval2 == 1 + uval1)) - return cond2; - - // Similar, but the input sequence is like - // if (reg == val) goto label1; - // if (reg > val - 1) goto label2; - if ((cond2 == GT && ival2 == ival1 - 1) - || (cond2 == GTU && uval2 == uval1 - 1)) - return avr_normalize_condition (cond2); - - ///////////////////////////////////////////////////////// // A sequence like - // if (reg == val) goto label1; - // if (reg < 1 + val) goto label2; + // if (reg == val) goto label1; + // if (reg <= val) goto label2; // can be re-written as // REG_CC = compare (reg, val) // if (REG_CC == 0) goto label1; // if (REG_CC < 0) goto label2; - if ((cond2 == LT && ival2 == 1 + ival1) - || (cond2 == LTU && uval2 == 1 + uval1)) - return cond2; + if (cond2 == LE || cond2 == LEU) + { + avr_dump (";; case #12\n"); + cond2 = avr_normalize_condition (cond2); + return xval; + } + } // cond1 == EQ - // Similar, but with an input sequence like - // if (reg == val) goto label1; - // if (reg <= val) goto label2; - if (ival1 == ival2 - && (cond2 == LE || cond2 == LEU)) - return avr_normalize_condition (cond2); + if (cond2 == EQ) + { + // A sequence like + // if (reg > val) goto label1; + // if (reg == val) goto label2; + // can be re-written as + // REG_CC = compare (reg, val) + // if (REG_CC == 0) goto label2; + // if (REG_CC >= 0) goto label1; + if ((cond1 == GTU || cond1 == GT) + && ! reverse_cond1) + { + avr_dump (";; case #13, swapt\n"); + cond1 = avr_normalize_condition (cond1); + std::swap (cond1, cond2); + swapt = true; + return xval; + } - // Similar, but with an input sequence like - // if (reg == val) goto label1; - // if (reg < val) goto label2; - if (ival1 == ival2 - && (cond2 == LT || cond2 == LTU)) - return cond2; - - // Similar, but with an input sequence like - // if (reg == val) goto label1; - // if (reg <= val - 1) goto label2; - if ((cond2 == LE && ival2 == ival1 - 1) - || (cond2 == LEU && uval2 == uval1 - 1)) - return avr_normalize_condition (cond2); + // A sequence like + // if (reg <= val) goto label1; + // if (reg == val) goto label2; + // can be re-written as + // REG_CC = compare (reg, val) + // if (REG_CC == 0) goto label2; + // if (REG_CC < 0) goto label1; + if ((cond1 == LEU || cond1 == LE) + && ! reverse_cond1) + { + avr_dump (";; case #14, swapt\n"); + cond1 = avr_normalize_condition (cond1); + std::swap (cond1, cond2); + swapt = true; + return xval; + } + } // cond2 == EQ - } // cond1 == EQ + if (cond2 == NE) + { + // A sequence like + // if (reg > val) goto label1; + // if (reg != val) goto label2; + // can be re-written as + // REG_CC = compare (reg, val) + // if (REG_CC < 0) goto label2; + // if (REG_CC != 0) goto label1; + if ((cond1 == GTU || cond1 == GT) + && ! reverse_cond1) + { + avr_dump (";; case #15, swapt\n"); + cond1 = swap_condition (cond1); + swapt = true; + return xval; + } + } // cond2 == NE - return UNKNOWN; + // Situations similar the the ones above, but where no + // extra care is needed, or swapping labels is not possible. + // As it appears, sequences like LT + GT and GT + LT + // don't appear since the optimizers trasform the 2nd code + // into a NE, and we don't generate such sequences either. + // Then there are rare, exotic cases with mixed signedness + // which we leave as they are. + avr_dump (";; case #10\n"); + + return xval; } -/* If-else decision trees generated for switch / case may produce sequences - like +/* Similar to the function above, but assume that + + if (xreg1 <comd1> xval1) goto label1; + if (xreg2 <comd2> xval2) goto label2; + + are two subsequent REG-REG comparisons. When this can be represented as + + REG_CC = compare (reg, xval); + if (REG_CC <cmp1> 0) goto labl1; + if (REG_CC <cmp2> 0) goto labl2; + + then set XREG1 to reg, COND1 and COND2 accordingly, and return xval. + Otherwise, return NULL_RTX. This optmization can be performed + when { xreg1, xval1 } and { xreg2, xval2 } are equal as sets, + and it can be performed in such a way that no difficult branches + are required. */ + +static rtx +avr_redundant_compare_regs (rtx &xreg1, rtx_code &cond1, rtx &xval1, + rtx &xreg2, rtx_code &cond2, rtx &xval2, + bool reverse_cond1) +{ + bool swapped; + + if (rtx_equal_p (xreg1, xreg2) + && rtx_equal_p (xval1, xval2)) + swapped = false; + else if (rtx_equal_p (xreg1, xval2) + && rtx_equal_p (xreg2, xval1)) + swapped = true; + else + return NULL_RTX; + + // Found a redundant REG-REG comparison. Assume that the incoming + // representation has been canonicalized by CANONICALIZE_COMPARISON. + // We can always represent this using only one comparison and in such + // a way that no difficult branches are required. + + if (dump_file) + { + const char *s_rev1 = reverse_cond1 ? " reverse_cond1" : ""; + avr_dump (";; %r %C %r%s\n", xreg1, cond1, xval1, s_rev1); + avr_dump (";; %r %C %r\n", xreg2, cond2, xval2); + } + + if (reverse_cond1) + cond1 = reverse_condition (cond1); + + if (swapped) + { + if (cond1 == EQ || cond1 == NE) + { + avr_dump (";; case #21\n"); + std::swap (xreg1, xval1); + } + else + { + std::swap (xreg2, xval2); + cond2 = swap_condition (cond2); + + // The swap may have introduced a difficult comparison. + // There are only a few cases that need extra care. + + if ((cond1 == LT && cond2 == GT) + || (cond1 == LTU && cond2 == GTU)) + { + avr_dump (";; case #22\n"); + cond2 = NE; + } + else + avr_dump (";; case #23\n"); + } + } + else + avr_dump (";; case #20\n"); + + return xval1; +} + + +/* INSN1 and INSN2 are two cbranch insns for the same integer mode. + When FOLLOW_LABEL1 is false, then INSN2 is located in the fallthrough + path of INSN1. When FOLLOW_LABEL1 is true, then INSN2 is located at + the true edge of INSN1, INSN2 is preceded by a barrier, and no other + edge leads to the basic block of INSN2. + + Try to replace INSN1 and INSN2 by a compare insn and two branch insns. + When such a replacement has been performed, then return the insn where the + caller should continue scanning the insn stream. Else, return nullptr. */ + +static rtx_insn* +avr_optimize_2ifelse (rtx_jump_insn *insn1, + rtx_jump_insn *insn2, bool follow_label1) +{ + // Extract the operands of the insns: + // $0 = comparison operator ($1, $2) + // $1 = reg + // $2 = reg or const_int + // $3 = code_label + // $4 = optional SCRATCH for HI, PSI, SI cases. + + const auto &op = recog_data.operand; + + extract_insn (insn1); + rtx xop1[5] = { op[0], op[1], op[2], op[3], op[4] }; + int n_operands = recog_data.n_operands; + + extract_insn (insn2); + rtx xop2[5] = { op[0], op[1], op[2], op[3], op[4] }; + + rtx_code code1 = GET_CODE (xop1[0]); + rtx_code code2 = GET_CODE (xop2[0]); + bool swap_targets = false; + + // Search redundant REG-REG comparison. + rtx xval = avr_redundant_compare_regs (xop1[1], code1, xop1[2], + xop2[1], code2, xop2[2], + follow_label1); + + // Search redundant REG-CONST_INT comparison. + if (! xval) + xval = avr_redundant_compare (xop1[1], code1, xop1[2], + xop2[1], code2, xop2[2], + follow_label1, swap_targets); + if (! xval) + return nullptr; + + if (follow_label1) + code1 = reverse_condition (code1); + + ////////////////////////////////////////////////////// + // Found a replacement. + + if (dump_file) + { + avr_dump (";; => %C %r\n", code1, xval); + avr_dump (";; => %C %r\n", code2, xval); + + fprintf (dump_file, "\n;; Found chain of jump_insn %d and" + " jump_insn %d, follow_label1=%d:\n", + INSN_UID (insn1), INSN_UID (insn2), follow_label1); + print_rtl_single (dump_file, PATTERN (insn1)); + print_rtl_single (dump_file, PATTERN (insn2)); + } + + rtx_insn *next_insn + = next_nonnote_nondebug_insn (follow_label1 ? insn1 : insn2); + + // Pop the new branch conditions and the new comparison. + // Prematurely split into compare + branch so that we can drop + // the 2nd comparison. The following pass, split2, splits all + // insns for REG_CC, and it should still work as usual even when + // there are already some REG_CC insns around. + + rtx xcond1 = gen_rtx_fmt_ee (code1, VOIDmode, cc_reg_rtx, const0_rtx); + rtx xcond2 = gen_rtx_fmt_ee (code2, VOIDmode, cc_reg_rtx, const0_rtx); + rtx xpat1 = gen_branch (xop1[3], xcond1); + rtx xpat2 = gen_branch (xop2[3], xcond2); + rtx xcompare = NULL_RTX; + machine_mode mode = GET_MODE (xop1[1]); + + if (mode == QImode) + { + gcc_assert (n_operands == 4); + xcompare = gen_cmpqi3 (xop1[1], xval); + } + else + { + gcc_assert (n_operands == 5); + rtx scratch = GET_CODE (xop1[4]) == SCRATCH ? xop2[4] : xop1[4]; + rtx (*gen_cmp)(rtx,rtx,rtx) + = mode == HImode ? gen_gen_comparehi + : mode == PSImode ? gen_gen_comparepsi + : gen_gen_comparesi; // SImode + xcompare = gen_cmp (xop1[1], xval, scratch); + } + + // Emit that stuff. + + rtx_insn *cmp = emit_insn_before (xcompare, insn1); + rtx_jump_insn *branch1 = emit_jump_insn_after (xpat1, insn1); + rtx_jump_insn *branch2 = emit_jump_insn_after (xpat2, insn2); + + JUMP_LABEL (branch1) = xop1[3]; + JUMP_LABEL (branch2) = xop2[3]; + // delete_insn() decrements LABEL_NUSES when deleting a JUMP_INSN, but + // when we pop a new JUMP_INSN, do it by hand. + ++LABEL_NUSES (xop1[3]); + ++LABEL_NUSES (xop2[3]); + + delete_insn (insn1); + delete_insn (insn2); + + if (swap_targets) + { + gcc_assert (! follow_label1); + + basic_block to1 = BLOCK_FOR_INSN (xop1[3]); + basic_block to2 = BLOCK_FOR_INSN (xop2[3]); + edge e1 = find_edge (BLOCK_FOR_INSN (branch1), to1); + edge e2 = find_edge (BLOCK_FOR_INSN (branch2), to2); + gcc_assert (e1); + gcc_assert (e2); + redirect_edge_and_branch (e1, to2); + redirect_edge_and_branch (e2, to1); + } + + // As a side effect, also recog the new insns. + gcc_assert (valid_insn_p (cmp)); + gcc_assert (valid_insn_p (branch1)); + gcc_assert (valid_insn_p (branch2)); + + return next_insn; +} + + +/* Sequences like - SREG = compare (reg, val); - if (SREG == 0) goto label1; SREG = compare (reg, 1 + val); - if (SREG >= 0) goto label2; + if (SREG >= 0) goto label1; + SREG = compare (reg, val); + if (SREG == 0) goto label2; - which can be optimized to + can be optimized to SREG = compare (reg, val); - if (SREG == 0) goto label1; - if (SREG >= 0) goto label2; + if (SREG == 0) goto label2; + if (SREG >= 0) goto label1; - The optimal place for such a pass would be directly after expand, but - it's not possible for a jump insn to target more than one code label. - Hence, run a mini pass right before split2 which introduces REG_CC. */ + Almost all cases where one of the comparisons is redundant can + be transformed in such a way that only one comparison is required + and no difficult branches are needed. */ void avr_pass_ifelse::avr_rest_of_handle_ifelse (function *) @@ -931,143 +1376,44 @@ avr_pass_ifelse::avr_rest_of_handle_ifelse (function *) continue; rtx_jump_insn *insn1 = as_a<rtx_jump_insn *> (insn); - rtx_jump_insn *insn2 = nullptr; - bool follow_label1 = false; - - // Extract the operands of the first insn: - // $0 = comparison operator ($1, $2) - // $1 = reg - // $2 = reg or const_int - // $3 = code_label - // $4 = optional SCRATCH for HI, PSI, SI cases. - - const auto &op = recog_data.operand; - - extract_insn (insn1); - rtx xop1[5] = { op[0], op[1], op[2], op[3], op[4] }; - int n_operands = recog_data.n_operands; - - // For now, we can optimize cbranches that follow an EQ cbranch, - // and cbranches that follow the label of a NE cbranch. - if (GET_CODE (xop1[0]) == EQ - && JUMP_P (next_insn) - && recog_memoized (next_insn) == icode1) - { - // The 2nd cbranch insn follows insn1, i.e. is located in the - // fallthrough path of insn1. + // [0]: We can optimize cbranches that follow cbranch insn1. + rtx_insn *jmp[2] = { next_insn, nullptr }; - insn2 = as_a<rtx_jump_insn *> (next_insn); - } - else if (GET_CODE (xop1[0]) == NE) + // [1]: A cbranch following the label of cbranch insn1. + if (LABEL_NUSES (JUMP_LABEL (insn1)) == 1) { - // insn1 might branch to a label followed by a cbranch. - - rtx target1 = JUMP_LABEL (insn1); rtx_insn *code_label1 = JUMP_LABEL_AS_INSN (insn1); - rtx_insn *next = next_nonnote_nondebug_insn (code_label1); rtx_insn *barrier = prev_nonnote_nondebug_insn (code_label1); - if (// Target label of insn1 is used exactly once and - // is not a fallthru, i.e. is preceded by a barrier. - LABEL_NUSES (target1) == 1 - && barrier - && BARRIER_P (barrier) - // Following the target label is a cbranch of the same kind. - && next - && JUMP_P (next) - && recog_memoized (next) == icode1) - { - follow_label1 = true; - insn2 = as_a<rtx_jump_insn *> (next); - } - } - - if (! insn2) - continue; - - // Also extract operands of insn2, and filter for REG + CONST_INT - // comparsons against the same register. - - extract_insn (insn2); - rtx xop2[5] = { op[0], op[1], op[2], op[3], op[4] }; - - if (! rtx_equal_p (xop1[1], xop2[1]) - || ! CONST_INT_P (xop1[2]) - || ! CONST_INT_P (xop2[2])) - continue; - - machine_mode mode = GET_MODE (xop1[1]); - enum rtx_code code1 = GET_CODE (xop1[0]); - enum rtx_code code2 = GET_CODE (xop2[0]); - - code2 = avr_redundant_compare (code1, xop1[2], code2, xop2[2], - mode, follow_label1); - if (code2 == UNKNOWN) - continue; - - ////////////////////////////////////////////////////// - // Found a replacement. - - if (dump_file) - { - fprintf (dump_file, "\n;; Found chain of jump_insn %d and" - " jump_insn %d, follow_label1=%d:\n", - INSN_UID (insn1), INSN_UID (insn2), follow_label1); - print_rtl_single (dump_file, PATTERN (insn1)); - print_rtl_single (dump_file, PATTERN (insn2)); - } - - if (! follow_label1) - next_insn = next_nonnote_nondebug_insn (insn2); - - // Pop the new branch conditions and the new comparison. - // Prematurely split into compare + branch so that we can drop - // the 2nd comparison. The following pass, split2, splits all - // insns for REG_CC, and it should still work as usual even when - // there are already some REG_CC insns around. + // When the target label of insn1 is used exactly once and is + // not a fallthrough, i.e. is preceded by a barrier, then + // consider the insn following that label. + if (barrier && BARRIER_P (barrier)) + jmp[1] = next_nonnote_nondebug_insn (code_label1); + } - rtx xcond1 = gen_rtx_fmt_ee (code1, VOIDmode, cc_reg_rtx, const0_rtx); - rtx xcond2 = gen_rtx_fmt_ee (code2, VOIDmode, cc_reg_rtx, const0_rtx); - rtx xpat1 = gen_branch (xop1[3], xcond1); - rtx xpat2 = gen_branch (xop2[3], xcond2); - rtx xcompare = NULL_RTX; + // With almost certainty, only one of the two possible jumps can + // be optimized with insn1, but it's hard to tell which one a priori. + // Just try both. In the unlikely case where both could be optimized, + // prefer jmp[0] because eliminating difficult branches is impeded + // by following label1. - if (mode == QImode) - { - gcc_assert (n_operands == 4); - xcompare = gen_cmpqi3 (xop1[1], xop1[2]); - } - else + for (int j = 0; j < 2; ++j) { - gcc_assert (n_operands == 5); - rtx (*gen_cmp)(rtx,rtx,rtx) - = mode == HImode ? gen_gen_comparehi - : mode == PSImode ? gen_gen_comparepsi - : gen_gen_comparesi; // SImode - xcompare = gen_cmp (xop1[1], xop1[2], xop1[4]); + if (jmp[j] && JUMP_P (jmp[j]) + && recog_memoized (jmp[j]) == icode1) + { + rtx_insn *next + = avr_optimize_2ifelse (insn1, as_a<rtx_jump_insn *> (jmp[j]), + j == 1 /* follow_label1 */); + if (next) + { + next_insn = next; + break; + } + } } - - // Emit that stuff. - - rtx_insn *cmp = emit_insn_before (xcompare, insn1); - rtx_jump_insn *branch1 = emit_jump_insn_before (xpat1, insn1); - rtx_jump_insn *branch2 = emit_jump_insn_before (xpat2, insn2); - - JUMP_LABEL (branch1) = xop1[3]; - JUMP_LABEL (branch2) = xop2[3]; - // delete_insn() decrements LABEL_NUSES when deleting a JUMP_INSN, but - // when we pop a new JUMP_INSN, do it by hand. - ++LABEL_NUSES (xop1[3]); - ++LABEL_NUSES (xop2[3]); - - delete_insn (insn1); - delete_insn (insn2); - - // As a side effect, also recog the new insns. - gcc_assert (valid_insn_p (cmp)); - gcc_assert (valid_insn_p (branch1)); - gcc_assert (valid_insn_p (branch2)); } // loop insns } diff --git a/gcc/testsuite/gcc.target/avr/torture/ifelse-c-i16.c b/gcc/testsuite/gcc.target/avr/torture/ifelse-c-i16.c new file mode 100644 index 00000000000..c6d25604ffd --- /dev/null +++ b/gcc/testsuite/gcc.target/avr/torture/ifelse-c-i16.c @@ -0,0 +1,12 @@ +/* { dg-do run } */ +/* { dg-additional-options { "-fwrapv" "-std=c99" } } */ + +#ifndef T +#define T i16 +#endif + +#ifndef W +#define W -5 +#endif + +#include "ifelse-c.h" diff --git a/gcc/testsuite/gcc.target/avr/torture/ifelse-c-i8.c b/gcc/testsuite/gcc.target/avr/torture/ifelse-c-i8.c new file mode 100644 index 00000000000..24c71a4a6f8 --- /dev/null +++ b/gcc/testsuite/gcc.target/avr/torture/ifelse-c-i8.c @@ -0,0 +1,12 @@ +/* { dg-do run } */ +/* { dg-additional-options { "-fwrapv" "-std=c99" } } */ + +#ifndef T +#define T i8 +#endif + +#ifndef W +#define W -5 +#endif + +#include "ifelse-c.h" diff --git a/gcc/testsuite/gcc.target/avr/torture/ifelse-c-u16.c b/gcc/testsuite/gcc.target/avr/torture/ifelse-c-u16.c new file mode 100644 index 00000000000..cb751128085 --- /dev/null +++ b/gcc/testsuite/gcc.target/avr/torture/ifelse-c-u16.c @@ -0,0 +1,12 @@ +/* { dg-do run } */ +/* { dg-additional-options { "-fwrapv" "-std=c99" } } */ + +#ifndef T +#define T u16 +#endif + +#ifndef W +#define W -5ul +#endif + +#include "ifelse-c.h" diff --git a/gcc/testsuite/gcc.target/avr/torture/ifelse-c.h b/gcc/testsuite/gcc.target/avr/torture/ifelse-c.h new file mode 100644 index 00000000000..e742415d7fd --- /dev/null +++ b/gcc/testsuite/gcc.target/avr/torture/ifelse-c.h @@ -0,0 +1,99 @@ +/* Testing pass avr-ifelse with REG-CONST_INT comparisons. */ + +#define V ((T) (W)) + +#ifdef __OPTIMIZE__ + +typedef __UINT32_TYPE__ u32; +typedef __uint24 u24; +typedef __UINT16_TYPE__ u16; +typedef __UINT8_TYPE__ u8; + +typedef __INT32_TYPE__ i32; +typedef __int24 i24; +typedef __INT16_TYPE__ i16; +typedef __INT8_TYPE__ i8; + +char volatile cc; + +#define NI __attribute__((noipa)) +#define AI static __inline__ __attribute__((always_inline)) + +NI +void f (char x) +{ + cc += x; +} + +#define MK_FUN(id, cmp1, cmp2) \ +NI void fun_##id (T x) \ +{ \ + if (x cmp1) \ + goto a; \ + if (x cmp2) \ + goto b; \ + f(1); \ + a: \ + f(2); \ + b: \ + f(4); \ +} \ + \ +NI char val_##id (T x) \ +{ \ + char c = 0; \ + if (x cmp1) \ + goto a; \ + __asm ("" : "+r" (x)); \ + if (x cmp2) \ + goto b; \ + c += 1; \ + a: \ + c += 2; \ + b: \ + c += 4; \ + \ + return c; \ +} + +MK_FUN (01, > V, < V) +MK_FUN (02, < V, > V) + +MK_FUN (03, == V, <= V) +MK_FUN (04, == V, >= V) + +MK_FUN (05, == V, < V) +MK_FUN (06, == V, > V) + +MK_FUN (07, > V, == V) +MK_FUN (08, < V, == V) + +MK_FUN (09, > V, != V) +MK_FUN (10, < V, != V) + +void testA (void) +{ + for (T x = (T) (V - 2); x != (T) (V + 2); ++x) + { + cc = 0; fun_01 (x); if (cc != val_01 (x)) __builtin_exit (__LINE__); + cc = 0; fun_02 (x); if (cc != val_02 (x)) __builtin_exit (__LINE__); + cc = 0; fun_03 (x); if (cc != val_03 (x)) __builtin_exit (__LINE__); + cc = 0; fun_04 (x); if (cc != val_04 (x)) __builtin_exit (__LINE__); + cc = 0; fun_05 (x); if (cc != val_05 (x)) __builtin_exit (__LINE__); + cc = 0; fun_06 (x); if (cc != val_06 (x)) __builtin_exit (__LINE__); + cc = 0; fun_07 (x); if (cc != val_07 (x)) __builtin_exit (__LINE__); + cc = 0; fun_08 (x); if (cc != val_08 (x)) __builtin_exit (__LINE__); + cc = 0; fun_09 (x); if (cc != val_09 (x)) __builtin_exit (__LINE__); + cc = 0; fun_10 (x); if (cc != val_10 (x)) __builtin_exit (__LINE__); + } +} +#endif /* OPTIMIZE */ + +int main (void) +{ +#ifdef __OPTIMIZE__ + testA(); +#endif /* OPTIMIZE */ + + return 0; +} diff --git a/gcc/testsuite/gcc.target/avr/torture/ifelse-d-i16.c b/gcc/testsuite/gcc.target/avr/torture/ifelse-d-i16.c new file mode 100644 index 00000000000..ad13c9e45e1 --- /dev/null +++ b/gcc/testsuite/gcc.target/avr/torture/ifelse-d-i16.c @@ -0,0 +1,12 @@ +/* { dg-do run } */ +/* { dg-additional-options { "-fwrapv" "-std=c99" } } */ + +#ifndef T +#define T i16 +#endif + +#ifndef W +#define W -5 +#endif + +#include "ifelse-d.h" diff --git a/gcc/testsuite/gcc.target/avr/torture/ifelse-d-i8.c b/gcc/testsuite/gcc.target/avr/torture/ifelse-d-i8.c new file mode 100644 index 00000000000..bf2cd699ddf --- /dev/null +++ b/gcc/testsuite/gcc.target/avr/torture/ifelse-d-i8.c @@ -0,0 +1,12 @@ +/* { dg-do run } */ +/* { dg-additional-options { "-fwrapv" "-std=c99" } } */ + +#ifndef T +#define T i8 +#endif + +#ifndef W +#define W -5 +#endif + +#include "ifelse-d.h" diff --git a/gcc/testsuite/gcc.target/avr/torture/ifelse-d-u16.c b/gcc/testsuite/gcc.target/avr/torture/ifelse-d-u16.c new file mode 100644 index 00000000000..6f673be7da4 --- /dev/null +++ b/gcc/testsuite/gcc.target/avr/torture/ifelse-d-u16.c @@ -0,0 +1,12 @@ +/* { dg-do run } */ +/* { dg-additional-options { "-fwrapv" "-std=c99" } } */ + +#ifndef T +#define T u16 +#endif + +#ifndef W +#define W -5ul +#endif + +#include "ifelse-d.h" diff --git a/gcc/testsuite/gcc.target/avr/torture/ifelse-d.h b/gcc/testsuite/gcc.target/avr/torture/ifelse-d.h new file mode 100644 index 00000000000..3ff2494ea6a --- /dev/null +++ b/gcc/testsuite/gcc.target/avr/torture/ifelse-d.h @@ -0,0 +1,82 @@ +/* Testing pass avr-ifelse with REG-CONST_INT comparisons + where the 2nd insn is behind the true edge of insn1 (reverse_cond1). */ + +#define V ((T) (W)) + +#ifdef __OPTIMIZE__ + +typedef __UINT32_TYPE__ u32; +typedef __uint24 u24; +typedef __UINT16_TYPE__ u16; +typedef __UINT8_TYPE__ u8; + +typedef __INT32_TYPE__ i32; +typedef __int24 i24; +typedef __INT16_TYPE__ i16; +typedef __INT8_TYPE__ i8; + +char volatile cc; + +#define NI __attribute__((noipa)) +#define AI static __inline__ __attribute__((always_inline)) + +NI +void f (char x) +{ + cc += x; +} + +#define MK_FUN(id, cmp1, cmp2) \ +NI void fun_##id (T x) \ +{ \ + if (x cmp1) \ + f (1); \ + else if (x cmp2) \ + f (2); \ +} \ + \ +NI char val_##id (T x) \ +{ \ + char c = 0; \ + T x2 = x; \ + __asm ("" : "+r" (x2)); \ + \ + if (x cmp1) \ + c += 1; \ + else if (x2 cmp2) \ + c += 2; \ + \ + return c; \ +} + +MK_FUN (01, > V, == V) +MK_FUN (02, < V, == V) + +MK_FUN (03, == V, < V) +MK_FUN (04, == V, > V) + +MK_FUN (05, > V, != V) +MK_FUN (06, < V, != V) + +void testA (void) +{ + for (T x = (T) (V - 2); x != (T) (V + 2); ++x) + { + cc = 0; fun_01 (x); if (cc != val_01 (x)) __builtin_exit (__LINE__); + cc = 0; fun_02 (x); if (cc != val_02 (x)) __builtin_exit (__LINE__); + cc = 0; fun_03 (x); if (cc != val_03 (x)) __builtin_exit (__LINE__); + cc = 0; fun_04 (x); if (cc != val_04 (x)) __builtin_exit (__LINE__); + cc = 0; fun_05 (x); if (cc != val_05 (x)) __builtin_exit (__LINE__); + cc = 0; fun_06 (x); if (cc != val_06 (x)) __builtin_exit (__LINE__); + } +} +#endif /* OPTIMIZE */ + +int main (void) +{ +#ifdef __OPTIMIZE__ + testA(); +#endif /* OPTIMIZE */ + + return 0; +} diff --git a/gcc/testsuite/gcc.target/avr/torture/ifelse-q-i16.c b/gcc/testsuite/gcc.target/avr/torture/ifelse-q-i16.c new file mode 100644 index 00000000000..7286de8dde2 --- /dev/null +++ b/gcc/testsuite/gcc.target/avr/torture/ifelse-q-i16.c @@ -0,0 +1,12 @@ +/* { dg-do run } */ +/* { dg-additional-options { "-fwrapv" "-std=c99" } } */ + +#ifndef T +#define T i16 +#endif + +#ifndef W +#define W -5 +#endif + +#include "ifelse-q.h" diff --git a/gcc/testsuite/gcc.target/avr/torture/ifelse-q-i8.c b/gcc/testsuite/gcc.target/avr/torture/ifelse-q-i8.c new file mode 100644 index 00000000000..937e228ab56 --- /dev/null +++ b/gcc/testsuite/gcc.target/avr/torture/ifelse-q-i8.c @@ -0,0 +1,12 @@ +/* { dg-do run } */ +/* { dg-additional-options { "-fwrapv" "-std=c99" } } */ + +#ifndef T +#define T i8 +#endif + +#ifndef W +#define W -5 +#endif + +#include "ifelse-q.h" diff --git a/gcc/testsuite/gcc.target/avr/torture/ifelse-q-u16.c b/gcc/testsuite/gcc.target/avr/torture/ifelse-q-u16.c new file mode 100644 index 00000000000..c61d6a6a6ee --- /dev/null +++ b/gcc/testsuite/gcc.target/avr/torture/ifelse-q-u16.c @@ -0,0 +1,12 @@ +/* { dg-do run } */ +/* { dg-additional-options { "-fwrapv" "-std=c99" } } */ + +#ifndef T +#define T u16 +#endif + +#ifndef W +#define W -5ul +#endif + +#include "ifelse-q.h" diff --git a/gcc/testsuite/gcc.target/avr/torture/ifelse-q.h b/gcc/testsuite/gcc.target/avr/torture/ifelse-q.h new file mode 100644 index 00000000000..6dbb05990e0 --- /dev/null +++ b/gcc/testsuite/gcc.target/avr/torture/ifelse-q.h @@ -0,0 +1,92 @@ +/* Testing pass avr-ifelse with REG-REG comparisons + where the 2nd insn is behind the true edge of insn1 (reverse_cond1). */ + +#define V ((T) (W)) + +#ifdef __OPTIMIZE__ + +typedef __UINT32_TYPE__ u32; +typedef __uint24 u24; +typedef __UINT16_TYPE__ u16; +typedef __UINT8_TYPE__ u8; + +typedef __INT32_TYPE__ i32; +typedef __int24 i24; +typedef __INT16_TYPE__ i16; +typedef __INT8_TYPE__ i8; + +char volatile cc; + +#define NI __attribute__((noipa)) +#define AI static __inline__ __attribute__((always_inline)) + +NI +void f (char x) +{ + cc += x; +} + +#define MK_FUN(id, cmp1, cmp2) \ +NI void fun_##id (T x, T y) \ +{ \ + if (x cmp1 y) \ + f (1); \ + else if (x cmp2 y) \ + f (2); \ +} \ + \ +NI char val_##id (T x, T y) \ +{ \ + char c = 0; \ + T x2 = x; \ + __asm ("" : "+r" (x2)); \ + \ + if (x cmp1 y) \ + c += 1; \ + else if (x2 cmp2 y) \ + c += 2; \ + \ + return c; \ +} + +MK_FUN (01, >, <) +MK_FUN (02, <, >) + +MK_FUN (03, ==, <= ) +MK_FUN (04, ==, >= ) + +MK_FUN (05, ==, <) +MK_FUN (06, ==, >) + +MK_FUN (07, >, ==) +MK_FUN (08, <, ==) + +MK_FUN (09, >, !=) +MK_FUN (10, <, !=) + +void testA (void) +{ + for (T x = (T) (V - 2); x != (T) (V + 2); ++x) + { + cc = 0; fun_01 (x,V); if (cc != val_01 (x,V)) __builtin_exit (__LINE__); + cc = 0; fun_02 (x,V); if (cc != val_02 (x,V)) __builtin_exit (__LINE__); + cc = 0; fun_03 (x,V); if (cc != val_03 (x,V)) __builtin_exit (__LINE__); + cc = 0; fun_04 (x,V); if (cc != val_04 (x,V)) __builtin_exit (__LINE__); + cc = 0; fun_05 (x,V); if (cc != val_05 (x,V)) __builtin_exit (__LINE__); + cc = 0; fun_06 (x,V); if (cc != val_06 (x,V)) __builtin_exit (__LINE__); + cc = 0; fun_07 (x,V); if (cc != val_07 (x,V)) __builtin_exit (__LINE__); + cc = 0; fun_08 (x,V); if (cc != val_08 (x,V)) __builtin_exit (__LINE__); + cc = 0; fun_09 (x,V); if (cc != val_09 (x,V)) __builtin_exit (__LINE__); + cc = 0; fun_10 (x,V); if (cc != val_10 (x,V)) __builtin_exit (__LINE__); + } +} +#endif /* OPTIMIZE */ + +int main (void) +{ +#ifdef __OPTIMIZE__ + testA(); +#endif /* OPTIMIZE */ + + return 0; +} diff --git a/gcc/testsuite/gcc.target/avr/torture/ifelse-r-i16.c b/gcc/testsuite/gcc.target/avr/torture/ifelse-r-i16.c new file mode 100644 index 00000000000..1c0d4a94305 --- /dev/null +++ b/gcc/testsuite/gcc.target/avr/torture/ifelse-r-i16.c @@ -0,0 +1,12 @@ +/* { dg-do run } */ +/* { dg-additional-options { "-fwrapv" "-std=c99" } } */ + +#ifndef T +#define T i16 +#endif + +#ifndef W +#define W -5 +#endif + +#include "ifelse-r.h" diff --git a/gcc/testsuite/gcc.target/avr/torture/ifelse-r-i8.c b/gcc/testsuite/gcc.target/avr/torture/ifelse-r-i8.c new file mode 100644 index 00000000000..3a5a902bb69 --- /dev/null +++ b/gcc/testsuite/gcc.target/avr/torture/ifelse-r-i8.c @@ -0,0 +1,12 @@ +/* { dg-do run } */ +/* { dg-additional-options { "-fwrapv" "-std=c99" } } */ + +#ifndef T +#define T i8 +#endif + +#ifndef W +#define W -5 +#endif + +#include "ifelse-r.h" diff --git a/gcc/testsuite/gcc.target/avr/torture/ifelse-r-u16.c b/gcc/testsuite/gcc.target/avr/torture/ifelse-r-u16.c new file mode 100644 index 00000000000..2d08ebb1ff5 --- /dev/null +++ b/gcc/testsuite/gcc.target/avr/torture/ifelse-r-u16.c @@ -0,0 +1,12 @@ +/* { dg-do run } */ +/* { dg-additional-options { "-fwrapv" "-std=c99" } } */ + +#ifndef T +#define T u16 +#endif + +#ifndef W +#define W -5ul +#endif + +#include "ifelse-r.h" diff --git a/gcc/testsuite/gcc.target/avr/torture/ifelse-r.h b/gcc/testsuite/gcc.target/avr/torture/ifelse-r.h new file mode 100644 index 00000000000..19cd1cc67c4 --- /dev/null +++ b/gcc/testsuite/gcc.target/avr/torture/ifelse-r.h @@ -0,0 +1,100 @@ +/* Testing pass avr-ifelse with REG-REG comparisons. */ + +#define V ((T) (W)) + +#ifdef __OPTIMIZE__ + +typedef __UINT32_TYPE__ u32; +typedef __uint24 u24; +typedef __UINT16_TYPE__ u16; +typedef __UINT8_TYPE__ u8; + +typedef __INT32_TYPE__ i32; +typedef __int24 i24; +typedef __INT16_TYPE__ i16; +typedef __INT8_TYPE__ i8; + +char volatile cc; + +#define NI __attribute__((noipa)) +#define AI static __inline__ __attribute__((always_inline)) + +NI +void f (char x) +{ + cc += x; +} + +#define MK_FUN(id, cmp1, cmp2) \ +NI void fun_##id (T x, T y) \ +{ \ + if (x cmp1 y) \ + goto a; \ + if (x cmp2 y) \ + goto b; \ + f(1); \ + a: \ + f(2); \ + b: \ + f(4); \ +} \ + \ +NI char val_##id (T x, T y) \ +{ \ + char c = 0; \ + if (x cmp1 y) \ + goto a; \ + __asm ("" : "+r" (x)); \ + __asm ("" : "+r" (y)); \ + if (x cmp2 y) \ + goto b; \ + c += 1; \ + a: \ + c += 2; \ + b: \ + c += 4; \ + \ + return c; \ +} + +MK_FUN (01, >, <) +MK_FUN (02, <, >) + +MK_FUN (03, ==, <= ) +MK_FUN (04, ==, >= ) + +MK_FUN (05, ==, <) +MK_FUN (06, ==, >) + +MK_FUN (07, >, ==) +MK_FUN (08, <, ==) + +MK_FUN (09, >, !=) +MK_FUN (10, <, !=) + +void testA (void) +{ + for (T x = (T) (V - 2); x != (T) (V + 2); ++x) + { + cc = 0; fun_01 (x,V); if (cc != val_01 (x,V)) __builtin_exit (__LINE__); + cc = 0; fun_02 (x,V); if (cc != val_02 (x,V)) __builtin_exit (__LINE__); + cc = 0; fun_03 (x,V); if (cc != val_03 (x,V)) __builtin_exit (__LINE__); + cc = 0; fun_04 (x,V); if (cc != val_04 (x,V)) __builtin_exit (__LINE__); + cc = 0; fun_05 (x,V); if (cc != val_05 (x,V)) __builtin_exit (__LINE__); + cc = 0; fun_06 (x,V); if (cc != val_06 (x,V)) __builtin_exit (__LINE__); + cc = 0; fun_07 (x,V); if (cc != val_07 (x,V)) __builtin_exit (__LINE__); + cc = 0; fun_08 (x,V); if (cc != val_08 (x,V)) __builtin_exit (__LINE__); + cc = 0; fun_09 (x,V); if (cc != val_09 (x,V)) __builtin_exit (__LINE__); + cc = 0; fun_10 (x,V); if (cc != val_10 (x,V)) __builtin_exit (__LINE__); + } +} +#endif /* OPTIMIZE */ + +int main (void) +{ +#ifdef __OPTIMIZE__ + testA(); +#endif /* OPTIMIZE */ + + return 0; +}