Message ID | 53A49957.9040403@redhat.com |
---|---|
State | New |
Headers | show |
On 20/06/14 21:28, Richard Henderson wrote: > There aren't too many users of the cmpelim pass, and previously they were all > small embedded targets without an FPU. > > I'm a bit surprised that Ramana decided to enable this pass for aarch64, as > that target is not so limited as the block comment for the pass describes. > Honestly, whatever is being deleted here ought to have been found earlier, > either via combine or cse. We ought to find out why any changes are made > during this pass for aarch64. Agreed - Going back and looking at my notes I remember seeing a difference in code generation with the elimination of a number of compares that prompted me to turn this on in a number of benchmarks. I don't remember double checking why CSE hadn't removed that at that time. This also probably explains the equivalent patch for ARM and Thumb2 hasn't shown demonstrable differences. Investigating this pass for Thumb1 may be interesting. > > That said, this PR does demonstrate a bug in the handling of fp comparisons in > the presence of -fnon-call-exceptions, so I go ahead and fix that regardless of > what we do with the aarch64 port longer term. Fixing it properly in the pass makes sense and sorry about the breakage. I can't look into this immediately but one of us will pick this up. regards Ramana > > Bootstrap still in progress, but the original testcase is resolved. > > > r~ >
On 06/23/2014 02:29 AM, Ramana Radhakrishnan wrote: > > > On 20/06/14 21:28, Richard Henderson wrote: >> There aren't too many users of the cmpelim pass, and previously they were all >> small embedded targets without an FPU. >> >> I'm a bit surprised that Ramana decided to enable this pass for aarch64, as >> that target is not so limited as the block comment for the pass describes. >> Honestly, whatever is being deleted here ought to have been found earlier, >> either via combine or cse. We ought to find out why any changes are made >> during this pass for aarch64. > > Agreed - Going back and looking at my notes I remember seeing a difference in > code generation with the elimination of a number of compares that prompted me > to turn this on in a number of benchmarks. I don't remember double checking why > CSE hadn't removed that at that time. This also probably explains the > equivalent patch for ARM and Thumb2 hasn't shown demonstrable differences. > > Investigating this pass for Thumb1 may be interesting. Isn't it true that thumb1 has only "adds r,r,#i" not "add r,r,#i"? Lack of an addition that doesn't clobber the flags is one of the two reasons why you'd want to enable the cmpelim pass. Although for thumb1, can't you achieve a no-clobber addition with an IT block with an "always" condition? Seems like a good use for the "addptr" pattern that Andreas Krebbel recently added for s390. That might let you make thumb1 look more like the rest of the arm port and emit separate compare and branch instructions from the start. r~
On 23/06/14 15:01, Richard Henderson wrote: > On 06/23/2014 02:29 AM, Ramana Radhakrishnan wrote: >> >> >> On 20/06/14 21:28, Richard Henderson wrote: >>> There aren't too many users of the cmpelim pass, and previously they were all >>> small embedded targets without an FPU. >>> >>> I'm a bit surprised that Ramana decided to enable this pass for aarch64, as >>> that target is not so limited as the block comment for the pass describes. >>> Honestly, whatever is being deleted here ought to have been found earlier, >>> either via combine or cse. We ought to find out why any changes are made >>> during this pass for aarch64. >> >> Agreed - Going back and looking at my notes I remember seeing a difference in >> code generation with the elimination of a number of compares that prompted me >> to turn this on in a number of benchmarks. I don't remember double checking why >> CSE hadn't removed that at that time. This also probably explains the >> equivalent patch for ARM and Thumb2 hasn't shown demonstrable differences. >> >> Investigating this pass for Thumb1 may be interesting. > > Isn't it true that thumb1 has only "adds r,r,#i" not "add r,r,#i"? > Lack of an addition that doesn't clobber the flags is one of the two > reasons why you'd want to enable the cmpelim pass. Agreed, this is why cmpelim looks interesting for Thumb1. (We may need another hook or something to disable it in configurations we don't need it in, but you know ... ) > > Although for thumb1, can't you achieve a no-clobber addition with an > IT block with an "always" condition? Except that IT instructions aren't in Thumb1 or indeed v6-m :(. > Seems like a good use for the "addptr" > pattern that Andreas Krebbel recently added for s390. That might let you > make thumb1 look more like the rest of the arm port and emit separate compare > and branch instructions from the start. > regards Ramana > > r~ >
On 06/23/2014 08:55 AM, Ramana Radhakrishnan wrote: > Agreed, this is why cmpelim looks interesting for Thumb1. (We may need another > hook or something to disable it in configurations we don't need it in, but you > know ... ) Yeah. Feel free to change targetm.flags_regnum from a variable to a function, when you get around to it. > Except that IT instructions aren't in Thumb1 or indeed v6-m :(. Ah, yes, of course. r~
diff --git a/gcc/compare-elim.c b/gcc/compare-elim.c index 2fbb75b..4ecdd48 100644 --- a/gcc/compare-elim.c +++ b/gcc/compare-elim.c @@ -100,6 +100,9 @@ struct comparison constants. */ rtx in_a, in_b; + /* The REG_EH_REGION of the comparison. */ + rtx eh_note; + /* Information about how this comparison is used. */ struct comparison_use uses[MAX_CMP_USE]; @@ -262,6 +265,7 @@ find_comparison_dom_walker::before_dom_children (basic_block bb) struct comparison *last_cmp; rtx insn, next, last_clobber; bool last_cmp_valid; + bool need_purge = false; bitmap killed; killed = BITMAP_ALLOC (NULL); @@ -303,44 +307,60 @@ find_comparison_dom_walker::before_dom_children (basic_block bb) if (src) { enum machine_mode src_mode = GET_MODE (src); + rtx eh_note = NULL; - /* Eliminate a compare that's redundant with the previous. */ - if (last_cmp_valid - && rtx_equal_p (last_cmp->in_a, XEXP (src, 0)) - && rtx_equal_p (last_cmp->in_b, XEXP (src, 1))) - { - rtx flags, x; - enum machine_mode new_mode - = targetm.cc_modes_compatible (last_cmp->orig_mode, src_mode); + if (flag_non_call_exceptions) + eh_note = find_reg_note (insn, REG_EH_REGION, NULL); - /* New mode is incompatible with the previous compare mode. */ - if (new_mode == VOIDmode) - continue; + if (!last_cmp_valid) + goto dont_delete; - if (new_mode != last_cmp->orig_mode) - { - flags = gen_rtx_REG (src_mode, targetm.flags_regnum); + /* Take care that it's in the same EH region. */ + if (flag_non_call_exceptions + && !rtx_equal_p (eh_note, last_cmp->eh_note)) + goto dont_delete; - /* Generate new comparison for substitution. */ - x = gen_rtx_COMPARE (new_mode, XEXP (src, 0), XEXP (src, 1)); - x = gen_rtx_SET (VOIDmode, flags, x); + /* Make sure the compare is redundant with the previous. */ + if (!rtx_equal_p (last_cmp->in_a, XEXP (src, 0)) + || !rtx_equal_p (last_cmp->in_b, XEXP (src, 1))) + goto dont_delete; - if (!validate_change (last_cmp->insn, - &PATTERN (last_cmp->insn), x, false)) - continue; + /* New mode must be compatible with the previous compare mode. */ + { + enum machine_mode new_mode + = targetm.cc_modes_compatible (last_cmp->orig_mode, src_mode); + if (new_mode == VOIDmode) + goto dont_delete; - last_cmp->orig_mode = new_mode; - } + if (new_mode != last_cmp->orig_mode) + { + rtx x, flags = gen_rtx_REG (src_mode, targetm.flags_regnum); - delete_insn (insn); - continue; - } + /* Generate new comparison for substitution. */ + x = gen_rtx_COMPARE (new_mode, XEXP (src, 0), XEXP (src, 1)); + x = gen_rtx_SET (VOIDmode, flags, x); + if (!validate_change (last_cmp->insn, + &PATTERN (last_cmp->insn), x, false)) + goto dont_delete; + + last_cmp->orig_mode = new_mode; + } + } + + /* All tests and substitutions succeeded! */ + if (eh_note) + need_purge = true; + delete_insn (insn); + continue; + + dont_delete: last_cmp = XCNEW (struct comparison); last_cmp->insn = insn; last_cmp->prev_clobber = last_clobber; last_cmp->in_a = XEXP (src, 0); last_cmp->in_b = XEXP (src, 1); + last_cmp->eh_note = eh_note; last_cmp->orig_mode = src_mode; all_compares.safe_push (last_cmp); @@ -404,6 +424,11 @@ find_comparison_dom_walker::before_dom_children (basic_block bb) } } } + + /* If we deleted a compare with a REG_EH_REGION note, we may need to + remove EH edges. */ + if (need_purge) + purge_dead_edges (bb); } /* Find all comparisons in the function. */