diff mbox

Fix 61565 -- cmpelim vs non-call exceptions

Message ID 53A49957.9040403@redhat.com
State New
Headers show

Commit Message

Richard Henderson June 20, 2014, 8:28 p.m. UTC
There aren't too many users of the cmpelim pass, and previously they were all
small embedded targets without an FPU.

I'm a bit surprised that Ramana decided to enable this pass for aarch64, as
that target is not so limited as the block comment for the pass describes.
Honestly, whatever is being deleted here ought to have been found earlier,
either via combine or cse.  We ought to find out why any changes are made
during this pass for aarch64.

That said, this PR does demonstrate a bug in the handling of fp comparisons in
the presence of -fnon-call-exceptions, so I go ahead and fix that regardless of
what we do with the aarch64 port longer term.

Bootstrap still in progress, but the original testcase is resolved.


r~
* compare-elim.c (struct comparison): Add eh_note.
	(find_comparison_dom_walker::before_dom_children): Don't eliminate
	a redundant comparison in a different EH region.  Purge EH edges if
	necessary.

Comments

Ramana Radhakrishnan June 23, 2014, 9:29 a.m. UTC | #1
On 20/06/14 21:28, Richard Henderson wrote:
> There aren't too many users of the cmpelim pass, and previously they were all
> small embedded targets without an FPU.
>
> I'm a bit surprised that Ramana decided to enable this pass for aarch64, as
> that target is not so limited as the block comment for the pass describes.
> Honestly, whatever is being deleted here ought to have been found earlier,
> either via combine or cse.  We ought to find out why any changes are made
> during this pass for aarch64.

Agreed - Going back and looking at my notes I remember seeing a 
difference in code generation with the elimination of a number of 
compares that prompted me to turn this on in a number of benchmarks. I 
don't remember double checking why CSE hadn't removed that at that time. 
This also probably explains the equivalent patch for ARM and Thumb2 
hasn't shown demonstrable differences.

Investigating this pass for Thumb1 may be interesting.


>
> That said, this PR does demonstrate a bug in the handling of fp comparisons in
> the presence of -fnon-call-exceptions, so I go ahead and fix that regardless of
> what we do with the aarch64 port longer term.

Fixing it properly in the pass makes sense and sorry about the breakage.

I can't look into this immediately but one of us will pick this up.


regards
Ramana

>
> Bootstrap still in progress, but the original testcase is resolved.
>
>
> r~
>
Richard Henderson June 23, 2014, 2:01 p.m. UTC | #2
On 06/23/2014 02:29 AM, Ramana Radhakrishnan wrote:
> 
> 
> On 20/06/14 21:28, Richard Henderson wrote:
>> There aren't too many users of the cmpelim pass, and previously they were all
>> small embedded targets without an FPU.
>>
>> I'm a bit surprised that Ramana decided to enable this pass for aarch64, as
>> that target is not so limited as the block comment for the pass describes.
>> Honestly, whatever is being deleted here ought to have been found earlier,
>> either via combine or cse.  We ought to find out why any changes are made
>> during this pass for aarch64.
> 
> Agreed - Going back and looking at my notes I remember seeing a difference in
> code generation with the elimination of a number of compares that prompted me
> to turn this on in a number of benchmarks. I don't remember double checking why
> CSE hadn't removed that at that time. This also probably explains the
> equivalent patch for ARM and Thumb2 hasn't shown demonstrable differences.
> 
> Investigating this pass for Thumb1 may be interesting.

Isn't it true that thumb1 has only "adds r,r,#i" not "add r,r,#i"?
Lack of an addition that doesn't clobber the flags is one of the two
reasons why you'd want to enable the cmpelim pass.

Although for thumb1, can't you achieve a no-clobber addition with an
IT block with an "always" condition?  Seems like a good use for the "addptr"
pattern that Andreas Krebbel recently added for s390.  That might let you
make thumb1 look more like the rest of the arm port and emit separate compare
and branch instructions from the start.


r~
Ramana Radhakrishnan June 23, 2014, 3:55 p.m. UTC | #3
On 23/06/14 15:01, Richard Henderson wrote:
> On 06/23/2014 02:29 AM, Ramana Radhakrishnan wrote:
>>
>>
>> On 20/06/14 21:28, Richard Henderson wrote:
>>> There aren't too many users of the cmpelim pass, and previously they were all
>>> small embedded targets without an FPU.
>>>
>>> I'm a bit surprised that Ramana decided to enable this pass for aarch64, as
>>> that target is not so limited as the block comment for the pass describes.
>>> Honestly, whatever is being deleted here ought to have been found earlier,
>>> either via combine or cse.  We ought to find out why any changes are made
>>> during this pass for aarch64.
>>
>> Agreed - Going back and looking at my notes I remember seeing a difference in
>> code generation with the elimination of a number of compares that prompted me
>> to turn this on in a number of benchmarks. I don't remember double checking why
>> CSE hadn't removed that at that time. This also probably explains the
>> equivalent patch for ARM and Thumb2 hasn't shown demonstrable differences.
>>
>> Investigating this pass for Thumb1 may be interesting.
>
> Isn't it true that thumb1 has only "adds r,r,#i" not "add r,r,#i"?
> Lack of an addition that doesn't clobber the flags is one of the two
> reasons why you'd want to enable the cmpelim pass.


Agreed, this is why cmpelim looks interesting for Thumb1. (We may need 
another hook or something to disable it in configurations we don't need 
it in, but you know ... )

>
> Although for thumb1, can't you achieve a no-clobber addition with an
> IT block with an "always" condition?

Except that IT instructions aren't in Thumb1 or indeed v6-m :(.


> Seems like a good use for the "addptr"
> pattern that Andreas Krebbel recently added for s390.  That might let you
> make thumb1 look more like the rest of the arm port and emit separate compare
> and branch instructions from the start.
>


regards
Ramana

>
> r~
>
Richard Henderson June 23, 2014, 6:03 p.m. UTC | #4
On 06/23/2014 08:55 AM, Ramana Radhakrishnan wrote:
> Agreed, this is why cmpelim looks interesting for Thumb1. (We may need another
> hook or something to disable it in configurations we don't need it in, but you
> know ... )

Yeah.  Feel free to change targetm.flags_regnum from a variable to a function,
when you get around to it.

> Except that IT instructions aren't in Thumb1 or indeed v6-m :(.

Ah, yes, of course.


r~
diff mbox

Patch

diff --git a/gcc/compare-elim.c b/gcc/compare-elim.c
index 2fbb75b..4ecdd48 100644
--- a/gcc/compare-elim.c
+++ b/gcc/compare-elim.c
@@ -100,6 +100,9 @@  struct comparison
      constants.  */
   rtx in_a, in_b;
 
+  /* The REG_EH_REGION of the comparison.  */
+  rtx eh_note;
+
   /* Information about how this comparison is used.  */
   struct comparison_use uses[MAX_CMP_USE];
 
@@ -262,6 +265,7 @@  find_comparison_dom_walker::before_dom_children (basic_block bb)
   struct comparison *last_cmp;
   rtx insn, next, last_clobber;
   bool last_cmp_valid;
+  bool need_purge = false;
   bitmap killed;
 
   killed = BITMAP_ALLOC (NULL);
@@ -303,44 +307,60 @@  find_comparison_dom_walker::before_dom_children (basic_block bb)
       if (src)
 	{
 	  enum machine_mode src_mode = GET_MODE (src);
+	  rtx eh_note = NULL;
 
-	  /* Eliminate a compare that's redundant with the previous.  */
-	  if (last_cmp_valid
-	      && rtx_equal_p (last_cmp->in_a, XEXP (src, 0))
-	      && rtx_equal_p (last_cmp->in_b, XEXP (src, 1)))
-	    {
-	      rtx flags, x;
-	      enum machine_mode new_mode
-		= targetm.cc_modes_compatible (last_cmp->orig_mode, src_mode);
+	  if (flag_non_call_exceptions)
+	    eh_note = find_reg_note (insn, REG_EH_REGION, NULL);
 
-	      /* New mode is incompatible with the previous compare mode.  */
-	      if (new_mode == VOIDmode)
-		continue;
+	  if (!last_cmp_valid)
+	    goto dont_delete;
 
-	      if (new_mode != last_cmp->orig_mode)
-		{
-		  flags = gen_rtx_REG (src_mode, targetm.flags_regnum);
+	  /* Take care that it's in the same EH region.  */
+	  if (flag_non_call_exceptions
+	      && !rtx_equal_p (eh_note, last_cmp->eh_note))
+	    goto dont_delete;
 
-		  /* Generate new comparison for substitution.  */
-		  x = gen_rtx_COMPARE (new_mode, XEXP (src, 0), XEXP (src, 1));
-		  x = gen_rtx_SET (VOIDmode, flags, x);
+	  /* Make sure the compare is redundant with the previous.  */
+	  if (!rtx_equal_p (last_cmp->in_a, XEXP (src, 0))
+	      || !rtx_equal_p (last_cmp->in_b, XEXP (src, 1)))
+	    goto dont_delete;
 
-		  if (!validate_change (last_cmp->insn,
-					&PATTERN (last_cmp->insn), x, false))
-		    continue;
+	  /* New mode must be compatible with the previous compare mode.  */
+	  {
+	    enum machine_mode new_mode
+	      = targetm.cc_modes_compatible (last_cmp->orig_mode, src_mode);
+	    if (new_mode == VOIDmode)
+	      goto dont_delete;
 
-		  last_cmp->orig_mode = new_mode;
-		}
+	    if (new_mode != last_cmp->orig_mode)
+	      {
+		rtx x, flags = gen_rtx_REG (src_mode, targetm.flags_regnum);
 
-	      delete_insn (insn);
-	      continue;
-	    }
+		/* Generate new comparison for substitution.  */
+		x = gen_rtx_COMPARE (new_mode, XEXP (src, 0), XEXP (src, 1));
+		x = gen_rtx_SET (VOIDmode, flags, x);
 
+		if (!validate_change (last_cmp->insn,
+				      &PATTERN (last_cmp->insn), x, false))
+		  goto dont_delete;
+
+		last_cmp->orig_mode = new_mode;
+	      }
+	  }
+
+	  /* All tests and substitutions succeeded!  */
+	  if (eh_note)
+	    need_purge = true;
+	  delete_insn (insn);
+	  continue;
+
+	dont_delete:
 	  last_cmp = XCNEW (struct comparison);
 	  last_cmp->insn = insn;
 	  last_cmp->prev_clobber = last_clobber;
 	  last_cmp->in_a = XEXP (src, 0);
 	  last_cmp->in_b = XEXP (src, 1);
+	  last_cmp->eh_note = eh_note;
 	  last_cmp->orig_mode = src_mode;
 	  all_compares.safe_push (last_cmp);
 
@@ -404,6 +424,11 @@  find_comparison_dom_walker::before_dom_children (basic_block bb)
 	    }
 	}
     }
+
+  /* If we deleted a compare with a REG_EH_REGION note, we may need to
+     remove EH edges.  */
+  if (need_purge)
+    purge_dead_edges (bb);
 }
 
 /* Find all comparisons in the function.  */