diff mbox

[RFA] Fix invalid redundant extension elimination for rl78 port

Message ID 56537379.4030101@redhat.com
State New
Headers show

Commit Message

Jeff Law Nov. 23, 2015, 8:13 p.m. UTC
The core analysis was from Nick.  Essentially:


     (insn  44 (set (reg:QI r11) (mem:QI (reg:HI r20)))
     (insn  45 (set (reg:QI r10) (mem:QI (reg:HI r18)))
     [...]
     (insn  71 (set (reg:HI r14) (zero_extend:HI (reg:QI r11)))
     [...]
     (insn  88 (set (reg:HI r10) (zero_extend:HI (reg:QI r10)))

   (This is on the RL78 target where HImode values occupy two hard
   registers and QImode values only one.  The bug however is generic, not
   RL78 specific).

   The REE pass transforms this into:

     (insn  44 (set (reg:QI r11) (mem:QI (reg:HI r20)))
     (insn  45 (set (reg:HI r10) (zero_extend:HI (mem:QI (reg:HI r18))))
     [...]
     (insn  71 (set (reg:HI r14) (zero_extend:HI (reg:QI r11)))
     [...]
     (insn  88 deleted)

   Note how the new set at insn 45 clobbers the value loaded by insn 44
   into r11.  Thus when we use the value in insn 71, we're using the
   wrong value.


Nick had a more complex patch which tried to determine if the additional 
hard registers were used/set.  But the implementation was flawed in that 
it assumed the use succeeded the def in the linear insn chain, which is 
an invalid assumption in general.  For this to work what we'd really 
have to do is note all the blocks through which there's a path from the 
def to the use, then check for uses/sets within all those blocks.

Given this scenario is quite rare, it doesn't seem worth the effort. 
Even with an abort in the codepath, I can't get it to trigger during 
normal x86_64 or rl78 builds.  It only triggers on the rl78 with -O1 -free.

As I mentioned in a prior message on the subject, this is only a problem 
when the source/dest of the extension are the same.  When the 
source/dest of the extension are different, we only optimize when the 
original set and extension are in the same block and we verify that all 
affected registers are not set/used between the original set and the 
extension.
Bootstrapped and regression tested on x86_64-linux-gnu.  Also tested 
execute.exp on rl78 with no regressions.

I didn't include a distinct testcase as these are covered by pr42833 and 
strct-stdarg-1.c -- but only when those are run with -O1 -free.  I can 
certainly add a -free test for those tests if folks want.

I took this opportunity to also remove a block of #if 0'd code that I 
had in place for this situation, but had been unable to trigger.  I 
prefer Nick's location for the test.

Ok for the trunk?



Jeff

Comments

Bernd Schmidt Nov. 23, 2015, 11:39 p.m. UTC | #1
> As I mentioned in a prior message on the subject, this is only a problem
> when the source/dest of the extension are the same.  When the
> source/dest of the extension are different, we only optimize when the
> original set and extension are in the same block and we verify that all
> affected registers are not set/used between the original set and the
> extension.
> Bootstrapped and regression tested on x86_64-linux-gnu.  Also tested
> execute.exp on rl78 with no regressions.

Ok.


Bernd
Richard Sandiford Dec. 1, 2015, 7:32 p.m. UTC | #2
Jeff Law <law@redhat.com> writes:
> @@ -1080,6 +1070,18 @@ add_removable_extension (const_rtx expr, rtx_insn *insn,
>  	      }
>  	  }
>  
> +      /* Fourth, if the extended version occupies more registers than the
> +	 original and the source of the extension is the same hard register
> +	 as the destination of the extension, then we can not eliminate
> +	 the extension without deep analysis, so just punt.
> +
> +	 We allow this when the registers are different because the
> +	 code in combine_reaching_defs will handle that case correctly.  */
> +      if ((HARD_REGNO_NREGS (REGNO (dest), mode)
> +	   != HARD_REGNO_NREGS (REGNO (reg), GET_MODE (reg)))
> +	  && REGNO (dest) == REGNO (reg))
> +	return;
> +
>        /* Then add the candidate to the list and insert the reaching definitions
>           into the definition map.  */
>        ext_cand e = {expr, code, mode, insn};

I might be wrong, but the check looks specific to little-endian.  Would
it make sense to use reg_overlap_mentioned_p instead of the REGNO check?

Thanks,
Richard
Jeff Law Dec. 16, 2015, 7:09 p.m. UTC | #3
On 12/01/2015 12:32 PM, Richard Sandiford wrote:
> Jeff Law <law@redhat.com> writes:
>> @@ -1080,6 +1070,18 @@ add_removable_extension (const_rtx expr, rtx_insn *insn,
>>   	      }
>>   	  }
>>
>> +      /* Fourth, if the extended version occupies more registers than the
>> +	 original and the source of the extension is the same hard register
>> +	 as the destination of the extension, then we can not eliminate
>> +	 the extension without deep analysis, so just punt.
>> +
>> +	 We allow this when the registers are different because the
>> +	 code in combine_reaching_defs will handle that case correctly.  */
>> +      if ((HARD_REGNO_NREGS (REGNO (dest), mode)
>> +	   != HARD_REGNO_NREGS (REGNO (reg), GET_MODE (reg)))
>> +	  && REGNO (dest) == REGNO (reg))
>> +	return;
>> +
>>         /* Then add the candidate to the list and insert the reaching definitions
>>            into the definition map.  */
>>         ext_cand e = {expr, code, mode, insn};
>
> I might be wrong, but the check looks specific to little-endian.  Would
> it make sense to use reg_overlap_mentioned_p instead of the REGNO check?
Agreed.  Testing in progress now...

jeff
diff mbox

Patch

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index e560746..29ed4e4 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,11 @@ 
+2015-11-18  Nick Clifton  <nickc@redhat.com>
+	    Jeff Law  <law@redhat.com>
+
+	* ree.c (add_removable_extension): Avoid mis-optimizing cases where
+	the source/dest of the target extension require a different number of
+	hard registers.
+	(combine_set_extension): Remove #if 0 code.
+
 2015-11-20  Jim Wilson  <jim.wilson@linaro.org>
 
 	* tree-vect-data-refs.c (compare_tree): Call STRIP_NOPS.
diff --git a/gcc/ree.c b/gcc/ree.c
index b8436f2..f3b79e0 100644
--- a/gcc/ree.c
+++ b/gcc/ree.c
@@ -332,16 +332,6 @@  combine_set_extension (ext_cand *cand, rtx_insn *curr_insn, rtx *orig_set)
   else
     new_reg = gen_rtx_REG (cand->mode, REGNO (SET_DEST (*orig_set)));
 
-#if 0
-  /* Rethinking test.  Temporarily disabled.  */
-  /* We're going to be widening the result of DEF_INSN, ensure that doing so
-     doesn't change the number of hard registers needed for the result.  */
-  if (HARD_REGNO_NREGS (REGNO (new_reg), cand->mode)
-      != HARD_REGNO_NREGS (REGNO (SET_DEST (*orig_set)),
-			   GET_MODE (SET_DEST (*orig_set))))
-	return false;
-#endif
-
   /* Merge constants by directly moving the constant into the register under
      some conditions.  Recall that RTL constants are sign-extended.  */
   if (GET_CODE (orig_src) == CONST_INT
@@ -1080,6 +1070,18 @@  add_removable_extension (const_rtx expr, rtx_insn *insn,
 	      }
 	  }
 
+      /* Fourth, if the extended version occupies more registers than the
+	 original and the source of the extension is the same hard register
+	 as the destination of the extension, then we can not eliminate
+	 the extension without deep analysis, so just punt.
+
+	 We allow this when the registers are different because the
+	 code in combine_reaching_defs will handle that case correctly.  */
+      if ((HARD_REGNO_NREGS (REGNO (dest), mode)
+	   != HARD_REGNO_NREGS (REGNO (reg), GET_MODE (reg)))
+	  && REGNO (dest) == REGNO (reg))
+	return;
+
       /* Then add the candidate to the list and insert the reaching definitions
          into the definition map.  */
       ext_cand e = {expr, code, mode, insn};