diff mbox

[ARM] correctly encode the CC reg data flow

Message ID AM4PR0701MB216294C83C28945778C84345E4780@AM4PR0701MB2162.eurprd07.prod.outlook.com
State New
Headers show

Commit Message

Bernd Edlinger Jan. 13, 2017, 6:28 p.m. UTC
On 01/13/17 17:10, Bernd Edlinger wrote:
> On 01/13/17 14:50, Richard Earnshaw (lists) wrote:
>> On 18/12/16 12:58, Bernd Edlinger wrote:
>>> Hi,
>>>
>>> this is related to PR77308, the follow-up patch will depend on this one.
>>>
>>> When trying the split the *arm_cmpdi_insn and *arm_cmpdi_unsigned
>>> before reload, a mis-compilation in libgcc function __gnu_satfractdasq
>>> was discovered, see [1] for more details.
>>>
>>> The reason seems to be that when the *arm_cmpdi_insn is directly
>>> followed by a *arm_cmpdi_unsigned instruction, both are split
>>> up into this:
>>>
>>>    [(set (reg:CC CC_REGNUM)
>>>          (compare:CC (match_dup 0) (match_dup 1)))
>>>     (parallel [(set (reg:CC CC_REGNUM)
>>>                     (compare:CC (match_dup 3) (match_dup 4)))
>>>                (set (match_dup 2)
>>>                     (minus:SI (match_dup 5)
>>>                              (ltu:SI (reg:CC_C CC_REGNUM) (const_int
>>> 0))))])]
>>>
>>>    [(set (reg:CC CC_REGNUM)
>>>          (compare:CC (match_dup 2) (match_dup 3)))
>>>     (cond_exec (eq:SI (reg:CC CC_REGNUM) (const_int 0))
>>>                (set (reg:CC CC_REGNUM)
>>>                     (compare:CC (match_dup 0) (match_dup 1))))]
>>>
>>> The problem is that the reg:CC from the *subsi3_carryin_compare
>>> is not mentioning that the reg:CC is also dependent on the reg:CC
>>> from before.  Therefore the *arm_cmpsi_insn appears to be
>>> redundant and thus got removed, because the data values are identical.
>>>
>>> I think that applies to a number of similar pattern where data
>>> flow is happening through the CC reg.
>>>
>>> So this is a kind of correctness issue, and should be fixed
>>> independently from the optimization issue PR77308.
>>>
>>> Therefore I think the patterns need to specify the true
>>> value that will be in the CC reg, in order for cse to
>>> know what the instructions are really doing.
>>>
>>>
>>> Bootstrapped and reg-tested on arm-linux-gnueabihf.
>>> Is it OK for trunk?
>>>
>>
>> I agree you've found a valid problem here, but I have some issues with
>> the patch itself.
>>
>>
>> (define_insn_and_split "subdi3_compare1"
>>   [(set (reg:CC_NCV CC_REGNUM)
>>     (compare:CC_NCV
>>       (match_operand:DI 1 "register_operand" "r")
>>       (match_operand:DI 2 "register_operand" "r")))
>>    (set (match_operand:DI 0 "register_operand" "=&r")
>>     (minus:DI (match_dup 1) (match_dup 2)))]
>>   "TARGET_32BIT"
>>   "#"
>>   "&& reload_completed"
>>   [(parallel [(set (reg:CC CC_REGNUM)
>>            (compare:CC (match_dup 1) (match_dup 2)))
>>           (set (match_dup 0) (minus:SI (match_dup 1) (match_dup 2)))])
>>    (parallel [(set (reg:CC_C CC_REGNUM)
>>            (compare:CC_C
>>              (zero_extend:DI (match_dup 4))
>>              (plus:DI (zero_extend:DI (match_dup 5))
>>                   (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))))
>>           (set (match_dup 3)
>>            (minus:SI (minus:SI (match_dup 4) (match_dup 5))
>>                  (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])]
>>
>>
>> This pattern is now no-longer self consistent in that before the split
>> the overall result for the condition register is in mode CC_NCV, but
>> afterwards it is just CC_C.
>>
>> I think CC_NCV is correct mode (the N, C and V bits all correctly
>> reflect the result of the 64-bit comparison), but that then implies that
>> the cc mode of subsi3_carryin_compare is incorrect as well and should in
>> fact also be CC_NCV.  Thinking about this pattern, I'm inclined to agree
>> that CC_NCV is the correct mode for this operation
>>
>> I'm not sure if there are other consequences that will fall out from
>> fixing this (it's possible that we might need a change to select_cc_mode
>> as well).
>>
>
> Yes, this is still a bit awkward...
>
> The N and V bit will be the correct result for the subdi3_compare1
> a 64-bit comparison, but zero_extend:DI (match_dup 4) (plus:DI ...)
> only gets the C bit correct, the expression for N and V is a different
> one.
>
> It probably works, because the subsi3_carryin_compare instruction sets
> more CC bits than the pattern does explicitly specify the value.
> We know the subsi3_carryin_compare also computes the NV bits, but it is
> hard to write down the correct rtl expression for it.
>
> In theory the pattern should describe everything correctly,
> maybe, like:
>
> set (reg:CC_C CC_REGNUM)
>     (compare:CC_C
>       (zero_extend:DI (match_dup 4))
>       (plus:DI (zero_extend:DI (match_dup 5))
>                (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))))
> set (reg:CC_NV CC_REGNUM)
>     (compare:CC_NV
>      (match_dup 4))
>      (plus:SI (match_dup 5) (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)))
> set (match_dup 3)
>     (minus:SI (minus:SI (match_dup 4) (match_dup 5))
>               (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0)))))
>
>
> But I doubt that will work to set CC_REGNUM with two different modes
> in parallel?
>
> Another idea would be to invent a CC_CNV_NOOV mode, that implicitly
> defines C from the DImode result, and NV from the SImode result,
> similar to the CC_NOOVmode, that also leaves something open what
> bits it really defines?
>
>
> What do you think?
>
>
> Thanks
> Bernd.

I think maybe the right solution is to invent a new CCmode
that defines C as if the comparison is done in DImode
but N and V as if the comparison is done in SImode.

I thought maybe I would call it CC_NCV_CIC (CIC = Carry-In-Compare),
furthermore I think the CC_NOOV should be renamed to CC_NZ (because
only N and Z are set correctly), but in a different patch of course.

Attached is a new version that implements the new CCmode.

How do you like this new version?

It seems to be able to build a cross-compiler at least.

I will start a new bootstrap with this new patch, but that can take some
time until I have definitive results.

Is there still a chance that it can go into gcc-7 or should it wait
for the next stage1?

Thanks
Bernd.
diff mbox

Patch

2016-01-13  Bernd Edlinger  <bernd.edlinger@hotmail.de>

	PR target/77308
	* config/arm/arm-modes.def (CC_NCV_CIC): New mode.
	* config/arm/arm.md (adddi3_compareV, *addsi3_compareV_upper,
	adddi3_compareC, *addsi3_compareC_upper, subdi3_compare1,
	subsi3_carryin_compare, subsi3_carryin_compare_const,
	negdi2_compare, *negsi2_carryin_compare,
	*arm_cmpdi_insn): Fix the CC reg dataflow.

Index: gcc/config/arm/arm-modes.def
===================================================================
--- gcc/config/arm/arm-modes.def	(revision 244439)
+++ gcc/config/arm/arm-modes.def	(working copy)
@@ -38,6 +38,8 @@ 
    (used for DImode unsigned comparisons).
    CC_NCVmode should be used if only the N, C, and V flags are correct
    (used for DImode signed comparisons).
+   CC_NCV_CICmode defines N and V in SImode and C in DImode
+   (used for carryin_compare patterns).
    CCmode should be used otherwise.  */
 
 CC_MODE (CC_NOOV);
@@ -44,6 +46,7 @@ 
 CC_MODE (CC_Z);
 CC_MODE (CC_CZ);
 CC_MODE (CC_NCV);
+CC_MODE (CC_NCV_CIC);
 CC_MODE (CC_SWP);
 CC_MODE (CCFP);
 CC_MODE (CCFPE);
Index: gcc/config/arm/arm.md
===================================================================
--- gcc/config/arm/arm.md	(revision 244439)
+++ gcc/config/arm/arm.md	(working copy)
@@ -669,17 +669,15 @@ 
 	      (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 2)))])
    (parallel [(set (reg:CC_V CC_REGNUM)
 		   (ne:CC_V
-		    (plus:DI (plus:DI
-			      (sign_extend:DI (match_dup 4))
-			      (sign_extend:DI (match_dup 5)))
-			     (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))
-		    (plus:DI (sign_extend:DI
-			      (plus:SI (match_dup 4) (match_dup 5)))
-			     (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))))
-	     (set (match_dup 3) (plus:SI (plus:SI
-					  (match_dup 4) (match_dup 5))
-					 (ltu:SI (reg:CC_C CC_REGNUM)
-						 (const_int 0))))])]
+		     (plus:DI (plus:DI (sign_extend:DI (match_dup 4))
+				       (sign_extend:DI (match_dup 5)))
+			      (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))
+		    (sign_extend:DI
+		      (plus:SI (plus:SI (match_dup 4) (match_dup 5))
+			       (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))))
+	      (set (match_dup 3) (plus:SI (plus:SI (match_dup 4) (match_dup 5))
+					  (ltu:SI (reg:CC_C CC_REGNUM)
+						  (const_int 0))))])]
   "
   {
     operands[3] = gen_highpart (SImode, operands[0]);
@@ -713,13 +711,13 @@ 
   [(set (reg:CC_V CC_REGNUM)
 	(ne:CC_V
 	  (plus:DI
-	   (plus:DI
-	    (sign_extend:DI (match_operand:SI 1 "register_operand" "r"))
-	    (sign_extend:DI (match_operand:SI 2 "register_operand" "r")))
-	   (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))
-	  (plus:DI (sign_extend:DI
-		    (plus:SI (match_dup 1) (match_dup 2)))
-		   (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))))
+	    (plus:DI
+	      (sign_extend:DI (match_operand:SI 1 "register_operand" "r"))
+	      (sign_extend:DI (match_operand:SI 2 "register_operand" "r")))
+	    (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))
+	  (sign_extend:DI (plus:SI (plus:SI (match_dup 1) (match_dup 2))
+				   (ltu:SI (reg:CC_C CC_REGNUM)
+					   (const_int 0))))))
    (set (match_operand:SI 0 "register_operand" "=r")
 	(plus:SI
 	 (plus:SI (match_dup 1) (match_dup 2))
@@ -748,17 +746,15 @@ 
 	      (set (match_dup 0) (plus:SI (match_dup 1) (match_dup 2)))])
    (parallel [(set (reg:CC_C CC_REGNUM)
 		   (ne:CC_C
-		    (plus:DI (plus:DI
-			      (zero_extend:DI (match_dup 4))
-			      (zero_extend:DI (match_dup 5)))
-			     (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))
-		    (plus:DI (zero_extend:DI
-			      (plus:SI (match_dup 4) (match_dup 5)))
-			     (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))))
-	     (set (match_dup 3) (plus:SI
-				 (plus:SI (match_dup 4) (match_dup 5))
-				 (ltu:SI (reg:CC_C CC_REGNUM)
-					 (const_int 0))))])]
+		     (plus:DI (plus:DI (zero_extend:DI (match_dup 4))
+				       (zero_extend:DI (match_dup 5)))
+			      (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))
+		    (zero_extend:DI
+		      (plus:SI (plus:SI (match_dup 4) (match_dup 5))
+			       (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))))
+	      (set (match_dup 3) (plus:SI (plus:SI (match_dup 4) (match_dup 5))
+					  (ltu:SI (reg:CC_C CC_REGNUM)
+						  (const_int 0))))])]
   "
   {
     operands[3] = gen_highpart (SImode, operands[0]);
@@ -777,17 +773,16 @@ 
   [(set (reg:CC_C CC_REGNUM)
 	(ne:CC_C
 	  (plus:DI
-	   (plus:DI
-	    (zero_extend:DI (match_operand:SI 1 "register_operand" "r"))
-	    (zero_extend:DI (match_operand:SI 2 "register_operand" "r")))
-	   (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))
-	  (plus:DI (zero_extend:DI
-		    (plus:SI (match_dup 1) (match_dup 2)))
-		   (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))))
+	    (plus:DI
+	      (zero_extend:DI (match_operand:SI 1 "register_operand" "r"))
+	      (zero_extend:DI (match_operand:SI 2 "register_operand" "r")))
+	    (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))
+	  (zero_extend:DI
+	    (plus:SI (plus:SI (match_dup 1) (match_dup 2))
+		     (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))))
    (set (match_operand:SI 0 "register_operand" "=r")
-	(plus:SI
-	 (plus:SI (match_dup 1) (match_dup 2))
-	 (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))]
+	(plus:SI (plus:SI (match_dup 1) (match_dup 2))
+		 (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))]
   "TARGET_32BIT"
   "adcs%?\\t%0, %1, %2"
   [(set_attr "conds" "set")
@@ -1086,8 +1081,8 @@ 
 })
 
 (define_insn_and_split "subdi3_compare1"
-  [(set (reg:CC CC_REGNUM)
-	(compare:CC
+  [(set (reg:CC_NCV CC_REGNUM)
+	(compare:CC_NCV
 	  (match_operand:DI 1 "register_operand" "r")
 	  (match_operand:DI 2 "register_operand" "r")))
    (set (match_operand:DI 0 "register_operand" "=&r")
@@ -1098,10 +1093,14 @@ 
   [(parallel [(set (reg:CC CC_REGNUM)
 		   (compare:CC (match_dup 1) (match_dup 2)))
 	      (set (match_dup 0) (minus:SI (match_dup 1) (match_dup 2)))])
-   (parallel [(set (reg:CC CC_REGNUM)
-		   (compare:CC (match_dup 4) (match_dup 5)))
-	     (set (match_dup 3) (minus:SI (minus:SI (match_dup 4) (match_dup 5))
-			       (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])]
+   (parallel [(set (reg:CC_NCV_CIC CC_REGNUM)
+		   (compare:CC_NCV_CIC
+		     (zero_extend:DI (match_dup 4))
+		     (plus:DI (zero_extend:DI (match_dup 5))
+			      (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))))
+	      (set (match_dup 3)
+		   (minus:SI (minus:SI (match_dup 4) (match_dup 5))
+			     (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])]
   {
     operands[3] = gen_highpart (SImode, operands[0]);
     operands[0] = gen_lowpart (SImode, operands[0]);
@@ -1157,13 +1156,15 @@ 
 )
 
 (define_insn "*subsi3_carryin_compare"
-  [(set (reg:CC CC_REGNUM)
-        (compare:CC (match_operand:SI 1 "s_register_operand" "r")
-                    (match_operand:SI 2 "s_register_operand" "r")))
+  [(set (reg:CC_NCV_CIC CC_REGNUM)
+	(compare:CC_NCV_CIC
+	  (zero_extend:DI (match_operand:SI 1 "s_register_operand" "r"))
+	  (plus:DI
+	    (zero_extend:DI (match_operand:SI 2 "s_register_operand" "r"))
+	    (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))))
    (set (match_operand:SI 0 "s_register_operand" "=r")
-        (minus:SI (minus:SI (match_dup 1)
-                            (match_dup 2))
-                  (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))]
+	(minus:SI (minus:SI (match_dup 1) (match_dup 2))
+		  (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))]
   "TARGET_32BIT"
   "sbcs\\t%0, %1, %2"
   [(set_attr "conds" "set")
@@ -1171,13 +1172,15 @@ 
 )
 
 (define_insn "*subsi3_carryin_compare_const"
-  [(set (reg:CC CC_REGNUM)
-        (compare:CC (match_operand:SI 1 "reg_or_int_operand" "r")
-                    (match_operand:SI 2 "arm_not_operand" "K")))
+  [(set (reg:CC_NCV_CIC CC_REGNUM)
+	(compare:CC_NCV_CIC
+	  (zero_extend:DI (match_operand:SI 1 "reg_or_int_operand" "r"))
+	  (plus:DI
+	    (zero_extend:DI (match_operand:SI 2 "arm_not_operand" "K"))
+	    (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))))
    (set (match_operand:SI 0 "s_register_operand" "=r")
-        (minus:SI (plus:SI (match_dup 1)
-                           (match_dup 2))
-                  (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))]
+	(minus:SI (plus:SI (match_dup 1) (match_dup 2))
+		  (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))]
   "TARGET_32BIT"
   "sbcs\\t%0, %1, #%B2"
   [(set_attr "conds" "set")
@@ -4634,8 +4637,8 @@ 
 
 
 (define_insn_and_split "negdi2_compare"
-  [(set (reg:CC CC_REGNUM)
-	(compare:CC
+  [(set (reg:CC_NCV CC_REGNUM)
+	(compare:CC_NCV
 	  (const_int 0)
 	  (match_operand:DI 1 "register_operand" "0,r")))
    (set (match_operand:DI 0 "register_operand" "=r,&r")
@@ -4647,8 +4650,12 @@ 
 		   (compare:CC (const_int 0) (match_dup 1)))
 	      (set (match_dup 0) (minus:SI (const_int 0)
 					   (match_dup 1)))])
-   (parallel [(set (reg:CC CC_REGNUM)
-		   (compare:CC (const_int 0) (match_dup 3)))
+   (parallel [(set (reg:CC_NCV_CIC CC_REGNUM)
+		   (compare:CC_NCV_CIC
+		     (const_int 0)
+		     (plus:DI
+		       (zero_extend:DI (match_dup 3))
+		       (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))))
 	     (set (match_dup 2)
 		  (minus:SI
 		   (minus:SI (const_int 0) (match_dup 3))
@@ -4707,12 +4714,14 @@ 
 )
 
 (define_insn "*negsi2_carryin_compare"
-  [(set (reg:CC CC_REGNUM)
-	(compare:CC (const_int 0)
-		    (match_operand:SI 1 "s_register_operand" "r")))
+  [(set (reg:CC_NCV_CIC CC_REGNUM)
+	(compare:CC_NCV_CIC
+	  (const_int 0)
+	  (plus:DI
+	    (zero_extend:DI (match_operand:SI 1 "s_register_operand" "r"))
+	    (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))))
    (set (match_operand:SI 0 "s_register_operand" "=r")
-	(minus:SI (minus:SI (const_int 0)
-			    (match_dup 1))
+	(minus:SI (minus:SI (const_int 0) (match_dup 1))
 		  (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))]
   "TARGET_ARM"
   "rscs\\t%0, %1, #0"
@@ -7361,12 +7370,15 @@ 
   "#"   ; "cmp\\t%Q0, %Q1\;sbcs\\t%2, %R0, %R1"
   "&& reload_completed"
   [(set (reg:CC CC_REGNUM)
-        (compare:CC (match_dup 0) (match_dup 1)))
-   (parallel [(set (reg:CC CC_REGNUM)
-                   (compare:CC (match_dup 3) (match_dup 4)))
-              (set (match_dup 2)
-                   (minus:SI (match_dup 5)
-                            (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])]
+	(compare:CC (match_dup 0) (match_dup 1)))
+   (parallel [(set (reg:CC_NCV_CIC CC_REGNUM)
+		   (compare:CC_NCV_CIC
+		     (zero_extend:DI (match_dup 3))
+		     (plus:DI (zero_extend:DI (match_dup 4))
+			      (ltu:DI (reg:CC_C CC_REGNUM) (const_int 0)))))
+	      (set (match_dup 2)
+		   (minus:SI (match_dup 5)
+			     (ltu:SI (reg:CC_C CC_REGNUM) (const_int 0))))])]
   {
     operands[3] = gen_highpart (SImode, operands[0]);
     operands[0] = gen_lowpart (SImode, operands[0]);