diff mbox

[ARM] PR47855 Compute attr length for thumb2 insns, 3/3 (issue4475042)

Message ID 20110505065105.BBA6820A24@guozhiwei.sha.corp.google.com
State New
Headers show

Commit Message

Carrot Wei May 5, 2011, 6:51 a.m. UTC
Hi

This is the third part of the fixing for

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47855

This patch contains the length computation/refinement for insn patterns
"*thumb2_movsi_insn", "*thumb2_cbz" and "*thumb2_cbnz".

At the same time this patch revealed two bugs. The first is the maximum offset
of cbz/cbnz, it should be 126, but it is 128 in patterns "*thumb2_cbz" and
"*thumb2_cbnz". The second is that only 2-register form of shift instructions
can be 16 bit, but 3-register form is allowed in "*thumb2_shiftsi3_short" and
related peephole2. The fix is also contained in this patch.

The patch has been tested on arm qemu.

thanks
Carrot


2011-05-05  Guozhi Wei  <carrot@google.com>

	PR target/47855
	* config/arm/thumb2.md (thumb2_movsi_insn): Add length addtribute.
	(thumb2_shiftsi3_short and peephole2): Remove 3-register case.
	(thumb2_cbz): Refine length computation.
	(thumb2_cbnz): Likewise.


--
This patch is available for review at http://codereview.appspot.com/4475042

Comments

Richard Earnshaw May 5, 2011, 9:42 a.m. UTC | #1
On Thu, 2011-05-05 at 14:51 +0800, Guozhi Wei wrote:
> Hi
> 
> This is the third part of the fixing for
> 
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47855
> 
> This patch contains the length computation/refinement for insn patterns
> "*thumb2_movsi_insn", "*thumb2_cbz" and "*thumb2_cbnz".
> 
> At the same time this patch revealed two bugs. The first is the maximum offset
> of cbz/cbnz, it should be 126, but it is 128 in patterns "*thumb2_cbz" and
> "*thumb2_cbnz". The second is that only 2-register form of shift instructions
> can be 16 bit, but 3-register form is allowed in "*thumb2_shiftsi3_short" and
> related peephole2. The fix is also contained in this patch.
> 
> The patch has been tested on arm qemu.
> 
> thanks
> Carrot
> 
> 
> 2011-05-05  Guozhi Wei  <carrot@google.com>
> 
> 	PR target/47855
> 	* config/arm/thumb2.md (thumb2_movsi_insn): Add length addtribute.
> 	(thumb2_shiftsi3_short and peephole2): Remove 3-register case.
> 	(thumb2_cbz): Refine length computation.
> 	(thumb2_cbnz): Likewise.
> 

Hmm, although these changes are all related to length calculations, they
are really three patches that are unrelated to each other.  It would be
easier to review this if they were kept separate.

1) thumb2_shiftsi3_short
This appears to be a straight bug.  We are putting out a 32-bit
instruction when we are claiming it to be only 16 bits.  This is OK.

2) thumb2_movsi_insn
There are two things here.
a) Thumb2 has a 16-bit move instruction for all core
register-to-register transfers, so the separation of alternatives 1 and
2 is unnecessary -- just code these as "rk".  
b) The ldm form does not support unaligned memory accesses.  I'm aware
that work is being done to add unaligned support to GCC for ARM, so I
need to find out whether this patch will interfere with those changes.
I'll try to find out what the situation is here and get back to you.

3) thumb2_cbz and thumb2_cbnz
The range calculations look wrong here.  Remember that the 'pc' as far
as GCC is concerned is the address of the start of the insn.  So for a
backwards branch you need to account for all the bytes in the insn
pattern that occur before the branch instruction itself, and secondly
you also have to remember that the 'pc' that the CPU uses is the address
of the branch instruction plus 4.  All these conspire to reduce the
backwards range of a short branch to several bytes less than the 256
that you currently have coded.


R.

> Index: config/arm/thumb2.md
> ===================================================================
> --- config/arm/thumb2.md	(revision 173350)
> +++ config/arm/thumb2.md	(working copy)
> @@ -165,25 +165,49 @@
>  ;; regs.  The high register alternatives are not taken into account when
>  ;; choosing register preferences in order to reflect their expense.
>  (define_insn "*thumb2_movsi_insn"
> -  [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,r,r,l ,*hk,m,*m")
> -	(match_operand:SI 1 "general_operand"	   "rk ,I,K,j,mi,*mi,l,*hk"))]
> +  [(set (match_operand:SI 0 "nonimmediate_operand" "=l,rk,r,r,r,l ,*rk,Uu,*m")
> +	(match_operand:SI 1 "general_operand"      "l ,rk,I,K,j,Uu,*mi,l ,*rk"))]
>    "TARGET_THUMB2 && ! TARGET_IWMMXT
>     && !(TARGET_HARD_FLOAT && TARGET_VFP)
>     && (   register_operand (operands[0], SImode)
>         || register_operand (operands[1], SImode))"
> -  "@
> -   mov%?\\t%0, %1
> -   mov%?\\t%0, %1
> -   mvn%?\\t%0, #%B1
> -   movw%?\\t%0, %1
> -   ldr%?\\t%0, %1
> -   ldr%?\\t%0, %1
> -   str%?\\t%1, %0
> -   str%?\\t%1, %0"
> -  [(set_attr "type" "*,*,*,*,load1,load1,store1,store1")
> +  "*
> +  switch (which_alternative)
> +    {
> +    case 0: return \"mov%?\\t%0, %1\";
> +    case 1: return \"mov%?\\t%0, %1\";
> +    case 2: return \"mov%?\\t%0, %1\";
> +    case 3: return \"mvn%?\\t%0, #%B1\";
> +    case 4: return \"movw%?\\t%0, %1\";
> +
> +    case 5:
> +      if (GET_CODE (XEXP (operands[1], 0)) == POST_INC)
> +	{
> +	  operands[1] = XEXP (XEXP (operands[1], 0), 0);
> +	  return \"ldm%(ia%)\t%1!, {%0}\";
> +	}
> +      else
> +	return \"ldr%?\\t%0, %1\";
> +
> +    case 6: return \"ldr%?\\t%0, %1\";
> +
> +    case 7:
> +      if (GET_CODE (XEXP (operands[0], 0)) == POST_INC)
> +	{
> +	  operands[0] = XEXP (XEXP (operands[0], 0), 0);
> +	  return \"stm%(ia%)\t%0!, {%1}\";
> +	}
> +      else
> +	return \"str%?\\t%1, %0\";
> +
> +    case 8: return \"str%?\\t%1, %0\";
> +    default: gcc_unreachable ();
> +    }"
> +  [(set_attr "type" "*,*,*,*,*,load1,load1,store1,store1")
>     (set_attr "predicable" "yes")
> -   (set_attr "pool_range" "*,*,*,*,1020,4096,*,*")
> -   (set_attr "neg_pool_range" "*,*,*,*,0,0,*,*")]
> +   (set_attr "length" "2,4,4,4,4,2,4,2,4")
> +   (set_attr "pool_range" "*,*,*,*,*,1020,4096,*,*")
> +   (set_attr "neg_pool_range" "*,*,*,*,*,0,0,*,*")]
>  )
>  
>  (define_insn "tls_load_dot_plus_four"
> @@ -685,7 +709,8 @@
>    "TARGET_THUMB2
>     && peep2_regno_dead_p(0, CC_REGNUM)
>     && ((GET_CODE(operands[3]) != ROTATE && GET_CODE(operands[3]) != ROTATERT)
> -       || REG_P(operands[2]))"
> +       || REG_P(operands[2]))
> +   && (CONSTANT_P (operands[2]) || (operands[0] == operands[1]))"
>    [(parallel
>      [(set (match_dup 0)
>  	  (match_op_dup 3
> @@ -696,10 +721,10 @@
>  )
>  
>  (define_insn "*thumb2_shiftsi3_short"
> -  [(set (match_operand:SI   0 "low_register_operand" "=l")
> +  [(set (match_operand:SI   0 "low_register_operand" "=l,l")
>  	(match_operator:SI  3 "shift_operator"
> -	 [(match_operand:SI 1 "low_register_operand"  "l")
> -	  (match_operand:SI 2 "low_reg_or_int_operand" "lM")]))
> +	 [(match_operand:SI 1 "low_register_operand"  "0,l")
> +	  (match_operand:SI 2 "low_reg_or_int_operand" "l,M")]))
>     (clobber (reg:CC CC_REGNUM))]
>    "TARGET_THUMB2 && reload_completed
>     && ((GET_CODE(operands[3]) != ROTATE && GET_CODE(operands[3]) != ROTATERT)
> @@ -707,7 +732,7 @@
>    "* return arm_output_shift(operands, 2);"
>    [(set_attr "predicable" "yes")
>     (set_attr "shift" "1")
> -   (set_attr "length" "2")
> +   (set_attr "length" "2,2")
>     (set (attr "type") (if_then_else (match_operand 2 "const_int_operand" "")
>  		      (const_string "alu_shift")
>  		      (const_string "alu_shift_reg")))]
> @@ -965,13 +990,23 @@
>    else
>      return \"cmp\\t%0, #0\;beq\\t%l1\";
>    "
> -  [(set (attr "length") 
> -        (if_then_else
> -	    (and (ge (minus (match_dup 1) (pc)) (const_int 2))
> -	         (le (minus (match_dup 1) (pc)) (const_int 128))
> -	         (eq (symbol_ref ("which_alternative")) (const_int 0)))
> -	    (const_int 2)
> -	    (const_int 8)))]
> +  [(set (attr "length")
> +	(if_then_else
> +	    (eq (symbol_ref ("which_alternative")) (const_int 0))
> +	    (if_then_else
> +		(and (ge (minus (match_dup 1) (pc)) (const_int 2))
> +		     (le (minus (match_dup 1) (pc)) (const_int 126)))
> +		(const_int 2)
> +		(if_then_else
> +		    (and (ge (minus (match_dup 1) (pc)) (const_int -256))
> +			 (le (minus (match_dup 1) (pc)) (const_int 254)))
> +		    (const_int 4)
> +		    (const_int 6)))
> +	    (if_then_else
> +		(and (ge (minus (match_dup 1) (pc)) (const_int -256))
> +		     (le (minus (match_dup 1) (pc)) (const_int 254)))
> +		(const_int 6)
> +		(const_int 8))))]
>  )
>  
>  (define_insn "*thumb2_cbnz"
> @@ -988,13 +1023,23 @@
>    else
>      return \"cmp\\t%0, #0\;bne\\t%l1\";
>    "
> -  [(set (attr "length") 
> -        (if_then_else
> -	    (and (ge (minus (match_dup 1) (pc)) (const_int 2))
> -	         (le (minus (match_dup 1) (pc)) (const_int 128))
> -	         (eq (symbol_ref ("which_alternative")) (const_int 0)))
> -	    (const_int 2)
> -	    (const_int 8)))]
> +  [(set (attr "length")
> +	(if_then_else
> +	    (eq (symbol_ref ("which_alternative")) (const_int 0))
> +	    (if_then_else
> +		(and (ge (minus (match_dup 1) (pc)) (const_int 2))
> +		     (le (minus (match_dup 1) (pc)) (const_int 126)))
> +		(const_int 2)
> +		(if_then_else
> +		    (and (ge (minus (match_dup 1) (pc)) (const_int -256))
> +			 (le (minus (match_dup 1) (pc)) (const_int 254)))
> +		    (const_int 4)
> +		    (const_int 6)))
> +	    (if_then_else
> +		(and (ge (minus (match_dup 1) (pc)) (const_int -256))
> +		     (le (minus (match_dup 1) (pc)) (const_int 254)))
> +		(const_int 6)
> +		(const_int 8))))]
>  )
>  
>  ;; 16-bit complement
> 
> --
> This patch is available for review at http://codereview.appspot.com/4475042
diff mbox

Patch

Index: config/arm/thumb2.md
===================================================================
--- config/arm/thumb2.md	(revision 173350)
+++ config/arm/thumb2.md	(working copy)
@@ -165,25 +165,49 @@ 
 ;; regs.  The high register alternatives are not taken into account when
 ;; choosing register preferences in order to reflect their expense.
 (define_insn "*thumb2_movsi_insn"
-  [(set (match_operand:SI 0 "nonimmediate_operand" "=rk,r,r,r,l ,*hk,m,*m")
-	(match_operand:SI 1 "general_operand"	   "rk ,I,K,j,mi,*mi,l,*hk"))]
+  [(set (match_operand:SI 0 "nonimmediate_operand" "=l,rk,r,r,r,l ,*rk,Uu,*m")
+	(match_operand:SI 1 "general_operand"      "l ,rk,I,K,j,Uu,*mi,l ,*rk"))]
   "TARGET_THUMB2 && ! TARGET_IWMMXT
    && !(TARGET_HARD_FLOAT && TARGET_VFP)
    && (   register_operand (operands[0], SImode)
        || register_operand (operands[1], SImode))"
-  "@
-   mov%?\\t%0, %1
-   mov%?\\t%0, %1
-   mvn%?\\t%0, #%B1
-   movw%?\\t%0, %1
-   ldr%?\\t%0, %1
-   ldr%?\\t%0, %1
-   str%?\\t%1, %0
-   str%?\\t%1, %0"
-  [(set_attr "type" "*,*,*,*,load1,load1,store1,store1")
+  "*
+  switch (which_alternative)
+    {
+    case 0: return \"mov%?\\t%0, %1\";
+    case 1: return \"mov%?\\t%0, %1\";
+    case 2: return \"mov%?\\t%0, %1\";
+    case 3: return \"mvn%?\\t%0, #%B1\";
+    case 4: return \"movw%?\\t%0, %1\";
+
+    case 5:
+      if (GET_CODE (XEXP (operands[1], 0)) == POST_INC)
+	{
+	  operands[1] = XEXP (XEXP (operands[1], 0), 0);
+	  return \"ldm%(ia%)\t%1!, {%0}\";
+	}
+      else
+	return \"ldr%?\\t%0, %1\";
+
+    case 6: return \"ldr%?\\t%0, %1\";
+
+    case 7:
+      if (GET_CODE (XEXP (operands[0], 0)) == POST_INC)
+	{
+	  operands[0] = XEXP (XEXP (operands[0], 0), 0);
+	  return \"stm%(ia%)\t%0!, {%1}\";
+	}
+      else
+	return \"str%?\\t%1, %0\";
+
+    case 8: return \"str%?\\t%1, %0\";
+    default: gcc_unreachable ();
+    }"
+  [(set_attr "type" "*,*,*,*,*,load1,load1,store1,store1")
    (set_attr "predicable" "yes")
-   (set_attr "pool_range" "*,*,*,*,1020,4096,*,*")
-   (set_attr "neg_pool_range" "*,*,*,*,0,0,*,*")]
+   (set_attr "length" "2,4,4,4,4,2,4,2,4")
+   (set_attr "pool_range" "*,*,*,*,*,1020,4096,*,*")
+   (set_attr "neg_pool_range" "*,*,*,*,*,0,0,*,*")]
 )
 
 (define_insn "tls_load_dot_plus_four"
@@ -685,7 +709,8 @@ 
   "TARGET_THUMB2
    && peep2_regno_dead_p(0, CC_REGNUM)
    && ((GET_CODE(operands[3]) != ROTATE && GET_CODE(operands[3]) != ROTATERT)
-       || REG_P(operands[2]))"
+       || REG_P(operands[2]))
+   && (CONSTANT_P (operands[2]) || (operands[0] == operands[1]))"
   [(parallel
     [(set (match_dup 0)
 	  (match_op_dup 3
@@ -696,10 +721,10 @@ 
 )
 
 (define_insn "*thumb2_shiftsi3_short"
-  [(set (match_operand:SI   0 "low_register_operand" "=l")
+  [(set (match_operand:SI   0 "low_register_operand" "=l,l")
 	(match_operator:SI  3 "shift_operator"
-	 [(match_operand:SI 1 "low_register_operand"  "l")
-	  (match_operand:SI 2 "low_reg_or_int_operand" "lM")]))
+	 [(match_operand:SI 1 "low_register_operand"  "0,l")
+	  (match_operand:SI 2 "low_reg_or_int_operand" "l,M")]))
    (clobber (reg:CC CC_REGNUM))]
   "TARGET_THUMB2 && reload_completed
    && ((GET_CODE(operands[3]) != ROTATE && GET_CODE(operands[3]) != ROTATERT)
@@ -707,7 +732,7 @@ 
   "* return arm_output_shift(operands, 2);"
   [(set_attr "predicable" "yes")
    (set_attr "shift" "1")
-   (set_attr "length" "2")
+   (set_attr "length" "2,2")
    (set (attr "type") (if_then_else (match_operand 2 "const_int_operand" "")
 		      (const_string "alu_shift")
 		      (const_string "alu_shift_reg")))]
@@ -965,13 +990,23 @@ 
   else
     return \"cmp\\t%0, #0\;beq\\t%l1\";
   "
-  [(set (attr "length") 
-        (if_then_else
-	    (and (ge (minus (match_dup 1) (pc)) (const_int 2))
-	         (le (minus (match_dup 1) (pc)) (const_int 128))
-	         (eq (symbol_ref ("which_alternative")) (const_int 0)))
-	    (const_int 2)
-	    (const_int 8)))]
+  [(set (attr "length")
+	(if_then_else
+	    (eq (symbol_ref ("which_alternative")) (const_int 0))
+	    (if_then_else
+		(and (ge (minus (match_dup 1) (pc)) (const_int 2))
+		     (le (minus (match_dup 1) (pc)) (const_int 126)))
+		(const_int 2)
+		(if_then_else
+		    (and (ge (minus (match_dup 1) (pc)) (const_int -256))
+			 (le (minus (match_dup 1) (pc)) (const_int 254)))
+		    (const_int 4)
+		    (const_int 6)))
+	    (if_then_else
+		(and (ge (minus (match_dup 1) (pc)) (const_int -256))
+		     (le (minus (match_dup 1) (pc)) (const_int 254)))
+		(const_int 6)
+		(const_int 8))))]
 )
 
 (define_insn "*thumb2_cbnz"
@@ -988,13 +1023,23 @@ 
   else
     return \"cmp\\t%0, #0\;bne\\t%l1\";
   "
-  [(set (attr "length") 
-        (if_then_else
-	    (and (ge (minus (match_dup 1) (pc)) (const_int 2))
-	         (le (minus (match_dup 1) (pc)) (const_int 128))
-	         (eq (symbol_ref ("which_alternative")) (const_int 0)))
-	    (const_int 2)
-	    (const_int 8)))]
+  [(set (attr "length")
+	(if_then_else
+	    (eq (symbol_ref ("which_alternative")) (const_int 0))
+	    (if_then_else
+		(and (ge (minus (match_dup 1) (pc)) (const_int 2))
+		     (le (minus (match_dup 1) (pc)) (const_int 126)))
+		(const_int 2)
+		(if_then_else
+		    (and (ge (minus (match_dup 1) (pc)) (const_int -256))
+			 (le (minus (match_dup 1) (pc)) (const_int 254)))
+		    (const_int 4)
+		    (const_int 6)))
+	    (if_then_else
+		(and (ge (minus (match_dup 1) (pc)) (const_int -256))
+		     (le (minus (match_dup 1) (pc)) (const_int 254)))
+		(const_int 6)
+		(const_int 8))))]
 )
 
 ;; 16-bit complement