diff mbox

[ARM] Set max_insns_skipped to MAX_INSN_PER_IT_BLOCK when optimize_size for THUMB2

Message ID 000001cf320c$cfdc4770$6f94d650$@arm.com
State New
Headers show

Commit Message

Zhenqiang Chen Feb. 25, 2014, 9:34 a.m. UTC
Hi,

Current value for max_insns_skipped is 6. For THUMB2, it needs 2 (IF-THEN)
or 3 (IF-THEN-ELSE) IT blocks to hold all the instructions. The overhead of
IT is 4 or 6 BYTES.

If we do not generate IT blocks, for IF-THEN, the overhead of conditional
jump is 2 or 4; for IF-THEN-ELSE, the overhead is 4, 6, or 8.

Most THUMB2 jump instructions are 2 BYTES. Tests on CSiBE show no one file
has code size regression. So The patch sets max_insns_skipped to
MAX_INSN_PER_IT_BLOCK.

No make check regression on cortex-m3.
For CSiBE, no any file has code size regression. And overall there is >0.01%
code size improvement for cortex-a9 and cortex-m4.

Is it OK?

Thanks!
-Zhenqiang

2014-02-25  Zhenqiang Chen  <zhenqiang.chen@arm.com>

	* config/arm/arm.c (arm_option_override): Set max_insns_skipped
	to MAX_INSN_PER_IT_BLOCK when optimize_size for THUMB2.

testsuite/ChangeLog:
2014-02-25  Zhenqiang Chen  <zhenqiang.chen@arm.com>

	* gcc.target/arm/max-insns-skipped.c: New test.

Comments

Richard Earnshaw Aug. 13, 2014, 12:29 p.m. UTC | #1
On 25/02/14 09:34, Zhenqiang Chen wrote:
> Hi,
> 
> Current value for max_insns_skipped is 6. For THUMB2, it needs 2 (IF-THEN)
> or 3 (IF-THEN-ELSE) IT blocks to hold all the instructions. The overhead of
> IT is 4 or 6 BYTES.
> 
> If we do not generate IT blocks, for IF-THEN, the overhead of conditional
> jump is 2 or 4; for IF-THEN-ELSE, the overhead is 4, 6, or 8.
> 
> Most THUMB2 jump instructions are 2 BYTES. Tests on CSiBE show no one file
> has code size regression. So The patch sets max_insns_skipped to
> MAX_INSN_PER_IT_BLOCK.
> 
> No make check regression on cortex-m3.
> For CSiBE, no any file has code size regression. And overall there is >0.01%
> code size improvement for cortex-a9 and cortex-m4.
> 
> Is it OK?
> 
> Thanks!
> -Zhenqiang
> 
> 2014-02-25  Zhenqiang Chen  <zhenqiang.chen@arm.com>
> 
> 	* config/arm/arm.c (arm_option_override): Set max_insns_skipped
> 	to MAX_INSN_PER_IT_BLOCK when optimize_size for THUMB2.
> 
> testsuite/ChangeLog:
> 2014-02-25  Zhenqiang Chen  <zhenqiang.chen@arm.com>
> 
> 	* gcc.target/arm/max-insns-skipped.c: New test.
> 
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index b49f43e..99cdbc4 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -2743,6 +2743,15 @@ arm_option_override (void)
>        /* If optimizing for size, bump the number of instructions that we
>           are prepared to conditionally execute (even on a StrongARM).  */
>        max_insns_skipped = 6;
> +
> +      /* For THUMB2, it needs 2 (IF-THEN) or 3 (IF-THEN-ELSE) IT blocks to
> +	 hold all the instructions. The overhead of IT is 4 or 6 BYTES.
> +	 If we do not generate IT blocks, for IF-THEN, the overhead of
> +	 conditional jump is 2 or 4; for IF-THEN-ELSE, the overhead is 4, 6
> +	 or 8.  Most THUMB2 jump instructions are 2 BYTES.
> +	 So set max_insns_skipped to MAX_INSN_PER_IT_BLOCK.  */
> +      if (TARGET_THUMB2)
> +	max_insns_skipped = MAX_INSN_PER_IT_BLOCK;

Replacing a single 2-byte branch with a 2-byte IT insn doesn't save any
space in itself.

Pedantically, we save space with IT blocks if either:
a) we can replace an else clause (saving a branch around that)
b) we can use non-flag setting versions of insns to replace what would
otherwise be 4-byte insns with 2-byte versions.

I agree that multiple IT instructions is probably not a win (the
cond-exec code doesn't know how to reason about the finer points of item
b) and doesn't even really consider a) either).

So this is OK, but I think the comment should be simplified.  Just say,
for thumb2 we limit the conditional sequence to one IT block.

OK with that change.

R.

>      }
>    else
>      max_insns_skipped = current_tune->max_insns_skipped;
> diff --git a/gcc/testsuite/gcc.target/arm/max-insns-skipped.c
> b/gcc/testsuite/gcc.target/arm/max-insns-skipped.c
> new file mode 100644
> index 0000000..0a11554
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/max-insns-skipped.c
> @@ -0,0 +1,21 @@
> +/* { dg-do assemble { target arm_thumb2 } } */
> +/* { dg-options " -Os " } */
> +
> +int t (int a, int b, int c, int d)
> +{
> +  int r;
> +  if (a > 0) {
> +    r = a + b;
> +    r += 0x456;
> +    r *= 0x1234567;
> +    }
> +  else {
> +    r = b - a;
> +    r -= 0x123;
> +    r *= 0x12387;
> +    r += d;
> +   }
> +  return r;
> +}
> +
> +/* { dg-final { object-size text <= 40 } } */
> 
> 
> 
>
diff mbox

Patch

diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index b49f43e..99cdbc4 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -2743,6 +2743,15 @@  arm_option_override (void)
       /* If optimizing for size, bump the number of instructions that we
          are prepared to conditionally execute (even on a StrongARM).  */
       max_insns_skipped = 6;
+
+      /* For THUMB2, it needs 2 (IF-THEN) or 3 (IF-THEN-ELSE) IT blocks to
+	 hold all the instructions. The overhead of IT is 4 or 6 BYTES.
+	 If we do not generate IT blocks, for IF-THEN, the overhead of
+	 conditional jump is 2 or 4; for IF-THEN-ELSE, the overhead is 4, 6
+	 or 8.  Most THUMB2 jump instructions are 2 BYTES.
+	 So set max_insns_skipped to MAX_INSN_PER_IT_BLOCK.  */
+      if (TARGET_THUMB2)
+	max_insns_skipped = MAX_INSN_PER_IT_BLOCK;
     }
   else
     max_insns_skipped = current_tune->max_insns_skipped;
diff --git a/gcc/testsuite/gcc.target/arm/max-insns-skipped.c
b/gcc/testsuite/gcc.target/arm/max-insns-skipped.c
new file mode 100644
index 0000000..0a11554
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/max-insns-skipped.c
@@ -0,0 +1,21 @@ 
+/* { dg-do assemble { target arm_thumb2 } } */
+/* { dg-options " -Os " } */
+
+int t (int a, int b, int c, int d)
+{
+  int r;
+  if (a > 0) {
+    r = a + b;
+    r += 0x456;
+    r *= 0x1234567;
+    }
+  else {
+    r = b - a;
+    r -= 0x123;
+    r *= 0x12387;
+    r += d;
+   }
+  return r;
+}
+
+/* { dg-final { object-size text <= 40 } } */