diff mbox

[AArch64,ARM] Modify idiv costs for Cortex-A53

Message ID VI1PR0801MB2031A1B7BBC6BEDF2D7BF853FF170@VI1PR0801MB2031.eurprd08.prod.outlook.com
State New
Headers show

Commit Message

Tamar Christina May 2, 2017, 3:37 p.m. UTC
Hi All, 

This patch adjusts the cost model for Cortex-A53 to increase the costs of
an integer division. The reason for this is that we want to always expand
the division to a multiply when doing a division by constant.

On the Cortex-A53 shifts are modeled to cost 1 instruction,
when doing the expansion we have to perform two shifts and an addition.
However because the cost model can't model things such as fusing of shifts,
we have to fully cost both shifts.

This leads to the cost model telling us that for the Cortex-A53 we can never
do the expansion. By increasing the costs of the division by two instructions
we recover the room required in the cost calculation to do the expansions.

The reason for all of this is that currently the code does not produce what you'd expect,
which is that division by constants are always expanded. Also it's inconsistent because
unsigned division does get expanded.

This all reduces the ability to do CSE when using signed modulo since that one is also expanded.

Given:

void f5(void)
{
  int x = 0;
  while (x > -1000)
  {
    g(x % 300);
    x--;
  }
}


we now generate

	smull	x0, w19, w21
	asr	x0, x0, 37
	sub	w0, w0, w19, asr 31
	msub	w0, w0, w20, w19
	sub	w19, w19, #1
	bl	g

as opposed to

	sdiv	w0, w19, w20
	msub	w0, w0, w20, w19
	sub	w19, w19, #1
	bl	g


Bootstrapped and reg tested on aarch64-none-linux-gnu with no regressions.

OK for trunk?

Thanks,
Tamar


gcc/
2017-05-02  Tamar Christina  <tamar.christina@arm.com>

	* config/arm/aarch-cost-tables.h (cortexa53_extra_cost): Increase idiv cost.

Comments

Tamar Christina May 15, 2017, 8:33 a.m. UTC | #1
Ping
Ramana Radhakrishnan May 15, 2017, 8:40 a.m. UTC | #2
On Tue, May 2, 2017 at 4:37 PM, Tamar Christina <Tamar.Christina@arm.com> wrote:
> Hi All,
>
> This patch adjusts the cost model for Cortex-A53 to increase the costs of
> an integer division. The reason for this is that we want to always expand
> the division to a multiply when doing a division by constant.
>
> On the Cortex-A53 shifts are modeled to cost 1 instruction,
> when doing the expansion we have to perform two shifts and an addition.
> However because the cost model can't model things such as fusing of shifts,
> we have to fully cost both shifts.
>
> This leads to the cost model telling us that for the Cortex-A53 we can never
> do the expansion. By increasing the costs of the division by two instructions
> we recover the room required in the cost calculation to do the expansions.
>
> The reason for all of this is that currently the code does not produce what you'd expect,
> which is that division by constants are always expanded. Also it's inconsistent because
> unsigned division does get expanded.
>
> This all reduces the ability to do CSE when using signed modulo since that one is also expanded.
>
> Given:
>
> void f5(void)
> {
>   int x = 0;
>   while (x > -1000)
>   {
>     g(x % 300);
>     x--;
>   }
> }
>
>
> we now generate
>
>         smull   x0, w19, w21
>         asr     x0, x0, 37
>         sub     w0, w0, w19, asr 31
>         msub    w0, w0, w20, w19
>         sub     w19, w19, #1
>         bl      g
>
> as opposed to
>
>         sdiv    w0, w19, w20
>         msub    w0, w0, w20, w19
>         sub     w19, w19, #1
>         bl      g
>
>
> Bootstrapped and reg tested on aarch64-none-linux-gnu with no regressions.

Since this affects the arm port as well, it needs to be regression
tested on arm as well.

Thanks,
Ramana

>
> OK for trunk?
>
> Thanks,
> Tamar
>
>
> gcc/
> 2017-05-02  Tamar Christina  <tamar.christina@arm.com>
>
>         * config/arm/aarch-cost-tables.h (cortexa53_extra_cost): Increase idiv cost.
Tamar Christina May 15, 2017, 9:40 a.m. UTC | #3
Hi,

Reg-tested now on arm-none-linux-gnueabihf as well and no regressions.

Ok for trunk?
Tamar
James Greenhalgh June 6, 2017, 12:56 p.m. UTC | #4
On Tue, May 02, 2017 at 04:37:21PM +0100, Tamar Christina wrote:
> Hi All, 
> 
> This patch adjusts the cost model for Cortex-A53 to increase the costs of
> an integer division. The reason for this is that we want to always expand
> the division to a multiply when doing a division by constant.
> 
> On the Cortex-A53 shifts are modeled to cost 1 instruction,
> when doing the expansion we have to perform two shifts and an addition.
> However because the cost model can't model things such as fusing of shifts,
> we have to fully cost both shifts.
> 
> This leads to the cost model telling us that for the Cortex-A53 we can never
> do the expansion. By increasing the costs of the division by two instructions
> we recover the room required in the cost calculation to do the expansions.
> 
> The reason for all of this is that currently the code does not produce what
> you'd expect, which is that division by constants are always expanded. Also
> it's inconsistent because unsigned division does get expanded.
> 
> This all reduces the ability to do CSE when using signed modulo since that
> one is also expanded.
> 
> Given:
> 
> void f5(void)
> {
>   int x = 0;
>   while (x > -1000)
>   {
>     g(x % 300);
>     x--;
>   }
> }
> 
> 
> we now generate
> 
> 	smull	x0, w19, w21
> 	asr	x0, x0, 37
> 	sub	w0, w0, w19, asr 31
> 	msub	w0, w0, w20, w19
> 	sub	w19, w19, #1
> 	bl	g
> 
> as opposed to
> 
> 	sdiv	w0, w19, w20
> 	msub	w0, w0, w20, w19
> 	sub	w19, w19, #1
> 	bl	g
> 
> 
> Bootstrapped and reg tested on aarch64-none-linux-gnu with no regressions.
> 
> OK for trunk?

OK for AArch64, but you'll need an ARM OK too.

Thanks,
James

> gcc/
> 2017-05-02  Tamar Christina  <tamar.christina@arm.com>
> 
> 	* config/arm/aarch-cost-tables.h (cortexa53_extra_cost): Increase idiv cost.
Ramana Radhakrishnan June 6, 2017, 1:26 p.m. UTC | #5
On Tue, Jun 6, 2017 at 1:56 PM, James Greenhalgh
<james.greenhalgh@arm.com> wrote:
> On Tue, May 02, 2017 at 04:37:21PM +0100, Tamar Christina wrote:
>> Hi All,
>>
>> This patch adjusts the cost model for Cortex-A53 to increase the costs of
>> an integer division. The reason for this is that we want to always expand
>> the division to a multiply when doing a division by constant.
>>
>> On the Cortex-A53 shifts are modeled to cost 1 instruction,
>> when doing the expansion we have to perform two shifts and an addition.
>> However because the cost model can't model things such as fusing of shifts,
>> we have to fully cost both shifts.
>>
>> This leads to the cost model telling us that for the Cortex-A53 we can never
>> do the expansion. By increasing the costs of the division by two instructions
>> we recover the room required in the cost calculation to do the expansions.
>>
>> The reason for all of this is that currently the code does not produce what
>> you'd expect, which is that division by constants are always expanded. Also
>> it's inconsistent because unsigned division does get expanded.
>>
>> This all reduces the ability to do CSE when using signed modulo since that
>> one is also expanded.
>>
>> Given:
>>
>> void f5(void)
>> {
>>   int x = 0;
>>   while (x > -1000)
>>   {
>>     g(x % 300);
>>     x--;
>>   }
>> }
>>
>>
>> we now generate
>>
>>       smull   x0, w19, w21
>>       asr     x0, x0, 37
>>       sub     w0, w0, w19, asr 31
>>       msub    w0, w0, w20, w19
>>       sub     w19, w19, #1
>>       bl      g
>>
>> as opposed to
>>
>>       sdiv    w0, w19, w20
>>       msub    w0, w0, w20, w19
>>       sub     w19, w19, #1
>>       bl      g
>>
>>
>> Bootstrapped and reg tested on aarch64-none-linux-gnu with no regressions.
>>
>> OK for trunk?
>
> OK for AArch64, but you'll need an ARM OK too.

OK.

Ramana
>
> Thanks,
> James
>
>> gcc/
>> 2017-05-02  Tamar Christina  <tamar.christina@arm.com>
>>
>>       * config/arm/aarch-cost-tables.h (cortexa53_extra_cost): Increase idiv cost.
>
diff mbox

Patch

diff --git a/gcc/config/arm/aarch-cost-tables.h b/gcc/config/arm/aarch-cost-tables.h
index 68f84b04fe2f2cbdb66c6dd0c7add097606a7878..8cff517861aea2c249a07a07f6775c60f75fb9a0 100644
--- a/gcc/config/arm/aarch-cost-tables.h
+++ b/gcc/config/arm/aarch-cost-tables.h
@@ -154,7 +154,7 @@  const struct cpu_cost_table cortexa53_extra_costs =
       COSTS_N_INSNS (1),	/* extend.  */
       COSTS_N_INSNS (1),	/* add.  */
       COSTS_N_INSNS (1),	/* extend_add.  */
-      COSTS_N_INSNS (7)		/* idiv.  */
+      COSTS_N_INSNS (9)		/* idiv.  */
     },
     /* MULT DImode */
     {