arm: Update fp16-aapcs-[24].c after insn_propagation patch

Message ID	mptcynjrdab.fsf@arm.com
State	New
Headers	show Return-Path: <gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org> DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 47045383E525 From: Richard Sandiford <richard.sandiford@arm.com> To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org,richard.earnshaw@arm.com, ramanara@nvidia.com, richard.sandiford@arm.com Cc: richard.earnshaw@arm.com, ramanara@nvidia.com Subject: [PATCH] arm: Update fp16-aapcs-[24].c after insn_propagation patch Date: Thu, 11 Jul 2024 19:31:24 +0100 Message-ID: <mptcynjrdab.fsf@arm.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain Precedence: list Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org
Series	arm: Update fp16-aapcs-[24].c after insn_propagation patch \| expand arm: Update fp16-aapcs-[24].c after insn_propagation patch

Message ID

mptcynjrdab.fsf@arm.com

State

New

Headers

DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 47045383E525
From: Richard Sandiford <richard.sandiford@arm.com>
To: gcc-patches@gcc.gnu.org
Mail-Followup-To: gcc-patches@gcc.gnu.org,richard.earnshaw@arm.com,
 ramanara@nvidia.com, richard.sandiford@arm.com
Cc: richard.earnshaw@arm.com, ramanara@nvidia.com
Subject: [PATCH] arm: Update fp16-aapcs-[24].c after insn_propagation patch
Date: Thu, 11 Jul 2024 19:31:24 +0100
Message-ID: <mptcynjrdab.fsf@arm.com>
User-Agent: Gnus/5.13 (Gnus v5.13)
MIME-Version: 1.0
Content-Type: text/plain
Precedence: list
Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org

Series

arm: Update fp16-aapcs-[24].c after insn_propagation patch | expand

Commit Message

Richard Sandiford July 11, 2024, 6:31 p.m. UTC

These tests used to generate:

        bl      swap
        ldr     r2, [sp, #4]
        mov     r0, r2  @ __fp16

but g:9d20529d94b23275885f380d155fe8671ab5353a means that we can
load directly into r0:

        bl      swap
        ldrh    r0, [sp, #4]    @ __fp16

This patch updates the tests to "defend" this change.

While there, the scans include:

mov\tr1, r[03]}

But if the spill of r2 occurs first, there's no real reason why
r2 couldn't be used as the temporary, instead r3.

The patch tries to update the scans while preserving the spirit
of the originals.

Spot-checked with armv8l-unknown-linux-gnueabihf.  OK to install?

Richard


gcc/testsuite/
	* gcc.target/arm/fp16-aapcs-2.c: Expect the return value to be
	loaded directly from the stack.  Test that the swap generates
	two moves out of r0/r1 and two moves in.
	* gcc.target/arm/fp16-aapcs-4.c: Likewise.
---
 gcc/testsuite/gcc.target/arm/fp16-aapcs-2.c | 8 +++++---
 gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c | 8 +++++---
 2 files changed, 10 insertions(+), 6 deletions(-)

Comments

Richard Earnshaw (lists) July 19, 2024, 2:25 p.m. UTC | #1

On 11/07/2024 19:31, Richard Sandiford wrote:
> These tests used to generate:
> 
>         bl      swap
>         ldr     r2, [sp, #4]
>         mov     r0, r2  @ __fp16
> 
> but g:9d20529d94b23275885f380d155fe8671ab5353a means that we can
> load directly into r0:
> 
>         bl      swap
>         ldrh    r0, [sp, #4]    @ __fp16
> 
> This patch updates the tests to "defend" this change.
> 
> While there, the scans include:
> 
> mov\tr1, r[03]}
> 
> But if the spill of r2 occurs first, there's no real reason why
> r2 couldn't be used as the temporary, instead r3.
> 
> The patch tries to update the scans while preserving the spirit
> of the originals.
> 
> Spot-checked with armv8l-unknown-linux-gnueabihf.  OK to install?
> 
> Richard

OK.

I'm not sure that these tests are really doing very much; it would probably be better if they could be rewritten using the gcc.target/arm/aapcs framework.  But that's for another day.

R.

> 
> 
> gcc/testsuite/
> 	* gcc.target/arm/fp16-aapcs-2.c: Expect the return value to be
> 	loaded directly from the stack.  Test that the swap generates
> 	two moves out of r0/r1 and two moves in.
> 	* gcc.target/arm/fp16-aapcs-4.c: Likewise.
> ---
>  gcc/testsuite/gcc.target/arm/fp16-aapcs-2.c | 8 +++++---
>  gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c | 8 +++++---
>  2 files changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/fp16-aapcs-2.c b/gcc/testsuite/gcc.target/arm/fp16-aapcs-2.c
> index c34387f5782..12d20560f53 100644
> --- a/gcc/testsuite/gcc.target/arm/fp16-aapcs-2.c
> +++ b/gcc/testsuite/gcc.target/arm/fp16-aapcs-2.c
> @@ -16,6 +16,8 @@ F (__fp16 a, __fp16 b, __fp16 c)
>    return c;
>  }
>  
> -/* { dg-final { scan-assembler-times {mov\tr[0-9]+, r[0-2]} 3 } }  */
> -/* { dg-final { scan-assembler-times {mov\tr1, r[03]} 1 } }  */
> -/* { dg-final { scan-assembler-times {mov\tr0, r[0-9]+} 2 } }  */
> +/* The swap must include two moves out of r0/r1 and two moves in.  */
> +/* { dg-final { scan-assembler-times {mov\tr[0-9]+, r[01]} 2 } }  */
> +/* { dg-final { scan-assembler-times {mov\tr[01], r[0-9]+} 2 } }  */
> +/* c should be spilled around the call.  */
> +/* { dg-final { scan-assembler {str\tr2, ([^\n]*).*ldrh\tr0, \1} { target arm_little_endian } } } */
> diff --git a/gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c b/gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c
> index daac29137ae..09fa64aa494 100644
> --- a/gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c
> +++ b/gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c
> @@ -16,6 +16,8 @@ F (__fp16 a, __fp16 b, __fp16 c)
>    return c;
>  }
>  
> -/* { dg-final { scan-assembler-times {mov\tr[0-9]+, r[0-2]} 3 } }  */
> -/* { dg-final { scan-assembler-times {mov\tr1, r[03]} 1 } }  */
> -/* { dg-final { scan-assembler-times {mov\tr0, r[0-9]+} 2 } }  */
> +/* The swap must include two moves out of r0/r1 and two moves in.  */
> +/* { dg-final { scan-assembler-times {mov\tr[0-9]+, r[01]} 2 } }  */
> +/* { dg-final { scan-assembler-times {mov\tr[01], r[0-9]+} 2 } }  */
> +/* c should be spilled around the call.  */
> +/* { dg-final { scan-assembler {str\tr2, ([^\n]*).*ldrh\tr0, \1} { target arm_little_endian } } } */

Adhemerval Zanella Netto July 23, 2024, 4:25 p.m. UTC | #2

On 19/07/24 11:25, Richard Earnshaw (lists) wrote:
> On 11/07/2024 19:31, Richard Sandiford wrote:
>> These tests used to generate:
>>
>>         bl      swap
>>         ldr     r2, [sp, #4]
>>         mov     r0, r2  @ __fp16
>>
>> but g:9d20529d94b23275885f380d155fe8671ab5353a means that we can
>> load directly into r0:
>>
>>         bl      swap
>>         ldrh    r0, [sp, #4]    @ __fp16
>>
>> This patch updates the tests to "defend" this change.
>>
>> While there, the scans include:
>>
>> mov\tr1, r[03]}
>>
>> But if the spill of r2 occurs first, there's no real reason why
>> r2 couldn't be used as the temporary, instead r3.
>>
>> The patch tries to update the scans while preserving the spirit
>> of the originals.
>>
>> Spot-checked with armv8l-unknown-linux-gnueabihf.  OK to install?
>>
>> Richard
> 
> OK.
> 
> I'm not sure that these tests are really doing very much; it would probably be better if they could be rewritten using the gcc.target/arm/aapcs framework.  But that's for another day.
> 
> R.

Hi Richard,

It seems that this not fully fixed on all configurations, I am still seeing:

FAIL: gcc.target/arm/fp16-aapcs-1.c scan-assembler vmov\\.f32\\ts1, s0
FAIL: gcc.target/arm/fp16-aapcs-2.c scan-assembler-times mov\\tr[01], r[0-9]+ 2
FAIL: gcc.target/arm/fp16-aapcs-2.c scan-assembler str\\tr2, ([^\\n]*).*ldrh\\tr0, \\1
FAIL: gcc.target/arm/fp16-aapcs-3.c scan-assembler vmov\\.f32\\ts1, s0
FAIL: gcc.target/arm/fp16-aapcs-4.c scan-assembler-times mov\\tr[01], r[0-9]+ 2
FAIL: gcc.target/arm/fp16-aapcs-4.c scan-assembler str\\tr2, ([^\\n]*).*ldrh\\tr0, \\1

The gcc is configured with --with-float=hard --with-fpu=vfpv3-d16 --with-mode=thumb --with-tune=cortex-a9 --with-arch=armv7-a 

https://ci.linaro.org/job/tcwg_gnu_cross_check_gcc--master-arm-build/1561/artifact/artifacts/00-sumfiles/

Richard Earnshaw (lists) July 23, 2024, 5:19 p.m. UTC | #3

On 23/07/2024 17:25, Adhemerval Zanella Netto wrote:
> 
> 
> On 19/07/24 11:25, Richard Earnshaw (lists) wrote:
>> On 11/07/2024 19:31, Richard Sandiford wrote:
>>> These tests used to generate:
>>>
>>>          bl      swap
>>>          ldr     r2, [sp, #4]
>>>          mov     r0, r2  @ __fp16
>>>
>>> but g:9d20529d94b23275885f380d155fe8671ab5353a means that we can
>>> load directly into r0:
>>>
>>>          bl      swap
>>>          ldrh    r0, [sp, #4]    @ __fp16
>>>
>>> This patch updates the tests to "defend" this change.
>>>
>>> While there, the scans include:
>>>
>>> mov\tr1, r[03]}
>>>
>>> But if the spill of r2 occurs first, there's no real reason why
>>> r2 couldn't be used as the temporary, instead r3.
>>>
>>> The patch tries to update the scans while preserving the spirit
>>> of the originals.
>>>
>>> Spot-checked with armv8l-unknown-linux-gnueabihf.  OK to install?
>>>
>>> Richard
>>
>> OK.
>>
>> I'm not sure that these tests are really doing very much; it would probably be better if they could be rewritten using the gcc.target/arm/aapcs framework.  But that's for another day.
>>
>> R.
> 
> Hi Richard,
> 
> It seems that this not fully fixed on all configurations, I am still seeing:
> 
> FAIL: gcc.target/arm/fp16-aapcs-1.c scan-assembler vmov\\.f32\\ts1, s0
> FAIL: gcc.target/arm/fp16-aapcs-2.c scan-assembler-times mov\\tr[01], r[0-9]+ 2
> FAIL: gcc.target/arm/fp16-aapcs-2.c scan-assembler str\\tr2, ([^\\n]*).*ldrh\\tr0, \\1
> FAIL: gcc.target/arm/fp16-aapcs-3.c scan-assembler vmov\\.f32\\ts1, s0
> FAIL: gcc.target/arm/fp16-aapcs-4.c scan-assembler-times mov\\tr[01], r[0-9]+ 2
> FAIL: gcc.target/arm/fp16-aapcs-4.c scan-assembler str\\tr2, ([^\\n]*).*ldrh\\tr0, \\1
> 
> The gcc is configured with --with-float=hard --with-fpu=vfpv3-d16 --with-mode=thumb --with-tune=cortex-a9 --with-arch=armv7-a
> 
> https://ci.linaro.org/job/tcwg_gnu_cross_check_gcc--master-arm-build/1561/artifact/artifacts/00-sumfiles/

That looks like a wider problem.  Did the test ever work for that set of 
configure options?

R.

diff --git a/gcc/testsuite/gcc.target/arm/fp16-aapcs-2.c b/gcc/testsuite/gcc.target/arm/fp16-aapcs-2.c
index c34387f5782..12d20560f53 100644
--- a/gcc/testsuite/gcc.target/arm/fp16-aapcs-2.c
+++ b/gcc/testsuite/gcc.target/arm/fp16-aapcs-2.c
@@ -16,6 +16,8 @@  F (__fp16 a, __fp16 b, __fp16 c)
   return c;
 }
 
-/* { dg-final { scan-assembler-times {mov\tr[0-9]+, r[0-2]} 3 } }  */
-/* { dg-final { scan-assembler-times {mov\tr1, r[03]} 1 } }  */
-/* { dg-final { scan-assembler-times {mov\tr0, r[0-9]+} 2 } }  */
+/* The swap must include two moves out of r0/r1 and two moves in.  */
+/* { dg-final { scan-assembler-times {mov\tr[0-9]+, r[01]} 2 } }  */
+/* { dg-final { scan-assembler-times {mov\tr[01], r[0-9]+} 2 } }  */
+/* c should be spilled around the call.  */
+/* { dg-final { scan-assembler {str\tr2, ([^\n]*).*ldrh\tr0, \1} { target arm_little_endian } } } */
diff --git a/gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c b/gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c
index daac29137ae..09fa64aa494 100644
--- a/gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c
+++ b/gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c
@@ -16,6 +16,8 @@  F (__fp16 a, __fp16 b, __fp16 c)
   return c;
 }
 
-/* { dg-final { scan-assembler-times {mov\tr[0-9]+, r[0-2]} 3 } }  */
-/* { dg-final { scan-assembler-times {mov\tr1, r[03]} 1 } }  */
-/* { dg-final { scan-assembler-times {mov\tr0, r[0-9]+} 2 } }  */
+/* The swap must include two moves out of r0/r1 and two moves in.  */
+/* { dg-final { scan-assembler-times {mov\tr[0-9]+, r[01]} 2 } }  */
+/* { dg-final { scan-assembler-times {mov\tr[01], r[0-9]+} 2 } }  */
+/* c should be spilled around the call.  */
+/* { dg-final { scan-assembler {str\tr2, ([^\n]*).*ldrh\tr0, \1} { target arm_little_endian } } } */

arm: Update fp16-aapcs-[24].c after insn_propagation patch

Commit Message

Comments

Patch