diff mbox series

arm: Update fp16-aapcs-[24].c after insn_propagation patch

Message ID mptcynjrdab.fsf@arm.com
State New
Headers show
Series arm: Update fp16-aapcs-[24].c after insn_propagation patch | expand

Commit Message

Richard Sandiford July 11, 2024, 6:31 p.m. UTC
These tests used to generate:

        bl      swap
        ldr     r2, [sp, #4]
        mov     r0, r2  @ __fp16

but g:9d20529d94b23275885f380d155fe8671ab5353a means that we can
load directly into r0:

        bl      swap
        ldrh    r0, [sp, #4]    @ __fp16

This patch updates the tests to "defend" this change.

While there, the scans include:

mov\tr1, r[03]}

But if the spill of r2 occurs first, there's no real reason why
r2 couldn't be used as the temporary, instead r3.

The patch tries to update the scans while preserving the spirit
of the originals.

Spot-checked with armv8l-unknown-linux-gnueabihf.  OK to install?

Richard


gcc/testsuite/
	* gcc.target/arm/fp16-aapcs-2.c: Expect the return value to be
	loaded directly from the stack.  Test that the swap generates
	two moves out of r0/r1 and two moves in.
	* gcc.target/arm/fp16-aapcs-4.c: Likewise.
---
 gcc/testsuite/gcc.target/arm/fp16-aapcs-2.c | 8 +++++---
 gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c | 8 +++++---
 2 files changed, 10 insertions(+), 6 deletions(-)

Comments

Richard Earnshaw (lists) July 19, 2024, 2:25 p.m. UTC | #1
On 11/07/2024 19:31, Richard Sandiford wrote:
> These tests used to generate:
> 
>         bl      swap
>         ldr     r2, [sp, #4]
>         mov     r0, r2  @ __fp16
> 
> but g:9d20529d94b23275885f380d155fe8671ab5353a means that we can
> load directly into r0:
> 
>         bl      swap
>         ldrh    r0, [sp, #4]    @ __fp16
> 
> This patch updates the tests to "defend" this change.
> 
> While there, the scans include:
> 
> mov\tr1, r[03]}
> 
> But if the spill of r2 occurs first, there's no real reason why
> r2 couldn't be used as the temporary, instead r3.
> 
> The patch tries to update the scans while preserving the spirit
> of the originals.
> 
> Spot-checked with armv8l-unknown-linux-gnueabihf.  OK to install?
> 
> Richard

OK.

I'm not sure that these tests are really doing very much; it would probably be better if they could be rewritten using the gcc.target/arm/aapcs framework.  But that's for another day.

R.

> 
> 
> gcc/testsuite/
> 	* gcc.target/arm/fp16-aapcs-2.c: Expect the return value to be
> 	loaded directly from the stack.  Test that the swap generates
> 	two moves out of r0/r1 and two moves in.
> 	* gcc.target/arm/fp16-aapcs-4.c: Likewise.
> ---
>  gcc/testsuite/gcc.target/arm/fp16-aapcs-2.c | 8 +++++---
>  gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c | 8 +++++---
>  2 files changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/fp16-aapcs-2.c b/gcc/testsuite/gcc.target/arm/fp16-aapcs-2.c
> index c34387f5782..12d20560f53 100644
> --- a/gcc/testsuite/gcc.target/arm/fp16-aapcs-2.c
> +++ b/gcc/testsuite/gcc.target/arm/fp16-aapcs-2.c
> @@ -16,6 +16,8 @@ F (__fp16 a, __fp16 b, __fp16 c)
>    return c;
>  }
>  
> -/* { dg-final { scan-assembler-times {mov\tr[0-9]+, r[0-2]} 3 } }  */
> -/* { dg-final { scan-assembler-times {mov\tr1, r[03]} 1 } }  */
> -/* { dg-final { scan-assembler-times {mov\tr0, r[0-9]+} 2 } }  */
> +/* The swap must include two moves out of r0/r1 and two moves in.  */
> +/* { dg-final { scan-assembler-times {mov\tr[0-9]+, r[01]} 2 } }  */
> +/* { dg-final { scan-assembler-times {mov\tr[01], r[0-9]+} 2 } }  */
> +/* c should be spilled around the call.  */
> +/* { dg-final { scan-assembler {str\tr2, ([^\n]*).*ldrh\tr0, \1} { target arm_little_endian } } } */
> diff --git a/gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c b/gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c
> index daac29137ae..09fa64aa494 100644
> --- a/gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c
> +++ b/gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c
> @@ -16,6 +16,8 @@ F (__fp16 a, __fp16 b, __fp16 c)
>    return c;
>  }
>  
> -/* { dg-final { scan-assembler-times {mov\tr[0-9]+, r[0-2]} 3 } }  */
> -/* { dg-final { scan-assembler-times {mov\tr1, r[03]} 1 } }  */
> -/* { dg-final { scan-assembler-times {mov\tr0, r[0-9]+} 2 } }  */
> +/* The swap must include two moves out of r0/r1 and two moves in.  */
> +/* { dg-final { scan-assembler-times {mov\tr[0-9]+, r[01]} 2 } }  */
> +/* { dg-final { scan-assembler-times {mov\tr[01], r[0-9]+} 2 } }  */
> +/* c should be spilled around the call.  */
> +/* { dg-final { scan-assembler {str\tr2, ([^\n]*).*ldrh\tr0, \1} { target arm_little_endian } } } */
Adhemerval Zanella July 23, 2024, 4:25 p.m. UTC | #2
On 19/07/24 11:25, Richard Earnshaw (lists) wrote:
> On 11/07/2024 19:31, Richard Sandiford wrote:
>> These tests used to generate:
>>
>>         bl      swap
>>         ldr     r2, [sp, #4]
>>         mov     r0, r2  @ __fp16
>>
>> but g:9d20529d94b23275885f380d155fe8671ab5353a means that we can
>> load directly into r0:
>>
>>         bl      swap
>>         ldrh    r0, [sp, #4]    @ __fp16
>>
>> This patch updates the tests to "defend" this change.
>>
>> While there, the scans include:
>>
>> mov\tr1, r[03]}
>>
>> But if the spill of r2 occurs first, there's no real reason why
>> r2 couldn't be used as the temporary, instead r3.
>>
>> The patch tries to update the scans while preserving the spirit
>> of the originals.
>>
>> Spot-checked with armv8l-unknown-linux-gnueabihf.  OK to install?
>>
>> Richard
> 
> OK.
> 
> I'm not sure that these tests are really doing very much; it would probably be better if they could be rewritten using the gcc.target/arm/aapcs framework.  But that's for another day.
> 
> R.

Hi Richard,

It seems that this not fully fixed on all configurations, I am still seeing:

FAIL: gcc.target/arm/fp16-aapcs-1.c scan-assembler vmov\\.f32\\ts1, s0
FAIL: gcc.target/arm/fp16-aapcs-2.c scan-assembler-times mov\\tr[01], r[0-9]+ 2
FAIL: gcc.target/arm/fp16-aapcs-2.c scan-assembler str\\tr2, ([^\\n]*).*ldrh\\tr0, \\1
FAIL: gcc.target/arm/fp16-aapcs-3.c scan-assembler vmov\\.f32\\ts1, s0
FAIL: gcc.target/arm/fp16-aapcs-4.c scan-assembler-times mov\\tr[01], r[0-9]+ 2
FAIL: gcc.target/arm/fp16-aapcs-4.c scan-assembler str\\tr2, ([^\\n]*).*ldrh\\tr0, \\1

The gcc is configured with --with-float=hard --with-fpu=vfpv3-d16 --with-mode=thumb --with-tune=cortex-a9 --with-arch=armv7-a 

https://ci.linaro.org/job/tcwg_gnu_cross_check_gcc--master-arm-build/1561/artifact/artifacts/00-sumfiles/
Richard Earnshaw (lists) July 23, 2024, 5:19 p.m. UTC | #3
On 23/07/2024 17:25, Adhemerval Zanella Netto wrote:
> 
> 
> On 19/07/24 11:25, Richard Earnshaw (lists) wrote:
>> On 11/07/2024 19:31, Richard Sandiford wrote:
>>> These tests used to generate:
>>>
>>>          bl      swap
>>>          ldr     r2, [sp, #4]
>>>          mov     r0, r2  @ __fp16
>>>
>>> but g:9d20529d94b23275885f380d155fe8671ab5353a means that we can
>>> load directly into r0:
>>>
>>>          bl      swap
>>>          ldrh    r0, [sp, #4]    @ __fp16
>>>
>>> This patch updates the tests to "defend" this change.
>>>
>>> While there, the scans include:
>>>
>>> mov\tr1, r[03]}
>>>
>>> But if the spill of r2 occurs first, there's no real reason why
>>> r2 couldn't be used as the temporary, instead r3.
>>>
>>> The patch tries to update the scans while preserving the spirit
>>> of the originals.
>>>
>>> Spot-checked with armv8l-unknown-linux-gnueabihf.  OK to install?
>>>
>>> Richard
>>
>> OK.
>>
>> I'm not sure that these tests are really doing very much; it would probably be better if they could be rewritten using the gcc.target/arm/aapcs framework.  But that's for another day.
>>
>> R.
> 
> Hi Richard,
> 
> It seems that this not fully fixed on all configurations, I am still seeing:
> 
> FAIL: gcc.target/arm/fp16-aapcs-1.c scan-assembler vmov\\.f32\\ts1, s0
> FAIL: gcc.target/arm/fp16-aapcs-2.c scan-assembler-times mov\\tr[01], r[0-9]+ 2
> FAIL: gcc.target/arm/fp16-aapcs-2.c scan-assembler str\\tr2, ([^\\n]*).*ldrh\\tr0, \\1
> FAIL: gcc.target/arm/fp16-aapcs-3.c scan-assembler vmov\\.f32\\ts1, s0
> FAIL: gcc.target/arm/fp16-aapcs-4.c scan-assembler-times mov\\tr[01], r[0-9]+ 2
> FAIL: gcc.target/arm/fp16-aapcs-4.c scan-assembler str\\tr2, ([^\\n]*).*ldrh\\tr0, \\1
> 
> The gcc is configured with --with-float=hard --with-fpu=vfpv3-d16 --with-mode=thumb --with-tune=cortex-a9 --with-arch=armv7-a
> 
> https://ci.linaro.org/job/tcwg_gnu_cross_check_gcc--master-arm-build/1561/artifact/artifacts/00-sumfiles/

That looks like a wider problem.  Did the test ever work for that set of 
configure options?

R.
diff mbox series

Patch

diff --git a/gcc/testsuite/gcc.target/arm/fp16-aapcs-2.c b/gcc/testsuite/gcc.target/arm/fp16-aapcs-2.c
index c34387f5782..12d20560f53 100644
--- a/gcc/testsuite/gcc.target/arm/fp16-aapcs-2.c
+++ b/gcc/testsuite/gcc.target/arm/fp16-aapcs-2.c
@@ -16,6 +16,8 @@  F (__fp16 a, __fp16 b, __fp16 c)
   return c;
 }
 
-/* { dg-final { scan-assembler-times {mov\tr[0-9]+, r[0-2]} 3 } }  */
-/* { dg-final { scan-assembler-times {mov\tr1, r[03]} 1 } }  */
-/* { dg-final { scan-assembler-times {mov\tr0, r[0-9]+} 2 } }  */
+/* The swap must include two moves out of r0/r1 and two moves in.  */
+/* { dg-final { scan-assembler-times {mov\tr[0-9]+, r[01]} 2 } }  */
+/* { dg-final { scan-assembler-times {mov\tr[01], r[0-9]+} 2 } }  */
+/* c should be spilled around the call.  */
+/* { dg-final { scan-assembler {str\tr2, ([^\n]*).*ldrh\tr0, \1} { target arm_little_endian } } } */
diff --git a/gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c b/gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c
index daac29137ae..09fa64aa494 100644
--- a/gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c
+++ b/gcc/testsuite/gcc.target/arm/fp16-aapcs-4.c
@@ -16,6 +16,8 @@  F (__fp16 a, __fp16 b, __fp16 c)
   return c;
 }
 
-/* { dg-final { scan-assembler-times {mov\tr[0-9]+, r[0-2]} 3 } }  */
-/* { dg-final { scan-assembler-times {mov\tr1, r[03]} 1 } }  */
-/* { dg-final { scan-assembler-times {mov\tr0, r[0-9]+} 2 } }  */
+/* The swap must include two moves out of r0/r1 and two moves in.  */
+/* { dg-final { scan-assembler-times {mov\tr[0-9]+, r[01]} 2 } }  */
+/* { dg-final { scan-assembler-times {mov\tr[01], r[0-9]+} 2 } }  */
+/* c should be spilled around the call.  */
+/* { dg-final { scan-assembler {str\tr2, ([^\n]*).*ldrh\tr0, \1} { target arm_little_endian } } } */