diff mbox series

aarch64: Fix caller saves of VNx2QI [PR116238]

Message ID mpt34mzvzee.fsf@arm.com
State New
Headers show
Series aarch64: Fix caller saves of VNx2QI [PR116238] | expand

Commit Message

Richard Sandiford Aug. 20, 2024, 6:10 p.m. UTC
The testcase contains a VNx2QImode pseudo that is live across a call
and that cannot be allocated a call-preserved register.  LRA quite
reasonably tried to save it before the call and restore it afterwards.
Unfortunately, the target told it to do that in SImode, even though
punning between SImode and VNx2QImode is disallowed by both
TARGET_CAN_CHANGE_MODE_CLASS and TARGET_MODES_TIEABLE_P.

The natural class to use for SImode is GENERAL_REGS, so this led
to an unsalvageable situation in which we had:

  (set (subreg:VNx2QI (reg:SI A) 0) (reg:VNx2QI B))

where A needed GENERAL_REGS and B needed FP_REGS.  We therefore ended
up in a reload loop.

The hooks above should ensure that this situation can never occur
for incoming subregs.  It only happened here because the target
explicitly forced it.

The decision to use SImode for modes smaller than 4 bytes dates
back to the beginning of the port, before 16-bit floating-point
modes existed.  I'm not sure whether promoting to SImode really
makes sense for any FPR, but that's a separate performance/QoI
discussion.  For now, this patch just disallows using SImode
when it is wrong for correctness reasons, since that should be
safer to backport.

Bootstrapped & regression-tested on aarch64-linux-gnu.  I'll leave
a day or so for comments before pushing.

Richard


gcc/
	PR testsuite/116238
	* config/aarch64/aarch64.cc (aarch64_hard_regno_caller_save_mode):
	Only return SImode if we can convert to and from it.

gcc/testsuite/
	PR testsuite/116238
	* gcc.target/aarch64/sve/pr116238.c: New test.
---
 gcc/config/aarch64/aarch64.cc                   |  7 ++++---
 gcc/testsuite/gcc.target/aarch64/sve/pr116238.c | 13 +++++++++++++
 2 files changed, 17 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr116238.c

Comments

Kyrylo Tkachov Aug. 21, 2024, 6:36 a.m. UTC | #1
> On 20 Aug 2024, at 20:10, Richard Sandiford <richard.sandiford@arm.com> wrote:
> 
> External email: Use caution opening links or attachments
> 
> 
> The testcase contains a VNx2QImode pseudo that is live across a call
> and that cannot be allocated a call-preserved register.  LRA quite
> reasonably tried to save it before the call and restore it afterwards.
> Unfortunately, the target told it to do that in SImode, even though
> punning between SImode and VNx2QImode is disallowed by both
> TARGET_CAN_CHANGE_MODE_CLASS and TARGET_MODES_TIEABLE_P.
> 
> The natural class to use for SImode is GENERAL_REGS, so this led
> to an unsalvageable situation in which we had:
> 
>  (set (subreg:VNx2QI (reg:SI A) 0) (reg:VNx2QI B))
> 
> where A needed GENERAL_REGS and B needed FP_REGS.  We therefore ended
> up in a reload loop.
> 
> The hooks above should ensure that this situation can never occur
> for incoming subregs.  It only happened here because the target
> explicitly forced it.
> 
> The decision to use SImode for modes smaller than 4 bytes dates
> back to the beginning of the port, before 16-bit floating-point
> modes existed.  I'm not sure whether promoting to SImode really
> makes sense for any FPR, but that's a separate performance/QoI
> discussion.  For now, this patch just disallows using SImode
> when it is wrong for correctness reasons, since that should be
> safer to backport.
> 
> Bootstrapped & regression-tested on aarch64-linux-gnu.  I'll leave
> a day or so for comments before pushing.

The explanation makes sense to me FWIW.
Thanks for fixing this.
Kyrill

> 
> Richard
> 
> 
> gcc/
>        PR testsuite/116238
>        * config/aarch64/aarch64.cc (aarch64_hard_regno_caller_save_mode):
>        Only return SImode if we can convert to and from it.
> 
> gcc/testsuite/
>        PR testsuite/116238
>        * gcc.target/aarch64/sve/pr116238.c: New test.
> ---
> gcc/config/aarch64/aarch64.cc                   |  7 ++++---
> gcc/testsuite/gcc.target/aarch64/sve/pr116238.c | 13 +++++++++++++
> 2 files changed, 17 insertions(+), 3 deletions(-)
> create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr116238.c
> 
> diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
> index bfd7bcdef7c..4e312c43576 100644
> --- a/gcc/config/aarch64/aarch64.cc
> +++ b/gcc/config/aarch64/aarch64.cc
> @@ -2521,10 +2521,11 @@ aarch64_hard_regno_caller_save_mode (unsigned regno, unsigned,
>      unnecessarily significant.  */
>   if (PR_REGNUM_P (regno))
>     return mode;
> -  if (known_ge (GET_MODE_SIZE (mode), 4))
> -    return mode;
> -  else
> +  if (known_lt (GET_MODE_SIZE (mode), 4)
> +      && REG_CAN_CHANGE_MODE_P (regno, mode, SImode)
> +      && REG_CAN_CHANGE_MODE_P (regno, SImode, mode))
>     return SImode;
> +  return mode;
> }
> 
> /* Return true if I's bits are consecutive ones from the MSB.  */
> diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr116238.c b/gcc/testsuite/gcc.target/aarch64/sve/pr116238.c
> new file mode 100644
> index 00000000000..fe66b198107
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr116238.c
> @@ -0,0 +1,13 @@
> +/* { dg-additional-options "-O2 -msve-vector-bits=128" } */
> +
> +void foo();
> +typedef unsigned char v2qi __attribute__((vector_size(2)));
> +void f(v2qi *ptr)
> +{
> +  v2qi x = *ptr;
> +  asm volatile ("" :: "w" (x));
> +  asm volatile ("" ::: "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15");
> +  foo();
> +  asm volatile ("" :: "w" (x));
> +  *ptr = x;
> +}
> --
> 2.25.1
>
diff mbox series

Patch

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index bfd7bcdef7c..4e312c43576 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -2521,10 +2521,11 @@  aarch64_hard_regno_caller_save_mode (unsigned regno, unsigned,
      unnecessarily significant.  */
   if (PR_REGNUM_P (regno))
     return mode;
-  if (known_ge (GET_MODE_SIZE (mode), 4))
-    return mode;
-  else
+  if (known_lt (GET_MODE_SIZE (mode), 4)
+      && REG_CAN_CHANGE_MODE_P (regno, mode, SImode)
+      && REG_CAN_CHANGE_MODE_P (regno, SImode, mode))
     return SImode;
+  return mode;
 }
 
 /* Return true if I's bits are consecutive ones from the MSB.  */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr116238.c b/gcc/testsuite/gcc.target/aarch64/sve/pr116238.c
new file mode 100644
index 00000000000..fe66b198107
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/pr116238.c
@@ -0,0 +1,13 @@ 
+/* { dg-additional-options "-O2 -msve-vector-bits=128" } */
+
+void foo();
+typedef unsigned char v2qi __attribute__((vector_size(2)));
+void f(v2qi *ptr)
+{
+  v2qi x = *ptr;
+  asm volatile ("" :: "w" (x));
+  asm volatile ("" ::: "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15");
+  foo();
+  asm volatile ("" :: "w" (x));
+  *ptr = x;
+}