Message ID | mpt34mzvzee.fsf@arm.com |
---|---|
State | New |
Headers | show |
Series | aarch64: Fix caller saves of VNx2QI [PR116238] | expand |
> On 20 Aug 2024, at 20:10, Richard Sandiford <richard.sandiford@arm.com> wrote: > > External email: Use caution opening links or attachments > > > The testcase contains a VNx2QImode pseudo that is live across a call > and that cannot be allocated a call-preserved register. LRA quite > reasonably tried to save it before the call and restore it afterwards. > Unfortunately, the target told it to do that in SImode, even though > punning between SImode and VNx2QImode is disallowed by both > TARGET_CAN_CHANGE_MODE_CLASS and TARGET_MODES_TIEABLE_P. > > The natural class to use for SImode is GENERAL_REGS, so this led > to an unsalvageable situation in which we had: > > (set (subreg:VNx2QI (reg:SI A) 0) (reg:VNx2QI B)) > > where A needed GENERAL_REGS and B needed FP_REGS. We therefore ended > up in a reload loop. > > The hooks above should ensure that this situation can never occur > for incoming subregs. It only happened here because the target > explicitly forced it. > > The decision to use SImode for modes smaller than 4 bytes dates > back to the beginning of the port, before 16-bit floating-point > modes existed. I'm not sure whether promoting to SImode really > makes sense for any FPR, but that's a separate performance/QoI > discussion. For now, this patch just disallows using SImode > when it is wrong for correctness reasons, since that should be > safer to backport. > > Bootstrapped & regression-tested on aarch64-linux-gnu. I'll leave > a day or so for comments before pushing. The explanation makes sense to me FWIW. Thanks for fixing this. Kyrill > > Richard > > > gcc/ > PR testsuite/116238 > * config/aarch64/aarch64.cc (aarch64_hard_regno_caller_save_mode): > Only return SImode if we can convert to and from it. > > gcc/testsuite/ > PR testsuite/116238 > * gcc.target/aarch64/sve/pr116238.c: New test. > --- > gcc/config/aarch64/aarch64.cc | 7 ++++--- > gcc/testsuite/gcc.target/aarch64/sve/pr116238.c | 13 +++++++++++++ > 2 files changed, 17 insertions(+), 3 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/pr116238.c > > diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc > index bfd7bcdef7c..4e312c43576 100644 > --- a/gcc/config/aarch64/aarch64.cc > +++ b/gcc/config/aarch64/aarch64.cc > @@ -2521,10 +2521,11 @@ aarch64_hard_regno_caller_save_mode (unsigned regno, unsigned, > unnecessarily significant. */ > if (PR_REGNUM_P (regno)) > return mode; > - if (known_ge (GET_MODE_SIZE (mode), 4)) > - return mode; > - else > + if (known_lt (GET_MODE_SIZE (mode), 4) > + && REG_CAN_CHANGE_MODE_P (regno, mode, SImode) > + && REG_CAN_CHANGE_MODE_P (regno, SImode, mode)) > return SImode; > + return mode; > } > > /* Return true if I's bits are consecutive ones from the MSB. */ > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr116238.c b/gcc/testsuite/gcc.target/aarch64/sve/pr116238.c > new file mode 100644 > index 00000000000..fe66b198107 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr116238.c > @@ -0,0 +1,13 @@ > +/* { dg-additional-options "-O2 -msve-vector-bits=128" } */ > + > +void foo(); > +typedef unsigned char v2qi __attribute__((vector_size(2))); > +void f(v2qi *ptr) > +{ > + v2qi x = *ptr; > + asm volatile ("" :: "w" (x)); > + asm volatile ("" ::: "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15"); > + foo(); > + asm volatile ("" :: "w" (x)); > + *ptr = x; > +} > -- > 2.25.1 >
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index bfd7bcdef7c..4e312c43576 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -2521,10 +2521,11 @@ aarch64_hard_regno_caller_save_mode (unsigned regno, unsigned, unnecessarily significant. */ if (PR_REGNUM_P (regno)) return mode; - if (known_ge (GET_MODE_SIZE (mode), 4)) - return mode; - else + if (known_lt (GET_MODE_SIZE (mode), 4) + && REG_CAN_CHANGE_MODE_P (regno, mode, SImode) + && REG_CAN_CHANGE_MODE_P (regno, SImode, mode)) return SImode; + return mode; } /* Return true if I's bits are consecutive ones from the MSB. */ diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr116238.c b/gcc/testsuite/gcc.target/aarch64/sve/pr116238.c new file mode 100644 index 00000000000..fe66b198107 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr116238.c @@ -0,0 +1,13 @@ +/* { dg-additional-options "-O2 -msve-vector-bits=128" } */ + +void foo(); +typedef unsigned char v2qi __attribute__((vector_size(2))); +void f(v2qi *ptr) +{ + v2qi x = *ptr; + asm volatile ("" :: "w" (x)); + asm volatile ("" ::: "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15"); + foo(); + asm volatile ("" :: "w" (x)); + *ptr = x; +}