Message ID | AM6PR10MB2566767783942E80F11CE7A7E4BA0@AM6PR10MB2566.EURPRD10.PROD.OUTLOOK.COM |
---|---|
State | New |
Headers | show |
Series | Fixup the recently added arm/pr91603.c test case | expand |
On 06/09/2019 11:28, Bernd Edlinger wrote: > Hi! > > It was pointed out in the PR that the test case fails on big endian > targets. > > This is probably obvious, since the test case scans the assembler > output for vld1.32, vst1.32, vldr and vstr, but these are never > generated for -mbig-endian. > > So added dg-require-effective-target arm_little_endian. > > > Is it OK for trunk? > > > Thanks > Bernd. > I think this test needs further work - there's opportunity for the compiler to optimize it by placement of both a and x, since it can see both definitions. Note that vldr is safe on 32-bit (or higher) aligned objects, so if GCC can see the definition it can use that information (the actual rather than the declared placement) in its optimizations. Changing the global to extern, or making it a pointer and dereferencing that is probably enough. Finally, the test is really just a memory copy operation. There's no real reason why it has to use the vector registers for this at all. I think if you want to force them, you'll need to do some operations on the values as well so that the compiler has to place the values in the vector registers rather than anywhere it sees fit. R.
On 9/6/19 12:47 PM, Richard Earnshaw (lists) wrote: > On 06/09/2019 11:28, Bernd Edlinger wrote: >> Hi! >> >> It was pointed out in the PR that the test case fails on big endian >> targets. >> >> This is probably obvious, since the test case scans the assembler >> output for vld1.32, vst1.32, vldr and vstr, but these are never >> generated for -mbig-endian. >> >> So added dg-require-effective-target arm_little_endian. >> >> >> Is it OK for trunk? >> >> >> Thanks >> Bernd. >> > > I think this test needs further work - there's opportunity for the compiler to optimize it by placement of both a and x, since it can see both definitions. Note that vldr is safe on 32-bit (or higher) aligned objects, so if GCC can see the definition it can use that information (the actual rather than the declared placement) in its optimizations. > > Changing the global to extern, or making it a pointer and dereferencing that is probably enough. > done. > Finally, the test is really just a memory copy operation. There's no real reason why it has to use the vector registers for this at all. I think if you want to force them, you'll need to do some operations on the values as well so that the compiler has to place the values in the vector registers rather than anywhere it sees fit. > Okay, nailed the test case down, to use neon registers. Is it OK now? Thanks Bernd.
Hi, I forgot to ping this, is the updated patch OK? On 9/6/19 1:27 PM, Bernd Edlinger wrote: > On 9/6/19 12:47 PM, Richard Earnshaw (lists) wrote: >> On 06/09/2019 11:28, Bernd Edlinger wrote: >>> Hi! >>> >>> It was pointed out in the PR that the test case fails on big endian >>> targets. >>> >>> This is probably obvious, since the test case scans the assembler >>> output for vld1.32, vst1.32, vldr and vstr, but these are never >>> generated for -mbig-endian. >>> >>> So added dg-require-effective-target arm_little_endian. >>> >>> >>> Is it OK for trunk? >>> >>> >>> Thanks >>> Bernd. >>> >> >> I think this test needs further work - there's opportunity for the compiler to optimize it by placement of both a and x, since it can see both definitions. Note that vldr is safe on 32-bit (or higher) aligned objects, so if GCC can see the definition it can use that information (the actual rather than the declared placement) in its optimizations. >> >> Changing the global to extern, or making it a pointer and dereferencing that is probably enough. >> > > done. > >> Finally, the test is really just a memory copy operation. There's no real reason why it has to use the vector registers for this at all. I think if you want to force them, you'll need to do some operations on the values as well so that the compiler has to place the values in the vector registers rather than anywhere it sees fit. >> > > Okay, nailed the test case down, to use neon registers. > > Is it OK now? > > > Thanks > Bernd. > > 2019-09-06 Bernd Edlinger <bernd.edlinger@hotmail.de> > > * gcc.target/arm/pr91603.c: Add dg-require-effective-target > arm_little_endian. Force value in neon register. > > Index: gcc/testsuite/gcc.target/arm/pr91603.c > =================================================================== > --- gcc/testsuite/gcc.target/arm/pr91603.c (revision 275409) > +++ gcc/testsuite/gcc.target/arm/pr91603.c (working copy) > @@ -1,5 +1,6 @@ > /* { dg-do compile } */ > /* { dg-require-effective-target arm_neon_ok } */ > +/* { dg-require-effective-target arm_little_endian } */ > /* { dg-options "-O3" } */ > /* { dg-add-options arm_neon } */ > > @@ -6,7 +7,7 @@ > typedef __simd64_int32_t int32x2_t; > typedef __attribute__((aligned (1))) int32x2_t unalignedvec; > > -unalignedvec a = {11, 13}; > +extern unalignedvec a; > > void foo(unalignedvec *); > > @@ -13,7 +14,9 @@ void foo(unalignedvec *); > void test() > { > unalignedvec x = a; > + __asm__ ("@ value in %h0": "=w"(x) : "0"(x)); > foo (&x); > + __asm__ ("@ value in %h0": "=w"(x) : "0"(x)); > a = x; > } >
2019-09-06 Bernd Edlinger <bernd.edlinger@hotmail.de> * gcc.target/arm/pr91603.c: Add dg-require-effective-target arm_little_endian. Index: gcc/testsuite/gcc.target/arm/pr91603.c =================================================================== --- gcc/testsuite/gcc.target/arm/pr91603.c (revision 275409) +++ gcc/testsuite/gcc.target/arm/pr91603.c (working copy) @@ -1,5 +1,6 @@ /* { dg-do compile } */ /* { dg-require-effective-target arm_neon_ok } */ +/* { dg-require-effective-target arm_little_endian } */ /* { dg-options "-O3" } */ /* { dg-add-options arm_neon } */