diff mbox

[AArch64] Fix aarch64_simd_valid_immediate for Bigendian

Message ID 532C52D4.40904@arm.com
State New
Headers show

Commit Message

Alan Lawrence March 21, 2014, 2:55 p.m. UTC
This patch fixes a bug whereby a vector like V8QImode {1,0,1,0,1,0,1,0} can 
result in an instruction like

movi v1.4h, 0x1

whereas on bigendian this should be

movi v1.4h, 0x1, lsl 8

Regression tested on aarch64_be-none-elf: no changes in libstdc++, newlib; no 
regressions in gcc or g++ and FAIL->PASS as listed below.

Ok for trunk (stage 4) ?

Cheers, Alan

gcc/ChangeLog:

2014-03-21  Alan Lawrence  alan.lawrence@arm.com

         * config/aarch64/aarch64.c (aarch64_simd_valid_immediate): reverse order
         of elements for bigendian.

=====
FAIL->PASS in gcc testsuite:

c-c++-common/cilk-plus/PS/reduction-1.c  -ftree-vectorize -fcilkplus -std=c99 
execution test
gcc.c-torture/execute/20000112-1.c execution,  -O0
gcc.c-torture/execute/900409-1.c execution,  -O0
gcc.c-torture/execute/p18298.c execution,  -O0
gcc.c-torture/execute/pr53645-2.c execution,  -O1
gcc.c-torture/execute/pr53645-2.c execution,  -O2
gcc.c-torture/execute/pr53645-2.c execution,  -O2 -flto
gcc.c-torture/execute/pr53645-2.c execution,  -O2 -flto -flto-partition=none
gcc.c-torture/execute/pr53645-2.c execution,  -O2 -flto -fno-use-linker-plugin 
-flto-partition=none
gcc.c-torture/execute/pr53645-2.c execution,  -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects
gcc.c-torture/execute/pr53645-2.c execution,  -O3 -fomit-frame-pointer
gcc.c-torture/execute/pr53645-2.c execution,  -O3 -fomit-frame-pointer 
-funroll-all-loops -finline-functions
gcc.c-torture/execute/pr53645-2.c execution,  -O3 -fomit-frame-pointer 
-funroll-loops
gcc.c-torture/execute/pr53645-2.c execution,  -O3 -g
gcc.c-torture/execute/pr53645-2.c execution,  -Og -g
gcc.c-torture/execute/pr53645-2.c execution,  -Os
gcc.c-torture/execute/pr53645.c execution,  -O1
gcc.c-torture/execute/pr53645.c execution,  -O2
gcc.c-torture/execute/pr53645.c execution,  -O2 -flto
gcc.c-torture/execute/pr53645.c execution,  -O2 -flto -flto-partition=none
gcc.c-torture/execute/pr53645.c execution,  -O2 -flto -fno-use-linker-plugin 
-flto-partition=none
gcc.c-torture/execute/pr53645.c execution,  -O2 -flto -fuse-linker-plugin 
-fno-fat-lto-objects
gcc.c-torture/execute/pr53645.c execution,  -O3 -fomit-frame-pointer
gcc.c-torture/execute/pr53645.c execution,  -O3 -fomit-frame-pointer 
-funroll-all-loops -finline-functions
gcc.c-torture/execute/pr53645.c execution,  -O3 -fomit-frame-pointer -funroll-loops
gcc.c-torture/execute/pr53645.c execution,  -O3 -g
gcc.c-torture/execute/pr53645.c execution,  -Og -g

FAIL->PASS in g++ testsuite:

g++.dg/torture/pr37922.C  -O3 -fomit-frame-pointer  execution test
g++.dg/torture/pr37922.C  -O3 -fomit-frame-pointer -funroll-loops  execution test
g++.dg/torture/pr37922.C  -O3 -fomit-frame-pointer -funroll-all-loops 
-finline-functions  execution test
g++.dg/torture/pr37922.C  -O3 -g  execution test
g++.dg/torture/pr37922.C  -O3 -fomit-frame-pointer  execution test
g++.dg/torture/pr37922.C  -O3 -fomit-frame-pointer -funroll-loops  execution test
g++.dg/torture/pr37922.C  -O3 -fomit-frame-pointer -funroll-all-loops 
-finline-functions  execution test
g++.dg/torture/pr37922.C  -O3 -g  execution test

Comments

Marcus Shawcroft March 24, 2014, 11:32 a.m. UTC | #1
On 21 March 2014 14:55, Alan Lawrence <alan.lawrence@arm.com> wrote:
> This patch fixes a bug whereby a vector like V8QImode {1,0,1,0,1,0,1,0} can
> result in an instruction like
>
> movi v1.4h, 0x1
>
> whereas on bigendian this should be
>
> movi v1.4h, 0x1, lsl 8
>
> Regression tested on aarch64_be-none-elf: no changes in libstdc++, newlib;
> no regressions in gcc or g++ and FAIL->PASS as listed below.
>
> Ok for trunk (stage 4) ?


> Cheers, Alan
>
> gcc/ChangeLog:
>
> 2014-03-21  Alan Lawrence  alan.lawrence@arm.com
>
>         * config/aarch64/aarch64.c (aarch64_simd_valid_immediate): reverse
> order
>         of elements for bigendian.

s/reverse/Reverse/

This should be fixed now in stage-4, the fix looks straight forward.
If there are no objections from RM's in the next 24 hours go ahead and
commit it.

Cheers
/Marcus
James Greenhalgh March 25, 2014, 4:01 p.m. UTC | #2
On Mon, Mar 24, 2014 at 11:32:39AM +0000, Marcus Shawcroft wrote:
> On 21 March 2014 14:55, Alan Lawrence <alan.lawrence@arm.com> wrote:
> > This patch fixes a bug whereby a vector like V8QImode {1,0,1,0,1,0,1,0} can
> > result in an instruction like
> >
> > movi v1.4h, 0x1
> >
> > whereas on bigendian this should be
> >
> > movi v1.4h, 0x1, lsl 8
> >
> > Regression tested on aarch64_be-none-elf: no changes in libstdc++, newlib;
> > no regressions in gcc or g++ and FAIL->PASS as listed below.
> >
> > Ok for trunk (stage 4) ?
> 
> 
> > Cheers, Alan
> >
> > gcc/ChangeLog:
> >
> > 2014-03-21  Alan Lawrence  alan.lawrence@arm.com
> >
> >         * config/aarch64/aarch64.c (aarch64_simd_valid_immediate): reverse
> > order
> >         of elements for bigendian.
> 
> s/reverse/Reverse/
> 
> This should be fixed now in stage-4, the fix looks straight forward.
> If there are no objections from RM's in the next 24 hours go ahead and
> commit it.
> 

I've committed this on Alan's behalf as revision 208814, with the
ChangeLog below.

Thanks,
James

gcc/

2014-03-25  Alan Lawrence  <alan.lawrence@arm.com>

	* config/aarch64/aarch64.c (aarch64_simd_valid_immediate): Reverse
	order of elements for big-endian.
diff mbox

Patch

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index f24b248..3166ebd 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -6563,7 +6563,9 @@  aarch64_simd_valid_immediate (rtx op, enum machine_mode mode, bool inverse,
   /* Splat vector constant out into a byte vector.  */
   for (i = 0; i < n_elts; i++)
     {
-      rtx el = CONST_VECTOR_ELT (op, i);
+      /* The vector is provided in gcc endian-neutral fashion.  For aarch64_be,
+         it must be laid out in the vector register in reverse order.  */
+      rtx el = CONST_VECTOR_ELT (op, BYTES_BIG_ENDIAN ? (n_elts - 1 - i) : i);
       unsigned HOST_WIDE_INT elpart;
       unsigned int part, parts;