Message ID | CAFULd4aDOkG+zseZLwFnKDM-7zZrKFP=PTLDnRvivGh5ZOZ4AQ@mail.gmail.com |
---|---|
State | New |
Headers | show |
Series | [i386] : Introduce signbit<mode>2 expander | expand |
Hi Uros, > Based on the recent work that enabled vectorization of > __builtin_signbit on aarch64. > > 2019-05-21 Uroš Bizjak <ubizjak@gmail.com> > > * config/i386/sse.md (VF1_AVX2): New mode iterator. > (signbit<mode>2): New expander > > testsuite/ChangeLog: > > 2019-05-21 Uroš Bizjak <ubizjak@gmail.com> > > * gcc.target/i386/vect-signbitf.c: New test. > > Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. the new testcase FAILs on i386-pc-solaris2.11 (both with the default -m32 and -m64), but also on i586-unknown-freebsd11.2, i686-pc-linux-gnu: +FAIL: gcc.target/i386/vect-signbitf.c scan-assembler-not -2147483648 Rainer
On Wed, May 22, 2019 at 11:04 AM Rainer Orth <ro@cebitec.uni-bielefeld.de> wrote: > > Hi Uros, > > > Based on the recent work that enabled vectorization of > > __builtin_signbit on aarch64. > > > > 2019-05-21 Uroš Bizjak <ubizjak@gmail.com> > > > > * config/i386/sse.md (VF1_AVX2): New mode iterator. > > (signbit<mode>2): New expander > > > > testsuite/ChangeLog: > > > > 2019-05-21 Uroš Bizjak <ubizjak@gmail.com> > > > > * gcc.target/i386/vect-signbitf.c: New test. > > > > Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. > > the new testcase FAILs on i386-pc-solaris2.11 (both with the default > -m32 and -m64), but also on i586-unknown-freebsd11.2, i686-pc-linux-gnu: > > +FAIL: gcc.target/i386/vect-signbitf.c scan-assembler-not -2147483648 It works for me on x86_64-linux-gnu with -32, so I'm at loss on what might be wrong. Uros.
Hi Uros, >> the new testcase FAILs on i386-pc-solaris2.11 (both with the default >> -m32 and -m64), but also on i586-unknown-freebsd11.2, i686-pc-linux-gnu: >> >> +FAIL: gcc.target/i386/vect-signbitf.c scan-assembler-not -2147483648 > > It works for me on x86_64-linux-gnu with -32, so I'm at loss on what > might be wrong. I just tried x86_64-pc-linux-gnu and i686-pc-linux-gnu bootstraps: no failures (with or without -m32) in the former, fails for both -m32 and -m64 in the latter, just as on i386-pc-solaris2.11. Rainer
On Wed, May 22, 2019 at 6:36 AM Rainer Orth <ro@cebitec.uni-bielefeld.de> wrote: > > Hi Uros, > > >> the new testcase FAILs on i386-pc-solaris2.11 (both with the default > >> -m32 and -m64), but also on i586-unknown-freebsd11.2, i686-pc-linux-gnu: > >> > >> +FAIL: gcc.target/i386/vect-signbitf.c scan-assembler-not -2147483648 > > > > It works for me on x86_64-linux-gnu with -32, so I'm at loss on what > > might be wrong. > > I just tried x86_64-pc-linux-gnu and i686-pc-linux-gnu bootstraps: no > failures (with or without -m32) in the former, fails for both -m32 and > -m64 in the latter, just as on i386-pc-solaris2.11. > i686 GCC generates slightly different assembly codes from x86-64 GCC with -m32: @@ -88,16 +88,16 @@ main.cold: .size a, 4096 a: .long 0 - .long 2147483648 + .long -2147483648 <<<<<<<< This matches /* { dg-final { scan-assembler-not "-2147483648" } } */ .long 1065353216 - .long 3212836864 - .long 3221225472 + .long -1082130432 + .long -1073741824 .long 1077936128 - .long 3231711232 - .long 3238002688 + .long -1063256064 + .long -1056964608 .long 1095761920 .long 1101529088 - .long 3251109888 + .long -1043857408 .long 1107558400 .zero 4048 These numbers are equivalent. Should we check andl $-2147483648, %edx instead? BTW, we also need -mfpmath=sse since it may not be the default for -m32.
On Wed, May 22, 2019 at 5:52 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > On Wed, May 22, 2019 at 6:36 AM Rainer Orth <ro@cebitec.uni-bielefeld.de> wrote: > > > > Hi Uros, > > > > >> the new testcase FAILs on i386-pc-solaris2.11 (both with the default > > >> -m32 and -m64), but also on i586-unknown-freebsd11.2, i686-pc-linux-gnu: > > >> > > >> +FAIL: gcc.target/i386/vect-signbitf.c scan-assembler-not -2147483648 > > > > > > It works for me on x86_64-linux-gnu with -32, so I'm at loss on what > > > might be wrong. > > > > I just tried x86_64-pc-linux-gnu and i686-pc-linux-gnu bootstraps: no > > failures (with or without -m32) in the former, fails for both -m32 and > > -m64 in the latter, just as on i386-pc-solaris2.11. > > > > i686 GCC generates slightly different assembly codes from x86-64 GCC > with -m32: > > > @@ -88,16 +88,16 @@ main.cold: > .size a, 4096 > a: > .long 0 > - .long 2147483648 > + .long -2147483648 <<<<<<<< This matches /* { dg-final { > scan-assembler-not "-2147483648" } } */ > .long 1065353216 > - .long 3212836864 > - .long 3221225472 > + .long -1082130432 > + .long -1073741824 > .long 1077936128 > - .long 3231711232 > - .long 3238002688 > + .long -1063256064 > + .long -1056964608 > .long 1095761920 > .long 1101529088 > - .long 3251109888 > + .long -1043857408 > .long 1107558400 > .zero 4048 > > These numbers are equivalent. Should we check > > andl $-2147483648, %edx I think we can go with: -/* { dg-final { scan-assembler-not "-2147483648" } } */ +/* { dg-final { scan-assembler-not "\\$-2147483648" } } */ > BTW, we also need -mfpmath=sse since it may not be the default for -m32. No need for this, vectorization does not care for target math. Uros.
Index: config/i386/sse.md =================================================================== --- config/i386/sse.md (revision 271467) +++ config/i386/sse.md (working copy) @@ -279,6 +279,9 @@ (define_mode_iterator VF1 [(V16SF "TARGET_AVX512F") (V8SF "TARGET_AVX") V4SF]) +(define_mode_iterator VF1_AVX2 + [(V16SF "TARGET_AVX512F") (V8SF "TARGET_AVX2") V4SF]) + ;; 128- and 256-bit SF vector modes (define_mode_iterator VF1_128_256 [(V8SF "TARGET_AVX") V4SF]) @@ -3523,6 +3526,15 @@ operands[4] = gen_reg_rtx (<MODE>mode); }) +(define_expand "signbit<mode>2" + [(set (match_operand:<sseintvecmode> 0 "register_operand") + (lshiftrt:<sseintvecmode> + (subreg:<sseintvecmode> + (match_operand:VF1_AVX2 1 "register_operand") 0) + (match_dup 2)))] + "TARGET_SSE2" + "operands[2] = GEN_INT (GET_MODE_UNIT_BITSIZE (<MODE>mode)-1);") + ;; Also define scalar versions. These are used for abs, neg, and ;; conditional move. Using subregs into vector modes causes register ;; allocation lossage. These patterns do not allow memory operands Index: testsuite/gcc.target/i386/vect-signbitf.c =================================================================== --- testsuite/gcc.target/i386/vect-signbitf.c (nonexistent) +++ testsuite/gcc.target/i386/vect-signbitf.c (working copy) @@ -0,0 +1,30 @@ +/* { dg-do run { target sse2_runtime } } */ +/* { dg-options "-O2 -msse2 -ftree-vectorize -fdump-tree-vect-details -save-temps" } */ + +extern void abort (); + +#define N 1024 +float a[N] = {0.0f, -0.0f, 1.0f, -1.0f, + -2.0f, 3.0f, -5.0f, -8.0f, + 13.0f, 21.0f, -25.0f, 33.0f}; +int r[N]; + +int +main (void) +{ + int i; + + for (i = 0; i < N; i++) + r[i] = __builtin_signbitf (a[i]); + + /* check results: */ + for (i = 0; i < N; i++) + if (__builtin_signbit (a[i]) && !r[i]) + abort (); + + return 0; +} + +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */ +/* { dg-final { scan-assembler-not "-2147483648" } } */ +/* { dg-final { scan-assembler "psrld" } } */