mbox series

[v4,0/3] rs6000: Support more SSE4 intrinsics

Message ID 20211019011512.100358-1-pc@us.ibm.com
Headers show
Series rs6000: Support more SSE4 intrinsics | expand

Message

Paul A. Clarke Oct. 19, 2021, 1:15 a.m. UTC
v4:
- Of original 6 patches in this series, I committed patches 2-5.
- Found an issue from v3. New file "nmmintrin.h" also needs to be added
to gcc/config.gcc "extra_headers".  Unfortunately, I discovered this
after committing the patch which added "nmmintrin.h", so I've added a
new patch here.
- Added scheduling "barriers" to patch 2 after review from Segher.
- Noted additional PR fixed by patch 3.

v3: Add "nmmintrin.h". _mm_cmpgt_epi64 is part of SSE4.2
and users will expect to be able to include "nmmintrin.h",
even though "nmmintrin.h" just includes "smmintrin.h"
where all of the SSE4.2 implementations actually appear.
Only patch 5/6 changed from v2.

Tested ppc64le (POWER9) and ppc64/32 (POWER7).

OK for trunk?

Paul A. Clarke (3):
  rs6000: Add nmmintrin.h to extra_headers
  rs6000: Support SSE4.1 "round" intrinsics
  rs6000: Guard some x86 intrinsics implementations

 gcc/config.gcc                                |   1 +
 gcc/config/rs6000/emmintrin.h                 |  12 +-
 gcc/config/rs6000/pmmintrin.h                 |   4 +
 gcc/config/rs6000/smmintrin.h                 | 296 ++++++++++++++----
 gcc/config/rs6000/tmmintrin.h                 |  12 +
 .../gcc.target/powerpc/sse4_1-round3.h        |  81 +++++
 .../gcc.target/powerpc/sse4_1-roundpd.c       | 143 +++++++++
 .../gcc.target/powerpc/sse4_1-roundps.c       |  98 ++++++
 .../gcc.target/powerpc/sse4_1-roundsd.c       | 256 +++++++++++++++
 .../gcc.target/powerpc/sse4_1-roundss.c       | 208 ++++++++++++
 .../gcc.target/powerpc/sse4_2-pcmpgtq.c       |   4 +-
 11 files changed, 1039 insertions(+), 76 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-round3.h
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundpd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundps.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundsd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundss.c