From patchwork Mon Aug 23 19:03:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul A. Clarke" X-Patchwork-Id: 1519938 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=qpEo5udX; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Gthd63TFVz9sPf for ; Tue, 24 Aug 2021 05:09:46 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0EDFF3858003 for ; Mon, 23 Aug 2021 19:09:44 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0EDFF3858003 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1629745784; bh=zKQwLav8AVBpug6XiuuWcxfRFQ/IfdphBgvBaxdC5PI=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=qpEo5udXj0y0R8EIawNCwXjghuGZjyW5rD8ZCIKnkrGsNqX6bSc5ikzKp6nrd/K+J 862EzxwFlKEAB+WYgdWYxfRwxDGndteeJx/4UTpWdeprYxIxzlhWhix4Z3bpFcTdqV LRVFqpB/qWu0c/ociRGI4lwgdtvKFphhgVxddLJY= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by sourceware.org (Postfix) with ESMTPS id BF3AB3857C52 for ; Mon, 23 Aug 2021 19:03:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BF3AB3857C52 Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 17NJ3OFN153176; Mon, 23 Aug 2021 15:03:24 -0400 Received: from ppma04wdc.us.ibm.com (1a.90.2fa9.ip4.static.sl-reverse.com [169.47.144.26]) by mx0a-001b2d01.pphosted.com with ESMTP id 3amdrqy6fk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 23 Aug 2021 15:03:24 -0400 Received: from pps.filterd (ppma04wdc.us.ibm.com [127.0.0.1]) by ppma04wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 17NIvK04006586; Mon, 23 Aug 2021 19:03:15 GMT Received: from b01cxnp22034.gho.pok.ibm.com (b01cxnp22034.gho.pok.ibm.com [9.57.198.24]) by ppma04wdc.us.ibm.com with ESMTP id 3ajs4b8t5p-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 23 Aug 2021 19:03:15 +0000 Received: from b01ledav002.gho.pok.ibm.com (b01ledav002.gho.pok.ibm.com [9.57.199.107]) by b01cxnp22034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 17NJ3EjG18940182 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 23 Aug 2021 19:03:14 GMT Received: from b01ledav002.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 00C61124066; Mon, 23 Aug 2021 19:03:14 +0000 (GMT) Received: from b01ledav002.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id AE11B12405B; Mon, 23 Aug 2021 19:03:13 +0000 (GMT) Received: from localhost (unknown [9.160.136.65]) by b01ledav002.gho.pok.ibm.com (Postfix) with ESMTP; Mon, 23 Aug 2021 19:03:13 +0000 (GMT) To: gcc-patches@gcc.gnu.org Subject: [PATCH v3 1/6] rs6000: Support SSE4.1 "round" intrinsics Date: Mon, 23 Aug 2021 14:03:05 -0500 Message-Id: <20210823190310.1679905-2-pc@us.ibm.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20210823190310.1679905-1-pc@us.ibm.com> References: <20210823190310.1679905-1-pc@us.ibm.com> MIME-Version: 1.0 X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: lY27YE6m6nL-QtAVAV4qbrUNeNZ9qvuM X-Proofpoint-GUID: lY27YE6m6nL-QtAVAV4qbrUNeNZ9qvuM X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-08-23_04:2021-08-23, 2021-08-23 signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 adultscore=0 bulkscore=0 mlxscore=0 mlxlogscore=999 lowpriorityscore=0 impostorscore=0 malwarescore=0 phishscore=0 spamscore=0 clxscore=1015 suspectscore=0 priorityscore=1501 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2107140000 definitions=main-2108230131 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Paul A. Clarke via Gcc-patches" From: "Paul A. Clarke" Reply-To: "Paul A. Clarke" Cc: wschmidt@linux.ibm.com, segher@kernel.crashing.org Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Suppress exceptions (when specified), by saving, manipulating, and restoring the FPSCR. Similarly, save, set, and restore the floating-point rounding mode when required. No attempt is made to optimize writing the FPSCR (by checking if the new value would be the same), other than using lighter weight instructions when possible. The scalar versions naively use the parallel versions to compute the single scalar result and then construct the remainder of the result. Of minor note, the values of _MM_FROUND_TO_NEG_INF and _MM_FROUND_TO_ZERO are swapped from the corresponding values on x86 so as to match the corresponding rounding mode values in the Power ISA. Move implementations of _mm_ceil* and _mm_floor* into _mm_round*, and convert _mm_ceil* and _mm_floor* into macros. This matches the current analogous implementations in config/i386/smmintrin.h. Function signatures match the analogous functions in config/i386/smmintrin.h. Add tests for _mm_round_pd, _mm_round_ps, _mm_round_sd, _mm_round_ss, modeled after the very similar "floor" and "ceil" tests. Include basic tests, plus tests at the boundaries for floating-point representation, positive and negative, test all of the parameterized rounding modes as well as the C99 rounding modes and interactions between the two. Exceptions are not explicitly tested. 2021-08-20 Paul A. Clarke gcc * config/rs6000/smmintrin.h (_mm_round_pd, _mm_round_ps, _mm_round_sd, _mm_round_ss, _MM_FROUND_TO_NEAREST_INT, _MM_FROUND_TO_ZERO, _MM_FROUND_TO_POS_INF, _MM_FROUND_TO_NEG_INF, _MM_FROUND_CUR_DIRECTION, _MM_FROUND_RAISE_EXC, _MM_FROUND_NO_EXC, _MM_FROUND_NINT, _MM_FROUND_FLOOR, _MM_FROUND_CEIL, _MM_FROUND_TRUNC, _MM_FROUND_RINT, _MM_FROUND_NEARBYINT): New. * config/rs6000/smmintrin.h (_mm_ceil_pd, _mm_ceil_ps, _mm_ceil_sd, _mm_ceil_ss, _mm_floor_pd, _mm_floor_ps, _mm_floor_sd, _mm_floor_ss): Convert from function to macro. gcc/testsuite * gcc.target/powerpc/sse4_1-round3.h: New. * gcc.target/powerpc/sse4_1-roundpd.c: New. * gcc.target/powerpc/sse4_1-roundps.c: New. * gcc.target/powerpc/sse4_1-roundsd.c: New. * gcc.target/powerpc/sse4_1-roundss.c: New. --- v3: No change. v2: - Replaced clever (and broken) exception masking with more straightforward implementation, per v1 review and closer inspection. mtfsf was only writing the final nybble (1) instead of the final two nybbles (2), so not all of the exception-enable bits were cleared. - Renamed some variables from cryptic "tmp" and "save" to "fpscr_save" and "enables_save". - Retained use of __builtin_mffsl, since that is supported pre-POWER8 (with an alternate instruction sequence). - Added "extern" to functions to maintain compatible decorations with like implementations in gcc/config/i386. - Added some additional text to the commit message about some of the (unpleasant?) implementations and decorations coming from like implementations in gcc/config/i386, per v1 review. - Removed "-Wno-psabi" from tests as unnecessary, per v1 review. - Fixed indentation and other minor formatting changes, per v1 review. - Noted testing in patch series cover letter. gcc/config/rs6000/smmintrin.h | 240 +++++++++++----- .../gcc.target/powerpc/sse4_1-round3.h | 81 ++++++ .../gcc.target/powerpc/sse4_1-roundpd.c | 143 ++++++++++ .../gcc.target/powerpc/sse4_1-roundps.c | 98 +++++++ .../gcc.target/powerpc/sse4_1-roundsd.c | 256 ++++++++++++++++++ .../gcc.target/powerpc/sse4_1-roundss.c | 208 ++++++++++++++ 6 files changed, 962 insertions(+), 64 deletions(-) create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-round3.h create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundpd.c create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundps.c create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundsd.c create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundss.c diff --git a/gcc/config/rs6000/smmintrin.h b/gcc/config/rs6000/smmintrin.h index 3767a67eada7..a6b88d313ad0 100644 --- a/gcc/config/rs6000/smmintrin.h +++ b/gcc/config/rs6000/smmintrin.h @@ -42,6 +42,182 @@ #include #include +/* Rounding mode macros. */ +#define _MM_FROUND_TO_NEAREST_INT 0x00 +#define _MM_FROUND_TO_ZERO 0x01 +#define _MM_FROUND_TO_POS_INF 0x02 +#define _MM_FROUND_TO_NEG_INF 0x03 +#define _MM_FROUND_CUR_DIRECTION 0x04 + +#define _MM_FROUND_NINT \ + (_MM_FROUND_TO_NEAREST_INT | _MM_FROUND_RAISE_EXC) +#define _MM_FROUND_FLOOR \ + (_MM_FROUND_TO_NEG_INF | _MM_FROUND_RAISE_EXC) +#define _MM_FROUND_CEIL \ + (_MM_FROUND_TO_POS_INF | _MM_FROUND_RAISE_EXC) +#define _MM_FROUND_TRUNC \ + (_MM_FROUND_TO_ZERO | _MM_FROUND_RAISE_EXC) +#define _MM_FROUND_RINT \ + (_MM_FROUND_CUR_DIRECTION | _MM_FROUND_RAISE_EXC) +#define _MM_FROUND_NEARBYINT \ + (_MM_FROUND_CUR_DIRECTION | _MM_FROUND_NO_EXC) + +#define _MM_FROUND_RAISE_EXC 0x00 +#define _MM_FROUND_NO_EXC 0x08 + +extern __inline __m128d +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_round_pd (__m128d __A, int __rounding) +{ + __v2df __r; + union { + double __fr; + long long __fpscr; + } __enables_save, __fpscr_save; + + if (__rounding & _MM_FROUND_NO_EXC) + { + /* Save enabled exceptions, disable all exceptions, + and preserve the rounding mode. */ +#ifdef _ARCH_PWR9 + __asm__ __volatile__ ("mffsce %0" : "=f" (__fpscr_save.__fr)); + __enables_save.__fpscr = __fpscr_save.__fpscr & 0xf8; +#else + __fpscr_save.__fr = __builtin_mffs (); + __enables_save.__fpscr = __fpscr_save.__fpscr & 0xf8; + __fpscr_save.__fpscr &= ~0xf8; + __builtin_mtfsf (0b00000011, __fpscr_save.__fr); +#endif + } + + switch (__rounding) + { + case _MM_FROUND_TO_NEAREST_INT: + __fpscr_save.__fr = __builtin_mffsl (); + __attribute__ ((fallthrough)); + case _MM_FROUND_TO_NEAREST_INT | _MM_FROUND_NO_EXC: + __builtin_set_fpscr_rn (0b00); + __r = vec_rint ((__v2df) __A); + __builtin_set_fpscr_rn (__fpscr_save.__fpscr); + break; + case _MM_FROUND_TO_NEG_INF: + case _MM_FROUND_TO_NEG_INF | _MM_FROUND_NO_EXC: + __r = vec_floor ((__v2df) __A); + break; + case _MM_FROUND_TO_POS_INF: + case _MM_FROUND_TO_POS_INF | _MM_FROUND_NO_EXC: + __r = vec_ceil ((__v2df) __A); + break; + case _MM_FROUND_TO_ZERO: + case _MM_FROUND_TO_ZERO | _MM_FROUND_NO_EXC: + __r = vec_trunc ((__v2df) __A); + break; + case _MM_FROUND_CUR_DIRECTION: + __r = vec_rint ((__v2df) __A); + break; + } + if (__rounding & _MM_FROUND_NO_EXC) + { + /* Restore enabled exceptions. */ + __fpscr_save.__fr = __builtin_mffsl (); + __fpscr_save.__fpscr |= __enables_save.__fpscr; + __builtin_mtfsf (0b00000011, __fpscr_save.__fr); + } + return (__m128d) __r; +} + +extern __inline __m128d +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_round_sd (__m128d __A, __m128d __B, int __rounding) +{ + __B = _mm_round_pd (__B, __rounding); + __v2df __r = { ((__v2df)__B)[0], ((__v2df) __A)[1] }; + return (__m128d) __r; +} + +extern __inline __m128 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_round_ps (__m128 __A, int __rounding) +{ + __v4sf __r; + union { + double __fr; + long long __fpscr; + } __enables_save, __fpscr_save; + + if (__rounding & _MM_FROUND_NO_EXC) + { + /* Save enabled exceptions, disable all exceptions, + and preserve the rounding mode. */ +#ifdef _ARCH_PWR9 + __asm__ __volatile__ ("mffsce %0" : "=f" (__fpscr_save.__fr)); + __enables_save.__fpscr = __fpscr_save.__fpscr & 0xf8; +#else + __fpscr_save.__fr = __builtin_mffs (); + __enables_save.__fpscr = __fpscr_save.__fpscr & 0xf8; + __fpscr_save.__fpscr &= ~0xf8; + __builtin_mtfsf (0b00000011, __fpscr_save.__fr); +#endif + } + + switch (__rounding) + { + case _MM_FROUND_TO_NEAREST_INT: + __fpscr_save.__fr = __builtin_mffsl (); + __attribute__ ((fallthrough)); + case _MM_FROUND_TO_NEAREST_INT | _MM_FROUND_NO_EXC: + __builtin_set_fpscr_rn (0b00); + __r = vec_rint ((__v4sf) __A); + __builtin_set_fpscr_rn (__fpscr_save.__fpscr); + break; + case _MM_FROUND_TO_NEG_INF: + case _MM_FROUND_TO_NEG_INF | _MM_FROUND_NO_EXC: + __r = vec_floor ((__v4sf) __A); + break; + case _MM_FROUND_TO_POS_INF: + case _MM_FROUND_TO_POS_INF | _MM_FROUND_NO_EXC: + __r = vec_ceil ((__v4sf) __A); + break; + case _MM_FROUND_TO_ZERO: + case _MM_FROUND_TO_ZERO | _MM_FROUND_NO_EXC: + __r = vec_trunc ((__v4sf) __A); + break; + case _MM_FROUND_CUR_DIRECTION: + __r = vec_rint ((__v4sf) __A); + break; + } + if (__rounding & _MM_FROUND_NO_EXC) + { + /* Restore enabled exceptions. */ + __fpscr_save.__fr = __builtin_mffsl (); + __fpscr_save.__fpscr |= __enables_save.__fpscr; + __builtin_mtfsf (0b00000011, __fpscr_save.__fr); + } + return (__m128) __r; +} + +extern __inline __m128 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_round_ss (__m128 __A, __m128 __B, int __rounding) +{ + __B = _mm_round_ps (__B, __rounding); + __v4sf __r = (__v4sf) __A; + __r[0] = ((__v4sf)__B)[0]; + return (__m128) __r; +} + +#define _mm_ceil_pd(V) _mm_round_pd ((V), _MM_FROUND_CEIL) +#define _mm_ceil_sd(D, V) _mm_round_sd ((D), (V), _MM_FROUND_CEIL) + +#define _mm_floor_pd(V) _mm_round_pd((V), _MM_FROUND_FLOOR) +#define _mm_floor_sd(D, V) _mm_round_sd ((D), (V), _MM_FROUND_FLOOR) + +#define _mm_ceil_ps(V) _mm_round_ps ((V), _MM_FROUND_CEIL) +#define _mm_ceil_ss(D, V) _mm_round_ss ((D), (V), _MM_FROUND_CEIL) + +#define _mm_floor_ps(V) _mm_round_ps ((V), _MM_FROUND_FLOOR) +#define _mm_floor_ss(D, V) _mm_round_ss ((D), (V), _MM_FROUND_FLOOR) + extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, __artificial__)) _mm_insert_epi8 (__m128i const __A, int const __D, int const __N) { @@ -232,70 +408,6 @@ _mm_test_mix_ones_zeros (__m128i __A, __m128i __mask) return any_ones * any_zeros; } -__inline __m128d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_ceil_pd (__m128d __A) -{ - return (__m128d) vec_ceil ((__v2df) __A); -} - -__inline __m128d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_ceil_sd (__m128d __A, __m128d __B) -{ - __v2df __r = vec_ceil ((__v2df) __B); - __r[1] = ((__v2df) __A)[1]; - return (__m128d) __r; -} - -__inline __m128d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_floor_pd (__m128d __A) -{ - return (__m128d) vec_floor ((__v2df) __A); -} - -__inline __m128d -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_floor_sd (__m128d __A, __m128d __B) -{ - __v2df __r = vec_floor ((__v2df) __B); - __r[1] = ((__v2df) __A)[1]; - return (__m128d) __r; -} - -__inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_ceil_ps (__m128 __A) -{ - return (__m128) vec_ceil ((__v4sf) __A); -} - -__inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_ceil_ss (__m128 __A, __m128 __B) -{ - __v4sf __r = (__v4sf) __A; - __r[0] = __builtin_ceil (((__v4sf) __B)[0]); - return __r; -} - -__inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_floor_ps (__m128 __A) -{ - return (__m128) vec_floor ((__v4sf) __A); -} - -__inline __m128 -__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) -_mm_floor_ss (__m128 __A, __m128 __B) -{ - __v4sf __r = (__v4sf) __A; - __r[0] = __builtin_floor (((__v4sf) __B)[0]); - return __r; -} - /* Return horizontal packed word minimum and its index in bits [15:0] and bits [18:16] respectively. */ __inline __m128i diff --git a/gcc/testsuite/gcc.target/powerpc/sse4_1-round3.h b/gcc/testsuite/gcc.target/powerpc/sse4_1-round3.h new file mode 100644 index 000000000000..de6cbf7be438 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/sse4_1-round3.h @@ -0,0 +1,81 @@ +#include +#include +#include "sse4_1-check.h" + +#define DIM(a) (sizeof (a) / sizeof (a)[0]) + +static int roundings[] = + { + _MM_FROUND_TO_NEAREST_INT, + _MM_FROUND_TO_NEG_INF, + _MM_FROUND_TO_POS_INF, + _MM_FROUND_TO_ZERO, + _MM_FROUND_CUR_DIRECTION + }; + +static int modes[] = + { + FE_TONEAREST, + FE_UPWARD, + FE_DOWNWARD, + FE_TOWARDZERO + }; + +static void +TEST (void) +{ + int i, j, ri, mi, round_save; + + round_save = fegetround (); + for (mi = 0; mi < DIM (modes); mi++) { + fesetround (modes[mi]); + for (i = 0; i < DIM (data); i++) { + for (ri = 0; ri < DIM (roundings); ri++) { + union value guess; + union value *current_answers = answers[ri]; + switch ( roundings[ri] ) { + case _MM_FROUND_TO_NEAREST_INT: + guess.x = ROUND_INTRIN (data[i].value1.x, data[i].value2.x, + _MM_FROUND_TO_NEAREST_INT); + break; + case _MM_FROUND_TO_NEG_INF: + guess.x = ROUND_INTRIN (data[i].value1.x, data[i].value2.x, + _MM_FROUND_TO_NEG_INF); + break; + case _MM_FROUND_TO_POS_INF: + guess.x = ROUND_INTRIN (data[i].value1.x, data[i].value2.x, + _MM_FROUND_TO_POS_INF); + break; + case _MM_FROUND_TO_ZERO: + guess.x = ROUND_INTRIN (data[i].value1.x, data[i].value2.x, + _MM_FROUND_TO_ZERO); + break; + case _MM_FROUND_CUR_DIRECTION: + guess.x = ROUND_INTRIN (data[i].value1.x, data[i].value2.x, + _MM_FROUND_CUR_DIRECTION); + switch ( modes[mi] ) { + case FE_TONEAREST: + current_answers = answers_NEAREST_INT; + break; + case FE_UPWARD: + current_answers = answers_POS_INF; + break; + case FE_DOWNWARD: + current_answers = answers_NEG_INF; + break; + case FE_TOWARDZERO: + current_answers = answers_ZERO; + break; + } + break; + default: + abort (); + } + for (j = 0; j < DIM (guess.f); j++) + if (guess.f[j] != current_answers[i].f[j]) + abort (); + } + } + } + fesetround (round_save); +} diff --git a/gcc/testsuite/gcc.target/powerpc/sse4_1-roundpd.c b/gcc/testsuite/gcc.target/powerpc/sse4_1-roundpd.c new file mode 100644 index 000000000000..0528c395f233 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/sse4_1-roundpd.c @@ -0,0 +1,143 @@ +/* { dg-do run } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O2 -mvsx" } */ + +#define NO_WARN_X86_INTRINSICS 1 +#include + +#define VEC_T __m128d +#define FP_T double + +#define ROUND_INTRIN(x, ignored, mode) _mm_round_pd (x, mode) + +#include "sse4_1-round-data.h" + +struct data2 data[] = { + { .value1 = { .f = { 0.00, 0.25 } } }, + { .value1 = { .f = { 0.50, 0.75 } } }, + + { .value1 = { .f = { 0x1.ffffffffffffcp+50, 0x1.ffffffffffffdp+50 } } }, + { .value1 = { .f = { 0x1.ffffffffffffep+50, 0x1.fffffffffffffp+50 } } }, + { .value1 = { .f = { 0x1.0000000000000p+51, 0x1.0000000000001p+51 } } }, + { .value1 = { .f = { 0x1.0000000000002p+51, 0x1.0000000000003p+51 } } }, + + { .value1 = { .f = { 0x1.ffffffffffffep+51, 0x1.fffffffffffffp+51 } } }, + { .value1 = { .f = { 0x1.0000000000000p+52, 0x1.0000000000001p+52 } } }, + + { .value1 = { .f = { -0x1.0000000000001p+52, -0x1.0000000000000p+52 } } }, + { .value1 = { .f = { -0x1.fffffffffffffp+51, -0x1.ffffffffffffep+51 } } }, + + { .value1 = { .f = { -0x1.0000000000004p+51, -0x1.0000000000002p+51 } } }, + { .value1 = { .f = { -0x1.0000000000001p+51, -0x1.0000000000000p+51 } } }, + { .value1 = { .f = { -0x1.ffffffffffffcp+50, -0x1.ffffffffffffep+50 } } }, + { .value1 = { .f = { -0x1.ffffffffffffdp+50, -0x1.ffffffffffffcp+50 } } }, + + { .value1 = { .f = { -1.00, -0.75 } } }, + { .value1 = { .f = { -0.50, -0.25 } } } +}; + +union value answers_NEAREST_INT[] = { + { .f = { 0.00, 0.00 } }, + { .f = { 0.00, 1.00 } }, + + { .f = { 0x1.ffffffffffffcp+50, 0x1.ffffffffffffcp+50 } }, + { .f = { 0x1.0000000000000p+51, 0x1.0000000000000p+51 } }, + { .f = { 0x1.0000000000000p+51, 0x1.0000000000000p+51 } }, + { .f = { 0x1.0000000000002p+51, 0x1.0000000000004p+51 } }, + + { .f = { 0x1.ffffffffffffep+51, 0x1.0000000000000p+52 } }, + { .f = { 0x1.0000000000000p+52, 0x1.0000000000001p+52 } }, + + { .f = { -0x1.0000000000001p+52, -0x1.0000000000000p+52 } }, + { .f = { -0x1.0000000000000p+52, -0x1.ffffffffffffep+51 } }, + + { .f = { -0x1.0000000000004p+51, -0x1.0000000000002p+51 } }, + { .f = { -0x1.0000000000000p+51, -0x1.0000000000000p+51 } }, + { .f = { -0x1.ffffffffffffcp+50, -0x1.0000000000000p+51 } }, + { .f = { -0x1.ffffffffffffcp+50, -0x1.ffffffffffffcp+50 } }, + + { .f = { -1.00, -1.00 } }, + { .f = { 0.00, 0.00 } } +}; + +union value answers_NEG_INF[] = { + { .f = { 0.00, 0.00 } }, + { .f = { 0.00, 0.00 } }, + + { .f = { 0x1.ffffffffffffcp+50, 0x1.ffffffffffffcp+50 } }, + { .f = { 0x1.ffffffffffffcp+50, 0x1.ffffffffffffcp+50 } }, + { .f = { 0x1.0000000000000p+51, 0x1.0000000000000p+51 } }, + { .f = { 0x1.0000000000002p+51, 0x1.0000000000002p+51 } }, + + { .f = { 0x1.ffffffffffffep+51, 0x1.ffffffffffffep+51 } }, + { .f = { 0x1.0000000000000p+52, 0x1.0000000000001p+52 } }, + + { .f = { -0x1.0000000000001p+52, -0x1.0000000000000p+52 } }, + { .f = { -0x1.0000000000000p+52, -0x1.ffffffffffffep+51 } }, + + { .f = { -0x1.0000000000004p+51, -0x1.0000000000002p+51 } }, + { .f = { -0x1.0000000000002p+51, -0x1.0000000000000p+51 } }, + { .f = { -0x1.ffffffffffffcp+50, -0x1.0000000000000p+51 } }, + { .f = { -0x1.0000000000000p+51, -0x1.ffffffffffffcp+50 } }, + + { .f = { -1.00, -1.00 } }, + { .f = { -1.00, -1.00 } } +}; + +union value answers_POS_INF[] = { + { .f = { 0.00, 1.00 } }, + { .f = { 1.00, 1.00 } }, + + { .f = { 0x1.ffffffffffffcp+50, 0x1.0000000000000p+51 } }, + { .f = { 0x1.0000000000000p+51, 0x1.0000000000000p+51 } }, + { .f = { 0x1.0000000000000p+51, 0x1.0000000000002p+51 } }, + { .f = { 0x1.0000000000002p+51, 0x1.0000000000004p+51 } }, + + { .f = { 0x1.ffffffffffffep+51, 0x1.0000000000000p+52 } }, + { .f = { 0x1.0000000000000p+52, 0x1.0000000000001p+52 } }, + + { .f = { -0x1.0000000000001p+52, -0x1.0000000000000p+52 } }, + { .f = { -0x1.ffffffffffffep+51, -0x1.ffffffffffffep+51 } }, + + { .f = { -0x1.0000000000004p+51, -0x1.0000000000002p+51 } }, + { .f = { -0x1.0000000000000p+51, -0x1.0000000000000p+51 } }, + { .f = { -0x1.ffffffffffffcp+50, -0x1.ffffffffffffcp+50 } }, + { .f = { -0x1.ffffffffffffcp+50, -0x1.ffffffffffffcp+50 } }, + + { .f = { -1.00, 0.00 } }, + { .f = { 0.00, 0.00 } } +}; + +union value answers_ZERO[] = { + { .f = { 0.00, 0.00 } }, + { .f = { 0.00, 0.00 } }, + + { .f = { 0x1.ffffffffffffcp+50, 0x1.ffffffffffffcp+50 } }, + { .f = { 0x1.ffffffffffffcp+50, 0x1.ffffffffffffcp+50 } }, + { .f = { 0x1.0000000000000p+51, 0x1.0000000000000p+51 } }, + { .f = { 0x1.0000000000002p+51, 0x1.0000000000002p+51 } }, + + { .f = { 0x1.ffffffffffffep+51, 0x1.ffffffffffffep+51 } }, + { .f = { 0x1.0000000000000p+52, 0x1.0000000000001p+52 } }, + + { .f = { -0x1.0000000000001p+52, -0x1.0000000000000p+52 } }, + { .f = { -0x1.ffffffffffffep+51, -0x1.ffffffffffffep+51 } }, + + { .f = { -0x1.0000000000004p+51, -0x1.0000000000002p+51 } }, + { .f = { -0x1.0000000000000p+51, -0x1.0000000000000p+51 } }, + { .f = { -0x1.ffffffffffffcp+50, -0x1.ffffffffffffcp+50 } }, + { .f = { -0x1.ffffffffffffcp+50, -0x1.ffffffffffffcp+50 } }, + + { .f = { -1.00, 0.00 } }, + { .f = { 0.00, 0.00 } } +}; + +union value *answers[] = { + answers_NEAREST_INT, + answers_NEG_INF, + answers_POS_INF, + answers_ZERO, + 0 /* CUR_DIRECTION answers depend on current rounding mode. */ +}; + +#include "sse4_1-round3.h" diff --git a/gcc/testsuite/gcc.target/powerpc/sse4_1-roundps.c b/gcc/testsuite/gcc.target/powerpc/sse4_1-roundps.c new file mode 100644 index 000000000000..6b5362e07590 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/sse4_1-roundps.c @@ -0,0 +1,98 @@ +/* { dg-do run } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O2 -mvsx" } */ + +#define NO_WARN_X86_INTRINSICS 1 +#include + +#define VEC_T __m128 +#define FP_T float + +#define ROUND_INTRIN(x, ignored, mode) _mm_round_ps (x, mode) + +#include "sse4_1-round-data.h" + +struct data2 data[] = { + { .value1 = { .f = { 0.00, 0.25, 0.50, 0.75 } } }, + + { .value1 = { .f = { 0x1.fffff8p+21, 0x1.fffffap+21, + 0x1.fffffcp+21, 0x1.fffffep+21 } } }, + { .value1 = { .f = { 0x1.fffffap+22, 0x1.fffffcp+22, + 0x1.fffffep+22, 0x1.fffffep+23 } } }, + { .value1 = { .f = { -0x1.fffffep+23, -0x1.fffffep+22, + -0x1.fffffcp+22, -0x1.fffffap+22 } } }, + { .value1 = { .f = { -0x1.fffffep+21, -0x1.fffffcp+21, + -0x1.fffffap+21, -0x1.fffff8p+21 } } }, + + { .value1 = { .f = { -1.00, -0.75, -0.50, -0.25 } } } +}; + +union value answers_NEAREST_INT[] = { + { .f = { 0.00, 0.00, 0.00, 1.00 } }, + + { .f = { 0x1.fffff8p+21, 0x1.fffff8p+21, + 0x1.000000p+22, 0x1.000000p+22 } }, + { .f = { 0x1.fffff8p+22, 0x1.fffffcp+22, + 0x1.000000p+23, 0x1.fffffep+23 } }, + { .f = { -0x1.fffffep+23, -0x1.000000p+23, + -0x1.fffffcp+22, -0x1.fffff8p+22 } }, + { .f = { -0x1.000000p+22, -0x1.000000p+22, + -0x1.fffff8p+21, -0x1.fffff8p+21 } }, + + { .f = { -1.00, -1.00, 0.00, 0.00 } } +}; + +union value answers_NEG_INF[] = { + { .f = { 0.00, 0.00, 0.00, 0.00 } }, + + { .f = { 0x1.fffff8p+21, 0x1.fffff8p+21, + 0x1.fffff8p+21, 0x1.fffff8p+21 } }, + { .f = { 0x1.fffff8p+22, 0x1.fffffcp+22, + 0x1.fffffcp+22, 0x1.fffffep+23 } }, + { .f = { -0x1.fffffep+23, -0x1.000000p+23, + -0x1.fffffcp+22, -0x1.fffffcp+22 } }, + { .f = { -0x1.000000p+22, -0x1.000000p+22, + -0x1.000000p+22, -0x1.fffff8p+21 } }, + + { .f = { -1.00, -1.00, -1.00, -1.00 } } +}; + +union value answers_POS_INF[] = { + { .f = { 0.00, 1.00, 1.00, 1.00 } }, + + { .f = { 0x1.fffff8p+21, 0x1.000000p+22, + 0x1.000000p+22, 0x1.000000p+22 } }, + { .f = { 0x1.fffffcp+22, 0x1.fffffcp+22, + 0x1.000000p+23, 0x1.fffffep+23 } }, + { .f = { -0x1.fffffep+23, -0x1.fffffcp+22, + -0x1.fffffcp+22, -0x1.fffff8p+22 } }, + { .f = { -0x1.fffff8p+21, -0x1.fffff8p+21, + -0x1.fffff8p+21, -0x1.fffff8p+21 } }, + + { .f = { -1.00, 0.00, 0.00, 0.00 } } +}; + +union value answers_ZERO[] = { + { .f = { 0.00, 0.00, 0.00, 0.00 } }, + + { .f = { 0x1.fffff8p+21, 0x1.fffff8p+21, + 0x1.fffff8p+21, 0x1.fffff8p+21 } }, + { .f = { 0x1.fffff8p+22, 0x1.fffffcp+22, + 0x1.fffffcp+22, 0x1.fffffep+23 } }, + { .f = { -0x1.fffffep+23, -0x1.fffffcp+22, + -0x1.fffffcp+22, -0x1.fffff8p+22 } }, + { .f = { -0x1.fffff8p+21, -0x1.fffff8p+21, + -0x1.fffff8p+21, -0x1.fffff8p+21 } }, + + { .f = { -1.00, 0.00, 0.00, 0.00 } } +}; + +union value *answers[] = { + answers_NEAREST_INT, + answers_NEG_INF, + answers_POS_INF, + answers_ZERO, + 0 /* CUR_DIRECTION answers depend on current rounding mode. */ +}; + +#include "sse4_1-round3.h" diff --git a/gcc/testsuite/gcc.target/powerpc/sse4_1-roundsd.c b/gcc/testsuite/gcc.target/powerpc/sse4_1-roundsd.c new file mode 100644 index 000000000000..2b0bad6469df --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/sse4_1-roundsd.c @@ -0,0 +1,256 @@ +/* { dg-do run } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O2 -mvsx" } */ + +#include +#define NO_WARN_X86_INTRINSICS 1 +#include + +#define VEC_T __m128d +#define FP_T double + +#define ROUND_INTRIN(x, y, mode) _mm_round_sd (x, y, mode) + +#include "sse4_1-round-data.h" + +static struct data2 data[] = { + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { 0.00, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { 0.25, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { 0.50, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { 0.75, IGNORED } } }, + + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { 0x1.ffffffffffffcp+50, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { 0x1.ffffffffffffdp+50, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { 0x1.ffffffffffffep+50, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { 0x1.fffffffffffffp+50, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { 0x1.0000000000000p+51, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { 0x1.0000000000001p+51, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { 0x1.0000000000002p+51, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { 0x1.0000000000003p+51, IGNORED } } }, + + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { 0x1.ffffffffffffep+51, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { 0x1.fffffffffffffp+51, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { 0x1.0000000000000p+52, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { 0x1.0000000000001p+52, IGNORED } } }, + + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { -0x1.0000000000001p+52, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { -0x1.0000000000000p+52, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { -0x1.fffffffffffffp+51, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { -0x1.ffffffffffffep+51, IGNORED } } }, + + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { -0x1.0000000000004p+51, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { -0x1.0000000000002p+51, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { -0x1.0000000000001p+51, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { -0x1.0000000000000p+51, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { -0x1.ffffffffffffcp+50, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { -0x1.ffffffffffffep+50, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { -0x1.ffffffffffffdp+50, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { -0x1.ffffffffffffcp+50, IGNORED } } }, + + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { -1.00, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { -0.75, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { -0.50, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH } }, + .value2 = { .f = { -0.25, IGNORED } } } +}; + +static union value answers_NEAREST_INT[] = { + { .f = { 0.00, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH } }, + { .f = { 1.00, PASSTHROUGH } }, + + { .f = { 0x1.ffffffffffffcp+50, PASSTHROUGH } }, + { .f = { 0x1.ffffffffffffcp+50, PASSTHROUGH } }, + { .f = { 0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { 0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { 0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { 0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { 0x1.0000000000002p+51, PASSTHROUGH } }, + { .f = { 0x1.0000000000004p+51, PASSTHROUGH } }, + + { .f = { 0x1.ffffffffffffep+51, PASSTHROUGH } }, + { .f = { 0x1.0000000000000p+52, PASSTHROUGH } }, + { .f = { 0x1.0000000000000p+52, PASSTHROUGH } }, + { .f = { 0x1.0000000000001p+52, PASSTHROUGH } }, + + { .f = { -0x1.0000000000001p+52, PASSTHROUGH } }, + { .f = { -0x1.0000000000000p+52, PASSTHROUGH } }, + { .f = { -0x1.0000000000000p+52, PASSTHROUGH } }, + { .f = { -0x1.ffffffffffffep+51, PASSTHROUGH } }, + + { .f = { -0x1.0000000000004p+51, PASSTHROUGH } }, + { .f = { -0x1.0000000000002p+51, PASSTHROUGH } }, + { .f = { -0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { -0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { -0x1.ffffffffffffcp+50, PASSTHROUGH } }, + { .f = { -0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { -0x1.ffffffffffffcp+50, PASSTHROUGH } }, + { .f = { -0x1.ffffffffffffcp+50, PASSTHROUGH } }, + + { .f = { -1.00, PASSTHROUGH } }, + { .f = { -1.00, PASSTHROUGH } }, + { .f = { -0.00, PASSTHROUGH } }, + { .f = { -0.00, PASSTHROUGH } } +}; + +static union value answers_NEG_INF[] = { + { .f = { 0.00, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH } }, + + { .f = { 0x1.ffffffffffffcp+50, PASSTHROUGH } }, + { .f = { 0x1.ffffffffffffcp+50, PASSTHROUGH } }, + { .f = { 0x1.ffffffffffffcp+50, PASSTHROUGH } }, + { .f = { 0x1.ffffffffffffcp+50, PASSTHROUGH } }, + { .f = { 0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { 0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { 0x1.0000000000002p+51, PASSTHROUGH } }, + { .f = { 0x1.0000000000002p+51, PASSTHROUGH } }, + + { .f = { 0x1.ffffffffffffep+51, PASSTHROUGH } }, + { .f = { 0x1.ffffffffffffep+51, PASSTHROUGH } }, + { .f = { 0x1.0000000000000p+52, PASSTHROUGH } }, + { .f = { 0x1.0000000000001p+52, PASSTHROUGH } }, + + { .f = { -0x1.0000000000001p+52, PASSTHROUGH } }, + { .f = { -0x1.0000000000000p+52, PASSTHROUGH } }, + { .f = { -0x1.0000000000000p+52, PASSTHROUGH } }, + { .f = { -0x1.ffffffffffffep+51, PASSTHROUGH } }, + + { .f = { -0x1.0000000000004p+51, PASSTHROUGH } }, + { .f = { -0x1.0000000000002p+51, PASSTHROUGH } }, + { .f = { -0x1.0000000000002p+51, PASSTHROUGH } }, + { .f = { -0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { -0x1.ffffffffffffcp+50, PASSTHROUGH } }, + { .f = { -0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { -0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { -0x1.ffffffffffffcp+50, PASSTHROUGH } }, + + { .f = { -1.00, PASSTHROUGH } }, + { .f = { -1.00, PASSTHROUGH } }, + { .f = { -1.00, PASSTHROUGH } }, + { .f = { -1.00, PASSTHROUGH } } +}; + +static union value answers_POS_INF[] = { + { .f = { 0.00, PASSTHROUGH } }, + { .f = { 1.00, PASSTHROUGH } }, + { .f = { 1.00, PASSTHROUGH } }, + { .f = { 1.00, PASSTHROUGH } }, + + { .f = { 0x1.ffffffffffffcp+50, PASSTHROUGH } }, + { .f = { 0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { 0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { 0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { 0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { 0x1.0000000000002p+51, PASSTHROUGH } }, + { .f = { 0x1.0000000000002p+51, PASSTHROUGH } }, + { .f = { 0x1.0000000000004p+51, PASSTHROUGH } }, + + { .f = { 0x1.ffffffffffffep+51, PASSTHROUGH } }, + { .f = { 0x1.0000000000000p+52, PASSTHROUGH } }, + { .f = { 0x1.0000000000000p+52, PASSTHROUGH } }, + { .f = { 0x1.0000000000001p+52, PASSTHROUGH } }, + + { .f = { -0x1.0000000000001p+52, PASSTHROUGH } }, + { .f = { -0x1.0000000000000p+52, PASSTHROUGH } }, + { .f = { -0x1.ffffffffffffep+51, PASSTHROUGH } }, + { .f = { -0x1.ffffffffffffep+51, PASSTHROUGH } }, + + { .f = { -0x1.0000000000004p+51, PASSTHROUGH } }, + { .f = { -0x1.0000000000002p+51, PASSTHROUGH } }, + { .f = { -0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { -0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { -0x1.ffffffffffffcp+50, PASSTHROUGH } }, + { .f = { -0x1.ffffffffffffcp+50, PASSTHROUGH } }, + { .f = { -0x1.ffffffffffffcp+50, PASSTHROUGH } }, + { .f = { -0x1.ffffffffffffcp+50, PASSTHROUGH } }, + + { .f = { -1.00, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH } } +}; + +static union value answers_ZERO[] = { + { .f = { 0.00, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH } }, + + { .f = { 0x1.ffffffffffffcp+50, PASSTHROUGH } }, + { .f = { 0x1.ffffffffffffcp+50, PASSTHROUGH } }, + { .f = { 0x1.ffffffffffffcp+50, PASSTHROUGH } }, + { .f = { 0x1.ffffffffffffcp+50, PASSTHROUGH } }, + { .f = { 0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { 0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { 0x1.0000000000002p+51, PASSTHROUGH } }, + { .f = { 0x1.0000000000002p+51, PASSTHROUGH } }, + + { .f = { 0x1.ffffffffffffep+51, PASSTHROUGH } }, + { .f = { 0x1.ffffffffffffep+51, PASSTHROUGH } }, + { .f = { 0x1.0000000000000p+52, PASSTHROUGH } }, + { .f = { 0x1.0000000000001p+52, PASSTHROUGH } }, + + { .f = { -0x1.0000000000001p+52, PASSTHROUGH } }, + { .f = { -0x1.0000000000000p+52, PASSTHROUGH } }, + { .f = { -0x1.ffffffffffffep+51, PASSTHROUGH } }, + { .f = { -0x1.ffffffffffffep+51, PASSTHROUGH } }, + + { .f = { -0x1.0000000000004p+51, PASSTHROUGH } }, + { .f = { -0x1.0000000000002p+51, PASSTHROUGH } }, + { .f = { -0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { -0x1.0000000000000p+51, PASSTHROUGH } }, + { .f = { -0x1.ffffffffffffcp+50, PASSTHROUGH } }, + { .f = { -0x1.ffffffffffffcp+50, PASSTHROUGH } }, + { .f = { -0x1.ffffffffffffcp+50, PASSTHROUGH } }, + { .f = { -0x1.ffffffffffffcp+50, PASSTHROUGH } }, + + { .f = { -1.00, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH } } +}; + +union value *answers[] = { + answers_NEAREST_INT, + answers_NEG_INF, + answers_POS_INF, + answers_ZERO, + 0 /* CUR_DIRECTION answers depend on current rounding mode. */ +}; + +#include "sse4_1-round3.h" diff --git a/gcc/testsuite/gcc.target/powerpc/sse4_1-roundss.c b/gcc/testsuite/gcc.target/powerpc/sse4_1-roundss.c new file mode 100644 index 000000000000..3154310314a1 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/sse4_1-roundss.c @@ -0,0 +1,208 @@ +/* { dg-do run } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O2 -mvsx" } */ + +#include +#define NO_WARN_X86_INTRINSICS 1 +#include + +#define VEC_T __m128 +#define FP_T float + +#define ROUND_INTRIN(x, y, mode) _mm_round_ss (x, y, mode) + +#include "sse4_1-round-data.h" + +static struct data2 data[] = { + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { 0.00, IGNORED, IGNORED, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { 0.25, IGNORED, IGNORED, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { 0.50, IGNORED, IGNORED, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { 0.75, IGNORED, IGNORED, IGNORED } } }, + + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { 0x1.fffff8p+21, IGNORED, IGNORED, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { 0x1.fffffap+21, IGNORED, IGNORED, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { 0x1.fffffcp+21, IGNORED, IGNORED, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { 0x1.fffffep+21, IGNORED, IGNORED, IGNORED } } }, + + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { 0x1.fffffap+22, IGNORED, IGNORED, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { 0x1.fffffcp+22, IGNORED, IGNORED, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { 0x1.fffffep+22, IGNORED, IGNORED, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { 0x1.fffffep+23, IGNORED, IGNORED, IGNORED } } }, + + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { -0x1.fffffep+23, IGNORED, IGNORED, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { -0x1.fffffep+22, IGNORED, IGNORED, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { -0x1.fffffcp+22, IGNORED, IGNORED, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { -0x1.fffffap+22, IGNORED, IGNORED, IGNORED } } }, + + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { -0x1.fffffep+21, IGNORED, IGNORED, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { -0x1.fffffcp+21, IGNORED, IGNORED, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { -0x1.fffffap+21, IGNORED, IGNORED, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { -0x1.fffff8p+21, IGNORED, IGNORED, IGNORED } } }, + + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { -1.00, IGNORED, IGNORED, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { -0.75, IGNORED, IGNORED, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { -0.50, IGNORED, IGNORED, IGNORED } } }, + { .value1 = { .f = { IGNORED, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + .value2 = { .f = { -0.25, IGNORED, IGNORED, IGNORED } } } +}; + +static union value answers_NEAREST_INT[] = { + { .f = { 0.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 1.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + + { .f = { 0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.000000p+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.000000p+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + + { .f = { 0x1.fffff8p+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.fffffcp+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.000000p+23, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.fffffep+23, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + + { .f = { -0x1.fffffep+23, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.000000p+23, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.fffffcp+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.fffff8p+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + + { .f = { -0x1.000000p+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.000000p+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + + { .f = { -1.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -1.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } } +}; + +static union value answers_NEG_INF[] = { + { .f = { 0.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + + { .f = { 0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + + { .f = { 0x1.fffff8p+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.fffffcp+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.fffffcp+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.fffffep+23, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + + { .f = { -0x1.fffffep+23, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.000000p+23, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.fffffcp+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.fffffcp+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + + { .f = { -0x1.000000p+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.000000p+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.000000p+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + + { .f = { -1.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -1.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -1.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -1.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } } +}; + +static union value answers_POS_INF[] = { + { .f = { 0.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 1.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 1.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 1.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + + { .f = { 0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.000000p+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.000000p+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.000000p+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + + { .f = { 0x1.fffffcp+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.fffffcp+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.000000p+23, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.fffffep+23, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + + { .f = { -0x1.fffffep+23, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.fffffcp+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.fffffcp+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.fffff8p+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + + { .f = { -0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + + { .f = { -1.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } } +}; + +static union value answers_ZERO[] = { + { .f = { 0.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + + { .f = { 0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + + { .f = { 0x1.fffff8p+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.fffffcp+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.fffffcp+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0x1.fffffep+23, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + + { .f = { -0x1.fffffep+23, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.fffffcp+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.fffffcp+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.fffff8p+22, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + + { .f = { -0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { -0x1.fffff8p+21, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + + { .f = { -1.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } }, + { .f = { 0.00, PASSTHROUGH, PASSTHROUGH, PASSTHROUGH } } +}; + +union value *answers[] = { + answers_NEAREST_INT, + answers_NEG_INF, + answers_POS_INF, + answers_ZERO, + 0 /* CUR_DIRECTION answers depend on current rounding mode. */ +}; + +#include "sse4_1-round3.h"