From patchwork Thu Jul 1 06:16:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: liuhongt X-Patchwork-Id: 1499330 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=AVYkcQOS; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GFpb36dXRz9sWX for ; Thu, 1 Jul 2021 16:43:59 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6BDE4384A028 for ; Thu, 1 Jul 2021 06:43:57 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6BDE4384A028 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1625121837; bh=5n2unIjtIQZh1IgpcE+/8q6nJXhMQ93BmYsce320jHk=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=AVYkcQOSuwT1BM+5/WF9uVGctUS0nQHuuM2cW5Xu2YVhwkD/sGhbfX0g8TiDwqdpc NfgeT2pQ7lgIS40kdJ7+tt4BBeaEnE9Ty5XEf9Q6QDsHxPC1G/hjwhMSnJnkTyH2eH QMs6u45k45SpPeBw7Ert7hivsHJrmewmbRS3Frs0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by sourceware.org (Postfix) with ESMTPS id BBD0C384841D for ; Thu, 1 Jul 2021 06:17:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org BBD0C384841D X-IronPort-AV: E=McAfee;i="6200,9189,10031"; a="272334063" X-IronPort-AV: E=Sophos;i="5.83,313,1616482800"; d="scan'208";a="272334063" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 30 Jun 2021 23:17:26 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.83,313,1616482800"; d="scan'208";a="626257440" Received: from scymds01.sc.intel.com ([10.148.94.138]) by orsmga005.jf.intel.com with ESMTP; 30 Jun 2021 23:17:26 -0700 Received: from shliclel320.sh.intel.com (shliclel320.sh.intel.com [10.239.236.50]) by scymds01.sc.intel.com with ESMTP id 1616Gmf1031625; Wed, 30 Jun 2021 23:17:25 -0700 To: gcc-patches@gcc.gnu.org Subject: [PATCH 22/62] AVX512FP16: Add fpclass/getexp/getmant instructions. Date: Thu, 1 Jul 2021 14:16:08 +0800 Message-Id: <20210701061648.9447-23-hongtao.liu@intel.com> X-Mailer: git-send-email 2.18.1 In-Reply-To: <20210701061648.9447-1-hongtao.liu@intel.com> References: <20210701061648.9447-1-hongtao.liu@intel.com> X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: liuhongt via Gcc-patches From: liuhongt Reply-To: liuhongt Cc: jakub@redhat.com Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Add vfpclassph/vfpclasssh/vgetexpph/vgetexpsh/vgetmantph/vgetmantsh. gcc/ChangeLog: * config/i386/avx512fp16intrin.h (_mm_fpclass_sh_mask): New intrinsic. (_mm_mask_fpclass_sh_mask): Likewise. (_mm512_mask_fpclass_ph_mask): Likewise. (_mm512_fpclass_ph_mask): Likewise. (_mm_getexp_sh): Likewise. (_mm_mask_getexp_sh): Likewise. (_mm_maskz_getexp_sh): Likewise. (_mm512_getexp_ph): Likewise. (_mm512_mask_getexp_ph): Likewise. (_mm512_maskz_getexp_ph): Likewise. (_mm_getexp_round_sh): Likewise. (_mm_mask_getexp_round_sh): Likewise. (_mm_maskz_getexp_round_sh): Likewise. (_mm512_getexp_round_ph): Likewise. (_mm512_mask_getexp_round_ph): Likewise. (_mm512_maskz_getexp_round_ph): Likewise. (_mm_getmant_sh): Likewise. (_mm_mask_getmant_sh): Likewise. (_mm_maskz_getmant_sh): Likewise. (_mm512_getmant_ph): Likewise. (_mm512_mask_getmant_ph): Likewise. (_mm512_maskz_getmant_ph): Likewise. (_mm_getmant_round_sh): Likewise. (_mm_mask_getmant_round_sh): Likewise. (_mm_maskz_getmant_round_sh): Likewise. (_mm512_getmant_round_ph): Likewise. (_mm512_mask_getmant_round_ph): Likewise. (_mm512_maskz_getmant_round_ph): Likewise. * config/i386/avx512fp16vlintrin.h (_mm_mask_fpclass_ph_mask): New intrinsic. (_mm_fpclass_ph_mask): Likewise. (_mm256_mask_fpclass_ph_mask): Likewise. (_mm256_fpclass_ph_mask): Likewise. (_mm256_getexp_ph): Likewise. (_mm256_mask_getexp_ph): Likewise. (_mm256_maskz_getexp_ph): Likewise. (_mm_getexp_ph): Likewise. (_mm_mask_getexp_ph): Likewise. (_mm_maskz_getexp_ph): Likewise. (_mm256_getmant_ph): Likewise. (_mm256_mask_getmant_ph): Likewise. (_mm256_maskz_getmant_ph): Likewise. (_mm_getmant_ph): Likewise. (_mm_mask_getmant_ph): Likewise. (_mm_maskz_getmant_ph): Likewise. * config/i386/i386-builtin-types.def: Add corresponding builtin types. * config/i386/i386-builtin.def: Add corresponding new builtins. * config/i386/i386-expand.c (ix86_expand_args_builtin): Handle new builtin types. (ix86_expand_round_builtin): Ditto. * config/i386/sse.md (vecmemsuffix): Add HF vector modes. (_getexp): Adjust to support HF vector modes. (avx512f_sgetexp): Ditto. (avx512dq_vmfpclass): Ditto. (_getmant): Ditto. (avx512f_vgetmant): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add test for new builtins. * gcc.target/i386/sse-13.c: Ditto. * gcc.target/i386/sse-23.c: Ditto. * gcc.target/i386/sse-14.c: Add test for new intrinsics. * gcc.target/i386/sse-22.c: Ditto. --- gcc/config/i386/avx512fp16intrin.h | 471 +++++++++++++++++++++++++ gcc/config/i386/avx512fp16vlintrin.h | 229 ++++++++++++ gcc/config/i386/i386-builtin-types.def | 3 + gcc/config/i386/i386-builtin.def | 12 + gcc/config/i386/i386-expand.c | 7 + gcc/config/i386/sse.md | 41 +-- gcc/testsuite/gcc.target/i386/avx-1.c | 10 + gcc/testsuite/gcc.target/i386/sse-13.c | 10 + gcc/testsuite/gcc.target/i386/sse-14.c | 18 + gcc/testsuite/gcc.target/i386/sse-22.c | 18 + gcc/testsuite/gcc.target/i386/sse-23.c | 10 + 11 files changed, 809 insertions(+), 20 deletions(-) diff --git a/gcc/config/i386/avx512fp16intrin.h b/gcc/config/i386/avx512fp16intrin.h index 8c2c9b28987..2fbfc140c44 100644 --- a/gcc/config/i386/avx512fp16intrin.h +++ b/gcc/config/i386/avx512fp16intrin.h @@ -1982,6 +1982,477 @@ _mm_maskz_roundscale_round_sh (__mmask8 __A, __m128h __B, __m128h __C, #endif /* __OPTIMIZE__ */ +/* Intrinsics vfpclasssh. */ +#ifdef __OPTIMIZE__ +extern __inline __mmask8 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_fpclass_sh_mask (__m128h __A, const int __imm) +{ + return (__mmask8) __builtin_ia32_fpclasssh_mask ((__v8hf) __A, __imm, + (__mmask8) -1); +} + +extern __inline __mmask8 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_mask_fpclass_sh_mask (__mmask8 __U, __m128h __A, const int __imm) +{ + return (__mmask8) __builtin_ia32_fpclasssh_mask ((__v8hf) __A, __imm, __U); +} + +#else +#define _mm_fpclass_sh_mask(X, C) \ + ((__mmask8) __builtin_ia32_fpclasssh_mask ((__v8hf) (__m128h) (X), \ + (int) (C), (__mmask8) (-1))) \ + +#define _mm_mask_fpclass_sh_mask(U, X, C) \ + ((__mmask8) __builtin_ia32_fpclasssh_mask ((__v8hf) (__m128h) (X), \ + (int) (C), (__mmask8) (U))) +#endif /* __OPTIMIZE__ */ + +/* Intrinsics vfpclassph. */ +#ifdef __OPTIMIZE__ +extern __inline __mmask32 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm512_mask_fpclass_ph_mask (__mmask32 __U, __m512h __A, + const int __imm) +{ + return (__mmask32) __builtin_ia32_fpclassph512_mask ((__v32hf) __A, + __imm, __U); +} + +extern __inline __mmask32 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm512_fpclass_ph_mask (__m512h __A, const int __imm) +{ + return (__mmask32) __builtin_ia32_fpclassph512_mask ((__v32hf) __A, + __imm, + (__mmask32) -1); +} + +#else +#define _mm512_mask_fpclass_ph_mask(u, x, c) \ + ((__mmask32) __builtin_ia32_fpclassph512_mask ((__v32hf) (__m512h) (x),\ + (int) (c),(__mmask8)(u))) + +#define _mm512_fpclass_ph_mask(x, c) \ + ((__mmask32) __builtin_ia32_fpclassph512_mask ((__v32hf) (__m512h) (x),\ + (int) (c),(__mmask8)-1)) +#endif /* __OPIMTIZE__ */ + +/* Intrinsics vgetexpph, vgetexpsh. */ +extern __inline __m128h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_getexp_sh (__m128h __A, __m128h __B) +{ + return (__m128h) + __builtin_ia32_getexpsh_mask_round ((__v8hf) __A, (__v8hf) __B, + (__v8hf) _mm_setzero_ph (), + (__mmask8) -1, + _MM_FROUND_CUR_DIRECTION); +} + +extern __inline __m128h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_mask_getexp_sh (__m128h __W, __mmask8 __U, __m128h __A, __m128h __B) +{ + return (__m128h) + __builtin_ia32_getexpsh_mask_round ((__v8hf) __A, (__v8hf) __B, + (__v8hf) __W, (__mmask8) __U, + _MM_FROUND_CUR_DIRECTION); +} + +extern __inline __m128h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_maskz_getexp_sh (__mmask8 __U, __m128h __A, __m128h __B) +{ + return (__m128h) + __builtin_ia32_getexpsh_mask_round ((__v8hf) __A, (__v8hf) __B, + (__v8hf) _mm_setzero_ph (), + (__mmask8) __U, + _MM_FROUND_CUR_DIRECTION); +} + +extern __inline __m512h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm512_getexp_ph (__m512h __A) +{ + return (__m512h) + __builtin_ia32_getexpph512_mask ((__v32hf) __A, + (__v32hf) _mm512_setzero_ph (), + (__mmask32) -1, _MM_FROUND_CUR_DIRECTION); +} + +extern __inline __m512h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm512_mask_getexp_ph (__m512h __W, __mmask32 __U, __m512h __A) +{ + return (__m512h) + __builtin_ia32_getexpph512_mask ((__v32hf) __A, (__v32hf) __W, + (__mmask32) __U, _MM_FROUND_CUR_DIRECTION); +} + +extern __inline __m512h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm512_maskz_getexp_ph (__mmask32 __U, __m512h __A) +{ + return (__m512h) + __builtin_ia32_getexpph512_mask ((__v32hf) __A, + (__v32hf) _mm512_setzero_ph (), + (__mmask32) __U, _MM_FROUND_CUR_DIRECTION); +} + +#ifdef __OPTIMIZE__ +extern __inline __m128h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_getexp_round_sh (__m128h __A, __m128h __B, const int __R) +{ + return (__m128h) __builtin_ia32_getexpsh_mask_round ((__v8hf) __A, + (__v8hf) __B, + _mm_setzero_ph (), + (__mmask8) -1, + __R); +} + +extern __inline __m128h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_mask_getexp_round_sh (__m128h __W, __mmask8 __U, __m128h __A, + __m128h __B, const int __R) +{ + return (__m128h) __builtin_ia32_getexpsh_mask_round ((__v8hf) __A, + (__v8hf) __B, + (__v8hf) __W, + (__mmask8) __U, __R); +} + +extern __inline __m128h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_maskz_getexp_round_sh (__mmask8 __U, __m128h __A, __m128h __B, + const int __R) +{ + return (__m128h) __builtin_ia32_getexpsh_mask_round ((__v8hf) __A, + (__v8hf) __B, + (__v8hf) + _mm_setzero_ph (), + (__mmask8) __U, __R); +} + +extern __inline __m512h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm512_getexp_round_ph (__m512h __A, const int __R) +{ + return (__m512h) __builtin_ia32_getexpph512_mask ((__v32hf) __A, + (__v32hf) + _mm512_setzero_ph (), + (__mmask32) -1, __R); +} + +extern __inline __m512h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm512_mask_getexp_round_ph (__m512h __W, __mmask32 __U, __m512h __A, + const int __R) +{ + return (__m512h) __builtin_ia32_getexpph512_mask ((__v32hf) __A, + (__v32hf) __W, + (__mmask32) __U, __R); +} + +extern __inline __m512h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm512_maskz_getexp_round_ph (__mmask32 __U, __m512h __A, const int __R) +{ + return (__m512h) __builtin_ia32_getexpph512_mask ((__v32hf) __A, + (__v32hf) + _mm512_setzero_ph (), + (__mmask32) __U, __R); +} + +#else +#define _mm_getexp_round_sh(A, B, R) \ + ((__m128h)__builtin_ia32_getexpsh_mask_round((__v8hf)(__m128h)(A), \ + (__v8hf)(__m128h)(B), \ + (__v8hf)_mm_setzero_ph(), \ + (__mmask8)-1, R)) + +#define _mm_mask_getexp_round_sh(W, U, A, B, C) \ + (__m128h)__builtin_ia32_getexpsh_mask_round(A, B, W, U, C) + +#define _mm_maskz_getexp_round_sh(U, A, B, C) \ + (__m128h)__builtin_ia32_getexpsh_mask_round(A, B, \ + (__v8hf)_mm_setzero_ph(), \ + U, C) + +#define _mm512_getexp_round_ph(A, R) \ + ((__m512h)__builtin_ia32_getexpph512_mask((__v32hf)(__m512h)(A), \ + (__v32hf)_mm512_setzero_ph(), (__mmask32)-1, R)) + +#define _mm512_mask_getexp_round_ph(W, U, A, R) \ + ((__m512h)__builtin_ia32_getexpph512_mask((__v32hf)(__m512h)(A), \ + (__v32hf)(__m512h)(W), (__mmask32)(U), R)) + +#define _mm512_maskz_getexp_round_ph(U, A, R) \ + ((__m512h)__builtin_ia32_getexpph512_mask((__v32hf)(__m512h)(A), \ + (__v32hf)_mm512_setzero_ph(), (__mmask32)(U), R)) + +#endif /* __OPTIMIZE__ */ + +/* Intrinsics vgetmantph, vgetmantsh. */ +#ifdef __OPTIMIZE__ +extern __inline __m128h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_getmant_sh (__m128h __A, __m128h __B, + _MM_MANTISSA_NORM_ENUM __C, + _MM_MANTISSA_SIGN_ENUM __D) +{ + return (__m128h) + __builtin_ia32_getmantsh_mask_round ((__v8hf) __A, (__v8hf) __B, + (__D << 2) | __C, _mm_setzero_ph (), + (__mmask8) -1, + _MM_FROUND_CUR_DIRECTION); +} + +extern __inline __m128h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_mask_getmant_sh (__m128h __W, __mmask8 __U, __m128h __A, + __m128h __B, _MM_MANTISSA_NORM_ENUM __C, + _MM_MANTISSA_SIGN_ENUM __D) +{ + return (__m128h) + __builtin_ia32_getmantsh_mask_round ((__v8hf) __A, (__v8hf) __B, + (__D << 2) | __C, (__v8hf) __W, + __U, _MM_FROUND_CUR_DIRECTION); +} + +extern __inline __m128h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_maskz_getmant_sh (__mmask8 __U, __m128h __A, __m128h __B, + _MM_MANTISSA_NORM_ENUM __C, + _MM_MANTISSA_SIGN_ENUM __D) +{ + return (__m128h) + __builtin_ia32_getmantsh_mask_round ((__v8hf) __A, (__v8hf) __B, + (__D << 2) | __C, + (__v8hf) _mm_setzero_ph(), + __U, _MM_FROUND_CUR_DIRECTION); +} + +extern __inline __m512h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm512_getmant_ph (__m512h __A, _MM_MANTISSA_NORM_ENUM __B, + _MM_MANTISSA_SIGN_ENUM __C) +{ + return (__m512h) __builtin_ia32_getmantph512_mask ((__v32hf) __A, + (__C << 2) | __B, + _mm512_setzero_ph (), + (__mmask32) -1, + _MM_FROUND_CUR_DIRECTION); +} + +extern __inline __m512h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm512_mask_getmant_ph (__m512h __W, __mmask32 __U, __m512h __A, + _MM_MANTISSA_NORM_ENUM __B, + _MM_MANTISSA_SIGN_ENUM __C) +{ + return (__m512h) __builtin_ia32_getmantph512_mask ((__v32hf) __A, + (__C << 2) | __B, + (__v32hf) __W, __U, + _MM_FROUND_CUR_DIRECTION); +} + +extern __inline __m512h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm512_maskz_getmant_ph (__mmask32 __U, __m512h __A, + _MM_MANTISSA_NORM_ENUM __B, + _MM_MANTISSA_SIGN_ENUM __C) +{ + return (__m512h) __builtin_ia32_getmantph512_mask ((__v32hf) __A, + (__C << 2) | __B, + (__v32hf) + _mm512_setzero_ph (), + __U, + _MM_FROUND_CUR_DIRECTION); +} + +extern __inline __m128h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_getmant_round_sh (__m128h __A, __m128h __B, + _MM_MANTISSA_NORM_ENUM __C, + _MM_MANTISSA_SIGN_ENUM __D, const int __R) +{ + return (__m128h) __builtin_ia32_getmantsh_mask_round ((__v8hf) __A, + (__v8hf) __B, + (__D << 2) | __C, + _mm_setzero_ph (), + (__mmask8) -1, + __R); +} + +extern __inline __m128h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_mask_getmant_round_sh (__m128h __W, __mmask8 __U, __m128h __A, + __m128h __B, _MM_MANTISSA_NORM_ENUM __C, + _MM_MANTISSA_SIGN_ENUM __D, const int __R) +{ + return (__m128h) __builtin_ia32_getmantsh_mask_round ((__v8hf) __A, + (__v8hf) __B, + (__D << 2) | __C, + (__v8hf) __W, + __U, __R); +} + +extern __inline __m128h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_maskz_getmant_round_sh (__mmask8 __U, __m128h __A, __m128h __B, + _MM_MANTISSA_NORM_ENUM __C, + _MM_MANTISSA_SIGN_ENUM __D, const int __R) +{ + return (__m128h) __builtin_ia32_getmantsh_mask_round ((__v8hf) __A, + (__v8hf) __B, + (__D << 2) | __C, + (__v8hf) + _mm_setzero_ph(), + __U, __R); +} + +extern __inline __m512h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm512_getmant_round_ph (__m512h __A, _MM_MANTISSA_NORM_ENUM __B, + _MM_MANTISSA_SIGN_ENUM __C, const int __R) +{ + return (__m512h) __builtin_ia32_getmantph512_mask ((__v32hf) __A, + (__C << 2) | __B, + _mm512_setzero_ph (), + (__mmask32) -1, __R); +} + +extern __inline __m512h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm512_mask_getmant_round_ph (__m512h __W, __mmask32 __U, __m512h __A, + _MM_MANTISSA_NORM_ENUM __B, + _MM_MANTISSA_SIGN_ENUM __C, const int __R) +{ + return (__m512h) __builtin_ia32_getmantph512_mask ((__v32hf) __A, + (__C << 2) | __B, + (__v32hf) __W, __U, + __R); +} + +extern __inline __m512h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm512_maskz_getmant_round_ph (__mmask32 __U, __m512h __A, + _MM_MANTISSA_NORM_ENUM __B, + _MM_MANTISSA_SIGN_ENUM __C, const int __R) +{ + return (__m512h) __builtin_ia32_getmantph512_mask ((__v32hf) __A, + (__C << 2) | __B, + (__v32hf) + _mm512_setzero_ph (), + __U, __R); +} + +#else +#define _mm512_getmant_ph(X, B, C) \ + ((__m512h)__builtin_ia32_getmantph512_mask ((__v32hf)(__m512h)(X), \ + (int)(((C)<<2) | (B)), \ + (__v32hf)(__m512h) \ + _mm512_setzero_ph(), \ + (__mmask32)-1, \ + _MM_FROUND_CUR_DIRECTION)) + +#define _mm512_mask_getmant_ph(W, U, X, B, C) \ + ((__m512h)__builtin_ia32_getmantph512_mask ((__v32hf)(__m512h)(X), \ + (int)(((C)<<2) | (B)), \ + (__v32hf)(__m512h)(W), \ + (__mmask32)(U), \ + _MM_FROUND_CUR_DIRECTION)) + + +#define _mm512_maskz_getmant_ph(U, X, B, C) \ + ((__m512h)__builtin_ia32_getmantph512_mask ((__v32hf)(__m512h)(X), \ + (int)(((C)<<2) | (B)), \ + (__v32hf)(__m512h) \ + _mm512_setzero_ph(), \ + (__mmask32)(U), \ + _MM_FROUND_CUR_DIRECTION)) + +#define _mm_getmant_sh(X, Y, C, D) \ + ((__m128h)__builtin_ia32_getmantsh_mask_round ((__v8hf)(__m128h)(X), \ + (__v8hf)(__m128h)(Y), \ + (int)(((D)<<2) | (C)), \ + (__v8hf)(__m128h) \ + _mm_setzero_ph (), \ + (__mmask8)-1, \ + _MM_FROUND_CUR_DIRECTION)) + +#define _mm_mask_getmant_sh(W, U, X, Y, C, D) \ + ((__m128h)__builtin_ia32_getmantsh_mask_round ((__v8hf)(__m128h)(X), \ + (__v8hf)(__m128h)(Y), \ + (int)(((D)<<2) | (C)), \ + (__v8hf)(__m128h)(W), \ + (__mmask8)(U), \ + _MM_FROUND_CUR_DIRECTION)) + +#define _mm_maskz_getmant_sh(U, X, Y, C, D) \ + ((__m128h)__builtin_ia32_getmantsh_mask_round ((__v8hf)(__m128h)(X), \ + (__v8hf)(__m128h)(Y), \ + (int)(((D)<<2) | (C)), \ + (__v8hf)(__m128h) \ + _mm_setzero_ph(), \ + (__mmask8)(U), \ + _MM_FROUND_CUR_DIRECTION)) + +#define _mm512_getmant_round_ph(X, B, C, R) \ + ((__m512h)__builtin_ia32_getmantph512_mask ((__v32hf)(__m512h)(X), \ + (int)(((C)<<2) | (B)), \ + (__v32hf)(__m512h) \ + _mm512_setzero_ph(), \ + (__mmask32)-1, \ + (R))) + +#define _mm512_mask_getmant_round_ph(W, U, X, B, C, R) \ + ((__m512h)__builtin_ia32_getmantph512_mask ((__v32hf)(__m512h)(X), \ + (int)(((C)<<2) | (B)), \ + (__v32hf)(__m512h)(W), \ + (__mmask32)(U), \ + (R))) + + +#define _mm512_maskz_getmant_round_ph(U, X, B, C, R) \ + ((__m512h)__builtin_ia32_getmantph512_mask ((__v32hf)(__m512h)(X), \ + (int)(((C)<<2) | (B)), \ + (__v32hf)(__m512h) \ + _mm512_setzero_ph(), \ + (__mmask32)(U), \ + (R))) + +#define _mm_getmant_round_sh(X, Y, C, D, R) \ + ((__m128h)__builtin_ia32_getmantsh_mask_round ((__v8hf)(__m128h)(X), \ + (__v8hf)(__m128h)(Y), \ + (int)(((D)<<2) | (C)), \ + (__v8hf)(__m128h) \ + _mm_setzero_ph (), \ + (__mmask8)-1, \ + (R))) + +#define _mm_mask_getmant_round_sh(W, U, X, Y, C, D, R) \ + ((__m128h)__builtin_ia32_getmantsh_mask_round ((__v8hf)(__m128h)(X), \ + (__v8hf)(__m128h)(Y), \ + (int)(((D)<<2) | (C)), \ + (__v8hf)(__m128h)(W), \ + (__mmask8)(U), \ + (R))) + +#define _mm_maskz_getmant_round_sh(U, X, Y, C, D, R) \ + ((__m128h)__builtin_ia32_getmantsh_mask_round ((__v8hf)(__m128h)(X), \ + (__v8hf)(__m128h)(Y), \ + (int)(((D)<<2) | (C)), \ + (__v8hf)(__m128h) \ + _mm_setzero_ph(), \ + (__mmask8)(U), \ + (R))) + +#endif /* __OPTIMIZE__ */ + #ifdef __DISABLE_AVX512FP16__ #undef __DISABLE_AVX512FP16__ #pragma GCC pop_options diff --git a/gcc/config/i386/avx512fp16vlintrin.h b/gcc/config/i386/avx512fp16vlintrin.h index 20b6716aa00..206d60407fc 100644 --- a/gcc/config/i386/avx512fp16vlintrin.h +++ b/gcc/config/i386/avx512fp16vlintrin.h @@ -701,6 +701,235 @@ _mm256_maskz_roundscale_ph (__mmask16 __A, __m256h __B, int __C) #endif /* __OPTIMIZE__ */ +/* Intrinsics vfpclassph. */ +#ifdef __OPTIMIZE__ +extern __inline __mmask8 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_mask_fpclass_ph_mask (__mmask8 __U, __m128h __A, const int __imm) +{ + return (__mmask8) __builtin_ia32_fpclassph128_mask ((__v8hf) __A, + __imm, __U); +} + +extern __inline __mmask8 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_fpclass_ph_mask (__m128h __A, const int __imm) +{ + return (__mmask8) __builtin_ia32_fpclassph128_mask ((__v8hf) __A, + __imm, + (__mmask8) -1); +} + +extern __inline __mmask16 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm256_mask_fpclass_ph_mask (__mmask16 __U, __m256h __A, const int __imm) +{ + return (__mmask16) __builtin_ia32_fpclassph256_mask ((__v16hf) __A, + __imm, __U); +} + +extern __inline __mmask16 +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm256_fpclass_ph_mask (__m256h __A, const int __imm) +{ + return (__mmask16) __builtin_ia32_fpclassph256_mask ((__v16hf) __A, + __imm, + (__mmask16) -1); +} + +#else +#define _mm_fpclass_ph_mask(X, C) \ + ((__mmask8) __builtin_ia32_fpclassph128_mask ((__v8hf) (__m128h) (X), \ + (int) (C),(__mmask8)-1)) + +#define _mm_mask_fpclass_ph_mask(u, X, C) \ + ((__mmask8) __builtin_ia32_fpclassph128_mask ((__v8hf) (__m128h) (X), \ + (int) (C),(__mmask8)(u))) + +#define _mm256_fpclass_ph_mask(X, C) \ + ((__mmask16) __builtin_ia32_fpclassph256_mask ((__v16hf) (__m256h) (X), \ + (int) (C),(__mmask16)-1)) + +#define _mm256_mask_fpclass_ph_mask(u, X, C) \ + ((__mmask16) __builtin_ia32_fpclassph256_mask ((__v16hf) (__m256h) (X), \ + (int) (C),(__mmask16)(u))) +#endif /* __OPTIMIZE__ */ + +/* Intrinsics vgetexpph, vgetexpsh. */ +extern __inline __m256h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm256_getexp_ph (__m256h __A) +{ + return (__m256h) __builtin_ia32_getexpph256_mask ((__v16hf) __A, + (__v16hf) + _mm256_setzero_ph (), + (__mmask16) -1); +} + +extern __inline __m256h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm256_mask_getexp_ph (__m256h __W, __mmask16 __U, __m256h __A) +{ + return (__m256h) __builtin_ia32_getexpph256_mask ((__v16hf) __A, + (__v16hf) __W, + (__mmask16) __U); +} + +extern __inline __m256h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm256_maskz_getexp_ph (__mmask16 __U, __m256h __A) +{ + return (__m256h) __builtin_ia32_getexpph256_mask ((__v16hf) __A, + (__v16hf) + _mm256_setzero_ph (), + (__mmask16) __U); +} + +extern __inline __m128h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_getexp_ph (__m128h __A) +{ + return (__m128h) __builtin_ia32_getexpph128_mask ((__v8hf) __A, + (__v8hf) + _mm_setzero_ph (), + (__mmask8) -1); +} + +extern __inline __m128h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_mask_getexp_ph (__m128h __W, __mmask8 __U, __m128h __A) +{ + return (__m128h) __builtin_ia32_getexpph128_mask ((__v8hf) __A, + (__v8hf) __W, + (__mmask8) __U); +} + +extern __inline __m128h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_maskz_getexp_ph (__mmask8 __U, __m128h __A) +{ + return (__m128h) __builtin_ia32_getexpph128_mask ((__v8hf) __A, + (__v8hf) + _mm_setzero_ph (), + (__mmask8) __U); +} + + +/* Intrinsics vgetmantph, vgetmantsh. */ +#ifdef __OPTIMIZE__ +extern __inline __m256h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm256_getmant_ph (__m256h __A, _MM_MANTISSA_NORM_ENUM __B, + _MM_MANTISSA_SIGN_ENUM __C) +{ + return (__m256h) __builtin_ia32_getmantph256_mask ((__v16hf) __A, + (__C << 2) | __B, + (__v16hf) + _mm256_setzero_ph (), + (__mmask16) -1); +} + +extern __inline __m256h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm256_mask_getmant_ph (__m256h __W, __mmask16 __U, __m256h __A, + _MM_MANTISSA_NORM_ENUM __B, + _MM_MANTISSA_SIGN_ENUM __C) +{ + return (__m256h) __builtin_ia32_getmantph256_mask ((__v16hf) __A, + (__C << 2) | __B, + (__v16hf) __W, + (__mmask16) __U); +} + +extern __inline __m256h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm256_maskz_getmant_ph (__mmask16 __U, __m256h __A, + _MM_MANTISSA_NORM_ENUM __B, + _MM_MANTISSA_SIGN_ENUM __C) +{ + return (__m256h) __builtin_ia32_getmantph256_mask ((__v16hf) __A, + (__C << 2) | __B, + (__v16hf) + _mm256_setzero_ph (), + (__mmask16) __U); +} + +extern __inline __m128h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_getmant_ph (__m128h __A, _MM_MANTISSA_NORM_ENUM __B, + _MM_MANTISSA_SIGN_ENUM __C) +{ + return (__m128h) __builtin_ia32_getmantph128_mask ((__v8hf) __A, + (__C << 2) | __B, + (__v8hf) + _mm_setzero_ph (), + (__mmask8) -1); +} + +extern __inline __m128h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_mask_getmant_ph (__m128h __W, __mmask8 __U, __m128h __A, + _MM_MANTISSA_NORM_ENUM __B, + _MM_MANTISSA_SIGN_ENUM __C) +{ + return (__m128h) __builtin_ia32_getmantph128_mask ((__v8hf) __A, + (__C << 2) | __B, + (__v8hf) __W, + (__mmask8) __U); +} + +extern __inline __m128h +__attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) +_mm_maskz_getmant_ph (__mmask8 __U, __m128h __A, + _MM_MANTISSA_NORM_ENUM __B, + _MM_MANTISSA_SIGN_ENUM __C) +{ + return (__m128h) __builtin_ia32_getmantph128_mask ((__v8hf) __A, + (__C << 2) | __B, + (__v8hf) + _mm_setzero_ph (), + (__mmask8) __U); +} + +#else +#define _mm256_getmant_ph(X, B, C) \ + ((__m256h) __builtin_ia32_getmantph256_mask ((__v16hf)(__m256h) (X), \ + (int)(((C)<<2) | (B)), \ + (__v16hf)(__m256h)_mm256_setzero_ph (),\ + (__mmask16)-1)) + +#define _mm256_mask_getmant_ph(W, U, X, B, C) \ + ((__m256h) __builtin_ia32_getmantph256_mask ((__v16hf)(__m256h) (X), \ + (int)(((C)<<2) | (B)), \ + (__v16hf)(__m256h)(W), \ + (__mmask16)(U))) + +#define _mm256_maskz_getmant_ph(U, X, B, C) \ + ((__m256h) __builtin_ia32_getmantph256_mask ((__v16hf)(__m256h) (X), \ + (int)(((C)<<2) | (B)), \ + (__v16hf)(__m256h)_mm256_setzero_ph (),\ + (__mmask16)(U))) + +#define _mm_getmant_ph(X, B, C) \ + ((__m128h) __builtin_ia32_getmantph128_mask ((__v8hf)(__m128h) (X), \ + (int)(((C)<<2) | (B)), \ + (__v8hf)(__m128h)_mm_setzero_ph (), \ + (__mmask8)-1)) + +#define _mm_mask_getmant_ph(W, U, X, B, C) \ + ((__m128h) __builtin_ia32_getmantph128_mask ((__v8hf)(__m128h) (X), \ + (int)(((C)<<2) | (B)), \ + (__v8hf)(__m128h)(W), \ + (__mmask8)(U))) + +#define _mm_maskz_getmant_ph(U, X, B, C) \ + ((__m128h) __builtin_ia32_getmantph128_mask ((__v8hf)(__m128h) (X), \ + (int)(((C)<<2) | (B)), \ + (__v8hf)(__m128h)_mm_setzero_ph (), \ + (__mmask8)(U))) + +#endif /* __OPTIMIZE__ */ + #ifdef __DISABLE_AVX512FP16VL__ #undef __DISABLE_AVX512FP16VL__ #pragma GCC pop_options diff --git a/gcc/config/i386/i386-builtin-types.def b/gcc/config/i386/i386-builtin-types.def index d2ba1a5edac..79e7edf13e5 100644 --- a/gcc/config/i386/i386-builtin-types.def +++ b/gcc/config/i386/i386-builtin-types.def @@ -1304,6 +1304,9 @@ DEF_FUNCTION_TYPE (UINT8, PV2DI, PCV2DI, PCVOID) # FP16 builtins DEF_FUNCTION_TYPE (V8HF, V8HI) +DEF_FUNCTION_TYPE (QI, V8HF, INT, UQI) +DEF_FUNCTION_TYPE (HI, V16HF, INT, UHI) +DEF_FUNCTION_TYPE (SI, V32HF, INT, USI) DEF_FUNCTION_TYPE (V8HF, V8HF, V8HF) DEF_FUNCTION_TYPE (V8HF, V8HF, V8HF, UQI) DEF_FUNCTION_TYPE (V8HF, V8HF, V8HF, INT) diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def index 6964062c874..ed1a4a38b1c 100644 --- a/gcc/config/i386/i386-builtin.def +++ b/gcc/config/i386/i386-builtin.def @@ -2818,6 +2818,14 @@ BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_reducepv8 BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_reducepv16hf_mask, "__builtin_ia32_vreduceph_v16hf_mask", IX86_BUILTIN_VREDUCEPH_V16HF_MASK, UNKNOWN, (int) V16HF_FTYPE_V16HF_INT_V16HF_UHI) BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512fp16_rndscalev8hf_mask, "__builtin_ia32_vrndscaleph_v8hf_mask", IX86_BUILTIN_VRNDSCALEPH_V8HF_MASK, UNKNOWN, (int) V8HF_FTYPE_V8HF_INT_V8HF_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512vl_rndscalev16hf_mask, "__builtin_ia32_vrndscaleph_v16hf_mask", IX86_BUILTIN_VRNDSCALEPH_V16HF_MASK, UNKNOWN, (int) V16HF_FTYPE_V16HF_INT_V16HF_UHI) +BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512dq_fpclassv16hf_mask, "__builtin_ia32_fpclassph256_mask", IX86_BUILTIN_FPCLASSPH256, UNKNOWN, (int) HI_FTYPE_V16HF_INT_UHI) +BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512dq_fpclassv8hf_mask, "__builtin_ia32_fpclassph128_mask", IX86_BUILTIN_FPCLASSPH128, UNKNOWN, (int) QI_FTYPE_V8HF_INT_UQI) +BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512dq_fpclassv32hf_mask, "__builtin_ia32_fpclassph512_mask", IX86_BUILTIN_FPCLASSPH512, UNKNOWN, (int) SI_FTYPE_V32HF_INT_USI) +BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512dq_vmfpclassv8hf_mask, "__builtin_ia32_fpclasssh_mask", IX86_BUILTIN_FPCLASSSH_MASK, UNKNOWN, (int) QI_FTYPE_V8HF_INT_UQI) +BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512vl_getexpv16hf_mask, "__builtin_ia32_getexpph256_mask", IX86_BUILTIN_GETEXPPH256, UNKNOWN, (int) V16HF_FTYPE_V16HF_V16HF_UHI) +BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512fp16_getexpv8hf_mask, "__builtin_ia32_getexpph128_mask", IX86_BUILTIN_GETEXPPH128, UNKNOWN, (int) V8HF_FTYPE_V8HF_V8HF_UQI) +BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512vl_getmantv16hf_mask, "__builtin_ia32_getmantph256_mask", IX86_BUILTIN_GETMANTPH256, UNKNOWN, (int) V16HF_FTYPE_V16HF_INT_V16HF_UHI) +BDESC (OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512fp16_getmantv8hf_mask, "__builtin_ia32_getmantph128_mask", IX86_BUILTIN_GETMANTPH128, UNKNOWN, (int) V8HF_FTYPE_V8HF_INT_V8HF_UQI) /* Builtins with rounding support. */ BDESC_END (ARGS, ROUND_ARGS) @@ -3041,6 +3049,10 @@ BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_reducepv32hf_mask_round, "__buil BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_reducesv8hf_mask_round, "__builtin_ia32_vreducesh_v8hf_mask_round", IX86_BUILTIN_VREDUCESH_V8HF_MASK_ROUND, UNKNOWN, (int) V8HF_FTYPE_V8HF_V8HF_INT_V8HF_UQI_INT) BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512bw_rndscalev32hf_mask_round, "__builtin_ia32_vrndscaleph_v32hf_mask_round", IX86_BUILTIN_VRNDSCALEPH_V32HF_MASK_ROUND, UNKNOWN, (int) V32HF_FTYPE_V32HF_INT_V32HF_USI_INT) BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512f_rndscalev8hf_mask_round, "__builtin_ia32_vrndscalesh_v8hf_mask_round", IX86_BUILTIN_VRNDSCALESH_V8HF_MASK_ROUND, UNKNOWN, (int) V8HF_FTYPE_V8HF_V8HF_INT_V8HF_UQI_INT) +BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512bw_getexpv32hf_mask_round, "__builtin_ia32_getexpph512_mask", IX86_BUILTIN_GETEXPPH512, UNKNOWN, (int) V32HF_FTYPE_V32HF_V32HF_USI_INT) +BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512f_sgetexpv8hf_mask_round, "__builtin_ia32_getexpsh_mask_round", IX86_BUILTIN_GETEXPSH_MASK_ROUND, UNKNOWN, (int) V8HF_FTYPE_V8HF_V8HF_V8HF_UQI_INT) +BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512bw_getmantv32hf_mask_round, "__builtin_ia32_getmantph512_mask", IX86_BUILTIN_GETMANTPH512, UNKNOWN, (int) V32HF_FTYPE_V32HF_INT_V32HF_USI_INT) +BDESC (0, OPTION_MASK_ISA2_AVX512FP16, CODE_FOR_avx512f_vgetmantv8hf_mask_round, "__builtin_ia32_getmantsh_mask_round", IX86_BUILTIN_GETMANTSH_MASK_ROUND, UNKNOWN, (int) V8HF_FTYPE_V8HF_V8HF_INT_V8HF_UQI_INT) BDESC_END (ROUND_ARGS, MULTI_ARG) diff --git a/gcc/config/i386/i386-expand.c b/gcc/config/i386/i386-expand.c index 655234cbdd0..266aa411ddb 100644 --- a/gcc/config/i386/i386-expand.c +++ b/gcc/config/i386/i386-expand.c @@ -9735,6 +9735,9 @@ ix86_expand_args_builtin (const struct builtin_description *d, case HI_FTYPE_V16SF_INT_UHI: case QI_FTYPE_V8SF_INT_UQI: case QI_FTYPE_V4SF_INT_UQI: + case QI_FTYPE_V8HF_INT_UQI: + case HI_FTYPE_V16HF_INT_UHI: + case SI_FTYPE_V32HF_INT_USI: case V4SI_FTYPE_V4SI_V4SI_UHI: case V8SI_FTYPE_V8SI_V8SI_UHI: nargs = 3; @@ -10056,8 +10059,10 @@ ix86_expand_args_builtin (const struct builtin_description *d, case CODE_FOR_avx_vpermilv4df_mask: case CODE_FOR_avx512f_getmantv8df_mask: case CODE_FOR_avx512f_getmantv16sf_mask: + case CODE_FOR_avx512vl_getmantv16hf_mask: case CODE_FOR_avx512vl_getmantv8sf_mask: case CODE_FOR_avx512vl_getmantv4df_mask: + case CODE_FOR_avx512fp16_getmantv8hf_mask: case CODE_FOR_avx512vl_getmantv4sf_mask: case CODE_FOR_avx512vl_getmantv2df_mask: case CODE_FOR_avx512dq_rangepv8df_mask_round: @@ -10593,10 +10598,12 @@ ix86_expand_round_builtin (const struct builtin_description *d, { case CODE_FOR_avx512f_getmantv8df_mask_round: case CODE_FOR_avx512f_getmantv16sf_mask_round: + case CODE_FOR_avx512bw_getmantv32hf_mask_round: case CODE_FOR_avx512f_vgetmantv2df_round: case CODE_FOR_avx512f_vgetmantv2df_mask_round: case CODE_FOR_avx512f_vgetmantv4sf_round: case CODE_FOR_avx512f_vgetmantv4sf_mask_round: + case CODE_FOR_avx512f_vgetmantv8hf_mask_round: error ("the immediate argument must be a 4-bit immediate"); return const0_rtx; case CODE_FOR_avx512f_cmpv8df3_mask_round: diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index f43651a95ce..c4db778e25d 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -701,7 +701,8 @@ (define_mode_attr ssequarterinsnmode [(V16SF "V4SF") (V8DF "V2DF") (V16SI "TI") (V8DI "TI")]) (define_mode_attr vecmemsuffix - [(V16SF "{z}") (V8SF "{y}") (V4SF "{x}") + [(V32HF "{z}") (V16HF "{y}") (V8HF "{x}") + (V16SF "{z}") (V8SF "{y}") (V4SF "{x}") (V8DF "{z}") (V4DF "{y}") (V2DF "{x}")]) (define_mode_attr ssedoublemodelower @@ -10050,8 +10051,8 @@ (define_insn "_vternlog_mask" (set_attr "mode" "")]) (define_insn "_getexp" - [(set (match_operand:VF_AVX512VL 0 "register_operand" "=v") - (unspec:VF_AVX512VL [(match_operand:VF_AVX512VL 1 "" "")] + [(set (match_operand:VFH_AVX512VL 0 "register_operand" "=v") + (unspec:VFH_AVX512VL [(match_operand:VFH_AVX512VL 1 "" "")] UNSPEC_GETEXP))] "TARGET_AVX512F" "vgetexp\t{%1, %0|%0, %1}"; @@ -10059,11 +10060,11 @@ (define_insn "_getexp" (set_attr "mode" "")]) (define_insn "avx512f_sgetexp" - [(set (match_operand:VF_128 0 "register_operand" "=v") - (vec_merge:VF_128 - (unspec:VF_128 - [(match_operand:VF_128 1 "register_operand" "v") - (match_operand:VF_128 2 "" "")] + [(set (match_operand:VFH_128 0 "register_operand" "=v") + (vec_merge:VFH_128 + (unspec:VFH_128 + [(match_operand:VFH_128 1 "register_operand" "v") + (match_operand:VFH_128 2 "" "")] UNSPEC_GETEXP) (match_dup 1) (const_int 1)))] @@ -23571,10 +23572,10 @@ (define_insn "avx512dq_ranges (define_insn "avx512dq_fpclass" [(set (match_operand: 0 "register_operand" "=k") (unspec: - [(match_operand:VF_AVX512VL 1 "vector_operand" "vm") + [(match_operand:VFH_AVX512VL 1 "vector_operand" "vm") (match_operand 2 "const_0_to_255_operand" "n")] UNSPEC_FPCLASS))] - "TARGET_AVX512DQ" + "TARGET_AVX512DQ || VALID_AVX512FP16_REG_MODE(mode)" "vfpclass\t{%2, %1, %0|%0, %1, %2}"; [(set_attr "type" "sse") (set_attr "length_immediate" "1") @@ -23585,11 +23586,11 @@ (define_insn "avx512dq_vmfpclass" [(set (match_operand: 0 "register_operand" "=k") (and: (unspec: - [(match_operand:VF_128 1 "nonimmediate_operand" "vm") + [(match_operand:VFH_128 1 "nonimmediate_operand" "vm") (match_operand 2 "const_0_to_255_operand" "n")] UNSPEC_FPCLASS) (const_int 1)))] - "TARGET_AVX512DQ" + "TARGET_AVX512DQ || VALID_AVX512FP16_REG_MODE(mode)" "vfpclass\t{%2, %1, %0|%0, %1, %2}"; [(set_attr "type" "sse") (set_attr "length_immediate" "1") @@ -23597,9 +23598,9 @@ (define_insn "avx512dq_vmfpclass" (set_attr "mode" "")]) (define_insn "_getmant" - [(set (match_operand:VF_AVX512VL 0 "register_operand" "=v") - (unspec:VF_AVX512VL - [(match_operand:VF_AVX512VL 1 "nonimmediate_operand" "") + [(set (match_operand:VFH_AVX512VL 0 "register_operand" "=v") + (unspec:VFH_AVX512VL + [(match_operand:VFH_AVX512VL 1 "nonimmediate_operand" "") (match_operand:SI 2 "const_0_to_15_operand")] UNSPEC_GETMANT))] "TARGET_AVX512F" @@ -23608,11 +23609,11 @@ (define_insn "_getmant" (set_attr "mode" "")]) (define_insn "avx512f_vgetmant" - [(set (match_operand:VF_128 0 "register_operand" "=v") - (vec_merge:VF_128 - (unspec:VF_128 - [(match_operand:VF_128 1 "register_operand" "v") - (match_operand:VF_128 2 "" "") + [(set (match_operand:VFH_128 0 "register_operand" "=v") + (vec_merge:VFH_128 + (unspec:VFH_128 + [(match_operand:VFH_128 1 "register_operand" "v") + (match_operand:VFH_128 2 "" "") (match_operand:SI 3 "const_0_to_15_operand")] UNSPEC_GETMANT) (match_dup 1) diff --git a/gcc/testsuite/gcc.target/i386/avx-1.c b/gcc/testsuite/gcc.target/i386/avx-1.c index 4c8e54e4c2a..b3cffa0644f 100644 --- a/gcc/testsuite/gcc.target/i386/avx-1.c +++ b/gcc/testsuite/gcc.target/i386/avx-1.c @@ -713,10 +713,20 @@ #define __builtin_ia32_vrndscaleph_v8hf_mask(A, B, C, D) __builtin_ia32_vrndscaleph_v8hf_mask(A, 123, C, D) #define __builtin_ia32_vrndscaleph_v16hf_mask(A, B, C, D) __builtin_ia32_vrndscaleph_v16hf_mask(A, 123, C, D) #define __builtin_ia32_vrndscalesh_v8hf_mask_round(A, B, C, D, E, F) __builtin_ia32_vrndscalesh_v8hf_mask_round(A, B, 123, D, E, 8) +#define __builtin_ia32_fpclassph512_mask(A, D, C) __builtin_ia32_fpclassph512_mask(A, 1, C) +#define __builtin_ia32_fpclasssh_mask(A, D, U) __builtin_ia32_fpclasssh_mask(A, 1, U) +#define __builtin_ia32_getexpph512_mask(A, B, C, D) __builtin_ia32_getexpph512_mask(A, B, C, 8) +#define __builtin_ia32_getexpsh_mask_round(A, B, C, D, E) __builtin_ia32_getexpsh_mask_round(A, B, C, D, 4) +#define __builtin_ia32_getmantph512_mask(A, F, C, D, E) __builtin_ia32_getmantph512_mask(A, 1, C, D, 8) +#define __builtin_ia32_getmantsh_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantsh_mask_round(A, B, 1, W, U, 4) /* avx512fp16vlintrin.h */ #define __builtin_ia32_vcmpph_v8hf_mask(A, B, C, D) __builtin_ia32_vcmpph_v8hf_mask(A, B, 1, D) #define __builtin_ia32_vcmpph_v16hf_mask(A, B, C, D) __builtin_ia32_vcmpph_v16hf_mask(A, B, 1, D) +#define __builtin_ia32_fpclassph256_mask(A, D, C) __builtin_ia32_fpclassph256_mask(A, 1, C) +#define __builtin_ia32_fpclassph128_mask(A, D, C) __builtin_ia32_fpclassph128_mask(A, 1, C) +#define __builtin_ia32_getmantph256_mask(A, E, C, D) __builtin_ia32_getmantph256_mask(A, 1, C, D) +#define __builtin_ia32_getmantph128_mask(A, E, C, D) __builtin_ia32_getmantph128_mask(A, 1, C, D) /* vpclmulqdqintrin.h */ #define __builtin_ia32_vpclmulqdq_v4di(A, B, C) __builtin_ia32_vpclmulqdq_v4di(A, B, 1) diff --git a/gcc/testsuite/gcc.target/i386/sse-13.c b/gcc/testsuite/gcc.target/i386/sse-13.c index 044d427c932..67ef567e437 100644 --- a/gcc/testsuite/gcc.target/i386/sse-13.c +++ b/gcc/testsuite/gcc.target/i386/sse-13.c @@ -730,10 +730,20 @@ #define __builtin_ia32_vrndscaleph_v8hf_mask(A, B, C, D) __builtin_ia32_vrndscaleph_v8hf_mask(A, 123, C, D) #define __builtin_ia32_vrndscaleph_v16hf_mask(A, B, C, D) __builtin_ia32_vrndscaleph_v16hf_mask(A, 123, C, D) #define __builtin_ia32_vrndscalesh_v8hf_mask_round(A, B, C, D, E, F) __builtin_ia32_vrndscalesh_v8hf_mask_round(A, B, 123, D, E, 8) +#define __builtin_ia32_fpclassph512_mask(A, D, C) __builtin_ia32_fpclassph512_mask(A, 1, C) +#define __builtin_ia32_fpclasssh_mask(A, D, U) __builtin_ia32_fpclasssh_mask(A, 1, U) +#define __builtin_ia32_getexpph512_mask(A, B, C, D) __builtin_ia32_getexpph512_mask(A, B, C, 8) +#define __builtin_ia32_getexpsh_mask_round(A, B, C, D, E) __builtin_ia32_getexpsh_mask_round(A, B, C, D, 4) +#define __builtin_ia32_getmantph512_mask(A, F, C, D, E) __builtin_ia32_getmantph512_mask(A, 1, C, D, 8) +#define __builtin_ia32_getmantsh_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantsh_mask_round(A, B, 1, W, U, 4) /* avx512fp16vlintrin.h */ #define __builtin_ia32_vcmpph_v8hf_mask(A, B, C, D) __builtin_ia32_vcmpph_v8hf_mask(A, B, 1, D) #define __builtin_ia32_vcmpph_v16hf_mask(A, B, C, D) __builtin_ia32_vcmpph_v16hf_mask(A, B, 1, D) +#define __builtin_ia32_fpclassph256_mask(A, D, C) __builtin_ia32_fpclassph256_mask(A, 1, C) +#define __builtin_ia32_fpclassph128_mask(A, D, C) __builtin_ia32_fpclassph128_mask(A, 1, C) +#define __builtin_ia32_getmantph256_mask(A, E, C, D) __builtin_ia32_getmantph256_mask(A, 1, C, D) +#define __builtin_ia32_getmantph128_mask(A, E, C, D) __builtin_ia32_getmantph128_mask(A, 1, C, D) /* vpclmulqdqintrin.h */ #define __builtin_ia32_vpclmulqdq_v4di(A, B, C) __builtin_ia32_vpclmulqdq_v4di(A, B, 1) diff --git a/gcc/testsuite/gcc.target/i386/sse-14.c b/gcc/testsuite/gcc.target/i386/sse-14.c index b7ffdf7e1df..04163874f90 100644 --- a/gcc/testsuite/gcc.target/i386/sse-14.c +++ b/gcc/testsuite/gcc.target/i386/sse-14.c @@ -677,8 +677,11 @@ test_1 (_mm512_reduce_ph, __m512h, __m512h, 123) test_1 (_mm_roundscale_ph, __m128h, __m128h, 123) test_1 (_mm256_roundscale_ph, __m256h, __m256h, 123) test_1 (_mm512_roundscale_ph, __m512h, __m512h, 123) +test_1 (_mm512_getexp_round_ph, __m512h, __m512h, 8) test_1x (_mm512_reduce_round_ph, __m512h, __m512h, 123, 8) test_1x (_mm512_roundscale_round_ph, __m512h, __m512h, 123, 8) +test_1x (_mm512_getmant_ph, __m512h, __m512h, 1, 1) +test_1y (_mm512_getmant_round_ph, __m512h, __m512h, 1, 1, 8) test_2 (_mm512_add_round_ph, __m512h, __m512h, __m512h, 8) test_2 (_mm512_sub_round_ph, __m512h, __m512h, __m512h, 8) test_2 (_mm512_mul_round_ph, __m512h, __m512h, __m512h, 8) @@ -705,6 +708,8 @@ test_2 (_mm_maskz_roundscale_ph, __m128h, __mmask8, __m128h, 123) test_2 (_mm256_maskz_roundscale_ph, __m256h, __mmask16, __m256h, 123) test_2 (_mm512_maskz_roundscale_ph, __m512h, __mmask32, __m512h, 123) test_2 (_mm_roundscale_sh, __m128h, __m128h, __m128h, 123) +test_2 (_mm512_maskz_getexp_round_ph, __m512h, __mmask32, __m512h, 8) +test_2 (_mm_getexp_round_sh, __m128h, __m128h, __m128h, 8) test_2x (_mm512_cmp_round_ph_mask, __mmask32, __m512h, __m512h, 1, 8) test_2x (_mm_cmp_round_sh_mask, __mmask8, __m128h, __m128h, 1, 8) test_2x (_mm_comi_round_sh, int, __m128h, __m128h, 1, 8) @@ -712,6 +717,10 @@ test_2x (_mm512_maskz_reduce_round_ph, __m512h, __mmask32, __m512h, 123, 8) test_2x (_mm512_maskz_roundscale_round_ph, __m512h, __mmask32, __m512h, 123, 8) test_2x (_mm_reduce_round_sh, __m128h, __m128h, __m128h, 123, 8) test_2x (_mm_roundscale_round_sh, __m128h, __m128h, __m128h, 123, 8) +test_2x (_mm512_maskz_getmant_ph, __m512h, __mmask32, __m512h, 1, 1) +test_2x (_mm_getmant_sh, __m128h, __m128h, __m128h, 1, 1) +test_2y (_mm512_maskz_getmant_round_ph, __m512h, __mmask32, __m512h, 1, 1, 8) +test_2y (_mm_getmant_round_sh, __m128h, __m128h, __m128h, 1, 1, 8) test_3 (_mm512_maskz_add_round_ph, __m512h, __mmask32, __m512h, __m512h, 8) test_3 (_mm512_maskz_sub_round_ph, __m512h, __mmask32, __m512h, __m512h, 8) test_3 (_mm512_maskz_mul_round_ph, __m512h, __mmask32, __m512h, __m512h, 8) @@ -737,12 +746,18 @@ test_3 (_mm_mask_roundscale_ph, __m128h, __m128h, __mmask8, __m128h, 123) test_3 (_mm256_mask_roundscale_ph, __m256h, __m256h, __mmask16, __m256h, 123) test_3 (_mm512_mask_roundscale_ph, __m512h, __m512h, __mmask32, __m512h, 123) test_3 (_mm_maskz_roundscale_sh, __m128h, __mmask8, __m128h, __m128h, 123) +test_3 (_mm_maskz_getexp_round_sh, __m128h, __mmask8, __m128h, __m128h, 8) +test_3 (_mm512_mask_getexp_round_ph, __m512h, __m512h, __mmask32, __m512h, 8) test_3x (_mm512_mask_cmp_round_ph_mask, __mmask32, __mmask32, __m512h, __m512h, 1, 8) test_3x (_mm_mask_cmp_round_sh_mask, __mmask8, __mmask8, __m128h, __m128h, 1, 8) test_3x (_mm512_mask_reduce_round_ph, __m512h, __m512h, __mmask32, __m512h, 123, 8) test_3x (_mm512_mask_roundscale_round_ph, __m512h, __m512h, __mmask32, __m512h, 123, 8) test_3x (_mm_maskz_reduce_round_sh, __m128h, __mmask8, __m128h, __m128h, 123, 8) test_3x (_mm_maskz_roundscale_round_sh, __m128h, __mmask8, __m128h, __m128h, 123, 8) +test_3x (_mm512_mask_getmant_ph, __m512h, __m512h, __mmask32, __m512h, 1, 1) +test_3x (_mm_maskz_getmant_sh, __m128h, __mmask8, __m128h, __m128h, 1, 1) +test_3y (_mm_maskz_getmant_round_sh, __m128h, __mmask8, __m128h, __m128h, 1, 1, 8) +test_3y (_mm512_mask_getmant_round_ph, __m512h, __m512h, __mmask32, __m512h, 1, 1, 8) test_4 (_mm512_mask_add_round_ph, __m512h, __m512h, __mmask32, __m512h, __m512h, 8) test_4 (_mm512_mask_sub_round_ph, __m512h, __m512h, __mmask32, __m512h, __m512h, 8) test_4 (_mm512_mask_mul_round_ph, __m512h, __m512h, __mmask32, __m512h, __m512h, 8) @@ -760,8 +775,11 @@ test_4 (_mm512_mask_scalef_round_ph, __m512h, __m512h, __mmask32, __m512h, __m51 test_4 (_mm_mask_scalef_round_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 8) test_4 (_mm_mask_reduce_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 123) test_4 (_mm_mask_roundscale_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 123) +test_4 (_mm_mask_getexp_round_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 8) test_4x (_mm_mask_reduce_round_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 123, 8) test_4x (_mm_mask_roundscale_round_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 123, 8) +test_4x (_mm_mask_getmant_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 1, 1) +test_4y (_mm_mask_getmant_round_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 1, 1, 8) /* avx512fp16vlintrin.h */ test_2 (_mm_cmp_ph_mask, __mmask8, __m128h, __m128h, 1) diff --git a/gcc/testsuite/gcc.target/i386/sse-22.c b/gcc/testsuite/gcc.target/i386/sse-22.c index 5dbe8cba5ea..008600a393d 100644 --- a/gcc/testsuite/gcc.target/i386/sse-22.c +++ b/gcc/testsuite/gcc.target/i386/sse-22.c @@ -782,8 +782,11 @@ test_1 (_mm512_reduce_ph, __m512h, __m512h, 123) test_1 (_mm_roundscale_ph, __m128h, __m128h, 123) test_1 (_mm256_roundscale_ph, __m256h, __m256h, 123) test_1 (_mm512_roundscale_ph, __m512h, __m512h, 123) +test_1 (_mm512_getexp_round_ph, __m512h, __m512h, 8) test_1x (_mm512_reduce_round_ph, __m512h, __m512h, 123, 8) test_1x (_mm512_roundscale_round_ph, __m512h, __m512h, 123, 8) +test_1x (_mm512_getmant_ph, __m512h, __m512h, 1, 1) +test_1y (_mm512_getmant_round_ph, __m512h, __m512h, 1, 1, 8) test_2 (_mm512_add_round_ph, __m512h, __m512h, __m512h, 8) test_2 (_mm512_sub_round_ph, __m512h, __m512h, __m512h, 8) test_2 (_mm512_mul_round_ph, __m512h, __m512h, __m512h, 8) @@ -809,6 +812,8 @@ test_2 (_mm_maskz_roundscale_ph, __m128h, __mmask8, __m128h, 123) test_2 (_mm256_maskz_roundscale_ph, __m256h, __mmask16, __m256h, 123) test_2 (_mm512_maskz_roundscale_ph, __m512h, __mmask32, __m512h, 123) test_2 (_mm_roundscale_sh, __m128h, __m128h, __m128h, 123) +test_2 (_mm512_maskz_getexp_round_ph, __m512h, __mmask32, __m512h, 8) +test_2 (_mm_getexp_round_sh, __m128h, __m128h, __m128h, 8) test_2x (_mm512_cmp_round_ph_mask, __mmask32, __m512h, __m512h, 1, 8) test_2x (_mm_cmp_round_sh_mask, __mmask8, __m128h, __m128h, 1, 8) test_2x (_mm_comi_round_sh, int, __m128h, __m128h, 1, 8) @@ -816,6 +821,10 @@ test_2x (_mm512_maskz_reduce_round_ph, __m512h, __mmask32, __m512h, 123, 8) test_2x (_mm512_maskz_roundscale_round_ph, __m512h, __mmask32, __m512h, 123, 8) test_2x (_mm_reduce_round_sh, __m128h, __m128h, __m128h, 123, 8) test_2x (_mm_roundscale_round_sh, __m128h, __m128h, __m128h, 123, 8) +test_2x (_mm512_maskz_getmant_ph, __m512h, __mmask32, __m512h, 1, 1) +test_2x (_mm_getmant_sh, __m128h, __m128h, __m128h, 1, 1) +test_2y (_mm512_maskz_getmant_round_ph, __m512h, __mmask32, __m512h, 1, 1, 8) +test_2y (_mm_getmant_round_sh, __m128h, __m128h, __m128h, 1, 1, 8) test_3 (_mm512_maskz_add_round_ph, __m512h, __mmask32, __m512h, __m512h, 8) test_3 (_mm512_maskz_sub_round_ph, __m512h, __mmask32, __m512h, __m512h, 8) test_3 (_mm512_maskz_mul_round_ph, __m512h, __mmask32, __m512h, __m512h, 8) @@ -840,12 +849,18 @@ test_3 (_mm_mask_roundscale_ph, __m128h, __m128h, __mmask8, __m128h, 123) test_3 (_mm256_mask_roundscale_ph, __m256h, __m256h, __mmask16, __m256h, 123) test_3 (_mm512_mask_roundscale_ph, __m512h, __m512h, __mmask32, __m512h, 123) test_3 (_mm_maskz_roundscale_sh, __m128h, __mmask8, __m128h, __m128h, 123) +test_3 (_mm_maskz_getexp_round_sh, __m128h, __mmask8, __m128h, __m128h, 8) +test_3 (_mm512_mask_getexp_round_ph, __m512h, __m512h, __mmask32, __m512h, 8) test_3x (_mm512_mask_cmp_round_ph_mask, __mmask32, __mmask32, __m512h, __m512h, 1, 8) test_3x (_mm_mask_cmp_round_sh_mask, __mmask8, __mmask8, __m128h, __m128h, 1, 8) test_3x (_mm512_mask_reduce_round_ph, __m512h, __m512h, __mmask32, __m512h, 123, 8) test_3x (_mm512_mask_roundscale_round_ph, __m512h, __m512h, __mmask32, __m512h, 123, 8) test_3x (_mm_maskz_reduce_round_sh, __m128h, __mmask8, __m128h, __m128h, 123, 8) test_3x (_mm_maskz_roundscale_round_sh, __m128h, __mmask8, __m128h, __m128h, 123, 8) +test_3x (_mm512_mask_getmant_ph, __m512h, __m512h, __mmask32, __m512h, 1, 1) +test_3x (_mm_maskz_getmant_sh, __m128h, __mmask8, __m128h, __m128h, 1, 1) +test_3y (_mm_maskz_getmant_round_sh, __m128h, __mmask8, __m128h, __m128h, 1, 1, 8) +test_3y (_mm512_mask_getmant_round_ph, __m512h, __m512h, __mmask32, __m512h, 1, 1, 8) test_4 (_mm512_mask_add_round_ph, __m512h, __m512h, __mmask32, __m512h, __m512h, 8) test_4 (_mm512_mask_sub_round_ph, __m512h, __m512h, __mmask32, __m512h, __m512h, 8) test_4 (_mm512_mask_mul_round_ph, __m512h, __m512h, __mmask32, __m512h, __m512h, 8) @@ -862,8 +877,11 @@ test_4 (_mm_mask_sqrt_round_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 8) test_4 (_mm512_mask_scalef_round_ph, __m512h, __m512h, __mmask32, __m512h, __m512h, 8) test_4 (_mm_mask_reduce_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 123) test_4 (_mm_mask_roundscale_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 123) +test_4 (_mm_mask_getexp_round_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 8) test_4x (_mm_mask_reduce_round_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 123, 8) test_4x (_mm_mask_roundscale_round_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 123, 8) +test_4x (_mm_mask_getmant_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 1, 1) +test_4y (_mm_mask_getmant_round_sh, __m128h, __m128h, __mmask8, __m128h, __m128h, 1, 1, 8) /* avx512fp16vlintrin.h */ test_2 (_mm_cmp_ph_mask, __mmask8, __m128h, __m128h, 1) diff --git a/gcc/testsuite/gcc.target/i386/sse-23.c b/gcc/testsuite/gcc.target/i386/sse-23.c index 2d968f07bc8..b3f07587acb 100644 --- a/gcc/testsuite/gcc.target/i386/sse-23.c +++ b/gcc/testsuite/gcc.target/i386/sse-23.c @@ -731,10 +731,20 @@ #define __builtin_ia32_vrndscaleph_v8hf_mask(A, B, C, D) __builtin_ia32_vrndscaleph_v8hf_mask(A, 123, C, D) #define __builtin_ia32_vrndscaleph_v16hf_mask(A, B, C, D) __builtin_ia32_vrndscaleph_v16hf_mask(A, 123, C, D) #define __builtin_ia32_vrndscalesh_v8hf_mask_round(A, B, C, D, E, F) __builtin_ia32_vrndscalesh_v8hf_mask_round(A, B, 123, D, E, 8) +#define __builtin_ia32_fpclassph512_mask(A, D, C) __builtin_ia32_fpclassph512_mask(A, 1, C) +#define __builtin_ia32_fpclasssh_mask(A, D, U) __builtin_ia32_fpclasssh_mask(A, 1, U) +#define __builtin_ia32_getexpph512_mask(A, B, C, D) __builtin_ia32_getexpph512_mask(A, B, C, 8) +#define __builtin_ia32_getexpsh_mask_round(A, B, C, D, E) __builtin_ia32_getexpsh_mask_round(A, B, C, D, 4) +#define __builtin_ia32_getmantph512_mask(A, F, C, D, E) __builtin_ia32_getmantph512_mask(A, 1, C, D, 8) +#define __builtin_ia32_getmantsh_mask_round(A, B, C, W, U, D) __builtin_ia32_getmantsh_mask_round(A, B, 1, W, U, 4) /* avx512fp16vlintrin.h */ #define __builtin_ia32_vcmpph_v8hf_mask(A, B, C, D) __builtin_ia32_vcmpph_v8hf_mask(A, B, 1, D) #define __builtin_ia32_vcmpph_v16hf_mask(A, B, C, D) __builtin_ia32_vcmpph_v16hf_mask(A, B, 1, D) +#define __builtin_ia32_fpclassph256_mask(A, D, C) __builtin_ia32_fpclassph256_mask(A, 1, C) +#define __builtin_ia32_fpclassph128_mask(A, D, C) __builtin_ia32_fpclassph128_mask(A, 1, C) +#define __builtin_ia32_getmantph256_mask(A, E, C, D) __builtin_ia32_getmantph256_mask(A, 1, C, D) +#define __builtin_ia32_getmantph128_mask(A, E, C, D) __builtin_ia32_getmantph128_mask(A, 1, C, D) /* vpclmulqdqintrin.h */ #define __builtin_ia32_vpclmulqdq_v4di(A, B, C) __builtin_ia32_vpclmulqdq_v4di(A, B, 1)