From patchwork Tue Aug 8 07:19:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Jiang, Haochen" X-Patchwork-Id: 1818539 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=YakC49ZA; dkim-atps=neutral Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RKl2W6mqmz1yYl for ; Tue, 8 Aug 2023 17:20:19 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B85C0385800A for ; Tue, 8 Aug 2023 07:20:16 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org B85C0385800A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1691479216; bh=pVZzaMRbujcAhK8Cy55Ju8g+4MXqf80iVbbuDv9zfP4=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=YakC49ZALCxud8qhfNU181/8MQTtbtyCW7XSTCVSLXnQKniUr103O5w7pMClHuVhK l/z1urejgqsmHKrJiBoZ04NTpB3nVadIAYefTUjxSce/habfPCEga4kVxg0oCMxXaN iyWR13QZeVmQhfR2jQM0nrUOKuF70FNxEmuQwac4= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.120]) by sourceware.org (Postfix) with ESMTPS id 562F53858417 for ; Tue, 8 Aug 2023 07:19:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 562F53858417 X-IronPort-AV: E=McAfee;i="6600,9927,10795"; a="369642481" X-IronPort-AV: E=Sophos;i="6.01,263,1684825200"; d="scan'208";a="369642481" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Aug 2023 00:19:53 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.01,202,1684825200"; d="scan'208";a="874620608" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga001.fm.intel.com with ESMTP; 08 Aug 2023 00:19:51 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id C1A4E1005608; Tue, 8 Aug 2023 15:19:47 +0800 (CST) To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com Subject: [PATCH 1/6] Support AVX10.1 for AVX512DQ+AVX512VL intrins Date: Tue, 8 Aug 2023 15:19:47 +0800 Message-Id: <20230808071947.1570053-1-haochen.jiang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230808071312.1569559-1-haochen.jiang@intel.com> References: <20230808071312.1569559-1-haochen.jiang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.8 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Haochen Jiang via Gcc-patches From: "Jiang, Haochen" Reply-To: Haochen Jiang Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" gcc/ChangeLog: * config/i386/avx512vldqintrin.h: Remove target attribute. * config/i386/i386-builtin.def (BDESC): Add OPTION_MASK_ISA2_AVX10_1. * config/i386/i386-builtins.cc (def_builtin): Handle AVX10_1. * config/i386/i386-expand.cc (ix86_check_builtin_isa_match): Ditto. (ix86_expand_sse2_mulvxdi3): Add TARGET_AVX10_1. * config/i386/i386.md: Add new isa attribute avx10_1_or_avx512dq and avx10_1_or_avx512vl. * config/i386/sse.md: (VF2_AVX512VLDQ_AVX10_1): New. (VF1_128_256VLDQ_AVX10_1): Ditto. (VI8_AVX512VLDQ_AVX10_1): Ditto. (_andnot3): Add TARGET_AVX10_1 and change isa attr from avx512dq to avx10_1_or_avx512dq. (*andnot3): Add TARGET_AVX10_1 and change isa attr from avx512vl to avx10_1_or_avx512vl. (fix_trunc2): Change iterator to VF2_AVX512VLDQ_AVX10_1. Remove target check. (fix_notrunc2): Ditto. (ufix_notrunc2): Ditto. (fix_trunc2): Change iterator to VF1_128_256VLDQ_AVX10_1. Remove target check. (avx512dq_fix_truncv2sfv2di2): Add TARGET_AVX10_1. (fix_truncv2sfv2di2): Ditto. (cond_mul): Change iterator to VI8_AVX10_1_AVX512DQVL. Remove target check. (avx512dq_mul3): Ditto. (*avx512dq_mul3): Ditto. (VI4F_BRCST32x2): Add TARGET_AVX512DQ and TARGET_AVX10_1. (avx512dq_broadcast): Remove target check. (VI8F_BRCST64x2): Add TARGET_AVX512DQ and TARGET_AVX10_1. (avx512dq_broadcast_1): Remove target check. * config/i386/subst.md (mask_mode512bit_condition): Add TARGET_AVX10_1. (mask_avx512vl_condition): Ditto. (mask): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/avx-1.c: Add -mavx10.1. * gcc.target/i386/avx-2.c: Ditto. * gcc.target/i386/sse-26.c: Skip AVX512VLDQ intrin file. --- gcc/config/i386/avx512vldqintrin.h | 12 ++-- gcc/config/i386/i386-builtin.def | 46 ++++++------ gcc/config/i386/i386-builtins.cc | 9 +-- gcc/config/i386/i386-expand.cc | 8 ++- gcc/config/i386/i386.md | 7 +- gcc/config/i386/sse.md | 97 ++++++++++++++++---------- gcc/config/i386/subst.md | 7 +- gcc/testsuite/gcc.target/i386/avx-1.c | 2 +- gcc/testsuite/gcc.target/i386/avx-2.c | 2 +- gcc/testsuite/gcc.target/i386/sse-26.c | 6 ++ 10 files changed, 117 insertions(+), 79 deletions(-) diff --git a/gcc/config/i386/avx512vldqintrin.h b/gcc/config/i386/avx512vldqintrin.h index be4d59c34e4..4b8006f7b73 100644 --- a/gcc/config/i386/avx512vldqintrin.h +++ b/gcc/config/i386/avx512vldqintrin.h @@ -28,12 +28,6 @@ #ifndef _AVX512VLDQINTRIN_H_INCLUDED #define _AVX512VLDQINTRIN_H_INCLUDED -#if !defined(__AVX512VL__) || !defined(__AVX512DQ__) -#pragma GCC push_options -#pragma GCC target("avx512vl,avx512dq") -#define __DISABLE_AVX512VLDQ__ -#endif /* __AVX512VLDQ__ */ - extern __inline __m256i __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) _mm256_cvttpd_epi64 (__m256d __A) @@ -679,6 +673,12 @@ _mm_maskz_andnot_ps (__mmask8 __U, __m128 __A, __m128 __B) (__mmask8) __U); } +#if !defined(__AVX512VL__) || !defined(__AVX512DQ__) +#pragma GCC push_options +#pragma GCC target("avx512vl,avx512dq") +#define __DISABLE_AVX512VLDQ__ +#endif /* __AVX512VLDQ__ */ + extern __inline __m256i __attribute__ ((__gnu_inline__, __always_inline__, __artificial__)) _mm256_cvtps_epi64 (__m128 __A) diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def index 8738b3b6a8a..18d8966f0de 100644 --- a/gcc/config/i386/i386-builtin.def +++ b/gcc/config/i386/i386-builtin.def @@ -1718,31 +1718,31 @@ BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_iorv4df3 BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_iorv2df3_mask, "__builtin_ia32_orpd128_mask", IX86_BUILTIN_ORPD128_MASK, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI) BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_iorv8sf3_mask, "__builtin_ia32_orps256_mask", IX86_BUILTIN_ORPS256_MASK, UNKNOWN, (int) V8SF_FTYPE_V8SF_V8SF_V8SF_UQI) BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_iorv4sf3_mask, "__builtin_ia32_orps128_mask", IX86_BUILTIN_ORPS128_MASK, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512dq_broadcastv8sf_mask, "__builtin_ia32_broadcastf32x2_256_mask", IX86_BUILTIN_BROADCASTF32x2_256, UNKNOWN, (int) V8SF_FTYPE_V4SF_V8SF_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512dq_broadcastv8si_mask, "__builtin_ia32_broadcasti32x2_256_mask", IX86_BUILTIN_BROADCASTI32x2_256, UNKNOWN, (int) V8SI_FTYPE_V4SI_V8SI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512dq_broadcastv4si_mask, "__builtin_ia32_broadcasti32x2_128_mask", IX86_BUILTIN_BROADCASTI32x2_128, UNKNOWN, (int) V4SI_FTYPE_V4SI_V4SI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512dq_broadcastv4df_mask_1, "__builtin_ia32_broadcastf64x2_256_mask", IX86_BUILTIN_BROADCASTF64X2_256, UNKNOWN, (int) V4DF_FTYPE_V2DF_V4DF_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512dq_broadcastv4di_mask_1, "__builtin_ia32_broadcasti64x2_256_mask", IX86_BUILTIN_BROADCASTI64X2_256, UNKNOWN, (int) V4DI_FTYPE_V2DI_V4DI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_broadcastv8sf_mask, "__builtin_ia32_broadcastf32x2_256_mask", IX86_BUILTIN_BROADCASTF32x2_256, UNKNOWN, (int) V8SF_FTYPE_V4SF_V8SF_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_broadcastv8si_mask, "__builtin_ia32_broadcasti32x2_256_mask", IX86_BUILTIN_BROADCASTI32x2_256, UNKNOWN, (int) V8SI_FTYPE_V4SI_V8SI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_broadcastv4si_mask, "__builtin_ia32_broadcasti32x2_128_mask", IX86_BUILTIN_BROADCASTI32x2_128, UNKNOWN, (int) V4SI_FTYPE_V4SI_V4SI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_broadcastv4df_mask_1, "__builtin_ia32_broadcastf64x2_256_mask", IX86_BUILTIN_BROADCASTF64X2_256, UNKNOWN, (int) V4DF_FTYPE_V2DF_V4DF_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_broadcastv4di_mask_1, "__builtin_ia32_broadcasti64x2_256_mask", IX86_BUILTIN_BROADCASTI64X2_256, UNKNOWN, (int) V4DI_FTYPE_V2DI_V4DI_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_broadcastv8sf_mask_1, "__builtin_ia32_broadcastf32x4_256_mask", IX86_BUILTIN_BROADCASTF32X4_256, UNKNOWN, (int) V8SF_FTYPE_V4SF_V8SF_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_broadcastv8si_mask_1, "__builtin_ia32_broadcasti32x4_256_mask", IX86_BUILTIN_BROADCASTI32X4_256, UNKNOWN, (int) V8SI_FTYPE_V4SI_V8SI_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vextractf128v8sf, "__builtin_ia32_extractf32x4_256_mask", IX86_BUILTIN_EXTRACTF32X4_256, UNKNOWN, (int) V4SF_FTYPE_V8SF_INT_V4SF_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512vl_vextractf128v8si, "__builtin_ia32_extracti32x4_256_mask", IX86_BUILTIN_EXTRACTI32X4_256, UNKNOWN, (int) V4SI_FTYPE_V8SI_INT_V4SI_UQI) BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512bw_dbpsadbwv16hi_mask, "__builtin_ia32_dbpsadbw256_mask", IX86_BUILTIN_DBPSADBW256, UNKNOWN, (int) V16HI_FTYPE_V32QI_V32QI_INT_V16HI_UHI) BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512bw_dbpsadbwv8hi_mask, "__builtin_ia32_dbpsadbw128_mask", IX86_BUILTIN_DBPSADBW128, UNKNOWN, (int) V8HI_FTYPE_V16QI_V16QI_INT_V8HI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_fix_truncv4dfv4di2_mask, "__builtin_ia32_cvttpd2qq256_mask", IX86_BUILTIN_CVTTPD2QQ256, UNKNOWN, (int) V4DI_FTYPE_V4DF_V4DI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_fix_truncv2dfv2di2_mask, "__builtin_ia32_cvttpd2qq128_mask", IX86_BUILTIN_CVTTPD2QQ128, UNKNOWN, (int) V2DI_FTYPE_V2DF_V2DI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_fixuns_truncv4dfv4di2_mask, "__builtin_ia32_cvttpd2uqq256_mask", IX86_BUILTIN_CVTTPD2UQQ256, UNKNOWN, (int) V4DI_FTYPE_V4DF_V4DI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_fixuns_truncv2dfv2di2_mask, "__builtin_ia32_cvttpd2uqq128_mask", IX86_BUILTIN_CVTTPD2UQQ128, UNKNOWN, (int) V2DI_FTYPE_V2DF_V2DI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_fix_notruncv4dfv4di2_mask, "__builtin_ia32_cvtpd2qq256_mask", IX86_BUILTIN_CVTPD2QQ256, UNKNOWN, (int) V4DI_FTYPE_V4DF_V4DI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_fix_notruncv2dfv2di2_mask, "__builtin_ia32_cvtpd2qq128_mask", IX86_BUILTIN_CVTPD2QQ128, UNKNOWN, (int) V2DI_FTYPE_V2DF_V2DI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_fixuns_notruncv4dfv4di2_mask, "__builtin_ia32_cvtpd2uqq256_mask", IX86_BUILTIN_CVTPD2UQQ256, UNKNOWN, (int) V4DI_FTYPE_V4DF_V4DI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_fixuns_notruncv2dfv2di2_mask, "__builtin_ia32_cvtpd2uqq128_mask", IX86_BUILTIN_CVTPD2UQQ128, UNKNOWN, (int) V2DI_FTYPE_V2DF_V2DI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_fix_truncv4dfv4di2_mask, "__builtin_ia32_cvttpd2qq256_mask", IX86_BUILTIN_CVTTPD2QQ256, UNKNOWN, (int) V4DI_FTYPE_V4DF_V4DI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_fix_truncv2dfv2di2_mask, "__builtin_ia32_cvttpd2qq128_mask", IX86_BUILTIN_CVTTPD2QQ128, UNKNOWN, (int) V2DI_FTYPE_V2DF_V2DI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_fixuns_truncv4dfv4di2_mask, "__builtin_ia32_cvttpd2uqq256_mask", IX86_BUILTIN_CVTTPD2UQQ256, UNKNOWN, (int) V4DI_FTYPE_V4DF_V4DI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_fixuns_truncv2dfv2di2_mask, "__builtin_ia32_cvttpd2uqq128_mask", IX86_BUILTIN_CVTTPD2UQQ128, UNKNOWN, (int) V2DI_FTYPE_V2DF_V2DI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_fix_notruncv4dfv4di2_mask, "__builtin_ia32_cvtpd2qq256_mask", IX86_BUILTIN_CVTPD2QQ256, UNKNOWN, (int) V4DI_FTYPE_V4DF_V4DI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_fix_notruncv2dfv2di2_mask, "__builtin_ia32_cvtpd2qq128_mask", IX86_BUILTIN_CVTPD2QQ128, UNKNOWN, (int) V2DI_FTYPE_V2DF_V2DI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_fixuns_notruncv4dfv4di2_mask, "__builtin_ia32_cvtpd2uqq256_mask", IX86_BUILTIN_CVTPD2UQQ256, UNKNOWN, (int) V4DI_FTYPE_V4DF_V4DI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_fixuns_notruncv2dfv2di2_mask, "__builtin_ia32_cvtpd2uqq128_mask", IX86_BUILTIN_CVTPD2UQQ128, UNKNOWN, (int) V2DI_FTYPE_V2DF_V2DI_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_fixuns_notruncv4dfv4si2_mask, "__builtin_ia32_cvtpd2udq256_mask", IX86_BUILTIN_CVTPD2UDQ256_MASK, UNKNOWN, (int) V4SI_FTYPE_V4DF_V4SI_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_fixuns_notruncv2dfv2si2_mask, "__builtin_ia32_cvtpd2udq128_mask", IX86_BUILTIN_CVTPD2UDQ128_MASK, UNKNOWN, (int) V4SI_FTYPE_V2DF_V4SI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_fix_truncv4sfv4di2_mask, "__builtin_ia32_cvttps2qq256_mask", IX86_BUILTIN_CVTTPS2QQ256, UNKNOWN, (int) V4DI_FTYPE_V4SF_V4DI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512dq_fix_truncv2sfv2di2_mask, "__builtin_ia32_cvttps2qq128_mask", IX86_BUILTIN_CVTTPS2QQ128, UNKNOWN, (int) V2DI_FTYPE_V4SF_V2DI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_fixuns_truncv4sfv4di2_mask, "__builtin_ia32_cvttps2uqq256_mask", IX86_BUILTIN_CVTTPS2UQQ256, UNKNOWN, (int) V4DI_FTYPE_V4SF_V4DI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512dq_fixuns_truncv2sfv2di2_mask, "__builtin_ia32_cvttps2uqq128_mask", IX86_BUILTIN_CVTTPS2UQQ128, UNKNOWN, (int) V2DI_FTYPE_V4SF_V2DI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_fix_truncv4sfv4di2_mask, "__builtin_ia32_cvttps2qq256_mask", IX86_BUILTIN_CVTTPS2QQ256, UNKNOWN, (int) V4DI_FTYPE_V4SF_V4DI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_fix_truncv2sfv2di2_mask, "__builtin_ia32_cvttps2qq128_mask", IX86_BUILTIN_CVTTPS2QQ128, UNKNOWN, (int) V2DI_FTYPE_V4SF_V2DI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_fixuns_truncv4sfv4di2_mask, "__builtin_ia32_cvttps2uqq256_mask", IX86_BUILTIN_CVTTPS2UQQ256, UNKNOWN, (int) V4DI_FTYPE_V4SF_V4DI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_fixuns_truncv2sfv2di2_mask, "__builtin_ia32_cvttps2uqq128_mask", IX86_BUILTIN_CVTTPS2UQQ128, UNKNOWN, (int) V2DI_FTYPE_V4SF_V2DI_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_fix_truncv8sfv8si2_mask, "__builtin_ia32_cvttps2dq256_mask", IX86_BUILTIN_CVTTPS2DQ256_MASK, UNKNOWN, (int) V8SI_FTYPE_V8SF_V8SI_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_fix_truncv4sfv4si2_mask, "__builtin_ia32_cvttps2dq128_mask", IX86_BUILTIN_CVTTPS2DQ128_MASK, UNKNOWN, (int) V4SI_FTYPE_V4SF_V4SI_UQI) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_fixuns_truncv8sfv8si2_mask, "__builtin_ia32_cvttps2udq256_mask", IX86_BUILTIN_CVTTPS2UDQ256, UNKNOWN, (int) V8SI_FTYPE_V8SF_V8SI_UQI) @@ -1936,16 +1936,16 @@ BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_smulv16h BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_smulv8hi3_highpart_mask, "__builtin_ia32_pmulhw128_mask", IX86_BUILTIN_PMULHW128_MASK, UNKNOWN,(int) V8HI_FTYPE_V8HI_V8HI_V8HI_UQI) BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_mulv16hi3_mask, "__builtin_ia32_pmullw256_mask" , IX86_BUILTIN_PMULLW256_MASK, UNKNOWN, (int) V16HI_FTYPE_V16HI_V16HI_V16HI_UHI) BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_mulv8hi3_mask, "__builtin_ia32_pmullw128_mask", IX86_BUILTIN_PMULLW128_MASK, UNKNOWN, (int) V8HI_FTYPE_V8HI_V8HI_V8HI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512dq_mulv4di3_mask, "__builtin_ia32_pmullq256_mask", IX86_BUILTIN_PMULLQ256, UNKNOWN, (int) V4DI_FTYPE_V4DI_V4DI_V4DI_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx512dq_mulv2di3_mask, "__builtin_ia32_pmullq128_mask", IX86_BUILTIN_PMULLQ128, UNKNOWN, (int) V2DI_FTYPE_V2DI_V2DI_V2DI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_mulv4di3_mask, "__builtin_ia32_pmullq256_mask", IX86_BUILTIN_PMULLQ256, UNKNOWN, (int) V4DI_FTYPE_V4DI_V4DI_V4DI_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx512dq_mulv2di3_mask, "__builtin_ia32_pmullq128_mask", IX86_BUILTIN_PMULLQ128, UNKNOWN, (int) V2DI_FTYPE_V2DI_V2DI_V2DI_UQI) BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_andv4df3_mask, "__builtin_ia32_andpd256_mask", IX86_BUILTIN_ANDPD256_MASK, UNKNOWN, (int) V4DF_FTYPE_V4DF_V4DF_V4DF_UQI) BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_andv2df3_mask, "__builtin_ia32_andpd128_mask", IX86_BUILTIN_ANDPD128_MASK, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI) BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_andv8sf3_mask, "__builtin_ia32_andps256_mask", IX86_BUILTIN_ANDPS256_MASK, UNKNOWN, (int) V8SF_FTYPE_V8SF_V8SF_V8SF_UQI) BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_andv4sf3_mask, "__builtin_ia32_andps128_mask", IX86_BUILTIN_ANDPS128_MASK, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx_andnotv4df3_mask, "__builtin_ia32_andnpd256_mask", IX86_BUILTIN_ANDNPD256_MASK, UNKNOWN, (int) V4DF_FTYPE_V4DF_V4DF_V4DF_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_sse2_andnotv2df3_mask, "__builtin_ia32_andnpd128_mask", IX86_BUILTIN_ANDNPD128_MASK, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_avx_andnotv8sf3_mask, "__builtin_ia32_andnps256_mask", IX86_BUILTIN_ANDNPS256_MASK, UNKNOWN, (int) V8SF_FTYPE_V8SF_V8SF_V8SF_UQI) -BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_sse_andnotv4sf3_mask, "__builtin_ia32_andnps128_mask", IX86_BUILTIN_ANDNPS128_MASK, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx_andnotv4df3_mask, "__builtin_ia32_andnpd256_mask", IX86_BUILTIN_ANDNPD256_MASK, UNKNOWN, (int) V4DF_FTYPE_V4DF_V4DF_V4DF_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_sse2_andnotv2df3_mask, "__builtin_ia32_andnpd128_mask", IX86_BUILTIN_ANDNPD128_MASK, UNKNOWN, (int) V2DF_FTYPE_V2DF_V2DF_V2DF_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_avx_andnotv8sf3_mask, "__builtin_ia32_andnps256_mask", IX86_BUILTIN_ANDNPS256_MASK, UNKNOWN, (int) V8SF_FTYPE_V8SF_V8SF_V8SF_UQI) +BDESC (OPTION_MASK_ISA_AVX512DQ | OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_AVX10_1, CODE_FOR_sse_andnotv4sf3_mask, "__builtin_ia32_andnps128_mask", IX86_BUILTIN_ANDNPS128_MASK, UNKNOWN, (int) V4SF_FTYPE_V4SF_V4SF_V4SF_UQI) BDESC (OPTION_MASK_ISA_AVX512BW | OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_ashlv8hi3_mask, "__builtin_ia32_psllwi128_mask", IX86_BUILTIN_PSLLWI128_MASK, UNKNOWN, (int) V8HI_FTYPE_V8HI_INT_V8HI_UQI_COUNT) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_ashlv4si3_mask, "__builtin_ia32_pslldi128_mask", IX86_BUILTIN_PSLLDI128_MASK, UNKNOWN, (int) V4SI_FTYPE_V4SI_INT_V4SI_UQI_COUNT) BDESC (OPTION_MASK_ISA_AVX512VL, 0, CODE_FOR_ashlv2di3_mask, "__builtin_ia32_psllqi128_mask", IX86_BUILTIN_PSLLQI128_MASK, UNKNOWN, (int) V2DI_FTYPE_V2DI_INT_V2DI_UQI_COUNT) diff --git a/gcc/config/i386/i386-builtins.cc b/gcc/config/i386/i386-builtins.cc index 356b6dfd5fb..06667629d51 100644 --- a/gcc/config/i386/i386-builtins.cc +++ b/gcc/config/i386/i386-builtins.cc @@ -278,15 +278,16 @@ def_builtin (HOST_WIDE_INT mask, HOST_WIDE_INT mask2, if (((mask2 == 0 || (mask2 & ix86_isa_flags2) != 0) && (mask == 0 || (mask & ix86_isa_flags) != 0)) || ((mask & OPTION_MASK_ISA_MMX) != 0 && TARGET_MMX_WITH_SSE) - /* "Unified" builtin used by either AVXVNNI/AVXIFMA/AES intrinsics - or AVX512VNNIVL/AVX512IFMAVL/VAESVL non-mask intrinsics should be - defined whenever avxvnni/avxifma/aes or avx512vnni/avx512ifma/vaes - && avx512vl exist. */ + /* "Unified" builtin used by either AVXVNNI/AVXIFMA/AES/AVX10.1 + intrinsics or AVX512VNNIVL/AVX512IFMAVL/VAESVL/- non-mask + intrinsics should be defined whenever avxvnni/avxifma/aes/avx10.1 or + avx512vnni/avx512ifma/vaes/- && avx512vl exist. */ || (mask2 == OPTION_MASK_ISA2_AVXVNNI) || (mask2 == OPTION_MASK_ISA2_AVXIFMA) || (mask2 == (OPTION_MASK_ISA2_AVXNECONVERT | OPTION_MASK_ISA2_AVX512BF16)) || ((mask2 & OPTION_MASK_ISA2_VAES) != 0) + || ((mask2 & OPTION_MASK_ISA2_AVX10_1) != 0) || (lang_hooks.builtin_function == lang_hooks.builtin_function_ext_scope)) { diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index e9dc0bc2e9d..85e30552d6f 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -12718,6 +12718,8 @@ ix86_check_builtin_isa_match (unsigned int fcode, OPTION_MASK_ISA2_AVXNECONVERT); SHARE_BUILTIN (OPTION_MASK_ISA_AES, 0, OPTION_MASK_ISA_AVX512VL, OPTION_MASK_ISA2_VAES); + SHARE_BUILTIN (OPTION_MASK_ISA_AVX512VL, 0, 0, OPTION_MASK_ISA2_AVX10_1); + SHARE_BUILTIN (OPTION_MASK_ISA_AVX512DQ, 0, 0, OPTION_MASK_ISA2_AVX10_1); isa = tmp_isa; isa2 = tmp_isa2; @@ -23949,9 +23951,11 @@ ix86_expand_sse2_mulvxdi3 (rtx op0, rtx op1, rtx op2) if (TARGET_AVX512DQ && mode == V8DImode) emit_insn (gen_avx512dq_mulv8di3 (op0, op1, op2)); - else if (TARGET_AVX512DQ && TARGET_AVX512VL && mode == V4DImode) + else if (((TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1) + && mode == V4DImode) emit_insn (gen_avx512dq_mulv4di3 (op0, op1, op2)); - else if (TARGET_AVX512DQ && TARGET_AVX512VL && mode == V2DImode) + else if (((TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1) + && mode == V2DImode) emit_insn (gen_avx512dq_mulv2di3 (op0, op1, op2)); else if (TARGET_XOP && mode == V2DImode) { diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index c906d75b13e..765dd0d1115 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -538,7 +538,8 @@ avx,noavx,avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,noavx512f, avx512bw,noavx512bw,avx512dq,noavx512dq,fma_or_avx512vl, avx512vl,noavx512vl,avxvnni,avx512vnnivl,avx512fp16,avxifma, - avx512ifmavl,avxneconvert,avx512bf16vl,vpclmulqdqvl" + avx512ifmavl,avxneconvert,avx512bf16vl,vpclmulqdqvl, + avx10_1_or_avx512dq,avx10_1_or_avx512vl" (const_string "base")) ;; The (bounding maximum) length of an instruction immediate. @@ -919,6 +920,10 @@ (symbol_ref "TARGET_AVX512BF16 && TARGET_AVX512VL") (eq_attr "isa" "vpclmulqdqvl") (symbol_ref "TARGET_VPCLMULQDQ && TARGET_AVX512VL") + (eq_attr "isa" "avx10_1_or_avx512dq") + (symbol_ref "TARGET_AVX512DQ || TARGET_AVX10_1") + (eq_attr "isa" "avx10_1_or_avx512vl") + (symbol_ref "TARGET_AVX512VL || TARGET_AVX10_1") (eq_attr "mmx_isa" "native") (symbol_ref "!TARGET_MMX_WITH_SSE") diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 2c698af4664..5d19aaf380f 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -388,6 +388,10 @@ (define_mode_iterator VF1_128_256VL [V8SF (V4SF "TARGET_AVX512VL")]) +(define_mode_iterator VF1_128_256VLDQ_AVX10_1 + [(V8SF "TARGET_AVX512DQ") + (V4SF "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1")]) + ;; All DFmode vector float modes (define_mode_iterator VF2 [(V8DF "TARGET_AVX512F") (V4DF "TARGET_AVX") V2DF]) @@ -463,6 +467,11 @@ (define_mode_iterator VF2_AVX512VL [V8DF (V4DF "TARGET_AVX512VL") (V2DF "TARGET_AVX512VL")]) +(define_mode_iterator VF2_AVX512VLDQ_AVX10_1 + [(V8DF "TARGET_AVX512DQ") + (V4DF "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1") + (V2DF "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1")]) + (define_mode_iterator VF1_AVX512VL [V16SF (V8SF "TARGET_AVX512VL") (V4SF "TARGET_AVX512VL")]) @@ -528,6 +537,11 @@ (define_mode_iterator VI8_AVX512VL [V8DI (V4DI "TARGET_AVX512VL") (V2DI "TARGET_AVX512VL")]) +(define_mode_iterator VI8_AVX512VLDQ_AVX10_1 + [(V8DI "TARGET_AVX512DQ") + (V4DI "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1") + (V2DI "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1")]) + (define_mode_iterator VI8_256_512 [V8DI (V4DI "TARGET_AVX512VL")]) @@ -4774,13 +4788,13 @@ output_asm_insn (buf, operands); return ""; } - [(set_attr "isa" "noavx,avx,avx512dq,avx512f") + [(set_attr "isa" "noavx,avx,avx10_1_or_avx512dq,avx512f") (set_attr "type" "sselog") (set_attr "prefix" "orig,maybe_vex,evex,evex") (set (attr "mode") (cond [(and (match_test "") (and (eq_attr "alternative" "1") - (match_test "!TARGET_AVX512DQ"))) + (match_test "!(TARGET_AVX512DQ || TARGET_AVX10_1)"))) (const_string "") (eq_attr "alternative" "3") (const_string "") @@ -5031,7 +5045,7 @@ ops = "vandn%s\t{%%2, %%1, %%0|%%0, %%1, %%2}"; break; case 2: - if (TARGET_AVX512DQ) + if (TARGET_AVX512DQ || TARGET_AVX10_1) ops = "vandn%s\t{%%2, %%1, %%0|%%0, %%1, %%2}"; else { @@ -5056,12 +5070,12 @@ output_asm_insn (buf, operands); return ""; } - [(set_attr "isa" "noavx,avx,avx512vl,avx512f") + [(set_attr "isa" "noavx,avx,avx10_1_or_avx512vl,avx512f") (set_attr "type" "sselog") (set_attr "prefix" "orig,vex,evex,evex") (set (attr "mode") (cond [(eq_attr "alternative" "2") - (if_then_else (match_test "TARGET_AVX512DQ") + (if_then_else (match_test "TARGET_AVX512DQ || TARGET_AVX10_1") (const_string "") (const_string "TI")) (eq_attr "alternative" "3") @@ -8870,8 +8884,8 @@ (define_insn "fix_trunc2" [(set (match_operand: 0 "register_operand" "=v") (any_fix: - (match_operand:VF2_AVX512VL 1 "" "")))] - "TARGET_AVX512DQ && " + (match_operand:VF2_AVX512VLDQ_AVX10_1 1 "" "")))] + "" "vcvttpd2qq\t{%1, %0|%0, %1}" [(set_attr "type" "ssecvt") (set_attr "prefix" "evex") @@ -8880,9 +8894,9 @@ (define_insn "fix_notrunc2" [(set (match_operand: 0 "register_operand" "=v") (unspec: - [(match_operand:VF2_AVX512VL 1 "" "")] + [(match_operand:VF2_AVX512VLDQ_AVX10_1 1 "" "")] UNSPEC_FIX_NOTRUNC))] - "TARGET_AVX512DQ && " + "" "vcvtpd2qq\t{%1, %0|%0, %1}" [(set_attr "type" "ssecvt") (set_attr "prefix" "evex") @@ -8891,9 +8905,9 @@ (define_insn "fixuns_notrunc2" [(set (match_operand: 0 "register_operand" "=v") (unspec: - [(match_operand:VF2_AVX512VL 1 "nonimmediate_operand" "")] + [(match_operand:VF2_AVX512VLDQ_AVX10_1 1 "nonimmediate_operand" "")] UNSPEC_UNSIGNED_FIX_NOTRUNC))] - "TARGET_AVX512DQ && " + "" "vcvtpd2uqq\t{%1, %0|%0, %1}" [(set_attr "type" "ssecvt") (set_attr "prefix" "evex") @@ -8902,8 +8916,8 @@ (define_insn "fix_trunc2" [(set (match_operand: 0 "register_operand" "=v") (any_fix: - (match_operand:VF1_128_256VL 1 "" "")))] - "TARGET_AVX512DQ && " + (match_operand:VF1_128_256VLDQ_AVX10_1 1 "" "")))] + "" "vcvttps2qq\t{%1, %0|%0, %1}" [(set_attr "type" "ssecvt") (set_attr "prefix" "evex") @@ -8915,7 +8929,7 @@ (vec_select:V2SF (match_operand:V4SF 1 "nonimmediate_operand" "vm") (parallel [(const_int 0) (const_int 1)]))))] - "TARGET_AVX512DQ && TARGET_AVX512VL" + "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1" "vcvttps2qq\t{%1, %0|%0, %q1}" [(set_attr "type" "ssecvt") (set_attr "prefix" "evex") @@ -8925,7 +8939,7 @@ [(set (match_operand:V2DI 0 "register_operand") (any_fix:V2DI (match_operand:V2SF 1 "register_operand")))] - "TARGET_AVX512DQ && TARGET_AVX512VL" + "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1" { rtx op1 = force_reg (V2SFmode, operands[1]); op1 = lowpart_subreg (V4SFmode, op1, V2SFmode); @@ -15840,14 +15854,14 @@ (set_attr "mode" "TI")]) (define_expand "cond_mul" - [(set (match_operand:VI8_AVX512VL 0 "register_operand") - (vec_merge:VI8_AVX512VL - (mult:VI8_AVX512VL - (match_operand:VI8_AVX512VL 2 "vector_operand") - (match_operand:VI8_AVX512VL 3 "vector_operand")) - (match_operand:VI8_AVX512VL 4 "nonimm_or_0_operand") + [(set (match_operand:VI8_AVX512VLDQ_AVX10_1 0 "register_operand") + (vec_merge:VI8_AVX512VLDQ_AVX10_1 + (mult:VI8_AVX512VLDQ_AVX10_1 + (match_operand:VI8_AVX512VLDQ_AVX10_1 2 "vector_operand") + (match_operand:VI8_AVX512VLDQ_AVX10_1 3 "vector_operand")) + (match_operand:VI8_AVX512VLDQ_AVX10_1 4 "nonimm_or_0_operand") (match_operand: 1 "register_operand")))] - "TARGET_AVX512DQ" + "" { emit_insn (gen_avx512dq_mul3_mask (operands[0], operands[2], @@ -15858,19 +15872,19 @@ }) (define_expand "avx512dq_mul3" - [(set (match_operand:VI8_AVX512VL 0 "register_operand") - (mult:VI8_AVX512VL - (match_operand:VI8_AVX512VL 1 "bcst_vector_operand") - (match_operand:VI8_AVX512VL 2 "bcst_vector_operand")))] - "TARGET_AVX512DQ && " + [(set (match_operand:VI8_AVX512VLDQ_AVX10_1 0 "register_operand") + (mult:VI8_AVX512VLDQ_AVX10_1 + (match_operand:VI8_AVX512VLDQ_AVX10_1 1 "bcst_vector_operand") + (match_operand:VI8_AVX512VLDQ_AVX10_1 2 "bcst_vector_operand")))] + "" "ix86_fixup_binary_operands_no_copy (MULT, mode, operands);") (define_insn "*avx512dq_mul3" - [(set (match_operand:VI8_AVX512VL 0 "register_operand" "=v") - (mult:VI8_AVX512VL - (match_operand:VI8_AVX512VL 1 "bcst_vector_operand" "%v") - (match_operand:VI8_AVX512VL 2 "bcst_vector_operand" "vmBr")))] - "TARGET_AVX512DQ && + [(set (match_operand:VI8_AVX512VLDQ_AVX10_1 0 "register_operand" "=v") + (mult:VI8_AVX512VLDQ_AVX10_1 + (match_operand:VI8_AVX512VLDQ_AVX10_1 1 "bcst_vector_operand" "%v") + (match_operand:VI8_AVX512VLDQ_AVX10_1 2 "bcst_vector_operand" "vmBr")))] + " && ix86_binary_operator_ok (MULT, mode, operands)" { if (TARGET_DEST_FALSE_DEP_FOR_GLC @@ -17506,7 +17520,8 @@ ? "" : ""); break; default: - ssesuffix = TARGET_AVX512VL && which_alternative == 2 ? "q" : ""; + ssesuffix = (TARGET_AVX512VL || TARGET_AVX10_1) + && which_alternative == 2 ? "q" : ""; } break; @@ -26826,8 +26841,11 @@ ;; For broadcast[i|f]32x2. Yes there is no v4sf version, only v4si. (define_mode_iterator VI4F_BRCST32x2 - [V16SI (V8SI "TARGET_AVX512VL") (V4SI "TARGET_AVX512VL") - V16SF (V8SF "TARGET_AVX512VL")]) + [(V16SI "TARGET_AVX512DQ") + (V8SI "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1") + (V4SI "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1") + (V16SF "TARGET_AVX512DQ") + (V8SF "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1")]) (define_mode_attr 64x2mode [(V8DF "V2DF") (V8DI "V2DI") (V4DI "V2DI") (V4DF "V2DF")]) @@ -26842,7 +26860,7 @@ (vec_select:<32x2mode> (match_operand: 1 "nonimmediate_operand" "vm") (parallel [(const_int 0) (const_int 1)]))))] - "TARGET_AVX512DQ" + "" "vbroadcast32x2\t{%1, %0|%0, %q1}" [(set_attr "type" "ssemov") (set_attr "prefix_extra" "1") @@ -26877,13 +26895,16 @@ ;; For broadcast[i|f]64x2 (define_mode_iterator VI8F_BRCST64x2 - [V8DI V8DF (V4DI "TARGET_AVX512VL") (V4DF "TARGET_AVX512VL")]) + [(V8DI "TARGET_AVX512DQ") + (V8DF "TARGET_AVX512DQ") + (V4DI "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1") + (V4DF "(TARGET_AVX512DQ && TARGET_AVX512VL) || TARGET_AVX10_1")]) (define_insn "avx512dq_broadcast_1" [(set (match_operand:VI8F_BRCST64x2 0 "register_operand" "=v,v") (vec_duplicate:VI8F_BRCST64x2 (match_operand:<64x2mode> 1 "nonimmediate_operand" "v,m")))] - "TARGET_AVX512DQ" + "" "@ vshuf64x2\t{$0x0, %1, %1, %0|%0, %1, %1, 0x0} vbroadcast64x2\t{%1, %0|%0, %1}" diff --git a/gcc/config/i386/subst.md b/gcc/config/i386/subst.md index c5de75bb89d..59c4b395a9d 100644 --- a/gcc/config/i386/subst.md +++ b/gcc/config/i386/subst.md @@ -61,8 +61,9 @@ (define_subst_attr "mask_operand19" "mask" "" "%{%20%}%N19") (define_subst_attr "mask_codefor" "mask" "*" "") (define_subst_attr "mask_operand_arg34" "mask" "" ", operands[3], operands[4]") -(define_subst_attr "mask_mode512bit_condition" "mask" "1" "( == 64 || TARGET_AVX512VL)") -(define_subst_attr "mask_avx512vl_condition" "mask" "1" "TARGET_AVX512VL") +(define_subst_attr "mask_mode512bit_condition" "mask" "1" "( == 64 || TARGET_AVX512VL + || TARGET_AVX10_1)") +(define_subst_attr "mask_avx512vl_condition" "mask" "1" "(TARGET_AVX512VL || TARGET_AVX10_1)") (define_subst_attr "mask_avx512bw_condition" "mask" "1" "TARGET_AVX512BW") (define_subst_attr "mask_avx512dq_condition" "mask" "1" "TARGET_AVX512DQ") (define_subst_attr "mask_prefix" "mask" "vex" "evex") @@ -81,7 +82,7 @@ (define_subst "mask" [(set (match_operand:SUBST_V 0) (match_operand:SUBST_V 1))] - "TARGET_AVX512F" + "TARGET_AVX512F || TARGET_AVX10_1" [(set (match_dup 0) (vec_merge:SUBST_V (match_dup 1) diff --git a/gcc/testsuite/gcc.target/i386/avx-1.c b/gcc/testsuite/gcc.target/i386/avx-1.c index a6589deca84..bb72555939a 100644 --- a/gcc/testsuite/gcc.target/i386/avx-1.c +++ b/gcc/testsuite/gcc.target/i386/avx-1.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -m3dnow -mavx -mavx2 -maes -mpclmul -mgfni -mavx512bw -mavx512fp16 -mavx512vl -mprefetchi" } */ +/* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -m3dnow -mavx -mavx2 -maes -mpclmul -mgfni -mavx512bw -mavx512fp16 -mavx512vl -mprefetchi -mavx10.1" } */ /* { dg-add-options bind_pic_locally } */ #include diff --git a/gcc/testsuite/gcc.target/i386/avx-2.c b/gcc/testsuite/gcc.target/i386/avx-2.c index 642ae4d7bfb..85fd7213d1b 100644 --- a/gcc/testsuite/gcc.target/i386/avx-2.c +++ b/gcc/testsuite/gcc.target/i386/avx-2.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O0 -Werror-implicit-function-declaration -march=k8 -m3dnow -mavx -mavx2 -msse4a -maes -mpclmul -mavx512bw -mavx512fp16 -mavx512vl" } */ +/* { dg-options "-O0 -Werror-implicit-function-declaration -march=k8 -m3dnow -mavx -mavx2 -msse4a -maes -mpclmul -mavx512bw -mavx512fp16 -mavx512vl -mavx10.1" } */ /* { dg-add-options bind_pic_locally } */ #include diff --git a/gcc/testsuite/gcc.target/i386/sse-26.c b/gcc/testsuite/gcc.target/i386/sse-26.c index 04ffe10f42a..89db33b8b8c 100644 --- a/gcc/testsuite/gcc.target/i386/sse-26.c +++ b/gcc/testsuite/gcc.target/i386/sse-26.c @@ -2,4 +2,10 @@ /* { dg-options "-O2 -Werror-implicit-function-declaration -march=k8 -msse2 -mmmx -mno-sse3 -mno-3dnow -mno-fma -mno-fxsr -mno-xsave -mno-rtm -mno-prfchw -mno-rdseed -mno-adx -mno-prefetchwt1 -mno-clflushopt -mno-xsavec -mno-xsaves -mno-clwb -mno-mwaitx -mno-clzero -mno-pku -mno-rdpid -mno-gfni -mno-shstk -mno-vaes -mno-vpclmulqdq" } */ /* { dg-add-options bind_pic_locally } */ +/* We need to skip those intrin files which removed target attribute since after + removal GCC will issue a "target option mismatch" error for those + intrinsics. */ + +#define _AVX512VLDQINTRIN_H_INCLUDED + #include "sse-13.c"