From patchwork Thu Aug 31 08:20:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 1828161 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=L3EaxOIk; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RbvLv6fjsz1ygM for ; Thu, 31 Aug 2023 18:23:35 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id ECBAF38279BB for ; Thu, 31 Aug 2023 08:23:33 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org ECBAF38279BB DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693470214; bh=Px38u6yquKLkjVnIVlT1YzWBCHYj4ZT2VH+d+hyh1nA=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=L3EaxOIk9NehWmlymj6/TChj4qfJAR0BebNQqTTd/sk9xn064wmr2x6wP+Jt/RdaP dIa73ebLrXiPAS9GhxnqgeZ561skChY1O+THB7XOO2mQ2GI1RPkACvsC0BZ19kvxQL 8TtlJd55H3rUJLDfqmvB9l5HUAJtWCBr/z4kEjk8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by sourceware.org (Postfix) with ESMTPS id 14AEA3857700 for ; Thu, 31 Aug 2023 08:20:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 14AEA3857700 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="462235710" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="462235710" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2023 01:20:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="862938744" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="862938744" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga004.jf.intel.com with ESMTP; 31 Aug 2023 01:20:31 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 993DF1005132; Thu, 31 Aug 2023 16:20:24 +0800 (CST) To: gcc-patches@gcc.gnu.org Subject: [PATCH 10/13] [APX EGPR] Handle legacy insns that only support GPR16 (2/5) Date: Thu, 31 Aug 2023 16:20:21 +0800 Message-Id: <20230831082024.314097-11-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230831082024.314097-1-hongyu.wang@intel.com> References: <20230831082024.314097-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_SHORT, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hongyu Wang via Gcc-patches From: Hongyu Wang Reply-To: Hongyu Wang Cc: jakub@redhat.com, hongtao.liu@intel.com, hubicka@ucw.cz Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" From: Kong Lingling These legacy insns in opcode map2/3 have vex but no evex counterpart, disable EGPR for them by adjusting alternatives and attr_gpr32. insn list: 1. phaddw/vphaddw, phaddd/vphaddd, phaddsw/vphaddsw 2. phsubw/vphsubw, phsubd/vphsubd, phsubsw/vphsubsw 3. psignb/vpsginb, psignw/vpsignw, psignd/vpsignd 4. blendps/vblendps, blendpd/vblendpd 5. blendvps/vblendvps, blendvpd/vblendvpd 6. pblendvb/vpblendvb, pblendw/vpblendw 7. mpsadbw/vmpsadbw 8. dpps/vddps, dppd/vdppd 9. pcmpeqq/vpcmpeqq, pcmpgtq/vpcmpgtq gcc/ChangeLog: * config/i386/sse.md (avx2_phwv16hi3): Set attr gpr32 0 and constraint Bt/BM to all mem alternatives. (ssse3_phwv8hi3): Likewise. (ssse3_phwv4hi3): Likewise. (avx2_phdv8si3): Likewise. (ssse3_phdv4si3): Likewise. (ssse3_phdv2si3): Likewise. (_psign3): Likewise. (ssse3_psign3): Likewise. (_blend_blendv_blendv_lt): Likewise. (*_blendv_not_ltint: Likewise. (_dp): Likewise. (_mpsadbw): Likewise. (_pblendvb): Likewise. (*_pblendvb_lt): Likewise. (sse4_1_pblend): Likewise. (*avx2_pblend): Likewise. (avx2_permv2ti): Likewise. (*avx_vperm2f128_nozero): Likewise. (*avx2_eq3): Likewise. (*sse4_1_eqv2di3): Likewise. (sse4_2_gtv2di3): Likewise. (avx2_gt3): Likewise. gcc/testsuite/ChangeLog: * gcc.target/i386/apx-legacy-insn-check-norex2.c: Add sse/vex intrinsic tests. --- gcc/config/i386/sse.md | 80 ++++++++----- .../i386/apx-legacy-insn-check-norex2.c | 106 ++++++++++++++++++ 2 files changed, 159 insertions(+), 27 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index bd6674d34f9..05963de9219 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -16837,7 +16837,7 @@ (define_insn "*avx2_eq3" [(set (match_operand:VI_256 0 "register_operand" "=x") (eq:VI_256 (match_operand:VI_256 1 "nonimmediate_operand" "%x") - (match_operand:VI_256 2 "nonimmediate_operand" "xm")))] + (match_operand:VI_256 2 "nonimmediate_operand" "xBt")))] "TARGET_AVX2 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" "vpcmpeq\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "ssecmp") @@ -16845,6 +16845,7 @@ (define_insn "*avx2_eq3" (if_then_else (eq (const_string "mode") (const_string "V4DImode")) (const_string "1") (const_string "*"))) + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "OI")]) @@ -17027,7 +17028,7 @@ (define_insn "*sse4_1_eqv2di3" [(set (match_operand:V2DI 0 "register_operand" "=Yr,*x,x") (eq:V2DI (match_operand:V2DI 1 "vector_operand" "%0,0,x") - (match_operand:V2DI 2 "vector_operand" "YrBm,*xBm,xm")))] + (match_operand:V2DI 2 "vector_operand" "YrBT,*xBT,xBt")))] "TARGET_SSE4_1 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" "@ pcmpeqq\t{%2, %0|%0, %2} @@ -17035,6 +17036,7 @@ (define_insn "*sse4_1_eqv2di3" vpcmpeqq\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssecmp") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,vex") (set_attr "mode" "TI")]) @@ -17043,7 +17045,7 @@ (define_insn "*sse2_eq3" [(set (match_operand:VI124_128 0 "register_operand" "=x,x") (eq:VI124_128 (match_operand:VI124_128 1 "vector_operand" "%0,x") - (match_operand:VI124_128 2 "vector_operand" "xBm,xm")))] + (match_operand:VI124_128 2 "vector_operand" "xBm,xBt")))] "TARGET_SSE2 && !(MEM_P (operands[1]) && MEM_P (operands[2]))" "@ @@ -17058,7 +17060,7 @@ (define_insn "sse4_2_gtv2di3" [(set (match_operand:V2DI 0 "register_operand" "=Yr,*x,x") (gt:V2DI (match_operand:V2DI 1 "register_operand" "0,0,x") - (match_operand:V2DI 2 "vector_operand" "YrBm,*xBm,xm")))] + (match_operand:V2DI 2 "vector_operand" "YrBT,*xBT,xBt")))] "TARGET_SSE4_2" "@ pcmpgtq\t{%2, %0|%0, %2} @@ -17066,6 +17068,7 @@ (define_insn "sse4_2_gtv2di3" vpcmpgtq\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssecmp") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,vex") (set_attr "mode" "TI")]) @@ -17074,7 +17077,7 @@ (define_insn "avx2_gt3" [(set (match_operand:VI_256 0 "register_operand" "=x") (gt:VI_256 (match_operand:VI_256 1 "register_operand" "x") - (match_operand:VI_256 2 "nonimmediate_operand" "xm")))] + (match_operand:VI_256 2 "nonimmediate_operand" "xBt")))] "TARGET_AVX2" "vpcmpgt\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "ssecmp") @@ -17082,6 +17085,7 @@ (define_insn "avx2_gt3" (if_then_else (eq (const_string "mode") (const_string "V4DImode")) (const_string "1") (const_string "*"))) + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "OI")]) @@ -17105,7 +17109,7 @@ (define_insn "*sse2_gt3" [(set (match_operand:VI124_128 0 "register_operand" "=x,x") (gt:VI124_128 (match_operand:VI124_128 1 "register_operand" "0,x") - (match_operand:VI124_128 2 "vector_operand" "xBm,xm")))] + (match_operand:VI124_128 2 "vector_operand" "xBm,xBt")))] "TARGET_SSE2" "@ pcmpgt\t{%2, %0|%0, %2} @@ -21228,7 +21232,7 @@ (define_insn "avx2_phwv16hi3" (vec_select:V16HI (vec_concat:V32HI (match_operand:V16HI 1 "register_operand" "x") - (match_operand:V16HI 2 "nonimmediate_operand" "xm")) + (match_operand:V16HI 2 "nonimmediate_operand" "xBt")) (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6) (const_int 16) (const_int 18) (const_int 20) (const_int 22) @@ -21244,6 +21248,7 @@ (define_insn "avx2_phwv16hi3" "TARGET_AVX2" "vphw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sseiadd") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "vex") (set_attr "mode" "OI")]) @@ -21254,7 +21259,7 @@ (define_insn "ssse3_phwv8hi3" (vec_select:V8HI (vec_concat:V16HI (match_operand:V8HI 1 "register_operand" "0,x") - (match_operand:V8HI 2 "vector_operand" "xBm,xm")) + (match_operand:V8HI 2 "vector_operand" "xBT,xBt")) (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6) (const_int 8) (const_int 10) (const_int 12) (const_int 14)])) @@ -21269,6 +21274,7 @@ (define_insn "ssse3_phwv8hi3" vphw\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseiadd") + (set_attr "gpr32" "0") (set_attr "atom_unit" "complex") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,vex") @@ -21280,7 +21286,7 @@ (define_insn_and_split "ssse3_phwv4hi3" (vec_select:V4HI (vec_concat:V8HI (match_operand:V4HI 1 "register_operand" "0,0,x") - (match_operand:V4HI 2 "register_mmxmem_operand" "ym,x,x")) + (match_operand:V4HI 2 "register_mmxmem_operand" "yBt,x,x")) (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6)])) (vec_select:V4HI @@ -21309,6 +21315,7 @@ (define_insn_and_split "ssse3_phwv4hi3" } [(set_attr "mmx_isa" "native,sse_noavx,avx") (set_attr "type" "sseiadd") + (set_attr "gpr32" "0") (set_attr "atom_unit" "complex") (set_attr "prefix_extra" "1") (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)")) @@ -21320,7 +21327,7 @@ (define_insn "avx2_phdv8si3" (vec_select:V8SI (vec_concat:V16SI (match_operand:V8SI 1 "register_operand" "x") - (match_operand:V8SI 2 "nonimmediate_operand" "xm")) + (match_operand:V8SI 2 "nonimmediate_operand" "xBt")) (parallel [(const_int 0) (const_int 2) (const_int 8) (const_int 10) (const_int 4) (const_int 6) (const_int 12) (const_int 14)])) @@ -21332,6 +21339,7 @@ (define_insn "avx2_phdv8si3" "TARGET_AVX2" "vphd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sseiadd") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "vex") (set_attr "mode" "OI")]) @@ -21342,7 +21350,7 @@ (define_insn "ssse3_phdv4si3" (vec_select:V4SI (vec_concat:V8SI (match_operand:V4SI 1 "register_operand" "0,x") - (match_operand:V4SI 2 "vector_operand" "xBm,xm")) + (match_operand:V4SI 2 "vector_operand" "xBT,xBt")) (parallel [(const_int 0) (const_int 2) (const_int 4) (const_int 6)])) (vec_select:V4SI @@ -21355,6 +21363,7 @@ (define_insn "ssse3_phdv4si3" vphd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseiadd") + (set_attr "gpr32" "0") (set_attr "atom_unit" "complex") (set_attr "prefix_data16" "1,*") (set_attr "prefix_extra" "1") @@ -21367,7 +21376,7 @@ (define_insn_and_split "ssse3_phdv2si3" (vec_select:V2SI (vec_concat:V4SI (match_operand:V2SI 1 "register_operand" "0,0,x") - (match_operand:V2SI 2 "register_mmxmem_operand" "ym,x,x")) + (match_operand:V2SI 2 "register_mmxmem_operand" "yBt,x,x")) (parallel [(const_int 0) (const_int 2)])) (vec_select:V2SI (vec_concat:V4SI (match_dup 1) (match_dup 2)) @@ -21394,6 +21403,7 @@ (define_insn_and_split "ssse3_phdv2si3" } [(set_attr "mmx_isa" "native,sse_noavx,avx") (set_attr "type" "sseiadd") + (set_attr "gpr32" "0") (set_attr "atom_unit" "complex") (set_attr "prefix_extra" "1") (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)")) @@ -21848,7 +21858,7 @@ (define_insn "_psign3" [(set (match_operand:VI124_AVX2 0 "register_operand" "=x,x") (unspec:VI124_AVX2 [(match_operand:VI124_AVX2 1 "register_operand" "0,x") - (match_operand:VI124_AVX2 2 "vector_operand" "xBm,xm")] + (match_operand:VI124_AVX2 2 "vector_operand" "xBT,xBt")] UNSPEC_PSIGN))] "TARGET_SSSE3" "@ @@ -21856,6 +21866,7 @@ (define_insn "_psign3" vpsign\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sselog1") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,vex") (set_attr "mode" "")]) @@ -21864,7 +21875,7 @@ (define_insn "ssse3_psign3" [(set (match_operand:MMXMODEI 0 "register_operand" "=y,x,x") (unspec:MMXMODEI [(match_operand:MMXMODEI 1 "register_operand" "0,0,x") - (match_operand:MMXMODEI 2 "register_mmxmem_operand" "ym,x,x")] + (match_operand:MMXMODEI 2 "register_mmxmem_operand" "yBt,x,x")] UNSPEC_PSIGN))] "(TARGET_MMX || TARGET_MMX_WITH_SSE) && TARGET_SSSE3" "@ @@ -21874,6 +21885,7 @@ (define_insn "ssse3_psign3" [(set_attr "isa" "*,noavx,avx") (set_attr "mmx_isa" "native,*,*") (set_attr "type" "sselog1") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set (attr "prefix_rex") (symbol_ref "x86_extended_reg_mentioned_p (insn)")) (set_attr "mode" "DI,TI,TI")]) @@ -22153,7 +22165,7 @@ (define_mode_attr blendbits (define_insn "_blend" [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x") (vec_merge:VF_128_256 - (match_operand:VF_128_256 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:VF_128_256 2 "vector_operand" "YrBT,*xBT,xBt") (match_operand:VF_128_256 1 "register_operand" "0,0,x") (match_operand:SI 3 "const_0_to__operand")))] "TARGET_SSE4_1" @@ -22163,6 +22175,7 @@ (define_insn "_blend" vblend\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") @@ -22173,7 +22186,7 @@ (define_insn "_blendv" [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x") (unspec:VF_128_256 [(match_operand:VF_128_256 1 "register_operand" "0,0,x") - (match_operand:VF_128_256 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:VF_128_256 2 "vector_operand" "YrBT,*xBT,xBt") (match_operand:VF_128_256 3 "register_operand" "Yz,Yz,x")] UNSPEC_BLENDV))] "TARGET_SSE4_1" @@ -22183,6 +22196,7 @@ (define_insn "_blendv" vblendv\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") @@ -22234,7 +22248,7 @@ (define_insn_and_split "*_blendv_lt" [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x") (unspec:VF_128_256 [(match_operand:VF_128_256 1 "register_operand" "0,0,x") - (match_operand:VF_128_256 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:VF_128_256 2 "vector_operand" "YrBT,*xBT,xBt") (lt:VF_128_256 (match_operand: 3 "register_operand" "Yz,Yz,x") (match_operand: 4 "const0_operand"))] @@ -22248,6 +22262,7 @@ (define_insn_and_split "*_blendv_lt" "operands[3] = gen_lowpart (mode, operands[3]);" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") @@ -22266,7 +22281,7 @@ (define_insn_and_split "*_blendv_ltint" [(set (match_operand: 0 "register_operand" "=Yr,*x,x") (unspec: [(match_operand: 1 "register_operand" "0,0,x") - (match_operand: 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand: 2 "vector_operand" "YrBT,*xBT,xBt") (subreg: (lt:VI48_AVX (match_operand:VI48_AVX 3 "register_operand" "Yz,Yz,x") @@ -22286,6 +22301,7 @@ (define_insn_and_split "*_blendv_ltint" } [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") @@ -22324,7 +22340,7 @@ (define_insn "_dp" [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x") (unspec:VF_128_256 [(match_operand:VF_128_256 1 "vector_operand" "%0,0,x") - (match_operand:VF_128_256 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:VF_128_256 2 "vector_operand" "YrBT,*xBT,xBt") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_DP))] "TARGET_SSE4_1" @@ -22334,6 +22350,7 @@ (define_insn "_dp" vdp\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemul") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix_data16" "1,1,*") (set_attr "prefix_extra" "1") @@ -22362,7 +22379,7 @@ (define_insn "_mpsadbw" [(set (match_operand:VI1_AVX2 0 "register_operand" "=Yr,*x,x") (unspec:VI1_AVX2 [(match_operand:VI1_AVX2 1 "register_operand" "0,0,x") - (match_operand:VI1_AVX2 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:VI1_AVX2 2 "vector_operand" "YrBT,*xBT,xBt") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_MPSADBW))] "TARGET_SSE4_1" @@ -22372,6 +22389,7 @@ (define_insn "_mpsadbw" vmpsadbw\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "sselog1") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix_extra" "1") (set_attr "prefix" "orig,orig,vex") @@ -22400,7 +22418,7 @@ (define_insn "_pblendvb" [(set (match_operand:VI1_AVX2 0 "register_operand" "=Yr,*x,x") (unspec:VI1_AVX2 [(match_operand:VI1_AVX2 1 "register_operand" "0,0,x") - (match_operand:VI1_AVX2 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:VI1_AVX2 2 "vector_operand" "YrBT,*xBT,xBt") (match_operand:VI1_AVX2 3 "register_operand" "Yz,Yz,x")] UNSPEC_BLENDV))] "TARGET_SSE4_1" @@ -22410,6 +22428,7 @@ (define_insn "_pblendvb" vpblendvb\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "*,*,1") (set_attr "prefix" "orig,orig,vex") @@ -22449,7 +22468,7 @@ (define_insn_and_split "*_pblendvb_lt" [(set (match_operand:VI1_AVX2 0 "register_operand" "=Yr,*x,x") (unspec:VI1_AVX2 [(match_operand:VI1_AVX2 1 "register_operand" "0,0,x") - (match_operand:VI1_AVX2 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:VI1_AVX2 2 "vector_operand" "YrBT,*xBT,xBt") (lt:VI1_AVX2 (match_operand:VI1_AVX2 3 "register_operand" "Yz,Yz,x") (match_operand:VI1_AVX2 4 "const0_operand"))] UNSPEC_BLENDV))] @@ -22462,6 +22481,7 @@ (define_insn_and_split "*_pblendvb_lt" "" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "*,*,1") (set_attr "prefix" "orig,orig,vex") @@ -22493,7 +22513,7 @@ (define_insn_and_split "*_pblendvb_lt_subreg_not" (define_insn "sse4_1_pblend" [(set (match_operand:V8_128 0 "register_operand" "=Yr,*x,x") (vec_merge:V8_128 - (match_operand:V8_128 2 "vector_operand" "YrBm,*xBm,xm") + (match_operand:V8_128 2 "vector_operand" "YrBT,*xBT,xBt") (match_operand:V8_128 1 "register_operand" "0,0,x") (match_operand:SI 3 "const_0_to_255_operand")))] "TARGET_SSE4_1" @@ -22503,6 +22523,7 @@ (define_insn "sse4_1_pblend" vpblendw\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "isa" "noavx,noavx,avx") (set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "orig,orig,vex") @@ -22565,7 +22586,7 @@ (define_expand "avx2_pblend_1" (define_insn "*avx2_pblend" [(set (match_operand:V16_256 0 "register_operand" "=x") (vec_merge:V16_256 - (match_operand:V16_256 2 "nonimmediate_operand" "xm") + (match_operand:V16_256 2 "nonimmediate_operand" "xBt") (match_operand:V16_256 1 "register_operand" "x") (match_operand:SI 3 "avx2_pblendw_operand")))] "TARGET_AVX2" @@ -22574,6 +22595,7 @@ (define_insn "*avx2_pblend" return "vpblendw\t{%3, %2, %1, %0|%0, %1, %2, %3}"; } [(set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") @@ -22582,7 +22604,7 @@ (define_insn "*avx2_pblend" (define_insn "avx2_pblendd" [(set (match_operand:VI4_AVX2 0 "register_operand" "=x") (vec_merge:VI4_AVX2 - (match_operand:VI4_AVX2 2 "nonimmediate_operand" "xm") + (match_operand:VI4_AVX2 2 "nonimmediate_operand" "xBt") (match_operand:VI4_AVX2 1 "register_operand" "x") (match_operand:SI 3 "const_0_to_255_operand")))] "TARGET_AVX2" @@ -26443,11 +26465,13 @@ (define_insn "avx512f_perm_1" (set_attr "prefix" "") (set_attr "mode" "")]) +;; TODO (APX): vmovaps supports EGPR but not others, could split +;; pattern to enable gpr32 for this one. (define_insn "avx2_permv2ti" [(set (match_operand:V4DI 0 "register_operand" "=x") (unspec:V4DI [(match_operand:V4DI 1 "register_operand" "x") - (match_operand:V4DI 2 "nonimmediate_operand" "xm") + (match_operand:V4DI 2 "nonimmediate_operand" "xBt") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_VPERMTI))] "TARGET_AVX2" @@ -26474,6 +26498,7 @@ (define_insn "avx2_permv2ti" return "vperm2i128\t{%3, %2, %1, %0|%0, %1, %2, %3}"; } [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "OI")]) @@ -27089,7 +27114,7 @@ (define_insn "*avx_vperm2f128_nozero" (vec_select:AVX256MODE2P (vec_concat: (match_operand:AVX256MODE2P 1 "register_operand" "x") - (match_operand:AVX256MODE2P 2 "nonimmediate_operand" "xm")) + (match_operand:AVX256MODE2P 2 "nonimmediate_operand" "xBt")) (match_parallel 3 "" [(match_operand 4 "const_int_operand")])))] "TARGET_AVX @@ -27106,6 +27131,7 @@ (define_insn "*avx_vperm2f128_nozero" return "vperm2\t{%3, %2, %1, %0|%0, %1, %2, %3}"; } [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") diff --git a/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c index 1e5450dfb73..510213a6ca7 100644 --- a/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c +++ b/gcc/testsuite/gcc.target/i386/apx-legacy-insn-check-norex2.c @@ -28,3 +28,109 @@ void legacy_test () /* { dg-final { scan-assembler-not "xrstor64\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ /* { dg-final { scan-assembler-not "fxsave64\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ /* { dg-final { scan-assembler-not "fxrstor64\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ + +#ifdef DTYPE +#undef DTYPE +#define DTYPE u64 +#endif + +typedef union +{ + __m128i xi[8]; + __m128 xf[8]; + __m128d xd[8]; + __m256i yi[4]; + __m256 yf[4]; + __m256d yd[4]; + DTYPE a[16]; +} tmp_u; + +__attribute__((target("sse4.2"))) +void sse_test () +{ + register tmp_u *tdst __asm__("%r16"); + register tmp_u *src1 __asm__("%r17"); + register tmp_u *src2 __asm__("%r18"); + + src1->xi[0] = _mm_hadd_epi16 (tdst->xi[2], src2->xi[3]); + src1->xi[1] = _mm_hadd_epi32 (tdst->xi[0], src2->xi[1]); + tdst->xi[2] = _mm_hadds_epi16 (src1->xi[4], src2->xi[5]); + tdst->xi[3] = _mm_hsub_epi16 (src1->xi[6], src2->xi[7]); + tdst->xi[4] = _mm_hsub_epi32 (src1->xi[0], src2->xi[1]); + tdst->xi[5] = _mm_hsubs_epi16 (src1->xi[2], src2->xi[3]); + + src1->xi[6] = _mm_cmpeq_epi64 (tdst->xi[4], src2->xi[5]); + src1->xi[7] = _mm_cmpgt_epi64 (tdst->xi[6], src2->xi[7]); + + tdst->xf[0] = _mm_dp_ps (src1->xf[0], src2->xf[1], 0xbf); + tdst->xd[1] = _mm_dp_pd (src1->xd[2], src2->xd[3], 0xae); + + tdst->xi[2] = _mm_mpsadbw_epu8 (src1->xi[4], src2->xi[5], 0xc1); + + tdst->xi[3] = _mm_blend_epi16 (src1->xi[6], src2->xi[7], 0xc); + tdst->xi[4] = _mm_blendv_epi8 (src1->xi[0], src2->xi[1], tdst->xi[2]); + tdst->xf[5] = _mm_blend_ps (src1->xf[3], src2->xf[4], 0x4); + tdst->xf[6] = _mm_blendv_ps (src1->xf[5], src2->xf[6], tdst->xf[7]); + tdst->xd[7] = _mm_blend_pd (tdst->xd[0], src1->xd[1], 0x1); + tdst->xd[0] = _mm_blendv_pd (src1->xd[2], src2->xd[3], tdst->xd[4]); + + tdst->xi[1] = _mm_sign_epi8 (src1->xi[5], src2->xi[6]); + tdst->xi[2] = _mm_sign_epi16 (src1->xi[7], src2->xi[0]); + tdst->xi[3] = _mm_sign_epi32 (src1->xi[1], src2->xi[2]); +} + +__attribute__((target("avx2"))) +void vex_test () +{ + + register tmp_u *tdst __asm__("%r16"); + register tmp_u *src1 __asm__("%r17"); + register tmp_u *src2 __asm__("%r18"); + + src1->yi[1] = _mm256_hadd_epi16 (tdst->yi[2], src2->yi[3]); + src1->yi[2] = _mm256_hadd_epi32 (tdst->yi[0], src2->yi[1]); + tdst->yi[3] = _mm256_hadds_epi16 (src1->yi[1], src2->yi[2]); + tdst->yi[0] = _mm256_hsub_epi16 (src1->yi[3], src2->yi[0]); + tdst->yi[1] = _mm256_hsub_epi32 (src1->yi[0], src2->yi[1]); + tdst->yi[2] = _mm256_hsubs_epi16 (src1->yi[2], src2->yi[3]); + + src1->yi[2] = _mm256_cmpeq_epi64 (tdst->yi[1], src2->yi[2]); + src1->yi[1] = _mm256_cmpgt_epi64 (tdst->yi[3], src2->yi[0]); + + tdst->yf[2] = _mm256_dp_ps (src1->yf[0], src2->yf[1], 0xbf); + tdst->xd[3] = _mm_dp_pd (src1->xd[0], src2->xd[1], 0xbf); + + tdst->yi[3] = _mm256_mpsadbw_epu8 (src1->yi[1], src2->yi[1], 0xc1); + + tdst->yi[0] = _mm256_blend_epi16 (src1->yi[1], src2->yi[2], 0xc); + tdst->yi[1] = _mm256_blendv_epi8 (src1->yi[1], src2->yi[2], tdst->yi[0]); + tdst->yf[2] = _mm256_blend_ps (src1->yf[0], src2->yf[1], 0x4); + tdst->yf[3] = _mm256_blendv_ps (src1->yf[2], src2->yf[3], tdst->yf[1]); + tdst->yd[3] = _mm256_blend_pd (tdst->yd[1], src1->yd[0], 0x1); + tdst->yd[1] = _mm256_blendv_pd (src1->yd[2], src2->yd[3], tdst->yd[2]); + + tdst->yi[2] = _mm256_sign_epi8 (src1->yi[0], src2->yi[1]); + tdst->yi[3] = _mm256_sign_epi16 (src1->yi[2], src2->yi[3]); + tdst->yi[0] = _mm256_sign_epi32 (src1->yi[0], src2->yi[1]); +} + +/* { dg-final { scan-assembler-not "v?pcmpeqq\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?pcmpgtq\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?phaddw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?phaddd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?phaddsw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?phsubw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?phsubd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?phsubsw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?dpps\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?dppd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?psadbw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?pblendw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?pblendvb\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?blendps\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?blendvps\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?blendpd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?blendvpd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?psignb\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?psignw\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */ +/* { dg-final { scan-assembler-not "v?psignd\[ \\t]+\\\.\\\*r\(1\[6-9\]\|2\[0-9\]|30\|31\)" } } */