From patchwork Thu Aug 31 08:20:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hongyu Wang X-Patchwork-Id: 1828163 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=fSQNcNbN; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RbvND1z38z1ygF for ; Thu, 31 Aug 2023 18:24:44 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3C98F3894C01 for ; Thu, 31 Aug 2023 08:24:42 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3C98F3894C01 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693470282; bh=ANnsCeW6wMUVwwPDimMr8sYHqUbUHJtJEMVtCySqAZY=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=fSQNcNbNn13+CDsT6sg3bc51GnMfwKh2A57Cg3LqAa/OBkJ3yZEKW/fnlRSOF2343 R23+UPWWvAZtcWpQ8jfYYoTGJHR1A9ROYZ9mR5oHm+qdCzyIfFiIX0B3YrbWRAtKL3 1EpCk7oRxoFYb8Bu6Zsy48zO+PbNnggOps014hDE= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by sourceware.org (Postfix) with ESMTPS id 1FAB73857019 for ; Thu, 31 Aug 2023 08:21:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1FAB73857019 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="462235943" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="462235943" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 31 Aug 2023 01:21:25 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10818"; a="862938939" X-IronPort-AV: E=Sophos;i="6.02,216,1688454000"; d="scan'208";a="862938939" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orsmga004.jf.intel.com with ESMTP; 31 Aug 2023 01:20:32 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id AAEDB1005135; Thu, 31 Aug 2023 16:20:24 +0800 (CST) To: gcc-patches@gcc.gnu.org Subject: [PATCH 13/13] [APX EGPR] Handle vex insns that only support GPR16 (5/5) Date: Thu, 31 Aug 2023 16:20:24 +0800 Message-Id: <20230831082024.314097-14-hongyu.wang@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20230831082024.314097-1-hongyu.wang@intel.com> References: <20230831082024.314097-1-hongyu.wang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Hongyu Wang via Gcc-patches From: Hongyu Wang Reply-To: Hongyu Wang Cc: jakub@redhat.com, hongtao.liu@intel.com, hubicka@ucw.cz Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" From: Kong Lingling These vex insn may have legacy counterpart that could support EGPR, but they do not have evex counterpart. Split out its vex part from patterns and set the vex part to non-EGPR supported by adjusting constraints and attr_gpr32. insn list: 1. vmovmskpd/vmovmskps 2. vpmovmskb 3. vrsqrtss/vrsqrtps 4. vrcpss/vrcpps 5. vhaddpd/vhaddps, vhsubpd/vhsubps 6. vldmxcsr/vstmxcsr 7. vaddsubpd/vaddsubps 8. vlddqu 9. vtestps/vtestpd 10. vmaskmovps/vmaskmovpd, vpmaskmovd/vpmaskmovq 11. vperm2f128/vperm2i128 12. vinserti128/vinsertf128 13. vbroadcasti128/vbroadcastf128 14. vcmppd/vcmpps, vcmpss/vcmpsd 15. vgatherdps/vgatherqps, vgatherdpd/vgatherqpd gcc/ChangeLog: * config/i386/constraints.md (TV): New constraint for vsib memory that does not allow gpr32. * config/i386/i386.md: (setcc__sse): Replace m to Bt for avx alternative and set attr_gpr32 to 0. (movmsk_df): Split avx/noavx alternatives and replace "r" to "h" for avx alternative. (_rcp2): Split avx/noavx alternatives and replace "m/Bm" to "Bt/BT" for avx alternative, set its gpr32 attr to 0. (*rsqrtsf2_sse): Likewise. * config/i386/mmx.md (mmx_pmovmskb): Split alternative 1 to avx/noavx and assign h/r constraint to dest. * config/i386/sse.md (_movmsk): Split avx/noavx alternatives and replace "r" to "h" for avx alternative. (*_movmsk_ext): Likewise. (*_movmsk_lt): Likewise. (*_movmsk_ext_lt): Likewise. (*_movmsk_shift): Likewise. (*_movmsk_ext_shift): Likewise. (_pmovmskb): Likewise. (*_pmovmskb_zext): Likewise. (*sse2_pmovmskb_ext): Likewise. (*_pmovmskb_lt): Likewise. (*_pmovmskb_zext_lt): Likewise. (*sse2_pmovmskb_ext_lt): Likewise. (_rcp2): Split avx/noavx alternatives and replace "m/Bm" to "Bt/BT" for avx alternative, set its attr_gpr32 to 0. (sse_vmrcpv4sf2): Likewise. (*sse_vmrcpv4sf2): Likewise. (rsqrt2): Likewise. (sse_vmrsqrtv4sf2): Likewise. (*sse_vmrsqrtv4sf2): Likewise. (avx_hv4df3): Likewise. (sse3_hsubv2df3): Likewise. (avx_hv8sf3): Likewise. (sse3_hv4sf3): Likewise. (_lddqu): Likewise. (avx_cmp3): Likewise. (avx_vmcmp3): Likewise. (*sse2_gt3): Likewise. (sse_ldmxcsr): Likewise. (sse_stmxcsr): Likewise. (avx_vtest): Replace m to Bt for avx alternative and set attr_gpr32 to 0. (avx2_permv2ti): Likewise. (*avx_vperm2f128_full): Likewise. (*avx_vperm2f128_nozero): Likewise. (vec_set_lo_v32qi): Likewise. (_maskload): Likewise. (_maskstore: Likewise. (avx_cmp3): Likewise. (avx_vmcmp3): Likewise. (*_maskcmp3_comm): Likewise. (*avx2_gathersi): Replace Tv to TV and set attr_gpr32 to 0. (*avx2_gathersi_2): Likewise. (*avx2_gatherdi): Likewise. (*avx2_gatherdi_2): Likewise. (*avx2_gatherdi_3): Likewise. (*avx2_gatherdi_4): Likewise. (avx_vbroadcastf128_): Restrict non-egpr alternative to noavx512vl, set its constraint to Bt and set attr_gpr32 to 0. (vec_set_lo_): Likewise. (vec_set_lo_): Likewise for SF/SI modes. (vec_set_hi_): Likewise. (vec_set_hi_): Likewise for SF/SI modes. (vec_set_hi_): Likewise. (vec_set_lo_): Likewise. (avx2_set_hi_v32qi): Likewise. --- gcc/config/i386/constraints.md | 7 + gcc/config/i386/i386.md | 52 +++-- gcc/config/i386/mmx.md | 11 +- gcc/config/i386/sse.md | 337 +++++++++++++++++++++------------ 4 files changed, 261 insertions(+), 146 deletions(-) diff --git a/gcc/config/i386/constraints.md b/gcc/config/i386/constraints.md index f487bf2e5a3..052b6a95841 100644 --- a/gcc/config/i386/constraints.md +++ b/gcc/config/i386/constraints.md @@ -374,6 +374,7 @@ (define_constraint "Z" ;; T prefix is used for different address constraints ;; v - VSIB address +;; V - VSIB address with no rex2 register ;; s - address with no segment register ;; i - address with no index and no rip ;; b - address with no base and no rip @@ -386,5 +387,11 @@ (define_address_constraint "Ts" "Address operand without segment register" (match_operand 0 "address_no_seg_operand")) +(define_address_constraint "TV" + "VSIB address operand" + (and (match_operand 0 "vsib_address_operand") + (not (and (match_test "TARGET_APX_EGPR") + (match_test "x86_extended_rex2reg_mentioned_p (op)"))))) + (define_register_constraint "h" "TARGET_APX_EGPR ? GENERAL_GPR16 : GENERAL_REGS") diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index 8ec249b268d..d31c1910026 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -554,7 +554,8 @@ (define_attr "isa" "base,x64,nox64,x64_sse2,x64_sse4,x64_sse4_noavx, avx,noavx,avx2,noavx2,bmi,bmi2,fma4,fma,avx512f,noavx512f, avx512bw,noavx512bw,avx512dq,noavx512dq,fma_or_avx512vl, avx512vl,noavx512vl,avxvnni,avx512vnnivl,avx512fp16,avxifma, - avx512ifmavl,avxneconvert,avx512bf16vl,vpclmulqdqvl" + avx512ifmavl,avxneconvert,avx512bf16vl,vpclmulqdqvl, + avx_noavx512f,avx_noavx512vl" (const_string "base")) ;; The (bounding maximum) length of an instruction immediate. @@ -908,6 +909,8 @@ (define_attr "enabled" "" (eq_attr "isa" "sse4_noavx") (symbol_ref "TARGET_SSE4_1 && !TARGET_AVX") (eq_attr "isa" "avx") (symbol_ref "TARGET_AVX") + (eq_attr "isa" "avx_noavx512f") + (symbol_ref "TARGET_AVX && !TARGET_AVX512F") (eq_attr "isa" "noavx") (symbol_ref "!TARGET_AVX") (eq_attr "isa" "avx2") (symbol_ref "TARGET_AVX2") (eq_attr "isa" "noavx2") (symbol_ref "!TARGET_AVX2") @@ -16665,12 +16668,13 @@ (define_insn "setcc__sse" [(set (match_operand:MODEF 0 "register_operand" "=x,x") (match_operator:MODEF 3 "sse_comparison_operator" [(match_operand:MODEF 1 "register_operand" "0,x") - (match_operand:MODEF 2 "nonimmediate_operand" "xm,xm")]))] + (match_operand:MODEF 2 "nonimmediate_operand" "xm,xBt")]))] "SSE_FLOAT_MODE_P (mode)" "@ cmp%D3\t{%2, %0|%0, %2} vcmp%D3\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "1,0") (set_attr "type" "ssecmp") (set_attr "length_immediate" "1") (set_attr "prefix" "orig,vex") @@ -20126,24 +20130,28 @@ (define_insn "*hf" (set_attr "mode" "HF")]) (define_insn "*rcpsf2_sse" - [(set (match_operand:SF 0 "register_operand" "=x,x,x") - (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "0,x,m")] + [(set (match_operand:SF 0 "register_operand" "=x,x,x,x") + (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "0,x,m,BT")] UNSPEC_RCP))] "TARGET_SSE && TARGET_SSE_MATH" "@ %vrcpss\t{%d1, %0|%0, %d1} %vrcpss\t{%d1, %0|%0, %d1} - %vrcpss\t{%1, %d0|%d0, %1}" - [(set_attr "type" "sse") + rcpss\t{%1, %d0|%d0, %1} + vrcpss\t{%1, %d0|%d0, %1}" + [(set_attr "isa" "*,*,noavx,avx") + (set_attr "gpr32" "1,1,1,0") + (set_attr "type" "sse") + (set_attr "gpr32" "0") (set_attr "atom_sse_attr" "rcp") (set_attr "btver2_sse_attr" "rcp") (set_attr "prefix" "maybe_vex") (set_attr "mode" "SF") - (set_attr "avx_partial_xmm_update" "false,false,true") + (set_attr "avx_partial_xmm_update" "false,false,true,true") (set (attr "preferred_for_speed") (cond [(match_test "TARGET_AVX") (symbol_ref "true") - (eq_attr "alternative" "1,2") + (eq_attr "alternative" "1,2,3") (symbol_ref "!TARGET_SSE_PARTIAL_REG_DEPENDENCY") ] (symbol_ref "true")))]) @@ -20386,24 +20394,27 @@ (define_insn "sqrtxf2" (set_attr "bdver1_decode" "direct")]) (define_insn "*rsqrtsf2_sse" - [(set (match_operand:SF 0 "register_operand" "=x,x,x") - (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "0,x,m")] + [(set (match_operand:SF 0 "register_operand" "=x,x,x,x") + (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "0,x,m,BT")] UNSPEC_RSQRT))] "TARGET_SSE && TARGET_SSE_MATH" "@ %vrsqrtss\t{%d1, %0|%0, %d1} %vrsqrtss\t{%d1, %0|%0, %d1} - %vrsqrtss\t{%1, %d0|%d0, %1}" - [(set_attr "type" "sse") + rsqrtss\t{%1, %d0|%d0, %1} + vrsqrtss\t{%1, %d0|%d0, %1}" + [(set_attr "isa" "*,*,noavx,avx") + (set_attr "gpr32" "1,1,1,0") + (set_attr "type" "sse") (set_attr "atom_sse_attr" "rcp") (set_attr "btver2_sse_attr" "rcp") (set_attr "prefix" "maybe_vex") (set_attr "mode" "SF") - (set_attr "avx_partial_xmm_update" "false,false,true") + (set_attr "avx_partial_xmm_update" "false,false,true,true") (set (attr "preferred_for_speed") (cond [(match_test "TARGET_AVX") (symbol_ref "true") - (eq_attr "alternative" "1,2") + (eq_attr "alternative" "1,2,3") (symbol_ref "!TARGET_SSE_PARTIAL_REG_DEPENDENCY") ] (symbol_ref "true")))]) @@ -22107,14 +22118,17 @@ (define_expand "signbitxf2" }) (define_insn "movmsk_df" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,h") (unspec:SI - [(match_operand:DF 1 "register_operand" "x")] + [(match_operand:DF 1 "register_operand" "x,x")] UNSPEC_MOVMSK))] "SSE_FLOAT_MODE_P (DFmode) && TARGET_SSE_MATH" - "%vmovmskpd\t{%1, %0|%0, %1}" - [(set_attr "type" "ssemov") - (set_attr "prefix" "maybe_vex") + "@ + movmskpd\t{%1, %0|%0, %1} + vmovmskpd\t{%1, %0|%0, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") + (set_attr "prefix" "maybe_evex") (set_attr "mode" "DF")]) ;; Use movmskpd in SSE mode to avoid store forwarding stall diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 63803c89f2b..9dcb165d270 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -5182,13 +5182,14 @@ (define_expand "usadv8qi" }) (define_insn_and_split "mmx_pmovmskb" - [(set (match_operand:SI 0 "register_operand" "=r,r") - (unspec:SI [(match_operand:V8QI 1 "register_operand" "y,x")] + [(set (match_operand:SI 0 "register_operand" "=r,r,h") + (unspec:SI [(match_operand:V8QI 1 "register_operand" "y,x,x")] UNSPEC_MOVMSK))] "(TARGET_MMX || TARGET_MMX_WITH_SSE) && (TARGET_SSE || TARGET_3DNOW_A)" "@ pmovmskb\t{%1, %0|%0, %1} + # #" "TARGET_SSE2 && reload_completed && SSE_REGNO_P (REGNO (operands[1]))" @@ -5203,9 +5204,9 @@ (define_insn_and_split "mmx_pmovmskb" operands[2] = lowpart_subreg (QImode, operands[0], GET_MODE (operands[0])); } - [(set_attr "mmx_isa" "native,sse") - (set_attr "type" "mmxcvt,ssemov") - (set_attr "mode" "DI,TI")]) + [(set_attr "mmx_isa" "native,sse_noavx,avx") + (set_attr "type" "mmxcvt,ssemov,ssemov") + (set_attr "mode" "DI,TI,TI")]) (define_expand "mmx_maskmovq" [(set (match_operand:V8QI 0 "memory_operand") diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 4913c34ed37..4b6bed36061 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -1845,12 +1845,16 @@ (define_peephole2 "operands[4] = adjust_address (operands[0], V2DFmode, 0);") (define_insn "_lddqu" - [(set (match_operand:VI1 0 "register_operand" "=x") - (unspec:VI1 [(match_operand:VI1 1 "memory_operand" "m")] + [(set (match_operand:VI1 0 "register_operand" "=x,x") + (unspec:VI1 [(match_operand:VI1 1 "memory_operand" "m,Bt")] UNSPEC_LDDQU))] "TARGET_SSE3" - "%vlddqu\t{%1, %0|%0, %1}" - [(set_attr "type" "ssemov") + "@ + lddqu\t{%1, %0|%0, %1} + vlddqu\t{%1, %0|%0, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") + (set_attr "gpr32" "1,0") (set_attr "movu" "1") (set (attr "prefix_data16") (if_then_else @@ -2519,12 +2523,16 @@ (define_insn "_div3" (set_attr "mode" "")]) (define_insn "_rcp2" - [(set (match_operand:VF1_128_256 0 "register_operand" "=x") + [(set (match_operand:VF1_128_256 0 "register_operand" "=x,x") (unspec:VF1_128_256 - [(match_operand:VF1_128_256 1 "vector_operand" "xBm")] UNSPEC_RCP))] + [(match_operand:VF1_128_256 1 "vector_operand" "xBm,xBT")] UNSPEC_RCP))] "TARGET_SSE" - "%vrcpps\t{%1, %0|%0, %1}" - [(set_attr "type" "sse") + "@ + rcpps\t{%1, %0|%0, %1} + vrcpps\t{%1, %0|%0, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sse") + (set_attr "gpr32" "1,0") (set_attr "atom_sse_attr" "rcp") (set_attr "btver2_sse_attr" "rcp") (set_attr "prefix" "maybe_vex") @@ -2543,6 +2551,7 @@ (define_insn "sse_vmrcpv4sf2" vrcpss\t{%1, %2, %0|%0, %2, %k1}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sse") + (set_attr "gpr32" "1,0") (set_attr "atom_sse_attr" "rcp") (set_attr "btver2_sse_attr" "rcp") (set_attr "prefix" "orig,vex") @@ -2562,6 +2571,7 @@ (define_insn "*sse_vmrcpv4sf2" vrcpss\t{%1, %2, %0|%0, %2, %1}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sse") + (set_attr "gpr32" "1,0") (set_attr "atom_sse_attr" "rcp") (set_attr "btver2_sse_attr" "rcp") (set_attr "prefix" "orig,vex") @@ -2738,12 +2748,16 @@ (define_expand "rsqrt2" "TARGET_AVX512FP16") (define_insn "_rsqrt2" - [(set (match_operand:VF1_128_256 0 "register_operand" "=x") + [(set (match_operand:VF1_128_256 0 "register_operand" "=x,x") (unspec:VF1_128_256 - [(match_operand:VF1_128_256 1 "vector_operand" "xBm")] UNSPEC_RSQRT))] + [(match_operand:VF1_128_256 1 "vector_operand" "xBm,xBT")] UNSPEC_RSQRT))] "TARGET_SSE" - "%vrsqrtps\t{%1, %0|%0, %1}" - [(set_attr "type" "sse") + "@ + rsqrtps\t{%1, %0|%0, %1} + vrsqrtps\t{%1, %0|%0, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sse") + (set_attr "gpr32" "1,0") (set_attr "prefix" "maybe_vex") (set_attr "mode" "")]) @@ -2802,7 +2816,7 @@ (define_insn "rsqrt14__mask" (define_insn "sse_vmrsqrtv4sf2" [(set (match_operand:V4SF 0 "register_operand" "=x,x") (vec_merge:V4SF - (unspec:V4SF [(match_operand:V4SF 1 "nonimmediate_operand" "xm,xm")] + (unspec:V4SF [(match_operand:V4SF 1 "nonimmediate_operand" "xm,xBt")] UNSPEC_RSQRT) (match_operand:V4SF 2 "register_operand" "0,x") (const_int 1)))] @@ -2812,6 +2826,7 @@ (define_insn "sse_vmrsqrtv4sf2" vrsqrtss\t{%1, %2, %0|%0, %2, %k1}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sse") + (set_attr "gpr32" "1,0") (set_attr "prefix" "orig,vex") (set_attr "mode" "SF")]) @@ -2819,7 +2834,7 @@ (define_insn "*sse_vmrsqrtv4sf2" [(set (match_operand:V4SF 0 "register_operand" "=x,x") (vec_merge:V4SF (vec_duplicate:V4SF - (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "xm,xm")] + (unspec:SF [(match_operand:SF 1 "nonimmediate_operand" "xm,xBt")] UNSPEC_RSQRT)) (match_operand:V4SF 2 "register_operand" "0,x") (const_int 1)))] @@ -2829,6 +2844,7 @@ (define_insn "*sse_vmrsqrtv4sf2" vrsqrtss\t{%1, %2, %0|%0, %2, %1}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sse") + (set_attr "gpr32" "1,0") (set_attr "prefix" "orig,vex") (set_attr "mode" "SF")]) @@ -3004,7 +3020,7 @@ (define_insn "vec_addsub3" (vec_merge:VF_128_256 (minus:VF_128_256 (match_operand:VF_128_256 1 "register_operand" "0,x") - (match_operand:VF_128_256 2 "vector_operand" "xBm, xm")) + (match_operand:VF_128_256 2 "vector_operand" "xBm, xBt")) (plus:VF_128_256 (match_dup 1) (match_dup 2)) (const_int )))] "TARGET_SSE3" @@ -3013,6 +3029,7 @@ (define_insn "vec_addsub3" vaddsub\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseadd") + (set_attr "gpr32" "1,0") (set (attr "atom_unit") (if_then_else (match_test "mode == V2DFmode") @@ -3156,7 +3173,7 @@ (define_insn "avx_hv4df3" (vec_select:DF (match_dup 1) (parallel [(const_int 1)]))) (plusminus:DF (vec_select:DF - (match_operand:V4DF 2 "nonimmediate_operand" "xm") + (match_operand:V4DF 2 "nonimmediate_operand" "xBt") (parallel [(const_int 0)])) (vec_select:DF (match_dup 2) (parallel [(const_int 1)])))) (vec_concat:V2DF @@ -3169,6 +3186,7 @@ (define_insn "avx_hv4df3" "TARGET_AVX" "vhpd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sseadd") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "V4DF")]) @@ -3199,7 +3217,7 @@ (define_insn "*sse3_haddv2df3" (parallel [(match_operand:SI 4 "const_0_to_1_operand")]))) (plus:DF (vec_select:DF - (match_operand:V2DF 2 "vector_operand" "xBm,xm") + (match_operand:V2DF 2 "vector_operand" "xBm,xBt") (parallel [(match_operand:SI 5 "const_0_to_1_operand")])) (vec_select:DF (match_dup 2) @@ -3211,6 +3229,7 @@ (define_insn "*sse3_haddv2df3" haddpd\t{%2, %0|%0, %2} vhaddpd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "1,0") (set_attr "type" "sseadd") (set_attr "prefix" "orig,vex") (set_attr "mode" "V2DF")]) @@ -3225,7 +3244,7 @@ (define_insn "sse3_hsubv2df3" (vec_select:DF (match_dup 1) (parallel [(const_int 1)]))) (minus:DF (vec_select:DF - (match_operand:V2DF 2 "vector_operand" "xBm,xm") + (match_operand:V2DF 2 "vector_operand" "xBm,xBt") (parallel [(const_int 0)])) (vec_select:DF (match_dup 2) (parallel [(const_int 1)])))))] "TARGET_SSE3" @@ -3234,6 +3253,7 @@ (define_insn "sse3_hsubv2df3" vhsubpd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseadd") + (set_attr "gpr32" "1,0") (set_attr "prefix" "orig,vex") (set_attr "mode" "V2DF")]) @@ -3290,7 +3310,7 @@ (define_insn "avx_hv8sf3" (vec_concat:V2SF (plusminus:SF (vec_select:SF - (match_operand:V8SF 2 "nonimmediate_operand" "xm") + (match_operand:V8SF 2 "nonimmediate_operand" "xBt") (parallel [(const_int 0)])) (vec_select:SF (match_dup 2) (parallel [(const_int 1)]))) (plusminus:SF @@ -3314,6 +3334,7 @@ (define_insn "avx_hv8sf3" "TARGET_AVX" "vhps\t{%2, %1, %0|%0, %1, %2}" [(set_attr "type" "sseadd") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "V8SF")]) @@ -3332,7 +3353,7 @@ (define_insn "sse3_hv4sf3" (vec_concat:V2SF (plusminus:SF (vec_select:SF - (match_operand:V4SF 2 "vector_operand" "xBm,xm") + (match_operand:V4SF 2 "vector_operand" "xBm,xBt") (parallel [(const_int 0)])) (vec_select:SF (match_dup 2) (parallel [(const_int 1)]))) (plusminus:SF @@ -3344,6 +3365,7 @@ (define_insn "sse3_hv4sf3" vhps\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseadd") + (set_attr "gpr32" "1,0") (set_attr "atom_unit" "complex") (set_attr "prefix" "orig,vex") (set_attr "prefix_rep" "1,*") @@ -3537,12 +3559,13 @@ (define_insn "avx_cmp3" [(set (match_operand:VF_128_256 0 "register_operand" "=x") (unspec:VF_128_256 [(match_operand:VF_128_256 1 "register_operand" "x") - (match_operand:VF_128_256 2 "nonimmediate_operand" "xm") + (match_operand:VF_128_256 2 "nonimmediate_operand" "xBt") (match_operand:SI 3 "const_0_to_31_operand")] UNSPEC_PCMP))] "TARGET_AVX" "vcmp\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "ssecmp") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -3748,7 +3771,7 @@ (define_insn "avx_vmcmp3" (vec_merge:VF_128 (unspec:VF_128 [(match_operand:VF_128 1 "register_operand" "x") - (match_operand:VF_128 2 "nonimmediate_operand" "xm") + (match_operand:VF_128 2 "nonimmediate_operand" "xBt") (match_operand:SI 3 "const_0_to_31_operand")] UNSPEC_PCMP) (match_dup 1) @@ -3756,6 +3779,7 @@ (define_insn "avx_vmcmp3" "TARGET_AVX" "vcmp\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "ssecmp") + (set_attr "gpr32" "0") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -3764,13 +3788,14 @@ (define_insn "*_maskcmp3_comm" [(set (match_operand:VF_128_256 0 "register_operand" "=x,x") (match_operator:VF_128_256 3 "sse_comparison_operator" [(match_operand:VF_128_256 1 "register_operand" "%0,x") - (match_operand:VF_128_256 2 "vector_operand" "xBm,xm")]))] + (match_operand:VF_128_256 2 "vector_operand" "xBm,xBt")]))] "TARGET_SSE && GET_RTX_CLASS (GET_CODE (operands[3])) == RTX_COMM_COMPARE" "@ cmp%D3\t{%2, %0|%0, %2} vcmp%D3\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "1,0") (set_attr "type" "ssecmp") (set_attr "length_immediate" "1") (set_attr "prefix" "orig,vex") @@ -3780,12 +3805,13 @@ (define_insn "_maskcmp3" [(set (match_operand:VF_128_256 0 "register_operand" "=x,x") (match_operator:VF_128_256 3 "sse_comparison_operator" [(match_operand:VF_128_256 1 "register_operand" "0,x") - (match_operand:VF_128_256 2 "vector_operand" "xBm,xm")]))] + (match_operand:VF_128_256 2 "vector_operand" "xBm,xBt")]))] "TARGET_SSE" "@ cmp%D3\t{%2, %0|%0, %2} vcmp%D3\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "1,0") (set_attr "type" "ssecmp") (set_attr "length_immediate" "1") (set_attr "prefix" "orig,vex") @@ -3796,7 +3822,7 @@ (define_insn "_vmmaskcmp3" (vec_merge:VF_128 (match_operator:VF_128 3 "sse_comparison_operator" [(match_operand:VF_128 1 "register_operand" "0,x") - (match_operand:VF_128 2 "nonimmediate_operand" "xm,xm")]) + (match_operand:VF_128 2 "nonimmediate_operand" "xm,xBt")]) (match_dup 1) (const_int 1)))] "TARGET_SSE" @@ -3804,6 +3830,7 @@ (define_insn "_vmmaskcmp3" cmp%D3\t{%2, %0|%0, %2} vcmp%D3\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "1,0") (set_attr "type" "ssecmp") (set_attr "length_immediate" "1,*") (set_attr "prefix" "orig,vex") @@ -4721,7 +4748,7 @@ (define_insn "_andnot3" (and:VFB_128_256 (not:VFB_128_256 (match_operand:VFB_128_256 1 "register_operand" "0,x,v,v")) - (match_operand:VFB_128_256 2 "vector_operand" "xBm,xm,vm,vm")))] + (match_operand:VFB_128_256 2 "vector_operand" "xBm,xBt,vm,vm")))] "TARGET_SSE && && (! || mode != HFmode)" { @@ -4765,7 +4792,8 @@ (define_insn "_andnot3" output_asm_insn (buf, operands); return ""; } - [(set_attr "isa" "noavx,avx,avx512dq,avx512f") + [(set_attr "isa" "noavx,avx_noavx512f,avx512dq,avx512f") + (set_attr "gpr32" "1,0,1,1") (set_attr "type" "sselog") (set_attr "prefix" "orig,maybe_vex,evex,evex") (set (attr "mode") @@ -5075,7 +5103,7 @@ (define_insn "*andnot3" [(set (match_operand:ANDNOT_MODE 0 "register_operand" "=x,x,v,v") (and:ANDNOT_MODE (not:ANDNOT_MODE (match_operand:ANDNOT_MODE 1 "register_operand" "0,x,v,v")) - (match_operand:ANDNOT_MODE 2 "vector_operand" "xBm,xm,vm,v")))] + (match_operand:ANDNOT_MODE 2 "vector_operand" "xBm,xBt,vm,v")))] "TARGET_SSE" { char buf[128]; @@ -5104,7 +5132,8 @@ (define_insn "*andnot3" output_asm_insn (buf, operands); return ""; } - [(set_attr "isa" "noavx,avx,avx512vl,avx512f") + [(set_attr "isa" "noavx,avx_noavx512f,avx512vl,avx512f") + (set_attr "gpr32" "1,0,1,1") (set_attr "type" "sselog") (set (attr "prefix_data16") (if_then_else @@ -12246,7 +12275,7 @@ (define_insn_and_split "vec_extract_lo_v32qi" "operands[1] = gen_lowpart (V16QImode, operands[1]);") (define_insn "vec_extract_hi_v32qi" - [(set (match_operand:V16QI 0 "nonimmediate_operand" "=xm,vm") + [(set (match_operand:V16QI 0 "nonimmediate_operand" "=xBt,vm") (vec_select:V16QI (match_operand:V32QI 1 "register_operand" "x,v") (parallel [(const_int 16) (const_int 17) @@ -12264,7 +12293,8 @@ (define_insn "vec_extract_hi_v32qi" [(set_attr "type" "sselog1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") - (set_attr "isa" "*,avx512vl") + (set_attr "isa" "noavx512vl,avx512vl") + (set_attr "gpr32" "0,1") (set_attr "prefix" "vex,evex") (set_attr "mode" "OI")]) @@ -17135,6 +17165,7 @@ (define_insn "*sse2_gt3" pcmpgt\t{%2, %0|%0, %2} vpcmpgt\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") + (set_attr "gpr32" "1,0") (set_attr "type" "ssecmp") (set_attr "prefix" "orig,vex") (set_attr "mode" "TI")]) @@ -17451,7 +17482,7 @@ (define_insn "*andnot3" [(set (match_operand:VI 0 "register_operand" "=x,x,v,v,v") (and:VI (not:VI (match_operand:VI 1 "bcst_vector_operand" "0,x,v,m,Br")) - (match_operand:VI 2 "bcst_vector_operand" "xBm,xm,vmBr,0,0")))] + (match_operand:VI 2 "bcst_vector_operand" "xBm,xBt,vmBr,0,0")))] "TARGET_SSE && (register_operand (operands[1], mode) || register_operand (operands[2], mode))" @@ -17538,7 +17569,8 @@ (define_insn "*andnot3" output_asm_insn (buf, operands); return ""; } - [(set_attr "isa" "noavx,avx,avx,*,*") + [(set_attr "isa" "noavx,avx_noavx512f,avx512f,*,*") + (set_attr "gpr32" "1,0,1,1,1") (set_attr "type" "sselog") (set (attr "prefix_data16") (if_then_else @@ -17693,7 +17725,7 @@ (define_insn "*3" [(set (match_operand:VI48_AVX_AVX512F 0 "register_operand" "=x,x,v") (any_logic:VI48_AVX_AVX512F (match_operand:VI48_AVX_AVX512F 1 "bcst_vector_operand" "%0,x,v") - (match_operand:VI48_AVX_AVX512F 2 "bcst_vector_operand" "xBm,xm,vmBr")))] + (match_operand:VI48_AVX_AVX512F 2 "bcst_vector_operand" "xBm,xBt,vmBr")))] "TARGET_SSE && && ix86_binary_operator_ok (, mode, operands)" { @@ -17723,9 +17755,11 @@ (define_insn "*3" case E_V4DImode: case E_V4SImode: case E_V2DImode: - ssesuffix = (TARGET_AVX512VL - && ( || which_alternative == 2) - ? "" : ""); + ssesuffix = ((TARGET_AVX512VL + && ( || which_alternative == 2)) + || (MEM_P (operands[2]) && which_alternative == 2 + && x86_extended_rex2reg_mentioned_p (operands[2]))) + ? "" : ""; break; default: gcc_unreachable (); @@ -17765,7 +17799,8 @@ (define_insn "*3" output_asm_insn (buf, operands); return ""; } - [(set_attr "isa" "noavx,avx,avx") + [(set_attr "isa" "noavx,avx_noavx512f,avx512f") + (set_attr "gpr32" "1,0,1") (set_attr "type" "sselog") (set (attr "prefix_data16") (if_then_else @@ -17792,7 +17827,7 @@ (define_insn "*3" [(set (match_operand:VI12_AVX_AVX512F 0 "register_operand" "=x,x,v") (any_logic:VI12_AVX_AVX512F (match_operand:VI12_AVX_AVX512F 1 "vector_operand" "%0,x,v") - (match_operand:VI12_AVX_AVX512F 2 "vector_operand" "xBm,xm,vm")))] + (match_operand:VI12_AVX_AVX512F 2 "vector_operand" "xBm,xBt,vm")))] "TARGET_SSE && !(MEM_P (operands[1]) && MEM_P (operands[2]))" { char buf[64]; @@ -17821,7 +17856,10 @@ (define_insn "*3" case E_V16HImode: case E_V16QImode: case E_V8HImode: - ssesuffix = TARGET_AVX512VL && which_alternative == 2 ? "q" : ""; + ssesuffix = (((TARGET_AVX512VL && which_alternative == 2) + || (MEM_P (operands[2]) && which_alternative == 2 + && x86_extended_rex2reg_mentioned_p (operands[2])))) + ? "q" : ""; break; default: gcc_unreachable (); @@ -17858,7 +17896,8 @@ (define_insn "*3" output_asm_insn (buf, operands); return ""; } - [(set_attr "isa" "noavx,avx,avx") + [(set_attr "isa" "noavx,avx_noavx512f,avx512f") + (set_attr "gpr32" "1,0,1") (set_attr "type" "sselog") (set (attr "prefix_data16") (if_then_else @@ -17885,13 +17924,14 @@ (define_insn "v1ti3" [(set (match_operand:V1TI 0 "register_operand" "=x,x,v") (any_logic:V1TI (match_operand:V1TI 1 "register_operand" "%0,x,v") - (match_operand:V1TI 2 "vector_operand" "xBm,xm,vm")))] + (match_operand:V1TI 2 "vector_operand" "xBm,xBt,vm")))] "TARGET_SSE2" "@ p\t{%2, %0|%0, %2} vp\t{%2, %1, %0|%0, %1, %2} vpd\t{%2, %1, %0|%0, %1, %2}" - [(set_attr "isa" "noavx,avx,avx512vl") + [(set_attr "isa" "noavx,avx_noavx512vl,avx512vl") + (set_attr "gpr32" "1,0,1") (set_attr "prefix" "orig,vex,evex") (set_attr "prefix_data16" "1,*,*") (set_attr "type" "sselog") @@ -20878,33 +20918,39 @@ (define_insn "*_psadbw" (set_attr "mode" "")]) (define_insn "_movmsk" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,h") (unspec:SI - [(match_operand:VF_128_256 1 "register_operand" "x")] + [(match_operand:VF_128_256 1 "register_operand" "x,x")] UNSPEC_MOVMSK))] "TARGET_SSE" - "%vmovmsk\t{%1, %0|%0, %1}" - [(set_attr "type" "ssemov") - (set_attr "prefix" "maybe_vex") + "@ + movmsk\t{%1, %0|%0, %1} + vmovmsk\t{%1, %0|%0, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") + (set_attr "prefix" "maybe_evex") (set_attr "mode" "")]) (define_insn "*_movmsk_ext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,h") (any_extend:DI (unspec:SI - [(match_operand:VF_128_256 1 "register_operand" "x")] + [(match_operand:VF_128_256 1 "register_operand" "x,x")] UNSPEC_MOVMSK)))] "TARGET_64BIT && TARGET_SSE" - "%vmovmsk\t{%1, %k0|%k0, %1}" - [(set_attr "type" "ssemov") - (set_attr "prefix" "maybe_vex") + "@ + movmsk\t{%1, %0|%0, %1} + vmovmsk\t{%1, %0|%0, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") + (set_attr "prefix" "maybe_evex") (set_attr "mode" "")]) (define_insn_and_split "*_movmsk_lt" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,h") (unspec:SI [(lt:VF_128_256 - (match_operand: 1 "register_operand" "x") + (match_operand: 1 "register_operand" "x,x") (match_operand: 2 "const0_operand"))] UNSPEC_MOVMSK))] "TARGET_SSE" @@ -20913,16 +20959,17 @@ (define_insn_and_split "*_movmsk_lt" [(set (match_dup 0) (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK))] "operands[1] = gen_lowpart (mode, operands[1]);" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set_attr "prefix" "maybe_vex") (set_attr "mode" "")]) (define_insn_and_split "*_movmsk_ext_lt" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,h") (any_extend:DI (unspec:SI [(lt:VF_128_256 - (match_operand: 1 "register_operand" "x") + (match_operand: 1 "register_operand" "x,x") (match_operand: 2 "const0_operand"))] UNSPEC_MOVMSK)))] "TARGET_64BIT && TARGET_SSE" @@ -20931,16 +20978,17 @@ (define_insn_and_split "*_movmsk_ext_lt" [(set (match_dup 0) (any_extend:DI (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)))] "operands[1] = gen_lowpart (mode, operands[1]);" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set_attr "prefix" "maybe_vex") (set_attr "mode" "")]) (define_insn_and_split "*_movmsk_shift" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,h") (unspec:SI [(subreg:VF_128_256 (ashiftrt: - (match_operand: 1 "register_operand" "x") + (match_operand: 1 "register_operand" "x,x") (match_operand:QI 2 "const_int_operand")) 0)] UNSPEC_MOVMSK))] "TARGET_SSE" @@ -20949,17 +20997,18 @@ (define_insn_and_split "*_movmsk_shift" [(set (match_dup 0) (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK))] "operands[1] = gen_lowpart (mode, operands[1]);" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set_attr "prefix" "maybe_vex") (set_attr "mode" "")]) (define_insn_and_split "*_movmsk_ext_shift" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,h") (any_extend:DI (unspec:SI [(subreg:VF_128_256 (ashiftrt: - (match_operand: 1 "register_operand" "x") + (match_operand: 1 "register_operand" "x,x") (match_operand:QI 2 "const_int_operand")) 0)] UNSPEC_MOVMSK)))] "TARGET_64BIT && TARGET_SSE" @@ -20968,18 +21017,22 @@ (define_insn_and_split "*_movmsk_ext_shift [(set (match_dup 0) (any_extend:DI (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)))] "operands[1] = gen_lowpart (mode, operands[1]);" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set_attr "prefix" "maybe_vex") (set_attr "mode" "")]) (define_insn "_pmovmskb" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,h") (unspec:SI - [(match_operand:VI1_AVX2 1 "register_operand" "x")] + [(match_operand:VI1_AVX2 1 "register_operand" "x,x")] UNSPEC_MOVMSK))] "TARGET_SSE2" - "%vpmovmskb\t{%1, %0|%0, %1}" - [(set_attr "type" "ssemov") + "@ + pmovmskb\t{%1, %0|%0, %1} + vpmovmskb\t{%1, %0|%0, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set (attr "prefix_data16") (if_then_else (match_test "TARGET_AVX") @@ -20989,14 +21042,17 @@ (define_insn "_pmovmskb" (set_attr "mode" "SI")]) (define_insn "*_pmovmskb_zext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,h") (zero_extend:DI (unspec:SI - [(match_operand:VI1_AVX2 1 "register_operand" "x")] + [(match_operand:VI1_AVX2 1 "register_operand" "x,x")] UNSPEC_MOVMSK)))] "TARGET_64BIT && TARGET_SSE2" - "%vpmovmskb\t{%1, %k0|%k0, %1}" - [(set_attr "type" "ssemov") + "@ + pmovmskb\t{%1, %k0|%k0, %1} + vpmovmskb\t{%1, %k0|%k0, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set (attr "prefix_data16") (if_then_else (match_test "TARGET_AVX") @@ -21006,14 +21062,17 @@ (define_insn "*_pmovmskb_zext" (set_attr "mode" "SI")]) (define_insn "*sse2_pmovmskb_ext" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,h") (sign_extend:DI (unspec:SI - [(match_operand:V16QI 1 "register_operand" "x")] + [(match_operand:V16QI 1 "register_operand" "x,x")] UNSPEC_MOVMSK)))] "TARGET_64BIT && TARGET_SSE2" - "%vpmovmskb\t{%1, %k0|%k0, %1}" - [(set_attr "type" "ssemov") + "@ + pmovmskb\t{%1, %k0|%k0, %1} + vpmovmskb\t{%1, %k0|%k0, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set (attr "prefix_data16") (if_then_else (match_test "TARGET_AVX") @@ -21098,9 +21157,9 @@ (define_split }) (define_insn_and_split "*_pmovmskb_lt" - [(set (match_operand:SI 0 "register_operand" "=r") + [(set (match_operand:SI 0 "register_operand" "=r,h") (unspec:SI - [(lt:VI1_AVX2 (match_operand:VI1_AVX2 1 "register_operand" "x") + [(lt:VI1_AVX2 (match_operand:VI1_AVX2 1 "register_operand" "x,x") (match_operand:VI1_AVX2 2 "const0_operand"))] UNSPEC_MOVMSK))] "TARGET_SSE2" @@ -21109,7 +21168,8 @@ (define_insn_and_split "*_pmovmskb_lt" [(set (match_dup 0) (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK))] "" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set (attr "prefix_data16") (if_then_else (match_test "TARGET_AVX") @@ -21119,10 +21179,10 @@ (define_insn_and_split "*_pmovmskb_lt" (set_attr "mode" "SI")]) (define_insn_and_split "*_pmovmskb_zext_lt" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,h") (zero_extend:DI (unspec:SI - [(lt:VI1_AVX2 (match_operand:VI1_AVX2 1 "register_operand" "x") + [(lt:VI1_AVX2 (match_operand:VI1_AVX2 1 "register_operand" "x,x") (match_operand:VI1_AVX2 2 "const0_operand"))] UNSPEC_MOVMSK)))] "TARGET_64BIT && TARGET_SSE2" @@ -21131,7 +21191,8 @@ (define_insn_and_split "*_pmovmskb_zext_lt" [(set (match_dup 0) (zero_extend:DI (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)))] "" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set (attr "prefix_data16") (if_then_else (match_test "TARGET_AVX") @@ -21141,10 +21202,10 @@ (define_insn_and_split "*_pmovmskb_zext_lt" (set_attr "mode" "SI")]) (define_insn_and_split "*sse2_pmovmskb_ext_lt" - [(set (match_operand:DI 0 "register_operand" "=r") + [(set (match_operand:DI 0 "register_operand" "=r,h") (sign_extend:DI (unspec:SI - [(lt:V16QI (match_operand:V16QI 1 "register_operand" "x") + [(lt:V16QI (match_operand:V16QI 1 "register_operand" "x,x") (match_operand:V16QI 2 "const0_operand"))] UNSPEC_MOVMSK)))] "TARGET_64BIT && TARGET_SSE2" @@ -21153,7 +21214,8 @@ (define_insn_and_split "*sse2_pmovmskb_ext_lt" [(set (match_dup 0) (sign_extend:DI (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)))] "" - [(set_attr "type" "ssemov") + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") (set (attr "prefix_data16") (if_then_else (match_test "TARGET_AVX") @@ -21214,21 +21276,28 @@ (define_insn "*sse2_maskmovdqu" (set_attr "mode" "TI")]) (define_insn "sse_ldmxcsr" - [(unspec_volatile [(match_operand:SI 0 "memory_operand" "m")] + [(unspec_volatile [(match_operand:SI 0 "memory_operand" "m,Bt")] UNSPECV_LDMXCSR)] "TARGET_SSE" - "%vldmxcsr\t%0" - [(set_attr "type" "sse") + "@ + ldmxcsr\t%0 + vldmxcsr\t%0" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "sse") + (set_attr "gpr32" "1,0") (set_attr "atom_sse_attr" "mxcsr") (set_attr "prefix" "maybe_vex") (set_attr "memory" "load")]) (define_insn "sse_stmxcsr" - [(set (match_operand:SI 0 "memory_operand" "=m") + [(set (match_operand:SI 0 "memory_operand" "=m,Bt") (unspec_volatile:SI [(const_int 0)] UNSPECV_STMXCSR))] "TARGET_SSE" - "%vstmxcsr\t%0" + "@ + stmxcsr\t%0 + vstmxcsr\t%0" [(set_attr "type" "sse") + (set_attr "gpr32" "0") (set_attr "atom_sse_attr" "mxcsr") (set_attr "prefix" "maybe_vex") (set_attr "memory" "store")]) @@ -23890,11 +23959,12 @@ (define_expand "v2siv2di2" (define_insn "avx_vtest" [(set (reg:CC FLAGS_REG) (unspec:CC [(match_operand:VF_128_256 0 "register_operand" "x") - (match_operand:VF_128_256 1 "nonimmediate_operand" "xm")] + (match_operand:VF_128_256 1 "nonimmediate_operand" "xBt")] UNSPEC_VTESTP))] "TARGET_AVX" "vtest\t{%1, %0|%0, %1}" [(set_attr "type" "ssecomi") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -26955,7 +27025,7 @@ (define_split (define_insn "avx_vbroadcastf128_" [(set (match_operand:V_256 0 "register_operand" "=x,x,x,v,v,v,v") (vec_concat:V_256 - (match_operand: 1 "nonimmediate_operand" "m,0,?x,m,0,m,0") + (match_operand: 1 "nonimmediate_operand" "Bt,0,?x,m,0,m,0") (match_dup 1)))] "TARGET_AVX" "@ @@ -26966,8 +27036,9 @@ (define_insn "avx_vbroadcastf128_" vinsert\t{$1, %1, %0, %0|%0, %0, %1, 1} vbroadcast32x4\t{%1, %0|%0, %1} vinsert32x4\t{$1, %1, %0, %0|%0, %0, %1, 1}" - [(set_attr "isa" "*,*,*,avx512dq,avx512dq,avx512vl,avx512vl") + [(set_attr "isa" "noavx512vl,*,*,avx512dq,avx512dq,avx512vl,avx512vl") (set_attr "type" "ssemov,sselog1,sselog1,ssemov,sselog1,ssemov,sselog1") + (set_attr "gpr32" "0,1,1,1,1,1,1") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "0,1,1,0,1,0,1") (set_attr "prefix" "vex,vex,vex,evex,evex,evex,evex") @@ -27235,12 +27306,13 @@ (define_insn "*avx_vperm2f128_full" [(set (match_operand:AVX256MODE2P 0 "register_operand" "=x") (unspec:AVX256MODE2P [(match_operand:AVX256MODE2P 1 "register_operand" "x") - (match_operand:AVX256MODE2P 2 "nonimmediate_operand" "xm") + (match_operand:AVX256MODE2P 2 "nonimmediate_operand" "xBt") (match_operand:SI 3 "const_0_to_255_operand")] UNSPEC_VPERMIL2F128))] "TARGET_AVX" "vperm2\t{%3, %2, %1, %0|%0, %1, %2, %3}" [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") @@ -27357,11 +27429,11 @@ (define_expand "avx_vinsertf128" }) (define_insn "vec_set_lo_" - [(set (match_operand:VI8F_256 0 "register_operand" "=v") + [(set (match_operand:VI8F_256 0 "register_operand" "=x,v") (vec_concat:VI8F_256 - (match_operand: 2 "nonimmediate_operand" "vm") + (match_operand: 2 "nonimmediate_operand" "xBt,vm") (vec_select: - (match_operand:VI8F_256 1 "register_operand" "v") + (match_operand:VI8F_256 1 "register_operand" "x,v") (parallel [(const_int 2) (const_int 3)]))))] "TARGET_AVX && " { @@ -27372,7 +27444,9 @@ (define_insn "vec_set_lo_" else return "vinsert\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}"; } - [(set_attr "type" "sselog") + [(set_attr "isa" "noavx512vl,avx512vl") + (set_attr "gpr32" "0,1") + (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") @@ -27401,11 +27475,11 @@ (define_insn "vec_set_hi_" (set_attr "mode" "")]) (define_insn "vec_set_lo_" - [(set (match_operand:VI4F_256 0 "register_operand" "=v") + [(set (match_operand:VI4F_256 0 "register_operand" "=x,v") (vec_concat:VI4F_256 - (match_operand: 2 "nonimmediate_operand" "vm") + (match_operand: 2 "nonimmediate_operand" "xBt,vm") (vec_select: - (match_operand:VI4F_256 1 "register_operand" "v") + (match_operand:VI4F_256 1 "register_operand" "x,v") (parallel [(const_int 4) (const_int 5) (const_int 6) (const_int 7)]))))] "TARGET_AVX" @@ -27415,20 +27489,22 @@ (define_insn "vec_set_lo_" else return "vinsert\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}"; } - [(set_attr "type" "sselog") + [(set_attr "isa" "noavx512vl,avx512vl") + (set_attr "gpr32" "0,1") + (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") (set_attr "mode" "")]) (define_insn "vec_set_hi_" - [(set (match_operand:VI4F_256 0 "register_operand" "=v") + [(set (match_operand:VI4F_256 0 "register_operand" "=x,v") (vec_concat:VI4F_256 (vec_select: - (match_operand:VI4F_256 1 "register_operand" "v") + (match_operand:VI4F_256 1 "register_operand" "x,v") (parallel [(const_int 0) (const_int 1) (const_int 2) (const_int 3)])) - (match_operand: 2 "nonimmediate_operand" "vm")))] + (match_operand: 2 "nonimmediate_operand" "xBt,vm")))] "TARGET_AVX" { if (TARGET_AVX512VL) @@ -27436,7 +27512,9 @@ (define_insn "vec_set_hi_" else return "vinsert\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}"; } - [(set_attr "type" "sselog") + [(set_attr "isa" "noavx512vl,avx512vl") + (set_attr "gpr32" "0,1") + (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex") @@ -27445,7 +27523,7 @@ (define_insn "vec_set_hi_" (define_insn "vec_set_lo_" [(set (match_operand:V16_256 0 "register_operand" "=x,v") (vec_concat:V16_256 - (match_operand: 2 "nonimmediate_operand" "xm,vm") + (match_operand: 2 "nonimmediate_operand" "xBt,vm") (vec_select: (match_operand:V16_256 1 "register_operand" "x,v") (parallel [(const_int 8) (const_int 9) @@ -27456,7 +27534,9 @@ (define_insn "vec_set_lo_" "@ vinsert%~128\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0} vinserti32x4\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}" - [(set_attr "type" "sselog") + [(set_attr "isa" "noavx512vl,avx512vl") + (set_attr "gpr32" "0,1") + (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex,evex") @@ -27471,12 +27551,14 @@ (define_insn "vec_set_hi_" (const_int 2) (const_int 3) (const_int 4) (const_int 5) (const_int 6) (const_int 7)])) - (match_operand: 2 "nonimmediate_operand" "xm,vm")))] + (match_operand: 2 "nonimmediate_operand" "xBt,vm")))] "TARGET_AVX" "@ vinsert%~128\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1} vinserti32x4\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}" - [(set_attr "type" "sselog") + [(set_attr "isa" "noavx512vl,avx512vl") + (set_attr "gpr32" "0,1") + (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex,evex") @@ -27485,7 +27567,7 @@ (define_insn "vec_set_hi_" (define_insn "vec_set_lo_v32qi" [(set (match_operand:V32QI 0 "register_operand" "=x,v") (vec_concat:V32QI - (match_operand:V16QI 2 "nonimmediate_operand" "xm,v") + (match_operand:V16QI 2 "nonimmediate_operand" "xBt,v") (vec_select:V16QI (match_operand:V32QI 1 "register_operand" "x,v") (parallel [(const_int 16) (const_int 17) @@ -27501,6 +27583,7 @@ (define_insn "vec_set_lo_v32qi" vinsert%~128\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0} vinserti32x4\t{$0x0, %2, %1, %0|%0, %1, %2, 0x0}" [(set_attr "type" "sselog") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex,evex") @@ -27519,12 +27602,14 @@ (define_insn "vec_set_hi_v32qi" (const_int 10) (const_int 11) (const_int 12) (const_int 13) (const_int 14) (const_int 15)])) - (match_operand:V16QI 2 "nonimmediate_operand" "xm,vm")))] + (match_operand:V16QI 2 "nonimmediate_operand" "xBt,vm")))] "TARGET_AVX" "@ vinsert%~128\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1} vinserti32x4\t{$0x1, %2, %1, %0|%0, %1, %2, 0x1}" - [(set_attr "type" "sselog") + [(set_attr "isa" "noavx512vl,avx512vl") + (set_attr "gpr32" "0") + (set_attr "type" "sselog") (set_attr "prefix_extra" "1") (set_attr "length_immediate" "1") (set_attr "prefix" "vex,evex") @@ -27534,7 +27619,7 @@ (define_insn "_maskload" [(set (match_operand:V48_128_256 0 "register_operand" "=x") (unspec:V48_128_256 [(match_operand: 2 "register_operand" "x") - (match_operand:V48_128_256 1 "memory_operand" "m")] + (match_operand:V48_128_256 1 "memory_operand" "Bt")] UNSPEC_MASKMOV))] "TARGET_AVX" { @@ -27544,13 +27629,14 @@ (define_insn "_maskload" return "vmaskmov\t{%1, %2, %0|%0, %2, %1}"; } [(set_attr "type" "sselog1") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "vex") (set_attr "btver2_decode" "vector") (set_attr "mode" "")]) (define_insn "_maskstore" - [(set (match_operand:V48_128_256 0 "memory_operand" "+m") + [(set (match_operand:V48_128_256 0 "memory_operand" "+Bt") (unspec:V48_128_256 [(match_operand: 1 "register_operand" "x") (match_operand:V48_128_256 2 "register_operand" "x") @@ -27564,6 +27650,7 @@ (define_insn "_maskstore" return "vmaskmov\t{%2, %1, %0|%0, %1, %2}"; } [(set_attr "type" "sselog1") + (set_attr "gpr32" "0") (set_attr "prefix_extra" "1") (set_attr "prefix" "vex") (set_attr "btver2_decode" "vector") @@ -28160,7 +28247,7 @@ (define_insn "*avx2_gathersi" [(match_operand:VEC_GATHER_MODE 2 "register_operand" "0") (match_operator: 7 "vsib_mem_operator" [(unspec:P - [(match_operand:P 3 "vsib_address_operand" "Tv") + [(match_operand:P 3 "vsib_address_operand" "TV") (match_operand: 4 "register_operand" "x") (match_operand:SI 6 "const1248_operand")] UNSPEC_VSIBADDR)]) @@ -28171,6 +28258,7 @@ (define_insn "*avx2_gathersi" "TARGET_AVX2" "%M3vgatherd\t{%1, %7, %0|%0, %7, %1}" [(set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -28180,7 +28268,7 @@ (define_insn "*avx2_gathersi_2" [(pc) (match_operator: 6 "vsib_mem_operator" [(unspec:P - [(match_operand:P 2 "vsib_address_operand" "Tv") + [(match_operand:P 2 "vsib_address_operand" "TV") (match_operand: 3 "register_operand" "x") (match_operand:SI 5 "const1248_operand")] UNSPEC_VSIBADDR)]) @@ -28191,6 +28279,7 @@ (define_insn "*avx2_gathersi_2" "TARGET_AVX2" "%M2vgatherd\t{%1, %6, %0|%0, %6, %1}" [(set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -28221,7 +28310,7 @@ (define_insn "*avx2_gatherdi" [(match_operand: 2 "register_operand" "0") (match_operator: 7 "vsib_mem_operator" [(unspec:P - [(match_operand:P 3 "vsib_address_operand" "Tv") + [(match_operand:P 3 "vsib_address_operand" "TV") (match_operand: 4 "register_operand" "x") (match_operand:SI 6 "const1248_operand")] UNSPEC_VSIBADDR)]) @@ -28232,6 +28321,7 @@ (define_insn "*avx2_gatherdi" "TARGET_AVX2" "%M3vgatherq\t{%5, %7, %2|%2, %7, %5}" [(set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -28241,7 +28331,7 @@ (define_insn "*avx2_gatherdi_2" [(pc) (match_operator: 6 "vsib_mem_operator" [(unspec:P - [(match_operand:P 2 "vsib_address_operand" "Tv") + [(match_operand:P 2 "vsib_address_operand" "TV") (match_operand: 3 "register_operand" "x") (match_operand:SI 5 "const1248_operand")] UNSPEC_VSIBADDR)]) @@ -28256,6 +28346,7 @@ (define_insn "*avx2_gatherdi_2" return "%M2vgatherq\t{%4, %6, %0|%0, %6, %4}"; } [(set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -28266,7 +28357,7 @@ (define_insn "*avx2_gatherdi_3" [(match_operand: 2 "register_operand" "0") (match_operator: 7 "vsib_mem_operator" [(unspec:P - [(match_operand:P 3 "vsib_address_operand" "Tv") + [(match_operand:P 3 "vsib_address_operand" "TV") (match_operand: 4 "register_operand" "x") (match_operand:SI 6 "const1248_operand")] UNSPEC_VSIBADDR)]) @@ -28279,6 +28370,7 @@ (define_insn "*avx2_gatherdi_3" "TARGET_AVX2" "%M3vgatherq\t{%5, %7, %0|%0, %7, %5}" [(set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "")]) @@ -28289,7 +28381,7 @@ (define_insn "*avx2_gatherdi_4" [(pc) (match_operator: 6 "vsib_mem_operator" [(unspec:P - [(match_operand:P 2 "vsib_address_operand" "Tv") + [(match_operand:P 2 "vsib_address_operand" "TV") (match_operand: 3 "register_operand" "x") (match_operand:SI 5 "const1248_operand")] UNSPEC_VSIBADDR)]) @@ -28302,6 +28394,7 @@ (define_insn "*avx2_gatherdi_4" "TARGET_AVX2" "%M2vgatherq\t{%4, %6, %0|%0, %6, %4}" [(set_attr "type" "ssemov") + (set_attr "gpr32" "0") (set_attr "prefix" "vex") (set_attr "mode" "")])