From patchwork Thu Jun 27 08:23:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: liuhongt X-Patchwork-Id: 1953031 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=Y8yX/V8E; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4W8s8j4HWKz20XB for ; Thu, 27 Jun 2024 18:25:57 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7F8F73838A1C for ; Thu, 27 Jun 2024 08:25:51 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by sourceware.org (Postfix) with ESMTPS id 29CC9383A60D for ; Thu, 27 Jun 2024 08:23:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 29CC9383A60D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 29CC9383A60D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719476607; cv=none; b=NsloI1/mBLpmgsI4f2Ii7SEXcUWn0DLKJoaTeALGK9grlp4+KqdQD6GKtIdTkxTd1GEY3z2rzzAop+RsKwtaow0wP3KAJrCRyBIpRv2Q197XqgtCkrfpDSmt6bM8ejcmUx9IgEDA6Okt1itfyqpapTriIY54e5+9JOj0BWY5was= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719476607; c=relaxed/simple; bh=1tMgglnnrpDhCLPYFoSUaAjzw0KeShFMA3xsTjI9SWM=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=ByrEwRtGkAd/aBiOmYSch6iHKribNCyDBTo3nW5Zi1IB9ePSnVTptpCbpf5E5rOy7WqJepFKocHfzDrS5H/d7IbMisYTxwVxnxt6zZlZFPIO5lM/760uf2jZ+JydEQmkwLiXpxLZUMfyrJxzTuzygOsZwzg21X2/sOXQPZxUKMU= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1719476600; x=1751012600; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1tMgglnnrpDhCLPYFoSUaAjzw0KeShFMA3xsTjI9SWM=; b=Y8yX/V8EVRQJkd86LTj7xQGQdPXHy6SKHFbMLvcFmnnpdghsU2X6VSNQ 2n8yOJIPj+s4dSMicnkM2O+Ux3q6i8umf/S5dM3ijcAuMx92nvVlPEs0B aOV+5pPg/fNZLPr2tz5GKajKwyjmciyIiDbs/6VCFuwTTw7Dg61HUjmaP iRvGr9HLdR9zSwkMCb/pWF6w6J8K5dwrxi+XY6zP8cTUae+1W0ulWxy2w 5801rZ3vKR4br0XXq0J2H24kTR4LwJdA5h0K28+QaMZc1Bu/A/QFQHNuI EZxqbRBViivkN48WP3f/JJLhWAo9HCQRuzUieUkoG0NWwiOZDVKgd3DIP g==; X-CSE-ConnectionGUID: bhFyK8vBR9ScxusImbZI7g== X-CSE-MsgGUID: V/T2UHaER0+gg2oQoUK8BQ== X-IronPort-AV: E=McAfee;i="6700,10204,11115"; a="16732306" X-IronPort-AV: E=Sophos;i="6.08,269,1712646000"; d="scan'208";a="16732306" Received: from orviesa007.jf.intel.com ([10.64.159.147]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Jun 2024 01:23:10 -0700 X-CSE-ConnectionGUID: MDTeJsLhTuiw7romoSY8xA== X-CSE-MsgGUID: 6wdu66vESfO/vCIGAcuyyw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,269,1712646000"; d="scan'208";a="44944356" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orviesa007.jf.intel.com with ESMTP; 27 Jun 2024 01:23:08 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 918D71005689; Thu, 27 Jun 2024 16:23:07 +0800 (CST) From: liuhongt To: gcc-patches@gcc.gnu.org Cc: crazylht@gmail.com, hjl.tools@gmail.com Subject: [PATCH 1/7] [x86] Add more splitters to match (unspec [op1 op2 (gt op3 constm1_operand)] UNSPEC_BLENDV) Date: Thu, 27 Jun 2024 16:23:01 +0800 Message-Id: <20240627082307.1166985-2-hongtao.liu@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20240627082307.1166985-1-hongtao.liu@intel.com> References: <20240627082307.1166985-1-hongtao.liu@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org These define_insn_and_split are needed after vcond{,u,eq} is obsolete. gcc/ChangeLog: PR target/115517 * config/i386/sse.md (*_blendv_gt): New define_insn_and_split. (*_blendv_gtint): Ditto. (*_blendv_not_gtint): Ditto. (*_pblendvb_gt): Ditto. (*_pblendvb_gt_subreg_not): Ditto. --- gcc/config/i386/sse.md | 130 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 130 insertions(+) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 0be2dcd8891..1148ac84f3d 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -23016,6 +23016,32 @@ (define_insn_and_split "*_blendv_lt" (set_attr "btver2_decode" "vector,vector,vector") (set_attr "mode" "")]) +(define_insn_and_split "*_blendv_gt" + [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x") + (unspec:VF_128_256 + [(match_operand:VF_128_256 1 "vector_operand" "Yrja,*xja,xjm") + (match_operand:VF_128_256 2 "register_operand" "0,0,x") + (gt:VF_128_256 + (match_operand: 3 "register_operand" "Yz,Yz,x") + (match_operand: 4 "vector_all_ones_operand"))] + UNSPEC_BLENDV))] + "TARGET_SSE4_1" + "#" + "&& reload_completed" + [(set (match_dup 0) + (unspec:VF_128_256 + [(match_dup 2) (match_dup 1) (match_dup 3)] UNSPEC_BLENDV))] + "operands[3] = gen_lowpart (mode, operands[3]);" + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "type" "ssemov") + (set_attr "addr" "gpr16") + (set_attr "length_immediate" "1") + (set_attr "prefix_data16" "1,1,*") + (set_attr "prefix_extra" "1") + (set_attr "prefix" "orig,orig,vex") + (set_attr "btver2_decode" "vector,vector,vector") + (set_attr "mode" "")]) + (define_mode_attr ssefltmodesuffix [(V2DI "pd") (V4DI "pd") (V4SI "ps") (V8SI "ps") (V2DF "pd") (V4DF "pd") (V4SF "ps") (V8SF "ps")]) @@ -23055,6 +23081,38 @@ (define_insn_and_split "*_blendv_ltint" (set_attr "btver2_decode" "vector,vector,vector") (set_attr "mode" "")]) +(define_insn_and_split "*_blendv_gtint" + [(set (match_operand: 0 "register_operand" "=Yr,*x,x") + (unspec: + [(match_operand: 1 "vector_operand" "Yrja,*xja,xjm") + (match_operand: 2 "register_operand" "0,0,x") + (subreg: + (gt:VI48_AVX + (match_operand:VI48_AVX 3 "register_operand" "Yz,Yz,x") + (match_operand:VI48_AVX 4 "vector_all_ones_operand")) 0)] + UNSPEC_BLENDV))] + "TARGET_SSE4_1" + "#" + "&& reload_completed" + [(set (match_dup 0) + (unspec: + [(match_dup 2) (match_dup 1) (match_dup 3)] UNSPEC_BLENDV))] +{ + operands[0] = gen_lowpart (mode, operands[0]); + operands[1] = gen_lowpart (mode, operands[1]); + operands[2] = gen_lowpart (mode, operands[2]); + operands[3] = gen_lowpart (mode, operands[3]); +} + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "type" "ssemov") + (set_attr "addr" "gpr16") + (set_attr "length_immediate" "1") + (set_attr "prefix_data16" "1,1,*") + (set_attr "prefix_extra" "1") + (set_attr "prefix" "orig,orig,vex") + (set_attr "btver2_decode" "vector,vector,vector") + (set_attr "mode" "")]) + ;; PR target/100738: Transform vpcmpeqd + vpxor + vblendvps to vblendvps for inverted mask; (define_insn_and_split "*_blendv_not_ltint" [(set (match_operand: 0 "register_operand") @@ -23082,6 +23140,32 @@ (define_insn_and_split "*_blendv_not_lt operands[3] = gen_lowpart (mode, operands[3]); }) +(define_insn_and_split "*_blendv_not_gtint" + [(set (match_operand: 0 "register_operand") + (unspec: + [(match_operand: 1 "vector_operand") + (match_operand: 2 "register_operand") + (subreg: + (gt:VI48_AVX + (subreg:VI48_AVX + (not: + (match_operand: 3 "register_operand")) 0) + (match_operand:VI48_AVX 4 "vector_all_ones_operand")) 0)] + UNSPEC_BLENDV))] + "TARGET_SSE4_1 && ix86_pre_reload_split ()" + "#" + "&& 1" + [(set (match_dup 0) + (unspec: + [(match_dup 1) (match_dup 2) (match_dup 3)] UNSPEC_BLENDV))] +{ + operands[0] = gen_lowpart (mode, operands[0]); + operands[2] = gen_lowpart (mode, operands[2]); + operands[1] = force_reg (mode, + gen_lowpart (mode, operands[1])); + operands[3] = gen_lowpart (mode, operands[3]); +}) + (define_insn "_dp" [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x") (unspec:VF_128_256 @@ -23236,6 +23320,30 @@ (define_insn_and_split "*_pblendvb_lt" (set_attr "btver2_decode" "vector,vector,vector") (set_attr "mode" "")]) +(define_insn_and_split "*_pblendvb_gt" + [(set (match_operand:VI1_AVX2 0 "register_operand" "=Yr,*x,x") + (unspec:VI1_AVX2 + [(match_operand:VI1_AVX2 1 "vector_operand" "Yrja,*xja,xjm") + (match_operand:VI1_AVX2 2 "register_operand" "0,0,x") + (gt:VI1_AVX2 (match_operand:VI1_AVX2 3 "register_operand" "Yz,Yz,x") + (match_operand:VI1_AVX2 4 "vector_all_ones_operand"))] + UNSPEC_BLENDV))] + "TARGET_SSE4_1" + "#" + "&& 1" + [(set (match_dup 0) + (unspec:VI1_AVX2 + [(match_dup 2) (match_dup 1) (match_dup 3)] UNSPEC_BLENDV))] + "" + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "type" "ssemov") + (set_attr "addr" "gpr16") + (set_attr "prefix_extra" "1") + (set_attr "length_immediate" "*,*,1") + (set_attr "prefix" "orig,orig,vex") + (set_attr "btver2_decode" "vector,vector,vector") + (set_attr "mode" "")]) + (define_insn_and_split "*_pblendvb_lt_subreg_not" [(set (match_operand:VI1_AVX2 0 "register_operand") (unspec:VI1_AVX2 @@ -23258,6 +23366,28 @@ (define_insn_and_split "*_pblendvb_lt_subreg_not" (lt:VI1_AVX2 (match_dup 3) (match_dup 4))] UNSPEC_BLENDV))] "operands[3] = gen_lowpart (mode, operands[3]);") +(define_insn_and_split "*_pblendvb_gt_subreg_not" + [(set (match_operand:VI1_AVX2 0 "register_operand") + (unspec:VI1_AVX2 + [(match_operand:VI1_AVX2 2 "register_operand") + (match_operand:VI1_AVX2 1 "vector_operand") + (gt:VI1_AVX2 + (subreg:VI1_AVX2 + (not (match_operand 3 "register_operand")) 0) + (match_operand:VI1_AVX2 4 "vector_all_ones_operand"))] + UNSPEC_BLENDV))] + "TARGET_SSE4_1 + && GET_MODE_CLASS (GET_MODE (operands[3])) == MODE_VECTOR_INT + && GET_MODE_SIZE (GET_MODE (operands[3])) == + && ix86_pre_reload_split ()" + "#" + "&& 1" + [(set (match_dup 0) + (unspec:VI1_AVX2 + [(match_dup 1) (match_dup 2) + (gt:VI1_AVX2 (match_dup 3) (match_dup 4))] UNSPEC_BLENDV))] + "operands[3] = gen_lowpart (mode, operands[3]);") + (define_insn "sse4_1_pblend" [(set (match_operand:V8_128 0 "register_operand" "=Yr,*x,x") (vec_merge:V8_128