From patchwork Thu Jun 27 08:23:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liu, Hongtao" X-Patchwork-Id: 1953031 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=Y8yX/V8E; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4W8s8j4HWKz20XB for ; Thu, 27 Jun 2024 18:25:57 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7F8F73838A1C for ; Thu, 27 Jun 2024 08:25:51 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by sourceware.org (Postfix) with ESMTPS id 29CC9383A60D for ; Thu, 27 Jun 2024 08:23:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 29CC9383A60D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 29CC9383A60D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719476607; cv=none; b=NsloI1/mBLpmgsI4f2Ii7SEXcUWn0DLKJoaTeALGK9grlp4+KqdQD6GKtIdTkxTd1GEY3z2rzzAop+RsKwtaow0wP3KAJrCRyBIpRv2Q197XqgtCkrfpDSmt6bM8ejcmUx9IgEDA6Okt1itfyqpapTriIY54e5+9JOj0BWY5was= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719476607; c=relaxed/simple; bh=1tMgglnnrpDhCLPYFoSUaAjzw0KeShFMA3xsTjI9SWM=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=ByrEwRtGkAd/aBiOmYSch6iHKribNCyDBTo3nW5Zi1IB9ePSnVTptpCbpf5E5rOy7WqJepFKocHfzDrS5H/d7IbMisYTxwVxnxt6zZlZFPIO5lM/760uf2jZ+JydEQmkwLiXpxLZUMfyrJxzTuzygOsZwzg21X2/sOXQPZxUKMU= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1719476600; x=1751012600; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1tMgglnnrpDhCLPYFoSUaAjzw0KeShFMA3xsTjI9SWM=; b=Y8yX/V8EVRQJkd86LTj7xQGQdPXHy6SKHFbMLvcFmnnpdghsU2X6VSNQ 2n8yOJIPj+s4dSMicnkM2O+Ux3q6i8umf/S5dM3ijcAuMx92nvVlPEs0B aOV+5pPg/fNZLPr2tz5GKajKwyjmciyIiDbs/6VCFuwTTw7Dg61HUjmaP iRvGr9HLdR9zSwkMCb/pWF6w6J8K5dwrxi+XY6zP8cTUae+1W0ulWxy2w 5801rZ3vKR4br0XXq0J2H24kTR4LwJdA5h0K28+QaMZc1Bu/A/QFQHNuI EZxqbRBViivkN48WP3f/JJLhWAo9HCQRuzUieUkoG0NWwiOZDVKgd3DIP g==; X-CSE-ConnectionGUID: bhFyK8vBR9ScxusImbZI7g== X-CSE-MsgGUID: V/T2UHaER0+gg2oQoUK8BQ== X-IronPort-AV: E=McAfee;i="6700,10204,11115"; a="16732306" X-IronPort-AV: E=Sophos;i="6.08,269,1712646000"; d="scan'208";a="16732306" Received: from orviesa007.jf.intel.com ([10.64.159.147]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Jun 2024 01:23:10 -0700 X-CSE-ConnectionGUID: MDTeJsLhTuiw7romoSY8xA== X-CSE-MsgGUID: 6wdu66vESfO/vCIGAcuyyw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,269,1712646000"; d="scan'208";a="44944356" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orviesa007.jf.intel.com with ESMTP; 27 Jun 2024 01:23:08 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 918D71005689; Thu, 27 Jun 2024 16:23:07 +0800 (CST) From: liuhongt To: gcc-patches@gcc.gnu.org Cc: crazylht@gmail.com, hjl.tools@gmail.com Subject: [PATCH 1/7] [x86] Add more splitters to match (unspec [op1 op2 (gt op3 constm1_operand)] UNSPEC_BLENDV) Date: Thu, 27 Jun 2024 16:23:01 +0800 Message-Id: <20240627082307.1166985-2-hongtao.liu@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20240627082307.1166985-1-hongtao.liu@intel.com> References: <20240627082307.1166985-1-hongtao.liu@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org These define_insn_and_split are needed after vcond{,u,eq} is obsolete. gcc/ChangeLog: PR target/115517 * config/i386/sse.md (*_blendv_gt): New define_insn_and_split. (*_blendv_gtint): Ditto. (*_blendv_not_gtint): Ditto. (*_pblendvb_gt): Ditto. (*_pblendvb_gt_subreg_not): Ditto. --- gcc/config/i386/sse.md | 130 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 130 insertions(+) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 0be2dcd8891..1148ac84f3d 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -23016,6 +23016,32 @@ (define_insn_and_split "*_blendv_lt" (set_attr "btver2_decode" "vector,vector,vector") (set_attr "mode" "")]) +(define_insn_and_split "*_blendv_gt" + [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x") + (unspec:VF_128_256 + [(match_operand:VF_128_256 1 "vector_operand" "Yrja,*xja,xjm") + (match_operand:VF_128_256 2 "register_operand" "0,0,x") + (gt:VF_128_256 + (match_operand: 3 "register_operand" "Yz,Yz,x") + (match_operand: 4 "vector_all_ones_operand"))] + UNSPEC_BLENDV))] + "TARGET_SSE4_1" + "#" + "&& reload_completed" + [(set (match_dup 0) + (unspec:VF_128_256 + [(match_dup 2) (match_dup 1) (match_dup 3)] UNSPEC_BLENDV))] + "operands[3] = gen_lowpart (mode, operands[3]);" + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "type" "ssemov") + (set_attr "addr" "gpr16") + (set_attr "length_immediate" "1") + (set_attr "prefix_data16" "1,1,*") + (set_attr "prefix_extra" "1") + (set_attr "prefix" "orig,orig,vex") + (set_attr "btver2_decode" "vector,vector,vector") + (set_attr "mode" "")]) + (define_mode_attr ssefltmodesuffix [(V2DI "pd") (V4DI "pd") (V4SI "ps") (V8SI "ps") (V2DF "pd") (V4DF "pd") (V4SF "ps") (V8SF "ps")]) @@ -23055,6 +23081,38 @@ (define_insn_and_split "*_blendv_ltint" (set_attr "btver2_decode" "vector,vector,vector") (set_attr "mode" "")]) +(define_insn_and_split "*_blendv_gtint" + [(set (match_operand: 0 "register_operand" "=Yr,*x,x") + (unspec: + [(match_operand: 1 "vector_operand" "Yrja,*xja,xjm") + (match_operand: 2 "register_operand" "0,0,x") + (subreg: + (gt:VI48_AVX + (match_operand:VI48_AVX 3 "register_operand" "Yz,Yz,x") + (match_operand:VI48_AVX 4 "vector_all_ones_operand")) 0)] + UNSPEC_BLENDV))] + "TARGET_SSE4_1" + "#" + "&& reload_completed" + [(set (match_dup 0) + (unspec: + [(match_dup 2) (match_dup 1) (match_dup 3)] UNSPEC_BLENDV))] +{ + operands[0] = gen_lowpart (mode, operands[0]); + operands[1] = gen_lowpart (mode, operands[1]); + operands[2] = gen_lowpart (mode, operands[2]); + operands[3] = gen_lowpart (mode, operands[3]); +} + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "type" "ssemov") + (set_attr "addr" "gpr16") + (set_attr "length_immediate" "1") + (set_attr "prefix_data16" "1,1,*") + (set_attr "prefix_extra" "1") + (set_attr "prefix" "orig,orig,vex") + (set_attr "btver2_decode" "vector,vector,vector") + (set_attr "mode" "")]) + ;; PR target/100738: Transform vpcmpeqd + vpxor + vblendvps to vblendvps for inverted mask; (define_insn_and_split "*_blendv_not_ltint" [(set (match_operand: 0 "register_operand") @@ -23082,6 +23140,32 @@ (define_insn_and_split "*_blendv_not_lt operands[3] = gen_lowpart (mode, operands[3]); }) +(define_insn_and_split "*_blendv_not_gtint" + [(set (match_operand: 0 "register_operand") + (unspec: + [(match_operand: 1 "vector_operand") + (match_operand: 2 "register_operand") + (subreg: + (gt:VI48_AVX + (subreg:VI48_AVX + (not: + (match_operand: 3 "register_operand")) 0) + (match_operand:VI48_AVX 4 "vector_all_ones_operand")) 0)] + UNSPEC_BLENDV))] + "TARGET_SSE4_1 && ix86_pre_reload_split ()" + "#" + "&& 1" + [(set (match_dup 0) + (unspec: + [(match_dup 1) (match_dup 2) (match_dup 3)] UNSPEC_BLENDV))] +{ + operands[0] = gen_lowpart (mode, operands[0]); + operands[2] = gen_lowpart (mode, operands[2]); + operands[1] = force_reg (mode, + gen_lowpart (mode, operands[1])); + operands[3] = gen_lowpart (mode, operands[3]); +}) + (define_insn "_dp" [(set (match_operand:VF_128_256 0 "register_operand" "=Yr,*x,x") (unspec:VF_128_256 @@ -23236,6 +23320,30 @@ (define_insn_and_split "*_pblendvb_lt" (set_attr "btver2_decode" "vector,vector,vector") (set_attr "mode" "")]) +(define_insn_and_split "*_pblendvb_gt" + [(set (match_operand:VI1_AVX2 0 "register_operand" "=Yr,*x,x") + (unspec:VI1_AVX2 + [(match_operand:VI1_AVX2 1 "vector_operand" "Yrja,*xja,xjm") + (match_operand:VI1_AVX2 2 "register_operand" "0,0,x") + (gt:VI1_AVX2 (match_operand:VI1_AVX2 3 "register_operand" "Yz,Yz,x") + (match_operand:VI1_AVX2 4 "vector_all_ones_operand"))] + UNSPEC_BLENDV))] + "TARGET_SSE4_1" + "#" + "&& 1" + [(set (match_dup 0) + (unspec:VI1_AVX2 + [(match_dup 2) (match_dup 1) (match_dup 3)] UNSPEC_BLENDV))] + "" + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "type" "ssemov") + (set_attr "addr" "gpr16") + (set_attr "prefix_extra" "1") + (set_attr "length_immediate" "*,*,1") + (set_attr "prefix" "orig,orig,vex") + (set_attr "btver2_decode" "vector,vector,vector") + (set_attr "mode" "")]) + (define_insn_and_split "*_pblendvb_lt_subreg_not" [(set (match_operand:VI1_AVX2 0 "register_operand") (unspec:VI1_AVX2 @@ -23258,6 +23366,28 @@ (define_insn_and_split "*_pblendvb_lt_subreg_not" (lt:VI1_AVX2 (match_dup 3) (match_dup 4))] UNSPEC_BLENDV))] "operands[3] = gen_lowpart (mode, operands[3]);") +(define_insn_and_split "*_pblendvb_gt_subreg_not" + [(set (match_operand:VI1_AVX2 0 "register_operand") + (unspec:VI1_AVX2 + [(match_operand:VI1_AVX2 2 "register_operand") + (match_operand:VI1_AVX2 1 "vector_operand") + (gt:VI1_AVX2 + (subreg:VI1_AVX2 + (not (match_operand 3 "register_operand")) 0) + (match_operand:VI1_AVX2 4 "vector_all_ones_operand"))] + UNSPEC_BLENDV))] + "TARGET_SSE4_1 + && GET_MODE_CLASS (GET_MODE (operands[3])) == MODE_VECTOR_INT + && GET_MODE_SIZE (GET_MODE (operands[3])) == + && ix86_pre_reload_split ()" + "#" + "&& 1" + [(set (match_dup 0) + (unspec:VI1_AVX2 + [(match_dup 1) (match_dup 2) + (gt:VI1_AVX2 (match_dup 3) (match_dup 4))] UNSPEC_BLENDV))] + "operands[3] = gen_lowpart (mode, operands[3]);") + (define_insn "sse4_1_pblend" [(set (match_operand:V8_128 0 "register_operand" "=Yr,*x,x") (vec_merge:V8_128 From patchwork Thu Jun 27 08:23:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liu, Hongtao" X-Patchwork-Id: 1953036 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=e95X4C/f; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4W8sDm6Db4z20XB for ; Thu, 27 Jun 2024 18:29:28 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2421C3838A22 for ; Thu, 27 Jun 2024 08:29:27 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by sourceware.org (Postfix) with ESMTPS id 0055D38393B0 for ; Thu, 27 Jun 2024 08:23:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0055D38393B0 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0055D38393B0 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719476605; cv=none; b=ZnkA2LvxvEfZfHpBf0OdxZJaPwisWuibXGs8lFUJHZIKjDv4VwSHCNx67Uxo1gXQmhcZsST2ftYsC2sxNXFsOuynKVGHqfIxBeTo1v+VTqjJlqO8rC3jIFJbqM7Vv8Zt7FPnrnAYKQNuGVdNJdPBhe+SjV/XGCm8qxbv+IqvJOk= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719476605; c=relaxed/simple; bh=ZhcLIHLySDcuVAO1gswnO9UySrO2qSIpGL2uLe5SVD0=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=kzI4Zl7EJ6h+ZMb6Y/W3v+1cg0y7R8BWkDaRpkBlV6ZVhPgOII6mCFXEG5NPn1A+IzSZ/csbpJNrf9lM5RkoPtp12TJzZlvoyLJeXeHg9XtvxrHYKd3MmjunihTLl3tvRoV6o6Tj84aNzYn04hbzxmpm1+JX7QduZe26uVFog9g= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1719476596; x=1751012596; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ZhcLIHLySDcuVAO1gswnO9UySrO2qSIpGL2uLe5SVD0=; b=e95X4C/fQXDr4FIQqeivL+uf0Inec/lU1zY3jAyYgIgp/7tgXqCsDILU FM0j8uDlUjMeF0Di34BVR40/K74nJThviPey36ItUnE78fpjNTUz/gkDO 4bAQC3L2fQc04XAVWUYubqU6sSkMXUOhuuchclwjQCzgz4wOPTs5zr94o GynuV/GhHnz0Liurimios3MvEbzpd57I2hoqoXsM3SOOUFZyMN2mJzFIA dWXvS3mHcz/XnSWvIKuW6HZjVO7U/N07s9zVRV1wvfy8xq/A9n6wfUDfx hYK7YQZf4RFo+PhDG7b6snsDDboj+1HSNHYTqfwnYooN9HGIxjVUiOX7j A==; X-CSE-ConnectionGUID: Rx2Ag/qVQEqDJBsTjauffQ== X-CSE-MsgGUID: TQ6uTr9UTAihqrkyg8FsNw== X-IronPort-AV: E=McAfee;i="6700,10204,11115"; a="16732302" X-IronPort-AV: E=Sophos;i="6.08,269,1712646000"; d="scan'208";a="16732302" Received: from orviesa007.jf.intel.com ([10.64.159.147]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Jun 2024 01:23:10 -0700 X-CSE-ConnectionGUID: NOGdAfOpRHKG1/CtB7vVkw== X-CSE-MsgGUID: 80yElRKuRz+UaIpABBhxsQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,269,1712646000"; d="scan'208";a="44944355" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orviesa007.jf.intel.com with ESMTP; 27 Jun 2024 01:23:08 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 93E8A1005690; Thu, 27 Jun 2024 16:23:07 +0800 (CST) From: liuhongt To: gcc-patches@gcc.gnu.org Cc: crazylht@gmail.com, hjl.tools@gmail.com Subject: [PATCH 2/7] Lower AVX512 kmask comparison back to AVX2 comparison when op_{true, false} is vector -1/0. Date: Thu, 27 Jun 2024 16:23:02 +0800 Message-Id: <20240627082307.1166985-3-hongtao.liu@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20240627082307.1166985-1-hongtao.liu@intel.com> References: <20240627082307.1166985-1-hongtao.liu@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org gcc/ChangeLog PR target/115517 * config/i386/sse.md (*_cvtmask2_not): New pre_reload splitter. (*_cvtmask2_not): Ditto. (*avx2_pcmp3_6): Ditto. (*avx2_pcmp3_7): Ditto. --- gcc/config/i386/sse.md | 97 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 97 insertions(+) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 1148ac84f3d..822159a869b 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -9986,6 +9986,24 @@ (define_insn "*_cvtmask2" [(set_attr "prefix" "evex") (set_attr "mode" "")]) +(define_insn_and_split "*_cvtmask2_not" + [(set (match_operand:VI12_AVX512VL 0 "register_operand") + (vec_merge:VI12_AVX512VL + (match_operand:VI12_AVX512VL 2 "const0_operand") + (match_operand:VI12_AVX512VL 3 "vector_all_ones_operand") + (match_operand: 1 "register_operand")))] + "TARGET_AVX512BW && ix86_pre_reload_split ()" + "#" + "&& 1" + [(set (match_dup 4) + (not: (match_dup 1))) + (set (match_dup 0) + (vec_merge:VI12_AVX512VL + (match_dup 3) + (match_dup 2) + (match_dup 4)))] + "operands[4] = gen_reg_rtx (mode);") + (define_expand "_cvtmask2" [(set (match_operand:VI48_AVX512VL 0 "register_operand") (vec_merge:VI48_AVX512VL @@ -10024,6 +10042,24 @@ (define_insn_and_split "*_cvtmask2" (set_attr "prefix" "evex") (set_attr "mode" "")]) +(define_insn_and_split "*_cvtmask2_not" + [(set (match_operand:VI48_AVX512VL 0 "register_operand") + (vec_merge:VI48_AVX512VL + (match_operand:VI48_AVX512VL 2 "const0_operand") + (match_operand:VI48_AVX512VL 3 "vector_all_ones_operand") + (match_operand: 1 "register_operand")))] + "TARGET_AVX512F && ix86_pre_reload_split ()" + "#" + "&& 1" + [(set (match_dup 4) + (not: (match_dup 1))) + (set (match_dup 0) + (vec_merge:VI48_AVX512VL + (match_dup 3) + (match_dup 2) + (match_dup 4)))] + "operands[4] = gen_reg_rtx (mode);") + (define_insn "*_cvtmask2_pternlog_false_dep" [(set (match_operand:VI48_AVX512VL 0 "register_operand" "=v") (vec_merge:VI48_AVX512VL @@ -17675,6 +17711,67 @@ (define_insn_and_split "*avx2_pcmp3_5" std::swap (operands[1], operands[2]); }) +(define_int_attr pcmp_usmin + [(UNSPEC_PCMP "smin") (UNSPEC_UNSIGNED_PCMP "umin")]) + +(define_insn_and_split "*avx2_pcmp3_6" + [(set (match_operand:VI_128_256 0 "register_operand") + (vec_merge:VI_128_256 + (match_operand:VI_128_256 1 "vector_all_ones_operand") + (match_operand:VI_128_256 2 "const0_operand") + (unspec: + [(match_operand:VI_128_256 3 "nonimmediate_operand") + (match_operand:VI_128_256 4 "nonimmediate_operand") + (match_operand:SI 5 "const_0_to_7_operand")] + UNSPEC_PCMP_ITER)))] + "TARGET_AVX512VL && ix86_pre_reload_split () + && (INTVAL (operands[5]) == 2 || INTVAL (operands[5]) == 5)" + "#" + "&& 1" + [(const_int 0)] +{ + rtx dst_min = gen_reg_rtx (mode); + + if (MEM_P (operands[3]) && MEM_P (operands[4])) + operands[3] = force_reg (mode, operands[3]); + emit_insn (gen_3 (dst_min, operands[3], operands[4])); + rtx eq_op = INTVAL (operands[5]) == 2 ? operands[3] : operands[4]; + emit_move_insn (operands[0], gen_rtx_EQ (mode, eq_op, dst_min)); + DONE; +}) + +(define_insn_and_split "*avx2_pcmp3_7" + [(set (match_operand:VI_128_256 0 "register_operand") + (vec_merge:VI_128_256 + (match_operand:VI_128_256 1 "const0_operand") + (match_operand:VI_128_256 2 "vector_all_ones_operand") + (unspec: + [(match_operand:VI_128_256 3 "nonimmediate_operand") + (match_operand:VI_128_256 4 "nonimmediate_operand") + (match_operand:SI 5 "const_0_to_7_operand")] + UNSPEC_PCMP_ITER)))] + "TARGET_AVX512VL && ix86_pre_reload_split () + /* NE is commutative. */ + && (INTVAL (operands[5]) == 4 + /* LE, 3 must be register. */ + || INTVAL (operands[5]) == 2 + /* NLT aka GE, 4 must be register and we swap operands. */ + || INTVAL (operands[5]) == 5)" + "#" + "&& 1" + [(const_int 0)] +{ + if (INTVAL (operands[5]) == 5) + std::swap (operands[3], operands[4]); + + if (MEM_P (operands[3])) + operands[3] = force_reg (mode, operands[3]); + enum rtx_code code = INTVAL (operands[5]) != 4 ? GT : EQ; + emit_move_insn (operands[0], gen_rtx_fmt_ee (code, mode, + operands[3], operands[4])); + DONE; +}) + (define_expand "_eq3" [(set (match_operand: 0 "register_operand") (unspec: From patchwork Thu Jun 27 08:23:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liu, Hongtao" X-Patchwork-Id: 1953029 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=AxenNHYC; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4W8s7z5zxHz20XB for ; Thu, 27 Jun 2024 18:25:19 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0A41C3838A0A for ; Thu, 27 Jun 2024 08:25:18 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by sourceware.org (Postfix) with ESMTPS id 142633839398 for ; Thu, 27 Jun 2024 08:23:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 142633839398 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 142633839398 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719476597; cv=none; b=i7IbSnf7uHls/jqvdHH5kOJVS1JWyNoISu6yrYpXWXKC02DaKTdDYCHzzpiMtUtrzZDDVh0EfERUO1308EeVRhkb3iuh5ovMODmkVFeV3Ef2QK6je83aFuoO4qkWVpco8BogAbJHJ3fzmw4j/cL4kTKkzwgt4EzA+pA2XsjnXfo= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719476597; c=relaxed/simple; bh=bCJmxB+PUr7cah42fk5ixC/dCF36YqFVWTTj8dzQehI=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=CIdiY/Ur/QnA6oqMs5PXxnm15Inw8szJ308DX4Bfu6lMiqc2IfGToXYjpk0sAK0Z5Nl2MWvOM7J5PRP1ZeJwuRjLFiZzUlu6sTMq5hzbOmtlc1AWjMkAynuDCUTm1i3HdfJJ8e4jiZZN2mVmNHE3Cp4nE/B1yhuImv7cj6c2I9I= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1719476591; x=1751012591; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bCJmxB+PUr7cah42fk5ixC/dCF36YqFVWTTj8dzQehI=; b=AxenNHYCj3k4oZF604cOdEB5ylSzoN1dt0HXHGyFpveEyFPxZtQ/SVoW VgmkISv1Z3uthqLcdf4dLH9JwPc87gu8zNszi/ygrOaAFp62QLtjO5imf jxI8IGQpZaU0E0Eo/gXBQust8nhK493yUYkmzh7w8Ap97oZrfindXLOyb bJ0J5Qtv5fNrm77TN83L+R1F4vxOTc1KdNmLeWZxNHPNNY79XuicrSAYC uEmYLbW7NmXOuHugJam/9tc0aoY+Nm64+vjSoTZ7H9RvABt3lTgNxvyJK CUbag38PQsuvTAqBA9+hy5Q8AKV66BP5sCdsPv99J8vLfhIsCHrwz5Jow A==; X-CSE-ConnectionGUID: zbXAx7d7TvuK3pEDdSpx7g== X-CSE-MsgGUID: QpKFF9hsSnanWE8SjNDLhg== X-IronPort-AV: E=McAfee;i="6700,10204,11115"; a="16732298" X-IronPort-AV: E=Sophos;i="6.08,269,1712646000"; d="scan'208";a="16732298" Received: from orviesa007.jf.intel.com ([10.64.159.147]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Jun 2024 01:23:10 -0700 X-CSE-ConnectionGUID: HBvKwjPfQP+VHMjgHybdkA== X-CSE-MsgGUID: SEsfemMtS/qRiLbzkaTzBA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,269,1712646000"; d="scan'208";a="44944358" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orviesa007.jf.intel.com with ESMTP; 27 Jun 2024 01:23:08 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 9615B10054C3; Thu, 27 Jun 2024 16:23:07 +0800 (CST) From: liuhongt To: gcc-patches@gcc.gnu.org Cc: crazylht@gmail.com, hjl.tools@gmail.com Subject: [PATCH 3/7] [x86] Match IEEE min/max with UNSPEC_IEEE_{MIN,MAX}. Date: Thu, 27 Jun 2024 16:23:03 +0800 Message-Id: <20240627082307.1166985-4-hongtao.liu@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20240627082307.1166985-1-hongtao.liu@intel.com> References: <20240627082307.1166985-1-hongtao.liu@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org These versions of the min/max patterns implement exactly the operations min = (op1 < op2 ? op1 : op2) max = (!(op1 < op2) ? op1 : op2) gcc/ChangeLog: PR target/115517 * config/i386/sse.md (*minmax3_1): New pre_reload define_insn_and_split. (*minmax3_2): Ditto. --- gcc/config/i386/sse.md | 63 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 63 insertions(+) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 822159a869b..92f8b74999f 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -3064,6 +3064,69 @@ (define_insn "*3" (set_attr "prefix" "") (set_attr "mode" "")]) +(define_insn_and_split "*minmax3_1" + [(set (match_operand:VFH 0 "register_operand") + (vec_merge:VFH + (match_operand:VFH 1 "nonimmediate_operand") + (match_operand:VFH 2 "nonimmediate_operand") + (unspec: + [(match_operand:VFH 3 "nonimmediate_operand") + (match_operand:VFH 4 "nonimmediate_operand") + (match_operand:SI 5 "const_0_to_31_operand")] + UNSPEC_PCMP)))] + "TARGET_SSE && ix86_pre_reload_split () + && ((rtx_equal_p (operands[1], operands[3]) + && rtx_equal_p (operands[2], operands[4])) + || (rtx_equal_p (operands[1], operands[4]) + && rtx_equal_p (operands[2], operands[3]))) + && (INTVAL (operands[5]) == 1 || INTVAL (operands[5]) == 14)" + "#" + "&& 1" + [(const_int 0)] + { + int u = UNSPEC_IEEE_MIN; + if ((INTVAL (operands[5]) == 1 && rtx_equal_p (operands[1], operands[4])) + || (INTVAL (operands[5]) == 14 && rtx_equal_p (operands[1], operands[3]))) + u = UNSPEC_IEEE_MAX; + + if (MEM_P (operands[1])) + operands[1] = force_reg (mode, operands[1]); + rtvec v = gen_rtvec (2, operands[1], operands[2]); + rtx tmp = gen_rtx_UNSPEC (mode, v, u); + emit_move_insn (operands[0], tmp); + DONE; + }) + +(define_insn_and_split "*minmax3_2" + [(set (match_operand:VF_128_256 0 "register_operand") + (unspec:VF_128_256 + [(match_operand:VF_128_256 1 "nonimmediate_operand") + (match_operand:VF_128_256 2 "nonimmediate_operand") + (lt:VF_128_256 + (match_operand:VF_128_256 3 "nonimmediate_operand") + (match_operand:VF_128_256 4 "nonimmediate_operand"))] + UNSPEC_BLENDV))] + "TARGET_SSE && ix86_pre_reload_split () + && ((rtx_equal_p (operands[1], operands[3]) + && rtx_equal_p (operands[2], operands[4])) + || (rtx_equal_p (operands[1], operands[4]) + && rtx_equal_p (operands[2], operands[3])))" + "#" + "&& 1" + [(const_int 0)] + { + int u = UNSPEC_IEEE_MIN; + if (rtx_equal_p (operands[1], operands[3])) + u = UNSPEC_IEEE_MAX; + + if (MEM_P (operands[2])) + force_reg (mode, operands[2]); + rtvec v = gen_rtvec (2, operands[2], operands[1]); + rtx tmp = gen_rtx_UNSPEC (mode, v, u); + emit_move_insn (operands[0], tmp); + DONE; + }) + ;; These versions of the min/max patterns implement exactly the operations ;; min = (op1 < op2 ? op1 : op2) ;; max = (!(op1 < op2) ? op1 : op2) From patchwork Thu Jun 27 08:23:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liu, Hongtao" X-Patchwork-Id: 1953035 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=klVuayFb; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4W8sDS21B9z20XB for ; Thu, 27 Jun 2024 18:29:12 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 75F4A383640F for ; Thu, 27 Jun 2024 08:29:10 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by sourceware.org (Postfix) with ESMTPS id 73C1B3838A05 for ; Thu, 27 Jun 2024 08:23:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 73C1B3838A05 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 73C1B3838A05 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719476623; cv=none; b=XWjjtqCPDjjRYfzT2sjOvDCmOty5u0Rej9JKdUFKybJbtlmRNpA5ljGHzJHxvRd1k8bDA5pUNw5Kn+U11UZP0b8laA2aFyRb/I8nRFD4HdAq6Qx01Geho77tmhLwYEC4E/xdiyHw2GtTnCToASgSFmUkJebwo3QxwkJPxlVjHBE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719476623; c=relaxed/simple; bh=4htAMadMV1/sfd6SbOAMWintMUb31OWPej+SNbXioVo=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=mkCgdZ0kooxxQXSHuxhs5WHenQg9N6cSd/cqJfiJX+p5hSxdOxb8jj7MCD2AUEjy4isEBfYU470MBv1aG0qsV4zkDl+yHNpWO7mVl5GYJ+ru2MJ2Is86r+UM7KVUcqX6IkxB6OoCuDbB6g/0N7NN/oYfFkMQ1m3P1MssIJID1rE= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1719476601; x=1751012601; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4htAMadMV1/sfd6SbOAMWintMUb31OWPej+SNbXioVo=; b=klVuayFbAuef/WfHKtNsIbgWgAma11uWsYJSFTt+gw6cM0ZOApi+IJZx nFnZYmAidGAaMk/OguQfLN1ac+dLx5XWxEVbmIubgax4XL+/sfGlF7GCz YsP4XT3j+kne/joquO3D20I4em1FaOaPX3d49JJWRjojmPPW+gEY84AGk RuVm1PscQKcMB5hnDXlSdYXmBHZv1o0O+0WvPH+6NDhcrujsMrP7RaUPo bjX/IDzKthj9l44Yc2SLi11zGtMMQw4eh6HZrQ1rEXZv9w94neSIh8eQi zNtCixn0otxzw/Zin96R7WXuYFrGQYYy0QYwCYwM/2lWSSJqsL+u3RO/U Q==; X-CSE-ConnectionGUID: uC9zBKZFTkSzfQQWEfO6bw== X-CSE-MsgGUID: hty2eX4ZQV+83rZ7X94yaQ== X-IronPort-AV: E=McAfee;i="6700,10204,11115"; a="16732312" X-IronPort-AV: E=Sophos;i="6.08,269,1712646000"; d="scan'208";a="16732312" Received: from orviesa007.jf.intel.com ([10.64.159.147]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Jun 2024 01:23:11 -0700 X-CSE-ConnectionGUID: rN52Zr7hQ3iae+KcKi8Few== X-CSE-MsgGUID: wLSWQKyoSb6YeCHdn+xErw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,269,1712646000"; d="scan'208";a="44944359" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orviesa007.jf.intel.com with ESMTP; 27 Jun 2024 01:23:08 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 981DD1006FE2; Thu, 27 Jun 2024 16:23:07 +0800 (CST) From: liuhongt To: gcc-patches@gcc.gnu.org Cc: crazylht@gmail.com, hjl.tools@gmail.com Subject: [PATCH 4/7] Add more splitter for mskmov with avx512 comparison. Date: Thu, 27 Jun 2024 16:23:04 +0800 Message-Id: <20240627082307.1166985-5-hongtao.liu@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20240627082307.1166985-1-hongtao.liu@intel.com> References: <20240627082307.1166985-1-hongtao.liu@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org gcc/ChangeLog: PR target/115517 * config/i386/sse.md (*_movmsk_lt_avx512): New define_insn_and_split. (*_movmsk_ext_lt_avx512): Ditto. (*_pmovmskb_lt_avx512): Ditto. (*_pmovmskb_zext_lt_avx512): Ditto. (*sse2_pmovmskb_ext_lt_avx512): Ditto. (*pmovsk_kmask_v16qi_avx512): Ditto. (*pmovsk_mask_v32qi_avx512): Ditto. (*pmovsk_mask_cmp__avx512): Ditto. (*pmovsk_ptest__avx512): Ditto. --- gcc/config/i386/sse.md | 232 +++++++++++++++++++++++++++++++++++++---- 1 file changed, 209 insertions(+), 23 deletions(-) diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 92f8b74999f..5996ad99606 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -10049,24 +10049,6 @@ (define_insn "*_cvtmask2" [(set_attr "prefix" "evex") (set_attr "mode" "")]) -(define_insn_and_split "*_cvtmask2_not" - [(set (match_operand:VI12_AVX512VL 0 "register_operand") - (vec_merge:VI12_AVX512VL - (match_operand:VI12_AVX512VL 2 "const0_operand") - (match_operand:VI12_AVX512VL 3 "vector_all_ones_operand") - (match_operand: 1 "register_operand")))] - "TARGET_AVX512BW && ix86_pre_reload_split ()" - "#" - "&& 1" - [(set (match_dup 4) - (not: (match_dup 1))) - (set (match_dup 0) - (vec_merge:VI12_AVX512VL - (match_dup 3) - (match_dup 2) - (match_dup 4)))] - "operands[4] = gen_reg_rtx (mode);") - (define_expand "_cvtmask2" [(set (match_operand:VI48_AVX512VL 0 "register_operand") (vec_merge:VI48_AVX512VL @@ -10106,10 +10088,10 @@ (define_insn_and_split "*_cvtmask2" (set_attr "mode" "")]) (define_insn_and_split "*_cvtmask2_not" - [(set (match_operand:VI48_AVX512VL 0 "register_operand") - (vec_merge:VI48_AVX512VL - (match_operand:VI48_AVX512VL 2 "const0_operand") - (match_operand:VI48_AVX512VL 3 "vector_all_ones_operand") + [(set (match_operand:VI1248_AVX512VLBW 0 "register_operand") + (vec_merge:VI1248_AVX512VLBW + (match_operand:VI1248_AVX512VLBW 2 "const0_operand") + (match_operand:VI1248_AVX512VLBW 3 "vector_all_ones_operand") (match_operand: 1 "register_operand")))] "TARGET_AVX512F && ix86_pre_reload_split ()" "#" @@ -10117,7 +10099,7 @@ (define_insn_and_split "*_cvtmask2_not" [(set (match_dup 4) (not: (match_dup 1))) (set (match_dup 0) - (vec_merge:VI48_AVX512VL + (vec_merge:VI1248_AVX512VLBW (match_dup 3) (match_dup 2) (match_dup 4)))] @@ -21753,6 +21735,30 @@ (define_insn_and_split "*_movmsk_lt" (set_attr "prefix" "maybe_vex") (set_attr "mode" "")]) +(define_insn_and_split "*_movmsk_lt_avx512" + [(set (match_operand:SI 0 "register_operand" "=r,jr") + (unspec:SI + [(subreg:VF_128_256 + (vec_merge: + (match_operand: 3 "vector_all_ones_operand") + (match_operand: 4 "const0_operand") + (unspec: + [(match_operand: 1 "register_operand" "x,x") + (match_operand: 2 "const0_operand") + (const_int 1)] + UNSPEC_PCMP)) 0)] + UNSPEC_MOVMSK))] + "TARGET_SSE" + "#" + "&& reload_completed" + [(set (match_dup 0) + (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK))] + "operands[1] = gen_lowpart (mode, operands[1]);" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") + (set_attr "prefix" "maybe_vex") + (set_attr "mode" "")]) + (define_insn_and_split "*_movmsk_ext_lt" [(set (match_operand:DI 0 "register_operand" "=r,jr") (any_extend:DI @@ -21772,6 +21778,31 @@ (define_insn_and_split "*_movmsk_ext_lt" (set_attr "prefix" "maybe_vex") (set_attr "mode" "")]) +(define_insn_and_split "*_movmsk_ext_lt_avx512" + [(set (match_operand:DI 0 "register_operand" "=r,jr") + (any_extend:DI + (unspec:SI + [(subreg:VF_128_256 + (vec_merge: + (match_operand: 3 "vector_all_ones_operand") + (match_operand: 4 "const0_operand") + (unspec: + [(match_operand: 1 "register_operand" "x,x") + (match_operand: 2 "const0_operand") + (const_int 1)] + UNSPEC_PCMP)) 0)] + UNSPEC_MOVMSK)))] + "TARGET_64BIT && TARGET_SSE" + "#" + "&& reload_completed" + [(set (match_dup 0) + (any_extend:DI (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)))] + "operands[1] = gen_lowpart (mode, operands[1]);" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") + (set_attr "prefix" "maybe_vex") + (set_attr "mode" "")]) + (define_insn_and_split "*_movmsk_shift" [(set (match_operand:SI 0 "register_operand" "=r,jr") (unspec:SI @@ -21961,6 +21992,34 @@ (define_insn_and_split "*_pmovmskb_lt" (set_attr "prefix" "maybe_vex") (set_attr "mode" "SI")]) +(define_insn_and_split "*_pmovmskb_lt_avx512" + [(set (match_operand:SI 0 "register_operand" "=r,jr") + (unspec:SI + [(vec_merge:VI1_AVX2 + (match_operand:VI1_AVX2 3 "vector_all_ones_operand") + (match_operand:VI1_AVX2 4 "const0_operand") + (unspec: + [(match_operand:VI1_AVX2 1 "register_operand" "x,x") + (match_operand:VI1_AVX2 2 "const0_operand") + (const_int 1)] + UNSPEC_PCMP))] + UNSPEC_MOVMSK))] + "TARGET_SSE2" + "#" + "&& 1" + [(set (match_dup 0) + (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK))] + "" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") + (set (attr "prefix_data16") + (if_then_else + (match_test "TARGET_AVX") + (const_string "*") + (const_string "1"))) + (set_attr "prefix" "maybe_vex") + (set_attr "mode" "SI")]) + (define_insn_and_split "*_pmovmskb_zext_lt" [(set (match_operand:DI 0 "register_operand" "=r,jr") (zero_extend:DI @@ -21984,6 +22043,35 @@ (define_insn_and_split "*_pmovmskb_zext_lt" (set_attr "prefix" "maybe_vex") (set_attr "mode" "SI")]) +(define_insn_and_split "*_pmovmskb_zext_lt_avx512" + [(set (match_operand:DI 0 "register_operand" "=r,jr") + (zero_extend:DI + (unspec:SI + [(vec_merge:VI1_AVX2 + (match_operand:VI1_AVX2 3 "vector_all_ones_operand") + (match_operand:VI1_AVX2 4 "const0_operand") + (unspec: + [(match_operand:VI1_AVX2 1 "register_operand" "x,x") + (match_operand:VI1_AVX2 2 "const0_operand") + (const_int 1)] + UNSPEC_PCMP))] + UNSPEC_MOVMSK)))] + "TARGET_64BIT && TARGET_SSE2" + "#" + "&& 1" + [(set (match_dup 0) + (zero_extend:DI (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)))] + "" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") + (set (attr "prefix_data16") + (if_then_else + (match_test "TARGET_AVX") + (const_string "*") + (const_string "1"))) + (set_attr "prefix" "maybe_vex") + (set_attr "mode" "SI")]) + (define_insn_and_split "*sse2_pmovmskb_ext_lt" [(set (match_operand:DI 0 "register_operand" "=r,jr") (sign_extend:DI @@ -22007,6 +22095,63 @@ (define_insn_and_split "*sse2_pmovmskb_ext_lt" (set_attr "prefix" "maybe_vex") (set_attr "mode" "SI")]) +(define_insn_and_split "*sse2_pmovmskb_ext_lt_avx512" + [(set (match_operand:DI 0 "register_operand" "=r,jr") + (sign_extend:DI + (unspec:SI + [(vec_merge:VI1_AVX2 + (match_operand:VI1_AVX2 3 "vector_all_ones_operand") + (match_operand:VI1_AVX2 4 "const0_operand") + (unspec: + [(match_operand:VI1_AVX2 1 "register_operand" "x,x") + (match_operand:VI1_AVX2 2 "const0_operand") + (const_int 1)] + UNSPEC_PCMP))] + UNSPEC_MOVMSK)))] + "TARGET_64BIT && TARGET_SSE2" + "#" + "&& 1" + [(set (match_dup 0) + (sign_extend:DI (unspec:SI [(match_dup 1)] UNSPEC_MOVMSK)))] + "" + [(set_attr "isa" "noavx,avx") + (set_attr "type" "ssemov") + (set (attr "prefix_data16") + (if_then_else + (match_test "TARGET_AVX") + (const_string "*") + (const_string "1"))) + (set_attr "prefix" "maybe_vex") + (set_attr "mode" "SI")]) + +(define_insn_and_split "*pmovsk_kmask_v16qi_avx512" + [(set (match_operand:SI 0 "register_operand") + (unspec:SI + [(vec_merge:V16QI + (match_operand:V16QI 2 "vector_all_ones_operand") + (match_operand:V16QI 3 "const0_operand") + (match_operand:HI 1 "register_operand"))] + UNSPEC_MOVMSK))] + "TARGET_SSE2 && ix86_pre_reload_split ()" + "#" + "&& 1" + [(set (match_dup 0) + (zero_extend:SI (match_dup 1)))]) + +(define_insn_and_split "*pmovsk_mask_v32qi_avx512" + [(set (match_operand:SI 0 "register_operand") + (unspec:SI + [(vec_merge:V32QI + (match_operand:V32QI 2 "vector_all_ones_operand") + (match_operand:V32QI 3 "const0_operand") + (match_operand:SI 1 "register_operand"))] + UNSPEC_MOVMSK))] + "TARGET_SSE2 && ix86_pre_reload_split ()" + "#" + "&& 1" + [(set (match_dup 0) + (match_dup 1))]) + ;; Optimize pxor/pcmpeqb/pmovmskb/cmp 0xffff to ptest. (define_mode_attr vi1avx2const [(V32QI "0xffffffff") (V16QI "0xffff")]) @@ -22025,6 +22170,47 @@ (define_split (match_dup 0)] UNSPEC_PTEST))]) +(define_insn_and_split "*pmovsk_mask_cmp__avx512" + [(set (reg:CCZ FLAGS_REG) + (compare:CCZ + (unspec:SI + [(vec_merge:VI1_AVX2 + (match_operand:VI1_AVX2 0 "vector_all_ones_operand") + (match_operand:VI1_AVX2 3 "const0_operand") + (match_operand: 1 "register_operand"))] + UNSPEC_MOVMSK) + (match_operand 2 "const_int_operand")))] + "TARGET_AVX512VL && UINTVAL (operands[2]) <= " + "#" + "&& 1" + [(set (reg:CCZ FLAGS_REG) + (compare:CCZ + (match_dup 1) + (match_dup 2)))] + "operands[2] = gen_int_mode (UINTVAL (operands[2]), mode);") + +(define_insn_and_split "*pmovsk_ptest__avx512" + [(set (reg:CCZ FLAGS_REG) + (compare:CCZ + (unspec:SI + [(vec_merge:VI1_AVX2 + (match_operand:VI1_AVX2 3 "vector_all_ones_operand") + (match_operand:VI1_AVX2 4 "const0_operand") + (unspec: + [(match_operand:VI1_AVX2 0 "vector_operand") + (match_operand:VI1_AVX2 1 "const0_operand") + (const_int 0)] + UNSPEC_PCMP))] + UNSPEC_MOVMSK) + (match_operand 2 "const_int_operand")))] + "TARGET_AVX512VL && (INTVAL (operands[2]) == (int) ())" + "#" + "&& 1" + [(set (reg:CCZ FLAGS_REG) + (unspec:CCZ [(match_dup 0) + (match_dup 0)] + UNSPEC_PTEST))]) + (define_expand "sse2_maskmovdqu" [(set (match_operand:V16QI 0 "memory_operand") (unspec:V16QI [(match_operand:V16QI 1 "register_operand") From patchwork Thu Jun 27 08:23:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liu, Hongtao" X-Patchwork-Id: 1953038 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=i64wWc1r; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4W8sFF4Jhzz20XB for ; Thu, 27 Jun 2024 18:29:53 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A22243839397 for ; Thu, 27 Jun 2024 08:29:51 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by sourceware.org (Postfix) with ESMTPS id 2C7223838A21 for ; Thu, 27 Jun 2024 08:23:26 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2C7223838A21 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2C7223838A21 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719476641; cv=none; b=sm74T9tLHxyjJmI/meChKPNs3JshTxSCB3JsvSrgS2pcuQrfnJUAE9ugl0Uaia4Y9ETsFe7V+wfmOpiVWHZsD5FcwWID1u8p1TTdm9PWNMH+mPEcHjn9DyOk3msDGsQAx6XLzHCpTpUFE/WEgTKUz6JCN/E3voSV3K3jPFYkc+4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719476641; c=relaxed/simple; bh=4Ie0Z2CCDWYGQcKsWcb8rpFA1x5dNMFUroL/vCEYErA=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=sMBAyZKYRCyMbnspBP5Pn4iyPNzn16ezIiG88eFE+jz7Ja8zbgpxr3g9YlHMwCzflr+pnb3g26+V3bp2U7Gu+uZ7a6WxGGv/EdYQCFaqNbCNy+DV5EhzbeYpXpVIzzkYVKd9Z2Q/vstFko6FtgXHg96DcPGrZxMh9xvAZ0iuwLg= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1719476606; x=1751012606; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4Ie0Z2CCDWYGQcKsWcb8rpFA1x5dNMFUroL/vCEYErA=; b=i64wWc1rQ2uVh3ehol2LS6v70gWKydrqaOCgPYHB2zrgUnpSDhLs5z2l OhTSXr8M5Nu9P0Wl39EoH7bl5bIJBrRffDLj84AvZH1VaNchwlK/zrhWG ltI7i5mL+pCHBT0bhf16e7pnpBfRqQK5Sbe2yzuKAbmMaNomo2BvTRnst s9BZmxgtbyGBb9WnxNQvkq0R9nEJCuhjsSBasFEYqIcUNClZ2p7bwU5Lr 39CABlgHYwCLjMNb9Qap2iMOT9lEqpPRfCWJKjj4k75uyMnw2WxiVY9Sj d4bNdX/wTkMqq/5Ut+WjZ0Dgwh7tdWcrWd0WpyOeJOIBruHEb2KoNygZk g==; X-CSE-ConnectionGUID: m3HcyORETGeFswysZDlqjA== X-CSE-MsgGUID: M3uzeaaHTViNVUA+cERbmg== X-IronPort-AV: E=McAfee;i="6700,10204,11115"; a="16732319" X-IronPort-AV: E=Sophos;i="6.08,269,1712646000"; d="scan'208";a="16732319" Received: from orviesa007.jf.intel.com ([10.64.159.147]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Jun 2024 01:23:15 -0700 X-CSE-ConnectionGUID: W7/yctZpT1yEpMSDoRQ/kA== X-CSE-MsgGUID: 7vXC+V0VSR6hVKvzLvW/fw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,269,1712646000"; d="scan'208";a="44944367" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orviesa007.jf.intel.com with ESMTP; 27 Jun 2024 01:23:10 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 9B3AE1006FE8; Thu, 27 Jun 2024 16:23:07 +0800 (CST) From: liuhongt To: gcc-patches@gcc.gnu.org Cc: crazylht@gmail.com, hjl.tools@gmail.com Subject: [PATCH 5/7] Adjust testcase for the regressed testcases after obsolete of vcond{, u, eq}. Date: Thu, 27 Jun 2024 16:23:05 +0800 Message-Id: <20240627082307.1166985-6-hongtao.liu@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20240627082307.1166985-1-hongtao.liu@intel.com> References: <20240627082307.1166985-1-hongtao.liu@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.3 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org > Richard suggests that we implement the "obvious" transforms like > inversion in the middle-end but if for example unsigned compares > are not supported the us_minus + eq + negative trick isn't on > that list. > > The main reason to restrict vec_cmp would be to avoid > a <= b ? c : d going with an unsupported vec_cmp but instead > do a > b ? d : c - the alternative is trying to fix this > on the RTL side via combine. I understand the non-native Yes, I have a patch which can fix most regressions via pattern match in combine. Still there is a situation that is difficult to deal with, mainly the optimization w/o sse4.1 . Because pblendvb/blendvps/blendvpd only exists under sse4.1, w/o sse4.1, it takes 3 instructions (pand,pandn,por) to simulate the vcond_mask, and the combine matches up to 4 instructions, which makes it currently impossible to use the combine to recover those optimizations in the vcond{,u,eq}.i.e min/max. In the case of sse 4.1 and above, there is basically no regression anymore. the regression testcases w/o sse4.1 FAIL: g++.target/i386/pr100637-1b.C -std=gnu++14 scan-assembler-times pcmpeqb 2 FAIL: g++.target/i386/pr100637-1b.C -std=gnu++17 scan-assembler-times pcmpeqb 2 FAIL: g++.target/i386/pr100637-1b.C -std=gnu++20 scan-assembler-times pcmpeqb 2 FAIL: g++.target/i386/pr100637-1b.C -std=gnu++98 scan-assembler-times pcmpeqb 2 FAIL: g++.target/i386/pr100637-1w.C -std=gnu++14 scan-assembler-times pcmpeqw 2 FAIL: g++.target/i386/pr100637-1w.C -std=gnu++17 scan-assembler-times pcmpeqw 2 FAIL: g++.target/i386/pr100637-1w.C -std=gnu++20 scan-assembler-times pcmpeqw 2 FAIL: g++.target/i386/pr100637-1w.C -std=gnu++98 scan-assembler-times pcmpeqw 2 FAIL: g++.target/i386/pr103861-1.C -std=gnu++14 scan-assembler-times pcmpeqb 2 FAIL: g++.target/i386/pr103861-1.C -std=gnu++17 scan-assembler-times pcmpeqb 2 FAIL: g++.target/i386/pr103861-1.C -std=gnu++20 scan-assembler-times pcmpeqb 2 FAIL: g++.target/i386/pr103861-1.C -std=gnu++98 scan-assembler-times pcmpeqb 2 FAIL: gcc.target/i386/pr88540.c scan-assembler minpd gcc/testsuite/ChangeLog: PR target/115517 * g++.target/i386/pr100637-1b.C: Add xfail and -mno-sse4.1. * g++.target/i386/pr100637-1w.C: Ditto. * g++.target/i386/pr103861-1.C: Ditto. * gcc.target/i386/pr88540.c: Ditto. * gcc.target/i386/pr103941-2.c: Add -mno-avx512f. * g++.target/i386/sse4_1-pr100637-1b.C: New test. * g++.target/i386/sse4_1-pr100637-1w.C: New test. * g++.target/i386/sse4_1-pr103861-1.C: New test. * gcc.target/i386/sse4_1-pr88540.c: New test. --- gcc/testsuite/g++.target/i386/pr100637-1b.C | 4 ++-- gcc/testsuite/g++.target/i386/pr100637-1w.C | 4 ++-- gcc/testsuite/g++.target/i386/pr103861-1.C | 4 ++-- .../g++.target/i386/sse4_1-pr100637-1b.C | 17 +++++++++++++++++ .../g++.target/i386/sse4_1-pr100637-1w.C | 17 +++++++++++++++++ .../g++.target/i386/sse4_1-pr103861-1.C | 17 +++++++++++++++++ gcc/testsuite/gcc.target/i386/pr103941-2.c | 2 +- gcc/testsuite/gcc.target/i386/pr88540.c | 4 ++-- gcc/testsuite/gcc.target/i386/sse4_1-pr88540.c | 10 ++++++++++ 9 files changed, 70 insertions(+), 9 deletions(-) create mode 100644 gcc/testsuite/g++.target/i386/sse4_1-pr100637-1b.C create mode 100644 gcc/testsuite/g++.target/i386/sse4_1-pr100637-1w.C create mode 100644 gcc/testsuite/g++.target/i386/sse4_1-pr103861-1.C create mode 100644 gcc/testsuite/gcc.target/i386/sse4_1-pr88540.c diff --git a/gcc/testsuite/g++.target/i386/pr100637-1b.C b/gcc/testsuite/g++.target/i386/pr100637-1b.C index 35b5df7c9dd..dccb8f5e712 100644 --- a/gcc/testsuite/g++.target/i386/pr100637-1b.C +++ b/gcc/testsuite/g++.target/i386/pr100637-1b.C @@ -1,6 +1,6 @@ /* PR target/100637 */ /* { dg-do compile } */ -/* { dg-options "-O2 -msse2" } */ +/* { dg-options "-O2 -msse2 -mno-sse4.1" } */ typedef unsigned char __attribute__((__vector_size__ (4))) __v4qu; typedef char __attribute__((__vector_size__ (4))) __v4qi; @@ -13,5 +13,5 @@ __v4qu us (__v4qi a, __v4qi b) { return (a > b) ? au : bu; } __v4qi su (__v4qu a, __v4qu b) { return (a > b) ? as : bs; } __v4qi ss (__v4qi a, __v4qi b) { return (a > b) ? as : bs; } -/* { dg-final { scan-assembler-times "pcmpeqb" 2 } } */ +/* { dg-final { scan-assembler-times "pcmpeqb" 2 { xfail *-*-* } } } */ /* { dg-final { scan-assembler-times "pcmpgtb" 2 } } */ diff --git a/gcc/testsuite/g++.target/i386/pr100637-1w.C b/gcc/testsuite/g++.target/i386/pr100637-1w.C index a3ed06fddee..a0aab62db33 100644 --- a/gcc/testsuite/g++.target/i386/pr100637-1w.C +++ b/gcc/testsuite/g++.target/i386/pr100637-1w.C @@ -1,6 +1,6 @@ /* PR target/100637 */ /* { dg-do compile } */ -/* { dg-options "-O2 -msse2" } */ +/* { dg-options "-O2 -msse2 -mno-sse4.1" } */ typedef unsigned short __attribute__((__vector_size__ (4))) __v2hu; typedef short __attribute__((__vector_size__ (4))) __v2hi; @@ -13,5 +13,5 @@ __v2hu us (__v2hi a, __v2hi b) { return (a > b) ? au : bu; } __v2hi su (__v2hu a, __v2hu b) { return (a > b) ? as : bs; } __v2hi ss (__v2hi a, __v2hi b) { return (a > b) ? as : bs; } -/* { dg-final { scan-assembler-times "pcmpeqw" 2 } } */ +/* { dg-final { scan-assembler-times "pcmpeqw" 2 { xfail *-*-* } } } */ /* { dg-final { scan-assembler-times "pcmpgtw" 2 } } */ diff --git a/gcc/testsuite/g++.target/i386/pr103861-1.C b/gcc/testsuite/g++.target/i386/pr103861-1.C index 6475728991e..3b282a7dca2 100644 --- a/gcc/testsuite/g++.target/i386/pr103861-1.C +++ b/gcc/testsuite/g++.target/i386/pr103861-1.C @@ -1,6 +1,6 @@ /* PR target/103861 */ /* { dg-do compile } */ -/* { dg-options "-O2 -msse2" } */ +/* { dg-options "-O2 -msse2 -mno-sse4.1" } */ typedef unsigned char __attribute__((__vector_size__ (2))) __v2qu; typedef char __attribute__((__vector_size__ (2))) __v2qi; @@ -13,5 +13,5 @@ __v2qu us (__v2qi a, __v2qi b) { return (a > b) ? au : bu; } __v2qi su (__v2qu a, __v2qu b) { return (a > b) ? as : bs; } __v2qi ss (__v2qi a, __v2qi b) { return (a > b) ? as : bs; } -/* { dg-final { scan-assembler-times "pcmpeqb" 2 } } */ +/* { dg-final { scan-assembler-times "pcmpeqb" 2 { xfail *-*-* } } } */ /* { dg-final { scan-assembler-times "pcmpgtb" 2 } } */ diff --git a/gcc/testsuite/g++.target/i386/sse4_1-pr100637-1b.C b/gcc/testsuite/g++.target/i386/sse4_1-pr100637-1b.C new file mode 100644 index 00000000000..3230a4ee563 --- /dev/null +++ b/gcc/testsuite/g++.target/i386/sse4_1-pr100637-1b.C @@ -0,0 +1,17 @@ +/* PR target/100637 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -msse4.1" } */ + +typedef unsigned char __attribute__((__vector_size__ (4))) __v4qu; +typedef char __attribute__((__vector_size__ (4))) __v4qi; + +__v4qu au, bu; +__v4qi as, bs; + +__v4qu uu (__v4qu a, __v4qu b) { return (a > b) ? au : bu; } +__v4qu us (__v4qi a, __v4qi b) { return (a > b) ? au : bu; } +__v4qi su (__v4qu a, __v4qu b) { return (a > b) ? as : bs; } +__v4qi ss (__v4qi a, __v4qi b) { return (a > b) ? as : bs; } + +/* { dg-final { scan-assembler-times "pcmpeqb" 2 } } */ +/* { dg-final { scan-assembler-times "pcmpgtb" 2 } } */ diff --git a/gcc/testsuite/g++.target/i386/sse4_1-pr100637-1w.C b/gcc/testsuite/g++.target/i386/sse4_1-pr100637-1w.C new file mode 100644 index 00000000000..9036ea5429d --- /dev/null +++ b/gcc/testsuite/g++.target/i386/sse4_1-pr100637-1w.C @@ -0,0 +1,17 @@ +/* PR target/100637 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -msse4.1" } */ + +typedef unsigned short __attribute__((__vector_size__ (4))) __v2hu; +typedef short __attribute__((__vector_size__ (4))) __v2hi; + +__v2hu au, bu; +__v2hi as, bs; + +__v2hu uu (__v2hu a, __v2hu b) { return (a > b) ? au : bu; } +__v2hu us (__v2hi a, __v2hi b) { return (a > b) ? au : bu; } +__v2hi su (__v2hu a, __v2hu b) { return (a > b) ? as : bs; } +__v2hi ss (__v2hi a, __v2hi b) { return (a > b) ? as : bs; } + +/* { dg-final { scan-assembler-times "pcmpeqw" 2 } } */ +/* { dg-final { scan-assembler-times "pcmpgtw" 2 } } */ diff --git a/gcc/testsuite/g++.target/i386/sse4_1-pr103861-1.C b/gcc/testsuite/g++.target/i386/sse4_1-pr103861-1.C new file mode 100644 index 00000000000..a2b74898db9 --- /dev/null +++ b/gcc/testsuite/g++.target/i386/sse4_1-pr103861-1.C @@ -0,0 +1,17 @@ +/* PR target/103861 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -msse4.1" } */ + +typedef unsigned char __attribute__((__vector_size__ (2))) __v2qu; +typedef char __attribute__((__vector_size__ (2))) __v2qi; + +__v2qu au, bu; +__v2qi as, bs; + +__v2qu uu (__v2qu a, __v2qu b) { return (a > b) ? au : bu; } +__v2qu us (__v2qi a, __v2qi b) { return (a > b) ? au : bu; } +__v2qi su (__v2qu a, __v2qu b) { return (a > b) ? as : bs; } +__v2qi ss (__v2qi a, __v2qi b) { return (a > b) ? as : bs; } + +/* { dg-final { scan-assembler-times "pcmpeqb" 2 } } */ +/* { dg-final { scan-assembler-times "pcmpgtb" 2 } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr103941-2.c b/gcc/testsuite/gcc.target/i386/pr103941-2.c index 972a32be997..9b24a10c63d 100644 --- a/gcc/testsuite/gcc.target/i386/pr103941-2.c +++ b/gcc/testsuite/gcc.target/i386/pr103941-2.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -msse2" } */ +/* { dg-options "-O2 -msse2 -mno-avx512f" } */ void foo (int *c, float *x, float *y) { diff --git a/gcc/testsuite/gcc.target/i386/pr88540.c b/gcc/testsuite/gcc.target/i386/pr88540.c index b927d0c57d5..a22744763b5 100644 --- a/gcc/testsuite/gcc.target/i386/pr88540.c +++ b/gcc/testsuite/gcc.target/i386/pr88540.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -msse2" } */ +/* { dg-options "-O2 -msse2 -mno-sse4.1" } */ void test(double* __restrict d1, double* __restrict d2, double* __restrict d3) { @@ -7,4 +7,4 @@ void test(double* __restrict d1, double* __restrict d2, double* __restrict d3) d3[n] = d1[n] < d2[n] ? d1[n] : d2[n]; } -/* { dg-final { scan-assembler "minpd" } } */ +/* { dg-final { scan-assembler "minpd" { xfail *-*-* } } } */ diff --git a/gcc/testsuite/gcc.target/i386/sse4_1-pr88540.c b/gcc/testsuite/gcc.target/i386/sse4_1-pr88540.c new file mode 100644 index 00000000000..31565a69db5 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/sse4_1-pr88540.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -msse4.1" } */ + +void test(double* __restrict d1, double* __restrict d2, double* __restrict d3) +{ + for (int n = 0; n < 2; ++n) + d3[n] = d1[n] < d2[n] ? d1[n] : d2[n]; +} + +/* { dg-final { scan-assembler "minpd" } } */ From patchwork Thu Jun 27 08:23:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liu, Hongtao" X-Patchwork-Id: 1953034 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=DRurEEyN; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4W8sBY1JqKz20XB for ; Thu, 27 Jun 2024 18:27:33 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 365373838A21 for ; Thu, 27 Jun 2024 08:27:31 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by sourceware.org (Postfix) with ESMTPS id 0D1B83839390 for ; Thu, 27 Jun 2024 08:23:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0D1B83839390 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0D1B83839390 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719476643; cv=none; b=uLKQMqEAnl5XCw3cZYjekCm3/8MbFDF9pYewR6jEi+o20wIEn7/VYALWuSh+aVQaiucyoEc3tlvZLGt+labPrO3xD7/90kSxEvZN8AivdGytdKPoaJGxQEoF4aP2z9YPszfl0FYsh4vWfNPckrYd6oL1XJupOd2Wx2XBakIapC8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719476643; c=relaxed/simple; bh=04ajH+RARg8ByQtA34Ts8IVmYJXpTadWK549gK5YLos=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=TztLToikyHAMh2SLXyabwFQ3k+BJlwsusCCCHMpJq0n5Snqj+HNuEf7wmSxlg1MZkXExyEXjZWo1rxBj9y/I3QdqtlxoSI9Pm/Yok+B9ONIbfQjPhiIb0uXjEZTBD6yDJzLWNr3wGwSrYK032DHVRKvpKljGfAN254YmmAPTFtk= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1719476608; x=1751012608; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=04ajH+RARg8ByQtA34Ts8IVmYJXpTadWK549gK5YLos=; b=DRurEEyNKGJOALp0ozZAY2M3bdjyHun5vxJSdum+az3TP5TJ3zqJn8B5 jZffKMmZIAD2W5aG3/jU5SEMvIhlHrJ11aLoNrQLL/rRRgz5LHM5JBFt4 nQAqd1a13/NB3m3TPbdbjSOhUa2SQyR5S2rK+MWlQT+OpAR83/Q6ig+g2 fbnny5W85qNzA4oiYpihRtt+VoRvhHtyRY3jGDzRJDorZqiQeqptsEOB2 7zIrwFdlUXo1ETneSDQx91S3AGqdHMH6TD1RHtz0fnTPV7fQFWN/xyqwd 98pyoHWRCpfbatEIQAgvFYhyJg94WitsnsPvrziC0ExxMCECZN0ZmG9J2 g==; X-CSE-ConnectionGUID: 7qNYeOSgQTe7h10KAaMrJw== X-CSE-MsgGUID: 2O2lc3ZpTFee8KByyl+lrg== X-IronPort-AV: E=McAfee;i="6700,10204,11115"; a="16732322" X-IronPort-AV: E=Sophos;i="6.08,269,1712646000"; d="scan'208";a="16732322" Received: from orviesa007.jf.intel.com ([10.64.159.147]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Jun 2024 01:23:15 -0700 X-CSE-ConnectionGUID: VNczvHY+Sra0zlETQmw6HA== X-CSE-MsgGUID: beUU7OvaRqSlWm9ObmwDvg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,269,1712646000"; d="scan'208";a="44944369" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orviesa007.jf.intel.com with ESMTP; 27 Jun 2024 01:23:10 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 9ED791006FEC; Thu, 27 Jun 2024 16:23:07 +0800 (CST) From: liuhongt To: gcc-patches@gcc.gnu.org Cc: crazylht@gmail.com, hjl.tools@gmail.com Subject: [PATCH 6/7] [x86] Optimize a < 0 ? -1 : 0 to (signed)a >> 31. Date: Thu, 27 Jun 2024 16:23:06 +0800 Message-Id: <20240627082307.1166985-7-hongtao.liu@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20240627082307.1166985-1-hongtao.liu@intel.com> References: <20240627082307.1166985-1-hongtao.liu@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Try to optimize x < 0 ? -1 : 0 into (signed) x >> 31 and x < 0 ? 1 : 0 into (unsigned) x >> 31. Add define_insn_and_split for the optimization did in ix86_expand_int_vcond. gcc/ChangeLog: PR target/115517 * config/i386/sse.md ("*ashr3_1"): New define_insn_and_split. (*avx512_ashr3_1): Ditto. (*avx2_lshr3_1): Ditto. (*avx2_lshr3_2): Ditto and add 2 combine splitter after it. * config/i386/mmx.md (mmxscalarsize): New mode attribute. (*mmw_ashr3_1): New define_insn_and_split. ("mmx_3): Add a combine spiltter after it. (*mmx_ashrv2hi3_1): New define_insn_and_plit, also add a combine splitter after it. gcc/testsuite/ChangeLog: * gcc.target/i386/avx2-pr115517.c: New test. * gcc.target/i386/avx512-pr115517.c: New test. * g++.target/i386/avx2-pr115517.C: New test. * g++.target/i386/avx512-pr115517.C: New test. * gcc.target/i386/pr111023-2.c: Adjust testcase. * gcc.target/i386/vect-div-1.c: Ditto. --- gcc/config/i386/mmx.md | 52 ++++++++++++ gcc/config/i386/sse.md | 83 +++++++++++++++++++ gcc/testsuite/g++.target/i386/avx2-pr115517.C | 60 ++++++++++++++ .../g++.target/i386/avx512-pr115517.C | 70 ++++++++++++++++ gcc/testsuite/gcc.target/i386/avx2-pr115517.c | 33 ++++++++ .../gcc.target/i386/avx512-pr115517.c | 70 ++++++++++++++++ gcc/testsuite/gcc.target/i386/pr111023-2.c | 4 +- gcc/testsuite/gcc.target/i386/vect-div-1.c | 3 +- 8 files changed, 372 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/g++.target/i386/avx2-pr115517.C create mode 100644 gcc/testsuite/g++.target/i386/avx512-pr115517.C create mode 100644 gcc/testsuite/gcc.target/i386/avx2-pr115517.c create mode 100644 gcc/testsuite/gcc.target/i386/avx512-pr115517.c diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index ea53f516cbb..7262bf146c2 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -135,6 +135,14 @@ (define_mode_attr mmxscalarmodelower (V4HI "hi") (V2HI "hi") (V8QI "qi")]) +(define_mode_attr mmxscalarsize + [(V1DI "64") + (V2SI "32") (V2SF "32") + (V4HF "16") (V4BF "16") + (V2HF "16") (V2BF "16") + (V4HI "16") (V2HI "16") + (V8QI "8")]) + (define_mode_attr Yv_Yw [(V8QI "Yw") (V4HI "Yw") (V2SI "Yv") (V1DI "Yv") (V2SF "Yv")]) @@ -3608,6 +3616,17 @@ (define_insn "mmx_ashr3" (const_string "0"))) (set_attr "mode" "DI,TI,TI")]) +(define_insn_and_split "*mmx_ashr3_1" + [(set (match_operand:MMXMODE24 0 "register_operand") + (lt:MMXMODE24 + (match_operand:MMXMODE24 1 "register_operand") + (match_operand:MMXMODE24 2 "const0_operand")))] + "TARGET_MMX_WITH_SSE && ix86_pre_reload_split ()" + "#" + "&& 1" + [(set (match_dup 0) (ashiftrt:MMXMODE24 (match_dup 1) (match_dup 3)))] + "operands[3] = gen_int_mode ( - 1, DImode);") + (define_expand "ashr3" [(set (match_operand:MMXMODE24 0 "register_operand") (ashiftrt:MMXMODE24 @@ -3634,6 +3653,17 @@ (define_insn "mmx_3" (const_string "0"))) (set_attr "mode" "DI,TI,TI")]) +(define_split + [(set (match_operand:MMXMODE248 0 "register_operand") + (and:MMXMODE248 + (lt:MMXMODE248 + (match_operand:MMXMODE248 1 "register_operand") + (match_operand:MMXMODE248 2 "const0_operand")) + (match_operand:MMXMODE248 3 "const1_operand")))] + "TARGET_MMX_WITH_SSE && ix86_pre_reload_split ()" + [(set (match_dup 0) (lshiftrt:MMXMODE248 (match_dup 1) (match_dup 4)))] + "operands[4] = gen_int_mode ( - 1, DImode);") + (define_expand "3" [(set (match_operand:MMXMODE24 0 "register_operand") (any_lshift:MMXMODE24 @@ -3675,6 +3705,28 @@ (define_insn "v2hi3" (const_string "0"))) (set_attr "mode" "TI")]) +(define_insn_and_split "*mmx_ashrv2hi3_1" + [(set (match_operand:V2HI 0 "register_operand") + (lt:V2HI + (match_operand:V2HI 1 "register_operand") + (match_operand:V2HI 2 "const0_operand")))] + "TARGET_SSE2 && ix86_pre_reload_split ()" + "#" + "&& 1" + [(set (match_dup 0) (ashiftrt:V2HI (match_dup 1) (match_dup 3)))] + "operands[3] = gen_int_mode (15, DImode);") + +(define_split + [(set (match_operand:V2HI 0 "register_operand") + (and:V2HI + (lt:V2HI + (match_operand:V2HI 1 "register_operand") + (match_operand:V2HI 2 "const0_operand")) + (match_operand:V2HI 3 "const1_operand")))] + "TARGET_SSE2 && ix86_pre_reload_split ()" + [(set (match_dup 0) (lshiftrt:V2HI (match_dup 1) (match_dup 4)))] + "operands[4] = gen_int_mode (15, DImode);") + (define_expand "v8qi3" [(set (match_operand:V8QI 0 "register_operand") (any_shift:V8QI (match_operand:V8QI 1 "register_operand") diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index 5996ad99606..d86b6fa81c0 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -16860,6 +16860,17 @@ (define_insn "ashr3" (set_attr "prefix" "orig,vex") (set_attr "mode" "")]) +(define_insn_and_split "*ashr3_1" + [(set (match_operand:VI24_AVX2 0 "register_operand") + (lt:VI24_AVX2 + (match_operand:VI24_AVX2 1 "register_operand") + (match_operand:VI24_AVX2 2 "const0_operand")))] + "TARGET_SSE2 && ix86_pre_reload_split ()" + "#" + "&& 1" + [(set (match_dup 0) (ashiftrt:VI24_AVX2 (match_dup 1) (match_dup 3)))] + "operands[3] = gen_int_mode ( - 1, DImode);") + (define_insn "ashr3" [(set (match_operand:VI248_AVX512BW_AVX512VL 0 "register_operand" "=v,v") (ashiftrt:VI248_AVX512BW_AVX512VL @@ -16874,6 +16885,23 @@ (define_insn "ashr3" (const_string "0"))) (set_attr "mode" "")]) +(define_insn_and_split "*avx512_ashr3_1" + [(set (match_operand:VI248_AVX512VLBW 0 "register_operand") + (vec_merge:VI248_AVX512VLBW + (match_operand:VI248_AVX512VLBW 1 "vector_all_ones_operand") + (match_operand:VI248_AVX512VLBW 2 "const0_operand") + (unspec: + [(match_operand:VI248_AVX512VLBW 3 "nonimmediate_operand") + (match_operand:VI248_AVX512VLBW 4 "const0_operand") + (const_int 1)] + UNSPEC_PCMP)))] + "TARGET_AVX512F && ix86_pre_reload_split ()" + "#" + "&& 1" + [(set (match_dup 0) + (ashiftrt:VI248_AVX512VLBW (match_dup 3) (match_dup 5)))] + "operands[5] = gen_int_mode ( - 1, DImode);") + (define_expand "ashr3" [(set (match_operand:VI248_AVX512BW 0 "register_operand") (ashiftrt:VI248_AVX512BW @@ -17028,6 +17056,61 @@ (define_insn "3" (set_attr "prefix" "orig,vex") (set_attr "mode" "")]) +(define_insn_and_split "*avx2_lshr3_1" + [(set (match_operand:VI8_AVX2 0 "register_operand") + (and:VI8_AVX2 + (gt:VI8_AVX2 + (match_operand:VI8_AVX2 1 "register_operand") + (match_operand:VI8_AVX2 2 "register_operand")) + (match_operand:VI8_AVX2 3 "const1_operand")))] + "TARGET_SSE4_2 && ix86_pre_reload_split ()" + "#" + "&& 1" + [(set (match_dup 5) (gt:VI8_AVX2 (match_dup 1) (match_dup 2))) + (set (match_dup 0) (lshiftrt:VI8_AVX2 (match_dup 5) (match_dup 4)))] +{ + operands[4] = gen_int_mode ( - 1, DImode); + operands[5] = gen_reg_rtx (mode); +}) + +(define_insn_and_split "*avx2_lshr3_2" + [(set (match_operand:VI8_AVX2 0 "register_operand") + (and:VI8_AVX2 + (lt:VI8_AVX2 + (match_operand:VI8_AVX2 1 "register_operand") + (match_operand:VI8_AVX2 2 "const0_operand")) + (match_operand:VI8_AVX2 3 "const1_operand")))] + "TARGET_SSE2 && ix86_pre_reload_split ()" + "#" + "&& 1" + [(set (match_dup 0) (lshiftrt:VI8_AVX2 (match_dup 1) (const_int 63)))]) + +(define_split + [(set (match_operand:VI248_AVX2 0 "register_operand") + (and:VI248_AVX2 + (lt:VI248_AVX2 + (match_operand:VI248_AVX2 1 "register_operand") + (match_operand:VI248_AVX2 2 "const0_operand")) + (match_operand:VI248_AVX2 3 "const1_operand")))] + "TARGET_SSE2 && ix86_pre_reload_split ()" + [(set (match_dup 0) (lshiftrt:VI248_AVX2 (match_dup 1) (match_dup 4)))] + "operands[4] = gen_int_mode ( - 1, DImode);") + +(define_split + [(set (match_operand:VI248_AVX512VLBW 0 "register_operand") + (vec_merge:VI248_AVX512VLBW + (match_operand:VI248_AVX512VLBW 1 "const1_operand") + (match_operand:VI248_AVX512VLBW 2 "const0_operand") + (unspec: + [(match_operand:VI248_AVX512VLBW 3 "nonimmediate_operand") + (match_operand:VI248_AVX512VLBW 4 "const0_operand") + (const_int 1)] + UNSPEC_PCMP)))] + "TARGET_AVX512F && ix86_pre_reload_split ()" + [(set (match_dup 0) + (lshiftrt:VI248_AVX512VLBW (match_dup 3) (match_dup 5)))] + "operands[5] = gen_int_mode ( - 1, DImode);") + (define_insn "3" [(set (match_operand:VI248_AVX512BW 0 "register_operand" "=v,v") (any_lshift:VI248_AVX512BW diff --git a/gcc/testsuite/g++.target/i386/avx2-pr115517.C b/gcc/testsuite/g++.target/i386/avx2-pr115517.C new file mode 100644 index 00000000000..ec000c57542 --- /dev/null +++ b/gcc/testsuite/g++.target/i386/avx2-pr115517.C @@ -0,0 +1,60 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx2 -O2" } */ +/* { dg-final { scan-assembler-times "vpsrlq" 2 } } */ +/* { dg-final { scan-assembler-times "vpsrld" 2 } } */ +/* { dg-final { scan-assembler-times "vpsrlw" 2 } } */ + +typedef short v8hi __attribute__((vector_size(16))); +typedef short v16hi __attribute__((vector_size(32))); +typedef int v4si __attribute__((vector_size(16))); +typedef int v8si __attribute__((vector_size(32))); +typedef long long v2di __attribute__((vector_size(16))); +typedef long long v4di __attribute__((vector_size(32))); + +v8hi +foo (v8hi a) +{ + v8hi const1_op = __extension__(v8hi){1,1,1,1,1,1,1,1}; + v8hi const0_op = __extension__(v8hi){0,0,0,0,0,0,0,0}; + return a < const0_op ? const1_op : const0_op; +} + +v16hi +foo2 (v16hi a) +{ + v16hi const1_op = __extension__(v16hi){1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1}; + v16hi const0_op = __extension__(v16hi){0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0}; + return a < const0_op ? const1_op : const0_op; +} + +v4si +foo3 (v4si a) +{ + v4si const1_op = __extension__(v4si){1,1,1,1}; + v4si const0_op = __extension__(v4si){0,0,0,0}; + return a < const0_op ? const1_op : const0_op; +} + +v8si +foo4 (v8si a) +{ + v8si const1_op = __extension__(v8si){1,1,1,1,1,1,1,1}; + v8si const0_op = __extension__(v8si){0,0,0,0,0,0,0,0}; + return a < const0_op ? const1_op : const0_op; +} + +v2di +foo3 (v2di a) +{ + v2di const1_op = __extension__(v2di){1,1}; + v2di const0_op = __extension__(v2di){0,0}; + return a < const0_op ? const1_op : const0_op; +} + +v4di +foo4 (v4di a) +{ + v4di const1_op = __extension__(v4di){1,1,1,1}; + v4di const0_op = __extension__(v4di){0,0,0,0}; + return a < const0_op ? const1_op : const0_op; +} diff --git a/gcc/testsuite/g++.target/i386/avx512-pr115517.C b/gcc/testsuite/g++.target/i386/avx512-pr115517.C new file mode 100644 index 00000000000..22df41bbdc9 --- /dev/null +++ b/gcc/testsuite/g++.target/i386/avx512-pr115517.C @@ -0,0 +1,70 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512bw -mavx512vl -O2" } */ +/* { dg-final { scan-assembler-times "vpsrad" 3 } } */ +/* { dg-final { scan-assembler-times "vpsraw" 3 } } */ +/* { dg-final { scan-assembler-times "vpsraq" 3 } } */ + +typedef short v8hi __attribute__((vector_size(16))); +typedef short v16hi __attribute__((vector_size(32))); +typedef short v32hi __attribute__((vector_size(64))); +typedef int v4si __attribute__((vector_size(16))); +typedef int v8si __attribute__((vector_size(32))); +typedef int v16si __attribute__((vector_size(64))); +typedef long long v2di __attribute__((vector_size(16))); +typedef long long v4di __attribute__((vector_size(32))); +typedef long long v8di __attribute__((vector_size(64))); + +v8hi +foo (v8hi a) +{ + return a < __extension__(v8hi) { 0, 0, 0, 0, 0, 0, 0, 0}; +} + +v16hi +foo2 (v16hi a) +{ + return a < __extension__(v16hi) { 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0}; +} + +v32hi +foo3 (v32hi a) +{ + return a < __extension__(v32hi) { 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0}; +} + +v4si +foo4 (v4si a) +{ + return a < __extension__(v4si) { 0, 0, 0, 0}; +} + +v8si +foo5 (v8si a) +{ + return a < __extension__(v8si) { 0, 0, 0, 0, 0, 0, 0, 0}; +} + +v16si +foo6 (v16si a) +{ + return a < __extension__(v16si) { 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0}; +} + +v2di +foo7 (v2di a) +{ + return a < __extension__(v2di) { 0, 0}; +} + +v4di +foo8 (v4di a) +{ + return a < __extension__(v4di) { 0, 0, 0, 0}; +} + +v8di +foo9 (v8di a) +{ + return a < __extension__(v8di) { 0, 0, 0, 0, 0, 0, 0, 0}; +} diff --git a/gcc/testsuite/gcc.target/i386/avx2-pr115517.c b/gcc/testsuite/gcc.target/i386/avx2-pr115517.c new file mode 100644 index 00000000000..5b2620b0dc1 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx2-pr115517.c @@ -0,0 +1,33 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx2 -O2" } */ +/* { dg-final { scan-assembler-times "vpsrad" 2 } } */ +/* { dg-final { scan-assembler-times "vpsraw" 2 } } */ + +typedef short v8hi __attribute__((vector_size(16))); +typedef short v16hi __attribute__((vector_size(32))); +typedef int v4si __attribute__((vector_size(16))); +typedef int v8si __attribute__((vector_size(32))); + +v8hi +foo (v8hi a) +{ + return a < __extension__(v8hi) { 0, 0, 0, 0, 0, 0, 0, 0}; +} + +v16hi +foo2 (v16hi a) +{ + return a < __extension__(v16hi) { 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0}; +} + +v4si +foo3 (v4si a) +{ + return a < __extension__(v4si) { 0, 0, 0, 0}; +} + +v8si +foo4 (v8si a) +{ + return a < __extension__(v8si) { 0, 0, 0, 0, 0, 0, 0, 0}; +} diff --git a/gcc/testsuite/gcc.target/i386/avx512-pr115517.c b/gcc/testsuite/gcc.target/i386/avx512-pr115517.c new file mode 100644 index 00000000000..22df41bbdc9 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512-pr115517.c @@ -0,0 +1,70 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512bw -mavx512vl -O2" } */ +/* { dg-final { scan-assembler-times "vpsrad" 3 } } */ +/* { dg-final { scan-assembler-times "vpsraw" 3 } } */ +/* { dg-final { scan-assembler-times "vpsraq" 3 } } */ + +typedef short v8hi __attribute__((vector_size(16))); +typedef short v16hi __attribute__((vector_size(32))); +typedef short v32hi __attribute__((vector_size(64))); +typedef int v4si __attribute__((vector_size(16))); +typedef int v8si __attribute__((vector_size(32))); +typedef int v16si __attribute__((vector_size(64))); +typedef long long v2di __attribute__((vector_size(16))); +typedef long long v4di __attribute__((vector_size(32))); +typedef long long v8di __attribute__((vector_size(64))); + +v8hi +foo (v8hi a) +{ + return a < __extension__(v8hi) { 0, 0, 0, 0, 0, 0, 0, 0}; +} + +v16hi +foo2 (v16hi a) +{ + return a < __extension__(v16hi) { 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0}; +} + +v32hi +foo3 (v32hi a) +{ + return a < __extension__(v32hi) { 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0, + 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0}; +} + +v4si +foo4 (v4si a) +{ + return a < __extension__(v4si) { 0, 0, 0, 0}; +} + +v8si +foo5 (v8si a) +{ + return a < __extension__(v8si) { 0, 0, 0, 0, 0, 0, 0, 0}; +} + +v16si +foo6 (v16si a) +{ + return a < __extension__(v16si) { 0, 0, 0, 0, 0, 0, 0, 0,0, 0, 0, 0, 0, 0, 0, 0}; +} + +v2di +foo7 (v2di a) +{ + return a < __extension__(v2di) { 0, 0}; +} + +v4di +foo8 (v4di a) +{ + return a < __extension__(v4di) { 0, 0, 0, 0}; +} + +v8di +foo9 (v8di a) +{ + return a < __extension__(v8di) { 0, 0, 0, 0, 0, 0, 0, 0}; +} diff --git a/gcc/testsuite/gcc.target/i386/pr111023-2.c b/gcc/testsuite/gcc.target/i386/pr111023-2.c index 6c69f947544..ba52959b357 100644 --- a/gcc/testsuite/gcc.target/i386/pr111023-2.c +++ b/gcc/testsuite/gcc.target/i386/pr111023-2.c @@ -36,7 +36,7 @@ v4si_v4hi (v4si *dst, v8hi src) dst[0] = *(v4si *) tem; } -/* { dg-final { scan-assembler "pcmpgtw" } } */ +/* { dg-final { scan-assembler "(?:pcmpgtw|psraw)" } } */ /* { dg-final { scan-assembler "punpcklwd" } } */ void @@ -48,5 +48,5 @@ v2di_v2si (v2di *dst, v4si src) dst[0] = *(v2di *) tem; } -/* { dg-final { scan-assembler "pcmpgtd" } } */ +/* { dg-final { scan-assembler "(?:pcmpgtd|psrad)" } } */ /* { dg-final { scan-assembler "punpckldq" } } */ diff --git a/gcc/testsuite/gcc.target/i386/vect-div-1.c b/gcc/testsuite/gcc.target/i386/vect-div-1.c index f611088d8df..6d911290e06 100644 --- a/gcc/testsuite/gcc.target/i386/vect-div-1.c +++ b/gcc/testsuite/gcc.target/i386/vect-div-1.c @@ -40,4 +40,5 @@ f4 (int x) is always non-negative, so there is no need to do >> 31 shift etc. to check if it is. And in f3 and f4, VRP can prove it is always negative. */ -/* { dg-final { scan-assembler-not "psrad\[^\n\r\]*\\\$31" } } */ +/* Now (lt:v4si op1 const0_operand) is optimized to psrad, there're 20 of them. */ +/* { dg-final { scan-assembler-times "psrad\[^\n\r\]*\\\$31" 20 } } */ From patchwork Thu Jun 27 08:23:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liu, Hongtao" X-Patchwork-Id: 1953049 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=A0itEMp4; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4W8sJC3xmHz20X6 for ; Thu, 27 Jun 2024 18:32:27 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C6DD43838A21 for ; Thu, 27 Jun 2024 08:32:25 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.15]) by sourceware.org (Postfix) with ESMTPS id B0694383939A for ; Thu, 27 Jun 2024 08:23:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B0694383939A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org B0694383939A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.15 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719476643; cv=none; b=cur30Cag60qG/mpKuMOJ2kXJEOQNwVQ6xMawjltzrVg7Dn/jyuhn0TpJc8KG7I8puWjkpl+yYAhKFYTP4A86otPniXEuz4744Y+1OnhoO4mp+MfhM2/696D+vwjQgpK5jXANXk6K39leW/iVKZWbbm21TMJiIthYGL69d7k7BzE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719476643; c=relaxed/simple; bh=hFqWEBa3h+IqN5is3MOZFyylnt2XRjTNXU6HcuNqCvw=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=sbFi1A2PgSKWHdx7j49wT25W9IoMw43JpzgqLBoYOYJRRi9Ki5/iYykE0RZv5wv1S/jldfMj+oYRupglePc2suo5Bmaoj8TfdmSQ/j4bVaCB7u6LU6wK1i8JmM2qYB6lSfSMzt/xWO+EABdAgJTPlVYkG0ps1wDED+SXB4pDM5I= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1719476624; x=1751012624; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=hFqWEBa3h+IqN5is3MOZFyylnt2XRjTNXU6HcuNqCvw=; b=A0itEMp4OB9rcNaTg77SKbz/9PPCGl3/v0+cl5VaGu8642Zu38pnuGs5 OpZd192uee0YH4U8w1dk2pOp2g1UcoyvBqx736e/1vnB1UZ/SONWaQ1Yt l8CBpBeeoaAR3colBNeT3WtawHxCPnp/qwM56Vj17bfaw4OD0v9sEXZCJ mVqzlGjjR2El1Fp+VJ0cFyuUhiEefGNDy0hDpeVTVNuHgLEMNUG1NqDtm aJLbT+GXNRo0+LVIc5Sr3KpOEdsqVDmYdn03XnFFfxfqQHqURuZh3NziC sDRRRR6Mp5LFCQsCzWRje7CveTAr7wlTQ4JBU5oOjxxZNDxsm5F76Z47t Q==; X-CSE-ConnectionGUID: XeJNbumNTRik80ii+r8/zQ== X-CSE-MsgGUID: 3lfv0G+WRROflsys6D6JWA== X-IronPort-AV: E=McAfee;i="6700,10204,11115"; a="16732327" X-IronPort-AV: E=Sophos;i="6.08,269,1712646000"; d="scan'208";a="16732327" Received: from orviesa007.jf.intel.com ([10.64.159.147]) by fmvoesa109.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Jun 2024 01:23:15 -0700 X-CSE-ConnectionGUID: /Ll6m0eJTkybwVGam1QJGQ== X-CSE-MsgGUID: miDDdHplSPGtHYpAInTJ3Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,269,1712646000"; d="scan'208";a="44944368" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by orviesa007.jf.intel.com with ESMTP; 27 Jun 2024 01:23:10 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id A23791007359; Thu, 27 Jun 2024 16:23:07 +0800 (CST) From: liuhongt To: gcc-patches@gcc.gnu.org Cc: crazylht@gmail.com, hjl.tools@gmail.com Subject: [PATCH 7/7] Remove vcond{, u, eq} expanders since they will be obsolete. Date: Thu, 27 Jun 2024 16:23:07 +0800 Message-Id: <20240627082307.1166985-8-hongtao.liu@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20240627082307.1166985-1-hongtao.liu@intel.com> References: <20240627082307.1166985-1-hongtao.liu@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org gcc/ChangeLog: PR target/115517 * config/i386/mmx.md (vcondv2sf): Removed. (vcond): Ditto. (vcond): Ditto. (vcondu): Ditto. (vcondu): Ditto. * config/i386/sse.md (vcond): Ditto. (vcond): Ditto. (vcond): Ditto. (vcond): Ditto. (vcond): Ditto. (vcond): Ditto. (vcond): Ditto. (vcondv2di): Ditto. (vcondu): Ditto. (vcondu): Ditto. (vcondu): Ditto. (vconduv2di): Ditto. (vcondeqv2di): Ditto. --- gcc/config/i386/mmx.md | 97 ------------------- gcc/config/i386/sse.md | 213 ----------------------------------------- 2 files changed, 310 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 7262bf146c2..17c5205cae2 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -1168,39 +1168,6 @@ (define_expand "vec_cmpv2sfv2si" DONE; }) -(define_expand "vcondv2sf" - [(set (match_operand:V2FI 0 "register_operand") - (if_then_else:V2FI - (match_operator 3 "" - [(match_operand:V2SF 4 "nonimmediate_operand") - (match_operand:V2SF 5 "nonimmediate_operand")]) - (match_operand:V2FI 1 "general_operand") - (match_operand:V2FI 2 "general_operand")))] - "TARGET_MMX_WITH_SSE && ix86_partial_vec_fp_math" -{ - rtx ops[6]; - ops[5] = gen_reg_rtx (V4SFmode); - ops[4] = gen_reg_rtx (V4SFmode); - ops[3] = gen_rtx_fmt_ee (GET_CODE (operands[3]), VOIDmode, ops[4], ops[5]); - ops[2] = lowpart_subreg (mode, - force_reg (mode, operands[2]), - mode); - ops[1] = lowpart_subreg (mode, - force_reg (mode, operands[1]), - mode); - ops[0] = gen_reg_rtx (mode); - - emit_insn (gen_movq_v2sf_to_sse (ops[5], operands[5])); - emit_insn (gen_movq_v2sf_to_sse (ops[4], operands[4])); - - bool ok = ix86_expand_fp_vcond (ops); - gcc_assert (ok); - - emit_move_insn (operands[0], lowpart_subreg (mode, ops[0], - mode)); - DONE; -}) - (define_insn "@sse4_1_insertps_" [(set (match_operand:V2FI 0 "register_operand" "=Yr,*x,v") (unspec:V2FI @@ -4029,70 +3996,6 @@ (define_expand "vec_cmpu" DONE; }) -(define_expand "vcond" - [(set (match_operand:MMXMODE124 0 "register_operand") - (if_then_else:MMXMODE124 - (match_operator 3 "" - [(match_operand:MMXMODEI 4 "register_operand") - (match_operand:MMXMODEI 5 "register_operand")]) - (match_operand:MMXMODE124 1) - (match_operand:MMXMODE124 2)))] - "TARGET_MMX_WITH_SSE - && (GET_MODE_NUNITS (mode) - == GET_MODE_NUNITS (mode))" -{ - bool ok = ix86_expand_int_vcond (operands); - gcc_assert (ok); - DONE; -}) - -(define_expand "vcond" - [(set (match_operand:VI_16_32 0 "register_operand") - (if_then_else:VI_16_32 - (match_operator 3 "" - [(match_operand:VI_16_32 4 "register_operand") - (match_operand:VI_16_32 5 "register_operand")]) - (match_operand:VI_16_32 1) - (match_operand:VI_16_32 2)))] - "TARGET_SSE2" -{ - bool ok = ix86_expand_int_vcond (operands); - gcc_assert (ok); - DONE; -}) - -(define_expand "vcondu" - [(set (match_operand:MMXMODE124 0 "register_operand") - (if_then_else:MMXMODE124 - (match_operator 3 "" - [(match_operand:MMXMODEI 4 "register_operand") - (match_operand:MMXMODEI 5 "register_operand")]) - (match_operand:MMXMODE124 1) - (match_operand:MMXMODE124 2)))] - "TARGET_MMX_WITH_SSE - && (GET_MODE_NUNITS (mode) - == GET_MODE_NUNITS (mode))" -{ - bool ok = ix86_expand_int_vcond (operands); - gcc_assert (ok); - DONE; -}) - -(define_expand "vcondu" - [(set (match_operand:VI_16_32 0 "register_operand") - (if_then_else:VI_16_32 - (match_operator 3 "" - [(match_operand:VI_16_32 4 "register_operand") - (match_operand:VI_16_32 5 "register_operand")]) - (match_operand:VI_16_32 1) - (match_operand:VI_16_32 2)))] - "TARGET_SSE2" -{ - bool ok = ix86_expand_int_vcond (operands); - gcc_assert (ok); - DONE; -}) - (define_expand "vcond_mask_" [(set (match_operand:MMXMODE124 0 "register_operand") (vec_merge:MMXMODE124 diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index d86b6fa81c0..2d6b39c920f 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -4816,72 +4816,6 @@ (define_expand "vec_cmpeqv1tiv1ti" DONE; }) -(define_expand "vcond" - [(set (match_operand:V_512 0 "register_operand") - (if_then_else:V_512 - (match_operator 3 "" - [(match_operand:VF_512 4 "nonimmediate_operand") - (match_operand:VF_512 5 "nonimmediate_operand")]) - (match_operand:V_512 1 "general_operand") - (match_operand:V_512 2 "general_operand")))] - "TARGET_AVX512F - && (GET_MODE_NUNITS (mode) - == GET_MODE_NUNITS (mode))" -{ - bool ok = ix86_expand_fp_vcond (operands); - gcc_assert (ok); - DONE; -}) - -(define_expand "vcond" - [(set (match_operand:V_256 0 "register_operand") - (if_then_else:V_256 - (match_operator 3 "" - [(match_operand:VF_256 4 "nonimmediate_operand") - (match_operand:VF_256 5 "nonimmediate_operand")]) - (match_operand:V_256 1 "general_operand") - (match_operand:V_256 2 "general_operand")))] - "TARGET_AVX - && (GET_MODE_NUNITS (mode) - == GET_MODE_NUNITS (mode))" -{ - bool ok = ix86_expand_fp_vcond (operands); - gcc_assert (ok); - DONE; -}) - -(define_expand "vcond" - [(set (match_operand:V_128 0 "register_operand") - (if_then_else:V_128 - (match_operator 3 "" - [(match_operand:VF_128 4 "vector_operand") - (match_operand:VF_128 5 "vector_operand")]) - (match_operand:V_128 1 "general_operand") - (match_operand:V_128 2 "general_operand")))] - "TARGET_SSE - && (GET_MODE_NUNITS (mode) - == GET_MODE_NUNITS (mode))" -{ - bool ok = ix86_expand_fp_vcond (operands); - gcc_assert (ok); - DONE; -}) - -(define_expand "vcond" - [(set (match_operand:VI2HFBF_AVX512VL 0 "register_operand") - (if_then_else:VI2HFBF_AVX512VL - (match_operator 3 "" - [(match_operand:VHF_AVX512VL 4 "vector_operand") - (match_operand:VHF_AVX512VL 5 "vector_operand")]) - (match_operand:VI2HFBF_AVX512VL 1 "general_operand") - (match_operand:VI2HFBF_AVX512VL 2 "general_operand")))] - "TARGET_AVX512FP16" -{ - bool ok = ix86_expand_fp_vcond (operands); - gcc_assert (ok); - DONE; -}) - (define_expand "vcond_mask_" [(set (match_operand:V48_AVX512VL 0 "register_operand") (vec_merge:V48_AVX512VL @@ -18017,153 +17951,6 @@ (define_insn "*sse2_gt3" (set_attr "prefix" "orig,vex") (set_attr "mode" "TI")]) -(define_expand "vcond" - [(set (match_operand:V_512 0 "register_operand") - (if_then_else:V_512 - (match_operator 3 "" - [(match_operand:VI_AVX512BW 4 "nonimmediate_operand") - (match_operand:VI_AVX512BW 5 "general_operand")]) - (match_operand:V_512 1) - (match_operand:V_512 2)))] - "TARGET_AVX512F - && (GET_MODE_NUNITS (mode) - == GET_MODE_NUNITS (mode))" -{ - bool ok = ix86_expand_int_vcond (operands); - gcc_assert (ok); - DONE; -}) - -(define_expand "vcond" - [(set (match_operand:V_256 0 "register_operand") - (if_then_else:V_256 - (match_operator 3 "" - [(match_operand:VI_256 4 "nonimmediate_operand") - (match_operand:VI_256 5 "general_operand")]) - (match_operand:V_256 1) - (match_operand:V_256 2)))] - "TARGET_AVX2 - && (GET_MODE_NUNITS (mode) - == GET_MODE_NUNITS (mode))" -{ - bool ok = ix86_expand_int_vcond (operands); - gcc_assert (ok); - DONE; -}) - -(define_expand "vcond" - [(set (match_operand:V_128 0 "register_operand") - (if_then_else:V_128 - (match_operator 3 "" - [(match_operand:VI124_128 4 "vector_operand") - (match_operand:VI124_128 5 "general_operand")]) - (match_operand:V_128 1) - (match_operand:V_128 2)))] - "TARGET_SSE2 - && (GET_MODE_NUNITS (mode) - == GET_MODE_NUNITS (mode))" -{ - bool ok = ix86_expand_int_vcond (operands); - gcc_assert (ok); - DONE; -}) - -(define_expand "vcondv2di" - [(set (match_operand:VI8F_128 0 "register_operand") - (if_then_else:VI8F_128 - (match_operator 3 "" - [(match_operand:V2DI 4 "vector_operand") - (match_operand:V2DI 5 "general_operand")]) - (match_operand:VI8F_128 1) - (match_operand:VI8F_128 2)))] - "TARGET_SSE4_2" -{ - bool ok = ix86_expand_int_vcond (operands); - gcc_assert (ok); - DONE; -}) - -(define_expand "vcondu" - [(set (match_operand:V_512 0 "register_operand") - (if_then_else:V_512 - (match_operator 3 "" - [(match_operand:VI_AVX512BW 4 "nonimmediate_operand") - (match_operand:VI_AVX512BW 5 "nonimmediate_operand")]) - (match_operand:V_512 1 "general_operand") - (match_operand:V_512 2 "general_operand")))] - "TARGET_AVX512F - && (GET_MODE_NUNITS (mode) - == GET_MODE_NUNITS (mode))" -{ - bool ok = ix86_expand_int_vcond (operands); - gcc_assert (ok); - DONE; -}) - -(define_expand "vcondu" - [(set (match_operand:V_256 0 "register_operand") - (if_then_else:V_256 - (match_operator 3 "" - [(match_operand:VI_256 4 "nonimmediate_operand") - (match_operand:VI_256 5 "nonimmediate_operand")]) - (match_operand:V_256 1 "general_operand") - (match_operand:V_256 2 "general_operand")))] - "TARGET_AVX2 - && (GET_MODE_NUNITS (mode) - == GET_MODE_NUNITS (mode))" -{ - bool ok = ix86_expand_int_vcond (operands); - gcc_assert (ok); - DONE; -}) - -(define_expand "vcondu" - [(set (match_operand:V_128 0 "register_operand") - (if_then_else:V_128 - (match_operator 3 "" - [(match_operand:VI124_128 4 "vector_operand") - (match_operand:VI124_128 5 "vector_operand")]) - (match_operand:V_128 1 "general_operand") - (match_operand:V_128 2 "general_operand")))] - "TARGET_SSE2 - && (GET_MODE_NUNITS (mode) - == GET_MODE_NUNITS (mode))" -{ - bool ok = ix86_expand_int_vcond (operands); - gcc_assert (ok); - DONE; -}) - -(define_expand "vconduv2di" - [(set (match_operand:VI8F_128 0 "register_operand") - (if_then_else:VI8F_128 - (match_operator 3 "" - [(match_operand:V2DI 4 "vector_operand") - (match_operand:V2DI 5 "vector_operand")]) - (match_operand:VI8F_128 1 "general_operand") - (match_operand:VI8F_128 2 "general_operand")))] - "TARGET_SSE4_2" -{ - bool ok = ix86_expand_int_vcond (operands); - gcc_assert (ok); - DONE; -}) - -(define_expand "vcondeqv2di" - [(set (match_operand:VI8F_128 0 "register_operand") - (if_then_else:VI8F_128 - (match_operator 3 "" - [(match_operand:V2DI 4 "vector_operand") - (match_operand:V2DI 5 "general_operand")]) - (match_operand:VI8F_128 1) - (match_operand:VI8F_128 2)))] - "TARGET_SSE4_1" -{ - bool ok = ix86_expand_int_vcond (operands); - gcc_assert (ok); - DONE; -}) - (define_mode_iterator VEC_PERM_AVX2 [V16QI V8HI V4SI V2DI V4SF V2DF (V8HF "TARGET_AVX512FP16")