From patchwork Thu Aug 10 00:47:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liu, Hongtao" X-Patchwork-Id: 1819660 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=CwEWZ4RS; dkim-atps=neutral Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RLpHJ3k3mz1yYl for ; Thu, 10 Aug 2023 10:50:03 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 64EE83858D33 for ; Thu, 10 Aug 2023 00:50:01 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 64EE83858D33 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1691628601; bh=4Dgqg/jVDD0tBPUOz8oqdW6QRVyUxmzc8xXscKvXWkE=; h=To:Cc:Subject:Date:List-Id:List-Unsubscribe:List-Archive: List-Post:List-Help:List-Subscribe:From:Reply-To:From; b=CwEWZ4RSCZ+pqO+20qHbfNLrU6tLHxfSi5tcrjB+YjpQwSMrwvNAMVB/6S6mBGgZo t37tZQTK66hEo/ZBxyNILdqzhrgZKNXbgR4q8d9mJ2FGybQn7Gbm2cMaPE36g4YuG6 6GiQFfPZ2KwHT+vd0NuiOIYDR9WK0dp2jT4KUobg= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.24]) by sourceware.org (Postfix) with ESMTPS id 30EF83858D20 for ; Thu, 10 Aug 2023 00:49:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 30EF83858D20 X-IronPort-AV: E=McAfee;i="6600,9927,10797"; a="374047636" X-IronPort-AV: E=Sophos;i="6.01,160,1684825200"; d="scan'208";a="374047636" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Aug 2023 17:49:30 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10797"; a="978593764" X-IronPort-AV: E=Sophos;i="6.01,160,1684825200"; d="scan'208";a="978593764" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga006.fm.intel.com with ESMTP; 09 Aug 2023 17:49:29 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 34741100518C; Thu, 10 Aug 2023 08:49:28 +0800 (CST) To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com Subject: [PATCH] i386: Do not sanitize upper part of V2HFmode and V4HFmode reg with -fno-trapping-math [PR110832] Date: Thu, 10 Aug 2023 08:47:28 +0800 Message-Id: <20230810004728.15915-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: liuhongt via Gcc-patches From: "Liu, Hongtao" Reply-To: liuhongt Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Also add ix86_partial_vec_fp_math to to condition of V2HF/V4HF named patterns in order to avoid generation of partial vector V8HFmode trapping instructions. Bootstrapped and regtseted on x86_64-pc-linux-gnu{-m32,} Ok for trunk? gcc/ChangeLog: PR target/110832 * config/i386/mmx.md: (movq__to_sse): Also do not sanitize upper part of V4HFmode register with -fno-trapping-math. (v4hf3): Enable for ix86_partial_vec_fp_math. (v2hf3): Ditto. (divv2hf3): Ditto. (movd_v2hf_to_sse): Do not sanitize upper part of V2HFmode register with -fno-trapping-math. --- gcc/config/i386/mmx.md | 20 ++++++++++++++------ 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index d51b3b9dc71..170432a7128 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -596,7 +596,7 @@ (define_expand "movq__to_sse" (match_dup 2)))] "TARGET_SSE2" { - if (mode == V2SFmode + if (mode != V2SImode && !flag_trapping_math) { rtx op1 = force_reg (mode, operands[1]); @@ -1941,7 +1941,7 @@ (define_expand "v4hf3" (plusminusmult:V4HF (match_operand:V4HF 1 "nonimmediate_operand") (match_operand:V4HF 2 "nonimmediate_operand")))] - "TARGET_AVX512FP16 && TARGET_AVX512VL" + "TARGET_AVX512FP16 && TARGET_AVX512VL && ix86_partial_vec_fp_math" { rtx op2 = gen_reg_rtx (V8HFmode); rtx op1 = gen_reg_rtx (V8HFmode); @@ -1961,7 +1961,7 @@ (define_expand "divv4hf3" (div:V4HF (match_operand:V4HF 1 "nonimmediate_operand") (match_operand:V4HF 2 "nonimmediate_operand")))] - "TARGET_AVX512FP16 && TARGET_AVX512VL" + "TARGET_AVX512FP16 && TARGET_AVX512VL && ix86_partial_vec_fp_math" { rtx op2 = gen_reg_rtx (V8HFmode); rtx op1 = gen_reg_rtx (V8HFmode); @@ -1983,14 +1983,22 @@ (define_expand "movd_v2hf_to_sse" (match_operand:V2HF 1 "nonimmediate_operand")) (match_operand:V8HF 2 "reg_or_0_operand") (const_int 3)))] - "TARGET_SSE") + "TARGET_SSE" +{ + if (!flag_trapping_math && operands[2] == CONST0_RTX (V8HFmode)) + { + rtx op1 = force_reg (V2HFmode, operands[1]); + emit_move_insn (operands[0], lowpart_subreg (V8HFmode, op1, V2HFmode)); + DONE; + } +}) (define_expand "v2hf3" [(set (match_operand:V2HF 0 "register_operand") (plusminusmult:V2HF (match_operand:V2HF 1 "nonimmediate_operand") (match_operand:V2HF 2 "nonimmediate_operand")))] - "TARGET_AVX512FP16 && TARGET_AVX512VL" + "TARGET_AVX512FP16 && TARGET_AVX512VL && ix86_partial_vec_fp_math" { rtx op2 = gen_reg_rtx (V8HFmode); rtx op1 = gen_reg_rtx (V8HFmode); @@ -2009,7 +2017,7 @@ (define_expand "divv2hf3" (div:V2HF (match_operand:V2HF 1 "nonimmediate_operand") (match_operand:V2HF 2 "nonimmediate_operand")))] - "TARGET_AVX512FP16 && TARGET_AVX512VL" + "TARGET_AVX512FP16 && TARGET_AVX512VL && ix86_partial_vec_fp_math" { rtx op2 = gen_reg_rtx (V8HFmode); rtx op1 = gen_reg_rtx (V8HFmode);