From patchwork Tue Jan 25 01:31:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Liu, Hongtao" X-Patchwork-Id: 1583814 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=sxCubAv3; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4JjTqL20Lkz9t25 for ; Tue, 25 Jan 2022 12:32:14 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C7FAA3857C50 for ; Tue, 25 Jan 2022 01:32:11 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org C7FAA3857C50 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1643074331; bh=CMHHUYAhCSxwOa6P8gRpPFJgIZ18q4YVBaHdVfrWCkE=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=sxCubAv3BOHU5vi/5rozWoQ+MdURe1BPvDXdwqdCIrEJYI2Zrfh7qYFJsWeObBih3 Yt5zGCCuibHr78bsZHxdsUuaRZTLrcLGvmJxCUNoogDOxVMyTqyPY31e1QCFqEYvd+ T3+oUBBvOm4C9FrE9Aunx7igbzmYC16Y17YVqbBI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by sourceware.org (Postfix) with ESMTPS id 2E4D73858021 for ; Tue, 25 Jan 2022 01:31:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 2E4D73858021 X-IronPort-AV: E=McAfee;i="6200,9189,10237"; a="245026482" X-IronPort-AV: E=Sophos;i="5.88,313,1635231600"; d="scan'208";a="245026482" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Jan 2022 17:31:50 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,313,1635231600"; d="scan'208";a="520178948" Received: from scymds01.sc.intel.com ([10.148.94.138]) by orsmga007.jf.intel.com with ESMTP; 24 Jan 2022 17:31:49 -0800 Received: from shliclel320.sh.intel.com (shliclel320.sh.intel.com [10.239.236.50]) by scymds01.sc.intel.com with ESMTP id 20P1VmY1024519; Mon, 24 Jan 2022 17:31:48 -0800 To: gcc-patches@gcc.gnu.org Subject: [PATCH] [rtl/cprop_hardreg] Don't propagate for a more expensive reg-reg move. Date: Tue, 25 Jan 2022 09:31:47 +0800 Message-Id: <20220125013147.54351-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.18.1 X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: liuhongt via Gcc-patches From: "Liu, Hongtao" Reply-To: liuhongt Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" For i386, it enables optimization like: vmovd %xmm0, %edx - vmovd %xmm0, %eax + movl %edx, %eax Bootstrapped and regtested on CLX for both x86_64-pc-linux-gnu{-m32,} and x86_64-pc-linux-gnu{-m32\ -march=native,\ -march=native} Ok for trunk? gcc/ChangeLog: PR rtl-optimization/104059 * regcprop.cc (copyprop_hardreg_forward_1): Don't propagate for a more expensive reg-reg move. gcc/testsuite/ChangeLog: * gcc.target/i386/pr104059.c: New test. --- gcc/regcprop.cc | 17 ++++++++++++++++- gcc/testsuite/gcc.target/i386/pr104059.c | 22 ++++++++++++++++++++++ 2 files changed, 38 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr104059.c diff --git a/gcc/regcprop.cc b/gcc/regcprop.cc index 1a9bcf0a1ad..858896b82c6 100644 --- a/gcc/regcprop.cc +++ b/gcc/regcprop.cc @@ -891,6 +891,7 @@ copyprop_hardreg_forward_1 (basic_block bb, struct value_data *vd) if (set && REG_P (SET_SRC (set))) { rtx src = SET_SRC (set); + rtx dest = SET_DEST (set); unsigned int regno = REGNO (src); machine_mode mode = GET_MODE (src); unsigned int i; @@ -914,7 +915,7 @@ copyprop_hardreg_forward_1 (basic_block bb, struct value_data *vd) /* If the destination is also a register, try to find a source register in the same class. */ - if (REG_P (SET_DEST (set))) + if (REG_P (dest)) { new_rtx = find_oldest_value_reg (REGNO_REG_CLASS (regno), src, vd); @@ -942,6 +943,20 @@ copyprop_hardreg_forward_1 (basic_block bb, struct value_data *vd) mode, i, regno); if (new_rtx != NULL_RTX) { + /* Don't propagate for a more expensive reg-reg move. */ + if (REG_P (dest)) + { + enum reg_class from = REGNO_REG_CLASS (regno); + enum reg_class to = REGNO_REG_CLASS (REGNO (dest)); + enum reg_class new_from = REGNO_REG_CLASS (i); + unsigned int original_cost + = targetm.register_move_cost (mode, from, to); + unsigned int after_cost + = targetm.register_move_cost (mode, new_from, to); + if (after_cost > original_cost) + goto no_move_special_case; + } + if (validate_change (insn, &SET_SRC (set), new_rtx, 0)) { ORIGINAL_REGNO (new_rtx) = ORIGINAL_REGNO (src); diff --git a/gcc/testsuite/gcc.target/i386/pr104059.c b/gcc/testsuite/gcc.target/i386/pr104059.c new file mode 100644 index 00000000000..4815fa38d21 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr104059.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx2 -O2 -fdump-rtl-cprop_hardreg-details" } */ +/* { dg-final { scan-rtl-dump-not {replaced reg [0-9]* with [0-9]*} "cprop_hardreg" } } */ + +#include +int test (uint8_t *p, uint32_t t[1][1], int n) { + + int sum = 0; + uint32_t a0; + for (int i = 0; i < 4; i++, p++) + t[i][0] = p[0]; + + for (int i = 0; i < 4; i++) { + { + int t0 = t[0][i] + t[0][i]; + a0 = t0; + }; + sum += a0; + } + return (((uint16_t)sum) + ((uint32_t)sum >> 16)) >> 1; +} +