From patchwork Wed Jun 26 06:08:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: liuhongt X-Patchwork-Id: 1952380 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=dUqNAPH/; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4W8B9Q6hdXz20XB for ; Wed, 26 Jun 2024 16:09:13 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 31B05386C59A for ; Wed, 26 Jun 2024 06:09:10 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.11]) by sourceware.org (Postfix) with ESMTPS id 5987B3849ADA for ; Wed, 26 Jun 2024 06:08:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5987B3849ADA Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5987B3849ADA Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.11 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719382125; cv=none; b=NrMqnuaAdSzlDhE+4w3oQzL27RUtUiidCaSalzGvvVf4fZj0c2zXVn/6J+kGLFjcponApBPQ+0OTKf1JpgLYIelPVVJB1PDmpywa2X7MbyEHNP/qrCV2U2HjyXJzWmyN1vlCy96UEDAWzdYRpt7ujeoegLJNyfz1OZpjgFTzA00= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719382125; c=relaxed/simple; bh=aWxea18HqmeYDgfuc1LKDpKSIhIBn3v2+1nBdQMmfAM=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=U+7G6n17lljFZIQB4LEWpp5ZsW4AYzthiWpXJ+ZmbZwa+0uia2KkKWhrJsm4SJxYujjptVTkJAmKZSo4wLPY3TwWBkebqKBYBd0+2kpqKjHp7Mu/03hCGFGV0tP9zUntdS8sFgD8Gwbk8lDrrQ/xIS3W7OohXxYi1Mwk6FL2Uek= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1719382124; x=1750918124; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=aWxea18HqmeYDgfuc1LKDpKSIhIBn3v2+1nBdQMmfAM=; b=dUqNAPH/GspFHXNw4cRqyY7PYDizxXW5D9IVzpvK6Uy0f5QRKuGEHLuS L61OaTy9hxwryBCsWKvTaBsOtni1ytItHskARAegvTqy+TKAU2y6wXNlR kRus96jnNwPrNrvMS/xq63kFB+99NwO1Jo+U8bvD101pvPDO9bXYwRTA9 fQYD1wmtEmJhOrwOAr9FZG0gJ58mF0dBie52b+8tFUwhZ6S9M1SF+ceGt bKHMt65SSYDEmThXQlRGWVPV9MH+LCGK/gWwIKMa+u/lvYhtPaI+FxLGa f12cvggSb6uwpNrYg4LZJRbVTf7siMWiF0ahbOxuaHjgY+8FBkRHofpIZ g==; X-CSE-ConnectionGUID: T4mNr66CRSi7boOp6RzyfQ== X-CSE-MsgGUID: NWtr0ZomTWmFNblHw0G5NQ== X-IronPort-AV: E=McAfee;i="6700,10204,11114"; a="27024405" X-IronPort-AV: E=Sophos;i="6.08,266,1712646000"; d="scan'208";a="27024405" Received: from fmviesa007.fm.intel.com ([10.60.135.147]) by orvoesa103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Jun 2024 23:08:42 -0700 X-CSE-ConnectionGUID: 1fyVwS5HRDq34vOUYyJRxg== X-CSE-MsgGUID: y8a6jowWQr+rYtem6OBERA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,266,1712646000"; d="scan'208";a="43764133" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmviesa007.fm.intel.com with ESMTP; 25 Jun 2024 23:08:41 -0700 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 66BA51006FE4; Wed, 26 Jun 2024 14:08:40 +0800 (CST) From: liuhongt To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com Subject: [PATCH] Fix wrong cost of MEM when addr is a lea. Date: Wed, 26 Jun 2024 14:08:40 +0800 Message-Id: <20240626060840.2836616-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-Spam-Status: No, score=-12.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_MSPIKE_H2, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org 416.gamess regressed 4-6% on x86_64 since my r15-882-g1d6199e5f8c1c0. The commit adjust rtx_cost of mem to reduce cost of (add op0 disp). But Cost of ADDR could be cheaper than XEXP (addr, 0) when it's a lea. It is the case in the PR, the patch uses lower cost to enable more simplication and fix the regression. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ok for trunk? gcc/ChangeLog: PR target/115462 * config/i386/i386.cc (ix86_rtx_costs): Use cost of addr when it's lower than rtx_cost (XEXP (addr, 0)) + 1. gcc/testsuite/ChangeLog: * gcc.target/i386/pr115462.c: New test. --- gcc/config/i386/i386.cc | 9 +++++++-- gcc/testsuite/gcc.target/i386/pr115462.c | 22 ++++++++++++++++++++++ 2 files changed, 29 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr115462.c diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index d4ccc24be6e..83dab8220dd 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -22341,8 +22341,13 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno, if (GET_CODE (addr) == PLUS && x86_64_immediate_operand (XEXP (addr, 1), Pmode)) { - *total += 1; - *total += rtx_cost (XEXP (addr, 0), Pmode, PLUS, 0, speed); + /* PR115462: Cost of ADDR could be cheaper than XEXP (addr, 0) + when it's a lea, use lower cost to enable more + simplification. */ + unsigned cost1 = rtx_cost (addr, Pmode, MEM, 0, speed); + unsigned cost2 = rtx_cost (XEXP (addr, 0), Pmode, + PLUS, 0, speed) + 1; + *total += MIN (cost1, cost2); return true; } } diff --git a/gcc/testsuite/gcc.target/i386/pr115462.c b/gcc/testsuite/gcc.target/i386/pr115462.c new file mode 100644 index 00000000000..ad50a6382bc --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr115462.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mavx2 -fno-tree-vectorize -fno-pic" } */ +/* { dg-final { scan-assembler-times {(?n)movl[ \t]+.*, p1\.0\+[0-9]*\(,} 3 } } */ + +int +foo (long indx, long indx2, long indx3, long indx4, long indx5, long indx6, long n, int* q) +{ + static int p1[10000]; + int* p2 = p1 + 1000; + int* p3 = p1 + 4000; + int* p4 = p1 + 8000; + + for (long i = 0; i != n; i++) + { + /* scan for movl %edi, p1.0+3996(,%rax,4), + p1.0+3996 should be propagted into the loop. */ + p2[indx++] = q[indx++]; + p3[indx2++] = q[indx2++]; + p4[indx3++] = q[indx3++]; + } + return p1[indx6] + p1[indx5]; +}