From patchwork Thu Nov  7 02:22:46 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Mayshao-oc <Mayshao-oc@zhaoxin.com>
X-Patchwork-Id: 2007812
Return-Path: <gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@legolas.ozlabs.org
Authentication-Results: legolas.ozlabs.org;
 spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org
 (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org;
 envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;
 receiver=patchwork.ozlabs.org)
Received: from server2.sourceware.org (server2.sourceware.org
 [IPv6:2620:52:3:1:0:246e:9693:128c])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)
	(No client certificate requested)
	by legolas.ozlabs.org (Postfix) with ESMTPS id 4XkQy33tkkz1xyM
	for <incoming@patchwork.ozlabs.org>; Thu,  7 Nov 2024 13:29:25 +1100 (AEDT)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id C19593858D37
	for <incoming@patchwork.ozlabs.org>; Thu,  7 Nov 2024 02:29:22 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from zxbjcas.zhaoxin.com (zxbjcas.zhaoxin.com [124.127.214.139])
 by sourceware.org (Postfix) with ESMTPS id 4BB4C3858D20
 for <gcc-patches@gcc.gnu.org>; Thu,  7 Nov 2024 02:28:58 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4BB4C3858D20
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=zhaoxin.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=zhaoxin.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 4BB4C3858D20
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=124.127.214.139
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730946541; cv=none;
 b=eNXA27uNv8XPC+ybpup6cMUpnrGsN/4QWfYSbSjFjAcNOXFjS5R8gwDvSDK+FBxhsR59hO6/gTiO0I4Szla67l8USTLbDgOutSs5psbo0kuSWw6zvKs7rrR5+uOXgp/zE1NuFUI8f1lhd6X5V/awSeXvEOq2Lui5zY6XQA3DD6g=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1730946541; c=relaxed/simple;
 bh=dDWYmXcgMz7kB0YbsH7SSSgwn24aU7IKVryjJP8DqvE=;
 h=From:To:Subject:Date:Message-ID:MIME-Version;
 b=xRA9wDF9MRe3ualrgWXuIUPWuzy/9e/KJE44fzL/eT281psT0rdLTenvSZGkzNQJVUAjG20esJxCU+H5n3SAz0o8blBGVDTmu1JO3OHsapo4V4EumaQ/Ogrgg6GA6vjkw5U8ZO39R/iieTyQdm7H8S+tb6JPwllQYbsUFNbh43o=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: from ZXBJMBX02.zhaoxin.com (ZXBJMBX02.zhaoxin.com [10.29.252.6])
 by zxbjcas.zhaoxin.com with ESMTP id 4A72SmwF091773;
 Thu, 7 Nov 2024 10:28:48 +0800 (GMT-8)
 (envelope-from Mayshao-oc@zhaoxin.com)
Received: from ZXSHMBX1.zhaoxin.com (10.28.252.163) by ZXBJMBX02.zhaoxin.com
 (10.29.252.6) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 7 Nov
 2024 10:28:47 +0800
Received: from ZXSHMBX1.zhaoxin.com ([fe80::3066:e339:e3d6:5264]) by
 ZXSHMBX1.zhaoxin.com ([fe80::3066:e339:e3d6:5264%7]) with mapi id
 15.01.2507.039; Thu, 7 Nov 2024 10:28:47 +0800
Received: from zhaoxin.com (10.29.8.45) by ZXBJMBX02.zhaoxin.com (10.29.252.6)
 with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 7 Nov
 2024 10:22:46 +0800
From: MayShao-oc <MayShao-oc@zhaoxin.com>
To: <gcc-patches@gcc.gnu.org>, <hubicka@ucw.cz>, <hongtao.liu@intel.com>,
 <ubizjak@gmail.com>, <richard.guenther@gmail.com>
CC: <timhu@zhaoxin.com>, <silviazhao@zhaoxin.com>, <louisqi@zhaoxin.com>,
 <cobechen@zhaoxin.com>
Subject: [PATCH] [x86_64] Add microarchtecture tunable for
 pass_align_tight_loops
Date: Thu, 7 Nov 2024 10:22:46 +0800
Message-ID: <20241107022246.418240-1-MayShao-oc@zhaoxin.com>
X-Mailer: git-send-email 2.34.1
MIME-Version: 1.0
X-Originating-IP: [10.29.8.45]
X-ClientProxiedBy: zxbjmbx1.zhaoxin.com (10.29.252.163) To
 ZXBJMBX02.zhaoxin.com (10.29.252.6)
X-Moderation-Data: 11/7/2024 10:28:46 AM
X-DNSRBL: 
X-MAIL: zxbjcas.zhaoxin.com 4A72SmwF091773
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org

Hi all:
   For zhaoxin, I find no improvement when enable pass_align_tight_loops,
and have performance drop in some cases.
   This patch add a new tunable to bypass pass_align_tight_loops in zhaoxin.

   Bootstrapped X86_64.
   Ok for trunk?
BR
Mayshao
gcc/ChangeLog:

	* config/i386/i386-features.cc (TARGET_ALIGN_TIGHT_LOOPS):
	default true in all processors except for zhaoxin.
	* config/i386/i386.h (TARGET_ALIGN_TIGHT_LOOPS): New Macro.
	* config/i386/x86-tune.def (X86_TUNE_ALIGN_TIGHT_LOOPS):
	New tune
---
 gcc/config/i386/i386-features.cc | 4 +++-
 gcc/config/i386/i386.h           | 3 +++
 gcc/config/i386/x86-tune.def     | 4 ++++
 3 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc
index e2e85212a4f..d9fd92964fe 100644
--- a/gcc/config/i386/i386-features.cc
+++ b/gcc/config/i386/i386-features.cc
@@ -3620,7 +3620,9 @@ public:
   /* opt_pass methods: */
   bool gate (function *) final override
     {
-      return optimize && optimize_function_for_speed_p (cfun);
+      return TARGET_ALIGN_TIGHT_LOOPS
+	     && optimize
+	     && optimize_function_for_speed_p (cfun);
     }
 
   unsigned int execute (function *) final override
diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 2dcd8803a08..7f9010246c2 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -466,6 +466,9 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST];
 #define TARGET_USE_RCR ix86_tune_features[X86_TUNE_USE_RCR]
 #define TARGET_SSE_MOVCC_USE_BLENDV \
 	ix86_tune_features[X86_TUNE_SSE_MOVCC_USE_BLENDV]
+#define TARGET_ALIGN_TIGHT_LOOPS \
+	 ix86_tune_features[X86_TUNE_ALIGN_TIGHT_LOOPS]
+
 
 /* Feature tests against the various architecture variations.  */
 enum ix86_arch_indices {
diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def
index 6ebb2fd3414..bd4fa8b3eee 100644
--- a/gcc/config/i386/x86-tune.def
+++ b/gcc/config/i386/x86-tune.def
@@ -542,6 +542,10 @@ DEF_TUNE (X86_TUNE_V2DF_REDUCTION_PREFER_HADDPD,
 DEF_TUNE (X86_TUNE_SSE_MOVCC_USE_BLENDV,
 	  "sse_movcc_use_blendv", ~m_CORE_ATOM)
 
+/* X86_TUNE_ALIGN_TIGHT_LOOPS: if false, tight loops are not aligned. */
+DEF_TUNE (X86_TUNE_ALIGN_TIGHT_LOOPS, "align_tight_loops",
+	 ~(m_ZHAOXIN))
+
 /*****************************************************************************/
 /* AVX instruction selection tuning (some of SSE flags affects AVX, too)     */
 /*****************************************************************************/