From patchwork Thu Nov 7 02:22:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mayshao-oc X-Patchwork-Id: 2007812 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XkQy33tkkz1xyM for ; Thu, 7 Nov 2024 13:29:25 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C19593858D37 for ; Thu, 7 Nov 2024 02:29:22 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from zxbjcas.zhaoxin.com (zxbjcas.zhaoxin.com [124.127.214.139]) by sourceware.org (Postfix) with ESMTPS id 4BB4C3858D20 for ; Thu, 7 Nov 2024 02:28:58 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4BB4C3858D20 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=zhaoxin.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=zhaoxin.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 4BB4C3858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=124.127.214.139 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730946541; cv=none; b=eNXA27uNv8XPC+ybpup6cMUpnrGsN/4QWfYSbSjFjAcNOXFjS5R8gwDvSDK+FBxhsR59hO6/gTiO0I4Szla67l8USTLbDgOutSs5psbo0kuSWw6zvKs7rrR5+uOXgp/zE1NuFUI8f1lhd6X5V/awSeXvEOq2Lui5zY6XQA3DD6g= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730946541; c=relaxed/simple; bh=dDWYmXcgMz7kB0YbsH7SSSgwn24aU7IKVryjJP8DqvE=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=xRA9wDF9MRe3ualrgWXuIUPWuzy/9e/KJE44fzL/eT281psT0rdLTenvSZGkzNQJVUAjG20esJxCU+H5n3SAz0o8blBGVDTmu1JO3OHsapo4V4EumaQ/Ogrgg6GA6vjkw5U8ZO39R/iieTyQdm7H8S+tb6JPwllQYbsUFNbh43o= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from ZXBJMBX02.zhaoxin.com (ZXBJMBX02.zhaoxin.com [10.29.252.6]) by zxbjcas.zhaoxin.com with ESMTP id 4A72SmwF091773; Thu, 7 Nov 2024 10:28:48 +0800 (GMT-8) (envelope-from Mayshao-oc@zhaoxin.com) Received: from ZXSHMBX1.zhaoxin.com (10.28.252.163) by ZXBJMBX02.zhaoxin.com (10.29.252.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 7 Nov 2024 10:28:47 +0800 Received: from ZXSHMBX1.zhaoxin.com ([fe80::3066:e339:e3d6:5264]) by ZXSHMBX1.zhaoxin.com ([fe80::3066:e339:e3d6:5264%7]) with mapi id 15.01.2507.039; Thu, 7 Nov 2024 10:28:47 +0800 Received: from zhaoxin.com (10.29.8.45) by ZXBJMBX02.zhaoxin.com (10.29.252.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Thu, 7 Nov 2024 10:22:46 +0800 From: MayShao-oc To: , , , , CC: , , , Subject: [PATCH] [x86_64] Add microarchtecture tunable for pass_align_tight_loops Date: Thu, 7 Nov 2024 10:22:46 +0800 Message-ID: <20241107022246.418240-1-MayShao-oc@zhaoxin.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Originating-IP: [10.29.8.45] X-ClientProxiedBy: zxbjmbx1.zhaoxin.com (10.29.252.163) To ZXBJMBX02.zhaoxin.com (10.29.252.6) X-Moderation-Data: 11/7/2024 10:28:46 AM X-DNSRBL: X-MAIL: zxbjcas.zhaoxin.com 4A72SmwF091773 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Hi all: For zhaoxin, I find no improvement when enable pass_align_tight_loops, and have performance drop in some cases. This patch add a new tunable to bypass pass_align_tight_loops in zhaoxin. Bootstrapped X86_64. Ok for trunk? BR Mayshao gcc/ChangeLog: * config/i386/i386-features.cc (TARGET_ALIGN_TIGHT_LOOPS): default true in all processors except for zhaoxin. * config/i386/i386.h (TARGET_ALIGN_TIGHT_LOOPS): New Macro. * config/i386/x86-tune.def (X86_TUNE_ALIGN_TIGHT_LOOPS): New tune --- gcc/config/i386/i386-features.cc | 4 +++- gcc/config/i386/i386.h | 3 +++ gcc/config/i386/x86-tune.def | 4 ++++ 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc index e2e85212a4f..d9fd92964fe 100644 --- a/gcc/config/i386/i386-features.cc +++ b/gcc/config/i386/i386-features.cc @@ -3620,7 +3620,9 @@ public: /* opt_pass methods: */ bool gate (function *) final override { - return optimize && optimize_function_for_speed_p (cfun); + return TARGET_ALIGN_TIGHT_LOOPS + && optimize + && optimize_function_for_speed_p (cfun); } unsigned int execute (function *) final override diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index 2dcd8803a08..7f9010246c2 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -466,6 +466,9 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST]; #define TARGET_USE_RCR ix86_tune_features[X86_TUNE_USE_RCR] #define TARGET_SSE_MOVCC_USE_BLENDV \ ix86_tune_features[X86_TUNE_SSE_MOVCC_USE_BLENDV] +#define TARGET_ALIGN_TIGHT_LOOPS \ + ix86_tune_features[X86_TUNE_ALIGN_TIGHT_LOOPS] + /* Feature tests against the various architecture variations. */ enum ix86_arch_indices { diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def index 6ebb2fd3414..bd4fa8b3eee 100644 --- a/gcc/config/i386/x86-tune.def +++ b/gcc/config/i386/x86-tune.def @@ -542,6 +542,10 @@ DEF_TUNE (X86_TUNE_V2DF_REDUCTION_PREFER_HADDPD, DEF_TUNE (X86_TUNE_SSE_MOVCC_USE_BLENDV, "sse_movcc_use_blendv", ~m_CORE_ATOM) +/* X86_TUNE_ALIGN_TIGHT_LOOPS: if false, tight loops are not aligned. */ +DEF_TUNE (X86_TUNE_ALIGN_TIGHT_LOOPS, "align_tight_loops", + ~(m_ZHAOXIN)) + /*****************************************************************************/ /* AVX instruction selection tuning (some of SSE flags affects AVX, too) */ /*****************************************************************************/