From patchwork Wed Jun 26 02:46:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mayshao-oc X-Patchwork-Id: 1952323 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4W85hb4xF5z20X6 for ; Wed, 26 Jun 2024 12:47:27 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 40DE33870C0C for ; Wed, 26 Jun 2024 02:47:25 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mx1.zhaoxin.com (MX1.ZHAOXIN.COM [210.0.225.12]) by sourceware.org (Postfix) with ESMTPS id CC4E038708D6 for ; Wed, 26 Jun 2024 02:47:03 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CC4E038708D6 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=zhaoxin.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=zhaoxin.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CC4E038708D6 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=210.0.225.12 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719370027; cv=none; b=RVpd8G9sVJgZTJ3Uf5LnsD1uYaQ1sQmwpI3wShE0KseKMlJ0+RORcUKkzDOeVUNSASEsGPCURt9AmgNVSrv51gWXbcDSZoFjqYzv+nOu+pr00VLE5uwFXogG0uuMJL8pQ8SvvDj2mzDxKZXTJ72YL9z80Vddw8D3zptOM7aUTQ0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1719370027; c=relaxed/simple; bh=sdKAgR8enu2wKXstd+iwSEA4bYCuRc16oyy1Jqo3Q8Y=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=JLaIzVZdK5hKIdWT5CeDkJ04KJR3oS/uKver0VS5Ibl1XVnIOOwUmUsYVX+/FUNOU4MHOrQ6Og4EKKhoWUsbHwdafCI4r4l0gdwHJpJkwQQXNandjh+OGl2s+3OKPe0MlfXjZgtryraKmFvlTBkwONUaM979PEnKuFVWd9WCLgM= ARC-Authentication-Results: i=1; server2.sourceware.org X-ASG-Debug-ID: 1719370016-086e231105135240001-zm97zT Received: from ZXSHMBX3.zhaoxin.com (ZXSHMBX3.zhaoxin.com [10.28.252.165]) by mx1.zhaoxin.com with ESMTP id VC0znWcGNB1t577s (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NO); Wed, 26 Jun 2024 10:46:56 +0800 (CST) X-Barracuda-Envelope-From: Mayshao-oc@zhaoxin.com X-Barracuda-RBL-Trusted-Forwarder: 10.28.252.165 Received: from ZXBJMBX02.zhaoxin.com (10.29.252.6) by ZXSHMBX3.zhaoxin.com (10.28.252.165) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Wed, 26 Jun 2024 10:46:56 +0800 Received: from zhaoxin.com (223.70.179.86) by ZXBJMBX02.zhaoxin.com (10.29.252.6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.39; Wed, 26 Jun 2024 10:46:55 +0800 X-Barracuda-RBL-Trusted-Forwarder: 10.28.252.165 From: MayShao X-Barracuda-RBL-Trusted-Forwarder: 10.29.252.6 To: , , , CC: , , , Subject: [PATCH 1/3] x86:Set preferred CPU features on the KH-40000 and KX-7000 Zhaoxin processors Date: Wed, 26 Jun 2024 10:46:47 +0800 X-ASG-Orig-Subj: [PATCH 1/3] x86:Set preferred CPU features on the KH-40000 and KX-7000 Zhaoxin processors Message-ID: <20240626024649.3689-1-MayShao-oc@zhaoxin.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Originating-IP: [223.70.179.86] X-ClientProxiedBy: ZXSHCAS1.zhaoxin.com (10.28.252.161) To ZXBJMBX02.zhaoxin.com (10.29.252.6) X-Barracuda-Connect: ZXSHMBX3.zhaoxin.com[10.28.252.165] X-Barracuda-Start-Time: 1719370016 X-Barracuda-Encrypted: ECDHE-RSA-AES128-GCM-SHA256 X-Barracuda-URL: https://10.28.252.35:4443/cgi-mod/mark.cgi X-Virus-Scanned: by bsmtpd at zhaoxin.com X-Barracuda-Scan-Msg-Size: 3716 X-Barracuda-BRTS-Status: 1 X-Barracuda-Bayes: INNOCENT GLOBAL 0.0000 1.0000 -2.0210 X-Barracuda-Spam-Score: -2.02 X-Barracuda-Spam-Status: No, SCORE=-2.02 using global scores of TAG_LEVEL=1000.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.126768 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org From: MayShao Fix code indentation issues under the Zhaoxin branch. Unaligned AVX load are slower on KH-40000 and KX-7000, so disable the AVX_Fast_Unaligned_Load. Enable Prefer_No_VZEROUPPER and Fast_Unaligned_Load features to use sse2_unaligned version of memset,strcpy and strcat. --- sysdeps/x86/cpu-features.c | 66 ++++++++++++++++++++++---------------- 1 file changed, 39 insertions(+), 27 deletions(-) diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c index 3d7c2819d7..24fbf699b9 100644 --- a/sysdeps/x86/cpu-features.c +++ b/sysdeps/x86/cpu-features.c @@ -1015,7 +1015,7 @@ https://www.intel.com/content/www/us/en/support/articles/000059422/processors.ht kind = arch_kind_zhaoxin; get_common_indices (cpu_features, &family, &model, &extended_model, - &stepping); + &stepping); get_extended_indices (cpu_features); @@ -1026,38 +1026,50 @@ https://www.intel.com/content/www/us/en/support/articles/000059422/processors.ht { if (model == 0xf || model == 0x19) { - CPU_FEATURE_UNSET (cpu_features, AVX); - CPU_FEATURE_UNSET (cpu_features, AVX2); + CPU_FEATURE_UNSET (cpu_features, AVX); + CPU_FEATURE_UNSET (cpu_features, AVX2); - cpu_features->preferred[index_arch_Slow_SSE4_2] - |= bit_arch_Slow_SSE4_2; + cpu_features->preferred[index_arch_Slow_SSE4_2] + |= bit_arch_Slow_SSE4_2; - cpu_features->preferred[index_arch_AVX_Fast_Unaligned_Load] - &= ~bit_arch_AVX_Fast_Unaligned_Load; + cpu_features->preferred[index_arch_AVX_Fast_Unaligned_Load] + &= ~bit_arch_AVX_Fast_Unaligned_Load; } } else if (family == 0x7) { - if (model == 0x1b) - { - CPU_FEATURE_UNSET (cpu_features, AVX); - CPU_FEATURE_UNSET (cpu_features, AVX2); - - cpu_features->preferred[index_arch_Slow_SSE4_2] - |= bit_arch_Slow_SSE4_2; - - cpu_features->preferred[index_arch_AVX_Fast_Unaligned_Load] - &= ~bit_arch_AVX_Fast_Unaligned_Load; - } - else if (model == 0x3b) - { - CPU_FEATURE_UNSET (cpu_features, AVX); - CPU_FEATURE_UNSET (cpu_features, AVX2); - - cpu_features->preferred[index_arch_AVX_Fast_Unaligned_Load] - &= ~bit_arch_AVX_Fast_Unaligned_Load; - } - } + switch (model) + { + case 0x1b: + CPU_FEATURE_UNSET (cpu_features, AVX); + CPU_FEATURE_UNSET (cpu_features, AVX2); + + cpu_features->preferred[index_arch_Slow_SSE4_2] + |= bit_arch_Slow_SSE4_2; + + cpu_features->preferred[index_arch_AVX_Fast_Unaligned_Load] + &= ~bit_arch_AVX_Fast_Unaligned_Load; + break; + + case 0x3b: + CPU_FEATURE_UNSET (cpu_features, AVX); + CPU_FEATURE_UNSET (cpu_features, AVX2); + + cpu_features->preferred[index_arch_AVX_Fast_Unaligned_Load] + &= ~bit_arch_AVX_Fast_Unaligned_Load; + break; + + case 0x5b: + case 0x6b: + cpu_features->preferred[index_arch_AVX_Fast_Unaligned_Load] + &= ~bit_arch_AVX_Fast_Unaligned_Load; + + cpu_features->preferred[index_arch_Prefer_No_VZEROUPPER] + |= (bit_arch_Prefer_No_VZEROUPPER + | bit_arch_Fast_Unaligned_Load); + break; + } + } } else {