From patchwork Tue Sep 20 12:22:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1680028 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=WPp+AVat; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4MX10n3Wxxz1yp7 for ; Tue, 20 Sep 2022 22:23:20 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4MX10m3sMsz3bc1 for ; Tue, 20 Sep 2022 22:23:20 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=WPp+AVat; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::634; helo=mail-pl1-x634.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=WPp+AVat; dkim-atps=neutral Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4MX10Z2j5bz2xH9 for ; Tue, 20 Sep 2022 22:23:09 +1000 (AEST) Received: by mail-pl1-x634.google.com with SMTP id x1so2184659plv.5 for ; Tue, 20 Sep 2022 05:23:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date; bh=0uo5dcFQbM9W2H5bm3V9IIMNYQYIlI/Bef8iVMezzP0=; b=WPp+AVatmd5j6+HtitX7zTI4WFH9b57OEp51yufZBZx1ph+3nPHL8ZibnPHdO4SdqD Byy8klJM0adSy2P5aPtn3RYc4WhaZha8aijqT5VaPHCituLeQCXeYIF6TqmPZk6Kw8HQ 2EdP2zSzRSV4Bram82g6K9HzSUOP88CRYXTvAgmVUO4FUrtucSc+B/Rgx8iCDj6vKR4p 6BuA2D20y8HY1lQ7HWsJrynhI0pLz36H7Ns14ihhl4YWsPmQMXkWzlSxjcthvBy25Gy8 DstJwVg9Zptn+Er/7QJUqpdUe5P3+e9WtYuVEDoO8oKDnzPXKJDmRGxtUqBDoxuKSVUs H+vw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date; bh=0uo5dcFQbM9W2H5bm3V9IIMNYQYIlI/Bef8iVMezzP0=; b=SUtGxjFd8MtG9q34IkCid4VLsC2n7ZfgYdvBHCsx7EaxvYHVMUcLD1Y2q4K3yq4GPk igEay0bLpzu0i4g2m8IZyLGdBODUV1ow5RMBW15S5+AgeerJlFMbbFoRymAFMqLA3YwG uu5Lcg955+V7knsNm6WRYTSparFpoeeAtjT9UdCcLuxAvrX7Q94NLXRUf8ulpkBR8Hp/ QvqsU3lnrHZfZwuaVQWSc3ariaDYrhSxLFMt5forfjcOqT6AW9uPZw0QHyM/AUoaTfhI 3RIFhHrXQMi2qwfYrouFNILtGbBTYNdhal3njQXdA7XbPnIsAoDw1gB2o0R7cyyMPxvg Kd5A== X-Gm-Message-State: ACrzQf036Di9rGzTGSmiIw5Wzl+PQOfvVVqyoqB4sjuszhVQ5EyCIixe XoUWVDrPsw0iAmDP7GsQOXyalnoFpqI= X-Google-Smtp-Source: AMsMyM721abONoG0+n17TeEM2i7CCsmujRCmcTuUb3pKmETtL5uY7x+sm+NVw9HLHdbNP1wqav8tXw== X-Received: by 2002:a17:902:b20a:b0:178:6f5b:f903 with SMTP id t10-20020a170902b20a00b001786f5bf903mr4694010plr.39.1663676586044; Tue, 20 Sep 2022 05:23:06 -0700 (PDT) Received: from bobo.ozlabs.ibm.com ([203.219.227.147]) by smtp.gmail.com with ESMTPSA id t17-20020a170902e1d100b001708c4ebbaesm1246569pla.309.2022.09.20.05.23.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Sep 2022 05:23:05 -0700 (PDT) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v4 1/2] powerpc: add ISA v3.0 / v3.1 wait opcode macro Date: Tue, 20 Sep 2022 22:22:58 +1000 Message-Id: <20220920122259.363092-1-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" The wait instruction encoding changed between ISA v2.07 and ISA v3.0. In v3.1 the instruction gained a new field. Update the PPC_WAIT macro to the current encoding. Rename the older incompatible one with a _v203 suffix as it was introduced in v2.03 (the WC field was introduced in v2.07 but the kernel only uses WC=0). Reviewed-by: Segher Boessenkool Signed-off-by: Nicholas Piggin --- v2: Update naming, patch changelog and title. v3: v2 sent incorrect patches, sorry. v4: Rebase. arch/powerpc/include/asm/ppc-opcode.h | 7 +++++-- arch/powerpc/kernel/idle_book3e.S | 2 +- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/include/asm/ppc-opcode.h b/arch/powerpc/include/asm/ppc-opcode.h index c6d724104ed1..21e33e46f4b8 100644 --- a/arch/powerpc/include/asm/ppc-opcode.h +++ b/arch/powerpc/include/asm/ppc-opcode.h @@ -330,6 +330,7 @@ #define __PPC_XSP(s) ((((s) & 0x1e) | (((s) >> 5) & 0x1)) << 21) #define __PPC_XTP(s) __PPC_XSP(s) #define __PPC_T_TLB(t) (((t) & 0x3) << 21) +#define __PPC_PL(p) (((p) & 0x3) << 16) #define __PPC_WC(w) (((w) & 0x3) << 21) #define __PPC_WS(w) (((w) & 0x1f) << 11) #define __PPC_SH(s) __PPC_WS(s) @@ -388,7 +389,8 @@ #define PPC_RAW_RFDI (0x4c00004e) #define PPC_RAW_RFMCI (0x4c00004c) #define PPC_RAW_TLBILX(t, a, b) (0x7c000024 | __PPC_T_TLB(t) | __PPC_RA0(a) | __PPC_RB(b)) -#define PPC_RAW_WAIT(w) (0x7c00007c | __PPC_WC(w)) +#define PPC_RAW_WAIT_v203 (0x7c00007c) +#define PPC_RAW_WAIT(w, p) (0x7c00003c | __PPC_WC(w) | __PPC_PL(p)) #define PPC_RAW_TLBIE(lp, a) (0x7c000264 | ___PPC_RB(a) | ___PPC_RS(lp)) #define PPC_RAW_TLBIE_5(rb, rs, ric, prs, r) \ (0x7c000264 | ___PPC_RB(rb) | ___PPC_RS(rs) | ___PPC_RIC(ric) | ___PPC_PRS(prs) | ___PPC_R(r)) @@ -606,7 +608,8 @@ #define PPC_TLBILX_ALL(a, b) PPC_TLBILX(0, a, b) #define PPC_TLBILX_PID(a, b) PPC_TLBILX(1, a, b) #define PPC_TLBILX_VA(a, b) PPC_TLBILX(3, a, b) -#define PPC_WAIT(w) stringify_in_c(.long PPC_RAW_WAIT(w)) +#define PPC_WAIT_v203 stringify_in_c(.long PPC_RAW_WAIT_v203) +#define PPC_WAIT(w, p) stringify_in_c(.long PPC_RAW_WAIT(w, p)) #define PPC_TLBIE(lp, a) stringify_in_c(.long PPC_RAW_TLBIE(lp, a)) #define PPC_TLBIE_5(rb, rs, ric, prs, r) \ stringify_in_c(.long PPC_RAW_TLBIE_5(rb, rs, ric, prs, r)) diff --git a/arch/powerpc/kernel/idle_book3e.S b/arch/powerpc/kernel/idle_book3e.S index cc008de58b05..6447de51ea71 100644 --- a/arch/powerpc/kernel/idle_book3e.S +++ b/arch/powerpc/kernel/idle_book3e.S @@ -77,7 +77,7 @@ _GLOBAL(\name) .macro BOOK3E_IDLE_LOOP 1: - PPC_WAIT(0) + PPC_WAIT_v203 b 1b .endm From patchwork Tue Sep 20 12:22:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1680030 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=QBLuuQrH; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4MX11P2C2tz1yp7 for ; Tue, 20 Sep 2022 22:23:53 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4MX11P1Lwhz3c8h for ; Tue, 20 Sep 2022 22:23:53 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=QBLuuQrH; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::636; helo=mail-pl1-x636.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=QBLuuQrH; dkim-atps=neutral Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4MX10b4HWTz2xH9 for ; Tue, 20 Sep 2022 22:23:11 +1000 (AEST) Received: by mail-pl1-x636.google.com with SMTP id c24so2195396plo.3 for ; Tue, 20 Sep 2022 05:23:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date; bh=tbCe8Z5/AG2fggvlriFr/YWvwLf8BQOqvnc5d9rfjmw=; b=QBLuuQrHplR61fSZb7cZ2sCN0xhQVjdt+CBYCWOz0NKPz/JfuMOqhzfAwC0W+Di33u wTVQ1wxSGu1la3hQHfC6PETQpyxSmAy2Os5vbfNghJM4UliFi0fWEisEREDnH7eSlqEw yf03TZ4x9ZqRTgIH/xRwZqT7CDBz1DvRUNTI4bRuGdTAet+od2YxYKxbJOtUhkguygWL cSLzw3v12/sbZDeBucjz4vHlOdaJFVq5kBcIJwYUIEWS7Wy2aNSI/z4ExWm/EiNBjx73 PojhX9/ccCgO2j3PFNEqA2dPrhJ6hZ+J/tcfop5IKoqNx2pUqFYppdeYDvndHLGyWjGZ FFqw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=tbCe8Z5/AG2fggvlriFr/YWvwLf8BQOqvnc5d9rfjmw=; b=DwudW9nre2aFrqcYDvh06q/TM7cvtyB+33srXBdjbiqeV0Ah7RLtuImwS61lGRxMYQ xIb0zseQcHm7UigrLXVuexAUKmdIOV5UfFxqi7jFj4JW8KiCIiTSgxei8XptkHEtpV2t Z1QdZyQO4OhdkNypoZS14LBASZeJWD7IWzM51bMALusqLoEKB2K6DjD4NMZScU8Wjto2 PTM3y2nBcC7UcwTrkLxSR3WdgyyxaXdl/uBivGkmsdQR0X2qP6XlF9JGkslNdATqD/Qk zoTevuUAxulMlRuZ8UpiTlbj+WXphaT1NKlwc8cn8jFFlNGutnK1wKn+Wq8ADQfcMgPt HDZQ== X-Gm-Message-State: ACrzQf3gL+cf/RAmCs5dkpvPctZz6n4qcylydDq6HJc7Z0BeHMRf+Uix 7jEcE6akYGR91eHpl1MO9HzY/H7sCjY= X-Google-Smtp-Source: AMsMyM65UroFYUKDBt4CDw5ZuZw8KzUo400c5C5lQFsJVWgXQMUbloTPPVCyIXrwsgV0CseuvUAH/A== X-Received: by 2002:a17:903:110f:b0:178:a07e:e643 with SMTP id n15-20020a170903110f00b00178a07ee643mr4482477plh.41.1663676588872; Tue, 20 Sep 2022 05:23:08 -0700 (PDT) Received: from bobo.ozlabs.ibm.com ([203.219.227.147]) by smtp.gmail.com with ESMTPSA id t17-20020a170902e1d100b001708c4ebbaesm1246569pla.309.2022.09.20.05.23.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Sep 2022 05:23:08 -0700 (PDT) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v4 2/2] powerpc/64s: Make POWER10 and later use pause_short in cpu_relax loops Date: Tue, 20 Sep 2022 22:22:59 +1000 Message-Id: <20220920122259.363092-2-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20220920122259.363092-1-npiggin@gmail.com> References: <20220920122259.363092-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" We want to move away from using SMT priority updates for cpu_relax, and use a 'wait' instruction which is similar to x86. As well as being a much better fit for what everybody else uses and tests with, priority nops are stateful which is nasty (interrupts have to consider they might be taken at a different priority), and they're expensive to execute, similar to a mtSPR which can effect other threads in the pipe. This has shown to give results that are less affected by code alignment on benchmarks that cause a lot of spin waiting (e.g., rwsem contention on unixbench filesystem benchmarks) on POWER10. QEMU TCG only supports this instruction correctly since v7.1, versions without the fix may cause hangs whne running POWER10 CPUs. Reviewed-by: Segher Boessenkool Signed-off-by: Nicholas Piggin --- v4: - Rebase, test with upstream qemu with fix - Clarify asm comments and fix typo (thanks Segher) arch/powerpc/include/asm/processor.h | 28 +++++++++++++++++++---- arch/powerpc/include/asm/vdso/processor.h | 10 +++++++- 2 files changed, 32 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h index fdfaae194ddd..6b9b0d710468 100644 --- a/arch/powerpc/include/asm/processor.h +++ b/arch/powerpc/include/asm/processor.h @@ -355,11 +355,29 @@ static inline unsigned long __pack_fe01(unsigned int fpmode) #ifdef CONFIG_PPC64 -#define spin_begin() HMT_low() - -#define spin_cpu_relax() barrier() - -#define spin_end() HMT_medium() +#define spin_begin() \ +do { \ + asm volatile(ASM_FTR_IFCLR( \ + "or 1,1,1", /* HMT_LOW */ \ + "nop", /* v3.1 uses pause_short in cpu_relax instead */ \ + %0) :: "i" (CPU_FTR_ARCH_31) : "memory"); \ +} while (0) + +#define spin_cpu_relax() \ +do { \ + asm volatile(ASM_FTR_IFCLR( \ + "nop", /* Before v3.1 use priority nops in spin_begin/end */ \ + PPC_WAIT(2, 0), /* aka pause_short */ \ + %0) :: "i" (CPU_FTR_ARCH_31) : "memory"); \ +} while (0) + +#define spin_end() \ +do { \ + asm volatile(ASM_FTR_IFCLR( \ + "or 2,2,2", /* HMT_MEDIUM */ \ + "nop", \ + %0) :: "i" (CPU_FTR_ARCH_31) : "memory"); \ +} while (0) #endif diff --git a/arch/powerpc/include/asm/vdso/processor.h b/arch/powerpc/include/asm/vdso/processor.h index 8d79f994b4aa..778d2b53041b 100644 --- a/arch/powerpc/include/asm/vdso/processor.h +++ b/arch/powerpc/include/asm/vdso/processor.h @@ -22,7 +22,15 @@ #endif #ifdef CONFIG_PPC64 -#define cpu_relax() do { HMT_low(); HMT_medium(); barrier(); } while (0) +#define cpu_relax() \ +do { \ + asm volatile(ASM_FTR_IFCLR( \ + /* Pre-POWER10 uses low ; medium priority nops */ \ + "or 1,1,1 ; or 2,2,2", \ + /* POWER10 onward uses pause_short (wait 2,0) */ \ + PPC_WAIT(2, 0), \ + %0) :: "i" (CPU_FTR_ARCH_31) : "memory"); \ +} while (0) #else #define cpu_relax() barrier() #endif