From patchwork Fri Jul 24 13:14:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1335697 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=sqxzgb7o; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4BCqRp22cMz9sTT for ; Fri, 24 Jul 2020 23:14:46 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726895AbgGXNOp (ORCPT ); Fri, 24 Jul 2020 09:14:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35264 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727050AbgGXNOm (ORCPT ); Fri, 24 Jul 2020 09:14:42 -0400 Received: from mail-pl1-x644.google.com (mail-pl1-x644.google.com [IPv6:2607:f8b0:4864:20::644]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 68A56C0619D3; Fri, 24 Jul 2020 06:14:42 -0700 (PDT) Received: by mail-pl1-x644.google.com with SMTP id o1so4416185plk.1; Fri, 24 Jul 2020 06:14:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=2kVR8dpc50ysMIxQqSwuiTqzeI4YP5E2Ddp3+XJfYGg=; b=sqxzgb7opVJpqkTP//ZlqsIRD4fKl8hWlOQ1iA3Df02waiSmGM1mmDdm+e8ka1/1mP kiWZvwxr98CH6Xr2XdEm7k7js/gVG3Vh4GK3sNC8KQbYQw04wHKaMo4OLwyLZiWOOYaE OBrUj4w6MrhBKScRRjr2ldae6JdXochR+jrX+sZVjSyrWj/55KpbagaXgig3t20v7JfP jCfA1yGhSsRgL4OMmSn9ze8Lbs6mD6D3431BOObaRSjfHEJWwUww7AmKlvaQBxqIcdCl t5C8eKYwa6ZP9jZMNXUzNAuJjWg0TF/fWmiX7cJYXiRGMp3FgXDkFQH4wR7ASXSBOEi/ t5tQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2kVR8dpc50ysMIxQqSwuiTqzeI4YP5E2Ddp3+XJfYGg=; b=aWjosuRbRY64KSpR17DT5qkjDKefr4N6fIphJ/Sau8jxMtamn70F7GWaX67xIFDmwn fSC6BUaj7Q5V6+Yymm+8iTAenQc0lfPeiVVpzh7rdhEamjyJdDx0NlbQ4CfcVFIpqXSn yT6LUN5phw/i2pwMIHVXFeimPN3+t5kPbYCWkRm5flbYWsgRov12qJr+vxEGoD40DqXN iC15yt2xBJ5I3wVWapP8XUtukJ/qxq21DueqXxUIcWFrYE5DaDmg13+POSgnSwWrV+xn 1G43uCjH6HONgHXhAhP0agHKqMyD2mnBAJNQqh0RGibuQ9jBnn3RP8twcISfNmonBwm7 VLtw== X-Gm-Message-State: AOAM530gTaGj1+aAN41xbZ5ShUwM3u1I7ROxeivzRkFHEUjh59F5QxmV V31RVRsddcAtDK61H6wt/70= X-Google-Smtp-Source: ABdhPJwZT1wR4LaafRQIYtvkhAEiiC1v3dDk5trVpmTzdqzVW4G7zNyw5wB5pFFpSG/t6k8+IXpsPw== X-Received: by 2002:a17:902:a9c1:: with SMTP id b1mr8349193plr.234.1595596482001; Fri, 24 Jul 2020 06:14:42 -0700 (PDT) Received: from bobo.ibm.com (110-174-173-27.tpgi.com.au. [110.174.173.27]) by smtp.gmail.com with ESMTPSA id az16sm5871998pjb.7.2020.07.24.06.14.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Jul 2020 06:14:41 -0700 (PDT) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Cc: Nicholas Piggin , Will Deacon , Peter Zijlstra , Boqun Feng , Ingo Molnar , Waiman Long , Anton Blanchard , =?utf-8?q?Michal_Such=C3=A1nek?= , linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v4 1/6] powerpc/pseries: move some PAPR paravirt functions to their own file Date: Fri, 24 Jul 2020 23:14:18 +1000 Message-Id: <20200724131423.1362108-2-npiggin@gmail.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20200724131423.1362108-1-npiggin@gmail.com> References: <20200724131423.1362108-1-npiggin@gmail.com> MIME-Version: 1.0 Sender: kvm-ppc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org These functions will be used by queued spinlock implementation, and may be useful elsewhere too, so move them out of spinlock.h. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/paravirt.h | 59 +++++++++++++++++++++++++++++ arch/powerpc/include/asm/spinlock.h | 24 +----------- arch/powerpc/lib/locks.c | 12 +++--- 3 files changed, 66 insertions(+), 29 deletions(-) create mode 100644 arch/powerpc/include/asm/paravirt.h diff --git a/arch/powerpc/include/asm/paravirt.h b/arch/powerpc/include/asm/paravirt.h new file mode 100644 index 000000000000..339e8533464b --- /dev/null +++ b/arch/powerpc/include/asm/paravirt.h @@ -0,0 +1,59 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +#ifndef _ASM_POWERPC_PARAVIRT_H +#define _ASM_POWERPC_PARAVIRT_H + +#include +#include +#ifdef CONFIG_PPC64 +#include +#include +#endif + +#ifdef CONFIG_PPC_SPLPAR +DECLARE_STATIC_KEY_FALSE(shared_processor); + +static inline bool is_shared_processor(void) +{ + return static_branch_unlikely(&shared_processor); +} + +/* If bit 0 is set, the cpu has been preempted */ +static inline u32 yield_count_of(int cpu) +{ + __be32 yield_count = READ_ONCE(lppaca_of(cpu).yield_count); + return be32_to_cpu(yield_count); +} + +static inline void yield_to_preempted(int cpu, u32 yield_count) +{ + plpar_hcall_norets(H_CONFER, get_hard_smp_processor_id(cpu), yield_count); +} +#else +static inline bool is_shared_processor(void) +{ + return false; +} + +static inline u32 yield_count_of(int cpu) +{ + return 0; +} + +extern void ___bad_yield_to_preempted(void); +static inline void yield_to_preempted(int cpu, u32 yield_count) +{ + ___bad_yield_to_preempted(); /* This would be a bug */ +} +#endif + +#define vcpu_is_preempted vcpu_is_preempted +static inline bool vcpu_is_preempted(int cpu) +{ + if (!is_shared_processor()) + return false; + if (yield_count_of(cpu) & 1) + return true; + return false; +} + +#endif /* _ASM_POWERPC_PARAVIRT_H */ diff --git a/arch/powerpc/include/asm/spinlock.h b/arch/powerpc/include/asm/spinlock.h index 2d620896cdae..79be9bb10bbb 100644 --- a/arch/powerpc/include/asm/spinlock.h +++ b/arch/powerpc/include/asm/spinlock.h @@ -15,11 +15,10 @@ * * (the type definitions are in asm/spinlock_types.h) */ -#include #include +#include #ifdef CONFIG_PPC64 #include -#include #endif #include #include @@ -35,18 +34,6 @@ #define LOCK_TOKEN 1 #endif -#ifdef CONFIG_PPC_PSERIES -DECLARE_STATIC_KEY_FALSE(shared_processor); - -#define vcpu_is_preempted vcpu_is_preempted -static inline bool vcpu_is_preempted(int cpu) -{ - if (!static_branch_unlikely(&shared_processor)) - return false; - return !!(be32_to_cpu(lppaca_of(cpu).yield_count) & 1); -} -#endif - static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock) { return lock.slock == 0; @@ -110,15 +97,6 @@ static inline void splpar_spin_yield(arch_spinlock_t *lock) {}; static inline void splpar_rw_yield(arch_rwlock_t *lock) {}; #endif -static inline bool is_shared_processor(void) -{ -#ifdef CONFIG_PPC_SPLPAR - return static_branch_unlikely(&shared_processor); -#else - return false; -#endif -} - static inline void spin_yield(arch_spinlock_t *lock) { if (is_shared_processor()) diff --git a/arch/powerpc/lib/locks.c b/arch/powerpc/lib/locks.c index 6440d5943c00..04165b7a163f 100644 --- a/arch/powerpc/lib/locks.c +++ b/arch/powerpc/lib/locks.c @@ -27,14 +27,14 @@ void splpar_spin_yield(arch_spinlock_t *lock) return; holder_cpu = lock_value & 0xffff; BUG_ON(holder_cpu >= NR_CPUS); - yield_count = be32_to_cpu(lppaca_of(holder_cpu).yield_count); + + yield_count = yield_count_of(holder_cpu); if ((yield_count & 1) == 0) return; /* virtual cpu is currently running */ rmb(); if (lock->slock != lock_value) return; /* something has changed */ - plpar_hcall_norets(H_CONFER, - get_hard_smp_processor_id(holder_cpu), yield_count); + yield_to_preempted(holder_cpu, yield_count); } EXPORT_SYMBOL_GPL(splpar_spin_yield); @@ -53,13 +53,13 @@ void splpar_rw_yield(arch_rwlock_t *rw) return; /* no write lock at present */ holder_cpu = lock_value & 0xffff; BUG_ON(holder_cpu >= NR_CPUS); - yield_count = be32_to_cpu(lppaca_of(holder_cpu).yield_count); + + yield_count = yield_count_of(holder_cpu); if ((yield_count & 1) == 0) return; /* virtual cpu is currently running */ rmb(); if (rw->lock != lock_value) return; /* something has changed */ - plpar_hcall_norets(H_CONFER, - get_hard_smp_processor_id(holder_cpu), yield_count); + yield_to_preempted(holder_cpu, yield_count); } #endif From patchwork Fri Jul 24 13:14:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1335698 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=f8Ah6shk; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4BCqRw3w75z9sTT for ; Fri, 24 Jul 2020 23:14:52 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726970AbgGXNOv (ORCPT ); Fri, 24 Jul 2020 09:14:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35280 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727085AbgGXNOt (ORCPT ); Fri, 24 Jul 2020 09:14:49 -0400 Received: from mail-pg1-x542.google.com (mail-pg1-x542.google.com [IPv6:2607:f8b0:4864:20::542]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7CFFC0619D3; Fri, 24 Jul 2020 06:14:48 -0700 (PDT) Received: by mail-pg1-x542.google.com with SMTP id p3so5190995pgh.3; Fri, 24 Jul 2020 06:14:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=oPs14SgggfUJ0bmLL053gPEaQ1BKSm2P947QZzWKdbk=; b=f8Ah6shkdc+cZT2hhaNGd2Cmnmzddi1v7nwueDcoESDED+mezGO5oQjI1yweJyNHz4 Aao9BMbQlgVbHJmWPFjxJETwAV+9Od/vkAKlsjGKBuXUstMSNVs52X+rcz/rVoJerZbl EkqKuR/43pH1LPQr18IKJ0KJaURmMuIXxBJKuj/XW0HnJ4PWFACHd7NI/CfdhNKCTJHL bMZDK/BlG1W5+5DuDtlpzFNzvB2uYB8DKl+SEBYlhIgBwR3h/p60vpOqw3p64gITmqxd 3g36LpVnxitSJYHPElij/RqAoFjhF/kYBGF4ocWe2eCkhAfltfpDdqa0Q5d4LFPdHlEu oUkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=oPs14SgggfUJ0bmLL053gPEaQ1BKSm2P947QZzWKdbk=; b=gIY6co/8uvbdMgPidycPVKbO9012lyN5JTRrTCjxMpMdkFfAn2ZclrURZo/SyhriDH oj1LagCijuDsv9EcFVKFDl0RAv5ZnYooWEX2/dCJsh+b2NNMfl0ayeCC5Qk7ht91DiGT YC0xIbMMrzlIn12+uLv95k/GB+JTmS9DaT/aB+jofandgmY2NXPUdaIDLcl7qTKg5amA 3bJUt+vgi2k2zwH/5KpaLw0n4JFSjbDZTtvXxEWJHgwPYKnwiX4wkD2gzh/xf4a3ZwsR bF6eivsHoFfMALpRr8azZWYe5KkIi9C6+Mxu3I03XeR7uXGySv+k7lzt96NoKasXWwGz +F3g== X-Gm-Message-State: AOAM532ZprwtJENx1mR98arFr6ZVK6guC0KpZYOjJVD5D+EgL1Xp0LrY uZacEKJJNGVvzyXtF5BZ554= X-Google-Smtp-Source: ABdhPJzH03uW5Prh+YcV/O0ttabxh7FoK2yubBWqNAmfZEBT/o4wfAxJ/HgjBa/NPfCgOZ/uZgsoNA== X-Received: by 2002:a62:7991:: with SMTP id u139mr8488666pfc.87.1595596488249; Fri, 24 Jul 2020 06:14:48 -0700 (PDT) Received: from bobo.ibm.com (110-174-173-27.tpgi.com.au. [110.174.173.27]) by smtp.gmail.com with ESMTPSA id az16sm5871998pjb.7.2020.07.24.06.14.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Jul 2020 06:14:47 -0700 (PDT) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Cc: Nicholas Piggin , Will Deacon , Peter Zijlstra , Boqun Feng , Ingo Molnar , Waiman Long , Anton Blanchard , =?utf-8?q?Michal_Such=C3=A1nek?= , linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v4 2/6] powerpc: move spinlock implementation to simple_spinlock Date: Fri, 24 Jul 2020 23:14:19 +1000 Message-Id: <20200724131423.1362108-3-npiggin@gmail.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20200724131423.1362108-1-npiggin@gmail.com> References: <20200724131423.1362108-1-npiggin@gmail.com> MIME-Version: 1.0 Sender: kvm-ppc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org To prepare for queued spinlocks. This is a simple rename except to update preprocessor guard name and a file reference. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/simple_spinlock.h | 288 ++++++++++++++++++ .../include/asm/simple_spinlock_types.h | 21 ++ arch/powerpc/include/asm/spinlock.h | 285 +---------------- arch/powerpc/include/asm/spinlock_types.h | 12 +- 4 files changed, 311 insertions(+), 295 deletions(-) create mode 100644 arch/powerpc/include/asm/simple_spinlock.h create mode 100644 arch/powerpc/include/asm/simple_spinlock_types.h diff --git a/arch/powerpc/include/asm/simple_spinlock.h b/arch/powerpc/include/asm/simple_spinlock.h new file mode 100644 index 000000000000..fe6cff7df48e --- /dev/null +++ b/arch/powerpc/include/asm/simple_spinlock.h @@ -0,0 +1,288 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +#ifndef _ASM_POWERPC_SIMPLE_SPINLOCK_H +#define _ASM_POWERPC_SIMPLE_SPINLOCK_H + +/* + * Simple spin lock operations. + * + * Copyright (C) 2001-2004 Paul Mackerras , IBM + * Copyright (C) 2001 Anton Blanchard , IBM + * Copyright (C) 2002 Dave Engebretsen , IBM + * Rework to support virtual processors + * + * Type of int is used as a full 64b word is not necessary. + * + * (the type definitions are in asm/simple_spinlock_types.h) + */ +#include +#include +#include +#include +#include + +#ifdef CONFIG_PPC64 +/* use 0x800000yy when locked, where yy == CPU number */ +#ifdef __BIG_ENDIAN__ +#define LOCK_TOKEN (*(u32 *)(&get_paca()->lock_token)) +#else +#define LOCK_TOKEN (*(u32 *)(&get_paca()->paca_index)) +#endif +#else +#define LOCK_TOKEN 1 +#endif + +static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock) +{ + return lock.slock == 0; +} + +static inline int arch_spin_is_locked(arch_spinlock_t *lock) +{ + smp_mb(); + return !arch_spin_value_unlocked(*lock); +} + +/* + * This returns the old value in the lock, so we succeeded + * in getting the lock if the return value is 0. + */ +static inline unsigned long __arch_spin_trylock(arch_spinlock_t *lock) +{ + unsigned long tmp, token; + + token = LOCK_TOKEN; + __asm__ __volatile__( +"1: " PPC_LWARX(%0,0,%2,1) "\n\ + cmpwi 0,%0,0\n\ + bne- 2f\n\ + stwcx. %1,0,%2\n\ + bne- 1b\n" + PPC_ACQUIRE_BARRIER +"2:" + : "=&r" (tmp) + : "r" (token), "r" (&lock->slock) + : "cr0", "memory"); + + return tmp; +} + +static inline int arch_spin_trylock(arch_spinlock_t *lock) +{ + return __arch_spin_trylock(lock) == 0; +} + +/* + * On a system with shared processors (that is, where a physical + * processor is multiplexed between several virtual processors), + * there is no point spinning on a lock if the holder of the lock + * isn't currently scheduled on a physical processor. Instead + * we detect this situation and ask the hypervisor to give the + * rest of our timeslice to the lock holder. + * + * So that we can tell which virtual processor is holding a lock, + * we put 0x80000000 | smp_processor_id() in the lock when it is + * held. Conveniently, we have a word in the paca that holds this + * value. + */ + +#if defined(CONFIG_PPC_SPLPAR) +/* We only yield to the hypervisor if we are in shared processor mode */ +void splpar_spin_yield(arch_spinlock_t *lock); +void splpar_rw_yield(arch_rwlock_t *lock); +#else /* SPLPAR */ +static inline void splpar_spin_yield(arch_spinlock_t *lock) {}; +static inline void splpar_rw_yield(arch_rwlock_t *lock) {}; +#endif + +static inline void spin_yield(arch_spinlock_t *lock) +{ + if (is_shared_processor()) + splpar_spin_yield(lock); + else + barrier(); +} + +static inline void rw_yield(arch_rwlock_t *lock) +{ + if (is_shared_processor()) + splpar_rw_yield(lock); + else + barrier(); +} + +static inline void arch_spin_lock(arch_spinlock_t *lock) +{ + while (1) { + if (likely(__arch_spin_trylock(lock) == 0)) + break; + do { + HMT_low(); + if (is_shared_processor()) + splpar_spin_yield(lock); + } while (unlikely(lock->slock != 0)); + HMT_medium(); + } +} + +static inline +void arch_spin_lock_flags(arch_spinlock_t *lock, unsigned long flags) +{ + unsigned long flags_dis; + + while (1) { + if (likely(__arch_spin_trylock(lock) == 0)) + break; + local_save_flags(flags_dis); + local_irq_restore(flags); + do { + HMT_low(); + if (is_shared_processor()) + splpar_spin_yield(lock); + } while (unlikely(lock->slock != 0)); + HMT_medium(); + local_irq_restore(flags_dis); + } +} +#define arch_spin_lock_flags arch_spin_lock_flags + +static inline void arch_spin_unlock(arch_spinlock_t *lock) +{ + __asm__ __volatile__("# arch_spin_unlock\n\t" + PPC_RELEASE_BARRIER: : :"memory"); + lock->slock = 0; +} + +/* + * Read-write spinlocks, allowing multiple readers + * but only one writer. + * + * NOTE! it is quite common to have readers in interrupts + * but no interrupt writers. For those circumstances we + * can "mix" irq-safe locks - any writer needs to get a + * irq-safe write-lock, but readers can get non-irqsafe + * read-locks. + */ + +#ifdef CONFIG_PPC64 +#define __DO_SIGN_EXTEND "extsw %0,%0\n" +#define WRLOCK_TOKEN LOCK_TOKEN /* it's negative */ +#else +#define __DO_SIGN_EXTEND +#define WRLOCK_TOKEN (-1) +#endif + +/* + * This returns the old value in the lock + 1, + * so we got a read lock if the return value is > 0. + */ +static inline long __arch_read_trylock(arch_rwlock_t *rw) +{ + long tmp; + + __asm__ __volatile__( +"1: " PPC_LWARX(%0,0,%1,1) "\n" + __DO_SIGN_EXTEND +" addic. %0,%0,1\n\ + ble- 2f\n" +" stwcx. %0,0,%1\n\ + bne- 1b\n" + PPC_ACQUIRE_BARRIER +"2:" : "=&r" (tmp) + : "r" (&rw->lock) + : "cr0", "xer", "memory"); + + return tmp; +} + +/* + * This returns the old value in the lock, + * so we got the write lock if the return value is 0. + */ +static inline long __arch_write_trylock(arch_rwlock_t *rw) +{ + long tmp, token; + + token = WRLOCK_TOKEN; + __asm__ __volatile__( +"1: " PPC_LWARX(%0,0,%2,1) "\n\ + cmpwi 0,%0,0\n\ + bne- 2f\n" +" stwcx. %1,0,%2\n\ + bne- 1b\n" + PPC_ACQUIRE_BARRIER +"2:" : "=&r" (tmp) + : "r" (token), "r" (&rw->lock) + : "cr0", "memory"); + + return tmp; +} + +static inline void arch_read_lock(arch_rwlock_t *rw) +{ + while (1) { + if (likely(__arch_read_trylock(rw) > 0)) + break; + do { + HMT_low(); + if (is_shared_processor()) + splpar_rw_yield(rw); + } while (unlikely(rw->lock < 0)); + HMT_medium(); + } +} + +static inline void arch_write_lock(arch_rwlock_t *rw) +{ + while (1) { + if (likely(__arch_write_trylock(rw) == 0)) + break; + do { + HMT_low(); + if (is_shared_processor()) + splpar_rw_yield(rw); + } while (unlikely(rw->lock != 0)); + HMT_medium(); + } +} + +static inline int arch_read_trylock(arch_rwlock_t *rw) +{ + return __arch_read_trylock(rw) > 0; +} + +static inline int arch_write_trylock(arch_rwlock_t *rw) +{ + return __arch_write_trylock(rw) == 0; +} + +static inline void arch_read_unlock(arch_rwlock_t *rw) +{ + long tmp; + + __asm__ __volatile__( + "# read_unlock\n\t" + PPC_RELEASE_BARRIER +"1: lwarx %0,0,%1\n\ + addic %0,%0,-1\n" +" stwcx. %0,0,%1\n\ + bne- 1b" + : "=&r"(tmp) + : "r"(&rw->lock) + : "cr0", "xer", "memory"); +} + +static inline void arch_write_unlock(arch_rwlock_t *rw) +{ + __asm__ __volatile__("# write_unlock\n\t" + PPC_RELEASE_BARRIER: : :"memory"); + rw->lock = 0; +} + +#define arch_spin_relax(lock) spin_yield(lock) +#define arch_read_relax(lock) rw_yield(lock) +#define arch_write_relax(lock) rw_yield(lock) + +/* See include/linux/spinlock.h */ +#define smp_mb__after_spinlock() smp_mb() + +#endif /* _ASM_POWERPC_SIMPLE_SPINLOCK_H */ diff --git a/arch/powerpc/include/asm/simple_spinlock_types.h b/arch/powerpc/include/asm/simple_spinlock_types.h new file mode 100644 index 000000000000..0f3cdd8faa95 --- /dev/null +++ b/arch/powerpc/include/asm/simple_spinlock_types.h @@ -0,0 +1,21 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_POWERPC_SIMPLE_SPINLOCK_TYPES_H +#define _ASM_POWERPC_SIMPLE_SPINLOCK_TYPES_H + +#ifndef __LINUX_SPINLOCK_TYPES_H +# error "please don't include this file directly" +#endif + +typedef struct { + volatile unsigned int slock; +} arch_spinlock_t; + +#define __ARCH_SPIN_LOCK_UNLOCKED { 0 } + +typedef struct { + volatile signed int lock; +} arch_rwlock_t; + +#define __ARCH_RW_LOCK_UNLOCKED { 0 } + +#endif /* _ASM_POWERPC_SIMPLE_SPINLOCK_TYPES_H */ diff --git a/arch/powerpc/include/asm/spinlock.h b/arch/powerpc/include/asm/spinlock.h index 79be9bb10bbb..21357fe05fe0 100644 --- a/arch/powerpc/include/asm/spinlock.h +++ b/arch/powerpc/include/asm/spinlock.h @@ -3,290 +3,7 @@ #define __ASM_SPINLOCK_H #ifdef __KERNEL__ -/* - * Simple spin lock operations. - * - * Copyright (C) 2001-2004 Paul Mackerras , IBM - * Copyright (C) 2001 Anton Blanchard , IBM - * Copyright (C) 2002 Dave Engebretsen , IBM - * Rework to support virtual processors - * - * Type of int is used as a full 64b word is not necessary. - * - * (the type definitions are in asm/spinlock_types.h) - */ -#include -#include -#ifdef CONFIG_PPC64 -#include -#endif -#include -#include - -#ifdef CONFIG_PPC64 -/* use 0x800000yy when locked, where yy == CPU number */ -#ifdef __BIG_ENDIAN__ -#define LOCK_TOKEN (*(u32 *)(&get_paca()->lock_token)) -#else -#define LOCK_TOKEN (*(u32 *)(&get_paca()->paca_index)) -#endif -#else -#define LOCK_TOKEN 1 -#endif - -static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock) -{ - return lock.slock == 0; -} - -static inline int arch_spin_is_locked(arch_spinlock_t *lock) -{ - smp_mb(); - return !arch_spin_value_unlocked(*lock); -} - -/* - * This returns the old value in the lock, so we succeeded - * in getting the lock if the return value is 0. - */ -static inline unsigned long __arch_spin_trylock(arch_spinlock_t *lock) -{ - unsigned long tmp, token; - - token = LOCK_TOKEN; - __asm__ __volatile__( -"1: " PPC_LWARX(%0,0,%2,1) "\n\ - cmpwi 0,%0,0\n\ - bne- 2f\n\ - stwcx. %1,0,%2\n\ - bne- 1b\n" - PPC_ACQUIRE_BARRIER -"2:" - : "=&r" (tmp) - : "r" (token), "r" (&lock->slock) - : "cr0", "memory"); - - return tmp; -} - -static inline int arch_spin_trylock(arch_spinlock_t *lock) -{ - return __arch_spin_trylock(lock) == 0; -} - -/* - * On a system with shared processors (that is, where a physical - * processor is multiplexed between several virtual processors), - * there is no point spinning on a lock if the holder of the lock - * isn't currently scheduled on a physical processor. Instead - * we detect this situation and ask the hypervisor to give the - * rest of our timeslice to the lock holder. - * - * So that we can tell which virtual processor is holding a lock, - * we put 0x80000000 | smp_processor_id() in the lock when it is - * held. Conveniently, we have a word in the paca that holds this - * value. - */ - -#if defined(CONFIG_PPC_SPLPAR) -/* We only yield to the hypervisor if we are in shared processor mode */ -void splpar_spin_yield(arch_spinlock_t *lock); -void splpar_rw_yield(arch_rwlock_t *lock); -#else /* SPLPAR */ -static inline void splpar_spin_yield(arch_spinlock_t *lock) {}; -static inline void splpar_rw_yield(arch_rwlock_t *lock) {}; -#endif - -static inline void spin_yield(arch_spinlock_t *lock) -{ - if (is_shared_processor()) - splpar_spin_yield(lock); - else - barrier(); -} - -static inline void rw_yield(arch_rwlock_t *lock) -{ - if (is_shared_processor()) - splpar_rw_yield(lock); - else - barrier(); -} - -static inline void arch_spin_lock(arch_spinlock_t *lock) -{ - while (1) { - if (likely(__arch_spin_trylock(lock) == 0)) - break; - do { - HMT_low(); - if (is_shared_processor()) - splpar_spin_yield(lock); - } while (unlikely(lock->slock != 0)); - HMT_medium(); - } -} - -static inline -void arch_spin_lock_flags(arch_spinlock_t *lock, unsigned long flags) -{ - unsigned long flags_dis; - - while (1) { - if (likely(__arch_spin_trylock(lock) == 0)) - break; - local_save_flags(flags_dis); - local_irq_restore(flags); - do { - HMT_low(); - if (is_shared_processor()) - splpar_spin_yield(lock); - } while (unlikely(lock->slock != 0)); - HMT_medium(); - local_irq_restore(flags_dis); - } -} -#define arch_spin_lock_flags arch_spin_lock_flags - -static inline void arch_spin_unlock(arch_spinlock_t *lock) -{ - __asm__ __volatile__("# arch_spin_unlock\n\t" - PPC_RELEASE_BARRIER: : :"memory"); - lock->slock = 0; -} - -/* - * Read-write spinlocks, allowing multiple readers - * but only one writer. - * - * NOTE! it is quite common to have readers in interrupts - * but no interrupt writers. For those circumstances we - * can "mix" irq-safe locks - any writer needs to get a - * irq-safe write-lock, but readers can get non-irqsafe - * read-locks. - */ - -#ifdef CONFIG_PPC64 -#define __DO_SIGN_EXTEND "extsw %0,%0\n" -#define WRLOCK_TOKEN LOCK_TOKEN /* it's negative */ -#else -#define __DO_SIGN_EXTEND -#define WRLOCK_TOKEN (-1) -#endif - -/* - * This returns the old value in the lock + 1, - * so we got a read lock if the return value is > 0. - */ -static inline long __arch_read_trylock(arch_rwlock_t *rw) -{ - long tmp; - - __asm__ __volatile__( -"1: " PPC_LWARX(%0,0,%1,1) "\n" - __DO_SIGN_EXTEND -" addic. %0,%0,1\n\ - ble- 2f\n" -" stwcx. %0,0,%1\n\ - bne- 1b\n" - PPC_ACQUIRE_BARRIER -"2:" : "=&r" (tmp) - : "r" (&rw->lock) - : "cr0", "xer", "memory"); - - return tmp; -} - -/* - * This returns the old value in the lock, - * so we got the write lock if the return value is 0. - */ -static inline long __arch_write_trylock(arch_rwlock_t *rw) -{ - long tmp, token; - - token = WRLOCK_TOKEN; - __asm__ __volatile__( -"1: " PPC_LWARX(%0,0,%2,1) "\n\ - cmpwi 0,%0,0\n\ - bne- 2f\n" -" stwcx. %1,0,%2\n\ - bne- 1b\n" - PPC_ACQUIRE_BARRIER -"2:" : "=&r" (tmp) - : "r" (token), "r" (&rw->lock) - : "cr0", "memory"); - - return tmp; -} - -static inline void arch_read_lock(arch_rwlock_t *rw) -{ - while (1) { - if (likely(__arch_read_trylock(rw) > 0)) - break; - do { - HMT_low(); - if (is_shared_processor()) - splpar_rw_yield(rw); - } while (unlikely(rw->lock < 0)); - HMT_medium(); - } -} - -static inline void arch_write_lock(arch_rwlock_t *rw) -{ - while (1) { - if (likely(__arch_write_trylock(rw) == 0)) - break; - do { - HMT_low(); - if (is_shared_processor()) - splpar_rw_yield(rw); - } while (unlikely(rw->lock != 0)); - HMT_medium(); - } -} - -static inline int arch_read_trylock(arch_rwlock_t *rw) -{ - return __arch_read_trylock(rw) > 0; -} - -static inline int arch_write_trylock(arch_rwlock_t *rw) -{ - return __arch_write_trylock(rw) == 0; -} - -static inline void arch_read_unlock(arch_rwlock_t *rw) -{ - long tmp; - - __asm__ __volatile__( - "# read_unlock\n\t" - PPC_RELEASE_BARRIER -"1: lwarx %0,0,%1\n\ - addic %0,%0,-1\n" -" stwcx. %0,0,%1\n\ - bne- 1b" - : "=&r"(tmp) - : "r"(&rw->lock) - : "cr0", "xer", "memory"); -} - -static inline void arch_write_unlock(arch_rwlock_t *rw) -{ - __asm__ __volatile__("# write_unlock\n\t" - PPC_RELEASE_BARRIER: : :"memory"); - rw->lock = 0; -} - -#define arch_spin_relax(lock) spin_yield(lock) -#define arch_read_relax(lock) rw_yield(lock) -#define arch_write_relax(lock) rw_yield(lock) - -/* See include/linux/spinlock.h */ -#define smp_mb__after_spinlock() smp_mb() +#include #endif /* __KERNEL__ */ #endif /* __ASM_SPINLOCK_H */ diff --git a/arch/powerpc/include/asm/spinlock_types.h b/arch/powerpc/include/asm/spinlock_types.h index 87adaf13b7e8..3906f52dae65 100644 --- a/arch/powerpc/include/asm/spinlock_types.h +++ b/arch/powerpc/include/asm/spinlock_types.h @@ -6,16 +6,6 @@ # error "please don't include this file directly" #endif -typedef struct { - volatile unsigned int slock; -} arch_spinlock_t; - -#define __ARCH_SPIN_LOCK_UNLOCKED { 0 } - -typedef struct { - volatile signed int lock; -} arch_rwlock_t; - -#define __ARCH_RW_LOCK_UNLOCKED { 0 } +#include #endif From patchwork Fri Jul 24 13:14:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1335699 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=uIVnByAH; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4BCqS90FJjz9sV1 for ; Fri, 24 Jul 2020 23:15:05 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726753AbgGXNOz (ORCPT ); Fri, 24 Jul 2020 09:14:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35298 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727045AbgGXNOy (ORCPT ); Fri, 24 Jul 2020 09:14:54 -0400 Received: from mail-pj1-x1043.google.com (mail-pj1-x1043.google.com [IPv6:2607:f8b0:4864:20::1043]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 93D71C0619D3; Fri, 24 Jul 2020 06:14:54 -0700 (PDT) Received: by mail-pj1-x1043.google.com with SMTP id t15so5204033pjq.5; Fri, 24 Jul 2020 06:14:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5Z5tnkoy8D3Ixvbx4t7AGSPKBpu2EDlVVpHs+ZSeWN8=; b=uIVnByAHepd5/lrKx9pkme64Btx1ZsGlTW6EyjCkHtsD0f5zXMoq3fC8WQzH+tI5th RTfrs9MSvP7ZsZ9b7nr2LXJNGdOAFbqnxMGdXijjVbjLCvEQeXQlVrANqs2besHS7Ljj 9YlAbJw+g7Wkn1AMVYDYizoZna8H6lvKMYC3e0mJF7x0B7VNYSK0fiuBDh4KGxKLmKgl VN7AxammFLMpmDGCaLX7Lf09gl+rpt/aOpdcIYGmfJtLkqnaj3RFQoYrcTr2wRHSNE5m ZeuIeCFRElYmR8LaP9+1za/wH6b5DWxtHlwnlxfQIbaEMT59eH8BVGqonzPo2Xln8EI1 GCcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5Z5tnkoy8D3Ixvbx4t7AGSPKBpu2EDlVVpHs+ZSeWN8=; b=O2pCVHohkF9Hji/3VOMoEXuion060xGXUU2zkDbga85uM43GSsEM5aRO8hVuGE+faN XPHskhSdvnZM3vfPZgmylSwCE5tWw0LpI1nH3VfGIWO09oe2nmKvKS39S39A4oQlNO2h 8w+lyURgGoxq1+RTuqA0OVJ1vCttyy961O+Sy224DImDYmQ3NrY7+bUe2Ru01iDENjsU iJaBBffBLaJLNMkLo6HEgIDt36AkbdJ8GFQjKjlU2kdl0A0eusaGGdxmBL19FHTX4CDH U9qaqt9EQ1rTY4NJ4IOXDoPS+sLnc29ln7cSSWJBgEGIYKQ+jVgJ7iXVMjDQ308Etain 8XtQ== X-Gm-Message-State: AOAM531LdKl2wFDXuOgF93PXadDtQnsgvYDtDi6WsR77DxHxyxAG+oiL BgL+mqmfSllOFPERNy4yrv0= X-Google-Smtp-Source: ABdhPJy8s0WlvzUfXEyHoYKDcaEDgL8O/zcMQeyrm0HgG3lq3ZjZ+S7R6eJyljhXdI8vGaWMzxf5LQ== X-Received: by 2002:a17:90a:7184:: with SMTP id i4mr5365125pjk.75.1595596494141; Fri, 24 Jul 2020 06:14:54 -0700 (PDT) Received: from bobo.ibm.com (110-174-173-27.tpgi.com.au. [110.174.173.27]) by smtp.gmail.com with ESMTPSA id az16sm5871998pjb.7.2020.07.24.06.14.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Jul 2020 06:14:53 -0700 (PDT) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Cc: Nicholas Piggin , Will Deacon , Peter Zijlstra , Boqun Feng , Ingo Molnar , Waiman Long , Anton Blanchard , =?utf-8?q?Michal_Such=C3=A1nek?= , linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v4 3/6] powerpc/64s: implement queued spinlocks and rwlocks Date: Fri, 24 Jul 2020 23:14:20 +1000 Message-Id: <20200724131423.1362108-4-npiggin@gmail.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20200724131423.1362108-1-npiggin@gmail.com> References: <20200724131423.1362108-1-npiggin@gmail.com> MIME-Version: 1.0 Sender: kvm-ppc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org These have shown significantly improved performance and fairness when spinlock contention is moderate to high on very large systems. With this series including subsequent patches, on a 16 socket 1536 thread POWER9, a stress test such as same-file open/close from all CPUs gets big speedups, 11620op/s aggregate with simple spinlocks vs 384158op/s (33x faster), where the difference in throughput between the fastest and slowest thread goes from 7x to 1.4x. Thanks to the fast path being identical in terms of atomics and barriers (after a subsequent optimisation patch), single threaded performance is not changed (no measurable difference). On smaller systems, performance and fairness seems to be generally improved. Using dbench on tmpfs as a test (that starts to run into kernel spinlock contention), a 2-socket OpenPOWER POWER9 system was tested with bare metal and KVM guest configurations. Results can be found here: https://github.com/linuxppc/issues/issues/305#issuecomment-663487453 Observations are: - Queued spinlocks are equal when contention is insignificant, as expected and as measured with microbenchmarks. - When there is contention, on bare metal queued spinlocks have better throughput and max latency at all points. - When virtualised, queued spinlocks are slightly worse approaching peak throughput, but significantly better throughput and max latency at all points beyond peak, until queued spinlock maximum latency rises when clients are 2x vCPUs. The regressions haven't been analysed very well yet, there are a lot of things that can be tuned, particularly the paravirtualised locking, but the numbers already look like a good net win even on relatively small systems. Acked-by: Peter Zijlstra (Intel) Signed-off-by: Nicholas Piggin --- arch/powerpc/Kconfig | 15 ++++++++++++++ arch/powerpc/include/asm/Kbuild | 1 + arch/powerpc/include/asm/qspinlock.h | 25 +++++++++++++++++++++++ arch/powerpc/include/asm/spinlock.h | 5 +++++ arch/powerpc/include/asm/spinlock_types.h | 5 +++++ arch/powerpc/lib/Makefile | 3 +++ include/asm-generic/qspinlock.h | 2 ++ 7 files changed, 56 insertions(+) create mode 100644 arch/powerpc/include/asm/qspinlock.h diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 9fa23eb320ff..641946052d67 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -145,6 +145,8 @@ config PPC select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_USE_BUILTIN_BSWAP select ARCH_USE_CMPXCHG_LOCKREF if PPC64 + select ARCH_USE_QUEUED_RWLOCKS if PPC_QUEUED_SPINLOCKS + select ARCH_USE_QUEUED_SPINLOCKS if PPC_QUEUED_SPINLOCKS select ARCH_WANT_IPC_PARSE_VERSION select ARCH_WEAK_RELEASE_ACQUIRE select BINFMT_ELF @@ -490,6 +492,19 @@ config HOTPLUG_CPU Say N if you are unsure. +config PPC_QUEUED_SPINLOCKS + bool "Queued spinlocks" + depends on SMP + help + Say Y here to use to use queued spinlocks which give better + scalability and fairness on large SMP and NUMA systems without + harming single threaded performance. + + This option is currently experimental, the code is more complex + and less tested so it defaults to "N" for the moment. + + If unsure, say "N". + config ARCH_CPU_PROBE_RELEASE def_bool y depends on HOTPLUG_CPU diff --git a/arch/powerpc/include/asm/Kbuild b/arch/powerpc/include/asm/Kbuild index dadbcf3a0b1e..27c2268dfd6c 100644 --- a/arch/powerpc/include/asm/Kbuild +++ b/arch/powerpc/include/asm/Kbuild @@ -6,5 +6,6 @@ generated-y += syscall_table_spu.h generic-y += export.h generic-y += local64.h generic-y += mcs_spinlock.h +generic-y += qrwlock.h generic-y += vtime.h generic-y += early_ioremap.h diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h new file mode 100644 index 000000000000..c49e33e24edd --- /dev/null +++ b/arch/powerpc/include/asm/qspinlock.h @@ -0,0 +1,25 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_POWERPC_QSPINLOCK_H +#define _ASM_POWERPC_QSPINLOCK_H + +#include + +#define _Q_PENDING_LOOPS (1 << 9) /* not tuned */ + +#define smp_mb__after_spinlock() smp_mb() + +static __always_inline int queued_spin_is_locked(struct qspinlock *lock) +{ + /* + * This barrier was added to simple spinlocks by commit 51d7d5205d338, + * but it should now be possible to remove it, asm arm64 has done with + * commit c6f5d02b6a0f. + */ + smp_mb(); + return atomic_read(&lock->val); +} +#define queued_spin_is_locked queued_spin_is_locked + +#include + +#endif /* _ASM_POWERPC_QSPINLOCK_H */ diff --git a/arch/powerpc/include/asm/spinlock.h b/arch/powerpc/include/asm/spinlock.h index 21357fe05fe0..434615f1d761 100644 --- a/arch/powerpc/include/asm/spinlock.h +++ b/arch/powerpc/include/asm/spinlock.h @@ -3,7 +3,12 @@ #define __ASM_SPINLOCK_H #ifdef __KERNEL__ +#ifdef CONFIG_PPC_QUEUED_SPINLOCKS +#include +#include +#else #include +#endif #endif /* __KERNEL__ */ #endif /* __ASM_SPINLOCK_H */ diff --git a/arch/powerpc/include/asm/spinlock_types.h b/arch/powerpc/include/asm/spinlock_types.h index 3906f52dae65..c5d742f18021 100644 --- a/arch/powerpc/include/asm/spinlock_types.h +++ b/arch/powerpc/include/asm/spinlock_types.h @@ -6,6 +6,11 @@ # error "please don't include this file directly" #endif +#ifdef CONFIG_PPC_QUEUED_SPINLOCKS +#include +#include +#else #include +#endif #endif diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile index 5e994cda8e40..d66a645503eb 100644 --- a/arch/powerpc/lib/Makefile +++ b/arch/powerpc/lib/Makefile @@ -41,7 +41,10 @@ obj-$(CONFIG_PPC_BOOK3S_64) += copyuser_power7.o copypage_power7.o \ obj64-y += copypage_64.o copyuser_64.o mem_64.o hweight_64.o \ memcpy_64.o memcpy_mcsafe_64.o +ifndef CONFIG_PPC_QUEUED_SPINLOCKS obj64-$(CONFIG_SMP) += locks.o +endif + obj64-$(CONFIG_ALTIVEC) += vmx-helper.o obj64-$(CONFIG_KPROBES_SANITY_TEST) += test_emulate_step.o \ test_emulate_step_exec_instr.o diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinlock.h index fde943d180e0..fb0a814d4395 100644 --- a/include/asm-generic/qspinlock.h +++ b/include/asm-generic/qspinlock.h @@ -12,6 +12,7 @@ #include +#ifndef queued_spin_is_locked /** * queued_spin_is_locked - is the spinlock locked? * @lock: Pointer to queued spinlock structure @@ -25,6 +26,7 @@ static __always_inline int queued_spin_is_locked(struct qspinlock *lock) */ return atomic_read(&lock->val); } +#endif /** * queued_spin_value_unlocked - is the spinlock structure unlocked? From patchwork Fri Jul 24 13:14:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1335700 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=E5/UTDAU; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4BCqSF0bXyz9sV7 for ; Fri, 24 Jul 2020 23:15:09 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726701AbgGXNPE (ORCPT ); Fri, 24 Jul 2020 09:15:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35326 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726643AbgGXNPB (ORCPT ); Fri, 24 Jul 2020 09:15:01 -0400 Received: from mail-pl1-x641.google.com (mail-pl1-x641.google.com [IPv6:2607:f8b0:4864:20::641]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B8D3C0619D3; Fri, 24 Jul 2020 06:15:01 -0700 (PDT) Received: by mail-pl1-x641.google.com with SMTP id x9so4414025plr.2; Fri, 24 Jul 2020 06:15:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=n61fbqlOypslFR94k6n+lGqidIfEPE+XConlkGlOLco=; b=E5/UTDAUnb3ESa5ShjlxYJ0FAndtToI8UfexqXwqO3SmkBFOUl5PVWdOyIOOYyNpVV L9SzENVxwDng6Ej6JhVibvdRFYsPmmomHFmFS2znFRwS3UCux5apIi2RDB2yTzbuaYw/ KJ8gwLGi14j0CIb30YzavIw/1HmaBrzCHKwBz51XB//K3MioDBbVPSidgDcr0lz35G9w tQzNzzcpd623VzmG6vXfIzMC2uFxD3Shs+c7dauvv1dDdc+FI3Py0O0BDchmlL5nL/6P fTMDYBOZ51qzGqHK2CZOYpJWQwwPVu+eEYrNA7ri7/HNfMpO1LZcYOcK+FX7kTiTZ9aq 6/ag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=n61fbqlOypslFR94k6n+lGqidIfEPE+XConlkGlOLco=; b=qld5hr+RzV0QflARnjhHN0ZP0m1srhu3DQV+YDYn6246E1LWm3RXkA3w8RlS5j9Bqz fmd/erVOri89o0/CN2BtdWCppW71QgZVkVDLr18phXpJpZVkR68/Zpw6AD1Cs405/4mr Nq794dVOHEvDM7fUW53jV/OFKHJe2yRQKF1SgiCVG+wTFiF8wAO2j6lpVS6/266/iTvZ ZWO2r8ph9EGGPDV2z4JRTJyR7KmyIkN0arNYoSWG294SqxnwSt/9uXKr9F7tHGnHKVNa DRgJswFWj6Q4y32upEaLCp71S+JN3sfUwmpP4QHGiVwOAAx+BxCKyoHz0gav3DsdLohj hEYA== X-Gm-Message-State: AOAM532/DO9dpQqJynAeDEZkA0JRmioLh+xrg5yxtujjy9joFcd1q5Vl SOUJ5XyFhsirfK223T1qQBI= X-Google-Smtp-Source: ABdhPJyehzfvgGFCs+XkBbrheOMwkmB9hmhwGlUE6rYdtAGc9DxocvZ7G2mQyADyXgm6wrNTNraoOg== X-Received: by 2002:a17:902:a3cb:: with SMTP id q11mr8177484plb.239.1595596500823; Fri, 24 Jul 2020 06:15:00 -0700 (PDT) Received: from bobo.ibm.com (110-174-173-27.tpgi.com.au. [110.174.173.27]) by smtp.gmail.com with ESMTPSA id az16sm5871998pjb.7.2020.07.24.06.14.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Jul 2020 06:14:59 -0700 (PDT) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Cc: Nicholas Piggin , Will Deacon , Peter Zijlstra , Boqun Feng , Ingo Molnar , Waiman Long , Anton Blanchard , =?utf-8?q?Michal_Such=C3=A1nek?= , linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v4 4/6] powerpc/pseries: implement paravirt qspinlocks for SPLPAR Date: Fri, 24 Jul 2020 23:14:21 +1000 Message-Id: <20200724131423.1362108-5-npiggin@gmail.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20200724131423.1362108-1-npiggin@gmail.com> References: <20200724131423.1362108-1-npiggin@gmail.com> MIME-Version: 1.0 Sender: kvm-ppc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org This implements the generic paravirt qspinlocks using H_PROD and H_CONFER to kick and wait. This uses an un-directed yield to any CPU rather than the directed yield to a pre-empted lock holder that paravirtualised simple spinlocks use, that requires no kick hcall. This is something that could be investigated and improved in future. Performance results can be found in the commit which added queued spinlocks. Acked-by: Peter Zijlstra (Intel) Acked-by: Waiman Long Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/paravirt.h | 28 ++++++++ arch/powerpc/include/asm/qspinlock.h | 66 +++++++++++++++++++ arch/powerpc/include/asm/qspinlock_paravirt.h | 7 ++ arch/powerpc/include/asm/spinlock.h | 4 ++ arch/powerpc/platforms/pseries/Kconfig | 9 ++- arch/powerpc/platforms/pseries/setup.c | 4 +- include/asm-generic/qspinlock.h | 2 + 7 files changed, 118 insertions(+), 2 deletions(-) create mode 100644 arch/powerpc/include/asm/qspinlock_paravirt.h diff --git a/arch/powerpc/include/asm/paravirt.h b/arch/powerpc/include/asm/paravirt.h index 339e8533464b..21e5f29ca251 100644 --- a/arch/powerpc/include/asm/paravirt.h +++ b/arch/powerpc/include/asm/paravirt.h @@ -28,6 +28,16 @@ static inline void yield_to_preempted(int cpu, u32 yield_count) { plpar_hcall_norets(H_CONFER, get_hard_smp_processor_id(cpu), yield_count); } + +static inline void prod_cpu(int cpu) +{ + plpar_hcall_norets(H_PROD, get_hard_smp_processor_id(cpu)); +} + +static inline void yield_to_any(void) +{ + plpar_hcall_norets(H_CONFER, -1, 0); +} #else static inline bool is_shared_processor(void) { @@ -44,6 +54,19 @@ static inline void yield_to_preempted(int cpu, u32 yield_count) { ___bad_yield_to_preempted(); /* This would be a bug */ } + +extern void ___bad_yield_to_any(void); +static inline void yield_to_any(void) +{ + ___bad_yield_to_any(); /* This would be a bug */ +} + +extern void ___bad_prod_cpu(void); +static inline void prod_cpu(int cpu) +{ + ___bad_prod_cpu(); /* This would be a bug */ +} + #endif #define vcpu_is_preempted vcpu_is_preempted @@ -56,4 +79,9 @@ static inline bool vcpu_is_preempted(int cpu) return false; } +static inline bool pv_is_native_spin_unlock(void) +{ + return !is_shared_processor(); +} + #endif /* _ASM_POWERPC_PARAVIRT_H */ diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h index c49e33e24edd..f5066f00a08c 100644 --- a/arch/powerpc/include/asm/qspinlock.h +++ b/arch/powerpc/include/asm/qspinlock.h @@ -3,9 +3,47 @@ #define _ASM_POWERPC_QSPINLOCK_H #include +#include #define _Q_PENDING_LOOPS (1 << 9) /* not tuned */ +#ifdef CONFIG_PARAVIRT_SPINLOCKS +extern void native_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); +extern void __pv_queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); +extern void __pv_queued_spin_unlock(struct qspinlock *lock); + +static __always_inline void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) +{ + if (!is_shared_processor()) + native_queued_spin_lock_slowpath(lock, val); + else + __pv_queued_spin_lock_slowpath(lock, val); +} + +#define queued_spin_unlock queued_spin_unlock +static inline void queued_spin_unlock(struct qspinlock *lock) +{ + if (!is_shared_processor()) + smp_store_release(&lock->locked, 0); + else + __pv_queued_spin_unlock(lock); +} + +#else +extern void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); +#endif + +static __always_inline void queued_spin_lock(struct qspinlock *lock) +{ + u32 val = 0; + + if (likely(atomic_try_cmpxchg_acquire(&lock->val, &val, _Q_LOCKED_VAL))) + return; + + queued_spin_lock_slowpath(lock, val); +} +#define queued_spin_lock queued_spin_lock + #define smp_mb__after_spinlock() smp_mb() static __always_inline int queued_spin_is_locked(struct qspinlock *lock) @@ -20,6 +58,34 @@ static __always_inline int queued_spin_is_locked(struct qspinlock *lock) } #define queued_spin_is_locked queued_spin_is_locked +#ifdef CONFIG_PARAVIRT_SPINLOCKS +#define SPIN_THRESHOLD (1<<15) /* not tuned */ + +static __always_inline void pv_wait(u8 *ptr, u8 val) +{ + if (*ptr != val) + return; + yield_to_any(); + /* + * We could pass in a CPU here if waiting in the queue and yield to + * the previous CPU in the queue. + */ +} + +static __always_inline void pv_kick(int cpu) +{ + prod_cpu(cpu); +} + +extern void __pv_init_lock_hash(void); + +static inline void pv_spinlocks_init(void) +{ + __pv_init_lock_hash(); +} + +#endif + #include #endif /* _ASM_POWERPC_QSPINLOCK_H */ diff --git a/arch/powerpc/include/asm/qspinlock_paravirt.h b/arch/powerpc/include/asm/qspinlock_paravirt.h new file mode 100644 index 000000000000..6b60e7736a47 --- /dev/null +++ b/arch/powerpc/include/asm/qspinlock_paravirt.h @@ -0,0 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +#ifndef _ASM_POWERPC_QSPINLOCK_PARAVIRT_H +#define _ASM_POWERPC_QSPINLOCK_PARAVIRT_H + +EXPORT_SYMBOL(__pv_queued_spin_unlock); + +#endif /* _ASM_POWERPC_QSPINLOCK_PARAVIRT_H */ diff --git a/arch/powerpc/include/asm/spinlock.h b/arch/powerpc/include/asm/spinlock.h index 434615f1d761..6ec72282888d 100644 --- a/arch/powerpc/include/asm/spinlock.h +++ b/arch/powerpc/include/asm/spinlock.h @@ -10,5 +10,9 @@ #include #endif +#ifndef CONFIG_PARAVIRT_SPINLOCKS +static inline void pv_spinlocks_init(void) { } +#endif + #endif /* __KERNEL__ */ #endif /* __ASM_SPINLOCK_H */ diff --git a/arch/powerpc/platforms/pseries/Kconfig b/arch/powerpc/platforms/pseries/Kconfig index 24c18362e5ea..5e037df2a3a1 100644 --- a/arch/powerpc/platforms/pseries/Kconfig +++ b/arch/powerpc/platforms/pseries/Kconfig @@ -25,15 +25,22 @@ config PPC_PSERIES select SWIOTLB default y +config PARAVIRT_SPINLOCKS + bool + config PPC_SPLPAR - depends on PPC_PSERIES bool "Support for shared-processor logical partitions" + depends on PPC_PSERIES + select PARAVIRT_SPINLOCKS if PPC_QUEUED_SPINLOCKS + default y help Enabling this option will make the kernel run more efficiently on logically-partitioned pSeries systems which use shared processors, that is, which share physical processors between two or more partitions. + Say Y if you are unsure. + config DTL bool "Dispatch Trace Log" depends on PPC_SPLPAR && DEBUG_FS diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c index 2db8469e475f..fa847d1f9d54 100644 --- a/arch/powerpc/platforms/pseries/setup.c +++ b/arch/powerpc/platforms/pseries/setup.c @@ -771,8 +771,10 @@ static void __init pSeries_setup_arch(void) if (firmware_has_feature(FW_FEATURE_LPAR)) { vpa_init(boot_cpuid); - if (lppaca_shared_proc(get_lppaca())) + if (lppaca_shared_proc(get_lppaca())) { static_branch_enable(&shared_processor); + pv_spinlocks_init(); + } ppc_md.power_save = pseries_lpar_idle; ppc_md.enable_pmcs = pseries_lpar_enable_pmcs; diff --git a/include/asm-generic/qspinlock.h b/include/asm-generic/qspinlock.h index fb0a814d4395..38ca14e79a86 100644 --- a/include/asm-generic/qspinlock.h +++ b/include/asm-generic/qspinlock.h @@ -69,6 +69,7 @@ static __always_inline int queued_spin_trylock(struct qspinlock *lock) extern void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val); +#ifndef queued_spin_lock /** * queued_spin_lock - acquire a queued spinlock * @lock: Pointer to queued spinlock structure @@ -82,6 +83,7 @@ static __always_inline void queued_spin_lock(struct qspinlock *lock) queued_spin_lock_slowpath(lock, val); } +#endif #ifndef queued_spin_unlock /** From patchwork Fri Jul 24 13:14:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1335701 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=g81IDF80; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4BCqSG0Ktlz9sV2 for ; Fri, 24 Jul 2020 23:15:10 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726956AbgGXNPI (ORCPT ); Fri, 24 Jul 2020 09:15:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35344 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726643AbgGXNPH (ORCPT ); Fri, 24 Jul 2020 09:15:07 -0400 Received: from mail-pl1-x642.google.com (mail-pl1-x642.google.com [IPv6:2607:f8b0:4864:20::642]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 64A40C0619D3; Fri, 24 Jul 2020 06:15:07 -0700 (PDT) Received: by mail-pl1-x642.google.com with SMTP id x9so4414159plr.2; Fri, 24 Jul 2020 06:15:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=WiftbphxWXSIjSEt+dXbqpQmYX6qU7iPeGCFcxYtSQA=; b=g81IDF80xys7hUQAi+tyjSjNlCK2qfjl6DMDLGBqdXprJsL04R7sE+Ppf7O4NRtfC/ XOS//mUthsddkAAPYDi+LuEet2/oqqhYQgoLAQh+Ep3WLc5cGnB4l66lgpOXZcgQjH/2 THm3HuX15vW0lFTMuAkgNbj59CAbfANYE0Hc+Aj0wV1aAg/Ykj/AyBVMwiYQIazUkq9p lBF0c8mFla78lkA0QbHIOhKwQQ3bZagwfLEZMbPDeJMyNGvAdTbg1NyKQnDLjSfzOoZu bxN9uRPeTkkk/F+N4xTxoxctv3RbyMtZPX1+RB0q8g72P11W9020iBvl6qF2ZP1/2LM0 rkAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=WiftbphxWXSIjSEt+dXbqpQmYX6qU7iPeGCFcxYtSQA=; b=S/7m/0xSumff4aEHytC4HjgylXhj+BxHPOiXmBh55LNscJBY0WC46VqF8RRd0uchQc dqnsrka9dS7yrjP1AU79fsNtMAsJNj9xYmVEjzrg24ECZlfJDW43Ah4hHJdwBRn4aO/k DKg5FMO0OKjlX6xDvtrIdyFkrEzXl8Gj9wK0XsEeD5hpes4U+UNIpPiFMkKJ5eUWto8O AysZV1mEjPUI8O6OWR4BRt3/l1QDsysc2pGYJ/QXVOTM/Oi4086ARoDOesBHdLAba/gU CTT3qQTOkq8aXfOtE4jslnZi7rHpEmA2DdC1CdxtOqtkgfCq/JXgAYF1k6kHJKLIrItm y0Gw== X-Gm-Message-State: AOAM533yQ+EUxkZRB8+3Ptk5uDltuZd0FWopQKmnsSFeNWxBZRlEJGcV wq112TgZ97P5E/zWL4D+3sk= X-Google-Smtp-Source: ABdhPJxDcJ3WwNc4e4B4k3cRPIw3LT71Pqkt7tkQKz15RWI12iCrEbS600oMqbiMl4hmSoD4ESVGAw== X-Received: by 2002:a17:90b:300a:: with SMTP id hg10mr5327031pjb.211.1595596507027; Fri, 24 Jul 2020 06:15:07 -0700 (PDT) Received: from bobo.ibm.com (110-174-173-27.tpgi.com.au. [110.174.173.27]) by smtp.gmail.com with ESMTPSA id az16sm5871998pjb.7.2020.07.24.06.15.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Jul 2020 06:15:06 -0700 (PDT) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Cc: Nicholas Piggin , Will Deacon , Peter Zijlstra , Boqun Feng , Ingo Molnar , Waiman Long , Anton Blanchard , =?utf-8?q?Michal_Such=C3=A1nek?= , linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v4 5/6] powerpc/qspinlock: optimised atomic_try_cmpxchg_lock that adds the lock hint Date: Fri, 24 Jul 2020 23:14:22 +1000 Message-Id: <20200724131423.1362108-6-npiggin@gmail.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20200724131423.1362108-1-npiggin@gmail.com> References: <20200724131423.1362108-1-npiggin@gmail.com> MIME-Version: 1.0 Sender: kvm-ppc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org This brings the behaviour of the uncontended fast path back to roughly equivalent to simple spinlocks -- a single atomic op with lock hint. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/atomic.h | 28 ++++++++++++++++++++++++++++ arch/powerpc/include/asm/qspinlock.h | 2 +- 2 files changed, 29 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/atomic.h b/arch/powerpc/include/asm/atomic.h index 498785ffc25f..f6a3d145ffb7 100644 --- a/arch/powerpc/include/asm/atomic.h +++ b/arch/powerpc/include/asm/atomic.h @@ -193,6 +193,34 @@ static __inline__ int atomic_dec_return_relaxed(atomic_t *v) #define atomic_xchg(v, new) (xchg(&((v)->counter), new)) #define atomic_xchg_relaxed(v, new) xchg_relaxed(&((v)->counter), (new)) +/* + * Don't want to override the generic atomic_try_cmpxchg_acquire, because + * we add a lock hint to the lwarx, which may not be wanted for the + * _acquire case (and is not used by the other _acquire variants so it + * would be a surprise). + */ +static __always_inline bool +atomic_try_cmpxchg_lock(atomic_t *v, int *old, int new) +{ + int r, o = *old; + + __asm__ __volatile__ ( +"1:\t" PPC_LWARX(%0,0,%2,1) " # atomic_try_cmpxchg_acquire \n" +" cmpw 0,%0,%3 \n" +" bne- 2f \n" +" stwcx. %4,0,%2 \n" +" bne- 1b \n" +"\t" PPC_ACQUIRE_BARRIER " \n" +"2: \n" + : "=&r" (r), "+m" (v->counter) + : "r" (&v->counter), "r" (o), "r" (new) + : "cr0", "memory"); + + if (unlikely(r != o)) + *old = r; + return likely(r == o); +} + /** * atomic_fetch_add_unless - add unless the number is a given value * @v: pointer of type atomic_t diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h index f5066f00a08c..b752d34517b3 100644 --- a/arch/powerpc/include/asm/qspinlock.h +++ b/arch/powerpc/include/asm/qspinlock.h @@ -37,7 +37,7 @@ static __always_inline void queued_spin_lock(struct qspinlock *lock) { u32 val = 0; - if (likely(atomic_try_cmpxchg_acquire(&lock->val, &val, _Q_LOCKED_VAL))) + if (likely(atomic_try_cmpxchg_lock(&lock->val, &val, _Q_LOCKED_VAL))) return; queued_spin_lock_slowpath(lock, val); From patchwork Fri Jul 24 13:14:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1335702 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=myWsbPv3; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4BCqSN62X2z9sV2 for ; Fri, 24 Jul 2020 23:15:16 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726667AbgGXNPP (ORCPT ); Fri, 24 Jul 2020 09:15:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35360 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726643AbgGXNPN (ORCPT ); Fri, 24 Jul 2020 09:15:13 -0400 Received: from mail-pf1-x444.google.com (mail-pf1-x444.google.com [IPv6:2607:f8b0:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 68507C0619D3; Fri, 24 Jul 2020 06:15:13 -0700 (PDT) Received: by mail-pf1-x444.google.com with SMTP id a23so5065530pfk.13; Fri, 24 Jul 2020 06:15:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=DHpBnVvFWixE9Yg0d0OK10tzA3221PSUyt4Cx2IH5+E=; b=myWsbPv3/UUglRzV+TMSS2ZZiFNk38LMgSBpDNrQsTIDc5WH7gUklbGO1enhGGySAz E7YzZBxilSJCpvzkCjCU/0CNuVBNZLByMjo8KxMsr1RuI7lIPVyR0vvJxZNF+vEP0Mzk Sn2h0sLrOt+aOi0ATgnWf89dVNz1bvDrQqo3cj6OoyWX8xCVLnNfJaAqKtDhCjUwNgA0 5Y/+beDc9ULzpOBEBVXgkQuIV6fUalwLPPpQY5i02r9TLUCpOAqwCxnak+DcBatC0Jyv PKBIC5be55ZRIMw4PLKztz9Klrbzm6QA2rbct7J2sE+V6G3FLRbUvMgFLeL3276GLo0m WAuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=DHpBnVvFWixE9Yg0d0OK10tzA3221PSUyt4Cx2IH5+E=; b=jVXkEbwRDRfdesW1bwxsE3Jqg53CwuOd2/pdiVzsRQSwyIUXte2PjBxM32n8U2K4Fr mCTCk8QVcYK8L1iATGulUcxs5HLlkJwEp8RQRgXA7zWLsTqEJMDPDA+OLaLjMwGobT+O TrD5YbrCa3azWqeTinyw7xLpeR0MAKfwonaSe+23nPsUYCyqYSkLOpcYFK4dz+ZmQEUh sIMPzQ2BEFjYI6tyGr85sjOg+MKoJDxF69OII1Z7oNELE2XqxylzZArKDrHVcxW4zjD8 Pdq4+1SgsB6QV1/ji944H2lh7y3r+pA1mPJCb+ZQQIbp3+wl211FVlQxK88+xWLtDPtF aMQA== X-Gm-Message-State: AOAM533kDo+c+0R+U3yMyjEYN/e6Glgu9ObXL4GBf2P72Dsj40JCTkM8 c1cv1/bYrs4c+6+/cud0Uzk= X-Google-Smtp-Source: ABdhPJyisx6hCRDrbmPNpGytS1qqjnTFZwITCd6wO1vkmRzyL31lFJNP7UVIeTdb2pnJAPV4vPWvTg== X-Received: by 2002:aa7:942e:: with SMTP id y14mr8909386pfo.58.1595596512969; Fri, 24 Jul 2020 06:15:12 -0700 (PDT) Received: from bobo.ibm.com (110-174-173-27.tpgi.com.au. [110.174.173.27]) by smtp.gmail.com with ESMTPSA id az16sm5871998pjb.7.2020.07.24.06.15.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Jul 2020 06:15:12 -0700 (PDT) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Cc: Nicholas Piggin , Will Deacon , Peter Zijlstra , Boqun Feng , Ingo Molnar , Waiman Long , Anton Blanchard , =?utf-8?q?Michal_Such=C3=A1nek?= , linux-kernel@vger.kernel.org, virtualization@lists.linux-foundation.org, kvm-ppc@vger.kernel.org, linux-arch@vger.kernel.org Subject: [PATCH v4 6/6] powerpc: implement smp_cond_load_relaxed Date: Fri, 24 Jul 2020 23:14:23 +1000 Message-Id: <20200724131423.1362108-7-npiggin@gmail.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20200724131423.1362108-1-npiggin@gmail.com> References: <20200724131423.1362108-1-npiggin@gmail.com> MIME-Version: 1.0 Sender: kvm-ppc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org This implements smp_cond_load_relaed with the slowpath busy loop using the preferred SMT priority pattern. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/barrier.h | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/arch/powerpc/include/asm/barrier.h b/arch/powerpc/include/asm/barrier.h index 123adcefd40f..9b4671d38674 100644 --- a/arch/powerpc/include/asm/barrier.h +++ b/arch/powerpc/include/asm/barrier.h @@ -76,6 +76,20 @@ do { \ ___p1; \ }) +#define smp_cond_load_relaxed(ptr, cond_expr) ({ \ + typeof(ptr) __PTR = (ptr); \ + __unqual_scalar_typeof(*ptr) VAL; \ + VAL = READ_ONCE(*__PTR); \ + if (unlikely(!(cond_expr))) { \ + spin_begin(); \ + do { \ + VAL = READ_ONCE(*__PTR); \ + } while (!(cond_expr)); \ + spin_end(); \ + } \ + (typeof(*ptr))VAL; \ +}) + #ifdef CONFIG_PPC_BOOK3S_64 #define NOSPEC_BARRIER_SLOT nop #elif defined(CONFIG_PPC_FSL_BOOK3E)