From patchwork Sat Nov 26 09:59:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1709197 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=UHVGmCRI; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NK6gZ2LxCz23nT for ; Sat, 26 Nov 2022 21:00:58 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4NK6gZ17dSz3f51 for ; Sat, 26 Nov 2022 21:00:58 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=UHVGmCRI; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::62a; helo=mail-pl1-x62a.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=UHVGmCRI; dkim-atps=neutral Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4NK6fG3Nbmz3bhv for ; Sat, 26 Nov 2022 20:59:50 +1100 (AEDT) Received: by mail-pl1-x62a.google.com with SMTP id d3so960169plr.10 for ; Sat, 26 Nov 2022 01:59:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=FGP73fdJuAg75ShdZDLVnvxLeN8f2eEzloqDQZRSINA=; b=UHVGmCRIgVmZyoKEYg/6Q7AnxZu1vRfqK9+dBOPHaFrjWQUMBeXt516jE+2AOWfezp E6P8Wh6gxYOm9HaUUWyy2+UKJzN0R2Yr81snsguZw6/4SBE7synOC4WslbuzTmnyLVz7 YY4tVY/OoU9XSJ4/598iURSVM9TqmFREpGumniovLdNoZtdxbbUASkPwaSI1oaYWO601 UQzR2nRPuqgXZCDyuYFkfHRG5GoRr90jBfNyui+L6iXPaaUMhSpTQAgMPgqI845IBTqJ 0m54trTLJNgs+jJ1z0PRpKolz8wdy6tmXzO37nJyhjhdgtNLjc+AKmvT01UsaxMoC81D H/tg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FGP73fdJuAg75ShdZDLVnvxLeN8f2eEzloqDQZRSINA=; b=IUfYpfPP8nwRTn8v1mdnHPi5pdB1WzmdjSUrtH47aPPsIab7ngjfjrbaUTisX1q5y/ YZprT89KvZSXZB1e8fdGE7UbYtBUXtxzhHXGYfkiazQ5qLA2gSWhAIAdYOdiFCT6num+ pcx7a/5jf9ID0FaohYEPhpwrvkbV3yQSqsTaEHnumORsU2QR2vOze97Qi8ViqSzgugjk VquwSbuZL+jjOsyZL+ZIcpn4xFE2/3M4s5Bh3tLq/3AtSJaeUcivWCETr2sMdqnjxYkv i2aQr7EN7Cv5pKOHYVBt2Mtv84FNj8V9ujkTkgZHnQbpGaYV+mh37GGOyAOOL47k9sDy HScw== X-Gm-Message-State: ANoB5pliQm2vkh47mQ//DjPfcgW8oWzW2BC4IF4owDj9AYOYnZhn6/Ix d2XRWYu/JLRZtYqEDionbG3Cp/qqWBJd5A== X-Google-Smtp-Source: AA0mqf7/o+RPdLSZRJYN8NnVGoG+VTb16fLQVLLt8oZudGhGKNhiCRljRw9JqD0i90OxOZSkuZ4Ckw== X-Received: by 2002:a17:90a:dc06:b0:218:9196:1cd1 with SMTP id i6-20020a17090adc0600b0021891961cd1mr37969572pjv.230.1669456787319; Sat, 26 Nov 2022 01:59:47 -0800 (PST) Received: from bobo.ozlabs.ibm.com (110-174-181-90.tpgi.com.au. [110.174.181.90]) by smtp.gmail.com with ESMTPSA id j3-20020a17090a94c300b00213202d77d9sm4239243pjw.43.2022.11.26.01.59.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Nov 2022 01:59:46 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v3 01/17] powerpc/qspinlock: add mcs queueing for contended waiters Date: Sat, 26 Nov 2022 19:59:16 +1000 Message-Id: <20221126095932.1234527-2-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221126095932.1234527-1-npiggin@gmail.com> References: <20221126095932.1234527-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" This forms the basis of the qspinlock slow path. Like generic qspinlocks and unlike the vanilla MCS algorithm, the lock owner does not participate in the queue, only waiters. The first waiter spins on the lock word, then when the lock is released it takes ownership and unqueues the next waiter. This is how qspinlocks can be implemented with the spinlock API -- lock owners don't need a node, only waiters do. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/qspinlock.h | 10 +- arch/powerpc/include/asm/qspinlock_types.h | 23 +++ arch/powerpc/lib/qspinlock.c | 187 ++++++++++++++++++++- 3 files changed, 214 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h index b1443aab2145..300c7d2ebe2e 100644 --- a/arch/powerpc/include/asm/qspinlock.h +++ b/arch/powerpc/include/asm/qspinlock.h @@ -18,12 +18,12 @@ static __always_inline int queued_spin_value_unlocked(struct qspinlock lock) static __always_inline int queued_spin_is_contended(struct qspinlock *lock) { - return 0; + return !!(atomic_read(&lock->val) & _Q_TAIL_CPU_MASK); } static __always_inline int queued_spin_trylock(struct qspinlock *lock) { - return atomic_cmpxchg_acquire(&lock->val, 0, 1) == 0; + return atomic_cmpxchg_acquire(&lock->val, 0, _Q_LOCKED_VAL) == 0; } void queued_spin_lock_slowpath(struct qspinlock *lock); @@ -36,7 +36,11 @@ static __always_inline void queued_spin_lock(struct qspinlock *lock) static inline void queued_spin_unlock(struct qspinlock *lock) { - atomic_set_release(&lock->val, 0); + for (;;) { + int val = atomic_read(&lock->val); + if (atomic_cmpxchg_release(&lock->val, val, val & ~_Q_LOCKED_VAL) == val) + return; + } } #define arch_spin_is_locked(l) queued_spin_is_locked(l) diff --git a/arch/powerpc/include/asm/qspinlock_types.h b/arch/powerpc/include/asm/qspinlock_types.h index 59606bc0c774..20a36dfb14e2 100644 --- a/arch/powerpc/include/asm/qspinlock_types.h +++ b/arch/powerpc/include/asm/qspinlock_types.h @@ -10,4 +10,27 @@ typedef struct qspinlock { #define __ARCH_SPIN_LOCK_UNLOCKED { .val = ATOMIC_INIT(0) } +/* + * Bitfields in the lock word: + * + * 0: locked bit + * 1-16: unused bits + * 17-31: tail cpu (+1) + */ +#define _Q_SET_MASK(type) (((1U << _Q_ ## type ## _BITS) - 1)\ + << _Q_ ## type ## _OFFSET) +/* 0x00000001 */ +#define _Q_LOCKED_OFFSET 0 +#define _Q_LOCKED_BITS 1 +#define _Q_LOCKED_VAL (1U << _Q_LOCKED_OFFSET) + +/* 0xfffe0000 */ +#define _Q_TAIL_CPU_OFFSET 17 +#define _Q_TAIL_CPU_BITS 15 +#define _Q_TAIL_CPU_MASK _Q_SET_MASK(TAIL_CPU) + +#if CONFIG_NR_CPUS >= (1U << _Q_TAIL_CPU_BITS) +#error "qspinlock does not support such large CONFIG_NR_CPUS" +#endif + #endif /* _ASM_POWERPC_QSPINLOCK_TYPES_H */ diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 1c669b5b4607..86504628501e 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -1,12 +1,193 @@ // SPDX-License-Identifier: GPL-2.0-or-later +#include +#include +#include #include -#include +#include +#include #include -void queued_spin_lock_slowpath(struct qspinlock *lock) +#define MAX_NODES 4 + +struct qnode { + struct qnode *next; + struct qspinlock *lock; + u8 locked; /* 1 if lock acquired */ +}; + +struct qnodes { + int count; + struct qnode nodes[MAX_NODES]; +}; + +static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes); + +static inline int encode_tail_cpu(int cpu) +{ + return (cpu + 1) << _Q_TAIL_CPU_OFFSET; +} + +static inline int decode_tail_cpu(int val) +{ + return (val >> _Q_TAIL_CPU_OFFSET) - 1; +} + +/* + * Try to acquire the lock if it was not already locked. If the tail matches + * mytail then clear it, otherwise leave it unchnaged. Return previous value. + * + * This is used by the head of the queue to acquire the lock and clean up + * its tail if it was the last one queued. + */ +static __always_inline int set_locked_clean_tail(struct qspinlock *lock, int tail) +{ + int val = atomic_read(&lock->val); + + BUG_ON(val & _Q_LOCKED_VAL); + + /* If we're the last queued, must clean up the tail. */ + if ((val & _Q_TAIL_CPU_MASK) == tail) { + if (atomic_cmpxchg_acquire(&lock->val, val, _Q_LOCKED_VAL) == val) + return val; + /* Another waiter must have enqueued */ + val = atomic_read(&lock->val); + BUG_ON(val & _Q_LOCKED_VAL); + } + + /* We must be the owner, just set the lock bit and acquire */ + atomic_or(_Q_LOCKED_VAL, &lock->val); + __atomic_acquire_fence(); + + return val; +} + +/* + * Publish our tail, replacing previous tail. Return previous value. + * + * This provides a release barrier for publishing node, this pairs with the + * acquire barrier in get_tail_qnode() when the next CPU finds this tail + * value. + */ +static __always_inline int publish_tail_cpu(struct qspinlock *lock, int tail) +{ + for (;;) { + int val = atomic_read(&lock->val); + int newval = (val & ~_Q_TAIL_CPU_MASK) | tail; + int old; + + old = atomic_cmpxchg_release(&lock->val, val, newval); + if (old == val) + return old; + } +} + +static struct qnode *get_tail_qnode(struct qspinlock *lock, int val) +{ + int cpu = decode_tail_cpu(val); + struct qnodes *qnodesp = per_cpu_ptr(&qnodes, cpu); + int idx; + + /* + * After publishing the new tail and finding a previous tail in the + * previous val (which is the control dependency), this barrier + * orders the release barrier in publish_tail_cpu performed by the + * last CPU, with subsequently looking at its qnode structures + * after the barrier. + */ + smp_acquire__after_ctrl_dep(); + + for (idx = 0; idx < MAX_NODES; idx++) { + struct qnode *qnode = &qnodesp->nodes[idx]; + if (qnode->lock == lock) + return qnode; + } + + BUG(); +} + +static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) { - while (!queued_spin_trylock(lock)) + struct qnodes *qnodesp; + struct qnode *next, *node; + int val, old, tail; + int idx; + + BUILD_BUG_ON(CONFIG_NR_CPUS >= (1U << _Q_TAIL_CPU_BITS)); + + qnodesp = this_cpu_ptr(&qnodes); + if (unlikely(qnodesp->count >= MAX_NODES)) { + while (!queued_spin_trylock(lock)) + cpu_relax(); + return; + } + + idx = qnodesp->count++; + /* + * Ensure that we increment the head node->count before initialising + * the actual node. If the compiler is kind enough to reorder these + * stores, then an IRQ could overwrite our assignments. + */ + barrier(); + node = &qnodesp->nodes[idx]; + node->next = NULL; + node->lock = lock; + node->locked = 0; + + tail = encode_tail_cpu(smp_processor_id()); + + old = publish_tail_cpu(lock, tail); + + /* + * If there was a previous node; link it and wait until reaching the + * head of the waitqueue. + */ + if (old & _Q_TAIL_CPU_MASK) { + struct qnode *prev = get_tail_qnode(lock, old); + + /* Link @node into the waitqueue. */ + WRITE_ONCE(prev->next, node); + + /* Wait for mcs node lock to be released */ + while (!node->locked) + cpu_relax(); + + smp_rmb(); /* acquire barrier for the mcs lock */ + } + + /* We're at the head of the waitqueue, wait for the lock. */ + for (;;) { + val = atomic_read(&lock->val); + if (!(val & _Q_LOCKED_VAL)) + break; + + cpu_relax(); + } + + /* If we're the last queued, must clean up the tail. */ + old = set_locked_clean_tail(lock, tail); + if ((old & _Q_TAIL_CPU_MASK) == tail) + goto release; /* Another waiter must have enqueued */ + + /* There is a next, must wait for node->next != NULL (MCS protocol) */ + while (!(next = READ_ONCE(node->next))) cpu_relax(); + + /* + * Unlock the next mcs waiter node. Release barrier is not required + * here because the acquirer is only accessing the lock word, and + * the acquire barrier we took the lock with orders that update vs + * this store to locked. The corresponding barrier is the smp_rmb() + * acquire barrier for mcs lock, above. + */ + WRITE_ONCE(next->locked, 1); + +release: + qnodesp->count--; /* release the node */ +} + +void queued_spin_lock_slowpath(struct qspinlock *lock) +{ + queued_spin_lock_mcs_queue(lock); } EXPORT_SYMBOL(queued_spin_lock_slowpath); From patchwork Sat Nov 26 09:59:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1709198 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=TK0VIqDZ; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NK6hk6Yvvz23nT for ; Sat, 26 Nov 2022 21:01:58 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4NK6hk40Q7z3f8B for ; Sat, 26 Nov 2022 21:01:58 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=TK0VIqDZ; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::632; helo=mail-pl1-x632.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=TK0VIqDZ; dkim-atps=neutral Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4NK6fK0DqMz3cFt for ; Sat, 26 Nov 2022 20:59:52 +1100 (AEDT) Received: by mail-pl1-x632.google.com with SMTP id b21so5941557plc.9 for ; Sat, 26 Nov 2022 01:59:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CJtnT2coJLC0jAlY9AcD87HJTvPiETsizDtNvNfuOXI=; b=TK0VIqDZXB5ZwgAhKj1TtADQSKWUaolvtj3p3MOOBMZ4r+PCSP8dPMiepBrjh2Q3Vh Hk0ivV7y8lY7vQMTn45KXFtWVLe7aFgrqDwAbhUOFcswWyFdANhkK9Wyfr0Cf0u43EO6 2s8V+A7MZXUynlBXLa3JPGL8cYS7XqAhChZpL/CXa7Hl1n4Pqkd6RF3gMd5a5OPFOKnA zy6bfkaYhFWB8nlF/9yDwfzPRrx91+345pviR6khtUkiNkc9rZHWGsNq+dn0E3OkaqsO UVhGBwvMS++i5F+/rF8m63HJlKjT6E7nwCin0EiZtpYjxeiizZnI0PKtiRCEWteTWE4E woYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CJtnT2coJLC0jAlY9AcD87HJTvPiETsizDtNvNfuOXI=; b=8C+ePYO0/Zk/GD7Qp3Yc5G61g+WeXYNdCxk8lDnxwl1I6q/VNx+KmAkRUOHj/QKTm3 Kp7LWD2kcsQA0CFkeU4YP1aZ+KgPABtJ5hJSGOCCcmbhE77umYQRfJNlyat6VgCSufnG jJwsjV1PwET5IUHF8obO1dOyogLns0UbbG5WFmEGCM82ESmR+UhvPPVWsIlc7x+nJLXy vcfOUqsmuTnl5jkWXPwXMAGEQB/hrhJ56zFnCjArugu9XwrORP0oKQaWKYfJ3kvHNtWj hQTu3fM9+MvMjO/1Q31IhQhz2l5LhGSUxMNiqNHtF4T7BKg4kMgaUf3xNBnADX4KNpeQ z0Hw== X-Gm-Message-State: ANoB5plii00QpjOB65w7PuOu2mIuodZjN/evSNT1lL5/vzJrp4kanY1c 5ovGP3pXac3CVloetAPJcGhV/a0Vkrg5kg== X-Google-Smtp-Source: AA0mqf5c4me+rrmkaOt5X1+jcyyY2vczIb8j6agYdGLISVGRTOqpc/MDt4P3X2SzEf0ZBZxuEeKShQ== X-Received: by 2002:a17:90a:ff14:b0:219:1927:27bc with SMTP id ce20-20020a17090aff1400b00219192727bcmr3436445pjb.229.1669456790368; Sat, 26 Nov 2022 01:59:50 -0800 (PST) Received: from bobo.ozlabs.ibm.com (110-174-181-90.tpgi.com.au. [110.174.181.90]) by smtp.gmail.com with ESMTPSA id j3-20020a17090a94c300b00213202d77d9sm4239243pjw.43.2022.11.26.01.59.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Nov 2022 01:59:49 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v3 02/17] powerpc/qspinlock: use a half-word store to unlock to avoid larx/stcx. Date: Sat, 26 Nov 2022 19:59:17 +1000 Message-Id: <20221126095932.1234527-3-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221126095932.1234527-1-npiggin@gmail.com> References: <20221126095932.1234527-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" The first 16 bits of the lock are only modified by the owner, and other modifications always use atomic operations on the entire 32 bits, so unlocks can use plain stores on the 16 bits. This is the same kind of optimisation done by core qspinlock code. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/qspinlock.h | 6 +----- arch/powerpc/include/asm/qspinlock_types.h | 19 +++++++++++++++++-- 2 files changed, 18 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h index 300c7d2ebe2e..7bc254c55705 100644 --- a/arch/powerpc/include/asm/qspinlock.h +++ b/arch/powerpc/include/asm/qspinlock.h @@ -36,11 +36,7 @@ static __always_inline void queued_spin_lock(struct qspinlock *lock) static inline void queued_spin_unlock(struct qspinlock *lock) { - for (;;) { - int val = atomic_read(&lock->val); - if (atomic_cmpxchg_release(&lock->val, val, val & ~_Q_LOCKED_VAL) == val) - return; - } + smp_store_release(&lock->locked, 0); } #define arch_spin_is_locked(l) queued_spin_is_locked(l) diff --git a/arch/powerpc/include/asm/qspinlock_types.h b/arch/powerpc/include/asm/qspinlock_types.h index 20a36dfb14e2..fe87181c59e5 100644 --- a/arch/powerpc/include/asm/qspinlock_types.h +++ b/arch/powerpc/include/asm/qspinlock_types.h @@ -3,12 +3,27 @@ #define _ASM_POWERPC_QSPINLOCK_TYPES_H #include +#include typedef struct qspinlock { - atomic_t val; + union { + atomic_t val; + +#ifdef __LITTLE_ENDIAN + struct { + u16 locked; + u8 reserved[2]; + }; +#else + struct { + u8 reserved[2]; + u16 locked; + }; +#endif + }; } arch_spinlock_t; -#define __ARCH_SPIN_LOCK_UNLOCKED { .val = ATOMIC_INIT(0) } +#define __ARCH_SPIN_LOCK_UNLOCKED { { .val = ATOMIC_INIT(0) } } /* * Bitfields in the lock word: From patchwork Sat Nov 26 09:59:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1709199 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=kULUuir2; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NK6jm2p73z23nT for ; Sat, 26 Nov 2022 21:02:52 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4NK6jk63npz3fB2 for ; Sat, 26 Nov 2022 21:02:50 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=kULUuir2; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::431; helo=mail-pf1-x431.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=kULUuir2; dkim-atps=neutral Received: from mail-pf1-x431.google.com (mail-pf1-x431.google.com [IPv6:2607:f8b0:4864:20::431]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4NK6fN1VHYz3f2q for ; Sat, 26 Nov 2022 20:59:55 +1100 (AEDT) Received: by mail-pf1-x431.google.com with SMTP id a16so5705930pfg.4 for ; Sat, 26 Nov 2022 01:59:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2ehrC4Nm6UZtrpvBU9PaHVj1Z37drtTfQ5iMySnIvfg=; b=kULUuir2XssMRRKl6AxabFzAD+CFklybKi74/QxkSsRGXvUioINF5rtACJFTmCBsK2 zbtPXCC5JHxxf3AgTxr0Tb5AUFbx8uwIgJc4Sr6fL/y6J5dFcLf15W9G4tm1fIBoyKVC /6z4Lcor9N7V/kn2su0bgD/RjFRsAUYh5B684qck6G77DlLuXK1bJzKbEEWMoDxBlLwJ xre1/ZiEqX8AH91KOsrXHYR3n15pdhOatmvEhPnWBJpuEVVaX4WJK/IfqdNvFIM3nRM4 Zx+EnNqmXxEjJhLna/IsuxwnuzzGRhcNn+C2MyA8NQtU3OJLlYydKOpvBMOUVElnlo/G X57w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2ehrC4Nm6UZtrpvBU9PaHVj1Z37drtTfQ5iMySnIvfg=; b=ae1LpDznsRXIz/DnE0TX1A7oMvmyHQXZf48FNe8tfd9fRVmTZ4XJhWy8HOqx4+HD7b YlGBT8IRvHRmApOFsQqQWaJ26fNDjLbJmk/ONtPzWcmYHk9/yMIx0bD+o52QePbNUhIc ea7HGZe13EneKKJfU9/rfJ5lbMDvAF7MY8MRtPQ2+Ca19ZcID+5pVED+2+wtcj+j5pYS swKJNXvIdsxeJztneXo+72YvzO+Egknsmbj5c6opqFY89Ttj9H4f7zkjkTA0ffueF3Jc s6btrj5rzawFoHFUtFkZdkbewO18S3VmiwLAJL0u5ib6uKGbfdnN5vsvCXcsMJDGla89 jIEA== X-Gm-Message-State: ANoB5plvlbBgFRiE1ia0GOPjLCpJ/IJZBbMJekRqHQuL+u10fC/8Vkru zjPRZxPW70gcHy0campJZ5TjqT4UU88JEw== X-Google-Smtp-Source: AA0mqf4NxLdOo5HQ2nhFX4kG03O4UAZWTtihfiS96zsxQ0J8gERnlQkhhFEEAJqAFymeYqYwOKWzUw== X-Received: by 2002:a63:560f:0:b0:476:ed2a:621a with SMTP id k15-20020a63560f000000b00476ed2a621amr19750535pgb.557.1669456793234; Sat, 26 Nov 2022 01:59:53 -0800 (PST) Received: from bobo.ozlabs.ibm.com (110-174-181-90.tpgi.com.au. [110.174.181.90]) by smtp.gmail.com with ESMTPSA id j3-20020a17090a94c300b00213202d77d9sm4239243pjw.43.2022.11.26.01.59.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Nov 2022 01:59:52 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v3 03/17] powerpc/qspinlock: convert atomic operations to assembly Date: Sat, 26 Nov 2022 19:59:18 +1000 Message-Id: <20221126095932.1234527-4-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221126095932.1234527-1-npiggin@gmail.com> References: <20221126095932.1234527-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" This uses more optimal ll/sc style access patterns (rather than cmpxchg), and also sets the EH=1 lock hint on those operations which acquire ownership of the lock. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/qspinlock.h | 24 +++++-- arch/powerpc/include/asm/qspinlock_types.h | 4 +- arch/powerpc/lib/qspinlock.c | 82 +++++++++++++--------- 3 files changed, 68 insertions(+), 42 deletions(-) diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h index 7bc254c55705..7d300e6883a8 100644 --- a/arch/powerpc/include/asm/qspinlock.h +++ b/arch/powerpc/include/asm/qspinlock.h @@ -2,28 +2,42 @@ #ifndef _ASM_POWERPC_QSPINLOCK_H #define _ASM_POWERPC_QSPINLOCK_H -#include #include #include static __always_inline int queued_spin_is_locked(struct qspinlock *lock) { - return atomic_read(&lock->val); + return READ_ONCE(lock->val); } static __always_inline int queued_spin_value_unlocked(struct qspinlock lock) { - return !atomic_read(&lock.val); + return !lock.val; } static __always_inline int queued_spin_is_contended(struct qspinlock *lock) { - return !!(atomic_read(&lock->val) & _Q_TAIL_CPU_MASK); + return !!(READ_ONCE(lock->val) & _Q_TAIL_CPU_MASK); } static __always_inline int queued_spin_trylock(struct qspinlock *lock) { - return atomic_cmpxchg_acquire(&lock->val, 0, _Q_LOCKED_VAL) == 0; + u32 prev; + + asm volatile( +"1: lwarx %0,0,%1,%3 # queued_spin_trylock \n" +" cmpwi 0,%0,0 \n" +" bne- 2f \n" +" stwcx. %2,0,%1 \n" +" bne- 1b \n" +"\t" PPC_ACQUIRE_BARRIER " \n" +"2: \n" + : "=&r" (prev) + : "r" (&lock->val), "r" (_Q_LOCKED_VAL), + "i" (IS_ENABLED(CONFIG_PPC64)) + : "cr0", "memory"); + + return likely(prev == 0); } void queued_spin_lock_slowpath(struct qspinlock *lock); diff --git a/arch/powerpc/include/asm/qspinlock_types.h b/arch/powerpc/include/asm/qspinlock_types.h index fe87181c59e5..b9a5a52fa670 100644 --- a/arch/powerpc/include/asm/qspinlock_types.h +++ b/arch/powerpc/include/asm/qspinlock_types.h @@ -7,7 +7,7 @@ typedef struct qspinlock { union { - atomic_t val; + u32 val; #ifdef __LITTLE_ENDIAN struct { @@ -23,7 +23,7 @@ typedef struct qspinlock { }; } arch_spinlock_t; -#define __ARCH_SPIN_LOCK_UNLOCKED { { .val = ATOMIC_INIT(0) } } +#define __ARCH_SPIN_LOCK_UNLOCKED { { .val = 0 } } /* * Bitfields in the lock word: diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 86504628501e..645d9affacfd 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -1,5 +1,4 @@ // SPDX-License-Identifier: GPL-2.0-or-later -#include #include #include #include @@ -22,12 +21,12 @@ struct qnodes { static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes); -static inline int encode_tail_cpu(int cpu) +static inline u32 encode_tail_cpu(int cpu) { return (cpu + 1) << _Q_TAIL_CPU_OFFSET; } -static inline int decode_tail_cpu(int val) +static inline int decode_tail_cpu(u32 val) { return (val >> _Q_TAIL_CPU_OFFSET) - 1; } @@ -39,26 +38,34 @@ static inline int decode_tail_cpu(int val) * This is used by the head of the queue to acquire the lock and clean up * its tail if it was the last one queued. */ -static __always_inline int set_locked_clean_tail(struct qspinlock *lock, int tail) +static __always_inline u32 set_locked_clean_tail(struct qspinlock *lock, u32 tail) { - int val = atomic_read(&lock->val); - - BUG_ON(val & _Q_LOCKED_VAL); - - /* If we're the last queued, must clean up the tail. */ - if ((val & _Q_TAIL_CPU_MASK) == tail) { - if (atomic_cmpxchg_acquire(&lock->val, val, _Q_LOCKED_VAL) == val) - return val; - /* Another waiter must have enqueued */ - val = atomic_read(&lock->val); - BUG_ON(val & _Q_LOCKED_VAL); - } - - /* We must be the owner, just set the lock bit and acquire */ - atomic_or(_Q_LOCKED_VAL, &lock->val); - __atomic_acquire_fence(); - - return val; + u32 newval = _Q_LOCKED_VAL; + u32 prev, tmp; + + asm volatile( +"1: lwarx %0,0,%2,%6 # set_locked_clean_tail \n" + /* Test whether the lock tail == tail */ +" and %1,%0,%5 \n" +" cmpw 0,%1,%3 \n" + /* Merge the new locked value */ +" or %1,%1,%4 \n" +" bne 2f \n" + /* If the lock tail matched, then clear it, otherwise leave it. */ +" andc %1,%1,%5 \n" +"2: stwcx. %1,0,%2 \n" +" bne- 1b \n" +"\t" PPC_ACQUIRE_BARRIER " \n" +"3: \n" + : "=&r" (prev), "=&r" (tmp) + : "r" (&lock->val), "r"(tail), "r" (newval), + "r" (_Q_TAIL_CPU_MASK), + "i" (IS_ENABLED(CONFIG_PPC64)) + : "cr0", "memory"); + + BUG_ON(prev & _Q_LOCKED_VAL); + + return prev; } /* @@ -68,20 +75,25 @@ static __always_inline int set_locked_clean_tail(struct qspinlock *lock, int tai * acquire barrier in get_tail_qnode() when the next CPU finds this tail * value. */ -static __always_inline int publish_tail_cpu(struct qspinlock *lock, int tail) +static __always_inline u32 publish_tail_cpu(struct qspinlock *lock, u32 tail) { - for (;;) { - int val = atomic_read(&lock->val); - int newval = (val & ~_Q_TAIL_CPU_MASK) | tail; - int old; - - old = atomic_cmpxchg_release(&lock->val, val, newval); - if (old == val) - return old; - } + u32 prev, tmp; + + asm volatile( +"\t" PPC_RELEASE_BARRIER " \n" +"1: lwarx %0,0,%2 # publish_tail_cpu \n" +" andc %1,%0,%4 \n" +" or %1,%1,%3 \n" +" stwcx. %1,0,%2 \n" +" bne- 1b \n" + : "=&r" (prev), "=&r"(tmp) + : "r" (&lock->val), "r" (tail), "r"(_Q_TAIL_CPU_MASK) + : "cr0", "memory"); + + return prev; } -static struct qnode *get_tail_qnode(struct qspinlock *lock, int val) +static struct qnode *get_tail_qnode(struct qspinlock *lock, u32 val) { int cpu = decode_tail_cpu(val); struct qnodes *qnodesp = per_cpu_ptr(&qnodes, cpu); @@ -109,7 +121,7 @@ static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) { struct qnodes *qnodesp; struct qnode *next, *node; - int val, old, tail; + u32 val, old, tail; int idx; BUILD_BUG_ON(CONFIG_NR_CPUS >= (1U << _Q_TAIL_CPU_BITS)); @@ -156,7 +168,7 @@ static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) /* We're at the head of the waitqueue, wait for the lock. */ for (;;) { - val = atomic_read(&lock->val); + val = READ_ONCE(lock->val); if (!(val & _Q_LOCKED_VAL)) break; From patchwork Sat Nov 26 09:59:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1709200 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=PQsZ18h/; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NK6kl3Zxqz23nT for ; Sat, 26 Nov 2022 21:03:43 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4NK6kl2gLpz3dvR for ; Sat, 26 Nov 2022 21:03:43 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=PQsZ18h/; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::1030; helo=mail-pj1-x1030.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=PQsZ18h/; dkim-atps=neutral Received: from mail-pj1-x1030.google.com (mail-pj1-x1030.google.com [IPv6:2607:f8b0:4864:20::1030]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4NK6fQ74Vgz3blw for ; Sat, 26 Nov 2022 20:59:58 +1100 (AEDT) Received: by mail-pj1-x1030.google.com with SMTP id mv18so5502694pjb.0 for ; Sat, 26 Nov 2022 01:59:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=sIuyypJS2oehQj26FQVIhC1ehQU+4Vz6HvODwr1U4xM=; b=PQsZ18h/1EtxLFqmapDe566egjcluY/V0kKp4NA1Qv9UEdLhJPbmUCvIwCJ2gcrzsl u/npuAG/rLEUy3xESd0qU5De9xS8N0hY8DokzlzHsIIKsNS/gKcbJd+aVPXeH/9Jn8U8 o52TcXuQ7vGASwlIsXqiitaCVbIjKZW9VrF8PhBX2M613FysxhHVZGe+OdUi88TZS2T2 Tm3GrJbqMz9MdUWuoucUUM3i6k+OVaT+OsWc/OclNz/UbyJmYA572sx+GjOcHzOJ/E72 hAUQj0GPvqj65YIktFEntF3r98+L/f/Q8XOCNpoPP2pm1ZMRsrkmaOlDIx7Afd4XMLiK Q6Pg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sIuyypJS2oehQj26FQVIhC1ehQU+4Vz6HvODwr1U4xM=; b=Rxt4HttnYTkpjXwpGx9ipqWbYzwsuqdfxSkUFVMrsmDqt5nZmo1n47vqupmKDRaMEQ N9aGmVFM6ezUl+qQrNlEmxir09JmcSfbcdfp0xd8gaXklaNFmziIJ4wK4VMxtVsYzNcH Ih5NomSaSg3nkxJOsCM9dHVsjZNgJPAzsBCuqO8d/eZ+qdXjPGtXhFs/3PAF8lfCjIfQ rQem8dW7BxgkaWW84j296nx8UXmNTzsKz7xVOhY4JPfqV0beAzQvL/YTuRA22DLPGSSP 0//ayDATl5jqockB3TkrF4passW0YjZs8pOiljr/BgcCGERw0g4dsa32eqvYRYtF5Z1P ykQQ== X-Gm-Message-State: ANoB5pnd0QL3bvbXfi4/E9Q/8M2ZMshBriTJp6ELEi54WAU5Vq5tia6i +d2OyI/oUIb0RpP29IJh8cNauCy39Do4fA== X-Google-Smtp-Source: AA0mqf7HBWSkjpkgo5+/dIfWuiAZgfI0YOBOXuY41jn8jtjm2Hzw4Mx6LpkhmdzrKJ9Z42lIAChYfw== X-Received: by 2002:a17:90b:2494:b0:211:906a:f8ef with SMTP id nt20-20020a17090b249400b00211906af8efmr44837718pjb.71.1669456796193; Sat, 26 Nov 2022 01:59:56 -0800 (PST) Received: from bobo.ozlabs.ibm.com (110-174-181-90.tpgi.com.au. [110.174.181.90]) by smtp.gmail.com with ESMTPSA id j3-20020a17090a94c300b00213202d77d9sm4239243pjw.43.2022.11.26.01.59.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Nov 2022 01:59:55 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v3 04/17] powerpc/qspinlock: allow new waiters to steal the lock before queueing Date: Sat, 26 Nov 2022 19:59:19 +1000 Message-Id: <20221126095932.1234527-5-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221126095932.1234527-1-npiggin@gmail.com> References: <20221126095932.1234527-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Allow new waiters to "steal" the lock before queueing. That is, to acquire it while other CPUs have queued. This particularly helps paravirt performance when physical CPUs are oversubscribed, by keeping the lock from becoming a strict FIFO and vCPU preemption causing queue train wrecks. The new __queued_spin_trylock_steal() function is put in qspinlock.h to save having to move it, because it will be used there by a later change. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/qspinlock.h | 23 ++++++ arch/powerpc/lib/qspinlock.c | 110 ++++++++++++++++++++++++--- 2 files changed, 124 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h index 7d300e6883a8..2a6f12a2c385 100644 --- a/arch/powerpc/include/asm/qspinlock.h +++ b/arch/powerpc/include/asm/qspinlock.h @@ -40,6 +40,29 @@ static __always_inline int queued_spin_trylock(struct qspinlock *lock) return likely(prev == 0); } +static __always_inline int __queued_spin_trylock_steal(struct qspinlock *lock) +{ + u32 prev, tmp; + + /* Trylock may get ahead of queued nodes if it finds unlocked */ + asm volatile( +"1: lwarx %0,0,%2,%5 # __queued_spin_trylock_steal \n" +" andc. %1,%0,%4 \n" +" bne- 2f \n" +" and %1,%0,%4 \n" +" or %1,%1,%3 \n" +" stwcx. %1,0,%2 \n" +" bne- 1b \n" +"\t" PPC_ACQUIRE_BARRIER " \n" +"2: \n" + : "=&r" (prev), "=&r" (tmp) + : "r" (&lock->val), "r" (_Q_LOCKED_VAL), "r" (_Q_TAIL_CPU_MASK), + "i" (IS_ENABLED(CONFIG_PPC64)) + : "cr0", "memory"); + + return likely(!(prev & ~_Q_TAIL_CPU_MASK)); +} + void queued_spin_lock_slowpath(struct qspinlock *lock); static __always_inline void queued_spin_lock(struct qspinlock *lock) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 645d9affacfd..6d67bc38b122 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -19,8 +19,17 @@ struct qnodes { struct qnode nodes[MAX_NODES]; }; +/* Tuning parameters */ +static int steal_spins __read_mostly = (1<<5); +static bool maybe_stealers __read_mostly = true; + static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes); +static __always_inline int get_steal_spins(void) +{ + return steal_spins; +} + static inline u32 encode_tail_cpu(int cpu) { return (cpu + 1) << _Q_TAIL_CPU_OFFSET; @@ -38,33 +47,35 @@ static inline int decode_tail_cpu(u32 val) * This is used by the head of the queue to acquire the lock and clean up * its tail if it was the last one queued. */ -static __always_inline u32 set_locked_clean_tail(struct qspinlock *lock, u32 tail) +static __always_inline u32 trylock_clean_tail(struct qspinlock *lock, u32 tail) { u32 newval = _Q_LOCKED_VAL; u32 prev, tmp; asm volatile( -"1: lwarx %0,0,%2,%6 # set_locked_clean_tail \n" - /* Test whether the lock tail == tail */ -" and %1,%0,%5 \n" +"1: lwarx %0,0,%2,%7 # trylock_clean_tail \n" + /* This test is necessary if there could be stealers */ +" andi. %1,%0,%5 \n" +" bne 3f \n" + /* Test whether the lock tail == mytail */ +" and %1,%0,%6 \n" " cmpw 0,%1,%3 \n" /* Merge the new locked value */ " or %1,%1,%4 \n" " bne 2f \n" /* If the lock tail matched, then clear it, otherwise leave it. */ -" andc %1,%1,%5 \n" +" andc %1,%1,%6 \n" "2: stwcx. %1,0,%2 \n" " bne- 1b \n" "\t" PPC_ACQUIRE_BARRIER " \n" "3: \n" : "=&r" (prev), "=&r" (tmp) : "r" (&lock->val), "r"(tail), "r" (newval), + "i" (_Q_LOCKED_VAL), "r" (_Q_TAIL_CPU_MASK), "i" (IS_ENABLED(CONFIG_PPC64)) : "cr0", "memory"); - BUG_ON(prev & _Q_LOCKED_VAL); - return prev; } @@ -117,6 +128,30 @@ static struct qnode *get_tail_qnode(struct qspinlock *lock, u32 val) BUG(); } +static inline bool try_to_steal_lock(struct qspinlock *lock) +{ + int iters = 0; + + if (!steal_spins) + return false; + + /* Attempt to steal the lock */ + do { + u32 val = READ_ONCE(lock->val); + + if (unlikely(!(val & _Q_LOCKED_VAL))) { + if (__queued_spin_trylock_steal(lock)) + return true; + } else { + cpu_relax(); + } + + iters++; + } while (iters < get_steal_spins()); + + return false; +} + static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) { struct qnodes *qnodesp; @@ -166,6 +201,7 @@ static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) smp_rmb(); /* acquire barrier for the mcs lock */ } +again: /* We're at the head of the waitqueue, wait for the lock. */ for (;;) { val = READ_ONCE(lock->val); @@ -176,9 +212,14 @@ static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) } /* If we're the last queued, must clean up the tail. */ - old = set_locked_clean_tail(lock, tail); + old = trylock_clean_tail(lock, tail); + if (unlikely(old & _Q_LOCKED_VAL)) { + BUG_ON(!maybe_stealers); + goto again; /* Can only be true if maybe_stealers. */ + } + if ((old & _Q_TAIL_CPU_MASK) == tail) - goto release; /* Another waiter must have enqueued */ + goto release; /* We were the tail, no next. */ /* There is a next, must wait for node->next != NULL (MCS protocol) */ while (!(next = READ_ONCE(node->next))) @@ -199,6 +240,9 @@ static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) void queued_spin_lock_slowpath(struct qspinlock *lock) { + if (try_to_steal_lock(lock)) + return; + queued_spin_lock_mcs_queue(lock); } EXPORT_SYMBOL(queued_spin_lock_slowpath); @@ -208,3 +252,51 @@ void pv_spinlocks_init(void) { } #endif + +#include +static int steal_spins_set(void *data, u64 val) +{ + static DEFINE_MUTEX(lock); + + /* + * The lock slow path has a !maybe_stealers case that can assume + * the head of queue will not see concurrent waiters. That waiter + * is unsafe in the presence of stealers, so must keep them away + * from one another. + */ + + mutex_lock(&lock); + if (val && !steal_spins) { + maybe_stealers = true; + /* wait for queue head waiter to go away */ + synchronize_rcu(); + steal_spins = val; + } else if (!val && steal_spins) { + steal_spins = val; + /* wait for all possible stealers to go away */ + synchronize_rcu(); + maybe_stealers = false; + } else { + steal_spins = val; + } + mutex_unlock(&lock); + + return 0; +} + +static int steal_spins_get(void *data, u64 *val) +{ + *val = steal_spins; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_steal_spins, steal_spins_get, steal_spins_set, "%llu\n"); + +static __init int spinlock_debugfs_init(void) +{ + debugfs_create_file("qspl_steal_spins", 0600, arch_debugfs_dir, NULL, &fops_steal_spins); + + return 0; +} +device_initcall(spinlock_debugfs_init); From patchwork Sat Nov 26 09:59:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1709201 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=bh6wj40p; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NK6lv5fTmz23nT for ; Sat, 26 Nov 2022 21:04:43 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4NK6lv4Jvnz3fFN for ; Sat, 26 Nov 2022 21:04:43 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=bh6wj40p; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::432; helo=mail-pf1-x432.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=bh6wj40p; dkim-atps=neutral Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4NK6fT6m4fz3f3y for ; Sat, 26 Nov 2022 21:00:01 +1100 (AEDT) Received: by mail-pf1-x432.google.com with SMTP id l7so3642328pfl.7 for ; Sat, 26 Nov 2022 02:00:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zd0A7OkqGIzoqzCdYe074g3Et8SnDelsWSBmhpohXvA=; b=bh6wj40pXcrxr4xCbveQvXNvz20RbohMK4mOrfHTE5XzJ4UIxfWRt1Ei4ySGfNH/QB oVFOadYu5t6tKLThUlsefqvm/cWm1+MtHX0ap+zi+xDOYauVfiH3Lwf0xWEU9LOS2CXt z+roiUDR4LK2lwuRr5gFEamZw0y8PQN0xGD3iBNoEjJ7p23nuHIVwZf2Gp27hXUld19R UMxDX7WtK2hj5E08bRCnIIzkVNs04cGBxVOYCuul70KH7hTiyKu9XHL5v6eXRYnLVSq1 ezrP/MGSKoWRwJcycS0yPkThMsmftrZZm+idOQZgPHNTi7WInzJtFw9QmkZHWoK8wGvE 1IXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zd0A7OkqGIzoqzCdYe074g3Et8SnDelsWSBmhpohXvA=; b=l3+Yufb6yXUrNY54fAnpU3BbDZPMFeSgL+2yJiao13aNRKPdW9bCcqVuPTKxYUT/Bf yCTOO0Rl3uhRLNLGOt2160kJEXeIqK9PvbDk0pzpdnKg47qA2lkn8LWEDYXVQnLPZL5r bY2DYyW6Mvx+1RBS3YPz500ggeajqoCQES+D+Iy1tvrKRwgOoNMZkTaWXGdOKgsvXh4F Xwh2AheukSKVhAxz28SZOkE5rA7UJukTmSqRmNiQbvN+r87VhIuQto90Lupel3Gn2RUA SwJsyr/6JMmrGaM+uWAAxth2f7+bJ2YnGo6UmYbyu+GaKLfA35JTUoIATsTa8lPrPpfY bHbQ== X-Gm-Message-State: ANoB5pl724etH/gVvAK5mj6WW98iEWj/V2yz3kJBdWkNLTpBvRLFNW6d /FgX8qArf3KbJhtB63PVb+WKTvk7ttRPcQ== X-Google-Smtp-Source: AA0mqf6Lb/m26DGHZWpadiQsmeuj0NbzXAGE8BQSdsS83mil8jtT5jGtOpEHq65l2OfgoI863WBT0Q== X-Received: by 2002:a63:4d61:0:b0:477:b1a8:531e with SMTP id n33-20020a634d61000000b00477b1a8531emr17186071pgl.158.1669456799211; Sat, 26 Nov 2022 01:59:59 -0800 (PST) Received: from bobo.ozlabs.ibm.com (110-174-181-90.tpgi.com.au. [110.174.181.90]) by smtp.gmail.com with ESMTPSA id j3-20020a17090a94c300b00213202d77d9sm4239243pjw.43.2022.11.26.01.59.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Nov 2022 01:59:58 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v3 05/17] powerpc/qspinlock: theft prevention to control latency Date: Sat, 26 Nov 2022 19:59:20 +1000 Message-Id: <20221126095932.1234527-6-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221126095932.1234527-1-npiggin@gmail.com> References: <20221126095932.1234527-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Give the queue head the ability to stop stealers. After a number of spins without sucessfully acquiring the lock, the queue head sets this, which halts stealing and will assure it is the next owner. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/qspinlock_types.h | 8 +++- arch/powerpc/lib/qspinlock.c | 53 ++++++++++++++++++++++ 2 files changed, 60 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/qspinlock_types.h b/arch/powerpc/include/asm/qspinlock_types.h index b9a5a52fa670..1911a8a16237 100644 --- a/arch/powerpc/include/asm/qspinlock_types.h +++ b/arch/powerpc/include/asm/qspinlock_types.h @@ -29,7 +29,8 @@ typedef struct qspinlock { * Bitfields in the lock word: * * 0: locked bit - * 1-16: unused bits + * 1-15: unused bits + * 16: must queue bit * 17-31: tail cpu (+1) */ #define _Q_SET_MASK(type) (((1U << _Q_ ## type ## _BITS) - 1)\ @@ -39,6 +40,11 @@ typedef struct qspinlock { #define _Q_LOCKED_BITS 1 #define _Q_LOCKED_VAL (1U << _Q_LOCKED_OFFSET) +/* 0x00010000 */ +#define _Q_MUST_Q_OFFSET 16 +#define _Q_MUST_Q_BITS 1 +#define _Q_MUST_Q_VAL (1U << _Q_MUST_Q_OFFSET) + /* 0xfffe0000 */ #define _Q_TAIL_CPU_OFFSET 17 #define _Q_TAIL_CPU_BITS 15 diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 6d67bc38b122..979b17ac7bd1 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -22,6 +22,7 @@ struct qnodes { /* Tuning parameters */ static int steal_spins __read_mostly = (1<<5); static bool maybe_stealers __read_mostly = true; +static int head_spins __read_mostly = (1<<8); static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes); @@ -30,6 +31,11 @@ static __always_inline int get_steal_spins(void) return steal_spins; } +static __always_inline int get_head_spins(void) +{ + return head_spins; +} + static inline u32 encode_tail_cpu(int cpu) { return (cpu + 1) << _Q_TAIL_CPU_OFFSET; @@ -104,6 +110,22 @@ static __always_inline u32 publish_tail_cpu(struct qspinlock *lock, u32 tail) return prev; } +static __always_inline u32 set_mustq(struct qspinlock *lock) +{ + u32 prev; + + asm volatile( +"1: lwarx %0,0,%1 # set_mustq \n" +" or %0,%0,%2 \n" +" stwcx. %0,0,%1 \n" +" bne- 1b \n" + : "=&r" (prev) + : "r" (&lock->val), "r" (_Q_MUST_Q_VAL) + : "cr0", "memory"); + + return prev; +} + static struct qnode *get_tail_qnode(struct qspinlock *lock, u32 val) { int cpu = decode_tail_cpu(val); @@ -139,6 +161,9 @@ static inline bool try_to_steal_lock(struct qspinlock *lock) do { u32 val = READ_ONCE(lock->val); + if (val & _Q_MUST_Q_VAL) + break; + if (unlikely(!(val & _Q_LOCKED_VAL))) { if (__queued_spin_trylock_steal(lock)) return true; @@ -157,7 +182,9 @@ static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) struct qnodes *qnodesp; struct qnode *next, *node; u32 val, old, tail; + bool mustq = false; int idx; + int iters = 0; BUILD_BUG_ON(CONFIG_NR_CPUS >= (1U << _Q_TAIL_CPU_BITS)); @@ -209,6 +236,15 @@ static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) break; cpu_relax(); + if (!maybe_stealers) + continue; + iters++; + + if (!mustq && iters >= get_head_spins()) { + mustq = true; + set_mustq(lock); + val |= _Q_MUST_Q_VAL; + } } /* If we're the last queued, must clean up the tail. */ @@ -293,9 +329,26 @@ static int steal_spins_get(void *data, u64 *val) DEFINE_SIMPLE_ATTRIBUTE(fops_steal_spins, steal_spins_get, steal_spins_set, "%llu\n"); +static int head_spins_set(void *data, u64 val) +{ + head_spins = val; + + return 0; +} + +static int head_spins_get(void *data, u64 *val) +{ + *val = head_spins; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_head_spins, head_spins_get, head_spins_set, "%llu\n"); + static __init int spinlock_debugfs_init(void) { debugfs_create_file("qspl_steal_spins", 0600, arch_debugfs_dir, NULL, &fops_steal_spins); + debugfs_create_file("qspl_head_spins", 0600, arch_debugfs_dir, NULL, &fops_head_spins); return 0; } From patchwork Sat Nov 26 09:59:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1709202 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=ckUaTtrj; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NK6mw1jQjz23mg for ; Sat, 26 Nov 2022 21:05:36 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4NK6mw0m93z3fBn for ; Sat, 26 Nov 2022 21:05:36 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=ckUaTtrj; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::531; helo=mail-pg1-x531.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=ckUaTtrj; dkim-atps=neutral Received: from mail-pg1-x531.google.com (mail-pg1-x531.google.com [IPv6:2607:f8b0:4864:20::531]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4NK6fY036Qz3f3L for ; Sat, 26 Nov 2022 21:00:04 +1100 (AEDT) Received: by mail-pg1-x531.google.com with SMTP id q1so5748174pgl.11 for ; Sat, 26 Nov 2022 02:00:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Viza/i7++bfoOLQCqPDRK4YsIv/n6ZG7Kj4/9MpOfAM=; b=ckUaTtrjRBgWZS+i4L/b7pO/9tpoV3FvPVfsE03bCNw/QdY3A7WCfIO53FT8NLQKH1 /Fxgusp2l2xPuUAAGIDFEg2WTP36//YHBo6dcH15EHTkgkPxrjSSM4i5s/7IXSUB/NcF f46/n08dxW7/HPbSThd+SOuPXZ+VX26ZVebkgGiuNjmvi3JGc7GWX8D3bhvkZ7kFHiL5 Ec0hrf+L6eXb79PRtvFsBXiB3yALAYZqsF3MY4HTja6N5O7tUZ1dpXK/uRdkRRcibhNT qwGKWcrr6sITMgN41ZbNZzQfdOmJ84Ux5jTHZFUx/khKDzVqnyAj4cyRScXnckJiR/S8 BsgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Viza/i7++bfoOLQCqPDRK4YsIv/n6ZG7Kj4/9MpOfAM=; b=IDMX68yL/fkDtDaa/lOH9w3ZTR1JoQ5XUAFO8AHCUhnk6u4ZxLb8Zzao0HVVrh0ULa GJAalRb8fG2SrkoP5CQQm+sJkdN9zCL2SEvkQkIVUWc3ee84yyLm9RUmgsB52pz77J35 iwEvQ+3rUSVAcIp1e0hiVFPhe1N8/JPkYzPoGSIdEprNRnvix4h7hX1xIShj+GurSbMK w1FkR7eB0THNJLIXYlTBPjhgikcGrdsQOPYUyTM9Kq7r+B0lpaohoQiAAi25Yb4DyXT4 tVUkKh2p8jSen9ffE5eRLQAd9ssleBu2FhUwQPXEnqPPyl1d28GrLq8TGLlnKxjQROeW gxkw== X-Gm-Message-State: ANoB5pkowpQB1J6rJ1suWsDuetMov4AL9uJauwo/60Gv8HsVWoHoNZkK iNREp6tdNDT/ekLulu9FlmaL83EJEHtPGQ== X-Google-Smtp-Source: AA0mqf5x32Xex0QYqZRZgz1ahlHrSrNI4+1/h6IwAhN/OZm9SlZO2qmfzzzsulIjT5Wp3zcMOKoSDw== X-Received: by 2002:aa7:9dc6:0:b0:561:b2ea:bfe9 with SMTP id g6-20020aa79dc6000000b00561b2eabfe9mr24772795pfq.4.1669456802260; Sat, 26 Nov 2022 02:00:02 -0800 (PST) Received: from bobo.ozlabs.ibm.com (110-174-181-90.tpgi.com.au. [110.174.181.90]) by smtp.gmail.com with ESMTPSA id j3-20020a17090a94c300b00213202d77d9sm4239243pjw.43.2022.11.26.01.59.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Nov 2022 02:00:01 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v3 06/17] powerpc/qspinlock: store owner CPU in lock word Date: Sat, 26 Nov 2022 19:59:21 +1000 Message-Id: <20221126095932.1234527-7-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221126095932.1234527-1-npiggin@gmail.com> References: <20221126095932.1234527-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Store the owner CPU number in the lock word so it may be yielded to, as powerpc's paravirtualised simple spinlocks do. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/qspinlock.h | 12 ++++++++++-- arch/powerpc/include/asm/qspinlock_types.h | 12 +++++++++++- arch/powerpc/lib/qspinlock.c | 2 +- 3 files changed, 22 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h index 2a6f12a2c385..be53702e56fc 100644 --- a/arch/powerpc/include/asm/qspinlock.h +++ b/arch/powerpc/include/asm/qspinlock.h @@ -20,8 +20,15 @@ static __always_inline int queued_spin_is_contended(struct qspinlock *lock) return !!(READ_ONCE(lock->val) & _Q_TAIL_CPU_MASK); } +static __always_inline u32 queued_spin_encode_locked_val(void) +{ + /* XXX: make this use lock value in paca like simple spinlocks? */ + return _Q_LOCKED_VAL | (smp_processor_id() << _Q_OWNER_CPU_OFFSET); +} + static __always_inline int queued_spin_trylock(struct qspinlock *lock) { + u32 new = queued_spin_encode_locked_val(); u32 prev; asm volatile( @@ -33,7 +40,7 @@ static __always_inline int queued_spin_trylock(struct qspinlock *lock) "\t" PPC_ACQUIRE_BARRIER " \n" "2: \n" : "=&r" (prev) - : "r" (&lock->val), "r" (_Q_LOCKED_VAL), + : "r" (&lock->val), "r" (new), "i" (IS_ENABLED(CONFIG_PPC64)) : "cr0", "memory"); @@ -42,6 +49,7 @@ static __always_inline int queued_spin_trylock(struct qspinlock *lock) static __always_inline int __queued_spin_trylock_steal(struct qspinlock *lock) { + u32 new = queued_spin_encode_locked_val(); u32 prev, tmp; /* Trylock may get ahead of queued nodes if it finds unlocked */ @@ -56,7 +64,7 @@ static __always_inline int __queued_spin_trylock_steal(struct qspinlock *lock) "\t" PPC_ACQUIRE_BARRIER " \n" "2: \n" : "=&r" (prev), "=&r" (tmp) - : "r" (&lock->val), "r" (_Q_LOCKED_VAL), "r" (_Q_TAIL_CPU_MASK), + : "r" (&lock->val), "r" (new), "r" (_Q_TAIL_CPU_MASK), "i" (IS_ENABLED(CONFIG_PPC64)) : "cr0", "memory"); diff --git a/arch/powerpc/include/asm/qspinlock_types.h b/arch/powerpc/include/asm/qspinlock_types.h index 1911a8a16237..adfeed4aa495 100644 --- a/arch/powerpc/include/asm/qspinlock_types.h +++ b/arch/powerpc/include/asm/qspinlock_types.h @@ -29,7 +29,8 @@ typedef struct qspinlock { * Bitfields in the lock word: * * 0: locked bit - * 1-15: unused bits + * 1-14: lock holder cpu + * 15: unused bit * 16: must queue bit * 17-31: tail cpu (+1) */ @@ -40,6 +41,15 @@ typedef struct qspinlock { #define _Q_LOCKED_BITS 1 #define _Q_LOCKED_VAL (1U << _Q_LOCKED_OFFSET) +/* 0x00007ffe */ +#define _Q_OWNER_CPU_OFFSET 1 +#define _Q_OWNER_CPU_BITS 14 +#define _Q_OWNER_CPU_MASK _Q_SET_MASK(OWNER_CPU) + +#if CONFIG_NR_CPUS > (1U << _Q_OWNER_CPU_BITS) +#error "qspinlock does not support such large CONFIG_NR_CPUS" +#endif + /* 0x00010000 */ #define _Q_MUST_Q_OFFSET 16 #define _Q_MUST_Q_BITS 1 diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 979b17ac7bd1..a5b2c0377cf9 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -55,7 +55,7 @@ static inline int decode_tail_cpu(u32 val) */ static __always_inline u32 trylock_clean_tail(struct qspinlock *lock, u32 tail) { - u32 newval = _Q_LOCKED_VAL; + u32 newval = queued_spin_encode_locked_val(); u32 prev, tmp; asm volatile( From patchwork Sat Nov 26 09:59:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1709203 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=BZylnu8N; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NK6nw6dVgz23mg for ; Sat, 26 Nov 2022 21:06:28 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4NK6nw5Sw5z3fJR for ; Sat, 26 Nov 2022 21:06:28 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=BZylnu8N; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::42e; helo=mail-pf1-x42e.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=BZylnu8N; dkim-atps=neutral Received: from mail-pf1-x42e.google.com (mail-pf1-x42e.google.com [IPv6:2607:f8b0:4864:20::42e]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4NK6fb74Xxz3cHY for ; Sat, 26 Nov 2022 21:00:07 +1100 (AEDT) Received: by mail-pf1-x42e.google.com with SMTP id z17so1311719pff.1 for ; Sat, 26 Nov 2022 02:00:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=WgjXNOZxEgp9f/jRgMOUKFdm/MwyXK6QmDuEMKtSEUQ=; b=BZylnu8N5B3S2M5gS+OfZNFGQysixb/MGGsW+Gz/9vvumGqKtddZA7CdQDYTcmnaPj Tcv/iYTacuZc+W6iGxZmOHSt8iUGl/Dzmh6/WAw+3VuL3O8UMwpQDgUiIJRGrZFW9qrO K3gJUXxMXzjkJEeXDl5jUECTuUzVpGKEEuIaFq0UAIAmA8PaxRtkRj33d4Sug3yE9jnu TC1gXrP0vLINY1wT0ovzp8NsMTVNLFp0P0dUHaQcMKwVhIrQzaEGuwq7KGP7AL1qXOKt NM6Rn9UJdzTF7AC/pfLPy5cXtHjtoywrsYjhBh0kGNMMJ2T9SMGlR9GbHNL99oUitB14 8+fg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WgjXNOZxEgp9f/jRgMOUKFdm/MwyXK6QmDuEMKtSEUQ=; b=eCJ9YrQbkJNeLjY5wi/jpI4MPk1JFNbUKpJgIal3JgadgQYrA+JrRKZJakKsuoAFEp /rMxSCzaVcMbDAc0WJJh4bnYqYIXNU3yQbZEVGVyUYcppIPqEPUpvct073WZ2VwD75ks xHQaPV4AdedJLMkr9V+pncTN3SN+Qx2caQcAQjEON5abP0GCphJ9UiqU+ShjSVJxyu8r unRcwsBZbgDtp/CLw51+Jfio2plGrsfEpyyuTkyl1LkU8j8xbJL9nyXWzMM9WGPYo1jK wfK/IbGGxUiBrFpeojUuMSCDIEJDGVUqy4jR3t9p2Q4ciyrVPwj78jRNz3XCLZN7hsOc sl+Q== X-Gm-Message-State: ANoB5pmmP3zTmYfyTFn+ttmN2ERM7XkAYOB67p6XjsC/6YB62autWLDP 7MdLvRljx9LQ+4vG77rzvilyQVBuDntA5A== X-Google-Smtp-Source: AA0mqf787L1rS4cS8z0KHYWvFL4PVGkH21r+xVNzcOpV6VcchRT+zskqOah+BfDv0ud6JUr1It8LHA== X-Received: by 2002:a05:6a00:2193:b0:573:6c4a:dbc8 with SMTP id h19-20020a056a00219300b005736c4adbc8mr21842133pfi.34.1669456805212; Sat, 26 Nov 2022 02:00:05 -0800 (PST) Received: from bobo.ozlabs.ibm.com (110-174-181-90.tpgi.com.au. [110.174.181.90]) by smtp.gmail.com with ESMTPSA id j3-20020a17090a94c300b00213202d77d9sm4239243pjw.43.2022.11.26.02.00.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Nov 2022 02:00:04 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v3 07/17] powerpc/qspinlock: paravirt yield to lock owner Date: Sat, 26 Nov 2022 19:59:22 +1000 Message-Id: <20221126095932.1234527-8-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221126095932.1234527-1-npiggin@gmail.com> References: <20221126095932.1234527-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Waiters spinning on the lock word should yield to the lock owner if the vCPU is preempted. This improves performance when the hypervisor has oversubscribed physical CPUs. Signed-off-by: Nicholas Piggin --- arch/powerpc/lib/qspinlock.c | 99 +++++++++++++++++++++++++++++++----- 1 file changed, 87 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index a5b2c0377cf9..bada773292af 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -5,6 +5,7 @@ #include #include #include +#include #define MAX_NODES 4 @@ -24,14 +25,16 @@ static int steal_spins __read_mostly = (1<<5); static bool maybe_stealers __read_mostly = true; static int head_spins __read_mostly = (1<<8); +static bool pv_yield_owner __read_mostly = true; + static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes); -static __always_inline int get_steal_spins(void) +static __always_inline int get_steal_spins(bool paravirt) { return steal_spins; } -static __always_inline int get_head_spins(void) +static __always_inline int get_head_spins(bool paravirt) { return head_spins; } @@ -46,6 +49,11 @@ static inline int decode_tail_cpu(u32 val) return (val >> _Q_TAIL_CPU_OFFSET) - 1; } +static inline int get_owner_cpu(u32 val) +{ + return (val & _Q_OWNER_CPU_MASK) >> _Q_OWNER_CPU_OFFSET; +} + /* * Try to acquire the lock if it was not already locked. If the tail matches * mytail then clear it, otherwise leave it unchnaged. Return previous value. @@ -150,7 +158,45 @@ static struct qnode *get_tail_qnode(struct qspinlock *lock, u32 val) BUG(); } -static inline bool try_to_steal_lock(struct qspinlock *lock) +static __always_inline void yield_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt) +{ + int owner; + u32 yield_count; + + BUG_ON(!(val & _Q_LOCKED_VAL)); + + if (!paravirt) + goto relax; + + if (!pv_yield_owner) + goto relax; + + owner = get_owner_cpu(val); + yield_count = yield_count_of(owner); + + if ((yield_count & 1) == 0) + goto relax; /* owner vcpu is running */ + + /* + * Read the lock word after sampling the yield count. On the other side + * there may a wmb because the yield count update is done by the + * hypervisor preemption and the value update by the OS, however this + * ordering might reduce the chance of out of order accesses and + * improve the heuristic. + */ + smp_rmb(); + + if (READ_ONCE(lock->val) == val) { + yield_to_preempted(owner, yield_count); + /* Don't relax if we yielded. Maybe we should? */ + return; + } +relax: + cpu_relax(); +} + + +static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool paravirt) { int iters = 0; @@ -168,16 +214,16 @@ static inline bool try_to_steal_lock(struct qspinlock *lock) if (__queued_spin_trylock_steal(lock)) return true; } else { - cpu_relax(); + yield_to_locked_owner(lock, val, paravirt); } iters++; - } while (iters < get_steal_spins()); + } while (iters < get_steal_spins(paravirt)); return false; } -static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) +static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, bool paravirt) { struct qnodes *qnodesp; struct qnode *next, *node; @@ -235,12 +281,12 @@ static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) if (!(val & _Q_LOCKED_VAL)) break; - cpu_relax(); + yield_to_locked_owner(lock, val, paravirt); if (!maybe_stealers) continue; iters++; - if (!mustq && iters >= get_head_spins()) { + if (!mustq && iters >= get_head_spins(paravirt)) { mustq = true; set_mustq(lock); val |= _Q_MUST_Q_VAL; @@ -276,10 +322,20 @@ static inline void queued_spin_lock_mcs_queue(struct qspinlock *lock) void queued_spin_lock_slowpath(struct qspinlock *lock) { - if (try_to_steal_lock(lock)) - return; - - queued_spin_lock_mcs_queue(lock); + /* + * This looks funny, but it induces the compiler to inline both + * sides of the branch rather than share code as when the condition + * is passed as the paravirt argument to the functions. + */ + if (IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS) && is_shared_processor()) { + if (try_to_steal_lock(lock, true)) + return; + queued_spin_lock_mcs_queue(lock, true); + } else { + if (try_to_steal_lock(lock, false)) + return; + queued_spin_lock_mcs_queue(lock, false); + } } EXPORT_SYMBOL(queued_spin_lock_slowpath); @@ -345,10 +401,29 @@ static int head_spins_get(void *data, u64 *val) DEFINE_SIMPLE_ATTRIBUTE(fops_head_spins, head_spins_get, head_spins_set, "%llu\n"); +static int pv_yield_owner_set(void *data, u64 val) +{ + pv_yield_owner = !!val; + + return 0; +} + +static int pv_yield_owner_get(void *data, u64 *val) +{ + *val = pv_yield_owner; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_pv_yield_owner, pv_yield_owner_get, pv_yield_owner_set, "%llu\n"); + static __init int spinlock_debugfs_init(void) { debugfs_create_file("qspl_steal_spins", 0600, arch_debugfs_dir, NULL, &fops_steal_spins); debugfs_create_file("qspl_head_spins", 0600, arch_debugfs_dir, NULL, &fops_head_spins); + if (is_shared_processor()) { + debugfs_create_file("qspl_pv_yield_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_owner); + } return 0; } From patchwork Sat Nov 26 09:59:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1709204 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=UpoU0RSR; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NK6py08GTz23mg for ; Sat, 26 Nov 2022 21:07:21 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4NK6px2qhYz3fKR for ; Sat, 26 Nov 2022 21:07:21 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=UpoU0RSR; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::52a; helo=mail-pg1-x52a.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=UpoU0RSR; dkim-atps=neutral Received: from mail-pg1-x52a.google.com (mail-pg1-x52a.google.com [IPv6:2607:f8b0:4864:20::52a]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4NK6ff6JG5z3f2w for ; Sat, 26 Nov 2022 21:00:10 +1100 (AEDT) Received: by mail-pg1-x52a.google.com with SMTP id 62so5731525pgb.13 for ; Sat, 26 Nov 2022 02:00:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4hEQzYl0altadHCnc6rsswHwjhGI50CiqQmFWhEPYno=; b=UpoU0RSRH+gjTguNVioKiusAXtYQFbuiM4csWccCWhJjlFde2fh1mvz95D6p8B6en0 q16uEeb5ezxwzC1ToTATEY8htJDuYvNRejYcDKJTWG6yTBB+KUNGDI/KqNwXehAJPLpj INj0XX+oj+ZN09LibhaAqrdpNl7Aqi64FMorB4yPsG8VNn7maF99fn0p1AIxwG0AfNhO Sfu4ZbW0B8ML2sNQsuHy0Tg2QL8tugSe1kbSU0lbv9UU5QsgGbvi4fHbdXoDK7Ph1p/A V5BHt3dvsJBVzmkkstUSSJ0ghsI746BIJ9rGIWj4syW14lktTd7MIQYMCmrKm5Z1zveg Km4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4hEQzYl0altadHCnc6rsswHwjhGI50CiqQmFWhEPYno=; b=QlsBJc58Xfvmli2pE/Bn0QaduLVFEVMPXUlEQe+paOO+cPBiijdZ67azBvt1z+fxrK CR0uPUhgB/3jNvZ98x1UpQHV4aOrDQC+06kfty12PulYwnCSVGfepGkMB+dan4RKnNfS ne4HF+NtOGvKJyzKx1IEMQ/RJLTciLuGQ5wn9J4eum0UG8BOTyMdaJ6/g4LjZTyvuU2L 1n0r2AG/CvkvUPPE2fYGE0WCitafQIp9foF+J9AoTev+A/v8ylFPf0V6LprP0eIRr+FB X4aobncOW2AfWiC0qB2lsHqKnlmppm0kzMqNqfbafqKTk60bHHpFN2Pa3sua3ugOAaDQ TZhQ== X-Gm-Message-State: ANoB5pnAHG+yK197ldQGyata28CLiqQIQm3yVo4A8YhBU4mt7O5MmU4G DppiU65yheF2SOXTUpZ5lLi4rFlhP6DXDg== X-Google-Smtp-Source: AA0mqf7TGVgZQhFxECPBF7Ir574vaAQ8OeXyWsQT5TimPkeYj9telgTlK8Od2TUUXwXgYRN2O3HiHw== X-Received: by 2002:aa7:8684:0:b0:56c:5afe:67f6 with SMTP id d4-20020aa78684000000b0056c5afe67f6mr22794400pfo.20.1669456808199; Sat, 26 Nov 2022 02:00:08 -0800 (PST) Received: from bobo.ozlabs.ibm.com (110-174-181-90.tpgi.com.au. [110.174.181.90]) by smtp.gmail.com with ESMTPSA id j3-20020a17090a94c300b00213202d77d9sm4239243pjw.43.2022.11.26.02.00.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Nov 2022 02:00:07 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v3 08/17] powerpc/qspinlock: implement option to yield to previous node Date: Sat, 26 Nov 2022 19:59:23 +1000 Message-Id: <20221126095932.1234527-9-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221126095932.1234527-1-npiggin@gmail.com> References: <20221126095932.1234527-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Queued waiters which are not at the head of the queue don't spin on the lock word but their qnode lock word, waiting for the previous queued CPU to release them. Add an option which allows these waiters to yield to the previous CPU if its vCPU is preempted. Signed-off-by: Nicholas Piggin --- arch/powerpc/lib/qspinlock.c | 46 +++++++++++++++++++++++++++++++++++- 1 file changed, 45 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index bada773292af..838be0bc8705 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -26,6 +26,7 @@ static bool maybe_stealers __read_mostly = true; static int head_spins __read_mostly = (1<<8); static bool pv_yield_owner __read_mostly = true; +static bool pv_yield_prev __read_mostly = true; static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes); @@ -195,6 +196,32 @@ static __always_inline void yield_to_locked_owner(struct qspinlock *lock, u32 va cpu_relax(); } +static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode *node, u32 val, bool paravirt) +{ + int prev_cpu = decode_tail_cpu(val); + u32 yield_count; + + if (!paravirt) + goto relax; + + if (!pv_yield_prev) + goto relax; + + yield_count = yield_count_of(prev_cpu); + if ((yield_count & 1) == 0) + goto relax; /* owner vcpu is running */ + + smp_rmb(); /* See yield_to_locked_owner comment */ + + if (!node->locked) { + yield_to_preempted(prev_cpu, yield_count); + return; + } + +relax: + cpu_relax(); +} + static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool paravirt) { @@ -269,7 +296,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b /* Wait for mcs node lock to be released */ while (!node->locked) - cpu_relax(); + yield_to_prev(lock, node, old, paravirt); smp_rmb(); /* acquire barrier for the mcs lock */ } @@ -417,12 +444,29 @@ static int pv_yield_owner_get(void *data, u64 *val) DEFINE_SIMPLE_ATTRIBUTE(fops_pv_yield_owner, pv_yield_owner_get, pv_yield_owner_set, "%llu\n"); +static int pv_yield_prev_set(void *data, u64 val) +{ + pv_yield_prev = !!val; + + return 0; +} + +static int pv_yield_prev_get(void *data, u64 *val) +{ + *val = pv_yield_prev; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_pv_yield_prev, pv_yield_prev_get, pv_yield_prev_set, "%llu\n"); + static __init int spinlock_debugfs_init(void) { debugfs_create_file("qspl_steal_spins", 0600, arch_debugfs_dir, NULL, &fops_steal_spins); debugfs_create_file("qspl_head_spins", 0600, arch_debugfs_dir, NULL, &fops_head_spins); if (is_shared_processor()) { debugfs_create_file("qspl_pv_yield_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_owner); + debugfs_create_file("qspl_pv_yield_prev", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_prev); } return 0; From patchwork Sat Nov 26 09:59:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1709205 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=js4Lx3v1; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NK6r55tqSz23mg for ; Sat, 26 Nov 2022 21:08:21 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4NK6r54gBYz3dvP for ; Sat, 26 Nov 2022 21:08:21 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=js4Lx3v1; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::631; helo=mail-pl1-x631.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=js4Lx3v1; dkim-atps=neutral Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4NK6fj6g96z3f4b for ; Sat, 26 Nov 2022 21:00:13 +1100 (AEDT) Received: by mail-pl1-x631.google.com with SMTP id j12so5946202plj.5 for ; Sat, 26 Nov 2022 02:00:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=OjskR2gpAJ6jlYK1rCUuLh/MS2cLeA8vhaR1uI2ScaI=; b=js4Lx3v1h0VnRWL8+8pH5UmpfPzqEpJ7NwlA+vAinwIVYrYVge9SCHvcAiRiaWwLrQ XQ0KGZ6EDKG9LBO8ypAcGCm9bBbaHWhBmLeRSzR91RxK15b/B/vTjFUelNR5jyftu0XL F68laGbMis2Bo3fEbYRORLE3J3VEGeAMJR2WuSwfY5FuoxBRWdP1xTLda5ayn8085XJR NI+gHpVnwenXvZ4AYp38Z5hh9vWgJrxS7vZYh/KZcz/d+CadqpX7sTB/aBLxmG9C3btX URQ7LbnSWQiosBLVIwEGt4eJGV0J/q8UkUCYIRq5WPxwBbsbT7nWlmokUtGvoo3qZiej iBQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OjskR2gpAJ6jlYK1rCUuLh/MS2cLeA8vhaR1uI2ScaI=; b=Xn1jjGjBpJ2lq2HoWeCN1NmQB/hQ7yjAJWqXeJhwksrq1SlTzTqoKZUBbXOzhJu286 9WdVhuO5bzpZMzKBPV4q3YBtI6jKUJhcikrP2cB8Pgw86KNeQwzIeQxAO4Kb5msAZ2wl A34XTrXjKbq1j7Psk89QpCecJ5EcYXFFyzso9QZxR05aFl6PCB1H79/vgJ8ebcycr140 wfN/BmneW/v+l+joKGScibRmWKzDMECKoRQ39bJAsxBByYBAw9j0PYJqTUqtvqBi37wb cVavXAE8Ci7eTns+6N0IOnHYIR2rb+l6aRZM4EZyFPtU3+RrTHFH916SUshxsVZxXIIb Ewig== X-Gm-Message-State: ANoB5plweI90yE2X43iyUHsKI1rQf+7SxUeHo8jM9MzzWmK1P/J8G5Xw sFtEzRnjxdZljezDHq+uICM7uWuNtar6DQ== X-Google-Smtp-Source: AA0mqf4Q1WXDfeLzkqdbTEJpJLBy3Bi0TgT6wx6ZMYcQ82R8fCDa5EhTnvDFyAdoyQcP5+eePoOvNQ== X-Received: by 2002:a17:90a:7e0d:b0:213:d630:f4af with SMTP id i13-20020a17090a7e0d00b00213d630f4afmr45256354pjl.77.1669456811162; Sat, 26 Nov 2022 02:00:11 -0800 (PST) Received: from bobo.ozlabs.ibm.com (110-174-181-90.tpgi.com.au. [110.174.181.90]) by smtp.gmail.com with ESMTPSA id j3-20020a17090a94c300b00213202d77d9sm4239243pjw.43.2022.11.26.02.00.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Nov 2022 02:00:10 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v3 09/17] powerpc/qspinlock: allow stealing when head of queue yields Date: Sat, 26 Nov 2022 19:59:24 +1000 Message-Id: <20221126095932.1234527-10-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221126095932.1234527-1-npiggin@gmail.com> References: <20221126095932.1234527-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" If the head of queue is preventing stealing but it finds the owner vCPU is preempted, it will yield its cycles to the owner which could cause it to become preempted. Add an option to re-allow stealers before yielding, and disallow them again after returning from the yield. Disable this option by default for now, i.e., no logical change. Signed-off-by: Nicholas Piggin --- arch/powerpc/lib/qspinlock.c | 59 ++++++++++++++++++++++++++++++++++-- 1 file changed, 56 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 838be0bc8705..1aafafc701da 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -26,6 +26,7 @@ static bool maybe_stealers __read_mostly = true; static int head_spins __read_mostly = (1<<8); static bool pv_yield_owner __read_mostly = true; +static bool pv_yield_allow_steal __read_mostly = false; static bool pv_yield_prev __read_mostly = true; static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes); @@ -135,6 +136,22 @@ static __always_inline u32 set_mustq(struct qspinlock *lock) return prev; } +static __always_inline u32 clear_mustq(struct qspinlock *lock) +{ + u32 prev; + + asm volatile( +"1: lwarx %0,0,%1 # clear_mustq \n" +" andc %0,%0,%2 \n" +" stwcx. %0,0,%1 \n" +" bne- 1b \n" + : "=&r" (prev) + : "r" (&lock->val), "r" (_Q_MUST_Q_VAL) + : "cr0", "memory"); + + return prev; +} + static struct qnode *get_tail_qnode(struct qspinlock *lock, u32 val) { int cpu = decode_tail_cpu(val); @@ -159,7 +176,7 @@ static struct qnode *get_tail_qnode(struct qspinlock *lock, u32 val) BUG(); } -static __always_inline void yield_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt) +static __always_inline void __yield_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt, bool mustq) { int owner; u32 yield_count; @@ -188,7 +205,11 @@ static __always_inline void yield_to_locked_owner(struct qspinlock *lock, u32 va smp_rmb(); if (READ_ONCE(lock->val) == val) { + if (mustq) + clear_mustq(lock); yield_to_preempted(owner, yield_count); + if (mustq) + set_mustq(lock); /* Don't relax if we yielded. Maybe we should? */ return; } @@ -196,6 +217,21 @@ static __always_inline void yield_to_locked_owner(struct qspinlock *lock, u32 va cpu_relax(); } +static __always_inline void yield_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt) +{ + __yield_to_locked_owner(lock, val, paravirt, false); +} + +static __always_inline void yield_head_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt) +{ + bool mustq = false; + + if ((val & _Q_MUST_Q_VAL) && pv_yield_allow_steal) + mustq = true; + + __yield_to_locked_owner(lock, val, paravirt, mustq); +} + static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode *node, u32 val, bool paravirt) { int prev_cpu = decode_tail_cpu(val); @@ -211,7 +247,7 @@ static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode * if ((yield_count & 1) == 0) goto relax; /* owner vcpu is running */ - smp_rmb(); /* See yield_to_locked_owner comment */ + smp_rmb(); /* See __yield_to_locked_owner comment */ if (!node->locked) { yield_to_preempted(prev_cpu, yield_count); @@ -308,7 +344,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b if (!(val & _Q_LOCKED_VAL)) break; - yield_to_locked_owner(lock, val, paravirt); + yield_head_to_locked_owner(lock, val, paravirt); if (!maybe_stealers) continue; iters++; @@ -444,6 +480,22 @@ static int pv_yield_owner_get(void *data, u64 *val) DEFINE_SIMPLE_ATTRIBUTE(fops_pv_yield_owner, pv_yield_owner_get, pv_yield_owner_set, "%llu\n"); +static int pv_yield_allow_steal_set(void *data, u64 val) +{ + pv_yield_allow_steal = !!val; + + return 0; +} + +static int pv_yield_allow_steal_get(void *data, u64 *val) +{ + *val = pv_yield_allow_steal; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_pv_yield_allow_steal, pv_yield_allow_steal_get, pv_yield_allow_steal_set, "%llu\n"); + static int pv_yield_prev_set(void *data, u64 val) { pv_yield_prev = !!val; @@ -466,6 +518,7 @@ static __init int spinlock_debugfs_init(void) debugfs_create_file("qspl_head_spins", 0600, arch_debugfs_dir, NULL, &fops_head_spins); if (is_shared_processor()) { debugfs_create_file("qspl_pv_yield_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_owner); + debugfs_create_file("qspl_pv_yield_allow_steal", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_allow_steal); debugfs_create_file("qspl_pv_yield_prev", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_prev); } From patchwork Sat Nov 26 09:59:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1709206 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=Tqf6MSjs; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NK6s64dNfz23mg for ; Sat, 26 Nov 2022 21:09:14 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4NK6s622mWz3fGt for ; Sat, 26 Nov 2022 21:09:14 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=Tqf6MSjs; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::1030; helo=mail-pj1-x1030.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=Tqf6MSjs; dkim-atps=neutral Received: from mail-pj1-x1030.google.com (mail-pj1-x1030.google.com [IPv6:2607:f8b0:4864:20::1030]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4NK6fm38WRz3f3Y for ; Sat, 26 Nov 2022 21:00:16 +1100 (AEDT) Received: by mail-pj1-x1030.google.com with SMTP id l22-20020a17090a3f1600b00212fbbcfb78so9662084pjc.3 for ; Sat, 26 Nov 2022 02:00:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2zzhIbvAzayb4yaRNKEO4qu4RYhZtzziiOxDt3X2n6s=; b=Tqf6MSjsI51hO6oK28KywrfoFpC6ISeMB5p6WkMYglA4QsQ3GarFeUpM7lk2lnZvcb YDLe37zHcvBw7JxekmBtD98xUivyTTNEhg6Unw+8AKk55iq8EfxRtPdDv0zmmgI8Lv7k ulCIulfwoqqyhaIKw5CeR87YCOJ0gmuiT21Bm6oUX6uDJuvPydaVMwe618X5AZEZG30p BDuQhTsF/mz2GqY0pluAEqxCjdRnFA0J2uBMlFRz3WqULVxMnXz68yLbWM6hnwT0S2DK mAY+Si7mA4CTfHpUERirwJApsDmANNmOgSkzStABTqF6Qh0euZWCPHk5mj4qYCUj2ApS rzpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2zzhIbvAzayb4yaRNKEO4qu4RYhZtzziiOxDt3X2n6s=; b=Dz0BHHBmNvZmhg9/GBMYK9ZawvoyXIYFhfPKdJGTd4BddFlLj5Y+wbTl0dZCWsWOE/ x3dhIZ6z6WVnGMgX4doVShFJYpI0CzDsGcSOyN0rBj1veH4ckYa7rDxmhR9P9J3ubh4u WDhjIYX2mEfnOcD/vkTj4tDvsc/6HGYHOzL39BD92ObvNI6Nv2xjb68B3tXoC4CuYeo8 kVVI+EXnHDhsNc8WWubjjFWN7IQS1rx+N8WHh/vugHbkm/3xk6hsVSs2axHGEXzmG0Dj GPrrx58LnpO41wDQ6DtlmB4DFmLCfU80u+r334xIgUAe/LqaH9Az70vJRK/M3NcWtkWj xdvA== X-Gm-Message-State: ANoB5pmTbp7NKoF9+x0XFVRZcUa7zhKbzQhFzDIWziKMO/d57qJN8dA/ hSE8e0G58V9dGT1wuN3hiyLrE06xUvKjcQ== X-Google-Smtp-Source: AA0mqf4tCC90iYXsW5Ey6icR47Frf3/nphkgh7rGMIJlMe9BStQOvf01333DgFvTYTEMrUESliiKGQ== X-Received: by 2002:a17:90b:358a:b0:218:c490:33f6 with SMTP id mm10-20020a17090b358a00b00218c49033f6mr25932131pjb.83.1669456814163; Sat, 26 Nov 2022 02:00:14 -0800 (PST) Received: from bobo.ozlabs.ibm.com (110-174-181-90.tpgi.com.au. [110.174.181.90]) by smtp.gmail.com with ESMTPSA id j3-20020a17090a94c300b00213202d77d9sm4239243pjw.43.2022.11.26.02.00.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Nov 2022 02:00:13 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v3 10/17] powerpc/qspinlock: allow propagation of yield CPU down the queue Date: Sat, 26 Nov 2022 19:59:25 +1000 Message-Id: <20221126095932.1234527-11-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221126095932.1234527-1-npiggin@gmail.com> References: <20221126095932.1234527-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Having all CPUs poll the lock word for the owner CPU that should be yielded to defeats most of the purpose of using MCS queueing for scalability. Yet it may be desirable for queued waiters to to yield to a preempted owner. With this change, queue waiters never sample the owner CPU directly from the lock word. The queue head (which is spinning on the lock) propagates the owner CPU back to the next waiter if it finds the owner has been preempted. That waiter then propagates the owner CPU back to the next waiter, and so on. s390 addreses this problem differenty, by having queued waiters sample the lock word to find the owner at a low frequency. That has the advantage of being simpler, the advantage of propagation is that the lock word never has to be accesed by queued waiters, and the transfer of cache lines to transmit the owner data is only required when lock holder vCPU preemption occurs. Signed-off-by: Nicholas Piggin --- arch/powerpc/lib/qspinlock.c | 79 ++++++++++++++++++++++++++++++++++++ 1 file changed, 79 insertions(+) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 1aafafc701da..b044760a05e9 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -12,6 +12,7 @@ struct qnode { struct qnode *next; struct qspinlock *lock; + int yield_cpu; u8 locked; /* 1 if lock acquired */ }; @@ -28,6 +29,7 @@ static int head_spins __read_mostly = (1<<8); static bool pv_yield_owner __read_mostly = true; static bool pv_yield_allow_steal __read_mostly = false; static bool pv_yield_prev __read_mostly = true; +static bool pv_yield_propagate_owner __read_mostly = true; static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes); @@ -232,14 +234,67 @@ static __always_inline void yield_head_to_locked_owner(struct qspinlock *lock, u __yield_to_locked_owner(lock, val, paravirt, mustq); } +static __always_inline void propagate_yield_cpu(struct qnode *node, u32 val, int *set_yield_cpu, bool paravirt) +{ + struct qnode *next; + int owner; + + if (!paravirt) + return; + if (!pv_yield_propagate_owner) + return; + + owner = get_owner_cpu(val); + if (*set_yield_cpu == owner) + return; + + next = READ_ONCE(node->next); + if (!next) + return; + + if (vcpu_is_preempted(owner)) { + next->yield_cpu = owner; + *set_yield_cpu = owner; + } else if (*set_yield_cpu != -1) { + next->yield_cpu = owner; + *set_yield_cpu = owner; + } +} + static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode *node, u32 val, bool paravirt) { int prev_cpu = decode_tail_cpu(val); u32 yield_count; + int yield_cpu; if (!paravirt) goto relax; + if (!pv_yield_propagate_owner) + goto yield_prev; + + yield_cpu = READ_ONCE(node->yield_cpu); + if (yield_cpu == -1) { + /* Propagate back the -1 CPU */ + if (node->next && node->next->yield_cpu != -1) + node->next->yield_cpu = yield_cpu; + goto yield_prev; + } + + yield_count = yield_count_of(yield_cpu); + if ((yield_count & 1) == 0) + goto yield_prev; /* owner vcpu is running */ + + smp_rmb(); + + if (yield_cpu == node->yield_cpu) { + if (node->next && node->next->yield_cpu != yield_cpu) + node->next->yield_cpu = yield_cpu; + yield_to_preempted(yield_cpu, yield_count); + return; + } + +yield_prev: if (!pv_yield_prev) goto relax; @@ -293,6 +348,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b u32 val, old, tail; bool mustq = false; int idx; + int set_yield_cpu = -1; int iters = 0; BUILD_BUG_ON(CONFIG_NR_CPUS >= (1U << _Q_TAIL_CPU_BITS)); @@ -314,6 +370,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b node = &qnodesp->nodes[idx]; node->next = NULL; node->lock = lock; + node->yield_cpu = -1; node->locked = 0; tail = encode_tail_cpu(smp_processor_id()); @@ -334,6 +391,10 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b while (!node->locked) yield_to_prev(lock, node, old, paravirt); + /* Clear out stale propagated yield_cpu */ + if (paravirt && pv_yield_propagate_owner && node->yield_cpu != -1) + node->yield_cpu = -1; + smp_rmb(); /* acquire barrier for the mcs lock */ } @@ -344,6 +405,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b if (!(val & _Q_LOCKED_VAL)) break; + propagate_yield_cpu(node, val, &set_yield_cpu, paravirt); yield_head_to_locked_owner(lock, val, paravirt); if (!maybe_stealers) continue; @@ -512,6 +574,22 @@ static int pv_yield_prev_get(void *data, u64 *val) DEFINE_SIMPLE_ATTRIBUTE(fops_pv_yield_prev, pv_yield_prev_get, pv_yield_prev_set, "%llu\n"); +static int pv_yield_propagate_owner_set(void *data, u64 val) +{ + pv_yield_propagate_owner = !!val; + + return 0; +} + +static int pv_yield_propagate_owner_get(void *data, u64 *val) +{ + *val = pv_yield_propagate_owner; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_pv_yield_propagate_owner, pv_yield_propagate_owner_get, pv_yield_propagate_owner_set, "%llu\n"); + static __init int spinlock_debugfs_init(void) { debugfs_create_file("qspl_steal_spins", 0600, arch_debugfs_dir, NULL, &fops_steal_spins); @@ -520,6 +598,7 @@ static __init int spinlock_debugfs_init(void) debugfs_create_file("qspl_pv_yield_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_owner); debugfs_create_file("qspl_pv_yield_allow_steal", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_allow_steal); debugfs_create_file("qspl_pv_yield_prev", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_prev); + debugfs_create_file("qspl_pv_yield_propagate_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_propagate_owner); } return 0; From patchwork Sat Nov 26 09:59:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1709207 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=R/Q6S60W; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NK6t71cryz23nT for ; Sat, 26 Nov 2022 21:10:07 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4NK6t70lQFz3fNL for ; Sat, 26 Nov 2022 21:10:07 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=R/Q6S60W; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::631; helo=mail-pl1-x631.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=R/Q6S60W; dkim-atps=neutral Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4NK6fq2F4zz3f4L for ; Sat, 26 Nov 2022 21:00:19 +1100 (AEDT) Received: by mail-pl1-x631.google.com with SMTP id jn7so5934553plb.13 for ; Sat, 26 Nov 2022 02:00:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=D7ItALSQmBYmE/CIt7MKw1pwIYiizTeoZ6ZroZLhvCk=; b=R/Q6S60WrnYgoEvQlOmmiLyNAOiYIxon8ULrrjOVrn/3z4aRCCxCCMqo7aDwVRiBXq Q5aN85KB+hkh8Qii+Lc7xCiPCZjOq9yumyyz43qGQ036bud3rGbk1aWneDUWloLhe6tB +bZW9i41ZjopKY1hY5FlvTQTA7isq/vekjNgad1DF41QKJMJQVzhauRxHI+few7x1qfF Z9ZJlhlRRFX6joKJ4uDkqM6esRckRIS6cQr5POwTy9DMgw1Ow/WO6bOD38ByEnYrXhIj RkSVYC5GRmsEFA8OkRBvG9M0Img5s9FnMrnn7MXteDU15DXVm8/wgN0oUKLIl8zRK6Ue kwrw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=D7ItALSQmBYmE/CIt7MKw1pwIYiizTeoZ6ZroZLhvCk=; b=ST6THi/9r4GGBDIuTPbUkgOvRJKN1l++aH7p30lc+SGVsk8JtyCY1lHY/Huv5JlMTP 75BLxMHD1VMEhzLIl64FRJSKlw2HxtJgavNpRyiqzkiGF8hq5zIBPny3EQEH5qE5r0Mr EQ9zHmyLlH8/MxXFnF8ymMelOr6PVESMbPk8Zy8MHOFeYSl1/2SWhMLYpCZz/tZtzT5v Fb1wTFh8/jyhYRgXe7jsbtlTe4uPnAmLdytWVcLhrgVrovM6KjBo9c38em/NEuAfHa47 68F3ZUSl+wrxbsvNj5uzm/szwxkQispWJLMr5rYWx5C9bS59GHTQn/KpYpwqpwIaoqro 401A== X-Gm-Message-State: ANoB5pkynx/pfbGj2YdIN22+IvGF+W+ut6kD0OmEN0kuzeR3NYCDyXK0 6pKVukzpW6ljlKCVwQJ6X21weG9+8Xk9AA== X-Google-Smtp-Source: AA0mqf53zu/DHPLbjl2UgUysLtTlt9x60FRkCLpgtzWZhxohdB/MbYdchAfMwDShAhGXOO2JzWjYZg== X-Received: by 2002:a17:90b:4fcc:b0:219:1b9c:4682 with SMTP id qa12-20020a17090b4fcc00b002191b9c4682mr2434948pjb.1.1669456817190; Sat, 26 Nov 2022 02:00:17 -0800 (PST) Received: from bobo.ozlabs.ibm.com (110-174-181-90.tpgi.com.au. [110.174.181.90]) by smtp.gmail.com with ESMTPSA id j3-20020a17090a94c300b00213202d77d9sm4239243pjw.43.2022.11.26.02.00.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Nov 2022 02:00:16 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v3 11/17] powerpc/qspinlock: add ability to prod new queue head CPU Date: Sat, 26 Nov 2022 19:59:26 +1000 Message-Id: <20221126095932.1234527-12-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221126095932.1234527-1-npiggin@gmail.com> References: <20221126095932.1234527-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" After the head of the queue acquires the lock, it releases the next waiter in the queue to become the new head. Add an option to prod the new head if its vCPU was preempted. This may only have an effect if queue waiters are yielding. Disable this option by default for now, i.e., no logical change. Signed-off-by: Nicholas Piggin --- arch/powerpc/lib/qspinlock.c | 31 +++++++++++++++++++++++++++++-- 1 file changed, 29 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index b044760a05e9..03329f4ed238 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -12,6 +12,7 @@ struct qnode { struct qnode *next; struct qspinlock *lock; + int cpu; int yield_cpu; u8 locked; /* 1 if lock acquired */ }; @@ -30,6 +31,7 @@ static bool pv_yield_owner __read_mostly = true; static bool pv_yield_allow_steal __read_mostly = false; static bool pv_yield_prev __read_mostly = true; static bool pv_yield_propagate_owner __read_mostly = true; +static bool pv_prod_head __read_mostly = false; static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes); @@ -370,10 +372,11 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b node = &qnodesp->nodes[idx]; node->next = NULL; node->lock = lock; + node->cpu = smp_processor_id(); node->yield_cpu = -1; node->locked = 0; - tail = encode_tail_cpu(smp_processor_id()); + tail = encode_tail_cpu(node->cpu); old = publish_tail_cpu(lock, tail); @@ -439,7 +442,14 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b * this store to locked. The corresponding barrier is the smp_rmb() * acquire barrier for mcs lock, above. */ - WRITE_ONCE(next->locked, 1); + if (paravirt && pv_prod_head) { + int next_cpu = next->cpu; + WRITE_ONCE(next->locked, 1); + if (vcpu_is_preempted(next_cpu)) + prod_cpu(next_cpu); + } else { + WRITE_ONCE(next->locked, 1); + } release: qnodesp->count--; /* release the node */ @@ -590,6 +600,22 @@ static int pv_yield_propagate_owner_get(void *data, u64 *val) DEFINE_SIMPLE_ATTRIBUTE(fops_pv_yield_propagate_owner, pv_yield_propagate_owner_get, pv_yield_propagate_owner_set, "%llu\n"); +static int pv_prod_head_set(void *data, u64 val) +{ + pv_prod_head = !!val; + + return 0; +} + +static int pv_prod_head_get(void *data, u64 *val) +{ + *val = pv_prod_head; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_pv_prod_head, pv_prod_head_get, pv_prod_head_set, "%llu\n"); + static __init int spinlock_debugfs_init(void) { debugfs_create_file("qspl_steal_spins", 0600, arch_debugfs_dir, NULL, &fops_steal_spins); @@ -599,6 +625,7 @@ static __init int spinlock_debugfs_init(void) debugfs_create_file("qspl_pv_yield_allow_steal", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_allow_steal); debugfs_create_file("qspl_pv_yield_prev", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_prev); debugfs_create_file("qspl_pv_yield_propagate_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_propagate_owner); + debugfs_create_file("qspl_pv_prod_head", 0600, arch_debugfs_dir, NULL, &fops_pv_prod_head); } return 0; From patchwork Sat Nov 26 09:59:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1709208 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=JSdo3Q5d; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NK6v80vs4z23nd for ; Sat, 26 Nov 2022 21:10:59 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4NK6v74DSNz3fPZ for ; Sat, 26 Nov 2022 21:10:59 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=JSdo3Q5d; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::1031; helo=mail-pj1-x1031.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=JSdo3Q5d; dkim-atps=neutral Received: from mail-pj1-x1031.google.com (mail-pj1-x1031.google.com [IPv6:2607:f8b0:4864:20::1031]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4NK6ft5XPqz3f4B for ; Sat, 26 Nov 2022 21:00:22 +1100 (AEDT) Received: by mail-pj1-x1031.google.com with SMTP id hd14-20020a17090b458e00b0021909875bccso2991732pjb.1 for ; Sat, 26 Nov 2022 02:00:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tnxR/O3pw+8kaag+KXP+dy6QBdsBfrsxNbfBQkhzVHo=; b=JSdo3Q5dcUDEKKluOZOfbUzRszdTxE0SFRv2CvxlMaO5+nB7+iYVIUp3M4x+MMfoE3 su9B80E5ikRpnS5a+Cnb6bX2V1x30ig948sOL75XuJQowox8DXXt0EyvOpDvmIo9kT+v AJMLDi2OV1I8RShEMyVYLMvMFFLuWlsopYyX76gznF2woMISJy2B+MUlirvh49w8Gy5T f3lrvqs5XyGg9GkWYYQyduAtX4QuXj5YtQi6AaDbaLLtWdeQGzAeAXfsar8U1o9F031M JhTSjSU9x3L61pcogTULGYa26SHbHFQng89AFN2WdkFFNC7bKOwiF4c/NsSYVuz9GAf/ jKYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tnxR/O3pw+8kaag+KXP+dy6QBdsBfrsxNbfBQkhzVHo=; b=NsHII7IyyVGGJw6W7XdcO3wj1QRgsDxdLC9N8Tdef05Bkyh0+/pA0omssYS0CRD1VA +3kVqnvG2AtiGoWet3jsKaaGVaRD+gPREYvje15jWB529uNptCUT5zAtsBPrdAoNDRnE 6PwaCpFaQp1e/zQqwxI0PCsrDhgJWQqg3KyjuOcVkz+TsojLv5XxmGynACLVxENSpxoP qzr+cpUYNls4UvDFvuFSHLq8BrpNLeQhjs9tp8aCba97ezduGRND2gHa6V4uJcdSxqOX 3EdPc4e7s6a2rnCdnMlT233abRmJT9/NfcpcY1sDM9Pl66dYMgNAgEYSU7TzcJJjDumi iF3A== X-Gm-Message-State: ANoB5pnZ0tcHyjqt1cL2E6i6CoMSyFtCTrfw9MCi3gJ9FUi/bcGqK3PG pTxVyuiAN+TH/y9vzQG5H3pwn3MwJ4Mg7w== X-Google-Smtp-Source: AA0mqf6wRWi8jgu+GiYQF+bAlFFHy+v1yMxSX0VLgabDfvEHgpwwwrImISJe+4FlIbzpIz3Gd5p00Q== X-Received: by 2002:a17:902:e94e:b0:188:f3b9:7156 with SMTP id b14-20020a170902e94e00b00188f3b97156mr22759523pll.76.1669456820126; Sat, 26 Nov 2022 02:00:20 -0800 (PST) Received: from bobo.ozlabs.ibm.com (110-174-181-90.tpgi.com.au. [110.174.181.90]) by smtp.gmail.com with ESMTPSA id j3-20020a17090a94c300b00213202d77d9sm4239243pjw.43.2022.11.26.02.00.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Nov 2022 02:00:19 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v3 12/17] powerpc/qspinlock: allow lock stealing in trylock and lock fastpath Date: Sat, 26 Nov 2022 19:59:27 +1000 Message-Id: <20221126095932.1234527-13-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221126095932.1234527-1-npiggin@gmail.com> References: <20221126095932.1234527-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" This change allows trylock to steal the lock. It also allows the initial lock attempt to steal the lock rather than bailing out and going to the slow path. This gives trylock more strength: without this a continually-contended lock will never permit a trylock to succeed. With this change, the trylock has a small but non-zero chance. It also also gives the lock fastpath most of the benefit of passing the reservation back through to the steal loop in the slow path without the complexity. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/qspinlock.h | 22 ++++++++++++++++++++-- arch/powerpc/lib/qspinlock.c | 9 +++++++++ 2 files changed, 29 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h index be53702e56fc..c9fa83bba1d5 100644 --- a/arch/powerpc/include/asm/qspinlock.h +++ b/arch/powerpc/include/asm/qspinlock.h @@ -5,6 +5,15 @@ #include #include +/* + * The trylock itself may steal. This makes trylocks slightly stronger, and + * might make spin locks slightly more efficient when stealing. + * + * This is compile-time, so if true then there may always be stealers, so the + * nosteal paths become unused. + */ +#define _Q_SPIN_TRY_LOCK_STEAL 1 + static __always_inline int queued_spin_is_locked(struct qspinlock *lock) { return READ_ONCE(lock->val); @@ -26,13 +35,14 @@ static __always_inline u32 queued_spin_encode_locked_val(void) return _Q_LOCKED_VAL | (smp_processor_id() << _Q_OWNER_CPU_OFFSET); } -static __always_inline int queued_spin_trylock(struct qspinlock *lock) +static __always_inline int __queued_spin_trylock_nosteal(struct qspinlock *lock) { u32 new = queued_spin_encode_locked_val(); u32 prev; + /* Trylock succeeds only when unlocked and no queued nodes */ asm volatile( -"1: lwarx %0,0,%1,%3 # queued_spin_trylock \n" +"1: lwarx %0,0,%1,%3 # __queued_spin_trylock_nosteal \n" " cmpwi 0,%0,0 \n" " bne- 2f \n" " stwcx. %2,0,%1 \n" @@ -71,6 +81,14 @@ static __always_inline int __queued_spin_trylock_steal(struct qspinlock *lock) return likely(!(prev & ~_Q_TAIL_CPU_MASK)); } +static __always_inline int queued_spin_trylock(struct qspinlock *lock) +{ + if (!_Q_SPIN_TRY_LOCK_STEAL) + return __queued_spin_trylock_nosteal(lock); + else + return __queued_spin_trylock_steal(lock); +} + void queued_spin_lock_slowpath(struct qspinlock *lock); static __always_inline void queued_spin_lock(struct qspinlock *lock) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 03329f4ed238..1cb47a6478a0 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -24,7 +24,11 @@ struct qnodes { /* Tuning parameters */ static int steal_spins __read_mostly = (1<<5); +#if _Q_SPIN_TRY_LOCK_STEAL == 1 +static const bool maybe_stealers = true; +#else static bool maybe_stealers __read_mostly = true; +#endif static int head_spins __read_mostly = (1<<8); static bool pv_yield_owner __read_mostly = true; @@ -483,6 +487,10 @@ void pv_spinlocks_init(void) #include static int steal_spins_set(void *data, u64 val) { +#if _Q_SPIN_TRY_LOCK_STEAL == 1 + /* MAYBE_STEAL remains true */ + steal_spins = val; +#else static DEFINE_MUTEX(lock); /* @@ -507,6 +515,7 @@ static int steal_spins_set(void *data, u64 val) steal_spins = val; } mutex_unlock(&lock); +#endif return 0; } From patchwork Sat Nov 26 09:59:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1709209 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=nukK7K/b; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NK6wH6mrxz23nd for ; Sat, 26 Nov 2022 21:11:59 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4NK6wH5ZDNz3fB6 for ; Sat, 26 Nov 2022 21:11:59 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=nukK7K/b; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::532; helo=mail-pg1-x532.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=nukK7K/b; dkim-atps=neutral Received: from mail-pg1-x532.google.com (mail-pg1-x532.google.com [IPv6:2607:f8b0:4864:20::532]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4NK6fx5RRCz3cLX for ; Sat, 26 Nov 2022 21:00:25 +1100 (AEDT) Received: by mail-pg1-x532.google.com with SMTP id q71so5755459pgq.8 for ; Sat, 26 Nov 2022 02:00:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+JiYM0dGwyhprcvukkjNkJqEwAgSn5XCubo9AuVi2lw=; b=nukK7K/bb5NlCuXxY07NT1z9Qd97agpyi3mZgzLtWeewlI6ONoCcRGtjdbIzeNK6b2 dy5SYP3J1JSCYfvZOTaNLypBcN3zLLY3ecHkCGriIMhFjtg2cb2tj0XfvMeG+wtdA/qI ifNA9sd2rckMhKofg28PwD56gS+lTHf5igqGtCxr2DbrTTsZo2NWgYFn6lQuiZY69aS+ keFE9FdsT5sJfXRlskpL5qwhXBcvRk5RDCd+qatJs9ZjWorrm74zcwOWAj+xPKn1h43Z aU9rsy6N887Xi/tZIjpqggzpJOv+FYNuk2xoVo4LmoDgE+RyoDX61ofIJuQ+sgynurM4 oa5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+JiYM0dGwyhprcvukkjNkJqEwAgSn5XCubo9AuVi2lw=; b=viGEMIfOm4L16CvM/n4NM1euEn7EWyQHd1GKplsCA0msyAi5asFosMn4OZCV2vlile 0R/eJow5eWslBpb+LIerGy08ziQ4BEJjGFx/2GpleeHBFLPhWLhvil+r+9mWF7qvwYD0 yBze/hSaqc9hHU3aVZL9KIOri4+k7NPyJN365RW2M5QJ+R96VZ3nBljHJgn5wYTr0LRR lyg08VfGKD/s8Qfr1XswT6vN5QjzcznifB5bhVETshJwpsguV7wyAo8B4s6B/464PMBK 6KcPfRuBI+3gIq9vmTK/rDRNyNUHwkzL363Dcm877f4KozwZ0TtNVKr9KEU0xf+MJsbQ dAWQ== X-Gm-Message-State: ANoB5plKpquPinb8M3T5OdUIimrSwKwn4xgeAdNFrY/WuH97nUPb68TI QBxdQCZDLze1xaphE47ZPXP+YJ1vAWBdmg== X-Google-Smtp-Source: AA0mqf5KjIbB37LUM9MaysacVM77I4pUUcpz/INwzU3Hw08obYt7qiA0dQoAyjBg1s1C3ki1vGZUxw== X-Received: by 2002:a63:2c8:0:b0:457:4863:2e85 with SMTP id 191-20020a6302c8000000b0045748632e85mr21000013pgc.6.1669456823073; Sat, 26 Nov 2022 02:00:23 -0800 (PST) Received: from bobo.ozlabs.ibm.com (110-174-181-90.tpgi.com.au. [110.174.181.90]) by smtp.gmail.com with ESMTPSA id j3-20020a17090a94c300b00213202d77d9sm4239243pjw.43.2022.11.26.02.00.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Nov 2022 02:00:22 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v3 13/17] powerpc/qspinlock: use spin_begin/end API Date: Sat, 26 Nov 2022 19:59:28 +1000 Message-Id: <20221126095932.1234527-14-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221126095932.1234527-1-npiggin@gmail.com> References: <20221126095932.1234527-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Use the spin_begin/spin_cpu_relax/spin_end APIs in qspinlock, which helps to prevent threads issuing a lot of expensive priority nops which may not have much effect due to immediately executing low then medium priority. Signed-off-by: Nicholas Piggin --- arch/powerpc/lib/qspinlock.c | 39 ++++++++++++++++++++++++++++++++---- 1 file changed, 35 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 1cb47a6478a0..70f924296b36 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -184,6 +184,7 @@ static struct qnode *get_tail_qnode(struct qspinlock *lock, u32 val) BUG(); } +/* Called inside spin_begin() */ static __always_inline void __yield_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt, bool mustq) { int owner; @@ -203,6 +204,8 @@ static __always_inline void __yield_to_locked_owner(struct qspinlock *lock, u32 if ((yield_count & 1) == 0) goto relax; /* owner vcpu is running */ + spin_end(); + /* * Read the lock word after sampling the yield count. On the other side * there may a wmb because the yield count update is done by the @@ -218,18 +221,22 @@ static __always_inline void __yield_to_locked_owner(struct qspinlock *lock, u32 yield_to_preempted(owner, yield_count); if (mustq) set_mustq(lock); + spin_begin(); /* Don't relax if we yielded. Maybe we should? */ return; } + spin_begin(); relax: - cpu_relax(); + spin_cpu_relax(); } +/* Called inside spin_begin() */ static __always_inline void yield_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt) { __yield_to_locked_owner(lock, val, paravirt, false); } +/* Called inside spin_begin() */ static __always_inline void yield_head_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt) { bool mustq = false; @@ -267,6 +274,7 @@ static __always_inline void propagate_yield_cpu(struct qnode *node, u32 val, int } } +/* Called inside spin_begin() */ static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode *node, u32 val, bool paravirt) { int prev_cpu = decode_tail_cpu(val); @@ -291,14 +299,18 @@ static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode * if ((yield_count & 1) == 0) goto yield_prev; /* owner vcpu is running */ + spin_end(); + smp_rmb(); if (yield_cpu == node->yield_cpu) { if (node->next && node->next->yield_cpu != yield_cpu) node->next->yield_cpu = yield_cpu; yield_to_preempted(yield_cpu, yield_count); + spin_begin(); return; } + spin_begin(); yield_prev: if (!pv_yield_prev) @@ -308,15 +320,19 @@ static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode * if ((yield_count & 1) == 0) goto relax; /* owner vcpu is running */ + spin_end(); + smp_rmb(); /* See __yield_to_locked_owner comment */ if (!node->locked) { yield_to_preempted(prev_cpu, yield_count); + spin_begin(); return; } + spin_begin(); relax: - cpu_relax(); + spin_cpu_relax(); } @@ -328,6 +344,8 @@ static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool parav return false; /* Attempt to steal the lock */ + spin_begin(); + do { u32 val = READ_ONCE(lock->val); @@ -335,8 +353,10 @@ static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool parav break; if (unlikely(!(val & _Q_LOCKED_VAL))) { + spin_end(); if (__queued_spin_trylock_steal(lock)) return true; + spin_begin(); } else { yield_to_locked_owner(lock, val, paravirt); } @@ -344,6 +364,8 @@ static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool parav iters++; } while (iters < get_steal_spins(paravirt)); + spin_end(); + return false; } @@ -395,8 +417,10 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b WRITE_ONCE(prev->next, node); /* Wait for mcs node lock to be released */ + spin_begin(); while (!node->locked) yield_to_prev(lock, node, old, paravirt); + spin_end(); /* Clear out stale propagated yield_cpu */ if (paravirt && pv_yield_propagate_owner && node->yield_cpu != -1) @@ -407,6 +431,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b again: /* We're at the head of the waitqueue, wait for the lock. */ + spin_begin(); for (;;) { val = READ_ONCE(lock->val); if (!(val & _Q_LOCKED_VAL)) @@ -424,6 +449,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b val |= _Q_MUST_Q_VAL; } } + spin_end(); /* If we're the last queued, must clean up the tail. */ old = trylock_clean_tail(lock, tail); @@ -436,8 +462,13 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b goto release; /* We were the tail, no next. */ /* There is a next, must wait for node->next != NULL (MCS protocol) */ - while (!(next = READ_ONCE(node->next))) - cpu_relax(); + next = READ_ONCE(node->next); + if (!next) { + spin_begin(); + while (!(next = READ_ONCE(node->next))) + cpu_relax(); + spin_end(); + } /* * Unlock the next mcs waiter node. Release barrier is not required From patchwork Sat Nov 26 09:59:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1709210 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=Uz2ctzj8; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NK6xK140Rz23nd for ; Sat, 26 Nov 2022 21:12:52 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4NK6xJ3rf5z3f6B for ; Sat, 26 Nov 2022 21:12:52 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=Uz2ctzj8; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::635; helo=mail-pl1-x635.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=Uz2ctzj8; dkim-atps=neutral Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4NK6g046Fkz3f5N for ; Sat, 26 Nov 2022 21:00:28 +1100 (AEDT) Received: by mail-pl1-x635.google.com with SMTP id p24so2206850plw.1 for ; Sat, 26 Nov 2022 02:00:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=BBy5d72hkipKEzW7/+Qx9hfjXm8nInTVQ43zZOGgfU4=; b=Uz2ctzj8rweykYICpms/OKJqj5L0r7Tur5w+zXFNPYc1QNKs1iIwNATumYISt51BhJ YO38WUA9yQqG7rkEjXZs5oYVxUkUY/zHgU0ATD6wVsb0vZtsFQYjKQ17xShnLwi8THQN +awT95h4aD8AJZfG02BEnGCx6mb9L1IePwt5kiE+Xz/KHsq3bNRmlgXhU7L9XbqhW8KN SZ1nNOfR4MtanXoertuXh7ATi4D7tbbDIjgs3sdTpK3RV9JUC6oQdoP++kJhjd+yQOQ2 Jsy/bIMzj2eJfIvMDhx2/dERpahaRepuezeJ9MA0fV1ZFFnfb9QIN7/y24Qi4NJQMExx QNtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BBy5d72hkipKEzW7/+Qx9hfjXm8nInTVQ43zZOGgfU4=; b=LbYwLBK4t57rYsmZxUkEH8K3Bmq9j2kBlw25i/69HdYhcyqYF2dDrDFY1PtNnisZ+g l+XW20vGclp5v/3tUrbhqT3q/+jPPXUPvy1sL/Y7QhxDPC0juBKkIED5W4hUgyQHWvIG WGqMLEZfAH1xBjyLocr9Mk+RamRACWsMiNTGwd6tVfwYWHUGL95FkMJae4CiKopJxwYP LdUj2Kh0g92BLsa6jus4JLH0pdmybc7j/+9nh5mD8RZViwz25368fkk6VbpaTyO15zGH CmSo3Zyyt0p9nvb3hEUqX+eRXaouy355U5lo52c1eRyBcBZlLJLrZmbhvNRC9SVc+3LK IqVQ== X-Gm-Message-State: ANoB5pnE/ekUL+j6IQpOV842D7HHkegAyfa26VABAUvdUBwbKh2Ihoa/ eTz7o7GhCxwm0bHxKCJnq6OcJ1G1xAQoKw== X-Google-Smtp-Source: AA0mqf6Fn0hjvb8uHVGVZITKzghyrrD5jvcCRFMJrep6wm4aUvR2A+Y+h0toyeeghQcRJVHLp0oPPA== X-Received: by 2002:a17:902:7fc6:b0:189:680d:f063 with SMTP id t6-20020a1709027fc600b00189680df063mr6884206plb.55.1669456826024; Sat, 26 Nov 2022 02:00:26 -0800 (PST) Received: from bobo.ozlabs.ibm.com (110-174-181-90.tpgi.com.au. [110.174.181.90]) by smtp.gmail.com with ESMTPSA id j3-20020a17090a94c300b00213202d77d9sm4239243pjw.43.2022.11.26.02.00.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Nov 2022 02:00:25 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v3 14/17] powerpc/qspinlock: reduce remote node steal spins Date: Sat, 26 Nov 2022 19:59:29 +1000 Message-Id: <20221126095932.1234527-15-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221126095932.1234527-1-npiggin@gmail.com> References: <20221126095932.1234527-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Allow for a reduction in the number of times a CPU from a different node than the owner can attempt to steal the lock before queueing. This could bias the transfer behaviour of the lock across the machine and reduce NUMA crossings. Signed-off-by: Nicholas Piggin --- arch/powerpc/lib/qspinlock.c | 43 +++++++++++++++++++++++++++++++++--- 1 file changed, 40 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 70f924296b36..7f6b41627351 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -4,6 +4,7 @@ #include #include #include +#include #include #include @@ -24,6 +25,7 @@ struct qnodes { /* Tuning parameters */ static int steal_spins __read_mostly = (1<<5); +static int remote_steal_spins __read_mostly = (1<<2); #if _Q_SPIN_TRY_LOCK_STEAL == 1 static const bool maybe_stealers = true; #else @@ -44,6 +46,11 @@ static __always_inline int get_steal_spins(bool paravirt) return steal_spins; } +static __always_inline int get_remote_steal_spins(bool paravirt) +{ + return remote_steal_spins; +} + static __always_inline int get_head_spins(bool paravirt) { return head_spins; @@ -335,10 +342,24 @@ static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode * spin_cpu_relax(); } +static __always_inline bool steal_break(u32 val, int iters, bool paravirt) +{ + if (iters >= get_steal_spins(paravirt)) + return true; + + if (IS_ENABLED(CONFIG_NUMA) && + (iters >= get_remote_steal_spins(paravirt))) { + int cpu = get_owner_cpu(val); + if (numa_node_id() != cpu_to_node(cpu)) + return true; + } + return false; +} static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool paravirt) { int iters = 0; + u32 val; if (!steal_spins) return false; @@ -347,8 +368,7 @@ static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool parav spin_begin(); do { - u32 val = READ_ONCE(lock->val); - + val = READ_ONCE(lock->val); if (val & _Q_MUST_Q_VAL) break; @@ -362,7 +382,7 @@ static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool parav } iters++; - } while (iters < get_steal_spins(paravirt)); + } while (!steal_break(val, iters, paravirt)); spin_end(); @@ -560,6 +580,22 @@ static int steal_spins_get(void *data, u64 *val) DEFINE_SIMPLE_ATTRIBUTE(fops_steal_spins, steal_spins_get, steal_spins_set, "%llu\n"); +static int remote_steal_spins_set(void *data, u64 val) +{ + remote_steal_spins = val; + + return 0; +} + +static int remote_steal_spins_get(void *data, u64 *val) +{ + *val = remote_steal_spins; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_remote_steal_spins, remote_steal_spins_get, remote_steal_spins_set, "%llu\n"); + static int head_spins_set(void *data, u64 val) { head_spins = val; @@ -659,6 +695,7 @@ DEFINE_SIMPLE_ATTRIBUTE(fops_pv_prod_head, pv_prod_head_get, pv_prod_head_set, " static __init int spinlock_debugfs_init(void) { debugfs_create_file("qspl_steal_spins", 0600, arch_debugfs_dir, NULL, &fops_steal_spins); + debugfs_create_file("qspl_remote_steal_spins", 0600, arch_debugfs_dir, NULL, &fops_remote_steal_spins); debugfs_create_file("qspl_head_spins", 0600, arch_debugfs_dir, NULL, &fops_head_spins); if (is_shared_processor()) { debugfs_create_file("qspl_pv_yield_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_owner); From patchwork Sat Nov 26 09:59:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1709211 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=OAlzuPOR; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NK6yK1s0cz23nd for ; Sat, 26 Nov 2022 21:13:45 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4NK6yK0j3fz3fTb for ; Sat, 26 Nov 2022 21:13:45 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=OAlzuPOR; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::631; helo=mail-pl1-x631.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=OAlzuPOR; dkim-atps=neutral Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4NK6g16zZcz3f4q for ; Sat, 26 Nov 2022 21:00:29 +1100 (AEDT) Received: by mail-pl1-x631.google.com with SMTP id j12so5946678plj.5 for ; Sat, 26 Nov 2022 02:00:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=aSDoMsxU/5buSsdmn1gktVtkkU+EzdOmdhTBduOiP/c=; b=OAlzuPORnjplAa/NcVzkhGxYuhR1PrEHj/GhupBlmk/Uc182BAGnM9NIwA65Wz16lZ C+ARO1pfUg/xY6bQtTCcY9NkN/fD0OSdLoja7IDN/Uno6oRRXTiD6xTGbtxBgqqsUSyE VDVkESYdv9VNzvmjeT4Dj6ohl0oBTjd/oGk91MZjU+QFkpZKadaPZTFZ2ZYzElvZU5fN 6PcHBsbvBlMf3XZGV0AOQZNMl54hAy4PRhl/H6BznzgwxXLK4LxkTPhz/TOYD9zrN+uF hS+uF95g9QP2VEdk+Lr+rmk6Avy5f9uLlUo4e8IEYfcR8VpGTdQZpAJID4UwA/4IXka0 FpOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=aSDoMsxU/5buSsdmn1gktVtkkU+EzdOmdhTBduOiP/c=; b=D0UdkUyiIMnc+ayPAYtJ7sQB+9wAPzzOA6rG7hL4f1CVej9sSLZbkvqf6a6LzeDIAt N0O4JqD0WYN5i53fQrOe2GhR25szfOcJNmT0QtGi2xl5j9p/jmT3WnPyOVQEk+uHv9e+ OSULh9Wx0/mPBlMb2NghR6hoyqZppDhGXrmoWbUOOSE66Q9NiE3zXrZG0r1jW+ZuA00E 63L/fKTaBRlTOBtKR54W9VTRE6mvHAjBHeUVPwlu1PKiDImy2RIvUm1sP3WvxFQMAS9v 1doEQa8qUMFmrM9KXpbu+0MnhmNhItjjFaMLJ9oaZ2vVqo6BsI2+qmBafv6S4VU/5/96 FYYA== X-Gm-Message-State: ANoB5pkAibcd+p69PWJ9qkTB9jd+cQKcPtDx2/u1h/2eUg5boQX/UEJj pqDm1YKxEO7zyb3uLH5Cr256KmISu5y4KA== X-Google-Smtp-Source: AA0mqf6KHw9I5L3GoTriAXjEZkYK2SEs1Td5e4riYlEMK58jsuYvrQoJBb3vOPe459aQIrJIbQ15Gg== X-Received: by 2002:a17:90a:898e:b0:218:bcab:96c6 with SMTP id v14-20020a17090a898e00b00218bcab96c6mr27426396pjn.46.1669456829025; Sat, 26 Nov 2022 02:00:29 -0800 (PST) Received: from bobo.ozlabs.ibm.com (110-174-181-90.tpgi.com.au. [110.174.181.90]) by smtp.gmail.com with ESMTPSA id j3-20020a17090a94c300b00213202d77d9sm4239243pjw.43.2022.11.26.02.00.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Nov 2022 02:00:28 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v3 15/17] powerpc/qspinlock: allow indefinite spinning on a preempted owner Date: Sat, 26 Nov 2022 19:59:30 +1000 Message-Id: <20221126095932.1234527-16-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221126095932.1234527-1-npiggin@gmail.com> References: <20221126095932.1234527-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Provide an option that holds off queueing indefinitely while the lock owner is preempted. This could reduce queueing latencies for very overcommitted vcpu situations. This is disabled by default. Signed-off-by: Nicholas Piggin --- arch/powerpc/lib/qspinlock.c | 77 +++++++++++++++++++++++++++++------- 1 file changed, 62 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 7f6b41627351..1c9079489b50 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -35,6 +35,7 @@ static int head_spins __read_mostly = (1<<8); static bool pv_yield_owner __read_mostly = true; static bool pv_yield_allow_steal __read_mostly = false; +static bool pv_spin_on_preempted_owner __read_mostly = false; static bool pv_yield_prev __read_mostly = true; static bool pv_yield_propagate_owner __read_mostly = true; static bool pv_prod_head __read_mostly = false; @@ -191,11 +192,12 @@ static struct qnode *get_tail_qnode(struct qspinlock *lock, u32 val) BUG(); } -/* Called inside spin_begin() */ -static __always_inline void __yield_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt, bool mustq) +/* Called inside spin_begin(). Returns whether or not the vCPU was preempted. */ +static __always_inline bool __yield_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt, bool mustq) { int owner; u32 yield_count; + bool preempted = false; BUG_ON(!(val & _Q_LOCKED_VAL)); @@ -213,6 +215,8 @@ static __always_inline void __yield_to_locked_owner(struct qspinlock *lock, u32 spin_end(); + preempted = true; + /* * Read the lock word after sampling the yield count. On the other side * there may a wmb because the yield count update is done by the @@ -229,29 +233,32 @@ static __always_inline void __yield_to_locked_owner(struct qspinlock *lock, u32 if (mustq) set_mustq(lock); spin_begin(); + /* Don't relax if we yielded. Maybe we should? */ - return; + return preempted; } spin_begin(); relax: spin_cpu_relax(); + + return preempted; } -/* Called inside spin_begin() */ -static __always_inline void yield_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt) +/* Called inside spin_begin(). Returns whether or not the vCPU was preempted. */ +static __always_inline bool yield_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt) { - __yield_to_locked_owner(lock, val, paravirt, false); + return __yield_to_locked_owner(lock, val, paravirt, false); } -/* Called inside spin_begin() */ -static __always_inline void yield_head_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt) +/* Called inside spin_begin(). Returns whether or not the vCPU was preempted. */ +static __always_inline bool yield_head_to_locked_owner(struct qspinlock *lock, u32 val, bool paravirt) { bool mustq = false; if ((val & _Q_MUST_Q_VAL) && pv_yield_allow_steal) mustq = true; - __yield_to_locked_owner(lock, val, paravirt, mustq); + return __yield_to_locked_owner(lock, val, paravirt, mustq); } static __always_inline void propagate_yield_cpu(struct qnode *node, u32 val, int *set_yield_cpu, bool paravirt) @@ -361,13 +368,16 @@ static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool parav int iters = 0; u32 val; - if (!steal_spins) + if (!steal_spins) { + /* XXX: should spin_on_preempted_owner do anything here? */ return false; + } /* Attempt to steal the lock */ spin_begin(); - do { + bool preempted = false; + val = READ_ONCE(lock->val); if (val & _Q_MUST_Q_VAL) break; @@ -378,10 +388,23 @@ static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool parav return true; spin_begin(); } else { - yield_to_locked_owner(lock, val, paravirt); + preempted = yield_to_locked_owner(lock, val, paravirt); } - iters++; + if (preempted) { + if (!pv_spin_on_preempted_owner) + iters++; + /* + * pv_spin_on_preempted_owner don't increase iters + * while the owner is preempted -- we won't interfere + * with it by definition. This could introduce some + * latency issue if we continually observe preempted + * owners, but hopefully that's a rare corner case of + * a badly oversubscribed system. + */ + } else { + iters++; + } } while (!steal_break(val, iters, paravirt)); spin_end(); @@ -453,15 +476,22 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b /* We're at the head of the waitqueue, wait for the lock. */ spin_begin(); for (;;) { + bool preempted; + val = READ_ONCE(lock->val); if (!(val & _Q_LOCKED_VAL)) break; propagate_yield_cpu(node, val, &set_yield_cpu, paravirt); - yield_head_to_locked_owner(lock, val, paravirt); + preempted = yield_head_to_locked_owner(lock, val, paravirt); if (!maybe_stealers) continue; - iters++; + if (preempted) { + if (!pv_spin_on_preempted_owner) + iters++; + } else { + iters++; + } if (!mustq && iters >= get_head_spins(paravirt)) { mustq = true; @@ -644,6 +674,22 @@ static int pv_yield_allow_steal_get(void *data, u64 *val) DEFINE_SIMPLE_ATTRIBUTE(fops_pv_yield_allow_steal, pv_yield_allow_steal_get, pv_yield_allow_steal_set, "%llu\n"); +static int pv_spin_on_preempted_owner_set(void *data, u64 val) +{ + pv_spin_on_preempted_owner = !!val; + + return 0; +} + +static int pv_spin_on_preempted_owner_get(void *data, u64 *val) +{ + *val = pv_spin_on_preempted_owner; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_pv_spin_on_preempted_owner, pv_spin_on_preempted_owner_get, pv_spin_on_preempted_owner_set, "%llu\n"); + static int pv_yield_prev_set(void *data, u64 val) { pv_yield_prev = !!val; @@ -700,6 +746,7 @@ static __init int spinlock_debugfs_init(void) if (is_shared_processor()) { debugfs_create_file("qspl_pv_yield_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_owner); debugfs_create_file("qspl_pv_yield_allow_steal", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_allow_steal); + debugfs_create_file("qspl_pv_spin_on_preempted_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_spin_on_preempted_owner); debugfs_create_file("qspl_pv_yield_prev", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_prev); debugfs_create_file("qspl_pv_yield_propagate_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_propagate_owner); debugfs_create_file("qspl_pv_prod_head", 0600, arch_debugfs_dir, NULL, &fops_pv_prod_head); From patchwork Sat Nov 26 09:59:31 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1709212 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=cDwKckh2; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NK6zL1QSVz23nd for ; Sat, 26 Nov 2022 21:14:37 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4NK6zK5pRlz3f7g for ; Sat, 26 Nov 2022 21:14:37 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=cDwKckh2; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::52c; helo=mail-pg1-x52c.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=cDwKckh2; dkim-atps=neutral Received: from mail-pg1-x52c.google.com (mail-pg1-x52c.google.com [IPv6:2607:f8b0:4864:20::52c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4NK6g65Nsnz3f43 for ; Sat, 26 Nov 2022 21:00:34 +1100 (AEDT) Received: by mail-pg1-x52c.google.com with SMTP id b62so5799128pgc.0 for ; Sat, 26 Nov 2022 02:00:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+WDgIGpgVyXC3bPLvnOIrXVCuTy+UkqPNEB6pgc7bTU=; b=cDwKckh2Wj1kSmyrTNMe/zQ9ccj7/tNCEpkplMdDhWlUotOFoOCoq5OpQ59xu4GbKc VkObLz8F9HLpT1HqUay1VLDEb5My6mXmLTKrihn3GypwaZspswmDa+VREK2+BmTxjO2e /vpdfIyuS4kNhGlu5C72BR1DroelTDLvtEg1+DJt+h4vBBbun3DyH+C1w+riUMi7CgBh zYwFWneCcKS6BJp8wndahjaQIK/EvDr8BpQkm0U3caDaW5jVX5u6CjMgXDteR0QRXsCB IoeR8ZL1+6DRDsTWA8VxQSaNtR/eAIi9QEZjMZgY4RH5h8sYbo6gO3BdFpDxxf7eVWG0 u17w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+WDgIGpgVyXC3bPLvnOIrXVCuTy+UkqPNEB6pgc7bTU=; b=cfAIrm17kmxl1ZDmtn6SQo9Y22JIjvgVj42fV+THy8zkCPU45S1h2OuUv5VoLaz48+ LKZt+hwPqts/5/j9ywU//zbpC62UwOGjWugtJbj33klo4guDGGq4xXlRnPjqGPsqQ6K+ NMq877py7+ra+RLC3e7sJXuFh4t7wV6dSQhuhWKwEQ4Vmlwo60lTGEhBqaQ3ztcOQvaa Lax0YT5171fw1DzlGGTdSHaOc6dli34tzMqRTMXfkyFJoHR8w2YdbJLM8llU8zImjfIB cwRqq7ria395vmxfu86HGk9HNXbXCcymHrCeSvRHcChzP4GQE3XWFyDGgXd1gZl/kToX s4UA== X-Gm-Message-State: ANoB5pl+9ZVsZ1+S5S7KealYsebtIk6qS+wo/JIB6xbPdJfIvzwOhXjw Y0e+mI8Tma7Py0xmDDBOyRrCZLc7iK3CzQ== X-Google-Smtp-Source: AA0mqf7PRon+lRW9gGjrcZmlkEZKI3BJ154blw76aR3YAcH+2rJPjr1SB7YcQMKna0p4wp4wZylW7g== X-Received: by 2002:a63:e510:0:b0:476:a862:53d2 with SMTP id r16-20020a63e510000000b00476a86253d2mr18749678pgh.163.1669456832111; Sat, 26 Nov 2022 02:00:32 -0800 (PST) Received: from bobo.ozlabs.ibm.com (110-174-181-90.tpgi.com.au. [110.174.181.90]) by smtp.gmail.com with ESMTPSA id j3-20020a17090a94c300b00213202d77d9sm4239243pjw.43.2022.11.26.02.00.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Nov 2022 02:00:31 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v3 16/17] powerpc/qspinlock: provide accounting and options for sleepy locks Date: Sat, 26 Nov 2022 19:59:31 +1000 Message-Id: <20221126095932.1234527-17-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221126095932.1234527-1-npiggin@gmail.com> References: <20221126095932.1234527-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Finding the owner or a queued waiter on a lock with a preempted vcpu is indicative of an oversubscribed guest causing the lock to get into trouble. Provide some options to detect this situation and have new CPUs avoid queueing for a longer time (more steal iterations) to minimise the problems caused by vcpu preemption on the queue. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/qspinlock_types.h | 7 +- arch/powerpc/lib/qspinlock.c | 242 +++++++++++++++++++-- 2 files changed, 230 insertions(+), 19 deletions(-) diff --git a/arch/powerpc/include/asm/qspinlock_types.h b/arch/powerpc/include/asm/qspinlock_types.h index adfeed4aa495..4766a7aa03cb 100644 --- a/arch/powerpc/include/asm/qspinlock_types.h +++ b/arch/powerpc/include/asm/qspinlock_types.h @@ -30,7 +30,7 @@ typedef struct qspinlock { * * 0: locked bit * 1-14: lock holder cpu - * 15: unused bit + * 15: lock owner or queuer vcpus observed to be preempted bit * 16: must queue bit * 17-31: tail cpu (+1) */ @@ -50,6 +50,11 @@ typedef struct qspinlock { #error "qspinlock does not support such large CONFIG_NR_CPUS" #endif +/* 0x00008000 */ +#define _Q_SLEEPY_OFFSET 15 +#define _Q_SLEEPY_BITS 1 +#define _Q_SLEEPY_VAL (1U << _Q_SLEEPY_OFFSET) + /* 0x00010000 */ #define _Q_MUST_Q_OFFSET 16 #define _Q_MUST_Q_BITS 1 diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 1c9079489b50..9a31b6147a23 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -5,6 +5,7 @@ #include #include #include +#include #include #include @@ -36,25 +37,56 @@ static int head_spins __read_mostly = (1<<8); static bool pv_yield_owner __read_mostly = true; static bool pv_yield_allow_steal __read_mostly = false; static bool pv_spin_on_preempted_owner __read_mostly = false; +static bool pv_sleepy_lock __read_mostly = true; +static bool pv_sleepy_lock_sticky __read_mostly = false; +static u64 pv_sleepy_lock_interval_ns __read_mostly = 0; +static int pv_sleepy_lock_factor __read_mostly = 256; static bool pv_yield_prev __read_mostly = true; static bool pv_yield_propagate_owner __read_mostly = true; static bool pv_prod_head __read_mostly = false; static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes); +static DEFINE_PER_CPU_ALIGNED(u64, sleepy_lock_seen_clock); -static __always_inline int get_steal_spins(bool paravirt) +static __always_inline bool recently_sleepy(void) { - return steal_spins; + /* pv_sleepy_lock is true when this is called */ + if (pv_sleepy_lock_interval_ns) { + u64 seen = this_cpu_read(sleepy_lock_seen_clock); + + if (seen) { + u64 delta = sched_clock() - seen; + if (delta < pv_sleepy_lock_interval_ns) + return true; + this_cpu_write(sleepy_lock_seen_clock, 0); + } + } + + return false; } -static __always_inline int get_remote_steal_spins(bool paravirt) +static __always_inline int get_steal_spins(bool paravirt, bool sleepy) { - return remote_steal_spins; + if (paravirt && sleepy) + return steal_spins * pv_sleepy_lock_factor; + else + return steal_spins; } -static __always_inline int get_head_spins(bool paravirt) +static __always_inline int get_remote_steal_spins(bool paravirt, bool sleepy) { - return head_spins; + if (paravirt && sleepy) + return remote_steal_spins * pv_sleepy_lock_factor; + else + return remote_steal_spins; +} + +static __always_inline int get_head_spins(bool paravirt, bool sleepy) +{ + if (paravirt && sleepy) + return head_spins * pv_sleepy_lock_factor; + else + return head_spins; } static inline u32 encode_tail_cpu(int cpu) @@ -168,6 +200,56 @@ static __always_inline u32 clear_mustq(struct qspinlock *lock) return prev; } +static __always_inline bool try_set_sleepy(struct qspinlock *lock, u32 old) +{ + u32 prev; + u32 new = old | _Q_SLEEPY_VAL; + + BUG_ON(!(old & _Q_LOCKED_VAL)); + BUG_ON(old & _Q_SLEEPY_VAL); + + asm volatile( +"1: lwarx %0,0,%1 # try_set_sleepy \n" +" cmpw 0,%0,%2 \n" +" bne- 2f \n" +" stwcx. %3,0,%1 \n" +" bne- 1b \n" +"2: \n" + : "=&r" (prev) + : "r" (&lock->val), "r"(old), "r" (new) + : "cr0", "memory"); + + return likely(prev == old); +} + +static __always_inline void seen_sleepy_owner(struct qspinlock *lock, u32 val) +{ + if (pv_sleepy_lock) { + if (pv_sleepy_lock_interval_ns) + this_cpu_write(sleepy_lock_seen_clock, sched_clock()); + if (!(val & _Q_SLEEPY_VAL)) + try_set_sleepy(lock, val); + } +} + +static __always_inline void seen_sleepy_lock(void) +{ + if (pv_sleepy_lock && pv_sleepy_lock_interval_ns) + this_cpu_write(sleepy_lock_seen_clock, sched_clock()); +} + +static __always_inline void seen_sleepy_node(struct qspinlock *lock, u32 val) +{ + if (pv_sleepy_lock) { + if (pv_sleepy_lock_interval_ns) + this_cpu_write(sleepy_lock_seen_clock, sched_clock()); + if (val & _Q_LOCKED_VAL) { + if (!(val & _Q_SLEEPY_VAL)) + try_set_sleepy(lock, val); + } + } +} + static struct qnode *get_tail_qnode(struct qspinlock *lock, u32 val) { int cpu = decode_tail_cpu(val); @@ -215,6 +297,7 @@ static __always_inline bool __yield_to_locked_owner(struct qspinlock *lock, u32 spin_end(); + seen_sleepy_owner(lock, val); preempted = true; /* @@ -289,11 +372,12 @@ static __always_inline void propagate_yield_cpu(struct qnode *node, u32 val, int } /* Called inside spin_begin() */ -static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode *node, u32 val, bool paravirt) +static __always_inline bool yield_to_prev(struct qspinlock *lock, struct qnode *node, u32 val, bool paravirt) { int prev_cpu = decode_tail_cpu(val); u32 yield_count; int yield_cpu; + bool preempted = false; if (!paravirt) goto relax; @@ -315,6 +399,9 @@ static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode * spin_end(); + preempted = true; + seen_sleepy_node(lock, val); + smp_rmb(); if (yield_cpu == node->yield_cpu) { @@ -322,7 +409,7 @@ static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode * node->next->yield_cpu = yield_cpu; yield_to_preempted(yield_cpu, yield_count); spin_begin(); - return; + return preempted; } spin_begin(); @@ -336,26 +423,31 @@ static __always_inline void yield_to_prev(struct qspinlock *lock, struct qnode * spin_end(); + preempted = true; + seen_sleepy_node(lock, val); + smp_rmb(); /* See __yield_to_locked_owner comment */ if (!node->locked) { yield_to_preempted(prev_cpu, yield_count); spin_begin(); - return; + return preempted; } spin_begin(); relax: spin_cpu_relax(); + + return preempted; } -static __always_inline bool steal_break(u32 val, int iters, bool paravirt) +static __always_inline bool steal_break(u32 val, int iters, bool paravirt, bool sleepy) { - if (iters >= get_steal_spins(paravirt)) + if (iters >= get_steal_spins(paravirt, sleepy)) return true; if (IS_ENABLED(CONFIG_NUMA) && - (iters >= get_remote_steal_spins(paravirt))) { + (iters >= get_remote_steal_spins(paravirt, sleepy))) { int cpu = get_owner_cpu(val); if (numa_node_id() != cpu_to_node(cpu)) return true; @@ -365,6 +457,8 @@ static __always_inline bool steal_break(u32 val, int iters, bool paravirt) static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool paravirt) { + bool seen_preempted = false; + bool sleepy = false; int iters = 0; u32 val; @@ -391,7 +485,25 @@ static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool parav preempted = yield_to_locked_owner(lock, val, paravirt); } + if (paravirt && pv_sleepy_lock) { + if (!sleepy) { + if (val & _Q_SLEEPY_VAL) { + seen_sleepy_lock(); + sleepy = true; + } else if (recently_sleepy()) { + sleepy = true; + } + } + if (pv_sleepy_lock_sticky && seen_preempted && + !(val & _Q_SLEEPY_VAL)) { + if (try_set_sleepy(lock, val)) + val |= _Q_SLEEPY_VAL; + } + } + if (preempted) { + seen_preempted = true; + sleepy = true; if (!pv_spin_on_preempted_owner) iters++; /* @@ -405,7 +517,7 @@ static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool parav } else { iters++; } - } while (!steal_break(val, iters, paravirt)); + } while (!steal_break(val, iters, paravirt, sleepy)); spin_end(); @@ -417,6 +529,8 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b struct qnodes *qnodesp; struct qnode *next, *node; u32 val, old, tail; + bool seen_preempted = false; + bool sleepy = false; bool mustq = false; int idx; int set_yield_cpu = -1; @@ -461,8 +575,10 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b /* Wait for mcs node lock to be released */ spin_begin(); - while (!node->locked) - yield_to_prev(lock, node, old, paravirt); + while (!node->locked) { + if (yield_to_prev(lock, node, old, paravirt)) + seen_preempted = true; + } spin_end(); /* Clear out stale propagated yield_cpu */ @@ -472,8 +588,8 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b smp_rmb(); /* acquire barrier for the mcs lock */ } -again: /* We're at the head of the waitqueue, wait for the lock. */ +again: spin_begin(); for (;;) { bool preempted; @@ -482,18 +598,40 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b if (!(val & _Q_LOCKED_VAL)) break; + if (paravirt && pv_sleepy_lock && maybe_stealers) { + if (!sleepy) { + if (val & _Q_SLEEPY_VAL) { + seen_sleepy_lock(); + sleepy = true; + } else if (recently_sleepy()) { + sleepy = true; + } + } + if (pv_sleepy_lock_sticky && seen_preempted && + !(val & _Q_SLEEPY_VAL)) { + if (try_set_sleepy(lock, val)) + val |= _Q_SLEEPY_VAL; + } + } + propagate_yield_cpu(node, val, &set_yield_cpu, paravirt); preempted = yield_head_to_locked_owner(lock, val, paravirt); if (!maybe_stealers) continue; - if (preempted) { + + if (preempted) + seen_preempted = true; + + if (paravirt && preempted) { + sleepy = true; + if (!pv_spin_on_preempted_owner) iters++; } else { iters++; } - if (!mustq && iters >= get_head_spins(paravirt)) { + if (!mustq && iters >= get_head_spins(paravirt, sleepy)) { mustq = true; set_mustq(lock); val |= _Q_MUST_Q_VAL; @@ -690,6 +828,70 @@ static int pv_spin_on_preempted_owner_get(void *data, u64 *val) DEFINE_SIMPLE_ATTRIBUTE(fops_pv_spin_on_preempted_owner, pv_spin_on_preempted_owner_get, pv_spin_on_preempted_owner_set, "%llu\n"); +static int pv_sleepy_lock_set(void *data, u64 val) +{ + pv_sleepy_lock = !!val; + + return 0; +} + +static int pv_sleepy_lock_get(void *data, u64 *val) +{ + *val = pv_sleepy_lock; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_pv_sleepy_lock, pv_sleepy_lock_get, pv_sleepy_lock_set, "%llu\n"); + +static int pv_sleepy_lock_sticky_set(void *data, u64 val) +{ + pv_sleepy_lock_sticky = !!val; + + return 0; +} + +static int pv_sleepy_lock_sticky_get(void *data, u64 *val) +{ + *val = pv_sleepy_lock_sticky; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_pv_sleepy_lock_sticky, pv_sleepy_lock_sticky_get, pv_sleepy_lock_sticky_set, "%llu\n"); + +static int pv_sleepy_lock_interval_ns_set(void *data, u64 val) +{ + pv_sleepy_lock_interval_ns = val; + + return 0; +} + +static int pv_sleepy_lock_interval_ns_get(void *data, u64 *val) +{ + *val = pv_sleepy_lock_interval_ns; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_pv_sleepy_lock_interval_ns, pv_sleepy_lock_interval_ns_get, pv_sleepy_lock_interval_ns_set, "%llu\n"); + +static int pv_sleepy_lock_factor_set(void *data, u64 val) +{ + pv_sleepy_lock_factor = val; + + return 0; +} + +static int pv_sleepy_lock_factor_get(void *data, u64 *val) +{ + *val = pv_sleepy_lock_factor; + + return 0; +} + +DEFINE_SIMPLE_ATTRIBUTE(fops_pv_sleepy_lock_factor, pv_sleepy_lock_factor_get, pv_sleepy_lock_factor_set, "%llu\n"); + static int pv_yield_prev_set(void *data, u64 val) { pv_yield_prev = !!val; @@ -747,6 +949,10 @@ static __init int spinlock_debugfs_init(void) debugfs_create_file("qspl_pv_yield_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_owner); debugfs_create_file("qspl_pv_yield_allow_steal", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_allow_steal); debugfs_create_file("qspl_pv_spin_on_preempted_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_spin_on_preempted_owner); + debugfs_create_file("qspl_pv_sleepy_lock", 0600, arch_debugfs_dir, NULL, &fops_pv_sleepy_lock); + debugfs_create_file("qspl_pv_sleepy_lock_sticky", 0600, arch_debugfs_dir, NULL, &fops_pv_sleepy_lock_sticky); + debugfs_create_file("qspl_pv_sleepy_lock_interval_ns", 0600, arch_debugfs_dir, NULL, &fops_pv_sleepy_lock_interval_ns); + debugfs_create_file("qspl_pv_sleepy_lock_factor", 0600, arch_debugfs_dir, NULL, &fops_pv_sleepy_lock_factor); debugfs_create_file("qspl_pv_yield_prev", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_prev); debugfs_create_file("qspl_pv_yield_propagate_owner", 0600, arch_debugfs_dir, NULL, &fops_pv_yield_propagate_owner); debugfs_create_file("qspl_pv_prod_head", 0600, arch_debugfs_dir, NULL, &fops_pv_prod_head); From patchwork Sat Nov 26 09:59:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 1709213 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=mmZW66co; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NK70W01sTz23nT for ; Sat, 26 Nov 2022 21:15:38 +1100 (AEDT) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4NK70V4Zwqz3fXF for ; Sat, 26 Nov 2022 21:15:38 +1100 (AEDT) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=mmZW66co; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gmail.com (client-ip=2607:f8b0:4864:20::536; helo=mail-pg1-x536.google.com; envelope-from=npiggin@gmail.com; receiver=) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20210112 header.b=mmZW66co; dkim-atps=neutral Received: from mail-pg1-x536.google.com (mail-pg1-x536.google.com [IPv6:2607:f8b0:4864:20::536]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4NK6g95pVyz3f56 for ; Sat, 26 Nov 2022 21:00:37 +1100 (AEDT) Received: by mail-pg1-x536.google.com with SMTP id 136so5786553pga.1 for ; Sat, 26 Nov 2022 02:00:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rEGJDewpv09S0gFfTmgnPDtO7efHwYjo4HjTISWJlio=; b=mmZW66co9BPoouqOpOBHVHafX3DVMJSEAG72V8fCU83TZV69jq2ov5nLxOaXqemu3s RgdnAaTvsQrSBYCpwAErqk90ZmGRrdPbgg/poFUWmkdI/J0tlvHMBjKYZ0IB7ZekuSzr hkvFNVPlfaejdbpIAw8jHqJRY8fofhK85u2kmTFi/91PvDWcf81ERXJd9oDEwdmx61YJ kr6FCUlffa+VgH40+296gPiHpU9B1c2wGNBbQPrzy0BSJWGSCgHvxJv9Uws+sADIkpS1 3F/m8va56YxGhVbZWB7yfHkwiqD2pbiWeZljidyPbQJq/jjXEnWm3/mK0GW97OEixPH5 pBwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rEGJDewpv09S0gFfTmgnPDtO7efHwYjo4HjTISWJlio=; b=vnaRxez70aR5Ww2x2DjBHAPFatZE/F1Ydq+IWG7mO/ggpWeCadJ64Q0cNQRAhLLLFa GRsar718W0pMKGR7TQhd3TndmZtw0hQFStn+v7srM3z64Wd6EOS3Sn6sXqn3rbPuGeM2 z9TbM27Si0LHYQbNBnCgsWNoVONl5Mo95XMFNFcfsbAbzHyW3ONOP4SlZAr6h7jyrRjq xdpMoKD7yUNv23iIfJ9wcJ9WmtSiwS+FuQ7Rqwy0lkoay1iUxfwL4davSDioTS6zAbh0 r9NEC7lCrUfeyZMCWfE4uBDCZWnRl7s9Bs+B69kdC3YZD/9zFyJkH/uvAAheIt4dGLUg 1yjA== X-Gm-Message-State: ANoB5pnz2f05HZkkynOg8QOutvWyXglLGxs2m2oetX/D+SNUmRdVDdJ2 GvvAsOvVTBbOP0hJtmIbQN/Xz/CzVOf3Nw== X-Google-Smtp-Source: AA0mqf7icrlTe3nxBQES7D7loSmTGSNHeH28L2/L6d/gYQ7J8wgGdZAI35Q/TMbybWLBMSNiZnmf5g== X-Received: by 2002:a63:f95a:0:b0:46f:5be0:feb9 with SMTP id q26-20020a63f95a000000b0046f5be0feb9mr37035270pgk.485.1669456835095; Sat, 26 Nov 2022 02:00:35 -0800 (PST) Received: from bobo.ozlabs.ibm.com (110-174-181-90.tpgi.com.au. [110.174.181.90]) by smtp.gmail.com with ESMTPSA id j3-20020a17090a94c300b00213202d77d9sm4239243pjw.43.2022.11.26.02.00.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 26 Nov 2022 02:00:34 -0800 (PST) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH v3 17/17] powerpc/qspinlock: add compile-time tuning adjustments Date: Sat, 26 Nov 2022 19:59:32 +1000 Message-Id: <20221126095932.1234527-18-npiggin@gmail.com> X-Mailer: git-send-email 2.37.2 In-Reply-To: <20221126095932.1234527-1-npiggin@gmail.com> References: <20221126095932.1234527-1-npiggin@gmail.com> MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jordan Niethe , Laurent Dufour , Nicholas Piggin Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" This adds compile-time options that allow the EH lock hint bit to be enabled or disabled, and adds some new options that may or may not help matters. To help with experimentation and tuning. Signed-off-by: Nicholas Piggin --- arch/powerpc/include/asm/qspinlock.h | 61 ++++++++++++++++++++++++++-- arch/powerpc/lib/qspinlock.c | 39 ++++++++++++++++-- 2 files changed, 94 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/include/asm/qspinlock.h b/arch/powerpc/include/asm/qspinlock.h index c9fa83bba1d5..9e71f8de7b12 100644 --- a/arch/powerpc/include/asm/qspinlock.h +++ b/arch/powerpc/include/asm/qspinlock.h @@ -5,15 +5,68 @@ #include #include +#ifdef CONFIG_PPC64 +/* + * Use the EH=1 hint for accesses that result in the lock being acquired. + * The hardware is supposed to optimise this pattern by holding the lock + * cacheline longer, and releasing when a store to the same memory (the + * unlock) is performed. + */ +#define _Q_SPIN_EH_HINT 1 +#else +#define _Q_SPIN_EH_HINT 0 +#endif + /* * The trylock itself may steal. This makes trylocks slightly stronger, and - * might make spin locks slightly more efficient when stealing. + * makes locks slightly more efficient when stealing. * * This is compile-time, so if true then there may always be stealers, so the * nosteal paths become unused. */ #define _Q_SPIN_TRY_LOCK_STEAL 1 +/* + * Put a speculation barrier after testing the lock/node and finding it + * busy. Try to prevent pointless speculation in slow paths. + * + * Slows down the lockstorm microbenchmark with no stealing, where locking + * is purely FIFO through the queue. May have more benefit in real workload + * where speculating into the wrong place could have a greater cost. + */ +#define _Q_SPIN_SPEC_BARRIER 0 + +#ifdef CONFIG_PPC64 +/* + * Execute a miso instruction after passing the MCS lock ownership to the + * queue head. Miso is intended to make stores visible to other CPUs sooner. + * + * This seems to make the lockstorm microbenchmark nospin test go slightly + * faster on POWER10, but disable for now. + */ +#define _Q_SPIN_MISO 0 +#else +#define _Q_SPIN_MISO 0 +#endif + +#ifdef CONFIG_PPC64 +/* + * This executes miso after an unlock of the lock word, having ownership + * pass to the next CPU sooner. This will slow the uncontended path to some + * degree. Not evidence it helps yet. + */ +#define _Q_SPIN_MISO_UNLOCK 0 +#else +#define _Q_SPIN_MISO_UNLOCK 0 +#endif + +/* + * Seems to slow down lockstorm microbenchmark, suspect queue node just + * has to become shared again right afterwards when its waiter spins on + * the lock field. + */ +#define _Q_SPIN_PREFETCH_NEXT 0 + static __always_inline int queued_spin_is_locked(struct qspinlock *lock) { return READ_ONCE(lock->val); @@ -51,7 +104,7 @@ static __always_inline int __queued_spin_trylock_nosteal(struct qspinlock *lock) "2: \n" : "=&r" (prev) : "r" (&lock->val), "r" (new), - "i" (IS_ENABLED(CONFIG_PPC64)) + "i" (_Q_SPIN_EH_HINT) : "cr0", "memory"); return likely(prev == 0); @@ -75,7 +128,7 @@ static __always_inline int __queued_spin_trylock_steal(struct qspinlock *lock) "2: \n" : "=&r" (prev), "=&r" (tmp) : "r" (&lock->val), "r" (new), "r" (_Q_TAIL_CPU_MASK), - "i" (IS_ENABLED(CONFIG_PPC64)) + "i" (_Q_SPIN_EH_HINT) : "cr0", "memory"); return likely(!(prev & ~_Q_TAIL_CPU_MASK)); @@ -100,6 +153,8 @@ static __always_inline void queued_spin_lock(struct qspinlock *lock) static inline void queued_spin_unlock(struct qspinlock *lock) { smp_store_release(&lock->locked, 0); + if (_Q_SPIN_MISO_UNLOCK) + asm volatile("miso" ::: "memory"); } #define arch_spin_is_locked(l) queued_spin_is_locked(l) diff --git a/arch/powerpc/lib/qspinlock.c b/arch/powerpc/lib/qspinlock.c index 9a31b6147a23..2eab84774911 100644 --- a/arch/powerpc/lib/qspinlock.c +++ b/arch/powerpc/lib/qspinlock.c @@ -48,6 +48,12 @@ static bool pv_prod_head __read_mostly = false; static DEFINE_PER_CPU_ALIGNED(struct qnodes, qnodes); static DEFINE_PER_CPU_ALIGNED(u64, sleepy_lock_seen_clock); +#if _Q_SPIN_SPEC_BARRIER == 1 +#define spec_barrier() do { asm volatile("ori 31,31,0" ::: "memory"); } while (0) +#else +#define spec_barrier() do { } while (0) +#endif + static __always_inline bool recently_sleepy(void) { /* pv_sleepy_lock is true when this is called */ @@ -137,7 +143,7 @@ static __always_inline u32 trylock_clean_tail(struct qspinlock *lock, u32 tail) : "r" (&lock->val), "r"(tail), "r" (newval), "i" (_Q_LOCKED_VAL), "r" (_Q_TAIL_CPU_MASK), - "i" (IS_ENABLED(CONFIG_PPC64)) + "i" (_Q_SPIN_EH_HINT) : "cr0", "memory"); return prev; @@ -475,6 +481,7 @@ static __always_inline bool try_to_steal_lock(struct qspinlock *lock, bool parav val = READ_ONCE(lock->val); if (val & _Q_MUST_Q_VAL) break; + spec_barrier(); if (unlikely(!(val & _Q_LOCKED_VAL))) { spin_end(); @@ -540,6 +547,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b qnodesp = this_cpu_ptr(&qnodes); if (unlikely(qnodesp->count >= MAX_NODES)) { + spec_barrier(); while (!queued_spin_trylock(lock)) cpu_relax(); return; @@ -576,9 +584,12 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b /* Wait for mcs node lock to be released */ spin_begin(); while (!node->locked) { + spec_barrier(); + if (yield_to_prev(lock, node, old, paravirt)) seen_preempted = true; } + spec_barrier(); spin_end(); /* Clear out stale propagated yield_cpu */ @@ -586,6 +597,17 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b node->yield_cpu = -1; smp_rmb(); /* acquire barrier for the mcs lock */ + + /* + * Generic qspinlocks have this prefetch here, but it seems + * like it could cause additional line transitions because + * the waiter will keep loading from it. + */ + if (_Q_SPIN_PREFETCH_NEXT) { + next = READ_ONCE(node->next); + if (next) + prefetchw(next); + } } /* We're at the head of the waitqueue, wait for the lock. */ @@ -597,6 +619,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b val = READ_ONCE(lock->val); if (!(val & _Q_LOCKED_VAL)) break; + spec_barrier(); if (paravirt && pv_sleepy_lock && maybe_stealers) { if (!sleepy) { @@ -637,6 +660,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b val |= _Q_MUST_Q_VAL; } } + spec_barrier(); spin_end(); /* If we're the last queued, must clean up the tail. */ @@ -657,6 +681,7 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b cpu_relax(); spin_end(); } + spec_barrier(); /* * Unlock the next mcs waiter node. Release barrier is not required @@ -668,10 +693,14 @@ static __always_inline void queued_spin_lock_mcs_queue(struct qspinlock *lock, b if (paravirt && pv_prod_head) { int next_cpu = next->cpu; WRITE_ONCE(next->locked, 1); + if (_Q_SPIN_MISO) + asm volatile("miso" ::: "memory"); if (vcpu_is_preempted(next_cpu)) prod_cpu(next_cpu); } else { WRITE_ONCE(next->locked, 1); + if (_Q_SPIN_MISO) + asm volatile("miso" ::: "memory"); } release: @@ -686,12 +715,16 @@ void queued_spin_lock_slowpath(struct qspinlock *lock) * is passed as the paravirt argument to the functions. */ if (IS_ENABLED(CONFIG_PARAVIRT_SPINLOCKS) && is_shared_processor()) { - if (try_to_steal_lock(lock, true)) + if (try_to_steal_lock(lock, true)) { + spec_barrier(); return; + } queued_spin_lock_mcs_queue(lock, true); } else { - if (try_to_steal_lock(lock, false)) + if (try_to_steal_lock(lock, false)) { + spec_barrier(); return; + } queued_spin_lock_mcs_queue(lock, false); } }