From patchwork Mon Apr 26 21:41:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guilherme G. Piccoli" X-Patchwork-Id: 1470486 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4FTdd4321Nz9sXH; Tue, 27 Apr 2021 07:41:28 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1lb8yq-000828-EU; Mon, 26 Apr 2021 21:41:24 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1lb8yn-000816-K9 for kernel-team@lists.ubuntu.com; Mon, 26 Apr 2021 21:41:21 +0000 Received: from mail-qk1-f200.google.com ([209.85.222.200]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1lb8yn-0001KW-7c for kernel-team@lists.ubuntu.com; Mon, 26 Apr 2021 21:41:21 +0000 Received: by mail-qk1-f200.google.com with SMTP id y9-20020ae9f4090000b02902e4caf24229so1056701qkl.4 for ; Mon, 26 Apr 2021 14:41:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=iZsFCyRDnt6b/3+qKAWQH7Cfy3kCpzlrNJZB8/a1Ajc=; b=hT96x+L4ai8nxMQFaquXmc27keD01N5GWQ6bhcBpDUgHIRmibx04YjnibT6uKvCihp 6+am0QxZd3O2lju6ZReytOIgj9Q5ZcSltgJbnFfO0sIhgc4gYA2St/FXjmVeiCHd+Kz3 TPfKOMAH4Rlk7P3tvA3lcT1F7H58sJa6IyX4BggMnjoFdE5kECoEZDn40+W0ckhfq4Mj TjHFNo6P5ZsxefSs/k7g6k5RsCB/nphxX30JDAXU9TvoxquTMRejY2HRUl6g1wkiaI/c gMrHfEk9LzuwRuD8mG067x22xhFh7oR+oA8z1W7CgYhe81H96OjgSIDSSPS0a7OIlfz5 U/kA== X-Gm-Message-State: AOAM533M73eogjkvzMOsuQ+LZYMLYBMWeq/IHEjsxd7h2xon3j+JOFxI 0Jv+6lKCSfSoi/CaE+vfn/yuRsaZVEdgIySceKwLwc8C82i2QpxSQMU+cJTVVMztjEADUYiFkz/ r+WdGe4VbvxZrQx+tunbCOE8R6Rn7Y0reVnTMl+ofJg== X-Received: by 2002:a05:622a:14c6:: with SMTP id u6mr18526159qtx.125.1619473280374; Mon, 26 Apr 2021 14:41:20 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzkKUgLBFkTKsvCT1CsrudbaqsrCU+AIShEW+QYLhHnH6ySh6GQSYvQtLn5O35igzOSnW9eRw== X-Received: by 2002:a05:622a:14c6:: with SMTP id u6mr18526146qtx.125.1619473280172; Mon, 26 Apr 2021 14:41:20 -0700 (PDT) Received: from localhost ([187.183.41.59]) by smtp.gmail.com with ESMTPSA id a66sm1342493qkc.74.2021.04.26.14.41.19 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 26 Apr 2021 14:41:19 -0700 (PDT) From: "Guilherme G. Piccoli" To: kernel-team@lists.ubuntu.com Subject: [B:linux][B:linux-aws][PATCH 1/2] locking/barriers: Introduce smp_cond_load_relaxed() and atomic_cond_read_relaxed() Date: Mon, 26 Apr 2021 18:41:11 -0300 Message-Id: <20210426214113.27372-2-gpiccoli@canonical.com> X-Mailer: git-send-email 2.29.0 In-Reply-To: <20210426214113.27372-1-gpiccoli@canonical.com> References: <20210426214113.27372-1-gpiccoli@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Will Deacon BugLink: https://bugs.launchpad.net/bugs/1926184 Whilst we currently provide smp_cond_load_acquire() and atomic_cond_read_acquire(), there are cases where the ACQUIRE semantics are not required because of a subsequent fence or release operation once the conditional loop has exited. This patch adds relaxed versions of the conditional spinning primitives to avoid unnecessary barrier overhead on architectures such as arm64. Signed-off-by: Will Deacon Acked-by: Peter Zijlstra (Intel) Acked-by: Waiman Long Cc: Linus Torvalds Cc: Thomas Gleixner Cc: boqun.feng@gmail.com Cc: linux-arm-kernel@lists.infradead.org Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1524738868-31318-2-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar (cherry picked from commit fcfdfe30e324725007e9dc5814b62a4c430ea909) Signed-off-by: Guilherme G. Piccoli --- include/asm-generic/atomic-long.h | 2 ++ include/asm-generic/barrier.h | 27 +++++++++++++++++++++------ include/linux/atomic.h | 2 ++ 3 files changed, 25 insertions(+), 6 deletions(-) diff --git a/include/asm-generic/atomic-long.h b/include/asm-generic/atomic-long.h index 34a028a7bcc5..5b2b0b5ea06d 100644 --- a/include/asm-generic/atomic-long.h +++ b/include/asm-generic/atomic-long.h @@ -244,6 +244,8 @@ static inline long atomic_long_add_unless(atomic_long_t *l, long a, long u) #define atomic_long_inc_not_zero(l) \ ATOMIC_LONG_PFX(_inc_not_zero)((ATOMIC_LONG_PFX(_t) *)(l)) +#define atomic_long_cond_read_relaxed(v, c) \ + ATOMIC_LONG_PFX(_cond_read_relaxed)((ATOMIC_LONG_PFX(_t) *)(v), (c)) #define atomic_long_cond_read_acquire(v, c) \ ATOMIC_LONG_PFX(_cond_read_acquire)((ATOMIC_LONG_PFX(_t) *)(v), (c)) diff --git a/include/asm-generic/barrier.h b/include/asm-generic/barrier.h index fe297b599b0a..305e03b19a26 100644 --- a/include/asm-generic/barrier.h +++ b/include/asm-generic/barrier.h @@ -221,18 +221,17 @@ do { \ #endif /** - * smp_cond_load_acquire() - (Spin) wait for cond with ACQUIRE ordering + * smp_cond_load_relaxed() - (Spin) wait for cond with no ordering guarantees * @ptr: pointer to the variable to wait on * @cond: boolean expression to wait for * - * Equivalent to using smp_load_acquire() on the condition variable but employs - * the control dependency of the wait to reduce the barrier on many platforms. + * Equivalent to using READ_ONCE() on the condition variable. * * Due to C lacking lambda expressions we load the value of *ptr into a * pre-named variable @VAL to be used in @cond. */ -#ifndef smp_cond_load_acquire -#define smp_cond_load_acquire(ptr, cond_expr) ({ \ +#ifndef smp_cond_load_relaxed +#define smp_cond_load_relaxed(ptr, cond_expr) ({ \ typeof(ptr) __PTR = (ptr); \ typeof(*ptr) VAL; \ for (;;) { \ @@ -241,10 +240,26 @@ do { \ break; \ cpu_relax(); \ } \ - smp_acquire__after_ctrl_dep(); \ VAL; \ }) #endif +/** + * smp_cond_load_acquire() - (Spin) wait for cond with ACQUIRE ordering + * @ptr: pointer to the variable to wait on + * @cond: boolean expression to wait for + * + * Equivalent to using smp_load_acquire() on the condition variable but employs + * the control dependency of the wait to reduce the barrier on many platforms. + */ +#ifndef smp_cond_load_acquire +#define smp_cond_load_acquire(ptr, cond_expr) ({ \ + typeof(*ptr) _val; \ + _val = smp_cond_load_relaxed(ptr, cond_expr); \ + smp_acquire__after_ctrl_dep(); \ + _val; \ +}) +#endif + #endif /* !__ASSEMBLY__ */ #endif /* __ASM_GENERIC_BARRIER_H */ diff --git a/include/linux/atomic.h b/include/linux/atomic.h index 8b276fd9a127..01ce3997cb42 100644 --- a/include/linux/atomic.h +++ b/include/linux/atomic.h @@ -654,6 +654,7 @@ static inline int atomic_dec_if_positive(atomic_t *v) } #endif +#define atomic_cond_read_relaxed(v, c) smp_cond_load_relaxed(&(v)->counter, (c)) #define atomic_cond_read_acquire(v, c) smp_cond_load_acquire(&(v)->counter, (c)) #ifdef CONFIG_GENERIC_ATOMIC64 @@ -1075,6 +1076,7 @@ static inline long long atomic64_fetch_andnot_release(long long i, atomic64_t *v } #endif +#define atomic64_cond_read_relaxed(v, c) smp_cond_load_relaxed(&(v)->counter, (c)) #define atomic64_cond_read_acquire(v, c) smp_cond_load_acquire(&(v)->counter, (c)) #include From patchwork Mon Apr 26 21:41:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Guilherme G. Piccoli" X-Patchwork-Id: 1470487 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4FTdd722pbz9sXh; Tue, 27 Apr 2021 07:41:31 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1lb8yt-00083t-M2; Mon, 26 Apr 2021 21:41:27 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1lb8yr-00082Z-1F for kernel-team@lists.ubuntu.com; Mon, 26 Apr 2021 21:41:25 +0000 Received: from mail-qv1-f69.google.com ([209.85.219.69]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1lb8yq-0001Kz-KF for kernel-team@lists.ubuntu.com; Mon, 26 Apr 2021 21:41:24 +0000 Received: by mail-qv1-f69.google.com with SMTP id l19-20020a0ce5130000b02901b6795e3304so4904791qvm.2 for ; Mon, 26 Apr 2021 14:41:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qjpqxTM9NATXmCMYKB+YXpEmdL1XdaPr+N6b/zZlbw4=; b=D4BHjPQWhJiHZlq5BMxb0aQ2yKcaTr5OP+kgAvSptJzRbXHRycI/FcmHMqTiVTE9Bc oZVeFTUsZnDQXG1P0egLPKBjGHt52ZnUl6C7LQpnaQTjWDYIXvz3WnBpXSyHq1gf/+gQ xSxREpCJ8C7agDwDYCdOsKSfDMRcMP/XC4FH8JVALsuM+ZE+W9Fnvim06cUgcUQG2gFu es/LKHntcicuhHQAz4e2FO+rgHwn91ESUOHtkw3hO8YQxHOe/ygZ2Ikt9TL1Qz03lTbM xleUGtkDTHxtmP55Z3eHcXilkDiK+cuhL5SFH9JpcatTVOAH796y6Qc7QHxuV25ATbul bsmQ== X-Gm-Message-State: AOAM533kdKiu6WcsJpZQ2eU0TGujq22RoCcSEneSIz/JyJcg2dOaxt6s BZLfT0igOWJwDCBy2AiFf3C/Rb/sws4UvYob1mb30jZyGPuEYfdl1jG7D0HABzVH6Nil1xPWC17 xs7r1upLt3H5fwktqMff4dP5yl/uLGp3E+QKqz/3X7w== X-Received: by 2002:ad4:5587:: with SMTP id e7mr12618667qvx.6.1619473283810; Mon, 26 Apr 2021 14:41:23 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxiN2U08U88n3CQ4qhSE5S047MHHMIsD9M43BWnCu6kfxtzg/Pg/JBG3lJuEO5smXtJwlNSYg== X-Received: by 2002:ad4:5587:: with SMTP id e7mr12618648qvx.6.1619473283565; Mon, 26 Apr 2021 14:41:23 -0700 (PDT) Received: from localhost ([187.183.41.59]) by smtp.gmail.com with ESMTPSA id r5sm12307249qtp.75.2021.04.26.14.41.22 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 26 Apr 2021 14:41:23 -0700 (PDT) From: "Guilherme G. Piccoli" To: kernel-team@lists.ubuntu.com Subject: [B:linux][B:linux-aws][PATCH 2/2] locking/qrwlock: Fix ordering in queued_write_lock_slowpath() Date: Mon, 26 Apr 2021 18:41:12 -0300 Message-Id: <20210426214113.27372-3-gpiccoli@canonical.com> X-Mailer: git-send-email 2.29.0 In-Reply-To: <20210426214113.27372-1-gpiccoli@canonical.com> References: <20210426214113.27372-1-gpiccoli@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Ali Saidi BugLink: https://bugs.launchpad.net/bugs/1926184 While this code is executed with the wait_lock held, a reader can acquire the lock without holding wait_lock. The writer side loops checking the value with the atomic_cond_read_acquire(), but only truly acquires the lock when the compare-and-exchange is completed successfully which isn’t ordered. This exposes the window between the acquire and the cmpxchg to an A-B-A problem which allows reads following the lock acquisition to observe values speculatively before the write lock is truly acquired. We've seen a problem in epoll where the reader does a xchg while holding the read lock, but the writer can see a value change out from under it. Writer | Reader -------------------------------------------------------------------------------- ep_scan_ready_list() | |- write_lock_irq() | |- queued_write_lock_slowpath() | |- atomic_cond_read_acquire() | | read_lock_irqsave(&ep->lock, flags); --> (observes value before unlock) | chain_epi_lockless() | | epi->next = xchg(&ep->ovflist, epi); | | read_unlock_irqrestore(&ep->lock, flags); | | | atomic_cmpxchg_relaxed() | |-- READ_ONCE(ep->ovflist); | A core can order the read of the ovflist ahead of the atomic_cmpxchg_relaxed(). Switching the cmpxchg to use acquire semantics addresses this issue at which point the atomic_cond_read can be switched to use relaxed semantics. Fixes: b519b56e378ee ("locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwlock") Signed-off-by: Ali Saidi [peterz: use try_cmpxchg()] Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Steve Capper Acked-by: Will Deacon Acked-by: Waiman Long Tested-by: Steve Capper (cherry picked from commit 84a24bf8c52e66b7ac89ada5e3cfbe72d65c1896) Signed-off-by: Guilherme G. Piccoli --- kernel/locking/qrwlock.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c index c7471c3fb798..16c09cda3b02 100644 --- a/kernel/locking/qrwlock.c +++ b/kernel/locking/qrwlock.c @@ -70,6 +70,8 @@ EXPORT_SYMBOL(queued_read_lock_slowpath); */ void queued_write_lock_slowpath(struct qrwlock *lock) { + int cnts; + /* Put the writer into the wait queue */ arch_spin_lock(&lock->wait_lock); @@ -83,9 +85,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock) /* When no more readers or writers, set the locked flag */ do { - atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING); - } while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING, - _QW_LOCKED) != _QW_WAITING); + cnts = atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING); + } while (!atomic_try_cmpxchg_acquire(&lock->cnts, &cnts, _QW_LOCKED)); unlock: arch_spin_unlock(&lock->wait_lock); }