From patchwork Mon Apr 26 21:41:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Guilherme G. Piccoli" X-Patchwork-Id: 1470487 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4FTdd722pbz9sXh; Tue, 27 Apr 2021 07:41:31 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1lb8yt-00083t-M2; Mon, 26 Apr 2021 21:41:27 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1lb8yr-00082Z-1F for kernel-team@lists.ubuntu.com; Mon, 26 Apr 2021 21:41:25 +0000 Received: from mail-qv1-f69.google.com ([209.85.219.69]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1lb8yq-0001Kz-KF for kernel-team@lists.ubuntu.com; Mon, 26 Apr 2021 21:41:24 +0000 Received: by mail-qv1-f69.google.com with SMTP id l19-20020a0ce5130000b02901b6795e3304so4904791qvm.2 for ; Mon, 26 Apr 2021 14:41:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qjpqxTM9NATXmCMYKB+YXpEmdL1XdaPr+N6b/zZlbw4=; b=D4BHjPQWhJiHZlq5BMxb0aQ2yKcaTr5OP+kgAvSptJzRbXHRycI/FcmHMqTiVTE9Bc oZVeFTUsZnDQXG1P0egLPKBjGHt52ZnUl6C7LQpnaQTjWDYIXvz3WnBpXSyHq1gf/+gQ xSxREpCJ8C7agDwDYCdOsKSfDMRcMP/XC4FH8JVALsuM+ZE+W9Fnvim06cUgcUQG2gFu es/LKHntcicuhHQAz4e2FO+rgHwn91ESUOHtkw3hO8YQxHOe/ygZ2Ikt9TL1Qz03lTbM xleUGtkDTHxtmP55Z3eHcXilkDiK+cuhL5SFH9JpcatTVOAH796y6Qc7QHxuV25ATbul bsmQ== X-Gm-Message-State: AOAM533kdKiu6WcsJpZQ2eU0TGujq22RoCcSEneSIz/JyJcg2dOaxt6s BZLfT0igOWJwDCBy2AiFf3C/Rb/sws4UvYob1mb30jZyGPuEYfdl1jG7D0HABzVH6Nil1xPWC17 xs7r1upLt3H5fwktqMff4dP5yl/uLGp3E+QKqz/3X7w== X-Received: by 2002:ad4:5587:: with SMTP id e7mr12618667qvx.6.1619473283810; Mon, 26 Apr 2021 14:41:23 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxiN2U08U88n3CQ4qhSE5S047MHHMIsD9M43BWnCu6kfxtzg/Pg/JBG3lJuEO5smXtJwlNSYg== X-Received: by 2002:ad4:5587:: with SMTP id e7mr12618648qvx.6.1619473283565; Mon, 26 Apr 2021 14:41:23 -0700 (PDT) Received: from localhost ([187.183.41.59]) by smtp.gmail.com with ESMTPSA id r5sm12307249qtp.75.2021.04.26.14.41.22 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 26 Apr 2021 14:41:23 -0700 (PDT) From: "Guilherme G. Piccoli" To: kernel-team@lists.ubuntu.com Subject: [B:linux][B:linux-aws][PATCH 2/2] locking/qrwlock: Fix ordering in queued_write_lock_slowpath() Date: Mon, 26 Apr 2021 18:41:12 -0300 Message-Id: <20210426214113.27372-3-gpiccoli@canonical.com> X-Mailer: git-send-email 2.29.0 In-Reply-To: <20210426214113.27372-1-gpiccoli@canonical.com> References: <20210426214113.27372-1-gpiccoli@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Ali Saidi BugLink: https://bugs.launchpad.net/bugs/1926184 While this code is executed with the wait_lock held, a reader can acquire the lock without holding wait_lock. The writer side loops checking the value with the atomic_cond_read_acquire(), but only truly acquires the lock when the compare-and-exchange is completed successfully which isn’t ordered. This exposes the window between the acquire and the cmpxchg to an A-B-A problem which allows reads following the lock acquisition to observe values speculatively before the write lock is truly acquired. We've seen a problem in epoll where the reader does a xchg while holding the read lock, but the writer can see a value change out from under it. Writer | Reader -------------------------------------------------------------------------------- ep_scan_ready_list() | |- write_lock_irq() | |- queued_write_lock_slowpath() | |- atomic_cond_read_acquire() | | read_lock_irqsave(&ep->lock, flags); --> (observes value before unlock) | chain_epi_lockless() | | epi->next = xchg(&ep->ovflist, epi); | | read_unlock_irqrestore(&ep->lock, flags); | | | atomic_cmpxchg_relaxed() | |-- READ_ONCE(ep->ovflist); | A core can order the read of the ovflist ahead of the atomic_cmpxchg_relaxed(). Switching the cmpxchg to use acquire semantics addresses this issue at which point the atomic_cond_read can be switched to use relaxed semantics. Fixes: b519b56e378ee ("locking/qrwlock: Use atomic_cond_read_acquire() when spinning in qrwlock") Signed-off-by: Ali Saidi [peterz: use try_cmpxchg()] Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Steve Capper Acked-by: Will Deacon Acked-by: Waiman Long Tested-by: Steve Capper (cherry picked from commit 84a24bf8c52e66b7ac89ada5e3cfbe72d65c1896) Signed-off-by: Guilherme G. Piccoli --- kernel/locking/qrwlock.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/kernel/locking/qrwlock.c b/kernel/locking/qrwlock.c index c7471c3fb798..16c09cda3b02 100644 --- a/kernel/locking/qrwlock.c +++ b/kernel/locking/qrwlock.c @@ -70,6 +70,8 @@ EXPORT_SYMBOL(queued_read_lock_slowpath); */ void queued_write_lock_slowpath(struct qrwlock *lock) { + int cnts; + /* Put the writer into the wait queue */ arch_spin_lock(&lock->wait_lock); @@ -83,9 +85,8 @@ void queued_write_lock_slowpath(struct qrwlock *lock) /* When no more readers or writers, set the locked flag */ do { - atomic_cond_read_acquire(&lock->cnts, VAL == _QW_WAITING); - } while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING, - _QW_LOCKED) != _QW_WAITING); + cnts = atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING); + } while (!atomic_try_cmpxchg_acquire(&lock->cnts, &cnts, _QW_LOCKED)); unlock: arch_spin_unlock(&lock->wait_lock); }