From patchwork Tue Jun 6 10:28:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gerald Yang X-Patchwork-Id: 1791054 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=T8Xte2dX; dkim-atps=neutral Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qb6CF4dWFz20Wd for ; Tue, 6 Jun 2023 20:28:57 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1q6Tvl-0005xA-VT; Tue, 06 Jun 2023 10:28:50 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1q6Tvg-0005s8-LU for kernel-team@lists.ubuntu.com; Tue, 06 Jun 2023 10:28:44 +0000 Received: from mail-pj1-f72.google.com (mail-pj1-f72.google.com [209.85.216.72]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id 6D7853F0CB for ; Tue, 6 Jun 2023 10:28:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1686047324; bh=vmJtpKhYUdAYezGaQzm5jogkHKsopiQMTe/e2rQ97s8=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=T8Xte2dXkCuTgAHSOoQitlx1Fy/9Mb+jd7EkG+ODA2PVFpMzhPhoPNIBaOEaOrDjl yUOoMVKw+3264c1YLM2K7+f4heawc+l+ePdhfJjvNmBjE1Po/mMngSMEmCYKmkST2l tVbmPnarV5t0Hd31PPOv0lOOW09ZpjEejSZfCW7Jn1OxlvXLajbYMe3+zd7sZ7ZSrg 6CTxAt5X+gidqFbEkRegkhW5n5qe9hgrtQP8u0xL8hTG+fLjcHTRol6PpDedK5e68w T5ZgHEz41tBDItfqQT7aO9tbmrf9muPHkoTi9c82QMyXVjDVqHcb2KKGuzUFn2p/To BKVMdKaB2Qbdg== Received: by mail-pj1-f72.google.com with SMTP id 98e67ed59e1d1-2566f2acd88so5141178a91.1 for ; Tue, 06 Jun 2023 03:28:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686047323; x=1688639323; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vmJtpKhYUdAYezGaQzm5jogkHKsopiQMTe/e2rQ97s8=; b=Es0Ruwh3OZKmkqScWYYU3KZ21Wmx7oodH46LiiNUnK5X1bqWl/5SRAu/Mr6gFnyOtj jvCN9i3UPeanAH4ogGVeOaBTuNjHKrkKnqiWrOuvgLjMFpbK1Nwg0O1FHjeSaPhRNE7z uZH2KxvJWW0F23ljapAoLGawPZs9zIy4eg9ojVZyq7+dTpfA3T8FVJGiITG3aXNMxuVX /NlUeaAyKsW/CZn8l21VqnfoK4afxUjgY8bt3iRynx2T47Yl1nb43Q0jAaUbwaUe9nrK agzglf1YJ3uJu1qR6dfFv44bQNHVcO2Fc/8wHhju1oMTe8P9p0nfVuSefRVuwMZhz0S0 VFwg== X-Gm-Message-State: AC+VfDwYXQEdZZHlIQyx2J8ZePodgPwOy9R3vvagBtjT1Eky1HbKzvLR 5i6lw+N1fF5P4QDxP8zTgNJqgq2BrPQm2PpGoMp3y9gTzA4NaZB2kWcKL8sUlFRYVkXttvQiwkH c0xVwgAt4KtqV9YZViIgitjvXEFIftL6F6Agm/80dtrpk1NNBmg== X-Received: by 2002:a17:90a:f2cf:b0:259:45c2:7339 with SMTP id gt15-20020a17090af2cf00b0025945c27339mr1446499pjb.23.1686047322822; Tue, 06 Jun 2023 03:28:42 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5pH0fybCIkVum5CfAJVWOGzjDPTfbNqm+jUJFUI5iNFo2Lr7kc7G/SqIa9iMJ10E/qF+/lfQ== X-Received: by 2002:a17:90a:f2cf:b0:259:45c2:7339 with SMTP id gt15-20020a17090af2cf00b0025945c27339mr1446487pjb.23.1686047322494; Tue, 06 Jun 2023 03:28:42 -0700 (PDT) Received: from localhost.localdomain (220-135-31-21.hinet-ip.hinet.net. [220.135.31.21]) by smtp.gmail.com with ESMTPSA id 8-20020a17090a0c0800b00256a6ec8507sm9777629pjs.19.2023.06.06.03.28.41 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Jun 2023 03:28:42 -0700 (PDT) From: Gerald Yang To: kernel-team@lists.ubuntu.com Subject: [SRU][K][PATCH 3/6] sbitmap: Avoid leaving waitqueue in invalid state in __sbq_wake_up() Date: Tue, 6 Jun 2023 18:28:25 +0800 Message-Id: <20230606102828.218014-4-gerald.yang@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230606102828.218014-1-gerald.yang@canonical.com> References: <20230606102828.218014-1-gerald.yang@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Jan Kara BugLink: https://bugs.launchpad.net/bugs/2022318 When __sbq_wake_up() decrements wait_cnt to 0 but races with someone else waking the waiter on the waitqueue (so the waitqueue becomes empty), it exits without reseting wait_cnt to wake_batch number. Once wait_cnt is 0, nobody will ever reset the wait_cnt or wake the new waiters resulting in possible deadlocks or busyloops. Fix the problem by making sure we reset wait_cnt even if we didn't wake up anybody in the end. Fixes: 040b83fcecfb ("sbitmap: fix possible io hung due to lost wakeup") Reported-by: Keith Busch Signed-off-by: Jan Kara Link: https://lore.kernel.org/r/20220908130937.2795-1-jack@suse.cz Signed-off-by: Jens Axboe (cherry picked from commit 48c033314f372478548203c583529f53080fd078) Signed-off-by: Gerald Yang --- lib/sbitmap.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/lib/sbitmap.c b/lib/sbitmap.c index a39b1a877366..47cd8fb894ba 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -604,6 +604,7 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) struct sbq_wait_state *ws; unsigned int wake_batch; int wait_cnt; + bool ret; ws = sbq_wake_ptr(sbq); if (!ws) @@ -614,12 +615,23 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) * For concurrent callers of this, callers should call this function * again to wakeup a new batch on a different 'ws'. */ - if (wait_cnt < 0 || !waitqueue_active(&ws->wait)) + if (wait_cnt < 0) return true; + /* + * If we decremented queue without waiters, retry to avoid lost + * wakeups. + */ if (wait_cnt > 0) - return false; + return !waitqueue_active(&ws->wait); + /* + * When wait_cnt == 0, we have to be particularly careful as we are + * responsible to reset wait_cnt regardless whether we've actually + * woken up anybody. But in case we didn't wakeup anybody, we still + * need to retry. + */ + ret = !waitqueue_active(&ws->wait); wake_batch = READ_ONCE(sbq->wake_batch); /* @@ -648,7 +660,7 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) sbq_index_atomic_inc(&sbq->wake_index); atomic_set(&ws->wait_cnt, wake_batch); - return false; + return ret; } void sbitmap_queue_wake_up(struct sbitmap_queue *sbq)