From patchwork Tue Jun 6 10:28:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gerald Yang X-Patchwork-Id: 1791055 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=Hl4jjUuf; dkim-atps=neutral Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qb6CH36X8z20Ty for ; Tue, 6 Jun 2023 20:28:59 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1q6Tvo-00060a-8o; Tue, 06 Jun 2023 10:28:52 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1q6Tvh-0005sm-RJ for kernel-team@lists.ubuntu.com; Tue, 06 Jun 2023 10:28:45 +0000 Received: from mail-pj1-f72.google.com (mail-pj1-f72.google.com [209.85.216.72]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id 94C3B3F0CB for ; Tue, 6 Jun 2023 10:28:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1686047325; bh=dtF0Nn2Mg4hlepgMWSTgw0mDqiPZ+L4XQJBCc2rOIuM=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Hl4jjUufdUozwjTy9NFVkYsTz33zPiQnv48vzYXfw8fjA2wsR37Ai/CzEP9LNKcug 6/Eee4bPldbwq7TriJc4s4a66zpdMY+PIYYKj/ugWYaXBmlbd/PHxjOZ+kG8tL02Df HpDZHw+TIuv1UM4/wPa6ct5Rn9iyv+zVBgxuoh3+vHmcQdwj2LbrtYp1MuZmylQ1KE godOqKBGeeVTBTg9KpQ+AlMpUcSoXbNfhYt5eF2W337lEbkxeE1HfpnK3khaLaNklS aSirsAQak2aCaFzen4hzKxt4jUqo2+v4+3ZZ17/DXqoMgJMRW2Gw4VI3fHCscXjg7B 9NUOnhCimhb+w== Received: by mail-pj1-f72.google.com with SMTP id 98e67ed59e1d1-2566f2acd88so5141225a91.1 for ; Tue, 06 Jun 2023 03:28:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686047325; x=1688639325; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dtF0Nn2Mg4hlepgMWSTgw0mDqiPZ+L4XQJBCc2rOIuM=; b=Oo95YKat0grM1H0eonMi3ns2c55NvYU/3BZfx/UThU5zEQc/GEKScUwQiffdBBHuNQ n3wWcXxYn/ok+We+gbUrJPoNWikob6dha2HZ8udEUTP4C2Ko+DCQlrQBzZ7lIkL3tC9q ikx05RQIuejGB6q0TNXCHZ7W+RKR5Xojp8ieH8zOw5bCHE8OrOkfl/SSMWQsqgSqtgba mXl3n35Hpm6sE+gwYp5C6AXNPYxZuxmu0JN2j1k/c9AaxPdnXP3PtLTwkNfHeXOuzDJo K15in0Cx5XshGqU0WlTCoN3UVgDrEKrRnV5noc4BBi9KrtDAAJLJnE+VNQYE5Jbs2ETz S13A== X-Gm-Message-State: AC+VfDxKPx644I9YZhAoBJQi8f0NbuxH0UBEobLEBzvmv5EaJgthXoAM 2gNFLGf/7DDyHDQWx4HkXgpqti43C2u5mvI2LEVM16mgy3Fisi3iVJnVINPrAgi2OE/N/LXNVPd oD098tTLO/H6zxiHwrN5jbxFvTuXx6BJM7RjAxreW9x6OjVdbEg== X-Received: by 2002:a17:90a:354:b0:258:9174:20ca with SMTP id 20-20020a17090a035400b00258917420camr1530586pjf.15.1686047324887; Tue, 06 Jun 2023 03:28:44 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5N1f5yTyrYSzJMtjDmDVYXXlnar9jzJlibxC2Ahz3dwwi3hrUDpQ2pEv3HCps04hw1qw7xew== X-Received: by 2002:a17:90a:354:b0:258:9174:20ca with SMTP id 20-20020a17090a035400b00258917420camr1530579pjf.15.1686047324552; Tue, 06 Jun 2023 03:28:44 -0700 (PDT) Received: from localhost.localdomain (220-135-31-21.hinet-ip.hinet.net. [220.135.31.21]) by smtp.gmail.com with ESMTPSA id 8-20020a17090a0c0800b00256a6ec8507sm9777629pjs.19.2023.06.06.03.28.43 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Jun 2023 03:28:44 -0700 (PDT) From: Gerald Yang To: kernel-team@lists.ubuntu.com Subject: [SRU][K][PATCH 5/6] sbitmap: fix batched wait_cnt accounting Date: Tue, 6 Jun 2023 18:28:27 +0800 Message-Id: <20230606102828.218014-6-gerald.yang@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230606102828.218014-1-gerald.yang@canonical.com> References: <20230606102828.218014-1-gerald.yang@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Keith Busch BugLink: https://bugs.launchpad.net/bugs/2022318 Batched completions can clear multiple bits, but we're only decrementing the wait_cnt by one each time. This can cause waiters to never be woken, stalling IO. Use the batched count instead. Link: https://bugzilla.kernel.org/show_bug.cgi?id=215679 Signed-off-by: Keith Busch Link: https://lore.kernel.org/r/20220909184022.1709476-1-kbusch@fb.com Signed-off-by: Jens Axboe (cherry picked from commit 4acb83417cadfdcbe64215f9d0ddcf3132af808e) Signed-off-by: Gerald Yang --- block/blk-mq-tag.c | 2 +- include/linux/sbitmap.h | 3 ++- lib/sbitmap.c | 37 +++++++++++++++++++++++-------------- 3 files changed, 26 insertions(+), 16 deletions(-) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 2dcd738c6952..7aea93047caf 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -200,7 +200,7 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data) * other allocations on previous queue won't be starved. */ if (bt != bt_prev) - sbitmap_queue_wake_up(bt_prev); + sbitmap_queue_wake_up(bt_prev, 1); ws = bt_wait_ptr(bt, data->hctx); } while (1); diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h index 8f5a86e210b9..4d2d5205ab58 100644 --- a/include/linux/sbitmap.h +++ b/include/linux/sbitmap.h @@ -575,8 +575,9 @@ void sbitmap_queue_wake_all(struct sbitmap_queue *sbq); * sbitmap_queue_wake_up() - Wake up some of waiters in one waitqueue * on a &struct sbitmap_queue. * @sbq: Bitmap queue to wake up. + * @nr: Number of bits cleared. */ -void sbitmap_queue_wake_up(struct sbitmap_queue *sbq); +void sbitmap_queue_wake_up(struct sbitmap_queue *sbq, int nr); /** * sbitmap_queue_show() - Dump &struct sbitmap_queue information to a &struct diff --git a/lib/sbitmap.c b/lib/sbitmap.c index cbfd2e677d87..624fa7f118d1 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -599,24 +599,31 @@ static struct sbq_wait_state *sbq_wake_ptr(struct sbitmap_queue *sbq) return NULL; } -static bool __sbq_wake_up(struct sbitmap_queue *sbq) +static bool __sbq_wake_up(struct sbitmap_queue *sbq, int *nr) { struct sbq_wait_state *ws; unsigned int wake_batch; - int wait_cnt; + int wait_cnt, cur, sub; bool ret; + if (*nr <= 0) + return false; + ws = sbq_wake_ptr(sbq); if (!ws) return false; - wait_cnt = atomic_dec_return(&ws->wait_cnt); - /* - * For concurrent callers of this, callers should call this function - * again to wakeup a new batch on a different 'ws'. - */ - if (wait_cnt < 0) - return true; + cur = atomic_read(&ws->wait_cnt); + do { + /* + * For concurrent callers of this, callers should call this + * function again to wakeup a new batch on a different 'ws'. + */ + if (cur == 0) + return true; + sub = min(*nr, cur); + wait_cnt = cur - sub; + } while (!atomic_try_cmpxchg(&ws->wait_cnt, &cur, wait_cnt)); /* * If we decremented queue without waiters, retry to avoid lost @@ -625,6 +632,8 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) if (wait_cnt > 0) return !waitqueue_active(&ws->wait); + *nr -= sub; + /* * When wait_cnt == 0, we have to be particularly careful as we are * responsible to reset wait_cnt regardless whether we've actually @@ -660,12 +669,12 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) sbq_index_atomic_inc(&sbq->wake_index); atomic_set(&ws->wait_cnt, wake_batch); - return ret; + return ret || *nr; } -void sbitmap_queue_wake_up(struct sbitmap_queue *sbq) +void sbitmap_queue_wake_up(struct sbitmap_queue *sbq, int nr) { - while (__sbq_wake_up(sbq)) + while (__sbq_wake_up(sbq, &nr)) ; } EXPORT_SYMBOL_GPL(sbitmap_queue_wake_up); @@ -705,7 +714,7 @@ void sbitmap_queue_clear_batch(struct sbitmap_queue *sbq, int offset, atomic_long_andnot(mask, (atomic_long_t *) addr); smp_mb__after_atomic(); - sbitmap_queue_wake_up(sbq); + sbitmap_queue_wake_up(sbq, nr_tags); sbitmap_update_cpu_hint(&sbq->sb, raw_smp_processor_id(), tags[nr_tags - 1] - offset); } @@ -733,7 +742,7 @@ void sbitmap_queue_clear(struct sbitmap_queue *sbq, unsigned int nr, * waiter. See the comment on waitqueue_active(). */ smp_mb__after_atomic(); - sbitmap_queue_wake_up(sbq); + sbitmap_queue_wake_up(sbq, 1); sbitmap_update_cpu_hint(&sbq->sb, cpu, nr); } EXPORT_SYMBOL_GPL(sbitmap_queue_clear);