From patchwork Mon Jun 5 05:45:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gerald Yang X-Patchwork-Id: 1790217 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=AKzwvdQB; dkim-atps=neutral Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QZN000cQpz20Vv for ; Mon, 5 Jun 2023 15:46:38 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1q632y-00077s-EJ; Mon, 05 Jun 2023 05:46:28 +0000 Received: from smtp-relay-internal-0.internal ([10.131.114.225] helo=smtp-relay-internal-0.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1q632r-00075f-HE for kernel-team@lists.ubuntu.com; Mon, 05 Jun 2023 05:46:21 +0000 Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-0.canonical.com (Postfix) with ESMTPS id 778A93F03D for ; Mon, 5 Jun 2023 05:46:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1685943980; bh=mJ+zXVqyZ0+N+aaQai8AEcTmtMOYaQoMje0ODV5NaMU=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=AKzwvdQBenFQh0VLtUmWuriU3iI3hSoIZyINBr/ARdshTK08me/uqJCnHjL34VxsQ nqERtUtH+HPNisrvt2hQzMdZIvZCQhUhgV9GXDqanIJ+jMr9UbZc+Ohiq2BQxYAJt9 S3NXuMcMBsU/4Qx1bYmVxA6GuwXFG59jbeCW46psazyxRLwPAyFGwOQsFahWx/PEXb CpemXIqCPQW3E4JZN5LoyqxNcq5OKVAsUFuJ7CNEyNLAT85L2DzVg9HYRNc1/cY4/3 +YBA+dKsduo0b4K1b/dRG9bNdFzujux0tVvXH9bafoj2iFMrwP7V/fGXhaerBuxwXQ BPGTk/Dyr3OTg== Received: by mail-qv1-f71.google.com with SMTP id 6a1803df08f44-621257e86daso65846536d6.1 for ; Sun, 04 Jun 2023 22:46:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685943978; x=1688535978; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mJ+zXVqyZ0+N+aaQai8AEcTmtMOYaQoMje0ODV5NaMU=; b=dXEkmEJn1JcmN+k82FGYotddlnA4qjaxRcyMxcak0Spe4itoKYBN4iaHqYVByDBg0B bzvAHUXBqiZGHrtnWpYGasrB5aBsPnOTzfxcS6wKPe+ujv0PzWQPT+Gs2SiQ3aPMh2Cw QshsnvgZAA0+QIkDHNaq+w9r/dVpIn7ZjCYukNDH/qqIWxf6+aL6DSfxT/+ZWhwVFR0W Df8jjMBRWBQVpNAMNO8NK4I5/c92rbO9RatF/YianK2jZuUDN3JMzOhSY4+Bi1C4o/dr qIxp5Gdn1Fw1haGycJSFGDOLKR++aT1HTbMvFMmW0FKpIX2um7yL3pcf0eU71oe8CllS xZIg== X-Gm-Message-State: AC+VfDwh0j1sGxlqtvvwkZ3AxdNXfi10HyA7qAIAoBlqFZFJoacBLueh ly2GERAPS0s2yNL1FGR7Eu6hf3l/hOUiUfVCEdk6YztOIrBBk3BjTVL9gPpl6r3tl5U7wnRIH6c ULUKnllYein1itCQFgfmE/a224xQF3R/TWFMQWHdp0TdcSmUbzg== X-Received: by 2002:a05:6214:1256:b0:626:1b8f:4940 with SMTP id r22-20020a056214125600b006261b8f4940mr8322498qvv.23.1685943978029; Sun, 04 Jun 2023 22:46:18 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6R3k1GFuFcZ81fpKuBOTNWeyeIM6nzwxUBjEBPDLR9xDOu2c9d9cZrC3g57E8SoBAyEVrycQ== X-Received: by 2002:a05:6214:1256:b0:626:1b8f:4940 with SMTP id r22-20020a056214125600b006261b8f4940mr8322472qvv.23.1685943977683; Sun, 04 Jun 2023 22:46:17 -0700 (PDT) Received: from localhost.localdomain (220-135-31-21.hinet-ip.hinet.net. [220.135.31.21]) by smtp.gmail.com with ESMTPSA id p5-20020a170902eac500b001b03a1a3151sm5637798pld.70.2023.06.04.22.46.16 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 04 Jun 2023 22:46:17 -0700 (PDT) From: Gerald Yang To: kernel-team@lists.ubuntu.com Subject: [SRU][K][PATCH 1/8] sbitmap: fix possible io hung due to lost wakeup Date: Mon, 5 Jun 2023 13:45:54 +0800 Message-Id: <20230605054601.1410517-2-gerald.yang@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230605054601.1410517-1-gerald.yang@canonical.com> References: <20230605054601.1410517-1-gerald.yang@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Yu Kuai There are two problems can lead to lost wakeup: 1) invalid wakeup on the wrong waitqueue: For example, 2 * wake_batch tags are put, while only wake_batch threads are woken: __sbq_wake_up atomic_cmpxchg -> reset wait_cnt __sbq_wake_up -> decrease wait_cnt ... __sbq_wake_up -> wait_cnt is decreased to 0 again atomic_cmpxchg sbq_index_atomic_inc -> increase wake_index wake_up_nr -> wake up and waitqueue might be empty sbq_index_atomic_inc -> increase again, one waitqueue is skipped wake_up_nr -> invalid wake up because old wakequeue might be empty To fix the problem, increasing 'wake_index' before resetting 'wait_cnt'. 2) 'wait_cnt' can be decreased while waitqueue is empty As pointed out by Jan Kara, following race is possible: CPU1 CPU2 __sbq_wake_up __sbq_wake_up sbq_wake_ptr() sbq_wake_ptr() -> the same wait_cnt = atomic_dec_return() /* decreased to 0 */ sbq_index_atomic_inc() /* move to next waitqueue */ atomic_set() /* reset wait_cnt */ wake_up_nr() /* wake up on the old waitqueue */ wait_cnt = atomic_dec_return() /* * decrease wait_cnt in the old * waitqueue, while it can be * empty. */ Fix the problem by waking up before updating 'wake_index' and 'wait_cnt'. With this patch, noted that 'wait_cnt' is still decreased in the old empty waitqueue, however, the wakeup is redirected to a active waitqueue, and the extra decrement on the old empty waitqueue is not handled. Fixes: 88459642cba4 ("blk-mq: abstract tag allocation out into sbitmap library") Signed-off-by: Yu Kuai Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20220803121504.212071-1-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe Signed-off-by: Gerald Yang --- lib/sbitmap.c | 55 ++++++++++++++++++++++++++++++--------------------- 1 file changed, 33 insertions(+), 22 deletions(-) diff --git a/lib/sbitmap.c b/lib/sbitmap.c index 29eb0484215a..1f31147872e6 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -611,32 +611,43 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) return false; wait_cnt = atomic_dec_return(&ws->wait_cnt); - if (wait_cnt <= 0) { - int ret; + /* + * For concurrent callers of this, callers should call this function + * again to wakeup a new batch on a different 'ws'. + */ + if (wait_cnt < 0 || !waitqueue_active(&ws->wait)) + return true; - wake_batch = READ_ONCE(sbq->wake_batch); + if (wait_cnt > 0) + return false; - /* - * Pairs with the memory barrier in sbitmap_queue_resize() to - * ensure that we see the batch size update before the wait - * count is reset. - */ - smp_mb__before_atomic(); + wake_batch = READ_ONCE(sbq->wake_batch); - /* - * For concurrent callers of this, the one that failed the - * atomic_cmpxhcg() race should call this function again - * to wakeup a new batch on a different 'ws'. - */ - ret = atomic_cmpxchg(&ws->wait_cnt, wait_cnt, wake_batch); - if (ret == wait_cnt) { - sbq_index_atomic_inc(&sbq->wake_index); - wake_up_nr(&ws->wait, wake_batch); - return false; - } + /* + * Wake up first in case that concurrent callers decrease wait_cnt + * while waitqueue is empty. + */ + wake_up_nr(&ws->wait, wake_batch); - return true; - } + /* + * Pairs with the memory barrier in sbitmap_queue_resize() to + * ensure that we see the batch size update before the wait + * count is reset. + * + * Also pairs with the implicit barrier between decrementing wait_cnt + * and checking for waitqueue_active() to make sure waitqueue_active() + * sees result of the wakeup if atomic_dec_return() has seen the result + * of atomic_set(). + */ + smp_mb__before_atomic(); + + /* + * Increase wake_index before updating wait_cnt, otherwise concurrent + * callers can see valid wait_cnt in old waitqueue, which can cause + * invalid wakeup on the old waitqueue. + */ + sbq_index_atomic_inc(&sbq->wake_index); + atomic_set(&ws->wait_cnt, wake_batch); return false; } From patchwork Mon Jun 5 05:45:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gerald Yang X-Patchwork-Id: 1790216 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=s/3ChTm2; dkim-atps=neutral Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QZN000mXkz20Wv for ; Mon, 5 Jun 2023 15:46:38 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1q632v-00076g-7d; Mon, 05 Jun 2023 05:46:25 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1q632r-00075o-El for kernel-team@lists.ubuntu.com; Mon, 05 Jun 2023 05:46:21 +0000 Received: from mail-pj1-f69.google.com (mail-pj1-f69.google.com [209.85.216.69]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id 165873F0E9 for ; Mon, 5 Jun 2023 05:46:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1685943981; bh=AipBBaR088g2l/YnmuviECB1RwNCzCCDb8JohHN4v6w=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=s/3ChTm2IRAs3kEUMEC5+4f2OXsp/lAdp4egiBdaUXJlvADceZxXIrjO21iU7paPu nJn4BdyjSX1uwY1u+3Nzg0bLOVyZUSRSRPKRkrX8i0vo0kE92gIOKDWzmCBGMUxVux XwBCNVRIuP41f1u5bpaTwtgo9XWCRPZ+c/wrPdknh8fWfaKxj3nee1/qqN64JU4pAv PxNn2DSnNVEVx1zDSqKrm7fBF/9lgK/PdJHPIwN+ntdKJ06VyKjL47a06Htowma6bp p9xwuxN0nD1kQU2nr52KTUysjaeqvWoElYxsz/QwZJy7Rn2VxLsnyxHMXNknicQE3J T5vaeF5z/jqTg== Received: by mail-pj1-f69.google.com with SMTP id 98e67ed59e1d1-258df3797f2so852170a91.2 for ; Sun, 04 Jun 2023 22:46:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685943979; x=1688535979; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=AipBBaR088g2l/YnmuviECB1RwNCzCCDb8JohHN4v6w=; b=H+0Z+DbC/+Gr8+dRVG2Xjy0jI4z0iRu0VDyDNJNpewXMtZ9jLATzAMDV3j52HpInGG 6xxFNEwfpO3vk1WuGtbU4TZe/dhytH93Ydd9SjUQ4JOkZYPxRxHOd+K+Ffn49F7A2gCI JL/8JLZf+ryLbUHM5PRi+NtxfoC+o5wdxzSc/jwOgEgp7BHSZC9N1fiwmCLFDo3zmAGO vX/BOGJW1vA0ETVC9Fgri2T/KMjHhoztIClqC25pFO9GISIBZM65VN1FdqMc4yDyemmE LXmey6FxU4ZUsqZqU8Ne+9tB6iO2nku/3xXzspD0CKwabR52pCm/wEC+IjziC8ZHIpKU qekA== X-Gm-Message-State: AC+VfDymS6fmMHKcGsv6VIjHwQbdMQCpMbydhqmi7GvrYQAR9eahfiMY TeqYz+IiyQwtZAR94NlaWTq0b6h+onF2TuEZw0a5KxVHofII5Jb+hDCHc3hr48Nvi+KnrcsZv4C ZXNHSKasyCTEeETjYZRUuvwa0dUxud5MbP/u2KsJv+hsRELVk/g== X-Received: by 2002:a17:903:1cf:b0:1aa:ff24:f8f0 with SMTP id e15-20020a17090301cf00b001aaff24f8f0mr3071411plh.4.1685943979118; Sun, 04 Jun 2023 22:46:19 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5DEtei1Fx2berAuixoW2+AfUP7Tt2rGX/I2ZBP0/oelWHi6t1ZLQlBHDKX/J+hp3nCtkGsEQ== X-Received: by 2002:a17:903:1cf:b0:1aa:ff24:f8f0 with SMTP id e15-20020a17090301cf00b001aaff24f8f0mr3071410plh.4.1685943978704; Sun, 04 Jun 2023 22:46:18 -0700 (PDT) Received: from localhost.localdomain (220-135-31-21.hinet-ip.hinet.net. [220.135.31.21]) by smtp.gmail.com with ESMTPSA id p5-20020a170902eac500b001b03a1a3151sm5637798pld.70.2023.06.04.22.46.17 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 04 Jun 2023 22:46:18 -0700 (PDT) From: Gerald Yang To: kernel-team@lists.ubuntu.com Subject: [SRU][K][PATCH 2/8] sbitmap: remove unnecessary code in __sbitmap_queue_get_batch Date: Mon, 5 Jun 2023 13:45:55 +0800 Message-Id: <20230605054601.1410517-3-gerald.yang@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230605054601.1410517-1-gerald.yang@canonical.com> References: <20230605054601.1410517-1-gerald.yang@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Liu Song If "nr + nr_tags <= map_depth", then the value of nr_tags will not be greater than map_depth, so no additional comparison is required. Signed-off-by: Liu Song Link: https://lore.kernel.org/r/1661483653-27326-1-git-send-email-liusong@linux.alibaba.com Signed-off-by: Jens Axboe Signed-off-by: Gerald Yang --- lib/sbitmap.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/lib/sbitmap.c b/lib/sbitmap.c index 1f31147872e6..a39b1a877366 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -533,10 +533,9 @@ unsigned long __sbitmap_queue_get_batch(struct sbitmap_queue *sbq, int nr_tags, nr = find_first_zero_bit(&map->word, map_depth); if (nr + nr_tags <= map_depth) { atomic_long_t *ptr = (atomic_long_t *) &map->word; - int map_tags = min_t(int, nr_tags, map_depth); unsigned long val, ret; - get_mask = ((1UL << map_tags) - 1) << nr; + get_mask = ((1UL << nr_tags) - 1) << nr; do { val = READ_ONCE(map->word); if ((val & ~get_mask) != val) @@ -547,7 +546,7 @@ unsigned long __sbitmap_queue_get_batch(struct sbitmap_queue *sbq, int nr_tags, if (get_mask) { *offset = nr + (index << sb->shift); update_alloc_hint_after_get(sb, depth, hint, - *offset + map_tags - 1); + *offset + nr_tags - 1); return get_mask; } } From patchwork Mon Jun 5 05:45:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gerald Yang X-Patchwork-Id: 1790220 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=G93DXhzh; dkim-atps=neutral Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QZN000xYzz20fM for ; Mon, 5 Jun 2023 15:46:38 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1q632u-00076M-1O; Mon, 05 Jun 2023 05:46:24 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1q632s-000763-33 for kernel-team@lists.ubuntu.com; Mon, 05 Jun 2023 05:46:22 +0000 Received: from mail-oi1-f200.google.com (mail-oi1-f200.google.com [209.85.167.200]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id C42BC3F0E9 for ; Mon, 5 Jun 2023 05:46:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1685943981; bh=hb+4MyGUTxrQ3O8WI0EPQeoiBfnsayfKfpck8xraBEs=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=G93DXhzhUwX3iL8OoI//nKxesC4jVWXVsza4Ar42iusVtgh2lBwUyNFGij2VM8j1f 4pXDFkgGMXFHppEjyI5xdm9eb9oArPiOkjHWeIUY5PAvwzqMLS6aBknMsGgRHhLZEd jRsQIfNFD5hlGcyEoGCzWtSArJ0xchNdI1JFPdKYiJq3U6mwDGPLiR7xoSXTHNzDp1 XpJWB++nxdQzu7cYk1dMwyBlTTIZPuNkLEDPMqLhdJ8HMLH6rMadjA8YTN1LWL+2Bu h3Isr5lHJQQ3GC9/73izmqmDMS6C0ji+gqbpeKY6jIP6W+mKWOAPK1ckP/HXfHeFD1 EovNRmml9ltUQ== Received: by mail-oi1-f200.google.com with SMTP id 5614622812f47-39a9cefa414so1439370b6e.0 for ; Sun, 04 Jun 2023 22:46:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685943980; x=1688535980; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hb+4MyGUTxrQ3O8WI0EPQeoiBfnsayfKfpck8xraBEs=; b=jLHk6/y9n6OeOZvMXdnKM5/Xh1ZPv7rS9cGZMYe1kY3C6C07I3uiRWpskQyA1uuLPN 0kdQ6DHdYq8ZWHTpYyfl6qkKmZwrA7DSmT9Jiv6mfjF9VBTxfMgCHq9+jLRfOk/0ECCi Q4PBGzfZ2QRj7orGgUUpl50o56gmyCY7VOwhpUbvUr50x9Ogl77pqsBtXKG691925rG8 a6SHi8xBnnOaw+8SUw/gza0G+da5IFzSVTna3UPAiwYlwgxFNNFaMBRllxWx+N4wox88 4DlLfkvaubytC/32gPq5kQlBaeALtrNdJiACPhBW6o2aMl0uxzaQp8xNfUwyOl5ZNmcJ in+w== X-Gm-Message-State: AC+VfDzjAWVwaqYYw5NLNVGAu8J2Om+FY05ftQd5MVImwv1amE3g5wyN ZTDV6utvOGBH+JcMVKq6PJp2GtRM1YpQ8na3xubwRyfnvlLEUABdJJJNyF+2sP3uJdlq6+X1qky +pttIVIRsqE4Vx0/YbMbvD8KT58iWI7DZYlj9wNCRByzGdwYniw== X-Received: by 2002:a05:6808:618b:b0:39a:a8c5:b3f0 with SMTP id dn11-20020a056808618b00b0039aa8c5b3f0mr4318794oib.46.1685943980389; Sun, 04 Jun 2023 22:46:20 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5G2KNNysbpYDatW+slel/rCoxPKBEoFe5PJvoFE59Y+oesdV6sZqLPQsaCxBjmkpwYczFRbw== X-Received: by 2002:a05:6808:618b:b0:39a:a8c5:b3f0 with SMTP id dn11-20020a056808618b00b0039aa8c5b3f0mr4318779oib.46.1685943980083; Sun, 04 Jun 2023 22:46:20 -0700 (PDT) Received: from localhost.localdomain (220-135-31-21.hinet-ip.hinet.net. [220.135.31.21]) by smtp.gmail.com with ESMTPSA id p5-20020a170902eac500b001b03a1a3151sm5637798pld.70.2023.06.04.22.46.18 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 04 Jun 2023 22:46:19 -0700 (PDT) From: Gerald Yang To: kernel-team@lists.ubuntu.com Subject: [SRU][K][PATCH 3/8] sbitmap: fix batched wait_cnt accounting Date: Mon, 5 Jun 2023 13:45:56 +0800 Message-Id: <20230605054601.1410517-4-gerald.yang@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230605054601.1410517-1-gerald.yang@canonical.com> References: <20230605054601.1410517-1-gerald.yang@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Keith Busch Batched completions can clear multiple bits, but we're only decrementing the wait_cnt by one each time. This can cause waiters to never be woken, stalling IO. Use the batched count instead. Link: https://bugzilla.kernel.org/show_bug.cgi?id=215679 Signed-off-by: Keith Busch Link: https://lore.kernel.org/r/20220825145312.1217900-1-kbusch@fb.com Signed-off-by: Jens Axboe Signed-off-by: Gerald Yang --- block/blk-mq-tag.c | 2 +- include/linux/sbitmap.h | 3 ++- lib/sbitmap.c | 31 +++++++++++++++++-------------- 3 files changed, 20 insertions(+), 16 deletions(-) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 2dcd738c6952..7aea93047caf 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -200,7 +200,7 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data) * other allocations on previous queue won't be starved. */ if (bt != bt_prev) - sbitmap_queue_wake_up(bt_prev); + sbitmap_queue_wake_up(bt_prev, 1); ws = bt_wait_ptr(bt, data->hctx); } while (1); diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h index 8f5a86e210b9..4d2d5205ab58 100644 --- a/include/linux/sbitmap.h +++ b/include/linux/sbitmap.h @@ -575,8 +575,9 @@ void sbitmap_queue_wake_all(struct sbitmap_queue *sbq); * sbitmap_queue_wake_up() - Wake up some of waiters in one waitqueue * on a &struct sbitmap_queue. * @sbq: Bitmap queue to wake up. + * @nr: Number of bits cleared. */ -void sbitmap_queue_wake_up(struct sbitmap_queue *sbq); +void sbitmap_queue_wake_up(struct sbitmap_queue *sbq, int nr); /** * sbitmap_queue_show() - Dump &struct sbitmap_queue information to a &struct diff --git a/lib/sbitmap.c b/lib/sbitmap.c index a39b1a877366..2fedf07a9db5 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -599,34 +599,38 @@ static struct sbq_wait_state *sbq_wake_ptr(struct sbitmap_queue *sbq) return NULL; } -static bool __sbq_wake_up(struct sbitmap_queue *sbq) +static bool __sbq_wake_up(struct sbitmap_queue *sbq, int nr) { struct sbq_wait_state *ws; - unsigned int wake_batch; - int wait_cnt; + int wake_batch, wait_cnt, cur; ws = sbq_wake_ptr(sbq); - if (!ws) + if (!ws || !nr) return false; - wait_cnt = atomic_dec_return(&ws->wait_cnt); + wake_batch = READ_ONCE(sbq->wake_batch); + cur = atomic_read(&ws->wait_cnt); + do { + if (cur <= 0) + return true; + wait_cnt = cur - nr; + } while (!atomic_try_cmpxchg(&ws->wait_cnt, &cur, wait_cnt)); + /* * For concurrent callers of this, callers should call this function * again to wakeup a new batch on a different 'ws'. */ - if (wait_cnt < 0 || !waitqueue_active(&ws->wait)) + if (!waitqueue_active(&ws->wait)) return true; if (wait_cnt > 0) return false; - wake_batch = READ_ONCE(sbq->wake_batch); - /* * Wake up first in case that concurrent callers decrease wait_cnt * while waitqueue is empty. */ - wake_up_nr(&ws->wait, wake_batch); + wake_up_nr(&ws->wait, max(wake_batch, nr)); /* * Pairs with the memory barrier in sbitmap_queue_resize() to @@ -651,12 +655,11 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) return false; } -void sbitmap_queue_wake_up(struct sbitmap_queue *sbq) +void sbitmap_queue_wake_up(struct sbitmap_queue *sbq, int nr) { - while (__sbq_wake_up(sbq)) + while (__sbq_wake_up(sbq, nr)) ; } -EXPORT_SYMBOL_GPL(sbitmap_queue_wake_up); static inline void sbitmap_update_cpu_hint(struct sbitmap *sb, int cpu, int tag) { @@ -693,7 +696,7 @@ void sbitmap_queue_clear_batch(struct sbitmap_queue *sbq, int offset, atomic_long_andnot(mask, (atomic_long_t *) addr); smp_mb__after_atomic(); - sbitmap_queue_wake_up(sbq); + sbitmap_queue_wake_up(sbq, nr_tags); sbitmap_update_cpu_hint(&sbq->sb, raw_smp_processor_id(), tags[nr_tags - 1] - offset); } @@ -721,7 +724,7 @@ void sbitmap_queue_clear(struct sbitmap_queue *sbq, unsigned int nr, * waiter. See the comment on waitqueue_active(). */ smp_mb__after_atomic(); - sbitmap_queue_wake_up(sbq); + sbitmap_queue_wake_up(sbq, 1); sbitmap_update_cpu_hint(&sbq->sb, cpu, nr); } EXPORT_SYMBOL_GPL(sbitmap_queue_clear); From patchwork Mon Jun 5 05:45:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gerald Yang X-Patchwork-Id: 1790221 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=qjziVoUs; dkim-atps=neutral Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QZN014xsPz20fN for ; Mon, 5 Jun 2023 15:46:41 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1q6331-0007BB-9L; Mon, 05 Jun 2023 05:46:31 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1q632x-000773-3P for kernel-team@lists.ubuntu.com; Mon, 05 Jun 2023 05:46:27 +0000 Received: from mail-oa1-f70.google.com (mail-oa1-f70.google.com [209.85.160.70]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id 6443B3F0E9 for ; Mon, 5 Jun 2023 05:46:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1685943984; bh=bha4NTAhcgfFxDQ9v+T1L/A7x8WEHmZg5t9EluRDWzI=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=qjziVoUs4dO/3EFKPwUta5yLoDdjIvW7CtNX92VKr/ocXbaJkeKL2QSf072ODoj0w 5k60cz99gtZ+05mv6RRGIvy0ec+zO3Y0yLQhuGB1LmB2eriUV1+0p2sqSqyXa0y94+ uaQ/o3GNLuTaFhdJAsgGG0qoUC3jFzAm+6MutUB1h/iO0nTVjz8Eg3ruXkPTNBE1Zf FgKXwxrdTemKaaqTRw7zf+aiUSTrqmdUeALPzAO5rUwiCaeB0zp77yBUEa5Im9q6ku zoL8Y7DgfNj6ksSTHKmtjrOhUBYfHAh+TkZKzFaNL7fzz4Sazc9SIQQDzf/aUsYP3G 6INzCq6c4u9IQ== Received: by mail-oa1-f70.google.com with SMTP id 586e51a60fabf-19fe1eea866so4279638fac.3 for ; Sun, 04 Jun 2023 22:46:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685943982; x=1688535982; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bha4NTAhcgfFxDQ9v+T1L/A7x8WEHmZg5t9EluRDWzI=; b=TjJSdTQ7RH/NT0LSzDw8Dh+0ChYO++O2M44OW+DbwVca5JKfiZauocNLqwkYJZbD4C rou71qUnJpCipKBcT3LTOuSOQ0NixFoI4r1IGC/cY/hG/FkdccrRi1kY8gRY5P5ltjjZ PrBzO1LZsNfHwMUNN9UWpMmzIGgXgmw+vdJr6hyOISEM8bWHXBzasLcFTo12UF7JMC7Y axfMZqd+UPBnGt5dXvdSgAWI7LTKWg0zgLCFjhHsPiUUNxsHQpWzIeDvnldCxfw+bkzq 547iytMJDbAX7zQrJkZog+AiIx/aU5QNIsh8/9IN8lHYawx5j6weH+nT9JlonmKvxfUY JDDQ== X-Gm-Message-State: AC+VfDwVL52Rfl01n1wjiO9f+ySZWiI1yBs4VtrjWqISVYX1mUxJypsX UOVVORR3Wl/wjn11WsHcMZG2mfZJF6XDsibp7OCHRcYPiuy6TkaU3aUS6gR65NbyTYjMicGvTz1 8Gcmr+iLCDlRd8gLB5Qcj18n3wKoxipX2VN82z4Ffq69tM+hacA== X-Received: by 2002:a05:6870:8641:b0:19f:4f5c:82a7 with SMTP id i1-20020a056870864100b0019f4f5c82a7mr7590688oal.22.1685943981868; Sun, 04 Jun 2023 22:46:21 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6Tis+aS8u3fLqwGMC5OftOhMWj+fvlqTv9cvPlKGW8eBj3IgqWtYHT81+eA2urigB/u4wZ/w== X-Received: by 2002:a05:6870:8641:b0:19f:4f5c:82a7 with SMTP id i1-20020a056870864100b0019f4f5c82a7mr7590675oal.22.1685943981448; Sun, 04 Jun 2023 22:46:21 -0700 (PDT) Received: from localhost.localdomain (220-135-31-21.hinet-ip.hinet.net. [220.135.31.21]) by smtp.gmail.com with ESMTPSA id p5-20020a170902eac500b001b03a1a3151sm5637798pld.70.2023.06.04.22.46.20 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 04 Jun 2023 22:46:21 -0700 (PDT) From: Gerald Yang To: kernel-team@lists.ubuntu.com Subject: [SRU][K][PATCH 4/8] Revert "sbitmap: fix batched wait_cnt accounting" Date: Mon, 5 Jun 2023 13:45:57 +0800 Message-Id: <20230605054601.1410517-5-gerald.yang@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230605054601.1410517-1-gerald.yang@canonical.com> References: <20230605054601.1410517-1-gerald.yang@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Jens Axboe This reverts commit 16ede66973c84f890c03584f79158dd5b2d725f5. This is causing issues with CPU stalls on my test box, revert it for now until we understand what is going on. It looks like infinite looping off sbitmap_queue_wake_up(), but hard to tell with a lot of CPUs hitting this issue and the console scrolling infinitely. Link: https://lore.kernel.org/linux-block/e742813b-ce5c-0d58-205b-1626f639b1bd@kernel.dk/ Signed-off-by: Jens Axboe Signed-off-by: Gerald Yang --- block/blk-mq-tag.c | 2 +- include/linux/sbitmap.h | 3 +-- lib/sbitmap.c | 31 ++++++++++++++----------------- 3 files changed, 16 insertions(+), 20 deletions(-) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 7aea93047caf..2dcd738c6952 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -200,7 +200,7 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data) * other allocations on previous queue won't be starved. */ if (bt != bt_prev) - sbitmap_queue_wake_up(bt_prev, 1); + sbitmap_queue_wake_up(bt_prev); ws = bt_wait_ptr(bt, data->hctx); } while (1); diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h index 4d2d5205ab58..8f5a86e210b9 100644 --- a/include/linux/sbitmap.h +++ b/include/linux/sbitmap.h @@ -575,9 +575,8 @@ void sbitmap_queue_wake_all(struct sbitmap_queue *sbq); * sbitmap_queue_wake_up() - Wake up some of waiters in one waitqueue * on a &struct sbitmap_queue. * @sbq: Bitmap queue to wake up. - * @nr: Number of bits cleared. */ -void sbitmap_queue_wake_up(struct sbitmap_queue *sbq, int nr); +void sbitmap_queue_wake_up(struct sbitmap_queue *sbq); /** * sbitmap_queue_show() - Dump &struct sbitmap_queue information to a &struct diff --git a/lib/sbitmap.c b/lib/sbitmap.c index 2fedf07a9db5..a39b1a877366 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -599,38 +599,34 @@ static struct sbq_wait_state *sbq_wake_ptr(struct sbitmap_queue *sbq) return NULL; } -static bool __sbq_wake_up(struct sbitmap_queue *sbq, int nr) +static bool __sbq_wake_up(struct sbitmap_queue *sbq) { struct sbq_wait_state *ws; - int wake_batch, wait_cnt, cur; + unsigned int wake_batch; + int wait_cnt; ws = sbq_wake_ptr(sbq); - if (!ws || !nr) + if (!ws) return false; - wake_batch = READ_ONCE(sbq->wake_batch); - cur = atomic_read(&ws->wait_cnt); - do { - if (cur <= 0) - return true; - wait_cnt = cur - nr; - } while (!atomic_try_cmpxchg(&ws->wait_cnt, &cur, wait_cnt)); - + wait_cnt = atomic_dec_return(&ws->wait_cnt); /* * For concurrent callers of this, callers should call this function * again to wakeup a new batch on a different 'ws'. */ - if (!waitqueue_active(&ws->wait)) + if (wait_cnt < 0 || !waitqueue_active(&ws->wait)) return true; if (wait_cnt > 0) return false; + wake_batch = READ_ONCE(sbq->wake_batch); + /* * Wake up first in case that concurrent callers decrease wait_cnt * while waitqueue is empty. */ - wake_up_nr(&ws->wait, max(wake_batch, nr)); + wake_up_nr(&ws->wait, wake_batch); /* * Pairs with the memory barrier in sbitmap_queue_resize() to @@ -655,11 +651,12 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq, int nr) return false; } -void sbitmap_queue_wake_up(struct sbitmap_queue *sbq, int nr) +void sbitmap_queue_wake_up(struct sbitmap_queue *sbq) { - while (__sbq_wake_up(sbq, nr)) + while (__sbq_wake_up(sbq)) ; } +EXPORT_SYMBOL_GPL(sbitmap_queue_wake_up); static inline void sbitmap_update_cpu_hint(struct sbitmap *sb, int cpu, int tag) { @@ -696,7 +693,7 @@ void sbitmap_queue_clear_batch(struct sbitmap_queue *sbq, int offset, atomic_long_andnot(mask, (atomic_long_t *) addr); smp_mb__after_atomic(); - sbitmap_queue_wake_up(sbq, nr_tags); + sbitmap_queue_wake_up(sbq); sbitmap_update_cpu_hint(&sbq->sb, raw_smp_processor_id(), tags[nr_tags - 1] - offset); } @@ -724,7 +721,7 @@ void sbitmap_queue_clear(struct sbitmap_queue *sbq, unsigned int nr, * waiter. See the comment on waitqueue_active(). */ smp_mb__after_atomic(); - sbitmap_queue_wake_up(sbq, 1); + sbitmap_queue_wake_up(sbq); sbitmap_update_cpu_hint(&sbq->sb, cpu, nr); } EXPORT_SYMBOL_GPL(sbitmap_queue_clear); From patchwork Mon Jun 5 05:45:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gerald Yang X-Patchwork-Id: 1790218 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=wIQCw1hg; dkim-atps=neutral Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QZN000tXjz20fG for ; Mon, 5 Jun 2023 15:46:39 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1q632z-00079O-MX; Mon, 05 Jun 2023 05:46:29 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1q632x-00077I-9j for kernel-team@lists.ubuntu.com; Mon, 05 Jun 2023 05:46:27 +0000 Received: from mail-pg1-f200.google.com (mail-pg1-f200.google.com [209.85.215.200]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id 2DF5D3F47C for ; Mon, 5 Jun 2023 05:46:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1685943986; bh=oHA51DKtf9i5LkNvZGu/L6UxKDE1NBlF4lkTIpsq78M=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=wIQCw1hgbO9CyRwqqTctgvFtjVtgFg7sRBrN8KMxUJVHF/E4b2RQUC21n+wQpeDiw wOPYv/qvVWO4MdqbR/By+dhB7q8APdAWZ0aM/qqgZi00Dw+VVaN3XvR9KsSzzUGY54 2Onc1OP5poq+YXVJ2Em0RTY7L9Nmd2G557bziHMMwmbOkGh1QWuUfgu2u35DrtdTZY e/YE1q7AXCDLBHOMtX98IBPVpifHnkyfG0iurzpJmvVRo1Mf2GFpL1jjWSzbSo7LAr xHAqA6+td7k914f9vpd8qHY+Bw80Y/wGQBYp2ysPfEO4hkdl6p0ddRLl5x75LLz5cL YT/WGZehKGw5A== Received: by mail-pg1-f200.google.com with SMTP id 41be03b00d2f7-528ab7097afso3803650a12.1 for ; Sun, 04 Jun 2023 22:46:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685943983; x=1688535983; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oHA51DKtf9i5LkNvZGu/L6UxKDE1NBlF4lkTIpsq78M=; b=OTFaehT6zeT4cTWhixyVbqd5K7usQErrAbv9gLuHqqYIaP9QWU/bq3+mTkOwz9q0Zy Dqz031eWdGHyE0RPjHjJqvYfgdv0Hfp2Gp6tqShCfteHVh5PDCHzBC0yeffG44JLaj55 d6XV4B42uDiy2mAXkg7D/MbZM6mGs/sch0gYTpkJF9lNvXqDvalOqz9jiumYopPILWYM 3BQPhjjg7oVv04jBmiWEPt1F7L67NNa/4+hqXkUCy3ynv9nPLCjXm9EGoxZZQJkahyvB pEsv4oGIrVdhx6nFUWKq6p1B9SpkETOt7ol+Qhr4p7FvYSpptDQOZ9S0nrzNTFJzymc/ V+Yw== X-Gm-Message-State: AC+VfDz9h8WrmZ64w0ouhd7EDlMNyZ1yVSN1FZ59JKUOyglQTh3Bd1Pf XQ/K9vvQwrX2ZUl+SeCCS9bj1XEvHSrj431rypJ1TTe7zbi9Hm6W+CQbVHYRnNWi0alVjCknGkj GVCJp4XvqLz7uj7qky5PrLVEz0nedlh+3r5tPKG2B+r6tcCuVJQ== X-Received: by 2002:a05:6a20:7d96:b0:116:fd37:c924 with SMTP id v22-20020a056a207d9600b00116fd37c924mr890837pzj.5.1685943983374; Sun, 04 Jun 2023 22:46:23 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ53/TrsC4u/PJaF/RixeasIwOHs0Ffr0zM2cn83FLaz5I2ZzUZXlcZgKflczRcR/z5Fvaqu3Q== X-Received: by 2002:a05:6a20:7d96:b0:116:fd37:c924 with SMTP id v22-20020a056a207d9600b00116fd37c924mr890825pzj.5.1685943983028; Sun, 04 Jun 2023 22:46:23 -0700 (PDT) Received: from localhost.localdomain (220-135-31-21.hinet-ip.hinet.net. [220.135.31.21]) by smtp.gmail.com with ESMTPSA id p5-20020a170902eac500b001b03a1a3151sm5637798pld.70.2023.06.04.22.46.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 04 Jun 2023 22:46:22 -0700 (PDT) From: Gerald Yang To: kernel-team@lists.ubuntu.com Subject: [SRU][K][PATCH 5/8] sbitmap: Avoid leaving waitqueue in invalid state in __sbq_wake_up() Date: Mon, 5 Jun 2023 13:45:58 +0800 Message-Id: <20230605054601.1410517-6-gerald.yang@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230605054601.1410517-1-gerald.yang@canonical.com> References: <20230605054601.1410517-1-gerald.yang@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Jan Kara When __sbq_wake_up() decrements wait_cnt to 0 but races with someone else waking the waiter on the waitqueue (so the waitqueue becomes empty), it exits without reseting wait_cnt to wake_batch number. Once wait_cnt is 0, nobody will ever reset the wait_cnt or wake the new waiters resulting in possible deadlocks or busyloops. Fix the problem by making sure we reset wait_cnt even if we didn't wake up anybody in the end. Fixes: 040b83fcecfb ("sbitmap: fix possible io hung due to lost wakeup") Reported-by: Keith Busch Signed-off-by: Jan Kara Link: https://lore.kernel.org/r/20220908130937.2795-1-jack@suse.cz Signed-off-by: Jens Axboe Signed-off-by: Gerald Yang --- lib/sbitmap.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/lib/sbitmap.c b/lib/sbitmap.c index a39b1a877366..47cd8fb894ba 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -604,6 +604,7 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) struct sbq_wait_state *ws; unsigned int wake_batch; int wait_cnt; + bool ret; ws = sbq_wake_ptr(sbq); if (!ws) @@ -614,12 +615,23 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) * For concurrent callers of this, callers should call this function * again to wakeup a new batch on a different 'ws'. */ - if (wait_cnt < 0 || !waitqueue_active(&ws->wait)) + if (wait_cnt < 0) return true; + /* + * If we decremented queue without waiters, retry to avoid lost + * wakeups. + */ if (wait_cnt > 0) - return false; + return !waitqueue_active(&ws->wait); + /* + * When wait_cnt == 0, we have to be particularly careful as we are + * responsible to reset wait_cnt regardless whether we've actually + * woken up anybody. But in case we didn't wakeup anybody, we still + * need to retry. + */ + ret = !waitqueue_active(&ws->wait); wake_batch = READ_ONCE(sbq->wake_batch); /* @@ -648,7 +660,7 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) sbq_index_atomic_inc(&sbq->wake_index); atomic_set(&ws->wait_cnt, wake_batch); - return false; + return ret; } void sbitmap_queue_wake_up(struct sbitmap_queue *sbq) From patchwork Mon Jun 5 05:45:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gerald Yang X-Patchwork-Id: 1790224 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=AKjpUjmP; dkim-atps=neutral Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QZN09349sz20Tb for ; Mon, 5 Jun 2023 15:46:49 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1q633A-0007OO-KQ; Mon, 05 Jun 2023 05:46:40 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1q632z-00078h-Ot for kernel-team@lists.ubuntu.com; Mon, 05 Jun 2023 05:46:29 +0000 Received: from mail-pj1-f72.google.com (mail-pj1-f72.google.com [209.85.216.72]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id A76993F0E9 for ; Mon, 5 Jun 2023 05:46:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1685943987; bh=8qoHshf5QUZZtiv7N3nL9XZU8DQgdbNdIQQp/cA8rI0=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=AKjpUjmPbqHKQRCxawDbf3VZUSsgxWTAaDP2BCL9syncn59FlUr9AR0wsDf3V1bTl p5ug4JWJFCFiINXfUD6wbdOtL/LbQMU6X1GInDtdpj1LRXdNjEaDjTkqXzPha/Usk6 Woisd0MOURKLiXHDDZP0+zrX+4Rk6B9r4uIwvzmwC+ughmwRfC6gduiU0fNuzKtrDC HDDoT+FzMs2t2sP22zyp5KVWvtTZi1kKX169dHzioDJtOY9Rg/HJS05Kir172YnODC rQGTFIg2iHUvgLc0j0BzxEPmbSmQ90Yv3wY2uHcGVnu1TghOcd6CuhT6pw0uysSJzn NHVImed4GB0pw== Received: by mail-pj1-f72.google.com with SMTP id 98e67ed59e1d1-2563afb0150so3865625a91.0 for ; Sun, 04 Jun 2023 22:46:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685943984; x=1688535984; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8qoHshf5QUZZtiv7N3nL9XZU8DQgdbNdIQQp/cA8rI0=; b=klDvl3d+sob6WfDqOCtULBCePrK4kZ/ldyHU0RguayBNWEw6ZeRYm++3S/jiPd/8cc cHyjtyNIRSzEYo0CbHKeThsx3dGEyghbjDDKM87cmggUkJIYMGGFJpKV1oSpG1vc14eb yacVlWi3kd/5CM4ELZd8w8dTA92lUdCX6gqF80/fwvz2C1h8BAQ8+7jwH0ZfS9swKPxz /PaMAFe3Di5yrvxLGhAl0Pwmad9QYX44e9VJl3hKCEwHvGmbXFbalhzDENO34NL63KiI 2psIt6B5vBV2qR84KsyUnf7ya3FvUsDSxPRVu7w91mzXDGgfSuGsYj80lPVdd/x86Bxu slPg== X-Gm-Message-State: AC+VfDwPV0GwytrczMsMF1yp5EBmBt0vYPpFUEkvCUBi+AtNXMJaDmBo hx3cJBMoEY1KIuGKeVzmDC2cxHMCTYK4TCcN933P9v2rfPv6bQ8JGqiTt8VNCex7F8E4Ip5qXOb B2Wx9t0RUHOKySmVJ15QQ/zKpWLtRXgQ3+ssg+lhhZtcfEp/mUw== X-Received: by 2002:a17:902:ce89:b0:1b2:a63:95aa with SMTP id f9-20020a170902ce8900b001b20a6395aamr2286131plg.54.1685943984770; Sun, 04 Jun 2023 22:46:24 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ50ErUE3YEINUU2PwaG3szLE93HIyReYuT2LblnoR9ckFBFk5/2Wjo/3Fvbrtqajg0jH9LZSQ== X-Received: by 2002:a17:902:ce89:b0:1b2:a63:95aa with SMTP id f9-20020a170902ce8900b001b20a6395aamr2286117plg.54.1685943984445; Sun, 04 Jun 2023 22:46:24 -0700 (PDT) Received: from localhost.localdomain (220-135-31-21.hinet-ip.hinet.net. [220.135.31.21]) by smtp.gmail.com with ESMTPSA id p5-20020a170902eac500b001b03a1a3151sm5637798pld.70.2023.06.04.22.46.23 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 04 Jun 2023 22:46:23 -0700 (PDT) From: Gerald Yang To: kernel-team@lists.ubuntu.com Subject: [SRU][K][PATCH 6/8] sbitmap: Use atomic_long_try_cmpxchg in __sbitmap_queue_get_batch Date: Mon, 5 Jun 2023 13:45:59 +0800 Message-Id: <20230605054601.1410517-7-gerald.yang@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230605054601.1410517-1-gerald.yang@canonical.com> References: <20230605054601.1410517-1-gerald.yang@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Uros Bizjak Use atomic_long_try_cmpxchg instead of atomic_long_cmpxchg (*ptr, old, new) == old in __sbitmap_queue_get_batch. x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg). Also, atomic_long_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg fails, enabling further code simplifications, e.g. an extra memory read can be avoided in the loop. No functional change intended. Cc: Jens Axboe Signed-off-by: Uros Bizjak Link: https://lore.kernel.org/r/20220908151200.9993-1-ubizjak@gmail.com Signed-off-by: Jens Axboe Signed-off-by: Gerald Yang --- lib/sbitmap.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/lib/sbitmap.c b/lib/sbitmap.c index 47cd8fb894ba..cbfd2e677d87 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -533,16 +533,16 @@ unsigned long __sbitmap_queue_get_batch(struct sbitmap_queue *sbq, int nr_tags, nr = find_first_zero_bit(&map->word, map_depth); if (nr + nr_tags <= map_depth) { atomic_long_t *ptr = (atomic_long_t *) &map->word; - unsigned long val, ret; + unsigned long val; get_mask = ((1UL << nr_tags) - 1) << nr; + val = READ_ONCE(map->word); do { - val = READ_ONCE(map->word); if ((val & ~get_mask) != val) goto next; - ret = atomic_long_cmpxchg(ptr, val, get_mask | val); - } while (ret != val); - get_mask = (get_mask & ~ret) >> nr; + } while (!atomic_long_try_cmpxchg(ptr, &val, + get_mask | val)); + get_mask = (get_mask & ~val) >> nr; if (get_mask) { *offset = nr + (index << sb->shift); update_alloc_hint_after_get(sb, depth, hint, From patchwork Mon Jun 5 05:46:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gerald Yang X-Patchwork-Id: 1790223 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=XOXmV6k/; dkim-atps=neutral Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QZN0621XHz20Vv for ; Mon, 5 Jun 2023 15:46:46 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1q6338-0007Kn-6e; Mon, 05 Jun 2023 05:46:38 +0000 Received: from smtp-relay-internal-0.internal ([10.131.114.225] helo=smtp-relay-internal-0.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1q6330-00078y-3w for kernel-team@lists.ubuntu.com; Mon, 05 Jun 2023 05:46:30 +0000 Received: from mail-oi1-f200.google.com (mail-oi1-f200.google.com [209.85.167.200]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-0.canonical.com (Postfix) with ESMTPS id B93243F03D for ; Mon, 5 Jun 2023 05:46:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1685943988; bh=sL3WuaJtJ26wa5DJBEh84sh8ppsisG9UwgLuyy6Msu4=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=XOXmV6k/bBPTuUu6/VlKwYE4yIuX+eoUsuQOazb8lyRSe9YeMY+5vW4CowIAQKQQu AXN5Op7GIcG9hQPt/3SIMqDOS1NZE628PiNLqEIRBrCm2XHIseTKIDoah2HVDVSgth 7VfN4H9TmdOVsWBeWVgLpl1UbUiwSUEvOipDrHiRgUmbM99MxZ+XgTppD6Z47kE63P i/YlTKaAYLsW1g+ksonedv0QnknYwb/DSFSJ/K+lzP1FDfky6t+W38uY3WgzCDaI5Y HPS5Tkbs1orcc8PMRPRUx1xgoxWDt7KzYsNfN2fJcYP9y4YbOyj5pMGm0QAp2I86r0 dvs/hBecY1M+A== Received: by mail-oi1-f200.google.com with SMTP id 5614622812f47-39a5e922c24so4290301b6e.2 for ; Sun, 04 Jun 2023 22:46:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685943986; x=1688535986; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sL3WuaJtJ26wa5DJBEh84sh8ppsisG9UwgLuyy6Msu4=; b=cAbWcjtzr0N0D/rppPSzWckS7uzzCnDrOjXnQU2V/quFd0ZAJU+yzL+GHT8/xcdsaQ GQD6fiUnJSvrrFhMfQjX+pbMiOENHFgCC7+vZz0orirtvkDshfdsVR/RdhVRxKS+6yXw JdneO17gaei8AL/8zjZaYc6jB9d7uQe4CQh0AAQIpQ4KSwsvJPEoAw19qQyQk1D8/nZx 3W8WwLIgm/qAPDZVJPCU5zc0bQjUu7rkOIb1+qSANw+AoTFeHoAnOFIEPwV7JXOzmEuw mUL60Y96PeEpskvzRAqun1bEhGRLFbpbGiUMqvmLytvIp8hG565ElB6/7oh8lmizrUrS Me4Q== X-Gm-Message-State: AC+VfDxOwwytU7sGP5ZxSqvSkGP3yxW9WL0P76jxNi5zpU3OTGUrnvXa LlCXzD2kUTyWgINjC/a6S0DkIPkCAhegnDOdg5JzA4pfvxHHEbZ8AbSWV9OyZSllxu5LD/3pr4F H3cCkD9VmWN7JJFgpyDuGg2yrTT7kIz6/qUvAHMXVMDpeccXnzQ== X-Received: by 2002:a05:6808:997:b0:399:8529:6726 with SMTP id a23-20020a056808099700b0039985296726mr7298758oic.51.1685943985898; Sun, 04 Jun 2023 22:46:25 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6jt5HG9qC+4pjunsIVyoOS7br/bAsiufaJIyW+Nl7RgWupj01THTl4qg87v5txym64rbfXTA== X-Received: by 2002:a05:6808:997:b0:399:8529:6726 with SMTP id a23-20020a056808099700b0039985296726mr7298745oic.51.1685943985530; Sun, 04 Jun 2023 22:46:25 -0700 (PDT) Received: from localhost.localdomain (220-135-31-21.hinet-ip.hinet.net. [220.135.31.21]) by smtp.gmail.com with ESMTPSA id p5-20020a170902eac500b001b03a1a3151sm5637798pld.70.2023.06.04.22.46.24 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 04 Jun 2023 22:46:25 -0700 (PDT) From: Gerald Yang To: kernel-team@lists.ubuntu.com Subject: [SRU][K][PATCH 7/8] sbitmap: fix batched wait_cnt accounting Date: Mon, 5 Jun 2023 13:46:00 +0800 Message-Id: <20230605054601.1410517-8-gerald.yang@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230605054601.1410517-1-gerald.yang@canonical.com> References: <20230605054601.1410517-1-gerald.yang@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Keith Busch Batched completions can clear multiple bits, but we're only decrementing the wait_cnt by one each time. This can cause waiters to never be woken, stalling IO. Use the batched count instead. Link: https://bugzilla.kernel.org/show_bug.cgi?id=215679 Signed-off-by: Keith Busch Link: https://lore.kernel.org/r/20220909184022.1709476-1-kbusch@fb.com Signed-off-by: Jens Axboe Signed-off-by: Gerald Yang --- block/blk-mq-tag.c | 2 +- include/linux/sbitmap.h | 3 ++- lib/sbitmap.c | 37 +++++++++++++++++++++++-------------- 3 files changed, 26 insertions(+), 16 deletions(-) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 2dcd738c6952..7aea93047caf 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -200,7 +200,7 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data) * other allocations on previous queue won't be starved. */ if (bt != bt_prev) - sbitmap_queue_wake_up(bt_prev); + sbitmap_queue_wake_up(bt_prev, 1); ws = bt_wait_ptr(bt, data->hctx); } while (1); diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h index 8f5a86e210b9..4d2d5205ab58 100644 --- a/include/linux/sbitmap.h +++ b/include/linux/sbitmap.h @@ -575,8 +575,9 @@ void sbitmap_queue_wake_all(struct sbitmap_queue *sbq); * sbitmap_queue_wake_up() - Wake up some of waiters in one waitqueue * on a &struct sbitmap_queue. * @sbq: Bitmap queue to wake up. + * @nr: Number of bits cleared. */ -void sbitmap_queue_wake_up(struct sbitmap_queue *sbq); +void sbitmap_queue_wake_up(struct sbitmap_queue *sbq, int nr); /** * sbitmap_queue_show() - Dump &struct sbitmap_queue information to a &struct diff --git a/lib/sbitmap.c b/lib/sbitmap.c index cbfd2e677d87..624fa7f118d1 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -599,24 +599,31 @@ static struct sbq_wait_state *sbq_wake_ptr(struct sbitmap_queue *sbq) return NULL; } -static bool __sbq_wake_up(struct sbitmap_queue *sbq) +static bool __sbq_wake_up(struct sbitmap_queue *sbq, int *nr) { struct sbq_wait_state *ws; unsigned int wake_batch; - int wait_cnt; + int wait_cnt, cur, sub; bool ret; + if (*nr <= 0) + return false; + ws = sbq_wake_ptr(sbq); if (!ws) return false; - wait_cnt = atomic_dec_return(&ws->wait_cnt); - /* - * For concurrent callers of this, callers should call this function - * again to wakeup a new batch on a different 'ws'. - */ - if (wait_cnt < 0) - return true; + cur = atomic_read(&ws->wait_cnt); + do { + /* + * For concurrent callers of this, callers should call this + * function again to wakeup a new batch on a different 'ws'. + */ + if (cur == 0) + return true; + sub = min(*nr, cur); + wait_cnt = cur - sub; + } while (!atomic_try_cmpxchg(&ws->wait_cnt, &cur, wait_cnt)); /* * If we decremented queue without waiters, retry to avoid lost @@ -625,6 +632,8 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) if (wait_cnt > 0) return !waitqueue_active(&ws->wait); + *nr -= sub; + /* * When wait_cnt == 0, we have to be particularly careful as we are * responsible to reset wait_cnt regardless whether we've actually @@ -660,12 +669,12 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) sbq_index_atomic_inc(&sbq->wake_index); atomic_set(&ws->wait_cnt, wake_batch); - return ret; + return ret || *nr; } -void sbitmap_queue_wake_up(struct sbitmap_queue *sbq) +void sbitmap_queue_wake_up(struct sbitmap_queue *sbq, int nr) { - while (__sbq_wake_up(sbq)) + while (__sbq_wake_up(sbq, &nr)) ; } EXPORT_SYMBOL_GPL(sbitmap_queue_wake_up); @@ -705,7 +714,7 @@ void sbitmap_queue_clear_batch(struct sbitmap_queue *sbq, int offset, atomic_long_andnot(mask, (atomic_long_t *) addr); smp_mb__after_atomic(); - sbitmap_queue_wake_up(sbq); + sbitmap_queue_wake_up(sbq, nr_tags); sbitmap_update_cpu_hint(&sbq->sb, raw_smp_processor_id(), tags[nr_tags - 1] - offset); } @@ -733,7 +742,7 @@ void sbitmap_queue_clear(struct sbitmap_queue *sbq, unsigned int nr, * waiter. See the comment on waitqueue_active(). */ smp_mb__after_atomic(); - sbitmap_queue_wake_up(sbq); + sbitmap_queue_wake_up(sbq, 1); sbitmap_update_cpu_hint(&sbq->sb, cpu, nr); } EXPORT_SYMBOL_GPL(sbitmap_queue_clear); From patchwork Mon Jun 5 05:46:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gerald Yang X-Patchwork-Id: 1790222 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=iHC/RdiC; dkim-atps=neutral Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4QZN0542D7z20Tb for ; Mon, 5 Jun 2023 15:46:45 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1q6336-0007H5-89; Mon, 05 Jun 2023 05:46:36 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1q6330-00079R-Ml for kernel-team@lists.ubuntu.com; Mon, 05 Jun 2023 05:46:30 +0000 Received: from mail-oa1-f70.google.com (mail-oa1-f70.google.com [209.85.160.70]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id 098F13F47C for ; Mon, 5 Jun 2023 05:46:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1685943989; bh=njrG9exquyu4CLsFvqZ5x0tjrQCyYYtME10yCbbA37M=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=iHC/RdiCRkhLadeDgUWmoc9+DuIsO/ltTLuOo5W8iyrtOeATMLCwMPEeeW+bnPtmV H0EoW4Xr68ARIE8LgHDLbALYxg3zJcB3KUVSENjGWPZ0V9jxR3d3PGqE1/ZT8ctuxt yiX3J5toqtZNGdbn0yYFLjJ7hCcRE/bcgcF2JIGZjDlG5YZ/5I2NBLneFEbKP5Tkm/ 6X+ojA1lB8b6eNFyhhIb2YdSGOcUKIm9YL8hGlGE91nKeBfIUE13RENfucEg7SFLEh EklhvZGdeHZobxpjkGg+g7p/qDou56UiQAGHYkVOUVDjwrw4HwaNYzh6sazmu1gAJ6 6c25f+4k7tFMA== Received: by mail-oa1-f70.google.com with SMTP id 586e51a60fabf-19f0dde4a72so4250618fac.2 for ; Sun, 04 Jun 2023 22:46:28 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1685943987; x=1688535987; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=njrG9exquyu4CLsFvqZ5x0tjrQCyYYtME10yCbbA37M=; b=QCvTmg4DY0sa78TXn8ZFTNSNUCdZdFqmxSI3lv3cGH3YLlUi0hpdukiCr8g1B3fHjx +sWqK1QUUyJSIsLHzvtDvYNNvbf4u2lBfmJtDdtRRosCuwvzQEtvB8UdgaGBrDx7R13P +CDwiZ/T4GDzhFWWXFnPfta/sI9Ispv2Hbh9zV5ge1+MJhasAOcCRNIF5Z+T56El+bgj 1IMoq6wX8CjDwqGe0L8CCHZAp1FlqTFW6l+L5ARBs41LIpScBoJRmvPvG3YRKzWT2cLZ nCg4DWKKCKuxh1cpGCgeKyoi09o/lY3IQOyCV4OzHgJ5g0pyGXx5O5CDXKIYDowPyGNG hZNQ== X-Gm-Message-State: AC+VfDzgu/FMdfTUE7p5WVJUZvSQIf7UCmBxzP3rxQuZK2GGMadFoyI9 A3dSuUpImeH+Bh0qMbqo2TXyoZ0o+JiqXSpZEE3eArmG6mdz90flrYsq8CVpfU8WY/nQfKTtpFR UZJfgPytHwFnGJ693fPxxVtwix/SLhR937ax2KPLiTz939O+rQQ== X-Received: by 2002:a05:6870:5316:b0:188:1338:fbb6 with SMTP id j22-20020a056870531600b001881338fbb6mr7656128oan.36.1685943987010; Sun, 04 Jun 2023 22:46:27 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6eNOxH4rpxkys7jTQLELcGtqS0/dk33LuKXyFLCXNbX3svPvzxKpUNR2M2Mt7h6rYJYtmG8g== X-Received: by 2002:a05:6870:5316:b0:188:1338:fbb6 with SMTP id j22-20020a056870531600b001881338fbb6mr7656121oan.36.1685943986705; Sun, 04 Jun 2023 22:46:26 -0700 (PDT) Received: from localhost.localdomain (220-135-31-21.hinet-ip.hinet.net. [220.135.31.21]) by smtp.gmail.com with ESMTPSA id p5-20020a170902eac500b001b03a1a3151sm5637798pld.70.2023.06.04.22.46.25 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 04 Jun 2023 22:46:26 -0700 (PDT) From: Gerald Yang To: kernel-team@lists.ubuntu.com Subject: [SRU][K][PATCH 8/8] sbitmap: fix lockup while swapping Date: Mon, 5 Jun 2023 13:46:01 +0800 Message-Id: <20230605054601.1410517-9-gerald.yang@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230605054601.1410517-1-gerald.yang@canonical.com> References: <20230605054601.1410517-1-gerald.yang@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Hugh Dickins Commit 4acb83417cad ("sbitmap: fix batched wait_cnt accounting") is a big improvement: without it, I had to revert to before commit 040b83fcecfb ("sbitmap: fix possible io hung due to lost wakeup") to avoid the high system time and freezes which that had introduced. Now okay on the NVME laptop, but 4acb83417cad is a disaster for heavy swapping (kernel builds in low memory) on another: soon locking up in sbitmap_queue_wake_up() (into which __sbq_wake_up() is inlined), cycling around with waitqueue_active() but wait_cnt 0 . Here is a backtrace, showing the common pattern of outer sbitmap_queue_wake_up() interrupted before setting wait_cnt 0 back to wake_batch (in some cases other CPUs are idle, in other cases they're spinning for a lock in dd_bio_merge()): sbitmap_queue_wake_up < sbitmap_queue_clear < blk_mq_put_tag < __blk_mq_free_request < blk_mq_free_request < __blk_mq_end_request < scsi_end_request < scsi_io_completion < scsi_finish_command < scsi_complete < blk_complete_reqs < blk_done_softirq < __do_softirq < __irq_exit_rcu < irq_exit_rcu < common_interrupt < asm_common_interrupt < _raw_spin_unlock_irqrestore < __wake_up_common_lock < __wake_up < sbitmap_queue_wake_up < sbitmap_queue_clear < blk_mq_put_tag < __blk_mq_free_request < blk_mq_free_request < dd_bio_merge < blk_mq_sched_bio_merge < blk_mq_attempt_bio_merge < blk_mq_submit_bio < __submit_bio < submit_bio_noacct_nocheck < submit_bio_noacct < submit_bio < __swap_writepage < swap_writepage < pageout < shrink_folio_list < evict_folios < lru_gen_shrink_lruvec < shrink_lruvec < shrink_node < do_try_to_free_pages < try_to_free_pages < __alloc_pages_slowpath < __alloc_pages < folio_alloc < vma_alloc_folio < do_anonymous_page < __handle_mm_fault < handle_mm_fault < do_user_addr_fault < exc_page_fault < asm_exc_page_fault See how the process-context sbitmap_queue_wake_up() has been interrupted, after bringing wait_cnt down to 0 (and in this example, after doing its wakeups), before advancing wake_index and refilling wake_cnt: an interrupt-context sbitmap_queue_wake_up() of the same sbq gets stuck. I have almost no grasp of all the possible sbitmap races, and their consequences: but __sbq_wake_up() can do nothing useful while wait_cnt 0, so it is better if sbq_wake_ptr() skips on to the next ws in that case: which fixes the lockup and shows no adverse consequence for me. The check for wait_cnt being 0 is obviously racy, and ultimately can lead to lost wakeups: for example, when there is only a single waitqueue with waiters. However, lost wakeups are unlikely to matter in these cases, and a proper fix requires redesign (and benchmarking) of the batched wakeup code: so let's plug the hole with this bandaid for now. Signed-off-by: Hugh Dickins Reviewed-by: Jan Kara Reviewed-by: Keith Busch Link: https://lore.kernel.org/r/9c2038a7-cdc5-5ee-854c-fbc6168bf16@google.com Signed-off-by: Jens Axboe Signed-off-by: Gerald Yang --- lib/sbitmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/sbitmap.c b/lib/sbitmap.c index 624fa7f118d1..a8108a962dfd 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -587,7 +587,7 @@ static struct sbq_wait_state *sbq_wake_ptr(struct sbitmap_queue *sbq) for (i = 0; i < SBQ_WAIT_QUEUES; i++) { struct sbq_wait_state *ws = &sbq->ws[wake_index]; - if (waitqueue_active(&ws->wait)) { + if (waitqueue_active(&ws->wait) && atomic_read(&ws->wait_cnt)) { if (wake_index != atomic_read(&sbq->wake_index)) atomic_set(&sbq->wake_index, wake_index); return ws;