From patchwork Tue Jun 6 07:22:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gerald Yang X-Patchwork-Id: 1790843 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=MganKuoJ; dkim-atps=neutral Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qb24f6Ctkz20WK for ; Tue, 6 Jun 2023 17:22:58 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1q6R1o-0000P5-2A; Tue, 06 Jun 2023 07:22:52 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1q6R1g-0000L4-19 for kernel-team@lists.ubuntu.com; Tue, 06 Jun 2023 07:22:44 +0000 Received: from mail-oi1-f197.google.com (mail-oi1-f197.google.com [209.85.167.197]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id 6F8793F0EF for ; Tue, 6 Jun 2023 07:22:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1686036163; bh=GQakCYPPVd/VCcY0IKFtGEJyInnRoZ3ihL5b5J0BiNg=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=MganKuoJLRI7mOumbo4RNRNcjB+8ATnN26Nme74dAb7zBv7QwlsElaCbMoW2mHKMc kotfufGt5dHZaZZ7bAG7x9DoN6fZbS4xXmZrcRPRB19yPz9v27IB+qSWpMe7yUMnrn lOTCYzMCsktvhTrX0S8u8kOIwi6ZrQBSHwzb/yr50D8q22Z6XyExNmjBo25i3ce/C9 lWTm+vZrHoS9RPsIXtP5AwFehP+U60qyelReHOQuhew1/FHhS1S9GvBCPx3GVmvuhc bY2ZiZ7NpLRwWbOXsZgZMMlVwvWjTx6+sHxXbfVCZ5u4F1XRk1Qi8ieOFxOhabNpbH MlBi7aGQqp/9A== Received: by mail-oi1-f197.google.com with SMTP id 5614622812f47-39a9d9981ffso2895245b6e.0 for ; Tue, 06 Jun 2023 00:22:43 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686036162; x=1688628162; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GQakCYPPVd/VCcY0IKFtGEJyInnRoZ3ihL5b5J0BiNg=; b=ghS4+peSWoKuXW90QIZAFCHVbmcFY0K2eXYEIaxVayq0SylGJaihXJp6kRrcPI0jE/ +T8E0RHJDwXbECs8y7V6y6JhRLcW5he3Rz+vsbzlCD7ac1q9DV4ZIS8ZtR4RLcWrTA4h r44J3iwHeUmhLgMQlMpiN0go4MHDp5ZWZD5HEkvbOGJ6KvC8rHmzx7nGMyVpJAmXIvab 3B5GOemQBv59IQOttfYIELRFo1qePzGZdpHZp8o9hV/apw/U03/5Ek0jJ4xMiiAh0f9N nY2G2cuY6QVrCn1bB01m893tnlFSKhVYXC/XrrhSTSzQbDAaiysDeL6sQZdfGGoDaKnZ Rz9Q== X-Gm-Message-State: AC+VfDzsnPCg/f1kPQX/CUoSdXOpwU4d0B+tvSwzZZCdPNTtU6NcmdZs 6NYupa7w3F4ebDEhEhYd/vPLVyZ8HuaD24USCuitTKwZIt1tI2d86XY8P2EbUy25dsXxRGYqVDH KoINBTpSgrCMXpzAlAzCLap1DXxwY0a8IZuzI31UOD0dJxYpopw== X-Received: by 2002:a05:6808:1ca:b0:39c:46db:1f83 with SMTP id x10-20020a05680801ca00b0039c46db1f83mr1424022oic.14.1686036161682; Tue, 06 Jun 2023 00:22:41 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7gFTyY3kcBoCjc8P1Jvy2eKt1gQFGwc3khhV2OOAK54CZ4zy12JQ+OmnbBZHaVvloejGJMlg== X-Received: by 2002:a05:6808:1ca:b0:39c:46db:1f83 with SMTP id x10-20020a05680801ca00b0039c46db1f83mr1424003oic.14.1686036161266; Tue, 06 Jun 2023 00:22:41 -0700 (PDT) Received: from localhost.localdomain (220-135-31-21.hinet-ip.hinet.net. [220.135.31.21]) by smtp.gmail.com with ESMTPSA id j6-20020a170902758600b001a260b5319bsm7776954pll.91.2023.06.06.00.22.40 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Jun 2023 00:22:41 -0700 (PDT) From: Gerald Yang To: kernel-team@lists.ubuntu.com Subject: [PATCH 1/6] sbitmap: fix possible io hung due to lost wakeup Date: Tue, 6 Jun 2023 15:22:24 +0800 Message-Id: <20230606072229.3988976-2-gerald.yang@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230606072229.3988976-1-gerald.yang@canonical.com> References: <20230606072229.3988976-1-gerald.yang@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Yu Kuai BugLink: https://bugs.launchpad.net/bugs/2022318 There are two problems can lead to lost wakeup: 1) invalid wakeup on the wrong waitqueue: For example, 2 * wake_batch tags are put, while only wake_batch threads are woken: __sbq_wake_up atomic_cmpxchg -> reset wait_cnt __sbq_wake_up -> decrease wait_cnt ... __sbq_wake_up -> wait_cnt is decreased to 0 again atomic_cmpxchg sbq_index_atomic_inc -> increase wake_index wake_up_nr -> wake up and waitqueue might be empty sbq_index_atomic_inc -> increase again, one waitqueue is skipped wake_up_nr -> invalid wake up because old wakequeue might be empty To fix the problem, increasing 'wake_index' before resetting 'wait_cnt'. 2) 'wait_cnt' can be decreased while waitqueue is empty As pointed out by Jan Kara, following race is possible: CPU1 CPU2 __sbq_wake_up __sbq_wake_up sbq_wake_ptr() sbq_wake_ptr() -> the same wait_cnt = atomic_dec_return() /* decreased to 0 */ sbq_index_atomic_inc() /* move to next waitqueue */ atomic_set() /* reset wait_cnt */ wake_up_nr() /* wake up on the old waitqueue */ wait_cnt = atomic_dec_return() /* * decrease wait_cnt in the old * waitqueue, while it can be * empty. */ Fix the problem by waking up before updating 'wake_index' and 'wait_cnt'. With this patch, noted that 'wait_cnt' is still decreased in the old empty waitqueue, however, the wakeup is redirected to a active waitqueue, and the extra decrement on the old empty waitqueue is not handled. Fixes: 88459642cba4 ("blk-mq: abstract tag allocation out into sbitmap library") Signed-off-by: Yu Kuai Reviewed-by: Jan Kara Link: https://lore.kernel.org/r/20220803121504.212071-1-yukuai1@huaweicloud.com Signed-off-by: Jens Axboe (cherry picked from commit 040b83fcecfb86f3225d3a5de7fd9b3fbccf83b4) Signed-off-by: Gerald Yang --- lib/sbitmap.c | 55 ++++++++++++++++++++++++++++++--------------------- 1 file changed, 33 insertions(+), 22 deletions(-) diff --git a/lib/sbitmap.c b/lib/sbitmap.c index 29eb0484215a..1f31147872e6 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -611,32 +611,43 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) return false; wait_cnt = atomic_dec_return(&ws->wait_cnt); - if (wait_cnt <= 0) { - int ret; + /* + * For concurrent callers of this, callers should call this function + * again to wakeup a new batch on a different 'ws'. + */ + if (wait_cnt < 0 || !waitqueue_active(&ws->wait)) + return true; - wake_batch = READ_ONCE(sbq->wake_batch); + if (wait_cnt > 0) + return false; - /* - * Pairs with the memory barrier in sbitmap_queue_resize() to - * ensure that we see the batch size update before the wait - * count is reset. - */ - smp_mb__before_atomic(); + wake_batch = READ_ONCE(sbq->wake_batch); - /* - * For concurrent callers of this, the one that failed the - * atomic_cmpxhcg() race should call this function again - * to wakeup a new batch on a different 'ws'. - */ - ret = atomic_cmpxchg(&ws->wait_cnt, wait_cnt, wake_batch); - if (ret == wait_cnt) { - sbq_index_atomic_inc(&sbq->wake_index); - wake_up_nr(&ws->wait, wake_batch); - return false; - } + /* + * Wake up first in case that concurrent callers decrease wait_cnt + * while waitqueue is empty. + */ + wake_up_nr(&ws->wait, wake_batch); - return true; - } + /* + * Pairs with the memory barrier in sbitmap_queue_resize() to + * ensure that we see the batch size update before the wait + * count is reset. + * + * Also pairs with the implicit barrier between decrementing wait_cnt + * and checking for waitqueue_active() to make sure waitqueue_active() + * sees result of the wakeup if atomic_dec_return() has seen the result + * of atomic_set(). + */ + smp_mb__before_atomic(); + + /* + * Increase wake_index before updating wait_cnt, otherwise concurrent + * callers can see valid wait_cnt in old waitqueue, which can cause + * invalid wakeup on the old waitqueue. + */ + sbq_index_atomic_inc(&sbq->wake_index); + atomic_set(&ws->wait_cnt, wake_batch); return false; } From patchwork Tue Jun 6 07:22:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gerald Yang X-Patchwork-Id: 1790841 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=Pxi3PrnZ; dkim-atps=neutral Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qb24b4pvjz20WK for ; Tue, 6 Jun 2023 17:22:54 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1q6R1i-0000LZ-MX; Tue, 06 Jun 2023 07:22:46 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1q6R1h-0000LI-IM for kernel-team@lists.ubuntu.com; Tue, 06 Jun 2023 07:22:45 +0000 Received: from mail-pl1-f200.google.com (mail-pl1-f200.google.com [209.85.214.200]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id DA71D3F0EF for ; Tue, 6 Jun 2023 07:22:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1686036164; bh=v8itPFhmoWkgEqdEOrOo6zxzQ+nwLw766Wi+/FHI7Bs=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Pxi3PrnZZwk17SkQ2WdLmBKMU7mrQy9DrbpmgTZ25skS5gacYx5CBA/i7HSFNBIZP zbsfkE7/1UH1V+JyEl4r1kWZZxNHzG5T/M3DoAwQSBeqb0bpJYIAJQZ0RXweMB44G8 aHu/NALoNTmididYcpSbzeJycyZIn3yNaP9xZML212AMXCaWYXFbdR4IbSaGxe31rm y8ISV6ktx9yJRqY/7Qwh9fEubCAAwWiIO2QUznpjrnMHNvzVluMmOdCm4eZKTWyzsf gcB3mFbrgR1AzGTCSzzAETpv839TabncOkGTnFPkxhfxpJjPrNz8D48wIXmNv/gcdb ftlU8MbCDWeQg== Received: by mail-pl1-f200.google.com with SMTP id d9443c01a7336-1b0427acfc3so21954315ad.0 for ; Tue, 06 Jun 2023 00:22:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686036163; x=1688628163; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=v8itPFhmoWkgEqdEOrOo6zxzQ+nwLw766Wi+/FHI7Bs=; b=lXLb+eT6mnY34I/WiI/0nlZpM8FQ3yAzcOjLwz44TKXdj43/9rkAFYTXjJ3YvHD9Y4 x8iutYcnhv/U1aqV8kNQBIpbdkCEWwx8L+8ZmWaVdYttBXMxRxFgw9prq2eoYhoPOUqO l0F8RiCiOqYl7jlX4Xg8o2bz5fcCqwFGeU/dUg5rGy9vNmnzyzFRcsf19It0e1Sv7Arc cCH47tiIV8HCeJ4AtaHUiENsA6B7uT/8EWlUW8tgfHmNLMnbsNJXFlmzzkFDL7wYdQYZ kmAc21hX7z+Lwg3S63p3bqNXcrCuSYeUV0ztwuNRxx/Y2poyfXInWn+Jqf8KS3bquNew nBsQ== X-Gm-Message-State: AC+VfDym/b2hKopN0GMLf3HKDrI30QtwO7Cm3d8/1kVwdtnroFTrXKYj m4W8So7tCSNGPKi+/PYqybqcsKTJoYmamrfUrYb0L2Ka209dwuGSdYIqsZWdQFaqW9/LaGiOWnJ GCaUWMJqqZTrFr9oLG1Eh90qAxI+1vutQRxxuAz/fYakEO84NoQ== X-Received: by 2002:a17:902:6907:b0:1af:f668:e323 with SMTP id j7-20020a170902690700b001aff668e323mr574691plk.10.1686036162984; Tue, 06 Jun 2023 00:22:42 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ55VX0VHc15vwi5Dl2cQuzFBxd4fUwVf3094KFXQeFl/cDHyv2Kb9qgF4sxgF95/EtEmkaszw== X-Received: by 2002:a17:902:6907:b0:1af:f668:e323 with SMTP id j7-20020a170902690700b001aff668e323mr574688plk.10.1686036162627; Tue, 06 Jun 2023 00:22:42 -0700 (PDT) Received: from localhost.localdomain (220-135-31-21.hinet-ip.hinet.net. [220.135.31.21]) by smtp.gmail.com with ESMTPSA id j6-20020a170902758600b001a260b5319bsm7776954pll.91.2023.06.06.00.22.41 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Jun 2023 00:22:42 -0700 (PDT) From: Gerald Yang To: kernel-team@lists.ubuntu.com Subject: [PATCH 2/6] sbitmap: remove unnecessary code in __sbitmap_queue_get_batch Date: Tue, 6 Jun 2023 15:22:25 +0800 Message-Id: <20230606072229.3988976-3-gerald.yang@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230606072229.3988976-1-gerald.yang@canonical.com> References: <20230606072229.3988976-1-gerald.yang@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Liu Song BugLink: https://bugs.launchpad.net/bugs/2022318 If "nr + nr_tags <= map_depth", then the value of nr_tags will not be greater than map_depth, so no additional comparison is required. Signed-off-by: Liu Song Link: https://lore.kernel.org/r/1661483653-27326-1-git-send-email-liusong@linux.alibaba.com Signed-off-by: Jens Axboe (cherry picked from commit ddbfc34fcf5d0bc33b006b90c580c56edeb31068) Signed-off-by: Gerald Yang --- lib/sbitmap.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/lib/sbitmap.c b/lib/sbitmap.c index 1f31147872e6..a39b1a877366 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -533,10 +533,9 @@ unsigned long __sbitmap_queue_get_batch(struct sbitmap_queue *sbq, int nr_tags, nr = find_first_zero_bit(&map->word, map_depth); if (nr + nr_tags <= map_depth) { atomic_long_t *ptr = (atomic_long_t *) &map->word; - int map_tags = min_t(int, nr_tags, map_depth); unsigned long val, ret; - get_mask = ((1UL << map_tags) - 1) << nr; + get_mask = ((1UL << nr_tags) - 1) << nr; do { val = READ_ONCE(map->word); if ((val & ~get_mask) != val) @@ -547,7 +546,7 @@ unsigned long __sbitmap_queue_get_batch(struct sbitmap_queue *sbq, int nr_tags, if (get_mask) { *offset = nr + (index << sb->shift); update_alloc_hint_after_get(sb, depth, hint, - *offset + map_tags - 1); + *offset + nr_tags - 1); return get_mask; } } From patchwork Tue Jun 6 07:22:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gerald Yang X-Patchwork-Id: 1790845 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=s0n0y8NB; dkim-atps=neutral Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qb24l33bMz20WK for ; Tue, 6 Jun 2023 17:23:03 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1q6R1r-0000T0-0e; Tue, 06 Jun 2023 07:22:55 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1q6R1j-0000LP-3K for kernel-team@lists.ubuntu.com; Tue, 06 Jun 2023 07:22:47 +0000 Received: from mail-oi1-f199.google.com (mail-oi1-f199.google.com [209.85.167.199]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id DBB143F0EF for ; Tue, 6 Jun 2023 07:22:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1686036165; bh=vmJtpKhYUdAYezGaQzm5jogkHKsopiQMTe/e2rQ97s8=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=s0n0y8NBuXNxXuMzus++O3WMslusC/QXbEu6hpFtNhyrXASi9xolCPrdKw/tzw4v7 CewibmD5EZqSzbnFB083K98QfuRUz1dGKtEVzcjrcJQER7fpyLuXMqnwylaqHSx1aa MI00GK6fJo8F1ek6RV7uIV3nlShRUGUFkRJqqIlOWoqUIkNrQPIsb8OezOOT/gwAUx Ljp5IwZNuvhN+A3QAmkW21koTBLBH0lVVgrWo524y0V+qVlrYXHSFx+HjnzBVlDcT/ 4btVnSLl81umLHu74SQ1U+Tf4zlqLiZgsHMkh1bDfR2UQXegLZpUQwYJnXUXmSkUTq rhsnVvccQMT6A== Received: by mail-oi1-f199.google.com with SMTP id 5614622812f47-397f9039f68so5471856b6e.0 for ; Tue, 06 Jun 2023 00:22:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686036164; x=1688628164; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vmJtpKhYUdAYezGaQzm5jogkHKsopiQMTe/e2rQ97s8=; b=gR5FRm61WCEVWGowkx5FmEcceu0Kb7Ki4RWjRHjBdu+K5yeKrtXdBCvgO2GIdpHNgu QjSZkfIDfkS+dXHGEiySO6j+mo4R3wNs8pETYwfpfdEuMqNf+g+097+2JCaD4ptAhjff OqYgyHttmNkziKBMCLsPQVT7NPhgHzrG0aXYigt6vAe+2dcp2HwyqNxKdJdkZsrqaIT+ N06xGkF5pDsyLBYAeqsaHmANtX546gZRoxZuIK/7ayfuXMkzO9TG0Lfn2r6SKrzBW6WW HI1+pjDY8WqfmQ0rysAa3BUiC+pUyW2+mwhHtAAFPMWvSOd1VP/iabxID9JXRmJSruBV crwg== X-Gm-Message-State: AC+VfDwmHxWaglYd1axKbdDzNI4w3osQm9PxxcCa5r0260zfQCAmWBUA Im9XNhANWNw2G9Yw3yTd8APSi50ETP73jiSiT6tqXwncTtD+jD3oO33BYSy4Au7FGnamtkSb0Hk 59IvLi33ZvOqs1n8zIGJ1Bt2OWZFkhsVyns3Hc6s3cYjm4jloHw== X-Received: by 2002:a54:4588:0:b0:39b:8f0c:3936 with SMTP id z8-20020a544588000000b0039b8f0c3936mr1382748oib.27.1686036164364; Tue, 06 Jun 2023 00:22:44 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5eWAt30voJBtnc5mkFTnR49Yf17T70s0cCBJjby6+7LVYJaN9anXPDUmF0zHxeq/ALndnj3Q== X-Received: by 2002:a54:4588:0:b0:39b:8f0c:3936 with SMTP id z8-20020a544588000000b0039b8f0c3936mr1382739oib.27.1686036164098; Tue, 06 Jun 2023 00:22:44 -0700 (PDT) Received: from localhost.localdomain (220-135-31-21.hinet-ip.hinet.net. [220.135.31.21]) by smtp.gmail.com with ESMTPSA id j6-20020a170902758600b001a260b5319bsm7776954pll.91.2023.06.06.00.22.42 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Jun 2023 00:22:43 -0700 (PDT) From: Gerald Yang To: kernel-team@lists.ubuntu.com Subject: [PATCH 3/6] sbitmap: Avoid leaving waitqueue in invalid state in __sbq_wake_up() Date: Tue, 6 Jun 2023 15:22:26 +0800 Message-Id: <20230606072229.3988976-4-gerald.yang@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230606072229.3988976-1-gerald.yang@canonical.com> References: <20230606072229.3988976-1-gerald.yang@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Jan Kara BugLink: https://bugs.launchpad.net/bugs/2022318 When __sbq_wake_up() decrements wait_cnt to 0 but races with someone else waking the waiter on the waitqueue (so the waitqueue becomes empty), it exits without reseting wait_cnt to wake_batch number. Once wait_cnt is 0, nobody will ever reset the wait_cnt or wake the new waiters resulting in possible deadlocks or busyloops. Fix the problem by making sure we reset wait_cnt even if we didn't wake up anybody in the end. Fixes: 040b83fcecfb ("sbitmap: fix possible io hung due to lost wakeup") Reported-by: Keith Busch Signed-off-by: Jan Kara Link: https://lore.kernel.org/r/20220908130937.2795-1-jack@suse.cz Signed-off-by: Jens Axboe (cherry picked from commit 48c033314f372478548203c583529f53080fd078) Signed-off-by: Gerald Yang --- lib/sbitmap.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/lib/sbitmap.c b/lib/sbitmap.c index a39b1a877366..47cd8fb894ba 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -604,6 +604,7 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) struct sbq_wait_state *ws; unsigned int wake_batch; int wait_cnt; + bool ret; ws = sbq_wake_ptr(sbq); if (!ws) @@ -614,12 +615,23 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) * For concurrent callers of this, callers should call this function * again to wakeup a new batch on a different 'ws'. */ - if (wait_cnt < 0 || !waitqueue_active(&ws->wait)) + if (wait_cnt < 0) return true; + /* + * If we decremented queue without waiters, retry to avoid lost + * wakeups. + */ if (wait_cnt > 0) - return false; + return !waitqueue_active(&ws->wait); + /* + * When wait_cnt == 0, we have to be particularly careful as we are + * responsible to reset wait_cnt regardless whether we've actually + * woken up anybody. But in case we didn't wakeup anybody, we still + * need to retry. + */ + ret = !waitqueue_active(&ws->wait); wake_batch = READ_ONCE(sbq->wake_batch); /* @@ -648,7 +660,7 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) sbq_index_atomic_inc(&sbq->wake_index); atomic_set(&ws->wait_cnt, wake_batch); - return false; + return ret; } void sbitmap_queue_wake_up(struct sbitmap_queue *sbq) From patchwork Tue Jun 6 07:22:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gerald Yang X-Patchwork-Id: 1790844 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=alqp5uz7; dkim-atps=neutral Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qb24g6cn8z20WK for ; Tue, 6 Jun 2023 17:22:59 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1q6R1o-0000PN-Dz; Tue, 06 Jun 2023 07:22:52 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1q6R1k-0000Ls-Hz for kernel-team@lists.ubuntu.com; Tue, 06 Jun 2023 07:22:48 +0000 Received: from mail-oi1-f197.google.com (mail-oi1-f197.google.com [209.85.167.197]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id 08DE93F0F4 for ; Tue, 6 Jun 2023 07:22:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1686036166; bh=VJ8QsgM/p+NQinqeHGdtKTJtpsBq8gLMWxMvIAZzaXk=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=alqp5uz7WiAy17k1A31XXC6MVfsqm+VxJBlUb//uJf076WfXUtjyNkw1CPBEOmjFa tzt6WwEMRxfM4fjezYBzWAvQJWg6qRHInKd0nhwnqM7iu7EnxCVaS/ek/WXTy9Co4l +K1XU74REsbZ9FL8KlTfNS+plFrAjNQJcS8JB0TyQ6itNyRBnlTpUwRIBRESkymPay vbbteOK5IMXbyq07wGzNDj4NB7NMeNNITL7irK4qGemmP67Cy1tHf1IGdWlqFychOF HjKEA2CNKfYYKpgttR3bICRK5DDKy0Eq8WEKdBL4UEnbxNDJAmLr4vWdyoYWYH4Ql8 uuTmbs6cJM4+Q== Received: by mail-oi1-f197.google.com with SMTP id 5614622812f47-39a9d9981ffso2895301b6e.0 for ; Tue, 06 Jun 2023 00:22:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686036165; x=1688628165; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VJ8QsgM/p+NQinqeHGdtKTJtpsBq8gLMWxMvIAZzaXk=; b=fJfrMbauHFbtiAR76CU+1CyMrEyBuJM9TzPrjoESHfciJq+N976V5FzibzwX/X2eP2 LjrSe6QfDxiLtUPWp9phBl3j+c/S6vO6tfZwWkmurKbmEhoTBbMWWz4OT1JicVXPBpyf KcHCmSqG/1gvEI0Zm2f9aQt7+2c31acc/yaEGhy0F1Z7t4uULqQuz0U4SpjKCo0pMtS6 WB1WBtAbZoV6hcvAyUc17WFEgYUcnWQJYrgGx6B3h0ICMsKEQ3ghtbyqtfyPIZQYf3Il qL/Sd9ycUTIASqIfnCr7V6+ABlBAD/bpkO/3LZiqzMSiMdbVtP5pBKo0HKYVb0bW1mcQ VPzg== X-Gm-Message-State: AC+VfDwUfSWs0YT9XgDKOLPWU6F1r9zO5QpqmMlxItIc6Xml1/TLsaru 2Td08R9w17s7j76h5kf7wdsu87DGqn8g/PbM8slecx2oszakefNcJj8NoeG7xjtKSZHIWHOhw0y RbFLasTI7gIb7+ygU81/SdH+RX8qH6IB1uSayRhvhe6Z9OdBWqg== X-Received: by 2002:aca:903:0:b0:394:31e1:41a1 with SMTP id 3-20020aca0903000000b0039431e141a1mr1554847oij.6.1686036165278; Tue, 06 Jun 2023 00:22:45 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7kMD/skdnON5DdbUKunT9DK1DxdztFzGmgECuF0dlx9TS+P8ndqPciFsm75x3mbRFSLt6EkQ== X-Received: by 2002:aca:903:0:b0:394:31e1:41a1 with SMTP id 3-20020aca0903000000b0039431e141a1mr1554830oij.6.1686036165006; Tue, 06 Jun 2023 00:22:45 -0700 (PDT) Received: from localhost.localdomain (220-135-31-21.hinet-ip.hinet.net. [220.135.31.21]) by smtp.gmail.com with ESMTPSA id j6-20020a170902758600b001a260b5319bsm7776954pll.91.2023.06.06.00.22.44 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Jun 2023 00:22:44 -0700 (PDT) From: Gerald Yang To: kernel-team@lists.ubuntu.com Subject: [PATCH 4/6] sbitmap: Use atomic_long_try_cmpxchg in __sbitmap_queue_get_batch Date: Tue, 6 Jun 2023 15:22:27 +0800 Message-Id: <20230606072229.3988976-5-gerald.yang@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230606072229.3988976-1-gerald.yang@canonical.com> References: <20230606072229.3988976-1-gerald.yang@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Uros Bizjak BugLink: https://bugs.launchpad.net/bugs/2022318 Use atomic_long_try_cmpxchg instead of atomic_long_cmpxchg (*ptr, old, new) == old in __sbitmap_queue_get_batch. x86 CMPXCHG instruction returns success in ZF flag, so this change saves a compare after cmpxchg (and related move instruction in front of cmpxchg). Also, atomic_long_cmpxchg implicitly assigns old *ptr value to "old" when cmpxchg fails, enabling further code simplifications, e.g. an extra memory read can be avoided in the loop. No functional change intended. Cc: Jens Axboe Signed-off-by: Uros Bizjak Link: https://lore.kernel.org/r/20220908151200.9993-1-ubizjak@gmail.com Signed-off-by: Jens Axboe (cherry picked from commit c35227d4e8cbc70a6622cc7cc5f8c3bff513f1fa) Signed-off-by: Gerald Yang --- lib/sbitmap.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/lib/sbitmap.c b/lib/sbitmap.c index 47cd8fb894ba..cbfd2e677d87 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -533,16 +533,16 @@ unsigned long __sbitmap_queue_get_batch(struct sbitmap_queue *sbq, int nr_tags, nr = find_first_zero_bit(&map->word, map_depth); if (nr + nr_tags <= map_depth) { atomic_long_t *ptr = (atomic_long_t *) &map->word; - unsigned long val, ret; + unsigned long val; get_mask = ((1UL << nr_tags) - 1) << nr; + val = READ_ONCE(map->word); do { - val = READ_ONCE(map->word); if ((val & ~get_mask) != val) goto next; - ret = atomic_long_cmpxchg(ptr, val, get_mask | val); - } while (ret != val); - get_mask = (get_mask & ~ret) >> nr; + } while (!atomic_long_try_cmpxchg(ptr, &val, + get_mask | val)); + get_mask = (get_mask & ~val) >> nr; if (get_mask) { *offset = nr + (index << sb->shift); update_alloc_hint_after_get(sb, depth, hint, From patchwork Tue Jun 6 07:22:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gerald Yang X-Patchwork-Id: 1790846 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=jUUYLtTB; dkim-atps=neutral Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qb24n0Fxrz20WK for ; Tue, 6 Jun 2023 17:23:05 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1q6R1s-0000Uz-MX; Tue, 06 Jun 2023 07:22:56 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1q6R1l-0000NN-DC for kernel-team@lists.ubuntu.com; Tue, 06 Jun 2023 07:22:49 +0000 Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id EEDD03F0EF for ; Tue, 6 Jun 2023 07:22:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1686036168; bh=dtF0Nn2Mg4hlepgMWSTgw0mDqiPZ+L4XQJBCc2rOIuM=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=jUUYLtTB2g75hzadD6WHc8ChFBy+URtUNJiPYGpzvG9E4ONm0rOcg67mU9MJbvMUn wYydGUF4lSDRQOkTJORoCQ6z0dFITXOGQU/kacNg/o/yDyASTVk61WUNadnf8DWt9B Go6Mm9tBCt72gDQXtIaMGNQTnibkbRNg92w+lWPhIBH/ot1kzYqnaa5TOnD13zHOCm 1t85keS2kWMj9J9Ti41vaD0z7V5v8eMFjtvj5nfHXdfODRuIGzq3XZ69Wh+vhr+qP6 e+f3eDBBaGwExaEZsWz6MaG8CJcikWnRhuPZQHcY8ZdvuDQceOpEhZvUan20rIFoA8 Ftp7vcpianPJQ== Received: by mail-pl1-f199.google.com with SMTP id d9443c01a7336-1b04dbcf0dbso41952405ad.1 for ; Tue, 06 Jun 2023 00:22:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686036166; x=1688628166; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dtF0Nn2Mg4hlepgMWSTgw0mDqiPZ+L4XQJBCc2rOIuM=; b=OW9EGXX4WqkYHdOCeFfQk72DU+4lkvytHzQk6rMXJol+XG+LLjEGaYryGnE/3i4CrJ aKjB3Bil95s9mbED2gdfzNiFi6lueQ0RHc7RRdatiYLq5Xq92+jWchEHly0D1eeTVeLs RDIqX2ayTbF5+BDVk4xMyFqIG22eh6hkajGGh/Iu+WhhT0qXz+tw7mjJKLIUUJqDXhru zgb34WvNjntuth4MSrEKH1a6ykwbt8r6fIXndxNtE6hCgaldxYo1eipuEE10kDO7Viqv Q6u3ir8U+tQ3A9epalOFL8S6+XOVEzcNB1WR5xnhs/jlbqVJAb8VCdAyMfkGjPiJI5S/ 3Qxg== X-Gm-Message-State: AC+VfDzw+zukSLD8Qz7I/eD6YxCR976m2MJllTgpNpFIZembdvuYmk1E CB8iUHanaxYh4GXqS+XJXNxn2s5v47QzGwtKMfsjPukyy9xPnGfBOCfKJcVM2CQ5cCtIKQcQQXz qR5bIc8UeCgaKdEPejjwe7ZtpGNeBUpULGYtYoz+mZoAT2yf4/A== X-Received: by 2002:a17:903:2289:b0:1b1:1168:656b with SMTP id b9-20020a170903228900b001b11168656bmr1883883plh.26.1686036166387; Tue, 06 Jun 2023 00:22:46 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4aMtqBodxEhd1k7fERc2MtqA1xAPu3hvdTt2v3OYWXuyluriKGIQlU0TGq3w/qtdl7iYVKjg== X-Received: by 2002:a17:903:2289:b0:1b1:1168:656b with SMTP id b9-20020a170903228900b001b11168656bmr1883873plh.26.1686036166010; Tue, 06 Jun 2023 00:22:46 -0700 (PDT) Received: from localhost.localdomain (220-135-31-21.hinet-ip.hinet.net. [220.135.31.21]) by smtp.gmail.com with ESMTPSA id j6-20020a170902758600b001a260b5319bsm7776954pll.91.2023.06.06.00.22.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Jun 2023 00:22:45 -0700 (PDT) From: Gerald Yang To: kernel-team@lists.ubuntu.com Subject: [PATCH 5/6] sbitmap: fix batched wait_cnt accounting Date: Tue, 6 Jun 2023 15:22:28 +0800 Message-Id: <20230606072229.3988976-6-gerald.yang@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230606072229.3988976-1-gerald.yang@canonical.com> References: <20230606072229.3988976-1-gerald.yang@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Keith Busch BugLink: https://bugs.launchpad.net/bugs/2022318 Batched completions can clear multiple bits, but we're only decrementing the wait_cnt by one each time. This can cause waiters to never be woken, stalling IO. Use the batched count instead. Link: https://bugzilla.kernel.org/show_bug.cgi?id=215679 Signed-off-by: Keith Busch Link: https://lore.kernel.org/r/20220909184022.1709476-1-kbusch@fb.com Signed-off-by: Jens Axboe (cherry picked from commit 4acb83417cadfdcbe64215f9d0ddcf3132af808e) Signed-off-by: Gerald Yang --- block/blk-mq-tag.c | 2 +- include/linux/sbitmap.h | 3 ++- lib/sbitmap.c | 37 +++++++++++++++++++++++-------------- 3 files changed, 26 insertions(+), 16 deletions(-) diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c index 2dcd738c6952..7aea93047caf 100644 --- a/block/blk-mq-tag.c +++ b/block/blk-mq-tag.c @@ -200,7 +200,7 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data) * other allocations on previous queue won't be starved. */ if (bt != bt_prev) - sbitmap_queue_wake_up(bt_prev); + sbitmap_queue_wake_up(bt_prev, 1); ws = bt_wait_ptr(bt, data->hctx); } while (1); diff --git a/include/linux/sbitmap.h b/include/linux/sbitmap.h index 8f5a86e210b9..4d2d5205ab58 100644 --- a/include/linux/sbitmap.h +++ b/include/linux/sbitmap.h @@ -575,8 +575,9 @@ void sbitmap_queue_wake_all(struct sbitmap_queue *sbq); * sbitmap_queue_wake_up() - Wake up some of waiters in one waitqueue * on a &struct sbitmap_queue. * @sbq: Bitmap queue to wake up. + * @nr: Number of bits cleared. */ -void sbitmap_queue_wake_up(struct sbitmap_queue *sbq); +void sbitmap_queue_wake_up(struct sbitmap_queue *sbq, int nr); /** * sbitmap_queue_show() - Dump &struct sbitmap_queue information to a &struct diff --git a/lib/sbitmap.c b/lib/sbitmap.c index cbfd2e677d87..624fa7f118d1 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -599,24 +599,31 @@ static struct sbq_wait_state *sbq_wake_ptr(struct sbitmap_queue *sbq) return NULL; } -static bool __sbq_wake_up(struct sbitmap_queue *sbq) +static bool __sbq_wake_up(struct sbitmap_queue *sbq, int *nr) { struct sbq_wait_state *ws; unsigned int wake_batch; - int wait_cnt; + int wait_cnt, cur, sub; bool ret; + if (*nr <= 0) + return false; + ws = sbq_wake_ptr(sbq); if (!ws) return false; - wait_cnt = atomic_dec_return(&ws->wait_cnt); - /* - * For concurrent callers of this, callers should call this function - * again to wakeup a new batch on a different 'ws'. - */ - if (wait_cnt < 0) - return true; + cur = atomic_read(&ws->wait_cnt); + do { + /* + * For concurrent callers of this, callers should call this + * function again to wakeup a new batch on a different 'ws'. + */ + if (cur == 0) + return true; + sub = min(*nr, cur); + wait_cnt = cur - sub; + } while (!atomic_try_cmpxchg(&ws->wait_cnt, &cur, wait_cnt)); /* * If we decremented queue without waiters, retry to avoid lost @@ -625,6 +632,8 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) if (wait_cnt > 0) return !waitqueue_active(&ws->wait); + *nr -= sub; + /* * When wait_cnt == 0, we have to be particularly careful as we are * responsible to reset wait_cnt regardless whether we've actually @@ -660,12 +669,12 @@ static bool __sbq_wake_up(struct sbitmap_queue *sbq) sbq_index_atomic_inc(&sbq->wake_index); atomic_set(&ws->wait_cnt, wake_batch); - return ret; + return ret || *nr; } -void sbitmap_queue_wake_up(struct sbitmap_queue *sbq) +void sbitmap_queue_wake_up(struct sbitmap_queue *sbq, int nr) { - while (__sbq_wake_up(sbq)) + while (__sbq_wake_up(sbq, &nr)) ; } EXPORT_SYMBOL_GPL(sbitmap_queue_wake_up); @@ -705,7 +714,7 @@ void sbitmap_queue_clear_batch(struct sbitmap_queue *sbq, int offset, atomic_long_andnot(mask, (atomic_long_t *) addr); smp_mb__after_atomic(); - sbitmap_queue_wake_up(sbq); + sbitmap_queue_wake_up(sbq, nr_tags); sbitmap_update_cpu_hint(&sbq->sb, raw_smp_processor_id(), tags[nr_tags - 1] - offset); } @@ -733,7 +742,7 @@ void sbitmap_queue_clear(struct sbitmap_queue *sbq, unsigned int nr, * waiter. See the comment on waitqueue_active(). */ smp_mb__after_atomic(); - sbitmap_queue_wake_up(sbq); + sbitmap_queue_wake_up(sbq, 1); sbitmap_update_cpu_hint(&sbq->sb, cpu, nr); } EXPORT_SYMBOL_GPL(sbitmap_queue_clear); From patchwork Tue Jun 6 07:22:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Gerald Yang X-Patchwork-Id: 1790847 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=canonical.com header.i=@canonical.com header.a=rsa-sha256 header.s=20210705 header.b=jliRCe9d; dkim-atps=neutral Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qb24p34xTz20WK for ; Tue, 6 Jun 2023 17:23:06 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1q6R1u-0000Xc-JX; Tue, 06 Jun 2023 07:22:58 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1q6R1m-0000Nt-UN for kernel-team@lists.ubuntu.com; Tue, 06 Jun 2023 07:22:50 +0000 Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id 78BE43F0EF for ; Tue, 6 Jun 2023 07:22:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=canonical.com; s=20210705; t=1686036169; bh=RTzBHpf7pNxeVKUNSCI3X7DgPVjuRy939w+zZ4Zr6Fg=; h=From:To:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=jliRCe9dnDuGinKGAS5Nih3S1KTuiFf+5sRpCYztWKpKgaY1CceKelo2arbgAgqFY 1ZzCPqMpKi2wX2Q57SKsBIZzeaRGWrEIvvdakJvAy4VRvN2q73EffLOkntNiXVCIXv F/vteFXfDKRSDAmJHrWlceNXKx/8Vd32vJyM9VgWF6FYcuk9P3av5ptfbZB+TqZpvC Nva+qj3C3uy3QsgVsFOgR7foCZs2gzRrsbb7yu9g4awGiRbOXu67eAnUb6SXsjmoGq ZhQtDXVY8szyKrGH9xeCvmojupYX2mV2d92hKLOGz71uVE+N237Vby8iYm8WZZ52hv 4hZXilWMestXQ== Received: by mail-pl1-f199.google.com with SMTP id d9443c01a7336-1b03f9dfd52so23188995ad.3 for ; Tue, 06 Jun 2023 00:22:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1686036167; x=1688628167; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RTzBHpf7pNxeVKUNSCI3X7DgPVjuRy939w+zZ4Zr6Fg=; b=UO4yGf+AvLDZ6lvUM6ogg1TgHXdAnVJoRKdQb9B/lHWTffwyW01wtRw5Mchfeawd1u dC2JcqBPOyfRyEfFK6+q1l7ETCxeVtKeW8B6ljQMUXGOm0KP9Z8sREislbw9eUoC6ixK c0x9ARsaeDQMuQwtO5ovERhWaSARo9KyjXZQo3UStRL5qtI+l2KJ+kNlUdEG+A/2/Tex rnX+2F8IHW/XNmy53OTrXyO48e0nfHgCPYuTw8q1B7Xc43DLpUIMfXhR55XnE2cJdrGh nfmdWhRLtr08LNt8OznnK1lgFl4Mvl2ZczMdDRZMKDxXDzYKPvV5rjYSpzhAM5a8lUA1 ousg== X-Gm-Message-State: AC+VfDxIzUoOKs+xFXQzb90tv85TCj3WZyrzetomB2mJiV6kIJjW2TDe aQ4Z+X6ZnnFa/dc88DEuAAW+8fJV8JaUSHSCWrl3Aiftd6waM9FFXN8UhQfPbvRP4aGVRdYwb2K vr1GYzM5WcMKuP0DwkggtU0lkkUIy5oEN6DcbJCB2kR+7FyguRg== X-Received: by 2002:a17:902:c3c4:b0:1ae:7421:82b5 with SMTP id j4-20020a170902c3c400b001ae742182b5mr798459plj.45.1686036167542; Tue, 06 Jun 2023 00:22:47 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5b8IchWhJIY8k46zS0xptCcYG+Jn/RUtDAYicxMqpniSfTrv2yjy1aMxUy9YM5i3xwAXDcvQ== X-Received: by 2002:a17:902:c3c4:b0:1ae:7421:82b5 with SMTP id j4-20020a170902c3c400b001ae742182b5mr798454plj.45.1686036167171; Tue, 06 Jun 2023 00:22:47 -0700 (PDT) Received: from localhost.localdomain (220-135-31-21.hinet-ip.hinet.net. [220.135.31.21]) by smtp.gmail.com with ESMTPSA id j6-20020a170902758600b001a260b5319bsm7776954pll.91.2023.06.06.00.22.46 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Jun 2023 00:22:46 -0700 (PDT) From: Gerald Yang To: kernel-team@lists.ubuntu.com Subject: [PATCH 6/6] sbitmap: fix lockup while swapping Date: Tue, 6 Jun 2023 15:22:29 +0800 Message-Id: <20230606072229.3988976-7-gerald.yang@canonical.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230606072229.3988976-1-gerald.yang@canonical.com> References: <20230606072229.3988976-1-gerald.yang@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Hugh Dickins BugLink: https://bugs.launchpad.net/bugs/2022318 Commit 4acb83417cad ("sbitmap: fix batched wait_cnt accounting") is a big improvement: without it, I had to revert to before commit 040b83fcecfb ("sbitmap: fix possible io hung due to lost wakeup") to avoid the high system time and freezes which that had introduced. Now okay on the NVME laptop, but 4acb83417cad is a disaster for heavy swapping (kernel builds in low memory) on another: soon locking up in sbitmap_queue_wake_up() (into which __sbq_wake_up() is inlined), cycling around with waitqueue_active() but wait_cnt 0 . Here is a backtrace, showing the common pattern of outer sbitmap_queue_wake_up() interrupted before setting wait_cnt 0 back to wake_batch (in some cases other CPUs are idle, in other cases they're spinning for a lock in dd_bio_merge()): sbitmap_queue_wake_up < sbitmap_queue_clear < blk_mq_put_tag < __blk_mq_free_request < blk_mq_free_request < __blk_mq_end_request < scsi_end_request < scsi_io_completion < scsi_finish_command < scsi_complete < blk_complete_reqs < blk_done_softirq < __do_softirq < __irq_exit_rcu < irq_exit_rcu < common_interrupt < asm_common_interrupt < _raw_spin_unlock_irqrestore < __wake_up_common_lock < __wake_up < sbitmap_queue_wake_up < sbitmap_queue_clear < blk_mq_put_tag < __blk_mq_free_request < blk_mq_free_request < dd_bio_merge < blk_mq_sched_bio_merge < blk_mq_attempt_bio_merge < blk_mq_submit_bio < __submit_bio < submit_bio_noacct_nocheck < submit_bio_noacct < submit_bio < __swap_writepage < swap_writepage < pageout < shrink_folio_list < evict_folios < lru_gen_shrink_lruvec < shrink_lruvec < shrink_node < do_try_to_free_pages < try_to_free_pages < __alloc_pages_slowpath < __alloc_pages < folio_alloc < vma_alloc_folio < do_anonymous_page < __handle_mm_fault < handle_mm_fault < do_user_addr_fault < exc_page_fault < asm_exc_page_fault See how the process-context sbitmap_queue_wake_up() has been interrupted, after bringing wait_cnt down to 0 (and in this example, after doing its wakeups), before advancing wake_index and refilling wake_cnt: an interrupt-context sbitmap_queue_wake_up() of the same sbq gets stuck. I have almost no grasp of all the possible sbitmap races, and their consequences: but __sbq_wake_up() can do nothing useful while wait_cnt 0, so it is better if sbq_wake_ptr() skips on to the next ws in that case: which fixes the lockup and shows no adverse consequence for me. The check for wait_cnt being 0 is obviously racy, and ultimately can lead to lost wakeups: for example, when there is only a single waitqueue with waiters. However, lost wakeups are unlikely to matter in these cases, and a proper fix requires redesign (and benchmarking) of the batched wakeup code: so let's plug the hole with this bandaid for now. Signed-off-by: Hugh Dickins Reviewed-by: Jan Kara Reviewed-by: Keith Busch Link: https://lore.kernel.org/r/9c2038a7-cdc5-5ee-854c-fbc6168bf16@google.com Signed-off-by: Jens Axboe (cherry picked from commit 30514bd2dd4e86a3ecfd6a93a3eadf7b9ea164a0) Signed-off-by: Gerald Yang --- lib/sbitmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/sbitmap.c b/lib/sbitmap.c index 624fa7f118d1..a8108a962dfd 100644 --- a/lib/sbitmap.c +++ b/lib/sbitmap.c @@ -587,7 +587,7 @@ static struct sbq_wait_state *sbq_wake_ptr(struct sbitmap_queue *sbq) for (i = 0; i < SBQ_WAIT_QUEUES; i++) { struct sbq_wait_state *ws = &sbq->ws[wake_index]; - if (waitqueue_active(&ws->wait)) { + if (waitqueue_active(&ws->wait) && atomic_read(&ws->wait_cnt)) { if (wake_index != atomic_read(&sbq->wake_index)) atomic_set(&sbq->wake_index, wake_index); return ws;