From patchwork Thu May 6 04:04:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matthew Ruffell X-Patchwork-Id: 1474779 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4FbKjz1rRFz9sRf; Thu, 6 May 2021 14:05:27 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1leVGM-0005MQ-7g; Thu, 06 May 2021 04:05:22 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1leVGC-0005Dg-2P for kernel-team@lists.ubuntu.com; Thu, 06 May 2021 04:05:12 +0000 Received: from mail-pj1-f69.google.com ([209.85.216.69]) by youngberry.canonical.com with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.93) (envelope-from ) id 1leVGB-0005RG-3S for kernel-team@lists.ubuntu.com; Thu, 06 May 2021 04:05:11 +0000 Received: by mail-pj1-f69.google.com with SMTP id t19-20020a17090ad153b0290158e579e49fso2099746pjw.7 for ; Wed, 05 May 2021 21:05:11 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Aj9i3s+an6bEEIJO4Fxyy4r5WEbWz1z8DMCn0EYUIcU=; b=QnHiF8GOTdvU6QdQBA0UCyzoHWi5usazCqgU0LPAx1hxcHu7bzuJz385x2B8u3DqNv IL8s7C/URxDQBsWpfXZWCiZMnSQq/HSWssY3Rr2eIvD+srP2sTaxtOnPUgEVj4QREET+ 4n3bYgtrautCoZDskPB/sdgRrUVBTnNUTdMofGbvhDMyq0v3lqFoeokwjhAdsU9tLNlP +SSsWXhuk4oBhl34Q/wIBknXOQgtAlDSOYeXEip5sQRzvGnKPjpCjTxM9KL+CkoYuvAJ AKt8dKSHHgjUB+wgUyt7YM9iYYa03wcubJtkKGAasIxK+wNiY/xhyWSR6RNgPzqYJmng Cn4Q== X-Gm-Message-State: AOAM530ErycatrdPtWYmqda7/IQYrEqQJe9m6/b5EVBvExISIwqbRKjT nRWJSLZbunNDeMJX5kYRUmFBvVJM9opPYpqX06W7K2pykxnfq0C+EHTpzQyjK2eUtbc/AhFG8ar Cp69TW2s1cUcoSuIc2CN6Vqceyg/sIDMpjXe5t79dxw== X-Received: by 2002:a05:6a00:1488:b029:28e:908b:5c50 with SMTP id v8-20020a056a001488b029028e908b5c50mr2353220pfu.79.1620273909818; Wed, 05 May 2021 21:05:09 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzQV0Go+U0Ssa/ndpNK0iCnN07i7W53+jd7XJ9i1izKAIUy5YdKcKwDUQ920tbvFfUNzE3/5g== X-Received: by 2002:a05:6a00:1488:b029:28e:908b:5c50 with SMTP id v8-20020a056a001488b029028e908b5c50mr2353203pfu.79.1620273909544; Wed, 05 May 2021 21:05:09 -0700 (PDT) Received: from desktop.. (122-58-78-211-adsl.sparkbb.co.nz. [122.58.78.211]) by smtp.gmail.com with ESMTPSA id ge4sm675315pjb.4.2021.05.05.21.05.08 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 May 2021 21:05:09 -0700 (PDT) From: Matthew Ruffell To: kernel-team@lists.ubuntu.com Subject: [SRU][B][PATCH 5/5] md/raid10: improve discard request for far layout Date: Thu, 6 May 2021 16:04:41 +1200 Message-Id: <20210506040442.10877-11-matthew.ruffell@canonical.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210506040442.10877-1-matthew.ruffell@canonical.com> References: <20210506040442.10877-1-matthew.ruffell@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Xiao Ni BugLink: https://bugs.launchpad.net/bugs/1896578 For far layout, the discard region is not continuous on disks. So it needs far copies r10bio to cover all regions. It needs a way to know all r10bios have finish or not. Similar with raid10_sync_request, only the first r10bio master_bio records the discard bio. Other r10bios master_bio record the first r10bio. The first r10bio can finish after other r10bios finish and then return the discard bio. Tested-by: Adrian Huang Signed-off-by: Xiao Ni Signed-off-by: Song Liu (backported from commit 254c271da0712ea8914f187588e0f81f7678ee2f) [mruffell: remove pointer reference for mempool_alloc()] Signed-off-by: Matthew Ruffell --- drivers/md/raid10.c | 79 ++++++++++++++++++++++++++++++++++----------- drivers/md/raid10.h | 1 + 2 files changed, 61 insertions(+), 19 deletions(-) diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c index 532fa80579f1..2b574a833c2b 100644 --- a/drivers/md/raid10.c +++ b/drivers/md/raid10.c @@ -1560,6 +1560,28 @@ static void __make_request(struct mddev *mddev, struct bio *bio, int sectors) raid10_write_request(mddev, bio, r10_bio); } +static void raid_end_discard_bio(struct r10bio *r10bio) +{ + struct r10conf *conf = r10bio->mddev->private; + struct r10bio *first_r10bio; + + while (atomic_dec_and_test(&r10bio->remaining)) { + + allow_barrier(conf); + + if (!test_bit(R10BIO_Discard, &r10bio->state)) { + first_r10bio = (struct r10bio *)r10bio->master_bio; + free_r10bio(r10bio); + r10bio = first_r10bio; + } else { + md_write_end(r10bio->mddev); + bio_endio(r10bio->master_bio); + free_r10bio(r10bio); + break; + } + } +} + static void raid10_end_discard_request(struct bio *bio) { struct r10bio *r10_bio = bio->bi_private; @@ -1587,11 +1609,7 @@ static void raid10_end_discard_request(struct bio *bio) rdev = conf->mirrors[dev].rdev; } - if (atomic_dec_and_test(&r10_bio->remaining)) { - md_write_end(r10_bio->mddev); - raid_end_bio_io(r10_bio); - } - + raid_end_discard_bio(r10_bio); rdev_dec_pending(rdev, conf->mddev); } @@ -1605,7 +1623,9 @@ static int raid10_handle_discard(struct mddev *mddev, struct bio *bio) { struct r10conf *conf = mddev->private; struct geom *geo = &conf->geo; - struct r10bio *r10_bio; + int far_copies = geo->far_copies; + bool first_copy = true; + struct r10bio *r10_bio, *first_r10bio; struct bio *split; int disk; sector_t chunk; @@ -1679,16 +1699,6 @@ static int raid10_handle_discard(struct mddev *mddev, struct bio *bio) wait_barrier(conf); } - r10_bio = mempool_alloc(conf->r10bio_pool, GFP_NOIO); - r10_bio->mddev = mddev; - r10_bio->state = 0; - r10_bio->sectors = 0; - memset(r10_bio->devs, 0, sizeof(r10_bio->devs[0]) * geo->raid_disks); - - wait_blocked_dev(mddev, r10_bio); - - r10_bio->master_bio = bio; - bio_start = bio->bi_iter.bi_sector; bio_end = bio_end_sector(bio); @@ -1715,6 +1725,29 @@ static int raid10_handle_discard(struct mddev *mddev, struct bio *bio) end_disk_offset = (bio_end & geo->chunk_mask) + (last_stripe_index << geo->chunk_shift); +retry_discard: + r10_bio = mempool_alloc(conf->r10bio_pool, GFP_NOIO); + r10_bio->mddev = mddev; + r10_bio->state = 0; + r10_bio->sectors = 0; + memset(r10_bio->devs, 0, sizeof(r10_bio->devs[0]) * geo->raid_disks); + wait_blocked_dev(mddev, r10_bio); + + /* + * For far layout it needs more than one r10bio to cover all regions. + * Inspired by raid10_sync_request, we can use the first r10bio->master_bio + * to record the discard bio. Other r10bio->master_bio record the first + * r10bio. The first r10bio only release after all other r10bios finish. + * The discard bio returns only first r10bio finishes + */ + if (first_copy) { + r10_bio->master_bio = bio; + set_bit(R10BIO_Discard, &r10_bio->state); + first_copy = false; + first_r10bio = r10_bio; + } else + r10_bio->master_bio = (struct bio *)first_r10bio; + rcu_read_lock(); for (disk = 0; disk < geo->raid_disks; disk++) { struct md_rdev *rdev = rcu_dereference(conf->mirrors[disk].rdev); @@ -1806,11 +1839,19 @@ static int raid10_handle_discard(struct mddev *mddev, struct bio *bio) } } - if (atomic_dec_and_test(&r10_bio->remaining)) { - md_write_end(r10_bio->mddev); - raid_end_bio_io(r10_bio); + if (!geo->far_offset && --far_copies) { + first_stripe_index += geo->stride >> geo->chunk_shift; + start_disk_offset += geo->stride; + last_stripe_index += geo->stride >> geo->chunk_shift; + end_disk_offset += geo->stride; + atomic_inc(&first_r10bio->remaining); + raid_end_discard_bio(r10_bio); + wait_barrier(conf); + goto retry_discard; } + raid_end_discard_bio(r10_bio); + return 0; out: allow_barrier(conf); diff --git a/drivers/md/raid10.h b/drivers/md/raid10.h index e2e8840de9bf..f157ef5ce49c 100644 --- a/drivers/md/raid10.h +++ b/drivers/md/raid10.h @@ -179,5 +179,6 @@ enum r10bio_state { R10BIO_Previous, /* failfast devices did receive failfast requests. */ R10BIO_FailFast, + R10BIO_Discard, }; #endif