From patchwork Wed Dec 18 14:30:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: dann frazier X-Patchwork-Id: 1212457 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.ubuntu.com (client-ip=91.189.94.19; helo=huckleberry.canonical.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47dHVT1qSVz9sSH; Thu, 19 Dec 2019 01:30:41 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.86_2) (envelope-from ) id 1ihaLV-0005GH-RU; Wed, 18 Dec 2019 14:30:37 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1ihaLU-0005G3-2w for kernel-team@lists.ubuntu.com; Wed, 18 Dec 2019 14:30:36 +0000 Received: from mail-il1-f197.google.com ([209.85.166.197]) by youngberry.canonical.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1ihaLT-0007LQ-Db for kernel-team@lists.ubuntu.com; Wed, 18 Dec 2019 14:30:35 +0000 Received: by mail-il1-f197.google.com with SMTP id x2so1849185ilk.18 for ; Wed, 18 Dec 2019 06:30:35 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=EAGYo9nHm8/uVN2vYlVQdyXEBGJJHkfgSe4ItQdwrPU=; b=kheRiiJLG2qRrWNG2TLJnGxcmjDkogRj0nAiwAssLziKLHlHd7/zClNpC0SoXImyqN ErzkxepEcZZfqhbEkae5EYw5h2iE50g/g3zw6vRkGjGm4sEwZoTpxoIVCLfYEZYkaXO6 YgKjpA0JgNbYS290rjBXtp1m195hrMPCZVD3O/PUG6ECN8z2v39ejVRyUKRFHE3JVX3n zs3/n97WLfj98etYmG0+xKzOhckCIdZ44+pQl2m5iWlvhJ/BGE6W1ticrciMg6mErQHU b7YFzE2LeDm5j9fqBZ8oL5GB7XXmFL37m+/QUfN0xt/4YRLm4jJYZllcJHXvzEuwA6te +XCA== X-Gm-Message-State: APjAAAVOMpAlnu7XReMkoSUDPE+7l3Jt1LiR28uL7QEh6pb9R1gTrdTY DrGU5wBBWdbtmovrE3xiGPduDIGzuUFXBTE0JkF20A+qx86yQ+s2hpXIIYcM/nJyD4/F19zHE+7 GBHQp2+Qif69Jcnnmppl8SqBxPdxGdQ4dQ7TL2kaw1Q== X-Received: by 2002:a5d:9eda:: with SMTP id a26mr1888915ioe.238.1576679434177; Wed, 18 Dec 2019 06:30:34 -0800 (PST) X-Google-Smtp-Source: APXvYqx5jcC0NtsgeHMimbHAhhry/+wCabc+UTOld+6obI0fKF7Bv8DsUOM+9n4YqTjORv7sKbMHOQ== X-Received: by 2002:a5d:9eda:: with SMTP id a26mr1888883ioe.238.1576679433834; Wed, 18 Dec 2019 06:30:33 -0800 (PST) Received: from xps13.canonical.com (c-71-56-235-36.hsd1.co.comcast.net. [71.56.235.36]) by smtp.gmail.com with ESMTPSA id j26sm504790iok.3.2019.12.18.06.30.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Dec 2019 06:30:33 -0800 (PST) From: dann frazier To: kernel-team@lists.ubuntu.com Subject: [PATCH 1/5][SRU Disco] md/raid0: avoid RAID0 data corruption due to layout confusion. Date: Wed, 18 Dec 2019 07:30:27 -0700 Message-Id: <20191218143031.207870-1-dann.frazier@canonical.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20191218142041.206383-1-dann.frazier@canonical.com> References: <20191218142041.206383-1-dann.frazier@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: NeilBrown BugLink: https://bugs.launchpad.net/bugs/1850540 [ Upstream commit c84a1372df929033cb1a0441fb57bd3932f39ac9 ] If the drives in a RAID0 are not all the same size, the array is divided into zones. The first zone covers all drives, to the size of the smallest. The second zone covers all drives larger than the smallest, up to the size of the second smallest - etc. A change in Linux 3.14 unintentionally changed the layout for the second and subsequent zones. All the correct data is still stored, but each chunk may be assigned to a different device than in pre-3.14 kernels. This can lead to data corruption. It is not possible to determine what layout to use - it depends which kernel the data was written by. So we add a module parameter to allow the old (0) or new (1) layout to be specified, and refused to assemble an affected array if that parameter is not set. Fixes: 20d0189b1012 ("block: Introduce new bio_split()") cc: stable@vger.kernel.org (3.14+) Acked-by: Guoqing Jiang Signed-off-by: NeilBrown Signed-off-by: Song Liu Signed-off-by: Sasha Levin Signed-off-by: dann frazier --- drivers/md/raid0.c | 33 ++++++++++++++++++++++++++++++++- drivers/md/raid0.h | 14 ++++++++++++++ 2 files changed, 46 insertions(+), 1 deletion(-) diff --git a/drivers/md/raid0.c b/drivers/md/raid0.c index 7c4d15207886f..ddecd07809261 100644 --- a/drivers/md/raid0.c +++ b/drivers/md/raid0.c @@ -26,6 +26,9 @@ #include "raid0.h" #include "raid5.h" +static int default_layout = 0; +module_param(default_layout, int, 0644); + #define UNSUPPORTED_MDDEV_FLAGS \ ((1L << MD_HAS_JOURNAL) | \ (1L << MD_JOURNAL_CLEAN) | \ @@ -146,6 +149,19 @@ static int create_strip_zones(struct mddev *mddev, struct r0conf **private_conf) } pr_debug("md/raid0:%s: FINAL %d zones\n", mdname(mddev), conf->nr_strip_zones); + + if (conf->nr_strip_zones == 1) { + conf->layout = RAID0_ORIG_LAYOUT; + } else if (default_layout == RAID0_ORIG_LAYOUT || + default_layout == RAID0_ALT_MULTIZONE_LAYOUT) { + conf->layout = default_layout; + } else { + pr_err("md/raid0:%s: cannot assemble multi-zone RAID0 with default_layout setting\n", + mdname(mddev)); + pr_err("md/raid0: please set raid.default_layout to 1 or 2\n"); + err = -ENOTSUPP; + goto abort; + } /* * now since we have the hard sector sizes, we can make sure * chunk size is a multiple of that sector size @@ -555,10 +571,12 @@ static void raid0_handle_discard(struct mddev *mddev, struct bio *bio) static bool raid0_make_request(struct mddev *mddev, struct bio *bio) { + struct r0conf *conf = mddev->private; struct strip_zone *zone; struct md_rdev *tmp_dev; sector_t bio_sector; sector_t sector; + sector_t orig_sector; unsigned chunk_sects; unsigned sectors; @@ -592,9 +610,22 @@ static bool raid0_make_request(struct mddev *mddev, struct bio *bio) bio = split; } + orig_sector = sector; zone = find_zone(mddev->private, §or); - tmp_dev = map_sector(mddev, zone, sector, §or); + switch (conf->layout) { + case RAID0_ORIG_LAYOUT: + tmp_dev = map_sector(mddev, zone, orig_sector, §or); + break; + case RAID0_ALT_MULTIZONE_LAYOUT: + tmp_dev = map_sector(mddev, zone, sector, §or); + break; + default: + WARN("md/raid0:%s: Invalid layout\n", mdname(mddev)); + bio_io_error(bio); + return true; + } + if (unlikely(is_mddev_broken(tmp_dev, "raid0"))) { bio_io_error(bio); return true; diff --git a/drivers/md/raid0.h b/drivers/md/raid0.h index 540e65d92642d..3816e5477db1e 100644 --- a/drivers/md/raid0.h +++ b/drivers/md/raid0.h @@ -8,11 +8,25 @@ struct strip_zone { int nb_dev; /* # of devices attached to the zone */ }; +/* Linux 3.14 (20d0189b101) made an unintended change to + * the RAID0 layout for multi-zone arrays (where devices aren't all + * the same size. + * RAID0_ORIG_LAYOUT restores the original layout + * RAID0_ALT_MULTIZONE_LAYOUT uses the altered layout + * The layouts are identical when there is only one zone (all + * devices the same size). + */ + +enum r0layout { + RAID0_ORIG_LAYOUT = 1, + RAID0_ALT_MULTIZONE_LAYOUT = 2, +}; struct r0conf { struct strip_zone *strip_zone; struct md_rdev **devlist; /* lists of rdevs, pointed to * by strip_zone->dev */ int nr_strip_zones; + enum r0layout layout; }; #endif