Message ID | 20240819032622.2241165-2-chengzhihao1@huawei.com |
---|---|
State | New |
Delegated to: | Richard Weinberger |
Headers | show |
Series | ubi: Two small bugfixs for wear-leveling | expand |
Hi Zhihao, > During wear-leveing work, the source PEB will be moved into scrub list > when source LEB cannot be locked in ubi_eba_copy_leb(), which is wrong > for non-scrub type source PEB. My understanding it's just the expected design of UBI wear leveling. When ubi_eba_copy_leb() is called for non-scrub type source PEB, it also indicates that this non-scrub PEB's ease counter has be over UBI_WL_THRESHOLD, so the WL should be performed on it. If ubi_eba_copy_leb() fails with MOVE_RETRY, this PEB is moved to scrub tree for further WL operation. This should also be sane, right? Or you have any statistical data or evidence where the current design behavior has any real bad effect to the life time of flash?
在 2024/8/27 11:51, Ryder Wang 写道: > Hi Zhihao, > >> During wear-leveing work, the source PEB will be moved into scrub list >> when source LEB cannot be locked in ubi_eba_copy_leb(), which is wrong >> for non-scrub type source PEB. > > My understanding it's just the expected design of UBI wear leveling. When ubi_eba_copy_leb() is called for non-scrub type source PEB, it also indicates that this non-scrub PEB's ease counter has be over UBI_WL_THRESHOLD, so the WL should be performed on it. If ubi_eba_copy_leb() fails with MOVE_RETRY, this PEB is moved to scrub tree for further WL operation. This should also be sane, right? Now, e1 is the source PEB with smaller erase counter, it will be picked as source PEB again in next WL if it not becomes free yet. We don't put it into scrub tree, because the scrub type PEB means that biflips ever happen during reading, the data is not stable anymore in that PEB, so UBI moves data from scrub type PEB just take the opportunity of WL process. IOW, in this case, the WL won't lost keeping ec balanced whether we keep e1 in used tree or move it into scrub tree. It's the same thing as the MOVE_TARGET_WR_ERR/MOVE_TARGET_RD_ERR case. > Or you have any statistical data or evidence where the current design behavior has any real bad effect to the life time of flash? Yes, see patch 0: Before applying patches (PEB erase counters Histogram): ========================================================= from to count min avg max --------------------------------------------------------- 0 .. 9: 0 0 0 0 10 .. 99: 0 0 0 0 100 .. 999: 1024 646 684 988 1000 .. 9999: 0 0 0 0 10000 .. 99999: 0 0 0 0 100000 .. inf: 0 0 0 0 --------------------------------------------------------- Total : 1024 646 684 988 After applying patches (PEB erase counters Histogram): ========================================================= from to count min avg max --------------------------------------------------------- 0 .. 9: 0 0 0 0 10 .. 99: 0 0 0 0 100 .. 999: 1024 629 666 960 1000 .. 9999: 0 0 0 0 10000 .. 99999: 0 0 0 0 100000 .. inf: 0 0 0 0 --------------------------------------------------------- Total : 1024 629 666 960 (0~64) > > ________________________________________ > From: linux-mtd <linux-mtd-bounces@lists.infradead.org> on behalf of Zhihao Cheng <chengzhihao1@huawei.com> > Sent: Monday, August 19, 2024 11:26 > To: richard@nod.at; miquel.raynal@bootlin.com; daniel@makrotopia.org > Cc: linux-mtd@lists.infradead.org > Subject: [PATCH 1/2] ubi: wl: Put source PEB into correct list if trying locking LEB failed > > During wear-leveing work, the source PEB will be moved into scrub list > when source LEB cannot be locked in ubi_eba_copy_leb(), which is wrong > for non-scrub type source PEB. The problem could bring extra and > ineffective wear-leveing jobs, which makes more or less negative effects > for the life time of flash. Specifically, the process is divided 2 steps: > 1. wear_leveling_worker // generate false scrub type PEB > ubi_eba_copy_leb // MOVE_RETRY is returned > leb_write_trylock // trylock failed > scrubbing = 1; > e1 is put into ubi->scrub > 2. wear_leveling_worker // schedule false scrub type PEB for wl > scrubbing = 1 > e1 = rb_entry(rb_first(&ubi->scrub)) > > The problem can be reproduced easily by running fsstress on a small > UBIFS partition(<64M, simulated by nandsim) for 5~10mins > (CONFIG_MTD_UBI_FASTMAP=y,CONFIG_MTD_UBI_WL_THRESHOLD=50). Following > message is shown: > ubi0: scrubbed PEB 66 (LEB 0:10), data moved to PEB 165 > > Since scrub type source PEB has set variable scrubbing as '1', and > variable scrubbing is checked before variable keep, so the problem can > be fixed by setting keep variable as 1 directly if the source LEB cannot > be locked. > > Fixes: e801e128b220 ("UBI: fix missing scrub when there is a bit-flip") > CC: stable@vger.kernel.org > Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> > --- > drivers/mtd/ubi/wl.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c > index a357f3d27f2f..8a26968aba11 100644 > --- a/drivers/mtd/ubi/wl.c > +++ b/drivers/mtd/ubi/wl.c > @@ -846,7 +846,14 @@ static int wear_leveling_worker(struct ubi_device *ubi, struct ubi_work *wrk, > goto out_not_moved; > } > if (err == MOVE_RETRY) { > - scrubbing = 1; > + /* > + * For source PEB: > + * 1. The scrubbing is set for scrub type PEB, it will > + * be put back into ubi->scrub list. > + * 2. Non-scrub type PEB will be put back into ubi->used > + * list. > + */ > + keep = 1; > dst_leb_clean = 1; > goto out_not_moved; > } > -- > 2.39.2 > > > ______________________________________________________ > Linux MTD discussion mailing list > http://lists.infradead.org/mailman/listinfo/linux-mtd/ > > . >
> Now, e1 is the source PEB with smaller erase counter, it will be picked > as source PEB again in next WL if it not becomes free yet. > > We don't put it into scrub tree, because the scrub type PEB means that > biflips ever happen during reading, the data is not stable anymore in > that PEB, so UBI moves data from scrub type PEB just take the > opportunity of WL process. > > IOW, in this case, the WL won't lost keeping ec balanced whether we keep > e1 in used tree or move it into scrub tree. It's the same thing as the > MOVE_TARGET_WR_ERR/MOVE_TARGET_RD_ERR case. Then what's the positive return and necessity of this patch? It looks there is few difference before and after applying the patch.
在 2024/8/27 14:25, Ryder Wang 写道: >> Now, e1 is the source PEB with smaller erase counter, it will be picked >> as source PEB again in next WL if it not becomes free yet. >> >> We don't put it into scrub tree, because the scrub type PEB means that >> biflips ever happen during reading, the data is not stable anymore in >> that PEB, so UBI moves data from scrub type PEB just take the >> opportunity of WL process. >> >> IOW, in this case, the WL won't lost keeping ec balanced whether we keep >> e1 in used tree or move it into scrub tree. It's the same thing as the >> MOVE_TARGET_WR_ERR/MOVE_TARGET_RD_ERR case. > > Then what's the positive return and necessity of this patch? It looks there is few difference before and after applying the patch. > . > The most necessity is that make the message 'ubi0: scrubbed PEB 66 (LEB 0:10), data moved to PEB 165' not be false positive, user could be confused when he get the kernel message while running tests on a nandsim(a nand simulator based on memory, which won't generate bitflips). As for the testing results(a small decrement of ec counter) based on the patches, there is a small difference: Before patches: step 1. In first wl, e1 is put into scrub tree, e2 is put into free tree. step 2. The second wl must be scheduled according to the implementation of ensure_wear_leveling(), because scrub tree is not empty. step 3. In second wl, e1 is still e1 in step 1, e2 is picked from ubi->fm_wl_pool, which is not the same e2 in step 2, which means that the e2->ec could be smaller than e1->ec + UBI_WL_THRESHOLD, there is no need to do the second wl actually. After patches: step 1. In first wl, e1 is put into used tree, e2 is put into free tree. step 2. The second wl may not be scheduled according to the implementation of ensure_wear_leveling(), because scrub tree is empty and the ec counter of first entry in ubi->fm_wl_pool is not greater than e1->ec + UBI_WL_THRESHOLD.
diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c index a357f3d27f2f..8a26968aba11 100644 --- a/drivers/mtd/ubi/wl.c +++ b/drivers/mtd/ubi/wl.c @@ -846,7 +846,14 @@ static int wear_leveling_worker(struct ubi_device *ubi, struct ubi_work *wrk, goto out_not_moved; } if (err == MOVE_RETRY) { - scrubbing = 1; + /* + * For source PEB: + * 1. The scrubbing is set for scrub type PEB, it will + * be put back into ubi->scrub list. + * 2. Non-scrub type PEB will be put back into ubi->used + * list. + */ + keep = 1; dst_leb_clean = 1; goto out_not_moved; }
During wear-leveing work, the source PEB will be moved into scrub list when source LEB cannot be locked in ubi_eba_copy_leb(), which is wrong for non-scrub type source PEB. The problem could bring extra and ineffective wear-leveing jobs, which makes more or less negative effects for the life time of flash. Specifically, the process is divided 2 steps: 1. wear_leveling_worker // generate false scrub type PEB ubi_eba_copy_leb // MOVE_RETRY is returned leb_write_trylock // trylock failed scrubbing = 1; e1 is put into ubi->scrub 2. wear_leveling_worker // schedule false scrub type PEB for wl scrubbing = 1 e1 = rb_entry(rb_first(&ubi->scrub)) The problem can be reproduced easily by running fsstress on a small UBIFS partition(<64M, simulated by nandsim) for 5~10mins (CONFIG_MTD_UBI_FASTMAP=y,CONFIG_MTD_UBI_WL_THRESHOLD=50). Following message is shown: ubi0: scrubbed PEB 66 (LEB 0:10), data moved to PEB 165 Since scrub type source PEB has set variable scrubbing as '1', and variable scrubbing is checked before variable keep, so the problem can be fixed by setting keep variable as 1 directly if the source LEB cannot be locked. Fixes: e801e128b220 ("UBI: fix missing scrub when there is a bit-flip") CC: stable@vger.kernel.org Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> --- drivers/mtd/ubi/wl.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)