diff mbox series

[1/2] ubi: wl: Put source PEB into correct list if trying locking LEB failed

Message ID 20240819032622.2241165-2-chengzhihao1@huawei.com
State New
Delegated to: Richard Weinberger
Headers show
Series ubi: Two small bugfixs for wear-leveling | expand

Commit Message

Zhihao Cheng Aug. 19, 2024, 3:26 a.m. UTC
During wear-leveing work, the source PEB will be moved into scrub list
when source LEB cannot be locked in ubi_eba_copy_leb(), which is wrong
for non-scrub type source PEB. The problem could bring extra and
ineffective wear-leveing jobs, which makes more or less negative effects
for the life time of flash. Specifically, the process is divided 2 steps:
1. wear_leveling_worker // generate false scrub type PEB
     ubi_eba_copy_leb // MOVE_RETRY is returned
       leb_write_trylock // trylock failed
     scrubbing = 1;
     e1 is put into ubi->scrub
2. wear_leveling_worker // schedule false scrub type PEB for wl
     scrubbing = 1
     e1 = rb_entry(rb_first(&ubi->scrub))

The problem can be reproduced easily by running fsstress on a small
UBIFS partition(<64M, simulated by nandsim) for 5~10mins
(CONFIG_MTD_UBI_FASTMAP=y,CONFIG_MTD_UBI_WL_THRESHOLD=50). Following
message is shown:
 ubi0: scrubbed PEB 66 (LEB 0:10), data moved to PEB 165

Since scrub type source PEB has set variable scrubbing as '1', and
variable scrubbing is checked before variable keep, so the problem can
be fixed by setting keep variable as 1 directly if the source LEB cannot
be locked.

Fixes: e801e128b220 ("UBI: fix missing scrub when there is a bit-flip")
CC: stable@vger.kernel.org
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
---
 drivers/mtd/ubi/wl.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Ryder Wang Aug. 27, 2024, 3:51 a.m. UTC | #1
Hi Zhihao,

> During wear-leveing work, the source PEB will be moved into scrub list
> when source LEB cannot be locked in ubi_eba_copy_leb(), which is wrong
> for non-scrub type source PEB.

My understanding it's just the expected design of UBI wear leveling. When ubi_eba_copy_leb() is called for non-scrub type source PEB, it also indicates that this non-scrub PEB's ease counter has be over UBI_WL_THRESHOLD, so the WL should be performed on it. If ubi_eba_copy_leb() fails with MOVE_RETRY, this PEB is moved to scrub tree for further WL operation. This should also be sane, right? Or you have any statistical data or evidence where the current design behavior has any real bad effect to the life time of flash?
Zhihao Cheng Aug. 27, 2024, 4:54 a.m. UTC | #2
在 2024/8/27 11:51, Ryder Wang 写道:
> Hi Zhihao,
> 
>> During wear-leveing work, the source PEB will be moved into scrub list
>> when source LEB cannot be locked in ubi_eba_copy_leb(), which is wrong
>> for non-scrub type source PEB.
> 
> My understanding it's just the expected design of UBI wear leveling. When ubi_eba_copy_leb() is called for non-scrub type source PEB, it also indicates that this non-scrub PEB's ease counter has be over UBI_WL_THRESHOLD, so the WL should be performed on it. If ubi_eba_copy_leb() fails with MOVE_RETRY, this PEB is moved to scrub tree for further WL operation. This should also be sane, right? 

Now, e1 is the source PEB with smaller erase counter, it will be picked 
as source PEB again in next WL if it not becomes free yet.

We don't put it into scrub tree, because the scrub type PEB means that 
biflips ever happen during reading, the data is not stable anymore in 
that PEB, so UBI moves data from scrub type PEB just take the 
opportunity of WL process.

IOW, in this case, the WL won't lost keeping ec balanced whether we keep 
e1 in used tree or move it into scrub tree. It's the same thing as the 
MOVE_TARGET_WR_ERR/MOVE_TARGET_RD_ERR case.

> Or you have any statistical data or evidence where the current design behavior has any real bad effect to the life time of flash?

Yes, see patch 0:
Before applying patches (PEB erase counters Histogram):
=========================================================
from              to     count      min      avg      max
---------------------------------------------------------
0        ..        9:        0        0        0        0
10       ..       99:        0        0        0        0
100      ..      999:     1024      646      684      988
1000     ..     9999:        0        0        0        0
10000    ..    99999:        0        0        0        0
100000   ..      inf:        0        0        0        0
---------------------------------------------------------
Total               :     1024      646      684      988

After applying patches (PEB erase counters Histogram):
=========================================================
from              to     count      min      avg      max
---------------------------------------------------------
0        ..        9:        0        0        0        0
10       ..       99:        0        0        0        0
100      ..      999:     1024      629      666      960
1000     ..     9999:        0        0        0        0
10000    ..    99999:        0        0        0        0
100000   ..      inf:        0        0        0        0
---------------------------------------------------------
Total               :     1024      629      666      960 (0~64)

> 
> ________________________________________
> From: linux-mtd <linux-mtd-bounces@lists.infradead.org> on behalf of Zhihao Cheng <chengzhihao1@huawei.com>
> Sent: Monday, August 19, 2024 11:26
> To: richard@nod.at; miquel.raynal@bootlin.com; daniel@makrotopia.org
> Cc: linux-mtd@lists.infradead.org
> Subject: [PATCH 1/2] ubi: wl: Put source PEB into correct list if trying locking LEB failed
> 
> During wear-leveing work, the source PEB will be moved into scrub list
> when source LEB cannot be locked in ubi_eba_copy_leb(), which is wrong
> for non-scrub type source PEB. The problem could bring extra and
> ineffective wear-leveing jobs, which makes more or less negative effects
> for the life time of flash. Specifically, the process is divided 2 steps:
> 1. wear_leveling_worker // generate false scrub type PEB
>       ubi_eba_copy_leb // MOVE_RETRY is returned
>         leb_write_trylock // trylock failed
>       scrubbing = 1;
>       e1 is put into ubi->scrub
> 2. wear_leveling_worker // schedule false scrub type PEB for wl
>       scrubbing = 1
>       e1 = rb_entry(rb_first(&ubi->scrub))
> 
> The problem can be reproduced easily by running fsstress on a small
> UBIFS partition(<64M, simulated by nandsim) for 5~10mins
> (CONFIG_MTD_UBI_FASTMAP=y,CONFIG_MTD_UBI_WL_THRESHOLD=50). Following
> message is shown:
>   ubi0: scrubbed PEB 66 (LEB 0:10), data moved to PEB 165
> 
> Since scrub type source PEB has set variable scrubbing as '1', and
> variable scrubbing is checked before variable keep, so the problem can
> be fixed by setting keep variable as 1 directly if the source LEB cannot
> be locked.
> 
> Fixes: e801e128b220 ("UBI: fix missing scrub when there is a bit-flip")
> CC: stable@vger.kernel.org
> Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
> ---
>   drivers/mtd/ubi/wl.c | 9 ++++++++-
>   1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
> index a357f3d27f2f..8a26968aba11 100644
> --- a/drivers/mtd/ubi/wl.c
> +++ b/drivers/mtd/ubi/wl.c
> @@ -846,7 +846,14 @@ static int wear_leveling_worker(struct ubi_device *ubi, struct ubi_work *wrk,
>                          goto out_not_moved;
>                  }
>                  if (err == MOVE_RETRY) {
> -                       scrubbing = 1;
> +                       /*
> +                        * For source PEB:
> +                        * 1. The scrubbing is set for scrub type PEB, it will
> +                        *    be put back into ubi->scrub list.
> +                        * 2. Non-scrub type PEB will be put back into ubi->used
> +                        *    list.
> +                        */
> +                       keep = 1;
>                          dst_leb_clean = 1;
>                          goto out_not_moved;
>                  }
> --
> 2.39.2
> 
> 
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
> 
> .
>
Ryder Wang Aug. 27, 2024, 6:25 a.m. UTC | #3
> Now, e1 is the source PEB with smaller erase counter, it will be picked
> as source PEB again in next WL if it not becomes free yet.
> 
> We don't put it into scrub tree, because the scrub type PEB means that
> biflips ever happen during reading, the data is not stable anymore in
> that PEB, so UBI moves data from scrub type PEB just take the
> opportunity of WL process.
> 
> IOW, in this case, the WL won't lost keeping ec balanced whether we keep
> e1 in used tree or move it into scrub tree. It's the same thing as the
> MOVE_TARGET_WR_ERR/MOVE_TARGET_RD_ERR case.

Then what's the positive return and necessity of this patch? It looks there is few difference before and after applying the patch.
Zhihao Cheng Aug. 27, 2024, 7:15 a.m. UTC | #4
在 2024/8/27 14:25, Ryder Wang 写道:
>> Now, e1 is the source PEB with smaller erase counter, it will be picked
>> as source PEB again in next WL if it not becomes free yet.
>>
>> We don't put it into scrub tree, because the scrub type PEB means that
>> biflips ever happen during reading, the data is not stable anymore in
>> that PEB, so UBI moves data from scrub type PEB just take the
>> opportunity of WL process.
>>
>> IOW, in this case, the WL won't lost keeping ec balanced whether we keep
>> e1 in used tree or move it into scrub tree. It's the same thing as the
>> MOVE_TARGET_WR_ERR/MOVE_TARGET_RD_ERR case.
> 
> Then what's the positive return and necessity of this patch? It looks there is few difference before and after applying the patch.
> .
> 

The most necessity is that make the message 'ubi0: scrubbed PEB 66 (LEB 
0:10), data moved to PEB 165' not be false positive, user could be 
confused when he get the kernel message while running tests on a 
nandsim(a nand simulator based on memory, which won't generate bitflips).
As for the testing results(a small decrement of ec counter) based on the 
patches, there is a small difference:

Before patches:
step 1. In first wl, e1 is put into scrub tree, e2 is put into free tree.
step 2. The second wl must be scheduled according to the implementation 
of ensure_wear_leveling(), because scrub tree is not empty.
step 3. In second wl, e1 is still e1 in step 1, e2 is picked from 
ubi->fm_wl_pool, which is not the same e2 in step 2, which means that 
the e2->ec could be smaller than e1->ec + UBI_WL_THRESHOLD, there is no 
need to do the second wl actually.

After patches:
step 1. In first wl, e1 is put into used tree, e2 is put into free tree.
step 2. The second wl may not be scheduled according to the 
implementation of ensure_wear_leveling(), because scrub tree is empty 
and the ec counter of first entry in ubi->fm_wl_pool is not greater than 
e1->ec + UBI_WL_THRESHOLD.
diff mbox series

Patch

diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
index a357f3d27f2f..8a26968aba11 100644
--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -846,7 +846,14 @@  static int wear_leveling_worker(struct ubi_device *ubi, struct ubi_work *wrk,
 			goto out_not_moved;
 		}
 		if (err == MOVE_RETRY) {
-			scrubbing = 1;
+			/*
+			 * For source PEB:
+			 * 1. The scrubbing is set for scrub type PEB, it will
+			 *    be put back into ubi->scrub list.
+			 * 2. Non-scrub type PEB will be put back into ubi->used
+			 *    list.
+			 */
+			keep = 1;
 			dst_leb_clean = 1;
 			goto out_not_moved;
 		}