Message ID | 20230304014141.2099204-1-wangzhaolong1@huawei.com |
---|---|
State | Accepted |
Headers | show |
Series | ubi: Fix deadlock caused by recursively holding work_sem | expand |
> During the processing of the bgt, if the sync_erase() return -EBUSY > or some other error code in __erase_worker(),schedule_erase() called > again lead to the down_read(ubi->work_sem) hold twice and may get > block by down_write(ubi->work_sem) in ubi_update_fastmap(), > which cause deadlock. > > ubi bgt other task > do_work > down_read(&ubi->work_sem) ubi_update_fastmap > erase_worker # Blocked by down_read > __erase_worker down_write(&ubi->work_sem) > schedule_erase > schedule_ubi_work > down_read(&ubi->work_sem) > > Fix this by changing input parameter @nested of the schedule_erase() to > 'true' to avoid recursively acquiring the down_read(&ubi->work_sem). > > Also, fix the incorrect comment about @nested parameter of the > schedule_erase() because when down_write(ubi->work_sem) is held, the > @nested is also need be true. > > Link: https://bugzilla.kernel.org/show_bug.cgi?id=217093 > Fixes: 2e8f08deabbc ("ubi: Fix races around ubi_refill_pools()") > Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com> > --- > drivers/mtd/ubi/wl.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> > > diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c > index 40f39e5d6dfc..26a214f016c1 100644 > --- a/drivers/mtd/ubi/wl.c > +++ b/drivers/mtd/ubi/wl.c > @@ -575,7 +575,7 @@ static int erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk, > * @vol_id: the volume ID that last used this PEB > * @lnum: the last used logical eraseblock number for the PEB > * @torture: if the physical eraseblock has to be tortured > - * @nested: denotes whether the work_sem is already held in read mode > + * @nested: denotes whether the work_sem is already held > * > * This function returns zero in case of success and a %-ENOMEM in case of > * failure. > @@ -1131,7 +1131,7 @@ static int __erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk) > int err1; > > /* Re-schedule the LEB for erasure */ > - err1 = schedule_erase(ubi, e, vol_id, lnum, 0, false); > + err1 = schedule_erase(ubi, e, vol_id, lnum, 0, true); > if (err1) { > spin_lock(&ubi->wl_lock); > wl_entry_destroy(ubi, e); >
----- Ursprüngliche Mail ----- > Von: "chengzhihao1" <chengzhihao1@huawei.com> >> During the processing of the bgt, if the sync_erase() return -EBUSY >> or some other error code in __erase_worker(),schedule_erase() called >> again lead to the down_read(ubi->work_sem) hold twice and may get >> block by down_write(ubi->work_sem) in ubi_update_fastmap(), >> which cause deadlock. >> >> ubi bgt other task >> do_work >> down_read(&ubi->work_sem) ubi_update_fastmap >> erase_worker # Blocked by down_read >> __erase_worker down_write(&ubi->work_sem) >> schedule_erase >> schedule_ubi_work >> down_read(&ubi->work_sem) >> >> Fix this by changing input parameter @nested of the schedule_erase() to >> 'true' to avoid recursively acquiring the down_read(&ubi->work_sem). >> >> Also, fix the incorrect comment about @nested parameter of the >> schedule_erase() because when down_write(ubi->work_sem) is held, the >> @nested is also need be true. >> >> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217093 >> Fixes: 2e8f08deabbc ("ubi: Fix races around ubi_refill_pools()") >> Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com> >> --- >> drivers/mtd/ubi/wl.c | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) > > Reviewed-by: Zhihao Cheng <chengzhihao1@huawei.com> Applied to -next. Thanks everyone! Thanks, //richard
diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c index 40f39e5d6dfc..26a214f016c1 100644 --- a/drivers/mtd/ubi/wl.c +++ b/drivers/mtd/ubi/wl.c @@ -575,7 +575,7 @@ static int erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk, * @vol_id: the volume ID that last used this PEB * @lnum: the last used logical eraseblock number for the PEB * @torture: if the physical eraseblock has to be tortured - * @nested: denotes whether the work_sem is already held in read mode + * @nested: denotes whether the work_sem is already held * * This function returns zero in case of success and a %-ENOMEM in case of * failure. @@ -1131,7 +1131,7 @@ static int __erase_worker(struct ubi_device *ubi, struct ubi_work *wl_wrk) int err1; /* Re-schedule the LEB for erasure */ - err1 = schedule_erase(ubi, e, vol_id, lnum, 0, false); + err1 = schedule_erase(ubi, e, vol_id, lnum, 0, true); if (err1) { spin_lock(&ubi->wl_lock); wl_entry_destroy(ubi, e);
During the processing of the bgt, if the sync_erase() return -EBUSY or some other error code in __erase_worker(),schedule_erase() called again lead to the down_read(ubi->work_sem) hold twice and may get block by down_write(ubi->work_sem) in ubi_update_fastmap(), which cause deadlock. ubi bgt other task do_work down_read(&ubi->work_sem) ubi_update_fastmap erase_worker # Blocked by down_read __erase_worker down_write(&ubi->work_sem) schedule_erase schedule_ubi_work down_read(&ubi->work_sem) Fix this by changing input parameter @nested of the schedule_erase() to 'true' to avoid recursively acquiring the down_read(&ubi->work_sem). Also, fix the incorrect comment about @nested parameter of the schedule_erase() because when down_write(ubi->work_sem) is held, the @nested is also need be true. Link: https://bugzilla.kernel.org/show_bug.cgi?id=217093 Fixes: 2e8f08deabbc ("ubi: Fix races around ubi_refill_pools()") Signed-off-by: ZhaoLong Wang <wangzhaolong1@huawei.com> --- drivers/mtd/ubi/wl.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)