Message ID | alpine.DEB.2.00.1205150943470.17750@eristoteles.iwoars.net |
---|---|
State | New, archived |
Headers | show |
On Tue, 2012-05-15 at 09:44 +0200, Joel Reardon wrote: > This is the second part of a patch to allow UBI to force the erasure of > particular logical eraseblock numbers. In this patch, a new function, > ubi_lnum_purge, is added that allows the caller to synchronously erase all > unmapped erase blocks whose LEB number corresponds to the parameter. This > requires a previous patch that stores the LEB number in struct ubi_work. > > This was tested by disabling the call to do_work in ubi thread, which results > in the work queue remaining unless explicitly called to remove. UBIFS was > changed to call ubifs_leb_change 50 times for three different LEBs. Then the > new function was called to clear the queue for the three differnt LEB numbers > one at a time. The work queue was dumped each time and the selective removal > of the particular LEB numbers was observed. > > Signed-off-by: Joel Reardon <reardonj@inf.ethz.ch> No objections in general, and I can merge this as soon as you send the final version. However, for this version I have several notes. > +/** > + * ubi_lnum_purge - synchronously erase unmapped PEBs by LEB number. > + * @ubi_num: UBI device to erase PEBs > + * @lnum: the LEB number to erase old unmapped PEBs. > + * > + * This function is designed to offer a means to ensure that the contents of > + * old, unmapped LEBs are securely erased without having to flush the entire > + * work queue of all erase blocks that need erasure. Simply erasing the block > + * at the time of unmapped is insufficient, as the wear-levelling subsystem > + * may have already moved the contents. This function navigates the list of > + * erase blocks that need erasures, and performs an immediate and synchronous > + * erasure of any erase block that has held data for this particular @lnum. > + * This may include eraseblocks that held older versions of the same @lnum. > + * Returns zero in case of success and a negative error code in case of > + * failure. > + */ > +int ubi_lnum_purge(int ubi_num, int lnum) > +{ > + int err; > + struct ubi_device *ubi; > + > + ubi = ubi_get_device(ubi_num); > + if (!ubi) > + return -ENODEV; > + > + err = ubi_wl_flush_lnum(ubi, lnum); > + ubi_put_device(ubi); > + return err; > +} > +EXPORT_SYMBOL_GPL(ubi_lnum_purge); Please, do not introduce a separate exported function for this. Instead, add "lnum" argument to "ubi_wl_flush". Preserve the old behavior if lnum is -1. Document this at the header comment. In your case you also need to call mtd->sync() I think. > + dbg_wl("flush lnum %d", lnum); > + list_for_each_entry_safe(wrk, tmp, &ubi->works, list) { > + if (wrk->lnum == lnum) { > + down_read(&ubi->work_sem); > + spin_lock(&ubi->wl_lock); But you cannot walk the ubi->works list without holding the spinlock. Any one may add/remove elements to/from this list concurrently. Take the work_sem at the beginning. Release at the very end. Then you can do something like this: int found = 1; while (found) { found = 0; spin_lock(&ubi->wl_lock); list_for_each_entry(wrk, tmp, &ubi->works, list) { if (wrk->lnum == lnum) { list_del(&wrk->list); ubi->works_count -= 1; ubi_assert(ubi->works_count >= 0); spin_unlock(&ubi->wl_lock); err = wrk->func(ubi, wrk, 0); if (err) return err; spin_lock(&ubi->wl_lock); found = 1; break; } spin_unlock(&ubi->wl_lock); }
> Take the work_sem at the beginning. Release at the very end. > > Then you can do something like this: > > int found = 1; > > while (found) { > found = 0; > > spin_lock(&ubi->wl_lock); > list_for_each_entry(wrk, tmp, &ubi->works, list) { > if (wrk->lnum == lnum) { > list_del(&wrk->list); > ubi->works_count -= 1; > ubi_assert(ubi->works_count >= 0); > spin_unlock(&ubi->wl_lock); > > err = wrk->func(ubi, wrk, 0); > if (err) > return err; > > spin_lock(&ubi->wl_lock); > found = 1; > break; > } > spin_unlock(&ubi->wl_lock); > } > If I use list_for_each_entry_safe(), it protects against deleting as it iterates. If I take the work_sem first, is it okay to do a simple traversal instead of one traversal per removed item? Even if another thread adds new work for the same vol_id/lnum, its okay, because the caller of this function only cares about vol_id/lnums erasures that it knows are currently on the worklist. Cheers, Joel Reardon
On Sat, 2012-05-19 at 11:46 +0200, Joel Reardon wrote: > > Take the work_sem at the beginning. Release at the very end. > > > > Then you can do something like this: > > > > int found = 1; > > > > while (found) { > > found = 0; > > > > spin_lock(&ubi->wl_lock); > > list_for_each_entry(wrk, tmp, &ubi->works, list) { > > if (wrk->lnum == lnum) { > > list_del(&wrk->list); > > ubi->works_count -= 1; > > ubi_assert(ubi->works_count >= 0); > > spin_unlock(&ubi->wl_lock); > > > > err = wrk->func(ubi, wrk, 0); > > if (err) > > return err; > > > > spin_lock(&ubi->wl_lock); > > found = 1; > > break; > > } > > spin_unlock(&ubi->wl_lock); > > } > > > > > If I use list_for_each_entry_safe(), it protects against deleting as it > iterates. Yes, but do not be confused, it does not protect against concurrent deleting. I would not use word "protect" at all - it is just a version of list_for_each_entry() which we use when we delete list elements inside the loop. It requires another temporary variable. > If I take the work_sem first, is it okay to do a simple > traversal instead of one traversal per removed item? No, work_sem does not protect the list. It is just a bit unusual way of syncing with all the processes which are performing UBI works. Take a closer look - there is only one single place where we take it for writing - ubi_wl_flush(). Everywhere else we take it for reading which means that many processes may enter the critical section, and may concurrently add/remove elements to/from the list. The list is protected by the spinlock. > Even if another > thread adds new work for the same vol_id/lnum, its okay, because the > caller of this function only cares about vol_id/lnums erasures that > it knows are currently on the worklist. If you walk the list and read pointers, and someone else modifies them, you may be in trouble. You cannot traverse the list without the spinlock. And once you dropped the spinlock - you have to start over because your 'wrk' pointer may point to a non-existing object, because this object might have been already freed. This is why I added 2 loops.
On Sat, 2012-05-19 at 13:47 +0300, Artem Bityutskiy wrote: > If you walk the list and read pointers, and someone else modifies them, > you may be in trouble. You cannot traverse the list without the > spinlock. And once you dropped the spinlock - you have to start over > because your 'wrk' pointer may point to a non-existing object, because > this object might have been already freed. This is why I added 2 loops. Actually we ourselves free it in ->func() :-) I think it is saner to not even use the list_for_each_entry(), but just take the element off the head, unlock the spinlock use "container_of()", and call ->func(). So we need only one loop, not 2.
diff --git a/drivers/mtd/ubi/kapi.c b/drivers/mtd/ubi/kapi.c index 33ede23..a236522 100644 --- a/drivers/mtd/ubi/kapi.c +++ b/drivers/mtd/ubi/kapi.c @@ -704,6 +704,37 @@ int ubi_sync(int ubi_num) } EXPORT_SYMBOL_GPL(ubi_sync); +/** + * ubi_lnum_purge - synchronously erase unmapped PEBs by LEB number. + * @ubi_num: UBI device to erase PEBs + * @lnum: the LEB number to erase old unmapped PEBs. + * + * This function is designed to offer a means to ensure that the contents of + * old, unmapped LEBs are securely erased without having to flush the entire + * work queue of all erase blocks that need erasure. Simply erasing the block + * at the time of unmapped is insufficient, as the wear-levelling subsystem + * may have already moved the contents. This function navigates the list of + * erase blocks that need erasures, and performs an immediate and synchronous + * erasure of any erase block that has held data for this particular @lnum. + * This may include eraseblocks that held older versions of the same @lnum. + * Returns zero in case of success and a negative error code in case of + * failure. + */ +int ubi_lnum_purge(int ubi_num, int lnum) +{ + int err; + struct ubi_device *ubi; + + ubi = ubi_get_device(ubi_num); + if (!ubi) + return -ENODEV; + + err = ubi_wl_flush_lnum(ubi, lnum); + ubi_put_device(ubi); + return err; +} +EXPORT_SYMBOL_GPL(ubi_lnum_purge); + BLOCKING_NOTIFIER_HEAD(ubi_notifiers); /** diff --git a/drivers/mtd/ubi/ubi.h b/drivers/mtd/ubi/ubi.h index b3e5815..f91f965 100644 --- a/drivers/mtd/ubi/ubi.h +++ b/drivers/mtd/ubi/ubi.h @@ -535,6 +535,7 @@ int ubi_eba_init_scan(struct ubi_device *ubi, struct ubi_scan_info *si); int ubi_wl_get_peb(struct ubi_device *ubi); int ubi_wl_put_peb(struct ubi_device *ubi, int pnum, int torture, int lnum); int ubi_wl_flush(struct ubi_device *ubi); +int ubi_wl_flush_lnum(struct ubi_device *ubi, int lnum); int ubi_wl_scrub_peb(struct ubi_device *ubi, int pnum); int ubi_wl_init_scan(struct ubi_device *ubi, struct ubi_scan_info *si); void ubi_wl_close(struct ubi_device *ubi); diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c index 4e51735..56d9836 100644 --- a/drivers/mtd/ubi/wl.c +++ b/drivers/mtd/ubi/wl.c @@ -610,7 +610,6 @@ static int schedule_erase(struct ubi_device *ubi, struct ubi_wl_entry *e, wl_wrk->e = e; wl_wrk->torture = torture; wl_wrk->lnum = lnum; - schedule_ubi_work(ubi, wl_wrk); return 0; } @@ -1237,6 +1236,47 @@ retry: } /** + * ubi_wl_flush_lnum - flush all pending works associated with a LEB number + * @ubi: UBI device description object + * @lnum: the LEB number only for which work should be synchronously executed + * + * This function executes pending works for PEBs which are associated with a + * particular LEB number. If one of the pending works fail, for instance, due + * to the erase block becoming bad block during erasure, then it returns the + * error without continuing, but removes the work from the queue. This + * function returns zero if no work on the queue remains for lnum and all work + * in the call executed successfully. + */ +int ubi_wl_flush_lnum(struct ubi_device *ubi, int lnum) +{ + int err = 0; + struct ubi_work *wrk, *tmp; + + /* For each pending work, if it corresponds to the parameter @lnum, + * then execute the work. + */ + dbg_wl("flush lnum %d", lnum); + list_for_each_entry_safe(wrk, tmp, &ubi->works, list) { + if (wrk->lnum == lnum) { + down_read(&ubi->work_sem); + spin_lock(&ubi->wl_lock); + + list_del(&wrk->list); + ubi->works_count -= 1; + ubi_assert(ubi->works_count >= 0); + spin_unlock(&ubi->wl_lock); + + err = wrk->func(ubi, wrk, 0); + up_read(&ubi->work_sem); + if (err) + return err; + } + } + + return err; +} + +/** * ubi_wl_flush - flush all pending works. * @ubi: UBI device description object * diff --git a/include/linux/mtd/ubi.h b/include/linux/mtd/ubi.h index 9838dce..8087e99 100644 --- a/include/linux/mtd/ubi.h +++ b/include/linux/mtd/ubi.h @@ -216,6 +216,7 @@ int ubi_leb_unmap(struct ubi_volume_desc *desc, int lnum); int ubi_leb_map(struct ubi_volume_desc *desc, int lnum); int ubi_is_mapped(struct ubi_volume_desc *desc, int lnum); int ubi_sync(int ubi_num); +int ubi_lnum_purge(int ubi_num, int lnum); /* * This function is the same as the 'ubi_leb_read()' function, but it does not
This is the second part of a patch to allow UBI to force the erasure of particular logical eraseblock numbers. In this patch, a new function, ubi_lnum_purge, is added that allows the caller to synchronously erase all unmapped erase blocks whose LEB number corresponds to the parameter. This requires a previous patch that stores the LEB number in struct ubi_work. This was tested by disabling the call to do_work in ubi thread, which results in the work queue remaining unless explicitly called to remove. UBIFS was changed to call ubifs_leb_change 50 times for three different LEBs. Then the new function was called to clear the queue for the three differnt LEB numbers one at a time. The work queue was dumped each time and the selective removal of the particular LEB numbers was observed. Signed-off-by: Joel Reardon <reardonj@inf.ethz.ch> --- drivers/mtd/ubi/kapi.c | 31 +++++++++++++++++++++++++++++++ drivers/mtd/ubi/ubi.h | 1 + drivers/mtd/ubi/wl.c | 42 +++++++++++++++++++++++++++++++++++++++++- include/linux/mtd/ubi.h | 1 + 4 files changed, 74 insertions(+), 1 deletions(-)