Message ID | 20230406132834.1669710-3-libaokun1@huawei.com |
---|---|
State | Superseded |
Headers | show |
Series | ext4: fix WARNING in ext4_da_update_reserve_space | expand |
On Thu 06-04-23 21:28:34, Baokun Li wrote: > If extent status tree update fails, we have inconsistency between what is > stored in the extent status tree and what is stored on disk. And that can > cause even data corruption issues in some cases. > > In the extent status tree, we have extents which we can just drop without > issues and extents we must not drop - this depends on the extent's status > - currently ext4_es_is_delayed() extents must stay, others may be dropped. > > For extents that cannot be dropped we use __GFP_NOFAIL to allocate memory. > A helper function is also added to help determine if the current extent can > be dropped, although only ext4_es_is_delayed() extents cannot be dropped > currently. In addition, with the above logic, the undo operation in > __es_remove_extent that may cause inconsistency if the split extent fails > is unnecessary, so we remove it as well. > > Suggested-by: Jan Kara <jack@suse.cz> > Signed-off-by: Baokun Li <libaokun1@huawei.com> > --- > V1->V2: > Add the patch 2 as suggested by Jan Kara. > > fs/ext4/extents_status.c | 36 +++++++++++++++++++++++++++++------- > 1 file changed, 29 insertions(+), 7 deletions(-) > > diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c > index 7bc221038c6c..8eed17f35b11 100644 > --- a/fs/ext4/extents_status.c > +++ b/fs/ext4/extents_status.c > @@ -448,12 +448,29 @@ static void ext4_es_list_del(struct inode *inode) > spin_unlock(&sbi->s_es_lock); > } > > +/* > + * Helper function to help determine if memory allocation for this > + * extent_status is allowed to fail. > + */ > +static inline bool ext4_es_alloc_should_nofail(struct extent_status *es) I'd call this function ext4_es_must_keep() and also use it in es_do_reclaim_extents() instead of ext4_es_is_delayed(). Do this as a preparatory patch please. > @@ -792,9 +809,16 @@ static int __es_insert_extent(struct inode *inode, struct extent_status *newes) > } > > es = ext4_es_alloc_extent(inode, newes->es_lblk, newes->es_len, > - newes->es_pblk); > - if (!es) > - return -ENOMEM; > + newes->es_pblk, 0); I would just call this like: es = ext4_es_alloc_extent(inode, newes->es_lblk, newes->es_len, newes->es_pblk, ext4_es_must_keep(newes)); to save the ifs below. > + if (!es) { > + /* Use GFP_NOFAIL if the allocation cannot fail. */ > + if (ext4_es_alloc_should_nofail(newes)) > + es = ext4_es_alloc_extent(inode, newes->es_lblk, > + newes->es_len, newes->es_pblk, 1); > + else > + return -ENOMEM; > + } > + > rb_link_node(&es->rb_node, parent, p); > rb_insert_color(&es->rb_node, &tree->root); > > @@ -1349,8 +1373,6 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk, > ext4_es_status(&orig_es)); > err = __es_insert_extent(inode, &newes); > if (err) { > - es->es_lblk = orig_es.es_lblk; > - es->es_len = orig_es.es_len; > if ((err == -ENOMEM) && > __es_shrink(EXT4_SB(inode->i_sb), > 128, EXT4_I(inode))) Also now __es_remove_extent() cannot fail (it will always remove what it should, maybe more) so please just make it void function (as a separate cleanup patch afterwards). Thanks! Honza
On 2023/4/11 17:19, Jan Kara wrote: > On Thu 06-04-23 21:28:34, Baokun Li wrote: >> If extent status tree update fails, we have inconsistency between what is >> stored in the extent status tree and what is stored on disk. And that can >> cause even data corruption issues in some cases. >> >> In the extent status tree, we have extents which we can just drop without >> issues and extents we must not drop - this depends on the extent's status >> - currently ext4_es_is_delayed() extents must stay, others may be dropped. >> >> For extents that cannot be dropped we use __GFP_NOFAIL to allocate memory. >> A helper function is also added to help determine if the current extent can >> be dropped, although only ext4_es_is_delayed() extents cannot be dropped >> currently. In addition, with the above logic, the undo operation in >> __es_remove_extent that may cause inconsistency if the split extent fails >> is unnecessary, so we remove it as well. >> >> Suggested-by: Jan Kara <jack@suse.cz> >> Signed-off-by: Baokun Li <libaokun1@huawei.com> >> --- >> V1->V2: >> Add the patch 2 as suggested by Jan Kara. >> >> fs/ext4/extents_status.c | 36 +++++++++++++++++++++++++++++------- >> 1 file changed, 29 insertions(+), 7 deletions(-) >> >> diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c >> index 7bc221038c6c..8eed17f35b11 100644 >> --- a/fs/ext4/extents_status.c >> +++ b/fs/ext4/extents_status.c >> @@ -448,12 +448,29 @@ static void ext4_es_list_del(struct inode *inode) >> spin_unlock(&sbi->s_es_lock); >> } >> >> +/* >> + * Helper function to help determine if memory allocation for this >> + * extent_status is allowed to fail. >> + */ >> +static inline bool ext4_es_alloc_should_nofail(struct extent_status *es) > I'd call this function ext4_es_must_keep() and also use it in > es_do_reclaim_extents() instead of ext4_es_is_delayed(). Do this as a > preparatory patch please. Totally agree! ext4_es_must_keep() is short and clear. It does make more sense to replace ext4_es_is_delayed() in es_do_reclaim_extents() with the new helper, I'll try to find out if there are any ext4_es_is_delayed() that need to be replaced as well. >> @@ -792,9 +809,16 @@ static int __es_insert_extent(struct inode *inode, struct extent_status *newes) >> } >> >> es = ext4_es_alloc_extent(inode, newes->es_lblk, newes->es_len, >> - newes->es_pblk); >> - if (!es) >> - return -ENOMEM; >> + newes->es_pblk, 0); > I would just call this like: > > es = ext4_es_alloc_extent(inode, newes->es_lblk, newes->es_len, > newes->es_pblk, ext4_es_must_keep(newes)); > > to save the ifs below. Yes! It does get a little long-winded here. > >> + if (!es) { >> + /* Use GFP_NOFAIL if the allocation cannot fail. */ >> + if (ext4_es_alloc_should_nofail(newes)) >> + es = ext4_es_alloc_extent(inode, newes->es_lblk, >> + newes->es_len, newes->es_pblk, 1); >> + else >> + return -ENOMEM; >> + } >> + >> rb_link_node(&es->rb_node, parent, p); >> rb_insert_color(&es->rb_node, &tree->root); >> >> @@ -1349,8 +1373,6 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk, >> ext4_es_status(&orig_es)); >> err = __es_insert_extent(inode, &newes); >> if (err) { >> - es->es_lblk = orig_es.es_lblk; >> - es->es_len = orig_es.es_len; >> if ((err == -ENOMEM) && >> __es_shrink(EXT4_SB(inode->i_sb), >> 128, EXT4_I(inode))) > Also now __es_remove_extent() cannot fail (it will always remove what it > should, maybe more) so please just make it void function (as a separate > cleanup patch afterwards). Thanks! > > Honza Yes! Thank you very much for the review! I will send a patch V3 with the changes suggested by you.
diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index 7bc221038c6c..8eed17f35b11 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -448,12 +448,29 @@ static void ext4_es_list_del(struct inode *inode) spin_unlock(&sbi->s_es_lock); } +/* + * Helper function to help determine if memory allocation for this + * extent_status is allowed to fail. + */ +static inline bool ext4_es_alloc_should_nofail(struct extent_status *es) +{ + if (ext4_es_is_delayed(es)) + return true; + + return false; +} + static struct extent_status * ext4_es_alloc_extent(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t len, - ext4_fsblk_t pblk) + ext4_fsblk_t pblk, int nofail) { struct extent_status *es; - es = kmem_cache_alloc(ext4_es_cachep, GFP_ATOMIC); + gfp_t gfp_flags = GFP_ATOMIC; + + if (nofail) + gfp_flags |= __GFP_NOFAIL; + + es = kmem_cache_alloc(ext4_es_cachep, gfp_flags); if (es == NULL) return NULL; es->es_lblk = lblk; @@ -792,9 +809,16 @@ static int __es_insert_extent(struct inode *inode, struct extent_status *newes) } es = ext4_es_alloc_extent(inode, newes->es_lblk, newes->es_len, - newes->es_pblk); - if (!es) - return -ENOMEM; + newes->es_pblk, 0); + if (!es) { + /* Use GFP_NOFAIL if the allocation cannot fail. */ + if (ext4_es_alloc_should_nofail(newes)) + es = ext4_es_alloc_extent(inode, newes->es_lblk, + newes->es_len, newes->es_pblk, 1); + else + return -ENOMEM; + } + rb_link_node(&es->rb_node, parent, p); rb_insert_color(&es->rb_node, &tree->root); @@ -1349,8 +1373,6 @@ static int __es_remove_extent(struct inode *inode, ext4_lblk_t lblk, ext4_es_status(&orig_es)); err = __es_insert_extent(inode, &newes); if (err) { - es->es_lblk = orig_es.es_lblk; - es->es_len = orig_es.es_len; if ((err == -ENOMEM) && __es_shrink(EXT4_SB(inode->i_sb), 128, EXT4_I(inode)))
If extent status tree update fails, we have inconsistency between what is stored in the extent status tree and what is stored on disk. And that can cause even data corruption issues in some cases. In the extent status tree, we have extents which we can just drop without issues and extents we must not drop - this depends on the extent's status - currently ext4_es_is_delayed() extents must stay, others may be dropped. For extents that cannot be dropped we use __GFP_NOFAIL to allocate memory. A helper function is also added to help determine if the current extent can be dropped, although only ext4_es_is_delayed() extents cannot be dropped currently. In addition, with the above logic, the undo operation in __es_remove_extent that may cause inconsistency if the split extent fails is unnecessary, so we remove it as well. Suggested-by: Jan Kara <jack@suse.cz> Signed-off-by: Baokun Li <libaokun1@huawei.com> --- V1->V2: Add the patch 2 as suggested by Jan Kara. fs/ext4/extents_status.c | 36 +++++++++++++++++++++++++++++------- 1 file changed, 29 insertions(+), 7 deletions(-)