Message ID | 20240510115252.11850-1-luis.henriques@linux.dev |
---|---|
State | Superseded |
Headers | show |
Series | ext4: fix infinite loop when replaying fast_commit | expand |
On 2024/5/10 19:52, Luis Henriques (SUSE) wrote: > When doing fast_commit replay an infinite loop may occur due to an > uninitialized extent_status struct. ext4_ext_determine_insert_hole() does > not detect the replay and calls ext4_es_find_extent_range(), which will > return immediately without initializing the 'es' variable. > > Because 'es' contains garbage, an integer overflow may happen causing an > infinite loop in this function, easily reproducible using fstest generic/039. > > This commit fixes this issue by detecting the replay in function > ext4_ext_determine_insert_hole(). It also adds initialization code to the > error path in function ext4_es_find_extent_range(). > > Thanks to Zhang Yi, for figuring out the real problem! > > Fixes: 8016e29f4362 ("ext4: fast commit recovery path") > Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev> > --- > Hi! > > Two comments: > 1) The change in ext4_ext_map_blocks() could probably use the min_not_zero > macro instead. I decided not to do so simply because I wasn't sure if > that would be safe, but I'm fine changing that if you think it is. > > 2) I thought about returning 'EXT_MAX_BLOCKS' instead of '0' in > ext4_lblk_t ext4_ext_determine_insert_hole(), which would then avoid > the extra change to ext4_ext_map_blocks(). '0' sounds like the right > value to return, but I'm also OK using 'EXT_MAX_BLOCKS' instead. > > And again thanks to Zhang Yi for pointing me the *real* problem! > > fs/ext4/extents.c | 6 +++++- > fs/ext4/extents_status.c | 5 ++++- > 2 files changed, 9 insertions(+), 2 deletions(-) > > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c > index e57054bdc5fd..b5bfcb6c18a0 100644 > --- a/fs/ext4/extents.c > +++ b/fs/ext4/extents.c > @@ -4052,6 +4052,9 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode, > ext4_lblk_t hole_start, len; > struct extent_status es; > > + if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) > + return 0; > + Sorry, I think it's may not correct. When replaying the jouranl, although we don't use the extent statue tree, we still need to query the accurate hole length, e.g. please see skip_hole(). If you do this, the hole length becomes incorrect, right? Thanks, Yi. > hole_start = lblk; > len = ext4_ext_find_hole(inode, path, &hole_start); > again: > @@ -4226,7 +4229,8 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode, > len = ext4_ext_determine_insert_hole(inode, path, map->m_lblk); > > map->m_pblk = 0; > - map->m_len = min_t(unsigned int, map->m_len, len); > + if (len > 0) > + map->m_len = min_t(unsigned int, map->m_len, len); > goto out; > } > > diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c > index 4a00e2f019d9..acb9616ca119 100644 > --- a/fs/ext4/extents_status.c > +++ b/fs/ext4/extents_status.c > @@ -310,8 +310,11 @@ void ext4_es_find_extent_range(struct inode *inode, > ext4_lblk_t lblk, ext4_lblk_t end, > struct extent_status *es) > { > - if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) > + if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) { > + /* Initialize extent to zero */ > + es->es_lblk = es->es_len = es->es_pblk = 0; > return; > + } > > trace_ext4_es_find_extent_range_enter(inode, lblk); > >
… > This commit fixes this issue by detecting the replay … Would corresponding imperative wordings be more desirable for such a change description? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.9-rc7#n94 > Thanks to Zhang Yi, for figuring out the real problem! … Will another tag become relevant here? https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/process/submitting-patches.rst?h=v6.9-rc7#n527 Regards, Markus
On Sat 11 May 2024 02:24:17 PM +08, Zhang Yi wrote; > On 2024/5/10 19:52, Luis Henriques (SUSE) wrote: >> When doing fast_commit replay an infinite loop may occur due to an >> uninitialized extent_status struct. ext4_ext_determine_insert_hole() does >> not detect the replay and calls ext4_es_find_extent_range(), which will >> return immediately without initializing the 'es' variable. >> >> Because 'es' contains garbage, an integer overflow may happen causing an >> infinite loop in this function, easily reproducible using fstest generic/039. >> >> This commit fixes this issue by detecting the replay in function >> ext4_ext_determine_insert_hole(). It also adds initialization code to the >> error path in function ext4_es_find_extent_range(). >> >> Thanks to Zhang Yi, for figuring out the real problem! >> >> Fixes: 8016e29f4362 ("ext4: fast commit recovery path") >> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev> >> --- >> Hi! >> >> Two comments: >> 1) The change in ext4_ext_map_blocks() could probably use the min_not_zero >> macro instead. I decided not to do so simply because I wasn't sure if >> that would be safe, but I'm fine changing that if you think it is. >> >> 2) I thought about returning 'EXT_MAX_BLOCKS' instead of '0' in >> ext4_lblk_t ext4_ext_determine_insert_hole(), which would then avoid >> the extra change to ext4_ext_map_blocks(). '0' sounds like the right >> value to return, but I'm also OK using 'EXT_MAX_BLOCKS' instead. >> >> And again thanks to Zhang Yi for pointing me the *real* problem! >> >> fs/ext4/extents.c | 6 +++++- >> fs/ext4/extents_status.c | 5 ++++- >> 2 files changed, 9 insertions(+), 2 deletions(-) >> >> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c >> index e57054bdc5fd..b5bfcb6c18a0 100644 >> --- a/fs/ext4/extents.c >> +++ b/fs/ext4/extents.c >> @@ -4052,6 +4052,9 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode, >> ext4_lblk_t hole_start, len; >> struct extent_status es; >> >> + if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) >> + return 0; >> + > > Sorry, I think it's may not correct. When replaying the jouranl, although > we don't use the extent statue tree, we still need to query the accurate > hole length, e.g. please see skip_hole(). If you do this, the hole length > becomes incorrect, right? Thank you for your review (and sorry for my delay replying). So, I see three different options to follow your suggestion: 1) Initialize 'es' immediately when declaring it in function ext4_ext_determine_insert_hole(): es.es_lblk = es.es_len = es.es_pblk = 0; 2) Initialize 'es' only in ext4_es_find_extent_range() when checking if an fc replay is in progress (my patch was already doing something like that): if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) { /* Initialize extent to zero */ es->es_lblk = es->es_len = es->es_pblk = 0; return; } 3) Remove the check for fc replay in function ext4_es_find_extent_range(), which will then unconditionally call __es_find_extent_range(). This will effectively also initialize the 'es' fields to '0' and, because __es_tree_search() will return NULL (at least in generic/039 test!), nothing else will be done. Since all these 3 options seem to have the same result, I believe option 1) is probably the best as it initializes the structure shortly after it's declaration. Would you agree? Or did I misunderstood you? Cheers,
On 2024/5/14 21:04, Luis Henriques wrote: > On Sat 11 May 2024 02:24:17 PM +08, Zhang Yi wrote; > >> On 2024/5/10 19:52, Luis Henriques (SUSE) wrote: >>> When doing fast_commit replay an infinite loop may occur due to an >>> uninitialized extent_status struct. ext4_ext_determine_insert_hole() does >>> not detect the replay and calls ext4_es_find_extent_range(), which will >>> return immediately without initializing the 'es' variable. >>> >>> Because 'es' contains garbage, an integer overflow may happen causing an >>> infinite loop in this function, easily reproducible using fstest generic/039. >>> >>> This commit fixes this issue by detecting the replay in function >>> ext4_ext_determine_insert_hole(). It also adds initialization code to the >>> error path in function ext4_es_find_extent_range(). >>> >>> Thanks to Zhang Yi, for figuring out the real problem! >>> >>> Fixes: 8016e29f4362 ("ext4: fast commit recovery path") >>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev> >>> --- >>> Hi! >>> >>> Two comments: >>> 1) The change in ext4_ext_map_blocks() could probably use the min_not_zero >>> macro instead. I decided not to do so simply because I wasn't sure if >>> that would be safe, but I'm fine changing that if you think it is. >>> >>> 2) I thought about returning 'EXT_MAX_BLOCKS' instead of '0' in >>> ext4_lblk_t ext4_ext_determine_insert_hole(), which would then avoid >>> the extra change to ext4_ext_map_blocks(). '0' sounds like the right >>> value to return, but I'm also OK using 'EXT_MAX_BLOCKS' instead. >>> >>> And again thanks to Zhang Yi for pointing me the *real* problem! >>> >>> fs/ext4/extents.c | 6 +++++- >>> fs/ext4/extents_status.c | 5 ++++- >>> 2 files changed, 9 insertions(+), 2 deletions(-) >>> >>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c >>> index e57054bdc5fd..b5bfcb6c18a0 100644 >>> --- a/fs/ext4/extents.c >>> +++ b/fs/ext4/extents.c >>> @@ -4052,6 +4052,9 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode, >>> ext4_lblk_t hole_start, len; >>> struct extent_status es; >>> >>> + if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) >>> + return 0; >>> + >> >> Sorry, I think it's may not correct. When replaying the jouranl, although >> we don't use the extent statue tree, we still need to query the accurate >> hole length, e.g. please see skip_hole(). If you do this, the hole length >> becomes incorrect, right? > > Thank you for your review (and sorry for my delay replying). > > So, I see three different options to follow your suggestion: > > 1) Initialize 'es' immediately when declaring it in function > ext4_ext_determine_insert_hole(): > > es.es_lblk = es.es_len = es.es_pblk = 0; > > 2) Initialize 'es' only in ext4_es_find_extent_range() when checking if an > fc replay is in progress (my patch was already doing something like > that): > > if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) { > /* Initialize extent to zero */ > es->es_lblk = es->es_len = es->es_pblk = 0; > return; > } > > 3) Remove the check for fc replay in function ext4_es_find_extent_range(), > which will then unconditionally call __es_find_extent_range(). This > will effectively also initialize the 'es' fields to '0' and, because > __es_tree_search() will return NULL (at least in generic/039 test!), > nothing else will be done. > > Since all these 3 options seem to have the same result, I believe option > 1) is probably the best as it initializes the structure shortly after it's > declaration. Would you agree? Or did I misunderstood you? > Both 1 and 2 are looks fine to me, but I would prefer to initialize it unconditionally in ext4_es_find_extent_range(). @@ -310,6 +310,8 @@ void ext4_es_find_extent_range(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t end, struct extent_status *es) { + es->es_lblk = es->es_len = es->es_pblk = 0; + if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) return; Thanks, Yi.
On Wed 15 May 2024 12:59:26 PM +08, Zhang Yi wrote; > On 2024/5/14 21:04, Luis Henriques wrote: >> On Sat 11 May 2024 02:24:17 PM +08, Zhang Yi wrote; >> >>> On 2024/5/10 19:52, Luis Henriques (SUSE) wrote: >>>> When doing fast_commit replay an infinite loop may occur due to an >>>> uninitialized extent_status struct. ext4_ext_determine_insert_hole() does >>>> not detect the replay and calls ext4_es_find_extent_range(), which will >>>> return immediately without initializing the 'es' variable. >>>> >>>> Because 'es' contains garbage, an integer overflow may happen causing an >>>> infinite loop in this function, easily reproducible using fstest generic/039. >>>> >>>> This commit fixes this issue by detecting the replay in function >>>> ext4_ext_determine_insert_hole(). It also adds initialization code to the >>>> error path in function ext4_es_find_extent_range(). >>>> >>>> Thanks to Zhang Yi, for figuring out the real problem! >>>> >>>> Fixes: 8016e29f4362 ("ext4: fast commit recovery path") >>>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev> >>>> --- >>>> Hi! >>>> >>>> Two comments: >>>> 1) The change in ext4_ext_map_blocks() could probably use the min_not_zero >>>> macro instead. I decided not to do so simply because I wasn't sure if >>>> that would be safe, but I'm fine changing that if you think it is. >>>> >>>> 2) I thought about returning 'EXT_MAX_BLOCKS' instead of '0' in >>>> ext4_lblk_t ext4_ext_determine_insert_hole(), which would then avoid >>>> the extra change to ext4_ext_map_blocks(). '0' sounds like the right >>>> value to return, but I'm also OK using 'EXT_MAX_BLOCKS' instead. >>>> >>>> And again thanks to Zhang Yi for pointing me the *real* problem! >>>> >>>> fs/ext4/extents.c | 6 +++++- >>>> fs/ext4/extents_status.c | 5 ++++- >>>> 2 files changed, 9 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c >>>> index e57054bdc5fd..b5bfcb6c18a0 100644 >>>> --- a/fs/ext4/extents.c >>>> +++ b/fs/ext4/extents.c >>>> @@ -4052,6 +4052,9 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode, >>>> ext4_lblk_t hole_start, len; >>>> struct extent_status es; >>>> >>>> + if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) >>>> + return 0; >>>> + >>> >>> Sorry, I think it's may not correct. When replaying the jouranl, although >>> we don't use the extent statue tree, we still need to query the accurate >>> hole length, e.g. please see skip_hole(). If you do this, the hole length >>> becomes incorrect, right? >> >> Thank you for your review (and sorry for my delay replying). >> >> So, I see three different options to follow your suggestion: >> >> 1) Initialize 'es' immediately when declaring it in function >> ext4_ext_determine_insert_hole(): >> >> es.es_lblk = es.es_len = es.es_pblk = 0; >> >> 2) Initialize 'es' only in ext4_es_find_extent_range() when checking if an >> fc replay is in progress (my patch was already doing something like >> that): >> >> if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) { >> /* Initialize extent to zero */ >> es->es_lblk = es->es_len = es->es_pblk = 0; >> return; >> } >> >> 3) Remove the check for fc replay in function ext4_es_find_extent_range(), >> which will then unconditionally call __es_find_extent_range(). This >> will effectively also initialize the 'es' fields to '0' and, because >> __es_tree_search() will return NULL (at least in generic/039 test!), >> nothing else will be done. >> >> Since all these 3 options seem to have the same result, I believe option >> 1) is probably the best as it initializes the structure shortly after it's >> declaration. Would you agree? Or did I misunderstood you? >> > > Both 1 and 2 are looks fine to me, but I would prefer to initialize it > unconditionally in ext4_es_find_extent_range(). > > @@ -310,6 +310,8 @@ void ext4_es_find_extent_range(struct inode *inode, > ext4_lblk_t lblk, ext4_lblk_t end, > struct extent_status *es) > { > + es->es_lblk = es->es_len = es->es_pblk = 0; > + > if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) > return; Thank you, Yi. I'll send out v2 shortly. Although, to be fair, the real patch author shouldn't be me. :-) Cheers,
On 2024/5/15 16:28, Luis Henriques wrote: > On Wed 15 May 2024 12:59:26 PM +08, Zhang Yi wrote; > >> On 2024/5/14 21:04, Luis Henriques wrote: >>> On Sat 11 May 2024 02:24:17 PM +08, Zhang Yi wrote; >>> >>>> On 2024/5/10 19:52, Luis Henriques (SUSE) wrote: >>>>> When doing fast_commit replay an infinite loop may occur due to an >>>>> uninitialized extent_status struct. ext4_ext_determine_insert_hole() does >>>>> not detect the replay and calls ext4_es_find_extent_range(), which will >>>>> return immediately without initializing the 'es' variable. >>>>> >>>>> Because 'es' contains garbage, an integer overflow may happen causing an >>>>> infinite loop in this function, easily reproducible using fstest generic/039. >>>>> >>>>> This commit fixes this issue by detecting the replay in function >>>>> ext4_ext_determine_insert_hole(). It also adds initialization code to the >>>>> error path in function ext4_es_find_extent_range(). >>>>> >>>>> Thanks to Zhang Yi, for figuring out the real problem! >>>>> >>>>> Fixes: 8016e29f4362 ("ext4: fast commit recovery path") >>>>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev> >>>>> --- >>>>> Hi! >>>>> >>>>> Two comments: >>>>> 1) The change in ext4_ext_map_blocks() could probably use the min_not_zero >>>>> macro instead. I decided not to do so simply because I wasn't sure if >>>>> that would be safe, but I'm fine changing that if you think it is. >>>>> >>>>> 2) I thought about returning 'EXT_MAX_BLOCKS' instead of '0' in >>>>> ext4_lblk_t ext4_ext_determine_insert_hole(), which would then avoid >>>>> the extra change to ext4_ext_map_blocks(). '0' sounds like the right >>>>> value to return, but I'm also OK using 'EXT_MAX_BLOCKS' instead. >>>>> >>>>> And again thanks to Zhang Yi for pointing me the *real* problem! >>>>> >>>>> fs/ext4/extents.c | 6 +++++- >>>>> fs/ext4/extents_status.c | 5 ++++- >>>>> 2 files changed, 9 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c >>>>> index e57054bdc5fd..b5bfcb6c18a0 100644 >>>>> --- a/fs/ext4/extents.c >>>>> +++ b/fs/ext4/extents.c >>>>> @@ -4052,6 +4052,9 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode, >>>>> ext4_lblk_t hole_start, len; >>>>> struct extent_status es; >>>>> >>>>> + if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) >>>>> + return 0; >>>>> + >>>> >>>> Sorry, I think it's may not correct. When replaying the jouranl, although >>>> we don't use the extent statue tree, we still need to query the accurate >>>> hole length, e.g. please see skip_hole(). If you do this, the hole length >>>> becomes incorrect, right? >>> >>> Thank you for your review (and sorry for my delay replying). >>> >>> So, I see three different options to follow your suggestion: >>> >>> 1) Initialize 'es' immediately when declaring it in function >>> ext4_ext_determine_insert_hole(): >>> >>> es.es_lblk = es.es_len = es.es_pblk = 0; >>> >>> 2) Initialize 'es' only in ext4_es_find_extent_range() when checking if an >>> fc replay is in progress (my patch was already doing something like >>> that): >>> >>> if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) { >>> /* Initialize extent to zero */ >>> es->es_lblk = es->es_len = es->es_pblk = 0; >>> return; >>> } >>> >>> 3) Remove the check for fc replay in function ext4_es_find_extent_range(), >>> which will then unconditionally call __es_find_extent_range(). This >>> will effectively also initialize the 'es' fields to '0' and, because >>> __es_tree_search() will return NULL (at least in generic/039 test!), >>> nothing else will be done. >>> >>> Since all these 3 options seem to have the same result, I believe option >>> 1) is probably the best as it initializes the structure shortly after it's >>> declaration. Would you agree? Or did I misunderstood you? >>> >> >> Both 1 and 2 are looks fine to me, but I would prefer to initialize it >> unconditionally in ext4_es_find_extent_range(). >> >> @@ -310,6 +310,8 @@ void ext4_es_find_extent_range(struct inode *inode, >> ext4_lblk_t lblk, ext4_lblk_t end, >> struct extent_status *es) >> { >> + es->es_lblk = es->es_len = es->es_pblk = 0; >> + >> if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) >> return; > > Thank you, Yi. I'll send out v2 shortly. Although, to be fair, the real > patch author shouldn't be me. :-) > Never mind, I just give a suggestion and also I didn't do a full test on this change. Thanks, Yi.
On Wed 15 May 2024 04:52:54 PM +08, Zhang Yi wrote; > On 2024/5/15 16:28, Luis Henriques wrote: >> On Wed 15 May 2024 12:59:26 PM +08, Zhang Yi wrote; >> >>> On 2024/5/14 21:04, Luis Henriques wrote: >>>> On Sat 11 May 2024 02:24:17 PM +08, Zhang Yi wrote; >>>> >>>>> On 2024/5/10 19:52, Luis Henriques (SUSE) wrote: >>>>>> When doing fast_commit replay an infinite loop may occur due to an >>>>>> uninitialized extent_status struct. ext4_ext_determine_insert_hole() does >>>>>> not detect the replay and calls ext4_es_find_extent_range(), which will >>>>>> return immediately without initializing the 'es' variable. >>>>>> >>>>>> Because 'es' contains garbage, an integer overflow may happen causing an >>>>>> infinite loop in this function, easily reproducible using fstest generic/039. >>>>>> >>>>>> This commit fixes this issue by detecting the replay in function >>>>>> ext4_ext_determine_insert_hole(). It also adds initialization code to the >>>>>> error path in function ext4_es_find_extent_range(). >>>>>> >>>>>> Thanks to Zhang Yi, for figuring out the real problem! >>>>>> >>>>>> Fixes: 8016e29f4362 ("ext4: fast commit recovery path") >>>>>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev> >>>>>> --- >>>>>> Hi! >>>>>> >>>>>> Two comments: >>>>>> 1) The change in ext4_ext_map_blocks() could probably use the min_not_zero >>>>>> macro instead. I decided not to do so simply because I wasn't sure if >>>>>> that would be safe, but I'm fine changing that if you think it is. >>>>>> >>>>>> 2) I thought about returning 'EXT_MAX_BLOCKS' instead of '0' in >>>>>> ext4_lblk_t ext4_ext_determine_insert_hole(), which would then avoid >>>>>> the extra change to ext4_ext_map_blocks(). '0' sounds like the right >>>>>> value to return, but I'm also OK using 'EXT_MAX_BLOCKS' instead. >>>>>> >>>>>> And again thanks to Zhang Yi for pointing me the *real* problem! >>>>>> >>>>>> fs/ext4/extents.c | 6 +++++- >>>>>> fs/ext4/extents_status.c | 5 ++++- >>>>>> 2 files changed, 9 insertions(+), 2 deletions(-) >>>>>> >>>>>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c >>>>>> index e57054bdc5fd..b5bfcb6c18a0 100644 >>>>>> --- a/fs/ext4/extents.c >>>>>> +++ b/fs/ext4/extents.c >>>>>> @@ -4052,6 +4052,9 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode, >>>>>> ext4_lblk_t hole_start, len; >>>>>> struct extent_status es; >>>>>> >>>>>> + if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) >>>>>> + return 0; >>>>>> + >>>>> >>>>> Sorry, I think it's may not correct. When replaying the jouranl, although >>>>> we don't use the extent statue tree, we still need to query the accurate >>>>> hole length, e.g. please see skip_hole(). If you do this, the hole length >>>>> becomes incorrect, right? >>>> >>>> Thank you for your review (and sorry for my delay replying). >>>> >>>> So, I see three different options to follow your suggestion: >>>> >>>> 1) Initialize 'es' immediately when declaring it in function >>>> ext4_ext_determine_insert_hole(): >>>> >>>> es.es_lblk = es.es_len = es.es_pblk = 0; >>>> >>>> 2) Initialize 'es' only in ext4_es_find_extent_range() when checking if an >>>> fc replay is in progress (my patch was already doing something like >>>> that): >>>> >>>> if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) { >>>> /* Initialize extent to zero */ >>>> es->es_lblk = es->es_len = es->es_pblk = 0; >>>> return; >>>> } >>>> >>>> 3) Remove the check for fc replay in function ext4_es_find_extent_range(), >>>> which will then unconditionally call __es_find_extent_range(). This >>>> will effectively also initialize the 'es' fields to '0' and, because >>>> __es_tree_search() will return NULL (at least in generic/039 test!), >>>> nothing else will be done. >>>> >>>> Since all these 3 options seem to have the same result, I believe option >>>> 1) is probably the best as it initializes the structure shortly after it's >>>> declaration. Would you agree? Or did I misunderstood you? >>>> >>> >>> Both 1 and 2 are looks fine to me, but I would prefer to initialize it >>> unconditionally in ext4_es_find_extent_range(). >>> >>> @@ -310,6 +310,8 @@ void ext4_es_find_extent_range(struct inode *inode, >>> ext4_lblk_t lblk, ext4_lblk_t end, >>> struct extent_status *es) >>> { >>> + es->es_lblk = es->es_len = es->es_pblk = 0; >>> + >>> if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) >>> return; >> >> Thank you, Yi. I'll send out v2 shortly. Although, to be fair, the real >> patch author shouldn't be me. :-) >> > > Never mind, I just give a suggestion and also I didn't do a full test on > this change. Oh, talking about testing, I forgot to mention that I see the same behaviour with generic/311. I.e. this test also enters an infinite loop, but fixed with this patch. Cheers,
On 2024/5/15 17:13, Luis Henriques wrote: > On Wed 15 May 2024 04:52:54 PM +08, Zhang Yi wrote; > >> On 2024/5/15 16:28, Luis Henriques wrote: >>> On Wed 15 May 2024 12:59:26 PM +08, Zhang Yi wrote; >>> >>>> On 2024/5/14 21:04, Luis Henriques wrote: >>>>> On Sat 11 May 2024 02:24:17 PM +08, Zhang Yi wrote; >>>>> >>>>>> On 2024/5/10 19:52, Luis Henriques (SUSE) wrote: >>>>>>> When doing fast_commit replay an infinite loop may occur due to an >>>>>>> uninitialized extent_status struct. ext4_ext_determine_insert_hole() does >>>>>>> not detect the replay and calls ext4_es_find_extent_range(), which will >>>>>>> return immediately without initializing the 'es' variable. >>>>>>> >>>>>>> Because 'es' contains garbage, an integer overflow may happen causing an >>>>>>> infinite loop in this function, easily reproducible using fstest generic/039. >>>>>>> >>>>>>> This commit fixes this issue by detecting the replay in function >>>>>>> ext4_ext_determine_insert_hole(). It also adds initialization code to the >>>>>>> error path in function ext4_es_find_extent_range(). >>>>>>> >>>>>>> Thanks to Zhang Yi, for figuring out the real problem! >>>>>>> >>>>>>> Fixes: 8016e29f4362 ("ext4: fast commit recovery path") >>>>>>> Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev> >>>>>>> --- >>>>>>> Hi! >>>>>>> >>>>>>> Two comments: >>>>>>> 1) The change in ext4_ext_map_blocks() could probably use the min_not_zero >>>>>>> macro instead. I decided not to do so simply because I wasn't sure if >>>>>>> that would be safe, but I'm fine changing that if you think it is. >>>>>>> >>>>>>> 2) I thought about returning 'EXT_MAX_BLOCKS' instead of '0' in >>>>>>> ext4_lblk_t ext4_ext_determine_insert_hole(), which would then avoid >>>>>>> the extra change to ext4_ext_map_blocks(). '0' sounds like the right >>>>>>> value to return, but I'm also OK using 'EXT_MAX_BLOCKS' instead. >>>>>>> >>>>>>> And again thanks to Zhang Yi for pointing me the *real* problem! >>>>>>> >>>>>>> fs/ext4/extents.c | 6 +++++- >>>>>>> fs/ext4/extents_status.c | 5 ++++- >>>>>>> 2 files changed, 9 insertions(+), 2 deletions(-) >>>>>>> >>>>>>> diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c >>>>>>> index e57054bdc5fd..b5bfcb6c18a0 100644 >>>>>>> --- a/fs/ext4/extents.c >>>>>>> +++ b/fs/ext4/extents.c >>>>>>> @@ -4052,6 +4052,9 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode, >>>>>>> ext4_lblk_t hole_start, len; >>>>>>> struct extent_status es; >>>>>>> >>>>>>> + if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) >>>>>>> + return 0; >>>>>>> + >>>>>> >>>>>> Sorry, I think it's may not correct. When replaying the jouranl, although >>>>>> we don't use the extent statue tree, we still need to query the accurate >>>>>> hole length, e.g. please see skip_hole(). If you do this, the hole length >>>>>> becomes incorrect, right? >>>>> >>>>> Thank you for your review (and sorry for my delay replying). >>>>> >>>>> So, I see three different options to follow your suggestion: >>>>> >>>>> 1) Initialize 'es' immediately when declaring it in function >>>>> ext4_ext_determine_insert_hole(): >>>>> >>>>> es.es_lblk = es.es_len = es.es_pblk = 0; >>>>> >>>>> 2) Initialize 'es' only in ext4_es_find_extent_range() when checking if an >>>>> fc replay is in progress (my patch was already doing something like >>>>> that): >>>>> >>>>> if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) { >>>>> /* Initialize extent to zero */ >>>>> es->es_lblk = es->es_len = es->es_pblk = 0; >>>>> return; >>>>> } >>>>> >>>>> 3) Remove the check for fc replay in function ext4_es_find_extent_range(), >>>>> which will then unconditionally call __es_find_extent_range(). This >>>>> will effectively also initialize the 'es' fields to '0' and, because >>>>> __es_tree_search() will return NULL (at least in generic/039 test!), >>>>> nothing else will be done. >>>>> >>>>> Since all these 3 options seem to have the same result, I believe option >>>>> 1) is probably the best as it initializes the structure shortly after it's >>>>> declaration. Would you agree? Or did I misunderstood you? >>>>> >>>> >>>> Both 1 and 2 are looks fine to me, but I would prefer to initialize it >>>> unconditionally in ext4_es_find_extent_range(). >>>> >>>> @@ -310,6 +310,8 @@ void ext4_es_find_extent_range(struct inode *inode, >>>> ext4_lblk_t lblk, ext4_lblk_t end, >>>> struct extent_status *es) >>>> { >>>> + es->es_lblk = es->es_len = es->es_pblk = 0; >>>> + >>>> if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) >>>> return; >>> >>> Thank you, Yi. I'll send out v2 shortly. Although, to be fair, the real >>> patch author shouldn't be me. :-) >>> >> >> Never mind, I just give a suggestion and also I didn't do a full test on >> this change. > > Oh, talking about testing, I forgot to mention that I see the same > behaviour with generic/311. I.e. this test also enters an infinite loop, > but fixed with this patch. > Yeah, generic/311 also does a lot of mount && journal recovery operations, and there maybe some other fault injection tests could have the same results, it's all right now. :) Thanks, Yi.
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index e57054bdc5fd..b5bfcb6c18a0 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4052,6 +4052,9 @@ static ext4_lblk_t ext4_ext_determine_insert_hole(struct inode *inode, ext4_lblk_t hole_start, len; struct extent_status es; + if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) + return 0; + hole_start = lblk; len = ext4_ext_find_hole(inode, path, &hole_start); again: @@ -4226,7 +4229,8 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode, len = ext4_ext_determine_insert_hole(inode, path, map->m_lblk); map->m_pblk = 0; - map->m_len = min_t(unsigned int, map->m_len, len); + if (len > 0) + map->m_len = min_t(unsigned int, map->m_len, len); goto out; } diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index 4a00e2f019d9..acb9616ca119 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -310,8 +310,11 @@ void ext4_es_find_extent_range(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t end, struct extent_status *es) { - if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) + if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) { + /* Initialize extent to zero */ + es->es_lblk = es->es_len = es->es_pblk = 0; return; + } trace_ext4_es_find_extent_range_enter(inode, lblk);
When doing fast_commit replay an infinite loop may occur due to an uninitialized extent_status struct. ext4_ext_determine_insert_hole() does not detect the replay and calls ext4_es_find_extent_range(), which will return immediately without initializing the 'es' variable. Because 'es' contains garbage, an integer overflow may happen causing an infinite loop in this function, easily reproducible using fstest generic/039. This commit fixes this issue by detecting the replay in function ext4_ext_determine_insert_hole(). It also adds initialization code to the error path in function ext4_es_find_extent_range(). Thanks to Zhang Yi, for figuring out the real problem! Fixes: 8016e29f4362 ("ext4: fast commit recovery path") Signed-off-by: Luis Henriques (SUSE) <luis.henriques@linux.dev> --- Hi! Two comments: 1) The change in ext4_ext_map_blocks() could probably use the min_not_zero macro instead. I decided not to do so simply because I wasn't sure if that would be safe, but I'm fine changing that if you think it is. 2) I thought about returning 'EXT_MAX_BLOCKS' instead of '0' in ext4_lblk_t ext4_ext_determine_insert_hole(), which would then avoid the extra change to ext4_ext_map_blocks(). '0' sounds like the right value to return, but I'm also OK using 'EXT_MAX_BLOCKS' instead. And again thanks to Zhang Yi for pointing me the *real* problem! fs/ext4/extents.c | 6 +++++- fs/ext4/extents_status.c | 5 ++++- 2 files changed, 9 insertions(+), 2 deletions(-)