Message ID | 20200430044110.DF25E42045@d06av24.portsmouth.uk.ibm.com |
---|---|
State | New |
Headers | show |
Series | Discussion about rcu_barrier() in ext4 | expand |
Hi Paul, Sorry about this email. While rebasing on my test environment, I guess a fix over the patch mentioned below was absent - which could possibly cause an indefinite loop. Looking into it currently. -ritesh On 4/30/20 10:11 AM, Ritesh Harjani wrote: > Hello Paul, > > I added a rcu_barrier() in one of the ext4 path, to ensure > that once rcu_barrier is completed, all the call_rcu() should have > been completed too. I think I am seeing a below deadlock while running > one of the ext4 fstest. > > Could you please help me understand what exactly is causing the deadlock > here? I was thinking is it because of khungtaskd inside rcu_read_lock() > which would have disabled the preemption and that's why the > rcu_barrier() path is stuck? But why can't khungtaskd proceed? > > Also about when is it safe or not safe to call rcu_barrier()? > > Appreciate your inputs on this one!! > > -ritesh > > <Call Stack> > ============= > > [ 1476.770791] Not tainted 5.6.0-rc4+ #47 > [ 1476.771723] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 1476.773814] kworker/u32:3 D12136 9612 2 0x80004000 > [ 1476.775299] Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work > [ 1476.776904] Call Trace: > [ 1476.777640] ? __schedule+0x35b/0x8b0 > [ 1476.778687] ? rwsem_down_write_slowpath+0x405/0x650 > [ 1476.780010] schedule+0x55/0xd0 > [ 1476.780763] rwsem_down_write_slowpath+0x40a/0x650 > [ 1476.781867] ? down_write+0x10c/0x110 > [ 1476.782687] down_write+0x10c/0x110 > [ 1476.783471] ext4_map_blocks+0xd4/0x640 > [ 1476.784367] ext4_convert_unwritten_extents+0x10f/0x210 > [ 1476.785493] ext4_convert_unwritten_io_end_vec+0x5f/0xd0 > [ 1476.786710] ext4_end_io_rsv_work+0xe6/0x190 > [ 1476.787659] process_one_work+0x24e/0x5a0 > [ 1476.788568] worker_thread+0x3c/0x390 > [ 1476.789396] ? process_one_work+0x5a0/0x5a0 > [ 1476.790312] kthread+0x120/0x140 > [ 1476.790958] ? kthread_park+0x80/0x80 > [ 1476.791750] ret_from_fork+0x3a/0x50 > [ 1476.792578] > [ 1476.792578] Showing all locks held in the system: > [ 1476.793885] 1 lock held by khungtaskd/92: > [ 1476.794744] #0: ffffffff82a7af20 (rcu_read_lock){....}, at: > debug_show_all_locks+0x15/0x17e > [ 1476.796485] 6 locks held by kworker/u32:2/194: > [ 1476.797424] #0: ffff888a0521d548 ((wq_completion)writeback){+.+.}, > at: process_one_work+0x1ce/0x5a0 > [ 1476.799293] #1: ffffc9000185be68 > ((work_completion)(&(&wb->dwork)->work)){+.+.}, at: > process_one_work+0x1ce/0x5a0 > [ 1476.801386] #2: ffff8889c6f31af0 (&sbi->s_writepages_rwsem){++++}, > at: do_writepages+0x41/0xe0 > [ 1476.803178] #3: ffff8889c46cd838 (jbd2_handle){++++}, at: > start_this_handle+0x1ac/0x680 > [ 1476.804853] #4: ffff8889f2101198 (&ei->i_data_sem){++++}, at: > ext4_map_blocks+0xd4/0x640 > [ 1476.806535] #5: ffffffff82a7ea78 (rcu_state.barrier_mutex){+.+.}, > at: rcu_barrier+0x59/0x760 > [ 1476.808296] 1 lock held by in:imklog/650: > [ 1476.809162] 4 locks held by kworker/u32:3/9612: > [ 1476.810159] #0: ffff88899e7f6548 > ((wq_completion)ext4-rsv-conversion#2){+.+.}, at: > process_one_work+0x1ce/0x5a0 > [ 1476.812227] #1: ffffc9000dd4be68 > ((work_completion)(&ei->i_rsv_conversion_work)){+.+.}, at: > process_one_work+0x1ce/0x5a0 > [ 1476.814432] #2: ffff8889c46cd838 (jbd2_handle){++++}, at: > start_this_handle+0x1ac/0x680 > [ 1476.816085] #3: ffff8889f2101198 (&ei->i_data_sem){++++}, at: > ext4_map_blocks+0xd4/0x640 > [ 1476.817799] 3 locks held by dd/17439: > [ 1476.818590] #0: ffff8889be355490 (sb_writers#3){.+.+}, at: > vfs_write+0x177/0x1d0 > [ 1476.820132] #1: ffff8889f21013d0 > (&sb->s_type->i_mutex_key#10){+.+.}, at: > ext4_buffered_write_iter+0x33/0x120 > [ 1476.822183] #2: ffff8889be3550e8 (&type->s_umount_key#35){++++}, at: > try_to_writeback_inodes_sb+0x1b/0x50 > [ 1476.824142] 2 locks held by debugfs/17530: > [ 1476.824962] #0: ffff8889e3f108a0 (&tty->ldisc_sem){++++}, at: > tty_ldisc_ref_wait+0x24/0x50 > [ 1476.826676] #1: ffffc900004672f0 (&ldata->atomic_read_lock){+.+.}, > at: n_tty_read+0xda/0x9d0 > [ 1476.828455] > [ 1476.828773] ============================================= > [ 1476.828773] > > > Below is the patch which I added which started causing this. > =========================================================== > > --- > Subject: [PATCH 1/1] ext4: Introduce percpu seq counter for freeing > blocks(PA) to avoid ENOSPC err > > There could be a race in function ext4_mb_discard_group_preallocations() > where the 1st thread may iterate through group's bb_prealloc_list and > remove all the PAs and add to function's local list head. > Now if the 2nd thread comes in to discard the group preallocations, > it will see that the group->bb_prealloc_list is empty and will return 0. > > Consider for a case where we have less number of groups (for e.g. just > group 0), > this may even return an -ENOSPC error from ext4_mb_new_blocks() > (where we call for ext4_mb_discard_group_preallocations()). > But that is wrong, since 2nd thread should have waited for 1st thread to > release > all the PAs and should have retried for allocation. Since 1st thread > was anyway going to discard the PAs. > > The algorithm using this percpu seq counter goes below: > 1. We sample the percpu discard_pa_seq counter before trying for block > allocation in ext4_mb_new_blocks(). > 2. We increment this percpu discard_pa_seq counter when we succeed in > allocating blocks and hence while adding the remaining blocks in > group's > bb_prealloc_list (ext4_mb_new_inode_pa/ext4_mb_new_group_pa). > 3. We also increment this percpu seq counter when we successfully identify > that the bb_prealloc_list is not empty and hence proceed for discarding > of those PAs inside ext4_mb_discard_group_preallocations(). > > Now to make sure that the regular fast path of block allocation is not > affected, as a small optimization we only sample the percpu seq counter > on that cpu. Only when the block allocation fails and when freed blocks > found were 0, that is when we sample percpu seq counter for all cpus using > below function ext4_get_discard_pa_seq_sum(). This happens after making > sure that all the PAs on grp->bb_prealloc_list got freed. > > Note: The other method was also to check for grp->bb_free next time > if we couldn't discard anything. But one suspicion with that was that if > grp->bb_free is non-zero for some reason (not sure), then we may result > in that loop indefinitely but still won't be able to satisfy any > request. Second, to check for grp->bb_free all threads were to wait on > grp's spin_lock which might result in increased cpu usage. > So it seems this could be a better & faster approach compared to that. > > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> > --- > fs/ext4/mballoc.c | 56 ++++++++++++++++++++++++++++++++++++++++++++--- > 1 file changed, 53 insertions(+), 3 deletions(-) > > diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c > index fe6528716f14..47ff3704d2fc 100644 > --- a/fs/ext4/mballoc.c > +++ b/fs/ext4/mballoc.c > @@ -357,6 +357,34 @@ static void ext4_mb_generate_from_pa(struct > super_block *sb, void *bitmap, > static void ext4_mb_generate_from_freelist(struct super_block *sb, > void *bitmap, > ext4_group_t group); > > +/* > + * The algorithm using this percpu seq counter goes below: > + * 1. We sample the percpu discard_pa_seq counter before trying for block > + * allocation in ext4_mb_new_blocks(). > + * 2. We increment this percpu discard_pa_seq counter when we succeed in > + * allocating blocks and hence while adding the remaining blocks in > group's > + * bb_prealloc_list (ext4_mb_new_inode_pa/ext4_mb_new_group_pa). > + * 3. We also increment this percpu seq counter when we successfully > identify > + * that the bb_prealloc_list is not empty and hence proceed for > discarding > + * of those PAs inside ext4_mb_discard_group_preallocations(). > + * > + * Now to make sure that the regular fast path of block allocation is not > + * affected, as a small optimization we only sample the percpu seq counter > + * on that cpu. Only when the block allocation fails and when freed blocks > + * found were 0, that is when we sample percpu seq counter for all cpus > using > + * below function ext4_get_discard_pa_seq_sum(). This happens after making > + * sure that all the PAs on grp->bb_prealloc_list got freed. > + */ > +DEFINE_PER_CPU(u64, discard_pa_seq); > +static inline u64 ext4_get_discard_pa_seq_sum(void) > +{ > + int __cpu; > + u64 __seq = 0; > + for_each_possible_cpu(__cpu) > + __seq += per_cpu(discard_pa_seq, __cpu); > + return __seq; > +} > + > static inline void *mb_correct_addr_and_bit(int *bit, void *addr) > { > #if BITS_PER_LONG == 64 > @@ -3721,6 +3749,7 @@ ext4_mb_new_inode_pa(struct > ext4_allocation_context *ac) > pa->pa_inode = ac->ac_inode; > > ext4_lock_group(sb, ac->ac_b_ex.fe_group); > + this_cpu_inc(discard_pa_seq); > list_add(&pa->pa_group_list, &grp->bb_prealloc_list); > ext4_unlock_group(sb, ac->ac_b_ex.fe_group); > > @@ -3782,6 +3811,7 @@ ext4_mb_new_group_pa(struct > ext4_allocation_context *ac) > pa->pa_inode = NULL; > > ext4_lock_group(sb, ac->ac_b_ex.fe_group); > + this_cpu_inc(discard_pa_seq); > list_add(&pa->pa_group_list, &grp->bb_prealloc_list); > ext4_unlock_group(sb, ac->ac_b_ex.fe_group); > > @@ -3934,6 +3964,7 @@ ext4_mb_discard_group_preallocations(struct > super_block *sb, > INIT_LIST_HEAD(&list); > repeat: > ext4_lock_group(sb, group); > + this_cpu_inc(discard_pa_seq); > list_for_each_entry_safe(pa, tmp, > &grp->bb_prealloc_list, pa_group_list) { > spin_lock(&pa->pa_lock); > @@ -4481,13 +4512,30 @@ static int ext4_mb_discard_preallocations(struct > super_block *sb, int needed) > } > > static bool ext4_mb_discard_preallocations_should_retry(struct > super_block *sb, > - struct ext4_allocation_context *ac) > + struct ext4_allocation_context *ac, u64 seq) > { > int freed; > + u64 seq_retry; > + > freed = ext4_mb_discard_preallocations(sb, ac->ac_o_ex.fe_len); > if (freed) > - return true; > + goto out_retry; > + /* > + * Unless it is ensured that PAs are actually freed, we may hit > + * a ENOSPC error since the next time seq may match while the PA > blocks > + * are still getting freed in ext4_mb_release_inode/group_pa(). > + * So, rcu_barrier() here is to make sure that any call_rcu queued in > + * ext4_mb_discard_group_preallocations() is completed before we > + * proceed further to retry for block allocation. > + */ > + rcu_barrier(); > + seq_retry = ext4_get_discard_pa_seq_sum(); > + if (seq_retry != seq) > + goto out_retry; > + > return false; > +out_retry: > + return true; > } > > /* > @@ -4504,6 +4552,7 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle, > ext4_fsblk_t block = 0; > unsigned int inquota = 0; > unsigned int reserv_clstrs = 0; > + u64 seq; > > might_sleep(); > sb = ar->inode->i_sb; > @@ -4565,6 +4614,7 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle, > } > > ac->ac_op = EXT4_MB_HISTORY_PREALLOC; > + seq = *this_cpu_ptr(&discard_pa_seq); > if (!ext4_mb_use_preallocated(ac)) { > ac->ac_op = EXT4_MB_HISTORY_ALLOC; > ext4_mb_normalize_request(ac, ar); > @@ -4596,7 +4646,7 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle, > ar->len = ac->ac_b_ex.fe_len; > } > } else { > - if (ext4_mb_discard_preallocations_should_retry(sb, ac)) > + if (ext4_mb_discard_preallocations_should_retry(sb, ac, seq)) > goto repeat; > *errp = -ENOSPC; > }
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c index fe6528716f14..47ff3704d2fc 100644 --- a/fs/ext4/mballoc.c +++ b/fs/ext4/mballoc.c @@ -357,6 +357,34 @@ static void ext4_mb_generate_from_pa(struct super_block *sb, void *bitmap, static void ext4_mb_generate_from_freelist(struct super_block *sb, void *bitmap, ext4_group_t group); +/* + * The algorithm using this percpu seq counter goes below: + * 1. We sample the percpu discard_pa_seq counter before trying for block + * allocation in ext4_mb_new_blocks(). + * 2. We increment this percpu discard_pa_seq counter when we succeed in + * allocating blocks and hence while adding the remaining blocks in group's + * bb_prealloc_list (ext4_mb_new_inode_pa/ext4_mb_new_group_pa). + * 3. We also increment this percpu seq counter when we successfully identify + * that the bb_prealloc_list is not empty and hence proceed for discarding + * of those PAs inside ext4_mb_discard_group_preallocations(). + * + * Now to make sure that the regular fast path of block allocation is not + * affected, as a small optimization we only sample the percpu seq counter + * on that cpu. Only when the block allocation fails and when freed blocks + * found were 0, that is when we sample percpu seq counter for all cpus using + * below function ext4_get_discard_pa_seq_sum(). This happens after making + * sure that all the PAs on grp->bb_prealloc_list got freed. + */ +DEFINE_PER_CPU(u64, discard_pa_seq); +static inline u64 ext4_get_discard_pa_seq_sum(void) +{ + int __cpu; + u64 __seq = 0; + for_each_possible_cpu(__cpu) + __seq += per_cpu(discard_pa_seq, __cpu);
Hello Paul, I added a rcu_barrier() in one of the ext4 path, to ensure that once rcu_barrier is completed, all the call_rcu() should have been completed too. I think I am seeing a below deadlock while running one of the ext4 fstest. Could you please help me understand what exactly is causing the deadlock here? I was thinking is it because of khungtaskd inside rcu_read_lock() which would have disabled the preemption and that's why the rcu_barrier() path is stuck? But why can't khungtaskd proceed? Also about when is it safe or not safe to call rcu_barrier()? Appreciate your inputs on this one!! -ritesh <Call Stack> ============= [ 1476.770791] Not tainted 5.6.0-rc4+ #47 [ 1476.771723] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 1476.773814] kworker/u32:3 D12136 9612 2 0x80004000 [ 1476.775299] Workqueue: ext4-rsv-conversion ext4_end_io_rsv_work [ 1476.776904] Call Trace: [ 1476.777640] ? __schedule+0x35b/0x8b0 [ 1476.778687] ? rwsem_down_write_slowpath+0x405/0x650 [ 1476.780010] schedule+0x55/0xd0 [ 1476.780763] rwsem_down_write_slowpath+0x40a/0x650 [ 1476.781867] ? down_write+0x10c/0x110 [ 1476.782687] down_write+0x10c/0x110 [ 1476.783471] ext4_map_blocks+0xd4/0x640 [ 1476.784367] ext4_convert_unwritten_extents+0x10f/0x210 [ 1476.785493] ext4_convert_unwritten_io_end_vec+0x5f/0xd0 [ 1476.786710] ext4_end_io_rsv_work+0xe6/0x190 [ 1476.787659] process_one_work+0x24e/0x5a0 [ 1476.788568] worker_thread+0x3c/0x390 [ 1476.789396] ? process_one_work+0x5a0/0x5a0 [ 1476.790312] kthread+0x120/0x140 [ 1476.790958] ? kthread_park+0x80/0x80 [ 1476.791750] ret_from_fork+0x3a/0x50 [ 1476.792578] [ 1476.792578] Showing all locks held in the system: [ 1476.793885] 1 lock held by khungtaskd/92: [ 1476.794744] #0: ffffffff82a7af20 (rcu_read_lock){....}, at: debug_show_all_locks+0x15/0x17e [ 1476.796485] 6 locks held by kworker/u32:2/194: [ 1476.797424] #0: ffff888a0521d548 ((wq_completion)writeback){+.+.}, at: process_one_work+0x1ce/0x5a0 [ 1476.799293] #1: ffffc9000185be68 ((work_completion)(&(&wb->dwork)->work)){+.+.}, at: process_one_work+0x1ce/0x5a0 [ 1476.801386] #2: ffff8889c6f31af0 (&sbi->s_writepages_rwsem){++++}, at: do_writepages+0x41/0xe0 [ 1476.803178] #3: ffff8889c46cd838 (jbd2_handle){++++}, at: start_this_handle+0x1ac/0x680 [ 1476.804853] #4: ffff8889f2101198 (&ei->i_data_sem){++++}, at: ext4_map_blocks+0xd4/0x640 [ 1476.806535] #5: ffffffff82a7ea78 (rcu_state.barrier_mutex){+.+.}, at: rcu_barrier+0x59/0x760 [ 1476.808296] 1 lock held by in:imklog/650: [ 1476.809162] 4 locks held by kworker/u32:3/9612: [ 1476.810159] #0: ffff88899e7f6548 ((wq_completion)ext4-rsv-conversion#2){+.+.}, at: process_one_work+0x1ce/0x5a0 [ 1476.812227] #1: ffffc9000dd4be68 ((work_completion)(&ei->i_rsv_conversion_work)){+.+.}, at: process_one_work+0x1ce/0x5a0 [ 1476.814432] #2: ffff8889c46cd838 (jbd2_handle){++++}, at: start_this_handle+0x1ac/0x680 [ 1476.816085] #3: ffff8889f2101198 (&ei->i_data_sem){++++}, at: ext4_map_blocks+0xd4/0x640 [ 1476.817799] 3 locks held by dd/17439: [ 1476.818590] #0: ffff8889be355490 (sb_writers#3){.+.+}, at: vfs_write+0x177/0x1d0 [ 1476.820132] #1: ffff8889f21013d0 (&sb->s_type->i_mutex_key#10){+.+.}, at: ext4_buffered_write_iter+0x33/0x120 [ 1476.822183] #2: ffff8889be3550e8 (&type->s_umount_key#35){++++}, at: try_to_writeback_inodes_sb+0x1b/0x50 [ 1476.824142] 2 locks held by debugfs/17530: [ 1476.824962] #0: ffff8889e3f108a0 (&tty->ldisc_sem){++++}, at: tty_ldisc_ref_wait+0x24/0x50 [ 1476.826676] #1: ffffc900004672f0 (&ldata->atomic_read_lock){+.+.}, at: n_tty_read+0xda/0x9d0 [ 1476.828455] [ 1476.828773] ============================================= [ 1476.828773] Below is the patch which I added which started causing this. =========================================================== --- Subject: [PATCH 1/1] ext4: Introduce percpu seq counter for freeing blocks(PA) to avoid ENOSPC err There could be a race in function ext4_mb_discard_group_preallocations() where the 1st thread may iterate through group's bb_prealloc_list and remove all the PAs and add to function's local list head. Now if the 2nd thread comes in to discard the group preallocations, it will see that the group->bb_prealloc_list is empty and will return 0. Consider for a case where we have less number of groups (for e.g. just group 0), this may even return an -ENOSPC error from ext4_mb_new_blocks() (where we call for ext4_mb_discard_group_preallocations()). But that is wrong, since 2nd thread should have waited for 1st thread to release all the PAs and should have retried for allocation. Since 1st thread was anyway going to discard the PAs. The algorithm using this percpu seq counter goes below: 1. We sample the percpu discard_pa_seq counter before trying for block allocation in ext4_mb_new_blocks(). 2. We increment this percpu discard_pa_seq counter when we succeed in allocating blocks and hence while adding the remaining blocks in group's bb_prealloc_list (ext4_mb_new_inode_pa/ext4_mb_new_group_pa). 3. We also increment this percpu seq counter when we successfully identify that the bb_prealloc_list is not empty and hence proceed for discarding of those PAs inside ext4_mb_discard_group_preallocations(). Now to make sure that the regular fast path of block allocation is not affected, as a small optimization we only sample the percpu seq counter on that cpu. Only when the block allocation fails and when freed blocks found were 0, that is when we sample percpu seq counter for all cpus using below function ext4_get_discard_pa_seq_sum(). This happens after making sure that all the PAs on grp->bb_prealloc_list got freed. Note: The other method was also to check for grp->bb_free next time if we couldn't discard anything. But one suspicion with that was that if grp->bb_free is non-zero for some reason (not sure), then we may result in that loop indefinitely but still won't be able to satisfy any request. Second, to check for grp->bb_free all threads were to wait on grp's spin_lock which might result in increased cpu usage. So it seems this could be a better & faster approach compared to that. Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> --- fs/ext4/mballoc.c | 56 ++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 53 insertions(+), 3 deletions(-) + return __seq; +} + static inline void *mb_correct_addr_and_bit(int *bit, void *addr) { #if BITS_PER_LONG == 64 @@ -3721,6 +3749,7 @@ ext4_mb_new_inode_pa(struct ext4_allocation_context *ac) pa->pa_inode = ac->ac_inode; ext4_lock_group(sb, ac->ac_b_ex.fe_group); + this_cpu_inc(discard_pa_seq); list_add(&pa->pa_group_list, &grp->bb_prealloc_list); ext4_unlock_group(sb, ac->ac_b_ex.fe_group); @@ -3782,6 +3811,7 @@ ext4_mb_new_group_pa(struct ext4_allocation_context *ac) pa->pa_inode = NULL; ext4_lock_group(sb, ac->ac_b_ex.fe_group); + this_cpu_inc(discard_pa_seq); list_add(&pa->pa_group_list, &grp->bb_prealloc_list); ext4_unlock_group(sb, ac->ac_b_ex.fe_group); @@ -3934,6 +3964,7 @@ ext4_mb_discard_group_preallocations(struct super_block *sb, INIT_LIST_HEAD(&list); repeat: ext4_lock_group(sb, group); + this_cpu_inc(discard_pa_seq); list_for_each_entry_safe(pa, tmp, &grp->bb_prealloc_list, pa_group_list) { spin_lock(&pa->pa_lock); @@ -4481,13 +4512,30 @@ static int ext4_mb_discard_preallocations(struct super_block *sb, int needed) } static bool ext4_mb_discard_preallocations_should_retry(struct super_block *sb, - struct ext4_allocation_context *ac) + struct ext4_allocation_context *ac, u64 seq) { int freed; + u64 seq_retry; + freed = ext4_mb_discard_preallocations(sb, ac->ac_o_ex.fe_len); if (freed) - return true; + goto out_retry; + /* + * Unless it is ensured that PAs are actually freed, we may hit + * a ENOSPC error since the next time seq may match while the PA blocks + * are still getting freed in ext4_mb_release_inode/group_pa(). + * So, rcu_barrier() here is to make sure that any call_rcu queued in + * ext4_mb_discard_group_preallocations() is completed before we + * proceed further to retry for block allocation. + */ + rcu_barrier(); + seq_retry = ext4_get_discard_pa_seq_sum(); + if (seq_retry != seq) + goto out_retry; + return false; +out_retry: + return true; } /* @@ -4504,6 +4552,7 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle, ext4_fsblk_t block = 0; unsigned int inquota = 0; unsigned int reserv_clstrs = 0; + u64 seq; might_sleep(); sb = ar->inode->i_sb; @@ -4565,6 +4614,7 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle, } ac->ac_op = EXT4_MB_HISTORY_PREALLOC; + seq = *this_cpu_ptr(&discard_pa_seq); if (!ext4_mb_use_preallocated(ac)) { ac->ac_op = EXT4_MB_HISTORY_ALLOC; ext4_mb_normalize_request(ac, ar); @@ -4596,7 +4646,7 @@ ext4_fsblk_t ext4_mb_new_blocks(handle_t *handle, ar->len = ac->ac_b_ex.fe_len; } } else { - if (ext4_mb_discard_preallocations_should_retry(sb, ac)) + if (ext4_mb_discard_preallocations_should_retry(sb, ac, seq)) goto repeat; *errp = -ENOSPC; }