Message ID | 20210629143603.2166962-2-yebin10@huawei.com |
---|---|
State | Superseded |
Headers | show |
Series | Fix use-after-free about sbi->s_mmp_tsk | expand |
On Tue 29-06-21 22:36:02, Ye Bin wrote: > After merge 618f003199c6("ext4: fix memory leak in ext4_fill_super") commit, > we add delay in ext4_remount after "sb->s_flags |= SB_RDONLY", then > remount filesystem with read-only kasan report following warning: > [ 888.695326] ================================================================== > [ 888.696566] BUG: KASAN: use-after-free in kthread_stop+0x4c/0x2f0 > [ 888.697599] Write of size 4 at addr ffff8883849e0020 by task mount/2013 > [ 888.698707] > [ 888.698982] CPU: 4 PID: 2013 Comm: mount Not tainted 4.19.95-00013-ga369a6189bbf-dirty #413 > [ 888.700376] Hardware name: QEMU Standard PC > [ 888.702587] Call Trace: > [ 888.703017] dump_stack+0x108/0x15f > [ 888.703615] print_address_description+0xa5/0x372 > [ 888.704420] kasan_report.cold+0x236/0x2a8 > [ 888.705761] check_memory_region+0x240/0x270 > [ 888.706486] kasan_check_write+0x20/0x30 > [ 888.707156] kthread_stop+0x4c/0x2f0 > [ 888.707776] ext4_stop_mmpd+0x32/0x90 > [ 888.708262] ext4_remount.cold+0xf6/0x116 > [ 888.712671] do_remount_sb+0xff/0x460 > [ 888.714007] do_mount+0xce3/0x1be0 > [ 888.717749] ksys_mount+0xb2/0x150 > [ 888.718163] __x64_sys_mount+0x6a/0x80 > [ 888.718607] do_syscall_64+0xd9/0x1f0 > [ 888.719047] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > As kmmpd will exit if filesystem is read-only. Then sbi->s_mmp_tsk become wild > ptr, lead to use-after-free. As kmmpd will exit by others(call ktread_stop) > or by itself. After 618f003199c6 commit we can trigger this issue very easy. > Before this commit also exist this issue. > If kmmpd exit by itself, after merge 618f003199c6 commit there will trigger UAF > when umount filesystem. > To fix this issue, introduce sbi->s_mmp_lock to protect sbi->s_mmp_tsk. If kmmpd > exit by itself, we set sbi->s_mmp_tsk with NULL, and release mmp buffer_head. > > Fixes: 618f003199c6 ("ext4: fix memory leak in ext4_fill_super") > Signed-off-by: Ye Bin <yebin10@huawei.com> > --- > fs/ext4/ext4.h | 1 + > fs/ext4/mmp.c | 24 ++++++++++++++++++++++-- > fs/ext4/super.c | 1 + > 3 files changed, 24 insertions(+), 2 deletions(-) > > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h > index 8d3446746718..a479da37fed4 100644 > --- a/fs/ext4/ext4.h > +++ b/fs/ext4/ext4.h > @@ -1489,6 +1489,7 @@ struct ext4_sb_info { > struct completion s_kobj_unregister; > struct super_block *s_sb; > struct buffer_head *s_mmp_bh; > + struct mutex s_mmp_lock; > > /* Journaling */ > struct journal_s *s_journal; > diff --git a/fs/ext4/mmp.c b/fs/ext4/mmp.c > index 6cb598b549ca..fc18a8c205c7 100644 > --- a/fs/ext4/mmp.c > +++ b/fs/ext4/mmp.c > @@ -128,8 +128,9 @@ void __dump_mmp_msg(struct super_block *sb, struct mmp_struct *mmp, > static int kmmpd(void *data) > { > struct super_block *sb = (struct super_block *) data; > - struct ext4_super_block *es = EXT4_SB(sb)->s_es; > - struct buffer_head *bh = EXT4_SB(sb)->s_mmp_bh; > + struct ext4_sb_info *sbi = EXT4_SB(sb); > + struct ext4_super_block *es = sbi->s_es; > + struct buffer_head *bh = sbi->s_mmp_bh; > struct mmp_struct *mmp; > ext4_fsblk_t mmp_block; > u32 seq = 0; > @@ -245,16 +246,35 @@ static int kmmpd(void *data) > retval = write_mmp_block(sb, bh); > > exit_thread: > + /* > + * Maybe s_mmp_tsk kthread is stoped by others or by itself. If exit > + * by itself then sbi->s_mmp_tsk will be wild ptr, so there is need > + * set sbi->s_mmp_tsk with NULL, and also release mmp buffer_head. > + */ > + while (!kthread_should_stop()) { > + if (!mutex_trylock(&sbi->s_mmp_lock)) > + continue; > + > + if (sbi->s_mmp_tsk) { > + sbi->s_mmp_tsk = NULL; > + brelse(bh); > + } > + mutex_unlock(&sbi->s_mmp_lock); > + break; > + } > + > return retval; > } Honestly, this is just ugly. I think it would be saner if ext4_stop_mmpd() did: void ext4_stop_mmpd(struct ext4_sb_info *sbi) { struct task_struct *tsk; mutex_lock(&sbi->s_mmp_lock); tsk = sbi->s_mmp_tsk; if (tsk) { sbi->s_mmp_tsk = NULL; get_task_struct(tsk); } mutex_unlock(&sbi->s_mmp_lock); if (tsk) { kthread_stop(tsk); put_task_struct(k); } } And you leave freeing of 'bh' to exit path from kmmpd() which can also use normal locking scheme for zeroing s_mmp_tsk now. That being said for this scheme spinlock is enough, you don't need a mutex for s_mmp_lock. Honza > void ext4_stop_mmpd(struct ext4_sb_info *sbi) > { > + mutex_lock(&sbi->s_mmp_lock); > if (sbi->s_mmp_tsk) { > kthread_stop(sbi->s_mmp_tsk); > brelse(sbi->s_mmp_bh); > sbi->s_mmp_tsk = NULL; > } > + mutex_unlock(&sbi->s_mmp_lock); > } > > /* > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index c3942804b57f..5bc3230553fb 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -4786,6 +4786,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) > needs_recovery = (es->s_last_orphan != 0 || > ext4_has_feature_journal_needs_recovery(sb)); > > + mutex_init(&sbi->s_mmp_lock); > if (ext4_has_feature_mmp(sb) && !sb_rdonly(sb)) > if (ext4_multi_mount_protect(sb, le64_to_cpu(es->s_mmp_block))) > goto failed_mount3a; > -- > 2.31.1 >
On Mon, Jul 05, 2021 at 01:15:48PM +0200, Jan Kara wrote: > > That being said for this scheme spinlock is enough, you don't need a mutex > for s_mmp_lock. I think we can solve this without using using either a spinlock or a mutex, and it's a smaller and simpler patch as a result. (This is the -v2 version of this patch, which removes an unused label compared to the earlier version.) From 22ebc97aac75e27a5fd11acdb2bc3030d1da58d1 Mon Sep 17 00:00:00 2001 From: Theodore Ts'o <tytso@mit.edu> Date: Fri, 2 Jul 2021 12:45:02 -0400 Subject: [PATCH] ext4: fix possible UAF when remounting r/o a mmp-protected file system After commit 618f003199c6 ("ext4: fix memory leak in ext4_fill_super"), after the file system is remounted read-only, there is a race where the kmmpd thread can exit, causing sbi->s_mmp_tsk to point at freed memory, which the call to ext4_stop_mmpd() can trip over. Fix this by only allowing kmmpd() to exit when it is stopped via ext4_stop_mmpd(). Link: https://lore.kernel.org/r/e525c0bf7b18da426bb3d3dd63830a3f85218a9e.1625244710.git.tytso@mit.edu Reported-by: Ye Bin <yebin10@huawei.com> Bug-Report-Link: <20210629143603.2166962-1-yebin10@huawei.com> Signed-off-by: Theodore Ts'o <tytso@mit.edu> --- fs/ext4/mmp.c | 33 +++++++++++++++++---------------- fs/ext4/super.c | 6 +++++- 2 files changed, 22 insertions(+), 17 deletions(-) diff --git a/fs/ext4/mmp.c b/fs/ext4/mmp.c index 6cb598b549ca..1e95cee3d8b7 100644 --- a/fs/ext4/mmp.c +++ b/fs/ext4/mmp.c @@ -157,6 +157,17 @@ static int kmmpd(void *data) sizeof(mmp->mmp_nodename)); while (!kthread_should_stop()) { + if (!(le32_to_cpu(es->s_feature_incompat) & + EXT4_FEATURE_INCOMPAT_MMP)) { + ext4_warning(sb, "kmmpd being stopped since MMP feature" + " has been disabled."); + goto wait_to_exit; + } + if (sb_rdonly(sb)) { + if (!kthread_should_stop()) + schedule_timeout_interruptible(HZ); + continue; + } if (++seq > EXT4_MMP_SEQ_MAX) seq = 1; @@ -177,16 +188,6 @@ static int kmmpd(void *data) failed_writes++; } - if (!(le32_to_cpu(es->s_feature_incompat) & - EXT4_FEATURE_INCOMPAT_MMP)) { - ext4_warning(sb, "kmmpd being stopped since MMP feature" - " has been disabled."); - goto exit_thread; - } - - if (sb_rdonly(sb)) - break; - diff = jiffies - last_update_time; if (diff < mmp_update_interval * HZ) schedule_timeout_interruptible(mmp_update_interval * @@ -207,7 +208,7 @@ static int kmmpd(void *data) ext4_error_err(sb, -retval, "error reading MMP data: %d", retval); - goto exit_thread; + goto wait_to_exit; } mmp_check = (struct mmp_struct *)(bh_check->b_data); @@ -221,7 +222,7 @@ static int kmmpd(void *data) ext4_error_err(sb, EBUSY, "abort"); put_bh(bh_check); retval = -EBUSY; - goto exit_thread; + goto wait_to_exit; } put_bh(bh_check); } @@ -242,9 +243,11 @@ static int kmmpd(void *data) mmp->mmp_seq = cpu_to_le32(EXT4_MMP_SEQ_CLEAN); mmp->mmp_time = cpu_to_le64(ktime_get_real_seconds()); - retval = write_mmp_block(sb, bh); + return write_mmp_block(sb, bh); -exit_thread: +wait_to_exit: + while (!kthread_should_stop()) + schedule(); return retval; } @@ -391,5 +394,3 @@ int ext4_multi_mount_protect(struct super_block *sb, brelse(bh); return 1; } - - diff --git a/fs/ext4/super.c b/fs/ext4/super.c index cdbe71d935e8..b8ff0399e171 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -5993,7 +5993,6 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data) */ ext4_mark_recovery_complete(sb, es); } - ext4_stop_mmpd(sbi); } else { /* Make sure we can mount this feature set readwrite */ if (ext4_has_feature_readonly(sb) || @@ -6107,6 +6106,9 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data) if (!test_opt(sb, BLOCK_VALIDITY) && sbi->s_system_blks) ext4_release_system_zone(sb); + if (!ext4_has_feature_mmp(sb) || sb_rdonly(sb)) + ext4_stop_mmpd(sbi); + /* * Some options can be enabled by ext4 and/or by VFS mount flag * either way we need to make sure it matches in both *flags and @@ -6140,6 +6142,8 @@ static int ext4_remount(struct super_block *sb, int *flags, char *data) for (i = 0; i < EXT4_MAXQUOTAS; i++) kfree(to_free[i]); #endif + if (!ext4_has_feature_mmp(sb) || sb_rdonly(sb)) + ext4_stop_mmpd(sbi); kfree(orig_data); return err; }
On Mon 05-07-21 16:35:28, Theodore Ts'o wrote: > On Mon, Jul 05, 2021 at 01:15:48PM +0200, Jan Kara wrote: > > > > That being said for this scheme spinlock is enough, you don't need a mutex > > for s_mmp_lock. > > I think we can solve this without using using either a spinlock or a > mutex, and it's a smaller and simpler patch as a result. (This is the > -v2 version of this patch, which removes an unused label compared to > the earlier version.) Yeah, what you suggest is probably simpler. Some comments below. > From 22ebc97aac75e27a5fd11acdb2bc3030d1da58d1 Mon Sep 17 00:00:00 2001 > From: Theodore Ts'o <tytso@mit.edu> > Date: Fri, 2 Jul 2021 12:45:02 -0400 > Subject: [PATCH] ext4: fix possible UAF when remounting r/o a mmp-protected file system > > After commit 618f003199c6 ("ext4: fix memory leak in > ext4_fill_super"), after the file system is remounted read-only, there > is a race where the kmmpd thread can exit, causing sbi->s_mmp_tsk to > point at freed memory, which the call to ext4_stop_mmpd() can trip > over. > > Fix this by only allowing kmmpd() to exit when it is stopped via > ext4_stop_mmpd(). > > Link: https://lore.kernel.org/r/e525c0bf7b18da426bb3d3dd63830a3f85218a9e.1625244710.git.tytso@mit.edu > Reported-by: Ye Bin <yebin10@huawei.com> > Bug-Report-Link: <20210629143603.2166962-1-yebin10@huawei.com> > Signed-off-by: Theodore Ts'o <tytso@mit.edu> > --- > fs/ext4/mmp.c | 33 +++++++++++++++++---------------- > fs/ext4/super.c | 6 +++++- > 2 files changed, 22 insertions(+), 17 deletions(-) > > diff --git a/fs/ext4/mmp.c b/fs/ext4/mmp.c > index 6cb598b549ca..1e95cee3d8b7 100644 > --- a/fs/ext4/mmp.c > +++ b/fs/ext4/mmp.c > @@ -157,6 +157,17 @@ static int kmmpd(void *data) > sizeof(mmp->mmp_nodename)); > > while (!kthread_should_stop()) { > + if (!(le32_to_cpu(es->s_feature_incompat) & > + EXT4_FEATURE_INCOMPAT_MMP)) { We can probably use ext4_has_feature_mmp() macro when changing this? > + ext4_warning(sb, "kmmpd being stopped since MMP feature" > + " has been disabled."); > + goto wait_to_exit; > + } > + if (sb_rdonly(sb)) { > + if (!kthread_should_stop()) > + schedule_timeout_interruptible(HZ); Cannot this effectively block remount RO for 1s when we wait for kmmpd to exit? I think doing 'break' when we detected RO super is fine. We'll write the mmp block and then wait for kthread_should_stop() condition as in any other abort case. Am I missing something? > + continue; > + } > if (++seq > EXT4_MMP_SEQ_MAX) > seq = 1; > > @@ -177,16 +188,6 @@ static int kmmpd(void *data) > failed_writes++; > } > > - if (!(le32_to_cpu(es->s_feature_incompat) & > - EXT4_FEATURE_INCOMPAT_MMP)) { > - ext4_warning(sb, "kmmpd being stopped since MMP feature" > - " has been disabled."); > - goto exit_thread; > - } > - > - if (sb_rdonly(sb)) > - break; > - > diff = jiffies - last_update_time; > if (diff < mmp_update_interval * HZ) > schedule_timeout_interruptible(mmp_update_interval * > @@ -207,7 +208,7 @@ static int kmmpd(void *data) > ext4_error_err(sb, -retval, > "error reading MMP data: %d", > retval); > - goto exit_thread; > + goto wait_to_exit; > } > > mmp_check = (struct mmp_struct *)(bh_check->b_data); > @@ -221,7 +222,7 @@ static int kmmpd(void *data) > ext4_error_err(sb, EBUSY, "abort"); > put_bh(bh_check); > retval = -EBUSY; > - goto exit_thread; > + goto wait_to_exit; > } > put_bh(bh_check); > } > @@ -242,9 +243,11 @@ static int kmmpd(void *data) > mmp->mmp_seq = cpu_to_le32(EXT4_MMP_SEQ_CLEAN); > mmp->mmp_time = cpu_to_le64(ktime_get_real_seconds()); > > - retval = write_mmp_block(sb, bh); > + return write_mmp_block(sb, bh); > > -exit_thread: > +wait_to_exit: > + while (!kthread_should_stop()) > + schedule(); This makes me a bit nervous that we could unnecessarily burn CPU for potentially a long time (e.g. if somebody uses tune2fs to disable MMP, we would be sitting in this loop until the fs in remounted / unmounted). So maybe we should have something like: while (!kthread_should_stop()) { set_task_state(TASK_INTERRUPTIBLE); if (!kthread_should_stop()) schedule(); } This should safely synchronize with (and not miss wakeup from) kthread_stop() since that first sets KTHREAD_SHOULD_STOP and after that calls wake_up_process(). Honza
On Tue, Jul 06, 2021 at 01:11:37PM +0200, Jan Kara wrote: > > --- a/fs/ext4/mmp.c > > +++ b/fs/ext4/mmp.c > > @@ -157,6 +157,17 @@ static int kmmpd(void *data) > > sizeof(mmp->mmp_nodename)); > > > > while (!kthread_should_stop()) { > > + if (!(le32_to_cpu(es->s_feature_incompat) & > > + EXT4_FEATURE_INCOMPAT_MMP)) { > > We can probably use ext4_has_feature_mmp() macro when changing this? Ack, I'll make that change. > > + if (sb_rdonly(sb)) { > > + if (!kthread_should_stop()) > > + schedule_timeout_interruptible(HZ); > > Cannot this effectively block remount RO for 1s when we wait for kmmpd to > exit? I think doing 'break' when we detected RO super is fine. We'll write > the mmp block and then wait for kthread_should_stop() condition as in any > other abort case. Am I missing something? Yeah, we do want to update the mmp block when remounting the file system read-only. So breaking out to exit is the right thing to do here. > > +wait_to_exit: > > + while (!kthread_should_stop()) > > + schedule(); > > This makes me a bit nervous that we could unnecessarily burn CPU for > potentially a long time (e.g. if somebody uses tune2fs to disable MMP, we > would be sitting in this loop until the fs in remounted / unmounted). So > maybe we should have something like: > > while (!kthread_should_stop()) { > set_task_state(TASK_INTERRUPTIBLE); > if (!kthread_should_stop()) > schedule(); > } > > This should safely synchronize with (and not miss wakeup from) > kthread_stop() since that first sets KTHREAD_SHOULD_STOP and after that > calls wake_up_process(). Yep, good catch. I'll fix this and send out revised patch. - Ted
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 8d3446746718..a479da37fed4 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1489,6 +1489,7 @@ struct ext4_sb_info { struct completion s_kobj_unregister; struct super_block *s_sb; struct buffer_head *s_mmp_bh; + struct mutex s_mmp_lock; /* Journaling */ struct journal_s *s_journal; diff --git a/fs/ext4/mmp.c b/fs/ext4/mmp.c index 6cb598b549ca..fc18a8c205c7 100644 --- a/fs/ext4/mmp.c +++ b/fs/ext4/mmp.c @@ -128,8 +128,9 @@ void __dump_mmp_msg(struct super_block *sb, struct mmp_struct *mmp, static int kmmpd(void *data) { struct super_block *sb = (struct super_block *) data; - struct ext4_super_block *es = EXT4_SB(sb)->s_es; - struct buffer_head *bh = EXT4_SB(sb)->s_mmp_bh; + struct ext4_sb_info *sbi = EXT4_SB(sb); + struct ext4_super_block *es = sbi->s_es; + struct buffer_head *bh = sbi->s_mmp_bh; struct mmp_struct *mmp; ext4_fsblk_t mmp_block; u32 seq = 0; @@ -245,16 +246,35 @@ static int kmmpd(void *data) retval = write_mmp_block(sb, bh); exit_thread: + /* + * Maybe s_mmp_tsk kthread is stoped by others or by itself. If exit + * by itself then sbi->s_mmp_tsk will be wild ptr, so there is need + * set sbi->s_mmp_tsk with NULL, and also release mmp buffer_head. + */ + while (!kthread_should_stop()) { + if (!mutex_trylock(&sbi->s_mmp_lock)) + continue; + + if (sbi->s_mmp_tsk) { + sbi->s_mmp_tsk = NULL; + brelse(bh); + } + mutex_unlock(&sbi->s_mmp_lock); + break; + } + return retval; } void ext4_stop_mmpd(struct ext4_sb_info *sbi) { + mutex_lock(&sbi->s_mmp_lock); if (sbi->s_mmp_tsk) { kthread_stop(sbi->s_mmp_tsk); brelse(sbi->s_mmp_bh); sbi->s_mmp_tsk = NULL; } + mutex_unlock(&sbi->s_mmp_lock); } /* diff --git a/fs/ext4/super.c b/fs/ext4/super.c index c3942804b57f..5bc3230553fb 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -4786,6 +4786,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) needs_recovery = (es->s_last_orphan != 0 || ext4_has_feature_journal_needs_recovery(sb)); + mutex_init(&sbi->s_mmp_lock); if (ext4_has_feature_mmp(sb) && !sb_rdonly(sb)) if (ext4_multi_mount_protect(sb, le64_to_cpu(es->s_mmp_block))) goto failed_mount3a;
After merge 618f003199c6("ext4: fix memory leak in ext4_fill_super") commit, we add delay in ext4_remount after "sb->s_flags |= SB_RDONLY", then remount filesystem with read-only kasan report following warning: [ 888.695326] ================================================================== [ 888.696566] BUG: KASAN: use-after-free in kthread_stop+0x4c/0x2f0 [ 888.697599] Write of size 4 at addr ffff8883849e0020 by task mount/2013 [ 888.698707] [ 888.698982] CPU: 4 PID: 2013 Comm: mount Not tainted 4.19.95-00013-ga369a6189bbf-dirty #413 [ 888.700376] Hardware name: QEMU Standard PC [ 888.702587] Call Trace: [ 888.703017] dump_stack+0x108/0x15f [ 888.703615] print_address_description+0xa5/0x372 [ 888.704420] kasan_report.cold+0x236/0x2a8 [ 888.705761] check_memory_region+0x240/0x270 [ 888.706486] kasan_check_write+0x20/0x30 [ 888.707156] kthread_stop+0x4c/0x2f0 [ 888.707776] ext4_stop_mmpd+0x32/0x90 [ 888.708262] ext4_remount.cold+0xf6/0x116 [ 888.712671] do_remount_sb+0xff/0x460 [ 888.714007] do_mount+0xce3/0x1be0 [ 888.717749] ksys_mount+0xb2/0x150 [ 888.718163] __x64_sys_mount+0x6a/0x80 [ 888.718607] do_syscall_64+0xd9/0x1f0 [ 888.719047] entry_SYSCALL_64_after_hwframe+0x44/0xa9 As kmmpd will exit if filesystem is read-only. Then sbi->s_mmp_tsk become wild ptr, lead to use-after-free. As kmmpd will exit by others(call ktread_stop) or by itself. After 618f003199c6 commit we can trigger this issue very easy. Before this commit also exist this issue. If kmmpd exit by itself, after merge 618f003199c6 commit there will trigger UAF when umount filesystem. To fix this issue, introduce sbi->s_mmp_lock to protect sbi->s_mmp_tsk. If kmmpd exit by itself, we set sbi->s_mmp_tsk with NULL, and release mmp buffer_head. Fixes: 618f003199c6 ("ext4: fix memory leak in ext4_fill_super") Signed-off-by: Ye Bin <yebin10@huawei.com> --- fs/ext4/ext4.h | 1 + fs/ext4/mmp.c | 24 ++++++++++++++++++++++-- fs/ext4/super.c | 1 + 3 files changed, 24 insertions(+), 2 deletions(-)