Message ID | 20210408113618.1033785-4-yi.zhang@huawei.com |
---|---|
State | New |
Headers | show |
Series | ext4: fix two issue about bdev_try_to_free_page() | expand |
Hi Zhang,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on ext4/dev]
[also build test ERROR on linus/master v5.12-rc6 next-20210408]
[cannot apply to tytso-fscrypt/master]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]
url: https://github.com/0day-ci/linux/commits/Zhang-Yi/ext4-fix-two-issue-about-bdev_try_to_free_page/20210408-193105
base: https://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4.git dev
config: alpha-randconfig-m031-20210408 (attached as .config)
compiler: alpha-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/1cd15af09e1887501090c23700b696d43ebf39ca
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Zhang-Yi/ext4-fix-two-issue-about-bdev_try_to_free_page/20210408-193105
git checkout 1cd15af09e1887501090c23700b696d43ebf39ca
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=alpha
If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
fs/ocfs2/journal.c: In function 'ocfs2_journal_shutdown':
>> fs/ocfs2/journal.c:1012:6: error: invalid use of void expression
1012 | if (!jbd2_journal_destroy(journal->j_journal) && !status) {
| ^
vim +1012 fs/ocfs2/journal.c
ccd979bdbce9fb Mark Fasheh 2005-12-15 955
ccd979bdbce9fb Mark Fasheh 2005-12-15 956 /*
ccd979bdbce9fb Mark Fasheh 2005-12-15 957 * If the journal has been kmalloc'd it needs to be freed after this
ccd979bdbce9fb Mark Fasheh 2005-12-15 958 * call.
ccd979bdbce9fb Mark Fasheh 2005-12-15 959 */
ccd979bdbce9fb Mark Fasheh 2005-12-15 960 void ocfs2_journal_shutdown(struct ocfs2_super *osb)
ccd979bdbce9fb Mark Fasheh 2005-12-15 961 {
ccd979bdbce9fb Mark Fasheh 2005-12-15 962 struct ocfs2_journal *journal = NULL;
ccd979bdbce9fb Mark Fasheh 2005-12-15 963 int status = 0;
ccd979bdbce9fb Mark Fasheh 2005-12-15 964 struct inode *inode = NULL;
ccd979bdbce9fb Mark Fasheh 2005-12-15 965 int num_running_trans = 0;
ccd979bdbce9fb Mark Fasheh 2005-12-15 966
ebdec83ba46c12 Eric Sesterhenn / snakebyte 2006-01-27 967 BUG_ON(!osb);
ccd979bdbce9fb Mark Fasheh 2005-12-15 968
ccd979bdbce9fb Mark Fasheh 2005-12-15 969 journal = osb->journal;
ccd979bdbce9fb Mark Fasheh 2005-12-15 970 if (!journal)
ccd979bdbce9fb Mark Fasheh 2005-12-15 971 goto done;
ccd979bdbce9fb Mark Fasheh 2005-12-15 972
ccd979bdbce9fb Mark Fasheh 2005-12-15 973 inode = journal->j_inode;
ccd979bdbce9fb Mark Fasheh 2005-12-15 974
ccd979bdbce9fb Mark Fasheh 2005-12-15 975 if (journal->j_state != OCFS2_JOURNAL_LOADED)
ccd979bdbce9fb Mark Fasheh 2005-12-15 976 goto done;
ccd979bdbce9fb Mark Fasheh 2005-12-15 977
2b4e30fbde4258 Joel Becker 2008-09-03 978 /* need to inc inode use count - jbd2_journal_destroy will iput. */
ccd979bdbce9fb Mark Fasheh 2005-12-15 979 if (!igrab(inode))
ccd979bdbce9fb Mark Fasheh 2005-12-15 980 BUG();
ccd979bdbce9fb Mark Fasheh 2005-12-15 981
ccd979bdbce9fb Mark Fasheh 2005-12-15 982 num_running_trans = atomic_read(&(osb->journal->j_num_trans));
b41079504c786e Tao Ma 2011-02-24 983 trace_ocfs2_journal_shutdown(num_running_trans);
ccd979bdbce9fb Mark Fasheh 2005-12-15 984
ccd979bdbce9fb Mark Fasheh 2005-12-15 985 /* Do a commit_cache here. It will flush our journal, *and*
ccd979bdbce9fb Mark Fasheh 2005-12-15 986 * release any locks that are still held.
ccd979bdbce9fb Mark Fasheh 2005-12-15 987 * set the SHUTDOWN flag and release the trans lock.
ccd979bdbce9fb Mark Fasheh 2005-12-15 988 * the commit thread will take the trans lock for us below. */
ccd979bdbce9fb Mark Fasheh 2005-12-15 989 journal->j_state = OCFS2_JOURNAL_IN_SHUTDOWN;
ccd979bdbce9fb Mark Fasheh 2005-12-15 990
ccd979bdbce9fb Mark Fasheh 2005-12-15 991 /* The OCFS2_JOURNAL_IN_SHUTDOWN will signal to commit_cache to not
ccd979bdbce9fb Mark Fasheh 2005-12-15 992 * drop the trans_lock (which we want to hold until we
ccd979bdbce9fb Mark Fasheh 2005-12-15 993 * completely destroy the journal. */
ccd979bdbce9fb Mark Fasheh 2005-12-15 994 if (osb->commit_task) {
ccd979bdbce9fb Mark Fasheh 2005-12-15 995 /* Wait for the commit thread */
b41079504c786e Tao Ma 2011-02-24 996 trace_ocfs2_journal_shutdown_wait(osb->commit_task);
ccd979bdbce9fb Mark Fasheh 2005-12-15 997 kthread_stop(osb->commit_task);
ccd979bdbce9fb Mark Fasheh 2005-12-15 998 osb->commit_task = NULL;
ccd979bdbce9fb Mark Fasheh 2005-12-15 999 }
ccd979bdbce9fb Mark Fasheh 2005-12-15 1000
ccd979bdbce9fb Mark Fasheh 2005-12-15 1001 BUG_ON(atomic_read(&(osb->journal->j_num_trans)) != 0);
ccd979bdbce9fb Mark Fasheh 2005-12-15 1002
c271c5c22b0a7c Sunil Mushran 2006-12-05 1003 if (ocfs2_mount_local(osb)) {
2b4e30fbde4258 Joel Becker 2008-09-03 1004 jbd2_journal_lock_updates(journal->j_journal);
2b4e30fbde4258 Joel Becker 2008-09-03 1005 status = jbd2_journal_flush(journal->j_journal);
2b4e30fbde4258 Joel Becker 2008-09-03 1006 jbd2_journal_unlock_updates(journal->j_journal);
c271c5c22b0a7c Sunil Mushran 2006-12-05 1007 if (status < 0)
c271c5c22b0a7c Sunil Mushran 2006-12-05 1008 mlog_errno(status);
c271c5c22b0a7c Sunil Mushran 2006-12-05 1009 }
c271c5c22b0a7c Sunil Mushran 2006-12-05 1010
d85400af790dba Junxiao Bi 2018-12-28 1011 /* Shutdown the kernel journal system */
d85400af790dba Junxiao Bi 2018-12-28 @1012 if (!jbd2_journal_destroy(journal->j_journal) && !status) {
c271c5c22b0a7c Sunil Mushran 2006-12-05 1013 /*
c271c5c22b0a7c Sunil Mushran 2006-12-05 1014 * Do not toggle if flush was unsuccessful otherwise
c271c5c22b0a7c Sunil Mushran 2006-12-05 1015 * will leave dirty metadata in a "clean" journal
c271c5c22b0a7c Sunil Mushran 2006-12-05 1016 */
539d8264093560 Sunil Mushran 2008-07-14 1017 status = ocfs2_journal_toggle_dirty(osb, 0, 0);
ccd979bdbce9fb Mark Fasheh 2005-12-15 1018 if (status < 0)
ccd979bdbce9fb Mark Fasheh 2005-12-15 1019 mlog_errno(status);
c271c5c22b0a7c Sunil Mushran 2006-12-05 1020 }
ae0dff683076b2 Sunil Mushran 2008-10-22 1021 journal->j_journal = NULL;
ccd979bdbce9fb Mark Fasheh 2005-12-15 1022
ccd979bdbce9fb Mark Fasheh 2005-12-15 1023 OCFS2_I(inode)->ip_open_count--;
ccd979bdbce9fb Mark Fasheh 2005-12-15 1024
ccd979bdbce9fb Mark Fasheh 2005-12-15 1025 /* unlock our journal */
e63aecb651ba73 Mark Fasheh 2007-10-18 1026 ocfs2_inode_unlock(inode, 1);
ccd979bdbce9fb Mark Fasheh 2005-12-15 1027
ccd979bdbce9fb Mark Fasheh 2005-12-15 1028 brelse(journal->j_bh);
ccd979bdbce9fb Mark Fasheh 2005-12-15 1029 journal->j_bh = NULL;
ccd979bdbce9fb Mark Fasheh 2005-12-15 1030
ccd979bdbce9fb Mark Fasheh 2005-12-15 1031 journal->j_state = OCFS2_JOURNAL_FREE;
ccd979bdbce9fb Mark Fasheh 2005-12-15 1032
---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
On Thu 08-04-21 19:36:18, Zhang Yi wrote: > There is a race between bdev_try_to_free_page() and > jbd2_journal_destroy() that could end up triggering a use after free > issue about journal. > > drop cache umount filesystem > bdev_try_to_free_page() > get journal > jbd2_journal_try_to_free_buffers() ext4_put_super() > kfree(journal) > access journal <-- lead to UAF > > The above race also could happens between the bdev_try_to_free_page() > and the error path of ext4_fill_super(). This patch avoid this race by > add rcu protection around accessing sbi->s_journal in > bdev_try_to_free_page() and destroy the journal after an rcu grace > period. > > Signed-off-by: Zhang Yi <yi.zhang@huawei.com> OK, I see the problem. But cannot the use-after-free happen even for the superblock itself (e.g., EXT4_SB(sb)->s_journal dereference)? I don't see anything preventing that as blkdev_releasepage() just shamelessly does: super = BDEV_I(page->mapping->host)->bdev.bd_super without making sure the sb cannot go away the instant we load a pointer to it. Or am I missing something Ted? If I'm right, we'd need some careful sprinkling of RCU, READ_ONCE(), and careful superblock reference grabbing to make bdev_try_to_free_page() safe against concurrent kill_block_super()... Honza > --- > fs/ext4/super.c | 33 ++++++++++++++++++++++++--------- > fs/jbd2/journal.c | 30 +++++++++++++++++++++++++++--- > include/linux/jbd2.h | 11 ++++++++++- > 3 files changed, 61 insertions(+), 13 deletions(-) > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index 02ba47a5bc70..6bbaadc5357b 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -1150,6 +1150,21 @@ static inline void ext4_quota_off_umount(struct super_block *sb) > } > #endif > > +static int ext4_journal_release(struct ext4_sb_info *sbi) > +{ > + journal_t *journal = sbi->s_journal; > + int ret; > + > + ret = jbd2_journal_release(journal); > + sbi->s_journal = NULL; > + /* > + * Call rcu to prevent racing with bdev_try_to_free_page() > + * accessing the journal at the same time. > + */ > + call_rcu(&journal->j_rcu, jbd2_journal_release_rcu); > + return ret; > +} > + > static void ext4_put_super(struct super_block *sb) > { > struct ext4_sb_info *sbi = EXT4_SB(sb); > @@ -1174,11 +1189,9 @@ static void ext4_put_super(struct super_block *sb) > > if (sbi->s_journal) { > aborted = is_journal_aborted(sbi->s_journal); > - err = jbd2_journal_destroy(sbi->s_journal); > - sbi->s_journal = NULL; > - if ((err < 0) && !aborted) { > + err = ext4_journal_release(sbi); > + if ((err < 0) && !aborted) > ext4_abort(sb, -err, "Couldn't clean up the journal"); > - } > } > > ext4_es_unregister_shrinker(sbi); > @@ -1449,14 +1462,18 @@ static int ext4_nfs_commit_metadata(struct inode *inode) > static int bdev_try_to_free_page(struct super_block *sb, struct page *page, > gfp_t wait) > { > - journal_t *journal = EXT4_SB(sb)->s_journal; > + journal_t *journal; > int ret = 0; > > WARN_ON(PageChecked(page)); > if (!page_has_buffers(page)) > return 0; > + > + rcu_read_lock(); > + journal = READ_ONCE(EXT4_SB(sb)->s_journal); > if (journal) > ret = jbd2_journal_try_to_free_buffers(journal, page); > + rcu_read_unlock(); > if (!ret) > return try_to_free_buffers(page); > return 0; > @@ -5146,10 +5163,8 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) > ext4_xattr_destroy_cache(sbi->s_ea_block_cache); > sbi->s_ea_block_cache = NULL; > > - if (sbi->s_journal) { > - jbd2_journal_destroy(sbi->s_journal); > - sbi->s_journal = NULL; > - } > + if (sbi->s_journal) > + ext4_journal_release(sbi); > failed_mount3a: > ext4_es_unregister_shrinker(sbi); > failed_mount3: > diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c > index 2dc944442802..071caaaa9de1 100644 > --- a/fs/jbd2/journal.c > +++ b/fs/jbd2/journal.c > @@ -76,6 +76,8 @@ EXPORT_SYMBOL(jbd2_journal_check_available_features); > EXPORT_SYMBOL(jbd2_journal_set_features); > EXPORT_SYMBOL(jbd2_journal_load); > EXPORT_SYMBOL(jbd2_journal_destroy); > +EXPORT_SYMBOL(jbd2_journal_release); > +EXPORT_SYMBOL(jbd2_journal_release_rcu); > EXPORT_SYMBOL(jbd2_journal_abort); > EXPORT_SYMBOL(jbd2_journal_errno); > EXPORT_SYMBOL(jbd2_journal_ack_err); > @@ -1951,14 +1953,14 @@ int jbd2_journal_load(journal_t *journal) > } > > /** > - * jbd2_journal_destroy() - Release a journal_t structure. > + * jbd2_journal_release() - Release a journal_t structure. > * @journal: Journal to act on. > * > * Release a journal_t structure once it is no longer in use by the > * journaled object. > * Return <0 if we couldn't clean up the journal. > */ > -int jbd2_journal_destroy(journal_t *journal) > +int jbd2_journal_release(journal_t *journal) > { > int err = 0; > > @@ -2021,11 +2023,33 @@ int jbd2_journal_destroy(journal_t *journal) > crypto_free_shash(journal->j_chksum_driver); > kfree(journal->j_fc_wbuf); > kfree(journal->j_wbuf); > - kfree(journal); > > return err; > } > > +/** > + * jbd2_journal_release_rcu() - Free a journal_t structure. > + * @rcu: rcu list node relate to the journal want to free. > + * > + * Freeing a journal_t structure after a rcu grace period. > + */ > +void jbd2_journal_release_rcu(struct rcu_head *rcu) > +{ > + kfree(container_of(rcu, journal_t, j_rcu)); > +} > + > +/** > + * jbd2_journal_destroy() - Release and free a journal_t structure. > + * @journal: Journal to act on. > + * > + * Release and free a journal_t structure once it is no longer in use > + * by the journaled object. > + */ > +void jbd2_journal_destroy(journal_t *journal) > +{ > + jbd2_journal_release(journal); > + kfree(journal); > +} > > /** > * jbd2_journal_check_used_features() - Check if features specified are used. > diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h > index 99d3cd051ac3..39a8d04596a2 100644 > --- a/include/linux/jbd2.h > +++ b/include/linux/jbd2.h > @@ -1238,6 +1238,13 @@ struct journal_s > */ > __u32 j_csum_seed; > > + /** > + * @j_rcu: > + * > + * Prevent racing between accessing and destroy at the same time. > + */ > + struct rcu_head j_rcu; > + > #ifdef CONFIG_DEBUG_LOCK_ALLOC > /** > * @j_trans_commit_map: > @@ -1509,7 +1516,9 @@ extern int jbd2_journal_set_features > extern void jbd2_journal_clear_features > (journal_t *, unsigned long, unsigned long, unsigned long); > extern int jbd2_journal_load (journal_t *journal); > -extern int jbd2_journal_destroy (journal_t *); > +extern void jbd2_journal_destroy (journal_t *); > +extern int jbd2_journal_release (journal_t *); > +extern void jbd2_journal_release_rcu (struct rcu_head *rcu); > extern int jbd2_journal_recover (journal_t *journal); > extern int jbd2_journal_wipe (journal_t *, int); > extern int jbd2_journal_skip_recovery (journal_t *); > -- > 2.25.4 >
On 2021/4/8 21:56, Jan Kara wrote: > On Thu 08-04-21 19:36:18, Zhang Yi wrote: >> There is a race between bdev_try_to_free_page() and >> jbd2_journal_destroy() that could end up triggering a use after free >> issue about journal. >> >> drop cache umount filesystem >> bdev_try_to_free_page() >> get journal >> jbd2_journal_try_to_free_buffers() ext4_put_super() >> kfree(journal) >> access journal <-- lead to UAF >> >> The above race also could happens between the bdev_try_to_free_page() >> and the error path of ext4_fill_super(). This patch avoid this race by >> add rcu protection around accessing sbi->s_journal in >> bdev_try_to_free_page() and destroy the journal after an rcu grace >> period. >> >> Signed-off-by: Zhang Yi <yi.zhang@huawei.com> > > OK, I see the problem. But cannot the use-after-free happen even for the > superblock itself (e.g., EXT4_SB(sb)->s_journal dereference)? I don't see > anything preventing that as blkdev_releasepage() just shamelessly does: > > super = BDEV_I(page->mapping->host)->bdev.bd_super > Hi, Jan. I checked the superblock. In theory, the bdev_try_to_free_page() is invoked with page locked, the umount process will wait the page unlock on kill_block_super()->..->kill_bdev()->truncate_inode_pages_range() before free superblock, so I guess the use-after-free problem couldn't happen in general. But I think it's fragile and may invalidate if the bdev has more than one operners(__blkdev_put() call kill_bdev only if bd_openers becomes zero)? I will check it. Thanks, Yi. > without making sure the sb cannot go away the instant we load a pointer to > it. Or am I missing something Ted? If I'm right, we'd need some careful > sprinkling of RCU, READ_ONCE(), and careful superblock reference grabbing > to make bdev_try_to_free_page() safe against concurrent > kill_block_super()... > > Honza > >> --- >> fs/ext4/super.c | 33 ++++++++++++++++++++++++--------- >> fs/jbd2/journal.c | 30 +++++++++++++++++++++++++++--- >> include/linux/jbd2.h | 11 ++++++++++- >> 3 files changed, 61 insertions(+), 13 deletions(-) >> >> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >> index 02ba47a5bc70..6bbaadc5357b 100644 >> --- a/fs/ext4/super.c >> +++ b/fs/ext4/super.c >> @@ -1150,6 +1150,21 @@ static inline void ext4_quota_off_umount(struct super_block *sb) >> } >> #endif >> >> +static int ext4_journal_release(struct ext4_sb_info *sbi) >> +{ >> + journal_t *journal = sbi->s_journal; >> + int ret; >> + >> + ret = jbd2_journal_release(journal); >> + sbi->s_journal = NULL; >> + /* >> + * Call rcu to prevent racing with bdev_try_to_free_page() >> + * accessing the journal at the same time. >> + */ >> + call_rcu(&journal->j_rcu, jbd2_journal_release_rcu); >> + return ret; >> +} >> + >> static void ext4_put_super(struct super_block *sb) >> { >> struct ext4_sb_info *sbi = EXT4_SB(sb); >> @@ -1174,11 +1189,9 @@ static void ext4_put_super(struct super_block *sb) >> >> if (sbi->s_journal) { >> aborted = is_journal_aborted(sbi->s_journal); >> - err = jbd2_journal_destroy(sbi->s_journal); >> - sbi->s_journal = NULL; >> - if ((err < 0) && !aborted) { >> + err = ext4_journal_release(sbi); >> + if ((err < 0) && !aborted) >> ext4_abort(sb, -err, "Couldn't clean up the journal"); >> - } >> } >> >> ext4_es_unregister_shrinker(sbi); >> @@ -1449,14 +1462,18 @@ static int ext4_nfs_commit_metadata(struct inode *inode) >> static int bdev_try_to_free_page(struct super_block *sb, struct page *page, >> gfp_t wait) >> { >> - journal_t *journal = EXT4_SB(sb)->s_journal; >> + journal_t *journal; >> int ret = 0; >> >> WARN_ON(PageChecked(page)); >> if (!page_has_buffers(page)) >> return 0; >> + >> + rcu_read_lock(); >> + journal = READ_ONCE(EXT4_SB(sb)->s_journal); >> if (journal) >> ret = jbd2_journal_try_to_free_buffers(journal, page); >> + rcu_read_unlock(); >> if (!ret) >> return try_to_free_buffers(page); >> return 0; >> @@ -5146,10 +5163,8 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) >> ext4_xattr_destroy_cache(sbi->s_ea_block_cache); >> sbi->s_ea_block_cache = NULL; >> >> - if (sbi->s_journal) { >> - jbd2_journal_destroy(sbi->s_journal); >> - sbi->s_journal = NULL; >> - } >> + if (sbi->s_journal) >> + ext4_journal_release(sbi); >> failed_mount3a: >> ext4_es_unregister_shrinker(sbi); >> failed_mount3: >> diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c >> index 2dc944442802..071caaaa9de1 100644 >> --- a/fs/jbd2/journal.c >> +++ b/fs/jbd2/journal.c >> @@ -76,6 +76,8 @@ EXPORT_SYMBOL(jbd2_journal_check_available_features); >> EXPORT_SYMBOL(jbd2_journal_set_features); >> EXPORT_SYMBOL(jbd2_journal_load); >> EXPORT_SYMBOL(jbd2_journal_destroy); >> +EXPORT_SYMBOL(jbd2_journal_release); >> +EXPORT_SYMBOL(jbd2_journal_release_rcu); >> EXPORT_SYMBOL(jbd2_journal_abort); >> EXPORT_SYMBOL(jbd2_journal_errno); >> EXPORT_SYMBOL(jbd2_journal_ack_err); >> @@ -1951,14 +1953,14 @@ int jbd2_journal_load(journal_t *journal) >> } >> >> /** >> - * jbd2_journal_destroy() - Release a journal_t structure. >> + * jbd2_journal_release() - Release a journal_t structure. >> * @journal: Journal to act on. >> * >> * Release a journal_t structure once it is no longer in use by the >> * journaled object. >> * Return <0 if we couldn't clean up the journal. >> */ >> -int jbd2_journal_destroy(journal_t *journal) >> +int jbd2_journal_release(journal_t *journal) >> { >> int err = 0; >> >> @@ -2021,11 +2023,33 @@ int jbd2_journal_destroy(journal_t *journal) >> crypto_free_shash(journal->j_chksum_driver); >> kfree(journal->j_fc_wbuf); >> kfree(journal->j_wbuf); >> - kfree(journal); >> >> return err; >> } >> >> +/** >> + * jbd2_journal_release_rcu() - Free a journal_t structure. >> + * @rcu: rcu list node relate to the journal want to free. >> + * >> + * Freeing a journal_t structure after a rcu grace period. >> + */ >> +void jbd2_journal_release_rcu(struct rcu_head *rcu) >> +{ >> + kfree(container_of(rcu, journal_t, j_rcu)); >> +} >> + >> +/** >> + * jbd2_journal_destroy() - Release and free a journal_t structure. >> + * @journal: Journal to act on. >> + * >> + * Release and free a journal_t structure once it is no longer in use >> + * by the journaled object. >> + */ >> +void jbd2_journal_destroy(journal_t *journal) >> +{ >> + jbd2_journal_release(journal); >> + kfree(journal); >> +} >> >> /** >> * jbd2_journal_check_used_features() - Check if features specified are used. >> diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h >> index 99d3cd051ac3..39a8d04596a2 100644 >> --- a/include/linux/jbd2.h >> +++ b/include/linux/jbd2.h >> @@ -1238,6 +1238,13 @@ struct journal_s >> */ >> __u32 j_csum_seed; >> >> + /** >> + * @j_rcu: >> + * >> + * Prevent racing between accessing and destroy at the same time. >> + */ >> + struct rcu_head j_rcu; >> + >> #ifdef CONFIG_DEBUG_LOCK_ALLOC >> /** >> * @j_trans_commit_map: >> @@ -1509,7 +1516,9 @@ extern int jbd2_journal_set_features >> extern void jbd2_journal_clear_features >> (journal_t *, unsigned long, unsigned long, unsigned long); >> extern int jbd2_journal_load (journal_t *journal); >> -extern int jbd2_journal_destroy (journal_t *); >> +extern void jbd2_journal_destroy (journal_t *); >> +extern int jbd2_journal_release (journal_t *); >> +extern void jbd2_journal_release_rcu (struct rcu_head *rcu); >> extern int jbd2_journal_recover (journal_t *journal); >> extern int jbd2_journal_wipe (journal_t *, int); >> extern int jbd2_journal_skip_recovery (journal_t *); >> -- >> 2.25.4 >>
On Thu 08-04-21 22:38:08, Zhang Yi wrote: > On 2021/4/8 21:56, Jan Kara wrote: > > On Thu 08-04-21 19:36:18, Zhang Yi wrote: > >> There is a race between bdev_try_to_free_page() and > >> jbd2_journal_destroy() that could end up triggering a use after free > >> issue about journal. > >> > >> drop cache umount filesystem > >> bdev_try_to_free_page() > >> get journal > >> jbd2_journal_try_to_free_buffers() ext4_put_super() > >> kfree(journal) > >> access journal <-- lead to UAF > >> > >> The above race also could happens between the bdev_try_to_free_page() > >> and the error path of ext4_fill_super(). This patch avoid this race by > >> add rcu protection around accessing sbi->s_journal in > >> bdev_try_to_free_page() and destroy the journal after an rcu grace > >> period. > >> > >> Signed-off-by: Zhang Yi <yi.zhang@huawei.com> > > > > OK, I see the problem. But cannot the use-after-free happen even for the > > superblock itself (e.g., EXT4_SB(sb)->s_journal dereference)? I don't see > > anything preventing that as blkdev_releasepage() just shamelessly does: > > > > super = BDEV_I(page->mapping->host)->bdev.bd_super > > > Hi, Jan. > > I checked the superblock. In theory, the bdev_try_to_free_page() is invoked > with page locked, the umount process will wait the page unlock on > kill_block_super()->..->kill_bdev()->truncate_inode_pages_range() before free > superblock, so I guess the use-after-free problem couldn't happen in general. > But I think it's fragile and may invalidate if the bdev has more than one > operners(__blkdev_put() call kill_bdev only if bd_openers becomes zero)? Yes, kill_bdev() is only called when bd_openers drops to 0 but there can be other processes having the bdev open (non-exclusively). Honza
diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 02ba47a5bc70..6bbaadc5357b 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1150,6 +1150,21 @@ static inline void ext4_quota_off_umount(struct super_block *sb) } #endif +static int ext4_journal_release(struct ext4_sb_info *sbi) +{ + journal_t *journal = sbi->s_journal; + int ret; + + ret = jbd2_journal_release(journal); + sbi->s_journal = NULL; + /* + * Call rcu to prevent racing with bdev_try_to_free_page() + * accessing the journal at the same time. + */ + call_rcu(&journal->j_rcu, jbd2_journal_release_rcu); + return ret; +} + static void ext4_put_super(struct super_block *sb) { struct ext4_sb_info *sbi = EXT4_SB(sb); @@ -1174,11 +1189,9 @@ static void ext4_put_super(struct super_block *sb) if (sbi->s_journal) { aborted = is_journal_aborted(sbi->s_journal); - err = jbd2_journal_destroy(sbi->s_journal); - sbi->s_journal = NULL; - if ((err < 0) && !aborted) { + err = ext4_journal_release(sbi); + if ((err < 0) && !aborted) ext4_abort(sb, -err, "Couldn't clean up the journal"); - } } ext4_es_unregister_shrinker(sbi); @@ -1449,14 +1462,18 @@ static int ext4_nfs_commit_metadata(struct inode *inode) static int bdev_try_to_free_page(struct super_block *sb, struct page *page, gfp_t wait) { - journal_t *journal = EXT4_SB(sb)->s_journal; + journal_t *journal; int ret = 0; WARN_ON(PageChecked(page)); if (!page_has_buffers(page)) return 0; + + rcu_read_lock(); + journal = READ_ONCE(EXT4_SB(sb)->s_journal); if (journal) ret = jbd2_journal_try_to_free_buffers(journal, page); + rcu_read_unlock(); if (!ret) return try_to_free_buffers(page); return 0; @@ -5146,10 +5163,8 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent) ext4_xattr_destroy_cache(sbi->s_ea_block_cache); sbi->s_ea_block_cache = NULL; - if (sbi->s_journal) { - jbd2_journal_destroy(sbi->s_journal); - sbi->s_journal = NULL; - } + if (sbi->s_journal) + ext4_journal_release(sbi); failed_mount3a: ext4_es_unregister_shrinker(sbi); failed_mount3: diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c index 2dc944442802..071caaaa9de1 100644 --- a/fs/jbd2/journal.c +++ b/fs/jbd2/journal.c @@ -76,6 +76,8 @@ EXPORT_SYMBOL(jbd2_journal_check_available_features); EXPORT_SYMBOL(jbd2_journal_set_features); EXPORT_SYMBOL(jbd2_journal_load); EXPORT_SYMBOL(jbd2_journal_destroy); +EXPORT_SYMBOL(jbd2_journal_release); +EXPORT_SYMBOL(jbd2_journal_release_rcu); EXPORT_SYMBOL(jbd2_journal_abort); EXPORT_SYMBOL(jbd2_journal_errno); EXPORT_SYMBOL(jbd2_journal_ack_err); @@ -1951,14 +1953,14 @@ int jbd2_journal_load(journal_t *journal) } /** - * jbd2_journal_destroy() - Release a journal_t structure. + * jbd2_journal_release() - Release a journal_t structure. * @journal: Journal to act on. * * Release a journal_t structure once it is no longer in use by the * journaled object. * Return <0 if we couldn't clean up the journal. */ -int jbd2_journal_destroy(journal_t *journal) +int jbd2_journal_release(journal_t *journal) { int err = 0; @@ -2021,11 +2023,33 @@ int jbd2_journal_destroy(journal_t *journal) crypto_free_shash(journal->j_chksum_driver); kfree(journal->j_fc_wbuf); kfree(journal->j_wbuf); - kfree(journal); return err; } +/** + * jbd2_journal_release_rcu() - Free a journal_t structure. + * @rcu: rcu list node relate to the journal want to free. + * + * Freeing a journal_t structure after a rcu grace period. + */ +void jbd2_journal_release_rcu(struct rcu_head *rcu) +{ + kfree(container_of(rcu, journal_t, j_rcu)); +} + +/** + * jbd2_journal_destroy() - Release and free a journal_t structure. + * @journal: Journal to act on. + * + * Release and free a journal_t structure once it is no longer in use + * by the journaled object. + */ +void jbd2_journal_destroy(journal_t *journal) +{ + jbd2_journal_release(journal); + kfree(journal); +} /** * jbd2_journal_check_used_features() - Check if features specified are used. diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h index 99d3cd051ac3..39a8d04596a2 100644 --- a/include/linux/jbd2.h +++ b/include/linux/jbd2.h @@ -1238,6 +1238,13 @@ struct journal_s */ __u32 j_csum_seed; + /** + * @j_rcu: + * + * Prevent racing between accessing and destroy at the same time. + */ + struct rcu_head j_rcu; + #ifdef CONFIG_DEBUG_LOCK_ALLOC /** * @j_trans_commit_map: @@ -1509,7 +1516,9 @@ extern int jbd2_journal_set_features extern void jbd2_journal_clear_features (journal_t *, unsigned long, unsigned long, unsigned long); extern int jbd2_journal_load (journal_t *journal); -extern int jbd2_journal_destroy (journal_t *); +extern void jbd2_journal_destroy (journal_t *); +extern int jbd2_journal_release (journal_t *); +extern void jbd2_journal_release_rcu (struct rcu_head *rcu); extern int jbd2_journal_recover (journal_t *journal); extern int jbd2_journal_wipe (journal_t *, int); extern int jbd2_journal_skip_recovery (journal_t *);
There is a race between bdev_try_to_free_page() and jbd2_journal_destroy() that could end up triggering a use after free issue about journal. drop cache umount filesystem bdev_try_to_free_page() get journal jbd2_journal_try_to_free_buffers() ext4_put_super() kfree(journal) access journal <-- lead to UAF The above race also could happens between the bdev_try_to_free_page() and the error path of ext4_fill_super(). This patch avoid this race by add rcu protection around accessing sbi->s_journal in bdev_try_to_free_page() and destroy the journal after an rcu grace period. Signed-off-by: Zhang Yi <yi.zhang@huawei.com> --- fs/ext4/super.c | 33 ++++++++++++++++++++++++--------- fs/jbd2/journal.c | 30 +++++++++++++++++++++++++++--- include/linux/jbd2.h | 11 ++++++++++- 3 files changed, 61 insertions(+), 13 deletions(-)