Message ID | 20220727143734.71612-1-jlayton@kernel.org |
---|---|
State | Superseded |
Headers | show |
Series | [v2] ext4: unconditionally enable the i_version counter | expand |
On 27 Jul 2022, at 10:37, Jeff Layton wrote: > The original i_version implementation was pretty expensive, requiring > a > log flush on every change. Because of this, it was gated behind a > mount > option (implemented via the MS_I_VERSION mountoption flag). > > Commit ae5e165d855d (fs: new API for handling inode->i_version) made > the > i_version flag much less expensive, so there is no longer a > performance > penalty from enabling it. xfs and btrfs already enable it > unconditionally when the on-disk format can support it. > > Have ext4 ignore the SB_I_VERSION flag, and just enable it > unconditionally. While we're in here, remove the handling of > Opt_i_version as well, since we're almost to 5.20 anyway. > > Ideally, we'd couple this change with a way to disable the i_version > counter (just in case), but the way the iversion mount option was > implemented makes that difficult to do. We'd need to add a new mount > option altogether or do something with tune2fs. That's probably best > left to later patches if it turns out to be needed. > > Cc: Dave Chinner <david@fromorbit.com> > Cc: Lukas Czerner <lczerner@redhat.com> > Cc: Benjamin Coddington <bcodding@redhat.com> > Cc: Christoph Hellwig <hch@infradead.org> > Cc: Darrick J. Wong <djwong@kernel.org> > Signed-off-by: Jeff Layton <jlayton@kernel.org> > --- > fs/ext4/inode.c | 5 ++--- > fs/ext4/super.c | 13 ++++--------- > 2 files changed, 6 insertions(+), 12 deletions(-) > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 84c0eb55071d..c785c0b72116 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -5411,7 +5411,7 @@ int ext4_setattr(struct user_namespace > *mnt_userns, struct dentry *dentry, > return -EINVAL; > } > > - if (IS_I_VERSION(inode) && attr->ia_size != inode->i_size) > + if (attr->ia_size != inode->i_size) > inode_inc_iversion(inode); > > if (shrink) { > @@ -5717,8 +5717,7 @@ int ext4_mark_iloc_dirty(handle_t *handle, > } > ext4_fc_track_inode(handle, inode); > > - if (IS_I_VERSION(inode)) > - inode_inc_iversion(inode); > + inode_inc_iversion(inode); > > /* the do_update_inode consumes one bh->b_count */ > get_bh(iloc->bh); > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index 845f2f8aee5f..4b06f394d7d1 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -1585,7 +1585,7 @@ enum { > Opt_inlinecrypt, > Opt_usrjquota, Opt_grpjquota, Opt_quota, > Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err, > - Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, > + Opt_usrquota, Opt_grpquota, Opt_prjquota, > Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never, > Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error, > Opt_nowarn_on_error, Opt_mblk_io_submit, Opt_debug_want_extra_isize, > @@ -1694,7 +1694,6 @@ static const struct fs_parameter_spec > ext4_param_specs[] = { > fsparam_flag ("barrier", Opt_barrier), > fsparam_u32 ("barrier", Opt_barrier), > fsparam_flag ("nobarrier", Opt_nobarrier), > - fsparam_flag ("i_version", Opt_i_version), We've got to keep the parameter, I think, else we'll break existing setups with the i_version mount option. Ben
On Wed, 2022-07-27 at 11:48 -0400, Benjamin Coddington wrote: > On 27 Jul 2022, at 10:37, Jeff Layton wrote: > > > The original i_version implementation was pretty expensive, requiring > > a > > log flush on every change. Because of this, it was gated behind a > > mount > > option (implemented via the MS_I_VERSION mountoption flag). > > > > Commit ae5e165d855d (fs: new API for handling inode->i_version) made > > the > > i_version flag much less expensive, so there is no longer a > > performance > > penalty from enabling it. xfs and btrfs already enable it > > unconditionally when the on-disk format can support it. > > > > Have ext4 ignore the SB_I_VERSION flag, and just enable it > > unconditionally. While we're in here, remove the handling of > > Opt_i_version as well, since we're almost to 5.20 anyway. > > > > Ideally, we'd couple this change with a way to disable the i_version > > counter (just in case), but the way the iversion mount option was > > implemented makes that difficult to do. We'd need to add a new mount > > option altogether or do something with tune2fs. That's probably best > > left to later patches if it turns out to be needed. > > > > Cc: Dave Chinner <david@fromorbit.com> > > Cc: Lukas Czerner <lczerner@redhat.com> > > Cc: Benjamin Coddington <bcodding@redhat.com> > > Cc: Christoph Hellwig <hch@infradead.org> > > Cc: Darrick J. Wong <djwong@kernel.org> > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > --- > > fs/ext4/inode.c | 5 ++--- > > fs/ext4/super.c | 13 ++++--------- > > 2 files changed, 6 insertions(+), 12 deletions(-) > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > > index 84c0eb55071d..c785c0b72116 100644 > > --- a/fs/ext4/inode.c > > +++ b/fs/ext4/inode.c > > @@ -5411,7 +5411,7 @@ int ext4_setattr(struct user_namespace > > *mnt_userns, struct dentry *dentry, > > return -EINVAL; > > } > > > > - if (IS_I_VERSION(inode) && attr->ia_size != inode->i_size) > > + if (attr->ia_size != inode->i_size) > > inode_inc_iversion(inode); > > > > if (shrink) { > > @@ -5717,8 +5717,7 @@ int ext4_mark_iloc_dirty(handle_t *handle, > > } > > ext4_fc_track_inode(handle, inode); > > > > - if (IS_I_VERSION(inode)) > > - inode_inc_iversion(inode); > > + inode_inc_iversion(inode); > > > > /* the do_update_inode consumes one bh->b_count */ > > get_bh(iloc->bh); > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > > index 845f2f8aee5f..4b06f394d7d1 100644 > > --- a/fs/ext4/super.c > > +++ b/fs/ext4/super.c > > @@ -1585,7 +1585,7 @@ enum { > > Opt_inlinecrypt, > > Opt_usrjquota, Opt_grpjquota, Opt_quota, > > Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err, > > - Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, > > + Opt_usrquota, Opt_grpquota, Opt_prjquota, > > Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never, > > Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error, > > Opt_nowarn_on_error, Opt_mblk_io_submit, Opt_debug_want_extra_isize, > > @@ -1694,7 +1694,6 @@ static const struct fs_parameter_spec > > ext4_param_specs[] = { > > fsparam_flag ("barrier", Opt_barrier), > > fsparam_u32 ("barrier", Opt_barrier), > > fsparam_flag ("nobarrier", Opt_nobarrier), > > - fsparam_flag ("i_version", Opt_i_version), > > We've got to keep the parameter, I think, else we'll break existing > setups > with the i_version mount option. > It had already been announced that the above mount option would be removed by v5.20 (which Darrick pointed out). We might as well drop it here since this likely wouldn't be merged before then anyway. The "iversion" mount option is parsed in the userland mount program, and gets turned into MS_I_VERSION flag for the mount syscall. That will still be done, though with this change, the kernel should now just ignore it.
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 84c0eb55071d..c785c0b72116 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -5411,7 +5411,7 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry, return -EINVAL; } - if (IS_I_VERSION(inode) && attr->ia_size != inode->i_size) + if (attr->ia_size != inode->i_size) inode_inc_iversion(inode); if (shrink) { @@ -5717,8 +5717,7 @@ int ext4_mark_iloc_dirty(handle_t *handle, } ext4_fc_track_inode(handle, inode); - if (IS_I_VERSION(inode)) - inode_inc_iversion(inode); + inode_inc_iversion(inode); /* the do_update_inode consumes one bh->b_count */ get_bh(iloc->bh); diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 845f2f8aee5f..4b06f394d7d1 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1585,7 +1585,7 @@ enum { Opt_inlinecrypt, Opt_usrjquota, Opt_grpjquota, Opt_quota, Opt_noquota, Opt_barrier, Opt_nobarrier, Opt_err, - Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_i_version, + Opt_usrquota, Opt_grpquota, Opt_prjquota, Opt_dax, Opt_dax_always, Opt_dax_inode, Opt_dax_never, Opt_stripe, Opt_delalloc, Opt_nodelalloc, Opt_warn_on_error, Opt_nowarn_on_error, Opt_mblk_io_submit, Opt_debug_want_extra_isize, @@ -1694,7 +1694,6 @@ static const struct fs_parameter_spec ext4_param_specs[] = { fsparam_flag ("barrier", Opt_barrier), fsparam_u32 ("barrier", Opt_barrier), fsparam_flag ("nobarrier", Opt_nobarrier), - fsparam_flag ("i_version", Opt_i_version), fsparam_flag ("dax", Opt_dax), fsparam_enum ("dax", Opt_dax_type, ext4_param_dax), fsparam_u32 ("stripe", Opt_stripe), @@ -2140,11 +2139,6 @@ static int ext4_parse_param(struct fs_context *fc, struct fs_parameter *param) case Opt_abort: ctx_set_mount_flag(ctx, EXT4_MF_FS_ABORTED); return 0; - case Opt_i_version: - ext4_msg(NULL, KERN_WARNING, deprecated_msg, param->key, "5.20"); - ext4_msg(NULL, KERN_WARNING, "Use iversion instead\n"); - ctx_set_flags(ctx, SB_I_VERSION); - return 0; case Opt_inlinecrypt: #ifdef CONFIG_FS_ENCRYPTION_INLINE_CRYPT ctx_set_flags(ctx, SB_INLINECRYPT); @@ -2970,8 +2964,6 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb, SEQ_OPTS_PRINT("min_batch_time=%u", sbi->s_min_batch_time); if (nodefs || sbi->s_max_batch_time != EXT4_DEF_MAX_BATCH_TIME) SEQ_OPTS_PRINT("max_batch_time=%u", sbi->s_max_batch_time); - if (sb->s_flags & SB_I_VERSION) - SEQ_OPTS_PUTS("i_version"); if (nodefs || sbi->s_stripe) SEQ_OPTS_PRINT("stripe=%lu", sbi->s_stripe); if (nodefs || EXT4_MOUNT_DATA_FLAGS & @@ -4630,6 +4622,9 @@ static int __ext4_fill_super(struct fs_context *fc, struct super_block *sb) sb->s_flags = (sb->s_flags & ~SB_POSIXACL) | (test_opt(sb, POSIX_ACL) ? SB_POSIXACL : 0); + /* i_version is always enabled now */ + sb->s_flags |= SB_I_VERSION; + if (le32_to_cpu(es->s_rev_level) == EXT4_GOOD_OLD_REV && (ext4_has_compat_features(sb) || ext4_has_ro_compat_features(sb) ||
The original i_version implementation was pretty expensive, requiring a log flush on every change. Because of this, it was gated behind a mount option (implemented via the MS_I_VERSION mountoption flag). Commit ae5e165d855d (fs: new API for handling inode->i_version) made the i_version flag much less expensive, so there is no longer a performance penalty from enabling it. xfs and btrfs already enable it unconditionally when the on-disk format can support it. Have ext4 ignore the SB_I_VERSION flag, and just enable it unconditionally. While we're in here, remove the handling of Opt_i_version as well, since we're almost to 5.20 anyway. Ideally, we'd couple this change with a way to disable the i_version counter (just in case), but the way the iversion mount option was implemented makes that difficult to do. We'd need to add a new mount option altogether or do something with tune2fs. That's probably best left to later patches if it turns out to be needed. Cc: Dave Chinner <david@fromorbit.com> Cc: Lukas Czerner <lczerner@redhat.com> Cc: Benjamin Coddington <bcodding@redhat.com> Cc: Christoph Hellwig <hch@infradead.org> Cc: Darrick J. Wong <djwong@kernel.org> Signed-off-by: Jeff Layton <jlayton@kernel.org> --- fs/ext4/inode.c | 5 ++--- fs/ext4/super.c | 13 ++++--------- 2 files changed, 6 insertions(+), 12 deletions(-)