Message ID | 20191120050024.11161-3-riteshh@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | ext4: Introducing ilock wrapper APIs & fixing i_rwsem scalablity prob. in DIO mixed-rw | expand |
On Wed, Nov 20, 2019 at 10:30:22AM +0530, Ritesh Harjani wrote: > This adds ext4_ilock/iunlock types of APIs. > This is the preparation APIs to make shared > locking/unlocking & restarting with exclusive > locking/unlocking easier in next patch. *scratches head* A nit, but what's with the changelog wrapping at like ~40 characters? > +#define EXT4_IOLOCK_EXCL (1 << 0) > +#define EXT4_IOLOCK_SHARED (1 << 1) > > +static inline void ext4_ilock(struct inode *inode, unsigned int iolock) > +{ > + if (iolock == EXT4_IOLOCK_EXCL) > + inode_lock(inode); > + else > + inode_lock_shared(inode); > +} > + > +static inline void ext4_iunlock(struct inode *inode, unsigned int iolock) > +{ > + if (iolock == EXT4_IOLOCK_EXCL) > + inode_unlock(inode); > + else > + inode_unlock_shared(inode); > +} > + > +static inline int ext4_ilock_nowait(struct inode *inode, unsigned int iolock) > +{ > + if (iolock == EXT4_IOLOCK_EXCL) > + return inode_trylock(inode); > + else > + return inode_trylock_shared(inode); > +} Is it really necessary for all these helpers to actually have the 'else' statement? Could we not just return/set whatever takes the 'else' branch directly from the end of these functions? I think it would be cleaner that way. /me doesn't really like the naming of these functions either. What's people's opinion on changing these for example: - ext4_inode_lock() - ext4_inode_unlock() Or, better yet, is there any reason why we've never actually considered naming such functions to have the verb precede the actual object that we're performing the operation on? In my opinion, it totally makes way more sense from a code readability standpoint and overall intent of the function. For example: - ext4_lock_inode() - ext4_unlock_inode() > +static inline void ext4_ilock_demote(struct inode *inode, unsigned int iolock) > +{ > + BUG_ON(iolock != EXT4_IOLOCK_EXCL); > + downgrade_write(&inode->i_rwsem); > +} > + Same principle would also apply here. On an ending note, I'm not really sure that I like the name of these macros. Like, for example, expand the macro 'EXT4_IOLOCK_EXCL' into plain english words as if you were reading it. This would translate to something like 'EXT4 INPUT/OUPUT LOCK EXCLUSIVE' or 'EXT4 IO LOCK EXCLUSIVE'. Just flipping the words around make a significant improvement for overall readability i.e. 'EXT4_EXCL_IOLOCK', which would expand out to 'EXT4 EXCLUSIVE IO LOCK'. Speaking of, is there any reason why we don't mention 'INODE' here seeing as though that's the object we're actually protecting by taking one of these locking mechanisms? /M
Hello Matthew, Thanks for the review. On 11/20/19 4:53 PM, Matthew Bobrowski wrote: > On Wed, Nov 20, 2019 at 10:30:22AM +0530, Ritesh Harjani wrote: >> This adds ext4_ilock/iunlock types of APIs. >> This is the preparation APIs to make shared >> locking/unlocking & restarting with exclusive >> locking/unlocking easier in next patch. > > *scratches head* > > A nit, but what's with the changelog wrapping at like ~40 characters? Yup will fix that next time. Thanks. > >> +#define EXT4_IOLOCK_EXCL (1 << 0) >> +#define EXT4_IOLOCK_SHARED (1 << 1) >> >> +static inline void ext4_ilock(struct inode *inode, unsigned int iolock) >> +{ >> + if (iolock == EXT4_IOLOCK_EXCL) >> + inode_lock(inode); >> + else >> + inode_lock_shared(inode); >> +} >> + >> +static inline void ext4_iunlock(struct inode *inode, unsigned int iolock) >> +{ >> + if (iolock == EXT4_IOLOCK_EXCL) >> + inode_unlock(inode); >> + else >> + inode_unlock_shared(inode); >> +} >> + >> +static inline int ext4_ilock_nowait(struct inode *inode, unsigned int iolock) >> +{ >> + if (iolock == EXT4_IOLOCK_EXCL) >> + return inode_trylock(inode); >> + else >> + return inode_trylock_shared(inode); >> +} > > Is it really necessary for all these helpers to actually have the > 'else' statement? Could we not just return/set whatever takes the > 'else' branch directly from the end of these functions? I think it > would be cleaner that way. Sure np. > > /me doesn't really like the naming of these functions either. :) difference of opinion. > > What's people's opinion on changing these for example: > - ext4_inode_lock() > - ext4_inode_unlock() > ext4_ilock/iunlock sounds better to me as it is short too. But if others have also have a strong opinion towards ext4_inode_lock/unlock() - I am ok with that. > Or, better yet, is there any reason why we've never actually > considered naming such functions to have the verb precede the actual > object that we're performing the operation on? In my opinion, it > totally makes way more sense from a code readability standpoint and > overall intent of the function. For example: > - ext4_lock_inode() > - ext4_unlock_inode() Not against your suggestion here. But in kernel I do see a preference towards object followed by a verb. At least in vfs I see functions like inode_lock()/unlock(). Plus I would not deny that this naming is also inspired from xfs_ilock()/iunlock API names. > >> +static inline void ext4_ilock_demote(struct inode *inode, unsigned int iolock) >> +{ >> + BUG_ON(iolock != EXT4_IOLOCK_EXCL); >> + downgrade_write(&inode->i_rwsem); >> +} >> + > > Same principle would also apply here. > > On an ending note, I'm not really sure that I like the name of these > macros. Like, for example, expand the macro 'EXT4_IOLOCK_EXCL' into > plain english words as if you were reading it. This would translate to > something like 'EXT4 INPUT/OUPUT LOCK EXCLUSIVE' or 'EXT4 IO LOCK > EXCLUSIVE'. Just flipping the words around make a significant > improvement for overall readability i.e. 'EXT4_EXCL_IOLOCK', which > would expand out to 'EXT4 EXCLUSIVE IO LOCK'. Speaking of, is there Ditto. Unless you and others have a strong objection, I would rather keep this as is :) > any reason why we don't mention 'INODE' here seeing as though that's > the object we're actually protecting by taking one of these locking > mechanisms? hmm, it was increasing the name of the macro if I do it that way. But that's ok. Is below macro name better? #define EXT4_INODE_IOLOCK_EXCL (1 << 0) #define EXT4_INODE_IOLOCK_SHARED (1 << 1) Thanks for the review!! -ritesh
On Wed 20-11-19 10:30:22, Ritesh Harjani wrote: > This adds ext4_ilock/iunlock types of APIs. > This is the preparation APIs to make shared > locking/unlocking & restarting with exclusive > locking/unlocking easier in next patch. > > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> I know XFS does it this way but I don't think we need the obsurity of additional locking helpers etc. just because of one place in ext4_dio_write_iter() that will use this. So I'd just drop this patch... Honza > --- > fs/ext4/ext4.h | 33 ++++++++++++++++++++++++++++++ > fs/ext4/extents.c | 16 +++++++-------- > fs/ext4/file.c | 52 +++++++++++++++++++++++------------------------ > fs/ext4/inode.c | 4 ++-- > fs/ext4/ioctl.c | 16 +++++++-------- > fs/ext4/super.c | 12 +++++------ > fs/ext4/xattr.c | 17 ++++++++-------- > 7 files changed, 92 insertions(+), 58 deletions(-) > > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h > index 61987c106511..b4169a92e8d0 100644 > --- a/fs/ext4/ext4.h > +++ b/fs/ext4/ext4.h > @@ -2960,6 +2960,39 @@ do { \ > #define EXT4_FREECLUSTERS_WATERMARK 0 > #endif > > +#define EXT4_IOLOCK_EXCL (1 << 0) > +#define EXT4_IOLOCK_SHARED (1 << 1) > + > +static inline void ext4_ilock(struct inode *inode, unsigned int iolock) > +{ > + if (iolock == EXT4_IOLOCK_EXCL) > + inode_lock(inode); > + else > + inode_lock_shared(inode); > +} > + > +static inline void ext4_iunlock(struct inode *inode, unsigned int iolock) > +{ > + if (iolock == EXT4_IOLOCK_EXCL) > + inode_unlock(inode); > + else > + inode_unlock_shared(inode); > +} > + > +static inline int ext4_ilock_nowait(struct inode *inode, unsigned int iolock) > +{ > + if (iolock == EXT4_IOLOCK_EXCL) > + return inode_trylock(inode); > + else > + return inode_trylock_shared(inode); > +} > + > +static inline void ext4_ilock_demote(struct inode *inode, unsigned int iolock) > +{ > + BUG_ON(iolock != EXT4_IOLOCK_EXCL); > + downgrade_write(&inode->i_rwsem); > +} > + > /* Update i_disksize. Requires i_mutex to avoid races with truncate */ > static inline void ext4_update_i_disksize(struct inode *inode, loff_t newsize) > { > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c > index 0e8708b77da6..08dd57558533 100644 > --- a/fs/ext4/extents.c > +++ b/fs/ext4/extents.c > @@ -4754,7 +4754,7 @@ static long ext4_zero_range(struct file *file, loff_t offset, > else > max_blocks -= lblk; > > - inode_lock(inode); > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > > /* > * Indirect files do not support unwritten extnets > @@ -4864,7 +4864,7 @@ static long ext4_zero_range(struct file *file, loff_t offset, > > ext4_journal_stop(handle); > out_mutex: > - inode_unlock(inode); > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > return ret; > } > > @@ -4930,7 +4930,7 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len) > if (mode & FALLOC_FL_KEEP_SIZE) > flags |= EXT4_GET_BLOCKS_KEEP_SIZE; > > - inode_lock(inode); > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > > /* > * We only support preallocation for extent-based files only > @@ -4961,7 +4961,7 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len) > EXT4_I(inode)->i_sync_tid); > } > out: > - inode_unlock(inode); > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > trace_ext4_fallocate_exit(inode, offset, max_blocks, ret); > return ret; > } > @@ -5509,7 +5509,7 @@ int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len) > return ret; > } > > - inode_lock(inode); > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > /* > * There is no need to overlap collapse range with EOF, in which case > * it is effectively a truncate operation > @@ -5608,7 +5608,7 @@ int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len) > out_mmap: > up_write(&EXT4_I(inode)->i_mmap_sem); > out_mutex: > - inode_unlock(inode); > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > return ret; > } > > @@ -5659,7 +5659,7 @@ int ext4_insert_range(struct inode *inode, loff_t offset, loff_t len) > return ret; > } > > - inode_lock(inode); > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > /* Currently just for extent based files */ > if (!ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)) { > ret = -EOPNOTSUPP; > @@ -5786,7 +5786,7 @@ int ext4_insert_range(struct inode *inode, loff_t offset, loff_t len) > out_mmap: > up_write(&EXT4_I(inode)->i_mmap_sem); > out_mutex: > - inode_unlock(inode); > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > return ret; > } > > diff --git a/fs/ext4/file.c b/fs/ext4/file.c > index 977ac58dc718..ebe3f051598d 100644 > --- a/fs/ext4/file.c > +++ b/fs/ext4/file.c > @@ -55,14 +55,14 @@ static ssize_t ext4_dio_read_iter(struct kiocb *iocb, struct iov_iter *to) > struct inode *inode = file_inode(iocb->ki_filp); > > if (iocb->ki_flags & IOCB_NOWAIT) { > - if (!inode_trylock_shared(inode)) > + if (!ext4_ilock_nowait(inode, EXT4_IOLOCK_SHARED)) > return -EAGAIN; > } else { > - inode_lock_shared(inode); > + ext4_ilock(inode, EXT4_IOLOCK_SHARED); > } > > if (!ext4_dio_supported(inode)) { > - inode_unlock_shared(inode); > + ext4_iunlock(inode, EXT4_IOLOCK_SHARED); > /* > * Fallback to buffered I/O if the operation being performed on > * the inode is not supported by direct I/O. The IOCB_DIRECT > @@ -76,7 +76,7 @@ static ssize_t ext4_dio_read_iter(struct kiocb *iocb, struct iov_iter *to) > > ret = iomap_dio_rw(iocb, to, &ext4_iomap_ops, NULL, > is_sync_kiocb(iocb)); > - inode_unlock_shared(inode); > + ext4_iunlock(inode, EXT4_IOLOCK_SHARED); > > file_accessed(iocb->ki_filp); > return ret; > @@ -89,22 +89,23 @@ static ssize_t ext4_dax_read_iter(struct kiocb *iocb, struct iov_iter *to) > ssize_t ret; > > if (iocb->ki_flags & IOCB_NOWAIT) { > - if (!inode_trylock_shared(inode)) > + if (!ext4_ilock_nowait(inode, EXT4_IOLOCK_SHARED)) > return -EAGAIN; > } else { > - inode_lock_shared(inode); > + ext4_ilock(inode, EXT4_IOLOCK_SHARED); > } > + > /* > * Recheck under inode lock - at this point we are sure it cannot > * change anymore > */ > if (!IS_DAX(inode)) { > - inode_unlock_shared(inode); > + ext4_iunlock(inode, EXT4_IOLOCK_SHARED); > /* Fallback to buffered IO in case we cannot support DAX */ > return generic_file_read_iter(iocb, to); > } > ret = dax_iomap_rw(iocb, to, &ext4_iomap_ops); > - inode_unlock_shared(inode); > + ext4_iunlock(inode, EXT4_IOLOCK_SHARED); > > file_accessed(iocb->ki_filp); > return ret; > @@ -244,7 +245,7 @@ static ssize_t ext4_buffered_write_iter(struct kiocb *iocb, > if (iocb->ki_flags & IOCB_NOWAIT) > return -EOPNOTSUPP; > > - inode_lock(inode); > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > ret = ext4_write_checks(iocb, from); > if (ret <= 0) > goto out; > @@ -254,7 +255,7 @@ static ssize_t ext4_buffered_write_iter(struct kiocb *iocb, > current->backing_dev_info = NULL; > > out: > - inode_unlock(inode); > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > if (likely(ret > 0)) { > iocb->ki_pos += ret; > ret = generic_write_sync(iocb, ret); > @@ -372,16 +373,17 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) > handle_t *handle; > struct inode *inode = file_inode(iocb->ki_filp); > bool extend = false, overwrite = false, unaligned_aio = false; > + unsigned int iolock = EXT4_IOLOCK_EXCL; > > if (iocb->ki_flags & IOCB_NOWAIT) { > - if (!inode_trylock(inode)) > + if (!ext4_ilock_nowait(inode, iolock)) > return -EAGAIN; > } else { > - inode_lock(inode); > + ext4_ilock(inode, iolock); > } > > if (!ext4_dio_supported(inode)) { > - inode_unlock(inode); > + ext4_iunlock(inode, iolock); > /* > * Fallback to buffered I/O if the inode does not support > * direct I/O. > @@ -391,7 +393,7 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) > > ret = ext4_write_checks(iocb, from); > if (ret <= 0) { > - inode_unlock(inode); > + ext4_iunlock(inode, iolock); > return ret; > } > > @@ -416,7 +418,8 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) > if (!unaligned_aio && ext4_overwrite_io(inode, offset, count) && > ext4_should_dioread_nolock(inode)) { > overwrite = true; > - downgrade_write(&inode->i_rwsem); > + ext4_ilock_demote(inode, iolock); > + iolock = EXT4_IOLOCK_SHARED; > } > > if (offset + count > EXT4_I(inode)->i_disksize) { > @@ -443,10 +446,7 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) > ret = ext4_handle_inode_extension(inode, offset, ret, count); > > out: > - if (overwrite) > - inode_unlock_shared(inode); > - else > - inode_unlock(inode); > + ext4_iunlock(inode, iolock); > > if (ret >= 0 && iov_iter_count(from)) { > ssize_t err; > @@ -489,10 +489,10 @@ ext4_dax_write_iter(struct kiocb *iocb, struct iov_iter *from) > struct inode *inode = file_inode(iocb->ki_filp); > > if (iocb->ki_flags & IOCB_NOWAIT) { > - if (!inode_trylock(inode)) > + if (!ext4_ilock_nowait(inode, EXT4_IOLOCK_EXCL)) > return -EAGAIN; > } else { > - inode_lock(inode); > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > } > > ret = ext4_write_checks(iocb, from); > @@ -524,7 +524,7 @@ ext4_dax_write_iter(struct kiocb *iocb, struct iov_iter *from) > if (extend) > ret = ext4_handle_inode_extension(inode, offset, ret, count); > out: > - inode_unlock(inode); > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > if (ret > 0) > ret = generic_write_sync(iocb, ret); > return ret; > @@ -757,16 +757,16 @@ loff_t ext4_llseek(struct file *file, loff_t offset, int whence) > return generic_file_llseek_size(file, offset, whence, > maxbytes, i_size_read(inode)); > case SEEK_HOLE: > - inode_lock_shared(inode); > + ext4_ilock(inode, EXT4_IOLOCK_SHARED); > offset = iomap_seek_hole(inode, offset, > &ext4_iomap_report_ops); > - inode_unlock_shared(inode); > + ext4_iunlock(inode, EXT4_IOLOCK_SHARED); > break; > case SEEK_DATA: > - inode_lock_shared(inode); > + ext4_ilock(inode, EXT4_IOLOCK_SHARED); > offset = iomap_seek_data(inode, offset, > &ext4_iomap_report_ops); > - inode_unlock_shared(inode); > + ext4_iunlock(inode, EXT4_IOLOCK_SHARED); > break; > } > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > index 381813205f99..39dcc22667a1 100644 > --- a/fs/ext4/inode.c > +++ b/fs/ext4/inode.c > @@ -3930,7 +3930,7 @@ int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length) > return ret; > } > > - inode_lock(inode); > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > > /* No need to punch hole beyond i_size */ > if (offset >= inode->i_size) > @@ -4037,7 +4037,7 @@ int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length) > out_dio: > up_write(&EXT4_I(inode)->i_mmap_sem); > out_mutex: > - inode_unlock(inode); > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > return ret; > } > > diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c > index 0b7f316fd30f..43b7a23dc57b 100644 > --- a/fs/ext4/ioctl.c > +++ b/fs/ext4/ioctl.c > @@ -855,13 +855,13 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) > if (err) > return err; > > - inode_lock(inode); > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > err = ext4_ioctl_check_immutable(inode, > from_kprojid(&init_user_ns, ei->i_projid), > flags); > if (!err) > err = ext4_ioctl_setflags(inode, flags); > - inode_unlock(inode); > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > mnt_drop_write_file(filp); > return err; > } > @@ -892,7 +892,7 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) > goto setversion_out; > } > > - inode_lock(inode); > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > handle = ext4_journal_start(inode, EXT4_HT_INODE, 1); > if (IS_ERR(handle)) { > err = PTR_ERR(handle); > @@ -907,7 +907,7 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) > ext4_journal_stop(handle); > > unlock_out: > - inode_unlock(inode); > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > setversion_out: > mnt_drop_write_file(filp); > return err; > @@ -1026,9 +1026,9 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) > * ext4_ext_swap_inode_data before we switch the > * inode format to prevent read. > */ > - inode_lock((inode)); > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > err = ext4_ext_migrate(inode); > - inode_unlock((inode)); > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > mnt_drop_write_file(filp); > return err; > } > @@ -1272,7 +1272,7 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) > if (err) > return err; > > - inode_lock(inode); > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > ext4_fill_fsxattr(inode, &old_fa); > err = vfs_ioc_fssetxattr_check(inode, &old_fa, &fa); > if (err) > @@ -1287,7 +1287,7 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) > goto out; > err = ext4_ioctl_setproject(filp, fa.fsx_projid); > out: > - inode_unlock(inode); > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > mnt_drop_write_file(filp); > return err; > } > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index 7796e2ffc294..48b83b2cf0ad 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -2682,12 +2682,12 @@ static void ext4_orphan_cleanup(struct super_block *sb, > __func__, inode->i_ino, inode->i_size); > jbd_debug(2, "truncating inode %lu to %lld bytes\n", > inode->i_ino, inode->i_size); > - inode_lock(inode); > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > truncate_inode_pages(inode->i_mapping, inode->i_size); > ret = ext4_truncate(inode); > if (ret) > ext4_std_error(inode->i_sb, ret); > - inode_unlock(inode); > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > nr_truncates++; > } else { > if (test_opt(sb, DEBUG)) > @@ -5785,7 +5785,7 @@ static int ext4_quota_on(struct super_block *sb, int type, int format_id, > * files. If this fails, we return success anyway since quotas > * are already enabled and this is not a hard failure. > */ > - inode_lock(inode); > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > handle = ext4_journal_start(inode, EXT4_HT_QUOTA, 1); > if (IS_ERR(handle)) > goto unlock_inode; > @@ -5795,7 +5795,7 @@ static int ext4_quota_on(struct super_block *sb, int type, int format_id, > ext4_mark_inode_dirty(handle, inode); > ext4_journal_stop(handle); > unlock_inode: > - inode_unlock(inode); > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > } > return err; > } > @@ -5887,7 +5887,7 @@ static int ext4_quota_off(struct super_block *sb, int type) > if (err || ext4_has_feature_quota(sb)) > goto out_put; > > - inode_lock(inode); > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > /* > * Update modification times of quota files when userspace can > * start looking at them. If we fail, we return success anyway since > @@ -5902,7 +5902,7 @@ static int ext4_quota_off(struct super_block *sb, int type) > ext4_mark_inode_dirty(handle, inode); > ext4_journal_stop(handle); > out_unlock: > - inode_unlock(inode); > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > out_put: > lockdep_set_quota_inode(inode, I_DATA_SEM_NORMAL); > iput(inode); > diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c > index 8966a5439a22..5c2dcc4c836a 100644 > --- a/fs/ext4/xattr.c > +++ b/fs/ext4/xattr.c > @@ -422,9 +422,9 @@ static int ext4_xattr_inode_iget(struct inode *parent, unsigned long ea_ino, > ext4_set_inode_state(inode, EXT4_STATE_LUSTRE_EA_INODE); > ext4_xattr_inode_set_ref(inode, 1); > } else { > - inode_lock(inode); > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > inode->i_flags |= S_NOQUOTA; > - inode_unlock(inode); > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > } > > *ea_inode = inode; > @@ -976,7 +976,7 @@ static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode, > u32 hash; > int ret; > > - inode_lock(ea_inode); > + ext4_ilock(ea_inode, EXT4_IOLOCK_EXCL); > > ret = ext4_reserve_inode_write(handle, ea_inode, &iloc); > if (ret) > @@ -1030,7 +1030,7 @@ static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode, > ext4_warning_inode(ea_inode, > "ext4_mark_iloc_dirty() failed ret=%d", ret); > out: > - inode_unlock(ea_inode); > + ext4_iunlock(ea_inode, EXT4_IOLOCK_EXCL); > return ret; > } > > @@ -1380,10 +1380,11 @@ static int ext4_xattr_inode_write(handle_t *handle, struct inode *ea_inode, > block += 1; > } > > - inode_lock(ea_inode); > + ext4_ilock(ea_inode, EXT4_IOLOCK_EXCL); > i_size_write(ea_inode, wsize); > ext4_update_i_disksize(ea_inode, wsize); > - inode_unlock(ea_inode); > + ext4_iunlock(ea_inode, EXT4_IOLOCK_EXCL); > + > > ext4_mark_inode_dirty(handle, ea_inode); > > @@ -1432,9 +1433,9 @@ static struct inode *ext4_xattr_inode_create(handle_t *handle, > */ > dquot_free_inode(ea_inode); > dquot_drop(ea_inode); > - inode_lock(ea_inode); > + ext4_ilock(ea_inode, EXT4_IOLOCK_EXCL); > ea_inode->i_flags |= S_NOQUOTA; > - inode_unlock(ea_inode); > + ext4_iunlock(ea_inode, EXT4_IOLOCK_EXCL); > } > > return ea_inode; > -- > 2.21.0 >
On Wed, Nov 20, 2019 at 02:11:07PM +0100, Jan Kara wrote: > On Wed 20-11-19 10:30:22, Ritesh Harjani wrote: > > This adds ext4_ilock/iunlock types of APIs. > > This is the preparation APIs to make shared > > locking/unlocking & restarting with exclusive > > locking/unlocking easier in next patch. > > > > Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> > > I know XFS does it this way but I don't think we need the obsurity of > additional locking helpers etc. just because of one place in > ext4_dio_write_iter() that will use this. So I'd just drop this patch... FWIW, if you were to add tracepoints to these inode lock and unlock helpers, then the helpers would have the additional value that you could use ftrace + script to diagnose inode deadlocking issues and the like. XFS has such scripts in the xfsprogs source code and it's really nice to have an automated system to tell you which thread forgot to let go of something, and when. --D > Honza > > > --- > > fs/ext4/ext4.h | 33 ++++++++++++++++++++++++++++++ > > fs/ext4/extents.c | 16 +++++++-------- > > fs/ext4/file.c | 52 +++++++++++++++++++++++------------------------ > > fs/ext4/inode.c | 4 ++-- > > fs/ext4/ioctl.c | 16 +++++++-------- > > fs/ext4/super.c | 12 +++++------ > > fs/ext4/xattr.c | 17 ++++++++-------- > > 7 files changed, 92 insertions(+), 58 deletions(-) > > > > diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h > > index 61987c106511..b4169a92e8d0 100644 > > --- a/fs/ext4/ext4.h > > +++ b/fs/ext4/ext4.h > > @@ -2960,6 +2960,39 @@ do { \ > > #define EXT4_FREECLUSTERS_WATERMARK 0 > > #endif > > > > +#define EXT4_IOLOCK_EXCL (1 << 0) > > +#define EXT4_IOLOCK_SHARED (1 << 1) > > + > > +static inline void ext4_ilock(struct inode *inode, unsigned int iolock) > > +{ > > + if (iolock == EXT4_IOLOCK_EXCL) > > + inode_lock(inode); > > + else > > + inode_lock_shared(inode); > > +} > > + > > +static inline void ext4_iunlock(struct inode *inode, unsigned int iolock) > > +{ > > + if (iolock == EXT4_IOLOCK_EXCL) > > + inode_unlock(inode); > > + else > > + inode_unlock_shared(inode); > > +} > > + > > +static inline int ext4_ilock_nowait(struct inode *inode, unsigned int iolock) > > +{ > > + if (iolock == EXT4_IOLOCK_EXCL) > > + return inode_trylock(inode); > > + else > > + return inode_trylock_shared(inode); > > +} > > + > > +static inline void ext4_ilock_demote(struct inode *inode, unsigned int iolock) > > +{ > > + BUG_ON(iolock != EXT4_IOLOCK_EXCL); > > + downgrade_write(&inode->i_rwsem); > > +} > > + > > /* Update i_disksize. Requires i_mutex to avoid races with truncate */ > > static inline void ext4_update_i_disksize(struct inode *inode, loff_t newsize) > > { > > diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c > > index 0e8708b77da6..08dd57558533 100644 > > --- a/fs/ext4/extents.c > > +++ b/fs/ext4/extents.c > > @@ -4754,7 +4754,7 @@ static long ext4_zero_range(struct file *file, loff_t offset, > > else > > max_blocks -= lblk; > > > > - inode_lock(inode); > > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > > > > /* > > * Indirect files do not support unwritten extnets > > @@ -4864,7 +4864,7 @@ static long ext4_zero_range(struct file *file, loff_t offset, > > > > ext4_journal_stop(handle); > > out_mutex: > > - inode_unlock(inode); > > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > > return ret; > > } > > > > @@ -4930,7 +4930,7 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len) > > if (mode & FALLOC_FL_KEEP_SIZE) > > flags |= EXT4_GET_BLOCKS_KEEP_SIZE; > > > > - inode_lock(inode); > > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > > > > /* > > * We only support preallocation for extent-based files only > > @@ -4961,7 +4961,7 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len) > > EXT4_I(inode)->i_sync_tid); > > } > > out: > > - inode_unlock(inode); > > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > > trace_ext4_fallocate_exit(inode, offset, max_blocks, ret); > > return ret; > > } > > @@ -5509,7 +5509,7 @@ int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len) > > return ret; > > } > > > > - inode_lock(inode); > > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > > /* > > * There is no need to overlap collapse range with EOF, in which case > > * it is effectively a truncate operation > > @@ -5608,7 +5608,7 @@ int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len) > > out_mmap: > > up_write(&EXT4_I(inode)->i_mmap_sem); > > out_mutex: > > - inode_unlock(inode); > > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > > return ret; > > } > > > > @@ -5659,7 +5659,7 @@ int ext4_insert_range(struct inode *inode, loff_t offset, loff_t len) > > return ret; > > } > > > > - inode_lock(inode); > > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > > /* Currently just for extent based files */ > > if (!ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)) { > > ret = -EOPNOTSUPP; > > @@ -5786,7 +5786,7 @@ int ext4_insert_range(struct inode *inode, loff_t offset, loff_t len) > > out_mmap: > > up_write(&EXT4_I(inode)->i_mmap_sem); > > out_mutex: > > - inode_unlock(inode); > > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > > return ret; > > } > > > > diff --git a/fs/ext4/file.c b/fs/ext4/file.c > > index 977ac58dc718..ebe3f051598d 100644 > > --- a/fs/ext4/file.c > > +++ b/fs/ext4/file.c > > @@ -55,14 +55,14 @@ static ssize_t ext4_dio_read_iter(struct kiocb *iocb, struct iov_iter *to) > > struct inode *inode = file_inode(iocb->ki_filp); > > > > if (iocb->ki_flags & IOCB_NOWAIT) { > > - if (!inode_trylock_shared(inode)) > > + if (!ext4_ilock_nowait(inode, EXT4_IOLOCK_SHARED)) > > return -EAGAIN; > > } else { > > - inode_lock_shared(inode); > > + ext4_ilock(inode, EXT4_IOLOCK_SHARED); > > } > > > > if (!ext4_dio_supported(inode)) { > > - inode_unlock_shared(inode); > > + ext4_iunlock(inode, EXT4_IOLOCK_SHARED); > > /* > > * Fallback to buffered I/O if the operation being performed on > > * the inode is not supported by direct I/O. The IOCB_DIRECT > > @@ -76,7 +76,7 @@ static ssize_t ext4_dio_read_iter(struct kiocb *iocb, struct iov_iter *to) > > > > ret = iomap_dio_rw(iocb, to, &ext4_iomap_ops, NULL, > > is_sync_kiocb(iocb)); > > - inode_unlock_shared(inode); > > + ext4_iunlock(inode, EXT4_IOLOCK_SHARED); > > > > file_accessed(iocb->ki_filp); > > return ret; > > @@ -89,22 +89,23 @@ static ssize_t ext4_dax_read_iter(struct kiocb *iocb, struct iov_iter *to) > > ssize_t ret; > > > > if (iocb->ki_flags & IOCB_NOWAIT) { > > - if (!inode_trylock_shared(inode)) > > + if (!ext4_ilock_nowait(inode, EXT4_IOLOCK_SHARED)) > > return -EAGAIN; > > } else { > > - inode_lock_shared(inode); > > + ext4_ilock(inode, EXT4_IOLOCK_SHARED); > > } > > + > > /* > > * Recheck under inode lock - at this point we are sure it cannot > > * change anymore > > */ > > if (!IS_DAX(inode)) { > > - inode_unlock_shared(inode); > > + ext4_iunlock(inode, EXT4_IOLOCK_SHARED); > > /* Fallback to buffered IO in case we cannot support DAX */ > > return generic_file_read_iter(iocb, to); > > } > > ret = dax_iomap_rw(iocb, to, &ext4_iomap_ops); > > - inode_unlock_shared(inode); > > + ext4_iunlock(inode, EXT4_IOLOCK_SHARED); > > > > file_accessed(iocb->ki_filp); > > return ret; > > @@ -244,7 +245,7 @@ static ssize_t ext4_buffered_write_iter(struct kiocb *iocb, > > if (iocb->ki_flags & IOCB_NOWAIT) > > return -EOPNOTSUPP; > > > > - inode_lock(inode); > > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > > ret = ext4_write_checks(iocb, from); > > if (ret <= 0) > > goto out; > > @@ -254,7 +255,7 @@ static ssize_t ext4_buffered_write_iter(struct kiocb *iocb, > > current->backing_dev_info = NULL; > > > > out: > > - inode_unlock(inode); > > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > > if (likely(ret > 0)) { > > iocb->ki_pos += ret; > > ret = generic_write_sync(iocb, ret); > > @@ -372,16 +373,17 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) > > handle_t *handle; > > struct inode *inode = file_inode(iocb->ki_filp); > > bool extend = false, overwrite = false, unaligned_aio = false; > > + unsigned int iolock = EXT4_IOLOCK_EXCL; > > > > if (iocb->ki_flags & IOCB_NOWAIT) { > > - if (!inode_trylock(inode)) > > + if (!ext4_ilock_nowait(inode, iolock)) > > return -EAGAIN; > > } else { > > - inode_lock(inode); > > + ext4_ilock(inode, iolock); > > } > > > > if (!ext4_dio_supported(inode)) { > > - inode_unlock(inode); > > + ext4_iunlock(inode, iolock); > > /* > > * Fallback to buffered I/O if the inode does not support > > * direct I/O. > > @@ -391,7 +393,7 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) > > > > ret = ext4_write_checks(iocb, from); > > if (ret <= 0) { > > - inode_unlock(inode); > > + ext4_iunlock(inode, iolock); > > return ret; > > } > > > > @@ -416,7 +418,8 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) > > if (!unaligned_aio && ext4_overwrite_io(inode, offset, count) && > > ext4_should_dioread_nolock(inode)) { > > overwrite = true; > > - downgrade_write(&inode->i_rwsem); > > + ext4_ilock_demote(inode, iolock); > > + iolock = EXT4_IOLOCK_SHARED; > > } > > > > if (offset + count > EXT4_I(inode)->i_disksize) { > > @@ -443,10 +446,7 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) > > ret = ext4_handle_inode_extension(inode, offset, ret, count); > > > > out: > > - if (overwrite) > > - inode_unlock_shared(inode); > > - else > > - inode_unlock(inode); > > + ext4_iunlock(inode, iolock); > > > > if (ret >= 0 && iov_iter_count(from)) { > > ssize_t err; > > @@ -489,10 +489,10 @@ ext4_dax_write_iter(struct kiocb *iocb, struct iov_iter *from) > > struct inode *inode = file_inode(iocb->ki_filp); > > > > if (iocb->ki_flags & IOCB_NOWAIT) { > > - if (!inode_trylock(inode)) > > + if (!ext4_ilock_nowait(inode, EXT4_IOLOCK_EXCL)) > > return -EAGAIN; > > } else { > > - inode_lock(inode); > > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > > } > > > > ret = ext4_write_checks(iocb, from); > > @@ -524,7 +524,7 @@ ext4_dax_write_iter(struct kiocb *iocb, struct iov_iter *from) > > if (extend) > > ret = ext4_handle_inode_extension(inode, offset, ret, count); > > out: > > - inode_unlock(inode); > > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > > if (ret > 0) > > ret = generic_write_sync(iocb, ret); > > return ret; > > @@ -757,16 +757,16 @@ loff_t ext4_llseek(struct file *file, loff_t offset, int whence) > > return generic_file_llseek_size(file, offset, whence, > > maxbytes, i_size_read(inode)); > > case SEEK_HOLE: > > - inode_lock_shared(inode); > > + ext4_ilock(inode, EXT4_IOLOCK_SHARED); > > offset = iomap_seek_hole(inode, offset, > > &ext4_iomap_report_ops); > > - inode_unlock_shared(inode); > > + ext4_iunlock(inode, EXT4_IOLOCK_SHARED); > > break; > > case SEEK_DATA: > > - inode_lock_shared(inode); > > + ext4_ilock(inode, EXT4_IOLOCK_SHARED); > > offset = iomap_seek_data(inode, offset, > > &ext4_iomap_report_ops); > > - inode_unlock_shared(inode); > > + ext4_iunlock(inode, EXT4_IOLOCK_SHARED); > > break; > > } > > > > diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c > > index 381813205f99..39dcc22667a1 100644 > > --- a/fs/ext4/inode.c > > +++ b/fs/ext4/inode.c > > @@ -3930,7 +3930,7 @@ int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length) > > return ret; > > } > > > > - inode_lock(inode); > > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > > > > /* No need to punch hole beyond i_size */ > > if (offset >= inode->i_size) > > @@ -4037,7 +4037,7 @@ int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length) > > out_dio: > > up_write(&EXT4_I(inode)->i_mmap_sem); > > out_mutex: > > - inode_unlock(inode); > > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > > return ret; > > } > > > > diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c > > index 0b7f316fd30f..43b7a23dc57b 100644 > > --- a/fs/ext4/ioctl.c > > +++ b/fs/ext4/ioctl.c > > @@ -855,13 +855,13 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) > > if (err) > > return err; > > > > - inode_lock(inode); > > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > > err = ext4_ioctl_check_immutable(inode, > > from_kprojid(&init_user_ns, ei->i_projid), > > flags); > > if (!err) > > err = ext4_ioctl_setflags(inode, flags); > > - inode_unlock(inode); > > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > > mnt_drop_write_file(filp); > > return err; > > } > > @@ -892,7 +892,7 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) > > goto setversion_out; > > } > > > > - inode_lock(inode); > > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > > handle = ext4_journal_start(inode, EXT4_HT_INODE, 1); > > if (IS_ERR(handle)) { > > err = PTR_ERR(handle); > > @@ -907,7 +907,7 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) > > ext4_journal_stop(handle); > > > > unlock_out: > > - inode_unlock(inode); > > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > > setversion_out: > > mnt_drop_write_file(filp); > > return err; > > @@ -1026,9 +1026,9 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) > > * ext4_ext_swap_inode_data before we switch the > > * inode format to prevent read. > > */ > > - inode_lock((inode)); > > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > > err = ext4_ext_migrate(inode); > > - inode_unlock((inode)); > > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > > mnt_drop_write_file(filp); > > return err; > > } > > @@ -1272,7 +1272,7 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) > > if (err) > > return err; > > > > - inode_lock(inode); > > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > > ext4_fill_fsxattr(inode, &old_fa); > > err = vfs_ioc_fssetxattr_check(inode, &old_fa, &fa); > > if (err) > > @@ -1287,7 +1287,7 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) > > goto out; > > err = ext4_ioctl_setproject(filp, fa.fsx_projid); > > out: > > - inode_unlock(inode); > > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > > mnt_drop_write_file(filp); > > return err; > > } > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > > index 7796e2ffc294..48b83b2cf0ad 100644 > > --- a/fs/ext4/super.c > > +++ b/fs/ext4/super.c > > @@ -2682,12 +2682,12 @@ static void ext4_orphan_cleanup(struct super_block *sb, > > __func__, inode->i_ino, inode->i_size); > > jbd_debug(2, "truncating inode %lu to %lld bytes\n", > > inode->i_ino, inode->i_size); > > - inode_lock(inode); > > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > > truncate_inode_pages(inode->i_mapping, inode->i_size); > > ret = ext4_truncate(inode); > > if (ret) > > ext4_std_error(inode->i_sb, ret); > > - inode_unlock(inode); > > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > > nr_truncates++; > > } else { > > if (test_opt(sb, DEBUG)) > > @@ -5785,7 +5785,7 @@ static int ext4_quota_on(struct super_block *sb, int type, int format_id, > > * files. If this fails, we return success anyway since quotas > > * are already enabled and this is not a hard failure. > > */ > > - inode_lock(inode); > > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > > handle = ext4_journal_start(inode, EXT4_HT_QUOTA, 1); > > if (IS_ERR(handle)) > > goto unlock_inode; > > @@ -5795,7 +5795,7 @@ static int ext4_quota_on(struct super_block *sb, int type, int format_id, > > ext4_mark_inode_dirty(handle, inode); > > ext4_journal_stop(handle); > > unlock_inode: > > - inode_unlock(inode); > > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > > } > > return err; > > } > > @@ -5887,7 +5887,7 @@ static int ext4_quota_off(struct super_block *sb, int type) > > if (err || ext4_has_feature_quota(sb)) > > goto out_put; > > > > - inode_lock(inode); > > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > > /* > > * Update modification times of quota files when userspace can > > * start looking at them. If we fail, we return success anyway since > > @@ -5902,7 +5902,7 @@ static int ext4_quota_off(struct super_block *sb, int type) > > ext4_mark_inode_dirty(handle, inode); > > ext4_journal_stop(handle); > > out_unlock: > > - inode_unlock(inode); > > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > > out_put: > > lockdep_set_quota_inode(inode, I_DATA_SEM_NORMAL); > > iput(inode); > > diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c > > index 8966a5439a22..5c2dcc4c836a 100644 > > --- a/fs/ext4/xattr.c > > +++ b/fs/ext4/xattr.c > > @@ -422,9 +422,9 @@ static int ext4_xattr_inode_iget(struct inode *parent, unsigned long ea_ino, > > ext4_set_inode_state(inode, EXT4_STATE_LUSTRE_EA_INODE); > > ext4_xattr_inode_set_ref(inode, 1); > > } else { > > - inode_lock(inode); > > + ext4_ilock(inode, EXT4_IOLOCK_EXCL); > > inode->i_flags |= S_NOQUOTA; > > - inode_unlock(inode); > > + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); > > } > > > > *ea_inode = inode; > > @@ -976,7 +976,7 @@ static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode, > > u32 hash; > > int ret; > > > > - inode_lock(ea_inode); > > + ext4_ilock(ea_inode, EXT4_IOLOCK_EXCL); > > > > ret = ext4_reserve_inode_write(handle, ea_inode, &iloc); > > if (ret) > > @@ -1030,7 +1030,7 @@ static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode, > > ext4_warning_inode(ea_inode, > > "ext4_mark_iloc_dirty() failed ret=%d", ret); > > out: > > - inode_unlock(ea_inode); > > + ext4_iunlock(ea_inode, EXT4_IOLOCK_EXCL); > > return ret; > > } > > > > @@ -1380,10 +1380,11 @@ static int ext4_xattr_inode_write(handle_t *handle, struct inode *ea_inode, > > block += 1; > > } > > > > - inode_lock(ea_inode); > > + ext4_ilock(ea_inode, EXT4_IOLOCK_EXCL); > > i_size_write(ea_inode, wsize); > > ext4_update_i_disksize(ea_inode, wsize); > > - inode_unlock(ea_inode); > > + ext4_iunlock(ea_inode, EXT4_IOLOCK_EXCL); > > + > > > > ext4_mark_inode_dirty(handle, ea_inode); > > > > @@ -1432,9 +1433,9 @@ static struct inode *ext4_xattr_inode_create(handle_t *handle, > > */ > > dquot_free_inode(ea_inode); > > dquot_drop(ea_inode); > > - inode_lock(ea_inode); > > + ext4_ilock(ea_inode, EXT4_IOLOCK_EXCL); > > ea_inode->i_flags |= S_NOQUOTA; > > - inode_unlock(ea_inode); > > + ext4_iunlock(ea_inode, EXT4_IOLOCK_EXCL); > > } > > > > return ea_inode; > > -- > > 2.21.0 > > > -- > Jan Kara <jack@suse.com> > SUSE Labs, CR
On Wed, Nov 20, 2019 at 05:48:30PM +0530, Ritesh Harjani wrote: > Not against your suggestion here. > But in kernel I do see a preference towards object followed by a verb. > At least in vfs I see functions like inode_lock()/unlock(). > > Plus I would not deny that this naming is also inspired from > xfs_ilock()/iunlock API names. I see those names as being "classical Unix" heritage (eh, maybe SysV). > hmm, it was increasing the name of the macro if I do it that way. > But that's ok. Is below macro name better? > > #define EXT4_INODE_IOLOCK_EXCL (1 << 0) > #define EXT4_INODE_IOLOCK_SHARED (1 << 1) In particular, Linux tends to prefer read/write instead of shared/exclusive terminology. rwlocks, rwsems, rcu_read_lock, seqlocks. shared/exclusive is used by file locks. And XFS ;-) I agree with Jan; just leave them opencoded. Probably worth adding inode_lock_downgrade() to fs.h instead of accessing i_rwsem directly.
On 11/20/19 10:05 PM, Matthew Wilcox wrote: > On Wed, Nov 20, 2019 at 05:48:30PM +0530, Ritesh Harjani wrote: >> Not against your suggestion here. >> But in kernel I do see a preference towards object followed by a verb. >> At least in vfs I see functions like inode_lock()/unlock(). >> >> Plus I would not deny that this naming is also inspired from >> xfs_ilock()/iunlock API names. > > I see those names as being "classical Unix" heritage (eh, maybe SysV). > >> hmm, it was increasing the name of the macro if I do it that way. >> But that's ok. Is below macro name better? >> >> #define EXT4_INODE_IOLOCK_EXCL (1 << 0) >> #define EXT4_INODE_IOLOCK_SHARED (1 << 1) > > In particular, Linux tends to prefer read/write instead of > shared/exclusive terminology. rwlocks, rwsems, rcu_read_lock, seqlocks. > shared/exclusive is used by file locks. And XFS ;-) > > I agree with Jan; just leave them opencoded. Sure. > > Probably worth adding inode_lock_downgrade() to fs.h instead of > accessing i_rwsem directly. > Yup, make sense. but since this series is independent of that change, let me add that as a separate patch after this series. Thanks for the review!! -ritesh
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 61987c106511..b4169a92e8d0 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -2960,6 +2960,39 @@ do { \ #define EXT4_FREECLUSTERS_WATERMARK 0 #endif +#define EXT4_IOLOCK_EXCL (1 << 0) +#define EXT4_IOLOCK_SHARED (1 << 1) + +static inline void ext4_ilock(struct inode *inode, unsigned int iolock) +{ + if (iolock == EXT4_IOLOCK_EXCL) + inode_lock(inode); + else + inode_lock_shared(inode); +} + +static inline void ext4_iunlock(struct inode *inode, unsigned int iolock) +{ + if (iolock == EXT4_IOLOCK_EXCL) + inode_unlock(inode); + else + inode_unlock_shared(inode); +} + +static inline int ext4_ilock_nowait(struct inode *inode, unsigned int iolock) +{ + if (iolock == EXT4_IOLOCK_EXCL) + return inode_trylock(inode); + else + return inode_trylock_shared(inode); +} + +static inline void ext4_ilock_demote(struct inode *inode, unsigned int iolock) +{ + BUG_ON(iolock != EXT4_IOLOCK_EXCL); + downgrade_write(&inode->i_rwsem); +} + /* Update i_disksize. Requires i_mutex to avoid races with truncate */ static inline void ext4_update_i_disksize(struct inode *inode, loff_t newsize) { diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 0e8708b77da6..08dd57558533 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4754,7 +4754,7 @@ static long ext4_zero_range(struct file *file, loff_t offset, else max_blocks -= lblk; - inode_lock(inode); + ext4_ilock(inode, EXT4_IOLOCK_EXCL); /* * Indirect files do not support unwritten extnets @@ -4864,7 +4864,7 @@ static long ext4_zero_range(struct file *file, loff_t offset, ext4_journal_stop(handle); out_mutex: - inode_unlock(inode); + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); return ret; } @@ -4930,7 +4930,7 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len) if (mode & FALLOC_FL_KEEP_SIZE) flags |= EXT4_GET_BLOCKS_KEEP_SIZE; - inode_lock(inode); + ext4_ilock(inode, EXT4_IOLOCK_EXCL); /* * We only support preallocation for extent-based files only @@ -4961,7 +4961,7 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len) EXT4_I(inode)->i_sync_tid); } out: - inode_unlock(inode); + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); trace_ext4_fallocate_exit(inode, offset, max_blocks, ret); return ret; } @@ -5509,7 +5509,7 @@ int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len) return ret; } - inode_lock(inode); + ext4_ilock(inode, EXT4_IOLOCK_EXCL); /* * There is no need to overlap collapse range with EOF, in which case * it is effectively a truncate operation @@ -5608,7 +5608,7 @@ int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len) out_mmap: up_write(&EXT4_I(inode)->i_mmap_sem); out_mutex: - inode_unlock(inode); + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); return ret; } @@ -5659,7 +5659,7 @@ int ext4_insert_range(struct inode *inode, loff_t offset, loff_t len) return ret; } - inode_lock(inode); + ext4_ilock(inode, EXT4_IOLOCK_EXCL); /* Currently just for extent based files */ if (!ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)) { ret = -EOPNOTSUPP; @@ -5786,7 +5786,7 @@ int ext4_insert_range(struct inode *inode, loff_t offset, loff_t len) out_mmap: up_write(&EXT4_I(inode)->i_mmap_sem); out_mutex: - inode_unlock(inode); + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); return ret; } diff --git a/fs/ext4/file.c b/fs/ext4/file.c index 977ac58dc718..ebe3f051598d 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -55,14 +55,14 @@ static ssize_t ext4_dio_read_iter(struct kiocb *iocb, struct iov_iter *to) struct inode *inode = file_inode(iocb->ki_filp); if (iocb->ki_flags & IOCB_NOWAIT) { - if (!inode_trylock_shared(inode)) + if (!ext4_ilock_nowait(inode, EXT4_IOLOCK_SHARED)) return -EAGAIN; } else { - inode_lock_shared(inode); + ext4_ilock(inode, EXT4_IOLOCK_SHARED); } if (!ext4_dio_supported(inode)) { - inode_unlock_shared(inode); + ext4_iunlock(inode, EXT4_IOLOCK_SHARED); /* * Fallback to buffered I/O if the operation being performed on * the inode is not supported by direct I/O. The IOCB_DIRECT @@ -76,7 +76,7 @@ static ssize_t ext4_dio_read_iter(struct kiocb *iocb, struct iov_iter *to) ret = iomap_dio_rw(iocb, to, &ext4_iomap_ops, NULL, is_sync_kiocb(iocb)); - inode_unlock_shared(inode); + ext4_iunlock(inode, EXT4_IOLOCK_SHARED); file_accessed(iocb->ki_filp); return ret; @@ -89,22 +89,23 @@ static ssize_t ext4_dax_read_iter(struct kiocb *iocb, struct iov_iter *to) ssize_t ret; if (iocb->ki_flags & IOCB_NOWAIT) { - if (!inode_trylock_shared(inode)) + if (!ext4_ilock_nowait(inode, EXT4_IOLOCK_SHARED)) return -EAGAIN; } else { - inode_lock_shared(inode); + ext4_ilock(inode, EXT4_IOLOCK_SHARED); } + /* * Recheck under inode lock - at this point we are sure it cannot * change anymore */ if (!IS_DAX(inode)) { - inode_unlock_shared(inode); + ext4_iunlock(inode, EXT4_IOLOCK_SHARED); /* Fallback to buffered IO in case we cannot support DAX */ return generic_file_read_iter(iocb, to); } ret = dax_iomap_rw(iocb, to, &ext4_iomap_ops); - inode_unlock_shared(inode); + ext4_iunlock(inode, EXT4_IOLOCK_SHARED); file_accessed(iocb->ki_filp); return ret; @@ -244,7 +245,7 @@ static ssize_t ext4_buffered_write_iter(struct kiocb *iocb, if (iocb->ki_flags & IOCB_NOWAIT) return -EOPNOTSUPP; - inode_lock(inode); + ext4_ilock(inode, EXT4_IOLOCK_EXCL); ret = ext4_write_checks(iocb, from); if (ret <= 0) goto out; @@ -254,7 +255,7 @@ static ssize_t ext4_buffered_write_iter(struct kiocb *iocb, current->backing_dev_info = NULL; out: - inode_unlock(inode); + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); if (likely(ret > 0)) { iocb->ki_pos += ret; ret = generic_write_sync(iocb, ret); @@ -372,16 +373,17 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) handle_t *handle; struct inode *inode = file_inode(iocb->ki_filp); bool extend = false, overwrite = false, unaligned_aio = false; + unsigned int iolock = EXT4_IOLOCK_EXCL; if (iocb->ki_flags & IOCB_NOWAIT) { - if (!inode_trylock(inode)) + if (!ext4_ilock_nowait(inode, iolock)) return -EAGAIN; } else { - inode_lock(inode); + ext4_ilock(inode, iolock); } if (!ext4_dio_supported(inode)) { - inode_unlock(inode); + ext4_iunlock(inode, iolock); /* * Fallback to buffered I/O if the inode does not support * direct I/O. @@ -391,7 +393,7 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) ret = ext4_write_checks(iocb, from); if (ret <= 0) { - inode_unlock(inode); + ext4_iunlock(inode, iolock); return ret; } @@ -416,7 +418,8 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) if (!unaligned_aio && ext4_overwrite_io(inode, offset, count) && ext4_should_dioread_nolock(inode)) { overwrite = true; - downgrade_write(&inode->i_rwsem); + ext4_ilock_demote(inode, iolock); + iolock = EXT4_IOLOCK_SHARED; } if (offset + count > EXT4_I(inode)->i_disksize) { @@ -443,10 +446,7 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) ret = ext4_handle_inode_extension(inode, offset, ret, count); out: - if (overwrite) - inode_unlock_shared(inode); - else - inode_unlock(inode); + ext4_iunlock(inode, iolock); if (ret >= 0 && iov_iter_count(from)) { ssize_t err; @@ -489,10 +489,10 @@ ext4_dax_write_iter(struct kiocb *iocb, struct iov_iter *from) struct inode *inode = file_inode(iocb->ki_filp); if (iocb->ki_flags & IOCB_NOWAIT) { - if (!inode_trylock(inode)) + if (!ext4_ilock_nowait(inode, EXT4_IOLOCK_EXCL)) return -EAGAIN; } else { - inode_lock(inode); + ext4_ilock(inode, EXT4_IOLOCK_EXCL); } ret = ext4_write_checks(iocb, from); @@ -524,7 +524,7 @@ ext4_dax_write_iter(struct kiocb *iocb, struct iov_iter *from) if (extend) ret = ext4_handle_inode_extension(inode, offset, ret, count); out: - inode_unlock(inode); + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); if (ret > 0) ret = generic_write_sync(iocb, ret); return ret; @@ -757,16 +757,16 @@ loff_t ext4_llseek(struct file *file, loff_t offset, int whence) return generic_file_llseek_size(file, offset, whence, maxbytes, i_size_read(inode)); case SEEK_HOLE: - inode_lock_shared(inode); + ext4_ilock(inode, EXT4_IOLOCK_SHARED); offset = iomap_seek_hole(inode, offset, &ext4_iomap_report_ops); - inode_unlock_shared(inode); + ext4_iunlock(inode, EXT4_IOLOCK_SHARED); break; case SEEK_DATA: - inode_lock_shared(inode); + ext4_ilock(inode, EXT4_IOLOCK_SHARED); offset = iomap_seek_data(inode, offset, &ext4_iomap_report_ops); - inode_unlock_shared(inode); + ext4_iunlock(inode, EXT4_IOLOCK_SHARED); break; } diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 381813205f99..39dcc22667a1 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3930,7 +3930,7 @@ int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length) return ret; } - inode_lock(inode); + ext4_ilock(inode, EXT4_IOLOCK_EXCL); /* No need to punch hole beyond i_size */ if (offset >= inode->i_size) @@ -4037,7 +4037,7 @@ int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length) out_dio: up_write(&EXT4_I(inode)->i_mmap_sem); out_mutex: - inode_unlock(inode); + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); return ret; } diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c index 0b7f316fd30f..43b7a23dc57b 100644 --- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -855,13 +855,13 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) if (err) return err; - inode_lock(inode); + ext4_ilock(inode, EXT4_IOLOCK_EXCL); err = ext4_ioctl_check_immutable(inode, from_kprojid(&init_user_ns, ei->i_projid), flags); if (!err) err = ext4_ioctl_setflags(inode, flags); - inode_unlock(inode); + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); mnt_drop_write_file(filp); return err; } @@ -892,7 +892,7 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) goto setversion_out; } - inode_lock(inode); + ext4_ilock(inode, EXT4_IOLOCK_EXCL); handle = ext4_journal_start(inode, EXT4_HT_INODE, 1); if (IS_ERR(handle)) { err = PTR_ERR(handle); @@ -907,7 +907,7 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) ext4_journal_stop(handle); unlock_out: - inode_unlock(inode); + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); setversion_out: mnt_drop_write_file(filp); return err; @@ -1026,9 +1026,9 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) * ext4_ext_swap_inode_data before we switch the * inode format to prevent read. */ - inode_lock((inode)); + ext4_ilock(inode, EXT4_IOLOCK_EXCL); err = ext4_ext_migrate(inode); - inode_unlock((inode)); + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); mnt_drop_write_file(filp); return err; } @@ -1272,7 +1272,7 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) if (err) return err; - inode_lock(inode); + ext4_ilock(inode, EXT4_IOLOCK_EXCL); ext4_fill_fsxattr(inode, &old_fa); err = vfs_ioc_fssetxattr_check(inode, &old_fa, &fa); if (err) @@ -1287,7 +1287,7 @@ long ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) goto out; err = ext4_ioctl_setproject(filp, fa.fsx_projid); out: - inode_unlock(inode); + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); mnt_drop_write_file(filp); return err; } diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 7796e2ffc294..48b83b2cf0ad 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -2682,12 +2682,12 @@ static void ext4_orphan_cleanup(struct super_block *sb, __func__, inode->i_ino, inode->i_size); jbd_debug(2, "truncating inode %lu to %lld bytes\n", inode->i_ino, inode->i_size); - inode_lock(inode); + ext4_ilock(inode, EXT4_IOLOCK_EXCL); truncate_inode_pages(inode->i_mapping, inode->i_size); ret = ext4_truncate(inode); if (ret) ext4_std_error(inode->i_sb, ret); - inode_unlock(inode); + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); nr_truncates++; } else { if (test_opt(sb, DEBUG)) @@ -5785,7 +5785,7 @@ static int ext4_quota_on(struct super_block *sb, int type, int format_id, * files. If this fails, we return success anyway since quotas * are already enabled and this is not a hard failure. */ - inode_lock(inode); + ext4_ilock(inode, EXT4_IOLOCK_EXCL); handle = ext4_journal_start(inode, EXT4_HT_QUOTA, 1); if (IS_ERR(handle)) goto unlock_inode; @@ -5795,7 +5795,7 @@ static int ext4_quota_on(struct super_block *sb, int type, int format_id, ext4_mark_inode_dirty(handle, inode); ext4_journal_stop(handle); unlock_inode: - inode_unlock(inode); + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); } return err; } @@ -5887,7 +5887,7 @@ static int ext4_quota_off(struct super_block *sb, int type) if (err || ext4_has_feature_quota(sb)) goto out_put; - inode_lock(inode); + ext4_ilock(inode, EXT4_IOLOCK_EXCL); /* * Update modification times of quota files when userspace can * start looking at them. If we fail, we return success anyway since @@ -5902,7 +5902,7 @@ static int ext4_quota_off(struct super_block *sb, int type) ext4_mark_inode_dirty(handle, inode); ext4_journal_stop(handle); out_unlock: - inode_unlock(inode); + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); out_put: lockdep_set_quota_inode(inode, I_DATA_SEM_NORMAL); iput(inode); diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c index 8966a5439a22..5c2dcc4c836a 100644 --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -422,9 +422,9 @@ static int ext4_xattr_inode_iget(struct inode *parent, unsigned long ea_ino, ext4_set_inode_state(inode, EXT4_STATE_LUSTRE_EA_INODE); ext4_xattr_inode_set_ref(inode, 1); } else { - inode_lock(inode); + ext4_ilock(inode, EXT4_IOLOCK_EXCL); inode->i_flags |= S_NOQUOTA; - inode_unlock(inode); + ext4_iunlock(inode, EXT4_IOLOCK_EXCL); } *ea_inode = inode; @@ -976,7 +976,7 @@ static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode, u32 hash; int ret; - inode_lock(ea_inode); + ext4_ilock(ea_inode, EXT4_IOLOCK_EXCL); ret = ext4_reserve_inode_write(handle, ea_inode, &iloc); if (ret) @@ -1030,7 +1030,7 @@ static int ext4_xattr_inode_update_ref(handle_t *handle, struct inode *ea_inode, ext4_warning_inode(ea_inode, "ext4_mark_iloc_dirty() failed ret=%d", ret); out: - inode_unlock(ea_inode); + ext4_iunlock(ea_inode, EXT4_IOLOCK_EXCL); return ret; } @@ -1380,10 +1380,11 @@ static int ext4_xattr_inode_write(handle_t *handle, struct inode *ea_inode, block += 1; } - inode_lock(ea_inode); + ext4_ilock(ea_inode, EXT4_IOLOCK_EXCL); i_size_write(ea_inode, wsize); ext4_update_i_disksize(ea_inode, wsize); - inode_unlock(ea_inode); + ext4_iunlock(ea_inode, EXT4_IOLOCK_EXCL); + ext4_mark_inode_dirty(handle, ea_inode); @@ -1432,9 +1433,9 @@ static struct inode *ext4_xattr_inode_create(handle_t *handle, */ dquot_free_inode(ea_inode); dquot_drop(ea_inode); - inode_lock(ea_inode); + ext4_ilock(ea_inode, EXT4_IOLOCK_EXCL); ea_inode->i_flags |= S_NOQUOTA; - inode_unlock(ea_inode); + ext4_iunlock(ea_inode, EXT4_IOLOCK_EXCL); } return ea_inode;
This adds ext4_ilock/iunlock types of APIs. This is the preparation APIs to make shared locking/unlocking & restarting with exclusive locking/unlocking easier in next patch. Signed-off-by: Ritesh Harjani <riteshh@linux.ibm.com> --- fs/ext4/ext4.h | 33 ++++++++++++++++++++++++++++++ fs/ext4/extents.c | 16 +++++++-------- fs/ext4/file.c | 52 +++++++++++++++++++++++------------------------ fs/ext4/inode.c | 4 ++-- fs/ext4/ioctl.c | 16 +++++++-------- fs/ext4/super.c | 12 +++++------ fs/ext4/xattr.c | 17 ++++++++-------- 7 files changed, 92 insertions(+), 58 deletions(-)