Message ID | 20240809121532.2105494-1-chengzhihao@huaweicloud.com |
---|---|
State | Awaiting Upstream |
Headers | show |
Series | ext4: dax: Fix overflowing extents beyond inode size when partially writing | expand |
On Fri 09-08-24 20:15:32, Zhihao Cheng wrote: > From: Zhihao Cheng <chengzhihao1@huawei.com> > > The dax_iomap_rw() does two things in each iteration: map written blocks > and copy user data to blocks. If the process is killed by user(See signal > handling in dax_iomap_iter()), the copied data will be returned and added > on inode size, which means that the length of written extents may exceed > the inode size, then fsck will fail. An example is given as: > > dd if=/dev/urandom of=file bs=4M count=1 > dax_iomap_rw > iomap_iter // round 1 > ext4_iomap_begin > ext4_iomap_alloc // allocate 0~2M extents(written flag) > dax_iomap_iter // copy 2M data > iomap_iter // round 2 > iomap_iter_advance > iter->pos += iter->processed // iter->pos = 2M > ext4_iomap_begin > ext4_iomap_alloc // allocate 2~4M extents(written flag) > dax_iomap_iter > fatal_signal_pending > done = iter->pos - iocb->ki_pos // done = 2M > ext4_handle_inode_extension > ext4_update_inode_size // inode size = 2M > > fsck reports: Inode 13, i_size is 2097152, should be 4194304. Fix? > > Fix the problem by truncating extents if the written length is smaller > than expected. > > Fixes: 776722e85d3b ("ext4: DAX iomap write support") > CC: stable@vger.kernel.org > Link: https://bugzilla.kernel.org/show_bug.cgi?id=219136 > Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com> Good catch! Feel free to add: Reviewed-by: Jan Kara <jack@suse.cz> Honza > --- > fs/ext4/file.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/fs/ext4/file.c b/fs/ext4/file.c > index c89e434db6b7..be061bb64067 100644 > --- a/fs/ext4/file.c > +++ b/fs/ext4/file.c > @@ -334,10 +334,10 @@ static ssize_t ext4_handle_inode_extension(struct inode *inode, loff_t offset, > * Clean up the inode after DIO or DAX extending write has completed and the > * inode size has been updated using ext4_handle_inode_extension(). > */ > -static void ext4_inode_extension_cleanup(struct inode *inode, ssize_t count) > +static void ext4_inode_extension_cleanup(struct inode *inode, bool need_trunc) > { > lockdep_assert_held_write(&inode->i_rwsem); > - if (count < 0) { > + if (need_trunc) { > ext4_truncate_failed_write(inode); > /* > * If the truncate operation failed early, then the inode may > @@ -586,7 +586,7 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) > * writeback of delalloc blocks. > */ > WARN_ON_ONCE(ret == -EIOCBQUEUED); > - ext4_inode_extension_cleanup(inode, ret); > + ext4_inode_extension_cleanup(inode, ret < 0); > } > > out: > @@ -670,7 +670,7 @@ ext4_dax_write_iter(struct kiocb *iocb, struct iov_iter *from) > > if (extend) { > ret = ext4_handle_inode_extension(inode, offset, ret); > - ext4_inode_extension_cleanup(inode, ret); > + ext4_inode_extension_cleanup(inode, ret < (ssize_t)count); > } > out: > inode_unlock(inode); > -- > 2.39.2 >
On Fri, 09 Aug 2024 20:15:32 +0800, Zhihao Cheng wrote: > The dax_iomap_rw() does two things in each iteration: map written blocks > and copy user data to blocks. If the process is killed by user(See signal > handling in dax_iomap_iter()), the copied data will be returned and added > on inode size, which means that the length of written extents may exceed > the inode size, then fsck will fail. An example is given as: > > dd if=/dev/urandom of=file bs=4M count=1 > dax_iomap_rw > iomap_iter // round 1 > ext4_iomap_begin > ext4_iomap_alloc // allocate 0~2M extents(written flag) > dax_iomap_iter // copy 2M data > iomap_iter // round 2 > iomap_iter_advance > iter->pos += iter->processed // iter->pos = 2M > ext4_iomap_begin > ext4_iomap_alloc // allocate 2~4M extents(written flag) > dax_iomap_iter > fatal_signal_pending > done = iter->pos - iocb->ki_pos // done = 2M > ext4_handle_inode_extension > ext4_update_inode_size // inode size = 2M > > [...] Applied, thanks! [1/1] ext4: dax: Fix overflowing extents beyond inode size when partially writing commit: dda898d7ffe85931f9cca6d702a51f33717c501e Best regards,
diff --git a/fs/ext4/file.c b/fs/ext4/file.c index c89e434db6b7..be061bb64067 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -334,10 +334,10 @@ static ssize_t ext4_handle_inode_extension(struct inode *inode, loff_t offset, * Clean up the inode after DIO or DAX extending write has completed and the * inode size has been updated using ext4_handle_inode_extension(). */ -static void ext4_inode_extension_cleanup(struct inode *inode, ssize_t count) +static void ext4_inode_extension_cleanup(struct inode *inode, bool need_trunc) { lockdep_assert_held_write(&inode->i_rwsem); - if (count < 0) { + if (need_trunc) { ext4_truncate_failed_write(inode); /* * If the truncate operation failed early, then the inode may @@ -586,7 +586,7 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from) * writeback of delalloc blocks. */ WARN_ON_ONCE(ret == -EIOCBQUEUED); - ext4_inode_extension_cleanup(inode, ret); + ext4_inode_extension_cleanup(inode, ret < 0); } out: @@ -670,7 +670,7 @@ ext4_dax_write_iter(struct kiocb *iocb, struct iov_iter *from) if (extend) { ret = ext4_handle_inode_extension(inode, offset, ret); - ext4_inode_extension_cleanup(inode, ret); + ext4_inode_extension_cleanup(inode, ret < (ssize_t)count); } out: inode_unlock(inode);