diff mbox series

ext4: disambiguate the return value of ext4_dio_write_end_io()

Message ID 20240815112746.18570-1-alexjlzheng@tencent.com
State Rejected
Headers show
Series ext4: disambiguate the return value of ext4_dio_write_end_io() | expand

Commit Message

alexjlzheng@gmail.com Aug. 15, 2024, 11:27 a.m. UTC
From: Jinliang Zheng <alexjlzheng@tencent.com>

The commit 91562895f803 ("ext4: properly sync file size update after O_SYNC
direct IO") causes confusion about the meaning of the return value of
ext4_dio_write_end_io().

Specifically, when the ext4_handle_inode_extension() operation succeeds,
ext4_dio_write_end_io() directly returns count instead of 0.

This does not cause a bug in the current kernel, but the semantics of the
return value of the ext4_dio_write_end_io() function are wrong, which is
likely to introduce bugs in the future code evolution.

Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
---
 fs/ext4/file.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Yang Erkun Aug. 16, 2024, 12:21 p.m. UTC | #1
在 2024/8/15 19:27, alexjlzheng@gmail.com 写道:
> From: Jinliang Zheng <alexjlzheng@tencent.com>
> 
> The commit 91562895f803 ("ext4: properly sync file size update after O_SYNC
> direct IO") causes confusion about the meaning of the return value of
> ext4_dio_write_end_io().
> 
> Specifically, when the ext4_handle_inode_extension() operation succeeds,
> ext4_dio_write_end_io() directly returns count instead of 0.
> 
> This does not cause a bug in the current kernel, but the semantics of the
> return value of the ext4_dio_write_end_io() function are wrong, which is
> likely to introduce bugs in the future code evolution.
> 
> Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
> ---
>   fs/ext4/file.c | 5 +++--
>   1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> index c89e434db6b7..6df5a92cec2b 100644
> --- a/fs/ext4/file.c
> +++ b/fs/ext4/file.c
> @@ -392,8 +392,9 @@ static int ext4_dio_write_end_io(struct kiocb *iocb, ssize_t size,
>   	 */
>   	if (pos + size <= READ_ONCE(EXT4_I(inode)->i_disksize) &&
>   	    pos + size <= i_size_read(inode))
> -		return size;
> -	return ext4_handle_inode_extension(inode, pos, size);
> +		return 0;
> +	error = ext4_handle_inode_extension(inode, pos, size);
> +	return error < 0 ? error : 0;

Why?

iomap_dio_complete can use the return value directly without any bug. 
And I think the code now seems more clearly...

>   }
>   
>   static const struct iomap_dio_ops ext4_dio_write_ops = {
alexjlzheng@gmail.com Aug. 16, 2024, 4:57 p.m. UTC | #2
On Fri, 16 Aug 2024 20:21:22 +0800, yangerkun@huaweicloud.com wrote:
> 在 2024/8/15 19:27, alexjlzheng@gmail.com 写道:
> > From: Jinliang Zheng <alexjlzheng@tencent.com>
> > 
> > The commit 91562895f803 ("ext4: properly sync file size update after O_SYNC
> > direct IO") causes confusion about the meaning of the return value of
> > ext4_dio_write_end_io().
> > 
> > Specifically, when the ext4_handle_inode_extension() operation succeeds,
> > ext4_dio_write_end_io() directly returns count instead of 0.
> > 
> > This does not cause a bug in the current kernel, but the semantics of the
> > return value of the ext4_dio_write_end_io() function are wrong, which is
> > likely to introduce bugs in the future code evolution.
> > 
> > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
> > ---
> >   fs/ext4/file.c | 5 +++--
> >   1 file changed, 3 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> > index c89e434db6b7..6df5a92cec2b 100644
> > --- a/fs/ext4/file.c
> > +++ b/fs/ext4/file.c
> > @@ -392,8 +392,9 @@ static int ext4_dio_write_end_io(struct kiocb *iocb, ssize_t size,
> >   	 */
> >   	if (pos + size <= READ_ONCE(EXT4_I(inode)->i_disksize) &&
> >   	    pos + size <= i_size_read(inode))
> > -		return size;
> > -	return ext4_handle_inode_extension(inode, pos, size);
> > +		return 0;
> > +	error = ext4_handle_inode_extension(inode, pos, size);
> > +	return error < 0 ? error : 0;
> 
> Why?

Before commit 91562895f803 ("ext4: properly sync file size update after O_SYNC
direct IO"), all filesystems' iomap_dio_ops.end_io() return 0 on success and
negative value on failure.

Moreover, this confusion of return value semantics caused data corruption when
this above patch was merged to the stable branch. See
https://lwn.net/Articles/954285/ for details.

> 
> iomap_dio_complete can use the return value directly without any bug. 
> And I think the code now seems more clearly...
> 

In my opinion, clean code should be clearly defined code, especially the
interface functions connecting various modules. So, what is the return value
definition of iomap_dio_ops.end_io()? What is the return value definition of
ext4_dio_write_end_io()?

Thanks,
Jinliang Zheng

> >   }
> >   
> >   static const struct iomap_dio_ops ext4_dio_write_ops = {
Darrick J. Wong Aug. 16, 2024, 5:50 p.m. UTC | #3
On Sat, Aug 17, 2024 at 12:57:27AM +0800, alexjlzheng@gmail.com wrote:
> On Fri, 16 Aug 2024 20:21:22 +0800, yangerkun@huaweicloud.com wrote:
> > 在 2024/8/15 19:27, alexjlzheng@gmail.com 写道:
> > > From: Jinliang Zheng <alexjlzheng@tencent.com>
> > > 
> > > The commit 91562895f803 ("ext4: properly sync file size update after O_SYNC
> > > direct IO") causes confusion about the meaning of the return value of
> > > ext4_dio_write_end_io().
> > > 
> > > Specifically, when the ext4_handle_inode_extension() operation succeeds,
> > > ext4_dio_write_end_io() directly returns count instead of 0.
> > > 
> > > This does not cause a bug in the current kernel, but the semantics of the
> > > return value of the ext4_dio_write_end_io() function are wrong, which is
> > > likely to introduce bugs in the future code evolution.
> > > 
> > > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
> > > ---
> > >   fs/ext4/file.c | 5 +++--
> > >   1 file changed, 3 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/fs/ext4/file.c b/fs/ext4/file.c
> > > index c89e434db6b7..6df5a92cec2b 100644
> > > --- a/fs/ext4/file.c
> > > +++ b/fs/ext4/file.c
> > > @@ -392,8 +392,9 @@ static int ext4_dio_write_end_io(struct kiocb *iocb, ssize_t size,
> > >   	 */
> > >   	if (pos + size <= READ_ONCE(EXT4_I(inode)->i_disksize) &&
> > >   	    pos + size <= i_size_read(inode))
> > > -		return size;
> > > -	return ext4_handle_inode_extension(inode, pos, size);
> > > +		return 0;
> > > +	error = ext4_handle_inode_extension(inode, pos, size);
> > > +	return error < 0 ? error : 0;
> > 
> > Why?
> 
> Before commit 91562895f803 ("ext4: properly sync file size update after O_SYNC
> direct IO"), all filesystems' iomap_dio_ops.end_io() return 0 on success and
> negative value on failure.
> 
> Moreover, this confusion of return value semantics caused data corruption when
> this above patch was merged to the stable branch. See
> https://lwn.net/Articles/954285/ for details.
> 
> > 
> > iomap_dio_complete can use the return value directly without any bug. 
> > And I think the code now seems more clearly...
> > 
> 
> In my opinion, clean code should be clearly defined code, especially the
> interface functions connecting various modules. So, what is the return value
> definition of iomap_dio_ops.end_io()? What is the return value definition of
> ext4_dio_write_end_io()?

iomap_dio_ops.end_io() implemenations should return one of:

- negative errno
- 0 if @size bytes were written

I'm not sure why ext4_dio_write_end_io returns size explicitly or why
ext4_handle_inode_extension returns count.  Neither function need to do
that.

--D

> Thanks,
> Jinliang Zheng
> 
> > >   }
> > >   
> > >   static const struct iomap_dio_ops ext4_dio_write_ops = {
>
Yang Erkun Aug. 17, 2024, 1:29 a.m. UTC | #4
在 2024/8/17 0:57, alexjlzheng@gmail.com 写道:
> On Fri, 16 Aug 2024 20:21:22 +0800, yangerkun@huaweicloud.com wrote:
>> 在 2024/8/15 19:27, alexjlzheng@gmail.com 写道:
>>> From: Jinliang Zheng <alexjlzheng@tencent.com>
>>>
>>> The commit 91562895f803 ("ext4: properly sync file size update after O_SYNC
>>> direct IO") causes confusion about the meaning of the return value of
>>> ext4_dio_write_end_io().
>>>
>>> Specifically, when the ext4_handle_inode_extension() operation succeeds,
>>> ext4_dio_write_end_io() directly returns count instead of 0.
>>>
>>> This does not cause a bug in the current kernel, but the semantics of the
>>> return value of the ext4_dio_write_end_io() function are wrong, which is
>>> likely to introduce bugs in the future code evolution.
>>>
>>> Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
>>> ---
>>>    fs/ext4/file.c | 5 +++--
>>>    1 file changed, 3 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/ext4/file.c b/fs/ext4/file.c
>>> index c89e434db6b7..6df5a92cec2b 100644
>>> --- a/fs/ext4/file.c
>>> +++ b/fs/ext4/file.c
>>> @@ -392,8 +392,9 @@ static int ext4_dio_write_end_io(struct kiocb *iocb, ssize_t size,
>>>    	 */
>>>    	if (pos + size <= READ_ONCE(EXT4_I(inode)->i_disksize) &&
>>>    	    pos + size <= i_size_read(inode))
>>> -		return size;
>>> -	return ext4_handle_inode_extension(inode, pos, size);
>>> +		return 0;
>>> +	error = ext4_handle_inode_extension(inode, pos, size);
>>> +	return error < 0 ? error : 0;
>>
>> Why?
> 
> Before commit 91562895f803 ("ext4: properly sync file size update after O_SYNC
> direct IO"), all filesystems' iomap_dio_ops.end_io() return 0 on success and
> negative value on failure.
> 
> Moreover, this confusion of return value semantics caused data corruption when
> this above patch was merged to the stable branch. See
> https://lwn.net/Articles/954285/ for details.

Yeah, I know this problem, you should backport 936e114a245b("iomap:
update ki_pos a little later in iomap_dio_complete") too to help update
iocb->ki_pos since ext4_dio_write_end_io now return > 0.

> 
>>
>> iomap_dio_complete can use the return value directly without any bug.
>> And I think the code now seems more clearly...
>>
> 
> In my opinion, clean code should be clearly defined code, especially the

Agree.

> interface functions connecting various modules. So, what is the return value
> definition of iomap_dio_ops.end_io()? What is the return value definition of
> ext4_dio_write_end_io()?

I have not seen the definition of return value for
iomap_dio_ops.end_io(), so I think the code is ok now. If we give a
definition for the return value like Darrick describe, this patch looks
good to me.

> 
> Thanks,
> Jinliang Zheng
> 
>>>    }
>>>    
>>>    static const struct iomap_dio_ops ext4_dio_write_ops = {
>
diff mbox series

Patch

diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index c89e434db6b7..6df5a92cec2b 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -392,8 +392,9 @@  static int ext4_dio_write_end_io(struct kiocb *iocb, ssize_t size,
 	 */
 	if (pos + size <= READ_ONCE(EXT4_I(inode)->i_disksize) &&
 	    pos + size <= i_size_read(inode))
-		return size;
-	return ext4_handle_inode_extension(inode, pos, size);
+		return 0;
+	error = ext4_handle_inode_extension(inode, pos, size);
+	return error < 0 ? error : 0;
 }
 
 static const struct iomap_dio_ops ext4_dio_write_ops = {