Message ID | 20230621174114.1320834-2-bongiojp@gmail.com |
---|---|
State | New |
Headers | show |
Series | iomap regression for aio dio 4k writes | expand |
On Wed, Jun 21, 2023 at 10:29:20AM -0700, Jeremy Bongio wrote: > +++ b/fs/iomap/direct-io.c > @@ -168,7 +168,9 @@ void iomap_dio_bio_end_io(struct bio *bio) > struct task_struct *waiter = dio->submit.waiter; > WRITE_ONCE(dio->submit.waiter, NULL); > blk_wake_io_task(waiter); > - } else if (dio->flags & IOMAP_DIO_WRITE) { > + } else if (dio->flags & IOMAP_DIO_WRITE && > + (!dio->iocb->ki_filp->f_inode || > + dio->iocb->ki_filp->f_inode->i_mapping->nrpages))) { I don't think it's possible for file->f_inode to be NULL here, is it? At any rate, that amount of indirection is just nasty. How about this? +++ b/fs/iomap/direct-io.c @@ -161,15 +161,19 @@ void iomap_dio_bio_end_io(struct bio *bio) struct task_struct *waiter = dio->submit.waiter; WRITE_ONCE(dio->submit.waiter, NULL); blk_wake_io_task(waiter); - } else if (dio->flags & IOMAP_DIO_WRITE) { + } else { struct inode *inode = file_inode(dio->iocb->ki_filp); WRITE_ONCE(dio->iocb->private, NULL); - INIT_WORK(&dio->aio.work, iomap_dio_complete_work); - queue_work(inode->i_sb->s_dio_done_wq, &dio->aio.work); - } else { - WRITE_ONCE(dio->iocb->private, NULL); - iomap_dio_complete_work(&dio->aio.work); + if (dio->flags & IOMAP_DIO_WRITE && + (inode->i_mapping->nrpages > 0) { + INIT_WORK(&dio->aio.work, + iomap_dio_complete_work); + queue_work(inode->i_sb->s_dio_done_wq, + &dio->aio.work); + } else { + iomap_dio_complete_work(&dio->aio.work); + } } }
On Wed, Jun 21, 2023 at 10:29:20AM -0700, Jeremy Bongio wrote: > If there are no mapped pages for an DIO write then the page cache does not > need to be updated. For very fast SSDs and direct async IO, deferring work > completion can result in a significant performance loss. > --- > fs/iomap/direct-io.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c > index 019cc87d0fb3..8f27d0dc4f6d 100644 > --- a/fs/iomap/direct-io.c > +++ b/fs/iomap/direct-io.c > @@ -168,7 +168,9 @@ void iomap_dio_bio_end_io(struct bio *bio) > struct task_struct *waiter = dio->submit.waiter; > WRITE_ONCE(dio->submit.waiter, NULL); > blk_wake_io_task(waiter); > - } else if (dio->flags & IOMAP_DIO_WRITE) { > + } else if (dio->flags & IOMAP_DIO_WRITE && > + (!dio->iocb->ki_filp->f_inode || > + dio->iocb->ki_filp->f_inode->i_mapping->nrpages))) { > struct inode *inode = file_inode(dio->iocb->ki_filp); Writes that are need O_DSYNC, unwritten extent conversion, file size extension, etc all need to be deferred. This will break all of them, as well as any other type of write that the filesystem itself needs to run completion in task context. Cheers, Dave.
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index 019cc87d0fb3..8f27d0dc4f6d 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -168,7 +168,9 @@ void iomap_dio_bio_end_io(struct bio *bio) struct task_struct *waiter = dio->submit.waiter; WRITE_ONCE(dio->submit.waiter, NULL); blk_wake_io_task(waiter); - } else if (dio->flags & IOMAP_DIO_WRITE) { + } else if (dio->flags & IOMAP_DIO_WRITE && + (!dio->iocb->ki_filp->f_inode || + dio->iocb->ki_filp->f_inode->i_mapping->nrpages))) { struct inode *inode = file_inode(dio->iocb->ki_filp); WRITE_ONCE(dio->iocb->private, NULL);