Message ID | 20210827164926.1726765-1-agruenba@redhat.com |
---|---|
Headers | show |
Series | gfs2: Fix mmap + page fault deadlocks | expand |
On Fri, Aug 27, 2021 at 9:49 AM Andreas Gruenbacher <agruenba@redhat.com> wrote: > > here's another update on top of v5.14-rc7. Changes: > > * Some of the patch descriptions have been improved. > > * Patch "gfs2: Eliminate ip->i_gh" has been moved further to the front. > > At this point, I'm not aware of anything that still needs fixing, From a quick scan, I didn't see anything that raised my hackles. But I skipped all the gfs2-specific changes in the series, since that's all above my paygrade. Linus
On Fri, Aug 27, 2021 at 7:17 PM Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Fri, Aug 27, 2021 at 9:49 AM Andreas Gruenbacher <agruenba@redhat.com> wrote: > > > > here's another update on top of v5.14-rc7. Changes: > > > > * Some of the patch descriptions have been improved. > > > > * Patch "gfs2: Eliminate ip->i_gh" has been moved further to the front. > > > > At this point, I'm not aware of anything that still needs fixing, > > From a quick scan, I didn't see anything that raised my hackles. So there's a minor merge conflict between Christoph's iomap_iter conversion and this patch queue now, and I should probably clarify the description of "iomap: Add done_before argument to iomap_dio_rw" that Darrick ran into. Then there are the user copy issues that Al has pointed out. Fixing those will create superficial conflicts with this patch queue, but probably nothing serious. So how should I proceed: do you expect a v8 of this patch queue on top of the current mainline? Thanks, Andreas
On Fri, Aug 27, 2021 at 5:51 PM Andreas Gruenbacher <agruenba@redhat.com> wrote: > > Hi all, > > here's another update on top of v5.14-rc7. Changes: > > * Some of the patch descriptions have been improved. > > * Patch "gfs2: Eliminate ip->i_gh" has been moved further to the front. > > At this point, I'm not aware of anything that still needs fixing, Hi, thanks for doing this. In btrfs we also have a deadlock (after the conversion to use iomap for direct IO) triggered by your recent test case for fstests, generic/647 [1]. Even though we can fix it in btrfs without touching iomap, iov_iter, etc, it would be too complex for such a rare and exotic case (a user passing a buffer for a direct IO read/write that is memory mapped to the same file range of the operation is very uncommon at least). But this patchset would make the fix much simpler and cleaner. One thing I noticed is that, for direct IO reads, despite setting the ->nofault attribute of the iov_iter to true, we can still get page faults while in the iomap code. This happens when reading from holes and unwritten/prealloc extents, because iomap calls iov_iter_zero() and this seems to ignore the value of ->nofault. Is that intentional? I can get around it by surrounding the iomap call with pagefault_disable() / pagefault_enable(), but it seems odd to do so, given that iov_iter->nofault was set to true. [1] https://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git/commit/?id=d3cbdabffc4cb28850e97bc7bd8a7a1460db94e5 Thanks. > > > The first two patches are independent of the core of this patch queue > and I've asked the respective maintainers to have a look, but I've not > heard back from them. The first patch should just go into Al's tree; > it's a relatively straight-forward fix. The second patch really needs > to be looked at; it might break things: > > iov_iter: Fix iov_iter_get_pages{,_alloc} page fault return value > powerpc/kvm: Fix kvm_use_magic_page > > > Al and Linus seem to have a disagreement about the error reporting > semantics that functions fault_in_{readable,writeable} and > fault_in_iov_iter_{readable,writeable} should have. I've implemented > Linus's suggestion of returning the number of bytes not faulted in and I > think that being able to tell if "nothing", "something" or "everything" > could be faulted in does help, but I'll live with anything that allows > us to make progress. > > > The iomap changes should ideally be reviewed by Christoph; I've not > heard from him about those. > > > Thanks, > Andreas > > Andreas Gruenbacher (16): > iov_iter: Fix iov_iter_get_pages{,_alloc} page fault return value > powerpc/kvm: Fix kvm_use_magic_page > gup: Turn fault_in_pages_{readable,writeable} into > fault_in_{readable,writeable} > iov_iter: Turn iov_iter_fault_in_readable into > fault_in_iov_iter_readable > iov_iter: Introduce fault_in_iov_iter_writeable > gfs2: Add wrapper for iomap_file_buffered_write > gfs2: Clean up function may_grant > gfs2: Move the inode glock locking to gfs2_file_buffered_write > gfs2: Eliminate ip->i_gh > gfs2: Fix mmap + page fault deadlocks for buffered I/O > iomap: Fix iomap_dio_rw return value for user copies > iomap: Support partial direct I/O on user copy failures > iomap: Add done_before argument to iomap_dio_rw > gup: Introduce FOLL_NOFAULT flag to disable page faults > iov_iter: Introduce nofault flag to disable page faults > gfs2: Fix mmap + page fault deadlocks for direct I/O > > Bob Peterson (3): > gfs2: Eliminate vestigial HIF_FIRST > gfs2: Remove redundant check from gfs2_glock_dq > gfs2: Introduce flag for glock holder auto-demotion > > arch/powerpc/kernel/kvm.c | 3 +- > arch/powerpc/kernel/signal_32.c | 4 +- > arch/powerpc/kernel/signal_64.c | 2 +- > arch/x86/kernel/fpu/signal.c | 7 +- > drivers/gpu/drm/armada/armada_gem.c | 7 +- > fs/btrfs/file.c | 7 +- > fs/btrfs/ioctl.c | 5 +- > fs/ext4/file.c | 5 +- > fs/f2fs/file.c | 2 +- > fs/fuse/file.c | 2 +- > fs/gfs2/bmap.c | 60 +---- > fs/gfs2/file.c | 245 ++++++++++++++++++-- > fs/gfs2/glock.c | 340 +++++++++++++++++++++------- > fs/gfs2/glock.h | 20 ++ > fs/gfs2/incore.h | 5 +- > fs/iomap/buffered-io.c | 2 +- > fs/iomap/direct-io.c | 21 +- > fs/ntfs/file.c | 2 +- > fs/xfs/xfs_file.c | 6 +- > fs/zonefs/super.c | 4 +- > include/linux/iomap.h | 11 +- > include/linux/mm.h | 3 +- > include/linux/pagemap.h | 58 +---- > include/linux/uio.h | 4 +- > lib/iov_iter.c | 103 +++++++-- > mm/filemap.c | 4 +- > mm/gup.c | 139 +++++++++++- > 27 files changed, 785 insertions(+), 286 deletions(-) > > -- > 2.26.3 >
On Wed, Sep 1, 2021 at 12:53 PM Andreas Gruenbacher <agruenba@redhat.com> wrote: > > So there's a minor merge conflict between Christoph's iomap_iter > conversion and this patch queue now, and I should probably clarify the > description of "iomap: Add done_before argument to iomap_dio_rw" that > Darrick ran into. Then there are the user copy issues that Al has > pointed out. Fixing those will create superficial conflicts with this > patch queue, but probably nothing serious. > > So how should I proceed: do you expect a v8 of this patch queue on top > of the current mainline? So if you rebase for fixes, it's going to be a "next merge window" thing again. Personally, I'm ok with the series as is, and the conflict isn't an issue. So I'd take it as is, and then people can fix up niggling issues later. But if somebody screams loudly.. Linus
On Fri, Sep 03, 2021 at 08:52:00AM -0700, Linus Torvalds wrote: > On Wed, Sep 1, 2021 at 12:53 PM Andreas Gruenbacher <agruenba@redhat.com> wrote: > > > > So there's a minor merge conflict between Christoph's iomap_iter > > conversion and this patch queue now, and I should probably clarify the > > description of "iomap: Add done_before argument to iomap_dio_rw" that > > Darrick ran into. Then there are the user copy issues that Al has > > pointed out. Fixing those will create superficial conflicts with this > > patch queue, but probably nothing serious. > > > > So how should I proceed: do you expect a v8 of this patch queue on top > > of the current mainline? > > So if you rebase for fixes, it's going to be a "next merge window" thing again. > > Personally, I'm ok with the series as is, and the conflict isn't an > issue. So I'd take it as is, and then people can fix up niggling > issues later. > > But if somebody screams loudly.. FWIW, my objections regarding the calling conventions are still there. Out of 3 callers that do want more than "success/failure", one is gone (series by tglx) and one more is broken (regardless of the semantics, btrfs on arm64). Which leaves 1 caller (fault-in for read in FPU handling on x86 sigreturn). That caller turns out to be correct, but IMO there are fairly strong arguments in favour of *not* using the normal fault-in in that case. "how many bytes can we fault in" is misleading, unless we really poke into every cacheline in the affected area. Which might be a primitive worth having, but it's a lot heavier than needed by the majority of callers.
On Fri, Sep 3, 2021 at 11:28 AM Al Viro <viro@zeniv.linux.org.uk> wrote: > > FWIW, my objections regarding the calling conventions are still there. So I'm happy to further change the calling conventions, but by now _that_ part is most definitely a "not this merge window". The need for that ternary state is still there. It might go away in the future, but I think that's literally that: a future cleanup. Not really related to the problem at hand. Linus