Message ID | 171815791109.14261.10223988071271993465@noble.neil.brown.name |
---|---|
State | Handled Elsewhere |
Headers | show |
Series | VFS: generate FS_CREATE before FS_OPEN when ->atomic_open used. | expand |
On Wed, Jun 12, 2024 at 12:05:11PM +1000, NeilBrown wrote: > For finish_open() there are three cases: > - finish_open is used in ->atomic_open handlers. For these we add a > call to fsnotify_open() in do_open() if FMODE_OPENED is set - which > means do_dentry_open() has been called. This happens after fsnotify_create(). Hummm.... There's a bit of behaviour change; in case we fail in may_open(), we used to get fsnotify_open()+fsnotify_close() and with that patch we's get fsnotify_close() alone. IF we don't care about that, we might as well take fsnotify_open() out of vfs_open() and, for do_open()/do_tmpfile()/do_o_path(), into path_openat() itself. I mean, having if (likely(!error)) { if (likely(file->f_mode & FMODE_OPENED)) { fsnotify_open(file); return file; } in there would be a lot easier to follow... It would lose fsnotify_open() in a few more failure exits, but if we don't give a damn about having it paired with fsnotify_close()...
On Wed, 12 Jun 2024, Al Viro wrote: > On Wed, Jun 12, 2024 at 12:05:11PM +1000, NeilBrown wrote: > > > For finish_open() there are three cases: > > - finish_open is used in ->atomic_open handlers. For these we add a > > call to fsnotify_open() in do_open() if FMODE_OPENED is set - which > > means do_dentry_open() has been called. This happens after fsnotify_create(). > > Hummm.... There's a bit of behaviour change; in case we fail in > may_open(), we used to get fsnotify_open()+fsnotify_close() and with that > patch we's get fsnotify_close() alone. True. Presumably we could fix that by doing diff --git a/fs/namei.c b/fs/namei.c index 37fb0a8aa09a..6fd04c9046fa 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -3645,6 +3645,8 @@ static int do_open(struct nameidata *nd, return error; do_truncate = true; } + if (file->f_mode & FMODE_OPENED) + fsnotify_open(file); error = may_open(idmap, &nd->path, acc_mode, open_flag); if (!error && !(file->f_mode & FMODE_OPENED)) error = vfs_open(&nd->path, file); @@ -3702,6 +3704,7 @@ int vfs_tmpfile(struct mnt_idmap *idmap, dput(child); if (error) return error; + fsnotify_open(file); /* Don't check for other permissions, the inode was just created */ error = may_open(idmap, &file->f_path, 0, file->f_flags); if (error) instead, but it seems a little weird sending an OPEN notification if may_open() fails. > > IF we don't care about that, we might as well take fsnotify_open() > out of vfs_open() and, for do_open()/do_tmpfile()/do_o_path(), into > path_openat() itself. I mean, having > if (likely(!error)) { > if (likely(file->f_mode & FMODE_OPENED)) { > fsnotify_open(file); > return file; > } > in there would be a lot easier to follow... It would lose fsnotify_open() > in a few more failure exits, but if we don't give a damn about having it > paired with fsnotify_close()... > Should we have fsnotify_open() set a new ->f_mode flag, and fsnotify_close() abort if it isn't set (and clear it if it is)? Then we would be guaranteed a balance - which does seem like a good idea. Thanks, NeilBrown
On Wed, Jun 12, 2024 at 12:55:40PM +1000, NeilBrown wrote: > > IF we don't care about that, we might as well take fsnotify_open() > > out of vfs_open() and, for do_open()/do_tmpfile()/do_o_path(), into > > path_openat() itself. I mean, having > > if (likely(!error)) { > > if (likely(file->f_mode & FMODE_OPENED)) { > > fsnotify_open(file); > > return file; > > } > > in there would be a lot easier to follow... It would lose fsnotify_open() > > in a few more failure exits, but if we don't give a damn about having it > > paired with fsnotify_close()... > > > > Should we have fsnotify_open() set a new ->f_mode flag, and > fsnotify_close() abort if it isn't set (and clear it if it is)? > Then we would be guaranteed a balance - which does seem like a good > idea. Umm... In that case, I would rather have FMODE_NONOTIFY set just before the fput() in path_openat() - no need to grab another flag from ->f_mode (not a lot of unused ones there) and no need to add any overhead on the fast path.
On Wed, 12 Jun 2024, Al Viro wrote: > On Wed, Jun 12, 2024 at 12:55:40PM +1000, NeilBrown wrote: > > > IF we don't care about that, we might as well take fsnotify_open() > > > out of vfs_open() and, for do_open()/do_tmpfile()/do_o_path(), into > > > path_openat() itself. I mean, having > > > if (likely(!error)) { > > > if (likely(file->f_mode & FMODE_OPENED)) { > > > fsnotify_open(file); > > > return file; > > > } > > > in there would be a lot easier to follow... It would lose fsnotify_open() > > > in a few more failure exits, but if we don't give a damn about having it > > > paired with fsnotify_close()... > > > > > > > Should we have fsnotify_open() set a new ->f_mode flag, and > > fsnotify_close() abort if it isn't set (and clear it if it is)? > > Then we would be guaranteed a balance - which does seem like a good > > idea. > > Umm... In that case, I would rather have FMODE_NONOTIFY set just before > the fput() in path_openat() - no need to grab another flag from ->f_mode > (not a lot of unused ones there) and no need to add any overhead on > the fast path. > Unfortunately that gets messy if handle_truncate() fails. We would need to delay the fsnotify_open() until after truncate which means moving it out of vfs_open() or maybe calling do_dentry_open() directly from do_open() - neither of which I like. I think it is best to stick with "if FMODE_OPENED is set, then we call fsnotify_open() even if the open will fail", and only move the place where fsnotify_open() is called. BTW I was wrong about gfs. Closer inspection of the code show that finish_open() is only called in the ->atomic_open case. Thanks, NeilBrown
diff --git a/fs/namei.c b/fs/namei.c index 37fb0a8aa09a..32031feaf6b6 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -3646,8 +3646,12 @@ static int do_open(struct nameidata *nd, do_truncate = true; } error = may_open(idmap, &nd->path, acc_mode, open_flag); - if (!error && !(file->f_mode & FMODE_OPENED)) - error = vfs_open(&nd->path, file); + if (!error) { + if (file->f_mode & FMODE_OPENED) + fsnotify_open(file); + else + error = vfs_open(&nd->path, file); + } if (!error) error = security_file_post_open(file, op->acc_mode); if (!error && do_truncate) @@ -3706,6 +3710,7 @@ int vfs_tmpfile(struct mnt_idmap *idmap, error = may_open(idmap, &file->f_path, 0, file->f_flags); if (error) return error; + fsnotify_open(file); inode = file_inode(file); if (!(open_flag & O_EXCL)) { spin_lock(&inode->i_lock); diff --git a/fs/open.c b/fs/open.c index 89cafb572061..970f299c0e77 100644 --- a/fs/open.c +++ b/fs/open.c @@ -1004,11 +1004,6 @@ static int do_dentry_open(struct file *f, } } - /* - * Once we return a file with FMODE_OPENED, __fput() will call - * fsnotify_close(), so we need fsnotify_open() here for symmetry. - */ - fsnotify_open(f); return 0; cleanup_all: @@ -1085,8 +1080,17 @@ EXPORT_SYMBOL(file_path); */ int vfs_open(const struct path *path, struct file *file) { + int ret; + file->f_path = *path; - return do_dentry_open(file, NULL); + ret = do_dentry_open(file, NULL); + if (!ret) + /* + * Once we return a file with FMODE_OPENED, __fput() will call + * fsnotify_close(), so we need fsnotify_open() here for symmetry. + */ + fsnotify_open(file); + return ret; } struct file *dentry_open(const struct path *path, int flags, @@ -1178,7 +1182,8 @@ struct file *kernel_file_open(const struct path *path, int flags, if (error) { fput(f); f = ERR_PTR(error); - } + } else + fsnotify_open(f); return f; } EXPORT_SYMBOL_GPL(kernel_file_open);
When a file is opened and created with open(..., O_CREAT) we get both the CREATE and OPEN fsnotify events and would expect them in that order. For most filesystems we get them in that order because open_last_lookups() calls fsnofify_create() and then do_open() (from path_openat()) calls vfs_open()->do_dentry_open() which calls fsnotify_open(). However when ->atomic_open is used, the do_dentry_open() -> fsnotify_open() call happens from finish_open() which is called from the ->atomic_open handler in lookup_open() which is called *before* open_last_lookups() calls fsnotify_create(). So we get the "open" notification before "create" - which is backwards. ltp testcase inotify02 tests this and reports the inconsistency. This patch lifts the fsnotify_open() call out of do_dentry_open() and places it higher up the call stack. There are three callers of do_dentry_open(). For vfs_open() and kernel_file_open() the fsnotify_open() is placed directly in that caller so there should be no behavioural change. For finish_open() there are three cases: - finish_open is used in ->atomic_open handlers. For these we add a call to fsnotify_open() in do_open() if FMODE_OPENED is set - which means do_dentry_open() has been called. This happens after fsnotify_create(). - finish_open is used in ->tmpfile() handlers. For these a call to fsnotify_open() is added to vfs_tmpfile() - finish_open is used in gfs2_create_inode() which is used for atomic_open, but also for gfs2_create() which is a ->create handler and is not expected to open the file. So losing the fsnotify_open() call in this case seems correct. With this patch NFSv3 is restored to its previous behaviour (before ->atomic_open support was added) of generating CREATE notifications before OPEN, and NFSv4 now has that same correct ordering that is has not had before. I haven't tested other filesystems. Fixes: 7c6c5249f061 ("NFS: add atomic_open for NFSv3 to handle O_TRUNC correctly.") Reported-by: James Clark <james.clark@arm.com> Closes: https://lore.kernel.org/all/01c3bf2e-eb1f-4b7f-a54f-d2a05dd3d8c8@arm.com Signed-off-by: NeilBrown <neilb@suse.de> --- fs/namei.c | 9 +++++++-- fs/open.c | 19 ++++++++++++------- 2 files changed, 19 insertions(+), 9 deletions(-)