Message ID | 20140213214402.GA20711@sergelap |
---|---|
State | New |
Headers | show |
On 13.02.2014 22:44, Serge Hallyn wrote: > To mark a file which exists in the lower layer as deleted, > it creates a symbolic link to a file called "(overlay-whiteout)" > in the writeable mount, and sets a "trusted.overlay" xattr > on that link. > > 1. When the create the symbolic link as container root, not > as the global root Have my problems parsing this. Guess it says: "When the symbolic link is created, it is done as container root, not as the global root." > > 2. Allow root in a container to edit "trusted.overlay*" > xattrs. Generally only global root is allowed to edit > "trusted.*" > > With this patch, I'm able to delete files and directories in a > user-namespace-based overlayfs-backed container. The overlay > writeable layer after "rm ab/ab; rmdir ab; mv xxx yyy;" ends up > looking like: > > ls -l .local/share/lxc/u11/delta0/home/ubuntu/ > total 0 > lrwxrwxrwx 1 150000 150000 18 Feb 13 22:30 ab -> (overlay-whiteout) > lrwxrwxrwx 1 150000 150000 18 Feb 13 22:30 xxx -> (overlay-whiteout) > -rw-rw-r-- 1 151000 151000 0 Feb 13 03:53 yyy > Hm, am I missing something here? I see access rights changed, but would the whiteout link creation not also be in overlayfs code ... somewhere? -Stefan > (with 150000 being the mapped container root) > > (note - the fs/xattr.c hunk could presumably be dropped if we > switched to using "user.overlay". This could however cause > problems with pre-existing overlay deltas.) > > Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com> > --- > fs/overlayfs/dir.c | 9 +++++++-- > fs/xattr.c | 5 ++++- > 2 files changed, 11 insertions(+), 3 deletions(-) > > diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c > index a209409..3c4657b 100644 > --- a/fs/overlayfs/dir.c > +++ b/fs/overlayfs/dir.c > @@ -12,6 +12,7 @@ > #include <linux/xattr.h> > #include <linux/security.h> > #include <linux/cred.h> > +#include <linux/sched.h> > #include "overlayfs.h" > > static const char *ovl_whiteout_symlink = "(overlay-whiteout)"; > @@ -38,8 +39,12 @@ static int ovl_whiteout(struct dentry *upperdir, struct dentry *dentry) > cap_raise(override_cred->cap_effective, CAP_SYS_ADMIN); > cap_raise(override_cred->cap_effective, CAP_DAC_OVERRIDE); > cap_raise(override_cred->cap_effective, CAP_FOWNER); > - override_cred->fsuid = GLOBAL_ROOT_UID; > - override_cred->fsgid = GLOBAL_ROOT_GID; > + override_cred->fsuid = make_kuid(current_user_ns(), 0); > + if (!uid_valid(override_cred->fsuid)) > + override_cred->fsuid = GLOBAL_ROOT_UID; > + override_cred->fsgid = make_kgid(current_user_ns(), 0); > + if (!gid_valid(override_cred->fsgid)) > + override_cred->fsgid = GLOBAL_ROOT_GID; > old_cred = override_creds(override_cred); > > newdentry = lookup_one_len(dentry->d_name.name, upperdir, > diff --git a/fs/xattr.c b/fs/xattr.c > index 3377dff..edd826c 100644 > --- a/fs/xattr.c > +++ b/fs/xattr.c > @@ -52,7 +52,10 @@ xattr_permission(struct inode *inode, const char *name, int mask) > * The trusted.* namespace can only be accessed by privileged users. > */ > if (!strncmp(name, XATTR_TRUSTED_PREFIX, XATTR_TRUSTED_PREFIX_LEN)) { > - if (!capable(CAP_SYS_ADMIN)) > + if (strncmp(name, "trusted.overlay", 15) == 0) { > + if (!inode_capable(inode, CAP_SYS_ADMIN)) > + return (mask & MAY_WRITE) ? -EPERM : -ENODATA; > + } else if (!capable(CAP_SYS_ADMIN)) > return (mask & MAY_WRITE) ? -EPERM : -ENODATA; > return 0; > } >
Quoting Stefan Bader (stefan.bader@canonical.com): > On 13.02.2014 22:44, Serge Hallyn wrote: > > To mark a file which exists in the lower layer as deleted, > > it creates a symbolic link to a file called "(overlay-whiteout)" > > in the writeable mount, and sets a "trusted.overlay" xattr > > on that link. > > > > > 1. When the create the symbolic link as container root, not > > as the global root > > Have my problems parsing this. Guess it says: "When the symbolic link is > created, it is done as container root, not as the global root." Yikes, yeah that's bad. Your interpretation is correct. > > 2. Allow root in a container to edit "trusted.overlay*" > > xattrs. Generally only global root is allowed to edit > > "trusted.*" > > > > With this patch, I'm able to delete files and directories in a > > user-namespace-based overlayfs-backed container. The overlay > > writeable layer after "rm ab/ab; rmdir ab; mv xxx yyy;" ends up > > looking like: > > > > ls -l .local/share/lxc/u11/delta0/home/ubuntu/ > > total 0 > > lrwxrwxrwx 1 150000 150000 18 Feb 13 22:30 ab -> (overlay-whiteout) > > lrwxrwxrwx 1 150000 150000 18 Feb 13 22:30 xxx -> (overlay-whiteout) > > -rw-rw-r-- 1 151000 151000 0 Feb 13 03:53 yyy > > > > Hm, am I missing something here? I see access rights changed, but would the > whiteout link creation not also be in overlayfs code ... somewhere? I'm not sure what you mean. I don't change access rights, but change the owning uid/gid. The whiteout link is indeed created in the overlayfs code, using vfs_symlink. Right before that is done, overlayfs sets an override credential. Currently the override cred is with the global root uid/gid. I'm changing it to be the container root uid/gid. Or maybe you mean the '(overlay-whiteout)' file itself? It doesn't exist, so the deleted files are symlinks to a nonexistent file... -serge
On 17.02.2014 16:51, Serge Hallyn wrote: > Quoting Stefan Bader (stefan.bader@canonical.com): >> On 13.02.2014 22:44, Serge Hallyn wrote: >>> To mark a file which exists in the lower layer as deleted, >>> it creates a symbolic link to a file called "(overlay-whiteout)" >>> in the writeable mount, and sets a "trusted.overlay" xattr >>> on that link. >>> >> >>> 1. When the create the symbolic link as container root, not >>> as the global root >> >> Have my problems parsing this. Guess it says: "When the symbolic link is >> created, it is done as container root, not as the global root." > > Yikes, yeah that's bad. Your interpretation is correct. > >>> 2. Allow root in a container to edit "trusted.overlay*" >>> xattrs. Generally only global root is allowed to edit >>> "trusted.*" >>> >>> With this patch, I'm able to delete files and directories in a >>> user-namespace-based overlayfs-backed container. The overlay >>> writeable layer after "rm ab/ab; rmdir ab; mv xxx yyy;" ends up >>> looking like: >>> >>> ls -l .local/share/lxc/u11/delta0/home/ubuntu/ >>> total 0 >>> lrwxrwxrwx 1 150000 150000 18 Feb 13 22:30 ab -> (overlay-whiteout) >>> lrwxrwxrwx 1 150000 150000 18 Feb 13 22:30 xxx -> (overlay-whiteout) >>> -rw-rw-r-- 1 151000 151000 0 Feb 13 03:53 yyy >>> >> >> Hm, am I missing something here? I see access rights changed, but would the >> whiteout link creation not also be in overlayfs code ... somewhere? > > I'm not sure what you mean. I don't change access rights, but change > the owning uid/gid. The whiteout link is indeed created in the > overlayfs code, using vfs_symlink. Right before that is done, overlayfs > sets an override credential. Currently the override cred is with the > global root uid/gid. I'm changing it to be the container root uid/gid. > > Or maybe you mean the '(overlay-whiteout)' file itself? It doesn't > exist, so the deleted files are symlinks to a nonexistent file... Doh! So for some reason I thought the whiteout process would be added. Probably mis-reading the first paragraph as part of the changes. But its a description of the current behaviour. -Stefan
diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c index a209409..3c4657b 100644 --- a/fs/overlayfs/dir.c +++ b/fs/overlayfs/dir.c @@ -12,6 +12,7 @@ #include <linux/xattr.h> #include <linux/security.h> #include <linux/cred.h> +#include <linux/sched.h> #include "overlayfs.h" static const char *ovl_whiteout_symlink = "(overlay-whiteout)"; @@ -38,8 +39,12 @@ static int ovl_whiteout(struct dentry *upperdir, struct dentry *dentry) cap_raise(override_cred->cap_effective, CAP_SYS_ADMIN); cap_raise(override_cred->cap_effective, CAP_DAC_OVERRIDE); cap_raise(override_cred->cap_effective, CAP_FOWNER); - override_cred->fsuid = GLOBAL_ROOT_UID; - override_cred->fsgid = GLOBAL_ROOT_GID; + override_cred->fsuid = make_kuid(current_user_ns(), 0); + if (!uid_valid(override_cred->fsuid)) + override_cred->fsuid = GLOBAL_ROOT_UID; + override_cred->fsgid = make_kgid(current_user_ns(), 0); + if (!gid_valid(override_cred->fsgid)) + override_cred->fsgid = GLOBAL_ROOT_GID; old_cred = override_creds(override_cred); newdentry = lookup_one_len(dentry->d_name.name, upperdir, diff --git a/fs/xattr.c b/fs/xattr.c index 3377dff..edd826c 100644 --- a/fs/xattr.c +++ b/fs/xattr.c @@ -52,7 +52,10 @@ xattr_permission(struct inode *inode, const char *name, int mask) * The trusted.* namespace can only be accessed by privileged users. */ if (!strncmp(name, XATTR_TRUSTED_PREFIX, XATTR_TRUSTED_PREFIX_LEN)) { - if (!capable(CAP_SYS_ADMIN)) + if (strncmp(name, "trusted.overlay", 15) == 0) { + if (!inode_capable(inode, CAP_SYS_ADMIN)) + return (mask & MAY_WRITE) ? -EPERM : -ENODATA; + } else if (!capable(CAP_SYS_ADMIN)) return (mask & MAY_WRITE) ? -EPERM : -ENODATA; return 0; }
To mark a file which exists in the lower layer as deleted, it creates a symbolic link to a file called "(overlay-whiteout)" in the writeable mount, and sets a "trusted.overlay" xattr on that link. 1. When the create the symbolic link as container root, not as the global root 2. Allow root in a container to edit "trusted.overlay*" xattrs. Generally only global root is allowed to edit "trusted.*" With this patch, I'm able to delete files and directories in a user-namespace-based overlayfs-backed container. The overlay writeable layer after "rm ab/ab; rmdir ab; mv xxx yyy;" ends up looking like: ls -l .local/share/lxc/u11/delta0/home/ubuntu/ total 0 lrwxrwxrwx 1 150000 150000 18 Feb 13 22:30 ab -> (overlay-whiteout) lrwxrwxrwx 1 150000 150000 18 Feb 13 22:30 xxx -> (overlay-whiteout) -rw-rw-r-- 1 151000 151000 0 Feb 13 03:53 yyy (with 150000 being the mapped container root) (note - the fs/xattr.c hunk could presumably be dropped if we switched to using "user.overlay". This could however cause problems with pre-existing overlay deltas.) Signed-off-by: Serge Hallyn <serge.hallyn@ubuntu.com> --- fs/overlayfs/dir.c | 9 +++++++-- fs/xattr.c | 5 ++++- 2 files changed, 11 insertions(+), 3 deletions(-)