Message ID | 20210630190001.6112-3-gpiccoli@canonical.com |
---|---|
State | New |
Headers | show |
Series | Kernel oops due to uninitialized list on kernfs | expand |
On Wed, Jun 30, 2021 at 04:00:01PM -0300, Guilherme G. Piccoli wrote: > From: Al Viro <viro@zeniv.linux.org.uk> > > BugLink: https://bugs.launchpad.net/bugs/1934175 > > new_sb is left uninitialized in case of early failures in kernfs_mount_ns(), > and while IS_ERR(root) is true in all such cases, using IS_ERR(root) || !new_sb > is not a solution - IS_ERR(root) is true in some cases when new_sb is true. > > Make sure new_sb is initialized (and matches the reality) in all cases and > fix the condition for dropping kobj reference - we want it done precisely > in those situations where the reference has not been transferred into a new > super_block instance. > > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> > (cherry picked from commit 7b745a4e4051e1bbce40e0b1c2cf636c70583aa4) > Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com> > --- > fs/sysfs/mount.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c > index fb49510c5dcf..88b388415d0e 100644 > --- a/fs/sysfs/mount.c > +++ b/fs/sysfs/mount.c > @@ -28,7 +28,7 @@ static struct dentry *sysfs_mount(struct file_system_type *fs_type, > { > struct dentry *root; > void *ns; > - bool new_sb; > + bool new_sb = false; > > if (!(flags & SB_KERNMOUNT)) { > if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET)) > @@ -38,9 +38,9 @@ static struct dentry *sysfs_mount(struct file_system_type *fs_type, > ns = kobj_ns_grab_current(KOBJ_NS_TYPE_NET); > root = kernfs_mount_ns(fs_type, flags, sysfs_root, > SYSFS_MAGIC, &new_sb, ns); > - if (IS_ERR(root) || !new_sb) > + if (!new_sb) > kobj_ns_drop(KOBJ_NS_TYPE_NET, ns); > - else if (new_sb) > + else if (!IS_ERR(root)) > root->d_sb->s_iflags |= SB_I_USERNS_VISIBLE; > > return root; > -- > 2.31.1 I still need to wrap my head around sget_userns (called from kernfs_mount_ns), but it looks like we can get away with dropping the kobj_ns reference here when this is not a new superblock, because there will be a reference to this "old" superblock, which will prevent it from being killed, and we only drop the kobj_ns on sysfs_kill_sb (defined below sysfs_mount) when all superblock references are done with. So we only need to keep the kobj_ns reference in the case of new_sb. However, if this is a new_sb, but we return an error (fs/kernfs/mount.c:340), with the old code, we would drop the kobj_ns reference, as expected. With this patch, it would not drop it, leading to a kernel leak. This code has been totally revamped after that change, with commit 23bf1b6be9c2 ("kernfs, sysfs, cgroup, intel_rdt: Support fs_context"), which means that whatever bug this is fixing needs to be fixed specifically for our 4.15 kernel (and upstream 4.14.y, and v4.19.y, it seems). Cascardo.
Cascardo, thanks a lot for the deep analysis! The first commit in the series was the real fix, and you acked it! So, we're fine. Regarding this one, honestly I always had a terrible time with VFS commits due to the complete lack of good commit messages from Al plus the complexity of the code. This one, for me seemed a very straightforward fix - I've also asked feedback in linux-stable (from Al himself!) and the patch was accepted for 4.14.y . You seem to have a good understanding of this area, so I'd suggest you to reply [0] with your analysis to prevent it to get added on stable 4.14.y - and if it is added there, we can either remove it later or just not merge this stable fix in our kernels. Cheers, Guilherme [0] https://lore.kernel.org/stable/20210628143628.33342-64-sashal@kernel.org/
On Thu, Jul 01, 2021 at 09:32:48AM -0300, Thadeu Lima de Souza Cascardo wrote: > On Wed, Jun 30, 2021 at 04:00:01PM -0300, Guilherme G. Piccoli wrote: > > From: Al Viro <viro@zeniv.linux.org.uk> > > > > BugLink: https://bugs.launchpad.net/bugs/1934175 > > > > new_sb is left uninitialized in case of early failures in kernfs_mount_ns(), > > and while IS_ERR(root) is true in all such cases, using IS_ERR(root) || !new_sb > > is not a solution - IS_ERR(root) is true in some cases when new_sb is true. > > > > Make sure new_sb is initialized (and matches the reality) in all cases and > > fix the condition for dropping kobj reference - we want it done precisely > > in those situations where the reference has not been transferred into a new > > super_block instance. > > > > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> > > (cherry picked from commit 7b745a4e4051e1bbce40e0b1c2cf636c70583aa4) > > Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com> > > --- > > fs/sysfs/mount.c | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c > > index fb49510c5dcf..88b388415d0e 100644 > > --- a/fs/sysfs/mount.c > > +++ b/fs/sysfs/mount.c > > @@ -28,7 +28,7 @@ static struct dentry *sysfs_mount(struct file_system_type *fs_type, > > { > > struct dentry *root; > > void *ns; > > - bool new_sb; > > + bool new_sb = false; > > > > if (!(flags & SB_KERNMOUNT)) { > > if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET)) > > @@ -38,9 +38,9 @@ static struct dentry *sysfs_mount(struct file_system_type *fs_type, > > ns = kobj_ns_grab_current(KOBJ_NS_TYPE_NET); > > root = kernfs_mount_ns(fs_type, flags, sysfs_root, > > SYSFS_MAGIC, &new_sb, ns); > > - if (IS_ERR(root) || !new_sb) > > + if (!new_sb) > > kobj_ns_drop(KOBJ_NS_TYPE_NET, ns); > > - else if (new_sb) > > + else if (!IS_ERR(root)) > > root->d_sb->s_iflags |= SB_I_USERNS_VISIBLE; > > > > return root; > > -- > > 2.31.1 > > I still need to wrap my head around sget_userns (called from > kernfs_mount_ns), but it looks like we can get away with dropping the > kobj_ns reference here when this is not a new superblock, because there > will be a reference to this "old" superblock, which will prevent it from > being killed, and we only drop the kobj_ns on sysfs_kill_sb (defined below > sysfs_mount) when all superblock references are done with. > > So we only need to keep the kobj_ns reference in the case of new_sb. > > However, if this is a new_sb, but we return an error > (fs/kernfs/mount.c:340), with the old code, we would drop the kobj_ns > reference, as expected. With this patch, it would not drop it, leading to a > kernel leak. > It was called to my attention that we drop the reference by calling kill_sb from deactive_locked_super, which is called right before returning the error. And this was the motivation for the fix in the first place, which somehow did not seem clear from the commit message. So, we have: 1) early failure on kernfs_mount_ns, new_sb is false, deactivate_locked_super has not been called, kill_sb will not be called: we need to drop the reference. 2) only other failure on kernfs_mount_ns is when new_sb is true and deactive_locked_super has been called: no need to drop the reference. 3) success on kernfs_mount_ns, either new_sb is set or not, we want to drop when new_sb is false to avoid a leak, because we only grabbed an extra reference on struct super_block s_active member. With that, Acked-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> > This code has been totally revamped after that change, with commit > 23bf1b6be9c2 ("kernfs, sysfs, cgroup, intel_rdt: Support fs_context"), > which means that whatever bug this is fixing needs to be fixed specifically > for our 4.15 kernel (and upstream 4.14.y, and v4.19.y, it seems). > > Cascardo. > > -- > kernel-team mailing list > kernel-team@lists.ubuntu.com > https://lists.ubuntu.com/mailman/listinfo/kernel-team
On Thu, Jul 1, 2021 at 12:09 PM Thadeu Lima de Souza Cascardo <cascardo@canonical.com> wrote: > > On Thu, Jul 01, 2021 at 09:32:48AM -0300, Thadeu Lima de Souza Cascardo wrote: > > On Wed, Jun 30, 2021 at 04:00:01PM -0300, Guilherme G. Piccoli wrote: > > > From: Al Viro <viro@zeniv.linux.org.uk> > > > > > > BugLink: https://bugs.launchpad.net/bugs/1934175 > > > > > > new_sb is left uninitialized in case of early failures in kernfs_mount_ns(), > > > and while IS_ERR(root) is true in all such cases, using IS_ERR(root) || !new_sb > > > is not a solution - IS_ERR(root) is true in some cases when new_sb is true. > > > > > > Make sure new_sb is initialized (and matches the reality) in all cases and > > > fix the condition for dropping kobj reference - we want it done precisely > > > in those situations where the reference has not been transferred into a new > > > super_block instance. > > > > > > Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> > > > (cherry picked from commit 7b745a4e4051e1bbce40e0b1c2cf636c70583aa4) > > > Signed-off-by: Guilherme G. Piccoli <gpiccoli@canonical.com> > > > --- > > > fs/sysfs/mount.c | 6 +++--- > > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > > > diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c > > > index fb49510c5dcf..88b388415d0e 100644 > > > --- a/fs/sysfs/mount.c > > > +++ b/fs/sysfs/mount.c > > > @@ -28,7 +28,7 @@ static struct dentry *sysfs_mount(struct file_system_type *fs_type, > > > { > > > struct dentry *root; > > > void *ns; > > > - bool new_sb; > > > + bool new_sb = false; > > > > > > if (!(flags & SB_KERNMOUNT)) { > > > if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET)) > > > @@ -38,9 +38,9 @@ static struct dentry *sysfs_mount(struct file_system_type *fs_type, > > > ns = kobj_ns_grab_current(KOBJ_NS_TYPE_NET); > > > root = kernfs_mount_ns(fs_type, flags, sysfs_root, > > > SYSFS_MAGIC, &new_sb, ns); > > > - if (IS_ERR(root) || !new_sb) > > > + if (!new_sb) > > > kobj_ns_drop(KOBJ_NS_TYPE_NET, ns); > > > - else if (new_sb) > > > + else if (!IS_ERR(root)) > > > root->d_sb->s_iflags |= SB_I_USERNS_VISIBLE; > > > > > > return root; > > > -- > > > 2.31.1 > > > > I still need to wrap my head around sget_userns (called from > > kernfs_mount_ns), but it looks like we can get away with dropping the > > kobj_ns reference here when this is not a new superblock, because there > > will be a reference to this "old" superblock, which will prevent it from > > being killed, and we only drop the kobj_ns on sysfs_kill_sb (defined below > > sysfs_mount) when all superblock references are done with. > > > > So we only need to keep the kobj_ns reference in the case of new_sb. > > > > However, if this is a new_sb, but we return an error > > (fs/kernfs/mount.c:340), with the old code, we would drop the kobj_ns > > reference, as expected. With this patch, it would not drop it, leading to a > > kernel leak. > > > > It was called to my attention that we drop the reference by calling kill_sb > from deactive_locked_super, which is called right before returning the error. > And this was the motivation for the fix in the first place, which somehow did > not seem clear from the commit message. > > So, we have: > > 1) early failure on kernfs_mount_ns, new_sb is false, deactivate_locked_super > has not been called, kill_sb will not be called: we need to drop the reference. > > 2) only other failure on kernfs_mount_ns is when new_sb is true and > deactive_locked_super has been called: no need to drop the reference. > > 3) success on kernfs_mount_ns, either new_sb is set or not, we want to drop > when new_sb is false to avoid a leak, because we only grabbed an extra > reference on struct super_block s_active member. > > With that, > > Acked-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com> > Thanks again, very informative analysis!
diff --git a/fs/sysfs/mount.c b/fs/sysfs/mount.c index fb49510c5dcf..88b388415d0e 100644 --- a/fs/sysfs/mount.c +++ b/fs/sysfs/mount.c @@ -28,7 +28,7 @@ static struct dentry *sysfs_mount(struct file_system_type *fs_type, { struct dentry *root; void *ns; - bool new_sb; + bool new_sb = false; if (!(flags & SB_KERNMOUNT)) { if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET)) @@ -38,9 +38,9 @@ static struct dentry *sysfs_mount(struct file_system_type *fs_type, ns = kobj_ns_grab_current(KOBJ_NS_TYPE_NET); root = kernfs_mount_ns(fs_type, flags, sysfs_root, SYSFS_MAGIC, &new_sb, ns); - if (IS_ERR(root) || !new_sb) + if (!new_sb) kobj_ns_drop(KOBJ_NS_TYPE_NET, ns); - else if (new_sb) + else if (!IS_ERR(root)) root->d_sb->s_iflags |= SB_I_USERNS_VISIBLE; return root;