Message ID | 1316633283.2176.14.camel@adamo |
---|---|
State | New |
Headers | show |
On 09/21/2011 01:28 PM, Leann Ogasawara wrote: > Hi All, > > BugLink: http://bugs.launchpad.net/bugs/843892 > > == SRU Justification == > It's been reported that destroying a container causes a kernel Oops and > will hang the system. The issue is reproducible. The user has > successfully tested the patch against Oneiric and can confirm the Oops > no longer occurs when using a patched Oneiric kernel. The patch has > been submitted upstream (CC'd upstream stable) and is currently queued > in the -mm tree. It also appears it will hit the 3.2 merge window. > Please consider for SRU against Oneiric and Natty. > > == Impact == > The commit message of the patch notes that this will likely affect > 2.6.26 and newer kernels, ie affects Lucid, Maverick, Natty, Oneiric. > However, due to the nature of our SRU process, the bug reporter is > likely only able to readily test Natty and Oneiric. Thus I'm only > submitting this for SRU against Oneiric and Natty. > > == Test Case == > See reproducer in comment #6 of bug report > > Thanks, > Leann > > From ba1f54e39f8ec3a7db00dd2844dc29988638c5a5 Mon Sep 17 00:00:00 2001 > From: Alex Bligh<alex@alex.org.uk> > Date: Wed, 14 Sep 2011 13:43:36 -0700 > Subject: [PATCH] net/netfilter/nf_conntrack_netlink.c: fix Oops on container destroy > > BugLink: http://bugs.launchpad.net/bugs/843892 > > Problem: > > A repeatable Oops can be caused if a container with networking > unshared is destroyed when it has nf_conntrack entries yet to expire. > > A copy of the oops follows below. A perl program generating the oops > repeatably is attached inline below. > > Analysis: > > The oops is called from cleanup_net when the namespace is > destroyed. conntrack iterates through outstanding events and calls > death_by_timeout on each of them, which in turn produces a call to > ctnetlink_conntrack_event. This calls nf_netlink_has_listeners, which > oopses because net->nfnl is NULL. > > The perl program generates the container through fork() then > clone(NS_NEWNET). I does not explicitly set up netlink > explicitly set up netlink, but I presume it was set up else net->nfnl > would have been NULL earlier (i.e. when an earlier connection > timed out). This would thus suggest that net->nfnl is made NULL > during the destruction of the container, which I think is done by > nfnetlink_net_exit_batch. > > I can see that the various subsystems are deinitialised in the opposite > order to which the relevant register_pernet_subsys calls are called, > and both nf_conntrack and nfnetlink_net_ops register their relevant > subsystems. If nfnetlink_net_ops registered later than nfconntrack, > then its exit routine would have been called first, which would cause > the oops described. I am not sure there is anything to prevent this > happening in a container environment. > > Whilst there's perhaps a more complex problem revolving around ordering > of subsystem deinit, it seems to me that missing a netlink event on a > container that is dying is not a disaster. An early check for net->nfnl > being non-NULL in ctnetlink_conntrack_event appears to fix this. There > may remain a potential race condition if it becomes NULL immediately > after being checked (I am not sure any lock is held at this point or > how synchronisation for subsystem deinitialization works). > > Patch: > > The patch attached should apply on everything from 2.6.26 (if not before) > onwards; it appears to be a problem on all kernels. This was taken against > Ubuntu-3.0.0-11.17 which is very close to 3.0.4. I have torture-tested it > with the above perl script for 15 minutes or so; the perl script hung the > machine within 20 seconds without this patch. > > Applicability: > > If this is the right solution, it should be applied to all stable kernels > as well as head. Apart from the minor overhead of checking one variable > against NULL, it can never 'do the wrong thing', because if net->nfnl > is NULL, an oops will inevitably result. Therefore, checking is a reasonable > thing to do unless it can be proven than net->nfnl will never be NULL. > > Check net->nfnl for NULL in ctnetlink_conntrack_event to avoid Oops on > container destroy > > Signed-off-by: Alex Bligh<alex@alex.org.uk> > Cc: Patrick McHardy<kaber@trash.net> > Cc: David Miller<davem@davemloft.net> > Cc:<stable@kernel.org> > Signed-off-by: Andrew Morton<akpm@linux-foundation.org> > (applied from -mm http://marc.info/?l=linux-mm-commits&m=131603308900694&w=2) > > Signed-off-by: Leann Ogasawara<leann.ogasawara@canonical.com> > --- > net/netfilter/nf_conntrack_netlink.c | 5 +++++ > 1 files changed, 5 insertions(+), 0 deletions(-) > > diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c > index 482e90c..0790d0a 100644 > --- a/net/netfilter/nf_conntrack_netlink.c > +++ b/net/netfilter/nf_conntrack_netlink.c > @@ -570,6 +570,11 @@ ctnetlink_conntrack_event(unsigned int events, struct nf_ct_event *item) > return 0; > > net = nf_ct_net(ct); > + > + /* container deinit, netlink may have died before death_by_timeout */ > + if (!net->nfnl) > + return 0; > + > if (!item->report&& !nfnetlink_has_listeners(net, group)) > return 0; > Acked-by: Tim Gardner <tim.gardner@canonical.com>
Applied to Oneiric master-next and Natty master-next. Thanks, Leann On Wed, 2011-09-21 at 12:28 -0700, Leann Ogasawara wrote: > Hi All, > > BugLink: http://bugs.launchpad.net/bugs/843892 > > == SRU Justification == > It's been reported that destroying a container causes a kernel Oops and > will hang the system. The issue is reproducible. The user has > successfully tested the patch against Oneiric and can confirm the Oops > no longer occurs when using a patched Oneiric kernel. The patch has > been submitted upstream (CC'd upstream stable) and is currently queued > in the -mm tree. It also appears it will hit the 3.2 merge window. > Please consider for SRU against Oneiric and Natty. > > == Impact == > The commit message of the patch notes that this will likely affect > 2.6.26 and newer kernels, ie affects Lucid, Maverick, Natty, Oneiric. > However, due to the nature of our SRU process, the bug reporter is > likely only able to readily test Natty and Oneiric. Thus I'm only > submitting this for SRU against Oneiric and Natty. > > == Test Case == > See reproducer in comment #6 of bug report > > Thanks, > Leann > > From ba1f54e39f8ec3a7db00dd2844dc29988638c5a5 Mon Sep 17 00:00:00 2001 > From: Alex Bligh <alex@alex.org.uk> > Date: Wed, 14 Sep 2011 13:43:36 -0700 > Subject: [PATCH] net/netfilter/nf_conntrack_netlink.c: fix Oops on container destroy > > BugLink: http://bugs.launchpad.net/bugs/843892 > > Problem: > > A repeatable Oops can be caused if a container with networking > unshared is destroyed when it has nf_conntrack entries yet to expire. > > A copy of the oops follows below. A perl program generating the oops > repeatably is attached inline below. > > Analysis: > > The oops is called from cleanup_net when the namespace is > destroyed. conntrack iterates through outstanding events and calls > death_by_timeout on each of them, which in turn produces a call to > ctnetlink_conntrack_event. This calls nf_netlink_has_listeners, which > oopses because net->nfnl is NULL. > > The perl program generates the container through fork() then > clone(NS_NEWNET). I does not explicitly set up netlink > explicitly set up netlink, but I presume it was set up else net->nfnl > would have been NULL earlier (i.e. when an earlier connection > timed out). This would thus suggest that net->nfnl is made NULL > during the destruction of the container, which I think is done by > nfnetlink_net_exit_batch. > > I can see that the various subsystems are deinitialised in the opposite > order to which the relevant register_pernet_subsys calls are called, > and both nf_conntrack and nfnetlink_net_ops register their relevant > subsystems. If nfnetlink_net_ops registered later than nfconntrack, > then its exit routine would have been called first, which would cause > the oops described. I am not sure there is anything to prevent this > happening in a container environment. > > Whilst there's perhaps a more complex problem revolving around ordering > of subsystem deinit, it seems to me that missing a netlink event on a > container that is dying is not a disaster. An early check for net->nfnl > being non-NULL in ctnetlink_conntrack_event appears to fix this. There > may remain a potential race condition if it becomes NULL immediately > after being checked (I am not sure any lock is held at this point or > how synchronisation for subsystem deinitialization works). > > Patch: > > The patch attached should apply on everything from 2.6.26 (if not before) > onwards; it appears to be a problem on all kernels. This was taken against > Ubuntu-3.0.0-11.17 which is very close to 3.0.4. I have torture-tested it > with the above perl script for 15 minutes or so; the perl script hung the > machine within 20 seconds without this patch. > > Applicability: > > If this is the right solution, it should be applied to all stable kernels > as well as head. Apart from the minor overhead of checking one variable > against NULL, it can never 'do the wrong thing', because if net->nfnl > is NULL, an oops will inevitably result. Therefore, checking is a reasonable > thing to do unless it can be proven than net->nfnl will never be NULL. > > Check net->nfnl for NULL in ctnetlink_conntrack_event to avoid Oops on > container destroy > > Signed-off-by: Alex Bligh <alex@alex.org.uk> > Cc: Patrick McHardy <kaber@trash.net> > Cc: David Miller <davem@davemloft.net> > Cc: <stable@kernel.org> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org> > (applied from -mm http://marc.info/?l=linux-mm-commits&m=131603308900694&w=2) > > Signed-off-by: Leann Ogasawara <leann.ogasawara@canonical.com> > --- > net/netfilter/nf_conntrack_netlink.c | 5 +++++ > 1 files changed, 5 insertions(+), 0 deletions(-) > > diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c > index 482e90c..0790d0a 100644 > --- a/net/netfilter/nf_conntrack_netlink.c > +++ b/net/netfilter/nf_conntrack_netlink.c > @@ -570,6 +570,11 @@ ctnetlink_conntrack_event(unsigned int events, struct nf_ct_event *item) > return 0; > > net = nf_ct_net(ct); > + > + /* container deinit, netlink may have died before death_by_timeout */ > + if (!net->nfnl) > + return 0; > + > if (!item->report && !nfnetlink_has_listeners(net, group)) > return 0; > > -- > 1.7.4.1 > > > >
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c index 482e90c..0790d0a 100644 --- a/net/netfilter/nf_conntrack_netlink.c +++ b/net/netfilter/nf_conntrack_netlink.c @@ -570,6 +570,11 @@ ctnetlink_conntrack_event(unsigned int events, struct nf_ct_event *item) return 0; net = nf_ct_net(ct); + + /* container deinit, netlink may have died before death_by_timeout */ + if (!net->nfnl) + return 0; + if (!item->report && !nfnetlink_has_listeners(net, group)) return 0;