[10/10] nf_conntrack: Use rcu_barrier().

Message ID	20090623150444.22490.27931.stgit@localhost
State	Changes Requested, archived
Delegated to:	David Miller
Headers	show Return-Path: <netdev-owner@vger.kernel.org> From: Jesper Dangaard Brouer <hawk@comx.dk> Subject: [PATCH 10/10] nf_conntrack: Use rcu_barrier(). To: "David S. Miller" <davem@davemloft.net> Cc: Jesper Dangaard Brouer <hawk@comx.dk>, "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, dougthompson@xmission.com, bluesmoke-devel@lists.sourceforge.net, axboe@kernel.dk, "Patrick McHardy" <kaber@trash.net>, christine.caulfield@googlemail.com, Trond.Myklebust@netapp.com, linux-wireless@vger.kernel.org, johannes@sipsolutions.net, yoshfuji@linux-ipv6.org, shemminger@linux-foundation.org, linux-nfs@vger.kernel.org, bfields@fieldses.org, neilb@suse.de, linux-ext4@vger.kernel.org, tytso@mit.edu, adilger@sun.com, netfilter-devel@vger.kernel.org Date: Tue, 23 Jun 2009 17:04:44 +0200 Message-ID: <20090623150444.22490.27931.stgit@localhost> In-Reply-To: <20090623150330.22490.87327.stgit@localhost> References: <20090623150330.22490.87327.stgit@localhost> User-Agent: StGIT/0.14.2 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Sender: netdev-owner@vger.kernel.org Precedence: bulk

Message ID

20090623150444.22490.27931.stgit@localhost

State

Changes Requested, archived

Delegated to:

David Miller

Headers

From: Jesper Dangaard Brouer <hawk@comx.dk>
Subject: [PATCH 10/10] nf_conntrack: Use rcu_barrier().
To: "David S. Miller" <davem@davemloft.net>
Cc: Jesper Dangaard Brouer <hawk@comx.dk>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	dougthompson@xmission.com, bluesmoke-devel@lists.sourceforge.net,
	axboe@kernel.dk, "Patrick McHardy" <kaber@trash.net>,
	christine.caulfield@googlemail.com, Trond.Myklebust@netapp.com,
	linux-wireless@vger.kernel.org, johannes@sipsolutions.net,
	yoshfuji@linux-ipv6.org, shemminger@linux-foundation.org,
	linux-nfs@vger.kernel.org, bfields@fieldses.org, neilb@suse.de,
	linux-ext4@vger.kernel.org, tytso@mit.edu, adilger@sun.com,
	netfilter-devel@vger.kernel.org
Date: Tue, 23 Jun 2009 17:04:44 +0200
Message-ID: <20090623150444.22490.27931.stgit@localhost>
In-Reply-To: <20090623150330.22490.87327.stgit@localhost>
References: <20090623150330.22490.87327.stgit@localhost>
User-Agent: StGIT/0.14.2
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
Sender: netdev-owner@vger.kernel.org
Precedence: bulk

Commit Message

Jesper Dangaard Brouer June 23, 2009, 3:04 p.m. UTC

I'm not sure which is are most optimal place to call rcu_barrier().
The patch probably calls rcu_barrier() too much, but its a better
safe than sorry approach.

There is embedded some comments that I would like Patrick McHardy
to look at.

Signed-off-by: Jesper Dangaard Brouer <hawk@comx.dk>
---

 net/netfilter/nf_conntrack_core.c       |    5 +++++
 net/netfilter/nf_conntrack_standalone.c |    2 ++
 2 files changed, 7 insertions(+), 0 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Patrick McHardy June 23, 2009, 4:23 p.m. UTC | #1

Jesper Dangaard Brouer wrote:
> I'm not sure which is are most optimal place to call rcu_barrier().
> The patch probably calls rcu_barrier() too much, but its a better
> safe than sorry approach.
> 
> There is embedded some comments that I would like Patrick McHardy
> to look at.
> 
> diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
> index 5f72b94..cea4537 100644
> --- a/net/netfilter/nf_conntrack_core.c
> +++ b/net/netfilter/nf_conntrack_core.c
> @@ -1084,6 +1084,8 @@ static void nf_conntrack_cleanup_init_net(void)
>  {
>  	nf_conntrack_helper_fini();
>  	nf_conntrack_proto_fini();
> +	rcu_barrier();
> +	/* Need to wait for call_rcu() before dealloc the kmem_cache */
>  	kmem_cache_destroy(nf_conntrack_cachep);

Which call_rcu() is this referring to? If its the conntrack destruction,
that one is gone in the current kernel and I think destruction is
handled properly by the sl*b-allocators (SLAB_DESTROY_BY_RCU).

> @@ -1118,6 +1120,9 @@ void nf_conntrack_cleanup(struct net *net)
>  	/* This makes sure all current packets have passed through
>  	   netfilter framework.  Roll on, two-stage module
>  	   delete... */
> +	/* hawk@comx.dk 2009-06-20: Think this should be replaced by a
> +          rcu_barrier() ???
> +	*/
>  	synchronize_net();

AFAICT this one is used to make sure the old value of the ip_ct_attach
hook is not visible anymore before beginning cleanup and is not needed
for anything else.

>  	nf_conntrack_cleanup_net(net);
> diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_conntrack_standalone.c
> index 1935153..29c6cd0 100644
> --- a/net/netfilter/nf_conntrack_standalone.c
> +++ b/net/netfilter/nf_conntrack_standalone.c
> @@ -500,6 +500,8 @@ static void nf_conntrack_net_exit(struct net *net)
>  	nf_conntrack_standalone_fini_sysctl(net);
>  	nf_conntrack_standalone_fini_proc(net);
>  	nf_conntrack_cleanup(net);
> +	/* hawk@comx.dk: Think rcu_barrier() should to be called earlier? */
> +	rcu_barrier(); /* Wait for completion of call_rcu()'s */
>  }

Which call_rcu() is this referring to? We should place them in
the cleanup sub-functions to make this clearly visible.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Jesper Dangaard Brouer June 24, 2009, 9:02 a.m. UTC | #2

On Tue, 2009-06-23 at 18:23 +0200, Patrick McHardy wrote:
> Jesper Dangaard Brouer wrote:
> > I'm not sure which is are most optimal place to call rcu_barrier().
> > The patch probably calls rcu_barrier() too much, but its a better
> > safe than sorry approach.
> > 
> > There is embedded some comments that I would like Patrick McHardy
> > to look at.
> > 
> > diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
> > index 5f72b94..cea4537 100644
> > --- a/net/netfilter/nf_conntrack_core.c
> > +++ b/net/netfilter/nf_conntrack_core.c
> > @@ -1084,6 +1084,8 @@ static void nf_conntrack_cleanup_init_net(void)
> >  {
> >  	nf_conntrack_helper_fini();
> >  	nf_conntrack_proto_fini();
> > +	rcu_barrier();
> > +	/* Need to wait for call_rcu() before dealloc the kmem_cache */
> >  	kmem_cache_destroy(nf_conntrack_cachep);
> 
> Which call_rcu() is this referring to? 

It is the call_rcu() in nf_conntrack_expect.c (which is linked into
nf_conntrack.ko).  But that also means that it should have been the slab
cache called "nf_ct_expect_cachep" we should have waited for... (and I
also notice that "nf_ct_expect_cachep" is missing the
SLAB_DESTROY_BY_RCU flag, and the SLAB_DESTROY_BY_RCU flag should be
removed from "nf_conntrack_cachep")

> If its the conntrack destruction,
> that one is gone in the current kernel and I think destruction is
> handled properly by the sl*b-allocators (SLAB_DESTROY_BY_RCU).

Just dived into the slab.c code and noticed that it also is flawed,
ARGH!.  When the SLAB_DESTROY_BY_RCU flags is set, it only calls
synchronize_rcu() and not rcu_barrier() as it should!

I'll fix that up in another patch series... 

Looking into slub and slob at the moment, and it seems that they
schedule another call_rcu callback for freeing when the
SLAB_DESTROY_BY_RCU flags is set.  That seems racy to me... Paul?

> > @@ -1118,6 +1120,9 @@ void nf_conntrack_cleanup(struct net *net)
> >  	/* This makes sure all current packets have passed through
> >  	   netfilter framework.  Roll on, two-stage module
> >  	   delete... */
> > +	/* hawk@comx.dk 2009-06-20: Think this should be replaced by a
> > +          rcu_barrier() ???
> > +	*/
> >  	synchronize_net();
> 
> AFAICT this one is used to make sure the old value of the ip_ct_attach
> hook is not visible anymore before beginning cleanup and is not needed
> for anything else.

Fine!

> >  	nf_conntrack_cleanup_net(net);
> > diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_conntrack_standalone.c
> > index 1935153..29c6cd0 100644
> > --- a/net/netfilter/nf_conntrack_standalone.c
> > +++ b/net/netfilter/nf_conntrack_standalone.c
> > @@ -500,6 +500,8 @@ static void nf_conntrack_net_exit(struct net *net)
> >  	nf_conntrack_standalone_fini_sysctl(net);
> >  	nf_conntrack_standalone_fini_proc(net);
> >  	nf_conntrack_cleanup(net);
> > +	/* hawk@comx.dk: Think rcu_barrier() should to be called earlier? */
> > +	rcu_barrier(); /* Wait for completion of call_rcu()'s */
> >  }
> 
> Which call_rcu() is this referring to? We should place them in
> the cleanup sub-functions to make this clearly visible.

This call_rcu() is the one done in nf_conntrack_extend.c:114  (notice
"_extend" NOT "_expect"), which calls __nf_ct_ext_free_rcu().

Guess this rcu_barrier() should then be move to
nf_ct_extend_unregister() right? (it already invokes a
synchronize_rcu() that should be replaced by rcu_barrier()).
Although this means the nf_ct_extend_unregister() will be called three
times in nf_conntrack_cleanup_net() when unregistering ecache, acct and
expect.

Thank you for your feedback :-) ... I'll post a new v2 patch...

diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 5f72b94..cea4537 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1084,6 +1084,8 @@  static void nf_conntrack_cleanup_init_net(void)
 {
 	nf_conntrack_helper_fini();
 	nf_conntrack_proto_fini();
+	rcu_barrier();
+	/* Need to wait for call_rcu() before dealloc the kmem_cache */
 	kmem_cache_destroy(nf_conntrack_cachep);
 }
 
@@ -1118,6 +1120,9 @@  void nf_conntrack_cleanup(struct net *net)
 	/* This makes sure all current packets have passed through
 	   netfilter framework.  Roll on, two-stage module
 	   delete... */
+	/* hawk@comx.dk 2009-06-20: Think this should be replaced by a
+          rcu_barrier() ???
+	*/
 	synchronize_net();
 
 	nf_conntrack_cleanup_net(net);
diff --git a/net/netfilter/nf_conntrack_standalone.c b/net/netfilter/nf_conntrack_standalone.c
index 1935153..29c6cd0 100644
--- a/net/netfilter/nf_conntrack_standalone.c
+++ b/net/netfilter/nf_conntrack_standalone.c
@@ -500,6 +500,8 @@  static void nf_conntrack_net_exit(struct net *net)
 	nf_conntrack_standalone_fini_sysctl(net);
 	nf_conntrack_standalone_fini_proc(net);
 	nf_conntrack_cleanup(net);
+	/* hawk@comx.dk: Think rcu_barrier() should to be called earlier? */
+	rcu_barrier(); /* Wait for completion of call_rcu()'s */
 }
 
 static struct pernet_operations nf_conntrack_net_ops = {

[10/10] nf_conntrack: Use rcu_barrier().

Commit Message

Comments

Patch