diff mbox

[2.6.30-git21] Network Namespace test failure

Message ID 20090625033847.GA27181@us.ibm.com
State Superseded, archived
Delegated to: David Miller
Headers show

Commit Message

Serge E. Hallyn June 25, 2009, 3:38 a.m. UTC
Quoting Serge E. Hallyn (serue@us.ibm.com):
> Quoting Sachin Sant (sachinp@in.ibm.com):
> > Serge E. Hallyn wrote:
> >> Precise kernel version and .config?
> >>   
> > Kernel version is 2.6.30-git21 (626f380d0b264a1e40237f5a2a3dffc5d14f256e)
> 
> Thanks.  I bisected it to commit
> ae0e8e82205c903978a79ebf5e31c670b61fa5b4 : "veth: prevent oops caused by
> netdev destructor".  That moves the free_percpu(priv->stats) from
> veth_dev_free to veth_close().  Since it gets allocated at
> veth_dev_init, and dveth_xmit uses it unconditionally, that seems like a
> likely cause of the oops?

Indeed the following patch fixes it on my end.  Sachin can you give
this one a shot?

thanks,
-serge

From 7193023ad09dbc4b57909c0204c19ed93472cd9e Mon Sep 17 00:00:00 2001
From: root <root@elm3b203.beaverton.ibm.com>
Date: Wed, 24 Jun 2009 20:26:17 -0700
Subject: [PATCH 1/1] veth: don't free priv->status until dev->destructor

Since commit ae0e8e82205c903978a79ebf5e31c670b61fa5b4, priv->status
has been freed at veth_close().  But that causes a NULL deref at
veth_xmit.  This patch moves priv->status free back to the device
destructor.

Signed-off-by: Serge Hallyn <serue@us.ibm.com>
---
 drivers/net/veth.c |   13 ++++++++++---
 1 files changed, 10 insertions(+), 3 deletions(-)

Comments

Serge E. Hallyn June 25, 2009, 3:46 a.m. UTC | #1
Quoting Serge E. Hallyn (serue@us.ibm.com):
> Quoting Serge E. Hallyn (serue@us.ibm.com):
> > Quoting Sachin Sant (sachinp@in.ibm.com):
> > > Serge E. Hallyn wrote:
> > >> Precise kernel version and .config?
> > >>   
> > > Kernel version is 2.6.30-git21 (626f380d0b264a1e40237f5a2a3dffc5d14f256e)
> > 
> > Thanks.  I bisected it to commit
> > ae0e8e82205c903978a79ebf5e31c670b61fa5b4 : "veth: prevent oops caused by
> > netdev destructor".  That moves the free_percpu(priv->stats) from
> > veth_dev_free to veth_close().  Since it gets allocated at
> > veth_dev_init, and dveth_xmit uses it unconditionally, that seems like a
> > likely cause of the oops?
> 
> Indeed the following patch fixes it on my end.  Sachin can you give
> this one a shot?

BTW - according to the original patch, my patch is not a proper fix,
bc the destructor can't point to code in the module.

I'm not sure offhand what is a proper fix, though, so this patch seemed
ok for having Sachin test but is not intended as a mergeable fix.

> thanks,
> -serge
> 
> >From 7193023ad09dbc4b57909c0204c19ed93472cd9e Mon Sep 17 00:00:00 2001
> From: root <root@elm3b203.beaverton.ibm.com>
> Date: Wed, 24 Jun 2009 20:26:17 -0700
> Subject: [PATCH 1/1] veth: don't free priv->status until dev->destructor
> 
> Since commit ae0e8e82205c903978a79ebf5e31c670b61fa5b4, priv->status
> has been freed at veth_close().  But that causes a NULL deref at
> veth_xmit.  This patch moves priv->status free back to the device
> destructor.
> 
> Signed-off-by: Serge Hallyn <serue@us.ibm.com>
> ---
>  drivers/net/veth.c |   13 ++++++++++---
>  1 files changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> index 8e56fcf..6000aae 100644
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c
> @@ -259,8 +259,6 @@ static int veth_close(struct net_device *dev)
>  	netif_carrier_off(dev);
>  	netif_carrier_off(priv->peer);
> 
> -	free_percpu(priv->stats);
> -	priv->stats = NULL;
>  	return 0;
>  }
> 
> @@ -301,6 +299,15 @@ static const struct net_device_ops veth_netdev_ops = {
>  	.ndo_set_mac_address = eth_mac_addr,
>  };
> 
> +static void veth_dev_free(struct net_device *dev)
> +{
> +	struct veth_priv *priv;
> +
> +	priv = netdev_priv(dev);
> +	free_percpu(priv->stats);
> +	free_netdev(dev);
> +}
> +
>  static void veth_setup(struct net_device *dev)
>  {
>  	ether_setup(dev);
> @@ -308,7 +315,7 @@ static void veth_setup(struct net_device *dev)
>  	dev->netdev_ops = &veth_netdev_ops;
>  	dev->ethtool_ops = &veth_ethtool_ops;
>  	dev->features |= NETIF_F_LLTX;
> -	dev->destructor = free_netdev;
> +	dev->destructor = veth_dev_free;
>  }
> 
>  /*
> -- 
> 1.6.2.3
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sachin P. Sant June 25, 2009, 5:38 a.m. UTC | #2
Serge E. Hallyn wrote:
> Indeed the following patch fixes it on my end.  Sachin can you give
> this one a shot?
>
> thanks,
> -serge
>
> From 7193023ad09dbc4b57909c0204c19ed93472cd9e Mon Sep 17 00:00:00 2001
> From: root <root@elm3b203.beaverton.ibm.com>
> Date: Wed, 24 Jun 2009 20:26:17 -0700
> Subject: [PATCH 1/1] veth: don't free priv->status until dev->destructor
>
> Since commit ae0e8e82205c903978a79ebf5e31c670b61fa5b4, priv->status
> has been freed at veth_close().  But that causes a NULL deref at
> veth_xmit.  This patch moves priv->status free back to the device
> destructor.
>
> Signed-off-by: Serge Hallyn <serue@us.ibm.com>
> ---
Yes this fixed the problem for me.

Thanks
-Sachin
David Miller June 25, 2009, 9:49 a.m. UTC | #3
From: "Serge E. Hallyn" <serue@us.ibm.com>
Date: Wed, 24 Jun 2009 22:38:47 -0500

>>From 7193023ad09dbc4b57909c0204c19ed93472cd9e Mon Sep 17 00:00:00 2001
> From: root <root@elm3b203.beaverton.ibm.com>
> Date: Wed, 24 Jun 2009 20:26:17 -0700
> Subject: [PATCH 1/1] veth: don't free priv->status until dev->destructor
> 
> Since commit ae0e8e82205c903978a79ebf5e31c670b61fa5b4, priv->status
> has been freed at veth_close().  But that causes a NULL deref at
> veth_xmit.  This patch moves priv->status free back to the device
> destructor.
> 
> Signed-off-by: Serge Hallyn <serue@us.ibm.com>

I think ae0e8e82205c903978a79ebf5e31c670b61fa5b4 is worse than
the problem it's trying to cure.  It introduces two regressions:

1) This OOPS we are discussing.

2) statistics are not "remembered" across ifdown/ifup and that
   disagrees with how every other network device works

I'm going to revert that change, and the dev->destructor pointing
to module code problem will need to be fixed in another way.

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 8e56fcf..6000aae 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -259,8 +259,6 @@  static int veth_close(struct net_device *dev)
 	netif_carrier_off(dev);
 	netif_carrier_off(priv->peer);
 
-	free_percpu(priv->stats);
-	priv->stats = NULL;
 	return 0;
 }
 
@@ -301,6 +299,15 @@  static const struct net_device_ops veth_netdev_ops = {
 	.ndo_set_mac_address = eth_mac_addr,
 };
 
+static void veth_dev_free(struct net_device *dev)
+{
+	struct veth_priv *priv;
+
+	priv = netdev_priv(dev);
+	free_percpu(priv->stats);
+	free_netdev(dev);
+}
+
 static void veth_setup(struct net_device *dev)
 {
 	ether_setup(dev);
@@ -308,7 +315,7 @@  static void veth_setup(struct net_device *dev)
 	dev->netdev_ops = &veth_netdev_ops;
 	dev->ethtool_ops = &veth_ethtool_ops;
 	dev->features |= NETIF_F_LLTX;
-	dev->destructor = free_netdev;
+	dev->destructor = veth_dev_free;
 }
 
 /*