Message ID | 20090626162418.GA24828@us.ibm.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
From: "Serge E. Hallyn" <serue@us.ibm.com> Date: Fri, 26 Jun 2009 11:24:18 -0500 > I haven't been able to reproduce the original oops though (been > trying to cat the stats sysfs files while rmmoding veth, to no > avail, and haven't found an original bug report or testcase), so > can't verify whether this patch prevents the original oops. If you 'cat' it you're unlikely to trigger the oops. You have to hold the sysfs files open, and then elsewhere do the rmmod, wait, and then continue with some access to those open sysfs file descriptors (f.e. do some reads). I'd also need this patch to be against current sources as they'll never apply since I did the revert quite some time ago. Thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 15 Jul 2009 08:50:12 -0700 (PDT) David Miller <davem@davemloft.net> wrote: > From: "Serge E. Hallyn" <serue@us.ibm.com> > Date: Fri, 26 Jun 2009 11:24:18 -0500 > > > I haven't been able to reproduce the original oops though (been > > trying to cat the stats sysfs files while rmmoding veth, to no > > avail, and haven't found an original bug report or testcase), so > > can't verify whether this patch prevents the original oops. > > If you 'cat' it you're unlikely to trigger the oops. > > You have to hold the sysfs files open, and then elsewhere do the > rmmod, wait, and then continue with some access to those open sysfs > file descriptors (f.e. do some reads). > > I'd also need this patch to be against current sources as they'll > never apply since I did the revert quite some time ago. > > Thanks. My usual way of doing this is: # (sleep 30; cat /sys/class/net/ethX/statistics/tx_bytes) & # rmmod the_buggy_driver wait...
Quoting Stephen Hemminger (shemminger@vyatta.com): > On Wed, 15 Jul 2009 08:50:12 -0700 (PDT) > David Miller <davem@davemloft.net> wrote: > > > From: "Serge E. Hallyn" <serue@us.ibm.com> > > Date: Fri, 26 Jun 2009 11:24:18 -0500 > > > > > I haven't been able to reproduce the original oops though (been > > > trying to cat the stats sysfs files while rmmoding veth, to no > > > avail, and haven't found an original bug report or testcase), so > > > can't verify whether this patch prevents the original oops. > > > > If you 'cat' it you're unlikely to trigger the oops. > > > > You have to hold the sysfs files open, and then elsewhere do the > > rmmod, wait, and then continue with some access to those open sysfs > > file descriptors (f.e. do some reads). Yup, I was doing that too, but couldn't reproduce as yet. > > I'd also need this patch to be against current sources as they'll > > never apply since I did the revert quite some time ago. > > > > Thanks. Ok, thanks - I'll generate a new patch against a fresh pull when I can confirm that it actually solves the problem. > My usual way of doing this is: > > # (sleep 30; cat /sys/class/net/ethX/statistics/tx_bytes) & > # rmmod the_buggy_driver > > wait... Can you oops the kernel this way on latest netns? thanks, -serge -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 26 Jun 2009 11:24:18 -0500 "Serge E. Hallyn" <serue@us.ibm.com> wrote: > Based on the commit msg on ae0e8e82205c903978a79ebf5e31c670b61fa5b4, it looks > ether_setup(dev); > @@ -306,7 +320,7 @@ static void veth_setup(struct net_device *dev) > dev->netdev_ops = &veth_netdev_ops; > dev->ethtool_ops = &veth_ethtool_ops; > dev->features |= NETIF_F_LLTX; > - dev->destructor = free_netdev; > + dev->destructor = veth_dev_free; > This is still going to oops if sysfs statistics referenced after module unload because module is unloaded (code is gone) and veth_dev_free no longer exists. I'll respin the original patch (using free_netdev) and fix the statistics complaint. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Stephen Hemminger <shemminger@vyatta.com> writes: > On Fri, 26 Jun 2009 11:24:18 -0500 > "Serge E. Hallyn" <serue@us.ibm.com> wrote: > >> Based on the commit msg on ae0e8e82205c903978a79ebf5e31c670b61fa5b4, it looks > >> ether_setup(dev); >> @@ -306,7 +320,7 @@ static void veth_setup(struct net_device *dev) >> dev->netdev_ops = &veth_netdev_ops; >> dev->ethtool_ops = &veth_ethtool_ops; >> dev->features |= NETIF_F_LLTX; >> - dev->destructor = free_netdev; >> + dev->destructor = veth_dev_free; >> > > This is still going to oops if sysfs statistics referenced > after module unload because module is unloaded (code is gone) > and veth_dev_free no longer exists. Has anyone actually seen that cause an oops? The reason I am asking is that as I read the code we cannot have this problem. At worst the destructor callback is delayed until: veth_exit rtnl_link_unregister rtnl_unlock netdev_run_todo dev->destructor Similarly even if the sysfs filehandle is open we have called: netdev_unregister_kobject ... sysfs_addrm_finish sysfs_deactivate Which guarantees that sysfs_get_active_two will fail and all subsequent actions on that file will fail. Eric -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 04 Aug 2009 23:40:47 -0700 ebiederm@xmission.com (Eric W. Biederman) wrote: > Stephen Hemminger <shemminger@vyatta.com> writes: > > > On Fri, 26 Jun 2009 11:24:18 -0500 > > "Serge E. Hallyn" <serue@us.ibm.com> wrote: > > > >> Based on the commit msg on ae0e8e82205c903978a79ebf5e31c670b61fa5b4, it looks > > > >> ether_setup(dev); > >> @@ -306,7 +320,7 @@ static void veth_setup(struct net_device *dev) > >> dev->netdev_ops = &veth_netdev_ops; > >> dev->ethtool_ops = &veth_ethtool_ops; > >> dev->features |= NETIF_F_LLTX; > >> - dev->destructor = free_netdev; > >> + dev->destructor = veth_dev_free; > >> > > > > This is still going to oops if sysfs statistics referenced > > after module unload because module is unloaded (code is gone) > > and veth_dev_free no longer exists. > > Has anyone actually seen that cause an oops? > > The reason I am asking is that as I read the code we cannot have > this problem. At worst the destructor callback is delayed until: > > veth_exit > rtnl_link_unregister > rtnl_unlock > netdev_run_todo > dev->destructor > > > Similarly even if the sysfs filehandle is open we have called: > > netdev_unregister_kobject > ... > sysfs_addrm_finish > sysfs_deactivate > > Which guarantees that sysfs_get_active_two will fail and all > subsequent actions on that file will fail. > > Eric Sysfs must be safer than it used to be. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Stephen Hemminger <shemminger@vyatta.com> writes: > > Sysfs must be safer than it used to be. Definitely. A lot of this dates to Tejun's cleanups which merged 2-3 years agos now. Just before I started working on sysfs. Eric -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 87197dd..112add0 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -208,7 +208,7 @@ rx_drop: static struct net_device_stats *veth_get_stats(struct net_device *dev) { - struct veth_priv *priv = netdev_priv(dev); + struct veth_priv *priv; struct net_device_stats *dev_stats = &dev->stats; unsigned int cpu; struct veth_net_stats *stats; @@ -220,6 +220,8 @@ static struct net_device_stats *veth_get_stats(struct net_device *dev) dev_stats->tx_dropped = 0; dev_stats->rx_dropped = 0; + rcu_read_lock(); + priv = netdev_priv(dev); if (priv->stats) for_each_online_cpu(cpu) { stats = per_cpu_ptr(priv->stats, cpu); @@ -231,6 +233,7 @@ static struct net_device_stats *veth_get_stats(struct net_device *dev) dev_stats->tx_dropped += stats->tx_dropped; dev_stats->rx_dropped += stats->rx_dropped; } + rcu_read_unlock(); return dev_stats; } @@ -257,8 +260,6 @@ static int veth_close(struct net_device *dev) netif_carrier_off(dev); netif_carrier_off(priv->peer); - free_percpu(priv->stats); - priv->stats = NULL; return 0; } @@ -299,6 +300,19 @@ static const struct net_device_ops veth_netdev_ops = { .ndo_set_mac_address = eth_mac_addr, }; +static void veth_dev_free(struct net_device *dev) +{ + struct veth_priv *priv; + struct veth_net_stats *stats; + + priv = netdev_priv(dev); + stats = priv->stats; + priv->stats = NULL; + synchronize_rcu(); + free_percpu(stats); + free_netdev(dev); +} + static void veth_setup(struct net_device *dev) { ether_setup(dev); @@ -306,7 +320,7 @@ static void veth_setup(struct net_device *dev) dev->netdev_ops = &veth_netdev_ops; dev->ethtool_ops = &veth_ethtool_ops; dev->features |= NETIF_F_LLTX; - dev->destructor = free_netdev; + dev->destructor = veth_dev_free; } /*