From patchwork Fri Jun 26 16:24:18 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Serge E. Hallyn" X-Patchwork-Id: 29198 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@bilbo.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from ozlabs.org (ozlabs.org [203.10.76.45]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mx.ozlabs.org", Issuer "CA Cert Signing Authority" (verified OK)) by bilbo.ozlabs.org (Postfix) with ESMTPS id AFDAEB708A for ; Sat, 27 Jun 2009 02:34:28 +1000 (EST) Received: by ozlabs.org (Postfix) id 97689DDD0B; Sat, 27 Jun 2009 02:34:28 +1000 (EST) Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by ozlabs.org (Postfix) with ESMTP id 29BBEDDD01 for ; Sat, 27 Jun 2009 02:34:28 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755880AbZFZQds (ORCPT ); Fri, 26 Jun 2009 12:33:48 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751883AbZFZQds (ORCPT ); Fri, 26 Jun 2009 12:33:48 -0400 Received: from e35.co.us.ibm.com ([32.97.110.153]:48610 "EHLO e35.co.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750839AbZFZQdr (ORCPT ); Fri, 26 Jun 2009 12:33:47 -0400 Received: from d03relay02.boulder.ibm.com (d03relay02.boulder.ibm.com [9.17.195.227]) by e35.co.us.ibm.com (8.13.1/8.13.1) with ESMTP id n5QGQPWg011868; Fri, 26 Jun 2009 10:26:25 -0600 Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by d03relay02.boulder.ibm.com (8.13.8/8.13.8/NCO v9.2) with ESMTP id n5QGXot3222838; Fri, 26 Jun 2009 10:33:50 -0600 Received: from d03av02.boulder.ibm.com (loopback [127.0.0.1]) by d03av02.boulder.ibm.com (8.12.11.20060308/8.13.3) with ESMTP id n5QGXniu004049; Fri, 26 Jun 2009 10:33:49 -0600 Received: from sergelap.hallyn.com (sig-9-65-70-248.mts.ibm.com [9.65.70.248]) by d03av02.boulder.ibm.com (8.12.11.20060308/8.12.11) with ESMTP id n5QGXmst004035; Fri, 26 Jun 2009 10:33:48 -0600 Received: by sergelap.hallyn.com (Postfix, from userid 1000) id F1F921A700E; Fri, 26 Jun 2009 11:24:18 -0500 (CDT) Date: Fri, 26 Jun 2009 11:24:18 -0500 From: "Serge E. Hallyn" To: Stephen Hemminger Cc: Linux Containers , Sachin Sant , netdev , David Miller , matthltc@us.ibm.com, lkml Subject: [PATCH 1/1] veth: don't free priv->status until dev->destructor (v2) Message-ID: <20090626162418.GA24828@us.ibm.com> MIME-Version: 1.0 Content-Disposition: inline User-Agent: Mutt/1.5.18 (2008-05-17) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Based on the commit msg on ae0e8e82205c903978a79ebf5e31c670b61fa5b4, it looks like oopses were caused when people were reading the veth dev stats while the module was being unloaded, causing a deref of freed memory in veth_get_stats()? If so, I believe the following patch (still against mainline, so not on top of my previous patch or on top of a git-revert of ae0e8e82205c903978a79ebf5e31c670b61fa5b)) should prevent that. All the stats are gathered within one rcu cycle, while the device free hook first sets the device stats struct to NULL, waits an rcu cycle before freeing it. I haven't been able to reproduce the original oops though (been trying to cat the stats sysfs files while rmmoding veth, to no avail, and haven't found an original bug report or testcase), so can't verify whether this patch prevents the original oops. Does this look sufficient? thanks, -serge From a8eb0950b47ff6c5dfe2debafbd203dcced75bd3 Mon Sep 17 00:00:00 2001 From: root Date: Wed, 24 Jun 2009 20:26:17 -0700 Subject: [PATCH 1/1] veth: don't free priv->status until dev->destructor (v2) Since commit ae0e8e82205c903978a79ebf5e31c670b61fa5b4, priv->status has been freed at veth_close(). But that causes a NULL deref at veth_xmit. This patch moves priv->status free back to the device destructor. It also tries to prevent the original possible sysfs-induced oops. All the stats are now gathered within one rcu cycle, while the device free hook first sets the device stats struct to NULL, waits an rcu cycle before freeing it. Changelog: June 26: try to fix the original oops. Signed-off-by: Serge Hallyn --- drivers/net/veth.c | 22 ++++++++++++++++++---- 1 files changed, 18 insertions(+), 4 deletions(-) diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 87197dd..112add0 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -208,7 +208,7 @@ rx_drop: static struct net_device_stats *veth_get_stats(struct net_device *dev) { - struct veth_priv *priv = netdev_priv(dev); + struct veth_priv *priv; struct net_device_stats *dev_stats = &dev->stats; unsigned int cpu; struct veth_net_stats *stats; @@ -220,6 +220,8 @@ static struct net_device_stats *veth_get_stats(struct net_device *dev) dev_stats->tx_dropped = 0; dev_stats->rx_dropped = 0; + rcu_read_lock(); + priv = netdev_priv(dev); if (priv->stats) for_each_online_cpu(cpu) { stats = per_cpu_ptr(priv->stats, cpu); @@ -231,6 +233,7 @@ static struct net_device_stats *veth_get_stats(struct net_device *dev) dev_stats->tx_dropped += stats->tx_dropped; dev_stats->rx_dropped += stats->rx_dropped; } + rcu_read_unlock(); return dev_stats; } @@ -257,8 +260,6 @@ static int veth_close(struct net_device *dev) netif_carrier_off(dev); netif_carrier_off(priv->peer); - free_percpu(priv->stats); - priv->stats = NULL; return 0; } @@ -299,6 +300,19 @@ static const struct net_device_ops veth_netdev_ops = { .ndo_set_mac_address = eth_mac_addr, }; +static void veth_dev_free(struct net_device *dev) +{ + struct veth_priv *priv; + struct veth_net_stats *stats; + + priv = netdev_priv(dev); + stats = priv->stats; + priv->stats = NULL; + synchronize_rcu(); + free_percpu(stats); + free_netdev(dev); +} + static void veth_setup(struct net_device *dev) { ether_setup(dev); @@ -306,7 +320,7 @@ static void veth_setup(struct net_device *dev) dev->netdev_ops = &veth_netdev_ops; dev->ethtool_ops = &veth_ethtool_ops; dev->features |= NETIF_F_LLTX; - dev->destructor = free_netdev; + dev->destructor = veth_dev_free; } /*