From patchwork Thu May 16 16:31:42 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Russell King - ARM Linux X-Patchwork-Id: 244395 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 849A32C0079 for ; Fri, 17 May 2013 02:31:57 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752927Ab3EPQby (ORCPT ); Thu, 16 May 2013 12:31:54 -0400 Received: from caramon.arm.linux.org.uk ([78.32.30.218]:51798 "EHLO caramon.arm.linux.org.uk" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750698Ab3EPQbw (ORCPT ); Thu, 16 May 2013 12:31:52 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=arm.linux.org.uk; s=caramon; h=Sender:In-Reply-To:Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date; bh=oQ5CUP7KmDpXnER5LY7wJFj/vwXb4xCIKhoSmKeCjxw=; b=UJ3Dbq0fe5XiZ4QXiiwS7oFtWD9JrmByfgrhskzLfDfifAvLysIdFbQ4yzNdj4VJc8HrmTMSDJGEhXHl5ip7AtDKn08RqtRzzlTNcYFE3+6ty1QE4Vj3wCGmMSRRJkStsh4j5WwOPahsTVhy+eHrFRHtRlwC5MkDg4HVtgmpXGY=; Received: from n2100.arm.linux.org.uk ([2002:4e20:1eda:1:214:fdff:fe10:4f86]:44138) by caramon.arm.linux.org.uk with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.76) (envelope-from ) id 1Ud15l-0007bV-Le; Thu, 16 May 2013 17:31:46 +0100 Received: from linux by n2100.arm.linux.org.uk with local (Exim 4.76) (envelope-from ) id 1Ud15j-0001fZ-6H; Thu, 16 May 2013 17:31:43 +0100 Date: Thu, 16 May 2013 17:31:42 +0100 From: Russell King - ARM Linux To: Eric Dumazet Cc: Lennert Buytenhek , netdev@vger.kernel.org Subject: Re: [PATCH] fix mv643xx_eth.c lockdep violation Message-ID: <20130516163142.GT18614@n2100.arm.linux.org.uk> References: <20130516161313.GS18614@n2100.arm.linux.org.uk> <1368721294.3301.55.camel@edumazet-glaptop> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <1368721294.3301.55.camel@edumazet-glaptop> User-Agent: Mutt/1.5.19 (2009-01-05) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org On Thu, May 16, 2013 at 09:21:34AM -0700, Eric Dumazet wrote: > On Thu, 2013-05-16 at 17:13 +0100, Russell King - ARM Linux wrote: > > > It seems that txq_reclaim() takes the netif tx lock: > > > > __netif_tx_lock(nq, smp_processor_id()); > > > > in a context outside of softirq context, and thus is susceptible to > > deadlock should an interrupt occur. > > > > Disable IRQs around the call to txq_deinit() to avoid this issue. > > Hmm, I would use __netif_tx_lock_bh()/__netif_tx_unlock_bh() in > txq_reclaim() instead... Yes, thanks, that seems to work as well. Here's a replacement patch. 8<=== From: Russell King Subject: [PATCH] NET: mv643xx_eth: avoid lockdep dump on interface down When the interface is shutdown, the mv643xx_eth driver hits the following lockdep dump: ================================= [ INFO: inconsistent lock state ] 3.8.0+ #303 Not tainted --------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. NetworkManager/3449 [HC0[0]:SC0[0]:HE1:SE1] takes: (_xmit_ETHER#2){+.?...}, at: [] txq_reclaim+0x60/0x230 {IN-SOFTIRQ-W} state was registered at: [] mark_irqflags+0xf8/0x1c4 [] __lock_acquire+0x458/0x9a4 [] lock_acquire+0x60/0x74 [] _raw_spin_lock+0x40/0x50 [] sch_direct_xmit+0xa4/0x2e4 [] dev_queue_xmit+0x174/0x508 [] ip6_finish_output2+0xd0/0x3c4 [] mld_sendpack+0x190/0x368 [] mld_ifc_timer_expire+0xc/0x58 [] call_timer_fn+0x6c/0xe0 [] run_timer_softirq+0x1d8/0x210 [] __do_softirq+0xe0/0x1b4 [] irq_exit+0x64/0x6c [] handle_IRQ+0x34/0x84 [] __irq_usr+0x30/0x80 irq event stamp: 160603 hardirqs last enabled at (160603): [] kfree+0xa8/0xe8 hardirqs last disabled at (160602): [] kfree+0x1c/0xe8 softirqs last enabled at (160304): [] mib_counters_update+0x5ec/0x60c softirqs last disabled at (160302): [] _raw_spin_lock_bh+0x14/0x54 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(_xmit_ETHER#2); lock(_xmit_ETHER#2); *** DEADLOCK *** 1 lock held by NetworkManager/3449: #0: (rtnl_mutex){+.+.+.}, at: [] rtnetlink_rcv+0xc/0x24 stack backtrace: [] (unwind_backtrace+0x0/0xf8) from [] (print_usage_bug+0x150/0x1d4) [] (print_usage_bug+0x150/0x1d4) from [] (mark_lock_irq+0x248/0x290) [] (mark_lock_irq+0x248/0x290) from [] (mark_lock+0x158/0x404) [] (mark_lock+0x158/0x404) from [] (mark_irqflags+0x138/0x1c4) [] (mark_irqflags+0x138/0x1c4) from [] (__lock_acquire+0x458/0x9a4) [] (__lock_acquire+0x458/0x9a4) from [] (lock_acquire+0x60/0x74) [] (lock_acquire+0x60/0x74) from [] (_raw_spin_lock+0x40/0x50) [] (_raw_spin_lock+0x40/0x50) from [] (txq_reclaim+0x60/0x230) [] (txq_reclaim+0x60/0x230) from [] (txq_deinit+0x24/0xcc) [] (txq_deinit+0x24/0xcc) from [] (mv643xx_eth_stop+0x1a8/0x1bc) [] (mv643xx_eth_stop+0x1a8/0x1bc) from [] (__dev_close_many+0x88/0xcc) [] (__dev_close_many+0x88/0xcc) from [] (__dev_close+0x28/0x3c) [] (__dev_close+0x28/0x3c) from [] (__dev_change_flags+0x7c/0x134) [] (__dev_change_flags+0x7c/0x134) from [] (dev_change_flags+0x10/0x48) [] (dev_change_flags+0x10/0x48) from [] (do_setlink+0x1a0/0x730) [] (do_setlink+0x1a0/0x730) from [] (rtnl_newlink+0x304/0x4b0) [] (rtnl_newlink+0x304/0x4b0) from [] (rtnetlink_rcv_msg+0x25c/0x2a0) [] (rtnetlink_rcv_msg+0x25c/0x2a0) from [] (netlink_rcv_skb+0xbc/0xd8) [] (netlink_rcv_skb+0xbc/0xd8) from [] (rtnetlink_rcv+0x1c/0x24) [] (rtnetlink_rcv+0x1c/0x24) from [] (netlink_unicast_kernel+0x88/0xd4) [] (netlink_unicast_kernel+0x88/0xd4) from [] (netlink_unicast+0x138/0x180) [] (netlink_unicast+0x138/0x180) from [] (netlink_sendmsg+0x208/0x32c) [] (netlink_sendmsg+0x208/0x32c) from [] (sock_sendmsg+0x84/0xa4) [] (sock_sendmsg+0x84/0xa4) from [] (__sys_sendmsg+0x2ac/0x2c4) [] (__sys_sendmsg+0x2ac/0x2c4) from [] (sys_sendmsg+0x3c/0x68) [] (sys_sendmsg+0x3c/0x68) from [] (ret_fast_syscall+0x0/0x3c) It seems that txq_reclaim() takes the netif tx lock: __netif_tx_lock(nq, smp_processor_id()); in a context outside of softirq context, and thus is susceptible to deadlock should an interrupt occur. Use __netif_tx_lock_bh()/__netif_tx_unlock_bh() instead. Signed-off-by: Russell King --- drivers/net/ethernet/marvell/mv643xx_eth.c | 4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/marvell/mv643xx_eth.c b/drivers/net/ethernet/marvell/mv643xx_eth.c index 84c1326..67a3e78 100644 --- a/drivers/net/ethernet/marvell/mv643xx_eth.c +++ b/drivers/net/ethernet/marvell/mv643xx_eth.c @@ -943,7 +943,7 @@ static int txq_reclaim(struct tx_queue *txq, int budget, int force) struct netdev_queue *nq = netdev_get_tx_queue(mp->dev, txq->index); int reclaimed; - __netif_tx_lock(nq, smp_processor_id()); + __netif_tx_lock_bh(nq); reclaimed = 0; while (reclaimed < budget && txq->tx_desc_count > 0) { @@ -989,7 +989,7 @@ static int txq_reclaim(struct tx_queue *txq, int budget, int force) dev_kfree_skb(skb); } - __netif_tx_unlock(nq); + __netif_tx_unlock_bh(nq); if (reclaimed < budget) mp->work_tx &= ~(1 << txq->index);