From patchwork Wed Aug 7 08:43:22 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amerigo Wang X-Patchwork-Id: 265411 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 345D42C022C for ; Wed, 7 Aug 2013 18:43:38 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756738Ab3HGInd (ORCPT ); Wed, 7 Aug 2013 04:43:33 -0400 Received: from mx1.redhat.com ([209.132.183.28]:26401 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1756023Ab3HGInc (ORCPT ); Wed, 7 Aug 2013 04:43:32 -0400 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r778hU25028024 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 7 Aug 2013 04:43:30 -0400 Received: from localhost.localdomain (vpn1-115-118.nay.redhat.com [10.66.115.118]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id r778hQoM012429; Wed, 7 Aug 2013 04:43:28 -0400 From: Cong Wang To: netdev@vger.kernel.org Cc: "David S. Miller" , Stephen Hemminger , Cong Wang Subject: [Patch net-next] vxlan: fix a soft lockup in vxlan module removal Date: Wed, 7 Aug 2013 16:43:22 +0800 Message-Id: <1375865002-8732-1-git-send-email-amwang@redhat.com> X-Scanned-By: MIMEDefang 2.67 on 10.5.11.11 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Cong Wang This is a regression introduced by: commit fe5c3561e6f0ac7c9546209f01351113c1b77ec8 Author: stephen hemminger Date: Sat Jul 13 10:18:18 2013 -0700 vxlan: add necessary locking on device removal The problem is that vxlan_dellink(), which is called with RTNL lock held, tries to flush the workqueue synchronously, but apparently igmp_join and igmp_leave work need to hold RTNL lock too, therefore we have a soft lockup! As suggested by Stephen, probably the flush_workqueue can just be removed and let the normal refcounting work. The workqueue has a reference to device and socket, therefore the cleanups should work correctly. Suggested-by: Stephen Hemminger Cc: Stephen Hemminger Cc: David S. Miller Tested-by: Cong Wang Signed-off-by: Cong Wang Acked-by: Stephen Hemminger --- -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index 8bf31d9..c51ef9b 100644 --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -1837,8 +1837,6 @@ static void vxlan_dellink(struct net_device *dev, struct list_head *head) struct vxlan_net *vn = net_generic(dev_net(dev), vxlan_net_id); struct vxlan_dev *vxlan = netdev_priv(dev); - flush_workqueue(vxlan_wq); - spin_lock(&vn->sock_lock); hlist_del_rcu(&vxlan->hlist); spin_unlock(&vn->sock_lock);