From patchwork Sat Sep 28 16:48:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1168861 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="cSyRr7u3"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46gZQ8740rz9sNk for ; Sun, 29 Sep 2019 02:49:36 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728611AbfI1QtP (ORCPT ); Sat, 28 Sep 2019 12:49:15 -0400 Received: from mail-pf1-f193.google.com ([209.85.210.193]:41338 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725897AbfI1QtP (ORCPT ); Sat, 28 Sep 2019 12:49:15 -0400 Received: by mail-pf1-f193.google.com with SMTP id q7so3221733pfh.8; Sat, 28 Sep 2019 09:49:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=G1T48Bdu4NesUcWUqIEwN33UNHciFwMiebfvljZSZF4=; b=cSyRr7u36ql9gGS58v1TCNnHQQ0+bXgwChD6IWLF4KeqjUunaJcqrEuUvibNKZz3hO KrorXMLZRNy6+w0hV+STTz2cjjQB5MVwgNYDJx0FPdxCFuPfeEGwCQcxVWicYcCzPTX8 AoUZRcT3Ak/yemQN1QOBvfbqSNKHZKw8MJIPcAGdziSKvjd3Zy7QmgOjtaZ8TInWxkDn ZqfJpwueQ6MeQKW9ouVR5aW2IRaVFfVSIATwPvlmoLuS2H7g/03cVA5WPdDHZO8/RA0E Bch24VcKdu/bUvyYb4BkudZS0kkfeydUe9TKUUqvrSS8naXCd3ZgtRXy670LhV37op2X FX2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=G1T48Bdu4NesUcWUqIEwN33UNHciFwMiebfvljZSZF4=; b=O+6q4RV84OBmybTB0z6pT+yhnDVao5V+aej8FgI2k/YAnBm9wrUEHodtbeAGVFFm16 MnCGfP257zzYGt6EVWpnyw1a1PvR2qBTrFb1+TLZg7UmzbCxeG6OqhQQNZTGPhRm4Vlu S7SSMVjxedTkgtGj8XtKET3AwoelYO0Yfghym6FjB3skEnz/NOwBP3KvBm5D/bfLAPj6 QRI6rtWnQvJYBJ2H7j8ddHD9OkgR1XulZGG/xPHjoKLixjaCmsNCwfjjduhXadAj8Cs7 Fw680NTFSN5kXtMdduzYsu8M2sWTQHR1SWCF2F5jktg+DybmV8FzAg4PfdLAgOxrA+I4 tVzA== X-Gm-Message-State: APjAAAVGbV0ndGzJuCQyfY9sIC/nVXxmq1q1GQmqo3XawD35Q7PCGKV/ PLz+8qtVaFRaLx+nQ2c+8Ao= X-Google-Smtp-Source: APXvYqzXqh34H3HHWWI+EA/3Id3QOjDlpn8EG8Lwok+yHyCbBk22zPx7i5o+OIgD6M0uoxxMooJmFg== X-Received: by 2002:a63:4913:: with SMTP id w19mr15208840pga.185.1569689353100; Sat, 28 Sep 2019 09:49:13 -0700 (PDT) Received: from localhost.localdomain ([110.35.161.54]) by smtp.gmail.com with ESMTPSA id 30sm8663092pjk.25.2019.09.28.09.49.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Sep 2019 09:49:11 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, linux-wireless@vger.kernel.org, jakub.kicinski@netronome.com, johannes@sipsolutions.net, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, stephen@networkplumber.org, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com, schuffelen@google.com, bjorn@mork.no Cc: ap420073@gmail.com Subject: [PATCH net v4 01/12] net: core: limit nested device depth Date: Sat, 28 Sep 2019 16:48:32 +0000 Message-Id: <20190928164843.31800-2-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190928164843.31800-1-ap420073@gmail.com> References: <20190928164843.31800-1-ap420073@gmail.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Current code doesn't limit the number of nested devices. Nested devices would be handled recursively and this needs huge stack memory. So, unlimited nested devices could make stack overflow. This patch adds upper_level and lower_level, they are common variables and represent maximum lower/upper depth. When upper/lower device is attached or dettached, {lower/upper}_level are updated. and if maximum depth is bigger than 8, attach routine fails and returns -EMLINK. In addition, this patch converts recursive routine of netdev_walk_all_{lower/upper} to iterator routine. Test commands: ip link add dummy0 type dummy ip link add link dummy0 name vlan1 type vlan id 1 ip link set vlan1 up for i in {2..200} do let A=$i-1 ip link add vlan$i link vlan$A type vlan id $i done ip link del vlan1 Splat looks like: [ 923.102992] Thread overran stack, or stack corrupted [ 923.103471] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN PTI [ 923.104086] CPU: 0 PID: 1597 Comm: ip Not tainted 5.3.0+ #3 [ 923.104771] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 923.108837] RIP: 0010:stack_depot_fetch+0x10/0x30 [ 923.109470] Code: 00 75 10 48 8b 73 18 48 89 ef 5b 5d e9 79 b1 83 ff 0f 0b e8 92 96 97 ff eb e9 89 f8 c1 ef 11 25 ff 0 [ 923.111775] RSP: 0018:ffff8880541ceb78 EFLAGS: 00010006 [ 923.112452] RAX: 00000000001fffff RBX: ffff8880541cee88 RCX: 0000000000000000 [ 923.113399] RDX: 000000000000001d RSI: ffff8880541ceb80 RDI: 0000000000003ff0 [ 923.114284] RBP: ffffea0001507380 R08: ffffed100d8fdf23 R09: ffffed100d8fdf23 [ 923.115183] R10: 0000000000000001 R11: ffffed100d8fdf22 R12: ffff88806c240880 [ 923.115986] R13: ffff8880541cec98 R14: ffff8880541cee88 R15: ffff8880541ced20 [ 923.120477] FS: 00007ff38ab4f0c0(0000) GS:ffff88806c600000(0000) knlGS:0000000000000000 [ 923.121486] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 923.122451] CR2: ffffffffa5be5658 CR3: 0000000053532004 CR4: 00000000000606f0 [ 923.123303] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 923.128422] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 923.129399] Call Trace: [ 923.129710] Modules linked in: 8021q dummy ip_tables x_tables [ 923.130518] CR2: ffffffffa5be5658 [ 923.130909] ---[ end trace 9568b7d36ab26094 ]--- [ 923.131457] RIP: 0010:stack_depot_fetch+0x10/0x30 [ 923.132006] Code: 00 75 10 48 8b 73 18 48 89 ef 5b 5d e9 79 b1 83 ff 0f 0b e8 92 96 97 ff eb e9 89 f8 c1 ef 11 25 ff 0 [ 923.134219] RSP: 0018:ffff8880541ceb78 EFLAGS: 00010006 [ 923.134834] RAX: 00000000001fffff RBX: ffff8880541cee88 RCX: 0000000000000000 [ 923.135664] RDX: 000000000000001d RSI: ffff8880541ceb80 RDI: 0000000000003ff0 [ 923.136514] RBP: ffffea0001507380 R08: ffffed100d8fdf23 R09: ffffed100d8fdf23 [ 923.137276] R10: 0000000000000001 R11: ffffed100d8fdf22 R12: ffff88806c240880 [ 923.138025] R13: ffff8880541cec98 R14: ffff8880541cee88 R15: ffff8880541ced20 [ 923.138773] FS: 00007ff38ab4f0c0(0000) GS:ffff88806c600000(0000) knlGS:0000000000000000 [ 923.140099] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 923.140763] CR2: ffffffffa5be5658 CR3: 0000000053532004 CR4: 00000000000606f0 [ 923.141539] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 923.144930] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 923.145942] Kernel panic - not syncing: Fatal exception Signed-off-by: Taehee Yoo --- v3 -> v4 : - This patch is not changed v2 -> v3 : - Modify nesting infra code to use iterator instead of recursive v1 -> v2 : - This patch is not changed include/linux/netdevice.h | 4 + net/core/dev.c | 286 ++++++++++++++++++++++++++++++++------ 2 files changed, 245 insertions(+), 45 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 9eda1c31d1f7..613007aa5986 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1637,6 +1637,8 @@ enum netdev_priv_flags { * @type: Interface hardware type * @hard_header_len: Maximum hardware header length. * @min_header_len: Minimum hardware header length + * @upper_level: Maximum depth level of upper devices. + * @lower_level: Maximum depth level of lower devices. * * @needed_headroom: Extra headroom the hardware may need, but not in all * cases can this be guaranteed @@ -1867,6 +1869,8 @@ struct net_device { unsigned short type; unsigned short hard_header_len; unsigned char min_header_len; + unsigned char upper_level; + unsigned char lower_level; unsigned short needed_headroom; unsigned short needed_tailroom; diff --git a/net/core/dev.c b/net/core/dev.c index bf3ed413abaf..13cb646fb98f 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -146,6 +146,7 @@ #include "net-sysfs.h" #define MAX_GRO_SKBS 8 +#define MAX_NEST_DEV 8 /* This should be increased if a protocol with a bigger head is added. */ #define GRO_MAX_HEAD (MAX_HEADER + 128) @@ -6644,6 +6645,21 @@ struct net_device *netdev_upper_get_next_dev_rcu(struct net_device *dev, } EXPORT_SYMBOL(netdev_upper_get_next_dev_rcu); +static struct net_device *netdev_next_upper_dev(struct net_device *dev, + struct list_head **iter) +{ + struct netdev_adjacent *upper; + + upper = list_entry((*iter)->next, struct netdev_adjacent, list); + + if (&upper->list == &dev->adj_list.upper) + return NULL; + + *iter = &upper->list; + + return upper->dev; +} + static struct net_device *netdev_next_upper_dev_rcu(struct net_device *dev, struct list_head **iter) { @@ -6661,31 +6677,103 @@ static struct net_device *netdev_next_upper_dev_rcu(struct net_device *dev, return upper->dev; } +int netdev_walk_all_upper_dev(struct net_device *dev, + int (*fn)(struct net_device *dev, + void *data), + void *data) +{ + struct net_device *udev, *next, *now, *dev_stack[MAX_NEST_DEV + 1]; + struct list_head *niter, *iter, *iter_stack[MAX_NEST_DEV + 1]; + int ret, cur = 0; + + now = dev; + iter = &dev->adj_list.upper; + + while (1) { + if (now != dev) { + ret = fn(now, data); + if (ret) + return ret; + } + + next = NULL; + while (1) { + udev = netdev_next_upper_dev(now, &iter); + if (!udev) + break; + + if (!next) { + next = udev; + niter = &udev->adj_list.upper; + } else { + dev_stack[cur] = udev; + iter_stack[cur++] = &udev->adj_list.upper; + break; + } + } + + if (!next) { + if (!cur) + return 0; + next = dev_stack[--cur]; + niter = iter_stack[cur]; + } + + now = next; + iter = niter; + } + + return 0; +} + int netdev_walk_all_upper_dev_rcu(struct net_device *dev, int (*fn)(struct net_device *dev, void *data), void *data) { - struct net_device *udev; - struct list_head *iter; - int ret; + struct net_device *udev, *next, *now, *dev_stack[MAX_NEST_DEV + 1]; + struct list_head *niter, *iter, *iter_stack[MAX_NEST_DEV + 1]; + int ret, cur = 0; - for (iter = &dev->adj_list.upper, - udev = netdev_next_upper_dev_rcu(dev, &iter); - udev; - udev = netdev_next_upper_dev_rcu(dev, &iter)) { - /* first is the upper device itself */ - ret = fn(udev, data); - if (ret) - return ret; + now = dev; + iter = &dev->adj_list.upper; - /* then look at all of its upper devices */ - ret = netdev_walk_all_upper_dev_rcu(udev, fn, data); - if (ret) - return ret; + while (1) { + if (now != dev) { + ret = fn(now, data); + if (ret) + return ret; + } + + next = NULL; + while (1) { + udev = netdev_next_upper_dev_rcu(now, &iter); + if (!udev) + break; + + if (!next) { + next = udev; + niter = &udev->adj_list.upper; + } else { + dev_stack[cur] = udev; + iter_stack[cur++] = &udev->adj_list.upper; + break; + } + } + + if (!next) { + if (!cur) + return 0; + next = dev_stack[--cur]; + niter = iter_stack[cur]; + } + + now = next; + iter = niter; } return 0; + } EXPORT_SYMBOL_GPL(netdev_walk_all_upper_dev_rcu); @@ -6790,23 +6878,45 @@ int netdev_walk_all_lower_dev(struct net_device *dev, void *data), void *data) { - struct net_device *ldev; - struct list_head *iter; - int ret; + struct net_device *ldev, *next, *now, *dev_stack[MAX_NEST_DEV + 1]; + struct list_head *niter, *iter, *iter_stack[MAX_NEST_DEV + 1]; + int ret, cur = 0; - for (iter = &dev->adj_list.lower, - ldev = netdev_next_lower_dev(dev, &iter); - ldev; - ldev = netdev_next_lower_dev(dev, &iter)) { - /* first is the lower device itself */ - ret = fn(ldev, data); - if (ret) - return ret; + now = dev; + iter = &dev->adj_list.lower; - /* then look at all of its lower devices */ - ret = netdev_walk_all_lower_dev(ldev, fn, data); - if (ret) - return ret; + while (1) { + if (now != dev) { + ret = fn(now, data); + if (ret) + return ret; + } + + next = NULL; + while (1) { + ldev = netdev_next_lower_dev(now, &iter); + if (!ldev) + break; + + if (!next) { + next = ldev; + niter = &ldev->adj_list.lower; + } else { + dev_stack[cur] = ldev; + iter_stack[cur++] = &ldev->adj_list.lower; + break; + } + } + + if (!next) { + if (!cur) + return 0; + next = dev_stack[--cur]; + niter = iter_stack[cur]; + } + + now = next; + iter = niter; } return 0; @@ -6827,31 +6937,100 @@ static struct net_device *netdev_next_lower_dev_rcu(struct net_device *dev, return lower->dev; } -int netdev_walk_all_lower_dev_rcu(struct net_device *dev, - int (*fn)(struct net_device *dev, - void *data), - void *data) +static u8 __netdev_upper_depth(struct net_device *dev) +{ + struct net_device *udev; + struct list_head *iter; + u8 max_depth = 0; + + for (iter = &dev->adj_list.upper, + udev = netdev_next_upper_dev(dev, &iter); + udev; + udev = netdev_next_upper_dev(dev, &iter)) { + if (max_depth < udev->upper_level) + max_depth = udev->upper_level; + } + + return max_depth; +} + +static u8 __netdev_lower_depth(struct net_device *dev) { struct net_device *ldev; struct list_head *iter; - int ret; + u8 max_depth = 0; for (iter = &dev->adj_list.lower, - ldev = netdev_next_lower_dev_rcu(dev, &iter); + ldev = netdev_next_lower_dev(dev, &iter); ldev; - ldev = netdev_next_lower_dev_rcu(dev, &iter)) { - /* first is the lower device itself */ - ret = fn(ldev, data); - if (ret) - return ret; + ldev = netdev_next_lower_dev(dev, &iter)) { + if (max_depth < ldev->lower_level) + max_depth = ldev->lower_level; + } - /* then look at all of its lower devices */ - ret = netdev_walk_all_lower_dev_rcu(ldev, fn, data); - if (ret) - return ret; + return max_depth; +} + +static int __netdev_update_upper_level(struct net_device *dev, void *data) +{ + dev->upper_level = __netdev_upper_depth(dev) + 1; + return 0; +} + +static int __netdev_update_lower_level(struct net_device *dev, void *data) +{ + dev->lower_level = __netdev_lower_depth(dev) + 1; + return 0; +} + +int netdev_walk_all_lower_dev_rcu(struct net_device *dev, + int (*fn)(struct net_device *dev, + void *data), + void *data) +{ + struct net_device *ldev, *next, *now, *dev_stack[MAX_NEST_DEV + 1]; + struct list_head *niter, *iter, *iter_stack[MAX_NEST_DEV + 1]; + int ret, cur = 0; + + now = dev; + iter = &dev->adj_list.lower; + + while (1) { + if (now != dev) { + ret = fn(now, data); + if (ret) + return ret; + } + + next = NULL; + while (1) { + ldev = netdev_next_lower_dev_rcu(now, &iter); + if (!ldev) + break; + + if (!next) { + next = ldev; + niter = &ldev->adj_list.lower; + } else { + dev_stack[cur] = ldev; + iter_stack[cur++] = &ldev->adj_list.lower; + break; + } + } + + if (!next) { + if (!cur) + return 0; + next = dev_stack[--cur]; + niter = iter_stack[cur]; + } + + now = next; + iter = niter; } return 0; + } EXPORT_SYMBOL_GPL(netdev_walk_all_lower_dev_rcu); @@ -7105,6 +7284,9 @@ static int __netdev_upper_dev_link(struct net_device *dev, if (netdev_has_upper_dev(upper_dev, dev)) return -EBUSY; + if ((dev->lower_level + upper_dev->upper_level) > MAX_NEST_DEV) + return -EMLINK; + if (!master) { if (netdev_has_upper_dev(dev, upper_dev)) return -EEXIST; @@ -7131,6 +7313,12 @@ static int __netdev_upper_dev_link(struct net_device *dev, if (ret) goto rollback; + __netdev_update_upper_level(dev, NULL); + netdev_walk_all_lower_dev(dev, __netdev_update_upper_level, NULL); + + __netdev_update_lower_level(upper_dev, NULL); + netdev_walk_all_upper_dev(upper_dev, __netdev_update_lower_level, NULL); + return 0; rollback: @@ -7213,6 +7401,12 @@ void netdev_upper_dev_unlink(struct net_device *dev, call_netdevice_notifiers_info(NETDEV_CHANGEUPPER, &changeupper_info.info); + + __netdev_update_upper_level(dev, NULL); + netdev_walk_all_lower_dev(dev, __netdev_update_upper_level, NULL); + + __netdev_update_lower_level(upper_dev, NULL); + netdev_walk_all_upper_dev(upper_dev, __netdev_update_lower_level, NULL); } EXPORT_SYMBOL(netdev_upper_dev_unlink); @@ -9212,6 +9406,8 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name, dev->gso_max_size = GSO_MAX_SIZE; dev->gso_max_segs = GSO_MAX_SEGS; + dev->upper_level = 1; + dev->lower_level = 1; INIT_LIST_HEAD(&dev->napi_list); INIT_LIST_HEAD(&dev->unreg_list); From patchwork Sat Sep 28 16:48:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1168862 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="uXszXltv"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46gZQ95Sngz9sNx for ; Sun, 29 Sep 2019 02:49:37 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728638AbfI1QtX (ORCPT ); Sat, 28 Sep 2019 12:49:23 -0400 Received: from mail-pf1-f193.google.com ([209.85.210.193]:44758 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725897AbfI1QtW (ORCPT ); Sat, 28 Sep 2019 12:49:22 -0400 Received: by mail-pf1-f193.google.com with SMTP id q21so3210836pfn.11; Sat, 28 Sep 2019 09:49:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=WSz6brKkd9K89pT0qtUsH4QbRz26iGOxP3ApCY+qFQg=; b=uXszXltvxjTlx4tlAmzQ3Yhcya1WPXY+4MqFXDmmQHFj5S/ATNW1C7wLpns4fARlYf 5BjU0OBxRIQKQjpcf2VLf8ZkpRhXGdLQdgBUopjgLjY4ExVFv7IjSp6iFuASdQurDenx /ZQDuvG4vpYeu113TU4YzoY+UhYnTG9AgiaQzLwSlHGdOyW4rbUFmAyZe5DjGaYSo+SV ux/XghQZzC4+/7fd2TrrSLfpr8Bb8LE6u1lFJvAN9ly6MjDd3+e0QZ6lT+YKz6DL1qLR tNmZG6Ex2cO8LB25xmVrMXJME4FEJCSXh8XhywWkIkftuFobAEVAN54SbOu29PGhvuAY mMVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=WSz6brKkd9K89pT0qtUsH4QbRz26iGOxP3ApCY+qFQg=; b=J6GQFaX723NspJd/rjxyW4ba9/8/SrQViecSaq1oPav+YcBcSTsbN+2qa3tbod3Zc7 RMJgnqPiyE+eftVn3ehB5HuSxyR7Z6T5Gsea10Ze84yhaSu++NuTxMN0LL0dgVO7ei4t 1k28qy6/7OTTOK+aXgXMgN6AFgNY8RViqTDWTkzBAaA0h3gKHF3SXWlMyD08KwMyJJmN Na1G8s5mOvZ8UOFsSL1M7r+udYA1AOYKIGi0Z3dxeOiQxCMsSUOIg1jPh4jx/X0tPGtu j2bidhKGxnFYrrcSPPR0OoXTeapsw03glHFAG+w0leFJRp8Tuw9SXSYuJoywk0yR1sWg oasg== X-Gm-Message-State: APjAAAWSqzMOpkQ/Orbo7HyuC0p9F3Guv4bmeUkSHbDLF9P+ZkFUNyNL Wspj3OPRac8hBnhUEKqYNlA= X-Google-Smtp-Source: APXvYqz+RllPo7XtJLUhand4l4vkq7qH+mS2pQIYFkZlVVakUeA6vo7Fk1XHLGUKiotbHifDGPhEpQ== X-Received: by 2002:a63:1918:: with SMTP id z24mr14981048pgl.94.1569689361717; Sat, 28 Sep 2019 09:49:21 -0700 (PDT) Received: from localhost.localdomain ([110.35.161.54]) by smtp.gmail.com with ESMTPSA id 30sm8663092pjk.25.2019.09.28.09.49.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Sep 2019 09:49:20 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, linux-wireless@vger.kernel.org, jakub.kicinski@netronome.com, johannes@sipsolutions.net, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, stephen@networkplumber.org, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com, schuffelen@google.com, bjorn@mork.no Cc: ap420073@gmail.com Subject: [PATCH net v4 02/12] vlan: use dynamic lockdep key instead of subclass Date: Sat, 28 Sep 2019 16:48:33 +0000 Message-Id: <20190928164843.31800-3-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190928164843.31800-1-ap420073@gmail.com> References: <20190928164843.31800-1-ap420073@gmail.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org All VLAN device has same lockdep key and subclass is initialized with nest_level. But actual nest_level value can be changed when a lower device is attached. And at this moment, the subclass should be updated but it seems to be unsafe. So this patch makes VLAN use dynamic lockdep key instead of the subclass. Test commands: ip link add dummy0 type dummy ip link set dummy0 up ip link add bond0 type bond ip link add vlan_dummy1 link dummy0 type vlan id 1 ip link add vlan_bond1 link bond0 type vlan id 2 ip link set vlan_dummy1 master bond0 ip link set bond0 up ip link set vlan_dummy1 up ip link set vlan_bond1 up Both vlan_dummy1 and vlan_bond1 have the same subclass and it makes unnecessary deadlock warning message. Splat looks like: [ 75.879233] WARNING: possible recursive locking detected [ 75.879881] 5.3.0+ #3 Not tainted [ 75.880285] -------------------------------------------- [ 75.880933] ip/634 is trying to acquire lock: [ 75.881463] ffff8880673c2558 (&vlan_netdev_addr_lock_key/1){+...}, at: dev_uc_sync_multiple+0xfa/0x1a0 [ 75.882714] [ 75.882714] but task is already holding lock: [ 75.883502] ffff8880645193f8 (&vlan_netdev_addr_lock_key/1){+...}, at: dev_set_rx_mode+0x19/0x30 [ 75.884707] [ 75.884707] other info that might help us debug this: [ 75.885742] Possible unsafe locking scenario: [ 75.885742] [ 75.887013] CPU0 [ 75.887415] ---- [ 75.887723] lock(&vlan_netdev_addr_lock_key/1); [ 75.888280] lock(&vlan_netdev_addr_lock_key/1); [ 75.888852] [ 75.888852] *** DEADLOCK *** [ 75.888852] [ 75.889569] May be due to missing lock nesting notation [ 75.889569] [ 75.890453] 4 locks held by ip/634: [ 75.890992] #0: ffffffff96ec7a30 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x466/0x8a0 [ 75.892021] #1: ffff8880645193f8 (&vlan_netdev_addr_lock_key/1){+...}, at: dev_set_rx_mode+0x19/0x30 [ 75.893387] #2: ffff8880694c4558 (&dev_addr_list_lock_key/3){+...}, at: dev_mc_sync+0xfa/0x1a0 [ 75.894545] #3: ffffffff96b22780 (rcu_read_lock){....}, at: bond_set_rx_mode+0x5/0x3c0 [bonding] [ 75.895558] [ 75.895558] stack backtrace: [ 75.896003] CPU: 0 PID: 634 Comm: ip Not tainted 5.3.0+ #3 [ 75.896566] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 75.897549] Call Trace: [ 75.897916] dump_stack+0x7c/0xbb [ 75.898287] __lock_acquire+0x26a9/0x3df0 [ 75.898664] ? register_lock_class+0x14d0/0x14d0 [ 75.899255] lock_acquire+0x164/0x3b0 [ 75.899718] ? dev_uc_sync_multiple+0xfa/0x1a0 [ 75.900245] ? rcu_read_lock_held+0x90/0xa0 [ 75.900707] _raw_spin_lock_nested+0x2e/0x60 [ 75.901149] ? dev_uc_sync_multiple+0xfa/0x1a0 [ 75.901629] dev_uc_sync_multiple+0xfa/0x1a0 [ 75.902116] bond_set_rx_mode+0x269/0x3c0 [bonding] [ 75.903135] ? bond_init+0x6f0/0x6f0 [bonding] [ 75.903696] dev_mc_sync+0x15a/0x1a0 [ ... ] Fixes: 0fe1e567d0b4 ("[VLAN]: nested VLAN: fix lockdep's recursive locking warning") Signed-off-by: Taehee Yoo --- v1 -> v4 : - This patch is not changed include/linux/if_vlan.h | 3 +++ net/8021q/vlan_dev.c | 28 +++++++++++++++------------- 2 files changed, 18 insertions(+), 13 deletions(-) diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h index 244278d5c222..1aed9f613e90 100644 --- a/include/linux/if_vlan.h +++ b/include/linux/if_vlan.h @@ -183,6 +183,9 @@ struct vlan_dev_priv { struct netpoll *netpoll; #endif unsigned int nest_level; + + struct lock_class_key xmit_lock_key; + struct lock_class_key addr_lock_key; }; static inline struct vlan_dev_priv *vlan_dev_priv(const struct net_device *dev) diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c index 93eadf179123..12bc80650087 100644 --- a/net/8021q/vlan_dev.c +++ b/net/8021q/vlan_dev.c @@ -494,24 +494,24 @@ static void vlan_dev_set_rx_mode(struct net_device *vlan_dev) * "super class" of normal network devices; split their locks off into a * separate class since they always nest. */ -static struct lock_class_key vlan_netdev_xmit_lock_key; -static struct lock_class_key vlan_netdev_addr_lock_key; - static void vlan_dev_set_lockdep_one(struct net_device *dev, struct netdev_queue *txq, - void *_subclass) + void *_unused) { - lockdep_set_class_and_subclass(&txq->_xmit_lock, - &vlan_netdev_xmit_lock_key, - *(int *)_subclass); + struct vlan_dev_priv *vlan = vlan_dev_priv(dev); + + lockdep_set_class(&txq->_xmit_lock, &vlan->xmit_lock_key); } -static void vlan_dev_set_lockdep_class(struct net_device *dev, int subclass) +static void vlan_dev_set_lockdep_class(struct net_device *dev) { - lockdep_set_class_and_subclass(&dev->addr_list_lock, - &vlan_netdev_addr_lock_key, - subclass); - netdev_for_each_tx_queue(dev, vlan_dev_set_lockdep_one, &subclass); + struct vlan_dev_priv *vlan = vlan_dev_priv(dev); + + lockdep_register_key(&vlan->addr_lock_key); + lockdep_set_class(&dev->addr_list_lock, &vlan->addr_lock_key); + + lockdep_register_key(&vlan->xmit_lock_key); + netdev_for_each_tx_queue(dev, vlan_dev_set_lockdep_one, NULL); } static int vlan_dev_get_lock_subclass(struct net_device *dev) @@ -609,7 +609,7 @@ static int vlan_dev_init(struct net_device *dev) SET_NETDEV_DEVTYPE(dev, &vlan_type); - vlan_dev_set_lockdep_class(dev, vlan_dev_get_lock_subclass(dev)); + vlan_dev_set_lockdep_class(dev); vlan->vlan_pcpu_stats = netdev_alloc_pcpu_stats(struct vlan_pcpu_stats); if (!vlan->vlan_pcpu_stats) @@ -630,6 +630,8 @@ static void vlan_dev_uninit(struct net_device *dev) kfree(pm); } } + lockdep_unregister_key(&vlan->addr_lock_key); + lockdep_unregister_key(&vlan->xmit_lock_key); } static netdev_features_t vlan_dev_fix_features(struct net_device *dev, From patchwork Sat Sep 28 16:48:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1168863 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="dlaRcb6Q"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46gZQB22j6z9sNF for ; Sun, 29 Sep 2019 02:49:38 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728659AbfI1Qtb (ORCPT ); Sat, 28 Sep 2019 12:49:31 -0400 Received: from mail-pf1-f193.google.com ([209.85.210.193]:42853 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725897AbfI1Qtb (ORCPT ); Sat, 28 Sep 2019 12:49:31 -0400 Received: by mail-pf1-f193.google.com with SMTP id q12so3218992pff.9; Sat, 28 Sep 2019 09:49:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=xbAXUEDUfV2PoEcvbHKnSj2Ovux/llWVp0MTjN2Sb7s=; b=dlaRcb6QWXuZhPqo4EXmvBt+5Oswi1kStfHQE9mTzicezEuCf+3s2jEdK741LvqQBI RD9bM7uSCJ98+FCSUmFvY1fhw2eA34av5S3vicfkYocc2oiwhgk8wHjUS8WjK8EdIkDw 5SLtmDTPV6obgFwY7UH6CR5NQbmdwjWKaLuKyjrdHvD0UmbmKqxwBLmwR3fnsdqJ5jvt u0ItBu+2ZNsOKZ32/8OMytY3w79s2TBSu/gMA+nWgTMVLDMAJWozBOYvJNaUF3ibbGIs RdpPTW0s6BYDJPZ1Yv+STUNO0YcNQpW6jz8ip9aWgZgNTPyrGvndy0KnvSe0Cp/oTXcI wX5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=xbAXUEDUfV2PoEcvbHKnSj2Ovux/llWVp0MTjN2Sb7s=; b=PLxLkuvDYHJDeWwdetgmW7dkCoOVx+Eb2vrYo5vfa8yeU1qyc8Y3EqDyiH2Mhw8Yd4 j8UDfGvoeudK6yf/I9ep6VOh0XJ12tfStCA+cWAIOzlT3Qxmf5XQyLrHMXGawhZF7hCo qRkLpGYg/fnjSiIXkiDsbM9pJudmYxSr8w3Md0Q/UcJTzBFqgpGM6bJiQbxBoGNGFftX QHquHaL7xOuJxU6ijvVy+yStbgl3kJqJGMnEMurgZJ6hpvhcfNGTmwcxeXUi+VCYoK8v VRHAZ4s2bbdjHX/Y8A2O2KdgVrT6nsmuInr/PyyeqGK/KSnzcvH9hoTN70yqazESfA6a qP7Q== X-Gm-Message-State: APjAAAVQUEg/ns+V/MLVnMKuO2fV7R5sWCONHNo1Ov5M2bXyyi/e1KrK INZo0XvX+lM8AL2Wgvmb6aU= X-Google-Smtp-Source: APXvYqweJmn1s4UBG7W/43VJzNRrNxJ2soKwyba3HCI+RNoG51+oMhS5i/OxRTrq82LB3/uDXmTFMw== X-Received: by 2002:a63:ec52:: with SMTP id r18mr16045896pgj.128.1569689369873; Sat, 28 Sep 2019 09:49:29 -0700 (PDT) Received: from localhost.localdomain ([110.35.161.54]) by smtp.gmail.com with ESMTPSA id 30sm8663092pjk.25.2019.09.28.09.49.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Sep 2019 09:49:29 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, linux-wireless@vger.kernel.org, jakub.kicinski@netronome.com, johannes@sipsolutions.net, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, stephen@networkplumber.org, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com, schuffelen@google.com, bjorn@mork.no Cc: ap420073@gmail.com Subject: [PATCH net v4 03/12] bonding: fix unexpected IFF_BONDING bit unset Date: Sat, 28 Sep 2019 16:48:34 +0000 Message-Id: <20190928164843.31800-4-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190928164843.31800-1-ap420073@gmail.com> References: <20190928164843.31800-1-ap420073@gmail.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The IFF_BONDING means bonding master or bonding slave device. ->ndo_add_slave() sets IFF_BONDING flag and ->ndo_del_slave() unsets IFF_BONDING flag. bond0<--bond1 Both bond0 and bond1 are bonding device and these should keep having IFF_BONDING flag until they are removed. But bond1 would lose IFF_BONDING at ->ndo_del_slave() because that routine do not check whether the slave device is the bonding type or not. This patch adds the interface type check routine before removing IFF_BONDING flag. Test commands: ip link add bond0 type bond ip link add bond1 type bond ip link set bond1 master bond0 ip link set bond1 nomaster ip link del bond1 type bond ip link add bond1 type bond Splat looks like: [ 38.843933] proc_dir_entry 'bonding/bond1' already registered [ 38.844741] WARNING: CPU: 1 PID: 631 at fs/proc/generic.c:361 proc_register+0x2a9/0x3e0 [ 38.845741] Modules linked in: bonding ip_tables x_tables [ 38.846432] CPU: 1 PID: 631 Comm: ip Not tainted 5.3.0+ #3 [ 38.847234] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 38.848489] RIP: 0010:proc_register+0x2a9/0x3e0 [ 38.849164] Code: 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 39 01 00 00 48 8b 04 24 48 89 ea 48 c7 c7 e0 2b 34 b3 48 8b b0 e 0 00 00 00 e8 c7 b6 89 ff <0f> 0b 48 c7 c7 40 3d c5 b3 e8 99 7a 38 01 48 8b 4c 24 10 48 b8 00 [ 38.851317] RSP: 0018:ffff888061527078 EFLAGS: 00010282 [ 38.851902] RAX: dffffc0000000008 RBX: ffff888064dc8cb0 RCX: ffffffffb1d252a2 [ 38.852684] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff88806cbf6b8c [ 38.853464] RBP: ffff888064dc8f33 R08: ffffed100d980019 R09: ffffed100d980019 [ 38.854242] R10: 0000000000000001 R11: ffffed100d980018 R12: ffff888064dc8e48 [ 38.855929] R13: ffff888064dc8f32 R14: dffffc0000000000 R15: ffffed100c9b91e6 [ 38.856695] FS: 00007fc9fcc230c0(0000) GS:ffff88806ca00000(0000) knlGS:0000000000000000 [ 38.857541] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 38.858150] CR2: 000055948b91c118 CR3: 0000000057110006 CR4: 00000000000606e0 [ 38.858957] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 38.859785] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 38.860700] Call Trace: [ 38.861004] proc_create_seq_private+0xb3/0xf0 [ 38.861460] bond_create_proc_entry+0x1b3/0x3f0 [bonding] [ 38.862113] bond_netdev_event+0x433/0x970 [bonding] [ 38.862762] ? __module_text_address+0x13/0x140 [ 38.867678] notifier_call_chain+0x90/0x160 [ 38.868257] register_netdevice+0x9b3/0xd80 [ 38.868791] ? alloc_netdev_mqs+0x854/0xc10 [ 38.869335] ? netdev_change_features+0xa0/0xa0 [ 38.869852] ? rtnl_create_link+0x2ed/0xad0 [ 38.870423] bond_newlink+0x2a/0x60 [bonding] [ 38.870935] __rtnl_newlink+0xb9f/0x11b0 [ ... ] Fixes: 0b680e753724 ("[PATCH] bonding: Add priv_flag to avoid event mishandling") Signed-off-by: Taehee Yoo Signed-off-by: Jay Vosburgh --- v2 -> v4 : - This patch is not changed v1 -> v2 : - Do not add a new priv_flag. drivers/net/bonding/bond_main.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 931d9d935686..0db12fcfc953 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1816,7 +1816,8 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, slave_disable_netpoll(new_slave); err_close: - slave_dev->priv_flags &= ~IFF_BONDING; + if (!netif_is_bond_master(slave_dev)) + slave_dev->priv_flags &= ~IFF_BONDING; dev_close(slave_dev); err_restore_mac: @@ -2017,7 +2018,8 @@ static int __bond_release_one(struct net_device *bond_dev, else dev_set_mtu(slave_dev, slave->original_mtu); - slave_dev->priv_flags &= ~IFF_BONDING; + if (!netif_is_bond_master(slave_dev)) + slave_dev->priv_flags &= ~IFF_BONDING; bond_free_slave(slave); From patchwork Sat Sep 28 16:48:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1168864 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="TrTOqLMV"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46gZQK5l4Nz9sNF for ; Sun, 29 Sep 2019 02:49:45 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728690AbfI1Qtl (ORCPT ); Sat, 28 Sep 2019 12:49:41 -0400 Received: from mail-pl1-f194.google.com ([209.85.214.194]:45234 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725897AbfI1Qtk (ORCPT ); Sat, 28 Sep 2019 12:49:40 -0400 Received: by mail-pl1-f194.google.com with SMTP id u12so2247998pls.12; Sat, 28 Sep 2019 09:49:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=x3DkLlNxAab2KL5voiX+odB3OOCM5FoX7tYfq/LUPRs=; b=TrTOqLMV7RwSTD2hQpKA6EAbERjT35rxkRn9XJ0we6btLRygCjn3nBzk3LNUhH6VoP B/ZBdrJiOBEMVu0JYUDVypSMv9rEvQsAvowpPjShBm6cujUmTs55OnLiLNETW8vQahJV WgApe9l8uDzCqLFS8LL2B+rEVZJRpWrlWEwbkBm4RjCf4smPhlPkVJuG+kCnMdK2EPSS r52fHODCWu+dyTyBxLQmJNnuWX/tiey38gU3YH9+Oo4ER8po0nfIfX3O+UfcVmoWjZNB zCzAep/y2zAGu08I6ixwmHvHsyI6Uk+G4uMcT6GmLOySm43MOk7f73pKnh0Z4KqhjPQ3 ejGA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=x3DkLlNxAab2KL5voiX+odB3OOCM5FoX7tYfq/LUPRs=; b=siyCPWtgGN/jKBe+WzUrEn2jJdSllfMyM5BTI/DF6+NaJB7Qv+OEg0HC2VKbyqq4Pj aUUSvPzzXELWWgRW/fyhmkQK0aBntXtOom8RFWrMR2KVzY0ETOFvun5ksDm3HI9PZCkW 2m969JeAzYVQEsvqmIOHd64qVqpXu2rBf6j46UuwEqImaG9gcHt6N5hZM0FYmiABY9Ca b3eN5ERrNT7U0d8A0OySpq4CBrtF235lGQhDjLvCWkKdb/5aWhZGBL+bOOJJ8tWd4Rkx P8Go5xOkU/G7+iL9k3PdVwEuWdx1aeQTHibC4Pr89/62h9jRFF3O5bURvdNPtZPRUaaC a3uA== X-Gm-Message-State: APjAAAX8eJpi4plv79f9kNVvMl40ey+fqZZoy8p0AxPAPOK6rkY8DNAi YuE8AgjSaOKpOle/12DB69I= X-Google-Smtp-Source: APXvYqyu/frR49EK+yWlygP8n4tUzwjm5eD42GQyGlzbQUQFX8+mOp9xvWaXZqPBnk5l4WxbCqg8TQ== X-Received: by 2002:a17:902:a588:: with SMTP id az8mr8094854plb.184.1569689377951; Sat, 28 Sep 2019 09:49:37 -0700 (PDT) Received: from localhost.localdomain ([110.35.161.54]) by smtp.gmail.com with ESMTPSA id 30sm8663092pjk.25.2019.09.28.09.49.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Sep 2019 09:49:37 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, linux-wireless@vger.kernel.org, jakub.kicinski@netronome.com, johannes@sipsolutions.net, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, stephen@networkplumber.org, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com, schuffelen@google.com, bjorn@mork.no Cc: ap420073@gmail.com Subject: [PATCH net v4 04/12] bonding: use dynamic lockdep key instead of subclass Date: Sat, 28 Sep 2019 16:48:35 +0000 Message-Id: <20190928164843.31800-5-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190928164843.31800-1-ap420073@gmail.com> References: <20190928164843.31800-1-ap420073@gmail.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org All bonding device has same lockdep key and subclass is initialized with nest_level. But actual nest_level value can be changed when a lower device is attached. And at this moment, the subclass should be updated but it seems to be unsafe. So this patch makes bonding use dynamic lockdep key instead of the subclass. Test commands: ip link add bond0 type bond for i in {1..5} do let A=$i-1 ip link add bond$i type bond ip link set bond$i master bond$A done ip link set bond5 master bond0 Splat looks like: [ 29.858108] WARNING: possible recursive locking detected [ 29.858630] 5.3.0+ #3 Not tainted [ 29.858946] -------------------------------------------- [ 29.859501] ip/629 is trying to acquire lock: [ 29.860591] ffff88806801cf00 (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0xb8/0x500 [bonding] [ 29.861677] [ 29.861677] but task is already holding lock: [ 29.862307] ffff88806801ada0 (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0xb8/0x500 [bonding] [ 29.863406] [ 29.863406] other info that might help us debug this: [ 29.864092] Possible unsafe locking scenario: [ 29.864092] [ 29.864715] CPU0 [ 29.864968] ---- [ 29.865225] lock(&(&bond->stats_lock)->rlock#2/2); [ 29.865731] lock(&(&bond->stats_lock)->rlock#2/2); [ 29.866235] [ 29.866235] *** DEADLOCK *** [ 29.866235] [ 29.866829] May be due to missing lock nesting notation [ 29.866829] [ 29.867632] 3 locks held by ip/629: [ 29.868077] #0: ffffffffb4ec7a30 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x466/0x8a0 [ 29.869141] #1: ffff88806801ada0 (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0xb8/0x500 [bonding] [ 29.870504] #2: ffffffffb4b22780 (rcu_read_lock){....}, at: bond_get_stats+0x9f/0x500 [bonding] [ 29.875917] [ 29.875917] stack backtrace: [ 29.876533] CPU: 0 PID: 629 Comm: ip Not tainted 5.3.0+ #3 [ 29.877254] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 29.878344] Call Trace: [ 29.878697] dump_stack+0x7c/0xbb [ 29.879167] __lock_acquire+0x26a9/0x3df0 [ 29.879660] ? register_lock_class+0x14d0/0x14d0 [ 29.880067] lock_acquire+0x164/0x3b0 [ 29.880402] ? bond_get_stats+0xb8/0x500 [bonding] [ 29.880826] _raw_spin_lock_nested+0x2e/0x60 [ 29.881206] ? bond_get_stats+0xb8/0x500 [bonding] [ 29.881725] bond_get_stats+0xb8/0x500 [bonding] [ ... ] Fixes: d3fff6c443fe ("net: add netdev_lockdep_set_classes() helper") Signed-off-by: Taehee Yoo --- v1 -> v4 : - This patch is not changed drivers/net/bonding/bond_main.c | 61 ++++++++++++++++++++++++++++++--- include/net/bonding.h | 3 ++ 2 files changed, 59 insertions(+), 5 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 0db12fcfc953..7f574e74ed78 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1857,6 +1857,32 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, return res; } +static void bond_dev_set_lockdep_one(struct net_device *dev, + struct netdev_queue *txq, + void *_unused) +{ + struct bonding *bond = netdev_priv(dev); + + lockdep_set_class(&txq->_xmit_lock, &bond->xmit_lock_key); +} + +static void bond_update_lock_key(struct net_device *dev) +{ + struct bonding *bond = netdev_priv(dev); + + lockdep_unregister_key(&bond->stats_lock_key); + lockdep_unregister_key(&bond->addr_lock_key); + lockdep_unregister_key(&bond->xmit_lock_key); + + lockdep_register_key(&bond->stats_lock_key); + lockdep_register_key(&bond->addr_lock_key); + lockdep_register_key(&bond->xmit_lock_key); + + lockdep_set_class(&bond->stats_lock, &bond->stats_lock_key); + lockdep_set_class(&dev->addr_list_lock, &bond->addr_lock_key); + netdev_for_each_tx_queue(dev, bond_dev_set_lockdep_one, NULL); +} + /* Try to release the slave device from the bond device * It is legal to access curr_active_slave without a lock because all the function * is RTNL-locked. If "all" is true it means that the function is being called @@ -2022,6 +2048,8 @@ static int __bond_release_one(struct net_device *bond_dev, slave_dev->priv_flags &= ~IFF_BONDING; bond_free_slave(slave); + if (netif_is_bond_master(slave_dev)) + bond_update_lock_key(slave_dev); return 0; } @@ -3459,7 +3487,7 @@ static void bond_get_stats(struct net_device *bond_dev, struct list_head *iter; struct slave *slave; - spin_lock_nested(&bond->stats_lock, bond_get_nest_level(bond_dev)); + spin_lock(&bond->stats_lock); memcpy(stats, &bond->bond_stats, sizeof(*stats)); rcu_read_lock(); @@ -4297,8 +4325,6 @@ void bond_setup(struct net_device *bond_dev) { struct bonding *bond = netdev_priv(bond_dev); - spin_lock_init(&bond->mode_lock); - spin_lock_init(&bond->stats_lock); bond->params = bonding_defaults; /* Initialize pointers */ @@ -4367,6 +4393,9 @@ static void bond_uninit(struct net_device *bond_dev) list_del(&bond->bond_list); + lockdep_unregister_key(&bond->stats_lock_key); + lockdep_unregister_key(&bond->addr_lock_key); + lockdep_unregister_key(&bond->xmit_lock_key); bond_debug_unregister(bond); } @@ -4758,6 +4787,29 @@ static int bond_check_params(struct bond_params *params) return 0; } +static struct lock_class_key qdisc_tx_busylock_key; +static struct lock_class_key qdisc_running_key; + +static void bond_dev_set_lockdep_class(struct net_device *dev) +{ + struct bonding *bond = netdev_priv(dev); + + dev->qdisc_tx_busylock = &qdisc_tx_busylock_key; + dev->qdisc_running_key = &qdisc_running_key; + + spin_lock_init(&bond->mode_lock); + + spin_lock_init(&bond->stats_lock); + lockdep_register_key(&bond->stats_lock_key); + lockdep_set_class(&bond->stats_lock, &bond->stats_lock_key); + + lockdep_register_key(&bond->addr_lock_key); + lockdep_set_class(&dev->addr_list_lock, &bond->addr_lock_key); + + lockdep_register_key(&bond->xmit_lock_key); + netdev_for_each_tx_queue(dev, bond_dev_set_lockdep_one, NULL); +} + /* Called from registration process */ static int bond_init(struct net_device *bond_dev) { @@ -4771,8 +4823,7 @@ static int bond_init(struct net_device *bond_dev) return -ENOMEM; bond->nest_level = SINGLE_DEPTH_NESTING; - netdev_lockdep_set_classes(bond_dev); - + bond_dev_set_lockdep_class(bond_dev); list_add_tail(&bond->bond_list, &bn->dev_list); bond_prepare_sysfs_group(bond); diff --git a/include/net/bonding.h b/include/net/bonding.h index f7fe45689142..c39ac7061e41 100644 --- a/include/net/bonding.h +++ b/include/net/bonding.h @@ -239,6 +239,9 @@ struct bonding { struct dentry *debug_dir; #endif /* CONFIG_DEBUG_FS */ struct rtnl_link_stats64 bond_stats; + struct lock_class_key stats_lock_key; + struct lock_class_key xmit_lock_key; + struct lock_class_key addr_lock_key; }; #define bond_slave_get_rcu(dev) \ From patchwork Sat Sep 28 16:48:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1168865 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="o0lMz0VE"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46gZQR3fDWz9sNF for ; Sun, 29 Sep 2019 02:49:51 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728714AbfI1Qtr (ORCPT ); Sat, 28 Sep 2019 12:49:47 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:44781 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725897AbfI1Qtr (ORCPT ); Sat, 28 Sep 2019 12:49:47 -0400 Received: by mail-pf1-f196.google.com with SMTP id q21so3211137pfn.11; Sat, 28 Sep 2019 09:49:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=lnnmWURm6k9EQ+DtQjsTtP1Aom1VoK7uWrgxpoaLKOE=; b=o0lMz0VE40vlza5M+WJNxNwKuMx1r+ycjXNKAgbbkneJqZ07HwimgQrnHDgZ8PJfRI tYqsKkmUpJzORbsH5hN149wnBOD4vTTksB8uxcvyGaW4vkd/5XKZkHfnRoG/+u41S53N CG8B0TKVc+GpekWrdxRmVvnl1F9MU2rDde+1wLGPa0NkVw+UwGldX87oNZpKraRlJDUi LwxlsKfx7EGl60JZ3CyTgC27bR0n3WIfcR2GL628b9+gdEu6wR0jyqxfcinjampJWpF4 uXcm7A0nCb1IT9/x66sx/aKcpjkgiBefV2qPbEv7w5hpp2JPbxf+TBvZ6DaPnap63nlR jjdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=lnnmWURm6k9EQ+DtQjsTtP1Aom1VoK7uWrgxpoaLKOE=; b=HxtbuNjiz4jl0X2/MO+XvRcSA2jtXK45Bk8Eyyy79bEhqynbgxFZM7FJvFzAxanopL aSLPW24b6ik5bsfaqS4xpK/en4YcVs714FKFQNUYRhrDh1HKi8NCV6mbHJsF5h4HLnkV Q+kvwOxd4oNAPaiBqdCi9xUAe1GtzAolA+mAvogBDmIw0YOkQ30+JytQyIMdCM6CB2bA j9uvPjq/jqtZw6Iv4pWuY/MO/G98DNP3ni88iQI0PsRI9ItqcuQt9PtgjmAj9g7I8HSH gw446dsL441z8eRgeUyiyteOxFHo76qNq6Yme/qDKAT4LTDCfmSNuNHH9En8HoXb77m+ TrQg== X-Gm-Message-State: APjAAAWwJNDs85hpUv8FnrpNkH1MiBn2xcIHJxorzbHhc9D8ITefAavl naQkvQO6P16mmh142BdO4do= X-Google-Smtp-Source: APXvYqxiSRtDEU5B9WFF29tEuwnCB/nA04RUYniYjhWKZMFFY/jbI/xP9lErBLWwWDAQAeAvvj8IXw== X-Received: by 2002:a17:90a:8990:: with SMTP id v16mr17195585pjn.131.1569689386390; Sat, 28 Sep 2019 09:49:46 -0700 (PDT) Received: from localhost.localdomain ([110.35.161.54]) by smtp.gmail.com with ESMTPSA id 30sm8663092pjk.25.2019.09.28.09.49.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Sep 2019 09:49:45 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, linux-wireless@vger.kernel.org, jakub.kicinski@netronome.com, johannes@sipsolutions.net, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, stephen@networkplumber.org, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com, schuffelen@google.com, bjorn@mork.no Cc: ap420073@gmail.com Subject: [PATCH net v4 05/12] team: use dynamic lockdep key instead of static key Date: Sat, 28 Sep 2019 16:48:36 +0000 Message-Id: <20190928164843.31800-6-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190928164843.31800-1-ap420073@gmail.com> References: <20190928164843.31800-1-ap420073@gmail.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org In the current code, all team devices have same static lockdep key and team devices could be nested so that it makes unnecessary lockdep warning. Test commands: ip link add team0 type team for i in {1..7} do let A=$i-1 ip link add team$i type team ip link set team$i master team$A done ip link del team0 Splat looks like: [ 32.862645] WARNING: possible recursive locking detected [ 32.863304] 5.3.0+ #3 Not tainted [ 32.863700] -------------------------------------------- [ 32.864358] ip/647 is trying to acquire lock: [ 32.864968] ffff8880666a6ad8 (&dev_addr_list_lock_key/1){+...}, at: dev_uc_sync_multiple+0xfa/0x1a0 [ 32.866047] [ 32.866047] but task is already holding lock: [ 32.866744] ffff888067402558 (&dev_addr_list_lock_key/1){+...}, at: dev_uc_unsync+0x10c/0x1b0 [ 32.867774] [ 32.867774] other info that might help us debug this: [ 32.868513] Possible unsafe locking scenario: [ 32.868513] [ 32.869180] CPU0 [ 32.872973] ---- [ 32.876717] lock(&dev_addr_list_lock_key/1); [ 32.877130] lock(&dev_addr_list_lock_key/1); [ 32.877621] [ 32.877621] *** DEADLOCK *** [ 32.877621] [ 32.878284] May be due to missing lock nesting notation [ 32.878284] [ 32.878999] 5 locks held by ip/647: [ 32.879382] #0: ffffffff8fec7a30 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x466/0x8a0 [ 32.880110] #1: ffff888068d5e300 (&team->lock){+.+.}, at: team_uninit+0x3a/0x1a0 [team] [ 32.880889] #2: ffff888068d5d978 (&dev_addr_list_lock_key){+...}, at: dev_uc_unsync+0x98/0x1b0 [ 32.881660] #3: ffff888067402558 (&dev_addr_list_lock_key/1){+...}, at: dev_uc_unsync+0x10c/0x1b0 [ 32.882451] #4: ffffffff8fb22780 (rcu_read_lock){....}, at: team_set_rx_mode+0x5/0x1d0 [team] [ 32.883209] [ 32.883209] stack backtrace: [ 32.883605] CPU: 0 PID: 647 Comm: ip Not tainted 5.3.0+ #3 [ 32.884144] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 32.884926] Call Trace: [ 32.885151] dump_stack+0x7c/0xbb [ 32.885460] __lock_acquire+0x26a9/0x3df0 [ 32.885964] ? register_lock_class+0x14d0/0x14d0 [ 32.886522] ? register_lock_class+0x14d0/0x14d0 [ 32.887114] lock_acquire+0x164/0x3b0 [ 32.887578] ? dev_uc_sync_multiple+0xfa/0x1a0 [ 32.888130] _raw_spin_lock_nested+0x2e/0x60 [ 32.888725] ? dev_uc_sync_multiple+0xfa/0x1a0 [ 32.889264] dev_uc_sync_multiple+0xfa/0x1a0 [ 32.889779] team_set_rx_mode+0xa9/0x1d0 [team] [ 32.892841] dev_uc_unsync+0x151/0x1b0 [ ... ] Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device") Signed-off-by: Taehee Yoo --- v1 -> v4 : - This patch is not changed drivers/net/team/team.c | 61 ++++++++++++++++++++++++++++++++++++++--- include/linux/if_team.h | 5 ++++ 2 files changed, 62 insertions(+), 4 deletions(-) diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c index e8089def5a46..bfcd6ed57493 100644 --- a/drivers/net/team/team.c +++ b/drivers/net/team/team.c @@ -1607,6 +1607,34 @@ static const struct team_option team_options[] = { }, }; +static void team_dev_set_lockdep_one(struct net_device *dev, + struct netdev_queue *txq, + void *_unused) +{ + struct team *team = netdev_priv(dev); + + lockdep_set_class(&txq->_xmit_lock, &team->xmit_lock_key); +} + +static struct lock_class_key qdisc_tx_busylock_key; +static struct lock_class_key qdisc_running_key; + +static void team_dev_set_lockdep_class(struct net_device *dev) +{ + struct team *team = netdev_priv(dev); + + dev->qdisc_tx_busylock = &qdisc_tx_busylock_key; + dev->qdisc_running_key = &qdisc_running_key; + + lockdep_register_key(&team->team_lock_key); + __mutex_init(&team->lock, "team->team_lock_key", &team->team_lock_key); + + lockdep_register_key(&team->addr_lock_key); + lockdep_set_class(&dev->addr_list_lock, &team->addr_lock_key); + + lockdep_register_key(&team->xmit_lock_key); + netdev_for_each_tx_queue(dev, team_dev_set_lockdep_one, NULL); +} static int team_init(struct net_device *dev) { @@ -1615,7 +1643,6 @@ static int team_init(struct net_device *dev) int err; team->dev = dev; - mutex_init(&team->lock); team_set_no_mode(team); team->pcpu_stats = netdev_alloc_pcpu_stats(struct team_pcpu_stats); @@ -1642,7 +1669,7 @@ static int team_init(struct net_device *dev) goto err_options_register; netif_carrier_off(dev); - netdev_lockdep_set_classes(dev); + team_dev_set_lockdep_class(dev); return 0; @@ -1673,6 +1700,11 @@ static void team_uninit(struct net_device *dev) team_queue_override_fini(team); mutex_unlock(&team->lock); netdev_change_features(dev); + + lockdep_unregister_key(&team->team_lock_key); + lockdep_unregister_key(&team->addr_lock_key); + lockdep_unregister_key(&team->xmit_lock_key); + } static void team_destructor(struct net_device *dev) @@ -1967,6 +1999,23 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev, return err; } +static void team_update_lock_key(struct net_device *dev) +{ + struct team *team = netdev_priv(dev); + + lockdep_unregister_key(&team->team_lock_key); + lockdep_unregister_key(&team->addr_lock_key); + lockdep_unregister_key(&team->xmit_lock_key); + + lockdep_register_key(&team->team_lock_key); + lockdep_register_key(&team->addr_lock_key); + lockdep_register_key(&team->xmit_lock_key); + + lockdep_set_class(&team->lock, &team->team_lock_key); + lockdep_set_class(&dev->addr_list_lock, &team->addr_lock_key); + netdev_for_each_tx_queue(dev, team_dev_set_lockdep_one, NULL); +} + static int team_del_slave(struct net_device *dev, struct net_device *port_dev) { struct team *team = netdev_priv(dev); @@ -1976,8 +2025,12 @@ static int team_del_slave(struct net_device *dev, struct net_device *port_dev) err = team_port_del(team, port_dev); mutex_unlock(&team->lock); - if (!err) - netdev_change_features(dev); + if (err) + return err; + + if (netif_is_team_master(port_dev)) + team_update_lock_key(port_dev); + netdev_change_features(dev); return err; } diff --git a/include/linux/if_team.h b/include/linux/if_team.h index 06faa066496f..9c97bb19ed34 100644 --- a/include/linux/if_team.h +++ b/include/linux/if_team.h @@ -223,6 +223,11 @@ struct team { atomic_t count_pending; struct delayed_work dw; } mcast_rejoin; + + struct lock_class_key team_lock_key; + struct lock_class_key xmit_lock_key; + struct lock_class_key addr_lock_key; + long mode_priv[TEAM_MODE_PRIV_LONGS]; }; From patchwork Sat Sep 28 16:48:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1168866 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="a0c0VTeV"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46gZQZ0wmrz9sNF for ; Sun, 29 Sep 2019 02:49:58 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728728AbfI1Qt4 (ORCPT ); Sat, 28 Sep 2019 12:49:56 -0400 Received: from mail-pl1-f195.google.com ([209.85.214.195]:35624 "EHLO mail-pl1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725897AbfI1Qt4 (ORCPT ); Sat, 28 Sep 2019 12:49:56 -0400 Received: by mail-pl1-f195.google.com with SMTP id y10so2261411plp.2; Sat, 28 Sep 2019 09:49:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=fG2ag5Kn+p57OLQ5yCzXYuHH56rcNXzarRBtRO20Www=; b=a0c0VTeVoBWPTuzhkKeNHy9HgVzD3pZ7De/RAUduwZo7/7Nx6FPY7cIlOCI9wYKb9G e98oRuPFR6XSD4ikKZ0mYaDNEfBAGlkbZUddGbLks/Pi42o78+aBKsTxR+jGSfq2ftiy S0L9rDucCRN18a4UpjqoiAxcZWa7xKmfLaAeZZRoewlQ7qez9x0XAggmud/54xVhsdIM 5MXQIk98m+Hpargw14EcxvIIPHs036uLGOhVRQCASIYIaUkL24h7Z4bsnCrMx7NPbY/F GG88HchJAZNVY5w3pU0LHPAmDAQWArIaVziQ/K2KVEnv9vFm4BdnP0yQvsviNe8iOFTp g0Sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=fG2ag5Kn+p57OLQ5yCzXYuHH56rcNXzarRBtRO20Www=; b=qgVCAGYI/2n5iTOm6E0obu0RKBKFwfMmqzr7Mw/KxapUyNi152k0fCPJLVMPcaCnZ0 +2uyl2VnhEyI8GPIh7u+C+y4rhm8De5nUZs0yQzX1CPZJob7gA+ZN/40HQS7OaLdM8Cm isa1uX05+4UEGody6zftFjVvLgmtpN2drYep3lXoZ0YpI4CrGOusl2qvM/EUqOdY9ecU V5igoZ5KbC63rUD90pNvZxp9Vn5AJuEd2Q/z87TSYHS4yXtldJy9AeD2vDYLhvJmyar9 nEArM7PPcYM/D4OF1iOgWx6y/DBjnE+X41pWr2OWe2T+pCWmRUSjJRzQhqjyQKo3CwH9 LWvg== X-Gm-Message-State: APjAAAUCQbORfoSRMYDkadDqiORTD3YXE5x+72Qxk+gwxT2V0aoX2Vs3 fxV7R9kqAeRKpXcenFaHu70= X-Google-Smtp-Source: APXvYqzuVMZwzShhM3z9qLTnZM+OtA9QZCVHUBVL/huTKpVVHxji6xMAqbeLoaSFU1rvBbLiZd81rw== X-Received: by 2002:a17:902:7485:: with SMTP id h5mr11382348pll.240.1569689394958; Sat, 28 Sep 2019 09:49:54 -0700 (PDT) Received: from localhost.localdomain ([110.35.161.54]) by smtp.gmail.com with ESMTPSA id 30sm8663092pjk.25.2019.09.28.09.49.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Sep 2019 09:49:53 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, linux-wireless@vger.kernel.org, jakub.kicinski@netronome.com, johannes@sipsolutions.net, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, stephen@networkplumber.org, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com, schuffelen@google.com, bjorn@mork.no Cc: ap420073@gmail.com Subject: [PATCH net v4 06/12] macsec: use dynamic lockdep key instead of subclass Date: Sat, 28 Sep 2019 16:48:37 +0000 Message-Id: <20190928164843.31800-7-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190928164843.31800-1-ap420073@gmail.com> References: <20190928164843.31800-1-ap420073@gmail.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org All macsec device has same lockdep key and subclass is initialized with nest_level. But actual nest_level value can be changed when a lower device is attached. And at this moment, the subclass should be updated but it seems to be unsafe. So this patch makes macsec use dynamic lockdep key instead of the subclass. Test commands: ip link add bond0 type bond ip link add dummy0 type dummy ip link add macsec0 link bond0 type macsec ip link add macsec1 link dummy0 type macsec ip link set bond0 mtu 1000 ip link set macsec1 master bond0 ip link set bond0 up ip link set macsec0 up ip link set dummy0 up ip link set macsec1 up Splat looks like: [ 29.758606] WARNING: possible recursive locking detected [ 29.759626] 5.3.0+ #3 Not tainted [ 29.760670] -------------------------------------------- [ 29.761385] ip/639 is trying to acquire lock: [ 29.761938] ffff888067680298 (&macsec_netdev_addr_lock_key/1){+...}, at: dev_uc_sync_multiple+0xfa/0x1a0 [ 29.763073] [ 29.763073] but task is already holding lock: [ 29.763840] ffff888060148298 (&macsec_netdev_addr_lock_key/1){+...}, at: dev_set_rx_mode+0x19/0x30 [ 29.764931] [ 29.764931] other info that might help us debug this: [ 29.765721] Possible unsafe locking scenario: [ 29.765721] [ 29.766615] CPU0 [ 29.766914] ---- [ 29.767256] lock(&macsec_netdev_addr_lock_key/1); [ 29.767847] lock(&macsec_netdev_addr_lock_key/1); [ 29.768441] [ 29.768441] *** DEADLOCK *** [ 29.768441] [ 29.769158] May be due to missing lock nesting notation [ 29.769158] [ 29.770083] 4 locks held by ip/639: [ 29.770908] #0: ffffffff93ec7a30 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x466/0x8a0 [ 29.771970] #1: ffff888060148298 (&macsec_netdev_addr_lock_key/1){+...}, at: dev_set_rx_mode+0x19/0x30 [ 29.773216] #2: ffff888063e58298 (&dev_addr_list_lock_key/3){+...}, at: dev_mc_sync+0xfa/0x1a0 [ 29.774324] #3: ffffffff93b22780 (rcu_read_lock){....}, at: bond_set_rx_mode+0x5/0x3c0 [bonding] [ 29.775459] [ 29.775459] stack backtrace: [ 29.775986] CPU: 0 PID: 639 Comm: ip Not tainted 5.3.0+ #3 [ 29.776719] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 29.777707] Call Trace: [ 29.778012] dump_stack+0x7c/0xbb [ 29.778434] __lock_acquire+0x26a9/0x3df0 [ 29.778920] ? register_lock_class+0x14d0/0x14d0 [ 29.779537] lock_acquire+0x164/0x3b0 [ 29.779981] ? dev_uc_sync_multiple+0xfa/0x1a0 [ 29.780523] ? rcu_read_lock_held+0x90/0xa0 [ 29.781028] _raw_spin_lock_nested+0x2e/0x60 [ 29.781550] ? dev_uc_sync_multiple+0xfa/0x1a0 [ 29.782311] dev_uc_sync_multiple+0xfa/0x1a0 [ 29.782832] bond_set_rx_mode+0x269/0x3c0 [bonding] [ ... ] Fixes: e20038724552 ("macsec: fix lockdep splats when nesting devices") Signed-off-by: Taehee Yoo --- v1 -> v4 : - This patch is not changed drivers/net/macsec.c | 37 ++++++++++++++++++++++++++++++++----- 1 file changed, 32 insertions(+), 5 deletions(-) diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c index cb7637364b40..c4a41b90c846 100644 --- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -267,6 +267,8 @@ struct macsec_dev { struct pcpu_secy_stats __percpu *stats; struct list_head secys; struct gro_cells gro_cells; + struct lock_class_key xmit_lock_key; + struct lock_class_key addr_lock_key; unsigned int nest_level; }; @@ -2750,7 +2752,32 @@ static netdev_tx_t macsec_start_xmit(struct sk_buff *skb, #define MACSEC_FEATURES \ (NETIF_F_SG | NETIF_F_HIGHDMA | NETIF_F_FRAGLIST) -static struct lock_class_key macsec_netdev_addr_lock_key; + +static void macsec_dev_set_lockdep_one(struct net_device *dev, + struct netdev_queue *txq, + void *_unused) +{ + struct macsec_dev *macsec = macsec_priv(dev); + + lockdep_set_class(&txq->_xmit_lock, &macsec->xmit_lock_key); +} + +static struct lock_class_key qdisc_tx_busylock_key; +static struct lock_class_key qdisc_running_key; + +static void macsec_dev_set_lockdep_class(struct net_device *dev) +{ + struct macsec_dev *macsec = macsec_priv(dev); + + dev->qdisc_tx_busylock = &qdisc_tx_busylock_key; + dev->qdisc_running_key = &qdisc_running_key; + + lockdep_register_key(&macsec->addr_lock_key); + lockdep_set_class(&dev->addr_list_lock, &macsec->addr_lock_key); + + lockdep_register_key(&macsec->xmit_lock_key); + netdev_for_each_tx_queue(dev, macsec_dev_set_lockdep_one, NULL); +} static int macsec_dev_init(struct net_device *dev) { @@ -2781,6 +2808,7 @@ static int macsec_dev_init(struct net_device *dev) if (is_zero_ether_addr(dev->broadcast)) memcpy(dev->broadcast, real_dev->broadcast, dev->addr_len); + macsec_dev_set_lockdep_class(dev); return 0; } @@ -2790,6 +2818,9 @@ static void macsec_dev_uninit(struct net_device *dev) gro_cells_destroy(&macsec->gro_cells); free_percpu(dev->tstats); + + lockdep_unregister_key(&macsec->addr_lock_key); + lockdep_unregister_key(&macsec->xmit_lock_key); } static netdev_features_t macsec_fix_features(struct net_device *dev, @@ -3264,10 +3295,6 @@ static int macsec_newlink(struct net *net, struct net_device *dev, dev_hold(real_dev); macsec->nest_level = dev_get_nest_level(real_dev) + 1; - netdev_lockdep_set_classes(dev); - lockdep_set_class_and_subclass(&dev->addr_list_lock, - &macsec_netdev_addr_lock_key, - macsec_get_nest_level(dev)); err = netdev_upper_dev_link(real_dev, dev, extack); if (err < 0) From patchwork Sat Sep 28 16:48:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1168867 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="ZvbIh7kq"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46gZQj3fmLz9sNF for ; Sun, 29 Sep 2019 02:50:05 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728743AbfI1QuE (ORCPT ); Sat, 28 Sep 2019 12:50:04 -0400 Received: from mail-pl1-f194.google.com ([209.85.214.194]:34578 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725897AbfI1QuE (ORCPT ); Sat, 28 Sep 2019 12:50:04 -0400 Received: by mail-pl1-f194.google.com with SMTP id k7so2261581pll.1; Sat, 28 Sep 2019 09:50:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ODpmrXLn5S687oEyUC8qGycaIYQf/xOsGOKK8D7XnDE=; b=ZvbIh7kqho+ncXgu8KAOkEvheJshficVaFZSWBd2oJgDy7VgPvqP7hdJOGbkB+amZb qi/vABGcqF+OXJ6e6xJAf0INFWMlFpw0rEKiFCvahTXVW+Mg326nPdHdO5NRle6L5kn1 l6bLT+9P+jxs3wHo2UrVe7aDq0sDJdfJqAkf7Rc4UufPMgzDx1mHSvoR20v3xeM2ME6m jo7KWpwbtJahkAW+cbh3y06RB3XbSYIed1KkLN5gyyKnCg5reyoETaOz17GQDIaeQx+p THcOnUpq+LMUlWKea1PH+IbvELZJ9CBNNfiyEIOfPYncGh6yfLGxmWOJ1kkMwzfvFB2B WxIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=ODpmrXLn5S687oEyUC8qGycaIYQf/xOsGOKK8D7XnDE=; b=au/IDJHLc6L9OXjc3F7MI2S1c8e+6F5Czo2zF90uJsLpKVXg0Ao/f/xtN3ggxTnxQN EED93hG7q8G1IXxt++wy9uUL1RY3gSIXKVKkblo7BqXOxk6v7qD/N+sjeeWDcWxqpNGM P19n1WlxqHcBjHDOpZe26NybIdm0mgZWdRlOfVBS5nIjnouEAmLe1TCfg3Fx+Kae2DyQ CRbulqWB1efkvgT86Fb+ZdmWXOfZwwOxebkYIh0NACNQcpQhg1ZNLj8s4htPNib2DIpv J1AeuhVeef54/sljx/VB3Idj+g0LcfRPw55YgX9HFrlCiuzDlv7Bscm6CJXjUAjUqcG3 k3dw== X-Gm-Message-State: APjAAAVixHg24dLNyNzcD5Zz80c9ALoQIB9PNhl8yZ3tO7HaOURDkZ5k 8MGJDdipJZZnlAFIqjfwhyI= X-Google-Smtp-Source: APXvYqxRxG18735Jgxmd8lUJinh3+WBGjMU6ehvmfQKC28kY5TvDpX0Kc5vXDBzg2hbpoE+n7EXhFw== X-Received: by 2002:a17:902:868a:: with SMTP id g10mr11458268plo.235.1569689403184; Sat, 28 Sep 2019 09:50:03 -0700 (PDT) Received: from localhost.localdomain ([110.35.161.54]) by smtp.gmail.com with ESMTPSA id 30sm8663092pjk.25.2019.09.28.09.49.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Sep 2019 09:50:02 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, linux-wireless@vger.kernel.org, jakub.kicinski@netronome.com, johannes@sipsolutions.net, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, stephen@networkplumber.org, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com, schuffelen@google.com, bjorn@mork.no Cc: ap420073@gmail.com Subject: [PATCH net v4 07/12] macvlan: use dynamic lockdep key instead of subclass Date: Sat, 28 Sep 2019 16:48:38 +0000 Message-Id: <20190928164843.31800-8-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190928164843.31800-1-ap420073@gmail.com> References: <20190928164843.31800-1-ap420073@gmail.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org All macvlan device has same lockdep key and subclass is initialized with nest_level. But actual nest_level value can be changed when a lower device is attached. And at this moment, the subclass should be updated but it seems to be unsafe. So this patch makes macvlan use dynamic lockdep key instead of the subclass. Test commands: ip link add bond0 type bond ip link add dummy0 type dummy ip link add macvlan0 link bond0 type macvlan mode bridge ip link add macvlan1 link dummy0 type macvlan mode bridge ip link set bond0 mtu 1000 ip link set macvlan1 master bond0 ip link set bond0 up ip link set macvlan0 up ip link set dummy0 up ip link set macvlan1 up Splat looks like: [ 30.281866] WARNING: possible recursive locking detected [ 30.282374] 5.3.0+ #3 Not tainted [ 30.282673] -------------------------------------------- [ 30.283138] ip/643 is trying to acquire lock: [ 30.283522] ffff88806750c818 (&macvlan_netdev_addr_lock_key/1){+...}, at: dev_uc_sync_multiple+0xfa/0x1a0 [ 30.284363] [ 30.284363] but task is already holding lock: [ 30.284878] ffff88806853ead8 (&macvlan_netdev_addr_lock_key/1){+...}, at: dev_set_rx_mode+0x19/0x30 [ 30.285680] [ 30.285680] other info that might help us debug this: [ 30.286274] Possible unsafe locking scenario: [ 30.286274] [ 30.286903] CPU0 [ 30.287192] ---- [ 30.287475] lock(&macvlan_netdev_addr_lock_key/1); [ 30.288121] lock(&macvlan_netdev_addr_lock_key/1); [ 30.288818] [ 30.288818] *** DEADLOCK *** [ 30.288818] [ 30.294651] May be due to missing lock nesting notation [ 30.294651] [ 30.295660] 4 locks held by ip/643: [ 30.296076] #0: ffffffff93ec7a30 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x466/0x8a0 [ 30.297030] #1: ffff88806853ead8 (&macvlan_netdev_addr_lock_key/1){+...}, at: dev_set_rx_mode+0x19/0x30 [ 30.298749] #2: ffff888063b8a3f8 (&dev_addr_list_lock_key/3){+...}, at: dev_uc_sync+0xfa/0x1a0 [ 30.299727] #3: ffffffff93b22780 (rcu_read_lock){....}, at: bond_set_rx_mode+0x5/0x3c0 [bonding] [ 30.302803] [ 30.302803] stack backtrace: [ 30.303254] CPU: 1 PID: 643 Comm: ip Not tainted 5.3.0+ #3 [ 30.303907] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 [ 30.310458] Call Trace: [ 30.310694] dump_stack+0x7c/0xbb [ 30.311016] __lock_acquire+0x26a9/0x3df0 [ 30.311390] ? register_lock_class+0x14d0/0x14d0 [ 30.311815] lock_acquire+0x164/0x3b0 [ 30.312237] ? dev_uc_sync_multiple+0xfa/0x1a0 [ 30.312776] ? rcu_read_lock_held+0x90/0xa0 [ 30.313293] _raw_spin_lock_nested+0x2e/0x60 [ 30.313819] ? dev_uc_sync_multiple+0xfa/0x1a0 [ 30.314429] dev_uc_sync_multiple+0xfa/0x1a0 [ 30.314950] bond_set_rx_mode+0x269/0x3c0 [bonding] [ 30.315541] ? bond_init+0x6f0/0x6f0 [bonding] [ 30.316075] dev_uc_sync+0x15a/0x1a0 [ ... ] Fixes: c674ac30c549 ("macvlan: Fix lockdep warnings with stacked macvlan devices") Signed-off-by: Taehee Yoo --- v1 -> v4 : - This patch is not changed drivers/net/macvlan.c | 35 +++++++++++++++++++++++++++-------- include/linux/if_macvlan.h | 2 ++ 2 files changed, 29 insertions(+), 8 deletions(-) diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index 940192c057b6..dae368a2e8d1 100644 --- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -852,8 +852,6 @@ static int macvlan_do_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) * "super class" of normal network devices; split their locks off into a * separate class since they always nest. */ -static struct lock_class_key macvlan_netdev_addr_lock_key; - #define ALWAYS_ON_OFFLOADS \ (NETIF_F_SG | NETIF_F_HW_CSUM | NETIF_F_GSO_SOFTWARE | \ NETIF_F_GSO_ROBUST | NETIF_F_GSO_ENCAP_ALL) @@ -874,12 +872,30 @@ static int macvlan_get_nest_level(struct net_device *dev) return ((struct macvlan_dev *)netdev_priv(dev))->nest_level; } -static void macvlan_set_lockdep_class(struct net_device *dev) +static void macvlan_dev_set_lockdep_one(struct net_device *dev, + struct netdev_queue *txq, + void *_unused) +{ + struct macvlan_dev *macvlan = netdev_priv(dev); + + lockdep_set_class(&txq->_xmit_lock, &macvlan->xmit_lock_key); +} + +static struct lock_class_key qdisc_tx_busylock_key; +static struct lock_class_key qdisc_running_key; + +static void macvlan_dev_set_lockdep_class(struct net_device *dev) { - netdev_lockdep_set_classes(dev); - lockdep_set_class_and_subclass(&dev->addr_list_lock, - &macvlan_netdev_addr_lock_key, - macvlan_get_nest_level(dev)); + struct macvlan_dev *macvlan = netdev_priv(dev); + + dev->qdisc_tx_busylock = &qdisc_tx_busylock_key; + dev->qdisc_running_key = &qdisc_running_key; + + lockdep_register_key(&macvlan->addr_lock_key); + lockdep_set_class(&dev->addr_list_lock, &macvlan->addr_lock_key); + + lockdep_register_key(&macvlan->xmit_lock_key); + netdev_for_each_tx_queue(dev, macvlan_dev_set_lockdep_one, NULL); } static int macvlan_init(struct net_device *dev) @@ -900,7 +916,7 @@ static int macvlan_init(struct net_device *dev) dev->gso_max_segs = lowerdev->gso_max_segs; dev->hard_header_len = lowerdev->hard_header_len; - macvlan_set_lockdep_class(dev); + macvlan_dev_set_lockdep_class(dev); vlan->pcpu_stats = netdev_alloc_pcpu_stats(struct vlan_pcpu_stats); if (!vlan->pcpu_stats) @@ -922,6 +938,9 @@ static void macvlan_uninit(struct net_device *dev) port->count -= 1; if (!port->count) macvlan_port_destroy(port->dev); + + lockdep_unregister_key(&vlan->addr_lock_key); + lockdep_unregister_key(&vlan->xmit_lock_key); } static void macvlan_dev_get_stats64(struct net_device *dev, diff --git a/include/linux/if_macvlan.h b/include/linux/if_macvlan.h index 2e55e4cdbd8a..ea5b41823287 100644 --- a/include/linux/if_macvlan.h +++ b/include/linux/if_macvlan.h @@ -31,6 +31,8 @@ struct macvlan_dev { u16 flags; int nest_level; unsigned int macaddr_count; + struct lock_class_key xmit_lock_key; + struct lock_class_key addr_lock_key; #ifdef CONFIG_NET_POLL_CONTROLLER struct netpoll *netpoll; #endif From patchwork Sat Sep 28 16:48:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1168868 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="t8kx5/Pv"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46gZQv3ZHkz9sNF for ; Sun, 29 Sep 2019 02:50:15 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728763AbfI1QuO (ORCPT ); Sat, 28 Sep 2019 12:50:14 -0400 Received: from mail-pl1-f195.google.com ([209.85.214.195]:39987 "EHLO mail-pl1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725897AbfI1QuN (ORCPT ); Sat, 28 Sep 2019 12:50:13 -0400 Received: by mail-pl1-f195.google.com with SMTP id d22so2252240pll.7; Sat, 28 Sep 2019 09:50:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=RELvOnZgMNXNkz/88nqNg0MOKI1kSF+INp6xZR5cu74=; b=t8kx5/PvLh3jymMGBL4U3hbsEhJPsqQa21iM4ekVjHxC3T63z1DIbCtpUMqxseF+zH CGm8xI8tDYfPWrcnasd1uqxIwB4A6QoYjyfwaeLWr/fWbLWHFiKQkxopJCtSrYMQRrtT HAkDjlR3VwYMFdddelX/V+OssUPOcNPgDgpkW3tM8khNr1kkUvxXHRDnwGFMavYEHyYK OATWy72zZzN0Dd6NFg0gHIS2X0/L9nMiO1xE2vWHC9q517PN+M89A2++NhKO0vCfD0xe 26+RhRoAPhLyPsqCf7yG74t8oHNAZ4oju8yiCq10KM23jwdbmm4/dZD5/p2sSLPhs5G7 bQgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=RELvOnZgMNXNkz/88nqNg0MOKI1kSF+INp6xZR5cu74=; b=jQYCIolKjkL81gyWitnIa9OhSsrfW1TdrTdIQ/87DHFXTisaFdo76ELv6UXkJYRyKD hqXFgRRI/AzTWBtIZHqLPkGtqZFT6G8WhhyhdnnV/E/G1rN92NExwXDE86c6oAnpOd24 RKrGuUTNpesx2U+YNc+KzBTdJRcOc+wjszQHe/w1tZhG1jkwLtlTevH2bWxPX9JUxEK7 SCdh7by5ebSroXpzFcpTSiqPhGgA4a7nPKpBL5yQPibkmZP/w9LsYgWHdPRnW4ptifYY xEfdkxWcNW+F6YthalVDzrMlASla+DL/C40SyZr+q9jaMkvlm2q039roSr8TViyVrRKv fZLQ== X-Gm-Message-State: APjAAAVsmMAQLcWGjNFDb3SfWrAkUa30cFUdvOxGtcFnMdOQJfqsqg9j px81o5LOoGmPjtXddPlI/jWyvEdpfmr3ykb1 X-Google-Smtp-Source: APXvYqw75/3NjyxAlZpZdmmgMQtAIFzoDrIgXVVWV8JIXQl0aRnboJPFNwNHBM1p42I/7ij94Xyh4g== X-Received: by 2002:a17:902:788f:: with SMTP id q15mr3200979pll.321.1569689412383; Sat, 28 Sep 2019 09:50:12 -0700 (PDT) Received: from localhost.localdomain ([110.35.161.54]) by smtp.gmail.com with ESMTPSA id 30sm8663092pjk.25.2019.09.28.09.50.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Sep 2019 09:50:11 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, linux-wireless@vger.kernel.org, jakub.kicinski@netronome.com, johannes@sipsolutions.net, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, stephen@networkplumber.org, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com, schuffelen@google.com, bjorn@mork.no Cc: ap420073@gmail.com Subject: [PATCH net v4 08/12] macsec: fix refcnt leak in module exit routine Date: Sat, 28 Sep 2019 16:48:39 +0000 Message-Id: <20190928164843.31800-9-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190928164843.31800-1-ap420073@gmail.com> References: <20190928164843.31800-1-ap420073@gmail.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org When a macsec interface is created, it increases a refcnt to a lower device(real device). when macsec interface is deleted, the refcnt is decreased in macsec_free_netdev(), which is ->priv_destructor() of macsec interface. The problem scenario is this. When nested macsec interfaces are exiting, the exit routine of the macsec module makes refcnt leaks. Test commands: ip link add dummy0 type dummy ip link add macsec0 link dummy0 type macsec ip link add macsec1 link macsec0 type macsec modprobe -rv macsec [ 208.629433] unregister_netdevice: waiting for macsec0 to become free. Usage count = 1 Steps of exit routine of macsec module are below. 1. Calls ->dellink() in __rtnl_link_unregister(). 2. Checks refcnt and wait refcnt to be 0 if refcnt is not 0 in netdev_run_todo(). 3. Calls ->priv_destruvtor() in netdev_run_todo(). Step2 checks refcnt, but step3 decreases refcnt. So, step2 waits forever. This patch makes the macsec module do not hold a refcnt of the lower device because it already holds a refcnt of the lower device with netdev_upper_dev_link(). Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: Taehee Yoo --- v1 -> v4 : - This patch is not changed drivers/net/macsec.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c index c4a41b90c846..28972da4a0b3 100644 --- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -3032,12 +3032,10 @@ static const struct nla_policy macsec_rtnl_policy[IFLA_MACSEC_MAX + 1] = { static void macsec_free_netdev(struct net_device *dev) { struct macsec_dev *macsec = macsec_priv(dev); - struct net_device *real_dev = macsec->real_dev; free_percpu(macsec->stats); free_percpu(macsec->secy.tx_sc.stats); - dev_put(real_dev); } static void macsec_setup(struct net_device *dev) @@ -3292,8 +3290,6 @@ static int macsec_newlink(struct net *net, struct net_device *dev, if (err < 0) return err; - dev_hold(real_dev); - macsec->nest_level = dev_get_nest_level(real_dev) + 1; err = netdev_upper_dev_link(real_dev, dev, extack); From patchwork Sat Sep 28 16:48:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1168869 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="YAAiaj6H"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46gZR307BFz9sNF for ; Sun, 29 Sep 2019 02:50:23 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728783AbfI1QuW (ORCPT ); Sat, 28 Sep 2019 12:50:22 -0400 Received: from mail-pg1-f196.google.com ([209.85.215.196]:46724 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725897AbfI1QuV (ORCPT ); Sat, 28 Sep 2019 12:50:21 -0400 Received: by mail-pg1-f196.google.com with SMTP id a3so5027873pgm.13; Sat, 28 Sep 2019 09:50:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=YIGe6kGHiMoc1Pv8mu+sUF30b2C6XCtTdnexzPuW3fM=; b=YAAiaj6HkFDusexvZ4pVY+knik2439+J4vyaW/HNquGrVBMMm+ZRQUhrQpaRyZQWzJ zsHsYuxhx/tv8feScAxEdU6C7e6hi1DUtPgIsNBQ4EfDuyYhzEQvDXysxSf3BXQeRAhM 4IMW+C2nTtAb99hEnymCEnkQ70oHmzaLavOZkAMTRW71U787DRno7uqB9Jbeg5v3Ciiw 8u5tNjcGXxBPcPQo7qBXkAxPPHSUuCnb15ak1mxZUY4uhh7rLomvvm4sA7Po3sGYTg29 BxNhbah+kVXWcpzohLYwTJsTzQlYScpQKEYMM+qON3EuoqjgeHilKExFF7C5st1SHKiK vFDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=YIGe6kGHiMoc1Pv8mu+sUF30b2C6XCtTdnexzPuW3fM=; b=QMRvjM9/SkwwiSAyOVlRRb/0tHP5yBNvNFqqk9/XnBzu4gHl+ka1e/FctkN2SugM/j atgfx7HmTSJlbnSSSbsappauIRDbNesPsdmHpNacjuzSJGTVDneZ8cnYVtzR96mI2Asm +/sVjHNMhIoNt+xgliw7xlD6Wq+LNp4+VK16YcO5vgmFOdKiVsySNRsnzeS6jYhGEVhN TmUu76sFXmoV8GqY+qTBhOjEdTe/g/mpVXqARYlbtSVjFXM4BnHAFpScA37VLbJIIx0p JOuVwEXLTZYs1nQu4R1LhfCRBJQr/eg3rfaolD44GWrMJ/1K4FQttRSVUHEMIjWbg+6p bmMg== X-Gm-Message-State: APjAAAU9W9LMjCUGkKFtNSkxcAR1MkaZtapNswDSQz1b5Yxz2ahh5y9K 3C0Kq9rds6NoRfUVgyj1c56zqhf7Kb1WCPCM X-Google-Smtp-Source: APXvYqws01n8gMC81BC+XnnyQmtlvoO7/3VvQW0/grxt3f8+rFfJ7csJYQOQpquu5yAHGdtOXCkBcQ== X-Received: by 2002:a17:90a:2464:: with SMTP id h91mr17725096pje.9.1569689420728; Sat, 28 Sep 2019 09:50:20 -0700 (PDT) Received: from localhost.localdomain ([110.35.161.54]) by smtp.gmail.com with ESMTPSA id 30sm8663092pjk.25.2019.09.28.09.50.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Sep 2019 09:50:19 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, linux-wireless@vger.kernel.org, jakub.kicinski@netronome.com, johannes@sipsolutions.net, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, stephen@networkplumber.org, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com, schuffelen@google.com, bjorn@mork.no Cc: ap420073@gmail.com Subject: [PATCH net v4 09/12] net: core: add ignore flag to netdev_adjacent structure Date: Sat, 28 Sep 2019 16:48:40 +0000 Message-Id: <20190928164843.31800-10-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190928164843.31800-1-ap420073@gmail.com> References: <20190928164843.31800-1-ap420073@gmail.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org In order to link an adjacent node, netdev_upper_dev_link() is used and in order to unlink an adjacent node, netdev_upper_dev_unlink() is used. unlink operation does not fail, but link operation can fail. In order to exchange adjacent nodes, we should unlink an old adjacent node first. then, link a new adjacent node. If link operation is failed, we should link an old adjacent node again. But this link operation can fail too. It eventually breaks the adjacent link relationship. This patch adds an ignore flag into the netdev_adjacent structure. If this flag is set, netdev_upper_dev_link() ignores an old adjacent node for a moment. This patch also adds new functions for other modules. netdev_adjacent_change_prepare() netdev_adjacent_change_commit() netdev_adjacent_change_abort() netdev_adjacent_change_prepare() inserts new device into adjacent list but new device is not allowed to use immediately. If netdev_adjacent_change_prepare() fails, it internally rollbacks adjacent list so that we don't need any other action. netdev_adjacent_change_commit() deletes old device in the adjacent list and allows new device to use. netdev_adjacent_change_abort() rollbacks adjacent list. Signed-off-by: Taehee Yoo --- v3 -> v4 : - Add missing static keyword in the dev.c - Expose netdev_adjacent_change_{prepare/commit/abort} instead of netdev_adjacent_dev_{enable/disable} v2 -> v3 : - Modify nesting infra code to use iterator instead of recursive v1 -> v2 : - This patch is not changed include/linux/netdevice.h | 10 ++ net/core/dev.c | 234 ++++++++++++++++++++++++++++++++++---- 2 files changed, 222 insertions(+), 22 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 613007aa5986..d1f99d4f41bb 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -4333,6 +4333,16 @@ int netdev_master_upper_dev_link(struct net_device *dev, struct netlink_ext_ack *extack); void netdev_upper_dev_unlink(struct net_device *dev, struct net_device *upper_dev); +int netdev_adjacent_change_prepare(struct net_device *old_dev, + struct net_device *new_dev, + struct net_device *dev, + struct netlink_ext_ack *extack); +void netdev_adjacent_change_commit(struct net_device *old_dev, + struct net_device *new_dev, + struct net_device *dev); +void netdev_adjacent_change_abort(struct net_device *old_dev, + struct net_device *new_dev, + struct net_device *dev); void netdev_adjacent_rename_links(struct net_device *dev, char *oldname); void *netdev_lower_dev_get_private(struct net_device *dev, struct net_device *lower_dev); diff --git a/net/core/dev.c b/net/core/dev.c index 13cb646fb98f..0b60bcd5033e 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6490,6 +6490,9 @@ struct netdev_adjacent { /* upper master flag, there can only be one master device per list */ bool master; + /* lookup ignore flag */ + bool ignore; + /* counter for the number of times this device was added to us */ u16 ref_nr; @@ -6512,7 +6515,7 @@ static struct netdev_adjacent *__netdev_find_adj(struct net_device *adj_dev, return NULL; } -static int __netdev_has_upper_dev(struct net_device *upper_dev, void *data) +static int ____netdev_has_upper_dev(struct net_device *upper_dev, void *data) { struct net_device *dev = data; @@ -6533,7 +6536,7 @@ bool netdev_has_upper_dev(struct net_device *dev, { ASSERT_RTNL(); - return netdev_walk_all_upper_dev_rcu(dev, __netdev_has_upper_dev, + return netdev_walk_all_upper_dev_rcu(dev, ____netdev_has_upper_dev, upper_dev); } EXPORT_SYMBOL(netdev_has_upper_dev); @@ -6551,7 +6554,7 @@ EXPORT_SYMBOL(netdev_has_upper_dev); bool netdev_has_upper_dev_all_rcu(struct net_device *dev, struct net_device *upper_dev) { - return !!netdev_walk_all_upper_dev_rcu(dev, __netdev_has_upper_dev, + return !!netdev_walk_all_upper_dev_rcu(dev, ____netdev_has_upper_dev, upper_dev); } EXPORT_SYMBOL(netdev_has_upper_dev_all_rcu); @@ -6595,6 +6598,22 @@ struct net_device *netdev_master_upper_dev_get(struct net_device *dev) } EXPORT_SYMBOL(netdev_master_upper_dev_get); +static struct net_device *__netdev_master_upper_dev_get(struct net_device *dev) +{ + struct netdev_adjacent *upper; + + ASSERT_RTNL(); + + if (list_empty(&dev->adj_list.upper)) + return NULL; + + upper = list_first_entry(&dev->adj_list.upper, + struct netdev_adjacent, list); + if (likely(upper->master) && !upper->ignore) + return upper->dev; + return NULL; +} + /** * netdev_has_any_lower_dev - Check if device is linked to some device * @dev: device @@ -6645,8 +6664,9 @@ struct net_device *netdev_upper_get_next_dev_rcu(struct net_device *dev, } EXPORT_SYMBOL(netdev_upper_get_next_dev_rcu); -static struct net_device *netdev_next_upper_dev(struct net_device *dev, - struct list_head **iter) +static struct net_device *__netdev_next_upper_dev(struct net_device *dev, + struct list_head **iter, + bool *ignore) { struct netdev_adjacent *upper; @@ -6656,6 +6676,7 @@ static struct net_device *netdev_next_upper_dev(struct net_device *dev, return NULL; *iter = &upper->list; + *ignore = upper->ignore; return upper->dev; } @@ -6677,14 +6698,15 @@ static struct net_device *netdev_next_upper_dev_rcu(struct net_device *dev, return upper->dev; } -int netdev_walk_all_upper_dev(struct net_device *dev, - int (*fn)(struct net_device *dev, - void *data), - void *data) +static int __netdev_walk_all_upper_dev(struct net_device *dev, + int (*fn)(struct net_device *dev, + void *data), + void *data) { struct net_device *udev, *next, *now, *dev_stack[MAX_NEST_DEV + 1]; struct list_head *niter, *iter, *iter_stack[MAX_NEST_DEV + 1]; int ret, cur = 0; + bool ignore; now = dev; iter = &dev->adj_list.upper; @@ -6698,9 +6720,11 @@ int netdev_walk_all_upper_dev(struct net_device *dev, next = NULL; while (1) { - udev = netdev_next_upper_dev(now, &iter); + udev = __netdev_next_upper_dev(now, &iter, &ignore); if (!udev) break; + if (ignore) + continue; if (!next) { next = udev; @@ -6777,6 +6801,15 @@ int netdev_walk_all_upper_dev_rcu(struct net_device *dev, } EXPORT_SYMBOL_GPL(netdev_walk_all_upper_dev_rcu); +static bool __netdev_has_upper_dev(struct net_device *dev, + struct net_device *upper_dev) +{ + ASSERT_RTNL(); + + return __netdev_walk_all_upper_dev(dev, ____netdev_has_upper_dev, + upper_dev); +} + /** * netdev_lower_get_next_private - Get the next ->private from the * lower neighbour list @@ -6873,6 +6906,23 @@ static struct net_device *netdev_next_lower_dev(struct net_device *dev, return lower->dev; } +static struct net_device *__netdev_next_lower_dev(struct net_device *dev, + struct list_head **iter, + bool *ignore) +{ + struct netdev_adjacent *lower; + + lower = list_entry((*iter)->next, struct netdev_adjacent, list); + + if (&lower->list == &dev->adj_list.lower) + return NULL; + + *iter = &lower->list; + *ignore = lower->ignore; + + return lower->dev; +} + int netdev_walk_all_lower_dev(struct net_device *dev, int (*fn)(struct net_device *dev, void *data), @@ -6923,6 +6973,58 @@ int netdev_walk_all_lower_dev(struct net_device *dev, } EXPORT_SYMBOL_GPL(netdev_walk_all_lower_dev); +static int __netdev_walk_all_lower_dev(struct net_device *dev, + int (*fn)(struct net_device *dev, + void *data), + void *data) +{ + struct net_device *ldev, *next, *now, *dev_stack[MAX_NEST_DEV + 1]; + struct list_head *niter, *iter, *iter_stack[MAX_NEST_DEV + 1]; + int ret, cur = 0; + bool ignore; + + now = dev; + iter = &dev->adj_list.lower; + + while (1) { + if (now != dev) { + ret = fn(now, data); + if (ret) + return ret; + } + + next = NULL; + while (1) { + ldev = __netdev_next_lower_dev(now, &iter, &ignore); + if (!ldev) + break; + if (ignore) + continue; + + if (!next) { + next = ldev; + niter = &ldev->adj_list.lower; + } else { + dev_stack[cur] = ldev; + iter_stack[cur++] = &ldev->adj_list.lower; + break; + } + } + + if (!next) { + if (!cur) + return 0; + next = dev_stack[--cur]; + niter = iter_stack[cur]; + } + + now = next; + iter = niter; + } + + return 0; +} + static struct net_device *netdev_next_lower_dev_rcu(struct net_device *dev, struct list_head **iter) { @@ -6942,11 +7044,14 @@ static u8 __netdev_upper_depth(struct net_device *dev) struct net_device *udev; struct list_head *iter; u8 max_depth = 0; + bool ignore; for (iter = &dev->adj_list.upper, - udev = netdev_next_upper_dev(dev, &iter); + udev = __netdev_next_upper_dev(dev, &iter, &ignore); udev; - udev = netdev_next_upper_dev(dev, &iter)) { + udev = __netdev_next_upper_dev(dev, &iter, &ignore)) { + if (ignore) + continue; if (max_depth < udev->upper_level) max_depth = udev->upper_level; } @@ -6959,11 +7064,14 @@ static u8 __netdev_lower_depth(struct net_device *dev) struct net_device *ldev; struct list_head *iter; u8 max_depth = 0; + bool ignore; for (iter = &dev->adj_list.lower, - ldev = netdev_next_lower_dev(dev, &iter); + ldev = __netdev_next_lower_dev(dev, &iter, &ignore); ldev; - ldev = netdev_next_lower_dev(dev, &iter)) { + ldev = __netdev_next_lower_dev(dev, &iter, &ignore)) { + if (ignore) + continue; if (max_depth < ldev->lower_level) max_depth = ldev->lower_level; } @@ -7131,6 +7239,7 @@ static int __netdev_adjacent_dev_insert(struct net_device *dev, adj->master = master; adj->ref_nr = 1; adj->private = private; + adj->ignore = false; dev_hold(adj_dev); pr_debug("Insert adjacency: dev %s adj_dev %s adj->ref_nr %d; dev_hold on %s\n", @@ -7281,17 +7390,17 @@ static int __netdev_upper_dev_link(struct net_device *dev, return -EBUSY; /* To prevent loops, check if dev is not upper device to upper_dev. */ - if (netdev_has_upper_dev(upper_dev, dev)) + if (__netdev_has_upper_dev(upper_dev, dev)) return -EBUSY; if ((dev->lower_level + upper_dev->upper_level) > MAX_NEST_DEV) return -EMLINK; if (!master) { - if (netdev_has_upper_dev(dev, upper_dev)) + if (__netdev_has_upper_dev(dev, upper_dev)) return -EEXIST; } else { - master_dev = netdev_master_upper_dev_get(dev); + master_dev = __netdev_master_upper_dev_get(dev); if (master_dev) return master_dev == upper_dev ? -EEXIST : -EBUSY; } @@ -7314,11 +7423,11 @@ static int __netdev_upper_dev_link(struct net_device *dev, goto rollback; __netdev_update_upper_level(dev, NULL); - netdev_walk_all_lower_dev(dev, __netdev_update_upper_level, NULL); + __netdev_walk_all_lower_dev(dev, __netdev_update_upper_level, NULL); __netdev_update_lower_level(upper_dev, NULL); - netdev_walk_all_upper_dev(upper_dev, __netdev_update_lower_level, NULL); - + __netdev_walk_all_upper_dev(upper_dev, __netdev_update_lower_level, + NULL); return 0; rollback: @@ -7403,13 +7512,94 @@ void netdev_upper_dev_unlink(struct net_device *dev, &changeupper_info.info); __netdev_update_upper_level(dev, NULL); - netdev_walk_all_lower_dev(dev, __netdev_update_upper_level, NULL); + __netdev_walk_all_lower_dev(dev, __netdev_update_upper_level, NULL); __netdev_update_lower_level(upper_dev, NULL); - netdev_walk_all_upper_dev(upper_dev, __netdev_update_lower_level, NULL); + __netdev_walk_all_upper_dev(upper_dev, __netdev_update_lower_level, + NULL); } EXPORT_SYMBOL(netdev_upper_dev_unlink); +static void __netdev_adjacent_dev_set(struct net_device *upper_dev, + struct net_device *lower_dev, + bool val) +{ + struct netdev_adjacent *adj; + + adj = __netdev_find_adj(lower_dev, &upper_dev->adj_list.lower); + if (adj) + adj->ignore = val; + + adj = __netdev_find_adj(upper_dev, &lower_dev->adj_list.upper); + if (adj) + adj->ignore = val; +} + +static void netdev_adjacent_dev_disable(struct net_device *upper_dev, + struct net_device *lower_dev) +{ + __netdev_adjacent_dev_set(upper_dev, lower_dev, true); +} + +static void netdev_adjacent_dev_enable(struct net_device *upper_dev, + struct net_device *lower_dev) +{ + __netdev_adjacent_dev_set(upper_dev, lower_dev, false); +} + +int netdev_adjacent_change_prepare(struct net_device *old_dev, + struct net_device *new_dev, + struct net_device *dev, + struct netlink_ext_ack *extack) +{ + int err; + + if (!new_dev) + return 0; + + if (old_dev && new_dev != old_dev) + netdev_adjacent_dev_disable(dev, old_dev); + + err = netdev_upper_dev_link(new_dev, dev, extack); + if (err) { + if (old_dev && new_dev != old_dev) + netdev_adjacent_dev_enable(dev, old_dev); + return err; + } + + return 0; +} +EXPORT_SYMBOL(netdev_adjacent_change_prepare); + +void netdev_adjacent_change_commit(struct net_device *old_dev, + struct net_device *new_dev, + struct net_device *dev) +{ + if (!new_dev || !old_dev) + return; + + if (new_dev == old_dev) + return; + + netdev_adjacent_dev_enable(dev, old_dev); + netdev_upper_dev_unlink(old_dev, dev); +} +EXPORT_SYMBOL(netdev_adjacent_change_commit); + +void netdev_adjacent_change_abort(struct net_device *old_dev, + struct net_device *new_dev, + struct net_device *dev) +{ + if (!new_dev) + return; + + if (old_dev && new_dev != old_dev) + netdev_adjacent_dev_enable(dev, old_dev); + + netdev_upper_dev_unlink(new_dev, dev); +} +EXPORT_SYMBOL(netdev_adjacent_change_abort); + /** * netdev_bonding_info_change - Dispatch event about slave change * @dev: device From patchwork Sat Sep 28 16:48:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1168870 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="f22UrAi7"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46gZRG3h7cz9sNF for ; Sun, 29 Sep 2019 02:50:34 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728794AbfI1Qud (ORCPT ); Sat, 28 Sep 2019 12:50:33 -0400 Received: from mail-pf1-f193.google.com ([209.85.210.193]:43416 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725897AbfI1Qud (ORCPT ); Sat, 28 Sep 2019 12:50:33 -0400 Received: by mail-pf1-f193.google.com with SMTP id a2so3218908pfo.10; Sat, 28 Sep 2019 09:50:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=MN6+dK8M4Q+jB3mNh8/eTLkmi5cM1iythP1/K5SkMhU=; b=f22UrAi7nK84VrQkGirbDgyejVB8sbDj1+jafqhaVJF36edVx+SV0+hxqdgzlYRYSE UEyc+kaG7FVACKES3EfiVzji1IYYQwfUd5Atk8tjmoJrMsRf7q2Vzel29oG1vi9pTWiP 48iRXw3ffNaRhXpaLoZQDBspgdfUnjeX8Ue/o1jTwPzTLqPw4zRy7I42QiB/dxPbtPhw nbOqnp8uSMGWf6Ry2Z76/PatuBoQGQGWxP9jlrbm7O+aX1C4JEdb3glpVm9L3YA3zWZe NPFg0AOltnVdL3nVIHby1/xiLr5GwgRfvcmFy/shZddbUMnlz91LelI13QKTa3IDXvNv tYbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=MN6+dK8M4Q+jB3mNh8/eTLkmi5cM1iythP1/K5SkMhU=; b=YjOH9WMO/nPZXyftQWdAYdWHjkIwN5O+xjmZDbvtXAYtRXj+AStcmeU59AWL86smBN 6Z2WB6Opt+mC1GwHwZqCnFnkotYVkslcOUOa7tB7woqxNGBDmtIHk4VSh7Wgw03neoKp EI6gpiSY90Wt7d50bnHcORiZa5oXGd4qf7lw0pjhy20aIHlQgJfHlIEkDxxRiWHDPlaQ HRxR9bqTl3OWhkh/emqjge7A+t3+85DHTqQy3Su7YYEFh1isB/ft1xo6sjNaMN8JWSiI 4VVVFVIOcbK1xIoVRx4vmkemLFcMUdvnZSYxjv1BZeVRpdrViJVy17iD6NTJ/dXxDdz6 cz1g== X-Gm-Message-State: APjAAAWAVY9s7nWV+AxxvoJQBjJyoPUT8kePPapFkK7QfdRIq4PPlhbB 117InH/HoWrTwrEilFz5iBw= X-Google-Smtp-Source: APXvYqykDCLpaZRApGUZrXeJbijKNWCt+VLmFuUfJvYjXwab5xq7w41moohw+uwRU9V4HkH9P0/hrw== X-Received: by 2002:a17:90a:e382:: with SMTP id b2mr16936513pjz.94.1569689432448; Sat, 28 Sep 2019 09:50:32 -0700 (PDT) Received: from localhost.localdomain ([110.35.161.54]) by smtp.gmail.com with ESMTPSA id 30sm8663092pjk.25.2019.09.28.09.50.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Sep 2019 09:50:31 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, linux-wireless@vger.kernel.org, jakub.kicinski@netronome.com, johannes@sipsolutions.net, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, stephen@networkplumber.org, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com, schuffelen@google.com, bjorn@mork.no Cc: ap420073@gmail.com Subject: [PATCH net v4 10/12] vxlan: add adjacent link to limit depth level Date: Sat, 28 Sep 2019 16:48:41 +0000 Message-Id: <20190928164843.31800-11-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190928164843.31800-1-ap420073@gmail.com> References: <20190928164843.31800-1-ap420073@gmail.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Current vxlan code doesn't limit the number of nested devices. Nested devices would be handled recursively and this routine needs huge stack memory. So, unlimited nested devices could make stack overflow. In order to fix this issue, this patch adds adjacent links. The adjacent link APIs internally check the depth level. Test commands: ip link add dummy0 type dummy ip link add vxlan0 type vxlan id 0 group 239.1.1.1 dev dummy0 \ dstport 4789 for i in {1..100} do let A=$i-1 ip link add vxlan$i type vxlan id $i group 239.1.1.1 \ dev vxlan$A dstport 4789 done ip link del dummy0 The top upper link is vxlan100 and the lowest link is vxlan0. When vxlan0 is deleting, the upper devices will be deleted recursively. It needs huge stack memory so it makes stack overflow. Splat looks like: [ 229.628477] ============================================================================= [ 229.629785] BUG page->ptl (Not tainted): Padding overwritten. 0x0000000026abf214-0x0000000091f6abb2 [ 229.629785] ----------------------------------------------------------------------------- [ 229.629785] [ 229.655439] ================================================================== [ 229.629785] INFO: Slab 0x00000000ff7cfda8 objects=19 used=19 fp=0x00000000fe33776c flags=0x200000000010200 [ 229.655688] BUG: KASAN: stack-out-of-bounds in unmap_single_vma+0x25a/0x2e0 [ 229.655688] Read of size 8 at addr ffff888113076928 by task vlan-network-in/2334 [ 229.655688] [ 229.629785] Padding 0000000026abf214: 00 80 14 0d 81 88 ff ff 68 91 81 14 81 88 ff ff ........h....... [ 229.629785] Padding 0000000001e24790: 38 91 81 14 81 88 ff ff 68 91 81 14 81 88 ff ff 8.......h....... [ 229.629785] Padding 00000000b39397c8: 33 30 62 a7 ff ff ff ff ff eb 60 22 10 f1 ff 1f 30b.......`".... [ 229.629785] Padding 00000000bc98f53a: 80 60 07 13 81 88 ff ff 00 80 14 0d 81 88 ff ff .`.............. [ 229.629785] Padding 000000002aa8123d: 68 91 81 14 81 88 ff ff f7 21 17 a7 ff ff ff ff h........!...... [ 229.629785] Padding 000000001c8c2369: 08 81 14 0d 81 88 ff ff 03 02 00 00 00 00 00 00 ................ [ 229.629785] Padding 000000004e290c5d: 21 90 a2 21 10 ed ff ff 00 00 00 00 00 fc ff df !..!............ [ 229.629785] Padding 000000000e25d731: 18 60 07 13 81 88 ff ff c0 8b 13 05 81 88 ff ff .`.............. [ 229.629785] Padding 000000007adc7ab3: b3 8a b5 41 00 00 00 00 ...A.... [ 229.629785] FIX page->ptl: Restoring 0x0000000026abf214-0x0000000091f6abb2=0x5a [ ... ] Fixes: acaf4e70997f ("net: vxlan: when lower dev unregisters remove vxlan dev as well") Signed-off-by: Taehee Yoo --- v3 -> v4 : - Fix wrong usage netdev_upper_dev_link() in the vxlan.c - Preserve reverse christmas tree variable ordering in the vxlan.c v1 -> v3 : - This patch is not changed drivers/net/vxlan.c | 52 ++++++++++++++++++++++++++++++++++++--------- include/net/vxlan.h | 1 + 2 files changed, 43 insertions(+), 10 deletions(-) diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index 3d9bcc957f7d..5537998d6137 100644 --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -3566,10 +3566,13 @@ static int __vxlan_dev_create(struct net *net, struct net_device *dev, { struct vxlan_net *vn = net_generic(net, vxlan_net_id); struct vxlan_dev *vxlan = netdev_priv(dev); + struct net_device *remote_dev = NULL; struct vxlan_fdb *f = NULL; bool unregister = false; + struct vxlan_rdst *dst; int err; + dst = &vxlan->default_dst; err = vxlan_dev_configure(net, dev, conf, false, extack); if (err) return err; @@ -3577,14 +3580,14 @@ static int __vxlan_dev_create(struct net *net, struct net_device *dev, dev->ethtool_ops = &vxlan_ethtool_ops; /* create an fdb entry for a valid default destination */ - if (!vxlan_addr_any(&vxlan->default_dst.remote_ip)) { + if (!vxlan_addr_any(&dst->remote_ip)) { err = vxlan_fdb_create(vxlan, all_zeros_mac, - &vxlan->default_dst.remote_ip, + &dst->remote_ip, NUD_REACHABLE | NUD_PERMANENT, vxlan->cfg.dst_port, - vxlan->default_dst.remote_vni, - vxlan->default_dst.remote_vni, - vxlan->default_dst.remote_ifindex, + dst->remote_vni, + dst->remote_vni, + dst->remote_ifindex, NTF_SELF, &f); if (err) return err; @@ -3595,26 +3598,41 @@ static int __vxlan_dev_create(struct net *net, struct net_device *dev, goto errout; unregister = true; + if (dst->remote_ifindex) { + remote_dev = __dev_get_by_index(net, dst->remote_ifindex); + if (!remote_dev) + goto errout; + + err = netdev_upper_dev_link(remote_dev, dev, extack); + if (err) + goto errout; + } + err = rtnl_configure_link(dev, NULL); if (err) - goto errout; + goto unlink; if (f) { - vxlan_fdb_insert(vxlan, all_zeros_mac, - vxlan->default_dst.remote_vni, f); + vxlan_fdb_insert(vxlan, all_zeros_mac, dst->remote_vni, f); /* notify default fdb entry */ err = vxlan_fdb_notify(vxlan, f, first_remote_rtnl(f), RTM_NEWNEIGH, true, extack); if (err) { vxlan_fdb_destroy(vxlan, f, false, false); + if (remote_dev) + netdev_upper_dev_unlink(remote_dev, dev); goto unregister; } } list_add(&vxlan->next, &vn->vxlan_list); + if (remote_dev) + dst->remote_dev = remote_dev; return 0; - +unlink: + if (remote_dev) + netdev_upper_dev_unlink(remote_dev, dev); errout: /* unregister_netdevice() destroys the default FDB entry with deletion * notification. But the addition notification was not sent yet, so @@ -3932,11 +3950,12 @@ static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[], struct netlink_ext_ack *extack) { struct vxlan_dev *vxlan = netdev_priv(dev); - struct vxlan_rdst *dst = &vxlan->default_dst; struct net_device *lowerdev; struct vxlan_config conf; + struct vxlan_rdst *dst; int err; + dst = &vxlan->default_dst; err = vxlan_nl2conf(tb, data, dev, &conf, true, extack); if (err) return err; @@ -3946,6 +3965,11 @@ static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[], if (err) return err; + err = netdev_adjacent_change_prepare(dst->remote_dev, lowerdev, dev, + extack); + if (err) + return err; + /* handle default dst entry */ if (!vxlan_addr_equal(&conf.remote_ip, &dst->remote_ip)) { u32 hash_index = fdb_head_index(vxlan, all_zeros_mac, conf.vni); @@ -3962,6 +3986,8 @@ static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[], NTF_SELF, true, extack); if (err) { spin_unlock_bh(&vxlan->hash_lock[hash_index]); + netdev_adjacent_change_abort(dst->remote_dev, + lowerdev, dev); return err; } } @@ -3979,6 +4005,10 @@ static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[], if (conf.age_interval != vxlan->cfg.age_interval) mod_timer(&vxlan->age_timer, jiffies); + netdev_adjacent_change_commit(dst->remote_dev, lowerdev, dev); + if (lowerdev && lowerdev != dst->remote_dev) + dst->remote_dev = lowerdev; + vxlan_config_apply(dev, &conf, lowerdev, vxlan->net, true); return 0; } @@ -3991,6 +4021,8 @@ static void vxlan_dellink(struct net_device *dev, struct list_head *head) list_del(&vxlan->next); unregister_netdevice_queue(dev, head); + if (vxlan->default_dst.remote_dev) + netdev_upper_dev_unlink(vxlan->default_dst.remote_dev, dev); } static size_t vxlan_get_size(const struct net_device *dev) diff --git a/include/net/vxlan.h b/include/net/vxlan.h index 335283dbe9b3..373aadcfea21 100644 --- a/include/net/vxlan.h +++ b/include/net/vxlan.h @@ -197,6 +197,7 @@ struct vxlan_rdst { u8 offloaded:1; __be32 remote_vni; u32 remote_ifindex; + struct net_device *remote_dev; struct list_head list; struct rcu_head rcu; struct dst_cache dst_cache; From patchwork Sat Sep 28 16:48:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1168871 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="l/xoKsVN"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46gZRS05JFz9sNF for ; Sun, 29 Sep 2019 02:50:44 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728812AbfI1Qun (ORCPT ); Sat, 28 Sep 2019 12:50:43 -0400 Received: from mail-pg1-f194.google.com ([209.85.215.194]:43719 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725897AbfI1Qum (ORCPT ); Sat, 28 Sep 2019 12:50:42 -0400 Received: by mail-pg1-f194.google.com with SMTP id v27so5044519pgk.10; Sat, 28 Sep 2019 09:50:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=7qGISXjSTLzKjSd2o0snCR5OwQ/OBfqFLQKNtUD8Qr0=; b=l/xoKsVNRBnpohs/sGEgYe5qCJbPe1eiFVSvR6G2ftHjSK1h5p98MN9cEwVZFnh1G8 8hr61f5mx73klu9lsIPcH/t+1IaErtZxKkaOWEoB7mxSCQVLhiS1CkMpB7ZAa5a+h0Hs +S5Y0JPByq0F2f5cxm87JUPOJkdo4D8JeoekbfWQtX+SwIqlwFHgwixfUXJe7j50JDWy icPPSOIFvLSdcs1Dl5nCtDOCPmOzw3N1xVMGB6Wpj91yReewbO/OL1kyqPI4aMUqxzKn IoZi0wzh5qqM8t0LkMSLxqOomEuCxX7zG5R5yDL5ez93mOBNw5uaJuIx/iLvBoaOFDCk PHmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=7qGISXjSTLzKjSd2o0snCR5OwQ/OBfqFLQKNtUD8Qr0=; b=DznIzQCUMAs1JodPaLnOV23rW9lCzC/uEobWtBAiInjbEWvGhqy0uf7Ng5yEwTRu9V Ky6dDI3U/6RntswZno3iZQq+5ibJqV6rRpXNPSClJrDYaZNIOqyXOhXSz6VZDU9qfODQ TRaTx5P3BPEIY9DJKBjXQ4c/9+1kWBKCw6ZQH7DzCNOkOMQ8iu7QJijJvzMO9AOO8R9r 3HPKHah6Uue+tFy5qpXpCXoz2yNmtPym2YaamYZRDhruMLAl1gHvMafOAFYV3dfzCEmO l5X6eT1Bo9czDt9A/bAyzHsxL/bShoUP7B70yaFkvayL89x5o7ThxlPqmKZYAoj1TfZv PK1Q== X-Gm-Message-State: APjAAAVVJRQ1CSHfGWtjyY6IlN6PLcKvTTfTC3zYiN1aENOei691WHwr mAyBJ3ZlUoBKuFUGv3rvxcg= X-Google-Smtp-Source: APXvYqxhMlYarKyaIyY1S3HLf2iIPW1905H5flkUgwLPmr92LouLurgXcAiu3NMrs4vuZ8taLNPEpw== X-Received: by 2002:aa7:8d8a:: with SMTP id i10mr11638193pfr.45.1569689441059; Sat, 28 Sep 2019 09:50:41 -0700 (PDT) Received: from localhost.localdomain ([110.35.161.54]) by smtp.gmail.com with ESMTPSA id 30sm8663092pjk.25.2019.09.28.09.50.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Sep 2019 09:50:40 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, linux-wireless@vger.kernel.org, jakub.kicinski@netronome.com, johannes@sipsolutions.net, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, stephen@networkplumber.org, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com, schuffelen@google.com, bjorn@mork.no Cc: ap420073@gmail.com Subject: [PATCH net v4 11/12] net: remove unnecessary variables and callback Date: Sat, 28 Sep 2019 16:48:42 +0000 Message-Id: <20190928164843.31800-12-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190928164843.31800-1-ap420073@gmail.com> References: <20190928164843.31800-1-ap420073@gmail.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch removes variables and callback these are related to the nested device structure. devices that can be nested have their own nest_level variable that represents the depth of nested devices. In the previous patch, new {lower/upper}_level variables are added and they replace old private nest_level variable. So, this patch removes all 'nest_level' variables. In order to avoid lockdep warning, ->ndo_get_lock_subclass() was added to get lockdep subclass value, which is actually lower nested depth value. But now, they use the dynamic lockdep key to avoid lockdep warning instead of the subclass. So, this patch removes ->ndo_get_lock_subclass() callback. Signed-off-by: Taehee Yoo --- v1 -> v4 : - This patch is not changed drivers/net/bonding/bond_alb.c | 2 +- drivers/net/bonding/bond_main.c | 14 ------------- .../net/ethernet/mellanox/mlx5/core/en_tc.c | 2 +- drivers/net/macsec.c | 9 --------- drivers/net/macvlan.c | 7 ------- include/linux/if_macvlan.h | 1 - include/linux/if_vlan.h | 12 ----------- include/linux/netdevice.h | 12 ----------- include/net/bonding.h | 1 - net/8021q/vlan.c | 1 - net/8021q/vlan_dev.c | 6 ------ net/core/dev.c | 20 ------------------- net/core/dev_addr_lists.c | 12 +++++------ net/smc/smc_core.c | 2 +- net/smc/smc_pnet.c | 2 +- 15 files changed, 10 insertions(+), 93 deletions(-) diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c index 8c79bad2a9a5..4f2e6910c623 100644 --- a/drivers/net/bonding/bond_alb.c +++ b/drivers/net/bonding/bond_alb.c @@ -952,7 +952,7 @@ static int alb_upper_dev_walk(struct net_device *upper, void *_data) struct bond_vlan_tag *tags; if (is_vlan_dev(upper) && - bond->nest_level == vlan_get_encap_level(upper) - 1) { + bond->dev->lower_level == upper->lower_level - 1) { if (upper->addr_assign_type == NET_ADDR_STOLEN) { alb_send_lp_vid(slave, mac_addr, vlan_dev_vlan_proto(upper), diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 7f574e74ed78..69eb61466fbe 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1733,8 +1733,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, goto err_upper_unlink; } - bond->nest_level = dev_get_nest_level(bond_dev) + 1; - /* If the mode uses primary, then the following is handled by * bond_change_active_slave(). */ @@ -1983,9 +1981,6 @@ static int __bond_release_one(struct net_device *bond_dev, if (!bond_has_slaves(bond)) { bond_set_carrier(bond); eth_hw_addr_random(bond_dev); - bond->nest_level = SINGLE_DEPTH_NESTING; - } else { - bond->nest_level = dev_get_nest_level(bond_dev) + 1; } unblock_netpoll_tx(); @@ -3472,13 +3467,6 @@ static void bond_fold_stats(struct rtnl_link_stats64 *_res, } } -static int bond_get_nest_level(struct net_device *bond_dev) -{ - struct bonding *bond = netdev_priv(bond_dev); - - return bond->nest_level; -} - static void bond_get_stats(struct net_device *bond_dev, struct rtnl_link_stats64 *stats) { @@ -4298,7 +4286,6 @@ static const struct net_device_ops bond_netdev_ops = { .ndo_neigh_setup = bond_neigh_setup, .ndo_vlan_rx_add_vid = bond_vlan_rx_add_vid, .ndo_vlan_rx_kill_vid = bond_vlan_rx_kill_vid, - .ndo_get_lock_subclass = bond_get_nest_level, #ifdef CONFIG_NET_POLL_CONTROLLER .ndo_netpoll_setup = bond_netpoll_setup, .ndo_netpoll_cleanup = bond_netpoll_cleanup, @@ -4822,7 +4809,6 @@ static int bond_init(struct net_device *bond_dev) if (!bond->wq) return -ENOMEM; - bond->nest_level = SINGLE_DEPTH_NESTING; bond_dev_set_lockdep_class(bond_dev); list_add_tail(&bond->bond_list, &bn->dev_list); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index 3e78a727f3e6..c4c59d2e676e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -3160,7 +3160,7 @@ static int add_vlan_pop_action(struct mlx5e_priv *priv, struct mlx5_esw_flow_attr *attr, u32 *action) { - int nest_level = vlan_get_encap_level(attr->parse_attr->filter_dev); + int nest_level = attr->parse_attr->filter_dev->lower_level; struct flow_action_entry vlan_act = { .id = FLOW_ACTION_VLAN_POP, }; diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c index 28972da4a0b3..647aeead644d 100644 --- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -269,7 +269,6 @@ struct macsec_dev { struct gro_cells gro_cells; struct lock_class_key xmit_lock_key; struct lock_class_key addr_lock_key; - unsigned int nest_level; }; /** @@ -2989,11 +2988,6 @@ static int macsec_get_iflink(const struct net_device *dev) return macsec_priv(dev)->real_dev->ifindex; } -static int macsec_get_nest_level(struct net_device *dev) -{ - return macsec_priv(dev)->nest_level; -} - static const struct net_device_ops macsec_netdev_ops = { .ndo_init = macsec_dev_init, .ndo_uninit = macsec_dev_uninit, @@ -3007,7 +3001,6 @@ static const struct net_device_ops macsec_netdev_ops = { .ndo_start_xmit = macsec_start_xmit, .ndo_get_stats64 = macsec_get_stats64, .ndo_get_iflink = macsec_get_iflink, - .ndo_get_lock_subclass = macsec_get_nest_level, }; static const struct device_type macsec_type = { @@ -3290,8 +3283,6 @@ static int macsec_newlink(struct net *net, struct net_device *dev, if (err < 0) return err; - macsec->nest_level = dev_get_nest_level(real_dev) + 1; - err = netdev_upper_dev_link(real_dev, dev, extack); if (err < 0) goto unregister; diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index dae368a2e8d1..2c14bc606514 100644 --- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -867,11 +867,6 @@ static int macvlan_do_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) #define MACVLAN_STATE_MASK \ ((1<<__LINK_STATE_NOCARRIER) | (1<<__LINK_STATE_DORMANT)) -static int macvlan_get_nest_level(struct net_device *dev) -{ - return ((struct macvlan_dev *)netdev_priv(dev))->nest_level; -} - static void macvlan_dev_set_lockdep_one(struct net_device *dev, struct netdev_queue *txq, void *_unused) @@ -1180,7 +1175,6 @@ static const struct net_device_ops macvlan_netdev_ops = { .ndo_fdb_add = macvlan_fdb_add, .ndo_fdb_del = macvlan_fdb_del, .ndo_fdb_dump = ndo_dflt_fdb_dump, - .ndo_get_lock_subclass = macvlan_get_nest_level, #ifdef CONFIG_NET_POLL_CONTROLLER .ndo_poll_controller = macvlan_dev_poll_controller, .ndo_netpoll_setup = macvlan_dev_netpoll_setup, @@ -1464,7 +1458,6 @@ int macvlan_common_newlink(struct net *src_net, struct net_device *dev, vlan->dev = dev; vlan->port = port; vlan->set_features = MACVLAN_FEATURES; - vlan->nest_level = dev_get_nest_level(lowerdev) + 1; vlan->mode = MACVLAN_MODE_VEPA; if (data && data[IFLA_MACVLAN_MODE]) diff --git a/include/linux/if_macvlan.h b/include/linux/if_macvlan.h index ea5b41823287..e9202edcf101 100644 --- a/include/linux/if_macvlan.h +++ b/include/linux/if_macvlan.h @@ -29,7 +29,6 @@ struct macvlan_dev { netdev_features_t set_features; enum macvlan_mode mode; u16 flags; - int nest_level; unsigned int macaddr_count; struct lock_class_key xmit_lock_key; struct lock_class_key addr_lock_key; diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h index 1aed9f613e90..6f30284a58e5 100644 --- a/include/linux/if_vlan.h +++ b/include/linux/if_vlan.h @@ -182,8 +182,6 @@ struct vlan_dev_priv { #ifdef CONFIG_NET_POLL_CONTROLLER struct netpoll *netpoll; #endif - unsigned int nest_level; - struct lock_class_key xmit_lock_key; struct lock_class_key addr_lock_key; }; @@ -224,11 +222,6 @@ extern void vlan_vids_del_by_dev(struct net_device *dev, extern bool vlan_uses_dev(const struct net_device *dev); -static inline int vlan_get_encap_level(struct net_device *dev) -{ - BUG_ON(!is_vlan_dev(dev)); - return vlan_dev_priv(dev)->nest_level; -} #else static inline struct net_device * __vlan_find_dev_deep_rcu(struct net_device *real_dev, @@ -298,11 +291,6 @@ static inline bool vlan_uses_dev(const struct net_device *dev) { return false; } -static inline int vlan_get_encap_level(struct net_device *dev) -{ - BUG(); - return 0; -} #endif /** diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index d1f99d4f41bb..4133db060593 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1421,7 +1421,6 @@ struct net_device_ops { void (*ndo_dfwd_del_station)(struct net_device *pdev, void *priv); - int (*ndo_get_lock_subclass)(struct net_device *dev); int (*ndo_set_tx_maxrate)(struct net_device *dev, int queue_index, u32 maxrate); @@ -4060,16 +4059,6 @@ static inline void netif_addr_lock(struct net_device *dev) spin_lock(&dev->addr_list_lock); } -static inline void netif_addr_lock_nested(struct net_device *dev) -{ - int subclass = SINGLE_DEPTH_NESTING; - - if (dev->netdev_ops->ndo_get_lock_subclass) - subclass = dev->netdev_ops->ndo_get_lock_subclass(dev); - - spin_lock_nested(&dev->addr_list_lock, subclass); -} - static inline void netif_addr_lock_bh(struct net_device *dev) { spin_lock_bh(&dev->addr_list_lock); @@ -4354,7 +4343,6 @@ void netdev_lower_state_changed(struct net_device *lower_dev, extern u8 netdev_rss_key[NETDEV_RSS_KEY_LEN] __read_mostly; void netdev_rss_key_fill(void *buffer, size_t len); -int dev_get_nest_level(struct net_device *dev); int skb_checksum_help(struct sk_buff *skb); int skb_crc32c_csum_help(struct sk_buff *skb); int skb_csum_hwoffload_help(struct sk_buff *skb, diff --git a/include/net/bonding.h b/include/net/bonding.h index c39ac7061e41..74f41dd73866 100644 --- a/include/net/bonding.h +++ b/include/net/bonding.h @@ -203,7 +203,6 @@ struct bonding { struct slave __rcu *primary_slave; struct bond_up_slave __rcu *slave_arr; /* Array of usable slaves */ bool force_primary; - u32 nest_level; s32 slave_cnt; /* never change this value outside the attach/detach wrappers */ int (*recv_probe)(const struct sk_buff *, struct bonding *, struct slave *); diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c index 54728d2eda18..d4bcfd8f95bf 100644 --- a/net/8021q/vlan.c +++ b/net/8021q/vlan.c @@ -172,7 +172,6 @@ int register_vlan_dev(struct net_device *dev, struct netlink_ext_ack *extack) if (err < 0) goto out_uninit_mvrp; - vlan->nest_level = dev_get_nest_level(real_dev) + 1; err = register_netdevice(dev); if (err < 0) goto out_uninit_mvrp; diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c index 12bc80650087..e8707827540c 100644 --- a/net/8021q/vlan_dev.c +++ b/net/8021q/vlan_dev.c @@ -514,11 +514,6 @@ static void vlan_dev_set_lockdep_class(struct net_device *dev) netdev_for_each_tx_queue(dev, vlan_dev_set_lockdep_one, NULL); } -static int vlan_dev_get_lock_subclass(struct net_device *dev) -{ - return vlan_dev_priv(dev)->nest_level; -} - static const struct header_ops vlan_header_ops = { .create = vlan_dev_hard_header, .parse = eth_header_parse, @@ -814,7 +809,6 @@ static const struct net_device_ops vlan_netdev_ops = { .ndo_netpoll_cleanup = vlan_dev_netpoll_cleanup, #endif .ndo_fix_features = vlan_dev_fix_features, - .ndo_get_lock_subclass = vlan_dev_get_lock_subclass, .ndo_get_iflink = vlan_dev_get_iflink, }; diff --git a/net/core/dev.c b/net/core/dev.c index 0b60bcd5033e..3fbd42eb75d1 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -7712,26 +7712,6 @@ void *netdev_lower_dev_get_private(struct net_device *dev, } EXPORT_SYMBOL(netdev_lower_dev_get_private); - -int dev_get_nest_level(struct net_device *dev) -{ - struct net_device *lower = NULL; - struct list_head *iter; - int max_nest = -1; - int nest; - - ASSERT_RTNL(); - - netdev_for_each_lower_dev(dev, lower, iter) { - nest = dev_get_nest_level(lower); - if (max_nest < nest) - max_nest = nest; - } - - return max_nest + 1; -} -EXPORT_SYMBOL(dev_get_nest_level); - /** * netdev_lower_change - Dispatch event about lower device state change * @lower_dev: device diff --git a/net/core/dev_addr_lists.c b/net/core/dev_addr_lists.c index 6393ba930097..2f949b5a1eb9 100644 --- a/net/core/dev_addr_lists.c +++ b/net/core/dev_addr_lists.c @@ -637,7 +637,7 @@ int dev_uc_sync(struct net_device *to, struct net_device *from) if (to->addr_len != from->addr_len) return -EINVAL; - netif_addr_lock_nested(to); + netif_addr_lock(to); err = __hw_addr_sync(&to->uc, &from->uc, to->addr_len); if (!err) __dev_set_rx_mode(to); @@ -667,7 +667,7 @@ int dev_uc_sync_multiple(struct net_device *to, struct net_device *from) if (to->addr_len != from->addr_len) return -EINVAL; - netif_addr_lock_nested(to); + netif_addr_lock(to); err = __hw_addr_sync_multiple(&to->uc, &from->uc, to->addr_len); if (!err) __dev_set_rx_mode(to); @@ -691,7 +691,7 @@ void dev_uc_unsync(struct net_device *to, struct net_device *from) return; netif_addr_lock_bh(from); - netif_addr_lock_nested(to); + netif_addr_lock(to); __hw_addr_unsync(&to->uc, &from->uc, to->addr_len); __dev_set_rx_mode(to); netif_addr_unlock(to); @@ -858,7 +858,7 @@ int dev_mc_sync(struct net_device *to, struct net_device *from) if (to->addr_len != from->addr_len) return -EINVAL; - netif_addr_lock_nested(to); + netif_addr_lock(to); err = __hw_addr_sync(&to->mc, &from->mc, to->addr_len); if (!err) __dev_set_rx_mode(to); @@ -888,7 +888,7 @@ int dev_mc_sync_multiple(struct net_device *to, struct net_device *from) if (to->addr_len != from->addr_len) return -EINVAL; - netif_addr_lock_nested(to); + netif_addr_lock(to); err = __hw_addr_sync_multiple(&to->mc, &from->mc, to->addr_len); if (!err) __dev_set_rx_mode(to); @@ -912,7 +912,7 @@ void dev_mc_unsync(struct net_device *to, struct net_device *from) return; netif_addr_lock_bh(from); - netif_addr_lock_nested(to); + netif_addr_lock(to); __hw_addr_unsync(&to->mc, &from->mc, to->addr_len); __dev_set_rx_mode(to); netif_addr_unlock(to); diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c index 4ca50ddf8d16..a2e91b8d04b3 100644 --- a/net/smc/smc_core.c +++ b/net/smc/smc_core.c @@ -558,7 +558,7 @@ int smc_vlan_by_tcpsk(struct socket *clcsock, struct smc_init_info *ini) } rtnl_lock(); - nest_lvl = dev_get_nest_level(ndev); + nest_lvl = ndev->lower_level; for (i = 0; i < nest_lvl; i++) { struct list_head *lower = &ndev->adj_list.lower; diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c index bab2da8cf17a..2920b006f65c 100644 --- a/net/smc/smc_pnet.c +++ b/net/smc/smc_pnet.c @@ -718,7 +718,7 @@ static struct net_device *pnet_find_base_ndev(struct net_device *ndev) int i, nest_lvl; rtnl_lock(); - nest_lvl = dev_get_nest_level(ndev); + nest_lvl = ndev->lower_level; for (i = 0; i < nest_lvl; i++) { struct list_head *lower = &ndev->adj_list.lower; From patchwork Sat Sep 28 16:48:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1168872 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="Zd/viF3M"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46gZRd1FBcz9sNF for ; Sun, 29 Sep 2019 02:50:53 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728824AbfI1Quv (ORCPT ); Sat, 28 Sep 2019 12:50:51 -0400 Received: from mail-pl1-f194.google.com ([209.85.214.194]:45283 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725897AbfI1Quv (ORCPT ); Sat, 28 Sep 2019 12:50:51 -0400 Received: by mail-pl1-f194.google.com with SMTP id u12so2248687pls.12; Sat, 28 Sep 2019 09:50:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=WGmCj+EzADgagaHq4m2frUsV7PtkZi9w2xNwBqjwIs4=; b=Zd/viF3M74FYZdxWNCEMdS1eJiNqHacTIY7qye+XdyRfI63Ah6BONEOS3RbJaRd7uX k/MwEckjfIIFdY9uScGpoTqmYHsL2/xpZBDGEz4TjTo7W6UiNaesrPgFhzaq7fPezxCA Bk5MlDeB1Jcd6zTIi2ZRUqmdwAUV2OrxW9c+ZA0QvlOvX7kOnaSRAwrxl0AeFFpOTnZ5 1F1quqkmSoDodH6rrxTeVcAjKZu+Xo3sBaHEffEKLghSHWIk7Oa02bL4xNW3hPDJbvuk mvnSFOAh9A+wPvmMt0HDYp0x/k8l8R/1xMXm1dZn9I4pnd0t3CjUEAyr5K3CDrQ2KyAn IxRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=WGmCj+EzADgagaHq4m2frUsV7PtkZi9w2xNwBqjwIs4=; b=YgNYdyJ222XE1Z/OL9aKvsFfRrl9Y+yle1CzQD3VMlpEOvScXx3rM+ixsDPv5uGz6t sHi6ZERIFglPci+4ndS6J2XH4uWAAQSkMxje+/uidUYVtayQE3aB+i7JDDCqycbxVSY0 37vcxhgTBKk1jF4/RfUdluQAE5le1pgJ+u3ZslOD1FtHPu8kqfXI2nFfRBbjlQfZHeMT AVwNBvSn0AdMdDym0XiKJe3TPs2SwDTI835cpjhleQI/Bm2xMNAUf7BR3BFpr1BzmSv7 yJesZghbzb9QZ8z9wUezvfg2Ng/8Z3fJf56wS47uvRXU8M8KIu+A/tx0siPswEWrkuna HtPA== X-Gm-Message-State: APjAAAXfnXa7YUcPTtIG3XjRibCgSS28w/ccJ0IPIj7rl4Qn6m87wLtR iPOh+CRC5TSR7BSp8Pt95B0= X-Google-Smtp-Source: APXvYqy0uIJBZ2GOlmjZwL3ywOkJNx5QSNg9GPATY8XC9xI9xordaY7vGafPPhNLy3j4NOHDuqFYnA== X-Received: by 2002:a17:902:222:: with SMTP id 31mr11582519plc.167.1569689450181; Sat, 28 Sep 2019 09:50:50 -0700 (PDT) Received: from localhost.localdomain ([110.35.161.54]) by smtp.gmail.com with ESMTPSA id 30sm8663092pjk.25.2019.09.28.09.50.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Sep 2019 09:50:49 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, linux-wireless@vger.kernel.org, jakub.kicinski@netronome.com, johannes@sipsolutions.net, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, stephen@networkplumber.org, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com, schuffelen@google.com, bjorn@mork.no Cc: ap420073@gmail.com Subject: [PATCH net v4 12/12] virt_wifi: fix refcnt leak in module exit routine Date: Sat, 28 Sep 2019 16:48:43 +0000 Message-Id: <20190928164843.31800-13-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190928164843.31800-1-ap420073@gmail.com> References: <20190928164843.31800-1-ap420073@gmail.com> Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org virt_wifi_newlink() calls netdev_upper_dev_link() and it internally holds reference count of lower interface. Current code does not release a reference count of the lower interface when the lower interface is being deleted. So, reference count leaks occur. Test commands: ip link add dummy0 type dummy ip link add vw1 link dummy0 type virt_wifi Splat looks like: [ 182.001918][ T1333] WARNING: CPU: 0 PID: 1333 at net/core/dev.c:8638 rollback_registered_many+0x75d/0xda0 [ 182.002724][ T1333] Modules linked in: virt_wifi cfg80211 dummy veth openvswitch nsh nf_conncount nf_nat nf_conntrack6 [ 182.002724][ T1333] CPU: 0 PID: 1333 Comm: ip Not tainted 5.3.0+ #370 [ 182.002724][ T1333] RIP: 0010:rollback_registered_many+0x75d/0xda0 [ 182.002724][ T1333] Code: 0c 00 00 48 89 de 4c 89 ff e8 6f 5a 04 00 48 89 df e8 c7 26 fd ff 84 c0 0f 84 a5 fd ff ff 40 [ 182.002724][ T1333] RSP: 0018:ffff88810900f348 EFLAGS: 00010286 [ 182.002724][ T1333] RAX: 0000000000000024 RBX: ffff88811361d700 RCX: 0000000000000000 [ 182.002724][ T1333] RDX: 0000000000000024 RSI: 0000000000000008 RDI: ffffed1021201e5f [ 182.002724][ T1333] RBP: ffff88810900f4e0 R08: ffffed1022c3ff71 R09: ffffed1022c3ff71 [ 182.002724][ T1333] R10: 0000000000000001 R11: ffffed1022c3ff70 R12: dffffc0000000000 [ 182.002724][ T1333] R13: ffff88810900f460 R14: ffff88810900f420 R15: ffff8881075f1940 [ 182.002724][ T1333] FS: 00007f77c42240c0(0000) GS:ffff888116000000(0000) knlGS:0000000000000000 [ 182.002724][ T1333] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 182.002724][ T1333] CR2: 00007f77c3706240 CR3: 000000011139e000 CR4: 00000000001006f0 [ 182.002724][ T1333] Call Trace: [ 182.002724][ T1333] ? generic_xdp_install+0x310/0x310 [ 182.002724][ T1333] ? check_chain_key+0x236/0x5d0 [ 182.002724][ T1333] ? __nla_validate_parse+0x98/0x1ad0 [ 182.002724][ T1333] unregister_netdevice_many.part.123+0x13/0x1b0 [ 182.002724][ T1333] rtnl_delete_link+0xbc/0x100 [ 182.002724][ T1333] ? rtnl_af_register+0xc0/0xc0 [ 182.002724][ T1333] rtnl_dellink+0x2e7/0x870 [ ... ] [ 192.874736][ T1333] unregister_netdevice: waiting for dummy0 to become free. Usage count = 1 This patch adds notifier routine to delete upper interface before deleting lower interface. Fixes: c7cdba31ed8b ("mac80211-next: rtnetlink wifi simulation device") Signed-off-by: Taehee Yoo --- v4: - Add this new patch to fix refcnt leaks in the virt_wifi module drivers/net/wireless/virt_wifi.c | 51 ++++++++++++++++++++++++++++++-- 1 file changed, 49 insertions(+), 2 deletions(-) diff --git a/drivers/net/wireless/virt_wifi.c b/drivers/net/wireless/virt_wifi.c index be92e1220284..aadbacb01c8d 100644 --- a/drivers/net/wireless/virt_wifi.c +++ b/drivers/net/wireless/virt_wifi.c @@ -590,6 +590,42 @@ static struct rtnl_link_ops virt_wifi_link_ops = { .priv_size = sizeof(struct virt_wifi_netdev_priv), }; +static inline bool netif_is_virt_wifi_dev(const struct net_device *dev) +{ + return rcu_access_pointer(dev->rx_handler) == virt_wifi_rx_handler; +} + +static int virt_wifi_event(struct notifier_block *this, unsigned long event, + void *ptr) +{ + struct net_device *lower_dev = netdev_notifier_info_to_dev(ptr); + struct virt_wifi_netdev_priv *priv; + struct net_device *upper_dev; + LIST_HEAD(list_kill); + + if (!netif_is_virt_wifi_dev(lower_dev)) + return NOTIFY_DONE; + + switch (event) { + case NETDEV_UNREGISTER: + priv = rtnl_dereference(lower_dev->rx_handler_data); + if (!priv) + return NOTIFY_DONE; + + upper_dev = priv->upperdev; + + upper_dev->rtnl_link_ops->dellink(upper_dev, &list_kill); + unregister_netdevice_many(&list_kill); + break; + } + + return NOTIFY_DONE; +} + +static struct notifier_block virt_wifi_notifier = { + .notifier_call = virt_wifi_event, +}; + /* Acquires and releases the rtnl lock. */ static int __init virt_wifi_init_module(void) { @@ -598,14 +634,24 @@ static int __init virt_wifi_init_module(void) /* Guaranteed to be locallly-administered and not multicast. */ eth_random_addr(fake_router_bssid); + err = register_netdevice_notifier(&virt_wifi_notifier); + if (err) + return err; + common_wiphy = virt_wifi_make_wiphy(); if (!common_wiphy) - return -ENOMEM; + goto notifier; err = rtnl_link_register(&virt_wifi_link_ops); if (err) - virt_wifi_destroy_wiphy(common_wiphy); + goto destroy_wiphy; + return 0; + +destroy_wiphy: + virt_wifi_destroy_wiphy(common_wiphy); +notifier: + unregister_netdevice_notifier(&virt_wifi_notifier); return err; } @@ -615,6 +661,7 @@ static void __exit virt_wifi_cleanup_module(void) /* Will delete any devices that depend on the wiphy. */ rtnl_link_unregister(&virt_wifi_link_ops); virt_wifi_destroy_wiphy(common_wiphy); + unregister_netdevice_notifier(&virt_wifi_notifier); } module_init(virt_wifi_init_module);