From patchwork Sat Sep 7 13:45:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1159322 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="nP5y95ME"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46QbKq0nlPz9sDB for ; Sat, 7 Sep 2019 23:45:51 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2393244AbfIGNpr (ORCPT ); Sat, 7 Sep 2019 09:45:47 -0400 Received: from mail-pf1-f195.google.com ([209.85.210.195]:35734 "EHLO mail-pf1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728809AbfIGNpr (ORCPT ); Sat, 7 Sep 2019 09:45:47 -0400 Received: by mail-pf1-f195.google.com with SMTP id 205so6408757pfw.2 for ; Sat, 07 Sep 2019 06:45:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=W/lU75C0i08MT1RDVmXy96us6MwDbmbo0x2PU45ni0A=; b=nP5y95MEKUhbfXrt4Y/v5CM1sVnoW9ohqTJooYu0Cu7qIlpGjo9kjHkUEkZ1ComJe8 GFr9Fqn+O7NAHtM7cucHVfhFlAQbyMvnhqdVY5qpJgU5Cl95zQIBkbVSuH+wal3p3BGJ lCrwScicwJmMCYnienij+867RLouRGHtL3LIcnEsqMzGkHC9Nnz7fk44+CJYmsqKXIbJ qHvaFFw6pUZpKGvTL/AbtqBmH3FUndvkmFUA/PqOfM7jDCZLXio2mUR9gxwoJX/0434n j5dXeyZ26+lpPVfkRcircXNxCqEGyl424uhyXLZR7Yt8mdwp1ge+X8484Wkqrb3EL1p5 9M6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=W/lU75C0i08MT1RDVmXy96us6MwDbmbo0x2PU45ni0A=; b=LOxqqJL4JNSk+JckKeo3ybrxEO6WipNRTJOrugzSAoxZSVEoa659MYXm1ja3IpLNPN qm04Ju6Kxa0qB25UbWK7KJ5iTFgbzqHmpEBFqUat3l+R6OI108KdCOqnpwBKLKg2hbpy P7utwCVoOtWRZ5urgzln76UMGPIBS5vrvuSbjlbp2bqcC4uA7khPP01tBskeGMnImYyQ W4J5BqOea3PcoEFyF6Un+wBRso2vl+2vB0ejGPKBBPty4xE2Et78GvtKyuNWpAoeQ0DY VKVPx8MI3LqBP3ojVtTUpf4NDE6U8SM/YDmEoHWxih5huMXtbRUSygXZ5kh+M99Cd/ll YMTw== X-Gm-Message-State: APjAAAVqkxIYJPXerhSB7lSDXIUqACZcWQ5BcEWi4T1//bdf5NkQs0t3 bctPvkV0bptDe19ZzdmvG0A= X-Google-Smtp-Source: APXvYqyW/CK5dDrXpQ37sYS9nXJPA3QC2+jqUidnseAF253J494fvzK9vf/mqFo3sOlJA1qswKKnEw== X-Received: by 2002:a63:3387:: with SMTP id z129mr12487675pgz.177.1567863946167; Sat, 07 Sep 2019 06:45:46 -0700 (PDT) Received: from ap-To-be-filled-by-O-E-M.8.8.8.8 ([14.33.120.60]) by smtp.gmail.com with ESMTPSA id s13sm9644772pfm.12.2019.09.07.06.45.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2019 06:45:45 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, sthemmin@microsoft.com, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com Cc: ap420073@gmail.com Subject: [PATCH net v2 01/11] net: core: limit nested device depth Date: Sat, 7 Sep 2019 22:45:32 +0900 Message-Id: <20190907134532.31975-1-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Current code doesn't limit the number of nested devices. Nested devices would be handled recursively and this needs huge stack memory. So, unlimited nested devices could make stack overflow. This patch adds upper_level and lower_leve, they are common variables and represent maximum lower/upper depth. When upper/lower device is attached or dettached, {lower/upper}_level are updated. and if maximum depth is bigger than 8, attach routine fails and returns -EMLINK. Test commands: ip link add dummy0 type dummy ip link add link dummy0 name vlan1 type vlan id 1 ip link set vlan1 up for i in {2..100} do let A=$i-1 ip link add name vlan$i link vlan$A type vlan id $i done Splat looks like: [ 140.483124] BUG: looking up invalid subclass: 8 [ 140.483505] turning off the locking correctness validator. [ 140.483505] CPU: 0 PID: 1324 Comm: ip Not tainted 5.3.0-rc7+ #322 [ 140.483505] Hardware name: To be filled by O.E.M. To be filled by O.E.M./Aptio CRB, BIOS 5.6.5 07/08/2015 [ 140.483505] Call Trace: [ 140.483505] dump_stack+0x7c/0xbb [ 140.483505] register_lock_class+0x64d/0x14d0 [ 140.483505] ? is_dynamic_key+0x230/0x230 [ 140.483505] ? module_assert_mutex_or_preempt+0x41/0x70 [ 140.483505] ? __module_address+0x3f/0x3c0 [ 140.483505] lockdep_init_map+0x24e/0x630 [ 140.483505] vlan_dev_init+0x828/0xce0 [8021q] [ 140.483505] register_netdevice+0x24f/0xd70 [ 140.483505] ? netdev_change_features+0xa0/0xa0 [ 140.483505] ? dev_get_nest_level+0xe1/0x170 [ 140.483505] register_vlan_dev+0x29b/0x710 [8021q] [ 140.483505] __rtnl_newlink+0xb75/0x1180 [ ... ] [ 168.446539] WARNING: can't dereference registers at 00000000bef3d701 for ip apic_timer_interrupt+0xf/0x20 [ 168.466843] ================================================================== [ 168.469452] BUG: KASAN: slab-out-of-bounds in __unwind_start+0x71/0x850 [ 168.480707] Write of size 88 at addr ffff8880b8856d38 by task ip/1758 [ 168.480707] [ 168.480707] CPU: 1 PID: 1758 Comm: ip Not tainted 5.3.0-rc7+ #322 [ ... ] [ 168.794493] Rebooting in 5 seconds.. Signed-off-by: Taehee Yoo --- v1 -> v2 : this patch isn't changed include/linux/netdevice.h | 4 ++ net/core/dev.c | 106 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 110 insertions(+) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 88292953aa6f..5bb5756129af 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1624,6 +1624,8 @@ enum netdev_priv_flags { * @type: Interface hardware type * @hard_header_len: Maximum hardware header length. * @min_header_len: Minimum hardware header length + * @upper_level: Maximum depth level of upper devices. + * @lower_level: Maximum depth level of lower devices. * * @needed_headroom: Extra headroom the hardware may need, but not in all * cases can this be guaranteed @@ -1854,6 +1856,8 @@ struct net_device { unsigned short type; unsigned short hard_header_len; unsigned char min_header_len; + unsigned char upper_level; + unsigned char lower_level; unsigned short needed_headroom; unsigned short needed_tailroom; diff --git a/net/core/dev.c b/net/core/dev.c index 0891f499c1bb..6a4b4ce62204 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -146,6 +146,7 @@ #include "net-sysfs.h" #define MAX_GRO_SKBS 8 +#define MAX_NEST_DEV 8 /* This should be increased if a protocol with a bigger head is added. */ #define GRO_MAX_HEAD (MAX_HEADER + 128) @@ -6602,6 +6603,21 @@ struct net_device *netdev_upper_get_next_dev_rcu(struct net_device *dev, } EXPORT_SYMBOL(netdev_upper_get_next_dev_rcu); +static struct net_device *netdev_next_upper_dev(struct net_device *dev, + struct list_head **iter) +{ + struct netdev_adjacent *upper; + + upper = list_entry((*iter)->next, struct netdev_adjacent, list); + + if (&upper->list == &dev->adj_list.upper) + return NULL; + + *iter = &upper->list; + + return upper->dev; +} + static struct net_device *netdev_next_upper_dev_rcu(struct net_device *dev, struct list_head **iter) { @@ -6619,6 +6635,33 @@ static struct net_device *netdev_next_upper_dev_rcu(struct net_device *dev, return upper->dev; } +int netdev_walk_all_upper_dev(struct net_device *dev, + int (*fn)(struct net_device *dev, + void *data), + void *data) +{ + struct net_device *udev; + struct list_head *iter; + int ret; + + for (iter = &dev->adj_list.upper, + udev = netdev_next_upper_dev(dev, &iter); + udev; + udev = netdev_next_upper_dev(dev, &iter)) { + /* first is the upper device itself */ + ret = fn(udev, data); + if (ret) + return ret; + + /* then look at all of its upper devices */ + ret = netdev_walk_all_upper_dev(udev, fn, data); + if (ret) + return ret; + } + + return 0; +} + int netdev_walk_all_upper_dev_rcu(struct net_device *dev, int (*fn)(struct net_device *dev, void *data), @@ -6785,6 +6828,52 @@ static struct net_device *netdev_next_lower_dev_rcu(struct net_device *dev, return lower->dev; } +static u8 __netdev_upper_depth(struct net_device *dev) +{ + struct net_device *udev; + struct list_head *iter; + u8 max_depth = 0; + + for (iter = &dev->adj_list.upper, + udev = netdev_next_upper_dev(dev, &iter); + udev; + udev = netdev_next_upper_dev(dev, &iter)) { + if (max_depth < udev->upper_level) + max_depth = udev->upper_level; + } + + return max_depth; +} + +static u8 __netdev_lower_depth(struct net_device *dev) +{ + struct net_device *ldev; + struct list_head *iter; + u8 max_depth = 0; + + for (iter = &dev->adj_list.lower, + ldev = netdev_next_lower_dev(dev, &iter); + ldev; + ldev = netdev_next_lower_dev(dev, &iter)) { + if (max_depth < ldev->lower_level) + max_depth = ldev->lower_level; + } + + return max_depth; +} + +static int __netdev_update_upper_level(struct net_device *dev, void *data) +{ + dev->upper_level = __netdev_upper_depth(dev) + 1; + return 0; +} + +static int __netdev_update_lower_level(struct net_device *dev, void *data) +{ + dev->lower_level = __netdev_lower_depth(dev) + 1; + return 0; +} + int netdev_walk_all_lower_dev_rcu(struct net_device *dev, int (*fn)(struct net_device *dev, void *data), @@ -7063,6 +7152,9 @@ static int __netdev_upper_dev_link(struct net_device *dev, if (netdev_has_upper_dev(upper_dev, dev)) return -EBUSY; + if ((dev->lower_level + upper_dev->upper_level) > MAX_NEST_DEV) + return -EMLINK; + if (!master) { if (netdev_has_upper_dev(dev, upper_dev)) return -EEXIST; @@ -7089,6 +7181,12 @@ static int __netdev_upper_dev_link(struct net_device *dev, if (ret) goto rollback; + __netdev_update_upper_level(dev, NULL); + netdev_walk_all_lower_dev(dev, __netdev_update_upper_level, NULL); + + __netdev_update_lower_level(upper_dev, NULL); + netdev_walk_all_upper_dev(upper_dev, __netdev_update_lower_level, NULL); + return 0; rollback: @@ -7171,6 +7269,12 @@ void netdev_upper_dev_unlink(struct net_device *dev, call_netdevice_notifiers_info(NETDEV_CHANGEUPPER, &changeupper_info.info); + + __netdev_update_upper_level(dev, NULL); + netdev_walk_all_lower_dev(dev, __netdev_update_upper_level, NULL); + + __netdev_update_lower_level(upper_dev, NULL); + netdev_walk_all_upper_dev(upper_dev, __netdev_update_lower_level, NULL); } EXPORT_SYMBOL(netdev_upper_dev_unlink); @@ -9157,6 +9261,8 @@ struct net_device *alloc_netdev_mqs(int sizeof_priv, const char *name, dev->gso_max_size = GSO_MAX_SIZE; dev->gso_max_segs = GSO_MAX_SEGS; + dev->upper_level = 1; + dev->lower_level = 1; INIT_LIST_HEAD(&dev->napi_list); INIT_LIST_HEAD(&dev->unreg_list); From patchwork Sat Sep 7 13:45:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1159323 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="p+pMQO0i"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46QbL168F2z9sDB for ; Sat, 7 Sep 2019 23:46:01 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2394599AbfIGNp7 (ORCPT ); Sat, 7 Sep 2019 09:45:59 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:43886 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728809AbfIGNp7 (ORCPT ); Sat, 7 Sep 2019 09:45:59 -0400 Received: by mail-pf1-f196.google.com with SMTP id d15so6383470pfo.10 for ; Sat, 07 Sep 2019 06:45:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=ps0gIofTlFcfdZfC4JRSnzND2si3Id1Kt0D0I9ATfy0=; b=p+pMQO0i3ilsU52ky1KizrzNlFUUUdLJYf+n7fBOPwJRYeMsxIB2cgP4sr8GpPHaxc POaZVkfRGMZUTi1mgytbfKHoILPvHyNmYESMUI2HaVe9GQhHe2Ow9dEyp4kVGw8A1vjF EYR9cmmJxk7OOm8mUbwZyHwBefvlGCpVeoh1KUA3Slg4npRA57XjrtAK9/GSEk72cGAf 1LoWUo4z2Wab+Ln6RfioQOUk54wlEqXPlbBHpvFWoyEKcfHE/Dt3VBVN3mHkM384Mguz IAlS3ard+DDuxgu9dIs6kJAXfymApv9q8bd795SIOxtQAFSHj+bJ2boFKhz8XFO8ckxp ockg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=ps0gIofTlFcfdZfC4JRSnzND2si3Id1Kt0D0I9ATfy0=; b=frHeoebDmYHD4FSlKlW5WDz+fmPH0HkF0XUrNL849YkANC6hu/6E/3OuzMqBoN5v9E rV5ozicpi0kvuBsAa2Shh8HZFO0G+bjDVd20FL6Co6uo7BN3QhZ/pLfTFqWkWrYXKjv/ VdIKNBUOdoLAGh0rdBdcSeIwpoL1CIPHoSrgiEjuy9hHEA58Bbf2fB5WEhCUdBfGbq8u JsL7YIGB689vTDuSIx9uGmuTO0k6QDDVDdBsEDfsIk27zeMzhECe9qcwfXJf0HfE/0BP b0al51W0gL40Kf3MlGowPJmtYRSr2lfkpc9LaNJXdcD2nz3/MZS4zbfKbWys4799UPxb DW7A== X-Gm-Message-State: APjAAAX7oshOn6tFHxMt4Iu4+kELWJI1LjxmyjM4vtnarySUlZbTYp0/ bbDj/7eGzJcQHHIaxip4BFs= X-Google-Smtp-Source: APXvYqw/Ht96idznp49eUsOhSmr5UXTiujWxEFMsITJQ2VVVijuoPYDbDOr5ve4q4xiD+G/yBM4uoQ== X-Received: by 2002:a62:5c82:: with SMTP id q124mr17006774pfb.177.1567863958347; Sat, 07 Sep 2019 06:45:58 -0700 (PDT) Received: from ap-To-be-filled-by-O-E-M.8.8.8.8 ([14.33.120.60]) by smtp.gmail.com with ESMTPSA id 136sm10576439pfz.123.2019.09.07.06.45.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2019 06:45:57 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, sthemmin@microsoft.com, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com Cc: ap420073@gmail.com Subject: [PATCH net v2 02/11] vlan: use dynamic lockdep key instead of subclass Date: Sat, 7 Sep 2019 22:45:48 +0900 Message-Id: <20190907134548.32071-1-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org All VLAN device has same lockdep key and subclass is initialized with nest_level. But actual nest_level value can be changed when a lower device is attached. And at this moment, the subclass should be updated but it seems to be unsafe. So this patch makes VLAN use dynamic lockdep key instead of the subclass. Test commands: ip link add dummy0 type dummy ip link set dummy0 up ip link add bond0 type bond ip link add vlan_dummy1 link dummy0 type vlan id 1 ip link add vlan_bond1 link bond0 type vlan id 2 ip link set vlan_dummy1 master bond0 ip link set bond0 up ip link set vlan_dummy1 up ip link set vlan_bond1 up Both vlan_dummy1 and vlan_bond1 have the same subclass and it makes unnecessary deadlock warning message. Splat looks like: [ 149.244978] ============================================ [ 149.244978] WARNING: possible recursive locking detected [ 149.244978] 5.3.0-rc7+ #322 Not tainted [ 149.244978] -------------------------------------------- [ 149.244978] ip/1340 is trying to acquire lock: [ 149.244978] 000000001399b1a7 (&vlan_netdev_addr_lock_key/1){+...}, at: dev_uc_sync_multiple+0xfa/0x1a0 [ 149.279600] [ 149.279600] but task is already holding lock: [ 149.279600] 00000000b963d9b4 (&vlan_netdev_addr_lock_key/1){+...}, at: dev_set_rx_mode+0x19/0x30 [ 149.279600] [ 149.279600] other info that might help us debug this: [ 149.305981] Possible unsafe locking scenario: [ 149.305981] [ 149.305981] CPU0 [ 149.305981] ---- [ 149.305981] lock(&vlan_netdev_addr_lock_key/1); [ 149.305981] lock(&vlan_netdev_addr_lock_key/1); [ 149.326258] [ 149.326258] *** DEADLOCK *** [ 149.326258] [ 149.326258] May be due to missing lock nesting notation [ 149.326258] [ 149.326258] 4 locks held by ip/1340: [ 149.326258] #0: 00000000927f0698 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x466/0x8a0 [ 149.326258] #1: 00000000b963d9b4 (&vlan_netdev_addr_lock_key/1){+...}, at: dev_set_rx_mode+0x19/0x30 [ 149.326258] #2: 0000000027395445 (&dev_addr_list_lock_key/3){+...}, at: dev_mc_sync+0xfa/0x1a0 [ 149.369961] #3: 00000000ce334932 (rcu_read_lock){....}, at: bond_set_rx_mode+0x5/0x3c0 [bonding] [ 149.369961] [ 149.369961] stack backtrace: [ 149.369961] CPU: 1 PID: 1340 Comm: ip Not tainted 5.3.0-rc7+ #322 [ 149.369961] Call Trace: [ 149.369961] dump_stack+0x7c/0xbb [ 149.369961] __lock_acquire+0x26a9/0x3de0 [ 149.369961] ? register_lock_class+0x14d0/0x14d0 [ 149.369961] ? register_lock_class+0x14d0/0x14d0 [ 149.369961] lock_acquire+0x164/0x3b0 [ 149.433970] ? dev_uc_sync_multiple+0xfa/0x1a0 [ 149.433970] _raw_spin_lock_nested+0x2e/0x60 [ 149.433970] ? dev_uc_sync_multiple+0xfa/0x1a0 [ 149.433970] dev_uc_sync_multiple+0xfa/0x1a0 [ 149.433970] bond_set_rx_mode+0x269/0x3c0 [bonding] [ 149.433970] ? bond_init+0x6f0/0x6f0 [bonding] [ 149.433970] dev_mc_sync+0x15a/0x1a0 [ 149.433970] vlan_dev_set_rx_mode+0x37/0x80 [8021q] [ 149.433970] dev_set_rx_mode+0x21/0x30 [ 149.433970] __dev_open+0x202/0x310 [ 149.433970] ? dev_set_rx_mode+0x30/0x30 [ 149.433970] ? mark_held_locks+0xa5/0xe0 [ 149.433970] ? __local_bh_enable_ip+0xe9/0x1b0 [ 149.433970] __dev_change_flags+0x3c3/0x500 [ ... ] Fixes: 0fe1e567d0b4 ("[VLAN]: nested VLAN: fix lockdep's recursive locking warning") Signed-off-by: Taehee Yoo --- v1 -> v2 : this patch isn't changed include/linux/if_vlan.h | 3 +++ net/8021q/vlan_dev.c | 28 +++++++++++++++------------- 2 files changed, 18 insertions(+), 13 deletions(-) diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h index 244278d5c222..1aed9f613e90 100644 --- a/include/linux/if_vlan.h +++ b/include/linux/if_vlan.h @@ -183,6 +183,9 @@ struct vlan_dev_priv { struct netpoll *netpoll; #endif unsigned int nest_level; + + struct lock_class_key xmit_lock_key; + struct lock_class_key addr_lock_key; }; static inline struct vlan_dev_priv *vlan_dev_priv(const struct net_device *dev) diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c index 93eadf179123..12bc80650087 100644 --- a/net/8021q/vlan_dev.c +++ b/net/8021q/vlan_dev.c @@ -494,24 +494,24 @@ static void vlan_dev_set_rx_mode(struct net_device *vlan_dev) * "super class" of normal network devices; split their locks off into a * separate class since they always nest. */ -static struct lock_class_key vlan_netdev_xmit_lock_key; -static struct lock_class_key vlan_netdev_addr_lock_key; - static void vlan_dev_set_lockdep_one(struct net_device *dev, struct netdev_queue *txq, - void *_subclass) + void *_unused) { - lockdep_set_class_and_subclass(&txq->_xmit_lock, - &vlan_netdev_xmit_lock_key, - *(int *)_subclass); + struct vlan_dev_priv *vlan = vlan_dev_priv(dev); + + lockdep_set_class(&txq->_xmit_lock, &vlan->xmit_lock_key); } -static void vlan_dev_set_lockdep_class(struct net_device *dev, int subclass) +static void vlan_dev_set_lockdep_class(struct net_device *dev) { - lockdep_set_class_and_subclass(&dev->addr_list_lock, - &vlan_netdev_addr_lock_key, - subclass); - netdev_for_each_tx_queue(dev, vlan_dev_set_lockdep_one, &subclass); + struct vlan_dev_priv *vlan = vlan_dev_priv(dev); + + lockdep_register_key(&vlan->addr_lock_key); + lockdep_set_class(&dev->addr_list_lock, &vlan->addr_lock_key); + + lockdep_register_key(&vlan->xmit_lock_key); + netdev_for_each_tx_queue(dev, vlan_dev_set_lockdep_one, NULL); } static int vlan_dev_get_lock_subclass(struct net_device *dev) @@ -609,7 +609,7 @@ static int vlan_dev_init(struct net_device *dev) SET_NETDEV_DEVTYPE(dev, &vlan_type); - vlan_dev_set_lockdep_class(dev, vlan_dev_get_lock_subclass(dev)); + vlan_dev_set_lockdep_class(dev); vlan->vlan_pcpu_stats = netdev_alloc_pcpu_stats(struct vlan_pcpu_stats); if (!vlan->vlan_pcpu_stats) @@ -630,6 +630,8 @@ static void vlan_dev_uninit(struct net_device *dev) kfree(pm); } } + lockdep_unregister_key(&vlan->addr_lock_key); + lockdep_unregister_key(&vlan->xmit_lock_key); } static netdev_features_t vlan_dev_fix_features(struct net_device *dev, From patchwork Sat Sep 7 13:46:00 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1159324 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="uxEf21Kq"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46QbLF740fz9sDB for ; Sat, 7 Sep 2019 23:46:13 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2394623AbfIGNqM (ORCPT ); Sat, 7 Sep 2019 09:46:12 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:38232 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728809AbfIGNqL (ORCPT ); Sat, 7 Sep 2019 09:46:11 -0400 Received: by mail-pf1-f196.google.com with SMTP id h195so6409424pfe.5 for ; Sat, 07 Sep 2019 06:46:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=hGN0VAJqy9BvobpRo2wGKqQ/hJyK8BQ2uEBPH7WP5Vk=; b=uxEf21KqjNVant8AhZF//n8CUwwLXEHf9Aysyc4X6bAC6dgo4JDa10d2rRW4BxU41x 3DI7y3TqPlTHWvkvVqjaOrtF41DHtPbFnDN4/TQyxs6e5MQtIqrFVApOAsrJYDN5GHTg ofj1Av6/kHAHPtvMJMGQYqA4JTFm4/cVeQAv2n7/gWhLpd2cFT2Dywgbcn2ON6Dn92Vm N8TuPahuFxDXIX1MVUTNfhErbIPFwB43Tlz0v5BlAFCBEcjhfBpGZkPPdcAoPL8OTQiw 4bpGQgdkHwtHoSFtU0shfHH2iCO7tzYvX1Xfr82vA6tDrJoAQGGWk+dN2P/LTSU6wkM/ T2Qw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=hGN0VAJqy9BvobpRo2wGKqQ/hJyK8BQ2uEBPH7WP5Vk=; b=NmtK2AwGhwNKkeuNDyajhv3LHaMWSYVqe1SQEO42Ws+f/Wco6zF20aaBT72sKtyP54 A2Rv2MoE8z05rab8EGM4y9ovpdS5xDxlzzZgBFEUL4i6r1QMMkAYXaKaiUApoiyK5eak 4c0oIcwTN6OQryBPTAX7GKnVV/IUJQakLt/ppo6TSF/wkisjdy90WqMKOqAnwqd2www/ nvLJndHH2AkZnXxIIvIOKIGGQrIvK3zCNImFzVdrRQ3ufavp75YJ6M5uETQ6xXROpgtI lzfMbVJAcuYKyrilL5ZYuxlk01VTwmJ32RRuoEyitKpXFXRzbLbSl1BOFMqhTwA4vjun 7Bqg== X-Gm-Message-State: APjAAAXtW8b7zsdS6bh41tc3O5efHdwImza/lOknuSVtgVoY6Wg+SK27 4EGKaDG77HFoH0bbVKsdqxE= X-Google-Smtp-Source: APXvYqzwepncP+fM2zDj5s9hKozh91s/Iu+BATtSwcJY3WhJ/Wd3s/m49OJzWpjZOg8qHoWx6/nDMw== X-Received: by 2002:a65:5786:: with SMTP id b6mr12499019pgr.236.1567863971051; Sat, 07 Sep 2019 06:46:11 -0700 (PDT) Received: from ap-To-be-filled-by-O-E-M.8.8.8.8 ([14.33.120.60]) by smtp.gmail.com with ESMTPSA id o35sm7462091pgm.29.2019.09.07.06.46.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2019 06:46:10 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, sthemmin@microsoft.com, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com Cc: ap420073@gmail.com Subject: [PATCH net v2 03/11] bonding: fix unexpected IFF_BONDING bit unset Date: Sat, 7 Sep 2019 22:46:00 +0900 Message-Id: <20190907134600.32152-1-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The IFF_BONDING means bonding master or bonding slave device. ->ndo_add_slave() sets IFF_BONDING flag and ->ndo_del_slave() unsets IFF_BONDING flag. bond0<--bond1 Both bond0 and bond1 are bonding device and these should keep having IFF_BONDING flag until they are removed. But bond1 would lose IFF_BONDING at ->ndo_del_slave() because that routine do not check whether the slave device is the bonding type or not. This patch adds the interface type check routine before removing IFF_BONDING flag. Test commands: ip link add bond0 type bond ip link add bond1 type bond ip link set bond1 master bond0 ip link set bond1 nomaster ip link del bond1 type bond ip link add bond1 type bond Splat looks like: [ 149.201107] proc_dir_entry 'bonding/bond1' already registered [ 149.208013] WARNING: CPU: 1 PID: 1308 at fs/proc/generic.c:361 proc_register+0x2a9/0x3e0 [ 149.208866] Modules linked in: bonding veth openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv4 ip_tables6 [ 149.208866] CPU: 1 PID: 1308 Comm: ip Not tainted 5.3.0-rc7+ #322 [ 149.208866] RIP: 0010:proc_register+0x2a9/0x3e0 [ 149.208866] Code: 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 39 01 00 00 48 8b 04 24 48 89 ea 48 c7 c7 a0 a0 13 89 48 8b b0 0 [ 149.208866] RSP: 0018:ffff88810df9f098 EFLAGS: 00010286 [ 149.208866] RAX: dffffc0000000008 RBX: ffff8880b5d3aa50 RCX: ffffffff87cdec92 [ 149.208866] RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffff888116bf6a8c [ 149.208866] RBP: ffff8880b5d3acd3 R08: ffffed1022d7ff71 R09: ffffed1022d7ff71 [ 149.208866] R10: 0000000000000001 R11: ffffed1022d7ff70 R12: ffff8880b5d3abe8 [ 149.208866] R13: ffff8880b5d3acd2 R14: dffffc0000000000 R15: ffffed1016ba759a [ 149.208866] FS: 00007f4bd1f650c0(0000) GS:ffff888116a00000(0000) knlGS:0000000000000000 [ 149.208866] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 149.208866] CR2: 000055e7ca686118 CR3: 0000000106fd4000 CR4: 00000000001006e0 [ 149.208866] Call Trace: [ 149.208866] proc_create_seq_private+0xb3/0xf0 [ 149.208866] bond_create_proc_entry+0x1b3/0x3f0 [bonding] [ 149.208866] bond_netdev_event+0x433/0x970 [bonding] [ 149.208866] ? __module_text_address+0x13/0x140 [ 149.208866] notifier_call_chain+0x90/0x160 [ 149.208866] register_netdevice+0x9b3/0xd70 [ 149.208866] ? alloc_netdev_mqs+0x854/0xc10 [ 149.208866] ? netdev_change_features+0xa0/0xa0 [ 149.208866] ? rtnl_create_link+0x2ed/0xad0 [ 149.208866] bond_newlink+0x2a/0x60 [bonding] [ 149.208866] __rtnl_newlink+0xb75/0x1180 [ ... ] Fixes: 0b680e753724 ("[PATCH] bonding: Add priv_flag to avoid event mishandling") Signed-off-by: Taehee Yoo --- v1 -> v2: do not add a new priv_flag. drivers/net/bonding/bond_main.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 931d9d935686..0db12fcfc953 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1816,7 +1816,8 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, slave_disable_netpoll(new_slave); err_close: - slave_dev->priv_flags &= ~IFF_BONDING; + if (!netif_is_bond_master(slave_dev)) + slave_dev->priv_flags &= ~IFF_BONDING; dev_close(slave_dev); err_restore_mac: @@ -2017,7 +2018,8 @@ static int __bond_release_one(struct net_device *bond_dev, else dev_set_mtu(slave_dev, slave->original_mtu); - slave_dev->priv_flags &= ~IFF_BONDING; + if (!netif_is_bond_master(slave_dev)) + slave_dev->priv_flags &= ~IFF_BONDING; bond_free_slave(slave); From patchwork Sat Sep 7 13:46:13 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1159325 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="l/mXyivn"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46QbLT43yVz9sDB for ; Sat, 7 Sep 2019 23:46:25 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2394632AbfIGNqX (ORCPT ); Sat, 7 Sep 2019 09:46:23 -0400 Received: from mail-pg1-f196.google.com ([209.85.215.196]:35631 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2394625AbfIGNqX (ORCPT ); Sat, 7 Sep 2019 09:46:23 -0400 Received: by mail-pg1-f196.google.com with SMTP id n4so5152243pgv.2 for ; Sat, 07 Sep 2019 06:46:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=HAYWwtEHy06UdxyP1W8EGDeOSCoeIrn8ybEJbaZ1LAc=; b=l/mXyivnSLaXswjUfNlKbawU5pxfb7zMxazU/V1M6JQZ1Gfo2Uu1V7cnCiLH+iJPHb /8yge/FD967KxVyqC3RBqpTU1dLW5Rfo6yICegYLp4GgZCRrNso8/brhZLT0XS9te/8T ph86ZnjZec+/xy7mdH46cOTHsx+CoQ58saLrCuSissazEqO+G5jlhLl7/nWjZF0dwbC3 bvb58UrMP4o8mUuIAvjIlNbo3hh2PPq5C0o15WQq6pP+D5uNC5l7uXzcVfgEaBd23RkQ MkAHDVRfmsUhhMRzOYa7Rm9zJUf5Z1lmghuQexWvezwHGLYfR2a1qz19pQrDHT9Y4wWw Hg9Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=HAYWwtEHy06UdxyP1W8EGDeOSCoeIrn8ybEJbaZ1LAc=; b=jK8gSSIcVLNjPMjUrr+sm6rlUB1uzeNB5a241RzlkwxHP0r4mDEPCou2Uo+3fsleUB M6dl00IvYZkZLUKa+cJHtilk4uZ96nhQvfTxxf5up5ymlqCzqlFeGOTUC+Qn48j4IsE7 Bo8xI7p1KeJNQd5DDL+gC1Fu1v5mLgjzlgCjGcSnpYmhfthfklN83y3lZmnNjQ0ywwsI 81CNibQO0zn2s58mX54DntM7ijkMKP2atHqeE1iagH+38AxRPCbkecKjmaJ1Q+OUTHxj KPk9iz8Ttus3EAbEe2tOUTpeFTk5UyUtsUHJ2rx0hY/vLHs4+0WZwJ3/K9SFRjs8nvOr DBkQ== X-Gm-Message-State: APjAAAWuEBPbhsD8VeI3WP2sPq9xfeX6XGOx1PqVUelOwdAcEVGYtCoW PuGvnberE+T6+YlxUJEqpqGvsV6YaF4= X-Google-Smtp-Source: APXvYqxgNr5HWrLqHo+TKkEJ9evPsTR0t1tR0C+vNLtpRESKIGh/4LiYRQPlhKzdQQGVv2hkIX/Waw== X-Received: by 2002:a65:4808:: with SMTP id h8mr12545030pgs.22.1567863982473; Sat, 07 Sep 2019 06:46:22 -0700 (PDT) Received: from ap-To-be-filled-by-O-E-M.8.8.8.8 ([14.33.120.60]) by smtp.gmail.com with ESMTPSA id c17sm7877740pfo.57.2019.09.07.06.46.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2019 06:46:21 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, sthemmin@microsoft.com, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com Cc: ap420073@gmail.com Subject: [PATCH net v2 04/11] bonding: use dynamic lockdep key instead of subclass Date: Sat, 7 Sep 2019 22:46:13 +0900 Message-Id: <20190907134613.32230-1-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org All bonding device has same lockdep key and subclass is initialized with nest_level. But actual nest_level value can be changed when a lower device is attached. And at this moment, the subclass should be updated but it seems to be unsafe. So this patch makes bonding use dynamic lockdep key instead of the subclass. Test commands: ip link add bond0 type bond for i in {1..5} do let A=$i-1 ip link add bond$i type bond ip link set bond$i master bond$A done ip link set bond5 master bond0 Splat looks like: [ 327.477830] ============================================ [ 327.477830] WARNING: possible recursive locking detected [ 327.477830] 5.3.0-rc7+ #322 Not tainted [ 327.477830] -------------------------------------------- [ 327.477830] ip/1399 is trying to acquire lock: [ 327.477830] 00000000f604be63 (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0xb8/0x500 [bonding] [ 327.477830] [ 327.477830] but task is already holding lock: [ 327.477830] 00000000e9d31238 (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0xb8/0x500 [bonding] [ 327.477830] [ 327.477830] other info that might help us debug this: [ 327.477830] Possible unsafe locking scenario: [ 327.477830] [ 327.477830] CPU0 [ 327.477830] ---- [ 327.477830] lock(&(&bond->stats_lock)->rlock#2/2); [ 327.477830] lock(&(&bond->stats_lock)->rlock#2/2); [ 327.477830] [ 327.477830] *** DEADLOCK *** [ 327.477830] [ 327.477830] May be due to missing lock nesting notation [ 327.477830] [ 327.477830] 3 locks held by ip/1399: [ 327.477830] #0: 00000000a762c4e3 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x466/0x8a0 [ 327.477830] #1: 00000000e9d31238 (&(&bond->stats_lock)->rlock#2/2){+.+.}, at: bond_get_stats+0xb8/0x500 [bonding] [ 327.477830] #2: 000000008f7ebff4 (rcu_read_lock){....}, at: bond_get_stats+0x9f/0x500 [bonding] [ 327.477830] [ 327.477830] stack backtrace: [ 327.477830] CPU: 0 PID: 1399 Comm: ip Not tainted 5.3.0-rc7+ #322 [ 327.477830] Call Trace: [ 327.477830] dump_stack+0x7c/0xbb [ 327.477830] __lock_acquire+0x26a9/0x3de0 [ 327.477830] ? __change_page_attr_set_clr+0x133b/0x1d20 [ 327.477830] ? register_lock_class+0x14d0/0x14d0 [ 327.477830] lock_acquire+0x164/0x3b0 [ 327.477830] ? bond_get_stats+0xb8/0x500 [bonding] [ 327.666914] _raw_spin_lock_nested+0x2e/0x60 [ 327.666914] ? bond_get_stats+0xb8/0x500 [bonding] [ 327.678302] bond_get_stats+0xb8/0x500 [bonding] [ 327.678302] ? bond_arp_rcv+0xf10/0xf10 [bonding] [ 327.678302] ? register_lock_class+0x14d0/0x14d0 [ 327.678302] ? bond_get_stats+0xb8/0x500 [bonding] [ 327.678302] dev_get_stats+0x1ec/0x270 [ 327.678302] bond_get_stats+0x1d1/0x500 [bonding] [ 327.678302] ? lock_acquire+0x164/0x3b0 [ 327.678302] ? bond_arp_rcv+0xf10/0xf10 [bonding] [ 327.678302] ? rtnl_is_locked+0x16/0x30 [ 327.678302] ? devlink_compat_switch_id_get+0x18/0x140 [ 327.678302] ? dev_get_alias+0xe2/0x190 [ 327.731145] ? dev_get_port_parent_id+0x12a/0x340 [ 327.731145] ? rtnl_phys_switch_id_fill+0x88/0xe0 [ 327.731145] dev_get_stats+0x1ec/0x270 [ 327.731145] rtnl_fill_stats+0x44/0xbe0 [ 327.731145] ? nla_put+0xc2/0x140 [ ... ] Fixes: d3fff6c443fe ("net: add netdev_lockdep_set_classes() helper") Signed-off-by: Taehee Yoo --- v1 -> v2 : this patch isn't changed drivers/net/bonding/bond_main.c | 61 ++++++++++++++++++++++++++++++--- include/net/bonding.h | 3 ++ 2 files changed, 59 insertions(+), 5 deletions(-) diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 0db12fcfc953..7f574e74ed78 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1857,6 +1857,32 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, return res; } +static void bond_dev_set_lockdep_one(struct net_device *dev, + struct netdev_queue *txq, + void *_unused) +{ + struct bonding *bond = netdev_priv(dev); + + lockdep_set_class(&txq->_xmit_lock, &bond->xmit_lock_key); +} + +static void bond_update_lock_key(struct net_device *dev) +{ + struct bonding *bond = netdev_priv(dev); + + lockdep_unregister_key(&bond->stats_lock_key); + lockdep_unregister_key(&bond->addr_lock_key); + lockdep_unregister_key(&bond->xmit_lock_key); + + lockdep_register_key(&bond->stats_lock_key); + lockdep_register_key(&bond->addr_lock_key); + lockdep_register_key(&bond->xmit_lock_key); + + lockdep_set_class(&bond->stats_lock, &bond->stats_lock_key); + lockdep_set_class(&dev->addr_list_lock, &bond->addr_lock_key); + netdev_for_each_tx_queue(dev, bond_dev_set_lockdep_one, NULL); +} + /* Try to release the slave device from the bond device * It is legal to access curr_active_slave without a lock because all the function * is RTNL-locked. If "all" is true it means that the function is being called @@ -2022,6 +2048,8 @@ static int __bond_release_one(struct net_device *bond_dev, slave_dev->priv_flags &= ~IFF_BONDING; bond_free_slave(slave); + if (netif_is_bond_master(slave_dev)) + bond_update_lock_key(slave_dev); return 0; } @@ -3459,7 +3487,7 @@ static void bond_get_stats(struct net_device *bond_dev, struct list_head *iter; struct slave *slave; - spin_lock_nested(&bond->stats_lock, bond_get_nest_level(bond_dev)); + spin_lock(&bond->stats_lock); memcpy(stats, &bond->bond_stats, sizeof(*stats)); rcu_read_lock(); @@ -4297,8 +4325,6 @@ void bond_setup(struct net_device *bond_dev) { struct bonding *bond = netdev_priv(bond_dev); - spin_lock_init(&bond->mode_lock); - spin_lock_init(&bond->stats_lock); bond->params = bonding_defaults; /* Initialize pointers */ @@ -4367,6 +4393,9 @@ static void bond_uninit(struct net_device *bond_dev) list_del(&bond->bond_list); + lockdep_unregister_key(&bond->stats_lock_key); + lockdep_unregister_key(&bond->addr_lock_key); + lockdep_unregister_key(&bond->xmit_lock_key); bond_debug_unregister(bond); } @@ -4758,6 +4787,29 @@ static int bond_check_params(struct bond_params *params) return 0; } +static struct lock_class_key qdisc_tx_busylock_key; +static struct lock_class_key qdisc_running_key; + +static void bond_dev_set_lockdep_class(struct net_device *dev) +{ + struct bonding *bond = netdev_priv(dev); + + dev->qdisc_tx_busylock = &qdisc_tx_busylock_key; + dev->qdisc_running_key = &qdisc_running_key; + + spin_lock_init(&bond->mode_lock); + + spin_lock_init(&bond->stats_lock); + lockdep_register_key(&bond->stats_lock_key); + lockdep_set_class(&bond->stats_lock, &bond->stats_lock_key); + + lockdep_register_key(&bond->addr_lock_key); + lockdep_set_class(&dev->addr_list_lock, &bond->addr_lock_key); + + lockdep_register_key(&bond->xmit_lock_key); + netdev_for_each_tx_queue(dev, bond_dev_set_lockdep_one, NULL); +} + /* Called from registration process */ static int bond_init(struct net_device *bond_dev) { @@ -4771,8 +4823,7 @@ static int bond_init(struct net_device *bond_dev) return -ENOMEM; bond->nest_level = SINGLE_DEPTH_NESTING; - netdev_lockdep_set_classes(bond_dev); - + bond_dev_set_lockdep_class(bond_dev); list_add_tail(&bond->bond_list, &bn->dev_list); bond_prepare_sysfs_group(bond); diff --git a/include/net/bonding.h b/include/net/bonding.h index f7fe45689142..c39ac7061e41 100644 --- a/include/net/bonding.h +++ b/include/net/bonding.h @@ -239,6 +239,9 @@ struct bonding { struct dentry *debug_dir; #endif /* CONFIG_DEBUG_FS */ struct rtnl_link_stats64 bond_stats; + struct lock_class_key stats_lock_key; + struct lock_class_key xmit_lock_key; + struct lock_class_key addr_lock_key; }; #define bond_slave_get_rcu(dev) \ From patchwork Sat Sep 7 13:46:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1159326 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="kd1h5PF+"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46QbLq6RFrz9sNf for ; Sat, 7 Sep 2019 23:46:43 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2394638AbfIGNqm (ORCPT ); Sat, 7 Sep 2019 09:46:42 -0400 Received: from mail-pl1-f194.google.com ([209.85.214.194]:33647 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2394633AbfIGNqm (ORCPT ); Sat, 7 Sep 2019 09:46:42 -0400 Received: by mail-pl1-f194.google.com with SMTP id t11so4533460plo.0 for ; Sat, 07 Sep 2019 06:46:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=Ez+JQwAKzEE02AdYPD2+QO7ui6wsJKK0ozfhm0sDP5o=; b=kd1h5PF+AlblPc47/v37Cs4/a4EQACdJGnkG7TMmbpX2LsM0UFNW3x9N1FwfAkNieZ zHZ10EHHJjJqweScHDsHSdp4HbvxidLLfHUZwa+f5UXJN24UgkHWOlgBj0F0jJ1np1eL vW66PPXo9D4n3yTEgvHSt0YpR/GQILUFMVIsb+uZF5gOXkgsgrdzXSeu3U3QeRbqJFxB mQ67O5tzdkVsBqHUTzKcXOKHqSFYwq9i/TXkRWP01vy4Uff+yyF1pFgHaGxc2ElYxSJT Q9H9avbNjFVxNoAKHZhDN1780cIWkEYYzKVvBP4+HxqtLxGE2/PCnyJkYnlduzP7aO43 Im4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=Ez+JQwAKzEE02AdYPD2+QO7ui6wsJKK0ozfhm0sDP5o=; b=WvPp1LGrBe8+WwiAaYE3hzeYBVz3xljh7qlJepuFzoLTo6PXLpLrSXT2mrKVUNWI6v UQSoXlUjbwZcqguyxguhzH2REshkp5m9/bY27r/Wnym4hbQpw7LI1BGbT1HwJzRjRQlZ RsUIyS3IsQFuYHfcUjYEsIJZuDgWc6GRpFv+8pLMTqLeTnOSQbc9tdneo2KRQoZidbMp pVIedsl/qpH8dvLsAAet3nDOA33q99rRZgjHtqHHOlghc4J4CqxvJmsZ38afUR4VN0hT bQHdy1+ijpKd6R8HKopM9LHFGMYTRdjwlLEK4YXs6+y5rSo883JBsked2nuGL5SB3y3O c5UQ== X-Gm-Message-State: APjAAAVy0yOgiULAYkXn0rKKg+oLf8wKaeaFrDfpt/BtHnM3VyqT+fdL 0s2l+E4Wk0N5LpKhT6TnsUNC5XQZUqY= X-Google-Smtp-Source: APXvYqwWMLISQL1GlbJwSq3HhjxubRMxpAh3T0hB6XUS0prVbMEwPrETNumcvzQZc4Bl2IUSVIjSjQ== X-Received: by 2002:a17:902:26f:: with SMTP id 102mr14996787plc.189.1567864001077; Sat, 07 Sep 2019 06:46:41 -0700 (PDT) Received: from ap-To-be-filled-by-O-E-M.8.8.8.8 ([14.33.120.60]) by smtp.gmail.com with ESMTPSA id g2sm10187147pfm.32.2019.09.07.06.46.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2019 06:46:40 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, sthemmin@microsoft.com, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com Cc: ap420073@gmail.com Subject: [PATCH net v2 05/11] team: use dynamic lockdep key instead of static key Date: Sat, 7 Sep 2019 22:46:31 +0900 Message-Id: <20190907134631.32325-1-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org In the current code, all team devices have same static lockdep key and team devices could be nested so that it makes unnecessary lockdep warning. Test commands: ip link add team0 type team for i in {1..7} do let A=$i-1 ip link add team$i type team ip link set team$i master team$A done ip link del team0 Splat looks like: [ 137.406730] ============================================ [ 137.412685] WARNING: possible recursive locking detected [ 137.418642] 5.3.0-rc7+ #322 Not tainted [ 137.422941] -------------------------------------------- [ 137.428886] ip/1383 is trying to acquire lock: [ 137.433869] 0000000089571080 (&dev_addr_list_lock_key/1){+...}, at: dev_uc_sync_multiple+0xfa/0x1a0 [ 137.444034] [ 137.444034] but task is already holding lock: [ 137.450572] 00000000d9597252 (&dev_addr_list_lock_key/1){+...}, at: dev_uc_unsync+0x10c/0x1b0 [ 137.460142] [ 137.460142] other info that might help us debug this: [ 137.467458] Possible unsafe locking scenario: [ 137.467458] [ 137.474096] CPU0 [ 137.476828] ---- [ 137.479569] lock(&dev_addr_list_lock_key/1); [ 137.484554] lock(&dev_addr_list_lock_key/1); [ 137.489539] [ 137.489539] *** DEADLOCK *** [ 137.489539] [ 137.496178] May be due to missing lock nesting notation [ 137.496178] [ 137.503789] 5 locks held by ip/1383: [ 137.507797] #0: 00000000d497f415 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x466/0x8a0 [ 137.516786] #1: 000000008e4b4656 (&team->lock){+.+.}, at: team_uninit+0x3a/0x1a0 [team] [ 137.525882] #2: 000000005cf248d1 (&dev_addr_list_lock_key){+...}, at: dev_uc_unsync+0x98/0x1b0 [ 137.535649] #3: 00000000d9597252 (&dev_addr_list_lock_key/1){+...}, at: dev_uc_unsync+0x10c/0x1b0 [ 137.545709] #4: 00000000bec134c3 (rcu_read_lock){....}, at: team_set_rx_mode+0x5/0x1d0 [team] [ 137.555384] [ 137.555384] stack backtrace: [ 137.560277] CPU: 0 PID: 1383 Comm: ip Not tainted 5.3.0-rc7+ #322 [ 137.577826] Call Trace: [ 137.580586] dump_stack+0x7c/0xbb [ 137.584307] __lock_acquire+0x26a9/0x3de0 [ 137.588820] ? register_lock_class+0x14d0/0x14d0 [ 137.594008] ? register_lock_class+0x14d0/0x14d0 [ 137.599194] lock_acquire+0x164/0x3b0 [ 137.603310] ? dev_uc_sync_multiple+0xfa/0x1a0 [ 137.608307] _raw_spin_lock_nested+0x2e/0x60 [ 137.613105] ? dev_uc_sync_multiple+0xfa/0x1a0 [ 137.618095] dev_uc_sync_multiple+0xfa/0x1a0 [ 137.622900] team_set_rx_mode+0xa9/0x1d0 [team] [ 137.627993] dev_uc_unsync+0x151/0x1b0 [ 137.632205] team_port_del+0x304/0x790 [team] [ 137.637110] team_uninit+0xb0/0x1a0 [team] [ 137.641717] rollback_registered_many+0x728/0xda0 [ 137.647005] ? generic_xdp_install+0x310/0x310 [ 137.651994] ? __set_pages_p+0xf4/0x150 [ 137.656306] ? check_chain_key+0x236/0x5d0 [ 137.660914] ? __nla_validate_parse+0x98/0x1ad0 [ 137.666006] unregister_netdevice_many.part.120+0x13/0x1b0 [ 137.672167] rtnl_delete_link+0xbc/0x100 [ 137.676575] ? rtnl_af_register+0xc0/0xc0 [ 137.681084] rtnl_dellink+0x2e7/0x870 [ 137.685204] ? find_held_lock+0x39/0x1d0 [ ... ] Fixes: 3d249d4ca7d0 ("net: introduce ethernet teaming device") Signed-off-by: Taehee Yoo --- v1 -> v2 : this patch isn't changed drivers/net/team/team.c | 61 ++++++++++++++++++++++++++++++++++++++--- include/linux/if_team.h | 5 ++++ 2 files changed, 62 insertions(+), 4 deletions(-) diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c index e8089def5a46..bfcd6ed57493 100644 --- a/drivers/net/team/team.c +++ b/drivers/net/team/team.c @@ -1607,6 +1607,34 @@ static const struct team_option team_options[] = { }, }; +static void team_dev_set_lockdep_one(struct net_device *dev, + struct netdev_queue *txq, + void *_unused) +{ + struct team *team = netdev_priv(dev); + + lockdep_set_class(&txq->_xmit_lock, &team->xmit_lock_key); +} + +static struct lock_class_key qdisc_tx_busylock_key; +static struct lock_class_key qdisc_running_key; + +static void team_dev_set_lockdep_class(struct net_device *dev) +{ + struct team *team = netdev_priv(dev); + + dev->qdisc_tx_busylock = &qdisc_tx_busylock_key; + dev->qdisc_running_key = &qdisc_running_key; + + lockdep_register_key(&team->team_lock_key); + __mutex_init(&team->lock, "team->team_lock_key", &team->team_lock_key); + + lockdep_register_key(&team->addr_lock_key); + lockdep_set_class(&dev->addr_list_lock, &team->addr_lock_key); + + lockdep_register_key(&team->xmit_lock_key); + netdev_for_each_tx_queue(dev, team_dev_set_lockdep_one, NULL); +} static int team_init(struct net_device *dev) { @@ -1615,7 +1643,6 @@ static int team_init(struct net_device *dev) int err; team->dev = dev; - mutex_init(&team->lock); team_set_no_mode(team); team->pcpu_stats = netdev_alloc_pcpu_stats(struct team_pcpu_stats); @@ -1642,7 +1669,7 @@ static int team_init(struct net_device *dev) goto err_options_register; netif_carrier_off(dev); - netdev_lockdep_set_classes(dev); + team_dev_set_lockdep_class(dev); return 0; @@ -1673,6 +1700,11 @@ static void team_uninit(struct net_device *dev) team_queue_override_fini(team); mutex_unlock(&team->lock); netdev_change_features(dev); + + lockdep_unregister_key(&team->team_lock_key); + lockdep_unregister_key(&team->addr_lock_key); + lockdep_unregister_key(&team->xmit_lock_key); + } static void team_destructor(struct net_device *dev) @@ -1967,6 +1999,23 @@ static int team_add_slave(struct net_device *dev, struct net_device *port_dev, return err; } +static void team_update_lock_key(struct net_device *dev) +{ + struct team *team = netdev_priv(dev); + + lockdep_unregister_key(&team->team_lock_key); + lockdep_unregister_key(&team->addr_lock_key); + lockdep_unregister_key(&team->xmit_lock_key); + + lockdep_register_key(&team->team_lock_key); + lockdep_register_key(&team->addr_lock_key); + lockdep_register_key(&team->xmit_lock_key); + + lockdep_set_class(&team->lock, &team->team_lock_key); + lockdep_set_class(&dev->addr_list_lock, &team->addr_lock_key); + netdev_for_each_tx_queue(dev, team_dev_set_lockdep_one, NULL); +} + static int team_del_slave(struct net_device *dev, struct net_device *port_dev) { struct team *team = netdev_priv(dev); @@ -1976,8 +2025,12 @@ static int team_del_slave(struct net_device *dev, struct net_device *port_dev) err = team_port_del(team, port_dev); mutex_unlock(&team->lock); - if (!err) - netdev_change_features(dev); + if (err) + return err; + + if (netif_is_team_master(port_dev)) + team_update_lock_key(port_dev); + netdev_change_features(dev); return err; } diff --git a/include/linux/if_team.h b/include/linux/if_team.h index 06faa066496f..9c97bb19ed34 100644 --- a/include/linux/if_team.h +++ b/include/linux/if_team.h @@ -223,6 +223,11 @@ struct team { atomic_t count_pending; struct delayed_work dw; } mcast_rejoin; + + struct lock_class_key team_lock_key; + struct lock_class_key xmit_lock_key; + struct lock_class_key addr_lock_key; + long mode_priv[TEAM_MODE_PRIV_LONGS]; }; From patchwork Sat Sep 7 13:46:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1159327 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="CwMOtSxc"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46QbM40dpGz9sDB for ; Sat, 7 Sep 2019 23:46:56 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2394643AbfIGNqy (ORCPT ); Sat, 7 Sep 2019 09:46:54 -0400 Received: from mail-pf1-f194.google.com ([209.85.210.194]:36491 "EHLO mail-pf1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1733278AbfIGNqy (ORCPT ); Sat, 7 Sep 2019 09:46:54 -0400 Received: by mail-pf1-f194.google.com with SMTP id y22so6417252pfr.3 for ; Sat, 07 Sep 2019 06:46:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=96pFkduzmJrdb3ENbrleY2Qc+hL1joupTUsDyzf+3Jg=; b=CwMOtSxczx2eMZCriVDekc1d3NEnKQdSLqodeqhfgy1eTZh0s64gUYMHLP7sR+h2pz PpMuUWucbND0muzS5EsRAr5KSOiU5CJYfrtrrHdI8j9c5fE3jH6Cs2AXyui5UhrmBYgo 5E/CCVknONawjItUGskD948PpGpdFbXUI35APW6Mx61Xvrygsv73vSEhD9IYveJZJ1ZF 4TZ6cSwNRNaVKt3PUQAMqeVgge04WGFOcyayxhM7+gzuTToFppM49PdrBkyP2Kj9PUkC ND4vMFMpV7GNZQcaARJjpAXm+2k6gJpWyLBUYLDk6R2JI3h3Jn8r8OqLYzRgZ9hmnRw2 JBKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=96pFkduzmJrdb3ENbrleY2Qc+hL1joupTUsDyzf+3Jg=; b=V2MZu3kXND0+JbWT9EjmJB7v4XsE67wvajYE/q9CLPWeiTIkh1IZBxtdY/2oNU1buw euK3eyovyzFKBL6QvLFI06nBm0Z/vjS6vz2aDoswjjaWwsusgSse5mP5MknzFx0rH/fv 0J3PQSJhNYR4uGzQGzWZRiZ4WZ+P0TTjdws9FcjrP30g1MwwlswrvXFn84kPGX+MkZcY SUzfGXZUY0gOMXElDNJjoKUBOvy0Ii2oVy5vBYeNl6DGpEL92bjac0fIVvmTSuaDskxo a4aNsQ/ihIxCkJJOaE2i8icv2PI2yogGUdVrFtMMtLrq3vzp5snaFtv1hCvOwgm5HDQd ob0g== X-Gm-Message-State: APjAAAUp/lT+BSRjBSfmPQwprqGCEAXOGC4sgLaQ0fVGIvMgPUQUteHS fiQfqpwQ+Tp+7ek10PdFAiA= X-Google-Smtp-Source: APXvYqwe3Lf+391JBdNfU+D4t9wcUGURPXbmUM/tY1v9z8vsepty6hZXJVRP4M2+q03ZJC1z+44e4g== X-Received: by 2002:a63:9e54:: with SMTP id r20mr13005565pgo.64.1567864013423; Sat, 07 Sep 2019 06:46:53 -0700 (PDT) Received: from ap-To-be-filled-by-O-E-M.8.8.8.8 ([14.33.120.60]) by smtp.gmail.com with ESMTPSA id d14sm14069210pfh.36.2019.09.07.06.46.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2019 06:46:52 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, sthemmin@microsoft.com, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com Cc: ap420073@gmail.com Subject: [PATCH net v2 06/11] macsec: use dynamic lockdep key instead of subclass Date: Sat, 7 Sep 2019 22:46:43 +0900 Message-Id: <20190907134643.32500-1-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org All macsec device has same lockdep key and subclass is initialized with nest_level. But actual nest_level value can be changed when a lower device is attached. And at this moment, the subclass should be updated but it seems to be unsafe. So this patch makes macsec use dynamic lockdep key instead of the subclass. Test commands: ip link add bond0 type bond ip link add dummy0 type dummy ip link add macsec0 link bond0 type macsec ip link add macsec1 link dummy0 type macsec ip link set bond0 mtu 1000 ip link set macsec1 master bond0 ip link set bond0 up ip link set macsec0 up ip link set dummy0 up ip link set macsec1 up Splat looks like: [ 146.540123] ============================================ [ 146.540123] WARNING: possible recursive locking detected [ 146.540123] 5.3.0-rc7+ #322 Not tainted [ 146.540123] -------------------------------------------- [ 146.540123] ip/1340 is trying to acquire lock: [ 146.540123] 00000000446fd8bd (&macsec_netdev_addr_lock_key/1){+...}, at: dev_uc_sync_multiple+0xfa/0x1a0 [ 146.540123] [ 146.540123] but task is already holding lock: [ 146.540123] 00000000a9ab6378 (&macsec_netdev_addr_lock_key/1){+...}, at: dev_set_rx_mode+0x19/0x30 [ 146.540123] [ 146.540123] other info that might help us debug this: [ 146.540123] Possible unsafe locking scenario: [ 146.540123] [ 146.540123] CPU0 [ 146.540123] ---- [ 146.540123] lock(&macsec_netdev_addr_lock_key/1); [ 146.540123] lock(&macsec_netdev_addr_lock_key/1); [ 146.623155] [ 146.623155] *** DEADLOCK *** [ 146.623155] [ 146.623155] May be due to missing lock nesting notation [ 146.623155] [ 146.623155] 4 locks held by ip/1340: [ 146.623155] #0: 0000000026436ef0 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x466/0x8a0 [ 146.623155] #1: 00000000a9ab6378 (&macsec_netdev_addr_lock_key/1){+...}, at: dev_set_rx_mode+0x19/0x30 [ 146.623155] #2: 00000000a8947dd0 (&dev_addr_list_lock_key/3){+...}, at: dev_mc_sync+0xfa/0x1a0 [ 146.623155] #3: 00000000b62011e9 (rcu_read_lock){....}, at: bond_set_rx_mode+0x5/0x3c0 [bonding] [ 146.674970] [ 146.674970] stack backtrace: [ 146.687145] CPU: 0 PID: 1340 Comm: ip Not tainted 5.3.0-rc7+ #322 [ 146.693024] Call Trace: [ 146.693024] dump_stack+0x7c/0xbb [ 146.693024] __lock_acquire+0x26a9/0x3de0 [ 146.693024] ? register_lock_class+0x14d0/0x14d0 [ 146.693024] ? register_lock_class+0x14d0/0x14d0 [ 146.693024] lock_acquire+0x164/0x3b0 [ 146.693024] ? dev_uc_sync_multiple+0xfa/0x1a0 [ 146.693024] _raw_spin_lock_nested+0x2e/0x60 [ 146.693024] ? dev_uc_sync_multiple+0xfa/0x1a0 [ 146.693024] dev_uc_sync_multiple+0xfa/0x1a0 [ 146.693024] bond_set_rx_mode+0x269/0x3c0 [bonding] [ 146.751163] ? bond_init+0x6f0/0x6f0 [bonding] [ 146.757006] ? do_raw_spin_trylock+0xa9/0x170 [ 146.757006] dev_mc_sync+0x15a/0x1a0 [ 146.757006] macsec_dev_set_rx_mode+0x3a/0x50 [macsec] [ 146.757006] dev_set_rx_mode+0x21/0x30 [ 146.757006] __dev_open+0x202/0x310 [ 146.757006] ? dev_set_rx_mode+0x30/0x30 [ 146.757006] ? mark_held_locks+0xa5/0xe0 [ 146.757006] ? __local_bh_enable_ip+0xe9/0x1b0 [ 146.757006] __dev_change_flags+0x3c3/0x500 [ 146.757006] ? dev_set_allmulti+0x10/0x10 [ 146.757006] ? sched_clock_local+0xd4/0x140 [ 146.757006] ? check_chain_key+0x236/0x5d0 [ 146.757006] dev_change_flags+0x7a/0x160 [ 146.757006] do_setlink+0xa26/0x2f20 [ 146.757006] ? sched_clock_local+0xd4/0x140 [ ... ] Fixes: e20038724552 ("macsec: fix lockdep splats when nesting devices") Signed-off-by: Taehee Yoo --- v1 -> v2 : this patch isn't changed drivers/net/macsec.c | 37 ++++++++++++++++++++++++++++++++----- 1 file changed, 32 insertions(+), 5 deletions(-) diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c index 8f46aa1ddec0..25a4fc88145d 100644 --- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -267,6 +267,8 @@ struct macsec_dev { struct pcpu_secy_stats __percpu *stats; struct list_head secys; struct gro_cells gro_cells; + struct lock_class_key xmit_lock_key; + struct lock_class_key addr_lock_key; unsigned int nest_level; }; @@ -2749,7 +2751,32 @@ static netdev_tx_t macsec_start_xmit(struct sk_buff *skb, #define MACSEC_FEATURES \ (NETIF_F_SG | NETIF_F_HIGHDMA | NETIF_F_FRAGLIST) -static struct lock_class_key macsec_netdev_addr_lock_key; + +static void macsec_dev_set_lockdep_one(struct net_device *dev, + struct netdev_queue *txq, + void *_unused) +{ + struct macsec_dev *macsec = macsec_priv(dev); + + lockdep_set_class(&txq->_xmit_lock, &macsec->xmit_lock_key); +} + +static struct lock_class_key qdisc_tx_busylock_key; +static struct lock_class_key qdisc_running_key; + +static void macsec_dev_set_lockdep_class(struct net_device *dev) +{ + struct macsec_dev *macsec = macsec_priv(dev); + + dev->qdisc_tx_busylock = &qdisc_tx_busylock_key; + dev->qdisc_running_key = &qdisc_running_key; + + lockdep_register_key(&macsec->addr_lock_key); + lockdep_set_class(&dev->addr_list_lock, &macsec->addr_lock_key); + + lockdep_register_key(&macsec->xmit_lock_key); + netdev_for_each_tx_queue(dev, macsec_dev_set_lockdep_one, NULL); +} static int macsec_dev_init(struct net_device *dev) { @@ -2780,6 +2807,7 @@ static int macsec_dev_init(struct net_device *dev) if (is_zero_ether_addr(dev->broadcast)) memcpy(dev->broadcast, real_dev->broadcast, dev->addr_len); + macsec_dev_set_lockdep_class(dev); return 0; } @@ -2789,6 +2817,9 @@ static void macsec_dev_uninit(struct net_device *dev) gro_cells_destroy(&macsec->gro_cells); free_percpu(dev->tstats); + + lockdep_unregister_key(&macsec->addr_lock_key); + lockdep_unregister_key(&macsec->xmit_lock_key); } static netdev_features_t macsec_fix_features(struct net_device *dev, @@ -3263,10 +3294,6 @@ static int macsec_newlink(struct net *net, struct net_device *dev, dev_hold(real_dev); macsec->nest_level = dev_get_nest_level(real_dev) + 1; - netdev_lockdep_set_classes(dev); - lockdep_set_class_and_subclass(&dev->addr_list_lock, - &macsec_netdev_addr_lock_key, - macsec_get_nest_level(dev)); err = netdev_upper_dev_link(real_dev, dev, extack); if (err < 0) From patchwork Sat Sep 7 13:46:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1159328 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="LDS5+OMr"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46QbML2QYHz9sDB for ; Sat, 7 Sep 2019 23:47:10 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2394648AbfIGNrI (ORCPT ); Sat, 7 Sep 2019 09:47:08 -0400 Received: from mail-pg1-f196.google.com ([209.85.215.196]:45080 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1733278AbfIGNrI (ORCPT ); Sat, 7 Sep 2019 09:47:08 -0400 Received: by mail-pg1-f196.google.com with SMTP id 4so5118035pgm.12 for ; Sat, 07 Sep 2019 06:47:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=B7E9bBHZyVVv5TtX3JVfbSqlSS3ZRozPxNl9tmfA3/U=; b=LDS5+OMrJ4D5zDWS8wmzPM7QM0Y7amqoaQxygQem2UhcICqLPwxeR4oKUfRbMGpAnn pbtgW9xOOGQXrSjK9rrYopqJ65mwuDcalT04LZMmkJXoborJ9mKAI7OcJCnPGQGLxmNE XZqUszdlGKOaOYBZZ8ZP8rOXpovNFnpqzD9B1NrUr97EhQWzaVHoWLYlWyzbycCfolZg TnI+yE+xx0+/+UEBYBSyTgTLkoUsyEoJT9nQPlGG3X9d6h6503hxFQ2WANAtkogBAuIw wz0gepjAV7FEbjY+y5VE5Zl6FImO8hcGuy6Pka0jnlFanXE1HlQjLH5YHlw5FkrL4D3R Uj4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=B7E9bBHZyVVv5TtX3JVfbSqlSS3ZRozPxNl9tmfA3/U=; b=FoPvGsQm3hH7zbySxjAFlK7eSZFjNjhpfi/1GDVwnRmU3eTconYDSkOGjMOUh9gABO htERW3uCHd4Q/xzP1OuxI1uKGZ5YvUf5LHuaidXlUUfIGKAeR7B1ujDXrFGVMi7TcTld sKrn+im+iJTAApjrwcnHDm18dfAFZE47gGMICfkVIszYiQF/vFshNdeIehNXEbpXQU4T dGF/LsiH71C8i5sfmfQ97b0s9+Fzj/tRVEKNijUn0t2pUkpnW2ZGHOv7GztRGr/attpu ZV4Tir2SboiW2UP8kR0g/ELXgi/INLhQJek8nE4QnsUQN3vxFILyicWUJbpyCwDtE0HB FXVQ== X-Gm-Message-State: APjAAAWMjHyZ3sf2xMdJpHMU78CVIAFS3uNaNRg3vTv7balbse3/Gwp4 PuhyFDxveDO4ixfoGf9bBoE= X-Google-Smtp-Source: APXvYqzqVSRIzmW/Pk15fq1CIGHJ3th4kP+pXzwNa4k45PVEHURt9PcwvWlQDpozpPHOCnNIngqaPw== X-Received: by 2002:a62:87c8:: with SMTP id i191mr16929000pfe.133.1567864027573; Sat, 07 Sep 2019 06:47:07 -0700 (PDT) Received: from ap-To-be-filled-by-O-E-M.8.8.8.8 ([14.33.120.60]) by smtp.gmail.com with ESMTPSA id s97sm5025446pjc.4.2019.09.07.06.47.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2019 06:47:06 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, sthemmin@microsoft.com, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com Cc: ap420073@gmail.com Subject: [PATCH net v2 07/11] macvlan: use dynamic lockdep key instead of subclass Date: Sat, 7 Sep 2019 22:46:55 +0900 Message-Id: <20190907134655.32639-1-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org All macvlan device has same lockdep key and subclass is initialized with nest_level. But actual nest_level value can be changed when a lower device is attached. And at this moment, the subclass should be updated but it seems to be unsafe. So this patch makes macvlan use dynamic lockdep key instead of the subclass. Test commands: ip link add bond0 type bond ip link add dummy0 type dummy ip link add macvlan0 link bond0 type macvlan mode bridge ip link add macvlan1 link dummy0 type macvlan mode bridge ip link set bond0 mtu 1000 ip link set macvlan1 master bond0 ip link set bond0 up ip link set macvlan0 up ip link set dummy0 up ip link set macvlan1 up Splat looks like: [ 165.677603] ============================================ [ 165.679642] WARNING: possible recursive locking detected [ 165.679642] 5.3.0-rc7+ #322 Not tainted [ 165.679642] -------------------------------------------- [ 165.679642] ip/1812 is trying to acquire lock: [ 165.679642] 00000000ae6a8a03 (&macvlan_netdev_addr_lock_key/1){+...}, at: dev_uc_sync_multiple+0xfa/0x1a0 [ 165.679642] [ 165.679642] but task is already holding lock: [ 165.679642] 00000000cec5da0b (&macvlan_netdev_addr_lock_key/1){+...}, at: dev_set_rx_mode+0x19/0x30 [ 165.679642] [ 165.679642] other info that might help us debug this: [ 165.679642] Possible unsafe locking scenario: [ 165.679642] [ 165.679642] CPU0 [ 165.679642] ---- [ 165.679642] lock(&macvlan_netdev_addr_lock_key/1); [ 165.679642] lock(&macvlan_netdev_addr_lock_key/1); [ 165.679642] [ 165.679642] *** DEADLOCK *** [ 165.679642] [ 165.679642] May be due to missing lock nesting notation [ 165.679642] [ 165.679642] 4 locks held by ip/1812: [ 165.679642] #0: 0000000088d10bd8 (rtnl_mutex){+.+.}, at: rtnetlink_rcv_msg+0x466/0x8a0 [ 165.679642] #1: 00000000cec5da0b (&macvlan_netdev_addr_lock_key/1){+...}, at: dev_set_rx_mode+0x19/0x30 [ 165.679642] #2: 000000000ca6fdb5 (&dev_addr_list_lock_key/3){+...}, at: dev_uc_sync+0xfa/0x1a0 [ 165.679642] #3: 00000000dc1495a2 (rcu_read_lock){....}, at: bond_set_rx_mode+0x5/0x3c0 [bonding] [ 165.679642] [ 165.679642] stack backtrace: [ 165.679642] CPU: 1 PID: 1812 Comm: ip Not tainted 5.3.0-rc7+ #322 [ 165.679642] Call Trace: [ 165.679642] dump_stack+0x7c/0xbb [ 165.679642] __lock_acquire+0x26a9/0x3de0 [ 165.679642] ? register_lock_class+0x14d0/0x14d0 [ 165.679642] ? mark_held_locks+0xa5/0xe0 [ 165.679642] ? trace_hardirqs_on_thunk+0x1a/0x20 [ 165.679642] ? register_lock_class+0x14d0/0x14d0 [ 165.679642] lock_acquire+0x164/0x3b0 [ 165.679642] ? dev_uc_sync_multiple+0xfa/0x1a0 [ 165.679642] _raw_spin_lock_nested+0x2e/0x60 [ 165.679642] ? dev_uc_sync_multiple+0xfa/0x1a0 [ 165.679642] dev_uc_sync_multiple+0xfa/0x1a0 [ 165.679642] bond_set_rx_mode+0x269/0x3c0 [bonding] [ 165.679642] ? bond_init+0x6f0/0x6f0 [bonding] [ 165.679642] dev_uc_sync+0x15a/0x1a0 [ 165.679642] macvlan_set_mac_lists+0x55/0x110 [macvlan] [ 165.679642] dev_set_rx_mode+0x21/0x30 [ 165.679642] __dev_open+0x202/0x310 [ 165.679642] ? dev_set_rx_mode+0x30/0x30 [ 165.679642] ? mark_held_locks+0xa5/0xe0 [ 165.679642] ? __local_bh_enable_ip+0xe9/0x1b0 [ 165.679642] __dev_change_flags+0x3c3/0x500 [ 165.679642] ? dev_set_allmulti+0x10/0x10 [ 165.679642] dev_change_flags+0x7a/0x160 [ ...] Fixes: c674ac30c549 ("macvlan: Fix lockdep warnings with stacked macvlan devices") Signed-off-by: Taehee Yoo --- v1 -> v2 : this patch isn't changed drivers/net/macvlan.c | 35 +++++++++++++++++++++++++++-------- include/linux/if_macvlan.h | 2 ++ 2 files changed, 29 insertions(+), 8 deletions(-) diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index 940192c057b6..dae368a2e8d1 100644 --- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -852,8 +852,6 @@ static int macvlan_do_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) * "super class" of normal network devices; split their locks off into a * separate class since they always nest. */ -static struct lock_class_key macvlan_netdev_addr_lock_key; - #define ALWAYS_ON_OFFLOADS \ (NETIF_F_SG | NETIF_F_HW_CSUM | NETIF_F_GSO_SOFTWARE | \ NETIF_F_GSO_ROBUST | NETIF_F_GSO_ENCAP_ALL) @@ -874,12 +872,30 @@ static int macvlan_get_nest_level(struct net_device *dev) return ((struct macvlan_dev *)netdev_priv(dev))->nest_level; } -static void macvlan_set_lockdep_class(struct net_device *dev) +static void macvlan_dev_set_lockdep_one(struct net_device *dev, + struct netdev_queue *txq, + void *_unused) +{ + struct macvlan_dev *macvlan = netdev_priv(dev); + + lockdep_set_class(&txq->_xmit_lock, &macvlan->xmit_lock_key); +} + +static struct lock_class_key qdisc_tx_busylock_key; +static struct lock_class_key qdisc_running_key; + +static void macvlan_dev_set_lockdep_class(struct net_device *dev) { - netdev_lockdep_set_classes(dev); - lockdep_set_class_and_subclass(&dev->addr_list_lock, - &macvlan_netdev_addr_lock_key, - macvlan_get_nest_level(dev)); + struct macvlan_dev *macvlan = netdev_priv(dev); + + dev->qdisc_tx_busylock = &qdisc_tx_busylock_key; + dev->qdisc_running_key = &qdisc_running_key; + + lockdep_register_key(&macvlan->addr_lock_key); + lockdep_set_class(&dev->addr_list_lock, &macvlan->addr_lock_key); + + lockdep_register_key(&macvlan->xmit_lock_key); + netdev_for_each_tx_queue(dev, macvlan_dev_set_lockdep_one, NULL); } static int macvlan_init(struct net_device *dev) @@ -900,7 +916,7 @@ static int macvlan_init(struct net_device *dev) dev->gso_max_segs = lowerdev->gso_max_segs; dev->hard_header_len = lowerdev->hard_header_len; - macvlan_set_lockdep_class(dev); + macvlan_dev_set_lockdep_class(dev); vlan->pcpu_stats = netdev_alloc_pcpu_stats(struct vlan_pcpu_stats); if (!vlan->pcpu_stats) @@ -922,6 +938,9 @@ static void macvlan_uninit(struct net_device *dev) port->count -= 1; if (!port->count) macvlan_port_destroy(port->dev); + + lockdep_unregister_key(&vlan->addr_lock_key); + lockdep_unregister_key(&vlan->xmit_lock_key); } static void macvlan_dev_get_stats64(struct net_device *dev, diff --git a/include/linux/if_macvlan.h b/include/linux/if_macvlan.h index 2e55e4cdbd8a..ea5b41823287 100644 --- a/include/linux/if_macvlan.h +++ b/include/linux/if_macvlan.h @@ -31,6 +31,8 @@ struct macvlan_dev { u16 flags; int nest_level; unsigned int macaddr_count; + struct lock_class_key xmit_lock_key; + struct lock_class_key addr_lock_key; #ifdef CONFIG_NET_POLL_CONTROLLER struct netpoll *netpoll; #endif From patchwork Sat Sep 7 13:47:22 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1159329 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="TzRzCrOh"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46QbMp4cQLz9sNf for ; Sat, 7 Sep 2019 23:47:34 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2403890AbfIGNrd (ORCPT ); Sat, 7 Sep 2019 09:47:33 -0400 Received: from mail-pl1-f196.google.com ([209.85.214.196]:45136 "EHLO mail-pl1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1733278AbfIGNrd (ORCPT ); Sat, 7 Sep 2019 09:47:33 -0400 Received: by mail-pl1-f196.google.com with SMTP id x3so4491865plr.12 for ; Sat, 07 Sep 2019 06:47:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=k34hpd5H8YR3N6QwzXuXUSdBWScoRUdvuwOxEUNEpPc=; b=TzRzCrOh7PYkLlH9gIVpMzLyPcG9Kg0+vyncfwzSG4YwSMRS1FTOnU+EH1ppRgBnMY QhkUl7UCc+0rg1DlCW9c3E8vXngT/XSpujqWndC4gQRtz4QmTZBMl+IPtVaF0MhvrtGg 4PsITYq5v7D7Qpxv7OlogARzldCHSSlplNpETPtK5NGvQ9G68yTvYFErUO1RGFryk50h rmspB6xv0ntHw1EfVYDgucB+dy2JnZ22OsrEm3F/zk69xf+1R8IJvI+3XR7d4New1FkB K0Z1opIFxHaLmrQ793zF7cBVVmulY5WUwfORmdRYVIi8ejuuxecJvA1PrxU+hOFpOSiE hP2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=k34hpd5H8YR3N6QwzXuXUSdBWScoRUdvuwOxEUNEpPc=; b=nBEGwhUQzBTmjKLd0yPoWPZIbdtI3TvpGEuXiIpiLzWKzOymY8MpQtcqtWdhNMqvk0 RaZTgncbJEAYPFPwo4ay2a3ajbIZIQ5HrzHS8h//UUpcVr9hXRvkBbd/ZDK195T9Nij9 zwmR7RqIpRezthgB2vZOiBnxtVr5hPIZM7Vyys6nsJRo9w8in6DZaSM133tQduwF96pd UrqVXuio25Adhq1s6ibjV1qLlf2mPtcN4cAwK6Br560SU+UVYszIV7TSp7kL+NdIe4Y6 bX/DM3C7xmBab0k40V+tiTAfcVhq3Px2ULY/A7O+1vDQ4SuDX8l/Hs4M3UNhhKYpv6Gg AZWw== X-Gm-Message-State: APjAAAXvtjQu8pj5uYsMzrkJXyvLJajVIT70WKsjxxTD7+oKmz+31wnL /M6TPWbhj0xniGQXro69BqQ= X-Google-Smtp-Source: APXvYqz6xFLM51efz7IN0OXeLOyHIdZkBQmfJz/U+U2weaqIbx4hPALRXOuN4GaLMEwrs/TA91JmiA== X-Received: by 2002:a17:902:96a:: with SMTP id 97mr14561450plm.264.1567864052714; Sat, 07 Sep 2019 06:47:32 -0700 (PDT) Received: from ap-To-be-filled-by-O-E-M.8.8.8.8 ([14.33.120.60]) by smtp.gmail.com with ESMTPSA id z23sm11274220pgi.78.2019.09.07.06.47.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2019 06:47:31 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, sthemmin@microsoft.com, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com Cc: ap420073@gmail.com Subject: [PATCH net v2 08/11] macsec: fix refcnt leak in module exit routine Date: Sat, 7 Sep 2019 22:47:22 +0900 Message-Id: <20190907134722.345-1-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org When a macsec interface is created, it increases a refcnt to a lower device(real device). when macsec interface is deleted, the refcnt is decreased in macsec_free_netdev(), which is ->priv_destructor() of macsec interface. The problem scenario is this. When nested macsec interfaces are exiting, the exit routine of the macsec module makes refcnt leaks. Test commands: ip link add dummy0 type dummy ip link add macsec0 link dummy0 type macsec ip link add macsec1 link macsec0 type macsec modprobe -rv macsec [ 208.629433] unregister_netdevice: waiting for macsec0 to become free. Usage count = 1 Steps of exit routine of macsec module are below. 1. Calls ->dellink() in __rtnl_link_unregister(). 2. Checks refcnt and wait refcnt to be 0 if refcnt is not 0 in netdev_run_todo(). 3. Calls ->priv_destruvtor() in netdev_run_todo(). Step2 checks refcnt, but step3 decreases refcnt. So, step2 waits forever. This patch makes the macsec module do not hold a refcnt of the lower device because it already holds a refcnt of the lower device with netdev_upper_dev_link(). Fixes: c09440f7dcb3 ("macsec: introduce IEEE 802.1AE driver") Signed-off-by: Taehee Yoo --- v1 -> v2 : this patch isn't changed drivers/net/macsec.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c index 25a4fc88145d..41ec1ed0d545 100644 --- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -3031,12 +3031,10 @@ static const struct nla_policy macsec_rtnl_policy[IFLA_MACSEC_MAX + 1] = { static void macsec_free_netdev(struct net_device *dev) { struct macsec_dev *macsec = macsec_priv(dev); - struct net_device *real_dev = macsec->real_dev; free_percpu(macsec->stats); free_percpu(macsec->secy.tx_sc.stats); - dev_put(real_dev); } static void macsec_setup(struct net_device *dev) @@ -3291,8 +3289,6 @@ static int macsec_newlink(struct net *net, struct net_device *dev, if (err < 0) return err; - dev_hold(real_dev); - macsec->nest_level = dev_get_nest_level(real_dev) + 1; err = netdev_upper_dev_link(real_dev, dev, extack); From patchwork Sat Sep 7 13:47:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1159330 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="CVkEyats"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46QbN446pmz9sDB for ; Sat, 7 Sep 2019 23:47:48 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404313AbfIGNrr (ORCPT ); Sat, 7 Sep 2019 09:47:47 -0400 Received: from mail-pf1-f194.google.com ([209.85.210.194]:33137 "EHLO mail-pf1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1733278AbfIGNrr (ORCPT ); Sat, 7 Sep 2019 09:47:47 -0400 Received: by mail-pf1-f194.google.com with SMTP id q10so6417321pfl.0 for ; Sat, 07 Sep 2019 06:47:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=n/E8kVLcgaLVDmvB6aVHwxVJQ6pq71Ob2wO+rFoaOWI=; b=CVkEyats1HA4Bf65u2sinwrRPYuzSkG4jhjmoIBUYatzkul4qGMB/Em4CM64OAMknz gTwiNEp/TbSzzqTO6xduKdmAJr5mR/1nGONuugK5Ixv32P4H5EYegnnq/MsJpAQGVh1y v55oMjUMOJcaMornPpv+G7STSwj1I4Jcr5UDV9jp5qHE/0KDtJnditCWWNWy/WSiMSdQ +jAm3NPfs5kk5SavBrB+SAQRPC8jyDAGdwNWA0KZ8HeDKCXzqjCD/+wGnA5lrPMgu8xZ R8f2aJ8BDRFcBdz5OPj/LTQQ0MjV6t3BYKsAVxyubrw2dEzrQf0igDwlXY/ukzpGEOZU kZjA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=n/E8kVLcgaLVDmvB6aVHwxVJQ6pq71Ob2wO+rFoaOWI=; b=HJ3G5nZFfgofSBuZesSxBNe6DHn6KWAhlfNogn9QClUeZRG/UndUK6ZK4tn28PKv9u uyG2sIgIskdQ8OpmFSGgbpjgsbFvkQ/iXv3YJtp+YarmsvOrq5LC6dYSAClTPOZXhCpA /wCoxL5kMo0zn/9wALd7xBEsFQKKWwOsoLhbushyRz4FqSjPGZeEfcz6BdMy7dU730yk 8Bo3hH04dQXuHGphfx01FopdJBRuipdjlMjzQDEPsLdTMxQRPxxD1I0gysilXdl427mo yxipr5VmlCjHReONvoqGuah0il79GXQ2z3a4YlBwKFG4JkqTE+ZyzixHc86ALwtrD53z y5ew== X-Gm-Message-State: APjAAAVNbR2jiGqj3YsAAvUmhoekxcWYYN/5NVH2nX4pcnXmbzOa6TXs xYGVSjZyrAlA2Hn2Yt3Ac1Y= X-Google-Smtp-Source: APXvYqx1mRCSO3rwBnHK5YeZHiDf4WMFHaI0f9b4EmLQieJmfDOcsOdBChwn0My+mAGncV2FU4CBPQ== X-Received: by 2002:a63:58c:: with SMTP id 134mr13244988pgf.106.1567864066582; Sat, 07 Sep 2019 06:47:46 -0700 (PDT) Received: from ap-To-be-filled-by-O-E-M.8.8.8.8 ([14.33.120.60]) by smtp.gmail.com with ESMTPSA id r23sm9184294pjo.22.2019.09.07.06.47.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2019 06:47:45 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, sthemmin@microsoft.com, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com Cc: ap420073@gmail.com Subject: [PATCH net v2 09/11] net: core: add ignore flag to netdev_adjacent structure Date: Sat, 7 Sep 2019 22:47:37 +0900 Message-Id: <20190907134737.444-1-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org In order to link an adjacent node, netdev_upper_dev_link() is used and in order to unlink an adjacent node, netdev_upper_dev_unlink() is used. unlink operation does not fail, but link operation can fail. In order to exchange adjacent nodes, we should unlink an old adjacent node first. then, link a new adjacent node. If link operation is failed, we should link an old adjacent node again. But this link operation can fail too. It eventually breaks the adjacent link relationship. This patch adds an ignore flag into the netdev_adjacent structure. If this flag is set, netdev_upper_dev_link() ignores an old adjacent node for a moment. So we can skip unlink operation before link operation. Signed-off-by: Taehee Yoo --- v1 -> v2 : this patch isn't changed include/linux/netdevice.h | 4 + net/core/dev.c | 160 +++++++++++++++++++++++++++++++++----- 2 files changed, 144 insertions(+), 20 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 5bb5756129af..309ae000bae7 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -4319,6 +4319,10 @@ int netdev_master_upper_dev_link(struct net_device *dev, struct netlink_ext_ack *extack); void netdev_upper_dev_unlink(struct net_device *dev, struct net_device *upper_dev); +void netdev_adjacent_dev_disable(struct net_device *upper_dev, + struct net_device *lower_dev); +void netdev_adjacent_dev_enable(struct net_device *upper_dev, + struct net_device *lower_dev); void netdev_adjacent_rename_links(struct net_device *dev, char *oldname); void *netdev_lower_dev_get_private(struct net_device *dev, struct net_device *lower_dev); diff --git a/net/core/dev.c b/net/core/dev.c index 6a4b4ce62204..ac055b531c96 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6448,6 +6448,9 @@ struct netdev_adjacent { /* upper master flag, there can only be one master device per list */ bool master; + /* lookup ignore flag */ + bool ignore; + /* counter for the number of times this device was added to us */ u16 ref_nr; @@ -6553,6 +6556,22 @@ struct net_device *netdev_master_upper_dev_get(struct net_device *dev) } EXPORT_SYMBOL(netdev_master_upper_dev_get); +struct net_device *netdev_master_upper_dev_get_ignore(struct net_device *dev) +{ + struct netdev_adjacent *upper; + + ASSERT_RTNL(); + + if (list_empty(&dev->adj_list.upper)) + return NULL; + + upper = list_first_entry(&dev->adj_list.upper, + struct netdev_adjacent, list); + if (likely(upper->master) && !upper->ignore) + return upper->dev; + return NULL; +} + /** * netdev_has_any_lower_dev - Check if device is linked to some device * @dev: device @@ -6603,8 +6622,9 @@ struct net_device *netdev_upper_get_next_dev_rcu(struct net_device *dev, } EXPORT_SYMBOL(netdev_upper_get_next_dev_rcu); -static struct net_device *netdev_next_upper_dev(struct net_device *dev, - struct list_head **iter) +static struct net_device *netdev_next_upper_dev_ignore(struct net_device *dev, + struct list_head **iter, + bool *ignore) { struct netdev_adjacent *upper; @@ -6614,6 +6634,7 @@ static struct net_device *netdev_next_upper_dev(struct net_device *dev, return NULL; *iter = &upper->list; + *ignore = upper->ignore; return upper->dev; } @@ -6635,26 +6656,29 @@ static struct net_device *netdev_next_upper_dev_rcu(struct net_device *dev, return upper->dev; } -int netdev_walk_all_upper_dev(struct net_device *dev, - int (*fn)(struct net_device *dev, - void *data), - void *data) +int netdev_walk_all_upper_dev_ignore(struct net_device *dev, + int (*fn)(struct net_device *dev, + void *data), + void *data) { struct net_device *udev; struct list_head *iter; int ret; + bool ignore; for (iter = &dev->adj_list.upper, - udev = netdev_next_upper_dev(dev, &iter); + udev = netdev_next_upper_dev_ignore(dev, &iter, &ignore); udev; - udev = netdev_next_upper_dev(dev, &iter)) { + udev = netdev_next_upper_dev_ignore(dev, &iter, &ignore)) { + if (ignore) + continue; /* first is the upper device itself */ ret = fn(udev, data); if (ret) return ret; /* then look at all of its upper devices */ - ret = netdev_walk_all_upper_dev(udev, fn, data); + ret = netdev_walk_all_upper_dev_ignore(udev, fn, data); if (ret) return ret; } @@ -6690,6 +6714,15 @@ int netdev_walk_all_upper_dev_rcu(struct net_device *dev, } EXPORT_SYMBOL_GPL(netdev_walk_all_upper_dev_rcu); +bool netdev_has_upper_dev_ignore(struct net_device *dev, + struct net_device *upper_dev) +{ + ASSERT_RTNL(); + + return netdev_walk_all_upper_dev_ignore(dev, __netdev_has_upper_dev, + upper_dev); +} + /** * netdev_lower_get_next_private - Get the next ->private from the * lower neighbour list @@ -6786,6 +6819,23 @@ static struct net_device *netdev_next_lower_dev(struct net_device *dev, return lower->dev; } +static struct net_device *netdev_next_lower_dev_ignore(struct net_device *dev, + struct list_head **iter, + bool *ignore) +{ + struct netdev_adjacent *lower; + + lower = list_entry((*iter)->next, struct netdev_adjacent, list); + + if (&lower->list == &dev->adj_list.lower) + return NULL; + + *iter = &lower->list; + *ignore = lower->ignore; + + return lower->dev; +} + int netdev_walk_all_lower_dev(struct net_device *dev, int (*fn)(struct net_device *dev, void *data), @@ -6814,6 +6864,36 @@ int netdev_walk_all_lower_dev(struct net_device *dev, } EXPORT_SYMBOL_GPL(netdev_walk_all_lower_dev); +int netdev_walk_all_lower_dev_ignore(struct net_device *dev, + int (*fn)(struct net_device *dev, + void *data), + void *data) +{ + struct net_device *ldev; + struct list_head *iter; + int ret; + bool ignore; + + for (iter = &dev->adj_list.lower, + ldev = netdev_next_lower_dev_ignore(dev, &iter, &ignore); + ldev; + ldev = netdev_next_lower_dev_ignore(dev, &iter, &ignore)) { + if (ignore) + continue; + /* first is the lower device itself */ + ret = fn(ldev, data); + if (ret) + return ret; + + /* then look at all of its lower devices */ + ret = netdev_walk_all_lower_dev_ignore(ldev, fn, data); + if (ret) + return ret; + } + + return 0; +} + static struct net_device *netdev_next_lower_dev_rcu(struct net_device *dev, struct list_head **iter) { @@ -6833,11 +6913,14 @@ static u8 __netdev_upper_depth(struct net_device *dev) struct net_device *udev; struct list_head *iter; u8 max_depth = 0; + bool ignore; for (iter = &dev->adj_list.upper, - udev = netdev_next_upper_dev(dev, &iter); + udev = netdev_next_upper_dev_ignore(dev, &iter, &ignore); udev; - udev = netdev_next_upper_dev(dev, &iter)) { + udev = netdev_next_upper_dev_ignore(dev, &iter, &ignore)) { + if (ignore) + continue; if (max_depth < udev->upper_level) max_depth = udev->upper_level; } @@ -6850,11 +6933,14 @@ static u8 __netdev_lower_depth(struct net_device *dev) struct net_device *ldev; struct list_head *iter; u8 max_depth = 0; + bool ignore; for (iter = &dev->adj_list.lower, - ldev = netdev_next_lower_dev(dev, &iter); + ldev = netdev_next_lower_dev_ignore(dev, &iter, &ignore); ldev; - ldev = netdev_next_lower_dev(dev, &iter)) { + ldev = netdev_next_lower_dev_ignore(dev, &iter, &ignore)) { + if (ignore) + continue; if (max_depth < ldev->lower_level) max_depth = ldev->lower_level; } @@ -6999,6 +7085,7 @@ static int __netdev_adjacent_dev_insert(struct net_device *dev, adj->master = master; adj->ref_nr = 1; adj->private = private; + adj->ignore = false; dev_hold(adj_dev); pr_debug("Insert adjacency: dev %s adj_dev %s adj->ref_nr %d; dev_hold on %s\n", @@ -7149,17 +7236,17 @@ static int __netdev_upper_dev_link(struct net_device *dev, return -EBUSY; /* To prevent loops, check if dev is not upper device to upper_dev. */ - if (netdev_has_upper_dev(upper_dev, dev)) + if (netdev_has_upper_dev_ignore(upper_dev, dev)) return -EBUSY; if ((dev->lower_level + upper_dev->upper_level) > MAX_NEST_DEV) return -EMLINK; if (!master) { - if (netdev_has_upper_dev(dev, upper_dev)) + if (netdev_has_upper_dev_ignore(dev, upper_dev)) return -EEXIST; } else { - master_dev = netdev_master_upper_dev_get(dev); + master_dev = netdev_master_upper_dev_get_ignore(dev); if (master_dev) return master_dev == upper_dev ? -EEXIST : -EBUSY; } @@ -7182,10 +7269,12 @@ static int __netdev_upper_dev_link(struct net_device *dev, goto rollback; __netdev_update_upper_level(dev, NULL); - netdev_walk_all_lower_dev(dev, __netdev_update_upper_level, NULL); + netdev_walk_all_lower_dev_ignore(dev, __netdev_update_upper_level, + NULL); __netdev_update_lower_level(upper_dev, NULL); - netdev_walk_all_upper_dev(upper_dev, __netdev_update_lower_level, NULL); + netdev_walk_all_upper_dev_ignore(upper_dev, + __netdev_update_lower_level, NULL); return 0; @@ -7271,13 +7360,44 @@ void netdev_upper_dev_unlink(struct net_device *dev, &changeupper_info.info); __netdev_update_upper_level(dev, NULL); - netdev_walk_all_lower_dev(dev, __netdev_update_upper_level, NULL); + netdev_walk_all_lower_dev_ignore(dev, __netdev_update_upper_level, + NULL); __netdev_update_lower_level(upper_dev, NULL); - netdev_walk_all_upper_dev(upper_dev, __netdev_update_lower_level, NULL); + netdev_walk_all_upper_dev_ignore(upper_dev, + __netdev_update_lower_level, NULL); } EXPORT_SYMBOL(netdev_upper_dev_unlink); +void __netdev_adjacent_dev_set(struct net_device *upper_dev, + struct net_device *lower_dev, + bool val) +{ + struct netdev_adjacent *adj; + + adj = __netdev_find_adj(lower_dev, &upper_dev->adj_list.lower); + if (adj) + adj->ignore = val; + + adj = __netdev_find_adj(upper_dev, &lower_dev->adj_list.upper); + if (adj) + adj->ignore = val; +} + +void netdev_adjacent_dev_disable(struct net_device *upper_dev, + struct net_device *lower_dev) +{ + __netdev_adjacent_dev_set(upper_dev, lower_dev, true); +} +EXPORT_SYMBOL(netdev_adjacent_dev_disable); + +void netdev_adjacent_dev_enable(struct net_device *upper_dev, + struct net_device *lower_dev) +{ + __netdev_adjacent_dev_set(upper_dev, lower_dev, false); +} +EXPORT_SYMBOL(netdev_adjacent_dev_enable); + /** * netdev_bonding_info_change - Dispatch event about slave change * @dev: device From patchwork Sat Sep 7 13:47:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1159331 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="Ib54I3f0"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46QbNK6jCRz9sDB for ; Sat, 7 Sep 2019 23:48:01 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405937AbfIGNsB (ORCPT ); Sat, 7 Sep 2019 09:48:01 -0400 Received: from mail-pg1-f194.google.com ([209.85.215.194]:36058 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1733278AbfIGNsA (ORCPT ); Sat, 7 Sep 2019 09:48:00 -0400 Received: by mail-pg1-f194.google.com with SMTP id l21so5146689pgm.3 for ; Sat, 07 Sep 2019 06:48:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=mD1NbS4MMRS1ULiG3xa8+vRcPs1R+IjGKx+iMzbM8rU=; b=Ib54I3f0qqugTqxPaFlHrycwlljwExW2pGtY7QIXGgV/CaFcbEFr4aqwc0Wh27HU+E hbNjfJl5r5IfjcJQnkqwksKP47V0RcP/S1yBia6rvBHf2vNw5Kqh4dL9Sin+0cQO5PbE Af9+oVjKyIqXOc9Frr1jM+juY+UKB8Wd2crwfh2bQ+olWGFXShCBqXT4c0f7RSmCgVG9 IDWPldk/3hwE2qSgLr8J+bKoYzHFa/kXHxpJaVMGc2wJoh4G3DE88X1Xd22516csNm6p 6+qUA3REc91ZkXvC4v//atXOy+9MLsrJXJQIVHyo1BENtwsGq904nxcWBjO+cCv2sevS glUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=mD1NbS4MMRS1ULiG3xa8+vRcPs1R+IjGKx+iMzbM8rU=; b=ulowS8AWApEZXhWYwMRGcLTdd2fCZ3IY5BdAxiBbMcReeuaVAGP9Ifvt4BAwDtipne CQukauQPjRvENyyqF61v8R5iEvRaS4qEtlkBhMinU4zSpwxJb7p9DOB5im4nTYqn5nag Tjx7874UXdR4z7ZGqtz1/vuaCaXo1EEgADNg3/raS02T+pq9M5QdDELfE6ECfnT8i2Vc 1GXjug+E5onPA1qAFnOjmKK9e51splO8vPN38vhl5+DgkpMYm0doekvhOCLqOqwJa75Z hjF7UuMP7jWS1o9SiTdbqduyowq/e2SD5LmDzdh15/ow9LlBlQSNsKBwl+9RXpmzF6rk Xqmg== X-Gm-Message-State: APjAAAUxy/RwNm9kejrn3a5Za1WpXSZ/Y8zRJyqFl6WONwKsYN+kA4oO KCb+mevlc3YIVsg0C4Dj6tA= X-Google-Smtp-Source: APXvYqwUsSqsLvN00P7BlyPp0WlykfXE2eX++xBbdr0xd/w/rJKG+KDNgUDfpW3CLgAcd4WNMIOmCw== X-Received: by 2002:a63:1341:: with SMTP id 1mr12919684pgt.48.1567864079861; Sat, 07 Sep 2019 06:47:59 -0700 (PDT) Received: from ap-To-be-filled-by-O-E-M.8.8.8.8 ([14.33.120.60]) by smtp.gmail.com with ESMTPSA id dw7sm7440139pjb.21.2019.09.07.06.47.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2019 06:47:58 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, sthemmin@microsoft.com, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com Cc: ap420073@gmail.com Subject: [PATCH net v2 10/11] vxlan: add adjacent link to limit depth level Date: Sat, 7 Sep 2019 22:47:49 +0900 Message-Id: <20190907134749.557-1-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Current vxlan code doesn't limit the number of nested devices. Nested devices would be handled recursively and this routine needs huge stack memory. So, unlimited nested devices could make stack overflow. In order to fix this issue, this patch adds adjacent links. The adjacent link APIs internally check the depth level. Test commands: ip link add dummy0 type dummy ip link add vxlan0 type vxlan id 0 group 239.1.1.1 dev dummy0 \ dstport 4789 for i in {1..100} do let A=$i-1 ip link add vxlan$i type vxlan id $i group 239.1.1.1 \ dev vxlan$A dstport 4789 done ip link del dummy0 The top upper link is vxlan100 and the lowest link is vxlan0. When vxlan0 is deleting, the upper devices will be deleted recursively. It needs huge stack memory so it makes stack overflow. Splat looks like: [ 229.628477] ============================================================================= [ 229.629785] BUG page->ptl (Not tainted): Padding overwritten. 0x0000000026abf214-0x0000000091f6abb2 [ 229.629785] ----------------------------------------------------------------------------- [ 229.629785] [ 229.655439] ================================================================== [ 229.629785] INFO: Slab 0x00000000ff7cfda8 objects=19 used=19 fp=0x00000000fe33776c flags=0x200000000010200 [ 229.655688] BUG: KASAN: stack-out-of-bounds in unmap_single_vma+0x25a/0x2e0 [ 229.655688] Read of size 8 at addr ffff888113076928 by task vlan-network-in/2334 [ 229.655688] [ 229.629785] Padding 0000000026abf214: 00 80 14 0d 81 88 ff ff 68 91 81 14 81 88 ff ff ........h....... [ 229.629785] Padding 0000000001e24790: 38 91 81 14 81 88 ff ff 68 91 81 14 81 88 ff ff 8.......h....... [ 229.629785] Padding 00000000b39397c8: 33 30 62 a7 ff ff ff ff ff eb 60 22 10 f1 ff 1f 30b.......`".... [ 229.629785] Padding 00000000bc98f53a: 80 60 07 13 81 88 ff ff 00 80 14 0d 81 88 ff ff .`.............. [ 229.629785] Padding 000000002aa8123d: 68 91 81 14 81 88 ff ff f7 21 17 a7 ff ff ff ff h........!...... [ 229.629785] Padding 000000001c8c2369: 08 81 14 0d 81 88 ff ff 03 02 00 00 00 00 00 00 ................ [ 229.629785] Padding 000000004e290c5d: 21 90 a2 21 10 ed ff ff 00 00 00 00 00 fc ff df !..!............ [ 229.629785] Padding 000000000e25d731: 18 60 07 13 81 88 ff ff c0 8b 13 05 81 88 ff ff .`.............. [ 229.629785] Padding 000000007adc7ab3: b3 8a b5 41 00 00 00 00 ...A.... [ 229.629785] FIX page->ptl: Restoring 0x0000000026abf214-0x0000000091f6abb2=0x5a [ ... ] Fixes: acaf4e70997f ("net: vxlan: when lower dev unregisters remove vxlan dev as well") Signed-off-by: Taehee Yoo --- v1 -> v2 : this patch isn't changed drivers/net/vxlan.c | 71 ++++++++++++++++++++++++++++++++++++++------- include/net/vxlan.h | 1 + 2 files changed, 62 insertions(+), 10 deletions(-) diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c index 3d9bcc957f7d..0d5c8d22d8a4 100644 --- a/drivers/net/vxlan.c +++ b/drivers/net/vxlan.c @@ -3567,6 +3567,8 @@ static int __vxlan_dev_create(struct net *net, struct net_device *dev, struct vxlan_net *vn = net_generic(net, vxlan_net_id); struct vxlan_dev *vxlan = netdev_priv(dev); struct vxlan_fdb *f = NULL; + struct net_device *remote_dev = NULL; + struct vxlan_rdst *dst = &vxlan->default_dst; bool unregister = false; int err; @@ -3577,14 +3579,14 @@ static int __vxlan_dev_create(struct net *net, struct net_device *dev, dev->ethtool_ops = &vxlan_ethtool_ops; /* create an fdb entry for a valid default destination */ - if (!vxlan_addr_any(&vxlan->default_dst.remote_ip)) { + if (!vxlan_addr_any(&dst->remote_ip)) { err = vxlan_fdb_create(vxlan, all_zeros_mac, - &vxlan->default_dst.remote_ip, + &dst->remote_ip, NUD_REACHABLE | NUD_PERMANENT, vxlan->cfg.dst_port, - vxlan->default_dst.remote_vni, - vxlan->default_dst.remote_vni, - vxlan->default_dst.remote_ifindex, + dst->remote_vni, + dst->remote_vni, + dst->remote_ifindex, NTF_SELF, &f); if (err) return err; @@ -3595,26 +3597,43 @@ static int __vxlan_dev_create(struct net *net, struct net_device *dev, goto errout; unregister = true; + if (dst->remote_ifindex) { + remote_dev = __dev_get_by_index(net, dst->remote_ifindex); + if (!remote_dev) + goto errout; + + err = netdev_upper_dev_link(remote_dev, dev, extack); + if (err) + goto errout; + } + err = rtnl_configure_link(dev, NULL); if (err) - goto errout; + goto unlink; if (f) { - vxlan_fdb_insert(vxlan, all_zeros_mac, - vxlan->default_dst.remote_vni, f); + vxlan_fdb_insert(vxlan, all_zeros_mac, dst->remote_vni, f); /* notify default fdb entry */ err = vxlan_fdb_notify(vxlan, f, first_remote_rtnl(f), RTM_NEWNEIGH, true, extack); if (err) { vxlan_fdb_destroy(vxlan, f, false, false); + if (remote_dev) + netdev_upper_dev_unlink(remote_dev, dev); goto unregister; } } list_add(&vxlan->next, &vn->vxlan_list); + if (remote_dev) { + dst->remote_dev = remote_dev; + dev_hold(remote_dev); + } return 0; - +unlink: + if (remote_dev) + netdev_upper_dev_unlink(remote_dev, dev); errout: /* unregister_netdevice() destroys the default FDB entry with deletion * notification. But the addition notification was not sent yet, so @@ -3936,6 +3955,8 @@ static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[], struct net_device *lowerdev; struct vxlan_config conf; int err; + bool linked = false; + bool disabled = false; err = vxlan_nl2conf(tb, data, dev, &conf, true, extack); if (err) @@ -3946,6 +3967,16 @@ static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[], if (err) return err; + if (lowerdev) { + if (dst->remote_dev && lowerdev != dst->remote_dev) { + netdev_adjacent_dev_disable(dst->remote_dev, dev); + disabled = true; + } + err = netdev_upper_dev_link(lowerdev, dev, extack); + if (err) + goto err; + linked = true; + } /* handle default dst entry */ if (!vxlan_addr_equal(&conf.remote_ip, &dst->remote_ip)) { u32 hash_index = fdb_head_index(vxlan, all_zeros_mac, conf.vni); @@ -3962,7 +3993,7 @@ static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[], NTF_SELF, true, extack); if (err) { spin_unlock_bh(&vxlan->hash_lock[hash_index]); - return err; + goto err; } } if (!vxlan_addr_any(&dst->remote_ip)) @@ -3979,8 +4010,24 @@ static int vxlan_changelink(struct net_device *dev, struct nlattr *tb[], if (conf.age_interval != vxlan->cfg.age_interval) mod_timer(&vxlan->age_timer, jiffies); + if (disabled) { + netdev_adjacent_dev_enable(dst->remote_dev, dev); + netdev_upper_dev_unlink(dst->remote_dev, dev); + dev_put(dst->remote_dev); + } + if (linked) { + dst->remote_dev = lowerdev; + dev_hold(dst->remote_dev); + } + vxlan_config_apply(dev, &conf, lowerdev, vxlan->net, true); return 0; +err: + if (linked) + netdev_upper_dev_unlink(lowerdev, dev); + if (disabled) + netdev_adjacent_dev_enable(dst->remote_dev, dev); + return err; } static void vxlan_dellink(struct net_device *dev, struct list_head *head) @@ -3991,6 +4038,10 @@ static void vxlan_dellink(struct net_device *dev, struct list_head *head) list_del(&vxlan->next); unregister_netdevice_queue(dev, head); + if (vxlan->default_dst.remote_dev) { + netdev_upper_dev_unlink(vxlan->default_dst.remote_dev, dev); + dev_put(vxlan->default_dst.remote_dev); + } } static size_t vxlan_get_size(const struct net_device *dev) diff --git a/include/net/vxlan.h b/include/net/vxlan.h index dc1583a1fb8a..08e237d7aa73 100644 --- a/include/net/vxlan.h +++ b/include/net/vxlan.h @@ -197,6 +197,7 @@ struct vxlan_rdst { u8 offloaded:1; __be32 remote_vni; u32 remote_ifindex; + struct net_device *remote_dev; struct list_head list; struct rcu_head rcu; struct dst_cache dst_cache; From patchwork Sat Sep 7 13:48:09 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taehee Yoo X-Patchwork-Id: 1159332 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="J5mUqjKB"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46QbNh69N9z9sNf for ; Sat, 7 Sep 2019 23:48:20 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2406722AbfIGNsU (ORCPT ); Sat, 7 Sep 2019 09:48:20 -0400 Received: from mail-pg1-f196.google.com ([209.85.215.196]:39599 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1733096AbfIGNsT (ORCPT ); Sat, 7 Sep 2019 09:48:19 -0400 Received: by mail-pg1-f196.google.com with SMTP id u17so5137568pgi.6 for ; Sat, 07 Sep 2019 06:48:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=Zvqvj0SAWpRVvbLIHyr6uhQGnZW7ODqSqCSSGUzXMas=; b=J5mUqjKBc4uTsUgkIIRLuMBFhiI99wwN99VnRJ/SPMLSiMMkqvrf82wykmkF/pGG5h s5zoCXYAZaPeDnCzxS9NnZvBfSy4NhEiLJWNTNgGE/IhG6FpiBxDFnJ+c4s47aA4d+UF o/XqmqeWBYeRg0adqNXsyrkWHDLfwTiKg0v8npdLtz+9eIWM31CC8dkl9COUaZ7+zofu 1wFb2zdwRARLskdyDFkm+mwtZGiz0ktdszhthj6eJpnTjUalF/mWjCDqXXQhSzFpoKIs T4zpQd3/w0f5/HOehHVx8r5+FuK9g+FUf61VSFmrnXMkXVnCd3c/v2Qdqb+zKdrsnJ/b 5+XA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=Zvqvj0SAWpRVvbLIHyr6uhQGnZW7ODqSqCSSGUzXMas=; b=eIVlQueT7I/PwSj5PdB2Ctl6ABSP5SJ302GlHmZKSR+veg61kC3qWTOyMC/ekpiaaS 322Ox+MbaKPCHaWkyPh2e0WDP+ePobDMtIhC5MreGtW5lvUEPCH5YAV2j3QUBRbmwFPe p2UChVFspFXCJfdeYnP1zpNK3DFDl5gKg90Vrjc3LpRxFXGuO7KA4YjQrVvmYf+fiyrt P/cQjD8Nlq57R4Y2YlfjdMJgr6OiuBBsJwaCmHlCmsLGO+BD+KA9Ub59r/L85g/a9zD1 QUdVpkbr4eW/oOgacDSJf9K9y7CE41/NZMvxJM86fv1w1Z2YFoIMMm8/xaCiooM0Pin7 xB1Q== X-Gm-Message-State: APjAAAXzZWU9R9Z6Usqbq3d/NtrFvvv/sEfPkLKk0Wo98yBXaZj3OsTK ZknExZDddkjfuL6wd739pOQ= X-Google-Smtp-Source: APXvYqyNO+k0ObQ8kdbnsftxvRl7OxJEmSp/tjQXxGFh+wandbzrhppgt6SnbuB7X91OeUjPuL7+IQ== X-Received: by 2002:a62:4e52:: with SMTP id c79mr17252204pfb.28.1567864098673; Sat, 07 Sep 2019 06:48:18 -0700 (PDT) Received: from ap-To-be-filled-by-O-E-M.8.8.8.8 ([14.33.120.60]) by smtp.gmail.com with ESMTPSA id r28sm20085547pfg.62.2019.09.07.06.48.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 07 Sep 2019 06:48:17 -0700 (PDT) From: Taehee Yoo To: davem@davemloft.net, netdev@vger.kernel.org, j.vosburgh@gmail.com, vfalico@gmail.com, andy@greyhouse.net, jiri@resnulli.us, sd@queasysnail.net, roopa@cumulusnetworks.com, saeedm@mellanox.com, manishc@marvell.com, rahulv@marvell.com, kys@microsoft.com, haiyangz@microsoft.com, sthemmin@microsoft.com, sashal@kernel.org, hare@suse.de, varun@chelsio.com, ubraun@linux.ibm.com, kgraul@linux.ibm.com, jay.vosburgh@canonical.com Cc: ap420073@gmail.com Subject: [PATCH net v2 11/11] net: remove unnecessary variables and callback Date: Sat, 7 Sep 2019 22:48:09 +0900 Message-Id: <20190907134809.720-1-ap420073@gmail.com> X-Mailer: git-send-email 2.17.1 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch removes variables and callback these are related to the nested device structure. devices that can be nested have their own nest_level variable that represents the depth of nested devices. In the previous patch, new {lower/upper}_level variables are added and they replace old private nest_level variable. So, this patch removes all 'nest_level' variables. In order to avoid lockdep warning, ->ndo_get_lock_subclass() was added to get lockdep subclass value, which is actually lower nested depth value. But now, they use the dynamic lockdep key to avoid lockdep warning instead of the subclass. So, this patch removes ->ndo_get_lock_subclass() callback. Signed-off-by: Taehee Yoo --- v1 -> v2 : this patch isn't changed drivers/net/bonding/bond_alb.c | 2 +- drivers/net/bonding/bond_main.c | 14 ------------- .../net/ethernet/mellanox/mlx5/core/en_tc.c | 2 +- drivers/net/macsec.c | 9 --------- drivers/net/macvlan.c | 7 ------- include/linux/if_macvlan.h | 1 - include/linux/if_vlan.h | 12 ----------- include/linux/netdevice.h | 12 ----------- include/net/bonding.h | 1 - net/8021q/vlan.c | 1 - net/8021q/vlan_dev.c | 6 ------ net/core/dev.c | 20 ------------------- net/core/dev_addr_lists.c | 12 +++++------ net/smc/smc_core.c | 2 +- net/smc/smc_pnet.c | 2 +- 15 files changed, 10 insertions(+), 93 deletions(-) diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c index 8c79bad2a9a5..4f2e6910c623 100644 --- a/drivers/net/bonding/bond_alb.c +++ b/drivers/net/bonding/bond_alb.c @@ -952,7 +952,7 @@ static int alb_upper_dev_walk(struct net_device *upper, void *_data) struct bond_vlan_tag *tags; if (is_vlan_dev(upper) && - bond->nest_level == vlan_get_encap_level(upper) - 1) { + bond->dev->lower_level == upper->lower_level - 1) { if (upper->addr_assign_type == NET_ADDR_STOLEN) { alb_send_lp_vid(slave, mac_addr, vlan_dev_vlan_proto(upper), diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 7f574e74ed78..69eb61466fbe 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1733,8 +1733,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, goto err_upper_unlink; } - bond->nest_level = dev_get_nest_level(bond_dev) + 1; - /* If the mode uses primary, then the following is handled by * bond_change_active_slave(). */ @@ -1983,9 +1981,6 @@ static int __bond_release_one(struct net_device *bond_dev, if (!bond_has_slaves(bond)) { bond_set_carrier(bond); eth_hw_addr_random(bond_dev); - bond->nest_level = SINGLE_DEPTH_NESTING; - } else { - bond->nest_level = dev_get_nest_level(bond_dev) + 1; } unblock_netpoll_tx(); @@ -3472,13 +3467,6 @@ static void bond_fold_stats(struct rtnl_link_stats64 *_res, } } -static int bond_get_nest_level(struct net_device *bond_dev) -{ - struct bonding *bond = netdev_priv(bond_dev); - - return bond->nest_level; -} - static void bond_get_stats(struct net_device *bond_dev, struct rtnl_link_stats64 *stats) { @@ -4298,7 +4286,6 @@ static const struct net_device_ops bond_netdev_ops = { .ndo_neigh_setup = bond_neigh_setup, .ndo_vlan_rx_add_vid = bond_vlan_rx_add_vid, .ndo_vlan_rx_kill_vid = bond_vlan_rx_kill_vid, - .ndo_get_lock_subclass = bond_get_nest_level, #ifdef CONFIG_NET_POLL_CONTROLLER .ndo_netpoll_setup = bond_netpoll_setup, .ndo_netpoll_cleanup = bond_netpoll_cleanup, @@ -4822,7 +4809,6 @@ static int bond_init(struct net_device *bond_dev) if (!bond->wq) return -ENOMEM; - bond->nest_level = SINGLE_DEPTH_NESTING; bond_dev_set_lockdep_class(bond_dev); list_add_tail(&bond->bond_list, &bn->dev_list); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index 00b2d4a86159..e056f9aad8df 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -2797,7 +2797,7 @@ static int add_vlan_pop_action(struct mlx5e_priv *priv, struct mlx5_esw_flow_attr *attr, u32 *action) { - int nest_level = vlan_get_encap_level(attr->parse_attr->filter_dev); + int nest_level = attr->parse_attr->filter_dev->lower_level; struct flow_action_entry vlan_act = { .id = FLOW_ACTION_VLAN_POP, }; diff --git a/drivers/net/macsec.c b/drivers/net/macsec.c index 41ec1ed0d545..c0cb595f2bba 100644 --- a/drivers/net/macsec.c +++ b/drivers/net/macsec.c @@ -269,7 +269,6 @@ struct macsec_dev { struct gro_cells gro_cells; struct lock_class_key xmit_lock_key; struct lock_class_key addr_lock_key; - unsigned int nest_level; }; /** @@ -2988,11 +2987,6 @@ static int macsec_get_iflink(const struct net_device *dev) return macsec_priv(dev)->real_dev->ifindex; } -static int macsec_get_nest_level(struct net_device *dev) -{ - return macsec_priv(dev)->nest_level; -} - static const struct net_device_ops macsec_netdev_ops = { .ndo_init = macsec_dev_init, .ndo_uninit = macsec_dev_uninit, @@ -3006,7 +3000,6 @@ static const struct net_device_ops macsec_netdev_ops = { .ndo_start_xmit = macsec_start_xmit, .ndo_get_stats64 = macsec_get_stats64, .ndo_get_iflink = macsec_get_iflink, - .ndo_get_lock_subclass = macsec_get_nest_level, }; static const struct device_type macsec_type = { @@ -3289,8 +3282,6 @@ static int macsec_newlink(struct net *net, struct net_device *dev, if (err < 0) return err; - macsec->nest_level = dev_get_nest_level(real_dev) + 1; - err = netdev_upper_dev_link(real_dev, dev, extack); if (err < 0) goto unregister; diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index dae368a2e8d1..2c14bc606514 100644 --- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -867,11 +867,6 @@ static int macvlan_do_ioctl(struct net_device *dev, struct ifreq *ifr, int cmd) #define MACVLAN_STATE_MASK \ ((1<<__LINK_STATE_NOCARRIER) | (1<<__LINK_STATE_DORMANT)) -static int macvlan_get_nest_level(struct net_device *dev) -{ - return ((struct macvlan_dev *)netdev_priv(dev))->nest_level; -} - static void macvlan_dev_set_lockdep_one(struct net_device *dev, struct netdev_queue *txq, void *_unused) @@ -1180,7 +1175,6 @@ static const struct net_device_ops macvlan_netdev_ops = { .ndo_fdb_add = macvlan_fdb_add, .ndo_fdb_del = macvlan_fdb_del, .ndo_fdb_dump = ndo_dflt_fdb_dump, - .ndo_get_lock_subclass = macvlan_get_nest_level, #ifdef CONFIG_NET_POLL_CONTROLLER .ndo_poll_controller = macvlan_dev_poll_controller, .ndo_netpoll_setup = macvlan_dev_netpoll_setup, @@ -1464,7 +1458,6 @@ int macvlan_common_newlink(struct net *src_net, struct net_device *dev, vlan->dev = dev; vlan->port = port; vlan->set_features = MACVLAN_FEATURES; - vlan->nest_level = dev_get_nest_level(lowerdev) + 1; vlan->mode = MACVLAN_MODE_VEPA; if (data && data[IFLA_MACVLAN_MODE]) diff --git a/include/linux/if_macvlan.h b/include/linux/if_macvlan.h index ea5b41823287..e9202edcf101 100644 --- a/include/linux/if_macvlan.h +++ b/include/linux/if_macvlan.h @@ -29,7 +29,6 @@ struct macvlan_dev { netdev_features_t set_features; enum macvlan_mode mode; u16 flags; - int nest_level; unsigned int macaddr_count; struct lock_class_key xmit_lock_key; struct lock_class_key addr_lock_key; diff --git a/include/linux/if_vlan.h b/include/linux/if_vlan.h index 1aed9f613e90..6f30284a58e5 100644 --- a/include/linux/if_vlan.h +++ b/include/linux/if_vlan.h @@ -182,8 +182,6 @@ struct vlan_dev_priv { #ifdef CONFIG_NET_POLL_CONTROLLER struct netpoll *netpoll; #endif - unsigned int nest_level; - struct lock_class_key xmit_lock_key; struct lock_class_key addr_lock_key; }; @@ -224,11 +222,6 @@ extern void vlan_vids_del_by_dev(struct net_device *dev, extern bool vlan_uses_dev(const struct net_device *dev); -static inline int vlan_get_encap_level(struct net_device *dev) -{ - BUG_ON(!is_vlan_dev(dev)); - return vlan_dev_priv(dev)->nest_level; -} #else static inline struct net_device * __vlan_find_dev_deep_rcu(struct net_device *real_dev, @@ -298,11 +291,6 @@ static inline bool vlan_uses_dev(const struct net_device *dev) { return false; } -static inline int vlan_get_encap_level(struct net_device *dev) -{ - BUG(); - return 0; -} #endif /** diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 309ae000bae7..e13db714ee85 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1408,7 +1408,6 @@ struct net_device_ops { void (*ndo_dfwd_del_station)(struct net_device *pdev, void *priv); - int (*ndo_get_lock_subclass)(struct net_device *dev); int (*ndo_set_tx_maxrate)(struct net_device *dev, int queue_index, u32 maxrate); @@ -4047,16 +4046,6 @@ static inline void netif_addr_lock(struct net_device *dev) spin_lock(&dev->addr_list_lock); } -static inline void netif_addr_lock_nested(struct net_device *dev) -{ - int subclass = SINGLE_DEPTH_NESTING; - - if (dev->netdev_ops->ndo_get_lock_subclass) - subclass = dev->netdev_ops->ndo_get_lock_subclass(dev); - - spin_lock_nested(&dev->addr_list_lock, subclass); -} - static inline void netif_addr_lock_bh(struct net_device *dev) { spin_lock_bh(&dev->addr_list_lock); @@ -4334,7 +4323,6 @@ void netdev_lower_state_changed(struct net_device *lower_dev, extern u8 netdev_rss_key[NETDEV_RSS_KEY_LEN] __read_mostly; void netdev_rss_key_fill(void *buffer, size_t len); -int dev_get_nest_level(struct net_device *dev); int skb_checksum_help(struct sk_buff *skb); int skb_crc32c_csum_help(struct sk_buff *skb); int skb_csum_hwoffload_help(struct sk_buff *skb, diff --git a/include/net/bonding.h b/include/net/bonding.h index c39ac7061e41..74f41dd73866 100644 --- a/include/net/bonding.h +++ b/include/net/bonding.h @@ -203,7 +203,6 @@ struct bonding { struct slave __rcu *primary_slave; struct bond_up_slave __rcu *slave_arr; /* Array of usable slaves */ bool force_primary; - u32 nest_level; s32 slave_cnt; /* never change this value outside the attach/detach wrappers */ int (*recv_probe)(const struct sk_buff *, struct bonding *, struct slave *); diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c index 54728d2eda18..d4bcfd8f95bf 100644 --- a/net/8021q/vlan.c +++ b/net/8021q/vlan.c @@ -172,7 +172,6 @@ int register_vlan_dev(struct net_device *dev, struct netlink_ext_ack *extack) if (err < 0) goto out_uninit_mvrp; - vlan->nest_level = dev_get_nest_level(real_dev) + 1; err = register_netdevice(dev); if (err < 0) goto out_uninit_mvrp; diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c index 12bc80650087..e8707827540c 100644 --- a/net/8021q/vlan_dev.c +++ b/net/8021q/vlan_dev.c @@ -514,11 +514,6 @@ static void vlan_dev_set_lockdep_class(struct net_device *dev) netdev_for_each_tx_queue(dev, vlan_dev_set_lockdep_one, NULL); } -static int vlan_dev_get_lock_subclass(struct net_device *dev) -{ - return vlan_dev_priv(dev)->nest_level; -} - static const struct header_ops vlan_header_ops = { .create = vlan_dev_hard_header, .parse = eth_header_parse, @@ -814,7 +809,6 @@ static const struct net_device_ops vlan_netdev_ops = { .ndo_netpoll_cleanup = vlan_dev_netpoll_cleanup, #endif .ndo_fix_features = vlan_dev_fix_features, - .ndo_get_lock_subclass = vlan_dev_get_lock_subclass, .ndo_get_iflink = vlan_dev_get_iflink, }; diff --git a/net/core/dev.c b/net/core/dev.c index ac055b531c96..73a69a7a3553 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -7510,26 +7510,6 @@ void *netdev_lower_dev_get_private(struct net_device *dev, } EXPORT_SYMBOL(netdev_lower_dev_get_private); - -int dev_get_nest_level(struct net_device *dev) -{ - struct net_device *lower = NULL; - struct list_head *iter; - int max_nest = -1; - int nest; - - ASSERT_RTNL(); - - netdev_for_each_lower_dev(dev, lower, iter) { - nest = dev_get_nest_level(lower); - if (max_nest < nest) - max_nest = nest; - } - - return max_nest + 1; -} -EXPORT_SYMBOL(dev_get_nest_level); - /** * netdev_lower_change - Dispatch event about lower device state change * @lower_dev: device diff --git a/net/core/dev_addr_lists.c b/net/core/dev_addr_lists.c index 6393ba930097..2f949b5a1eb9 100644 --- a/net/core/dev_addr_lists.c +++ b/net/core/dev_addr_lists.c @@ -637,7 +637,7 @@ int dev_uc_sync(struct net_device *to, struct net_device *from) if (to->addr_len != from->addr_len) return -EINVAL; - netif_addr_lock_nested(to); + netif_addr_lock(to); err = __hw_addr_sync(&to->uc, &from->uc, to->addr_len); if (!err) __dev_set_rx_mode(to); @@ -667,7 +667,7 @@ int dev_uc_sync_multiple(struct net_device *to, struct net_device *from) if (to->addr_len != from->addr_len) return -EINVAL; - netif_addr_lock_nested(to); + netif_addr_lock(to); err = __hw_addr_sync_multiple(&to->uc, &from->uc, to->addr_len); if (!err) __dev_set_rx_mode(to); @@ -691,7 +691,7 @@ void dev_uc_unsync(struct net_device *to, struct net_device *from) return; netif_addr_lock_bh(from); - netif_addr_lock_nested(to); + netif_addr_lock(to); __hw_addr_unsync(&to->uc, &from->uc, to->addr_len); __dev_set_rx_mode(to); netif_addr_unlock(to); @@ -858,7 +858,7 @@ int dev_mc_sync(struct net_device *to, struct net_device *from) if (to->addr_len != from->addr_len) return -EINVAL; - netif_addr_lock_nested(to); + netif_addr_lock(to); err = __hw_addr_sync(&to->mc, &from->mc, to->addr_len); if (!err) __dev_set_rx_mode(to); @@ -888,7 +888,7 @@ int dev_mc_sync_multiple(struct net_device *to, struct net_device *from) if (to->addr_len != from->addr_len) return -EINVAL; - netif_addr_lock_nested(to); + netif_addr_lock(to); err = __hw_addr_sync_multiple(&to->mc, &from->mc, to->addr_len); if (!err) __dev_set_rx_mode(to); @@ -912,7 +912,7 @@ void dev_mc_unsync(struct net_device *to, struct net_device *from) return; netif_addr_lock_bh(from); - netif_addr_lock_nested(to); + netif_addr_lock(to); __hw_addr_unsync(&to->mc, &from->mc, to->addr_len); __dev_set_rx_mode(to); netif_addr_unlock(to); diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c index 4ca50ddf8d16..a2e91b8d04b3 100644 --- a/net/smc/smc_core.c +++ b/net/smc/smc_core.c @@ -558,7 +558,7 @@ int smc_vlan_by_tcpsk(struct socket *clcsock, struct smc_init_info *ini) } rtnl_lock(); - nest_lvl = dev_get_nest_level(ndev); + nest_lvl = ndev->lower_level; for (i = 0; i < nest_lvl; i++) { struct list_head *lower = &ndev->adj_list.lower; diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c index bab2da8cf17a..2920b006f65c 100644 --- a/net/smc/smc_pnet.c +++ b/net/smc/smc_pnet.c @@ -718,7 +718,7 @@ static struct net_device *pnet_find_base_ndev(struct net_device *ndev) int i, nest_lvl; rtnl_lock(); - nest_lvl = dev_get_nest_level(ndev); + nest_lvl = ndev->lower_level; for (i = 0; i < nest_lvl; i++) { struct list_head *lower = &ndev->adj_list.lower;