Message ID | 20200713162443.2510682-1-olteanv@gmail.com |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | [net] net: dsa: link interfaces with the DSA master to get rid of lockdep warnings | expand |
> diff --git a/net/dsa/slave.c b/net/dsa/slave.c > index 743caabeaaa6..a951b2a7d79a 100644 > --- a/net/dsa/slave.c > +++ b/net/dsa/slave.c > @@ -1994,6 +1994,13 @@ int dsa_slave_create(struct dsa_port *port) > ret, slave_dev->name); > goto out_phy; > } > + rtnl_lock(); > + ret = netdev_upper_dev_link(master, slave_dev, NULL); > + rtnl_unlock(); > + if (ret) { > + unregister_netdevice(slave_dev); > + goto out_phy; > + } Hi Vladimir A common pattern we see in bugs is that the driver sets up something critical after calling register_netdev(), not realising that that call can go off and really start using the interface before it returns. So in general, i like to have register_netdev() last, nothing after it. Please could you move this before register_netdev(). Thanks Andrew
Hi Andrew, On Mon, Jul 13, 2020 at 06:47:28PM +0200, Andrew Lunn wrote: > > diff --git a/net/dsa/slave.c b/net/dsa/slave.c > > index 743caabeaaa6..a951b2a7d79a 100644 > > --- a/net/dsa/slave.c > > +++ b/net/dsa/slave.c > > @@ -1994,6 +1994,13 @@ int dsa_slave_create(struct dsa_port *port) > > ret, slave_dev->name); > > goto out_phy; > > } > > + rtnl_lock(); > > + ret = netdev_upper_dev_link(master, slave_dev, NULL); > > + rtnl_unlock(); > > + if (ret) { > > + unregister_netdevice(slave_dev); > > + goto out_phy; > > + } > > Hi Vladimir > > A common pattern we see in bugs is that the driver sets up something > critical after calling register_netdev(), not realising that that call > can go off and really start using the interface before it returns. So > in general, i like to have register_netdev() last, nothing after it. > > Please could you move this before register_netdev(). > > Thanks > Andrew It doesn't work after register_netdev(). The call to netdev_upper_dev_link() fails and no network interface gets probed. VLAN performs registration and linkage in the same order: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/net/8021q/vlan.c#n175 So I think this part is fine. Thanks, -Vladimir
On Mon, Jul 13, 2020 at 08:30:49PM +0300, Vladimir Oltean wrote: > Hi Andrew, > > On Mon, Jul 13, 2020 at 06:47:28PM +0200, Andrew Lunn wrote: > > > diff --git a/net/dsa/slave.c b/net/dsa/slave.c > > > index 743caabeaaa6..a951b2a7d79a 100644 > > > --- a/net/dsa/slave.c > > > +++ b/net/dsa/slave.c > > > @@ -1994,6 +1994,13 @@ int dsa_slave_create(struct dsa_port *port) > > > ret, slave_dev->name); > > > goto out_phy; > > > } > > > + rtnl_lock(); > > > + ret = netdev_upper_dev_link(master, slave_dev, NULL); > > > + rtnl_unlock(); > > > + if (ret) { > > > + unregister_netdevice(slave_dev); > > > + goto out_phy; > > > + } > > > > Hi Vladimir > > > > A common pattern we see in bugs is that the driver sets up something > > critical after calling register_netdev(), not realising that that call > > can go off and really start using the interface before it returns. So > > in general, i like to have register_netdev() last, nothing after it. > > > > Please could you move this before register_netdev(). > > > > Thanks > > Andrew > > It doesn't work after register_netdev(). The call to I mean it doesn't work when netdev_upper_dev_link() is _before_ register_netdev(). > netdev_upper_dev_link() fails and no network interface gets probed. VLAN > performs registration and linkage in the same order: > > https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/net/8021q/vlan.c#n175 > > So I think this part is fine. > > Thanks, > -Vladimir
On Mon, Jul 13, 2020 at 08:33:19PM +0300, Vladimir Oltean wrote: > On Mon, Jul 13, 2020 at 08:30:49PM +0300, Vladimir Oltean wrote: > > Hi Andrew, > > > > On Mon, Jul 13, 2020 at 06:47:28PM +0200, Andrew Lunn wrote: > > > > diff --git a/net/dsa/slave.c b/net/dsa/slave.c > > > > index 743caabeaaa6..a951b2a7d79a 100644 > > > > --- a/net/dsa/slave.c > > > > +++ b/net/dsa/slave.c > > > > @@ -1994,6 +1994,13 @@ int dsa_slave_create(struct dsa_port *port) > > > > ret, slave_dev->name); > > > > goto out_phy; > > > > } > > > > + rtnl_lock(); > > > > + ret = netdev_upper_dev_link(master, slave_dev, NULL); > > > > + rtnl_unlock(); > > > > + if (ret) { > > > > + unregister_netdevice(slave_dev); > > > > + goto out_phy; > > > > + } > > > > > > Hi Vladimir > > > > > > A common pattern we see in bugs is that the driver sets up something > > > critical after calling register_netdev(), not realising that that call > > > can go off and really start using the interface before it returns. So > > > in general, i like to have register_netdev() last, nothing after it. > > > > > > Please could you move this before register_netdev(). > > > > > > Thanks > > > Andrew > > > > It doesn't work after register_netdev(). The call to > > I mean it doesn't work when netdev_upper_dev_link() is _before_ > register_netdev(). > > > netdev_upper_dev_link() fails and no network interface gets probed. VLAN > > performs registration and linkage in the same order: > > > > https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/net/8021q/vlan.c#n175 > > > > So I think this part is fine. > > > > Thanks, > > -Vladimir One difference from VLAN is that in that case, the entire register_vlan_device() function runs under RTNL. When those bugs that you talk about are found, who starts using the network interface too early? User space or someone else? Would RTNL be enough to avoid that? Thanks, -Vladimir
+ Jiri, On 7/13/2020 9:24 AM, Vladimir Oltean wrote: > Since commit 845e0ebb4408 ("net: change addr_list_lock back to static > key"), cascaded DSA setups (DSA switch port as DSA master for another > DSA switch port) are emitting this lockdep warning: > > ============================================ > WARNING: possible recursive locking detected > 5.8.0-rc1-00133-g923e4b5032dd-dirty #208 Not tainted > -------------------------------------------- > dhcpcd/323 is trying to acquire lock: > ffff000066dd4268 (&dsa_master_addr_list_lock_key/1){+...}-{2:2}, at: dev_mc_sync+0x44/0x90 > > but task is already holding lock: > ffff00006608c268 (&dsa_master_addr_list_lock_key/1){+...}-{2:2}, at: dev_mc_sync+0x44/0x90 > > other info that might help us debug this: > Possible unsafe locking scenario: > > CPU0 > ---- > lock(&dsa_master_addr_list_lock_key/1); > lock(&dsa_master_addr_list_lock_key/1); > > *** DEADLOCK *** > > May be due to missing lock nesting notation > > 3 locks held by dhcpcd/323: > #0: ffffdbd1381dda18 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x24/0x30 > #1: ffff00006614b268 (_xmit_ETHER){+...}-{2:2}, at: dev_set_rx_mode+0x28/0x48 > #2: ffff00006608c268 (&dsa_master_addr_list_lock_key/1){+...}-{2:2}, at: dev_mc_sync+0x44/0x90 > > stack backtrace: > Call trace: > dump_backtrace+0x0/0x1e0 > show_stack+0x20/0x30 > dump_stack+0xec/0x158 > __lock_acquire+0xca0/0x2398 > lock_acquire+0xe8/0x440 > _raw_spin_lock_nested+0x64/0x90 > dev_mc_sync+0x44/0x90 > dsa_slave_set_rx_mode+0x34/0x50 > __dev_set_rx_mode+0x60/0xa0 > dev_mc_sync+0x84/0x90 > dsa_slave_set_rx_mode+0x34/0x50 > __dev_set_rx_mode+0x60/0xa0 > dev_set_rx_mode+0x30/0x48 > __dev_open+0x10c/0x180 > __dev_change_flags+0x170/0x1c8 > dev_change_flags+0x2c/0x70 > devinet_ioctl+0x774/0x878 > inet_ioctl+0x348/0x3b0 > sock_do_ioctl+0x50/0x310 > sock_ioctl+0x1f8/0x580 > ksys_ioctl+0xb0/0xf0 > __arm64_sys_ioctl+0x28/0x38 > el0_svc_common.constprop.0+0x7c/0x180 > do_el0_svc+0x2c/0x98 > el0_sync_handler+0x9c/0x1b8 > el0_sync+0x158/0x180 > > Since DSA never made use of the netdev API for describing links between > upper devices and lower devices, the dev->lower_level value of a DSA > switch interface would be 1, which would warn when it is a DSA master. > > We can use netdev_upper_dev_link() to describe the relationship between > a DSA slave and a DSA master. To be precise, a DSA "slave" (switch port) > is an "upper" to a DSA "master" (host port). The relationship is "many > uppers to one lower", like in the case of VLAN. So, for that reason, we > use the same function as VLAN uses. > > Since this warning was not there when lockdep was using dynamic keys for > addr_list_lock, we are blaming the lockdep patch itself. The network > stack _has_ been using static lockdep keys before, and it _is_ likely > that stacked DSA setups have been triggering these lockdep warnings > since forever, however I can't test very old kernels on this particular > stacked DSA setup, to ensure I'm not in fact introducing regressions. > > Fixes: 845e0ebb4408 ("net: change addr_list_lock back to static key") > Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> > Signed-off-by: Vladimir Oltean <olteanv@gmail.com> Jiri suggested not doing this a few years ago, but I do not remember the reasons why he advised against doing it. Jiri does your objection still stand today? > --- > net/dsa/slave.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/net/dsa/slave.c b/net/dsa/slave.c > index 743caabeaaa6..a951b2a7d79a 100644 > --- a/net/dsa/slave.c > +++ b/net/dsa/slave.c > @@ -1994,6 +1994,13 @@ int dsa_slave_create(struct dsa_port *port) > ret, slave_dev->name); > goto out_phy; > } > + rtnl_lock(); > + ret = netdev_upper_dev_link(master, slave_dev, NULL); > + rtnl_unlock(); > + if (ret) { > + unregister_netdevice(slave_dev); > + goto out_phy; > + } > > return 0; > > @@ -2013,11 +2020,13 @@ int dsa_slave_create(struct dsa_port *port) > > void dsa_slave_destroy(struct net_device *slave_dev) > { > + struct net_device *master = dsa_slave_to_master(slave_dev); > struct dsa_port *dp = dsa_slave_to_port(slave_dev); > struct dsa_slave_priv *p = netdev_priv(slave_dev); > > netif_carrier_off(slave_dev); > rtnl_lock(); > + netdev_upper_dev_unlink(master, slave_dev); > phylink_disconnect_phy(dp->pl); > rtnl_unlock(); > >
> One difference from VLAN is that in that case, the entire > register_vlan_device() function runs under RTNL. > When those bugs that you talk about are found, who starts using the > network interface too early? User space or someone else? Would RTNL be > enough to avoid that? NFS root. Registering the interface causes autoconfig to start, sending a DHCP request, or if the IP addresses are fixed, it could send an ARP for the NFS server. It is just nice to have if it is before register_netdev(). I don't think there is an actual issues in this case, being able to send/receive packets should not depend on the upper/lower linkage for DSA. Andrew
From: Vladimir Oltean <olteanv@gmail.com> Date: Mon, 13 Jul 2020 20:42:27 +0300 > One difference from VLAN is that in that case, the entire > register_vlan_device() function runs under RTNL. > When those bugs that you talk about are found, who starts using the > network interface too early? User space or someone else? Would RTNL be > enough to avoid that? As soon as the notifier is emitted by register_netdev(), userspace like components such as NetworkManager can and do ifup the device immediately.
On Mon, 13 Jul 2020 19:24:43 +0300 Vladimir Oltean wrote: > diff --git a/net/dsa/slave.c b/net/dsa/slave.c > index 743caabeaaa6..a951b2a7d79a 100644 > --- a/net/dsa/slave.c > +++ b/net/dsa/slave.c > @@ -1994,6 +1994,13 @@ int dsa_slave_create(struct dsa_port *port) > ret, slave_dev->name); > goto out_phy; > } > + rtnl_lock(); > + ret = netdev_upper_dev_link(master, slave_dev, NULL); > + rtnl_unlock(); > + if (ret) { > + unregister_netdevice(slave_dev); The error handling here looks sketchy. First of all please move this unregister to the error path below, not inside the body of the if. Secondly as a rule of thumb the error path should resemble the destroy function. Here we have : unregister_netdevice(slave_dev); out_phy: rtnl_lock(); phylink_disconnect_phy(p->dp->pl); rtnl_unlock(); phylink_destroy(p->dp->pl); out_gcells: gro_cells_destroy(&p->gcells); out_free: free_percpu(p->stats64); free_netdev(slave_dev); port->slave = NULL; return ret; vs. netif_carrier_off(slave_dev); rtnl_lock(); phylink_disconnect_phy(dp->pl); rtnl_unlock(); dsa_slave_notify(slave_dev, DSA_PORT_UNREGISTER); unregister_netdev(slave_dev); phylink_destroy(dp->pl); gro_cells_destroy(&p->gcells); free_percpu(p->stats64); free_netdev(slave_dev); Ordering is different, plus you're missing the dsa_slave_notify() and netif_carrier_off(). > + goto out_phy; > + } > > return 0; > > @@ -2013,11 +2020,13 @@ int dsa_slave_create(struct dsa_port *port) > > void dsa_slave_destroy(struct net_device *slave_dev) > { > + struct net_device *master = dsa_slave_to_master(slave_dev); > struct dsa_port *dp = dsa_slave_to_port(slave_dev); > struct dsa_slave_priv *p = netdev_priv(slave_dev); > > netif_carrier_off(slave_dev); > rtnl_lock(); > + netdev_upper_dev_unlink(master, slave_dev); > phylink_disconnect_phy(dp->pl); > rtnl_unlock();
diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 743caabeaaa6..a951b2a7d79a 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -1994,6 +1994,13 @@ int dsa_slave_create(struct dsa_port *port) ret, slave_dev->name); goto out_phy; } + rtnl_lock(); + ret = netdev_upper_dev_link(master, slave_dev, NULL); + rtnl_unlock(); + if (ret) { + unregister_netdevice(slave_dev); + goto out_phy; + } return 0; @@ -2013,11 +2020,13 @@ int dsa_slave_create(struct dsa_port *port) void dsa_slave_destroy(struct net_device *slave_dev) { + struct net_device *master = dsa_slave_to_master(slave_dev); struct dsa_port *dp = dsa_slave_to_port(slave_dev); struct dsa_slave_priv *p = netdev_priv(slave_dev); netif_carrier_off(slave_dev); rtnl_lock(); + netdev_upper_dev_unlink(master, slave_dev); phylink_disconnect_phy(dp->pl); rtnl_unlock();
Since commit 845e0ebb4408 ("net: change addr_list_lock back to static key"), cascaded DSA setups (DSA switch port as DSA master for another DSA switch port) are emitting this lockdep warning: ============================================ WARNING: possible recursive locking detected 5.8.0-rc1-00133-g923e4b5032dd-dirty #208 Not tainted -------------------------------------------- dhcpcd/323 is trying to acquire lock: ffff000066dd4268 (&dsa_master_addr_list_lock_key/1){+...}-{2:2}, at: dev_mc_sync+0x44/0x90 but task is already holding lock: ffff00006608c268 (&dsa_master_addr_list_lock_key/1){+...}-{2:2}, at: dev_mc_sync+0x44/0x90 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&dsa_master_addr_list_lock_key/1); lock(&dsa_master_addr_list_lock_key/1); *** DEADLOCK *** May be due to missing lock nesting notation 3 locks held by dhcpcd/323: #0: ffffdbd1381dda18 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x24/0x30 #1: ffff00006614b268 (_xmit_ETHER){+...}-{2:2}, at: dev_set_rx_mode+0x28/0x48 #2: ffff00006608c268 (&dsa_master_addr_list_lock_key/1){+...}-{2:2}, at: dev_mc_sync+0x44/0x90 stack backtrace: Call trace: dump_backtrace+0x0/0x1e0 show_stack+0x20/0x30 dump_stack+0xec/0x158 __lock_acquire+0xca0/0x2398 lock_acquire+0xe8/0x440 _raw_spin_lock_nested+0x64/0x90 dev_mc_sync+0x44/0x90 dsa_slave_set_rx_mode+0x34/0x50 __dev_set_rx_mode+0x60/0xa0 dev_mc_sync+0x84/0x90 dsa_slave_set_rx_mode+0x34/0x50 __dev_set_rx_mode+0x60/0xa0 dev_set_rx_mode+0x30/0x48 __dev_open+0x10c/0x180 __dev_change_flags+0x170/0x1c8 dev_change_flags+0x2c/0x70 devinet_ioctl+0x774/0x878 inet_ioctl+0x348/0x3b0 sock_do_ioctl+0x50/0x310 sock_ioctl+0x1f8/0x580 ksys_ioctl+0xb0/0xf0 __arm64_sys_ioctl+0x28/0x38 el0_svc_common.constprop.0+0x7c/0x180 do_el0_svc+0x2c/0x98 el0_sync_handler+0x9c/0x1b8 el0_sync+0x158/0x180 Since DSA never made use of the netdev API for describing links between upper devices and lower devices, the dev->lower_level value of a DSA switch interface would be 1, which would warn when it is a DSA master. We can use netdev_upper_dev_link() to describe the relationship between a DSA slave and a DSA master. To be precise, a DSA "slave" (switch port) is an "upper" to a DSA "master" (host port). The relationship is "many uppers to one lower", like in the case of VLAN. So, for that reason, we use the same function as VLAN uses. Since this warning was not there when lockdep was using dynamic keys for addr_list_lock, we are blaming the lockdep patch itself. The network stack _has_ been using static lockdep keys before, and it _is_ likely that stacked DSA setups have been triggering these lockdep warnings since forever, however I can't test very old kernels on this particular stacked DSA setup, to ensure I'm not in fact introducing regressions. Fixes: 845e0ebb4408 ("net: change addr_list_lock back to static key") Suggested-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: Vladimir Oltean <olteanv@gmail.com> --- net/dsa/slave.c | 9 +++++++++ 1 file changed, 9 insertions(+)