Message ID | 20241023172745.181265-1-jdamato@fastly.com |
---|---|
State | Accepted |
Delegated to: | Anthony Nguyen |
Headers | show |
Series | [iwl-net,v2] e1000: Hold RTNL when e1000_down can be called | expand |
On 10/23/2024 10:27 AM, Joe Damato wrote: > e1000_down calls netif_queue_set_napi, which assumes that RTNL is held. > > There are a few paths for e1000_down to be called in e1000 where RTNL is > not currently being held: > - e1000_shutdown (pci shutdown) > - e1000_suspend (power management) > - e1000_reinit_locked (via e1000_reset_task delayed work) > - e1000_io_error_detected (via pci error handler) > > Hold RTNL in three places to fix this issue: > - e1000_reset_task: igc, igb, and e100e all hold rtnl in this path. > - e1000_io_error_detected (pci error handler): e1000e and ixgbe hold > rtnl in this path. A patch has been posted for igc to do the same > [1]. > - __e1000_shutdown (which is called from both e1000_shutdown and > e1000_suspend): igb, ixgbe, and e1000e all hold rtnl in the same > path. > > The other paths which call e1000_down seemingly hold RTNL and are OK: > - e1000_close (ndo_stop) > - e1000_change_mtu (ndo_change_mtu) > > Based on the above analysis and mailing list discussion [2], I believe > adding rtnl in the three places mentioned above is correct. > > Fixes: 8f7ff18a5ec7 ("e1000: Link NAPI instances to queues and IRQs") Hi Joe, I've applied the patch to iwl-next as this commit hasn't landed in net/iwl-net yet. Thanks, Tony > Reported-by: Dmitry Antipov <dmantipov@yandex.ru> > Closes: https://lore.kernel.org/netdev/8cf62307-1965-46a0-a411-ff0080090ff9@yandex.ru/ > Link: https://lore.kernel.org/netdev/20241022215246.307821-3-jdamato@fastly.com/ [1] > Link: https://lore.kernel.org/netdev/ZxgVRX7Ne-lTjwiJ@LQ3V64L9R2/ [2] > Signed-off-by: Joe Damato <jdamato@fastly.com> > --- > v2: > - No longer an RFC > - Include an rtnl_lock/rtnl_unlock in e1000_io_error_detected > inspired by ixgbe's implementation of the same > > drivers/net/ethernet/intel/e1000/e1000_main.c | 10 +++++++++- > 1 file changed, 9 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c > index 4de9b156b2be..3f089c3d47b2 100644 > --- a/drivers/net/ethernet/intel/e1000/e1000_main.c > +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c > @@ -3509,7 +3509,9 @@ static void e1000_reset_task(struct work_struct *work) > container_of(work, struct e1000_adapter, reset_task); > > e_err(drv, "Reset adapter\n"); > + rtnl_lock(); > e1000_reinit_locked(adapter); > + rtnl_unlock(); > } > > /** > @@ -5074,7 +5076,9 @@ static int __e1000_shutdown(struct pci_dev *pdev, bool *enable_wake) > usleep_range(10000, 20000); > > WARN_ON(test_bit(__E1000_RESETTING, &adapter->flags)); > + rtnl_lock(); > e1000_down(adapter); > + rtnl_unlock(); > } > > status = er32(STATUS); > @@ -5235,16 +5239,20 @@ static pci_ers_result_t e1000_io_error_detected(struct pci_dev *pdev, > struct net_device *netdev = pci_get_drvdata(pdev); > struct e1000_adapter *adapter = netdev_priv(netdev); > > + rtnl_lock(); > netif_device_detach(netdev); > > - if (state == pci_channel_io_perm_failure) > + if (state == pci_channel_io_perm_failure) { > + rtnl_unlock(); > return PCI_ERS_RESULT_DISCONNECT; > + } > > if (netif_running(netdev)) > e1000_down(adapter); > > if (!test_and_set_bit(__E1000_DISABLED, &adapter->flags)) > pci_disable_device(pdev); > + rtnl_unlock(); > > /* Request a slot reset. */ > return PCI_ERS_RESULT_NEED_RESET; > > base-commit: d05596f248578be943015c1237120574a8d845dd
diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c index 4de9b156b2be..3f089c3d47b2 100644 --- a/drivers/net/ethernet/intel/e1000/e1000_main.c +++ b/drivers/net/ethernet/intel/e1000/e1000_main.c @@ -3509,7 +3509,9 @@ static void e1000_reset_task(struct work_struct *work) container_of(work, struct e1000_adapter, reset_task); e_err(drv, "Reset adapter\n"); + rtnl_lock(); e1000_reinit_locked(adapter); + rtnl_unlock(); } /** @@ -5074,7 +5076,9 @@ static int __e1000_shutdown(struct pci_dev *pdev, bool *enable_wake) usleep_range(10000, 20000); WARN_ON(test_bit(__E1000_RESETTING, &adapter->flags)); + rtnl_lock(); e1000_down(adapter); + rtnl_unlock(); } status = er32(STATUS); @@ -5235,16 +5239,20 @@ static pci_ers_result_t e1000_io_error_detected(struct pci_dev *pdev, struct net_device *netdev = pci_get_drvdata(pdev); struct e1000_adapter *adapter = netdev_priv(netdev); + rtnl_lock(); netif_device_detach(netdev); - if (state == pci_channel_io_perm_failure) + if (state == pci_channel_io_perm_failure) { + rtnl_unlock(); return PCI_ERS_RESULT_DISCONNECT; + } if (netif_running(netdev)) e1000_down(adapter); if (!test_and_set_bit(__E1000_DISABLED, &adapter->flags)) pci_disable_device(pdev); + rtnl_unlock(); /* Request a slot reset. */ return PCI_ERS_RESULT_NEED_RESET;
e1000_down calls netif_queue_set_napi, which assumes that RTNL is held. There are a few paths for e1000_down to be called in e1000 where RTNL is not currently being held: - e1000_shutdown (pci shutdown) - e1000_suspend (power management) - e1000_reinit_locked (via e1000_reset_task delayed work) - e1000_io_error_detected (via pci error handler) Hold RTNL in three places to fix this issue: - e1000_reset_task: igc, igb, and e100e all hold rtnl in this path. - e1000_io_error_detected (pci error handler): e1000e and ixgbe hold rtnl in this path. A patch has been posted for igc to do the same [1]. - __e1000_shutdown (which is called from both e1000_shutdown and e1000_suspend): igb, ixgbe, and e1000e all hold rtnl in the same path. The other paths which call e1000_down seemingly hold RTNL and are OK: - e1000_close (ndo_stop) - e1000_change_mtu (ndo_change_mtu) Based on the above analysis and mailing list discussion [2], I believe adding rtnl in the three places mentioned above is correct. Fixes: 8f7ff18a5ec7 ("e1000: Link NAPI instances to queues and IRQs") Reported-by: Dmitry Antipov <dmantipov@yandex.ru> Closes: https://lore.kernel.org/netdev/8cf62307-1965-46a0-a411-ff0080090ff9@yandex.ru/ Link: https://lore.kernel.org/netdev/20241022215246.307821-3-jdamato@fastly.com/ [1] Link: https://lore.kernel.org/netdev/ZxgVRX7Ne-lTjwiJ@LQ3V64L9R2/ [2] Signed-off-by: Joe Damato <jdamato@fastly.com> --- v2: - No longer an RFC - Include an rtnl_lock/rtnl_unlock in e1000_io_error_detected inspired by ixgbe's implementation of the same drivers/net/ethernet/intel/e1000/e1000_main.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) base-commit: d05596f248578be943015c1237120574a8d845dd