Message ID | 1449573462-28417-1-git-send-email-peter@lekensteyn.nl |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
Peter Wu [mailto:peter@lekensteyn.nl] > Sent: Tuesday, December 08, 2015 7:18 PM > > When an interface is brought up which was previously suspended (via > runtime PM), it would hang. This happens because napi_disable is called > before napi_enable. > > Solve this by avoiding napi_enable in the resume during open function > (netif_running is true when open is called, IFF_UP is set after a > successful open; netif_running is false when close is called, but IFF_UP > is then still set). > > While at it, remove WORK_ENABLE check from rtl8152_open (introduced with > the original change) because it cannot happen: > > - After this patch, runtime resume will not set it during rtl8152_open. > - When link is up, rtl8152_open is not called. > - When link is down during system/auto suspend/resume, it is not set. > > Fixes: 41cec84cf285 ("r8152: don't enable napi before rx ready") > Link: https://lkml.kernel.org/r/20151205105912.GA1766@al > Signed-off-by: Peter Wu <peter@lekensteyn.nl> > --- > v2: moved rtl_runtime_suspend_enable from close to rtl8152_suspend Acked-by: Hayes Wang <hayeswang@realtek.com> Best Regards, Hayes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Dec 08, 2015 at 12:39:12PM +0000, Hayes Wang wrote: > Peter Wu [mailto:peter@lekensteyn.nl] > > Sent: Tuesday, December 08, 2015 7:18 PM > > > > When an interface is brought up which was previously suspended (via > > runtime PM), it would hang. This happens because napi_disable is called > > before napi_enable. > > > > Solve this by avoiding napi_enable in the resume during open function > > (netif_running is true when open is called, IFF_UP is set after a > > successful open; netif_running is false when close is called, but IFF_UP > > is then still set). > > > > While at it, remove WORK_ENABLE check from rtl8152_open (introduced with > > the original change) because it cannot happen: > > > > - After this patch, runtime resume will not set it during rtl8152_open. > > - When link is up, rtl8152_open is not called. > > - When link is down during system/auto suspend/resume, it is not set. > > > > Fixes: 41cec84cf285 ("r8152: don't enable napi before rx ready") > > Link: https://lkml.kernel.org/r/20151205105912.GA1766@al > > Signed-off-by: Peter Wu <peter@lekensteyn.nl> > > --- > > v2: moved rtl_runtime_suspend_enable from close to rtl8152_suspend > > Acked-by: Hayes Wang <hayeswang@realtek.com> > > Best Regards, > Hayes I found another problem with runtime PM. When a device is suspended via autosuspend and a system suspend takes place, there is no network I/O after resume. Triggering a renegotiation (ethtool -r eth1) brings back network activity. Can you have a look? I guess that reset_resume needs different treatment, but don't know how to do it properly as suspend is not called before system reset (because the device is apparently already in suspended state). I think that this new issue can be addressed in a different patch, this patch solves a more severe lockup. Opinions?
From: Peter Wu <peter@lekensteyn.nl> Date: Tue, 8 Dec 2015 12:17:42 +0100 > When an interface is brought up which was previously suspended (via > runtime PM), it would hang. This happens because napi_disable is called > before napi_enable. > > Solve this by avoiding napi_enable in the resume during open function > (netif_running is true when open is called, IFF_UP is set after a > successful open; netif_running is false when close is called, but IFF_UP > is then still set). > > While at it, remove WORK_ENABLE check from rtl8152_open (introduced with > the original change) because it cannot happen: > > - After this patch, runtime resume will not set it during rtl8152_open. > - When link is up, rtl8152_open is not called. > - When link is down during system/auto suspend/resume, it is not set. > > Fixes: 41cec84cf285 ("r8152: don't enable napi before rx ready") > Link: https://lkml.kernel.org/r/20151205105912.GA1766@al > Signed-off-by: Peter Wu <peter@lekensteyn.nl> > --- > v2: moved rtl_runtime_suspend_enable from close to rtl8152_suspend Applied, and queued up for -stable, thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2015-12-08 at 15:33 +0100, Peter Wu wrote: > Can you have a look? I guess that reset_resume needs different > treatment, but don't know how to do it properly as suspend is not > called > before system reset (because the device is apparently already in > suspended state). > > I think that this new issue can be addressed in a different patch, > this > patch solves a more severe lockup. Opinions? We do not know in advance whether resume() or reset_resume() will be called after S3. A driver must be ready for both. Regards Oliver -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Peter Wu [mailto:peter@lekensteyn.nl] > Sent: Tuesday, December 08, 2015 10:33 PM [...] > I found another problem with runtime PM. When a device is suspended via > autosuspend and a system suspend takes place, there is no network I/O > after resume. Triggering a renegotiation (ethtool -r eth1) brings back > network activity. I think it is relative to the firmware. Could you try the driver from Realtek website? Best Regards, Hayes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2015-12-22 at 09:48 +0000, Hayes Wang wrote: > Peter Wu [mailto:peter@lekensteyn.nl] > > Sent: Tuesday, December 08, 2015 10:33 PM > [...] > > I found another problem with runtime PM. When a device is suspended via > > autosuspend and a system suspend takes place, there is no network I/O > > after resume. Triggering a renegotiation (ethtool -r eth1) brings back > > network activity. > > I think it is relative to the firmware. Could you try the driver from Realtek website? Hi, at the risk of repeating myself I must say that there is a logic flaw in the driver. If you look at this code: static int rtl8152_resume(struct usb_interface *intf) { struct r8152 *tp = usb_get_intfdata(intf); mutex_lock(&tp->control); if (!test_bit(SELECTIVE_SUSPEND, &tp->flags)) { tp->rtl_ops.init(tp); netif_device_attach(tp->netdev); } if (netif_running(tp->netdev) && tp->netdev->flags & IFF_UP) { if (test_bit(SELECTIVE_SUSPEND, &tp->flags)) { rtl_runtime_suspend_enable(tp, false); clear_bit(SELECTIVE_SUSPEND, &tp->flags); napi_disable(&tp->napi); set_bit(WORK_ENABLE, &tp->flags); if (netif_carrier_ok(tp->netdev)) rtl_start_rx(tp); napi_enable(&tp->napi); } else { tp->rtl_ops.up(tp); rtl8152_set_speed(tp, AUTONEG_ENABLE, tp->mii.supports_gmii ? SPEED_1000 : SPEED_100, DUPLEX_FULL); netif_carrier_off(tp->netdev); set_bit(WORK_ENABLE, &tp->flags); } You need to understand that its use of the flag SELECTIVE_SUSPEND is invalid. SELECTIVE_SUSPEND is used at two places in the driver. Once in rtl8152_start_xmit(), where it is working but misnamed. At that time you need to know whether the device is suspended. To the device it does not matter whether the suspension is selective or for the whole bus. It cannot tell. The driver just needs to know whether it should resume the device if a packet to be transmitted through it is given to the driver. So far all is well. But at the time rtl8152_resume() is called the flag has become meaningless. It tells you whether the device was selectively suspended (we call that autosuspended, but the concept is the same), but you take it to mean that it was _only_ selectively suspended. It does not tell you that. That matters a lot because the behavior of the host regarding powering the bus during S3 and S4 is not defined. As I mentioned the device can't tell whether it is selectively suspended. But it does notice if its power supply is cut. In that case the driver must reinitialize the device. The current code does that conditional on !test_bit(SELECTIVE_SUSPEND ... ) That is wrong because you need to do this if power was lost. The test only tells you whether the device was selectively suspend before power was lost (if power was lost). That is not the same thing at all. The way the USB subsystem is designed is that it tells you whether power had been cut by calling reset_resume() if power was cut or resume() if power was kept. If reset_resume() is called you must always execute this code: tp->rtl_ops.init(tp); and tp->rtl_ops.up(tp); rtl8152_set_speed(tp, AUTONEG_ENABLE, tp->mii.supports_gmii ? SPEED_1000 : SPEED_100, DUPLEX_FULL); The conditions used in rtl8152_resume() are wrong. That is the reason "ethtool -r eth1" is reported to restore the device. It triggers equivalent operations. It is clear to me that you cannot get away with using the same operation for resume() and reset_resume() in your driver. It is fundamentally impossible. Firmware cannot fix it. Sorry for the length of the explanation. HTH Oliver -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Oliver Neukum [mailto:oneukum@suse.com] [...] > It is clear to me that you cannot get away with using the same operation > for resume() and reset_resume() in your driver. It is fundamentally > impossible. Firmware cannot fix it. I would think how to fix it. > Sorry for the length of the explanation. Thanks for your response. I have some questions. What are the flows when the system resume follows a system suspend which follows a autosuspend? Are they as following? 1. suspend() with PMSG_IS_AUTO for autosuspned. 2. suspend() for system suspend. 3. resume() for system resume. And, should the device exist autosuspend before (2)? Best Regards, Hayes
On Wed, 2015-12-23 at 03:31 +0000, Hayes Wang wrote: > Oliver Neukum [mailto:oneukum@suse.com] > [...] > > It is clear to me that you cannot get away with using the same operation > > for resume() and reset_resume() in your driver. It is fundamentally > > impossible. Firmware cannot fix it. > > I would think how to fix it. > > > Sorry for the length of the explanation. > > Thanks for your response. I have some questions. What are the flows when > the system resume follows a system suspend which follows a autosuspend? > Are they as following? > > 1. suspend() with PMSG_IS_AUTO for autosuspned. > 2. suspend() for system suspend. > 3. resume() for system resume. > > And, should the device exist autosuspend before (2)? No, step (2) does not exist. Calls to suspend() and [reset_]resume() always balance. Usually a driver shouldn't care about system suspend. The way the driver is currently coded will also fail for Port-Power Off. Regards Oliver -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Oliver Neukum [mailto:oneukum@suse.de] > Sent: Wednesday, December 23, 2015 4:20 PM [...] > No, step (2) does not exist. Calls to suspend() and [reset_]resume() > always balance. Usually a driver shouldn't care about system suspend. > The way the driver is currently coded will also fail for Port-Power Off. It is different with Windows. The Windows would resume the device before system suspend, if the system suspend follows the autosuspend. Would this be a problem? After system suspend, the device may wake up the system when receiving any packet, not only magic packet. The wake events are different for system suspend and autosuspend. However, I couldn't change the wake event, because the autosuspend occurs first, and the suspend() is only called once. Best Regards, Hayes
On Wed, 2015-12-23 at 09:20 +0000, Hayes Wang wrote: > Oliver Neukum [mailto:oneukum@suse.de] > > Sent: Wednesday, December 23, 2015 4:20 PM > [...] > > No, step (2) does not exist. Calls to suspend() and [reset_]resume() > > always balance. Usually a driver shouldn't care about system suspend. > > The way the driver is currently coded will also fail for Port-Power Off. > > It is different with Windows. The Windows would resume the device before > system suspend, if the system suspend follows the autosuspend. > > Would this be a problem? After system suspend, the device may wake up > the system when receiving any packet, not only magic packet. The wake > events are different for system suspend and autosuspend. However, I > couldn't change the wake event, because the autosuspend occurs first, > and the suspend() is only called once. That is indeed a problem and I need to think a bit about finding a good solution. If you are happy with an inelegant solution, you can use a pm_notifier, which will tell you that the system is going to suspend. This is documented: https://www.kernel.org/doc/Documentation/power/notifiers.txt HTH Oliver -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 23 Dec 2015, Hayes Wang wrote: > Oliver Neukum [mailto:oneukum@suse.de] > > Sent: Wednesday, December 23, 2015 4:20 PM > [...] > > No, step (2) does not exist. Calls to suspend() and [reset_]resume() > > always balance. Usually a driver shouldn't care about system suspend. > > The way the driver is currently coded will also fail for Port-Power Off. > > It is different with Windows. The Windows would resume the device before > system suspend, if the system suspend follows the autosuspend. > > Would this be a problem? After system suspend, the device may wake up > the system when receiving any packet, not only magic packet. The wake > events are different for system suspend and autosuspend. However, I > couldn't change the wake event, because the autosuspend occurs first, > and the suspend() is only called once. I don't understand why the wakeup conditions are different. It seems to me that the choice of which packets will generate a wakeup ought to depend on the user's selection, not on the kind of suspend. For instance, if the user says that only a magic packet should cause a wakeup then that should be true for both runtime suspend and system suspend. To put it another way, as far as the device is concerned a suspend is just a suspend -- there's no different between a runtime suspend and a system suspend. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2015-12-23 at 20:32 -0500, Alan Stern wrote: > I don't understand why the wakeup conditions are different. It seems > to me that the choice of which packets will generate a wakeup ought to > depend on the user's selection, not on the kind of suspend. For > instance, if the user says that only a magic packet should cause a > wakeup then that should be true for both runtime suspend and system > suspend. > > To put it another way, as far as the device is concerned a suspend is > just a suspend -- there's no different between a runtime suspend and a > system suspend. This literally true, but the host and the driver care. If we autosuspend a running network device, any packet (maybe filtered for MAC) should cause a remote wake up, else we'd lose packets. But you cannot keep that setting if the system goes down or any broadcast packet would resume the whole system. Yet you cannot just disable remote wake up, as WoL packages still must trigger a remote wake up. So there are drivers which must change settings on devices as the system goes to sleep, even if their devices have already been autosuspended. We could use the notifier chains for that. But can this solution be called elegant? Merry Christmas Oliver -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 24 Dec 2015, Oliver Neukum wrote: > On Wed, 2015-12-23 at 20:32 -0500, Alan Stern wrote: > > > I don't understand why the wakeup conditions are different. It seems > > to me that the choice of which packets will generate a wakeup ought to > > depend on the user's selection, not on the kind of suspend. For > > instance, if the user says that only a magic packet should cause a > > wakeup then that should be true for both runtime suspend and system > > suspend. > > > > To put it another way, as far as the device is concerned a suspend is > > just a suspend -- there's no different between a runtime suspend and a > > system suspend. > > This literally true, but the host and the driver care. > If we autosuspend a running network device, any packet > (maybe filtered for MAC) should cause a remote wake up, > else we'd lose packets. That's also true during system suspend. > But you cannot keep that setting if the system goes down > or any broadcast packet would resume the whole system. > Yet you cannot just disable remote wake up, as WoL packages > still must trigger a remote wake up. This means that sometimes you want to avoid losing packets and other times you do want to lose packets. That is a policy decision, and therefore it should be made by the user, not the kernel. > So there are drivers which must change settings on devices > as the system goes to sleep, even if their devices have > already been autosuspended. We could use the notifier chains > for that. But can this solution be called elegant? Instead of the driver trying to do this automatically, you could rely on userspace telling the driver which packets should cause a wakeup. The setting could be updated immediately before and after each system suspend. I admit this is more awkward than having the driver make a choice based on the type of suspend. This is a case where the resources provided by the PM core aren't adequate for what the driver needs. The PM core distinguishes between wakeup enabled or disabled; it doesn't distinguish among different levels of wakekup. Alan Stern -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 2015-12-24 at 10:14 -0500, Alan Stern wrote: > On Thu, 24 Dec 2015, Oliver Neukum wrote: > > > On Wed, 2015-12-23 at 20:32 -0500, Alan Stern wrote: > > But you cannot keep that setting if the system goes down > > or any broadcast packet would resume the whole system. > > Yet you cannot just disable remote wake up, as WoL packages > > still must trigger a remote wake up. > > This means that sometimes you want to avoid losing packets and other > times you do want to lose packets. That is a policy decision, and > therefore it should be made by the user, not the kernel. Indeed it is and there is a tool for this with a defined interface called "ethtool" The problem here is not the policy decision, but implementing it in kernel space. > > So there are drivers which must change settings on devices > > as the system goes to sleep, even if their devices have > > already been autosuspended. We could use the notifier chains > > for that. But can this solution be called elegant? > > Instead of the driver trying to do this automatically, you could rely > on userspace telling the driver which packets should cause a wakeup. It does. > The setting could be updated immediately before and after each system > suspend. The API is so that user space sets the policy, which persists until user space changes the setting and the kernel implements it. The problem is that to do so the kernel needs to do IO to the device as the system is about to suspend. Thus the driver may need to resume the device and it needs to learn that the system is about to go to sleep, even if the device it manages is already autosuspended. > I admit this is more awkward than having the driver make a choice based > on the type of suspend. This is a case where the resources provided by It also is a race condition, unless you want user space to disable autosuspend as the system is about to go to sleep. And it makes relatively little sense, as enabling remote wakeup is the last thing we do before the device suspends. Setting the filters long before that doesn't make much sense. > the PM core aren't adequate for what the driver needs. The PM core > distinguishes between wakeup enabled or disabled; it doesn't > distinguish among different levels of wakekup. True and sanely it cannot. We could only distinguish between drivers which need their devices to be resumed before the system suspends and the rest. Or we tell driver coders to use the notifier chains. Regards Oliver -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, 24 Dec 2015, Oliver Neukum wrote: > On Thu, 2015-12-24 at 10:14 -0500, Alan Stern wrote: > > On Thu, 24 Dec 2015, Oliver Neukum wrote: > > > > > On Wed, 2015-12-23 at 20:32 -0500, Alan Stern wrote: > > > > But you cannot keep that setting if the system goes down > > > or any broadcast packet would resume the whole system. > > > Yet you cannot just disable remote wake up, as WoL packages > > > still must trigger a remote wake up. > > > > This means that sometimes you want to avoid losing packets and other > > times you do want to lose packets. That is a policy decision, and > > therefore it should be made by the user, not the kernel. > > Indeed it is and there is a tool for this with a defined > interface called "ethtool" No; ethtool affects the wakeup setting for system suspend, but not for runtime suspend. I was referring to something that would specify the setting for both cases. But perhaps that doesn't make sense, because you never want to drop relevant packets during runtime suspend. If you did, you would run "ifconfig down" instead. > > the PM core aren't adequate for what the driver needs. The PM core > > distinguishes between wakeup enabled or disabled; it doesn't > > distinguish among different levels of wakekup. > > True and sanely it cannot. We could only distinguish between drivers > which need their devices to be resumed before the system suspends and > the rest. > Or we tell driver coders to use the notifier chains. "Resume before system suspend" sounds like a reasonable thing to implement, for devices that have multiple levels of wakeup settings. Would you like to post a proposal on linux-pm for this? Alan Stern -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c index d9427ca..2e32c41 100644 --- a/drivers/net/usb/r8152.c +++ b/drivers/net/usb/r8152.c @@ -3067,17 +3067,6 @@ static int rtl8152_open(struct net_device *netdev) mutex_lock(&tp->control); - /* The WORK_ENABLE may be set when autoresume occurs */ - if (test_bit(WORK_ENABLE, &tp->flags)) { - clear_bit(WORK_ENABLE, &tp->flags); - usb_kill_urb(tp->intr_urb); - cancel_delayed_work_sync(&tp->schedule); - - /* disable the tx/rx, if the workqueue has enabled them. */ - if (netif_carrier_ok(netdev)) - tp->rtl_ops.disable(tp); - } - tp->rtl_ops.up(tp); rtl8152_set_speed(tp, AUTONEG_ENABLE, @@ -3124,12 +3113,6 @@ static int rtl8152_close(struct net_device *netdev) } else { mutex_lock(&tp->control); - /* The autosuspend may have been enabled and wouldn't - * be disable when autoresume occurs, because the - * netif_running() would be false. - */ - rtl_runtime_suspend_enable(tp, false); - tp->rtl_ops.down(tp); mutex_unlock(&tp->control); @@ -3512,7 +3495,7 @@ static int rtl8152_resume(struct usb_interface *intf) netif_device_attach(tp->netdev); } - if (netif_running(tp->netdev)) { + if (netif_running(tp->netdev) && tp->netdev->flags & IFF_UP) { if (test_bit(SELECTIVE_SUSPEND, &tp->flags)) { rtl_runtime_suspend_enable(tp, false); clear_bit(SELECTIVE_SUSPEND, &tp->flags); @@ -3532,6 +3515,8 @@ static int rtl8152_resume(struct usb_interface *intf) } usb_submit_urb(tp->intr_urb, GFP_KERNEL); } else if (test_bit(SELECTIVE_SUSPEND, &tp->flags)) { + if (tp->netdev->flags & IFF_UP) + rtl_runtime_suspend_enable(tp, false); clear_bit(SELECTIVE_SUSPEND, &tp->flags); }
When an interface is brought up which was previously suspended (via runtime PM), it would hang. This happens because napi_disable is called before napi_enable. Solve this by avoiding napi_enable in the resume during open function (netif_running is true when open is called, IFF_UP is set after a successful open; netif_running is false when close is called, but IFF_UP is then still set). While at it, remove WORK_ENABLE check from rtl8152_open (introduced with the original change) because it cannot happen: - After this patch, runtime resume will not set it during rtl8152_open. - When link is up, rtl8152_open is not called. - When link is down during system/auto suspend/resume, it is not set. Fixes: 41cec84cf285 ("r8152: don't enable napi before rx ready") Link: https://lkml.kernel.org/r/20151205105912.GA1766@al Signed-off-by: Peter Wu <peter@lekensteyn.nl> --- v2: moved rtl_runtime_suspend_enable from close to rtl8152_suspend Tested with the scenario from https://lkml.kernel.org/r/20151208111008.GA18728@al (added debug prints to verify the suspend/resume moments) --- drivers/net/usb/r8152.c | 21 +++------------------ 1 file changed, 3 insertions(+), 18 deletions(-)