diff mbox series

[RFC,2/3] usbnet: Avoid potential races in usbnet_deferred_kevent()

Message ID 20170919161522.995-2-dianders@chromium.org
State RFC, archived
Delegated to: David Miller
Headers show
Series [RFC,1/3] usbnet: Get rid of spammy usbnet "kevent X may have been dropped" | expand

Commit Message

Doug Anderson Sept. 19, 2017, 4:15 p.m. UTC
In general when you've got a flag communicating that "something needs
to be done" you want to clear that flag _before_ doing the task.  If
you clear the flag _after_ doing the task you end up with the risk
that this will happen:

1. Requester sets flag saying task A needs to be done.
2. Worker comes and stars doing task A.
3. Worker finishes task A but hasn't yet cleared the flag.
4. Requester wants to set flag saying task A needs to be done again.
5. Worker clears the flag without doing anything.

Let's make the usbnet codebase consistently clear the flag _before_ it
does the requested work.  That way if there's another request to do
the work while the work is already in progress it won't be lost.

NOTES:
- No known bugs are fixed by this; it's just found by code inspection.
- This changes the semantics in some of the error conditions.
  -> If we fail to clear the "tx halt" or "rx halt" we still clear the
     flag and thus won't retry the clear next time we happen to be in
     the work function.  Had the old code really wanted to retry these
     events it should have re-scheduled the worker anyway.
  -> If we fail to allocate memory in usb_alloc_urb() we will still
     clear the EVENT_RX_MEMORY flag.  This makes it consistent with
     how we would deal with other failures, including failure to
     allocate a memory chunk in rx_submit().  It can also be noted
     that usb_alloc_urb() in this case is allocating much less than 4K
     worth of data and probably never fails.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 drivers/net/usb/usbnet.c | 50 +++++++++++++++++++++---------------------------
 1 file changed, 22 insertions(+), 28 deletions(-)

Comments

Oliver Neukum Sept. 19, 2017, 8:37 p.m. UTC | #1
Am Dienstag, den 19.09.2017, 09:15 -0700 schrieb Douglas Anderson:
> In general when you've got a flag communicating that "something needs
> to be done" you want to clear that flag _before_ doing the task.  If
> you clear the flag _after_ doing the task you end up with the risk
> that this will happen:
> 
> 1. Requester sets flag saying task A needs to be done.
> 2. Worker comes and stars doing task A.
> 3. Worker finishes task A but hasn't yet cleared the flag.
> 4. Requester wants to set flag saying task A needs to be done again.
> 5. Worker clears the flag without doing anything.
> 
> Let's make the usbnet codebase consistently clear the flag _before_ it
> does the requested work.  That way if there's another request to do
> the work while the work is already in progress it won't be lost.
> 
> NOTES:
> - No known bugs are fixed by this; it's just found by code inspection.

Hi,

unfortunately the patch is wrong. The flags must be cleared only
in case the handler is successful. That is not guaranteed.

	Regards
		Oliver

NACK
Guenter Roeck Sept. 19, 2017, 8:51 p.m. UTC | #2
On Tue, Sep 19, 2017 at 1:37 PM, Oliver Neukum <oneukum@suse.com> wrote:
> Am Dienstag, den 19.09.2017, 09:15 -0700 schrieb Douglas Anderson:
>> In general when you've got a flag communicating that "something needs
>> to be done" you want to clear that flag _before_ doing the task.  If
>> you clear the flag _after_ doing the task you end up with the risk
>> that this will happen:
>>
>> 1. Requester sets flag saying task A needs to be done.
>> 2. Worker comes and stars doing task A.
>> 3. Worker finishes task A but hasn't yet cleared the flag.
>> 4. Requester wants to set flag saying task A needs to be done again.
>> 5. Worker clears the flag without doing anything.
>>
>> Let's make the usbnet codebase consistently clear the flag _before_ it
>> does the requested work.  That way if there's another request to do
>> the work while the work is already in progress it won't be lost.
>>
>> NOTES:
>> - No known bugs are fixed by this; it's just found by code inspection.
>
> Hi,
>
> unfortunately the patch is wrong. The flags must be cleared only
> in case the handler is successful. That is not guaranteed.
>

Just out of curiosity, what is the retry mechanism ? Whenever a new,
possibly unrelated, event is scheduled ?

Thanks,
Guenter
Doug Anderson Sept. 19, 2017, 8:53 p.m. UTC | #3
Hi,

On Tue, Sep 19, 2017 at 1:37 PM, Oliver Neukum <oneukum@suse.com> wrote:
> Am Dienstag, den 19.09.2017, 09:15 -0700 schrieb Douglas Anderson:
>> In general when you've got a flag communicating that "something needs
>> to be done" you want to clear that flag _before_ doing the task.  If
>> you clear the flag _after_ doing the task you end up with the risk
>> that this will happen:
>>
>> 1. Requester sets flag saying task A needs to be done.
>> 2. Worker comes and stars doing task A.
>> 3. Worker finishes task A but hasn't yet cleared the flag.
>> 4. Requester wants to set flag saying task A needs to be done again.
>> 5. Worker clears the flag without doing anything.
>>
>> Let's make the usbnet codebase consistently clear the flag _before_ it
>> does the requested work.  That way if there's another request to do
>> the work while the work is already in progress it won't be lost.
>>
>> NOTES:
>> - No known bugs are fixed by this; it's just found by code inspection.
>
> Hi,
>
> unfortunately the patch is wrong. The flags must be cleared only
> in case the handler is successful. That is not guaranteed.
>
>         Regards
>                 Oliver
>
> NACK

OK, thanks for reviewing!  I definitely wasn't super confident about
the patch (hence the RFC).

Do you think that the races I identified are possible to hit?  In
other words: should I try to rework the patch somehow or just drop it?
 Originally I had the patch setting the flags back to true in the
failure cases, but then I convinced myself that wasn't needed.  I can
certainly go back and try it that way...

-Doug
Oliver Neukum Sept. 20, 2017, 8:23 a.m. UTC | #4
Am Dienstag, den 19.09.2017, 13:51 -0700 schrieb Guenter Roeck:
> On Tue, Sep 19, 2017 at 1:37 PM, Oliver Neukum <oneukum@suse.com> wrote:
> > 
> > Am Dienstag, den 19.09.2017, 09:15 -0700 schrieb Douglas Anderson:
> > > 
[..]
> > > NOTES:
> > > - No known bugs are fixed by this; it's just found by code inspection.
> > 
> > Hi,
> > 
> > unfortunately the patch is wrong. The flags must be cleared only
> > in case the handler is successful. That is not guaranteed.
> > 
> 
> Just out of curiosity, what is the retry mechanism ? Whenever a new,
> possibly unrelated, event is scheduled ?

Hi,

that actually depends on the flag.
Look at the case of fail_lowmem. There we reschedule.

	HTH
		Oliver
Oliver Neukum Sept. 20, 2017, 8:25 a.m. UTC | #5
Am Dienstag, den 19.09.2017, 13:53 -0700 schrieb Doug Anderson:
> Hi,
> 
> On Tue, Sep 19, 2017 at 1:37 PM, Oliver Neukum <oneukum@suse.com> wrote:
> > 
> > Am Dienstag, den 19.09.2017, 09:15 -0700 schrieb Douglas Anderson:
> > > 
> > > In general when you've got a flag communicating that "something needs
> > > to be done" you want to clear that flag _before_ doing the task.  If
> > > you clear the flag _after_ doing the task you end up with the risk
> > > that this will happen:
> > > 
> > > 1. Requester sets flag saying task A needs to be done.
> > > 2. Worker comes and stars doing task A.
> > > 3. Worker finishes task A but hasn't yet cleared the flag.
> > > 4. Requester wants to set flag saying task A needs to be done again.
> > > 5. Worker clears the flag without doing anything.
> > > 
> > > Let's make the usbnet codebase consistently clear the flag _before_ it
> > > does the requested work.  That way if there's another request to do
> > > the work while the work is already in progress it won't be lost.
> > > 
> > > NOTES:
> > > - No known bugs are fixed by this; it's just found by code inspection.
> > 
> > Hi,
> > 
> > unfortunately the patch is wrong. The flags must be cleared only
> > in case the handler is successful. That is not guaranteed.
> > 
> >         Regards
> >                 Oliver
> > 
> > NACK
> 
> OK, thanks for reviewing!  I definitely wasn't super confident about
> the patch (hence the RFC).
> 
> Do you think that the races I identified are possible to hit?  In

As far as I can tell, we are safe, but you are right to say that the
driver is not quite clean at that point.

> other words: should I try to rework the patch somehow or just drop it?
>  Originally I had the patch setting the flags back to true in the
> failure cases, but then I convinced myself that wasn't needed.  I can
> certainly go back and try it that way...

Setting the flags again in the error case would certainly be an
improvement. I'd be happy with a patch doing that.

	Regards
		Oliver
diff mbox series

Patch

diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index a3e8dbaadcf9..e72547d8d0e6 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -1103,8 +1103,6 @@  static void __handle_link_change(struct usbnet *dev)
 
 	/* hard_mtu or rx_urb_size may change during link change */
 	usbnet_update_max_qlen(dev);
-
-	clear_bit(EVENT_LINK_CHANGE, &dev->flags);
 }
 
 static void usbnet_set_rx_mode(struct net_device *net)
@@ -1118,8 +1116,6 @@  static void __handle_set_rx_mode(struct usbnet *dev)
 {
 	if (dev->driver_info->set_rx_mode)
 		(dev->driver_info->set_rx_mode)(dev);
-
-	clear_bit(EVENT_SET_RX_MODE, &dev->flags);
 }
 
 /* work that cannot be done in interrupt context uses keventd.
@@ -1135,7 +1131,7 @@  usbnet_deferred_kevent (struct work_struct *work)
 	int			status;
 
 	/* usb_clear_halt() needs a thread context */
-	if (test_bit (EVENT_TX_HALT, &dev->flags)) {
+	if (test_and_clear_bit (EVENT_TX_HALT, &dev->flags)) {
 		unlink_urbs (dev, &dev->txq);
 		status = usb_autopm_get_interface(dev->intf);
 		if (status < 0)
@@ -1150,12 +1146,11 @@  usbnet_deferred_kevent (struct work_struct *work)
 				netdev_err(dev->net, "can't clear tx halt, status %d\n",
 					   status);
 		} else {
-			clear_bit (EVENT_TX_HALT, &dev->flags);
 			if (status != -ESHUTDOWN)
 				netif_wake_queue (dev->net);
 		}
 	}
-	if (test_bit (EVENT_RX_HALT, &dev->flags)) {
+	if (test_and_clear_bit (EVENT_RX_HALT, &dev->flags)) {
 		unlink_urbs (dev, &dev->rxq);
 		status = usb_autopm_get_interface(dev->intf);
 		if (status < 0)
@@ -1170,41 +1165,39 @@  usbnet_deferred_kevent (struct work_struct *work)
 				netdev_err(dev->net, "can't clear rx halt, status %d\n",
 					   status);
 		} else {
-			clear_bit (EVENT_RX_HALT, &dev->flags);
 			tasklet_schedule (&dev->bh);
 		}
 	}
 
 	/* tasklet could resubmit itself forever if memory is tight */
-	if (test_bit (EVENT_RX_MEMORY, &dev->flags)) {
+	if (test_and_clear_bit (EVENT_RX_MEMORY, &dev->flags)) {
 		struct urb	*urb = NULL;
 		int resched = 1;
 
-		if (netif_running (dev->net))
+		if (netif_running (dev->net)) {
 			urb = usb_alloc_urb (0, GFP_KERNEL);
-		else
-			clear_bit (EVENT_RX_MEMORY, &dev->flags);
-		if (urb != NULL) {
-			clear_bit (EVENT_RX_MEMORY, &dev->flags);
-			status = usb_autopm_get_interface(dev->intf);
-			if (status < 0) {
-				usb_free_urb(urb);
-				goto fail_lowmem;
-			}
-			if (rx_submit (dev, urb, GFP_KERNEL) == -ENOLINK)
-				resched = 0;
-			usb_autopm_put_interface(dev->intf);
+			if (urb != NULL) {
+				status = usb_autopm_get_interface(dev->intf);
+				if (status < 0) {
+					usb_free_urb(urb);
+					goto fail_lowmem;
+				}
+				if (rx_submit (dev, urb, GFP_KERNEL) ==
+				    -ENOLINK)
+					resched = 0;
+				usb_autopm_put_interface(dev->intf);
 fail_lowmem:
-			if (resched)
-				tasklet_schedule (&dev->bh);
+				if (resched)
+					tasklet_schedule (&dev->bh);
+			}
 		}
+
 	}
 
-	if (test_bit (EVENT_LINK_RESET, &dev->flags)) {
+	if (test_and_clear_bit (EVENT_LINK_RESET, &dev->flags)) {
 		struct driver_info	*info = dev->driver_info;
 		int			retval = 0;
 
-		clear_bit (EVENT_LINK_RESET, &dev->flags);
 		status = usb_autopm_get_interface(dev->intf);
 		if (status < 0)
 			goto skip_reset;
@@ -1221,13 +1214,14 @@  usbnet_deferred_kevent (struct work_struct *work)
 		}
 
 		/* handle link change from link resetting */
+		clear_bit(EVENT_LINK_CHANGE, &dev->flags);
 		__handle_link_change(dev);
 	}
 
-	if (test_bit (EVENT_LINK_CHANGE, &dev->flags))
+	if (test_and_clear_bit (EVENT_LINK_CHANGE, &dev->flags))
 		__handle_link_change(dev);
 
-	if (test_bit (EVENT_SET_RX_MODE, &dev->flags))
+	if (test_and_clear_bit (EVENT_SET_RX_MODE, &dev->flags))
 		__handle_set_rx_mode(dev);