Message ID | 1267793225-426-1-git-send-email-steve.glendinning@smsc.com |
---|---|
State | Rejected, archived |
Delegated to: | David Miller |
Headers | show |
From: Steve Glendinning <steve.glendinning@smsc.com> Date: Fri, 5 Mar 2010 12:47:05 +0000 > This patch fixes a reproducible null dereference in smsc95xx (and I > suspect others) when the device is removed during a control register > access. This can be reproduced by rapidly plugging and unplugging > the device during its initialisation. > > Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> The parent shouldn't become NULL until the device is totally quiesced and is no longer accesses. Maybe you can instead fix the smsc95xx driver to abide by this rule instead of adding a conditional check to thousands of other drivers in the tree that do not need this? I really have no intention of adding your change, please fix this properly, thanks. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, 2010-03-05 at 06:39 -0800, David Miller wrote: > From: Steve Glendinning <steve.glendinning@smsc.com> > Date: Fri, 5 Mar 2010 12:47:05 +0000 > > > This patch fixes a reproducible null dereference in smsc95xx (and I > > suspect others) when the device is removed during a control register > > access. This can be reproduced by rapidly plugging and unplugging > > the device during its initialisation. > > > > Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> > > The parent shouldn't become NULL until the device is totally quiesced > and is no longer accesses. > > Maybe you can instead fix the smsc95xx driver to abide by this rule > instead of adding a conditional check to thousands of other drivers in > the tree that do not need this? > > I really have no intention of adding your change, please fix this > properly, thanks. > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Perhaps something like this is appropriate in the mean time: const char *get_netdev_parent_name(const struct net_device *dev) { if (!dev->dev.parent) return "Unparented net_device, please report this"; return netdev->dev.parent; } -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Joe Perches <joe@perches.com> Date: Fri, 05 Mar 2010 06:50:20 -0800 > On Fri, 2010-03-05 at 06:39 -0800, David Miller wrote: >> From: Steve Glendinning <steve.glendinning@smsc.com> >> Date: Fri, 5 Mar 2010 12:47:05 +0000 >> >> > This patch fixes a reproducible null dereference in smsc95xx (and I >> > suspect others) when the device is removed during a control register >> > access. This can be reproduced by rapidly plugging and unplugging >> > the device during its initialisation. >> > >> > Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> >> >> The parent shouldn't become NULL until the device is totally quiesced >> and is no longer accesses. >> >> Maybe you can instead fix the smsc95xx driver to abide by this rule >> instead of adding a conditional check to thousands of other drivers in >> the tree that do not need this? >> >> I really have no intention of adding your change, please fix this >> properly, thanks. >> -- >> To unsubscribe from this list: send the line "unsubscribe netdev" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > > Perhaps something like this is appropriate in the mean time: > > const char *get_netdev_parent_name(const struct net_device *dev) > { > if (!dev->dev.parent) > return "Unparented net_device, please report this"; > return netdev->dev.parent; > } Yes, but in the smsc95xx driver. :-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller <davem@davemloft.net> wrote on 05/03/2010 14:39:09: > From: Steve Glendinning <steve.glendinning@smsc.com> > Date: Fri, 5 Mar 2010 12:47:05 +0000 > > > This patch fixes a reproducible null dereference in smsc95xx (and I > > suspect others) when the device is removed during a control register > > access. This can be reproduced by rapidly plugging and unplugging > > the device during its initialisation. > > > > Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> > > The parent shouldn't become NULL until the device is totally quiesced > and is no longer accesses. The failure I'm seeing is caused when the usb device is disconnected. smsc95xx detects that a pending USB control operation failed and tries to print a message via netdev_printk to report this. Unfortunately, something else (the USB subsystem?) has already set parent to null at this time so the netdev_printk causes a null dereference. So netdev_printk suddenly changes from safe to use to unsafe to use? I could change all instances in smsc95xx to defensively check this at each call, but what should I test to see if the device is still valid? Is testing parent != null the correct thing to do? I think other usbnet drivers may have the same problem, but I don't have any hardware to test them. > Maybe you can instead fix the smsc95xx driver to abide by this rule > instead of adding a conditional check to thousands of other drivers in > the tree that do not need this? I agree it'd be better to avoid this check where it's unnecessary, but if netdev_printk isn't necessarily safe to call for *removable* interfaces then shouldn't all such callers be checking that it's safe to do so? Steve -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Steve.Glendinning@smsc.com Date: Fri, 5 Mar 2010 15:29:41 +0000 > The failure I'm seeing is caused when the usb device is disconnected. > smsc95xx detects that a pending USB control operation failed > and tries to print a message via netdev_printk to report this. > > Unfortunately, something else (the USB subsystem?) has already set > parent to null at this time so the netdev_printk causes a null > dereference. > > So netdev_printk suddenly changes from safe to use to unsafe to use? It seems to me that really you only need this parent NULL check where you notice the USB control operation failed and want to print a message about that. That should cover all the necessary cases shouldn't it? Even more importantly, why does a USB disconnect NULL out the netdev parent device pointer? Until you actually release this USB device in the driver, the parent pointer should stay there. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
David Miller <davem@davemloft.net> wrote on 05/03/2010 15:43:35: > It seems to me that really you only need this parent NULL check where > you notice the USB control operation failed and want to print a > message about that. > > That should cover all the necessary cases shouldn't it? That's the easily reproducible case - during initialisation there are a lot of register reads and writes, so it's quite easy to trigger by manually unplugging ~500ms after connection. In this case usbnet is still in usbnet_probe. Theoretically though, this device could be disconnected at any time, and there are other places in the code where we print messages not triggered by a usb failure (for example set_multicast). Could it be possibly unsafe to call netdev_printk here? > Even more importantly, why does a USB disconnect NULL out the netdev > parent device pointer? Until you actually release this USB device in > the driver, the parent pointer should stay there. Most of the time it's not nulled out, and the code succesfully prints errors as expected, but maybe 1 time in 20 dev.parent is NULL. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Steve.Glendinning@smsc.com Date: Fri, 5 Mar 2010 16:32:03 +0000 > David Miller <davem@davemloft.net> wrote on 05/03/2010 15:43:35: > >> Even more importantly, why does a USB disconnect NULL out the netdev >> parent device pointer? Until you actually release this USB device in >> the driver, the parent pointer should stay there. > > Most of the time it's not nulled out, and the code succesfully prints > errors as expected, but maybe 1 time in 20 dev.parent is NULL. I think until the device driver puts it's references and whatnot of the device it's driving, that parent pointer should be kept non-NULL. As long as the netdevice exists and is registered, for example, people can get at the parent device chain via SYSFS file accesses and similar. So it seems to me this is a huge problem waiting to happen anyways and this netdev_printk() issue is merely making the problem more obvious :-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index c79a88b..250dbc0 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -2132,9 +2132,10 @@ static inline const char *netdev_name(const struct net_device *dev) } #define netdev_printk(level, netdev, format, args...) \ - dev_printk(level, (netdev)->dev.parent, \ + ({ if ((netdev)->dev.parent) \ + dev_printk(level, (netdev)->dev.parent, \ "%s: " format, \ - netdev_name(netdev), ##args) + netdev_name(netdev), ##args); }) #define netdev_emerg(dev, format, args...) \ netdev_printk(KERN_EMERG, dev, format, ##args)
This patch fixes a reproducible null dereference in smsc95xx (and I suspect others) when the device is removed during a control register access. This can be reproduced by rapidly plugging and unplugging the device during its initialisation. Signed-off-by: Steve Glendinning <steve.glendinning@smsc.com> --- include/linux/netdevice.h | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-)