Message ID | 1461188301-4466-1-git-send-email-fruggeri@arista.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
Francesco Ruggeri <fruggeri@arista.com> writes: > If macvlan_common_newlink fails in register_netdevice after macvlan_init > then it decrements port->count twice, first in macvlan_uninit (from > register_netdevice or rollback_registered) and then again in > macvlan_common_newlink. > A similar problem may exist in the ipvlan driver. > This patch consolidates modifications to port->count into macvlan_init > and macvlan_uninit (thanks to Eric Biederman for suggesting this approach). > In macvtap_device_event it also avoids cleaning up in NETDEV_UNREGISTER > if NETDEV_REGISTER had previously failed. > > Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> The macvlan_init bits obviously corect. The macvtap bits I don't really understand what is going on. > diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c > index 95394ed..e770221 100644 > --- a/drivers/net/macvtap.c > +++ b/drivers/net/macvtap.c > @@ -1303,6 +1303,8 @@ static int macvtap_device_event(struct notifier_block *unused, > } > break; > case NETDEV_UNREGISTER: > + if (vlan->minor == 0) > + break; I don't understand this bit. A minor of 0 is never assigned. That is clear from the code. On what code path can you get here without assigning a minor? My gut says this either needs a big fat comment explaining what is going on, or a slight refactoring of the code (like moving port count increments into macvlan_init) that makes it so this code path can't happen. > devt = MKDEV(MAJOR(macvtap_major), vlan->minor); > device_destroy(macvtap_class, devt); > macvtap_free_minor(vlan); Eric
On Thu, Apr 21, 2016 at 10:44 AM, Eric W. Biederman <ebiederm@xmission.com> wrote: < >> diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c >> index 95394ed..e770221 100644 >> --- a/drivers/net/macvtap.c >> +++ b/drivers/net/macvtap.c >> @@ -1303,6 +1303,8 @@ static int macvtap_device_event(struct notifier_block *unused, >> } >> break; >> case NETDEV_UNREGISTER: >> + if (vlan->minor == 0) >> + break; > > I don't understand this bit. A minor of 0 is never assigned. That is > clear from the code. On what code path can you get here without > assigning a minor? > You can have vlan->minor == 0 if macvtap_device_event(NETDEV_REGISTER) failed. macvtap_device_event is invoked in the context of macvtap_newlink and it can fail if for example a macvtap interface using the same ifindex already exists in a different namespace. That is how we originally ran into the port->count issue. In that case the sequence is macvtap_newlink macvlan_common_newlink register_netdevice call_netdevice_notifiers(NETDEV_REGISTER, dev) macvtap_device_event(NETDEV_REGISTER) <fail here, vlan->minor = 0> rollback_registered(dev); rollback_registered_many call_netdevice_notifiers(NETDEV_UNREGISTER, dev); macvtap_device_event(NETDEV_UNREGISTER) <nothing to do here> Should this bit go into a separate patch? Would a comment like this help: /* We can have vlan->minor == 0 if NETDEV_REGISTER above failed */ Marc Angel posted https://marc.info/?l=linux-netdev&m=146116146925511&w=2 about conflicts if one tries to create macvtap interfaces with the same ifindex in different namespaces. Thanks, Francesco > > Eric
Francesco Ruggeri <fruggeri@arista.com> writes: > On Thu, Apr 21, 2016 at 10:44 AM, Eric W. Biederman > <ebiederm@xmission.com> wrote: > < >>> diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c >>> index 95394ed..e770221 100644 >>> --- a/drivers/net/macvtap.c >>> +++ b/drivers/net/macvtap.c >>> @@ -1303,6 +1303,8 @@ static int macvtap_device_event(struct notifier_block *unused, >>> } >>> break; >>> case NETDEV_UNREGISTER: >>> + if (vlan->minor == 0) >>> + break; >> >> I don't understand this bit. A minor of 0 is never assigned. That is >> clear from the code. On what code path can you get here without >> assigning a minor? >> > > You can have vlan->minor == 0 if macvtap_device_event(NETDEV_REGISTER) > failed. > > macvtap_device_event is invoked in the context of macvtap_newlink and > it can fail if for example a macvtap interface using the same ifindex > already exists in a different namespace. That is how we originally ran > into the port->count issue. > In that case the sequence is > > macvtap_newlink > macvlan_common_newlink > register_netdevice > call_netdevice_notifiers(NETDEV_REGISTER, dev) > macvtap_device_event(NETDEV_REGISTER) > <fail here, vlan->minor = 0> > rollback_registered(dev); > rollback_registered_many > call_netdevice_notifiers(NETDEV_UNREGISTER, dev); > macvtap_device_event(NETDEV_UNREGISTER) > <nothing to do here> > > Should this bit go into a separate patch? > Would a comment like this help: > > /* We can have vlan->minor == 0 if NETDEV_REGISTER above failed */ > > Marc Angel posted https://marc.info/?l=linux-netdev&m=146116146925511&w=2 > about conflicts if one tries to create macvtap interfaces with the same > ifindex in different namespaces. Just for clarity I would recommend splitting the two changes. One change that fixes macvlan_init to do the right thing. Another change that includes your backtrace above, adds the comment and fixes macvlan to ignore that case. Arguably we should be fixing register_netdevice to call call_netdeivce_notifiers(NETDEV_UNREIGSTER) if call_netdevice_notifiers(NETDEV_REGISTER) failed. Having a separate patch will at least allow us to look at this second issue all by itself. Thank you for your patience in all of this. Eric
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index 2bcf1f3..cb01023 100644 --- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -795,6 +795,7 @@ static int macvlan_init(struct net_device *dev) { struct macvlan_dev *vlan = netdev_priv(dev); const struct net_device *lowerdev = vlan->lowerdev; + struct macvlan_port *port = vlan->port; dev->state = (dev->state & ~MACVLAN_STATE_MASK) | (lowerdev->state & MACVLAN_STATE_MASK); @@ -812,6 +813,8 @@ static int macvlan_init(struct net_device *dev) if (!vlan->pcpu_stats) return -ENOMEM; + port->count += 1; + return 0; } @@ -1312,10 +1315,9 @@ int macvlan_common_newlink(struct net *src_net, struct net_device *dev, return err; } - port->count += 1; err = register_netdevice(dev); if (err < 0) - goto destroy_port; + return err; dev->priv_flags |= IFF_MACVLAN; err = netdev_upper_dev_link(lowerdev, dev); @@ -1330,10 +1332,6 @@ int macvlan_common_newlink(struct net *src_net, struct net_device *dev, unregister_netdev: unregister_netdevice(dev); -destroy_port: - port->count -= 1; - if (!port->count) - macvlan_port_destroy(lowerdev); return err; } diff --git a/drivers/net/macvtap.c b/drivers/net/macvtap.c index 95394ed..e770221 100644 --- a/drivers/net/macvtap.c +++ b/drivers/net/macvtap.c @@ -1303,6 +1303,8 @@ static int macvtap_device_event(struct notifier_block *unused, } break; case NETDEV_UNREGISTER: + if (vlan->minor == 0) + break; devt = MKDEV(MAJOR(macvtap_major), vlan->minor); device_destroy(macvtap_class, devt); macvtap_free_minor(vlan);
If macvlan_common_newlink fails in register_netdevice after macvlan_init then it decrements port->count twice, first in macvlan_uninit (from register_netdevice or rollback_registered) and then again in macvlan_common_newlink. A similar problem may exist in the ipvlan driver. This patch consolidates modifications to port->count into macvlan_init and macvlan_uninit (thanks to Eric Biederman for suggesting this approach). In macvtap_device_event it also avoids cleaning up in NETDEV_UNREGISTER if NETDEV_REGISTER had previously failed. Signed-off-by: Francesco Ruggeri <fruggeri@arista.com> --- drivers/net/macvlan.c | 10 ++++------ drivers/net/macvtap.c | 2 ++ 2 files changed, 6 insertions(+), 6 deletions(-)