Message ID | 1392433180-16052-2-git-send-email-mcgrof@do-not-panic.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On Fri, 2014-02-14 at 18:59 -0800, Luis R. Rodriguez wrote: > From: "Luis R. Rodriguez" <mcgrof@suse.com> > > It doesn't make sense for some interfaces to become a root bridge I think you mean 'root port'. > at any point in time. One example is virtual backend interfaces > which rely on other entities on the bridge for actual physical > connectivity. They only provide virtual access. > > Device drivers that know they should never become part of the > root bridge have been using a trick of setting their MAC address > to a high broadcast MAC address such as FE:FF:FF:FF:FF:FF. Instead > of using these hacks lets the interfaces annotate its intent and > generalizes a solution for multiple drivers, while letting the > drivers use a random MAC address or one prefixed with a proper OUI. > This sort of hack is used by both qemu and xen for their backend > interfaces. > > Cc: Stephen Hemminger <stephen@networkplumber.org> > Cc: bridge@lists.linux-foundation.org > Cc: netdev@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> > --- > include/uapi/linux/if.h | 1 + > net/bridge/br_if.c | 2 ++ > net/bridge/br_private.h | 1 + > net/bridge/br_stp_if.c | 2 ++ > 4 files changed, 6 insertions(+) > > diff --git a/include/uapi/linux/if.h b/include/uapi/linux/if.h > index d758163..8d10382 100644 > --- a/include/uapi/linux/if.h > +++ b/include/uapi/linux/if.h > @@ -84,6 +84,7 @@ > #define IFF_LIVE_ADDR_CHANGE 0x100000 /* device supports hardware address > * change when it's running */ > #define IFF_MACVLAN 0x200000 /* Macvlan device */ > +#define IFF_BRIDGE_NON_ROOT 0x400000 /* Don't consider for root bridge */ [...] Does it really make sense to add a flag that says exactly which special behaviour you want, or would it be better to define the flag as a passive property, which other drivers/protocols then use as a condition for special behaviour? The fact that you also define the IFF_BRIDGE_SKIP_IP flag, and set it on exactly the same devices, makes me think that they should actually be a single flag. I don't know how that flag should be named or described, though. Ben.
On Fri, 14 Feb 2014 18:59:37 -0800 "Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote: > From: "Luis R. Rodriguez" <mcgrof@suse.com> > > It doesn't make sense for some interfaces to become a root bridge > at any point in time. One example is virtual backend interfaces > which rely on other entities on the bridge for actual physical > connectivity. They only provide virtual access. > > Device drivers that know they should never become part of the > root bridge have been using a trick of setting their MAC address > to a high broadcast MAC address such as FE:FF:FF:FF:FF:FF. Instead > of using these hacks lets the interfaces annotate its intent and > generalizes a solution for multiple drivers, while letting the > drivers use a random MAC address or one prefixed with a proper OUI. > This sort of hack is used by both qemu and xen for their backend > interfaces. > > Cc: Stephen Hemminger <stephen@networkplumber.org> > Cc: bridge@lists.linux-foundation.org > Cc: netdev@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> This is already supported in a more standard way via the root block flag. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 15/02/14 02:59, Luis R. Rodriguez wrote: > From: "Luis R. Rodriguez" <mcgrof@suse.com> > > It doesn't make sense for some interfaces to become a root bridge > at any point in time. One example is virtual backend interfaces > which rely on other entities on the bridge for actual physical > connectivity. They only provide virtual access. It is possible that a guest bridge together to VIF, either from the same Dom0 bridge or from different ones. In that case using STP on VIFs sound sensible to me. Zoli -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Feb 16, 2014 at 10:57 AM, Stephen Hemminger <stephen@networkplumber.org> wrote: > On Fri, 14 Feb 2014 18:59:37 -0800 > "Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote: > >> From: "Luis R. Rodriguez" <mcgrof@suse.com> >> >> It doesn't make sense for some interfaces to become a root bridge >> at any point in time. One example is virtual backend interfaces >> which rely on other entities on the bridge for actual physical >> connectivity. They only provide virtual access. >> >> Device drivers that know they should never become part of the >> root bridge have been using a trick of setting their MAC address >> to a high broadcast MAC address such as FE:FF:FF:FF:FF:FF. Instead >> of using these hacks lets the interfaces annotate its intent and >> generalizes a solution for multiple drivers, while letting the >> drivers use a random MAC address or one prefixed with a proper OUI. >> This sort of hack is used by both qemu and xen for their backend >> interfaces. >> >> Cc: Stephen Hemminger <stephen@networkplumber.org> >> Cc: bridge@lists.linux-foundation.org >> Cc: netdev@vger.kernel.org >> Cc: linux-kernel@vger.kernel.org >> Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> > > This is already supported in a more standard way via the root > block flag. Great! For documentation purposes the root_block flag is a sysfs attribute, added via 3.8 through commit 1007dd1a. The respective interface flag is IFLA_BRPORT_PROTECT and can be set via the iproute2 bridge utility or through sysfs: mcgrof@garbanzo ~/linux (git::master)$ find /sys/ -name root_block /sys/devices/pci0000:00/0000:00:04.0/0000:02:00.0/net/eth0/brport/root_block /sys/devices/vif-3-0/net/vif3.0/brport/root_block /sys/devices/virtual/net/vif3.0-emu/brport/root_block mcgrof@garbanzo ~/devel/iproute2 (git::master)$ cat /sys/devices/vif-3-0/net/vif3.0/brport/root_block 0 mcgrof@garbanzo ~/devel/iproute2 (git::master)$ sudo bridge link set dev vif3.0 root_block on mcgrof@garbanzo ~/devel/iproute2 (git::master)$ cat /sys/devices/vif-3-0/net/vif3.0/brport/root_block 1 So if we'd want to avoid using the MAC address hack alternative to skip a root port userspace would need to be updated to simply set this attribute after adding the device to the bridge. Based on Zoltan's feedback there seems to be use cases to not enable this always for all xen-netback interfaces though as such we can just punt this to userspace for the topologies that require this. The original motivation for this series was to avoid the IPv6 duplicate address incurred by the MAC address hack for avoiding the root bridge. Given that Zoltan also noted a use case whereby IPv4 and IPv6 addresses can be assigned to the backend interfaces we should be able to avoid the duplicate address situation for IPv6 by using a proper random MAC address *once* userspace has been updated also to use IFLA_BRPORT_PROTECT. New userspace can't and won't need to set this flag for older kernels (older than 3.8) as root_block is not implemented on those kernels and the MAC address hack would still be used there. This strategy however does put a requirement on new kernels to use new userspace as otherwise the MAC address workaround would not be in place and root_block would not take effect. Luis -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2014-02-18 at 13:02 -0800, Luis R. Rodriguez wrote: > On Sun, Feb 16, 2014 at 10:57 AM, Stephen Hemminger > <stephen@networkplumber.org> wrote: > > On Fri, 14 Feb 2014 18:59:37 -0800 > > "Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote: > > > >> From: "Luis R. Rodriguez" <mcgrof@suse.com> > >> > >> It doesn't make sense for some interfaces to become a root bridge > >> at any point in time. One example is virtual backend interfaces > >> which rely on other entities on the bridge for actual physical > >> connectivity. They only provide virtual access. > >> > >> Device drivers that know they should never become part of the > >> root bridge have been using a trick of setting their MAC address > >> to a high broadcast MAC address such as FE:FF:FF:FF:FF:FF. Instead > >> of using these hacks lets the interfaces annotate its intent and > >> generalizes a solution for multiple drivers, while letting the > >> drivers use a random MAC address or one prefixed with a proper OUI. > >> This sort of hack is used by both qemu and xen for their backend > >> interfaces. > >> > >> Cc: Stephen Hemminger <stephen@networkplumber.org> > >> Cc: bridge@lists.linux-foundation.org > >> Cc: netdev@vger.kernel.org > >> Cc: linux-kernel@vger.kernel.org > >> Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> > > > > This is already supported in a more standard way via the root > > block flag. > > Great! For documentation purposes the root_block flag is a sysfs > attribute, added via 3.8 through commit 1007dd1a. The respective > interface flag is IFLA_BRPORT_PROTECT and can be set via the iproute2 > bridge utility or through sysfs: > > mcgrof@garbanzo ~/linux (git::master)$ find /sys/ -name root_block > /sys/devices/pci0000:00/0000:00:04.0/0000:02:00.0/net/eth0/brport/root_block > /sys/devices/vif-3-0/net/vif3.0/brport/root_block > /sys/devices/virtual/net/vif3.0-emu/brport/root_block > > mcgrof@garbanzo ~/devel/iproute2 (git::master)$ cat > /sys/devices/vif-3-0/net/vif3.0/brport/root_block > 0 > mcgrof@garbanzo ~/devel/iproute2 (git::master)$ sudo bridge link set > dev vif3.0 root_block on > mcgrof@garbanzo ~/devel/iproute2 (git::master)$ cat > /sys/devices/vif-3-0/net/vif3.0/brport/root_block > 1 > > So if we'd want to avoid using the MAC address hack alternative to > skip a root port userspace would need to be updated to simply set this > attribute after adding the device to the bridge. Based on Zoltan's > feedback there seems to be use cases to not enable this always for all > xen-netback interfaces though as such we can just punt this to > userspace for the topologies that require this. > > The original motivation for this series was to avoid the IPv6 > duplicate address incurred by the MAC address hack for avoiding the > root bridge. Given that Zoltan also noted a use case whereby IPv4 and > IPv6 addresses can be assigned to the backend interfaces we should be > able to avoid the duplicate address situation for IPv6 by using a > proper random MAC address *once* userspace has been updated also to > use IFLA_BRPORT_PROTECT. New userspace can't and won't need to set > this flag for older kernels (older than 3.8) as root_block is not > implemented on those kernels and the MAC address hack would still be > used there. This strategy however does put a requirement on new > kernels to use new userspace as otherwise the MAC address workaround > would not be in place and root_block would not take effect. Can't we arrange things in the Xen hotplug scripts such that if the root_block stuff isn't available/doesn't work we fallback to the existing fe:ff:ff:ff:ff usage? That would avoid concerns about forward/backwards compat I think. It wouldn't solve the issue you are targeting on old systems, but it also doesn't regress them any further. Ian. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 19/02/14 09:52, Ian Campbell wrote: > On Tue, 2014-02-18 at 13:02 -0800, Luis R. Rodriguez wrote: >> On Sun, Feb 16, 2014 at 10:57 AM, Stephen Hemminger >> <stephen@networkplumber.org> wrote: >>> On Fri, 14 Feb 2014 18:59:37 -0800 >>> "Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote: >>> >>>> From: "Luis R. Rodriguez" <mcgrof@suse.com> >>>> >>>> It doesn't make sense for some interfaces to become a root bridge >>>> at any point in time. One example is virtual backend interfaces >>>> which rely on other entities on the bridge for actual physical >>>> connectivity. They only provide virtual access. >>>> >>>> Device drivers that know they should never become part of the >>>> root bridge have been using a trick of setting their MAC address >>>> to a high broadcast MAC address such as FE:FF:FF:FF:FF:FF. Instead >>>> of using these hacks lets the interfaces annotate its intent and >>>> generalizes a solution for multiple drivers, while letting the >>>> drivers use a random MAC address or one prefixed with a proper OUI. >>>> This sort of hack is used by both qemu and xen for their backend >>>> interfaces. >>>> >>>> Cc: Stephen Hemminger <stephen@networkplumber.org> >>>> Cc: bridge@lists.linux-foundation.org >>>> Cc: netdev@vger.kernel.org >>>> Cc: linux-kernel@vger.kernel.org >>>> Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> >>> >>> This is already supported in a more standard way via the root >>> block flag. >> >> Great! For documentation purposes the root_block flag is a sysfs >> attribute, added via 3.8 through commit 1007dd1a. The respective >> interface flag is IFLA_BRPORT_PROTECT and can be set via the iproute2 >> bridge utility or through sysfs: >> >> mcgrof@garbanzo ~/linux (git::master)$ find /sys/ -name root_block >> /sys/devices/pci0000:00/0000:00:04.0/0000:02:00.0/net/eth0/brport/root_block >> /sys/devices/vif-3-0/net/vif3.0/brport/root_block >> /sys/devices/virtual/net/vif3.0-emu/brport/root_block >> >> mcgrof@garbanzo ~/devel/iproute2 (git::master)$ cat >> /sys/devices/vif-3-0/net/vif3.0/brport/root_block >> 0 >> mcgrof@garbanzo ~/devel/iproute2 (git::master)$ sudo bridge link set >> dev vif3.0 root_block on >> mcgrof@garbanzo ~/devel/iproute2 (git::master)$ cat >> /sys/devices/vif-3-0/net/vif3.0/brport/root_block >> 1 >> >> So if we'd want to avoid using the MAC address hack alternative to >> skip a root port userspace would need to be updated to simply set this >> attribute after adding the device to the bridge. Based on Zoltan's >> feedback there seems to be use cases to not enable this always for all >> xen-netback interfaces though as such we can just punt this to >> userspace for the topologies that require this. >> >> The original motivation for this series was to avoid the IPv6 >> duplicate address incurred by the MAC address hack for avoiding the >> root bridge. Given that Zoltan also noted a use case whereby IPv4 and >> IPv6 addresses can be assigned to the backend interfaces we should be >> able to avoid the duplicate address situation for IPv6 by using a >> proper random MAC address *once* userspace has been updated also to >> use IFLA_BRPORT_PROTECT. New userspace can't and won't need to set >> this flag for older kernels (older than 3.8) as root_block is not >> implemented on those kernels and the MAC address hack would still be >> used there. This strategy however does put a requirement on new >> kernels to use new userspace as otherwise the MAC address workaround >> would not be in place and root_block would not take effect. > > Can't we arrange things in the Xen hotplug scripts such that if the > root_block stuff isn't available/doesn't work we fallback to the > existing fe:ff:ff:ff:ff usage? > > That would avoid concerns about forward/backwards compat I think. It > wouldn't solve the issue you are targeting on old systems, but it also > doesn't regress them any further. I agree, I think this problem could be better handled from userspace: if it can set root_block then change the default MAC to a random one, if it can't, then stay with the default one. Or if someone doesn't care about STP but DAD is still important, userspace can have a force_random_mac option somewhere to change to a random MAC regardless of root_block presence. Zoli -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Feb 17, 2014 at 9:52 AM, Zoltan Kiss <zoltan.kiss@citrix.com> wrote: > On 15/02/14 02:59, Luis R. Rodriguez wrote: >> >> From: "Luis R. Rodriguez" <mcgrof@suse.com> >> >> It doesn't make sense for some interfaces to become a root bridge >> at any point in time. One example is virtual backend interfaces >> which rely on other entities on the bridge for actual physical >> connectivity. They only provide virtual access. > > It is possible that a guest bridge together to VIF, either from the same > Dom0 bridge or from different ones. In that case using STP on VIFs sound > sensible to me. You seem to describe a case whereby it can make sense for xen-netback interfaces to end up becoming the root port of a bridge. Can you elaborate a little more on that as it was unclear the use case. Additionally if such cases exist then under the current upstream implementation one would simply need to change the MAC address in order to enable a vif to become the root port. Stephen noted there is a way to avoid nominating an interface for a root port through the root block flag. We should use that instead of the MAC address hacks. Let's keep in mind that part of the motivation for this series is to avoid a duplicate IPv6 address left in place by use cases whereby the MAC address of the backend vif was left static. The use case your are explaining likely describes the more prevalent use case where address conflicts can occur, perhaps when administrators for got to change the backend MAC address. If we embrace a random MAC address we'd avoid that issue, and but we'd need to update userspace to use the root block on topologies where desired. Luis -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Feb 19, 2014 at 6:35 AM, Zoltan Kiss <zoltan.kiss@citrix.com> wrote: > On 19/02/14 09:52, Ian Campbell wrote: >> Can't we arrange things in the Xen hotplug scripts such that if the >> root_block stuff isn't available/doesn't work we fallback to the >> existing fe:ff:ff:ff:ff usage? >> >> That would avoid concerns about forward/backwards compat I think. It >> wouldn't solve the issue you are targeting on old systems, but it also >> doesn't regress them any further. > > I agree, I think this problem could be better handled from userspace: if it > can set root_block then change the default MAC to a random one, if it can't, > then stay with the default one. Or if someone doesn't care about STP but DAD > is still important, userspace can have a force_random_mac option somewhere > to change to a random MAC regardless of root_block presence. Folks, what if I repurpose my patch to use the IFF_BRIDGE_NON_ROOT (or relabel to IFF_ROOT_BLOCK_DEF) flag for a default driver preference upon initialization so that root block will be used once the device gets added to a bridge. The purpose would be to avoid drivers from using the high MAC address hack, streamline to use a random MAC address thereby avoiding the possible duplicate address situation for IPv6. In the STP use case for these interfaces we'd just require userspace to unset the root block. I'd consider the STP use case the most odd of all. The caveat to this approach is 3.8 would be needed (or its the root block patches cherry picked) for base kernels older than 3.8. Stephen? Luis -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 19 Feb 2014 09:02:06 -0800 "Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote: > Folks, what if I repurpose my patch to use the IFF_BRIDGE_NON_ROOT (or > relabel to IFF_ROOT_BLOCK_DEF) flag for a default driver preference > upon initialization so that root block will be used once the device > gets added to a bridge. The purpose would be to avoid drivers from > using the high MAC address hack, streamline to use a random MAC > address thereby avoiding the possible duplicate address situation for > IPv6. In the STP use case for these interfaces we'd just require > userspace to unset the root block. I'd consider the STP use case the > most odd of all. The caveat to this approach is 3.8 would be needed > (or its the root block patches cherry picked) for base kernels older > than 3.8. > > Stephen? > > Luis Don't add IFF_ flags that adds yet another API hook into bridge. Please only use the netlink/sysfs flags fields that already exist for new features. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Feb 19, 2014 at 9:08 AM, Stephen Hemminger <stephen@networkplumber.org> wrote: > On Wed, 19 Feb 2014 09:02:06 -0800 > "Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote: > >> Folks, what if I repurpose my patch to use the IFF_BRIDGE_NON_ROOT (or >> relabel to IFF_ROOT_BLOCK_DEF) flag for a default driver preference >> upon initialization so that root block will be used once the device >> gets added to a bridge. The purpose would be to avoid drivers from >> using the high MAC address hack, streamline to use a random MAC >> address thereby avoiding the possible duplicate address situation for >> IPv6. In the STP use case for these interfaces we'd just require >> userspace to unset the root block. I'd consider the STP use case the >> most odd of all. The caveat to this approach is 3.8 would be needed >> (or its the root block patches cherry picked) for base kernels older >> than 3.8. >> >> Stephen? >> >> Luis > > Don't add IFF_ flags that adds yet another API hook into bridge. The goal was not to add a userspace API, but rather consider a driver initialization preference. > Please only use the netlink/sysfs flags fields that already exist > for new features. Sure, but what if we know a driver in most cases wants the root block and we'd want to make it the default, thereby only requiring userspace for toggling it off. Luis -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 19/02/14 17:02, Luis R. Rodriguez wrote: > On Wed, Feb 19, 2014 at 6:35 AM, Zoltan Kiss <zoltan.kiss@citrix.com> wrote: >> On 19/02/14 09:52, Ian Campbell wrote: >>> Can't we arrange things in the Xen hotplug scripts such that if the >>> root_block stuff isn't available/doesn't work we fallback to the >>> existing fe:ff:ff:ff:ff usage? >>> >>> That would avoid concerns about forward/backwards compat I think. It >>> wouldn't solve the issue you are targeting on old systems, but it also >>> doesn't regress them any further. >> >> I agree, I think this problem could be better handled from userspace: if it >> can set root_block then change the default MAC to a random one, if it can't, >> then stay with the default one. Or if someone doesn't care about STP but DAD >> is still important, userspace can have a force_random_mac option somewhere >> to change to a random MAC regardless of root_block presence. > > Folks, what if I repurpose my patch to use the IFF_BRIDGE_NON_ROOT (or > relabel to IFF_ROOT_BLOCK_DEF) flag for a default driver preference > upon initialization so that root block will be used once the device > gets added to a bridge. The purpose would be to avoid drivers from > using the high MAC address hack, streamline to use a random MAC > address thereby avoiding the possible duplicate address situation for > IPv6. In the STP use case for these interfaces we'd just require > userspace to unset the root block. I'd consider the STP use case the > most odd of all. The caveat to this approach is 3.8 would be needed > (or its the root block patches cherry picked) for base kernels older > than 3.8. How about this: netback sets the root_block flag and a random MAC by default. So the default behaviour won't change, DAD will be happy, and userspace don't have to do anything unless it's using netback for STP root bridge (I don't think there are too many toolstacks doing that), in which case it has to remove the root_block flag instead of setting a random MAC. Zoli -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 19/02/14 16:45, Luis R. Rodriguez wrote: > On Mon, Feb 17, 2014 at 9:52 AM, Zoltan Kiss <zoltan.kiss@citrix.com> wrote: >> On 15/02/14 02:59, Luis R. Rodriguez wrote: >>> >>> From: "Luis R. Rodriguez" <mcgrof@suse.com> >>> >>> It doesn't make sense for some interfaces to become a root bridge >>> at any point in time. One example is virtual backend interfaces >>> which rely on other entities on the bridge for actual physical >>> connectivity. They only provide virtual access. >> >> It is possible that a guest bridge together to VIF, either from the same >> Dom0 bridge or from different ones. In that case using STP on VIFs sound >> sensible to me. > > You seem to describe a case whereby it can make sense for xen-netback > interfaces to end up becoming the root port of a bridge. Can you > elaborate a little more on that as it was unclear the use case. Well, I might be wrong on that, but the scenario I was thinking: a guest (let's say domain 1) can have multiple interfaces on different Dom0 (or driver domain) bridges, let's say vif1.0 is plugged into xenbr0 and vif1.1 is in xenbr1. If the guest wants to make a bridge of this two, then using STP makes sense. I wanted to bring up CloudStack's virtual router as an example, but then I realized it's probably doesn't do such thing. However I don't think we should hardcode that a netback interface can't be RP ever. > > Additionally if such cases exist then under the current upstream > implementation one would simply need to change the MAC address in > order to enable a vif to become the root port. Stephen noted there is > a way to avoid nominating an interface for a root port through the > root block flag. We should use that instead of the MAC address hacks. > Let's keep in mind that part of the motivation for this series is to > avoid a duplicate IPv6 address left in place by use cases whereby the > MAC address of the backend vif was left static. The use case your are > explaining likely describes the more prevalent use case where address > conflicts can occur, perhaps when administrators for got to change the > backend MAC address. If we embrace a random MAC address we'd avoid > that issue, and but we'd need to update userspace to use the root > block on topologies where desired. If I understand you correctly, this is the same I suggested in my another email sent 1.5 hour ago. Zoli -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 19 Feb 2014 09:59:33 -0800 "Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote: > On Wed, Feb 19, 2014 at 9:08 AM, Stephen Hemminger > <stephen@networkplumber.org> wrote: > > On Wed, 19 Feb 2014 09:02:06 -0800 > > "Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote: > > > >> Folks, what if I repurpose my patch to use the IFF_BRIDGE_NON_ROOT (or > >> relabel to IFF_ROOT_BLOCK_DEF) flag for a default driver preference > >> upon initialization so that root block will be used once the device > >> gets added to a bridge. The purpose would be to avoid drivers from > >> using the high MAC address hack, streamline to use a random MAC > >> address thereby avoiding the possible duplicate address situation for > >> IPv6. In the STP use case for these interfaces we'd just require > >> userspace to unset the root block. I'd consider the STP use case the > >> most odd of all. The caveat to this approach is 3.8 would be needed > >> (or its the root block patches cherry picked) for base kernels older > >> than 3.8. > >> > >> Stephen? > >> > >> Luis > > > > Don't add IFF_ flags that adds yet another API hook into bridge. > > The goal was not to add a userspace API, but rather consider a driver > initialization preference. > > > Please only use the netlink/sysfs flags fields that already exist > > for new features. > > Sure, but what if we know a driver in most cases wants the root block > and we'd want to make it the default, thereby only requiring userspace > for toggling it off. > > Luis Something in userspace has to put the device into the bridge. Fix the port setup in that tool via the netlink or sysfs flags in the bridge. It should not have to be handled in the bridge looking at magic flags in the device. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Feb 20, 2014 at 5:19 AM, Zoltan Kiss <zoltan.kiss@citrix.com> wrote: > How about this: netback sets the root_block flag and a random MAC by > default. So the default behaviour won't change, DAD will be happy, and > userspace don't have to do anything unless it's using netback for STP root > bridge (I don't think there are too many toolstacks doing that), in which > case it has to remove the root_block flag instead of setting a random MAC. :D that's exactly what I ended up proposing too. I mentioned how xen-netback could do this as well, we'd keep or rename the flag I added, and then the bridge could would look at it and enable the root block if the flag is set. Stephen however does not like having the bridge code look at magic flags for this behavior and would prefer for us to get the tools to ask for the root block. Let's follow more up on that thread. Luis -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Feb 20, 2014 at 9:19 AM, Stephen Hemminger <stephen@networkplumber.org> wrote: > On Wed, 19 Feb 2014 09:59:33 -0800 "Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote: >> On Wed, Feb 19, 2014 at 9:08 AM, Stephen Hemminger <stephen@networkplumber.org> wrote: >> > >> > Please only use the netlink/sysfs flags fields that already exist >> > for new features. >> >> Sure, but what if we know a driver in most cases wants the root block >> and we'd want to make it the default, thereby only requiring userspace >> for toggling it off. > > Something in userspace has to put the device into the bridge. > Fix the port setup in that tool via the netlink or sysfs flags in > the bridge. It should not have to be handled in the bridge looking > at magic flags in the device. Agreed that's the best strategy and I'll work on sending patches to brctl to enable the root_block preference. This approach however also requires a userspace upgrade. I'm trying to see if we can get an old-nasty-cryptic-hack practice removed from the kernel and we'd try to prevent future drivers from using it -- without requiring userspace upgrade. In this case the bad practice is to using a high static MAC address for mimicking a root block default preference. In order to remove that *without* requiring a userspace upgrade the dev->priv_flag approach is the only thing I can think of. If this would go in we'd replace the high static MAC address with a random MAC address to prevent IPv6 SLAAC / DAD conflicts. I'd document this flag and indicate with preference for userspace to be the one tuning these knobs. Without this we'd have to keep the high static MAC address on upstream drivers and let userspace do the random'ization if it confirms the userspace knob to turn the root block flag is available. Is the priv_flag approach worth the compromise to remove the root block hack practice? Luis -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Feb 20, 2014 at 6:47 AM, Zoltan Kiss <zoltan.kiss@citrix.com> wrote: > On 19/02/14 16:45, Luis R. Rodriguez wrote: > >> You seem to describe a case whereby it can make sense for xen-netback >> interfaces to end up becoming the root port of a bridge. Can you >> elaborate a little more on that as it was unclear the use case. > > Well, I might be wrong on that, but the scenario I was thinking: a guest > (let's say domain 1) can have multiple interfaces on different Dom0 (or > driver domain) bridges, let's say vif1.0 is plugged into xenbr0 and vif1.1 > is in xenbr1. If the guest wants to make a bridge of this two, then using > STP makes sense. The bridging would happen on the front end in that case no? > I wanted to bring up CloudStack's virtual router as an > example, but then I realized it's probably doesn't do such thing. However I > don't think we should hardcode that a netback interface can't be RP ever. My patch did allow for this but the root block flag that Stephen mentioned can always be lifted. Luis -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 20/02/14 20:01, Luis R. Rodriguez wrote: > On Thu, Feb 20, 2014 at 5:19 AM, Zoltan Kiss <zoltan.kiss@citrix.com> wrote: >> How about this: netback sets the root_block flag and a random MAC by >> default. So the default behaviour won't change, DAD will be happy, and >> userspace don't have to do anything unless it's using netback for STP root >> bridge (I don't think there are too many toolstacks doing that), in which >> case it has to remove the root_block flag instead of setting a random MAC. > > :D that's exactly what I ended up proposing too. I mentioned how > xen-netback could do this as well, we'd keep or rename the flag I > added, and then the bridge could would look at it and enable the root > block if the flag is set. Stephen however does not like having the > bridge code look at magic flags for this behavior and would prefer for > us to get the tools to ask for the root block. Let's follow more up on > that thread We don't need that new flag, just forget about it. Set that root_block flag from netback device init, around the time you generate the random MAC, or at the earliest possible time. Nothing else has to be done from kernel side. If someone wants netback to be a root port, then remove root_block from their tools, instead of changing the the MAC address, as it happens now. Another problem with the random addresses, pointed out by Ian earlier, that when adding/removing interfaces, the bridge does recalculate it's MAC address, and choose the lowest one. In the general usecase I think that's normal, but in case of Xen networking, we would like to keep the bridge using the physical interface's MAC, because the local port of the bridge is used for Dom0 network traffic, therefore changing the bridge MAC when a netback device has lower MAC breaks that traffic. I think the best is to address this from userspace: if it set the MAC of the bridge explicitly, dev_set_mac_address() does dev->addr_assign_type = NET_ADDR_SET;, so br_stp_recalculate_bridge_id() will exit before changing anything. And when I say userspace, I mean Xen specific tools which does networking configuration, e.g. xapi in XenServer case. Not brctl, it doesn't have to know whether this is a xenbrX device or a bridge used for another purposes. Zoli -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 20/02/14 20:24, Luis R. Rodriguez wrote: > On Thu, Feb 20, 2014 at 9:19 AM, Stephen Hemminger > <stephen@networkplumber.org> wrote: >> On Wed, 19 Feb 2014 09:59:33 -0800 "Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote: >>> On Wed, Feb 19, 2014 at 9:08 AM, Stephen Hemminger <stephen@networkplumber.org> wrote: >>>> >>>> Please only use the netlink/sysfs flags fields that already exist >>>> for new features. >>> >>> Sure, but what if we know a driver in most cases wants the root block >>> and we'd want to make it the default, thereby only requiring userspace >>> for toggling it off. >> >> Something in userspace has to put the device into the bridge. >> Fix the port setup in that tool via the netlink or sysfs flags in >> the bridge. It should not have to be handled in the bridge looking >> at magic flags in the device. > > Agreed that's the best strategy and I'll work on sending patches to > brctl to enable the root_block preference. This approach however also I don't think brctl should deal with any Xen specific stuff. I assume there is a misunderstanding in this thread: when I (and possibly other Xen folks) talk about "userspace" or "toolstack" here, I mean Xen specific tools which use e.g. brctl to set up bridges. Not brctl itself. > requires a userspace upgrade. I'm trying to see if we can get an > old-nasty-cryptic-hack practice removed from the kernel and we'd try > to prevent future drivers from using it -- without requiring userspace > upgrade. In this case the bad practice is to using a high static MAC > address for mimicking a root block default preference. In order to > remove that *without* requiring a userspace upgrade the dev->priv_flag > approach is the only thing I can think of. If this would go in we'd > replace the high static MAC address with a random MAC address to > prevent IPv6 SLAAC / DAD conflicts. I'd document this flag and > indicate with preference for userspace to be the one tuning these > knobs. > > Without this we'd have to keep the high static MAC address on upstream > drivers and let userspace do the random'ization if it confirms the > userspace knob to turn the root block flag is available. Is the > priv_flag approach worth the compromise to remove the root block hack > practice? > > Luis > -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Feb 21, 2014 at 5:02 AM, Zoltan Kiss <zoltan.kiss@citrix.com> wrote: > On 20/02/14 20:01, Luis R. Rodriguez wrote: >> >> On Thu, Feb 20, 2014 at 5:19 AM, Zoltan Kiss <zoltan.kiss@citrix.com> >> wrote: >>> >>> How about this: netback sets the root_block flag and a random MAC by >>> default. So the default behaviour won't change, DAD will be happy, and >>> userspace don't have to do anything unless it's using netback for STP >>> root >>> bridge (I don't think there are too many toolstacks doing that), in which >>> case it has to remove the root_block flag instead of setting a random >>> MAC. >> >> >> :D that's exactly what I ended up proposing too. I mentioned how >> xen-netback could do this as well, we'd keep or rename the flag I >> added, and then the bridge could would look at it and enable the root >> block if the flag is set. Stephen however does not like having the >> bridge code look at magic flags for this behavior and would prefer for >> us to get the tools to ask for the root block. Let's follow more up on >> that thread > > We don't need that new flag, just forget about it. Unless I'm missing something the root_block flag is a bridge port primitive. This means we can't set it *until* the interface gets added to a bridge, and even then, its a knob that would be available only to the bridge. > Another problem with the random addresses, pointed out by Ian earlier, that > when adding/removing interfaces, the bridge does recalculate it's MAC > address, and choose the lowest one. In the general usecase I think that's > normal, but in case of Xen networking, we would like to keep the bridge > using the physical interface's MAC, because the local port of the bridge is > used for Dom0 network traffic, therefore changing the bridge MAC when a > netback device has lower MAC breaks that traffic. This is a good reason then to actually have an interface general specific knob to annotate to the bridge that we'd prefer to root_block by default, the alternative as you point out below is to have the xen / kvm utils to set the bridge MAC address statically, but that'll requires a userspace upgrade. I'm looking for a kernel solution that is backwards compatible with old userspace. > I think the best is to > address this from userspace: if it set the MAC of the bridge explicitly, > dev_set_mac_address() does dev->addr_assign_type = NET_ADDR_SET;, so > br_stp_recalculate_bridge_id() will exit before changing anything. That will certainly work for new xen / kvm util userspace. Luis -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Feb 21, 2014 at 5:02 AM, Zoltan Kiss <zoltan.kiss@citrix.com> wrote: >> Agreed that's the best strategy and I'll work on sending patches to >> brctl to enable the root_block preference. This approach however also > > I don't think brctl should deal with any Xen specific stuff. I assume there > is a misunderstanding in this thread: when I (and possibly other Xen folks) > talk about "userspace" or "toolstack" here, I mean Xen specific tools which > use e.g. brctl to set up bridges. Not brctl itself. I did mean brctl, but as I looked at the code it doesn't used rtnl_open() and not sure if Stephen would want that. Additionally even if it did handle root_block the other issue with this strategy is that as you noted upon initialization the bridge, without a static MAC address, could end up setting the backend as the root port, until you let userspace turn the root_block knob. Luis -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Feb 21, 2014 at 8:01 AM, Luis R. Rodriguez <mcgrof@do-not-panic.com> wrote: > On Fri, Feb 21, 2014 at 5:02 AM, Zoltan Kiss <zoltan.kiss@citrix.com> wrote: >>> Agreed that's the best strategy and I'll work on sending patches to >>> brctl to enable the root_block preference. This approach however also >> >> I don't think brctl should deal with any Xen specific stuff. I assume there >> is a misunderstanding in this thread: when I (and possibly other Xen folks) >> talk about "userspace" or "toolstack" here, I mean Xen specific tools which >> use e.g. brctl to set up bridges. Not brctl itself. > > I did mean brctl, but as I looked at the code it doesn't used > rtnl_open() and not sure if Stephen would want that. Actually that'd be the incorrect tool to extend, iproute2 would be the new way with: ip link add dev xenbr0 type bridge ip link set dev eth0 master xenbr0 ip link set dev vif1.0 master xenbr0 <root_block> where root_block would be the new desired argument. This would use the rtnetlink RTM_SETLINK + IFLA_MASTER, which will in turn kick off the bridge ndo_add_slave(). Still though it seems this requires the eth0 device to actually exist and as such from what I can tell we can't set the root_block preference until *after* the addition onto the bridge, which should mean the bridge could still take the vif1.0 MAC address momentarily. This is of course only an issue if the link was up during the additions. This makes me think perhaps nothing is needed then and scripts could just use the: bridge link set dev vif1.0 root_block on I also just noticed that if an entry that was the bridge root port got a root_block toggle we don't kick off the newly blocked port, I just verified this. Note that removing the interface from the bridge does however reset the bridge with a proper new root port: ip link set dev vif1.0 nomaster For old userspace with brctl and no iproute2 we're shit out of luck, this means we can't use root block (xen-netblock was added on v2.6.39). Stephen all this can we add the priv_flags flag to help out as proposed, but I'd make it just toggle the new root_block flag, that'd enable drivers to use this from initialization. Let me know if you have other suggestions or things I may have missed. Luis -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/uapi/linux/if.h b/include/uapi/linux/if.h index d758163..8d10382 100644 --- a/include/uapi/linux/if.h +++ b/include/uapi/linux/if.h @@ -84,6 +84,7 @@ #define IFF_LIVE_ADDR_CHANGE 0x100000 /* device supports hardware address * change when it's running */ #define IFF_MACVLAN 0x200000 /* Macvlan device */ +#define IFF_BRIDGE_NON_ROOT 0x400000 /* Don't consider for root bridge */ #define IF_GET_IFACE 0x0001 /* for querying only */ diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c index 4bf02ad..a745415 100644 --- a/net/bridge/br_if.c +++ b/net/bridge/br_if.c @@ -228,6 +228,8 @@ static struct net_bridge_port *new_nbp(struct net_bridge *br, br_init_port(p); p->state = BR_STATE_DISABLED; br_stp_port_timer_init(p); + if (dev->priv_flags & IFF_BRIDGE_NON_ROOT) + p->flags |= BR_DONT_ROOT; br_multicast_add_port(p); return p; diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 045d56e..a89e8ad 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -173,6 +173,7 @@ struct net_bridge_port #define BR_ADMIN_COST 0x00000010 #define BR_LEARNING 0x00000020 #define BR_FLOOD 0x00000040 +#define BR_DONT_ROOT 0x00000080 #ifdef CONFIG_BRIDGE_IGMP_SNOOPING struct bridge_mcast_query ip4_query; diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c index 656a6f3..12fd848 100644 --- a/net/bridge/br_stp_if.c +++ b/net/bridge/br_stp_if.c @@ -228,6 +228,8 @@ bool br_stp_recalculate_bridge_id(struct net_bridge *br) return false; list_for_each_entry(p, &br->port_list, list) { + if (p->flags & BR_DONT_ROOT) + continue; if (addr == br_mac_zero || memcmp(p->dev->dev_addr, addr, ETH_ALEN) < 0) addr = p->dev->dev_addr;