Message ID | 1393886825-24323-5-git-send-email-mcgrof@do-not-panic.com |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On Mon, 3 Mar 2014 14:47:03 -0800 "Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote: > From: "Luis R. Rodriguez" <mcgrof@suse.com> > > root block support was added via 1007dd1a on v3.8 but toggling > this flag is only allowed after a device has been registered and > added to a bridge as its a bridge *port* primitive, not a *net_device* > feature. There are work arounds possible to account for the lack > of netlink tools to toggle root_block, such as using the root_block > syfs attribute [0] or using udev / the driver to set the MAC address > to something high such as FE:FF:FF:FF:FF:FF, but neither of these > ensure root block is respected _from_the_start_ through device > initialization. > > In order to support the root_block feature from the start since device > initialization and in order to avoid having to require userspace > work arounds to existing deployments this exposes a private > net_device flag which enables drivers that know they want to > start with the root_block feature enabled form the start. The > only caveat with this is topologies that require STP or non-root > will either have to use sysfs [0] or netlink tools like the > iproute2 bridge util to toggle the flag off after initialization. > This is an accepted compromise. > > This flag is required given that ndo_add_slave() currently does not > allow specifying any other parameters other than the net_device. We > could extend this but in order to do that properly we'd need to > evaluate all other types of master device implementations. > > [0] echo 1 > /sys/devices/vif-2-0/net/vif2.0/brport/root_block > > Cc: Stephen Hemminger <stephen@networkplumber.org> > Cc: bridge@lists.linux-foundation.org > Cc: netdev@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > Cc: xen-devel@lists.xenproject.org > Cc: kvm@vger.kernel.org > Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com> > --- > include/linux/netdevice.h | 7 +++++++ > net/bridge/br_if.c | 2 ++ > 2 files changed, 9 insertions(+) > > diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h > index 1a86948..b17643a 100644 > --- a/include/linux/netdevice.h > +++ b/include/linux/netdevice.h > @@ -1181,6 +1181,11 @@ struct net_device_ops { > * @IFF_LIVE_ADDR_CHANGE: device supports hardware address > * change when it's running > * @IFF_MACVLAN: Macvlan device > + * @IFF_BRIDGE_ROOT_BLOCK: don't consider this net_device for root port > + * when this interface is added to a bridge. This makes use of the > + * root_block mechanism but since its a bridge port primitive this > + * flag can be used to instantiate the preference to have root block > + * enabled from the start since initialization. > */ Doing this in priv flags bloats what is a limited resource (# of bits). Plus there are issues (what if this is changed after adding to bridge)? Maybe better to enhance existing netlink infrastructure to allow passing flags on adding port to bridge. Also, unless device is up, nothing will happen right away when added to bridge. Root port status can be changed since device is disabled. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Mar 3, 2014 at 3:43 PM, Stephen Hemminger <stephen@networkplumber.org> wrote: > Doing this in priv flags bloats what is a limited resource (# of bits). Agreed. I tried to avoid it but saw no other option for addressing this during initialization properly without requirng a userspace upgrade. > Plus there are issues (what if this is changed after adding to bridge)? Its only considered when the net_device gets added to the bridge. If you add it and then toggle root_block off that will be respected. If you remove the net_device from the bridge and then add it back on the priv flag will be respected but if desirable we could toggle it off if userspace was used to disable root_block once on. > Maybe better to enhance existing netlink infrastructure to allow passing > flags on adding port to bridge. Agreed, I looked into this and the limitation of using the existing slave add request is that ndo_add_slave() only passes the net_device, we'd have to either extend that (with collateral study on other slaves as noted on my cover letter) or we'd have to make this a UAPI netlink feature for the net_device. Some old userspace may not use netlink too, and they may be stuck with brctl, I tried to create a compromise only to not affect old userspace if that kernel gets upgraded. > Also, unless device is up, nothing will happen right away when added to bridge. I get different results, the bridge immediately does compute whether or not to nominate a device for the bridge interface and root port. The commands provided as an example on 3/6 can be used to replicate this. > Root port status can be changed since device is disabled. Agreed however the reason for the flag is to address removing high MAC address hack-preference set on drivers that were merged prior to the root_block feature. Without a proper way to address this upon initialization we're stuck with requiring userspace upgrade if these driver's hack is removed. Luis -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, 3 Mar 2014 15:58:50 -0800 "Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote: > On Mon, Mar 3, 2014 at 3:43 PM, Stephen Hemminger > <stephen@networkplumber.org> wrote: > > Doing this in priv flags bloats what is a limited resource (# of bits). > > Agreed. I tried to avoid it but saw no other option for addressing > this during initialization properly without requirng a userspace > upgrade. Replacing one Xen hack for another doesn't seem like great progress. I would rather see an API change as needed because there are other things like setting STP parameters which might also want to be part of initial device add. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Mar 3, 2014 at 4:31 PM, Stephen Hemminger <stephen@networkplumber.org> wrote: > On Mon, 3 Mar 2014 15:58:50 -0800 > "Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote: > >> On Mon, Mar 3, 2014 at 3:43 PM, Stephen Hemminger >> <stephen@networkplumber.org> wrote: >> > Doing this in priv flags bloats what is a limited resource (# of bits). >> >> Agreed. I tried to avoid it but saw no other option for addressing >> this during initialization properly without requirng a userspace >> upgrade. > > Replacing one Xen hack for another doesn't seem like great progress. I'm the last person that wants to see any internal driver hack propagate, the goal was to kill one here. > I would rather see an API change as needed because there are other > things like setting STP parameters which might also want to be part > of initial device add. As it stands right now ndo_add_slave() currently only allows us to pass the net_device, that's it, so we're confined there. One way I can think of to properly extend this interface is to allow extra parameters to be passed. We may even be able to move some old priv_flags for bonding onto this new interface but before I moved down this more invasive approach I wanted to review this other approach. Does extending ndo_add_slave() seems reasonable, or do you have any other preferred alternatives? Even though the high MAC address was a xen hack it is a hack put in place prior to the root block feature being added, so I would like at the very least a bit of consideration on userspace here. Its really the only reason why I've been stubborn on looking for a compromise. In the KVM case that doesn't use the root_block feature yet I would have considered just extending the ndo_add_slave() from the start but because xen-netback *does* have in place a hack removing it requires userspace considerations. Luis -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 1a86948..b17643a 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1181,6 +1181,11 @@ struct net_device_ops { * @IFF_LIVE_ADDR_CHANGE: device supports hardware address * change when it's running * @IFF_MACVLAN: Macvlan device + * @IFF_BRIDGE_ROOT_BLOCK: don't consider this net_device for root port + * when this interface is added to a bridge. This makes use of the + * root_block mechanism but since its a bridge port primitive this + * flag can be used to instantiate the preference to have root block + * enabled from the start since initialization. */ enum netdev_priv_flags { IFF_802_1Q_VLAN = 1<<0, @@ -1205,6 +1210,7 @@ enum netdev_priv_flags { IFF_SUPP_NOFCS = 1<<19, IFF_LIVE_ADDR_CHANGE = 1<<20, IFF_MACVLAN = 1<<21, + IFF_BRIDGE_ROOT_BLOCK = 1<<22, }; #define IFF_802_1Q_VLAN IFF_802_1Q_VLAN @@ -1229,6 +1235,7 @@ enum netdev_priv_flags { #define IFF_SUPP_NOFCS IFF_SUPP_NOFCS #define IFF_LIVE_ADDR_CHANGE IFF_LIVE_ADDR_CHANGE #define IFF_MACVLAN IFF_MACVLAN +#define IFF_BRIDGE_ROOT_BLOCK IFF_BRIDGE_ROOT_BLOCK /* * The DEVICE structure. diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c index 54d207d..4bee748 100644 --- a/net/bridge/br_if.c +++ b/net/bridge/br_if.c @@ -228,6 +228,8 @@ static struct net_bridge_port *new_nbp(struct net_bridge *br, br_init_port(p); p->state = BR_STATE_DISABLED; br_stp_port_timer_init(p); + if (dev->priv_flags & IFF_BRIDGE_ROOT_BLOCK) + p->flags |= BR_ROOT_BLOCK; br_multicast_add_port(p); return p;