diff mbox

[RFC,v3,4/6] bridge: enable root block during device registration

Message ID 1393886825-24323-5-git-send-email-mcgrof@do-not-panic.com
State RFC, archived
Delegated to: David Miller
Headers show

Commit Message

Luis R. Rodriguez March 3, 2014, 10:47 p.m. UTC
From: "Luis R. Rodriguez" <mcgrof@suse.com>

root block support was added via 1007dd1a on v3.8 but toggling
this flag is only allowed after a device has been registered and
added to a bridge as its a bridge *port* primitive, not a *net_device*
feature. There are work arounds possible to account for the lack
of netlink tools to toggle root_block, such as using the root_block
syfs attribute [0] or using udev / the driver to set the MAC address
to something high such as FE:FF:FF:FF:FF:FF, but neither of these
ensure root block is respected _from_the_start_ through device
initialization.

In order to support the root_block feature from the start since device
initialization and in order to avoid having to require userspace
work arounds to existing deployments this exposes a private
net_device flag which enables drivers that know they want to
start with the root_block feature enabled form the start. The
only caveat with this is topologies that require STP or non-root
will either have to use sysfs [0] or netlink tools like the
iproute2 bridge util to toggle the flag off after initialization.
This is an accepted compromise.

This flag is required given that ndo_add_slave() currently does not
allow specifying any other parameters other than the net_device. We
could extend this but in order to do that properly we'd need to
evaluate all other types of master device implementations.

[0] echo 1 > /sys/devices/vif-2-0/net/vif2.0/brport/root_block

Cc: Stephen Hemminger <stephen@networkplumber.org>
Cc: bridge@lists.linux-foundation.org
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: xen-devel@lists.xenproject.org
Cc: kvm@vger.kernel.org
Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
---
 include/linux/netdevice.h | 7 +++++++
 net/bridge/br_if.c        | 2 ++
 2 files changed, 9 insertions(+)

Comments

Stephen Hemminger March 3, 2014, 11:43 p.m. UTC | #1
On Mon,  3 Mar 2014 14:47:03 -0800
"Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote:

> From: "Luis R. Rodriguez" <mcgrof@suse.com>
> 
> root block support was added via 1007dd1a on v3.8 but toggling
> this flag is only allowed after a device has been registered and
> added to a bridge as its a bridge *port* primitive, not a *net_device*
> feature. There are work arounds possible to account for the lack
> of netlink tools to toggle root_block, such as using the root_block
> syfs attribute [0] or using udev / the driver to set the MAC address
> to something high such as FE:FF:FF:FF:FF:FF, but neither of these
> ensure root block is respected _from_the_start_ through device
> initialization.
> 
> In order to support the root_block feature from the start since device
> initialization and in order to avoid having to require userspace
> work arounds to existing deployments this exposes a private
> net_device flag which enables drivers that know they want to
> start with the root_block feature enabled form the start. The
> only caveat with this is topologies that require STP or non-root
> will either have to use sysfs [0] or netlink tools like the
> iproute2 bridge util to toggle the flag off after initialization.
> This is an accepted compromise.
> 
> This flag is required given that ndo_add_slave() currently does not
> allow specifying any other parameters other than the net_device. We
> could extend this but in order to do that properly we'd need to
> evaluate all other types of master device implementations.
> 
> [0] echo 1 > /sys/devices/vif-2-0/net/vif2.0/brport/root_block
> 
> Cc: Stephen Hemminger <stephen@networkplumber.org>
> Cc: bridge@lists.linux-foundation.org
> Cc: netdev@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Cc: xen-devel@lists.xenproject.org
> Cc: kvm@vger.kernel.org
> Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
> ---
>  include/linux/netdevice.h | 7 +++++++
>  net/bridge/br_if.c        | 2 ++
>  2 files changed, 9 insertions(+)
> 
> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
> index 1a86948..b17643a 100644
> --- a/include/linux/netdevice.h
> +++ b/include/linux/netdevice.h
> @@ -1181,6 +1181,11 @@ struct net_device_ops {
>   * @IFF_LIVE_ADDR_CHANGE: device supports hardware address
>   *	change when it's running
>   * @IFF_MACVLAN: Macvlan device
> + * @IFF_BRIDGE_ROOT_BLOCK: don't consider this net_device for root port
> + *	when this interface is added to a bridge. This makes use of the
> + *	root_block mechanism but since its a bridge port primitive this
> + *	flag can be used to instantiate the preference to have root block
> + *	enabled from the start since initialization.
>   */

Doing this in priv flags bloats what is a limited resource (# of bits).
Plus there are issues (what if this is changed after adding to bridge)?

Maybe better to enhance existing netlink infrastructure to allow passing
flags on adding port to bridge.

Also, unless device is up, nothing will happen right away when added to bridge.
Root port status can be changed since device is disabled.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luis R. Rodriguez March 3, 2014, 11:58 p.m. UTC | #2
On Mon, Mar 3, 2014 at 3:43 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> Doing this in priv flags bloats what is a limited resource (# of bits).

Agreed. I tried to avoid it but saw no other option for addressing
this during initialization properly without requirng a userspace
upgrade.

> Plus there are issues (what if this is changed after adding to bridge)?

Its only considered when the net_device gets added to the bridge. If
you add it and then toggle root_block off that will be respected. If
you remove the net_device from the bridge and then add it back on the
priv flag will be respected but if desirable we could toggle it off if
userspace was used to disable root_block once on.

> Maybe better to enhance existing netlink infrastructure to allow passing
> flags on adding port to bridge.

Agreed, I looked into this and the limitation of using the existing
slave add request is that ndo_add_slave() only passes the net_device,
we'd have to either extend that (with collateral study on other slaves
as noted on my cover letter) or we'd have to make this a UAPI netlink
feature for the net_device. Some old userspace  may not use netlink
too, and they may be stuck with brctl, I tried to create a compromise
only to not affect old userspace if that kernel gets upgraded.

> Also, unless device is up, nothing will happen right away when added to bridge.

I get different results, the bridge immediately does compute whether
or not to nominate a device for the bridge interface and root port.
The commands provided as an example on 3/6 can be used to replicate
this.

> Root port status can be changed since device is disabled.

Agreed however the reason for the flag is to address removing high MAC
address hack-preference set on drivers that were merged prior to the
root_block feature. Without a proper way to address this upon
initialization we're stuck with requiring userspace upgrade if these
driver's hack is removed.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Stephen Hemminger March 4, 2014, 12:31 a.m. UTC | #3
On Mon, 3 Mar 2014 15:58:50 -0800
"Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote:

> On Mon, Mar 3, 2014 at 3:43 PM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
> > Doing this in priv flags bloats what is a limited resource (# of bits).  
> 
> Agreed. I tried to avoid it but saw no other option for addressing
> this during initialization properly without requirng a userspace
> upgrade.

Replacing one Xen hack for another doesn't seem like great progress.
I would rather see an API change as needed because there are other
things like setting STP parameters which might also want to be part
of initial device add.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Luis R. Rodriguez March 4, 2014, 12:53 a.m. UTC | #4
On Mon, Mar 3, 2014 at 4:31 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Mon, 3 Mar 2014 15:58:50 -0800
> "Luis R. Rodriguez" <mcgrof@do-not-panic.com> wrote:
>
>> On Mon, Mar 3, 2014 at 3:43 PM, Stephen Hemminger
>> <stephen@networkplumber.org> wrote:
>> > Doing this in priv flags bloats what is a limited resource (# of bits).
>>
>> Agreed. I tried to avoid it but saw no other option for addressing
>> this during initialization properly without requirng a userspace
>> upgrade.
>
> Replacing one Xen hack for another doesn't seem like great progress.

I'm the last person that wants to see any internal driver hack
propagate, the goal was to kill one here.

> I would rather see an API change as needed because there are other
> things like setting STP parameters which might also want to be part
> of initial device add.

As it stands right now ndo_add_slave() currently only allows us to
pass the net_device, that's it, so we're confined there. One way I can
think of to properly extend this interface is to allow extra
parameters to be passed. We may even be able to move some old
priv_flags for bonding onto this new interface but before I moved down
this more invasive approach I wanted to review this other approach.
Does extending ndo_add_slave() seems reasonable, or do you have any
other preferred alternatives?

Even though the high MAC address was a xen hack it is a hack put in
place prior to the root block feature being added, so I would like at
the very least a bit of consideration on userspace here. Its really
the only reason why I've been stubborn on looking for a compromise. In
the KVM case that doesn't use the root_block feature yet I would have
considered just extending the ndo_add_slave() from the start but
because xen-netback *does* have in place a hack removing it requires
userspace considerations.

  Luis
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 1a86948..b17643a 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -1181,6 +1181,11 @@  struct net_device_ops {
  * @IFF_LIVE_ADDR_CHANGE: device supports hardware address
  *	change when it's running
  * @IFF_MACVLAN: Macvlan device
+ * @IFF_BRIDGE_ROOT_BLOCK: don't consider this net_device for root port
+ *	when this interface is added to a bridge. This makes use of the
+ *	root_block mechanism but since its a bridge port primitive this
+ *	flag can be used to instantiate the preference to have root block
+ *	enabled from the start since initialization.
  */
 enum netdev_priv_flags {
 	IFF_802_1Q_VLAN			= 1<<0,
@@ -1205,6 +1210,7 @@  enum netdev_priv_flags {
 	IFF_SUPP_NOFCS			= 1<<19,
 	IFF_LIVE_ADDR_CHANGE		= 1<<20,
 	IFF_MACVLAN			= 1<<21,
+	IFF_BRIDGE_ROOT_BLOCK		= 1<<22,
 };
 
 #define IFF_802_1Q_VLAN			IFF_802_1Q_VLAN
@@ -1229,6 +1235,7 @@  enum netdev_priv_flags {
 #define IFF_SUPP_NOFCS			IFF_SUPP_NOFCS
 #define IFF_LIVE_ADDR_CHANGE		IFF_LIVE_ADDR_CHANGE
 #define IFF_MACVLAN			IFF_MACVLAN
+#define IFF_BRIDGE_ROOT_BLOCK		IFF_BRIDGE_ROOT_BLOCK
 
 /*
  *	The DEVICE structure.
diff --git a/net/bridge/br_if.c b/net/bridge/br_if.c
index 54d207d..4bee748 100644
--- a/net/bridge/br_if.c
+++ b/net/bridge/br_if.c
@@ -228,6 +228,8 @@  static struct net_bridge_port *new_nbp(struct net_bridge *br,
 	br_init_port(p);
 	p->state = BR_STATE_DISABLED;
 	br_stp_port_timer_init(p);
+	if (dev->priv_flags & IFF_BRIDGE_ROOT_BLOCK)
+		p->flags |= BR_ROOT_BLOCK;
 	br_multicast_add_port(p);
 
 	return p;