Message ID | 20200212200555.2393-1-f.fainelli@gmail.com |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | [net] net: dsa: Treat VLAN ID 0 as PVID untagged | expand |
Hi Florian, On Wed, 12 Feb 2020 at 22:06, Florian Fainelli <f.fainelli@gmail.com> wrote: > > VLAN ID 0 is special by all kinds and is really meant to be the default > ingress and egress untagged VLAN. We were not configuring it that way > and so we would be ingress untagged but egress tagged. > > When our devices are interfaced with other link partners such as switch > devices, the results would be entirely equipment dependent. Some > switches are completely fine with accepting an egress tagged frame with > VLAN ID 0 and would send their responses untagged, so everything works, > but other devices are not so tolerant and would typically reject a VLAN > ID 0 tagged frame. Are you sure that it's not in fact those devices that are not doing what they're supposed to? VID 0 should be sent as tagged and no port membership checks should be enforced on it. > > Fixes: 061f6a505ac3 ("net: dsa: Add ndo_vlan_rx_{add, kill}_vid implementation") > Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> > --- > Hi all, > > After looking at all DSA drivers and how they implement port_vlan_add() > I think this is the right change to do, but would appreciate if you > could test this on your respective platforms to ensure this is not > problematic. I'm pretty sure this is problematic, for the simple reason that with this change, DSA is insisting that the default PVID is 0, contrary to the bridge core which insists it is 1. And some switches, like the Microchip Ocelot/Felix, don't support more than 1 egress-untagged VLAN, so adding one of the VIDs 0 or 1 will fail (I don't know the exact order off-hand). See 1c44ce560b4d ("net: mscc: ocelot: fix vlan_filtering when enslaving to bridge before link is up") for more details of how that is going to work. > > Thank you > > > net/dsa/slave.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/net/dsa/slave.c b/net/dsa/slave.c > index 088c886e609e..d3a2782eb94d 100644 > --- a/net/dsa/slave.c > +++ b/net/dsa/slave.c > @@ -1100,6 +1100,7 @@ static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto, > { > struct dsa_port *dp = dsa_slave_to_port(dev); > struct bridge_vlan_info info; > + u16 flags = 0; > int ret; > > /* Check for a possible bridge VLAN entry now since there is no > @@ -1118,7 +1119,13 @@ static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto, > return -EBUSY; > } > > - ret = dsa_port_vid_add(dp, vid, 0); > + /* VLAN ID 0 is special and should be the default egress and ingress > + * untagged VLAN, make sure it gets programmed as such. > + */ > + if (vid == 0) > + flags = BRIDGE_VLAN_INFO_PVID | BRIDGE_VLAN_INFO_UNTAGGED; IEEE 802.1Q-2018, page 247, Table 9-2—Reserved VID values: The null VID. Indicates that the tag header contains only priority information; no VID is present in the frame. This VID value shall not be configured as a PVID or a member of a VID Set, or configured in any FDB entry, or used in any Management operation. > + > + ret = dsa_port_vid_add(dp, vid, flags); > if (ret) > return ret; > > -- > 2.17.1 > Is this a test? Regards, -Vladimir
On 2/12/20 12:47 PM, Vladimir Oltean wrote: > Hi Florian, > > On Wed, 12 Feb 2020 at 22:06, Florian Fainelli <f.fainelli@gmail.com> wrote: >> >> VLAN ID 0 is special by all kinds and is really meant to be the default >> ingress and egress untagged VLAN. We were not configuring it that way >> and so we would be ingress untagged but egress tagged. >> >> When our devices are interfaced with other link partners such as switch >> devices, the results would be entirely equipment dependent. Some >> switches are completely fine with accepting an egress tagged frame with >> VLAN ID 0 and would send their responses untagged, so everything works, >> but other devices are not so tolerant and would typically reject a VLAN >> ID 0 tagged frame. > > Are you sure that it's not in fact those devices that are not doing > what they're supposed to? VID 0 should be sent as tagged and no port > membership checks should be enforced on it. Where everything works what I see is the following: - Linux on egress sends an untagged frame (as captured by tcpdump) but the VLAN entry for VID 0 makes it egress tagged and the machine on the other sees it as such as well - the response from that machine is also ingress tagged as captured from the DSA master network device what I do not have visibility into are systems where this does not work but will try to request that. Breaking users is obviously bad which prompted me for doing this specification violating frame. I am not sure whether DSA standalone ports qualify as managed ports or not, sounds like no given we have not added support for doing much UC/MC filtering unlike what NICs do. > >> >> Fixes: 061f6a505ac3 ("net: dsa: Add ndo_vlan_rx_{add, kill}_vid implementation") >> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> >> --- >> Hi all, >> >> After looking at all DSA drivers and how they implement port_vlan_add() >> I think this is the right change to do, but would appreciate if you >> could test this on your respective platforms to ensure this is not >> problematic. > > I'm pretty sure this is problematic, for the simple reason that with > this change, DSA is insisting that the default PVID is 0, contrary to > the bridge core which insists it is 1. And some switches, like the > Microchip Ocelot/Felix, don't support more than 1 egress-untagged > VLAN, so adding one of the VIDs 0 or 1 will fail (I don't know the > exact order off-hand). See 1c44ce560b4d ("net: mscc: ocelot: fix > vlan_filtering when enslaving to bridge before link is up") for more > details of how that is going to work. OK, I do wonder if we would be better off just skipping the VLAN programming for VID = 0 and/or just defining a different reserved/default VLAN ID for switches that have global VLAN filtering. > >> >> Thank you >> >> >> net/dsa/slave.c | 9 ++++++++- >> 1 file changed, 8 insertions(+), 1 deletion(-) >> >> diff --git a/net/dsa/slave.c b/net/dsa/slave.c >> index 088c886e609e..d3a2782eb94d 100644 >> --- a/net/dsa/slave.c >> +++ b/net/dsa/slave.c >> @@ -1100,6 +1100,7 @@ static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto, >> { >> struct dsa_port *dp = dsa_slave_to_port(dev); >> struct bridge_vlan_info info; >> + u16 flags = 0; >> int ret; >> >> /* Check for a possible bridge VLAN entry now since there is no >> @@ -1118,7 +1119,13 @@ static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto, >> return -EBUSY; >> } >> >> - ret = dsa_port_vid_add(dp, vid, 0); >> + /* VLAN ID 0 is special and should be the default egress and ingress >> + * untagged VLAN, make sure it gets programmed as such. >> + */ >> + if (vid == 0) >> + flags = BRIDGE_VLAN_INFO_PVID | BRIDGE_VLAN_INFO_UNTAGGED; > > IEEE 802.1Q-2018, page 247, Table 9-2—Reserved VID values: > > The null VID. Indicates that the tag header contains only priority > information; no VID is > present in the frame. This VID value shall not be configured as a PVID > or a member of a VID > Set, or configured in any FDB entry, or used in any Management operation. > >> + >> + ret = dsa_port_vid_add(dp, vid, flags); >> if (ret) >> return ret; >> >> -- >> 2.17.1 >> > > Is this a test? Of course, I am always testing you, that way you do not know if I am incredibly smart or stupid.
On Wed, 12 Feb 2020 at 23:28, Florian Fainelli <f.fainelli@gmail.com> wrote: > > On 2/12/20 12:47 PM, Vladimir Oltean wrote: > > Hi Florian, > > > > On Wed, 12 Feb 2020 at 22:06, Florian Fainelli <f.fainelli@gmail.com> wrote: > >> > >> VLAN ID 0 is special by all kinds and is really meant to be the default > >> ingress and egress untagged VLAN. We were not configuring it that way > >> and so we would be ingress untagged but egress tagged. > >> > >> When our devices are interfaced with other link partners such as switch > >> devices, the results would be entirely equipment dependent. Some > >> switches are completely fine with accepting an egress tagged frame with > >> VLAN ID 0 and would send their responses untagged, so everything works, > >> but other devices are not so tolerant and would typically reject a VLAN > >> ID 0 tagged frame. > > > > Are you sure that it's not in fact those devices that are not doing > > what they're supposed to? VID 0 should be sent as tagged and no port > > membership checks should be enforced on it. > > Where everything works what I see is the following: > > - Linux on egress sends an untagged frame (as captured by tcpdump) but > the VLAN entry for VID 0 makes it egress tagged and the machine on the > other sees it as such as well So the operating system is sending untagged traffic, it gets pvid-tagged by the hardware on the CPU port and is sent as egress-tagged on the front panel. Odd, but ok, not illegal, I suppose. The odd part is caused by having the vid 0 as pvid. Otherwise, having vid 0 as egress-tagged is not in itself a problem, since the assumption is that the only way a frame would get to the switch with VID 0 was if that VID was already in the tag. If anything, changing the pvid to something that is egress-untagged will give you some nice throughput boost, save you 4 bytes on the wire per frame. > - the response from that machine is also ingress tagged as captured from > the DSA master network device > > what I do not have visibility into are systems where this does not work > but will try to request that. Well, we can talk until the cows come home, but until the drop reason on those devices is clear, I would refrain from drawing any conclusion. > Breaking users is obviously bad which > prompted me for doing this specification violating frame. I am not sure > whether DSA standalone ports qualify as managed ports or not, sounds > like no given we have not added support for doing much UC/MC filtering > unlike what NICs do. > > > > >> > >> Fixes: 061f6a505ac3 ("net: dsa: Add ndo_vlan_rx_{add, kill}_vid implementation") > >> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> > >> --- > >> Hi all, > >> > >> After looking at all DSA drivers and how they implement port_vlan_add() > >> I think this is the right change to do, but would appreciate if you > >> could test this on your respective platforms to ensure this is not > >> problematic. > > > > I'm pretty sure this is problematic, for the simple reason that with > > this change, DSA is insisting that the default PVID is 0, contrary to > > the bridge core which insists it is 1. And some switches, like the > > Microchip Ocelot/Felix, don't support more than 1 egress-untagged > > VLAN, so adding one of the VIDs 0 or 1 will fail (I don't know the > > exact order off-hand). See 1c44ce560b4d ("net: mscc: ocelot: fix > > vlan_filtering when enslaving to bridge before link is up") for more > > details of how that is going to work. > > OK, I do wonder if we would be better off just skipping the VLAN > programming for VID = 0 and/or just defining a different > reserved/default VLAN ID for switches that have global VLAN filtering. > Oh, right, I remember. This is one of the switches with b53_default_pvid=0? So what were you saying... [0] > > > > Why should we bend the framework because sja1105 and dsa_8021q are > > > > special? [0]: https://lore.kernel.org/netdev/670c1d7f-4d2c-e9b4-3057-e87a66ad0d33@gmail.com/ So having 0 as pvid will inevitably cause problems trying to do something meaningful with it on egress. Send it tagged, it'll mess with your untagged traffic, send it untagged, it'll mess with your stack-originated 802.1p-tagged traffic, as well as 802.1p-tagged traffic forwarded from other endpoints. Hence the reason why IEEE said "don't do that". Can't you do whatever workarounds with vid 0 and/or NETIF_F_HW_VLAN_CTAG_FILTER that are restricted to b53? The "flags" value of 0 for 802.1p tagged frames is fine under the assumption that vid 0 is never going to be a pvid, which it'd better not be. The bridge doesn't even let you run "bridge vlan add dev swp0 vid 0 pvid untagged", that should say something. > > > >> > >> Thank you > >> > >> > >> net/dsa/slave.c | 9 ++++++++- > >> 1 file changed, 8 insertions(+), 1 deletion(-) > >> > >> diff --git a/net/dsa/slave.c b/net/dsa/slave.c > >> index 088c886e609e..d3a2782eb94d 100644 > >> --- a/net/dsa/slave.c > >> +++ b/net/dsa/slave.c > >> @@ -1100,6 +1100,7 @@ static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto, > >> { > >> struct dsa_port *dp = dsa_slave_to_port(dev); > >> struct bridge_vlan_info info; > >> + u16 flags = 0; > >> int ret; > >> > >> /* Check for a possible bridge VLAN entry now since there is no > >> @@ -1118,7 +1119,13 @@ static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto, > >> return -EBUSY; > >> } > >> > >> - ret = dsa_port_vid_add(dp, vid, 0); > >> + /* VLAN ID 0 is special and should be the default egress and ingress > >> + * untagged VLAN, make sure it gets programmed as such. > >> + */ > >> + if (vid == 0) > >> + flags = BRIDGE_VLAN_INFO_PVID | BRIDGE_VLAN_INFO_UNTAGGED; > > > > IEEE 802.1Q-2018, page 247, Table 9-2—Reserved VID values: > > > > The null VID. Indicates that the tag header contains only priority > > information; no VID is > > present in the frame. This VID value shall not be configured as a PVID > > or a member of a VID > > Set, or configured in any FDB entry, or used in any Management operation. > > > >> + > >> + ret = dsa_port_vid_add(dp, vid, flags); > >> if (ret) > >> return ret; > >> > >> -- > >> 2.17.1 > >> > > > > Is this a test? > > Of course, I am always testing you, that way you do not know if I am > incredibly smart or stupid. Actually I was more like happy about not being a complete beginner anymore. Looking through the sja1105 git history you'll see that I've oscillated a few times myself between vid 0 and 1 as pvid. Perhaps overly excited that I could point to the right page in the right book, for once. > -- > Florian Regards, -Vladimir
On 2/12/20 2:38 PM, Vladimir Oltean wrote: > On Wed, 12 Feb 2020 at 23:28, Florian Fainelli <f.fainelli@gmail.com> wrote: >> >> On 2/12/20 12:47 PM, Vladimir Oltean wrote: >>> Hi Florian, >>> >>> On Wed, 12 Feb 2020 at 22:06, Florian Fainelli <f.fainelli@gmail.com> wrote: >>>> >>>> VLAN ID 0 is special by all kinds and is really meant to be the default >>>> ingress and egress untagged VLAN. We were not configuring it that way >>>> and so we would be ingress untagged but egress tagged. >>>> >>>> When our devices are interfaced with other link partners such as switch >>>> devices, the results would be entirely equipment dependent. Some >>>> switches are completely fine with accepting an egress tagged frame with >>>> VLAN ID 0 and would send their responses untagged, so everything works, >>>> but other devices are not so tolerant and would typically reject a VLAN >>>> ID 0 tagged frame. >>> >>> Are you sure that it's not in fact those devices that are not doing >>> what they're supposed to? VID 0 should be sent as tagged and no port >>> membership checks should be enforced on it. >> >> Where everything works what I see is the following: >> >> - Linux on egress sends an untagged frame (as captured by tcpdump) but >> the VLAN entry for VID 0 makes it egress tagged and the machine on the >> other sees it as such as well > > So the operating system is sending untagged traffic, it gets > pvid-tagged by the hardware on the CPU port and is sent as > egress-tagged on the front panel. > Odd, but ok, not illegal, I suppose. The odd part is caused by having > the vid 0 as pvid. Otherwise, having vid 0 as egress-tagged is not in > itself a problem, since the assumption is that the only way a frame > would get to the switch with VID 0 was if that VID was already in the > tag. > If anything, changing the pvid to something that is egress-untagged > will give you some nice throughput boost, save you 4 bytes on the wire > per frame. > >> - the response from that machine is also ingress tagged as captured from >> the DSA master network device >> >> what I do not have visibility into are systems where this does not work >> but will try to request that. > > Well, we can talk until the cows come home, but until the drop reason > on those devices is clear, I would refrain from drawing any > conclusion. > >> Breaking users is obviously bad which >> prompted me for doing this specification violating frame. I am not sure >> whether DSA standalone ports qualify as managed ports or not, sounds >> like no given we have not added support for doing much UC/MC filtering >> unlike what NICs do. >> >>> >>>> >>>> Fixes: 061f6a505ac3 ("net: dsa: Add ndo_vlan_rx_{add, kill}_vid implementation") >>>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> >>>> --- >>>> Hi all, >>>> >>>> After looking at all DSA drivers and how they implement port_vlan_add() >>>> I think this is the right change to do, but would appreciate if you >>>> could test this on your respective platforms to ensure this is not >>>> problematic. >>> >>> I'm pretty sure this is problematic, for the simple reason that with >>> this change, DSA is insisting that the default PVID is 0, contrary to >>> the bridge core which insists it is 1. And some switches, like the >>> Microchip Ocelot/Felix, don't support more than 1 egress-untagged >>> VLAN, so adding one of the VIDs 0 or 1 will fail (I don't know the >>> exact order off-hand). See 1c44ce560b4d ("net: mscc: ocelot: fix >>> vlan_filtering when enslaving to bridge before link is up") for more >>> details of how that is going to work. >> >> OK, I do wonder if we would be better off just skipping the VLAN >> programming for VID = 0 and/or just defining a different >> reserved/default VLAN ID for switches that have global VLAN filtering. >> > > Oh, right, I remember. This is one of the switches with b53_default_pvid=0? > So what were you saying... [0] Yes that is correct, and we also have VLAN filtering being a global property, which is why we cannot simply reject programming with VID = 0, since that is the default, and default VID still requires a VLAN entry. > >>>>> Why should we bend the framework because sja1105 and dsa_8021q are >>>>> special? > > [0]: https://lore.kernel.org/netdev/670c1d7f-4d2c-e9b4-3057-e87a66ad0d33@gmail.com/ > > So having 0 as pvid will inevitably cause problems trying to do > something meaningful with it on egress. Send it tagged, it'll mess > with your untagged traffic, send it untagged, it'll mess with your > stack-originated 802.1p-tagged traffic, as well as 802.1p-tagged > traffic forwarded from other endpoints. Hence the reason why IEEE said > "don't do that". > > Can't you do whatever workarounds with vid 0 and/or > NETIF_F_HW_VLAN_CTAG_FILTER that are restricted to b53? The "flags" > value of 0 for 802.1p tagged frames is fine under the assumption that > vid 0 is never going to be a pvid, which it'd better not be. The > bridge doesn't even let you run "bridge vlan add dev swp0 vid 0 pvid > untagged", that should say something. Yes, forcing the default VID to be unconditionally untagged (but not PVID), scoped specifically to b53 works nicely, with, or without VLAN filtering and I can confirm that no VID 0 tag is sent on the wire (as expected). Thanks!
Hi Florian, On Wed, 12 Feb 2020 12:05:55 -0800, Florian Fainelli <f.fainelli@gmail.com> wrote: > VLAN ID 0 is special by all kinds and is really meant to be the default > ingress and egress untagged VLAN. We were not configuring it that way > and so we would be ingress untagged but egress tagged. > > When our devices are interfaced with other link partners such as switch > devices, the results would be entirely equipment dependent. Some > switches are completely fine with accepting an egress tagged frame with > VLAN ID 0 and would send their responses untagged, so everything works, > but other devices are not so tolerant and would typically reject a VLAN > ID 0 tagged frame. > > Fixes: 061f6a505ac3 ("net: dsa: Add ndo_vlan_rx_{add, kill}_vid implementation") > Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> > --- > Hi all, > > After looking at all DSA drivers and how they implement port_vlan_add() > I think this is the right change to do, but would appreciate if you > could test this on your respective platforms to ensure this is not > problematic. > > Thank you > > > net/dsa/slave.c | 9 ++++++++- > 1 file changed, 8 insertions(+), 1 deletion(-) > > diff --git a/net/dsa/slave.c b/net/dsa/slave.c > index 088c886e609e..d3a2782eb94d 100644 > --- a/net/dsa/slave.c > +++ b/net/dsa/slave.c > @@ -1100,6 +1100,7 @@ static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto, > { > struct dsa_port *dp = dsa_slave_to_port(dev); > struct bridge_vlan_info info; > + u16 flags = 0; > int ret; > > /* Check for a possible bridge VLAN entry now since there is no > @@ -1118,7 +1119,13 @@ static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto, > return -EBUSY; > } > > - ret = dsa_port_vid_add(dp, vid, 0); > + /* VLAN ID 0 is special and should be the default egress and ingress > + * untagged VLAN, make sure it gets programmed as such. > + */ > + if (vid == 0) > + flags = BRIDGE_VLAN_INFO_PVID | BRIDGE_VLAN_INFO_UNTAGGED; > + > + ret = dsa_port_vid_add(dp, vid, flags); > if (ret) > return ret; If a frame tagged with VID 0 ingresses a Marvell port with 802.1Q enabled, the VID assigned will be the port's default VID. That being said, the hardware shouldn't prevent us from programming a port's default VID as 0 or adding an entry for VID 0 in the VLAN table, but AFAICT we are rejecting the latter for some reasons (it might have no effect, idk). With this change we will be overriding the port's default VID with 0 in addition to attempting to program a VLAN entry for the null VID (mv88e6xxx would still return -EOPNOTSUPP at the moment for both anyway). Am I correct? Thank you, Vivien
diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 088c886e609e..d3a2782eb94d 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -1100,6 +1100,7 @@ static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto, { struct dsa_port *dp = dsa_slave_to_port(dev); struct bridge_vlan_info info; + u16 flags = 0; int ret; /* Check for a possible bridge VLAN entry now since there is no @@ -1118,7 +1119,13 @@ static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto, return -EBUSY; } - ret = dsa_port_vid_add(dp, vid, 0); + /* VLAN ID 0 is special and should be the default egress and ingress + * untagged VLAN, make sure it gets programmed as such. + */ + if (vid == 0) + flags = BRIDGE_VLAN_INFO_PVID | BRIDGE_VLAN_INFO_UNTAGGED; + + ret = dsa_port_vid_add(dp, vid, flags); if (ret) return ret;
VLAN ID 0 is special by all kinds and is really meant to be the default ingress and egress untagged VLAN. We were not configuring it that way and so we would be ingress untagged but egress tagged. When our devices are interfaced with other link partners such as switch devices, the results would be entirely equipment dependent. Some switches are completely fine with accepting an egress tagged frame with VLAN ID 0 and would send their responses untagged, so everything works, but other devices are not so tolerant and would typically reject a VLAN ID 0 tagged frame. Fixes: 061f6a505ac3 ("net: dsa: Add ndo_vlan_rx_{add, kill}_vid implementation") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> --- Hi all, After looking at all DSA drivers and how they implement port_vlan_add() I think this is the right change to do, but would appreciate if you could test this on your respective platforms to ensure this is not problematic. Thank you net/dsa/slave.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)