Message ID | 20241013185509.4430-12-ericwouds@gmail.com |
---|---|
State | RFC |
Headers | show |
Series | bridge-fastpath and related improvements | expand |
On 13/10/2024 21:55, Eric Woudstra wrote: > In network setup as below: > > fastpath bypass > .----------------------------------------. > / \ > | IP - forwarding | > | / \ v > | / wan ... > | / > | | > | | > | brlan.1 > | | > | +-------------------------------+ > | | vlan 1 | > | | | > | | brlan (vlan-filtering) | > | | +---------------+ > | | | DSA-SWITCH | > | | vlan 1 | | > | | to | | > | | untagged 1 vlan 1 | > | +---------------+---------------+ > . / \ > ----->wlan1 lan0 > . . > . ^ > ^ vlan 1 tagged packets > untagged packets > > Now that DEV_PATH_MTK_WDMA is added to nft_dev_path_info() the forward > path is filled also when ending with the mediatek wlan1, info.indev not > NULL now in nft_dev_forward_path(). This results in a direct transmit > instead of a neighbor transmit. This is how it should be, But this fails. > > br_vlan_fill_forward_path_mode() sets DEV_PATH_BR_VLAN_UNTAG_HW when > filling in from brlan.1 towards wlan1. But it should be set to > DEV_PATH_BR_VLAN_UNTAG in this case. Using BR_VLFLAG_ADDED_BY_SWITCHDEV > is not correct. The dsa switchdev adds it as a foreign port. > > Use BR_VLFLAG_TAGGING_BY_SWITCHDEV to make sure DEV_PATH_BR_VLAN_UNTAG is > set when there is a dsa-switch inside the bridge. > > Signed-off-by: Eric Woudstra <ericwouds@gmail.com> > --- > net/bridge/br_private.h | 1 + > net/bridge/br_vlan.c | 18 +++++++++++++++++- > 2 files changed, 18 insertions(+), 1 deletion(-) > > diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h > index 8da7798f9368..7d427214cc7c 100644 > --- a/net/bridge/br_private.h > +++ b/net/bridge/br_private.h > @@ -180,6 +180,7 @@ enum { > BR_VLFLAG_MCAST_ENABLED = BIT(2), > BR_VLFLAG_GLOBAL_MCAST_ENABLED = BIT(3), > BR_VLFLAG_NEIGH_SUPPRESS_ENABLED = BIT(4), > + BR_VLFLAG_TAGGING_BY_SWITCHDEV = BIT(5), > }; > > /** > diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c > index 1830d7d617cd..b7877724b969 100644 > --- a/net/bridge/br_vlan.c > +++ b/net/bridge/br_vlan.c > @@ -3,6 +3,7 @@ > #include <linux/netdevice.h> > #include <linux/rtnetlink.h> > #include <linux/slab.h> > +#include <net/dsa.h> > #include <net/switchdev.h> > > #include "br_private.h" > @@ -100,6 +101,19 @@ static void __vlan_flags_commit(struct net_bridge_vlan *v, u16 flags) > __vlan_flags_update(v, flags, true); > } > > +static inline bool br_vlan_tagging_by_switchdev(struct net_bridge *br) no inline in .c files and also constify br > +{ > +#if IS_ENABLED(CONFIG_NET_DSA) > + struct net_bridge_port *p; > + > + list_for_each_entry(p, &br->port_list, list) { > + if (dsa_user_dev_check(p->dev)) I don't think this can change at runtime, so please keep a counter in the bridge and don't walk the port list on every vlan add. > + return false; > + } > +#endif > + return true; > +} > + > static int __vlan_vid_add(struct net_device *dev, struct net_bridge *br, > struct net_bridge_vlan *v, u16 flags, > struct netlink_ext_ack *extack) > @@ -113,6 +127,8 @@ static int __vlan_vid_add(struct net_device *dev, struct net_bridge *br, > if (err == -EOPNOTSUPP) > return vlan_vid_add(dev, br->vlan_proto, v->vid); > v->priv_flags |= BR_VLFLAG_ADDED_BY_SWITCHDEV; > + if (br_vlan_tagging_by_switchdev(br)) > + v->priv_flags |= BR_VLFLAG_TAGGING_BY_SWITCHDEV; > return err; > } > > @@ -1491,7 +1507,7 @@ int br_vlan_fill_forward_path_mode(struct net_bridge *br, > > if (path->bridge.vlan_mode == DEV_PATH_BR_VLAN_TAG) > path->bridge.vlan_mode = DEV_PATH_BR_VLAN_KEEP; > - else if (v->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV) > + else if (v->priv_flags & BR_VLFLAG_TAGGING_BY_SWITCHDEV) > path->bridge.vlan_mode = DEV_PATH_BR_VLAN_UNTAG_HW; > else > path->bridge.vlan_mode = DEV_PATH_BR_VLAN_UNTAG;
On 14/10/2024 09:18, Nikolay Aleksandrov wrote: > On 13/10/2024 21:55, Eric Woudstra wrote: >> In network setup as below: >> >> fastpath bypass >> .----------------------------------------. >> / \ >> | IP - forwarding | >> | / \ v >> | / wan ... >> | / >> | | >> | | >> | brlan.1 >> | | >> | +-------------------------------+ >> | | vlan 1 | >> | | | >> | | brlan (vlan-filtering) | >> | | +---------------+ >> | | | DSA-SWITCH | >> | | vlan 1 | | >> | | to | | >> | | untagged 1 vlan 1 | >> | +---------------+---------------+ >> . / \ >> ----->wlan1 lan0 >> . . >> . ^ >> ^ vlan 1 tagged packets >> untagged packets >> >> Now that DEV_PATH_MTK_WDMA is added to nft_dev_path_info() the forward >> path is filled also when ending with the mediatek wlan1, info.indev not >> NULL now in nft_dev_forward_path(). This results in a direct transmit >> instead of a neighbor transmit. This is how it should be, But this fails. >> >> br_vlan_fill_forward_path_mode() sets DEV_PATH_BR_VLAN_UNTAG_HW when >> filling in from brlan.1 towards wlan1. But it should be set to >> DEV_PATH_BR_VLAN_UNTAG in this case. Using BR_VLFLAG_ADDED_BY_SWITCHDEV >> is not correct. The dsa switchdev adds it as a foreign port. >> >> Use BR_VLFLAG_TAGGING_BY_SWITCHDEV to make sure DEV_PATH_BR_VLAN_UNTAG is >> set when there is a dsa-switch inside the bridge. >> >> Signed-off-by: Eric Woudstra <ericwouds@gmail.com> >> --- >> net/bridge/br_private.h | 1 + >> net/bridge/br_vlan.c | 18 +++++++++++++++++- >> 2 files changed, 18 insertions(+), 1 deletion(-) >> >> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h >> index 8da7798f9368..7d427214cc7c 100644 >> --- a/net/bridge/br_private.h >> +++ b/net/bridge/br_private.h >> @@ -180,6 +180,7 @@ enum { >> BR_VLFLAG_MCAST_ENABLED = BIT(2), >> BR_VLFLAG_GLOBAL_MCAST_ENABLED = BIT(3), >> BR_VLFLAG_NEIGH_SUPPRESS_ENABLED = BIT(4), >> + BR_VLFLAG_TAGGING_BY_SWITCHDEV = BIT(5), >> }; >> >> /** >> diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c >> index 1830d7d617cd..b7877724b969 100644 >> --- a/net/bridge/br_vlan.c >> +++ b/net/bridge/br_vlan.c >> @@ -3,6 +3,7 @@ >> #include <linux/netdevice.h> >> #include <linux/rtnetlink.h> >> #include <linux/slab.h> >> +#include <net/dsa.h> >> #include <net/switchdev.h> >> >> #include "br_private.h" >> @@ -100,6 +101,19 @@ static void __vlan_flags_commit(struct net_bridge_vlan *v, u16 flags) >> __vlan_flags_update(v, flags, true); >> } >> >> +static inline bool br_vlan_tagging_by_switchdev(struct net_bridge *br) > > no inline in .c files and also constify br > >> +{ >> +#if IS_ENABLED(CONFIG_NET_DSA) >> + struct net_bridge_port *p; >> + >> + list_for_each_entry(p, &br->port_list, list) { >> + if (dsa_user_dev_check(p->dev)) > > I don't think this can change at runtime, so please keep a counter in > the bridge and don't walk the port list on every vlan add. > you can use an internal bridge opt (check br_private.h) with a private opt that's set when such device is added as a port, no need for a full counter obviously >> + return false; >> + } >> +#endif >> + return true; >> +} >> + >> static int __vlan_vid_add(struct net_device *dev, struct net_bridge *br, >> struct net_bridge_vlan *v, u16 flags, >> struct netlink_ext_ack *extack) >> @@ -113,6 +127,8 @@ static int __vlan_vid_add(struct net_device *dev, struct net_bridge *br, >> if (err == -EOPNOTSUPP) >> return vlan_vid_add(dev, br->vlan_proto, v->vid); >> v->priv_flags |= BR_VLFLAG_ADDED_BY_SWITCHDEV; >> + if (br_vlan_tagging_by_switchdev(br)) >> + v->priv_flags |= BR_VLFLAG_TAGGING_BY_SWITCHDEV; >> return err; >> } >> >> @@ -1491,7 +1507,7 @@ int br_vlan_fill_forward_path_mode(struct net_bridge *br, >> >> if (path->bridge.vlan_mode == DEV_PATH_BR_VLAN_TAG) >> path->bridge.vlan_mode = DEV_PATH_BR_VLAN_KEEP; >> - else if (v->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV) >> + else if (v->priv_flags & BR_VLFLAG_TAGGING_BY_SWITCHDEV) >> path->bridge.vlan_mode = DEV_PATH_BR_VLAN_UNTAG_HW; >> else >> path->bridge.vlan_mode = DEV_PATH_BR_VLAN_UNTAG; >
Keeping the full email body untrimmed for extra context for the newly added people. On Mon, Oct 14, 2024 at 09:22:07AM +0300, Nikolay Aleksandrov wrote: > On 14/10/2024 09:18, Nikolay Aleksandrov wrote: > > On 13/10/2024 21:55, Eric Woudstra wrote: > >> In network setup as below: > >> > >> fastpath bypass > >> .----------------------------------------. > >> / \ > >> | IP - forwarding | > >> | / \ v > >> | / wan ... > >> | / > >> | | > >> | | > >> | brlan.1 > >> | | > >> | +-------------------------------+ > >> | | vlan 1 | > >> | | | > >> | | brlan (vlan-filtering) | > >> | | +---------------+ > >> | | | DSA-SWITCH | > >> | | vlan 1 | | > >> | | to | | > >> | | untagged 1 vlan 1 | > >> | +---------------+---------------+ > >> . / \ > >> ----->wlan1 lan0 > >> . . > >> . ^ > >> ^ vlan 1 tagged packets > >> untagged packets > >> > >> Now that DEV_PATH_MTK_WDMA is added to nft_dev_path_info() the forward > >> path is filled also when ending with the mediatek wlan1, info.indev not > >> NULL now in nft_dev_forward_path(). This results in a direct transmit > >> instead of a neighbor transmit. This is how it should be, But this fails. > >> > >> br_vlan_fill_forward_path_mode() sets DEV_PATH_BR_VLAN_UNTAG_HW when > >> filling in from brlan.1 towards wlan1. But it should be set to > >> DEV_PATH_BR_VLAN_UNTAG in this case. Using BR_VLFLAG_ADDED_BY_SWITCHDEV > >> is not correct. The dsa switchdev adds it as a foreign port. > >> > >> Use BR_VLFLAG_TAGGING_BY_SWITCHDEV to make sure DEV_PATH_BR_VLAN_UNTAG is > >> set when there is a dsa-switch inside the bridge. > >> > >> Signed-off-by: Eric Woudstra <ericwouds@gmail.com> > >> --- > >> net/bridge/br_private.h | 1 + > >> net/bridge/br_vlan.c | 18 +++++++++++++++++- > >> 2 files changed, 18 insertions(+), 1 deletion(-) > >> > >> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h > >> index 8da7798f9368..7d427214cc7c 100644 > >> --- a/net/bridge/br_private.h > >> +++ b/net/bridge/br_private.h > >> @@ -180,6 +180,7 @@ enum { > >> BR_VLFLAG_MCAST_ENABLED = BIT(2), > >> BR_VLFLAG_GLOBAL_MCAST_ENABLED = BIT(3), > >> BR_VLFLAG_NEIGH_SUPPRESS_ENABLED = BIT(4), > >> + BR_VLFLAG_TAGGING_BY_SWITCHDEV = BIT(5), > >> }; > >> > >> /** > >> diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c > >> index 1830d7d617cd..b7877724b969 100644 > >> --- a/net/bridge/br_vlan.c > >> +++ b/net/bridge/br_vlan.c > >> @@ -3,6 +3,7 @@ > >> #include <linux/netdevice.h> > >> #include <linux/rtnetlink.h> > >> #include <linux/slab.h> > >> +#include <net/dsa.h> > >> #include <net/switchdev.h> > >> > >> #include "br_private.h" > >> @@ -100,6 +101,19 @@ static void __vlan_flags_commit(struct net_bridge_vlan *v, u16 flags) > >> __vlan_flags_update(v, flags, true); > >> } > >> > >> +static inline bool br_vlan_tagging_by_switchdev(struct net_bridge *br) > > > > no inline in .c files and also constify br > > > >> +{ > >> +#if IS_ENABLED(CONFIG_NET_DSA) > >> + struct net_bridge_port *p; > >> + > >> + list_for_each_entry(p, &br->port_list, list) { > >> + if (dsa_user_dev_check(p->dev)) > > > > I don't think this can change at runtime, so please keep a counter in > > the bridge and don't walk the port list on every vlan add. > > > > you can use an internal bridge opt (check br_private.h) with a private opt > that's set when such device is added as a port, no need for a full counter > obviously To continue on Nikolay's line of thought... Can you abstractly describe which functional behavior do you need the bridge port to perform, rather than "it needs to be a DSA user port"? switchdev_bridge_port_offload() has a mechanism to inform the bridge core of extra abilities (like tx_fwd_offload). Perhaps you could modify the DSA drivers you need to set a similar bit to inform the bridge of their presence and ability. That would also work when the bridge port is a LAG over a DSA user port. Also, please also CC DSA maintainers when you use DSA API outside net/dsa/ and drivers/net/dsa/. I am in the process of revamping the public DSA API and would like to be in touch with changes as they are made. > >> + return false; > >> + } > >> +#endif > >> + return true; > >> +} > >> + > >> static int __vlan_vid_add(struct net_device *dev, struct net_bridge *br, > >> struct net_bridge_vlan *v, u16 flags, > >> struct netlink_ext_ack *extack) > >> @@ -113,6 +127,8 @@ static int __vlan_vid_add(struct net_device *dev, struct net_bridge *br, > >> if (err == -EOPNOTSUPP) > >> return vlan_vid_add(dev, br->vlan_proto, v->vid); > >> v->priv_flags |= BR_VLFLAG_ADDED_BY_SWITCHDEV; > >> + if (br_vlan_tagging_by_switchdev(br)) > >> + v->priv_flags |= BR_VLFLAG_TAGGING_BY_SWITCHDEV; > >> return err; > >> } > >> > >> @@ -1491,7 +1507,7 @@ int br_vlan_fill_forward_path_mode(struct net_bridge *br, > >> > >> if (path->bridge.vlan_mode == DEV_PATH_BR_VLAN_TAG) > >> path->bridge.vlan_mode = DEV_PATH_BR_VLAN_KEEP; > >> - else if (v->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV) > >> + else if (v->priv_flags & BR_VLFLAG_TAGGING_BY_SWITCHDEV) > >> path->bridge.vlan_mode = DEV_PATH_BR_VLAN_UNTAG_HW; > >> else > >> path->bridge.vlan_mode = DEV_PATH_BR_VLAN_UNTAG; > >
On 10/14/24 4:46 PM, Vladimir Oltean wrote: > Keeping the full email body untrimmed for extra context for the newly > added people. > > On Mon, Oct 14, 2024 at 09:22:07AM +0300, Nikolay Aleksandrov wrote: >> On 14/10/2024 09:18, Nikolay Aleksandrov wrote: >>> On 13/10/2024 21:55, Eric Woudstra wrote: >>>> In network setup as below: >>>> >>>> fastpath bypass >>>> .----------------------------------------. >>>> / \ >>>> | IP - forwarding | >>>> | / \ v >>>> | / wan ... >>>> | / >>>> | | >>>> | | >>>> | brlan.1 >>>> | | >>>> | +-------------------------------+ >>>> | | vlan 1 | >>>> | | | >>>> | | brlan (vlan-filtering) | >>>> | | +---------------+ >>>> | | | DSA-SWITCH | >>>> | | vlan 1 | | >>>> | | to | | >>>> | | untagged 1 vlan 1 | >>>> | +---------------+---------------+ >>>> . / \ >>>> ----->wlan1 lan0 >>>> . . >>>> . ^ >>>> ^ vlan 1 tagged packets >>>> untagged packets >>>> >>>> Now that DEV_PATH_MTK_WDMA is added to nft_dev_path_info() the forward >>>> path is filled also when ending with the mediatek wlan1, info.indev not >>>> NULL now in nft_dev_forward_path(). This results in a direct transmit >>>> instead of a neighbor transmit. This is how it should be, But this fails. >>>> >>>> br_vlan_fill_forward_path_mode() sets DEV_PATH_BR_VLAN_UNTAG_HW when >>>> filling in from brlan.1 towards wlan1. But it should be set to >>>> DEV_PATH_BR_VLAN_UNTAG in this case. Using BR_VLFLAG_ADDED_BY_SWITCHDEV >>>> is not correct. The dsa switchdev adds it as a foreign port. >>>> >>>> Use BR_VLFLAG_TAGGING_BY_SWITCHDEV to make sure DEV_PATH_BR_VLAN_UNTAG is >>>> set when there is a dsa-switch inside the bridge. >>>> >>>> Signed-off-by: Eric Woudstra <ericwouds@gmail.com> >>>> --- >>>> net/bridge/br_private.h | 1 + >>>> net/bridge/br_vlan.c | 18 +++++++++++++++++- >>>> 2 files changed, 18 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h >>>> index 8da7798f9368..7d427214cc7c 100644 >>>> --- a/net/bridge/br_private.h >>>> +++ b/net/bridge/br_private.h >>>> @@ -180,6 +180,7 @@ enum { >>>> BR_VLFLAG_MCAST_ENABLED = BIT(2), >>>> BR_VLFLAG_GLOBAL_MCAST_ENABLED = BIT(3), >>>> BR_VLFLAG_NEIGH_SUPPRESS_ENABLED = BIT(4), >>>> + BR_VLFLAG_TAGGING_BY_SWITCHDEV = BIT(5), >>>> }; >>>> >>>> /** >>>> diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c >>>> index 1830d7d617cd..b7877724b969 100644 >>>> --- a/net/bridge/br_vlan.c >>>> +++ b/net/bridge/br_vlan.c >>>> @@ -3,6 +3,7 @@ >>>> #include <linux/netdevice.h> >>>> #include <linux/rtnetlink.h> >>>> #include <linux/slab.h> >>>> +#include <net/dsa.h> >>>> #include <net/switchdev.h> >>>> >>>> #include "br_private.h" >>>> @@ -100,6 +101,19 @@ static void __vlan_flags_commit(struct net_bridge_vlan *v, u16 flags) >>>> __vlan_flags_update(v, flags, true); >>>> } >>>> >>>> +static inline bool br_vlan_tagging_by_switchdev(struct net_bridge *br) >>> >>> no inline in .c files and also constify br >>> >>>> +{ >>>> +#if IS_ENABLED(CONFIG_NET_DSA) >>>> + struct net_bridge_port *p; >>>> + >>>> + list_for_each_entry(p, &br->port_list, list) { >>>> + if (dsa_user_dev_check(p->dev)) >>> >>> I don't think this can change at runtime, so please keep a counter in >>> the bridge and don't walk the port list on every vlan add. >>> >> >> you can use an internal bridge opt (check br_private.h) with a private opt >> that's set when such device is added as a port, no need for a full counter >> obviously > > To continue on Nikolay's line of thought... > > Can you abstractly describe which functional behavior do you need the > bridge port to perform, rather than "it needs to be a DSA user port"? Hello Vladimir, This has to do with the sequence of events, when dealing with vlan tagging in a bridge. When looking at the ingress of untaggged packets, that will be tagged by a certain port of that bridge, it depends when this tagging happens, in respect to the netfilter hook. This describes the existing code: Because we are using dev_fill_forward_path(), we know in all cases that untagged packets arive and are tagged by the bridge. In the case (#1) off a switchdev, the tagging is done before the packet reaches the netfilter hook. Then the software fastpath code needs to handle the packet, as if it is incoming tagged, remembering that this tag was tagged at ingress (tuple->in_vlan_ingress). We need to remember this, because dealing with hardware offloaded fastpath we then need to skip this vlan tag in the hardware offloaded fastpath in nf_flow_rule_match() and nf_flow_rule_route_common(). In all other cases (#2), 'normal' ports (ports without switchdev and dsa-user-ports) the tagging is done after packet reaches the netfilter-hook. Then the software fastpath code needs to handle the packet, as incoming untagged. (Of course the dsa-user-port is more complicated, but it all handled by the dsa architecture so that it can be used as a 'normal' port.) In both cases #1 and #2 also the other direction is taken into account. To decide which case to apply, the code looks at BR_VLFLAG_ADDED_BY_SWITCHDEV, which was actually used for something else, but it is easily available in the code at that point and seemed to work, so far... But as it turns out, this flag is also set for foreign ports added to the dsa-switchdev. This means that case #1 is applied to the foreign dsa port. This breaks the software fastpath and, with packets not reaching their destination, also the hardware offloaded fastpath does not start. We need to apply case #2. So actually we need to know, inside br_vlan_fill_forward_path_mode(), whether or not net_bridge_port *dst is a foreign port added to the dsa switchdev. Or perhaps there is another way to find out we have to apply case #2. > switchdev_bridge_port_offload() has a mechanism to inform the bridge > core of extra abilities (like tx_fwd_offload). Perhaps you could modify > the DSA drivers you need to set a similar bit to inform the bridge of > their presence and ability. That would also work when the bridge port is > a LAG over a DSA user port. > > Also, please also CC DSA maintainers when you use DSA API outside > net/dsa/ and drivers/net/dsa/. I am in the process of revamping the > public DSA API and would like to be in touch with changes as they are > made. > >>>> + return false; >>>> + } >>>> +#endif >>>> + return true; >>>> +} >>>> + >>>> static int __vlan_vid_add(struct net_device *dev, struct net_bridge *br, >>>> struct net_bridge_vlan *v, u16 flags, >>>> struct netlink_ext_ack *extack) >>>> @@ -113,6 +127,8 @@ static int __vlan_vid_add(struct net_device *dev, struct net_bridge *br, >>>> if (err == -EOPNOTSUPP) >>>> return vlan_vid_add(dev, br->vlan_proto, v->vid); >>>> v->priv_flags |= BR_VLFLAG_ADDED_BY_SWITCHDEV; >>>> + if (br_vlan_tagging_by_switchdev(br)) >>>> + v->priv_flags |= BR_VLFLAG_TAGGING_BY_SWITCHDEV; >>>> return err; >>>> } >>>> >>>> @@ -1491,7 +1507,7 @@ int br_vlan_fill_forward_path_mode(struct net_bridge *br, >>>> >>>> if (path->bridge.vlan_mode == DEV_PATH_BR_VLAN_TAG) >>>> path->bridge.vlan_mode = DEV_PATH_BR_VLAN_KEEP; >>>> - else if (v->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV) >>>> + else if (v->priv_flags & BR_VLFLAG_TAGGING_BY_SWITCHDEV) >>>> path->bridge.vlan_mode = DEV_PATH_BR_VLAN_UNTAG_HW; >>>> else >>>> path->bridge.vlan_mode = DEV_PATH_BR_VLAN_UNTAG; >>>
On 10/15/24 12:26 PM, Eric Woudstra wrote: > > > On 10/14/24 4:46 PM, Vladimir Oltean wrote: >> Keeping the full email body untrimmed for extra context for the newly >> added people. >> >> On Mon, Oct 14, 2024 at 09:22:07AM +0300, Nikolay Aleksandrov wrote: >>> On 14/10/2024 09:18, Nikolay Aleksandrov wrote: >>>> On 13/10/2024 21:55, Eric Woudstra wrote: >>>>> In network setup as below: >>>>> >>>>> fastpath bypass >>>>> .----------------------------------------. >>>>> / \ >>>>> | IP - forwarding | >>>>> | / \ v >>>>> | / wan ... >>>>> | / >>>>> | | >>>>> | | >>>>> | brlan.1 >>>>> | | >>>>> | +-------------------------------+ >>>>> | | vlan 1 | >>>>> | | | >>>>> | | brlan (vlan-filtering) | >>>>> | | +---------------+ >>>>> | | | DSA-SWITCH | >>>>> | | vlan 1 | | >>>>> | | to | | >>>>> | | untagged 1 vlan 1 | >>>>> | +---------------+---------------+ >>>>> . / \ >>>>> ----->wlan1 lan0 >>>>> . . >>>>> . ^ >>>>> ^ vlan 1 tagged packets >>>>> untagged packets >>>>> >>>>> Now that DEV_PATH_MTK_WDMA is added to nft_dev_path_info() the forward >>>>> path is filled also when ending with the mediatek wlan1, info.indev not >>>>> NULL now in nft_dev_forward_path(). This results in a direct transmit >>>>> instead of a neighbor transmit. This is how it should be, But this fails. >>>>> >>>>> br_vlan_fill_forward_path_mode() sets DEV_PATH_BR_VLAN_UNTAG_HW when >>>>> filling in from brlan.1 towards wlan1. But it should be set to >>>>> DEV_PATH_BR_VLAN_UNTAG in this case. Using BR_VLFLAG_ADDED_BY_SWITCHDEV >>>>> is not correct. The dsa switchdev adds it as a foreign port. >>>>> >>>>> Use BR_VLFLAG_TAGGING_BY_SWITCHDEV to make sure DEV_PATH_BR_VLAN_UNTAG is >>>>> set when there is a dsa-switch inside the bridge. >>>>> >>>>> Signed-off-by: Eric Woudstra <ericwouds@gmail.com> >>>>> --- >>>>> net/bridge/br_private.h | 1 + >>>>> net/bridge/br_vlan.c | 18 +++++++++++++++++- >>>>> 2 files changed, 18 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h >>>>> index 8da7798f9368..7d427214cc7c 100644 >>>>> --- a/net/bridge/br_private.h >>>>> +++ b/net/bridge/br_private.h >>>>> @@ -180,6 +180,7 @@ enum { >>>>> BR_VLFLAG_MCAST_ENABLED = BIT(2), >>>>> BR_VLFLAG_GLOBAL_MCAST_ENABLED = BIT(3), >>>>> BR_VLFLAG_NEIGH_SUPPRESS_ENABLED = BIT(4), >>>>> + BR_VLFLAG_TAGGING_BY_SWITCHDEV = BIT(5), >>>>> }; >>>>> >>>>> /** >>>>> diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c >>>>> index 1830d7d617cd..b7877724b969 100644 >>>>> --- a/net/bridge/br_vlan.c >>>>> +++ b/net/bridge/br_vlan.c >>>>> @@ -3,6 +3,7 @@ >>>>> #include <linux/netdevice.h> >>>>> #include <linux/rtnetlink.h> >>>>> #include <linux/slab.h> >>>>> +#include <net/dsa.h> >>>>> #include <net/switchdev.h> >>>>> >>>>> #include "br_private.h" >>>>> @@ -100,6 +101,19 @@ static void __vlan_flags_commit(struct net_bridge_vlan *v, u16 flags) >>>>> __vlan_flags_update(v, flags, true); >>>>> } >>>>> >>>>> +static inline bool br_vlan_tagging_by_switchdev(struct net_bridge *br) >>>> >>>> no inline in .c files and also constify br >>>> >>>>> +{ >>>>> +#if IS_ENABLED(CONFIG_NET_DSA) >>>>> + struct net_bridge_port *p; >>>>> + >>>>> + list_for_each_entry(p, &br->port_list, list) { >>>>> + if (dsa_user_dev_check(p->dev)) >>>> >>>> I don't think this can change at runtime, so please keep a counter in >>>> the bridge and don't walk the port list on every vlan add. >>>> >>> >>> you can use an internal bridge opt (check br_private.h) with a private opt >>> that's set when such device is added as a port, no need for a full counter >>> obviously >> >> To continue on Nikolay's line of thought... >> >> Can you abstractly describe which functional behavior do you need the >> bridge port to perform, rather than "it needs to be a DSA user port"? > > Hello Vladimir, > > This has to do with the sequence of events, when dealing with vlan > tagging in a bridge. When looking at the ingress of untaggged packets, > that will be tagged by a certain port of that bridge, it depends when > this tagging happens, in respect to the netfilter hook. This describes > the existing code: > > Because we are using dev_fill_forward_path(), we know in all cases that > untagged packets arive and are tagged by the bridge. > > In the case (#1) off a switchdev, the tagging is done before the packet > reaches the netfilter hook. Then the software fastpath code needs to > handle the packet, as if it is incoming tagged, remembering that this > tag was tagged at ingress (tuple->in_vlan_ingress). We need to remember > this, because dealing with hardware offloaded fastpath we then need to > skip this vlan tag in the hardware offloaded fastpath in > nf_flow_rule_match() and nf_flow_rule_route_common(). > > In all other cases (#2), 'normal' ports (ports without switchdev and > dsa-user-ports) the tagging is done after packet reaches the > netfilter-hook. Then the software fastpath code needs to handle the > packet, as incoming untagged. > > (Of course the dsa-user-port is more complicated, but it all handled by > the dsa architecture so that it can be used as a 'normal' port.) > > In both cases #1 and #2 also the other direction is taken into account. > > To decide which case to apply, the code looks at > BR_VLFLAG_ADDED_BY_SWITCHDEV, which was actually used for something > else, but it is easily available in the code at that point and seemed to > work, so far... > > But as it turns out, this flag is also set for foreign ports added to > the dsa-switchdev. This means that case #1 is applied to the foreign dsa > port. This breaks the software fastpath and, with packets not reaching > their destination, also the hardware offloaded fastpath does not start. > We need to apply case #2. > > So actually we need to know, inside br_vlan_fill_forward_path_mode(), > whether or not net_bridge_port *dst is a foreign port added to the dsa > switchdev. Or perhaps there is another way to find out we have to apply > case #2. > So after doing some more reading, at creation of the code using BR_VLFLAG_ADDED_BY_SWITCHDEV would have been without problems. After the switchdev was altered so that objects from foreign devices can be added, it is problematic in br_vlan_fill_forward_path_mode(). I have tested and indeed any foreign device does have this problem. So we need a way to distinguish in br_vlan_fill_forward_path_mode() whether or not we are dealing with a (dsa) foreign device on the switchdev. I have come up with something, but this is most likely to crude to be accepted, but for the sake of 'rfc' discussing it may lead to a proper solution. So what does work is the following patch, so that netif_has_dsa_foreign_vlan() can be used inside br_vlan_fill_forward_path_mode(). Any suggestions on how this could be implemented properly would be greatly appreciated. diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 9d80f650345e75..3fb67312428a1f 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1839,6 +1839,7 @@ enum netdev_reg_state { * * @vlan_info: VLAN info * @dsa_ptr: dsa specific data + * @dsa_foreign_vlan: Counter, number of dsa foreign vlans added * @tipc_ptr: TIPC specific data * @atalk_ptr: AppleTalk link * @ip_ptr: IPv4 specific data @@ -2214,6 +2215,7 @@ struct net_device { #endif #if IS_ENABLED(CONFIG_NET_DSA) struct dsa_port *dsa_ptr; + unsigned int dsa_foreign_vlan; #endif #if IS_ENABLED(CONFIG_TIPC) struct tipc_bearer __rcu *tipc_ptr; @@ -5135,6 +5137,15 @@ static inline bool netif_is_lag_port(const struct net_device *dev) return netif_is_bond_slave(dev) || netif_is_team_port(dev); } +static inline bool netif_has_dsa_foreign_vlan(const struct net_device *dev) +{ +#if IS_ENABLED(CONFIG_NET_DSA) + return !!dev->dsa_foreign_vlan; +#else + return false; +#endif +} + static inline bool netif_is_rxfh_configured(const struct net_device *dev) { return dev->priv_flags & IFF_RXFH_CONFIGURED; diff --git a/net/dsa/user.c b/net/dsa/user.c index 74eda9b30608e6..775f6346120ed6 100644 --- a/net/dsa/user.c +++ b/net/dsa/user.c @@ -737,6 +737,8 @@ static int dsa_user_host_vlan_add(struct net_device *dev, return 0; } + obj->orig_dev->dsa_foreign_vlan++; + vlan = *SWITCHDEV_OBJ_PORT_VLAN(obj); /* Even though drivers often handle CPU membership in special ways, @@ -824,6 +826,8 @@ static int dsa_user_host_vlan_del(struct net_device *dev, if (dsa_port_skip_vlan_configuration(dp)) return 0; + obj->orig_dev->dsa_foreign_vlan--; + vlan = SWITCHDEV_OBJ_PORT_VLAN(obj); return dsa_port_host_vlan_del(dp, vlan); >> switchdev_bridge_port_offload() has a mechanism to inform the bridge >> core of extra abilities (like tx_fwd_offload). Perhaps you could modify >> the DSA drivers you need to set a similar bit to inform the bridge of >> their presence and ability. That would also work when the bridge port is >> a LAG over a DSA user port. >> >> Also, please also CC DSA maintainers when you use DSA API outside >> net/dsa/ and drivers/net/dsa/. I am in the process of revamping the >> public DSA API and would like to be in touch with changes as they are >> made. >> >>>>> + return false; >>>>> + } >>>>> +#endif >>>>> + return true; >>>>> +} >>>>> + >>>>> static int __vlan_vid_add(struct net_device *dev, struct net_bridge *br, >>>>> struct net_bridge_vlan *v, u16 flags, >>>>> struct netlink_ext_ack *extack) >>>>> @@ -113,6 +127,8 @@ static int __vlan_vid_add(struct net_device *dev, struct net_bridge *br, >>>>> if (err == -EOPNOTSUPP) >>>>> return vlan_vid_add(dev, br->vlan_proto, v->vid); >>>>> v->priv_flags |= BR_VLFLAG_ADDED_BY_SWITCHDEV; >>>>> + if (br_vlan_tagging_by_switchdev(br)) >>>>> + v->priv_flags |= BR_VLFLAG_TAGGING_BY_SWITCHDEV; >>>>> return err; >>>>> } >>>>> >>>>> @@ -1491,7 +1507,7 @@ int br_vlan_fill_forward_path_mode(struct net_bridge *br, >>>>> >>>>> if (path->bridge.vlan_mode == DEV_PATH_BR_VLAN_TAG) >>>>> path->bridge.vlan_mode = DEV_PATH_BR_VLAN_KEEP; >>>>> - else if (v->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV) >>>>> + else if (v->priv_flags & BR_VLFLAG_TAGGING_BY_SWITCHDEV) >>>>> path->bridge.vlan_mode = DEV_PATH_BR_VLAN_UNTAG_HW; >>>>> else >>>>> path->bridge.vlan_mode = DEV_PATH_BR_VLAN_UNTAG; >>>>
On Sun, Oct 20, 2024 at 11:23:18AM +0200, Eric Woudstra wrote: > So after doing some more reading, at creation of the code using > BR_VLFLAG_ADDED_BY_SWITCHDEV would have been without problems. > > After the switchdev was altered so that objects from foreign devices can > be added, it is problematic in br_vlan_fill_forward_path_mode(). I have > tested and indeed any foreign device does have this problem. > > So we need a way to distinguish in br_vlan_fill_forward_path_mode() > whether or not we are dealing with a (dsa) foreign device on the switchdev. > > I have come up with something, but this is most likely to crude to be > accepted, but for the sake of 'rfc' discussing it may lead to a proper > solution. So what does work is the following patch, so that > netif_has_dsa_foreign_vlan() can be used inside > br_vlan_fill_forward_path_mode(). > > Any suggestions on how this could be implemented properly would be > greatly appreciated. I don't know nearly enough about the netfilter flowtable to even understand exactly the problem you're describing and are trying to solve. I've started to read up on things, but plenty of concepts are new and I'm mixing this with plenty of other activities. If you could share some commands to build a test setup so I could form my own independent opinion of what is going on, it would be great as it would speed up that process. With respect to the patch you've posted, it doesn't look exactly great. One would need to make a thorough analysis of the bridge's use of BR_VLFLAG_ADDED_BY_SWITCHDEV, of whether it still makes sense in today's world where br_switchdev_vlan_replay() is a thing (a VLAN that used to not be "added by switchdev" can become "added by switchdev" after a replay, but this flag will remain incorrectly unset), of whether VLANs on foreign DSA interfaces should even have this flag set, and on whether your flowtable forwarding path patches are conceptually using it correctly. There's a lot to think about, and if somebody doesn't have the big picture, I'm worried that a wrong decision will be taken.
On 10/21/24 3:47 PM, Vladimir Oltean wrote: > On Sun, Oct 20, 2024 at 11:23:18AM +0200, Eric Woudstra wrote: >> So after doing some more reading, at creation of the code using >> BR_VLFLAG_ADDED_BY_SWITCHDEV would have been without problems. >> >> After the switchdev was altered so that objects from foreign devices can >> be added, it is problematic in br_vlan_fill_forward_path_mode(). I have >> tested and indeed any foreign device does have this problem. >> >> So we need a way to distinguish in br_vlan_fill_forward_path_mode() >> whether or not we are dealing with a (dsa) foreign device on the switchdev. >> >> I have come up with something, but this is most likely to crude to be >> accepted, but for the sake of 'rfc' discussing it may lead to a proper >> solution. So what does work is the following patch, so that >> netif_has_dsa_foreign_vlan() can be used inside >> br_vlan_fill_forward_path_mode(). >> >> Any suggestions on how this could be implemented properly would be >> greatly appreciated. > I don't know nearly enough about the netfilter flowtable to even > understand exactly the problem you're describing and are trying to solve. > I've started to read up on things, but plenty of concepts are new and Another way of shortly describing it, considering the software fastpath, only there it is a problem: Same case #1: When looking at the ingress hook of the fastpath for a bridged switchdev-port (non-dsa) with PVID set, the existing code is written so that the flowtuple needs to match the packet INcluding the PVID. Same case #2: When considering the ingress hook of the fastpath of a bridged device that is not part of a switchdev at all and there is no dsa on the bridge (so it is not a foreign device), the existing code is written so that the flowtuple needs to match the packet EXcluding the PVID. When looking at the diagram of this patch, wlan1 would stand for any foreign device. Because of the use of BR_VLFLAG_ADDED_BY_SWITCHDEV, case #1 is applied, instead of case #2. > I'm mixing this with plenty of other activities. If you could share some > commands to build a test setup so I could form my own independent > opinion of what is going on, it would be great as it would speed up that > process. I've only setup the bridged part with systemd-networkd. When trying to re-create, it will only happen if the total action of the bridge, towards the foreign port, is: ingress + egress = untagging So in the forward-fastpath, tagging in the forward path is done by a vlan-device and untagging is done in the forward path of the bridge. > With respect to the patch you've posted, it doesn't look exactly great. Agreed. I was experimenting now with having br_switchdev_port_vlan_add() first to try it only for non-foreign ports. If not successful, then try it with foreign ports. This way the calling function will know if the port is a foreign port. Therefore, no need for the switchdev driver to communicate back to upper layers. > One would need to make a thorough analysis of the bridge's use of > BR_VLFLAG_ADDED_BY_SWITCHDEV, of whether it still makes sense in today's > world where br_switchdev_vlan_replay() is a thing (a VLAN that used to > not be "added by switchdev" can become "added by switchdev" after a > replay, but this flag will remain incorrectly unset), of whether VLANs on > foreign DSA interfaces should even have this flag set, and on whether > your flowtable forwarding path patches are conceptually using it correctly. > There's a lot to think about, and if somebody doesn't have the big picture, > I'm worried that a wrong decision will be taken. The entire usage BR_VLFLAG_ADDED_BY_SWITCHDEV does need a careful review. I also realize my patch-set needs to do more with the switchdev and vlan combination, then it does now. So for now I will leave it as RFC, as it will not work properly with switchdevs other the dsa.
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 8da7798f9368..7d427214cc7c 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -180,6 +180,7 @@ enum { BR_VLFLAG_MCAST_ENABLED = BIT(2), BR_VLFLAG_GLOBAL_MCAST_ENABLED = BIT(3), BR_VLFLAG_NEIGH_SUPPRESS_ENABLED = BIT(4), + BR_VLFLAG_TAGGING_BY_SWITCHDEV = BIT(5), }; /** diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c index 1830d7d617cd..b7877724b969 100644 --- a/net/bridge/br_vlan.c +++ b/net/bridge/br_vlan.c @@ -3,6 +3,7 @@ #include <linux/netdevice.h> #include <linux/rtnetlink.h> #include <linux/slab.h> +#include <net/dsa.h> #include <net/switchdev.h> #include "br_private.h" @@ -100,6 +101,19 @@ static void __vlan_flags_commit(struct net_bridge_vlan *v, u16 flags) __vlan_flags_update(v, flags, true); } +static inline bool br_vlan_tagging_by_switchdev(struct net_bridge *br) +{ +#if IS_ENABLED(CONFIG_NET_DSA) + struct net_bridge_port *p; + + list_for_each_entry(p, &br->port_list, list) { + if (dsa_user_dev_check(p->dev)) + return false; + } +#endif + return true; +} + static int __vlan_vid_add(struct net_device *dev, struct net_bridge *br, struct net_bridge_vlan *v, u16 flags, struct netlink_ext_ack *extack) @@ -113,6 +127,8 @@ static int __vlan_vid_add(struct net_device *dev, struct net_bridge *br, if (err == -EOPNOTSUPP) return vlan_vid_add(dev, br->vlan_proto, v->vid); v->priv_flags |= BR_VLFLAG_ADDED_BY_SWITCHDEV; + if (br_vlan_tagging_by_switchdev(br)) + v->priv_flags |= BR_VLFLAG_TAGGING_BY_SWITCHDEV; return err; } @@ -1491,7 +1507,7 @@ int br_vlan_fill_forward_path_mode(struct net_bridge *br, if (path->bridge.vlan_mode == DEV_PATH_BR_VLAN_TAG) path->bridge.vlan_mode = DEV_PATH_BR_VLAN_KEEP; - else if (v->priv_flags & BR_VLFLAG_ADDED_BY_SWITCHDEV) + else if (v->priv_flags & BR_VLFLAG_TAGGING_BY_SWITCHDEV) path->bridge.vlan_mode = DEV_PATH_BR_VLAN_UNTAG_HW; else path->bridge.vlan_mode = DEV_PATH_BR_VLAN_UNTAG;
In network setup as below: fastpath bypass .----------------------------------------. / \ | IP - forwarding | | / \ v | / wan ... | / | | | | | brlan.1 | | | +-------------------------------+ | | vlan 1 | | | | | | brlan (vlan-filtering) | | | +---------------+ | | | DSA-SWITCH | | | vlan 1 | | | | to | | | | untagged 1 vlan 1 | | +---------------+---------------+ . / \ ----->wlan1 lan0 . . . ^ ^ vlan 1 tagged packets untagged packets Now that DEV_PATH_MTK_WDMA is added to nft_dev_path_info() the forward path is filled also when ending with the mediatek wlan1, info.indev not NULL now in nft_dev_forward_path(). This results in a direct transmit instead of a neighbor transmit. This is how it should be, But this fails. br_vlan_fill_forward_path_mode() sets DEV_PATH_BR_VLAN_UNTAG_HW when filling in from brlan.1 towards wlan1. But it should be set to DEV_PATH_BR_VLAN_UNTAG in this case. Using BR_VLFLAG_ADDED_BY_SWITCHDEV is not correct. The dsa switchdev adds it as a foreign port. Use BR_VLFLAG_TAGGING_BY_SWITCHDEV to make sure DEV_PATH_BR_VLAN_UNTAG is set when there is a dsa-switch inside the bridge. Signed-off-by: Eric Woudstra <ericwouds@gmail.com> --- net/bridge/br_private.h | 1 + net/bridge/br_vlan.c | 18 +++++++++++++++++- 2 files changed, 18 insertions(+), 1 deletion(-)