Message ID | 20190116200102.2749-11-f.fainelli@gmail.com |
---|---|
State | Changes Requested |
Delegated to: | David Miller |
Headers | show |
Series | net: dsa: management mode for bcm_sf2 | expand |
On Wed, Jan 16, 2019 at 12:00:58PM -0800, Florian Fainelli wrote: > The VLAN real device could be an Ethernet switch port and that switch > might have VLAN filtering globally enabled (because of a bridge > requesting VLAN filtering on the switch on another port) and so when > programming multicast addresses, we need the multicast filter > programming to be aware of the correct VLAN ID as well. This looks like a quirk of a specific device. How bad is it to patch the driver to add a multicast address for every configured VLAN? Also, I think it's weird that we have one API to program address and a completely different API (via switchdev) to program address+VID pairs. Extending current API might make more sense.
On 1/17/19 6:49 AM, Ido Schimmel wrote: > On Wed, Jan 16, 2019 at 12:00:58PM -0800, Florian Fainelli wrote: >> The VLAN real device could be an Ethernet switch port and that switch >> might have VLAN filtering globally enabled (because of a bridge >> requesting VLAN filtering on the switch on another port) and so when >> programming multicast addresses, we need the multicast filter >> programming to be aware of the correct VLAN ID as well. > > This looks like a quirk of a specific device. How bad is it to patch the > driver to add a multicast address for every configured VLAN? There is at least another driver that can be benefit from that which is cpsw, if I understand Ivan's use case correctly. If there is a ndo_set_rx_mode() function implemented by the virtual device, and that does call dev_{mc,uc}_sync(master, dev), then this means that you do want to be able to filter UC and MC addresses. If we added the entire class D range of multicast addresses to the switch's MDB, that would not be filtering, we would be passing up everything to the stack and let it filter in software because there is no multicast socket listening on that address. > > Also, I think it's weird that we have one API to program address and a > completely different API (via switchdev) to program address+VID pairs. > Extending current API might make more sense. > Do you mean ndo_set_rx_mode() and dev_mc_sync()? That is what Ivan proposed doing not so long ago here: https://www.spinics.net/lists/netdev/msg537424.html but that is IMHO wasting storage space, because the kernel is maintaining the address lists, and now also needs to gain knowledge about the VID. With up to 4K - 2 VLAN interfaces per switch port, this bloats the memory footprint, we arguably still need to maintain those address lists anyway... The reason why I chose switchdev here is because: - this is mostly relevant for switch devices, not so much for NICs (it seems), if it was, they would have solved the problem by now - this allows to have an unified path from the switch driver perspective to program MDB addresses targeting the CPU/management port, no need to have X different ways of doing the same operation
On Thu, Jan 17, 2019 at 11:12:24AM -0800, Florian Fainelli wrote: > On 1/17/19 6:49 AM, Ido Schimmel wrote: > > On Wed, Jan 16, 2019 at 12:00:58PM -0800, Florian Fainelli wrote: > >> The VLAN real device could be an Ethernet switch port and that switch > >> might have VLAN filtering globally enabled (because of a bridge > >> requesting VLAN filtering on the switch on another port) and so when > >> programming multicast addresses, we need the multicast filter > >> programming to be aware of the correct VLAN ID as well. > > > > This looks like a quirk of a specific device. How bad is it to patch the > > driver to add a multicast address for every configured VLAN? > > There is at least another driver that can be benefit from that which is > cpsw, if I understand Ivan's use case correctly. I understand and I have no argument against the need for this. I just think we should use a different mechanism than switchdev. > If there is a ndo_set_rx_mode() function implemented by the virtual > device, and that does call dev_{mc,uc}_sync(master, dev), then this > means that you do want to be able to filter UC and MC addresses. If we > added the entire class D range of multicast addresses to the switch's > MDB, that would not be filtering, we would be passing up everything to > the stack and let it filter in software because there is no multicast > socket listening on that address. OK. > > Also, I think it's weird that we have one API to program address and a > > completely different API (via switchdev) to program address+VID pairs. > > Extending current API might make more sense. > > > > Do you mean ndo_set_rx_mode() and dev_mc_sync()? That is what Ivan > proposed doing not so long ago here: > > https://www.spinics.net/lists/netdev/msg537424.html > > but that is IMHO wasting storage space, because the kernel is > maintaining the address lists, and now also needs to gain knowledge > about the VID. With up to 4K - 2 VLAN interfaces per switch port, this > bloats the memory footprint, we arguably still need to maintain those > address lists anyway... I didn't review Ivan's changes in details, but it makes much more sense to me to simply extend the current Rx filtering mechanism than to use a completely unrelated infrastructure. > > The reason why I chose switchdev here is because: > > - this is mostly relevant for switch devices, not so much for NICs (it > seems), if it was, they would have solved the problem by now I don't see any use of switchdev APIs in the driver Ivan is patching. The cover letter doesn't indicate anything about it either. > - this allows to have an unified path from the switch driver perspective > to program MDB addresses targeting the CPU/management port, no need to > have X different ways of doing the same operation But it's not the same thing. Allowing certain packets to ingress the device is not the same as having the device send them to the CPU. We have VLAN filters as well. Allowing VID X to ingress does not mean that we trap each packet with this VID to CPU.
Hi Ido, > > > > The reason why I chose switchdev here is because: > > > > - this is mostly relevant for switch devices, not so much for NICs (it > > seems), if it was, they would have solved the problem by now > > I don't see any use of switchdev APIs in the driver Ivan is patching. > The cover letter doesn't indicate anything about it either. There were RFCs for it a few months ago https://patchwork.ozlabs.org/cover/929367/ We decided that rewriting the driver instead of adding switchdev support on the current one is cleaner and preferred, so we'll be posting a new driver for this at some point (most of the work is already done). > > > - this allows to have an unified path from the switch driver perspective > > to program MDB addresses targeting the CPU/management port, no need to > > have X different ways of doing the same operation > > But it's not the same thing. Allowing certain packets to ingress the > device is not the same as having the device send them to the CPU. We > have VLAN filters as well. Allowing VID X to ingress does not mean that > we trap each packet with this VID to CPU. /Ilias
On Mon, Jan 21, 2019 at 09:13:01AM +0000, Ido Schimmel wrote: >On Thu, Jan 17, 2019 at 11:12:24AM -0800, Florian Fainelli wrote: >> On 1/17/19 6:49 AM, Ido Schimmel wrote: >> > On Wed, Jan 16, 2019 at 12:00:58PM -0800, Florian Fainelli wrote: >> >> The VLAN real device could be an Ethernet switch port and that switch >> >> might have VLAN filtering globally enabled (because of a bridge >> >> requesting VLAN filtering on the switch on another port) and so when >> >> programming multicast addresses, we need the multicast filter >> >> programming to be aware of the correct VLAN ID as well. >> > >> > This looks like a quirk of a specific device. How bad is it to patch the >> > driver to add a multicast address for every configured VLAN? >> >> There is at least another driver that can be benefit from that which is >> cpsw, if I understand Ivan's use case correctly. > >I understand and I have no argument against the need for this. I just >think we should use a different mechanism than switchdev. > >> If there is a ndo_set_rx_mode() function implemented by the virtual >> device, and that does call dev_{mc,uc}_sync(master, dev), then this >> means that you do want to be able to filter UC and MC addresses. If we >> added the entire class D range of multicast addresses to the switch's >> MDB, that would not be filtering, we would be passing up everything to >> the stack and let it filter in software because there is no multicast >> socket listening on that address. > >OK. > >> > Also, I think it's weird that we have one API to program address and a >> > completely different API (via switchdev) to program address+VID pairs. >> > Extending current API might make more sense. >> > >> >> Do you mean ndo_set_rx_mode() and dev_mc_sync()? That is what Ivan >> proposed doing not so long ago here: >> >> https://www.spinics.net/lists/netdev/msg537424.html >> >> but that is IMHO wasting storage space, because the kernel is >> maintaining the address lists, and now also needs to gain knowledge >> about the VID. With up to 4K - 2 VLAN interfaces per switch port, this >> bloats the memory footprint, we arguably still need to maintain those >> address lists anyway... Yes wasting storage it's a factor, even it brings a lot of other benefits. But compared to this it's more legal. As alternative I've also proposed completely working model when vlan identification is real dev responsibility, allowing to split address space between vlans (for mc and uc). And better to add more commments to the referencies I provided there. > >I didn't review Ivan's changes in details, but it makes much more sense >to me to simply extend the current Rx filtering mechanism than to use a >completely unrelated infrastructure. True. > >> >> The reason why I chose switchdev here is because: >> >> - this is mostly relevant for switch devices, not so much for NICs (it >> seems), if it was, they would have solved the problem by now > >I don't see any use of switchdev APIs in the driver Ivan is patching. >The cover letter doesn't indicate anything about it either. Yes, it's not related to switched. > >> - this allows to have an unified path from the switch driver perspective >> to program MDB addresses targeting the CPU/management port, no need to >> have X different ways of doing the same operation > >But it's not the same thing. Allowing certain packets to ingress the >device is not the same as having the device send them to the CPU. We >have VLAN filters as well. Allowing VID X to ingress does not mean that >we trap each packet with this VID to CPU.
On Thu, Jan 17, 2019 at 11:12:24AM -0800, Florian Fainelli wrote: >On 1/17/19 6:49 AM, Ido Schimmel wrote: >> On Wed, Jan 16, 2019 at 12:00:58PM -0800, Florian Fainelli wrote: >>> The VLAN real device could be an Ethernet switch port and that switch >>> might have VLAN filtering globally enabled (because of a bridge >>> requesting VLAN filtering on the switch on another port) and so when >>> programming multicast addresses, we need the multicast filter >>> programming to be aware of the correct VLAN ID as well. >> >> This looks like a quirk of a specific device. How bad is it to patch the >> driver to add a multicast address for every configured VLAN? > >There is at least another driver that can be benefit from that which is >cpsw, if I understand Ivan's use case correctly. > >If there is a ndo_set_rx_mode() function implemented by the virtual >device, and that does call dev_{mc,uc}_sync(master, dev), then this >means that you do want to be able to filter UC and MC addresses. If we >added the entire class D range of multicast addresses to the switch's >MDB, that would not be filtering, we would be passing up everything to >the stack and let it filter in software because there is no multicast >socket listening on that address. > >> >> Also, I think it's weird that we have one API to program address and a >> completely different API (via switchdev) to program address+VID pairs. >> Extending current API might make more sense. >> > >Do you mean ndo_set_rx_mode() and dev_mc_sync()? That is what Ivan >proposed doing not so long ago here: > >https://www.spinics.net/lists/netdev/msg537424.html > >but that is IMHO wasting storage space, because the kernel is >maintaining the address lists, and now also needs to gain knowledge >about the VID. With up to 4K - 2 VLAN interfaces per switch port, this >bloats the memory footprint, we arguably still need to maintain those >address lists anyway... > >The reason why I chose switchdev here is because: > >- this is mostly relevant for switch devices, not so much for NICs (it >seems), if it was, they would have solved the problem by now If provide normal interface for this it could spread. And personally my aim is to avoid redundant mcast traffic when i don't expect it saving CPU performance, latency .... and overall it should be done in this way, switchdev here seems like otth. > >- this allows to have an unified path from the switch driver perspective >to program MDB addresses targeting the CPU/management port, no need to >have X different ways of doing the same operation >-- >Florian
diff --git a/net/8021q/vlan_dev.c b/net/8021q/vlan_dev.c index b2d9c8f27cd7..ea2ef9d78dcb 100644 --- a/net/8021q/vlan_dev.c +++ b/net/8021q/vlan_dev.c @@ -312,6 +312,43 @@ static int vlan_dev_open(struct net_device *dev) return err; } +static int vlan_dev_sync_unsync_mc_addr(struct net_device *dev, + const unsigned char *addr, + bool add) +{ + struct net_device *real_dev = vlan_dev_priv(dev)->real_dev; + struct switchdev_obj_port_mdb mdb = { + .obj = { + .orig_dev = dev, + .id = SWITCHDEV_OBJ_ID_HOST_MDB, + .flags = SWITCHDEV_F_DEFER, + }, + .vid = vlan_dev_vlan_id(dev), + }; + int ret = -EOPNOTSUPP; + + ether_addr_copy(mdb.addr, addr); + if (add) + ret = switchdev_port_obj_add(real_dev, &mdb.obj, NULL); + else + ret = switchdev_port_obj_del(real_dev, &mdb.obj); + + return ret; +} + +static int vlan_dev_sync_mc_addr(struct net_device *dev, + const unsigned char *addr) +{ + return vlan_dev_sync_unsync_mc_addr(dev, addr, true); +} + +static int vlan_dev_unsync_mc_addr(struct net_device *dev, + const unsigned char *addr) +{ + return vlan_dev_sync_unsync_mc_addr(dev, addr, false); +} + + static int vlan_dev_stop(struct net_device *dev) { struct vlan_dev_priv *vlan = vlan_dev_priv(dev); @@ -319,6 +356,7 @@ static int vlan_dev_stop(struct net_device *dev) dev_mc_unsync(real_dev, dev); dev_uc_unsync(real_dev, dev); + __hw_addr_unsync_dev(&dev->mc, dev, vlan_dev_unsync_mc_addr); if (dev->flags & IFF_ALLMULTI) dev_set_allmulti(real_dev, -1); if (dev->flags & IFF_PROMISC) @@ -483,6 +521,8 @@ static void vlan_dev_change_rx_flags(struct net_device *dev, int change) static void vlan_dev_set_rx_mode(struct net_device *vlan_dev) { + __hw_addr_sync_dev(&vlan_dev->mc, vlan_dev, vlan_dev_sync_mc_addr, + vlan_dev_unsync_mc_addr); dev_mc_sync(vlan_dev_priv(vlan_dev)->real_dev, vlan_dev); dev_uc_sync(vlan_dev_priv(vlan_dev)->real_dev, vlan_dev); }
The VLAN real device could be an Ethernet switch port and that switch might have VLAN filtering globally enabled (because of a bridge requesting VLAN filtering on the switch on another port) and so when programming multicast addresses, we need the multicast filter programming to be aware of the correct VLAN ID as well. Ethernet drivers that do not implement switchdev_port_{add,del} operations and do not specifically check for SWITCHDEV_OBJ_ID_HOST_MDB are not affected by that change. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> --- net/8021q/vlan_dev.c | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+)