Message ID | 20130617181004.GA1364@fedora-17-guest.dell.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On 06/17/2013 11:10 AM, Narendra_K@Dell.com wrote: > It is useful to know if network interfaces from NIC partitions > 'map to/use the' same physical port. For example, when creating > bonding in fault tolerance mode, if two network interfaces map to/use > the same physical port, it might not have the desired result. This > information is not available today in a standard format or it is not > present. If this information can be made available in a generic way > to user space, tools such as NetworkManager or Libteam or Wicked can > make smarter bonding decisions (such as warn users when setting up > configurations which will not have desired effect). > > The requirement is to have a generic interface using which > kernel/drivers can provide information/hints to user space about the > physical port number used by a network interface. > > The following options were explored - > > 1. 'dev_id' sysfs attribute: > > In addition to being used to differentiate between devices that share > the same link layer address, it is being used to indicate the physical > port number used by a network interface. > > As dev_id exists to differentiate between devices sharing the same > link layer address, dev_id option is not selected. > > 2. Re-using 'if_port' field in 'struct net_device': > > if_port field exists to indicate the media type(please refer to > netdevice.h). It seemed like it was also used to indicate the physical > port number. > > As re-using 'if_port' might possibly break user space, this option is > not selected. > > 3. Add a new field 'phys_port' to 'struct net_device' and export it > to sysfs: > > The 'phys_port' will be a universally unique identifier, which > would be a MAC-48 or EUI-64 or a 128 bit UUID value, but not > restricted to these spaces. It will uniquely identify the physical > port used by a network interface. The 'length' of the identifier will > be zero if the field is not set for a network interface. > > This patch implements option 3. It creates a new sysfs attribute > 'phys_port' - > > /sys/class/net/<interface name>/phys_port > > References: http://marc.info/?l=linux-netdev&m=136920998009209&w=2 > References: http://marc.info/?l=linux-netdev&m=136992041432498&w=2 > > Signed-off-by: Narendra K <narendra_k@dell.com> > --- > Changes from RFC version: > > Suggestions from Ben Hutchings - > 1. 'struct port_identifier' is changed to be generic instead of > restricting it to MAC-48 or EUI-64 or 128 bit UUID. > 2. Commit message updated to indicate point 1. > 3. 'show_phys_port' function modified to handle zero length > instead of returning -EINVAL > 4. 'show_phys_port' function made generic to handle all > lengths instead 6, 8 or 16 bytes. > > Hi Ben, I have retained the commit message to indicate that 'dev_id' > is being used to indicate the physical port number also. > > Thank you. > > include/linux/netdevice.h | 13 +++++++++++++ > net/core/net-sysfs.c | 17 +++++++++++++++++ > 2 files changed, 30 insertions(+) [...] > --- a/net/core/net-sysfs.c > +++ b/net/core/net-sysfs.c > @@ -334,6 +334,22 @@ static ssize_t store_group(struct device *dev, struct device_attribute *attr, > return netdev_store(dev, attr, buf, len, change_group); > } > Is there some missing locking here? > +static ssize_t show_phys_port(struct device *dev, > + struct device_attribute *attr, char *buf) > +{ > + struct net_device *net = to_net_dev(dev); > + unsigned char len; > + read_lock(&dev_base_lock); > + if (!dev_isalive(net)) > + return -EINVAL; > + > + len = net->phys_port.port_id_len; > + if (!len) > + return 0; ret = sysfs_format_mac(buf, net->phys_port.port_id, len); read_unlock(&dev_base_lock); return ret; } Please take a look maybe I missed something. Thanks, John
> -----Original Message----- > From: John Fastabend [mailto:john.fastabend@gmail.com] > Sent: Tuesday, June 18, 2013 12:18 AM > To: K, Narendra > Cc: netdev@vger.kernel.org; bhutchings@solarflare.com; > john.r.fastabend@intel.com > Subject: Re: [PATCH net-next] net: Add phys_port identifier to struct > net_device and export it to sysfs > > On 06/17/2013 11:10 AM, Narendra_K@Dell.com wrote: [...] > > 3. Add a new field 'phys_port' to 'struct net_device' and export it to > > sysfs: > > > > The 'phys_port' will be a universally unique identifier, which would > > be a MAC-48 or EUI-64 or a 128 bit UUID value, but not restricted to > > these spaces. It will uniquely identify the physical port used by a > > network interface. The 'length' of the identifier will be zero if the > > field is not set for a network interface. > > > > This patch implements option 3. It creates a new sysfs attribute > > 'phys_port' - > > > > /sys/class/net/<interface name>/phys_port > > > > References: http://marc.info/?l=linux-netdev&m=136920998009209&w=2 > > References: http://marc.info/?l=linux-netdev&m=136992041432498&w=2 > > > > Signed-off-by: Narendra K <narendra_k@dell.com> > > --- > > Changes from RFC version: > > > > Suggestions from Ben Hutchings - > > 1. 'struct port_identifier' is changed to be generic instead of > > restricting it to MAC-48 or EUI-64 or 128 bit UUID. > > 2. Commit message updated to indicate point 1. > > 3. 'show_phys_port' function modified to handle zero length instead of > > returning -EINVAL 4. 'show_phys_port' function made generic to handle > > all lengths instead 6, 8 or 16 bytes. > > > > Hi Ben, I have retained the commit message to indicate that 'dev_id' > > is being used to indicate the physical port number also. > > > > Thank you. > > > > include/linux/netdevice.h | 13 +++++++++++++ > > net/core/net-sysfs.c | 17 +++++++++++++++++ > > 2 files changed, 30 insertions(+) > > [...] > > > --- a/net/core/net-sysfs.c > > +++ b/net/core/net-sysfs.c > > @@ -334,6 +334,22 @@ static ssize_t store_group(struct device *dev, > struct device_attribute *attr, > > return netdev_store(dev, attr, buf, len, change_group); > > } > > > > Is there some missing locking here? > > > +static ssize_t show_phys_port(struct device *dev, > > + struct device_attribute *attr, char *buf) { > > + struct net_device *net = to_net_dev(dev); > > + unsigned char len; > > + > > read_lock(&dev_base_lock); > > + if (!dev_isalive(net)) > > + return -EINVAL; > > + > > + len = net->phys_port.port_id_len; > > + if (!len) > > + return 0; > > ret = sysfs_format_mac(buf, net->phys_port.port_id, len); > read_unlock(&dev_base_lock); > > return ret; > } > > Please take a look maybe I missed something. > Hi John, thanks for the pointer. It seems like we need to hold the ' dev_base_lock' here. I missed this initially as I was looking at ' show_broadcast' function . But looks like the 'show_broadcast' function is also missing the lock. Attributes such as 'dev_id' are read with read_lock(&dev_base_lock) generically in netdev_show function. While looking at the use of ' dev_base_lock', the 'write_lock' is being held when the 'netdev' is being added to and removed from 'dev_base_head'. It is also being held when the 'dev->operstate' and 'dev->link_mode' are being changed. The 'read_lock(&dev_base_lock)' needs to be held before the 'dev_isalive(net) ' call because 1. netdev is not removed from 'dev_base_head' when 'show_phys_port' accesses 'netdev->phys_port.port_id' (and port_id_len) 2. show_phys_port function sees a consistent value of 'netdev->phys_port.port_id and netdev->phys_port.port_id_len ' if another execution path changes the value of 'netdev->phys_port.port_id and netdev->phys_port.port_id_len ' with write_lock(&dev_base_lock) held (similar to how dev->operstate is being changed). Is the above understanding correct ? Sorry, if I missed some detail here. With regards, Narendra K Linux Engineering Dell Inc. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2013-06-19 at 07:29 -0700, Narendra_K@Dell.com wrote: [...] > 2. show_phys_port function sees a consistent value of > 'netdev->phys_port.port_id and netdev->phys_port.port_id_len ' if > another execution path changes the value of 'netdev->phys_port.port_id > and netdev->phys_port.port_id_len ' with write_lock(&dev_base_lock) > held (similar to how dev->operstate is being changed). [...] If the physical port ID can change dynamically (I hadn't thought of that, but an embedded switch could support such reconfiguration) then any such change also needs to be announced through rtnetlink. Actually, I think the value needs to be included in rtnetlink information anyway. Ben.
On Thu, 2013-06-20 at 00:23 +0530, Narendra_K@Dell.com wrote: > > -----Original Message----- > > From: Ben Hutchings [mailto:bhutchings@solarflare.com] > > Sent: Wednesday, June 19, 2013 9:07 PM > > To: K, Narendra > > Cc: john.fastabend@gmail.com; netdev@vger.kernel.org; > > john.r.fastabend@intel.com > > Subject: Re: [PATCH net-next] net: Add phys_port identifier to struct > > net_device and export it to sysfs > > > > On Wed, 2013-06-19 at 07:29 -0700, Narendra_K@Dell.com wrote: > > [...] > > > 2. show_phys_port function sees a consistent value of > > > 'netdev->phys_port.port_id and netdev->phys_port.port_id_len ' if > > > another execution path changes the value of 'netdev->phys_port.port_id > > > and netdev->phys_port.port_id_len ' with write_lock(&dev_base_lock) > > > held (similar to how dev->operstate is being changed). > > [...] > > > > If the physical port ID can change dynamically (I hadn't thought of that, but an > > embedded switch could support such reconfiguration) then any such change > > also needs to be announced through rtnetlink. Actually, I think the value > > needs to be included in rtnetlink information anyway. > > > > Ok. Thank you Ben. I had not thought about this scenario. I was > thinking about the reason to hold the dev_base_lock. Do you think > points 1 and 2 are correct reason to hold the dev_base_lock ? I think so. > If correct, I think the 'show_broadcast' function also needs to be > fixed as it is not holding the lock. I think the broadcast address should never change during the lifetime of a device, so it doesn't need the lock. That might not be true for all layer 2 protocols though. Ben.
Having an unique identifier per port, the userspace tools might be able to check if the VFs/partitions are of the same physical port. Just being paranoid, wouldn't it be better to bond VFs/Partitions across NICs instead of across ports of the same NIC? To be able to check if the VFs/partitions are of the same port & NIC, the unique identifier should probably have two identifiers in it like ABC_XYZ. The ABC uniquely identifying the NIC and XYZ uniquely identifying the port. Any thoughts? Thank you Praveen K Paladugu
On 06/19/2013 12:34 PM, Ben Hutchings wrote: > On Thu, 2013-06-20 at 00:23 +0530, Narendra_K@Dell.com wrote: >>> -----Original Message----- >>> From: Ben Hutchings [mailto:bhutchings@solarflare.com] >>> Sent: Wednesday, June 19, 2013 9:07 PM >>> To: K, Narendra >>> Cc: john.fastabend@gmail.com; netdev@vger.kernel.org; >>> john.r.fastabend@intel.com >>> Subject: Re: [PATCH net-next] net: Add phys_port identifier to struct >>> net_device and export it to sysfs >>> >>> On Wed, 2013-06-19 at 07:29 -0700, Narendra_K@Dell.com wrote: >>> [...] >>>> 2. show_phys_port function sees a consistent value of >>>> 'netdev->phys_port.port_id and netdev->phys_port.port_id_len ' if >>>> another execution path changes the value of 'netdev->phys_port.port_id >>>> and netdev->phys_port.port_id_len ' with write_lock(&dev_base_lock) >>>> held (similar to how dev->operstate is being changed). >>> [...] >>> >>> If the physical port ID can change dynamically (I hadn't thought of that, but an >>> embedded switch could support such reconfiguration) then any such change >>> also needs to be announced through rtnetlink. Actually, I think the value >>> needs to be included in rtnetlink information anyway. >>> >> >> Ok. Thank you Ben. I had not thought about this scenario. I was >> thinking about the reason to hold the dev_base_lock. Do you think >> points 1 and 2 are correct reason to hold the dev_base_lock ? > > I think so. > >> If correct, I think the 'show_broadcast' function also needs to be >> fixed as it is not holding the lock. > > I think the broadcast address should never change during the lifetime of > a device, so it doesn't need the lock. That might not be true for all > layer 2 protocols though. > > Ben. > Also, do you think this will be primarily useful for partitioning devices that expose multiple physical functions? Or do you see a use case for SR-IOV with virtual functions as well. The pyhs_port attribute provides a common interface for both cases which is good I suppose in the VF case however the host can already learn this. I gather from your original post here that you are aware of all this. quoting: > I was thinking about the scenario of VF0 and VF1 coming from PF0 in the host > Network Controller 1 being direct assigned to a KVM guest via VTD and netdevices > from VF0 and VF1 being bonded in the guest. Assuming that physical port number used > by VF0 and VF1 is 1, additional information is needed to identify if port number 1 > is on Network controller 1 or Network controller 2. (In the host we could use > PCI b/d/f to differentiate two Network Controllers). I think it is similar to > hybrid guest acceleration on the VF assignment aspect. I'm curious though why would the host/libvirt assign two VFs from the same PF to a guest like this? Is this really a host mis-configuration that you want a way to detect in the guest? Thanks, John
> -----Original Message----- > From: John Fastabend [mailto:john.fastabend@gmail.com] > Sent: Friday, June 21, 2013 10:41 PM > To: K, Narendra > Cc: Ben Hutchings; netdev@vger.kernel.org; john.r.fastabend@intel.com > Subject: Re: [PATCH net-next] net: Add phys_port identifier to struct > net_device and export it to sysfs > > On 06/19/2013 12:34 PM, Ben Hutchings wrote: [...] > > I think so. > > > >> If correct, I think the 'show_broadcast' function also needs to be > >> fixed as it is not holding the lock. > > > > I think the broadcast address should never change during the lifetime > > of a device, so it doesn't need the lock. That might not be true for > > all layer 2 protocols though. > > > > Ben. > > > > Also, do you think this will be primarily useful for partitioning devices that > expose multiple physical functions? Or do you see a use case for SR-IOV with > virtual functions as well. The pyhs_port attribute provides a common > interface for both cases which is good I suppose in the VF case however the > host can already learn this. I gather from your original post here that you are > aware of all this. > John, I think it will be useful in the SRIOV scenario also when more than one VF from two NICs are assigned to the guest. phys_port would be helpful in choosing the correct slave interfaces when host details are not available. With regards, Narendra K Linux Engineering Dell Inc.
[...] >> >> Also, do you think this will be primarily useful for partitioning devices that >> expose multiple physical functions? Or do you see a use case for SR-IOV with >> virtual functions as well. The pyhs_port attribute provides a common >> interface for both cases which is good I suppose in the VF case however the >> host can already learn this. I gather from your original post here that you are >> aware of all this. >> > > John, I think it will be useful in the SRIOV scenario also when more than one VF from two NICs are assigned to the guest. phys_port would be helpful in choosing the correct slave interfaces when host details are not available. OK. But I'm not sure why you would assign two VFs from the same NIC to a guest? This doesn't seem like a good configuration for failover because if one VF fails it seems likely both will fail. Maybe there are some benefits for load balancing? Or my assumption both VFs will fail is wrong. Anyways it does seem useful in the partitioning case with multiple physical functions. .John
On Fri, 2013-06-28 at 09:33 -0700, John Fastabend wrote: > [...] > > >> > >> Also, do you think this will be primarily useful for partitioning devices that > >> expose multiple physical functions? Or do you see a use case for SR-IOV with > >> virtual functions as well. The pyhs_port attribute provides a common > >> interface for both cases which is good I suppose in the VF case however the > >> host can already learn this. I gather from your original post here that you are > >> aware of all this. > >> > > > > John, I think it will be useful in the SRIOV scenario also when more > than one VF from two NICs are assigned to the guest. phys_port would be > helpful in choosing the correct slave interfaces when host details are > not available. > > OK. But I'm not sure why you would assign two VFs from the same NIC > to a guest? This doesn't seem like a good configuration for failover > because if one VF fails it seems likely both will fail. Maybe there > are some benefits for load balancing? Or my assumption both VFs will > fail is wrong. I believe Narendra is trying to provide hints to the guest that would allow it to avoid such broken bonding configurations. But it is certainly a good question why there would be two VFs assigned in the first place. I could imagine passing through two VFs for the same physical port that have been assigned to different VLANs. But then you wouldn't want to bond two devices that are on different VLANs, whether or not they're using the same port! > Anyways it does seem useful in the partitioning case with multiple > physical functions. I was thinking it could also help to support the hybrid guest networking mode. In this mode, the guest gets a PV (e.g. virtio_net) device and a VF bridged to the same physical port, and the VF can be removed before the guest is migrated (and maybe reinserted if there's a VF available on the new host) without a major disruption to the guest. In that case the guest *should* bond together the two net devices that have the same physical port ID but different drivers. This would require the physical port ID to be propagated through macvtap/macvlan and virtio. Ben.
On Fri, Jun 28, 2013 at 10:39:13PM +0530, Ben Hutchings wrote: > > On Fri, 2013-06-28 at 09:33 -0700, John Fastabend wrote: [...] > > > John, I think it will be useful in the SRIOV scenario also when more > > than one VF from two NICs are assigned to the guest. phys_port would be > > helpful in choosing the correct slave interfaces when host details are > > not available. > > > > OK. But I'm not sure why you would assign two VFs from the same NIC > > to a guest? This doesn't seem like a good configuration for failover > > because if one VF fails it seems likely both will fail. Maybe there > > are some benefits for load balancing? Or my assumption both VFs will > > fail is wrong. > > I believe Narendra is trying to provide hints to the guest that would > allow it to avoid such broken bonding configurations. But it is > certainly a good question why there would be two VFs assigned in the > first place. > > I could imagine passing through two VFs for the same physical port that > have been assigned to different VLANs. But then you wouldn't want to > bond two devices that are on different VLANs, whether or not they're > using the same port! I was thinking of the following scenario in the guest. bond0 = NIC1 VF0 + NIC2 VF0 bond1 = NIC1 VF1 + NIC2 VF1 bond0 and bond1 are on different VLANs. The phys_port identifier hint would be helpful to the guest in selecting the correct slaves for the above configuration. Sorry if I missed any detail here. > > > Anyways it does seem useful in the partitioning case with multiple > > physical functions. Yes, I agree. > > I was thinking it could also help to support the hybrid guest networking > mode. In this mode, the guest gets a PV (e.g. virtio_net) device and a > VF bridged to the same physical port, and the VF can be removed before > the guest is migrated (and maybe reinserted if there's a VF available on > the new host) without a major disruption to the guest. In that case the > guest *should* bond together the two net devices that have the same > physical port ID but different drivers. This would require the physical > port ID to be propagated through macvtap/macvlan and virtio. > > Ben. > > -- > Ben Hutchings, Staff Engineer, Solarflare > Not speaking for my employer; that's the marketing department's job. > They asked us to note that Solarflare product names are trademarked. > > -- > To unsubscribe from this list: send the line "unsubscribe netdev" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html >
Mon, Jun 17, 2013 at 08:10:32PM CEST, Narendra_K@Dell.com wrote: >It is useful to know if network interfaces from NIC partitions >'map to/use the' same physical port. For example, when creating >bonding in fault tolerance mode, if two network interfaces map to/use >the same physical port, it might not have the desired result. This >information is not available today in a standard format or it is not >present. If this information can be made available in a generic way >to user space, tools such as NetworkManager or Libteam or Wicked can >make smarter bonding decisions (such as warn users when setting up >configurations which will not have desired effect). > >The requirement is to have a generic interface using which >kernel/drivers can provide information/hints to user space about the >physical port number used by a network interface. > >The following options were explored - > >1. 'dev_id' sysfs attribute: > >In addition to being used to differentiate between devices that share >the same link layer address, it is being used to indicate the physical >port number used by a network interface. > >As dev_id exists to differentiate between devices sharing the same >link layer address, dev_id option is not selected. > >2. Re-using 'if_port' field in 'struct net_device': > >if_port field exists to indicate the media type(please refer to >netdevice.h). It seemed like it was also used to indicate the physical >port number. > >As re-using 'if_port' might possibly break user space, this option is >not selected. > >3. Add a new field 'phys_port' to 'struct net_device' and export it >to sysfs: > >The 'phys_port' will be a universally unique identifier, which >would be a MAC-48 or EUI-64 or a 128 bit UUID value, but not >restricted to these spaces. It will uniquely identify the physical >port used by a network interface. The 'length' of the identifier will >be zero if the field is not set for a network interface. > >This patch implements option 3. It creates a new sysfs attribute >'phys_port' - I think that correct way is to (Ben mentioned already part of it): 1) introduce ndo_phys_port_id() which would be used by core to get the struct port_identifier filled by the driver (struct port_identifier is not really a good name (namespace prefix should be there)) 2) add netdev nofitier event type which would allow driver to propagate changes of phys to to rtnetlink code and drivers which might be interested (like bond/bridge/whatever) as well. 3) export phys port id through rtnetlink api to userspace. I can cook up a patch like this after I return from my weekend trip if you are interested :) Jiri > >/sys/class/net/<interface name>/phys_port > >References: http://marc.info/?l=linux-netdev&m=136920998009209&w=2 >References: http://marc.info/?l=linux-netdev&m=136992041432498&w=2 > >Signed-off-by: Narendra K <narendra_k@dell.com> >--- >Changes from RFC version: > >Suggestions from Ben Hutchings - >1. 'struct port_identifier' is changed to be generic instead of >restricting it to MAC-48 or EUI-64 or 128 bit UUID. >2. Commit message updated to indicate point 1. >3. 'show_phys_port' function modified to handle zero length >instead of returning -EINVAL >4. 'show_phys_port' function made generic to handle all >lengths instead 6, 8 or 16 bytes. > >Hi Ben, I have retained the commit message to indicate that 'dev_id' >is being used to indicate the physical port number also. > >Thank you. > > include/linux/netdevice.h | 13 +++++++++++++ > net/core/net-sysfs.c | 17 +++++++++++++++++ > 2 files changed, 30 insertions(+) > >diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h >index 09b4188..ddb14ef 100644 >--- a/include/linux/netdevice.h >+++ b/include/linux/netdevice.h >@@ -1062,6 +1062,14 @@ struct net_device_ops { > bool new_carrier); > }; > >+/* This structure holds a universally unique identifier to >+ * identify the physical port used by a netdevice >+ */ >+struct port_identifier { >+ unsigned char port_id[MAX_ADDR_LEN]; >+ unsigned port_id_len; >+}; >+ > /* > * The DEVICE structure. > * Actually, this whole structure is a big mistake. It mixes I/O >@@ -1181,6 +1189,11 @@ struct net_device { > * that share the same link > * layer address > */ >+ struct port_identifier phys_port; /* Universally unique physical >+ * port identifier, MAC-48 or >+ * EUI-64 or 128 bit UUID, >+ * length is zero if not set >+ */ > spinlock_t addr_list_lock; > struct netdev_hw_addr_list uc; /* Unicast mac addresses */ > struct netdev_hw_addr_list mc; /* Multicast mac addresses */ >diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c >index 981fed3..3245e90 100644 >--- a/net/core/net-sysfs.c >+++ b/net/core/net-sysfs.c >@@ -334,6 +334,22 @@ static ssize_t store_group(struct device *dev, struct device_attribute *attr, > return netdev_store(dev, attr, buf, len, change_group); > } > >+static ssize_t show_phys_port(struct device *dev, >+ struct device_attribute *attr, char *buf) >+{ >+ struct net_device *net = to_net_dev(dev); >+ unsigned char len; >+ >+ if (!dev_isalive(net)) >+ return -EINVAL; >+ >+ len = net->phys_port.port_id_len; >+ if (!len) >+ return 0; >+ >+ return sysfs_format_mac(buf, net->phys_port.port_id, len); >+} >+ > static struct device_attribute net_class_attributes[] = { > __ATTR(addr_assign_type, S_IRUGO, show_addr_assign_type, NULL), > __ATTR(addr_len, S_IRUGO, show_addr_len, NULL), >@@ -355,6 +371,7 @@ static struct device_attribute net_class_attributes[] = { > __ATTR(tx_queue_len, S_IRUGO | S_IWUSR, show_tx_queue_len, > store_tx_queue_len), > __ATTR(netdev_group, S_IRUGO | S_IWUSR, show_group, store_group), >+ __ATTR(phys_port, S_IRUGO, show_phys_port, NULL), > {} > }; > >-- >1.8.0.1 > >-- >With regards, >Narendra K >Linux Engineering >Dell Inc. >-- >To unsubscribe from this list: send the line "unsubscribe netdev" in >the body of a message to majordomo@vger.kernel.org >More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 09b4188..ddb14ef 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1062,6 +1062,14 @@ struct net_device_ops { bool new_carrier); }; +/* This structure holds a universally unique identifier to + * identify the physical port used by a netdevice + */ +struct port_identifier { + unsigned char port_id[MAX_ADDR_LEN]; + unsigned port_id_len; +}; + /* * The DEVICE structure. * Actually, this whole structure is a big mistake. It mixes I/O @@ -1181,6 +1189,11 @@ struct net_device { * that share the same link * layer address */ + struct port_identifier phys_port; /* Universally unique physical + * port identifier, MAC-48 or + * EUI-64 or 128 bit UUID, + * length is zero if not set + */ spinlock_t addr_list_lock; struct netdev_hw_addr_list uc; /* Unicast mac addresses */ struct netdev_hw_addr_list mc; /* Multicast mac addresses */ diff --git a/net/core/net-sysfs.c b/net/core/net-sysfs.c index 981fed3..3245e90 100644 --- a/net/core/net-sysfs.c +++ b/net/core/net-sysfs.c @@ -334,6 +334,22 @@ static ssize_t store_group(struct device *dev, struct device_attribute *attr, return netdev_store(dev, attr, buf, len, change_group); } +static ssize_t show_phys_port(struct device *dev, + struct device_attribute *attr, char *buf) +{ + struct net_device *net = to_net_dev(dev); + unsigned char len; + + if (!dev_isalive(net)) + return -EINVAL; + + len = net->phys_port.port_id_len; + if (!len) + return 0; + + return sysfs_format_mac(buf, net->phys_port.port_id, len); +} + static struct device_attribute net_class_attributes[] = { __ATTR(addr_assign_type, S_IRUGO, show_addr_assign_type, NULL), __ATTR(addr_len, S_IRUGO, show_addr_len, NULL), @@ -355,6 +371,7 @@ static struct device_attribute net_class_attributes[] = { __ATTR(tx_queue_len, S_IRUGO | S_IWUSR, show_tx_queue_len, store_tx_queue_len), __ATTR(netdev_group, S_IRUGO | S_IWUSR, show_group, store_group), + __ATTR(phys_port, S_IRUGO, show_phys_port, NULL), {} };
It is useful to know if network interfaces from NIC partitions 'map to/use the' same physical port. For example, when creating bonding in fault tolerance mode, if two network interfaces map to/use the same physical port, it might not have the desired result. This information is not available today in a standard format or it is not present. If this information can be made available in a generic way to user space, tools such as NetworkManager or Libteam or Wicked can make smarter bonding decisions (such as warn users when setting up configurations which will not have desired effect). The requirement is to have a generic interface using which kernel/drivers can provide information/hints to user space about the physical port number used by a network interface. The following options were explored - 1. 'dev_id' sysfs attribute: In addition to being used to differentiate between devices that share the same link layer address, it is being used to indicate the physical port number used by a network interface. As dev_id exists to differentiate between devices sharing the same link layer address, dev_id option is not selected. 2. Re-using 'if_port' field in 'struct net_device': if_port field exists to indicate the media type(please refer to netdevice.h). It seemed like it was also used to indicate the physical port number. As re-using 'if_port' might possibly break user space, this option is not selected. 3. Add a new field 'phys_port' to 'struct net_device' and export it to sysfs: The 'phys_port' will be a universally unique identifier, which would be a MAC-48 or EUI-64 or a 128 bit UUID value, but not restricted to these spaces. It will uniquely identify the physical port used by a network interface. The 'length' of the identifier will be zero if the field is not set for a network interface. This patch implements option 3. It creates a new sysfs attribute 'phys_port' - /sys/class/net/<interface name>/phys_port References: http://marc.info/?l=linux-netdev&m=136920998009209&w=2 References: http://marc.info/?l=linux-netdev&m=136992041432498&w=2 Signed-off-by: Narendra K <narendra_k@dell.com> --- Changes from RFC version: Suggestions from Ben Hutchings - 1. 'struct port_identifier' is changed to be generic instead of restricting it to MAC-48 or EUI-64 or 128 bit UUID. 2. Commit message updated to indicate point 1. 3. 'show_phys_port' function modified to handle zero length instead of returning -EINVAL 4. 'show_phys_port' function made generic to handle all lengths instead 6, 8 or 16 bytes. Hi Ben, I have retained the commit message to indicate that 'dev_id' is being used to indicate the physical port number also. Thank you. include/linux/netdevice.h | 13 +++++++++++++ net/core/net-sysfs.c | 17 +++++++++++++++++ 2 files changed, 30 insertions(+)