Message ID | 1342400890-32183-1-git-send-email-anirban.chakraborty@qlogic.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
Anirban Chakraborty <anirban.chakraborty@qlogic.com> wrote: >From: Anirban Chakraborty <anirban.chakraborty@qlogic.com> > >Add support to disable bonding of interfaces belonging to the same physical port. In >case of SRIOV or NIC partition mode, a single port of the adapter can have multiple >NIC functions. While bonding such interfaces, it is ensured that the NIC functions >belonging to the same physical port are not bonded together. > >Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> >--- > Documentation/networking/ifenslave.c | 208 +++++++++++++++++++++++++++++++++- > 1 files changed, 207 insertions(+), 1 deletions(-) > >diff --git a/Documentation/networking/ifenslave.c b/Documentation/networking/ifenslave.c >index ac5debb..a0bdab9 100644 >--- a/Documentation/networking/ifenslave.c >+++ b/Documentation/networking/ifenslave.c >@@ -92,9 +92,14 @@ > * - 2003/12/01 - Shmulik Hen <shmulik.hen at intel dot com> > * - Code cleanup and style changes > * set version to 1.1.0 >+ * >+ * - 2012/07/15 - Anirban Chakraborty <anirban.chakraborty at qlogic dot com> >+ * - Added support to disable bonding interfaces belonging to the >+ * same physical port. >+ * set version to 1.1.1 This patch is all implemented within the ifenslave user space program, which, to my knowledge, is not currently used by any major distro to configure bonding. The configuration for bonding is typically performed by packages such as initscripts or sysconfig, and this functionality would likely need to go there. The only real use for ifenslave.c is on kernels without sysfs compiled in. -J --- -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 7/15/2012 6:40 PM, Jay Vosburgh wrote: > Anirban Chakraborty <anirban.chakraborty@qlogic.com> wrote: > >> From: Anirban Chakraborty <anirban.chakraborty@qlogic.com> >> >> Add support to disable bonding of interfaces belonging to the same physical port. In >> case of SRIOV or NIC partition mode, a single port of the adapter can have multiple >> NIC functions. While bonding such interfaces, it is ensured that the NIC functions >> belonging to the same physical port are not bonded together. >> >> Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> >> --- >> Documentation/networking/ifenslave.c | 208 +++++++++++++++++++++++++++++++++- >> 1 files changed, 207 insertions(+), 1 deletions(-) >> >> diff --git a/Documentation/networking/ifenslave.c b/Documentation/networking/ifenslave.c >> index ac5debb..a0bdab9 100644 >> --- a/Documentation/networking/ifenslave.c >> +++ b/Documentation/networking/ifenslave.c >> @@ -92,9 +92,14 @@ >> * - 2003/12/01 - Shmulik Hen <shmulik.hen at intel dot com> >> * - Code cleanup and style changes >> * set version to 1.1.0 >> + * >> + * - 2012/07/15 - Anirban Chakraborty <anirban.chakraborty at qlogic dot com> >> + * - Added support to disable bonding interfaces belonging to the >> + * same physical port. >> + * set version to 1.1.1 > > This patch is all implemented within the ifenslave user space > program, which, to my knowledge, is not currently used by any major > distro to configure bonding. > > The configuration for bonding is typically performed by packages > such as initscripts or sysconfig, and this functionality would likely > need to go there. > > The only real use for ifenslave.c is on kernels without sysfs > compiled in. > > -J > Also I'm not sure we need to explicitly block this. It is clear from looking at 'ip' output what the topology is. And in the SR-IOV case would this still work if the functions are direct assigned? How about if I try to bond two stacked devices that are on the same physical link. In both case iirc the bus info wont match up. Seems easier to just call this a configuration error or not if for some reason this is really what someone intended. .John -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 7/15/12 9:39 PM, "John Fastabend" <john.r.fastabend@intel.com> wrote: >On 7/15/2012 6:40 PM, Jay Vosburgh wrote: >> Anirban Chakraborty <anirban.chakraborty@qlogic.com> wrote: >> >>> From: Anirban Chakraborty <anirban.chakraborty@qlogic.com> >>> >>> Add support to disable bonding of interfaces belonging to the same >>>physical port. In >>> case of SRIOV or NIC partition mode, a single port of the adapter can >>>have multiple >>> NIC functions. While bonding such interfaces, it is ensured that the >>>NIC functions >>> belonging to the same physical port are not bonded together. >>> >>> Signed-off-by: Anirban Chakraborty <anirban.chakraborty@qlogic.com> >>> --- >>> Documentation/networking/ifenslave.c | 208 >>>+++++++++++++++++++++++++++++++++- >>> 1 files changed, 207 insertions(+), 1 deletions(-) >>> >>> diff --git a/Documentation/networking/ifenslave.c >>>b/Documentation/networking/ifenslave.c >>> index ac5debb..a0bdab9 100644 >>> --- a/Documentation/networking/ifenslave.c >>> +++ b/Documentation/networking/ifenslave.c >>> @@ -92,9 +92,14 @@ >>> * - 2003/12/01 - Shmulik Hen <shmulik.hen at intel dot com> >>> * - Code cleanup and style changes >>> * set version to 1.1.0 >>> + * >>> + * - 2012/07/15 - Anirban Chakraborty <anirban.chakraborty at >>>qlogic dot com> >>> + * - Added support to disable bonding interfaces belonging to the >>> + * same physical port. >>> + * set version to 1.1.1 >> >> This patch is all implemented within the ifenslave user space >> program, which, to my knowledge, is not currently used by any major >> distro to configure bonding. >> >> The configuration for bonding is typically performed by packages >> such as initscripts or sysconfig, and this functionality would likely >> need to go there. >> >> The only real use for ifenslave.c is on kernels without sysfs >> compiled in. >> >> -J >> > >Also I'm not sure we need to explicitly block this. It is clear from >looking at 'ip' output what the topology is. And in the SR-IOV >case would this still work if the functions are direct assigned? How >about if I try to bond two stacked devices that are on the same >physical link. In both case iirc the bus info wont match up. > >Seems easier to just call this a configuration error or not if for >some reason this is really what someone intended. > >.John I agree that for SR-IOV case with VFs assigned directly to the guest, bus info won't match up. However, I was thinking from the point of view of NIC partitioned mode (NPAR), and for the use case of SR-IOV VFs assigned to the hypervisor. It would be nice to prevent the user from getting into misconfiguration. -Anirban -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Anirban Chakraborty <anirban.chakraborty@qlogic.com> wrote: >On 7/15/12 9:39 PM, "John Fastabend" <john.r.fastabend@intel.com> wrote: [...] >>Also I'm not sure we need to explicitly block this. It is clear from >>looking at 'ip' output what the topology is. And in the SR-IOV >>case would this still work if the functions are direct assigned? How >>about if I try to bond two stacked devices that are on the same >>physical link. In both case iirc the bus info wont match up. >> >>Seems easier to just call this a configuration error or not if for >>some reason this is really what someone intended. >> >>.John > >I agree that for SR-IOV case with VFs assigned directly to the guest, bus >info won't >match up. However, I was thinking from the point of view of NIC >partitioned mode (NPAR), >and for the use case of SR-IOV VFs assigned to the hypervisor. It would be >nice to >prevent the user from getting into misconfiguration. If I'm understanding correctly, to hit the case you're worried about here would require assigning multiple VFs from one PF to the same linux instance as the PF itself, and then bonding those VFs together. Heck, there might be some arcane reason that somebody wants to do that on purpose, or the test may inadvertently prohibit legal configurations that happen to match the criteria. Has this been a real problem in practice? I'm not seeing a compelling argument for doing this. -J --- -Jay Vosburgh, IBM Linux Technology Center, fubar@us.ibm.com -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 7/15/12 10:50 PM, "Jay Vosburgh" <fubar@us.ibm.com> wrote: >Anirban Chakraborty <anirban.chakraborty@qlogic.com> wrote: >>On 7/15/12 9:39 PM, "John Fastabend" <john.r.fastabend@intel.com> wrote: >[...] >>>Also I'm not sure we need to explicitly block this. It is clear from >>>looking at 'ip' output what the topology is. And in the SR-IOV >>>case would this still work if the functions are direct assigned? How >>>about if I try to bond two stacked devices that are on the same >>>physical link. In both case iirc the bus info wont match up. >>> >>>Seems easier to just call this a configuration error or not if for >>>some reason this is really what someone intended. >>> >>>.John >> >>I agree that for SR-IOV case with VFs assigned directly to the guest, bus >>info won't >>match up. However, I was thinking from the point of view of NIC >>partitioned mode (NPAR), >>and for the use case of SR-IOV VFs assigned to the hypervisor. It would >>be >>nice to >>prevent the user from getting into misconfiguration. > > If I'm understanding correctly, to hit the case you're worried >about here would require assigning multiple VFs from one PF to the same >linux instance as the PF itself, and then bonding those VFs together. > > Heck, there might be some arcane reason that somebody wants to >do that on purpose, or the test may inadvertently prohibit legal >configurations that happen to match the criteria. > > Has this been a real problem in practice? I'm not seeing a >compelling argument for doing this. > > -J The real problem that we have faced so far is the case of NPAR, where multiple physical functions are assigned to the linux host and the customer tried to create a bond of interfaces from the same physical port. In this case, linux host was running without any guest OS and the NICs are used for carrying host traffic only. -Anirban -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/Documentation/networking/ifenslave.c b/Documentation/networking/ifenslave.c index ac5debb..a0bdab9 100644 --- a/Documentation/networking/ifenslave.c +++ b/Documentation/networking/ifenslave.c @@ -92,9 +92,14 @@ * - 2003/12/01 - Shmulik Hen <shmulik.hen at intel dot com> * - Code cleanup and style changes * set version to 1.1.0 + * + * - 2012/07/15 - Anirban Chakraborty <anirban.chakraborty at qlogic dot com> + * - Added support to disable bonding interfaces belonging to the + * same physical port. + * set version to 1.1.1 */ -#define APP_VERSION "1.1.0" +#define APP_VERSION "1.1.1" #define APP_RELDATE "December 1, 2003" #define APP_NAME "ifenslave" @@ -111,6 +116,10 @@ static const char *usage_msg = " ifenslave -c <master-if> <slave-if>\n" " ifenslave --help\n"; +static const char *misconfig_msg = +"Use interfaces from different physical port for an ethernet adapter\n" +"which has multiple NIC functions belonging to the same physical port\n"; + static const char *help_msg = "\n" " To create a bond device, simply follow these three steps :\n" @@ -208,6 +217,18 @@ struct dev_ifr { int req_type; }; +/* port: physical port + * bus: PCI bus no. + * ifname: interface name + * driver: driver for this device + */ +struct dev_prop { + u8 port; + u16 bus; + char ifname[IFNAMSIZ]; + char driver[IFNAMSIZ]; +}; + struct dev_ifr master_ifra[] = { {&master_mtu, "SIOCGIFMTU", SIOCGIFMTU}, {&master_flags, "SIOCGIFFLAGS", SIOCGIFFLAGS}, @@ -224,6 +245,10 @@ struct dev_ifr slave_ifra[] = { static void if_print(char *ifname); static int get_drv_info(char *master_ifname); +static int get_bus_info(struct dev_prop *interface); +static int get_device_info(int count, struct dev_prop *prop, char *slaves[]); +static int validate_slaves(char *master, char **spp); +static int valid_slaves(int count, char *interfaces[]); static int get_if_settings(char *ifname, struct dev_ifr ifra[]); static int get_slave_flags(char *slave_ifname); static int set_master_hwaddr(char *master_ifname, struct sockaddr *hwaddr); @@ -335,6 +360,15 @@ int main(int argc, char *argv[]) goto out; } + /* validate the slave configuration */ + if (!opt_d) { + res = validate_slaves(master_ifname, spp); + if (res) { + fprintf(stderr, "%s\n", misconfig_msg); + goto out; + } + } + slave_ifname = *spp++; if (slave_ifname == NULL) { @@ -643,6 +677,178 @@ out: return 0; } +/* + * Validate if specified interfaces do not belong to the same physical port + */ +static int validate_slaves(char *master, char **spp) +{ + int i, count = 0, res = 0; + struct ifreq ifr; + ifbond bond; + ifslave slv; + char *bslave; + char **slaves, **islaves, **tslaves; + + /* Find a count for existing slave interfaces */ + memset(&ifr, 0, sizeof(ifr)); + ifr.ifr_data = (ifbond *)&bond; + strncpy(ifr.ifr_name, master, IFNAMSIZ); + if (ioctl(skfd, SIOCBONDINFOQUERY, &ifr) < 0) { + if (errno == EOPNOTSUPP) { + saved_errno = errno; + return 1; + } + } + islaves = spp; + while (*islaves++ != NULL) + count++; + if (!count) + return 0; + count += bond.num_slaves; + slaves = malloc(sizeof(char) * count * IFNAMSIZ); + if (slaves == NULL) + return 1; + memset(slaves, 0, (sizeof(char) * count * IFNAMSIZ)); + tslaves = slaves; + /* find new interface names */ + islaves = spp; + + while (*islaves != NULL) + memcpy(slaves++, islaves++, IFNAMSIZ); + /* find existing slave interface names */ + for (i = 0; i < bond.num_slaves; i++) { + memset(&slv, 0, sizeof(slv)); + ifr.ifr_data = (ifslave *)&slv; + slv.slave_id = i; + if (ioctl(skfd, SIOCBONDSLAVEINFOQUERY, &ifr) < 0) { + if (errno == EOPNOTSUPP) { + saved_errno = errno; + res = 1; + goto out; + } + } + bslave = slv.slave_name; + memcpy(slaves++, &bslave, IFNAMSIZ); + } + res = valid_slaves(count, tslaves); +out: + if (tslaves) + free(tslaves); + return res; +} + +static int get_bus_info(struct dev_prop *interface) +{ + struct ifreq ifr; + struct ethtool_drvinfo info; + char *buf[4], *token, *tmp, *end; + int i = 0; + + memset(&ifr, 0, sizeof(ifr)); + strncpy(ifr.ifr_name, interface->ifname, IFNAMSIZ); + ifr.ifr_data = (caddr_t)&info; + + info.cmd = ETHTOOL_GDRVINFO; + if (ioctl(skfd, SIOCETHTOOL, &ifr) < 0) { + if (errno == EOPNOTSUPP) + goto out; + saved_errno = errno; + v_print("Slave '%s': Error: get bonding info failed %s\n", + interface->ifname, strerror(saved_errno)); + return 1; + } + memcpy(interface->driver, info.driver, strlen(info.driver) + 1); + token = strtok_r(info.bus_info, " :", &tmp); + buf[i] = token; + while (token) { + token = strtok_r(NULL, " :.", &tmp); + buf[++i] = token; + } + interface->bus = strtoul(buf[1], &end, 16); + return 0; +out: + return 1; +} + +static int get_device_info(int count, struct dev_prop *prop, char *slaves[]) +{ + int ret = -1, i; + struct ifreq ifr; + struct ethtool_cmd info; + + for (i = 0; i < count; i++) { + strncpy(prop[i].ifname, slaves[i], IFNAMSIZ); + memset(&ifr, 0, sizeof(ifr)); + strncpy(ifr.ifr_name, prop[i].ifname, IFNAMSIZ); + ifr.ifr_data = (caddr_t)&info; + + info.cmd = ETHTOOL_GSET; + + if (ioctl(skfd, SIOCETHTOOL, &ifr) < 0) { + if (errno == EOPNOTSUPP) + goto out; + saved_errno = errno; + v_print("Slave '%s': Error: get bonding info failed" + " %s\n", prop[i].ifname, + strerror(saved_errno)); + ret = 1; + goto out; + } + prop[i].port = info.phy_address; + ret = get_bus_info(&prop[i]); + if (ret) + goto out; + } +out: + return ret; +} + +/* For a given set of interfaces, find out if they belong to the + * same physical port. Return true if two interfaces are found to + * be from same physical port, otherwise return false. + */ + +static int valid_slaves(int count, char *slaves[]) +{ + int i, j, ret = 0; + struct dev_prop *ifprop, *searchif; + struct dev_prop *prop; + + prop = malloc(count * sizeof(struct dev_prop)); + if (prop == NULL) + return 1; + + memset(prop, 0, sizeof(struct dev_prop) * count); + ret = get_device_info(count, prop, slaves); + /* Iterate over the array of interfaces and find a match */ + for (j = 0; j < count; j++) { + ifprop = &prop[j]; + for (i = j + 1; i < count; i++) { + searchif = &prop[i]; + /* Compare driver names */ + if (!strncmp(ifprop->driver, searchif->driver, IFNAMSIZ) + && !strncmp(ifprop->ifname, searchif->ifname, + IFNAMSIZ)) + continue; + /* Compare physical port and bus of the interfaces */ + if ((searchif->bus == ifprop->bus) && + (searchif->port == ifprop->port)) { + ret = 1; + fprintf(stderr, + "slave interfaces %s and %s " + "belong to the same physical " + "port of the adapter\n", + searchif->ifname, ifprop->ifname); + goto out; + } + } + } +out: + free(prop); + prop = NULL; + return ret; +} + static int change_active(char *master_ifname, char *slave_ifname) { struct ifreq ifr;