Message ID | 20100428044235.8646.61943.stgit@savbu-pc100.cisco.com |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
On Wednesday 28 April 2010, Scott Feldman wrote: > From: Scott Feldman <scofeldm@cisco.com> > > Add new netdev ops ndo_{set|get}_port_profile to allow setting of port-profile > on a netdev interface. Extends RTM_SETLINK/RTM_GETLINK with new sub cmd called > IFLA_PORT_PROFILE (added to end of IFLA_cmd list). The port-profile cmd > arguments are (as seen from modified iproute2 cmdline): > > ip link set DEVICE [ { up | down } ] > [ arp { on | off } ] > [ dynamic { on | off } ] > [ multicast { on | off } ] > ... > [ vf NUM [ mac LLADDR ] > [ vlan VLANID [ qos VLAN-QOS ] ] > [ rate TXRATE ] ] > [ port_profile [ PORT-PROFILE > [ mac LLADDR ] > [ host_uuid HOST_UUID ] > [ client_uuid CLIENT_UUID ] > [ client_name CLIENT_NAME ] ] ] > ip link show [ DEVICE ] We will need a few more options to cover draft VDP in addition to the protocol your NIC is using. I still think it's possible to use the same interface for both, but the differences are obviously showing. The missing bits that I can see so far are: - You only have 'get' and 'set'. We will also need a 'unset' or 'delete' option in order to get rid of a port profile association. - VDP has three different ways to 'set' a port profile: 'associate', 'pre-associate with resource reservation' and 'pre-associate without resource reservation'. This could become an extra option flag. - Instead of a port profile name, VDP specifies a tuple like struct vsi_associate { unsigned char VSI_Mgr_ID; /* VSI manager ID */ unsigned char VSI_Type_ID[3]; /* 24 bit VSI Type ID */ unsigned char VSI_Type_Version; /* VSI Type version */ }; I'm not sure how to deal with that best, but there needs to be some parsing of these numbers. - VDP requires a vlan ID to be part of the association, in addition to the MAC address. In theory, we could have multiple tuples of MAC+VLAN addresses, but we can probably just associate each tuple separately and ignore that part of the standard. - we have a set of possible error conditions that can be returned by the switch (invalid format, insufficient resources, unknown VTID, VTID violation, VTID verison violation, out of sync). It should be possible to return each of these to the user with 'get'. > Since we're using netlink sockets, the receiver of the RTM_SETLINK msg can > be in kernel- or user-space. For kernel-space recipient, rtnetlink.c, the > new ndo_set_port_profile netdev op is called to set the port-profile. > User-space recipients can decide how they propagate the msg to the switch. > There is also a RTM_GETLINK cmd to to return port-profile setting of an > interface and to also return the status of the last port-profile. More on a stylistic note, I'm not convinced that using RTM_SETLINK/GETLINK is the right interface for this, unlike the 'ip link set DEV vf ...' stuff, because it seems to suggest that this is an option of the adapter itself. I actually liked the iovnl family better in this regard, because it kept the protocols separate. What I could imagine to unify this is something like ip port_profile set DEVICE [ { pre_associate | pre_associate_rr } ] { name PORT-PROFILE | vsi MGR:VTID:VER } mac LLADDR [ vlan VID ] [ host_uuid HOST_UUID ] [ client_uuid CLIENT_UUID ] [ client_name CLIENT_NAME ] ip port_profile del DEVICE [ mac LLADDR [ vlan VID ] ] ip port_profile show DEVICE Arnd -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 4/28/10 6:13 AM, "Arnd Bergmann" <arnd@arndb.de> wrote: > On Wednesday 28 April 2010, Scott Feldman wrote: >> ip link set DEVICE [ { up | down } ] >> [ arp { on | off } ] >> [ dynamic { on | off } ] >> [ multicast { on | off } ] >> ... >> [ vf NUM [ mac LLADDR ] >> [ vlan VLANID [ qos VLAN-QOS ] ] >> [ rate TXRATE ] ] >> [ port_profile [ PORT-PROFILE >> [ mac LLADDR ] >> [ host_uuid HOST_UUID ] >> [ client_uuid CLIENT_UUID ] >> [ client_name CLIENT_NAME ] ] ] >> ip link show [ DEVICE ] > > We will need a few more options to cover draft VDP in addition to the protocol > your NIC is using. I still think it's possible to use the same interface > for both, but the differences are obviously showing. > > The missing bits that I can see so far are: > > - You only have 'get' and 'set'. We will also need a 'unset' or 'delete' > option in order to get rid of a port profile association. That's there in my patch. If you don't specify anything after port-profile keyword, it's an unset. See extra "[" and "]" above. Will that work for you? > - VDP has three different ways to 'set' a port profile: 'associate', > 'pre-associate with resource reservation' and 'pre-associate without > resource reservation'. This could become an extra option flag. Ok, let's add an option flag bit field. I'm not sure how that looks from the iproute2 cmd line. Would you take a stab at defining these? > - Instead of a port profile name, VDP specifies a tuple like > struct vsi_associate { > unsigned char VSI_Mgr_ID; /* VSI manager ID */ > unsigned char VSI_Type_ID[3]; /* 24 bit VSI Type ID */ > unsigned char VSI_Type_Version; /* VSI Type version */ > }; > I'm not sure how to deal with that best, but there needs to be > some parsing of these numbers. PORT-PROFILE above is a u8* array. You could decide to encode the tuple in a string, e.g. "1.12345.1", and let the receiver parse it? Or pack it in as binary. PORT-PROFILE for us is just a string identifier, e.g. "corp-net-10" or "joes-garage". > - VDP requires a vlan ID to be part of the association, in addition to > the MAC address. In theory, we could have multiple tuples of MAC+VLAN > addresses, but we can probably just associate each tuple separately > and ignore that part of the standard. I don't think I have enough information to spec this out for this item. Would you take a stab at how this would look in the struct and how it would look from the iproute2 cmd line? (Note I'm using iproute2 cmd line for illustrative purposes, but the sender of the msg could be something like libvirt). > - we have a set of possible error conditions that can be returned by > the switch (invalid format, insufficient resources, unknown VTID, > VTID violation, VTID verison violation, out of sync). It should be > possible to return each of these to the user with 'get'. There is a status code in the get cmd as defined in my patch. I have it as a u8 with some enum codes. Can we add to the enum code list? Or do you want to return a full string? Our requirements are we return one of: {success, error, in-progress}. >> Since we're using netlink sockets, the receiver of the RTM_SETLINK msg can >> be in kernel- or user-space. For kernel-space recipient, rtnetlink.c, the >> new ndo_set_port_profile netdev op is called to set the port-profile. >> User-space recipients can decide how they propagate the msg to the switch. >> There is also a RTM_GETLINK cmd to to return port-profile setting of an >> interface and to also return the status of the last port-profile. > > More on a stylistic note, I'm not convinced that using RTM_SETLINK/GETLINK > is the right interface for this, unlike the 'ip link set DEV vf ...' stuff, > because it seems to suggest that this is an option of the adapter itself. > I actually liked the iovnl family better in this regard, because it kept > the protocols separate. Wait a second...I abandoned iovnl and moved to if_link based on suggestions from you and others. On 4/21/10 2:13 PM, "Arnd Bergmann" <arnd@arndb.de> wrote: > Right. My preference would probably be make these a subcategory of > the if_link, and use the existing RTM_NEWLINK/RTM_DELLINK commands. My latest with if_link fits best with what we're trying to do with enic. Note the current interface is per netdev interface (a link), where the netdev interface could be the adapter itself or any other netdev such as macvlan, bond, etc. > What I could imagine to unify this is something like > > ip port_profile set DEVICE [ { pre_associate | pre_associate_rr } ] > { name PORT-PROFILE | vsi MGR:VTID:VER } > mac LLADDR > [ vlan VID ] > [ host_uuid HOST_UUID ] > [ client_uuid CLIENT_UUID ] > [ client_name CLIENT_NAME ] > ip port_profile del DEVICE [ mac LLADDR [ vlan VID ] ] > ip port_profile show DEVICE If we want to break port_profile out into it's own ip cmd, I'm ok with that. What you have above would work for enic. The netdev would have these ops: ndo_set_port_profile ndo_get_port_profile ndo_del_port_profile Sounds OK? -scott -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 4/28/10 6:13 AM, "Arnd Bergmann" <arnd@arndb.de> wrote: > What I could imagine to unify this is something like > > ip port_profile set DEVICE [ { pre_associate | pre_associate_rr } ] > { name PORT-PROFILE | vsi MGR:VTID:VER } > mac LLADDR > [ vlan VID ] > [ host_uuid HOST_UUID ] > [ client_uuid CLIENT_UUID ] > [ client_name CLIENT_NAME ] > ip port_profile del DEVICE [ mac LLADDR [ vlan VID ] ] > ip port_profile show DEVICE Arnd, can someone test this with VDP today? I don't have access to that equipment so it's difficult for me to fully test the unified patch. I can test the previous patch with enic easily because I have access to production systems. I'd like to make sure someone can test this with VDP before I respin the patch one more time. -scott -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday 28 April 2010, Scott Feldman wrote: > On 4/28/10 6:13 AM, "Arnd Bergmann" <arnd@arndb.de> wrote: > > > > We will need a few more options to cover draft VDP in addition to the protocol > > your NIC is using. I still think it's possible to use the same interface > > for both, but the differences are obviously showing. > > > > The missing bits that I can see so far are: > > > > - You only have 'get' and 'set'. We will also need a 'unset' or 'delete' > > option in order to get rid of a port profile association. > > That's there in my patch. If you don't specify anything after port-profile > keyword, it's an unset. See extra "[" and "]" above. Will that work for > you? Well, this won't work if you want to have multiple slave interfaces connected to a single master and record the port profile in the master. > > - VDP has three different ways to 'set' a port profile: 'associate', > > 'pre-associate with resource reservation' and 'pre-associate without > > resource reservation'. This could become an extra option flag. > > Ok, let's add an option flag bit field. I'm not sure how that looks from > the iproute2 cmd line. Would you take a stab at defining these? > > > - Instead of a port profile name, VDP specifies a tuple like > > struct vsi_associate { > > unsigned char VSI_Mgr_ID; /* VSI manager ID */ > > unsigned char VSI_Type_ID[3]; /* 24 bit VSI Type ID */ > > unsigned char VSI_Type_Version; /* VSI Type version */ > > }; > > I'm not sure how to deal with that best, but there needs to be > > some parsing of these numbers. > > PORT-PROFILE above is a u8* array. You could decide to encode the tuple in > a string, e.g. "1.12345.1", and let the receiver parse it? Or pack it in as > binary. PORT-PROFILE for us is just a string identifier, e.g. "corp-net-10" > or "joes-garage". I guess either one would work, but I'd prefer to do the parsing at the front-end and passing it as binary to avoid libvirt encoding it as ascii and lldpad (or the kernel) parsing the data again, which would be more error-prone. Another alternative would be to make this a distinct argument (or even three of them) and only allow passing one or the other. That would also implicitly choose the protocol (VDP or port extender). > > - VDP requires a vlan ID to be part of the association, in addition to > > the MAC address. In theory, we could have multiple tuples of MAC+VLAN > > addresses, but we can probably just associate each tuple separately > > and ignore that part of the standard. > > I don't think I have enough information to spec this out for this item. > Would you take a stab at how this would look in the struct and how it would > look from the iproute2 cmd line? (Note I'm using iproute2 cmd line for > illustrative purposes, but the sender of the msg could be something like > libvirt). Just adding the VID would be done I wrote at the end of my last mail. Adding multiple VLAN/MAC pairs is probably not necessary if we can do multiple associations. > > - we have a set of possible error conditions that can be returned by > > the switch (invalid format, insufficient resources, unknown VTID, > > VTID violation, VTID verison violation, out of sync). It should be > > possible to return each of these to the user with 'get'. > > There is a status code in the get cmd as defined in my patch. I have it as > a u8 with some enum codes. Can we add to the enum code list? Or do you > want to return a full string? Our requirements are we return one of: > {success, error, in-progress}. enum is fine, but I think it would be good to use the same numbers as the VDP standard where possible. Maybe we could use two bytes, the first one for the overall status (success, error, in progress) and the second one for the specific error (as above)? > >> Since we're using netlink sockets, the receiver of the RTM_SETLINK msg can > >> be in kernel- or user-space. For kernel-space recipient, rtnetlink.c, the > >> new ndo_set_port_profile netdev op is called to set the port-profile. > >> User-space recipients can decide how they propagate the msg to the switch. > >> There is also a RTM_GETLINK cmd to to return port-profile setting of an > >> interface and to also return the status of the last port-profile. > > > > More on a stylistic note, I'm not convinced that using RTM_SETLINK/GETLINK > > is the right interface for this, unlike the 'ip link set DEV vf ...' stuff, > > because it seems to suggest that this is an option of the adapter itself. > > I actually liked the iovnl family better in this regard, because it kept > > the protocols separate. > > Wait a second...I abandoned iovnl and moved to if_link based on suggestions > from you and others. On 4/21/10 2:13 PM, "Arnd Bergmann" <arnd@arndb.de> > wrote: > > > Right. My preference would probably be make these a subcategory of > > the if_link, and use the existing RTM_NEWLINK/RTM_DELLINK commands. > > My latest with if_link fits best with what we're trying to do with enic. Sorry for the misunderstanding, I was probably not clear enough. This was a side-discussion about how we should create and destroy virtual interfaces on IOV capable adapters. My point was that this should be _separate_ from the port profile association. For creating a macvlan/macvtap device, we already use RTM_NEWLINK, which does not associate the device with the switch, so I suggested that for creating a virtual interface with hardware support we should do the same thing, but leave the association somewhere else. iovnl sounds like a good place for that. > Note the current interface is per netdev interface (a link), where the > netdev interface could be the adapter itself or any other netdev such as > macvlan, bond, etc. We certainly missed each others arguments here, see my other mail. In order to do the port profile association from software, we definitely need the master device (nic, PF, bond, bridge, ...) so we have a way to communicate to the switch, not the slave device (VF, tap, macvtap, ...) that we are trying to associate. > > What I could imagine to unify this is something like > > > > ip port_profile set DEVICE [ { pre_associate | pre_associate_rr } ] > > { name PORT-PROFILE | vsi MGR:VTID:VER } > > mac LLADDR > > [ vlan VID ] > > [ host_uuid HOST_UUID ] > > [ client_uuid CLIENT_UUID ] > > [ client_name CLIENT_NAME ] > > ip port_profile del DEVICE [ mac LLADDR [ vlan VID ] ] > > ip port_profile show DEVICE > > If we want to break port_profile out into it's own ip cmd, I'm ok with that. > What you have above would work for enic. The netdev would have these ops: > > ndo_set_port_profile > ndo_get_port_profile > ndo_del_port_profile > > Sounds OK? That sounds good. Arnd -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday 28 April 2010, Scott Feldman wrote: > On 4/28/10 6:13 AM, "Arnd Bergmann" <arnd@arndb.de> wrote: > > > What I could imagine to unify this is something like > > > > ip port_profile set DEVICE [ { pre_associate | pre_associate_rr } ] > > { name PORT-PROFILE | vsi MGR:VTID:VER } > > mac LLADDR > > [ vlan VID ] > > [ host_uuid HOST_UUID ] > > [ client_uuid CLIENT_UUID ] > > [ client_name CLIENT_NAME ] > > ip port_profile del DEVICE [ mac LLADDR [ vlan VID ] ] > > ip port_profile show DEVICE > > Arnd, can someone test this with VDP today? I don't have access to that > equipment so it's difficult for me to fully test the unified patch. I can > test the previous patch with enic easily because I have access to production > systems. I'd like to make sure someone can test this with VDP before I > respin the patch one more time. Sorry, but I don't have access to production hardware at this time. Jens wants to implement both sides so we can test this in simulation mode, but it's not done yet. Arnd -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/if_link.h b/include/linux/if_link.h index cfd420b..6f02398 100644 --- a/include/linux/if_link.h +++ b/include/linux/if_link.h @@ -116,6 +116,7 @@ enum { IFLA_VF_TX_RATE, /* TX Bandwidth Allocation */ IFLA_VFINFO, IFLA_STATS64, + IFLA_PORT_PROFILE, __IFLA_MAX }; @@ -259,4 +260,29 @@ struct ifla_vf_info { __u32 qos; __u32 tx_rate; }; + +/* Port-profile managment section */ + +#define IFLA_PORT_PROFILE_MAX 40 +#define IFLA_PP_HOST_UUID_MAX 40 +#define IFLA_PP_CLIENT_UUID_MAX 40 +#define IFLA_PP_CLIENT_NAME_MAX 40 + +enum ifla_port_profile_status { + IFLA_PORT_PROFILE_STATUS_UNKNOWN, + IFLA_PORT_PROFILE_STATUS_SUCCESS, + IFLA_PORT_PROFILE_STATUS_ERROR, + IFLA_PORT_PROFILE_STATUS_INPROGRESS, +}; + +struct ifla_port_profile { + __u8 status; + __u8 port_profile[IFLA_PORT_PROFILE_MAX]; + __u8 mac[32]; /* MAX_ADDR_LEN */ + __u8 host_uuid[IFLA_PP_HOST_UUID_MAX]; + /* e.g. "CEEFD3B1-9E11-11DE-BDFD-000BAB01C0FB" */ + __u8 client_uuid[IFLA_PP_CLIENT_UUID_MAX]; + __u8 client_name[IFLA_PP_CLIENT_NAME_MAX]; /* e.g. "vm0-eth1" */ +}; + #endif /* _LINUX_IF_LINK_H */ diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 3c5ed5f..2962288 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -696,6 +696,12 @@ struct netdev_rx_queue { * int (*ndo_set_vf_tx_rate)(struct net_device *dev, int vf, int rate); * int (*ndo_get_vf_config)(struct net_device *dev, * int vf, struct ifla_vf_info *ivf); + * + * Port-profile management functions. + * int (*ndo_set_port_profile)(struct net_device *dev, + * struct ifla_port_profile *ipp); + * int (*ndo_get_port_profile)(struct net_device *dev, + * struct ifla_port_profile *ipp); */ #define HAVE_NET_DEVICE_OPS struct net_device_ops { @@ -744,6 +750,10 @@ struct net_device_ops { int (*ndo_get_vf_config)(struct net_device *dev, int vf, struct ifla_vf_info *ivf); + int (*ndo_set_port_profile)(struct net_device *dev, + struct ifla_port_profile *ipp); + int (*ndo_get_port_profile)(struct net_device *dev, + struct ifla_port_profile *ipp); #if defined(CONFIG_FCOE) || defined(CONFIG_FCOE_MODULE) int (*ndo_fcoe_enable)(struct net_device *dev); int (*ndo_fcoe_disable)(struct net_device *dev); diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 78c8598..1d7e9a7 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -758,6 +758,14 @@ static int rtnl_fill_ifinfo(struct sk_buff *skb, struct net_device *dev, NLA_PUT(skb, IFLA_VFINFO, sizeof(ivi), &ivi); } } + + if (dev->netdev_ops->ndo_get_port_profile) { + struct ifla_port_profile ipp; + + if (!dev->netdev_ops->ndo_get_port_profile(dev, &ipp)) + NLA_PUT(skb, IFLA_PORT_PROFILE, sizeof(ipp), &ipp); + } + if (dev->rtnl_link_ops) { if (rtnl_link_fill(skb, dev) < 0) goto nla_put_failure; @@ -824,6 +832,8 @@ const struct nla_policy ifla_policy[IFLA_MAX+1] = { .len = sizeof(struct ifla_vf_vlan) }, [IFLA_VF_TX_RATE] = { .type = NLA_BINARY, .len = sizeof(struct ifla_vf_tx_rate) }, + [IFLA_PORT_PROFILE] = { .type = NLA_BINARY, + .len = sizeof(struct ifla_port_profile)}, }; EXPORT_SYMBOL(ifla_policy); @@ -1028,6 +1038,18 @@ static int do_setlink(struct net_device *dev, struct ifinfomsg *ifm, } err = 0; + if (tb[IFLA_PORT_PROFILE]) { + struct ifla_port_profile *ipp; + ipp = nla_data(tb[IFLA_PORT_PROFILE]); + err = -EOPNOTSUPP; + if (ops->ndo_set_port_profile) + err = ops->ndo_set_port_profile(dev, ipp); + if (err < 0) + goto errout; + modified = 1; + } + err = 0; + errout: if (err < 0 && modified && net_ratelimit()) printk(KERN_WARNING "A link change request failed with "