Message ID | 20100421181021.GC25928@x200.localdomain |
---|---|
State | RFC, archived |
Delegated to: | David Miller |
Headers | show |
On Wednesday 21 April 2010, Chris Wright wrote: > * Arnd Bergmann (arnd@arndb.de) wrote: > > On Wednesday 21 April 2010, Chris Wright wrote: > > > * Arnd Bergmann (arnd@arndb.de) wrote: > > > > Since it seems what you really want to do is to do the exchange with the > > > > switch from here, maybe the hardware configuration part should be moved > > > > the DCB interface? > > > > > > I suppose this would work (although it's a bit odd being out of scope > > > of DCB spec). > > > > It could be anywhere, it doesn't have to be the DCB interface, but could > > be anything ranging from ethtool to iplink I guess. And we should define > > it in a way that works for any SR-IOV card, whether it's using Cisco's > > protocol in firmware, 802.1Qbg VDP in firmware, lldpad to do VDP or > > none of the above and just provides an internal switch like all the > > existing NICs. > > Heh, that's exactly what iovnl does ;-) No, according to what you write below, it's exactly what iovnl does *not* do, i.e. part 1 in my list. > > 1. Setting up the slave device > > a) create an SR-IOV VF to assign to a guest > > b) create a macvtap device to pass to qemu or vhost > > c) attach a tap device to a bridge > > d) create a macvlan device and put it into a container > > e) create a virtual interface for a VMDq adapter > > OK, but iovnl isn't doing this. The set_mac_vlan that Scott's patch adds seems to implement 1a), as far as I can tell. Interestingly, this is not actually implemented in the enic driver in patch 2/2. So if we all agree that this is out of the scope of iovnl, let's just remove it from the interface and find another way for it (ethtool, iplink, ..., as listed above). Note that we still need to pass the MAC address and VLAN ID (or a list of these) to the external switch, my point is just that this should be separate from enforcing it in the hypervisor. > > 2) Registering the slave with the switch > > a) use Cisco protocol in enic firmware (see patch 2/2) > > b) use standard VDP in lldpad > > c) use reverse-engineered cisco protocol in some user tool for > > non-enic adapters. > > d) use standard VDP in firmware (hopefully this never happens) > > e) do nothing at all (as we do today) > > And this is the step that is the main purpose of iovnl. > > Here's the simplest snippet of libvirt to show this. It sends > set_port_profile netlink messages and then creates macvtap. As simple > as it gets... > > --- a/src/qemu/qemu_conf.c > +++ b/src/qemu/qemu_conf.c > @@ -1470,6 +1470,11 @@ qemudPhysIfaceConnect(virConnectPtr conn, > net->model && STREQ(net->model, "virtio")) > vnet_hdr = 1; > > + setPortProfileId(net->data.direct.linkdev, > + net->data.direct.mode, > + net->data.direct.profileid, > + net->mac); > + > rc = openMacvtapTap(net->ifname, net->mac, linkdev, brmode, > &res_ifname, vnet_hdr); Ok. In case of VDP, I guess this needs to be extended with the vlan ID that has been configured, and possibly with a UUID, because we need to pass the same one on the target machine if we migrate it. Alternatively, the setPortProfileId could figure out the MAC address and VLAN ID from the device itself, but then we don't need to pass either of them. Arnd -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 4/21/10 12:39 PM, "Arnd Bergmann" <arnd@arndb.de> wrote: >>> 1. Setting up the slave device >>> a) create an SR-IOV VF to assign to a guest >>> b) create a macvtap device to pass to qemu or vhost >>> c) attach a tap device to a bridge >>> d) create a macvlan device and put it into a container >>> e) create a virtual interface for a VMDq adapter >> >> OK, but iovnl isn't doing this. > > The set_mac_vlan that Scott's patch adds seems to implement 1a), as far > as I can tell. Interestingly, this is not actually implemented in > the enic driver in patch 2/2. So if we all agree that this is out of the > scope of iovnl, let's just remove it from the interface and find another > way for it (ethtool, iplink, ..., as listed above). You're right, not needed for enic since mac addr is included with port-profile push and vlan membership is implied by port-profile. So I put set_mac_vlan in there basically to elicit feedback. There really wouldn't be much different between iplink and iovnl since they're both rtnetlink...seems we should keep IOV-related APIs in one place. Maybe there are other IOV APIs to add to iovnl in the future like: vf <- add_vf(pf) del_vf(pf, vf) Ethtool doesn't seem the right place for this. -scott -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday 21 April 2010, Scott Feldman wrote: > On 4/21/10 12:39 PM, "Arnd Bergmann" <arnd@arndb.de> wrote: > > >>> 1. Setting up the slave device > >>> a) create an SR-IOV VF to assign to a guest > >>> b) create a macvtap device to pass to qemu or vhost > >>> c) attach a tap device to a bridge > >>> d) create a macvlan device and put it into a container > >>> e) create a virtual interface for a VMDq adapter > >> > >> OK, but iovnl isn't doing this. > > > > The set_mac_vlan that Scott's patch adds seems to implement 1a), as far > > as I can tell. Interestingly, this is not actually implemented in > > the enic driver in patch 2/2. So if we all agree that this is out of the > > scope of iovnl, let's just remove it from the interface and find another > > way for it (ethtool, iplink, ..., as listed above). > > You're right, not needed for enic since mac addr is included with > port-profile push and vlan membership is implied by port-profile. So I put > set_mac_vlan in there basically to elicit feedback. Ok. Two points though: - when you say that the mac address is included in the port-profile push, does that imply that the VF does not have a mac address prior to this? This would again mix the NIC configuration phase with the switch association, which I think we really need to avoid if we want to be able to implement the association in user space! - The VLAN ID being implied in the port profile seems to be another difference between what enic is doing and the current draft VDP that will eventually become 802.1Qbg, and I fear that this difference will be visible in the iovnl protocol. > There really wouldn't be much different between iplink and iovnl since > they're both rtnetlink...seems we should keep IOV-related APIs in one place. > Maybe there are other IOV APIs to add to iovnl in the future like: > > vf <- add_vf(pf) > del_vf(pf, vf) > > Ethtool doesn't seem the right place for this. Right. My preference would probably be make these a subcategory of the if_link, and use the existing RTM_NEWLINK/RTM_DELLINK commands. This would make it resemble the existing interfaces and mean you can use ip link add link eth0 type macvlan # for a container ip link add link eth0 type macvtap # for qemu/vhost ip link add link eth0 type vf # for device assignment There are obviously significant differences between these three, but they also share enough of their properties to let us treat them in similar ways. If we integrate the iovnl client into iproute2, the sequence for setting up an enic VF and associating it to the port profile could be # create vf0, pass mac and vlan id to HW, no association yet ip link add link eth0 name vf0 type vf mac fe:dc:ba:12:34:56 vlan 78 # associate vf with port profile, mac address must match the one assigned # to the interface before. ip iov assoc eth0 port-profile "general" host-uuid "dcf2a873-f5ee-41dd-a7ad-802a544e48c2" \ mac fe:dc:ba:12:34:56 Arnd -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
* Arnd Bergmann (arnd@arndb.de) wrote: > On Wednesday 21 April 2010, Chris Wright wrote: > > * Arnd Bergmann (arnd@arndb.de) wrote: > > > On Wednesday 21 April 2010, Chris Wright wrote: > > > > * Arnd Bergmann (arnd@arndb.de) wrote: > > > > > Since it seems what you really want to do is to do the exchange with the > > > > > switch from here, maybe the hardware configuration part should be moved > > > > > the DCB interface? > > > > > > > > I suppose this would work (although it's a bit odd being out of scope > > > > of DCB spec). > > > > > > It could be anywhere, it doesn't have to be the DCB interface, but could > > > be anything ranging from ethtool to iplink I guess. And we should define > > > it in a way that works for any SR-IOV card, whether it's using Cisco's > > > protocol in firmware, 802.1Qbg VDP in firmware, lldpad to do VDP or > > > none of the above and just provides an internal switch like all the > > > existing NICs. > > > > Heh, that's exactly what iovnl does ;-) > > No, according to what you write below, it's exactly what iovnl does *not* do, > i.e. part 1 in my list. OK, I see...in this case to me hw setup was the port profile for the enic to initiate host<->switch negotiation, sorry for confusion. > > > 1. Setting up the slave device > > > a) create an SR-IOV VF to assign to a guest > > > b) create a macvtap device to pass to qemu or vhost > > > c) attach a tap device to a bridge > > > d) create a macvlan device and put it into a container > > > e) create a virtual interface for a VMDq adapter > > > > OK, but iovnl isn't doing this. > > The set_mac_vlan that Scott's patch adds seems to implement 1a), as far > as I can tell. Interestingly, this is not actually implemented in > the enic driver in patch 2/2. So if we all agree that this is out of the > scope of iovnl, let's just remove it from the interface and find another > way for it (ethtool, iplink, ..., as listed above). Scott, any objection? At least a way to keep moving forward on the port profile bit. > Note that we still need to pass the MAC address and VLAN ID (or a list > of these) to the external switch, my point is just that this should be > separate from enforcing it in the hypervisor. Yup, we should focus on reconciling the diff of enic vs vpd port profile needs. > > > 2) Registering the slave with the switch > > > a) use Cisco protocol in enic firmware (see patch 2/2) > > > b) use standard VDP in lldpad > > > c) use reverse-engineered cisco protocol in some user tool for > > > non-enic adapters. > > > d) use standard VDP in firmware (hopefully this never happens) > > > e) do nothing at all (as we do today) > > > > And this is the step that is the main purpose of iovnl. > > > > Here's the simplest snippet of libvirt to show this. It sends > > set_port_profile netlink messages and then creates macvtap. As simple > > as it gets... > > > > --- a/src/qemu/qemu_conf.c > > +++ b/src/qemu/qemu_conf.c > > @@ -1470,6 +1470,11 @@ qemudPhysIfaceConnect(virConnectPtr conn, > > net->model && STREQ(net->model, "virtio")) > > vnet_hdr = 1; > > > > + setPortProfileId(net->data.direct.linkdev, > > + net->data.direct.mode, > > + net->data.direct.profileid, > > + net->mac); > > + > > rc = openMacvtapTap(net->ifname, net->mac, linkdev, brmode, > > &res_ifname, vnet_hdr); > > Ok. In case of VDP, I guess this needs to be extended with the vlan ID > that has been configured, and possibly with a UUID, because we need to > pass the same one on the target machine if we migrate it. > > Alternatively, the setPortProfileId could figure out the MAC address and > VLAN ID from the device itself, but then we don't need to pass either of > them. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
* Arnd Bergmann (arnd@arndb.de) wrote: > On Wednesday 21 April 2010, Scott Feldman wrote: > > On 4/21/10 12:39 PM, "Arnd Bergmann" <arnd@arndb.de> wrote: > > > > >>> 1. Setting up the slave device > > >>> a) create an SR-IOV VF to assign to a guest > > >>> b) create a macvtap device to pass to qemu or vhost > > >>> c) attach a tap device to a bridge > > >>> d) create a macvlan device and put it into a container > > >>> e) create a virtual interface for a VMDq adapter > > >> > > >> OK, but iovnl isn't doing this. > > > > > > The set_mac_vlan that Scott's patch adds seems to implement 1a), as far > > > as I can tell. Interestingly, this is not actually implemented in > > > the enic driver in patch 2/2. So if we all agree that this is out of the > > > scope of iovnl, let's just remove it from the interface and find another > > > way for it (ethtool, iplink, ..., as listed above). > > > > You're right, not needed for enic since mac addr is included with > > port-profile push and vlan membership is implied by port-profile. So I put > > set_mac_vlan in there basically to elicit feedback. > > Ok. Two points though: > > - when you say that the mac address is included in the port-profile push, > does that imply that the VF does not have a mac address prior to this? > This would again mix the NIC configuration phase with the switch > association, which I think we really need to avoid if we want to be > able to implement the association in user space! > > - The VLAN ID being implied in the port profile seems to be another > difference between what enic is doing and the current draft VDP > that will eventually become 802.1Qbg, and I fear that this difference > will be visible in the iovnl protocol. > > > There really wouldn't be much different between iplink and iovnl since > > they're both rtnetlink...seems we should keep IOV-related APIs in one place. > > Maybe there are other IOV APIs to add to iovnl in the future like: > > > > vf <- add_vf(pf) > > del_vf(pf, vf) > > > > Ethtool doesn't seem the right place for this. > > Right. My preference would probably be make these a subcategory of > the if_link, and use the existing RTM_NEWLINK/RTM_DELLINK commands. > This would make it resemble the existing interfaces and mean you can > use > > ip link add link eth0 type macvlan # for a container > ip link add link eth0 type macvtap # for qemu/vhost > ip link add link eth0 type vf # for device assignment BTW, what do you mean by device assignment? > There are obviously significant differences between these three, but > they also share enough of their properties to let us treat them > in similar ways. > > If we integrate the iovnl client into iproute2, the sequence for setting > up an enic VF and associating it to the port profile could be > > # create vf0, pass mac and vlan id to HW, no association yet > ip link add link eth0 name vf0 type vf mac fe:dc:ba:12:34:56 vlan 78 Just to clarify...right now, the normal SR-IOV VF is already there. And, or course, can have its mac addr/vlan set already. > # associate vf with port profile, mac address must match the one assigned > # to the interface before. > ip iov assoc eth0 port-profile "general" host-uuid "dcf2a873-f5ee-41dd-a7ad-802a544e48c2" \ > mac fe:dc:ba:12:34:56 At that point you could just do s/mac fe:.*/link vf0/ thanks, -chris -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 4/21/10 2:13 PM, "Arnd Bergmann" <arnd@arndb.de> wrote: > On Wednesday 21 April 2010, Scott Feldman wrote: >> On 4/21/10 12:39 PM, "Arnd Bergmann" <arnd@arndb.de> wrote: >> >>>>> 1. Setting up the slave device >>>>> a) create an SR-IOV VF to assign to a guest >>>>> b) create a macvtap device to pass to qemu or vhost >>>>> c) attach a tap device to a bridge >>>>> d) create a macvlan device and put it into a container >>>>> e) create a virtual interface for a VMDq adapter >>>> >>>> OK, but iovnl isn't doing this. >>> >>> The set_mac_vlan that Scott's patch adds seems to implement 1a), as far >>> as I can tell. Interestingly, this is not actually implemented in >>> the enic driver in patch 2/2. So if we all agree that this is out of the >>> scope of iovnl, let's just remove it from the interface and find another >>> way for it (ethtool, iplink, ..., as listed above). >> >> You're right, not needed for enic since mac addr is included with >> port-profile push and vlan membership is implied by port-profile. So I put >> set_mac_vlan in there basically to elicit feedback. > > Ok. Two points though: > > - when you say that the mac address is included in the port-profile push, > does that imply that the VF does not have a mac address prior to this? Correct, VF has no mac addr prior to port-profile being applied. The mac_addr is the mac_addr of the VM guest interface that's to use the VF. If the port-profile defines L2 mac spoofing, for example, the switch wants to know the mac address before i/o starts. I/o doesn't start until port-profile is applied and the switch virtual port is setup. > This would again mix the NIC configuration phase with the switch > association, which I think we really need to avoid if we want to be > able to implement the association in user space! > > - The VLAN ID being implied in the port profile seems to be another > difference between what enic is doing and the current draft VDP > that will eventually become 802.1Qbg, and I fear that this difference > will be visible in the iovnl protocol. It's not just a VLAN ID, but the entire VLAN membership for the switch virtual port. The port-profile may define a single native VLAN for access mode on the switch port, or a trunk mode with a list of allowed vlans, with on native vlan. The key is the port-profile. The port-profile resolves the configuration of the switch virtual port. The configuration of the switch virtual port includes many setting like I mentioned earlier: VLAN membership, QoS (rate limits, priority class, L2 security, etc). >> There really wouldn't be much different between iplink and iovnl since >> they're both rtnetlink...seems we should keep IOV-related APIs in one place. >> Maybe there are other IOV APIs to add to iovnl in the future like: >> >> vf <- add_vf(pf) >> del_vf(pf, vf) >> >> Ethtool doesn't seem the right place for this. > > Right. My preference would probably be make these a subcategory of > the if_link, and use the existing RTM_NEWLINK/RTM_DELLINK commands. > This would make it resemble the existing interfaces and mean you can > use > > ip link add link eth0 type macvlan # for a container > ip link add link eth0 type macvtap # for qemu/vhost > ip link add link eth0 type vf # for device assignment > > There are obviously significant differences between these three, but > they also share enough of their properties to let us treat them > in similar ways. > I don't have strong preference for iovnl vs. extending if_link. I thought I had a reason against if_link, but I can't recall that now...it'll probably come to me when I look at it again. Let me look again... > If we integrate the iovnl client into iproute2, the sequence for setting > up an enic VF and associating it to the port profile could be > > # create vf0, pass mac and vlan id to HW, no association yet > ip link add link eth0 name vf0 type vf mac fe:dc:ba:12:34:56 vlan 78 > > # associate vf with port profile, mac address must match the one assigned > # to the interface before. > ip iov assoc eth0 port-profile "general" host-uuid > "dcf2a873-f5ee-41dd-a7ad-802a544e48c2" \ > mac fe:dc:ba:12:34:56 Ya, that sounds pretty close. I still want the flexibility to direct ops to a PF link for a VF link. -scott -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 4/21/10 3:18 PM, "Chris Wright" <chrisw@redhat.com> wrote: >> The set_mac_vlan that Scott's patch adds seems to implement 1a), as far >> as I can tell. Interestingly, this is not actually implemented in >> the enic driver in patch 2/2. So if we all agree that this is out of the >> scope of iovnl, let's just remove it from the interface and find another >> way for it (ethtool, iplink, ..., as listed above). > > Scott, any objection? At least a way to keep moving forward on the port > profile bit. Yes, that's fine with me, port-profile bit is the most important part. >> Note that we still need to pass the MAC address and VLAN ID (or a list >> of these) to the external switch, my point is just that this should be >> separate from enforcing it in the hypervisor. > > Yup, we should focus on reconciling the diff of enic vs vpd port profile > needs. -scott -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday 22 April 2010, Chris Wright wrote: > > > > ip link add link eth0 type macvlan # for a container > > ip link add link eth0 type macvtap # for qemu/vhost > > ip link add link eth0 type vf # for device assignment > > BTW, what do you mean by device assignment? I mean giving an SR-IOV VF to the guest as a native PCI device rather than having qemu or vhost present a virtio-net to the guest. > > There are obviously significant differences between these three, but > > they also share enough of their properties to let us treat them > > in similar ways. > > > > If we integrate the iovnl client into iproute2, the sequence for setting > > up an enic VF and associating it to the port profile could be > > > > # create vf0, pass mac and vlan id to HW, no association yet > > ip link add link eth0 name vf0 type vf mac fe:dc:ba:12:34:56 vlan 78 > > Just to clarify...right now, the normal SR-IOV VF is already there. > And, or course, can have its mac addr/vlan set already. I don't have an SR-IOV card available for testing yet. How is this configured now? > > # associate vf with port profile, mac address must match the one assigned > > # to the interface before. > > ip iov assoc eth0 port-profile "general" host-uuid "dcf2a873-f5ee-41dd-a7ad-802a544e48c2" \ > > mac fe:dc:ba:12:34:56 > > At that point you could just do s/mac fe:.*/link vf0/ My point was that this information should be irrelevant to the code doing the association with the switch. It sort of makes sense when the receiver is enic, but when we send the same data to lldpad, it doesn't care about the slave device name but only about the mac address. Especially since the slave device might not be in the root name space any more, meaning we have no way to find it. Arnd -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Arnd Bergmann <arnd@arndb.de> Date: Wed, 21 Apr 2010 23:13:04 +0200 > My preference would probably be make these a subcategory of the > if_link, and use the existing RTM_NEWLINK/RTM_DELLINK commands. I was going to suggest this as well. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday 22 April 2010, Scott Feldman wrote: > On 4/21/10 2:13 PM, "Arnd Bergmann" <arnd@arndb.de> wrote: > > On Wednesday 21 April 2010, Scott Feldman wrote: > >> On 4/21/10 12:39 PM, "Arnd Bergmann" <arnd@arndb.de> wrote: > >> You're right, not needed for enic since mac addr is included with > >> port-profile push and vlan membership is implied by port-profile. So I put > >> set_mac_vlan in there basically to elicit feedback. > > > > Ok. Two points though: > > > > - when you say that the mac address is included in the port-profile push, > > does that imply that the VF does not have a mac address prior to this? > > Correct, VF has no mac addr prior to port-profile being applied. The > mac_addr is the mac_addr of the VM guest interface that's to use the VF. If > the port-profile defines L2 mac spoofing, for example, the switch wants to > know the mac address before i/o starts. I/o doesn't start until > port-profile is applied and the switch virtual port is setup. Is it possible to split this this process, in order to make it more closely resemble what we have when the registration is in user space? This would mean that you assign a MAC address to the interface when the interface gets created, and register the same MAC address at the switch independent from the creation. Obviously, if the port-profile (for enic) or the VSI list in the switch enforces a the mac address and you pass one that's different from the one that's set in the VF, it won't be able to send any data, but it remains the job of the switch to enforce that case. > It's not just a VLAN ID, but the entire VLAN membership for the switch > virtual port. The port-profile may define a single native VLAN for access > mode on the switch port, or a trunk mode with a list of allowed vlans, with > on native vlan. > > The key is the port-profile. The port-profile resolves the configuration of > the switch virtual port. The configuration of the switch virtual port > includes many setting like I mentioned earlier: VLAN membership, QoS (rate > limits, priority class, L2 security, etc). Ok, I see. > > If we integrate the iovnl client into iproute2, the sequence for setting > > up an enic VF and associating it to the port profile could be > > > > # create vf0, pass mac and vlan id to HW, no association yet > > ip link add link eth0 name vf0 type vf mac fe:dc:ba:12:34:56 vlan 78 > > > > # associate vf with port profile, mac address must match the one assigned > > # to the interface before. > > ip iov assoc eth0 port-profile "general" host-uuid > > "dcf2a873-f5ee-41dd-a7ad-802a544e48c2" \ > > mac fe:dc:ba:12:34:56 > > Ya, that sounds pretty close. I still want the flexibility to direct ops to > a PF link for a VF link. Does that mean you require passing both the PF and the VF name? Arnd -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
* Arnd Bergmann (arnd@arndb.de) wrote: > On Thursday 22 April 2010, Chris Wright wrote: > > > > > > ip link add link eth0 type macvlan # for a container > > > ip link add link eth0 type macvtap # for qemu/vhost > > > ip link add link eth0 type vf # for device assignment > > > > BTW, what do you mean by device assignment? > > I mean giving an SR-IOV VF to the guest as a native PCI device > rather than having qemu or vhost present a virtio-net to the > guest. OK, wasn't clear if you meant that or simply 100% dedicating the interface via something like virtio. The add_vf() idea, while neat, doesn't really match how VF's are allocated. > > > There are obviously significant differences between these three, but > > > they also share enough of their properties to let us treat them > > > in similar ways. > > > > > > If we integrate the iovnl client into iproute2, the sequence for setting > > > up an enic VF and associating it to the port profile could be > > > > > > # create vf0, pass mac and vlan id to HW, no association yet > > > ip link add link eth0 name vf0 type vf mac fe:dc:ba:12:34:56 vlan 78 > > > > Just to clarify...right now, the normal SR-IOV VF is already there. > > And, or course, can have its mac addr/vlan set already. > > I don't have an SR-IOV card available for testing yet. How is this > configured now? The device shows up in the host as a normal network device, so mgmt tools currently treat it as if it's no different from a PF. So that's just plain old: SIOCSIFHWADDR or RTM_SETLINK (i.e. normal ->ndo_set_mac_addr) There's also the possiblity of configuring through the PF (although this isn't really widely used ATM, and has the disadvantage of exposing the VF number to userspace in a way that's difficult to use). This is also done via RTM_SETLINK (on the PF this time), and will result in ->ndo_set_vf_mac(). > > > # associate vf with port profile, mac address must match the one assigned > > > # to the interface before. > > > ip iov assoc eth0 port-profile "general" host-uuid "dcf2a873-f5ee-41dd-a7ad-802a544e48c2" \ > > > mac fe:dc:ba:12:34:56 > > > > At that point you could just do s/mac fe:.*/link vf0/ > > My point was that this information should be irrelevant to the code doing the > association with the switch. It sort of makes sense when the receiver is enic, > but when we send the same data to lldpad, it doesn't care about the slave device > name but only about the mac address. Especially since the slave device might not > be in the root name space any more, meaning we have no way to find it. Yeah, w/ namespace I think you'd normally do all setup before handing into a new namespace. thanks, -chris -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday 22 April 2010 19:47:29 Chris Wright wrote: > OK, wasn't clear if you meant that or simply 100% dedicating the interface > via something like virtio. The add_vf() idea, while neat, doesn't really > match how VF's are allocated. But we still need something like that for allocating queues in VMDq and similar cases where we do not have pass-through, right? As far as I can tell we don't have an interface for that yet, but we have drivers for a number of cards that could do this. > > I don't have an SR-IOV card available for testing yet. How is this > > configured now? > > The device shows up in the host as a normal network device, so mgmt tools > currently treat it as if it's no different from a PF. So that's just > plain old: > > SIOCSIFHWADDR or RTM_SETLINK (i.e. normal ->ndo_set_mac_addr) Ok, but that only works for a fixed number of VFs and you can only configure the VF before it's assigned to the guest, right? Both are not serious limitations, but it would be nice to have an easy way around them. In particular, for assigning the mac address and vlan id (VF in access mode), there needs to be some interface that allows the host but not the guest to change the settings after assigning the card to the guest. This is a fundamental requirement for VEPA, because the switch applied its forwarding rules based on the mac address and trusts the hypervisor to make sure it cannot be faked by the guest. Arnd -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
* Arnd Bergmann (arnd@arndb.de) wrote: > On Thursday 22 April 2010 19:47:29 Chris Wright wrote: > > OK, wasn't clear if you meant that or simply 100% dedicating the interface > > via something like virtio. The add_vf() idea, while neat, doesn't really > > match how VF's are allocated. > > But we still need something like that for allocating queues in VMDq > and similar cases where we do not have pass-through, right? Iff we care about VMDq w/out SR-IOV (since SR-IOV hardware is VMDq capable and already has a queue-pair + interrupt + net_dev), yes. And it's not just VMDq, it's any multi-queue card that can do mac/vlan filter in hw + header/data split (for direct data DMA to guest buffers). > As far as I can tell we don't have an interface for that yet, but > we have drivers for a number of cards that could do this. > > > > I don't have an SR-IOV card available for testing yet. How is this > > > configured now? > > > > The device shows up in the host as a normal network device, so mgmt tools > > currently treat it as if it's no different from a PF. So that's just > > plain old: > > > > SIOCSIFHWADDR or RTM_SETLINK (i.e. normal ->ndo_set_mac_addr) > > Ok, but that only works for a fixed number of VFs and you can only > configure the VF before it's assigned to the guest, right? Depends on assign. Assign meaning it's still visible in host, but only one guest is using it via virtio (e.g. vhost-net)....then no, can change anytime (although it's not typically changed during VM lifecycle). Assign meaning direct PCI device assignment of the VF to the guest, then yes, only while the device has driver in host. > Both are not serious limitations, but it would be nice to > have an easy way around them. In particular, for assigning > the mac address and vlan id (VF in access mode), there needs > to be some interface that allows the host but not the guest > to change the settings after assigning the card to the guest. > > This is a fundamental requirement for VEPA, because the switch > applied its forwarding rules based on the mac address and trusts > the hypervisor to make sure it cannot be faked by the guest. Sure, but the VF (when directly assigned to the guest) is going to (at least it should, for security reasons) always trap to a privileged code if the guest tries to do something like set mac or vlan id. All the SR-IOV cards I've seen do this. The "set VF mac addr" is really a message to the PF. thanks, -chris -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday 22 April 2010 21:02:30 Chris Wright wrote: > * Arnd Bergmann (arnd@arndb.de) wrote: > > On Thursday 22 April 2010 19:47:29 Chris Wright wrote: > > > OK, wasn't clear if you meant that or simply 100% dedicating the interface > > > via something like virtio. The add_vf() idea, while neat, doesn't really > > > match how VF's are allocated. > > > > But we still need something like that for allocating queues in VMDq > > and similar cases where we do not have pass-through, right? > > Iff we care about VMDq w/out SR-IOV (since SR-IOV hardware is VMDq > capable and already has a queue-pair + interrupt + net_dev), yes. > > And it's not just VMDq, it's any multi-queue card that can do mac/vlan > filter in hw + header/data split (for direct data DMA to guest buffers). Right, that's what I meant by VMDq. Do we have a better term to describe this class of devices, i.e. VMDq and other cards that also have the features you listed? Arnd -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
* Arnd Bergmann (arnd@arndb.de) wrote: > On Thursday 22 April 2010 21:02:30 Chris Wright wrote: > > * Arnd Bergmann (arnd@arndb.de) wrote: > > > On Thursday 22 April 2010 19:47:29 Chris Wright wrote: > > > > OK, wasn't clear if you meant that or simply 100% dedicating the interface > > > > via something like virtio. The add_vf() idea, while neat, doesn't really > > > > match how VF's are allocated. > > > > > > But we still need something like that for allocating queues in VMDq > > > and similar cases where we do not have pass-through, right? > > > > Iff we care about VMDq w/out SR-IOV (since SR-IOV hardware is VMDq > > capable and already has a queue-pair + interrupt + net_dev), yes. > > > > And it's not just VMDq, it's any multi-queue card that can do mac/vlan > > filter in hw + header/data split (for direct data DMA to guest buffers). > > Right, that's what I meant by VMDq. Do we have a better term to describe > this class of devices, i.e. VMDq and other cards that also have the > features you listed? I don't have a good term. Some of these devices can already surface multiple netdevs. thanks, -chris -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
--- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -1470,6 +1470,11 @@ qemudPhysIfaceConnect(virConnectPtr conn, net->model && STREQ(net->model, "virtio")) vnet_hdr = 1; + setPortProfileId(net->data.direct.linkdev, + net->data.direct.mode, + net->data.direct.profileid, + net->mac); + rc = openMacvtapTap(net->ifname, net->mac, linkdev, brmode, &res_ifname, vnet_hdr); -- To unsubscribe from this list: send the line "unsubscribe netdev" in