Message ID | 20111022054311.21798.3340.stgit@dhcp-8-146.nay.redhat.com |
---|---|
State | New |
Headers | show |
On Sat, 22 Oct 2011 13:43:11 +0800, Jason Wang <jasowang@redhat.com> wrote: > This make let virtio-net driver can send gratituous packet by a new > config bit - VIRTIO_NET_S_ANNOUNCE in each config update > interrupt. When this bit is set by backend, the driver would schedule > a workqueue to send gratituous packet through NETDEV_NOTIFY_PEERS. > > This feature is negotiated through bit VIRTIO_NET_F_GUEST_ANNOUNCE. > > Signed-off-by: Jason Wang <jasowang@redhat.com> This seems like a huge layering violation. Imagine this in real hardware, for example. There may be a good reason why virtual devices might want this kind of reconfiguration cheat, which is unnecessary for normal machines, but it'd have to be spelled out clearly in the spec to justify it... Cheers, Rusty.
On Mon, Oct 24, 2011 at 02:54:59PM +1030, Rusty Russell wrote: > On Sat, 22 Oct 2011 13:43:11 +0800, Jason Wang <jasowang@redhat.com> wrote: > > This make let virtio-net driver can send gratituous packet by a new > > config bit - VIRTIO_NET_S_ANNOUNCE in each config update > > interrupt. When this bit is set by backend, the driver would schedule > > a workqueue to send gratituous packet through NETDEV_NOTIFY_PEERS. > > > > This feature is negotiated through bit VIRTIO_NET_F_GUEST_ANNOUNCE. > > > > Signed-off-by: Jason Wang <jasowang@redhat.com> > > This seems like a huge layering violation. Imagine this in real > hardware, for example. commits 06c4648d46d1b757d6b9591a86810be79818b60c and 99606477a5888b0ead0284fecb13417b1da8e3af document the need for this: NETDEV_NOTIFY_PEERS notifier indicates that a device moved to a different physical link. and In real hardware such notifications are only generated when the device comes up or the address changes. So hypervisor could get the same behaviour by sending link up/down events, this is just an optimization so guest won't do unecessary stuff like try to reconfigure an IP address. Maybe LOCATION_CHANGE would be a better name? > There may be a good reason why virtual devices might want this kind of > reconfiguration cheat, which is unnecessary for normal machines, I think yes, the difference with real hardware is guest can change location without link getting dropped. FWIW, Xen seems to use this capability too. > but > it'd have to be spelled out clearly in the spec to justify it... > > Cheers, > Rusty. Agree, and I'd like to see the spec too. The interface seems to involve the guest clearing the status bit when it detects an event? Also - how does it interact with the link up event? We probably don't want to schedule this when we detect a link status change or during initialization, as this patch seems to do? What if link goes down while the work is running? Is that OK?
On Mon, 2011-10-24 at 07:25 +0200, Michael S. Tsirkin wrote: > On Mon, Oct 24, 2011 at 02:54:59PM +1030, Rusty Russell wrote: > > On Sat, 22 Oct 2011 13:43:11 +0800, Jason Wang <jasowang@redhat.com> wrote: > > > This make let virtio-net driver can send gratituous packet by a new > > > config bit - VIRTIO_NET_S_ANNOUNCE in each config update > > > interrupt. When this bit is set by backend, the driver would schedule > > > a workqueue to send gratituous packet through NETDEV_NOTIFY_PEERS. > > > > > > This feature is negotiated through bit VIRTIO_NET_F_GUEST_ANNOUNCE. > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com> > > > > This seems like a huge layering violation. Imagine this in real > > hardware, for example. > > commits 06c4648d46d1b757d6b9591a86810be79818b60c > and 99606477a5888b0ead0284fecb13417b1da8e3af > document the need for this: > > NETDEV_NOTIFY_PEERS notifier indicates that a device moved to a > different physical link. > and > In real hardware such notifications are only > generated when the device comes up or the address changes. > > So hypervisor could get the same behaviour by sending link up/down > events, this is just an optimization so guest won't do > unecessary stuff like try to reconfigure an IP address. > > > Maybe LOCATION_CHANGE would be a better name? [...] We also use this in bonding failover, where the system location doesn't change but a different link is used. However, I do recognise that the name ought to indicate what kind of change happened and not what the expected action is. Ben.
On 10/24/2011 01:25 PM, Michael S. Tsirkin wrote: > On Mon, Oct 24, 2011 at 02:54:59PM +1030, Rusty Russell wrote: >> On Sat, 22 Oct 2011 13:43:11 +0800, Jason Wang <jasowang@redhat.com> wrote: >>> This make let virtio-net driver can send gratituous packet by a new >>> config bit - VIRTIO_NET_S_ANNOUNCE in each config update >>> interrupt. When this bit is set by backend, the driver would schedule >>> a workqueue to send gratituous packet through NETDEV_NOTIFY_PEERS. >>> >>> This feature is negotiated through bit VIRTIO_NET_F_GUEST_ANNOUNCE. >>> >>> Signed-off-by: Jason Wang <jasowang@redhat.com> >> >> This seems like a huge layering violation. Imagine this in real >> hardware, for example. > > commits 06c4648d46d1b757d6b9591a86810be79818b60c > and 99606477a5888b0ead0284fecb13417b1da8e3af > document the need for this: > > NETDEV_NOTIFY_PEERS notifier indicates that a device moved to a > different physical link. > and > In real hardware such notifications are only > generated when the device comes up or the address changes. > > So hypervisor could get the same behaviour by sending link up/down > events, this is just an optimization so guest won't do > unecessary stuff like try to reconfigure an IP address. > > > Maybe LOCATION_CHANGE would be a better name? > ANNOUNCE_SELF? > >> There may be a good reason why virtual devices might want this kind of >> reconfiguration cheat, which is unnecessary for normal machines, > > I think yes, the difference with real hardware is guest can change > location without link getting dropped. > FWIW, Xen seems to use this capability too. So does ms netvsc. > >> but >> it'd have to be spelled out clearly in the spec to justify it... >> >> Cheers, >> Rusty. > > Agree, and I'd like to see the spec too. The interface seems > to involve the guest clearing the status bit when it detects > an event? I would describe this in spec. The interface need guest to clear the status bit, this would let the back-end know it has finished the work as we may need to send the gratuitous packets many times. > > Also - how does it interact with the link up event? > We probably don't want to schedule this when we detect > a link status change or during initialization, as > this patch seems to do? What if link goes down > while the work is running? Is that OK? > Looks like there's are duplications if guest enable arp_notify vm is started, but we need to handle the situation that resuming a stopped virtual machine. For the link down race, I don't see any real issue, either dropping or queued.
On Tue, Oct 25, 2011 at 10:50:41AM +0800, Jason Wang wrote: > On 10/24/2011 01:25 PM, Michael S. Tsirkin wrote: > > On Mon, Oct 24, 2011 at 02:54:59PM +1030, Rusty Russell wrote: > >> On Sat, 22 Oct 2011 13:43:11 +0800, Jason Wang <jasowang@redhat.com> wrote: > >>> This make let virtio-net driver can send gratituous packet by a new > >>> config bit - VIRTIO_NET_S_ANNOUNCE in each config update > >>> interrupt. When this bit is set by backend, the driver would schedule > >>> a workqueue to send gratituous packet through NETDEV_NOTIFY_PEERS. > >>> > >>> This feature is negotiated through bit VIRTIO_NET_F_GUEST_ANNOUNCE. > >>> > >>> Signed-off-by: Jason Wang <jasowang@redhat.com> > >> > >> This seems like a huge layering violation. Imagine this in real > >> hardware, for example. > > > > commits 06c4648d46d1b757d6b9591a86810be79818b60c > > and 99606477a5888b0ead0284fecb13417b1da8e3af > > document the need for this: > > > > NETDEV_NOTIFY_PEERS notifier indicates that a device moved to a > > different physical link. > > and > > In real hardware such notifications are only > > generated when the device comes up or the address changes. > > > > So hypervisor could get the same behaviour by sending link up/down > > events, this is just an optimization so guest won't do > > unecessary stuff like try to reconfigure an IP address. > > > > > > Maybe LOCATION_CHANGE would be a better name? > > > > ANNOUNCE_SELF? It would be nice to formulate what kind of event are we notifying the guest about. The announce part of it is really up to the guest, isn't it? > > > >> There may be a good reason why virtual devices might want this kind of > >> reconfiguration cheat, which is unnecessary for normal machines, > > > > I think yes, the difference with real hardware is guest can change > > location without link getting dropped. > > FWIW, Xen seems to use this capability too. > > So does ms netvsc. > > > > >> but > >> it'd have to be spelled out clearly in the spec to justify it... > >> > >> Cheers, > >> Rusty. > > > > Agree, and I'd like to see the spec too. The interface seems > > to involve the guest clearing the status bit when it detects > > an event? > > I would describe this in spec. The interface need guest to clear the > status bit, this would let the back-end know it has finished the work as > we may need to send the gratuitous packets many times. > > > > > Also - how does it interact with the link up event? > > We probably don't want to schedule this when we detect > > a link status change or during initialization, as > > this patch seems to do? What if link goes down > > while the work is running? Is that OK? > > > > Looks like there's are duplications if guest enable arp_notify vm is > started, How hard would it be to avoid these duplicates? > but we need to handle the situation that resuming a stopped > virtual machine. > > For the link down race, I don't see any real issue, either dropping or > queued. For example, you do unregister_netdev(vi->dev); cancel_work_sync(&vi->announce); which looks scary as announce seems to use the netdev.
On 10/25/2011 11:41 PM, Michael S. Tsirkin wrote: > On Tue, Oct 25, 2011 at 10:50:41AM +0800, Jason Wang wrote: >> On 10/24/2011 01:25 PM, Michael S. Tsirkin wrote: >>> On Mon, Oct 24, 2011 at 02:54:59PM +1030, Rusty Russell wrote: >>>> On Sat, 22 Oct 2011 13:43:11 +0800, Jason Wang <jasowang@redhat.com> wrote: >>>>> This make let virtio-net driver can send gratituous packet by a new >>>>> config bit - VIRTIO_NET_S_ANNOUNCE in each config update >>>>> interrupt. When this bit is set by backend, the driver would schedule >>>>> a workqueue to send gratituous packet through NETDEV_NOTIFY_PEERS. >>>>> >>>>> This feature is negotiated through bit VIRTIO_NET_F_GUEST_ANNOUNCE. >>>>> >>>>> Signed-off-by: Jason Wang <jasowang@redhat.com> >>>> >>>> This seems like a huge layering violation. Imagine this in real >>>> hardware, for example. >>> >>> commits 06c4648d46d1b757d6b9591a86810be79818b60c >>> and 99606477a5888b0ead0284fecb13417b1da8e3af >>> document the need for this: >>> >>> NETDEV_NOTIFY_PEERS notifier indicates that a device moved to a >>> different physical link. >>> and >>> In real hardware such notifications are only >>> generated when the device comes up or the address changes. >>> >>> So hypervisor could get the same behaviour by sending link up/down >>> events, this is just an optimization so guest won't do >>> unecessary stuff like try to reconfigure an IP address. >>> >>> >>> Maybe LOCATION_CHANGE would be a better name? >>> >> >> ANNOUNCE_SELF? > > It would be nice to formulate what kind of event > are we notifying the guest about. > The announce part of it is really up to the guest, isn't it? > Right. >>> >>>> There may be a good reason why virtual devices might want this kind of >>>> reconfiguration cheat, which is unnecessary for normal machines, >>> >>> I think yes, the difference with real hardware is guest can change >>> location without link getting dropped. >>> FWIW, Xen seems to use this capability too. >> >> So does ms netvsc. >> >>> >>>> but >>>> it'd have to be spelled out clearly in the spec to justify it... >>>> >>>> Cheers, >>>> Rusty. >>> >>> Agree, and I'd like to see the spec too. The interface seems >>> to involve the guest clearing the status bit when it detects >>> an event? >> >> I would describe this in spec. The interface need guest to clear the >> status bit, this would let the back-end know it has finished the work as >> we may need to send the gratuitous packets many times. >> >>> >>> Also - how does it interact with the link up event? >>> We probably don't want to schedule this when we detect >>> a link status change or during initialization, as >>> this patch seems to do? What if link goes down >>> while the work is running? Is that OK? >>> >> >> Looks like there's are duplications if guest enable arp_notify vm is >> started, > > How hard would it be to avoid these duplicates? Not hard, it could be done in backend by distinguishing the reason : fresh start or cont after migration or stop. > >> but we need to handle the situation that resuming a stopped >> virtual machine. >> >> For the link down race, I don't see any real issue, either dropping or >> queued. > > For example, you do > unregister_netdev(vi->dev); > cancel_work_sync(&vi->announce); > > which looks scary as announce seems to use the netdev. > oops, it's a bug, I would fix it. Thanks
diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index b8225f3..1cdecf7 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -71,6 +71,9 @@ struct virtnet_info { /* Work struct for refilling if we run low on memory. */ struct delayed_work refill; + /* Work struct for send gratituous packet. */ + struct work_struct announce; + /* Chain pages by the private ptr. */ struct page *pages; @@ -507,6 +510,13 @@ static void refill_work(struct work_struct *work) schedule_delayed_work(&vi->refill, HZ/2); } +static void announce_work(struct work_struct *work) +{ + struct virtnet_info *vi = container_of(work, struct virtnet_info, + announce); + netif_notify_peers(vi->dev); +} + static int virtnet_poll(struct napi_struct *napi, int budget) { struct virtnet_info *vi = container_of(napi, struct virtnet_info, napi); @@ -923,11 +933,22 @@ static void virtnet_update_status(struct virtnet_info *vi) &v, sizeof(v)); /* Ignore unknown (future) status bits */ - v &= VIRTIO_NET_S_LINK_UP; + v &= VIRTIO_NET_S_LINK_UP | VIRTIO_NET_S_ANNOUNCE; if (vi->status == v) return; + if (v & VIRTIO_NET_S_ANNOUNCE) { + if ((v & VIRTIO_NET_S_LINK_UP) && + virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_ANNOUNCE)) + schedule_work(&vi->announce); + v &= ~VIRTIO_NET_S_ANNOUNCE; + vi->vdev->config->set(vi->vdev, + offsetof(struct virtio_net_config, + status), + &v, sizeof(v)); + } + vi->status = v; if (vi->status & VIRTIO_NET_S_LINK_UP) { @@ -937,6 +958,7 @@ static void virtnet_update_status(struct virtnet_info *vi) netif_carrier_off(vi->dev); netif_stop_queue(vi->dev); } + } static void virtnet_config_changed(struct virtio_device *vdev) @@ -1016,6 +1038,8 @@ static int virtnet_probe(struct virtio_device *vdev) goto free; INIT_DELAYED_WORK(&vi->refill, refill_work); + if (virtio_has_feature(vdev, VIRTIO_NET_F_GUEST_ANNOUNCE)) + INIT_WORK(&vi->announce, announce_work); sg_init_table(vi->rx_sg, ARRAY_SIZE(vi->rx_sg)); sg_init_table(vi->tx_sg, ARRAY_SIZE(vi->tx_sg)); @@ -1077,6 +1101,8 @@ static int virtnet_probe(struct virtio_device *vdev) unregister: unregister_netdev(dev); cancel_delayed_work_sync(&vi->refill); + if (virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_ANNOUNCE)) + cancel_work_sync(&vi->announce); free_vqs: vdev->config->del_vqs(vdev); free_stats: @@ -1118,6 +1144,8 @@ static void __devexit virtnet_remove(struct virtio_device *vdev) unregister_netdev(vi->dev); cancel_delayed_work_sync(&vi->refill); + if(virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_ANNOUNCE)) + cancel_work_sync(&vi->announce); /* Free unused buffers in both send and recv, if any. */ free_unused_bufs(vi); @@ -1144,6 +1172,7 @@ static unsigned int features[] = { VIRTIO_NET_F_GUEST_ECN, VIRTIO_NET_F_GUEST_UFO, VIRTIO_NET_F_MRG_RXBUF, VIRTIO_NET_F_STATUS, VIRTIO_NET_F_CTRL_VQ, VIRTIO_NET_F_CTRL_RX, VIRTIO_NET_F_CTRL_VLAN, + VIRTIO_NET_F_GUEST_ANNOUNCE, }; static struct virtio_driver virtio_net_driver = { diff --git a/include/linux/virtio_net.h b/include/linux/virtio_net.h index 970d5a2..44a38d6 100644 --- a/include/linux/virtio_net.h +++ b/include/linux/virtio_net.h @@ -49,8 +49,10 @@ #define VIRTIO_NET_F_CTRL_RX 18 /* Control channel RX mode support */ #define VIRTIO_NET_F_CTRL_VLAN 19 /* Control channel VLAN filtering */ #define VIRTIO_NET_F_CTRL_RX_EXTRA 20 /* Extra RX mode control support */ +#define VIRTIO_NET_F_GUEST_ANNOUNCE 21 /* Guest can send gratituous packet */ #define VIRTIO_NET_S_LINK_UP 1 /* Link is up */ +#define VIRTIO_NET_S_ANNOUNCE 2 /* Announcement is needed */ struct virtio_net_config { /* The config defining mac address (if VIRTIO_NET_F_MAC) */
This make let virtio-net driver can send gratituous packet by a new config bit - VIRTIO_NET_S_ANNOUNCE in each config update interrupt. When this bit is set by backend, the driver would schedule a workqueue to send gratituous packet through NETDEV_NOTIFY_PEERS. This feature is negotiated through bit VIRTIO_NET_F_GUEST_ANNOUNCE. Signed-off-by: Jason Wang <jasowang@redhat.com> --- drivers/net/virtio_net.c | 31 ++++++++++++++++++++++++++++++- include/linux/virtio_net.h | 2 ++ 2 files changed, 32 insertions(+), 1 deletions(-)