Message ID | CABUUfwNNrvdcr1cp88mnqnHKH9L7f=yj6oRWOYAsAoB_T9w=CQ@mail.gmail.com |
---|---|
State | New |
Headers | show |
On 06/25/2015 10:22 PM, Thibaut Collet wrote: > On Thu, Jun 25, 2015 at 2:53 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >> On Thu, Jun 25, 2015 at 01:01:29PM +0200, Thibaut Collet wrote: >>> On Thu, Jun 25, 2015 at 11:59 AM, Jason Wang <jasowang@redhat.com> wrote: >>>> >>>> >>>> On 06/24/2015 07:05 PM, Michael S. Tsirkin wrote: >>>>> On Wed, Jun 24, 2015 at 04:31:15PM +0800, Jason Wang wrote: >>>>>> On 06/23/2015 01:49 PM, Michael S. Tsirkin wrote: >>>>>>> On Tue, Jun 23, 2015 at 10:12:17AM +0800, Jason Wang wrote: >>>>>>>>> On 06/18/2015 11:16 PM, Thibaut Collet wrote: >>>>>>>>>>> On Tue, Jun 16, 2015 at 10:05 AM, Jason Wang <jasowang@redhat.com> wrote: >>>>>>>>>>>>> On 06/16/2015 03:24 PM, Thibaut Collet wrote: >>>>>>>>>>>>>>> If my understanding is correct, on a resume operation, we have the >>>>>>>>>>>>>>> following callback trace: >>>>>>>>>>>>>>> 1. virtio_pci_restore function that calls all restore call back of >>>>>>>>>>>>>>> virtio devices >>>>>>>>>>>>>>> 2. virtnet_restore that calls try_fill_recv function for each virtual queues >>>>>>>>>>>>>>> 3. try_fill_recv function kicks the virtual queue (through >>>>>>>>>>>>>>> virtqueue_kick function) >>>>>>>>>>>>> Yes, but this happens only after pm resume not migration. Migration is >>>>>>>>>>>>> totally transparent to guest. >>>>>>>>>>>>> >>>>>>>>>>> Hi Jason, >>>>>>>>>>> >>>>>>>>>>> After a deeper look in the migration code of QEMU a resume event is >>>>>>>>>>> always sent when the live migration is finished. >>>>>>>>>>> On a live migration we have the following callback trace: >>>>>>>>>>> 1. The VM on the new host is set to the state RUN_STATE_INMIGRATE, the >>>>>>>>>>> autostart boolean to 1 and calls the qemu_start_incoming_migration >>>>>>>>>>> function (see function main of vl.c) >>>>>>>>>>> ..... >>>>>>>>>>> 2. call of process_incoming_migration function in >>>>>>>>>>> migration/migration.c file whatever the way to do the live migration >>>>>>>>>>> (tcp:, fd:, unix:, exec: ...) >>>>>>>>>>> 3. call of process_incoming_migration_co function in migration/migration.c >>>>>>>>>>> 4. call of vm_start function in vl.c (otherwise the migrated VM stay >>>>>>>>>>> in the pause state, the autostart boolean is set to 1 by the main >>>>>>>>>>> function in vl.c) >>>>>>>>>>> 5. call of vm_start function that sets the VM is the RUN_STATE_RUNNING state. >>>>>>>>>>> 6. call of qapi_event_send_resume function that ends a resume event to the VM >>>>>>>>> AFAIK, this function sends resume event to qemu monitor not VM. >>>>>>>>> >>>>>>>>>>> So when a live migration is ended: >>>>>>>>>>> 1. a resume event is sent to the guest >>>>>>>>>>> 2. On the reception of this resume event the virtual queue are kicked >>>>>>>>>>> by the guest >>>>>>>>>>> 3. Backend vhost user catches this kick and can emit a RARP to guest >>>>>>>>>>> that does not support GUEST_ANNOUNCE >>>>>>>>>>> >>>>>>>>>>> This solution, as solution based on detection of DRIVER_OK status >>>>>>>>>>> suggested by Michael, allows backend to send the RARP to legacy guest >>>>>>>>>>> without involving QEMU and add ioctl to vhost-user. >>>>>>>>> A question here is did vhost-user code pass status to the backend? If >>>>>>>>> not, how can userspace backend detect DRIVER_OK? >>>>>>> Sorry, I must have been unclear. >>>>>>> vhost core calls VHOST_NET_SET_BACKEND on DRIVER_OK. >>>>>>> Unfortunately vhost user currently translates it to VHOST_USER_NONE. >>>>>> Looks like VHOST_NET_SET_BACKEND was only used for tap backend. >>>>>> >>>>>>> As a work around, I think kicking ioeventfds once you get >>>>>>> VHOST_NET_SET_BACKEND will work. >>>>>> Maybe just a eventfd_set() in vhost_net_start(). But is this >>>>>> "workaround" elegant enough to be documented? Is it better to do this >>>>>> explicitly with a new feature? >>>>> If you are going to do this anyway, there are a couple of other changes >>>>> we should do, in particular, decide what we want to do with control vq. >>>>> >>>> If I understand correctly, you mean VIRTIO_NET_CTRL_MQ and >>>> VIRTIO_NET_CTRL_GUEST_OFFLOADS? Looks like both of these were broken. >>>> Need more thought, maybe new kinds of requests. >>>> >>>> >>> Are there any objections to add VHOST_NET_SET_BACKEND support to vhost >>> user with a patch like that: >>> >>> >>> hw/net/vhost_net.c | 8 ++++++++ >>> hw/virtio/vhost-user.c | 10 +++++++++- >>> 2 files changed, 17 insertions(+), 1 deletion(-) >>> >>> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c >>> index 907e002..7a008c0 100644 >>> --- a/hw/net/vhost_net.c >>> +++ b/hw/net/vhost_net.c >>> @@ -234,6 +234,14 @@ static int vhost_net_start_one(struct vhost_net *net, >>> goto fail; >>> } >>> } >>> + } else if (net->nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER) { >>> + file.fd = 0; >>> + for (file.index = 0; file.index < net->dev.nvqs; ++file.index) { >>> + const VhostOps *vhost_ops = net->dev.vhost_ops; >>> + int r = vhost_ops->vhost_call(&net->dev, VHOST_NET_SET_BACKEND, >>> + &file); >>> + assert(r >= 0); >>> + } >>> } >>> return 0; >>> fail: >>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c >>> index d6f2163..32c6bd9 100644 >>> --- a/hw/virtio/vhost-user.c >>> +++ b/hw/virtio/vhost-user.c >>> @@ -41,6 +41,7 @@ typedef enum VhostUserRequest { >>> VHOST_USER_SET_VRING_KICK = 12, >>> VHOST_USER_SET_VRING_CALL = 13, >>> VHOST_USER_SET_VRING_ERR = 14, >>> + VHOST_USER_NET_SET_BACKEND = 15, >>> VHOST_USER_MAX >>> } VhostUserRequest; >>> >>> @@ -104,7 +105,8 @@ static unsigned long int >>> ioctl_to_vhost_user_request[VHOST_USER_MAX] = { >>> VHOST_GET_VRING_BASE, /* VHOST_USER_GET_VRING_BASE */ >>> VHOST_SET_VRING_KICK, /* VHOST_USER_SET_VRING_KICK */ >>> VHOST_SET_VRING_CALL, /* VHOST_USER_SET_VRING_CALL */ >>> - VHOST_SET_VRING_ERR /* VHOST_USER_SET_VRING_ERR */ >>> + VHOST_SET_VRING_ERR, /* VHOST_USER_SET_VRING_ERR */ >>> + VHOST_NET_SET_BACKEND /* VHOST_USER_NET_SET_BACKEND */ >>> }; >>> >>> static VhostUserRequest vhost_user_request_translate(unsigned long int request) >>> @@ -287,6 +289,12 @@ static int vhost_user_call(struct vhost_dev *dev, >>> unsigned long int request, >>> msg.u64 |= VHOST_USER_VRING_NOFD_MASK; >>> } >>> break; >>> + >>> + case VHOST_NET_SET_BACKEND: >>> + memcpy(&msg.file, arg, sizeof(struct vhost_vring_state)); >>> + msg.size = sizeof(m.state); >>> + break; >>> + >>> default: >>> error_report("vhost-user trying to send unhandled ioctl"); >>> return -1; >>> >>> >>> This message will be sent when guest is ready and can be used by vhost >>> user backend to send RARP to legacy guest. >>> >>> This solution avoids to add new message and has no impact on control vq. >> >> I think that you can't add messages to protocol unconditionally. >> For example, snabbswitch simply crashes if it gets an unknown >> message. >> >> Either this needs a new feature bit, or implement >> [PATCH RFC] vhost-user: protocol extensions >> making it safe to add new messages. >> >> -- >> MST > I understand. > Last idea before doing a RFC: > Normally guest notifies vhost of new buffer onto a virtqueue by > kicking the eventfd. This eventfd has been provided to vhost by QEMU. > So when DRIVER_OK is received by QEMU, QEMU can kick the eventfd. > > A possible patch to do that is: > hw/net/vhost_net.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c > index 907e002..fbc55e0 100644 > --- a/hw/net/vhost_net.c > +++ b/hw/net/vhost_net.c > @@ -234,6 +234,13 @@ static int vhost_net_start_one(struct vhost_net *net, > goto fail; > } > } > + } else if (net->nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER) { > + int idx; > + > + for (idx = 0; idx < net->dev.nvqs; ++idx) { > + struct VirtQueue *vq = virtio_get_queue(dev, idx); > + event_notifier_set(virtio_queue_get_host_notifier(vq)); > + } > } > return 0; > fail: > > kicking this eventfd has no impact for QEMU or the guest (they do not > poll it) and simply wake up vhost to allow it to send RARP for legacy > guest. > > Regards. > > Thibaut. This may work but the issue is: - I believe we should document this in the spec. But it looks more like a workaround and use implicit method to notify control message which often cause ulgy codes in both side. This is not elegant to be documented in the spec. - Consider you may have 100 queues, then kick will happen 100 times and backend need handle such case.
diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c index 907e002..fbc55e0 100644 --- a/hw/net/vhost_net.c +++ b/hw/net/vhost_net.c @@ -234,6 +234,13 @@ static int vhost_net_start_one(struct vhost_net *net, goto fail; } } + } else if (net->nc->info->type == NET_CLIENT_OPTIONS_KIND_VHOST_USER) { + int idx; + + for (idx = 0; idx < net->dev.nvqs; ++idx) { + struct VirtQueue *vq = virtio_get_queue(dev, idx); + event_notifier_set(virtio_queue_get_host_notifier(vq)); + } } return 0; fail: