Message ID | 20230704151527.193586-1-stefanha@redhat.com |
---|---|
State | New |
Headers | show |
Series | virtio-blk: fix host notifier issues during dataplane start/stop | expand |
Thank you, Stefan, I tested it with the extended set of tests and it addresses the issue. Regards, Lukáš Tested-by: Lukas Doktor <ldoktor@redhat.com> Dne 04. 07. 23 v 17:15 Stefan Hajnoczi napsal(a): > The main loop thread can consume 100% CPU when using --device > virtio-blk-pci,iothread=<iothread>. ppoll() constantly returns but > reading virtqueue host notifiers fails with EAGAIN. The file descriptors > are stale and remain registered with the AioContext because of bugs in > the virtio-blk dataplane start/stop code. > > The problem is that the dataplane start/stop code involves drain > operations, which call virtio_blk_drained_begin() and > virtio_blk_drained_end() at points where the host notifier is not > operational: > - In virtio_blk_data_plane_start(), blk_set_aio_context() drains after > vblk->dataplane_started has been set to true but the host notifier has > not been attached yet. > - In virtio_blk_data_plane_stop(), blk_drain() and blk_set_aio_context() > drain after the host notifier has already been detached but with > vblk->dataplane_started still set to true. > > I would like to simplify ->ioeventfd_start/stop() to avoid interactions > with drain entirely, but couldn't find a way to do that. Instead, this > patch accepts the fragile nature of the code and reorders it so that > vblk->dataplane_started is false during drain operations. This way the > virtio_blk_drained_begin() and virtio_blk_drained_end() calls don't > touch the host notifier. The result is that > virtio_blk_data_plane_start() and virtio_blk_data_plane_stop() have > complete control over the host notifier and stale file descriptors are > no longer left in the AioContext. > > This patch fixes the 100% CPU consumption in the main loop thread and > correctly moves host notifier processing to the IOThread. > > Fixes: 1665d9326fd2 ("virtio-blk: implement BlockDevOps->drained_begin()") > Reported-by: Lukáš Doktor <ldoktor@redhat.com> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > --- > hw/block/dataplane/virtio-blk.c | 67 +++++++++++++++++++-------------- > 1 file changed, 38 insertions(+), 29 deletions(-) > > diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c > index c227b39408..da36fcfd0b 100644 > --- a/hw/block/dataplane/virtio-blk.c > +++ b/hw/block/dataplane/virtio-blk.c > @@ -219,13 +219,6 @@ int virtio_blk_data_plane_start(VirtIODevice *vdev) > > memory_region_transaction_commit(); > > - /* > - * These fields are visible to the IOThread so we rely on implicit barriers > - * in aio_context_acquire() on the write side and aio_notify_accept() on > - * the read side. > - */ > - s->starting = false; > - vblk->dataplane_started = true; > trace_virtio_blk_data_plane_start(s); > > old_context = blk_get_aio_context(s->conf->conf.blk); > @@ -244,6 +237,18 @@ int virtio_blk_data_plane_start(VirtIODevice *vdev) > event_notifier_set(virtio_queue_get_host_notifier(vq)); > } > > + /* > + * These fields must be visible to the IOThread when it processes the > + * virtqueue, otherwise it will think dataplane has not started yet. > + * > + * Make sure ->dataplane_started is false when blk_set_aio_context() is > + * called above so that draining does not cause the host notifier to be > + * detached/attached prematurely. > + */ > + s->starting = false; > + vblk->dataplane_started = true; > + smp_wmb(); /* paired with aio_notify_accept() on the read side */ > + > /* Get this show started by hooking up our callbacks */ > if (!blk_in_drain(s->conf->conf.blk)) { > aio_context_acquire(s->ctx); > @@ -273,7 +278,6 @@ int virtio_blk_data_plane_start(VirtIODevice *vdev) > fail_guest_notifiers: > vblk->dataplane_disabled = true; > s->starting = false; > - vblk->dataplane_started = true; > return -ENOSYS; > } > > @@ -327,6 +331,32 @@ void virtio_blk_data_plane_stop(VirtIODevice *vdev) > aio_wait_bh_oneshot(s->ctx, virtio_blk_data_plane_stop_bh, s); > } > > + /* > + * Batch all the host notifiers in a single transaction to avoid > + * quadratic time complexity in address_space_update_ioeventfds(). > + */ > + memory_region_transaction_begin(); > + > + for (i = 0; i < nvqs; i++) { > + virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), i, false); > + } > + > + /* > + * The transaction expects the ioeventfds to be open when it > + * commits. Do it now, before the cleanup loop. > + */ > + memory_region_transaction_commit(); > + > + for (i = 0; i < nvqs; i++) { > + virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus), i); > + } > + > + /* > + * Set ->dataplane_started to false before draining so that host notifiers > + * are not detached/attached anymore. > + */ > + vblk->dataplane_started = false; > + > aio_context_acquire(s->ctx); > > /* Wait for virtio_blk_dma_restart_bh() and in flight I/O to complete */ > @@ -340,32 +370,11 @@ void virtio_blk_data_plane_stop(VirtIODevice *vdev) > > aio_context_release(s->ctx); > > - /* > - * Batch all the host notifiers in a single transaction to avoid > - * quadratic time complexity in address_space_update_ioeventfds(). > - */ > - memory_region_transaction_begin(); > - > - for (i = 0; i < nvqs; i++) { > - virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), i, false); > - } > - > - /* > - * The transaction expects the ioeventfds to be open when it > - * commits. Do it now, before the cleanup loop. > - */ > - memory_region_transaction_commit(); > - > - for (i = 0; i < nvqs; i++) { > - virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus), i); > - } > - > qemu_bh_cancel(s->bh); > notify_guest_bh(s); /* final chance to notify guest */ > > /* Clean up guest notifier (irq) */ > k->set_guest_notifiers(qbus->parent, nvqs, false); > > - vblk->dataplane_started = false; > s->stopping = false; > }
On Tue, Jul 04, 2023 at 05:15:27PM +0200, Stefan Hajnoczi wrote: > The main loop thread can consume 100% CPU when using --device > virtio-blk-pci,iothread=<iothread>. ppoll() constantly returns but > reading virtqueue host notifiers fails with EAGAIN. The file descriptors > are stale and remain registered with the AioContext because of bugs in > the virtio-blk dataplane start/stop code. > > The problem is that the dataplane start/stop code involves drain > operations, which call virtio_blk_drained_begin() and > virtio_blk_drained_end() at points where the host notifier is not > operational: > - In virtio_blk_data_plane_start(), blk_set_aio_context() drains after > vblk->dataplane_started has been set to true but the host notifier has > not been attached yet. > - In virtio_blk_data_plane_stop(), blk_drain() and blk_set_aio_context() > drain after the host notifier has already been detached but with > vblk->dataplane_started still set to true. > > I would like to simplify ->ioeventfd_start/stop() to avoid interactions > with drain entirely, but couldn't find a way to do that. Instead, this > patch accepts the fragile nature of the code and reorders it so that > vblk->dataplane_started is false during drain operations. This way the > virtio_blk_drained_begin() and virtio_blk_drained_end() calls don't > touch the host notifier. The result is that > virtio_blk_data_plane_start() and virtio_blk_data_plane_stop() have > complete control over the host notifier and stale file descriptors are > no longer left in the AioContext. > > This patch fixes the 100% CPU consumption in the main loop thread and > correctly moves host notifier processing to the IOThread. > > Fixes: 1665d9326fd2 ("virtio-blk: implement BlockDevOps->drained_begin()") > Reported-by: Lukáš Doktor <ldoktor@redhat.com> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > --- > hw/block/dataplane/virtio-blk.c | 67 +++++++++++++++++++-------------- > 1 file changed, 38 insertions(+), 29 deletions(-) Thanks, applied to my block tree: https://gitlab.com/stefanha/qemu/commits/block Stefan
diff --git a/hw/block/dataplane/virtio-blk.c b/hw/block/dataplane/virtio-blk.c index c227b39408..da36fcfd0b 100644 --- a/hw/block/dataplane/virtio-blk.c +++ b/hw/block/dataplane/virtio-blk.c @@ -219,13 +219,6 @@ int virtio_blk_data_plane_start(VirtIODevice *vdev) memory_region_transaction_commit(); - /* - * These fields are visible to the IOThread so we rely on implicit barriers - * in aio_context_acquire() on the write side and aio_notify_accept() on - * the read side. - */ - s->starting = false; - vblk->dataplane_started = true; trace_virtio_blk_data_plane_start(s); old_context = blk_get_aio_context(s->conf->conf.blk); @@ -244,6 +237,18 @@ int virtio_blk_data_plane_start(VirtIODevice *vdev) event_notifier_set(virtio_queue_get_host_notifier(vq)); } + /* + * These fields must be visible to the IOThread when it processes the + * virtqueue, otherwise it will think dataplane has not started yet. + * + * Make sure ->dataplane_started is false when blk_set_aio_context() is + * called above so that draining does not cause the host notifier to be + * detached/attached prematurely. + */ + s->starting = false; + vblk->dataplane_started = true; + smp_wmb(); /* paired with aio_notify_accept() on the read side */ + /* Get this show started by hooking up our callbacks */ if (!blk_in_drain(s->conf->conf.blk)) { aio_context_acquire(s->ctx); @@ -273,7 +278,6 @@ int virtio_blk_data_plane_start(VirtIODevice *vdev) fail_guest_notifiers: vblk->dataplane_disabled = true; s->starting = false; - vblk->dataplane_started = true; return -ENOSYS; } @@ -327,6 +331,32 @@ void virtio_blk_data_plane_stop(VirtIODevice *vdev) aio_wait_bh_oneshot(s->ctx, virtio_blk_data_plane_stop_bh, s); } + /* + * Batch all the host notifiers in a single transaction to avoid + * quadratic time complexity in address_space_update_ioeventfds(). + */ + memory_region_transaction_begin(); + + for (i = 0; i < nvqs; i++) { + virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), i, false); + } + + /* + * The transaction expects the ioeventfds to be open when it + * commits. Do it now, before the cleanup loop. + */ + memory_region_transaction_commit(); + + for (i = 0; i < nvqs; i++) { + virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus), i); + } + + /* + * Set ->dataplane_started to false before draining so that host notifiers + * are not detached/attached anymore. + */ + vblk->dataplane_started = false; + aio_context_acquire(s->ctx); /* Wait for virtio_blk_dma_restart_bh() and in flight I/O to complete */ @@ -340,32 +370,11 @@ void virtio_blk_data_plane_stop(VirtIODevice *vdev) aio_context_release(s->ctx); - /* - * Batch all the host notifiers in a single transaction to avoid - * quadratic time complexity in address_space_update_ioeventfds(). - */ - memory_region_transaction_begin(); - - for (i = 0; i < nvqs; i++) { - virtio_bus_set_host_notifier(VIRTIO_BUS(qbus), i, false); - } - - /* - * The transaction expects the ioeventfds to be open when it - * commits. Do it now, before the cleanup loop. - */ - memory_region_transaction_commit(); - - for (i = 0; i < nvqs; i++) { - virtio_bus_cleanup_host_notifier(VIRTIO_BUS(qbus), i); - } - qemu_bh_cancel(s->bh); notify_guest_bh(s); /* final chance to notify guest */ /* Clean up guest notifier (irq) */ k->set_guest_notifiers(qbus->parent, nvqs, false); - vblk->dataplane_started = false; s->stopping = false; }
The main loop thread can consume 100% CPU when using --device virtio-blk-pci,iothread=<iothread>. ppoll() constantly returns but reading virtqueue host notifiers fails with EAGAIN. The file descriptors are stale and remain registered with the AioContext because of bugs in the virtio-blk dataplane start/stop code. The problem is that the dataplane start/stop code involves drain operations, which call virtio_blk_drained_begin() and virtio_blk_drained_end() at points where the host notifier is not operational: - In virtio_blk_data_plane_start(), blk_set_aio_context() drains after vblk->dataplane_started has been set to true but the host notifier has not been attached yet. - In virtio_blk_data_plane_stop(), blk_drain() and blk_set_aio_context() drain after the host notifier has already been detached but with vblk->dataplane_started still set to true. I would like to simplify ->ioeventfd_start/stop() to avoid interactions with drain entirely, but couldn't find a way to do that. Instead, this patch accepts the fragile nature of the code and reorders it so that vblk->dataplane_started is false during drain operations. This way the virtio_blk_drained_begin() and virtio_blk_drained_end() calls don't touch the host notifier. The result is that virtio_blk_data_plane_start() and virtio_blk_data_plane_stop() have complete control over the host notifier and stale file descriptors are no longer left in the AioContext. This patch fixes the 100% CPU consumption in the main loop thread and correctly moves host notifier processing to the IOThread. Fixes: 1665d9326fd2 ("virtio-blk: implement BlockDevOps->drained_begin()") Reported-by: Lukáš Doktor <ldoktor@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> --- hw/block/dataplane/virtio-blk.c | 67 +++++++++++++++++++-------------- 1 file changed, 38 insertions(+), 29 deletions(-)