Message ID | 20190611172032.19143-1-lvivier@redhat.com |
---|---|
State | New |
Headers | show |
Series | [RFC] virtio-rng: add a watchdog | expand |
On Tue, 2019-06-11 at 19:20 +0200, Laurent Vivier wrote: > The virtio-rng linux driver can be stuck in virtio_read() on a > wait_for_completion_killable() call if the virtio-rng device in QEMU > doesn't provide data. > > It's a problem, because virtio_read() is called from rng_get_data() > with > reading_mutex() held. The same mutex is taken by > add_early_randomness() > and hwrng_fillfn() and this brings to a hang during the boot sequence > if > the virtio-rng driver is builtin. > Moreover, another lock is taken (rng_mutex) when the hwrng driver > wants to switch the RNG device or the user tries to unplug the > virtio-rng > PCI card, and this can hang too because the virtio-rng driver is only > able > to release the card if the virtio-rng device sends back the virtqueue > element. > > # echo -n virtio_rng.1 > /sys/class/misc/hw_random/rng_current > [ 240.165234] INFO: task kworker/u2:1:34 blocked for more than 120 > seconds. > [ 240.165961] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" > disables this message. > [ 240.166708] kworker/u2:1 D > ffffffffb86b85a8 0 34 2 0x00000000 > [ 240.166714] Workqueue: kacpi_hotplug acpi_hotplug_work_fn > [ 240.166716] ffffa0e8f3c0b890 0000000000000046 ffffa0e8f3c00000 > ffffa0e8f3c0bfd8 > [ 240.166717] ffffa0e8f3c0bfd8 ffffa0e8f3c0bfd8 ffffa0e8f3c00000 > ffffffffb86b85a0 > [ 240.166719] ffffffffb86b85a4 ffffa0e8f3c00000 00000000ffffffff > ffffffffb86b85a8 > [ 240.166720] Call Trace: > [ 240.166725] [<ffffffffb82a61c9>] > schedule_preempt_disabled+0x29/0x70 > [ 240.166727] [<ffffffffb82a40f7>] > __mutex_lock_slowpath+0xc7/0x1d0 > [ 240.166728] [<ffffffffb82a350f>] mutex_lock+0x1f/0x2f > [ 240.166730] [<ffffffffb8022b52>] hwrng_register+0x32/0x1d0 > [ 240.166733] [<ffffffffc07fa149>] virtrng_scan+0x19/0x30 > [virtio_rng] > [ 240.166744] [<ffffffffc03108db>] virtio_dev_probe+0x1eb/0x290 > [virtio] > [ 240.166746] [<ffffffffb803d6e5>] > driver_probe_device+0x145/0x3c0 > ... > > In some case, the QEMU RNG backend is not able to provide data, and > the virtio-rng device is not aware of that: > - with rng-random using /dev/random and no entropy is available, > - with rng-egd started with a socket in "server,nowait" mode and > no daemon connected, > - with rng-egd and an egd daemon that is not providing enough data, > - ... > > To release the locks regularly, this patch adds a watchdog in QEMU > virtio-rng device that sends back to the guest the virtqueue buffer > with a 0 byte payload. This case is expected and correctly managed by > the hwrng core. I'm wondering if it makes more sense to rework the way the kernel driver requests for seeding entropy during probe. The virtio_read call is killable, so it can take signals when initiated by userspace. For the initial probe, specifying a timeout / watchdog in the driver is better. > > Signed-off-by: Laurent Vivier <lvivier@redhat.com> > --- > hw/virtio/virtio-rng.c | 23 +++++++++++++++++++++++ > include/hw/virtio/virtio-rng.h | 1 + > 2 files changed, 24 insertions(+) > > diff --git a/hw/virtio/virtio-rng.c b/hw/virtio/virtio-rng.c > index 30493a258622..173ecd370c0e 100644 > --- a/hw/virtio/virtio-rng.c > +++ b/hw/virtio/virtio-rng.c > @@ -19,6 +19,8 @@ > #include "qom/object_interfaces.h" > #include "trace.h" > > +#define VIRTIO_RNG_WATCHDOG_MS 500 > + > static bool is_guest_ready(VirtIORNG *vrng) > { > VirtIODevice *vdev = VIRTIO_DEVICE(vrng); > @@ -38,6 +40,21 @@ static size_t get_request_size(VirtQueue *vq, > unsigned quota) > return in; > } > > +static void watchdog(void *opaque) > +{ > + VirtIORNG *vrng = opaque; > + VirtIODevice *vdev = VIRTIO_DEVICE(vrng); > + VirtQueueElement *elem; > + > + /* wake up driver */ > + elem = virtqueue_pop(vrng->vq, sizeof(VirtQueueElement)); > + if (!elem) { > + return; > + } > + virtqueue_push(vrng->vq, elem, 0); > + virtio_notify(vdev, vrng->vq); > +} > + > static void virtio_rng_process(VirtIORNG *vrng); > > /* Send data from a char device over to the guest */ > @@ -98,6 +115,9 @@ static void virtio_rng_process(VirtIORNG *vrng) > return; > } > > + timer_mod(vrng->watchdog_timer, > + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + > VIRTIO_RNG_WATCHDOG_MS); > + > if (vrng->activate_timer) { > timer_mod(vrng->rate_limit_timer, > qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + vrng- > >conf.period_ms); > @@ -222,6 +242,7 @@ static void virtio_rng_device_realize(DeviceState > *dev, Error **errp) > > vrng->vq = virtio_add_queue(vdev, 8, handle_input); > vrng->quota_remaining = vrng->conf.max_bytes; > + vrng->watchdog_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, > watchdog, vrng); > vrng->rate_limit_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, > check_rate_limit, > vrng); > vrng->activate_timer = true; > @@ -236,6 +257,8 @@ static void > virtio_rng_device_unrealize(DeviceState *dev, Error **errp) > VirtIORNG *vrng = VIRTIO_RNG(dev); > > qemu_del_vm_change_state_handler(vrng->vmstate); > + timer_del(vrng->watchdog_timer); > + timer_free(vrng->watchdog_timer); > timer_del(vrng->rate_limit_timer); > timer_free(vrng->rate_limit_timer); > virtio_cleanup(vdev); > diff --git a/include/hw/virtio/virtio-rng.h > b/include/hw/virtio/virtio-rng.h > index 922dce7caccf..05d6b0e7d881 100644 > --- a/include/hw/virtio/virtio-rng.h > +++ b/include/hw/virtio/virtio-rng.h > @@ -42,6 +42,7 @@ typedef struct VirtIORNG { > /* We purposefully don't migrate this state. The quota will > reset on the > * destination as a result. Rate limiting is host state, not > guest state. > */ > + QEMUTimer *watchdog_timer; > QEMUTimer *rate_limit_timer; > int64_t quota_remaining; > bool activate_timer;
On 12/06/2019 09:03, Amit Shah wrote: > On Tue, 2019-06-11 at 19:20 +0200, Laurent Vivier wrote: >> The virtio-rng linux driver can be stuck in virtio_read() on a >> wait_for_completion_killable() call if the virtio-rng device in QEMU >> doesn't provide data. >> >> It's a problem, because virtio_read() is called from rng_get_data() >> with >> reading_mutex() held. The same mutex is taken by >> add_early_randomness() >> and hwrng_fillfn() and this brings to a hang during the boot sequence >> if >> the virtio-rng driver is builtin. >> Moreover, another lock is taken (rng_mutex) when the hwrng driver >> wants to switch the RNG device or the user tries to unplug the >> virtio-rng >> PCI card, and this can hang too because the virtio-rng driver is only >> able >> to release the card if the virtio-rng device sends back the virtqueue >> element. >> >> # echo -n virtio_rng.1 > /sys/class/misc/hw_random/rng_current >> [ 240.165234] INFO: task kworker/u2:1:34 blocked for more than 120 >> seconds. >> [ 240.165961] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" >> disables this message. >> [ 240.166708] kworker/u2:1 D >> ffffffffb86b85a8 0 34 2 0x00000000 >> [ 240.166714] Workqueue: kacpi_hotplug acpi_hotplug_work_fn >> [ 240.166716] ffffa0e8f3c0b890 0000000000000046 ffffa0e8f3c00000 >> ffffa0e8f3c0bfd8 >> [ 240.166717] ffffa0e8f3c0bfd8 ffffa0e8f3c0bfd8 ffffa0e8f3c00000 >> ffffffffb86b85a0 >> [ 240.166719] ffffffffb86b85a4 ffffa0e8f3c00000 00000000ffffffff >> ffffffffb86b85a8 >> [ 240.166720] Call Trace: >> [ 240.166725] [<ffffffffb82a61c9>] >> schedule_preempt_disabled+0x29/0x70 >> [ 240.166727] [<ffffffffb82a40f7>] >> __mutex_lock_slowpath+0xc7/0x1d0 >> [ 240.166728] [<ffffffffb82a350f>] mutex_lock+0x1f/0x2f >> [ 240.166730] [<ffffffffb8022b52>] hwrng_register+0x32/0x1d0 >> [ 240.166733] [<ffffffffc07fa149>] virtrng_scan+0x19/0x30 >> [virtio_rng] >> [ 240.166744] [<ffffffffc03108db>] virtio_dev_probe+0x1eb/0x290 >> [virtio] >> [ 240.166746] [<ffffffffb803d6e5>] >> driver_probe_device+0x145/0x3c0 >> ... >> >> In some case, the QEMU RNG backend is not able to provide data, and >> the virtio-rng device is not aware of that: >> - with rng-random using /dev/random and no entropy is available, >> - with rng-egd started with a socket in "server,nowait" mode and >> no daemon connected, >> - with rng-egd and an egd daemon that is not providing enough data, >> - ... >> >> To release the locks regularly, this patch adds a watchdog in QEMU >> virtio-rng device that sends back to the guest the virtqueue buffer >> with a 0 byte payload. This case is expected and correctly managed by >> the hwrng core. > > I'm wondering if it makes more sense to rework the way the kernel > driver requests for seeding entropy during probe. The kernel side was my first angle of attack. I tried first to not block in add_early_randomness(): "hwrng: core - don't block in add_early_randomness()" https://patchwork.kernel.org/patch/10877571/ But I agree with the maintainer, the problem must be fixed at virtio-rng level. > The virtio_read call is killable, so it can take signals when initiated > by userspace. For the initial probe, specifying a timeout / watchdog > in the driver is better. Yes, I think also it's better, I tried to do something like that: --- a/drivers/char/hw_random/virtio-rng.c +++ b/drivers/char/hw_random/virtio-rng.c @@ -77,10 +77,7 @@ static int virtio_read(struct hwrng *rng, void *buf, size_t size, bool wait) register_buffer(vi, buf, size); } - if (!wait) - return 0; - - ret = wait_for_completion_killable(&vi->have_data); + ret = wait_for_completion_timeout(&vi->have_data, wait ? MAX_SCHEDULE_TIMEOUT : HZ); if (ret < 0) return ret; But I have a problem doing the timeout / watchdog at driver level: once the buffer is submitted to the virtqueue, how to cancel it? Is there a way to ask the QEMU device to not process the element in the virtqueue we have stopped to wait for because of the timeout (or for the signal: I don't understand how it works in this case. How it is canceled?)? Thanks, Laurent
On Thu, 2019-06-13 at 10:53 +0200, Laurent Vivier wrote: > On 12/06/2019 09:03, Amit Shah wrote: > > On Tue, 2019-06-11 at 19:20 +0200, Laurent Vivier wrote: > > > The virtio-rng linux driver can be stuck in virtio_read() on a > > > wait_for_completion_killable() call if the virtio-rng device in > > > QEMU > > > doesn't provide data. > > > > > > It's a problem, because virtio_read() is called from > > > rng_get_data() > > > with > > > reading_mutex() held. The same mutex is taken by > > > add_early_randomness() > > > and hwrng_fillfn() and this brings to a hang during the boot > > > sequence > > > if > > > the virtio-rng driver is builtin. > > > Moreover, another lock is taken (rng_mutex) when the hwrng driver > > > wants to switch the RNG device or the user tries to unplug the > > > virtio-rng > > > PCI card, and this can hang too because the virtio-rng driver is > > > only > > > able > > > to release the card if the virtio-rng device sends back the > > > virtqueue > > > element. > > > > > > # echo -n virtio_rng.1 > /sys/class/misc/hw_random/rng_current > > > [ 240.165234] INFO: task kworker/u2:1:34 blocked for more than > > > 120 > > > seconds. > > > [ 240.165961] "echo 0 > > > > /proc/sys/kernel/hung_task_timeout_secs" > > > disables this message. > > > [ 240.166708] kworker/u2:1 D > > > ffffffffb86b85a8 0 34 2 0x00000000 > > > [ 240.166714] Workqueue: kacpi_hotplug acpi_hotplug_work_fn > > > [ 240.166716] ffffa0e8f3c0b890 0000000000000046 > > > ffffa0e8f3c00000 > > > ffffa0e8f3c0bfd8 > > > [ 240.166717] ffffa0e8f3c0bfd8 ffffa0e8f3c0bfd8 > > > ffffa0e8f3c00000 > > > ffffffffb86b85a0 > > > [ 240.166719] ffffffffb86b85a4 ffffa0e8f3c00000 > > > 00000000ffffffff > > > ffffffffb86b85a8 > > > [ 240.166720] Call Trace: > > > [ 240.166725] [<ffffffffb82a61c9>] > > > schedule_preempt_disabled+0x29/0x70 > > > [ 240.166727] [<ffffffffb82a40f7>] > > > __mutex_lock_slowpath+0xc7/0x1d0 > > > [ 240.166728] [<ffffffffb82a350f>] mutex_lock+0x1f/0x2f > > > [ 240.166730] [<ffffffffb8022b52>] hwrng_register+0x32/0x1d0 > > > [ 240.166733] [<ffffffffc07fa149>] virtrng_scan+0x19/0x30 > > > [virtio_rng] > > > [ 240.166744] [<ffffffffc03108db>] > > > virtio_dev_probe+0x1eb/0x290 > > > [virtio] > > > [ 240.166746] [<ffffffffb803d6e5>] > > > driver_probe_device+0x145/0x3c0 > > > ... > > > > > > In some case, the QEMU RNG backend is not able to provide data, > > > and > > > the virtio-rng device is not aware of that: > > > - with rng-random using /dev/random and no entropy is available, > > > - with rng-egd started with a socket in "server,nowait" mode and > > > no daemon connected, > > > - with rng-egd and an egd daemon that is not providing enough > > > data, > > > - ... > > > > > > To release the locks regularly, this patch adds a watchdog in > > > QEMU > > > virtio-rng device that sends back to the guest the virtqueue > > > buffer > > > with a 0 byte payload. This case is expected and correctly > > > managed by > > > the hwrng core. > > > > I'm wondering if it makes more sense to rework the way the kernel > > driver requests for seeding entropy during probe. > > The kernel side was my first angle of attack. > I tried first to not block in add_early_randomness(): > > "hwrng: core - don't block in add_early_randomness()" > https://patchwork.kernel.org/patch/10877571/ > > But I agree with the maintainer, the problem must be fixed at virtio- > rng > level. Yea. How much of this is due to 'rng-egd was not set up properly; the backend did not appear in time' -- ie a setup problem; vs a problem we can expect to recur? The current patch affects *all* requests from the guest. The problem being seen though is just during the driver probe. Is the current patch doing much more than is required? Say the device runs out of randomness to provide after setup - the guest's read calls will just block and remain killable. It's the probe currently that is not killable. Other options to explore are: 1. Ensure setup is ready (ie the device's backend is set up before plugging in the device), so that the device is ready to serve requests as soon as it's plugged in 2. Make the add_early_randomness optional for virtio-rng. (This is a big hammer, and working around bad setup problems.) > > > The virtio_read call is killable, so it can take signals when > > initiated > > by userspace. For the initial probe, specifying a timeout / > > watchdog > > in the driver is better. > > Yes, I think also it's better, I tried to do something like that: > > --- a/drivers/char/hw_random/virtio-rng.c > +++ b/drivers/char/hw_random/virtio-rng.c > @@ -77,10 +77,7 @@ static int virtio_read(struct hwrng *rng, void > *buf, size_t size, bool wait) > register_buffer(vi, buf, size); > } > > - if (!wait) > - return 0; > - > - ret = wait_for_completion_killable(&vi->have_data); > + ret = wait_for_completion_timeout(&vi->have_data, wait ? > MAX_SCHEDULE_TIMEOUT : HZ); > if (ret < 0) > return ret; > > But I have a problem doing the timeout / watchdog at driver level: > once > the buffer is submitted to the virtqueue, how to cancel it? Is there > a > way to ask the QEMU device to not process the element in the > virtqueue > we have stopped to wait for because of the timeout (or for the > signal: I > don't understand how it works in this case. How it is canceled?)? Could be achieved via a control command? But all that sounds quite ugly (and also requires host+guest modification).
On 14/06/2019 09:49, Amit Shah wrote: > On Thu, 2019-06-13 at 10:53 +0200, Laurent Vivier wrote: >> On 12/06/2019 09:03, Amit Shah wrote: >>> On Tue, 2019-06-11 at 19:20 +0200, Laurent Vivier wrote: >>>> The virtio-rng linux driver can be stuck in virtio_read() on a >>>> wait_for_completion_killable() call if the virtio-rng device in >>>> QEMU >>>> doesn't provide data. >>>> >>>> It's a problem, because virtio_read() is called from >>>> rng_get_data() >>>> with >>>> reading_mutex() held. The same mutex is taken by >>>> add_early_randomness() >>>> and hwrng_fillfn() and this brings to a hang during the boot >>>> sequence >>>> if >>>> the virtio-rng driver is builtin. >>>> Moreover, another lock is taken (rng_mutex) when the hwrng driver >>>> wants to switch the RNG device or the user tries to unplug the >>>> virtio-rng >>>> PCI card, and this can hang too because the virtio-rng driver is >>>> only >>>> able >>>> to release the card if the virtio-rng device sends back the >>>> virtqueue >>>> element. >>>> >>>> # echo -n virtio_rng.1 > /sys/class/misc/hw_random/rng_current >>>> [ 240.165234] INFO: task kworker/u2:1:34 blocked for more than >>>> 120 >>>> seconds. >>>> [ 240.165961] "echo 0 > >>>> /proc/sys/kernel/hung_task_timeout_secs" >>>> disables this message. >>>> [ 240.166708] kworker/u2:1 D >>>> ffffffffb86b85a8 0 34 2 0x00000000 >>>> [ 240.166714] Workqueue: kacpi_hotplug acpi_hotplug_work_fn >>>> [ 240.166716] ffffa0e8f3c0b890 0000000000000046 >>>> ffffa0e8f3c00000 >>>> ffffa0e8f3c0bfd8 >>>> [ 240.166717] ffffa0e8f3c0bfd8 ffffa0e8f3c0bfd8 >>>> ffffa0e8f3c00000 >>>> ffffffffb86b85a0 >>>> [ 240.166719] ffffffffb86b85a4 ffffa0e8f3c00000 >>>> 00000000ffffffff >>>> ffffffffb86b85a8 >>>> [ 240.166720] Call Trace: >>>> [ 240.166725] [<ffffffffb82a61c9>] >>>> schedule_preempt_disabled+0x29/0x70 >>>> [ 240.166727] [<ffffffffb82a40f7>] >>>> __mutex_lock_slowpath+0xc7/0x1d0 >>>> [ 240.166728] [<ffffffffb82a350f>] mutex_lock+0x1f/0x2f >>>> [ 240.166730] [<ffffffffb8022b52>] hwrng_register+0x32/0x1d0 >>>> [ 240.166733] [<ffffffffc07fa149>] virtrng_scan+0x19/0x30 >>>> [virtio_rng] >>>> [ 240.166744] [<ffffffffc03108db>] >>>> virtio_dev_probe+0x1eb/0x290 >>>> [virtio] >>>> [ 240.166746] [<ffffffffb803d6e5>] >>>> driver_probe_device+0x145/0x3c0 >>>> ... >>>> >>>> In some case, the QEMU RNG backend is not able to provide data, >>>> and >>>> the virtio-rng device is not aware of that: >>>> - with rng-random using /dev/random and no entropy is available, >>>> - with rng-egd started with a socket in "server,nowait" mode and >>>> no daemon connected, >>>> - with rng-egd and an egd daemon that is not providing enough >>>> data, >>>> - ... >>>> >>>> To release the locks regularly, this patch adds a watchdog in >>>> QEMU >>>> virtio-rng device that sends back to the guest the virtqueue >>>> buffer >>>> with a 0 byte payload. This case is expected and correctly >>>> managed by >>>> the hwrng core. >>> >>> I'm wondering if it makes more sense to rework the way the kernel >>> driver requests for seeding entropy during probe. >> >> The kernel side was my first angle of attack. >> I tried first to not block in add_early_randomness(): >> >> "hwrng: core - don't block in add_early_randomness()" >> https://patchwork.kernel.org/patch/10877571/ >> >> But I agree with the maintainer, the problem must be fixed at virtio- >> rng >> level. > > Yea. > > How much of this is due to 'rng-egd was not set up properly; the > backend did not appear in time' -- ie a setup problem; vs a problem we > can expect to recur? I agree there is a configuration issue here, but not only. Imagine the egd daemon on host hangs, is it normal the guest hangs (in the kernel) too? It can also happen with rng-random on a host with not enough entropy. Imagine several guests on the same host that require entropy and there is not enough entropy for all. > The current patch affects *all* requests from the guest. The problem > being seen though is just during the driver probe. Is the current Not only during the device probe. It happens also when we try to unplug the virtio-rng card or when we try to switch rng backend in the guest kernel. > patch doing much more than is required? Say the device runs out of > randomness to provide after setup - the guest's read calls will just > block and remain killable. It's the probe currently that is not > killable. I don't think to be killable is the solution of the problem: who will kill it? How to know we need to kill something? Is this normal to have to explicitly kill a process when we want to unplug a card? > Other options to explore are: > > 1. Ensure setup is ready (ie the device's backend is set up before > plugging in the device), so that the device is ready to serve requests > as soon as it's plugged in A way I have explored is to not set ready the virtio-rng device in QEMU if the backend is not ready (this should prevent the driver to be started in the guest), but this only fixes a problem at startup. It doesn't fix the problem if the EGD daemon on host hangs later or if there is no enough entropy on the host (with EGD or /dev/random). > 2. Make the add_early_randomness optional for virtio-rng. (This is a > big hammer, and working around bad setup problems.) I've been tempted to remove it, but I don't remove code I don't understand: why do we need the add_early_randomness() in first place? >>> The virtio_read call is killable, so it can take signals when >>> initiated >>> by userspace. For the initial probe, specifying a timeout / >>> watchdog >>> in the driver is better. >> ... >> But I have a problem doing the timeout / watchdog at driver level: >> once >> the buffer is submitted to the virtqueue, how to cancel it? Is there >> a >> way to ask the QEMU device to not process the element in the >> virtqueue >> we have stopped to wait for because of the timeout (or for the >> signal: I >> don't understand how it works in this case. How it is canceled?)? > > Could be achieved via a control command? But all that sounds quite> ugly (and also requires host+guest modification). > Adding a control command changes the specification of the device, can we? We need to add a queue, I guess? So I think there is also a problem with the "killable" approach. Using a watchdog in the QEMU device allows to remove the killable option and to avoid to have a lost request in the queue, without changing the specs, as the kernel rng core is able to manage 0 length data payload. Thanks, Laurent
diff --git a/hw/virtio/virtio-rng.c b/hw/virtio/virtio-rng.c index 30493a258622..173ecd370c0e 100644 --- a/hw/virtio/virtio-rng.c +++ b/hw/virtio/virtio-rng.c @@ -19,6 +19,8 @@ #include "qom/object_interfaces.h" #include "trace.h" +#define VIRTIO_RNG_WATCHDOG_MS 500 + static bool is_guest_ready(VirtIORNG *vrng) { VirtIODevice *vdev = VIRTIO_DEVICE(vrng); @@ -38,6 +40,21 @@ static size_t get_request_size(VirtQueue *vq, unsigned quota) return in; } +static void watchdog(void *opaque) +{ + VirtIORNG *vrng = opaque; + VirtIODevice *vdev = VIRTIO_DEVICE(vrng); + VirtQueueElement *elem; + + /* wake up driver */ + elem = virtqueue_pop(vrng->vq, sizeof(VirtQueueElement)); + if (!elem) { + return; + } + virtqueue_push(vrng->vq, elem, 0); + virtio_notify(vdev, vrng->vq); +} + static void virtio_rng_process(VirtIORNG *vrng); /* Send data from a char device over to the guest */ @@ -98,6 +115,9 @@ static void virtio_rng_process(VirtIORNG *vrng) return; } + timer_mod(vrng->watchdog_timer, + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + VIRTIO_RNG_WATCHDOG_MS); + if (vrng->activate_timer) { timer_mod(vrng->rate_limit_timer, qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) + vrng->conf.period_ms); @@ -222,6 +242,7 @@ static void virtio_rng_device_realize(DeviceState *dev, Error **errp) vrng->vq = virtio_add_queue(vdev, 8, handle_input); vrng->quota_remaining = vrng->conf.max_bytes; + vrng->watchdog_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, watchdog, vrng); vrng->rate_limit_timer = timer_new_ms(QEMU_CLOCK_VIRTUAL, check_rate_limit, vrng); vrng->activate_timer = true; @@ -236,6 +257,8 @@ static void virtio_rng_device_unrealize(DeviceState *dev, Error **errp) VirtIORNG *vrng = VIRTIO_RNG(dev); qemu_del_vm_change_state_handler(vrng->vmstate); + timer_del(vrng->watchdog_timer); + timer_free(vrng->watchdog_timer); timer_del(vrng->rate_limit_timer); timer_free(vrng->rate_limit_timer); virtio_cleanup(vdev); diff --git a/include/hw/virtio/virtio-rng.h b/include/hw/virtio/virtio-rng.h index 922dce7caccf..05d6b0e7d881 100644 --- a/include/hw/virtio/virtio-rng.h +++ b/include/hw/virtio/virtio-rng.h @@ -42,6 +42,7 @@ typedef struct VirtIORNG { /* We purposefully don't migrate this state. The quota will reset on the * destination as a result. Rate limiting is host state, not guest state. */ + QEMUTimer *watchdog_timer; QEMUTimer *rate_limit_timer; int64_t quota_remaining; bool activate_timer;
The virtio-rng linux driver can be stuck in virtio_read() on a wait_for_completion_killable() call if the virtio-rng device in QEMU doesn't provide data. It's a problem, because virtio_read() is called from rng_get_data() with reading_mutex() held. The same mutex is taken by add_early_randomness() and hwrng_fillfn() and this brings to a hang during the boot sequence if the virtio-rng driver is builtin. Moreover, another lock is taken (rng_mutex) when the hwrng driver wants to switch the RNG device or the user tries to unplug the virtio-rng PCI card, and this can hang too because the virtio-rng driver is only able to release the card if the virtio-rng device sends back the virtqueue element. # echo -n virtio_rng.1 > /sys/class/misc/hw_random/rng_current [ 240.165234] INFO: task kworker/u2:1:34 blocked for more than 120 seconds. [ 240.165961] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 240.166708] kworker/u2:1 D ffffffffb86b85a8 0 34 2 0x00000000 [ 240.166714] Workqueue: kacpi_hotplug acpi_hotplug_work_fn [ 240.166716] ffffa0e8f3c0b890 0000000000000046 ffffa0e8f3c00000 ffffa0e8f3c0bfd8 [ 240.166717] ffffa0e8f3c0bfd8 ffffa0e8f3c0bfd8 ffffa0e8f3c00000 ffffffffb86b85a0 [ 240.166719] ffffffffb86b85a4 ffffa0e8f3c00000 00000000ffffffff ffffffffb86b85a8 [ 240.166720] Call Trace: [ 240.166725] [<ffffffffb82a61c9>] schedule_preempt_disabled+0x29/0x70 [ 240.166727] [<ffffffffb82a40f7>] __mutex_lock_slowpath+0xc7/0x1d0 [ 240.166728] [<ffffffffb82a350f>] mutex_lock+0x1f/0x2f [ 240.166730] [<ffffffffb8022b52>] hwrng_register+0x32/0x1d0 [ 240.166733] [<ffffffffc07fa149>] virtrng_scan+0x19/0x30 [virtio_rng] [ 240.166744] [<ffffffffc03108db>] virtio_dev_probe+0x1eb/0x290 [virtio] [ 240.166746] [<ffffffffb803d6e5>] driver_probe_device+0x145/0x3c0 ... In some case, the QEMU RNG backend is not able to provide data, and the virtio-rng device is not aware of that: - with rng-random using /dev/random and no entropy is available, - with rng-egd started with a socket in "server,nowait" mode and no daemon connected, - with rng-egd and an egd daemon that is not providing enough data, - ... To release the locks regularly, this patch adds a watchdog in QEMU virtio-rng device that sends back to the guest the virtqueue buffer with a 0 byte payload. This case is expected and correctly managed by the hwrng core. Signed-off-by: Laurent Vivier <lvivier@redhat.com> --- hw/virtio/virtio-rng.c | 23 +++++++++++++++++++++++ include/hw/virtio/virtio-rng.h | 1 + 2 files changed, 24 insertions(+)