diff mbox series

virtio: Always reset vhost devices

Message ID 20240710112310.316551-1-hreitz@redhat.com
State New
Headers show
Series virtio: Always reset vhost devices | expand

Commit Message

Hanna Czenczek July 10, 2024, 11:23 a.m. UTC
Requiring `vhost_started` to be true for resetting vhost devices in
`virtio_reset()` seems like the wrong condition: Most importantly, the
preceding `virtio_set_status(vdev, 0)` call will (for vhost devices) end
up in `vhost_dev_stop()` (through vhost devices' `.set_status`
implementations), setting `vdev->vhost_started = false`.  Therefore, the
gated `vhost_reset_device()` call is unreachable.

`vhost_started` is not documented, so it is hard to say what exactly it
is supposed to mean, but judging from the fact that `vhost_dev_start()`
sets it and `vhost_dev_stop()` clears it, it seems like it indicates
whether there is a vhost back-end, and whether that back-end is
currently running and processing virtio requests.

Making a reset conditional on whether the vhost back-end is processing
virtio requests seems wrong; in fact, it is probably better to reset it
only when it is not currently processing requests, which is exactly the
current order of operations in `virtio_reset()`: First, the back-end is
stopped through `virtio_set_status(vdev, 0)`, then we want to send a
reset.

Therefore, we should drop the `vhost_started` condition, but in its
stead we then have to verify that we can indeed send a reset to this
vhost device, by not just checking `k->get_vhost != NULL` (introduced by
commit 95e1019a4a9), but also that the vhost back-end is connected
(`hdev = k->get_vhost(); hdev != NULL && hdev->vhost_ops != NULL`).

Signed-off-by: Hanna Czenczek <hreitz@redhat.com>
---
 hw/virtio/virtio.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Matias Ezequiel Vara Larsen July 10, 2024, 1:39 p.m. UTC | #1
Hello Hanna,

On Wed, Jul 10, 2024 at 01:23:10PM +0200, Hanna Czenczek wrote:
> Requiring `vhost_started` to be true for resetting vhost devices in
> `virtio_reset()` seems like the wrong condition: Most importantly, the
> preceding `virtio_set_status(vdev, 0)` call will (for vhost devices) end
> up in `vhost_dev_stop()` (through vhost devices' `.set_status`
> implementations), setting `vdev->vhost_started = false`.  Therefore, the
> gated `vhost_reset_device()` call is unreachable.
> 
> `vhost_started` is not documented, so it is hard to say what exactly it
> is supposed to mean, but judging from the fact that `vhost_dev_start()`
> sets it and `vhost_dev_stop()` clears it, it seems like it indicates
> whether there is a vhost back-end, and whether that back-end is
> currently running and processing virtio requests.
> 
> Making a reset conditional on whether the vhost back-end is processing
> virtio requests seems wrong; in fact, it is probably better to reset it
> only when it is not currently processing requests, which is exactly the
> current order of operations in `virtio_reset()`: First, the back-end is
> stopped through `virtio_set_status(vdev, 0)`, then we want to send a
> reset.
> 
> Therefore, we should drop the `vhost_started` condition, but in its
> stead we then have to verify that we can indeed send a reset to this
> vhost device, by not just checking `k->get_vhost != NULL` (introduced by
> commit 95e1019a4a9), but also that the vhost back-end is connected
> (`hdev = k->get_vhost(); hdev != NULL && hdev->vhost_ops != NULL`).
> 

I am not a native speaker but I think the commit message could be
rephrased to make it clear. This is a minor comment so feel free to
ignore it:

Before `virtio_reset()` is invoked, `virtio_set_status(vdev, 0)` for
vhost devices ends up setting `vhost_stared` to false thus resulting in
`vhost_reset_device()` being never reached during a reset.

`vhost_started` is not documented, however, from `vhost_dev_start()` and
`vhost_dev_stop()`, it seems indicating that there is a vhost back-end
and that the back-end is running and processing virtio requests.
Resetting should not rely on whether the vhost back-end is processing
virtio requests, instead, reset must happen if there is no processing
request. This is what `virtio_reset()` does, it first stops the backend
and then sends the reset.

To trigger a reset, this commit drops the `vhost_started` condition and
checks that the device is running (introduced by 95e1019a4a9) but also
that the vhost back-end is connected so it is possible to send a reset
to the vhost device.

Matias
Stefan Hajnoczi July 10, 2024, 4:28 p.m. UTC | #2
On Wed, 10 Jul 2024 at 13:25, Hanna Czenczek <hreitz@redhat.com> wrote:
>
> Requiring `vhost_started` to be true for resetting vhost devices in
> `virtio_reset()` seems like the wrong condition: Most importantly, the
> preceding `virtio_set_status(vdev, 0)` call will (for vhost devices) end
> up in `vhost_dev_stop()` (through vhost devices' `.set_status`
> implementations), setting `vdev->vhost_started = false`.  Therefore, the
> gated `vhost_reset_device()` call is unreachable.
>
> `vhost_started` is not documented, so it is hard to say what exactly it
> is supposed to mean, but judging from the fact that `vhost_dev_start()`
> sets it and `vhost_dev_stop()` clears it, it seems like it indicates
> whether there is a vhost back-end, and whether that back-end is
> currently running and processing virtio requests.
>
> Making a reset conditional on whether the vhost back-end is processing
> virtio requests seems wrong; in fact, it is probably better to reset it
> only when it is not currently processing requests, which is exactly the
> current order of operations in `virtio_reset()`: First, the back-end is
> stopped through `virtio_set_status(vdev, 0)`, then we want to send a
> reset.
>
> Therefore, we should drop the `vhost_started` condition, but in its
> stead we then have to verify that we can indeed send a reset to this
> vhost device, by not just checking `k->get_vhost != NULL` (introduced by
> commit 95e1019a4a9), but also that the vhost back-end is connected
> (`hdev = k->get_vhost(); hdev != NULL && hdev->vhost_ops != NULL`).
>
> Signed-off-by: Hanna Czenczek <hreitz@redhat.com>

I think an additional SET_STATUS 0 call is made to the vDPA vhost
backend after this patch, but that seems fine.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>

> ---
>  hw/virtio/virtio.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
> index 893a072c9d..4410d62126 100644
> --- a/hw/virtio/virtio.c
> +++ b/hw/virtio/virtio.c
> @@ -2146,8 +2146,12 @@ void virtio_reset(void *opaque)
>          vdev->device_endian = virtio_default_endian();
>      }
>
> -    if (vdev->vhost_started && k->get_vhost) {
> -        vhost_reset_device(k->get_vhost(vdev));
> +    if (k->get_vhost) {
> +        struct vhost_dev *hdev = k->get_vhost(vdev);
> +        /* Only reset when vhost back-end is connected */
> +        if (hdev && hdev->vhost_ops) {
> +            vhost_reset_device(hdev);
> +        }
>      }
>
>      if (k->reset) {
> --
> 2.45.2
>
>
Hanna Czenczek July 11, 2024, 11:43 a.m. UTC | #3
On 10.07.24 15:39, Matias Ezequiel Vara Larsen wrote:
> Hello Hanna,
>
> On Wed, Jul 10, 2024 at 01:23:10PM +0200, Hanna Czenczek wrote:
>> Requiring `vhost_started` to be true for resetting vhost devices in
>> `virtio_reset()` seems like the wrong condition: Most importantly, the
>> preceding `virtio_set_status(vdev, 0)` call will (for vhost devices) end
>> up in `vhost_dev_stop()` (through vhost devices' `.set_status`
>> implementations), setting `vdev->vhost_started = false`.  Therefore, the
>> gated `vhost_reset_device()` call is unreachable.
>>
>> `vhost_started` is not documented, so it is hard to say what exactly it
>> is supposed to mean, but judging from the fact that `vhost_dev_start()`
>> sets it and `vhost_dev_stop()` clears it, it seems like it indicates
>> whether there is a vhost back-end, and whether that back-end is
>> currently running and processing virtio requests.
>>
>> Making a reset conditional on whether the vhost back-end is processing
>> virtio requests seems wrong; in fact, it is probably better to reset it
>> only when it is not currently processing requests, which is exactly the
>> current order of operations in `virtio_reset()`: First, the back-end is
>> stopped through `virtio_set_status(vdev, 0)`, then we want to send a
>> reset.
>>
>> Therefore, we should drop the `vhost_started` condition, but in its
>> stead we then have to verify that we can indeed send a reset to this
>> vhost device, by not just checking `k->get_vhost != NULL` (introduced by
>> commit 95e1019a4a9), but also that the vhost back-end is connected
>> (`hdev = k->get_vhost(); hdev != NULL && hdev->vhost_ops != NULL`).
>>
> I am not a native speaker but I think the commit message could be
> rephrased to make it clear. This is a minor comment so feel free to
> ignore it:

Thanks!  I don’t mind restructuring, I just wanted to be verbose on 
what’s happening.  Honestly, I’m also not 100 % sure about my reasoning, 
which is why I put a lot of indefinite fluff in there, i.e. purposefully 
was not clear.

I wonder what exactly you find to be the problem with it, though? Too 
much fluff, too much detail, weird sentence structure?

> Before `virtio_reset()` is invoked, `virtio_set_status(vdev, 0)` for
> vhost devices ends up setting `vhost_stared` to false thus resulting in
> `vhost_reset_device()` being never reached during a reset.

It’s before vhost_reset_device(), not virtio_reset() (all of this is in 
virtio_reset()).  Also, the “`virtio_set_status()` for vhost devices” 
sounds a bit strange and too detail-less to me; (A) it makes it sound a 
bit like it’d be only called for vhost devices (easily solvable by 
moving “for vhost devices” back after ”false”), and (B) I would like to 
have the detail in there how vhost_started is cleared by the set_status 
call, because it doesn’t seem obvious to me.  Also, I find the “never 
reached during a reset” strangely abstract; we can be very concrete 
here, so why not be?

Maybe

When a vhost device is reset, we want `virtio_reset()` to call 
`vhost_reset_device()`.  However, `virtio_reset()` begins with 
`virtio_set_status(vdev, 0)`, which, for vhost devices, will end up in 
`vhost_dev_stop()`, clearing `vhost_started`.  This makes the subsequent 
`vhost_reset_device()` call unreachable.

> `vhost_started` is not documented, however, from `vhost_dev_start()` and
> `vhost_dev_stop()`, it seems indicating that there is a vhost back-end
> and that the back-end is running and processing virtio requests.
> Resetting should not rely on whether the vhost back-end is processing
> virtio requests, instead, reset must happen if there is no processing
> request. This is what `virtio_reset()` does, it first stops the backend
> and then sends the reset.

I find this unclear on how vhost_dev_start()/stop() indicate what 
vhost_started means (it’s because they set/clear it).  I cannot stand 
behind “reset must happen if there is no processing request”, because I 
don’t know for sure.  I just feel like it makes more sense this way, 
hence me saying “probably”.  I miss context to “it stops the backend”, I 
would like to be clear that it’s through virtio_set_status(0) that the 
back-end is stopped.  I find it important to be concrete here because I 
asked myself whether e could pull vhost_reset_device() up before 
virtio_set_status(0), and this paragraph is intended to convey that that 
would be wrong; I want to be clear that the concrete call order in that 
function makes sense to me.

I also miss a connection to why this paragraph is focussing on 
vhost_started now.  Maybe I’m a bit slow at making connections, but in 
this rephrased version, it is not 100 % clear that vhost_started is a 
condition for vhost_reset_device() (unless one looks at the code 
simultaneously), so its importance and why we need to know what it does 
isn’t clear here.  The very start of the original message however did 
stress that the vhost_started condition seems wrong, so it’s clear why 
it’s important at this point.

So maybe

Requiring `vhost_started` as a precondition for `vhost_reset_device()` 
is also semantically questionable: While undocumented, it being 
set/cleared by `vhost_dev_{start,stop}()` indicates that it shows 
whether there is a vhost back-end, and whether that back-end is running 
and processing virtio requests. The latter should not be a condition for 
doing a reset, on the contrary: A reset should preferably happen when 
the back-end is stopped.  Thus, the order that we have in 
`virtio_reset()` (stop the back-end via `virtio_set_status(vdev, 0)`, 
then reset it) makes perfect sense, but checking `vhost_started` does not.

> To trigger a reset, this commit drops the `vhost_started` condition and
> checks that the device is running (introduced by 95e1019a4a9) but also
> that the vhost back-end is connected so it is possible to send a reset
> to the vhost device.

This seems incorrect: First, we decidedly no longer check that the 
device is running, that was the point of the previous paragraph. Second, 
no such check has been introduced by commit 95e1019a4a9. That commit 
just introduced a check of `k->get_vhost != NULL`, i.e. whether this 
could even theoretically be a vhost device.

So I’d just keep the final paragraph as-is from the original patch. It 
isn’t much longer, just quotes more code.

Hanna
Hanna Czenczek July 11, 2024, 11:46 a.m. UTC | #4
On 10.07.24 18:28, Stefan Hajnoczi wrote:
> On Wed, 10 Jul 2024 at 13:25, Hanna Czenczek<hreitz@redhat.com>  wrote:
>> Requiring `vhost_started` to be true for resetting vhost devices in
>> `virtio_reset()` seems like the wrong condition: Most importantly, the
>> preceding `virtio_set_status(vdev, 0)` call will (for vhost devices) end
>> up in `vhost_dev_stop()` (through vhost devices' `.set_status`
>> implementations), setting `vdev->vhost_started = false`.  Therefore, the
>> gated `vhost_reset_device()` call is unreachable.
>>
>> `vhost_started` is not documented, so it is hard to say what exactly it
>> is supposed to mean, but judging from the fact that `vhost_dev_start()`
>> sets it and `vhost_dev_stop()` clears it, it seems like it indicates
>> whether there is a vhost back-end, and whether that back-end is
>> currently running and processing virtio requests.
>>
>> Making a reset conditional on whether the vhost back-end is processing
>> virtio requests seems wrong; in fact, it is probably better to reset it
>> only when it is not currently processing requests, which is exactly the
>> current order of operations in `virtio_reset()`: First, the back-end is
>> stopped through `virtio_set_status(vdev, 0)`, then we want to send a
>> reset.
>>
>> Therefore, we should drop the `vhost_started` condition, but in its
>> stead we then have to verify that we can indeed send a reset to this
>> vhost device, by not just checking `k->get_vhost != NULL` (introduced by
>> commit 95e1019a4a9), but also that the vhost back-end is connected
>> (`hdev = k->get_vhost(); hdev != NULL && hdev->vhost_ops != NULL`).
>>
>> Signed-off-by: Hanna Czenczek<hreitz@redhat.com>
> I think an additional SET_STATUS 0 call is made to the vDPA vhost
> backend after this patch, but that seems fine.
>
> Reviewed-by: Stefan Hajnoczi<stefanha@redhat.com>

Thanks!  I agree that double-sending SET_STATUS with the same value 
should be fine.The virtio specification states: “The device status field 
starts out as 0, and is reinitialized to 0 by the device during reset.” 
– I interpret that to mean that (re-)setting the field to 0 is always 
OK. Hanna
>> ---
>>   hw/virtio/virtio.c | 8 ++++++--
>>   1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
>> index 893a072c9d..4410d62126 100644
>> --- a/hw/virtio/virtio.c
>> +++ b/hw/virtio/virtio.c
>> @@ -2146,8 +2146,12 @@ void virtio_reset(void *opaque)
>>           vdev->device_endian = virtio_default_endian();
>>       }
>>
>> -    if (vdev->vhost_started && k->get_vhost) {
>> -        vhost_reset_device(k->get_vhost(vdev));
>> +    if (k->get_vhost) {
>> +        struct vhost_dev *hdev = k->get_vhost(vdev);
>> +        /* Only reset when vhost back-end is connected */
>> +        if (hdev && hdev->vhost_ops) {
>> +            vhost_reset_device(hdev);
>> +        }
>>       }
>>
>>       if (k->reset) {
>> --
>> 2.45.2
>>
>>
diff mbox series

Patch

diff --git a/hw/virtio/virtio.c b/hw/virtio/virtio.c
index 893a072c9d..4410d62126 100644
--- a/hw/virtio/virtio.c
+++ b/hw/virtio/virtio.c
@@ -2146,8 +2146,12 @@  void virtio_reset(void *opaque)
         vdev->device_endian = virtio_default_endian();
     }
 
-    if (vdev->vhost_started && k->get_vhost) {
-        vhost_reset_device(k->get_vhost(vdev));
+    if (k->get_vhost) {
+        struct vhost_dev *hdev = k->get_vhost(vdev);
+        /* Only reset when vhost back-end is connected */
+        if (hdev && hdev->vhost_ops) {
+            vhost_reset_device(hdev);
+        }
     }
 
     if (k->reset) {