Message ID | 20230711155230.64277-7-hreitz@redhat.com |
---|---|
State | New |
Headers | show |
Series | vhost-user: Add suspend/resume | expand |
On Tue, Jul 11, 2023 at 05:52:28PM +0200, Hanna Czenczek wrote: > The only user of vhost_user_reset_status() is vhost_dev_stop(), which > only uses it as a fall-back to stop the back-end if it does not support > SUSPEND. However, vhost-user's implementation is a no-op unless the > back-end supports SET_STATUS. > > vhost-vdpa's implementation instead just calls > vhost_vdpa_reset_device(), implying that it's OK to fully reset the > device if SET_STATUS is not supported. > > To be fair, vhost_vdpa_reset_device() does nothing but to set the status > to zero. However, that may well be because vhost-vdpa has no method > besides this to reset a device. In contrast, vhost-user has > RESET_DEVICE and a RESET_OWNER, which can be used instead. > > While it is not entirely clear from documentation or git logs, from > discussions and the order of vhost-user protocol features, it appears to > me as if RESET_OWNER originally had no real meaning for vhost-user, and > was thus used to signal a device reset to the back-end. Then, > RESET_DEVICE was introduced, to have a well-defined dedicated reset > command. Finally, vhost-user received full STATUS support, including > SET_STATUS, so setting the device status to 0 is now the preferred way > of resetting a device. Still, RESET_DEVICE and RESET_OWNER should > remain valid as fall-backs. > > Therefore, have vhost_user_reset_status() fall back to > vhost_user_reset_device() if the back-end has no STATUS support. > > Signed-off-by: Hanna Czenczek <hreitz@redhat.com> > --- > hw/virtio/vhost-user.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c > index 4507de5a92..53a881ec2a 100644 > --- a/hw/virtio/vhost-user.c > +++ b/hw/virtio/vhost-user.c > @@ -2833,6 +2833,8 @@ static void vhost_user_reset_status(struct vhost_dev *dev) > if (virtio_has_feature(dev->protocol_features, > VHOST_USER_PROTOCOL_F_STATUS)) { > vhost_user_set_status(dev, 0); > + } else { > + vhost_user_reset_device(dev); > } > } Did you check whether DPDK treats setting the status to 0 as equivalent to RESET_DEVICE? My understanding is that SET_STATUS is mostly ignored by vhost-user back-ends today. Even those that implement it may not treat SET_STATUS 0 as equivalent to RESET_DEVICE. If you decide it's safe to make this change, please also update vhost-user.rst to document that front-ends should use SET_STATUS 0, RESET_DEVICE, and RESET_OWNER in order of preference. Stefan
On 18.07.23 17:10, Stefan Hajnoczi wrote: > On Tue, Jul 11, 2023 at 05:52:28PM +0200, Hanna Czenczek wrote: >> The only user of vhost_user_reset_status() is vhost_dev_stop(), which >> only uses it as a fall-back to stop the back-end if it does not support >> SUSPEND. However, vhost-user's implementation is a no-op unless the >> back-end supports SET_STATUS. >> >> vhost-vdpa's implementation instead just calls >> vhost_vdpa_reset_device(), implying that it's OK to fully reset the >> device if SET_STATUS is not supported. >> >> To be fair, vhost_vdpa_reset_device() does nothing but to set the status >> to zero. However, that may well be because vhost-vdpa has no method >> besides this to reset a device. In contrast, vhost-user has >> RESET_DEVICE and a RESET_OWNER, which can be used instead. >> >> While it is not entirely clear from documentation or git logs, from >> discussions and the order of vhost-user protocol features, it appears to >> me as if RESET_OWNER originally had no real meaning for vhost-user, and >> was thus used to signal a device reset to the back-end. Then, >> RESET_DEVICE was introduced, to have a well-defined dedicated reset >> command. Finally, vhost-user received full STATUS support, including >> SET_STATUS, so setting the device status to 0 is now the preferred way >> of resetting a device. Still, RESET_DEVICE and RESET_OWNER should >> remain valid as fall-backs. >> >> Therefore, have vhost_user_reset_status() fall back to >> vhost_user_reset_device() if the back-end has no STATUS support. >> >> Signed-off-by: Hanna Czenczek <hreitz@redhat.com> >> --- >> hw/virtio/vhost-user.c | 2 ++ >> 1 file changed, 2 insertions(+) >> >> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c >> index 4507de5a92..53a881ec2a 100644 >> --- a/hw/virtio/vhost-user.c >> +++ b/hw/virtio/vhost-user.c >> @@ -2833,6 +2833,8 @@ static void vhost_user_reset_status(struct vhost_dev *dev) >> if (virtio_has_feature(dev->protocol_features, >> VHOST_USER_PROTOCOL_F_STATUS)) { >> vhost_user_set_status(dev, 0); >> + } else { >> + vhost_user_reset_device(dev); >> } >> } > Did you check whether DPDK treats setting the status to 0 as equivalent > to RESET_DEVICE? If it doesn’t, what’s even the point of using reset_status? I will investigate, but if there’s a difference, that makes the whole reset_* thing even more questionable to me than it has already been so far. Hanna > My understanding is that SET_STATUS is mostly ignored by vhost-user > back-ends today. Even those that implement it may not treat SET_STATUS 0 > as equivalent to RESET_DEVICE. > > If you decide it's safe to make this change, please also update > vhost-user.rst to document that front-ends should use SET_STATUS 0, > RESET_DEVICE, and RESET_OWNER in order of preference. > > Stefan
On 19.07.23 16:11, Hanna Czenczek wrote: > On 18.07.23 17:10, Stefan Hajnoczi wrote: >> On Tue, Jul 11, 2023 at 05:52:28PM +0200, Hanna Czenczek wrote: >>> The only user of vhost_user_reset_status() is vhost_dev_stop(), which >>> only uses it as a fall-back to stop the back-end if it does not support >>> SUSPEND. However, vhost-user's implementation is a no-op unless the >>> back-end supports SET_STATUS. >>> >>> vhost-vdpa's implementation instead just calls >>> vhost_vdpa_reset_device(), implying that it's OK to fully reset the >>> device if SET_STATUS is not supported. >>> >>> To be fair, vhost_vdpa_reset_device() does nothing but to set the >>> status >>> to zero. However, that may well be because vhost-vdpa has no method >>> besides this to reset a device. In contrast, vhost-user has >>> RESET_DEVICE and a RESET_OWNER, which can be used instead. >>> >>> While it is not entirely clear from documentation or git logs, from >>> discussions and the order of vhost-user protocol features, it >>> appears to >>> me as if RESET_OWNER originally had no real meaning for vhost-user, and >>> was thus used to signal a device reset to the back-end. Then, >>> RESET_DEVICE was introduced, to have a well-defined dedicated reset >>> command. Finally, vhost-user received full STATUS support, including >>> SET_STATUS, so setting the device status to 0 is now the preferred way >>> of resetting a device. Still, RESET_DEVICE and RESET_OWNER should >>> remain valid as fall-backs. >>> >>> Therefore, have vhost_user_reset_status() fall back to >>> vhost_user_reset_device() if the back-end has no STATUS support. >>> >>> Signed-off-by: Hanna Czenczek <hreitz@redhat.com> >>> --- >>> hw/virtio/vhost-user.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c >>> index 4507de5a92..53a881ec2a 100644 >>> --- a/hw/virtio/vhost-user.c >>> +++ b/hw/virtio/vhost-user.c >>> @@ -2833,6 +2833,8 @@ static void vhost_user_reset_status(struct >>> vhost_dev *dev) >>> if (virtio_has_feature(dev->protocol_features, >>> VHOST_USER_PROTOCOL_F_STATUS)) { >>> vhost_user_set_status(dev, 0); >>> + } else { >>> + vhost_user_reset_device(dev); >>> } >>> } >> Did you check whether DPDK treats setting the status to 0 as equivalent >> to RESET_DEVICE? > > If it doesn’t, what’s even the point of using reset_status? Sorry, I’m being unclear, and I think this may be important because it ties into the question from patch 1, what qemu is even trying to do by running SET_STATUS(0) vhost_dev_stop(), so here’s what gave me the impression that SET_STATUS(0) and RESET_DEVICE should be equivalent: vhost-vdpa.c runs SET_STATUS(0) in a function called vhost_vdpa_reset_device(). This is one thing that gave me the impression that this is about an actual full reset. Another is the whole discussion that we’ve had. vhost_dev_stop() does not call a `vhost_reset_device()` function, it calls `vhost_reset_status()`. Still, we were always talking about resetting the device. It doesn’t make sense to me that vDPA would provide no function to fully reset a device, while vhost-user does. Being able to reset a device sounds vital to me. This also gave me the impression that SET_STATUS(0) on vDPA at least is functionally equivalent to a full device reset. Maybe SET_STATUS(0) does mean a full device reset on vDPA, but not on vhost-user. That would be a real shame, so I assumed this would not be the case; that SET_STATUS(0) does the same thing on both protocols. The virtio specification says “Writing 0 into this field resets the device.” about the device_status field. This also makes sense, because the device_status field is basically used to tell the device that a driver has taken control. If reset, this indicates the driver has given up control, and to me this is a point where a device should fully reset itself. So all in all, I can’t see the rationale why any implementation that supports SET_STATUS would decide to treat SET_STATUS(0) not as equivalent or a superset of RESET_DEVICE. I may be wrong, and this might explain a whole deal about what kind of background operations we hope to stop with SET_STATUS(0). Hanna
On Wed, Jul 19, 2023 at 04:27:58PM +0200, Hanna Czenczek wrote: > On 19.07.23 16:11, Hanna Czenczek wrote: > > On 18.07.23 17:10, Stefan Hajnoczi wrote: > > > On Tue, Jul 11, 2023 at 05:52:28PM +0200, Hanna Czenczek wrote: > > > > The only user of vhost_user_reset_status() is vhost_dev_stop(), which > > > > only uses it as a fall-back to stop the back-end if it does not support > > > > SUSPEND. However, vhost-user's implementation is a no-op unless the > > > > back-end supports SET_STATUS. > > > > > > > > vhost-vdpa's implementation instead just calls > > > > vhost_vdpa_reset_device(), implying that it's OK to fully reset the > > > > device if SET_STATUS is not supported. > > > > > > > > To be fair, vhost_vdpa_reset_device() does nothing but to set > > > > the status > > > > to zero. However, that may well be because vhost-vdpa has no method > > > > besides this to reset a device. In contrast, vhost-user has > > > > RESET_DEVICE and a RESET_OWNER, which can be used instead. > > > > > > > > While it is not entirely clear from documentation or git logs, from > > > > discussions and the order of vhost-user protocol features, it > > > > appears to > > > > me as if RESET_OWNER originally had no real meaning for vhost-user, and > > > > was thus used to signal a device reset to the back-end. Then, > > > > RESET_DEVICE was introduced, to have a well-defined dedicated reset > > > > command. Finally, vhost-user received full STATUS support, including > > > > SET_STATUS, so setting the device status to 0 is now the preferred way > > > > of resetting a device. Still, RESET_DEVICE and RESET_OWNER should > > > > remain valid as fall-backs. > > > > > > > > Therefore, have vhost_user_reset_status() fall back to > > > > vhost_user_reset_device() if the back-end has no STATUS support. > > > > > > > > Signed-off-by: Hanna Czenczek <hreitz@redhat.com> > > > > --- > > > > hw/virtio/vhost-user.c | 2 ++ > > > > 1 file changed, 2 insertions(+) > > > > > > > > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c > > > > index 4507de5a92..53a881ec2a 100644 > > > > --- a/hw/virtio/vhost-user.c > > > > +++ b/hw/virtio/vhost-user.c > > > > @@ -2833,6 +2833,8 @@ static void vhost_user_reset_status(struct > > > > vhost_dev *dev) > > > > if (virtio_has_feature(dev->protocol_features, > > > > VHOST_USER_PROTOCOL_F_STATUS)) { > > > > vhost_user_set_status(dev, 0); > > > > + } else { > > > > + vhost_user_reset_device(dev); > > > > } > > > > } > > > Did you check whether DPDK treats setting the status to 0 as equivalent > > > to RESET_DEVICE? > > > > If it doesn’t, what’s even the point of using reset_status? > > Sorry, I’m being unclear, and I think this may be important because it ties > into the question from patch 1, what qemu is even trying to do by running > SET_STATUS(0) vhost_dev_stop(), so here’s what gave me the impression that > SET_STATUS(0) and RESET_DEVICE should be equivalent: > > vhost-vdpa.c runs SET_STATUS(0) in a function called > vhost_vdpa_reset_device(). This is one thing that gave me the impression > that this is about an actual full reset. > > Another is the whole discussion that we’ve had. vhost_dev_stop() does not > call a `vhost_reset_device()` function, it calls `vhost_reset_status()`. > Still, we were always talking about resetting the device. There is some hacky stuff with struct vhost_dev's vq_index_end and multi-queue devices. I think it's because multi-queue vhost-net device consist of many vhost_devs and NetClientStates, so certain vhost operations are skipped unless this is the "first" or "last" vhost_dev from a large aggregate vhost-net device. That might be responsible for part of the weirdness. > > It doesn’t make sense to me that vDPA would provide no function to fully > reset a device, while vhost-user does. Being able to reset a device sounds > vital to me. This also gave me the impression that SET_STATUS(0) on vDPA at > least is functionally equivalent to a full device reset. > > > Maybe SET_STATUS(0) does mean a full device reset on vDPA, but not on > vhost-user. That would be a real shame, so I assumed this would not be the > case; that SET_STATUS(0) does the same thing on both protocols. Yes, exactly. It has the real VIRTIO spec meaning in vDPA. In vhost-user it's currently only used by DPDK as a hint for when device initialization is complete: https://github.com/DPDK/dpdk/commit/41d201804c4c44738168e2d247d3b1780845faa1 > The virtio specification says “Writing 0 into this field resets the device.” > about the device_status field. > > This also makes sense, because the device_status field is basically used to > tell the device that a driver has taken control. If reset, this indicates > the driver has given up control, and to me this is a point where a device > should fully reset itself. > > So all in all, I can’t see the rationale why any implementation that > supports SET_STATUS would decide to treat SET_STATUS(0) not as equivalent or > a superset of RESET_DEVICE. I may be wrong, and this might explain a whole > deal about what kind of background operations we hope to stop with > SET_STATUS(0). I would like vhost-user devices to implement SET_STATUS according to the VIRTIO specification in the future and they can do that. But I think front-ends should continue sending RESET_DEVICE in order to support old devices. Stefan
On 20.07.23 18:03, Stefan Hajnoczi wrote: > On Wed, Jul 19, 2023 at 04:27:58PM +0200, Hanna Czenczek wrote: >> On 19.07.23 16:11, Hanna Czenczek wrote: >>> On 18.07.23 17:10, Stefan Hajnoczi wrote: >>>> On Tue, Jul 11, 2023 at 05:52:28PM +0200, Hanna Czenczek wrote: >>>>> The only user of vhost_user_reset_status() is vhost_dev_stop(), which >>>>> only uses it as a fall-back to stop the back-end if it does not support >>>>> SUSPEND. However, vhost-user's implementation is a no-op unless the >>>>> back-end supports SET_STATUS. >>>>> >>>>> vhost-vdpa's implementation instead just calls >>>>> vhost_vdpa_reset_device(), implying that it's OK to fully reset the >>>>> device if SET_STATUS is not supported. >>>>> >>>>> To be fair, vhost_vdpa_reset_device() does nothing but to set >>>>> the status >>>>> to zero. However, that may well be because vhost-vdpa has no method >>>>> besides this to reset a device. In contrast, vhost-user has >>>>> RESET_DEVICE and a RESET_OWNER, which can be used instead. >>>>> >>>>> While it is not entirely clear from documentation or git logs, from >>>>> discussions and the order of vhost-user protocol features, it >>>>> appears to >>>>> me as if RESET_OWNER originally had no real meaning for vhost-user, and >>>>> was thus used to signal a device reset to the back-end. Then, >>>>> RESET_DEVICE was introduced, to have a well-defined dedicated reset >>>>> command. Finally, vhost-user received full STATUS support, including >>>>> SET_STATUS, so setting the device status to 0 is now the preferred way >>>>> of resetting a device. Still, RESET_DEVICE and RESET_OWNER should >>>>> remain valid as fall-backs. >>>>> >>>>> Therefore, have vhost_user_reset_status() fall back to >>>>> vhost_user_reset_device() if the back-end has no STATUS support. >>>>> >>>>> Signed-off-by: Hanna Czenczek <hreitz@redhat.com> >>>>> --- >>>>> hw/virtio/vhost-user.c | 2 ++ >>>>> 1 file changed, 2 insertions(+) >>>>> >>>>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c >>>>> index 4507de5a92..53a881ec2a 100644 >>>>> --- a/hw/virtio/vhost-user.c >>>>> +++ b/hw/virtio/vhost-user.c >>>>> @@ -2833,6 +2833,8 @@ static void vhost_user_reset_status(struct >>>>> vhost_dev *dev) >>>>> if (virtio_has_feature(dev->protocol_features, >>>>> VHOST_USER_PROTOCOL_F_STATUS)) { >>>>> vhost_user_set_status(dev, 0); >>>>> + } else { >>>>> + vhost_user_reset_device(dev); >>>>> } >>>>> } >>>> Did you check whether DPDK treats setting the status to 0 as equivalent >>>> to RESET_DEVICE? >>> If it doesn’t, what’s even the point of using reset_status? >> Sorry, I’m being unclear, and I think this may be important because it ties >> into the question from patch 1, what qemu is even trying to do by running >> SET_STATUS(0) vhost_dev_stop(), so here’s what gave me the impression that >> SET_STATUS(0) and RESET_DEVICE should be equivalent: >> >> vhost-vdpa.c runs SET_STATUS(0) in a function called >> vhost_vdpa_reset_device(). This is one thing that gave me the impression >> that this is about an actual full reset. >> >> Another is the whole discussion that we’ve had. vhost_dev_stop() does not >> call a `vhost_reset_device()` function, it calls `vhost_reset_status()`. >> Still, we were always talking about resetting the device. > There is some hacky stuff with struct vhost_dev's vq_index_end and > multi-queue devices. I think it's because multi-queue vhost-net device > consist of many vhost_devs and NetClientStates, so certain vhost > operations are skipped unless this is the "first" or "last" vhost_dev > from a large aggregate vhost-net device. That might be responsible for > part of the weirdness. > >> It doesn’t make sense to me that vDPA would provide no function to fully >> reset a device, while vhost-user does. Being able to reset a device sounds >> vital to me. This also gave me the impression that SET_STATUS(0) on vDPA at >> least is functionally equivalent to a full device reset. >> >> >> Maybe SET_STATUS(0) does mean a full device reset on vDPA, but not on >> vhost-user. That would be a real shame, so I assumed this would not be the >> case; that SET_STATUS(0) does the same thing on both protocols. > Yes, exactly. It has the real VIRTIO spec meaning in vDPA. In vhost-user > it's currently only used by DPDK as a hint for when device > initialization is complete: > https://github.com/DPDK/dpdk/commit/41d201804c4c44738168e2d247d3b1780845faa1 FWIW, now the code is a bit different. https://github.com/DPDK/dpdk/commit/671cc679a5fcd26705bb20ddc13b93e665719054 has added a RESET interpretation for the status field, i.e. when it is 0. It doesn’t do anything, but at least DPDK seems to agree that SET_STATUS(0) is a reset. >> The virtio specification says “Writing 0 into this field resets the device.” >> about the device_status field. >> >> This also makes sense, because the device_status field is basically used to >> tell the device that a driver has taken control. If reset, this indicates >> the driver has given up control, and to me this is a point where a device >> should fully reset itself. >> >> So all in all, I can’t see the rationale why any implementation that >> supports SET_STATUS would decide to treat SET_STATUS(0) not as equivalent or >> a superset of RESET_DEVICE. I may be wrong, and this might explain a whole >> deal about what kind of background operations we hope to stop with >> SET_STATUS(0). > I would like vhost-user devices to implement SET_STATUS according to the > VIRTIO specification in the future and they can do that. But I think > front-ends should continue sending RESET_DEVICE in order to support old > devices. Well, yes, exactly. That is what I meant to address with this patch, vhost-user right now does not send RESET_DEVICE in its vhost_reset_status implementation, so the front-end will not fall back to RESET_DEVICE when it apparently does intend to reset the device[1]. We do arguably have vhost_reset_device, too, but for vDPA that is just a SET_STATUS(0) (there is no RESET_DEVICE on vDPA), and it’s also only called by vhost-user-scsi. So this also begs the question why we even do have vhost_reset_status and vhost_reset_device as two separate things. The commit introducing vhost_reset_status (c3716f260bf) doesn’t say. Maybe the intention was that vhost_reset_device would leave the status at 0, while vhost_reset_status would return it to ACKNOWLEDGE | DRIVER, as done by the introducing commit, but that comes back to patch 5 in this series – we don’t need to have ACKNOWLEDGE | DRIVER set after vhost_dev_stop(), so we don’t need vhost_reset_status to set those flags. They should be set in vhost_dev_start(). [1] This is assuming that SET_STATUS(0) is intended to reset the device, but it sounds like you agree on that.
On Fri, Jul 21, 2023 at 04:16:07PM +0200, Hanna Czenczek wrote: > On 20.07.23 18:03, Stefan Hajnoczi wrote: > > On Wed, Jul 19, 2023 at 04:27:58PM +0200, Hanna Czenczek wrote: > > > On 19.07.23 16:11, Hanna Czenczek wrote: > > > > On 18.07.23 17:10, Stefan Hajnoczi wrote: > > > > > On Tue, Jul 11, 2023 at 05:52:28PM +0200, Hanna Czenczek wrote: > > > > > > The only user of vhost_user_reset_status() is vhost_dev_stop(), which > > > > > > only uses it as a fall-back to stop the back-end if it does not support > > > > > > SUSPEND. However, vhost-user's implementation is a no-op unless the > > > > > > back-end supports SET_STATUS. > > > > > > > > > > > > vhost-vdpa's implementation instead just calls > > > > > > vhost_vdpa_reset_device(), implying that it's OK to fully reset the > > > > > > device if SET_STATUS is not supported. > > > > > > > > > > > > To be fair, vhost_vdpa_reset_device() does nothing but to set > > > > > > the status > > > > > > to zero. However, that may well be because vhost-vdpa has no method > > > > > > besides this to reset a device. In contrast, vhost-user has > > > > > > RESET_DEVICE and a RESET_OWNER, which can be used instead. > > > > > > > > > > > > While it is not entirely clear from documentation or git logs, from > > > > > > discussions and the order of vhost-user protocol features, it > > > > > > appears to > > > > > > me as if RESET_OWNER originally had no real meaning for vhost-user, and > > > > > > was thus used to signal a device reset to the back-end. Then, > > > > > > RESET_DEVICE was introduced, to have a well-defined dedicated reset > > > > > > command. Finally, vhost-user received full STATUS support, including > > > > > > SET_STATUS, so setting the device status to 0 is now the preferred way > > > > > > of resetting a device. Still, RESET_DEVICE and RESET_OWNER should > > > > > > remain valid as fall-backs. > > > > > > > > > > > > Therefore, have vhost_user_reset_status() fall back to > > > > > > vhost_user_reset_device() if the back-end has no STATUS support. > > > > > > > > > > > > Signed-off-by: Hanna Czenczek <hreitz@redhat.com> > > > > > > --- > > > > > > hw/virtio/vhost-user.c | 2 ++ > > > > > > 1 file changed, 2 insertions(+) > > > > > > > > > > > > diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c > > > > > > index 4507de5a92..53a881ec2a 100644 > > > > > > --- a/hw/virtio/vhost-user.c > > > > > > +++ b/hw/virtio/vhost-user.c > > > > > > @@ -2833,6 +2833,8 @@ static void vhost_user_reset_status(struct > > > > > > vhost_dev *dev) > > > > > > if (virtio_has_feature(dev->protocol_features, > > > > > > VHOST_USER_PROTOCOL_F_STATUS)) { > > > > > > vhost_user_set_status(dev, 0); > > > > > > + } else { > > > > > > + vhost_user_reset_device(dev); > > > > > > } > > > > > > } > > > > > Did you check whether DPDK treats setting the status to 0 as equivalent > > > > > to RESET_DEVICE? > > > > If it doesn’t, what’s even the point of using reset_status? > > > Sorry, I’m being unclear, and I think this may be important because it ties > > > into the question from patch 1, what qemu is even trying to do by running > > > SET_STATUS(0) vhost_dev_stop(), so here’s what gave me the impression that > > > SET_STATUS(0) and RESET_DEVICE should be equivalent: > > > > > > vhost-vdpa.c runs SET_STATUS(0) in a function called > > > vhost_vdpa_reset_device(). This is one thing that gave me the impression > > > that this is about an actual full reset. > > > > > > Another is the whole discussion that we’ve had. vhost_dev_stop() does not > > > call a `vhost_reset_device()` function, it calls `vhost_reset_status()`. > > > Still, we were always talking about resetting the device. > > There is some hacky stuff with struct vhost_dev's vq_index_end and > > multi-queue devices. I think it's because multi-queue vhost-net device > > consist of many vhost_devs and NetClientStates, so certain vhost > > operations are skipped unless this is the "first" or "last" vhost_dev > > from a large aggregate vhost-net device. That might be responsible for > > part of the weirdness. > > > > > It doesn’t make sense to me that vDPA would provide no function to fully > > > reset a device, while vhost-user does. Being able to reset a device sounds > > > vital to me. This also gave me the impression that SET_STATUS(0) on vDPA at > > > least is functionally equivalent to a full device reset. > > > > > > > > > Maybe SET_STATUS(0) does mean a full device reset on vDPA, but not on > > > vhost-user. That would be a real shame, so I assumed this would not be the > > > case; that SET_STATUS(0) does the same thing on both protocols. > > Yes, exactly. It has the real VIRTIO spec meaning in vDPA. In vhost-user > > it's currently only used by DPDK as a hint for when device > > initialization is complete: > > https://github.com/DPDK/dpdk/commit/41d201804c4c44738168e2d247d3b1780845faa1 > > FWIW, now the code is a bit different. > https://github.com/DPDK/dpdk/commit/671cc679a5fcd26705bb20ddc13b93e665719054 > has added a RESET interpretation for the status field, i.e. when it is 0. > It doesn’t do anything, but at least DPDK seems to agree that SET_STATUS(0) > is a reset. That patch adds diagnostics but does not perform any action for SET_STATUS 0. DPDK's vhost_user_reset_owner() is still the only place where the device is actually reset. QEMU cannot switch to just SET_STATUS 0, it still needs to send RESET_DEVICE/RESET_OWNER. > > > > The virtio specification says “Writing 0 into this field resets the device.” > > > about the device_status field. > > > > > > This also makes sense, because the device_status field is basically used to > > > tell the device that a driver has taken control. If reset, this indicates > > > the driver has given up control, and to me this is a point where a device > > > should fully reset itself. > > > > > > So all in all, I can’t see the rationale why any implementation that > > > supports SET_STATUS would decide to treat SET_STATUS(0) not as equivalent or > > > a superset of RESET_DEVICE. I may be wrong, and this might explain a whole > > > deal about what kind of background operations we hope to stop with > > > SET_STATUS(0). > > I would like vhost-user devices to implement SET_STATUS according to the > > VIRTIO specification in the future and they can do that. But I think > > front-ends should continue sending RESET_DEVICE in order to support old > > devices. > > Well, yes, exactly. That is what I meant to address with this patch, > vhost-user right now does not send RESET_DEVICE in its vhost_reset_status > implementation, so the front-end will not fall back to RESET_DEVICE when it > apparently does intend to reset the device[1]. We do arguably have > vhost_reset_device, too, but for vDPA that is just a SET_STATUS(0) (there is > no RESET_DEVICE on vDPA), and it’s also only called by vhost-user-scsi. > > So this also begs the question why we even do have vhost_reset_status and > vhost_reset_device as two separate things. The commit introducing > vhost_reset_status (c3716f260bf) doesn’t say. Maybe the intention was that > vhost_reset_device would leave the status at 0, while vhost_reset_status > would return it to ACKNOWLEDGE | DRIVER, as done by the introducing commit, > but that comes back to patch 5 in this series – we don’t need to have > ACKNOWLEDGE | DRIVER set after vhost_dev_stop(), so we don’t need > vhost_reset_status to set those flags. They should be set in > vhost_dev_start(). > > [1] This is assuming that SET_STATUS(0) is intended to reset the device, but > it sounds like you agree on that. I don't know the answers, but I think it's safe to go ahead with a SET_STATUS sequence that follows the VIRTIO spec, plus a VHOST_USER_RESET_DEVICE/VHOST_USER_RESET_OWNER. Stefan
On 24.07.23 20:04, Stefan Hajnoczi wrote: > On Fri, Jul 21, 2023 at 04:16:07PM +0200, Hanna Czenczek wrote: >> On 20.07.23 18:03, Stefan Hajnoczi wrote: >>> On Wed, Jul 19, 2023 at 04:27:58PM +0200, Hanna Czenczek wrote: >>>> On 19.07.23 16:11, Hanna Czenczek wrote: >>>>> On 18.07.23 17:10, Stefan Hajnoczi wrote: >>>>>> On Tue, Jul 11, 2023 at 05:52:28PM +0200, Hanna Czenczek wrote: >>>>>>> The only user of vhost_user_reset_status() is vhost_dev_stop(), which >>>>>>> only uses it as a fall-back to stop the back-end if it does not support >>>>>>> SUSPEND. However, vhost-user's implementation is a no-op unless the >>>>>>> back-end supports SET_STATUS. >>>>>>> >>>>>>> vhost-vdpa's implementation instead just calls >>>>>>> vhost_vdpa_reset_device(), implying that it's OK to fully reset the >>>>>>> device if SET_STATUS is not supported. >>>>>>> >>>>>>> To be fair, vhost_vdpa_reset_device() does nothing but to set >>>>>>> the status >>>>>>> to zero. However, that may well be because vhost-vdpa has no method >>>>>>> besides this to reset a device. In contrast, vhost-user has >>>>>>> RESET_DEVICE and a RESET_OWNER, which can be used instead. >>>>>>> >>>>>>> While it is not entirely clear from documentation or git logs, from >>>>>>> discussions and the order of vhost-user protocol features, it >>>>>>> appears to >>>>>>> me as if RESET_OWNER originally had no real meaning for vhost-user, and >>>>>>> was thus used to signal a device reset to the back-end. Then, >>>>>>> RESET_DEVICE was introduced, to have a well-defined dedicated reset >>>>>>> command. Finally, vhost-user received full STATUS support, including >>>>>>> SET_STATUS, so setting the device status to 0 is now the preferred way >>>>>>> of resetting a device. Still, RESET_DEVICE and RESET_OWNER should >>>>>>> remain valid as fall-backs. >>>>>>> >>>>>>> Therefore, have vhost_user_reset_status() fall back to >>>>>>> vhost_user_reset_device() if the back-end has no STATUS support. >>>>>>> >>>>>>> Signed-off-by: Hanna Czenczek <hreitz@redhat.com> >>>>>>> --- >>>>>>> hw/virtio/vhost-user.c | 2 ++ >>>>>>> 1 file changed, 2 insertions(+) >>>>>>> >>>>>>> diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c >>>>>>> index 4507de5a92..53a881ec2a 100644 >>>>>>> --- a/hw/virtio/vhost-user.c >>>>>>> +++ b/hw/virtio/vhost-user.c >>>>>>> @@ -2833,6 +2833,8 @@ static void vhost_user_reset_status(struct >>>>>>> vhost_dev *dev) >>>>>>> if (virtio_has_feature(dev->protocol_features, >>>>>>> VHOST_USER_PROTOCOL_F_STATUS)) { >>>>>>> vhost_user_set_status(dev, 0); >>>>>>> + } else { >>>>>>> + vhost_user_reset_device(dev); >>>>>>> } >>>>>>> } >>>>>> Did you check whether DPDK treats setting the status to 0 as equivalent >>>>>> to RESET_DEVICE? >>>>> If it doesn’t, what’s even the point of using reset_status? >>>> Sorry, I’m being unclear, and I think this may be important because it ties >>>> into the question from patch 1, what qemu is even trying to do by running >>>> SET_STATUS(0) vhost_dev_stop(), so here’s what gave me the impression that >>>> SET_STATUS(0) and RESET_DEVICE should be equivalent: >>>> >>>> vhost-vdpa.c runs SET_STATUS(0) in a function called >>>> vhost_vdpa_reset_device(). This is one thing that gave me the impression >>>> that this is about an actual full reset. >>>> >>>> Another is the whole discussion that we’ve had. vhost_dev_stop() does not >>>> call a `vhost_reset_device()` function, it calls `vhost_reset_status()`. >>>> Still, we were always talking about resetting the device. >>> There is some hacky stuff with struct vhost_dev's vq_index_end and >>> multi-queue devices. I think it's because multi-queue vhost-net device >>> consist of many vhost_devs and NetClientStates, so certain vhost >>> operations are skipped unless this is the "first" or "last" vhost_dev >>> from a large aggregate vhost-net device. That might be responsible for >>> part of the weirdness. >>> >>>> It doesn’t make sense to me that vDPA would provide no function to fully >>>> reset a device, while vhost-user does. Being able to reset a device sounds >>>> vital to me. This also gave me the impression that SET_STATUS(0) on vDPA at >>>> least is functionally equivalent to a full device reset. >>>> >>>> >>>> Maybe SET_STATUS(0) does mean a full device reset on vDPA, but not on >>>> vhost-user. That would be a real shame, so I assumed this would not be the >>>> case; that SET_STATUS(0) does the same thing on both protocols. >>> Yes, exactly. It has the real VIRTIO spec meaning in vDPA. In vhost-user >>> it's currently only used by DPDK as a hint for when device >>> initialization is complete: >>> https://github.com/DPDK/dpdk/commit/41d201804c4c44738168e2d247d3b1780845faa1 >> FWIW, now the code is a bit different. >> https://github.com/DPDK/dpdk/commit/671cc679a5fcd26705bb20ddc13b93e665719054 >> has added a RESET interpretation for the status field, i.e. when it is 0. >> It doesn’t do anything, but at least DPDK seems to agree that SET_STATUS(0) >> is a reset. > That patch adds diagnostics but does not perform any action for > SET_STATUS 0. DPDK's vhost_user_reset_owner() is still the only place > where the device is actually reset. That’s what I said, it doesn’t do anything, but the diagnostics agree that it is a RESET. > QEMU cannot switch to just > SET_STATUS 0, it still needs to send RESET_DEVICE/RESET_OWNER. That is what I questioned below: We currently *do not* call RESET_DEVICE/RESET_OWNER. This patch is not about switching to SET_STATUS(0), it is about having RESET_DEVICE/RESET_OWNER be fallbacks for it. >>>> The virtio specification says “Writing 0 into this field resets the device.” >>>> about the device_status field. >>>> >>>> This also makes sense, because the device_status field is basically used to >>>> tell the device that a driver has taken control. If reset, this indicates >>>> the driver has given up control, and to me this is a point where a device >>>> should fully reset itself. >>>> >>>> So all in all, I can’t see the rationale why any implementation that >>>> supports SET_STATUS would decide to treat SET_STATUS(0) not as equivalent or >>>> a superset of RESET_DEVICE. I may be wrong, and this might explain a whole >>>> deal about what kind of background operations we hope to stop with >>>> SET_STATUS(0). >>> I would like vhost-user devices to implement SET_STATUS according to the >>> VIRTIO specification in the future and they can do that. But I think >>> front-ends should continue sending RESET_DEVICE in order to support old >>> devices. >> Well, yes, exactly. That is what I meant to address with this patch, >> vhost-user right now does not send RESET_DEVICE in its vhost_reset_status >> implementation, so the front-end will not fall back to RESET_DEVICE when it >> apparently does intend to reset the device[1]. We do arguably have >> vhost_reset_device, too, but for vDPA that is just a SET_STATUS(0) (there is >> no RESET_DEVICE on vDPA), and it’s also only called by vhost-user-scsi. >> >> So this also begs the question why we even do have vhost_reset_status and >> vhost_reset_device as two separate things. The commit introducing >> vhost_reset_status (c3716f260bf) doesn’t say. Maybe the intention was that >> vhost_reset_device would leave the status at 0, while vhost_reset_status >> would return it to ACKNOWLEDGE | DRIVER, as done by the introducing commit, >> but that comes back to patch 5 in this series – we don’t need to have >> ACKNOWLEDGE | DRIVER set after vhost_dev_stop(), so we don’t need >> vhost_reset_status to set those flags. They should be set in >> vhost_dev_start(). >> >> [1] This is assuming that SET_STATUS(0) is intended to reset the device, but >> it sounds like you agree on that. > I don't know the answers, but I think it's safe to go ahead with a > SET_STATUS sequence that follows the VIRTIO spec, plus a > VHOST_USER_RESET_DEVICE/VHOST_USER_RESET_OWNER. So what you’re saying is that RESET_DEVICE/RESET_OWNER should not be fallbacks, but be invoked in addition to SET_STATUS(0)? If so, that would be silly. I see your point that DPDK resets only in response to RESET_DEVICE/RESET_OWNER, but the diagnostics agree that SET_STATUS(0) is a reset, which is why I find this so silly. It sounds to me as if any properly behaving implementation would fully reset the back-end on SET_STATUS(0), so unconditionally invoking RESET_DEVICE/RESET_OWNER afterwards is just doing a double-reset. Notably, invoking RESET_DEVICE/RESET_OWNER in addition to SET_STATUS(0) (instead of as a fallback) would be a change in behavior, because we do not call RESET_DEVICE/RESET_OWNER outside of vhost-user-scsi today. Hanna
diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index 4507de5a92..53a881ec2a 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -2833,6 +2833,8 @@ static void vhost_user_reset_status(struct vhost_dev *dev) if (virtio_has_feature(dev->protocol_features, VHOST_USER_PROTOCOL_F_STATUS)) { vhost_user_set_status(dev, 0); + } else { + vhost_user_reset_device(dev); } }
The only user of vhost_user_reset_status() is vhost_dev_stop(), which only uses it as a fall-back to stop the back-end if it does not support SUSPEND. However, vhost-user's implementation is a no-op unless the back-end supports SET_STATUS. vhost-vdpa's implementation instead just calls vhost_vdpa_reset_device(), implying that it's OK to fully reset the device if SET_STATUS is not supported. To be fair, vhost_vdpa_reset_device() does nothing but to set the status to zero. However, that may well be because vhost-vdpa has no method besides this to reset a device. In contrast, vhost-user has RESET_DEVICE and a RESET_OWNER, which can be used instead. While it is not entirely clear from documentation or git logs, from discussions and the order of vhost-user protocol features, it appears to me as if RESET_OWNER originally had no real meaning for vhost-user, and was thus used to signal a device reset to the back-end. Then, RESET_DEVICE was introduced, to have a well-defined dedicated reset command. Finally, vhost-user received full STATUS support, including SET_STATUS, so setting the device status to 0 is now the preferred way of resetting a device. Still, RESET_DEVICE and RESET_OWNER should remain valid as fall-backs. Therefore, have vhost_user_reset_status() fall back to vhost_user_reset_device() if the back-end has no STATUS support. Signed-off-by: Hanna Czenczek <hreitz@redhat.com> --- hw/virtio/vhost-user.c | 2 ++ 1 file changed, 2 insertions(+)