Message ID | 20200226094357.25061-1-jasowang@redhat.com |
---|---|
State | New |
Headers | show |
Series | [V2] vhost: correctly turn on VIRTIO_F_IOMMU_PLATFORM | expand |
On Wed, Feb 26, 2020 at 05:43:57PM +0800, Jason Wang wrote: > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on > platform without IOMMU support. This can lead unnecessary IOTLB > transactions which will damage the performance. > > Fixing this by check whether the device is backed by IOMMU and disable > device IOTLB. > > Reported-by: Halil Pasic <pasic@linux.ibm.com> > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") Well it's just an optimization, isn't it? I don't think it's justified to push this to everyone using vhost with IOTLB, is it? If you disagree, could you comment a bit on which configurations where tested? > Cc: qemu-stable@nongnu.org > Signed-off-by: Jason Wang <jasowang@redhat.com> Halil could you test this pls? Does this help your performance issue? > --- > Changes from V1: > - do not check acked_features > - reuse vhost_dev_has_iommu() > --- > hw/virtio/vhost.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > index 9edfadc81d..9182a00495 100644 > --- a/hw/virtio/vhost.c > +++ b/hw/virtio/vhost.c > @@ -290,7 +290,14 @@ static int vhost_dev_has_iommu(struct vhost_dev *dev) > { > VirtIODevice *vdev = dev->vdev; > > - return virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); > + /* > + * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support > + * incremental memory mapping API via IOTLB API. For platform that > + * does not have IOMMU, there's no need to enable this feature > + * which may cause unnecessary IOTLB miss/update trnasactions. > + */ > + return vdev->dma_as != &address_space_memory && > + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); > } > > static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr, > @@ -765,6 +772,9 @@ static int vhost_dev_set_features(struct vhost_dev *dev, > if (enable_log) { > features |= 0x1ULL << VHOST_F_LOG_ALL; > } > + if (!vhost_dev_has_iommu(dev)) { > + features &= ~(0x1ULL << VIRTIO_F_IOMMU_PLATFORM); > + } > r = dev->vhost_ops->vhost_set_features(dev, features); > if (r < 0) { > VHOST_OPS_DEBUG("vhost_set_features failed"); > -- > 2.19.1
On 2020/2/26 下午5:53, Michael S. Tsirkin wrote: > On Wed, Feb 26, 2020 at 05:43:57PM +0800, Jason Wang wrote: >> We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on >> platform without IOMMU support. This can lead unnecessary IOTLB >> transactions which will damage the performance. >> >> Fixing this by check whether the device is backed by IOMMU and disable >> device IOTLB. >> >> Reported-by: Halil Pasic <pasic@linux.ibm.com> >> Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") > Well it's just an optimization, isn't it? Kind of, or a fix for the performance. > I don't think it's justified to push this to everyone using > vhost with IOTLB, is it? My understanding is that the function should be equivalent to IOTLB in this case. Since no IOMMU is used and device may only see GPA. Another possible direction is to qemu to update memory table via device IOTLB API, this seems less straightforward. > If you disagree, could you comment a bit on which configurations where tested? I just do dirty shortcut to emulate the platform without get_dma_as wth IOMMU_PLATFORM set. Thanks > >> Cc: qemu-stable@nongnu.org >> Signed-off-by: Jason Wang <jasowang@redhat.com> > Halil could you test this pls? Does this help your performance issue? > >> --- >> Changes from V1: >> - do not check acked_features >> - reuse vhost_dev_has_iommu() >> --- >> hw/virtio/vhost.c | 12 +++++++++++- >> 1 file changed, 11 insertions(+), 1 deletion(-) >> >> diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c >> index 9edfadc81d..9182a00495 100644 >> --- a/hw/virtio/vhost.c >> +++ b/hw/virtio/vhost.c >> @@ -290,7 +290,14 @@ static int vhost_dev_has_iommu(struct vhost_dev *dev) >> { >> VirtIODevice *vdev = dev->vdev; >> >> - return virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); >> + /* >> + * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support >> + * incremental memory mapping API via IOTLB API. For platform that >> + * does not have IOMMU, there's no need to enable this feature >> + * which may cause unnecessary IOTLB miss/update trnasactions. >> + */ >> + return vdev->dma_as != &address_space_memory && >> + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); >> } >> >> static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr, >> @@ -765,6 +772,9 @@ static int vhost_dev_set_features(struct vhost_dev *dev, >> if (enable_log) { >> features |= 0x1ULL << VHOST_F_LOG_ALL; >> } >> + if (!vhost_dev_has_iommu(dev)) { >> + features &= ~(0x1ULL << VIRTIO_F_IOMMU_PLATFORM); >> + } >> r = dev->vhost_ops->vhost_set_features(dev, features); >> if (r < 0) { >> VHOST_OPS_DEBUG("vhost_set_features failed"); >> -- >> 2.19.1
On Wed, Feb 26, 2020 at 06:17:34PM +0800, Jason Wang wrote: > > On 2020/2/26 下午5:53, Michael S. Tsirkin wrote: > > On Wed, Feb 26, 2020 at 05:43:57PM +0800, Jason Wang wrote: > > > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on > > > platform without IOMMU support. This can lead unnecessary IOTLB > > > transactions which will damage the performance. > > > > > > Fixing this by check whether the device is backed by IOMMU and disable > > > device IOTLB. > > > > > > Reported-by: Halil Pasic <pasic@linux.ibm.com> > > > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") > > Well it's just an optimization, isn't it? > > > Kind of, or a fix for the performance. > Given this wasn't a regression, it's a valuable enhancement but Fixes: seems to agressive. > > I don't think it's justified to push this to everyone using > > vhost with IOTLB, is it? > > > My understanding is that the function should be equivalent to IOTLB in this > case. > > Since no IOMMU is used and device may only see GPA. > > Another possible direction is to qemu to update memory table via device > IOTLB API, this seems less straightforward. > > > > If you disagree, could you comment a bit on which configurations where tested? > > > I just do dirty shortcut to emulate the platform without get_dma_as wth > IOMMU_PLATFORM set. > > Thanks If you want Fixes tag with an expectation of everyone backporting, then this also needs to be tested on a platform with a viommu, such as dpdk within guest. > > > > > > Cc: qemu-stable@nongnu.org > > > Signed-off-by: Jason Wang <jasowang@redhat.com> > > Halil could you test this pls? Does this help your performance issue? > > > > > --- > > > Changes from V1: > > > - do not check acked_features > > > - reuse vhost_dev_has_iommu() > > > --- > > > hw/virtio/vhost.c | 12 +++++++++++- > > > 1 file changed, 11 insertions(+), 1 deletion(-) > > > > > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > > > index 9edfadc81d..9182a00495 100644 > > > --- a/hw/virtio/vhost.c > > > +++ b/hw/virtio/vhost.c > > > @@ -290,7 +290,14 @@ static int vhost_dev_has_iommu(struct vhost_dev *dev) > > > { > > > VirtIODevice *vdev = dev->vdev; > > > - return virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); > > > + /* > > > + * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support > > > + * incremental memory mapping API via IOTLB API. For platform that > > > + * does not have IOMMU, there's no need to enable this feature > > > + * which may cause unnecessary IOTLB miss/update trnasactions. > > > + */ > > > + return vdev->dma_as != &address_space_memory && > > > + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); > > > } > > > static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr, > > > @@ -765,6 +772,9 @@ static int vhost_dev_set_features(struct vhost_dev *dev, > > > if (enable_log) { > > > features |= 0x1ULL << VHOST_F_LOG_ALL; > > > } > > > + if (!vhost_dev_has_iommu(dev)) { > > > + features &= ~(0x1ULL << VIRTIO_F_IOMMU_PLATFORM); > > > + } > > > r = dev->vhost_ops->vhost_set_features(dev, features); > > > if (r < 0) { > > > VHOST_OPS_DEBUG("vhost_set_features failed"); > > > -- > > > 2.19.1
On Wed, 26 Feb 2020 06:44:25 -0500 "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Wed, Feb 26, 2020 at 06:17:34PM +0800, Jason Wang wrote: > > > > On 2020/2/26 下午5:53, Michael S. Tsirkin wrote: > > > On Wed, Feb 26, 2020 at 05:43:57PM +0800, Jason Wang wrote: > > > > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on > > > > platform without IOMMU support. This can lead unnecessary IOTLB > > > > transactions which will damage the performance. > > > > > > > > Fixing this by check whether the device is backed by IOMMU and disable > > > > device IOTLB. > > > > > > > > Reported-by: Halil Pasic <pasic@linux.ibm.com> > > > > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") > > > Well it's just an optimization, isn't it? > > > > > > Kind of, or a fix for the performance. > > > > Given this wasn't a regression, it's a valuable enhancement > but Fixes: seems to agressive. IMHO Fixes is appropriate. Telling vhost-net F_ACCESS_PLATFORM when when vdev->dma_as != &address_space_memory results in a severe performance degradation. Regards, Halil [..]
On Wed, 26 Feb 2020 04:53:33 -0500 "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Wed, Feb 26, 2020 at 05:43:57PM +0800, Jason Wang wrote: > > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on > > platform without IOMMU support. This can lead unnecessary IOTLB > > transactions which will damage the performance. > > > > Fixing this by check whether the device is backed by IOMMU and disable > > device IOTLB. > > > > Reported-by: Halil Pasic <pasic@linux.ibm.com> > > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") > > Well it's just an optimization, isn't it? > I don't think it's justified to push this to everyone using > vhost with IOTLB, is it? IMHO we need this for everyone using vhost! For instance vhost-vsock currently does not work with iommu_platform=on, because unlike vhost-net vhost does not offer F_ACCESS_PLATFORM, so set features IOCTL fails. > If you disagree, could you comment a bit on which configurations where tested? > > > Cc: qemu-stable@nongnu.org > > Signed-off-by: Jason Wang <jasowang@redhat.com> > > Halil could you test this pls? Does this help your performance issue? > I'm pretty sure it does, but I will re-test. The previous version where it was done virtio-net certainly did. Regards, Halil
On Wed, 26 Feb 2020 17:43:57 +0800 Jason Wang <jasowang@redhat.com> wrote: > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on > platform without IOMMU support. This can lead unnecessary IOTLB > transactions which will damage the performance. > > Fixing this by check whether the device is backed by IOMMU and disable > device IOTLB. > > Reported-by: Halil Pasic <pasic@linux.ibm.com> > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") > Cc: qemu-stable@nongnu.org > Signed-off-by: Jason Wang <jasowang@redhat.com> Tested-by: Halil Pasic <pasic@linux.ibm.com> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> Thank you very much for fixing this! BTW as I mentioned before it fixes vhost-vsock with iommu_platform=on as well. Regards, Halil > --- > Changes from V1: > - do not check acked_features > - reuse vhost_dev_has_iommu() > --- > hw/virtio/vhost.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > index 9edfadc81d..9182a00495 100644 > --- a/hw/virtio/vhost.c > +++ b/hw/virtio/vhost.c > @@ -290,7 +290,14 @@ static int vhost_dev_has_iommu(struct vhost_dev *dev) > { > VirtIODevice *vdev = dev->vdev; > > - return virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); > + /* > + * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support > + * incremental memory mapping API via IOTLB API. For platform that > + * does not have IOMMU, there's no need to enable this feature > + * which may cause unnecessary IOTLB miss/update trnasactions. > + */ > + return vdev->dma_as != &address_space_memory && > + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); > } > > static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr, > @@ -765,6 +772,9 @@ static int vhost_dev_set_features(struct vhost_dev *dev, > if (enable_log) { > features |= 0x1ULL << VHOST_F_LOG_ALL; > } > + if (!vhost_dev_has_iommu(dev)) { > + features &= ~(0x1ULL << VIRTIO_F_IOMMU_PLATFORM); > + } > r = dev->vhost_ops->vhost_set_features(dev, features); > if (r < 0) { > VHOST_OPS_DEBUG("vhost_set_features failed");
On Wed, Feb 26, 2020 at 02:28:39PM +0100, Halil Pasic wrote: > On Wed, 26 Feb 2020 17:43:57 +0800 > Jason Wang <jasowang@redhat.com> wrote: > > > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on > > platform without IOMMU support. This can lead unnecessary IOTLB > > transactions which will damage the performance. > > > > Fixing this by check whether the device is backed by IOMMU and disable > > device IOTLB. > > > > Reported-by: Halil Pasic <pasic@linux.ibm.com> > > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") > > Cc: qemu-stable@nongnu.org > > Signed-off-by: Jason Wang <jasowang@redhat.com> > > Tested-by: Halil Pasic <pasic@linux.ibm.com> > Reviewed-by: Halil Pasic <pasic@linux.ibm.com> > > Thank you very much for fixing this! BTW as I mentioned before it > fixes vhost-vsock with iommu_platform=on as well. Fixes as in improves performance? > Regards, > Halil > > > --- > > Changes from V1: > > - do not check acked_features > > - reuse vhost_dev_has_iommu() > > --- > > hw/virtio/vhost.c | 12 +++++++++++- > > 1 file changed, 11 insertions(+), 1 deletion(-) > > > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > > index 9edfadc81d..9182a00495 100644 > > --- a/hw/virtio/vhost.c > > +++ b/hw/virtio/vhost.c > > @@ -290,7 +290,14 @@ static int vhost_dev_has_iommu(struct vhost_dev *dev) > > { > > VirtIODevice *vdev = dev->vdev; > > > > - return virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); > > + /* > > + * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support > > + * incremental memory mapping API via IOTLB API. For platform that > > + * does not have IOMMU, there's no need to enable this feature > > + * which may cause unnecessary IOTLB miss/update trnasactions. > > + */ > > + return vdev->dma_as != &address_space_memory && > > + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); > > } > > > > static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr, > > @@ -765,6 +772,9 @@ static int vhost_dev_set_features(struct vhost_dev *dev, > > if (enable_log) { > > features |= 0x1ULL << VHOST_F_LOG_ALL; > > } > > + if (!vhost_dev_has_iommu(dev)) { > > + features &= ~(0x1ULL << VIRTIO_F_IOMMU_PLATFORM); > > + } > > r = dev->vhost_ops->vhost_set_features(dev, features); > > if (r < 0) { > > VHOST_OPS_DEBUG("vhost_set_features failed");
On Wed, Feb 26, 2020 at 01:55:39PM +0100, Halil Pasic wrote: > On Wed, 26 Feb 2020 04:53:33 -0500 > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > On Wed, Feb 26, 2020 at 05:43:57PM +0800, Jason Wang wrote: > > > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on > > > platform without IOMMU support. This can lead unnecessary IOTLB > > > transactions which will damage the performance. > > > > > > Fixing this by check whether the device is backed by IOMMU and disable > > > device IOTLB. > > > > > > Reported-by: Halil Pasic <pasic@linux.ibm.com> > > > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") > > > > Well it's just an optimization, isn't it? > > I don't think it's justified to push this to everyone using > > vhost with IOTLB, is it? > > IMHO we need this for everyone using vhost! For instance vhost-vsock > currently does not work with iommu_platform=on, because unlike vhost-net > vhost does not offer F_ACCESS_PLATFORM, so set features IOCTL fails. You mean vsock does not offer it? OK but that's still not a bugfix. Making new configs work is great, but that's a feature almost by definition. > > If you disagree, could you comment a bit on which configurations where tested? > > > > > Cc: qemu-stable@nongnu.org > > > Signed-off-by: Jason Wang <jasowang@redhat.com> > > > > Halil could you test this pls? Does this help your performance issue? > > > > I'm pretty sure it does, but I will re-test. The previous version where > it was done virtio-net certainly did. > > Regards, > Halil > >
On Wed, 26 Feb 2020 08:37:13 -0500 "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Wed, Feb 26, 2020 at 02:28:39PM +0100, Halil Pasic wrote: > > On Wed, 26 Feb 2020 17:43:57 +0800 > > Jason Wang <jasowang@redhat.com> wrote: > > > > > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on > > > platform without IOMMU support. This can lead unnecessary IOTLB > > > transactions which will damage the performance. > > > > > > Fixing this by check whether the device is backed by IOMMU and disable > > > device IOTLB. > > > > > > Reported-by: Halil Pasic <pasic@linux.ibm.com> > > > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") > > > Cc: qemu-stable@nongnu.org > > > Signed-off-by: Jason Wang <jasowang@redhat.com> > > > > Tested-by: Halil Pasic <pasic@linux.ibm.com> > > Reviewed-by: Halil Pasic <pasic@linux.ibm.com> > > > > Thank you very much for fixing this! BTW as I mentioned before it > > fixes vhost-vsock with iommu_platform=on as well. > > Fixes as in improves performance? No, fixes like one does not get something like: qemu-system-s390x: vhost_set_features failed: Operation not supported (95) qemu-system-s390x: Error starting vhost: 95 any more. Regards, Halil [..]
On Wed, Feb 26, 2020 at 04:36:18PM +0100, Halil Pasic wrote: > On Wed, 26 Feb 2020 08:37:13 -0500 > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > On Wed, Feb 26, 2020 at 02:28:39PM +0100, Halil Pasic wrote: > > > On Wed, 26 Feb 2020 17:43:57 +0800 > > > Jason Wang <jasowang@redhat.com> wrote: > > > > > > > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on > > > > platform without IOMMU support. This can lead unnecessary IOTLB > > > > transactions which will damage the performance. > > > > > > > > Fixing this by check whether the device is backed by IOMMU and disable > > > > device IOTLB. > > > > > > > > Reported-by: Halil Pasic <pasic@linux.ibm.com> > > > > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") > > > > Cc: qemu-stable@nongnu.org > > > > Signed-off-by: Jason Wang <jasowang@redhat.com> > > > > > > Tested-by: Halil Pasic <pasic@linux.ibm.com> > > > Reviewed-by: Halil Pasic <pasic@linux.ibm.com> > > > > > > Thank you very much for fixing this! BTW as I mentioned before it > > > fixes vhost-vsock with iommu_platform=on as well. > > > > Fixes as in improves performance? > > No, fixes like one does not get something like: > qemu-system-s390x: vhost_set_features failed: Operation not supported (95) > qemu-system-s390x: Error starting vhost: 95 > any more. > > Regards, > Halil > > [..] But can commit c471ad0e9bd46 actually boot a secure guest where iommu_platform=on is required?
On Wed, 26 Feb 2020 11:52:26 -0500 "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Wed, Feb 26, 2020 at 04:36:18PM +0100, Halil Pasic wrote: > > On Wed, 26 Feb 2020 08:37:13 -0500 > > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > On Wed, Feb 26, 2020 at 02:28:39PM +0100, Halil Pasic wrote: > > > > On Wed, 26 Feb 2020 17:43:57 +0800 > > > > Jason Wang <jasowang@redhat.com> wrote: > > > > > > > > > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on > > > > > platform without IOMMU support. This can lead unnecessary IOTLB > > > > > transactions which will damage the performance. > > > > > > > > > > Fixing this by check whether the device is backed by IOMMU and disable > > > > > device IOTLB. > > > > > > > > > > Reported-by: Halil Pasic <pasic@linux.ibm.com> > > > > > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") > > > > > Cc: qemu-stable@nongnu.org > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com> > > > > > > > > Tested-by: Halil Pasic <pasic@linux.ibm.com> > > > > Reviewed-by: Halil Pasic <pasic@linux.ibm.com> > > > > > > > > Thank you very much for fixing this! BTW as I mentioned before it > > > > fixes vhost-vsock with iommu_platform=on as well. > > > > > > Fixes as in improves performance? > > > > No, fixes like one does not get something like: > > qemu-system-s390x: vhost_set_features failed: Operation not supported (95) > > qemu-system-s390x: Error starting vhost: 95 > > any more. > > > > Regards, > > Halil > > > > [..] > > But can commit c471ad0e9bd46 actually boot a secure guest > where iommu_platform=on is required? > No, of course it can not. But I'm not sure about AMD SEV. AFAIU without Jason's patch it does not work for AMD SEV. Tom already stated that with SEV they don't need the IOVA translation aspect of ACCESS_PLATFORM, but I have no idea if the condition vdev->dma_as == &address_space_memory catches them as well or not. They probably have !=. CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu? Also, one can specify iommu_platform=on on a device that ain't a part of a secure-capable VM, just for the fun of it. And that breaks vhost-vsock. Or is setting iommu_platform=on only valid if qemu-system-s390x is protected virtualization capable? BTW, I don't have a strong opinion on the fixes tag. We currently do not recommend setting iommu_platform, and thus I don't think we care too much about past qemus having problems with it. Regards, Halil
On Thu, Feb 27, 2020 at 02:02:15PM +0100, Halil Pasic wrote: > On Wed, 26 Feb 2020 11:52:26 -0500 > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > On Wed, Feb 26, 2020 at 04:36:18PM +0100, Halil Pasic wrote: > > > On Wed, 26 Feb 2020 08:37:13 -0500 > > > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > On Wed, Feb 26, 2020 at 02:28:39PM +0100, Halil Pasic wrote: > > > > > On Wed, 26 Feb 2020 17:43:57 +0800 > > > > > Jason Wang <jasowang@redhat.com> wrote: > > > > > > > > > > > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on > > > > > > platform without IOMMU support. This can lead unnecessary IOTLB > > > > > > transactions which will damage the performance. > > > > > > > > > > > > Fixing this by check whether the device is backed by IOMMU and disable > > > > > > device IOTLB. > > > > > > > > > > > > Reported-by: Halil Pasic <pasic@linux.ibm.com> > > > > > > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") > > > > > > Cc: qemu-stable@nongnu.org > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com> > > > > > > > > > > Tested-by: Halil Pasic <pasic@linux.ibm.com> > > > > > Reviewed-by: Halil Pasic <pasic@linux.ibm.com> > > > > > > > > > > Thank you very much for fixing this! BTW as I mentioned before it > > > > > fixes vhost-vsock with iommu_platform=on as well. > > > > > > > > Fixes as in improves performance? > > > > > > No, fixes like one does not get something like: > > > qemu-system-s390x: vhost_set_features failed: Operation not supported (95) > > > qemu-system-s390x: Error starting vhost: 95 > > > any more. > > > > > > Regards, > > > Halil > > > > > > [..] > > > > But can commit c471ad0e9bd46 actually boot a secure guest > > where iommu_platform=on is required? > > > > No, of course it can not. But I'm not sure about AMD SEV. AFAIU without > Jason's patch it does not work for AMD SEV. Tom already stated that with > SEV they don't need the IOVA translation aspect of ACCESS_PLATFORM, but > I have no idea if the condition vdev->dma_as == &address_space_memory > catches them as well or not. They probably have !=. > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu? > > Also, one can specify iommu_platform=on on a device that ain't a part of > a secure-capable VM, just for the fun of it. And that breaks > vhost-vsock. Or is setting iommu_platform=on only valid if > qemu-system-s390x is protected virtualization capable? > > BTW, I don't have a strong opinion on the fixes tag. We currently do not > recommend setting iommu_platform, and thus I don't think we care too > much about past qemus having problems with it. > > Regards, > Halil Let's just say if we do have a Fixes: tag we want to set it correctly to the commit that needs this fix.
On 2/27/20 7:02 AM, Halil Pasic wrote: > On Wed, 26 Feb 2020 11:52:26 -0500 > "Michael S. Tsirkin" <mst@redhat.com> wrote: > >> On Wed, Feb 26, 2020 at 04:36:18PM +0100, Halil Pasic wrote: >>> On Wed, 26 Feb 2020 08:37:13 -0500 >>> "Michael S. Tsirkin" <mst@redhat.com> wrote: >>> >>>> On Wed, Feb 26, 2020 at 02:28:39PM +0100, Halil Pasic wrote: >>>>> On Wed, 26 Feb 2020 17:43:57 +0800 >>>>> Jason Wang <jasowang@redhat.com> wrote: >>>>> >>>>>> We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on >>>>>> platform without IOMMU support. This can lead unnecessary IOTLB >>>>>> transactions which will damage the performance. >>>>>> >>>>>> Fixing this by check whether the device is backed by IOMMU and disable >>>>>> device IOTLB. >>>>>> >>>>>> Reported-by: Halil Pasic <pasic@linux.ibm.com> >>>>>> Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") >>>>>> Cc: qemu-stable@nongnu.org >>>>>> Signed-off-by: Jason Wang <jasowang@redhat.com> >>>>> >>>>> Tested-by: Halil Pasic <pasic@linux.ibm.com> >>>>> Reviewed-by: Halil Pasic <pasic@linux.ibm.com> >>>>> >>>>> Thank you very much for fixing this! BTW as I mentioned before it >>>>> fixes vhost-vsock with iommu_platform=on as well. >>>> >>>> Fixes as in improves performance? >>> >>> No, fixes like one does not get something like: >>> qemu-system-s390x: vhost_set_features failed: Operation not supported (95) >>> qemu-system-s390x: Error starting vhost: 95 >>> any more. >>> >>> Regards, >>> Halil >>> >>> [..] >> >> But can commit c471ad0e9bd46 actually boot a secure guest >> where iommu_platform=on is required? >> > > No, of course it can not. But I'm not sure about AMD SEV. AFAIU without > Jason's patch it does not work for AMD SEV. Tom already stated that with > SEV they don't need the IOVA translation aspect of ACCESS_PLATFORM, but > I have no idea if the condition vdev->dma_as == &address_space_memory > catches them as well or not. They probably have !=. > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu? Adding Brijesh for this, too. > > Also, one can specify iommu_platform=on on a device that ain't a part of > a secure-capable VM, just for the fun of it. And that breaks > vhost-vsock. Or is setting iommu_platform=on only valid if > qemu-system-s390x is protected virtualization capable? > > BTW, I don't have a strong opinion on the fixes tag. We currently do not > recommend setting iommu_platform, and thus I don't think we care too > much about past qemus having problems with it. > > Regards, > Halil >
On Thu, 27 Feb 2020 10:47:22 -0500 "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Thu, Feb 27, 2020 at 02:02:15PM +0100, Halil Pasic wrote: > > On Wed, 26 Feb 2020 11:52:26 -0500 > > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > On Wed, Feb 26, 2020 at 04:36:18PM +0100, Halil Pasic wrote: > > > > On Wed, 26 Feb 2020 08:37:13 -0500 > > > > "Michael S. Tsirkin" <mst@redhat.com> wrote: > > > > > > > > > On Wed, Feb 26, 2020 at 02:28:39PM +0100, Halil Pasic wrote: > > > > > > On Wed, 26 Feb 2020 17:43:57 +0800 > > > > > > Jason Wang <jasowang@redhat.com> wrote: > > > > > > > > > > > > > We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on > > > > > > > platform without IOMMU support. This can lead unnecessary IOTLB > > > > > > > transactions which will damage the performance. > > > > > > > > > > > > > > Fixing this by check whether the device is backed by IOMMU and disable > > > > > > > device IOTLB. > > > > > > > > > > > > > > Reported-by: Halil Pasic <pasic@linux.ibm.com> > > > > > > > Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") > > > > > > > Cc: qemu-stable@nongnu.org > > > > > > > Signed-off-by: Jason Wang <jasowang@redhat.com> > > > > > > > > > > > > Tested-by: Halil Pasic <pasic@linux.ibm.com> > > > > > > Reviewed-by: Halil Pasic <pasic@linux.ibm.com> > > > > > > > > > > > > Thank you very much for fixing this! BTW as I mentioned before it > > > > > > fixes vhost-vsock with iommu_platform=on as well. > > > > > > > > > > Fixes as in improves performance? > > > > > > > > No, fixes like one does not get something like: > > > > qemu-system-s390x: vhost_set_features failed: Operation not supported (95) > > > > qemu-system-s390x: Error starting vhost: 95 > > > > any more. > > > > > > > > Regards, > > > > Halil > > > > > > > > [..] > > > > > > But can commit c471ad0e9bd46 actually boot a secure guest > > > where iommu_platform=on is required? > > > > > > > No, of course it can not. But I'm not sure about AMD SEV. AFAIU without > > Jason's patch it does not work for AMD SEV. Tom already stated that with > > SEV they don't need the IOVA translation aspect of ACCESS_PLATFORM, but > > I have no idea if the condition vdev->dma_as == &address_space_memory > > catches them as well or not. They probably have !=. > > > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu? > > > > Also, one can specify iommu_platform=on on a device that ain't a part of > > a secure-capable VM, just for the fun of it. And that breaks > > vhost-vsock. Or is setting iommu_platform=on only valid if > > qemu-system-s390x is protected virtualization capable? > > > > BTW, I don't have a strong opinion on the fixes tag. We currently do not > > recommend setting iommu_platform, and thus I don't think we care too > > much about past qemus having problems with it. > > > > Regards, > > Halil > > > Let's just say if we do have a Fixes: tag we want to set it correctly to > the commit that needs this fix. > You are absolutely correct. And c471ad0e9bd46 has nothing to do with vsock. On s390x the situation with virtio-net + vhost + iommu_platform=on seems rather complex. I did some digging, but I have no conclusive results yet. And I don't think we care all that much for iommu_platform=on for old qemus. Regards, Halil
[..] > > > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu? > > > > Also, one can specify iommu_platform=on on a device that ain't a part of > > a secure-capable VM, just for the fun of it. And that breaks > > vhost-vsock. Or is setting iommu_platform=on only valid if > > qemu-system-s390x is protected virtualization capable? > > > > BTW, I don't have a strong opinion on the fixes tag. We currently do not > > recommend setting iommu_platform, and thus I don't think we care too > > much about past qemus having problems with it. > > > > Regards, > > Halil > > > Let's just say if we do have a Fixes: tag we want to set it correctly to > the commit that needs this fix. > I finally did some digging regarding the performance degradation. For s390x the performance degradation on vhost-net was introduced by commit 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was calculated as the rest of the memory regions size (from address), and covered most of the guest address space. That is we didn't have a whole lot of IOTLB API overhead. With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of 076a93d797 and 076a93d797~1. Regarding vhost-vsock. It does not work with iommu_platform=on since the very beginning (i.e. 8607f5c307 ("virtio: convert to use DMA api")). Not sure if that is a good or a bad thing. (If the vhost driver in the kernel would actually have to do the IOTLB translation, then failing in case where it does not support it seems sane. The problem is that ACCESS_PLATFORM is used for more than one thing (needs translation, and restricted memory access).) I don't think I've heard back from AMD whether vsock works with SEV or not... I don't have access to HW to test it myself. We (s390) don't require this being backported to the stable qemus, because for us iommu_platform=on becomes relevant with protected virtualization, and those qemu versions don't support it. Cheers, Halil
On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote: > [..] > > > > > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu? > > > > > > Also, one can specify iommu_platform=on on a device that ain't a part of > > > a secure-capable VM, just for the fun of it. And that breaks > > > vhost-vsock. Or is setting iommu_platform=on only valid if > > > qemu-system-s390x is protected virtualization capable? > > > > > > BTW, I don't have a strong opinion on the fixes tag. We currently do not > > > recommend setting iommu_platform, and thus I don't think we care too > > > much about past qemus having problems with it. > > > > > > Regards, > > > Halil > > > > > > Let's just say if we do have a Fixes: tag we want to set it correctly to > > the commit that needs this fix. > > > > I finally did some digging regarding the performance degradation. For > s390x the performance degradation on vhost-net was introduced by commit > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was > calculated as the rest of the memory regions size (from address), and > covered most of the guest address space. That is we didn't have a whole > lot of IOTLB API overhead. > > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of > 076a93d797 and 076a93d797~1. Peter, what's your take on this one? > Regarding vhost-vsock. It does not work with iommu_platform=on since the > very beginning (i.e. 8607f5c307 ("virtio: convert to use DMA api")). Not > sure if that is a good or a bad thing. (If the vhost driver in the kernel > would actually have to do the IOTLB translation, then failing in case > where it does not support it seems sane. The problem is that > ACCESS_PLATFORM is used for more than one thing (needs translation, and > restricted memory access).) > > I don't think I've heard back from AMD whether vsock works with SEV or > not... I don't have access to HW to test it myself. > > We (s390) don't require this being backported to the stable qemus, > because for us iommu_platform=on becomes relevant with protected > virtualization, and those qemu versions don't support it. > > Cheers, > Halil
On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote: > On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote: > > [..] > > > > > > > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu? > > > > > > > > Also, one can specify iommu_platform=on on a device that ain't a part of > > > > a secure-capable VM, just for the fun of it. And that breaks > > > > vhost-vsock. Or is setting iommu_platform=on only valid if > > > > qemu-system-s390x is protected virtualization capable? > > > > > > > > BTW, I don't have a strong opinion on the fixes tag. We currently do not > > > > recommend setting iommu_platform, and thus I don't think we care too > > > > much about past qemus having problems with it. > > > > > > > > Regards, > > > > Halil > > > > > > > > > Let's just say if we do have a Fixes: tag we want to set it correctly to > > > the commit that needs this fix. > > > > > > > I finally did some digging regarding the performance degradation. For > > s390x the performance degradation on vhost-net was introduced by commit > > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before > > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was > > calculated as the rest of the memory regions size (from address), and > > covered most of the guest address space. That is we didn't have a whole > > lot of IOTLB API overhead. > > > > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes > > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working > > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of > > 076a93d797 and 076a93d797~1. > > Peter, what's your take on this one? Commit 076a93d797 was one of the patchset where we want to provide sensible IOTLB entries and also that should start to work with huge pages. Frankly speaking after a few years I forgot the original motivation of that whole thing, but IIRC there's a patch that was trying to speedup especially for vhost but I noticed it's not merged: https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg00574.html Regarding to the current patch, I'm not sure I understand it correctly, but is that performance issue only happens when (1) there's no intel-iommu device, and (2) there is iommu_platform=on specified for the vhost backend? If so, I'd confess I am not too surprised if this fails the boot with vhost-vsock because after all we speicified iommu_platform=on explicitly in the cmdline, so if we want it to work we can simply remove that iommu_platform=on when vhost-vsock doesn't support it yet... I thougth iommu_platform=on was added for that case - when we want to force IOMMU to be enabled from host side, and it should always be used with a vIOMMU device. However I also agree that from performance POV this patch helps for this quite special case. Thanks,
On 3/13/20 7:44 AM, Halil Pasic wrote: > [..] >>> CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu? >>> >>> Also, one can specify iommu_platform=on on a device that ain't a part of >>> a secure-capable VM, just for the fun of it. And that breaks >>> vhost-vsock. Or is setting iommu_platform=on only valid if >>> qemu-system-s390x is protected virtualization capable? >>> >>> BTW, I don't have a strong opinion on the fixes tag. We currently do not >>> recommend setting iommu_platform, and thus I don't think we care too >>> much about past qemus having problems with it. >>> >>> Regards, >>> Halil >> >> Let's just say if we do have a Fixes: tag we want to set it correctly to >> the commit that needs this fix. >> > I finally did some digging regarding the performance degradation. For > s390x the performance degradation on vhost-net was introduced by commit > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was > calculated as the rest of the memory regions size (from address), and > covered most of the guest address space. That is we didn't have a whole > lot of IOTLB API overhead. > > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of > 076a93d797 and 076a93d797~1. > > Regarding vhost-vsock. It does not work with iommu_platform=on since the > very beginning (i.e. 8607f5c307 ("virtio: convert to use DMA api")). Not > sure if that is a good or a bad thing. (If the vhost driver in the kernel > would actually have to do the IOTLB translation, then failing in case > where it does not support it seems sane. The problem is that > ACCESS_PLATFORM is used for more than one thing (needs translation, and > restricted memory access).) > > I don't think I've heard back from AMD whether vsock works with SEV or > not... I don't have access to HW to test it myself. I just tried vhost-vsock on AMD SEV machine and it does not work. I am using FC31 (qemu 4.1.1.1.fc31). > We (s390) don't require this being backported to the stable qemus, > because for us iommu_platform=on becomes relevant with protected > virtualization, and those qemu versions don't support it. > > Cheers, > Halil >
On Fri, Mar 13, 2020 at 03:27:59PM -0500, Brijesh Singh wrote: > > On 3/13/20 7:44 AM, Halil Pasic wrote: > > [..] > >>> CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu? > >>> > >>> Also, one can specify iommu_platform=on on a device that ain't a part of > >>> a secure-capable VM, just for the fun of it. And that breaks > >>> vhost-vsock. Or is setting iommu_platform=on only valid if > >>> qemu-system-s390x is protected virtualization capable? > >>> > >>> BTW, I don't have a strong opinion on the fixes tag. We currently do not > >>> recommend setting iommu_platform, and thus I don't think we care too > >>> much about past qemus having problems with it. > >>> > >>> Regards, > >>> Halil > >> > >> Let's just say if we do have a Fixes: tag we want to set it correctly to > >> the commit that needs this fix. > >> > > I finally did some digging regarding the performance degradation. For > > s390x the performance degradation on vhost-net was introduced by commit > > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before > > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was > > calculated as the rest of the memory regions size (from address), and > > covered most of the guest address space. That is we didn't have a whole > > lot of IOTLB API overhead. > > > > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes > > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working > > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of > > 076a93d797 and 076a93d797~1. > > > > Regarding vhost-vsock. It does not work with iommu_platform=on since the > > very beginning (i.e. 8607f5c307 ("virtio: convert to use DMA api")). Not > > sure if that is a good or a bad thing. (If the vhost driver in the kernel > > would actually have to do the IOTLB translation, then failing in case > > where it does not support it seems sane. The problem is that > > ACCESS_PLATFORM is used for more than one thing (needs translation, and > > restricted memory access).) > > > > I don't think I've heard back from AMD whether vsock works with SEV or > > not... I don't have access to HW to test it myself. > > > I just tried vhost-vsock on AMD SEV machine and it does not work. I am > using FC31 (qemu 4.1.1.1.fc31). Neither does vhost scsi - no ACCESS_PLATFORM support. But with Jason's patch I think both should work. Pls give it a try. > > > We (s390) don't require this being backported to the stable qemus, > > because for us iommu_platform=on becomes relevant with protected > > virtualization, and those qemu versions don't support it. > > > > Cheers, > > Halil > >
On Fri, 13 Mar 2020 12:31:22 -0400 Peter Xu <peterx@redhat.com> wrote: > On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote: > > On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote: > > > [..] > > > > > > > > > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu? > > > > > > > > > > Also, one can specify iommu_platform=on on a device that ain't a part of > > > > > a secure-capable VM, just for the fun of it. And that breaks > > > > > vhost-vsock. Or is setting iommu_platform=on only valid if > > > > > qemu-system-s390x is protected virtualization capable? > > > > > > > > > > BTW, I don't have a strong opinion on the fixes tag. We currently do not > > > > > recommend setting iommu_platform, and thus I don't think we care too > > > > > much about past qemus having problems with it. > > > > > > > > > > Regards, > > > > > Halil > > > > > > > > > > > > Let's just say if we do have a Fixes: tag we want to set it correctly to > > > > the commit that needs this fix. > > > > > > > > > > I finally did some digging regarding the performance degradation. For > > > s390x the performance degradation on vhost-net was introduced by commit > > > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before > > > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was > > > calculated as the rest of the memory regions size (from address), and > > > covered most of the guest address space. That is we didn't have a whole > > > lot of IOTLB API overhead. > > > > > > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes > > > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working > > > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of > > > 076a93d797 and 076a93d797~1. > > > > Peter, what's your take on this one? > > Commit 076a93d797 was one of the patchset where we want to provide > sensible IOTLB entries and also that should start to work with huge > pages. Frankly speaking after a few years I forgot the original > motivation of that whole thing, but IIRC there's a patch that was > trying to speedup especially for vhost but I noticed it's not merged: > > https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg00574.html > From the looks of it, I don't think we would have seen that big performance degradation had this patch been included. I can give it a spin if you like. Shall I? > Regarding to the current patch, I'm not sure I understand it > correctly, but is that performance issue only happens when (1) there's > no intel-iommu device, and (2) there is iommu_platform=on specified > for the vhost backend? > I can confirm, that your description covers my scenario. I didn't investigate what happens when we have an intel-iommu, because s390 does not do intel-iommu. I can also confirm that no performance degradation is observed when the virtio-net has iommu_platform=off. The property iommu_platform is a virtio device (and not a backend) level property. > If so, I'd confess I am not too surprised if this fails the boot with > vhost-vsock because after all we speicified iommu_platform=on > explicitly in the cmdline, so if we want it to work we can simply > remove that iommu_platform=on when vhost-vsock doesn't support it > yet... I thougth iommu_platform=on was added for that case - when we > want to force IOMMU to be enabled from host side, and it should always > be used with a vIOMMU device. > The problem is that the virtio feature bit F_ACCESS_PLATFORM, which is directly controlled by the iommu_platform proprerty stands for two things 1) need to do IOVA translation 2) the access of the device to the guests RAM is restricted. There are cases where 2) does apply and 1) does not. We need to specify iommu_platform=on to make the virtio implementation in the guest use the dma api, because we need to grant access to memory as required. But we don't need translation and we don't have a vIOMMU. Regards, Halil > However I also agree that from performance POV this patch helps for > this quite special case. > > Thanks, >
On Fri, Mar 13, 2020 at 12:31:22PM -0400, Peter Xu wrote: > On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote: > > On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote: > > > [..] > > > > > > > > > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu? > > > > > > > > > > Also, one can specify iommu_platform=on on a device that ain't a part of > > > > > a secure-capable VM, just for the fun of it. And that breaks > > > > > vhost-vsock. Or is setting iommu_platform=on only valid if > > > > > qemu-system-s390x is protected virtualization capable? > > > > > > > > > > BTW, I don't have a strong opinion on the fixes tag. We currently do not > > > > > recommend setting iommu_platform, and thus I don't think we care too > > > > > much about past qemus having problems with it. > > > > > > > > > > Regards, > > > > > Halil > > > > > > > > > > > > Let's just say if we do have a Fixes: tag we want to set it correctly to > > > > the commit that needs this fix. > > > > > > > > > > I finally did some digging regarding the performance degradation. For > > > s390x the performance degradation on vhost-net was introduced by commit > > > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before > > > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was > > > calculated as the rest of the memory regions size (from address), and > > > covered most of the guest address space. That is we didn't have a whole > > > lot of IOTLB API overhead. > > > > > > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes > > > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working > > > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of > > > 076a93d797 and 076a93d797~1. > > > > Peter, what's your take on this one? > > Commit 076a93d797 was one of the patchset where we want to provide > sensible IOTLB entries and also that should start to work with huge > pages. So the issue bundamentally is that it never produces entries larger than page size. Wasteful even just with huge pages, all the more so which passthrough which could have giga-byte entries. Want to try fixing that? > Frankly speaking after a few years I forgot the original > motivation of that whole thing, but IIRC there's a patch that was > trying to speedup especially for vhost but I noticed it's not merged: > > https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg00574.html > > Regarding to the current patch, I'm not sure I understand it > correctly, but is that performance issue only happens when (1) there's > no intel-iommu device, and (2) there is iommu_platform=on specified > for the vhost backend? > > If so, I'd confess I am not too surprised if this fails the boot with > vhost-vsock because after all we speicified iommu_platform=on > explicitly in the cmdline, so if we want it to work we can simply > remove that iommu_platform=on when vhost-vsock doesn't support it > yet... I thougth iommu_platform=on was added for that case - when we > want to force IOMMU to be enabled from host side, and it should always > be used with a vIOMMU device. > > However I also agree that from performance POV this patch helps for > this quite special case. > > Thanks, > > -- > Peter Xu
On Mon, Mar 16, 2020 at 05:57:37PM +0100, Halil Pasic wrote: > On Fri, 13 Mar 2020 12:31:22 -0400 > Peter Xu <peterx@redhat.com> wrote: > > > On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote: > > > On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote: > > > > [..] > > > > > > > > > > > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu? > > > > > > > > > > > > Also, one can specify iommu_platform=on on a device that ain't a part of > > > > > > a secure-capable VM, just for the fun of it. And that breaks > > > > > > vhost-vsock. Or is setting iommu_platform=on only valid if > > > > > > qemu-system-s390x is protected virtualization capable? > > > > > > > > > > > > BTW, I don't have a strong opinion on the fixes tag. We currently do not > > > > > > recommend setting iommu_platform, and thus I don't think we care too > > > > > > much about past qemus having problems with it. > > > > > > > > > > > > Regards, > > > > > > Halil > > > > > > > > > > > > > > > Let's just say if we do have a Fixes: tag we want to set it correctly to > > > > > the commit that needs this fix. > > > > > > > > > > > > > I finally did some digging regarding the performance degradation. For > > > > s390x the performance degradation on vhost-net was introduced by commit > > > > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before > > > > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was > > > > calculated as the rest of the memory regions size (from address), and > > > > covered most of the guest address space. That is we didn't have a whole > > > > lot of IOTLB API overhead. > > > > > > > > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes > > > > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working > > > > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of > > > > 076a93d797 and 076a93d797~1. > > > > > > Peter, what's your take on this one? > > > > Commit 076a93d797 was one of the patchset where we want to provide > > sensible IOTLB entries and also that should start to work with huge > > pages. Frankly speaking after a few years I forgot the original > > motivation of that whole thing, but IIRC there's a patch that was > > trying to speedup especially for vhost but I noticed it's not merged: > > > > https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg00574.html [1] > > > > From the looks of it, I don't think we would have seen that big > performance degradation had this patch been included. I can give > it a spin if you like. Shall I? > > > Regarding to the current patch, I'm not sure I understand it > > correctly, but is that performance issue only happens when (1) there's > > no intel-iommu device, and (2) there is iommu_platform=on specified > > for the vhost backend? > > > > I can confirm, that your description covers my scenario. I didn't > investigate what happens when we have an intel-iommu, because s390 does > not do intel-iommu. I can also confirm that no performance degradation > is observed when the virtio-net has iommu_platform=off. The property > iommu_platform is a virtio device (and not a backend) level property. > > > > If so, I'd confess I am not too surprised if this fails the boot with > > vhost-vsock because after all we speicified iommu_platform=on > > explicitly in the cmdline, so if we want it to work we can simply > > remove that iommu_platform=on when vhost-vsock doesn't support it > > yet... I thougth iommu_platform=on was added for that case - when we > > want to force IOMMU to be enabled from host side, and it should always > > be used with a vIOMMU device. > > > > The problem is that the virtio feature bit F_ACCESS_PLATFORM, which is > directly controlled by the iommu_platform proprerty stands for two things > 1) need to do IOVA translation > 2) the access of the device to the guests RAM is restricted. > > There are cases where 2) does apply and 1) does not. We need to specify > iommu_platform=on to make the virtio implementation in the guest use > the dma api, because we need to grant access to memory as required. But > we don't need translation and we don't have a vIOMMU. I see the point of this patch now. I'm still unclear on how s390 works for DMA protection, but it seems totally different from the IOMMU model on x86/arm. Considering this, please ignore above patch [1] because that's hackish in all cases to play with iotlb caches, and current patch should be much better (and easier) IMHO. Thanks,
On Mon, Mar 16, 2020 at 01:19:54PM -0400, Michael S. Tsirkin wrote: > On Fri, Mar 13, 2020 at 12:31:22PM -0400, Peter Xu wrote: > > On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote: > > > On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote: > > > > [..] > > > > > > > > > > > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu? > > > > > > > > > > > > Also, one can specify iommu_platform=on on a device that ain't a part of > > > > > > a secure-capable VM, just for the fun of it. And that breaks > > > > > > vhost-vsock. Or is setting iommu_platform=on only valid if > > > > > > qemu-system-s390x is protected virtualization capable? > > > > > > > > > > > > BTW, I don't have a strong opinion on the fixes tag. We currently do not > > > > > > recommend setting iommu_platform, and thus I don't think we care too > > > > > > much about past qemus having problems with it. > > > > > > > > > > > > Regards, > > > > > > Halil > > > > > > > > > > > > > > > Let's just say if we do have a Fixes: tag we want to set it correctly to > > > > > the commit that needs this fix. > > > > > > > > > > > > > I finally did some digging regarding the performance degradation. For > > > > s390x the performance degradation on vhost-net was introduced by commit > > > > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before > > > > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was > > > > calculated as the rest of the memory regions size (from address), and > > > > covered most of the guest address space. That is we didn't have a whole > > > > lot of IOTLB API overhead. > > > > > > > > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes > > > > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working > > > > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of > > > > 076a93d797 and 076a93d797~1. > > > > > > Peter, what's your take on this one? > > > > Commit 076a93d797 was one of the patchset where we want to provide > > sensible IOTLB entries and also that should start to work with huge > > pages. > > So the issue bundamentally is that it > never produces entries larger than page size. > > Wasteful even just with huge pages, all the more > so which passthrough which could have giga-byte > entries. > > Want to try fixing that? Yes we can fix that, but I'm still not sure whether changing the interface of address_space_get_iotlb_entry() to cover adhoc regions is a good idea, because I think it's still a memory core API and imho it would still be good to have IOTLBs returned to be what the hardware will be using (always page aligned IOTLBs). Also it would still be not ideal because vhost backend will still need to send the MISSING messages and block for each of the continuous guest memory ranges registered, so there will still be misterious delay. Not to say logically all the caches can be invalidated too so in that sense I think it's as hacky as the vhost speedup patch mentioned below.. Ideally I think vhost should be able to know when PT is enabled or disabled for the device, so the vhost backend (kernel or userspace) should be able to directly use GPA for DMA. That might need some new vhost interface. For the s390's specific issue, I would think Jason's patch an simple and ideal solution already. Thanks, > > > > Frankly speaking after a few years I forgot the original > > motivation of that whole thing, but IIRC there's a patch that was > > trying to speedup especially for vhost but I noticed it's not merged: > > > > https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg00574.html > > > > Regarding to the current patch, I'm not sure I understand it > > correctly, but is that performance issue only happens when (1) there's > > no intel-iommu device, and (2) there is iommu_platform=on specified > > for the vhost backend? > > > > If so, I'd confess I am not too surprised if this fails the boot with > > vhost-vsock because after all we speicified iommu_platform=on > > explicitly in the cmdline, so if we want it to work we can simply > > remove that iommu_platform=on when vhost-vsock doesn't support it > > yet... I thougth iommu_platform=on was added for that case - when we > > want to force IOMMU to be enabled from host side, and it should always > > be used with a vIOMMU device. > > > > However I also agree that from performance POV this patch helps for > > this quite special case. > > > > Thanks, > > > > -- > > Peter Xu >
On 2020/3/17 上午2:14, Peter Xu wrote: > On Mon, Mar 16, 2020 at 01:19:54PM -0400, Michael S. Tsirkin wrote: >> On Fri, Mar 13, 2020 at 12:31:22PM -0400, Peter Xu wrote: >>> On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote: >>>> On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote: >>>>> [..] >>>>>>> CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu? >>>>>>> >>>>>>> Also, one can specify iommu_platform=on on a device that ain't a part of >>>>>>> a secure-capable VM, just for the fun of it. And that breaks >>>>>>> vhost-vsock. Or is setting iommu_platform=on only valid if >>>>>>> qemu-system-s390x is protected virtualization capable? >>>>>>> >>>>>>> BTW, I don't have a strong opinion on the fixes tag. We currently do not >>>>>>> recommend setting iommu_platform, and thus I don't think we care too >>>>>>> much about past qemus having problems with it. >>>>>>> >>>>>>> Regards, >>>>>>> Halil >>>>>> Let's just say if we do have a Fixes: tag we want to set it correctly to >>>>>> the commit that needs this fix. >>>>>> >>>>> I finally did some digging regarding the performance degradation. For >>>>> s390x the performance degradation on vhost-net was introduced by commit >>>>> 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before >>>>> IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was >>>>> calculated as the rest of the memory regions size (from address), and >>>>> covered most of the guest address space. That is we didn't have a whole >>>>> lot of IOTLB API overhead. >>>>> >>>>> With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes >>>>> as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working >>>>> properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of >>>>> 076a93d797 and 076a93d797~1. >>>> Peter, what's your take on this one? >>> Commit 076a93d797 was one of the patchset where we want to provide >>> sensible IOTLB entries and also that should start to work with huge >>> pages. >> So the issue bundamentally is that it >> never produces entries larger than page size. >> >> Wasteful even just with huge pages, all the more >> so which passthrough which could have giga-byte >> entries. >> >> Want to try fixing that? > Yes we can fix that, but I'm still not sure whether changing the > interface of address_space_get_iotlb_entry() to cover adhoc regions is > a good idea, because I think it's still a memory core API and imho it > would still be good to have IOTLBs returned to be what the hardware > will be using (always page aligned IOTLBs). Also it would still be > not ideal because vhost backend will still need to send the MISSING > messages and block for each of the continuous guest memory ranges > registered, so there will still be misterious delay. Not to say > logically all the caches can be invalidated too so in that sense I > think it's as hacky as the vhost speedup patch mentioned below.. > > Ideally I think vhost should be able to know when PT is enabled or > disabled for the device, so the vhost backend (kernel or userspace) > should be able to directly use GPA for DMA. That might need some new > vhost interface. Yes but I think we don't need another API since we can send GPA->HVA mapping via device IOTLB API when we find there's no DMA translation at all (either PT or no vIOMMU). Vhost doesn't need to know whether an address is an IOVA (vIOMMU) , GPA (no vIOMMU), or even HVA (dpdk virtio-user). Thanks > > For the s390's specific issue, I would think Jason's patch an simple > and ideal solution already. > > Thanks, >
On Mon, Mar 16, 2020 at 02:14:05PM -0400, Peter Xu wrote: > On Mon, Mar 16, 2020 at 01:19:54PM -0400, Michael S. Tsirkin wrote: > > On Fri, Mar 13, 2020 at 12:31:22PM -0400, Peter Xu wrote: > > > On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote: > > > > On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote: > > > > > [..] > > > > > > > > > > > > > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu? > > > > > > > > > > > > > > Also, one can specify iommu_platform=on on a device that ain't a part of > > > > > > > a secure-capable VM, just for the fun of it. And that breaks > > > > > > > vhost-vsock. Or is setting iommu_platform=on only valid if > > > > > > > qemu-system-s390x is protected virtualization capable? > > > > > > > > > > > > > > BTW, I don't have a strong opinion on the fixes tag. We currently do not > > > > > > > recommend setting iommu_platform, and thus I don't think we care too > > > > > > > much about past qemus having problems with it. > > > > > > > > > > > > > > Regards, > > > > > > > Halil > > > > > > > > > > > > > > > > > > Let's just say if we do have a Fixes: tag we want to set it correctly to > > > > > > the commit that needs this fix. > > > > > > > > > > > > > > > > I finally did some digging regarding the performance degradation. For > > > > > s390x the performance degradation on vhost-net was introduced by commit > > > > > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before > > > > > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was > > > > > calculated as the rest of the memory regions size (from address), and > > > > > covered most of the guest address space. That is we didn't have a whole > > > > > lot of IOTLB API overhead. > > > > > > > > > > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes > > > > > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working > > > > > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of > > > > > 076a93d797 and 076a93d797~1. > > > > > > > > Peter, what's your take on this one? > > > > > > Commit 076a93d797 was one of the patchset where we want to provide > > > sensible IOTLB entries and also that should start to work with huge > > > pages. > > > > So the issue bundamentally is that it > > never produces entries larger than page size. > > > > Wasteful even just with huge pages, all the more > > so which passthrough which could have giga-byte > > entries. > > > > Want to try fixing that? > > Yes we can fix that, but I'm still not sure whether changing the > interface of address_space_get_iotlb_entry() to cover adhoc regions is > a good idea, because I think it's still a memory core API and imho it > would still be good to have IOTLBs returned to be what the hardware > will be using (always page aligned IOTLBs). E.g. with virtio-iommu, there's no hardware in sight. Even with e.g. VTD page aligned does not mean TARGET_PAGE, can be much bigger. > Also it would still be > not ideal because vhost backend will still need to send the MISSING > messages and block for each of the continuous guest memory ranges > registered, so there will still be misterious delay. Not to say > logically all the caches can be invalidated too so in that sense I > think it's as hacky as the vhost speedup patch mentioned below.. > > Ideally I think vhost should be able to know when PT is enabled or > disabled for the device, so the vhost backend (kernel or userspace) > should be able to directly use GPA for DMA. That might need some new > vhost interface. > > For the s390's specific issue, I would think Jason's patch an simple > and ideal solution already. > > Thanks, > > > > > > > > Frankly speaking after a few years I forgot the original > > > motivation of that whole thing, but IIRC there's a patch that was > > > trying to speedup especially for vhost but I noticed it's not merged: > > > > > > https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg00574.html > > > > > > Regarding to the current patch, I'm not sure I understand it > > > correctly, but is that performance issue only happens when (1) there's > > > no intel-iommu device, and (2) there is iommu_platform=on specified > > > for the vhost backend? > > > > > > If so, I'd confess I am not too surprised if this fails the boot with > > > vhost-vsock because after all we speicified iommu_platform=on > > > explicitly in the cmdline, so if we want it to work we can simply > > > remove that iommu_platform=on when vhost-vsock doesn't support it > > > yet... I thougth iommu_platform=on was added for that case - when we > > > want to force IOMMU to be enabled from host side, and it should always > > > be used with a vIOMMU device. > > > > > > However I also agree that from performance POV this patch helps for > > > this quite special case. > > > > > > Thanks, > > > > > > -- > > > Peter Xu > > > > -- > Peter Xu
On Tue, Mar 17, 2020 at 11:04:26AM +0800, Jason Wang wrote: > > On 2020/3/17 上午2:14, Peter Xu wrote: > > On Mon, Mar 16, 2020 at 01:19:54PM -0400, Michael S. Tsirkin wrote: > > > On Fri, Mar 13, 2020 at 12:31:22PM -0400, Peter Xu wrote: > > > > On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote: > > > > > On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote: > > > > > > [..] > > > > > > > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu? > > > > > > > > > > > > > > > > Also, one can specify iommu_platform=on on a device that ain't a part of > > > > > > > > a secure-capable VM, just for the fun of it. And that breaks > > > > > > > > vhost-vsock. Or is setting iommu_platform=on only valid if > > > > > > > > qemu-system-s390x is protected virtualization capable? > > > > > > > > > > > > > > > > BTW, I don't have a strong opinion on the fixes tag. We currently do not > > > > > > > > recommend setting iommu_platform, and thus I don't think we care too > > > > > > > > much about past qemus having problems with it. > > > > > > > > > > > > > > > > Regards, > > > > > > > > Halil > > > > > > > Let's just say if we do have a Fixes: tag we want to set it correctly to > > > > > > > the commit that needs this fix. > > > > > > > > > > > > > I finally did some digging regarding the performance degradation. For > > > > > > s390x the performance degradation on vhost-net was introduced by commit > > > > > > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before > > > > > > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was > > > > > > calculated as the rest of the memory regions size (from address), and > > > > > > covered most of the guest address space. That is we didn't have a whole > > > > > > lot of IOTLB API overhead. > > > > > > > > > > > > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes > > > > > > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working > > > > > > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of > > > > > > 076a93d797 and 076a93d797~1. > > > > > Peter, what's your take on this one? > > > > Commit 076a93d797 was one of the patchset where we want to provide > > > > sensible IOTLB entries and also that should start to work with huge > > > > pages. > > > So the issue bundamentally is that it > > > never produces entries larger than page size. > > > > > > Wasteful even just with huge pages, all the more > > > so which passthrough which could have giga-byte > > > entries. > > > > > > Want to try fixing that? > > Yes we can fix that, but I'm still not sure whether changing the > > interface of address_space_get_iotlb_entry() to cover adhoc regions is > > a good idea, because I think it's still a memory core API and imho it > > would still be good to have IOTLBs returned to be what the hardware > > will be using (always page aligned IOTLBs). Also it would still be > > not ideal because vhost backend will still need to send the MISSING > > messages and block for each of the continuous guest memory ranges > > registered, so there will still be misterious delay. Not to say > > logically all the caches can be invalidated too so in that sense I > > think it's as hacky as the vhost speedup patch mentioned below.. > > > > Ideally I think vhost should be able to know when PT is enabled or > > disabled for the device, so the vhost backend (kernel or userspace) > > should be able to directly use GPA for DMA. That might need some new > > vhost interface. > > > Yes but I think we don't need another API since we can send GPA->HVA mapping > via device IOTLB API when we find there's no DMA translation at all (either > PT or no vIOMMU). Jason, Do you mean what we've worked on before? https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg00574.html (I just read the previous discussion on that patch, it seems to be exactly what we've discussed again...) Thanks, > > Vhost doesn't need to know whether an address is an IOVA (vIOMMU) , GPA (no > vIOMMU), or even HVA (dpdk virtio-user). > > Thanks > > > > > > For the s390's specific issue, I would think Jason's patch an simple > > and ideal solution already. > > > > Thanks, > > >
On Tue, Mar 17, 2020 at 02:28:42AM -0400, Michael S. Tsirkin wrote: > On Mon, Mar 16, 2020 at 02:14:05PM -0400, Peter Xu wrote: > > On Mon, Mar 16, 2020 at 01:19:54PM -0400, Michael S. Tsirkin wrote: > > > On Fri, Mar 13, 2020 at 12:31:22PM -0400, Peter Xu wrote: > > > > On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote: > > > > > On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote: > > > > > > [..] > > > > > > > > > > > > > > > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu? > > > > > > > > > > > > > > > > Also, one can specify iommu_platform=on on a device that ain't a part of > > > > > > > > a secure-capable VM, just for the fun of it. And that breaks > > > > > > > > vhost-vsock. Or is setting iommu_platform=on only valid if > > > > > > > > qemu-system-s390x is protected virtualization capable? > > > > > > > > > > > > > > > > BTW, I don't have a strong opinion on the fixes tag. We currently do not > > > > > > > > recommend setting iommu_platform, and thus I don't think we care too > > > > > > > > much about past qemus having problems with it. > > > > > > > > > > > > > > > > Regards, > > > > > > > > Halil > > > > > > > > > > > > > > > > > > > > > Let's just say if we do have a Fixes: tag we want to set it correctly to > > > > > > > the commit that needs this fix. > > > > > > > > > > > > > > > > > > > I finally did some digging regarding the performance degradation. For > > > > > > s390x the performance degradation on vhost-net was introduced by commit > > > > > > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before > > > > > > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was > > > > > > calculated as the rest of the memory regions size (from address), and > > > > > > covered most of the guest address space. That is we didn't have a whole > > > > > > lot of IOTLB API overhead. > > > > > > > > > > > > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes > > > > > > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working > > > > > > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of > > > > > > 076a93d797 and 076a93d797~1. > > > > > > > > > > Peter, what's your take on this one? > > > > > > > > Commit 076a93d797 was one of the patchset where we want to provide > > > > sensible IOTLB entries and also that should start to work with huge > > > > pages. > > > > > > So the issue bundamentally is that it > > > never produces entries larger than page size. > > > > > > Wasteful even just with huge pages, all the more > > > so which passthrough which could have giga-byte > > > entries. > > > > > > Want to try fixing that? > > > > Yes we can fix that, but I'm still not sure whether changing the > > interface of address_space_get_iotlb_entry() to cover adhoc regions is > > a good idea, because I think it's still a memory core API and imho it > > would still be good to have IOTLBs returned to be what the hardware > > will be using (always page aligned IOTLBs). > > E.g. with virtio-iommu, there's no hardware in sight. > Even with e.g. VTD page aligned does not mean TARGET_PAGE, > can be much bigger. Right. Sorry to be unclear, but I meant the emulated device (in this case for x86 it's VT-d) should follow the hardware. Here the page mask is decided by VT-d in vtd_iommu_translate() for PT mode which is 4K only. For another example, ARM SMMU is doing similar thing (return PAGE_SIZE when PT enabled, smmuv3_translate()). That actually makes sense to me. On the other hand, I'm not sure whether there's side effect if we change this to cover the whole address space for PT. Thanks, > > > Also it would still be > > not ideal because vhost backend will still need to send the MISSING > > messages and block for each of the continuous guest memory ranges > > registered, so there will still be misterious delay. Not to say > > logically all the caches can be invalidated too so in that sense I > > think it's as hacky as the vhost speedup patch mentioned below.. > > > > Ideally I think vhost should be able to know when PT is enabled or > > disabled for the device, so the vhost backend (kernel or userspace) > > should be able to directly use GPA for DMA. That might need some new > > vhost interface. > > > > For the s390's specific issue, I would think Jason's patch an simple > > and ideal solution already. > > > > Thanks, > > > > > > > > > > > > Frankly speaking after a few years I forgot the original > > > > motivation of that whole thing, but IIRC there's a patch that was > > > > trying to speedup especially for vhost but I noticed it's not merged: > > > > > > > > https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg00574.html > > > > > > > > Regarding to the current patch, I'm not sure I understand it > > > > correctly, but is that performance issue only happens when (1) there's > > > > no intel-iommu device, and (2) there is iommu_platform=on specified > > > > for the vhost backend? > > > > > > > > If so, I'd confess I am not too surprised if this fails the boot with > > > > vhost-vsock because after all we speicified iommu_platform=on > > > > explicitly in the cmdline, so if we want it to work we can simply > > > > remove that iommu_platform=on when vhost-vsock doesn't support it > > > > yet... I thougth iommu_platform=on was added for that case - when we > > > > want to force IOMMU to be enabled from host side, and it should always > > > > be used with a vIOMMU device. > > > > > > > > However I also agree that from performance POV this patch helps for > > > > this quite special case. > > > > > > > > Thanks, > > > > > > > > -- > > > > Peter Xu > > > > > > > -- > > Peter Xu >
On Tue, Mar 17, 2020 at 10:39:04AM -0400, Peter Xu wrote: > On Tue, Mar 17, 2020 at 02:28:42AM -0400, Michael S. Tsirkin wrote: > > On Mon, Mar 16, 2020 at 02:14:05PM -0400, Peter Xu wrote: > > > On Mon, Mar 16, 2020 at 01:19:54PM -0400, Michael S. Tsirkin wrote: > > > > On Fri, Mar 13, 2020 at 12:31:22PM -0400, Peter Xu wrote: > > > > > On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote: > > > > > > On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote: > > > > > > > [..] > > > > > > > > > > > > > > > > > > CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu? > > > > > > > > > > > > > > > > > > Also, one can specify iommu_platform=on on a device that ain't a part of > > > > > > > > > a secure-capable VM, just for the fun of it. And that breaks > > > > > > > > > vhost-vsock. Or is setting iommu_platform=on only valid if > > > > > > > > > qemu-system-s390x is protected virtualization capable? > > > > > > > > > > > > > > > > > > BTW, I don't have a strong opinion on the fixes tag. We currently do not > > > > > > > > > recommend setting iommu_platform, and thus I don't think we care too > > > > > > > > > much about past qemus having problems with it. > > > > > > > > > > > > > > > > > > Regards, > > > > > > > > > Halil > > > > > > > > > > > > > > > > > > > > > > > > Let's just say if we do have a Fixes: tag we want to set it correctly to > > > > > > > > the commit that needs this fix. > > > > > > > > > > > > > > > > > > > > > > I finally did some digging regarding the performance degradation. For > > > > > > > s390x the performance degradation on vhost-net was introduced by commit > > > > > > > 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before > > > > > > > IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was > > > > > > > calculated as the rest of the memory regions size (from address), and > > > > > > > covered most of the guest address space. That is we didn't have a whole > > > > > > > lot of IOTLB API overhead. > > > > > > > > > > > > > > With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes > > > > > > > as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working > > > > > > > properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of > > > > > > > 076a93d797 and 076a93d797~1. > > > > > > > > > > > > Peter, what's your take on this one? > > > > > > > > > > Commit 076a93d797 was one of the patchset where we want to provide > > > > > sensible IOTLB entries and also that should start to work with huge > > > > > pages. > > > > > > > > So the issue bundamentally is that it > > > > never produces entries larger than page size. > > > > > > > > Wasteful even just with huge pages, all the more > > > > so which passthrough which could have giga-byte > > > > entries. > > > > > > > > Want to try fixing that? > > > > > > Yes we can fix that, but I'm still not sure whether changing the > > > interface of address_space_get_iotlb_entry() to cover adhoc regions is > > > a good idea, because I think it's still a memory core API and imho it > > > would still be good to have IOTLBs returned to be what the hardware > > > will be using (always page aligned IOTLBs). > > > > E.g. with virtio-iommu, there's no hardware in sight. > > Even with e.g. VTD page aligned does not mean TARGET_PAGE, > > can be much bigger. > > Right. Sorry to be unclear, but I meant the emulated device (in this > case for x86 it's VT-d) should follow the hardware. Here the page > mask is decided by VT-d in vtd_iommu_translate() for PT mode which is > 4K only. For another example, ARM SMMU is doing similar thing (return > PAGE_SIZE when PT enabled, smmuv3_translate()). That actually makes > sense to me. On the other hand, I'm not sure whether there's side > effect if we change this to cover the whole address space for PT. > > Thanks, Well we can translate a batch of entries in a loop, and as long as VA/PA mappings are consistent, treat the batch as one. This is a classical batching approach and not doing this is a classical reason for bad performance. > > > > > Also it would still be > > > not ideal because vhost backend will still need to send the MISSING > > > messages and block for each of the continuous guest memory ranges > > > registered, so there will still be misterious delay. Not to say > > > logically all the caches can be invalidated too so in that sense I > > > think it's as hacky as the vhost speedup patch mentioned below.. > > > > > > Ideally I think vhost should be able to know when PT is enabled or > > > disabled for the device, so the vhost backend (kernel or userspace) > > > should be able to directly use GPA for DMA. That might need some new > > > vhost interface. > > > > > > For the s390's specific issue, I would think Jason's patch an simple > > > and ideal solution already. > > > > > > Thanks, > > > > > > > > > > > > > > > > Frankly speaking after a few years I forgot the original > > > > > motivation of that whole thing, but IIRC there's a patch that was > > > > > trying to speedup especially for vhost but I noticed it's not merged: > > > > > > > > > > https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg00574.html > > > > > > > > > > Regarding to the current patch, I'm not sure I understand it > > > > > correctly, but is that performance issue only happens when (1) there's > > > > > no intel-iommu device, and (2) there is iommu_platform=on specified > > > > > for the vhost backend? > > > > > > > > > > If so, I'd confess I am not too surprised if this fails the boot with > > > > > vhost-vsock because after all we speicified iommu_platform=on > > > > > explicitly in the cmdline, so if we want it to work we can simply > > > > > remove that iommu_platform=on when vhost-vsock doesn't support it > > > > > yet... I thougth iommu_platform=on was added for that case - when we > > > > > want to force IOMMU to be enabled from host side, and it should always > > > > > be used with a vIOMMU device. > > > > > > > > > > However I also agree that from performance POV this patch helps for > > > > > this quite special case. > > > > > > > > > > Thanks, > > > > > > > > > > -- > > > > > Peter Xu > > > > > > > > > > -- > > > Peter Xu > > > > -- > Peter Xu
On 2020/3/17 下午10:13, Peter Xu wrote: > On Tue, Mar 17, 2020 at 11:04:26AM +0800, Jason Wang wrote: >> On 2020/3/17 上午2:14, Peter Xu wrote: >>> On Mon, Mar 16, 2020 at 01:19:54PM -0400, Michael S. Tsirkin wrote: >>>> On Fri, Mar 13, 2020 at 12:31:22PM -0400, Peter Xu wrote: >>>>> On Fri, Mar 13, 2020 at 11:29:59AM -0400, Michael S. Tsirkin wrote: >>>>>> On Fri, Mar 13, 2020 at 01:44:46PM +0100, Halil Pasic wrote: >>>>>>> [..] >>>>>>>>> CCing Tom. @Tom does vhost-vsock work for you with SEV and current qemu? >>>>>>>>> >>>>>>>>> Also, one can specify iommu_platform=on on a device that ain't a part of >>>>>>>>> a secure-capable VM, just for the fun of it. And that breaks >>>>>>>>> vhost-vsock. Or is setting iommu_platform=on only valid if >>>>>>>>> qemu-system-s390x is protected virtualization capable? >>>>>>>>> >>>>>>>>> BTW, I don't have a strong opinion on the fixes tag. We currently do not >>>>>>>>> recommend setting iommu_platform, and thus I don't think we care too >>>>>>>>> much about past qemus having problems with it. >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Halil >>>>>>>> Let's just say if we do have a Fixes: tag we want to set it correctly to >>>>>>>> the commit that needs this fix. >>>>>>>> >>>>>>> I finally did some digging regarding the performance degradation. For >>>>>>> s390x the performance degradation on vhost-net was introduced by commit >>>>>>> 076a93d797 ("exec: simplify address_space_get_iotlb_entry"). Before >>>>>>> IOMMUTLBEntry.addr_mask used to be based on plen, which in turn was >>>>>>> calculated as the rest of the memory regions size (from address), and >>>>>>> covered most of the guest address space. That is we didn't have a whole >>>>>>> lot of IOTLB API overhead. >>>>>>> >>>>>>> With commit 076a93d797 I see IOMMUTLBEntry.addr_mask == 0xfff which comes >>>>>>> as ~TARGET_PAGE_MASK from flatview_do_translate(). To have things working >>>>>>> properly I applied 75e5b70e6, b021d1c044, and d542800d1e on the level of >>>>>>> 076a93d797 and 076a93d797~1. >>>>>> Peter, what's your take on this one? >>>>> Commit 076a93d797 was one of the patchset where we want to provide >>>>> sensible IOTLB entries and also that should start to work with huge >>>>> pages. >>>> So the issue bundamentally is that it >>>> never produces entries larger than page size. >>>> >>>> Wasteful even just with huge pages, all the more >>>> so which passthrough which could have giga-byte >>>> entries. >>>> >>>> Want to try fixing that? >>> Yes we can fix that, but I'm still not sure whether changing the >>> interface of address_space_get_iotlb_entry() to cover adhoc regions is >>> a good idea, because I think it's still a memory core API and imho it >>> would still be good to have IOTLBs returned to be what the hardware >>> will be using (always page aligned IOTLBs). Also it would still be >>> not ideal because vhost backend will still need to send the MISSING >>> messages and block for each of the continuous guest memory ranges >>> registered, so there will still be misterious delay. Not to say >>> logically all the caches can be invalidated too so in that sense I >>> think it's as hacky as the vhost speedup patch mentioned below.. >>> >>> Ideally I think vhost should be able to know when PT is enabled or >>> disabled for the device, so the vhost backend (kernel or userspace) >>> should be able to directly use GPA for DMA. That might need some new >>> vhost interface. >> Yes but I think we don't need another API since we can send GPA->HVA mapping >> via device IOTLB API when we find there's no DMA translation at all (either >> PT or no vIOMMU). > Jason, > > Do you mean what we've worked on before? > > https://lists.gnu.org/archive/html/qemu-devel/2017-06/msg00574.html > > (I just read the previous discussion on that patch, it seems to be > exactly what we've discussed again...) > > Thanks, Right, something like that. But it's not urgent now consider this patch has been merged. Thanks >
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index 9edfadc81d..9182a00495 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -290,7 +290,14 @@ static int vhost_dev_has_iommu(struct vhost_dev *dev) { VirtIODevice *vdev = dev->vdev; - return virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); + /* + * For vhost, VIRTIO_F_IOMMU_PLATFORM means the backend support + * incremental memory mapping API via IOTLB API. For platform that + * does not have IOMMU, there's no need to enable this feature + * which may cause unnecessary IOTLB miss/update trnasactions. + */ + return vdev->dma_as != &address_space_memory && + virtio_host_has_feature(vdev, VIRTIO_F_IOMMU_PLATFORM); } static void *vhost_memory_map(struct vhost_dev *dev, hwaddr addr, @@ -765,6 +772,9 @@ static int vhost_dev_set_features(struct vhost_dev *dev, if (enable_log) { features |= 0x1ULL << VHOST_F_LOG_ALL; } + if (!vhost_dev_has_iommu(dev)) { + features &= ~(0x1ULL << VIRTIO_F_IOMMU_PLATFORM); + } r = dev->vhost_ops->vhost_set_features(dev, features); if (r < 0) { VHOST_OPS_DEBUG("vhost_set_features failed");
We turn on device IOTLB via VIRTIO_F_IOMMU_PLATFORM unconditionally on platform without IOMMU support. This can lead unnecessary IOTLB transactions which will damage the performance. Fixing this by check whether the device is backed by IOMMU and disable device IOTLB. Reported-by: Halil Pasic <pasic@linux.ibm.com> Fixes: c471ad0e9bd46 ("vhost_net: device IOTLB support") Cc: qemu-stable@nongnu.org Signed-off-by: Jason Wang <jasowang@redhat.com> --- Changes from V1: - do not check acked_features - reuse vhost_dev_has_iommu() --- hw/virtio/vhost.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)