Message ID | 94df5b2180d61fb2ee2b04cc007981e58b6479a9.1689030052.git.mst@redhat.com |
---|---|
State | New |
Headers | show |
Series | [PULL,01/66] vdpa: Remove status in reset tracing | expand |
On Tue, 11 Jul 2023 at 00:04, Michael S. Tsirkin <mst@redhat.com> wrote: > > From: Eric Auger <eric.auger@redhat.com> > > When running on a 64kB page size host and protecting a VFIO device > with the virtio-iommu, qemu crashes with this kind of message: > > qemu-kvm: virtio-iommu page mask 0xfffffffffffff000 is incompatible > with mask 0x20010000 > qemu: hardware error: vfio: DMA mapping failed, unable to continue > > This is due to the fact the IOMMU MR corresponding to the VFIO device > is enabled very late on domain attach, after the machine init. > The device reports a minimal 64kB page size but it is too late to be > applied. virtio_iommu_set_page_size_mask() fails and this causes > vfio_listener_region_add() to end up with hw_error(); > > To work around this issue, we transiently enable the IOMMU MR on > machine init to collect the page size requirements and then restore > the bypass state. > > Fixes: 90519b9053 ("virtio-iommu: Add bypass mode support to assigned device") > Signed-off-by: Eric Auger <eric.auger@redhat.com> Hi; Coverity complains about this change (CID 1517772): > +static void virtio_iommu_freeze_granule(Notifier *notifier, void *data) > +{ > + VirtIOIOMMU *s = container_of(notifier, VirtIOIOMMU, machine_done); > + int granule; > + > + if (likely(s->config.bypass)) { > + /* > + * Transient IOMMU MR enable to collect page_size_mask requirements > + * through memory_region_iommu_set_page_size_mask() called by > + * VFIO region_add() callback > + */ > + s->config.bypass = false; > + virtio_iommu_switch_address_space_all(s); > + /* restore default */ > + s->config.bypass = true; > + virtio_iommu_switch_address_space_all(s); > + } > + s->granule_frozen = true; > + granule = ctz64(s->config.page_size_mask); > + trace_virtio_iommu_freeze_granule(BIT(granule)); Specifically, in this code, it thinks that ctz64() can return 64, in which case BIT(granule) is shifting off the end of the value, which is undefined behaviour. This can happen if s->config.page_size_mask is 0 -- are there assertions/checks that that can't happen elsewhere? Secondly, BIT() only works for values up to 32, since it works on type unsigned long, which might be a 32-bit type on some hosts. Since you used ctz64() you probably want BIT_ULL() which uses the ULL type which definitely has 64 bits. thanks -- PMM
On Mon, Jul 17, 2023 at 11:50:54AM +0100, Peter Maydell wrote: > On Tue, 11 Jul 2023 at 00:04, Michael S. Tsirkin <mst@redhat.com> wrote: > > > > From: Eric Auger <eric.auger@redhat.com> > > > > When running on a 64kB page size host and protecting a VFIO device > > with the virtio-iommu, qemu crashes with this kind of message: > > > > qemu-kvm: virtio-iommu page mask 0xfffffffffffff000 is incompatible > > with mask 0x20010000 > > qemu: hardware error: vfio: DMA mapping failed, unable to continue > > > > This is due to the fact the IOMMU MR corresponding to the VFIO device > > is enabled very late on domain attach, after the machine init. > > The device reports a minimal 64kB page size but it is too late to be > > applied. virtio_iommu_set_page_size_mask() fails and this causes > > vfio_listener_region_add() to end up with hw_error(); > > > > To work around this issue, we transiently enable the IOMMU MR on > > machine init to collect the page size requirements and then restore > > the bypass state. > > > > Fixes: 90519b9053 ("virtio-iommu: Add bypass mode support to assigned device") > > Signed-off-by: Eric Auger <eric.auger@redhat.com> > > Hi; Coverity complains about this change (CID 1517772): > > > +static void virtio_iommu_freeze_granule(Notifier *notifier, void *data) > > +{ > > + VirtIOIOMMU *s = container_of(notifier, VirtIOIOMMU, machine_done); > > + int granule; > > + > > + if (likely(s->config.bypass)) { > > + /* > > + * Transient IOMMU MR enable to collect page_size_mask requirements > > + * through memory_region_iommu_set_page_size_mask() called by > > + * VFIO region_add() callback > > + */ > > + s->config.bypass = false; > > + virtio_iommu_switch_address_space_all(s); > > + /* restore default */ > > + s->config.bypass = true; > > + virtio_iommu_switch_address_space_all(s); > > + } > > + s->granule_frozen = true; > > + granule = ctz64(s->config.page_size_mask); > > + trace_virtio_iommu_freeze_granule(BIT(granule)); > > Specifically, in this code, it thinks that ctz64() can > return 64, in which case BIT(granule) is shifting off > the end of the value, which is undefined behaviour. > This can happen if s->config.page_size_mask is 0 -- are > there assertions/checks that that can't happen elsewhere? > > Secondly, BIT() only works for values up to 32, since > it works on type unsigned long, which might be a 32-bit > type on some hosts. Since you used ctz64() > you probably want BIT_ULL() which uses the ULL type > which definitely has 64 bits. > > thanks > -- PMM Thanks! Eric can you fix pls?
Hi Peter, On 7/17/23 12:50, Peter Maydell wrote: > On Tue, 11 Jul 2023 at 00:04, Michael S. Tsirkin <mst@redhat.com> wrote: >> From: Eric Auger <eric.auger@redhat.com> >> >> When running on a 64kB page size host and protecting a VFIO device >> with the virtio-iommu, qemu crashes with this kind of message: >> >> qemu-kvm: virtio-iommu page mask 0xfffffffffffff000 is incompatible >> with mask 0x20010000 >> qemu: hardware error: vfio: DMA mapping failed, unable to continue >> >> This is due to the fact the IOMMU MR corresponding to the VFIO device >> is enabled very late on domain attach, after the machine init. >> The device reports a minimal 64kB page size but it is too late to be >> applied. virtio_iommu_set_page_size_mask() fails and this causes >> vfio_listener_region_add() to end up with hw_error(); >> >> To work around this issue, we transiently enable the IOMMU MR on >> machine init to collect the page size requirements and then restore >> the bypass state. >> >> Fixes: 90519b9053 ("virtio-iommu: Add bypass mode support to assigned device") >> Signed-off-by: Eric Auger <eric.auger@redhat.com> > Hi; Coverity complains about this change (CID 1517772): thank you for reporting the issue > >> +static void virtio_iommu_freeze_granule(Notifier *notifier, void *data) >> +{ >> + VirtIOIOMMU *s = container_of(notifier, VirtIOIOMMU, machine_done); >> + int granule; >> + >> + if (likely(s->config.bypass)) { >> + /* >> + * Transient IOMMU MR enable to collect page_size_mask requirements >> + * through memory_region_iommu_set_page_size_mask() called by >> + * VFIO region_add() callback >> + */ >> + s->config.bypass = false; >> + virtio_iommu_switch_address_space_all(s); >> + /* restore default */ >> + s->config.bypass = true; >> + virtio_iommu_switch_address_space_all(s); >> + } >> + s->granule_frozen = true; >> + granule = ctz64(s->config.page_size_mask); >> + trace_virtio_iommu_freeze_granule(BIT(granule)); > Specifically, in this code, it thinks that ctz64() can > return 64, in which case BIT(granule) is shifting off > the end of the value, which is undefined behaviour. > This can happen if s->config.page_size_mask is 0 -- are > there assertions/checks that that can't happen elsewhere? To me this cannot happen. The page_size_mask is initialized with qemu_target_page_mask(), then further constrained with virtio_iommu_set_page_size_mask() which would call error_setg if the new mask is 0 or (cur_mask & new_mask) == 0 What can I do to give coverity a hint that page_size_mask cannot be NULL? > > Secondly, BIT() only works for values up to 32, since > it works on type unsigned long, which might be a 32-bit > type on some hosts. Since you used ctz64() > you probably want BIT_ULL() which uses the ULL type > which definitely has 64 bits. agreed. Thanks Eric > > thanks > -- PMM >
On 7/17/23 13:51, Michael S. Tsirkin wrote: > On Mon, Jul 17, 2023 at 11:50:54AM +0100, Peter Maydell wrote: >> On Tue, 11 Jul 2023 at 00:04, Michael S. Tsirkin <mst@redhat.com> wrote: >>> From: Eric Auger <eric.auger@redhat.com> >>> >>> When running on a 64kB page size host and protecting a VFIO device >>> with the virtio-iommu, qemu crashes with this kind of message: >>> >>> qemu-kvm: virtio-iommu page mask 0xfffffffffffff000 is incompatible >>> with mask 0x20010000 >>> qemu: hardware error: vfio: DMA mapping failed, unable to continue >>> >>> This is due to the fact the IOMMU MR corresponding to the VFIO device >>> is enabled very late on domain attach, after the machine init. >>> The device reports a minimal 64kB page size but it is too late to be >>> applied. virtio_iommu_set_page_size_mask() fails and this causes >>> vfio_listener_region_add() to end up with hw_error(); >>> >>> To work around this issue, we transiently enable the IOMMU MR on >>> machine init to collect the page size requirements and then restore >>> the bypass state. >>> >>> Fixes: 90519b9053 ("virtio-iommu: Add bypass mode support to assigned device") >>> Signed-off-by: Eric Auger <eric.auger@redhat.com> >> Hi; Coverity complains about this change (CID 1517772): >> >>> +static void virtio_iommu_freeze_granule(Notifier *notifier, void *data) >>> +{ >>> + VirtIOIOMMU *s = container_of(notifier, VirtIOIOMMU, machine_done); >>> + int granule; >>> + >>> + if (likely(s->config.bypass)) { >>> + /* >>> + * Transient IOMMU MR enable to collect page_size_mask requirements >>> + * through memory_region_iommu_set_page_size_mask() called by >>> + * VFIO region_add() callback >>> + */ >>> + s->config.bypass = false; >>> + virtio_iommu_switch_address_space_all(s); >>> + /* restore default */ >>> + s->config.bypass = true; >>> + virtio_iommu_switch_address_space_all(s); >>> + } >>> + s->granule_frozen = true; >>> + granule = ctz64(s->config.page_size_mask); >>> + trace_virtio_iommu_freeze_granule(BIT(granule)); >> Specifically, in this code, it thinks that ctz64() can >> return 64, in which case BIT(granule) is shifting off >> the end of the value, which is undefined behaviour. >> This can happen if s->config.page_size_mask is 0 -- are >> there assertions/checks that that can't happen elsewhere? >> >> Secondly, BIT() only works for values up to 32, since >> it works on type unsigned long, which might be a 32-bit >> type on some hosts. Since you used ctz64() >> you probably want BIT_ULL() which uses the ULL type >> which definitely has 64 bits. >> >> thanks >> -- PMM > Thanks! Eric can you fix pls? Sure Eric >
On Mon, 17 Jul 2023 at 17:56, Eric Auger <eric.auger@redhat.com> wrote: > > > Hi Peter, > On 7/17/23 12:50, Peter Maydell wrote: > > On Tue, 11 Jul 2023 at 00:04, Michael S. Tsirkin <mst@redhat.com> wrote: > >> From: Eric Auger <eric.auger@redhat.com> > >> > >> When running on a 64kB page size host and protecting a VFIO device > >> with the virtio-iommu, qemu crashes with this kind of message: > >> > >> qemu-kvm: virtio-iommu page mask 0xfffffffffffff000 is incompatible > >> with mask 0x20010000 > >> qemu: hardware error: vfio: DMA mapping failed, unable to continue > >> > >> This is due to the fact the IOMMU MR corresponding to the VFIO device > >> is enabled very late on domain attach, after the machine init. > >> The device reports a minimal 64kB page size but it is too late to be > >> applied. virtio_iommu_set_page_size_mask() fails and this causes > >> vfio_listener_region_add() to end up with hw_error(); > >> > >> To work around this issue, we transiently enable the IOMMU MR on > >> machine init to collect the page size requirements and then restore > >> the bypass state. > >> > >> Fixes: 90519b9053 ("virtio-iommu: Add bypass mode support to assigned device") > >> Signed-off-by: Eric Auger <eric.auger@redhat.com> > > Hi; Coverity complains about this change (CID 1517772): > > thank you for reporting the issue > > > >> +static void virtio_iommu_freeze_granule(Notifier *notifier, void *data) > >> +{ > >> + VirtIOIOMMU *s = container_of(notifier, VirtIOIOMMU, machine_done); > >> + int granule; > >> + > >> + if (likely(s->config.bypass)) { > >> + /* > >> + * Transient IOMMU MR enable to collect page_size_mask requirements > >> + * through memory_region_iommu_set_page_size_mask() called by > >> + * VFIO region_add() callback > >> + */ > >> + s->config.bypass = false; > >> + virtio_iommu_switch_address_space_all(s); > >> + /* restore default */ > >> + s->config.bypass = true; > >> + virtio_iommu_switch_address_space_all(s); > >> + } > >> + s->granule_frozen = true; > >> + granule = ctz64(s->config.page_size_mask); > >> + trace_virtio_iommu_freeze_granule(BIT(granule)); > > Specifically, in this code, it thinks that ctz64() can > > return 64, in which case BIT(granule) is shifting off > > the end of the value, which is undefined behaviour. > > This can happen if s->config.page_size_mask is 0 -- are > > there assertions/checks that that can't happen elsewhere? > To me this cannot happen. The page_size_mask is initialized with > qemu_target_page_mask(), then further constrained with > virtio_iommu_set_page_size_mask() which would call error_setg if the new > mask is 0 or (cur_mask & new_mask) == 0 > > What can I do to give coverity a hint that page_size_mask cannot be NULL? You can assert() it if you believe it to be true and you think an assert() would help a human reader, or we could just say "this is a false positive" and mark it that way in the Coverity UI. We don't need to change things purely to make Coverity happy. > > Secondly, BIT() only works for values up to 32, since > > it works on type unsigned long, which might be a 32-bit > > type on some hosts. Since you used ctz64() > > you probably want BIT_ULL() which uses the ULL type > > which definitely has 64 bits. > agreed. Looking more closely at this, the file is not entirely consistent about how it handles the page_size_mask: * in virtio_iommu_translate() we call ctz32() and then use an open-coded 1 << n on that * in virtio_iommu_set_page_size_mask() we call ctz64() and use BIT() * in virtio_iommu_freeze_granule() we call ctz64() and then use BIT() So I suspect it's actually true (or at least assumed) that the granule value can never be 32 or higher, as well as it being non-zero. thanks -- PMM
On 7/17/23 19:07, Peter Maydell wrote: > On Mon, 17 Jul 2023 at 17:56, Eric Auger <eric.auger@redhat.com> wrote: >> >> Hi Peter, >> On 7/17/23 12:50, Peter Maydell wrote: >>> On Tue, 11 Jul 2023 at 00:04, Michael S. Tsirkin <mst@redhat.com> wrote: >>>> From: Eric Auger <eric.auger@redhat.com> >>>> >>>> When running on a 64kB page size host and protecting a VFIO device >>>> with the virtio-iommu, qemu crashes with this kind of message: >>>> >>>> qemu-kvm: virtio-iommu page mask 0xfffffffffffff000 is incompatible >>>> with mask 0x20010000 >>>> qemu: hardware error: vfio: DMA mapping failed, unable to continue >>>> >>>> This is due to the fact the IOMMU MR corresponding to the VFIO device >>>> is enabled very late on domain attach, after the machine init. >>>> The device reports a minimal 64kB page size but it is too late to be >>>> applied. virtio_iommu_set_page_size_mask() fails and this causes >>>> vfio_listener_region_add() to end up with hw_error(); >>>> >>>> To work around this issue, we transiently enable the IOMMU MR on >>>> machine init to collect the page size requirements and then restore >>>> the bypass state. >>>> >>>> Fixes: 90519b9053 ("virtio-iommu: Add bypass mode support to assigned device") >>>> Signed-off-by: Eric Auger <eric.auger@redhat.com> >>> Hi; Coverity complains about this change (CID 1517772): >> thank you for reporting the issue >>>> +static void virtio_iommu_freeze_granule(Notifier *notifier, void *data) >>>> +{ >>>> + VirtIOIOMMU *s = container_of(notifier, VirtIOIOMMU, machine_done); >>>> + int granule; >>>> + >>>> + if (likely(s->config.bypass)) { >>>> + /* >>>> + * Transient IOMMU MR enable to collect page_size_mask requirements >>>> + * through memory_region_iommu_set_page_size_mask() called by >>>> + * VFIO region_add() callback >>>> + */ >>>> + s->config.bypass = false; >>>> + virtio_iommu_switch_address_space_all(s); >>>> + /* restore default */ >>>> + s->config.bypass = true; >>>> + virtio_iommu_switch_address_space_all(s); >>>> + } >>>> + s->granule_frozen = true; >>>> + granule = ctz64(s->config.page_size_mask); >>>> + trace_virtio_iommu_freeze_granule(BIT(granule)); >>> Specifically, in this code, it thinks that ctz64() can >>> return 64, in which case BIT(granule) is shifting off >>> the end of the value, which is undefined behaviour. >>> This can happen if s->config.page_size_mask is 0 -- are >>> there assertions/checks that that can't happen elsewhere? >> To me this cannot happen. The page_size_mask is initialized with >> qemu_target_page_mask(), then further constrained with >> virtio_iommu_set_page_size_mask() which would call error_setg if the new >> mask is 0 or (cur_mask & new_mask) == 0 >> >> What can I do to give coverity a hint that page_size_mask cannot be NULL? > You can assert() it if you believe it to be true and you > think an assert() would help a human reader, > or we could just say "this is a false positive" and > mark it that way in the Coverity UI. We don't need to > change things purely to make Coverity happy. OK > >>> Secondly, BIT() only works for values up to 32, since >>> it works on type unsigned long, which might be a 32-bit >>> type on some hosts. Since you used ctz64() >>> you probably want BIT_ULL() which uses the ULL type >>> which definitely has 64 bits. >> agreed. > Looking more closely at this, the file is not entirely > consistent about how it handles the page_size_mask: > * in virtio_iommu_translate() we call ctz32() > and then use an open-coded 1 << n on that > * in virtio_iommu_set_page_size_mask() we call > ctz64() and use BIT() > * in virtio_iommu_freeze_granule() we call ctz64() > and then use BIT() > > So I suspect it's actually true (or at least assumed) > that the granule value can never be 32 or higher, as > well as it being non-zero. yes I don't expect the granule to be >= 32 but I agree this is a mess in the current implementation and this would deserve some alignment I will work on a patch ... Thanks Eric > > thanks > -- PMM >
diff --git a/include/hw/virtio/virtio-iommu.h b/include/hw/virtio/virtio-iommu.h index 2ad5ee320b..a93fc5383e 100644 --- a/include/hw/virtio/virtio-iommu.h +++ b/include/hw/virtio/virtio-iommu.h @@ -61,6 +61,8 @@ struct VirtIOIOMMU { QemuRecMutex mutex; GTree *endpoints; bool boot_bypass; + Notifier machine_done; + bool granule_frozen; }; #endif diff --git a/hw/virtio/virtio-iommu.c b/hw/virtio/virtio-iommu.c index 1bbad23f4a..8d0c5e3f32 100644 --- a/hw/virtio/virtio-iommu.c +++ b/hw/virtio/virtio-iommu.c @@ -25,6 +25,7 @@ #include "hw/virtio/virtio.h" #include "sysemu/kvm.h" #include "sysemu/reset.h" +#include "sysemu/sysemu.h" #include "qapi/error.h" #include "qemu/error-report.h" #include "trace.h" @@ -1106,12 +1107,12 @@ static int virtio_iommu_set_page_size_mask(IOMMUMemoryRegion *mr, } /* - * After the machine is finalized, we can't change the mask anymore. If by + * Once the granule is frozen we can't change the mask anymore. If by * chance the hotplugged device supports the same granule, we can still * accept it. Having a different masks is possible but the guest will use * sub-optimal block sizes, so warn about it. */ - if (phase_check(PHASE_MACHINE_READY)) { + if (s->granule_frozen) { int new_granule = ctz64(new_mask); int cur_granule = ctz64(cur_mask); @@ -1146,6 +1147,28 @@ static void virtio_iommu_system_reset(void *opaque) } +static void virtio_iommu_freeze_granule(Notifier *notifier, void *data) +{ + VirtIOIOMMU *s = container_of(notifier, VirtIOIOMMU, machine_done); + int granule; + + if (likely(s->config.bypass)) { + /* + * Transient IOMMU MR enable to collect page_size_mask requirements + * through memory_region_iommu_set_page_size_mask() called by + * VFIO region_add() callback + */ + s->config.bypass = false; + virtio_iommu_switch_address_space_all(s); + /* restore default */ + s->config.bypass = true; + virtio_iommu_switch_address_space_all(s); + } + s->granule_frozen = true; + granule = ctz64(s->config.page_size_mask); + trace_virtio_iommu_freeze_granule(BIT(granule)); +} + static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) { VirtIODevice *vdev = VIRTIO_DEVICE(dev); @@ -1189,6 +1212,9 @@ static void virtio_iommu_device_realize(DeviceState *dev, Error **errp) error_setg(errp, "VIRTIO-IOMMU is not attached to any PCI bus!"); } + s->machine_done.notify = virtio_iommu_freeze_granule; + qemu_add_machine_init_done_notifier(&s->machine_done); + qemu_register_reset(virtio_iommu_system_reset, s); } @@ -1198,6 +1224,7 @@ static void virtio_iommu_device_unrealize(DeviceState *dev) VirtIOIOMMU *s = VIRTIO_IOMMU(dev); qemu_unregister_reset(virtio_iommu_system_reset, s); + qemu_remove_machine_init_done_notifier(&s->machine_done); g_hash_table_destroy(s->as_by_busptr); if (s->domains) { diff --git a/hw/virtio/trace-events b/hw/virtio/trace-events index 4e39ed8a95..7109cf1a3b 100644 --- a/hw/virtio/trace-events +++ b/hw/virtio/trace-events @@ -133,6 +133,7 @@ virtio_iommu_set_page_size_mask(const char *name, uint64_t old, uint64_t new) "m virtio_iommu_notify_flag_add(const char *name) "add notifier to mr %s" virtio_iommu_notify_flag_del(const char *name) "del notifier from mr %s" virtio_iommu_switch_address_space(uint8_t bus, uint8_t slot, uint8_t fn, bool on) "Device %02x:%02x.%x switching address space (iommu enabled=%d)" +virtio_iommu_freeze_granule(uint64_t page_size_mask) "granule set to 0x%"PRIx64 # virtio-mem.c virtio_mem_send_response(uint16_t type) "type=%" PRIu16