Message ID | 20190705032501.106966-1-aik@ozlabs.ru |
---|---|
State | New |
Headers | show |
Series | [RFC,qemu] vfio-quirks: Pass the actual parent when deleting a memory region | expand |
On Fri, Jul 05, 2019 at 01:25:01PM +1000, Alexey Kardashevskiy wrote: > The usual way of using a quirk's MR is to add it as a subregion of a BAR > as this is what quirks are for. However there is less than standard user > of this - NVLink2-enabled NVIDIA GPU which exposes a GPU RAM and a ATSD > 64K region outside of PCI MMIO window so these MRs get the system address > space root as a parent. So when the user unplugs such device, assert > occurs: > > qemu-system-ppc64: /home/aik/p/qemu/memory.c:2391: memory_region_del_subregion: Assertion `subregion->container == mr' failed. > > This passes the actual parent MR to memory_region_del_subregion() in > vfio_bar_quirk_exit. > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> > --- > > This removes an extra sanity check that a quirk has a correct parent; > I am not sure if it is very useful. > I could use the "system" MR if quirk->mem[i].container==get_system_memory() > and quirk->mem[i].container otherwise to keep that assert working. > > Also this does not help with the actual device removal much because of > the closed source driver nature - the associated service > (nvidia-persistenced, responsible for onlining GPU memory) crashes > the guest system but at least the user can reboot the guest after > the crash which is not as bad as assert. > > --- > hw/vfio/pci-quirks.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c > index 27dddbc8fa3e..ef2e182c1d36 100644 > --- a/hw/vfio/pci-quirks.c > +++ b/hw/vfio/pci-quirks.c > @@ -1896,7 +1896,8 @@ void vfio_bar_quirk_exit(VFIOPCIDevice *vdev, int nr) > } > > for (i = 0; i < quirk->nr_mem; i++) { > - memory_region_del_subregion(bar->region.mem, &quirk->mem[i]); > + memory_region_del_subregion(quirk->mem[i].container, > + &quirk->mem[i]); > } > } > }
On Fri, 5 Jul 2019 13:25:01 +1000 Alexey Kardashevskiy <aik@ozlabs.ru> wrote: > The usual way of using a quirk's MR is to add it as a subregion of a BAR > as this is what quirks are for. However there is less than standard user > of this - NVLink2-enabled NVIDIA GPU which exposes a GPU RAM and a ATSD > 64K region outside of PCI MMIO window so these MRs get the system address > space root as a parent. So when the user unplugs such device, assert > occurs: > > qemu-system-ppc64: /home/aik/p/qemu/memory.c:2391: memory_region_del_subregion: Assertion `subregion->container == mr' failed. > > This passes the actual parent MR to memory_region_del_subregion() in > vfio_bar_quirk_exit. > > Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> > --- > > This removes an extra sanity check that a quirk has a correct parent; > I am not sure if it is very useful. > I could use the "system" MR if quirk->mem[i].container==get_system_memory() > and quirk->mem[i].container otherwise to keep that assert working. > > Also this does not help with the actual device removal much because of > the closed source driver nature - the associated service > (nvidia-persistenced, responsible for onlining GPU memory) crashes > the guest system but at least the user can reboot the guest after > the crash which is not as bad as assert. > > --- > hw/vfio/pci-quirks.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c > index 27dddbc8fa3e..ef2e182c1d36 100644 > --- a/hw/vfio/pci-quirks.c > +++ b/hw/vfio/pci-quirks.c > @@ -1896,7 +1896,8 @@ void vfio_bar_quirk_exit(VFIOPCIDevice *vdev, int nr) > } > > for (i = 0; i < quirk->nr_mem; i++) { > - memory_region_del_subregion(bar->region.mem, &quirk->mem[i]); > + memory_region_del_subregion(quirk->mem[i].container, > + &quirk->mem[i]); > } > } > } NAK struct MemoryRegion { Object parent_obj; /* All fields are private - violators will be prosecuted */ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ... MemoryRegion *container;
diff --git a/hw/vfio/pci-quirks.c b/hw/vfio/pci-quirks.c index 27dddbc8fa3e..ef2e182c1d36 100644 --- a/hw/vfio/pci-quirks.c +++ b/hw/vfio/pci-quirks.c @@ -1896,7 +1896,8 @@ void vfio_bar_quirk_exit(VFIOPCIDevice *vdev, int nr) } for (i = 0; i < quirk->nr_mem; i++) { - memory_region_del_subregion(bar->region.mem, &quirk->mem[i]); + memory_region_del_subregion(quirk->mem[i].container, + &quirk->mem[i]); } } }
The usual way of using a quirk's MR is to add it as a subregion of a BAR as this is what quirks are for. However there is less than standard user of this - NVLink2-enabled NVIDIA GPU which exposes a GPU RAM and a ATSD 64K region outside of PCI MMIO window so these MRs get the system address space root as a parent. So when the user unplugs such device, assert occurs: qemu-system-ppc64: /home/aik/p/qemu/memory.c:2391: memory_region_del_subregion: Assertion `subregion->container == mr' failed. This passes the actual parent MR to memory_region_del_subregion() in vfio_bar_quirk_exit. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> --- This removes an extra sanity check that a quirk has a correct parent; I am not sure if it is very useful. I could use the "system" MR if quirk->mem[i].container==get_system_memory() and quirk->mem[i].container otherwise to keep that assert working. Also this does not help with the actual device removal much because of the closed source driver nature - the associated service (nvidia-persistenced, responsible for onlining GPU memory) crashes the guest system but at least the user can reboot the guest after the crash which is not as bad as assert. --- hw/vfio/pci-quirks.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)