Message ID | 1428934370-29695-1-git-send-email-liang.z.li@intel.com |
---|---|
State | New |
Headers | show |
On 13/04/2015 16:12, Liang Li wrote: > 2. Do the attach and detach operation with a time interval. eg. 10s. > > The error message will not disappear if retry, in this case, it's > a bug. > > In the 'xen_pt_region_add' and 'xen_pt_region_del', we should only care > about the 'xen-pci-pt-*' memory region, this can avoid the region's > reference count is not equal with the dereference count when the > device is detached and prevent the device's related QemuOpts object from > being released properly, and then trigger the bug when the device is > re-attached. This doesn't explain _which_ region is causing the bug and how. Assuming this is the right fix, should you instead move the memory_region_ref/unref pair from xen_pt_region_add/del after this conditional: if (bar == -1 && (!s->msix || &s->msix->mmio != mr)) { return; } in xen_pt_region_update? Paolo
> On 13/04/2015 16:12, Liang Li wrote: > > 2. Do the attach and detach operation with a time interval. eg. 10s. > > > > The error message will not disappear if retry, in this case, it's > > a bug. > > > > In the 'xen_pt_region_add' and 'xen_pt_region_del', we should only > > care about the 'xen-pci-pt-*' memory region, this can avoid the > > region's reference count is not equal with the dereference count when > > the device is detached and prevent the device's related QemuOpts > > object from being released properly, and then trigger the bug when the > > device is re-attached. > > This doesn't explain _which_ region is causing the bug and how. Please ignore this patch, because I have some new findings and this patch is not correct, I will send a new patch later. > > Assuming this is the right fix, should you instead move the > memory_region_ref/unref pair from xen_pt_region_add/del after this > conditional: > > if (bar == -1 && (!s->msix || &s->msix->mmio != mr)) { > return; > } > > in xen_pt_region_update? > > Paolo Yes, it's the right place. Put aside the bug fix, I think the memory_region_ref/unref pair should be move to xen_pt_region_update after the conditional as you point out. Do you think so? Liang
On 15/04/2015 16:14, Li, Liang Z wrote: > Yes, it's the right place. Put aside the bug fix, I think the memory_region_ref/unref pair > should be move to xen_pt_region_update after the conditional as you point out. > Do you think so? It would make sense, but I was just guessing... I'm still not sure whether that causes any difference, also because I'm not familiar with the Xen PCI passthrough code. Paolo
diff --git a/hw/xen/xen_pt.c b/hw/xen/xen_pt.c index f2893b2..8aab769 100644 --- a/hw/xen/xen_pt.c +++ b/hw/xen/xen_pt.c @@ -61,6 +61,7 @@ #include "qemu/range.h" #include "exec/address-spaces.h" +#define XEN_PT_NAME_PREFIX "xen-pci-pt" #define XEN_PT_NR_IRQS (256) static uint8_t xen_pt_mapped_machine_irq[XEN_PT_NR_IRQS] = {0}; @@ -585,6 +586,10 @@ static void xen_pt_region_update(XenPCIPassthroughState *s, static void xen_pt_region_add(MemoryListener *l, MemoryRegionSection *sec) { + if (strncmp(sec->mr->name, XEN_PT_NAME_PREFIX, + sizeof(XEN_PT_NAME_PREFIX))) { + return; + } XenPCIPassthroughState *s = container_of(l, XenPCIPassthroughState, memory_listener); @@ -594,6 +599,11 @@ static void xen_pt_region_add(MemoryListener *l, MemoryRegionSection *sec) static void xen_pt_region_del(MemoryListener *l, MemoryRegionSection *sec) { + if (strncmp(sec->mr->name, XEN_PT_NAME_PREFIX, + sizeof(XEN_PT_NAME_PREFIX))) { + return; + } + XenPCIPassthroughState *s = container_of(l, XenPCIPassthroughState, memory_listener); @@ -603,6 +613,10 @@ static void xen_pt_region_del(MemoryListener *l, MemoryRegionSection *sec) static void xen_pt_io_region_add(MemoryListener *l, MemoryRegionSection *sec) { + if (strncmp(sec->mr->name, XEN_PT_NAME_PREFIX, + sizeof(XEN_PT_NAME_PREFIX))) { + return; + } XenPCIPassthroughState *s = container_of(l, XenPCIPassthroughState, io_listener); @@ -612,6 +626,10 @@ static void xen_pt_io_region_add(MemoryListener *l, MemoryRegionSection *sec) static void xen_pt_io_region_del(MemoryListener *l, MemoryRegionSection *sec) { + if (strncmp(sec->mr->name, XEN_PT_NAME_PREFIX, + sizeof(XEN_PT_NAME_PREFIX))) { + return; + } XenPCIPassthroughState *s = container_of(l, XenPCIPassthroughState, io_listener);
Use the option like 'pci=[ '07:10,1', '0b:10.1', '81:10.1']' in HVM guest configuration file to assign more than one VF PCI devices to a guest, after the guest boot up, detach the VFs in sequence by 'xl pci-detach $DOM_ID $VF_BDF', and then attach the VFs by 'xl pci-attach $DOM_ID $VF_BDF' in sequence, an error message will be reported like following: libxl: error: libxl_qmp.c:287:qmp_handle_error_response: receive an error message from QMP server: Duplicate ID 'pci-pt-81_10.1' for device. The error message will be printed in two cases: 1. Attach and detach very quickly. The message will disappear if retry, it is expected because of the asynchronous unplug mechanism. 2. Do the attach and detach operation with a time interval. eg. 10s. The error message will not disappear if retry, in this case, it's a bug. In the 'xen_pt_region_add' and 'xen_pt_region_del', we should only care about the 'xen-pci-pt-*' memory region, this can avoid the region's reference count is not equal with the dereference count when the device is detached and prevent the device's related QemuOpts object from being released properly, and then trigger the bug when the device is re-attached. I sent a patch to fix a similar bug before, but the patch could not fix the issue completely. Signed-off-by: Liang Li <liang.z.li@intel.com> --- hw/xen/xen_pt.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)