Message ID | 1470922183-22873-1-git-send-email-peterx@redhat.com |
---|---|
State | New |
Headers | show |
2016-08-11 21:29+0800, Peter Xu: > Adding one extra property for intel-iommu device to decide whether we > should support EIM bit for IR. > > Now we are throwing high 24 bits of dest_id away directly. This will > cause interrupt issues with guests that: > > - enabled x2apic with cluster mode > - have more than 16 vcpus (so will have more than 1 cluster) Good catch, it is a problem starting at 9 VCPUs, because that one is already "0x100" and we truncate to 8 bits ... > Let's make xapic the default one, and for the brave people who would > like to try EIM and know the side effects, we can do it by explicitly > enabling EIM using: This might bite us: users cannot easily tell when EIM is sane, so eim=on is going to be a hazard even after some KVM/QEMU has it fixed. I'd make make eim=on fail when logical x2APIC is broken. Right now, we can allow eim=on if maximal APIC ID < 8, because physical 0xffffffff gets truncated to 0xff and is still interpreted as a broadcast thanks to the quirk. Logical 0xff gets misinterpreted as a broadcast, but it would address all APICs anyway and KVM's lowest-priority works for broadcast too, so it is fine to do so. Not very wise, though. :) > -device intel-iommu,intremap=on,eim=on > > Even after we have x2apic support, it'll still be good if we can provide > a way to switch xapic/x2apic from QEMU side for e.g. debugging purpose, > which is an alternative for tuning guest kernel boot parameters. > > We can switch the default to "on" after x2apic fully supported. > > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > This will have small conflict with Radim's patches to selectively > enable EIM. I can do a rebase when needed. This patch made me realize that broadcast quirk is not enough, so I'll do v2 on top of this.
On 11/08/2016 15:29, Peter Xu wrote: > + > +static void vtd_eim_prop_set(Object *o, bool value, Error **errp) > +{ > + IntelIOMMUState *s = INTEL_IOMMU_DEVICE(o); > + s->eim_supported = value; > +} > + > +static void vtd_instance_init(Object *o) > +{ > + IntelIOMMUState *s = INTEL_IOMMU_DEVICE(o); > + > + /* > + * TODO: we can enable this by default after we have full x2apic > + * support. > + */ > + s->eim_supported = false; > + object_property_add_bool(o, "eim", vtd_eim_prop_get, > + vtd_eim_prop_set, NULL); > +} > + > static const TypeInfo vtd_info = { > .name = TYPE_INTEL_IOMMU_DEVICE, > .parent = TYPE_X86_IOMMU_DEVICE, > + .instance_init = vtd_instance_init, This can use DEFINE_PROP_BOOL. Paolo
On Fri, Aug 12, 2016 at 10:34:44AM +0200, Paolo Bonzini wrote: > > > On 11/08/2016 15:29, Peter Xu wrote: > > + > > +static void vtd_eim_prop_set(Object *o, bool value, Error **errp) > > +{ > > + IntelIOMMUState *s = INTEL_IOMMU_DEVICE(o); > > + s->eim_supported = value; > > +} > > + > > +static void vtd_instance_init(Object *o) > > +{ > > + IntelIOMMUState *s = INTEL_IOMMU_DEVICE(o); > > + > > + /* > > + * TODO: we can enable this by default after we have full x2apic > > + * support. > > + */ > > + s->eim_supported = false; > > + object_property_add_bool(o, "eim", vtd_eim_prop_get, > > + vtd_eim_prop_set, NULL); > > +} > > + > > static const TypeInfo vtd_info = { > > .name = TYPE_INTEL_IOMMU_DEVICE, > > .parent = TYPE_X86_IOMMU_DEVICE, > > + .instance_init = vtd_instance_init, > > This can use DEFINE_PROP_BOOL. Right. Will post v2. Thanks! -- peterx
On Thu, Aug 11, 2016 at 05:44:16PM +0200, Radim Krčmář wrote: > 2016-08-11 21:29+0800, Peter Xu: > > Adding one extra property for intel-iommu device to decide whether we > > should support EIM bit for IR. > > > > Now we are throwing high 24 bits of dest_id away directly. This will > > cause interrupt issues with guests that: > > > > - enabled x2apic with cluster mode > > - have more than 16 vcpus (so will have more than 1 cluster) > > Good catch, it is a problem starting at 9 VCPUs, because that one is > already "0x100" and we truncate to 8 bits ... Ah yes. I muddled on the bits. :) > > > Let's make xapic the default one, and for the brave people who would > > like to try EIM and know the side effects, we can do it by explicitly > > enabling EIM using: > > This might bite us: users cannot easily tell when EIM is sane, so eim=on > is going to be a hazard even after some KVM/QEMU has it fixed. > I'd make make eim=on fail when logical x2APIC is broken. > > Right now, we can allow eim=on if maximal APIC ID < 8, because physical > 0xffffffff gets truncated to 0xff and is still interpreted as a > broadcast thanks to the quirk. > Logical 0xff gets misinterpreted as a broadcast, but it would address > all APICs anyway and KVM's lowest-priority works for broadcast too, so > it is fine to do so. Not very wise, though. :) > > > -device intel-iommu,intremap=on,eim=on > > > > Even after we have x2apic support, it'll still be good if we can provide > > a way to switch xapic/x2apic from QEMU side for e.g. debugging purpose, > > which is an alternative for tuning guest kernel boot parameters. > > > > We can switch the default to "on" after x2apic fully supported. > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > --- > > This will have small conflict with Radim's patches to selectively > > enable EIM. I can do a rebase when needed. > > This patch made me realize that broadcast quirk is not enough, so I'll > do v2 on top of this. I'll post v2 soon with DEFINE_PROP_BOOL. Thanks, -- peterx
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 28c31a2..a2f5051 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -2364,7 +2364,10 @@ static void vtd_init(IntelIOMMUState *s) s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO; if (x86_iommu->intr_supported) { - s->ecap |= VTD_ECAP_IR | VTD_ECAP_EIM | VTD_ECAP_MHMV; + s->ecap |= VTD_ECAP_IR | VTD_ECAP_MHMV; + if (s->eim_supported) { + s->ecap |= VTD_ECAP_EIM; + } } vtd_reset_context_cache(s); @@ -2468,6 +2471,12 @@ static void vtd_realize(DeviceState *dev, Error **errp) /* Pseudo address space under root PCI bus. */ pcms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC); + /* EIM bit requires IR */ + if (s->eim_supported && !x86_iommu->intr_supported) { + error_report("EIM (Extended Interrupt Mode) bit requires intremap=on"); + exit(1); + } + /* Currently Intel IOMMU IR only support "kernel-irqchip={off|split}" */ if (x86_iommu->intr_supported && kvm_irqchip_in_kernel() && !kvm_irqchip_is_split()) { @@ -2490,9 +2499,35 @@ static void vtd_class_init(ObjectClass *klass, void *data) x86_class->int_remap = vtd_int_remap; } +static bool vtd_eim_prop_get(Object *o, Error **errp) +{ + IntelIOMMUState *s = INTEL_IOMMU_DEVICE(o); + return s->eim_supported; +} + +static void vtd_eim_prop_set(Object *o, bool value, Error **errp) +{ + IntelIOMMUState *s = INTEL_IOMMU_DEVICE(o); + s->eim_supported = value; +} + +static void vtd_instance_init(Object *o) +{ + IntelIOMMUState *s = INTEL_IOMMU_DEVICE(o); + + /* + * TODO: we can enable this by default after we have full x2apic + * support. + */ + s->eim_supported = false; + object_property_add_bool(o, "eim", vtd_eim_prop_get, + vtd_eim_prop_set, NULL); +} + static const TypeInfo vtd_info = { .name = TYPE_INTEL_IOMMU_DEVICE, .parent = TYPE_X86_IOMMU_DEVICE, + .instance_init = vtd_instance_init, .instance_size = sizeof(IntelIOMMUState), .class_init = vtd_class_init, }; diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h index a42dbd7..b1bc768 100644 --- a/include/hw/i386/intel_iommu.h +++ b/include/hw/i386/intel_iommu.h @@ -289,6 +289,7 @@ struct IntelIOMMUState { dma_addr_t intr_root; /* Interrupt remapping table pointer */ uint32_t intr_size; /* Number of IR table entries */ bool intr_eime; /* Extended interrupt mode enabled */ + bool eim_supported; /* Whether to allow EIM bit */ }; /* Find the VTD Address space associated with the given bus pointer,
Adding one extra property for intel-iommu device to decide whether we should support EIM bit for IR. Now we are throwing high 24 bits of dest_id away directly. This will cause interrupt issues with guests that: - enabled x2apic with cluster mode - have more than 16 vcpus (so will have more than 1 cluster) Let's make xapic the default one, and for the brave people who would like to try EIM and know the side effects, we can do it by explicitly enabling EIM using: -device intel-iommu,intremap=on,eim=on Even after we have x2apic support, it'll still be good if we can provide a way to switch xapic/x2apic from QEMU side for e.g. debugging purpose, which is an alternative for tuning guest kernel boot parameters. We can switch the default to "on" after x2apic fully supported. Signed-off-by: Peter Xu <peterx@redhat.com> --- This will have small conflict with Radim's patches to selectively enable EIM. I can do a rebase when needed. hw/i386/intel_iommu.c | 37 ++++++++++++++++++++++++++++++++++++- include/hw/i386/intel_iommu.h | 1 + 2 files changed, 37 insertions(+), 1 deletion(-)