diff mbox

[2/2] intel-iommu: restrict EIM to quirkless KVM

Message ID 20160809150333.9991-3-rkrcmar@redhat.com
State New
Headers show

Commit Message

Radim Krčmář Aug. 9, 2016, 3:03 p.m. UTC
APIC in QEMU doesn't support x2APIC so exposing EIM is pointless and KVM
has a quirk that needs to be disabled unless we want x2APIC message with
destination 0xff to be misinterpreted as a broadcast.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
---
 hw/i386/intel_iommu.c  | 10 +++++++++-
 target-i386/kvm-stub.c |  5 +++++
 target-i386/kvm.c      | 12 ++++++++++++
 target-i386/kvm_i386.h |  1 +
 4 files changed, 27 insertions(+), 1 deletion(-)

Comments

Peter Xu Aug. 10, 2016, 3:29 a.m. UTC | #1
On Tue, Aug 09, 2016 at 05:03:33PM +0200, Radim Krčmář wrote:
> APIC in QEMU doesn't support x2APIC so exposing EIM is pointless and KVM
> has a quirk that needs to be disabled unless we want x2APIC message with
> destination 0xff to be misinterpreted as a broadcast.
> 
> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
> ---
>  hw/i386/intel_iommu.c  | 10 +++++++++-
>  target-i386/kvm-stub.c |  5 +++++
>  target-i386/kvm.c      | 12 ++++++++++++
>  target-i386/kvm_i386.h |  1 +
>  4 files changed, 27 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 28c31a2cdfa3..733751923233 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -31,6 +31,7 @@
>  #include "hw/i386/x86-iommu.h"
>  #include "hw/pci-host/q35.h"
>  #include "sysemu/kvm.h"
> +#include "kvm_i386.h"
>  
>  /*#define DEBUG_INTEL_IOMMU*/
>  #ifdef DEBUG_INTEL_IOMMU
> @@ -2364,7 +2365,14 @@ static void vtd_init(IntelIOMMUState *s)
>      s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO;
>  
>      if (x86_iommu->intr_supported) {
> -        s->ecap |= VTD_ECAP_IR | VTD_ECAP_EIM | VTD_ECAP_MHMV;
> +        s->ecap |= VTD_ECAP_IR | VTD_ECAP_MHMV;
> +        /* QEMU APIC does not support x2APIC and KVM does not work well without
> +         * disabling a quirk.  IOMMU is unmigratable so we unconditionally use
> +         * optional KVM features.
> +         */
> +        if (kvm_irqchip_in_kernel() && kvm_disable_x2apic_broadcast_quirk()) {
> +            s->ecap |= VTD_ECAP_EIM;
> +        }

Good to me if this patch is only going to disable x2apic when we
failed to disable the x2apic broadcast quirk in KVM.

Question: still not too clear about how KVM treats the case when
x2apic and xapic are used in a single VM. E.g., if dest_id of an
interrupt is 0xff from a peripheral device, how should I know this is
a x2apic broadcast to 0-7 cpu in cluster 0, or an apic broadcast to
all?

Thanks,

-- peterx
Radim Krčmář Aug. 10, 2016, 4:59 p.m. UTC | #2
2016-08-10 11:29+0800, Peter Xu:
> On Tue, Aug 09, 2016 at 05:03:33PM +0200, Radim Krčmář wrote:
>> APIC in QEMU doesn't support x2APIC so exposing EIM is pointless and KVM
>> has a quirk that needs to be disabled unless we want x2APIC message with
>> destination 0xff to be misinterpreted as a broadcast.
>> 
>> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
>> ---
>> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
>> @@ -31,6 +31,7 @@
>>  #include "hw/i386/x86-iommu.h"
>>  #include "hw/pci-host/q35.h"
>>  #include "sysemu/kvm.h"
>> +#include "kvm_i386.h"
>>  
>>  /*#define DEBUG_INTEL_IOMMU*/
>>  #ifdef DEBUG_INTEL_IOMMU
>> @@ -2364,7 +2365,14 @@ static void vtd_init(IntelIOMMUState *s)
>>      s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO;
>>  
>>      if (x86_iommu->intr_supported) {
>> -        s->ecap |= VTD_ECAP_IR | VTD_ECAP_EIM | VTD_ECAP_MHMV;
>> +        s->ecap |= VTD_ECAP_IR | VTD_ECAP_MHMV;
>> +        /* QEMU APIC does not support x2APIC and KVM does not work well without
>> +         * disabling a quirk.  IOMMU is unmigratable so we unconditionally use
>> +         * optional KVM features.
>> +         */
>> +        if (kvm_irqchip_in_kernel() && kvm_disable_x2apic_broadcast_quirk()) {
>> +            s->ecap |= VTD_ECAP_EIM;
>> +        }
> 
> Good to me if this patch is only going to disable x2apic when we
> failed to disable the x2apic broadcast quirk in KVM.

Do you mean to also allow QEMU's APIC?

  if (!kvm_irqchip_in_kernel() || kvm_disable_x2apic_broadcast_quirk())

Thanks.

> Question: still not too clear about how KVM treats the case when
> x2apic and xapic are used in a single VM. E.g., if dest_id of an
> interrupt is 0xff from a peripheral device, how should I know this is
> a x2apic broadcast to 0-7 cpu in cluster 0, or an apic broadcast to
> all?

If a KVM guest has LAPICs in both x and x2 modes, then every interrupt
arrives to all LAPICs and is accepted according to ID/LDR/DFR where
every LAPIC assumes that the sender matches LAPIC's mode => all xLAPICs
would accept 0xff and x2LAPICs with ID 0-7 would as well.
kvm_apic_match_dest() is the function that decides and kvm_apic_mda()
does most of the magic.  The quirk disables a case that translated 0xff
to 0xffffffff for x2LAPICs.

I don't know how real hardware does it and the behavior might even
differ between FSB and QPI.  I think KVM differs from both of them, but
it's not that any behavior makes a difference in practice, so running a
test kernel to figure it out has never been a priority ...
Peter Xu Aug. 11, 2016, 3:33 a.m. UTC | #3
On Wed, Aug 10, 2016 at 06:59:14PM +0200, Radim Krčmář wrote:
> 2016-08-10 11:29+0800, Peter Xu:
> > On Tue, Aug 09, 2016 at 05:03:33PM +0200, Radim Krčmář wrote:
> >> APIC in QEMU doesn't support x2APIC so exposing EIM is pointless and KVM
> >> has a quirk that needs to be disabled unless we want x2APIC message with
> >> destination 0xff to be misinterpreted as a broadcast.
> >> 
> >> Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
> >> ---
> >> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> >> @@ -31,6 +31,7 @@
> >>  #include "hw/i386/x86-iommu.h"
> >>  #include "hw/pci-host/q35.h"
> >>  #include "sysemu/kvm.h"
> >> +#include "kvm_i386.h"
> >>  
> >>  /*#define DEBUG_INTEL_IOMMU*/
> >>  #ifdef DEBUG_INTEL_IOMMU
> >> @@ -2364,7 +2365,14 @@ static void vtd_init(IntelIOMMUState *s)
> >>      s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO;
> >>  
> >>      if (x86_iommu->intr_supported) {
> >> -        s->ecap |= VTD_ECAP_IR | VTD_ECAP_EIM | VTD_ECAP_MHMV;
> >> +        s->ecap |= VTD_ECAP_IR | VTD_ECAP_MHMV;
> >> +        /* QEMU APIC does not support x2APIC and KVM does not work well without
> >> +         * disabling a quirk.  IOMMU is unmigratable so we unconditionally use
> >> +         * optional KVM features.
> >> +         */
> >> +        if (kvm_irqchip_in_kernel() && kvm_disable_x2apic_broadcast_quirk()) {
> >> +            s->ecap |= VTD_ECAP_EIM;
> >> +        }
> > 
> > Good to me if this patch is only going to disable x2apic when we
> > failed to disable the x2apic broadcast quirk in KVM.
> 
> Do you mean to also allow QEMU's APIC?
> 
>   if (!kvm_irqchip_in_kernel() || kvm_disable_x2apic_broadcast_quirk())
> 
> Thanks.

Nop. But, now I understand your mean, and please take this:

Reviewed-by: Peter Xu <peterx@redhat.com>

> 
> > Question: still not too clear about how KVM treats the case when
> > x2apic and xapic are used in a single VM. E.g., if dest_id of an
> > interrupt is 0xff from a peripheral device, how should I know this is
> > a x2apic broadcast to 0-7 cpu in cluster 0, or an apic broadcast to
> > all?
> 
> If a KVM guest has LAPICs in both x and x2 modes, then every interrupt
> arrives to all LAPICs and is accepted according to ID/LDR/DFR where
> every LAPIC assumes that the sender matches LAPIC's mode => all xLAPICs
> would accept 0xff and x2LAPICs with ID 0-7 would as well.
> kvm_apic_match_dest() is the function that decides and kvm_apic_mda()
> does most of the magic.  The quirk disables a case that translated 0xff
> to 0xffffffff for x2LAPICs.
> 
> I don't know how real hardware does it and the behavior might even
> differ between FSB and QPI.  I think KVM differs from both of them, but
> it's not that any behavior makes a difference in practice, so running a
> test kernel to figure it out has never been a priority ...

Thank you for the explaination!

I feel I am getting close to the nightmares. :)

-- peterx
diff mbox

Patch

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 28c31a2cdfa3..733751923233 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -31,6 +31,7 @@ 
 #include "hw/i386/x86-iommu.h"
 #include "hw/pci-host/q35.h"
 #include "sysemu/kvm.h"
+#include "kvm_i386.h"
 
 /*#define DEBUG_INTEL_IOMMU*/
 #ifdef DEBUG_INTEL_IOMMU
@@ -2364,7 +2365,14 @@  static void vtd_init(IntelIOMMUState *s)
     s->ecap = VTD_ECAP_QI | VTD_ECAP_IRO;
 
     if (x86_iommu->intr_supported) {
-        s->ecap |= VTD_ECAP_IR | VTD_ECAP_EIM | VTD_ECAP_MHMV;
+        s->ecap |= VTD_ECAP_IR | VTD_ECAP_MHMV;
+        /* QEMU APIC does not support x2APIC and KVM does not work well without
+         * disabling a quirk.  IOMMU is unmigratable so we unconditionally use
+         * optional KVM features.
+         */
+        if (kvm_irqchip_in_kernel() && kvm_disable_x2apic_broadcast_quirk()) {
+            s->ecap |= VTD_ECAP_EIM;
+        }
     }
 
     vtd_reset_context_cache(s);
diff --git a/target-i386/kvm-stub.c b/target-i386/kvm-stub.c
index cdf15061091d..71be0bd94ddc 100644
--- a/target-i386/kvm-stub.c
+++ b/target-i386/kvm-stub.c
@@ -25,6 +25,11 @@  bool kvm_has_smm(void)
     return 1;
 }
 
+bool kvm_disable_x2apic_broadcast_quirk(void)
+{
+    return false;
+}
+
 /* This function is only called inside conditionals which we
  * rely on the compiler to optimize out when CONFIG_KVM is not
  * defined.
diff --git a/target-i386/kvm.c b/target-i386/kvm.c
index 0b2016a77a3c..050c850d77d3 100644
--- a/target-i386/kvm.c
+++ b/target-i386/kvm.c
@@ -128,6 +128,18 @@  bool kvm_allows_irq0_override(void)
     return !kvm_irqchip_in_kernel() || kvm_has_gsi_routing();
 }
 
+static bool kvm_x2apic_api_set_flags(uint64_t flags)
+{
+    KVMState *s = KVM_STATE(current_machine->accelerator);
+
+    return !kvm_vm_enable_cap(s, KVM_CAP_X2APIC_API, 0, flags);
+}
+
+bool kvm_disable_x2apic_broadcast_quirk(void)
+{
+    return kvm_x2apic_api_set_flags(KVM_X2APIC_API_DISABLE_BROADCAST_QUIRK);
+}
+
 static int kvm_get_tsc(CPUState *cs)
 {
     X86CPU *cpu = X86_CPU(cs);
diff --git a/target-i386/kvm_i386.h b/target-i386/kvm_i386.h
index 42b00af1b1c3..1bc445d59c83 100644
--- a/target-i386/kvm_i386.h
+++ b/target-i386/kvm_i386.h
@@ -41,4 +41,5 @@  int kvm_device_msix_set_vector(KVMState *s, uint32_t dev_id, uint32_t vector,
 int kvm_device_msix_assign(KVMState *s, uint32_t dev_id);
 int kvm_device_msix_deassign(KVMState *s, uint32_t dev_id);
 
+bool kvm_disable_x2apic_broadcast_quirk(void);
 #endif