diff mbox series

[Jammy,6/6] x86/entry: Do not allow external 0x80 interrupts

Message ID 20240611201145.183510-10-yuxuan.luo@canonical.com
State New
Headers show
Series CVE-2024-25744 | expand

Commit Message

Yuxuan Luo June 11, 2024, 8:11 p.m. UTC
From: Thomas Gleixner <tglx@linutronix.de>

The INT 0x80 instruction is used for 32-bit x86 Linux syscalls. The
kernel expects to receive a software interrupt as a result of the INT
0x80 instruction. However, an external interrupt on the same vector
also triggers the same codepath.

An external interrupt on vector 0x80 will currently be interpreted as a
32-bit system call, and assuming that it was a user context.

Panic on external interrupts on the vector.

To distinguish software interrupts from external ones, the kernel checks
the APIC ISR bit relevant to the 0x80 vector. For software interrupts,
this bit will be 0.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Reviewed-by: Borislav Petkov (AMD) <bp@alien8.de>
Cc: <stable@vger.kernel.org> # v6.0+
(cherry picked from commit 55617fb991df535f953589586468612351575704)
CVE-2024-25744
Signed-off-by: Yuxuan Luo <yuxuan.luo@canonical.com>
---
 arch/x86/entry/common.c | 37 ++++++++++++++++++++++++++++++++++++-
 1 file changed, 36 insertions(+), 1 deletion(-)
diff mbox series

Patch

diff --git a/arch/x86/entry/common.c b/arch/x86/entry/common.c
index 5adc7a17f37c9..d1594b4acf485 100644
--- a/arch/x86/entry/common.c
+++ b/arch/x86/entry/common.c
@@ -25,6 +25,7 @@ 
 #include <xen/events.h>
 #endif
 
+#include <asm/apic.h>
 #include <asm/desc.h>
 #include <asm/traps.h>
 #include <asm/vdso.h>
@@ -120,6 +121,25 @@  static __always_inline void do_syscall_32_irqs_on(struct pt_regs *regs, int nr)
 }
 
 #ifdef CONFIG_IA32_EMULATION
+static __always_inline bool int80_is_external(void)
+{
+	const unsigned int offs = (0x80 / 32) * 0x10;
+	const u32 bit = BIT(0x80 % 32);
+
+	/* The local APIC on XENPV guests is fake */
+	if (cpu_feature_enabled(X86_FEATURE_XENPV))
+		return false;
+
+	/*
+	 * If vector 0x80 is set in the APIC ISR then this is an external
+	 * interrupt. Either from broken hardware or injected by a VMM.
+	 *
+	 * Note: In guest mode this is only valid for secure guests where
+	 * the secure module fully controls the vAPIC exposed to the guest.
+	 */
+	return apic_read(APIC_ISR + offs) & bit;
+}
+
 /**
  * int80_emulation - 32-bit legacy syscall entry
  *
@@ -143,12 +163,27 @@  DEFINE_IDTENTRY_RAW(int80_emulation)
 {
 	int nr;
 
-	/* Establish kernel context. */
+	/* Kernel does not use INT $0x80! */
+	if (unlikely(!user_mode(regs))) {
+		irqentry_enter(regs);
+		instrumentation_begin();
+		panic("Unexpected external interrupt 0x80\n");
+	}
+
+	/*
+	 * Establish kernel context for instrumentation, including for
+	 * int80_is_external() below which calls into the APIC driver.
+	 * Identical for soft and external interrupts.
+	 */
 	enter_from_user_mode(regs);
 
 	instrumentation_begin();
 	add_random_kstack_offset();
 
+	/* Validate that this is a soft interrupt to the extent possible */
+	if (unlikely(int80_is_external()))
+		panic("Unexpected external interrupt 0x80\n");
+
 	/*
 	 * The low level idtentry code pushed -1 into regs::orig_ax
 	 * and regs::ax contains the syscall number.