diff mbox series

powerpc/books: Never call nmi_enter for real-mode NMIs

Message ID 20221027074314.2084016-1-npiggin@gmail.com (mailing list archive)
State Changes Requested
Headers show
Series powerpc/books: Never call nmi_enter for real-mode NMIs | expand

Checks

Context Check Description
snowpatch_ozlabs/github-powerpc_kernel_qemu fail kernel (corenet32_smp_defconfig, fedora-34) failed at step build.
snowpatch_ozlabs/github-powerpc_ppctests success Successfully ran 10 jobs.

Commit Message

Nicholas Piggin Oct. 27, 2022, 7:43 a.m. UTC
NMIs that are taken in real mode (the early MCE and HMI handlers)
skipped calling nmi_enter() in some configurations, in the hope that
more modern configurations like radix suffer fewer restrictions.  This
just turns into whack-a-mole and fragile when core kernel code changes
anything.

A recent such example that breaks with radix, an HMI real mode interrupt
tries to access vmalloc memory, causing it to take a machine check:

 --- interrupt: 200 at perf_trace_rcu_dyntick+0x140/0x190
 NIP:  c0000000001d4720 LR: c0000000001d2bb4 CTR: c0000000001d45e0
 REGS: c000000fffdbfd60 TRAP: 0200   Tainted: G   M                (6.0.0-dirty)
 MSR:  9000000000201003 <SF,HV,ME,RI,LE>  CR: 24024228  XER: 20040000
 CFAR: c0000000001d4648 DAR: c009e000016e29a8 DSISR: 00000008 IRQMASK: 3
 GPR00: c0000000001d2bb4 c000000fffdc7b30 c00000000255c100 c0000000023089f8
 GPR04: c000000001bd0438 4000000000000000 4000000000000002 0000000000964794
 GPR08: 0000000000000000 c009dff0055b29a8 0000000ffc130000 7265677368657265
 GPR12: c0000000001d45e0 c000000ffffd7000 c00000000014e7c8 c00000000ab74280
 GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000031a64d8
 GPR20: c00000000d9f7b00 0000000000000006 c000000002446a28 c009e000016e29a8
 GPR24: c000000001bd0438 4000000000000000 4000000000000002 0000000000964794
 GPR28: c0000000001d2bb4 4000000000000002 c0000000023089f8 c0002000063f0668
 perf_trace_rcu_dyntick+0x140/0x190
 __traceiter_rcu_dyntick+0x84/0xc0
 --- interrupt: 200
 rcu_read_lock_sched_held+0x10/0xe0 (unreliable)
 __traceiter_rcu_dyntick+0x84/0xc0
 ct_nmi_enter+0x118/0x280
 interrupt_nmi_enter_prepare+0x118/0x1f0
 hmi_exception_realmode+0x38/0xe4
 hmi_exception_early_common+0x114/0x2a0
 --- interrupt: e60 at arch_local_irq_restore+0x11c/0x1b0

Just disable this entirely. It turns out the features that might be
enabled by nmi_enter(), like RCU or printk are unlikely to be usable
in real mode anyway.

Reported-by: Michael Ellerman <mpe@ellerman.id.au>
Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com>
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/include/asm/interrupt.h | 49 +++++++++-------------------
 1 file changed, 16 insertions(+), 33 deletions(-)

Comments

kernel test robot Oct. 27, 2022, 10:31 a.m. UTC | #1
Hi Nicholas,

I love your patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on linus/master v6.1-rc2 next-20221027]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Nicholas-Piggin/powerpc-books-Never-call-nmi_enter-for-real-mode-NMIs/20221027-154503
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
patch link:    https://lore.kernel.org/r/20221027074314.2084016-1-npiggin%40gmail.com
patch subject: [PATCH] powerpc/books: Never call nmi_enter for real-mode NMIs
config: powerpc-allnoconfig
compiler: powerpc-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/57c237cb1fe034e85db9cd44fb09986fa27b2197
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Nicholas-Piggin/powerpc-books-Never-call-nmi_enter-for-real-mode-NMIs/20221027-154503
        git checkout 57c237cb1fe034e85db9cd44fb09986fa27b2197
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=powerpc SHELL=/bin/bash arch/powerpc/lib/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from arch/powerpc/lib/feature-fixups.c:20:
   arch/powerpc/include/asm/interrupt.h: In function 'interrupt_nmi_enter_prepare':
>> arch/powerpc/include/asm/interrupt.h:349:18: error: 'struct interrupt_nmi_state' has no member named 'mmu_enabled'
     349 |         if (state->mmu_enabled)
         |                  ^~
   In file included from arch/powerpc/include/asm/bug.h:158,
                    from include/linux/bug.h:5,
                    from arch/powerpc/include/asm/cmpxchg.h:8,
                    from arch/powerpc/include/asm/atomic.h:11,
                    from include/linux/atomic.h:7,
                    from include/linux/jump_label.h:254,
                    from arch/powerpc/lib/feature-fixups.c:12:
   arch/powerpc/include/asm/interrupt.h: In function 'interrupt_nmi_exit_prepare':
   arch/powerpc/include/asm/interrupt.h:355:27: error: 'struct interrupt_nmi_state' has no member named 'mmu_enabled'
     355 |         WARN_ON_ONCE(state->mmu_enabled != !!(mfmsr() & MSR_DR));
         |                           ^~
   include/asm-generic/bug.h:110:32: note: in definition of macro 'WARN_ON_ONCE'
     110 |         int __ret_warn_on = !!(condition);                      \
         |                                ^~~~~~~~~
   arch/powerpc/include/asm/interrupt.h:357:18: error: 'struct interrupt_nmi_state' has no member named 'mmu_enabled'
     357 |         if (state->mmu_enabled)
         |                  ^~


vim +349 arch/powerpc/include/asm/interrupt.h

   337	
   338		/*
   339		 * If data relocations are enabled, it's safe to use nmi_enter().
   340		 * Otherwise avoid using it because the core kernel may touch
   341		 * vmalloc (e.g., in per-CPU variables), which is not accessible
   342		 * with the MMU off. Linear memory beyond the VRMA limit is also
   343		 * a problem for hash guests.
   344		 *
   345		 * The real-mode machine checks should not use RCU, tracing, lockdep
   346		 * locks, and should not printk, access per-CPU variables, among
   347		 * many other restrictions.
   348		 */
 > 349		if (state->mmu_enabled)
   350			nmi_enter();
   351	}
   352
diff mbox series

Patch

diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h
index 4745bb9998bd..3e87e9ec5117 100644
--- a/arch/powerpc/include/asm/interrupt.h
+++ b/arch/powerpc/include/asm/interrupt.h
@@ -276,6 +276,7 @@  struct interrupt_nmi_state {
 	u8 irq_soft_mask;
 	u8 irq_happened;
 	u8 ftrace_enabled;
+	u8 mmu_enabled;
 	u64 softe;
 #endif
 };
@@ -303,6 +304,7 @@  static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct inte
 	state->irq_soft_mask = local_paca->irq_soft_mask;
 	state->irq_happened = local_paca->irq_happened;
 	state->softe = regs->softe;
+	state->mmu_enabled = !!(mfmsr() & MSR_DR);
 
 	/*
 	 * Set IRQS_ALL_DISABLED unconditionally so irqs_disabled() does
@@ -333,46 +335,27 @@  static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct inte
 	}
 #endif
 
-	/* If data relocations are enabled, it's safe to use nmi_enter() */
-	if (mfmsr() & MSR_DR) {
-		nmi_enter();
-		return;
-	}
-
-	/*
-	 * But do not use nmi_enter() for pseries hash guest taking a real-mode
-	 * NMI because not everything it touches is within the RMA limit.
-	 */
-	if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
-	    firmware_has_feature(FW_FEATURE_LPAR) &&
-	    !radix_enabled())
-		return;
-
 	/*
-	 * Likewise, don't use it if we have some form of instrumentation (like
-	 * KASAN shadow) that is not safe to access in real mode (even on radix)
+	 * If data relocations are enabled, it's safe to use nmi_enter().
+	 * Otherwise avoid using it because the core kernel may touch
+	 * vmalloc (e.g., in per-CPU variables), which is not accessible
+	 * with the MMU off. Linear memory beyond the VRMA limit is also
+	 * a problem for hash guests.
+	 *
+	 * The real-mode machine checks should not use RCU, tracing, lockdep
+	 * locks, and should not printk, access per-CPU variables, among
+	 * many other restrictions.
 	 */
-	if (IS_ENABLED(CONFIG_KASAN))
-		return;
-
-	/* Otherwise, it should be safe to call it */
-	nmi_enter();
+	if (state->mmu_enabled)
+		nmi_enter();
 }
 
 static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state)
 {
-	if (mfmsr() & MSR_DR) {
-		// nmi_exit if relocations are on
-		nmi_exit();
-	} else if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) &&
-		   firmware_has_feature(FW_FEATURE_LPAR) &&
-		   !radix_enabled()) {
-		// no nmi_exit for a pseries hash guest taking a real mode exception
-	} else if (IS_ENABLED(CONFIG_KASAN)) {
-		// no nmi_exit for KASAN in real mode
-	} else {
+	WARN_ON_ONCE(state->mmu_enabled != !!(mfmsr() & MSR_DR));
+
+	if (state->mmu_enabled)
 		nmi_exit();
-	}
 
 	/*
 	 * nmi does not call nap_adjust_return because nmi should not create