Message ID | 20221027074314.2084016-1-npiggin@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | powerpc/books: Never call nmi_enter for real-mode NMIs | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/github-powerpc_kernel_qemu | fail | kernel (corenet32_smp_defconfig, fedora-34) failed at step build. |
snowpatch_ozlabs/github-powerpc_ppctests | success | Successfully ran 10 jobs. |
Hi Nicholas, I love your patch! Yet something to improve: [auto build test ERROR on powerpc/next] [also build test ERROR on linus/master v6.1-rc2 next-20221027] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Nicholas-Piggin/powerpc-books-Never-call-nmi_enter-for-real-mode-NMIs/20221027-154503 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next patch link: https://lore.kernel.org/r/20221027074314.2084016-1-npiggin%40gmail.com patch subject: [PATCH] powerpc/books: Never call nmi_enter for real-mode NMIs config: powerpc-allnoconfig compiler: powerpc-linux-gcc (GCC) 12.1.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/57c237cb1fe034e85db9cd44fb09986fa27b2197 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Nicholas-Piggin/powerpc-books-Never-call-nmi_enter-for-real-mode-NMIs/20221027-154503 git checkout 57c237cb1fe034e85db9cd44fb09986fa27b2197 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=powerpc SHELL=/bin/bash arch/powerpc/lib/ If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): In file included from arch/powerpc/lib/feature-fixups.c:20: arch/powerpc/include/asm/interrupt.h: In function 'interrupt_nmi_enter_prepare': >> arch/powerpc/include/asm/interrupt.h:349:18: error: 'struct interrupt_nmi_state' has no member named 'mmu_enabled' 349 | if (state->mmu_enabled) | ^~ In file included from arch/powerpc/include/asm/bug.h:158, from include/linux/bug.h:5, from arch/powerpc/include/asm/cmpxchg.h:8, from arch/powerpc/include/asm/atomic.h:11, from include/linux/atomic.h:7, from include/linux/jump_label.h:254, from arch/powerpc/lib/feature-fixups.c:12: arch/powerpc/include/asm/interrupt.h: In function 'interrupt_nmi_exit_prepare': arch/powerpc/include/asm/interrupt.h:355:27: error: 'struct interrupt_nmi_state' has no member named 'mmu_enabled' 355 | WARN_ON_ONCE(state->mmu_enabled != !!(mfmsr() & MSR_DR)); | ^~ include/asm-generic/bug.h:110:32: note: in definition of macro 'WARN_ON_ONCE' 110 | int __ret_warn_on = !!(condition); \ | ^~~~~~~~~ arch/powerpc/include/asm/interrupt.h:357:18: error: 'struct interrupt_nmi_state' has no member named 'mmu_enabled' 357 | if (state->mmu_enabled) | ^~ vim +349 arch/powerpc/include/asm/interrupt.h 337 338 /* 339 * If data relocations are enabled, it's safe to use nmi_enter(). 340 * Otherwise avoid using it because the core kernel may touch 341 * vmalloc (e.g., in per-CPU variables), which is not accessible 342 * with the MMU off. Linear memory beyond the VRMA limit is also 343 * a problem for hash guests. 344 * 345 * The real-mode machine checks should not use RCU, tracing, lockdep 346 * locks, and should not printk, access per-CPU variables, among 347 * many other restrictions. 348 */ > 349 if (state->mmu_enabled) 350 nmi_enter(); 351 } 352
diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h index 4745bb9998bd..3e87e9ec5117 100644 --- a/arch/powerpc/include/asm/interrupt.h +++ b/arch/powerpc/include/asm/interrupt.h @@ -276,6 +276,7 @@ struct interrupt_nmi_state { u8 irq_soft_mask; u8 irq_happened; u8 ftrace_enabled; + u8 mmu_enabled; u64 softe; #endif }; @@ -303,6 +304,7 @@ static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct inte state->irq_soft_mask = local_paca->irq_soft_mask; state->irq_happened = local_paca->irq_happened; state->softe = regs->softe; + state->mmu_enabled = !!(mfmsr() & MSR_DR); /* * Set IRQS_ALL_DISABLED unconditionally so irqs_disabled() does @@ -333,46 +335,27 @@ static inline void interrupt_nmi_enter_prepare(struct pt_regs *regs, struct inte } #endif - /* If data relocations are enabled, it's safe to use nmi_enter() */ - if (mfmsr() & MSR_DR) { - nmi_enter(); - return; - } - - /* - * But do not use nmi_enter() for pseries hash guest taking a real-mode - * NMI because not everything it touches is within the RMA limit. - */ - if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && - firmware_has_feature(FW_FEATURE_LPAR) && - !radix_enabled()) - return; - /* - * Likewise, don't use it if we have some form of instrumentation (like - * KASAN shadow) that is not safe to access in real mode (even on radix) + * If data relocations are enabled, it's safe to use nmi_enter(). + * Otherwise avoid using it because the core kernel may touch + * vmalloc (e.g., in per-CPU variables), which is not accessible + * with the MMU off. Linear memory beyond the VRMA limit is also + * a problem for hash guests. + * + * The real-mode machine checks should not use RCU, tracing, lockdep + * locks, and should not printk, access per-CPU variables, among + * many other restrictions. */ - if (IS_ENABLED(CONFIG_KASAN)) - return; - - /* Otherwise, it should be safe to call it */ - nmi_enter(); + if (state->mmu_enabled) + nmi_enter(); } static inline void interrupt_nmi_exit_prepare(struct pt_regs *regs, struct interrupt_nmi_state *state) { - if (mfmsr() & MSR_DR) { - // nmi_exit if relocations are on - nmi_exit(); - } else if (IS_ENABLED(CONFIG_PPC_BOOK3S_64) && - firmware_has_feature(FW_FEATURE_LPAR) && - !radix_enabled()) { - // no nmi_exit for a pseries hash guest taking a real mode exception - } else if (IS_ENABLED(CONFIG_KASAN)) { - // no nmi_exit for KASAN in real mode - } else { + WARN_ON_ONCE(state->mmu_enabled != !!(mfmsr() & MSR_DR)); + + if (state->mmu_enabled) nmi_exit(); - } /* * nmi does not call nap_adjust_return because nmi should not create
NMIs that are taken in real mode (the early MCE and HMI handlers) skipped calling nmi_enter() in some configurations, in the hope that more modern configurations like radix suffer fewer restrictions. This just turns into whack-a-mole and fragile when core kernel code changes anything. A recent such example that breaks with radix, an HMI real mode interrupt tries to access vmalloc memory, causing it to take a machine check: --- interrupt: 200 at perf_trace_rcu_dyntick+0x140/0x190 NIP: c0000000001d4720 LR: c0000000001d2bb4 CTR: c0000000001d45e0 REGS: c000000fffdbfd60 TRAP: 0200 Tainted: G M (6.0.0-dirty) MSR: 9000000000201003 <SF,HV,ME,RI,LE> CR: 24024228 XER: 20040000 CFAR: c0000000001d4648 DAR: c009e000016e29a8 DSISR: 00000008 IRQMASK: 3 GPR00: c0000000001d2bb4 c000000fffdc7b30 c00000000255c100 c0000000023089f8 GPR04: c000000001bd0438 4000000000000000 4000000000000002 0000000000964794 GPR08: 0000000000000000 c009dff0055b29a8 0000000ffc130000 7265677368657265 GPR12: c0000000001d45e0 c000000ffffd7000 c00000000014e7c8 c00000000ab74280 GPR16: 0000000000000000 0000000000000000 0000000000000000 c0000000031a64d8 GPR20: c00000000d9f7b00 0000000000000006 c000000002446a28 c009e000016e29a8 GPR24: c000000001bd0438 4000000000000000 4000000000000002 0000000000964794 GPR28: c0000000001d2bb4 4000000000000002 c0000000023089f8 c0002000063f0668 perf_trace_rcu_dyntick+0x140/0x190 __traceiter_rcu_dyntick+0x84/0xc0 --- interrupt: 200 rcu_read_lock_sched_held+0x10/0xe0 (unreliable) __traceiter_rcu_dyntick+0x84/0xc0 ct_nmi_enter+0x118/0x280 interrupt_nmi_enter_prepare+0x118/0x1f0 hmi_exception_realmode+0x38/0xe4 hmi_exception_early_common+0x114/0x2a0 --- interrupt: e60 at arch_local_irq_restore+0x11c/0x1b0 Just disable this entirely. It turns out the features that might be enabled by nmi_enter(), like RCU or printk are unlikely to be usable in real mode anyway. Reported-by: Michael Ellerman <mpe@ellerman.id.au> Cc: Mahesh Salgaonkar <mahesh@linux.ibm.com> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> --- arch/powerpc/include/asm/interrupt.h | 49 +++++++++------------------- 1 file changed, 16 insertions(+), 33 deletions(-)