diff mbox series

powerpc: unrecoverable system reset crash fix

Message ID 20211130105603.2042129-1-npiggin@gmail.com (mailing list archive)
State Changes Requested
Headers show
Series powerpc: unrecoverable system reset crash fix | expand

Checks

Context Check Description
snowpatch_ozlabs/github-powerpc_selftests success Successfully ran 8 jobs.
snowpatch_ozlabs/github-powerpc_ppctests success Successfully ran 8 jobs.
snowpatch_ozlabs/github-powerpc_clang success Successfully ran 7 jobs.
snowpatch_ozlabs/github-powerpc_sparse success Successfully ran 4 jobs.
snowpatch_ozlabs/github-powerpc_kernel_qemu success Successfully ran 24 jobs.

Commit Message

Nicholas Piggin Nov. 30, 2021, 10:56 a.m. UTC
When the system reset interrupt (0x100 trap) calls die(), it does not
exit so it returns. Unrecoverable system reset cases don't expect this,
so it continues on and ends up going BUG later.

Change the 0x100 logic in die to just avoid kexec/fadump. Then a 0
signr can be used by a caller to avoid the exit/panic, which the main
0x100 die() call can use. The unrecoverable die() calls then exit and
don't return, as expected.

Fixes: d40b6768e45b ("powerpc/64s: sreset panic if there is no debugger or crash dump handlers")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/traps.c | 37 +++++++++++++++++++++++++------------
 1 file changed, 25 insertions(+), 12 deletions(-)
diff mbox series

Patch

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 11741703d26e..94b842d659ab 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -219,15 +219,15 @@  static void oops_end(unsigned long flags, struct pt_regs *regs,
 	raw_local_irq_restore(flags);
 
 	/*
-	 * system_reset_excption handles debugger, crash dump, panic, for 0x100
+	 * When system_reset_exception sets signr==0, it does not want crash
+	 * dump code to be called (it has already handled it).
 	 */
-	if (TRAP(regs) == INTERRUPT_SYSTEM_RESET)
-		return;
-
-	crash_fadump(regs, "die oops");
+	if (!(TRAP(regs) == INTERRUPT_SYSTEM_RESET && signr == 0)) {
+		crash_fadump(regs, "die oops");
 
-	if (kexec_should_crash(current))
-		crash_kexec(regs);
+		if (kexec_should_crash(current))
+			crash_kexec(regs);
+	}
 
 	if (!signr)
 		return;
@@ -287,7 +287,8 @@  void die(const char *str, struct pt_regs *regs, long err)
 	unsigned long flags;
 
 	/*
-	 * system_reset_excption handles debugger, crash dump, panic, for 0x100
+	 * When system_reset_exception calls die, it does not want the
+	 * debugger to be invoked (it has already handled it).
 	 */
 	if (TRAP(regs) != INTERRUPT_SYSTEM_RESET) {
 		if (debugger(regs))
@@ -462,8 +463,19 @@  DEFINE_INTERRUPT_HANDLER_NMI(system_reset_exception)
 
 	/* See if any machine dependent calls */
 	if (ppc_md.system_reset_exception) {
-		if (ppc_md.system_reset_exception(regs))
-			goto out;
+		if (ppc_md.system_reset_exception(regs)) {
+			/*
+			 * If this is unrecoverable, it will miss calling
+			 * the debugger due to the TRAP=0x100 logic in die(),
+			 * do it here.
+			 */
+			if (regs_is_unrecoverable(regs)) {
+				if (debugger(regs))
+					goto out;
+			} else {
+				goto out;
+			}
+		}
 	}
 
 	if (debugger(regs))
@@ -488,9 +500,10 @@  DEFINE_INTERRUPT_HANDLER_NMI(system_reset_exception)
 
 	/*
 	 * No debugger or crash dump registered, print logs then
-	 * panic.
+	 * panic. Pass 0 in the err argument to prevent the debugger
+	 * being invoked again, and to prevent die() from crashing.
 	 */
-	die("System Reset", regs, SIGABRT);
+	die("System Reset", regs, 0);
 
 	mdelay(2*MSEC_PER_SEC); /* Wait a little while for others to print */
 	add_taint(TAINT_DIE, LOCKDEP_NOW_UNRELIABLE);