diff mbox

[1/2] powerpc: fix graceful debugger recovery

Message ID 20161108121445.15303-2-npiggin@gmail.com (mailing list archive)
State Accepted
Headers show

Commit Message

Nicholas Piggin Nov. 8, 2016, 12:14 p.m. UTC
When exiting xmon with 'x' (exit and recover), oops_begin bails
out immediately, but die then calls __die() and oops_end(), which
cause a lot of bad things to happen.

If the debugger was attached then went to graceful recovery, exit
from die() immediately.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 arch/powerpc/kernel/traps.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

Comments

Michael Ellerman Nov. 10, 2016, 1:35 a.m. UTC | #1
Nicholas Piggin <npiggin@gmail.com> writes:

> When exiting xmon with 'x' (exit and recover), oops_begin bails
> out immediately, but die then calls __die() and oops_end(), which
> cause a lot of bad things to happen.

In fact oops_begin() returns 1, which oops_end() then passes directly to
raw_local_irq_restore() as flags. On 64-bit that actually works because
arch_local_irq_restore() takes just "en" (enable), not real flags. But
on 32-bit it's supposed to be the MSR value. So that's impressively
broken.

> If the debugger was attached then went to graceful recovery, exit
> from die() immediately.

Right. Crucially it doesn't change anything in terms of the actual logic
of oops_begin(), ie. previously oops_begin() did nothing prior to
calling debugger(), and after this patch that remains the same (which
you did mention above but just spelling it out for myself).

cheers
Nicholas Piggin Nov. 10, 2016, 4:24 a.m. UTC | #2
On Thu, 10 Nov 2016 12:35:59 +1100
Michael Ellerman <mpe@ellerman.id.au> wrote:

> Nicholas Piggin <npiggin@gmail.com> writes:
> 
> > When exiting xmon with 'x' (exit and recover), oops_begin bails
> > out immediately, but die then calls __die() and oops_end(), which
> > cause a lot of bad things to happen.  
> 
> In fact oops_begin() returns 1, which oops_end() then passes directly to
> raw_local_irq_restore() as flags. On 64-bit that actually works because
> arch_local_irq_restore() takes just "en" (enable), not real flags. But
> on 32-bit it's supposed to be the MSR value. So that's impressively
> broken.

Yeah, I guess most of the time you either go to debugger with
sysrq, or in case of a crash don't try to graceful recover.

When sending debug NMIs down via system reset it becomes a problem!

> 
> > If the debugger was attached then went to graceful recovery, exit
> > from die() immediately.  
> 
> Right. Crucially it doesn't change anything in terms of the actual logic
> of oops_begin(), ie. previously oops_begin() did nothing prior to
> calling debugger(), and after this patch that remains the same (which
> you did mention above but just spelling it out for myself).

Right.

Thanks,
Nick
Michael Ellerman Nov. 10, 2016, 10:25 a.m. UTC | #3
Nicholas Piggin <npiggin@gmail.com> writes:

> On Thu, 10 Nov 2016 12:35:59 +1100
> Michael Ellerman <mpe@ellerman.id.au> wrote:
>
>> Nicholas Piggin <npiggin@gmail.com> writes:
>> 
>> > When exiting xmon with 'x' (exit and recover), oops_begin bails
>> > out immediately, but die then calls __die() and oops_end(), which
>> > cause a lot of bad things to happen.  
>> 
>> In fact oops_begin() returns 1, which oops_end() then passes directly to
>> raw_local_irq_restore() as flags. On 64-bit that actually works because
>> arch_local_irq_restore() takes just "en" (enable), not real flags. But
>> on 32-bit it's supposed to be the MSR value. So that's impressively
>> broken.
>
> Yeah, I guess most of the time you either go to debugger with
> sysrq, or in case of a crash don't try to graceful recover.

Yeah. It's debatable whether we should even allow graceful recovery, but
it's useful sometimes and regular users probably shouldn't have a
debugger enabled anyway.

cheers
Michael Ellerman Nov. 22, 2016, 12:34 a.m. UTC | #4
On Tue, 2016-11-08 at 12:14:44 UTC, Nicholas Piggin wrote:
> When exiting xmon with 'x' (exit and recover), oops_begin bails
> out immediately, but die then calls __die() and oops_end(), which
> cause a lot of bad things to happen.
> 
> If the debugger was attached then went to graceful recovery, exit
> from die() immediately.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>

Applied to powerpc next, thanks.

https://git.kernel.org/powerpc/c/6f44b20ee9b4130345c189c0c90ef6

cheers
diff mbox

Patch

diff --git a/arch/powerpc/kernel/traps.c b/arch/powerpc/kernel/traps.c
index 7480902..26c3ba4 100644
--- a/arch/powerpc/kernel/traps.c
+++ b/arch/powerpc/kernel/traps.c
@@ -122,9 +122,6 @@  static unsigned long oops_begin(struct pt_regs *regs)
 	int cpu;
 	unsigned long flags;
 
-	if (debugger(regs))
-		return 1;
-
 	oops_enter();
 
 	/* racy, but better than risking deadlock. */
@@ -227,8 +224,12 @@  NOKPROBE_SYMBOL(__die);
 
 void die(const char *str, struct pt_regs *regs, long err)
 {
-	unsigned long flags = oops_begin(regs);
+	unsigned long flags;
+
+	if (debugger(regs))
+		return;
 
+	flags = oops_begin(regs);
 	if (__die(str, regs, err))
 		err = 0;
 	oops_end(flags, regs, err);