Message ID | 20221114075607.30631-1-ganeshgr@linux.ibm.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | powerpc/mce: log the error for all unrecoverable errors | expand |
On 2022-11-14 13:26:07 Mon, Ganesh Goudar wrote: > machine_check_log_err() is not getting called for all > unrecoverable errors, And we are missing to log the error. > > Raise irq work in save_mce_event() for unrecoverable errors, > So that we log the error from MCE event handling block in > timer handler. Thanks for fixing this. Reviewed-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> > > Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com> > --- > arch/powerpc/kernel/mce.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c > index 6c5d30fba766..a1cb2172eb7b 100644 > --- a/arch/powerpc/kernel/mce.c > +++ b/arch/powerpc/kernel/mce.c > @@ -131,6 +131,13 @@ void save_mce_event(struct pt_regs *regs, long handled, > if (mce->error_type == MCE_ERROR_TYPE_UE) > mce->u.ue_error.ignore_event = mce_err->ignore_event; > > + /* > + * Raise irq work, So that we don't miss to log the error for > + * unrecoverable errors. > + */ > + if (mce->disposition == MCE_DISPOSITION_NOT_RECOVERED) > + mce_irq_work_queue(); > + > if (!addr) > return; > > @@ -235,7 +242,6 @@ static void machine_check_ue_event(struct machine_check_event *evt) > evt, sizeof(*evt)); > > /* Queue work to process this event later. */ > - mce_irq_work_queue(); > } With your patch now we can see RTAS event logged for other unrecoverable errors as well. [ 573.006337] Disabling lock debugging due to kernel taint [ 573.006357] MCE: CPU27: machine check (Severe) Real address Load/Store (foreign/control memory) [Not recovered] [ 573.006362] MCE: CPU27: PID: 10580 Comm: inject-ra-err NIP: [0000000010000df4] [ 573.006366] MCE: CPU27: Initiator CPU [ 573.006369] MCE: CPU27: Unknown [ 573.006426] RTAS: event: 1, Type: Platform Error (224), Severity: 3 Tested-by: Mahesh Salgaonkar <mahesh@linux.ibm.com> Thanks, -Mahesh.
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c index 6c5d30fba766..a1cb2172eb7b 100644 --- a/arch/powerpc/kernel/mce.c +++ b/arch/powerpc/kernel/mce.c @@ -131,6 +131,13 @@ void save_mce_event(struct pt_regs *regs, long handled, if (mce->error_type == MCE_ERROR_TYPE_UE) mce->u.ue_error.ignore_event = mce_err->ignore_event; + /* + * Raise irq work, So that we don't miss to log the error for + * unrecoverable errors. + */ + if (mce->disposition == MCE_DISPOSITION_NOT_RECOVERED) + mce_irq_work_queue(); + if (!addr) return; @@ -235,7 +242,6 @@ static void machine_check_ue_event(struct machine_check_event *evt) evt, sizeof(*evt)); /* Queue work to process this event later. */ - mce_irq_work_queue(); } /*
machine_check_log_err() is not getting called for all unrecoverable errors, And we are missing to log the error. Raise irq work in save_mce_event() for unrecoverable errors, So that we log the error from MCE event handling block in timer handler. Signed-off-by: Ganesh Goudar <ganeshgr@linux.ibm.com> --- arch/powerpc/kernel/mce.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)