Message ID | 20211125103346.1188958-1-npiggin@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Series | powerpc/watchdog: Fix wd_smp_last_reset_tb reporting | expand |
On Thu, 25 Nov 2021 20:33:46 +1000, Nicholas Piggin wrote: > wd_smp_last_reset_tb now gets reset by watchdog_smp_panic() as part of > marking CPUs stuck and removing them from the pending mask before it > begins any printing. This causes last reset times reported to be off. > > Fix this by reading it into a local variable before it gets reset. > > > [...] Applied to powerpc/next. [1/1] powerpc/watchdog: Fix wd_smp_last_reset_tb reporting https://git.kernel.org/powerpc/c/3d030e301856da366380b3865fce6c03037b08a6 cheers
On 25/11/2021, 11:33:46, Nicholas Piggin wrote: > wd_smp_last_reset_tb now gets reset by watchdog_smp_panic() as part of > marking CPUs stuck and removing them from the pending mask before it > begins any printing. This causes last reset times reported to be off. > > Fix this by reading it into a local variable before it gets reset. > > Fixes: 76521c4b0291 ("powerpc/watchdog: Avoid holding wd_smp_lock over printk and smp_send_nmi_ipi") > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> > --- > > This is the delta for patches 1-4 between v3 and v4 of the series which > is the result of fixing the bug in patch 3. Sending because v3 got > merged into powerpc next What about the 5th patch in the v4 series titled "[PATCH v4 5/5] powerpc/watchdog: help remote CPUs to flush NMI printk output"? > Thanks, > Nick > > arch/powerpc/kernel/watchdog.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c > index b6533539386b..23745af38d62 100644 > --- a/arch/powerpc/kernel/watchdog.c > +++ b/arch/powerpc/kernel/watchdog.c > @@ -179,13 +179,14 @@ static void watchdog_smp_panic(int cpu) > { > static cpumask_t wd_smp_cpus_ipi; // protected by reporting > unsigned long flags; > - u64 tb; > + u64 tb, last_reset; > int c; > > wd_smp_lock(&flags); > /* Double check some things under lock */ > tb = get_tb(); > - if ((s64)(tb - wd_smp_last_reset_tb) < (s64)wd_smp_panic_timeout_tb) > + last_reset = wd_smp_last_reset_tb; > + if ((s64)(tb - last_reset) < (s64)wd_smp_panic_timeout_tb) > goto out; > if (cpumask_test_cpu(cpu, &wd_smp_cpus_pending)) > goto out; > @@ -210,8 +211,7 @@ static void watchdog_smp_panic(int cpu) > pr_emerg("CPU %d detected hard LOCKUP on other CPUs %*pbl\n", > cpu, cpumask_pr_args(&wd_smp_cpus_ipi)); > pr_emerg("CPU %d TB:%lld, last SMP heartbeat TB:%lld (%lldms ago)\n", > - cpu, tb, wd_smp_last_reset_tb, > - tb_to_ns(tb - wd_smp_last_reset_tb) / 1000000); > + cpu, tb, last_reset, tb_to_ns(tb - last_reset) / 1000000); > > if (!sysctl_hardlockup_all_cpu_backtrace) { > /* >
Excerpts from Laurent Dufour's message of November 26, 2021 3:21 am: > On 25/11/2021, 11:33:46, Nicholas Piggin wrote: >> wd_smp_last_reset_tb now gets reset by watchdog_smp_panic() as part of >> marking CPUs stuck and removing them from the pending mask before it >> begins any printing. This causes last reset times reported to be off. >> >> Fix this by reading it into a local variable before it gets reset. >> >> Fixes: 76521c4b0291 ("powerpc/watchdog: Avoid holding wd_smp_lock over printk and smp_send_nmi_ipi") >> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> >> --- >> >> This is the delta for patches 1-4 between v3 and v4 of the series which >> is the result of fixing the bug in patch 3. Sending because v3 got >> merged into powerpc next > > What about the 5th patch in the v4 series titled "[PATCH v4 5/5] > powerpc/watchdog: help remote CPUs to flush NMI printk output"? Yes it would be good to get that one in too. Thanks, Nick
diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c index b6533539386b..23745af38d62 100644 --- a/arch/powerpc/kernel/watchdog.c +++ b/arch/powerpc/kernel/watchdog.c @@ -179,13 +179,14 @@ static void watchdog_smp_panic(int cpu) { static cpumask_t wd_smp_cpus_ipi; // protected by reporting unsigned long flags; - u64 tb; + u64 tb, last_reset; int c; wd_smp_lock(&flags); /* Double check some things under lock */ tb = get_tb(); - if ((s64)(tb - wd_smp_last_reset_tb) < (s64)wd_smp_panic_timeout_tb) + last_reset = wd_smp_last_reset_tb; + if ((s64)(tb - last_reset) < (s64)wd_smp_panic_timeout_tb) goto out; if (cpumask_test_cpu(cpu, &wd_smp_cpus_pending)) goto out; @@ -210,8 +211,7 @@ static void watchdog_smp_panic(int cpu) pr_emerg("CPU %d detected hard LOCKUP on other CPUs %*pbl\n", cpu, cpumask_pr_args(&wd_smp_cpus_ipi)); pr_emerg("CPU %d TB:%lld, last SMP heartbeat TB:%lld (%lldms ago)\n", - cpu, tb, wd_smp_last_reset_tb, - tb_to_ns(tb - wd_smp_last_reset_tb) / 1000000); + cpu, tb, last_reset, tb_to_ns(tb - last_reset) / 1000000); if (!sysctl_hardlockup_all_cpu_backtrace) { /*
wd_smp_last_reset_tb now gets reset by watchdog_smp_panic() as part of marking CPUs stuck and removing them from the pending mask before it begins any printing. This causes last reset times reported to be off. Fix this by reading it into a local variable before it gets reset. Fixes: 76521c4b0291 ("powerpc/watchdog: Avoid holding wd_smp_lock over printk and smp_send_nmi_ipi") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> --- This is the delta for patches 1-4 between v3 and v4 of the series which is the result of fixing the bug in patch 3. Sending because v3 got merged into powerpc next Thanks, Nick arch/powerpc/kernel/watchdog.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)