diff mbox series

powerpc/watchdog: Fix wd_smp_last_reset_tb reporting

Message ID 20211125103346.1188958-1-npiggin@gmail.com (mailing list archive)
State Accepted
Headers show
Series powerpc/watchdog: Fix wd_smp_last_reset_tb reporting | expand

Commit Message

Nicholas Piggin Nov. 25, 2021, 10:33 a.m. UTC
wd_smp_last_reset_tb now gets reset by watchdog_smp_panic() as part of
marking CPUs stuck and removing them from the pending mask before it
begins any printing. This causes last reset times reported to be off.

Fix this by reading it into a local variable before it gets reset.

Fixes: 76521c4b0291 ("powerpc/watchdog: Avoid holding wd_smp_lock over printk and smp_send_nmi_ipi")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---

This is the delta for patches 1-4 between v3 and v4 of the series which
is the result of fixing the bug in patch 3. Sending because v3 got
merged into powerpc next

Thanks,
Nick

 arch/powerpc/kernel/watchdog.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Michael Ellerman Nov. 25, 2021, 11 a.m. UTC | #1
On Thu, 25 Nov 2021 20:33:46 +1000, Nicholas Piggin wrote:
> wd_smp_last_reset_tb now gets reset by watchdog_smp_panic() as part of
> marking CPUs stuck and removing them from the pending mask before it
> begins any printing. This causes last reset times reported to be off.
> 
> Fix this by reading it into a local variable before it gets reset.
> 
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/watchdog: Fix wd_smp_last_reset_tb reporting
      https://git.kernel.org/powerpc/c/3d030e301856da366380b3865fce6c03037b08a6

cheers
Laurent Dufour Nov. 25, 2021, 5:21 p.m. UTC | #2
On 25/11/2021, 11:33:46, Nicholas Piggin wrote:
> wd_smp_last_reset_tb now gets reset by watchdog_smp_panic() as part of
> marking CPUs stuck and removing them from the pending mask before it
> begins any printing. This causes last reset times reported to be off.
> 
> Fix this by reading it into a local variable before it gets reset.
> 
> Fixes: 76521c4b0291 ("powerpc/watchdog: Avoid holding wd_smp_lock over printk and smp_send_nmi_ipi")
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> 
> This is the delta for patches 1-4 between v3 and v4 of the series which
> is the result of fixing the bug in patch 3. Sending because v3 got
> merged into powerpc next

What about the 5th patch in the v4 series titled "[PATCH v4 5/5]
powerpc/watchdog: help remote CPUs to flush NMI printk output"?


> Thanks,
> Nick
> 
>  arch/powerpc/kernel/watchdog.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
> index b6533539386b..23745af38d62 100644
> --- a/arch/powerpc/kernel/watchdog.c
> +++ b/arch/powerpc/kernel/watchdog.c
> @@ -179,13 +179,14 @@ static void watchdog_smp_panic(int cpu)
>  {
>  	static cpumask_t wd_smp_cpus_ipi; // protected by reporting
>  	unsigned long flags;
> -	u64 tb;
> +	u64 tb, last_reset;
>  	int c;
>  
>  	wd_smp_lock(&flags);
>  	/* Double check some things under lock */
>  	tb = get_tb();
> -	if ((s64)(tb - wd_smp_last_reset_tb) < (s64)wd_smp_panic_timeout_tb)
> +	last_reset = wd_smp_last_reset_tb;
> +	if ((s64)(tb - last_reset) < (s64)wd_smp_panic_timeout_tb)
>  		goto out;
>  	if (cpumask_test_cpu(cpu, &wd_smp_cpus_pending))
>  		goto out;
> @@ -210,8 +211,7 @@ static void watchdog_smp_panic(int cpu)
>  	pr_emerg("CPU %d detected hard LOCKUP on other CPUs %*pbl\n",
>  		 cpu, cpumask_pr_args(&wd_smp_cpus_ipi));
>  	pr_emerg("CPU %d TB:%lld, last SMP heartbeat TB:%lld (%lldms ago)\n",
> -		 cpu, tb, wd_smp_last_reset_tb,
> -		 tb_to_ns(tb - wd_smp_last_reset_tb) / 1000000);
> +		 cpu, tb, last_reset, tb_to_ns(tb - last_reset) / 1000000);
>  
>  	if (!sysctl_hardlockup_all_cpu_backtrace) {
>  		/*
>
Nicholas Piggin Nov. 26, 2021, 12:50 a.m. UTC | #3
Excerpts from Laurent Dufour's message of November 26, 2021 3:21 am:
> On 25/11/2021, 11:33:46, Nicholas Piggin wrote:
>> wd_smp_last_reset_tb now gets reset by watchdog_smp_panic() as part of
>> marking CPUs stuck and removing them from the pending mask before it
>> begins any printing. This causes last reset times reported to be off.
>> 
>> Fix this by reading it into a local variable before it gets reset.
>> 
>> Fixes: 76521c4b0291 ("powerpc/watchdog: Avoid holding wd_smp_lock over printk and smp_send_nmi_ipi")
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>> 
>> This is the delta for patches 1-4 between v3 and v4 of the series which
>> is the result of fixing the bug in patch 3. Sending because v3 got
>> merged into powerpc next
> 
> What about the 5th patch in the v4 series titled "[PATCH v4 5/5]
> powerpc/watchdog: help remote CPUs to flush NMI printk output"?

Yes it would be good to get that one in too.

Thanks,
Nick
diff mbox series

Patch

diff --git a/arch/powerpc/kernel/watchdog.c b/arch/powerpc/kernel/watchdog.c
index b6533539386b..23745af38d62 100644
--- a/arch/powerpc/kernel/watchdog.c
+++ b/arch/powerpc/kernel/watchdog.c
@@ -179,13 +179,14 @@  static void watchdog_smp_panic(int cpu)
 {
 	static cpumask_t wd_smp_cpus_ipi; // protected by reporting
 	unsigned long flags;
-	u64 tb;
+	u64 tb, last_reset;
 	int c;
 
 	wd_smp_lock(&flags);
 	/* Double check some things under lock */
 	tb = get_tb();
-	if ((s64)(tb - wd_smp_last_reset_tb) < (s64)wd_smp_panic_timeout_tb)
+	last_reset = wd_smp_last_reset_tb;
+	if ((s64)(tb - last_reset) < (s64)wd_smp_panic_timeout_tb)
 		goto out;
 	if (cpumask_test_cpu(cpu, &wd_smp_cpus_pending))
 		goto out;
@@ -210,8 +211,7 @@  static void watchdog_smp_panic(int cpu)
 	pr_emerg("CPU %d detected hard LOCKUP on other CPUs %*pbl\n",
 		 cpu, cpumask_pr_args(&wd_smp_cpus_ipi));
 	pr_emerg("CPU %d TB:%lld, last SMP heartbeat TB:%lld (%lldms ago)\n",
-		 cpu, tb, wd_smp_last_reset_tb,
-		 tb_to_ns(tb - wd_smp_last_reset_tb) / 1000000);
+		 cpu, tb, last_reset, tb_to_ns(tb - last_reset) / 1000000);
 
 	if (!sysctl_hardlockup_all_cpu_backtrace) {
 		/*