diff mbox series

powerpc/time: avoid programming DEC at the start of the timer interrupt

Message ID 20220909142457.278032-1-npiggin@gmail.com (mailing list archive)
State Accepted
Headers show
Series powerpc/time: avoid programming DEC at the start of the timer interrupt | expand

Checks

Context Check Description
snowpatch_ozlabs/github-powerpc_selftests success Successfully ran 10 jobs.
snowpatch_ozlabs/github-powerpc_ppctests success Successfully ran 10 jobs.
snowpatch_ozlabs/github-powerpc_clang success Successfully ran 6 jobs.
snowpatch_ozlabs/github-powerpc_sparse success Successfully ran 4 jobs.
snowpatch_ozlabs/github-powerpc_kernel_qemu success Successfully ran 23 jobs.

Commit Message

Nicholas Piggin Sept. 9, 2022, 2:24 p.m. UTC
Setting DEC to maximum at the start of the timer interrupt is not
necessary and can be avoided for performance when MSR[EE] is not
enabled during the handler as explained in commit 0faf20a1ad16
("powerpc/64s/interrupt: Don't enable MSR[EE] in irq handlers unless
perf is in use"), where this change was first attempted.

The idea is that the timer interrupt runs with MSR[EE]=0, and at the end
of the interrupt DEC is programmed to the next timer interval, so there
is no need to clear the decrementer exception before then.

When the above commit was merged, that was not quite true. The low res
timer subsystem had some cases in the oneshot timer code where if the
tick was to be stopped and no timers active, the clock device would not
get the ->set_state_oneshot_stopped() call, so DEC would not get
reprogrammed, and this would hang taking continual timer interrupts.

So this was reverted in commit d2b9be1f4af5 ("powerpc/time: Always set
decrementer in timer_interrupt()"), which was a partial revert of the
above commit.

Commit 62c1256d5447 ("timers/nohz: Switch to ONESHOT_STOPPED in the
low-res handler when the tick is stopped") was later merged to fix this
missing case in the timer subsystem, so now the behaviour can be
restored.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
Try this again. This boots fine with CONFIG_HIGH_RES_TIMERS=n.
Reverting 62c1256d5447 with this patch applied causes the
infinite timer hang to reappear, which confirms test is doing
something useful.

Thanks,
Nick

 arch/powerpc/kernel/time.c | 29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

Comments

Michael Ellerman Oct. 4, 2022, 1:24 p.m. UTC | #1
On Sat, 10 Sep 2022 00:24:57 +1000, Nicholas Piggin wrote:
> Setting DEC to maximum at the start of the timer interrupt is not
> necessary and can be avoided for performance when MSR[EE] is not
> enabled during the handler as explained in commit 0faf20a1ad16
> ("powerpc/64s/interrupt: Don't enable MSR[EE] in irq handlers unless
> perf is in use"), where this change was first attempted.
> 
> The idea is that the timer interrupt runs with MSR[EE]=0, and at the end
> of the interrupt DEC is programmed to the next timer interval, so there
> is no need to clear the decrementer exception before then.
> 
> [...]

Applied to powerpc/next.

[1/1] powerpc/time: avoid programming DEC at the start of the timer interrupt
      https://git.kernel.org/powerpc/c/c84550203b3173511e8cdbe94bc2e33175ba1d72

cheers
diff mbox series

Patch

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 587adcc12860..328129f19a2e 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -614,22 +614,23 @@  DEFINE_INTERRUPT_HANDLER_ASYNC(timer_interrupt)
 		return;
 	}
 
-	/*
-	 * Ensure a positive value is written to the decrementer, or
-	 * else some CPUs will continue to take decrementer exceptions.
-	 * When the PPC_WATCHDOG (decrementer based) is configured,
-	 * keep this at most 31 bits, which is about 4 seconds on most
-	 * systems, which gives the watchdog a chance of catching timer
-	 * interrupt hard lockups.
-	 */
-	if (IS_ENABLED(CONFIG_PPC_WATCHDOG))
-		set_dec(0x7fffffff);
-	else
-		set_dec(decrementer_max);
-
 	/* Conditionally hard-enable interrupts. */
-	if (should_hard_irq_enable())
+	if (should_hard_irq_enable()) {
+		/*
+		 * Ensure a positive value is written to the decrementer, or
+		 * else some CPUs will continue to take decrementer exceptions.
+		 * When the PPC_WATCHDOG (decrementer based) is configured,
+		 * keep this at most 31 bits, which is about 4 seconds on most
+		 * systems, which gives the watchdog a chance of catching timer
+		 * interrupt hard lockups.
+		 */
+		if (IS_ENABLED(CONFIG_PPC_WATCHDOG))
+			set_dec(0x7fffffff);
+		else
+			set_dec(decrementer_max);
+
 		do_hard_irq_enable();
+	}
 
 #if defined(CONFIG_PPC32) && defined(CONFIG_PPC_PMAC)
 	if (atomic_read(&ppc_n_lost_interrupts) != 0)