diff mbox series

powerpc: Add preempt lazy support

Message ID 20241108101853.277808-1-sshegde@linux.ibm.com (mailing list archive)
State New
Headers show
Series powerpc: Add preempt lazy support | expand

Commit Message

Shrikanth Hegde Nov. 8, 2024, 10:18 a.m. UTC
Define preempt lazy bit for Powerpc. Use bit 9 which is free and within
16 bit range of NEED_RESCHED, so compiler can issue single andi. 

Since Powerpc doesn't use the generic entry/exit, add lazy check at exit
to user. CONFIG_PREEMPTION is defined for lazy/full/rt so use it for 
return to kernel. 

Ran a few benchmarks and db workload on Power10. Performance is close to
preempt=none/voluntary. It is possible that some patterns would
differ in lazy[2]. More details of preempt lazy is here [1]

Since Powerpc system can have large core count and large memory, 
preempt lazy is going to be helpful in avoiding soft lockup issues. 

[1]: https://lore.kernel.org/lkml/20241007074609.447006177@infradead.org/
[2]: https://lore.kernel.org/all/1a973dda-c79e-4d95-935b-e4b93eb077b8@linux.ibm.com/

Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
---
 arch/powerpc/Kconfig                   | 1 +
 arch/powerpc/include/asm/thread_info.h | 9 ++++++---
 arch/powerpc/kernel/interrupt.c        | 4 ++--
 3 files changed, 9 insertions(+), 5 deletions(-)

Comments

Sebastian Andrzej Siewior Nov. 8, 2024, 10:50 a.m. UTC | #1
On 2024-11-08 15:48:53 [+0530], Shrikanth Hegde wrote:
> Define preempt lazy bit for Powerpc. Use bit 9 which is free and within
> 16 bit range of NEED_RESCHED, so compiler can issue single andi. 
> 
> Since Powerpc doesn't use the generic entry/exit, add lazy check at exit
> to user. CONFIG_PREEMPTION is defined for lazy/full/rt so use it for 
> return to kernel. 
> 
> Ran a few benchmarks and db workload on Power10. Performance is close to
> preempt=none/voluntary. It is possible that some patterns would
> differ in lazy[2]. More details of preempt lazy is here [1]
> 
> Since Powerpc system can have large core count and large memory, 
> preempt lazy is going to be helpful in avoiding soft lockup issues. 
> 
> [1]: https://lore.kernel.org/lkml/20241007074609.447006177@infradead.org/
> [2]: https://lore.kernel.org/all/1a973dda-c79e-4d95-935b-e4b93eb077b8@linux.ibm.com/

The lazy bits are only in tip.

Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

> Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
> ---
> diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
> index af62ec974b97..8f4acc55407b 100644
> --- a/arch/powerpc/kernel/interrupt.c
> +++ b/arch/powerpc/kernel/interrupt.c
> @@ -396,7 +396,7 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
>  		/* Returning to a kernel context with local irqs enabled. */
>  		WARN_ON_ONCE(!(regs->msr & MSR_EE));
>  again:
> -		if (IS_ENABLED(CONFIG_PREEMPT)) {
> +		if (IS_ENABLED(CONFIG_PREEMPTION)) {
>  			/* Return to preemptible kernel context */
>  			if (unlikely(read_thread_flags() & _TIF_NEED_RESCHED)) {
>  				if (preempt_count() == 0)

Shouldn't exit_vmx_usercopy() get also this
s@CONFIG_PREEMPT@CONFIG_PREEMPTION@ change ?

Sebastian
Ankur Arora Nov. 8, 2024, 7:06 p.m. UTC | #2
Shrikanth Hegde <sshegde@linux.ibm.com> writes:

> Define preempt lazy bit for Powerpc. Use bit 9 which is free and within
> 16 bit range of NEED_RESCHED, so compiler can issue single andi.
>
> Since Powerpc doesn't use the generic entry/exit, add lazy check at exit
> to user. CONFIG_PREEMPTION is defined for lazy/full/rt so use it for
> return to kernel.
>
> Ran a few benchmarks and db workload on Power10. Performance is close to
> preempt=none/voluntary. It is possible that some patterns would
> differ in lazy[2]. More details of preempt lazy is here [1]
>
> Since Powerpc system can have large core count and large memory,
> preempt lazy is going to be helpful in avoiding soft lockup issues.
>
> [1]: https://lore.kernel.org/lkml/20241007074609.447006177@infradead.org/
> [2]: https://lore.kernel.org/all/1a973dda-c79e-4d95-935b-e4b93eb077b8@linux.ibm.com/
>
> Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>

Looks good. Reviewed-by: <ankur.a.arora@oracle.com>

However, I just checked and powerpc does not have
CONFIG_KVM_XFER_TO_GUEST_WORK. Do you need this additional patch
for handling the lazy bit at KVM guest entry?

diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index f14329989e9a..7bdf7015bb65 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -84,7 +84,8 @@ int kvmppc_prepare_to_enter(struct kvm_vcpu *vcpu)
        hard_irq_disable();

        while (true) {
-               if (need_resched()) {
+               unsigned long tf = read_thread_flags();
+               if (tf & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)) {
                        local_irq_enable();
                        cond_resched();
                        hard_irq_disable();


Ankur
diff mbox series

Patch

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 8094a01974cc..2f625aecf94b 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -145,6 +145,7 @@  config PPC
 	select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
 	select ARCH_HAS_PHYS_TO_DMA
 	select ARCH_HAS_PMEM_API
+	select ARCH_HAS_PREEMPT_LAZY
 	select ARCH_HAS_PTE_DEVMAP		if PPC_BOOK3S_64
 	select ARCH_HAS_PTE_SPECIAL
 	select ARCH_HAS_SCALED_CPUTIME		if VIRT_CPU_ACCOUNTING_NATIVE && PPC_BOOK3S_64
diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
index 6ebca2996f18..2785c7462ebf 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -103,6 +103,7 @@  void arch_setup_new_exec(void);
 #define TIF_PATCH_PENDING	6	/* pending live patching update */
 #define TIF_SYSCALL_AUDIT	7	/* syscall auditing active */
 #define TIF_SINGLESTEP		8	/* singlestepping active */
+#define TIF_NEED_RESCHED_LAZY	9       /* Scheduler driven lazy preemption */
 #define TIF_SECCOMP		10	/* secure computing */
 #define TIF_RESTOREALL		11	/* Restore all regs (implies NOERROR) */
 #define TIF_NOERROR		12	/* Force successful syscall return */
@@ -122,6 +123,7 @@  void arch_setup_new_exec(void);
 #define _TIF_SYSCALL_TRACE	(1<<TIF_SYSCALL_TRACE)
 #define _TIF_SIGPENDING		(1<<TIF_SIGPENDING)
 #define _TIF_NEED_RESCHED	(1<<TIF_NEED_RESCHED)
+#define _TIF_NEED_RESCHED_LAZY	(1<<TIF_NEED_RESCHED_LAZY)
 #define _TIF_NOTIFY_SIGNAL	(1<<TIF_NOTIFY_SIGNAL)
 #define _TIF_POLLING_NRFLAG	(1<<TIF_POLLING_NRFLAG)
 #define _TIF_32BIT		(1<<TIF_32BIT)
@@ -142,9 +144,10 @@  void arch_setup_new_exec(void);
 				 _TIF_SYSCALL_EMU)
 
 #define _TIF_USER_WORK_MASK	(_TIF_SIGPENDING | _TIF_NEED_RESCHED | \
-				 _TIF_NOTIFY_RESUME | _TIF_UPROBE | \
-				 _TIF_RESTORE_TM | _TIF_PATCH_PENDING | \
-				 _TIF_NOTIFY_SIGNAL)
+				 _TIF_NEED_RESCHED_LAZY | _TIF_NOTIFY_RESUME | \
+				 _TIF_UPROBE | _TIF_RESTORE_TM | \
+				 _TIF_PATCH_PENDING | _TIF_NOTIFY_SIGNAL)
+
 #define _TIF_PERSYSCALL_MASK	(_TIF_RESTOREALL|_TIF_NOERROR)
 
 /* Bits in local_flags */
diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index af62ec974b97..8f4acc55407b 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -185,7 +185,7 @@  interrupt_exit_user_prepare_main(unsigned long ret, struct pt_regs *regs)
 	ti_flags = read_thread_flags();
 	while (unlikely(ti_flags & (_TIF_USER_WORK_MASK & ~_TIF_RESTORE_TM))) {
 		local_irq_enable();
-		if (ti_flags & _TIF_NEED_RESCHED) {
+		if (ti_flags & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)) {
 			schedule();
 		} else {
 			/*
@@ -396,7 +396,7 @@  notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
 		/* Returning to a kernel context with local irqs enabled. */
 		WARN_ON_ONCE(!(regs->msr & MSR_EE));
 again:
-		if (IS_ENABLED(CONFIG_PREEMPT)) {
+		if (IS_ENABLED(CONFIG_PREEMPTION)) {
 			/* Return to preemptible kernel context */
 			if (unlikely(read_thread_flags() & _TIF_NEED_RESCHED)) {
 				if (preempt_count() == 0)