Message ID | 20221013151647.1857994-3-npiggin@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Series | [1/3] powerpc/64s: Disable preemption in hash lazy mmu mode | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/github-powerpc_ppctests | success | Successfully ran 10 jobs. |
snowpatch_ozlabs/github-powerpc_selftests | success | Successfully ran 10 jobs. |
snowpatch_ozlabs/github-powerpc_sparse | success | Successfully ran 4 jobs. |
snowpatch_ozlabs/github-powerpc_clang | success | Successfully ran 6 jobs. |
On Fri, Oct 14, 2022 at 01:16:47AM +1000, Nicholas Piggin wrote: > schedule must not be explicitly called while KUAP is unlocked, because > the AMR register will not be saved across the context switch on 64s > (preemption is allowed because that is driven by interrupts which do > save the AMR). > > exit_vmx_usercopy() runs inside an unlocked user access region, and it > calls preempt_enable() which will call schedule() if need_resched() was > set while non-preemptible. This can cause tasks to run unprotected when > the should not, and can cause the user copy to be improperly blocked > when scheduling back to it. > > Fix this by avoiding the explicit resched for preempt kernels by > generating an interrupt to reschedule the context if need_resched() got > set. > > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> With: 0cbbc95b12ac (HEAD -> ppc) powerpc: fix reschedule bug in KUAP-unlocked user copy 28f6b1f174f4 powerpc/64s: Fix hash__change_memory_range preemption warning 20cbdcbb2b09 powerpc/64s: Disable preemption in hash lazy mmu mode a8b305a95113 powerpc/pseries: Fix CONFIG_DTL=n build 9c55696dd278 powerpc/64s/interrupt: Fix lost interrupts when returning to soft-masked context f2b220ef93ea (local/master, master) Merge tag 'docs-6.1-2' of git://git.lwn.net/linux I observed the traceback below. This is with KFENCE and various module tests (re-)enabled, so the problem may well be unrelated to any of the above patches. Guenter --- WARNING: inconsistent lock state 6.0.0-11845-g0cbbc95b12ac #1 Tainted: G N -------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. swapper/0/1 [HC0[0]:SC0[0]:HE1:SE1] takes: c000000002734de8 (native_tlbie_lock){+.?.}-{2:2}, at: .native_hpte_updateboltedpp+0x1a4/0x600 {IN-SOFTIRQ-W} state was registered at: .lock_acquire+0x20c/0x520 ._raw_spin_lock+0x4c/0x70 .native_hpte_invalidate+0x62c/0x840 .hash__kernel_map_pages+0x450/0x640 .kfence_protect+0x58/0xc0 .kfence_guarded_free+0x374/0x5a0 .__slab_free+0x3d0/0x630 .put_cred_rcu+0xcc/0x120 .rcu_core+0x3c4/0x14e0 .__do_softirq+0x1dc/0x7dc .do_softirq_own_stack+0x40/0x60 0xc00000000869f6d0 .irq_exit+0x1e8/0x220 .timer_interrupt+0x284/0x700 decrementer_common_virt+0x214/0x220 .rwsem_wake+0x88/0xe0 .crypto_alg_tested+0x26c/0x370 .cryptomgr_test+0x34/0x70 .kthread+0x154/0x180 .ret_from_kernel_thread+0x58/0x60 irq event stamp: 162973 hardirqs last enabled at (162973): [<c0000000010cb724>] ._raw_spin_unlock_irqrestore+0xa4/0x120 hardirqs last disabled at (162972): [<c0000000010cb2e0>] ._raw_spin_lock_irqsave+0x40/0xa0 softirqs last enabled at (162280): [<c0000000010cea2c>] .__do_softirq+0x7ac/0x7dc softirqs last disabled at (162271): [<c000000000014a60>] .do_softirq_own_stack+0x40/0x60 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(native_tlbie_lock); <Interrupt> lock(native_tlbie_lock); *** DEADLOCK *** no locks held by swapper/0/1. stack backtrace: CPU: 0 PID: 1 Comm: swapper/0 Tainted: G N 6.0.0-11845-g0cbbc95b12ac #1 Hardware name: PowerMac3,1 PPC970FX 0x3c0301 PowerMac Call Trace: [c000000005607520] [c0000000010853a8] .dump_stack_lvl+0xa4/0x100 (unreliable) [c0000000056075b0] [c00000000017e34c] .print_usage_bug+0x2dc/0x310 [c000000005607690] [c00000000017ec8c] .mark_lock.part.42+0x90c/0x9d0 [c0000000056077c0] [c00000000018071c] .__lock_acquire+0x58c/0x2070 [c000000005607910] [c00000000017f3bc] .lock_acquire+0x20c/0x520 [c000000005607a20] [c0000000010cb05c] ._raw_spin_lock+0x4c/0x70 [c000000005607ab0] [c00000000007af94] .native_hpte_updateboltedpp+0x1a4/0x600 [c000000005607b70] [c0000000000721cc] .hash__change_memory_range+0xec/0x200 [c000000005607c20] [c000000000072904] .hash__mark_initmem_nx+0x54/0x90 [c000000005607ca0] [c00000000006e490] .mark_initmem_nx+0x30/0x70 [c000000005607d10] [c00000000006d440] .free_initmem+0x30/0xf0 [c000000005607d90] [c0000000000116ec] .kernel_init+0x5c/0x1c0 [c000000005607e10] [c00000000000ca3c] .ret_from_kernel_thread+0x58/0x60
On Fri, Oct 14, 2022 at 01:16:47AM +1000, Nicholas Piggin wrote: > schedule must not be explicitly called while KUAP is unlocked, because > the AMR register will not be saved across the context switch on 64s > (preemption is allowed because that is driven by interrupts which do > save the AMR). > > exit_vmx_usercopy() runs inside an unlocked user access region, and it > calls preempt_enable() which will call schedule() if need_resched() was > set while non-preemptible. This can cause tasks to run unprotected when > the should not, and can cause the user copy to be improperly blocked > when scheduling back to it. > > Fix this by avoiding the explicit resched for preempt kernels by > generating an interrupt to reschedule the context if need_resched() got > set. > > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Tested-by: Guenter Roeck <linux@roeck-us.net> > --- > arch/powerpc/lib/vmx-helper.c | 12 +++++++++++- > 1 file changed, 11 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/lib/vmx-helper.c b/arch/powerpc/lib/vmx-helper.c > index f76a50291fd7..d491da8d1838 100644 > --- a/arch/powerpc/lib/vmx-helper.c > +++ b/arch/powerpc/lib/vmx-helper.c > @@ -36,7 +36,17 @@ int exit_vmx_usercopy(void) > { > disable_kernel_altivec(); > pagefault_enable(); > - preempt_enable(); > + preempt_enable_no_resched(); > + /* > + * Must never explicitly call schedule (including preempt_enable()) > + * while in a kuap-unlocked user copy, because the AMR register will > + * not be saved and restored across context switch. However preempt > + * kernels need to be preempted as soon as possible if need_resched is > + * set and we are preemptible. The hack here is to schedule a > + * decrementer to fire here and reschedule for us if necessary. > + */ > + if (IS_ENABLED(CONFIG_PREEMPT) && need_resched()) > + set_dec(1); > return 0; > } > > -- > 2.37.2 >
diff --git a/arch/powerpc/lib/vmx-helper.c b/arch/powerpc/lib/vmx-helper.c index f76a50291fd7..d491da8d1838 100644 --- a/arch/powerpc/lib/vmx-helper.c +++ b/arch/powerpc/lib/vmx-helper.c @@ -36,7 +36,17 @@ int exit_vmx_usercopy(void) { disable_kernel_altivec(); pagefault_enable(); - preempt_enable(); + preempt_enable_no_resched(); + /* + * Must never explicitly call schedule (including preempt_enable()) + * while in a kuap-unlocked user copy, because the AMR register will + * not be saved and restored across context switch. However preempt + * kernels need to be preempted as soon as possible if need_resched is + * set and we are preemptible. The hack here is to schedule a + * decrementer to fire here and reschedule for us if necessary. + */ + if (IS_ENABLED(CONFIG_PREEMPT) && need_resched()) + set_dec(1); return 0; }
schedule must not be explicitly called while KUAP is unlocked, because the AMR register will not be saved across the context switch on 64s (preemption is allowed because that is driven by interrupts which do save the AMR). exit_vmx_usercopy() runs inside an unlocked user access region, and it calls preempt_enable() which will call schedule() if need_resched() was set while non-preemptible. This can cause tasks to run unprotected when the should not, and can cause the user copy to be improperly blocked when scheduling back to it. Fix this by avoiding the explicit resched for preempt kernels by generating an interrupt to reschedule the context if need_resched() got set. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> --- arch/powerpc/lib/vmx-helper.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)