Message ID | 1475476829-6296-1-git-send-email-anton@ozlabs.org (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
On 03/10/16 17:40, Anton Blanchard wrote: > From: Anton Blanchard <anton@samba.org> > > During context switch, switch_mm() sets our current CPU in mm_cpumask. > We can avoid this atomic sequence in most cases by checking before > setting the bit. > > Testing on a POWER8 using our context switch microbenchmark: > > tools/testing/selftests/powerpc/benchmarks/context_switch \ > --process --no-fp --no-altivec --no-vector > > Performance improves 2%. > > Signed-off-by: Anton Blanchard <anton@samba.org> > --- > arch/powerpc/include/asm/mmu_context.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h > index 475d1be..5c45114 100644 > --- a/arch/powerpc/include/asm/mmu_context.h > +++ b/arch/powerpc/include/asm/mmu_context.h > @@ -72,7 +72,8 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next, > struct task_struct *tsk) > { > /* Mark this context has been used on the new CPU */ > - cpumask_set_cpu(smp_processor_id(), mm_cpumask(next)); > + if (!cpumask_test_cpu(smp_processor_id(), mm_cpumask(next))) > + cpumask_set_cpu(smp_processor_id(), mm_cpumask(next)); > I think this makes sense, in fact I think in the longer term we can even use __set_bit() reorder-able version since we have a sync coming out of schedule(). The read side for TLB flush can use a RMB Acked-by: Balbir Singh <bsingharora@gmail.com> Balbir Singh.
On Tue, 2016-10-04 at 10:25 +1100, Balbir Singh wrote: > I think this makes sense, in fact I think in the longer term we can > even use __set_bit() reorder-able version since we have a sync > coming out of schedule(). The read side for TLB flush can use a RMB No, that wouldn't be atomic vs. other threads accessing the same bitmap. Ben.
On 04/10/16 10:58, Benjamin Herrenschmidt wrote: > On Tue, 2016-10-04 at 10:25 +1100, Balbir Singh wrote: >> I think this makes sense, in fact I think in the longer term we can >> even use __set_bit() reorder-able version since we have a sync >> coming out of schedule(). The read side for TLB flush can use a RMB > > No, that wouldn't be atomic vs. other threads accessing the same > bitmap. Somethings distorted my thought process. Thanks! Balbir Singh.
On Mon, 2016-03-10 at 06:40:29 UTC, Anton Blanchard wrote: > From: Anton Blanchard <anton@samba.org> > > During context switch, switch_mm() sets our current CPU in mm_cpumask. > We can avoid this atomic sequence in most cases by checking before > setting the bit. > > Testing on a POWER8 using our context switch microbenchmark: > > tools/testing/selftests/powerpc/benchmarks/context_switch \ > --process --no-fp --no-altivec --no-vector > > Performance improves 2%. > > Signed-off-by: Anton Blanchard <anton@samba.org> > Acked-by: Balbir Singh <bsingharora@gmail.com> Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/bb85fb5803270c52863b983596c2a0 cheers
diff --git a/arch/powerpc/include/asm/mmu_context.h b/arch/powerpc/include/asm/mmu_context.h index 475d1be..5c45114 100644 --- a/arch/powerpc/include/asm/mmu_context.h +++ b/arch/powerpc/include/asm/mmu_context.h @@ -72,7 +72,8 @@ static inline void switch_mm(struct mm_struct *prev, struct mm_struct *next, struct task_struct *tsk) { /* Mark this context has been used on the new CPU */ - cpumask_set_cpu(smp_processor_id(), mm_cpumask(next)); + if (!cpumask_test_cpu(smp_processor_id(), mm_cpumask(next))) + cpumask_set_cpu(smp_processor_id(), mm_cpumask(next)); /* 32-bit keeps track of the current PGDIR in the thread struct */ #ifdef CONFIG_PPC32