Message ID | 20160915090321.787-1-npiggin@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
On 15/09/16 19:03, Nicholas Piggin wrote: > The mflr r10 instruction was left over saving of lr when the code used > lr to branch to system_call_entry from the exception handler. That was > changed by 6a404806d to use the count register. The value is never used > now, so mflr can be removed, and r10 can be used for storage rather than > spilling to the SPR scratch register. > > The scratch register spill causes a long pipeline stall due to the SPR > read after write. This change brings getppid syscall cost from 406 to > 376 cycles on POWER8. getppid for non-relocatable case is 371 cycles. > > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> > --- > > arch/powerpc/kernel/exceptions-64s.S | 7 ++----- > 1 file changed, 2 insertions(+), 5 deletions(-) > > diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S > index df6d45e..2cdd64f 100644 > --- a/arch/powerpc/kernel/exceptions-64s.S > +++ b/arch/powerpc/kernel/exceptions-64s.S > @@ -63,15 +63,12 @@ END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE) \ > * is volatile across system calls. > */ > #define SYSCALL_PSERIES_2_DIRECT \ > - mflr r10 ; \ > ld r12,PACAKBASE(r13) ; \ > LOAD_HANDLER(r12, system_call_entry) ; \ > mtctr r12 ; \ > mfspr r12,SPRN_SRR1 ; \ > - /* Re-use of r13... No spare regs to do this */ \ > - li r13,MSR_RI ; \ > - mtmsrd r13,1 ; \ > - GET_PACA(r13) ; /* get r13 back */ \ > + li r10,MSR_RI ; \ > + mtmsrd r10,1 ; \ > bctr ; > #else > /* We can branch directly */ > The patch makes sense Acked-by: Balbir Singh <bsingharora@gmail.com>
On Thu, 2016-15-09 at 09:03:21 UTC, Nicholas Piggin wrote: > The mflr r10 instruction was left over saving of lr when the code used > lr to branch to system_call_entry from the exception handler. That was > changed by 6a404806d to use the count register. The value is never used > now, so mflr can be removed, and r10 can be used for storage rather than > spilling to the SPR scratch register. > > The scratch register spill causes a long pipeline stall due to the SPR > read after write. This change brings getppid syscall cost from 406 to > 376 cycles on POWER8. getppid for non-relocatable case is 371 cycles. > > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> > Acked-by: Balbir Singh <bsingharora@gmail.com> Applied to powerpc next, thanks. https://git.kernel.org/powerpc/c/18e3f56b1cacb96017e2a66844 cheers
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index df6d45e..2cdd64f 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -63,15 +63,12 @@ END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE) \ * is volatile across system calls. */ #define SYSCALL_PSERIES_2_DIRECT \ - mflr r10 ; \ ld r12,PACAKBASE(r13) ; \ LOAD_HANDLER(r12, system_call_entry) ; \ mtctr r12 ; \ mfspr r12,SPRN_SRR1 ; \ - /* Re-use of r13... No spare regs to do this */ \ - li r13,MSR_RI ; \ - mtmsrd r13,1 ; \ - GET_PACA(r13) ; /* get r13 back */ \ + li r10,MSR_RI ; \ + mtmsrd r10,1 ; \ bctr ; #else /* We can branch directly */
The mflr r10 instruction was left over saving of lr when the code used lr to branch to system_call_entry from the exception handler. That was changed by 6a404806d to use the count register. The value is never used now, so mflr can be removed, and r10 can be used for storage rather than spilling to the SPR scratch register. The scratch register spill causes a long pipeline stall due to the SPR read after write. This change brings getppid syscall cost from 406 to 376 cycles on POWER8. getppid for non-relocatable case is 371 cycles. Signed-off-by: Nicholas Piggin <npiggin@gmail.com> --- arch/powerpc/kernel/exceptions-64s.S | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)