From patchwork Thu Jun 8 15:35:05 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 773347 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3wk8jd5z1lz9s76 for ; Fri, 9 Jun 2017 01:39:17 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="MWoyDeAm"; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3wk8jd4rt8zDqP6 for ; Fri, 9 Jun 2017 01:39:17 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="MWoyDeAm"; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from mail-pg0-x244.google.com (mail-pg0-x244.google.com [IPv6:2607:f8b0:400e:c05::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3wk8dB1zGnzDqQN for ; Fri, 9 Jun 2017 01:35:26 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="MWoyDeAm"; dkim-atps=neutral Received: by mail-pg0-x244.google.com with SMTP id v14so4959548pgn.1 for ; Thu, 08 Jun 2017 08:35:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=0/iUhelYNxM/kNPWRgVVZnUerjsYIVjXX/2dIwLoB9Q=; b=MWoyDeAmsa+hqsDtcaPDmSVys4jMwSfhfzt99+dpXOOq0vk4fXSnUhqboV0Q2sTYuf V2W84b5LWOM9VPzrT8Iv6tDmslKY+fGpJvtv67FQLWyIthSL7qQQLBWINMyNMKlwoeUK y54n2aM2CtzJmRndUQD+9LgPY02ra/kni37oW6QGtv5hKeg03oecXwPvCQ13/ZDZ8OS4 ixN5qU4+YFcEoHEvNRJ2sEv0WsYG3C+7hOOxCXReR5E/TljXno/KaKMtY2G6rcvsw6rH 21sjXxe0upInV/EDe9MWun884LvJ6Z2WBjM88UOlNL6asr7bnEc09xIETazCon2XrQMd M55Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=0/iUhelYNxM/kNPWRgVVZnUerjsYIVjXX/2dIwLoB9Q=; b=OSa4tEdk5I7clNQfwm689Cj2lqmz2lBqPCz6vb7Ko1nP3eyBNItsv7At+eFkE8itks +KzQdD5uV1PsNwPmUGlKZngxiDnPl105Lsbs8NIH8qPr3sQtjaWQTZah00F3dJP2DN6U UTebbBl+kRgBn+aNS8M3cZP15TpW95gZz0FjW7UmB5+VfQpta9qtWHuQIrn+38VF0jVi oFt56Af63tPReqoCvPVi2WDC5HF8aGJlUHELo4Nz+avCsgGTRyDSrsCN3iB2oyDf3Rls QMzyrGlcqScJV9ZNuEMBq5JTS5KSqyYC+v3WqPVzDA9M9Vgsz8GAF8QrL4UMfFSoG87Z O2BQ== X-Gm-Message-State: AODbwcAzZ6Kj6ZH3XaWBq2rx7Cp8hsRhAl/HZLe+ADSxSz2RLOnih4Zu SCttTLo9EBBK8vlD X-Received: by 10.98.108.70 with SMTP id h67mr36479784pfc.98.1496936124258; Thu, 08 Jun 2017 08:35:24 -0700 (PDT) Received: from roar.au.ibm.com (14-202-185-133.tpgi.com.au. [14.202.185.133]) by smtp.gmail.com with ESMTPSA id w24sm11343041pfa.79.2017.06.08.08.35.21 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 08 Jun 2017 08:35:23 -0700 (PDT) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH 2/2] powerpc/64: syscall avoid restore_math call if possible Date: Fri, 9 Jun 2017 01:35:05 +1000 Message-Id: <20170608153505.18857-2-npiggin@gmail.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20170608153505.18857-1-npiggin@gmail.com> References: <20170608153505.18857-1-npiggin@gmail.com> X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Nicholas Piggin Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" The syscall exit code that branches to restore_math is quite heavy on Book3S, consisting of 2 mtmsr instructions. Threads that don't use both FP and vector can get caught here if the kernel ever uses FP or vector. Lazy-FP/vec context switching also trips this case. So check for lazy FP and vector before switching RI for restore_math. Move most of this case out of line. For threads that do want to restore math registers, the MSR switches are still suboptimal. Future direction may be to use a soft-RI bit to avoid MSR switches in kernel (similar to soft-EE), but for now at least the no-restore POWER9 context switch rate increases by about 5% due to sched_yield(2) return performance. I haven't constructed a test to measure the syscall cost. Signed-off-by: Nicholas Piggin --- arch/powerpc/kernel/entry_64.S | 62 +++++++++++++++++++++++++++++------------- arch/powerpc/kernel/process.c | 4 +++ 2 files changed, 47 insertions(+), 19 deletions(-) diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S index bfbad08a1207..6f70ea821a07 100644 --- a/arch/powerpc/kernel/entry_64.S +++ b/arch/powerpc/kernel/entry_64.S @@ -210,27 +210,17 @@ system_call: /* label this so stack traces look sane */ andi. r0,r9,(_TIF_SYSCALL_DOTRACE|_TIF_SINGLESTEP|_TIF_USER_WORK_MASK|_TIF_PERSYSCALL_MASK) bne- syscall_exit_work - andi. r0,r8,MSR_FP - beq 2f + /* If MSR_FP and MSR_VEC are set in user msr, then no need to restore */ + li r7,MSR_FP #ifdef CONFIG_ALTIVEC - andis. r0,r8,MSR_VEC@h - bne 3f + oris r7,r7,MSR_VEC@h #endif -2: addi r3,r1,STACK_FRAME_OVERHEAD -#ifdef CONFIG_PPC_BOOK3S - li r10,MSR_RI - mtmsrd r10,1 /* Restore RI */ -#endif - bl restore_math -#ifdef CONFIG_PPC_BOOK3S - li r11,0 - mtmsrd r11,1 -#endif - ld r8,_MSR(r1) - ld r3,RESULT(r1) - li r11,-MAX_ERRNO + and r0,r8,r7 + cmpd r0,r7 + bne syscall_restore_math +.Lsyscall_restore_math_cont: -3: cmpld r3,r11 + cmpld r3,r11 ld r5,_CCR(r1) bge- syscall_error .Lsyscall_error_cont: @@ -263,7 +253,41 @@ syscall_error: neg r3,r3 std r5,_CCR(r1) b .Lsyscall_error_cont - + +syscall_restore_math: + /* + * Some initial tests from restore_math to avoid the heavyweight + * C code entry and MSR manipulations. + */ + LOAD_REG_IMMEDIATE(r0, MSR_TS_MASK) + and. r0,r0,r8 + bne 1f + + ld r7,PACACURRENT(r13) + lbz r0,THREAD+THREAD_LOAD_FP(r7) +#ifdef CONFIG_ALTIVEC + lbz r6,THREAD+THREAD_LOAD_VEC(r7) + add r0,r0,r6 +#endif + cmpdi r0,0 + beq .Lsyscall_restore_math_cont + +1: addi r3,r1,STACK_FRAME_OVERHEAD +#ifdef CONFIG_PPC_BOOK3S + li r10,MSR_RI + mtmsrd r10,1 /* Restore RI */ +#endif + bl restore_math +#ifdef CONFIG_PPC_BOOK3S + li r11,0 + mtmsrd r11,1 +#endif + /* Restore volatiles, reload MSR from updated one */ + ld r8,_MSR(r1) + ld r3,RESULT(r1) + li r11,-MAX_ERRNO + b .Lsyscall_restore_math_cont + /* Traced system call support */ syscall_dotrace: bl save_nvgprs diff --git a/arch/powerpc/kernel/process.c b/arch/powerpc/kernel/process.c index baae104b16c7..5cbb8b1faf7e 100644 --- a/arch/powerpc/kernel/process.c +++ b/arch/powerpc/kernel/process.c @@ -511,6 +511,10 @@ void restore_math(struct pt_regs *regs) { unsigned long msr; + /* + * Syscall exit makes a similar initial check before branching + * to restore_math. Keep them in synch. + */ if (!msr_tm_active(regs->msr) && !current->thread.load_fp && !loadvec(current->thread)) return;