From patchwork Fri May 30 22:45:46 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Scott Wood X-Patchwork-Id: 354362 X-Patchwork-Delegate: scottwood@freescale.com Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [103.22.144.68]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id CE896140087 for ; Sat, 31 May 2014 08:46:38 +1000 (EST) Received: from ozlabs.org (ozlabs.org [103.22.144.67]) by lists.ozlabs.org (Postfix) with ESMTP id B4C7C1A07B6 for ; Sat, 31 May 2014 08:46:38 +1000 (EST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from na01-bn1-obe.outbound.protection.outlook.com (mail-bn1lp0140.outbound.protection.outlook.com [207.46.163.140]) (using TLSv1 with cipher AES128-SHA (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id DC1D31A06FF for ; Sat, 31 May 2014 08:46:03 +1000 (EST) Received: from snotra.am.freescale.net (192.88.168.50) by DM2PR03MB399.namprd03.prod.outlook.com (10.141.84.148) with Microsoft SMTP Server (TLS) id 15.0.954.9; Fri, 30 May 2014 22:45:55 +0000 From: Scott Wood To: Subject: [PATCH v2] powerpc/booke64: wrap tlb lock and search in htw miss with FTR_SMT Date: Fri, 30 May 2014 17:45:46 -0500 Message-ID: <1401489946-12935-1-git-send-email-scottwood@freescale.com> X-Mailer: git-send-email 1.9.1 MIME-Version: 1.0 X-Originating-IP: [192.88.168.50] X-ClientProxiedBy: BN1PR02CA0030.namprd02.prod.outlook.com (10.141.56.30) To DM2PR03MB399.namprd03.prod.outlook.com (10.141.84.148) X-Microsoft-Antispam: BL:0; ACTION:Default; RISK:Low; SCL:0; SPMLVL:NotSpam; PCL:0; RULEID: X-Forefront-PRVS: 02272225C5 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(6009001)(428001)(199002)(189002)(77982001)(50226001)(93916002)(86362001)(4396001)(21056001)(99396002)(62966002)(50986999)(104166001)(92726001)(79102001)(81342001)(74502001)(74662001)(81542001)(33646001)(102836001)(85852003)(88136002)(83072002)(64706001)(76482001)(20776003)(42186004)(89996001)(87976001)(87286001)(101416001)(19580405001)(83322001)(19580395003)(50466002)(47776003)(48376002)(46102001)(36756003)(80022001)(77156001)(66066001); DIR:OUT; SFP:; SCL:1; SRVR:DM2PR03MB399; H:snotra.am.freescale.net; FPR:; MLV:sfv; PTR:InfoNoRecords; A:1; MX:1; LANG:en; Received-SPF: None (: freescale.com does not designate permitted sender hosts) Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=scottwood@freescale.com; X-OriginatorOrg: freescale.com Cc: Scott Wood , Laurentiu Tudor X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.16 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" From: Laurentiu Tudor Virtualized environments may expose a e6500 dual-threaded core as two single-threaded e6500 cores. Take advantage of this and get rid of the tlb lock and the trap-causing tlbsx in the htw miss handler by guarding with CPU_FTR_SMT, as it's already being done in the bolted tlb1 miss handler. As seen in the results below, measurements done with lmbench random memory access latency test running under Freescale's Embedded Hypervisor, there is a ~34% improvement. Memory latencies in nanoseconds - smaller is better (WARNING - may not be correct, check graphs) ---------------------------------------------------- Host Mhz L1 $ L2 $ Main mem Rand mem --------- --- ---- ---- -------- -------- smt 1665 1.8020 13.2 83.0 1149.7 nosmt 1665 1.8020 13.2 83.0 758.1 Signed-off-by: Laurentiu Tudor Cc: Scott Wood [scottwood@freescale.com: commit message tweak] Signed-off-by: Scott Wood --- v2: - s/expose/may expose/ in commit message - rebased onto my patch queue to resolve conflict - resent since the original didn't make it to the list archives or patchwork. arch/powerpc/mm/tlb_low_64e.S | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/arch/powerpc/mm/tlb_low_64e.S b/arch/powerpc/mm/tlb_low_64e.S index 131f1f4..57c4d66 100644 --- a/arch/powerpc/mm/tlb_low_64e.S +++ b/arch/powerpc/mm/tlb_low_64e.S @@ -299,6 +299,7 @@ itlb_miss_fault_bolted: * r10 = crap (free to use) */ tlb_miss_common_e6500: +BEGIN_FTR_SECTION /* * Search if we already have an indirect entry for that virtual * address, and if we do, bail out. @@ -333,6 +334,7 @@ tlb_miss_common_e6500: andis. r10,r10,MAS1_VALID@h bne tlb_miss_done_e6500 +END_FTR_SECTION_IFSET(CPU_FTR_SMT) /* Now, we need to walk the page tables. First check if we are in * range. @@ -393,11 +395,13 @@ tlb_miss_common_e6500: tlb_miss_done_e6500: .macro tlb_unlock_e6500 +BEGIN_FTR_SECTION beq cr1,1f /* no unlock if lock was recursively grabbed */ li r15,0 isync stb r15,0(r11) 1: +END_FTR_SECTION_IFSET(CPU_FTR_SMT) .endm tlb_unlock_e6500