From patchwork Wed Aug 23 15:31:40 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vaidyanathan Srinivasan X-Patchwork-Id: 805063 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3xcs186Xs2z9s3T for ; Thu, 24 Aug 2017 01:34:36 +1000 (AEST) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3xcs185dBNzDrJg for ; Thu, 24 Aug 2017 01:34:36 +1000 (AEST) X-Original-To: skiboot@lists.ozlabs.org Delivered-To: skiboot@lists.ozlabs.org Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3xcry657nBzDrJr for ; Thu, 24 Aug 2017 01:31:58 +1000 (AEST) Received: from pps.filterd (m0098396.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id v7NFVkVc083012 for ; Wed, 23 Aug 2017 11:31:56 -0400 Received: from e23smtp04.au.ibm.com (e23smtp04.au.ibm.com [202.81.31.146]) by mx0a-001b2d01.pphosted.com with ESMTP id 2chbd93ss3-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Wed, 23 Aug 2017 11:31:56 -0400 Received: from localhost by e23smtp04.au.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 24 Aug 2017 01:31:53 +1000 Received: from d23relay06.au.ibm.com (202.81.31.225) by e23smtp04.au.ibm.com (202.81.31.210) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Thu, 24 Aug 2017 01:31:52 +1000 Received: from d23av02.au.ibm.com (d23av02.au.ibm.com [9.190.235.138]) by d23relay06.au.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v7NFVpY742139776 for ; Thu, 24 Aug 2017 01:31:51 +1000 Received: from d23av02.au.ibm.com (localhost [127.0.0.1]) by d23av02.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id v7NFVgL9002030 for ; Thu, 24 Aug 2017 01:31:42 +1000 Received: from drishya.in.ibm.com ([9.85.69.117]) by d23av02.au.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id v7NFVc1Z001872; Thu, 24 Aug 2017 01:31:40 +1000 From: Vaidyanathan Srinivasan To: Michael Neuling , Anton Blanchard Date: Wed, 23 Aug 2017 21:01:40 +0530 X-Mailer: git-send-email 2.9.5 X-TM-AS-MML: disable x-cbid: 17082315-0012-0000-0000-0000025A5552 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17082315-0013-0000-0000-000007760CEE Message-Id: <20170823153140.21183-1-svaidy@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-08-23_06:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1708230233 Subject: [Skiboot] [PATCH v2] slw: Modify the power9 stop0_lite latency & residency X-BeenThere: skiboot@lists.ozlabs.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Mailing list for skiboot development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: skiboot@lists.ozlabs.org, "Gautham R. Shenoy" MIME-Version: 1.0 Errors-To: skiboot-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Skiboot" From: "Gautham R. Shenoy" Currently skiboot exposes the exit-latency for stop0_lite as 200ns and the target-residency to be 2us. However, the kernel cpu-idle infrastructure rounds up the latency to microseconds and lists the stop0_lite latency as 0us, putting it on par with snooze state. As a result, when the predicted latency is small (< 1us), cpuidle will select stop0_lite instead of snooze. The difference between these states is that snooze doesn't require an interrupt to exit from the state, but stop0_lite does. And the value 200ns doesn't include the interrupt latency. This shows up in the context_switch2 benchmark (http://ozlabs.org/~anton/junkcode/context_switch2.c) where the number of context switches per second with the stop0_lite disabled is found to be roughly 30% more than with stop0_lite enabled. =============================================================================== x latency_200ns_residency_2us + latency_200ns_residency_2us_stop0_lite_disabled N Min Max Median Avg Stddev x 100 222784 473466 294510 302295.26 45380.6 + 100 205316 609420 385198 396338.72 78135.648 Difference at 99.0% confidence 94043.5 +/- 23276.2 31.1098% +/- 7.69983% (Student's t, pooled s = 63892.8) =============================================================================== This can be correlated with the number of times cpuidle enters stop0_lite compared to snooze. =================================================================== latency=200ns, residency=2us stop0_lite enabled. * snooze usage = 7 * stop0 lite usage = 3200324 * stop1 lite usage = 6 stop0_lite disabled * snooze usage: 287846 * stop0_lite usage: 0 * stop1_lite usage: 0 ================================================================== Hence, bump up the exit latency of stop0_lite to 1us. Since the target residency is chosen to be 10 times the exit latency, set the target residency to 10us. With these values, we see a 50% improvement in the number of context switches: ===================================================================== x latency_200ns_residency_2us + latency_1us_residency_10us N Min Max Median Avg Stddev x 100 222784 473466 294510 302295.26 45380.6 + 100 281790 710784 514878 510224.62 85163.252 Difference at 99.0% confidence 207929 +/- 24858.3 68.7835% +/- 8.22319% (Student's t, pooled s = 68235.5) ===================================================================== The cpuidle usage statistics show that we choose stop0_lite less often in such cases. latency = 1us, residency = 10us stop0_lite enabled * snooze usage = 536808 * stop0 lite usage = 3 * stop1 lite usage = 7 Reported-by: Anton Blanchard Signed-off-by: Gautham R. Shenoy Signed-off-by: Vaidyanathan Srinivasan --- Changes from v1: [PATCH] power9-dd1:slw: Modify the stop0_lite latency & residency https://lists.ozlabs.org/pipermail/skiboot/2017-April/007011.html Update latency to atleast 1us for both DD1 and DD2 idle states. --Vaidy hw/slw.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/hw/slw.c b/hw/slw.c index c0ab9de..98040e6 100644 --- a/hw/slw.c +++ b/hw/slw.c @@ -508,8 +508,8 @@ static struct cpu_idle_states power8_cpu_idle_states[] = { static struct cpu_idle_states power9_cpu_idle_states[] = { { .name = "stop0_lite", /* Enter stop0 with no state loss */ - .latency_ns = 200, - .residency_ns = 2000, + .latency_ns = 1000, + .residency_ns = 10000, .flags = 0*OPAL_PM_DEC_STOP \ | 0*OPAL_PM_TIMEBASE_STOP \ | 0*OPAL_PM_LOSE_USER_CONTEXT \ @@ -522,8 +522,8 @@ static struct cpu_idle_states power9_cpu_idle_states[] = { .pm_ctrl_reg_mask = OPAL_PM_PSSCR_MASK }, { .name = "stop0", - .latency_ns = 300, - .residency_ns = 3000, + .latency_ns = 2000, + .residency_ns = 20000, .flags = 0*OPAL_PM_DEC_STOP \ | 0*OPAL_PM_TIMEBASE_STOP \ | 0*OPAL_PM_LOSE_USER_CONTEXT \ @@ -653,8 +653,8 @@ static struct cpu_idle_states power9_cpu_idle_states[] = { static struct cpu_idle_states power9_ndd1_cpu_idle_states[] = { { .name = "stop0_lite", - .latency_ns = 200, - .residency_ns = 2000, + .latency_ns = 1000, + .residency_ns = 10000, .flags = 0*OPAL_PM_DEC_STOP \ | 0*OPAL_PM_TIMEBASE_STOP \ | 0*OPAL_PM_LOSE_USER_CONTEXT \