From patchwork Tue Apr 4 04:11:21 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 746692 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3vxwYv4KN5z9s8D for ; Tue, 4 Apr 2017 14:13:07 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="n16F2M/z"; dkim-atps=neutral Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3vxwYv39JkzDqL2 for ; Tue, 4 Apr 2017 14:13:07 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="n16F2M/z"; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from ozlabs.org (ozlabs.org [103.22.144.67]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3vxwX63H8rzDqJp for ; Tue, 4 Apr 2017 14:11:34 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="n16F2M/z"; dkim-atps=neutral Received: from ozlabs.org (ozlabs.org [IPv6:2401:3900:2:1::2]) by bilbo.ozlabs.org (Postfix) with ESMTP id 3vxwX62SqXz8t03 for ; Tue, 4 Apr 2017 14:11:34 +1000 (AEST) Received: by ozlabs.org (Postfix) id 3vxwX6263Qz9s8F; Tue, 4 Apr 2017 14:11:34 +1000 (AEST) Delivered-To: linuxppc-dev@ozlabs.org Received: from mail-pg0-x244.google.com (mail-pg0-x244.google.com [IPv6:2607:f8b0:400e:c05::244]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3vxwX567rvz9s8D for ; Tue, 4 Apr 2017 14:11:33 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="n16F2M/z"; dkim-atps=neutral Received: by mail-pg0-x244.google.com with SMTP id o123so34271262pga.1 for ; Mon, 03 Apr 2017 21:11:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:in-reply-to:references :organization:mime-version:content-transfer-encoding; bh=6TjypXsSt2OY9vtk09LEA5VwwN7DycCDZJliIzN0qpQ=; b=n16F2M/zQF6zVJwLaVLVNGq8VZVLc5sGVlkV6Oa9Bf6i/Ew5/xDycaHT7e/6IzO2HS 7wezxr9U6yKD71BvN91qTEqGl69jErW+dwnwSWIikw/w0tOgQ0wuv2LGTUEe58ekSRFz zvD33K6Y8erlSeXx1J7YLqA6GJV/yQmBhrJ0EWtaNwBflf93SlTemzFHD5mE2GtfRS9d BnVkouqeD3l+9upI/viMOQ968FjpHYIvMIZuRJavUz+SkxiTUo4AmKv9nvXnZYKPvscR 8y8ke7n3xfgqv/+eltZlL/YVGD1UmJSeEPCsHb0NY5RHUtIJ0qjUE+h2r++Y+OAhR9D4 wjiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:in-reply-to :references:organization:mime-version:content-transfer-encoding; bh=6TjypXsSt2OY9vtk09LEA5VwwN7DycCDZJliIzN0qpQ=; b=czAQge9LYCcvH7ab4VatRUFVXyRJHIZoH54RjHrHUehMdRzIrw3j4vmhibuw6Qyrdo ae8fagYyjIbTjHGTKuYQPCLlggPdxWQwqLVxjDfsZK1TGzFeAFA1X38+JGy6Y13ANd0i ofp+ktvgGXaorUrXZhHKGkMGs/s7+dcy2xDzWYJ92awuGxxy7EZNBupASa8AcUUdF7uG 0+6zVDYC0wYKC7nQv8SWNgbUUYv6D8n3F9I0lvR1lnxRijlmyM2+EMsOupPvoQ1F4eF8 Ha3AogAK3WuVjdjwXzMiOxbBNUmXJkpQF5KpdPMLDr6Q2VmEHV/PcX5rXOWAbXyNE/9a RWiw== X-Gm-Message-State: AFeK/H0kSLPiCcVcRUlSU8bd1H7hqqfWQPwwaImsKjXbWQL9dTvPDud6dEHpmzEW1t/IAw== X-Received: by 10.99.66.193 with SMTP id p184mr21774692pga.213.1491279092208; Mon, 03 Apr 2017 21:11:32 -0700 (PDT) Received: from roar.ozlabs.ibm.com ([203.221.48.234]) by smtp.gmail.com with ESMTPSA id c28sm28443329pfj.77.2017.04.03.21.11.28 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 03 Apr 2017 21:11:31 -0700 (PDT) Date: Tue, 4 Apr 2017 14:11:21 +1000 From: Nicholas Piggin To: Linus Torvalds Subject: Re: [RFC][PATCH] spin loop arch primitives for busy waiting Message-ID: <20170404141121.5510e925@roar.ozlabs.ibm.com> In-Reply-To: <20170404130233.1f45115b@roar.ozlabs.ibm.com> References: <20170403081328.30266-1-npiggin@gmail.com> <20170404095001.664718b8@roar.ozlabs.ibm.com> <20170404130233.1f45115b@roar.ozlabs.ibm.com> Organization: IBM X-Mailer: Claws Mail 3.14.1 (GTK+ 2.24.31; x86_64-pc-linux-gnu) MIME-Version: 1.0 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "linux-arch@vger.kernel.org" , linuxppc-dev , Linux Kernel Mailing List , Anton Blanchard Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Tue, 4 Apr 2017 13:02:33 +1000 Nicholas Piggin wrote: > On Mon, 3 Apr 2017 17:43:05 -0700 > Linus Torvalds wrote: > > > But that depends on architectures having some pattern that we *can* > > abstract. Would some "begin/in-loop/end" pattern like the above be > > sufficient? > > Yes. begin/in/end would be sufficient for powerpc SMT priority, and > for x86, and it looks like sparc64 too. So we could do that if you > prefer. How's this? I changed your name a bit just so we have a common spin_ prefix. With example powerpc implementation and one caller converted to see the effect. --- arch/powerpc/include/asm/processor.h | 17 +++++++++++++ include/linux/processor.h | 48 ++++++++++++++++++++++++++++++++++++ kernel/sched/idle.c | 7 +++++- 3 files changed, 71 insertions(+), 1 deletion(-) create mode 100644 include/linux/processor.h diff --git a/arch/powerpc/include/asm/processor.h b/arch/powerpc/include/asm/processor.h index e9bbd450d966..1274dc818e74 100644 --- a/arch/powerpc/include/asm/processor.h +++ b/arch/powerpc/include/asm/processor.h @@ -402,6 +402,23 @@ static inline unsigned long __pack_fe01(unsigned int fpmode) #ifdef CONFIG_PPC64 #define cpu_relax() do { HMT_low(); HMT_medium(); barrier(); } while (0) + +#ifndef spin_begin +#define spin_begin() HMT_low() +#endif + +#ifndef spin_cpu_relax +#define spin_cpu_relax() barrier() +#endif + +#ifndef spin_cpu_yield +#define spin_cpu_yield() +#endif + +#ifndef spin_end +#define spin_end() HMT_medium() +#endif + #else #define cpu_relax() barrier() #endif diff --git a/include/linux/processor.h b/include/linux/processor.h new file mode 100644 index 000000000000..65e5635d0069 --- /dev/null +++ b/include/linux/processor.h @@ -0,0 +1,48 @@ +/* Misc low level processor primitives */ +#ifndef _LINUX_PROCESSOR_H +#define _LINUX_PROCESSOR_H + +#include + +/* + * spin_begin is used before beginning a busy-wait loop, and must be paired + * with spin_end when the loop is exited. spin_cpu_relax must be called + * within the loop. + * + * These loop body should be as small and fast as possible, on the order of + * tens of instructions/cycles as a guide. It should and avoid calling + * cpu_relax, or any "spin" or sleep type of primitive including nested uses + * of these primitives. It should not lock or take any other resource. + * Violations of this will not cause a bug, but may cause sub optimal + * performance. + * + * These loops are optimized to be used where wait times are expected to be + * less than the cost of a context switch (and associated overhead). + * + * Detection of resource owner and decision to spin or sleep or guest-yield + * (e.g., spin lock holder vcpu preempted, or mutex owner not on CPU) can be + * tested within the busy loop body if necessary. + */ +#ifndef spin_begin +#define spin_begin() +#endif + +#ifndef spin_cpu_relax +#define spin_cpu_relax() cpu_relax() +#endif + +/* + * spin_cpu_yield may be called to yield (undirected) to the hypervisor if + * necessary. This should be used if the wait is expected to take longer + * than context switch overhead, but we can't sleep or do a directed yield. + */ +#ifndef spin_cpu_yield +#define spin_cpu_yield() cpu_relax_yield() +#endif + +#ifndef spin_end +#define spin_end() +#endif + +#endif /* _LINUX_PROCESSOR_H */ + diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c index ac6d5176463d..99a032d9f4a9 100644 --- a/kernel/sched/idle.c +++ b/kernel/sched/idle.c @@ -10,6 +10,7 @@ #include #include #include +#include #include @@ -63,9 +64,13 @@ static noinline int __cpuidle cpu_idle_poll(void) trace_cpu_idle_rcuidle(0, smp_processor_id()); local_irq_enable(); stop_critical_timings(); + + spin_begin(); while (!tif_need_resched() && (cpu_idle_force_poll || tick_check_broadcast_expired())) - cpu_relax(); + spin_cpu_relax(); + spin_end(); + start_critical_timings(); trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, smp_processor_id()); rcu_idle_exit();