From patchwork Wed Dec 10 14:08:57 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 419665 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id D351B1400B7 for ; Thu, 11 Dec 2014 01:10:28 +1100 (AEDT) Received: from ozlabs.org (ozlabs.org [103.22.144.67]) by lists.ozlabs.org (Postfix) with ESMTP id C240B1A0CEE for ; Thu, 11 Dec 2014 01:10:28 +1100 (AEDT) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from Galois.linutronix.de (Galois.linutronix.de [IPv6:2001:470:1f0b:db:abcd:42:0:1]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id F1ED61A0B8F for ; Thu, 11 Dec 2014 01:09:44 +1100 (AEDT) Received: from localhost ([127.0.0.1]) by Galois.linutronix.de with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.80) (envelope-from ) id 1Xyhwr-0007vf-Lu; Wed, 10 Dec 2014 15:09:01 +0100 Date: Wed, 10 Dec 2014 15:08:57 +0100 (CET) From: Thomas Gleixner To: Linus Torvalds Subject: Re: [PATCH] powerpc: secondary CPUs signal to master before setting active and online (fixes kernel BUG at kernel/smpboot.c:134!) In-Reply-To: Message-ID: References: <1418009221-12719-1-git-send-email-anton@samba.org> <20141208083408.GA8023@gmail.com> <20141208211859.6e81ec81@kryten> <20141209105819.0e847b4b@kryten> User-Agent: Alpine 2.11 (DEB 23 2013-08-11) MIME-Version: 1.0 X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1, SHORTCIRCUIT=-0.0001 Cc: Peter Zijlstra , Yuanhan Liu , Benjamin Segall , Paul Mackerras , Ingo Molnar , "Rafael J. Wysocki" , Ingo Molnar , Paul Turner , yuyang.du@intel.com, Daniel J Blueman , Steven Rostedt , Anton Blanchard , subbaram@codeaurora.org, Wu Fengguang , lkp@01.org, Linux Kernel Mailing List , Slava Pestov , Tejun Heo , Andrew Morton , Brian Norris , ppc-dev X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.18 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" On Tue, 9 Dec 2014, Linus Torvalds wrote: > On Mon, Dec 8, 2014 at 3:58 PM, Anton Blanchard wrote: > > Hi Ingo, > > > >> At that point I thought the previous task_cpu() was somewhat ingrained > >> in the scheduler and came up with the patch. If not, we could go on a > >> hunt to see what else needs fixing. > > > > I had another look. The scheduled does indeed make assumptions about the > > previous task_cpu, but we have a hammer to fix it up called > > select_fallback_rq. > > > > I annotated select_fallback_rq, and did hit a case where the CPU was > > not active. ppc64 patch below. > > Anton, I'll assume I will get this through the usual powerpc pull requests? > > > I think x86 have a similar (although harder to hit) issue. Indeed way harder to hit: CPU 0 CPU 1 set_cpu_online(1, true) { while (!cpu_online(cpu1)) cpumask_set_cpu(1, to_cpumask(cpu_online_bits)); relax(); wakeup_thread_on_cpu1(); cpumask_set_cpu(1, to_cpumask(cpu_active_bits)); On bare metal probably impossible, but on virt it should be observable. Fix is simple. Thanks, tglx diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index 668d8f2a8781..534f3384f03f 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -222,7 +222,6 @@ static void notrace start_secondary(void *unused) lock_vector_lock(); set_cpu_online(smp_processor_id(), true); unlock_vector_lock(); - per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE; x86_platform.nmi_init(); /* enable local interrupts */ @@ -234,6 +233,7 @@ static void notrace start_secondary(void *unused) x86_cpuinit.setup_percpu_clockev(); wmb(); + per_cpu(cpu_state, smp_processor_id()) = CPU_ONLINE; cpu_startup_entry(CPUHP_ONLINE); } @@ -932,7 +932,7 @@ int native_cpu_up(unsigned int cpu, struct task_struct *tidle) check_tsc_sync_source(cpu); local_irq_restore(flags); - while (!cpu_online(cpu)) { + while (per_cpu(cpu_state,cpu) != CPU_ONLINE) { cpu_relax(); touch_nmi_watchdog(); }