From patchwork Tue Jun 11 20:01:25 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kamal Mostafa X-Patchwork-Id: 250630 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from huckleberry.canonical.com (huckleberry.canonical.com [91.189.94.19]) by ozlabs.org (Postfix) with ESMTP id 4917A2C0098 for ; Wed, 12 Jun 2013 06:01:48 +1000 (EST) Received: from localhost ([127.0.0.1] helo=huckleberry.canonical.com) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1UmUl4-0006pn-2s; Tue, 11 Jun 2013 20:01:34 +0000 Received: from youngberry.canonical.com ([91.189.89.112]) by huckleberry.canonical.com with esmtp (Exim 4.76) (envelope-from ) id 1UmUky-0006kZ-If for kernel-team@lists.ubuntu.com; Tue, 11 Jun 2013 20:01:28 +0000 Received: from c-67-160-231-42.hsd1.ca.comcast.net ([67.160.231.42] helo=fourier) by youngberry.canonical.com with esmtpsa (TLS1.0:DHE_RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1UmUky-0007Cb-2H; Tue, 11 Jun 2013 20:01:28 +0000 Received: from kamal by fourier with local (Exim 4.80) (envelope-from ) id 1UmUkv-00082n-4S; Tue, 11 Jun 2013 13:01:25 -0700 From: Kamal Mostafa To: Konrad Rzeszutek Wilk Subject: [ 3.8.y.z extended stable ] Patch "xen/smp: Fixup NOHZ per cpu data when onlining an offline CPU." has been added to staging queue Date: Tue, 11 Jun 2013 13:01:25 -0700 Message-Id: <1370980885-30893-1-git-send-email-kamal@canonical.com> X-Mailer: git-send-email 1.8.1.2 X-Extended-Stable: 3.8 Cc: Kamal Mostafa , kernel-team@lists.ubuntu.com X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.14 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: kernel-team-bounces@lists.ubuntu.com This is a note to let you know that I have just added a patch titled xen/smp: Fixup NOHZ per cpu data when onlining an offline CPU. to the linux-3.8.y-queue branch of the 3.8.y.z extended stable tree which can be found at: http://kernel.ubuntu.com/git?p=ubuntu/linux.git;a=shortlog;h=refs/heads/linux-3.8.y-queue This patch is scheduled to be released in version 3.8.13.3. If you, or anyone else, feels it should not be added to this tree, please reply to this email. For more information about the 3.8.y.z tree, see https://wiki.ubuntu.com/Kernel/Dev/ExtendedStable Thanks. -Kamal ------ From b436531f2149fc81da460e391f0990d788f5f0ec Mon Sep 17 00:00:00 2001 From: Konrad Rzeszutek Wilk Date: Mon, 3 Jun 2013 10:33:55 -0400 Subject: xen/smp: Fixup NOHZ per cpu data when onlining an offline CPU. commit 466318a87f28cb3ba0d08a3b7ef1a37ae73d5aa7 upstream. The xen_play_dead is an undead function. When the vCPU is told to offline it ends up calling xen_play_dead wherin it calls the VCPUOP_down hypercall which offlines the vCPU. However, when the vCPU is onlined back, it resumes execution right after VCPUOP_down hypercall. That was OK (albeit the API for play_dead assumes that the CPU stays dead and never returns) but with commit 4b0c0f294 (tick: Cleanup NOHZ per cpu data on cpu down) that is no longer safe as said commit resets the ts->inidle which at the start of the cpu_idle loop was set. The net effect is that we get this warn: Broke affinity for irq 16 installing Xen timer for CPU 1 cpu 1 spinlock event irq 48 ------------[ cut here ]------------ WARNING: at /home/konrad/linux-linus/kernel/time/tick-sched.c:935 tick_nohz_idle_exit+0x195/0x1b0() Modules linked in: dm_multipath dm_mod xen_evtchn iscsi_boot_sysfs CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.10.0-rc3upstream-00068-gdcdbe33 #1 Hardware name: BIOSTAR Group N61PB-M2S/N61PB-M2S, BIOS 6.00 PG 09/03/2009 ffffffff8193b448 ffff880039da5e60 ffffffff816707c8 ffff880039da5ea0 ffffffff8108ce8b ffff880039da4010 ffff88003fa8e500 ffff880039da4010 0000000000000001 ffff880039da4000 ffff880039da4010 ffff880039da5eb0 Call Trace: [] dump_stack+0x19/0x1b [] warn_slowpath_common+0x6b/0xa0 [] warn_slowpath_null+0x15/0x20 [] tick_nohz_idle_exit+0x195/0x1b0 [] cpu_startup_entry+0x205/0x250 [] cpu_bringup_and_idle+0x13/0x15 ---[ end trace 915c8c486004dda1 ]--- b/c ts_inidle is set to zero. Thomas suggested that we just add a workaround to call tick_nohz_idle_enter before returning from xen_play_dead() - and that is what this patch does and fixes the issue. We also add the stable part b/c git commit 4b0c0f294 is on the stable tree. Suggested-by: Thomas Gleixner Signed-off-by: Konrad Rzeszutek Wilk Signed-off-by: Kamal Mostafa --- arch/x86/xen/smp.c | 8 ++++++++ 1 file changed, 8 insertions(+) -- 1.8.1.2 diff --git a/arch/x86/xen/smp.c b/arch/x86/xen/smp.c index 48d7b2c..5f8e4fb 100644 --- a/arch/x86/xen/smp.c +++ b/arch/x86/xen/smp.c @@ -17,6 +17,7 @@ #include #include #include +#include #include #include @@ -432,6 +433,13 @@ static void __cpuinit xen_play_dead(void) /* used only with HOTPLUG_CPU */ play_dead_common(); HYPERVISOR_vcpu_op(VCPUOP_down, smp_processor_id(), NULL); cpu_bringup(); + /* + * commit 4b0c0f294 (tick: Cleanup NOHZ per cpu data on cpu down) + * clears certain data that the cpu_idle loop (which called us + * and that we return from) expects. The only way to get that + * data back is to call: + */ + tick_nohz_idle_enter(); } #else /* !CONFIG_HOTPLUG_CPU */