From patchwork Thu Nov 19 13:40:40 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin O'Connor X-Patchwork-Id: 546465 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id D7C0214145C for ; Fri, 20 Nov 2015 00:46:04 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=koconnor-net.20150623.gappssmtp.com header.i=@koconnor-net.20150623.gappssmtp.com header.b=XfMMHPdE; dkim-atps=neutral Received: from localhost ([::1]:41756 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZzPXG-0004iM-Pg for incoming@patchwork.ozlabs.org; Thu, 19 Nov 2015 08:46:02 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:35899) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZzPSI-0005Vp-1d for qemu-devel@nongnu.org; Thu, 19 Nov 2015 08:40:54 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZzPSE-0006ES-HJ for qemu-devel@nongnu.org; Thu, 19 Nov 2015 08:40:53 -0500 Received: from mail-vk0-x231.google.com ([2607:f8b0:400c:c05::231]:36799) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZzPSD-0006EO-TB for qemu-devel@nongnu.org; Thu, 19 Nov 2015 08:40:50 -0500 Received: by vkgy188 with SMTP id y188so16549258vkg.3 for ; Thu, 19 Nov 2015 05:40:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=koconnor-net.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; bh=zh+dn8wYy1ehm+98cSWcPhZihfRtwA4Fl1XaHc6MJnQ=; b=XfMMHPdEC7VNT3mFE5vG7tm5DLDOva87orbHfITtuuCK4/vqVt1YHyAE47dBFGE29I tJfHU2uiR43fF0d7L9RvmVRKVV/LiWJClUAvHakQD/taJOmO6Xuu8TTKFit2q/jIZ1XQ XzGk8jpEUnoDjUymbyJuUYT56hQtiYIh40VEpwL5TKzSDyZgfL5Bdd3tKswmrquvhgaE P6aCCbtm0uSG5cnj0X8dq0krsXTcoAEScx1GKa6EhOgV+hYwb15WqxBSUGZDuT04olgc 9tPhmQ5OktpFxoMnDWLjs6vVjl7hMkiDo5yX6IQb9EbDKEdieb409ySdo2voGhg67mgZ UJ5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-type:content-disposition:in-reply-to :user-agent; bh=zh+dn8wYy1ehm+98cSWcPhZihfRtwA4Fl1XaHc6MJnQ=; b=U2G4J97dk3aXOipj/B9MVy+I893v4wz187HMkWTXW6/PjE/RR3bzHPv75DLQoEqYEJ OiW1b6HPYBUlnXs3/xhvpYW2s324LZBYcEnJfT0qb3gDrnJGzM+eknbFGhBwZs/zMEVA hxrXUhdyMEVVRV7JmCLkSWLefTkxx4IbYhcnrVVUBMUe7JdTpYbPMKjB1B8laigOVsVu YmEYgW2dZYhQhpSNnCAI7IBf0ZLbp5m63TU/0qNgyT44RZ9lddPpOq82uWq2LErZnWm/ NK/2ncV0rziAG02mphCBF0rCZZqVIH7I07kqZkSYykFnsBr+DgdNVaDsEFxw3+mcQcPm skdw== X-Gm-Message-State: ALoCoQkI4izXChydbFu5dtPuauRgsKnmzNtQu7KpCG7Qji966Aqrk/4xFCfNuI7JLmTYKb5q2aye X-Received: by 10.31.47.138 with SMTP id v132mr5053880vkv.116.1447940449342; Thu, 19 Nov 2015 05:40:49 -0800 (PST) Received: from localhost (209-122-232-221.c3-0.avec-ubr1.nyr-avec.ny.cable.rcn.com. [209.122.232.221]) by smtp.gmail.com with ESMTPSA id h9sm3824244vkf.21.2015.11.19.05.40.48 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 19 Nov 2015 05:40:48 -0800 (PST) Date: Thu, 19 Nov 2015 08:40:40 -0500 From: Kevin O'Connor To: "Xulei (Stone)" Message-ID: <20151119134039.GA27717@morn.lan> References: <8E78D212B8C25246BE4CE7EA0E645FE5291A08@SZXEMI504-MBS.china.huawei.com> <563955D4.7080000@huawei.com> <20151104174201.GA17784@morn.lan> <8E78D212B8C25246BE4CE7EA0E645FE52977E8@SZXEMI504-MBS.china.huawei.com> <20151109133253.GA1790@morn.lan> <20151109200618.GA29129@morn.lan> <20151109202726.GA31490@morn.lan> <8E78D212B8C25246BE4CE7EA0E645FE52B5BE3@SZXEMI504-MBS.china.huawei.com> <8E78D212B8C25246BE4CE7EA0E645FE52B72B7@SZXEMI504-MBS.china.huawei.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <8E78D212B8C25246BE4CE7EA0E645FE52B72B7@SZXEMI504-MBS.china.huawei.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address (bad octet value). X-Received-From: 2607:f8b0:400c:c05::231 Cc: "Huangweidong \(C\)" , "Gonglei \(Arei\)" , "seabios@seabios.org" , qemu-devel Subject: Re: [Qemu-devel] [PATCH] SeaBios: Fix reset procedure reentrancy problem on qemu-kvm platform X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org On Thu, Nov 19, 2015 at 12:42:50PM +0000, Xulei (Stone) wrote: > Kevin, > > After deeply analyzing, i think there may be 3 possible reasons: > 1)wrong CountCPUs value. It seems CountCPUs++ in handle_smp() has no > lock to protect. So, sometimes, 2 or more vcpu may get the same > current value of CountCPUs. Then we'll get a single incrementation > instead of 2 or more and "while (cmos_smp_count != CountCPUs)" will > loop forever; The handle_smp() code is called from romlayout.S:entry_smp() which does take a lock. So, all of handle_smp() should run synchronous. > 2)wrong cmos_smp_count value. SeaBIOS rtc reads an incorrect number? Not sure - the last time there were problems in this area of the code others used kvmtrace to try and track this down. Since you are getting dprintf statements, you could also try outputting cmos_smp_count prior to the loop (see patch below). > 3)yield() stuck. Is it possible that SeaBIOS is stuck during yield? > I've tested, when yield() is running, SeaBIOS seems has not created > some other threads except the main thread. So I don't know what's > the function of yield() here.? The yield() allows hardware interrupts to occur. But note that yield() isn't called in the loop - is is only called after the loop completes. If you are only getting this on massive repetitive reboot requests, there are some other possible explanations: - perhaps the SIPI is getting lost because one of the CPUs is still resetting or still processing a SIPI from the last reboot? - the seabios code itself may have been corrupted if the memcpy() in qemu_prep_reset() got far enough along to clear HaveRunPost, but did not get far enough along to fully complete the memcpy(). If the failure is reproducible, the patch below could help narrow the possibilities. -Kevin --- a/src/fw/smp.c +++ b/src/fw/smp.c @@ -125,6 +125,7 @@ smp_setup(void) // Wait for other CPUs to process the SIPI. u8 cmos_smp_count = rtc_read(CMOS_BIOS_SMP_COUNT) + 1; + dprintf(1, "cmos_smp_count=%d\n", cmos_smp_count); while (cmos_smp_count != CountCPUs) asm volatile( // Release lock and allow other processors to use the stack. @@ -136,6 +137,7 @@ smp_setup(void) " jc 1b\n" : "+m" (SMPLock), "+m" (SMPStack) : : "cc", "memory"); + dprintf(1, "finish smp\n"); yield(); // Restore memory.