From patchwork Tue Aug  9 08:04:37 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Xulei (Stone, Euler)" <stone.xulei@huawei.com>
X-Patchwork-Id: 657146
Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11])
	(using TLSv1 with cipher AES256-SHA (256/256 bits))
	(No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 3s7n1S5qCfz9sBR
	for <incoming@patchwork.ozlabs.org>;
	Tue,  9 Aug 2016 18:06:52 +1000 (AEST)
Received: from localhost ([::1]:34108 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71) (envelope-from
	<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>)
	id 1bX23m-00018J-AB
	for incoming@patchwork.ozlabs.org; Tue, 09 Aug 2016 04:06:50 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:53986)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stone.xulei@huawei.com>) id 1bX22V-0000DE-A1
	for qemu-devel@nongnu.org; Tue, 09 Aug 2016 04:05:32 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <stone.xulei@huawei.com>) id 1bX22Q-0001HF-Ar
	for qemu-devel@nongnu.org; Tue, 09 Aug 2016 04:05:31 -0400
Received: from szxga02-in.huawei.com ([119.145.14.65]:22889)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <stone.xulei@huawei.com>) id 1bX22O-0001DW-Nq
	for qemu-devel@nongnu.org; Tue, 09 Aug 2016 04:05:26 -0400
Received: from 172.24.1.60 (EHLO SZXEMI403-HUB.china.huawei.com)
	([172.24.1.60])
	by szxrg02-dlp.huawei.com (MOS 4.3.7-GA FastPath queued)
	with ESMTP id DLL73597; Tue, 09 Aug 2016 16:04:40 +0800 (CST)
Received: from SZXEMI504-MBS.china.huawei.com ([169.254.1.86]) by
	SZXEMI403-HUB.china.huawei.com ([10.83.65.55]) with mapi id
	14.03.0235.001; Tue, 9 Aug 2016 16:04:37 +0800
From: "Xulei (Stone)" <stone.xulei@huawei.com>
To: "Kevin O'Connor" <kevin@koconnor.net>, Paolo Bonzini
	<paolo.bonzini@gmail.com>
Thread-Topic: [QUESTION]stuck in SeaBIOS and vm cannot be reset any more
Thread-Index: AdHyFIcV036cWEnqTD6hEFazQP1Okw==
Date: Tue, 9 Aug 2016 08:04:37 +0000
Message-ID: 
 <8E78D212B8C25246BE4CE7EA0E645FE53FB818@SZXEMI504-MBS.china.huawei.com>
Accept-Language: zh-CN, en-US
Content-Language: zh-CN
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [10.177.254.96]
MIME-Version: 1.0
X-CFilter-Loop: Reflected
X-Mirapoint-Virus-RAPID-Raw: score=unknown(0),
	refid=str=0001.0A020201.57A98E9D.0132, ss=1, re=0.000, recu=0.000,
	reip=0.000, cl=1, cld=1, fgs=0, ip=169.254.1.86,
	so=2013-06-18 04:22:30, dmn=2013-03-21 17:37:32
X-Mirapoint-Loop-Id: 33a4777865139a8809b1e27783b296d0
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic]
X-Received-From: 119.145.14.65
Subject: Re: [Qemu-devel] [QUESTION]stuck in SeaBIOS and vm cannot be reset
	any more
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: "seabios@seabios.org" <seabios@seabios.org>,
	qemu-devel <qemu-devel@nongnu.org>
Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Sender: "Qemu-devel"
	<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>

>On Tue, Aug 02, 2016 at 04:18:30AM +0000, Xulei (Stone) wrote:
>> >On Fri, Jul 29, 2016 at 04:04:59AM +0000, Xulei (Stone) wrote:
>> >> After one day, the vm is stuck. Looking from the following seabios
>> >> log, it seems seabios stops at "PCI: Using 00:02.0 for primary
>> >> VGA", and can not execute handle_smp() any more.
>> >> What may be the reason?
>> >
>> >More debugging info would be necessary to find this problem.  You
>> >could try reproducing and attaching gdb (
>> >http://www.seabios.org/Debugging#Debugging_with_gdb_on_QEMU ).
>> >Alternatively, a kvm trace log may help.
>> >
>> kvm trace (seems useful) indicates that cpu 0 keeps always to access
0x00b3 ioport.
>> 0x00b3 is PORT_SMI_STATUS, so i guess my bios is stuck in the
>> function smm_relocate_and_restore {
>>       ...
>>       /* wait until SMM code executed */
>>     while (inb(PORT_SMI_STATUS) != 0x00)
>>       ...
>> }
>
>I'd try adding dprintf() statements around all the code at the top of
>smm_relocate_and_restore() and enable the dprintf() at the top of
>handle_smi().
>
>It would also be useful if you can extract the log from the last two
>working reboots to compare it to the failed case.

Following your suggestion, i'm now sure it is caused by missing SMI.
I have tried adding dprintf() like this:

2016-08-03 16:23:15before SMI====
2016-08-03 16:23:15after SMI=====

So, it's obviously that after outb(0x01, PORT_SMI_STATUS), bios does not
handle_smi, so PORT_SMI_STATUS is always 0x01. What's more, when this
problem happens, rebooting vm cannot restore it any more. My vm is always 
stuck at the same place until i destroy it.

And I have already tried kernel commit c43203cab1e which still can not 
solve this problem.
Any idea, Kevin and Paolo?
> >
> >-Kevin

--- a/roms/seabios/src/fw/smm.c
+++ b/roms/seabios/src/fw/smm.c
@@ -65,7 +65,8 @@ handle_smi(u16 cs)
     u8 cmd = inb(PORT_SMI_CMD);
     struct smm_layout *smm = MAKE_FLATPTR(cs, 0);
     u32 rev = smm->cpu.i32.smm_rev & SMM_REV_MASK;
-    dprintf(DEBUG_HDL_smi, "handle_smi cmd=%x smbase=%p\n", cmd, smm);
+    if(cmd == 0x00) {
+    	dprintf(1, "handle_smi cmd=%x smbase=%p\n", cmd, smm);
+    }

     if (smm == (void*)BUILD_SMM_INIT_ADDR) {
         // relocate SMBASE to 0xa0000
@@ -147,14 +148,14 @@ smm_relocate_and_restore(void)  {
     /* init APM status port */
     outb(0x01, PORT_SMI_STATUS);
+   dprintf(1,"before SMI====\n");

     /* raise an SMI interrupt */
     outb(0x00, PORT_SMI_CMD);
+    dprintf(1,"after SMI=====\n");

     /* wait until SMM code executed */
     while (inb(PORT_SMI_STATUS) != 0x00)
         ;
+   dprintf(1,"smm code executes complete====\n");

And the failed case log output like this:
2016-08-03 16:23:15PCI: Using 00:02.0 for primary VGA
2016-08-03 16:23:15smm_device_setup start
2016-08-03 16:23:15init smm