diff mbox

[QUESTION] stuck in SeaBIOS and vm cannot be reset any more

Message ID 8E78D212B8C25246BE4CE7EA0E645FE53FB818@SZXEMI504-MBS.china.huawei.com
State New
Headers show

Commit Message

Xulei (Stone, Euler) Aug. 9, 2016, 8:04 a.m. UTC
>On Tue, Aug 02, 2016 at 04:18:30AM +0000, Xulei (Stone) wrote:
>> >On Fri, Jul 29, 2016 at 04:04:59AM +0000, Xulei (Stone) wrote:
>> >> After one day, the vm is stuck. Looking from the following seabios
>> >> log, it seems seabios stops at "PCI: Using 00:02.0 for primary
>> >> VGA", and can not execute handle_smp() any more.
>> >> What may be the reason?
>> >
>> >More debugging info would be necessary to find this problem.  You
>> >could try reproducing and attaching gdb (
>> >http://www.seabios.org/Debugging#Debugging_with_gdb_on_QEMU ).
>> >Alternatively, a kvm trace log may help.
>> >
>> kvm trace (seems useful) indicates that cpu 0 keeps always to access
0x00b3 ioport.
>> 0x00b3 is PORT_SMI_STATUS, so i guess my bios is stuck in the
>> function smm_relocate_and_restore {
>>       ...
>>       /* wait until SMM code executed */
>>     while (inb(PORT_SMI_STATUS) != 0x00)
>>       ...
>> }
>
>I'd try adding dprintf() statements around all the code at the top of
>smm_relocate_and_restore() and enable the dprintf() at the top of
>handle_smi().
>
>It would also be useful if you can extract the log from the last two
>working reboots to compare it to the failed case.

Following your suggestion, i'm now sure it is caused by missing SMI.
I have tried adding dprintf() like this:

2016-08-03 16:23:15before SMI====
2016-08-03 16:23:15after SMI=====

So, it's obviously that after outb(0x01, PORT_SMI_STATUS), bios does not
handle_smi, so PORT_SMI_STATUS is always 0x01. What's more, when this
problem happens, rebooting vm cannot restore it any more. My vm is always 
stuck at the same place until i destroy it.

And I have already tried kernel commit c43203cab1e which still can not 
solve this problem.
Any idea, Kevin and Paolo?
> >
> >-Kevin

Comments

Paolo Bonzini Aug. 9, 2016, 8:08 a.m. UTC | #1
On 09/08/2016 10:04, Xulei (Stone) wrote:
> Following your suggestion, i'm now sure it is caused by missing SMI.
> I have tried adding dprintf() like this:
> 
> --- a/roms/seabios/src/fw/smm.c
> +++ b/roms/seabios/src/fw/smm.c
> @@ -65,7 +65,8 @@ handle_smi(u16 cs)
>      u8 cmd = inb(PORT_SMI_CMD);
>      struct smm_layout *smm = MAKE_FLATPTR(cs, 0);
>      u32 rev = smm->cpu.i32.smm_rev & SMM_REV_MASK;
> -    dprintf(DEBUG_HDL_smi, "handle_smi cmd=%x smbase=%p\n", cmd, smm);
> +    if(cmd == 0x00) {
> +    	dprintf(1, "handle_smi cmd=%x smbase=%p\n", cmd, smm);
> +    }
> 
>      if (smm == (void*)BUILD_SMM_INIT_ADDR) {
>          // relocate SMBASE to 0xa0000
> @@ -147,14 +148,14 @@ smm_relocate_and_restore(void)  {
>      /* init APM status port */
>      outb(0x01, PORT_SMI_STATUS);
> +   dprintf(1,"before SMI====\n");
> 
>      /* raise an SMI interrupt */
>      outb(0x00, PORT_SMI_CMD);
> +    dprintf(1,"after SMI=====\n");
> 
>      /* wait until SMM code executed */
>      while (inb(PORT_SMI_STATUS) != 0x00)
>          ;
> +   dprintf(1,"smm code executes complete====\n");
> 
> And the failed case log output like this:
> 2016-08-03 16:23:15PCI: Using 00:02.0 for primary VGA
> 2016-08-03 16:23:15smm_device_setup start
> 2016-08-03 16:23:15init smm
> 2016-08-03 16:23:15before SMI====
> 2016-08-03 16:23:15after SMI=====
> 
> So, it's obviously that after outb(0x01, PORT_SMI_STATUS), bios does not
> handle_smi, so PORT_SMI_STATUS is always 0x01. What's more, when this
> problem happens, rebooting vm cannot restore it any more. My vm is always 
> stuck at the same place until i destroy it.
> 
> And I have already tried kernel commit c43203cab1e which still can not 
> solve this problem.
> Any idea, Kevin and Paolo?

0xb2 is handled within QEMU, so it may be useful to make sure that QEMU
is sending the KVM_SMI ioctl.  From there the best tool is still KVM
tracing and printk.  I suggest replacing the dprintf with a simple outb
like outb(0x21, 0x402) (before) and outb(0x23, 0x402) (after).  They
show as "!" and "#" in the trace, and they are easy to spot in the KVM
trace.

Paolo
diff mbox

Patch

--- a/roms/seabios/src/fw/smm.c
+++ b/roms/seabios/src/fw/smm.c
@@ -65,7 +65,8 @@  handle_smi(u16 cs)
     u8 cmd = inb(PORT_SMI_CMD);
     struct smm_layout *smm = MAKE_FLATPTR(cs, 0);
     u32 rev = smm->cpu.i32.smm_rev & SMM_REV_MASK;
-    dprintf(DEBUG_HDL_smi, "handle_smi cmd=%x smbase=%p\n", cmd, smm);
+    if(cmd == 0x00) {
+    	dprintf(1, "handle_smi cmd=%x smbase=%p\n", cmd, smm);
+    }

     if (smm == (void*)BUILD_SMM_INIT_ADDR) {
         // relocate SMBASE to 0xa0000
@@ -147,14 +148,14 @@  smm_relocate_and_restore(void)  {
     /* init APM status port */
     outb(0x01, PORT_SMI_STATUS);
+   dprintf(1,"before SMI====\n");

     /* raise an SMI interrupt */
     outb(0x00, PORT_SMI_CMD);
+    dprintf(1,"after SMI=====\n");

     /* wait until SMM code executed */
     while (inb(PORT_SMI_STATUS) != 0x00)
         ;
+   dprintf(1,"smm code executes complete====\n");

And the failed case log output like this:
2016-08-03 16:23:15PCI: Using 00:02.0 for primary VGA
2016-08-03 16:23:15smm_device_setup start
2016-08-03 16:23:15init smm