From patchwork Mon Feb 28 02:52:40 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wen Congyang X-Patchwork-Id: 84720 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 66FA5B7113 for ; Mon, 28 Feb 2011 13:55:33 +1100 (EST) Received: from localhost ([127.0.0.1]:46087 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PttGf-0003sh-PS for incoming@patchwork.ozlabs.org; Sun, 27 Feb 2011 21:55:25 -0500 Received: from [140.186.70.92] (port=50181 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1PttFI-0003Ro-J3 for qemu-devel@nongnu.org; Sun, 27 Feb 2011 21:54:02 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1PttFG-0003Vv-Ll for qemu-devel@nongnu.org; Sun, 27 Feb 2011 21:54:00 -0500 Received: from [222.73.24.84] (port=51082 helo=song.cn.fujitsu.com) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1PttFF-0003Vi-RH for qemu-devel@nongnu.org; Sun, 27 Feb 2011 21:53:58 -0500 Received: from tang.cn.fujitsu.com (tang.cn.fujitsu.com [10.167.250.3]) by song.cn.fujitsu.com (Postfix) with ESMTP id A9B2B17012B; Mon, 28 Feb 2011 10:53:53 +0800 (CST) Received: from mailserver.fnst.cn.fujitus.com (tang.cn.fujitsu.com [127.0.0.1]) by tang.cn.fujitsu.com (8.14.3/8.13.1) with ESMTP id p1S2m4lw001184; Mon, 28 Feb 2011 10:48:05 +0800 Received: from [10.167.225.226] ([10.167.225.226]) by mailserver.fnst.cn.fujitus.com (Lotus Domino Release 8.5.1FP4) with ESMTP id 2011022810524656-235414 ; Mon, 28 Feb 2011 10:52:46 +0800 Message-ID: <4D6B0DF8.5000407@cn.fujitsu.com> Date: Mon, 28 Feb 2011 10:52:40 +0800 From: Wen Congyang User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.9) Gecko/20100413 Fedora/3.0.4-2.fc13 Thunderbird/3.0.4 MIME-Version: 1.0 To: Markus Armbruster Subject: Re: [Qemu-devel] [PATCH] `qdev_free` when unplug a pci device References: <1298396180-23734-1-git-send-email-wdauchy@gmail.com> <20110223025001.GC19727@valinux.co.jp> In-Reply-To: X-MIMETrack: Itemize by SMTP Server on mailserver/fnst(Release 8.5.1FP4|July 25, 2010) at 2011-02-28 10:52:46, Serialize by Router on mailserver/fnst(Release 8.5.1FP4|July 25, 2010) at 2011-02-28 10:52:47, Serialize complete at 2011-02-28 10:52:47 X-detected-operating-system: by eggs.gnu.org: FreeBSD 6.x (1) X-Received-From: 222.73.24.84 Cc: qemu-devel@nongnu.org, Isaku Yamahata , Gerd Hoffmann , William Dauchy X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Hi Markus Armbruster At 02/23/2011 04:30 PM, Markus Armbruster Write: > Isaku Yamahata writes: > > > I don't think this patch is correct. Let me explain. > > Device hot unplug is *not* guaranteed to succeed. > > For some buses, such as USB, it always succeeds immediately, i.e. when > the device_del monitor command finishes, the device is gone. Live is > good. > > But for PCI, device_del merely initiates the ACPI unplug rain dance. It > doesn't wait for the dance to complete. Why? The dance can take an > unpredictable amount of time, including forever. > > Problem: Subsequent device_add can fail if it reuses the qdev ID or PCI > slot, and the unplug has not yet completed (race condition), or it > failed. Yes, Virginia, PCI hotplug *can* fail. > > When unplug succeeds, the qdev is automatically destroyed. > pciej_write() does that for PIIX4. Looks like pcie_cap_slot_event() > does it for PCIE. I got a similar problem. When I unplug a pci device by hand, it works as expected, and I can hotplug it again. But when I use a srcipt to do the same thing, sometimes it failed. I think I may find another bug. Steps to reproduce this bug: 1. cat ./test-e1000.sh # RHEL6RC is domain name #! /bin/bash while true; do virsh attach-interface RHEL6RC network default --mac 52:54:00:1f:db:c7 --model e1000 if [[ $? -ne 0 ]]; then break fi virsh detach-interface RHEL6RC network --mac 52:54:00:1f:db:c7 if [[ $? -ne 0 ]]; then break fi sleep 5 done 2. ./test-e1000.sh Interface attached successfully Interface detached successfully Interface attached successfully Interface detached successfully error: Failed to attach interface error: operation failed: adding e1000,netdev=hostnet1,id=net1,mac=52:54:00:1f:db:c7,bus=pci.0,addr=0x8 device failed: Duplicate ID 'net1' for device I use ftrace to trace whether sci interrupt is passed to guest OS, here is log: # cat trace | grep 'irq=9' -0 [000] 118342.634772: irq_handler_entry: irq=9 name=acpi -0 [000] 118342.634831: irq_handler_exit: irq=9 ret=handled -0 [000] 118342.693216: irq_handler_entry: irq=9 name=acpi -0 [000] 118342.693254: irq_handler_exit: irq=9 ret=handled -0 [000] 118347.737689: irq_handler_entry: irq=9 name=acpi -0 [000] 118347.737738: irq_handler_exit: irq=9 ret=handled According to the log, we only receive 3 sci interrupt, and one is lost. I enable piix4's debug and 1 line printf into piix4_device_hotplug: printf("slot: %d, up: %d, down: %d, state: %d\n", slot, s->pci0_status.up, s->pci0_status.down, state); here is the log: ======== PM: mapping to 0xb000 PM readw port=0x0004 val=0x0000 ... gpe write afe2 <== 0 gpe write afe0 <== 255 gpe write afe3 <== 0 gpe write afe1 <== 255 PM readw port=0x0000 val=0x0001 PM readw port=0x0002 val=0x0000 gpe read afe0 == 0 gpe read afe2 == 0 gpe read afe1 == 0 gpe read afe3 == 0 PM writew port=0x0000 val=0x0020 PM readw port=0x0002 val=0x0000 PM writew port=0x0002 val=0x0020 PM readw port=0x0002 val=0x0020 gpe write afe2 <== 255 gpe write afe3 <== 255 ... slot: 6, up: 0, down: 0, state: 1 <==== we attach interface the first time PM readw port=0x0000 val=0x0001 PM readw port=0x0002 val=0x0120 gpe read afe0 == 2 gpe read afe2 == ff gpe read afe2 == ff gpe write afe2 <== 253 gpe read afe1 == 0 gpe read afe3 == ff pcihotplug read ae00 == 40 pcihotplug read ae04 == 0 ... gpe write afe0 <== 2 gpe write afe2 <== 255 slot: 6, up: 64, down: 0, state: 0 <===== we detach interface the first time PM readw port=0x0000 val=0x0001 PM readw port=0x0002 val=0x0120 gpe read afe0 == 2 gpe read afe2 == ff gpe read afe2 == ff gpe write afe2 <== 253 gpe read afe1 == 0 gpe read afe3 == ff pcihotplug read ae00 == 0 pcihotplug read ae04 == 40 ... pciej write ae08 <== 64 <=== we will call qdev_free() pciej write ae08 <== 64 gpe write afe0 <== 2 gpe write afe2 <== 255 slot: 7, up: 0, down: 64, state: 1 <=== we attach interface the second time. PM readw port=0x0000 val=0x0001 PM readw port=0x0002 val=0x0120 gpe read afe0 == 2 gpe read afe2 == ff gpe read afe2 == ff gpe write afe2 <== 253 gpe read afe1 == 0 gpe read afe3 == ff pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 pcihotplug read ae00 == 80 pcihotplug read ae04 == 0 slot: 7, up: 128, down: 0, state: 0 <=== we detach interface the second time gpe write afe0 <== 2 <=== write 2 to afe0 will cause sci interupt to be lost gpe write afe2 <== 255 =========== Here is short log, and show the different between the first time and second time: =========== gpe write afe0 <== 2 gpe write afe2 <== 255 slot: 6, up: 64, down: 0, state: 0 <===== we detach interface the first time .... slot: 7, up: 128, down: 0, state: 0 <=== we detach interface the second time gpe write afe0 <== 2 <=== write 2 to afe0 will cause sci interupt to be lost gpe write afe2 <== 255 =========== Does this behavior is a normal behavior? The following patch can fix this problem. From 3c149575b52261b8da9a73d5c4ebdfedce9866c1 Mon Sep 17 00:00:00 2001 From: Wen Congyang Date: Mon, 28 Feb 2011 10:33:45 +0800 Subject: [PATCH] pend hotplug event until last event is handled by guest OS --- hw/acpi_piix4.c | 51 ++++++++++++++++++++++++++++++++++++++++++++------- 1 files changed, 44 insertions(+), 7 deletions(-) diff --git a/hw/acpi_piix4.c b/hw/acpi_piix4.c index b5a2762..4ff3475 100644 --- a/hw/acpi_piix4.c +++ b/hw/acpi_piix4.c @@ -74,6 +74,7 @@ typedef struct PIIX4PMState { /* for pci hotplug */ struct gpe_regs gpe; struct pci_status pci0_status; + struct pci_status pci0_status_pending; uint32_t pci0_hotplug_enable; } PIIX4PMState; @@ -518,6 +519,19 @@ static void gpe_reset_val(uint16_t *cur, int addr, uint32_t val) *cur = (*cur & (0xff << (8 - shift))) | (x1 << shift); } +static void raise_pending_hotplug(PIIX4PMState *s) +{ + if (s->pci0_status_pending.up || s->pci0_status_pending.down) { + s->gpe.sts |= PIIX4_PCI_HOTPLUG_STATUS; + s->pci0_status.up = s->pci0_status_pending.up; + s->pci0_status.down = s->pci0_status_pending.down; + s->pci0_status_pending.up = 0; + s->pci0_status_pending.down = 0; + + pm_update_sci(s); + } +} + static void gpe_writeb(void *opaque, uint32_t addr, uint32_t val) { PIIX4PMState *s = opaque; @@ -539,6 +553,11 @@ static void gpe_writeb(void *opaque, uint32_t addr, uint32_t val) pm_update_sci(s); PIIX4_DPRINTF("gpe write %x <== %d\n", addr, val); + + if (!(g->sts & PIIX4_PCI_HOTPLUG_STATUS) && g->en & PIIX4_PCI_HOTPLUG_STATUS) { + /* check and reraise the pending hotplug event */ + raise_pending_hotplug(s); + } } static uint32_t pcihotplug_read(void *opaque, uint32_t addr) @@ -639,12 +658,22 @@ static void enable_device(PIIX4PMState *s, int slot) s->pci0_status.up |= (1 << slot); } +static void pend_enable_device(PIIX4PMState *s, int slot) +{ + s->pci0_status_pending.up |= (1 << slot); +} + static void disable_device(PIIX4PMState *s, int slot) { s->gpe.sts |= PIIX4_PCI_HOTPLUG_STATUS; s->pci0_status.down |= (1 << slot); } +static void pend_disable_device(PIIX4PMState *s, int slot) +{ + s->pci0_status_pending.down |= (1 << slot); +} + static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev, PCIHotplugState state) { @@ -659,15 +688,23 @@ static int piix4_device_hotplug(DeviceState *qdev, PCIDevice *dev, return 0; } - s->pci0_status.up = 0; - s->pci0_status.down = 0; - if (state == PCI_HOTPLUG_ENABLED) { - enable_device(s, slot); + if (s->gpe.sts & PIIX4_PCI_HOTPLUG_STATUS) { + if (state == PCI_HOTPLUG_ENABLED) { + pend_enable_device(s, slot); + } else { + pend_disable_device(s, slot); + } } else { - disable_device(s, slot); - } + s->pci0_status.up = 0; + s->pci0_status.down = 0; + if (state == PCI_HOTPLUG_ENABLED) { + enable_device(s, slot); + } else { + disable_device(s, slot); + } - pm_update_sci(s); + pm_update_sci(s); + } return 0; }