From patchwork Mon Mar 27 13:22:42 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Laurent Vivier X-Patchwork-Id: 743806 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3vsF8x6zlMz9s5g for ; Tue, 28 Mar 2017 00:23:45 +1100 (AEDT) Received: from localhost ([::1]:46791 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1csUcZ-0000Xq-A1 for incoming@patchwork.ozlabs.org; Mon, 27 Mar 2017 09:23:43 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:59032) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1csUbi-0000BC-Uc for qemu-devel@nongnu.org; Mon, 27 Mar 2017 09:22:52 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1csUbf-0005CQ-Ow for qemu-devel@nongnu.org; Mon, 27 Mar 2017 09:22:50 -0400 Received: from mx1.redhat.com ([209.132.183.28]:56534) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1csUbf-0005Bs-Je; Mon, 27 Mar 2017 09:22:47 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C01C48F89E; Mon, 27 Mar 2017 13:22:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com C01C48F89E Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx03.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=lvivier@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com C01C48F89E Received: from thinkpad.redhat.com (ovpn-116-101.ams2.redhat.com [10.36.116.101]) by smtp.corp.redhat.com (Postfix) with ESMTP id C14047D518; Mon, 27 Mar 2017 13:22:43 +0000 (UTC) From: Laurent Vivier To: Michael Roth Date: Mon, 27 Mar 2017 15:22:42 +0200 Message-Id: <20170327132242.17104-1-lvivier@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.27]); Mon, 27 Mar 2017 13:22:45 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH] spapr: fix memory hot-unplugging X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Laurent Vivier , Thomas Huth , qemu-ppc@nongnu.org, qemu-devel@nongnu.org, David Gibson Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" If, once the kernel has booted, we try to remove a memory hotplugged while the kernel was not started, QEMU crashes on an assert: qemu-system-ppc64: hw/virtio/vhost.c:651: vhost_commit: Assertion `r >= 0' failed. ... #4 in vhost_commit #5 in memory_region_transaction_commit #6 in pc_dimm_memory_unplug #7 in spapr_memory_unplug #8 spapr_machine_device_unplug #9 in hotplug_handler_unplug #10 in spapr_lmb_release #11 in detach #12 in set_allocation_state #13 in rtas_set_indicator ... If we take a closer look to the guest kernel log, we can see when we try to unplug the memory: pseries-hotplug-mem: Attempting to hot-add 4 LMB(s) What happens: 1- The kernel has ignored the memory hotplug event because it was not started when it was generated. 2- When we hot-unplug the memory, QEMU starts to remove the memory, generates an hot-unplug event, and signals the kernel of the incoming new event 3- as the kernel is started, on the QEMU signal, it reads the event list, decodes the hotplug event and tries to finish the hotplugging. 4- QEMU receives the hotplug notification while it is trying to hot-unplug the memory. This moves the memory DRC to an invalid state This patch prevents this by not allowing to set the allocation state to USABLE while the DRC is awaiting release and allocation. RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1432382 Signed-off-by: Laurent Vivier --- hw/ppc/spapr_drc.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c index 150f6bf..377ea65 100644 --- a/hw/ppc/spapr_drc.c +++ b/hw/ppc/spapr_drc.c @@ -135,6 +135,12 @@ static uint32_t set_allocation_state(sPAPRDRConnector *drc, if (!drc->dev) { return RTAS_OUT_NO_SUCH_INDICATOR; } + if (drc->awaiting_release && drc->awaiting_allocation) { + /* kernel is acknowledging a previous hotplug event + * while we are already removing it. + */ + return RTAS_OUT_NO_SUCH_INDICATOR; + } } if (drc->type != SPAPR_DR_CONNECTOR_TYPE_PCI) {