From patchwork Mon Oct 9 21:11:36 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Henrique Barboza X-Patchwork-Id: 823526 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3y9tHj0K52z9t5Q for ; Tue, 10 Oct 2017 08:12:49 +1100 (AEDT) Received: from localhost ([::1]:59823 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e1fLz-00080w-2v for incoming@patchwork.ozlabs.org; Mon, 09 Oct 2017 17:12:47 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58215) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1e1fLG-0007ye-Eb for qemu-devel@nongnu.org; Mon, 09 Oct 2017 17:12:03 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1e1fLD-0001CJ-3X for qemu-devel@nongnu.org; Mon, 09 Oct 2017 17:12:02 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:55378 helo=mx0a-001b2d01.pphosted.com) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1e1fLC-0001Bx-Tk for qemu-devel@nongnu.org; Mon, 09 Oct 2017 17:11:59 -0400 Received: from pps.filterd (m0098421.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id v99L8rxl042740 for ; Mon, 9 Oct 2017 17:11:54 -0400 Received: from e32.co.us.ibm.com (e32.co.us.ibm.com [32.97.110.150]) by mx0a-001b2d01.pphosted.com with ESMTP id 2dgayvj6th-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Mon, 09 Oct 2017 17:11:54 -0400 Received: from localhost by e32.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Mon, 9 Oct 2017 15:11:53 -0600 Received: from b03cxnp08028.gho.boulder.ibm.com (9.17.130.20) by e32.co.us.ibm.com (192.168.1.132) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Mon, 9 Oct 2017 15:11:50 -0600 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v99LBnCa29360230; Mon, 9 Oct 2017 14:11:49 -0700 Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id B3F06BE039; Mon, 9 Oct 2017 15:11:49 -0600 (MDT) Received: from localhost.localdomain (unknown [9.80.211.183]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP id 3F8B3BE040; Mon, 9 Oct 2017 15:11:48 -0600 (MDT) From: Daniel Henrique Barboza To: qemu-devel@nongnu.org Date: Mon, 9 Oct 2017 18:11:36 -0300 X-Mailer: git-send-email 2.13.6 In-Reply-To: <20171009211136.31640-1-danielhb@linux.vnet.ibm.com> References: <20171009211136.31640-1-danielhb@linux.vnet.ibm.com> X-TM-AS-GCONF: 00 x-cbid: 17100921-0004-0000-0000-0000130A36F2 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007870; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000235; SDB=6.00928794; UDB=6.00467453; IPR=6.00709065; BA=6.00005631; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00017464; XFM=3.00000015; UTC=2017-10-09 21:11:52 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17100921-0005-0000-0000-000084688F0D Message-Id: <20171009211136.31640-2-danielhb@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-10-09_06:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=1 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1710090307 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] [fuzzy] X-Received-From: 148.163.158.5 Subject: [Qemu-devel] [PATCH v2 1/1] hw/ppc/spapr.c: abort unplug_request if previous unplug isn't done X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: qemu-ppc@nongnu.org, mdroth@linux.vnet.ibm.com, david@gibson.dropbear.id.au Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" LMB removal is completed only when the spapr_lmb_release callback is called after all DRCs of the dimm are detached. During this time, it is possible that a unplug request for the same dimm arrives, trying to detach DRCs that were detached by the guest in the first unplug_request. BQL doesn't help in this case - the lock will prevent any concurrent removal from happening until the end of spapr_memory_unplug_request only. What happens is that the second unplug_request ends up calling spapr_drc_detach in a DRC that were detached already, causing an assert error in spapr_drc_detach (e.g https://bugs.launchpad.net/qemu/+bug/1718118). spapr_lmb_release uses a structure called sPAPRDIMMState, stored in the spapr->pending_dimm_unplugs QTAIL, to track how many LMB DRCs are left to be detached by the guest. When there are no more DRCs left, this structure is deleted and the pc-dimm unplug handler is called to finish the process. This patch reuses the sPAPRDIMMState to allow unplug_request to know if there is an ongoing unplug process for a given dimm, aborting the unplug request in this case, by doing the following changes: - in spapr_lmb_release callback, move the dimm state removal to the end, after pc-dimm unplug handler. With this change we can check for the existence of the dimm state to see if the unplug process is done. - use spapr_pending_dimm_unplugs_find in spapr_memory_unplug_request to check if the dimm state exists. If positive, there is an unplug operation already in progress for this dimm, meaning that we should abort it and warn the user about it. Fixes: https://bugs.launchpad.net/qemu/+bug/1718118 Signed-off-by: Daniel Henrique Barboza --- hw/ppc/spapr.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index ff87f155d5..7f9274d57d 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -3054,14 +3054,13 @@ void spapr_lmb_release(DeviceState *dev) return; } - spapr_pending_dimm_unplugs_remove(spapr, ds); - /* * Now that all the LMBs have been removed by the guest, call the * pc-dimm unplug handler to cleanup up the pc-dimm device. */ pc_dimm_memory_unplug(dev, &spapr->hotplug_memory, mr); object_unparent(OBJECT(dev)); + spapr_pending_dimm_unplugs_remove(spapr, ds); } static void spapr_memory_unplug_request(HotplugHandler *hotplug_dev, @@ -3090,6 +3089,19 @@ static void spapr_memory_unplug_request(HotplugHandler *hotplug_dev, goto out; } + /* + * An existing pending dimm state for this DIMM means that there is an + * unplug operation in progress, waiting for the spapr_lmb_release + * callback to complete the job (BQL can't cover that far). In this case, + * bail out to avoid detaching DRCs that were already released. + */ + if (spapr_pending_dimm_unplugs_find(spapr, dimm)) { + error_setg(&local_err, + "Memory unplug already in progress for device %s", + dev->id); + goto out; + } + spapr_pending_dimm_unplugs_add(spapr, nr_lmbs, dimm); addr = addr_start;