From patchwork Tue Apr 23 06:10:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Harsh Prateek Bora X-Patchwork-Id: 1926366 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=ibm.com header.i=@ibm.com header.a=rsa-sha256 header.s=pp1 header.b=QM7Rt+ab; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-ppc-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VNsFf5630z1yZP for ; Tue, 23 Apr 2024 16:11:34 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rz9N6-0007iQ-Nz; Tue, 23 Apr 2024 02:11:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rz9N5-0007hl-P7 for qemu-ppc@nongnu.org; Tue, 23 Apr 2024 02:11:15 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rz9N3-000817-I7 for qemu-ppc@nongnu.org; Tue, 23 Apr 2024 02:11:15 -0400 Received: from pps.filterd (m0356516.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 43N5a5Cj012776 for ; Tue, 23 Apr 2024 06:11:10 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : date : message-id : content-transfer-encoding : mime-version; s=pp1; bh=BQjXbBib/bzRbjMpiyIqY1YS2HcHR9Q4HWaBv5DHpS4=; b=QM7Rt+ab8Md8Z0oOtUBbgoSHvRAB1ilZO9nmx99xUVAvLQmNVQh00prnCf2AhYQ5/pOK x1f76PECrZp3ELOTO4TrhB9dY89UA/Qm9fck8knKq4dRdJnvTO6nzdy8t+rJM2SwL/qj dop3INjh8ZTEMqzzfnMblxUVXr0p+Qb9fX6JeTLadzrDfq1efFORsXHIUTqZnWSn9+Vd n+E6mRRHwMEbYMEHanzMLS6S5hkwDQsX/WAudyifhuDboK5SRdK3yi5j6ZGzDMrxvSOW WljAjtSo6BWQDZr+OoPpqOi+GZLXZ6Qw/9/OzT7Z/+FjQcJxDA9X00JB2jpciWcrN0Nm Mw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3xp6kn0296-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT) for ; Tue, 23 Apr 2024 06:11:10 +0000 Received: from m0356516.ppops.net (m0356516.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 43N6B9tx003454 for ; Tue, 23 Apr 2024 06:11:09 GMT Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 3xp6kn0294-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 23 Apr 2024 06:11:09 +0000 Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.17.1.19/8.17.1.19) with ESMTP id 43N3BrQh020943; Tue, 23 Apr 2024 06:11:09 GMT Received: from smtprelay07.fra02v.mail.ibm.com ([9.218.2.229]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 3xmrdyv3px-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 23 Apr 2024 06:11:08 +0000 Received: from smtpav01.fra02v.mail.ibm.com (smtpav01.fra02v.mail.ibm.com [10.20.54.100]) by smtprelay07.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 43N6B3KK45678924 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 23 Apr 2024 06:11:05 GMT Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4B8FD2004B; Tue, 23 Apr 2024 06:11:03 +0000 (GMT) Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 2366720043; Tue, 23 Apr 2024 06:11:02 +0000 (GMT) Received: from li-1901474c-32f3-11b2-a85c-fc5ff2c001f3.in.ibm.com (unknown [9.109.243.194]) by smtpav01.fra02v.mail.ibm.com (Postfix) with ESMTP; Tue, 23 Apr 2024 06:11:01 +0000 (GMT) From: Harsh Prateek Bora To: qemu-ppc@nongnu.org, npiggin@gmail.com Cc: danielhb413@gmail.com, vaibhav@linux.ibm.com, sbhat@linux.ibm.com Subject: [PATCH] target/ppc: handle vcpu hotplug failure gracefully Date: Tue, 23 Apr 2024 11:40:58 +0530 Message-Id: <20240423061058.595674-1-harshpb@linux.ibm.com> X-Mailer: git-send-email 2.39.3 X-TM-AS-GCONF: 00 X-Proofpoint-GUID: ha4bQQQeoDLk7UmicLet_hjvwJXOiz00 X-Proofpoint-ORIG-GUID: IdPo2NbGHMh7HNqMXY1qqJvMngUV451q X-Proofpoint-UnRewURL: 0 URL was un-rewritten MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-04-23_04,2024-04-22_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 impostorscore=0 bulkscore=0 priorityscore=1501 mlxscore=0 suspectscore=0 malwarescore=0 clxscore=1015 phishscore=0 adultscore=0 mlxlogscore=999 lowpriorityscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2404010000 definitions=main-2404230017 Received-SPF: pass client-ip=148.163.158.5; envelope-from=harshpb@linux.ibm.com; helo=mx0b-001b2d01.pphosted.com X-Spam_score_int: -19 X-Spam_score: -2.0 X-Spam_bar: -- X-Spam_report: (-2.0 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-ppc@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-ppc-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-ppc-bounces+incoming=patchwork.ozlabs.org@nongnu.org On ppc64, the PowerVM hypervisor runs with limited memory and a VCPU creation during hotplug may fail during kvm_ioctl for KVM_CREATE_VCPU, leading to termination of guest since errp is set to &error_fatal while calling kvm_init_vcpu. This unexpected behaviour can be avoided by pre-creating vcpu and parking it on success or return error otherwise. This enables graceful error delivery for any vcpu hotplug failures while the guest can keep running. Based on api refactoring to create/park vcpus introduced in 1/8 of patch series: https://lore.kernel.org/qemu-devel/20240312020000.12992-2-salil.mehta@huawei.com/ Tested OK by repeatedly doing a hotplug/unplug of vcpus as below: #virsh setvcpus hotplug 40 #virsh setvcpus hotplug 70 error: internal error: unable to execute QEMU command 'device_add': kvmppc_cpu_realize: vcpu hotplug failed with -12 Reported-by: Anushree Mathur Suggested-by: Shivaprasad G Bhat Suggested-by: Vaibhav Jain Signed-off by: Harsh Prateek Bora --- --- target/ppc/kvm.c | 42 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/target/ppc/kvm.c b/target/ppc/kvm.c index 8231feb2d4..c887f6dfa0 100644 --- a/target/ppc/kvm.c +++ b/target/ppc/kvm.c @@ -48,6 +48,8 @@ #include "qemu/mmap-alloc.h" #include "elf.h" #include "sysemu/kvm_int.h" +#include "sysemu/kvm.h" +#include "hw/core/accel-cpu.h" #define PROC_DEVTREE_CPU "/proc/device-tree/cpus/" @@ -2339,6 +2341,43 @@ static void alter_insns(uint64_t *word, uint64_t flags, bool on) } } +static int max_cpu_index = 0; + +static bool kvmppc_cpu_realize(CPUState *cs, Error **errp) +{ + int ret; + + cs->cpu_index = max_cpu_index++; + + POWERPC_CPU(cs)->vcpu_id = cs->cpu_index; + + if (cs->parent_obj.hotplugged) { + /* create and park to fail gracefully in case vcpu hotplug fails */ + ret = kvm_create_vcpu(cs); + if (!ret) { + kvm_park_vcpu(cs); + } else { + max_cpu_index--; + error_setg(errp, "%s: vcpu hotplug failed with %d", + __func__, ret); + return false; + } + } + return true; +} + +static void kvmppc_cpu_unrealize(CPUState *cpu) +{ + if (POWERPC_CPU(cpu)->vcpu_id == (max_cpu_index - 1)) { + /* only reclaim vcpuid if its the last one assigned + * as reclaiming random vcpuid for parked vcpus may lead + * to unexpected behaviour due to an existing kernel bug + * when drc_index doesnt get reclaimed as expected. + */ + max_cpu_index--; + } +} + static void kvmppc_host_cpu_class_init(ObjectClass *oc, void *data) { PowerPCCPUClass *pcc = POWERPC_CPU_CLASS(oc); @@ -2963,4 +3002,7 @@ bool kvm_arch_cpu_check_are_resettable(void) void kvm_arch_accel_class_init(ObjectClass *oc) { + AccelClass *ac = ACCEL_CLASS(oc); + ac->cpu_common_realize = kvmppc_cpu_realize; + ac->cpu_common_unrealize = kvmppc_cpu_unrealize; }