From patchwork Mon Sep 18 09:45:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 1836001 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=K+YeTPpe; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Rq0L95RX3z1ypw for ; Mon, 18 Sep 2023 19:46:24 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qiAor-0005oD-20; Mon, 18 Sep 2023 05:45:29 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qiAoo-0005nd-25 for qemu-devel@nongnu.org; Mon, 18 Sep 2023 05:45:26 -0400 Received: from mgamail.intel.com ([134.134.136.100]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qiAoe-0000xj-7u for qemu-devel@nongnu.org; Mon, 18 Sep 2023 05:45:25 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695030316; x=1726566316; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ZnuW1WYEcEDGILKnDibSvGbY0oSLRt/k2KWdG1CgWKo=; b=K+YeTPpeQtgG489rE67U3iPMmLTXMBoGeg7+ZfvbWVzdxSa8d4yMaymb m+eSFM1IjYiZSpwwxGRcrozz72U9TaAzVoMQpOrVYQ7ciNks1Y4fYwcWY IRB8KOjcKpo1+L8yZWIvPqrx1AHbjXdwYNiVGBRDSsCSPaYGDuySgHwNM jS/flOMu+vEg4hibv48th3KirVBSrN4gNLcjwzJkFn1Fm6xNz1tJEFkqb 3+Msfcpq5/496yytxUTNxB871+nSkQAw0YtdU/gJahyvBQl57yC5uGLnY FW506LUdqok5XEHAtb3TxQirtei+LsTsIw4FSl22OMemUNg6umBdY9MPR g==; X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="446072269" X-IronPort-AV: E=Sophos;i="6.02,156,1688454000"; d="scan'208";a="446072269" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Sep 2023 02:45:13 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="835955945" X-IronPort-AV: E=Sophos;i="6.02,156,1688454000"; d="scan'208";a="835955945" Received: from vmmteam.bj.intel.com ([10.240.193.84]) by FMSMGA003.fm.intel.com with ESMTP; 18 Sep 2023 02:45:11 -0700 From: Jing Liu To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, pbonzini@redhat.com, kevin.tian@intel.com, reinette.chatre@intel.com, jing2.liu@intel.com, jing2.liu@linux.intel.com Subject: [PATCH v2 1/4] vfio/pci: detect the support of dynamic MSI-X allocation Date: Mon, 18 Sep 2023 05:45:04 -0400 Message-Id: <20230918094507.409050-2-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230918094507.409050-1-jing2.liu@intel.com> References: <20230918094507.409050-1-jing2.liu@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.100; envelope-from=jing2.liu@intel.com; helo=mgamail.intel.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Kernel provides the guidance of dynamic MSI-X allocation support of passthrough device, by clearing the VFIO_IRQ_INFO_NORESIZE flag to guide user space. Fetch the flags from host to determine if dynamic MSI-X allocation is supported. Originally-by: Reinette Chatre Signed-off-by: Jing Liu Reviewed-by: Cédric Le Goater --- Changes since v1: - Free msix when failed to get MSI-X irq info. (Cédric) - Apply Cédric's Reviewed-by. Changes since RFC v1: - Filter the dynamic MSI-X allocation flag and store as a bool type. (Alex) - Move the detection to vfio_msix_early_setup(). (Alex) - Report error of getting irq info and remove the trace of failure case. (Alex, Cédric) --- hw/vfio/pci.c | 16 ++++++++++++++-- hw/vfio/pci.h | 1 + hw/vfio/trace-events | 2 +- 3 files changed, 16 insertions(+), 3 deletions(-) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index a205c6b1130f..60654ca28ab8 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -1493,7 +1493,9 @@ static void vfio_msix_early_setup(VFIOPCIDevice *vdev, Error **errp) uint8_t pos; uint16_t ctrl; uint32_t table, pba; - int fd = vdev->vbasedev.fd; + int ret, fd = vdev->vbasedev.fd; + struct vfio_irq_info irq_info = { .argsz = sizeof(irq_info), + .index = VFIO_PCI_MSIX_IRQ_INDEX }; VFIOMSIXInfo *msix; pos = pci_find_capability(&vdev->pdev, PCI_CAP_ID_MSIX); @@ -1530,6 +1532,15 @@ static void vfio_msix_early_setup(VFIOPCIDevice *vdev, Error **errp) msix->pba_offset = pba & ~PCI_MSIX_FLAGS_BIRMASK; msix->entries = (ctrl & PCI_MSIX_FLAGS_QSIZE) + 1; + ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_GET_IRQ_INFO, &irq_info); + if (ret < 0) { + error_setg_errno(errp, -ret, "failed to get MSI-X irq info"); + g_free(msix); + return; + } + + msix->noresize = !!(irq_info.flags & VFIO_IRQ_INFO_NORESIZE); + /* * Test the size of the pba_offset variable and catch if it extends outside * of the specified BAR. If it is the case, we need to apply a hardware @@ -1562,7 +1573,8 @@ static void vfio_msix_early_setup(VFIOPCIDevice *vdev, Error **errp) } trace_vfio_msix_early_setup(vdev->vbasedev.name, pos, msix->table_bar, - msix->table_offset, msix->entries); + msix->table_offset, msix->entries, + msix->noresize); vdev->msix = msix; vfio_pci_fixup_msix_region(vdev); diff --git a/hw/vfio/pci.h b/hw/vfio/pci.h index a2771b9ff3cc..0717574d79e9 100644 --- a/hw/vfio/pci.h +++ b/hw/vfio/pci.h @@ -113,6 +113,7 @@ typedef struct VFIOMSIXInfo { uint32_t table_offset; uint32_t pba_offset; unsigned long *pending; + bool noresize; } VFIOMSIXInfo; #define TYPE_VFIO_PCI "vfio-pci" diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index 81ec7c7a958b..cc7c21365c92 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -27,7 +27,7 @@ vfio_vga_read(uint64_t addr, int size, uint64_t data) " (0x%"PRIx64", %d) = 0x%" vfio_pci_read_config(const char *name, int addr, int len, int val) " (%s, @0x%x, len=0x%x) 0x%x" vfio_pci_write_config(const char *name, int addr, int val, int len) " (%s, @0x%x, 0x%x, len=0x%x)" vfio_msi_setup(const char *name, int pos) "%s PCI MSI CAP @0x%x" -vfio_msix_early_setup(const char *name, int pos, int table_bar, int offset, int entries) "%s PCI MSI-X CAP @0x%x, BAR %d, offset 0x%x, entries %d" +vfio_msix_early_setup(const char *name, int pos, int table_bar, int offset, int entries, bool noresize) "%s PCI MSI-X CAP @0x%x, BAR %d, offset 0x%x, entries %d, noresize %d" vfio_check_pcie_flr(const char *name) "%s Supports FLR via PCIe cap" vfio_check_pm_reset(const char *name) "%s Supports PM reset" vfio_check_af_flr(const char *name) "%s Supports FLR via AF cap" From patchwork Mon Sep 18 09:45:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 1836004 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=HzfkbFFJ; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Rq0Lf1pmNz1ypc for ; Mon, 18 Sep 2023 19:46:50 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qiAos-0005oi-Ey; Mon, 18 Sep 2023 05:45:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qiAop-0005nt-HM for qemu-devel@nongnu.org; Mon, 18 Sep 2023 05:45:28 -0400 Received: from mgamail.intel.com ([134.134.136.100]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qiAof-0000zF-H5 for qemu-devel@nongnu.org; Mon, 18 Sep 2023 05:45:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695030317; x=1726566317; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=sYenLO8qBFQaQjUPMfoV6cxi4vZBlBpvvEUXyjxOwYk=; b=HzfkbFFJtNRozrDD0jEU5hOjs+yJ41FTKBIJDUq/c/A47kLIzhNKPfQq bm+lMW70Qh4Rr9QPTJotVaE01jketAHFCA5zFY9xm4KFAdWssdFGo+cP2 P5+Ci4fM2uezY+bJ9t904DLx9m8IyvAPIU153pl+UPWdJ1Jd/CfeEYb3a VcDaCSAfh3Lpdo78rMuFECQPS8/5M8xItsH57J9JlYaWTAxLbHlc+ZqBk 3v/fleea2ue7Xpq+eGI2Bq3LUSLUMZGMv92HhICFELjpE5EMdVTLAJSN4 QABFRQPf2WeS8Un3rFW2TW9EHa5p+yUTuH8lf20tCF6jBm3qjNVsubBBQ A==; X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="446072275" X-IronPort-AV: E=Sophos;i="6.02,156,1688454000"; d="scan'208";a="446072275" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Sep 2023 02:45:16 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="835955952" X-IronPort-AV: E=Sophos;i="6.02,156,1688454000"; d="scan'208";a="835955952" Received: from vmmteam.bj.intel.com ([10.240.193.84]) by FMSMGA003.fm.intel.com with ESMTP; 18 Sep 2023 02:45:13 -0700 From: Jing Liu To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, pbonzini@redhat.com, kevin.tian@intel.com, reinette.chatre@intel.com, jing2.liu@intel.com, jing2.liu@linux.intel.com Subject: [PATCH v2 2/4] vfio/pci: enable vector on dynamic MSI-X allocation Date: Mon, 18 Sep 2023 05:45:05 -0400 Message-Id: <20230918094507.409050-3-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230918094507.409050-1-jing2.liu@intel.com> References: <20230918094507.409050-1-jing2.liu@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.100; envelope-from=jing2.liu@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org The vector_use callback is used to enable vector that is unmasked in guest. The kernel used to only support static MSI-X allocation. When allocating a new interrupt using "static MSI-X allocation" kernels, QEMU first disables all previously allocated vectors and then re-allocates all including the new one. The nr_vectors of VFIOPCIDevice indicates that all vectors from 0 to nr_vectors are allocated (and may be enabled), which is used to to loop all the possibly used vectors When, e.g., disabling MSI-X interrupts. Extend the vector_use function to support dynamic MSI-X allocation when host supports the capability. QEMU therefore can individually allocate and enable a new interrupt without affecting others or causing interrupts lost during runtime. Utilize nr_vectors to calculate the upper bound of enabled vectors in dynamic MSI-X allocation mode since looping all msix_entries_nr is not efficient and unnecessary. Signed-off-by: Jing Liu Tested-by: Reinette Chatre --- Changes since v1: - Revise Qemu to QEMU. Changes since RFC v1: - Test vdev->msix->noresize to identify the allocation mode. (Alex) - Move defer_kvm_irq_routing test out and update nr_vectors in a common place before vfio_enable_vectors(). (Alex) - Revise the comments. (Alex) --- hw/vfio/pci.c | 44 +++++++++++++++++++++++++++----------------- 1 file changed, 27 insertions(+), 17 deletions(-) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 60654ca28ab8..84987e46fd7a 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -470,6 +470,7 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr, VFIOPCIDevice *vdev = VFIO_PCI(pdev); VFIOMSIVector *vector; int ret; + int old_nr_vecs = vdev->nr_vectors; trace_vfio_msix_vector_do_use(vdev->vbasedev.name, nr); @@ -512,33 +513,42 @@ static int vfio_msix_vector_do_use(PCIDevice *pdev, unsigned int nr, } /* - * We don't want to have the host allocate all possible MSI vectors - * for a device if they're not in use, so we shutdown and incrementally - * increase them as needed. + * When dynamic allocation is not supported, we don't want to have the + * host allocate all possible MSI vectors for a device if they're not + * in use, so we shutdown and incrementally increase them as needed. + * nr_vectors represents the total number of vectors allocated. + * + * When dynamic allocation is supported, let the host only allocate + * and enable a vector when it is in use in guest. nr_vectors represents + * the upper bound of vectors being enabled (but not all of the ranges + * is allocated or enabled). */ if (vdev->nr_vectors < nr + 1) { vdev->nr_vectors = nr + 1; - if (!vdev->defer_kvm_irq_routing) { + } + + if (!vdev->defer_kvm_irq_routing) { + if (vdev->msix->noresize && (old_nr_vecs < nr + 1)) { vfio_disable_irqindex(&vdev->vbasedev, VFIO_PCI_MSIX_IRQ_INDEX); ret = vfio_enable_vectors(vdev, true); if (ret) { error_report("vfio: failed to enable vectors, %d", ret); } - } - } else { - Error *err = NULL; - int32_t fd; - - if (vector->virq >= 0) { - fd = event_notifier_get_fd(&vector->kvm_interrupt); } else { - fd = event_notifier_get_fd(&vector->interrupt); - } + Error *err = NULL; + int32_t fd; - if (vfio_set_irq_signaling(&vdev->vbasedev, - VFIO_PCI_MSIX_IRQ_INDEX, nr, - VFIO_IRQ_SET_ACTION_TRIGGER, fd, &err)) { - error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name); + if (vector->virq >= 0) { + fd = event_notifier_get_fd(&vector->kvm_interrupt); + } else { + fd = event_notifier_get_fd(&vector->interrupt); + } + + if (vfio_set_irq_signaling(&vdev->vbasedev, + VFIO_PCI_MSIX_IRQ_INDEX, nr, + VFIO_IRQ_SET_ACTION_TRIGGER, fd, &err)) { + error_reportf_err(err, VFIO_MSG_PREFIX, vdev->vbasedev.name); + } } } From patchwork Mon Sep 18 09:45:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 1836003 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=ITf4tbG5; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Rq0Lc2GP4z1ync for ; Mon, 18 Sep 2023 19:46:48 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qiAot-0005os-JR; Mon, 18 Sep 2023 05:45:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qiAor-0005oH-AM for qemu-devel@nongnu.org; Mon, 18 Sep 2023 05:45:29 -0400 Received: from mgamail.intel.com ([134.134.136.100]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qiAop-0000xj-Gj for qemu-devel@nongnu.org; Mon, 18 Sep 2023 05:45:29 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695030327; x=1726566327; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=K9qjTjP69wy9/oxYf5DYvTpAECxOt0osGbsO6aI872Y=; b=ITf4tbG5X115TDrDbWrvBCfX5X8OPSfpms3DQ1NaPd0QADOBYkUay8bJ 5zYVAuNT4yAIM8Tv/pH39nTD9CGPAB6a+wjd12oAx6IVqMtiHkRv1ngH6 90uvnwe0z6ZBobqCmTh5NnrEPvL1DLOAEkHKGc+HjpkEwxjd+QrAoUJO5 IWySfSLdfpJAC3SIW8Wv5AESNkaxvg97K1QfAlbZe8EedBkYe2E3zUJMP nd1CmRqtOPzaI6Z+kyys92eOJya8+eETrytDXwxO1EgxRLxDFlnuwueNG B8VKgtP7u9HvJPSpSJW9ar7BmqjMal7Wn2fs9fgZ0gFysZ1x1NFNelZCz Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="446072279" X-IronPort-AV: E=Sophos;i="6.02,156,1688454000"; d="scan'208";a="446072279" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Sep 2023 02:45:18 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="835955956" X-IronPort-AV: E=Sophos;i="6.02,156,1688454000"; d="scan'208";a="835955956" Received: from vmmteam.bj.intel.com ([10.240.193.84]) by FMSMGA003.fm.intel.com with ESMTP; 18 Sep 2023 02:45:16 -0700 From: Jing Liu To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, pbonzini@redhat.com, kevin.tian@intel.com, reinette.chatre@intel.com, jing2.liu@intel.com, jing2.liu@linux.intel.com Subject: [PATCH v2 3/4] vfio/pci: use an invalid fd to enable MSI-X Date: Mon, 18 Sep 2023 05:45:06 -0400 Message-Id: <20230918094507.409050-4-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230918094507.409050-1-jing2.liu@intel.com> References: <20230918094507.409050-1-jing2.liu@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.100; envelope-from=jing2.liu@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Guests typically enable MSI-X with all of the vectors masked in the MSI-X vector table. To match the guest state of device, QEMU enables MSI-X by enabling vector 0 with userspace triggering and immediately release. However the release function actually does not release it due to already using userspace mode. It is no need to enable triggering on host and rely on the mask bit to avoid spurious interrupts. Use an invalid fd (i.e. fd = -1) is enough to get MSI-X enabled. After dynamic MSI-X allocation is supported, the interrupt restoring also need use such way to enable MSI-X, therefore, create a function for that. Suggested-by: Alex Williamson Signed-off-by: Jing Liu Reviewed-by: Cédric Le Goater --- Changes since v1: - Revise Qemu to QEMU. (Cédric) - Use g_autofree to automatically release. (Cédric) - Just return 'ret' and let the caller of vfio_enable_msix_no_vec() report the error. (Cédric) Changes since RFC v1: - A new patch. Use an invalid fd to get MSI-X enabled instead of using userspace triggering. (Alex) --- hw/vfio/pci.c | 44 ++++++++++++++++++++++++++++++++++++-------- 1 file changed, 36 insertions(+), 8 deletions(-) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 84987e46fd7a..0117f230e934 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -369,6 +369,33 @@ static void vfio_msi_interrupt(void *opaque) notify(&vdev->pdev, nr); } +/* + * Get MSI-X enabled, but no vector enabled, by setting vector 0 with an invalid + * fd to kernel. + */ +static int vfio_enable_msix_no_vec(VFIOPCIDevice *vdev) +{ + g_autofree struct vfio_irq_set *irq_set = NULL; + int ret = 0, argsz; + int32_t *fd; + + argsz = sizeof(*irq_set) + sizeof(*fd); + + irq_set = g_malloc0(argsz); + irq_set->argsz = argsz; + irq_set->flags = VFIO_IRQ_SET_DATA_EVENTFD | + VFIO_IRQ_SET_ACTION_TRIGGER; + irq_set->index = VFIO_PCI_MSIX_IRQ_INDEX; + irq_set->start = 0; + irq_set->count = 1; + fd = (int32_t *)&irq_set->data; + *fd = -1; + + ret = ioctl(vdev->vbasedev.fd, VFIO_DEVICE_SET_IRQS, irq_set); + + return ret; +} + static int vfio_enable_vectors(VFIOPCIDevice *vdev, bool msix) { struct vfio_irq_set *irq_set; @@ -618,6 +645,8 @@ static void vfio_commit_kvm_msi_virq_batch(VFIOPCIDevice *vdev) static void vfio_msix_enable(VFIOPCIDevice *vdev) { + int ret; + vfio_disable_interrupts(vdev); vdev->msi_vectors = g_new0(VFIOMSIVector, vdev->msix->entries); @@ -640,8 +669,6 @@ static void vfio_msix_enable(VFIOPCIDevice *vdev) vfio_commit_kvm_msi_virq_batch(vdev); if (vdev->nr_vectors) { - int ret; - ret = vfio_enable_vectors(vdev, true); if (ret) { error_report("vfio: failed to enable vectors, %d", ret); @@ -655,13 +682,14 @@ static void vfio_msix_enable(VFIOPCIDevice *vdev) * MSI-X capability, but leaves the vector table masked. We therefore * can't rely on a vector_use callback (from request_irq() in the guest) * to switch the physical device into MSI-X mode because that may come a - * long time after pci_enable_msix(). This code enables vector 0 with - * triggering to userspace, then immediately release the vector, leaving - * the physical device with no vectors enabled, but MSI-X enabled, just - * like the guest view. + * long time after pci_enable_msix(). This code sets vector 0 with an + * invalid fd to make the physical device MSI-X enabled, but with no + * vectors enabled, just like the guest view. */ - vfio_msix_vector_do_use(&vdev->pdev, 0, NULL, NULL); - vfio_msix_vector_release(&vdev->pdev, 0); + ret = vfio_enable_msix_no_vec(vdev); + if (ret) { + error_report("vfio: failed to enable MSI-X, %d", ret); + } } trace_vfio_msix_enable(vdev->vbasedev.name); From patchwork Mon Sep 18 09:45:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jing Liu X-Patchwork-Id: 1836002 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=fIsb/RJp; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Rq0L94m9hz1ypc for ; Mon, 18 Sep 2023 19:46:24 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qiAou-0005p0-28; Mon, 18 Sep 2023 05:45:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qiAos-0005oU-2P for qemu-devel@nongnu.org; Mon, 18 Sep 2023 05:45:30 -0400 Received: from mgamail.intel.com ([134.134.136.100]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qiAop-0000zF-US for qemu-devel@nongnu.org; Mon, 18 Sep 2023 05:45:29 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695030327; x=1726566327; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=KqrlI41bT36v2JNI8frGPTvfWv6nYWbfQq51fwpgcGI=; b=fIsb/RJplL8cSG9ePCTqpWGbjE4v/xvM15wwvA5U4uPjnJn0FS+YgmaH RYHrG6DwlNKgUtWo5VeHOQigONWK81vmG+exD03BPq44rMBbwTPASDi3L rZUXxdtzMan2WVJksYH+h1gvpAxx9IoO0kiwBKAFtiHod/ehEPoWTAt0P L3f6mNGCo4yD0mhr72ByhMJMd6nOrw4IpzFGTaqd+0vbgEvLhmU576kEv IKx+kli3RV8f8iirVsHz3+9S10iiomkOyjDIZqjcd9ckp0NgcpDDU/cPf H5XJItYHdATbr/Id1QJRZgFEblvRaINyiBSYa/2UZymj8OG3HOR5oKBF0 Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="446072281" X-IronPort-AV: E=Sophos;i="6.02,156,1688454000"; d="scan'208";a="446072281" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Sep 2023 02:45:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10836"; a="835955962" X-IronPort-AV: E=Sophos;i="6.02,156,1688454000"; d="scan'208";a="835955962" Received: from vmmteam.bj.intel.com ([10.240.193.84]) by FMSMGA003.fm.intel.com with ESMTP; 18 Sep 2023 02:45:18 -0700 From: Jing Liu To: qemu-devel@nongnu.org Cc: alex.williamson@redhat.com, clg@redhat.com, pbonzini@redhat.com, kevin.tian@intel.com, reinette.chatre@intel.com, jing2.liu@intel.com, jing2.liu@linux.intel.com Subject: [PATCH v2 4/4] vfio/pci: enable MSI-X in interrupt restoring on dynamic allocation Date: Mon, 18 Sep 2023 05:45:07 -0400 Message-Id: <20230918094507.409050-5-jing2.liu@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20230918094507.409050-1-jing2.liu@intel.com> References: <20230918094507.409050-1-jing2.liu@intel.com> MIME-Version: 1.0 Received-SPF: pass client-ip=134.134.136.100; envelope-from=jing2.liu@intel.com; helo=mgamail.intel.com X-Spam_score_int: -43 X-Spam_score: -4.4 X-Spam_bar: ---- X-Spam_report: (-4.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org During migration restoring, vfio_enable_vectors() is called to restore enabling MSI-X interrupts for assigned devices. It sets the range from 0 to nr_vectors to kernel to enable MSI-X and the vectors unmasked in guest. During the MSI-X enabling, all the vectors within the range are allocated according to the VFIO_DEVICE_SET_IRQS ioctl. When dynamic MSI-X allocation is supported, we only want the guest unmasked vectors being allocated and enabled. Use vector 0 with an invalid fd to get MSI-X enabled, after that, all the vectors can be allocated in need. Signed-off-by: Jing Liu Reviewed-by: Cédric Le Goater --- Changes since v1: - No change. Changes since RFC v1: - Revise the comments. (Alex) - Call the new helper function in previous patch to enable MSI-X. (Alex) --- hw/vfio/pci.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 0117f230e934..f5f891dc0792 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -402,6 +402,23 @@ static int vfio_enable_vectors(VFIOPCIDevice *vdev, bool msix) int ret = 0, i, argsz; int32_t *fds; + /* + * If dynamic MSI-X allocation is supported, the vectors to be allocated + * and enabled can be scattered. Before kernel enabling MSI-X, setting + * nr_vectors causes all these vectors to be allocated on host. + * + * To keep allocation as needed, use vector 0 with an invalid fd to get + * MSI-X enabled first, then set vectors with a potentially sparse set of + * eventfds to enable interrupts only when enabled in guest. + */ + if (msix && !vdev->msix->noresize) { + ret = vfio_enable_msix_no_vec(vdev); + + if (ret) { + return ret; + } + } + argsz = sizeof(*irq_set) + (vdev->nr_vectors * sizeof(*fds)); irq_set = g_malloc0(argsz);