From patchwork Mon Oct 5 20:37:25 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Williamson X-Patchwork-Id: 526545 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 1541C1400CB for ; Tue, 6 Oct 2015 07:38:13 +1100 (AEDT) Received: from localhost ([::1]:47621 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZjCWQ-00068O-NF for incoming@patchwork.ozlabs.org; Mon, 05 Oct 2015 16:38:10 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39647) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZjCVl-0004rt-7A for qemu-devel@nongnu.org; Mon, 05 Oct 2015 16:37:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZjCVj-00017m-N1 for qemu-devel@nongnu.org; Mon, 05 Oct 2015 16:37:29 -0400 Received: from mx1.redhat.com ([209.132.183.28]:49153) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZjCVj-00016o-Fx for qemu-devel@nongnu.org; Mon, 05 Oct 2015 16:37:27 -0400 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (Postfix) with ESMTPS id 2C307461D6; Mon, 5 Oct 2015 20:37:27 +0000 (UTC) Received: from gimli.home (ovpn-113-42.phx2.redhat.com [10.3.113.42]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t95KbQBN015632; Mon, 5 Oct 2015 16:37:26 -0400 From: Alex Williamson To: qemu-devel@nongnu.org Date: Mon, 05 Oct 2015 14:37:25 -0600 Message-ID: <20151005203722.310.84234.stgit@gimli.home> In-Reply-To: <20151005203357.310.44414.stgit@gimli.home> References: <20151005203357.310.44414.stgit@gimli.home> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.68 on 10.5.11.24 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 209.132.183.28 Cc: Laurent Vivier , David Gibson Subject: [Qemu-devel] [PULL 06/10] vfio: Check guest IOVA ranges against host IOMMU capabilities X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: David Gibson The current vfio core code assumes that the host IOMMU is capable of mapping any IOVA the guest wants to use to where we need. However, real IOMMUs generally only support translating a certain range of IOVAs (the "DMA window") not a full 64-bit address space. The common x86 IOMMUs support a wide enough range that guests are very unlikely to go beyond it in practice, however the IOMMU used on IBM Power machines - in the default configuration - supports only a much more limited IOVA range, usually 0..2GiB. If the guest attempts to set up an IOVA range that the host IOMMU can't map, qemu won't report an error until it actually attempts to map a bad IOVA. If guest RAM is being mapped directly into the IOMMU (i.e. no guest visible IOMMU) then this will show up very quickly. If there is a guest visible IOMMU, however, the problem might not show up until much later when the guest actually attempt to DMA with an IOVA the host can't handle. This patch adds a test so that we will detect earlier if the guest is attempting to use IOVA ranges that the host IOMMU won't be able to deal with. For now, we assume that "Type1" (x86) IOMMUs can support any IOVA, this is incorrect, but no worse than what we have already. We can't do better for now because the Type1 kernel interface doesn't tell us what IOVA range the IOMMU actually supports. For the Power "sPAPR TCE" IOMMU, however, we can retrieve the supported IOVA range and validate guest IOVA ranges against it, and this patch does so. Signed-off-by: David Gibson Reviewed-by: Laurent Vivier Signed-off-by: Alex Williamson --- hw/vfio/common.c | 40 +++++++++++++++++++++++++++++++++++++--- include/hw/vfio/vfio-common.h | 6 ++++++ 2 files changed, 43 insertions(+), 3 deletions(-) diff --git a/hw/vfio/common.c b/hw/vfio/common.c index 95a4850..2faf492 100644 --- a/hw/vfio/common.c +++ b/hw/vfio/common.c @@ -343,14 +343,22 @@ static void vfio_listener_region_add(MemoryListener *listener, if (int128_ge(int128_make64(iova), llend)) { return; } + end = int128_get64(llend); + + if ((iova < container->min_iova) || ((end - 1) > container->max_iova)) { + error_report("vfio: IOMMU container %p can't map guest IOVA region" + " 0x%"HWADDR_PRIx"..0x%"HWADDR_PRIx, + container, iova, end - 1); + ret = -EFAULT; + goto fail; + } memory_region_ref(section->mr); if (memory_region_is_iommu(section->mr)) { VFIOGuestIOMMU *giommu; - trace_vfio_listener_region_add_iommu(iova, - int128_get64(int128_sub(llend, int128_one()))); + trace_vfio_listener_region_add_iommu(iova, end - 1); /* * FIXME: We should do some checking to see if the * capabilities of the host VFIO IOMMU are adequate to model @@ -387,7 +395,6 @@ static void vfio_listener_region_add(MemoryListener *listener, /* Here we assume that memory_region_is_ram(section->mr)==true */ - end = int128_get64(llend); vaddr = memory_region_get_ram_ptr(section->mr) + section->offset_within_region + (iova - section->offset_within_address_space); @@ -685,7 +692,19 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as) ret = -errno; goto free_container_exit; } + + /* + * FIXME: This assumes that a Type1 IOMMU can map any 64-bit + * IOVA whatsoever. That's not actually true, but the current + * kernel interface doesn't tell us what it can map, and the + * existing Type1 IOMMUs generally support any IOVA we're + * going to actually try in practice. + */ + container->min_iova = 0; + container->max_iova = (hwaddr)-1; } else if (ioctl(fd, VFIO_CHECK_EXTENSION, VFIO_SPAPR_TCE_IOMMU)) { + struct vfio_iommu_spapr_tce_info info; + ret = ioctl(group->fd, VFIO_GROUP_SET_CONTAINER, &fd); if (ret) { error_report("vfio: failed to set group container: %m"); @@ -710,6 +729,21 @@ static int vfio_connect_container(VFIOGroup *group, AddressSpace *as) ret = -errno; goto free_container_exit; } + + /* + * This only considers the host IOMMU's 32-bit window. At + * some point we need to add support for the optional 64-bit + * window and dynamic windows + */ + info.argsz = sizeof(info); + ret = ioctl(fd, VFIO_IOMMU_SPAPR_TCE_GET_INFO, &info); + if (ret) { + error_report("vfio: VFIO_IOMMU_SPAPR_TCE_GET_INFO failed: %m"); + ret = -errno; + goto free_container_exit; + } + container->min_iova = info.dma32_window_start; + container->max_iova = container->min_iova + info.dma32_window_size - 1; } else { error_report("vfio: No available IOMMU models"); ret = -EINVAL; diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index fbbe6de..27a14c0 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -65,6 +65,12 @@ typedef struct VFIOContainer { MemoryListener listener; int error; bool initialized; + /* + * This assumes the host IOMMU can support only a single + * contiguous IOVA window. We may need to generalize that in + * future + */ + hwaddr min_iova, max_iova; QLIST_HEAD(, VFIOGuestIOMMU) giommu_list; QLIST_HEAD(, VFIOGroup) group_list; QLIST_ENTRY(VFIOContainer) next;