From patchwork Fri Jun  8 05:46:27 2018
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Alexey Kardashevskiy <aik@ozlabs.ru>
X-Patchwork-Id: 926594
Return-Path: <kvm-ppc-owner@vger.kernel.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org;
	spf=none (mailfrom) smtp.mailfrom=vger.kernel.org
	(client-ip=209.132.180.67; helo=vger.kernel.org;
	envelope-from=kvm-ppc-owner@vger.kernel.org;
	receiver=<UNKNOWN>)
Authentication-Results: ozlabs.org;
	dmarc=none (p=none dis=none) header.from=ozlabs.ru
Received: from vger.kernel.org (vger.kernel.org [209.132.180.67])
	by ozlabs.org (Postfix) with ESMTP id 412BHr36jRzB3h4
	for <incoming@patchwork.ozlabs.org>;
	Fri,  8 Jun 2018 15:47:04 +1000 (AEST)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752642AbeFHFqn (ORCPT <rfc822;incoming@patchwork.ozlabs.org>);
	Fri, 8 Jun 2018 01:46:43 -0400
Received: from 107-173-13-209-host.colocrossing.com ([107.173.13.209]:34369
	"EHLO ozlabs.ru" rhost-flags-OK-FAIL-OK-OK) by vger.kernel.org
	with ESMTP id S1752647AbeFHFqj (ORCPT
	<rfc822;kvm-ppc@vger.kernel.org>); Fri, 8 Jun 2018 01:46:39 -0400
Received: from vpl1.ozlabs.ibm.com (localhost [IPv6:::1])
	by ozlabs.ru (Postfix) with ESMTP id 0B0A2AE80015;
	Fri,  8 Jun 2018 01:45:30 -0400 (EDT)
From: Alexey Kardashevskiy <aik@ozlabs.ru>
To: linuxppc-dev@lists.ozlabs.org
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>,
	David Gibson <david@gibson.dropbear.id.au>,
	kvm-ppc@vger.kernel.org, kvm@vger.kernel.org,
	Alex Williamson <alex.williamson@redhat.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>
Subject: [PATCH kernel 0/6] powerpc/powernv/iommu: Optimize memory use
Date: Fri,  8 Jun 2018 15:46:27 +1000
Message-Id: <20180608054633.18659-1-aik@ozlabs.ru>
X-Mailer: git-send-email 2.11.0
Sender: kvm-ppc-owner@vger.kernel.org
Precedence: bulk
List-ID: <kvm-ppc.vger.kernel.org>
X-Mailing-List: kvm-ppc@vger.kernel.org

This patchset aims to reduce actual memory use for guests with
sparse memory. The pseries guest uses dynamic DMA windows to map
the entire guest RAM but it only actually maps onlined memory
which may be not be contiguous. I hit this when tried passing
through NVLink2-connected GPU RAM of NVIDIA V100 and trying to
map this RAM at the same offset as in the real hardware
forced me to rework I handle these windows.

This moves userspace-to-host-physical translation table
(iommu_table::it_userspace) from VFIO TCE IOMMU subdriver to
the platform code and reuses the already existing multilevel
TCE table code which we have for the hardware tables.
At last in 6/6 I switch to on-demand allocation so we do not
allocate huge chunks of the table if we do not have to;
there is some math in 6/6.


Please comment. Thanks.



Alexey Kardashevskiy (6):
  powerpc/powernv: Remove useless wrapper
  powerpc/powernv: Move TCE manupulation code to its own file
  KVM: PPC: Make iommu_table::it_userspace big endian
  powerpc/powernv: Add indirect levels to it_userspace
  powerpc/powernv: Rework TCE level allocation
  powerpc/powernv/ioda: Allocate indirect TCE levels on demand

 arch/powerpc/platforms/powernv/Makefile       |   2 +-
 arch/powerpc/include/asm/iommu.h              |  11 +-
 arch/powerpc/platforms/powernv/pci.h          |  44 ++-
 arch/powerpc/kvm/book3s_64_vio.c              |  11 +-
 arch/powerpc/kvm/book3s_64_vio_hv.c           |  18 +-
 arch/powerpc/platforms/powernv/pci-ioda-tce.c | 395 ++++++++++++++++++++++++++
 arch/powerpc/platforms/powernv/pci-ioda.c     | 192 ++-----------
 arch/powerpc/platforms/powernv/pci.c          | 158 -----------
 drivers/vfio/vfio_iommu_spapr_tce.c           |  65 +----
 9 files changed, 482 insertions(+), 414 deletions(-)
 create mode 100644 arch/powerpc/platforms/powernv/pci-ioda-tce.c