From patchwork Thu Sep 28 09:16:12 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexey Kardashevskiy X-Patchwork-Id: 819447 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=kvm-ppc-owner@vger.kernel.org; receiver=) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 3y2pw41brvz9t30 for ; Thu, 28 Sep 2017 19:16:20 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750819AbdI1JQT (ORCPT ); Thu, 28 Sep 2017 05:16:19 -0400 Received: from ozlabs.ru ([107.173.13.209]:42332 "EHLO ozlabs.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750759AbdI1JQS (ORCPT ); Thu, 28 Sep 2017 05:16:18 -0400 Received: from vpl1.ozlabs.ibm.com (localhost [IPv6:::1]) by ozlabs.ru (Postfix) with ESMTP id A356F3A60015; Thu, 28 Sep 2017 05:17:35 -0400 (EDT) From: Alexey Kardashevskiy To: linuxppc-dev@lists.ozlabs.org Cc: Alexey Kardashevskiy , David Gibson , kvm-ppc@vger.kernel.org, kvm@vger.kernel.org, Alex Williamson , Nicholas Piggin Subject: [PATCH kernel v3] vfio/spapr: Add cond_resched() for huge updates Date: Thu, 28 Sep 2017 19:16:12 +1000 Message-Id: <20170928091612.20717-1-aik@ozlabs.ru> X-Mailer: git-send-email 2.11.0 Sender: kvm-ppc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm-ppc@vger.kernel.org Clearing very big IOMMU tables can trigger soft lockups. This adds cond_resched() to allow the scheduler to do context switching when it decides to. Signed-off-by: Alexey Kardashevskiy Reviewed-by: David Gibson --- The testcase is POWER9 box with 264GB guest, 4 VFIO devices from independent IOMMU groups, 64K IOMMU pages. This configuration produces 4325376 TCE entries, each entry update incurs 4 OPAL calls to update an individual PE TCE cache; this produced lockups for more than 20s. Reducing table size to 4194304 (i.e. 256GB guest) or removing one of 4 VFIO devices makes the problem go away. --- Changes: v3: * cond_resched() checks for should_resched() so we just call resched() and let the cpu scheduler decide whether to switch or not v2: * replaced with time based solution --- drivers/vfio/vfio_iommu_spapr_tce.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/drivers/vfio/vfio_iommu_spapr_tce.c b/drivers/vfio/vfio_iommu_spapr_tce.c index 63112c36ab2d..759a5bdd40e1 100644 --- a/drivers/vfio/vfio_iommu_spapr_tce.c +++ b/drivers/vfio/vfio_iommu_spapr_tce.c @@ -507,6 +507,8 @@ static int tce_iommu_clear(struct tce_container *container, enum dma_data_direction direction; for ( ; pages; --pages, ++entry) { + cond_resched(); + direction = DMA_NONE; oldhpa = 0; ret = iommu_tce_xchg(tbl, entry, &oldhpa, &direction);