From patchwork Fri Jul 22 12:57:28 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicholas Piggin X-Patchwork-Id: 651653 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3rwrMF5Szcz9t0m for ; Fri, 22 Jul 2016 22:59:21 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b=fpnekcm/; dkim-atps=neutral Received: from ozlabs.org (lists.ozlabs.org [IPv6:2401:3900:2:1::3]) by lists.ozlabs.org (Postfix) with ESMTP id 3rwrMF4NkdzDrL9 for ; Fri, 22 Jul 2016 22:59:21 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b=fpnekcm/; dkim-atps=neutral X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Received: from mail-pa0-x241.google.com (mail-pa0-x241.google.com [IPv6:2607:f8b0:400e:c03::241]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 3rwrKT3Z2tzDrKl for ; Fri, 22 Jul 2016 22:57:49 +1000 (AEST) Authentication-Results: lists.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b=fpnekcm/; dkim-atps=neutral Received: by mail-pa0-x241.google.com with SMTP id hh10so7003140pac.1 for ; Fri, 22 Jul 2016 05:57:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id; bh=rXg6fcEy6mSwQ5oPjHDJPnUwY3NxJ2Z/W4OPe9iHIRQ=; b=fpnekcm/qlDs7ytoHLTQZ10pR2PsYp5anFgw7mnTz5SchVPHQdeBP29WMWMZlTvOnR DA6dp4JBEg3Y7BoW1qLhtKDh3AG1Drmy8pSYmYtnQwxBr/br8J23EHDy0Xtlhg1GLXdE cGcECg6qpXe5aVn+h2kDZFTicDGVqs2FlFrN7j9CTe/6SiYjYbe3PjHx/5L/z5QfEJhk pwTUDUYRcS5+ulEqjyOzNxv+64rU40xjZAbH/fb3DJB3dgEulcpnW8uEVhs42HsD1kju 6yzzUZO8vPA7+XhzRVy77wVAX4xHuAeb7oaYIVh+kVhca+zjbQ5r847oYdvl8g9tx0Ux vKQQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=rXg6fcEy6mSwQ5oPjHDJPnUwY3NxJ2Z/W4OPe9iHIRQ=; b=hed87DwcRYifRTw5kmdvnMzHz92T7IXCvsgPIogRZX92M7G73fZiG1FPUv9z4OIbqL tPg/yrVxQIdAl0YoOob6VWTa/p3j2fR5ghA33JrWuktjDLgi39mpv99anO6CUsPk9Tw8 yZ1Bw1fWaT2AvzCx93ZuMEfqfmFf7OVks0KlZb80paRDF2GM+fW7hzMeXOVNJdvWePVb kJNTJo2FxPLn8qXUvLaOZZzspdN/bDtPYfDC3vPhLE9eDDdTiuVd9fEk5EDTkR2R/Zi4 EXzuD3pkCa1PxuOyzBI4Z5fFIkT2AAaZn6gODuf61UTtZQm7bQ8CaE2ggO1XJgwhxJj+ znXg== X-Gm-Message-State: AEkooutz5KjGly4g8ifS/XFpWyGrtrguQLYynqotWF6hIXmcnLGyKo+JW0CKc/p0HbbWGw== X-Received: by 10.66.227.101 with SMTP id rz5mr6145467pac.81.1469192267111; Fri, 22 Jul 2016 05:57:47 -0700 (PDT) Received: from roar.ozlabs.ibm.com ([122.99.82.10]) by smtp.gmail.com with ESMTPSA id g21sm7259340pfj.88.2016.07.22.05.57.43 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 22 Jul 2016 05:57:46 -0700 (PDT) From: Nicholas Piggin To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH] powerpc/64: implement a slice mask cache Date: Fri, 22 Jul 2016 22:57:28 +1000 Message-Id: <1469192248-25141-1-git-send-email-npiggin@gmail.com> X-Mailer: git-send-email 2.8.1 X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.22 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Anton Blanchard , Nicholas Piggin MIME-Version: 1.0 Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Calculating the slice mask can become a signifcant overhead for get_unmapped_area. The mask is relatively small and does not change frequently, so we can cache it in the mm context. This saves about 30% kernel time on a 4K user address allocation in a microbenchmark. Comments on the approach taken? I think there is the option for fixed allocations to avoid some of the slice calculation entirely, but first I think it will be good to have a general speedup that covers all mmaps. Cc: Benjamin Herrenschmidt Cc: Anton Blanchard --- arch/powerpc/include/asm/book3s/64/mmu.h | 8 +++++++ arch/powerpc/mm/slice.c | 39 ++++++++++++++++++++++++++++++-- 2 files changed, 45 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/mmu.h b/arch/powerpc/include/asm/book3s/64/mmu.h index 5854263..0d15af4 100644 --- a/arch/powerpc/include/asm/book3s/64/mmu.h +++ b/arch/powerpc/include/asm/book3s/64/mmu.h @@ -71,6 +71,14 @@ typedef struct { #ifdef CONFIG_PPC_MM_SLICES u64 low_slices_psize; /* SLB page size encodings */ unsigned char high_slices_psize[SLICE_ARRAY_SIZE]; + struct slice_mask mask_4k; +# ifdef CONFIG_PPC_64K_PAGES + struct slice_mask mask_64k; +# endif +# ifdef CONFIG_HUGETLB_PAGE + struct slice_mask mask_16m; + struct slice_mask mask_16g; +# endif #else u16 sllp; /* SLB page size encoding */ #endif diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c index 2b27458..559ea5f 100644 --- a/arch/powerpc/mm/slice.c +++ b/arch/powerpc/mm/slice.c @@ -147,7 +147,7 @@ static struct slice_mask slice_mask_for_free(struct mm_struct *mm) return ret; } -static struct slice_mask slice_mask_for_size(struct mm_struct *mm, int psize) +static struct slice_mask calc_slice_mask_for_size(struct mm_struct *mm, int psize) { unsigned char *hpsizes; int index, mask_index; @@ -171,6 +171,36 @@ static struct slice_mask slice_mask_for_size(struct mm_struct *mm, int psize) return ret; } +static void recalc_slice_mask_cache(struct mm_struct *mm) +{ + mm->context.mask_4k = calc_slice_mask_for_size(mm, MMU_PAGE_4K); +#ifdef CONFIG_PPC_64K_PAGES + mm->context.mask_64k = calc_slice_mask_for_size(mm, MMU_PAGE_64K); +#endif +# ifdef CONFIG_HUGETLB_PAGE + /* Radix does not come here */ + mm->context.mask_16m = calc_slice_mask_for_size(mm, MMU_PAGE_16M); + mm->context.mask_16g = calc_slice_mask_for_size(mm, MMU_PAGE_16G); +# endif +} + +static struct slice_mask slice_mask_for_size(struct mm_struct *mm, int psize) +{ + if (psize == MMU_PAGE_4K) + return mm->context.mask_4k; +#ifdef CONFIG_PPC_64K_PAGES + if (psize == MMU_PAGE_64K) + return mm->context.mask_64k; +#endif +# ifdef CONFIG_HUGETLB_PAGE + if (psize == MMU_PAGE_16M) + return mm->context.mask_16m; + if (psize == MMU_PAGE_16G) + return mm->context.mask_16g; +# endif + BUG(); +} + static int slice_check_fit(struct slice_mask mask, struct slice_mask available) { return (mask.low_slices & available.low_slices) == mask.low_slices && @@ -233,6 +263,8 @@ static void slice_convert(struct mm_struct *mm, struct slice_mask mask, int psiz spin_unlock_irqrestore(&slice_convert_lock, flags); + recalc_slice_mask_cache(mm); + copro_flush_all_slbs(mm); } @@ -625,7 +657,7 @@ void slice_set_user_psize(struct mm_struct *mm, unsigned int psize) goto bail; mm->context.user_psize = psize; - wmb(); + wmb(); /* Why? */ lpsizes = mm->context.low_slices_psize; for (i = 0; i < SLICE_NUM_LOW; i++) @@ -652,6 +684,9 @@ void slice_set_user_psize(struct mm_struct *mm, unsigned int psize) mm->context.low_slices_psize, mm->context.high_slices_psize); + spin_unlock_irqrestore(&slice_convert_lock, flags); + recalc_slice_mask_cache(mm); + return; bail: spin_unlock_irqrestore(&slice_convert_lock, flags); }