From patchwork Mon May 27 13:29:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Leroy X-Patchwork-Id: 1939943 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=patchwork.ozlabs.org) Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) by legolas.ozlabs.org (Postfix) with ESMTP id 4VnxWX3XcMz20PT for ; Mon, 27 May 2024 23:36:40 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4VnxNp033Vz3wW2 for ; Mon, 27 May 2024 23:30:50 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=csgroup.eu Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=csgroup.eu (client-ip=93.17.236.30; helo=pegase1.c-s.fr; envelope-from=christophe.leroy@csgroup.eu; receiver=lists.ozlabs.org) Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4VnxNH5wZJz3g8S for ; Mon, 27 May 2024 23:30:23 +1000 (AEST) Received: from localhost (mailhub3.si.c-s.fr [192.168.12.233]) by localhost (Postfix) with ESMTP id 4VnxNB0wZXz9tCB; Mon, 27 May 2024 15:30:18 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RrXYXteiq-87; Mon, 27 May 2024 15:30:18 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 4VnxNB01Pqz9sqS; Mon, 27 May 2024 15:30:18 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id EFCB18B773; Mon, 27 May 2024 15:30:17 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id Hb91EQK3TP9D; Mon, 27 May 2024 15:30:17 +0200 (CEST) Received: from PO20335.idsi0.si.c-s.fr (unknown [192.168.232.49]) by messagerie.si.c-s.fr (Postfix) with ESMTP id A6A2A8B764; Mon, 27 May 2024 15:30:09 +0200 (CEST) From: Christophe Leroy To: Andrew Morton , Jason Gunthorpe , Peter Xu , Oscar Salvador , Michael Ellerman , Nicholas Piggin Subject: [RFC PATCH v4 01/16] powerpc/64e: Remove unused IBM HTW code [SQUASHED] Date: Mon, 27 May 2024 15:29:59 +0200 Message-ID: X-Mailer: git-send-email 2.44.0 In-Reply-To: References: MIME-Version: 1.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1716816600; l=40864; i=christophe.leroy@csgroup.eu; s=20211009; h=from:subject:message-id; bh=7P5m5qnsOOzOIX3zvYc63487z05AhuMULzMhzjcBwYs=; b=THNQjr9ich6xVeaqcG/OtUfpE4sqdra+d0HuzJnlhSeclq+63fEFprmyqIU21osawf5C6t1UK yoRLKE8uXi/Dul7/aZVaQW9b6YK3OMENNPAG3JteKkjEEpFZ3BKVhAo X-Developer-Key: i=christophe.leroy@csgroup.eu; a=ed25519; pk=HIzTzUj91asvincQGOFx6+ZF5AoUuP9GdOtQChs7Mm0= X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" From: Michael Ellerman This is a squash of series from Michael https://patchwork.ozlabs.org/project/linuxppc-dev/patch/20240524073141.1637736-1-mpe@ellerman.id.au/ The nohash HTW_IBM (Hardware Table Walk) code is unused since support for A2 was removed in commit fb5a515704d7 ("powerpc: Remove platforms/ wsp and associated pieces") (2014). The remaining supported CPUs use either no HTW (data_tlb_miss_bolted), or the e6500 HTW (data_tlb_miss_e6500). Signed-off-by: Michael Ellerman powerpc/64e: Split out nohash Book3E 64-bit code A reasonable chunk of nohash/tlb.c is 64-bit only code, split it out into a separate file. Signed-off-by: Michael Ellerman powerpc/64e: Drop E500 ifdefs in 64-bit code All 64-bit Book3E have E500=y, so drop the unneeded ifdefs. Signed-off-by: Michael Ellerman powerpc/64e: Drop MMU_FTR_TYPE_FSL_E checks in 64-bit code All 64-bit Book3E have MMU_FTR_TYPE_FSL_E, since A2 was removed, so remove checks for it in 64-bit only code. Signed-off-by: Michael Ellerman powerpc/64e: Consolidate TLB miss handler patching The 64e TLB miss handler patching is done in setup_mmu_htw(), and then again immediately afterward in early_init_mmu_global(). Consolidate it into a single location. Signed-off-by: Michael Ellerman powerpc/64e: Drop unused TLB miss handlers There are two possibilities for book3e_htw_mode, PPC_HTW_E6500 or PPC_HTW_NONE. The TLB miss handlers are patched to use, respectively: - exc_[data|indstruction]_tlb_miss_e6500_book3e - exc_[data|indstruction]_tlb_miss_bolted_book3e Which means the default handlers are never used. Remove those, and use the bolted handlers (PPC_HTW_NONE) by default. Signed-off-by: Michael Ellerman Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/nohash/mmu-e500.h | 3 +- arch/powerpc/kernel/exceptions-64e.S | 4 +- arch/powerpc/kernel/setup_64.c | 6 +- arch/powerpc/mm/nohash/Makefile | 2 +- arch/powerpc/mm/nohash/tlb.c | 398 +------------------ arch/powerpc/mm/nohash/tlb_64e.c | 314 +++++++++++++++ arch/powerpc/mm/nohash/tlb_low_64e.S | 421 --------------------- 7 files changed, 320 insertions(+), 828 deletions(-) create mode 100644 arch/powerpc/mm/nohash/tlb_64e.c diff --git a/arch/powerpc/include/asm/nohash/mmu-e500.h b/arch/powerpc/include/asm/nohash/mmu-e500.h index 6ddced0415cb..7dc24b8632d7 100644 --- a/arch/powerpc/include/asm/nohash/mmu-e500.h +++ b/arch/powerpc/include/asm/nohash/mmu-e500.h @@ -303,8 +303,7 @@ extern unsigned long linear_map_top; extern int book3e_htw_mode; #define PPC_HTW_NONE 0 -#define PPC_HTW_IBM 1 -#define PPC_HTW_E6500 2 +#define PPC_HTW_E6500 1 /* * 64-bit booke platforms don't load the tlb in the tlb miss handler code. diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S index dcf0591ad3c2..63f6b9f513a4 100644 --- a/arch/powerpc/kernel/exceptions-64e.S +++ b/arch/powerpc/kernel/exceptions-64e.S @@ -485,8 +485,8 @@ interrupt_base_book3e: /* fake trap */ EXCEPTION_STUB(0x160, decrementer) /* 0x0900 */ EXCEPTION_STUB(0x180, fixed_interval) /* 0x0980 */ EXCEPTION_STUB(0x1a0, watchdog) /* 0x09f0 */ - EXCEPTION_STUB(0x1c0, data_tlb_miss) - EXCEPTION_STUB(0x1e0, instruction_tlb_miss) + EXCEPTION_STUB(0x1c0, data_tlb_miss_bolted) + EXCEPTION_STUB(0x1e0, instruction_tlb_miss_bolted) EXCEPTION_STUB(0x200, altivec_unavailable) EXCEPTION_STUB(0x220, altivec_assist) EXCEPTION_STUB(0x260, perfmon) diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c index ae36a129789f..22f83fbbc762 100644 --- a/arch/powerpc/kernel/setup_64.c +++ b/arch/powerpc/kernel/setup_64.c @@ -696,11 +696,7 @@ __init u64 ppc64_bolted_size(void) { #ifdef CONFIG_PPC_BOOK3E_64 /* Freescale BookE bolts the entire linear mapping */ - /* XXX: BookE ppc64_rma_limit setup seems to disagree? */ - if (early_mmu_has_feature(MMU_FTR_TYPE_FSL_E)) - return linear_map_top; - /* Other BookE, we assume the first GB is bolted */ - return 1ul << 30; + return linear_map_top; #else /* BookS radix, does not take faults on linear mapping */ if (early_radix_enabled()) diff --git a/arch/powerpc/mm/nohash/Makefile b/arch/powerpc/mm/nohash/Makefile index b3f0498dd42f..90e846f0c46c 100644 --- a/arch/powerpc/mm/nohash/Makefile +++ b/arch/powerpc/mm/nohash/Makefile @@ -1,7 +1,7 @@ # SPDX-License-Identifier: GPL-2.0 obj-y += mmu_context.o tlb.o tlb_low.o kup.o -obj-$(CONFIG_PPC_BOOK3E_64) += tlb_low_64e.o book3e_pgtable.o +obj-$(CONFIG_PPC_BOOK3E_64) += tlb_64e.o tlb_low_64e.o book3e_pgtable.o obj-$(CONFIG_40x) += 40x.o obj-$(CONFIG_44x) += 44x.o obj-$(CONFIG_PPC_8xx) += 8xx.o diff --git a/arch/powerpc/mm/nohash/tlb.c b/arch/powerpc/mm/nohash/tlb.c index 5ffa0af4328a..f57dc721d063 100644 --- a/arch/powerpc/mm/nohash/tlb.c +++ b/arch/powerpc/mm/nohash/tlb.c @@ -110,28 +110,6 @@ struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT] = { }; #endif -/* The variables below are currently only used on 64-bit Book3E - * though this will probably be made common with other nohash - * implementations at some point - */ -#ifdef CONFIG_PPC64 - -int mmu_pte_psize; /* Page size used for PTE pages */ -int mmu_vmemmap_psize; /* Page size used for the virtual mem map */ -int book3e_htw_mode; /* HW tablewalk? Value is PPC_HTW_* */ -unsigned long linear_map_top; /* Top of linear mapping */ - - -/* - * Number of bytes to add to SPRN_SPRG_TLB_EXFRAME on crit/mcheck/debug - * exceptions. This is used for bolted and e6500 TLB miss handlers which - * do not modify this SPRG in the TLB miss code; for other TLB miss handlers, - * this is set to zero. - */ -int extlb_level_exc; - -#endif /* CONFIG_PPC64 */ - #ifdef CONFIG_PPC_E500 /* next_tlbcam_idx is used to round-robin tlbcam entry assignment */ DEFINE_PER_CPU(int, next_tlbcam_idx); @@ -358,381 +336,7 @@ void tlb_flush(struct mmu_gather *tlb) flush_tlb_mm(tlb->mm); } -/* - * Below are functions specific to the 64-bit variant of Book3E though that - * may change in the future - */ - -#ifdef CONFIG_PPC64 - -/* - * Handling of virtual linear page tables or indirect TLB entries - * flushing when PTE pages are freed - */ -void tlb_flush_pgtable(struct mmu_gather *tlb, unsigned long address) -{ - int tsize = mmu_psize_defs[mmu_pte_psize].enc; - - if (book3e_htw_mode != PPC_HTW_NONE) { - unsigned long start = address & PMD_MASK; - unsigned long end = address + PMD_SIZE; - unsigned long size = 1UL << mmu_psize_defs[mmu_pte_psize].shift; - - /* This isn't the most optimal, ideally we would factor out the - * while preempt & CPU mask mucking around, or even the IPI but - * it will do for now - */ - while (start < end) { - __flush_tlb_page(tlb->mm, start, tsize, 1); - start += size; - } - } else { - unsigned long rmask = 0xf000000000000000ul; - unsigned long rid = (address & rmask) | 0x1000000000000000ul; - unsigned long vpte = address & ~rmask; - - vpte = (vpte >> (PAGE_SHIFT - 3)) & ~0xffful; - vpte |= rid; - __flush_tlb_page(tlb->mm, vpte, tsize, 0); - } -} - -static void __init setup_page_sizes(void) -{ - unsigned int tlb0cfg; - unsigned int tlb0ps; - unsigned int eptcfg; - int i, psize; - -#ifdef CONFIG_PPC_E500 - unsigned int mmucfg = mfspr(SPRN_MMUCFG); - int fsl_mmu = mmu_has_feature(MMU_FTR_TYPE_FSL_E); - - if (fsl_mmu && (mmucfg & MMUCFG_MAVN) == MMUCFG_MAVN_V1) { - unsigned int tlb1cfg = mfspr(SPRN_TLB1CFG); - unsigned int min_pg, max_pg; - - min_pg = (tlb1cfg & TLBnCFG_MINSIZE) >> TLBnCFG_MINSIZE_SHIFT; - max_pg = (tlb1cfg & TLBnCFG_MAXSIZE) >> TLBnCFG_MAXSIZE_SHIFT; - - for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) { - struct mmu_psize_def *def; - unsigned int shift; - - def = &mmu_psize_defs[psize]; - shift = def->shift; - - if (shift == 0 || shift & 1) - continue; - - /* adjust to be in terms of 4^shift Kb */ - shift = (shift - 10) >> 1; - - if ((shift >= min_pg) && (shift <= max_pg)) - def->flags |= MMU_PAGE_SIZE_DIRECT; - } - - goto out; - } - - if (fsl_mmu && (mmucfg & MMUCFG_MAVN) == MMUCFG_MAVN_V2) { - u32 tlb1cfg, tlb1ps; - - tlb0cfg = mfspr(SPRN_TLB0CFG); - tlb1cfg = mfspr(SPRN_TLB1CFG); - tlb1ps = mfspr(SPRN_TLB1PS); - eptcfg = mfspr(SPRN_EPTCFG); - - if ((tlb1cfg & TLBnCFG_IND) && (tlb0cfg & TLBnCFG_PT)) - book3e_htw_mode = PPC_HTW_E6500; - - /* - * We expect 4K subpage size and unrestricted indirect size. - * The lack of a restriction on indirect size is a Freescale - * extension, indicated by PSn = 0 but SPSn != 0. - */ - if (eptcfg != 2) - book3e_htw_mode = PPC_HTW_NONE; - - for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) { - struct mmu_psize_def *def = &mmu_psize_defs[psize]; - - if (!def->shift) - continue; - - if (tlb1ps & (1U << (def->shift - 10))) { - def->flags |= MMU_PAGE_SIZE_DIRECT; - - if (book3e_htw_mode && psize == MMU_PAGE_2M) - def->flags |= MMU_PAGE_SIZE_INDIRECT; - } - } - - goto out; - } -#endif - - tlb0cfg = mfspr(SPRN_TLB0CFG); - tlb0ps = mfspr(SPRN_TLB0PS); - eptcfg = mfspr(SPRN_EPTCFG); - - /* Look for supported direct sizes */ - for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) { - struct mmu_psize_def *def = &mmu_psize_defs[psize]; - - if (tlb0ps & (1U << (def->shift - 10))) - def->flags |= MMU_PAGE_SIZE_DIRECT; - } - - /* Indirect page sizes supported ? */ - if ((tlb0cfg & TLBnCFG_IND) == 0 || - (tlb0cfg & TLBnCFG_PT) == 0) - goto out; - - book3e_htw_mode = PPC_HTW_IBM; - - /* Now, we only deal with one IND page size for each - * direct size. Hopefully all implementations today are - * unambiguous, but we might want to be careful in the - * future. - */ - for (i = 0; i < 3; i++) { - unsigned int ps, sps; - - sps = eptcfg & 0x1f; - eptcfg >>= 5; - ps = eptcfg & 0x1f; - eptcfg >>= 5; - if (!ps || !sps) - continue; - for (psize = 0; psize < MMU_PAGE_COUNT; psize++) { - struct mmu_psize_def *def = &mmu_psize_defs[psize]; - - if (ps == (def->shift - 10)) - def->flags |= MMU_PAGE_SIZE_INDIRECT; - if (sps == (def->shift - 10)) - def->ind = ps + 10; - } - } - -out: - /* Cleanup array and print summary */ - pr_info("MMU: Supported page sizes\n"); - for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) { - struct mmu_psize_def *def = &mmu_psize_defs[psize]; - const char *__page_type_names[] = { - "unsupported", - "direct", - "indirect", - "direct & indirect" - }; - if (def->flags == 0) { - def->shift = 0; - continue; - } - pr_info(" %8ld KB as %s\n", 1ul << (def->shift - 10), - __page_type_names[def->flags & 0x3]); - } -} - -static void __init setup_mmu_htw(void) -{ - /* - * If we want to use HW tablewalk, enable it by patching the TLB miss - * handlers to branch to the one dedicated to it. - */ - - switch (book3e_htw_mode) { - case PPC_HTW_IBM: - patch_exception(0x1c0, exc_data_tlb_miss_htw_book3e); - patch_exception(0x1e0, exc_instruction_tlb_miss_htw_book3e); - break; -#ifdef CONFIG_PPC_E500 - case PPC_HTW_E6500: - extlb_level_exc = EX_TLB_SIZE; - patch_exception(0x1c0, exc_data_tlb_miss_e6500_book3e); - patch_exception(0x1e0, exc_instruction_tlb_miss_e6500_book3e); - break; -#endif - } - pr_info("MMU: Book3E HW tablewalk %s\n", - book3e_htw_mode != PPC_HTW_NONE ? "enabled" : "not supported"); -} - -/* - * Early initialization of the MMU TLB code - */ -static void early_init_this_mmu(void) -{ - unsigned int mas4; - - /* Set MAS4 based on page table setting */ - - mas4 = 0x4 << MAS4_WIMGED_SHIFT; - switch (book3e_htw_mode) { - case PPC_HTW_E6500: - mas4 |= MAS4_INDD; - mas4 |= BOOK3E_PAGESZ_2M << MAS4_TSIZED_SHIFT; - mas4 |= MAS4_TLBSELD(1); - mmu_pte_psize = MMU_PAGE_2M; - break; - - case PPC_HTW_IBM: - mas4 |= MAS4_INDD; - mas4 |= BOOK3E_PAGESZ_1M << MAS4_TSIZED_SHIFT; - mmu_pte_psize = MMU_PAGE_1M; - break; - - case PPC_HTW_NONE: - mas4 |= BOOK3E_PAGESZ_4K << MAS4_TSIZED_SHIFT; - mmu_pte_psize = mmu_virtual_psize; - break; - } - mtspr(SPRN_MAS4, mas4); - -#ifdef CONFIG_PPC_E500 - if (mmu_has_feature(MMU_FTR_TYPE_FSL_E)) { - unsigned int num_cams; - bool map = true; - - /* use a quarter of the TLBCAM for bolted linear map */ - num_cams = (mfspr(SPRN_TLB1CFG) & TLBnCFG_N_ENTRY) / 4; - - /* - * Only do the mapping once per core, or else the - * transient mapping would cause problems. - */ -#ifdef CONFIG_SMP - if (hweight32(get_tensr()) > 1) - map = false; -#endif - - if (map) - linear_map_top = map_mem_in_cams(linear_map_top, - num_cams, false, true); - } -#endif - - /* A sync won't hurt us after mucking around with - * the MMU configuration - */ - mb(); -} - -static void __init early_init_mmu_global(void) -{ - /* XXX This should be decided at runtime based on supported - * page sizes in the TLB, but for now let's assume 16M is - * always there and a good fit (which it probably is) - * - * Freescale booke only supports 4K pages in TLB0, so use that. - */ - if (mmu_has_feature(MMU_FTR_TYPE_FSL_E)) - mmu_vmemmap_psize = MMU_PAGE_4K; - else - mmu_vmemmap_psize = MMU_PAGE_16M; - - /* XXX This code only checks for TLB 0 capabilities and doesn't - * check what page size combos are supported by the HW. It - * also doesn't handle the case where a separate array holds - * the IND entries from the array loaded by the PT. - */ - /* Look for supported page sizes */ - setup_page_sizes(); - - /* Look for HW tablewalk support */ - setup_mmu_htw(); - -#ifdef CONFIG_PPC_E500 - if (mmu_has_feature(MMU_FTR_TYPE_FSL_E)) { - if (book3e_htw_mode == PPC_HTW_NONE) { - extlb_level_exc = EX_TLB_SIZE; - patch_exception(0x1c0, exc_data_tlb_miss_bolted_book3e); - patch_exception(0x1e0, - exc_instruction_tlb_miss_bolted_book3e); - } - } -#endif - - /* Set the global containing the top of the linear mapping - * for use by the TLB miss code - */ - linear_map_top = memblock_end_of_DRAM(); - - ioremap_bot = IOREMAP_BASE; -} - -static void __init early_mmu_set_memory_limit(void) -{ -#ifdef CONFIG_PPC_E500 - if (mmu_has_feature(MMU_FTR_TYPE_FSL_E)) { - /* - * Limit memory so we dont have linear faults. - * Unlike memblock_set_current_limit, which limits - * memory available during early boot, this permanently - * reduces the memory available to Linux. We need to - * do this because highmem is not supported on 64-bit. - */ - memblock_enforce_memory_limit(linear_map_top); - } -#endif - - memblock_set_current_limit(linear_map_top); -} - -/* boot cpu only */ -void __init early_init_mmu(void) -{ - early_init_mmu_global(); - early_init_this_mmu(); - early_mmu_set_memory_limit(); -} - -void early_init_mmu_secondary(void) -{ - early_init_this_mmu(); -} - -void setup_initial_memory_limit(phys_addr_t first_memblock_base, - phys_addr_t first_memblock_size) -{ - /* On non-FSL Embedded 64-bit, we adjust the RMA size to match - * the bolted TLB entry. We know for now that only 1G - * entries are supported though that may eventually - * change. - * - * on FSL Embedded 64-bit, usually all RAM is bolted, but with - * unusual memory sizes it's possible for some RAM to not be mapped - * (such RAM is not used at all by Linux, since we don't support - * highmem on 64-bit). We limit ppc64_rma_size to what would be - * mappable if this memblock is the only one. Additional memblocks - * can only increase, not decrease, the amount that ends up getting - * mapped. We still limit max to 1G even if we'll eventually map - * more. This is due to what the early init code is set up to do. - * - * We crop it to the size of the first MEMBLOCK to - * avoid going over total available memory just in case... - */ -#ifdef CONFIG_PPC_E500 - if (early_mmu_has_feature(MMU_FTR_TYPE_FSL_E)) { - unsigned long linear_sz; - unsigned int num_cams; - - /* use a quarter of the TLBCAM for bolted linear map */ - num_cams = (mfspr(SPRN_TLB1CFG) & TLBnCFG_N_ENTRY) / 4; - - linear_sz = map_mem_in_cams(first_memblock_size, num_cams, - true, true); - - ppc64_rma_size = min_t(u64, linear_sz, 0x40000000); - } else -#endif - ppc64_rma_size = min_t(u64, first_memblock_size, 0x40000000); - - /* Finally limit subsequent allocations */ - memblock_set_current_limit(first_memblock_base + ppc64_rma_size); -} -#else /* ! CONFIG_PPC64 */ +#ifndef CONFIG_PPC64 void __init early_init_mmu(void) { unsigned long root = of_get_flat_dt_root(); diff --git a/arch/powerpc/mm/nohash/tlb_64e.c b/arch/powerpc/mm/nohash/tlb_64e.c new file mode 100644 index 000000000000..053128a5636c --- /dev/null +++ b/arch/powerpc/mm/nohash/tlb_64e.c @@ -0,0 +1,314 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright 2008,2009 Ben Herrenschmidt + * IBM Corp. + * + * Derived from arch/ppc/mm/init.c: + * Copyright (C) 1995-1996 Gary Thomas (gdt@linuxppc.org) + * + * Modifications by Paul Mackerras (PowerMac) (paulus@cs.anu.edu.au) + * and Cort Dougan (PReP) (cort@cs.nmt.edu) + * Copyright (C) 1996 Paul Mackerras + * + * Derived from "arch/i386/mm/init.c" + * Copyright (C) 1991, 1992, 1993, 1994 Linus Torvalds + */ + +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include + +#include + +/* The variables below are currently only used on 64-bit Book3E + * though this will probably be made common with other nohash + * implementations at some point + */ +int mmu_pte_psize; /* Page size used for PTE pages */ +int mmu_vmemmap_psize; /* Page size used for the virtual mem map */ +int book3e_htw_mode; /* HW tablewalk? Value is PPC_HTW_* */ +unsigned long linear_map_top; /* Top of linear mapping */ + + +/* + * Number of bytes to add to SPRN_SPRG_TLB_EXFRAME on crit/mcheck/debug + * exceptions. This is used for bolted and e6500 TLB miss handlers which + * do not modify this SPRG in the TLB miss code; for other TLB miss handlers, + * this is set to zero. + */ +int extlb_level_exc; + +/* + * Handling of virtual linear page tables or indirect TLB entries + * flushing when PTE pages are freed + */ +void tlb_flush_pgtable(struct mmu_gather *tlb, unsigned long address) +{ + int tsize = mmu_psize_defs[mmu_pte_psize].enc; + + if (book3e_htw_mode != PPC_HTW_NONE) { + unsigned long start = address & PMD_MASK; + unsigned long end = address + PMD_SIZE; + unsigned long size = 1UL << mmu_psize_defs[mmu_pte_psize].shift; + + /* This isn't the most optimal, ideally we would factor out the + * while preempt & CPU mask mucking around, or even the IPI but + * it will do for now + */ + while (start < end) { + __flush_tlb_page(tlb->mm, start, tsize, 1); + start += size; + } + } else { + unsigned long rmask = 0xf000000000000000ul; + unsigned long rid = (address & rmask) | 0x1000000000000000ul; + unsigned long vpte = address & ~rmask; + + vpte = (vpte >> (PAGE_SHIFT - 3)) & ~0xffful; + vpte |= rid; + __flush_tlb_page(tlb->mm, vpte, tsize, 0); + } +} + +static void __init setup_page_sizes(void) +{ + unsigned int tlb0cfg; + unsigned int eptcfg; + int psize; + + unsigned int mmucfg = mfspr(SPRN_MMUCFG); + + if ((mmucfg & MMUCFG_MAVN) == MMUCFG_MAVN_V1) { + unsigned int tlb1cfg = mfspr(SPRN_TLB1CFG); + unsigned int min_pg, max_pg; + + min_pg = (tlb1cfg & TLBnCFG_MINSIZE) >> TLBnCFG_MINSIZE_SHIFT; + max_pg = (tlb1cfg & TLBnCFG_MAXSIZE) >> TLBnCFG_MAXSIZE_SHIFT; + + for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) { + struct mmu_psize_def *def; + unsigned int shift; + + def = &mmu_psize_defs[psize]; + shift = def->shift; + + if (shift == 0 || shift & 1) + continue; + + /* adjust to be in terms of 4^shift Kb */ + shift = (shift - 10) >> 1; + + if ((shift >= min_pg) && (shift <= max_pg)) + def->flags |= MMU_PAGE_SIZE_DIRECT; + } + + goto out; + } + + if ((mmucfg & MMUCFG_MAVN) == MMUCFG_MAVN_V2) { + u32 tlb1cfg, tlb1ps; + + tlb0cfg = mfspr(SPRN_TLB0CFG); + tlb1cfg = mfspr(SPRN_TLB1CFG); + tlb1ps = mfspr(SPRN_TLB1PS); + eptcfg = mfspr(SPRN_EPTCFG); + + if ((tlb1cfg & TLBnCFG_IND) && (tlb0cfg & TLBnCFG_PT)) + book3e_htw_mode = PPC_HTW_E6500; + + /* + * We expect 4K subpage size and unrestricted indirect size. + * The lack of a restriction on indirect size is a Freescale + * extension, indicated by PSn = 0 but SPSn != 0. + */ + if (eptcfg != 2) + book3e_htw_mode = PPC_HTW_NONE; + + for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) { + struct mmu_psize_def *def = &mmu_psize_defs[psize]; + + if (!def->shift) + continue; + + if (tlb1ps & (1U << (def->shift - 10))) { + def->flags |= MMU_PAGE_SIZE_DIRECT; + + if (book3e_htw_mode && psize == MMU_PAGE_2M) + def->flags |= MMU_PAGE_SIZE_INDIRECT; + } + } + + goto out; + } +out: + /* Cleanup array and print summary */ + pr_info("MMU: Supported page sizes\n"); + for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) { + struct mmu_psize_def *def = &mmu_psize_defs[psize]; + const char *__page_type_names[] = { + "unsupported", + "direct", + "indirect", + "direct & indirect" + }; + if (def->flags == 0) { + def->shift = 0; + continue; + } + pr_info(" %8ld KB as %s\n", 1ul << (def->shift - 10), + __page_type_names[def->flags & 0x3]); + } +} + +/* + * Early initialization of the MMU TLB code + */ +static void early_init_this_mmu(void) +{ + unsigned int mas4; + + /* Set MAS4 based on page table setting */ + + mas4 = 0x4 << MAS4_WIMGED_SHIFT; + switch (book3e_htw_mode) { + case PPC_HTW_E6500: + mas4 |= MAS4_INDD; + mas4 |= BOOK3E_PAGESZ_2M << MAS4_TSIZED_SHIFT; + mas4 |= MAS4_TLBSELD(1); + mmu_pte_psize = MMU_PAGE_2M; + break; + + case PPC_HTW_NONE: + mas4 |= BOOK3E_PAGESZ_4K << MAS4_TSIZED_SHIFT; + mmu_pte_psize = mmu_virtual_psize; + break; + } + mtspr(SPRN_MAS4, mas4); + + unsigned int num_cams; + bool map = true; + + /* use a quarter of the TLBCAM for bolted linear map */ + num_cams = (mfspr(SPRN_TLB1CFG) & TLBnCFG_N_ENTRY) / 4; + + /* + * Only do the mapping once per core, or else the + * transient mapping would cause problems. + */ +#ifdef CONFIG_SMP + if (hweight32(get_tensr()) > 1) + map = false; +#endif + + if (map) + linear_map_top = map_mem_in_cams(linear_map_top, + num_cams, false, true); + + /* A sync won't hurt us after mucking around with + * the MMU configuration + */ + mb(); +} + +static void __init early_init_mmu_global(void) +{ + /* + * Freescale booke only supports 4K pages in TLB0, so use that. + */ + mmu_vmemmap_psize = MMU_PAGE_4K; + + /* XXX This code only checks for TLB 0 capabilities and doesn't + * check what page size combos are supported by the HW. It + * also doesn't handle the case where a separate array holds + * the IND entries from the array loaded by the PT. + */ + /* Look for supported page sizes */ + setup_page_sizes(); + + /* + * If we want to use HW tablewalk, enable it by patching the TLB miss + * handlers to branch to the one dedicated to it. + */ + extlb_level_exc = EX_TLB_SIZE; + switch (book3e_htw_mode) { + case PPC_HTW_E6500: + patch_exception(0x1c0, exc_data_tlb_miss_e6500_book3e); + patch_exception(0x1e0, exc_instruction_tlb_miss_e6500_book3e); + break; + } + + pr_info("MMU: Book3E HW tablewalk %s\n", + book3e_htw_mode != PPC_HTW_NONE ? "enabled" : "not supported"); + + /* Set the global containing the top of the linear mapping + * for use by the TLB miss code + */ + linear_map_top = memblock_end_of_DRAM(); + + ioremap_bot = IOREMAP_BASE; +} + +static void __init early_mmu_set_memory_limit(void) +{ + /* + * Limit memory so we dont have linear faults. + * Unlike memblock_set_current_limit, which limits + * memory available during early boot, this permanently + * reduces the memory available to Linux. We need to + * do this because highmem is not supported on 64-bit. + */ + memblock_enforce_memory_limit(linear_map_top); + + memblock_set_current_limit(linear_map_top); +} + +/* boot cpu only */ +void __init early_init_mmu(void) +{ + early_init_mmu_global(); + early_init_this_mmu(); + early_mmu_set_memory_limit(); +} + +void early_init_mmu_secondary(void) +{ + early_init_this_mmu(); +} + +void setup_initial_memory_limit(phys_addr_t first_memblock_base, + phys_addr_t first_memblock_size) +{ + /* + * On FSL Embedded 64-bit, usually all RAM is bolted, but with + * unusual memory sizes it's possible for some RAM to not be mapped + * (such RAM is not used at all by Linux, since we don't support + * highmem on 64-bit). We limit ppc64_rma_size to what would be + * mappable if this memblock is the only one. Additional memblocks + * can only increase, not decrease, the amount that ends up getting + * mapped. We still limit max to 1G even if we'll eventually map + * more. This is due to what the early init code is set up to do. + * + * We crop it to the size of the first MEMBLOCK to + * avoid going over total available memory just in case... + */ + unsigned long linear_sz; + unsigned int num_cams; + + /* use a quarter of the TLBCAM for bolted linear map */ + num_cams = (mfspr(SPRN_TLB1CFG) & TLBnCFG_N_ENTRY) / 4; + + linear_sz = map_mem_in_cams(first_memblock_size, num_cams, true, true); + ppc64_rma_size = min_t(u64, linear_sz, 0x40000000); + + /* Finally limit subsequent allocations */ + memblock_set_current_limit(first_memblock_base + ppc64_rma_size); +} diff --git a/arch/powerpc/mm/nohash/tlb_low_64e.S b/arch/powerpc/mm/nohash/tlb_low_64e.S index 7e0b8fe1c279..a54e7d6c3d0b 100644 --- a/arch/powerpc/mm/nohash/tlb_low_64e.S +++ b/arch/powerpc/mm/nohash/tlb_low_64e.S @@ -511,232 +511,6 @@ itlb_miss_fault_e6500: tlb_epilog_bolted b exc_instruction_storage_book3e -/********************************************************************** - * * - * TLB miss handling for Book3E with TLB reservation and HES support * - * * - **********************************************************************/ - - -/* Data TLB miss */ - START_EXCEPTION(data_tlb_miss) - TLB_MISS_PROLOG - - /* Now we handle the fault proper. We only save DEAR in normal - * fault case since that's the only interesting values here. - * We could probably also optimize by not saving SRR0/1 in the - * linear mapping case but I'll leave that for later - */ - mfspr r14,SPRN_ESR - mfspr r16,SPRN_DEAR /* get faulting address */ - srdi r15,r16,44 /* get region */ - xoris r15,r15,0xc - cmpldi cr0,r15,0 /* linear mapping ? */ - beq tlb_load_linear /* yes -> go to linear map load */ - cmpldi cr1,r15,1 /* vmalloc mapping ? */ - - /* The page tables are mapped virtually linear. At this point, though, - * we don't know whether we are trying to fault in a first level - * virtual address or a virtual page table address. We can get that - * from bit 0x1 of the region ID which we have set for a page table - */ - andis. r10,r15,0x1 - bne- virt_page_table_tlb_miss - - std r14,EX_TLB_ESR(r12); /* save ESR */ - std r16,EX_TLB_DEAR(r12); /* save DEAR */ - - /* We need _PAGE_PRESENT and _PAGE_ACCESSED set */ - li r11,_PAGE_PRESENT - oris r11,r11,_PAGE_ACCESSED@h - - /* We do the user/kernel test for the PID here along with the RW test - */ - srdi. r15,r16,60 /* Check for user region */ - - /* We pre-test some combination of permissions to avoid double - * faults: - * - * We move the ESR:ST bit into the position of _PAGE_BAP_SW in the PTE - * ESR_ST is 0x00800000 - * _PAGE_BAP_SW is 0x00000010 - * So the shift is >> 19. This tests for supervisor writeability. - * If the page happens to be supervisor writeable and not user - * writeable, we will take a new fault later, but that should be - * a rare enough case. - * - * We also move ESR_ST in _PAGE_DIRTY position - * _PAGE_DIRTY is 0x00001000 so the shift is >> 11 - * - * MAS1 is preset for all we need except for TID that needs to - * be cleared for kernel translations - */ - rlwimi r11,r14,32-19,27,27 - rlwimi r11,r14,32-16,19,19 - beq normal_tlb_miss_user - /* XXX replace the RMW cycles with immediate loads + writes */ -1: mfspr r10,SPRN_MAS1 - rlwinm r10,r10,0,16,1 /* Clear TID */ - mtspr SPRN_MAS1,r10 - beq+ cr1,normal_tlb_miss - - /* We got a crappy address, just fault with whatever DEAR and ESR - * are here - */ - TLB_MISS_EPILOG_ERROR - b exc_data_storage_book3e - -/* Instruction TLB miss */ - START_EXCEPTION(instruction_tlb_miss) - TLB_MISS_PROLOG - - /* If we take a recursive fault, the second level handler may need - * to know whether we are handling a data or instruction fault in - * order to get to the right store fault handler. We provide that - * info by writing a crazy value in ESR in our exception frame - */ - li r14,-1 /* store to exception frame is done later */ - - /* Now we handle the fault proper. We only save DEAR in the non - * linear mapping case since we know the linear mapping case will - * not re-enter. We could indeed optimize and also not save SRR0/1 - * in the linear mapping case but I'll leave that for later - * - * Faulting address is SRR0 which is already in r16 - */ - srdi r15,r16,44 /* get region */ - xoris r15,r15,0xc - cmpldi cr0,r15,0 /* linear mapping ? */ - beq tlb_load_linear /* yes -> go to linear map load */ - cmpldi cr1,r15,1 /* vmalloc mapping ? */ - - /* We do the user/kernel test for the PID here along with the RW test - */ - li r11,_PAGE_PRESENT|_PAGE_BAP_UX /* Base perm */ - oris r11,r11,_PAGE_ACCESSED@h - - srdi. r15,r16,60 /* Check for user region */ - std r14,EX_TLB_ESR(r12) /* write crazy -1 to frame */ - beq normal_tlb_miss_user - - li r11,_PAGE_PRESENT|_PAGE_BAP_SX /* Base perm */ - oris r11,r11,_PAGE_ACCESSED@h - /* XXX replace the RMW cycles with immediate loads + writes */ - mfspr r10,SPRN_MAS1 - rlwinm r10,r10,0,16,1 /* Clear TID */ - mtspr SPRN_MAS1,r10 - beq+ cr1,normal_tlb_miss - - /* We got a crappy address, just fault */ - TLB_MISS_EPILOG_ERROR - b exc_instruction_storage_book3e - -/* - * This is the guts of the first-level TLB miss handler for direct - * misses. We are entered with: - * - * r16 = faulting address - * r15 = region ID - * r14 = crap (free to use) - * r13 = PACA - * r12 = TLB exception frame in PACA - * r11 = PTE permission mask - * r10 = crap (free to use) - */ -normal_tlb_miss_user: -#ifdef CONFIG_PPC_KUAP - mfspr r14,SPRN_MAS1 - rlwinm. r14,r14,0,0x3fff0000 - beq- normal_tlb_miss_access_fault /* KUAP fault */ -#endif -normal_tlb_miss: - /* So we first construct the page table address. We do that by - * shifting the bottom of the address (not the region ID) by - * PAGE_SHIFT-3, clearing the bottom 3 bits (get a PTE ptr) and - * or'ing the fourth high bit. - * - * NOTE: For 64K pages, we do things slightly differently in - * order to handle the weird page table format used by linux - */ - srdi r15,r16,44 - oris r10,r15,0x1 - rldicl r14,r16,64-(PAGE_SHIFT-3),PAGE_SHIFT-3+4 - sldi r15,r10,44 - clrrdi r14,r14,19 - or r10,r15,r14 - - ld r14,0(r10) - -finish_normal_tlb_miss: - /* Check if required permissions are met */ - andc. r15,r11,r14 - bne- normal_tlb_miss_access_fault - - /* Now we build the MAS: - * - * MAS 0 : Fully setup with defaults in MAS4 and TLBnCFG - * MAS 1 : Almost fully setup - * - PID already updated by caller if necessary - * - TSIZE need change if !base page size, not - * yet implemented for now - * MAS 2 : Defaults not useful, need to be redone - * MAS 3+7 : Needs to be done - * - * TODO: mix up code below for better scheduling - */ - clrrdi r10,r16,12 /* Clear low crap in EA */ - rlwimi r10,r14,32-19,27,31 /* Insert WIMGE */ - mtspr SPRN_MAS2,r10 - - /* Check page size, if not standard, update MAS1 */ - rldicl r10,r14,64-8,64-8 - cmpldi cr0,r10,BOOK3E_PAGESZ_4K - beq- 1f - mfspr r11,SPRN_MAS1 - rlwimi r11,r14,31,21,24 - rlwinm r11,r11,0,21,19 - mtspr SPRN_MAS1,r11 -1: - /* Move RPN in position */ - rldicr r11,r14,64-(PTE_RPN_SHIFT-PAGE_SHIFT),63-PAGE_SHIFT - clrldi r15,r11,12 /* Clear crap at the top */ - rlwimi r15,r14,32-8,22,25 /* Move in U bits */ - rlwimi r15,r14,32-2,26,31 /* Move in BAP bits */ - - /* Mask out SW and UW if !DIRTY (XXX optimize this !) */ - andi. r11,r14,_PAGE_DIRTY - bne 1f - li r11,MAS3_SW|MAS3_UW - andc r15,r15,r11 -1: - srdi r16,r15,32 - mtspr SPRN_MAS3,r15 - mtspr SPRN_MAS7,r16 - - tlbwe - -normal_tlb_miss_done: - /* We don't bother with restoring DEAR or ESR since we know we are - * level 0 and just going back to userland. They are only needed - * if you are going to take an access fault - */ - TLB_MISS_EPILOG_SUCCESS - rfi - -normal_tlb_miss_access_fault: - /* We need to check if it was an instruction miss */ - andi. r10,r11,_PAGE_BAP_UX - bne 1f - ld r14,EX_TLB_DEAR(r12) - ld r15,EX_TLB_ESR(r12) - mtspr SPRN_DEAR,r14 - mtspr SPRN_ESR,r15 - TLB_MISS_EPILOG_ERROR - b exc_data_storage_book3e -1: TLB_MISS_EPILOG_ERROR - b exc_instruction_storage_book3e - - /* * This is the guts of the second-level TLB miss handler for direct * misses. We are entered with: @@ -893,201 +667,6 @@ virt_page_table_tlb_miss_whacko_fault: TLB_MISS_EPILOG_ERROR b exc_data_storage_book3e - -/************************************************************** - * * - * TLB miss handling for Book3E with hw page table support * - * * - **************************************************************/ - - -/* Data TLB miss */ - START_EXCEPTION(data_tlb_miss_htw) - TLB_MISS_PROLOG - - /* Now we handle the fault proper. We only save DEAR in normal - * fault case since that's the only interesting values here. - * We could probably also optimize by not saving SRR0/1 in the - * linear mapping case but I'll leave that for later - */ - mfspr r14,SPRN_ESR - mfspr r16,SPRN_DEAR /* get faulting address */ - srdi r11,r16,44 /* get region */ - xoris r11,r11,0xc - cmpldi cr0,r11,0 /* linear mapping ? */ - beq tlb_load_linear /* yes -> go to linear map load */ - cmpldi cr1,r11,1 /* vmalloc mapping ? */ - - /* We do the user/kernel test for the PID here along with the RW test - */ - srdi. r11,r16,60 /* Check for user region */ - ld r15,PACAPGD(r13) /* Load user pgdir */ - beq htw_tlb_miss - - /* XXX replace the RMW cycles with immediate loads + writes */ -1: mfspr r10,SPRN_MAS1 - rlwinm r10,r10,0,16,1 /* Clear TID */ - mtspr SPRN_MAS1,r10 - ld r15,PACA_KERNELPGD(r13) /* Load kernel pgdir */ - beq+ cr1,htw_tlb_miss - - /* We got a crappy address, just fault with whatever DEAR and ESR - * are here - */ - TLB_MISS_EPILOG_ERROR - b exc_data_storage_book3e - -/* Instruction TLB miss */ - START_EXCEPTION(instruction_tlb_miss_htw) - TLB_MISS_PROLOG - - /* If we take a recursive fault, the second level handler may need - * to know whether we are handling a data or instruction fault in - * order to get to the right store fault handler. We provide that - * info by keeping a crazy value for ESR in r14 - */ - li r14,-1 /* store to exception frame is done later */ - - /* Now we handle the fault proper. We only save DEAR in the non - * linear mapping case since we know the linear mapping case will - * not re-enter. We could indeed optimize and also not save SRR0/1 - * in the linear mapping case but I'll leave that for later - * - * Faulting address is SRR0 which is already in r16 - */ - srdi r11,r16,44 /* get region */ - xoris r11,r11,0xc - cmpldi cr0,r11,0 /* linear mapping ? */ - beq tlb_load_linear /* yes -> go to linear map load */ - cmpldi cr1,r11,1 /* vmalloc mapping ? */ - - /* We do the user/kernel test for the PID here along with the RW test - */ - srdi. r11,r16,60 /* Check for user region */ - ld r15,PACAPGD(r13) /* Load user pgdir */ - beq htw_tlb_miss - - /* XXX replace the RMW cycles with immediate loads + writes */ -1: mfspr r10,SPRN_MAS1 - rlwinm r10,r10,0,16,1 /* Clear TID */ - mtspr SPRN_MAS1,r10 - ld r15,PACA_KERNELPGD(r13) /* Load kernel pgdir */ - beq+ htw_tlb_miss - - /* We got a crappy address, just fault */ - TLB_MISS_EPILOG_ERROR - b exc_instruction_storage_book3e - - -/* - * This is the guts of the second-level TLB miss handler for direct - * misses. We are entered with: - * - * r16 = virtual page table faulting address - * r15 = PGD pointer - * r14 = ESR - * r13 = PACA - * r12 = TLB exception frame in PACA - * r11 = crap (free to use) - * r10 = crap (free to use) - * - * It can be re-entered by the linear mapping miss handler. However, to - * avoid too much complication, it will save/restore things for us - */ -htw_tlb_miss: -#ifdef CONFIG_PPC_KUAP - mfspr r10,SPRN_MAS1 - rlwinm. r10,r10,0,0x3fff0000 - beq- htw_tlb_miss_fault /* KUAP fault */ -#endif - /* Search if we already have a TLB entry for that virtual address, and - * if we do, bail out. - * - * MAS1:IND should be already set based on MAS4 - */ - PPC_TLBSRX_DOT(0,R16) - beq htw_tlb_miss_done - - /* Now, we need to walk the page tables. First check if we are in - * range. - */ - rldicl. r10,r16,64-PGTABLE_EADDR_SIZE,PGTABLE_EADDR_SIZE+4 - bne- htw_tlb_miss_fault - - /* Get the PGD pointer */ - cmpldi cr0,r15,0 - beq- htw_tlb_miss_fault - - /* Get to PGD entry */ - rldicl r11,r16,64-(PGDIR_SHIFT-3),64-PGD_INDEX_SIZE-3 - clrrdi r10,r11,3 - ldx r15,r10,r15 - cmpdi cr0,r15,0 - bge htw_tlb_miss_fault - - /* Get to PUD entry */ - rldicl r11,r16,64-(PUD_SHIFT-3),64-PUD_INDEX_SIZE-3 - clrrdi r10,r11,3 - ldx r15,r10,r15 - cmpdi cr0,r15,0 - bge htw_tlb_miss_fault - - /* Get to PMD entry */ - rldicl r11,r16,64-(PMD_SHIFT-3),64-PMD_INDEX_SIZE-3 - clrrdi r10,r11,3 - ldx r15,r10,r15 - cmpdi cr0,r15,0 - bge htw_tlb_miss_fault - - /* Ok, we're all right, we can now create an indirect entry for - * a 1M or 256M page. - * - * The last trick is now that because we use "half" pages for - * the HTW (1M IND is 2K and 256M IND is 32K) we need to account - * for an added LSB bit to the RPN. For 64K pages, there is no - * problem as we already use 32K arrays (half PTE pages), but for - * 4K page we need to extract a bit from the virtual address and - * insert it into the "PA52" bit of the RPN. - */ - rlwimi r15,r16,32-9,20,20 - /* Now we build the MAS: - * - * MAS 0 : Fully setup with defaults in MAS4 and TLBnCFG - * MAS 1 : Almost fully setup - * - PID already updated by caller if necessary - * - TSIZE for now is base ind page size always - * MAS 2 : Use defaults - * MAS 3+7 : Needs to be done - */ - ori r10,r15,(BOOK3E_PAGESZ_4K << MAS3_SPSIZE_SHIFT) - - srdi r16,r10,32 - mtspr SPRN_MAS3,r10 - mtspr SPRN_MAS7,r16 - - tlbwe - -htw_tlb_miss_done: - /* We don't bother with restoring DEAR or ESR since we know we are - * level 0 and just going back to userland. They are only needed - * if you are going to take an access fault - */ - TLB_MISS_EPILOG_SUCCESS - rfi - -htw_tlb_miss_fault: - /* We need to check if it was an instruction miss. We know this - * though because r14 would contain -1 - */ - cmpdi cr0,r14,-1 - beq 1f - mtspr SPRN_DEAR,r16 - mtspr SPRN_ESR,r14 - TLB_MISS_EPILOG_ERROR - b exc_data_storage_book3e -1: TLB_MISS_EPILOG_ERROR - b exc_instruction_storage_book3e - /* * This is the guts of "any" level TLB miss handler for kernel linear * mapping misses. We are entered with: From patchwork Mon May 27 13:30:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Leroy X-Patchwork-Id: 1939945 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=patchwork.ozlabs.org) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) by legolas.ozlabs.org (Postfix) with ESMTP id 4VnxWj6ysLz20PT for ; Mon, 27 May 2024 23:36:49 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4VnxPL535lz87JK for ; Mon, 27 May 2024 23:31:18 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=csgroup.eu Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=csgroup.eu (client-ip=93.17.236.30; helo=pegase1.c-s.fr; envelope-from=christophe.leroy@csgroup.eu; receiver=lists.ozlabs.org) Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4VnxNV0ybkz3gKN for ; Mon, 27 May 2024 23:30:33 +1000 (AEST) Received: from localhost (mailhub3.si.c-s.fr [192.168.12.233]) by localhost (Postfix) with ESMTP id 4VnxNH3X8Nz9syd; Mon, 27 May 2024 15:30:23 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lK-Qu1YbbiXB; Mon, 27 May 2024 15:30:23 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 4VnxNH2tB8z9sqS; Mon, 27 May 2024 15:30:23 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 5D3D88B773; Mon, 27 May 2024 15:30:23 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id hmmxoWXQmTJi; Mon, 27 May 2024 15:30:23 +0200 (CEST) Received: from PO20335.idsi0.si.c-s.fr (unknown [192.168.232.49]) by messagerie.si.c-s.fr (Postfix) with ESMTP id A92238B764; Mon, 27 May 2024 15:30:19 +0200 (CEST) From: Christophe Leroy To: Andrew Morton , Jason Gunthorpe , Peter Xu , Oscar Salvador , Michael Ellerman , Nicholas Piggin Subject: [RFC PATCH v4 02/16] mm: Define __pte_leaf_size() to also take a PMD entry Date: Mon, 27 May 2024 15:30:00 +0200 Message-ID: X-Mailer: git-send-email 2.44.0 In-Reply-To: References: MIME-Version: 1.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1716816600; l=1803; i=christophe.leroy@csgroup.eu; s=20211009; h=from:subject:message-id; bh=UzfhNq5PmSTEAt037maHDDYdpiu9u0KQG11+Ji9a3Ko=; b=QTMHKiCYoXWonuyH5nH6U6kLDwMLXHwSww1cSyPU33riwsvrJQGt9NlFB1qDR9YbUHKo9aZhD LWoqX2Vka/mAIu3z8TYAq2rQ45aYu14RwGhgdDeAjsMt9EKpC5bwNit X-Developer-Key: i=christophe.leroy@csgroup.eu; a=ed25519; pk=HIzTzUj91asvincQGOFx6+ZF5AoUuP9GdOtQChs7Mm0= X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" On powerpc 8xx, when a page is 8M size, the information is in the PMD entry. So allow architectures to provide __pte_leaf_size() instead of pte_leaf_size() and provide the PMD entry to that function. When __pte_leaf_size() is not defined, define it as a pte_leaf_size() so that architectures not interested in the PMD arguments are not impacted. Only define a default pte_leaf_size() when __pte_leaf_size() is not defined to make sure nobody adds new calls to pte_leaf_size() in the core. Signed-off-by: Christophe Leroy Reviewed-by: Oscar Salvador --- v3: Don't change pte_leaf_size() to not impact other architectures --- include/linux/pgtable.h | 3 +++ kernel/events/core.c | 2 +- 2 files changed, 4 insertions(+), 1 deletion(-) diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 18019f037bae..3080e7cde3de 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1888,9 +1888,12 @@ typedef unsigned int pgtbl_mod_mask; #ifndef pmd_leaf_size #define pmd_leaf_size(x) PMD_SIZE #endif +#ifndef __pte_leaf_size #ifndef pte_leaf_size #define pte_leaf_size(x) PAGE_SIZE #endif +#define __pte_leaf_size(x,y) pte_leaf_size(y) +#endif /* * We always define pmd_pfn for all archs as it's used in lots of generic diff --git a/kernel/events/core.c b/kernel/events/core.c index f0128c5ff278..880df84ce07c 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -7596,7 +7596,7 @@ static u64 perf_get_pgtable_size(struct mm_struct *mm, unsigned long addr) pte = ptep_get_lockless(ptep); if (pte_present(pte)) - size = pte_leaf_size(pte); + size = __pte_leaf_size(pmd, pte); pte_unmap(ptep); #endif /* CONFIG_HAVE_GUP_FAST */ From patchwork Mon May 27 13:30:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Leroy X-Patchwork-Id: 1939946 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=patchwork.ozlabs.org) Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) by legolas.ozlabs.org (Postfix) with ESMTP id 4VnxWt6rSDz20PT for ; Mon, 27 May 2024 23:36:58 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4VnxPm2JYWz3cSK for ; Mon, 27 May 2024 23:31:40 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=csgroup.eu Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=csgroup.eu (client-ip=93.17.236.30; helo=pegase1.c-s.fr; envelope-from=christophe.leroy@csgroup.eu; receiver=lists.ozlabs.org) Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4VnxNf3NC1z3vf4 for ; Mon, 27 May 2024 23:30:42 +1000 (AEST) Received: from localhost (mailhub3.si.c-s.fr [192.168.12.233]) by localhost (Postfix) with ESMTP id 4VnxNP3zxRz9tC6; Mon, 27 May 2024 15:30:29 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id nIdK8OizgSyF; Mon, 27 May 2024 15:30:29 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 4VnxNP3Fnxz9sqS; Mon, 27 May 2024 15:30:29 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 6A89E8B773; Mon, 27 May 2024 15:30:29 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id ox5E4KZjDZS4; Mon, 27 May 2024 15:30:29 +0200 (CEST) Received: from PO20335.idsi0.si.c-s.fr (unknown [192.168.232.49]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 872BA8B764; Mon, 27 May 2024 15:30:23 +0200 (CEST) From: Christophe Leroy To: Andrew Morton , Jason Gunthorpe , Peter Xu , Oscar Salvador , Michael Ellerman , Nicholas Piggin Subject: [RFC PATCH v4 03/16] mm: Provide mm_struct and address to huge_ptep_get() Date: Mon, 27 May 2024 15:30:01 +0200 Message-ID: <941ae68c7813cafc7aee3f34131f895fb7599636.1716815901.git.christophe.leroy@csgroup.eu> X-Mailer: git-send-email 2.44.0 In-Reply-To: References: MIME-Version: 1.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1716816600; l=23916; i=christophe.leroy@csgroup.eu; s=20211009; h=from:subject:message-id; bh=Sd69KgHpZZBh4AekXreEq4ibrjP7rDYxnYN6D0zK1Is=; b=yemf2RPMY+vmtUyGtDfRjTxs/EAkDqEGTXhNRKc5TGVb805K556op9qhTPNLfqpJp7WroVlw9 /lzMljbQcrEDlTA3+IRpz4vNZPX9tCnwaZGH4+dNlidFqYYzQfalbEY X-Developer-Key: i=christophe.leroy@csgroup.eu; a=ed25519; pk=HIzTzUj91asvincQGOFx6+ZF5AoUuP9GdOtQChs7Mm0= X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" On powerpc 8xx huge_ptep_get() will need to know whether the given ptep is a PTE entry or a PMD entry. This cannot be known with the PMD entry itself because there is no easy way to know it from the content of the entry. So huge_ptep_get() will need to know either the size of the page or get the pmd. In order to be consistent with huge_ptep_get_and_clear(), give mm and address to huge_ptep_get(). Signed-off-by: Christophe Leroy Reviewed-by: Oscar Salvador --- v2: Add missing changes in arch implementations v3: Fixed a comment in ARM and missing changes in S390 v4: Fix missing or bad changes in mm/hugetlb.c and rebased on v6.10-rc1 --- arch/arm/include/asm/hugetlb-3level.h | 4 +-- arch/arm64/include/asm/hugetlb.h | 2 +- arch/arm64/mm/hugetlbpage.c | 2 +- arch/riscv/include/asm/hugetlb.h | 2 +- arch/riscv/mm/hugetlbpage.c | 2 +- arch/s390/include/asm/hugetlb.h | 4 +-- arch/s390/mm/hugetlbpage.c | 4 +-- fs/hugetlbfs/inode.c | 2 +- fs/proc/task_mmu.c | 10 +++--- fs/userfaultfd.c | 2 +- include/asm-generic/hugetlb.h | 2 +- include/linux/swapops.h | 4 +-- mm/damon/vaddr.c | 6 ++-- mm/gup.c | 2 +- mm/hmm.c | 2 +- mm/hugetlb.c | 44 +++++++++++++-------------- mm/memory-failure.c | 2 +- mm/mempolicy.c | 2 +- mm/migrate.c | 4 +-- mm/mincore.c | 2 +- mm/userfaultfd.c | 2 +- 21 files changed, 53 insertions(+), 53 deletions(-) diff --git a/arch/arm/include/asm/hugetlb-3level.h b/arch/arm/include/asm/hugetlb-3level.h index a30be5505793..87d48e2d90ad 100644 --- a/arch/arm/include/asm/hugetlb-3level.h +++ b/arch/arm/include/asm/hugetlb-3level.h @@ -13,12 +13,12 @@ /* * If our huge pte is non-zero then mark the valid bit. - * This allows pte_present(huge_ptep_get(ptep)) to return true for non-zero + * This allows pte_present(huge_ptep_get(mm,addr,ptep)) to return true for non-zero * ptes. * (The valid bit is automatically cleared by set_pte_at for PROT_NONE ptes). */ #define __HAVE_ARCH_HUGE_PTEP_GET -static inline pte_t huge_ptep_get(pte_t *ptep) +static inline pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { pte_t retval = *ptep; if (pte_val(retval)) diff --git a/arch/arm64/include/asm/hugetlb.h b/arch/arm64/include/asm/hugetlb.h index 3954cbd2ff56..293f880865e8 100644 --- a/arch/arm64/include/asm/hugetlb.h +++ b/arch/arm64/include/asm/hugetlb.h @@ -46,7 +46,7 @@ extern pte_t huge_ptep_clear_flush(struct vm_area_struct *vma, extern void huge_pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep, unsigned long sz); #define __HAVE_ARCH_HUGE_PTEP_GET -extern pte_t huge_ptep_get(pte_t *ptep); +extern pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep); void __init arm64_hugetlb_cma_reserve(void); diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c index 3f09ac73cce3..5f1e2103888b 100644 --- a/arch/arm64/mm/hugetlbpage.c +++ b/arch/arm64/mm/hugetlbpage.c @@ -127,7 +127,7 @@ static inline int num_contig_ptes(unsigned long size, size_t *pgsize) return contig_ptes; } -pte_t huge_ptep_get(pte_t *ptep) +pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { int ncontig, i; size_t pgsize; diff --git a/arch/riscv/include/asm/hugetlb.h b/arch/riscv/include/asm/hugetlb.h index b1ce97a9dbfc..faf3624d8057 100644 --- a/arch/riscv/include/asm/hugetlb.h +++ b/arch/riscv/include/asm/hugetlb.h @@ -44,7 +44,7 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, pte_t pte, int dirty); #define __HAVE_ARCH_HUGE_PTEP_GET -pte_t huge_ptep_get(pte_t *ptep); +pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep); pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags); #define arch_make_huge_pte arch_make_huge_pte diff --git a/arch/riscv/mm/hugetlbpage.c b/arch/riscv/mm/hugetlbpage.c index 0ebd968b33c9..42314f093922 100644 --- a/arch/riscv/mm/hugetlbpage.c +++ b/arch/riscv/mm/hugetlbpage.c @@ -3,7 +3,7 @@ #include #ifdef CONFIG_RISCV_ISA_SVNAPOT -pte_t huge_ptep_get(pte_t *ptep) +pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { unsigned long pte_num; int i; diff --git a/arch/s390/include/asm/hugetlb.h b/arch/s390/include/asm/hugetlb.h index ce5f4fe8be4d..cf1b5d6fb1a6 100644 --- a/arch/s390/include/asm/hugetlb.h +++ b/arch/s390/include/asm/hugetlb.h @@ -19,7 +19,7 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned long sz); void __set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte); -pte_t huge_ptep_get(pte_t *ptep); +pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep); pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep); @@ -64,7 +64,7 @@ static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t pte, int dirty) { - int changed = !pte_same(huge_ptep_get(ptep), pte); + int changed = !pte_same(huge_ptep_get(vma->vm_mm, addr, ptep), pte); if (changed) { huge_ptep_get_and_clear(vma->vm_mm, addr, ptep); __set_huge_pte_at(vma->vm_mm, addr, ptep, pte); diff --git a/arch/s390/mm/hugetlbpage.c b/arch/s390/mm/hugetlbpage.c index 2675aab4acc7..1be481672f4a 100644 --- a/arch/s390/mm/hugetlbpage.c +++ b/arch/s390/mm/hugetlbpage.c @@ -169,7 +169,7 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, __set_huge_pte_at(mm, addr, ptep, pte); } -pte_t huge_ptep_get(pte_t *ptep) +pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { return __rste_to_pte(pte_val(*ptep)); } @@ -177,7 +177,7 @@ pte_t huge_ptep_get(pte_t *ptep) pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - pte_t pte = huge_ptep_get(ptep); + pte_t pte = huge_ptep_get(mm, addr, ptep); pmd_t *pmdp = (pmd_t *) ptep; pud_t *pudp = (pud_t *) ptep; diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 412f295acebe..970c7e6fb144 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -422,7 +422,7 @@ static bool hugetlb_vma_maps_page(struct vm_area_struct *vma, if (!ptep) return false; - pte = huge_ptep_get(ptep); + pte = huge_ptep_get(vma->vm_mm, addr, ptep); if (huge_pte_none(pte) || !pte_present(pte)) return false; diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index f8d35f993fe5..a74c1e42da0c 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -730,7 +730,7 @@ static int smaps_hugetlb_range(pte_t *pte, unsigned long hmask, { struct mem_size_stats *mss = walk->private; struct vm_area_struct *vma = walk->vma; - pte_t ptent = huge_ptep_get(pte); + pte_t ptent = huge_ptep_get(walk->mm, addr, pte); struct folio *folio = NULL; if (pte_present(ptent)) { @@ -1582,7 +1582,7 @@ static int pagemap_hugetlb_range(pte_t *ptep, unsigned long hmask, if (vma->vm_flags & VM_SOFTDIRTY) flags |= PM_SOFT_DIRTY; - pte = huge_ptep_get(ptep); + pte = huge_ptep_get(walk->mm, addr, ptep); if (pte_present(pte)) { struct folio *folio = page_folio(pte_page(pte)); @@ -2271,7 +2271,7 @@ static int pagemap_scan_hugetlb_entry(pte_t *ptep, unsigned long hmask, if (~p->arg.flags & PM_SCAN_WP_MATCHING) { /* Go the short route when not write-protecting pages. */ - pte = huge_ptep_get(ptep); + pte = huge_ptep_get(walk->mm, start, ptep); categories = p->cur_vma_category | pagemap_hugetlb_category(pte); if (!pagemap_scan_is_interesting_page(categories, p)) @@ -2283,7 +2283,7 @@ static int pagemap_scan_hugetlb_entry(pte_t *ptep, unsigned long hmask, i_mmap_lock_write(vma->vm_file->f_mapping); ptl = huge_pte_lock(hstate_vma(vma), vma->vm_mm, ptep); - pte = huge_ptep_get(ptep); + pte = huge_ptep_get(walk->mm, start, ptep); categories = p->cur_vma_category | pagemap_hugetlb_category(pte); if (!pagemap_scan_is_interesting_page(categories, p)) @@ -2679,7 +2679,7 @@ static int gather_pte_stats(pmd_t *pmd, unsigned long addr, static int gather_hugetlb_stats(pte_t *pte, unsigned long hmask, unsigned long addr, unsigned long end, struct mm_walk *walk) { - pte_t huge_pte = huge_ptep_get(pte); + pte_t huge_pte = huge_ptep_get(walk->mm, addr, pte); struct numa_maps *md; struct page *page; diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index eee7320ab0b0..bafce25fd676 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -257,7 +257,7 @@ static inline bool userfaultfd_huge_must_wait(struct userfaultfd_ctx *ctx, goto out; ret = false; - pte = huge_ptep_get(ptep); + pte = huge_ptep_get(vma->vm_mm, vmf->address, ptep); /* * Lockless access: we're in a wait_event so it's ok if it diff --git a/include/asm-generic/hugetlb.h b/include/asm-generic/hugetlb.h index 6dcf4d576970..594d5905f615 100644 --- a/include/asm-generic/hugetlb.h +++ b/include/asm-generic/hugetlb.h @@ -144,7 +144,7 @@ static inline int huge_ptep_set_access_flags(struct vm_area_struct *vma, #endif #ifndef __HAVE_ARCH_HUGE_PTEP_GET -static inline pte_t huge_ptep_get(pte_t *ptep) +static inline pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { return ptep_get(ptep); } diff --git a/include/linux/swapops.h b/include/linux/swapops.h index a5c560a2f8c2..cb468e418ea1 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -334,7 +334,7 @@ static inline bool is_migration_entry_dirty(swp_entry_t entry) extern void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, unsigned long address); -extern void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte); +extern void migration_entry_wait_huge(struct vm_area_struct *vma, unsigned long addr, pte_t *pte); #else /* CONFIG_MIGRATION */ static inline swp_entry_t make_readable_migration_entry(pgoff_t offset) { @@ -359,7 +359,7 @@ static inline int is_migration_entry(swp_entry_t swp) static inline void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, unsigned long address) { } static inline void migration_entry_wait_huge(struct vm_area_struct *vma, - pte_t *pte) { } + unsigned long addr, pte_t *pte) { } static inline int is_writable_migration_entry(swp_entry_t entry) { return 0; diff --git a/mm/damon/vaddr.c b/mm/damon/vaddr.c index 381559e4a1fa..58829baf8b5d 100644 --- a/mm/damon/vaddr.c +++ b/mm/damon/vaddr.c @@ -339,7 +339,7 @@ static void damon_hugetlb_mkold(pte_t *pte, struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr) { bool referenced = false; - pte_t entry = huge_ptep_get(pte); + pte_t entry = huge_ptep_get(mm, addr, pte); struct folio *folio = pfn_folio(pte_pfn(entry)); unsigned long psize = huge_page_size(hstate_vma(vma)); @@ -373,7 +373,7 @@ static int damon_mkold_hugetlb_entry(pte_t *pte, unsigned long hmask, pte_t entry; ptl = huge_pte_lock(h, walk->mm, pte); - entry = huge_ptep_get(pte); + entry = huge_ptep_get(walk->mm, addr, pte); if (!pte_present(entry)) goto out; @@ -509,7 +509,7 @@ static int damon_young_hugetlb_entry(pte_t *pte, unsigned long hmask, pte_t entry; ptl = huge_pte_lock(h, walk->mm, pte); - entry = huge_ptep_get(pte); + entry = huge_ptep_get(walk->mm, addr, pte); if (!pte_present(entry)) goto out; diff --git a/mm/gup.c b/mm/gup.c index ca0f5cedce9b..53ebb0ae53a0 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -547,7 +547,7 @@ static int gup_hugepte(struct vm_area_struct *vma, pte_t *ptep, unsigned long sz if (pte_end < end) end = pte_end; - pte = huge_ptep_get(ptep); + pte = huge_ptep_get(vma->mm, addr, ptep); if (!pte_access_permitted(pte, flags & FOLL_WRITE)) return 0; diff --git a/mm/hmm.c b/mm/hmm.c index 93aebd9cc130..7e0229ae4a5a 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -480,7 +480,7 @@ static int hmm_vma_walk_hugetlb_entry(pte_t *pte, unsigned long hmask, pte_t entry; ptl = huge_pte_lock(hstate_vma(vma), walk->mm, pte); - entry = huge_ptep_get(pte); + entry = huge_ptep_get(walk->mm, addr, pte); i = (start - range->start) >> PAGE_SHIFT; pfn_req_flags = range->hmm_pfns[i]; diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 6be78e7d4f6e..861e304df39c 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -5315,7 +5315,7 @@ static void set_huge_ptep_writable(struct vm_area_struct *vma, { pte_t entry; - entry = huge_pte_mkwrite(huge_pte_mkdirty(huge_ptep_get(ptep))); + entry = huge_pte_mkwrite(huge_pte_mkdirty(huge_ptep_get(vma->vm_mm, address, ptep))); if (huge_ptep_set_access_flags(vma, address, ptep, entry, 1)) update_mmu_cache(vma, address, ptep); } @@ -5423,7 +5423,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, dst_ptl = huge_pte_lock(h, dst, dst_pte); src_ptl = huge_pte_lockptr(h, src, src_pte); spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); - entry = huge_ptep_get(src_pte); + entry = huge_ptep_get(src_vma->vm_mm, addr, src_pte); again: if (huge_pte_none(entry)) { /* @@ -5461,7 +5461,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, set_huge_pte_at(dst, addr, dst_pte, make_pte_marker(marker), sz); } else { - entry = huge_ptep_get(src_pte); + entry = huge_ptep_get(src_vma->vm_mm, addr, src_pte); pte_folio = page_folio(pte_page(entry)); folio_get(pte_folio); @@ -5503,7 +5503,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, dst_ptl = huge_pte_lock(h, dst, dst_pte); src_ptl = huge_pte_lockptr(h, src, src_pte); spin_lock_nested(src_ptl, SINGLE_DEPTH_NESTING); - entry = huge_ptep_get(src_pte); + entry = huge_ptep_get(src_vma->vm_mm, addr, src_pte); if (!pte_same(src_pte_old, entry)) { restore_reserve_on_error(h, dst_vma, addr, new_folio); @@ -5613,7 +5613,7 @@ int move_hugetlb_page_tables(struct vm_area_struct *vma, new_addr |= last_addr_mask; continue; } - if (huge_pte_none(huge_ptep_get(src_pte))) + if (huge_pte_none(huge_ptep_get(mm, old_addr, src_pte))) continue; if (huge_pmd_unshare(mm, vma, old_addr, src_pte)) { @@ -5686,7 +5686,7 @@ void __unmap_hugepage_range(struct mmu_gather *tlb, struct vm_area_struct *vma, continue; } - pte = huge_ptep_get(ptep); + pte = huge_ptep_get(mm, address, ptep); if (huge_pte_none(pte)) { spin_unlock(ptl); continue; @@ -5923,7 +5923,7 @@ static vm_fault_t hugetlb_wp(struct folio *pagecache_folio, struct vm_area_struct *vma = vmf->vma; struct mm_struct *mm = vma->vm_mm; const bool unshare = vmf->flags & FAULT_FLAG_UNSHARE; - pte_t pte = huge_ptep_get(vmf->pte); + pte_t pte = huge_ptep_get(mm, vmf->address, vmf->pte); struct hstate *h = hstate_vma(vma); struct folio *old_folio; struct folio *new_folio; @@ -6044,7 +6044,7 @@ static vm_fault_t hugetlb_wp(struct folio *pagecache_folio, vmf->pte = hugetlb_walk(vma, vmf->address, huge_page_size(h)); if (likely(vmf->pte && - pte_same(huge_ptep_get(vmf->pte), pte))) + pte_same(huge_ptep_get(mm, vmf->address, vmf->pte), pte))) goto retry_avoidcopy; /* * race occurs while re-acquiring page table @@ -6082,7 +6082,7 @@ static vm_fault_t hugetlb_wp(struct folio *pagecache_folio, */ spin_lock(vmf->ptl); vmf->pte = hugetlb_walk(vma, vmf->address, huge_page_size(h)); - if (likely(vmf->pte && pte_same(huge_ptep_get(vmf->pte), pte))) { + if (likely(vmf->pte && pte_same(huge_ptep_get(mm, vmf->address, vmf->pte), pte))) { pte_t newpte = make_huge_pte(vma, &new_folio->page, !unshare); /* Break COW or unshare */ @@ -6183,14 +6183,14 @@ static inline vm_fault_t hugetlb_handle_userfault(struct vm_fault *vmf, * Recheck pte with pgtable lock. Returns true if pte didn't change, or * false if pte changed or is changing. */ -static bool hugetlb_pte_stable(struct hstate *h, struct mm_struct *mm, +static bool hugetlb_pte_stable(struct hstate *h, struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t old_pte) { spinlock_t *ptl; bool same; ptl = huge_pte_lock(h, mm, ptep); - same = pte_same(huge_ptep_get(ptep), old_pte); + same = pte_same(huge_ptep_get(mm, addr, ptep), old_pte); spin_unlock(ptl); return same; @@ -6251,7 +6251,7 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping, * never happen on the page after UFFDIO_COPY has * correctly installed the page and returned. */ - if (!hugetlb_pte_stable(h, mm, vmf->pte, vmf->orig_pte)) { + if (!hugetlb_pte_stable(h, mm, vmf->address, vmf->pte, vmf->orig_pte)) { ret = 0; goto out; } @@ -6280,7 +6280,7 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping, * here. Before returning error, get ptl and make * sure there really is no pte entry. */ - if (hugetlb_pte_stable(h, mm, vmf->pte, vmf->orig_pte)) + if (hugetlb_pte_stable(h, mm, vmf->address, vmf->pte, vmf->orig_pte)) ret = vmf_error(PTR_ERR(folio)); else ret = 0; @@ -6330,7 +6330,7 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping, folio_unlock(folio); folio_put(folio); /* See comment in userfaultfd_missing() block above */ - if (!hugetlb_pte_stable(h, mm, vmf->pte, vmf->orig_pte)) { + if (!hugetlb_pte_stable(h, mm, vmf->address, vmf->pte, vmf->orig_pte)) { ret = 0; goto out; } @@ -6357,7 +6357,7 @@ static vm_fault_t hugetlb_no_page(struct address_space *mapping, vmf->ptl = huge_pte_lock(h, mm, vmf->pte); ret = 0; /* If pte changed from under us, retry */ - if (!pte_same(huge_ptep_get(vmf->pte), vmf->orig_pte)) + if (!pte_same(huge_ptep_get(mm, vmf->address, vmf->pte), vmf->orig_pte)) goto backout; if (anon_rmap) @@ -6478,7 +6478,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, return VM_FAULT_OOM; } - vmf.orig_pte = huge_ptep_get(vmf.pte); + vmf.orig_pte = huge_ptep_get(mm, vmf.address, vmf.pte); if (huge_pte_none_mostly(vmf.orig_pte)) { if (is_pte_marker(vmf.orig_pte)) { pte_marker marker = @@ -6519,7 +6519,7 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, * be released there. */ mutex_unlock(&hugetlb_fault_mutex_table[hash]); - migration_entry_wait_huge(vma, vmf.pte); + migration_entry_wait_huge(vma, vmf.address, vmf.pte); return 0; } else if (unlikely(is_hugetlb_entry_hwpoisoned(vmf.orig_pte))) ret = VM_FAULT_HWPOISON_LARGE | @@ -6552,11 +6552,11 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, vmf.ptl = huge_pte_lock(h, mm, vmf.pte); /* Check for a racing update before calling hugetlb_wp() */ - if (unlikely(!pte_same(vmf.orig_pte, huge_ptep_get(vmf.pte)))) + if (unlikely(!pte_same(vmf.orig_pte, huge_ptep_get(mm, vmf.address, vmf.pte)))) goto out_ptl; /* Handle userfault-wp first, before trying to lock more pages */ - if (userfaultfd_wp(vma) && huge_pte_uffd_wp(huge_ptep_get(vmf.pte)) && + if (userfaultfd_wp(vma) && huge_pte_uffd_wp(huge_ptep_get(mm, vmf.address, vmf.pte)) && (flags & FAULT_FLAG_WRITE) && !huge_pte_write(vmf.orig_pte)) { if (!userfaultfd_wp_async(vma)) { spin_unlock(vmf.ptl); @@ -6684,7 +6684,7 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte, ptl = huge_pte_lock(h, dst_mm, dst_pte); /* Don't overwrite any existing PTEs (even markers) */ - if (!huge_pte_none(huge_ptep_get(dst_pte))) { + if (!huge_pte_none(huge_ptep_get(dst_mm, dst_addr, dst_pte))) { spin_unlock(ptl); return -EEXIST; } @@ -6821,7 +6821,7 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte, * page backing it, then access the page. */ ret = -EEXIST; - if (!huge_pte_none_mostly(huge_ptep_get(dst_pte))) + if (!huge_pte_none_mostly(huge_ptep_get(dst_mm, dst_addr, dst_pte))) goto out_release_unlock; if (folio_in_pagecache) @@ -6942,7 +6942,7 @@ long hugetlb_change_protection(struct vm_area_struct *vma, address |= last_addr_mask; continue; } - pte = huge_ptep_get(ptep); + pte = huge_ptep_get(mm, address, ptep); if (unlikely(is_hugetlb_entry_hwpoisoned(pte))) { /* Nothing to do. */ } else if (unlikely(is_hugetlb_entry_migration(pte))) { diff --git a/mm/memory-failure.c b/mm/memory-failure.c index d3c830e817e3..b8216b769784 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -834,7 +834,7 @@ static int hwpoison_hugetlb_range(pte_t *ptep, unsigned long hmask, struct mm_walk *walk) { struct hwpoison_walk *hwp = walk->private; - pte_t pte = huge_ptep_get(ptep); + pte_t pte = huge_ptep_get(walk->mm, addr, ptep); struct hstate *h = hstate_vma(walk->vma); return check_hwpoisoned_entry(pte, addr, huge_page_shift(h), diff --git a/mm/mempolicy.c b/mm/mempolicy.c index aec756ae5637..b01bb1ebfee7 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -624,7 +624,7 @@ static int queue_folios_hugetlb(pte_t *pte, unsigned long hmask, pte_t entry; ptl = huge_pte_lock(hstate_vma(walk->vma), walk->mm, pte); - entry = huge_ptep_get(pte); + entry = huge_ptep_get(walk->mm, addr, pte); if (!pte_present(entry)) { if (unlikely(is_hugetlb_entry_migration(entry))) qp->nr_failed++; diff --git a/mm/migrate.c b/mm/migrate.c index dd04f578c19c..4862c5924a9d 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -338,14 +338,14 @@ void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd, * * This function will release the vma lock before returning. */ -void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *ptep) +void migration_entry_wait_huge(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep) { spinlock_t *ptl = huge_pte_lockptr(hstate_vma(vma), vma->vm_mm, ptep); pte_t pte; hugetlb_vma_assert_locked(vma); spin_lock(ptl); - pte = huge_ptep_get(ptep); + pte = huge_ptep_get(vma->vm_mm, addr, ptep); if (unlikely(!is_hugetlb_entry_migration(pte))) { spin_unlock(ptl); diff --git a/mm/mincore.c b/mm/mincore.c index dad3622cc963..b5735a4aaa7d 100644 --- a/mm/mincore.c +++ b/mm/mincore.c @@ -33,7 +33,7 @@ static int mincore_hugetlb(pte_t *pte, unsigned long hmask, unsigned long addr, * Hugepages under user process are always in RAM and never * swapped out, but theoretically it needs to be checked. */ - present = pte && !huge_pte_none_mostly(huge_ptep_get(pte)); + present = pte && !huge_pte_none_mostly(huge_ptep_get(walk->mm, addr, pte)); for (; addr != end; vec++, addr += PAGE_SIZE) *vec = present; walk->private = vec; diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index defa5109cc62..8f1fd5af5909 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -587,7 +587,7 @@ static __always_inline ssize_t mfill_atomic_hugetlb( } if (!uffd_flags_mode_is(flags, MFILL_ATOMIC_CONTINUE) && - !huge_pte_none_mostly(huge_ptep_get(dst_pte))) { + !huge_pte_none_mostly(huge_ptep_get(dst_mm, dst_addr, dst_pte))) { err = -EEXIST; hugetlb_vma_unlock_read(dst_vma); mutex_unlock(&hugetlb_fault_mutex_table[hash]); From patchwork Mon May 27 13:30:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Leroy X-Patchwork-Id: 1939961 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=patchwork.ozlabs.org) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) by legolas.ozlabs.org (Postfix) with ESMTP id 4VnxfT0PWnz20KL for ; Mon, 27 May 2024 23:42:41 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4VnxQJ5Kzsz87vw for ; Mon, 27 May 2024 23:32:08 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=csgroup.eu Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=csgroup.eu (client-ip=93.17.236.30; helo=pegase1.c-s.fr; envelope-from=christophe.leroy@csgroup.eu; receiver=lists.ozlabs.org) Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4VnxNr1tPLz792X for ; Mon, 27 May 2024 23:30:51 +1000 (AEST) Received: from localhost (mailhub3.si.c-s.fr [192.168.12.233]) by localhost (Postfix) with ESMTP id 4VnxNS2y9Tz9tCB; Mon, 27 May 2024 15:30:32 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dwzHL0j4Um4X; Mon, 27 May 2024 15:30:32 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 4VnxNS27XHz9sqS; Mon, 27 May 2024 15:30:32 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 445D48B773; Mon, 27 May 2024 15:30:32 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id z_9OGc64ZTyp; Mon, 27 May 2024 15:30:32 +0200 (CEST) Received: from PO20335.idsi0.si.c-s.fr (unknown [192.168.232.49]) by messagerie.si.c-s.fr (Postfix) with ESMTP id B2E508B764; Mon, 27 May 2024 15:30:29 +0200 (CEST) From: Christophe Leroy To: Andrew Morton , Jason Gunthorpe , Peter Xu , Oscar Salvador , Michael Ellerman , Nicholas Piggin Subject: [RFC PATCH v4 04/16] powerpc/mm: Remove _PAGE_PSIZE Date: Mon, 27 May 2024 15:30:02 +0200 Message-ID: X-Mailer: git-send-email 2.44.0 In-Reply-To: References: MIME-Version: 1.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1716816600; l=3781; i=christophe.leroy@csgroup.eu; s=20211009; h=from:subject:message-id; bh=ymAToQL34uMf8YFNIV4scdsUyseWbpqXywhQPbZn4dQ=; b=mgroCXZl2/TZQwRAqnuJswewIKfMl4lJ+gcZuO2nU2xi2qIowPjsJ/9jFFjBsd+KifgCqhjoI +l+xohWFTyKAbv9S8ISGhJeETUR9pdCiIfLYaWAaL+8GS2KnB1LQFyw X-Developer-Key: i=christophe.leroy@csgroup.eu; a=ed25519; pk=HIzTzUj91asvincQGOFx6+ZF5AoUuP9GdOtQChs7Mm0= X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" _PAGE_PSIZE macro is never used outside the place it is defined and is used only on 8xx and e500. Remove indirection, remove it and use its content directly. Signed-off-by: Christophe Leroy Reviewed-by: Oscar Salvador --- arch/powerpc/include/asm/nohash/32/pte-40x.h | 3 --- arch/powerpc/include/asm/nohash/32/pte-44x.h | 3 --- arch/powerpc/include/asm/nohash/32/pte-85xx.h | 3 --- arch/powerpc/include/asm/nohash/32/pte-8xx.h | 5 ++--- arch/powerpc/include/asm/nohash/pte-e500.h | 4 +--- 5 files changed, 3 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/include/asm/nohash/32/pte-40x.h b/arch/powerpc/include/asm/nohash/32/pte-40x.h index d759cfd74754..52ed58516fa4 100644 --- a/arch/powerpc/include/asm/nohash/32/pte-40x.h +++ b/arch/powerpc/include/asm/nohash/32/pte-40x.h @@ -49,9 +49,6 @@ #define _PAGE_EXEC 0x200 /* hardware: EX permission */ #define _PAGE_ACCESSED 0x400 /* software: R: page referenced */ -/* No page size encoding in the linux PTE */ -#define _PAGE_PSIZE 0 - /* cache related flags non existing on 40x */ #define _PAGE_COHERENT 0 diff --git a/arch/powerpc/include/asm/nohash/32/pte-44x.h b/arch/powerpc/include/asm/nohash/32/pte-44x.h index 851813725237..da0469928273 100644 --- a/arch/powerpc/include/asm/nohash/32/pte-44x.h +++ b/arch/powerpc/include/asm/nohash/32/pte-44x.h @@ -75,9 +75,6 @@ #define _PAGE_NO_CACHE 0x00000400 /* H: I bit */ #define _PAGE_WRITETHRU 0x00000800 /* H: W bit */ -/* No page size encoding in the linux PTE */ -#define _PAGE_PSIZE 0 - /* TODO: Add large page lowmem mapping support */ #define _PMD_PRESENT 0 #define _PMD_PRESENT_MASK (PAGE_MASK) diff --git a/arch/powerpc/include/asm/nohash/32/pte-85xx.h b/arch/powerpc/include/asm/nohash/32/pte-85xx.h index 653a342d3b25..14d64b4f3f14 100644 --- a/arch/powerpc/include/asm/nohash/32/pte-85xx.h +++ b/arch/powerpc/include/asm/nohash/32/pte-85xx.h @@ -31,9 +31,6 @@ #define _PAGE_WRITETHRU 0x00400 /* H: W bit */ #define _PAGE_SPECIAL 0x00800 /* S: Special page */ -/* No page size encoding in the linux PTE */ -#define _PAGE_PSIZE 0 - #define _PMD_PRESENT 0 #define _PMD_PRESENT_MASK (PAGE_MASK) #define _PMD_BAD (~PAGE_MASK) diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h b/arch/powerpc/include/asm/nohash/32/pte-8xx.h index 137dc3c84e45..625c31d6ce5c 100644 --- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h +++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h @@ -74,12 +74,11 @@ #define _PTE_NONE_MASK 0 #ifdef CONFIG_PPC_16K_PAGES -#define _PAGE_PSIZE _PAGE_SPS +#define _PAGE_BASE_NC (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_SPS) #else -#define _PAGE_PSIZE 0 +#define _PAGE_BASE_NC (_PAGE_PRESENT | _PAGE_ACCESSED) #endif -#define _PAGE_BASE_NC (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_PSIZE) #define _PAGE_BASE (_PAGE_BASE_NC) #include diff --git a/arch/powerpc/include/asm/nohash/pte-e500.h b/arch/powerpc/include/asm/nohash/pte-e500.h index f516f0b5b7a8..975facc7e38e 100644 --- a/arch/powerpc/include/asm/nohash/pte-e500.h +++ b/arch/powerpc/include/asm/nohash/pte-e500.h @@ -65,8 +65,6 @@ #define _PAGE_SPECIAL _PAGE_SW0 -/* Base page size */ -#define _PAGE_PSIZE _PAGE_PSIZE_4K #define PTE_RPN_SHIFT (24) #define PTE_WIMGE_SHIFT (19) @@ -89,7 +87,7 @@ * pages. We always set _PAGE_COHERENT when SMP is enabled or * the processor might need it for DMA coherency. */ -#define _PAGE_BASE_NC (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_PSIZE) +#define _PAGE_BASE_NC (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_PSIZE_4K) #if defined(CONFIG_SMP) #define _PAGE_BASE (_PAGE_BASE_NC | _PAGE_COHERENT) #else From patchwork Mon May 27 13:30:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Leroy X-Patchwork-Id: 1939952 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=patchwork.ozlabs.org) Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) by legolas.ozlabs.org (Postfix) with ESMTP id 4Vnxf02Cscz20KL for ; Mon, 27 May 2024 23:42:16 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4VnxQk155Tz88C4 for ; Mon, 27 May 2024 23:32:30 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=csgroup.eu Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=csgroup.eu (client-ip=93.17.236.30; helo=pegase1.c-s.fr; envelope-from=christophe.leroy@csgroup.eu; receiver=lists.ozlabs.org) Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4VnxP141MHz79Bq for ; Mon, 27 May 2024 23:31:01 +1000 (AEST) Received: from localhost (mailhub3.si.c-s.fr [192.168.12.233]) by localhost (Postfix) with ESMTP id 4VnxNT6zt4z9tCl; Mon, 27 May 2024 15:30:33 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mnRJFRISjg-U; Mon, 27 May 2024 15:30:33 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 4VnxNT69BJz9sqS; Mon, 27 May 2024 15:30:33 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id CED8E8B773; Mon, 27 May 2024 15:30:33 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id nNWVwUBAoKxI; Mon, 27 May 2024 15:30:33 +0200 (CEST) Received: from PO20335.idsi0.si.c-s.fr (unknown [192.168.232.49]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 4EAC88B764; Mon, 27 May 2024 15:30:32 +0200 (CEST) From: Christophe Leroy To: Andrew Morton , Jason Gunthorpe , Peter Xu , Oscar Salvador , Michael Ellerman , Nicholas Piggin Subject: [RFC PATCH v4 05/16] powerpc/mm: Fix __find_linux_pte() on 32 bits with PMD leaf entries Date: Mon, 27 May 2024 15:30:03 +0200 Message-ID: X-Mailer: git-send-email 2.44.0 In-Reply-To: References: MIME-Version: 1.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1716816600; l=2268; i=christophe.leroy@csgroup.eu; s=20211009; h=from:subject:message-id; bh=CgsurgT/HBdZ/v9IT1Gnjjxrc//xCkwC6Uo2ZbsMc0U=; b=IfJ51RBuoaijjVJUpXSYAenvJ2xeDUJnOsadh4SsDBYVzR3VhIS7lTxdFZA6qrF8yyk96WhQi Uy+dEMH3SUzA4HGrpgq60Nk62vjH7QKtycWTD5UA/ytzt+kydidbAMU X-Developer-Key: i=christophe.leroy@csgroup.eu; a=ed25519; pk=HIzTzUj91asvincQGOFx6+ZF5AoUuP9GdOtQChs7Mm0= X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Building on 32 bits with pmd_leaf() not returning always false leads to the following error: CC arch/powerpc/mm/pgtable.o arch/powerpc/mm/pgtable.c: In function '__find_linux_pte': arch/powerpc/mm/pgtable.c:506:1: error: function may return address of local variable [-Werror=return-local-addr] 506 | } | ^ arch/powerpc/mm/pgtable.c:394:15: note: declared here 394 | pud_t pud, *pudp; | ^~~ arch/powerpc/mm/pgtable.c:394:15: note: declared here This is due to pmd_offset() being a no-op in that case. So rework it for powerpc/32 so that pXd_offset() are used on real pointers and not on on-stack copies. Signed-off-by: Christophe Leroy Reviewed-by: Oscar Salvador --- v3: Removed p4dp and pudp locals for PPC32 and add a comment. v4: Properly set pdshift on PPC32 case --- arch/powerpc/mm/pgtable.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index 9e7ba9c3851f..bce8a8619589 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -382,8 +382,10 @@ pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea, bool *is_thp, unsigned *hpage_shift) { pgd_t *pgdp; +#ifdef CONFIG_PPC64 p4d_t p4d, *p4dp; pud_t pud, *pudp; +#endif pmd_t pmd, *pmdp; pte_t *ret_pte; hugepd_t *hpdp = NULL; @@ -401,8 +403,12 @@ pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea, * page fault or a page unmap. The return pte_t * is still not * stable. So should be checked there for above conditions. * Top level is an exception because it is folded into p4d. + * + * On PPC32, P4D/PUD/PMD are folded into PGD so go straight to + * PMD level. */ pgdp = pgdir + pgd_index(ea); +#ifdef CONFIG_PPC64 p4dp = p4d_offset(pgdp, ea); p4d = READ_ONCE(*p4dp); pdshift = P4D_SHIFT; @@ -442,8 +448,11 @@ pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea, goto out_huge; } - pdshift = PMD_SHIFT; pmdp = pmd_offset(&pud, ea); +#else + pmdp = pmd_offset(pud_offset(p4d_offset(pgdp, ea), ea), ea); +#endif + pdshift = PMD_SHIFT; pmd = READ_ONCE(*pmdp); /* From patchwork Mon May 27 13:30:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Leroy X-Patchwork-Id: 1939955 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=patchwork.ozlabs.org) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) by legolas.ozlabs.org (Postfix) with ESMTP id 4VnxfG07PPz20f2 for ; Mon, 27 May 2024 23:42:30 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4VnxR756Crz88RD for ; Mon, 27 May 2024 23:32:51 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=csgroup.eu Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=csgroup.eu (client-ip=93.17.236.30; helo=pegase1.c-s.fr; envelope-from=christophe.leroy@csgroup.eu; receiver=lists.ozlabs.org) Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4VnxPB5Grmz79Jx for ; Mon, 27 May 2024 23:31:10 +1000 (AEST) Received: from localhost (mailhub3.si.c-s.fr [192.168.12.233]) by localhost (Postfix) with ESMTP id 4VnxNX3KFXz9tFS; Mon, 27 May 2024 15:30:36 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2X3i08xfsoP6; Mon, 27 May 2024 15:30:36 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 4VnxNW4jMZz9sqS; Mon, 27 May 2024 15:30:35 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 9CB748B773; Mon, 27 May 2024 15:30:35 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id k5MqdDyFkIZz; Mon, 27 May 2024 15:30:35 +0200 (CEST) Received: from PO20335.idsi0.si.c-s.fr (unknown [192.168.232.49]) by messagerie.si.c-s.fr (Postfix) with ESMTP id EB15E8B764; Mon, 27 May 2024 15:30:33 +0200 (CEST) From: Christophe Leroy To: Andrew Morton , Jason Gunthorpe , Peter Xu , Oscar Salvador , Michael Ellerman , Nicholas Piggin Subject: [RFC PATCH v4 06/16] powerpc/mm: Allow hugepages without hugepd Date: Mon, 27 May 2024 15:30:04 +0200 Message-ID: X-Mailer: git-send-email 2.44.0 In-Reply-To: References: MIME-Version: 1.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1716816600; l=5224; i=christophe.leroy@csgroup.eu; s=20211009; h=from:subject:message-id; bh=sE6juotDv7BsAHC8sy2Y+2Se42RoaSJz1OS5qq2ql/0=; b=36Xw7zn/i6QTEaj0XySNCaF0JKcqk1XM13g/ArikPR+BvvyGupcewcydvCdAjEEdhNn2XgG7F 4Zru2XyYpa1Cc3BVAc7UtBSPPaHXy9EATBeQoAGIVmo5aLWgLd/IxfI X-Developer-Key: i=christophe.leroy@csgroup.eu; a=ed25519; pk=HIzTzUj91asvincQGOFx6+ZF5AoUuP9GdOtQChs7Mm0= X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" In preparation of implementing huge pages on powerpc 8xx without hugepd, enclose hugepd related code inside an ifdef CONFIG_ARCH_HAS_HUGEPD This also allows removing some stubs. Signed-off-by: Christophe Leroy Reviewed-by: Oscar Salvador --- v3: - Prepare huge_pte_alloc() for full standard topology, not only for 2-level - Reordered last part of huge_pte_alloc() v4: - Rebased of v6.10-rc1 --- arch/powerpc/include/asm/book3s/32/pgalloc.h | 2 -- arch/powerpc/include/asm/hugetlb.h | 10 ++---- arch/powerpc/include/asm/nohash/pgtable.h | 2 +- arch/powerpc/mm/hugetlbpage.c | 33 ++++++++++++++++++++ arch/powerpc/mm/pgtable.c | 2 ++ 5 files changed, 38 insertions(+), 11 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/32/pgalloc.h b/arch/powerpc/include/asm/book3s/32/pgalloc.h index dc5c039eb28e..dd4eb3063175 100644 --- a/arch/powerpc/include/asm/book3s/32/pgalloc.h +++ b/arch/powerpc/include/asm/book3s/32/pgalloc.h @@ -47,8 +47,6 @@ static inline void pgtable_free(void *table, unsigned index_size) } } -#define get_hugepd_cache_index(x) (x) - static inline void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift) { diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h index ea71f7245a63..79176a499763 100644 --- a/arch/powerpc/include/asm/hugetlb.h +++ b/arch/powerpc/include/asm/hugetlb.h @@ -30,10 +30,12 @@ static inline int is_hugepage_only_range(struct mm_struct *mm, } #define is_hugepage_only_range is_hugepage_only_range +#ifdef CONFIG_ARCH_HAS_HUGEPD #define __HAVE_ARCH_HUGETLB_FREE_PGD_RANGE void hugetlb_free_pgd_range(struct mmu_gather *tlb, unsigned long addr, unsigned long end, unsigned long floor, unsigned long ceiling); +#endif #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm, @@ -67,14 +69,6 @@ static inline void flush_hugetlb_page(struct vm_area_struct *vma, { } -#define hugepd_shift(x) 0 -static inline pte_t *hugepte_offset(hugepd_t hpd, unsigned long addr, - unsigned pdshift) -{ - return NULL; -} - - static inline void __init gigantic_hugetlb_cma_reserve(void) { } diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h index f5f39d4f03c8..e7fc1314c23e 100644 --- a/arch/powerpc/include/asm/nohash/pgtable.h +++ b/arch/powerpc/include/asm/nohash/pgtable.h @@ -340,7 +340,7 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr, #define pgprot_writecombine pgprot_noncached_wc -#ifdef CONFIG_HUGETLB_PAGE +#ifdef CONFIG_ARCH_HAS_HUGEPD static inline int hugepd_ok(hugepd_t hpd) { #ifdef CONFIG_PPC_8xx diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 594a4b7b2ca2..20fad59ff9f5 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -42,6 +42,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr, unsigned long s return __find_linux_pte(mm->pgd, addr, NULL, NULL); } +#ifdef CONFIG_ARCH_HAS_HUGEPD static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp, unsigned long address, unsigned int pdshift, unsigned int pshift, spinlock_t *ptl) @@ -193,6 +194,36 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, return hugepte_offset(*hpdp, addr, pdshift); } +#else +pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, + unsigned long addr, unsigned long sz) +{ + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + + addr &= ~(sz - 1); + + p4d = p4d_offset(pgd_offset(mm, addr), addr); + if (!mm_pud_folded(mm) && sz >= P4D_SIZE) + return (pte_t *)p4d; + + pud = pud_alloc(mm, p4d, addr); + if (!pud) + return NULL; + if (!mm_pmd_folded(mm) && sz >= PUD_SIZE) + return (pte_t *)pud; + + pmd = pmd_alloc(mm, pud, addr); + if (!pmd) + return NULL; + + if (sz >= PMD_SIZE) + return (pte_t *)pmd; + + return pte_alloc_huge(mm, pmd, addr); +} +#endif #ifdef CONFIG_PPC_BOOK3S_64 /* @@ -248,6 +279,7 @@ int __init alloc_bootmem_huge_page(struct hstate *h, int nid) return __alloc_bootmem_huge_page(h, nid); } +#ifdef CONFIG_ARCH_HAS_HUGEPD #ifndef CONFIG_PPC_BOOK3S_64 #define HUGEPD_FREELIST_SIZE \ ((PAGE_SIZE - sizeof(struct hugepd_freelist)) / sizeof(pte_t)) @@ -505,6 +537,7 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb, } } while (addr = next, addr != end); } +#endif bool __init arch_hugetlb_valid_size(unsigned long size) { diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index bce8a8619589..9010973f036c 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -496,8 +496,10 @@ pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea, if (!hpdp) return NULL; +#ifdef CONFIG_ARCH_HAS_HUGEPD ret_pte = hugepte_offset(*hpdp, ea, pdshift); pdshift = hugepd_shift(*hpdp); +#endif out: if (hpage_shift) *hpage_shift = pdshift; From patchwork Mon May 27 13:30:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Leroy X-Patchwork-Id: 1939958 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=patchwork.ozlabs.org) Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) by legolas.ozlabs.org (Postfix) with ESMTP id 4VnxfL4249z20KL for ; Mon, 27 May 2024 23:42:34 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4VnxRY2Wltz88gL for ; Mon, 27 May 2024 23:33:13 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=csgroup.eu Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=csgroup.eu (client-ip=93.17.236.30; helo=pegase1.c-s.fr; envelope-from=christophe.leroy@csgroup.eu; receiver=lists.ozlabs.org) Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4VnxPH1R5mz87GZ for ; Mon, 27 May 2024 23:31:15 +1000 (AEST) Received: from localhost (mailhub3.si.c-s.fr [192.168.12.233]) by localhost (Postfix) with ESMTP id 4VnxNY39myz9syd; Mon, 27 May 2024 15:30:37 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QPpKG3o7xn_a; Mon, 27 May 2024 15:30:37 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 4VnxNY2Zzjz9sqS; Mon, 27 May 2024 15:30:37 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 53AB38B773; Mon, 27 May 2024 15:30:37 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id dnmhP2yQDIxO; Mon, 27 May 2024 15:30:37 +0200 (CEST) Received: from PO20335.idsi0.si.c-s.fr (unknown [192.168.232.49]) by messagerie.si.c-s.fr (Postfix) with ESMTP id B08078B764; Mon, 27 May 2024 15:30:35 +0200 (CEST) From: Christophe Leroy To: Andrew Morton , Jason Gunthorpe , Peter Xu , Oscar Salvador , Michael Ellerman , Nicholas Piggin Subject: [RFC PATCH v4 07/16] powerpc/8xx: Fix size given to set_huge_pte_at() Date: Mon, 27 May 2024 15:30:05 +0200 Message-ID: <75f695241f19fd08ce4f476110b933b7850249d3.1716815901.git.christophe.leroy@csgroup.eu> X-Mailer: git-send-email 2.44.0 In-Reply-To: References: MIME-Version: 1.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1716816600; l=967; i=christophe.leroy@csgroup.eu; s=20211009; h=from:subject:message-id; bh=D5c7rc2WApQ3tCbzTNEVQ0Gk3Td9OpyCeDWt3oeDJWA=; b=wqFHcjPXA+RmKeGLWUXQ7d01EPFwdwnDrXyqFosPd+EELJOAwUJ3KBbQ7UPPIOF5o8fX2QyDs d5FVrFEhD65AbwWAkEOgTVWQtleIBSW0PmXTmX8yM/GA1KsMvHg8MVO X-Developer-Key: i=christophe.leroy@csgroup.eu; a=ed25519; pk=HIzTzUj91asvincQGOFx6+ZF5AoUuP9GdOtQChs7Mm0= X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" set_huge_pte_at() expects the size of the hugepage as an int, not the psize which is the index of the page definition in table mmu_psize_defs[] Fixes: 935d4f0c6dc8 ("mm: hugetlb: add huge page size param to set_huge_pte_at()") Signed-off-by: Christophe Leroy Reviewed-by: Oscar Salvador --- arch/powerpc/mm/nohash/8xx.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c index 43d4842bb1c7..d93433e26ded 100644 --- a/arch/powerpc/mm/nohash/8xx.c +++ b/arch/powerpc/mm/nohash/8xx.c @@ -94,7 +94,8 @@ static int __ref __early_map_kernel_hugepage(unsigned long va, phys_addr_t pa, return -EINVAL; set_huge_pte_at(&init_mm, va, ptep, - pte_mkhuge(pfn_pte(pa >> PAGE_SHIFT, prot)), psize); + pte_mkhuge(pfn_pte(pa >> PAGE_SHIFT, prot)), + 1UL << mmu_psize_to_shift(psize)); return 0; } From patchwork Mon May 27 13:30:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Leroy X-Patchwork-Id: 1939962 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=patchwork.ozlabs.org) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) by legolas.ozlabs.org (Postfix) with ESMTP id 4Vnxfm0RDqz20KL for ; Mon, 27 May 2024 23:42:56 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4VnxRy64pdz88vK for ; Mon, 27 May 2024 23:33:34 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=csgroup.eu Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=csgroup.eu (client-ip=93.17.236.30; helo=pegase1.c-s.fr; envelope-from=christophe.leroy@csgroup.eu; receiver=lists.ozlabs.org) Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4VnxPM6BkWz87Jx for ; Mon, 27 May 2024 23:31:19 +1000 (AEST) Received: from localhost (mailhub3.si.c-s.fr [192.168.12.233]) by localhost (Postfix) with ESMTP id 4VnxNg1tFRz9tHZ; Mon, 27 May 2024 15:30:43 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ACLJuaZzRMkU; Mon, 27 May 2024 15:30:43 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 4VnxNc1QD2z9sqS; Mon, 27 May 2024 15:30:40 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 2B2BB8B773; Mon, 27 May 2024 15:30:40 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id JbzWHv5gnEyT; Mon, 27 May 2024 15:30:40 +0200 (CEST) Received: from PO20335.idsi0.si.c-s.fr (unknown [192.168.232.49]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 5D8978B764; Mon, 27 May 2024 15:30:37 +0200 (CEST) From: Christophe Leroy To: Andrew Morton , Jason Gunthorpe , Peter Xu , Oscar Salvador , Michael Ellerman , Nicholas Piggin Subject: [RFC PATCH v4 08/16] powerpc/8xx: Rework support for 8M pages using contiguous PTE entries Date: Mon, 27 May 2024 15:30:06 +0200 Message-ID: X-Mailer: git-send-email 2.44.0 In-Reply-To: References: MIME-Version: 1.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1716816600; l=19545; i=christophe.leroy@csgroup.eu; s=20211009; h=from:subject:message-id; bh=KJmkvE2q76d1gjbHjmvLCQMnU3vAWTvdfIYQx1JoU1M=; b=t8piJqU+FQ61Azfku/mqmzmtdohAat5EWKZOSP0CAiH17VHhu9or0ajP5s6J1nmTNDuv0yubu YcAxjxmvp2rCCgUjeDyv0vTuLcWAlEJCJk6tpqNpJaBqk+n0Ljq+vop X-Developer-Key: i=christophe.leroy@csgroup.eu; a=ed25519; pk=HIzTzUj91asvincQGOFx6+ZF5AoUuP9GdOtQChs7Mm0= X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" In order to fit better with standard Linux page tables layout, add support for 8M pages using contiguous PTE entries in a standard page table. Page tables will then be populated with 1024 similar entries and two PMD entries will point to that page table. The PMD entries also get a flag to tell it is addressing an 8M page, this is required for the HW tablewalk assistance. Signed-off-by: Christophe Leroy Reviewed-by: Oscar Salvador --- v3: - Move huge_ptep_get() for a more readable commit diff - Flag PMD as 8Mbytes in set_huge_pte_at() - Define __pte_leaf_size() - Change pte_update() instead of all huge callers of pte_update() - Added ptep_is_8m_pmdp() helper - Fixed kasan early memory 8M allocation --- arch/powerpc/Kconfig | 1 - .../include/asm/nohash/32/hugetlb-8xx.h | 38 +++---------- arch/powerpc/include/asm/nohash/32/pte-8xx.h | 53 ++++++++++++------- arch/powerpc/include/asm/nohash/pgtable.h | 4 -- arch/powerpc/include/asm/page.h | 5 -- arch/powerpc/include/asm/pgtable.h | 3 ++ arch/powerpc/kernel/head_8xx.S | 10 +--- arch/powerpc/mm/hugetlbpage.c | 18 ++++--- arch/powerpc/mm/kasan/8xx.c | 21 +++++--- arch/powerpc/mm/nohash/8xx.c | 40 +++++++------- arch/powerpc/mm/pgtable.c | 27 +++++++--- arch/powerpc/mm/pgtable_32.c | 2 +- arch/powerpc/platforms/Kconfig.cputype | 2 + 13 files changed, 112 insertions(+), 112 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 3c968f2f4ac4..cddccd4ca477 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -135,7 +135,6 @@ config PPC select ARCH_HAS_DMA_MAP_DIRECT if PPC_PSERIES select ARCH_HAS_FORTIFY_SOURCE select ARCH_HAS_GCOV_PROFILE_ALL - select ARCH_HAS_HUGEPD if HUGETLB_PAGE select ARCH_HAS_KCOV select ARCH_HAS_KERNEL_FPU_SUPPORT if PPC_FPU select ARCH_HAS_MEMBARRIER_CALLBACKS diff --git a/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h b/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h index 92df40c6cc6b..c60219269323 100644 --- a/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h +++ b/arch/powerpc/include/asm/nohash/32/hugetlb-8xx.h @@ -4,42 +4,12 @@ #define PAGE_SHIFT_8M 23 -static inline pte_t *hugepd_page(hugepd_t hpd) -{ - BUG_ON(!hugepd_ok(hpd)); - - return (pte_t *)__va(hpd_val(hpd) & ~HUGEPD_SHIFT_MASK); -} - -static inline unsigned int hugepd_shift(hugepd_t hpd) -{ - return PAGE_SHIFT_8M; -} - -static inline pte_t *hugepte_offset(hugepd_t hpd, unsigned long addr, - unsigned int pdshift) -{ - unsigned long idx = (addr & (SZ_4M - 1)) >> PAGE_SHIFT; - - return hugepd_page(hpd) + idx; -} - static inline void flush_hugetlb_page(struct vm_area_struct *vma, unsigned long vmaddr) { flush_tlb_page(vma, vmaddr); } -static inline void hugepd_populate(hugepd_t *hpdp, pte_t *new, unsigned int pshift) -{ - *hpdp = __hugepd(__pa(new) | _PMD_USER | _PMD_PRESENT | _PMD_PAGE_8M); -} - -static inline void hugepd_populate_kernel(hugepd_t *hpdp, pte_t *new, unsigned int pshift) -{ - *hpdp = __hugepd(__pa(new) | _PMD_PRESENT | _PMD_PAGE_8M); -} - static inline int check_and_get_huge_psize(int shift) { return shift_to_mmu_psize(shift); @@ -49,6 +19,14 @@ static inline int check_and_get_huge_psize(int shift) void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned long sz); +#define __HAVE_ARCH_HUGE_PTEP_GET +static inline pte_t huge_ptep_get(struct mm_struct *mm, unsigned long addr, pte_t *ptep) +{ + if (ptep_is_8m_pmdp(mm, addr, ptep)) + ptep = pte_offset_kernel((pmd_t *)ptep, 0); + return ptep_get(ptep); +} + #define __HAVE_ARCH_HUGE_PTE_CLEAR static inline void huge_pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep, unsigned long sz) diff --git a/arch/powerpc/include/asm/nohash/32/pte-8xx.h b/arch/powerpc/include/asm/nohash/32/pte-8xx.h index 625c31d6ce5c..54ebb91dbdcf 100644 --- a/arch/powerpc/include/asm/nohash/32/pte-8xx.h +++ b/arch/powerpc/include/asm/nohash/32/pte-8xx.h @@ -119,7 +119,7 @@ static inline pte_t pte_mkhuge(pte_t pte) #define pte_mkhuge pte_mkhuge -static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, pte_t *p, +static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, pte_t *ptep, unsigned long clr, unsigned long set, int huge); static inline void ptep_set_wrprotect(struct mm_struct *mm, unsigned long addr, pte_t *ptep) @@ -141,19 +141,12 @@ static inline void __ptep_set_access_flags(struct vm_area_struct *vma, pte_t *pt } #define __ptep_set_access_flags __ptep_set_access_flags -static inline unsigned long pgd_leaf_size(pgd_t pgd) -{ - if (pgd_val(pgd) & _PMD_PAGE_8M) - return SZ_8M; - return SZ_4M; -} - -#define pgd_leaf_size pgd_leaf_size - -static inline unsigned long pte_leaf_size(pte_t pte) +static inline unsigned long __pte_leaf_size(pmd_t pmd, pte_t pte) { pte_basic_t val = pte_val(pte); + if (pmd_val(pmd) & _PMD_PAGE_8M) + return SZ_8M; if (val & _PAGE_HUGE) return SZ_512K; if (val & _PAGE_SPS) @@ -161,31 +154,38 @@ static inline unsigned long pte_leaf_size(pte_t pte) return SZ_4K; } -#define pte_leaf_size pte_leaf_size +#define __pte_leaf_size __pte_leaf_size /* * On the 8xx, the page tables are a bit special. For 16k pages, we have * 4 identical entries. For 512k pages, we have 128 entries as if it was * 4k pages, but they are flagged as 512k pages for the hardware. - * For other page sizes, we have a single entry in the table. + * For 8M pages, we have 1024 entries as if it was 4M pages (PMD_SIZE) + * but they are flagged as 8M pages for the hardware. + * For 4k pages, we have a single entry in the table. */ static pmd_t *pmd_off(struct mm_struct *mm, unsigned long addr); -static int hugepd_ok(hugepd_t hpd); +static inline pte_t *pte_offset_kernel(pmd_t *pmd, unsigned long address); + +static inline bool ptep_is_8m_pmdp(struct mm_struct *mm, unsigned long addr, pte_t *ptep) +{ + return (pmd_t *)ptep == pmd_off(mm, ALIGN_DOWN(addr, SZ_8M)); +} static inline int number_of_cells_per_pte(pmd_t *pmd, pte_basic_t val, int huge) { if (!huge) return PAGE_SIZE / SZ_4K; - else if (hugepd_ok(*((hugepd_t *)pmd))) - return 1; + else if ((pmd_val(*pmd) & _PMD_PAGE_MASK) == _PMD_PAGE_8M) + return SZ_4M / SZ_4K; else if (IS_ENABLED(CONFIG_PPC_4K_PAGES) && !(val & _PAGE_HUGE)) return SZ_16K / SZ_4K; else return SZ_512K / SZ_4K; } -static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, pte_t *p, - unsigned long clr, unsigned long set, int huge) +static inline pte_basic_t __pte_update(struct mm_struct *mm, unsigned long addr, pte_t *p, + unsigned long clr, unsigned long set, int huge) { pte_basic_t *entry = (pte_basic_t *)p; pte_basic_t old = pte_val(*p); @@ -197,7 +197,7 @@ static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, p for (i = 0; i < num; i += PAGE_SIZE / SZ_4K, new += PAGE_SIZE) { *entry++ = new; - if (IS_ENABLED(CONFIG_PPC_16K_PAGES) && num != 1) { + if (IS_ENABLED(CONFIG_PPC_16K_PAGES)) { *entry++ = new; *entry++ = new; *entry++ = new; @@ -207,6 +207,21 @@ static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, p return old; } +static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, pte_t *ptep, + unsigned long clr, unsigned long set, int huge) +{ + pte_basic_t old; + + if (huge && ptep_is_8m_pmdp(mm, addr, ptep)) { + pmd_t *pmdp = (pmd_t *)ptep; + + old = __pte_update(mm, addr, pte_offset_kernel(pmdp, 0), clr, set, huge); + __pte_update(mm, addr, pte_offset_kernel(pmdp + 1, 0), clr, set, huge); + } else { + old = __pte_update(mm, addr, ptep, clr, set, huge); + } + return old; +} #define pte_update pte_update #ifdef CONFIG_PPC_16K_PAGES diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h index e7fc1314c23e..90d6a0943b35 100644 --- a/arch/powerpc/include/asm/nohash/pgtable.h +++ b/arch/powerpc/include/asm/nohash/pgtable.h @@ -343,12 +343,8 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr, #ifdef CONFIG_ARCH_HAS_HUGEPD static inline int hugepd_ok(hugepd_t hpd) { -#ifdef CONFIG_PPC_8xx - return ((hpd_val(hpd) & _PMD_PAGE_MASK) == _PMD_PAGE_8M); -#else /* We clear the top bit to indicate hugepd */ return (hpd_val(hpd) && (hpd_val(hpd) & PD_HUGE) == 0); -#endif } #define is_hugepd(hpd) (hugepd_ok(hpd)) diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h index e411e5a70ea3..018c3d55232c 100644 --- a/arch/powerpc/include/asm/page.h +++ b/arch/powerpc/include/asm/page.h @@ -293,13 +293,8 @@ static inline const void *pfn_to_kaddr(unsigned long pfn) /* * Some number of bits at the level of the page table that points to * a hugepte are used to encode the size. This masks those bits. - * On 8xx, HW assistance requires 4k alignment for the hugepte. */ -#ifdef CONFIG_PPC_8xx -#define HUGEPD_SHIFT_MASK 0xfff -#else #define HUGEPD_SHIFT_MASK 0x3f -#endif #ifndef __ASSEMBLY__ diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h index 239709a2f68e..264a6c09517a 100644 --- a/arch/powerpc/include/asm/pgtable.h +++ b/arch/powerpc/include/asm/pgtable.h @@ -106,6 +106,9 @@ unsigned long vmalloc_to_phys(void *vmalloc_addr); void pgtable_cache_add(unsigned int shift); +#ifdef CONFIG_PPC32 +void __init *early_alloc_pgtable(unsigned long size); +#endif pte_t *early_pte_alloc_kernel(pmd_t *pmdp, unsigned long va); #if defined(CONFIG_STRICT_KERNEL_RWX) || defined(CONFIG_PPC32) diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S index edc479a7c2bc..ac74321b1192 100644 --- a/arch/powerpc/kernel/head_8xx.S +++ b/arch/powerpc/kernel/head_8xx.S @@ -415,14 +415,13 @@ FixupDAR:/* Entry point for dcbx workaround. */ oris r11, r11, (swapper_pg_dir - PAGE_OFFSET)@ha 3: lwz r11, (swapper_pg_dir-PAGE_OFFSET)@l(r11) /* Get the level 1 entry */ + rlwinm r11, r11, 0, ~_PMD_PAGE_8M mtspr SPRN_MD_TWC, r11 - mtcrf 0x01, r11 mfspr r11, SPRN_MD_TWC lwz r11, 0(r11) /* Get the pte */ - bt 28,200f /* bit 28 = Large page (8M) */ /* concat physical page address(r11) and page offset(r10) */ rlwimi r11, r10, 0, 32 - PAGE_SHIFT, 31 -201: lwz r11,0(r11) + lwz r11,0(r11) /* Check if it really is a dcbx instruction. */ /* dcbt and dcbtst does not generate DTLB Misses/Errors, * no need to include them here */ @@ -441,11 +440,6 @@ FixupDAR:/* Entry point for dcbx workaround. */ 141: mfspr r10,SPRN_M_TW b DARFixed /* Nope, go back to normal TLB processing */ -200: - /* concat physical page address(r11) and page offset(r10) */ - rlwimi r11, r10, 0, 32 - PAGE_SHIFT_8M, 31 - b 201b - 144: mfspr r10, SPRN_DSISR rlwinm r10, r10,0,7,5 /* Clear store bit for buggy dcbst insn */ mtspr SPRN_DSISR, r10 diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 20fad59ff9f5..5193f6845725 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -183,9 +183,6 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, if (!hpdp) return NULL; - if (IS_ENABLED(CONFIG_PPC_8xx) && pshift < PMD_SHIFT) - return pte_alloc_huge(mm, (pmd_t *)hpdp, addr); - BUG_ON(!hugepd_none(*hpdp) && !hugepd_ok(*hpdp)); if (hugepd_none(*hpdp) && __hugepte_alloc(mm, hpdp, addr, @@ -218,8 +215,18 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, if (!pmd) return NULL; - if (sz >= PMD_SIZE) + if (sz >= PMD_SIZE) { + /* On 8xx, all hugepages are handled as contiguous PTEs */ + if (IS_ENABLED(CONFIG_PPC_8xx)) { + int i; + + for (i = 0; i < sz / PMD_SIZE; i++) { + if (!pte_alloc_huge(mm, pmd + i, addr)) + return NULL; + } + } return (pte_t *)pmd; + } return pte_alloc_huge(mm, pmd, addr); } @@ -619,8 +626,7 @@ static int __init hugetlbpage_init(void) if (pdshift > shift) { if (!IS_ENABLED(CONFIG_PPC_8xx)) pgtable_cache_add(pdshift - shift); - } else if (IS_ENABLED(CONFIG_PPC_E500) || - IS_ENABLED(CONFIG_PPC_8xx)) { + } else if (IS_ENABLED(CONFIG_PPC_E500)) { pgtable_cache_add(PTE_T_ORDER); } diff --git a/arch/powerpc/mm/kasan/8xx.c b/arch/powerpc/mm/kasan/8xx.c index 2784224054f8..989d6cdf4141 100644 --- a/arch/powerpc/mm/kasan/8xx.c +++ b/arch/powerpc/mm/kasan/8xx.c @@ -6,28 +6,33 @@ #include #include +#include + static int __init kasan_init_shadow_8M(unsigned long k_start, unsigned long k_end, void *block) { pmd_t *pmd = pmd_off_k(k_start); unsigned long k_cur, k_next; - for (k_cur = k_start; k_cur != k_end; k_cur = k_next, pmd += 2, block += SZ_8M) { - pte_basic_t *new; + for (k_cur = k_start; k_cur != k_end; k_cur = k_next, pmd++, block += SZ_4M) { + pte_t *ptep; + int i; k_next = pgd_addr_end(k_cur, k_end); - k_next = pgd_addr_end(k_next, k_end); if ((void *)pmd_page_vaddr(*pmd) != kasan_early_shadow_pte) continue; - new = memblock_alloc(sizeof(pte_basic_t), SZ_4K); - if (!new) + ptep = memblock_alloc(PTE_FRAG_SIZE, PTE_FRAG_SIZE); + if (!ptep) return -ENOMEM; - *new = pte_val(pte_mkhuge(pfn_pte(PHYS_PFN(__pa(block)), PAGE_KERNEL))); + for (i = 0; i < PTRS_PER_PTE; i++) { + pte_t pte = pte_mkhuge(pfn_pte(PHYS_PFN(__pa(block + i * PAGE_SIZE)), PAGE_KERNEL)); - hugepd_populate_kernel((hugepd_t *)pmd, (pte_t *)new, PAGE_SHIFT_8M); - hugepd_populate_kernel((hugepd_t *)pmd + 1, (pte_t *)new, PAGE_SHIFT_8M); + __set_pte_at(&init_mm, k_cur, ptep + i, pte, 1); + } + pmd_populate_kernel(&init_mm, pmd, ptep); + *pmd = __pmd(pmd_val(*pmd) | _PMD_PAGE_8M); } return 0; } diff --git a/arch/powerpc/mm/nohash/8xx.c b/arch/powerpc/mm/nohash/8xx.c index d93433e26ded..388bba0ab3e7 100644 --- a/arch/powerpc/mm/nohash/8xx.c +++ b/arch/powerpc/mm/nohash/8xx.c @@ -11,6 +11,7 @@ #include #include +#include #include @@ -48,20 +49,6 @@ unsigned long p_block_mapped(phys_addr_t pa) return 0; } -static pte_t __init *early_hugepd_alloc_kernel(hugepd_t *pmdp, unsigned long va) -{ - if (hpd_val(*pmdp) == 0) { - pte_t *ptep = memblock_alloc(sizeof(pte_basic_t), SZ_4K); - - if (!ptep) - return NULL; - - hugepd_populate_kernel((hugepd_t *)pmdp, ptep, PAGE_SHIFT_8M); - hugepd_populate_kernel((hugepd_t *)pmdp + 1, ptep, PAGE_SHIFT_8M); - } - return hugepte_offset(*(hugepd_t *)pmdp, va, PGDIR_SHIFT); -} - static int __ref __early_map_kernel_hugepage(unsigned long va, phys_addr_t pa, pgprot_t prot, int psize, bool new) { @@ -75,24 +62,33 @@ static int __ref __early_map_kernel_hugepage(unsigned long va, phys_addr_t pa, if (WARN_ON(slab_is_available())) return -EINVAL; - if (psize == MMU_PAGE_512K) + if (psize == MMU_PAGE_512K) { ptep = early_pte_alloc_kernel(pmdp, va); - else - ptep = early_hugepd_alloc_kernel((hugepd_t *)pmdp, va); + /* The PTE should never be already present */ + if (WARN_ON(pte_present(*ptep) && pgprot_val(prot))) + return -EINVAL; + } else { + if (WARN_ON(!pmd_none(*pmdp) || !pmd_none(*(pmdp + 1)))) + return -EINVAL; + + ptep = early_alloc_pgtable(PTE_FRAG_SIZE); + pmd_populate_kernel(&init_mm, pmdp, ptep); + + ptep = early_alloc_pgtable(PTE_FRAG_SIZE); + pmd_populate_kernel(&init_mm, pmdp + 1, ptep); + + ptep = (pte_t *)pmdp; + } } else { if (psize == MMU_PAGE_512K) ptep = pte_offset_kernel(pmdp, va); else - ptep = hugepte_offset(*(hugepd_t *)pmdp, va, PGDIR_SHIFT); + ptep = (pte_t *)pmdp; } if (WARN_ON(!ptep)) return -ENOMEM; - /* The PTE should never be already present */ - if (new && WARN_ON(pte_present(*ptep) && pgprot_val(prot))) - return -EINVAL; - set_huge_pte_at(&init_mm, va, ptep, pte_mkhuge(pfn_pte(pa >> PAGE_SHIFT, prot)), 1UL << mmu_psize_to_shift(psize)); diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index 9010973f036c..294775c793ab 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -297,11 +297,8 @@ int huge_ptep_set_access_flags(struct vm_area_struct *vma, } #if defined(CONFIG_PPC_8xx) -void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, - pte_t pte, unsigned long sz) +static void __set_huge_pte_at(pmd_t *pmd, pte_t *ptep, pte_basic_t val) { - pmd_t *pmd = pmd_off(mm, addr); - pte_basic_t val; pte_basic_t *entry = (pte_basic_t *)ptep; int num, i; @@ -311,15 +308,29 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, */ VM_WARN_ON(pte_hw_valid(*ptep) && !pte_protnone(*ptep)); - pte = set_pte_filter(pte, addr); - - val = pte_val(pte); - num = number_of_cells_per_pte(pmd, val, 1); for (i = 0; i < num; i++, entry++, val += SZ_4K) *entry = val; } + +void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, + pte_t pte, unsigned long sz) +{ + pmd_t *pmdp = pmd_off(mm, addr); + + pte = set_pte_filter(pte, addr); + + if (sz == SZ_8M) { /* Flag both PMD entries as 8M and fill both page tables */ + *pmdp = __pmd(pmd_val(*pmdp) | _PMD_PAGE_8M); + *(pmdp + 1) = __pmd(pmd_val(*(pmdp + 1)) | _PMD_PAGE_8M); + + __set_huge_pte_at(pmdp, pte_offset_kernel(pmdp, 0), pte_val(pte)); + __set_huge_pte_at(pmdp, pte_offset_kernel(pmdp + 1, 0), pte_val(pte) + SZ_4M); + } else { + __set_huge_pte_at(pmdp, ptep, pte_val(pte)); + } +} #endif #endif /* CONFIG_HUGETLB_PAGE */ diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c index cfd622ebf774..787b22206386 100644 --- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -48,7 +48,7 @@ notrace void __init early_ioremap_init(void) early_ioremap_setup(); } -static void __init *early_alloc_pgtable(unsigned long size) +void __init *early_alloc_pgtable(unsigned long size) { void *ptr = memblock_alloc(size, size); diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype index b2d8c0da2ad9..fa4bb096b3ae 100644 --- a/arch/powerpc/platforms/Kconfig.cputype +++ b/arch/powerpc/platforms/Kconfig.cputype @@ -98,6 +98,7 @@ config PPC_BOOK3S_64 select ARCH_ENABLE_HUGEPAGE_MIGRATION if HUGETLB_PAGE && MIGRATION select ARCH_ENABLE_SPLIT_PMD_PTLOCK select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE + select ARCH_HAS_HUGEPD if HUGETLB_PAGE select ARCH_SUPPORTS_HUGETLBFS select ARCH_SUPPORTS_NUMA_BALANCING select HAVE_MOVE_PMD @@ -290,6 +291,7 @@ config PPC_BOOK3S config PPC_E500 select FSL_EMB_PERFMON bool + select ARCH_HAS_HUGEPD if HUGETLB_PAGE select ARCH_SUPPORTS_HUGETLBFS if PHYS_64BIT || PPC64 select PPC_SMP_MUXED_IPI select PPC_DOORBELL From patchwork Mon May 27 13:30:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Leroy X-Patchwork-Id: 1939954 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=patchwork.ozlabs.org) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) by legolas.ozlabs.org (Postfix) with ESMTP id 4VnxfF2Crpz20ds for ; Mon, 27 May 2024 23:42:29 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4VnxSW51q4z89Bf for ; Mon, 27 May 2024 23:34:03 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=csgroup.eu Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=csgroup.eu (client-ip=93.17.236.30; helo=pegase1.c-s.fr; envelope-from=christophe.leroy@csgroup.eu; receiver=lists.ozlabs.org) Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4VnxPT1fhLz87Mq for ; Mon, 27 May 2024 23:31:24 +1000 (AEST) Received: from localhost (mailhub3.si.c-s.fr [192.168.12.233]) by localhost (Postfix) with ESMTP id 4VnxNh4CNPz9sqS; Mon, 27 May 2024 15:30:44 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id xjPC1R4_n9Zd; Mon, 27 May 2024 15:30:44 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 4VnxNg1JK8z9tHF; Mon, 27 May 2024 15:30:43 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 276B48B773; Mon, 27 May 2024 15:30:43 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id oFrG5_RqJUcI; Mon, 27 May 2024 15:30:43 +0200 (CEST) Received: from PO20335.idsi0.si.c-s.fr (unknown [192.168.232.49]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 36A748B764; Mon, 27 May 2024 15:30:40 +0200 (CEST) From: Christophe Leroy To: Andrew Morton , Jason Gunthorpe , Peter Xu , Oscar Salvador , Michael Ellerman , Nicholas Piggin Subject: [RFC PATCH v4 09/16] powerpc/8xx: Simplify struct mmu_psize_def Date: Mon, 27 May 2024 15:30:07 +0200 Message-ID: X-Mailer: git-send-email 2.44.0 In-Reply-To: References: MIME-Version: 1.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1716816601; l=1334; i=christophe.leroy@csgroup.eu; s=20211009; h=from:subject:message-id; bh=foRpSg705p76Fv1R91+1LawwFjSx303hjdxd+UuPeuU=; b=atjAkuPvr2Ff19HnM3/JJDA8hanr8ssPzuScrr4EAW+e8KY9FkfkPIo6ifYbiRWCmKLcT2yim WqCOkcm1cUIC5pqBbEtGzVFjE8KrVm19vJe/pprh/0CxgsyxAvsnX++ X-Developer-Key: i=christophe.leroy@csgroup.eu; a=ed25519; pk=HIzTzUj91asvincQGOFx6+ZF5AoUuP9GdOtQChs7Mm0= X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" On 8xx, only the shift field is used in struct mmu_psize_def Remove other fields and related macros. Signed-off-by: Christophe Leroy Reviewed-by: Oscar Salvador --- arch/powerpc/include/asm/nohash/32/mmu-8xx.h | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-) diff --git a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h index 141d82e249a8..a756a1e59c54 100644 --- a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h +++ b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h @@ -189,19 +189,14 @@ typedef struct { #define PHYS_IMMR_BASE (mfspr(SPRN_IMMR) & 0xfff80000) -/* Page size definitions, common between 32 and 64-bit +/* + * Page size definitions for 8xx * * shift : is the "PAGE_SHIFT" value for that page size - * penc : is the pte encoding mask * */ struct mmu_psize_def { unsigned int shift; /* number of bits */ - unsigned int enc; /* PTE encoding */ - unsigned int ind; /* Corresponding indirect page size shift */ - unsigned int flags; -#define MMU_PAGE_SIZE_DIRECT 0x1 /* Supported as a direct size */ -#define MMU_PAGE_SIZE_INDIRECT 0x2 /* Supported as an indirect size */ }; extern struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT]; From patchwork Mon May 27 13:30:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Leroy X-Patchwork-Id: 1939953 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=patchwork.ozlabs.org) Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) by legolas.ozlabs.org (Postfix) with ESMTP id 4Vnxf85gzCz20KL for ; Mon, 27 May 2024 23:42:24 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4VnxSx1qNNz89QP for ; Mon, 27 May 2024 23:34:25 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=csgroup.eu Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=csgroup.eu (client-ip=93.17.236.30; helo=pegase1.c-s.fr; envelope-from=christophe.leroy@csgroup.eu; receiver=lists.ozlabs.org) Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4VnxPY4R7nz87QL for ; Mon, 27 May 2024 23:31:29 +1000 (AEST) Received: from localhost (mailhub3.si.c-s.fr [192.168.12.233]) by localhost (Postfix) with ESMTP id 4VnxNj27dVz9tHF; Mon, 27 May 2024 15:30:45 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OMtrQV6-bPEi; Mon, 27 May 2024 15:30:45 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 4VnxNj1W8nz9tC6; Mon, 27 May 2024 15:30:45 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 2F2978B773; Mon, 27 May 2024 15:30:45 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id j18ynURuvhtH; Mon, 27 May 2024 15:30:45 +0200 (CEST) Received: from PO20335.idsi0.si.c-s.fr (unknown [192.168.232.49]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 318758B764; Mon, 27 May 2024 15:30:43 +0200 (CEST) From: Christophe Leroy To: Andrew Morton , Jason Gunthorpe , Peter Xu , Oscar Salvador , Michael Ellerman , Nicholas Piggin Subject: [RFC PATCH v4 10/16] powerpc/e500: Remove enc and ind fields from struct mmu_psize_def Date: Mon, 27 May 2024 15:30:08 +0200 Message-ID: X-Mailer: git-send-email 2.44.0 In-Reply-To: References: MIME-Version: 1.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1716816601; l=3777; i=christophe.leroy@csgroup.eu; s=20211009; h=from:subject:message-id; bh=GvUuKmINqllZ54PyQ+lDpy16CYLzBdm/Cr1rg49spF8=; b=RxMHKhJJDAP5VpYsyWZSt1EFiYcmwhH/8hOQ85y473bMx94B2aAMvyWdPD8Ua2+9iVSiBAblO I+3pg9fCTWWAX53sbwzZ4ZdyOu8X7nryQzRKv38fQaDfVr2VNipKl4h X-Developer-Key: i=christophe.leroy@csgroup.eu; a=ed25519; pk=HIzTzUj91asvincQGOFx6+ZF5AoUuP9GdOtQChs7Mm0= X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" enc field is hidden behind BOOK3E_PAGESZ_XX macros, and when you look closer you realise that this field is nothing else than the value of shift minus ten. So remove enc field and calculate tsize from shift field. Also remove inc field which is unused. Signed-off-by: Christophe Leroy Reviewed-by: Oscar Salvador --- arch/powerpc/include/asm/nohash/mmu-e500.h | 3 --- arch/powerpc/mm/nohash/book3e_pgtable.c | 4 ++-- arch/powerpc/mm/nohash/tlb.c | 9 +-------- arch/powerpc/mm/nohash/tlb_64e.c | 2 +- 4 files changed, 4 insertions(+), 14 deletions(-) diff --git a/arch/powerpc/include/asm/nohash/mmu-e500.h b/arch/powerpc/include/asm/nohash/mmu-e500.h index 7dc24b8632d7..b281d9eeaf1e 100644 --- a/arch/powerpc/include/asm/nohash/mmu-e500.h +++ b/arch/powerpc/include/asm/nohash/mmu-e500.h @@ -244,14 +244,11 @@ typedef struct { /* Page size definitions, common between 32 and 64-bit * * shift : is the "PAGE_SHIFT" value for that page size - * penc : is the pte encoding mask * */ struct mmu_psize_def { unsigned int shift; /* number of bits */ - unsigned int enc; /* PTE encoding */ - unsigned int ind; /* Corresponding indirect page size shift */ unsigned int flags; #define MMU_PAGE_SIZE_DIRECT 0x1 /* Supported as a direct size */ #define MMU_PAGE_SIZE_INDIRECT 0x2 /* Supported as an indirect size */ diff --git a/arch/powerpc/mm/nohash/book3e_pgtable.c b/arch/powerpc/mm/nohash/book3e_pgtable.c index 1c5e4ecbebeb..ad2a7c26f2a0 100644 --- a/arch/powerpc/mm/nohash/book3e_pgtable.c +++ b/arch/powerpc/mm/nohash/book3e_pgtable.c @@ -29,10 +29,10 @@ int __meminit vmemmap_create_mapping(unsigned long start, _PAGE_KERNEL_RW; /* PTEs only contain page size encodings up to 32M */ - BUG_ON(mmu_psize_defs[mmu_vmemmap_psize].enc > 0xf); + BUG_ON(mmu_psize_defs[mmu_vmemmap_psize].shift - 10 > 0xf); /* Encode the size in the PTE */ - flags |= mmu_psize_defs[mmu_vmemmap_psize].enc << 8; + flags |= (mmu_psize_defs[mmu_vmemmap_psize].shift - 10) << 8; /* For each PTE for that area, map things. Note that we don't * increment phys because all PTEs are of the large size and diff --git a/arch/powerpc/mm/nohash/tlb.c b/arch/powerpc/mm/nohash/tlb.c index f57dc721d063..b653a7be4cb1 100644 --- a/arch/powerpc/mm/nohash/tlb.c +++ b/arch/powerpc/mm/nohash/tlb.c @@ -53,37 +53,30 @@ struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT] = { [MMU_PAGE_4K] = { .shift = 12, - .enc = BOOK3E_PAGESZ_4K, }, [MMU_PAGE_2M] = { .shift = 21, - .enc = BOOK3E_PAGESZ_2M, }, [MMU_PAGE_4M] = { .shift = 22, - .enc = BOOK3E_PAGESZ_4M, }, [MMU_PAGE_16M] = { .shift = 24, - .enc = BOOK3E_PAGESZ_16M, }, [MMU_PAGE_64M] = { .shift = 26, - .enc = BOOK3E_PAGESZ_64M, }, [MMU_PAGE_256M] = { .shift = 28, - .enc = BOOK3E_PAGESZ_256M, }, [MMU_PAGE_1G] = { .shift = 30, - .enc = BOOK3E_PAGESZ_1GB, }, }; static inline int mmu_get_tsize(int psize) { - return mmu_psize_defs[psize].enc; + return mmu_psize_defs[psize].shift - 10; } #else static inline int mmu_get_tsize(int psize) diff --git a/arch/powerpc/mm/nohash/tlb_64e.c b/arch/powerpc/mm/nohash/tlb_64e.c index 053128a5636c..7988238496d7 100644 --- a/arch/powerpc/mm/nohash/tlb_64e.c +++ b/arch/powerpc/mm/nohash/tlb_64e.c @@ -53,7 +53,7 @@ int extlb_level_exc; */ void tlb_flush_pgtable(struct mmu_gather *tlb, unsigned long address) { - int tsize = mmu_psize_defs[mmu_pte_psize].enc; + int tsize = mmu_psize_defs[mmu_pte_psize].shift - 10; if (book3e_htw_mode != PPC_HTW_NONE) { unsigned long start = address & PMD_MASK; From patchwork Mon May 27 13:30:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Leroy X-Patchwork-Id: 1939960 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=patchwork.ozlabs.org) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) by legolas.ozlabs.org (Postfix) with ESMTP id 4VnxfQ2jJSz20ds for ; Mon, 27 May 2024 23:42:38 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4VnxTL5Q5rz89f1 for ; Mon, 27 May 2024 23:34:46 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=csgroup.eu Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=csgroup.eu (client-ip=93.17.236.30; helo=pegase1.c-s.fr; envelope-from=christophe.leroy@csgroup.eu; receiver=lists.ozlabs.org) Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4VnxPd4VQ0z87SQ for ; Mon, 27 May 2024 23:31:33 +1000 (AEST) Received: from localhost (mailhub3.si.c-s.fr [192.168.12.233]) by localhost (Postfix) with ESMTP id 4VnxNk4s1Bz9tKB; Mon, 27 May 2024 15:30:46 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bSM3peuOfHHq; Mon, 27 May 2024 15:30:46 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 4VnxNk3yDZz9tC6; Mon, 27 May 2024 15:30:46 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 831688B773; Mon, 27 May 2024 15:30:46 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id 9bEcQ6VFSJHq; Mon, 27 May 2024 15:30:46 +0200 (CEST) Received: from PO20335.idsi0.si.c-s.fr (unknown [192.168.232.49]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 5FECE8B764; Mon, 27 May 2024 15:30:45 +0200 (CEST) From: Christophe Leroy To: Andrew Morton , Jason Gunthorpe , Peter Xu , Oscar Salvador , Michael Ellerman , Nicholas Piggin Subject: [RFC PATCH v4 11/16] powerpc/e500: Switch to 64 bits PGD on 85xx (32 bits) Date: Mon, 27 May 2024 15:30:09 +0200 Message-ID: X-Mailer: git-send-email 2.44.0 In-Reply-To: References: MIME-Version: 1.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1716816601; l=2287; i=christophe.leroy@csgroup.eu; s=20211009; h=from:subject:message-id; bh=s4/AO36tt/CSDTJwqMkp0Ne29mDvmColJNi/FwYddUk=; b=GW/J/ozCeLmomfRdNF2QT5Vp07G4D1/PKFAJswCcFhvfLH1zh8aIVIsEC+pF6j8gRlvl1neqN HCa8Ewi63PoBkmQjaHQFXBKtCtXMuU3uUon/bxcwXwl2N2bXYik3NFj X-Developer-Key: i=christophe.leroy@csgroup.eu; a=ed25519; pk=HIzTzUj91asvincQGOFx6+ZF5AoUuP9GdOtQChs7Mm0= X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" At the time being when CONFIG_PTE_64BIT is selected, PTE entries are 64 bits but PGD entries are still 32 bits. In order to allow leaf PMD entries, switch the PGD to 64 bits entries. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/pgtable-types.h | 4 ++++ arch/powerpc/kernel/head_85xx.S | 10 ++++++---- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/include/asm/pgtable-types.h b/arch/powerpc/include/asm/pgtable-types.h index 082c85cc09b1..db965d98e0ae 100644 --- a/arch/powerpc/include/asm/pgtable-types.h +++ b/arch/powerpc/include/asm/pgtable-types.h @@ -49,7 +49,11 @@ static inline unsigned long pud_val(pud_t x) #endif /* CONFIG_PPC64 */ /* PGD level */ +#if defined(CONFIG_PPC_E500) && defined(CONFIG_PTE_64BIT) +typedef struct { unsigned long long pgd; } pgd_t; +#else typedef struct { unsigned long pgd; } pgd_t; +#endif #define __pgd(x) ((pgd_t) { (x) }) static inline unsigned long pgd_val(pgd_t x) { diff --git a/arch/powerpc/kernel/head_85xx.S b/arch/powerpc/kernel/head_85xx.S index 39724ff5ae1f..a305244afc9f 100644 --- a/arch/powerpc/kernel/head_85xx.S +++ b/arch/powerpc/kernel/head_85xx.S @@ -307,8 +307,9 @@ set_ivor: #ifdef CONFIG_PTE_64BIT #ifdef CONFIG_HUGETLB_PAGE #define FIND_PTE \ - rlwinm r12, r10, 13, 19, 29; /* Compute pgdir/pmd offset */ \ - lwzx r11, r12, r11; /* Get pgd/pmd entry */ \ + rlwinm r12, r10, 14, 18, 28; /* Compute pgdir/pmd offset */ \ + add r12, r11, r12; \ + lwz r11, 4(r12); /* Get pgd/pmd entry */ \ rlwinm. r12, r11, 0, 0, 20; /* Extract pt base address */ \ blt 1000f; /* Normal non-huge page */ \ beq 2f; /* Bail if no table */ \ @@ -321,8 +322,9 @@ set_ivor: 1001: lwz r11, 4(r12); /* Get pte entry */ #else #define FIND_PTE \ - rlwinm r12, r10, 13, 19, 29; /* Compute pgdir/pmd offset */ \ - lwzx r11, r12, r11; /* Get pgd/pmd entry */ \ + rlwinm r12, r10, 14, 18, 28; /* Compute pgdir/pmd offset */ \ + add r12, r11, r12; \ + lwz r11, 4(r12); /* Get pgd/pmd entry */ \ rlwinm. r12, r11, 0, 0, 20; /* Extract pt base address */ \ beq 2f; /* Bail if no table */ \ rlwimi r12, r10, 23, 20, 28; /* Compute pte address */ \ From patchwork Mon May 27 13:30:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Leroy X-Patchwork-Id: 1939957 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=patchwork.ozlabs.org) Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) by legolas.ozlabs.org (Postfix) with ESMTP id 4VnxfH0Br4z20f2 for ; Mon, 27 May 2024 23:42:31 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4VnxTm2PF7z89sL for ; Mon, 27 May 2024 23:35:08 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=csgroup.eu Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=csgroup.eu (client-ip=93.17.236.30; helo=pegase1.c-s.fr; envelope-from=christophe.leroy@csgroup.eu; receiver=lists.ozlabs.org) Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4VnxPj6fnlz87Vf for ; Mon, 27 May 2024 23:31:37 +1000 (AEST) Received: from localhost (mailhub3.si.c-s.fr [192.168.12.233]) by localhost (Postfix) with ESMTP id 4VnxNl50Ssz9tKP; Mon, 27 May 2024 15:30:47 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PTTff4CE0Yd7; Mon, 27 May 2024 15:30:47 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 4VnxNl4Nvvz9tC6; Mon, 27 May 2024 15:30:47 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 91E898B773; Mon, 27 May 2024 15:30:47 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id Zt8Ki81erwHc; Mon, 27 May 2024 15:30:47 +0200 (CEST) Received: from PO20335.idsi0.si.c-s.fr (unknown [192.168.232.49]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 8E4C38B764; Mon, 27 May 2024 15:30:46 +0200 (CEST) From: Christophe Leroy To: Andrew Morton , Jason Gunthorpe , Peter Xu , Oscar Salvador , Michael Ellerman , Nicholas Piggin Subject: [RFC PATCH v4 12/16] powerpc/e500: Encode hugepage size in PTE bits Date: Mon, 27 May 2024 15:30:10 +0200 Message-ID: <10eae3c6815e3aba5f624af92321948e4684c95a.1716815901.git.christophe.leroy@csgroup.eu> X-Mailer: git-send-email 2.44.0 In-Reply-To: References: MIME-Version: 1.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1716816601; l=1805; i=christophe.leroy@csgroup.eu; s=20211009; h=from:subject:message-id; bh=uUFe3OogHVPo7wrS42BAn509SjAdI6hEiXUNXjCP0Kk=; b=YL5kCMuwuiOUJzgHNZiLsRNG4JGxpb9X2MJhw2pkjypC8qNj9tqf0pexIsNSaqtiXEuIkUx7j tzlHiAZj6RlB4nEJ2M7CYqgs3yfKNJWdU6O3BAWWYEbp+QuluiZGoYd X-Developer-Key: i=christophe.leroy@csgroup.eu; a=ed25519; pk=HIzTzUj91asvincQGOFx6+ZF5AoUuP9GdOtQChs7Mm0= X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" Use U0-U3 bits to encode hugepage size, more exactly page shift. As we start using hugepages at shift 21 (2Mbytes), substract 20 so that it fits into 4 bits. That may change in the future if we want to use smaller hugepages. Signed-off-by: Christophe Leroy --- arch/powerpc/include/asm/nohash/hugetlb-e500.h | 6 ++++++ arch/powerpc/include/asm/nohash/pte-e500.h | 3 +++ 2 files changed, 9 insertions(+) diff --git a/arch/powerpc/include/asm/nohash/hugetlb-e500.h b/arch/powerpc/include/asm/nohash/hugetlb-e500.h index 8f04ad20e040..d8e51a3f8557 100644 --- a/arch/powerpc/include/asm/nohash/hugetlb-e500.h +++ b/arch/powerpc/include/asm/nohash/hugetlb-e500.h @@ -42,4 +42,10 @@ static inline int check_and_get_huge_psize(int shift) return shift_to_mmu_psize(shift); } +static inline pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, vm_flags_t flags) +{ + return __pte(pte_val(entry) | (_PAGE_U3 * (shift - 20))); +} +#define arch_make_huge_pte arch_make_huge_pte + #endif /* _ASM_POWERPC_NOHASH_HUGETLB_E500_H */ diff --git a/arch/powerpc/include/asm/nohash/pte-e500.h b/arch/powerpc/include/asm/nohash/pte-e500.h index 975facc7e38e..091e4bff1fba 100644 --- a/arch/powerpc/include/asm/nohash/pte-e500.h +++ b/arch/powerpc/include/asm/nohash/pte-e500.h @@ -46,6 +46,9 @@ #define _PAGE_NO_CACHE 0x400000 /* I: cache inhibit */ #define _PAGE_WRITETHRU 0x800000 /* W: cache write-through */ +#define _PAGE_HSIZE_MSK (_PAGE_U0 | _PAGE_U1 | _PAGE_U2 | _PAGE_U3) +#define _PAGE_HSIZE_SHIFT 14 + /* "Higher level" linux bit combinations */ #define _PAGE_EXEC (_PAGE_BAP_SX | _PAGE_BAP_UX) /* .. and was cache cleaned */ #define _PAGE_READ (_PAGE_BAP_SR | _PAGE_BAP_UR) /* User read permission */ From patchwork Mon May 27 13:30:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Leroy X-Patchwork-Id: 1939951 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=patchwork.ozlabs.org) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) by legolas.ozlabs.org (Postfix) with ESMTP id 4Vnxdv6TF7z20KL for ; Mon, 27 May 2024 23:42:11 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4VnxV96hClz8B61 for ; Mon, 27 May 2024 23:35:29 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=csgroup.eu Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=csgroup.eu (client-ip=93.17.236.30; helo=pegase1.c-s.fr; envelope-from=christophe.leroy@csgroup.eu; receiver=lists.ozlabs.org) Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4VnxPn6dlmz3fmW for ; Mon, 27 May 2024 23:31:41 +1000 (AEST) Received: from localhost (mailhub3.si.c-s.fr [192.168.12.233]) by localhost (Postfix) with ESMTP id 4VnxNp655hz9tLG; Mon, 27 May 2024 15:30:50 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DAF14CDk1IYW; Mon, 27 May 2024 15:30:50 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 4VnxNp5Qs5z9tC6; Mon, 27 May 2024 15:30:50 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id B56818B773; Mon, 27 May 2024 15:30:50 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id 9vS0LqUFHk6r; Mon, 27 May 2024 15:30:50 +0200 (CEST) Received: from PO20335.idsi0.si.c-s.fr (unknown [192.168.232.49]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 9F5448B764; Mon, 27 May 2024 15:30:47 +0200 (CEST) From: Christophe Leroy To: Andrew Morton , Jason Gunthorpe , Peter Xu , Oscar Salvador , Michael Ellerman , Nicholas Piggin Subject: [RFC PATCH v4 13/16] powerpc/e500: Use contiguous PMD instead of hugepd Date: Mon, 27 May 2024 15:30:11 +0200 Message-ID: <56cf925576285e2b97550f4f7317183d98d596c5.1716815901.git.christophe.leroy@csgroup.eu> X-Mailer: git-send-email 2.44.0 In-Reply-To: References: MIME-Version: 1.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1716816601; l=15623; i=christophe.leroy@csgroup.eu; s=20211009; h=from:subject:message-id; bh=8UxSK6UFThA+fS5v4To3cEqypz8JD7mcZuTM7v/aoW8=; b=3H262+jGid3iIXsxM5xPqxkdEMudmbX8YJy5U6LlJFlxe6xQOZoMR+Z/TRcIOzBUJl4RRbX9q ms72N/J4jCIAW+xLKBMMBwywKIzopg8OvfnMpE63eU9MLrJi4JBxbVn X-Developer-Key: i=christophe.leroy@csgroup.eu; a=ed25519; pk=HIzTzUj91asvincQGOFx6+ZF5AoUuP9GdOtQChs7Mm0= X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" e500 supports many page sizes among which the following size are implemented in the kernel at the time being: 4M, 16M, 64M, 256M, 1G. On e500, TLB miss for hugepages is exclusively handled by SW even on e6500 which has HW assistance for 4k pages, so there are no constraints like on the 8xx. On e500/32, all are at PGD/PMD level and can be handled as cont-PMD. On e500/64, smaller ones are on PMD while bigger ones are on PUD. Again, they can easily be handled as cont-PMD and cont-PUD instead of hugepd. Signed-off-by: Christophe Leroy --- v3: Add missing pmd_leaf_size() and pud_leaf_size() v4: Rebased of v6.10-rc1 : pmd_huge() and pud_huge() are gone --- .../powerpc/include/asm/nohash/hugetlb-e500.h | 32 +--------- arch/powerpc/include/asm/nohash/pgalloc.h | 2 - arch/powerpc/include/asm/nohash/pgtable.h | 37 +++++++---- arch/powerpc/include/asm/nohash/pte-e500.h | 28 +++++++++ arch/powerpc/include/asm/page.h | 15 +---- arch/powerpc/kernel/head_85xx.S | 23 +++---- arch/powerpc/mm/hugetlbpage.c | 2 - arch/powerpc/mm/nohash/tlb_low_64e.S | 63 +++++++++++-------- arch/powerpc/mm/pgtable.c | 31 +++++++++ arch/powerpc/platforms/Kconfig.cputype | 1 - 10 files changed, 139 insertions(+), 95 deletions(-) diff --git a/arch/powerpc/include/asm/nohash/hugetlb-e500.h b/arch/powerpc/include/asm/nohash/hugetlb-e500.h index d8e51a3f8557..d30e2a3f129d 100644 --- a/arch/powerpc/include/asm/nohash/hugetlb-e500.h +++ b/arch/powerpc/include/asm/nohash/hugetlb-e500.h @@ -2,38 +2,12 @@ #ifndef _ASM_POWERPC_NOHASH_HUGETLB_E500_H #define _ASM_POWERPC_NOHASH_HUGETLB_E500_H -static inline pte_t *hugepd_page(hugepd_t hpd) -{ - if (WARN_ON(!hugepd_ok(hpd))) - return NULL; - - return (pte_t *)((hpd_val(hpd) & ~HUGEPD_SHIFT_MASK) | PD_HUGE); -} - -static inline unsigned int hugepd_shift(hugepd_t hpd) -{ - return hpd_val(hpd) & HUGEPD_SHIFT_MASK; -} - -static inline pte_t *hugepte_offset(hugepd_t hpd, unsigned long addr, - unsigned int pdshift) -{ - /* - * On FSL BookE, we have multiple higher-level table entries that - * point to the same hugepte. Just use the first one since they're all - * identical. So for that case, idx=0. - */ - return hugepd_page(hpd); -} +#define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT +void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, + pte_t pte, unsigned long sz); void flush_hugetlb_page(struct vm_area_struct *vma, unsigned long vmaddr); -static inline void hugepd_populate(hugepd_t *hpdp, pte_t *new, unsigned int pshift) -{ - /* We use the old format for PPC_E500 */ - *hpdp = __hugepd(((unsigned long)new & ~PD_HUGE) | pshift); -} - static inline int check_and_get_huge_psize(int shift) { if (shift & 1) /* Not a power of 4 */ diff --git a/arch/powerpc/include/asm/nohash/pgalloc.h b/arch/powerpc/include/asm/nohash/pgalloc.h index 4b62376318e1..d06efac6d7aa 100644 --- a/arch/powerpc/include/asm/nohash/pgalloc.h +++ b/arch/powerpc/include/asm/nohash/pgalloc.h @@ -44,8 +44,6 @@ static inline void pgtable_free(void *table, int shift) } } -#define get_hugepd_cache_index(x) (x) - static inline void pgtable_free_tlb(struct mmu_gather *tlb, void *table, int shift) { unsigned long pgf = (unsigned long)table; diff --git a/arch/powerpc/include/asm/nohash/pgtable.h b/arch/powerpc/include/asm/nohash/pgtable.h index 90d6a0943b35..f7421d1a1693 100644 --- a/arch/powerpc/include/asm/nohash/pgtable.h +++ b/arch/powerpc/include/asm/nohash/pgtable.h @@ -52,11 +52,36 @@ static inline pte_basic_t pte_update(struct mm_struct *mm, unsigned long addr, p { pte_basic_t old = pte_val(*p); pte_basic_t new = (old & ~(pte_basic_t)clr) | set; + unsigned long sz; + unsigned long pdsize; + int i; if (new == old) return old; - *p = __pte(new); +#ifdef CONFIG_PPC_E500 + if (huge) + sz = 1UL << (((old & _PAGE_HSIZE_MSK) >> _PAGE_HSIZE_SHIFT) + 20); + else +#endif + sz = PAGE_SIZE; + + if (!huge || sz < PMD_SIZE) + pdsize = PAGE_SIZE; + else if (sz < PUD_SIZE) + pdsize = PMD_SIZE; + else if (sz < P4D_SIZE) + pdsize = PUD_SIZE; + else if (sz < PGDIR_SIZE) + pdsize = P4D_SIZE; + else + pdsize = PGDIR_SIZE; + + for (i = 0; i < sz / pdsize; i++, p++) { + *p = __pte(new); + if (new) + new += (unsigned long long)(pdsize / PAGE_SIZE) << PTE_RPN_SHIFT; + } if (IS_ENABLED(CONFIG_44x) && !is_kernel_addr(addr) && (old & _PAGE_EXEC)) icache_44x_need_flush = 1; @@ -340,16 +365,6 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr, #define pgprot_writecombine pgprot_noncached_wc -#ifdef CONFIG_ARCH_HAS_HUGEPD -static inline int hugepd_ok(hugepd_t hpd) -{ - /* We clear the top bit to indicate hugepd */ - return (hpd_val(hpd) && (hpd_val(hpd) & PD_HUGE) == 0); -} - -#define is_hugepd(hpd) (hugepd_ok(hpd)) -#endif - int map_kernel_page(unsigned long va, phys_addr_t pa, pgprot_t prot); void unmap_kernel_page(unsigned long va); diff --git a/arch/powerpc/include/asm/nohash/pte-e500.h b/arch/powerpc/include/asm/nohash/pte-e500.h index 091e4bff1fba..86e0cd5fcbb4 100644 --- a/arch/powerpc/include/asm/nohash/pte-e500.h +++ b/arch/powerpc/include/asm/nohash/pte-e500.h @@ -67,6 +67,7 @@ #define _PAGE_RWX (_PAGE_READ | _PAGE_WRITE | _PAGE_BAP_UX) #define _PAGE_SPECIAL _PAGE_SW0 +#define _PAGE_PTE _PAGE_PSIZE_4K #define PTE_RPN_SHIFT (24) @@ -106,6 +107,33 @@ static inline pte_t pte_mkexec(pte_t pte) } #define pte_mkexec pte_mkexec +static inline int pmd_leaf(pmd_t pmd) +{ + return pmd_val(pmd) & _PAGE_PTE; +} +#define pmd_leaf pmd_leaf + +static inline unsigned long pmd_leaf_size(pmd_t pmd) +{ + return 1UL << (((pmd_val(pmd) & _PAGE_HSIZE_MSK) >> _PAGE_HSIZE_SHIFT) + 20); +} +#define pmd_leaf_size pmd_leaf_size + +#ifdef CONFIG_PPC64 +static inline int pud_leaf(pud_t pud) +{ + return pud_val(pud) & _PAGE_PTE; +} +#define pud_leaf pud_leaf + +static inline unsigned long pud_leaf_size(pud_t pud) +{ + return 1UL << (((pud_val(pud) & _PAGE_HSIZE_MSK) >> _PAGE_HSIZE_SHIFT) + 20); +} +#define pud_leaf_size pud_leaf_size + +#endif + #endif /* __ASSEMBLY__ */ #endif /* __KERNEL__ */ diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h index 018c3d55232c..7d3c3bc40e6a 100644 --- a/arch/powerpc/include/asm/page.h +++ b/arch/powerpc/include/asm/page.h @@ -269,20 +269,7 @@ static inline const void *pfn_to_kaddr(unsigned long pfn) #define is_kernel_addr(x) ((x) >= TASK_SIZE) #endif -#ifndef CONFIG_PPC_BOOK3S_64 -/* - * Use the top bit of the higher-level page table entries to indicate whether - * the entries we point to contain hugepages. This works because we know that - * the page tables live in kernel space. If we ever decide to support having - * page tables at arbitrary addresses, this breaks and will have to change. - */ -#ifdef CONFIG_PPC64 -#define PD_HUGE 0x8000000000000000UL -#else -#define PD_HUGE 0x80000000 -#endif - -#else /* CONFIG_PPC_BOOK3S_64 */ +#ifdef CONFIG_PPC_BOOK3S_64 /* * Book3S 64 stores real addresses in the hugepd entries to * avoid overlaps with _PAGE_PRESENT and _PAGE_PTE. diff --git a/arch/powerpc/kernel/head_85xx.S b/arch/powerpc/kernel/head_85xx.S index a305244afc9f..96479a2230ac 100644 --- a/arch/powerpc/kernel/head_85xx.S +++ b/arch/powerpc/kernel/head_85xx.S @@ -310,16 +310,17 @@ set_ivor: rlwinm r12, r10, 14, 18, 28; /* Compute pgdir/pmd offset */ \ add r12, r11, r12; \ lwz r11, 4(r12); /* Get pgd/pmd entry */ \ - rlwinm. r12, r11, 0, 0, 20; /* Extract pt base address */ \ - blt 1000f; /* Normal non-huge page */ \ - beq 2f; /* Bail if no table */ \ - oris r11, r11, PD_HUGE@h; /* Put back address bit */ \ - andi. r10, r11, HUGEPD_SHIFT_MASK@l; /* extract size field */ \ - xor r12, r10, r11; /* drop size bits from pointer */ \ + rotlwi. r11, r11, 22; /* Leaf entry (_PAGE_PTE set) */\ + bge 1000f; /* Normal non-huge page */ \ + rlwinm r10, r11, 64 - _PAGE_HSIZE_SHIFT - 22, 0xf; \ + rotrwi r11, r11, 22; /* Restore entry */ \ b 1001f; \ -1000: rlwimi r12, r10, 23, 20, 28; /* Compute pte address */ \ +1000: rlwinm. r12, r11, 32 - 22, 0, 20; /* Extract pt base address */ \ + beq 2f; /* Bail if no table */ \ + rlwimi r12, r10, 23, 20, 28; /* Compute pte address */ \ li r10, 0; /* clear r10 */ \ -1001: lwz r11, 4(r12); /* Get pte entry */ + lwz r11, 4(r12); /* Get pte entry */ \ +1001: #else #define FIND_PTE \ rlwinm r12, r10, 14, 18, 28; /* Compute pgdir/pmd offset */ \ @@ -749,16 +750,16 @@ finish_tlb_load: 100: stw r15, 0(r17) /* - * Calc MAS1_TSIZE from r10 (which has pshift encoded) + * Calc MAS1_TSIZE from r10 (which has pshift - 20 encoded) * tlb_enc = (pshift - 10). */ - subi r15, r10, 10 + addi r15, r10, 10 mfspr r16, SPRN_MAS1 rlwimi r16, r15, 7, 20, 24 mtspr SPRN_MAS1, r16 /* copy the pshift for use later */ - mr r14, r10 + addi r14, r10, 20 /* fall through */ diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 5193f6845725..ca00dbfe0e50 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -626,8 +626,6 @@ static int __init hugetlbpage_init(void) if (pdshift > shift) { if (!IS_ENABLED(CONFIG_PPC_8xx)) pgtable_cache_add(pdshift - shift); - } else if (IS_ENABLED(CONFIG_PPC_E500)) { - pgtable_cache_add(PTE_T_ORDER); } configured = true; diff --git a/arch/powerpc/mm/nohash/tlb_low_64e.S b/arch/powerpc/mm/nohash/tlb_low_64e.S index a54e7d6c3d0b..5f6154befde3 100644 --- a/arch/powerpc/mm/nohash/tlb_low_64e.S +++ b/arch/powerpc/mm/nohash/tlb_low_64e.S @@ -152,20 +152,26 @@ tlb_miss_common_bolted: rldicl r15,r16,64-PUD_SHIFT+3,64-PUD_INDEX_SIZE-3 clrrdi r15,r15,3 - cmpdi cr0,r14,0 - bge tlb_miss_fault_bolted /* Bad pgd entry or hugepage; bail */ + cmpdi cr3,r14,0 + andi. r10,r14,_PAGE_PTE + beq- cr3,tlb_miss_fault_bolted /* No entry, bail */ + bne tlb_miss_fault_bolted /* Hugepage; bail */ ldx r14,r14,r15 /* grab pud entry */ rldicl r15,r16,64-PMD_SHIFT+3,64-PMD_INDEX_SIZE-3 clrrdi r15,r15,3 - cmpdi cr0,r14,0 - bge tlb_miss_fault_bolted + cmpdi cr3,r14,0 + andi. r10,r14,_PAGE_PTE + beq- cr3,tlb_miss_fault_bolted /* No entry, bail */ + bne tlb_miss_fault_bolted /* Hugepage; bail */ ldx r14,r14,r15 /* Grab pmd entry */ rldicl r15,r16,64-PAGE_SHIFT+3,64-PTE_INDEX_SIZE-3 clrrdi r15,r15,3 - cmpdi cr0,r14,0 - bge tlb_miss_fault_bolted + cmpdi cr3,r14,0 + andi. r10,r14,_PAGE_PTE + beq- cr3,tlb_miss_fault_bolted /* No entry, bail */ + bne tlb_miss_fault_bolted /* Hugepage; bail */ ldx r14,r14,r15 /* Grab PTE, normal (!huge) page */ /* Check if required permissions are met */ @@ -390,19 +396,25 @@ ALT_FTR_SECTION_END_IFSET(CPU_FTR_SMT) rldicl r15,r16,64-PUD_SHIFT+3,64-PUD_INDEX_SIZE-3 clrrdi r15,r15,3 - cmpdi cr0,r14,0 - bge tlb_miss_huge_e6500 /* Bad pgd entry or hugepage; bail */ + cmpdi cr3,r14,0 + andi. r10,r14,_PAGE_PTE + beq- cr3,tlb_miss_fault_e6500 /* No entry, bail */ + bne tlb_miss_huge_e6500 /* Hugepage; bail */ ldx r14,r14,r15 /* grab pud entry */ rldicl r15,r16,64-PMD_SHIFT+3,64-PMD_INDEX_SIZE-3 clrrdi r15,r15,3 - cmpdi cr0,r14,0 - bge tlb_miss_huge_e6500 + cmpdi cr3,r14,0 + andi. r10,r14,_PAGE_PTE + beq- cr3,tlb_miss_fault_e6500 /* No entry, bail */ + bne tlb_miss_huge_e6500 /* Hugepage; bail */ ldx r14,r14,r15 /* Grab pmd entry */ mfspr r10,SPRN_MAS0 - cmpdi cr0,r14,0 - bge tlb_miss_huge_e6500 + cmpdi cr3,r14,0 + andi. r15,r14,_PAGE_PTE + beq- cr3,tlb_miss_fault_e6500 /* No entry, bail */ + bne tlb_miss_huge_e6500 /* Hugepage; bail */ /* Now we build the MAS for a 2M indirect page: * @@ -449,12 +461,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_SMT) rfi tlb_miss_huge_e6500: - beq tlb_miss_fault_e6500 - li r10,1 - andi. r15,r14,HUGEPD_SHIFT_MASK@l /* r15 = psize */ - rldimi r14,r10,63,0 /* Set PD_HUGE */ - xor r14,r14,r15 /* Clear size bits */ - ldx r14,0,r14 + rlwinm r15,r14,32-_PAGE_HSIZE_SHIFT,0xf /* * Now we build the MAS for a huge page. @@ -465,7 +472,7 @@ tlb_miss_huge_e6500: * MAS 2,3+7: Needs to be redone similar to non-tablewalk handler */ - subi r15,r15,10 /* Convert psize to tsize */ + addi r15,r15,10 /* Convert hsize to tsize */ mfspr r10,SPRN_MAS1 rlwinm r10,r10,0,~MAS1_IND rlwimi r10,r15,MAS1_TSIZE_SHIFT,MAS1_TSIZE_MASK @@ -579,22 +586,28 @@ virt_page_table_tlb_miss: rldicl r11,r16,64-VPTE_PGD_SHIFT,64-PGD_INDEX_SIZE-3 clrrdi r10,r11,3 ldx r15,r10,r15 - cmpdi cr0,r15,0 - bge virt_page_table_tlb_miss_fault + cmpdi cr3,r15,0 + andi. r10,r15,_PAGE_PTE + beq- cr3,virt_page_table_tlb_miss_fault /* No entry, bail */ + bne virt_page_table_tlb_miss_fault /* Hugepage; bail */ /* Get to PUD entry */ rldicl r11,r16,64-VPTE_PUD_SHIFT,64-PUD_INDEX_SIZE-3 clrrdi r10,r11,3 ldx r15,r10,r15 - cmpdi cr0,r15,0 - bge virt_page_table_tlb_miss_fault + cmpdi cr3,r15,0 + andi. r10,r15,_PAGE_PTE + beq- cr3,virt_page_table_tlb_miss_fault /* No entry, bail */ + bne virt_page_table_tlb_miss_fault /* Hugepage; bail */ /* Get to PMD entry */ rldicl r11,r16,64-VPTE_PMD_SHIFT,64-PMD_INDEX_SIZE-3 clrrdi r10,r11,3 ldx r15,r10,r15 - cmpdi cr0,r15,0 - bge virt_page_table_tlb_miss_fault + cmpdi cr3,r15,0 + andi. r10,r15,_PAGE_PTE + beq- cr3,virt_page_table_tlb_miss_fault /* No entry, bail */ + bne virt_page_table_tlb_miss_fault /* Hugepage; bail */ /* Ok, we're all right, we can now create a kernel translation for * a 4K or 64K page from r16 -> r15. diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index 294775c793ab..6498454959f3 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -331,6 +331,37 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, __set_huge_pte_at(pmdp, ptep, pte_val(pte)); } } +#elif defined(CONFIG_PPC_E500) +void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, + pte_t pte, unsigned long sz) +{ + unsigned long pdsize; + int i; + + pte = set_pte_filter(pte, addr); + + /* + * Make sure hardware valid bit is not set. We don't do + * tlb flush for this update. + */ + VM_WARN_ON(pte_hw_valid(*ptep) && !pte_protnone(*ptep)); + + if (sz < PMD_SIZE) + pdsize = PAGE_SIZE; + else if (sz < PUD_SIZE) + pdsize = PMD_SIZE; + else if (sz < P4D_SIZE) + pdsize = PUD_SIZE; + else if (sz < PGDIR_SIZE) + pdsize = P4D_SIZE; + else + pdsize = PGDIR_SIZE; + + for (i = 0; i < sz / pdsize; i++, ptep++, addr += pdsize) { + __set_pte_at(mm, addr, ptep, pte, 0); + pte = __pte(pte_val(pte) + ((unsigned long long)pdsize / PAGE_SIZE << PFN_PTE_SHIFT)); + } +} #endif #endif /* CONFIG_HUGETLB_PAGE */ diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype index fa4bb096b3ae..30a78e99663e 100644 --- a/arch/powerpc/platforms/Kconfig.cputype +++ b/arch/powerpc/platforms/Kconfig.cputype @@ -291,7 +291,6 @@ config PPC_BOOK3S config PPC_E500 select FSL_EMB_PERFMON bool - select ARCH_HAS_HUGEPD if HUGETLB_PAGE select ARCH_SUPPORTS_HUGETLBFS if PHYS_64BIT || PPC64 select PPC_SMP_MUXED_IPI select PPC_DOORBELL From patchwork Mon May 27 13:30:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Leroy X-Patchwork-Id: 1939959 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=patchwork.ozlabs.org) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) by legolas.ozlabs.org (Postfix) with ESMTP id 4VnxfP6fJpz20KL for ; Mon, 27 May 2024 23:42:37 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4VnxVk5J7gz79gN for ; Mon, 27 May 2024 23:35:58 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=csgroup.eu Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=csgroup.eu (client-ip=93.17.236.30; helo=pegase1.c-s.fr; envelope-from=christophe.leroy@csgroup.eu; receiver=lists.ozlabs.org) Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4VnxPt2mwhz87cg for ; Mon, 27 May 2024 23:31:46 +1000 (AEST) Received: from localhost (mailhub3.si.c-s.fr [192.168.12.233]) by localhost (Postfix) with ESMTP id 4VnxNw2jlvz9tCB; Mon, 27 May 2024 15:30:56 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ckzpMhH7FIRG; Mon, 27 May 2024 15:30:56 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 4VnxNw22v3z9tC6; Mon, 27 May 2024 15:30:56 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 401A08B773; Mon, 27 May 2024 15:30:56 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id 46esPQ90bM0a; Mon, 27 May 2024 15:30:56 +0200 (CEST) Received: from PO20335.idsi0.si.c-s.fr (unknown [192.168.232.49]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 278098B764; Mon, 27 May 2024 15:30:51 +0200 (CEST) From: Christophe Leroy To: Andrew Morton , Jason Gunthorpe , Peter Xu , Oscar Salvador , Michael Ellerman , Nicholas Piggin Subject: [RFC PATCH v4 14/16] powerpc/64s: Use contiguous PMD/PUD instead of HUGEPD Date: Mon, 27 May 2024 15:30:12 +0200 Message-ID: <610be6003a6d215e9e9ca87d7f5402042da1e355.1716815901.git.christophe.leroy@csgroup.eu> X-Mailer: git-send-email 2.44.0 In-Reply-To: References: MIME-Version: 1.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1716816601; l=18115; i=christophe.leroy@csgroup.eu; s=20211009; h=from:subject:message-id; bh=Q+QFKmibwODglXuX5lXraQlevVVIdErU/OdOOnu3O+g=; b=eZZS5MQTLdNRLI0nqe7ijZPJ/L23VfGhz7W+7c+4M7jOUil4Sk583zWBzBsijK1kcCHVxYuzl 0NWle6hSX5ABHBZnBk+J+YmNo4FSSiKjErhFBTbXeEZ2Z0zHL/BTVwY X-Developer-Key: i=christophe.leroy@csgroup.eu; a=ed25519; pk=HIzTzUj91asvincQGOFx6+ZF5AoUuP9GdOtQChs7Mm0= X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" On book3s/64, the only user of hugepd is hash in 4k mode. All other setups (hash-64, radix-4, radix-64) use leaf PMD/PUD. Rework hash-4k to use contiguous PMD and PUD instead. In that setup there are only two huge page sizes: 16M and 16G. 16M sits at PMD level and 16G at PUD level. pte_update doesn't know page size, lets use the same trick as hpte_need_flush() to get page size from segment properties. That's not the most efficient way but let's do that until callers of pte_update() provide page size instead of just a huge flag. Signed-off-by: Christophe Leroy --- v3: - Add missing pmd_leaf_size() and pud_leaf_size() - More cleanup in hugetlbpage_init() - Take a page fault when DIRTY or ACCESSED is missing on hash-4 hugepage v4: Rebased on v6.10-rc1 --- arch/powerpc/include/asm/book3s/64/hash-4k.h | 15 ------ arch/powerpc/include/asm/book3s/64/hash.h | 38 ++++++++++++--- arch/powerpc/include/asm/book3s/64/hugetlb.h | 38 --------------- .../include/asm/book3s/64/pgtable-4k.h | 47 ------------------- .../include/asm/book3s/64/pgtable-64k.h | 20 -------- arch/powerpc/include/asm/book3s/64/pgtable.h | 22 +++++++-- arch/powerpc/include/asm/hugetlb.h | 4 ++ .../powerpc/include/asm/nohash/hugetlb-e500.h | 4 -- arch/powerpc/include/asm/page.h | 8 ---- arch/powerpc/mm/book3s64/hash_utils.c | 11 +++-- arch/powerpc/mm/book3s64/hugetlbpage.c | 10 ++++ arch/powerpc/mm/book3s64/pgtable.c | 12 ----- arch/powerpc/mm/hugetlbpage.c | 27 ----------- arch/powerpc/mm/pgtable.c | 2 +- arch/powerpc/platforms/Kconfig.cputype | 1 - 15 files changed, 72 insertions(+), 187 deletions(-) delete mode 100644 arch/powerpc/include/asm/book3s/64/pgtable-4k.h diff --git a/arch/powerpc/include/asm/book3s/64/hash-4k.h b/arch/powerpc/include/asm/book3s/64/hash-4k.h index 6472b08fa1b0..c654c376ef8b 100644 --- a/arch/powerpc/include/asm/book3s/64/hash-4k.h +++ b/arch/powerpc/include/asm/book3s/64/hash-4k.h @@ -74,21 +74,6 @@ #define remap_4k_pfn(vma, addr, pfn, prot) \ remap_pfn_range((vma), (addr), (pfn), PAGE_SIZE, (prot)) -#ifdef CONFIG_HUGETLB_PAGE -static inline int hash__hugepd_ok(hugepd_t hpd) -{ - unsigned long hpdval = hpd_val(hpd); - /* - * if it is not a pte and have hugepd shift mask - * set, then it is a hugepd directory pointer - */ - if (!(hpdval & _PAGE_PTE) && (hpdval & _PAGE_PRESENT) && - ((hpdval & HUGEPD_SHIFT_MASK) != 0)) - return true; - return false; -} -#endif - /* * 4K PTE format is different from 64K PTE format. Saving the hash_slot is just * a matter of returning the PTE bits that need to be modified. On 64K PTE, diff --git a/arch/powerpc/include/asm/book3s/64/hash.h b/arch/powerpc/include/asm/book3s/64/hash.h index faf3e3b4e4b2..8202c27afe23 100644 --- a/arch/powerpc/include/asm/book3s/64/hash.h +++ b/arch/powerpc/include/asm/book3s/64/hash.h @@ -4,6 +4,7 @@ #ifdef __KERNEL__ #include +#include /* * Common bits between 4K and 64K pages in a linux-style PTE. @@ -161,14 +162,10 @@ extern void hpte_need_flush(struct mm_struct *mm, unsigned long addr, pte_t *ptep, unsigned long pte, int huge); unsigned long htab_convert_pte_flags(unsigned long pteflags, unsigned long flags); /* Atomic PTE updates */ -static inline unsigned long hash__pte_update(struct mm_struct *mm, - unsigned long addr, - pte_t *ptep, unsigned long clr, - unsigned long set, - int huge) +static inline unsigned long hash__pte_update_one(pte_t *ptep, unsigned long clr, + unsigned long set) { __be64 old_be, tmp_be; - unsigned long old; __asm__ __volatile__( "1: ldarx %0,0,%3 # pte_update\n\ @@ -182,11 +179,38 @@ static inline unsigned long hash__pte_update(struct mm_struct *mm, : "r" (ptep), "r" (cpu_to_be64(clr)), "m" (*ptep), "r" (cpu_to_be64(H_PAGE_BUSY)), "r" (cpu_to_be64(set)) : "cc" ); + + return be64_to_cpu(old_be); +} + +static inline unsigned long hash__pte_update(struct mm_struct *mm, + unsigned long addr, + pte_t *ptep, unsigned long clr, + unsigned long set, + int huge) +{ + unsigned long old; + + old = hash__pte_update_one(ptep, clr, set); + + if (IS_ENABLED(CONFIG_PPC_4K_PAGES) && huge) { + unsigned int psize = get_slice_psize(mm, addr); + int nb, i; + + if (psize == MMU_PAGE_16M) + nb = SZ_16M / PMD_SIZE; + else if (psize == MMU_PAGE_16G) + nb = SZ_16G / PUD_SIZE; + else + nb = 1; + + for (i = 1; i < nb; i++) + hash__pte_update_one(ptep + i, clr, set); + } /* huge pages use the old page table lock */ if (!huge) assert_pte_locked(mm, addr); - old = be64_to_cpu(old_be); if (old & H_PAGE_HASHPTE) hpte_need_flush(mm, addr, ptep, old, huge); diff --git a/arch/powerpc/include/asm/book3s/64/hugetlb.h b/arch/powerpc/include/asm/book3s/64/hugetlb.h index aa1c67c8bfc8..f0bba9c5f9c3 100644 --- a/arch/powerpc/include/asm/book3s/64/hugetlb.h +++ b/arch/powerpc/include/asm/book3s/64/hugetlb.h @@ -49,9 +49,6 @@ static inline bool gigantic_page_runtime_supported(void) return true; } -/* hugepd entry valid bit */ -#define HUGEPD_VAL_BITS (0x8000000000000000UL) - #define huge_ptep_modify_prot_start huge_ptep_modify_prot_start extern pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep); @@ -60,29 +57,7 @@ extern pte_t huge_ptep_modify_prot_start(struct vm_area_struct *vma, extern void huge_ptep_modify_prot_commit(struct vm_area_struct *vma, unsigned long addr, pte_t *ptep, pte_t old_pte, pte_t new_pte); -/* - * This should work for other subarchs too. But right now we use the - * new format only for 64bit book3s - */ -static inline pte_t *hugepd_page(hugepd_t hpd) -{ - BUG_ON(!hugepd_ok(hpd)); - /* - * We have only four bits to encode, MMU page size - */ - BUILD_BUG_ON((MMU_PAGE_COUNT - 1) > 0xf); - return __va(hpd_val(hpd) & HUGEPD_ADDR_MASK); -} - -static inline unsigned int hugepd_mmu_psize(hugepd_t hpd) -{ - return (hpd_val(hpd) & HUGEPD_SHIFT_MASK) >> 2; -} -static inline unsigned int hugepd_shift(hugepd_t hpd) -{ - return mmu_psize_to_shift(hugepd_mmu_psize(hpd)); -} static inline void flush_hugetlb_page(struct vm_area_struct *vma, unsigned long vmaddr) { @@ -90,19 +65,6 @@ static inline void flush_hugetlb_page(struct vm_area_struct *vma, return radix__flush_hugetlb_page(vma, vmaddr); } -static inline pte_t *hugepte_offset(hugepd_t hpd, unsigned long addr, - unsigned int pdshift) -{ - unsigned long idx = (addr & ((1UL << pdshift) - 1)) >> hugepd_shift(hpd); - - return hugepd_page(hpd) + idx; -} - -static inline void hugepd_populate(hugepd_t *hpdp, pte_t *new, unsigned int pshift) -{ - *hpdp = __hugepd(__pa(new) | HUGEPD_VAL_BITS | (shift_to_mmu_psize(pshift) << 2)); -} - void flush_hugetlb_page(struct vm_area_struct *vma, unsigned long vmaddr); static inline int check_and_get_huge_psize(int shift) diff --git a/arch/powerpc/include/asm/book3s/64/pgtable-4k.h b/arch/powerpc/include/asm/book3s/64/pgtable-4k.h deleted file mode 100644 index baf934578c3a..000000000000 --- a/arch/powerpc/include/asm/book3s/64/pgtable-4k.h +++ /dev/null @@ -1,47 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -#ifndef _ASM_POWERPC_BOOK3S_64_PGTABLE_4K_H -#define _ASM_POWERPC_BOOK3S_64_PGTABLE_4K_H -/* - * hash 4k can't share hugetlb and also doesn't support THP - */ -#ifndef __ASSEMBLY__ -#ifdef CONFIG_HUGETLB_PAGE -/* - * With radix , we have hugepage ptes in the pud and pmd entries. We don't - * need to setup hugepage directory for them. Our pte and page directory format - * enable us to have this enabled. - */ -static inline int hugepd_ok(hugepd_t hpd) -{ - if (radix_enabled()) - return 0; - return hash__hugepd_ok(hpd); -} -#define is_hugepd(hpd) (hugepd_ok(hpd)) - -/* - * 16M and 16G huge page directory tables are allocated from slab cache - * - */ -#define H_16M_CACHE_INDEX (PAGE_SHIFT + H_PTE_INDEX_SIZE + H_PMD_INDEX_SIZE - 24) -#define H_16G_CACHE_INDEX \ - (PAGE_SHIFT + H_PTE_INDEX_SIZE + H_PMD_INDEX_SIZE + H_PUD_INDEX_SIZE - 34) - -static inline int get_hugepd_cache_index(int index) -{ - switch (index) { - case H_16M_CACHE_INDEX: - return HTLB_16M_INDEX; - case H_16G_CACHE_INDEX: - return HTLB_16G_INDEX; - default: - BUG(); - } - /* should not reach */ -} - -#endif /* CONFIG_HUGETLB_PAGE */ - -#endif /* __ASSEMBLY__ */ - -#endif /*_ASM_POWERPC_BOOK3S_64_PGTABLE_4K_H */ diff --git a/arch/powerpc/include/asm/book3s/64/pgtable-64k.h b/arch/powerpc/include/asm/book3s/64/pgtable-64k.h index 6ac73da7b80e..4d8d7b4ea16b 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable-64k.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable-64k.h @@ -5,26 +5,6 @@ #ifndef __ASSEMBLY__ #ifdef CONFIG_HUGETLB_PAGE -/* - * With 64k page size, we have hugepage ptes in the pgd and pmd entries. We don't - * need to setup hugepage directory for them. Our pte and page directory format - * enable us to have this enabled. - */ -static inline int hugepd_ok(hugepd_t hpd) -{ - return 0; -} - -#define is_hugepd(pdep) 0 - -/* - * This should never get called - */ -static __always_inline int get_hugepd_cache_index(int index) -{ - BUILD_BUG(); -} - #endif /* CONFIG_HUGETLB_PAGE */ static inline int remap_4k_pfn(struct vm_area_struct *vma, unsigned long addr, diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index 8f9432e3855a..519b1743a0f4 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -274,6 +274,24 @@ static inline bool pud_leaf(pud_t pud) { return !!(pud_raw(pud) & cpu_to_be64(_PAGE_PTE)); } + +#define pmd_leaf_size pmd_leaf_size +static inline unsigned long pmd_leaf_size(pmd_t pmd) +{ + if (IS_ENABLED(CONFIG_PPC_4K_PAGES) && !radix_enabled()) + return SZ_16M; + else + return PMD_SIZE; +} + +#define pud_leaf_size pud_leaf_size +static inline unsigned long pud_leaf_size(pud_t pud) +{ + if (IS_ENABLED(CONFIG_PPC_4K_PAGES) && !radix_enabled()) + return SZ_16G; + else + return PUD_SIZE; +} #endif /* __ASSEMBLY__ */ #include @@ -285,11 +303,9 @@ static inline bool pud_leaf(pud_t pud) #define MAX_PHYSMEM_BITS R_MAX_PHYSMEM_BITS #endif - +/* hash 4k can't share hugetlb and also doesn't support THP */ #ifdef CONFIG_PPC_64K_PAGES #include -#else -#include #endif #include diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h index 79176a499763..e959c26c0b52 100644 --- a/arch/powerpc/include/asm/hugetlb.h +++ b/arch/powerpc/include/asm/hugetlb.h @@ -37,6 +37,10 @@ void hugetlb_free_pgd_range(struct mmu_gather *tlb, unsigned long addr, unsigned long ceiling); #endif +#define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT +void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, + pte_t pte, unsigned long sz); + #define __HAVE_ARCH_HUGE_PTEP_GET_AND_CLEAR static inline pte_t huge_ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) diff --git a/arch/powerpc/include/asm/nohash/hugetlb-e500.h b/arch/powerpc/include/asm/nohash/hugetlb-e500.h index d30e2a3f129d..aea4c462e494 100644 --- a/arch/powerpc/include/asm/nohash/hugetlb-e500.h +++ b/arch/powerpc/include/asm/nohash/hugetlb-e500.h @@ -2,10 +2,6 @@ #ifndef _ASM_POWERPC_NOHASH_HUGETLB_E500_H #define _ASM_POWERPC_NOHASH_HUGETLB_E500_H -#define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT -void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, - pte_t pte, unsigned long sz); - void flush_hugetlb_page(struct vm_area_struct *vma, unsigned long vmaddr); static inline int check_and_get_huge_psize(int shift) diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h index 7d3c3bc40e6a..c0af246a64ff 100644 --- a/arch/powerpc/include/asm/page.h +++ b/arch/powerpc/include/asm/page.h @@ -269,14 +269,6 @@ static inline const void *pfn_to_kaddr(unsigned long pfn) #define is_kernel_addr(x) ((x) >= TASK_SIZE) #endif -#ifdef CONFIG_PPC_BOOK3S_64 -/* - * Book3S 64 stores real addresses in the hugepd entries to - * avoid overlaps with _PAGE_PRESENT and _PAGE_PTE. - */ -#define HUGEPD_ADDR_MASK (0x0ffffffffffffffful & ~HUGEPD_SHIFT_MASK) -#endif /* CONFIG_PPC_BOOK3S_64 */ - /* * Some number of bits at the level of the page table that points to * a hugepte are used to encode the size. This masks those bits. diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c index 01c3b4b65241..6727a15ab94f 100644 --- a/arch/powerpc/mm/book3s64/hash_utils.c +++ b/arch/powerpc/mm/book3s64/hash_utils.c @@ -1233,10 +1233,6 @@ void __init hash__early_init_mmu(void) __pmd_table_size = H_PMD_TABLE_SIZE; __pud_table_size = H_PUD_TABLE_SIZE; __pgd_table_size = H_PGD_TABLE_SIZE; - /* - * 4k use hugepd format, so for hash set then to - * zero - */ __pmd_val_bits = HASH_PMD_VAL_BITS; __pud_val_bits = HASH_PUD_VAL_BITS; __pgd_val_bits = HASH_PGD_VAL_BITS; @@ -1546,6 +1542,13 @@ int hash_page_mm(struct mm_struct *mm, unsigned long ea, goto bail; } + if (IS_ENABLED(CONFIG_PPC_4K_PAGES) && !radix_enabled()) { + if (hugeshift == PMD_SHIFT && psize == MMU_PAGE_16M) + hugeshift = mmu_psize_defs[MMU_PAGE_16M].shift; + if (hugeshift == PUD_SHIFT && psize == MMU_PAGE_16G) + hugeshift = mmu_psize_defs[MMU_PAGE_16G].shift; + } + /* * Add _PAGE_PRESENT to the required access perm. If there are parallel * updates to the pte that can possibly clear _PAGE_PTE, catch that too. diff --git a/arch/powerpc/mm/book3s64/hugetlbpage.c b/arch/powerpc/mm/book3s64/hugetlbpage.c index 5a2e512e96db..83c3361b358b 100644 --- a/arch/powerpc/mm/book3s64/hugetlbpage.c +++ b/arch/powerpc/mm/book3s64/hugetlbpage.c @@ -53,6 +53,16 @@ int __hash_page_huge(unsigned long ea, unsigned long access, unsigned long vsid, /* If PTE permissions don't match, take page fault */ if (unlikely(!check_pte_access(access, old_pte))) return 1; + /* + * If hash-4k, hugepages use seeral contiguous PxD entries + * so bail out and let mm make the page young or dirty + */ + if (IS_ENABLED(CONFIG_PPC_4K_PAGES)) { + if (!(old_pte & _PAGE_ACCESSED)) + return 1; + if ((access & _PAGE_WRITE) && !(old_pte & _PAGE_DIRTY)) + return 1; + } /* * Try to lock the PTE, add ACCESSED and DIRTY if it was diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c index 2975ea0841ba..f4d8d3c40e5c 100644 --- a/arch/powerpc/mm/book3s64/pgtable.c +++ b/arch/powerpc/mm/book3s64/pgtable.c @@ -461,18 +461,6 @@ static inline void pgtable_free(void *table, int index) case PUD_INDEX: __pud_free(table); break; -#if defined(CONFIG_PPC_4K_PAGES) && defined(CONFIG_HUGETLB_PAGE) - /* 16M hugepd directory at pud level */ - case HTLB_16M_INDEX: - BUILD_BUG_ON(H_16M_CACHE_INDEX <= 0); - kmem_cache_free(PGT_CACHE(H_16M_CACHE_INDEX), table); - break; - /* 16G hugepd directory at the pgd level */ - case HTLB_16G_INDEX: - BUILD_BUG_ON(H_16G_CACHE_INDEX <= 0); - kmem_cache_free(PGT_CACHE(H_16G_CACHE_INDEX), table); - break; -#endif /* We don't free pgd table via RCU callback */ default: BUG(); diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index ca00dbfe0e50..1fe2843f5b12 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -592,41 +592,14 @@ static int __init hugetlbpage_init(void) for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) { unsigned shift; - unsigned pdshift; if (!mmu_psize_defs[psize].shift) continue; shift = mmu_psize_to_shift(psize); -#ifdef CONFIG_PPC_BOOK3S_64 - if (shift > PGDIR_SHIFT) - continue; - else if (shift > PUD_SHIFT) - pdshift = PGDIR_SHIFT; - else if (shift > PMD_SHIFT) - pdshift = PUD_SHIFT; - else - pdshift = PMD_SHIFT; -#else - if (shift < PUD_SHIFT) - pdshift = PMD_SHIFT; - else if (shift < PGDIR_SHIFT) - pdshift = PUD_SHIFT; - else - pdshift = PGDIR_SHIFT; -#endif - if (add_huge_page_size(1ULL << shift) < 0) continue; - /* - * if we have pdshift and shift value same, we don't - * use pgt cache for hugepd. - */ - if (pdshift > shift) { - if (!IS_ENABLED(CONFIG_PPC_8xx)) - pgtable_cache_add(pdshift - shift); - } configured = true; } diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index 6498454959f3..218792cb2c47 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -331,7 +331,7 @@ void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, __set_huge_pte_at(pmdp, ptep, pte_val(pte)); } } -#elif defined(CONFIG_PPC_E500) +#else void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned long sz) { diff --git a/arch/powerpc/platforms/Kconfig.cputype b/arch/powerpc/platforms/Kconfig.cputype index 30a78e99663e..b2d8c0da2ad9 100644 --- a/arch/powerpc/platforms/Kconfig.cputype +++ b/arch/powerpc/platforms/Kconfig.cputype @@ -98,7 +98,6 @@ config PPC_BOOK3S_64 select ARCH_ENABLE_HUGEPAGE_MIGRATION if HUGETLB_PAGE && MIGRATION select ARCH_ENABLE_SPLIT_PMD_PTLOCK select ARCH_ENABLE_THP_MIGRATION if TRANSPARENT_HUGEPAGE - select ARCH_HAS_HUGEPD if HUGETLB_PAGE select ARCH_SUPPORTS_HUGETLBFS select ARCH_SUPPORTS_NUMA_BALANCING select HAVE_MOVE_PMD From patchwork Mon May 27 13:30:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Leroy X-Patchwork-Id: 1939950 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=2404:9400:2:0:216:3eff:fee1:b9f1; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=patchwork.ozlabs.org) Received: from lists.ozlabs.org (lists.ozlabs.org [IPv6:2404:9400:2:0:216:3eff:fee1:b9f1]) by legolas.ozlabs.org (Postfix) with ESMTP id 4VnxdS0dm0z20KL for ; Mon, 27 May 2024 23:41:47 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4VnxW83z0hz79tq for ; Mon, 27 May 2024 23:36:20 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=csgroup.eu Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=csgroup.eu (client-ip=93.17.236.30; helo=pegase1.c-s.fr; envelope-from=christophe.leroy@csgroup.eu; receiver=lists.ozlabs.org) Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4VnxPz3fC7z87hg for ; Mon, 27 May 2024 23:31:51 +1000 (AEST) Received: from localhost (mailhub3.si.c-s.fr [192.168.12.233]) by localhost (Postfix) with ESMTP id 4VnxP90zbQz9tCl; Mon, 27 May 2024 15:31:09 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id smFEYZcgFAps; Mon, 27 May 2024 15:31:09 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 4VnxP90HBqz9tC6; Mon, 27 May 2024 15:31:09 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 03D138B773; Mon, 27 May 2024 15:31:09 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id x2KtvmF3p3L2; Mon, 27 May 2024 15:31:08 +0200 (CEST) Received: from PO20335.idsi0.si.c-s.fr (unknown [192.168.232.49]) by messagerie.si.c-s.fr (Postfix) with ESMTP id B079F8B764; Mon, 27 May 2024 15:30:57 +0200 (CEST) From: Christophe Leroy To: Andrew Morton , Jason Gunthorpe , Peter Xu , Oscar Salvador , Michael Ellerman , Nicholas Piggin Subject: [RFC PATCH v4 15/16] powerpc/mm: Remove hugepd leftovers Date: Mon, 27 May 2024 15:30:13 +0200 Message-ID: <90a9c00eeb25305aa555fa5ac4f934dc2ba5ba14.1716815901.git.christophe.leroy@csgroup.eu> X-Mailer: git-send-email 2.44.0 In-Reply-To: References: MIME-Version: 1.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1716816601; l=18416; i=christophe.leroy@csgroup.eu; s=20211009; h=from:subject:message-id; bh=5ILOHhg2kdKUvkDB27RTPRs/XgWOctXfbY3rjhexkGI=; b=x+orjP06gPARjTZkX74NHu2dmY7q6fN2vkF6U8Iw1gIwe+DREp7ggjtKntgfOxJnDY7qp/dVO 0Tn8WIpivxODv+r7CfTk7hFh77Eu4V4cn2/okAAKGhbkpsmPAvslmE4 X-Developer-Key: i=christophe.leroy@csgroup.eu; a=ed25519; pk=HIzTzUj91asvincQGOFx6+ZF5AoUuP9GdOtQChs7Mm0= X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" All targets have now opted out of CONFIG_ARCH_HAS_HUGEPD so remove left over code. Signed-off-by: Christophe Leroy Acked-by: Oscar Salvador --- arch/powerpc/include/asm/hugetlb.h | 7 - arch/powerpc/include/asm/page.h | 6 - arch/powerpc/include/asm/pgtable-be-types.h | 10 - arch/powerpc/include/asm/pgtable-types.h | 9 - arch/powerpc/mm/hugetlbpage.c | 412 -------------------- arch/powerpc/mm/init-common.c | 8 +- arch/powerpc/mm/pgtable.c | 27 +- 7 files changed, 3 insertions(+), 476 deletions(-) diff --git a/arch/powerpc/include/asm/hugetlb.h b/arch/powerpc/include/asm/hugetlb.h index e959c26c0b52..18a3028ac3b6 100644 --- a/arch/powerpc/include/asm/hugetlb.h +++ b/arch/powerpc/include/asm/hugetlb.h @@ -30,13 +30,6 @@ static inline int is_hugepage_only_range(struct mm_struct *mm, } #define is_hugepage_only_range is_hugepage_only_range -#ifdef CONFIG_ARCH_HAS_HUGEPD -#define __HAVE_ARCH_HUGETLB_FREE_PGD_RANGE -void hugetlb_free_pgd_range(struct mmu_gather *tlb, unsigned long addr, - unsigned long end, unsigned long floor, - unsigned long ceiling); -#endif - #define __HAVE_ARCH_HUGE_SET_HUGE_PTE_AT void set_huge_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, unsigned long sz); diff --git a/arch/powerpc/include/asm/page.h b/arch/powerpc/include/asm/page.h index c0af246a64ff..83d0a4fc5f75 100644 --- a/arch/powerpc/include/asm/page.h +++ b/arch/powerpc/include/asm/page.h @@ -269,12 +269,6 @@ static inline const void *pfn_to_kaddr(unsigned long pfn) #define is_kernel_addr(x) ((x) >= TASK_SIZE) #endif -/* - * Some number of bits at the level of the page table that points to - * a hugepte are used to encode the size. This masks those bits. - */ -#define HUGEPD_SHIFT_MASK 0x3f - #ifndef __ASSEMBLY__ #ifdef CONFIG_PPC_BOOK3S_64 diff --git a/arch/powerpc/include/asm/pgtable-be-types.h b/arch/powerpc/include/asm/pgtable-be-types.h index 82633200b500..6bd8f89b25dc 100644 --- a/arch/powerpc/include/asm/pgtable-be-types.h +++ b/arch/powerpc/include/asm/pgtable-be-types.h @@ -101,14 +101,4 @@ static inline bool pmd_xchg(pmd_t *pmdp, pmd_t old, pmd_t new) return pmd_raw(old) == prev; } -#ifdef CONFIG_ARCH_HAS_HUGEPD -typedef struct { __be64 pdbe; } hugepd_t; -#define __hugepd(x) ((hugepd_t) { cpu_to_be64(x) }) - -static inline unsigned long hpd_val(hugepd_t x) -{ - return be64_to_cpu(x.pdbe); -} -#endif - #endif /* _ASM_POWERPC_PGTABLE_BE_TYPES_H */ diff --git a/arch/powerpc/include/asm/pgtable-types.h b/arch/powerpc/include/asm/pgtable-types.h index db965d98e0ae..7b3d4c592a10 100644 --- a/arch/powerpc/include/asm/pgtable-types.h +++ b/arch/powerpc/include/asm/pgtable-types.h @@ -87,13 +87,4 @@ static inline bool pte_xchg(pte_t *ptep, pte_t old, pte_t new) } #endif -#ifdef CONFIG_ARCH_HAS_HUGEPD -typedef struct { unsigned long pd; } hugepd_t; -#define __hugepd(x) ((hugepd_t) { (x) }) -static inline unsigned long hpd_val(hugepd_t x) -{ - return x.pd; -} -#endif - #endif /* _ASM_POWERPC_PGTABLE_TYPES_H */ diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 1fe2843f5b12..76846c6014e4 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -28,8 +28,6 @@ bool hugetlb_disabled = false; -#define hugepd_none(hpd) (hpd_val(hpd) == 0) - #define PTE_T_ORDER (__builtin_ffs(sizeof(pte_basic_t)) - \ __builtin_ffs(sizeof(void *))) @@ -42,156 +40,6 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr, unsigned long s return __find_linux_pte(mm->pgd, addr, NULL, NULL); } -#ifdef CONFIG_ARCH_HAS_HUGEPD -static int __hugepte_alloc(struct mm_struct *mm, hugepd_t *hpdp, - unsigned long address, unsigned int pdshift, - unsigned int pshift, spinlock_t *ptl) -{ - struct kmem_cache *cachep; - pte_t *new; - int i; - int num_hugepd; - - if (pshift >= pdshift) { - cachep = PGT_CACHE(PTE_T_ORDER); - num_hugepd = 1 << (pshift - pdshift); - } else { - cachep = PGT_CACHE(pdshift - pshift); - num_hugepd = 1; - } - - if (!cachep) { - WARN_ONCE(1, "No page table cache created for hugetlb tables"); - return -ENOMEM; - } - - new = kmem_cache_alloc(cachep, pgtable_gfp_flags(mm, GFP_KERNEL)); - - BUG_ON(pshift > HUGEPD_SHIFT_MASK); - BUG_ON((unsigned long)new & HUGEPD_SHIFT_MASK); - - if (!new) - return -ENOMEM; - - /* - * Make sure other cpus find the hugepd set only after a - * properly initialized page table is visible to them. - * For more details look for comment in __pte_alloc(). - */ - smp_wmb(); - - spin_lock(ptl); - /* - * We have multiple higher-level entries that point to the same - * actual pte location. Fill in each as we go and backtrack on error. - * We need all of these so the DTLB pgtable walk code can find the - * right higher-level entry without knowing if it's a hugepage or not. - */ - for (i = 0; i < num_hugepd; i++, hpdp++) { - if (unlikely(!hugepd_none(*hpdp))) - break; - hugepd_populate(hpdp, new, pshift); - } - /* If we bailed from the for loop early, an error occurred, clean up */ - if (i < num_hugepd) { - for (i = i - 1 ; i >= 0; i--, hpdp--) - *hpdp = __hugepd(0); - kmem_cache_free(cachep, new); - } else { - kmemleak_ignore(new); - } - spin_unlock(ptl); - return 0; -} - -/* - * At this point we do the placement change only for BOOK3S 64. This would - * possibly work on other subarchs. - */ -pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, - unsigned long addr, unsigned long sz) -{ - pgd_t *pg; - p4d_t *p4; - pud_t *pu; - pmd_t *pm; - hugepd_t *hpdp = NULL; - unsigned pshift = __ffs(sz); - unsigned pdshift = PGDIR_SHIFT; - spinlock_t *ptl; - - addr &= ~(sz-1); - pg = pgd_offset(mm, addr); - p4 = p4d_offset(pg, addr); - -#ifdef CONFIG_PPC_BOOK3S_64 - if (pshift == PGDIR_SHIFT) - /* 16GB huge page */ - return (pte_t *) p4; - else if (pshift > PUD_SHIFT) { - /* - * We need to use hugepd table - */ - ptl = &mm->page_table_lock; - hpdp = (hugepd_t *)p4; - } else { - pdshift = PUD_SHIFT; - pu = pud_alloc(mm, p4, addr); - if (!pu) - return NULL; - if (pshift == PUD_SHIFT) - return (pte_t *)pu; - else if (pshift > PMD_SHIFT) { - ptl = pud_lockptr(mm, pu); - hpdp = (hugepd_t *)pu; - } else { - pdshift = PMD_SHIFT; - pm = pmd_alloc(mm, pu, addr); - if (!pm) - return NULL; - if (pshift == PMD_SHIFT) - /* 16MB hugepage */ - return (pte_t *)pm; - else { - ptl = pmd_lockptr(mm, pm); - hpdp = (hugepd_t *)pm; - } - } - } -#else - if (pshift >= PGDIR_SHIFT) { - ptl = &mm->page_table_lock; - hpdp = (hugepd_t *)p4; - } else { - pdshift = PUD_SHIFT; - pu = pud_alloc(mm, p4, addr); - if (!pu) - return NULL; - if (pshift >= PUD_SHIFT) { - ptl = pud_lockptr(mm, pu); - hpdp = (hugepd_t *)pu; - } else { - pdshift = PMD_SHIFT; - pm = pmd_alloc(mm, pu, addr); - if (!pm) - return NULL; - ptl = pmd_lockptr(mm, pm); - hpdp = (hugepd_t *)pm; - } - } -#endif - if (!hpdp) - return NULL; - - BUG_ON(!hugepd_none(*hpdp) && !hugepd_ok(*hpdp)); - - if (hugepd_none(*hpdp) && __hugepte_alloc(mm, hpdp, addr, - pdshift, pshift, ptl)) - return NULL; - - return hugepte_offset(*hpdp, addr, pdshift); -} -#else pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long addr, unsigned long sz) { @@ -286,266 +134,6 @@ int __init alloc_bootmem_huge_page(struct hstate *h, int nid) return __alloc_bootmem_huge_page(h, nid); } -#ifdef CONFIG_ARCH_HAS_HUGEPD -#ifndef CONFIG_PPC_BOOK3S_64 -#define HUGEPD_FREELIST_SIZE \ - ((PAGE_SIZE - sizeof(struct hugepd_freelist)) / sizeof(pte_t)) - -struct hugepd_freelist { - struct rcu_head rcu; - unsigned int index; - void *ptes[]; -}; - -static DEFINE_PER_CPU(struct hugepd_freelist *, hugepd_freelist_cur); - -static void hugepd_free_rcu_callback(struct rcu_head *head) -{ - struct hugepd_freelist *batch = - container_of(head, struct hugepd_freelist, rcu); - unsigned int i; - - for (i = 0; i < batch->index; i++) - kmem_cache_free(PGT_CACHE(PTE_T_ORDER), batch->ptes[i]); - - free_page((unsigned long)batch); -} - -static void hugepd_free(struct mmu_gather *tlb, void *hugepte) -{ - struct hugepd_freelist **batchp; - - batchp = &get_cpu_var(hugepd_freelist_cur); - - if (atomic_read(&tlb->mm->mm_users) < 2 || - mm_is_thread_local(tlb->mm)) { - kmem_cache_free(PGT_CACHE(PTE_T_ORDER), hugepte); - put_cpu_var(hugepd_freelist_cur); - return; - } - - if (*batchp == NULL) { - *batchp = (struct hugepd_freelist *)__get_free_page(GFP_ATOMIC); - (*batchp)->index = 0; - } - - (*batchp)->ptes[(*batchp)->index++] = hugepte; - if ((*batchp)->index == HUGEPD_FREELIST_SIZE) { - call_rcu(&(*batchp)->rcu, hugepd_free_rcu_callback); - *batchp = NULL; - } - put_cpu_var(hugepd_freelist_cur); -} -#else -static inline void hugepd_free(struct mmu_gather *tlb, void *hugepte) {} -#endif - -/* Return true when the entry to be freed maps more than the area being freed */ -static bool range_is_outside_limits(unsigned long start, unsigned long end, - unsigned long floor, unsigned long ceiling, - unsigned long mask) -{ - if ((start & mask) < floor) - return true; - if (ceiling) { - ceiling &= mask; - if (!ceiling) - return true; - } - return end - 1 > ceiling - 1; -} - -static void free_hugepd_range(struct mmu_gather *tlb, hugepd_t *hpdp, int pdshift, - unsigned long start, unsigned long end, - unsigned long floor, unsigned long ceiling) -{ - pte_t *hugepte = hugepd_page(*hpdp); - int i; - - unsigned long pdmask = ~((1UL << pdshift) - 1); - unsigned int num_hugepd = 1; - unsigned int shift = hugepd_shift(*hpdp); - - /* Note: On fsl the hpdp may be the first of several */ - if (shift > pdshift) - num_hugepd = 1 << (shift - pdshift); - - if (range_is_outside_limits(start, end, floor, ceiling, pdmask)) - return; - - for (i = 0; i < num_hugepd; i++, hpdp++) - *hpdp = __hugepd(0); - - if (shift >= pdshift) - hugepd_free(tlb, hugepte); - else - pgtable_free_tlb(tlb, hugepte, - get_hugepd_cache_index(pdshift - shift)); -} - -static void hugetlb_free_pte_range(struct mmu_gather *tlb, pmd_t *pmd, - unsigned long addr, unsigned long end, - unsigned long floor, unsigned long ceiling) -{ - pgtable_t token = pmd_pgtable(*pmd); - - if (range_is_outside_limits(addr, end, floor, ceiling, PMD_MASK)) - return; - - pmd_clear(pmd); - pte_free_tlb(tlb, token, addr); - mm_dec_nr_ptes(tlb->mm); -} - -static void hugetlb_free_pmd_range(struct mmu_gather *tlb, pud_t *pud, - unsigned long addr, unsigned long end, - unsigned long floor, unsigned long ceiling) -{ - pmd_t *pmd; - unsigned long next; - unsigned long start; - - start = addr; - do { - unsigned long more; - - pmd = pmd_offset(pud, addr); - next = pmd_addr_end(addr, end); - if (!is_hugepd(__hugepd(pmd_val(*pmd)))) { - if (pmd_none_or_clear_bad(pmd)) - continue; - - /* - * if it is not hugepd pointer, we should already find - * it cleared. - */ - WARN_ON(!IS_ENABLED(CONFIG_PPC_8xx)); - - hugetlb_free_pte_range(tlb, pmd, addr, end, floor, ceiling); - - continue; - } - /* - * Increment next by the size of the huge mapping since - * there may be more than one entry at this level for a - * single hugepage, but all of them point to - * the same kmem cache that holds the hugepte. - */ - more = addr + (1UL << hugepd_shift(*(hugepd_t *)pmd)); - if (more > next) - next = more; - - free_hugepd_range(tlb, (hugepd_t *)pmd, PMD_SHIFT, - addr, next, floor, ceiling); - } while (addr = next, addr != end); - - if (range_is_outside_limits(start, end, floor, ceiling, PUD_MASK)) - return; - - pmd = pmd_offset(pud, start & PUD_MASK); - pud_clear(pud); - pmd_free_tlb(tlb, pmd, start & PUD_MASK); - mm_dec_nr_pmds(tlb->mm); -} - -static void hugetlb_free_pud_range(struct mmu_gather *tlb, p4d_t *p4d, - unsigned long addr, unsigned long end, - unsigned long floor, unsigned long ceiling) -{ - pud_t *pud; - unsigned long next; - unsigned long start; - - start = addr; - do { - pud = pud_offset(p4d, addr); - next = pud_addr_end(addr, end); - if (!is_hugepd(__hugepd(pud_val(*pud)))) { - if (pud_none_or_clear_bad(pud)) - continue; - hugetlb_free_pmd_range(tlb, pud, addr, next, floor, - ceiling); - } else { - unsigned long more; - /* - * Increment next by the size of the huge mapping since - * there may be more than one entry at this level for a - * single hugepage, but all of them point to - * the same kmem cache that holds the hugepte. - */ - more = addr + (1UL << hugepd_shift(*(hugepd_t *)pud)); - if (more > next) - next = more; - - free_hugepd_range(tlb, (hugepd_t *)pud, PUD_SHIFT, - addr, next, floor, ceiling); - } - } while (addr = next, addr != end); - - if (range_is_outside_limits(start, end, floor, ceiling, PGDIR_MASK)) - return; - - pud = pud_offset(p4d, start & PGDIR_MASK); - p4d_clear(p4d); - pud_free_tlb(tlb, pud, start & PGDIR_MASK); - mm_dec_nr_puds(tlb->mm); -} - -/* - * This function frees user-level page tables of a process. - */ -void hugetlb_free_pgd_range(struct mmu_gather *tlb, - unsigned long addr, unsigned long end, - unsigned long floor, unsigned long ceiling) -{ - pgd_t *pgd; - p4d_t *p4d; - unsigned long next; - - /* - * Because there are a number of different possible pagetable - * layouts for hugepage ranges, we limit knowledge of how - * things should be laid out to the allocation path - * (huge_pte_alloc(), above). Everything else works out the - * structure as it goes from information in the hugepd - * pointers. That means that we can't here use the - * optimization used in the normal page free_pgd_range(), of - * checking whether we're actually covering a large enough - * range to have to do anything at the top level of the walk - * instead of at the bottom. - * - * To make sense of this, you should probably go read the big - * block comment at the top of the normal free_pgd_range(), - * too. - */ - - do { - next = pgd_addr_end(addr, end); - pgd = pgd_offset(tlb->mm, addr); - p4d = p4d_offset(pgd, addr); - if (!is_hugepd(__hugepd(pgd_val(*pgd)))) { - if (p4d_none_or_clear_bad(p4d)) - continue; - hugetlb_free_pud_range(tlb, p4d, addr, next, floor, ceiling); - } else { - unsigned long more; - /* - * Increment next by the size of the huge mapping since - * there may be more than one entry at the pgd level - * for a single hugepage, but all of them point to the - * same kmem cache that holds the hugepte. - */ - more = addr + (1UL << hugepd_shift(*(hugepd_t *)pgd)); - if (more > next) - next = more; - - free_hugepd_range(tlb, (hugepd_t *)p4d, PGDIR_SHIFT, - addr, next, floor, ceiling); - } - } while (addr = next, addr != end); -} -#endif - bool __init arch_hugetlb_valid_size(unsigned long size) { int shift = __ffs(size); diff --git a/arch/powerpc/mm/init-common.c b/arch/powerpc/mm/init-common.c index d3a7726ecf51..024e95c62a2d 100644 --- a/arch/powerpc/mm/init-common.c +++ b/arch/powerpc/mm/init-common.c @@ -120,12 +120,8 @@ void pgtable_cache_add(unsigned int shift) /* When batching pgtable pointers for RCU freeing, we store * the index size in the low bits. Table alignment must be * big enough to fit it. - * - * Likewise, hugeapge pagetable pointers contain a (different) - * shift value in the low bits. All tables must be aligned so - * as to leave enough 0 bits in the address to contain it. */ - unsigned long minalign = max(MAX_PGTABLE_INDEX_SIZE + 1, - HUGEPD_SHIFT_MASK + 1); + */ + unsigned long minalign = MAX_PGTABLE_INDEX_SIZE + 1; struct kmem_cache *new = NULL; /* It would be nice if this was a BUILD_BUG_ON(), but at the diff --git a/arch/powerpc/mm/pgtable.c b/arch/powerpc/mm/pgtable.c index 218792cb2c47..ab0656115424 100644 --- a/arch/powerpc/mm/pgtable.c +++ b/arch/powerpc/mm/pgtable.c @@ -409,11 +409,10 @@ unsigned long vmalloc_to_phys(void *va) EXPORT_SYMBOL_GPL(vmalloc_to_phys); /* - * We have 4 cases for pgds and pmds: + * We have 3 cases for pgds and pmds: * (1) invalid (all zeroes) * (2) pointer to next table, as normal; bottom 6 bits == 0 * (3) leaf pte for huge page _PAGE_PTE set - * (4) hugepd pointer, _PAGE_PTE = 0 and bits [2..6] indicate size of table * * So long as we atomically load page table pointers we are safe against teardown, * we can follow the address down to the page and take a ref on it. @@ -430,7 +429,6 @@ pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea, #endif pmd_t pmd, *pmdp; pte_t *ret_pte; - hugepd_t *hpdp = NULL; unsigned pdshift; if (hpage_shift) @@ -463,11 +461,6 @@ pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea, goto out; } - if (is_hugepd(__hugepd(p4d_val(p4d)))) { - hpdp = (hugepd_t *)&p4d; - goto out_huge; - } - /* * Even if we end up with an unmap, the pgtable will not * be freed, because we do an rcu free and here we are @@ -485,11 +478,6 @@ pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea, goto out; } - if (is_hugepd(__hugepd(pud_val(pud)))) { - hpdp = (hugepd_t *)&pud; - goto out_huge; - } - pmdp = pmd_offset(&pud, ea); #else pmdp = pmd_offset(pud_offset(p4d_offset(pgdp, ea), ea), ea); @@ -527,21 +515,8 @@ pte_t *__find_linux_pte(pgd_t *pgdir, unsigned long ea, goto out; } - if (is_hugepd(__hugepd(pmd_val(pmd)))) { - hpdp = (hugepd_t *)&pmd; - goto out_huge; - } - return pte_offset_kernel(&pmd, ea); -out_huge: - if (!hpdp) - return NULL; - -#ifdef CONFIG_ARCH_HAS_HUGEPD - ret_pte = hugepte_offset(*hpdp, ea, pdshift); - pdshift = hugepd_shift(*hpdp); -#endif out: if (hpage_shift) *hpage_shift = pdshift; From patchwork Mon May 27 13:30:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christophe Leroy X-Patchwork-Id: 1939963 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ozlabs.org (client-ip=112.213.38.117; helo=lists.ozlabs.org; envelope-from=linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org; receiver=patchwork.ozlabs.org) Received: from lists.ozlabs.org (lists.ozlabs.org [112.213.38.117]) by legolas.ozlabs.org (Postfix) with ESMTP id 4Vnxl66fgCz20PT for ; Mon, 27 May 2024 23:46:42 +1000 (AEST) Received: from boromir.ozlabs.org (localhost [IPv6:::1]) by lists.ozlabs.org (Postfix) with ESMTP id 4VnxWj1FQ6z3vY0 for ; Mon, 27 May 2024 23:36:49 +1000 (AEST) X-Original-To: linuxppc-dev@lists.ozlabs.org Delivered-To: linuxppc-dev@lists.ozlabs.org Authentication-Results: lists.ozlabs.org; dmarc=pass (p=quarantine dis=none) header.from=csgroup.eu Authentication-Results: lists.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=csgroup.eu (client-ip=93.17.236.30; helo=pegase1.c-s.fr; envelope-from=christophe.leroy@csgroup.eu; receiver=lists.ozlabs.org) Received: from pegase1.c-s.fr (pegase1.c-s.fr [93.17.236.30]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by lists.ozlabs.org (Postfix) with ESMTPS id 4VnxQ36d0Fz87mF for ; Mon, 27 May 2024 23:31:55 +1000 (AEST) Received: from localhost (mailhub3.si.c-s.fr [192.168.12.233]) by localhost (Postfix) with ESMTP id 4VnxPK5zXmz9tLb; Mon, 27 May 2024 15:31:17 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from pegase1.c-s.fr ([192.168.12.234]) by localhost (pegase1.c-s.fr [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OdZTzxKM_r1Z; Mon, 27 May 2024 15:31:17 +0200 (CEST) Received: from messagerie.si.c-s.fr (messagerie.si.c-s.fr [192.168.25.192]) by pegase1.c-s.fr (Postfix) with ESMTP id 4VnxPK5KFzz9tFS; Mon, 27 May 2024 15:31:17 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by messagerie.si.c-s.fr (Postfix) with ESMTP id B12D18B773; Mon, 27 May 2024 15:31:17 +0200 (CEST) X-Virus-Scanned: amavisd-new at c-s.fr Received: from messagerie.si.c-s.fr ([127.0.0.1]) by localhost (messagerie.si.c-s.fr [127.0.0.1]) (amavisd-new, port 10023) with ESMTP id wNrmK42xOPRp; Mon, 27 May 2024 15:31:17 +0200 (CEST) Received: from PO20335.idsi0.si.c-s.fr (unknown [192.168.232.49]) by messagerie.si.c-s.fr (Postfix) with ESMTP id 26C5E8B764; Mon, 27 May 2024 15:31:09 +0200 (CEST) From: Christophe Leroy To: Andrew Morton , Jason Gunthorpe , Peter Xu , Oscar Salvador , Michael Ellerman , Nicholas Piggin Subject: [RFC PATCH v4 16/16] mm: Remove CONFIG_ARCH_HAS_HUGEPD Date: Mon, 27 May 2024 15:30:14 +0200 Message-ID: <98d6ff5bc9395785e268b9f8eff08fb742c045c6.1716815901.git.christophe.leroy@csgroup.eu> X-Mailer: git-send-email 2.44.0 In-Reply-To: References: MIME-Version: 1.0 X-Developer-Signature: v=1; a=ed25519-sha256; t=1716816601; l=13969; i=christophe.leroy@csgroup.eu; s=20211009; h=from:subject:message-id; bh=2mnucT7jK/WjK+hp16tx3H/CY4w2kcP/kZQm14EdO3U=; b=+y4afFc8IAbinC7WJjJ+NWwCU4NrtOneoLV6ulrU1x1Do1PwObN+oY70VUBj2BtI+7WPYfpWJ zCg+niYXt4bD9ECim5rrTwaqunq+8ARrng9xS33KK3M6mjRtiFpAYid X-Developer-Key: i=christophe.leroy@csgroup.eu; a=ed25519; pk=HIzTzUj91asvincQGOFx6+ZF5AoUuP9GdOtQChs7Mm0= X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org Errors-To: linuxppc-dev-bounces+incoming=patchwork.ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" powerpc was the only user of CONFIG_ARCH_HAS_HUGEPD and doesn't use it anymore, so remove all related code. Signed-off-by: Christophe Leroy Acked-by: Oscar Salvador --- v4: Rebased on v6.10-rc1 --- arch/powerpc/mm/hugetlbpage.c | 1 - include/linux/hugetlb.h | 6 -- mm/Kconfig | 10 -- mm/gup.c | 183 +--------------------------------- mm/pagewalk.c | 57 +---------- 5 files changed, 9 insertions(+), 248 deletions(-) diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 76846c6014e4..6b043180220a 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -78,7 +78,6 @@ pte_t *huge_pte_alloc(struct mm_struct *mm, struct vm_area_struct *vma, return pte_alloc_huge(mm, pmd, addr); } -#endif #ifdef CONFIG_PPC_BOOK3S_64 /* diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 2b3c3a404769..58daf7d14bf4 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -20,12 +20,6 @@ struct user_struct; struct mmu_gather; struct node; -#ifndef CONFIG_ARCH_HAS_HUGEPD -typedef struct { unsigned long pd; } hugepd_t; -#define is_hugepd(hugepd) (0) -#define __hugepd(x) ((hugepd_t) { (x) }) -#endif - void free_huge_folio(struct folio *folio); #ifdef CONFIG_HUGETLB_PAGE diff --git a/mm/Kconfig b/mm/Kconfig index b4cb45255a54..049d29ec6e20 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1119,16 +1119,6 @@ config DMAPOOL_TEST config ARCH_HAS_PTE_SPECIAL bool -# -# Some architectures require a special hugepage directory format that is -# required to support multiple hugepage sizes. For example a4fe3ce76 -# "powerpc/mm: Allow more flexible layouts for hugepage pagetables" -# introduced it on powerpc. This allows for a more flexible hugepage -# pagetable layouts. -# -config ARCH_HAS_HUGEPD - bool - config MAPPING_DIRTY_HELPERS bool diff --git a/mm/gup.c b/mm/gup.c index 53ebb0ae53a0..f8e982a42bba 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -501,7 +501,7 @@ static inline void mm_set_has_pinned_flag(unsigned long *mm_flags) #ifdef CONFIG_MMU -#if defined(CONFIG_ARCH_HAS_HUGEPD) || defined(CONFIG_HAVE_GUP_FAST) +#ifdef CONFIG_HAVE_GUP_FAST static int record_subpages(struct page *page, unsigned long sz, unsigned long addr, unsigned long end, struct page **pages) @@ -515,147 +515,7 @@ static int record_subpages(struct page *page, unsigned long sz, return nr; } -#endif /* CONFIG_ARCH_HAS_HUGEPD || CONFIG_HAVE_GUP_FAST */ - -#ifdef CONFIG_ARCH_HAS_HUGEPD -static unsigned long hugepte_addr_end(unsigned long addr, unsigned long end, - unsigned long sz) -{ - unsigned long __boundary = (addr + sz) & ~(sz-1); - return (__boundary - 1 < end - 1) ? __boundary : end; -} - -/* - * Returns 1 if succeeded, 0 if failed, -EMLINK if unshare needed. - * - * NOTE: for the same entry, gup-fast and gup-slow can return different - * results (0 v.s. -EMLINK) depending on whether vma is available. This is - * the expected behavior, where we simply want gup-fast to fallback to - * gup-slow to take the vma reference first. - */ -static int gup_hugepte(struct vm_area_struct *vma, pte_t *ptep, unsigned long sz, - unsigned long addr, unsigned long end, unsigned int flags, - struct page **pages, int *nr) -{ - unsigned long pte_end; - struct page *page; - struct folio *folio; - pte_t pte; - int refs; - - pte_end = (addr + sz) & ~(sz-1); - if (pte_end < end) - end = pte_end; - - pte = huge_ptep_get(vma->mm, addr, ptep); - - if (!pte_access_permitted(pte, flags & FOLL_WRITE)) - return 0; - - /* hugepages are never "special" */ - VM_BUG_ON(!pfn_valid(pte_pfn(pte))); - - page = pte_page(pte); - refs = record_subpages(page, sz, addr, end, pages + *nr); - - folio = try_grab_folio(page, refs, flags); - if (!folio) - return 0; - - if (unlikely(pte_val(pte) != pte_val(ptep_get(ptep)))) { - gup_put_folio(folio, refs, flags); - return 0; - } - - if (!pte_write(pte) && gup_must_unshare(vma, flags, &folio->page)) { - gup_put_folio(folio, refs, flags); - return -EMLINK; - } - - *nr += refs; - folio_set_referenced(folio); - return 1; -} - -/* - * NOTE: currently GUP for a hugepd is only possible on hugetlbfs file - * systems on Power, which does not have issue with folio writeback against - * GUP updates. When hugepd will be extended to support non-hugetlbfs or - * even anonymous memory, we need to do extra check as what we do with most - * of the other folios. See writable_file_mapping_allowed() and - * gup_fast_folio_allowed() for more information. - */ -static int gup_hugepd(struct vm_area_struct *vma, hugepd_t hugepd, - unsigned long addr, unsigned int pdshift, - unsigned long end, unsigned int flags, - struct page **pages, int *nr) -{ - pte_t *ptep; - unsigned long sz = 1UL << hugepd_shift(hugepd); - unsigned long next; - int ret; - - ptep = hugepte_offset(hugepd, addr, pdshift); - do { - next = hugepte_addr_end(addr, end, sz); - ret = gup_hugepte(vma, ptep, sz, addr, end, flags, pages, nr); - if (ret != 1) - return ret; - } while (ptep++, addr = next, addr != end); - - return 1; -} - -static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd, - unsigned long addr, unsigned int pdshift, - unsigned int flags, - struct follow_page_context *ctx) -{ - struct page *page; - struct hstate *h; - spinlock_t *ptl; - int nr = 0, ret; - pte_t *ptep; - - /* Only hugetlb supports hugepd */ - if (WARN_ON_ONCE(!is_vm_hugetlb_page(vma))) - return ERR_PTR(-EFAULT); - - h = hstate_vma(vma); - ptep = hugepte_offset(hugepd, addr, pdshift); - ptl = huge_pte_lock(h, vma->vm_mm, ptep); - ret = gup_hugepd(vma, hugepd, addr, pdshift, addr + PAGE_SIZE, - flags, &page, &nr); - spin_unlock(ptl); - - if (ret == 1) { - /* GUP succeeded */ - WARN_ON_ONCE(nr != 1); - ctx->page_mask = (1U << huge_page_order(h)) - 1; - return page; - } - - /* ret can be either 0 (translates to NULL) or negative */ - return ERR_PTR(ret); -} -#else /* CONFIG_ARCH_HAS_HUGEPD */ -static inline int gup_hugepd(struct vm_area_struct *vma, hugepd_t hugepd, - unsigned long addr, unsigned int pdshift, - unsigned long end, unsigned int flags, - struct page **pages, int *nr) -{ - return 0; -} - -static struct page *follow_hugepd(struct vm_area_struct *vma, hugepd_t hugepd, - unsigned long addr, unsigned int pdshift, - unsigned int flags, - struct follow_page_context *ctx) -{ - return NULL; -} -#endif /* CONFIG_ARCH_HAS_HUGEPD */ - +#endif /* CONFIG_HAVE_GUP_FAST */ static struct page *no_page_table(struct vm_area_struct *vma, unsigned int flags, unsigned long address) @@ -1025,9 +885,6 @@ static struct page *follow_pmd_mask(struct vm_area_struct *vma, return no_page_table(vma, flags, address); if (!pmd_present(pmdval)) return no_page_table(vma, flags, address); - if (unlikely(is_hugepd(__hugepd(pmd_val(pmdval))))) - return follow_hugepd(vma, __hugepd(pmd_val(pmdval)), - address, PMD_SHIFT, flags, ctx); if (pmd_devmap(pmdval)) { ptl = pmd_lock(mm, pmd); page = follow_devmap_pmd(vma, address, pmd, flags, &ctx->pgmap); @@ -1078,9 +935,6 @@ static struct page *follow_pud_mask(struct vm_area_struct *vma, pud = READ_ONCE(*pudp); if (!pud_present(pud)) return no_page_table(vma, flags, address); - if (unlikely(is_hugepd(__hugepd(pud_val(pud))))) - return follow_hugepd(vma, __hugepd(pud_val(pud)), - address, PUD_SHIFT, flags, ctx); if (pud_leaf(pud)) { ptl = pud_lock(mm, pudp); page = follow_huge_pud(vma, address, pudp, flags, ctx); @@ -1106,10 +960,6 @@ static struct page *follow_p4d_mask(struct vm_area_struct *vma, p4d = READ_ONCE(*p4dp); BUILD_BUG_ON(p4d_leaf(p4d)); - if (unlikely(is_hugepd(__hugepd(p4d_val(p4d))))) - return follow_hugepd(vma, __hugepd(p4d_val(p4d)), - address, P4D_SHIFT, flags, ctx); - if (!p4d_present(p4d) || p4d_bad(p4d)) return no_page_table(vma, flags, address); @@ -1153,10 +1003,7 @@ static struct page *follow_page_mask(struct vm_area_struct *vma, ctx->page_mask = 0; pgd = pgd_offset(mm, address); - if (unlikely(is_hugepd(__hugepd(pgd_val(*pgd))))) - page = follow_hugepd(vma, __hugepd(pgd_val(*pgd)), - address, PGDIR_SHIFT, flags, ctx); - else if (pgd_none(*pgd) || unlikely(pgd_bad(*pgd))) + if (pgd_none(*pgd) || unlikely(pgd_bad(*pgd))) page = no_page_table(vma, flags, address); else page = follow_p4d_mask(vma, address, pgd, flags, ctx); @@ -3270,14 +3117,6 @@ static int gup_fast_pmd_range(pud_t *pudp, pud_t pud, unsigned long addr, pages, nr)) return 0; - } else if (unlikely(is_hugepd(__hugepd(pmd_val(pmd))))) { - /* - * architecture have different format for hugetlbfs - * pmd format and THP pmd format - */ - if (gup_hugepd(NULL, __hugepd(pmd_val(pmd)), addr, - PMD_SHIFT, next, flags, pages, nr) != 1) - return 0; } else if (!gup_fast_pte_range(pmd, pmdp, addr, next, flags, pages, nr)) return 0; @@ -3304,10 +3143,6 @@ static int gup_fast_pud_range(p4d_t *p4dp, p4d_t p4d, unsigned long addr, if (!gup_fast_pud_leaf(pud, pudp, addr, next, flags, pages, nr)) return 0; - } else if (unlikely(is_hugepd(__hugepd(pud_val(pud))))) { - if (gup_hugepd(NULL, __hugepd(pud_val(pud)), addr, - PUD_SHIFT, next, flags, pages, nr) != 1) - return 0; } else if (!gup_fast_pmd_range(pudp, pud, addr, next, flags, pages, nr)) return 0; @@ -3331,12 +3166,8 @@ static int gup_fast_p4d_range(pgd_t *pgdp, pgd_t pgd, unsigned long addr, if (!p4d_present(p4d)) return 0; BUILD_BUG_ON(p4d_leaf(p4d)); - if (unlikely(is_hugepd(__hugepd(p4d_val(p4d))))) { - if (gup_hugepd(NULL, __hugepd(p4d_val(p4d)), addr, - P4D_SHIFT, next, flags, pages, nr) != 1) - return 0; - } else if (!gup_fast_pud_range(p4dp, p4d, addr, next, flags, - pages, nr)) + if (!gup_fast_pud_range(p4dp, p4d, addr, next, flags, + pages, nr)) return 0; } while (p4dp++, addr = next, addr != end); @@ -3360,10 +3191,6 @@ static void gup_fast_pgd_range(unsigned long addr, unsigned long end, if (!gup_fast_pgd_leaf(pgd, pgdp, addr, next, flags, pages, nr)) return; - } else if (unlikely(is_hugepd(__hugepd(pgd_val(pgd))))) { - if (gup_hugepd(NULL, __hugepd(pgd_val(pgd)), addr, - PGDIR_SHIFT, next, flags, pages, nr) != 1) - return; } else if (!gup_fast_p4d_range(pgdp, pgd, addr, next, flags, pages, nr)) return; diff --git a/mm/pagewalk.c b/mm/pagewalk.c index f46c80b18ce4..ae2f08ce991b 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -73,45 +73,6 @@ static int walk_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, return err; } -#ifdef CONFIG_ARCH_HAS_HUGEPD -static int walk_hugepd_range(hugepd_t *phpd, unsigned long addr, - unsigned long end, struct mm_walk *walk, int pdshift) -{ - int err = 0; - const struct mm_walk_ops *ops = walk->ops; - int shift = hugepd_shift(*phpd); - int page_size = 1 << shift; - - if (!ops->pte_entry) - return 0; - - if (addr & (page_size - 1)) - return 0; - - for (;;) { - pte_t *pte; - - spin_lock(&walk->mm->page_table_lock); - pte = hugepte_offset(*phpd, addr, pdshift); - err = ops->pte_entry(pte, addr, addr + page_size, walk); - spin_unlock(&walk->mm->page_table_lock); - - if (err) - break; - if (addr >= end - page_size) - break; - addr += page_size; - } - return err; -} -#else -static int walk_hugepd_range(hugepd_t *phpd, unsigned long addr, - unsigned long end, struct mm_walk *walk, int pdshift) -{ - return 0; -} -#endif - static int walk_pmd_range(pud_t *pud, unsigned long addr, unsigned long end, struct mm_walk *walk) { @@ -159,10 +120,7 @@ static int walk_pmd_range(pud_t *pud, unsigned long addr, unsigned long end, if (walk->vma) split_huge_pmd(walk->vma, pmd, addr); - if (is_hugepd(__hugepd(pmd_val(*pmd)))) - err = walk_hugepd_range((hugepd_t *)pmd, addr, next, walk, PMD_SHIFT); - else - err = walk_pte_range(pmd, addr, next, walk); + err = walk_pte_range(pmd, addr, next, walk); if (err) break; @@ -215,10 +173,7 @@ static int walk_pud_range(p4d_t *p4d, unsigned long addr, unsigned long end, if (pud_none(*pud)) goto again; - if (is_hugepd(__hugepd(pud_val(*pud)))) - err = walk_hugepd_range((hugepd_t *)pud, addr, next, walk, PUD_SHIFT); - else - err = walk_pmd_range(pud, addr, next, walk); + err = walk_pmd_range(pud, addr, next, walk); if (err) break; } while (pud++, addr = next, addr != end); @@ -250,9 +205,7 @@ static int walk_p4d_range(pgd_t *pgd, unsigned long addr, unsigned long end, if (err) break; } - if (is_hugepd(__hugepd(p4d_val(*p4d)))) - err = walk_hugepd_range((hugepd_t *)p4d, addr, next, walk, P4D_SHIFT); - else if (ops->pud_entry || ops->pmd_entry || ops->pte_entry) + if (ops->pud_entry || ops->pmd_entry || ops->pte_entry) err = walk_pud_range(p4d, addr, next, walk); if (err) break; @@ -287,9 +240,7 @@ static int walk_pgd_range(unsigned long addr, unsigned long end, if (err) break; } - if (is_hugepd(__hugepd(pgd_val(*pgd)))) - err = walk_hugepd_range((hugepd_t *)pgd, addr, next, walk, PGDIR_SHIFT); - else if (ops->p4d_entry || ops->pud_entry || ops->pmd_entry || ops->pte_entry) + if (ops->p4d_entry || ops->pud_entry || ops->pmd_entry || ops->pte_entry) err = walk_p4d_range(pgd, addr, next, walk); if (err) break;