Message ID | 20240416120635.361838-2-skseofh@gmail.com |
---|---|
State | Changes Requested |
Headers | show |
Series | [v2] memblock: add no-map alloc functions | expand |
Context | Check | Description |
---|---|---|
robh/checkpatch | warning | total: 60 errors, 42 warnings, 111 lines checked |
robh/patch-applied | fail | build log |
On Tue, Apr 16, 2024 at 09:06:35PM +0900, skseofh@gmail.com wrote: >From: Daero Lee <daero_le.lee@samsung.com> > >Like reserved-memory with the 'no-map' property and only 'size' property >(w/o 'reg' property), there are memory regions need to be allocated in >memblock.memory marked with the MEMBLOCK_NOMAP flag, but should not be >allocated in memblock.reserved. We don't "allocate" memory from memblock.memory directly. We present memory in memblock.memory and "allocate" a memory by marking it in memblock.reserved. > >example : arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi > reserved-memory { > #address-cells = <2>; > #size-cells = <2>; > ranges; > > bman_fbpr: bman-fbpr { > compatible = "shared-dma-pool"; > size = <0 0x1000000>; > alignment = <0 0x1000000>; > no-map; > }; > > qman_fqd: qman-fqd { > compatible = "shared-dma-pool"; > size = <0 0x400000>; > alignment = <0 0x400000>; > no-map; > }; > > qman_pfdr: qman-pfdr { > compatible = "shared-dma-pool"; > size = <0 0x2000000>; > alignment = <0 0x2000000>; > no-map; > }; > }; > >So, functions were added that find the required memory area in >memblock.memory, but do not allocate it to memblock.reserved. > >In previous patch(a7259df), early_init_dt_alloc_reserved_memory was You want to say this patch introduced a regression? >modified to use memblock_phys_alloc_range allocating memory in >memblock.reserved, instead of memblock_find_in_range that just find the >available region. But if there is a 'no-map' property, memory region >should not be allocated to memblock.reserved. If my understanding is correct, memblock_phys_free() is called if 'no-map' property is set. This would release the "allocation" in memblock.reserved. > >So, the early_init_dt_alloc_reserved_memory_arch function was modified >using the no-map alloc function. > >Signed-off-by: Daero Lee <daero_le.lee@samsung.com> >--- > drivers/of/of_reserved_mem.c | 9 +++-- > mm/memblock.c | 78 ++++++++++++++++++++++++++++++++++++ > 2 files changed, 84 insertions(+), 3 deletions(-) > >diff --git a/drivers/of/of_reserved_mem.c b/drivers/of/of_reserved_mem.c >index 8236ecae2953..504f2f60689c 100644 >--- a/drivers/of/of_reserved_mem.c >+++ b/drivers/of/of_reserved_mem.c >@@ -40,15 +40,18 @@ static int __init early_init_dt_alloc_reserved_memory_arch(phys_addr_t size, > > end = !end ? MEMBLOCK_ALLOC_ANYWHERE : end; > align = !align ? SMP_CACHE_BYTES : align; >- base = memblock_phys_alloc_range(size, align, start, end); >+ if (nomap) { >+ base = memblock_phys_alloc_range_nomap(size, align, start, end); >+ } else { >+ base = memblock_phys_alloc_range(size, align, start, end); >+ } >+ > if (!base) > return -ENOMEM; > > *res_base = base; > if (nomap) { > err = memblock_mark_nomap(base, size); >- if (err) >- memblock_phys_free(base, size); I see you removed it here, seems does the same as before. > } > > kmemleak_ignore_phys(base); >diff --git a/mm/memblock.c b/mm/memblock.c >index d09136e040d3..f103f1ecbfad 100644 >--- a/mm/memblock.c >+++ b/mm/memblock.c >@@ -1506,6 +1506,72 @@ phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size, > return found; > } > >+phys_addr_t __init memblock_alloc_range_nid_nomap(phys_addr_t size, >+ phys_addr_t align, phys_addr_t start, >+ phys_addr_t end, int nid, >+ bool exact_nid) >+{ >+ enum memblock_flags flags = choose_memblock_flags(); >+ phys_addr_t found; >+ >+ if (WARN_ONCE(nid == MAX_NUMNODES, "Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n")) >+ nid = NUMA_NO_NODE; >+ >+ if (!align) { >+ /* Can't use WARNs this early in boot on powerpc */ >+ dump_stack(); >+ align = SMP_CACHE_BYTES; >+ } >+ >+again: >+ found = memblock_find_in_range_node(size, align, start, end, nid, >+ flags); >+ if (found) >+ goto done; >+ >+ if (nid != NUMA_NO_NODE && !exact_nid) { >+ found = memblock_find_in_range_node(size, align, start, >+ end, NUMA_NO_NODE, >+ flags); >+ if (found) >+ goto done; >+ } >+ >+ if (flags & MEMBLOCK_MIRROR) { >+ flags &= ~MEMBLOCK_MIRROR; >+ pr_warn_ratelimited("Could not allocate %pap bytes of mirrored memory\n", >+ &size); >+ goto again; >+ } >+ >+ return 0; >+ >+done: >+ /* >+ * Skip kmemleak for those places like kasan_init() and >+ * early_pgtable_alloc() due to high volume. >+ */ >+ if (end != MEMBLOCK_ALLOC_NOLEAKTRACE) >+ /* >+ * Memblock allocated blocks are never reported as >+ * leaks. This is because many of these blocks are >+ * only referred via the physical address which is >+ * not looked up by kmemleak. >+ */ >+ kmemleak_alloc_phys(found, size, 0); >+ >+ /* >+ * Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, >+ * require memory to be accepted before it can be used by the >+ * guest. >+ * >+ * Accept the memory of the allocated buffer. >+ */ >+ accept_memory(found, found + size); >+ >+ return found; >+} >+ > /** > * memblock_phys_alloc_range - allocate a memory block inside specified range > * @size: size of memory block to be allocated in bytes >@@ -1530,6 +1596,18 @@ phys_addr_t __init memblock_phys_alloc_range(phys_addr_t size, > false); > } > >+phys_addr_t __init memblock_phys_alloc_range_nomap(phys_addr_t size, >+ phys_addr_t align, >+ phys_addr_t start, >+ phys_addr_t end) >+{ >+ memblock_dbg("%s: %llu bytes align=0x%llx from=%pa max_addr=%pa %pS\n", >+ __func__, (u64)size, (u64)align, &start, &end, >+ (void *)_RET_IP_); >+ return memblock_alloc_range_nid_nomap(size, align, start, end, >+ NUMA_NO_NODE, false); >+} >+ > /** > * memblock_phys_alloc_try_nid - allocate a memory block from specified NUMA node > * @size: size of memory block to be allocated in bytes >-- >2.25.1 >
On Tue, Apr 16, 2024 at 09:06:35PM +0900, skseofh@gmail.com wrote: > From: Daero Lee <daero_le.lee@samsung.com> > > Like reserved-memory with the 'no-map' property and only 'size' property > (w/o 'reg' property), there are memory regions need to be allocated in > memblock.memory marked with the MEMBLOCK_NOMAP flag, but should not be > allocated in memblock.reserved. This still does not explain why you need such regions. As Wei Yang explained, memblock does not allocate memory from memblock.reserved. The memblock.reserved array represents memory that is in use by firmware or by early kernel allocations and cannot be freed to page allocator. If you have a region that's _NOMAP in memblock.memory and is absent in memblock.reserved it will not be mapped by the kernel page tables, but it will be considered as free memory by the core mm. Is this really what you want? > example : arch/arm64/boot/dts/freescale/fsl-ls1043a.dtsi > reserved-memory { > #address-cells = <2>; > #size-cells = <2>; > ranges; > > bman_fbpr: bman-fbpr { > compatible = "shared-dma-pool"; > size = <0 0x1000000>; > alignment = <0 0x1000000>; > no-map; > }; > > qman_fqd: qman-fqd { > compatible = "shared-dma-pool"; > size = <0 0x400000>; > alignment = <0 0x400000>; > no-map; > }; > > qman_pfdr: qman-pfdr { > compatible = "shared-dma-pool"; > size = <0 0x2000000>; > alignment = <0 0x2000000>; > no-map; > }; > }; >
2024년 4월 17일 (수) 오후 3:03, Mike Rapoport <rppt@kernel.org>님이 작성: > > On Tue, Apr 16, 2024 at 09:06:35PM +0900, skseofh@gmail.com wrote: > > From: Daero Lee <daero_le.lee@samsung.com> > > > > Like reserved-memory with the 'no-map' property and only 'size' property > > (w/o 'reg' property), there are memory regions need to be allocated in > > memblock.memory marked with the MEMBLOCK_NOMAP flag, but should not be > > allocated in memblock.reserved. > > This still does not explain why you need such regions. > > As Wei Yang explained, memblock does not allocate memory from > memblock.reserved. The memblock.reserved array represents memory that is in > use by firmware or by early kernel allocations and cannot be freed to page > allocator. Thank you for your comments. I used the wrong word. When I use 'allocate', I mean that the region 'adds' to the memblock.reserved. > > If you have a region that's _NOMAP in memblock.memory and is absent in > memblock.reserved it will not be mapped by the kernel page tables, but it > will be considered as free memory by the core mm. > > Is this really what you want? If my understanding is right, before freeing (memory && !reserved) area, we marked the memblock.reserved regions and memblock.memory regions with no-map flag. And when we free (memory && !reserved) area, we skip the memblock.memory regions with no-map(see should_skip_region). So, I think that the memory regions with no-map flag will not be considered as free memory. If there is anything I think is wrong, feel free to correct me. Regards, DaeRo Lee
On Thu, Apr 18, 2024 at 11:54:15PM +0900, DaeRo Lee wrote: > 2024년 4월 17일 (수) 오후 3:03, Mike Rapoport <rppt@kernel.org>님이 작성: > > > > On Tue, Apr 16, 2024 at 09:06:35PM +0900, skseofh@gmail.com wrote: > > > From: Daero Lee <daero_le.lee@samsung.com> > > > > > > Like reserved-memory with the 'no-map' property and only 'size' property > > > (w/o 'reg' property), there are memory regions need to be allocated in > > > memblock.memory marked with the MEMBLOCK_NOMAP flag, but should not be > > > allocated in memblock.reserved. > > > > This still does not explain why you need such regions. > > > > As Wei Yang explained, memblock does not allocate memory from > > memblock.reserved. The memblock.reserved array represents memory that is in > > use by firmware or by early kernel allocations and cannot be freed to page > > allocator. > Thank you for your comments. I used the wrong word. > When I use 'allocate', I mean that the region 'adds' to the memblock.reserved. > > > > > If you have a region that's _NOMAP in memblock.memory and is absent in > > memblock.reserved it will not be mapped by the kernel page tables, but it > > will be considered as free memory by the core mm. > > > > Is this really what you want? > If my understanding is right, before freeing (memory && !reserved) > area, we marked the memblock.reserved regions and memblock.memory > regions with no-map flag. And when we free (memory && !reserved) area, > we skip the memblock.memory regions with no-map(see > should_skip_region). So, I think that the memory regions with no-map > flag will not be considered as free memory. You are right here. But I still don't understand *why* do you want to change the way early_init_dt_alloc_reserved_memory_arch() works. > Regards, > DaeRo Lee
2024년 4월 19일 (금) 오전 3:04, Mike Rapoport <rppt@kernel.org>님이 작성: > > On Thu, Apr 18, 2024 at 11:54:15PM +0900, DaeRo Lee wrote: > > 2024년 4월 17일 (수) 오후 3:03, Mike Rapoport <rppt@kernel.org>님이 작성: > > > > > > On Tue, Apr 16, 2024 at 09:06:35PM +0900, skseofh@gmail.com wrote: > > > > From: Daero Lee <daero_le.lee@samsung.com> > > > > > > > > Like reserved-memory with the 'no-map' property and only 'size' property > > > > (w/o 'reg' property), there are memory regions need to be allocated in > > > > memblock.memory marked with the MEMBLOCK_NOMAP flag, but should not be > > > > allocated in memblock.reserved. > > > > > > This still does not explain why you need such regions. > > > > > > As Wei Yang explained, memblock does not allocate memory from > > > memblock.reserved. The memblock.reserved array represents memory that is in > > > use by firmware or by early kernel allocations and cannot be freed to page > > > allocator. > > Thank you for your comments. I used the wrong word. > > When I use 'allocate', I mean that the region 'adds' to the memblock.reserved. > > > > > > > > If you have a region that's _NOMAP in memblock.memory and is absent in > > > memblock.reserved it will not be mapped by the kernel page tables, but it > > > will be considered as free memory by the core mm. > > > > > > Is this really what you want? > > If my understanding is right, before freeing (memory && !reserved) > > area, we marked the memblock.reserved regions and memblock.memory > > regions with no-map flag. And when we free (memory && !reserved) area, > > we skip the memblock.memory regions with no-map(see > > should_skip_region). So, I think that the memory regions with no-map > > flag will not be considered as free memory. > > You are right here. > > But I still don't understand *why* do you want to change the way > early_init_dt_alloc_reserved_memory_arch() works. In memmap_init_reserved_pages, we mark memblock.reserved as PageReserved first and mark the memblock.reserved with nomap flag also. -> Isn't this duplicated work? (If we add no-map region to memblock.reserved 'and' mark in memblock.memory..) So, I think that for the no-map region, we don't need to add to the memblock.reserved. This is what we do now in early_init_dt_reserve_memory. the nomap region is not added to the memblock.reserved. In early_init_dt_alloc_reserved_memory_arch, if 'nomap' is true, we mark the memblock.memory region as _NOMAP. And if the return value 'err' is not zero(which is '-ENOMEM' from memblock_isolate_range), we free the region. - 'nomap' is true -> memblock_mark_nomap : success -> not free the region : fail -> free the region And it can be said that we add the region to the memblock.reserved using memblock_phys_alloc_range and if the region is nomap, then we can free the region from memblock.reserved. But is it necessary to add it to memblock.reserved? We just need the region in memblock.memory to mark nomap. So, here is what I think: - reserved-memory w/ nomap region -> mark only to memblock.memory - reserved-memory w/o nomap region -> add to the memblock.reserved Regards, DaeRo Lee
2024년 4월 19일 (금) 오전 10:46, DaeRo Lee <skseofh@gmail.com>님이 작성: > > 2024년 4월 19일 (금) 오전 3:04, Mike Rapoport <rppt@kernel.org>님이 작성: > > > > On Thu, Apr 18, 2024 at 11:54:15PM +0900, DaeRo Lee wrote: > > > 2024년 4월 17일 (수) 오후 3:03, Mike Rapoport <rppt@kernel.org>님이 작성: > > > > > > > > On Tue, Apr 16, 2024 at 09:06:35PM +0900, skseofh@gmail.com wrote: > > > > > From: Daero Lee <daero_le.lee@samsung.com> > > > > > > > > > > Like reserved-memory with the 'no-map' property and only 'size' property > > > > > (w/o 'reg' property), there are memory regions need to be allocated in > > > > > memblock.memory marked with the MEMBLOCK_NOMAP flag, but should not be > > > > > allocated in memblock.reserved. > > > > > > > > This still does not explain why you need such regions. > > > > > > > > As Wei Yang explained, memblock does not allocate memory from > > > > memblock.reserved. The memblock.reserved array represents memory that is in > > > > use by firmware or by early kernel allocations and cannot be freed to page > > > > allocator. > > > Thank you for your comments. I used the wrong word. > > > When I use 'allocate', I mean that the region 'adds' to the memblock.reserved. > > > > > > > > > > > If you have a region that's _NOMAP in memblock.memory and is absent in > > > > memblock.reserved it will not be mapped by the kernel page tables, but it > > > > will be considered as free memory by the core mm. > > > > > > > > Is this really what you want? > > > If my understanding is right, before freeing (memory && !reserved) > > > area, we marked the memblock.reserved regions and memblock.memory > > > regions with no-map flag. And when we free (memory && !reserved) area, > > > we skip the memblock.memory regions with no-map(see > > > should_skip_region). So, I think that the memory regions with no-map > > > flag will not be considered as free memory. > > > > You are right here. > > > > But I still don't understand *why* do you want to change the way > > early_init_dt_alloc_reserved_memory_arch() works. > > In memmap_init_reserved_pages, we mark memblock.reserved as > PageReserved first and mark the memblock.reserved with nomap flag > also. Sorry. This is my mistake. 'memblock.memory with nomap flag' is right. > -> Isn't this duplicated work? (If we add no-map region to > memblock.reserved 'and' mark in memblock.memory..) > So, I think that for the no-map region, we don't need to add to the > memblock.reserved. > This is what we do now in early_init_dt_reserve_memory. the nomap > region is not added to the memblock.reserved. > > In early_init_dt_alloc_reserved_memory_arch, if 'nomap' is true, we > mark the memblock.memory region as _NOMAP. And if the return value > 'err' is not zero(which is '-ENOMEM' from memblock_isolate_range), we > free the region. > - 'nomap' is true -> memblock_mark_nomap : success -> not free the region > > : fail -> free the region > And it can be said that we add the region to the memblock.reserved > using memblock_phys_alloc_range and if the region is nomap, then we > can free the region from memblock.reserved. But is it necessary to add > it to memblock.reserved? We just need the region in memblock.memory to > mark nomap. > > So, here is what I think: > - reserved-memory w/ nomap region -> mark only to memblock.memory > - reserved-memory w/o nomap region -> add to the memblock.reserved > > Regards, > DaeRo Lee
On Fri, Apr 19, 2024 at 10:59:52AM +0900, DaeRo Lee wrote: > 2024년 4월 19일 (금) 오전 10:46, DaeRo Lee <skseofh@gmail.com>님이 작성: > > > > In memmap_init_reserved_pages, we mark memblock.reserved as > > PageReserved first and mark the memblock.reserved with nomap flag > > also. > Sorry. This is my mistake. 'memblock.memory with nomap flag' is right. > > > -> Isn't this duplicated work? (If we add no-map region to > > memblock.reserved 'and' mark in memblock.memory..) > > So, I think that for the no-map region, we don't need to add to the > > memblock.reserved. > > This is what we do now in early_init_dt_reserve_memory. the nomap > > region is not added to the memblock.reserved. > > > > In early_init_dt_alloc_reserved_memory_arch, if 'nomap' is true, we > > mark the memblock.memory region as _NOMAP. And if the return value > > 'err' is not zero(which is '-ENOMEM' from memblock_isolate_range), we > > free the region. > > - 'nomap' is true -> memblock_mark_nomap : success -> not free the region > > > > : fail -> free the region > > And it can be said that we add the region to the memblock.reserved > > using memblock_phys_alloc_range and if the region is nomap, then we > > can free the region from memblock.reserved. But is it necessary to add > > it to memblock.reserved? We just need the region in memblock.memory to > > mark nomap. > > > > So, here is what I think: > > - reserved-memory w/ nomap region -> mark only to memblock.memory > > - reserved-memory w/o nomap region -> add to the memblock.reserved NOMAP and memblock.reserved are semantically different, and at makes sense to have a "reserved nomap" node in fdt recorded in both memblock.memory and memblock.reserved. memblock.reserved represents the memory that is used by firmware or early kernel allocation, so reserved memory in fdt should be reserved in memblock as well. I believe it's an oversight that early_init_dt_reserve_memory() does not call memblock_reserve() for nomap memory. NOMAP is a property of a memory region that says that that region should not be mapped in the linear map, it's not necessarily in use. > > Regards, > > DaeRo Lee
2024년 4월 27일 (토) 오후 5:50, Mike Rapoport <rppt@kernel.org>님이 작성: > > On Fri, Apr 19, 2024 at 10:59:52AM +0900, DaeRo Lee wrote: > > 2024년 4월 19일 (금) 오전 10:46, DaeRo Lee <skseofh@gmail.com>님이 작성: > > > > > > In memmap_init_reserved_pages, we mark memblock.reserved as > > > PageReserved first and mark the memblock.reserved with nomap flag > > > also. > > Sorry. This is my mistake. 'memblock.memory with nomap flag' is right. > > > > > -> Isn't this duplicated work? (If we add no-map region to > > > memblock.reserved 'and' mark in memblock.memory..) > > > So, I think that for the no-map region, we don't need to add to the > > > memblock.reserved. > > > This is what we do now in early_init_dt_reserve_memory. the nomap > > > region is not added to the memblock.reserved. > > > > > > In early_init_dt_alloc_reserved_memory_arch, if 'nomap' is true, we > > > mark the memblock.memory region as _NOMAP. And if the return value > > > 'err' is not zero(which is '-ENOMEM' from memblock_isolate_range), we > > > free the region. > > > - 'nomap' is true -> memblock_mark_nomap : success -> not free the region > > > > > > : fail -> free the region > > > And it can be said that we add the region to the memblock.reserved > > > using memblock_phys_alloc_range and if the region is nomap, then we > > > can free the region from memblock.reserved. But is it necessary to add > > > it to memblock.reserved? We just need the region in memblock.memory to > > > mark nomap. > > > > > > So, here is what I think: > > > - reserved-memory w/ nomap region -> mark only to memblock.memory > > > - reserved-memory w/o nomap region -> add to the memblock.reserved > > NOMAP and memblock.reserved are semantically different, and at makes sense > to have a "reserved nomap" node in fdt recorded in both memblock.memory and > memblock.reserved. > > memblock.reserved represents the memory that is used by firmware or early > kernel allocation, so reserved memory in fdt should be reserved in memblock > as well. I believe it's an oversight that early_init_dt_reserve_memory() > does not call memblock_reserve() for nomap memory. > > NOMAP is a property of a memory region that says that that region should > not be mapped in the linear map, it's not necessarily in use. I agree that the NOMAP region should be added to memblock.reserved. So, I think we need to clean-up memmap_init_reserved_pages, because in this function we call reserve_bootmem_region for memblock.reserved and memblock.memory with nomap. We don't need to call reserve_bootmem_region for nomap. Regards,. DaeRo Lee
On Sat, Apr 27, 2024 at 07:24:23PM +0900, DaeRo Lee wrote: > 2024년 4월 27일 (토) 오후 5:50, Mike Rapoport <rppt@kernel.org>님이 작성: > > > > On Fri, Apr 19, 2024 at 10:59:52AM +0900, DaeRo Lee wrote: > > > 2024년 4월 19일 (금) 오전 10:46, DaeRo Lee <skseofh@gmail.com>님이 작성: > > > > > > > > In memmap_init_reserved_pages, we mark memblock.reserved as > > > > PageReserved first and mark the memblock.reserved with nomap flag > > > > also. > > > Sorry. This is my mistake. 'memblock.memory with nomap flag' is right. > > > > > > > -> Isn't this duplicated work? (If we add no-map region to > > > > memblock.reserved 'and' mark in memblock.memory..) > > > > So, I think that for the no-map region, we don't need to add to the > > > > memblock.reserved. > > > > This is what we do now in early_init_dt_reserve_memory. the nomap > > > > region is not added to the memblock.reserved. > > > > > > > > In early_init_dt_alloc_reserved_memory_arch, if 'nomap' is true, we > > > > mark the memblock.memory region as _NOMAP. And if the return value > > > > 'err' is not zero(which is '-ENOMEM' from memblock_isolate_range), we > > > > free the region. > > > > - 'nomap' is true -> memblock_mark_nomap : success -> not free the region > > > > > > > > : fail -> free the region > > > > And it can be said that we add the region to the memblock.reserved > > > > using memblock_phys_alloc_range and if the region is nomap, then we > > > > can free the region from memblock.reserved. But is it necessary to add > > > > it to memblock.reserved? We just need the region in memblock.memory to > > > > mark nomap. > > > > > > > > So, here is what I think: > > > > - reserved-memory w/ nomap region -> mark only to memblock.memory > > > > - reserved-memory w/o nomap region -> add to the memblock.reserved > > > > NOMAP and memblock.reserved are semantically different, and at makes sense > > to have a "reserved nomap" node in fdt recorded in both memblock.memory and > > memblock.reserved. > > > > memblock.reserved represents the memory that is used by firmware or early > > kernel allocation, so reserved memory in fdt should be reserved in memblock > > as well. I believe it's an oversight that early_init_dt_reserve_memory() > > does not call memblock_reserve() for nomap memory. > > > > NOMAP is a property of a memory region that says that that region should > > not be mapped in the linear map, it's not necessarily in use. > > I agree that the NOMAP region should be added to memblock.reserved. > > So, I think we need to clean-up memmap_init_reserved_pages, because in > this function we call reserve_bootmem_region for memblock.reserved and > memblock.memory with nomap. We don't need to call > reserve_bootmem_region for nomap. Read the comment about memblock_mark_nomap() > Regards,. > DaeRo Lee
2024년 4월 28일 (일) 오후 3:35, Mike Rapoport <rppt@kernel.org>님이 작성: > > On Sat, Apr 27, 2024 at 07:24:23PM +0900, DaeRo Lee wrote: > > 2024년 4월 27일 (토) 오후 5:50, Mike Rapoport <rppt@kernel.org>님이 작성: > > > > > > On Fri, Apr 19, 2024 at 10:59:52AM +0900, DaeRo Lee wrote: > > > > 2024년 4월 19일 (금) 오전 10:46, DaeRo Lee <skseofh@gmail.com>님이 작성: > > > > > > > > > > In memmap_init_reserved_pages, we mark memblock.reserved as > > > > > PageReserved first and mark the memblock.reserved with nomap flag > > > > > also. > > > > Sorry. This is my mistake. 'memblock.memory with nomap flag' is right. > > > > > > > > > -> Isn't this duplicated work? (If we add no-map region to > > > > > memblock.reserved 'and' mark in memblock.memory..) > > > > > So, I think that for the no-map region, we don't need to add to the > > > > > memblock.reserved. > > > > > This is what we do now in early_init_dt_reserve_memory. the nomap > > > > > region is not added to the memblock.reserved. > > > > > > > > > > In early_init_dt_alloc_reserved_memory_arch, if 'nomap' is true, we > > > > > mark the memblock.memory region as _NOMAP. And if the return value > > > > > 'err' is not zero(which is '-ENOMEM' from memblock_isolate_range), we > > > > > free the region. > > > > > - 'nomap' is true -> memblock_mark_nomap : success -> not free the region > > > > > > > > > > : fail -> free the region > > > > > And it can be said that we add the region to the memblock.reserved > > > > > using memblock_phys_alloc_range and if the region is nomap, then we > > > > > can free the region from memblock.reserved. But is it necessary to add > > > > > it to memblock.reserved? We just need the region in memblock.memory to > > > > > mark nomap. > > > > > > > > > > So, here is what I think: > > > > > - reserved-memory w/ nomap region -> mark only to memblock.memory > > > > > - reserved-memory w/o nomap region -> add to the memblock.reserved > > > > > > NOMAP and memblock.reserved are semantically different, and at makes sense > > > to have a "reserved nomap" node in fdt recorded in both memblock.memory and > > > memblock.reserved. > > > > > > memblock.reserved represents the memory that is used by firmware or early > > > kernel allocation, so reserved memory in fdt should be reserved in memblock > > > as well. I believe it's an oversight that early_init_dt_reserve_memory() > > > does not call memblock_reserve() for nomap memory. > > > > > > NOMAP is a property of a memory region that says that that region should > > > not be mapped in the linear map, it's not necessarily in use. > > > > I agree that the NOMAP region should be added to memblock.reserved. > > > > So, I think we need to clean-up memmap_init_reserved_pages, because in > > this function we call reserve_bootmem_region for memblock.reserved and > > memblock.memory with nomap. We don't need to call > > reserve_bootmem_region for nomap. > > Read the comment about memblock_mark_nomap() I read the comment about memblock_mark_nomap() and understood that regions with nomap flags should be treated as PageReserved. But, if we add this nomap region to memblock.reserved, the region with nomap flag will be processed in the first for-loop in memmap_init_reserved_pages. Am I thinking wrong? Regards, DaeRo Lee
On Sun, Apr 28, 2024 at 07:36:40PM +0900, DaeRo Lee wrote: > 2024년 4월 28일 (일) 오후 3:35, Mike Rapoport <rppt@kernel.org>님이 작성: > > > > On Sat, Apr 27, 2024 at 07:24:23PM +0900, DaeRo Lee wrote: > > > 2024년 4월 27일 (토) 오후 5:50, Mike Rapoport <rppt@kernel.org>님이 작성: > > > > > > > > > > > > So, here is what I think: > > > > > > - reserved-memory w/ nomap region -> mark only to memblock.memory > > > > > > - reserved-memory w/o nomap region -> add to the memblock.reserved > > > > > > > > NOMAP and memblock.reserved are semantically different, and at makes sense > > > > to have a "reserved nomap" node in fdt recorded in both memblock.memory and > > > > memblock.reserved. > > > > > > > > memblock.reserved represents the memory that is used by firmware or early > > > > kernel allocation, so reserved memory in fdt should be reserved in memblock > > > > as well. I believe it's an oversight that early_init_dt_reserve_memory() > > > > does not call memblock_reserve() for nomap memory. > > > > > > > > NOMAP is a property of a memory region that says that that region should > > > > not be mapped in the linear map, it's not necessarily in use. > > > > > > I agree that the NOMAP region should be added to memblock.reserved. > > > > > > So, I think we need to clean-up memmap_init_reserved_pages, because in > > > this function we call reserve_bootmem_region for memblock.reserved and > > > memblock.memory with nomap. We don't need to call > > > reserve_bootmem_region for nomap. > > > > Read the comment about memblock_mark_nomap() > I read the comment about memblock_mark_nomap() and understood that > regions with nomap flags should be treated as PageReserved. > But, if we add this nomap region to memblock.reserved, the region with > nomap flag will be processed in the first for-loop in > memmap_init_reserved_pages. memblock still must make sure that pages in nomap regions get PG_Reserved to be robust against potential errors and bugs in firmware parsing. > Am I thinking wrong? > > Regards, > DaeRo Lee
2024년 4월 28일 (일) 오후 9:01, Mike Rapoport <rppt@kernel.org>님이 작성: > > On Sun, Apr 28, 2024 at 07:36:40PM +0900, DaeRo Lee wrote: > > 2024년 4월 28일 (일) 오후 3:35, Mike Rapoport <rppt@kernel.org>님이 작성: > > > > > > On Sat, Apr 27, 2024 at 07:24:23PM +0900, DaeRo Lee wrote: > > > > 2024년 4월 27일 (토) 오후 5:50, Mike Rapoport <rppt@kernel.org>님이 작성: > > > > > > > > > > > > > > So, here is what I think: > > > > > > > - reserved-memory w/ nomap region -> mark only to memblock.memory > > > > > > > - reserved-memory w/o nomap region -> add to the memblock.reserved > > > > > > > > > > NOMAP and memblock.reserved are semantically different, and at makes sense > > > > > to have a "reserved nomap" node in fdt recorded in both memblock.memory and > > > > > memblock.reserved. > > > > > > > > > > memblock.reserved represents the memory that is used by firmware or early > > > > > kernel allocation, so reserved memory in fdt should be reserved in memblock > > > > > as well. I believe it's an oversight that early_init_dt_reserve_memory() > > > > > does not call memblock_reserve() for nomap memory. > > > > > > > > > > NOMAP is a property of a memory region that says that that region should > > > > > not be mapped in the linear map, it's not necessarily in use. > > > > > > > > I agree that the NOMAP region should be added to memblock.reserved. > > > > > > > > So, I think we need to clean-up memmap_init_reserved_pages, because in > > > > this function we call reserve_bootmem_region for memblock.reserved and > > > > memblock.memory with nomap. We don't need to call > > > > reserve_bootmem_region for nomap. > > > > > > Read the comment about memblock_mark_nomap() > > I read the comment about memblock_mark_nomap() and understood that > > regions with nomap flags should be treated as PageReserved. > > But, if we add this nomap region to memblock.reserved, the region with > > nomap flag will be processed in the first for-loop in > > memmap_init_reserved_pages. > > memblock still must make sure that pages in nomap regions get PG_Reserved > to be robust against potential errors and bugs in firmware parsing. Got it. Thank you for your comments. I'll make another patch for just clean-up early_init_dt_reserve_memory() Thank you. Regards, DaeRo Lee
diff --git a/drivers/of/of_reserved_mem.c b/drivers/of/of_reserved_mem.c index 8236ecae2953..504f2f60689c 100644 --- a/drivers/of/of_reserved_mem.c +++ b/drivers/of/of_reserved_mem.c @@ -40,15 +40,18 @@ static int __init early_init_dt_alloc_reserved_memory_arch(phys_addr_t size, end = !end ? MEMBLOCK_ALLOC_ANYWHERE : end; align = !align ? SMP_CACHE_BYTES : align; - base = memblock_phys_alloc_range(size, align, start, end); + if (nomap) { + base = memblock_phys_alloc_range_nomap(size, align, start, end); + } else { + base = memblock_phys_alloc_range(size, align, start, end); + } + if (!base) return -ENOMEM; *res_base = base; if (nomap) { err = memblock_mark_nomap(base, size); - if (err) - memblock_phys_free(base, size); } kmemleak_ignore_phys(base); diff --git a/mm/memblock.c b/mm/memblock.c index d09136e040d3..f103f1ecbfad 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -1506,6 +1506,72 @@ phys_addr_t __init memblock_alloc_range_nid(phys_addr_t size, return found; } +phys_addr_t __init memblock_alloc_range_nid_nomap(phys_addr_t size, + phys_addr_t align, phys_addr_t start, + phys_addr_t end, int nid, + bool exact_nid) +{ + enum memblock_flags flags = choose_memblock_flags(); + phys_addr_t found; + + if (WARN_ONCE(nid == MAX_NUMNODES, "Usage of MAX_NUMNODES is deprecated. Use NUMA_NO_NODE instead\n")) + nid = NUMA_NO_NODE; + + if (!align) { + /* Can't use WARNs this early in boot on powerpc */ + dump_stack(); + align = SMP_CACHE_BYTES; + } + +again: + found = memblock_find_in_range_node(size, align, start, end, nid, + flags); + if (found) + goto done; + + if (nid != NUMA_NO_NODE && !exact_nid) { + found = memblock_find_in_range_node(size, align, start, + end, NUMA_NO_NODE, + flags); + if (found) + goto done; + } + + if (flags & MEMBLOCK_MIRROR) { + flags &= ~MEMBLOCK_MIRROR; + pr_warn_ratelimited("Could not allocate %pap bytes of mirrored memory\n", + &size); + goto again; + } + + return 0; + +done: + /* + * Skip kmemleak for those places like kasan_init() and + * early_pgtable_alloc() due to high volume. + */ + if (end != MEMBLOCK_ALLOC_NOLEAKTRACE) + /* + * Memblock allocated blocks are never reported as + * leaks. This is because many of these blocks are + * only referred via the physical address which is + * not looked up by kmemleak. + */ + kmemleak_alloc_phys(found, size, 0); + + /* + * Some Virtual Machine platforms, such as Intel TDX or AMD SEV-SNP, + * require memory to be accepted before it can be used by the + * guest. + * + * Accept the memory of the allocated buffer. + */ + accept_memory(found, found + size); + + return found; +} + /** * memblock_phys_alloc_range - allocate a memory block inside specified range * @size: size of memory block to be allocated in bytes @@ -1530,6 +1596,18 @@ phys_addr_t __init memblock_phys_alloc_range(phys_addr_t size, false); } +phys_addr_t __init memblock_phys_alloc_range_nomap(phys_addr_t size, + phys_addr_t align, + phys_addr_t start, + phys_addr_t end) +{ + memblock_dbg("%s: %llu bytes align=0x%llx from=%pa max_addr=%pa %pS\n", + __func__, (u64)size, (u64)align, &start, &end, + (void *)_RET_IP_); + return memblock_alloc_range_nid_nomap(size, align, start, end, + NUMA_NO_NODE, false); +} + /** * memblock_phys_alloc_try_nid - allocate a memory block from specified NUMA node * @size: size of memory block to be allocated in bytes