diff mbox series

[1/2] mm/cma: provide option to opt out from exposing pages on activation failure

Message ID 20211220193419.104242-2-hbathini@linux.ibm.com (mailing list archive)
State Superseded
Headers show
Series powerpc/fadump: handle CMA activation failure appropriately | expand

Commit Message

Hari Bathini Dec. 20, 2021, 7:34 p.m. UTC
Commit 072355c1cf2d ("mm/cma: expose all pages to the buddy if
activation of an area fails") started exposing all pages to buddy
allocator on CMA activation failure. But there can be CMA users that
want to handle the reserved memory differently on CMA allocation
failure. Provide an option to opt out from exposing pages to buddy
for such cases.

Signed-off-by: Hari Bathini <hbathini@linux.ibm.com>
---
 include/linux/cma.h |  2 ++
 mm/cma.c            | 15 +++++++++++++--
 mm/cma.h            |  1 +
 3 files changed, 16 insertions(+), 2 deletions(-)

Comments

David Hildenbrand Dec. 21, 2021, 6:48 p.m. UTC | #1
On 20.12.21 20:34, Hari Bathini wrote:
> Commit 072355c1cf2d ("mm/cma: expose all pages to the buddy if
> activation of an area fails") started exposing all pages to buddy
> allocator on CMA activation failure. But there can be CMA users that
> want to handle the reserved memory differently on CMA allocation
> failure. Provide an option to opt out from exposing pages to buddy
> for such cases.

Can you elaborate why that is important and what the target user can
actually do with it?

It certainly cannot do CMA allocations :)
Hari Bathini Jan. 6, 2022, 12:01 p.m. UTC | #2
On 22/12/21 12:18 am, David Hildenbrand wrote:
> On 20.12.21 20:34, Hari Bathini wrote:
>> Commit 072355c1cf2d ("mm/cma: expose all pages to the buddy if
>> activation of an area fails") started exposing all pages to buddy
>> allocator on CMA activation failure. But there can be CMA users that
>> want to handle the reserved memory differently on CMA allocation
>> failure. Provide an option to opt out from exposing pages to buddy
>> for such cases.

Hi David,

Sorry, I could not get back on this sooner. I went out on vacation
and missed this.
.

> 
> Can you elaborate why that is important and what the target user can
> actually do with it?
Previously, firmware-assisted dump [1] used to reserve memory that it 
needs for booting a capture kernel & offloading /proc/vmcore.
This memory is reserved, basically blocked from being used by
production kernel, to ensure kernel crash context is not lost on
booting into a capture kernel from this memory chunk.

But [2] started using CMA instead to let the memory be used at least
in some cases as long as this memory is not going to have kernel pages. 
So, the intention in using CMA was to keep the memory unused if CMA
activation fails and only let it be used for some purpose, if at all,
if CMA activation succeeds. But [3] breaks that assumption reporting
weird errors on vmcore captured with fadump, when CMA activation fails.

To answer the question, fadump does not want the memory to be used for
kernel pages, if CMA activation fails...


[1] 
https://github.com/torvalds/linux/blob/master/Documentation/powerpc/firmware-assisted-dump.rst
[2] https://github.com/torvalds/linux/commit/a4e92ce8e4c8
[3] https://github.com/torvalds/linux/commit/072355c1cf2d

Thanks
Hari
David Hildenbrand Jan. 11, 2022, 2:36 p.m. UTC | #3
On 06.01.22 13:01, Hari Bathini wrote:
> 
> 
> On 22/12/21 12:18 am, David Hildenbrand wrote:
>> On 20.12.21 20:34, Hari Bathini wrote:
>>> Commit 072355c1cf2d ("mm/cma: expose all pages to the buddy if
>>> activation of an area fails") started exposing all pages to buddy
>>> allocator on CMA activation failure. But there can be CMA users that
>>> want to handle the reserved memory differently on CMA allocation
>>> failure. Provide an option to opt out from exposing pages to buddy
>>> for such cases.
> 
> Hi David,
> 
> Sorry, I could not get back on this sooner. I went out on vacation
> and missed this.
> .
> 
>>
>> Can you elaborate why that is important and what the target user can
>> actually do with it?
> Previously, firmware-assisted dump [1] used to reserve memory that it 
> needs for booting a capture kernel & offloading /proc/vmcore.
> This memory is reserved, basically blocked from being used by
> production kernel, to ensure kernel crash context is not lost on
> booting into a capture kernel from this memory chunk.
> 
> But [2] started using CMA instead to let the memory be used at least
> in some cases as long as this memory is not going to have kernel pages. 
> So, the intention in using CMA was to keep the memory unused if CMA
> activation fails and only let it be used for some purpose, if at all,
> if CMA activation succeeds. But [3] breaks that assumption reporting
> weird errors on vmcore captured with fadump, when CMA activation fails.
> 
> To answer the question, fadump does not want the memory to be used for
> kernel pages, if CMA activation fails...

Okay, so what you want is a reserved region, and if possible, let CMA
use that memory for other (movable allocation) purposes until you
actually need that area and free it up by using CMA. If CMA cannot use
the region because of zone issues, you just want that region to stay
reserved.

I guess the biggest different to other CMA users is that it can make use
of the memory even if not allocated via CMA -- because it's going to
make use of the the physical memory range indirectly via a HW facility,
not via any "struct page" access.


I wonder if we can make the terminology a bit clearer, the freeing part
is a bit confusing, because init_cma_reserved_pageblock() essentially
also frees pages, just to the MIGRATE_CMA lists ... what you want is to
treat it like a simple memblock allocation/reservation on error.

What about:
* cma->reserve_pages_on_error that defaults to false
* void __init cma_reserve_pages_on_error(struct cma *cma)
Hari Bathini Jan. 12, 2022, 9:50 a.m. UTC | #4
On 11/01/22 8:06 pm, David Hildenbrand wrote:
> On 06.01.22 13:01, Hari Bathini wrote:
>>
>> To answer the question, fadump does not want the memory to be used for
>> kernel pages, if CMA activation fails...
> 
> Okay, so what you want is a reserved region, and if possible, let CMA
> use that memory for other (movable allocation) purposes until you
> actually need that area and free it up by using CMA. If CMA cannot use
> the region because of zone issues, you just want that region to stay
> reserved.
> 

Right.

> I guess the biggest different to other CMA users is that it can make use
> of the memory even if not allocated via CMA -- because it's going to
> make use of the the physical memory range indirectly via a HW facility,
> not via any "struct page" access.
> 
> 
> I wonder if we can make the terminology a bit clearer, the freeing part
> is a bit confusing, because init_cma_reserved_pageblock() essentially
> also frees pages, just to the MIGRATE_CMA lists ... what you want is to
> treat it like a simple memblock allocation/reservation on error.

> What about:
> * cma->reserve_pages_on_error that defaults to false
> * void __init cma_reserve_pages_on_error(struct cma *cma)

Yeah, this change does make things bit more clearer.
Will send out a v2 with the change..
diff mbox series

Patch

diff --git a/include/linux/cma.h b/include/linux/cma.h
index bd801023504b..8c9e229e7080 100644
--- a/include/linux/cma.h
+++ b/include/linux/cma.h
@@ -50,4 +50,6 @@  extern bool cma_pages_valid(struct cma *cma, const struct page *pages, unsigned
 extern bool cma_release(struct cma *cma, const struct page *pages, unsigned long count);
 
 extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data);
+
+extern void cma_dont_free_pages_on_error(struct cma *cma);
 #endif
diff --git a/mm/cma.c b/mm/cma.c
index bc9ca8f3c487..6dffc9b2dafe 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -131,8 +131,10 @@  static void __init cma_activate_area(struct cma *cma)
 	bitmap_free(cma->bitmap);
 out_error:
 	/* Expose all pages to the buddy, they are useless for CMA. */
-	for (pfn = base_pfn; pfn < base_pfn + cma->count; pfn++)
-		free_reserved_page(pfn_to_page(pfn));
+	if (cma->free_pages_on_error) {
+		for (pfn = base_pfn; pfn < base_pfn + cma->count; pfn++)
+			free_reserved_page(pfn_to_page(pfn));
+	}
 	totalcma_pages -= cma->count;
 	cma->count = 0;
 	pr_err("CMA area %s could not be activated\n", cma->name);
@@ -150,6 +152,14 @@  static int __init cma_init_reserved_areas(void)
 }
 core_initcall(cma_init_reserved_areas);
 
+void __init cma_dont_free_pages_on_error(struct cma *cma)
+{
+	if (!cma)
+		return;
+
+	cma->free_pages_on_error = false;
+}
+
 /**
  * cma_init_reserved_mem() - create custom contiguous area from reserved memory
  * @base: Base address of the reserved area
@@ -204,6 +214,7 @@  int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size,
 	cma->base_pfn = PFN_DOWN(base);
 	cma->count = size >> PAGE_SHIFT;
 	cma->order_per_bit = order_per_bit;
+	cma->free_pages_on_error = true;
 	*res_cma = cma;
 	cma_area_count++;
 	totalcma_pages += (size / PAGE_SIZE);
diff --git a/mm/cma.h b/mm/cma.h
index 2c775877eae2..9e2438f9233d 100644
--- a/mm/cma.h
+++ b/mm/cma.h
@@ -30,6 +30,7 @@  struct cma {
 	/* kobject requires dynamic object */
 	struct cma_kobject *cma_kobj;
 #endif
+	bool free_pages_on_error;
 };
 
 extern struct cma cma_areas[MAX_CMA_AREAS];