diff mbox

[v2,3/3] mm: enable CONFIG_MOVABLE_NODE on powerpc

Message ID 1473883618-14998-4-git-send-email-arbab@linux.vnet.ibm.com (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Reza Arbab Sept. 14, 2016, 8:06 p.m. UTC
Onlining memory into ZONE_MOVABLE requires CONFIG_MOVABLE_NODE. Enable
the use of this config option on PPC64 platforms.

Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
---
 Documentation/kernel-parameters.txt | 2 +-
 mm/Kconfig                          | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

Comments

Aneesh Kumar K.V Sept. 19, 2016, 6:29 a.m. UTC | #1
Reza Arbab <arbab@linux.vnet.ibm.com> writes:

> Onlining memory into ZONE_MOVABLE requires CONFIG_MOVABLE_NODE. Enable
> the use of this config option on PPC64 platforms.
>
> Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com>
> ---
>  Documentation/kernel-parameters.txt | 2 +-
>  mm/Kconfig                          | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
> index a4f4d69..3d8460d 100644
> --- a/Documentation/kernel-parameters.txt
> +++ b/Documentation/kernel-parameters.txt
> @@ -2344,7 +2344,7 @@ bytes respectively. Such letter suffixes can also be entirely omitted.
>  			that the amount of memory usable for all allocations
>  			is not too small.
>  
> -	movable_node	[KNL,X86] Boot-time switch to enable the effects
> +	movable_node	[KNL,X86,PPC] Boot-time switch to enable the effects
>  			of CONFIG_MOVABLE_NODE=y. See mm/Kconfig for details.
>  

Movable node also does.
	memblock_set_bottom_up(true);
What is the impact of that. Do we need changes equivalent to that ? Also
where are we marking the nodes which can be hotplugged, ie where do we
do memblock_mark_hotplug() ?
       
>  	MTD_Partition=	[MTD]
> diff --git a/mm/Kconfig b/mm/Kconfig
> index be0ee11..4b19cd3 100644
> --- a/mm/Kconfig
> +++ b/mm/Kconfig
> @@ -153,7 +153,7 @@ config MOVABLE_NODE
>  	bool "Enable to assign a node which has only movable memory"
>  	depends on HAVE_MEMBLOCK
>  	depends on NO_BOOTMEM
> -	depends on X86_64
> +	depends on X86_64 || PPC64
>  	depends on NUMA
>  	default n
>  	help

-aneesh
Reza Arbab Sept. 21, 2016, 5:45 a.m. UTC | #2
On Mon, Sep 19, 2016 at 11:59:35AM +0530, Aneesh Kumar K.V wrote:
>Movable node also does.
>	memblock_set_bottom_up(true);
>What is the impact of that. Do we need changes equivalent to that ? Also
>where are we marking the nodes which can be hotplugged, ie where do we
>do memblock_mark_hotplug() ?

These are related to the mechanism x86 uses to create movable nodes at 
boot. The SRAT is parsed to mark any hotplug memory. That marking is 
used later when initializing ZONE_MOVABLE for each node. [1]

The bottom-up allocation is due to a timing issue [2]. There is a window 
where kernel memory may be allocated before the SRAT is parsed. Any 
bottom-up allocations done during that time will likely be in the same 
(nonmovable) node as the kernel image.

On power, I don't think we have a heuristic equivalent to that SRAT 
memory hotplug info. So, we'll be limited to dynamically adding movable 
nodes after boot.

1. http://events.linuxfoundation.org/sites/events/files/lcjp13_chen.pdf
2. commit 79442ed189ac ("mm/memblock.c: introduce bottom-up allocation 
mode")
Aneesh Kumar K.V Sept. 21, 2016, 7:09 a.m. UTC | #3
Reza Arbab <arbab@linux.vnet.ibm.com> writes:

> On Mon, Sep 19, 2016 at 11:59:35AM +0530, Aneesh Kumar K.V wrote:
>>Movable node also does.
>>	memblock_set_bottom_up(true);
>>What is the impact of that. Do we need changes equivalent to that ? Also
>>where are we marking the nodes which can be hotplugged, ie where do we
>>do memblock_mark_hotplug() ?
>
> These are related to the mechanism x86 uses to create movable nodes at 
> boot. The SRAT is parsed to mark any hotplug memory. That marking is 
> used later when initializing ZONE_MOVABLE for each node. [1]
>
> The bottom-up allocation is due to a timing issue [2]. There is a window 
> where kernel memory may be allocated before the SRAT is parsed. Any 
> bottom-up allocations done during that time will likely be in the same 
> (nonmovable) node as the kernel image.
>
> On power, I don't think we have a heuristic equivalent to that SRAT 
> memory hotplug info. So, we'll be limited to dynamically adding movable 
> nodes after boot.
>
> 1. http://events.linuxfoundation.org/sites/events/files/lcjp13_chen.pdf
> 2. commit 79442ed189ac ("mm/memblock.c: introduce bottom-up allocation 
> mode")
>

What I was checking was how will one mark a node movable in ppc64 ? I
don't see ppc64 code doing the equivalent of memblock_mark_hotplug().

So when you say "Onlining memory into ZONE_MOVABLE requires
CONFIG_MOVABLE_NODE" where is that restriction ?. IIUC,
should_add_memory_movable() will only return ZONE_MOVABLE only if it is
non empty and MOVABLE_NODE will create a ZONE_MOVABLE zone by default
only if it finds a memblock marked hotpluggable. So wondering if we
are not calling memblock_mark_hotplug() how is it working. Or am I
missing something ?

-aneesh
Reza Arbab Sept. 21, 2016, 2:08 p.m. UTC | #4
On Wed, Sep 21, 2016 at 12:39:51PM +0530, Aneesh Kumar K.V wrote:
>What I was checking was how will one mark a node movable in ppc64 ? I
>don't see ppc64 code doing the equivalent of memblock_mark_hotplug().

Post boot, the marking mechanism is not necessary. You can create a 
movable node by putting all of the node's memory into ZONE_MOVABLE 
during the hotplug.

>So when you say "Onlining memory into ZONE_MOVABLE requires
>CONFIG_MOVABLE_NODE" where is that restriction ?. IIUC,
>should_add_memory_movable() will only return ZONE_MOVABLE only if it is
>non empty and MOVABLE_NODE will create a ZONE_MOVABLE zone by default
>only if it finds a memblock marked hotpluggable. So wondering if we
>are not calling memblock_mark_hotplug() how is it working. Or am I
>missing something ?

You are looking at the addition step of hotplug. You're correct there, 
the memory is added to the default zone, not ZONE_MOVABLE. The 
transition to ZONE_MOVABLE takes place during the onlining step. In 
online_pages():

	zone = move_pfn_range(zone_shift, pfn, pfn + nr_pages);

The reason we need CONFIG_MOVABLE_NODE is right before that:

	if ((zone_idx(zone) > ZONE_NORMAL ||
	    online_type == MMOP_ONLINE_MOVABLE) &&
	    !can_online_high_movable(zone))
		return -EINVAL;

where can_online_high_movable() is defined like this:

	#ifdef CONFIG_MOVABLE_NODE
	/*
	 * When CONFIG_MOVABLE_NODE, we permit onlining of a node which doesn't have
	 * normal memory.
	 */
	static bool can_online_high_movable(struct zone *zone)
	{
		return true;
	}
	#else /* CONFIG_MOVABLE_NODE */
	/* ensure every online node has NORMAL memory */
	static bool can_online_high_movable(struct zone *zone)
	{
		return node_state(zone_to_nid(zone), N_NORMAL_MEMORY);
	}
	#endif /* CONFIG_MOVABLE_NODE */

To be more clear, I can change the commit log to say "Onlining all of a 
node's memory into ZONE_MOVABLE requires CONFIG_MOVABLE_NODE".
Aneesh Kumar K.V Sept. 21, 2016, 2:43 p.m. UTC | #5
Reza Arbab <arbab@linux.vnet.ibm.com> writes:

> On Wed, Sep 21, 2016 at 12:39:51PM +0530, Aneesh Kumar K.V wrote:
>>What I was checking was how will one mark a node movable in ppc64 ? I
>>don't see ppc64 code doing the equivalent of memblock_mark_hotplug().
>
> Post boot, the marking mechanism is not necessary. You can create a 
> movable node by putting all of the node's memory into ZONE_MOVABLE 
> during the hotplug.
>
>>So when you say "Onlining memory into ZONE_MOVABLE requires
>>CONFIG_MOVABLE_NODE" where is that restriction ?. IIUC,
>>should_add_memory_movable() will only return ZONE_MOVABLE only if it is
>>non empty and MOVABLE_NODE will create a ZONE_MOVABLE zone by default
>>only if it finds a memblock marked hotpluggable. So wondering if we
>>are not calling memblock_mark_hotplug() how is it working. Or am I
>>missing something ?
>
> You are looking at the addition step of hotplug. You're correct there, 
> the memory is added to the default zone, not ZONE_MOVABLE. The 
> transition to ZONE_MOVABLE takes place during the onlining step. In 
> online_pages():
>
> 	zone = move_pfn_range(zone_shift, pfn, pfn + nr_pages);
>
> The reason we need CONFIG_MOVABLE_NODE is right before that:
>
> 	if ((zone_idx(zone) > ZONE_NORMAL ||
> 	    online_type == MMOP_ONLINE_MOVABLE) &&
> 	    !can_online_high_movable(zone))
> 		return -EINVAL;
>

So we are looking at two step online process here. The above explained
the details nicely. Can you capture these details in the commit message. ie,
to say that when using 'echo online-movable > state' we allow the move from
normal to movable only if movable node is set. Also you may want to
mention that we still don't support the auto-online to movable.


> where can_online_high_movable() is defined like this:
>
> 	#ifdef CONFIG_MOVABLE_NODE
> 	/*
> 	 * When CONFIG_MOVABLE_NODE, we permit onlining of a node which doesn't have
> 	 * normal memory.
> 	 */
> 	static bool can_online_high_movable(struct zone *zone)
> 	{
> 		return true;
> 	}
> 	#else /* CONFIG_MOVABLE_NODE */
> 	/* ensure every online node has NORMAL memory */
> 	static bool can_online_high_movable(struct zone *zone)
> 	{
> 		return node_state(zone_to_nid(zone), N_NORMAL_MEMORY);
> 	}
> 	#endif /* CONFIG_MOVABLE_NODE */
>
> To be more clear, I can change the commit log to say "Onlining all of a 
> node's memory into ZONE_MOVABLE requires CONFIG_MOVABLE_NODE".
>
> -- 
> Reza Arbab

-aneesh
Reza Arbab Sept. 21, 2016, 10:29 p.m. UTC | #6
On Wed, Sep 21, 2016 at 08:13:37PM +0530, Aneesh Kumar K.V wrote:
>So we are looking at two step online process here. The above explained
>the details nicely. Can you capture these details in the commit message. ie,
>to say that when using 'echo online-movable > state' we allow the move from
>normal to movable only if movable node is set. Also you may want to
>mention that we still don't support the auto-online to movable.

Sure, no problem. I'll use a more verbose commit message in v3.
diff mbox

Patch

diff --git a/Documentation/kernel-parameters.txt b/Documentation/kernel-parameters.txt
index a4f4d69..3d8460d 100644
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -2344,7 +2344,7 @@  bytes respectively. Such letter suffixes can also be entirely omitted.
 			that the amount of memory usable for all allocations
 			is not too small.
 
-	movable_node	[KNL,X86] Boot-time switch to enable the effects
+	movable_node	[KNL,X86,PPC] Boot-time switch to enable the effects
 			of CONFIG_MOVABLE_NODE=y. See mm/Kconfig for details.
 
 	MTD_Partition=	[MTD]
diff --git a/mm/Kconfig b/mm/Kconfig
index be0ee11..4b19cd3 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -153,7 +153,7 @@  config MOVABLE_NODE
 	bool "Enable to assign a node which has only movable memory"
 	depends on HAVE_MEMBLOCK
 	depends on NO_BOOTMEM
-	depends on X86_64
+	depends on X86_64 || PPC64
 	depends on NUMA
 	default n
 	help