diff mbox

[V2,2/2] : powerpc/hotplug/mm: Fix hot-add memory node assoc

Message ID 9b006866-5a8e-dc05-d603-bfe8eb02eef2@linux.vnet.ibm.com (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Michael Bringmann May 25, 2017, 5:37 p.m. UTC
Removing or adding memory via the PowerPC hotplug interface shows
anomalies in the association between memory and nodes.  The code
was updated to ensure that all nodes found at boot are still available
to subsequent DLPAR hotplug-memory operations, even if they are not
needed at boot time.

Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
---
Changes in V2:
  -- Simplify patches to ensure more nodes in possible map, removing
     code from PowerPC numa.c that constrained possible map to size
     of online map.
---
 arch/powerpc/mm/numa.c |    7 -------
 1 file changed, 7 deletions(-)

Comments

Balbir Singh May 26, 2017, 3:23 a.m. UTC | #1
On Thu, 25 May 2017 12:37:40 -0500
Michael Bringmann <mwb@linux.vnet.ibm.com> wrote:

> Removing or adding memory via the PowerPC hotplug interface shows
> anomalies in the association between memory and nodes.  The code
> was updated to ensure that all nodes found at boot are still available
> to subsequent DLPAR hotplug-memory operations, even if they are not
> needed at boot time.
> 
> Signed-off-by: Michael Bringmann <mwb@linux.vnet.ibm.com>
> ---
> Changes in V2:
>   -- Simplify patches to ensure more nodes in possible map, removing
>      code from PowerPC numa.c that constrained possible map to size
>      of online map.
> ---
>  arch/powerpc/mm/numa.c |    7 -------
>  1 file changed, 7 deletions(-)
> 
> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
> index 15c2dd5..18f3038 100644
> --- a/arch/powerpc/mm/numa.c
> +++ b/arch/powerpc/mm/numa.c
> @@ -907,13 +907,6 @@ void __init initmem_init(void)
>  
>  	memblock_dump_all();
>  
> -	/*
> -	 * Reduce the possible NUMA nodes to the online NUMA nodes,
> -	 * since we do not support node hotplug. This ensures that  we
> -	 * lower the maximum NUMA node ID to what is actually present.
> -	 */
> -	nodes_and(node_possible_map, node_possible_map, node_online_map);
> -

There is an overhead with turning this off if you have too many cgroups
with the memory controller. I think this fix was added for a pathological
test case. On my system I see 84 cgroups with 1 node, so the probable
overhead is 84*255*sizeof(struct mem_cgroup_tree_per_node).

I tried some patches to reduce the overhead, but those need more overhauling
and rework.

Balbir Singh.
Michael Ellerman May 26, 2017, 5:38 a.m. UTC | #2
Michael Bringmann <mwb@linux.vnet.ibm.com> writes:

> Removing or adding memory via the PowerPC hotplug interface shows
> anomalies in the association between memory and nodes.

What anomalies? Please describe the actual problem you're seeing, with
details, and why you think this is the correct fix.

This is a revert of 3af229f2071f ("powerpc/numa: Reset node_possible_map
to only node_online_map"), so please explain why all the things
mentioned in the change log for that commit are either wrong or no
longer true.

cheers
Michael Bringmann May 26, 2017, 12:28 p.m. UTC | #3
>>  arch/powerpc/mm/numa.c |    7 -------
>>  1 file changed, 7 deletions(-)
>>
>> diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
>> index 15c2dd5..18f3038 100644
>> --- a/arch/powerpc/mm/numa.c
>> +++ b/arch/powerpc/mm/numa.c
>> @@ -907,13 +907,6 @@ void __init initmem_init(void)
>>  
>>  	memblock_dump_all();
>>  
>> -	/*
>> -	 * Reduce the possible NUMA nodes to the online NUMA nodes,
>> -	 * since we do not support node hotplug. This ensures that  we
>> -	 * lower the maximum NUMA node ID to what is actually present.
>> -	 */
>> -	nodes_and(node_possible_map, node_possible_map, node_online_map);
>> -
> 
> There is an overhead with turning this off if you have too many cgroups
> with the memory controller. I think this fix was added for a pathological
> test case. On my system I see 84 cgroups with 1 node, so the probable
> overhead is 84*255*sizeof(struct mem_cgroup_tree_per_node).
> 
> I tried some patches to reduce the overhead, but those need more overhauling
> and rework.

Is there some other way to add a node to a dynamic, running system without
crashing?  I have not encountered one as yet.

> Balbir Singh.
diff mbox

Patch

diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 15c2dd5..18f3038 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -907,13 +907,6 @@  void __init initmem_init(void)
 
 	memblock_dump_all();
 
-	/*
-	 * Reduce the possible NUMA nodes to the online NUMA nodes,
-	 * since we do not support node hotplug. This ensures that  we
-	 * lower the maximum NUMA node ID to what is actually present.
-	 */
-	nodes_and(node_possible_map, node_possible_map, node_online_map);
-
 	for_each_online_node(nid) {
 		unsigned long start_pfn, end_pfn;