Message ID | 1474828616-16608-5-git-send-email-arbab@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Reza Arbab <arbab@linux.vnet.ibm.com> writes: > At boot, the movable_node option sets bottom-up memblock allocation. > > This reduces the chance that, in the window before movable memory has > been identified, an allocation for the kernel might come from a movable > node. By going bottom-up, early allocations will most likely come from > the same node as the kernel image, which is necessarily in a nonmovable > node. > > Then, once any known hotplug memory has been marked, allocation can be > reset back to top-down. On x86, this is done in numa_init(). This patch > does the same on power, in numa initmem_init(). > > Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> > --- > arch/powerpc/mm/numa.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c > index d7ac419..fdf1e69 100644 > --- a/arch/powerpc/mm/numa.c > +++ b/arch/powerpc/mm/numa.c > @@ -945,6 +945,9 @@ void __init initmem_init(void) > max_low_pfn = memblock_end_of_DRAM() >> PAGE_SHIFT; > max_pfn = max_low_pfn; > > + /* bottom-up allocation may have been set by movable_node */ > + memblock_set_bottom_up(false); > + By then we have done few memblock allocation right ? IMHO, we should do this early enough in prom.c after we do parse_early_param, with a comment there explaining that, we don't really support hotplug memblock and when we do that, this should be moved to a place where we can handle memblock allocation such that we avoid spreading memblock allocation to movable node. > if (parse_numa_properties()) > setup_nonnuma(); > else > -- > 1.8.3.1 -aneesh
On Mon, Sep 26, 2016 at 09:17:43PM +0530, Aneesh Kumar K.V wrote: >> + /* bottom-up allocation may have been set by movable_node */ >> + memblock_set_bottom_up(false); >> + > >By then we have done few memblock allocation right ? Yes, some allocations do occur while bottom-up is set. >IMHO, we should do this early enough in prom.c after we do >parse_early_param, with a comment there explaining that, we don't >really support hotplug memblock and when we do that, this should be >moved to a place where we can handle memblock allocation such that we >avoid spreading memblock allocation to movable node. Sure, we can do it earlier. The only consideration is that any potential calls to memblock_mark_hotplug() happen before we reset to top-down. Since we don't do that at all on power, the call can go anywhere.
On Sun, 2016-09-25 at 13:36 -0500, Reza Arbab wrote: > At boot, the movable_node option sets bottom-up memblock allocation. > > This reduces the chance that, in the window before movable memory has > been identified, an allocation for the kernel might come from a movable > node. By going bottom-up, early allocations will most likely come from > the same node as the kernel image, which is necessarily in a nonmovable > node. > > Then, once any known hotplug memory has been marked, allocation can be > reset back to top-down. On x86, this is done in numa_init(). This patch > does the same on power, in numa initmem_init(). That's fragile and a bit gross. But then I'm not *that* fan of making accelerator memory be "memory" nodes in the first place. Oh well... In any case, if the memory hasn't been hotplug, this shouldn't be necessary as we shouldn't be considering it for allocation. If we want to prevent it for other reason, we should add logic for that in memblock, or reserve it early or something like that. Just relying magically on the direction of the allocator is bad, really bad. Ben. > Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> > --- > arch/powerpc/mm/numa.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c > index d7ac419..fdf1e69 100644 > --- a/arch/powerpc/mm/numa.c > +++ b/arch/powerpc/mm/numa.c > @@ -945,6 +945,9 @@ void __init initmem_init(void) > > max_low_pfn = memblock_end_of_DRAM() >> PAGE_SHIFT; > > max_pfn = max_low_pfn; > > > + /* bottom-up allocation may have been set by movable_node */ > > + memblock_set_bottom_up(false); > + > > if (parse_numa_properties()) > > setup_nonnuma(); > > else
On Tue, Sep 27, 2016 at 07:12:31AM +1000, Benjamin Herrenschmidt wrote: >In any case, if the memory hasn't been hotplug, this shouldn't be >necessary as we shouldn't be considering it for allocation. Right. To be clear, the background info I put in the commit log refers to x86, where the SRAT can describe movable nodes which exist at boot. They're trying to avoid allocations from those nodes before they've been identified. On power, movable nodes can only exist via hotplug, so that scenario can't happen. We can immediately go back to top-down allocation. That is the missing call being added in the patch.
On 27/09/16 10:14, Reza Arbab wrote: > On Tue, Sep 27, 2016 at 07:12:31AM +1000, Benjamin Herrenschmidt wrote: >> In any case, if the memory hasn't been hotplug, this shouldn't be necessary as we shouldn't be considering it for allocation. > > Right. To be clear, the background info I put in the commit log refers to x86, where the SRAT can describe movable nodes which exist at boot. They're trying to avoid allocations from those nodes before they've been identified. > > On power, movable nodes can only exist via hotplug, so that scenario can't happen. We can immediately go back to top-down allocation. That is the missing call being added in the patch. > Can we fix cmdline_parse_movable_node() to do the right thing? I suspect that code is heavily x86 only in the sense that no other arch needs it. Balbir Singh.
On Tue, Oct 04, 2016 at 11:48:30AM +1100, Balbir Singh wrote: >On 27/09/16 10:14, Reza Arbab wrote: >> Right. To be clear, the background info I put in the commit log >> refers to x86, where the SRAT can describe movable nodes which exist >> at boot. They're trying to avoid allocations from those nodes before >> they've been identified. >> >> On power, movable nodes can only exist via hotplug, so that scenario >> can't happen. We can immediately go back to top-down allocation. That >> is the missing call being added in the patch. > >Can we fix cmdline_parse_movable_node() to do the right thing? I >suspect that code is heavily x86 only in the sense that no other arch >needs it. Good idea. We could change it so things only go bottom-up on x86 in the first place. A nice consequence is that CONFIG_MOVABLE_NODE would then basically be usable on any platform with memory hotplug, not just PPC64 and X86_64. I'll see if I can move the relevant code into an arch_*() call or otherwise factor it out.
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c index d7ac419..fdf1e69 100644 --- a/arch/powerpc/mm/numa.c +++ b/arch/powerpc/mm/numa.c @@ -945,6 +945,9 @@ void __init initmem_init(void) max_low_pfn = memblock_end_of_DRAM() >> PAGE_SHIFT; max_pfn = max_low_pfn; + /* bottom-up allocation may have been set by movable_node */ + memblock_set_bottom_up(false); + if (parse_numa_properties()) setup_nonnuma(); else
At boot, the movable_node option sets bottom-up memblock allocation. This reduces the chance that, in the window before movable memory has been identified, an allocation for the kernel might come from a movable node. By going bottom-up, early allocations will most likely come from the same node as the kernel image, which is necessarily in a nonmovable node. Then, once any known hotplug memory has been marked, allocation can be reset back to top-down. On x86, this is done in numa_init(). This patch does the same on power, in numa initmem_init(). Signed-off-by: Reza Arbab <arbab@linux.vnet.ibm.com> --- arch/powerpc/mm/numa.c | 3 +++ 1 file changed, 3 insertions(+)