Message ID | b3202827b5f22e9a7e8f145f83140698911641f7.1518394650.git.sam.bobroff@au1.ibm.com (mailing list archive) |
---|---|
State | Accepted |
Commit | c9dccf1d074a67d36c510845f663980d69e3409b |
Headers | show |
Series | [1/1] powerpc/pseries: Enable RAS hotplug events late | expand |
On Mon, Feb 12, 2018 at 11:19 AM, Sam Bobroff <sam.bobroff@au1.ibm.com> wrote: > Currently if the kernel receives a memory hot-unplug event early > enough, it may get stuck in an infinite loop in > dissolve_free_huge_pages(). This appears as a stall just after: > > pseries-hotplug-mem: Attempting to hot-remove XX LMB(s) at YYYYYYYY > > It appears to be caused by "minimum_order" being uninitialized, due to > init_ras_IRQ() executing before hugetlb_init(). > > To correct this, extract the part of init_ras_IRQ() that enables > hotplug event processing and place it in the machine_late_initcall > phase, which is guaranteed to be after hugetlb_init() is called. > > Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> > --- > arch/powerpc/platforms/pseries/ras.c | 29 +++++++++++++++++++++-------- > 1 file changed, 21 insertions(+), 8 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c > index 81d8614e7379..ba284949af06 100644 > --- a/arch/powerpc/platforms/pseries/ras.c > +++ b/arch/powerpc/platforms/pseries/ras.c > @@ -66,6 +66,26 @@ static int __init init_ras_IRQ(void) > of_node_put(np); > } > > + /* EPOW Events */ > + np = of_find_node_by_path("/event-sources/epow-events"); > + if (np != NULL) { > + request_event_sources_irqs(np, ras_epow_interrupt, "RAS_EPOW"); > + of_node_put(np); > + } > + > + return 0; > +} > +machine_subsys_initcall(pseries, init_ras_IRQ); > + > +/* > + * Enable the hotplug interrupt late because processing them may touch other > + * devices or systems (e.g. hugepages) that have not been initialized at the > + * subsys stage. > + */ > +int __init init_ras_hotplug_IRQ(void) > +{ > + struct device_node *np; > + > /* Hotplug Events */ > np = of_find_node_by_path("/event-sources/hot-plug-events"); > if (np != NULL) { > @@ -75,16 +95,9 @@ static int __init init_ras_IRQ(void) > of_node_put(np); > } > > - /* EPOW Events */ > - np = of_find_node_by_path("/event-sources/epow-events"); > - if (np != NULL) { > - request_event_sources_irqs(np, ras_epow_interrupt, "RAS_EPOW"); > - of_node_put(np); > - } > - > return 0; > } > -machine_subsys_initcall(pseries, init_ras_IRQ); > +machine_late_initcall(pseries, init_ras_hotplug_IRQ); > Seems reasonable to me, the other RAS events internal error and epow seem like they are in the right place. Acked-by: Balbir Singh <bsingharora@gmail.com>
On Mon, 2018-02-12 at 00:19:29 UTC, Sam Bobroff wrote: > Currently if the kernel receives a memory hot-unplug event early > enough, it may get stuck in an infinite loop in > dissolve_free_huge_pages(). This appears as a stall just after: > > pseries-hotplug-mem: Attempting to hot-remove XX LMB(s) at YYYYYYYY > > It appears to be caused by "minimum_order" being uninitialized, due to > init_ras_IRQ() executing before hugetlb_init(). > > To correct this, extract the part of init_ras_IRQ() that enables > hotplug event processing and place it in the machine_late_initcall > phase, which is guaranteed to be after hugetlb_init() is called. > > Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> > Acked-by: Balbir Singh <bsingharora@gmail.com> Applied to powerpc fixes, thanks. https://git.kernel.org/powerpc/c/c9dccf1d074a67d36c510845f66398 cheers
diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c index 81d8614e7379..ba284949af06 100644 --- a/arch/powerpc/platforms/pseries/ras.c +++ b/arch/powerpc/platforms/pseries/ras.c @@ -66,6 +66,26 @@ static int __init init_ras_IRQ(void) of_node_put(np); } + /* EPOW Events */ + np = of_find_node_by_path("/event-sources/epow-events"); + if (np != NULL) { + request_event_sources_irqs(np, ras_epow_interrupt, "RAS_EPOW"); + of_node_put(np); + } + + return 0; +} +machine_subsys_initcall(pseries, init_ras_IRQ); + +/* + * Enable the hotplug interrupt late because processing them may touch other + * devices or systems (e.g. hugepages) that have not been initialized at the + * subsys stage. + */ +int __init init_ras_hotplug_IRQ(void) +{ + struct device_node *np; + /* Hotplug Events */ np = of_find_node_by_path("/event-sources/hot-plug-events"); if (np != NULL) { @@ -75,16 +95,9 @@ static int __init init_ras_IRQ(void) of_node_put(np); } - /* EPOW Events */ - np = of_find_node_by_path("/event-sources/epow-events"); - if (np != NULL) { - request_event_sources_irqs(np, ras_epow_interrupt, "RAS_EPOW"); - of_node_put(np); - } - return 0; } -machine_subsys_initcall(pseries, init_ras_IRQ); +machine_late_initcall(pseries, init_ras_hotplug_IRQ); #define EPOW_SHUTDOWN_NORMAL 1 #define EPOW_SHUTDOWN_ON_UPS 2
Currently if the kernel receives a memory hot-unplug event early enough, it may get stuck in an infinite loop in dissolve_free_huge_pages(). This appears as a stall just after: pseries-hotplug-mem: Attempting to hot-remove XX LMB(s) at YYYYYYYY It appears to be caused by "minimum_order" being uninitialized, due to init_ras_IRQ() executing before hugetlb_init(). To correct this, extract the part of init_ras_IRQ() that enables hotplug event processing and place it in the machine_late_initcall phase, which is guaranteed to be after hugetlb_init() is called. Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com> --- arch/powerpc/platforms/pseries/ras.c | 29 +++++++++++++++++++++-------- 1 file changed, 21 insertions(+), 8 deletions(-)