Message ID | 1489034268-24888-1-git-send-email-bharata@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | New |
Headers | show |
> Fix this by ensuring that of_node_put() is called only from the > error path in dlpar_cpu_remove_by_index(). In the normal path, > of_node_put() happens as part of dlpar_detach_node(). > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> > Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> > --- > Changes in v1: > - Fixed the refcount problem in the userspace driven unplug path > in addition to in-kernel unplug path. (Sachin Sant) > > v0: https://patchwork.ozlabs.org/patch/736547/ > Thanks Bharata. Tested cpu hotplug/unplug with this fix on a PowrVM LPAR. Tested-by : Sachin Sant <sachinp@linux.vnet.ibm.com> Thanks -Sachin
Bharata B Rao <bharata@linux.vnet.ibm.com> writes: > The following warning is seen when a CPU is hot unplugged on a PowerKVM > guest: > > refcount_t: underflow; use-after-free. ... > > Fix this by ensuring that of_node_put() is called only from the > error path in dlpar_cpu_remove_by_index(). In the normal path, > of_node_put() happens as part of dlpar_detach_node(). Don't we have the same bug in mobility.c ? static int delete_dt_node(__be32 phandle) { struct device_node *dn; dn = of_find_node_by_phandle(be32_to_cpu(phandle)); if (!dn) return -ENOENT; dlpar_detach_node(dn); of_node_put(dn); return 0; } > diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c > index 7bc0e91..c5ed510 100644 > --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c > +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c > @@ -619,7 +619,8 @@ static int dlpar_cpu_remove_by_index(u32 drc_index) > } > > rc = dlpar_cpu_remove(dn, drc_index); > - of_node_put(dn); > + if (rc) > + of_node_put(dn); > return rc; > } > > @@ -856,9 +857,12 @@ static ssize_t dlpar_cpu_release(const char *buf, size_t count) > } > > rc = dlpar_cpu_remove(dn, drc_index); > - of_node_put(dn); > - > - return rc ? rc : count; > + if (rc) { > + of_node_put(dn); > + return rc; > + } else { > + return count; > + } > } Which makes me think that the bug is actually that dlpar_detach_node() should not be dropping the last reference on behalf of its caller. It means that the caller has a pointer to a freed dn, which is wrong. eg. A caller could do: dlpar_detach_node(dn); printk("detached %s\n", dn->name); But dn will have been freed by the time the printk happens. The problem is we call dlpar_detach_node() recursively, and for the child nodes it *does* need to drop the last reference, because that's the whole point. So it needs some refactoring. I also notice there's no error checking on the removal of the child nodes, which is pretty dubious: child = of_get_next_child(dn, NULL); while (child) { dlpar_detach_node(child); child = of_get_next_child(dn, child); } cheers
On 03/08/2017 08:37 PM, Bharata B Rao wrote: > The following warning is seen when a CPU is hot unplugged on a PowerKVM > guest: Is this the case with cpus present at boot? What about cpus hotplugged after boot? My suspicion is that the refcount was wrong to begin with. See my comments below. The use of the of_node_put() calls is correct as in each case we incremented the ref count earlier in the same function. > > refcount_t: underflow; use-after-free. > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 53 at lib/refcount.c:128 refcount_sub_and_test+0xd8/0xf0 > Modules linked in: > CPU: 0 PID: 53 Comm: kworker/u510:1 Not tainted 4.11.0-rc1 #3 > Workqueue: pseries hotplug workque pseries_hp_work_fn > task: c0000000fb475000 task.stack: c0000000fb81c000 > NIP: c0000000006f0808 LR: c0000000006f0804 CTR: c0000000007b98c0 > REGS: c0000000fb81f710 TRAP: 0700 Not tainted (4.11.0-rc1) > MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> > CR: 48002222 XER: 20000000 > CFAR: c000000000c438e0 SOFTE: 1 > GPR00: c0000000006f0804 c0000000fb81f990 c000000001573b00 0000000000000026 > GPR04: 0000000000000000 000000000000016c 667265652e0d0a73 652d61667465722d > GPR08: 0000000000000007 0000000000000007 0000000000000001 0000000000000006 > GPR12: 0000000000002200 c00000000ff40000 c00000000010c578 c0000001f11b9f40 > GPR16: c0000001fe0312a8 c0000001fe031078 c0000001fe031020 0000000000000001 > GPR20: 0000000000000000 0000000000000000 c000000001454808 fffffffffffffef7 > GPR24: 0000000000000000 c0000001f1677648 0000000000000000 0000000000000000 > GPR28: 0000000010000008 c000000000e4d3d8 0000000000000000 c0000001eaae07d8 > NIP [c0000000006f0808] refcount_sub_and_test+0xd8/0xf0 > LR [c0000000006f0804] refcount_sub_and_test+0xd4/0xf0 > Call Trace: > [c0000000fb81f990] [c0000000006f0804] refcount_sub_and_test+0xd4/0xf0 (unreliable) > [c0000000fb81f9f0] [c0000000006d04b4] kobject_put+0x44/0x2a0 > [c0000000fb81fa70] [c0000000009d5284] of_node_put+0x34/0x50 > [c0000000fb81faa0] [c0000000000aceb8] dlpar_cpu_remove_by_index+0x108/0x130 > [c0000000fb81fb30] [c0000000000ae128] dlpar_cpu+0x78/0x550 > [c0000000fb81fbe0] [c0000000000a7b40] handle_dlpar_errorlog+0xc0/0x160 > [c0000000fb81fc50] [c0000000000a7c74] pseries_hp_work_fn+0x94/0xa0 > [c0000000fb81fc80] [c000000000102cec] process_one_work+0x23c/0x540 > [c0000000fb81fd20] [c00000000010309c] worker_thread+0xac/0x620 > [c0000000fb81fdc0] [c00000000010c6c4] kthread+0x154/0x1a0 > [c0000000fb81fe30] [c00000000000bbe0] ret_from_kernel_thread+0x5c/0x7c > > Fix this by ensuring that of_node_put() is called only from the > error path in dlpar_cpu_remove_by_index(). In the normal path, > of_node_put() happens as part of dlpar_detach_node(). > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> > Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> > --- > Changes in v1: > - Fixed the refcount problem in the userspace driven unplug path > in addition to in-kernel unplug path. (Sachin Sant) > > v0: https://patchwork.ozlabs.org/patch/736547/ > > arch/powerpc/platforms/pseries/hotplug-cpu.c | 12 ++++++++---- > 1 file changed, 8 insertions(+), 4 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c > index 7bc0e91..c5ed510 100644 > --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c > +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c > @@ -619,7 +619,8 @@ static int dlpar_cpu_remove_by_index(u32 drc_index) > } > > rc = dlpar_cpu_remove(dn, drc_index); > - of_node_put(dn); > + if (rc) > + of_node_put(dn); I think there is another issue at play here because this is wrong. Regardless of whether the dlpar_cpu_remove() succeeds or fails we still need of_node_put() for both cases because we incremented the ref count earlier in this function with a call to cpu_drc_index_to_dn() call. That call doesn't, but shoul, document that it returns a device_node with incremented refcount. > return rc; > } > > @@ -856,9 +857,12 @@ static ssize_t dlpar_cpu_release(const char *buf, size_t count) > } > > rc = dlpar_cpu_remove(dn, drc_index); > - of_node_put(dn); > - > - return rc ? rc : count; > + if (rc) { > + of_node_put(dn); > + return rc; > + } else { > + return count; > + } Same comment as above. The call earlier in the function to of_find_node_by_path() returned a device_node struct with its ref count incremented. So, regardless of whether dlpar_cpu_remove() succeeds or fails we need decrement the ref count with of_node_put(). Looking closer at the call paths for attach and detach one will notice that __of_attach_node_sysfs() does not take a device_node reference with of_node_get(), but __of_detach_node_sysfs() does a of_node_put(). In the old days we use to keep the device tree in /proc. Now it lives in sysfs and is symlinked to /proc for userspace ABI reasons. Further, pseries was the only platform in those days that did any sort of dynamic OF operations. So, in those dark days we were responsible for calling of_node_put in dlpar_detach_node() to decrement the of_node_init() reference. Looking at the comments in __of_detach_node_sysfs() it seems that they expect to decrement that reference there now. void __of_detach_node_sysfs(struct device_node *np) { ...snip... /* finally remove the kobj_init ref */ of_node_put(np); } -Tyrel > } > > #endif /* CONFIG_ARCH_CPU_PROBE_RELEASE */ >
On Thu, Mar 09, 2017 at 01:34:00PM -0800, Tyrel Datwyler wrote: > On 03/08/2017 08:37 PM, Bharata B Rao wrote: > > The following warning is seen when a CPU is hot unplugged on a PowerKVM > > guest: > > Is this the case with cpus present at boot? What about cpus hotplugged > after boot? I have observed this for CPUs that are hotplugged. > > My suspicion is that the refcount was wrong to begin with. See my > comments below. The use of the of_node_put() calls is correct as in each > case we incremented the ref count earlier in the same function. > > > > > refcount_t: underflow; use-after-free. > > ------------[ cut here ]------------ > > WARNING: CPU: 0 PID: 53 at lib/refcount.c:128 refcount_sub_and_test+0xd8/0xf0 > > Modules linked in: > > CPU: 0 PID: 53 Comm: kworker/u510:1 Not tainted 4.11.0-rc1 #3 > > Workqueue: pseries hotplug workque pseries_hp_work_fn > > task: c0000000fb475000 task.stack: c0000000fb81c000 > > NIP: c0000000006f0808 LR: c0000000006f0804 CTR: c0000000007b98c0 > > REGS: c0000000fb81f710 TRAP: 0700 Not tainted (4.11.0-rc1) > > MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> > > CR: 48002222 XER: 20000000 > > CFAR: c000000000c438e0 SOFTE: 1 > > GPR00: c0000000006f0804 c0000000fb81f990 c000000001573b00 0000000000000026 > > GPR04: 0000000000000000 000000000000016c 667265652e0d0a73 652d61667465722d > > GPR08: 0000000000000007 0000000000000007 0000000000000001 0000000000000006 > > GPR12: 0000000000002200 c00000000ff40000 c00000000010c578 c0000001f11b9f40 > > GPR16: c0000001fe0312a8 c0000001fe031078 c0000001fe031020 0000000000000001 > > GPR20: 0000000000000000 0000000000000000 c000000001454808 fffffffffffffef7 > > GPR24: 0000000000000000 c0000001f1677648 0000000000000000 0000000000000000 > > GPR28: 0000000010000008 c000000000e4d3d8 0000000000000000 c0000001eaae07d8 > > NIP [c0000000006f0808] refcount_sub_and_test+0xd8/0xf0 > > LR [c0000000006f0804] refcount_sub_and_test+0xd4/0xf0 > > Call Trace: > > [c0000000fb81f990] [c0000000006f0804] refcount_sub_and_test+0xd4/0xf0 (unreliable) > > [c0000000fb81f9f0] [c0000000006d04b4] kobject_put+0x44/0x2a0 > > [c0000000fb81fa70] [c0000000009d5284] of_node_put+0x34/0x50 > > [c0000000fb81faa0] [c0000000000aceb8] dlpar_cpu_remove_by_index+0x108/0x130 > > [c0000000fb81fb30] [c0000000000ae128] dlpar_cpu+0x78/0x550 > > [c0000000fb81fbe0] [c0000000000a7b40] handle_dlpar_errorlog+0xc0/0x160 > > [c0000000fb81fc50] [c0000000000a7c74] pseries_hp_work_fn+0x94/0xa0 > > [c0000000fb81fc80] [c000000000102cec] process_one_work+0x23c/0x540 > > [c0000000fb81fd20] [c00000000010309c] worker_thread+0xac/0x620 > > [c0000000fb81fdc0] [c00000000010c6c4] kthread+0x154/0x1a0 > > [c0000000fb81fe30] [c00000000000bbe0] ret_from_kernel_thread+0x5c/0x7c > > > > Fix this by ensuring that of_node_put() is called only from the > > error path in dlpar_cpu_remove_by_index(). In the normal path, > > of_node_put() happens as part of dlpar_detach_node(). > > > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> > > Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> > > --- > > Changes in v1: > > - Fixed the refcount problem in the userspace driven unplug path > > in addition to in-kernel unplug path. (Sachin Sant) > > > > v0: https://patchwork.ozlabs.org/patch/736547/ > > > > arch/powerpc/platforms/pseries/hotplug-cpu.c | 12 ++++++++---- > > 1 file changed, 8 insertions(+), 4 deletions(-) > > > > diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c > > index 7bc0e91..c5ed510 100644 > > --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c > > +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c > > @@ -619,7 +619,8 @@ static int dlpar_cpu_remove_by_index(u32 drc_index) > > } > > > > rc = dlpar_cpu_remove(dn, drc_index); > > - of_node_put(dn); > > + if (rc) > > + of_node_put(dn); > > I think there is another issue at play here because this is wrong. > Regardless of whether the dlpar_cpu_remove() succeeds or fails we still > need of_node_put() for both cases because we incremented the ref count > earlier in this function with a call to cpu_drc_index_to_dn() call. That > call doesn't, but shoul, document that it returns a device_node with > incremented refcount. > > > return rc; > > } > > > > @@ -856,9 +857,12 @@ static ssize_t dlpar_cpu_release(const char *buf, size_t count) > > } > > > > rc = dlpar_cpu_remove(dn, drc_index); > > - of_node_put(dn); > > - > > - return rc ? rc : count; > > + if (rc) { > > + of_node_put(dn); > > + return rc; > > + } else { > > + return count; > > + } > > Same comment as above. The call earlier in the function to > of_find_node_by_path() returned a device_node struct with its ref count > incremented. So, regardless of whether dlpar_cpu_remove() succeeds or > fails we need decrement the ref count with of_node_put(). > > Looking closer at the call paths for attach and detach one will notice > that __of_attach_node_sysfs() does not take a device_node reference with > of_node_get(), but __of_detach_node_sysfs() does a of_node_put(). In the > old days we use to keep the device tree in /proc. Now it lives in sysfs > and is symlinked to /proc for userspace ABI reasons. Further, pseries > was the only platform in those days that did any sort of dynamic OF > operations. So, in those dark days we were responsible for calling > of_node_put in dlpar_detach_node() to decrement the of_node_init() > reference. Looking at the comments in __of_detach_node_sysfs() it seems > that they expect to decrement that reference there now. > > void __of_detach_node_sysfs(struct device_node *np) > { > > ...snip... > > /* finally remove the kobj_init ref */ > of_node_put(np); > } > So you suggest that adding of_node_get() to __of_attach_node_sysfs() is the right fix ? Regards, Bharata.
On 03/13/2017 03:29 AM, Bharata B Rao wrote: > On Thu, Mar 09, 2017 at 01:34:00PM -0800, Tyrel Datwyler wrote: >> On 03/08/2017 08:37 PM, Bharata B Rao wrote: >>> The following warning is seen when a CPU is hot unplugged on a PowerKVM >>> guest: >> >> Is this the case with cpus present at boot? What about cpus hotplugged >> after boot? > > I have observed this for CPUs that are hotplugged. If removing a cpu present at boot works, but removing one that has been hotplugged after boot reproduces the problem it is more likely the case that we failed to take a reference during hotplug or released a reference we shouldn't have. I'd have to go look at the hot add path. > >> >> My suspicion is that the refcount was wrong to begin with. See my >> comments below. The use of the of_node_put() calls is correct as in each >> case we incremented the ref count earlier in the same function. >> >>> >>> refcount_t: underflow; use-after-free. >>> ------------[ cut here ]------------ >>> WARNING: CPU: 0 PID: 53 at lib/refcount.c:128 refcount_sub_and_test+0xd8/0xf0 >>> Modules linked in: >>> CPU: 0 PID: 53 Comm: kworker/u510:1 Not tainted 4.11.0-rc1 #3 >>> Workqueue: pseries hotplug workque pseries_hp_work_fn >>> task: c0000000fb475000 task.stack: c0000000fb81c000 >>> NIP: c0000000006f0808 LR: c0000000006f0804 CTR: c0000000007b98c0 >>> REGS: c0000000fb81f710 TRAP: 0700 Not tainted (4.11.0-rc1) >>> MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> >>> CR: 48002222 XER: 20000000 >>> CFAR: c000000000c438e0 SOFTE: 1 >>> GPR00: c0000000006f0804 c0000000fb81f990 c000000001573b00 0000000000000026 >>> GPR04: 0000000000000000 000000000000016c 667265652e0d0a73 652d61667465722d >>> GPR08: 0000000000000007 0000000000000007 0000000000000001 0000000000000006 >>> GPR12: 0000000000002200 c00000000ff40000 c00000000010c578 c0000001f11b9f40 >>> GPR16: c0000001fe0312a8 c0000001fe031078 c0000001fe031020 0000000000000001 >>> GPR20: 0000000000000000 0000000000000000 c000000001454808 fffffffffffffef7 >>> GPR24: 0000000000000000 c0000001f1677648 0000000000000000 0000000000000000 >>> GPR28: 0000000010000008 c000000000e4d3d8 0000000000000000 c0000001eaae07d8 >>> NIP [c0000000006f0808] refcount_sub_and_test+0xd8/0xf0 >>> LR [c0000000006f0804] refcount_sub_and_test+0xd4/0xf0 >>> Call Trace: >>> [c0000000fb81f990] [c0000000006f0804] refcount_sub_and_test+0xd4/0xf0 (unreliable) >>> [c0000000fb81f9f0] [c0000000006d04b4] kobject_put+0x44/0x2a0 >>> [c0000000fb81fa70] [c0000000009d5284] of_node_put+0x34/0x50 >>> [c0000000fb81faa0] [c0000000000aceb8] dlpar_cpu_remove_by_index+0x108/0x130 >>> [c0000000fb81fb30] [c0000000000ae128] dlpar_cpu+0x78/0x550 >>> [c0000000fb81fbe0] [c0000000000a7b40] handle_dlpar_errorlog+0xc0/0x160 >>> [c0000000fb81fc50] [c0000000000a7c74] pseries_hp_work_fn+0x94/0xa0 >>> [c0000000fb81fc80] [c000000000102cec] process_one_work+0x23c/0x540 >>> [c0000000fb81fd20] [c00000000010309c] worker_thread+0xac/0x620 >>> [c0000000fb81fdc0] [c00000000010c6c4] kthread+0x154/0x1a0 >>> [c0000000fb81fe30] [c00000000000bbe0] ret_from_kernel_thread+0x5c/0x7c >>> >>> Fix this by ensuring that of_node_put() is called only from the >>> error path in dlpar_cpu_remove_by_index(). In the normal path, >>> of_node_put() happens as part of dlpar_detach_node(). >>> >>> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> >>> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> >>> --- >>> Changes in v1: >>> - Fixed the refcount problem in the userspace driven unplug path >>> in addition to in-kernel unplug path. (Sachin Sant) >>> >>> v0: https://patchwork.ozlabs.org/patch/736547/ >>> >>> arch/powerpc/platforms/pseries/hotplug-cpu.c | 12 ++++++++---- >>> 1 file changed, 8 insertions(+), 4 deletions(-) >>> >>> diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c >>> index 7bc0e91..c5ed510 100644 >>> --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c >>> +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c >>> @@ -619,7 +619,8 @@ static int dlpar_cpu_remove_by_index(u32 drc_index) >>> } >>> >>> rc = dlpar_cpu_remove(dn, drc_index); >>> - of_node_put(dn); >>> + if (rc) >>> + of_node_put(dn); >> >> I think there is another issue at play here because this is wrong. >> Regardless of whether the dlpar_cpu_remove() succeeds or fails we still >> need of_node_put() for both cases because we incremented the ref count >> earlier in this function with a call to cpu_drc_index_to_dn() call. That >> call doesn't, but shoul, document that it returns a device_node with >> incremented refcount. >> >>> return rc; >>> } >>> >>> @@ -856,9 +857,12 @@ static ssize_t dlpar_cpu_release(const char *buf, size_t count) >>> } >>> >>> rc = dlpar_cpu_remove(dn, drc_index); >>> - of_node_put(dn); >>> - >>> - return rc ? rc : count; >>> + if (rc) { >>> + of_node_put(dn); >>> + return rc; >>> + } else { >>> + return count; >>> + } >> >> Same comment as above. The call earlier in the function to >> of_find_node_by_path() returned a device_node struct with its ref count >> incremented. So, regardless of whether dlpar_cpu_remove() succeeds or >> fails we need decrement the ref count with of_node_put(). >> >> Looking closer at the call paths for attach and detach one will notice >> that __of_attach_node_sysfs() does not take a device_node reference with >> of_node_get(), but __of_detach_node_sysfs() does a of_node_put(). In the >> old days we use to keep the device tree in /proc. Now it lives in sysfs >> and is symlinked to /proc for userspace ABI reasons. Further, pseries >> was the only platform in those days that did any sort of dynamic OF >> operations. So, in those dark days we were responsible for calling >> of_node_put in dlpar_detach_node() to decrement the of_node_init() >> reference. Looking at the comments in __of_detach_node_sysfs() it seems >> that they expect to decrement that reference there now. >> >> void __of_detach_node_sysfs(struct device_node *np) >> { >> >> ...snip... >> >> /* finally remove the kobj_init ref */ >> of_node_put(np); >> } >> > > So you suggest that adding of_node_get() to __of_attach_node_sysfs() > is the right fix ? If I understand that this only creates for hot-added cpus then no. Also for this to be the correct fix I would expect to see this recreate for all hot-remove operations such as memory and pci devices as well. -Tyrel > > Regards, > Bharata. >
>> So you suggest that adding of_node_get() to __of_attach_node_sysfs() >> is the right fix ? > > If I understand that this only creates for hot-added cpus then no. Also > for this to be the correct fix I would expect to see this recreate for > all hot-remove operations such as memory and pci devices as well. > So I can recreate this problem while removing a I/O adapter as well. Here is a trace against 4.11.0-rc1-next20170310 while performing a DLPAR remove operation on a I/O adapter. [ 549.815605] rpaphp: Slot [U78C7.001.RCH0042-P1-C1] registered [ 549.815608] rpadlpar_io: slot PHB 64 added [ 575.267302] iommu: Removing device 0040:01:00.0 from group 1 [ 575.267401] iommu: Removing device 0040:01:00.1 from group 1 [ 575.267842] refcount_t: underflow; use-after-free. [ 575.267855] ------------[ cut here ]------------ [ 575.267862] WARNING: CPU: 2 PID: 3837 at lib/refcount.c:128 refcount_sub_and_test+0xf4/0x110 [ 575.267865] Modules linked in: rpadlpar_io rpaphp dccp_diag dccp tcp_diag udp_diag inet_diag unix_diag af_packet_diag netlink_diag rpcrdma sunrpc ib_isert iscsi_target_mod ib_iser libiscsi scsi_transport_iscsi ib_srpt target_core_mod ib_srp ib_ipoib rdma_ucm ib_ucm ib_uverbs ib_umad rdma_cm ib_cm iw_cm iw_cxgb3 ib_core ghash_generic xts gf128mul tpm_ibmvtpm tpm vmx_crypto pseries_rng sg binfmt_misc ip_tables xfs libcrc32c sr_mod sd_mod cdrom cxgb3 ibmvscsi ibmveth scsi_transport_srp mdio dm_mirror dm_region_hash dm_log dm_mod [ 575.267904] CPU: 2 PID: 3837 Comm: drmgr Not tainted 4.11.0-rc1-next-20170310 #4 [ 575.267907] task: c00000076f041600 task.stack: c00000076f084000 [ 575.267910] NIP: c000000001aa69c4 LR: c000000001aa69c0 CTR: 00000000006338e4 [ 575.267913] REGS: c00000076f0878a0 TRAP: 0700 Not tainted (4.11.0-rc1-next-20170310) [ 575.267915] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> [ 575.267920] CR: 42002422 XER: 00000007 [ 575.267923] CFAR: c000000001edb5e0 SOFTE: 1 [ 575.267923] GPR00: c000000001aa69c0 c00000076f087b20 c000000002605f00 0000000000000026 [ 575.267923] GPR04: 0000000000000000 800000100fe93ec0 0000000000492b9a 0000000000000000 [ 575.267923] GPR08: 0000000000000001 0000000000000007 0000000000000006 0000000000003ff0 [ 575.267923] GPR12: 0000000000002200 c00000000e801200 0000000000000000 0000000000000000 [ 575.267923] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 575.267923] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 575.267923] GPR24: 0000000000000000 c0000001db50a78c 0000000010016650 0000000000000000 [ 575.267923] GPR28: c0000001dd1a7500 c0000001dd1a7200 c0000001db50a780 c0000001dd1a7258 [ 575.267955] NIP [c000000001aa69c4] refcount_sub_and_test+0xf4/0x110 [ 575.267958] LR [c000000001aa69c0] refcount_sub_and_test+0xf0/0x110 [ 575.267960] Call Trace: [ 575.267962] [c00000076f087b20] [c000000001aa69c0] refcount_sub_and_test+0xf0/0x110 (unreliable) [ 575.267967] [c00000076f087b80] [c000000001a84f1c] kobject_put+0x3c/0xa0 [ 575.267972] [c00000076f087bf0] [c000000001d239b4] of_node_put+0x24/0x40 [ 575.267976] [c00000076f087c10] [c00000000165ce74] ofdt_write+0x204/0x6b0 [ 575.267980] [c00000076f087cd0] [c00000000197bde0] proc_reg_write+0x80/0xd0 [ 575.267984] [c00000076f087d00] [c0000000018df040] __vfs_write+0x40/0x1c0 [ 575.267987] [c00000076f087d90] [c0000000018e0998] vfs_write+0xc8/0x240 [ 575.267991] [c00000076f087de0] [c0000000018e2600] SyS_write+0x60/0x110 [ 575.267995] [c00000076f087e30] [c0000000015cb184] system_call+0x38/0xe0 [ 575.267997] Instruction dump: [ 575.267999] 7863d182 4e800020 7c0802a6 39200001 3d42fff8 3c62ffb1 386306a0 992afd41 [ 575.268004] f8010010 f821ffa1 48434be5 60000000 <0fe00000> 38210060 38600000 e8010010 [ 575.268010] ---[ end trace e6c0a4371a0aa4e2 ]— Thanks -Sachin
I get the error when removing a CPU that has been hotplugged after boot. On 03/14/2017 03:42 PM, Tyrel Datwyler wrote: > On 03/13/2017 03:29 AM, Bharata B Rao wrote: >> On Thu, Mar 09, 2017 at 01:34:00PM -0800, Tyrel Datwyler wrote: >>> On 03/08/2017 08:37 PM, Bharata B Rao wrote: >>>> The following warning is seen when a CPU is hot unplugged on a PowerKVM >>>> guest: >>> >>> Is this the case with cpus present at boot? What about cpus hotplugged >>> after boot? >> >> I have observed this for CPUs that are hotplugged. > > If removing a cpu present at boot works, but removing one that has been > hotplugged after boot reproduces the problem it is more likely the case > that we failed to take a reference during hotplug or released a > reference we shouldn't have. I'd have to go look at the hot add path. > >> >>> >>> My suspicion is that the refcount was wrong to begin with. See my >>> comments below. The use of the of_node_put() calls is correct as in each >>> case we incremented the ref count earlier in the same function. >>> >>>> >>>> refcount_t: underflow; use-after-free. >>>> ------------[ cut here ]------------ >>>> WARNING: CPU: 0 PID: 53 at lib/refcount.c:128 refcount_sub_and_test+0xd8/0xf0 >>>> Modules linked in: >>>> CPU: 0 PID: 53 Comm: kworker/u510:1 Not tainted 4.11.0-rc1 #3 >>>> Workqueue: pseries hotplug workque pseries_hp_work_fn >>>> task: c0000000fb475000 task.stack: c0000000fb81c000 >>>> NIP: c0000000006f0808 LR: c0000000006f0804 CTR: c0000000007b98c0 >>>> REGS: c0000000fb81f710 TRAP: 0700 Not tainted (4.11.0-rc1) >>>> MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> >>>> CR: 48002222 XER: 20000000 >>>> CFAR: c000000000c438e0 SOFTE: 1 >>>> GPR00: c0000000006f0804 c0000000fb81f990 c000000001573b00 0000000000000026 >>>> GPR04: 0000000000000000 000000000000016c 667265652e0d0a73 652d61667465722d >>>> GPR08: 0000000000000007 0000000000000007 0000000000000001 0000000000000006 >>>> GPR12: 0000000000002200 c00000000ff40000 c00000000010c578 c0000001f11b9f40 >>>> GPR16: c0000001fe0312a8 c0000001fe031078 c0000001fe031020 0000000000000001 >>>> GPR20: 0000000000000000 0000000000000000 c000000001454808 fffffffffffffef7 >>>> GPR24: 0000000000000000 c0000001f1677648 0000000000000000 0000000000000000 >>>> GPR28: 0000000010000008 c000000000e4d3d8 0000000000000000 c0000001eaae07d8 >>>> NIP [c0000000006f0808] refcount_sub_and_test+0xd8/0xf0 >>>> LR [c0000000006f0804] refcount_sub_and_test+0xd4/0xf0 >>>> Call Trace: >>>> [c0000000fb81f990] [c0000000006f0804] refcount_sub_and_test+0xd4/0xf0 (unreliable) >>>> [c0000000fb81f9f0] [c0000000006d04b4] kobject_put+0x44/0x2a0 >>>> [c0000000fb81fa70] [c0000000009d5284] of_node_put+0x34/0x50 >>>> [c0000000fb81faa0] [c0000000000aceb8] dlpar_cpu_remove_by_index+0x108/0x130 >>>> [c0000000fb81fb30] [c0000000000ae128] dlpar_cpu+0x78/0x550 >>>> [c0000000fb81fbe0] [c0000000000a7b40] handle_dlpar_errorlog+0xc0/0x160 >>>> [c0000000fb81fc50] [c0000000000a7c74] pseries_hp_work_fn+0x94/0xa0 >>>> [c0000000fb81fc80] [c000000000102cec] process_one_work+0x23c/0x540 >>>> [c0000000fb81fd20] [c00000000010309c] worker_thread+0xac/0x620 >>>> [c0000000fb81fdc0] [c00000000010c6c4] kthread+0x154/0x1a0 >>>> [c0000000fb81fe30] [c00000000000bbe0] ret_from_kernel_thread+0x5c/0x7c >>>> >>>> Fix this by ensuring that of_node_put() is called only from the >>>> error path in dlpar_cpu_remove_by_index(). In the normal path, >>>> of_node_put() happens as part of dlpar_detach_node(). >>>> >>>> Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> >>>> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> >>>> --- >>>> Changes in v1: >>>> - Fixed the refcount problem in the userspace driven unplug path >>>> in addition to in-kernel unplug path. (Sachin Sant) >>>> >>>> v0: https://patchwork.ozlabs.org/patch/736547/ >>>> >>>> arch/powerpc/platforms/pseries/hotplug-cpu.c | 12 ++++++++---- >>>> 1 file changed, 8 insertions(+), 4 deletions(-) >>>> >>>> diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c >>>> index 7bc0e91..c5ed510 100644 >>>> --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c >>>> +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c >>>> @@ -619,7 +619,8 @@ static int dlpar_cpu_remove_by_index(u32 drc_index) >>>> } >>>> >>>> rc = dlpar_cpu_remove(dn, drc_index); >>>> - of_node_put(dn); >>>> + if (rc) >>>> + of_node_put(dn); >>> >>> I think there is another issue at play here because this is wrong. >>> Regardless of whether the dlpar_cpu_remove() succeeds or fails we still >>> need of_node_put() for both cases because we incremented the ref count >>> earlier in this function with a call to cpu_drc_index_to_dn() call. That >>> call doesn't, but shoul, document that it returns a device_node with >>> incremented refcount. >>> >>>> return rc; >>>> } >>>> >>>> @@ -856,9 +857,12 @@ static ssize_t dlpar_cpu_release(const char *buf, size_t count) >>>> } >>>> >>>> rc = dlpar_cpu_remove(dn, drc_index); >>>> - of_node_put(dn); >>>> - >>>> - return rc ? rc : count; >>>> + if (rc) { >>>> + of_node_put(dn); >>>> + return rc; >>>> + } else { >>>> + return count; >>>> + } >>> >>> Same comment as above. The call earlier in the function to >>> of_find_node_by_path() returned a device_node struct with its ref count >>> incremented. So, regardless of whether dlpar_cpu_remove() succeeds or >>> fails we need decrement the ref count with of_node_put(). >>> >>> Looking closer at the call paths for attach and detach one will notice >>> that __of_attach_node_sysfs() does not take a device_node reference with >>> of_node_get(), but __of_detach_node_sysfs() does a of_node_put(). In the >>> old days we use to keep the device tree in /proc. Now it lives in sysfs >>> and is symlinked to /proc for userspace ABI reasons. Further, pseries >>> was the only platform in those days that did any sort of dynamic OF >>> operations. So, in those dark days we were responsible for calling >>> of_node_put in dlpar_detach_node() to decrement the of_node_init() >>> reference. Looking at the comments in __of_detach_node_sysfs() it seems >>> that they expect to decrement that reference there now. >>> >>> void __of_detach_node_sysfs(struct device_node *np) >>> { >>> >>> ...snip... >>> >>> /* finally remove the kobj_init ref */ >>> of_node_put(np); >>> } >>> >> >> So you suggest that adding of_node_get() to __of_attach_node_sysfs() >> is the right fix ? > > If I understand that this only creates for hot-added cpus then no. Also > for this to be the correct fix I would expect to see this recreate for > all hot-remove operations such as memory and pci devices as well. > > -Tyrel > >> >> Regards, >> Bharata. >> > >
Le 09/03/2017 à 05:37, Bharata B Rao a écrit : > The following warning is seen when a CPU is hot unplugged on a PowerKVM > guest: > > refcount_t: underflow; use-after-free. > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 53 at lib/refcount.c:128 refcount_sub_and_test+0xd8/0xf0 > Modules linked in: > CPU: 0 PID: 53 Comm: kworker/u510:1 Not tainted 4.11.0-rc1 #3 > Workqueue: pseries hotplug workque pseries_hp_work_fn > task: c0000000fb475000 task.stack: c0000000fb81c000 > NIP: c0000000006f0808 LR: c0000000006f0804 CTR: c0000000007b98c0 > REGS: c0000000fb81f710 TRAP: 0700 Not tainted (4.11.0-rc1) > MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> > CR: 48002222 XER: 20000000 > CFAR: c000000000c438e0 SOFTE: 1 > GPR00: c0000000006f0804 c0000000fb81f990 c000000001573b00 0000000000000026 > GPR04: 0000000000000000 000000000000016c 667265652e0d0a73 652d61667465722d > GPR08: 0000000000000007 0000000000000007 0000000000000001 0000000000000006 > GPR12: 0000000000002200 c00000000ff40000 c00000000010c578 c0000001f11b9f40 > GPR16: c0000001fe0312a8 c0000001fe031078 c0000001fe031020 0000000000000001 > GPR20: 0000000000000000 0000000000000000 c000000001454808 fffffffffffffef7 > GPR24: 0000000000000000 c0000001f1677648 0000000000000000 0000000000000000 > GPR28: 0000000010000008 c000000000e4d3d8 0000000000000000 c0000001eaae07d8 > NIP [c0000000006f0808] refcount_sub_and_test+0xd8/0xf0 > LR [c0000000006f0804] refcount_sub_and_test+0xd4/0xf0 > Call Trace: > [c0000000fb81f990] [c0000000006f0804] refcount_sub_and_test+0xd4/0xf0 (unreliable) > [c0000000fb81f9f0] [c0000000006d04b4] kobject_put+0x44/0x2a0 > [c0000000fb81fa70] [c0000000009d5284] of_node_put+0x34/0x50 > [c0000000fb81faa0] [c0000000000aceb8] dlpar_cpu_remove_by_index+0x108/0x130 > [c0000000fb81fb30] [c0000000000ae128] dlpar_cpu+0x78/0x550 > [c0000000fb81fbe0] [c0000000000a7b40] handle_dlpar_errorlog+0xc0/0x160 > [c0000000fb81fc50] [c0000000000a7c74] pseries_hp_work_fn+0x94/0xa0 > [c0000000fb81fc80] [c000000000102cec] process_one_work+0x23c/0x540 > [c0000000fb81fd20] [c00000000010309c] worker_thread+0xac/0x620 > [c0000000fb81fdc0] [c00000000010c6c4] kthread+0x154/0x1a0 > [c0000000fb81fe30] [c00000000000bbe0] ret_from_kernel_thread+0x5c/0x7c > > Fix this by ensuring that of_node_put() is called only from the > error path in dlpar_cpu_remove_by_index(). In the normal path, > of_node_put() happens as part of dlpar_detach_node(). > > Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> > Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> Is this patch still needed ? Thanks Christophe > --- > Changes in v1: > - Fixed the refcount problem in the userspace driven unplug path > in addition to in-kernel unplug path. (Sachin Sant) > > v0: https://patchwork.ozlabs.org/patch/736547/ > > arch/powerpc/platforms/pseries/hotplug-cpu.c | 12 ++++++++---- > 1 file changed, 8 insertions(+), 4 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c > index 7bc0e91..c5ed510 100644 > --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c > +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c > @@ -619,7 +619,8 @@ static int dlpar_cpu_remove_by_index(u32 drc_index) > } > > rc = dlpar_cpu_remove(dn, drc_index); > - of_node_put(dn); > + if (rc) > + of_node_put(dn); > return rc; > } > > @@ -856,9 +857,12 @@ static ssize_t dlpar_cpu_release(const char *buf, size_t count) > } > > rc = dlpar_cpu_remove(dn, drc_index); > - of_node_put(dn); > - > - return rc ? rc : count; > + if (rc) { > + of_node_put(dn); > + return rc; > + } else { > + return count; > + } > } > > #endif /* CONFIG_ARCH_CPU_PROBE_RELEASE */
diff --git a/arch/powerpc/platforms/pseries/hotplug-cpu.c b/arch/powerpc/platforms/pseries/hotplug-cpu.c index 7bc0e91..c5ed510 100644 --- a/arch/powerpc/platforms/pseries/hotplug-cpu.c +++ b/arch/powerpc/platforms/pseries/hotplug-cpu.c @@ -619,7 +619,8 @@ static int dlpar_cpu_remove_by_index(u32 drc_index) } rc = dlpar_cpu_remove(dn, drc_index); - of_node_put(dn); + if (rc) + of_node_put(dn); return rc; } @@ -856,9 +857,12 @@ static ssize_t dlpar_cpu_release(const char *buf, size_t count) } rc = dlpar_cpu_remove(dn, drc_index); - of_node_put(dn); - - return rc ? rc : count; + if (rc) { + of_node_put(dn); + return rc; + } else { + return count; + } } #endif /* CONFIG_ARCH_CPU_PROBE_RELEASE */
The following warning is seen when a CPU is hot unplugged on a PowerKVM guest: refcount_t: underflow; use-after-free. ------------[ cut here ]------------ WARNING: CPU: 0 PID: 53 at lib/refcount.c:128 refcount_sub_and_test+0xd8/0xf0 Modules linked in: CPU: 0 PID: 53 Comm: kworker/u510:1 Not tainted 4.11.0-rc1 #3 Workqueue: pseries hotplug workque pseries_hp_work_fn task: c0000000fb475000 task.stack: c0000000fb81c000 NIP: c0000000006f0808 LR: c0000000006f0804 CTR: c0000000007b98c0 REGS: c0000000fb81f710 TRAP: 0700 Not tainted (4.11.0-rc1) MSR: 800000000282b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 48002222 XER: 20000000 CFAR: c000000000c438e0 SOFTE: 1 GPR00: c0000000006f0804 c0000000fb81f990 c000000001573b00 0000000000000026 GPR04: 0000000000000000 000000000000016c 667265652e0d0a73 652d61667465722d GPR08: 0000000000000007 0000000000000007 0000000000000001 0000000000000006 GPR12: 0000000000002200 c00000000ff40000 c00000000010c578 c0000001f11b9f40 GPR16: c0000001fe0312a8 c0000001fe031078 c0000001fe031020 0000000000000001 GPR20: 0000000000000000 0000000000000000 c000000001454808 fffffffffffffef7 GPR24: 0000000000000000 c0000001f1677648 0000000000000000 0000000000000000 GPR28: 0000000010000008 c000000000e4d3d8 0000000000000000 c0000001eaae07d8 NIP [c0000000006f0808] refcount_sub_and_test+0xd8/0xf0 LR [c0000000006f0804] refcount_sub_and_test+0xd4/0xf0 Call Trace: [c0000000fb81f990] [c0000000006f0804] refcount_sub_and_test+0xd4/0xf0 (unreliable) [c0000000fb81f9f0] [c0000000006d04b4] kobject_put+0x44/0x2a0 [c0000000fb81fa70] [c0000000009d5284] of_node_put+0x34/0x50 [c0000000fb81faa0] [c0000000000aceb8] dlpar_cpu_remove_by_index+0x108/0x130 [c0000000fb81fb30] [c0000000000ae128] dlpar_cpu+0x78/0x550 [c0000000fb81fbe0] [c0000000000a7b40] handle_dlpar_errorlog+0xc0/0x160 [c0000000fb81fc50] [c0000000000a7c74] pseries_hp_work_fn+0x94/0xa0 [c0000000fb81fc80] [c000000000102cec] process_one_work+0x23c/0x540 [c0000000fb81fd20] [c00000000010309c] worker_thread+0xac/0x620 [c0000000fb81fdc0] [c00000000010c6c4] kthread+0x154/0x1a0 [c0000000fb81fe30] [c00000000000bbe0] ret_from_kernel_thread+0x5c/0x7c Fix this by ensuring that of_node_put() is called only from the error path in dlpar_cpu_remove_by_index(). In the normal path, of_node_put() happens as part of dlpar_detach_node(). Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Cc: Nathan Fontenot <nfont@linux.vnet.ibm.com> --- Changes in v1: - Fixed the refcount problem in the userspace driven unplug path in addition to in-kernel unplug path. (Sachin Sant) v0: https://patchwork.ozlabs.org/patch/736547/ arch/powerpc/platforms/pseries/hotplug-cpu.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)