diff mbox series

powerpc/powernv/prd: Unregister OPAL_MSG_PRD2 notifier during module unload

Message ID 20210928120933.196571-1-hegdevasant@linux.vnet.ibm.com (mailing list archive)
State Superseded, archived
Headers show
Series powerpc/powernv/prd: Unregister OPAL_MSG_PRD2 notifier during module unload | expand
Related show

Checks

Context Check Description
snowpatch_ozlabs/github-powerpc_selftests success Successfully ran 8 jobs.
snowpatch_ozlabs/github-powerpc_ppctests success Successfully ran 8 jobs.
snowpatch_ozlabs/github-powerpc_clang success Successfully ran 7 jobs.
snowpatch_ozlabs/github-powerpc_sparse success Successfully ran 4 jobs.
snowpatch_ozlabs/github-powerpc_kernel_qemu success Successfully ran 24 jobs.

Commit Message

Vasant Hegde Sept. 28, 2021, 12:09 p.m. UTC
Commit 587164cd, introduced new opal message type (OPAL_MSG_PRD2) and added
opal notifier. But I missed to unregister the notifier during module unload
path. This results in below call trace if you try to unload and load
opal_prd module.

Sample calltrace (modprobe -r opal_prd; modprobe opal_prd)
  [  213.335261] BUG: Unable to handle kernel data access on read at 0xc0080000192200e0
  [  213.335287] Faulting instruction address: 0xc00000000018d1cc
  [  213.335301] Oops: Kernel access of bad area, sig: 11 [#1]
  [  213.335313] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV
  [  213.335736] CPU: 66 PID: 7446 Comm: modprobe Kdump: loaded Tainted: G            E     5.14.0prd #759
  [  213.335772] NIP:  c00000000018d1cc LR: c00000000018d2a8 CTR: c0000000000cde10
  [  213.335805] REGS: c0000003c4c0f0a0 TRAP: 0300   Tainted: G            E      (5.14.0prd)
  [  213.335848] MSR:  9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE>  CR: 24224824  XER: 20040000
  [  213.335893] CFAR: c00000000018d2a4 DAR: c0080000192200e0 DSISR: 40000000 IRQMASK: 1
  [  213.335893] GPR00: c00000000018d2a8 c0000003c4c0f340 c000000001995300 c000000001a5ad08
  [  213.335893] GPR04: c00800000e3700d0 c0000003c4c0f434 c0000000010a8c08 6e616d6500000000
  [  213.335893] GPR08: 0000000000000000 c0080000192200d0 0000000000000001 c00800000e351020
  [  213.335893] GPR12: c0000000000cde10 c000000ffffecb80 c0000003c4c0fd00 0000000000000000
  [  213.335893] GPR16: 0000000000000990 c00800000d950000 c00800000d950990 c00000000103fd10
  [  213.335893] GPR20: c0000003c4c0fbc0 0000000000000001 c0000003c4c0fbc0 c00800000e370498
  [  213.335893] GPR24: 0000000000000000 c000000000dab5c8 c000000001a5ad18 c000000001a5ac80
  [  213.335893] GPR28: 0000000000000008 0000000000000001 c000000001a5ad00 c000000001a5ad00
  [  213.336170] NIP [c00000000018d1cc] notifier_chain_register+0x2c/0xc0
  [  213.336205] LR [c00000000018d2a8] atomic_notifier_chain_register+0x48/0x80
  [  213.336238] Call Trace:
  [  213.336255] [c0000003c4c0f340] [c000000002090610] 0xc000000002090610 (unreliable)
  [  213.336281] [c0000003c4c0f3a0] [c00000000018d2b8] atomic_notifier_chain_register+0x58/0x80
  [  213.336309] [c0000003c4c0f3f0] [c0000000000cde8c] opal_message_notifier_register+0x7c/0x1e0
  [  213.336345] [c0000003c4c0f4b0] [c00800000e3508ac] opal_prd_probe+0x84/0x150 [opal_prd]
  [  213.336382] [c0000003c4c0f530] [c00000000097acc8] platform_probe+0x78/0x130
  [  213.336416] [c0000003c4c0f5b0] [c000000000976520] really_probe+0x110/0x5d0
  [  213.336467] [c0000003c4c0f630] [c000000000976b5c] __driver_probe_device+0x17c/0x230
  [  213.336512] [c0000003c4c0f6b0] [c000000000976c70] driver_probe_device+0x60/0x130
  [  213.336556] [c0000003c4c0f700] [c00000000097746c] __driver_attach+0xfc/0x220
  [  213.336592] [c0000003c4c0f780] [c000000000972e68] bus_for_each_dev+0xa8/0x130
  [  213.336627] [c0000003c4c0f7e0] [c000000000975b04] driver_attach+0x34/0x50
  [  213.336661] [c0000003c4c0f800] [c000000000974e20] bus_add_driver+0x1b0/0x300
  [  213.336696] [c0000003c4c0f890] [c000000000978468] driver_register+0x98/0x1a0
  [  213.336732] [c0000003c4c0f900] [c00000000097a878] __platform_driver_register+0x38/0x50
  [  213.336768] [c0000003c4c0f920] [c00800000e350dc0] opal_prd_driver_init+0x34/0x50 [opal_prd]
  [  213.336804] [c0000003c4c0f940] [c000000000012410] do_one_initcall+0x60/0x2d0
  [  213.336821] [c0000003c4c0fa10] [c000000000262c8c] do_init_module+0x7c/0x320
  [  213.336845] [c0000003c4c0fa90] [c000000000266544] load_module+0x3394/0x3650
  [  213.336880] [c0000003c4c0fc90] [c000000000266b34] __do_sys_finit_module+0xd4/0x160
  [  213.336914] [c0000003c4c0fdb0] [c000000000031e00] system_call_exception+0x140/0x290
  [  213.336958] [c0000003c4c0fe10] [c00000000000c764] system_call_common+0xf4/0x258

Fixes: 587164cd ("powerpc/powernv: Add new opal message type")
Signed-off-by: Vasant Hegde <hegdevasant@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/opal-prd.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Daniel Axtens Oct. 1, 2021, 6:10 a.m. UTC | #1
Hi Vasant,

> Commit 587164cd, introduced new opal message type (OPAL_MSG_PRD2) and added
> opal notifier. But I missed to unregister the notifier during module unload
> path. This results in below call trace if you try to unload and load
> opal_prd module.
>
> Fixes: 587164cd ("powerpc/powernv: Add new opal message type")

In reviewing this patch, I've become a bit worried the underlying patch
has another issue that we should fix.

Consider opal_prd_probe and the opal_prd_event_nb:

static struct notifier_block opal_prd_event_nb = {
	.notifier_call	= opal_prd_msg_notifier,
	.next		= NULL,
	.priority	= 0,
};

static int opal_prd_probe(struct platform_device *pdev)
{
...
	rc = opal_message_notifier_register(OPAL_MSG_PRD, &opal_prd_event_nb);
	if (rc) { ... }

	rc = opal_message_notifier_register(OPAL_MSG_PRD2, &opal_prd_event_nb);
	if (rc) { ... }

I don't think you can reuse a single notifier block for multiple
notifier_register calls. opal_message_notify_register calls
atomic_notifier_chain_register which takes a spinlock and calls
notifer_chain_register which reads:

static int notifier_chain_register(struct notifier_block **nl,
		struct notifier_block *n)
{
	while ((*nl) != NULL) {
        // ... skip some error detection
        // ... find the right spot in the list to add this
		if (n->priority > (*nl)->priority)
			break;
		nl = &((*nl)->next);
	}
	n->next = *nl; // <-- mutate the notifier block!!
	rcu_assign_pointer(*nl, n);
	return 0;
}

So we have the same notifier block getting inserted into two different
linked lists. This is unlikely to break at the moment because nothing
else registers an MSG_PRD/PRD2 notifier, but nonetheless I think you
need to use a different notifer_block struct for each list you insert
your notifier into.

Likewise this could cause other problems during removal, if there were
other entries in either notifier list.

Kind regards,
Daniel
Vasant Hegde Oct. 28, 2021, 4:57 p.m. UTC | #2
On 10/1/21 11:40 AM, Daniel Axtens wrote:
> Hi Vasant,
> 
>> Commit 587164cd, introduced new opal message type (OPAL_MSG_PRD2) and added
>> opal notifier. But I missed to unregister the notifier during module unload
>> path. This results in below call trace if you try to unload and load
>> opal_prd module.
>>
>> Fixes: 587164cd ("powerpc/powernv: Add new opal message type")
> 
> In reviewing this patch, I've become a bit worried the underlying patch
> has another issue that we should fix.

Thanks for the review. I have addressed below comments in v2.

-Vasant
diff mbox series

Patch

diff --git a/arch/powerpc/platforms/powernv/opal-prd.c b/arch/powerpc/platforms/powernv/opal-prd.c
index a191f4c60ce7..83305615db9e 100644
--- a/arch/powerpc/platforms/powernv/opal-prd.c
+++ b/arch/powerpc/platforms/powernv/opal-prd.c
@@ -393,6 +393,7 @@  static int opal_prd_probe(struct platform_device *pdev)
 	rc = opal_message_notifier_register(OPAL_MSG_PRD2, &opal_prd_event_nb);
 	if (rc) {
 		pr_err("Couldn't register PRD2 event notifier\n");
+		opal_message_notifier_unregister(OPAL_MSG_PRD, &opal_prd_event_nb);
 		return rc;
 	}
 
@@ -401,6 +402,8 @@  static int opal_prd_probe(struct platform_device *pdev)
 		pr_err("failed to register miscdev\n");
 		opal_message_notifier_unregister(OPAL_MSG_PRD,
 				&opal_prd_event_nb);
+		opal_message_notifier_unregister(OPAL_MSG_PRD2,
+				&opal_prd_event_nb);
 		return rc;
 	}
 
@@ -411,6 +414,7 @@  static int opal_prd_remove(struct platform_device *pdev)
 {
 	misc_deregister(&opal_prd_dev);
 	opal_message_notifier_unregister(OPAL_MSG_PRD, &opal_prd_event_nb);
+	opal_message_notifier_unregister(OPAL_MSG_PRD2, &opal_prd_event_nb);
 	return 0;
 }