Message ID | 20220115072020.93524-1-atrajeev@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | powerpc/perf: Fix power_pmu_disable to call clear_pmi_irq_pending only if PMI is pending | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/github-powerpc_ppctests | success | Successfully ran 8 jobs. |
snowpatch_ozlabs/github-powerpc_selftests | success | Successfully ran 8 jobs. |
snowpatch_ozlabs/github-powerpc_sparse | success | Successfully ran 4 jobs. |
snowpatch_ozlabs/github-powerpc_kernel_qemu | success | Successfully ran 24 jobs. |
snowpatch_ozlabs/github-powerpc_clang | success | Successfully ran 7 jobs. |
Excerpts from Athira Rajeev's message of January 15, 2022 5:20 pm: > Running selftest with CONFIG_PPC_IRQ_SOFT_MASK_DEBUG enabled in kernel > triggered below warning: > > [ 172.851380] ------------[ cut here ]------------ > [ 172.851391] WARNING: CPU: 8 PID: 2901 at arch/powerpc/include/asm/hw_irq.h:246 power_pmu_disable+0x270/0x280 > [ 172.851402] Modules linked in: dm_mod bonding nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables rfkill nfnetlink sunrpc xfs libcrc32c pseries_rng xts vmx_crypto uio_pdrv_genirq uio sch_fq_codel ip_tables ext4 mbcache jbd2 sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp fuse > [ 172.851442] CPU: 8 PID: 2901 Comm: lost_exception_ Not tainted 5.16.0-rc5-03218-g798527287598 #2 > [ 172.851451] NIP: c00000000013d600 LR: c00000000013d5a4 CTR: c00000000013b180 > [ 172.851458] REGS: c000000017687860 TRAP: 0700 Not tainted (5.16.0-rc5-03218-g798527287598) > [ 172.851465] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 48004884 XER: 20040000 > [ 172.851482] CFAR: c00000000013d5b4 IRQMASK: 1 > [ 172.851482] GPR00: c00000000013d5a4 c000000017687b00 c000000002a10600 0000000000000004 > [ 172.851482] GPR04: 0000000082004000 c0000008ba08f0a8 0000000000000000 00000008b7ed0000 > [ 172.851482] GPR08: 00000000446194f6 0000000000008000 c00000000013b118 c000000000d58e68 > [ 172.851482] GPR12: c00000000013d390 c00000001ec54a80 0000000000000000 0000000000000000 > [ 172.851482] GPR16: 0000000000000000 0000000000000000 c000000015d5c708 c0000000025396d0 > [ 172.851482] GPR20: 0000000000000000 0000000000000000 c00000000a3bbf40 0000000000000003 > [ 172.851482] GPR24: 0000000000000000 c0000008ba097400 c0000000161e0d00 c00000000a3bb600 > [ 172.851482] GPR28: c000000015d5c700 0000000000000001 0000000082384090 c0000008ba0020d8 > [ 172.851549] NIP [c00000000013d600] power_pmu_disable+0x270/0x280 > [ 172.851557] LR [c00000000013d5a4] power_pmu_disable+0x214/0x280 > [ 172.851565] Call Trace: > [ 172.851568] [c000000017687b00] [c00000000013d5a4] power_pmu_disable+0x214/0x280 (unreliable) > [ 172.851579] [c000000017687b40] [c0000000003403ac] perf_pmu_disable+0x4c/0x60 > [ 172.851588] [c000000017687b60] [c0000000003445e4] __perf_event_task_sched_out+0x1d4/0x660 > [ 172.851596] [c000000017687c50] [c000000000d1175c] __schedule+0xbcc/0x12a0 > [ 172.851602] [c000000017687d60] [c000000000d11ea8] schedule+0x78/0x140 > [ 172.851608] [c000000017687d90] [c0000000001a8080] sys_sched_yield+0x20/0x40 > [ 172.851615] [c000000017687db0] [c0000000000334dc] system_call_exception+0x18c/0x380 > [ 172.851622] [c000000017687e10] [c00000000000c74c] system_call_common+0xec/0x268 > > The warning indicates that MSR_EE being set(interrupt enabled) when > there was an overflown PMC detected. This could happen in > power_pmu_disable since it runs under interrupt soft disable > condition ( local_irq_save ) and not with interrupts hard disabled. > commit 2c9ac51b850d ("powerpc/perf: Fix PMU callbacks to clear > pending PMI before resetting an overflown PMC") intended to clear > PMI pending bit in Paca when disabling the PMU. It could happen > that PMC gets overflown while code is in power_pmu_disable > callback function. Hence add a check to see if PMI pending bit > is set in Paca before clearing it via clear_pmi_pending. > > Fixes: 2c9ac51b850d ("powerpc/perf: Fix PMU callbacks to clear pending PMI before resetting an overflown PMC") > Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> > Reported-by: Sachin Sant <sachinp@linux.ibm.com> > Tested-by: Sachin Sant <sachinp@linux.ibm.com> > --- > Note: Address the warning reported here: > https://lists.ozlabs.org/pipermail/linuxppc-dev/2021-December/238190.html > Patch is on top of powerpc/merge > > arch/powerpc/perf/core-book3s.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c > index a684901b6965..dfb0ea0f0df3 100644 > --- a/arch/powerpc/perf/core-book3s.c > +++ b/arch/powerpc/perf/core-book3s.c > @@ -1327,9 +1327,10 @@ static void power_pmu_disable(struct pmu *pmu) > * Otherwise provide a warning if there is PMI pending, but > * no counter is found overflown. > */ > - if (any_pmc_overflown(cpuhw)) > - clear_pmi_irq_pending(); > - else > + if (any_pmc_overflown(cpuhw)) { > + if (pmi_irq_pending()) > + clear_pmi_irq_pending(); And this works because MSR[EE] gets disabled by the masked interrupt handler if we have a PMI irq pending, is that right? Can I be a pain and just ask for a little comment there otherwise I'll forget about it next time. Thanks, Nick > + } else > WARN_ON(pmi_irq_pending()); > > val = mmcra = cpuhw->mmcr.mmcra; > -- > 2.33.0 > >
> On 19-Jan-2022, at 9:54 AM, Nicholas Piggin <npiggin@gmail.com> wrote: > > Excerpts from Athira Rajeev's message of January 15, 2022 5:20 pm: >> Running selftest with CONFIG_PPC_IRQ_SOFT_MASK_DEBUG enabled in kernel >> triggered below warning: >> >> [ 172.851380] ------------[ cut here ]------------ >> [ 172.851391] WARNING: CPU: 8 PID: 2901 at arch/powerpc/include/asm/hw_irq.h:246 power_pmu_disable+0x270/0x280 >> [ 172.851402] Modules linked in: dm_mod bonding nft_ct nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables rfkill nfnetlink sunrpc xfs libcrc32c pseries_rng xts vmx_crypto uio_pdrv_genirq uio sch_fq_codel ip_tables ext4 mbcache jbd2 sd_mod t10_pi sg ibmvscsi ibmveth scsi_transport_srp fuse >> [ 172.851442] CPU: 8 PID: 2901 Comm: lost_exception_ Not tainted 5.16.0-rc5-03218-g798527287598 #2 >> [ 172.851451] NIP: c00000000013d600 LR: c00000000013d5a4 CTR: c00000000013b180 >> [ 172.851458] REGS: c000000017687860 TRAP: 0700 Not tainted (5.16.0-rc5-03218-g798527287598) >> [ 172.851465] MSR: 8000000000029033 <SF,EE,ME,IR,DR,RI,LE> CR: 48004884 XER: 20040000 >> [ 172.851482] CFAR: c00000000013d5b4 IRQMASK: 1 >> [ 172.851482] GPR00: c00000000013d5a4 c000000017687b00 c000000002a10600 0000000000000004 >> [ 172.851482] GPR04: 0000000082004000 c0000008ba08f0a8 0000000000000000 00000008b7ed0000 >> [ 172.851482] GPR08: 00000000446194f6 0000000000008000 c00000000013b118 c000000000d58e68 >> [ 172.851482] GPR12: c00000000013d390 c00000001ec54a80 0000000000000000 0000000000000000 >> [ 172.851482] GPR16: 0000000000000000 0000000000000000 c000000015d5c708 c0000000025396d0 >> [ 172.851482] GPR20: 0000000000000000 0000000000000000 c00000000a3bbf40 0000000000000003 >> [ 172.851482] GPR24: 0000000000000000 c0000008ba097400 c0000000161e0d00 c00000000a3bb600 >> [ 172.851482] GPR28: c000000015d5c700 0000000000000001 0000000082384090 c0000008ba0020d8 >> [ 172.851549] NIP [c00000000013d600] power_pmu_disable+0x270/0x280 >> [ 172.851557] LR [c00000000013d5a4] power_pmu_disable+0x214/0x280 >> [ 172.851565] Call Trace: >> [ 172.851568] [c000000017687b00] [c00000000013d5a4] power_pmu_disable+0x214/0x280 (unreliable) >> [ 172.851579] [c000000017687b40] [c0000000003403ac] perf_pmu_disable+0x4c/0x60 >> [ 172.851588] [c000000017687b60] [c0000000003445e4] __perf_event_task_sched_out+0x1d4/0x660 >> [ 172.851596] [c000000017687c50] [c000000000d1175c] __schedule+0xbcc/0x12a0 >> [ 172.851602] [c000000017687d60] [c000000000d11ea8] schedule+0x78/0x140 >> [ 172.851608] [c000000017687d90] [c0000000001a8080] sys_sched_yield+0x20/0x40 >> [ 172.851615] [c000000017687db0] [c0000000000334dc] system_call_exception+0x18c/0x380 >> [ 172.851622] [c000000017687e10] [c00000000000c74c] system_call_common+0xec/0x268 >> >> The warning indicates that MSR_EE being set(interrupt enabled) when >> there was an overflown PMC detected. This could happen in >> power_pmu_disable since it runs under interrupt soft disable >> condition ( local_irq_save ) and not with interrupts hard disabled. >> commit 2c9ac51b850d ("powerpc/perf: Fix PMU callbacks to clear >> pending PMI before resetting an overflown PMC") intended to clear >> PMI pending bit in Paca when disabling the PMU. It could happen >> that PMC gets overflown while code is in power_pmu_disable >> callback function. Hence add a check to see if PMI pending bit >> is set in Paca before clearing it via clear_pmi_pending. >> >> Fixes: 2c9ac51b850d ("powerpc/perf: Fix PMU callbacks to clear pending PMI before resetting an overflown PMC") >> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> >> Reported-by: Sachin Sant <sachinp@linux.ibm.com> >> Tested-by: Sachin Sant <sachinp@linux.ibm.com> >> --- >> Note: Address the warning reported here: >> https://lists.ozlabs.org/pipermail/linuxppc-dev/2021-December/238190.html >> Patch is on top of powerpc/merge >> >> arch/powerpc/perf/core-book3s.c | 7 ++++--- >> 1 file changed, 4 insertions(+), 3 deletions(-) >> >> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c >> index a684901b6965..dfb0ea0f0df3 100644 >> --- a/arch/powerpc/perf/core-book3s.c >> +++ b/arch/powerpc/perf/core-book3s.c >> @@ -1327,9 +1327,10 @@ static void power_pmu_disable(struct pmu *pmu) >> * Otherwise provide a warning if there is PMI pending, but >> * no counter is found overflown. >> */ >> - if (any_pmc_overflown(cpuhw)) >> - clear_pmi_irq_pending(); >> - else >> + if (any_pmc_overflown(cpuhw)) { >> + if (pmi_irq_pending()) >> + clear_pmi_irq_pending(); > > And this works because MSR[EE] gets disabled by the masked interrupt > handler if we have a PMI irq pending, is that right? > > Can I be a pain and just ask for a little comment there otherwise I'll > forget about it next time. Hi Nick, Thanks for the review. Yes that’s right, if there is a PMI pending, MSR[EE] will be disabled by the handler. I will add that in comment. Thanks Athira > > Thanks, > Nick > >> + } else >> WARN_ON(pmi_irq_pending()); >> >> val = mmcra = cpuhw->mmcr.mmcra; >> -- >> 2.33.0
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c index a684901b6965..dfb0ea0f0df3 100644 --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -1327,9 +1327,10 @@ static void power_pmu_disable(struct pmu *pmu) * Otherwise provide a warning if there is PMI pending, but * no counter is found overflown. */ - if (any_pmc_overflown(cpuhw)) - clear_pmi_irq_pending(); - else + if (any_pmc_overflown(cpuhw)) { + if (pmi_irq_pending()) + clear_pmi_irq_pending(); + } else WARN_ON(pmi_irq_pending()); val = mmcra = cpuhw->mmcr.mmcra;