Message ID | 1606124997-3358-1-git-send-email-atrajeev@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | powerpc/perf: Fix crash with 'is_sier_available' when pmu is not set | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | Successfully applied on branch powerpc/merge (9d1aa2f025c6cc516125c42c70f6a9ce087c49ea) |
snowpatch_ozlabs/build-ppc64le | success | Build succeeded |
snowpatch_ozlabs/build-ppc64be | success | Build succeeded |
snowpatch_ozlabs/build-ppc64e | success | Build succeeded |
snowpatch_ozlabs/build-pmac32 | success | Build succeeded |
snowpatch_ozlabs/checkpatch | warning | total: 0 errors, 1 warnings, 0 checks, 9 lines checked |
snowpatch_ozlabs/needsstable | warning | Please consider tagging this patch for stable! |
> When perf record session started with "-I" option, capture registers > via intr-regs, on each sample ‘is_sier_available()'i is called to check > for the SIER ( Sample Instruction Event Register) availability in the > platform. This function in core-book3s access 'ppmu->flags'. If platform > specific pmu driver is not registered, ppmu is set to null and accessing > its members results in crash. Patch fixes this by returning false in > 'is_sier_available()' if 'ppmu' is not set. > > Fixes: 333804dc3b7a ("powerpc/perf: Update perf_regs structure to include SIER") > Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com> > Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Tested-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Thanks -Sachin
Hi Athira, Athira Rajeev <atrajeev@linux.vnet.ibm.com> writes: > On systems without any platform specific PMU driver support registered or > Generic Compat PMU support registered, The compat PMU is registered just like other PMUs, so I don't see how we can crash like this if the compat PMU is active? ie. if we're using the compat PMU then ppmu will be non-NULL and point to generic_compat_pmu. > running 'perf record' with > —intr-regs will crash ( perf record -I <workload> ). > > The relevant portion from crash logs and Call Trace: > > Unable to handle kernel paging request for data at address 0x00000068 > Faulting instruction address: 0xc00000000013eb18 > Oops: Kernel access of bad area, sig: 11 [#1] > CPU: 2 PID: 13435 Comm: kill Kdump: loaded Not tainted 4.18.0-193.el8.ppc64le #1 > NIP: c00000000013eb18 LR: c000000000139f2c CTR: c000000000393d80 > REGS: c0000004a07ab4f0 TRAP: 0300 Not tainted (4.18.0-193.el8.ppc64le) > NIP [c00000000013eb18] is_sier_available+0x18/0x30 > LR [c000000000139f2c] perf_reg_value+0x6c/0xb0 > Call Trace: > [c0000004a07ab770] [c0000004a07ab7c8] 0xc0000004a07ab7c8 (unreliable) > [c0000004a07ab7a0] [c0000000003aa77c] perf_output_sample+0x60c/0xac0 > [c0000004a07ab840] [c0000000003ab3f0] perf_event_output_forward+0x70/0xb0 > [c0000004a07ab8c0] [c00000000039e208] __perf_event_overflow+0x88/0x1a0 > [c0000004a07ab910] [c00000000039e42c] perf_swevent_hrtimer+0x10c/0x1d0 > [c0000004a07abc50] [c000000000228b9c] __hrtimer_run_queues+0x17c/0x480 > [c0000004a07abcf0] [c00000000022aaf4] hrtimer_interrupt+0x144/0x520 > [c0000004a07abdd0] [c00000000002a864] timer_interrupt+0x104/0x2f0 > [c0000004a07abe30] [c0000000000091c4] decrementer_common+0x114/0x120 > > When perf record session started with "-I" option, capture registers ^ is > via intr-regs, "intr-regs" is just the full name for the -I option, so that kind of repeats itself. > on each sample ‘is_sier_available()'i is called to check ^ extra i The single quotes around is_sier_available() aren't necessary IMO. > for the SIER ( Sample Instruction Event Register) availability in the ^ stray space > platform. This function in core-book3s access 'ppmu->flags'. If platform ^ ^ es a > specific pmu driver is not registered, ppmu is set to null and accessing ^ ^ PMU NULL > its members results in crash. Patch fixes this by returning false in ^ a > 'is_sier_available()' if 'ppmu' is not set. Use the imperative mood for the last sentence which says what the patch does: Fix the crash by returning false in is_sier_available() if ppmu is not set. > Fixes: 333804dc3b7a ("powerpc/perf: Update perf_regs structure to include SIER") > Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com> > Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> > --- > arch/powerpc/perf/core-book3s.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c > index 08643cb..1de4770 100644 > --- a/arch/powerpc/perf/core-book3s.c > +++ b/arch/powerpc/perf/core-book3s.c > @@ -137,6 +137,9 @@ static void pmao_restore_workaround(bool ebb) { } > > bool is_sier_available(void) > { > + if (!ppmu) > + return false; > + > if (ppmu->flags & PPMU_HAS_SIER) > return true; > > -- > 1.8.3.1 cheers
> On 23-Nov-2020, at 4:49 PM, Michael Ellerman <mpe@ellerman.id.au> wrote: > > Hi Athira, > > Athira Rajeev <atrajeev@linux.vnet.ibm.com> writes: >> On systems without any platform specific PMU driver support registered or >> Generic Compat PMU support registered, > > The compat PMU is registered just like other PMUs, so I don't see how we > can crash like this if the compat PMU is active? > > ie. if we're using the compat PMU then ppmu will be non-NULL and point > to generic_compat_pmu. Hi Michael, Thanks for checking the patch. Crash happens on systems which neither has compat PMU support registered nor has Platform specific PMU. This happens when the distro do not have either the PMU driver support for that platform or the generic "compat-mode" performance monitoring driver support. So in such cases since compat PMU is in-active, ppmu is not set and results in crash. Sorry for the confusion with my first line. I will correct it. > >> running 'perf record' with >> —intr-regs will crash ( perf record -I <workload> ). >> >> The relevant portion from crash logs and Call Trace: >> >> Unable to handle kernel paging request for data at address 0x00000068 >> Faulting instruction address: 0xc00000000013eb18 >> Oops: Kernel access of bad area, sig: 11 [#1] >> CPU: 2 PID: 13435 Comm: kill Kdump: loaded Not tainted 4.18.0-193.el8.ppc64le #1 >> NIP: c00000000013eb18 LR: c000000000139f2c CTR: c000000000393d80 >> REGS: c0000004a07ab4f0 TRAP: 0300 Not tainted (4.18.0-193.el8.ppc64le) >> NIP [c00000000013eb18] is_sier_available+0x18/0x30 >> LR [c000000000139f2c] perf_reg_value+0x6c/0xb0 >> Call Trace: >> [c0000004a07ab770] [c0000004a07ab7c8] 0xc0000004a07ab7c8 (unreliable) >> [c0000004a07ab7a0] [c0000000003aa77c] perf_output_sample+0x60c/0xac0 >> [c0000004a07ab840] [c0000000003ab3f0] perf_event_output_forward+0x70/0xb0 >> [c0000004a07ab8c0] [c00000000039e208] __perf_event_overflow+0x88/0x1a0 >> [c0000004a07ab910] [c00000000039e42c] perf_swevent_hrtimer+0x10c/0x1d0 >> [c0000004a07abc50] [c000000000228b9c] __hrtimer_run_queues+0x17c/0x480 >> [c0000004a07abcf0] [c00000000022aaf4] hrtimer_interrupt+0x144/0x520 >> [c0000004a07abdd0] [c00000000002a864] timer_interrupt+0x104/0x2f0 >> [c0000004a07abe30] [c0000000000091c4] decrementer_common+0x114/0x120 >> >> When perf record session started with "-I" option, capture registers > ^ > is > >> via intr-regs, > > "intr-regs" is just the full name for the -I option, so that kind of > repeats itself. > >> on each sample ‘is_sier_available()'i is called to check > ^ > extra i > > The single quotes around is_sier_available() aren't necessary IMO. > >> for the SIER ( Sample Instruction Event Register) availability in the > ^ > stray space >> platform. This function in core-book3s access 'ppmu->flags'. If platform > ^ ^ > es a >> specific pmu driver is not registered, ppmu is set to null and accessing > ^ ^ > PMU NULL >> its members results in crash. Patch fixes this by returning false in > ^ > a >> 'is_sier_available()' if 'ppmu' is not set. > > Use the imperative mood for the last sentence which says what the patch > does: > > Fix the crash by returning false in is_sier_available() if ppmu is not set. Sure, I will make all these changes as suggested. Thanks Athira > > >> Fixes: 333804dc3b7a ("powerpc/perf: Update perf_regs structure to include SIER") >> Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com> >> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> >> --- >> arch/powerpc/perf/core-book3s.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c >> index 08643cb..1de4770 100644 >> --- a/arch/powerpc/perf/core-book3s.c >> +++ b/arch/powerpc/perf/core-book3s.c >> @@ -137,6 +137,9 @@ static void pmao_restore_workaround(bool ebb) { } >> >> bool is_sier_available(void) >> { >> + if (!ppmu) >> + return false; >> + >> if (ppmu->flags & PPMU_HAS_SIER) >> return true; >> >> -- >> 1.8.3.1 > > > cheers
diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c index 08643cb..1de4770 100644 --- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -137,6 +137,9 @@ static void pmao_restore_workaround(bool ebb) { } bool is_sier_available(void) { + if (!ppmu) + return false; + if (ppmu->flags & PPMU_HAS_SIER) return true;
On systems without any platform specific PMU driver support registered or Generic Compat PMU support registered, running 'perf record' with —intr-regs will crash ( perf record -I <workload> ). The relevant portion from crash logs and Call Trace: Unable to handle kernel paging request for data at address 0x00000068 Faulting instruction address: 0xc00000000013eb18 Oops: Kernel access of bad area, sig: 11 [#1] CPU: 2 PID: 13435 Comm: kill Kdump: loaded Not tainted 4.18.0-193.el8.ppc64le #1 NIP: c00000000013eb18 LR: c000000000139f2c CTR: c000000000393d80 REGS: c0000004a07ab4f0 TRAP: 0300 Not tainted (4.18.0-193.el8.ppc64le) NIP [c00000000013eb18] is_sier_available+0x18/0x30 LR [c000000000139f2c] perf_reg_value+0x6c/0xb0 Call Trace: [c0000004a07ab770] [c0000004a07ab7c8] 0xc0000004a07ab7c8 (unreliable) [c0000004a07ab7a0] [c0000000003aa77c] perf_output_sample+0x60c/0xac0 [c0000004a07ab840] [c0000000003ab3f0] perf_event_output_forward+0x70/0xb0 [c0000004a07ab8c0] [c00000000039e208] __perf_event_overflow+0x88/0x1a0 [c0000004a07ab910] [c00000000039e42c] perf_swevent_hrtimer+0x10c/0x1d0 [c0000004a07abc50] [c000000000228b9c] __hrtimer_run_queues+0x17c/0x480 [c0000004a07abcf0] [c00000000022aaf4] hrtimer_interrupt+0x144/0x520 [c0000004a07abdd0] [c00000000002a864] timer_interrupt+0x104/0x2f0 [c0000004a07abe30] [c0000000000091c4] decrementer_common+0x114/0x120 When perf record session started with "-I" option, capture registers via intr-regs, on each sample ‘is_sier_available()'i is called to check for the SIER ( Sample Instruction Event Register) availability in the platform. This function in core-book3s access 'ppmu->flags'. If platform specific pmu driver is not registered, ppmu is set to null and accessing its members results in crash. Patch fixes this by returning false in 'is_sier_available()' if 'ppmu' is not set. Fixes: 333804dc3b7a ("powerpc/perf: Update perf_regs structure to include SIER") Reported-by: Sachin Sant <sachinp@linux.vnet.ibm.com> Signed-off-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com> --- arch/powerpc/perf/core-book3s.c | 3 +++ 1 file changed, 3 insertions(+)