Message ID | 20200711012639.3429622-2-songliubraving@fb.com |
---|---|
State | Changes Requested |
Delegated to: | BPF Maintainers |
Headers | show |
Series | bpf: fix stackmap on perf_events with PEBS | expand |
On Fri, Jul 10, 2020 at 6:30 PM Song Liu <songliubraving@fb.com> wrote: > > Calling get_perf_callchain() on perf_events from PEBS entries may cause > unwinder errors. To fix this issue, the callchain is fetched early. Such > perf_events are marked with __PERF_SAMPLE_CALLCHAIN_EARLY. > > Similarly, calling bpf_get_[stack|stackid] on perf_events from PEBS may > also cause unwinder errors. To fix this, block bpf_get_[stack|stackid] on > these perf_events. Unfortunately, bpf verifier cannot tell whether the > program will be attached to perf_event with PEBS entries. Therefore, > block such programs during ioctl(PERF_EVENT_IOC_SET_BPF). > > Signed-off-by: Song Liu <songliubraving@fb.com> > --- Perhaps it's a stupid question, but why bpf_get_stack/bpf_get_stackid can't figure out automatically that they are called from __PERF_SAMPLE_CALLCHAIN_EARLY perf event and use different callchain, if necessary? It is quite suboptimal from a user experience point of view to require two different BPF helpers depending on PEBS or non-PEBS perf events. [...]
> On Jul 10, 2020, at 8:53 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Fri, Jul 10, 2020 at 6:30 PM Song Liu <songliubraving@fb.com> wrote: >> >> Calling get_perf_callchain() on perf_events from PEBS entries may cause >> unwinder errors. To fix this issue, the callchain is fetched early. Such >> perf_events are marked with __PERF_SAMPLE_CALLCHAIN_EARLY. >> >> Similarly, calling bpf_get_[stack|stackid] on perf_events from PEBS may >> also cause unwinder errors. To fix this, block bpf_get_[stack|stackid] on >> these perf_events. Unfortunately, bpf verifier cannot tell whether the >> program will be attached to perf_event with PEBS entries. Therefore, >> block such programs during ioctl(PERF_EVENT_IOC_SET_BPF). >> >> Signed-off-by: Song Liu <songliubraving@fb.com> >> --- > > Perhaps it's a stupid question, but why bpf_get_stack/bpf_get_stackid > can't figure out automatically that they are called from > __PERF_SAMPLE_CALLCHAIN_EARLY perf event and use different callchain, > if necessary? > > It is quite suboptimal from a user experience point of view to require > two different BPF helpers depending on PEBS or non-PEBS perf events. I am not aware of an easy way to tell the difference in bpf_get_stack. But I do agree that would be much better. Thanks, Song
On Fri, Jul 10, 2020 at 11:28 PM Song Liu <songliubraving@fb.com> wrote: > > > > > On Jul 10, 2020, at 8:53 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > > > On Fri, Jul 10, 2020 at 6:30 PM Song Liu <songliubraving@fb.com> wrote: > >> > >> Calling get_perf_callchain() on perf_events from PEBS entries may cause > >> unwinder errors. To fix this issue, the callchain is fetched early. Such > >> perf_events are marked with __PERF_SAMPLE_CALLCHAIN_EARLY. > >> > >> Similarly, calling bpf_get_[stack|stackid] on perf_events from PEBS may > >> also cause unwinder errors. To fix this, block bpf_get_[stack|stackid] on > >> these perf_events. Unfortunately, bpf verifier cannot tell whether the > >> program will be attached to perf_event with PEBS entries. Therefore, > >> block such programs during ioctl(PERF_EVENT_IOC_SET_BPF). > >> > >> Signed-off-by: Song Liu <songliubraving@fb.com> > >> --- > > > > Perhaps it's a stupid question, but why bpf_get_stack/bpf_get_stackid > > can't figure out automatically that they are called from > > __PERF_SAMPLE_CALLCHAIN_EARLY perf event and use different callchain, > > if necessary? > > > > It is quite suboptimal from a user experience point of view to require > > two different BPF helpers depending on PEBS or non-PEBS perf events. > > I am not aware of an easy way to tell the difference in bpf_get_stack. > But I do agree that would be much better. > Hm... Looking a bit more how all this is tied together in the kernel, I think it's actually quite easy. So, for perf_event BPF program type: 1. return a special prototype for bpf_get_stack/bpf_get_stackid, which will have this extra bit of logic for callchain. All other program types with access to bpf_get_stack/bpf_get_stackid should use the current one, probably. 2. For that special program, just like for bpf_read_branch_records(), we know that context is actually `struct bpf_perf_event_data_kern *`, and it has pt_regs, perf_sample_data and perf_event itself. 3. With that, it seems like you'll have everything you need to automatically choose a proper callchain. All this absolutely transparently to the BPF program. Am I missing something? > Thanks, > Song
> On Jul 11, 2020, at 10:06 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Fri, Jul 10, 2020 at 11:28 PM Song Liu <songliubraving@fb.com> wrote: >> >> >> >>> On Jul 10, 2020, at 8:53 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: >>> >>> On Fri, Jul 10, 2020 at 6:30 PM Song Liu <songliubraving@fb.com> wrote: >>>> >>>> Calling get_perf_callchain() on perf_events from PEBS entries may cause >>>> unwinder errors. To fix this issue, the callchain is fetched early. Such >>>> perf_events are marked with __PERF_SAMPLE_CALLCHAIN_EARLY. >>>> >>>> Similarly, calling bpf_get_[stack|stackid] on perf_events from PEBS may >>>> also cause unwinder errors. To fix this, block bpf_get_[stack|stackid] on >>>> these perf_events. Unfortunately, bpf verifier cannot tell whether the >>>> program will be attached to perf_event with PEBS entries. Therefore, >>>> block such programs during ioctl(PERF_EVENT_IOC_SET_BPF). >>>> >>>> Signed-off-by: Song Liu <songliubraving@fb.com> >>>> --- >>> >>> Perhaps it's a stupid question, but why bpf_get_stack/bpf_get_stackid >>> can't figure out automatically that they are called from >>> __PERF_SAMPLE_CALLCHAIN_EARLY perf event and use different callchain, >>> if necessary? >>> >>> It is quite suboptimal from a user experience point of view to require >>> two different BPF helpers depending on PEBS or non-PEBS perf events. >> >> I am not aware of an easy way to tell the difference in bpf_get_stack. >> But I do agree that would be much better. >> > > Hm... Looking a bit more how all this is tied together in the kernel, > I think it's actually quite easy. So, for perf_event BPF program type: > > 1. return a special prototype for bpf_get_stack/bpf_get_stackid, which > will have this extra bit of logic for callchain. All other program > types with access to bpf_get_stack/bpf_get_stackid should use the > current one, probably. > 2. For that special program, just like for bpf_read_branch_records(), > we know that context is actually `struct bpf_perf_event_data_kern *`, > and it has pt_regs, perf_sample_data and perf_event itself. > 3. With that, it seems like you'll have everything you need to > automatically choose a proper callchain. > > All this absolutely transparently to the BPF program. > > Am I missing something? Good idea! A separate prototype should work here. Thanks, Song
diff --git a/include/linux/filter.h b/include/linux/filter.h index 2593777236037..fb34dc40f039b 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -534,7 +534,8 @@ struct bpf_prog { is_func:1, /* program is a bpf function */ kprobe_override:1, /* Do we override a kprobe? */ has_callchain_buf:1, /* callchain buffer allocated? */ - enforce_expected_attach_type:1; /* Enforce expected_attach_type checking at attach time */ + enforce_expected_attach_type:1, /* Enforce expected_attach_type checking at attach time */ + call_get_perf_callchain:1; /* Do we call helpers that uses get_perf_callchain()? */ enum bpf_prog_type type; /* Type of BPF program */ enum bpf_attach_type expected_attach_type; /* For some prog types */ u32 len; /* Number of filter blocks */ diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index b608185e1ffd5..1e11b0f6fba31 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4884,6 +4884,9 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn env->prog->has_callchain_buf = true; } + if (func_id == BPF_FUNC_get_stackid || func_id == BPF_FUNC_get_stack) + env->prog->call_get_perf_callchain = true; + if (changes_data) clear_all_pkt_pointers(env); return 0; diff --git a/kernel/events/core.c b/kernel/events/core.c index 856d98c36f562..f2f575a286bb4 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -9544,6 +9544,16 @@ static int perf_event_set_bpf_handler(struct perf_event *event, u32 prog_fd) if (IS_ERR(prog)) return PTR_ERR(prog); + if ((event->attr.sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY) && + prog->call_get_perf_callchain) { + /* + * The perf_event get_perf_callchain() early, the attached + * BPF program shouldn't call get_perf_callchain() again. + */ + bpf_prog_put(prog); + return -EINVAL; + } + event->prog = prog; event->orig_overflow_handler = READ_ONCE(event->overflow_handler); WRITE_ONCE(event->overflow_handler, bpf_overflow_handler);
Calling get_perf_callchain() on perf_events from PEBS entries may cause unwinder errors. To fix this issue, the callchain is fetched early. Such perf_events are marked with __PERF_SAMPLE_CALLCHAIN_EARLY. Similarly, calling bpf_get_[stack|stackid] on perf_events from PEBS may also cause unwinder errors. To fix this, block bpf_get_[stack|stackid] on these perf_events. Unfortunately, bpf verifier cannot tell whether the program will be attached to perf_event with PEBS entries. Therefore, block such programs during ioctl(PERF_EVENT_IOC_SET_BPF). Signed-off-by: Song Liu <songliubraving@fb.com> --- include/linux/filter.h | 3 ++- kernel/bpf/verifier.c | 3 +++ kernel/events/core.c | 10 ++++++++++ 3 files changed, 15 insertions(+), 1 deletion(-)