Message ID | 20200722184210.4078256-3-songliubraving@fb.com |
---|---|
State | Changes Requested |
Delegated to: | BPF Maintainers |
Headers | show |
Series | bpf: fix stackmap on perf_events with PEBS | expand |
On Wed, Jul 22, 2020 at 11:42:08AM -0700, Song Liu wrote: > diff --git a/kernel/events/core.c b/kernel/events/core.c > index 856d98c36f562..f77d009fcce95 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -9544,6 +9544,24 @@ static int perf_event_set_bpf_handler(struct perf_event *event, u32 prog_fd) > if (IS_ERR(prog)) > return PTR_ERR(prog); > > + if (event->attr.precise_ip && > + prog->call_get_stack && > + (!(event->attr.sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY) || > + event->attr.exclude_callchain_kernel || > + event->attr.exclude_callchain_user)) { > + /* > + * On perf_event with precise_ip, calling bpf_get_stack() > + * may trigger unwinder warnings and occasional crashes. > + * bpf_get_[stack|stackid] works around this issue by using > + * callchain attached to perf_sample_data. If the > + * perf_event does not full (kernel and user) callchain > + * attached to perf_sample_data, do not allow attaching BPF > + * program that calls bpf_get_[stack|stackid]. > + */ > + bpf_prog_put(prog); > + return -EINVAL; I suspect this will be a common error. bpftrace and others will be hitting this issue and would need to fix how they do perf_event_open. But EINVAL is too ambiguous and sys_perf_event_open has no ability to return a string. So how about we pick some different errno here to make future debugging a bit less painful? May be EBADFD or EPROTO or EPROTOTYPE ? I think anything would be better than EINVAL.
> On Jul 22, 2020, at 10:55 PM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Wed, Jul 22, 2020 at 11:42:08AM -0700, Song Liu wrote: >> diff --git a/kernel/events/core.c b/kernel/events/core.c >> index 856d98c36f562..f77d009fcce95 100644 >> --- a/kernel/events/core.c >> +++ b/kernel/events/core.c >> @@ -9544,6 +9544,24 @@ static int perf_event_set_bpf_handler(struct perf_event *event, u32 prog_fd) >> if (IS_ERR(prog)) >> return PTR_ERR(prog); >> >> + if (event->attr.precise_ip && >> + prog->call_get_stack && >> + (!(event->attr.sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY) || >> + event->attr.exclude_callchain_kernel || >> + event->attr.exclude_callchain_user)) { >> + /* >> + * On perf_event with precise_ip, calling bpf_get_stack() >> + * may trigger unwinder warnings and occasional crashes. >> + * bpf_get_[stack|stackid] works around this issue by using >> + * callchain attached to perf_sample_data. If the >> + * perf_event does not full (kernel and user) callchain >> + * attached to perf_sample_data, do not allow attaching BPF >> + * program that calls bpf_get_[stack|stackid]. >> + */ >> + bpf_prog_put(prog); >> + return -EINVAL; > > I suspect this will be a common error. bpftrace and others will be hitting > this issue and would need to fix how they do perf_event_open. > But EINVAL is too ambiguous and sys_perf_event_open has no ability to > return a string. > So how about we pick some different errno here to make future debugging > a bit less painful? > May be EBADFD or EPROTO or EPROTOTYPE ? > I think anything would be better than EINVAL. I like EPROTO most. I will change it to EPROTO if we don't have better ideas. Btw, this is not the error code on sys_perf_event_open(). It is the ioctl() on the perf_event fd. So debugging this error will be less painful than debugging sys_perf_event_open() errors. Thanks, Song
On Wed, Jul 22, 2020 at 11:20 PM Song Liu <songliubraving@fb.com> wrote: > > > > > On Jul 22, 2020, at 10:55 PM, Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > > > On Wed, Jul 22, 2020 at 11:42:08AM -0700, Song Liu wrote: > >> diff --git a/kernel/events/core.c b/kernel/events/core.c > >> index 856d98c36f562..f77d009fcce95 100644 > >> --- a/kernel/events/core.c > >> +++ b/kernel/events/core.c > >> @@ -9544,6 +9544,24 @@ static int perf_event_set_bpf_handler(struct perf_event *event, u32 prog_fd) > >> if (IS_ERR(prog)) > >> return PTR_ERR(prog); > >> > >> + if (event->attr.precise_ip && > >> + prog->call_get_stack && > >> + (!(event->attr.sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY) || > >> + event->attr.exclude_callchain_kernel || > >> + event->attr.exclude_callchain_user)) { > >> + /* > >> + * On perf_event with precise_ip, calling bpf_get_stack() > >> + * may trigger unwinder warnings and occasional crashes. > >> + * bpf_get_[stack|stackid] works around this issue by using > >> + * callchain attached to perf_sample_data. If the > >> + * perf_event does not full (kernel and user) callchain > >> + * attached to perf_sample_data, do not allow attaching BPF > >> + * program that calls bpf_get_[stack|stackid]. > >> + */ > >> + bpf_prog_put(prog); > >> + return -EINVAL; > > > > I suspect this will be a common error. bpftrace and others will be hitting > > this issue and would need to fix how they do perf_event_open. > > But EINVAL is too ambiguous and sys_perf_event_open has no ability to > > return a string. > > So how about we pick some different errno here to make future debugging > > a bit less painful? > > May be EBADFD or EPROTO or EPROTOTYPE ? > > I think anything would be better than EINVAL. > > I like EPROTO most. I will change it to EPROTO if we don't have better ideas. > > Btw, this is not the error code on sys_perf_event_open(). It is the ioctl() > on the perf_event fd. So debugging this error will be less painful than > debugging sys_perf_event_open() errors. ahh. right. Could you also add a string hint to libbpf when it sees this errno?
diff --git a/include/linux/filter.h b/include/linux/filter.h index 8252572db918f..582262017a7dd 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -534,7 +534,8 @@ struct bpf_prog { is_func:1, /* program is a bpf function */ kprobe_override:1, /* Do we override a kprobe? */ has_callchain_buf:1, /* callchain buffer allocated? */ - enforce_expected_attach_type:1; /* Enforce expected_attach_type checking at attach time */ + enforce_expected_attach_type:1, /* Enforce expected_attach_type checking at attach time */ + call_get_stack:1; /* Do we call bpf_get_stack() or bpf_get_stackid() */ enum bpf_prog_type type; /* Type of BPF program */ enum bpf_attach_type expected_attach_type; /* For some prog types */ u32 len; /* Number of filter blocks */ diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 9a6703bc3f36f..41c9517a505ff 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -4887,6 +4887,9 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn env->prog->has_callchain_buf = true; } + if (func_id == BPF_FUNC_get_stackid || func_id == BPF_FUNC_get_stack) + env->prog->call_get_stack = true; + if (changes_data) clear_all_pkt_pointers(env); return 0; diff --git a/kernel/events/core.c b/kernel/events/core.c index 856d98c36f562..f77d009fcce95 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -9544,6 +9544,24 @@ static int perf_event_set_bpf_handler(struct perf_event *event, u32 prog_fd) if (IS_ERR(prog)) return PTR_ERR(prog); + if (event->attr.precise_ip && + prog->call_get_stack && + (!(event->attr.sample_type & __PERF_SAMPLE_CALLCHAIN_EARLY) || + event->attr.exclude_callchain_kernel || + event->attr.exclude_callchain_user)) { + /* + * On perf_event with precise_ip, calling bpf_get_stack() + * may trigger unwinder warnings and occasional crashes. + * bpf_get_[stack|stackid] works around this issue by using + * callchain attached to perf_sample_data. If the + * perf_event does not full (kernel and user) callchain + * attached to perf_sample_data, do not allow attaching BPF + * program that calls bpf_get_[stack|stackid]. + */ + bpf_prog_put(prog); + return -EINVAL; + } + event->prog = prog; event->orig_overflow_handler = READ_ONCE(event->overflow_handler); WRITE_ONCE(event->overflow_handler, bpf_overflow_handler);
bpf_get_[stack|stackid] on perf_events with precise_ip uses callchain attached to perf_sample_data. If this callchain is not presented, do not allow attaching BPF program that calls bpf_get_[stack|stackid] to this event. Signed-off-by: Song Liu <songliubraving@fb.com> --- include/linux/filter.h | 3 ++- kernel/bpf/verifier.c | 3 +++ kernel/events/core.c | 18 ++++++++++++++++++ 3 files changed, 23 insertions(+), 1 deletion(-)