Message ID | 20201202175039.3625166-1-sdf@google.com |
---|---|
State | Not Applicable |
Headers | show |
Series | [bpf-next] libbpf: add retries in sys_bpf_prog_load | expand |
On Wed, Dec 2, 2020 at 9:52 AM Stanislav Fomichev <sdf@google.com> wrote: > > I've seen a situation, where a process that's under pprof constantly > generates SIGPROF which prevents program loading indefinitely. > The right thing to do probably is to disable signals in the upper > layers while loading, but it still would be nice to get some error from > libbpf instead of an endless loop. > > Let's add some small retry limit to the program loading: > try loading the program 10 (arbitrary) times and give up. > > Signed-off-by: Stanislav Fomichev <sdf@google.com> > --- The subject is misleading as hell. You are not adding retries, you are limiting the number of retries. Otherwise, LGTM. I'd probably go with an even smaller number, can't imagine any normal use case having more than once EAGAIN. So I'd say feel free to reduce it to 5 even. Acked-by: Andrii Nakryiko <andrii@kernel.org> > tools/lib/bpf/bpf.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c > index d27e34133973..31ebd6b3ec7c 100644 > --- a/tools/lib/bpf/bpf.c > +++ b/tools/lib/bpf/bpf.c > @@ -67,11 +67,12 @@ static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr, > > static inline int sys_bpf_prog_load(union bpf_attr *attr, unsigned int size) > { > + int retries = 10; > int fd; > > do { > fd = sys_bpf(BPF_PROG_LOAD, attr, size); > - } while (fd < 0 && errno == EAGAIN); > + } while (fd < 0 && errno == EAGAIN && retries-- > 0); > > return fd; > } > -- > 2.29.2.454.gaff20da3a2-goog >
On Wed, Dec 2, 2020 at 2:46 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Wed, Dec 2, 2020 at 9:52 AM Stanislav Fomichev <sdf@google.com> wrote: > > > > I've seen a situation, where a process that's under pprof constantly > > generates SIGPROF which prevents program loading indefinitely. > > The right thing to do probably is to disable signals in the upper > > layers while loading, but it still would be nice to get some error from > > libbpf instead of an endless loop. > > > > Let's add some small retry limit to the program loading: > > try loading the program 10 (arbitrary) times and give up. > > > > Signed-off-by: Stanislav Fomichev <sdf@google.com> > > --- > > The subject is misleading as hell. You are not adding retries, you are > limiting the number of retries. Ah, sorry, should've been s/add/cap/ :-( > Otherwise, LGTM. I'd probably go with an even smaller number, can't > imagine any normal use case having more than once EAGAIN. So I'd say > feel free to reduce it to 5 even. > > Acked-by: Andrii Nakryiko <andrii@kernel.org> Let me respin with a proper subject and 5 retries.
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c index d27e34133973..31ebd6b3ec7c 100644 --- a/tools/lib/bpf/bpf.c +++ b/tools/lib/bpf/bpf.c @@ -67,11 +67,12 @@ static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr, static inline int sys_bpf_prog_load(union bpf_attr *attr, unsigned int size) { + int retries = 10; int fd; do { fd = sys_bpf(BPF_PROG_LOAD, attr, size); - } while (fd < 0 && errno == EAGAIN); + } while (fd < 0 && errno == EAGAIN && retries-- > 0); return fd; }
I've seen a situation, where a process that's under pprof constantly generates SIGPROF which prevents program loading indefinitely. The right thing to do probably is to disable signals in the upper layers while loading, but it still would be nice to get some error from libbpf instead of an endless loop. Let's add some small retry limit to the program loading: try loading the program 10 (arbitrary) times and give up. Signed-off-by: Stanislav Fomichev <sdf@google.com> --- tools/lib/bpf/bpf.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)