Message ID | 20200513212057.147133-1-andriin@fb.com |
---|---|
State | Changes Requested |
Delegated to: | BPF Maintainers |
Headers | show |
Series | [bpf-next] bpf: fix bpf_iter's task iterator logic | expand |
On 5/13/20 2:20 PM, Andrii Nakryiko wrote: > task_seq_get_next might stop prematurely if get_pid_task() fails to get > task_struct. Failure to do so doesn't mean that there are no more tasks with > higher pids. Procfs's iteration algorithm (see next_tgid in fs/proc/base.c) > does a retry in such case. After this fix, instead of stopping prematurely > after about 300 tasks on my server, bpf_iter program now returns >4000, which > sounds much closer to reality. > > Cc: Yonghong Song <yhs@fb.com> > Fixes: eaaacd23910f ("bpf: Add task and task/file iterator targets") > Signed-off-by: Andrii Nakryiko <andriin@fb.com> Thanks for the fix. We did this retry logic for bpf_map which is idr based logic too. But forgot to check for task which has the same issue. Acked-by: Yonghong Song <yhs@fb.com> > --- > kernel/bpf/task_iter.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c > index a9b7264dda08..e1836def6738 100644 > --- a/kernel/bpf/task_iter.c > +++ b/kernel/bpf/task_iter.c > @@ -27,9 +27,15 @@ static struct task_struct *task_seq_get_next(struct pid_namespace *ns, > struct pid *pid; > > rcu_read_lock(); > +retry: > pid = idr_get_next(&ns->idr, tid); > - if (pid) > + if (pid) { > task = get_pid_task(pid, PIDTYPE_PID); > + if (!task) { > + *tid++; > + goto retry; > + } > + } > rcu_read_unlock(); > > return task; >
On Wed, May 13, 2020 at 2:23 PM Andrii Nakryiko <andriin@fb.com> wrote: > > task_seq_get_next might stop prematurely if get_pid_task() fails to get > task_struct. Failure to do so doesn't mean that there are no more tasks with > higher pids. Procfs's iteration algorithm (see next_tgid in fs/proc/base.c) > does a retry in such case. After this fix, instead of stopping prematurely > after about 300 tasks on my server, bpf_iter program now returns >4000, which > sounds much closer to reality. > > Cc: Yonghong Song <yhs@fb.com> > Fixes: eaaacd23910f ("bpf: Add task and task/file iterator targets") > Signed-off-by: Andrii Nakryiko <andriin@fb.com> > --- > kernel/bpf/task_iter.c | 8 +++++++- > 1 file changed, 7 insertions(+), 1 deletion(-) > > diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c > index a9b7264dda08..e1836def6738 100644 > --- a/kernel/bpf/task_iter.c > +++ b/kernel/bpf/task_iter.c > @@ -27,9 +27,15 @@ static struct task_struct *task_seq_get_next(struct pid_namespace *ns, > struct pid *pid; > > rcu_read_lock(); > +retry: > pid = idr_get_next(&ns->idr, tid); > - if (pid) > + if (pid) { > task = get_pid_task(pid, PIDTYPE_PID); > + if (!task) { > + *tid++; ../kernel/bpf/task_iter.c: In function ‘task_seq_get_next’: ../kernel/bpf/task_iter.c:35:4: warning: value computed is not used [-Wunused-value] 35 | *tid++; | ^~~~~~
On Wed, May 13, 2020 at 3:42 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > > On Wed, May 13, 2020 at 2:23 PM Andrii Nakryiko <andriin@fb.com> wrote: > > > > task_seq_get_next might stop prematurely if get_pid_task() fails to get > > task_struct. Failure to do so doesn't mean that there are no more tasks with > > higher pids. Procfs's iteration algorithm (see next_tgid in fs/proc/base.c) > > does a retry in such case. After this fix, instead of stopping prematurely > > after about 300 tasks on my server, bpf_iter program now returns >4000, which > > sounds much closer to reality. > > > > Cc: Yonghong Song <yhs@fb.com> > > Fixes: eaaacd23910f ("bpf: Add task and task/file iterator targets") > > Signed-off-by: Andrii Nakryiko <andriin@fb.com> > > --- > > kernel/bpf/task_iter.c | 8 +++++++- > > 1 file changed, 7 insertions(+), 1 deletion(-) > > > > diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c > > index a9b7264dda08..e1836def6738 100644 > > --- a/kernel/bpf/task_iter.c > > +++ b/kernel/bpf/task_iter.c > > @@ -27,9 +27,15 @@ static struct task_struct *task_seq_get_next(struct pid_namespace *ns, > > struct pid *pid; > > > > rcu_read_lock(); > > +retry: > > pid = idr_get_next(&ns->idr, tid); > > - if (pid) > > + if (pid) { > > task = get_pid_task(pid, PIDTYPE_PID); > > + if (!task) { > > + *tid++; > > ../kernel/bpf/task_iter.c: In function ‘task_seq_get_next’: > ../kernel/bpf/task_iter.c:35:4: warning: value computed is not used > [-Wunused-value] > 35 | *tid++; > | ^~~~~~ welp... thanks, fixing to prefix form
diff --git a/kernel/bpf/task_iter.c b/kernel/bpf/task_iter.c index a9b7264dda08..e1836def6738 100644 --- a/kernel/bpf/task_iter.c +++ b/kernel/bpf/task_iter.c @@ -27,9 +27,15 @@ static struct task_struct *task_seq_get_next(struct pid_namespace *ns, struct pid *pid; rcu_read_lock(); +retry: pid = idr_get_next(&ns->idr, tid); - if (pid) + if (pid) { task = get_pid_task(pid, PIDTYPE_PID); + if (!task) { + *tid++; + goto retry; + } + } rcu_read_unlock(); return task;
task_seq_get_next might stop prematurely if get_pid_task() fails to get task_struct. Failure to do so doesn't mean that there are no more tasks with higher pids. Procfs's iteration algorithm (see next_tgid in fs/proc/base.c) does a retry in such case. After this fix, instead of stopping prematurely after about 300 tasks on my server, bpf_iter program now returns >4000, which sounds much closer to reality. Cc: Yonghong Song <yhs@fb.com> Fixes: eaaacd23910f ("bpf: Add task and task/file iterator targets") Signed-off-by: Andrii Nakryiko <andriin@fb.com> --- kernel/bpf/task_iter.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)