Message ID | 1432866362-8154-1-git-send-email-ast@plumgrid.com |
---|---|
State | Accepted, archived |
Delegated to: | David Miller |
Headers | show |
On 05/29/2015 04:26 AM, Alexei Starovoitov wrote: > Normally the program attachment place (like sockets, qdiscs) takes > care of rcu protection and calls bpf_prog_put() after a grace period. > The programs stored inside prog_array may not be attached anywhere, > so prog_array needs to take care of preserving rcu protection. > Otherwise bpf_tail_call() will race with bpf_prog_put(). > To solve that introduce bpf_prog_put_rcu() helper function and use > it in 3 places where unattached program can decrement refcnt: > closing program fd, deleting/replacing program in prog_array. > > Fixes: 04fd61ab36ec ("bpf: allow bpf programs to tail-call other bpf programs") > Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com> > Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Fix looks correct, so: Acked-by: Daniel Borkmann <daniel@iogearbox.net> [...] > diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c > index 98a69bd83069..a1b14d197a4f 100644 > --- a/kernel/bpf/syscall.c > +++ b/kernel/bpf/syscall.c > @@ -432,6 +432,23 @@ static void free_used_maps(struct bpf_prog_aux *aux) > kfree(aux->used_maps); > } > > +static void __prog_put_rcu(struct rcu_head *rcu) > +{ > + struct bpf_prog_aux *aux = container_of(rcu, struct bpf_prog_aux, rcu); > + > + free_used_maps(aux); > + bpf_prog_free(aux->prog); Not sure if it's worth it to move these two into a common helper shared with bpf_prog_put()? Probably only in case that code should get further extended. > +} > + > +/* version of bpf_prog_put() that is called after a grace period */ Note that this callback to complete could potentially also last longer than a grace period. Probably depends on the reader how to interpret the comment, but the code itself would have been already self-documenting. ;) > +void bpf_prog_put_rcu(struct bpf_prog *prog) > +{ > + if (atomic_dec_and_test(&prog->aux->refcnt)) { > + prog->aux->prog = prog; > + call_rcu(&prog->aux->rcu, __prog_put_rcu); > + } > +} > + > void bpf_prog_put(struct bpf_prog *prog) > { > if (atomic_dec_and_test(&prog->aux->refcnt)) { > @@ -445,7 +462,7 @@ static int bpf_prog_release(struct inode *inode, struct file *filp) > { -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 5/29/15 2:10 AM, Daniel Borkmann wrote: >> >> +static void __prog_put_rcu(struct rcu_head *rcu) >> +{ >> + struct bpf_prog_aux *aux = container_of(rcu, struct bpf_prog_aux, >> rcu); >> + >> + free_used_maps(aux); >> + bpf_prog_free(aux->prog); > > Not sure if it's worth it to move these two into a common helper shared > with bpf_prog_put()? Probably only in case that code should get further > extended. I though about it too, but my recent re-reading of net/core/filter.c taught me otherwise. We have too many tiny helper functions that are hiding meaning instead of helping. Like instead of having two pieces of the code: do1(); do2(); do3(); and do1(); do2(); if we introduce a helper foo() { do1(); do2(); } and the code will do: foo(), do3() and foo() when the helper is close enough to invocation it's still easy to read, but overtime the whole thing, imo, will become a mess. For example, we have prog_release, prog_free, filter_release and all combinations with and without __ prefix and _rcu suffix. I think some of this stuff should be 'unhelpered'. Like __sk_filter_release() and __bpf_prog_release() should be removed. Of course, it's a grey line when to introduce a helper and when not to, but just because two lines are close enough between two functions it doesn't mean that helper is warranted. In this bpf_prog_put() case I think helper is not needed _today_. If it grows, we'll reconsider. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 05/30/2015 01:22 AM, Alexei Starovoitov wrote: ... > Like __sk_filter_release() and __bpf_prog_release() should be removed. The whole filter cleanup procedure needs to be simplified a bit, got a bit too complicated over time, agreed. > Of course, it's a grey line when to introduce a helper and when not to, > but just because two lines are close enough between two functions it > doesn't mean that helper is warranted. In this bpf_prog_put() case > I think helper is not needed _today_. If it grows, we'll reconsider. Yes, that's what I meant. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
From: Alexei Starovoitov <ast@plumgrid.com> Date: Thu, 28 May 2015 19:26:02 -0700 > Normally the program attachment place (like sockets, qdiscs) takes > care of rcu protection and calls bpf_prog_put() after a grace period. > The programs stored inside prog_array may not be attached anywhere, > so prog_array needs to take care of preserving rcu protection. > Otherwise bpf_tail_call() will race with bpf_prog_put(). > To solve that introduce bpf_prog_put_rcu() helper function and use > it in 3 places where unattached program can decrement refcnt: > closing program fd, deleting/replacing program in prog_array. > > Fixes: 04fd61ab36ec ("bpf: allow bpf programs to tail-call other bpf programs") > Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com> > Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> Applied, thank you. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 8821b9a8689e..5f520f5f087e 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -123,7 +123,10 @@ struct bpf_prog_aux { const struct bpf_verifier_ops *ops; struct bpf_map **used_maps; struct bpf_prog *prog; - struct work_struct work; + union { + struct work_struct work; + struct rcu_head rcu; + }; }; struct bpf_array { @@ -153,6 +156,7 @@ void bpf_register_map_type(struct bpf_map_type_list *tl); struct bpf_prog *bpf_prog_get(u32 ufd); void bpf_prog_put(struct bpf_prog *prog); +void bpf_prog_put_rcu(struct bpf_prog *prog); struct bpf_map *bpf_map_get(struct fd f); void bpf_map_put(struct bpf_map *map); diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c index 614bcd4c1d74..cb31229a6fa4 100644 --- a/kernel/bpf/arraymap.c +++ b/kernel/bpf/arraymap.c @@ -202,7 +202,7 @@ static int prog_array_map_update_elem(struct bpf_map *map, void *key, old_prog = xchg(array->prog + index, prog); if (old_prog) - bpf_prog_put(old_prog); + bpf_prog_put_rcu(old_prog); return 0; } @@ -218,7 +218,7 @@ static int prog_array_map_delete_elem(struct bpf_map *map, void *key) old_prog = xchg(array->prog + index, NULL); if (old_prog) { - bpf_prog_put(old_prog); + bpf_prog_put_rcu(old_prog); return 0; } else { return -ENOENT; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 98a69bd83069..a1b14d197a4f 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -432,6 +432,23 @@ static void free_used_maps(struct bpf_prog_aux *aux) kfree(aux->used_maps); } +static void __prog_put_rcu(struct rcu_head *rcu) +{ + struct bpf_prog_aux *aux = container_of(rcu, struct bpf_prog_aux, rcu); + + free_used_maps(aux); + bpf_prog_free(aux->prog); +} + +/* version of bpf_prog_put() that is called after a grace period */ +void bpf_prog_put_rcu(struct bpf_prog *prog) +{ + if (atomic_dec_and_test(&prog->aux->refcnt)) { + prog->aux->prog = prog; + call_rcu(&prog->aux->rcu, __prog_put_rcu); + } +} + void bpf_prog_put(struct bpf_prog *prog) { if (atomic_dec_and_test(&prog->aux->refcnt)) { @@ -445,7 +462,7 @@ static int bpf_prog_release(struct inode *inode, struct file *filp) { struct bpf_prog *prog = filp->private_data; - bpf_prog_put(prog); + bpf_prog_put_rcu(prog); return 0; }
Normally the program attachment place (like sockets, qdiscs) takes care of rcu protection and calls bpf_prog_put() after a grace period. The programs stored inside prog_array may not be attached anywhere, so prog_array needs to take care of preserving rcu protection. Otherwise bpf_tail_call() will race with bpf_prog_put(). To solve that introduce bpf_prog_put_rcu() helper function and use it in 3 places where unattached program can decrement refcnt: closing program fd, deleting/replacing program in prog_array. Fixes: 04fd61ab36ec ("bpf: allow bpf programs to tail-call other bpf programs") Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com> --- include/linux/bpf.h | 6 +++++- kernel/bpf/arraymap.c | 4 ++-- kernel/bpf/syscall.c | 19 ++++++++++++++++++- 3 files changed, 25 insertions(+), 4 deletions(-)