diff mbox

[net-next] bpf: add missing rcu protection when releasing programs from prog_array

Message ID 1432866362-8154-1-git-send-email-ast@plumgrid.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Alexei Starovoitov May 29, 2015, 2:26 a.m. UTC
Normally the program attachment place (like sockets, qdiscs) takes
care of rcu protection and calls bpf_prog_put() after a grace period.
The programs stored inside prog_array may not be attached anywhere,
so prog_array needs to take care of preserving rcu protection.
Otherwise bpf_tail_call() will race with bpf_prog_put().
To solve that introduce bpf_prog_put_rcu() helper function and use
it in 3 places where unattached program can decrement refcnt:
closing program fd, deleting/replacing program in prog_array.

Fixes: 04fd61ab36ec ("bpf: allow bpf programs to tail-call other bpf programs")
Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>
---
 include/linux/bpf.h   |    6 +++++-
 kernel/bpf/arraymap.c |    4 ++--
 kernel/bpf/syscall.c  |   19 ++++++++++++++++++-
 3 files changed, 25 insertions(+), 4 deletions(-)

Comments

Daniel Borkmann May 29, 2015, 9:10 a.m. UTC | #1
On 05/29/2015 04:26 AM, Alexei Starovoitov wrote:
> Normally the program attachment place (like sockets, qdiscs) takes
> care of rcu protection and calls bpf_prog_put() after a grace period.
> The programs stored inside prog_array may not be attached anywhere,
> so prog_array needs to take care of preserving rcu protection.
> Otherwise bpf_tail_call() will race with bpf_prog_put().
> To solve that introduce bpf_prog_put_rcu() helper function and use
> it in 3 places where unattached program can decrement refcnt:
> closing program fd, deleting/replacing program in prog_array.
>
> Fixes: 04fd61ab36ec ("bpf: allow bpf programs to tail-call other bpf programs")
> Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>

Fix looks correct, so:

Acked-by: Daniel Borkmann <daniel@iogearbox.net>

[...]
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index 98a69bd83069..a1b14d197a4f 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -432,6 +432,23 @@ static void free_used_maps(struct bpf_prog_aux *aux)
>   	kfree(aux->used_maps);
>   }
>
> +static void __prog_put_rcu(struct rcu_head *rcu)
> +{
> +	struct bpf_prog_aux *aux = container_of(rcu, struct bpf_prog_aux, rcu);
> +
> +	free_used_maps(aux);
> +	bpf_prog_free(aux->prog);

Not sure if it's worth it to move these two into a common helper shared
with bpf_prog_put()? Probably only in case that code should get further
extended.

> +}
> +
> +/* version of bpf_prog_put() that is called after a grace period */

Note that this callback to complete could potentially also last longer
than a grace period. Probably depends on the reader how to interpret
the comment, but the code itself would have been already self-documenting. ;)

> +void bpf_prog_put_rcu(struct bpf_prog *prog)
> +{
> +	if (atomic_dec_and_test(&prog->aux->refcnt)) {
> +		prog->aux->prog = prog;
> +		call_rcu(&prog->aux->rcu, __prog_put_rcu);
> +	}
> +}
> +
>   void bpf_prog_put(struct bpf_prog *prog)
>   {
>   	if (atomic_dec_and_test(&prog->aux->refcnt)) {
> @@ -445,7 +462,7 @@ static int bpf_prog_release(struct inode *inode, struct file *filp)
>   {
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Alexei Starovoitov May 29, 2015, 11:22 p.m. UTC | #2
On 5/29/15 2:10 AM, Daniel Borkmann wrote:
>>
>> +static void __prog_put_rcu(struct rcu_head *rcu)
>> +{
>> +    struct bpf_prog_aux *aux = container_of(rcu, struct bpf_prog_aux,
>> rcu);
>> +
>> +    free_used_maps(aux);
>> +    bpf_prog_free(aux->prog);
>
> Not sure if it's worth it to move these two into a common helper shared
> with bpf_prog_put()? Probably only in case that code should get further
> extended.

I though about it too, but my recent re-reading of net/core/filter.c
taught me otherwise. We have too many tiny helper functions that
are hiding meaning instead of helping.
Like instead of having two pieces of the code:
do1(); do2(); do3(); and do1(); do2();
if we introduce a helper foo() { do1(); do2(); } and the code will do:
foo(), do3() and foo()
when the helper is close enough to invocation it's still easy to read,
but overtime the whole thing, imo, will become a mess. For example,
we have prog_release, prog_free, filter_release and all combinations
with and without __ prefix and _rcu suffix.
I think some of this stuff should be 'unhelpered'.
Like __sk_filter_release() and __bpf_prog_release() should be removed.
Of course, it's a grey line when to introduce a helper and when not to,
but just because two lines are close enough between two functions it
doesn't mean that helper is warranted. In this bpf_prog_put() case
I think helper is not needed _today_. If it grows, we'll reconsider.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Daniel Borkmann May 30, 2015, 9:02 a.m. UTC | #3
On 05/30/2015 01:22 AM, Alexei Starovoitov wrote:
...
> Like __sk_filter_release() and __bpf_prog_release() should be removed.

The whole filter cleanup procedure needs to be simplified a bit, got a
bit too complicated over time, agreed.

> Of course, it's a grey line when to introduce a helper and when not to,
> but just because two lines are close enough between two functions it
> doesn't mean that helper is warranted. In this bpf_prog_put() case
> I think helper is not needed _today_. If it grows, we'll reconsider.

Yes, that's what I meant.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller May 31, 2015, 7:28 a.m. UTC | #4
From: Alexei Starovoitov <ast@plumgrid.com>
Date: Thu, 28 May 2015 19:26:02 -0700

> Normally the program attachment place (like sockets, qdiscs) takes
> care of rcu protection and calls bpf_prog_put() after a grace period.
> The programs stored inside prog_array may not be attached anywhere,
> so prog_array needs to take care of preserving rcu protection.
> Otherwise bpf_tail_call() will race with bpf_prog_put().
> To solve that introduce bpf_prog_put_rcu() helper function and use
> it in 3 places where unattached program can decrement refcnt:
> closing program fd, deleting/replacing program in prog_array.
> 
> Fixes: 04fd61ab36ec ("bpf: allow bpf programs to tail-call other bpf programs")
> Reported-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
> Signed-off-by: Alexei Starovoitov <ast@plumgrid.com>

Applied, thank you.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 8821b9a8689e..5f520f5f087e 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -123,7 +123,10 @@  struct bpf_prog_aux {
 	const struct bpf_verifier_ops *ops;
 	struct bpf_map **used_maps;
 	struct bpf_prog *prog;
-	struct work_struct work;
+	union {
+		struct work_struct work;
+		struct rcu_head	rcu;
+	};
 };
 
 struct bpf_array {
@@ -153,6 +156,7 @@  void bpf_register_map_type(struct bpf_map_type_list *tl);
 
 struct bpf_prog *bpf_prog_get(u32 ufd);
 void bpf_prog_put(struct bpf_prog *prog);
+void bpf_prog_put_rcu(struct bpf_prog *prog);
 
 struct bpf_map *bpf_map_get(struct fd f);
 void bpf_map_put(struct bpf_map *map);
diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 614bcd4c1d74..cb31229a6fa4 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -202,7 +202,7 @@  static int prog_array_map_update_elem(struct bpf_map *map, void *key,
 
 	old_prog = xchg(array->prog + index, prog);
 	if (old_prog)
-		bpf_prog_put(old_prog);
+		bpf_prog_put_rcu(old_prog);
 
 	return 0;
 }
@@ -218,7 +218,7 @@  static int prog_array_map_delete_elem(struct bpf_map *map, void *key)
 
 	old_prog = xchg(array->prog + index, NULL);
 	if (old_prog) {
-		bpf_prog_put(old_prog);
+		bpf_prog_put_rcu(old_prog);
 		return 0;
 	} else {
 		return -ENOENT;
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 98a69bd83069..a1b14d197a4f 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -432,6 +432,23 @@  static void free_used_maps(struct bpf_prog_aux *aux)
 	kfree(aux->used_maps);
 }
 
+static void __prog_put_rcu(struct rcu_head *rcu)
+{
+	struct bpf_prog_aux *aux = container_of(rcu, struct bpf_prog_aux, rcu);
+
+	free_used_maps(aux);
+	bpf_prog_free(aux->prog);
+}
+
+/* version of bpf_prog_put() that is called after a grace period */
+void bpf_prog_put_rcu(struct bpf_prog *prog)
+{
+	if (atomic_dec_and_test(&prog->aux->refcnt)) {
+		prog->aux->prog = prog;
+		call_rcu(&prog->aux->rcu, __prog_put_rcu);
+	}
+}
+
 void bpf_prog_put(struct bpf_prog *prog)
 {
 	if (atomic_dec_and_test(&prog->aux->refcnt)) {
@@ -445,7 +462,7 @@  static int bpf_prog_release(struct inode *inode, struct file *filp)
 {
 	struct bpf_prog *prog = filp->private_data;
 
-	bpf_prog_put(prog);
+	bpf_prog_put_rcu(prog);
 	return 0;
 }