[bpf] bpf: zero-fill re-used per-cpu map element

Message ID	20201023123754.30304-1-david.verbeiren@tessares.net
State	Not Applicable
Delegated to:	BPF Maintainers
Headers	show Return-Path: <netdev-owner@vger.kernel.org> From: David Verbeiren <david.verbeiren@tessares.net> To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, David Verbeiren <david.verbeiren@tessares.net>, Matthieu Baerts <matthieu.baerts@tessares.net> Subject: [PATCH bpf] bpf: zero-fill re-used per-cpu map element Date: Fri, 23 Oct 2020 14:37:54 +0200 Message-Id: <20201023123754.30304-1-david.verbeiren@tessares.net> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	[bpf] bpf: zero-fill re-used per-cpu map element \| expand [bpf] bpf: zero-fill re-used per-cpu map element

Message ID

20201023123754.30304-1-david.verbeiren@tessares.net

State

Not Applicable

Delegated to:

BPF Maintainers

Headers

From: David Verbeiren <david.verbeiren@tessares.net>
To: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org,
        David Verbeiren <david.verbeiren@tessares.net>,
        Matthieu Baerts <matthieu.baerts@tessares.net>
Subject: [PATCH bpf] bpf: zero-fill re-used per-cpu map element
Date: Fri, 23 Oct 2020 14:37:54 +0200
Message-Id: <20201023123754.30304-1-david.verbeiren@tessares.net>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: bulk

Series

[bpf] bpf: zero-fill re-used per-cpu map element | expand

Checks

Context	Check	Description
jkicinski/cover_letter	success	Link
jkicinski/fixes_present	success	Link
jkicinski/patch_count	success	Link
jkicinski/tree_selection	success	Clearly marked for bpf
jkicinski/subject_prefix	success	Link
jkicinski/source_inline	success	Was 0 now: 0
jkicinski/verify_signedoff	success	Link
jkicinski/module_param	success	Was 0 now: 0
jkicinski/build_32bit	success	Errors and warnings before: 14 this patch: 14
jkicinski/kdoc	success	Errors and warnings before: 0 this patch: 0
jkicinski/verify_fixes	success	Link
jkicinski/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 24 lines checked
jkicinski/build_allmodconfig_warn	success	Errors and warnings before: 14 this patch: 14
jkicinski/header_inline	success	Link
jkicinski/stable	success	Stable not CCed

Context

Check

Description

jkicinski/cover_letter

success

Link

jkicinski/fixes_present

success

Link

jkicinski/patch_count

success

Link

jkicinski/tree_selection

success

Clearly marked for bpf

jkicinski/subject_prefix

success

Link

jkicinski/source_inline

success

Was 0 now: 0

jkicinski/verify_signedoff

success

Link

jkicinski/module_param

success

Was 0 now: 0

jkicinski/build_32bit

success

Errors and warnings before: 14 this patch: 14

jkicinski/kdoc

success

Errors and warnings before: 0 this patch: 0

jkicinski/verify_fixes

success

Link

jkicinski/checkpatch

success

total: 0 errors, 0 warnings, 0 checks, 24 lines checked

jkicinski/build_allmodconfig_warn

success

Errors and warnings before: 14 this patch: 14

jkicinski/header_inline

success

Link

jkicinski/stable

success

Stable not CCed

Commit Message

David Verbeiren Oct. 23, 2020, 12:37 p.m. UTC

Zero-fill element values for all cpus, just as when not using
prealloc. This is the only way the bpf program can ensure known
initial values for cpus other than the current one ('onallcpus'
cannot be set when coming from the bpf program).

The scenario is: bpf program inserts some elements in a per-cpu
map, then deletes some (or userspace does). When later adding
new elements using bpf_map_update_elem(), the bpf program can
only set the value of the new elements for the current cpu.
When prealloc is enabled, previously deleted elements are re-used.
Without the fix, values for other cpus remain whatever they were
when the re-used entry was previously freed.

Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: David Verbeiren <david.verbeiren@tessares.net>
---
 kernel/bpf/hashtab.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

Andrii Nakryiko Oct. 26, 2020, 10:47 p.m. UTC | #1

On Fri, Oct 23, 2020 at 8:48 AM David Verbeiren
<david.verbeiren@tessares.net> wrote:
>
> Zero-fill element values for all cpus, just as when not using
> prealloc. This is the only way the bpf program can ensure known
> initial values for cpus other than the current one ('onallcpus'
> cannot be set when coming from the bpf program).
>
> The scenario is: bpf program inserts some elements in a per-cpu
> map, then deletes some (or userspace does). When later adding
> new elements using bpf_map_update_elem(), the bpf program can
> only set the value of the new elements for the current cpu.
> When prealloc is enabled, previously deleted elements are re-used.
> Without the fix, values for other cpus remain whatever they were
> when the re-used entry was previously freed.
>
> Fixes: 6c9059817432 ("bpf: pre-allocate hash map elements")
> Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
> Signed-off-by: David Verbeiren <david.verbeiren@tessares.net>
> ---
>  kernel/bpf/hashtab.c | 12 ++++++++++++
>  1 file changed, 12 insertions(+)
>
> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> index 1815e97d4c9c..667553cce65a 100644
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c
> @@ -836,6 +836,7 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key,
>         bool prealloc = htab_is_prealloc(htab);
>         struct htab_elem *l_new, **pl_new;
>         void __percpu *pptr;
> +       int cpu;
>
>         if (prealloc) {
>                 if (old_elem) {
> @@ -880,6 +881,17 @@ static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key,
>                 size = round_up(size, 8);
>                 if (prealloc) {
>                         pptr = htab_elem_get_ptr(l_new, key_size);
> +
> +                       /* zero-fill element values for all cpus, just as when
> +                        * not using prealloc. Only way for bpf program to
> +                        * ensure known initial values for cpus other than
> +                        * current one (onallcpus=false when coming from bpf
> +                        * prog).
> +                        */
> +                       if (!onallcpus)
> +                               for_each_possible_cpu(cpu)
> +                                       memset((void *)per_cpu_ptr(pptr, cpu),
> +                                              0, size);

Technically, you don't have to memset() for the current CPU, right?
Don't know if extra check is cheaper than avoiding one memset() call,
though.

But regardless, this 6 level nesting looks pretty bad, maybe move the
for_each_possible_cpu() loop into a helper function?

Also, does the per-CPU LRU hashmap need the same treatment?

>                 } else {
>                         /* alloc_percpu zero-fills */
>                         pptr = __alloc_percpu_gfp(size, 8,
> --
> 2.29.0
>

David Verbeiren Oct. 27, 2020, 2:48 p.m. UTC | #2

On Mon, Oct 26, 2020 at 11:48 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Fri, Oct 23, 2020 at 8:48 AM David Verbeiren
> <david.verbeiren@tessares.net> wrote:
> > [...]
> > +                       if (!onallcpus)
> > +                               for_each_possible_cpu(cpu)
> > +                                       memset((void *)per_cpu_ptr(pptr, cpu),
> > +                                              0, size);
>
> Technically, you don't have to memset() for the current CPU, right?
> Don't know if extra check is cheaper than avoiding one memset() call,
> though.

I thought about that as well but, because it depends on the 'size',
I decided to keep it simple. However, taking into account your other
comments, I think there is a possibility to combine it all nicely in a
separate function.

> But regardless, this 6 level nesting looks pretty bad, maybe move the
> for_each_possible_cpu() loop into a helper function?
>
> Also, does the per-CPU LRU hashmap need the same treatment?
I think it does. Good catch!

Thanks for your feedback. v2 is coming.

diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
index 1815e97d4c9c..667553cce65a 100644
--- a/kernel/bpf/hashtab.c
+++ b/kernel/bpf/hashtab.c
@@ -836,6 +836,7 @@  static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key,
 	bool prealloc = htab_is_prealloc(htab);
 	struct htab_elem *l_new, **pl_new;
 	void __percpu *pptr;
+	int cpu;
 
 	if (prealloc) {
 		if (old_elem) {
@@ -880,6 +881,17 @@  static struct htab_elem *alloc_htab_elem(struct bpf_htab *htab, void *key,
 		size = round_up(size, 8);
 		if (prealloc) {
 			pptr = htab_elem_get_ptr(l_new, key_size);
+
+			/* zero-fill element values for all cpus, just as when
+			 * not using prealloc. Only way for bpf program to
+			 * ensure known initial values for cpus other than
+			 * current one (onallcpus=false when coming from bpf
+			 * prog).
+			 */
+			if (!onallcpus)
+				for_each_possible_cpu(cpu)
+					memset((void *)per_cpu_ptr(pptr, cpu),
+					       0, size);
 		} else {
 			/* alloc_percpu zero-fills */
 			pptr = __alloc_percpu_gfp(size, 8,

[bpf] bpf: zero-fill re-used per-cpu map element

Checks

Commit Message

Comments

Patch