diff mbox series

[bpf-next,4/6] bpf, libbpf: add bpf_tail_call_static helper for bpf programs

Message ID ae48d5b3c4b6b7ee1285c3167c3aa38ae3fdc093.1600967205.git.daniel@iogearbox.net
State Changes Requested
Delegated to: BPF Maintainers
Headers show
Series Various BPF helper improvements | expand

Commit Message

Daniel Borkmann Sept. 24, 2020, 6:21 p.m. UTC
Port of tail_call_static() helper function from Cilium's BPF code base [0]
to libbpf, so others can easily consume it as well. We've been using this
in production code for some time now. The main idea is that we guarantee
that the kernel's BPF infrastructure and JIT (here: x86_64) can patch the
JITed BPF insns with direct jumps instead of having to fall back to using
expensive retpolines. By using inline asm, we guarantee that the compiler
won't merge the call from different paths with potentially different
content of r2/r3.

We're also using __throw_build_bug() macro in different places as a neat
trick to trigger compilation errors when compiler does not remove code at
compilation time. This works for the BPF backend as it does not implement
the __builtin_trap().

  [0] https://github.com/cilium/cilium/commit/f5537c26020d5297b70936c6b7d03a1e412a1035

Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
---
 tools/lib/bpf/bpf_helpers.h | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

Comments

Andrii Nakryiko Sept. 24, 2020, 8:53 p.m. UTC | #1
On Thu, Sep 24, 2020 at 11:22 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> Port of tail_call_static() helper function from Cilium's BPF code base [0]
> to libbpf, so others can easily consume it as well. We've been using this
> in production code for some time now. The main idea is that we guarantee
> that the kernel's BPF infrastructure and JIT (here: x86_64) can patch the
> JITed BPF insns with direct jumps instead of having to fall back to using
> expensive retpolines. By using inline asm, we guarantee that the compiler
> won't merge the call from different paths with potentially different
> content of r2/r3.
>
> We're also using __throw_build_bug() macro in different places as a neat
> trick to trigger compilation errors when compiler does not remove code at
> compilation time. This works for the BPF backend as it does not implement
> the __builtin_trap().
>
>   [0] https://github.com/cilium/cilium/commit/f5537c26020d5297b70936c6b7d03a1e412a1035
>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> ---
>  tools/lib/bpf/bpf_helpers.h | 32 ++++++++++++++++++++++++++++++++
>  1 file changed, 32 insertions(+)
>
> diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h
> index 1106777df00b..18b75a4c82e6 100644
> --- a/tools/lib/bpf/bpf_helpers.h
> +++ b/tools/lib/bpf/bpf_helpers.h
> @@ -53,6 +53,38 @@
>         })
>  #endif
>
> +/*
> + * Misc useful helper macros
> + */
> +#ifndef __throw_build_bug
> +# define __throw_build_bug()   __builtin_trap()
> +#endif

this will become part of libbpf stable API, do we want/need to expose
it? If we want to expose it, then we should probably provide a better
description.

But also curious, how is it better than _Static_assert() (see
test_cls_redirect.c), which also allows to provide a better error
message?

> +
> +static __always_inline void
> +bpf_tail_call_static(void *ctx, const void *map, const __u32 slot)
> +{
> +       if (!__builtin_constant_p(slot))
> +               __throw_build_bug();
> +
> +       /*
> +        * Don't gamble, but _guarantee_ that LLVM won't optimize setting
> +        * r2 and r3 from different paths ending up at the same call insn as
> +        * otherwise we won't be able to use the jmpq/nopl retpoline-free
> +        * patching by the x86-64 JIT in the kernel.
> +        *

So the clobbering comment below is completely clear. But this one is
less clear without some sort of example situation in which bad things
happen. Do you mind providing some pseudo-C example in which the
compiler will optimize things in such a way that the tail call
patching won't happen?

> +        * Note on clobber list: we need to stay in-line with BPF calling
> +        * convention, so even if we don't end up using r0, r4, r5, we need
> +        * to mark them as clobber so that LLVM doesn't end up using them
> +        * before / after the call.
> +        */
> +       asm volatile("r1 = %[ctx]\n\t"
> +                    "r2 = %[map]\n\t"
> +                    "r3 = %[slot]\n\t"
> +                    "call 12\n\t"
> +                    :: [ctx]"r"(ctx), [map]"r"(map), [slot]"i"(slot)
> +                    : "r0", "r1", "r2", "r3", "r4", "r5");
> +}
> +
>  /*
>   * Helper structure used by eBPF C program
>   * to describe BPF map attributes to libbpf loader
> --
> 2.21.0
>
Daniel Borkmann Sept. 24, 2020, 10:17 p.m. UTC | #2
On 9/24/20 10:53 PM, Andrii Nakryiko wrote:
> On Thu, Sep 24, 2020 at 11:22 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>
>> Port of tail_call_static() helper function from Cilium's BPF code base [0]
>> to libbpf, so others can easily consume it as well. We've been using this
>> in production code for some time now. The main idea is that we guarantee
>> that the kernel's BPF infrastructure and JIT (here: x86_64) can patch the
>> JITed BPF insns with direct jumps instead of having to fall back to using
>> expensive retpolines. By using inline asm, we guarantee that the compiler
>> won't merge the call from different paths with potentially different
>> content of r2/r3.
>>
>> We're also using __throw_build_bug() macro in different places as a neat
>> trick to trigger compilation errors when compiler does not remove code at
>> compilation time. This works for the BPF backend as it does not implement
>> the __builtin_trap().
>>
>>    [0] https://github.com/cilium/cilium/commit/f5537c26020d5297b70936c6b7d03a1e412a1035
>>
>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>> ---
>>   tools/lib/bpf/bpf_helpers.h | 32 ++++++++++++++++++++++++++++++++
>>   1 file changed, 32 insertions(+)
>>
>> diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h
>> index 1106777df00b..18b75a4c82e6 100644
>> --- a/tools/lib/bpf/bpf_helpers.h
>> +++ b/tools/lib/bpf/bpf_helpers.h
>> @@ -53,6 +53,38 @@
>>          })
>>   #endif
>>
>> +/*
>> + * Misc useful helper macros
>> + */
>> +#ifndef __throw_build_bug
>> +# define __throw_build_bug()   __builtin_trap()
>> +#endif
> 
> this will become part of libbpf stable API, do we want/need to expose
> it? If we want to expose it, then we should probably provide a better
> description.
> 
> But also curious, how is it better than _Static_assert() (see
> test_cls_redirect.c), which also allows to provide a better error
> message?

Need to get back to you whether that has same semantics. We use the __throw_build_bug()
also in __bpf_memzero() and friends [0] as a way to trigger a hard build bug if we hit
a default switch-case [0], so we detect unsupported sizes which are not covered by the
implementation yet. If _Static_assert (0, "foo") does the trick, we could also use that;
will check with our code base.

   [0] https://github.com/cilium/cilium/blob/master/bpf/include/bpf/builtins.h

>> +static __always_inline void
>> +bpf_tail_call_static(void *ctx, const void *map, const __u32 slot)
>> +{
>> +       if (!__builtin_constant_p(slot))
>> +               __throw_build_bug();
>> +
>> +       /*
>> +        * Don't gamble, but _guarantee_ that LLVM won't optimize setting
>> +        * r2 and r3 from different paths ending up at the same call insn as
>> +        * otherwise we won't be able to use the jmpq/nopl retpoline-free
>> +        * patching by the x86-64 JIT in the kernel.
>> +        *
> 
> So the clobbering comment below is completely clear. But this one is
> less clear without some sort of example situation in which bad things
> happen. Do you mind providing some pseudo-C example in which the
> compiler will optimize things in such a way that the tail call
> patching won't happen?

The details are pretty much here [1] and mentioned at plumbers, so if we end up from
different paths with different map or const key at the same tail-call call insn, then
the record_func_key() will determine that we need to emit retpoline instead of patching
which could have occured with two tail-call call insns, for example. This helper is just
to guarantee that the latter will always happen.

   [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=d2e4c1e6c2947269346054ac8937ccfe9e0bcc6b

>> +        * Note on clobber list: we need to stay in-line with BPF calling
>> +        * convention, so even if we don't end up using r0, r4, r5, we need
>> +        * to mark them as clobber so that LLVM doesn't end up using them
>> +        * before / after the call.
>> +        */
>> +       asm volatile("r1 = %[ctx]\n\t"
>> +                    "r2 = %[map]\n\t"
>> +                    "r3 = %[slot]\n\t"
>> +                    "call 12\n\t"
>> +                    :: [ctx]"r"(ctx), [map]"r"(map), [slot]"i"(slot)
>> +                    : "r0", "r1", "r2", "r3", "r4", "r5");
>> +}
>> +
>>   /*
>>    * Helper structure used by eBPF C program
>>    * to describe BPF map attributes to libbpf loader
>> --
>> 2.21.0
>>
Yonghong Song Sept. 25, 2020, 6:13 a.m. UTC | #3
On 9/24/20 11:21 AM, Daniel Borkmann wrote:
> Port of tail_call_static() helper function from Cilium's BPF code base [0]
> to libbpf, so others can easily consume it as well. We've been using this
> in production code for some time now. The main idea is that we guarantee
> that the kernel's BPF infrastructure and JIT (here: x86_64) can patch the
> JITed BPF insns with direct jumps instead of having to fall back to using
> expensive retpolines. By using inline asm, we guarantee that the compiler
> won't merge the call from different paths with potentially different
> content of r2/r3.
> 
> We're also using __throw_build_bug() macro in different places as a neat
> trick to trigger compilation errors when compiler does not remove code at
> compilation time. This works for the BPF backend as it does not implement
> the __builtin_trap().
> 
>    [0] https://github.com/cilium/cilium/commit/f5537c26020d5297b70936c6b7d03a1e412a1035
> 
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> ---
>   tools/lib/bpf/bpf_helpers.h | 32 ++++++++++++++++++++++++++++++++
>   1 file changed, 32 insertions(+)
> 
> diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h
> index 1106777df00b..18b75a4c82e6 100644
> --- a/tools/lib/bpf/bpf_helpers.h
> +++ b/tools/lib/bpf/bpf_helpers.h
> @@ -53,6 +53,38 @@
>   	})
>   #endif
>   
> +/*
> + * Misc useful helper macros
> + */
> +#ifndef __throw_build_bug
> +# define __throw_build_bug()	__builtin_trap()

Just some general comments below. The patch itself is fine to me.

I guess we will never implement a 'trap' insn? The only possible
use I know is for failed CORE relocation, which currently encoded
as an illegal call insn.

> +#endif
> +
> +static __always_inline void
> +bpf_tail_call_static(void *ctx, const void *map, const __u32 slot)
> +{
> +	if (!__builtin_constant_p(slot))
> +		__throw_build_bug();
> +
> +	/*
> +	 * Don't gamble, but _guarantee_ that LLVM won't optimize setting
> +	 * r2 and r3 from different paths ending up at the same call insn as
> +	 * otherwise we won't be able to use the jmpq/nopl retpoline-free
> +	 * patching by the x86-64 JIT in the kernel.
> +	 *
> +	 * Note on clobber list: we need to stay in-line with BPF calling
> +	 * convention, so even if we don't end up using r0, r4, r5, we need
> +	 * to mark them as clobber so that LLVM doesn't end up using them
> +	 * before / after the call.
> +	 */
> +	asm volatile("r1 = %[ctx]\n\t"
> +		     "r2 = %[map]\n\t"
> +		     "r3 = %[slot]\n\t"
> +		     "call 12\n\t"
> +		     :: [ctx]"r"(ctx), [map]"r"(map), [slot]"i"(slot)
> +		     : "r0", "r1", "r2", "r3", "r4", "r5");

Clever scheme to enforce a constant!

This particular case is to avoid sinking common code.
We have a similar use case in https://reviews.llvm.org/D87153
which is to avoid PHI merging of two relocation globals like
    phi = [reloc_global1, reloc_global2]  // reachable from two paths
    ... phi ...

The solution is to insert bpf internal __builtin
    val = __builtin_bpf_passthrough(seq_num, val)
in proper places to prevent sinking common code.

I guess in this example, we could prevent this with
compiler inserting passthrough builtin's like
    ...
    ret = bpf_tail_call(ctx, map, slot)
    ret = __builtin_bpf_passthrough(seq_num, ret)
    ...

If in the future, such a builtin is proved useful for bpf program
writers as well. we could just define
    val = __builtin_bpf_passthrough(val)
and internally, it will be transformed to
    val = llvm.bpf.passthrough(seq_num, val)

> +}
> +
>   /*
>    * Helper structure used by eBPF C program
>    * to describe BPF map attributes to libbpf loader
>
Daniel Borkmann Sept. 25, 2020, 3:42 p.m. UTC | #4
On 9/25/20 12:17 AM, Daniel Borkmann wrote:
> On 9/24/20 10:53 PM, Andrii Nakryiko wrote:
>> On Thu, Sep 24, 2020 at 11:22 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>>
>>> Port of tail_call_static() helper function from Cilium's BPF code base [0]
>>> to libbpf, so others can easily consume it as well. We've been using this
>>> in production code for some time now. The main idea is that we guarantee
>>> that the kernel's BPF infrastructure and JIT (here: x86_64) can patch the
>>> JITed BPF insns with direct jumps instead of having to fall back to using
>>> expensive retpolines. By using inline asm, we guarantee that the compiler
>>> won't merge the call from different paths with potentially different
>>> content of r2/r3.
>>>
>>> We're also using __throw_build_bug() macro in different places as a neat
>>> trick to trigger compilation errors when compiler does not remove code at
>>> compilation time. This works for the BPF backend as it does not implement
>>> the __builtin_trap().
>>>
>>>    [0] https://github.com/cilium/cilium/commit/f5537c26020d5297b70936c6b7d03a1e412a1035
>>>
>>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>>> ---
>>>   tools/lib/bpf/bpf_helpers.h | 32 ++++++++++++++++++++++++++++++++
>>>   1 file changed, 32 insertions(+)
>>>
>>> diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h
>>> index 1106777df00b..18b75a4c82e6 100644
>>> --- a/tools/lib/bpf/bpf_helpers.h
>>> +++ b/tools/lib/bpf/bpf_helpers.h
>>> @@ -53,6 +53,38 @@
>>>          })
>>>   #endif
>>>
>>> +/*
>>> + * Misc useful helper macros
>>> + */
>>> +#ifndef __throw_build_bug
>>> +# define __throw_build_bug()   __builtin_trap()
>>> +#endif
>>
>> this will become part of libbpf stable API, do we want/need to expose
>> it? If we want to expose it, then we should probably provide a better
>> description.
>>
>> But also curious, how is it better than _Static_assert() (see
>> test_cls_redirect.c), which also allows to provide a better error
>> message?
> 
> Need to get back to you whether that has same semantics. We use the __throw_build_bug()
> also in __bpf_memzero() and friends [0] as a way to trigger a hard build bug if we hit
> a default switch-case [0], so we detect unsupported sizes which are not covered by the
> implementation yet. If _Static_assert (0, "foo") does the trick, we could also use that;
> will check with our code base.

So _Static_assert() won't work here, for example consider:

   # cat f1.c
   int main(void)
   {
	if (0)
		_Static_assert(0, "foo");
	return 0;
   }
   # clang -target bpf -Wall -O2 -c f1.c -o f1.o
   f1.c:4:3: error: expected expression
                 _Static_assert(0, "foo");
                 ^
   1 error generated.

In order for it to work as required form the use-case, the _Static_assert() must not trigger
here given the path is unreachable and will be optimized away. I'll add a comment to the
__throw_build_bug() helper. Given libbpf we should probably also prefix with bpf_. If you see
a better name that would fit, pls let me know.

>    [0] https://github.com/cilium/cilium/blob/master/bpf/include/bpf/builtins.h
Thanks,
Daniel
Daniel Borkmann Sept. 25, 2020, 3:52 p.m. UTC | #5
On 9/25/20 5:42 PM, Daniel Borkmann wrote:
> On 9/25/20 12:17 AM, Daniel Borkmann wrote:
>> On 9/24/20 10:53 PM, Andrii Nakryiko wrote:
>>> On Thu, Sep 24, 2020 at 11:22 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>>>
>>>> Port of tail_call_static() helper function from Cilium's BPF code base [0]
>>>> to libbpf, so others can easily consume it as well. We've been using this
>>>> in production code for some time now. The main idea is that we guarantee
>>>> that the kernel's BPF infrastructure and JIT (here: x86_64) can patch the
>>>> JITed BPF insns with direct jumps instead of having to fall back to using
>>>> expensive retpolines. By using inline asm, we guarantee that the compiler
>>>> won't merge the call from different paths with potentially different
>>>> content of r2/r3.
>>>>
>>>> We're also using __throw_build_bug() macro in different places as a neat
>>>> trick to trigger compilation errors when compiler does not remove code at
>>>> compilation time. This works for the BPF backend as it does not implement
>>>> the __builtin_trap().
>>>>
>>>>    [0] https://github.com/cilium/cilium/commit/f5537c26020d5297b70936c6b7d03a1e412a1035
>>>>
>>>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>>>> ---
>>>>   tools/lib/bpf/bpf_helpers.h | 32 ++++++++++++++++++++++++++++++++
>>>>   1 file changed, 32 insertions(+)
>>>>
>>>> diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h
>>>> index 1106777df00b..18b75a4c82e6 100644
>>>> --- a/tools/lib/bpf/bpf_helpers.h
>>>> +++ b/tools/lib/bpf/bpf_helpers.h
>>>> @@ -53,6 +53,38 @@
>>>>          })
>>>>   #endif
>>>>
>>>> +/*
>>>> + * Misc useful helper macros
>>>> + */
>>>> +#ifndef __throw_build_bug
>>>> +# define __throw_build_bug()   __builtin_trap()
>>>> +#endif
>>>
>>> this will become part of libbpf stable API, do we want/need to expose
>>> it? If we want to expose it, then we should probably provide a better
>>> description.
>>>
>>> But also curious, how is it better than _Static_assert() (see
>>> test_cls_redirect.c), which also allows to provide a better error
>>> message?
>>
>> Need to get back to you whether that has same semantics. We use the __throw_build_bug()
>> also in __bpf_memzero() and friends [0] as a way to trigger a hard build bug if we hit
>> a default switch-case [0], so we detect unsupported sizes which are not covered by the
>> implementation yet. If _Static_assert (0, "foo") does the trick, we could also use that;
>> will check with our code base.
> 
> So _Static_assert() won't work here, for example consider:
> 
>    # cat f1.c
>    int main(void)
>    {
>      if (0)
>          _Static_assert(0, "foo");
>      return 0;
>    }
>    # clang -target bpf -Wall -O2 -c f1.c -o f1.o
>    f1.c:4:3: error: expected expression
>                  _Static_assert(0, "foo");
>                  ^
>    1 error generated.

.. aaand it looks like I need some more coffee. ;-) But result is the same after all:

   # clang -target bpf -Wall -O2 -c f1.c -o f1.o
   f1.c:4:3: error: static_assert failed "foo"
                 _Static_assert(0, "foo");
                 ^              ~
   1 error generated.

   # cat f1.c
   int main(void)
   {
	if (0) {
		_Static_assert(0, "foo");
	}
	return 0;
   }

> In order for it to work as required form the use-case, the _Static_assert() must not trigger
> here given the path is unreachable and will be optimized away. I'll add a comment to the
> __throw_build_bug() helper. Given libbpf we should probably also prefix with bpf_. If you see
> a better name that would fit, pls let me know.
> 
>>    [0] https://github.com/cilium/cilium/blob/master/bpf/include/bpf/builtins.h
> Thanks,
> Daniel
Andrii Nakryiko Sept. 25, 2020, 4:50 p.m. UTC | #6
On Fri, Sep 25, 2020 at 8:52 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 9/25/20 5:42 PM, Daniel Borkmann wrote:
> > On 9/25/20 12:17 AM, Daniel Borkmann wrote:
> >> On 9/24/20 10:53 PM, Andrii Nakryiko wrote:
> >>> On Thu, Sep 24, 2020 at 11:22 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >>>>
> >>>> Port of tail_call_static() helper function from Cilium's BPF code base [0]
> >>>> to libbpf, so others can easily consume it as well. We've been using this
> >>>> in production code for some time now. The main idea is that we guarantee
> >>>> that the kernel's BPF infrastructure and JIT (here: x86_64) can patch the
> >>>> JITed BPF insns with direct jumps instead of having to fall back to using
> >>>> expensive retpolines. By using inline asm, we guarantee that the compiler
> >>>> won't merge the call from different paths with potentially different
> >>>> content of r2/r3.
> >>>>
> >>>> We're also using __throw_build_bug() macro in different places as a neat
> >>>> trick to trigger compilation errors when compiler does not remove code at
> >>>> compilation time. This works for the BPF backend as it does not implement
> >>>> the __builtin_trap().
> >>>>
> >>>>    [0] https://github.com/cilium/cilium/commit/f5537c26020d5297b70936c6b7d03a1e412a1035
> >>>>
> >>>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> >>>> ---
> >>>>   tools/lib/bpf/bpf_helpers.h | 32 ++++++++++++++++++++++++++++++++
> >>>>   1 file changed, 32 insertions(+)
> >>>>
> >>>> diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h
> >>>> index 1106777df00b..18b75a4c82e6 100644
> >>>> --- a/tools/lib/bpf/bpf_helpers.h
> >>>> +++ b/tools/lib/bpf/bpf_helpers.h
> >>>> @@ -53,6 +53,38 @@
> >>>>          })
> >>>>   #endif
> >>>>
> >>>> +/*
> >>>> + * Misc useful helper macros
> >>>> + */
> >>>> +#ifndef __throw_build_bug
> >>>> +# define __throw_build_bug()   __builtin_trap()
> >>>> +#endif
> >>>
> >>> this will become part of libbpf stable API, do we want/need to expose
> >>> it? If we want to expose it, then we should probably provide a better
> >>> description.
> >>>
> >>> But also curious, how is it better than _Static_assert() (see
> >>> test_cls_redirect.c), which also allows to provide a better error
> >>> message?
> >>
> >> Need to get back to you whether that has same semantics. We use the __throw_build_bug()
> >> also in __bpf_memzero() and friends [0] as a way to trigger a hard build bug if we hit
> >> a default switch-case [0], so we detect unsupported sizes which are not covered by the
> >> implementation yet. If _Static_assert (0, "foo") does the trick, we could also use that;
> >> will check with our code base.
> >
> > So _Static_assert() won't work here, for example consider:
> >
> >    # cat f1.c
> >    int main(void)
> >    {
> >      if (0)
> >          _Static_assert(0, "foo");
> >      return 0;
> >    }
> >    # clang -target bpf -Wall -O2 -c f1.c -o f1.o
> >    f1.c:4:3: error: expected expression
> >                  _Static_assert(0, "foo");
> >                  ^
> >    1 error generated.
>
> .. aaand it looks like I need some more coffee. ;-) But result is the same after all:
>
>    # clang -target bpf -Wall -O2 -c f1.c -o f1.o
>    f1.c:4:3: error: static_assert failed "foo"
>                  _Static_assert(0, "foo");
>                  ^              ~
>    1 error generated.
>
>    # cat f1.c
>    int main(void)
>    {
>         if (0) {
>                 _Static_assert(0, "foo");
>         }
>         return 0;
>    }

You need still more :-P. For you use case it will look like this:

$ cat test-bla.c
int bar(int x) {
       _Static_assert(!__builtin_constant_p(x), "not a constant!");
       return x;
}

int foo() {
        bar(123);
        return 0;
}
$ clang -target bpf -O2 -c test-bla.c -o test-bla.o
$ echo $?
0

But in general to ensure unreachable code it's probably useful anyway
to have this. How about calling it __bpf_build_error() or maybe even
__bpf_unreachable()?

>
> > In order for it to work as required form the use-case, the _Static_assert() must not trigger
> > here given the path is unreachable and will be optimized away. I'll add a comment to the
> > __throw_build_bug() helper. Given libbpf we should probably also prefix with bpf_. If you see
> > a better name that would fit, pls let me know.
> >
> >>    [0] https://github.com/cilium/cilium/blob/master/bpf/include/bpf/builtins.h
> > Thanks,
> > Daniel
>
Daniel Borkmann Sept. 25, 2020, 7:52 p.m. UTC | #7
On 9/25/20 6:50 PM, Andrii Nakryiko wrote:
> On Fri, Sep 25, 2020 at 8:52 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 9/25/20 5:42 PM, Daniel Borkmann wrote:
>>> On 9/25/20 12:17 AM, Daniel Borkmann wrote:
>>>> On 9/24/20 10:53 PM, Andrii Nakryiko wrote:
>>>>> On Thu, Sep 24, 2020 at 11:22 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>>>>>
>>>>>> Port of tail_call_static() helper function from Cilium's BPF code base [0]
>>>>>> to libbpf, so others can easily consume it as well. We've been using this
>>>>>> in production code for some time now. The main idea is that we guarantee
>>>>>> that the kernel's BPF infrastructure and JIT (here: x86_64) can patch the
>>>>>> JITed BPF insns with direct jumps instead of having to fall back to using
>>>>>> expensive retpolines. By using inline asm, we guarantee that the compiler
>>>>>> won't merge the call from different paths with potentially different
>>>>>> content of r2/r3.
>>>>>>
>>>>>> We're also using __throw_build_bug() macro in different places as a neat
>>>>>> trick to trigger compilation errors when compiler does not remove code at
>>>>>> compilation time. This works for the BPF backend as it does not implement
>>>>>> the __builtin_trap().
>>>>>>
>>>>>>     [0] https://github.com/cilium/cilium/commit/f5537c26020d5297b70936c6b7d03a1e412a1035
>>>>>>
>>>>>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>>>>>> ---
>>>>>>    tools/lib/bpf/bpf_helpers.h | 32 ++++++++++++++++++++++++++++++++
>>>>>>    1 file changed, 32 insertions(+)
>>>>>>
>>>>>> diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h
>>>>>> index 1106777df00b..18b75a4c82e6 100644
>>>>>> --- a/tools/lib/bpf/bpf_helpers.h
>>>>>> +++ b/tools/lib/bpf/bpf_helpers.h
>>>>>> @@ -53,6 +53,38 @@
>>>>>>           })
>>>>>>    #endif
>>>>>>
>>>>>> +/*
>>>>>> + * Misc useful helper macros
>>>>>> + */
>>>>>> +#ifndef __throw_build_bug
>>>>>> +# define __throw_build_bug()   __builtin_trap()
>>>>>> +#endif
>>>>>
>>>>> this will become part of libbpf stable API, do we want/need to expose
>>>>> it? If we want to expose it, then we should probably provide a better
>>>>> description.
>>>>>
>>>>> But also curious, how is it better than _Static_assert() (see
>>>>> test_cls_redirect.c), which also allows to provide a better error
>>>>> message?
>>>>
>>>> Need to get back to you whether that has same semantics. We use the __throw_build_bug()
>>>> also in __bpf_memzero() and friends [0] as a way to trigger a hard build bug if we hit
>>>> a default switch-case [0], so we detect unsupported sizes which are not covered by the
>>>> implementation yet. If _Static_assert (0, "foo") does the trick, we could also use that;
>>>> will check with our code base.
>>>
>>> So _Static_assert() won't work here, for example consider:
>>>
>>>     # cat f1.c
>>>     int main(void)
>>>     {
>>>       if (0)
>>>           _Static_assert(0, "foo");
>>>       return 0;
>>>     }
>>>     # clang -target bpf -Wall -O2 -c f1.c -o f1.o
>>>     f1.c:4:3: error: expected expression
>>>                   _Static_assert(0, "foo");
>>>                   ^
>>>     1 error generated.
>>
>> .. aaand it looks like I need some more coffee. ;-) But result is the same after all:
>>
>>     # clang -target bpf -Wall -O2 -c f1.c -o f1.o
>>     f1.c:4:3: error: static_assert failed "foo"
>>                   _Static_assert(0, "foo");
>>                   ^              ~
>>     1 error generated.
>>
>>     # cat f1.c
>>     int main(void)
>>     {
>>          if (0) {
>>                  _Static_assert(0, "foo");
>>          }
>>          return 0;
>>     }
> 
> You need still more :-P. For you use case it will look like this:
> 
> $ cat test-bla.c
> int bar(int x) {
>         _Static_assert(!__builtin_constant_p(x), "not a constant!");
>         return x;
> }
> 
> int foo() {
>          bar(123);
>          return 0;
> }
> $ clang -target bpf -O2 -c test-bla.c -o test-bla.o
> $ echo $?
> 0

Right, but that won't work for example for the use case to detect switch cases which fall
into default case as mentioned with the mem* optimizations earlier in this thread.

> But in general to ensure unreachable code it's probably useful anyway
> to have this. How about calling it __bpf_build_error() or maybe even
> __bpf_unreachable()?

I think the __bpf_unreachable() sounds best to me, will use that.

>>> In order for it to work as required form the use-case, the _Static_assert() must not trigger
>>> here given the path is unreachable and will be optimized away. I'll add a comment to the
>>> __throw_build_bug() helper. Given libbpf we should probably also prefix with bpf_. If you see
>>> a better name that would fit, pls let me know.
>>>
>>>>     [0] https://github.com/cilium/cilium/blob/master/bpf/include/bpf/builtins.h
>>> Thanks,
>>> Daniel
>>
diff mbox series

Patch

diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h
index 1106777df00b..18b75a4c82e6 100644
--- a/tools/lib/bpf/bpf_helpers.h
+++ b/tools/lib/bpf/bpf_helpers.h
@@ -53,6 +53,38 @@ 
 	})
 #endif
 
+/*
+ * Misc useful helper macros
+ */
+#ifndef __throw_build_bug
+# define __throw_build_bug()	__builtin_trap()
+#endif
+
+static __always_inline void
+bpf_tail_call_static(void *ctx, const void *map, const __u32 slot)
+{
+	if (!__builtin_constant_p(slot))
+		__throw_build_bug();
+
+	/*
+	 * Don't gamble, but _guarantee_ that LLVM won't optimize setting
+	 * r2 and r3 from different paths ending up at the same call insn as
+	 * otherwise we won't be able to use the jmpq/nopl retpoline-free
+	 * patching by the x86-64 JIT in the kernel.
+	 *
+	 * Note on clobber list: we need to stay in-line with BPF calling
+	 * convention, so even if we don't end up using r0, r4, r5, we need
+	 * to mark them as clobber so that LLVM doesn't end up using them
+	 * before / after the call.
+	 */
+	asm volatile("r1 = %[ctx]\n\t"
+		     "r2 = %[map]\n\t"
+		     "r3 = %[slot]\n\t"
+		     "call 12\n\t"
+		     :: [ctx]"r"(ctx), [map]"r"(map), [slot]"i"(slot)
+		     : "r0", "r1", "r2", "r3", "r4", "r5");
+}
+
 /*
  * Helper structure used by eBPF C program
  * to describe BPF map attributes to libbpf loader