diff mbox series

[v3,bpf-next,2/3] selftests/bpf: add variable-length data concatenation pattern test

Message ID 20200623032224.4020118-2-andriin@fb.com
State Accepted
Delegated to: BPF Maintainers
Headers show
Series [v3,bpf-next,1/3] bpf: switch most helper return values from 32-bit int to 64-bit long | expand

Commit Message

Andrii Nakryiko June 23, 2020, 3:22 a.m. UTC
Add selftest that validates variable-length data reading and concatentation
with one big shared data array. This is a common pattern in production use for
monitoring and tracing applications, that potentially can read a lot of data,
but overall read much less. Such pattern allows to determine precisely what
amount of data needs to be sent over perfbuf/ringbuf and maximize efficiency.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 .../testing/selftests/bpf/prog_tests/varlen.c | 56 +++++++++++
 .../testing/selftests/bpf/progs/test_varlen.c | 96 +++++++++++++++++++
 2 files changed, 152 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/varlen.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_varlen.c

Comments

Daniel Borkmann June 23, 2020, 8:39 p.m. UTC | #1
Hey Andrii,

On 6/23/20 5:22 AM, Andrii Nakryiko wrote:
> Add selftest that validates variable-length data reading and concatentation
> with one big shared data array. This is a common pattern in production use for
> monitoring and tracing applications, that potentially can read a lot of data,
> but overall read much less. Such pattern allows to determine precisely what
> amount of data needs to be sent over perfbuf/ringbuf and maximize efficiency.
> 
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>

Currently getting the below errors on these tests. My last clang/llvm git build
is on 4676cf444ea2 ("[Clang] Skip adding begin source location for PragmaLoopHint'd
loop when[...]"):

# ./test_progs -t varlen
test_varlen:PASS:skel_open 0 nsec
test_varlen:PASS:skel_attach 0 nsec
test_varlen:FAIL:check got 0 != exp 8
test_varlen:FAIL:check got 0 != exp 7
test_varlen:FAIL:check got 0 != exp 15
test_varlen:FAIL:content_check doesn't match!test_varlen:FAIL:check got 0 != exp 8
test_varlen:FAIL:check got 0 != exp 7
test_varlen:FAIL:check got 0 != exp 15
test_varlen:FAIL:content_check doesn't match!test_varlen:FAIL:check got 0 != exp 8
test_varlen:FAIL:check got 0 != exp 7
test_varlen:FAIL:check got 0 != exp 15
test_varlen:FAIL:content_check doesn't match!test_varlen:FAIL:check got 0 != exp 8
test_varlen:FAIL:check got 0 != exp 7
test_varlen:FAIL:check got 0 != exp 15
test_varlen:FAIL:content_check doesn't match!
#87 varlen:FAIL
Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED

# ./test_progs-no_alu32 -t varlen
Switching to flavor 'no_alu32' subdirectory...
test_varlen:PASS:skel_open 0 nsec
test_varlen:PASS:skel_attach 0 nsec
test_varlen:FAIL:check got 0 != exp 8
test_varlen:FAIL:check got 0 != exp 7
test_varlen:FAIL:check got 0 != exp 15
test_varlen:FAIL:content_check doesn't match!test_varlen:FAIL:check got 0 != exp 8
test_varlen:FAIL:check got 0 != exp 7
test_varlen:FAIL:check got 0 != exp 15
test_varlen:FAIL:content_check doesn't match!test_varlen:FAIL:check got 0 != exp 8
test_varlen:FAIL:check got 0 != exp 7
test_varlen:FAIL:check got 0 != exp 15
test_varlen:FAIL:content_check doesn't match!test_varlen:FAIL:check got 0 != exp 8
test_varlen:FAIL:check got 0 != exp 7
test_varlen:FAIL:check got 0 != exp 15
test_varlen:FAIL:content_check doesn't match!
#87 varlen:FAIL
Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED

Thanks,
Daniel
Andrii Nakryiko June 23, 2020, 8:52 p.m. UTC | #2
On Tue, Jun 23, 2020 at 1:39 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> Hey Andrii,
>
> On 6/23/20 5:22 AM, Andrii Nakryiko wrote:
> > Add selftest that validates variable-length data reading and concatentation
> > with one big shared data array. This is a common pattern in production use for
> > monitoring and tracing applications, that potentially can read a lot of data,
> > but overall read much less. Such pattern allows to determine precisely what
> > amount of data needs to be sent over perfbuf/ringbuf and maximize efficiency.
> >
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>
> Currently getting the below errors on these tests. My last clang/llvm git build
> is on 4676cf444ea2 ("[Clang] Skip adding begin source location for PragmaLoopHint'd
> loop when[...]"):
>

Yeah, you need 02553b91da5d ("bpf: bpf_probe_read_kernel_str() has to
return amount of data read on success") from bpf tree.

I'm eagerly awaiting bpf being merged into bpf-next :)

> # ./test_progs -t varlen
> test_varlen:PASS:skel_open 0 nsec
> test_varlen:PASS:skel_attach 0 nsec
> test_varlen:FAIL:check got 0 != exp 8
> test_varlen:FAIL:check got 0 != exp 7
> test_varlen:FAIL:check got 0 != exp 15
> test_varlen:FAIL:content_check doesn't match!test_varlen:FAIL:check got 0 != exp 8
> test_varlen:FAIL:check got 0 != exp 7
> test_varlen:FAIL:check got 0 != exp 15
> test_varlen:FAIL:content_check doesn't match!test_varlen:FAIL:check got 0 != exp 8
> test_varlen:FAIL:check got 0 != exp 7
> test_varlen:FAIL:check got 0 != exp 15
> test_varlen:FAIL:content_check doesn't match!test_varlen:FAIL:check got 0 != exp 8
> test_varlen:FAIL:check got 0 != exp 7
> test_varlen:FAIL:check got 0 != exp 15
> test_varlen:FAIL:content_check doesn't match!
> #87 varlen:FAIL
> Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
>
> # ./test_progs-no_alu32 -t varlen
> Switching to flavor 'no_alu32' subdirectory...
> test_varlen:PASS:skel_open 0 nsec
> test_varlen:PASS:skel_attach 0 nsec
> test_varlen:FAIL:check got 0 != exp 8
> test_varlen:FAIL:check got 0 != exp 7
> test_varlen:FAIL:check got 0 != exp 15
> test_varlen:FAIL:content_check doesn't match!test_varlen:FAIL:check got 0 != exp 8
> test_varlen:FAIL:check got 0 != exp 7
> test_varlen:FAIL:check got 0 != exp 15
> test_varlen:FAIL:content_check doesn't match!test_varlen:FAIL:check got 0 != exp 8
> test_varlen:FAIL:check got 0 != exp 7
> test_varlen:FAIL:check got 0 != exp 15
> test_varlen:FAIL:content_check doesn't match!test_varlen:FAIL:check got 0 != exp 8
> test_varlen:FAIL:check got 0 != exp 7
> test_varlen:FAIL:check got 0 != exp 15
> test_varlen:FAIL:content_check doesn't match!
> #87 varlen:FAIL
> Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED
>
> Thanks,
> Daniel
Daniel Borkmann June 23, 2020, 9:15 p.m. UTC | #3
On 6/23/20 10:52 PM, Andrii Nakryiko wrote:
> On Tue, Jun 23, 2020 at 1:39 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 6/23/20 5:22 AM, Andrii Nakryiko wrote:
>>> Add selftest that validates variable-length data reading and concatentation
>>> with one big shared data array. This is a common pattern in production use for
>>> monitoring and tracing applications, that potentially can read a lot of data,
>>> but overall read much less. Such pattern allows to determine precisely what
>>> amount of data needs to be sent over perfbuf/ringbuf and maximize efficiency.
>>>
>>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>>
>> Currently getting the below errors on these tests. My last clang/llvm git build
>> is on 4676cf444ea2 ("[Clang] Skip adding begin source location for PragmaLoopHint'd
>> loop when[...]"):
> 
> Yeah, you need 02553b91da5d ("bpf: bpf_probe_read_kernel_str() has to
> return amount of data read on success") from bpf tree.

Fair point, it's in net- but not yet in net-next tree, so bpf-next sync needs
to wait.

> I'm eagerly awaiting bpf being merged into bpf-next :)

I'll cherry-pick 02553b91da5d locally for testing and if it passes I'll push
these out.

Thanks,
Daniel
Alexei Starovoitov June 23, 2020, 11:25 p.m. UTC | #4
On Tue, Jun 23, 2020 at 11:15:58PM +0200, Daniel Borkmann wrote:
> On 6/23/20 10:52 PM, Andrii Nakryiko wrote:
> > On Tue, Jun 23, 2020 at 1:39 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> > > On 6/23/20 5:22 AM, Andrii Nakryiko wrote:
> > > > Add selftest that validates variable-length data reading and concatentation
> > > > with one big shared data array. This is a common pattern in production use for
> > > > monitoring and tracing applications, that potentially can read a lot of data,
> > > > but overall read much less. Such pattern allows to determine precisely what
> > > > amount of data needs to be sent over perfbuf/ringbuf and maximize efficiency.
> > > > 
> > > > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > > 
> > > Currently getting the below errors on these tests. My last clang/llvm git build
> > > is on 4676cf444ea2 ("[Clang] Skip adding begin source location for PragmaLoopHint'd
> > > loop when[...]"):
> > 
> > Yeah, you need 02553b91da5d ("bpf: bpf_probe_read_kernel_str() has to
> > return amount of data read on success") from bpf tree.
> 
> Fair point, it's in net- but not yet in net-next tree, so bpf-next sync needs
> to wait.
> 
> > I'm eagerly awaiting bpf being merged into bpf-next :)
> 
> I'll cherry-pick 02553b91da5d locally for testing and if it passes I'll push
> these out.

I've merged the bpf_probe_read_kernel_str() fix into bpf-next and 3 extra commits
prior to that one so that sha of the bpf_probe_read_kernel_str() fix (02553b91da5de)
is exactly the same in bpf/net/linus/bpf-next. I think that shouldn't cause
issue during bpf-next pull into net-next and later merge with Linus's tree.
Crossing fingers, since we're doing this experiment for the first time.

Daniel pushed these 3 commits as well.
Now varlen and kernel_reloc tests are good, but we have a different issue :(
./test_progs-no_alu32 -t get_stack_raw_tp
is now failing, but for a different reason.

52: (85) call bpf_get_stack#67
53: (bf) r8 = r0
54: (bf) r1 = r8
55: (67) r1 <<= 32
56: (c7) r1 s>>= 32
; if (usize < 0)
57: (c5) if r1 s< 0x0 goto pc+26
 R0=inv(id=0,smax_value=800) R1_w=inv(id=0,umax_value=800,var_off=(0x0; 0x3ff)) R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=4,vs=1600,imm=0) R8_w=inv(id=0,smax_value=800) R9=inv800 R10=fp0 fp-8=mmmm????
; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
58: (1f) r9 -= r8
; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
59: (bf) r2 = r7
60: (0f) r2 += r1
regs=1 stack=0 before 52: (85) call bpf_get_stack#67
; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
61: (bf) r1 = r6
62: (bf) r3 = r9
63: (b7) r4 = 0
64: (85) call bpf_get_stack#67
 R0=inv(id=0,smax_value=800) R1_w=ctx(id=0,off=0,imm=0) R2_w=map_value(id=0,off=0,ks=4,vs=1600,umax_value=800,var_off=(0x0; 0x3ff),s32_max_value=1023,u32_max_value=1023) R3_w=inv(id=0,umax_value=9223372036854776608) R4_w=inv0 R6=ctx(id=0?
R3 unbounded memory access, use 'var &= const' or 'if (var < const)'

In the C code it was this:
        usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK);
        if (usize < 0)
                return 0;

        ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
        if (ksize < 0)
                return 0;

We used to have problem with pointer arith in R2.
Now it's a problem with two integers in R3.
'if (usize < 0)' is comparing R1 and makes it [0,800], but R8 stays [-inf,800].
Both registers represent the same 'usize' variable.
Then R9 -= R8 is doing 800 - [-inf, 800]
so the result of "max_len - usize" looks unbounded to the verifier while
it's obvious in C code that "max_len - usize" should be [0, 800].

The following diff 'fixes' the issue for no_alu32:
diff --git a/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c b/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c
index 29817a703984..93058136d608 100644
--- a/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c
+++ b/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c
@@ -2,6 +2,7 @@

 #include <linux/bpf.h>
 #include <bpf/bpf_helpers.h>
+#define var_barrier(a) asm volatile ("" : "=r"(a) : "0"(a))

 /* Permit pretty deep stack traces */
 #define MAX_STACK_RAWTP 100
@@ -84,10 +85,12 @@ int bpf_prog1(void *ctx)
                return 0;

        usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK);
+       var_barrier(usize);
        if (usize < 0)
                return 0;

        ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
+       var_barrier(ksize);
        if (ksize < 0)
                return 0;

But it breaks alu32 case.

I'm using llvm 11 fwiw.

Long term Yonghong is working on llvm support to emit this kind
of workarounds automatically.
I'm still thinking what to do next. Ideas?
Yonghong Song June 24, 2020, 12:25 a.m. UTC | #5
On 6/23/20 4:25 PM, Alexei Starovoitov wrote:
> On Tue, Jun 23, 2020 at 11:15:58PM +0200, Daniel Borkmann wrote:
>> On 6/23/20 10:52 PM, Andrii Nakryiko wrote:
>>> On Tue, Jun 23, 2020 at 1:39 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>>>> On 6/23/20 5:22 AM, Andrii Nakryiko wrote:
>>>>> Add selftest that validates variable-length data reading and concatentation
>>>>> with one big shared data array. This is a common pattern in production use for
>>>>> monitoring and tracing applications, that potentially can read a lot of data,
>>>>> but overall read much less. Such pattern allows to determine precisely what
>>>>> amount of data needs to be sent over perfbuf/ringbuf and maximize efficiency.
>>>>>
>>>>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>>>>
>>>> Currently getting the below errors on these tests. My last clang/llvm git build
>>>> is on 4676cf444ea2 ("[Clang] Skip adding begin source location for PragmaLoopHint'd
>>>> loop when[...]"):
>>>
>>> Yeah, you need 02553b91da5d ("bpf: bpf_probe_read_kernel_str() has to
>>> return amount of data read on success") from bpf tree.
>>
>> Fair point, it's in net- but not yet in net-next tree, so bpf-next sync needs
>> to wait.
>>
>>> I'm eagerly awaiting bpf being merged into bpf-next :)
>>
>> I'll cherry-pick 02553b91da5d locally for testing and if it passes I'll push
>> these out.
> 
> I've merged the bpf_probe_read_kernel_str() fix into bpf-next and 3 extra commits
> prior to that one so that sha of the bpf_probe_read_kernel_str() fix (02553b91da5de)
> is exactly the same in bpf/net/linus/bpf-next. I think that shouldn't cause
> issue during bpf-next pull into net-next and later merge with Linus's tree.
> Crossing fingers, since we're doing this experiment for the first time.
> 
> Daniel pushed these 3 commits as well.
> Now varlen and kernel_reloc tests are good, but we have a different issue :(
> ./test_progs-no_alu32 -t get_stack_raw_tp
> is now failing, but for a different reason.
> 
> 52: (85) call bpf_get_stack#67
> 53: (bf) r8 = r0
> 54: (bf) r1 = r8
> 55: (67) r1 <<= 32
> 56: (c7) r1 s>>= 32
> ; if (usize < 0)
> 57: (c5) if r1 s< 0x0 goto pc+26
>   R0=inv(id=0,smax_value=800) R1_w=inv(id=0,umax_value=800,var_off=(0x0; 0x3ff)) R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=4,vs=1600,imm=0) R8_w=inv(id=0,smax_value=800) R9=inv800 R10=fp0 fp-8=mmmm????
> ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> 58: (1f) r9 -= r8
> ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> 59: (bf) r2 = r7
> 60: (0f) r2 += r1
> regs=1 stack=0 before 52: (85) call bpf_get_stack#67
> ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> 61: (bf) r1 = r6
> 62: (bf) r3 = r9
> 63: (b7) r4 = 0
> 64: (85) call bpf_get_stack#67
>   R0=inv(id=0,smax_value=800) R1_w=ctx(id=0,off=0,imm=0) R2_w=map_value(id=0,off=0,ks=4,vs=1600,umax_value=800,var_off=(0x0; 0x3ff),s32_max_value=1023,u32_max_value=1023) R3_w=inv(id=0,umax_value=9223372036854776608) R4_w=inv0 R6=ctx(id=0?
> R3 unbounded memory access, use 'var &= const' or 'if (var < const)'
> 
> In the C code it was this:
>          usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK);
>          if (usize < 0)
>                  return 0;
> 
>          ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
>          if (ksize < 0)
>                  return 0;
> 
> We used to have problem with pointer arith in R2.
> Now it's a problem with two integers in R3.
> 'if (usize < 0)' is comparing R1 and makes it [0,800], but R8 stays [-inf,800].
> Both registers represent the same 'usize' variable.
> Then R9 -= R8 is doing 800 - [-inf, 800]
> so the result of "max_len - usize" looks unbounded to the verifier while
> it's obvious in C code that "max_len - usize" should be [0, 800].
> 
> The following diff 'fixes' the issue for no_alu32:
> diff --git a/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c b/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c
> index 29817a703984..93058136d608 100644
> --- a/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c
> +++ b/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c
> @@ -2,6 +2,7 @@
> 
>   #include <linux/bpf.h>
>   #include <bpf/bpf_helpers.h>
> +#define var_barrier(a) asm volatile ("" : "=r"(a) : "0"(a))
> 
>   /* Permit pretty deep stack traces */
>   #define MAX_STACK_RAWTP 100
> @@ -84,10 +85,12 @@ int bpf_prog1(void *ctx)
>                  return 0;
> 
>          usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK);
> +       var_barrier(usize);
>          if (usize < 0)
>                  return 0;
> 
>          ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> +       var_barrier(ksize);
>          if (ksize < 0)
>                  return 0;
> 
> But it breaks alu32 case.
> 
> I'm using llvm 11 fwiw.
> 
> Long term Yonghong is working on llvm support to emit this kind
> of workarounds automatically.
> I'm still thinking what to do next. Ideas?

The following source change will make both alu32 and non-alu32 happy:

  SEC("raw_tracepoint/sys_enter")
  int bpf_prog1(void *ctx)
  {
-       int max_len, max_buildid_len, usize, ksize, total_size;
+       int max_len, max_buildid_len, total_size;
+       long usize, ksize;
         struct stack_trace_t *data;
         void *raw_data;
         __u32 key = 0;

I have not checked the reason why it works. Mostly this confirms to
the function signature so compiler generates more friendly code.
Andrii Nakryiko June 24, 2020, 12:53 a.m. UTC | #6
On Tue, Jun 23, 2020 at 5:25 PM Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 6/23/20 4:25 PM, Alexei Starovoitov wrote:
> > On Tue, Jun 23, 2020 at 11:15:58PM +0200, Daniel Borkmann wrote:
> >> On 6/23/20 10:52 PM, Andrii Nakryiko wrote:
> >>> On Tue, Jun 23, 2020 at 1:39 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> >>>> On 6/23/20 5:22 AM, Andrii Nakryiko wrote:
> >>>>> Add selftest that validates variable-length data reading and concatentation
> >>>>> with one big shared data array. This is a common pattern in production use for
> >>>>> monitoring and tracing applications, that potentially can read a lot of data,
> >>>>> but overall read much less. Such pattern allows to determine precisely what
> >>>>> amount of data needs to be sent over perfbuf/ringbuf and maximize efficiency.
> >>>>>
> >>>>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> >>>>
> >>>> Currently getting the below errors on these tests. My last clang/llvm git build
> >>>> is on 4676cf444ea2 ("[Clang] Skip adding begin source location for PragmaLoopHint'd
> >>>> loop when[...]"):
> >>>
> >>> Yeah, you need 02553b91da5d ("bpf: bpf_probe_read_kernel_str() has to
> >>> return amount of data read on success") from bpf tree.
> >>
> >> Fair point, it's in net- but not yet in net-next tree, so bpf-next sync needs
> >> to wait.
> >>
> >>> I'm eagerly awaiting bpf being merged into bpf-next :)
> >>
> >> I'll cherry-pick 02553b91da5d locally for testing and if it passes I'll push
> >> these out.
> >
> > I've merged the bpf_probe_read_kernel_str() fix into bpf-next and 3 extra commits
> > prior to that one so that sha of the bpf_probe_read_kernel_str() fix (02553b91da5de)
> > is exactly the same in bpf/net/linus/bpf-next. I think that shouldn't cause
> > issue during bpf-next pull into net-next and later merge with Linus's tree.
> > Crossing fingers, since we're doing this experiment for the first time.
> >
> > Daniel pushed these 3 commits as well.
> > Now varlen and kernel_reloc tests are good, but we have a different issue :(
> > ./test_progs-no_alu32 -t get_stack_raw_tp
> > is now failing, but for a different reason.
> >
> > 52: (85) call bpf_get_stack#67
> > 53: (bf) r8 = r0
> > 54: (bf) r1 = r8
> > 55: (67) r1 <<= 32
> > 56: (c7) r1 s>>= 32
> > ; if (usize < 0)
> > 57: (c5) if r1 s< 0x0 goto pc+26
> >   R0=inv(id=0,smax_value=800) R1_w=inv(id=0,umax_value=800,var_off=(0x0; 0x3ff)) R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=4,vs=1600,imm=0) R8_w=inv(id=0,smax_value=800) R9=inv800 R10=fp0 fp-8=mmmm????
> > ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> > 58: (1f) r9 -= r8
> > ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> > 59: (bf) r2 = r7
> > 60: (0f) r2 += r1
> > regs=1 stack=0 before 52: (85) call bpf_get_stack#67
> > ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> > 61: (bf) r1 = r6
> > 62: (bf) r3 = r9
> > 63: (b7) r4 = 0
> > 64: (85) call bpf_get_stack#67
> >   R0=inv(id=0,smax_value=800) R1_w=ctx(id=0,off=0,imm=0) R2_w=map_value(id=0,off=0,ks=4,vs=1600,umax_value=800,var_off=(0x0; 0x3ff),s32_max_value=1023,u32_max_value=1023) R3_w=inv(id=0,umax_value=9223372036854776608) R4_w=inv0 R6=ctx(id=0?
> > R3 unbounded memory access, use 'var &= const' or 'if (var < const)'
> >
> > In the C code it was this:
> >          usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK);
> >          if (usize < 0)
> >                  return 0;
> >
> >          ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> >          if (ksize < 0)
> >                  return 0;
> >
> > We used to have problem with pointer arith in R2.
> > Now it's a problem with two integers in R3.
> > 'if (usize < 0)' is comparing R1 and makes it [0,800], but R8 stays [-inf,800].
> > Both registers represent the same 'usize' variable.
> > Then R9 -= R8 is doing 800 - [-inf, 800]
> > so the result of "max_len - usize" looks unbounded to the verifier while
> > it's obvious in C code that "max_len - usize" should be [0, 800].
> >
> > The following diff 'fixes' the issue for no_alu32:
> > diff --git a/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c b/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c
> > index 29817a703984..93058136d608 100644
> > --- a/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c
> > +++ b/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c
> > @@ -2,6 +2,7 @@
> >
> >   #include <linux/bpf.h>
> >   #include <bpf/bpf_helpers.h>
> > +#define var_barrier(a) asm volatile ("" : "=r"(a) : "0"(a))
> >
> >   /* Permit pretty deep stack traces */
> >   #define MAX_STACK_RAWTP 100
> > @@ -84,10 +85,12 @@ int bpf_prog1(void *ctx)
> >                  return 0;
> >
> >          usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK);
> > +       var_barrier(usize);
> >          if (usize < 0)
> >                  return 0;
> >
> >          ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> > +       var_barrier(ksize);
> >          if (ksize < 0)
> >                  return 0;
> >
> > But it breaks alu32 case.
> >
> > I'm using llvm 11 fwiw.
> >
> > Long term Yonghong is working on llvm support to emit this kind
> > of workarounds automatically.
> > I'm still thinking what to do next. Ideas?
>

Funny enough, Alexei's fix didn't fix even no_alu32 case for me. Also
have one of the latest clang 11...

> The following source change will make both alu32 and non-alu32 happy:
>
>   SEC("raw_tracepoint/sys_enter")
>   int bpf_prog1(void *ctx)
>   {
> -       int max_len, max_buildid_len, usize, ksize, total_size;
> +       int max_len, max_buildid_len, total_size;
> +       long usize, ksize;

This does fix it, both alu32 and no-alu32 pass.

>          struct stack_trace_t *data;
>          void *raw_data;
>          __u32 key = 0;
>
> I have not checked the reason why it works. Mostly this confirms to
> the function signature so compiler generates more friendly code.

Yes, it's due to the compiler not doing all the casting/bit shifting.
Just straightforward use of a single register consistently across
conditional jump and offset calculations.
John Fastabend June 24, 2020, 6:04 a.m. UTC | #7
Andrii Nakryiko wrote:
> On Tue, Jun 23, 2020 at 5:25 PM Yonghong Song <yhs@fb.com> wrote:
> >
> >
> >
> > On 6/23/20 4:25 PM, Alexei Starovoitov wrote:
> > > On Tue, Jun 23, 2020 at 11:15:58PM +0200, Daniel Borkmann wrote:
> > >> On 6/23/20 10:52 PM, Andrii Nakryiko wrote:
> > >>> On Tue, Jun 23, 2020 at 1:39 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> > >>>> On 6/23/20 5:22 AM, Andrii Nakryiko wrote:
> > >>>>> Add selftest that validates variable-length data reading and concatentation
> > >>>>> with one big shared data array. This is a common pattern in production use for
> > >>>>> monitoring and tracing applications, that potentially can read a lot of data,
> > >>>>> but overall read much less. Such pattern allows to determine precisely what
> > >>>>> amount of data needs to be sent over perfbuf/ringbuf and maximize efficiency.
> > >>>>>
> > >>>>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > >>>>
> > >>>> Currently getting the below errors on these tests. My last clang/llvm git build
> > >>>> is on 4676cf444ea2 ("[Clang] Skip adding begin source location for PragmaLoopHint'd
> > >>>> loop when[...]"):
> > >>>
> > >>> Yeah, you need 02553b91da5d ("bpf: bpf_probe_read_kernel_str() has to
> > >>> return amount of data read on success") from bpf tree.
> > >>
> > >> Fair point, it's in net- but not yet in net-next tree, so bpf-next sync needs
> > >> to wait.
> > >>
> > >>> I'm eagerly awaiting bpf being merged into bpf-next :)
> > >>
> > >> I'll cherry-pick 02553b91da5d locally for testing and if it passes I'll push
> > >> these out.
> > >
> > > I've merged the bpf_probe_read_kernel_str() fix into bpf-next and 3 extra commits
> > > prior to that one so that sha of the bpf_probe_read_kernel_str() fix (02553b91da5de)
> > > is exactly the same in bpf/net/linus/bpf-next. I think that shouldn't cause
> > > issue during bpf-next pull into net-next and later merge with Linus's tree.
> > > Crossing fingers, since we're doing this experiment for the first time.
> > >
> > > Daniel pushed these 3 commits as well.
> > > Now varlen and kernel_reloc tests are good, but we have a different issue :(
> > > ./test_progs-no_alu32 -t get_stack_raw_tp
> > > is now failing, but for a different reason.
> > >
> > > 52: (85) call bpf_get_stack#67
> > > 53: (bf) r8 = r0
> > > 54: (bf) r1 = r8
> > > 55: (67) r1 <<= 32
> > > 56: (c7) r1 s>>= 32
> > > ; if (usize < 0)
> > > 57: (c5) if r1 s< 0x0 goto pc+26
> > >   R0=inv(id=0,smax_value=800) R1_w=inv(id=0,umax_value=800,var_off=(0x0; 0x3ff)) R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=4,vs=1600,imm=0) R8_w=inv(id=0,smax_value=800) R9=inv800 R10=fp0 fp-8=mmmm????
> > > ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> > > 58: (1f) r9 -= r8
> > > ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> > > 59: (bf) r2 = r7
> > > 60: (0f) r2 += r1
> > > regs=1 stack=0 before 52: (85) call bpf_get_stack#67
> > > ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> > > 61: (bf) r1 = r6
> > > 62: (bf) r3 = r9
> > > 63: (b7) r4 = 0
> > > 64: (85) call bpf_get_stack#67
> > >   R0=inv(id=0,smax_value=800) R1_w=ctx(id=0,off=0,imm=0) R2_w=map_value(id=0,off=0,ks=4,vs=1600,umax_value=800,var_off=(0x0; 0x3ff),s32_max_value=1023,u32_max_value=1023) R3_w=inv(id=0,umax_value=9223372036854776608) R4_w=inv0 R6=ctx(id=0?
> > > R3 unbounded memory access, use 'var &= const' or 'if (var < const)'
> > >
> > > In the C code it was this:
> > >          usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK);
> > >          if (usize < 0)
> > >                  return 0;
> > >
> > >          ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> > >          if (ksize < 0)
> > >                  return 0;
> > >
> > > We used to have problem with pointer arith in R2.
> > > Now it's a problem with two integers in R3.
> > > 'if (usize < 0)' is comparing R1 and makes it [0,800], but R8 stays [-inf,800].
> > > Both registers represent the same 'usize' variable.
> > > Then R9 -= R8 is doing 800 - [-inf, 800]
> > > so the result of "max_len - usize" looks unbounded to the verifier while
> > > it's obvious in C code that "max_len - usize" should be [0, 800].
> > >
> > > The following diff 'fixes' the issue for no_alu32:
> > > diff --git a/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c b/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c
> > > index 29817a703984..93058136d608 100644
> > > --- a/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c
> > > +++ b/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c
> > > @@ -2,6 +2,7 @@
> > >
> > >   #include <linux/bpf.h>
> > >   #include <bpf/bpf_helpers.h>
> > > +#define var_barrier(a) asm volatile ("" : "=r"(a) : "0"(a))
> > >
> > >   /* Permit pretty deep stack traces */
> > >   #define MAX_STACK_RAWTP 100
> > > @@ -84,10 +85,12 @@ int bpf_prog1(void *ctx)
> > >                  return 0;
> > >
> > >          usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK);
> > > +       var_barrier(usize);
> > >          if (usize < 0)
> > >                  return 0;
> > >
> > >          ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> > > +       var_barrier(ksize);
> > >          if (ksize < 0)
> > >                  return 0;
> > >
> > > But it breaks alu32 case.
> > >
> > > I'm using llvm 11 fwiw.
> > >
> > > Long term Yonghong is working on llvm support to emit this kind
> > > of workarounds automatically.
> > > I'm still thinking what to do next. Ideas?
> >
> 
> Funny enough, Alexei's fix didn't fix even no_alu32 case for me. Also
> have one of the latest clang 11...
> 
> > The following source change will make both alu32 and non-alu32 happy:
> >
> >   SEC("raw_tracepoint/sys_enter")
> >   int bpf_prog1(void *ctx)
> >   {
> > -       int max_len, max_buildid_len, usize, ksize, total_size;
> > +       int max_len, max_buildid_len, total_size;
> > +       long usize, ksize;
> 
> This does fix it, both alu32 and no-alu32 pass.
> 
> >          struct stack_trace_t *data;
> >          void *raw_data;
> >          __u32 key = 0;
> >
> > I have not checked the reason why it works. Mostly this confirms to
> > the function signature so compiler generates more friendly code.
> 
> Yes, it's due to the compiler not doing all the casting/bit shifting.
> Just straightforward use of a single register consistently across
> conditional jump and offset calculations.

Another option is to drop the int->long uapi conversion and write the
varlen test using >=0 tests. The below diff works for me also using
recent clang-11, but maybe doesn't resolve Andrii's original issue.
My concern is if we break existing code in selftests is there a risk
users will get breaking code as well? Seems like without a few
additional clang improvements its going to be hard to get all
combinations working by just fiddling with the types.

diff --git a/tools/testing/selftests/bpf/progs/test_varlen.c b/tools/testing/selftests/bpf/progs/test_varlen.c
index 0969185..01c992c 100644
--- a/tools/testing/selftests/bpf/progs/test_varlen.c
+++ b/tools/testing/selftests/bpf/progs/test_varlen.c
@@ -31,20 +31,20 @@ int handler64(void *regs)
 {
 	int pid = bpf_get_current_pid_tgid() >> 32;
 	void *payload = payload1;
-	u64 len;
+	int len;
 
 	/* ignore irrelevant invocations */
 	if (test_pid != pid || !capture)
 		return 0;
 
 	len = bpf_probe_read_kernel_str(payload, MAX_LEN, &buf_in1[0]);
-	if (len <= MAX_LEN) {
+	if (len >= 0) {
 		payload += len;
 		payload1_len1 = len;
 	}
 
 	len = bpf_probe_read_kernel_str(payload, MAX_LEN, &buf_in2[0]);
-	if (len <= MAX_LEN) {
+	if (len >= 0) {
 		payload += len;
 		payload1_len2 = len;
 	}
@@ -59,20 +59,20 @@ int handler32(void *regs)
 {
 	int pid = bpf_get_current_pid_tgid() >> 32;
 	void *payload = payload2;
-	u32 len;
+	int len;
 
 	/* ignore irrelevant invocations */
 	if (test_pid != pid || !capture)
 		return 0;
 
 	len = bpf_probe_read_kernel_str(payload, MAX_LEN, &buf_in1[0]);
-	if (len <= MAX_LEN) {
+	if (len >= 0) {
 		payload += len;
 		payload2_len1 = len;
 	}
 
 	len = bpf_probe_read_kernel_str(payload, MAX_LEN, &buf_in2[0]);
-	if (len <= MAX_LEN) {
+	if (len >= 0) {
 		payload += len;
 		payload2_len2 = len;
 	}
Andrii Nakryiko June 24, 2020, 6:51 a.m. UTC | #8
On Tue, Jun 23, 2020 at 11:04 PM John Fastabend
<john.fastabend@gmail.com> wrote:
>
> Andrii Nakryiko wrote:
> > On Tue, Jun 23, 2020 at 5:25 PM Yonghong Song <yhs@fb.com> wrote:
> > >
> > >
> > >
> > > On 6/23/20 4:25 PM, Alexei Starovoitov wrote:
> > > > On Tue, Jun 23, 2020 at 11:15:58PM +0200, Daniel Borkmann wrote:
> > > >> On 6/23/20 10:52 PM, Andrii Nakryiko wrote:
> > > >>> On Tue, Jun 23, 2020 at 1:39 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> > > >>>> On 6/23/20 5:22 AM, Andrii Nakryiko wrote:
> > > >>>>> Add selftest that validates variable-length data reading and concatentation
> > > >>>>> with one big shared data array. This is a common pattern in production use for
> > > >>>>> monitoring and tracing applications, that potentially can read a lot of data,
> > > >>>>> but overall read much less. Such pattern allows to determine precisely what
> > > >>>>> amount of data needs to be sent over perfbuf/ringbuf and maximize efficiency.
> > > >>>>>
> > > >>>>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > > >>>>
> > > >>>> Currently getting the below errors on these tests. My last clang/llvm git build
> > > >>>> is on 4676cf444ea2 ("[Clang] Skip adding begin source location for PragmaLoopHint'd
> > > >>>> loop when[...]"):
> > > >>>
> > > >>> Yeah, you need 02553b91da5d ("bpf: bpf_probe_read_kernel_str() has to
> > > >>> return amount of data read on success") from bpf tree.
> > > >>
> > > >> Fair point, it's in net- but not yet in net-next tree, so bpf-next sync needs
> > > >> to wait.
> > > >>
> > > >>> I'm eagerly awaiting bpf being merged into bpf-next :)
> > > >>
> > > >> I'll cherry-pick 02553b91da5d locally for testing and if it passes I'll push
> > > >> these out.
> > > >
> > > > I've merged the bpf_probe_read_kernel_str() fix into bpf-next and 3 extra commits
> > > > prior to that one so that sha of the bpf_probe_read_kernel_str() fix (02553b91da5de)
> > > > is exactly the same in bpf/net/linus/bpf-next. I think that shouldn't cause
> > > > issue during bpf-next pull into net-next and later merge with Linus's tree.
> > > > Crossing fingers, since we're doing this experiment for the first time.
> > > >
> > > > Daniel pushed these 3 commits as well.
> > > > Now varlen and kernel_reloc tests are good, but we have a different issue :(
> > > > ./test_progs-no_alu32 -t get_stack_raw_tp
> > > > is now failing, but for a different reason.
> > > >
> > > > 52: (85) call bpf_get_stack#67
> > > > 53: (bf) r8 = r0
> > > > 54: (bf) r1 = r8
> > > > 55: (67) r1 <<= 32
> > > > 56: (c7) r1 s>>= 32
> > > > ; if (usize < 0)
> > > > 57: (c5) if r1 s< 0x0 goto pc+26
> > > >   R0=inv(id=0,smax_value=800) R1_w=inv(id=0,umax_value=800,var_off=(0x0; 0x3ff)) R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=4,vs=1600,imm=0) R8_w=inv(id=0,smax_value=800) R9=inv800 R10=fp0 fp-8=mmmm????
> > > > ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> > > > 58: (1f) r9 -= r8
> > > > ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> > > > 59: (bf) r2 = r7
> > > > 60: (0f) r2 += r1
> > > > regs=1 stack=0 before 52: (85) call bpf_get_stack#67
> > > > ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> > > > 61: (bf) r1 = r6
> > > > 62: (bf) r3 = r9
> > > > 63: (b7) r4 = 0
> > > > 64: (85) call bpf_get_stack#67
> > > >   R0=inv(id=0,smax_value=800) R1_w=ctx(id=0,off=0,imm=0) R2_w=map_value(id=0,off=0,ks=4,vs=1600,umax_value=800,var_off=(0x0; 0x3ff),s32_max_value=1023,u32_max_value=1023) R3_w=inv(id=0,umax_value=9223372036854776608) R4_w=inv0 R6=ctx(id=0?
> > > > R3 unbounded memory access, use 'var &= const' or 'if (var < const)'
> > > >
> > > > In the C code it was this:
> > > >          usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK);
> > > >          if (usize < 0)
> > > >                  return 0;
> > > >
> > > >          ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> > > >          if (ksize < 0)
> > > >                  return 0;
> > > >
> > > > We used to have problem with pointer arith in R2.
> > > > Now it's a problem with two integers in R3.
> > > > 'if (usize < 0)' is comparing R1 and makes it [0,800], but R8 stays [-inf,800].
> > > > Both registers represent the same 'usize' variable.
> > > > Then R9 -= R8 is doing 800 - [-inf, 800]
> > > > so the result of "max_len - usize" looks unbounded to the verifier while
> > > > it's obvious in C code that "max_len - usize" should be [0, 800].
> > > >
> > > > The following diff 'fixes' the issue for no_alu32:
> > > > diff --git a/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c b/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c
> > > > index 29817a703984..93058136d608 100644
> > > > --- a/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c
> > > > +++ b/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c
> > > > @@ -2,6 +2,7 @@
> > > >
> > > >   #include <linux/bpf.h>
> > > >   #include <bpf/bpf_helpers.h>
> > > > +#define var_barrier(a) asm volatile ("" : "=r"(a) : "0"(a))
> > > >
> > > >   /* Permit pretty deep stack traces */
> > > >   #define MAX_STACK_RAWTP 100
> > > > @@ -84,10 +85,12 @@ int bpf_prog1(void *ctx)
> > > >                  return 0;
> > > >
> > > >          usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK);
> > > > +       var_barrier(usize);
> > > >          if (usize < 0)
> > > >                  return 0;
> > > >
> > > >          ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> > > > +       var_barrier(ksize);
> > > >          if (ksize < 0)
> > > >                  return 0;
> > > >
> > > > But it breaks alu32 case.
> > > >
> > > > I'm using llvm 11 fwiw.
> > > >
> > > > Long term Yonghong is working on llvm support to emit this kind
> > > > of workarounds automatically.
> > > > I'm still thinking what to do next. Ideas?
> > >
> >
> > Funny enough, Alexei's fix didn't fix even no_alu32 case for me. Also
> > have one of the latest clang 11...
> >
> > > The following source change will make both alu32 and non-alu32 happy:
> > >
> > >   SEC("raw_tracepoint/sys_enter")
> > >   int bpf_prog1(void *ctx)
> > >   {
> > > -       int max_len, max_buildid_len, usize, ksize, total_size;
> > > +       int max_len, max_buildid_len, total_size;
> > > +       long usize, ksize;
> >
> > This does fix it, both alu32 and no-alu32 pass.
> >
> > >          struct stack_trace_t *data;
> > >          void *raw_data;
> > >          __u32 key = 0;
> > >
> > > I have not checked the reason why it works. Mostly this confirms to
> > > the function signature so compiler generates more friendly code.
> >
> > Yes, it's due to the compiler not doing all the casting/bit shifting.
> > Just straightforward use of a single register consistently across
> > conditional jump and offset calculations.
>
> Another option is to drop the int->long uapi conversion and write the
> varlen test using >=0 tests. The below diff works for me also using
> recent clang-11, but maybe doesn't resolve Andrii's original issue.
> My concern is if we break existing code in selftests is there a risk
> users will get breaking code as well? Seems like without a few
> additional clang improvements its going to be hard to get all
> combinations working by just fiddling with the types.

I have bad news for you, John. Try running your variant of test_varlen
on, say, 5.6 kernel (upstream version). Both alu32 and no-alu32
version fail (that's with int helpers, of course):

libbpf: load bpf program failed: Permission denied
libbpf: -- BEGIN DUMP LOG ---
libbpf:
arg#0 type is not a struct
Unrecognized arg#0 type PTR
; int pid = bpf_get_current_pid_tgid() >> 32;
0: (85) call bpf_get_current_pid_tgid#14
; if (test_pid != pid || !capture)
1: (18) r1 = 0xffffc90000132200
3: (61) r1 = *(u32 *)(r1 +0)
 R0_w=inv(id=0) R1_w=map_value(id=0,off=512,ks=4,vs=1056,imm=0) R10=fp0
; int pid = bpf_get_current_pid_tgid() >> 32;
4: (77) r0 >>= 32
; if (test_pid != pid || !capture)
5: (5e) if w1 != w0 goto pc+37
 R0_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff))
R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R10=fp0
6: (18) r1 = 0xffffc90000132204
8: (71) r1 = *(u8 *)(r1 +0)
 R0_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff))
R1_w=map_value(id=0,off=516,ks=4,vs=1056,imm=0) R10=fp0
9: (16) if w1 == 0x0 goto pc+33
 R0=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff))
R1=inv(id=0,umax_value=255,var_off=(0x0; 0xff)) R10=fp0
; len = bpf_probe_read_kernel_str(payload, MAX_LEN, &buf_in1[0]);
10: (18) r6 = 0xffffc90000129218
12: (18) r1 = 0xffffc90000129218
14: (b4) w2 = 256
15: (18) r3 = 0xffffc90000132000
17: (85) call bpf_probe_read_kernel_str#115
 R0=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff))
R1_w=map_value(id=0,off=536,ks=4,vs=1572,imm=0) R2_w=inv256
R3_w=map_value(id=0,off=0,ks=4,vs=1056,imm=0)
R6_w=map_value(id=0,off=536,ks=4,vs=1572,imm=0) R10=fp0
last_idx 17 first_idx 9
regs=4 stack=0 before 15: (18) r3 = 0xffffc90000132000
regs=4 stack=0 before 14: (b4) w2 = 256
; if (len >= 0) {
18: (c6) if w0 s< 0x0 goto pc+7
 R0_w=inv(id=0) R6_w=map_value(id=0,off=536,ks=4,vs=1572,imm=0) R10=fp0
; payload3_len1 = len;
19: (18) r1 = 0xffffc9000012920c
21: (63) *(u32 *)(r1 +0) = r0
 R0_w=inv(id=0) R1_w=map_value(id=0,off=524,ks=4,vs=1572,imm=0)
R6_w=map_value(id=0,off=536,ks=4,vs=1572,imm=0) R10=fp0
; len = bpf_probe_read_kernel_str(payload, MAX_LEN, &buf_in1[0]);
22: (bc) w1 = w0
; payload += len;
23: (18) r6 = 0xffffc90000129218
25: (0f) r6 += r1
last_idx 25 first_idx 9
regs=2 stack=0 before 23: (18) r6 = 0xffffc90000129218
regs=2 stack=0 before 22: (bc) w1 = w0
regs=1 stack=0 before 21: (63) *(u32 *)(r1 +0) = r0
regs=1 stack=0 before 19: (18) r1 = 0xffffc9000012920c
regs=1 stack=0 before 18: (c6) if w0 s< 0x0 goto pc+7
regs=1 stack=0 before 17: (85) call bpf_probe_read_kernel_str#115
; len = bpf_probe_read_kernel_str(payload, MAX_LEN, &buf_in2[0]);
26: (bf) r1 = r6
27: (b4) w2 = 256
28: (18) r3 = 0xffffc90000132100
30: (85) call bpf_probe_read_kernel_str#115
 R0=inv(id=0) R1_w=map_value(id=0,off=536,ks=4,vs=1572,umax_value=4294967295,var_off=(0x0;
0xffffffff)) R2_w=inv256
R3_w=map_value(id=0,off=256,ks=4,vs=1056,imm=0)
R6=map_value(id=0,off=536,ks=4,vs=1572,umax_value=4294967295,var_off=(0x00
R1 unbounded memory access, make sure to bounds check any array access
into a map
processed 23 insns (limit 1000000) max_states_per_insn 0 total_states
2 peak_states 2 mark_read 1

libbpf: -- END LOG --
libbpf: failed to load program 'raw_tp/sys_exit'
libbpf: failed to load object 'test_varlen'
libbpf: failed to load BPF skeleton 'test_varlen': -4007
test_varlen:FAIL:skel_open failed to open skeleton
#88 varlen:FAIL
Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED


Disassembly of section raw_tp/sys_exit:

0000000000000000 <handler64_signed>:
       0:       85 00 00 00 0e 00 00 00 call 14
       1:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
       3:       61 11 00 00 00 00 00 00 r1 = *(u32 *)(r1 + 0)
       4:       77 00 00 00 20 00 00 00 r0 >>= 32
       5:       5e 01 25 00 00 00 00 00 if w1 != w0 goto +37 <LBB1_7>
       6:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
       8:       71 11 00 00 00 00 00 00 r1 = *(u8 *)(r1 + 0)
       9:       16 01 21 00 00 00 00 00 if w1 == 0 goto +33 <LBB1_7>
      10:       18 06 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r6 = 0 ll
      12:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
      14:       b4 02 00 00 00 01 00 00 w2 = 256
      15:       18 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r3 = 0 ll
      17:       85 00 00 00 73 00 00 00 call 115
      18:       c6 00 07 00 00 00 00 00 if w0 s< 0 goto +7 <LBB1_4>
      19:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
      21:       63 01 00 00 00 00 00 00 *(u32 *)(r1 + 0) = r0
      22:       bc 01 00 00 00 00 00 00 w1 = w0
      23:       18 06 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r6 = 0 ll
      25:       0f 16 00 00 00 00 00 00 r6 += r1

00000000000000d0 <LBB1_4>:
      26:       bf 61 00 00 00 00 00 00 r1 = r6
      27:       b4 02 00 00 00 01 00 00 w2 = 256
      28:       18 03 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r3 = 0 ll
      30:       85 00 00 00 73 00 00 00 call 115
      31:       c6 00 05 00 00 00 00 00 if w0 s< 0 goto +5 <LBB1_6>
      32:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
      34:       63 01 00 00 00 00 00 00 *(u32 *)(r1 + 0) = r0
      35:       bc 01 00 00 00 00 00 00 w1 = w0
      36:       0f 16 00 00 00 00 00 00 r6 += r1

0000000000000128 <LBB1_6>:
      37:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
      39:       1c 16 00 00 00 00 00 00 w6 -= w1
      40:       18 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 r1 = 0 ll
      42:       63 61 00 00 00 00 00 00 *(u32 *)(r1 + 0) = r6

0000000000000158 <LBB1_7>:
      43:       b4 00 00 00 00 00 00 00 w0 = 0
      44:       95 00 00 00 00 00 00 00 exit


ALU32 code looks ok, but it's probably those verifier bugs that you've
fixed just recently that make it impossible to use? Don't know. So one
needs a quite recent kernel to make such code pattern work, while
unsigned variant works fine in practice.

Now, my variant also fails on 5.6 with default int helpers. If we have
longs, though, it suddenly works! Both alu32 and no-alu32!

And it makes sense, that's what I've been arguing all along. long
represent reality, it causes more straightforward code generation, if
you don't aritifically down-cast types. Even with downcasted types, in
many cases it's ok. If there are regressions (which are probably
impossible to avoid, unfortunately), at least in theory just casting
bpf helper's return result to int should be equivalent:

int len = (int)bpf_read_probe_str(...)

But even better is to just fix types of your local variables to match
native BPF size.

> Seems like without a few
> additional clang improvements its going to be hard to get all
> combinations working by just fiddling with the types.

I'd like to see the case yet in which synchronizing types didn't help, actually.

>
> diff --git a/tools/testing/selftests/bpf/progs/test_varlen.c b/tools/testing/selftests/bpf/progs/test_varlen.c
> index 0969185..01c992c 100644
> --- a/tools/testing/selftests/bpf/progs/test_varlen.c
> +++ b/tools/testing/selftests/bpf/progs/test_varlen.c
> @@ -31,20 +31,20 @@ int handler64(void *regs)

[...]
John Fastabend June 24, 2020, 4:48 p.m. UTC | #9
Andrii Nakryiko wrote:
> On Tue, Jun 23, 2020 at 11:04 PM John Fastabend
> <john.fastabend@gmail.com> wrote:
> >
> > Andrii Nakryiko wrote:
> > > On Tue, Jun 23, 2020 at 5:25 PM Yonghong Song <yhs@fb.com> wrote:
> > > >
> > > >
> > > >
> > > > On 6/23/20 4:25 PM, Alexei Starovoitov wrote:
> > > > > On Tue, Jun 23, 2020 at 11:15:58PM +0200, Daniel Borkmann wrote:
> > > > >> On 6/23/20 10:52 PM, Andrii Nakryiko wrote:
> > > > >>> On Tue, Jun 23, 2020 at 1:39 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
> > > > >>>> On 6/23/20 5:22 AM, Andrii Nakryiko wrote:
> > > > >>>>> Add selftest that validates variable-length data reading and concatentation
> > > > >>>>> with one big shared data array. This is a common pattern in production use for
> > > > >>>>> monitoring and tracing applications, that potentially can read a lot of data,
> > > > >>>>> but overall read much less. Such pattern allows to determine precisely what
> > > > >>>>> amount of data needs to be sent over perfbuf/ringbuf and maximize efficiency.
> > > > >>>>>
> > > > >>>>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > > > >>>>
> > > > >>>> Currently getting the below errors on these tests. My last clang/llvm git build
> > > > >>>> is on 4676cf444ea2 ("[Clang] Skip adding begin source location for PragmaLoopHint'd
> > > > >>>> loop when[...]"):
> > > > >>>
> > > > >>> Yeah, you need 02553b91da5d ("bpf: bpf_probe_read_kernel_str() has to
> > > > >>> return amount of data read on success") from bpf tree.
> > > > >>
> > > > >> Fair point, it's in net- but not yet in net-next tree, so bpf-next sync needs
> > > > >> to wait.
> > > > >>
> > > > >>> I'm eagerly awaiting bpf being merged into bpf-next :)
> > > > >>
> > > > >> I'll cherry-pick 02553b91da5d locally for testing and if it passes I'll push
> > > > >> these out.
> > > > >
> > > > > I've merged the bpf_probe_read_kernel_str() fix into bpf-next and 3 extra commits
> > > > > prior to that one so that sha of the bpf_probe_read_kernel_str() fix (02553b91da5de)
> > > > > is exactly the same in bpf/net/linus/bpf-next. I think that shouldn't cause
> > > > > issue during bpf-next pull into net-next and later merge with Linus's tree.
> > > > > Crossing fingers, since we're doing this experiment for the first time.
> > > > >
> > > > > Daniel pushed these 3 commits as well.
> > > > > Now varlen and kernel_reloc tests are good, but we have a different issue :(
> > > > > ./test_progs-no_alu32 -t get_stack_raw_tp
> > > > > is now failing, but for a different reason.
> > > > >
> > > > > 52: (85) call bpf_get_stack#67
> > > > > 53: (bf) r8 = r0
> > > > > 54: (bf) r1 = r8
> > > > > 55: (67) r1 <<= 32
> > > > > 56: (c7) r1 s>>= 32
> > > > > ; if (usize < 0)
> > > > > 57: (c5) if r1 s< 0x0 goto pc+26
> > > > >   R0=inv(id=0,smax_value=800) R1_w=inv(id=0,umax_value=800,var_off=(0x0; 0x3ff)) R6=ctx(id=0,off=0,imm=0) R7=map_value(id=0,off=0,ks=4,vs=1600,imm=0) R8_w=inv(id=0,smax_value=800) R9=inv800 R10=fp0 fp-8=mmmm????
> > > > > ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> > > > > 58: (1f) r9 -= r8
> > > > > ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> > > > > 59: (bf) r2 = r7
> > > > > 60: (0f) r2 += r1
> > > > > regs=1 stack=0 before 52: (85) call bpf_get_stack#67
> > > > > ; ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> > > > > 61: (bf) r1 = r6
> > > > > 62: (bf) r3 = r9
> > > > > 63: (b7) r4 = 0
> > > > > 64: (85) call bpf_get_stack#67
> > > > >   R0=inv(id=0,smax_value=800) R1_w=ctx(id=0,off=0,imm=0) R2_w=map_value(id=0,off=0,ks=4,vs=1600,umax_value=800,var_off=(0x0; 0x3ff),s32_max_value=1023,u32_max_value=1023) R3_w=inv(id=0,umax_value=9223372036854776608) R4_w=inv0 R6=ctx(id=0?
> > > > > R3 unbounded memory access, use 'var &= const' or 'if (var < const)'
> > > > >
> > > > > In the C code it was this:
> > > > >          usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK);
> > > > >          if (usize < 0)
> > > > >                  return 0;
> > > > >
> > > > >          ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> > > > >          if (ksize < 0)
> > > > >                  return 0;
> > > > >
> > > > > We used to have problem with pointer arith in R2.
> > > > > Now it's a problem with two integers in R3.
> > > > > 'if (usize < 0)' is comparing R1 and makes it [0,800], but R8 stays [-inf,800].
> > > > > Both registers represent the same 'usize' variable.
> > > > > Then R9 -= R8 is doing 800 - [-inf, 800]
> > > > > so the result of "max_len - usize" looks unbounded to the verifier while
> > > > > it's obvious in C code that "max_len - usize" should be [0, 800].
> > > > >
> > > > > The following diff 'fixes' the issue for no_alu32:
> > > > > diff --git a/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c b/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c
> > > > > index 29817a703984..93058136d608 100644
> > > > > --- a/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c
> > > > > +++ b/tools/testing/selftests/bpf/progs/test_get_stack_rawtp.c
> > > > > @@ -2,6 +2,7 @@
> > > > >
> > > > >   #include <linux/bpf.h>
> > > > >   #include <bpf/bpf_helpers.h>
> > > > > +#define var_barrier(a) asm volatile ("" : "=r"(a) : "0"(a))
> > > > >
> > > > >   /* Permit pretty deep stack traces */
> > > > >   #define MAX_STACK_RAWTP 100
> > > > > @@ -84,10 +85,12 @@ int bpf_prog1(void *ctx)
> > > > >                  return 0;
> > > > >
> > > > >          usize = bpf_get_stack(ctx, raw_data, max_len, BPF_F_USER_STACK);
> > > > > +       var_barrier(usize);
> > > > >          if (usize < 0)
> > > > >                  return 0;
> > > > >
> > > > >          ksize = bpf_get_stack(ctx, raw_data + usize, max_len - usize, 0);
> > > > > +       var_barrier(ksize);
> > > > >          if (ksize < 0)
> > > > >                  return 0;
> > > > >
> > > > > But it breaks alu32 case.
> > > > >
> > > > > I'm using llvm 11 fwiw.
> > > > >
> > > > > Long term Yonghong is working on llvm support to emit this kind
> > > > > of workarounds automatically.
> > > > > I'm still thinking what to do next. Ideas?
> > > >
> > >
> > > Funny enough, Alexei's fix didn't fix even no_alu32 case for me. Also
> > > have one of the latest clang 11...
> > >
> > > > The following source change will make both alu32 and non-alu32 happy:
> > > >
> > > >   SEC("raw_tracepoint/sys_enter")
> > > >   int bpf_prog1(void *ctx)
> > > >   {
> > > > -       int max_len, max_buildid_len, usize, ksize, total_size;
> > > > +       int max_len, max_buildid_len, total_size;
> > > > +       long usize, ksize;
> > >
> > > This does fix it, both alu32 and no-alu32 pass.
> > >
> > > >          struct stack_trace_t *data;
> > > >          void *raw_data;
> > > >          __u32 key = 0;
> > > >
> > > > I have not checked the reason why it works. Mostly this confirms to
> > > > the function signature so compiler generates more friendly code.
> > >
> > > Yes, it's due to the compiler not doing all the casting/bit shifting.
> > > Just straightforward use of a single register consistently across
> > > conditional jump and offset calculations.
> >
> > Another option is to drop the int->long uapi conversion and write the
> > varlen test using >=0 tests. The below diff works for me also using
> > recent clang-11, but maybe doesn't resolve Andrii's original issue.
> > My concern is if we break existing code in selftests is there a risk
> > users will get breaking code as well? Seems like without a few
> > additional clang improvements its going to be hard to get all
> > combinations working by just fiddling with the types.
> 
> I have bad news for you, John. Try running your variant of test_varlen
> on, say, 5.6 kernel (upstream version). Both alu32 and no-alu32
> version fail (that's with int helpers, of course):

OK. 

[...]

> 
> ALU32 code looks ok, but it's probably those verifier bugs that you've
> fixed just recently that make it impossible to use? Don't know. So one
> needs a quite recent kernel to make such code pattern work, while
> unsigned variant works fine in practice.

Correct bounds tracking broke above.

>
> Now, my variant also fails on 5.6 with default int helpers. If we have
> longs, though, it suddenly works! Both alu32 and no-alu32!

OK that is a win.

> 
> And it makes sense, that's what I've been arguing all along. long
> represent reality, it causes more straightforward code generation, if
> you don't aritifically down-cast types. Even with downcasted types, in
> many cases it's ok. If there are regressions (which are probably
> impossible to avoid, unfortunately), at least in theory just casting
> bpf helper's return result to int should be equivalent:
> 
> int len = (int)bpf_read_probe_str(...)
> 
> But even better is to just fix types of your local variables to match
> native BPF size.

I've already done this on code I own so no problem from me.

> 
> > Seems like without a few
> > additional clang improvements its going to be hard to get all
> > combinations working by just fiddling with the types.
> 
> I'd like to see the case yet in which synchronizing types didn't help, actually.

The case is when the output of read_probe_str(..) is fed back into
a call with an int arg. But we can probably change those to longs
as well. At any rate its obscure and arguably those types just
need to be syncchronized as well.

For what its worth I'm not try to stop the patch I ACKed it at
one point I believe. As long as the <=0 test continues to work
which it does I'm happy. Mostly the point I wanted to make is
the bpf-next verifier is happy with either code. I briefly
forgot you also want to push this into libbpf github most
likely and run it everywhere.

> 
> >
> > diff --git a/tools/testing/selftests/bpf/progs/test_varlen.c b/tools/testing/selftests/bpf/progs/test_varlen.c
> > index 0969185..01c992c 100644
> > --- a/tools/testing/selftests/bpf/progs/test_varlen.c
> > +++ b/tools/testing/selftests/bpf/progs/test_varlen.c
> > @@ -31,20 +31,20 @@ int handler64(void *regs)
> 
> [...]
Alexei Starovoitov June 24, 2020, 6:18 p.m. UTC | #10
On Tue, Jun 23, 2020 at 11:51 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> long
> represent reality, it causes more straightforward code generation, if
> you don't aritifically down-cast types.

yep. s/int/long/ conversion in bpf_helpers_def.h definitely improves
generated code.

> But even better is to just fix types of your local variables to match
> native BPF size.

I've applied int to long conversion for test_get_stack_rawtp.c test for now.

Let's try to keep 100% passing rate for test_progs and test_progs-no_alu32 :)
Andrii Nakryiko June 24, 2020, 6:26 p.m. UTC | #11
On Wed, Jun 24, 2020 at 11:19 AM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Jun 23, 2020 at 11:51 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > long
> > represent reality, it causes more straightforward code generation, if
> > you don't aritifically down-cast types.
>
> yep. s/int/long/ conversion in bpf_helpers_def.h definitely improves
> generated code.
>
> > But even better is to just fix types of your local variables to match
> > native BPF size.
>
> I've applied int to long conversion for test_get_stack_rawtp.c test for now.
>
> Let's try to keep 100% passing rate for test_progs and test_progs-no_alu32 :)

Yeah, my bad. I was 100% sure that I tested both back when I did the
change, but nothing is 100% in this world, apparently :)

As for test_progs-no_alu32, I'll add them to Travis CI as well (right
now we only run test_progs), that will help. But I'll try to keep
test_progs-no_alu32 in mind when doing tests locally as well.
diff mbox series

Patch

diff --git a/tools/testing/selftests/bpf/prog_tests/varlen.c b/tools/testing/selftests/bpf/prog_tests/varlen.c
new file mode 100644
index 000000000000..7533565e096d
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/varlen.c
@@ -0,0 +1,56 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020 Facebook */
+
+#include <test_progs.h>
+#include <time.h>
+#include "test_varlen.skel.h"
+
+#define CHECK_VAL(got, exp) \
+	CHECK((got) != (exp), "check", "got %ld != exp %ld\n", \
+	      (long)(got), (long)(exp))
+
+void test_varlen(void)
+{
+	int duration = 0, err;
+	struct test_varlen* skel;
+	struct test_varlen__bss *bss;
+	struct test_varlen__data *data;
+	const char str1[] = "Hello, ";
+	const char str2[] = "World!";
+	const char exp_str[] = "Hello, \0World!\0";
+	const int size1 = sizeof(str1);
+	const int size2 = sizeof(str2);
+
+	skel = test_varlen__open_and_load();
+	if (CHECK(!skel, "skel_open", "failed to open skeleton\n"))
+		return;
+	bss = skel->bss;
+	data = skel->data;
+
+	err = test_varlen__attach(skel);
+	if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
+		goto cleanup;
+
+	bss->test_pid = getpid();
+
+	/* trigger everything */
+	memcpy(bss->buf_in1, str1, size1);
+	memcpy(bss->buf_in2, str2, size2);
+	bss->capture = true;
+	usleep(1);
+	bss->capture = false;
+
+	CHECK_VAL(bss->payload1_len1, size1);
+	CHECK_VAL(bss->payload1_len2, size2);
+	CHECK_VAL(bss->total1, size1 + size2);
+	CHECK(memcmp(bss->payload1, exp_str, size1 + size2), "content_check",
+	      "doesn't match!");
+
+	CHECK_VAL(data->payload2_len1, size1);
+	CHECK_VAL(data->payload2_len2, size2);
+	CHECK_VAL(data->total2, size1 + size2);
+	CHECK(memcmp(data->payload2, exp_str, size1 + size2), "content_check",
+	      "doesn't match!");
+cleanup:
+	test_varlen__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_varlen.c b/tools/testing/selftests/bpf/progs/test_varlen.c
new file mode 100644
index 000000000000..09691852debf
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_varlen.c
@@ -0,0 +1,96 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020 Facebook */
+
+#include "vmlinux.h"
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_core_read.h>
+
+#define MAX_LEN 256
+
+char buf_in1[MAX_LEN] = {};
+char buf_in2[MAX_LEN] = {};
+
+int test_pid = 0;
+bool capture = false;
+
+/* .bss */
+long payload1_len1 = 0;
+long payload1_len2 = 0;
+long total1 = 0;
+char payload1[MAX_LEN + MAX_LEN] = {};
+
+/* .data */
+int payload2_len1 = -1;
+int payload2_len2 = -1;
+int total2 = -1;
+char payload2[MAX_LEN + MAX_LEN] = { 1 };
+
+SEC("raw_tp/sys_enter")
+int handler64(void *regs)
+{
+	int pid = bpf_get_current_pid_tgid() >> 32;
+	void *payload = payload1;
+	u64 len;
+
+	/* ignore irrelevant invocations */
+	if (test_pid != pid || !capture)
+		return 0;
+
+	len = bpf_probe_read_kernel_str(payload, MAX_LEN, &buf_in1[0]);
+	if (len <= MAX_LEN) {
+		payload += len;
+		payload1_len1 = len;
+	}
+
+	len = bpf_probe_read_kernel_str(payload, MAX_LEN, &buf_in2[0]);
+	if (len <= MAX_LEN) {
+		payload += len;
+		payload1_len2 = len;
+	}
+
+	total1 = payload - (void *)payload1;
+
+	return 0;
+}
+
+SEC("tp_btf/sys_enter")
+int handler32(void *regs)
+{
+	int pid = bpf_get_current_pid_tgid() >> 32;
+	void *payload = payload2;
+	u32 len;
+
+	/* ignore irrelevant invocations */
+	if (test_pid != pid || !capture)
+		return 0;
+
+	len = bpf_probe_read_kernel_str(payload, MAX_LEN, &buf_in1[0]);
+	if (len <= MAX_LEN) {
+		payload += len;
+		payload2_len1 = len;
+	}
+
+	len = bpf_probe_read_kernel_str(payload, MAX_LEN, &buf_in2[0]);
+	if (len <= MAX_LEN) {
+		payload += len;
+		payload2_len2 = len;
+	}
+
+	total2 = payload - (void *)payload2;
+
+	return 0;
+}
+
+SEC("tp_btf/sys_exit")
+int handler_exit(void *regs)
+{
+	long bla;
+
+	if (bpf_probe_read_kernel(&bla, sizeof(bla), 0))
+		return 1;
+	else
+		return 0;
+}
+
+char LICENSE[] SEC("license") = "GPL";