mbox series

[bpf-next,0/4] libbpf: add raw BTF type dumping

Message ID 20200929232843.1249318-1-andriin@fb.com
Headers show
Series libbpf: add raw BTF type dumping | expand

Message

Andrii Nakryiko Sept. 29, 2020, 11:28 p.m. UTC
Add btf_dump__dump_type_raw() API that emits human-readable low-level BTF type
information, same as bpftool output. bpftool is not switched to this API
because bpftool still needs to perform all the same BTF type processing logic
to do JSON output, so benefits are pretty much zero.

Raw BTF type output is extremely useful when debugging issues with BTF. It's
also handy to be able to do that in selftests. Raw BTF type output doesn't
hide any information like BTF-to-C conversion might (e.g., not emitting
BTF_KIND_FUNC, BTF_KIND_VAR and BTF_KIND_DATASEC), so is the most robust way
to look at BTF data without going all the way to deciphering binary BTF info.

Also, now that BTF can be extended with write APIs, teach btf_dump to work
with such modifiable BTFs, including the BTF-to-C convertion APIs. A self-test
to validate such incremental BTF-to-C conversion is added in patch #4.

Andrii Nakryiko (4):
  libbpf: make btf_dump work with modifiable BTF
  libbpf: add raw dumping of BTF types
  selftests/bpf: add checking of raw type dump in BTF writer APIs
    selftests
  selftests/bpf: test "incremental" btf_dump in C format

 tools/lib/bpf/btf.c                           |  17 ++
 tools/lib/bpf/btf.h                           |   1 +
 tools/lib/bpf/btf_dump.c                      | 243 ++++++++++++++++--
 tools/lib/bpf/libbpf.map                      |   1 +
 tools/lib/bpf/libbpf_internal.h               |   1 +
 .../selftests/bpf/prog_tests/btf_dump.c       | 105 ++++++++
 .../selftests/bpf/prog_tests/btf_write.c      |  67 ++++-
 7 files changed, 410 insertions(+), 25 deletions(-)

Comments

Alexei Starovoitov Sept. 30, 2020, 12:03 a.m. UTC | #1
On Tue, Sep 29, 2020 at 04:28:39PM -0700, Andrii Nakryiko wrote:
> Add btf_dump__dump_type_raw() API that emits human-readable low-level BTF type
> information, same as bpftool output. bpftool is not switched to this API
> because bpftool still needs to perform all the same BTF type processing logic
> to do JSON output, so benefits are pretty much zero.

If the only existing user cannot actually use such api it speaks heavily
against adding such api to libbpf. Comparing strings in tests is nice, but
could be done with C output just as well.
Andrii Nakryiko Sept. 30, 2020, 12:44 a.m. UTC | #2
On Tue, Sep 29, 2020 at 5:03 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Sep 29, 2020 at 04:28:39PM -0700, Andrii Nakryiko wrote:
> > Add btf_dump__dump_type_raw() API that emits human-readable low-level BTF type
> > information, same as bpftool output. bpftool is not switched to this API
> > because bpftool still needs to perform all the same BTF type processing logic
> > to do JSON output, so benefits are pretty much zero.
>
> If the only existing user cannot actually use such api it speaks heavily
> against adding such api to libbpf. Comparing strings in tests is nice, but
> could be done with C output just as well.

It certainly can, it just won't save much code, because bpftool would
still need to have a big switch over BTF type kinds to do JSON output.
I can do such conversion, if you prefer. I'm also thinking about
switching pahole to use this during BTF dedup verbose mode, if Arnaldo
will be fine with changing output format a bit.
Alexei Starovoitov Sept. 30, 2020, 3:18 a.m. UTC | #3
On Tue, Sep 29, 2020 at 05:44:48PM -0700, Andrii Nakryiko wrote:
> On Tue, Sep 29, 2020 at 5:03 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Tue, Sep 29, 2020 at 04:28:39PM -0700, Andrii Nakryiko wrote:
> > > Add btf_dump__dump_type_raw() API that emits human-readable low-level BTF type
> > > information, same as bpftool output. bpftool is not switched to this API
> > > because bpftool still needs to perform all the same BTF type processing logic
> > > to do JSON output, so benefits are pretty much zero.
> >
> > If the only existing user cannot actually use such api it speaks heavily
> > against adding such api to libbpf. Comparing strings in tests is nice, but
> > could be done with C output just as well.
> 
> It certainly can, it just won't save much code, because bpftool would
> still need to have a big switch over BTF type kinds to do JSON output.

So you're saying that most of the dump_btf_type() in bpftool/btf.c will stay as-is.
Only 'if (json_output)' will become unconditional? Hmm.
I know you don't want json in libbpf, but I think it's the point of
making a call on such things. Either libbpf gets to dump both
json and text dump_btf_type()-like output or it stays with C only.
Doing C and this text and not doing json is inconsistent.
Either libbpf can print btf in many different ways or it stays with C.
2nd format is not special in any way.
I don't think that text and json formats bring much value comparing to C,
so I would be fine with C only. But if we allow 2nd format we should
do json at the same time too to save bpftool the hassle.
And in the future we should allow 4th and 5th formats.
Andrii Nakryiko Sept. 30, 2020, 6:22 p.m. UTC | #4
On Tue, Sep 29, 2020 at 8:18 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Sep 29, 2020 at 05:44:48PM -0700, Andrii Nakryiko wrote:
> > On Tue, Sep 29, 2020 at 5:03 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Tue, Sep 29, 2020 at 04:28:39PM -0700, Andrii Nakryiko wrote:
> > > > Add btf_dump__dump_type_raw() API that emits human-readable low-level BTF type
> > > > information, same as bpftool output. bpftool is not switched to this API
> > > > because bpftool still needs to perform all the same BTF type processing logic
> > > > to do JSON output, so benefits are pretty much zero.
> > >
> > > If the only existing user cannot actually use such api it speaks heavily
> > > against adding such api to libbpf. Comparing strings in tests is nice, but
> > > could be done with C output just as well.
> >
> > It certainly can, it just won't save much code, because bpftool would
> > still need to have a big switch over BTF type kinds to do JSON output.
>
> So you're saying that most of the dump_btf_type() in bpftool/btf.c will stay as-is.
> Only 'if (json_output)' will become unconditional? Hmm.

Yes.

> I know you don't want json in libbpf, but I think it's the point of
> making a call on such things. Either libbpf gets to dump both
> json and text dump_btf_type()-like output or it stays with C only.

Right, I don't think JSON belongs in libbpf. But I fail to see why
this is the point where we need to make such a decision.

> Doing C and this text and not doing json is inconsistent.

Inconsistent with what? I've never found bpftool's raw BTF dump in
JSON format useful. At all. Saying raw BTF dump is useful and
consistent (?) only if it's both human-readable text and JSON makes no
sense to me. Libbpf doesn't have to re-implement entire bpftool
functionality.

> Either libbpf can print btf in many different ways or it stays with C.
> 2nd format is not special in any way.

I don't understand your point. With my patch it now can dump it as
valid C language definition or as a textual low-level BTF
representation.

If you are saying it should emit it in Go format, Rust format, or
other language-specific way, then sure, maybe, but it sure won't
re-use C-specific logic of btf_dump__dump_type() as is, because it is
C language specific. For Go there would be different logic, just as
for any other language. And someone will have to implement it (and
there would need to be a compelling use case for that, of course). And
it will be a different API, or at least a generic API with some enum
specifying "format" (which is the same thing, really, but inferior for
customizability reasons).

But JSON is different from that. It's just a more machine-friendly
output of textual low-level BTF dump. It could have been BSON or YAML,
but I hope you don't suggest to emit in those formats as well.

> I don't think that text and json formats bring much value comparing to C,
> so I would be fine with C only.

Noted. I disagree and find it very useful all the time, it's pretty
much the only way I look at BTF. C output is not complete: it doesn't
show functions, data sections and variables. It's not a replacement
for raw BTF dump. I don't even consider it as a different "format".
It's an entirely different and complementary (not alternative) view
(interpretation) of BTF.

> But if we allow 2nd format we should
> do json at the same time too to save bpftool the hassle.

There is no hassle for bpftool, code is written and working. Libbpf's
goal is not to minimize bpftool code either. So I hear you, but I
don't think about this the same way.

> And in the future we should allow 4th and 5th formats.

Ok, but there is no contradiction with what I'm doing here.



Regardless, feel free to drop patches #2 and #3, but patch #1 fixes
real issue, so would be nice to land it anyways. Patch #4 adds test
for changes in patch #1. Let me know if you want me to respin with
just those 2 patches.
Alexei Starovoitov Sept. 30, 2020, 9:29 p.m. UTC | #5
On Wed, Sep 30, 2020 at 11:22:50AM -0700, Andrii Nakryiko wrote:
> 
> If you are saying it should emit it in Go format, Rust format, or
> other language-specific way, then sure, 

Yes. that's what I'm saying. cloudflare and cilium are favoring golang.
Hopefully they can adopt skeleton when it's generated in golang.
It would probably mean some support from libbpf and vmlinux.go
Which means BTF dumping in golang.

> maybe, but it sure won't
> re-use C-specific logic of btf_dump__dump_type() as is, because it is
> C language specific. For Go there would be different logic, just as
> for any other language.

sure. that's fine.

> And someone will have to implement it (and
> there would need to be a compelling use case for that, of course). And
> it will be a different API, or at least a generic API with some enum
> specifying "format" (which is the same thing, really, but inferior for
> customizability reasons).

yes. New or reusing api doesn't matter much.
The question is what dumpers libbpf provides.

> But JSON is different from that. It's just a more machine-friendly
> output of textual low-level BTF dump. It could have been BSON or YAML,
> but I hope you don't suggest to emit in those formats as well.

why not. If libbpf does more than one there is no reason to restrict.

> 
> > I don't think that text and json formats bring much value comparing to C,
> > so I would be fine with C only.
> 
> Noted. I disagree and find it very useful all the time, it's pretty
> much the only way I look at BTF. C output is not complete: it doesn't
> show functions, data sections and variables. It's not a replacement
> for raw BTF dump. 

Ok, but it's easy to add dumping of these extra data into vmlinux.h
They can come inside /* */ or as 'extern'.
So C output can be complete and suitable for selftest's strcmp.

> Regardless, feel free to drop patches #2 and #3, but patch #1 fixes
> real issue, so would be nice to land it anyways. Patch #4 adds test
> for changes in patch #1. Let me know if you want me to respin with
> just those 2 patches.

Applied 1 and 4. I was waiting to patchwork bot to notice this partial
application, but looks like it's not that smart... yet.
Andrii Nakryiko Sept. 30, 2020, 10:47 p.m. UTC | #6
On Wed, Sep 30, 2020 at 2:29 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Sep 30, 2020 at 11:22:50AM -0700, Andrii Nakryiko wrote:
> >
> > If you are saying it should emit it in Go format, Rust format, or
> > other language-specific way, then sure,
>
> Yes. that's what I'm saying. cloudflare and cilium are favoring golang.
> Hopefully they can adopt skeleton when it's generated in golang.
> It would probably mean some support from libbpf and vmlinux.go
> Which means BTF dumping in golang.

Yes, if they were to adopt the skeleton approach, we'd need some sort
of BTF-to-Go struct dumping. But as for vmlinux.h, keep in mind that
that thing is supposed to be only included from the BPF side, which so
far is always pure C (apart from RedBPF approach of compiling Rust
code into BPF code). I don't think we want to have BPF-side code
written in Go?

>
> > maybe, but it sure won't
> > re-use C-specific logic of btf_dump__dump_type() as is, because it is
> > C language specific. For Go there would be different logic, just as
> > for any other language.
>
> sure. that's fine.
>
> > And someone will have to implement it (and
> > there would need to be a compelling use case for that, of course). And
> > it will be a different API, or at least a generic API with some enum
> > specifying "format" (which is the same thing, really, but inferior for
> > customizability reasons).
>
> yes. New or reusing api doesn't matter much.
> The question is what dumpers libbpf provides.
>
> > But JSON is different from that. It's just a more machine-friendly
> > output of textual low-level BTF dump. It could have been BSON or YAML,
> > but I hope you don't suggest to emit in those formats as well.
>
> why not. If libbpf does more than one there is no reason to restrict.

just extra code and maintenance burden without clear benefits, that's
the only reason

>
> >
> > > I don't think that text and json formats bring much value comparing to C,
> > > so I would be fine with C only.
> >
> > Noted. I disagree and find it very useful all the time, it's pretty
> > much the only way I look at BTF. C output is not complete: it doesn't
> > show functions, data sections and variables. It's not a replacement
> > for raw BTF dump.
>
> Ok, but it's easy to add dumping of these extra data into vmlinux.h
> They can come inside /* */ or as 'extern'.
> So C output can be complete and suitable for selftest's strcmp.

yeah, comments might work to "augment" vmlinux.h. There is still the
question of output type ordering, it's not always a single unique
ordering, which makes it harder to use for testing arbitrary BTFs. I
was very careful with existing BTF dump tests to ensure the order of
types is unique, but as a general case that's not true.

E.g., these two are equivalent:

struct a;

struct b { struct a *a; };

struct a { struct b *b; };

And:

struct b;

struct a { struct b *b; };

struct b { struct a *a; };

>
> > Regardless, feel free to drop patches #2 and #3, but patch #1 fixes
> > real issue, so would be nice to land it anyways. Patch #4 adds test
> > for changes in patch #1. Let me know if you want me to respin with
> > just those 2 patches.
>
> Applied 1 and 4. I was waiting to patchwork bot to notice this partial

thanks!

> application, but looks like it's not that smart... yet.

software, maybe some day :)