diff mbox series

[RFC,bpf-next,2/8] libbpf: add support for extracting kernel symbol addresses

Message ID 20200612223150.1177182-3-andriin@fb.com
State RFC
Delegated to: BPF Maintainers
Headers show
Series libbpf ksym support and bpftool show PIDs | expand

Commit Message

Andrii Nakryiko June 12, 2020, 10:31 p.m. UTC
Add support for another (in addition to existing Kconfig) special kind of
externs in BPF code, kernel symbol externs. Such externs allow BPF code to
"know" kernel symbol address and either use it for comparisons with kernel
data structures (e.g., struct file's f_op pointer, to distinguish different
kinds of file), or, with the help of bpf_probe_user_kernel(), to follow
pointers and read data from global variables. Kernel symbol addresses are
found through /proc/kallsyms, which should be present in the system.

Currently, such kernel symbol variables are typeless: they have to be defined
as `extern const void <symbol>` and the only operation you can do (in C code)
with them is to take its address. Such extern should reside in a special
section '.ksyms'. bpf_helpers.h header provides __ksym macro for this. Strong
vs weak semantics stays the same as with Kconfig externs. If symbol is not
found in /proc/kallsyms, this will be a failure for strong (non-weak) extern,
but will be defaulted to 0 for weak externs.

If the same symbol is defined multiple times in /proc/kallsyms, then it will
be error if any of the associated addresses differs. In that case, address is
ambiguous, so libbpf falls on the side of caution, rather than confusing user
with randomly chosen address.

In the future, once kernel is extended with variables BTF information, such
ksym externs will be supported in a typed version, which will allow BPF
program to read variable's contents directly, similarly to how it's done for
fentry/fexit input arguments.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/lib/bpf/bpf_helpers.h |   1 +
 tools/lib/bpf/btf.h         |   5 ++
 tools/lib/bpf/libbpf.c      | 138 ++++++++++++++++++++++++++++++++++--
 3 files changed, 139 insertions(+), 5 deletions(-)

Comments

Andrii Nakryiko June 15, 2020, 7:08 p.m. UTC | #1
On Mon, Jun 15, 2020 at 9:44 AM Hao Luo <haoluo@google.com> wrote:
>
> Thanks, Andrii,
>
> This change looks nice! A couple of comments:
>
> 1. A 'void' type variable looks slightly odd from a user's perspective. How about using 'u64' or 'void *'? Or at least, a named type, which aliases to 'void'?

That choice is very deliberate one. `extern const void` is the right
way in C language to access linker-generated symbols, for instance,
which is quite similar to what the intent is her. Having void type is
very explicit that you don't know/care about that value pointed to by
extern address, the only operation you can perform is to get it's
address.

Once we add kernel variables support, that's when types will start to
be specified and libbpf will do extra checks (type matching) and extra
work (generating ldimm64 with BTF ID, for instance), to allow C code
to access data pointed to by extern address.

Switching type to u64 would be misleading in allowing C code to
implicitly dereference value of extern. E.g., there is a big
difference between:

extern u64 bla;

printf("%lld\n", bla); /* de-reference happens here, we get contents
of memory pointed to by "bla" symbol */

printf("%p\n", &bla); /* here we get value of linker symbol/address of
extern variable */

Currently I explicitly support only the latter and want to prevent the
former, until we have kernel variables in BTF. Using `extern void`
makes compiler enforce that only the &bla form is allowed. Everything
else is compilation error.

> 2. About the type size of ksym, IIUC, it looks strange that the values read from kallsyms have 8 bytes but their corresponding vs->size is 4 bytes and vs->type points to 4-byte int. Can we make them of the same size?

That's a bit of a hack on my part. Variable needs to point to some
type, which size will match the size of datasec's varinfo entry. This
is checked and enforced by kernel. I'm looking for 4-byte int, because
it's almost guaranteed that it will be present in program's BTF and I
won't have to explicitly add it (it's because all BPF programs return
int, so it must be in program's BTF already). While 8-byte long is
less likely to be there.

In the future, if we have a nicer way to extend BTF (and we will
soon), we can do this a bit better, but either way that .ksyms DATASEC
type isn't used for anything (there is no map with that DATASEC as a
value type), so it doesn't matter.

>
> Hao
>
> On Fri, Jun 12, 2020 at 3:35 PM Andrii Nakryiko <andriin@fb.com> wrote:
>>
>> Add support for another (in addition to existing Kconfig) special kind of
>> externs in BPF code, kernel symbol externs. Such externs allow BPF code to
>> "know" kernel symbol address and either use it for comparisons with kernel
>> data structures (e.g., struct file's f_op pointer, to distinguish different
>> kinds of file), or, with the help of bpf_probe_user_kernel(), to follow
>> pointers and read data from global variables. Kernel symbol addresses are
>> found through /proc/kallsyms, which should be present in the system.
>>
>> Currently, such kernel symbol variables are typeless: they have to be defined
>> as `extern const void <symbol>` and the only operation you can do (in C code)
>> with them is to take its address. Such extern should reside in a special
>> section '.ksyms'. bpf_helpers.h header provides __ksym macro for this. Strong
>> vs weak semantics stays the same as with Kconfig externs. If symbol is not
>> found in /proc/kallsyms, this will be a failure for strong (non-weak) extern,
>> but will be defaulted to 0 for weak externs.
>>
>> If the same symbol is defined multiple times in /proc/kallsyms, then it will
>> be error if any of the associated addresses differs. In that case, address is
>> ambiguous, so libbpf falls on the side of caution, rather than confusing user
>> with randomly chosen address.
>>
>> In the future, once kernel is extended with variables BTF information, such
>> ksym externs will be supported in a typed version, which will allow BPF
>> program to read variable's contents directly, similarly to how it's done for
>> fentry/fexit input arguments.
>>
>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>> ---
>>  tools/lib/bpf/bpf_helpers.h |   1 +
>>  tools/lib/bpf/btf.h         |   5 ++
>>  tools/lib/bpf/libbpf.c      | 138 ++++++++++++++++++++++++++++++++++--
>>  3 files changed, 139 insertions(+), 5 deletions(-)
>>

[...]

>>
>>  enum extern_type {
>>         EXT_UNKNOWN,
>> +       EXT_KSYM,
>>
>>         EXT_KCFG,
>>  };
>
>
> Minor, let EXT_KSYM come after EXT_KCFG.

I wanted ksym externs to go before KCFG ones, but not sure why. I'll
double check, I don't think it should matter.

>
>>
>>

[...]

>> +static int bpf_object__read_kallsyms_file(struct bpf_object *obj)
>> +{
>> +       char sym_type, sym_name[256];
>> +       unsigned long sym_addr;
>> +       struct extern_desc *ext;
>> +       int ret, err = 0;
>> +       FILE *f;
>> +
>> +       f = fopen("/proc/kallsyms", "r");
>> +       if (!f) {
>> +               err = -errno;
>> +               pr_warn("failed to open /proc/kallsyms: %d\n", err);
>> +               return err;
>> +       }
>> +
>> +       while (true) {
>> +               ret = fscanf(f, "%lx %c %s%*[^\n]\n",
>> +                            &sym_addr, &sym_type, sym_name);
>
>
> Maybe better follow the existing pattern in kernel (scripts/kallsyms.c https://github.com/torvalds/linux/blob/master/scripts/kallsyms.c#L177)


oh, didn't know about this "%499s" trick, will change.

>
>>
>> +               if (ret == EOF && feof(f))
>> +                       break;
>> +               if (ret != 3) {
>> +                       err = -EINVAL;
>> +                       goto out;
>> +               }
>> +

[...]
Hao Luo June 16, 2020, 8:05 a.m. UTC | #2
On Mon, Jun 15, 2020 at 12:08 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Mon, Jun 15, 2020 at 9:44 AM Hao Luo <haoluo@google.com> wrote:
> >
> > Thanks, Andrii,
> >
> > This change looks nice! A couple of comments:
> >
> > 1. A 'void' type variable looks slightly odd from a user's perspective. How about using 'u64' or 'void *'? Or at least, a named type, which aliases to 'void'?
>
> That choice is very deliberate one. `extern const void` is the right
> way in C language to access linker-generated symbols, for instance,
> which is quite similar to what the intent is her. Having void type is
> very explicit that you don't know/care about that value pointed to by
> extern address, the only operation you can perform is to get it's
> address.
>
> Once we add kernel variables support, that's when types will start to
> be specified and libbpf will do extra checks (type matching) and extra
> work (generating ldimm64 with BTF ID, for instance), to allow C code
> to access data pointed to by extern address.
>
> Switching type to u64 would be misleading in allowing C code to
> implicitly dereference value of extern. E.g., there is a big
> difference between:
>
> extern u64 bla;
>
> printf("%lld\n", bla); /* de-reference happens here, we get contents
> of memory pointed to by "bla" symbol */
>
> printf("%p\n", &bla); /* here we get value of linker symbol/address of
> extern variable */
>
> Currently I explicitly support only the latter and want to prevent the
> former, until we have kernel variables in BTF. Using `extern void`
> makes compiler enforce that only the &bla form is allowed. Everything
> else is compilation error.
>

Ah, I see. I've been taking the extern variable as an actual variable
that contains the symbol's address, which is the first case. Your
approach makes sense. Thanks for explaining.

> > 2. About the type size of ksym, IIUC, it looks strange that the values read from kallsyms have 8 bytes but their corresponding vs->size is 4 bytes and vs->type points to 4-byte int. Can we make them of the same size?
>
> That's a bit of a hack on my part. Variable needs to point to some
> type, which size will match the size of datasec's varinfo entry. This
> is checked and enforced by kernel. I'm looking for 4-byte int, because
> it's almost guaranteed that it will be present in program's BTF and I
> won't have to explicitly add it (it's because all BPF programs return
> int, so it must be in program's BTF already). While 8-byte long is
> less likely to be there.
>
> In the future, if we have a nicer way to extend BTF (and we will
> soon), we can do this a bit better, but either way that .ksyms DATASEC
> type isn't used for anything (there is no map with that DATASEC as a
> value type), so it doesn't matter.
>

Thanks for explaining, Andrii.

These explanations as comments in the code would be quite helpful, IMHO.
Hao Luo June 17, 2020, 1:24 a.m. UTC | #3
Andrii,

Do you think we need to put the kernel's variables in one single
DATASEC in vmlinux BTF? It looks like all the ksyms in the program
will be under one ".ksyms" section, so we are not able to tell whether
a ksym is from a percpu section or a .rodata section. Without this
information, if the vmlinux has multiple DATASECs, the loader may need
to traverse all of them. If vmlinux BTF has only one DATASEC, it
matches the object's BTF better.

Right now, the percpu vars are in a ".data..percpu" DATASEC in my
patch and the plan seems that we will introduce more DATASECs to hold
other data.

Please let me know your insights here. Thanks.

Hao

On Tue, Jun 16, 2020 at 1:05 AM Hao Luo <haoluo@google.com> wrote:
>
> On Mon, Jun 15, 2020 at 12:08 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Mon, Jun 15, 2020 at 9:44 AM Hao Luo <haoluo@google.com> wrote:
> > >
> > > Thanks, Andrii,
> > >
> > > This change looks nice! A couple of comments:
> > >
> > > 1. A 'void' type variable looks slightly odd from a user's perspective. How about using 'u64' or 'void *'? Or at least, a named type, which aliases to 'void'?
> >
> > That choice is very deliberate one. `extern const void` is the right
> > way in C language to access linker-generated symbols, for instance,
> > which is quite similar to what the intent is her. Having void type is
> > very explicit that you don't know/care about that value pointed to by
> > extern address, the only operation you can perform is to get it's
> > address.
> >
> > Once we add kernel variables support, that's when types will start to
> > be specified and libbpf will do extra checks (type matching) and extra
> > work (generating ldimm64 with BTF ID, for instance), to allow C code
> > to access data pointed to by extern address.
> >
> > Switching type to u64 would be misleading in allowing C code to
> > implicitly dereference value of extern. E.g., there is a big
> > difference between:
> >
> > extern u64 bla;
> >
> > printf("%lld\n", bla); /* de-reference happens here, we get contents
> > of memory pointed to by "bla" symbol */
> >
> > printf("%p\n", &bla); /* here we get value of linker symbol/address of
> > extern variable */
> >
> > Currently I explicitly support only the latter and want to prevent the
> > former, until we have kernel variables in BTF. Using `extern void`
> > makes compiler enforce that only the &bla form is allowed. Everything
> > else is compilation error.
> >
>
> Ah, I see. I've been taking the extern variable as an actual variable
> that contains the symbol's address, which is the first case. Your
> approach makes sense. Thanks for explaining.
>
> > > 2. About the type size of ksym, IIUC, it looks strange that the values read from kallsyms have 8 bytes but their corresponding vs->size is 4 bytes and vs->type points to 4-byte int. Can we make them of the same size?
> >
> > That's a bit of a hack on my part. Variable needs to point to some
> > type, which size will match the size of datasec's varinfo entry. This
> > is checked and enforced by kernel. I'm looking for 4-byte int, because
> > it's almost guaranteed that it will be present in program's BTF and I
> > won't have to explicitly add it (it's because all BPF programs return
> > int, so it must be in program's BTF already). While 8-byte long is
> > less likely to be there.
> >
> > In the future, if we have a nicer way to extend BTF (and we will
> > soon), we can do this a bit better, but either way that .ksyms DATASEC
> > type isn't used for anything (there is no map with that DATASEC as a
> > value type), so it doesn't matter.
> >
>
> Thanks for explaining, Andrii.
>
> These explanations as comments in the code would be quite helpful, IMHO.
Andrii Nakryiko June 17, 2020, 1:36 a.m. UTC | #4
On Tue, Jun 16, 2020 at 6:24 PM Hao Luo <haoluo@google.com> wrote:
>
> Andrii,
>
> Do you think we need to put the kernel's variables in one single
> DATASEC in vmlinux BTF? It looks like all the ksyms in the program
> will be under one ".ksyms" section, so we are not able to tell whether
> a ksym is from a percpu section or a .rodata section. Without this
> information, if the vmlinux has multiple DATASECs, the loader may need
> to traverse all of them. If vmlinux BTF has only one DATASEC, it
> matches the object's BTF better.
>
> Right now, the percpu vars are in a ".data..percpu" DATASEC in my
> patch and the plan seems that we will introduce more DATASECs to hold
> other data.
>
> Please let me know your insights here. Thanks.

I think we should keep original DATASECs in vmlinux's BTF, so that
they match ELF sections. Otherwise BTF is going to lie and will cause
confusion down the road in the longer term.

On the BPF program side, though, I think we'll limit it to just two
special sections: .ksyms and .ksyms.percpu. libbpf will have to
traverse all vmlinux DATASECs to find corresponding variables, but
that's ok, it has to do the linear scan either way.

>
> Hao
>
> On Tue, Jun 16, 2020 at 1:05 AM Hao Luo <haoluo@google.com> wrote:
> >
> > On Mon, Jun 15, 2020 at 12:08 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Mon, Jun 15, 2020 at 9:44 AM Hao Luo <haoluo@google.com> wrote:
> > > >
> > > > Thanks, Andrii,
> > > >
> > > > This change looks nice! A couple of comments:
> > > >
> > > > 1. A 'void' type variable looks slightly odd from a user's perspective. How about using 'u64' or 'void *'? Or at least, a named type, which aliases to 'void'?
> > >
> > > That choice is very deliberate one. `extern const void` is the right
> > > way in C language to access linker-generated symbols, for instance,
> > > which is quite similar to what the intent is her. Having void type is
> > > very explicit that you don't know/care about that value pointed to by
> > > extern address, the only operation you can perform is to get it's
> > > address.
> > >
> > > Once we add kernel variables support, that's when types will start to
> > > be specified and libbpf will do extra checks (type matching) and extra
> > > work (generating ldimm64 with BTF ID, for instance), to allow C code
> > > to access data pointed to by extern address.
> > >
> > > Switching type to u64 would be misleading in allowing C code to
> > > implicitly dereference value of extern. E.g., there is a big
> > > difference between:
> > >
> > > extern u64 bla;
> > >
> > > printf("%lld\n", bla); /* de-reference happens here, we get contents
> > > of memory pointed to by "bla" symbol */
> > >
> > > printf("%p\n", &bla); /* here we get value of linker symbol/address of
> > > extern variable */
> > >
> > > Currently I explicitly support only the latter and want to prevent the
> > > former, until we have kernel variables in BTF. Using `extern void`
> > > makes compiler enforce that only the &bla form is allowed. Everything
> > > else is compilation error.
> > >
> >
> > Ah, I see. I've been taking the extern variable as an actual variable
> > that contains the symbol's address, which is the first case. Your
> > approach makes sense. Thanks for explaining.
> >
> > > > 2. About the type size of ksym, IIUC, it looks strange that the values read from kallsyms have 8 bytes but their corresponding vs->size is 4 bytes and vs->type points to 4-byte int. Can we make them of the same size?
> > >
> > > That's a bit of a hack on my part. Variable needs to point to some
> > > type, which size will match the size of datasec's varinfo entry. This
> > > is checked and enforced by kernel. I'm looking for 4-byte int, because
> > > it's almost guaranteed that it will be present in program's BTF and I
> > > won't have to explicitly add it (it's because all BPF programs return
> > > int, so it must be in program's BTF already). While 8-byte long is
> > > less likely to be there.
> > >
> > > In the future, if we have a nicer way to extend BTF (and we will
> > > soon), we can do this a bit better, but either way that .ksyms DATASEC
> > > type isn't used for anything (there is no map with that DATASEC as a
> > > value type), so it doesn't matter.
> > >
> >
> > Thanks for explaining, Andrii.
> >
> > These explanations as comments in the code would be quite helpful, IMHO.
Hao Luo June 18, 2020, 7:53 a.m. UTC | #5
Sounds good. Thanks.

On Tue, Jun 16, 2020 at 6:37 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Jun 16, 2020 at 6:24 PM Hao Luo <haoluo@google.com> wrote:
> >
> > Andrii,
> >
> > Do you think we need to put the kernel's variables in one single
> > DATASEC in vmlinux BTF? It looks like all the ksyms in the program
> > will be under one ".ksyms" section, so we are not able to tell whether
> > a ksym is from a percpu section or a .rodata section. Without this
> > information, if the vmlinux has multiple DATASECs, the loader may need
> > to traverse all of them. If vmlinux BTF has only one DATASEC, it
> > matches the object's BTF better.
> >
> > Right now, the percpu vars are in a ".data..percpu" DATASEC in my
> > patch and the plan seems that we will introduce more DATASECs to hold
> > other data.
> >
> > Please let me know your insights here. Thanks.
>
> I think we should keep original DATASECs in vmlinux's BTF, so that
> they match ELF sections. Otherwise BTF is going to lie and will cause
> confusion down the road in the longer term.
>
> On the BPF program side, though, I think we'll limit it to just two
> special sections: .ksyms and .ksyms.percpu. libbpf will have to
> traverse all vmlinux DATASECs to find corresponding variables, but
> that's ok, it has to do the linear scan either way.
>
> >
> > Hao
> >
> > On Tue, Jun 16, 2020 at 1:05 AM Hao Luo <haoluo@google.com> wrote:
> > >
> > > On Mon, Jun 15, 2020 at 12:08 PM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:
> > > >
> > > > On Mon, Jun 15, 2020 at 9:44 AM Hao Luo <haoluo@google.com> wrote:
> > > > >
> > > > > Thanks, Andrii,
> > > > >
> > > > > This change looks nice! A couple of comments:
> > > > >
> > > > > 1. A 'void' type variable looks slightly odd from a user's perspective. How about using 'u64' or 'void *'? Or at least, a named type, which aliases to 'void'?
> > > >
> > > > That choice is very deliberate one. `extern const void` is the right
> > > > way in C language to access linker-generated symbols, for instance,
> > > > which is quite similar to what the intent is her. Having void type is
> > > > very explicit that you don't know/care about that value pointed to by
> > > > extern address, the only operation you can perform is to get it's
> > > > address.
> > > >
> > > > Once we add kernel variables support, that's when types will start to
> > > > be specified and libbpf will do extra checks (type matching) and extra
> > > > work (generating ldimm64 with BTF ID, for instance), to allow C code
> > > > to access data pointed to by extern address.
> > > >
> > > > Switching type to u64 would be misleading in allowing C code to
> > > > implicitly dereference value of extern. E.g., there is a big
> > > > difference between:
> > > >
> > > > extern u64 bla;
> > > >
> > > > printf("%lld\n", bla); /* de-reference happens here, we get contents
> > > > of memory pointed to by "bla" symbol */
> > > >
> > > > printf("%p\n", &bla); /* here we get value of linker symbol/address of
> > > > extern variable */
> > > >
> > > > Currently I explicitly support only the latter and want to prevent the
> > > > former, until we have kernel variables in BTF. Using `extern void`
> > > > makes compiler enforce that only the &bla form is allowed. Everything
> > > > else is compilation error.
> > > >
> > >
> > > Ah, I see. I've been taking the extern variable as an actual variable
> > > that contains the symbol's address, which is the first case. Your
> > > approach makes sense. Thanks for explaining.
> > >
> > > > > 2. About the type size of ksym, IIUC, it looks strange that the values read from kallsyms have 8 bytes but their corresponding vs->size is 4 bytes and vs->type points to 4-byte int. Can we make them of the same size?
> > > >
> > > > That's a bit of a hack on my part. Variable needs to point to some
> > > > type, which size will match the size of datasec's varinfo entry. This
> > > > is checked and enforced by kernel. I'm looking for 4-byte int, because
> > > > it's almost guaranteed that it will be present in program's BTF and I
> > > > won't have to explicitly add it (it's because all BPF programs return
> > > > int, so it must be in program's BTF already). While 8-byte long is
> > > > less likely to be there.
> > > >
> > > > In the future, if we have a nicer way to extend BTF (and we will
> > > > soon), we can do this a bit better, but either way that .ksyms DATASEC
> > > > type isn't used for anything (there is no map with that DATASEC as a
> > > > value type), so it doesn't matter.
> > > >
> > >
> > > Thanks for explaining, Andrii.
> > >
> > > These explanations as comments in the code would be quite helpful, IMHO.
diff mbox series

Patch

diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h
index f67dce2af802..a510d8ed716f 100644
--- a/tools/lib/bpf/bpf_helpers.h
+++ b/tools/lib/bpf/bpf_helpers.h
@@ -75,5 +75,6 @@  enum libbpf_tristate {
 };
 
 #define __kconfig __attribute__((section(".kconfig")))
+#define __ksym __attribute__((section(".ksyms")))
 
 #endif
diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
index 70c1b7ec2bd0..06cd1731c154 100644
--- a/tools/lib/bpf/btf.h
+++ b/tools/lib/bpf/btf.h
@@ -168,6 +168,11 @@  static inline bool btf_kflag(const struct btf_type *t)
 	return BTF_INFO_KFLAG(t->info);
 }
 
+static inline bool btf_is_void(const struct btf_type *t)
+{
+	return btf_kind(t) == BTF_KIND_UNKN;
+}
+
 static inline bool btf_is_int(const struct btf_type *t)
 {
 	return btf_kind(t) == BTF_KIND_INT;
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index b1a36c965fc0..451fb0b5615b 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -285,6 +285,7 @@  struct bpf_struct_ops {
 #define BSS_SEC ".bss"
 #define RODATA_SEC ".rodata"
 #define KCONFIG_SEC ".kconfig"
+#define KSYMS_SEC ".ksyms"
 #define STRUCT_OPS_SEC ".struct_ops"
 
 enum libbpf_map_type {
@@ -329,6 +330,7 @@  struct bpf_map {
 
 enum extern_type {
 	EXT_UNKNOWN,
+	EXT_KSYM,
 	EXT_KCFG,
 };
 
@@ -357,6 +359,9 @@  struct extern_desc {
 			int data_off;
 			bool is_signed;
 		} kcfg;
+		struct {
+			unsigned long addr;
+		} ksym;
 	};
 };
 
@@ -2812,9 +2817,25 @@  static int cmp_externs(const void *_a, const void *_b)
 	return strcmp(a->name, b->name);
 }
 
+static int find_int_btf_id(const struct btf *btf)
+{
+	const struct btf_type *t;
+	int i, n;
+
+	n = btf__get_nr_types(btf);
+	for (i = 1; i <= n; i++) {
+		t = btf__type_by_id(btf, i);
+
+		if (btf_is_int(t) && btf_int_bits(t) == 32)
+			return i;
+	}
+
+	return 0;
+}
+
 static int bpf_object__collect_externs(struct bpf_object *obj)
 {
-	struct btf_type *sec, *kcfg_sec = NULL;
+	struct btf_type *sec, *kcfg_sec = NULL, *ksym_sec = NULL;
 	const struct btf_type *t;
 	struct extern_desc *ext;
 	int i, n, off;
@@ -2895,6 +2916,17 @@  static int bpf_object__collect_externs(struct bpf_object *obj)
 				pr_warn("kconfig extern '%s' type is unsupported\n", ext_name);
 				return -ENOTSUP;
 			}
+		} else if (strcmp(sec_name, KSYMS_SEC) == 0) {
+			const struct btf_type *vt;
+
+			ksym_sec = sec;
+			ext->type = EXT_KSYM;
+
+			vt = skip_mods_and_typedefs(obj->btf, t->type, NULL);
+			if (!btf_is_void(vt)) {
+				pr_warn("ksym extern '%s' is not typeless (void)\n", ext_name);
+				return -ENOTSUP;
+			}
 		} else {
 			pr_warn("unrecognized extern section '%s'\n", sec_name);
 			return -ENOTSUP;
@@ -2908,6 +2940,43 @@  static int bpf_object__collect_externs(struct bpf_object *obj)
 	/* sort externs by type, for kcfg ones also by (align, size, name) */
 	qsort(obj->externs, obj->nr_extern, sizeof(*ext), cmp_externs);
 
+	/* for .ksyms section, we need to turn all externs into allocated
+	 * variables in BTF to pass kernel verification; we do this by
+	 * pretending that each extern is a 8-byte variable
+	 */
+	if (ksym_sec) {
+		/* find existing 4-byte integer type in BTF to use for fake
+		 * extern variables in DATASEC
+		 */
+		int int_btf_id = find_int_btf_id(obj->btf);
+
+		for (i = 0; i < n; i++) {
+			pr_debug("extern #%d (ksym): symbol %d, name %s\n",
+				 i, ext->sym_idx, ext->name);
+		}
+
+		sec = ksym_sec;
+		n = btf_vlen(sec);
+		for (i = 0, off = 0; i < n; i++, off += sizeof(int)) {
+			struct btf_var_secinfo *vs = btf_var_secinfos(sec) + i;
+			struct btf_type *vt;
+
+			vt = (void *)btf__type_by_id(obj->btf, vs->type);
+			ext_name = btf__name_by_offset(obj->btf, vt->name_off);
+			ext = find_extern_by_name(obj, ext_name);
+			if (!ext) {
+				pr_warn("failed to find extern definition for BTF var '%s'\n",
+					ext_name);
+				return -ESRCH;
+			}
+			btf_var(vt)->linkage = BTF_VAR_GLOBAL_ALLOCATED;
+			vt->type = int_btf_id;
+			vs->offset = off;
+			vs->size = sizeof(int);
+		}
+		sec->size = off;
+	}
+
 	if (kcfg_sec) {
 		sec = kcfg_sec;
 		/* for kcfg externs calculate their offsets within a .kconfig map */
@@ -5009,9 +5078,14 @@  bpf_program__relocate(struct bpf_program *prog, struct bpf_object *obj)
 			break;
 		case RELO_EXTERN:
 			ext = &obj->externs[relo->sym_off];
-			insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
-			insn[0].imm = obj->maps[obj->kconfig_map_idx].fd;
-			insn[1].imm = ext->kcfg.data_off;
+			if (ext->type == EXT_KCFG) {
+				insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
+				insn[0].imm = obj->maps[obj->kconfig_map_idx].fd;
+				insn[1].imm = ext->kcfg.data_off;
+			} else /* EXT_KSYM */ {
+				insn[0].imm = (__u32)ext->ksym.addr;
+				insn[1].imm = ext->ksym.addr >> 32;
+			}
 			break;
 		case RELO_CALL:
 			err = bpf_program__reloc_text(prog, obj, relo);
@@ -5630,10 +5704,57 @@  static int bpf_object__sanitize_maps(struct bpf_object *obj)
 	return 0;
 }
 
+static int bpf_object__read_kallsyms_file(struct bpf_object *obj)
+{
+	char sym_type, sym_name[256];
+	unsigned long sym_addr;
+	struct extern_desc *ext;
+	int ret, err = 0;
+	FILE *f;
+
+	f = fopen("/proc/kallsyms", "r");
+	if (!f) {
+		err = -errno;
+		pr_warn("failed to open /proc/kallsyms: %d\n", err);
+		return err;
+	}
+
+	while (true) {
+		ret = fscanf(f, "%lx %c %s%*[^\n]\n",
+			     &sym_addr, &sym_type, sym_name);
+		if (ret == EOF && feof(f))
+			break;
+		if (ret != 3) {
+			err = -EINVAL;
+			goto out;
+		}
+
+		ext = find_extern_by_name(obj, sym_name);
+		if (!ext || ext->type != EXT_KSYM)
+			continue;
+
+		if (ext->is_set && ext->ksym.addr != sym_addr) {
+			pr_warn("ksym extern '%s' resolution is ambiguous: 0x%lx or 0x%lx\n",
+				sym_name, ext->ksym.addr, sym_addr);
+			err = -EINVAL;
+			goto out;
+		}
+		if (!ext->is_set) {
+			ext->is_set = true;
+			ext->ksym.addr = sym_addr;
+			pr_debug("ksym extern %s=0x%lx\n", sym_name, sym_addr);
+		}
+	}
+
+out:
+	fclose(f);
+	return err;
+}
+
 static int bpf_object__resolve_externs(struct bpf_object *obj,
 				       const char *extra_kconfig)
 {
-	bool need_config = false;
+	bool need_config = false, need_kallsyms = false;
 	struct extern_desc *ext;
 	void *kcfg_data;
 	int err, i;
@@ -5663,6 +5784,8 @@  static int bpf_object__resolve_externs(struct bpf_object *obj,
 		} else if (ext->type == EXT_KCFG &&
 			   strncmp(ext->name, "CONFIG_", 7) == 0) {
 			need_config = true;
+		} else if (ext->type == EXT_KSYM) {
+			need_kallsyms = true;
 		} else {
 			pr_warn("unrecognized extern '%s'\n", ext->name);
 			return -EINVAL;
@@ -5686,6 +5809,11 @@  static int bpf_object__resolve_externs(struct bpf_object *obj,
 		if (err)
 			return -EINVAL;
 	}
+	if (need_kallsyms) {
+		err = bpf_object__read_kallsyms_file(obj);
+		if (err)
+			return -EINVAL;
+	}
 	for (i = 0; i < obj->nr_extern; i++) {
 		ext = &obj->externs[i];