From patchwork Mon Jan 21 13:15:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiong Wang X-Patchwork-Id: 1028640 Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=netronome.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=netronome-com.20150623.gappssmtp.com header.i=@netronome-com.20150623.gappssmtp.com header.b="TxifNGPA"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 43jsWK5Lw0z9s7h for ; Tue, 22 Jan 2019 00:16:13 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728722AbfAUNQM (ORCPT ); Mon, 21 Jan 2019 08:16:12 -0500 Received: from mail-wr1-f67.google.com ([209.85.221.67]:35753 "EHLO mail-wr1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728314AbfAUNQM (ORCPT ); Mon, 21 Jan 2019 08:16:12 -0500 Received: by mail-wr1-f67.google.com with SMTP id 96so23333997wrb.2 for ; Mon, 21 Jan 2019 05:16:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=netronome-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id; bh=aX899OjdaFwfoL+kUsDWEO/S95tanzo+hjwsJ8RSycs=; b=TxifNGPAGv4WBd8ASGnYksVVTy0ON4LEUZDtbjKCzv9zFaNfLucbes5qf3aA6MVjUz mkcjUhVk678LUAB0gzie1mTUccVr6ncYNSkndalGCvweDgKLhVP4SjdW0vuXpesY4ls9 lST0Ir5X8En7J3EFxC/vf2hiloxJ4f2cKoLiBWvrbb8cMxru6UbMcCm3YduAGeIWZQ5l aR4YyYcivz7Hof+zDSqQohS1FnJi82RDUqALISce0FRtiPym6LXScqd53k+cH0OsbSwy JQcsX4VosoUoEo1z1aileiO8HUG/EkGDgXxjw9jon3+GkfZVDnYsnUqOybkmRiwZy88s k2rA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=aX899OjdaFwfoL+kUsDWEO/S95tanzo+hjwsJ8RSycs=; b=G7BRL6Z0Xxs0WKjrGfuK9bieoPEVopI4Ij8ZPG2YVyjFFFyKi6hsXbWxV4gzRvTBAc 7rvoEXB23IW+dv/76DvVHlod/fnLuEYj6/3hFPDzyFYM+0W1pOhfQfwj53NCZ5k3TSLF XFH1p8zRV64Q3z63UUKJ4LtO/7TiqdCf0EsgTSU46xR3dVqR+cUB/BqSYBXSHO3Svk5J quuqaD5MpYGgQ4MwqfkdZiXNxsGiWuYxE/NuPn9rCEIrEvOAizi6chf6/Sx5uhQtGt70 ymliPm/DrirOKlTiU4xhYexnNTuOm1P/MKO5OQq7FYwZkgV3pVerKk2wiC9Yb+W9dz/b Ka/Q== X-Gm-Message-State: AJcUukdhn4InwIJyUDauzdoGSPiswxC+mhMbsel/O3jX0TKjiP9Bn25v UePK1GEGXWbycWcT/nP9j9xWhA== X-Google-Smtp-Source: ALg8bN4Zlt39JdlR22upOjEOcchAnx/L/EnUJx5gyV6WwRuX/6285L71FOXUaZfZcCnd7hZ/kXjqTg== X-Received: by 2002:a5d:568c:: with SMTP id f12mr26496546wrv.101.1548076569421; Mon, 21 Jan 2019 05:16:09 -0800 (PST) Received: from cbtest28.netronome.com ([217.38.71.146]) by smtp.gmail.com with ESMTPSA id a12sm100591936wro.18.2019.01.21.05.16.08 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 21 Jan 2019 05:16:08 -0800 (PST) From: Jiong Wang To: ast@kernel.org, daniel@iogearbox.net Cc: netdev@vger.kernel.org, oss-drivers@netronome.com, Jiong Wang , "David S . Miller" , Paul Burton , Wang YanQing , Zi Shen Lim , Shubham Bansal , "Naveen N . Rao" , Sandipan Das , Martin Schwidefsky , Heiko Carstens Subject: [PATCH bpf-next v2 00/16] bpf: propose new jmp32 instructions Date: Mon, 21 Jan 2019 08:15:37 -0500 Message-Id: <1548076553-31268-1-git-send-email-jiong.wang@netronome.com> X-Mailer: git-send-email 2.7.4 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org v1 -> v2: - Updated encoding. Use reserved insn class 0x6 instead of packing with existing BPF_JMP. (Alexei) - Updated code comments in s390 port. (Martin) - Separate JIT function for jeq32_imm in NFP port. (Jakub) - Re-implemented auto-testing support. (Jakub) - Moved testcases to test_verifer.c, plus more unit tests. (Jakub) - Fixed JEQ/JNE range deduction. (Jakub) - Also supported JSET in this patch set. - Fixed/Improved range deduction for all the other operations. All C programs under bpf selftest passed verification now. - Improved min/max code implementation. - Fixed bpftool/disassembler. Current eBPF ISA has 32-bit sub-register and has defined a set of ALU32 instructions. However, there is no JMP32 instructions, the consequence is code-gen for 32-bit sub-registers is not efficient. For example, explicit sign-extension from 32-bit to 64-bit is needed for signed comparison. Adding JMP32 instruction therefore could complete eBPF ISA on 32-bit sub-register support. This also match those JMP32 instructions in most JIT backends, for example x64-64 and AArch64. These new eBPF JMP32 instructions could have one-to-one map on them. A few verifier ALU32 related bugs has been fixed recently, and JMP32 introduced by this set further improves BPF sub-register ecosystem. Once this is landed, BPF programs using 32-bit sub-register ISA could get reasonably good support from verifier and JIT compilers. Users then could compare the runtime efficiency of one BPF program under both modes, and could use the one shown better from benchmark result. From benchmark results on some Cilium BPF programs, for 64-bit arches, after JMP32 introduced, programs compiled with -mattr=+alu32 (meaning enable sub-register usage) are smaller in code size and generally smaller in verifier processed insn number. Benchmark results on x86-64 === Text size in bytes (generated by "size") --- LLVM code-gen option default alu32 alu32/jmp32 change Vs. change Vs. alu32 default bpf_lb-DLB_L3.o: 6456 6280 6160 -1.91% -4.58% bpf_lb-DLB_L4.o: 7848 7664 7136 -6.89% -9.07% bpf_lb-DUNKNOWN.o: 2680 2664 2568 -3.60% -4.18% bpf_lxc.o: 104824 104744 97360 -7.05% -7.12% bpf_netdev.o: 23456 23576 21632 -8.25% -7.78% bpf_overlay.o: 16184 16304 14648 -10.16% -9.49% Processed instruction number --- LLVM code-gen option default alu32 alu32/jmp32 change Vs. change Vs. alu32 default bpf_lb-DLB_L3.o: 1579 1281 1295 +1.09% -17.99% bpf_lb-DLB_L4.o: 2045 1663 1556 -6.43% -23.91% bpf_lb-DUNKNOWN.o: 606 513 501 -2.34% -17.33% bpf_lxc.o: 85381 103218 94435 -8.51% +10.60% bpf_netdev.o: 5246 5809 5200 -10.48% -0.08% bpf_overlay.o: 2443 2705 2456 -9.02% -0.53% It is even better for 32-bit arches like x32, arm32 and nfp etc, as now some conditional jump will become JMP32 which doesn't require code-gen for high 32-bit comparison. Encoding === The new JMP32 instructions are using new BPF_JMP32 class which is using the reserved eBPF class number 0x6. And BPF_JA/CALL/EXIT only exist for BPF_JMP, they are reserved opcode for BPF_JMP32. LLVM support === A couple of unit tests has been added and included in this set. Also LLVM code-gen for JMP32 has been added, so you could just compile any BPF C program with both -mcpu=probe and -mattr=+alu32 specified. If you are compiling on a machine with kernel patched by this set, LLVM will select the ISA automatically based on host probe results. Otherwise specify -mcpu=v3 and -mattr=+alu32 could also force use JMP32 ISA. LLVM support could be found at: https://github.com/Netronome/llvm/tree/jmp32-v2 (clang driver also taught about the new "v3" processor, will send out merge request for both clang and llvm once kernel set landed.) JIT backends support === A couple of JIT backends has been supported in this set except SPARC and MIPS. It shouldn't be a big issue for these two ports as LLVM default won't generate JMP32 insns, it will only generate them when host machine is probed to be with the support. Thanks. Cc: David S. Miller Cc: Paul Burton Cc: Wang YanQing Cc: Zi Shen Lim Cc: Shubham Bansal Cc: Naveen N. Rao Cc: Sandipan Das Cc: Martin Schwidefsky Cc: Heiko Carstens Jiong Wang (16): bpf: allocate 0x06 to new eBPF instruction class JMP32 bpf: refactor verifier min/max code for condition jump bpf: verifier support JMP32 bpf: disassembler support JMP32 tools: bpftool: teach cfg code about JMP32 bpf: interpreter support for JMP32 bpf: JIT blinds support JMP32 bpf: functional and min/max reasoning unit tests for JMP32 x86_64: bpf: implement jitting of JMP32 x32: bpf: implement jitting of JMP32 arm64: bpf: implement jitting of JMP32 arm: bpf: implement jitting of JMP32 ppc: bpf: implement jitting of JMP32 s390: bpf: implement jitting of JMP32 nfp: bpf: implement jitting of JMP32 selftests: bpf: makefile support sub-register code-gen test mode Documentation/networking/filter.txt | 15 +- arch/arm/net/bpf_jit_32.c | 53 +- arch/arm/net/bpf_jit_32.h | 2 + arch/arm64/net/bpf_jit_comp.c | 37 +- arch/powerpc/include/asm/ppc-opcode.h | 1 + arch/powerpc/net/bpf_jit.h | 4 + arch/powerpc/net/bpf_jit_comp64.c | 98 +++- arch/s390/net/bpf_jit_comp.c | 69 ++- arch/x86/net/bpf_jit_comp.c | 46 +- arch/x86/net/bpf_jit_comp32.c | 121 ++-- drivers/net/ethernet/netronome/nfp/bpf/jit.c | 97 +++- drivers/net/ethernet/netronome/nfp/bpf/main.h | 15 + include/linux/filter.h | 20 + include/uapi/linux/bpf.h | 1 + kernel/bpf/core.c | 221 +++----- kernel/bpf/disasm.c | 34 +- kernel/bpf/verifier.c | 362 ++++++++---- samples/bpf/bpf_insn.h | 20 + tools/bpf/bpftool/cfg.c | 9 +- tools/include/linux/filter.h | 20 + tools/include/uapi/linux/bpf.h | 1 + tools/testing/selftests/bpf/Makefile | 93 ++- tools/testing/selftests/bpf/test_verifier.c | 786 +++++++++++++++++++++++++- 23 files changed, 1708 insertions(+), 417 deletions(-)