From patchwork Fri Aug 14 19:15:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: YiFei Zhu X-Patchwork-Id: 1345177 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=CAYVXZPX; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4BStT03gGlz9sPB for ; Sat, 15 Aug 2020 05:16:04 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728028AbgHNTQE (ORCPT ); Fri, 14 Aug 2020 15:16:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54124 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726652AbgHNTQD (ORCPT ); Fri, 14 Aug 2020 15:16:03 -0400 Received: from mail-il1-x141.google.com (mail-il1-x141.google.com [IPv6:2607:f8b0:4864:20::141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 88598C061385 for ; Fri, 14 Aug 2020 12:16:03 -0700 (PDT) Received: by mail-il1-x141.google.com with SMTP id z3so9368522ilh.3 for ; Fri, 14 Aug 2020 12:16:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=mvPRHFnkNbXE5owds5CXbk8m7oRd00WsS93Qd2kI5oU=; b=CAYVXZPXrm/OasK0OIzMTI5vGoL8CgANydQRipm0ZTQztufWJyjL7uW2XmkW9AUB1a LAdAQO8pmXpvUWOuQIUxhWTIV1FVRQkXBcRBeCJIxFvMfnkVcdg5AmcTm5Wqr+kbb68C ocvADRP/7QGbFEP+swpgKGUQdSqkaN/WzIXfU+GRzemjXVkQQYimolFnbxfUxy2ADfBe ankkfcsK25HQcqRoYgwrUrI1Zzrc8oEFE9F67HrerrWS4pCLqAS5QEn0dLLRQqLulcfD q8XzKm2s4p+AgzD9dt5v8aQCi/nqJGk7gd8X8n+hDy+MUSPSAMyGX6LH2YRVPoDB8xuV mtPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mvPRHFnkNbXE5owds5CXbk8m7oRd00WsS93Qd2kI5oU=; b=rh1ALitL7NGEKhhpgpClvQnZSTZFJX3A0u/88IIw8b2YSWR6CcGryLSkETQe5H14Cl 9AxxxHKXZ3b+WwaY0ytcZkHRKWBfiYDFKTE8nHWONYPe6lFHLT+leLLUZX7h9A5D2nzy amAKpZK86SBRgQDZHlb2DlRDqnZ2qs0kK1vPF4586XQeaFSM+qo7KMUCYTNMyt5rcwrI oKypeJyPlMSx+qRf1QCJRg/4WlH6TZ+EWc9jfUq/GFphGo5CeA3S5yzmWCdUzltvxOjS D3G793Mc13b5fCVJRh5i9izFrE3rhJwpmkX8ORdbAd2oBatq5+jUlT3Bs5mongO7nN+u C8pQ== X-Gm-Message-State: AOAM533+rKCAKQujLhUD0vAomKv3DlOmljWcmNd49ETIPo2I5FmNPBPZ 6BJr+w9Cfr/PElAo6IEmT34FqFpQ91nJiQ== X-Google-Smtp-Source: ABdhPJzdV8h+iysJ2bfLqCQNIGyArmmIQumgtiqWJuJYXJJVOWDFmZDSAOeCfQ01AVJN+IXKdVi4Qw== X-Received: by 2002:a92:8708:: with SMTP id m8mr3861964ild.19.1597432562598; Fri, 14 Aug 2020 12:16:02 -0700 (PDT) Received: from localhost.localdomain (host-173-230-99-219.tnkngak.clients.pavlovmedia.com. [173.230.99.219]) by smtp.gmail.com with ESMTPSA id f15sm4521028ilc.51.2020.08.14.12.16.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 14 Aug 2020 12:16:01 -0700 (PDT) From: YiFei Zhu To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Daniel Borkmann , Stanislav Fomichev , Mahesh Bandewar , YiFei Zhu Subject: [RFC PATCH bpf-next 1/5] bpf: RCU protect used_maps array and count Date: Fri, 14 Aug 2020 14:15:54 -0500 Message-Id: <7e37411ca33ae89e2a98dd95707a35caf2fd542e.1597427271.git.zhuyifei@google.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org From: YiFei Zhu To support modifying the used_maps array, we use RCU to protect the use of the counter and the array. A mutex is used as the write-side lock, and it is initialized in the verifier where the rcu struct is allocated. Most uses are non-sleeping and very straight forward, just holding rcu_read_lock. bpf_check_tail_call can be called for a cBPF map (by bpf_prog_select_runtime, bpf_migrate_filter) so an extra check is added for the case when the program did not pass through the eBPF verifier and the used_maps pointer is NULL. The Netronome driver does allocate memory while reading the used_maps array so for simplicity it is made to hold the write-side lock. Signed-off-by: YiFei Zhu --- .../net/ethernet/netronome/nfp/bpf/offload.c | 33 +++++++++++------ include/linux/bpf.h | 11 +++++- kernel/bpf/core.c | 25 ++++++++++--- kernel/bpf/syscall.c | 19 ++++++++-- kernel/bpf/verifier.c | 37 +++++++++++++------ net/core/dev.c | 12 ++++-- 6 files changed, 100 insertions(+), 37 deletions(-) diff --git a/drivers/net/ethernet/netronome/nfp/bpf/offload.c b/drivers/net/ethernet/netronome/nfp/bpf/offload.c index ac02369174a9..74ed42b678e2 100644 --- a/drivers/net/ethernet/netronome/nfp/bpf/offload.c +++ b/drivers/net/ethernet/netronome/nfp/bpf/offload.c @@ -111,34 +111,45 @@ static int nfp_map_ptrs_record(struct nfp_app_bpf *bpf, struct nfp_prog *nfp_prog, struct bpf_prog *prog) { - int i, cnt, err; + struct bpf_used_maps *used_maps; + int i, cnt, err = 0; + + /* We are calling sleepable functions while reading used_maps array */ + mutex_lock(&prog->aux->used_maps_mutex); + + used_maps = rcu_dereference_protected(prog->aux->used_maps, + lockdep_is_held(&prog->aux->used_maps_mutex)); /* Quickly count the maps we will have to remember */ cnt = 0; - for (i = 0; i < prog->aux->used_map_cnt; i++) - if (bpf_map_offload_neutral(prog->aux->used_maps[i])) + for (i = 0; i < used_maps->cnt; i++) + if (bpf_map_offload_neutral(used_maps->arr[i])) cnt++; if (!cnt) - return 0; + goto out; nfp_prog->map_records = kmalloc_array(cnt, sizeof(nfp_prog->map_records[0]), GFP_KERNEL); - if (!nfp_prog->map_records) - return -ENOMEM; + if (!nfp_prog->map_records) { + err = -ENOMEM; + goto out; + } - for (i = 0; i < prog->aux->used_map_cnt; i++) - if (bpf_map_offload_neutral(prog->aux->used_maps[i])) { + for (i = 0; i < used_maps->cnt; i++) + if (bpf_map_offload_neutral(used_maps->arr[i])) { err = nfp_map_ptr_record(bpf, nfp_prog, - prog->aux->used_maps[i]); + used_maps->arr[i]); if (err) { nfp_map_ptrs_forget(bpf, nfp_prog); - return err; + goto out; } } WARN_ON(cnt != nfp_prog->map_records_cnt); - return 0; +out: + mutex_unlock(&prog->aux->used_maps_mutex); + return err; } static int diff --git a/include/linux/bpf.h b/include/linux/bpf.h index cef4ef0d2b4e..417189b4061d 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -689,9 +689,15 @@ struct bpf_ctx_arg_aux { u32 btf_id; }; +/* rcu struct for prog used_maps */ +struct bpf_used_maps { + u32 cnt; + struct bpf_map **arr; + struct rcu_head rcu; +}; + struct bpf_prog_aux { atomic64_t refcnt; - u32 used_map_cnt; u32 max_ctx_offset; u32 max_pkt_offset; u32 max_tp_access; @@ -722,7 +728,8 @@ struct bpf_prog_aux { u32 size_poke_tab; struct bpf_ksym ksym; const struct bpf_prog_ops *ops; - struct bpf_map **used_maps; + struct mutex used_maps_mutex; /* write-side mutex for used_maps */ + struct bpf_used_maps __rcu *used_maps; struct bpf_prog *prog; struct user_struct *user; u64 load_time; /* ns since boottime */ diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index bde93344164d..9766aa0337d9 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -1746,11 +1746,16 @@ bool bpf_prog_array_compatible(struct bpf_array *array, static int bpf_check_tail_call(const struct bpf_prog *fp) { - struct bpf_prog_aux *aux = fp->aux; + const struct bpf_used_maps *used_maps; int i; - for (i = 0; i < aux->used_map_cnt; i++) { - struct bpf_map *map = aux->used_maps[i]; + rcu_read_lock(); + used_maps = rcu_dereference(fp->aux->used_maps); + if (!used_maps) + goto out; + + for (i = 0; i < used_maps->cnt; i++) { + struct bpf_map *map = used_maps->arr[i]; struct bpf_array *array; if (map->map_type != BPF_MAP_TYPE_PROG_ARRAY) @@ -1761,6 +1766,8 @@ static int bpf_check_tail_call(const struct bpf_prog *fp) return -EINVAL; } +out: + rcu_read_unlock(); return 0; } @@ -2113,8 +2120,16 @@ void __bpf_free_used_maps(struct bpf_prog_aux *aux, static void bpf_free_used_maps(struct bpf_prog_aux *aux) { - __bpf_free_used_maps(aux, aux->used_maps, aux->used_map_cnt); - kfree(aux->used_maps); + struct bpf_used_maps *used_maps = aux->used_maps; + + if (!used_maps) + return; + + __bpf_free_used_maps(aux, used_maps->arr, used_maps->cnt); + kfree(used_maps->arr); + kfree(used_maps); + + mutex_destroy(&aux->used_maps_mutex); } static void bpf_prog_free_deferred(struct work_struct *work) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 2f343ce15747..3fde9dc4b595 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -3149,11 +3149,15 @@ static const struct bpf_map *bpf_map_from_imm(const struct bpf_prog *prog, unsigned long addr, u32 *off, u32 *type) { + const struct bpf_used_maps *used_maps; const struct bpf_map *map; int i; - for (i = 0, *off = 0; i < prog->aux->used_map_cnt; i++) { - map = prog->aux->used_maps[i]; + rcu_read_lock(); + used_maps = rcu_dereference(prog->aux->used_maps); + + for (i = 0, *off = 0; i < used_maps->cnt; i++) { + map = used_maps->arr[i]; if (map == (void *)addr) { *type = BPF_PSEUDO_MAP_FD; return map; @@ -3166,6 +3170,7 @@ static const struct bpf_map *bpf_map_from_imm(const struct bpf_prog *prog, } } + rcu_read_unlock(); return NULL; } @@ -3263,6 +3268,7 @@ static int bpf_prog_get_info_by_fd(struct file *file, struct bpf_prog_stats stats; char __user *uinsns; u32 ulen; + const struct bpf_used_maps *used_maps; int err; err = bpf_check_uarg_tail_zero(uinfo, sizeof(info), info_len); @@ -3284,19 +3290,24 @@ static int bpf_prog_get_info_by_fd(struct file *file, memcpy(info.tag, prog->tag, sizeof(prog->tag)); memcpy(info.name, prog->aux->name, sizeof(prog->aux->name)); + rcu_read_lock(); + used_maps = rcu_dereference(prog->aux->used_maps); + ulen = info.nr_map_ids; - info.nr_map_ids = prog->aux->used_map_cnt; + info.nr_map_ids = used_maps->cnt; ulen = min_t(u32, info.nr_map_ids, ulen); if (ulen) { u32 __user *user_map_ids = u64_to_user_ptr(info.map_ids); u32 i; for (i = 0; i < ulen; i++) - if (put_user(prog->aux->used_maps[i]->id, + if (put_user(used_maps->arr[i]->id, &user_map_ids[i])) return -EFAULT; } + rcu_read_unlock(); + err = set_info_rec_size(&info); if (err) return err; diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index b6ccfce3bf4c..9a6ca16667c7 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -11232,25 +11232,38 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, goto err_release_maps; } - if (ret == 0 && env->used_map_cnt) { + if (ret == 0) { /* if program passed verifier, update used_maps in bpf_prog_info */ - env->prog->aux->used_maps = kmalloc_array(env->used_map_cnt, - sizeof(env->used_maps[0]), + struct bpf_used_maps *used_maps = kzalloc(sizeof(*used_maps), GFP_KERNEL); - - if (!env->prog->aux->used_maps) { + if (!used_maps) { ret = -ENOMEM; goto err_release_maps; } - memcpy(env->prog->aux->used_maps, env->used_maps, - sizeof(env->used_maps[0]) * env->used_map_cnt); - env->prog->aux->used_map_cnt = env->used_map_cnt; + if (env->used_map_cnt) { + used_maps->arr = kmalloc_array(env->used_map_cnt, + sizeof(env->used_maps[0]), + GFP_KERNEL); - /* program is valid. Convert pseudo bpf_ld_imm64 into generic - * bpf_ld_imm64 instructions - */ - convert_pseudo_ld_imm64(env); + if (!used_maps->arr) { + kfree(used_maps); + ret = -ENOMEM; + goto err_release_maps; + } + + memcpy(used_maps->arr, env->used_maps, + sizeof(env->used_maps[0]) * env->used_map_cnt); + used_maps->cnt = env->used_map_cnt; + + /* program is valid. Convert pseudo bpf_ld_imm64 into generic + * bpf_ld_imm64 instructions + */ + convert_pseudo_ld_imm64(env); + } + + env->prog->aux->used_maps = used_maps; + mutex_init(&env->prog->aux->used_maps_mutex); } if (ret == 0) diff --git a/net/core/dev.c b/net/core/dev.c index 7df6c9617321..bebbf8abd9a7 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -5439,17 +5439,23 @@ static int generic_xdp_install(struct net_device *dev, struct netdev_bpf *xdp) int ret = 0; if (new) { + const struct bpf_used_maps *used_maps; u32 i; + rcu_read_lock(); + used_maps = rcu_dereference(new->aux->used_maps); + /* generic XDP does not work with DEVMAPs that can * have a bpf_prog installed on an entry */ - for (i = 0; i < new->aux->used_map_cnt; i++) { - if (dev_map_can_have_prog(new->aux->used_maps[i])) + for (i = 0; i < used_maps->cnt; i++) { + if (dev_map_can_have_prog(used_maps->arr[i])) return -EINVAL; - if (cpu_map_prog_allowed(new->aux->used_maps[i])) + if (cpu_map_prog_allowed(used_maps->arr[i])) return -EINVAL; } + + rcu_read_unlock(); } switch (xdp->command) { From patchwork Fri Aug 14 19:15:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: YiFei Zhu X-Patchwork-Id: 1345178 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=QQchmShj; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4BStT148Jpz9sTH for ; Sat, 15 Aug 2020 05:16:05 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728057AbgHNTQF (ORCPT ); Fri, 14 Aug 2020 15:16:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54126 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726652AbgHNTQE (ORCPT ); Fri, 14 Aug 2020 15:16:04 -0400 Received: from mail-il1-x141.google.com (mail-il1-x141.google.com [IPv6:2607:f8b0:4864:20::141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77419C061385 for ; Fri, 14 Aug 2020 12:16:04 -0700 (PDT) Received: by mail-il1-x141.google.com with SMTP id c6so9319789ilo.13 for ; Fri, 14 Aug 2020 12:16:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=2N+DvyNqII1FuugRHkcsgtDFKc9Q/C1D5vc41I6jfrY=; b=QQchmShjXxwh8dfu7se5ZZ4VzYcaCOWL5cGSO8y3tUQpwkEW0osxHPu13aODZMCfs/ X0nJRqTaGYC0+dJhIMbV+Mg6WFF0ZkEVaF5p4zygw+fehOEb3Tndt+9YweduX/ISiG60 d7c+vsboQBqxvzPi13dFdvgtM8zMs36UaBu+f6B7Q2pgaeJwnHmVbFa3I81GBNZdpihO cG+UwhoGoSoKx7Ubfdr+ZsMjEgK/SHA5BYXfffv0L+szIwaWRW5I0u8WZ+W0dIkpYJa5 Rs1VPff6ALcFbhXFDD6ABsSMhErvGZyuizkhqmYabfcY/quHzTHJvPOxOmQkh9EPjfBr krhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2N+DvyNqII1FuugRHkcsgtDFKc9Q/C1D5vc41I6jfrY=; b=HaKXaFllSnAHz74XcujCLVtisLbjeQlDeLD7YTtVJ+6c/gE+vxYJrzh2FWPDwKdtjL 8e4jsnH49eazvXualQij6RW+NGTHQ/xR1lhbDek04MW/Uzo93Pec0r2ver3MBqnR6ArK XlSc/wPZEe29P4aiFGamgJRiOYyEuxp7NQJIGj2m9XFj0ZZnI29OKrYxwQYeB6IF1ecU s5DFV5HNGqLrMRSy3IbqHqXbK4/H6uyGCHKVU1+G/wA6ibMHXyyy1qq1Q8+JTQqZv+8e zdqlRSOSDAnrI2lvBUXkFt9EC1SqlgDunF1s0+TxI34EbQjAHkhkJ72FtSVEteedGqnK aENQ== X-Gm-Message-State: AOAM531QLdp9ODyoMePrgjHqmdlWL4/cxXlkLb6I7/sPd1vhTriGoBHB bAPbgkRV9vcbd4S4KLuSvmvIR5HojGg7Fw== X-Google-Smtp-Source: ABdhPJyVYEJwF71HX5cWN/LjnieELwCcS+w/l0E9mMZeZI3q8UA37rhr6CMVN3N4+uQe1tNpyIG5ZA== X-Received: by 2002:a92:7a0e:: with SMTP id v14mr3735901ilc.296.1597432563638; Fri, 14 Aug 2020 12:16:03 -0700 (PDT) Received: from localhost.localdomain (host-173-230-99-219.tnkngak.clients.pavlovmedia.com. [173.230.99.219]) by smtp.gmail.com with ESMTPSA id f15sm4521028ilc.51.2020.08.14.12.16.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 14 Aug 2020 12:16:02 -0700 (PDT) From: YiFei Zhu To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Daniel Borkmann , Stanislav Fomichev , Mahesh Bandewar , YiFei Zhu Subject: [RFC PATCH bpf-next 2/5] bpf: Add BPF_PROG_ADD_MAP syscall Date: Fri, 14 Aug 2020 14:15:55 -0500 Message-Id: <55a2aeb0c93dd281f24c5e20c6ba3796477234eb.1597427271.git.zhuyifei@google.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org From: YiFei Zhu This syscall attaches a map to a program. -EEXIST if the map is already attached to the program. call_rcu is used to free the old used_maps struct after an RCU grace period. Signed-off-by: YiFei Zhu --- include/uapi/linux/bpf.h | 7 +++ kernel/bpf/syscall.c | 83 ++++++++++++++++++++++++++++++++++ tools/include/uapi/linux/bpf.h | 7 +++ 3 files changed, 97 insertions(+) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index b134e679e9db..01b32036a0f5 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -118,6 +118,7 @@ enum bpf_cmd { BPF_ENABLE_STATS, BPF_ITER_CREATE, BPF_LINK_DETACH, + BPF_PROG_ADD_MAP, }; enum bpf_map_type { @@ -648,6 +649,12 @@ union bpf_attr { __u32 flags; } iter_create; + struct { /* struct used by BPF_PROG_ADD_MAP command */ + __u32 prog_fd; + __u32 map_fd; + __u32 flags; /* extra flags */ + } prog_add_map; + } __attribute__((aligned(8))); /* The description below is an attempt at providing documentation to eBPF diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 3fde9dc4b595..0564a4291184 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -4144,6 +4144,86 @@ static int bpf_iter_create(union bpf_attr *attr) return err; } +#define BPF_PROG_ADD_MAP_LAST_FIELD prog_add_map.flags + +static void __bpf_free_used_maps_rcu(struct rcu_head *rcu) +{ + struct bpf_used_maps *used_maps = container_of(rcu, struct bpf_used_maps, rcu); + + kfree(used_maps->arr); + kfree(used_maps); +} + +static int bpf_prog_add_map(union bpf_attr *attr) +{ + struct bpf_prog *prog; + struct bpf_map *map; + struct bpf_used_maps *used_maps_old, *used_maps_new; + int i, ret = 0; + + if (CHECK_ATTR(BPF_PROG_ADD_MAP)) + return -EINVAL; + + if (attr->prog_add_map.flags) + return -EINVAL; + + prog = bpf_prog_get(attr->prog_add_map.prog_fd); + if (IS_ERR(prog)) + return PTR_ERR(prog); + + map = bpf_map_get(attr->prog_add_map.map_fd); + if (IS_ERR(map)) { + ret = PTR_ERR(map); + goto out_prog_put; + } + + used_maps_new = kzalloc(sizeof(*used_maps_new), GFP_KERNEL); + if (!used_maps_new) { + ret = -ENOMEM; + goto out_map_put; + } + + mutex_lock(&prog->aux->used_maps_mutex); + + used_maps_old = rcu_dereference_protected(prog->aux->used_maps, + lockdep_is_held(&prog->aux->used_maps_mutex)); + + for (i = 0; i < used_maps_old->cnt; i++) + if (used_maps_old->arr[i] == map) { + ret = -EEXIST; + goto out_unlock; + } + + used_maps_new->cnt = used_maps_old->cnt + 1; + used_maps_new->arr = kmalloc_array(used_maps_new->cnt, + sizeof(used_maps_new->arr[0]), + GFP_KERNEL); + if (!used_maps_new->arr) { + ret = -ENOMEM; + goto out_unlock; + } + + memcpy(used_maps_new->arr, used_maps_old->arr, + sizeof(used_maps_old->arr[0]) * used_maps_old->cnt); + used_maps_new->arr[used_maps_old->cnt] = map; + + rcu_assign_pointer(prog->aux->used_maps, used_maps_new); + call_rcu(&used_maps_old->rcu, __bpf_free_used_maps_rcu); + +out_unlock: + mutex_unlock(&prog->aux->used_maps_mutex); + + if (ret) + kfree(used_maps_new); + +out_map_put: + if (ret) + bpf_map_put(map); +out_prog_put: + bpf_prog_put(prog); + return ret; +} + SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size) { union bpf_attr attr; @@ -4277,6 +4357,9 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz case BPF_LINK_DETACH: err = link_detach(&attr); break; + case BPF_PROG_ADD_MAP: + err = bpf_prog_add_map(&attr); + break; default: err = -EINVAL; break; diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index b134e679e9db..01b32036a0f5 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -118,6 +118,7 @@ enum bpf_cmd { BPF_ENABLE_STATS, BPF_ITER_CREATE, BPF_LINK_DETACH, + BPF_PROG_ADD_MAP, }; enum bpf_map_type { @@ -648,6 +649,12 @@ union bpf_attr { __u32 flags; } iter_create; + struct { /* struct used by BPF_PROG_ADD_MAP command */ + __u32 prog_fd; + __u32 map_fd; + __u32 flags; /* extra flags */ + } prog_add_map; + } __attribute__((aligned(8))); /* The description below is an attempt at providing documentation to eBPF From patchwork Fri Aug 14 19:15:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: YiFei Zhu X-Patchwork-Id: 1345179 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=rRkpYYle; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4BStT21d68z9sPB for ; Sat, 15 Aug 2020 05:16:06 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728082AbgHNTQF (ORCPT ); Fri, 14 Aug 2020 15:16:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54132 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726652AbgHNTQF (ORCPT ); Fri, 14 Aug 2020 15:16:05 -0400 Received: from mail-il1-x143.google.com (mail-il1-x143.google.com [IPv6:2607:f8b0:4864:20::143]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4E4ABC061385 for ; Fri, 14 Aug 2020 12:16:05 -0700 (PDT) Received: by mail-il1-x143.google.com with SMTP id z3so9368597ilh.3 for ; Fri, 14 Aug 2020 12:16:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=u6PVJvwPCQC+XR849wkWtonb95LC2TdeUz0P8ui6S6M=; b=rRkpYYleSkutg6uZU64d1utRbS8mSqr/gbePEIEkRGsZ20MPObKXu/i2KsYzRwsvWs g5YIe4BAXAhpEVUiOTE3F2F+65yrbvue4JMJDn08wvyQcT0oaS5prgG6B2xpZDHIEqzc GkFsDObGawTPvRneYGrnAxYnBSh8Hsl3dXUJfOsibq/EiGyNN//AC+3/zrPnZlQ6nFeh QC2cIkq4z3diz7pCPzH7E3CN0rAN9zZ2rgWCX8svdcHqjyOPecEg+O/UJl75Uc+GP9EA 32woq9A8zcsmiO4K8cwDn7PFsDv8mPB3n6Vau+8WokX9RXPaU+1xDYvu5olHBzsD5m4+ 86RQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=u6PVJvwPCQC+XR849wkWtonb95LC2TdeUz0P8ui6S6M=; b=Nw1NjlwBCNnj1wglupY7tjisqAODgjADaaploR75+jh4cLciHCu0hMDpezxtRs5ys+ 1pKC6I1iLMXXYOvil0bdeAhWNK2IMkL7nMVmp6xQxXmBd7l1/YBLNXEITslmQKP9U87g obH5wBUfQXmQT4UhGdFfxNN5w3IaCcM7royqF4RXaVVoWIsKLCBnAc+zRsPQls4ma25b 82D4ZBzinyRUvPxuxHUEwXBW45ToN0QbTLtx7kj8ey+qc/C5zuyZzZRoK7Iq30H3LK40 ybPmKXu96w0ol7vIGg50tmbFDv+WJQiLMxlAq++dhjcaSVrvXrKCEQ584U6/O4r02ZR8 /htA== X-Gm-Message-State: AOAM533NDd5pXTmpQR9DE/DmvnUv7e2ZhxRefNmU9w90a4DfAuxNR45k SqPBhef/JxTslPUkgPHRHfqq7t5W2mopAw== X-Google-Smtp-Source: ABdhPJzT7Xf5DhxQI4MwAnKnNHSwyz6CGNyT6hD4+b92sTNQtCgju9DeFwYs0zgBgXWpzChhsIom6A== X-Received: by 2002:a92:d44b:: with SMTP id r11mr3640748ilm.157.1597432564417; Fri, 14 Aug 2020 12:16:04 -0700 (PDT) Received: from localhost.localdomain (host-173-230-99-219.tnkngak.clients.pavlovmedia.com. [173.230.99.219]) by smtp.gmail.com with ESMTPSA id f15sm4521028ilc.51.2020.08.14.12.16.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 14 Aug 2020 12:16:03 -0700 (PDT) From: YiFei Zhu To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Daniel Borkmann , Stanislav Fomichev , Mahesh Bandewar , YiFei Zhu Subject: [RFC PATCH bpf-next 3/5] libbpf: Add BPF_PROG_ADD_MAP syscall and use it on .metadata section Date: Fri, 14 Aug 2020 14:15:56 -0500 Message-Id: X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org From: YiFei Zhu The patch adds a simple wrapper bpf_prog_add_map around the syscall. And when using libbpf to load a program, it will probe the kernel for the support of this syscall, and scan for the .metadata ELF section and load it as an internal map like a .data section. In the case that kernel supports the BPF_PROG_ADD_MAP syscall and a .metadata section exists, the map will be explicitly attached to the program via the syscall immediately after program is loaded. -EEXIST is ignored for this syscall. Signed-off-by: YiFei Zhu --- tools/lib/bpf/bpf.c | 11 +++++ tools/lib/bpf/bpf.h | 1 + tools/lib/bpf/libbpf.c | 97 +++++++++++++++++++++++++++++++++++++++- tools/lib/bpf/libbpf.map | 1 + 4 files changed, 109 insertions(+), 1 deletion(-) diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c index eab14c97c15d..9b2173d4f92c 100644 --- a/tools/lib/bpf/bpf.c +++ b/tools/lib/bpf/bpf.c @@ -872,3 +872,14 @@ int bpf_enable_stats(enum bpf_stats_type type) return sys_bpf(BPF_ENABLE_STATS, &attr, sizeof(attr)); } + +int bpf_prog_add_map(int prog_fd, int map_fd, int flags) +{ + union bpf_attr attr = {}; + + attr.prog_add_map.prog_fd = prog_fd; + attr.prog_add_map.map_fd = map_fd; + attr.prog_add_map.flags = flags; + + return sys_bpf(BPF_PROG_ADD_MAP, &attr, sizeof(attr)); +} diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index 28855fd5b5f4..d76fee8f84e0 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -240,6 +240,7 @@ LIBBPF_API int bpf_task_fd_query(int pid, int fd, __u32 flags, char *buf, enum bpf_stats_type; /* defined in up-to-date linux/bpf.h */ LIBBPF_API int bpf_enable_stats(enum bpf_stats_type type); +LIBBPF_API int bpf_prog_add_map(int prog_fd, int map_fd, int flags); #ifdef __cplusplus } /* extern "C" */ #endif diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 7be04e45d29c..3ab1cb1f2af3 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -180,6 +180,8 @@ struct bpf_capabilities { __u32 btf_func_global:1; /* kernel support for expected_attach_type in BPF_PROG_LOAD */ __u32 exp_attach_type:1; + /* kernel support for BPF_PROG_ADD_MAP */ + __u32 prog_add_map:1; }; enum reloc_type { @@ -288,6 +290,7 @@ struct bpf_struct_ops { #define KCONFIG_SEC ".kconfig" #define KSYMS_SEC ".ksyms" #define STRUCT_OPS_SEC ".struct_ops" +#define METADATA_SEC ".metadata" enum libbpf_map_type { LIBBPF_MAP_UNSPEC, @@ -295,6 +298,7 @@ enum libbpf_map_type { LIBBPF_MAP_BSS, LIBBPF_MAP_RODATA, LIBBPF_MAP_KCONFIG, + LIBBPF_MAP_METADATA, }; static const char * const libbpf_type_to_btf_name[] = { @@ -302,6 +306,7 @@ static const char * const libbpf_type_to_btf_name[] = { [LIBBPF_MAP_BSS] = BSS_SEC, [LIBBPF_MAP_RODATA] = RODATA_SEC, [LIBBPF_MAP_KCONFIG] = KCONFIG_SEC, + [LIBBPF_MAP_METADATA] = METADATA_SEC, }; struct bpf_map { @@ -380,6 +385,8 @@ struct bpf_object { size_t nr_maps; size_t maps_cap; + struct bpf_map *metadata_map; + char *kconfig; struct extern_desc *externs; int nr_extern; @@ -403,6 +410,7 @@ struct bpf_object { Elf_Data *rodata; Elf_Data *bss; Elf_Data *st_ops_data; + Elf_Data *metadata; size_t strtabidx; struct { GElf_Shdr shdr; @@ -418,6 +426,7 @@ struct bpf_object { int rodata_shndx; int bss_shndx; int st_ops_shndx; + int metadata_shndx; } efile; /* * All loaded bpf_object is linked in a list, which is @@ -1030,6 +1039,7 @@ static struct bpf_object *bpf_object__new(const char *path, obj->efile.obj_buf_sz = obj_buf_sz; obj->efile.maps_shndx = -1; obj->efile.btf_maps_shndx = -1; + obj->efile.metadata_shndx = -1; obj->efile.data_shndx = -1; obj->efile.rodata_shndx = -1; obj->efile.bss_shndx = -1; @@ -1391,6 +1401,9 @@ bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type, if (data) memcpy(map->mmaped, data, data_sz); + if (type == LIBBPF_MAP_METADATA) + obj->metadata_map = map; + pr_debug("map %td is \"%s\"\n", map - obj->maps, map->name); return 0; } @@ -1426,6 +1439,14 @@ static int bpf_object__init_global_data_maps(struct bpf_object *obj) if (err) return err; } + if (obj->efile.metadata_shndx >= 0) { + err = bpf_object__init_internal_map(obj, LIBBPF_MAP_METADATA, + obj->efile.metadata_shndx, + obj->efile.metadata->d_buf, + obj->efile.metadata->d_size); + if (err) + return err; + } return 0; } @@ -2665,6 +2686,9 @@ static int bpf_object__elf_collect(struct bpf_object *obj) } else if (strcmp(name, STRUCT_OPS_SEC) == 0) { obj->efile.st_ops_data = data; obj->efile.st_ops_shndx = idx; + } else if (strcmp(name, METADATA_SEC) == 0) { + obj->efile.metadata = data; + obj->efile.metadata_shndx = idx; } else { pr_debug("skip section(%d) %s\n", idx, name); } @@ -3078,7 +3102,8 @@ static bool bpf_object__shndx_is_data(const struct bpf_object *obj, { return shndx == obj->efile.data_shndx || shndx == obj->efile.bss_shndx || - shndx == obj->efile.rodata_shndx; + shndx == obj->efile.rodata_shndx || + shndx == obj->efile.metadata_shndx; } static bool bpf_object__shndx_is_maps(const struct bpf_object *obj, @@ -3099,6 +3124,8 @@ bpf_object__section_to_libbpf_map_type(const struct bpf_object *obj, int shndx) return LIBBPF_MAP_RODATA; else if (shndx == obj->efile.symbols_shndx) return LIBBPF_MAP_KCONFIG; + else if (shndx == obj->efile.metadata_shndx) + return LIBBPF_MAP_METADATA; else return LIBBPF_MAP_UNSPEC; } @@ -3633,6 +3660,59 @@ bpf_object__probe_exp_attach_type(struct bpf_object *obj) return 0; } +static int +bpf_object__probe_prog_add_map(struct bpf_object *obj) +{ + struct bpf_load_program_attr prog_attr; + struct bpf_create_map_attr map_attr; + char *cp, errmsg[STRERR_BUFSIZE]; + struct bpf_insn insns[] = { + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }; + int prog, map; + + if (!obj->caps.global_data) + return 0; + + memset(&map_attr, 0, sizeof(map_attr)); + map_attr.map_type = BPF_MAP_TYPE_ARRAY; + map_attr.key_size = sizeof(int); + map_attr.value_size = 32; + map_attr.max_entries = 1; + + map = bpf_create_map_xattr(&map_attr); + if (map < 0) { + cp = libbpf_strerror_r(errno, errmsg, sizeof(errmsg)); + pr_warn("Error in %s():%s(%d). Couldn't create simple array map.\n", + __func__, cp, errno); + return -errno; + } + + memset(&prog_attr, 0, sizeof(prog_attr)); + prog_attr.prog_type = BPF_PROG_TYPE_SOCKET_FILTER; + prog_attr.insns = insns; + prog_attr.insns_cnt = ARRAY_SIZE(insns); + prog_attr.license = "GPL"; + + prog = bpf_load_program_xattr(&prog_attr, NULL, 0); + if (prog < 0) { + cp = libbpf_strerror_r(errno, errmsg, sizeof(errmsg)); + pr_warn("Error in %s():%s(%d). Couldn't create simple program.\n", + __func__, cp, errno); + + close(map); + return -errno; + } + + if (!bpf_prog_add_map(prog, map, 0)) + obj->caps.prog_add_map = 1; + + close(map); + close(prog); + return 0; +} + static int bpf_object__probe_caps(struct bpf_object *obj) { @@ -3644,6 +3724,7 @@ bpf_object__probe_caps(struct bpf_object *obj) bpf_object__probe_btf_datasec, bpf_object__probe_array_mmap, bpf_object__probe_exp_attach_type, + bpf_object__probe_prog_add_map, }; int i, ret; @@ -5404,6 +5485,20 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt, if (ret >= 0) { if (log_buf && load_attr.log_level) pr_debug("verifier log:\n%s", log_buf); + + if (prog->obj->metadata_map && prog->obj->caps.prog_add_map) { + if (bpf_prog_add_map(ret, bpf_map__fd(prog->obj->metadata_map), 0) && + errno != EEXIST) { + int fd = ret; + + ret = -errno; + cp = libbpf_strerror_r(errno, errmsg, sizeof(errmsg)); + pr_warn("add metadata map failed: %s\n", cp); + close(fd); + goto out; + } + } + *pfd = ret; ret = 0; goto out; diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index 0c4722bfdd0a..86494a464c34 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -288,6 +288,7 @@ LIBBPF_0.1.0 { bpf_map__set_value_size; bpf_map__type; bpf_map__value_size; + bpf_prog_add_map; bpf_program__attach_xdp; bpf_program__autoload; bpf_program__is_sk_lookup; From patchwork Fri Aug 14 19:15:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: YiFei Zhu X-Patchwork-Id: 1345180 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=CW8ufsRU; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4BStT40X0bz9sPB for ; Sat, 15 Aug 2020 05:16:08 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728083AbgHNTQH (ORCPT ); Fri, 14 Aug 2020 15:16:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54134 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726652AbgHNTQG (ORCPT ); Fri, 14 Aug 2020 15:16:06 -0400 Received: from mail-il1-x12a.google.com (mail-il1-x12a.google.com [IPv6:2607:f8b0:4864:20::12a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9B7BFC061385 for ; Fri, 14 Aug 2020 12:16:06 -0700 (PDT) Received: by mail-il1-x12a.google.com with SMTP id t13so9321902ile.9 for ; Fri, 14 Aug 2020 12:16:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=C280CQNutEXiQ8ZUy45h62xWzwMS2jnf+f8g7QQRwLQ=; b=CW8ufsRUvHFSMPZt8bp70CpMkJ+gSWz6p/Zsy5DTXhWCR46To2kebYbGQF67O2u96R 7G5Mn0aYTX2JmSiF6XQ1JGozGE5kmoIE9NWlkY2zVji7J0KkDLzCAadrOMxu4uHjg02z YV68+4sB0b+V+CzPjK7rvfxKlB2sTY58G7gbnY4luDttRBQMkeDQrORqFxUwoAFtbFxe qtAClwwB9A0bIYRJCiikDUP7khHJEPKJiRxtA1PBGWzUz9+rW+cYTEpyY4hbECTgFhEj HQFWQKbongO/MiTxcZg4RgB8ITEeDHetjaHpGmUrLVOTgDMX9zTpOe8LiswcohkpvxqX zCkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=C280CQNutEXiQ8ZUy45h62xWzwMS2jnf+f8g7QQRwLQ=; b=LODmwBK64YV59WV/+gV3qDMsC5Q7gwI5nuhz6sl3vl+FP2WI9czlltHd8G3WHOSG4k Sx1BsIQh9Hu3qyeZI8qG6MK9vvHoThz1ydKG8+RkJDivr4WXU7SFhhEaj/wVbuXmK+ET JVf9/uHSiHG2hBr9KhAyhaxc+pLE4vkJMSwOC7nGiQl09yOAD9cNn7j+hhohBb6G2pit p2OQkK0vO1xDetG1N7I6EA6KELaaRE/etQrp6+8mJ8qDTp4Lk3SOO8LInVy+/j2bQnm/ QjBnOYkS/0QvMZ+QTrZ72Px+7GNW4ynu1Rc9WzVirs38ar226Q1TtFjzUXemYybqh6Fb TmpA== X-Gm-Message-State: AOAM530iJHMc85BcSH2iyz1/k0LdGXBVMH44tyiapVneLMzxndeNXd8F yy90Hg3Fi0Kv0A/oPk0Y4BX1I601CK5w7w== X-Google-Smtp-Source: ABdhPJzRL7o5Yka7zYhLEv0P8reP8zoV+Ojxjba/loT7KvUCJCvkIAHsd3RhgoilxKx/ZrnPakYavw== X-Received: by 2002:a92:b74a:: with SMTP id c10mr3777117ilm.231.1597432565504; Fri, 14 Aug 2020 12:16:05 -0700 (PDT) Received: from localhost.localdomain (host-173-230-99-219.tnkngak.clients.pavlovmedia.com. [173.230.99.219]) by smtp.gmail.com with ESMTPSA id f15sm4521028ilc.51.2020.08.14.12.16.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 14 Aug 2020 12:16:04 -0700 (PDT) From: YiFei Zhu To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Daniel Borkmann , Stanislav Fomichev , Mahesh Bandewar , YiFei Zhu Subject: [RFC PATCH bpf-next 4/5] bpftool: support dumping metadata Date: Fri, 14 Aug 2020 14:15:57 -0500 Message-Id: <618cfa1775ff2b0e45167d215c534d3cf39dbc15.1597427271.git.zhuyifei@google.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org From: YiFei Zhu Added a flag "--metadata" to `bpftool prog list` to dump the metadata contents. For some formatting some BTF code is put directly in the metadata dumping. Sanity checks on the map and the kind of the btf_type to make sure we are actually dumping what we are expecting. A helper jsonw_reset is added to json writer so we can reuse the same json writer without having extraneous commas. Sample output: $ bpftool prog --metadata 6: cgroup_skb name prog tag bcf7977d3b93787c gpl [...] btf_id 4 metadata: metadata_a = "foo" metadata_b = 1 $ bpftool prog --metadata --json --pretty [{ "id": 6, [...] "btf_id": 4, "metadata": { "metadata_a": "foo", "metadata_b": 1 } } ] Signed-off-by: YiFei Zhu --- tools/bpf/bpftool/json_writer.c | 6 ++ tools/bpf/bpftool/json_writer.h | 3 + tools/bpf/bpftool/main.c | 10 +++ tools/bpf/bpftool/main.h | 1 + tools/bpf/bpftool/prog.c | 132 ++++++++++++++++++++++++++++++++ 5 files changed, 152 insertions(+) diff --git a/tools/bpf/bpftool/json_writer.c b/tools/bpf/bpftool/json_writer.c index 86501cd3c763..7fea83bedf48 100644 --- a/tools/bpf/bpftool/json_writer.c +++ b/tools/bpf/bpftool/json_writer.c @@ -119,6 +119,12 @@ void jsonw_pretty(json_writer_t *self, bool on) self->pretty = on; } +void jsonw_reset(json_writer_t *self) +{ + assert(self->depth == 0); + self->sep = '\0'; +} + /* Basic blocks */ static void jsonw_begin(json_writer_t *self, int c) { diff --git a/tools/bpf/bpftool/json_writer.h b/tools/bpf/bpftool/json_writer.h index 35cf1f00f96c..8ace65cdb92f 100644 --- a/tools/bpf/bpftool/json_writer.h +++ b/tools/bpf/bpftool/json_writer.h @@ -27,6 +27,9 @@ void jsonw_destroy(json_writer_t **self_p); /* Cause output to have pretty whitespace */ void jsonw_pretty(json_writer_t *self, bool on); +/* Reset separator to create new JSON */ +void jsonw_reset(json_writer_t *self); + /* Add property name */ void jsonw_name(json_writer_t *self, const char *name); diff --git a/tools/bpf/bpftool/main.c b/tools/bpf/bpftool/main.c index 4a191fcbeb82..a681d568cfa7 100644 --- a/tools/bpf/bpftool/main.c +++ b/tools/bpf/bpftool/main.c @@ -28,6 +28,7 @@ bool show_pinned; bool block_mount; bool verifier_logs; bool relaxed_maps; +bool dump_metadata; struct pinned_obj_table prog_table; struct pinned_obj_table map_table; struct pinned_obj_table link_table; @@ -351,6 +352,10 @@ static int do_batch(int argc, char **argv) return err; } +enum bpftool_longonly_opts { + OPT_METADATA = 256, +}; + int main(int argc, char **argv) { static const struct option options[] = { @@ -362,6 +367,7 @@ int main(int argc, char **argv) { "mapcompat", no_argument, NULL, 'm' }, { "nomount", no_argument, NULL, 'n' }, { "debug", no_argument, NULL, 'd' }, + { "metadata", no_argument, NULL, OPT_METADATA }, { 0 } }; int opt, ret; @@ -371,6 +377,7 @@ int main(int argc, char **argv) json_output = false; show_pinned = false; block_mount = false; + dump_metadata = false; bin_name = argv[0]; hash_init(prog_table.table); @@ -412,6 +419,9 @@ int main(int argc, char **argv) libbpf_set_print(print_all_levels); verifier_logs = true; break; + case OPT_METADATA: + dump_metadata = true; + break; default: p_err("unrecognized option '%s'", argv[optind - 1]); if (json_output) diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h index e3a79b5a9960..54adfda5a9c9 100644 --- a/tools/bpf/bpftool/main.h +++ b/tools/bpf/bpftool/main.h @@ -82,6 +82,7 @@ extern bool show_pids; extern bool block_mount; extern bool verifier_logs; extern bool relaxed_maps; +extern bool dump_metadata; extern struct pinned_obj_table prog_table; extern struct pinned_obj_table map_table; extern struct pinned_obj_table link_table; diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c index 158995d853b0..9f803a84d132 100644 --- a/tools/bpf/bpftool/prog.c +++ b/tools/bpf/bpftool/prog.c @@ -151,6 +151,132 @@ static void show_prog_maps(int fd, __u32 num_maps) } } +static void show_prog_metadata(int fd, __u32 num_maps) +{ + struct bpf_prog_info prog_info = {}; + struct bpf_map_info map_info = {}; + __u32 prog_info_len = sizeof(prog_info); + __u32 map_info_len = sizeof(map_info); + __u32 map_ids[num_maps]; + void *value = NULL; + struct btf *btf = NULL; + const struct btf_type *t_datasec, *t_var; + struct btf_var_secinfo *vsi; + int key = 0; + unsigned int i, vlen; + int map_fd; + int err; + + prog_info.nr_map_ids = num_maps; + prog_info.map_ids = ptr_to_u64(map_ids); + + err = bpf_obj_get_info_by_fd(fd, &prog_info, &prog_info_len); + if (err || !prog_info.nr_map_ids) + return; + + for (i = 0; i < prog_info.nr_map_ids; i++) { + map_fd = bpf_map_get_fd_by_id(map_ids[i]); + if (map_fd < 0) + return; + + err = bpf_obj_get_info_by_fd(map_fd, &map_info, &map_info_len); + if (err) + goto out_close; + + if (map_info.type != BPF_MAP_TYPE_ARRAY) + continue; + if (map_info.key_size != sizeof(int)) + continue; + if (map_info.max_entries != 1) + continue; + if (!map_info.btf_value_type_id) + continue; + if (!strstr(map_info.name, ".metadata")) + continue; + + goto found; + } + + goto out_close; + +found: + value = malloc(map_info.value_size); + if (!value) + goto out_close; + + if (bpf_map_lookup_elem(map_fd, &key, value)) + goto out_free; + + err = btf__get_from_id(map_info.btf_id, &btf); + if (err || !btf) + goto out_free; + + t_datasec = btf__type_by_id(btf, map_info.btf_value_type_id); + if (BTF_INFO_KIND(t_datasec->info) != BTF_KIND_DATASEC) + goto out_free; + + vlen = BTF_INFO_VLEN(t_datasec->info); + vsi = (struct btf_var_secinfo *)(t_datasec + 1); + + if (json_output) { + struct btf_dumper d = { + .btf = btf, + .jw = json_wtr, + .is_plain_text = false, + }; + + jsonw_name(json_wtr, "metadata"); + + jsonw_start_object(json_wtr); + for (i = 0; i < vlen; i++) { + t_var = btf__type_by_id(btf, vsi[i].type); + + if (BTF_INFO_KIND(t_var->info) != BTF_KIND_VAR) + continue; + + jsonw_name(json_wtr, btf__name_by_offset(btf, t_var->name_off)); + err = btf_dumper_type(&d, t_var->type, value + vsi[i].offset); + if (err) + break; + } + jsonw_end_object(json_wtr); + } else { + json_writer_t *btf_wtr = jsonw_new(stdout); + struct btf_dumper d = { + .btf = btf, + .jw = btf_wtr, + .is_plain_text = true, + }; + if (!btf_wtr) + goto out_free; + + printf("\tmetadata:"); + + for (i = 0; i < vlen; i++) { + t_var = btf__type_by_id(btf, vsi[i].type); + + if (BTF_INFO_KIND(t_var->info) != BTF_KIND_VAR) + continue; + + printf("\n\t\t%s = ", btf__name_by_offset(btf, t_var->name_off)); + + jsonw_reset(btf_wtr); + err = btf_dumper_type(&d, t_var->type, value + vsi[i].offset); + if (err) + break; + } + + jsonw_destroy(&btf_wtr); + } + +out_free: + btf__free(btf); + free(value); + +out_close: + close(map_fd); +} + static void print_prog_header_json(struct bpf_prog_info *info) { jsonw_uint_field(json_wtr, "id", info->id); @@ -228,6 +354,9 @@ static void print_prog_json(struct bpf_prog_info *info, int fd) emit_obj_refs_json(&refs_table, info->id, json_wtr); + if (dump_metadata) + show_prog_metadata(fd, info->nr_map_ids); + jsonw_end_object(json_wtr); } @@ -297,6 +426,9 @@ static void print_prog_plain(struct bpf_prog_info *info, int fd) emit_obj_refs_plain(&refs_table, info->id, "\n\tpids "); printf("\n"); + + if (dump_metadata) + show_prog_metadata(fd, info->nr_map_ids); } static int show_prog(int fd) From patchwork Fri Aug 14 19:15:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: YiFei Zhu X-Patchwork-Id: 1345181 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=AwPk4Pq+; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4BStT43nj2z9sTH for ; Sat, 15 Aug 2020 05:16:08 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726652AbgHNTQH (ORCPT ); Fri, 14 Aug 2020 15:16:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54142 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728084AbgHNTQH (ORCPT ); Fri, 14 Aug 2020 15:16:07 -0400 Received: from mail-il1-x143.google.com (mail-il1-x143.google.com [IPv6:2607:f8b0:4864:20::143]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5D3ADC061385 for ; Fri, 14 Aug 2020 12:16:07 -0700 (PDT) Received: by mail-il1-x143.google.com with SMTP id r13so4928342iln.0 for ; Fri, 14 Aug 2020 12:16:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=6Wp6t4uQBQ5dF3zDMIDTsLP9qXJPmzukRhj8KqkAa5g=; b=AwPk4Pq+mDQDsQPzV638UrKi1S6PPEzapqxfoyOSyVzstuXQpR28qaowIB8XUtrVtu H7IsDSrMjySc16VbaVttF+6fu4qo8exS5qDV3EzLL4QmJX+N98isBKFc4F1OvxTlBLRa PIvpqWZ3BiMewwtfcmBmaZboMXXTZw10YaMLZP593KQHEyKKdp24wsieb1PA3aLK9gXe /Uxt0aYp3eu23Hc0uA+4eSgtj+mELnn6yRUnmrp4J0mV+/d034NnL/ipJYyGtwgx5B5a x6E6ILkKCN6iaklrHwujkfVcnFZ44iQugYn6jghr4IbIQtqVHkuojc9UoX/nmAh4/sAy l/aQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=6Wp6t4uQBQ5dF3zDMIDTsLP9qXJPmzukRhj8KqkAa5g=; b=ugfE6r7gnMguRRBLbdyiv+Yw3XDI7eV35uQmTl5KGC7dmozQCF9X7fBqYGGCqEbAWu yeaXZ38KVYMgLH1JkuJpU6O3kB09CwIpsvGoyy1Y2TxFBJQlOfwBTQGpEzM5Q+UNmClU MWvVHRxpY/FtY5unmEEI+OGqjefJn4bXb5lWU2JqWo9Id6FYUCgjGeWtBITnZ6szlqfZ R/xsM24s6Digbqi6hxfAQqaJu5jULotDsBCfrosiHADLuBn++Uxxt8citnVxxmbQu82f lMqfC84sodgYtt3GIZXrfI0BMrjWqjdk2PLccVCOnV+hnHaXdiwLrpHR/mfatIk2C2o0 o7PQ== X-Gm-Message-State: AOAM533jZqs2O1hFGLkUnUe26gVWarsFg43aLIOf6tlTVdab9BpqWZ9V +o3pfvkB9x2o+NgFn+XMgdl5afXeRcK3dw== X-Google-Smtp-Source: ABdhPJzhvgnia7DjBgHzkNhml8UKq3CP4fchQgCEgGzOIGwHgEzcWJkhDVPiXolpdDQQqQh817uvTw== X-Received: by 2002:a92:cf09:: with SMTP id c9mr3890308ilo.38.1597432566374; Fri, 14 Aug 2020 12:16:06 -0700 (PDT) Received: from localhost.localdomain (host-173-230-99-219.tnkngak.clients.pavlovmedia.com. [173.230.99.219]) by smtp.gmail.com with ESMTPSA id f15sm4521028ilc.51.2020.08.14.12.16.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 14 Aug 2020 12:16:05 -0700 (PDT) From: YiFei Zhu To: bpf@vger.kernel.org Cc: Alexei Starovoitov , Daniel Borkmann , Stanislav Fomichev , Mahesh Bandewar , YiFei Zhu Subject: [RFC PATCH bpf-next 5/5] selftests/bpf: Test bpftool loading and dumping metadata Date: Fri, 14 Aug 2020 14:15:58 -0500 Message-Id: <58804d7707fdd2918654b86f5235b201951aa8a9.1597427271.git.zhuyifei@google.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org From: YiFei Zhu This is a simple test to check that loading and dumping metadata works, whether or not metadata contents are used by the program. Signed-off-by: YiFei Zhu --- tools/testing/selftests/bpf/Makefile | 3 +- .../selftests/bpf/progs/metadata_unused.c | 15 ++++ .../selftests/bpf/progs/metadata_used.c | 15 ++++ .../selftests/bpf/test_bpftool_metadata.sh | 82 +++++++++++++++++++ 4 files changed, 114 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/bpf/progs/metadata_unused.c create mode 100644 tools/testing/selftests/bpf/progs/metadata_used.c create mode 100755 tools/testing/selftests/bpf/test_bpftool_metadata.sh diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index e7a8cf83ba48..5f559c79a977 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -68,7 +68,8 @@ TEST_PROGS := test_kmod.sh \ test_tc_edt.sh \ test_xdping.sh \ test_bpftool_build.sh \ - test_bpftool.sh + test_bpftool.sh \ + test_bpftool_metadata.sh \ TEST_PROGS_EXTENDED := with_addr.sh \ with_tunnels.sh \ diff --git a/tools/testing/selftests/bpf/progs/metadata_unused.c b/tools/testing/selftests/bpf/progs/metadata_unused.c new file mode 100644 index 000000000000..523b3c332426 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/metadata_unused.c @@ -0,0 +1,15 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include + +char metadata_a[] SEC(".metadata") = "foo"; +int metadata_b SEC(".metadata") = 1; + +SEC("cgroup_skb/egress") +int prog(struct xdp_md *ctx) +{ + return 0; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/progs/metadata_used.c b/tools/testing/selftests/bpf/progs/metadata_used.c new file mode 100644 index 000000000000..59785404f7bb --- /dev/null +++ b/tools/testing/selftests/bpf/progs/metadata_used.c @@ -0,0 +1,15 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include + +char metadata_a[] SEC(".metadata") = "bar"; +int metadata_b SEC(".metadata") = 2; + +SEC("cgroup_skb/egress") +int prog(struct xdp_md *ctx) +{ + return metadata_b ? 1 : 0; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/test_bpftool_metadata.sh b/tools/testing/selftests/bpf/test_bpftool_metadata.sh new file mode 100755 index 000000000000..a7515c09dc2d --- /dev/null +++ b/tools/testing/selftests/bpf/test_bpftool_metadata.sh @@ -0,0 +1,82 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 + +# Kselftest framework requirement - SKIP code is 4. +ksft_skip=4 + +TESTNAME=bpftool_metadata +BPF_FS=$(awk '$3 == "bpf" {print $2; exit}' /proc/mounts) +BPF_DIR=$BPF_FS/test_$TESTNAME + +_cleanup() +{ + set +e + rm -rf $BPF_DIR 2> /dev/null +} + +cleanup_skip() +{ + echo "selftests: $TESTNAME [SKIP]" + _cleanup + + exit $ksft_skip +} + +cleanup() +{ + if [ "$?" = 0 ]; then + echo "selftests: $TESTNAME [PASS]" + else + echo "selftests: $TESTNAME [FAILED]" + fi + _cleanup +} + +if [ $(id -u) -ne 0 ]; then + echo "selftests: $TESTNAME [SKIP] Need root privileges" + exit $ksft_skip +fi + +if [ -z "$BPF_FS" ]; then + echo "selftests: $TESTNAME [SKIP] Could not run test without bpffs mounted" + exit $ksft_skip +fi + +if ! bpftool version > /dev/null 2>&1; then + echo "selftests: $TESTNAME [SKIP] Could not run test without bpftool" + exit $ksft_skip +fi + +set -e + +trap cleanup_skip EXIT + +mkdir $BPF_DIR + +trap cleanup EXIT + +bpftool prog load metadata_unused.o $BPF_DIR/unused + +METADATA_PLAIN="$(bpftool prog --metadata)" +echo "$METADATA_PLAIN" | grep 'metadata_a = "foo"' > /dev/null +echo "$METADATA_PLAIN" | grep 'metadata_b = 1' > /dev/null + +bpftool prog --metadata --json | grep '"metadata":{"metadata_a":"foo","metadata_b":1}' > /dev/null + +bpftool map | grep 'metada.metadata' > /dev/null + +rm $BPF_DIR/unused + +bpftool prog load metadata_used.o $BPF_DIR/used + +METADATA_PLAIN="$(bpftool prog --metadata)" +echo "$METADATA_PLAIN" | grep 'metadata_a = "bar"' > /dev/null +echo "$METADATA_PLAIN" | grep 'metadata_b = 2' > /dev/null + +bpftool prog --metadata --json | grep '"metadata":{"metadata_a":"bar","metadata_b":2}' > /dev/null + +bpftool map | grep 'metada.metadata' > /dev/null + +rm $BPF_DIR/used + +exit 0