From patchwork Tue Jan 14 16:46:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Vazquez X-Patchwork-Id: 1222946 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=RD+O07e8; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47xxFy4g62z9sPn for ; Wed, 15 Jan 2020 03:47:34 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728925AbgANQqj (ORCPT ); Tue, 14 Jan 2020 11:46:39 -0500 Received: from mail-pl1-f202.google.com ([209.85.214.202]:45844 "EHLO mail-pl1-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728912AbgANQqi (ORCPT ); Tue, 14 Jan 2020 11:46:38 -0500 Received: by mail-pl1-f202.google.com with SMTP id k9so5384403pll.12 for ; Tue, 14 Jan 2020 08:46:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=Jc5+7d7NtiEYIwbVpgtYmSFqbGpKI69cS8An9zRo6DQ=; b=RD+O07e8AvfO9vxlexBheHJWnYQeRdWn1MYOgbysmc347uLeJufnqIGOBAzcI7jr6l N5RMZnBvwpiVkwCIn3L2j8ALC3sIAZZFcxJ0B8cbGYPdaBJx/B8wAowcGbvb/v/tXNJF 7oprjCc/b4kS60EUehwbCLX67OvjllQfmwU83q55rZkOgu1tnhWKJrsy/K2XkiwTLL6J LsSnlx2dk6uU5X+3roL1UGf5t089iFPH5zjC9s9hDlPFanoBMJAr+8E5irV+uPb0o4BG cEGrNateiE4nHFS/eM23xH70cVmedwrgS3FkuFPQwWjPum6ttFaZ6dWzE7f8LRWp9qMG wQqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=Jc5+7d7NtiEYIwbVpgtYmSFqbGpKI69cS8An9zRo6DQ=; b=i0jmqYWW66e0T3va+RDTlpdeP7ZJ2kbeS+II/C7wjkDQQf1T4fhcDvGvRNWZnqBS+Q l+n9t/5/UbKCzBLAiSsGoFq8+2fhoUqNZCpXgpub+XmAs4sRrhiKqbmM78hXBD0BEZer XwnaqtkSHDqhmj8qJgOjUbEZ7MvHDpUKW282naJMf5sNB1u/rA9izXbdIKF/3NNRiPHJ uRDDE2DJVgw+6HpfJSuGMCmRTdvLc3NQSnPLPnsuRyAJzG2sQp68Erq/BBkxTn37uFOS ftAmM+ndqHUp+JsnP8awtUPWCI3u5m/AtLSIlGLyIFCUq0EnU2s8E8R6Q/gf6sBr1HbD 0VNw== X-Gm-Message-State: APjAAAV4eOMzQQruRPyEm1bPCJShqQ53vUO6UlC7y5gl4dlj2bcNrZzK +Xew1taLD1fjFDt7oxB/9RchQGkEljiV X-Google-Smtp-Source: APXvYqzCK/+QxDG9ZFuDR26tMZLXh46byU3EXvYd1q83XUiWtbwRFsNncXq4PHpIUj3OXnaXGqtX/9JFcg5p X-Received: by 2002:a63:6787:: with SMTP id b129mr27904289pgc.103.1579020396975; Tue, 14 Jan 2020 08:46:36 -0800 (PST) Date: Tue, 14 Jan 2020 08:46:05 -0800 In-Reply-To: <20200114164614.47029-1-brianvv@google.com> Message-Id: <20200114164614.47029-2-brianvv@google.com> Mime-Version: 1.0 References: <20200114164614.47029-1-brianvv@google.com> X-Mailer: git-send-email 2.25.0.rc1.283.g88dfdc4193-goog Subject: [PATCH v4 bpf-next 1/9] bpf: add bpf_map_{value_size, update_value, map_copy_value} functions From: Brian Vazquez To: Brian Vazquez , Brian Vazquez , Alexei Starovoitov , Daniel Borkmann , "David S . Miller" Cc: Yonghong Song , Andrii Nakryiko , Stanislav Fomichev , Petar Penkov , Willem de Bruijn , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, John Fastabend Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This commit moves reusable code from map_lookup_elem and map_update_elem to avoid code duplication in kernel/bpf/syscall.c. Signed-off-by: Brian Vazquez Acked-by: John Fastabend Acked-by: Yonghong Song --- kernel/bpf/syscall.c | 280 +++++++++++++++++++++++-------------------- 1 file changed, 152 insertions(+), 128 deletions(-) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index f9db72a96ec04..08b0b6e40454b 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -129,6 +129,154 @@ static struct bpf_map *find_and_alloc_map(union bpf_attr *attr) return map; } +static u32 bpf_map_value_size(struct bpf_map *map) +{ + if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || + map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH || + map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY || + map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) + return round_up(map->value_size, 8) * num_possible_cpus(); + else if (IS_FD_MAP(map)) + return sizeof(u32); + else + return map->value_size; +} + +static void maybe_wait_bpf_programs(struct bpf_map *map) +{ + /* Wait for any running BPF programs to complete so that + * userspace, when we return to it, knows that all programs + * that could be running use the new map value. + */ + if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS || + map->map_type == BPF_MAP_TYPE_ARRAY_OF_MAPS) + synchronize_rcu(); +} + +static int bpf_map_update_value(struct bpf_map *map, struct fd f, void *key, + void *value, __u64 flags) +{ + int err; + + /* Need to create a kthread, thus must support schedule */ + if (bpf_map_is_dev_bound(map)) { + return bpf_map_offload_update_elem(map, key, value, flags); + } else if (map->map_type == BPF_MAP_TYPE_CPUMAP || + map->map_type == BPF_MAP_TYPE_SOCKHASH || + map->map_type == BPF_MAP_TYPE_SOCKMAP || + map->map_type == BPF_MAP_TYPE_STRUCT_OPS) { + return map->ops->map_update_elem(map, key, value, flags); + } else if (IS_FD_PROG_ARRAY(map)) { + return bpf_fd_array_map_update_elem(map, f.file, key, value, + flags); + } + + /* must increment bpf_prog_active to avoid kprobe+bpf triggering from + * inside bpf map update or delete otherwise deadlocks are possible + */ + preempt_disable(); + __this_cpu_inc(bpf_prog_active); + if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || + map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) { + err = bpf_percpu_hash_update(map, key, value, flags); + } else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) { + err = bpf_percpu_array_update(map, key, value, flags); + } else if (map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) { + err = bpf_percpu_cgroup_storage_update(map, key, value, + flags); + } else if (IS_FD_ARRAY(map)) { + rcu_read_lock(); + err = bpf_fd_array_map_update_elem(map, f.file, key, value, + flags); + rcu_read_unlock(); + } else if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS) { + rcu_read_lock(); + err = bpf_fd_htab_map_update_elem(map, f.file, key, value, + flags); + rcu_read_unlock(); + } else if (map->map_type == BPF_MAP_TYPE_REUSEPORT_SOCKARRAY) { + /* rcu_read_lock() is not needed */ + err = bpf_fd_reuseport_array_update_elem(map, key, value, + flags); + } else if (map->map_type == BPF_MAP_TYPE_QUEUE || + map->map_type == BPF_MAP_TYPE_STACK) { + err = map->ops->map_push_elem(map, value, flags); + } else { + rcu_read_lock(); + err = map->ops->map_update_elem(map, key, value, flags); + rcu_read_unlock(); + } + __this_cpu_dec(bpf_prog_active); + preempt_enable(); + maybe_wait_bpf_programs(map); + + return err; +} + +static int bpf_map_copy_value(struct bpf_map *map, void *key, void *value, + __u64 flags) +{ + void *ptr; + int err; + + if (bpf_map_is_dev_bound(map)) { + err = bpf_map_offload_lookup_elem(map, key, value); + return err; + } + + preempt_disable(); + this_cpu_inc(bpf_prog_active); + if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || + map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) { + err = bpf_percpu_hash_copy(map, key, value); + } else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) { + err = bpf_percpu_array_copy(map, key, value); + } else if (map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) { + err = bpf_percpu_cgroup_storage_copy(map, key, value); + } else if (map->map_type == BPF_MAP_TYPE_STACK_TRACE) { + err = bpf_stackmap_copy(map, key, value); + } else if (IS_FD_ARRAY(map) || IS_FD_PROG_ARRAY(map)) { + err = bpf_fd_array_map_lookup_elem(map, key, value); + } else if (IS_FD_HASH(map)) { + err = bpf_fd_htab_map_lookup_elem(map, key, value); + } else if (map->map_type == BPF_MAP_TYPE_REUSEPORT_SOCKARRAY) { + err = bpf_fd_reuseport_array_lookup_elem(map, key, value); + } else if (map->map_type == BPF_MAP_TYPE_QUEUE || + map->map_type == BPF_MAP_TYPE_STACK) { + err = map->ops->map_peek_elem(map, value); + } else if (map->map_type == BPF_MAP_TYPE_STRUCT_OPS) { + /* struct_ops map requires directly updating "value" */ + err = bpf_struct_ops_map_sys_lookup_elem(map, key, value); + } else { + rcu_read_lock(); + if (map->ops->map_lookup_elem_sys_only) + ptr = map->ops->map_lookup_elem_sys_only(map, key); + else + ptr = map->ops->map_lookup_elem(map, key); + if (IS_ERR(ptr)) { + err = PTR_ERR(ptr); + } else if (!ptr) { + err = -ENOENT; + } else { + err = 0; + if (flags & BPF_F_LOCK) + /* lock 'ptr' and copy everything but lock */ + copy_map_value_locked(map, value, ptr, true); + else + copy_map_value(map, value, ptr); + /* mask lock, since value wasn't zero inited */ + check_and_init_map_lock(map, value); + } + rcu_read_unlock(); + } + + this_cpu_dec(bpf_prog_active); + preempt_enable(); + maybe_wait_bpf_programs(map); + + return err; +} + static void *__bpf_map_area_alloc(u64 size, int numa_node, bool mmapable) { /* We really just want to fail instead of triggering OOM killer @@ -827,7 +975,7 @@ static int map_lookup_elem(union bpf_attr *attr) void __user *uvalue = u64_to_user_ptr(attr->value); int ufd = attr->map_fd; struct bpf_map *map; - void *key, *value, *ptr; + void *key, *value; u32 value_size; struct fd f; int err; @@ -859,75 +1007,14 @@ static int map_lookup_elem(union bpf_attr *attr) goto err_put; } - if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || - map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH || - map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY || - map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) - value_size = round_up(map->value_size, 8) * num_possible_cpus(); - else if (IS_FD_MAP(map)) - value_size = sizeof(u32); - else - value_size = map->value_size; + value_size = bpf_map_value_size(map); err = -ENOMEM; value = kmalloc(value_size, GFP_USER | __GFP_NOWARN); if (!value) goto free_key; - if (bpf_map_is_dev_bound(map)) { - err = bpf_map_offload_lookup_elem(map, key, value); - goto done; - } - - preempt_disable(); - this_cpu_inc(bpf_prog_active); - if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || - map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) { - err = bpf_percpu_hash_copy(map, key, value); - } else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) { - err = bpf_percpu_array_copy(map, key, value); - } else if (map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) { - err = bpf_percpu_cgroup_storage_copy(map, key, value); - } else if (map->map_type == BPF_MAP_TYPE_STACK_TRACE) { - err = bpf_stackmap_copy(map, key, value); - } else if (IS_FD_ARRAY(map) || IS_FD_PROG_ARRAY(map)) { - err = bpf_fd_array_map_lookup_elem(map, key, value); - } else if (IS_FD_HASH(map)) { - err = bpf_fd_htab_map_lookup_elem(map, key, value); - } else if (map->map_type == BPF_MAP_TYPE_REUSEPORT_SOCKARRAY) { - err = bpf_fd_reuseport_array_lookup_elem(map, key, value); - } else if (map->map_type == BPF_MAP_TYPE_QUEUE || - map->map_type == BPF_MAP_TYPE_STACK) { - err = map->ops->map_peek_elem(map, value); - } else if (map->map_type == BPF_MAP_TYPE_STRUCT_OPS) { - /* struct_ops map requires directly updating "value" */ - err = bpf_struct_ops_map_sys_lookup_elem(map, key, value); - } else { - rcu_read_lock(); - if (map->ops->map_lookup_elem_sys_only) - ptr = map->ops->map_lookup_elem_sys_only(map, key); - else - ptr = map->ops->map_lookup_elem(map, key); - if (IS_ERR(ptr)) { - err = PTR_ERR(ptr); - } else if (!ptr) { - err = -ENOENT; - } else { - err = 0; - if (attr->flags & BPF_F_LOCK) - /* lock 'ptr' and copy everything but lock */ - copy_map_value_locked(map, value, ptr, true); - else - copy_map_value(map, value, ptr); - /* mask lock, since value wasn't zero inited */ - check_and_init_map_lock(map, value); - } - rcu_read_unlock(); - } - this_cpu_dec(bpf_prog_active); - preempt_enable(); - -done: + err = bpf_map_copy_value(map, key, value, attr->flags); if (err) goto free_value; @@ -946,16 +1033,6 @@ static int map_lookup_elem(union bpf_attr *attr) return err; } -static void maybe_wait_bpf_programs(struct bpf_map *map) -{ - /* Wait for any running BPF programs to complete so that - * userspace, when we return to it, knows that all programs - * that could be running use the new map value. - */ - if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS || - map->map_type == BPF_MAP_TYPE_ARRAY_OF_MAPS) - synchronize_rcu(); -} #define BPF_MAP_UPDATE_ELEM_LAST_FIELD flags @@ -1011,61 +1088,8 @@ static int map_update_elem(union bpf_attr *attr) if (copy_from_user(value, uvalue, value_size) != 0) goto free_value; - /* Need to create a kthread, thus must support schedule */ - if (bpf_map_is_dev_bound(map)) { - err = bpf_map_offload_update_elem(map, key, value, attr->flags); - goto out; - } else if (map->map_type == BPF_MAP_TYPE_CPUMAP || - map->map_type == BPF_MAP_TYPE_SOCKHASH || - map->map_type == BPF_MAP_TYPE_SOCKMAP || - map->map_type == BPF_MAP_TYPE_STRUCT_OPS) { - err = map->ops->map_update_elem(map, key, value, attr->flags); - goto out; - } else if (IS_FD_PROG_ARRAY(map)) { - err = bpf_fd_array_map_update_elem(map, f.file, key, value, - attr->flags); - goto out; - } + err = bpf_map_update_value(map, f, key, value, attr->flags); - /* must increment bpf_prog_active to avoid kprobe+bpf triggering from - * inside bpf map update or delete otherwise deadlocks are possible - */ - preempt_disable(); - __this_cpu_inc(bpf_prog_active); - if (map->map_type == BPF_MAP_TYPE_PERCPU_HASH || - map->map_type == BPF_MAP_TYPE_LRU_PERCPU_HASH) { - err = bpf_percpu_hash_update(map, key, value, attr->flags); - } else if (map->map_type == BPF_MAP_TYPE_PERCPU_ARRAY) { - err = bpf_percpu_array_update(map, key, value, attr->flags); - } else if (map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE) { - err = bpf_percpu_cgroup_storage_update(map, key, value, - attr->flags); - } else if (IS_FD_ARRAY(map)) { - rcu_read_lock(); - err = bpf_fd_array_map_update_elem(map, f.file, key, value, - attr->flags); - rcu_read_unlock(); - } else if (map->map_type == BPF_MAP_TYPE_HASH_OF_MAPS) { - rcu_read_lock(); - err = bpf_fd_htab_map_update_elem(map, f.file, key, value, - attr->flags); - rcu_read_unlock(); - } else if (map->map_type == BPF_MAP_TYPE_REUSEPORT_SOCKARRAY) { - /* rcu_read_lock() is not needed */ - err = bpf_fd_reuseport_array_update_elem(map, key, value, - attr->flags); - } else if (map->map_type == BPF_MAP_TYPE_QUEUE || - map->map_type == BPF_MAP_TYPE_STACK) { - err = map->ops->map_push_elem(map, value, attr->flags); - } else { - rcu_read_lock(); - err = map->ops->map_update_elem(map, key, value, attr->flags); - rcu_read_unlock(); - } - __this_cpu_dec(bpf_prog_active); - preempt_enable(); - maybe_wait_bpf_programs(map); -out: free_value: kfree(value); free_key: From patchwork Tue Jan 14 16:46:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Vazquez X-Patchwork-Id: 1222928 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=oUbGFHuz; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47xxDy2Hhfz9sPn for ; Wed, 15 Jan 2020 03:46:42 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728940AbgANQqk (ORCPT ); Tue, 14 Jan 2020 11:46:40 -0500 Received: from mail-pj1-f74.google.com ([209.85.216.74]:52117 "EHLO mail-pj1-f74.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728936AbgANQqk (ORCPT ); Tue, 14 Jan 2020 11:46:40 -0500 Received: by mail-pj1-f74.google.com with SMTP id h2so8422473pji.1 for ; Tue, 14 Jan 2020 08:46:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=/0hoB7QTx34Umi8lqf0IpefwqLyxOozJhDYnRQx15So=; b=oUbGFHuzxfbikP+w2H69yHyNg/DNQR5mgxnRGWfLWXn02suHKCUCsZFMMAqlJ/ukA8 kVH2YbvZp5i1FPm5YL6AbrGuv2ZEi4yXXCHUq1EVyW/2f7ez2DKjGpvMT/dOSi8PjjW1 KPxIP71suH4lFFsZKqxodO8bJamM9/CR/gzHtsU1t9+W2PTtOHIvgmMC08guw8Iyt8lj FQyrkNv6afyio3EO3eDi62J7vr8hZwoAYTV4DoyvjMO0rqTgX15ANVcbJ73mH37BP6nN Q/muvEvr1klzCtLsIJzj5PA4yupFPRbpLg2Aso0rkj3UfFuhTQcYO4gDJOQHdaegGbJR XuuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=/0hoB7QTx34Umi8lqf0IpefwqLyxOozJhDYnRQx15So=; b=meMNwlswUQqOkAUkS42A2tn53qamfJP4olSanAcHDDudZTFl55soB97A3CmbWMfSWz 9D6zYjTmBMmDK7SkqXihLpmfXqxPF4auO+bJ4OpfYKxP7JSiPOTklbVDFvsIplSBDAvh CDPZFlSn9fWZH0reDJtZQfVGhXR+KA4xtyMp48tAnng8DKTHVVtpgnuf8IGZk12ZSEd0 RgLAoXiPq3fnVllvAbRLKa/23UWZ/bailRpDBh1FCRNSiQHUj7Ntv2uNBS6rccSqC52/ KKTy098xGaI4ze0cd94P0T96Livq12+eDdfvLkXSL2NtVnx8vnjlN8txZ5G6ukIV0STp WZZg== X-Gm-Message-State: APjAAAWGkSPdz3z0BpEoMs01yukAs02Eqyhri6245GgPEjn1PT4iAz5Z ZJ/8xpejX6jN6FHxN4RCNCm0zMUhP/ST X-Google-Smtp-Source: APXvYqyh0cBrgfBqhYYig0C/EXSsJKS+YZOJNX/tJwJlhzdAbdUvRexbr3b3xzyUJ10wp7ncxOOIfEt1zsZ8 X-Received: by 2002:a63:cd06:: with SMTP id i6mr28311164pgg.48.1579020399538; Tue, 14 Jan 2020 08:46:39 -0800 (PST) Date: Tue, 14 Jan 2020 08:46:06 -0800 In-Reply-To: <20200114164614.47029-1-brianvv@google.com> Message-Id: <20200114164614.47029-3-brianvv@google.com> Mime-Version: 1.0 References: <20200114164614.47029-1-brianvv@google.com> X-Mailer: git-send-email 2.25.0.rc1.283.g88dfdc4193-goog Subject: [PATCH v4 bpf-next 2/9] bpf: add generic support for lookup batch op From: Brian Vazquez To: Brian Vazquez , Brian Vazquez , Alexei Starovoitov , Daniel Borkmann , "David S . Miller" Cc: Yonghong Song , Andrii Nakryiko , Stanislav Fomichev , Petar Penkov , Willem de Bruijn , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org This commit introduces generic support for the bpf_map_lookup_batch. This implementation can be used by almost all the bpf maps since its core implementation is relying on the existing map_get_next_key and map_lookup_elem. The bpf syscall subcommand introduced is: BPF_MAP_LOOKUP_BATCH The UAPI attribute is: struct { /* struct used by BPF_MAP_*_BATCH commands */ __aligned_u64 in_batch; /* start batch, * NULL to start from beginning */ __aligned_u64 out_batch; /* output: next start batch */ __aligned_u64 keys; __aligned_u64 values; __u32 count; /* input/output: * input: # of key/value * elements * output: # of filled elements */ __u32 map_fd; __u64 elem_flags; __u64 flags; } batch; in_batch/out_batch are opaque values use to communicate between user/kernel space, in_batch/out_batch must be of key_size length. To start iterating from the beginning in_batch must be null, count is the # of key/value elements to retrieve. Note that the 'keys' buffer must be a buffer of key_size * count size and the 'values' buffer must be value_size * count, where value_size must be aligned to 8 bytes by userspace if it's dealing with percpu maps. 'count' will contain the number of keys/values successfully retrieved. Note that 'count' is an input/output variable and it can contain a lower value after a call. If there's no more entries to retrieve, ENOENT will be returned. If error is ENOENT, count might be > 0 in case it copied some values but there were no more entries to retrieve. Note that if the return code is an error and not -EFAULT, count indicates the number of elements successfully processed. Suggested-by: Stanislav Fomichev Signed-off-by: Brian Vazquez Signed-off-by: Yonghong Song --- include/linux/bpf.h | 5 ++ include/uapi/linux/bpf.h | 18 +++++ kernel/bpf/syscall.c | 154 ++++++++++++++++++++++++++++++++++++++- 3 files changed, 173 insertions(+), 4 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index aed2bc39d72b6..807744ecaa5a1 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -44,6 +44,8 @@ struct bpf_map_ops { int (*map_get_next_key)(struct bpf_map *map, void *key, void *next_key); void (*map_release_uref)(struct bpf_map *map); void *(*map_lookup_elem_sys_only)(struct bpf_map *map, void *key); + int (*map_lookup_batch)(struct bpf_map *map, const union bpf_attr *attr, + union bpf_attr __user *uattr); /* funcs callable from userspace and from eBPF programs */ void *(*map_lookup_elem)(struct bpf_map *map, void *key); @@ -982,6 +984,9 @@ void *bpf_map_area_alloc(u64 size, int numa_node); void *bpf_map_area_mmapable_alloc(u64 size, int numa_node); void bpf_map_area_free(void *base); void bpf_map_init_from_attr(struct bpf_map *map, union bpf_attr *attr); +int generic_map_lookup_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr); extern int sysctl_unprivileged_bpf_disabled; diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 52966e758fe59..8185f1542daa1 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -107,6 +107,7 @@ enum bpf_cmd { BPF_MAP_LOOKUP_AND_DELETE_ELEM, BPF_MAP_FREEZE, BPF_BTF_GET_NEXT_ID, + BPF_MAP_LOOKUP_BATCH, }; enum bpf_map_type { @@ -420,6 +421,23 @@ union bpf_attr { __u64 flags; }; + struct { /* struct used by BPF_MAP_*_BATCH commands */ + __aligned_u64 in_batch; /* start batch, + * NULL to start from beginning + */ + __aligned_u64 out_batch; /* output: next start batch */ + __aligned_u64 keys; + __aligned_u64 values; + __u32 count; /* input/output: + * input: # of key/value + * elements + * output: # of filled elements + */ + __u32 map_fd; + __u64 elem_flags; + __u64 flags; + } batch; + struct { /* anonymous struct used by BPF_PROG_LOAD command */ __u32 prog_type; /* one of enum bpf_prog_type */ __u32 insn_cnt; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 08b0b6e40454b..d4acb6eb5ef9e 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -219,10 +219,8 @@ static int bpf_map_copy_value(struct bpf_map *map, void *key, void *value, void *ptr; int err; - if (bpf_map_is_dev_bound(map)) { - err = bpf_map_offload_lookup_elem(map, key, value); - return err; - } + if (bpf_map_is_dev_bound(map)) + return bpf_map_offload_lookup_elem(map, key, value); preempt_disable(); this_cpu_inc(bpf_prog_active); @@ -1220,6 +1218,103 @@ static int map_get_next_key(union bpf_attr *attr) return err; } +#define MAP_LOOKUP_RETRIES 3 + +int generic_map_lookup_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + void __user *uobatch = u64_to_user_ptr(attr->batch.out_batch); + void __user *ubatch = u64_to_user_ptr(attr->batch.in_batch); + void __user *values = u64_to_user_ptr(attr->batch.values); + void __user *keys = u64_to_user_ptr(attr->batch.keys); + void *buf, *buf_prevkey, *prev_key, *key, *value; + int err, retry = MAP_LOOKUP_RETRIES; + u32 value_size, cp, max_count; + bool first_key = false; + + if (attr->batch.elem_flags & ~BPF_F_LOCK) + return -EINVAL; + + if ((attr->batch.elem_flags & BPF_F_LOCK) && + !map_value_has_spin_lock(map)) + return -EINVAL; + + value_size = bpf_map_value_size(map); + + max_count = attr->batch.count; + if (!max_count) + return 0; + + buf_prevkey = kmalloc(map->key_size, GFP_USER | __GFP_NOWARN); + if (!buf_prevkey) + return -ENOMEM; + + buf = kmalloc(map->key_size + value_size, GFP_USER | __GFP_NOWARN); + if (!buf) { + kvfree(buf_prevkey); + return -ENOMEM; + } + + err = -EFAULT; + first_key = false; + prev_key = NULL; + if (ubatch && copy_from_user(buf_prevkey, ubatch, map->key_size)) + goto free_buf; + key = buf; + value = key + map->key_size; + if (ubatch) + prev_key = buf_prevkey; + + for (cp = 0; cp < max_count;) { + rcu_read_lock(); + err = map->ops->map_get_next_key(map, prev_key, key); + rcu_read_unlock(); + if (err) + break; + err = bpf_map_copy_value(map, key, value, + attr->batch.elem_flags); + + if (err == -ENOENT) { + if (retry) { + retry--; + continue; + } + err = -EINTR; + break; + } + + if (err) + goto free_buf; + + if (copy_to_user(keys + cp * map->key_size, key, + map->key_size)) { + err = -EFAULT; + goto free_buf; + } + if (copy_to_user(values + cp * value_size, value, value_size)) { + err = -EFAULT; + goto free_buf; + } + + if (!prev_key) + prev_key = buf_prevkey; + + swap(prev_key, key); + retry = MAP_LOOKUP_RETRIES; + cp++; + } + + if ((copy_to_user(&uattr->batch.count, &cp, sizeof(cp)) || + (cp && copy_to_user(uobatch, prev_key, map->key_size)))) + err = -EFAULT; + +free_buf: + kfree(buf_prevkey); + kfree(buf); + return err; +} + #define BPF_MAP_LOOKUP_AND_DELETE_ELEM_LAST_FIELD value static int map_lookup_and_delete_elem(union bpf_attr *attr) @@ -3076,6 +3171,54 @@ static int bpf_task_fd_query(const union bpf_attr *attr, return err; } +#define BPF_MAP_BATCH_LAST_FIELD batch.flags + +#define BPF_DO_BATCH(fn) \ + do { \ + if (!fn) { \ + err = -ENOTSUPP; \ + goto err_put; \ + } \ + err = fn(map, attr, uattr); \ + } while (0) + +static int bpf_map_do_batch(const union bpf_attr *attr, + union bpf_attr __user *uattr, + int cmd) +{ + struct bpf_map *map; + int err, ufd; + struct fd f; + + if (CHECK_ATTR(BPF_MAP_BATCH)) + return -EINVAL; + + ufd = attr->batch.map_fd; + f = fdget(ufd); + map = __bpf_map_get(f); + if (IS_ERR(map)) + return PTR_ERR(map); + + if (cmd == BPF_MAP_LOOKUP_BATCH && + !(map_get_sys_perms(map, f) & FMODE_CAN_READ)) { + err = -EPERM; + goto err_put; + } + + if (cmd != BPF_MAP_LOOKUP_BATCH && + !(map_get_sys_perms(map, f) & FMODE_CAN_WRITE)) { + err = -EPERM; + goto err_put; + } + + if (cmd == BPF_MAP_LOOKUP_BATCH) + BPF_DO_BATCH(map->ops->map_lookup_batch); + +err_put: + fdput(f); + return err; +} + SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size) { union bpf_attr attr = {}; @@ -3173,6 +3316,9 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz case BPF_MAP_LOOKUP_AND_DELETE_ELEM: err = map_lookup_and_delete_elem(&attr); break; + case BPF_MAP_LOOKUP_BATCH: + err = bpf_map_do_batch(&attr, uattr, BPF_MAP_LOOKUP_BATCH); + break; default: err = -EINVAL; break; From patchwork Tue Jan 14 16:46:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Vazquez X-Patchwork-Id: 1222944 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=i9gQvdqD; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47xxFv6RGRz9sPn for ; Wed, 15 Jan 2020 03:47:31 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728969AbgANQqo (ORCPT ); Tue, 14 Jan 2020 11:46:44 -0500 Received: from mail-pg1-f202.google.com ([209.85.215.202]:37469 "EHLO mail-pg1-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728952AbgANQqm (ORCPT ); Tue, 14 Jan 2020 11:46:42 -0500 Received: by mail-pg1-f202.google.com with SMTP id 14so8719343pgg.4 for ; Tue, 14 Jan 2020 08:46:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=IRlKatwJeTkkAyX6vfla/t8ZI87eW9srrnw4F61EIh4=; b=i9gQvdqDubiqI71J3DzzKcwks7J8EKVQmZLrfV0XIy1g0IqAgbYOqylHyJjd2HchyA dHNCdvGybScD5EbATZuPIVPNj2Pv6vE5R2nLWVdVLx8oUoNtVQocqkK6oNUOcOoiCLJP 2cIBOz9zqOhbyOAF4lti4CIZZbY9YfXQVkqghKfeI3KuX+dJGD7p5a7UWefendQcrmHr OBjS/fpaAUVrRu/NVgGoYCLKhcoZApAhF4j9j3v8EHNr0mELKjK2GveyX2P7mhBWsMPe TFGqfyzHa1aJlYAegBEJDKm+UFhJ3Z7VoAvwN4I7MuG484ahrZZ+qL2xkR9nMiu4/H61 6M7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=IRlKatwJeTkkAyX6vfla/t8ZI87eW9srrnw4F61EIh4=; b=QmNPbyuBP5tQwXIpENk+cg31r+Rfs2w89YDudf+MyHbZIdu5pCJ6mmXb5UzQ4vb4Ek pd+hXaP8Ol9mycZgDKWQpSQPJCyKtkFCbmTDLeEM5KRHayN5miOKeERmaMmYRxXqc730 s7M02dwZUzipDIhuW/8ZaARPQJMCw0PZq5BK8siUOTZ52nKOrXIfpSq4+toqBuRUlVHj Xi+YBbrAqBB5bxNB2QnpQix5kjRH7MYy53I0wrYNdRy8zRbP3UXefQIwMEa2gQnJrXqY jI+NZlu4M75clkxluvYagJzvQVGOOhDTVgp9W73r95KweWtfBs45dHfkMWdFbrplIKd3 3qkQ== X-Gm-Message-State: APjAAAWe12eT/fsh2Bf6kRLXe6+S+lYucQnPUnlPzAoAOT/vjYIMtrn1 qgqe/gN5EihlyIANrHF0/ysxJ4doVx0i X-Google-Smtp-Source: APXvYqzzfbhKKfZQ1vHOCgUP7GsaNjnlbIP/FyC4CJWlHiqFhqSFI1OxB9kYE5iAhEjb42Pily10wwZWSgEc X-Received: by 2002:a63:a53:: with SMTP id z19mr27748373pgk.267.1579020401723; Tue, 14 Jan 2020 08:46:41 -0800 (PST) Date: Tue, 14 Jan 2020 08:46:07 -0800 In-Reply-To: <20200114164614.47029-1-brianvv@google.com> Message-Id: <20200114164614.47029-4-brianvv@google.com> Mime-Version: 1.0 References: <20200114164614.47029-1-brianvv@google.com> X-Mailer: git-send-email 2.25.0.rc1.283.g88dfdc4193-goog Subject: [PATCH v4 bpf-next 3/9] bpf: add generic support for update and delete batch ops From: Brian Vazquez To: Brian Vazquez , Brian Vazquez , Alexei Starovoitov , Daniel Borkmann , "David S . Miller" Cc: Yonghong Song , Andrii Nakryiko , Stanislav Fomichev , Petar Penkov , Willem de Bruijn , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This commit adds generic support for update and delete batch ops that can be used for almost all the bpf maps. These commands share the same UAPI attr that lookup and lookup_and_delete batch ops use and the syscall commands are: BPF_MAP_UPDATE_BATCH BPF_MAP_DELETE_BATCH The main difference between update/delete and lookup batch ops is that for update/delete keys/values must be specified for userspace and because of that, neither in_batch nor out_batch are used. Suggested-by: Stanislav Fomichev Signed-off-by: Brian Vazquez Signed-off-by: Yonghong Song --- include/linux/bpf.h | 10 ++++ include/uapi/linux/bpf.h | 2 + kernel/bpf/syscall.c | 115 +++++++++++++++++++++++++++++++++++++++ 3 files changed, 127 insertions(+) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 807744ecaa5a1..05466ad6cf1c5 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -46,6 +46,10 @@ struct bpf_map_ops { void *(*map_lookup_elem_sys_only)(struct bpf_map *map, void *key); int (*map_lookup_batch)(struct bpf_map *map, const union bpf_attr *attr, union bpf_attr __user *uattr); + int (*map_update_batch)(struct bpf_map *map, const union bpf_attr *attr, + union bpf_attr __user *uattr); + int (*map_delete_batch)(struct bpf_map *map, const union bpf_attr *attr, + union bpf_attr __user *uattr); /* funcs callable from userspace and from eBPF programs */ void *(*map_lookup_elem)(struct bpf_map *map, void *key); @@ -987,6 +991,12 @@ void bpf_map_init_from_attr(struct bpf_map *map, union bpf_attr *attr); int generic_map_lookup_batch(struct bpf_map *map, const union bpf_attr *attr, union bpf_attr __user *uattr); +int generic_map_update_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr); +int generic_map_delete_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr); extern int sysctl_unprivileged_bpf_disabled; diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 8185f1542daa1..e8df9ca680e0c 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -108,6 +108,8 @@ enum bpf_cmd { BPF_MAP_FREEZE, BPF_BTF_GET_NEXT_ID, BPF_MAP_LOOKUP_BATCH, + BPF_MAP_UPDATE_BATCH, + BPF_MAP_DELETE_BATCH, }; enum bpf_map_type { diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index d4acb6eb5ef9e..2f631eb67d00c 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1218,6 +1218,111 @@ static int map_get_next_key(union bpf_attr *attr) return err; } +int generic_map_delete_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + void __user *keys = u64_to_user_ptr(attr->batch.keys); + u32 cp, max_count; + int err = 0; + void *key; + + if (attr->batch.elem_flags & ~BPF_F_LOCK) + return -EINVAL; + + if ((attr->batch.elem_flags & BPF_F_LOCK) && + !map_value_has_spin_lock(map)) { + return -EINVAL; + } + + max_count = attr->batch.count; + if (!max_count) + return 0; + + for (cp = 0; cp < max_count; cp++) { + key = __bpf_copy_key(keys + cp * map->key_size, map->key_size); + if (IS_ERR(key)) { + err = PTR_ERR(key); + break; + } + + if (bpf_map_is_dev_bound(map)) { + err = bpf_map_offload_delete_elem(map, key); + break; + } + + preempt_disable(); + __this_cpu_inc(bpf_prog_active); + rcu_read_lock(); + err = map->ops->map_delete_elem(map, key); + rcu_read_unlock(); + __this_cpu_dec(bpf_prog_active); + preempt_enable(); + maybe_wait_bpf_programs(map); + if (err) + break; + } + if (copy_to_user(&uattr->batch.count, &cp, sizeof(cp))) + err = -EFAULT; + return err; +} + +int generic_map_update_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + void __user *values = u64_to_user_ptr(attr->batch.values); + void __user *keys = u64_to_user_ptr(attr->batch.keys); + u32 value_size, cp, max_count; + int ufd = attr->map_fd; + void *key, *value; + struct fd f; + int err = 0; + + f = fdget(ufd); + if (attr->batch.elem_flags & ~BPF_F_LOCK) + return -EINVAL; + + if ((attr->batch.elem_flags & BPF_F_LOCK) && + !map_value_has_spin_lock(map)) { + return -EINVAL; + } + + value_size = bpf_map_value_size(map); + + max_count = attr->batch.count; + if (!max_count) + return 0; + + value = kmalloc(value_size, GFP_USER | __GFP_NOWARN); + if (!value) + return -ENOMEM; + + for (cp = 0; cp < max_count; cp++) { + key = __bpf_copy_key(keys + cp * map->key_size, map->key_size); + if (IS_ERR(key)) { + err = PTR_ERR(key); + break; + } + err = -EFAULT; + if (copy_from_user(value, values + cp * value_size, value_size)) + break; + + err = bpf_map_update_value(map, f, key, value, + attr->batch.elem_flags); + + if (err) + break; + } + + if (copy_to_user(&uattr->batch.count, &cp, sizeof(cp))) + err = -EFAULT; + + kfree(value); + kfree(key); + return err; +} + #define MAP_LOOKUP_RETRIES 3 int generic_map_lookup_batch(struct bpf_map *map, @@ -3213,6 +3318,10 @@ static int bpf_map_do_batch(const union bpf_attr *attr, if (cmd == BPF_MAP_LOOKUP_BATCH) BPF_DO_BATCH(map->ops->map_lookup_batch); + else if (cmd == BPF_MAP_UPDATE_BATCH) + BPF_DO_BATCH(map->ops->map_update_batch); + else + BPF_DO_BATCH(map->ops->map_delete_batch); err_put: fdput(f); @@ -3319,6 +3428,12 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz case BPF_MAP_LOOKUP_BATCH: err = bpf_map_do_batch(&attr, uattr, BPF_MAP_LOOKUP_BATCH); break; + case BPF_MAP_UPDATE_BATCH: + err = bpf_map_do_batch(&attr, uattr, BPF_MAP_UPDATE_BATCH); + break; + case BPF_MAP_DELETE_BATCH: + err = bpf_map_do_batch(&attr, uattr, BPF_MAP_DELETE_BATCH); + break; default: err = -EINVAL; break; From patchwork Tue Jan 14 16:46:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Vazquez X-Patchwork-Id: 1222929 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=QNnMhi1W; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47xxF43mX2z9sR4 for ; Wed, 15 Jan 2020 03:46:48 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728993AbgANQqq (ORCPT ); Tue, 14 Jan 2020 11:46:46 -0500 Received: from mail-pl1-f201.google.com ([209.85.214.201]:50739 "EHLO mail-pl1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728983AbgANQqp (ORCPT ); Tue, 14 Jan 2020 11:46:45 -0500 Received: by mail-pl1-f201.google.com with SMTP id g5so2052006plq.17 for ; Tue, 14 Jan 2020 08:46:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=bTzv6GNpgfrRZmeHOV0jVc+oFdZ9HuTWtPYHIUcg/W4=; b=QNnMhi1W2Lwr9MiJV1KUIy84wUVGideJuBgoE8jzj7ATu3d+8Iej5m7jE4wEyrUloG PmCzBN3Y/FcSXfgNrjtdeca6SP75c21ZSFxwQ33whfSl8Llc/Tfs931jR7JFlKvtpL7R OVQD2+sf7pxwqUSXltEoe9xWiW8qtliCn4a6dmeqj6LHT9vtAQItMQikhZnofmzBG3ZN SY4j+4f9chtw5cTVKbA+v/YxZpuJTUPdcuSaDlkWIegcBGwvg2oifZEqpMBK4bDDbS8f oJPCU+5ZciMpiDheJSoUQnu4UnxzEBByQdSSTGsNpWfzD/C9yoO51HaQXhDAknnJLTT/ RP5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=bTzv6GNpgfrRZmeHOV0jVc+oFdZ9HuTWtPYHIUcg/W4=; b=RQSFV01J7ZF9AadjOnPSaU83gdSRglnl6ThrNWDEc3BCaluDiTs6IUTKaacjXB4lhB w0SqHIi3SrQgLiP8ciu4yKwATYJ0KRO8Z5g8ajp1dkpYrEQi+zbEmLGZ4LDe7pH21AV7 ZXfR5vjYxtFw9pgwbJynwngsDXgwn71GxVeQZ3vtzE3m2wBZL1v5SOfZ04Wj/B9rYCkQ SqYjAGXmhm/9Mmdu4vqZVyg1iNYKQ/vtELpOvqQfw8CwaKgvUnjwx85C1QBFTff3F7XZ lmlhXE5T9gyhEBAoRIKgIYwrG/sk9aIhS4WnajXJDpDd7LKTofAJ60x7fGm0mgziHno+ XK+A== X-Gm-Message-State: APjAAAXnKHjE0AdpUVH8tOaJb499VsN4BgAqjjxlPATHBETAS1EBSlVf +PpmhptdNr1485uTxqlAfQuQ5FVXOkBM X-Google-Smtp-Source: APXvYqwNlRjnwBABlatph1aPgTiQJF+O6lc81rsDh02aseJJD5vR/jzq0Fjyes0uwdcfrH/meeLDOMTraN4R X-Received: by 2002:a63:7843:: with SMTP id t64mr27824323pgc.144.1579020404127; Tue, 14 Jan 2020 08:46:44 -0800 (PST) Date: Tue, 14 Jan 2020 08:46:08 -0800 In-Reply-To: <20200114164614.47029-1-brianvv@google.com> Message-Id: <20200114164614.47029-5-brianvv@google.com> Mime-Version: 1.0 References: <20200114164614.47029-1-brianvv@google.com> X-Mailer: git-send-email 2.25.0.rc1.283.g88dfdc4193-goog Subject: [PATCH v4 bpf-next 4/9] bpf: add lookup and update batch ops to arraymap From: Brian Vazquez To: Brian Vazquez , Brian Vazquez , Alexei Starovoitov , Daniel Borkmann , "David S . Miller" Cc: Yonghong Song , Andrii Nakryiko , Stanislav Fomichev , Petar Penkov , Willem de Bruijn , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This adds the generic batch ops functionality to bpf arraymap, note that since deletion is not a valid operation for arraymap, only batch and lookup are added. Signed-off-by: Brian Vazquez Acked-by: Yonghong Song --- kernel/bpf/arraymap.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c index f0d19bbb9211e..95d77770353c9 100644 --- a/kernel/bpf/arraymap.c +++ b/kernel/bpf/arraymap.c @@ -503,6 +503,8 @@ const struct bpf_map_ops array_map_ops = { .map_mmap = array_map_mmap, .map_seq_show_elem = array_map_seq_show_elem, .map_check_btf = array_map_check_btf, + .map_lookup_batch = generic_map_lookup_batch, + .map_update_batch = generic_map_update_batch, }; const struct bpf_map_ops percpu_array_map_ops = { From patchwork Tue Jan 14 16:46:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Vazquez X-Patchwork-Id: 1222935 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=XHdkIwq3; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47xxFG3jSpz9sRk for ; Wed, 15 Jan 2020 03:46:58 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728741AbgANQqx (ORCPT ); Tue, 14 Jan 2020 11:46:53 -0500 Received: from mail-vk1-f202.google.com ([209.85.221.202]:39367 "EHLO mail-vk1-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729017AbgANQqu (ORCPT ); Tue, 14 Jan 2020 11:46:50 -0500 Received: by mail-vk1-f202.google.com with SMTP id t126so5995943vkg.6 for ; Tue, 14 Jan 2020 08:46:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=ENm2W98BdPXWu1vMy5ZAp6KVwSPWX0LCVXrUV7vlrno=; b=XHdkIwq39SToNbhgWMNmvm4HImu/w0B8pw0BbZSRU3hYLL7nULyi1KkZeiTpDw9AFf WmajBAvwLAQuKcPkcibnWEeXSE90MD38zUmuTYgml5GMmWdN4UdXjgs2awNt3z5RlQih INogYt9xBfErbmAAFFsbN6ChyLTcrmTb4cchlfpcNZDaVnYbL33OIX34S+mf7RkunWcx 7c9CD8LBuhVoOiaC1Wx8GLYWLauj/3BCI4OW7CXeZJoQZbwxg+8UzEQFXenDUKW/VUqL nl9UtHOELV+daNRDW4zwu8rNmGax7a7yiZDsrOOuHuocl5M2QniXuk4QPUbQJ+p5iTRI xTHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=ENm2W98BdPXWu1vMy5ZAp6KVwSPWX0LCVXrUV7vlrno=; b=ljWU/IH+nZLtJFFKpTd3z/wU3xNrevPBfOO4GcOexPSKEPcavZix1JJtNhUgiddD/4 cvezf9H0+1c5MdwSgsz+C9PP3JWRntAjKqRmEEdpN+OFT5pk+AVX6vU/mrN3VxKtvC/N 55SfgxIY6SncumoVGrkGQbTjT0eO+dZz4ryNBNsYN49xv9z5WYNEMzfFRXIJ8cF5jQSg mzQNSaH/kWaHjY9/AQM4mP5i01EUw/Aro86hpGC4szBgyyffztC83UiKhz3JO0q9uXM3 ToNj/G9OxXdTXOgdoVYuMbA1//nHjehZn5K9fVo9taXs5ddF9bRhREVmfsvoBY7/6izM JU1Q== X-Gm-Message-State: APjAAAWwyM2Yk53l09UoHXjD1QLtF2bRFLWaCrNFC1rlWw3w4D69Dibb /hmG+EgJoxHSdEEKk3DXYsQwCHP+q1lE X-Google-Smtp-Source: APXvYqyW/csPPl6TNI3QFgTKsypqD99xhbgVTG7AtoaU37pgcw3GNQB80dU60I5Kxleuty5YXisbAoJr9OTn X-Received: by 2002:a05:6102:2408:: with SMTP id j8mr1964089vsi.124.1579020409073; Tue, 14 Jan 2020 08:46:49 -0800 (PST) Date: Tue, 14 Jan 2020 08:46:10 -0800 In-Reply-To: <20200114164614.47029-1-brianvv@google.com> Message-Id: <20200114164614.47029-7-brianvv@google.com> Mime-Version: 1.0 References: <20200114164614.47029-1-brianvv@google.com> X-Mailer: git-send-email 2.25.0.rc1.283.g88dfdc4193-goog Subject: [PATCH v4 bpf-next 5/9] bpf: add batch ops to all htab bpf map From: Brian Vazquez To: Brian Vazquez , Brian Vazquez , Alexei Starovoitov , Daniel Borkmann , "David S . Miller" Cc: Yonghong Song , Andrii Nakryiko , Stanislav Fomichev , Petar Penkov , Willem de Bruijn , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org From: Yonghong Song htab can't use generic batch support due some problematic behaviours inherent to the data structre, i.e. while iterating the bpf map a concurrent program might delete the next entry that batch was about to use, in that case there's no easy solution to retrieve the next entry, the issue has been discussed multiple times (see [1] and [2]). The only way hmap can be traversed without the problem previously exposed is by making sure that the map is traversing entire buckets. This commit implements those strict requirements for hmap, the implementation follows the same interaction that generic support with some exceptions: - If keys/values buffer are not big enough to traverse a bucket, ENOSPC will be returned. - out_batch contains the value of the next bucket in the iteration, not the next key, but this is transparent for the user since the user should never use out_batch for other than bpf batch syscalls. This commits implements BPF_MAP_LOOKUP_BATCH and adds support for new command BPF_MAP_LOOKUP_AND_DELETE_BATCH. Note that for update/delete batch ops it is possible to use the generic implementations. [1] https://lore.kernel.org/bpf/20190724165803.87470-1-brianvv@google.com/ [2] https://lore.kernel.org/bpf/20190906225434.3635421-1-yhs@fb.com/ Signed-off-by: Yonghong Song Signed-off-by: Brian Vazquez --- include/linux/bpf.h | 3 + include/uapi/linux/bpf.h | 1 + kernel/bpf/hashtab.c | 258 +++++++++++++++++++++++++++++++++++++++ kernel/bpf/syscall.c | 9 +- 4 files changed, 270 insertions(+), 1 deletion(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 05466ad6cf1c5..3517e32149a4f 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -46,6 +46,9 @@ struct bpf_map_ops { void *(*map_lookup_elem_sys_only)(struct bpf_map *map, void *key); int (*map_lookup_batch)(struct bpf_map *map, const union bpf_attr *attr, union bpf_attr __user *uattr); + int (*map_lookup_and_delete_batch)(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr); int (*map_update_batch)(struct bpf_map *map, const union bpf_attr *attr, union bpf_attr __user *uattr); int (*map_delete_batch)(struct bpf_map *map, const union bpf_attr *attr, diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index e8df9ca680e0c..9536729a03d57 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -108,6 +108,7 @@ enum bpf_cmd { BPF_MAP_FREEZE, BPF_BTF_GET_NEXT_ID, BPF_MAP_LOOKUP_BATCH, + BPF_MAP_LOOKUP_AND_DELETE_BATCH, BPF_MAP_UPDATE_BATCH, BPF_MAP_DELETE_BATCH, }; diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 22066a62c8c97..d9888acfd632b 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -17,6 +17,16 @@ (BPF_F_NO_PREALLOC | BPF_F_NO_COMMON_LRU | BPF_F_NUMA_NODE | \ BPF_F_ACCESS_MASK | BPF_F_ZERO_SEED) +#define BATCH_OPS(_name) \ + .map_lookup_batch = \ + _name##_map_lookup_batch, \ + .map_lookup_and_delete_batch = \ + _name##_map_lookup_and_delete_batch, \ + .map_update_batch = \ + generic_map_update_batch, \ + .map_delete_batch = \ + generic_map_delete_batch + struct bucket { struct hlist_nulls_head head; raw_spinlock_t lock; @@ -1232,6 +1242,250 @@ static void htab_map_seq_show_elem(struct bpf_map *map, void *key, rcu_read_unlock(); } +static int +__htab_map_lookup_and_delete_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr, + bool do_delete, bool is_lru_map, + bool is_percpu) +{ + struct bpf_htab *htab = container_of(map, struct bpf_htab, map); + u32 bucket_cnt, total, key_size, value_size, roundup_key_size; + void *keys = NULL, *values = NULL, *value, *dst_key, *dst_val; + void __user *uvalues = u64_to_user_ptr(attr->batch.values); + void __user *ukeys = u64_to_user_ptr(attr->batch.keys); + void *ubatch = u64_to_user_ptr(attr->batch.in_batch); + u32 batch, max_count, size, bucket_size; + u64 elem_map_flags, map_flags; + struct hlist_nulls_head *head; + struct hlist_nulls_node *n; + unsigned long flags; + struct htab_elem *l; + struct bucket *b; + int ret = 0; + + elem_map_flags = attr->batch.elem_flags; + if ((elem_map_flags & ~BPF_F_LOCK) || + ((elem_map_flags & BPF_F_LOCK) && !map_value_has_spin_lock(map))) + return -EINVAL; + + map_flags = attr->batch.flags; + if (map_flags) + return -EINVAL; + + max_count = attr->batch.count; + if (!max_count) + return 0; + + batch = 0; + if (ubatch && copy_from_user(&batch, ubatch, sizeof(batch))) + return -EFAULT; + + if (batch >= htab->n_buckets) + return -ENOENT; + + key_size = htab->map.key_size; + roundup_key_size = round_up(htab->map.key_size, 8); + value_size = htab->map.value_size; + size = round_up(value_size, 8); + if (is_percpu) + value_size = size * num_possible_cpus(); + total = 0; + bucket_size = 1; + +alloc: + /* We cannot do copy_from_user or copy_to_user inside + * the rcu_read_lock. Allocate enough space here. + */ + keys = kvmalloc(key_size * bucket_size, GFP_USER | __GFP_NOWARN); + values = kvmalloc(value_size * bucket_size, GFP_USER | __GFP_NOWARN); + if (!keys || !values) { + ret = -ENOMEM; + goto out; + } + +again: + preempt_disable(); + this_cpu_inc(bpf_prog_active); + rcu_read_lock(); +again_nocopy: + dst_key = keys; + dst_val = values; + b = &htab->buckets[batch]; + head = &b->head; + raw_spin_lock_irqsave(&b->lock, flags); + + bucket_cnt = 0; + hlist_nulls_for_each_entry_rcu(l, n, head, hash_node) + bucket_cnt++; + + if (bucket_cnt > (max_count - total)) { + if (total == 0) + ret = -ENOSPC; + raw_spin_unlock_irqrestore(&b->lock, flags); + rcu_read_unlock(); + this_cpu_dec(bpf_prog_active); + preempt_enable(); + goto after_loop; + } + + if (bucket_cnt > bucket_size) { + bucket_size = bucket_cnt; + raw_spin_unlock_irqrestore(&b->lock, flags); + rcu_read_unlock(); + this_cpu_dec(bpf_prog_active); + preempt_enable(); + kvfree(keys); + kvfree(values); + goto alloc; + } + + hlist_nulls_for_each_entry_safe(l, n, head, hash_node) { + memcpy(dst_key, l->key, key_size); + + if (is_percpu) { + int off = 0, cpu; + void __percpu *pptr; + + pptr = htab_elem_get_ptr(l, map->key_size); + for_each_possible_cpu(cpu) { + bpf_long_memcpy(dst_val + off, + per_cpu_ptr(pptr, cpu), size); + off += size; + } + } else { + value = l->key + roundup_key_size; + if (elem_map_flags & BPF_F_LOCK) + copy_map_value_locked(map, dst_val, value, + true); + else + copy_map_value(map, dst_val, value); + check_and_init_map_lock(map, dst_val); + } + if (do_delete) { + hlist_nulls_del_rcu(&l->hash_node); + if (is_lru_map) + bpf_lru_push_free(&htab->lru, &l->lru_node); + else + free_htab_elem(htab, l); + } + dst_key += key_size; + dst_val += value_size; + } + + raw_spin_unlock_irqrestore(&b->lock, flags); + /* If we are not copying data, we can go to next bucket and avoid + * unlocking the rcu. + */ + if (!bucket_cnt && (batch + 1 < htab->n_buckets)) { + batch++; + goto again_nocopy; + } + + rcu_read_unlock(); + this_cpu_dec(bpf_prog_active); + preempt_enable(); + if (bucket_cnt && (copy_to_user(ukeys + total * key_size, keys, + key_size * bucket_cnt) || + copy_to_user(uvalues + total * value_size, values, + value_size * bucket_cnt))) { + ret = -EFAULT; + goto after_loop; + } + + total += bucket_cnt; + batch++; + if (batch >= htab->n_buckets) { + ret = -ENOENT; + goto after_loop; + } + goto again; + +after_loop: + if (ret && (ret != -ENOENT && ret != -EFAULT)) + goto out; + + /* copy # of entries and next batch */ + ubatch = u64_to_user_ptr(attr->batch.out_batch); + if (copy_to_user(ubatch, &batch, sizeof(batch)) || + put_user(total, &uattr->batch.count)) + ret = -EFAULT; + +out: + kvfree(keys); + kvfree(values); + return ret; +} + +static int +htab_percpu_map_lookup_batch(struct bpf_map *map, const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + return __htab_map_lookup_and_delete_batch(map, attr, uattr, false, + false, true); +} + +static int +htab_percpu_map_lookup_and_delete_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + return __htab_map_lookup_and_delete_batch(map, attr, uattr, true, + false, true); +} + +static int +htab_map_lookup_batch(struct bpf_map *map, const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + return __htab_map_lookup_and_delete_batch(map, attr, uattr, false, + false, false); +} + +static int +htab_map_lookup_and_delete_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + return __htab_map_lookup_and_delete_batch(map, attr, uattr, true, + false, false); +} + +static int +htab_lru_percpu_map_lookup_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + return __htab_map_lookup_and_delete_batch(map, attr, uattr, false, + true, true); +} + +static int +htab_lru_percpu_map_lookup_and_delete_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + return __htab_map_lookup_and_delete_batch(map, attr, uattr, true, + true, true); +} + +static int +htab_lru_map_lookup_batch(struct bpf_map *map, const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + return __htab_map_lookup_and_delete_batch(map, attr, uattr, false, + true, false); +} + +static int +htab_lru_map_lookup_and_delete_batch(struct bpf_map *map, + const union bpf_attr *attr, + union bpf_attr __user *uattr) +{ + return __htab_map_lookup_and_delete_batch(map, attr, uattr, true, + true, false); +} + const struct bpf_map_ops htab_map_ops = { .map_alloc_check = htab_map_alloc_check, .map_alloc = htab_map_alloc, @@ -1242,6 +1496,7 @@ const struct bpf_map_ops htab_map_ops = { .map_delete_elem = htab_map_delete_elem, .map_gen_lookup = htab_map_gen_lookup, .map_seq_show_elem = htab_map_seq_show_elem, + BATCH_OPS(htab), }; const struct bpf_map_ops htab_lru_map_ops = { @@ -1255,6 +1510,7 @@ const struct bpf_map_ops htab_lru_map_ops = { .map_delete_elem = htab_lru_map_delete_elem, .map_gen_lookup = htab_lru_map_gen_lookup, .map_seq_show_elem = htab_map_seq_show_elem, + BATCH_OPS(htab_lru), }; /* Called from eBPF program */ @@ -1368,6 +1624,7 @@ const struct bpf_map_ops htab_percpu_map_ops = { .map_update_elem = htab_percpu_map_update_elem, .map_delete_elem = htab_map_delete_elem, .map_seq_show_elem = htab_percpu_map_seq_show_elem, + BATCH_OPS(htab_percpu), }; const struct bpf_map_ops htab_lru_percpu_map_ops = { @@ -1379,6 +1636,7 @@ const struct bpf_map_ops htab_lru_percpu_map_ops = { .map_update_elem = htab_lru_percpu_map_update_elem, .map_delete_elem = htab_lru_map_delete_elem, .map_seq_show_elem = htab_percpu_map_seq_show_elem, + BATCH_OPS(htab_lru_percpu), }; static int fd_htab_map_alloc_check(union bpf_attr *attr) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 2f631eb67d00c..7e4f40657450f 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -3304,7 +3304,8 @@ static int bpf_map_do_batch(const union bpf_attr *attr, if (IS_ERR(map)) return PTR_ERR(map); - if (cmd == BPF_MAP_LOOKUP_BATCH && + if ((cmd == BPF_MAP_LOOKUP_BATCH || + cmd == BPF_MAP_LOOKUP_AND_DELETE_BATCH) && !(map_get_sys_perms(map, f) & FMODE_CAN_READ)) { err = -EPERM; goto err_put; @@ -3318,6 +3319,8 @@ static int bpf_map_do_batch(const union bpf_attr *attr, if (cmd == BPF_MAP_LOOKUP_BATCH) BPF_DO_BATCH(map->ops->map_lookup_batch); + else if (cmd == BPF_MAP_LOOKUP_AND_DELETE_BATCH) + BPF_DO_BATCH(map->ops->map_lookup_and_delete_batch); else if (cmd == BPF_MAP_UPDATE_BATCH) BPF_DO_BATCH(map->ops->map_update_batch); else @@ -3428,6 +3431,10 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz case BPF_MAP_LOOKUP_BATCH: err = bpf_map_do_batch(&attr, uattr, BPF_MAP_LOOKUP_BATCH); break; + case BPF_MAP_LOOKUP_AND_DELETE_BATCH: + err = bpf_map_do_batch(&attr, uattr, + BPF_MAP_LOOKUP_AND_DELETE_BATCH); + break; case BPF_MAP_UPDATE_BATCH: err = bpf_map_do_batch(&attr, uattr, BPF_MAP_UPDATE_BATCH); break; From patchwork Tue Jan 14 16:46:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Vazquez X-Patchwork-Id: 1222932 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=fxhBY6GJ; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47xxFC1lcXz9sRV for ; Wed, 15 Jan 2020 03:46:55 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729045AbgANQqy (ORCPT ); Tue, 14 Jan 2020 11:46:54 -0500 Received: from mail-pf1-f201.google.com ([209.85.210.201]:35439 "EHLO mail-pf1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729035AbgANQqw (ORCPT ); Tue, 14 Jan 2020 11:46:52 -0500 Received: by mail-pf1-f201.google.com with SMTP id r17so9125258pfl.2 for ; Tue, 14 Jan 2020 08:46:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=xQls5+iJi5Kqz7m3pR6x7QjrYUU6eZohn0NALbSGdKM=; b=fxhBY6GJyfZmJR7rTwfsKpKUIE04Kq0SWTO9fSrJydbyqAZ42ZVfBxUf2cZ1MzpoXX HITp9Cai+fr1dHgI3V51RVruG5RorGsWMyN5/CI7b6DOMq9eDOdfUXoxNB7vw3CJp2v0 jcGjw1bpE3K2LG3UQz3ZQoc/7rjuvDUibLkKhQfGlzDsPtd6e7oOT9eD1AdlGH3XYbIS O5xCae37Owl9agHMXCJNt1obzV2ks2PQVaTz5MlrTqAZbop1P4zEZJSozMoJD+PUMbyG GtfhTT943n+DR6wff+3Iy5x13KDeRAiuABvRCE0kTCuKHTmnXut7e24JHadrdTVFomMJ OJqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=xQls5+iJi5Kqz7m3pR6x7QjrYUU6eZohn0NALbSGdKM=; b=V0Qh9lZKbUhI3jCsu7tLadxjxQWQRwuSbaohVJAG4rQdd57XhTY1CLvZh5OFhUUyTa KYdwmbgpeXAjg5D45j/btgTxkTBQpWvacwOuGU7NKSJAnOAln4flTYqlGk7vXMHDV3HD IkaExkpP+iycv4ItUZluJdUHaNPn5RtHrks2Uc1Y+irin04k7LxW8VkFBNpIOJ8Nfcy1 6GXR5GNqcWb81WDY1eLlklF8bL2XjdvweMfup/Z3+8au9zTluumYoCR3NXzABw721fMF XEUZ0MjYE1sXSeHoBw95ZmKdh5oJtMqf7rlMNvdSfa47Wqu50+SnWPEJx2eAAl5DEYcj AAbQ== X-Gm-Message-State: APjAAAVe/olYAiyY95LtsUy9AdKJmR2dSFfMbBUS/b5VSCAvxra46MUR Tvvll1+WjBBOgl0ipMQHxpnm8IZ5HUqB X-Google-Smtp-Source: APXvYqz4O0SYpBmBS3JghD5AxAlsNFp6IkPmhyVtKyPudG10ac42CeBOdTM+nz5pu1/pgfl7g9k/aFCkvHyF X-Received: by 2002:a63:4a1c:: with SMTP id x28mr27353866pga.7.1579020411670; Tue, 14 Jan 2020 08:46:51 -0800 (PST) Date: Tue, 14 Jan 2020 08:46:11 -0800 In-Reply-To: <20200114164614.47029-1-brianvv@google.com> Message-Id: <20200114164614.47029-8-brianvv@google.com> Mime-Version: 1.0 References: <20200114164614.47029-1-brianvv@google.com> X-Mailer: git-send-email 2.25.0.rc1.283.g88dfdc4193-goog Subject: [PATCH v4 bpf-next 6/9] tools/bpf: sync uapi header bpf.h From: Brian Vazquez To: Brian Vazquez , Brian Vazquez , Alexei Starovoitov , Daniel Borkmann , "David S . Miller" Cc: Yonghong Song , Andrii Nakryiko , Stanislav Fomichev , Petar Penkov , Willem de Bruijn , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Yonghong Song sync uapi header include/uapi/linux/bpf.h to tools/include/uapi/linux/bpf.h Signed-off-by: Yonghong Song --- tools/include/uapi/linux/bpf.h | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 52966e758fe59..9536729a03d57 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -107,6 +107,10 @@ enum bpf_cmd { BPF_MAP_LOOKUP_AND_DELETE_ELEM, BPF_MAP_FREEZE, BPF_BTF_GET_NEXT_ID, + BPF_MAP_LOOKUP_BATCH, + BPF_MAP_LOOKUP_AND_DELETE_BATCH, + BPF_MAP_UPDATE_BATCH, + BPF_MAP_DELETE_BATCH, }; enum bpf_map_type { @@ -420,6 +424,23 @@ union bpf_attr { __u64 flags; }; + struct { /* struct used by BPF_MAP_*_BATCH commands */ + __aligned_u64 in_batch; /* start batch, + * NULL to start from beginning + */ + __aligned_u64 out_batch; /* output: next start batch */ + __aligned_u64 keys; + __aligned_u64 values; + __u32 count; /* input/output: + * input: # of key/value + * elements + * output: # of filled elements + */ + __u32 map_fd; + __u64 elem_flags; + __u64 flags; + } batch; + struct { /* anonymous struct used by BPF_PROG_LOAD command */ __u32 prog_type; /* one of enum bpf_prog_type */ __u32 insn_cnt; From patchwork Tue Jan 14 16:46:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Vazquez X-Patchwork-Id: 1222934 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=OnEmcG7Z; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47xxFG3fHwz9sRX for ; Wed, 15 Jan 2020 03:46:58 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729035AbgANQq4 (ORCPT ); Tue, 14 Jan 2020 11:46:56 -0500 Received: from mail-pj1-f73.google.com ([209.85.216.73]:42481 "EHLO mail-pj1-f73.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729049AbgANQqz (ORCPT ); Tue, 14 Jan 2020 11:46:55 -0500 Received: by mail-pj1-f73.google.com with SMTP id s6so3486835pjn.7 for ; Tue, 14 Jan 2020 08:46:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=0ZZGOVa7EnJEsoOGWnh8FKGGunEpv3ouJ9H6MPXwHHU=; b=OnEmcG7Z1b8fn/LqDccA04IsZ4ftZMZ8tgrB0WVlNrIrJ+69R4SUnTYQ8yPWrs6VnC 12GYWf1bydu7O4pllLGhEbNYro+UaC3jv+nolF/d/J3PENjhZPz2exgmh89Hla3n+aAt VsOFXdu3cD6XNtEgef+WqlUHXBs5TdsS+OPZgEvrHlAuA5NXFrLrWhbVlLL/vIUUgiCm r8WKV22w4TeSpMMEdIrZlvYiBnCQVHm4x12XzYd5+KYlRD12QjSlfJOaVVh/Cfjo8Cvv OJxjNLT57+wLEKyh5hJa7nyXU8TWRsA1BYwI5T5OoeyzoKGrHj9mLa0E7/rU4+VN0vL9 1kzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=0ZZGOVa7EnJEsoOGWnh8FKGGunEpv3ouJ9H6MPXwHHU=; b=bckxxegXRw1V9tmf4TCzjuwxkg7dHJ938TZaWEzxcXdjme+oV2DMx7uEfec4soKzCE EvWSZkwyRx1tAr6VIyVz9feELjxPKoGBTmc2MMEqWkPBuf0chdlBO2w2eWVkRFULOqR8 tI/hKkaMJ1qEpSvTLR/uy1TmNbwxOH3MDV2HRAoIhk+ltxiMZ1QUb58tD0SZ39ZvSniW +UxSIIe+B+1+/oOcKceH7fYokxfMVsGIC8qsm/kETgTTNIkZlEl1GahTbt28Qtbb6G+P 1cxzbHrKUWZjHC8y2MM4mk/bOxmTAUdB8aTWc7HuSQpYcYmYw8Cjl7xSAOW/MqZZtHgC T/1A== X-Gm-Message-State: APjAAAVceV9I3b1Bvl+hKe/PFqcGH1JKahxy8/K9CVy0MQENb8/GIXaP S44wlqNRQAPjfa9uoYach20R7CY2+TgQ X-Google-Smtp-Source: APXvYqy6igihoyMTYiS23TUvN5OViNjX3JKaXsgKWeQQpNQ7KaMyS2eIYBekiBbThz8XG8KvEK42DzVE2zCm X-Received: by 2002:a63:201d:: with SMTP id g29mr28801762pgg.427.1579020414067; Tue, 14 Jan 2020 08:46:54 -0800 (PST) Date: Tue, 14 Jan 2020 08:46:12 -0800 In-Reply-To: <20200114164614.47029-1-brianvv@google.com> Message-Id: <20200114164614.47029-9-brianvv@google.com> Mime-Version: 1.0 References: <20200114164614.47029-1-brianvv@google.com> X-Mailer: git-send-email 2.25.0.rc1.283.g88dfdc4193-goog Subject: [PATCH v4 bpf-next 7/9] libbpf: add libbpf support to batch ops From: Brian Vazquez To: Brian Vazquez , Brian Vazquez , Alexei Starovoitov , Daniel Borkmann , "David S . Miller" Cc: Yonghong Song , Andrii Nakryiko , Stanislav Fomichev , Petar Penkov , Willem de Bruijn , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Yonghong Song Added four libbpf API functions to support map batch operations: . int bpf_map_delete_batch( ... ) . int bpf_map_lookup_batch( ... ) . int bpf_map_lookup_and_delete_batch( ... ) . int bpf_map_update_batch( ... ) Signed-off-by: Yonghong Song --- tools/lib/bpf/bpf.c | 60 ++++++++++++++++++++++++++++++++++++++++ tools/lib/bpf/bpf.h | 22 +++++++++++++++ tools/lib/bpf/libbpf.map | 4 +++ 3 files changed, 86 insertions(+) diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c index 500afe478e94a..12ce8d275f7dc 100644 --- a/tools/lib/bpf/bpf.c +++ b/tools/lib/bpf/bpf.c @@ -452,6 +452,66 @@ int bpf_map_freeze(int fd) return sys_bpf(BPF_MAP_FREEZE, &attr, sizeof(attr)); } +static int bpf_map_batch_common(int cmd, int fd, void *in_batch, + void *out_batch, void *keys, void *values, + __u32 *count, + const struct bpf_map_batch_opts *opts) +{ + union bpf_attr attr = {}; + int ret; + + if (!OPTS_VALID(opts, bpf_map_batch_opts)) + return -EINVAL; + + memset(&attr, 0, sizeof(attr)); + attr.batch.map_fd = fd; + attr.batch.in_batch = ptr_to_u64(in_batch); + attr.batch.out_batch = ptr_to_u64(out_batch); + attr.batch.keys = ptr_to_u64(keys); + attr.batch.values = ptr_to_u64(values); + if (count) + attr.batch.count = *count; + attr.batch.elem_flags = OPTS_GET(opts, elem_flags, 0); + attr.batch.flags = OPTS_GET(opts, flags, 0); + + ret = sys_bpf(cmd, &attr, sizeof(attr)); + if (count) + *count = attr.batch.count; + + return ret; +} + +int bpf_map_delete_batch(int fd, void *keys, __u32 *count, + const struct bpf_map_batch_opts *opts) +{ + return bpf_map_batch_common(BPF_MAP_DELETE_BATCH, fd, NULL, + NULL, keys, NULL, count, opts); +} + +int bpf_map_lookup_batch(int fd, void *in_batch, void *out_batch, void *keys, + void *values, __u32 *count, + const struct bpf_map_batch_opts *opts) +{ + return bpf_map_batch_common(BPF_MAP_LOOKUP_BATCH, fd, in_batch, + out_batch, keys, values, count, opts); +} + +int bpf_map_lookup_and_delete_batch(int fd, void *in_batch, void *out_batch, + void *keys, void *values, __u32 *count, + const struct bpf_map_batch_opts *opts) +{ + return bpf_map_batch_common(BPF_MAP_LOOKUP_AND_DELETE_BATCH, + fd, in_batch, out_batch, keys, values, + count, opts); +} + +int bpf_map_update_batch(int fd, void *keys, void *values, __u32 *count, + const struct bpf_map_batch_opts *opts) +{ + return bpf_map_batch_common(BPF_MAP_UPDATE_BATCH, fd, NULL, NULL, + keys, values, count, opts); +} + int bpf_obj_pin(int fd, const char *pathname) { union bpf_attr attr; diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h index 56341d117e5ba..b976e77316cca 100644 --- a/tools/lib/bpf/bpf.h +++ b/tools/lib/bpf/bpf.h @@ -127,6 +127,28 @@ LIBBPF_API int bpf_map_lookup_and_delete_elem(int fd, const void *key, LIBBPF_API int bpf_map_delete_elem(int fd, const void *key); LIBBPF_API int bpf_map_get_next_key(int fd, const void *key, void *next_key); LIBBPF_API int bpf_map_freeze(int fd); + +struct bpf_map_batch_opts { + size_t sz; /* size of this struct for forward/backward compatibility */ + __u64 elem_flags; + __u64 flags; +}; +#define bpf_map_batch_opts__last_field flags + +LIBBPF_API int bpf_map_delete_batch(int fd, void *keys, + __u32 *count, + const struct bpf_map_batch_opts *opts); +LIBBPF_API int bpf_map_lookup_batch(int fd, void *in_batch, void *out_batch, + void *keys, void *values, __u32 *count, + const struct bpf_map_batch_opts *opts); +LIBBPF_API int bpf_map_lookup_and_delete_batch(int fd, void *in_batch, + void *out_batch, void *keys, + void *values, __u32 *count, + const struct bpf_map_batch_opts *opts); +LIBBPF_API int bpf_map_update_batch(int fd, void *keys, void *values, + __u32 *count, + const struct bpf_map_batch_opts *opts); + LIBBPF_API int bpf_obj_pin(int fd, const char *pathname); LIBBPF_API int bpf_obj_get(const char *pathname); diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index a19f04e6e3d96..1902a0fc6afcc 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -214,6 +214,10 @@ LIBBPF_0.0.7 { btf_dump__emit_type_decl; bpf_link__disconnect; bpf_map__attach_struct_ops; + bpf_map_delete_batch; + bpf_map_lookup_and_delete_batch; + bpf_map_lookup_batch; + bpf_map_update_batch; bpf_object__find_program_by_name; bpf_object__attach_skeleton; bpf_object__destroy_skeleton; From patchwork Tue Jan 14 16:46:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Vazquez X-Patchwork-Id: 1222939 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=Z/RczBzd; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47xxFR6L46z9sRV for ; Wed, 15 Jan 2020 03:47:07 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729050AbgANQrG (ORCPT ); Tue, 14 Jan 2020 11:47:06 -0500 Received: from mail-pl1-f201.google.com ([209.85.214.201]:52956 "EHLO mail-pl1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729067AbgANQq5 (ORCPT ); Tue, 14 Jan 2020 11:46:57 -0500 Received: by mail-pl1-f201.google.com with SMTP id x10so5381354plo.19 for ; Tue, 14 Jan 2020 08:46:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=A0SlgO3tXhK8zh7IMkEfJl+dSnSSahEjlYnf96g/pb4=; b=Z/RczBzdKCh6Zk1v0GDh6GhsWzs8uOSgZ672Spz9STAnSpwN2IKqIrZWCA60QE406O B8vLkMBbEbNET1RL2kYsQ5CmYhaocaVCu9gN3gMMVYQSKUlUe5DQmWvRbPsBRD0Zy2Sk dwB9D9bFFC9S+/DpQrXBeyXKxC/ZitilMR8M0zUXuQt2FjD5C0cWgkotafxxPfQT9USw EzI4PS5bW9YVBs39D8YpehdwHIxJMM2STfygrVzDFqgVvu9scHmumUqQ9KQ/7QzYj06T gjgLORRH5XpWs9Q3E+uMdGy1THonr4cKcvCh9P6tnghTg/xPX+/qHwArNVosHxtVni69 /Mig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=A0SlgO3tXhK8zh7IMkEfJl+dSnSSahEjlYnf96g/pb4=; b=KPB3iG4F09bt0Vcznr5W/QhABtHpBaLEDFXErzLcW6hfa4lLzWonc8GwMuN+FRPF8N 8EQHb2NtEoLrmlKeVzEIYOhG00ZwJCMT7xjuW/SFUlUsKmci1rtJKzDd9sORRcEJ4vg9 Dh8QzVbq2c6EDGJYcoopSiClgh/CNksyU0oAtAOd2PcD/xysehiH/LK+sfbfKZb0b4GR ERVcm6HgZJG63H11cX3LXmg7somfFYR4lvfvsr0B/p+NAR+RS6nm/9bTjMjhbOW9rw4F fvF6Z97OkjAmAQnfpjac06gTqd1VS+Uy/dSZTTJOpbLfvuaks82JYUKj53Kryw2wqg1p fDvg== X-Gm-Message-State: APjAAAWTMu2owPP5gTW+B8XkJEGocRyRP/c3qz1XscCAlnIWbdDGqI7E lXClhXxKM4ALw1Odlp1WLPLrsAR1JxHX X-Google-Smtp-Source: APXvYqzlV6OWpFuw/6wrIzKco9TrH89sPL1BTGerNRQK5dEvEo5/WYp7Lsf6RFzKsq0N0W7xIv/mTKBhuazI X-Received: by 2002:a63:6787:: with SMTP id b129mr27905924pgc.103.1579020416398; Tue, 14 Jan 2020 08:46:56 -0800 (PST) Date: Tue, 14 Jan 2020 08:46:13 -0800 In-Reply-To: <20200114164614.47029-1-brianvv@google.com> Message-Id: <20200114164614.47029-10-brianvv@google.com> Mime-Version: 1.0 References: <20200114164614.47029-1-brianvv@google.com> X-Mailer: git-send-email 2.25.0.rc1.283.g88dfdc4193-goog Subject: [PATCH v4 bpf-next 8/9] selftests/bpf: add batch ops testing for htab and htab_percpu map From: Brian Vazquez To: Brian Vazquez , Brian Vazquez , Alexei Starovoitov , Daniel Borkmann , "David S . Miller" Cc: Yonghong Song , Andrii Nakryiko , Stanislav Fomichev , Petar Penkov , Willem de Bruijn , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org From: Yonghong Song Tested bpf_map_lookup_batch(), bpf_map_lookup_and_delete_batch(), bpf_map_update_batch(), and bpf_map_delete_batch() functionality. $ ./test_maps ... test_htab_map_batch_ops:PASS test_htab_percpu_map_batch_ops:PASS ... Signed-off-by: Yonghong Song Signed-off-by: Brian Vazquez --- .../bpf/map_tests/htab_map_batch_ops.c | 285 ++++++++++++++++++ 1 file changed, 285 insertions(+) create mode 100644 tools/testing/selftests/bpf/map_tests/htab_map_batch_ops.c diff --git a/tools/testing/selftests/bpf/map_tests/htab_map_batch_ops.c b/tools/testing/selftests/bpf/map_tests/htab_map_batch_ops.c new file mode 100644 index 0000000000000..670d7e6574899 --- /dev/null +++ b/tools/testing/selftests/bpf/map_tests/htab_map_batch_ops.c @@ -0,0 +1,285 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Copyright (c) 2019 Facebook */ +#include +#include +#include + +#include +#include + +#include +#include + +static void map_batch_update(int map_fd, __u32 max_entries, int *keys, + void *values, bool is_pcpu) +{ + typedef BPF_DECLARE_PERCPU(int, value); + value *v = NULL; + int i, j, err; + + DECLARE_LIBBPF_OPTS(bpf_map_batch_opts, opts, + .elem_flags = 0, + .flags = 0, + ); + + if (is_pcpu) + v = (value *)values; + + for (i = 0; i < max_entries; i++) { + keys[i] = i + 1; + if (is_pcpu) + for (j = 0; j < bpf_num_possible_cpus(); j++) + bpf_percpu(v[i], j) = i + 2 + j; + else + ((int *)values)[i] = i + 2; + } + + err = bpf_map_update_batch(map_fd, keys, values, &max_entries, &opts); + CHECK(err, "bpf_map_update_batch()", "error:%s\n", strerror(errno)); +} + +static void map_batch_verify(int *visited, __u32 max_entries, + int *keys, void *values, bool is_pcpu) +{ + typedef BPF_DECLARE_PERCPU(int, value); + value *v = NULL; + int i, j; + + if (is_pcpu) + v = (value *)values; + + memset(visited, 0, max_entries * sizeof(*visited)); + for (i = 0; i < max_entries; i++) { + + if (is_pcpu) { + for (j = 0; j < bpf_num_possible_cpus(); j++) { + CHECK(keys[i] + 1 + j != bpf_percpu(v[i], j), + "key/value checking", + "error: i %d j %d key %d value %d\n", + i, j, keys[i], bpf_percpu(v[i], j)); + } + } else { + CHECK(keys[i] + 1 != ((int *)values)[i], + "key/value checking", + "error: i %d key %d value %d\n", i, keys[i], + ((int *)values)[i]); + } + + visited[i] = 1; + + } + for (i = 0; i < max_entries; i++) { + CHECK(visited[i] != 1, "visited checking", + "error: keys array at index %d missing\n", i); + } +} + +void __test_map_lookup_and_delete_batch(bool is_pcpu) +{ + int map_type = is_pcpu ? BPF_MAP_TYPE_PERCPU_HASH : BPF_MAP_TYPE_HASH; + struct bpf_create_map_attr xattr = { + .name = "hash_map", + .map_type = map_type, + .key_size = sizeof(int), + .value_size = sizeof(int), + }; + __u32 batch, count, total, total_success; + typedef BPF_DECLARE_PERCPU(int, value); + int map_fd, *keys, *visited, key; + const __u32 max_entries = 10; + int err, step, value_size; + value pcpu_values[max_entries]; + bool nospace_err; + void *values; + + DECLARE_LIBBPF_OPTS(bpf_map_batch_opts, opts, + .elem_flags = 0, + .flags = 0, + ); + + xattr.max_entries = max_entries; + map_fd = bpf_create_map_xattr(&xattr); + CHECK(map_fd == -1, + "bpf_create_map_xattr()", "error:%s\n", strerror(errno)); + + value_size = is_pcpu ? sizeof(value) : sizeof(int); + keys = malloc(max_entries * sizeof(int)); + if (is_pcpu) + values = pcpu_values; + else + values = malloc(max_entries * sizeof(int)); + visited = malloc(max_entries * sizeof(int)); + CHECK(!keys || !values || !visited, "malloc()", + "error:%s\n", strerror(errno)); + + /* test 1: lookup/delete an empty hash table, -ENOENT */ + count = max_entries; + err = bpf_map_lookup_and_delete_batch(map_fd, NULL, &batch, keys, + values, &count, &opts); + CHECK((err && errno != ENOENT), "empty map", + "error: %s\n", strerror(errno)); + + /* populate elements to the map */ + map_batch_update(map_fd, max_entries, keys, values, is_pcpu); + + /* test 2: lookup/delete with count = 0, success */ + count = 0; + err = bpf_map_lookup_and_delete_batch(map_fd, NULL, &batch, keys, + values, &count, &opts); + CHECK(err, "count = 0", "error: %s\n", strerror(errno)); + + /* test 3: lookup/delete with count = max_entries, success */ + memset(keys, 0, max_entries * sizeof(*keys)); + memset(values, 0, max_entries * value_size); + count = max_entries; + err = bpf_map_lookup_and_delete_batch(map_fd, NULL, &batch, keys, + values, &count, &opts); + CHECK((err && errno != ENOENT), "count = max_entries", + "error: %s\n", strerror(errno)); + CHECK(count != max_entries, "count = max_entries", + "count = %u, max_entries = %u\n", count, max_entries); + map_batch_verify(visited, max_entries, keys, values, is_pcpu); + + /* bpf_map_get_next_key() should return -ENOENT for an empty map. */ + err = bpf_map_get_next_key(map_fd, NULL, &key); + CHECK(!err, "bpf_map_get_next_key()", "error: %s\n", strerror(errno)); + + /* test 4: lookup/delete in a loop with various steps. */ + total_success = 0; + for (step = 1; step < max_entries; step++) { + map_batch_update(map_fd, max_entries, keys, values, is_pcpu); + memset(keys, 0, max_entries * sizeof(*keys)); + memset(values, 0, max_entries * value_size); + total = 0; + /* iteratively lookup/delete elements with 'step' + * elements each + */ + count = step; + nospace_err = false; + while (true) { + err = bpf_map_lookup_batch(map_fd, + total ? &batch : NULL, + &batch, keys + total, + values + + total * value_size, + &count, &opts); + /* It is possible that we are failing due to buffer size + * not big enough. In such cases, let us just exit and + * go with large steps. Not that a buffer size with + * max_entries should always work. + */ + if (err && errno == ENOSPC) { + nospace_err = true; + break; + } + + CHECK((err && errno != ENOENT), "lookup with steps", + "error: %s\n", strerror(errno)); + + total += count; + if (err) + break; + + } + if (nospace_err == true) + continue; + + CHECK(total != max_entries, "lookup with steps", + "total = %u, max_entries = %u\n", total, max_entries); + map_batch_verify(visited, max_entries, keys, values, is_pcpu); + + total = 0; + count = step; + while (total < max_entries) { + if (max_entries - total < step) + count = max_entries - total; + err = bpf_map_delete_batch(map_fd, + keys + total, + &count, &opts); + CHECK((err && errno != ENOENT), "delete batch", + "error: %s\n", strerror(errno)); + total += count; + if (err) + break; + } + CHECK(total != max_entries, "delete with steps", + "total = %u, max_entries = %u\n", total, max_entries); + + /* check map is empty, errono == ENOENT */ + err = bpf_map_get_next_key(map_fd, NULL, &key); + CHECK(!err || errno != ENOENT, "bpf_map_get_next_key()", + "error: %s\n", strerror(errno)); + + /* iteratively lookup/delete elements with 'step' + * elements each + */ + map_batch_update(map_fd, max_entries, keys, values, is_pcpu); + memset(keys, 0, max_entries * sizeof(*keys)); + memset(values, 0, max_entries * value_size); + total = 0; + count = step; + nospace_err = false; + while (true) { + err = bpf_map_lookup_and_delete_batch(map_fd, + total ? &batch : NULL, + &batch, keys + total, + values + + total * value_size, + &count, &opts); + /* It is possible that we are failing due to buffer size + * not big enough. In such cases, let us just exit and + * go with large steps. Not that a buffer size with + * max_entries should always work. + */ + if (err && errno == ENOSPC) { + nospace_err = true; + break; + } + + CHECK((err && errno != ENOENT), "lookup with steps", + "error: %s\n", strerror(errno)); + + total += count; + if (err) + break; + } + + if (nospace_err == true) + continue; + + CHECK(total != max_entries, "lookup/delete with steps", + "total = %u, max_entries = %u\n", total, max_entries); + + map_batch_verify(visited, max_entries, keys, values, is_pcpu); + err = bpf_map_get_next_key(map_fd, NULL, &key); + CHECK(!err, "bpf_map_get_next_key()", "error: %s\n", + strerror(errno)); + + total_success++; + } + + CHECK(total_success == 0, "check total_success", + "unexpected failure\n"); + free(keys); + free(visited); + if (!is_pcpu) + free(values); +} + +void htab_map_batch_ops(void) +{ + __test_map_lookup_and_delete_batch(false); + printf("test_%s:PASS\n", __func__); +} + +void htab_percpu_map_batch_ops(void) +{ + __test_map_lookup_and_delete_batch(true); + printf("test_%s:PASS\n", __func__); +} + +void test_htab_map_batch_ops(void) +{ + htab_map_batch_ops(); + htab_percpu_map_batch_ops(); +} From patchwork Tue Jan 14 16:46:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Vazquez X-Patchwork-Id: 1222937 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.a=rsa-sha256 header.s=20161025 header.b=suons98s; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 47xxFN2jbnz9sRV for ; Wed, 15 Jan 2020 03:47:04 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729083AbgANQrB (ORCPT ); Tue, 14 Jan 2020 11:47:01 -0500 Received: from mail-pf1-f201.google.com ([209.85.210.201]:54585 "EHLO mail-pf1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726669AbgANQq7 (ORCPT ); Tue, 14 Jan 2020 11:46:59 -0500 Received: by mail-pf1-f201.google.com with SMTP id v14so9084032pfm.21 for ; Tue, 14 Jan 2020 08:46:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=utFWUT6wlMosbqdVUWks9jGqk+aaGoz4H9C5fn7YGjc=; b=suons98sXGv+WZP3A0WOzWUhmDaJzA6cISir0AQCPvyueitlHsxzE44GK+dKUUr7zc EMvkPYuT+d8+lefNZ3Dz84eqb0HXbkq4WzC2Jv4RgH72LPXXLTE+wMLzPv7oQfhG0OvT FBf+2aPSjBU5WLejdzHOg6Bg+FQnu5P9s387y41rraratirCb495x0yv1zX9d4SFk/7m ta4YXNpmQUy1BACaHa95A4ckd1dgWon2VH/Kvtv6C93tJhcFU0WGzAr/WhSn1Gs5NpRY 9YLfdwOqpeYbcxhPdYcLRu2Qup2ndWwJM4ujCNuOvRflKsxN0XT/zKE102LansKoa+LO h65Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=utFWUT6wlMosbqdVUWks9jGqk+aaGoz4H9C5fn7YGjc=; b=rITrLIajMx8BhnVunUB3PsCAC+49BsV/rH5Qojo8hnXIlMea+lyGSqwqhbNwlml7FF S/Lei8Bem66+2Oj2CYI1wW0b8IxwKCohR7VKVxmYpGB8gOJLbMpTTJoQhh8fapbLEMyo wn7p+8KeFc8ZNEWH/VnZMdzxG/W08i+b2KlDwlhDmf4zcJbEuFTWhi0PFv30lLp3hA3D yXbkEbrVHdyKROAW5YmZdUstugnrKaCK0+0cMQh59rEXn6OrODLimZ3LVhaUMc7ZvVcC BKb+6rAS6iHN83bNeV8IcXAMHgy2ati5euXpM0H7oJweyakgvgW27l2GoT7wTV6WWGLW wO+A== X-Gm-Message-State: APjAAAUheUNPJH+US9VT5DH4jPHvRHYoUqX+haIlwVCYmAgaDoTgwZ9c GxfOPCvHf/BiGkAb60ece5qVy0XHIvld X-Google-Smtp-Source: APXvYqwCbXHUN96hD1m5fhRfxiQHBzeJoLHzVq40CD8wueDkFEN9quREbNiy+kM8hP5feUtpk9+shhAo33Gm X-Received: by 2002:a63:30c:: with SMTP id 12mr27766270pgd.276.1579020418779; Tue, 14 Jan 2020 08:46:58 -0800 (PST) Date: Tue, 14 Jan 2020 08:46:14 -0800 In-Reply-To: <20200114164614.47029-1-brianvv@google.com> Message-Id: <20200114164614.47029-11-brianvv@google.com> Mime-Version: 1.0 References: <20200114164614.47029-1-brianvv@google.com> X-Mailer: git-send-email 2.25.0.rc1.283.g88dfdc4193-goog Subject: [PATCH v4 bpf-next 9/9] selftests/bpf: add batch ops testing to array bpf map From: Brian Vazquez To: Brian Vazquez , Brian Vazquez , Alexei Starovoitov , Daniel Borkmann , "David S . Miller" Cc: Yonghong Song , Andrii Nakryiko , Stanislav Fomichev , Petar Penkov , Willem de Bruijn , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Tested bpf_map_lookup_batch() and bpf_map_update_batch() functionality. $ ./test_maps ... test_array_map_batch_ops:PASS ... Signed-off-by: Brian Vazquez Signed-off-by: Yonghong Song --- .../bpf/map_tests/array_map_batch_ops.c | 131 ++++++++++++++++++ 1 file changed, 131 insertions(+) create mode 100644 tools/testing/selftests/bpf/map_tests/array_map_batch_ops.c diff --git a/tools/testing/selftests/bpf/map_tests/array_map_batch_ops.c b/tools/testing/selftests/bpf/map_tests/array_map_batch_ops.c new file mode 100644 index 0000000000000..05b7caea6a444 --- /dev/null +++ b/tools/testing/selftests/bpf/map_tests/array_map_batch_ops.c @@ -0,0 +1,131 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include + +#include +#include + +#include + +static void map_batch_update(int map_fd, __u32 max_entries, int *keys, + int *values) +{ + int i, err; + + DECLARE_LIBBPF_OPTS(bpf_map_batch_opts, opts, + .elem_flags = 0, + .flags = 0, + ); + + for (i = 0; i < max_entries; i++) { + keys[i] = i; + values[i] = i + 1; + } + + err = bpf_map_update_batch(map_fd, keys, values, &max_entries, &opts); + CHECK(err, "bpf_map_update_batch()", "error:%s\n", strerror(errno)); +} + +static void map_batch_verify(int *visited, __u32 max_entries, + int *keys, int *values) +{ + int i; + + memset(visited, 0, max_entries * sizeof(*visited)); + for (i = 0; i < max_entries; i++) { + CHECK(keys[i] + 1 != values[i], "key/value checking", + "error: i %d key %d value %d\n", i, keys[i], values[i]); + visited[i] = 1; + } + for (i = 0; i < max_entries; i++) { + CHECK(visited[i] != 1, "visited checking", + "error: keys array at index %d missing\n", i); + } +} + +void test_array_map_batch_ops(void) +{ + struct bpf_create_map_attr xattr = { + .name = "array_map", + .map_type = BPF_MAP_TYPE_ARRAY, + .key_size = sizeof(int), + .value_size = sizeof(int), + }; + int map_fd, *keys, *values, *visited; + __u32 count, total, total_success; + const __u32 max_entries = 10000; + bool nospace_err; + __u64 batch = 0; + int err, step; + + DECLARE_LIBBPF_OPTS(bpf_map_batch_opts, opts, + .elem_flags = 0, + .flags = 0, + ); + + xattr.max_entries = max_entries; + map_fd = bpf_create_map_xattr(&xattr); + CHECK(map_fd == -1, + "bpf_create_map_xattr()", "error:%s\n", strerror(errno)); + + keys = malloc(max_entries * sizeof(int)); + values = malloc(max_entries * sizeof(int)); + visited = malloc(max_entries * sizeof(int)); + CHECK(!keys || !values || !visited, "malloc()", "error:%s\n", + strerror(errno)); + + /* populate elements to the map */ + map_batch_update(map_fd, max_entries, keys, values); + + /* test 1: lookup in a loop with various steps. */ + total_success = 0; + for (step = 1; step < max_entries; step++) { + map_batch_update(map_fd, max_entries, keys, values); + map_batch_verify(visited, max_entries, keys, values); + memset(keys, 0, max_entries * sizeof(*keys)); + memset(values, 0, max_entries * sizeof(*values)); + batch = 0; + total = 0; + /* iteratively lookup/delete elements with 'step' + * elements each. + */ + count = step; + nospace_err = false; + while (true) { + err = bpf_map_lookup_batch(map_fd, + total ? &batch : NULL, &batch, + keys + total, + values + total, + &count, &opts); + + CHECK((err && errno != ENOENT), "lookup with steps", + "error: %s\n", strerror(errno)); + + total += count; + if (err) + break; + + } + + if (nospace_err == true) + continue; + + CHECK(total != max_entries, "lookup with steps", + "total = %u, max_entries = %u\n", total, max_entries); + + map_batch_verify(visited, max_entries, keys, values); + + total_success++; + } + + CHECK(total_success == 0, "check total_success", + "unexpected failure\n"); + + printf("%s:PASS\n", __func__); + + free(keys); + free(values); + free(visited); +}