From patchwork Fri Oct 18 04:07:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1179122 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="aKPUSGeJ"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46vYpp11Zcz9sPJ for ; Fri, 18 Oct 2019 16:04:02 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2503854AbfJRFEB (ORCPT ); Fri, 18 Oct 2019 01:04:01 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:40918 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2503941AbfJRFEA (ORCPT ); Fri, 18 Oct 2019 01:04:00 -0400 Received: by mail-pf1-f196.google.com with SMTP id x127so3078551pfb.7; Thu, 17 Oct 2019 22:04:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=lMPcEYS5xafvLJbpWGK/0vwnzdz9Gd+595jpPJa4b6M=; b=aKPUSGeJ8tO7TWXDEqDXkHuu/6OoOeacJGCv3+Vb/evIygU7/YNocEqv3oRBZWmgOH 7pugRdvKZa8VAgY2S0/vpJe6KnHhfv7Hzkgiw6DuWUWX+DFL3rzg671AhpM9I+EtJB1o g+4PLkPqV9nR1nUr/phQ1ju5+KnbzBP5qks2YcRqJidSVw7agLLDf3KoOvHMGnmMNvb0 ILPpVJyf9nroOaDdqgigVIKvFfxBCQcOx5NUXOajLSrklkD0GWy54nZxsq/qCdc6PkI3 +wrbw0NyrzHJaOKy+Te8PRI5CERCx9I8/qDNTU3yU2S67oAPYBcGmtZ0vLrUK7G4j6XE vh3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lMPcEYS5xafvLJbpWGK/0vwnzdz9Gd+595jpPJa4b6M=; b=MHGaruOhbBnVylott96pjiKflt2mEhYKYt7gCPMnxp85M7jiNnkiVkYaE5oKKdpv8r GUe8fTkZZSsyk/FvhadoARuEGZr9oMgYMZ51v1uT2dhQP2feMiVzNe1AlaaxH7/9YW0b vswaTca/BFcyN2rqNlciqnkjI0PrPzC/RiOnZMfYA1Vxmj4rweksEd+JGByTA15vqx20 ax3K9R2c7XsLELyk4YA4F6J5nxDPMD3vpOFnzJixEaPGbk2JvEy8cjEmJPKUYNXXtLhn BaUwSmhRyRRoPwSkH8PgVCKxECOVURj1Sl6ezSnR2y3CgEBpGv7P4PMc+pnV4+2vj0Q4 olsg== X-Gm-Message-State: APjAAAXfH+XAH8aMsdMl98DZZZaN1+IvfF0XOHkVw4QDbkR4gMRTMFFu 5x8piLX/89vmV1xnaXrS9YFKQ5VX X-Google-Smtp-Source: APXvYqxI1edhZZlcWSkC5PeziVQhdAHlp8SxvVZDy1qaL8P6Pa3SCfdXVNBU5BFqmAlZu3enRtIO6Q== X-Received: by 2002:a62:6446:: with SMTP id y67mr4086649pfb.47.1571371708365; Thu, 17 Oct 2019 21:08:28 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id d11sm4341680pfo.104.2019.10.17.21.08.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2019 21:08:27 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko , Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal , Pravin B Shelar Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu , Stanislav Fomichev Subject: [RFC PATCH v2 bpf-next 01/15] xdp_flow: Add skeleton of XDP based flow offload driver Date: Fri, 18 Oct 2019 13:07:34 +0900 Message-Id: <20191018040748.30593-2-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191018040748.30593-1-toshiaki.makita1@gmail.com> References: <20191018040748.30593-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Add flow offload driver, xdp_flow_core.c, and skeleton of UMH handling mechanism. The driver is not called from anywhere yet. xdp_flow_setup_block() in xdp_flow_core.c is meant to be called when a net device is bound to a flow block, e.g. ingress qdisc is added. It loads xdp_flow kernel module and the kmod provides callbacks for setup phase and flow insertion phase. xdp_flow_setup() in the kmod will be called from xdp_flow_setup_block() when ingress qdisc is added, and xdp_flow_setup_block_cb() will be called when a tc flower filter is added. The former will request the UMH to load the eBPF program and the latter will request the UMH to populate maps for flow tables. In this patch no actual processing is implemented and the following commits implement them. The overall mechanism of UMH handling is written referring to bpfilter. Signed-off-by: Toshiaki Makita --- net/Kconfig | 1 + net/Makefile | 1 + net/xdp_flow/.gitignore | 1 + net/xdp_flow/Kconfig | 16 +++ net/xdp_flow/Makefile | 31 +++++ net/xdp_flow/msgfmt.h | 102 ++++++++++++++++ net/xdp_flow/xdp_flow.h | 23 ++++ net/xdp_flow/xdp_flow_core.c | 127 ++++++++++++++++++++ net/xdp_flow/xdp_flow_kern_mod.c | 250 +++++++++++++++++++++++++++++++++++++++ net/xdp_flow/xdp_flow_umh.c | 116 ++++++++++++++++++ net/xdp_flow/xdp_flow_umh_blob.S | 7 ++ 11 files changed, 675 insertions(+) create mode 100644 net/xdp_flow/.gitignore create mode 100644 net/xdp_flow/Kconfig create mode 100644 net/xdp_flow/Makefile create mode 100644 net/xdp_flow/msgfmt.h create mode 100644 net/xdp_flow/xdp_flow.h create mode 100644 net/xdp_flow/xdp_flow_core.c create mode 100644 net/xdp_flow/xdp_flow_kern_mod.c create mode 100644 net/xdp_flow/xdp_flow_umh.c create mode 100644 net/xdp_flow/xdp_flow_umh_blob.S diff --git a/net/Kconfig b/net/Kconfig index 3101bfcb..369ecd0 100644 --- a/net/Kconfig +++ b/net/Kconfig @@ -206,6 +206,7 @@ source "net/bridge/netfilter/Kconfig" endif source "net/bpfilter/Kconfig" +source "net/xdp_flow/Kconfig" source "net/dccp/Kconfig" source "net/sctp/Kconfig" diff --git a/net/Makefile b/net/Makefile index 449fc0b..b78d1ef 100644 --- a/net/Makefile +++ b/net/Makefile @@ -87,3 +87,4 @@ endif obj-$(CONFIG_QRTR) += qrtr/ obj-$(CONFIG_NET_NCSI) += ncsi/ obj-$(CONFIG_XDP_SOCKETS) += xdp/ +obj-$(CONFIG_XDP_FLOW) += xdp_flow/ diff --git a/net/xdp_flow/.gitignore b/net/xdp_flow/.gitignore new file mode 100644 index 0000000..8cad817 --- /dev/null +++ b/net/xdp_flow/.gitignore @@ -0,0 +1 @@ +xdp_flow_umh diff --git a/net/xdp_flow/Kconfig b/net/xdp_flow/Kconfig new file mode 100644 index 0000000..a4d79fa --- /dev/null +++ b/net/xdp_flow/Kconfig @@ -0,0 +1,16 @@ +# SPDX-License-Identifier: GPL-2.0-only +menuconfig XDP_FLOW + bool "XDP based flow offload engine (XDP_FLOW)" + depends on NET && BPF_SYSCALL && MEMFD_CREATE + help + This builds experimental xdp_flow framework that is aiming to + provide flow software offload functionality via XDP + +if XDP_FLOW +config XDP_FLOW_UMH + tristate "xdp_flow kernel module with user mode helper" + depends on $(success,$(srctree)/scripts/cc-can-link.sh $(CC)) + default m + help + This builds xdp_flow kernel module with embedded user mode helper +endif diff --git a/net/xdp_flow/Makefile b/net/xdp_flow/Makefile new file mode 100644 index 0000000..f6138c2 --- /dev/null +++ b/net/xdp_flow/Makefile @@ -0,0 +1,31 @@ +# SPDX-License-Identifier: GPL-2.0 + +obj-$(CONFIG_XDP_FLOW) += xdp_flow_core.o + +ifeq ($(CONFIG_XDP_FLOW_UMH), y) +# builtin xdp_flow_umh should be compiled with -static +# since rootfs isn't mounted at the time of __init +# function is called and do_execv won't find elf interpreter +STATIC := -static +endif + +quiet_cmd_cc_user = CC $@ + cmd_cc_user = $(CC) -Wall -Wmissing-prototypes -O2 -std=gnu89 \ + -I$(srctree)/tools/include/ \ + -c -o $@ $< + +quiet_cmd_ld_user = LD $@ + cmd_ld_user = $(CC) $(STATIC) -o $@ $^ + +$(obj)/xdp_flow_umh.o: $(src)/xdp_flow_umh.c FORCE + $(call if_changed,cc_user) + +$(obj)/xdp_flow_umh: $(obj)/xdp_flow_umh.o + $(call if_changed,ld_user) + +clean-files := xdp_flow_umh + +$(obj)/xdp_flow_umh_blob.o: $(obj)/xdp_flow_umh + +obj-$(CONFIG_XDP_FLOW_UMH) += xdp_flow.o +xdp_flow-objs += xdp_flow_kern_mod.o xdp_flow_umh_blob.o diff --git a/net/xdp_flow/msgfmt.h b/net/xdp_flow/msgfmt.h new file mode 100644 index 0000000..97d8490 --- /dev/null +++ b/net/xdp_flow/msgfmt.h @@ -0,0 +1,102 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _NET_XDP_FLOW_MSGFMT_H +#define _NET_XDP_FLOW_MSGFMT_H + +#include +#include +#include +#include + +#define MAX_XDP_FLOW_ACTIONS 32 + +enum xdp_flow_action_id { + /* ABORT if 0, i.e. uninitialized */ + XDP_FLOW_ACTION_ACCEPT = 1, + XDP_FLOW_ACTION_DROP, + XDP_FLOW_ACTION_REDIRECT, + XDP_FLOW_ACTION_VLAN_PUSH, + XDP_FLOW_ACTION_VLAN_POP, + XDP_FLOW_ACTION_VLAN_MANGLE, + XDP_FLOW_ACTION_MANGLE, + XDP_FLOW_ACTION_CSUM, + NR_XDP_FLOW_ACTION, +}; + +struct xdp_flow_action { + enum xdp_flow_action_id id; + union { + int ifindex; /* REDIRECT */ + struct { /* VLAN */ + __be16 proto; + __be16 tci; + } vlan; + }; +}; + +struct xdp_flow_actions { + unsigned int num_actions; + struct xdp_flow_action actions[MAX_XDP_FLOW_ACTIONS]; +}; + +struct xdp_flow_key { + struct { + __u8 dst[ETH_ALEN] __aligned(2); + __u8 src[ETH_ALEN] __aligned(2); + __be16 type; + } eth; + struct { + __be16 tpid; + __be16 tci; + } vlan; + struct { + __u8 proto; + __u8 ttl; + __u8 tos; + __u8 frag; + } ip; + union { + struct { + __be32 src; + __be32 dst; + } ipv4; + struct { + struct in6_addr src; + struct in6_addr dst; + } ipv6; + }; + struct { + __be16 src; + __be16 dst; + } l4port; + struct { + __be16 flags; + } tcp; +} __aligned(BITS_PER_LONG / 8); + +struct xdp_flow { + struct xdp_flow_key key; + struct xdp_flow_key mask; + struct xdp_flow_actions actions; + __u16 priority; +}; + +enum xdp_flow_cmd { + XDP_FLOW_CMD_NOOP = 0, + XDP_FLOW_CMD_LOAD, + XDP_FLOW_CMD_UNLOAD, + XDP_FLOW_CMD_REPLACE, + XDP_FLOW_CMD_DELETE, +}; + +struct mbox_request { + int ifindex; + __u8 cmd; + struct xdp_flow flow; +}; + +struct mbox_reply { + int status; + __u32 id; +}; + +#endif diff --git a/net/xdp_flow/xdp_flow.h b/net/xdp_flow/xdp_flow.h new file mode 100644 index 0000000..656ceab --- /dev/null +++ b/net/xdp_flow/xdp_flow.h @@ -0,0 +1,23 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_XDP_FLOW_H +#define _LINUX_XDP_FLOW_H + +#include +#include +#include + +struct xdp_flow_umh_ops { + struct umh_info info; + /* serialize access to this object and UMH */ + struct mutex lock; + flow_setup_cb_t *setup_cb; + int (*setup)(struct net_device *dev, bool do_bind, + struct netlink_ext_ack *extack); + int (*start)(void); + bool stop; + struct module *module; +}; + +extern struct xdp_flow_umh_ops xdp_flow_ops; + +#endif diff --git a/net/xdp_flow/xdp_flow_core.c b/net/xdp_flow/xdp_flow_core.c new file mode 100644 index 0000000..8265aef --- /dev/null +++ b/net/xdp_flow/xdp_flow_core.c @@ -0,0 +1,127 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include "xdp_flow.h" + +struct xdp_flow_umh_ops xdp_flow_ops; +EXPORT_SYMBOL_GPL(xdp_flow_ops); + +static LIST_HEAD(xdp_block_cb_list); + +static void xdp_flow_block_release(void *cb_priv) +{ + struct net_device *dev = cb_priv; + struct netlink_ext_ack extack; + + mutex_lock(&xdp_flow_ops.lock); + xdp_flow_ops.setup(dev, false, &extack); + module_put(xdp_flow_ops.module); + mutex_unlock(&xdp_flow_ops.lock); +} + +int xdp_flow_setup_block(struct net_device *dev, struct flow_block_offload *f) +{ + struct flow_block_cb *block_cb; + int err = 0; + + /* TODO: Remove this limitation */ + if (!net_eq(current->nsproxy->net_ns, &init_net)) + return -EOPNOTSUPP; + + if (f->binder_type != FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS) + return -EOPNOTSUPP; + + mutex_lock(&xdp_flow_ops.lock); + if (!xdp_flow_ops.module) { + mutex_unlock(&xdp_flow_ops.lock); + if (f->command == FLOW_BLOCK_UNBIND) + return -ENOENT; + err = request_module("xdp_flow"); + if (err) + return err; + mutex_lock(&xdp_flow_ops.lock); + if (!xdp_flow_ops.module) { + err = -ECHILD; + goto out; + } + } + if (xdp_flow_ops.stop) { + err = xdp_flow_ops.start(); + if (err) + goto out; + } + + f->driver_block_list = &xdp_block_cb_list; + + switch (f->command) { + case FLOW_BLOCK_BIND: + if (flow_block_cb_is_busy(xdp_flow_ops.setup_cb, dev, + &xdp_block_cb_list)) { + err = -EBUSY; + goto out; + } + + if (!try_module_get(xdp_flow_ops.module)) { + err = -ECHILD; + goto out; + } + + err = xdp_flow_ops.setup(dev, true, f->extack); + if (err) { + module_put(xdp_flow_ops.module); + goto out; + } + + block_cb = flow_block_cb_alloc(xdp_flow_ops.setup_cb, dev, dev, + xdp_flow_block_release); + if (IS_ERR(block_cb)) { + xdp_flow_ops.setup(dev, false, f->extack); + module_put(xdp_flow_ops.module); + err = PTR_ERR(block_cb); + goto out; + } + + flow_block_cb_add(block_cb, f); + list_add_tail(&block_cb->driver_list, &xdp_block_cb_list); + break; + case FLOW_BLOCK_UNBIND: + block_cb = flow_block_cb_lookup(f->block, xdp_flow_ops.setup_cb, + dev); + if (!block_cb) { + err = -ENOENT; + goto out; + } + + flow_block_cb_remove(block_cb, f); + list_del(&block_cb->driver_list); + break; + default: + err = -EOPNOTSUPP; + } +out: + mutex_unlock(&xdp_flow_ops.lock); + + return err; +} + +static void xdp_flow_umh_cleanup(struct umh_info *info) +{ + mutex_lock(&xdp_flow_ops.lock); + xdp_flow_ops.stop = true; + fput(info->pipe_to_umh); + fput(info->pipe_from_umh); + info->pid = 0; + mutex_unlock(&xdp_flow_ops.lock); +} + +static int __init xdp_flow_init(void) +{ + mutex_init(&xdp_flow_ops.lock); + xdp_flow_ops.stop = true; + xdp_flow_ops.info.cmdline = "xdp_flow_umh"; + xdp_flow_ops.info.cleanup = &xdp_flow_umh_cleanup; + + return 0; +} +device_initcall(xdp_flow_init); diff --git a/net/xdp_flow/xdp_flow_kern_mod.c b/net/xdp_flow/xdp_flow_kern_mod.c new file mode 100644 index 0000000..14e06ee --- /dev/null +++ b/net/xdp_flow/xdp_flow_kern_mod.c @@ -0,0 +1,250 @@ +// SPDX-License-Identifier: GPL-2.0 +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt +#include +#include +#include +#include +#include "xdp_flow.h" +#include "msgfmt.h" + +extern char xdp_flow_umh_start; +extern char xdp_flow_umh_end; + +static void shutdown_umh(void) +{ + struct task_struct *tsk; + + if (xdp_flow_ops.stop) + return; + + tsk = get_pid_task(find_vpid(xdp_flow_ops.info.pid), PIDTYPE_PID); + if (tsk) { + send_sig(SIGKILL, tsk, 1); + put_task_struct(tsk); + } +} + +static int transact_umh(struct mbox_request *req, u32 *id) +{ + struct mbox_reply reply; + int ret = -EFAULT; + loff_t pos; + ssize_t n; + + if (!xdp_flow_ops.info.pid) + goto out; + + n = __kernel_write(xdp_flow_ops.info.pipe_to_umh, req, sizeof(*req), + &pos); + if (n != sizeof(*req)) { + pr_err("write fail %zd\n", n); + shutdown_umh(); + goto out; + } + + pos = 0; + n = kernel_read(xdp_flow_ops.info.pipe_from_umh, &reply, + sizeof(reply), &pos); + if (n != sizeof(reply)) { + pr_err("read fail %zd\n", n); + shutdown_umh(); + goto out; + } + + ret = reply.status; + if (id) + *id = reply.id; +out: + return ret; +} + +static int xdp_flow_replace(struct net_device *dev, struct flow_cls_offload *f) +{ + return -EOPNOTSUPP; +} + +static int xdp_flow_destroy(struct net_device *dev, struct flow_cls_offload *f) +{ + return -EOPNOTSUPP; +} + +static int xdp_flow_setup_flower(struct net_device *dev, + struct flow_cls_offload *f) +{ + switch (f->command) { + case FLOW_CLS_REPLACE: + return xdp_flow_replace(dev, f); + case FLOW_CLS_DESTROY: + return xdp_flow_destroy(dev, f); + case FLOW_CLS_STATS: + case FLOW_CLS_TMPLT_CREATE: + case FLOW_CLS_TMPLT_DESTROY: + default: + return -EOPNOTSUPP; + } +} + +static int xdp_flow_setup_block_cb(enum tc_setup_type type, void *type_data, + void *cb_priv) +{ + struct flow_cls_common_offload *common = type_data; + struct net_device *dev = cb_priv; + int err = 0; + + if (common->chain_index) { + NL_SET_ERR_MSG_MOD(common->extack, + "Supports only offload of chain 0"); + return -EOPNOTSUPP; + } + + if (type != TC_SETUP_CLSFLOWER) + return -EOPNOTSUPP; + + mutex_lock(&xdp_flow_ops.lock); + if (xdp_flow_ops.stop) { + err = xdp_flow_ops.start(); + if (err) + goto out; + } + + err = xdp_flow_setup_flower(dev, type_data); +out: + mutex_unlock(&xdp_flow_ops.lock); + return err; +} + +static int xdp_flow_setup_bind(struct net_device *dev, + struct netlink_ext_ack *extack) +{ + struct mbox_request *req; + u32 id = 0; + int err; + + req = kzalloc(sizeof(*req), GFP_KERNEL); + if (!req) + return -ENOMEM; + + req->cmd = XDP_FLOW_CMD_LOAD; + req->ifindex = dev->ifindex; + + /* Load bpf in UMH and get prog id */ + err = transact_umh(req, &id); + + /* TODO: id will be used to attach bpf prog to XDP + * As we have rtnl_lock, UMH cannot attach prog to XDP + */ + + kfree(req); + + return err; +} + +static int xdp_flow_setup_unbind(struct net_device *dev, + struct netlink_ext_ack *extack) +{ + struct mbox_request *req; + int err; + + req = kzalloc(sizeof(*req), GFP_KERNEL); + if (!req) + return -ENOMEM; + + req->cmd = XDP_FLOW_CMD_UNLOAD; + req->ifindex = dev->ifindex; + + err = transact_umh(req, NULL); + + kfree(req); + + return err; +} + +static int xdp_flow_setup(struct net_device *dev, bool do_bind, + struct netlink_ext_ack *extack) +{ + ASSERT_RTNL(); + + if (!net_eq(dev_net(dev), &init_net)) + return -EINVAL; + + return do_bind ? + xdp_flow_setup_bind(dev, extack) : + xdp_flow_setup_unbind(dev, extack); +} + +static int xdp_flow_test(void) +{ + struct mbox_request *req; + int err; + + req = kzalloc(sizeof(*req), GFP_KERNEL); + if (!req) + return -ENOMEM; + + req->cmd = XDP_FLOW_CMD_NOOP; + err = transact_umh(req, NULL); + + kfree(req); + + return err; +} + +static int start_umh(void) +{ + int err; + + /* fork usermode process */ + err = fork_usermode_blob(&xdp_flow_umh_start, + &xdp_flow_umh_end - &xdp_flow_umh_start, + &xdp_flow_ops.info); + if (err) + return err; + + xdp_flow_ops.stop = false; + pr_info("Loaded xdp_flow_umh pid %d\n", xdp_flow_ops.info.pid); + + /* health check that usermode process started correctly */ + if (xdp_flow_test()) { + shutdown_umh(); + return -EFAULT; + } + + return 0; +} + +static int __init load_umh(void) +{ + int err = 0; + + mutex_lock(&xdp_flow_ops.lock); + if (!xdp_flow_ops.stop) { + err = -EFAULT; + goto err; + } + + err = start_umh(); + if (err) + goto err; + + xdp_flow_ops.setup_cb = &xdp_flow_setup_block_cb; + xdp_flow_ops.setup = &xdp_flow_setup; + xdp_flow_ops.start = &start_umh; + xdp_flow_ops.module = THIS_MODULE; +err: + mutex_unlock(&xdp_flow_ops.lock); + return err; +} + +static void __exit fini_umh(void) +{ + mutex_lock(&xdp_flow_ops.lock); + shutdown_umh(); + xdp_flow_ops.module = NULL; + xdp_flow_ops.start = NULL; + xdp_flow_ops.setup = NULL; + xdp_flow_ops.setup_cb = NULL; + mutex_unlock(&xdp_flow_ops.lock); +} +module_init(load_umh); +module_exit(fini_umh); +MODULE_LICENSE("GPL"); diff --git a/net/xdp_flow/xdp_flow_umh.c b/net/xdp_flow/xdp_flow_umh.c new file mode 100644 index 0000000..c642b5b --- /dev/null +++ b/net/xdp_flow/xdp_flow_umh.c @@ -0,0 +1,116 @@ +// SPDX-License-Identifier: GPL-2.0 +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include "msgfmt.h" + +FILE *kmsg; + +#define pr_log(fmt, prio, ...) fprintf(kmsg, "<%d>xdp_flow_umh: " fmt, \ + LOG_DAEMON | (prio), ##__VA_ARGS__) +#ifdef DEBUG +#define pr_debug(fmt, ...) pr_log(fmt, LOG_DEBUG, ##__VA_ARGS__) +#else +#define pr_debug(fmt, ...) do {} while (0) +#endif +#define pr_info(fmt, ...) pr_log(fmt, LOG_INFO, ##__VA_ARGS__) +#define pr_warn(fmt, ...) pr_log(fmt, LOG_WARNING, ##__VA_ARGS__) +#define pr_err(fmt, ...) pr_log(fmt, LOG_ERR, ##__VA_ARGS__) + +static int handle_load(const struct mbox_request *req, __u32 *prog_id) +{ + *prog_id = 0; + + return 0; +} + +static int handle_unload(const struct mbox_request *req) +{ + return 0; +} + +static int handle_replace(struct mbox_request *req) +{ + return -EOPNOTSUPP; +} + +static int handle_delete(const struct mbox_request *req) +{ + return -EOPNOTSUPP; +} + +static void loop(void) +{ + struct mbox_request *req; + + req = malloc(sizeof(struct mbox_request)); + if (!req) { + pr_err("Memory allocation for mbox_request failed\n"); + return; + } + + while (1) { + struct mbox_reply reply; + int n; + + n = read(0, req, sizeof(*req)); + if (n < 0) { + pr_err("read for mbox_request failed: %s\n", + strerror(errno)); + break; + } + if (n != sizeof(*req)) { + pr_err("Invalid request size %d\n", n); + break; + } + + switch (req->cmd) { + case XDP_FLOW_CMD_NOOP: + reply.status = 0; + break; + case XDP_FLOW_CMD_LOAD: + reply.status = handle_load(req, &reply.id); + break; + case XDP_FLOW_CMD_UNLOAD: + reply.status = handle_unload(req); + break; + case XDP_FLOW_CMD_REPLACE: + reply.status = handle_replace(req); + break; + case XDP_FLOW_CMD_DELETE: + reply.status = handle_delete(req); + break; + default: + pr_err("Invalid command %d\n", req->cmd); + reply.status = -EOPNOTSUPP; + } + + n = write(1, &reply, sizeof(reply)); + if (n < 0) { + pr_err("write for mbox_reply failed: %s\n", + strerror(errno)); + break; + } + if (n != sizeof(reply)) { + pr_err("reply written too short: %d\n", n); + break; + } + } + + free(req); +} + +int main(void) +{ + kmsg = fopen("/dev/kmsg", "a"); + setvbuf(kmsg, NULL, _IONBF, 0); + pr_info("Started xdp_flow\n"); + loop(); + fclose(kmsg); + + return 0; +} diff --git a/net/xdp_flow/xdp_flow_umh_blob.S b/net/xdp_flow/xdp_flow_umh_blob.S new file mode 100644 index 0000000..6edcb0e --- /dev/null +++ b/net/xdp_flow/xdp_flow_umh_blob.S @@ -0,0 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + .section .rodata, "a" + .global xdp_flow_umh_start +xdp_flow_umh_start: + .incbin "net/xdp_flow/xdp_flow_umh" + .global xdp_flow_umh_end +xdp_flow_umh_end: From patchwork Fri Oct 18 04:07:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1179115 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="kTQRhepx"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46vYjD2lVrz9sPZ for ; Fri, 18 Oct 2019 15:59:12 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2439317AbfJRE7L (ORCPT ); Fri, 18 Oct 2019 00:59:11 -0400 Received: from mail-pf1-f193.google.com ([209.85.210.193]:40349 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2395237AbfJRE7L (ORCPT ); Fri, 18 Oct 2019 00:59:11 -0400 Received: by mail-pf1-f193.google.com with SMTP id x127so3070571pfb.7; Thu, 17 Oct 2019 21:59:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=IHV4SO/al2u2XuTkmMeqUZlAhMxNeejSZWXnLD4dZc8=; b=kTQRhepxS9r3Q9U94rJU2AjVEHjbHdimFIyPaublD1xkMxB2gWmrhw1r3r64VMtFV2 vgp1r9ls6WjAvm8M2iOMMvW/xarfFWjGRxp2kbCkiUkN1UFbhW+frvjk4AKNSdDib7Uw HN6aQJQlmOwzPjtZSChk9guDrax3LXOpjy79NBX0j966NO4Nvdq/AVkdFsdjzvHlX9Tr ufUAWRvHFwrl9LeoYme1cbO0cj3VvNdBTVNphnbitG0NxszfHKWgQRIBUO6a4Jq8szgy HhHtucthVyKjL3q3CSlJokmyRHonoh5OSdfV/nANxyq2UpsjCEQq9D9IYyu0S5YxgBpL sSmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=IHV4SO/al2u2XuTkmMeqUZlAhMxNeejSZWXnLD4dZc8=; b=USY4MNtj7Q+9Ekar9YRjOCk0TPJSv1AJJ5ej90XVI7J1SRKIJBm5BB1EkQ9b//k2RG GhVIN5nrFWiDBppKkwpNI0O7/4AJ7gd6Zk1veQZwdxLkl9ZyU0L3ZLU4q6soG5EK+BFG mBSPNR1HzEjE/zFXAn/+6osoLbLmw7M9Tg/y2iI7N816lSuWfX2WnamzClCZH4p0K8ib VzP5h7uHogTzwP1zJSB6eUCoph4j8yvcHKJUV26jzbltaTGZNDqTU5N6HBnUvhrUOsUi 3o2ucBnJERAc4cBCug1sfCMub561+1zLzDV+EWf2PLQLsYzYLei9zkdS8b9L+Lecd+md oSlQ== X-Gm-Message-State: APjAAAXCJAK3jiljKL0uewe1ng/QQTw2agOvGry4qfUhz/K75NmWK6ik Xh0UP1AazOkAVN67fftyhioR9EBy X-Google-Smtp-Source: APXvYqyV6vTdnjnStg9feoF7VRrFaxfAYsc4daSJuMnCMpRGTowFU9vkbC11DqBC/WokizJCR4pI4A== X-Received: by 2002:a62:1909:: with SMTP id 9mr4176439pfz.248.1571371713198; Thu, 17 Oct 2019 21:08:33 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id d11sm4341680pfo.104.2019.10.17.21.08.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2019 21:08:32 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko , Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal , Pravin B Shelar Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu , Stanislav Fomichev Subject: [RFC PATCH v2 bpf-next 02/15] xdp_flow: Add skeleton bpf program for XDP Date: Fri, 18 Oct 2019 13:07:35 +0900 Message-Id: <20191018040748.30593-3-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191018040748.30593-1-toshiaki.makita1@gmail.com> References: <20191018040748.30593-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The program is meant to be loaded when a device is bound to an ingress flow block and should be attached to XDP on the device. Typically it should be loaded when TC ingress or clsact qdisc is added or a nftables offloaded chain is added. The program is prebuilt and embedded in the UMH, instead of generated dynamically. This is because TC filter is frequently changed when it is used by OVS, and the latency of TC filter change will affect the latency of datapath. Signed-off-by: Toshiaki Makita --- net/xdp_flow/Makefile | 87 +++++++++++- net/xdp_flow/xdp_flow_kern_bpf.c | 12 ++ net/xdp_flow/xdp_flow_kern_bpf_blob.S | 7 + net/xdp_flow/xdp_flow_umh.c | 243 +++++++++++++++++++++++++++++++++- 4 files changed, 345 insertions(+), 4 deletions(-) create mode 100644 net/xdp_flow/xdp_flow_kern_bpf.c create mode 100644 net/xdp_flow/xdp_flow_kern_bpf_blob.S diff --git a/net/xdp_flow/Makefile b/net/xdp_flow/Makefile index f6138c2..057cc6a 100644 --- a/net/xdp_flow/Makefile +++ b/net/xdp_flow/Makefile @@ -2,25 +2,106 @@ obj-$(CONFIG_XDP_FLOW) += xdp_flow_core.o +XDP_FLOW_PATH ?= $(abspath $(srctree)/$(src)) +TOOLS_PATH := $(XDP_FLOW_PATH)/../../tools + +# Libbpf dependencies +LIBBPF = $(TOOLS_PATH)/lib/bpf/libbpf.a + +LLC ?= llc +CLANG ?= clang +LLVM_OBJCOPY ?= llvm-objcopy +BTF_PAHOLE ?= pahole + +ifdef CROSS_COMPILE +CLANG_ARCH_ARGS = -target $(ARCH) +endif + +BTF_LLC_PROBE := $(shell $(LLC) -march=bpf -mattr=help 2>&1 | grep dwarfris) +BTF_PAHOLE_PROBE := $(shell $(BTF_PAHOLE) --help 2>&1 | grep BTF) +BTF_OBJCOPY_PROBE := $(shell $(LLVM_OBJCOPY) --help 2>&1 | grep -i 'usage.*llvm') +BTF_LLVM_PROBE := $(shell echo "int main() { return 0; }" | \ + $(CLANG) -target bpf -O2 -g -c -x c - -o ./llvm_btf_verify.o; \ + readelf -S ./llvm_btf_verify.o | grep BTF; \ + /bin/rm -f ./llvm_btf_verify.o) + +ifneq ($(BTF_LLVM_PROBE),) + EXTRA_CFLAGS += -g +else +ifneq ($(and $(BTF_LLC_PROBE),$(BTF_PAHOLE_PROBE),$(BTF_OBJCOPY_PROBE)),) + EXTRA_CFLAGS += -g + LLC_FLAGS += -mattr=dwarfris + DWARF2BTF = y +endif +endif + +$(LIBBPF): FORCE +# Fix up variables inherited from Kbuild that tools/ build system won't like + $(MAKE) -C $(dir $@) RM='rm -rf' LDFLAGS= srctree=$(XDP_FLOW_PATH)/../../ O= + +# Verify LLVM compiler tools are available and bpf target is supported by llc +.PHONY: verify_cmds verify_target_bpf $(CLANG) $(LLC) + +verify_cmds: $(CLANG) $(LLC) + @for TOOL in $^ ; do \ + if ! (which -- "$${TOOL}" > /dev/null 2>&1); then \ + echo "*** ERROR: Cannot find LLVM tool $${TOOL}" ;\ + exit 1; \ + else true; fi; \ + done + +verify_target_bpf: verify_cmds + @if ! (${LLC} -march=bpf -mattr=help > /dev/null 2>&1); then \ + echo "*** ERROR: LLVM (${LLC}) does not support 'bpf' target" ;\ + echo " NOTICE: LLVM version >= 3.7.1 required" ;\ + exit 2; \ + else true; fi + +$(src)/xdp_flow_kern_bpf.c: verify_target_bpf + +$(obj)/xdp_flow_kern_bpf.o: $(src)/xdp_flow_kern_bpf.c FORCE + @echo " CLANG-bpf " $@ + $(Q)$(CLANG) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) -I$(obj) \ + -I$(srctree)/tools/lib/bpf/ \ + -D__KERNEL__ -D__BPF_TRACING__ -Wno-unused-value -Wno-pointer-sign \ + -D__TARGET_ARCH_$(SRCARCH) -Wno-compare-distinct-pointer-types \ + -Wno-gnu-variable-sized-type-not-at-end \ + -Wno-address-of-packed-member -Wno-tautological-compare \ + -Wno-unknown-warning-option $(CLANG_ARCH_ARGS) \ + -I$(srctree)/samples/bpf/ -include asm_goto_workaround.h \ + -O2 -emit-llvm -c $< -o -| $(LLC) -march=bpf $(LLC_FLAGS) -filetype=obj -o $@ +ifeq ($(DWARF2BTF),y) + $(BTF_PAHOLE) -J $@ +endif + ifeq ($(CONFIG_XDP_FLOW_UMH), y) # builtin xdp_flow_umh should be compiled with -static # since rootfs isn't mounted at the time of __init # function is called and do_execv won't find elf interpreter STATIC := -static +STATICLDLIBS := -lz endif +quiet_cmd_as_user = AS $@ + cmd_as_user = $(AS) -c -o $@ $< + quiet_cmd_cc_user = CC $@ cmd_cc_user = $(CC) -Wall -Wmissing-prototypes -O2 -std=gnu89 \ - -I$(srctree)/tools/include/ \ + -I$(srctree)/tools/lib/ -I$(srctree)/tools/include/ \ -c -o $@ $< quiet_cmd_ld_user = LD $@ - cmd_ld_user = $(CC) $(STATIC) -o $@ $^ + cmd_ld_user = $(CC) $(STATIC) -o $@ $^ $(LIBBPF) -lelf $(STATICLDLIBS) + +$(obj)/xdp_flow_kern_bpf_blob.o: $(src)/xdp_flow_kern_bpf_blob.S \ + $(obj)/xdp_flow_kern_bpf.o + $(call if_changed,as_user) $(obj)/xdp_flow_umh.o: $(src)/xdp_flow_umh.c FORCE $(call if_changed,cc_user) -$(obj)/xdp_flow_umh: $(obj)/xdp_flow_umh.o +$(obj)/xdp_flow_umh: $(obj)/xdp_flow_umh.o $(LIBBPF) \ + $(obj)/xdp_flow_kern_bpf_blob.o $(call if_changed,ld_user) clean-files := xdp_flow_umh diff --git a/net/xdp_flow/xdp_flow_kern_bpf.c b/net/xdp_flow/xdp_flow_kern_bpf.c new file mode 100644 index 0000000..74cdb1d --- /dev/null +++ b/net/xdp_flow/xdp_flow_kern_bpf.c @@ -0,0 +1,12 @@ +// SPDX-License-Identifier: GPL-2.0 +#define KBUILD_MODNAME "foo" +#include +#include + +SEC("xdp_flow") +int xdp_flow_prog(struct xdp_md *ctx) +{ + return XDP_PASS; +} + +char _license[] SEC("license") = "GPL"; diff --git a/net/xdp_flow/xdp_flow_kern_bpf_blob.S b/net/xdp_flow/xdp_flow_kern_bpf_blob.S new file mode 100644 index 0000000..d180c1b --- /dev/null +++ b/net/xdp_flow/xdp_flow_kern_bpf_blob.S @@ -0,0 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + .section .rodata, "a" + .global xdp_flow_bpf_start +xdp_flow_bpf_start: + .incbin "net/xdp_flow/xdp_flow_kern_bpf.o" + .global xdp_flow_bpf_end +xdp_flow_bpf_end: diff --git a/net/xdp_flow/xdp_flow_umh.c b/net/xdp_flow/xdp_flow_umh.c index c642b5b..85c5c7b 100644 --- a/net/xdp_flow/xdp_flow_umh.c +++ b/net/xdp_flow/xdp_flow_umh.c @@ -6,8 +6,18 @@ #include #include #include +#include +#include +#include +#include +#include +#include +#include #include "msgfmt.h" +extern char xdp_flow_bpf_start; +extern char xdp_flow_bpf_end; +int progfile_fd; FILE *kmsg; #define pr_log(fmt, prio, ...) fprintf(kmsg, "<%d>xdp_flow_umh: " fmt, \ @@ -21,15 +31,241 @@ #define pr_warn(fmt, ...) pr_log(fmt, LOG_WARNING, ##__VA_ARGS__) #define pr_err(fmt, ...) pr_log(fmt, LOG_ERR, ##__VA_ARGS__) +#define ERRBUF_SIZE 64 + +/* This key represents a net device */ +struct netdev_info_key { + int ifindex; +}; + +struct netdev_info { + struct netdev_info_key key; + struct hlist_node node; + struct bpf_object *obj; +}; + +DEFINE_HASHTABLE(netdev_info_table, 16); + +static int libbpf_err(int err, char *errbuf) +{ + libbpf_strerror(err, errbuf, ERRBUF_SIZE); + + if (-err < __LIBBPF_ERRNO__START) + return err; + + return -EINVAL; +} + +static int setup(void) +{ + size_t size = &xdp_flow_bpf_end - &xdp_flow_bpf_start; + struct rlimit r = { RLIM_INFINITY, RLIM_INFINITY }; + ssize_t len; + int err; + + if (setrlimit(RLIMIT_MEMLOCK, &r)) { + err = -errno; + pr_err("setrlimit MEMLOCK failed: %s\n", strerror(errno)); + return err; + } + + progfile_fd = memfd_create("xdp_flow_kern_bpf.o", 0); + if (progfile_fd < 0) { + err = -errno; + pr_err("memfd_create failed: %s\n", strerror(errno)); + return err; + } + + len = write(progfile_fd, &xdp_flow_bpf_start, size); + if (len < 0) { + err = -errno; + pr_err("Failed to write bpf prog: %s\n", strerror(errno)); + goto err; + } + + if (len < size) { + pr_err("bpf prog written too short: expected %ld, actual %ld\n", + size, len); + err = -EIO; + goto err; + } + + return 0; +err: + close(progfile_fd); + + return err; +} + +static int load_bpf(int ifindex, struct bpf_object **objp) +{ + struct bpf_object_open_attr attr = {}; + char path[256], errbuf[ERRBUF_SIZE]; + struct bpf_program *prog; + struct bpf_object *obj; + int prog_fd, err; + ssize_t len; + + len = snprintf(path, 256, "/proc/self/fd/%d", progfile_fd); + if (len < 0) { + err = -errno; + pr_err("Failed to setup prog fd path string: %s\n", + strerror(errno)); + return err; + } + + attr.file = path; + attr.prog_type = BPF_PROG_TYPE_XDP; + obj = bpf_object__open_xattr(&attr); + if (IS_ERR_OR_NULL(obj)) { + if (IS_ERR(obj)) { + err = libbpf_err((int)PTR_ERR(obj), errbuf); + } else { + err = -ENOENT; + strerror_r(-err, errbuf, sizeof(errbuf)); + } + pr_err("Cannot open bpf prog: %s\n", errbuf); + return err; + } + + bpf_object__for_each_program(prog, obj) + bpf_program__set_type(prog, attr.prog_type); + + err = bpf_object__load(obj); + if (err) { + err = libbpf_err(err, errbuf); + pr_err("Failed to load bpf prog: %s\n", errbuf); + goto err; + } + + prog = bpf_object__find_program_by_title(obj, "xdp_flow"); + if (!prog) { + pr_err("Cannot find xdp_flow program\n"); + err = -ENOENT; + goto err; + } + + prog_fd = bpf_program__fd(prog); + if (prog_fd < 0) { + err = libbpf_err(prog_fd, errbuf); + pr_err("Invalid program fd: %s\n", errbuf); + goto err; + } + + *objp = obj; + + return prog_fd; +err: + bpf_object__close(obj); + return err; +} + +static int get_netdev_info_keyval(const struct netdev_info_key *key) +{ + return key->ifindex; +} + +static struct netdev_info *find_netdev_info(const struct netdev_info_key *key) +{ + int keyval = get_netdev_info_keyval(key); + struct netdev_info *netdev_info; + + hash_for_each_possible(netdev_info_table, netdev_info, node, keyval) { + if (netdev_info->key.ifindex == key->ifindex) + return netdev_info; + } + + return NULL; +} + +static int get_netdev_info_key(const struct mbox_request *req, + struct netdev_info_key *key) +{ + key->ifindex = req->ifindex; + + return 0; +} + +static struct netdev_info *get_netdev_info(const struct mbox_request *req) +{ + struct netdev_info *netdev_info; + struct netdev_info_key key; + int err; + + err = get_netdev_info_key(req, &key); + if (err) + return ERR_PTR(err); + + netdev_info = find_netdev_info(&key); + if (!netdev_info) { + pr_err("BUG: netdev_info for if %d not found.\n", + key.ifindex); + return ERR_PTR(-ENOENT); + } + + return netdev_info; +} + static int handle_load(const struct mbox_request *req, __u32 *prog_id) { - *prog_id = 0; + struct netdev_info *netdev_info; + struct bpf_prog_info info = {}; + struct netdev_info_key key; + __u32 len = sizeof(info); + int err, prog_fd; + + err = get_netdev_info_key(req, &key); + if (err) + return err; + + netdev_info = find_netdev_info(&key); + if (netdev_info) + return 0; + + netdev_info = malloc(sizeof(*netdev_info)); + if (!netdev_info) { + pr_err("malloc for netdev_info failed.\n"); + return -ENOMEM; + } + netdev_info->key.ifindex = key.ifindex; + + prog_fd = load_bpf(req->ifindex, &netdev_info->obj); + if (prog_fd < 0) { + err = prog_fd; + goto err_netdev_info; + } + + err = bpf_obj_get_info_by_fd(prog_fd, &info, &len); + if (err) + goto err_obj; + + *prog_id = info.id; + hash_add(netdev_info_table, &netdev_info->node, + get_netdev_info_keyval(&netdev_info->key)); + pr_debug("XDP program for if %d was loaded\n", req->ifindex); return 0; +err_obj: + bpf_object__close(netdev_info->obj); +err_netdev_info: + free(netdev_info); + + return err; } static int handle_unload(const struct mbox_request *req) { + struct netdev_info *netdev_info; + + netdev_info = get_netdev_info(req); + if (IS_ERR(netdev_info)) + return PTR_ERR(netdev_info); + + hash_del(&netdev_info->node); + bpf_object__close(netdev_info->obj); + free(netdev_info); + pr_debug("XDP program for if %d was closed\n", req->ifindex); + return 0; } @@ -109,7 +345,12 @@ int main(void) kmsg = fopen("/dev/kmsg", "a"); setvbuf(kmsg, NULL, _IONBF, 0); pr_info("Started xdp_flow\n"); + if (setup()) { + fclose(kmsg); + return -1; + } loop(); + close(progfile_fd); fclose(kmsg); return 0; From patchwork Fri Oct 18 04:07:36 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1179126 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="stVAgmUo"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46vYsf2knFz9sR7 for ; Fri, 18 Oct 2019 16:06:30 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2504263AbfJRFG3 (ORCPT ); Fri, 18 Oct 2019 01:06:29 -0400 Received: from mail-pf1-f193.google.com ([209.85.210.193]:33856 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2504254AbfJRFG2 (ORCPT ); Fri, 18 Oct 2019 01:06:28 -0400 Received: by mail-pf1-f193.google.com with SMTP id b128so3105046pfa.1; Thu, 17 Oct 2019 22:06:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=2WYqjTN1RUCVkv/Je1j3VKlKtOkDStGRgvwygXm30VE=; b=stVAgmUonojr7iuGmI4Ls1OhAgj/NAhYmLxFnkS+Ku/KXfGM2yrOPhODAJr04AJQUe LD2I/9nMW0LIcT0cjr71RIX8cfrmEiQu0fBalMETBkRtXASge4Uvwpzuqg6Kh/3d8WFL t95bYCAbedMYb9JAloXofu1OHjkJna83/NiMNfIXa8qLou3s/V5Pr4x+DYQoyAbQQ/Ie zWM2JQj77twrlVjM92Fm85HQTAJbwgmNCHi2gwPXbkjyjjKhgNm1cULX6oF77lCTcKnx unHqQx97wWXxa92boKNwLA/42Bw8RM2PLrqulVnppPk8e0NT7AB68IWIVDf4x69IByjj uYAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2WYqjTN1RUCVkv/Je1j3VKlKtOkDStGRgvwygXm30VE=; b=V8wMzoblKEO5GdFZwDjVPhyi9oduqybrhZZLFD13GXnQZIKWY53hJZqlpZswkVbvQk G+6WhP9tny5xjOx6rw+kWrn7UBM8Nv5HS5+1zOhvjGwp8bQnOn5XKZbZMnejSvWEJRKa cV+RC80DbICSLZCyksEG5Hzd7j5R5lBantAs/3eBl3Z0MBFL477Nmm9vvAhit/5CLgdZ XYfq9DSlpL8RXb6dIHjtBfm89z/0q9E3v9c1S2kTG4eGMf9S2y56UiR66Idsdkqd29Vh GIYsiD3XpoDc0vJqY87cc5opn46eIHpk3HcZ5YkR8UBDXPPYeTBMScV3YZ+83RCY87Rd Kz5w== X-Gm-Message-State: APjAAAXaT1VsYsNsx13lumX/cm+9UYTLZqg70Knx6R2rU0M6fxNIjtyy vejsz0MuJflYT/WlqBXfEcIn7M/b X-Google-Smtp-Source: APXvYqywM1MTmVDyX2VcXcBBiWD84D7G+0LcHfGimyOlkwTm6ToTV0VlTLXNOvWYP8rUIqSw+k42bQ== X-Received: by 2002:aa7:8a84:: with SMTP id a4mr3999603pfc.99.1571371717724; Thu, 17 Oct 2019 21:08:37 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id d11sm4341680pfo.104.2019.10.17.21.08.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2019 21:08:37 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko , Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal , Pravin B Shelar Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu , Stanislav Fomichev Subject: [RFC PATCH v2 bpf-next 03/15] bpf: Add API to get program from id Date: Fri, 18 Oct 2019 13:07:36 +0900 Message-Id: <20191018040748.30593-4-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191018040748.30593-1-toshiaki.makita1@gmail.com> References: <20191018040748.30593-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Factor out the logic in bpf_prog_get_fd_by_id() and add bpf_prog_get_by_id()/bpf_prog_get_type_dev_by_id(). Also export bpf_prog_get_type_dev_by_id(), which will be used by the following commit to get bpf prog from its id. Signed-off-by: Toshiaki Makita --- include/linux/bpf.h | 8 ++++++++ kernel/bpf/syscall.c | 42 ++++++++++++++++++++++++++++++++++-------- 2 files changed, 42 insertions(+), 8 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 282e28b..78fe7ef 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -636,6 +636,8 @@ int bpf_prog_array_copy(struct bpf_prog_array *old_array, struct bpf_prog *bpf_prog_get(u32 ufd); struct bpf_prog *bpf_prog_get_type_dev(u32 ufd, enum bpf_prog_type type, bool attach_drv); +struct bpf_prog *bpf_prog_get_type_dev_by_id(u32 id, enum bpf_prog_type type, + bool attach_drv); struct bpf_prog * __must_check bpf_prog_add(struct bpf_prog *prog, int i); void bpf_prog_sub(struct bpf_prog *prog, int i); struct bpf_prog * __must_check bpf_prog_inc(struct bpf_prog *prog); @@ -760,6 +762,12 @@ static inline struct bpf_prog *bpf_prog_get_type_dev(u32 ufd, return ERR_PTR(-EOPNOTSUPP); } +static inline struct bpf_prog * +bpf_prog_get_type_dev_by_id(u32 id, enum bpf_prog_type type, bool attach_drv) +{ + return ERR_PTR(-EOPNOTSUPP); +} + static inline struct bpf_prog * __must_check bpf_prog_add(struct bpf_prog *prog, int i) { diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 82eabd4..2dd6cfc 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2139,6 +2139,39 @@ static int bpf_obj_get_next_id(const union bpf_attr *attr, return err; } +static struct bpf_prog *bpf_prog_get_by_id(u32 id) +{ + struct bpf_prog *prog; + + spin_lock_bh(&prog_idr_lock); + prog = idr_find(&prog_idr, id); + if (prog) + prog = bpf_prog_inc_not_zero(prog); + else + prog = ERR_PTR(-ENOENT); + spin_unlock_bh(&prog_idr_lock); + + return prog; +} + +struct bpf_prog *bpf_prog_get_type_dev_by_id(u32 id, enum bpf_prog_type type, + bool attach_drv) +{ + struct bpf_prog *prog; + + prog = bpf_prog_get_by_id(id); + if (IS_ERR(prog)) + return prog; + + if (!bpf_prog_get_ok(prog, &type, attach_drv)) { + bpf_prog_put(prog); + return ERR_PTR(-EINVAL); + } + + return prog; +} +EXPORT_SYMBOL_GPL(bpf_prog_get_type_dev_by_id); + #define BPF_PROG_GET_FD_BY_ID_LAST_FIELD prog_id static int bpf_prog_get_fd_by_id(const union bpf_attr *attr) @@ -2153,14 +2186,7 @@ static int bpf_prog_get_fd_by_id(const union bpf_attr *attr) if (!capable(CAP_SYS_ADMIN)) return -EPERM; - spin_lock_bh(&prog_idr_lock); - prog = idr_find(&prog_idr, id); - if (prog) - prog = bpf_prog_inc_not_zero(prog); - else - prog = ERR_PTR(-ENOENT); - spin_unlock_bh(&prog_idr_lock); - + prog = bpf_prog_get_by_id(id); if (IS_ERR(prog)) return PTR_ERR(prog); From patchwork Fri Oct 18 04:07:37 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1179120 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="kcjYXQQl"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46vYnd3ddtz9sPJ for ; Fri, 18 Oct 2019 16:03:01 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727981AbfJRFDA (ORCPT ); Fri, 18 Oct 2019 01:03:00 -0400 Received: from mail-pl1-f196.google.com ([209.85.214.196]:42873 "EHLO mail-pl1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726663AbfJRFDA (ORCPT ); Fri, 18 Oct 2019 01:03:00 -0400 Received: by mail-pl1-f196.google.com with SMTP id g9so936454plj.9; Thu, 17 Oct 2019 22:02:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=GTOf/Ta9mLCx6uCsjSPuiUUWC38JnQfNI985qroQXjw=; b=kcjYXQQlaQbCtc9Ciikf6XULMEJAJ5vFb3gdSRzCeMwK4sYZNGGMT6i9FVjCdRumUS 9l/uXXJxVZhHtgK2kgbrID2AQZ6dTNwTXSVBjPvMysVOzLgF6j/mnRt/KlkwxE7k+wbB Rz0WSP6WH9A/0Hkt0rNwDgwFyIIi3UKP6Le5mlpQfKCLYiJXbGnQme9MKnoRc+/oeSUB SUvajNx3msTLiP+HC9r5It8QE7q8wOLfTGKQKBzhLkahKfWwq8Ko3Z1QeTidZfMz9UPQ ucAcoyxh1UoLrH8aeJWc17hStbrljqQRU2BCFmgC90trSSW7RLmRfaYt0XV8WjQev1Fs ZpMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=GTOf/Ta9mLCx6uCsjSPuiUUWC38JnQfNI985qroQXjw=; b=G/y/uiWeOa8nYzj7B+77tzktGyH4DYvl3OVFhQCjoPZVnTvNc5ZrcIFBKxZQiJNhln cdTfOlDEej2Jlem59Qpo72uzVudee/ny7XYWkZufJHLSWUQQe6qNaHnsy0wf5wTUjzyQ XAZC7Kwv/K6lnjHYPeE/ypfHlGhx6CKhq4dbTlJIyOSAz9aCh1AkUo35TUhGN5d+bNza a5Ah/pD4Pr+lO7zQvC6Gjq3cN+shtWwk7OEKjd/I286z9Bm9fRI8F/T+AtqWbss1fElF y/qbf+7oHLcUOQNPlnamVRDsMotbI5cBeEnyhLAsxlkVx9VKZdr9xRpHtr25lmU7pEW6 T3ig== X-Gm-Message-State: APjAAAXi24xQWlHCa3i7un1KiCs30MLjvt5lWCfTUZjsHzm45CYJIQHn YQAh2nRHIwS4ZmkooCSaZKam9sUp X-Google-Smtp-Source: APXvYqzCnHVgLSTfgaH3su/NNJ8g5y3hs5HySo+Msq15O4by3957SYjgWIFsILAOq3Dc3I9+RrkVEA== X-Received: by 2002:a17:902:8d8e:: with SMTP id v14mr7619205plo.287.1571371722117; Thu, 17 Oct 2019 21:08:42 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id d11sm4341680pfo.104.2019.10.17.21.08.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2019 21:08:41 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko , Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal , Pravin B Shelar Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu , Stanislav Fomichev Subject: [RFC PATCH v2 bpf-next 04/15] xdp: Export dev_check_xdp and dev_change_xdp Date: Fri, 18 Oct 2019 13:07:37 +0900 Message-Id: <20191018040748.30593-5-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191018040748.30593-1-toshiaki.makita1@gmail.com> References: <20191018040748.30593-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Factor out the check and change logic from dev_change_xdp_fd(), and export them for the following commit. Signed-off-by: Toshiaki Makita --- include/linux/netdevice.h | 4 ++ net/core/dev.c | 111 +++++++++++++++++++++++++++++++++++++--------- 2 files changed, 95 insertions(+), 20 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 3207e0b..c338a73 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -3707,6 +3707,10 @@ struct sk_buff *dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev, struct netdev_queue *txq, int *ret); typedef int (*bpf_op_t)(struct net_device *dev, struct netdev_bpf *bpf); +int dev_check_xdp(struct net_device *dev, struct netlink_ext_ack *extack, + bool do_install, u32 *prog_id_p, u32 flags); +int dev_change_xdp(struct net_device *dev, struct netlink_ext_ack *extack, + struct bpf_prog *prod, u32 flags); int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack, int fd, u32 flags); u32 __dev_xdp_query(struct net_device *dev, bpf_op_t xdp_op, diff --git a/net/core/dev.c b/net/core/dev.c index 8bc3dce..9965675 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -8317,23 +8317,24 @@ static void dev_xdp_uninstall(struct net_device *dev) } /** - * dev_change_xdp_fd - set or clear a bpf program for a device rx path + * dev_check_xdp - check if xdp prog can be [un]installed * @dev: device * @extack: netlink extended ack - * @fd: new program fd or negative value to clear + * @install: flag to install or uninstall + * @prog_id_p: pointer to a storage for program id * @flags: xdp-related flags * - * Set or clear a bpf program for a device + * Check if xdp prog can be [un]installed + * If a program is already loaded, store the prog id to prog_id_p */ -int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack, - int fd, u32 flags) +int dev_check_xdp(struct net_device *dev, struct netlink_ext_ack *extack, + bool install, u32 *prog_id_p, u32 flags) { const struct net_device_ops *ops = dev->netdev_ops; enum bpf_netdev_command query; - struct bpf_prog *prog = NULL; bpf_op_t bpf_op, bpf_chk; bool offload; - int err; + u32 prog_id; ASSERT_RTNL(); @@ -8350,28 +8351,64 @@ int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack, if (bpf_op == bpf_chk) bpf_chk = generic_xdp_install; - if (fd >= 0) { - u32 prog_id; - + if (install) { if (!offload && __dev_xdp_query(dev, bpf_chk, XDP_QUERY_PROG)) { NL_SET_ERR_MSG(extack, "native and generic XDP can't be active at the same time"); return -EEXIST; } prog_id = __dev_xdp_query(dev, bpf_op, query); + if (prog_id_p) + *prog_id_p = prog_id; if ((flags & XDP_FLAGS_UPDATE_IF_NOEXIST) && prog_id) { NL_SET_ERR_MSG(extack, "XDP program already attached"); return -EBUSY; } + } else { + prog_id = __dev_xdp_query(dev, bpf_op, query); + if (prog_id_p) + *prog_id_p = prog_id; + if (!prog_id) + return -ENOENT; + } - prog = bpf_prog_get_type_dev(fd, BPF_PROG_TYPE_XDP, - bpf_op == ops->ndo_bpf); - if (IS_ERR(prog)) - return PTR_ERR(prog); + return 0; +} +EXPORT_SYMBOL_GPL(dev_check_xdp); + +/** + * dev_change_xdp - set or clear a bpf program for a device rx path + * @dev: device + * @extack: netlink extended ack + * @prog: bpf progam + * @flags: xdp-related flags + * + * Set or clear a bpf program for a device. + * Caller must call dev_check_xdp before calling this function to + * check if xdp prog can be [un]installed. + */ +int dev_change_xdp(struct net_device *dev, struct netlink_ext_ack *extack, + struct bpf_prog *prog, u32 flags) +{ + const struct net_device_ops *ops = dev->netdev_ops; + enum bpf_netdev_command query; + bpf_op_t bpf_op; + bool offload; + + ASSERT_RTNL(); + + offload = flags & XDP_FLAGS_HW_MODE; + query = offload ? XDP_QUERY_PROG_HW : XDP_QUERY_PROG; + + bpf_op = ops->ndo_bpf; + if (!bpf_op || (flags & XDP_FLAGS_SKB_MODE)) + bpf_op = generic_xdp_install; + + if (prog) { + u32 prog_id = __dev_xdp_query(dev, bpf_op, query); if (!offload && bpf_prog_is_dev_bound(prog->aux)) { NL_SET_ERR_MSG(extack, "using device-bound program without HW_MODE flag is not supported"); - bpf_prog_put(prog); return -EINVAL; } @@ -8379,13 +8416,47 @@ int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack, bpf_prog_put(prog); return 0; } - } else { - if (!__dev_xdp_query(dev, bpf_op, query)) - return 0; } - err = dev_xdp_install(dev, bpf_op, extack, flags, prog); - if (err < 0 && prog) + return dev_xdp_install(dev, bpf_op, extack, flags, prog); +} +EXPORT_SYMBOL_GPL(dev_change_xdp); + +/** + * dev_change_xdp_fd - set or clear a bpf program for a device rx path + * @dev: device + * @extack: netlink extended ack + * @fd: new program fd or negative value to clear + * @flags: xdp-related flags + * + * Set or clear a bpf program for a device + */ +int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack, + int fd, u32 flags) +{ + struct bpf_prog *prog = NULL; + bool install = fd >= 0; + int err; + + err = dev_check_xdp(dev, extack, install, NULL, flags); + if (err) { + if (!install && err == -ENOENT) + err = 0; + return err; + } + + if (install) { + bool attach_drv; + + attach_drv = dev->netdev_ops->ndo_bpf && + !(flags & XDP_FLAGS_SKB_MODE); + prog = bpf_prog_get_type_dev(fd, BPF_PROG_TYPE_XDP, attach_drv); + if (IS_ERR(prog)) + return PTR_ERR(prog); + } + + err = dev_change_xdp(dev, extack, prog, flags); + if (err && prog) bpf_prog_put(prog); return err; From patchwork Fri Oct 18 04:07:38 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1179144 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="iCpOTOH9"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46vZ2D21J1z9sP3 for ; Fri, 18 Oct 2019 16:13:56 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732046AbfJRFNz (ORCPT ); Fri, 18 Oct 2019 01:13:55 -0400 Received: from mail-pl1-f193.google.com ([209.85.214.193]:45514 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727606AbfJRFNz (ORCPT ); Fri, 18 Oct 2019 01:13:55 -0400 Received: by mail-pl1-f193.google.com with SMTP id u12so2249789pls.12; Thu, 17 Oct 2019 22:13:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ZAQePQ7/QLeSg6o7fSfTOlJs0MDCwCY0jzDB1HxnUcQ=; b=iCpOTOH9CYfmcmecB5Wh8gDJXCHxZoJ+w3WlbTojmcVz/s7aKXUzKW+iQ+ohveaxe0 hMtHiiizZUjZwlXnlNQ70sdOe1t41XdEG70tbWk13YGqBnKu25HCcZLcIE7t+4TEL7SJ nNGh5WuQL66Cx5MXSNObE9eU748wZ7WcTbakilfHNyqqg5tvq/byNgi90vWHMt7sT4p4 9ryqJmptlr0HXD98u/l4YYRNVWkbImS+fh7FbypBImSsUUVmgqZunZy4AgKmSFzEVyMm Y5f5P99zOOCzGvP4dbV0+grEJWCzEaTE2p0wX/NcZ1y/PdhdapSr+y8W7bM1l2Nu8IQD fk9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ZAQePQ7/QLeSg6o7fSfTOlJs0MDCwCY0jzDB1HxnUcQ=; b=sadyiA9CyTzOHNtLiIBGgkuTeyaHsRSlYCavit9ttXmiLztWj6QzdLz/5fvFiIs1/Y I/d2FjCt6apK9yBg26/glq+LCNRsjTFqtFsj/M2CofOMeEGJ6dGkMvuof1oZBMuQNnvV phCuzt8SSmMilckv2g9O77azH5hBecJeqRDO4yy0F6VT/MfNj3gmwU1wC1/5LFem376Y F6k9U6imvANnYrU2ZDbQun6ILC7wIWzLrMDsmFxemCAQIN48VBDc6w6ueP2VuCi+wFhW Aw3gf7MpGHvwdZ6LIGvlWFd1QiugIv+0KPO1N5+vXS7dyV6X8uTHvCuh70FDPoLyUtc2 3pxQ== X-Gm-Message-State: APjAAAXM6uVDJNh/JAc3tzvQFFB//QXAVJ9Bv1po3w2DTphr85pe4m3l Sg6W2kMs5Jq46tIRpVmklI9bIETd X-Google-Smtp-Source: APXvYqwoMBLtFgL83xyAx7Sa5V55f25Tj7JBU8z9TKjXc9uRls3ZYbUiyyE2NOZHbmPmfK/+syX/sQ== X-Received: by 2002:a17:902:d888:: with SMTP id b8mr7850950plz.259.1571371727060; Thu, 17 Oct 2019 21:08:47 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id d11sm4341680pfo.104.2019.10.17.21.08.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2019 21:08:46 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko , Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal , Pravin B Shelar Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu , Stanislav Fomichev Subject: [RFC PATCH v2 bpf-next 05/15] xdp_flow: Attach bpf prog to XDP in kernel after UMH loaded program Date: Fri, 18 Oct 2019 13:07:38 +0900 Message-Id: <20191018040748.30593-6-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191018040748.30593-1-toshiaki.makita1@gmail.com> References: <20191018040748.30593-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org As UMH runs under RTNL, it cannot attach XDP from userspace. Thus the kernel, xdp_flow module, installs the XDP program. Signed-off-by: Toshiaki Makita --- net/xdp_flow/xdp_flow_kern_mod.c | 109 ++++++++++++++++++++++++++++++++++++--- 1 file changed, 103 insertions(+), 6 deletions(-) diff --git a/net/xdp_flow/xdp_flow_kern_mod.c b/net/xdp_flow/xdp_flow_kern_mod.c index 14e06ee..2c80590 100644 --- a/net/xdp_flow/xdp_flow_kern_mod.c +++ b/net/xdp_flow/xdp_flow_kern_mod.c @@ -3,10 +3,27 @@ #include #include #include +#include #include +#include #include "xdp_flow.h" #include "msgfmt.h" +struct xdp_flow_prog { + struct rhash_head ht_node; + struct net_device *dev; + struct bpf_prog *prog; +}; + +static const struct rhashtable_params progs_params = { + .key_len = sizeof(struct net_devce *), + .key_offset = offsetof(struct xdp_flow_prog, dev), + .head_offset = offsetof(struct xdp_flow_prog, ht_node), + .automatic_shrinking = true, +}; + +static struct rhashtable progs; + extern char xdp_flow_umh_start; extern char xdp_flow_umh_end; @@ -116,10 +133,17 @@ static int xdp_flow_setup_block_cb(enum tc_setup_type type, void *type_data, static int xdp_flow_setup_bind(struct net_device *dev, struct netlink_ext_ack *extack) { + u32 flags = XDP_FLAGS_DRV_MODE | XDP_FLAGS_UPDATE_IF_NOEXIST; + struct xdp_flow_prog *prog_node; struct mbox_request *req; + struct bpf_prog *prog; u32 id = 0; int err; + err = dev_check_xdp(dev, extack, true, NULL, flags); + if (err) + return err; + req = kzalloc(sizeof(*req), GFP_KERNEL); if (!req) return -ENOMEM; @@ -129,21 +153,83 @@ static int xdp_flow_setup_bind(struct net_device *dev, /* Load bpf in UMH and get prog id */ err = transact_umh(req, &id); + if (err) + goto out; + + prog = bpf_prog_get_type_dev_by_id(id, BPF_PROG_TYPE_XDP, true); + if (IS_ERR(prog)) { + err = PTR_ERR(prog); + goto err_umh; + } - /* TODO: id will be used to attach bpf prog to XDP - * As we have rtnl_lock, UMH cannot attach prog to XDP - */ + err = dev_change_xdp(dev, extack, prog, flags); + if (err) + goto err_prog; + prog_node = kzalloc(sizeof(*prog_node), GFP_KERNEL); + if (!prog_node) { + err = -ENOMEM; + goto err_xdp; + } + + prog_node->dev = dev; + prog_node->prog = prog; + err = rhashtable_insert_fast(&progs, &prog_node->ht_node, progs_params); + if (err) + goto err_pnode; + + prog = bpf_prog_inc(prog); + if (IS_ERR(prog)) { + err = PTR_ERR(prog); + goto err_rht; + } +out: kfree(req); return err; +err_rht: + rhashtable_remove_fast(&progs, &prog_node->ht_node, progs_params); +err_pnode: + kfree(prog_node); +err_xdp: + dev_change_xdp(dev, extack, NULL, flags); +err_prog: + bpf_prog_put(prog); +err_umh: + req->cmd = XDP_FLOW_CMD_UNLOAD; + transact_umh(req, NULL); + + goto out; } static int xdp_flow_setup_unbind(struct net_device *dev, struct netlink_ext_ack *extack) { + struct xdp_flow_prog *prog_node; + u32 flags = XDP_FLAGS_DRV_MODE; struct mbox_request *req; - int err; + int err, ret = 0; + u32 prog_id = 0; + + prog_node = rhashtable_lookup_fast(&progs, &dev, progs_params); + if (!prog_node) { + pr_warn_once("%s: xdp_flow unbind was requested before bind\n", + dev->name); + return -ENOENT; + } + + err = dev_check_xdp(dev, extack, false, &prog_id, flags); + if (!err && prog_id == prog_node->prog->aux->id) { + err = dev_change_xdp(dev, extack, NULL, flags); + if (err) { + pr_warn("Failed to uninstall XDP prog: %d\n", err); + ret = err; + } + } + + bpf_prog_put(prog_node->prog); + rhashtable_remove_fast(&progs, &prog_node->ht_node, progs_params); + kfree(prog_node); req = kzalloc(sizeof(*req), GFP_KERNEL); if (!req) @@ -153,10 +239,12 @@ static int xdp_flow_setup_unbind(struct net_device *dev, req->ifindex = dev->ifindex; err = transact_umh(req, NULL); + if (err) + ret = err; kfree(req); - return err; + return ret; } static int xdp_flow_setup(struct net_device *dev, bool do_bind, @@ -214,7 +302,11 @@ static int start_umh(void) static int __init load_umh(void) { - int err = 0; + int err; + + err = rhashtable_init(&progs, &progs_params); + if (err) + return err; mutex_lock(&xdp_flow_ops.lock); if (!xdp_flow_ops.stop) { @@ -230,8 +322,12 @@ static int __init load_umh(void) xdp_flow_ops.setup = &xdp_flow_setup; xdp_flow_ops.start = &start_umh; xdp_flow_ops.module = THIS_MODULE; + + mutex_unlock(&xdp_flow_ops.lock); + return 0; err: mutex_unlock(&xdp_flow_ops.lock); + rhashtable_destroy(&progs); return err; } @@ -244,6 +340,7 @@ static void __exit fini_umh(void) xdp_flow_ops.setup = NULL; xdp_flow_ops.setup_cb = NULL; mutex_unlock(&xdp_flow_ops.lock); + rhashtable_destroy(&progs); } module_init(load_umh); module_exit(fini_umh); From patchwork Fri Oct 18 04:07:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1179140 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="rTRZYDEr"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46vYzv3G8Qz9sPJ for ; Fri, 18 Oct 2019 16:11:55 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388701AbfJRFLy (ORCPT ); Fri, 18 Oct 2019 01:11:54 -0400 Received: from mail-pf1-f193.google.com ([209.85.210.193]:42119 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726533AbfJRFLy (ORCPT ); Fri, 18 Oct 2019 01:11:54 -0400 Received: by mail-pf1-f193.google.com with SMTP id q12so3086780pff.9; Thu, 17 Oct 2019 22:11:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=lS2VY/MLWZ7UPWNixHnZS2/uk4iJ3lj4WfbGDmq2T7s=; b=rTRZYDErqRVBVMmq3faMZc1o1VtPojYvN6cytEaFP06ddcOcqxkCxEvxKPV4lssOVm wy0wq7KX8OY+qBk5vIDy8DWybn5k67eFGw3lapbs94SoEpXaw3gGYHzDHfV+1PP7seJm s8nP84p7WssI4U4QxcjKRo1zKMaL87byWtJUT46qfPMeyG+QwUg6uJ8sfVipeBWwicLv bgntBjnI2U0C981tYSZGCAkADzAne7yd5dse4YrLJ/bnNUTXSxsfJCc6yTiuheBLTCPJ mo4h0ZeOYUYUF6+fhn0m4Ocd5M7vKUiKk9IJklAy/7c0YCqFGeFDjSMbJlxbgjtpcYx6 G8rQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=lS2VY/MLWZ7UPWNixHnZS2/uk4iJ3lj4WfbGDmq2T7s=; b=Qk4gkI6wTn/Q8sN47bpFwMsHJWwvm/HrEBFjs5snrJ9NfHI73Hw3YO/r5VYGa0CgIR ZdDVh+slldczpNb9lRzk2MEkusp38FWAQC+K1bkY5hjeKOgUqjyGB+MoGyQwlTBeialD jBRCxSIVptwoZYIit4BYeSfxZJkbPYK9HAORUix9zq1+KN/eJRTJc0IJZbM4m9Za56KW 5Md1BHW2dC1RqMLvSJvQvZZUbjSEQmYsK6vTduUk7l5ENmremkIFG21o+OJIIvmGOktE ZrDxxpYMHohyxKBqz5jOQ7rIE4CPgItFPRCss9FD/uW8d7vAsjyvpPPiSaSrfW5E4/QF EeSA== X-Gm-Message-State: APjAAAUimJ7cGEjqL9q6U2zQ3yFC2zgHnI8qMT8VebkPY79Pdp8kXupn tIxC0fUqB0HzltG1cRZi1xO0+tlV X-Google-Smtp-Source: APXvYqxXsQjWXfw/Sr74Nmf+m1BLgkGfUhe197YlMZ7EkA5xvAUZM58k7O7CGGKNvwbS/G69dHa+ow== X-Received: by 2002:a17:90a:8990:: with SMTP id v16mr8260805pjn.137.1571371731810; Thu, 17 Oct 2019 21:08:51 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id d11sm4341680pfo.104.2019.10.17.21.08.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2019 21:08:51 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko , Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal , Pravin B Shelar Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu , Stanislav Fomichev Subject: [RFC PATCH v2 bpf-next 06/15] xdp_flow: Prepare flow tables in bpf Date: Fri, 18 Oct 2019 13:07:39 +0900 Message-Id: <20191018040748.30593-7-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191018040748.30593-1-toshiaki.makita1@gmail.com> References: <20191018040748.30593-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Add maps for flow tables in bpf. TC flower has hash tables for each flow mask ordered by priority. To do the same thing, prepare a hashmap-in-arraymap. As bpf does not provide ordered list, we emulate it by an array. Each array entry has one-byte next index field to implement a list. Also prepare a one-element array to point to the head index of the list. Because of the limitation of bpf maps, the outer array is implemented using two array maps. "flow_masks" is the array to emulate the list and its entries have the priority and mask of each flow table. For each priority/mask, the same index entry of another map "flow_tables", which is the hashmap-in-arraymap, points to the actual flow table. The flow insertion logic in UMH and lookup logic in BPF will be implemented in the following commits. NOTE: This list emulation by array may be able to be realized by adding ordered-list type map. In that case we also need map iteration API for bpf progs. Signed-off-by: Toshiaki Makita --- net/xdp_flow/umh_bpf.h | 18 +++++++++++ net/xdp_flow/xdp_flow_kern_bpf.c | 22 +++++++++++++ net/xdp_flow/xdp_flow_umh.c | 70 ++++++++++++++++++++++++++++++++++++++-- 3 files changed, 108 insertions(+), 2 deletions(-) create mode 100644 net/xdp_flow/umh_bpf.h diff --git a/net/xdp_flow/umh_bpf.h b/net/xdp_flow/umh_bpf.h new file mode 100644 index 0000000..b4fe0c6 --- /dev/null +++ b/net/xdp_flow/umh_bpf.h @@ -0,0 +1,18 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _NET_XDP_FLOW_UMH_BPF_H +#define _NET_XDP_FLOW_UMH_BPF_H + +#include "msgfmt.h" + +#define MAX_FLOWS 1024 +#define MAX_FLOW_MASKS 255 +#define FLOW_MASKS_TAIL 255 + +struct xdp_flow_mask_entry { + struct xdp_flow_key mask; + __u16 priority; + short count; + int next; +}; + +#endif diff --git a/net/xdp_flow/xdp_flow_kern_bpf.c b/net/xdp_flow/xdp_flow_kern_bpf.c index 74cdb1d..c101156 100644 --- a/net/xdp_flow/xdp_flow_kern_bpf.c +++ b/net/xdp_flow/xdp_flow_kern_bpf.c @@ -2,6 +2,28 @@ #define KBUILD_MODNAME "foo" #include #include +#include "umh_bpf.h" + +struct bpf_map_def SEC("maps") flow_masks_head = { + .type = BPF_MAP_TYPE_ARRAY, + .key_size = sizeof(u32), + .value_size = sizeof(int), + .max_entries = 1, +}; + +struct bpf_map_def SEC("maps") flow_masks = { + .type = BPF_MAP_TYPE_ARRAY, + .key_size = sizeof(u32), + .value_size = sizeof(struct xdp_flow_mask_entry), + .max_entries = MAX_FLOW_MASKS, +}; + +struct bpf_map_def SEC("maps") flow_tables = { + .type = BPF_MAP_TYPE_ARRAY_OF_MAPS, + .key_size = sizeof(u32), + .value_size = sizeof(u32), + .max_entries = MAX_FLOW_MASKS, +}; SEC("xdp_flow") int xdp_flow_prog(struct xdp_md *ctx) diff --git a/net/xdp_flow/xdp_flow_umh.c b/net/xdp_flow/xdp_flow_umh.c index 85c5c7b..515c2fd 100644 --- a/net/xdp_flow/xdp_flow_umh.c +++ b/net/xdp_flow/xdp_flow_umh.c @@ -13,7 +13,7 @@ #include #include #include -#include "msgfmt.h" +#include "umh_bpf.h" extern char xdp_flow_bpf_start; extern char xdp_flow_bpf_end; @@ -99,11 +99,13 @@ static int setup(void) static int load_bpf(int ifindex, struct bpf_object **objp) { + int prog_fd, flow_tables_fd, flow_meta_fd, flow_masks_head_fd, err; + struct bpf_map *flow_tables, *flow_masks_head; + int zero = 0, flow_masks_tail = FLOW_MASKS_TAIL; struct bpf_object_open_attr attr = {}; char path[256], errbuf[ERRBUF_SIZE]; struct bpf_program *prog; struct bpf_object *obj; - int prog_fd, err; ssize_t len; len = snprintf(path, 256, "/proc/self/fd/%d", progfile_fd); @@ -131,6 +133,48 @@ static int load_bpf(int ifindex, struct bpf_object **objp) bpf_object__for_each_program(prog, obj) bpf_program__set_type(prog, attr.prog_type); + flow_meta_fd = bpf_create_map(BPF_MAP_TYPE_HASH, + sizeof(struct xdp_flow_key), + sizeof(struct xdp_flow_actions), + MAX_FLOWS, 0); + if (flow_meta_fd < 0) { + err = -errno; + pr_err("map creation for flow_tables meta failed: %s\n", + strerror(errno)); + goto err; + } + + flow_tables_fd = bpf_create_map_in_map(BPF_MAP_TYPE_ARRAY_OF_MAPS, + "flow_tables", sizeof(__u32), + flow_meta_fd, MAX_FLOW_MASKS, 0); + if (flow_tables_fd < 0) { + err = -errno; + pr_err("map creation for flow_tables failed: %s\n", + strerror(errno)); + close(flow_meta_fd); + goto err; + } + + close(flow_meta_fd); + + flow_tables = bpf_object__find_map_by_name(obj, "flow_tables"); + if (!flow_tables) { + pr_err("Cannot find flow_tables\n"); + err = -ENOENT; + close(flow_tables_fd); + goto err; + } + + err = bpf_map__reuse_fd(flow_tables, flow_tables_fd); + if (err) { + err = libbpf_err(err, errbuf); + pr_err("Failed to reuse flow_tables fd: %s\n", errbuf); + close(flow_tables_fd); + goto err; + } + + close(flow_tables_fd); + err = bpf_object__load(obj); if (err) { err = libbpf_err(err, errbuf); @@ -138,6 +182,28 @@ static int load_bpf(int ifindex, struct bpf_object **objp) goto err; } + flow_masks_head = bpf_object__find_map_by_name(obj, "flow_masks_head"); + if (!flow_masks_head) { + pr_err("Cannot find flow_masks_head map\n"); + err = -ENOENT; + goto err; + } + + flow_masks_head_fd = bpf_map__fd(flow_masks_head); + if (flow_masks_head_fd < 0) { + err = libbpf_err(flow_masks_head_fd, errbuf); + pr_err("Invalid flow_masks_head fd: %s\n", errbuf); + goto err; + } + + if (bpf_map_update_elem(flow_masks_head_fd, &zero, &flow_masks_tail, + 0)) { + err = -errno; + pr_err("Failed to initialize flow_masks_head: %s\n", + strerror(errno)); + goto err; + } + prog = bpf_object__find_program_by_title(obj, "xdp_flow"); if (!prog) { pr_err("Cannot find xdp_flow program\n"); From patchwork Fri Oct 18 04:07:40 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1179142 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="EbNwGHod"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46vZ0563Vbz9sP3 for ; Fri, 18 Oct 2019 16:12:05 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390085AbfJRFMF (ORCPT ); Fri, 18 Oct 2019 01:12:05 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:44520 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388139AbfJRFMF (ORCPT ); Fri, 18 Oct 2019 01:12:05 -0400 Received: by mail-pf1-f196.google.com with SMTP id q21so3077662pfn.11; Thu, 17 Oct 2019 22:12:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=JiBXzGwZlxWWcrVY21SxEa/4z5gwrUXX7O2QDRHGFAc=; b=EbNwGHodM02PjvvL8TvFR6AV6aY6jFZ3Uxw450MPn7J7f7RPEjASDGWcSh4jfuhBgL 66xtM6daIWZi7IHdfKJ5qYDrUEGJjCvF8RD/54jH5NqOozUuhTd/kKldj20wPrWmjQeL F8jtFV1WYOAf/UiwEn12I3Bi5GlLYp3R4aez/z927/P5bnXaRReCPzlg8vRq8rOnQkbM M75mOuR6Uj/zZZaUVBVnvrjIkRNdtc2PMVoKuEM98BQxgWWBaqw1skJ3sYqNiqJQ7vGK /zsC3o2LGUR/bcwRzAFZhPA0vQlXSclJSVjisyxbePK9B4WaEyyjXeZBaTMHCjooNgeu P+1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=JiBXzGwZlxWWcrVY21SxEa/4z5gwrUXX7O2QDRHGFAc=; b=ORnbZwLEBu7/eZS7Ln79v1QhIFW3mgMbdDNLUMX0P97bT2iOIUBaKTJ4b7ws42JQjH dwre1VgQl0XYrL7Cz/leL3CHpgS4h0GkgxxrpKqBZsP8UF4Yjtrvbe20g2tYUs2+cKb4 eIAf6+frSIhylkyzXBwdcYdGRyWzWwsZLYIOnSOHcU5pvXfZxeUTJryvm6nB7zxIMArJ NznVNbvujE1CX2b2MNZsPm+38OiynNgBedzycqnHlZZaRYzykibx++ZHmQav9xNtZKiw pKY7ZA7b/6EJ2Zh4sLPQMN7yVskZdJXCmp67xA1lY2xw3Hpv1O63Moy6e8plQKflgRmb 8OQQ== X-Gm-Message-State: APjAAAWIpSDO4IdM+yohCiY8gFJRnrxqC7pOEfX3P/sMXDDOCrjhJ+be 45MesvPpmFw5e+iJ9dWFUIqo34P3 X-Google-Smtp-Source: APXvYqxEa/22w7pIG5ZjMTpS3jSrIIybvdkOcCq83QFcC7eXoyoWUnxJgiwjlD9o17o7EQfXVCUOfw== X-Received: by 2002:a17:90a:d141:: with SMTP id t1mr8362143pjw.103.1571371737048; Thu, 17 Oct 2019 21:08:57 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id d11sm4341680pfo.104.2019.10.17.21.08.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2019 21:08:55 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko , Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal , Pravin B Shelar Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu , Stanislav Fomichev Subject: [RFC PATCH v2 bpf-next 07/15] xdp_flow: Add flow entry insertion/deletion logic in UMH Date: Fri, 18 Oct 2019 13:07:40 +0900 Message-Id: <20191018040748.30593-8-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191018040748.30593-1-toshiaki.makita1@gmail.com> References: <20191018040748.30593-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org This logic will be used when xdp_flow kmod requests flow insertion/deleteion. On insertion, find a free entry and populate it, then update next index pointer of its previous entry. On deletion, set the next index pointer of the prev entry to the next index of the entry to be deleted. Signed-off-by: Toshiaki Makita --- net/xdp_flow/umh_bpf.h | 15 ++ net/xdp_flow/xdp_flow_umh.c | 470 +++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 483 insertions(+), 2 deletions(-) diff --git a/net/xdp_flow/umh_bpf.h b/net/xdp_flow/umh_bpf.h index b4fe0c6..4e4633f 100644 --- a/net/xdp_flow/umh_bpf.h +++ b/net/xdp_flow/umh_bpf.h @@ -15,4 +15,19 @@ struct xdp_flow_mask_entry { int next; }; +static inline bool flow_equal(const struct xdp_flow_key *key1, + const struct xdp_flow_key *key2) +{ + long *lkey1 = (long *)key1; + long *lkey2 = (long *)key2; + int i; + + for (i = 0; i < sizeof(*key1); i += sizeof(long)) { + if (*lkey1++ != *lkey2++) + return false; + } + + return true; +} + #endif diff --git a/net/xdp_flow/xdp_flow_umh.c b/net/xdp_flow/xdp_flow_umh.c index 515c2fd..0588a36 100644 --- a/net/xdp_flow/xdp_flow_umh.c +++ b/net/xdp_flow/xdp_flow_umh.c @@ -20,6 +20,8 @@ int progfile_fd; FILE *kmsg; +#define zalloc(size) calloc(1, (size)) + #define pr_log(fmt, prio, ...) fprintf(kmsg, "<%d>xdp_flow_umh: " fmt, \ LOG_DAEMON | (prio), ##__VA_ARGS__) #ifdef DEBUG @@ -42,6 +44,8 @@ struct netdev_info { struct netdev_info_key key; struct hlist_node node; struct bpf_object *obj; + int free_slot_top; + int free_slots[MAX_FLOW_MASKS]; }; DEFINE_HASHTABLE(netdev_info_table, 16); @@ -272,6 +276,57 @@ static struct netdev_info *get_netdev_info(const struct mbox_request *req) return netdev_info; } +static void init_flow_masks_free_slot(struct netdev_info *netdev_info) +{ + int i; + + for (i = 0; i < MAX_FLOW_MASKS; i++) + netdev_info->free_slots[MAX_FLOW_MASKS - 1 - i] = i; + netdev_info->free_slot_top = MAX_FLOW_MASKS - 1; +} + +static int get_flow_masks_free_slot(const struct netdev_info *netdev_info) +{ + if (netdev_info->free_slot_top < 0) + return -ENOBUFS; + + return netdev_info->free_slots[netdev_info->free_slot_top]; +} + +static int add_flow_masks_free_slot(struct netdev_info *netdev_info, int slot) +{ + if (unlikely(netdev_info->free_slot_top >= MAX_FLOW_MASKS - 1)) { + pr_warn("BUG: free_slot overflow: top=%d, slot=%d\n", + netdev_info->free_slot_top, slot); + return -EOVERFLOW; + } + + netdev_info->free_slots[++netdev_info->free_slot_top] = slot; + + return 0; +} + +static void delete_flow_masks_free_slot(struct netdev_info *netdev_info, + int slot) +{ + int top_slot; + + if (unlikely(netdev_info->free_slot_top < 0)) { + pr_warn("BUG: free_slot underflow: top=%d, slot=%d\n", + netdev_info->free_slot_top, slot); + return; + } + + top_slot = netdev_info->free_slots[netdev_info->free_slot_top]; + if (unlikely(top_slot != slot)) { + pr_warn("BUG: inconsistent free_slot top: top_slot=%d, slot=%d\n", + top_slot, slot); + return; + } + + netdev_info->free_slot_top--; +} + static int handle_load(const struct mbox_request *req, __u32 *prog_id) { struct netdev_info *netdev_info; @@ -295,6 +350,8 @@ static int handle_load(const struct mbox_request *req, __u32 *prog_id) } netdev_info->key.ifindex = key.ifindex; + init_flow_masks_free_slot(netdev_info); + prog_fd = load_bpf(req->ifindex, &netdev_info->obj); if (prog_fd < 0) { err = prog_fd; @@ -335,14 +392,423 @@ static int handle_unload(const struct mbox_request *req) return 0; } +static int get_table_fd(const struct netdev_info *netdev_info, + const char *table_name) +{ + char errbuf[ERRBUF_SIZE]; + struct bpf_map *map; + int map_fd; + int err; + + map = bpf_object__find_map_by_name(netdev_info->obj, table_name); + if (!map) { + pr_err("BUG: %s map not found.\n", table_name); + return -ENOENT; + } + + map_fd = bpf_map__fd(map); + if (map_fd < 0) { + err = libbpf_err(map_fd, errbuf); + pr_err("Invalid map fd: %s\n", errbuf); + return err; + } + + return map_fd; +} + +static int get_flow_masks_head_fd(const struct netdev_info *netdev_info) +{ + return get_table_fd(netdev_info, "flow_masks_head"); +} + +static int get_flow_masks_head(int head_fd, int *head) +{ + int err, zero = 0; + + if (bpf_map_lookup_elem(head_fd, &zero, head)) { + err = -errno; + pr_err("Cannot get flow_masks_head: %s\n", strerror(errno)); + return err; + } + + return 0; +} + +static int update_flow_masks_head(int head_fd, int head) +{ + int err, zero = 0; + + if (bpf_map_update_elem(head_fd, &zero, &head, 0)) { + err = -errno; + pr_err("Cannot update flow_masks_head: %s\n", strerror(errno)); + return err; + } + + return 0; +} + +static int get_flow_masks_fd(const struct netdev_info *netdev_info) +{ + return get_table_fd(netdev_info, "flow_masks"); +} + +static int get_flow_tables_fd(const struct netdev_info *netdev_info) +{ + return get_table_fd(netdev_info, "flow_tables"); +} + +static int __flow_table_insert_elem(int flow_table_fd, + const struct xdp_flow *flow) +{ + int err = 0; + + if (bpf_map_update_elem(flow_table_fd, &flow->key, &flow->actions, 0)) { + err = -errno; + pr_err("Cannot insert flow entry: %s\n", + strerror(errno)); + } + + return err; +} + +static void __flow_table_delete_elem(int flow_table_fd, + const struct xdp_flow *flow) +{ + bpf_map_delete_elem(flow_table_fd, &flow->key); +} + +static int flow_table_insert_elem(struct netdev_info *netdev_info, + const struct xdp_flow *flow) +{ + int masks_fd, head_fd, flow_tables_fd, flow_table_fd, free_slot, head; + struct xdp_flow_mask_entry *entry, *pentry; + int err, cnt, idx, pidx; + + masks_fd = get_flow_masks_fd(netdev_info); + if (masks_fd < 0) + return masks_fd; + + head_fd = get_flow_masks_head_fd(netdev_info); + if (head_fd < 0) + return head_fd; + + err = get_flow_masks_head(head_fd, &head); + if (err) + return err; + + flow_tables_fd = get_flow_tables_fd(netdev_info); + if (flow_tables_fd < 0) + return flow_tables_fd; + + entry = zalloc(sizeof(*entry)); + if (!entry) { + pr_err("Memory allocation for flow_masks entry failed\n"); + return -ENOMEM; + } + + pentry = zalloc(sizeof(*pentry)); + if (!pentry) { + flow_table_fd = -ENOMEM; + pr_err("Memory allocation for flow_masks prev entry failed\n"); + goto err_entry; + } + + idx = head; + for (cnt = 0; cnt < MAX_FLOW_MASKS; cnt++) { + if (idx == FLOW_MASKS_TAIL) + break; + + if (bpf_map_lookup_elem(masks_fd, &idx, entry)) { + err = -errno; + pr_err("Cannot lookup flow_masks: %s\n", + strerror(errno)); + goto err; + } + + if (entry->priority == flow->priority && + flow_equal(&entry->mask, &flow->mask)) { + __u32 id; + + if (bpf_map_lookup_elem(flow_tables_fd, &idx, &id)) { + err = -errno; + pr_err("Cannot lookup flow_tables: %s\n", + strerror(errno)); + goto err; + } + + flow_table_fd = bpf_map_get_fd_by_id(id); + if (flow_table_fd < 0) { + err = -errno; + pr_err("Cannot get flow_table fd by id: %s\n", + strerror(errno)); + goto err; + } + + err = __flow_table_insert_elem(flow_table_fd, flow); + if (err) + goto out; + + entry->count++; + if (bpf_map_update_elem(masks_fd, &idx, entry, 0)) { + err = -errno; + pr_err("Cannot update flow_masks count: %s\n", + strerror(errno)); + __flow_table_delete_elem(flow_table_fd, flow); + goto out; + } + + goto out; + } + + if (entry->priority > flow->priority) + break; + + *pentry = *entry; + pidx = idx; + idx = entry->next; + } + + if (unlikely(cnt == MAX_FLOW_MASKS && idx != FLOW_MASKS_TAIL)) { + err = -EINVAL; + pr_err("Cannot lookup flow_masks: Broken flow_masks list\n"); + goto out; + } + + /* Flow mask was not found. Create a new one */ + + free_slot = get_flow_masks_free_slot(netdev_info); + if (free_slot < 0) { + err = free_slot; + goto err; + } + + entry->mask = flow->mask; + entry->priority = flow->priority; + entry->count = 1; + entry->next = idx; + if (bpf_map_update_elem(masks_fd, &free_slot, entry, 0)) { + err = -errno; + pr_err("Cannot update flow_masks: %s\n", strerror(errno)); + goto err; + } + + flow_table_fd = bpf_create_map(BPF_MAP_TYPE_HASH, + sizeof(struct xdp_flow_key), + sizeof(struct xdp_flow_actions), + MAX_FLOWS, 0); + if (flow_table_fd < 0) { + err = -errno; + pr_err("map creation for flow_table failed: %s\n", + strerror(errno)); + goto err; + } + + err = __flow_table_insert_elem(flow_table_fd, flow); + if (err) + goto out; + + if (bpf_map_update_elem(flow_tables_fd, &free_slot, &flow_table_fd, 0)) { + err = -errno; + pr_err("Failed to insert flow_table into flow_tables: %s\n", + strerror(errno)); + goto out; + } + + if (cnt == 0) { + err = update_flow_masks_head(head_fd, free_slot); + if (err) + goto err_flow_table; + } else { + pentry->next = free_slot; + /* This effectively only updates one byte of entry->next */ + if (bpf_map_update_elem(masks_fd, &pidx, pentry, 0)) { + err = -errno; + pr_err("Cannot update flow_masks prev entry: %s\n", + strerror(errno)); + goto err_flow_table; + } + } + delete_flow_masks_free_slot(netdev_info, free_slot); +out: + close(flow_table_fd); +err: + free(pentry); +err_entry: + free(entry); + + return err; + +err_flow_table: + bpf_map_delete_elem(flow_tables_fd, &free_slot); + + goto out; +} + +static int flow_table_delete_elem(struct netdev_info *netdev_info, + const struct xdp_flow *flow) +{ + int masks_fd, head_fd, flow_tables_fd, flow_table_fd, head; + struct xdp_flow_mask_entry *entry, *pentry; + int err, cnt, idx, pidx; + __u32 id; + + masks_fd = get_flow_masks_fd(netdev_info); + if (masks_fd < 0) + return masks_fd; + + head_fd = get_flow_masks_head_fd(netdev_info); + if (head_fd < 0) + return head_fd; + + err = get_flow_masks_head(head_fd, &head); + if (err) + return err; + + flow_tables_fd = get_flow_tables_fd(netdev_info); + if (flow_tables_fd < 0) + return flow_tables_fd; + + entry = zalloc(sizeof(*entry)); + if (!entry) { + pr_err("Memory allocation for flow_masks entry failed\n"); + return -ENOMEM; + } + + pentry = zalloc(sizeof(*pentry)); + if (!pentry) { + err = -ENOMEM; + pr_err("Memory allocation for flow_masks prev entry failed\n"); + goto err_pentry; + } + + idx = head; + for (cnt = 0; cnt < MAX_FLOW_MASKS; cnt++) { + if (idx == FLOW_MASKS_TAIL) { + err = -ENOENT; + pr_err("Cannot lookup flow_masks: %s\n", + strerror(-err)); + goto out; + } + + if (bpf_map_lookup_elem(masks_fd, &idx, entry)) { + err = -errno; + pr_err("Cannot lookup flow_masks: %s\n", + strerror(errno)); + goto out; + } + + if (entry->priority > flow->priority) { + err = -ENOENT; + pr_err("Cannot lookup flow_masks: %s\n", + strerror(-err)); + goto out; + } + + if (entry->priority == flow->priority && + flow_equal(&entry->mask, &flow->mask)) + break; + + *pentry = *entry; + pidx = idx; + idx = entry->next; + } + + if (unlikely(cnt == MAX_FLOW_MASKS)) { + err = -ENOENT; + pr_err("Cannot lookup flow_masks: Broken flow_masks list\n"); + goto out; + } + + if (bpf_map_lookup_elem(flow_tables_fd, &idx, &id)) { + err = -errno; + pr_err("Cannot lookup flow_tables: %s\n", + strerror(errno)); + goto out; + } + + flow_table_fd = bpf_map_get_fd_by_id(id); + if (flow_table_fd < 0) { + err = -errno; + pr_err("Cannot get flow_table fd by id: %s\n", + strerror(errno)); + goto out; + } + + __flow_table_delete_elem(flow_table_fd, flow); + close(flow_table_fd); + + if (--entry->count > 0) { + if (bpf_map_update_elem(masks_fd, &idx, entry, 0)) { + err = -errno; + pr_err("Cannot update flow_masks count: %s\n", + strerror(errno)); + } + + goto out; + } + + if (unlikely(entry->count < 0)) { + pr_warn("flow_masks has negative count: %d\n", + entry->count); + } + + if (cnt == 0) { + err = update_flow_masks_head(head_fd, entry->next); + if (err) + goto out; + } else { + pentry->next = entry->next; + /* This effectively only updates one byte of entry->next */ + if (bpf_map_update_elem(masks_fd, &pidx, pentry, 0)) { + err = -errno; + pr_err("Cannot update flow_masks prev entry: %s\n", + strerror(errno)); + goto out; + } + } + + bpf_map_delete_elem(flow_tables_fd, &idx); + err = add_flow_masks_free_slot(netdev_info, idx); + if (err) + pr_err("Cannot add flow_masks free slot: %s\n", strerror(-err)); +out: + free(pentry); +err_pentry: + free(entry); + + return err; +} + static int handle_replace(struct mbox_request *req) { - return -EOPNOTSUPP; + struct netdev_info *netdev_info; + int err; + + netdev_info = get_netdev_info(req); + if (IS_ERR(netdev_info)) + return PTR_ERR(netdev_info); + + err = flow_table_insert_elem(netdev_info, &req->flow); + if (err) + return err; + + return 0; } static int handle_delete(const struct mbox_request *req) { - return -EOPNOTSUPP; + struct netdev_info *netdev_info; + int err; + + netdev_info = get_netdev_info(req); + if (IS_ERR(netdev_info)) + return PTR_ERR(netdev_info); + + err = flow_table_delete_elem(netdev_info, &req->flow); + if (err) + return err; + + return 0; } static void loop(void) From patchwork Fri Oct 18 04:07:41 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1179146 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="iKIkYvsN"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46vZ8J1rLZz9sPW for ; Fri, 18 Oct 2019 16:19:12 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391320AbfJRFTL (ORCPT ); Fri, 18 Oct 2019 01:19:11 -0400 Received: from mail-pf1-f193.google.com ([209.85.210.193]:33246 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726533AbfJRFTL (ORCPT ); Fri, 18 Oct 2019 01:19:11 -0400 Received: by mail-pf1-f193.google.com with SMTP id q10so3124600pfl.0; Thu, 17 Oct 2019 22:19:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=sl/A1VlixFBlZDC2qMUkrSI6erBrODceelic2qhde7k=; b=iKIkYvsNiWsE3UamggtrzhOvDO0r4ULPONKHKer1Xx9mXri5jScy1ICw3gN6tHyvh/ vCtkuiXtT2mqWA97sybt9bsfExt0qlZqWlkRhinxFdbGxpCeUzHSoaP08nvod2hi6kDL KzPwCI87vqtC14ZUAa2iu4cD2DukUjV4JiW6euEdIHzWieD3p4JCCsE1T1dfGvADIerH Ngdb0wc7vZrlujW0b+Q3IEQJNcXIMri8NqnE1HCflF0hPeOSa06ZKKmn6TNWzAFAFMzd x3KIUhSUII8kkN/Y6r3UxG4P8nNM+o0ve62NbUI1DnRnHpY/z92YaiD2fbJAE1N+J6s+ V3dg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=sl/A1VlixFBlZDC2qMUkrSI6erBrODceelic2qhde7k=; b=T2oNGw+eolZpPjtbJcMzbbCP+xjoNZTlaFK3HoVNb5nZZL21id0juriXkV/3kgh72G oS74mtt4aRC3nqvCrQwm5+Tyg6DxKOO1w6lkhs93Fhq1R4ZYGbJP7s2tSAb/QbxxB4/4 +jgUFXtDqNAuYsDxI7YoqCiVk87eUHqRXtGmUf7fzreKqYc1uR9zIWcQNxVBbxSe2Szi Ayd0V3NtXJT8lTH3AjBojdYq3OiGp/KVdu7uBEiKvaWRY7GuDxLfbtpJ8tFLCUQ5AYcu czZfXhWK+ULoy0J42sT5NGPduXli72Y2D/2JFDXynZrMBnXBfGYLIJMrAz2C9ctWgm1C A2Sw== X-Gm-Message-State: APjAAAXCFhUzroSQAzimXCvv+ZwkN4b0mdQ4HJpEC/9N0fVOzZ5TpxUT rc1dT7qJsVgyrypaHqHTu+II12oM X-Google-Smtp-Source: APXvYqy8d2y+PRAcpbTxKdThavqxS55MaElR+B3nyyxwVtetMK5JxFPMg65XxWxIt4WgNZTE4SoOyw== X-Received: by 2002:a17:90a:a00a:: with SMTP id q10mr8309721pjp.108.1571371741692; Thu, 17 Oct 2019 21:09:01 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id d11sm4341680pfo.104.2019.10.17.21.08.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2019 21:09:01 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko , Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal , Pravin B Shelar Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu , Stanislav Fomichev Subject: [RFC PATCH v2 bpf-next 08/15] xdp_flow: Add flow handling and basic actions in bpf prog Date: Fri, 18 Oct 2019 13:07:41 +0900 Message-Id: <20191018040748.30593-9-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191018040748.30593-1-toshiaki.makita1@gmail.com> References: <20191018040748.30593-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org BPF prog for XDP parses the packet and extracts the flow key. Then find an entry from flow tables. Only "accept" and "drop" actions are implemented at this point. Signed-off-by: Toshiaki Makita --- net/xdp_flow/xdp_flow_kern_bpf.c | 297 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 296 insertions(+), 1 deletion(-) diff --git a/net/xdp_flow/xdp_flow_kern_bpf.c b/net/xdp_flow/xdp_flow_kern_bpf.c index c101156..f4a6346 100644 --- a/net/xdp_flow/xdp_flow_kern_bpf.c +++ b/net/xdp_flow/xdp_flow_kern_bpf.c @@ -1,9 +1,27 @@ // SPDX-License-Identifier: GPL-2.0 #define KBUILD_MODNAME "foo" #include +#include +#include +#include +#include +#include +#include +#include +#include #include #include "umh_bpf.h" +/* Used when the action only modifies the packet */ +#define _XDP_CONTINUE -1 + +struct bpf_map_def SEC("maps") debug_stats = { + .type = BPF_MAP_TYPE_PERCPU_ARRAY, + .key_size = sizeof(u32), + .value_size = sizeof(long), + .max_entries = 256, +}; + struct bpf_map_def SEC("maps") flow_masks_head = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(u32), @@ -25,10 +43,287 @@ struct bpf_map_def SEC("maps") flow_tables = { .max_entries = MAX_FLOW_MASKS, }; +static inline void account_debug(int idx) +{ + long *cnt; + + cnt = bpf_map_lookup_elem(&debug_stats, &idx); + if (cnt) + *cnt += 1; +} + +static inline void account_action(int act) +{ + account_debug(act + 1); +} + +static inline int action_accept(void) +{ + account_action(XDP_FLOW_ACTION_ACCEPT); + return XDP_PASS; +} + +static inline int action_drop(void) +{ + account_action(XDP_FLOW_ACTION_DROP); + return XDP_DROP; +} + +static inline int action_redirect(struct xdp_flow_action *action) +{ + account_action(XDP_FLOW_ACTION_REDIRECT); + + // TODO: implement this + return XDP_ABORTED; +} + +static inline int action_vlan_push(struct xdp_md *ctx, + struct xdp_flow_action *action) +{ + account_action(XDP_FLOW_ACTION_VLAN_PUSH); + + // TODO: implement this + return XDP_ABORTED; +} + +static inline int action_vlan_pop(struct xdp_md *ctx, + struct xdp_flow_action *action) +{ + account_action(XDP_FLOW_ACTION_VLAN_POP); + + // TODO: implement this + return XDP_ABORTED; +} + +static inline int action_vlan_mangle(struct xdp_md *ctx, + struct xdp_flow_action *action) +{ + account_action(XDP_FLOW_ACTION_VLAN_MANGLE); + + // TODO: implement this + return XDP_ABORTED; +} + +static inline int action_mangle(struct xdp_md *ctx, + struct xdp_flow_action *action) +{ + account_action(XDP_FLOW_ACTION_MANGLE); + + // TODO: implement this + return XDP_ABORTED; +} + +static inline int action_csum(struct xdp_md *ctx, + struct xdp_flow_action *action) +{ + account_action(XDP_FLOW_ACTION_CSUM); + + // TODO: implement this + return XDP_ABORTED; +} + +static inline void __ether_addr_copy(u8 *dst, const u8 *src) +{ + u16 *a = (u16 *)dst; + const u16 *b = (const u16 *)src; + + a[0] = b[0]; + a[1] = b[1]; + a[2] = b[2]; +} + +static inline int parse_ipv4(void *data, u64 *nh_off, void *data_end, + struct xdp_flow_key *key) +{ + struct iphdr *iph = data + *nh_off; + + if (iph + 1 > data_end) + return -1; + + key->ipv4.src = iph->saddr; + key->ipv4.dst = iph->daddr; + key->ip.ttl = iph->ttl; + key->ip.tos = iph->tos; + *nh_off += iph->ihl * 4; + + return iph->protocol; +} + +static inline int parse_ipv6(void *data, u64 *nh_off, void *data_end, + struct xdp_flow_key *key) +{ + struct ipv6hdr *ip6h = data + *nh_off; + + if (ip6h + 1 > data_end) + return -1; + + key->ipv6.src = ip6h->saddr; + key->ipv6.dst = ip6h->daddr; + key->ip.ttl = ip6h->hop_limit; + key->ip.tos = ipv6_get_dsfield(ip6h); + *nh_off += sizeof(*ip6h); + + if (ip6h->nexthdr == NEXTHDR_HOP || + ip6h->nexthdr == NEXTHDR_ROUTING || + ip6h->nexthdr == NEXTHDR_FRAGMENT || + ip6h->nexthdr == NEXTHDR_AUTH || + ip6h->nexthdr == NEXTHDR_NONE || + ip6h->nexthdr == NEXTHDR_DEST) + return 0; + + return ip6h->nexthdr; +} + +#define for_each_flow_mask(entry, head, idx, cnt) \ + for (entry = bpf_map_lookup_elem(&flow_masks, (head)), \ + idx = *(head), cnt = 0; \ + entry != NULL && cnt < MAX_FLOW_MASKS; \ + idx = entry->next, \ + entry = bpf_map_lookup_elem(&flow_masks, &idx), cnt++) + +static inline void flow_mask(struct xdp_flow_key *mkey, + const struct xdp_flow_key *key, + const struct xdp_flow_key *mask) +{ + long *lmkey = (long *)mkey; + long *lmask = (long *)mask; + long *lkey = (long *)key; + int i; + + for (i = 0; i < sizeof(*mkey); i += sizeof(long)) + *lmkey++ = *lkey++ & *lmask++; +} + SEC("xdp_flow") int xdp_flow_prog(struct xdp_md *ctx) { - return XDP_PASS; + void *data_end = (void *)(long)ctx->data_end; + struct xdp_flow_actions *actions = NULL; + void *data = (void *)(long)ctx->data; + int cnt, idx, action_idx, zero = 0; + struct xdp_flow_mask_entry *entry; + struct ethhdr *eth = data; + struct xdp_flow_key key; + int rc = XDP_DROP; + long *value; + u16 h_proto; + int ipproto; + u64 nh_off; + int *head; + + account_debug(0); + + nh_off = sizeof(*eth); + if (data + nh_off > data_end) + return XDP_DROP; + + __builtin_memset(&key, 0, sizeof(key)); + h_proto = eth->h_proto; + __ether_addr_copy(key.eth.dst, eth->h_dest); + __ether_addr_copy(key.eth.src, eth->h_source); + + if (eth_type_vlan(h_proto)) { + struct vlan_hdr *vhdr; + + vhdr = data + nh_off; + nh_off += sizeof(*vhdr); + if (data + nh_off > data_end) + return XDP_DROP; + key.vlan.tpid = h_proto; + key.vlan.tci = vhdr->h_vlan_TCI; + h_proto = vhdr->h_vlan_encapsulated_proto; + } + key.eth.type = h_proto; + + if (h_proto == htons(ETH_P_IP)) + ipproto = parse_ipv4(data, &nh_off, data_end, &key); + else if (h_proto == htons(ETH_P_IPV6)) + ipproto = parse_ipv6(data, &nh_off, data_end, &key); + else + ipproto = 0; + if (ipproto < 0) + return XDP_DROP; + key.ip.proto = ipproto; + + if (ipproto == IPPROTO_TCP) { + struct tcphdr *th = data + nh_off; + + if (th + 1 > data_end) + return XDP_DROP; + + key.l4port.src = th->source; + key.l4port.dst = th->dest; + key.tcp.flags = (*(__be16 *)&tcp_flag_word(th) & htons(0x0FFF)); + } else if (ipproto == IPPROTO_UDP) { + struct udphdr *uh = data + nh_off; + + if (uh + 1 > data_end) + return XDP_DROP; + + key.l4port.src = uh->source; + key.l4port.dst = uh->dest; + } + + head = bpf_map_lookup_elem(&flow_masks_head, &zero); + if (!head) + return XDP_PASS; + + for_each_flow_mask(entry, head, idx, cnt) { + struct xdp_flow_key mkey; + void *flow_table; + + flow_table = bpf_map_lookup_elem(&flow_tables, &idx); + if (!flow_table) + return XDP_ABORTED; + + flow_mask(&mkey, &key, &entry->mask); + actions = bpf_map_lookup_elem(flow_table, &mkey); + if (actions) + break; + } + + if (!actions) + return XDP_PASS; + + for (action_idx = 0; + action_idx < actions->num_actions && + action_idx < MAX_XDP_FLOW_ACTIONS; + action_idx++) { + struct xdp_flow_action *action; + int act; + + action = &actions->actions[action_idx]; + + switch (action->id) { + case XDP_FLOW_ACTION_ACCEPT: + return action_accept(); + case XDP_FLOW_ACTION_DROP: + return action_drop(); + case XDP_FLOW_ACTION_REDIRECT: + return action_redirect(action); + case XDP_FLOW_ACTION_VLAN_PUSH: + act = action_vlan_push(ctx, action); + break; + case XDP_FLOW_ACTION_VLAN_POP: + act = action_vlan_pop(ctx, action); + break; + case XDP_FLOW_ACTION_VLAN_MANGLE: + act = action_vlan_mangle(ctx, action); + break; + case XDP_FLOW_ACTION_MANGLE: + act = action_mangle(ctx, action); + break; + case XDP_FLOW_ACTION_CSUM: + act = action_csum(ctx, action); + break; + default: + return XDP_ABORTED; + } + if (act != _XDP_CONTINUE) + return act; + } + + return XDP_ABORTED; } char _license[] SEC("license") = "GPL"; From patchwork Fri Oct 18 04:07:42 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1179135 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="DyFLvhPc"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46vYws1Sv7z9sPJ for ; Fri, 18 Oct 2019 16:09:17 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2393130AbfJRFJQ (ORCPT ); Fri, 18 Oct 2019 01:09:16 -0400 Received: from mail-qt1-f193.google.com ([209.85.160.193]:46401 "EHLO mail-qt1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727606AbfJRFJQ (ORCPT ); Fri, 18 Oct 2019 01:09:16 -0400 Received: by mail-qt1-f193.google.com with SMTP id u22so7301621qtq.13; Thu, 17 Oct 2019 22:09:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=RfnMY7Z/NlMSF1wkhwfieYlPpCrINoL1LfP3K3mwjsM=; b=DyFLvhPc7Sncs0Maq82PXQmm5BECYROi1USmL1oAOjY+LMbPeNf6aHDT26+GSnnH6d TaqSpH6Y76dE7NLKvz92lV/DdYZ9UMB48L0xddZneFMWFjmzdrNRY/w6oqS/mmxxcVo0 sHjtjaxbxtvU5JgKf0mw7a/sAIQIuwMlBw5rLI+DFdK6PwQShHtqPabd4J8kxccams6n 9rrlVRptYxNlJKOSvH0/60EKRaxrAqIX5fRdhZH05dZ5pUZVrUzhKg0Cy9vpyEs18BRs aNvoUxIP5eu6SwPjDt9GUzuzFxljzCr/yPwRbfeeIu+4RLXS8BQysLXdgHlAC4gCtiy3 DaWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=RfnMY7Z/NlMSF1wkhwfieYlPpCrINoL1LfP3K3mwjsM=; b=rv3nbqPSo/6Q/nfL3eaHRcW5wSx+zIV5cEV33YamRvk7THLI74sJMm7lkFn8W9Dl2k PleAuxdFW1mfAnFD0/R6fWbBBHiLD+4wGXSCM9Etyut8RWG/372DqzpR8492kaY+xMMv y/69qUXoEZCbrSSsot0mkkyJY8WyddXF79e8+604CM9T4GpaMotAZ6M8ivoZrxkVE0KP 0dAu2atFxu6eiHDkNSqhyB+C9pZFCtzNJ9AyFcDFQnwV2GmOh96RF2M9zrrNggIygB37 wYpROy/QVvK5BREztujoC7vbscf1P5Lb627s+GyIR/mxJ0rfEF+eGyAuDYArFU9dmK/+ 2V6A== X-Gm-Message-State: APjAAAXVOUGO50xjFz1oUAnms0wlV2N1WtcdAnARvPFa29YAmXEI9kA9 E74sDWOxMCa6gqLdgTsV+WRxR9P6 X-Google-Smtp-Source: APXvYqw9TrjyEKo2ZE3+0fVDlfNigbyWGlgFv7gQkHWTdcord3xJ98JXwpbXV6M/WgEZRTgW+c6Tbg== X-Received: by 2002:a63:5a03:: with SMTP id o3mr7763457pgb.381.1571371746180; Thu, 17 Oct 2019 21:09:06 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id d11sm4341680pfo.104.2019.10.17.21.09.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2019 21:09:05 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko , Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal , Pravin B Shelar Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu , Stanislav Fomichev Subject: [RFC PATCH v2 bpf-next 09/15] xdp_flow: Implement flow replacement/deletion logic in xdp_flow kmod Date: Fri, 18 Oct 2019 13:07:42 +0900 Message-Id: <20191018040748.30593-10-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191018040748.30593-1-toshiaki.makita1@gmail.com> References: <20191018040748.30593-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org As struct flow_rule has descrete storages for flow_dissector and key/mask containers, we need to serialize them in some way to pass them to UMH. Convert flow_rule into flow key form used in xdp_flow bpf prog and pass it. Signed-off-by: Toshiaki Makita --- net/xdp_flow/xdp_flow_kern_mod.c | 331 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 329 insertions(+), 2 deletions(-) diff --git a/net/xdp_flow/xdp_flow_kern_mod.c b/net/xdp_flow/xdp_flow_kern_mod.c index 2c80590..e70a86a 100644 --- a/net/xdp_flow/xdp_flow_kern_mod.c +++ b/net/xdp_flow/xdp_flow_kern_mod.c @@ -3,8 +3,10 @@ #include #include #include +#include #include #include +#include #include #include "xdp_flow.h" #include "msgfmt.h" @@ -24,9 +26,261 @@ struct xdp_flow_prog { static struct rhashtable progs; +struct xdp_flow_rule { + struct rhash_head ht_node; + unsigned long cookie; + struct xdp_flow_key key; + struct xdp_flow_key mask; +}; + +static const struct rhashtable_params rules_params = { + .key_len = sizeof(unsigned long), + .key_offset = offsetof(struct xdp_flow_rule, cookie), + .head_offset = offsetof(struct xdp_flow_rule, ht_node), + .automatic_shrinking = true, +}; + +static struct rhashtable rules; + extern char xdp_flow_umh_start; extern char xdp_flow_umh_end; +static int xdp_flow_parse_actions(struct xdp_flow_actions *actions, + struct flow_action *flow_action, + struct netlink_ext_ack *extack) +{ + const struct flow_action_entry *act; + int i; + + if (!flow_action_has_entries(flow_action)) + return 0; + + if (flow_action->num_entries > MAX_XDP_FLOW_ACTIONS) + return -ENOBUFS; + + flow_action_for_each(i, act, flow_action) { + struct xdp_flow_action *action = &actions->actions[i]; + + switch (act->id) { + case FLOW_ACTION_ACCEPT: + action->id = XDP_FLOW_ACTION_ACCEPT; + break; + case FLOW_ACTION_DROP: + action->id = XDP_FLOW_ACTION_DROP; + break; + case FLOW_ACTION_REDIRECT: + case FLOW_ACTION_VLAN_PUSH: + case FLOW_ACTION_VLAN_POP: + case FLOW_ACTION_VLAN_MANGLE: + case FLOW_ACTION_MANGLE: + case FLOW_ACTION_CSUM: + /* TODO: implement these */ + /* fall through */ + default: + NL_SET_ERR_MSG_MOD(extack, "Unsupported action"); + return -EOPNOTSUPP; + } + } + actions->num_actions = flow_action->num_entries; + + return 0; +} + +static int xdp_flow_parse_ports(struct xdp_flow_key *key, + struct xdp_flow_key *mask, + struct flow_cls_offload *f, u8 ip_proto) +{ + const struct flow_rule *rule = flow_cls_offload_flow_rule(f); + struct flow_match_ports match; + + if (!flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_PORTS)) + return 0; + + if (ip_proto != IPPROTO_TCP && ip_proto != IPPROTO_UDP) { + NL_SET_ERR_MSG_MOD(f->common.extack, + "Only UDP and TCP keys are supported"); + return -EINVAL; + } + + flow_rule_match_ports(rule, &match); + + key->l4port.src = match.key->src; + mask->l4port.src = match.mask->src; + key->l4port.dst = match.key->dst; + mask->l4port.dst = match.mask->dst; + + return 0; +} + +static int xdp_flow_parse_tcp(struct xdp_flow_key *key, + struct xdp_flow_key *mask, + struct flow_cls_offload *f, u8 ip_proto) +{ + const struct flow_rule *rule = flow_cls_offload_flow_rule(f); + struct flow_match_tcp match; + + if (!flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_TCP)) + return 0; + + if (ip_proto != IPPROTO_TCP) { + NL_SET_ERR_MSG_MOD(f->common.extack, + "TCP keys supported only for TCP"); + return -EINVAL; + } + + flow_rule_match_tcp(rule, &match); + + key->tcp.flags = match.key->flags; + mask->tcp.flags = match.mask->flags; + + return 0; +} + +static int xdp_flow_parse_ip(struct xdp_flow_key *key, + struct xdp_flow_key *mask, + struct flow_cls_offload *f, __be16 n_proto) +{ + const struct flow_rule *rule = flow_cls_offload_flow_rule(f); + struct flow_match_ip match; + + if (!flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_IP)) + return 0; + + if (n_proto != htons(ETH_P_IP) && n_proto != htons(ETH_P_IPV6)) { + NL_SET_ERR_MSG_MOD(f->common.extack, + "IP keys supported only for IPv4/6"); + return -EINVAL; + } + + flow_rule_match_ip(rule, &match); + + key->ip.ttl = match.key->ttl; + mask->ip.ttl = match.mask->ttl; + key->ip.tos = match.key->tos; + mask->ip.tos = match.mask->tos; + + return 0; +} + +static int xdp_flow_parse(struct xdp_flow_key *key, struct xdp_flow_key *mask, + struct xdp_flow_actions *actions, + struct flow_cls_offload *f) +{ + struct flow_rule *rule = flow_cls_offload_flow_rule(f); + struct flow_dissector *dissector = rule->match.dissector; + __be16 n_proto = 0, n_proto_mask = 0; + u16 addr_type = 0; + u8 ip_proto = 0; + int err; + + if (dissector->used_keys & + ~(BIT(FLOW_DISSECTOR_KEY_CONTROL) | + BIT(FLOW_DISSECTOR_KEY_BASIC) | + BIT(FLOW_DISSECTOR_KEY_ETH_ADDRS) | + BIT(FLOW_DISSECTOR_KEY_IPV4_ADDRS) | + BIT(FLOW_DISSECTOR_KEY_IPV6_ADDRS) | + BIT(FLOW_DISSECTOR_KEY_PORTS) | + BIT(FLOW_DISSECTOR_KEY_TCP) | + BIT(FLOW_DISSECTOR_KEY_IP) | + BIT(FLOW_DISSECTOR_KEY_VLAN))) { + NL_SET_ERR_MSG_MOD(f->common.extack, "Unsupported key"); + return -EOPNOTSUPP; + } + + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_CONTROL)) { + struct flow_match_control match; + + flow_rule_match_control(rule, &match); + addr_type = match.key->addr_type; + } + + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_BASIC)) { + struct flow_match_basic match; + + flow_rule_match_basic(rule, &match); + + n_proto = match.key->n_proto; + n_proto_mask = match.mask->n_proto; + if (n_proto == htons(ETH_P_ALL)) { + n_proto = 0; + n_proto_mask = 0; + } + + key->eth.type = n_proto; + mask->eth.type = n_proto_mask; + + if (match.mask->ip_proto) { + ip_proto = match.key->ip_proto; + key->ip.proto = ip_proto; + mask->ip.proto = match.mask->ip_proto; + } + } + + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_ETH_ADDRS)) { + struct flow_match_eth_addrs match; + + flow_rule_match_eth_addrs(rule, &match); + + ether_addr_copy(key->eth.dst, match.key->dst); + ether_addr_copy(mask->eth.dst, match.mask->dst); + ether_addr_copy(key->eth.src, match.key->src); + ether_addr_copy(mask->eth.src, match.mask->src); + } + + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_VLAN)) { + struct flow_match_vlan match; + + flow_rule_match_vlan(rule, &match); + + key->vlan.tpid = match.key->vlan_tpid; + mask->vlan.tpid = match.mask->vlan_tpid; + key->vlan.tci = htons(match.key->vlan_id | + (match.key->vlan_priority << + VLAN_PRIO_SHIFT)); + mask->vlan.tci = htons(match.mask->vlan_id | + (match.mask->vlan_priority << + VLAN_PRIO_SHIFT)); + } + + if (addr_type == FLOW_DISSECTOR_KEY_IPV4_ADDRS) { + struct flow_match_ipv4_addrs match; + + flow_rule_match_ipv4_addrs(rule, &match); + + key->ipv4.src = match.key->src; + mask->ipv4.src = match.mask->src; + key->ipv4.dst = match.key->dst; + mask->ipv4.dst = match.mask->dst; + } + + if (addr_type == FLOW_DISSECTOR_KEY_IPV6_ADDRS) { + struct flow_match_ipv6_addrs match; + + flow_rule_match_ipv6_addrs(rule, &match); + + key->ipv6.src = match.key->src; + mask->ipv6.src = match.mask->src; + key->ipv6.dst = match.key->dst; + mask->ipv6.dst = match.mask->dst; + } + + err = xdp_flow_parse_ports(key, mask, f, ip_proto); + if (err) + return err; + err = xdp_flow_parse_tcp(key, mask, f, ip_proto); + if (err) + return err; + + err = xdp_flow_parse_ip(key, mask, f, n_proto); + if (err) + return err; + + // TODO: encapsulation related tasks + + return xdp_flow_parse_actions(actions, &rule->action, + f->common.extack); +} + static void shutdown_umh(void) { struct task_struct *tsk; @@ -77,12 +331,78 @@ static int transact_umh(struct mbox_request *req, u32 *id) static int xdp_flow_replace(struct net_device *dev, struct flow_cls_offload *f) { - return -EOPNOTSUPP; + struct xdp_flow_rule *rule; + struct mbox_request *req; + int err; + + req = kzalloc(sizeof(*req), GFP_KERNEL); + if (!req) + return -ENOMEM; + + rule = kzalloc(sizeof(*rule), GFP_KERNEL); + if (!rule) { + err = -ENOMEM; + goto out; + } + + req->flow.priority = f->common.prio >> 16; + err = xdp_flow_parse(&req->flow.key, &req->flow.mask, + &req->flow.actions, f); + if (err) + goto err_rule; + + rule->cookie = f->cookie; + rule->key = req->flow.key; + rule->mask = req->flow.mask; + err = rhashtable_insert_fast(&rules, &rule->ht_node, rules_params); + if (err) + goto err_rule; + + req->cmd = XDP_FLOW_CMD_REPLACE; + req->ifindex = dev->ifindex; + err = transact_umh(req, NULL); + if (err) + goto err_rht; +out: + kfree(req); + + return err; +err_rht: + rhashtable_remove_fast(&rules, &rule->ht_node, rules_params); +err_rule: + kfree(rule); + goto out; } static int xdp_flow_destroy(struct net_device *dev, struct flow_cls_offload *f) { - return -EOPNOTSUPP; + struct xdp_flow_rule *rule; + struct mbox_request *req; + int err; + + rule = rhashtable_lookup_fast(&rules, &f->cookie, rules_params); + if (!rule) + return 0; + + req = kzalloc(sizeof(*req), GFP_KERNEL); + if (!req) + return -ENOMEM; + + req->flow.priority = f->common.prio >> 16; + req->flow.key = rule->key; + req->flow.mask = rule->mask; + req->cmd = XDP_FLOW_CMD_DELETE; + req->ifindex = dev->ifindex; + err = transact_umh(req, NULL); + + kfree(req); + + if (!err) { + rhashtable_remove_fast(&rules, &rule->ht_node, rules_params); + kfree(rule); + } + + return err; } static int xdp_flow_setup_flower(struct net_device *dev, @@ -308,6 +628,10 @@ static int __init load_umh(void) if (err) return err; + err = rhashtable_init(&rules, &rules_params); + if (err) + goto err_progs; + mutex_lock(&xdp_flow_ops.lock); if (!xdp_flow_ops.stop) { err = -EFAULT; @@ -327,6 +651,8 @@ static int __init load_umh(void) return 0; err: mutex_unlock(&xdp_flow_ops.lock); + rhashtable_destroy(&rules); +err_progs: rhashtable_destroy(&progs); return err; } @@ -340,6 +666,7 @@ static void __exit fini_umh(void) xdp_flow_ops.setup = NULL; xdp_flow_ops.setup_cb = NULL; mutex_unlock(&xdp_flow_ops.lock); + rhashtable_destroy(&rules); rhashtable_destroy(&progs); } module_init(load_umh); From patchwork Fri Oct 18 04:07:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1179129 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="H8TLl1YB"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46vYsy4pxnz9sRN for ; Fri, 18 Oct 2019 16:06:46 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2504297AbfJRFGp (ORCPT ); Fri, 18 Oct 2019 01:06:45 -0400 Received: from mail-pl1-f196.google.com ([209.85.214.196]:43987 "EHLO mail-pl1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2504043AbfJRFGo (ORCPT ); Fri, 18 Oct 2019 01:06:44 -0400 Received: by mail-pl1-f196.google.com with SMTP id f21so2247159plj.10; Thu, 17 Oct 2019 22:06:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=TCMqo1bzjd5t8QP9CMHm+x8fzH1KL/PG6H6lSp0PHdI=; b=H8TLl1YB5AiyfqoxV98BgfWUScF35a/A1zSICZ1FmYKBecdazHllGEt9tIqGZSt0L8 O2okt5M4+Qy1DOYxqE+XfMHE+6PByrFGt89orBT2kiiMwrh3rlDiY8CYaDs2brOyOFli 0rYwmr3R3chZtXWgx6T/VkAWUvhCObb3Eqj4bzyh1SMDnREJS/O3ryzBJdTeJLchEva+ XJHlO98hYh4j/D2ZBUUxWJFDq++vmZ+O7SLharTDVtEIEShSod6YlagXV0aNaXVVhMO/ ZajM5RlGzkVZGi5XXFQptSgZu/Oy3xAPorIX3Nk3trj8pbXleP5f4EfmKQOGKFeE0yOP 7cww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TCMqo1bzjd5t8QP9CMHm+x8fzH1KL/PG6H6lSp0PHdI=; b=alNmr1+zltmVuPCz0qvzinw0vSmm47RxefCMtMFwY5vfuW0S53cwVroOxwb5EuYMnk q2Qt6/ZdQl8v9r/HKd+ub//EVHZFAuBkRJAtpQqJYoTyeD6xoY/vOoAMfJiUBp9HhEHb AkGQ5t+k40/led4saTQAroURCetQFGQQ2IZoPyQbRf+tZdahrGu/J1K5TWLC/asAm2q/ C237Q13Y2qIyOQ6ryJRujWnaWUhOMvVlh3zh7FR+4hsKhRrwDu2B0PSljet5XNExYQQh Q2Bxi/NMr4uBzN1ROGvN0BmF6mYmST8Zl1PDCQIzG3LXUX+FourFRUdc9HOKDYlFD35d F38Q== X-Gm-Message-State: APjAAAW2J2W0ogX1qpWf7qRLeLAEgKQfr5Cch1PqcPqLhtpcUPhi+lo7 a3xWEJhippg2Mfe8YXmtYQDnKBn2 X-Google-Smtp-Source: APXvYqwDC0PoSaJlI1G8s8m1E9AJ5ROywf7OJ2mIOZJAyEzGw5mFRuaNDlJCEwq+Q79TXDlnKc5UXw== X-Received: by 2002:a17:902:724b:: with SMTP id c11mr7643993pll.155.1571371750717; Thu, 17 Oct 2019 21:09:10 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id d11sm4341680pfo.104.2019.10.17.21.09.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2019 21:09:10 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko , Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal , Pravin B Shelar Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu , Stanislav Fomichev Subject: [RFC PATCH v2 bpf-next 10/15] xdp_flow: Add netdev feature for enabling flow offload to XDP Date: Fri, 18 Oct 2019 13:07:43 +0900 Message-Id: <20191018040748.30593-11-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191018040748.30593-1-toshiaki.makita1@gmail.com> References: <20191018040748.30593-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The usage would be like this: $ ethtool -K eth0 flow-offload-xdp on $ tc qdisc add dev eth0 clsact $ tc filter add dev eth0 ingress protocol ip flower ... Then the filters offloaded to XDP are marked as "in_hw". xdp_flow is using the indirect block mechanism to handle the newly added feature. Signed-off-by: Toshiaki Makita --- include/linux/netdev_features.h | 2 ++ net/core/dev.c | 2 ++ net/core/ethtool.c | 1 + net/xdp_flow/xdp_flow.h | 5 ++++ net/xdp_flow/xdp_flow_core.c | 55 +++++++++++++++++++++++++++++++++++++++- net/xdp_flow/xdp_flow_kern_mod.c | 6 +++++ 6 files changed, 70 insertions(+), 1 deletion(-) diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h index 4b19c54..1063511 100644 --- a/include/linux/netdev_features.h +++ b/include/linux/netdev_features.h @@ -80,6 +80,7 @@ enum { NETIF_F_GRO_HW_BIT, /* Hardware Generic receive offload */ NETIF_F_HW_TLS_RECORD_BIT, /* Offload TLS record */ + NETIF_F_XDP_FLOW_BIT, /* Offload flow to XDP */ /* * Add your fresh new feature above and remember to update @@ -150,6 +151,7 @@ enum { #define NETIF_F_GSO_UDP_L4 __NETIF_F(GSO_UDP_L4) #define NETIF_F_HW_TLS_TX __NETIF_F(HW_TLS_TX) #define NETIF_F_HW_TLS_RX __NETIF_F(HW_TLS_RX) +#define NETIF_F_XDP_FLOW __NETIF_F(XDP_FLOW) /* Finds the next feature with the highest number of the range of start till 0. */ diff --git a/net/core/dev.c b/net/core/dev.c index 9965675..62e0469 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -9035,6 +9035,8 @@ int register_netdevice(struct net_device *dev) * software offloads (GSO and GRO). */ dev->hw_features |= NETIF_F_SOFT_FEATURES; + if (IS_ENABLED(CONFIG_XDP_FLOW) && dev->netdev_ops->ndo_bpf) + dev->hw_features |= NETIF_F_XDP_FLOW; dev->features |= NETIF_F_SOFT_FEATURES; if (dev->netdev_ops->ndo_udp_tunnel_add) { diff --git a/net/core/ethtool.c b/net/core/ethtool.c index c763106..200aa96 100644 --- a/net/core/ethtool.c +++ b/net/core/ethtool.c @@ -111,6 +111,7 @@ int ethtool_op_get_ts_info(struct net_device *dev, struct ethtool_ts_info *info) [NETIF_F_HW_TLS_RECORD_BIT] = "tls-hw-record", [NETIF_F_HW_TLS_TX_BIT] = "tls-hw-tx-offload", [NETIF_F_HW_TLS_RX_BIT] = "tls-hw-rx-offload", + [NETIF_F_XDP_FLOW_BIT] = "flow-offload-xdp", }; static const char diff --git a/net/xdp_flow/xdp_flow.h b/net/xdp_flow/xdp_flow.h index 656ceab..58f8a229 100644 --- a/net/xdp_flow/xdp_flow.h +++ b/net/xdp_flow/xdp_flow.h @@ -20,4 +20,9 @@ struct xdp_flow_umh_ops { extern struct xdp_flow_umh_ops xdp_flow_ops; +static inline bool xdp_flow_enabled(const struct net_device *dev) +{ + return dev->features & NETIF_F_XDP_FLOW; +} + #endif diff --git a/net/xdp_flow/xdp_flow_core.c b/net/xdp_flow/xdp_flow_core.c index 8265aef..f402427 100644 --- a/net/xdp_flow/xdp_flow_core.c +++ b/net/xdp_flow/xdp_flow_core.c @@ -20,7 +20,8 @@ static void xdp_flow_block_release(void *cb_priv) mutex_unlock(&xdp_flow_ops.lock); } -int xdp_flow_setup_block(struct net_device *dev, struct flow_block_offload *f) +static int xdp_flow_setup_block(struct net_device *dev, + struct flow_block_offload *f) { struct flow_block_cb *block_cb; int err = 0; @@ -32,6 +33,9 @@ int xdp_flow_setup_block(struct net_device *dev, struct flow_block_offload *f) if (f->binder_type != FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS) return -EOPNOTSUPP; + if (f->command == FLOW_BLOCK_BIND && !xdp_flow_enabled(dev)) + return -EOPNOTSUPP; + mutex_lock(&xdp_flow_ops.lock); if (!xdp_flow_ops.module) { mutex_unlock(&xdp_flow_ops.lock); @@ -105,6 +109,50 @@ int xdp_flow_setup_block(struct net_device *dev, struct flow_block_offload *f) return err; } +static int xdp_flow_indr_setup_cb(struct net_device *dev, void *cb_priv, + enum tc_setup_type type, void *type_data) +{ + switch (type) { + case TC_SETUP_BLOCK: + return xdp_flow_setup_block(dev, type_data); + default: + return -EOPNOTSUPP; + } +} + +static int xdp_flow_netdevice_event(struct notifier_block *nb, + unsigned long event, void *ptr) +{ + struct net_device *dev = netdev_notifier_info_to_dev(ptr); + int err; + + if (!dev->netdev_ops->ndo_bpf) + return NOTIFY_DONE; + + switch (event) { + case NETDEV_REGISTER: + err = __flow_indr_block_cb_register(dev, NULL, + xdp_flow_indr_setup_cb, + dev); + if (err) { + netdev_err(dev, + "Failed to register indirect block setup callback: %d\n", + err); + } + break; + case NETDEV_UNREGISTER: + __flow_indr_block_cb_unregister(dev, xdp_flow_indr_setup_cb, + dev); + break; + } + + return NOTIFY_DONE; +} + +static struct notifier_block xdp_flow_notifier_block __read_mostly = { + .notifier_call = xdp_flow_netdevice_event, +}; + static void xdp_flow_umh_cleanup(struct umh_info *info) { mutex_lock(&xdp_flow_ops.lock); @@ -117,6 +165,11 @@ static void xdp_flow_umh_cleanup(struct umh_info *info) static int __init xdp_flow_init(void) { + int err = register_netdevice_notifier(&xdp_flow_notifier_block); + + if (err) + return err; + mutex_init(&xdp_flow_ops.lock); xdp_flow_ops.stop = true; xdp_flow_ops.info.cmdline = "xdp_flow_umh"; diff --git a/net/xdp_flow/xdp_flow_kern_mod.c b/net/xdp_flow/xdp_flow_kern_mod.c index e70a86a..ce8a75b 100644 --- a/net/xdp_flow/xdp_flow_kern_mod.c +++ b/net/xdp_flow/xdp_flow_kern_mod.c @@ -335,6 +335,12 @@ static int xdp_flow_replace(struct net_device *dev, struct flow_cls_offload *f) struct mbox_request *req; int err; + if (!xdp_flow_enabled(dev)) { + NL_SET_ERR_MSG_MOD(f->common.extack, + "flow-offload-xdp is disabled on net device"); + return -EOPNOTSUPP; + } + req = kzalloc(sizeof(*req), GFP_KERNEL); if (!req) return -ENOMEM; From patchwork Fri Oct 18 04:07:44 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1179133 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="GX8guqp9"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46vYwK042Dz9sPJ for ; Fri, 18 Oct 2019 16:08:49 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405033AbfJRFIs (ORCPT ); Fri, 18 Oct 2019 01:08:48 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:42841 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726417AbfJRFIr (ORCPT ); Fri, 18 Oct 2019 01:08:47 -0400 Received: by mail-pf1-f196.google.com with SMTP id q12so3082222pff.9; Thu, 17 Oct 2019 22:08:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=kDo7nCIMvkyWSvjXvxSmyNKw6UvaYHW662NZNSW2au0=; b=GX8guqp9YIwauFtKbREGztLEywgVwa0DGiORXyc5O3v+CbQZ7rr2lRVT8pxo3CQiMO jhTRkwrZl+HrgBvzkpRV5mykLR6vOIWLreGnstxBt5YMc4KpfjsgmYN7oiyNF9GJiD/r Gb1tymQCDXJUBDhHbCxMlWcewAr1qWzGZ36ypI1/+osMC2Zz9cD9M/EPjG5Zj1B3pn1Y oz9Vo9vzjoGl82+U+y+Bx4VWZIS9g3PdyoAdpURNgqph89GVY1AdnONkraIP+4kplFtL EUJ9rgjACpGL7QfCsYCScOdnvTK7nb2y7rvdnE7LwubrpmRrQ13fMa18XZbwzpAcN6sS V/4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=kDo7nCIMvkyWSvjXvxSmyNKw6UvaYHW662NZNSW2au0=; b=GmsaavUlFnQQQPwEvsEERNL6TvT1fmv+OVHevSFWNV3S4KZfy+V9gryXmv5RAAhTbG Wor5NTYj7WiAx3RU7i5NS7JnajU74t/TT3Ymu638fwpTThY5KUBrHW+3tKooBqkEoR8D GsUo6skpT8jGt1lPaNDdkIDqttDKpBQ7I4Fenp4io5qihwR2oiiDCS7Rjq8rzbbT+qGd RZC/r49iEVfwJXZ+GtMfyPzHzPwV9MMyvmmWRTvaxEaUkPcK2Ng+vqd0AOKXF4GRngYD gfp+XsPhvB0IhGa6VsFzwh40e/G9/sZA77ZFkqrQGQFkaP1Ra8iDwEcS69HUxzFGMGAR 2J6Q== X-Gm-Message-State: APjAAAXLfBXLfEhAQr3K+IHc0Knt6vwKNm+V7Wj2K94bVh7yL5TNx94q gFpp51O6/oxvP9UB1AG7lWImK5l/ X-Google-Smtp-Source: APXvYqySu7dUavc/YLs4LCxJWT1vTdT7E7p/yX8kuqMKYOdc6NYgYNmXkPia2b7LQeeLmUpDTBXdlw== X-Received: by 2002:a62:e40d:: with SMTP id r13mr4174998pfh.135.1571371756055; Thu, 17 Oct 2019 21:09:16 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id d11sm4341680pfo.104.2019.10.17.21.09.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2019 21:09:14 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko , Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal , Pravin B Shelar Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu , Stanislav Fomichev Subject: [RFC PATCH v2 bpf-next 11/15] xdp_flow: Implement redirect action Date: Fri, 18 Oct 2019 13:07:44 +0900 Message-Id: <20191018040748.30593-12-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191018040748.30593-1-toshiaki.makita1@gmail.com> References: <20191018040748.30593-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Add a devmap for XDP_REDIRECT and use it for redirect action. Signed-off-by: Toshiaki Makita --- net/xdp_flow/umh_bpf.h | 1 + net/xdp_flow/xdp_flow_kern_bpf.c | 14 +++- net/xdp_flow/xdp_flow_kern_mod.c | 14 ++++ net/xdp_flow/xdp_flow_umh.c | 164 +++++++++++++++++++++++++++++++++++++-- 4 files changed, 186 insertions(+), 7 deletions(-) diff --git a/net/xdp_flow/umh_bpf.h b/net/xdp_flow/umh_bpf.h index 4e4633f..a279d0a1 100644 --- a/net/xdp_flow/umh_bpf.h +++ b/net/xdp_flow/umh_bpf.h @@ -4,6 +4,7 @@ #include "msgfmt.h" +#define MAX_PORTS 65536 #define MAX_FLOWS 1024 #define MAX_FLOW_MASKS 255 #define FLOW_MASKS_TAIL 255 diff --git a/net/xdp_flow/xdp_flow_kern_bpf.c b/net/xdp_flow/xdp_flow_kern_bpf.c index f4a6346..381d67e 100644 --- a/net/xdp_flow/xdp_flow_kern_bpf.c +++ b/net/xdp_flow/xdp_flow_kern_bpf.c @@ -22,6 +22,13 @@ struct bpf_map_def SEC("maps") debug_stats = { .max_entries = 256, }; +struct bpf_map_def SEC("maps") output_map = { + .type = BPF_MAP_TYPE_DEVMAP, + .key_size = sizeof(int), + .value_size = sizeof(int), + .max_entries = MAX_PORTS, +}; + struct bpf_map_def SEC("maps") flow_masks_head = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(u32), @@ -71,10 +78,13 @@ static inline int action_drop(void) static inline int action_redirect(struct xdp_flow_action *action) { + int tx_port; + account_action(XDP_FLOW_ACTION_REDIRECT); - // TODO: implement this - return XDP_ABORTED; + tx_port = action->ifindex; + + return bpf_redirect_map(&output_map, tx_port, 0); } static inline int action_vlan_push(struct xdp_md *ctx, diff --git a/net/xdp_flow/xdp_flow_kern_mod.c b/net/xdp_flow/xdp_flow_kern_mod.c index ce8a75b..2581b81 100644 --- a/net/xdp_flow/xdp_flow_kern_mod.c +++ b/net/xdp_flow/xdp_flow_kern_mod.c @@ -69,6 +69,20 @@ static int xdp_flow_parse_actions(struct xdp_flow_actions *actions, action->id = XDP_FLOW_ACTION_DROP; break; case FLOW_ACTION_REDIRECT: + if (!act->dev->netdev_ops->ndo_xdp_xmit) { + NL_SET_ERR_MSG_MOD(extack, + "Redirect target interface does not support XDP_TX"); + return -EOPNOTSUPP; + } + if (!rhashtable_lookup_fast(&progs, &act->dev, + progs_params)) { + NL_SET_ERR_MSG_MOD(extack, + "Need xdp_flow setup on redirect target interface in advance"); + return -EINVAL; + } + action->id = XDP_FLOW_ACTION_REDIRECT; + action->ifindex = act->dev->ifindex; + break; case FLOW_ACTION_VLAN_PUSH: case FLOW_ACTION_VLAN_POP: case FLOW_ACTION_VLAN_MANGLE: diff --git a/net/xdp_flow/xdp_flow_umh.c b/net/xdp_flow/xdp_flow_umh.c index 0588a36..54a7f10 100644 --- a/net/xdp_flow/xdp_flow_umh.c +++ b/net/xdp_flow/xdp_flow_umh.c @@ -18,6 +18,7 @@ extern char xdp_flow_bpf_start; extern char xdp_flow_bpf_end; int progfile_fd; +int output_map_fd; FILE *kmsg; #define zalloc(size) calloc(1, (size)) @@ -44,12 +45,22 @@ struct netdev_info { struct netdev_info_key key; struct hlist_node node; struct bpf_object *obj; + int devmap_idx; int free_slot_top; int free_slots[MAX_FLOW_MASKS]; }; DEFINE_HASHTABLE(netdev_info_table, 16); +struct devmap_idx_node { + int devmap_idx; + struct hlist_node node; +}; + +DEFINE_HASHTABLE(devmap_idx_table, 16); + +int max_devmap_idx; + static int libbpf_err(int err, char *errbuf) { libbpf_strerror(err, errbuf, ERRBUF_SIZE); @@ -94,6 +105,15 @@ static int setup(void) goto err; } + output_map_fd = bpf_create_map(BPF_MAP_TYPE_DEVMAP, sizeof(int), + sizeof(int), MAX_PORTS, 0); + if (output_map_fd < 0) { + err = -errno; + pr_err("map creation for output_map failed: %s\n", + strerror(errno)); + goto err; + } + return 0; err: close(progfile_fd); @@ -101,10 +121,23 @@ static int setup(void) return err; } -static int load_bpf(int ifindex, struct bpf_object **objp) +static void delete_output_map_elem(int idx) +{ + char errbuf[ERRBUF_SIZE]; + int err; + + err = bpf_map_delete_elem(output_map_fd, &idx); + if (err) { + libbpf_err(err, errbuf); + pr_warn("Failed to delete idx %d from output_map: %s\n", + idx, errbuf); + } +} + +static int load_bpf(int ifindex, int devmap_idx, struct bpf_object **objp) { int prog_fd, flow_tables_fd, flow_meta_fd, flow_masks_head_fd, err; - struct bpf_map *flow_tables, *flow_masks_head; + struct bpf_map *output_map, *flow_tables, *flow_masks_head; int zero = 0, flow_masks_tail = FLOW_MASKS_TAIL; struct bpf_object_open_attr attr = {}; char path[256], errbuf[ERRBUF_SIZE]; @@ -137,6 +170,27 @@ static int load_bpf(int ifindex, struct bpf_object **objp) bpf_object__for_each_program(prog, obj) bpf_program__set_type(prog, attr.prog_type); + output_map = bpf_object__find_map_by_name(obj, "output_map"); + if (!output_map) { + pr_err("Cannot find output_map\n"); + err = -ENOENT; + goto err_obj; + } + + err = bpf_map__reuse_fd(output_map, output_map_fd); + if (err) { + err = libbpf_err(err, errbuf); + pr_err("Failed to reuse output_map fd: %s\n", errbuf); + goto err_obj; + } + + if (bpf_map_update_elem(output_map_fd, &devmap_idx, &ifindex, 0)) { + err = -errno; + pr_err("Failed to insert idx %d if %d into output_map: %s\n", + devmap_idx, ifindex, strerror(errno)); + goto err_obj; + } + flow_meta_fd = bpf_create_map(BPF_MAP_TYPE_HASH, sizeof(struct xdp_flow_key), sizeof(struct xdp_flow_actions), @@ -226,6 +280,8 @@ static int load_bpf(int ifindex, struct bpf_object **objp) return prog_fd; err: + delete_output_map_elem(devmap_idx); +err_obj: bpf_object__close(obj); return err; } @@ -276,6 +332,56 @@ static struct netdev_info *get_netdev_info(const struct mbox_request *req) return netdev_info; } +static struct devmap_idx_node *find_devmap_idx(int devmap_idx) +{ + struct devmap_idx_node *node; + + hash_for_each_possible(devmap_idx_table, node, node, devmap_idx) { + if (node->devmap_idx == devmap_idx) + return node; + } + + return NULL; +} + +static int get_new_devmap_idx(void) +{ + int offset; + + for (offset = 0; offset < MAX_PORTS; offset++) { + int devmap_idx = max_devmap_idx++; + + if (max_devmap_idx >= MAX_PORTS) + max_devmap_idx -= MAX_PORTS; + + if (!find_devmap_idx(devmap_idx)) { + struct devmap_idx_node *node; + + node = malloc(sizeof(*node)); + if (!node) { + pr_err("malloc for devmap_idx failed\n"); + return -ENOMEM; + } + node->devmap_idx = devmap_idx; + hash_add(devmap_idx_table, &node->node, devmap_idx); + + return devmap_idx; + } + } + + return -ENOSPC; +} + +static void delete_devmap_idx(int devmap_idx) +{ + struct devmap_idx_node *node = find_devmap_idx(devmap_idx); + + if (node) { + hash_del(&node->node); + free(node); + } +} + static void init_flow_masks_free_slot(struct netdev_info *netdev_info) { int i; @@ -329,11 +435,11 @@ static void delete_flow_masks_free_slot(struct netdev_info *netdev_info, static int handle_load(const struct mbox_request *req, __u32 *prog_id) { + int err, prog_fd, devmap_idx = -1; struct netdev_info *netdev_info; struct bpf_prog_info info = {}; struct netdev_info_key key; __u32 len = sizeof(info); - int err, prog_fd; err = get_netdev_info_key(req, &key); if (err) @@ -350,12 +456,19 @@ static int handle_load(const struct mbox_request *req, __u32 *prog_id) } netdev_info->key.ifindex = key.ifindex; + devmap_idx = get_new_devmap_idx(); + if (devmap_idx < 0) { + err = devmap_idx; + goto err_netdev_info; + } + netdev_info->devmap_idx = devmap_idx; + init_flow_masks_free_slot(netdev_info); - prog_fd = load_bpf(req->ifindex, &netdev_info->obj); + prog_fd = load_bpf(req->ifindex, devmap_idx, &netdev_info->obj); if (prog_fd < 0) { err = prog_fd; - goto err_netdev_info; + goto err_devmap_idx; } err = bpf_obj_get_info_by_fd(prog_fd, &info, &len); @@ -370,6 +483,8 @@ static int handle_load(const struct mbox_request *req, __u32 *prog_id) return 0; err_obj: bpf_object__close(netdev_info->obj); +err_devmap_idx: + delete_devmap_idx(devmap_idx); err_netdev_info: free(netdev_info); @@ -386,12 +501,45 @@ static int handle_unload(const struct mbox_request *req) hash_del(&netdev_info->node); bpf_object__close(netdev_info->obj); + delete_output_map_elem(netdev_info->devmap_idx); + delete_devmap_idx(netdev_info->devmap_idx); free(netdev_info); pr_debug("XDP program for if %d was closed\n", req->ifindex); return 0; } +static int convert_ifindex_to_devmap_idx(struct mbox_request *req) +{ + int i; + + for (i = 0; i < req->flow.actions.num_actions; i++) { + struct xdp_flow_action *action = &req->flow.actions.actions[i]; + + if (action->id == XDP_FLOW_ACTION_REDIRECT) { + struct netdev_info *netdev_info; + struct netdev_info_key key; + int err; + + err = get_netdev_info_key(req, &key); + if (err) + return err; + key.ifindex = action->ifindex; + + netdev_info = find_netdev_info(&key); + if (!netdev_info) { + pr_err("BUG: Interface %d is not ready for redirect target.\n", + key.ifindex); + return -EINVAL; + } + + action->ifindex = netdev_info->devmap_idx; + } + } + + return 0; +} + static int get_table_fd(const struct netdev_info *netdev_info, const char *table_name) { @@ -788,6 +936,11 @@ static int handle_replace(struct mbox_request *req) if (IS_ERR(netdev_info)) return PTR_ERR(netdev_info); + /* TODO: Use XDP_TX for redirect action when possible */ + err = convert_ifindex_to_devmap_idx(req); + if (err) + return err; + err = flow_table_insert_elem(netdev_info, &req->flow); if (err) return err; @@ -883,6 +1036,7 @@ int main(void) } loop(); close(progfile_fd); + close(output_map_fd); fclose(kmsg); return 0; From patchwork Fri Oct 18 04:07:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1179117 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="ukP+M1s/"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46vYkN2Kcwz9sPl for ; Fri, 18 Oct 2019 16:00:12 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405641AbfJRFAL (ORCPT ); Fri, 18 Oct 2019 01:00:11 -0400 Received: from mail-pl1-f194.google.com ([209.85.214.194]:42614 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726315AbfJRFAL (ORCPT ); Fri, 18 Oct 2019 01:00:11 -0400 Received: by mail-pl1-f194.google.com with SMTP id g9so932987plj.9; Thu, 17 Oct 2019 22:00:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=HXzFJI/J/iuS7umfGU0E3WbfBmf272p7UrFI2fd2lao=; b=ukP+M1s/Rb3wueS//9V9ply43AdKgY+mOWY20e2SOPnBewg9HAFb8C6FEGLpjcvzWM d8McqNTWm616/H8B8OsOjGibsxGBRdha0d1N36woottSd+1mMMh/Qhp47F5O/tOSzFlx qmlQHZFMntlbIb3+7/8gTvrJ5/meJsqcSt0gySDSen+FExra8Llez5VhGaMPIL9gv8/G pAUJS7YpOX0ouw1ItD0Vxle6rE/yGzvac4REeuHTZS+roDqM+Wg1hD/FtnyTXygZqeS4 3KbV5JNfI7z015J1wvmJkObZqH6EH3xV0DL3ir8AovHs/HTOo64HCBNs9nW3DYliz5Xc FMRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=HXzFJI/J/iuS7umfGU0E3WbfBmf272p7UrFI2fd2lao=; b=fffIbq7HefmqO07G0uVr/8tfso7iF484VWCb7JyUNXu3OCQEMSgeLQf6QpXj7lSOZw cx+Qc6Iv12WxiAdvj11CrrscXD1xe1/7tnAvSQ10KLPuk2Kmgitms4ShW428d1x9Zep0 6qvreLWg2Fr2sS+yqgLs+N9y1CIVmVgb0f21XLhZSgqJFxcBECYL0lRwjPVscVbhR16A KvF0LwmMBFr+U5fGNnvUMeHyXhZ3Y0j/Kt4lbt3UdoAaSTTyMcdSmLPOF5+mqCKT++tB HJ2+0wEWgkBT4uWflQpEDRobG2gBVjhMqdWkRJs2a117bbmNidzw6+bR9phB9LvBmZRC JX/Q== X-Gm-Message-State: APjAAAWRlOHSFntQKLDfRpXB2Blt80sCaJvzGM3huAYQ4kUXpQitVgwK 6M6jUFXOKDIkoIL78TEV4vk/nezj X-Google-Smtp-Source: APXvYqwB+B4q0ZvtPAxyqi037lwMrxXdZ7UlDoM3w/+8/3BXXUcZumuGlG6GQ1wCkaO5d9I9rYP0YQ== X-Received: by 2002:a17:902:ab82:: with SMTP id f2mr7817238plr.39.1571371760774; Thu, 17 Oct 2019 21:09:20 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id d11sm4341680pfo.104.2019.10.17.21.09.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2019 21:09:20 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko , Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal , Pravin B Shelar Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu , Stanislav Fomichev Subject: [RFC PATCH v2 bpf-next 12/15] xdp_flow: Implement vlan_push action Date: Fri, 18 Oct 2019 13:07:45 +0900 Message-Id: <20191018040748.30593-13-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191018040748.30593-1-toshiaki.makita1@gmail.com> References: <20191018040748.30593-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org This is another example action. Signed-off-by: Toshiaki Makita --- net/xdp_flow/xdp_flow_kern_bpf.c | 23 +++++++++++++++++++++-- net/xdp_flow/xdp_flow_kern_mod.c | 5 +++++ 2 files changed, 26 insertions(+), 2 deletions(-) diff --git a/net/xdp_flow/xdp_flow_kern_bpf.c b/net/xdp_flow/xdp_flow_kern_bpf.c index 381d67e..7930349 100644 --- a/net/xdp_flow/xdp_flow_kern_bpf.c +++ b/net/xdp_flow/xdp_flow_kern_bpf.c @@ -90,10 +90,29 @@ static inline int action_redirect(struct xdp_flow_action *action) static inline int action_vlan_push(struct xdp_md *ctx, struct xdp_flow_action *action) { + struct vlan_ethhdr *vehdr; + void *data, *data_end; + __be16 proto, tci; + account_action(XDP_FLOW_ACTION_VLAN_PUSH); - // TODO: implement this - return XDP_ABORTED; + proto = action->vlan.proto; + tci = action->vlan.tci; + + if (bpf_xdp_adjust_head(ctx, -VLAN_HLEN)) + return XDP_DROP; + + data_end = (void *)(long)ctx->data_end; + data = (void *)(long)ctx->data; + if (data + VLAN_ETH_HLEN > data_end) + return XDP_DROP; + + __builtin_memmove(data, data + VLAN_HLEN, ETH_ALEN * 2); + vehdr = data; + vehdr->h_vlan_proto = proto; + vehdr->h_vlan_TCI = tci; + + return _XDP_CONTINUE; } static inline int action_vlan_pop(struct xdp_md *ctx, diff --git a/net/xdp_flow/xdp_flow_kern_mod.c b/net/xdp_flow/xdp_flow_kern_mod.c index 2581b81..7ce1733 100644 --- a/net/xdp_flow/xdp_flow_kern_mod.c +++ b/net/xdp_flow/xdp_flow_kern_mod.c @@ -84,6 +84,11 @@ static int xdp_flow_parse_actions(struct xdp_flow_actions *actions, action->ifindex = act->dev->ifindex; break; case FLOW_ACTION_VLAN_PUSH: + action->id = XDP_FLOW_ACTION_VLAN_PUSH; + action->vlan.tci = act->vlan.vid | + (act->vlan.prio << VLAN_PRIO_SHIFT); + action->vlan.proto = act->vlan.proto; + break; case FLOW_ACTION_VLAN_POP: case FLOW_ACTION_VLAN_MANGLE: case FLOW_ACTION_MANGLE: From patchwork Fri Oct 18 04:07:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1179113 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="HPTxPD7r"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46vYhN53Tpz9sPZ for ; Fri, 18 Oct 2019 15:58:28 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2407231AbfJRE61 (ORCPT ); Fri, 18 Oct 2019 00:58:27 -0400 Received: from mail-pl1-f193.google.com ([209.85.214.193]:41552 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2395367AbfJRE61 (ORCPT ); Fri, 18 Oct 2019 00:58:27 -0400 Received: by mail-pl1-f193.google.com with SMTP id t10so2238403plr.8; Thu, 17 Oct 2019 21:58:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=HICHf2lFRFrLIdIYZu0hb5n6fSttUUKMQ4YCvonchCA=; b=HPTxPD7rANku4IkMuWjfQInEz9yxT550MOCx7UaYk/aLn5CzYuFFIqxUF6vk+uMANF 0cKdqmRp0kiDalBryBu4eTrgEJ3GX0mZ9l0U7vLIYVuMt8UutPmmJjlQoLqzNzjhPxl9 AgHFCPD4rdgtt0sFfkV7Q292aDPZJzqP8zijdmZlVnBnCnJXiyax8Tiz8abu0I7VvHjf 8AaJ5hcbj5JWoeq4ktsZHN0XAfT+r74t4djelTc/bgY1p9uqKb9bbtQynWq7qvdiJQPt jUF95/6m3JDwuv/dX17E/yecTbm/OAuhxcFU4pKH3HrAD2yrAklkuHzHP5KSUqfLjv0d 65eQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=HICHf2lFRFrLIdIYZu0hb5n6fSttUUKMQ4YCvonchCA=; b=eFEs1upaLy698nnT9cfWKsGeJ5BbW4Y7IdOBIoNCkiG80b7iAsupXKd4Iv1wErxxKB 5mJO0BhrEbeHxVMWItrpTIXwZAWTkVICRL+poAdJQmwfSlJ3EkKZLgB4h50Lz6AKOGGr WaXK44nIjdswSjMuRfbelwc+DsIsU7UpuivpWgqr9++/zRZ8k6Rpq/sg8kMFuwJ7bX5Q x6bRzuBRl1wvxFeEEPBbXAMAPj9gLWdkiOD8MRvS+1gND9b4fsGHuhc5rsH3s+WHSO9V 3ua6Zkh3fc6FdkvEIxXWPp7jO9GqcZ26b72B8ScJBq7EfgtPXHb2OXhRc/7n+aJ/LxBN nJDg== X-Gm-Message-State: APjAAAVCmA3R1jnf/NBDwE8lp5cXkdOTIpSilB92iqrCSUCr9oQEMPNh finoqoUK0TapXCIxI7AZyjsHu8B/ X-Google-Smtp-Source: APXvYqwv8HykLmT11neueFQ7bIQVAXRDpzD2/K8c/pH6n1kuT+hVG3pyYs8w66GRzdcoowrgM8omZQ== X-Received: by 2002:a17:902:14f:: with SMTP id 73mr7684854plb.87.1571371765474; Thu, 17 Oct 2019 21:09:25 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id d11sm4341680pfo.104.2019.10.17.21.09.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2019 21:09:24 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko , Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal , Pravin B Shelar Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu , Stanislav Fomichev Subject: [RFC PATCH v2 bpf-next 13/15] bpf, selftest: Add test for xdp_flow Date: Fri, 18 Oct 2019 13:07:46 +0900 Message-Id: <20191018040748.30593-14-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191018040748.30593-1-toshiaki.makita1@gmail.com> References: <20191018040748.30593-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Check if TC flow offloading to XDP works. Signed-off-by: Toshiaki Makita --- tools/testing/selftests/bpf/Makefile | 1 + tools/testing/selftests/bpf/test_xdp_flow.sh | 106 +++++++++++++++++++++++++++ 2 files changed, 107 insertions(+) create mode 100755 tools/testing/selftests/bpf/test_xdp_flow.sh diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 00d05c5..3db9819 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -55,6 +55,7 @@ TEST_PROGS := test_kmod.sh \ test_xdp_redirect.sh \ test_xdp_meta.sh \ test_xdp_veth.sh \ + test_xdp_flow.sh \ test_offload.py \ test_sock_addr.sh \ test_tunnel.sh \ diff --git a/tools/testing/selftests/bpf/test_xdp_flow.sh b/tools/testing/selftests/bpf/test_xdp_flow.sh new file mode 100755 index 0000000..6937454 --- /dev/null +++ b/tools/testing/selftests/bpf/test_xdp_flow.sh @@ -0,0 +1,106 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# +# Create 2 namespaces with 2 veth peers, and +# forward packets in-between using xdp_flow +# +# NS1(veth11) NS2(veth22) +# | | +# | | +# (veth1) (veth2) +# ^ ^ +# | xdp_flow | +# -------------------- + +# Kselftest framework requirement - SKIP code is 4. +ksft_skip=4 + +TESTNAME=xdp_flow + +_cleanup() +{ + set +e + ip link del veth1 2> /dev/null + ip link del veth2 2> /dev/null + ip netns del ns1 2> /dev/null + ip netns del ns2 2> /dev/null +} + +cleanup_skip() +{ + echo "selftests: $TESTNAME [SKIP]" + _cleanup + + exit $ksft_skip +} + +cleanup() +{ + if [ "$?" = 0 ]; then + echo "selftests: $TESTNAME [PASS]" + else + echo "selftests: $TESTNAME [FAILED]" + fi + _cleanup +} + +if [ $(id -u) -ne 0 ]; then + echo "selftests: $TESTNAME [SKIP] Need root privileges" + exit $ksft_skip +fi + +if ! ip link set dev lo xdp off > /dev/null 2>&1; then + echo "selftests: $TESTNAME [SKIP] Could not run test without the ip xdp support" + exit $ksft_skip +fi + +set -e + +trap cleanup_skip EXIT + +ip netns add ns1 +ip netns add ns2 + +ip link add veth1 type veth peer name veth11 netns ns1 +ip link add veth2 type veth peer name veth22 netns ns2 + +ip link set veth1 up +ip link set veth2 up + +ip -n ns1 addr add 10.1.1.11/24 dev veth11 +ip -n ns2 addr add 10.1.1.22/24 dev veth22 + +ip -n ns1 link set dev veth11 up +ip -n ns2 link set dev veth22 up + +ip -n ns1 link set dev veth11 xdp obj xdp_dummy.o sec xdp_dummy +ip -n ns2 link set dev veth22 xdp obj xdp_dummy.o sec xdp_dummy + +ethtool -K veth1 flow-offload-xdp on +ethtool -K veth2 flow-offload-xdp on + +trap cleanup EXIT + +# Adding clsact or ingress will trigger loading bpf prog in UMH +tc qdisc add dev veth1 clsact +tc qdisc add dev veth2 clsact + +# Adding filter will have UMH populate flow table map +tc filter add dev veth1 ingress protocol ip flower \ + dst_ip 10.1.1.0/24 action mirred egress redirect dev veth2 +tc filter add dev veth2 ingress protocol ip flower \ + dst_ip 10.1.1.0/24 action mirred egress redirect dev veth1 + +# flows should be offloaded when 'flow-offload-xdp' is enabled on veth +tc filter show dev veth1 ingress | grep -q not_in_hw && false +tc filter show dev veth2 ingress | grep -q not_in_hw && false + +# ARP is not supported so add filters after in_hw check +tc filter add dev veth1 ingress protocol arp flower \ + arp_tip 10.1.1.0/24 action mirred egress redirect dev veth2 +tc filter add dev veth2 ingress protocol arp flower \ + arp_sip 10.1.1.0/24 action mirred egress redirect dev veth1 + +ip netns exec ns1 ping -c 1 -W 1 10.1.1.22 + +exit 0 From patchwork Fri Oct 18 04:07:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1179138 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="QXtQOSFw"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46vYz91N7jz9sP3 for ; Fri, 18 Oct 2019 16:11:16 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390975AbfJRFLP (ORCPT ); Fri, 18 Oct 2019 01:11:15 -0400 Received: from mail-pf1-f193.google.com ([209.85.210.193]:36568 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728069AbfJRFLP (ORCPT ); Fri, 18 Oct 2019 01:11:15 -0400 Received: by mail-pf1-f193.google.com with SMTP id y22so3101115pfr.3; Thu, 17 Oct 2019 22:11:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Lt7tB1LeS2Sex/M1UY8JvLWxPnjBGbRvOSpWI/WAnhY=; b=QXtQOSFwlpgYpLscplD6fS8aglXutf62lteg1WU9ZZfAiczK5JpzBC/uj4c6pZCrVi mLC5q+f+vFwCpsUNiZEQ1uxUxAN3UCSFg5SHqGDqwp9Kmgc9vRXxMcYHD0ndJLbgjnb8 4unNOzDzJjT1TM3Fr5Fzp72sxR+x0KUonP+H/KD83u+twq8ec16p8JoUU/uVTzhV79t3 ndXhFvfUFWbOnsHqNx6h4knny5/Ngrd6Mg3VzOrTpzdHv4oXuH5vCLyK6PiBfo1DMwsi Hz89jCBqPoUCqjv1wzPC2vYWFOf275cKZHiY3KW8cu7va1MdgcPeUz4bxjyYk96hyTDm l+XA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Lt7tB1LeS2Sex/M1UY8JvLWxPnjBGbRvOSpWI/WAnhY=; b=J+VexhfHff1iG2985ZY8ixEf4eDb1lpJlSpGzD+vnFT1PANPCABO8Iwg9df7XcXed6 pJDmvfQ2f2f6B6FNvBsewDgj8SdtctfjHtAngFY4RjfYXzhynFkWb7VS9zSFOcn0Grue VdidbSqLezx+8eengjVMED/KyFDEx5XfU3UNadh3VFrFwjTRowHGsHKKepY8lDVZLmoB rnaOjA8oFwNwMKVnujHMgFWcgCAYPuD5WC3f8+O2CrSIa0AOvnPgbCf/bZCvq1SQtW+4 kX2IBmBVbBJrezC/kYaD7AJ5wRVaXagqVGeE9XEIXDQpphsajilQW9n0NdbrFeJPikzS XkgA== X-Gm-Message-State: APjAAAWWm2LD4AbgXza5tbulbEf1QqWodvugTBq1ifzdY5/KXoI3eQWk qtIe9jNWCZCGpo28oL2No1CQeuX8 X-Google-Smtp-Source: APXvYqyE+Nt0AFAIr6XgIzy9G75sgU6aYtXbvU7B0Yj2FMFQq8X+HLgJQbeIvbNHLsgOzTK4zp7PuQ== X-Received: by 2002:a17:90a:3628:: with SMTP id s37mr8608320pjb.38.1571371770338; Thu, 17 Oct 2019 21:09:30 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id d11sm4341680pfo.104.2019.10.17.21.09.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2019 21:09:29 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko , Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal , Pravin B Shelar Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu , Stanislav Fomichev Subject: [RFC PATCH v2 bpf-next 14/15] i40e: prefetch xdp->data before running XDP prog Date: Fri, 18 Oct 2019 13:07:47 +0900 Message-Id: <20191018040748.30593-15-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191018040748.30593-1-toshiaki.makita1@gmail.com> References: <20191018040748.30593-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org XDP progs are likely to read/write xdp->data. This improves the performance of xdp_flow. Signed-off-by: Toshiaki Makita --- drivers/net/ethernet/intel/i40e/i40e_txrx.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c index e3f29dc..a85a4ae 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c @@ -2207,6 +2207,7 @@ static struct sk_buff *i40e_run_xdp(struct i40e_ring *rx_ring, if (!xdp_prog) goto xdp_out; + prefetchw(xdp->data); prefetchw(xdp->data_hard_start); /* xdp_frame write */ act = bpf_prog_run_xdp(xdp_prog, xdp); From patchwork Fri Oct 18 04:07:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1179124 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (no SPF record) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="XnvfBet4"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 46vYsF2C0Hz9sPp for ; Fri, 18 Oct 2019 16:06:09 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2503919AbfJRFGH (ORCPT ); Fri, 18 Oct 2019 01:06:07 -0400 Received: from mail-pg1-f196.google.com ([209.85.215.196]:35374 "EHLO mail-pg1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725856AbfJRFGG (ORCPT ); Fri, 18 Oct 2019 01:06:06 -0400 Received: by mail-pg1-f196.google.com with SMTP id p30so2681052pgl.2; Thu, 17 Oct 2019 22:06:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=3znI3jfyxhlrImFZP1YaxjrIboRVdgk6y/Uv9/1hgdY=; b=XnvfBet4AbU0ExahHoD7IWNzMYRW3M2/FYWTAgA1MZybbezJNpGTEccvqnFVMAXY0g Fm0bBX+M1hT/UDnZNpwGLB4vuqnJ/OaTwDXaCTJHKl4dxIN8im6vEpUpcevzthFxneMp TNBS3RM2ahiDNdj1RY9GdN7Ll8WPPcD7Zxhri2GgvbPW3eiJCs8nbXuswgoeAnsadwaL xF5+yCHVVr2wWk/+D+K2bmO1jCAuqN9xrS5+wtI35rO+NJ4WBaKyX+pjIgqkHE8we+yP NPgoiYv2pwL1o+8s42S27f0NvxZLak+jSeq/MpskJ0CrReBjXWcAVGaJubKBRJsyY+vE rzbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=3znI3jfyxhlrImFZP1YaxjrIboRVdgk6y/Uv9/1hgdY=; b=KCgzlAraGzGUt17Lj8ksMiacufWFmi+ExSF4M2erl0twQVNxt4BUipLUUvCvhhStUk 3JmY5otxs5HVYjfsLiZMhYsrDUrgVexeFeSM3eb20D3mgsU8R7xKQidaFbdPzFd3XjP0 cFr5gQPbxl43Wrvlo50MNLE4oAmrucyr+iv1vQ4WQ4TXWQDYxfXdhA01S5F0Ny3q8GtC x1/kZVur3ogY9pYBevvjp5kp+CMNCCuvgbPpOzdU+f9iRmRKAnL6zG6BntGnQ2uoQ90X xMiI5iRxYF8yu14kWXXCVIyGyXBBou45qe0h80M4XAwPuCYeEc3uBPxJS4nQosDDiu0q rbFA== X-Gm-Message-State: APjAAAVUNallv2HZoGEYSTKaREIILRbf/C0J4RZkXxKy/js+G8ajnyuX NLt+qsm1E/pBlTFvgPUxR2A9PQsJ X-Google-Smtp-Source: APXvYqzDPlDkah3w9vs71p4pB03R5Apr7H6F4vO+4UvTHwe9ETXCUnl1vgZ0+kcqJFrRfIeWNlCnZw== X-Received: by 2002:a17:90a:5896:: with SMTP id j22mr8122854pji.55.1571371775397; Thu, 17 Oct 2019 21:09:35 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id d11sm4341680pfo.104.2019.10.17.21.09.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Oct 2019 21:09:34 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko , Pablo Neira Ayuso , Jozsef Kadlecsik , Florian Westphal , Pravin B Shelar Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu , Stanislav Fomichev Subject: [RFC PATCH v2 bpf-next 15/15] bpf, hashtab: Compare keys in long Date: Fri, 18 Oct 2019 13:07:48 +0900 Message-Id: <20191018040748.30593-16-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191018040748.30593-1-toshiaki.makita1@gmail.com> References: <20191018040748.30593-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org memcmp() is generally slow. Compare keys in long if possible. This improves xdp_flow performance. Signed-off-by: Toshiaki Makita --- kernel/bpf/hashtab.c | 27 +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-) diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 22066a6..8b5ffd4 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -417,6 +417,29 @@ static inline struct hlist_nulls_head *select_bucket(struct bpf_htab *htab, u32 return &__select_bucket(htab, hash)->head; } +/* key1 must be aligned to sizeof long */ +static bool key_equal(void *key1, void *key2, u32 size) +{ + /* Check for key1 */ + BUILD_BUG_ON(!IS_ALIGNED(offsetof(struct htab_elem, key), + sizeof(long))); + + if (IS_ALIGNED((unsigned long)key2 | (unsigned long)size, + sizeof(long))) { + unsigned long *lkey1, *lkey2; + + for (lkey1 = key1, lkey2 = key2; size > 0; + lkey1++, lkey2++, size -= sizeof(long)) { + if (*lkey1 != *lkey2) + return false; + } + + return true; + } + + return !memcmp(key1, key2, size); +} + /* this lookup function can only be called with bucket lock taken */ static struct htab_elem *lookup_elem_raw(struct hlist_nulls_head *head, u32 hash, void *key, u32 key_size) @@ -425,7 +448,7 @@ static struct htab_elem *lookup_elem_raw(struct hlist_nulls_head *head, u32 hash struct htab_elem *l; hlist_nulls_for_each_entry_rcu(l, n, head, hash_node) - if (l->hash == hash && !memcmp(&l->key, key, key_size)) + if (l->hash == hash && key_equal(&l->key, key, key_size)) return l; return NULL; @@ -444,7 +467,7 @@ static struct htab_elem *lookup_nulls_elem_raw(struct hlist_nulls_head *head, again: hlist_nulls_for_each_entry_rcu(l, n, head, hash_node) - if (l->hash == hash && !memcmp(&l->key, key, key_size)) + if (l->hash == hash && key_equal(&l->key, key, key_size)) return l; if (unlikely(get_nulls_value(n) != (hash & (n_buckets - 1))))