From patchwork Tue Aug 13 12:05:45 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1146280 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="vQqfO5wo"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 467BKp0Kpxz9sN1 for ; Tue, 13 Aug 2019 22:07:26 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727857AbfHMMHZ (ORCPT ); Tue, 13 Aug 2019 08:07:25 -0400 Received: from mail-pf1-f194.google.com ([209.85.210.194]:37477 "EHLO mail-pf1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726600AbfHMMHZ (ORCPT ); Tue, 13 Aug 2019 08:07:25 -0400 Received: by mail-pf1-f194.google.com with SMTP id 129so4585931pfa.4; Tue, 13 Aug 2019 05:07:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=dWQ7kFLH2aQ4dpwzDQ4WeLUwEAGpDCkM5yl0qeAhFhE=; b=vQqfO5woqA3T7lDtOpBqIRTObmAif4S4bYSwH3aDS7GFpkU87h4W1N/veF2xbXRFzy aMD1LITYWfRIl6KyUnpClXOymmeehuZ3aAo5VacLIHhaiLkI3spKUe8HOfC5U5KgHevD 0yiBO+CQ4KaLQ01P3nxZ8EBWrkLj+mjUI2b6PIpU/71aqUm56BYNWAO71VomQfAM7dyZ r10Ft00QIAkxl19OUCgIjekTpjxMzHqkW9oUDt7bCYafqIKuI+fIaLfvTkERkU8v9c56 RW1kpYEfHgjEkvP63DOcti8sOG7x0iT127Bk86q8HUhlWSPjnoKf+DBp+8H/1/gOPcz9 x1sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=dWQ7kFLH2aQ4dpwzDQ4WeLUwEAGpDCkM5yl0qeAhFhE=; b=CFArnzGNQooiuXdI535U6UbvSannA9XbPcNk4mARhi15olV8clZhuMmsDR37trPSd1 en5cyX0YYaq0Z3cCTUGU7rRMw6gDPgbaz2GTT9CAdwVWvPIegdZpp8s8z5twoES4iflY PA4SfabO3vMArGAHtFYwch9eNVGU3AEEUTWPOyR3WycwFlVej4FcDD/WBybnsGNI86o3 UcG90vcyHAsDlCf7d7+VNyrC+qsYmvMSI5aeNUwWjv1YCujQ678b5ZmKy+mHlxRvS1ba 5oV9vBKUSAf4nZa+dHxJEAqgaEUse1iVBnmAYUz5q0QJqcWaaSAWrzglKfnT5mVe8Drs TFjg== X-Gm-Message-State: APjAAAUkkCg6+7TKKoc/ZuQkOtnxV4YbvSl4Cqq28SQ456f2LpzeZqGg mwhTJSUFV4oE7KtvcpJNqnduiqHv X-Google-Smtp-Source: APXvYqxizNEI9lCLXGEEZXGVkoFi081FLUT5v3u2WSEBiThxnMge/T9aa/NefjAziMhMu4JaFfUJHw== X-Received: by 2002:a63:cb4f:: with SMTP id m15mr34040197pgi.100.1565698043733; Tue, 13 Aug 2019 05:07:23 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id o9sm73251099pgv.19.2019.08.13.05.07.20 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Tue, 13 Aug 2019 05:07:23 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu Subject: [RFC PATCH bpf-next 01/14] xdp_flow: Add skeleton of XDP based TC offload driver Date: Tue, 13 Aug 2019 21:05:45 +0900 Message-Id: <20190813120558.6151-2-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190813120558.6151-1-toshiaki.makita1@gmail.com> References: <20190813120558.6151-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Add TC offload driver, xdp_flow_core.c, and skeleton of UMH handling mechanism. The driver is not called from anywhere yet. xdp_flow_setup_block() in xdp_flow_core.c is meant to be called when ingress qdisc is added. It loads xdp_flow kernel module and the kmod provides some callbacks for setup phase and flow insertion phase. xdp_flow_setup() in the kmod will be called from xdp_flow_setup_block() when ingress qdisc is added, and xdp_flow_setup_block_cb() will be called when a tc flower filter is added. The former will request the UMH to load the eBPF program and the latter will request the UMH to populate maps for flow tables. In this patch no actual processing is implemented and the following commits implement them. The overall mechanism of UMH handling is written referring to bpfilter. Signed-off-by: Toshiaki Makita --- include/net/flow_offload_xdp.h | 33 ++++++ net/Kconfig | 1 + net/Makefile | 1 + net/xdp_flow/.gitignore | 1 + net/xdp_flow/Kconfig | 16 +++ net/xdp_flow/Makefile | 31 +++++ net/xdp_flow/msgfmt.h | 102 ++++++++++++++++ net/xdp_flow/xdp_flow_core.c | 126 ++++++++++++++++++++ net/xdp_flow/xdp_flow_kern_mod.c | 250 +++++++++++++++++++++++++++++++++++++++ net/xdp_flow/xdp_flow_umh.c | 109 +++++++++++++++++ net/xdp_flow/xdp_flow_umh_blob.S | 7 ++ 11 files changed, 677 insertions(+) create mode 100644 include/net/flow_offload_xdp.h create mode 100644 net/xdp_flow/.gitignore create mode 100644 net/xdp_flow/Kconfig create mode 100644 net/xdp_flow/Makefile create mode 100644 net/xdp_flow/msgfmt.h create mode 100644 net/xdp_flow/xdp_flow_core.c create mode 100644 net/xdp_flow/xdp_flow_kern_mod.c create mode 100644 net/xdp_flow/xdp_flow_umh.c create mode 100644 net/xdp_flow/xdp_flow_umh_blob.S diff --git a/include/net/flow_offload_xdp.h b/include/net/flow_offload_xdp.h new file mode 100644 index 0000000..d04a73d --- /dev/null +++ b/include/net/flow_offload_xdp.h @@ -0,0 +1,33 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_FLOW_OFFLOAD_XDP_H +#define _LINUX_FLOW_OFFLOAD_XDP_H + +#include +#include +#include + +struct xdp_flow_umh_ops { + struct umh_info info; + /* serialize access to this object and UMH */ + struct mutex lock; + flow_setup_cb_t *setup_cb; + int (*setup)(struct net_device *dev, bool do_bind, + struct netlink_ext_ack *extack); + int (*start)(void); + bool stop; + struct module *module; +}; + +extern struct xdp_flow_umh_ops xdp_flow_ops; + +#ifdef CONFIG_XDP_FLOW +int xdp_flow_setup_block(struct net_device *dev, struct flow_block_offload *f); +#else +static inline int xdp_flow_setup_block(struct net_device *dev, + struct flow_block_offload *f) +{ + return -EOPNOTSUPP; +} +#endif + +#endif diff --git a/net/Kconfig b/net/Kconfig index 57f51a2..08d36444 100644 --- a/net/Kconfig +++ b/net/Kconfig @@ -206,6 +206,7 @@ source "net/bridge/netfilter/Kconfig" endif source "net/bpfilter/Kconfig" +source "net/xdp_flow/Kconfig" source "net/dccp/Kconfig" source "net/sctp/Kconfig" diff --git a/net/Makefile b/net/Makefile index 449fc0b..b78d1ef 100644 --- a/net/Makefile +++ b/net/Makefile @@ -87,3 +87,4 @@ endif obj-$(CONFIG_QRTR) += qrtr/ obj-$(CONFIG_NET_NCSI) += ncsi/ obj-$(CONFIG_XDP_SOCKETS) += xdp/ +obj-$(CONFIG_XDP_FLOW) += xdp_flow/ diff --git a/net/xdp_flow/.gitignore b/net/xdp_flow/.gitignore new file mode 100644 index 0000000..8cad817 --- /dev/null +++ b/net/xdp_flow/.gitignore @@ -0,0 +1 @@ +xdp_flow_umh diff --git a/net/xdp_flow/Kconfig b/net/xdp_flow/Kconfig new file mode 100644 index 0000000..82e7bf3 --- /dev/null +++ b/net/xdp_flow/Kconfig @@ -0,0 +1,16 @@ +# SPDX-License-Identifier: GPL-2.0-only +menuconfig XDP_FLOW + bool "XDP based flow offload engine (XDP_FLOW)" + depends on NET && BPF_SYSCALL && NET_CLS_FLOWER && MEMFD_CREATE + help + This builds experimental xdp_flow framework that is aiming to + provide flow software offload functionality via XDP + +if XDP_FLOW +config XDP_FLOW_UMH + tristate "xdp_flow kernel module with user mode helper" + depends on $(success,$(srctree)/scripts/cc-can-link.sh $(CC)) + default m + help + This builds xdp_flow kernel module with embedded user mode helper +endif diff --git a/net/xdp_flow/Makefile b/net/xdp_flow/Makefile new file mode 100644 index 0000000..f6138c2 --- /dev/null +++ b/net/xdp_flow/Makefile @@ -0,0 +1,31 @@ +# SPDX-License-Identifier: GPL-2.0 + +obj-$(CONFIG_XDP_FLOW) += xdp_flow_core.o + +ifeq ($(CONFIG_XDP_FLOW_UMH), y) +# builtin xdp_flow_umh should be compiled with -static +# since rootfs isn't mounted at the time of __init +# function is called and do_execv won't find elf interpreter +STATIC := -static +endif + +quiet_cmd_cc_user = CC $@ + cmd_cc_user = $(CC) -Wall -Wmissing-prototypes -O2 -std=gnu89 \ + -I$(srctree)/tools/include/ \ + -c -o $@ $< + +quiet_cmd_ld_user = LD $@ + cmd_ld_user = $(CC) $(STATIC) -o $@ $^ + +$(obj)/xdp_flow_umh.o: $(src)/xdp_flow_umh.c FORCE + $(call if_changed,cc_user) + +$(obj)/xdp_flow_umh: $(obj)/xdp_flow_umh.o + $(call if_changed,ld_user) + +clean-files := xdp_flow_umh + +$(obj)/xdp_flow_umh_blob.o: $(obj)/xdp_flow_umh + +obj-$(CONFIG_XDP_FLOW_UMH) += xdp_flow.o +xdp_flow-objs += xdp_flow_kern_mod.o xdp_flow_umh_blob.o diff --git a/net/xdp_flow/msgfmt.h b/net/xdp_flow/msgfmt.h new file mode 100644 index 0000000..97d8490 --- /dev/null +++ b/net/xdp_flow/msgfmt.h @@ -0,0 +1,102 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _NET_XDP_FLOW_MSGFMT_H +#define _NET_XDP_FLOW_MSGFMT_H + +#include +#include +#include +#include + +#define MAX_XDP_FLOW_ACTIONS 32 + +enum xdp_flow_action_id { + /* ABORT if 0, i.e. uninitialized */ + XDP_FLOW_ACTION_ACCEPT = 1, + XDP_FLOW_ACTION_DROP, + XDP_FLOW_ACTION_REDIRECT, + XDP_FLOW_ACTION_VLAN_PUSH, + XDP_FLOW_ACTION_VLAN_POP, + XDP_FLOW_ACTION_VLAN_MANGLE, + XDP_FLOW_ACTION_MANGLE, + XDP_FLOW_ACTION_CSUM, + NR_XDP_FLOW_ACTION, +}; + +struct xdp_flow_action { + enum xdp_flow_action_id id; + union { + int ifindex; /* REDIRECT */ + struct { /* VLAN */ + __be16 proto; + __be16 tci; + } vlan; + }; +}; + +struct xdp_flow_actions { + unsigned int num_actions; + struct xdp_flow_action actions[MAX_XDP_FLOW_ACTIONS]; +}; + +struct xdp_flow_key { + struct { + __u8 dst[ETH_ALEN] __aligned(2); + __u8 src[ETH_ALEN] __aligned(2); + __be16 type; + } eth; + struct { + __be16 tpid; + __be16 tci; + } vlan; + struct { + __u8 proto; + __u8 ttl; + __u8 tos; + __u8 frag; + } ip; + union { + struct { + __be32 src; + __be32 dst; + } ipv4; + struct { + struct in6_addr src; + struct in6_addr dst; + } ipv6; + }; + struct { + __be16 src; + __be16 dst; + } l4port; + struct { + __be16 flags; + } tcp; +} __aligned(BITS_PER_LONG / 8); + +struct xdp_flow { + struct xdp_flow_key key; + struct xdp_flow_key mask; + struct xdp_flow_actions actions; + __u16 priority; +}; + +enum xdp_flow_cmd { + XDP_FLOW_CMD_NOOP = 0, + XDP_FLOW_CMD_LOAD, + XDP_FLOW_CMD_UNLOAD, + XDP_FLOW_CMD_REPLACE, + XDP_FLOW_CMD_DELETE, +}; + +struct mbox_request { + int ifindex; + __u8 cmd; + struct xdp_flow flow; +}; + +struct mbox_reply { + int status; + __u32 id; +}; + +#endif diff --git a/net/xdp_flow/xdp_flow_core.c b/net/xdp_flow/xdp_flow_core.c new file mode 100644 index 0000000..ab84863 --- /dev/null +++ b/net/xdp_flow/xdp_flow_core.c @@ -0,0 +1,126 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include + +struct xdp_flow_umh_ops xdp_flow_ops; +EXPORT_SYMBOL_GPL(xdp_flow_ops); + +static LIST_HEAD(xdp_block_cb_list); + +static void xdp_flow_block_release(void *cb_priv) +{ + struct net_device *dev = cb_priv; + struct netlink_ext_ack extack; + + mutex_lock(&xdp_flow_ops.lock); + xdp_flow_ops.setup(dev, false, &extack); + module_put(xdp_flow_ops.module); + mutex_unlock(&xdp_flow_ops.lock); +} + +int xdp_flow_setup_block(struct net_device *dev, struct flow_block_offload *f) +{ + struct flow_block_cb *block_cb; + int err = 0; + + /* TODO: Remove this limitation */ + if (!net_eq(current->nsproxy->net_ns, &init_net)) + return -EOPNOTSUPP; + + if (f->binder_type != FLOW_BLOCK_BINDER_TYPE_CLSACT_INGRESS) + return -EOPNOTSUPP; + + mutex_lock(&xdp_flow_ops.lock); + if (!xdp_flow_ops.module) { + mutex_unlock(&xdp_flow_ops.lock); + err = request_module("xdp_flow"); + if (err) + return err; + mutex_lock(&xdp_flow_ops.lock); + if (!xdp_flow_ops.module) { + err = -ECHILD; + goto out; + } + } + if (xdp_flow_ops.stop) { + err = xdp_flow_ops.start(); + if (err) + goto out; + } + + f->driver_block_list = &xdp_block_cb_list; + + switch (f->command) { + case FLOW_BLOCK_BIND: + if (flow_block_cb_is_busy(xdp_flow_ops.setup_cb, dev, + &xdp_block_cb_list)) { + err = -EBUSY; + goto out; + } + + if (!try_module_get(xdp_flow_ops.module)) { + err = -ECHILD; + goto out; + } + + err = xdp_flow_ops.setup(dev, true, f->extack); + if (err) { + module_put(xdp_flow_ops.module); + goto out; + } + + block_cb = flow_block_cb_alloc(xdp_flow_ops.setup_cb, dev, dev, + xdp_flow_block_release); + if (IS_ERR(block_cb)) { + xdp_flow_ops.setup(dev, false, f->extack); + module_put(xdp_flow_ops.module); + err = PTR_ERR(block_cb); + goto out; + } + + flow_block_cb_add(block_cb, f); + list_add_tail(&block_cb->driver_list, &xdp_block_cb_list); + break; + case FLOW_BLOCK_UNBIND: + block_cb = flow_block_cb_lookup(f->block, xdp_flow_ops.setup_cb, + dev); + if (!block_cb) { + err = -ENOENT; + goto out; + } + + flow_block_cb_remove(block_cb, f); + list_del(&block_cb->driver_list); + break; + default: + err = -EOPNOTSUPP; + } +out: + mutex_unlock(&xdp_flow_ops.lock); + + return err; +} + +static void xdp_flow_umh_cleanup(struct umh_info *info) +{ + mutex_lock(&xdp_flow_ops.lock); + xdp_flow_ops.stop = true; + fput(info->pipe_to_umh); + fput(info->pipe_from_umh); + info->pid = 0; + mutex_unlock(&xdp_flow_ops.lock); +} + +static int __init xdp_flow_init(void) +{ + mutex_init(&xdp_flow_ops.lock); + xdp_flow_ops.stop = true; + xdp_flow_ops.info.cmdline = "xdp_flow_umh"; + xdp_flow_ops.info.cleanup = &xdp_flow_umh_cleanup; + + return 0; +} +device_initcall(xdp_flow_init); diff --git a/net/xdp_flow/xdp_flow_kern_mod.c b/net/xdp_flow/xdp_flow_kern_mod.c new file mode 100644 index 0000000..823ab65 --- /dev/null +++ b/net/xdp_flow/xdp_flow_kern_mod.c @@ -0,0 +1,250 @@ +// SPDX-License-Identifier: GPL-2.0 +#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt +#include +#include +#include +#include +#include +#include "msgfmt.h" + +extern char xdp_flow_umh_start; +extern char xdp_flow_umh_end; + +static void shutdown_umh(void) +{ + struct task_struct *tsk; + + if (xdp_flow_ops.stop) + return; + + tsk = get_pid_task(find_vpid(xdp_flow_ops.info.pid), PIDTYPE_PID); + if (tsk) { + send_sig(SIGKILL, tsk, 1); + put_task_struct(tsk); + } +} + +static int transact_umh(struct mbox_request *req, u32 *id) +{ + struct mbox_reply reply; + int ret = -EFAULT; + loff_t pos; + ssize_t n; + + if (!xdp_flow_ops.info.pid) + goto out; + + n = __kernel_write(xdp_flow_ops.info.pipe_to_umh, req, sizeof(*req), + &pos); + if (n != sizeof(*req)) { + pr_err("write fail %zd\n", n); + shutdown_umh(); + goto out; + } + + pos = 0; + n = kernel_read(xdp_flow_ops.info.pipe_from_umh, &reply, + sizeof(reply), &pos); + if (n != sizeof(reply)) { + pr_err("read fail %zd\n", n); + shutdown_umh(); + goto out; + } + + ret = reply.status; + if (id) + *id = reply.id; +out: + return ret; +} + +static int xdp_flow_replace(struct net_device *dev, struct flow_cls_offload *f) +{ + return -EOPNOTSUPP; +} + +int xdp_flow_destroy(struct net_device *dev, struct flow_cls_offload *f) +{ + return -EOPNOTSUPP; +} + +static int xdp_flow_setup_flower(struct net_device *dev, + struct flow_cls_offload *f) +{ + switch (f->command) { + case FLOW_CLS_REPLACE: + return xdp_flow_replace(dev, f); + case FLOW_CLS_DESTROY: + return xdp_flow_destroy(dev, f); + case FLOW_CLS_STATS: + case FLOW_CLS_TMPLT_CREATE: + case FLOW_CLS_TMPLT_DESTROY: + default: + return -EOPNOTSUPP; + } +} + +static int xdp_flow_setup_block_cb(enum tc_setup_type type, void *type_data, + void *cb_priv) +{ + struct flow_cls_common_offload *common = type_data; + struct net_device *dev = cb_priv; + int err = 0; + + if (common->chain_index) { + NL_SET_ERR_MSG(common->extack, + "xdp_flow supports only offload of chain 0"); + return -EOPNOTSUPP; + } + + if (type != TC_SETUP_CLSFLOWER) + return -EOPNOTSUPP; + + mutex_lock(&xdp_flow_ops.lock); + if (xdp_flow_ops.stop) { + err = xdp_flow_ops.start(); + if (err) + goto out; + } + + err = xdp_flow_setup_flower(dev, type_data); +out: + mutex_unlock(&xdp_flow_ops.lock); + return err; +} + +static int xdp_flow_setup_bind(struct net_device *dev, + struct netlink_ext_ack *extack) +{ + struct mbox_request *req; + u32 id = 0; + int err; + + req = kzalloc(sizeof(*req), GFP_KERNEL); + if (!req) + return -ENOMEM; + + req->cmd = XDP_FLOW_CMD_LOAD; + req->ifindex = dev->ifindex; + + /* Load bpf in UMH and get prog id */ + err = transact_umh(req, &id); + + /* TODO: id will be used to attach bpf prog to XDP + * As we have rtnl_lock, UMH cannot attach prog to XDP + */ + + kfree(req); + + return err; +} + +static int xdp_flow_setup_unbind(struct net_device *dev, + struct netlink_ext_ack *extack) +{ + struct mbox_request *req; + int err; + + req = kzalloc(sizeof(*req), GFP_KERNEL); + if (!req) + return -ENOMEM; + + req->cmd = XDP_FLOW_CMD_UNLOAD; + req->ifindex = dev->ifindex; + + err = transact_umh(req, NULL); + + kfree(req); + + return err; +} + +static int xdp_flow_setup(struct net_device *dev, bool do_bind, + struct netlink_ext_ack *extack) +{ + ASSERT_RTNL(); + + if (!net_eq(dev_net(dev), &init_net)) + return -EINVAL; + + return do_bind ? + xdp_flow_setup_bind(dev, extack) : + xdp_flow_setup_unbind(dev, extack); +} + +static int xdp_flow_test(void) +{ + struct mbox_request *req; + int err; + + req = kzalloc(sizeof(*req), GFP_KERNEL); + if (!req) + return -ENOMEM; + + req->cmd = XDP_FLOW_CMD_NOOP; + err = transact_umh(req, NULL); + + kfree(req); + + return err; +} + +static int start_umh(void) +{ + int err; + + /* fork usermode process */ + err = fork_usermode_blob(&xdp_flow_umh_start, + &xdp_flow_umh_end - &xdp_flow_umh_start, + &xdp_flow_ops.info); + if (err) + return err; + + xdp_flow_ops.stop = false; + pr_info("Loaded xdp_flow_umh pid %d\n", xdp_flow_ops.info.pid); + + /* health check that usermode process started correctly */ + if (xdp_flow_test()) { + shutdown_umh(); + return -EFAULT; + } + + return 0; +} + +static int __init load_umh(void) +{ + int err = 0; + + mutex_lock(&xdp_flow_ops.lock); + if (!xdp_flow_ops.stop) { + err = -EFAULT; + goto err; + } + + err = start_umh(); + if (err) + goto err; + + xdp_flow_ops.setup_cb = &xdp_flow_setup_block_cb; + xdp_flow_ops.setup = &xdp_flow_setup; + xdp_flow_ops.start = &start_umh; + xdp_flow_ops.module = THIS_MODULE; +err: + mutex_unlock(&xdp_flow_ops.lock); + return err; +} + +static void __exit fini_umh(void) +{ + mutex_lock(&xdp_flow_ops.lock); + shutdown_umh(); + xdp_flow_ops.module = NULL; + xdp_flow_ops.start = NULL; + xdp_flow_ops.setup = NULL; + xdp_flow_ops.setup_cb = NULL; + mutex_unlock(&xdp_flow_ops.lock); +} +module_init(load_umh); +module_exit(fini_umh); +MODULE_LICENSE("GPL"); diff --git a/net/xdp_flow/xdp_flow_umh.c b/net/xdp_flow/xdp_flow_umh.c new file mode 100644 index 0000000..6729bdf --- /dev/null +++ b/net/xdp_flow/xdp_flow_umh.c @@ -0,0 +1,109 @@ +// SPDX-License-Identifier: GPL-2.0 +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include "msgfmt.h" + +/* FIXME: syslog is used for easy debugging. As writing /dev/log can be stuck + * due to reader side, should use another log mechanism like kmsg. + */ +#define pr_debug(fmt, ...) syslog(LOG_DAEMON | LOG_DEBUG, fmt, ##__VA_ARGS__) +#define pr_info(fmt, ...) syslog(LOG_DAEMON | LOG_INFO, fmt, ##__VA_ARGS__) +#define pr_warn(fmt, ...) syslog(LOG_DAEMON | LOG_WARNING, fmt, ##__VA_ARGS__) +#define pr_err(fmt, ...) syslog(LOG_DAEMON | LOG_ERR, fmt, ##__VA_ARGS__) + +static int handle_load(const struct mbox_request *req, __u32 *prog_id) +{ + *prog_id = 0; + + return 0; +} + +static int handle_unload(const struct mbox_request *req) +{ + return 0; +} + +static int handle_replace(struct mbox_request *req) +{ + return -EOPNOTSUPP; +} + +static int handle_delete(const struct mbox_request *req) +{ + return -EOPNOTSUPP; +} + +static void loop(void) +{ + struct mbox_request *req; + + req = malloc(sizeof(struct mbox_request)); + if (!req) { + pr_err("Memory allocation for mbox_request failed\n"); + return; + } + + while (1) { + struct mbox_reply reply; + int n; + + n = read(0, req, sizeof(*req)); + if (n < 0) { + pr_err("read for mbox_request failed: %s\n", + strerror(errno)); + break; + } + if (n != sizeof(*req)) { + pr_err("Invalid request size %d\n", n); + break; + } + + switch (req->cmd) { + case XDP_FLOW_CMD_NOOP: + reply.status = 0; + break; + case XDP_FLOW_CMD_LOAD: + reply.status = handle_load(req, &reply.id); + break; + case XDP_FLOW_CMD_UNLOAD: + reply.status = handle_unload(req); + break; + case XDP_FLOW_CMD_REPLACE: + reply.status = handle_replace(req); + break; + case XDP_FLOW_CMD_DELETE: + reply.status = handle_delete(req); + break; + default: + pr_err("Invalid command %d\n", req->cmd); + reply.status = -EOPNOTSUPP; + } + + n = write(1, &reply, sizeof(reply)); + if (n < 0) { + pr_err("write for mbox_reply failed: %s\n", + strerror(errno)); + break; + } + if (n != sizeof(reply)) { + pr_err("reply written too short: %d\n", n); + break; + } + } + + free(req); +} + +int main(void) +{ + pr_info("Started xdp_flow\n"); + loop(); + + return 0; +} diff --git a/net/xdp_flow/xdp_flow_umh_blob.S b/net/xdp_flow/xdp_flow_umh_blob.S new file mode 100644 index 0000000..6edcb0e --- /dev/null +++ b/net/xdp_flow/xdp_flow_umh_blob.S @@ -0,0 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + .section .rodata, "a" + .global xdp_flow_umh_start +xdp_flow_umh_start: + .incbin "net/xdp_flow/xdp_flow_umh" + .global xdp_flow_umh_end +xdp_flow_umh_end: From patchwork Tue Aug 13 12:05:46 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1146282 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="px+en4B7"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 467BKs3kD1z9sN1 for ; Tue, 13 Aug 2019 22:07:29 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726600AbfHMMH3 (ORCPT ); Tue, 13 Aug 2019 08:07:29 -0400 Received: from mail-pf1-f194.google.com ([209.85.210.194]:36254 "EHLO mail-pf1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725953AbfHMMH2 (ORCPT ); Tue, 13 Aug 2019 08:07:28 -0400 Received: by mail-pf1-f194.google.com with SMTP id w2so681630pfi.3; Tue, 13 Aug 2019 05:07:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=8/8NsF2jfxOSjSgasYfm8ctYMNupvW30/NVGZbFK9U8=; b=px+en4B7B+KhoMW/+lWytzTuEr1kbpCm22dyaArq8i4a8tzMcmBY4z17TtCn64FZm9 rZ39uCN0FBH1DycfjH3V6z+vpfbuF3GrPWHdyArO1scwMPKsXkJNCnTt/U/wW42Qy+Sa QpeiQk4gVVHeYXwbcC9T/Nn+cFTloShunmZWytvb47aBlA0xcNF/I7SqZvDXFPRzWY7z YAUIsYzIBqNybUEc0v2RoPmYDQrEXVat+8ByQ6eJf0Mvbq983BD1pVa77sJ7gbGk1wLT ZjD6Orj/RHU5jhVKDoyLbYFiWMCTc2mP6Esf+T/3MFA7nqjXMoljVvUnCqtmtzlwWTLr njgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8/8NsF2jfxOSjSgasYfm8ctYMNupvW30/NVGZbFK9U8=; b=lRpE/nsfd0k9uKuFkHT3LqJtAAPm151FUFLNAzBPm+sdmfFrY5maFi1086bdxYaOYM wMI9z1BZ1MZUrLM/AvwEwTrTDeEiWeBZzAV83KxI2TnvR93RsXDtyvBRvzfk+o9N/EdH Cvpay1ZGjGkcCdfZfcU8eYuBgTOfoNPEbc002vN++ANylXYDzRTemgYuqPVnjbVJu05K wSupv9IyVA+zFJgVmOX8JBpPmNci1I3xkdJfOBycrDjwKRCj+SLLJTQtf8vZZfQ9gSr0 gd4WP3BuqStzW/I9FLnRd5bsZ/ZgezowNQHr5p+MVsLlb1R2ZKZL7L44GBgcSY3P1Wqu pg1w== X-Gm-Message-State: APjAAAVpkQ063plYpT/n0PH91QdxUp4lQ/bdIM2ymN9LJbm6TbPAsL/M BMrmu3MIK4ywLAvQDcaJeIg= X-Google-Smtp-Source: APXvYqwdNwzRt9Ejpfehx9WPIpV97NU7T4pBxebgSOGDewAlmDnExQAilB+OyMcrSkMbC4h6zYP52A== X-Received: by 2002:a63:e10:: with SMTP id d16mr34540517pgl.444.1565698047652; Tue, 13 Aug 2019 05:07:27 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id o9sm73251099pgv.19.2019.08.13.05.07.24 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Tue, 13 Aug 2019 05:07:27 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu Subject: [RFC PATCH bpf-next 02/14] xdp_flow: Add skeleton bpf program for XDP Date: Tue, 13 Aug 2019 21:05:46 +0900 Message-Id: <20190813120558.6151-3-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190813120558.6151-1-toshiaki.makita1@gmail.com> References: <20190813120558.6151-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org The program is meant to be loaded when a device is bound to an ingress TC block and should be attached to XDP on the device. Typically it should be loaded when TC ingress or clsact qdisc is added. The program is prebuilt and embedded in the UMH, instead of generated dynamically. This is because TC filter is frequently changed when it is used by OVS, and the latency of TC filter change will affect the latency of datapath. Signed-off-by: Toshiaki Makita --- net/xdp_flow/Makefile | 87 +++++++++++- net/xdp_flow/xdp_flow_kern_bpf.c | 12 ++ net/xdp_flow/xdp_flow_kern_bpf_blob.S | 7 + net/xdp_flow/xdp_flow_umh.c | 241 +++++++++++++++++++++++++++++++++- 4 files changed, 343 insertions(+), 4 deletions(-) create mode 100644 net/xdp_flow/xdp_flow_kern_bpf.c create mode 100644 net/xdp_flow/xdp_flow_kern_bpf_blob.S diff --git a/net/xdp_flow/Makefile b/net/xdp_flow/Makefile index f6138c2..b3a0416 100644 --- a/net/xdp_flow/Makefile +++ b/net/xdp_flow/Makefile @@ -2,25 +2,106 @@ obj-$(CONFIG_XDP_FLOW) += xdp_flow_core.o +XDP_FLOW_PATH ?= $(abspath $(srctree)/$(src)) +TOOLS_PATH := $(XDP_FLOW_PATH)/../../tools + +# Libbpf dependencies +LIBBPF = $(TOOLS_PATH)/lib/bpf/libbpf.a + +LLC ?= llc +CLANG ?= clang +LLVM_OBJCOPY ?= llvm-objcopy +BTF_PAHOLE ?= pahole + +ifdef CROSS_COMPILE +CLANG_ARCH_ARGS = -target $(ARCH) +endif + +BTF_LLC_PROBE := $(shell $(LLC) -march=bpf -mattr=help 2>&1 | grep dwarfris) +BTF_PAHOLE_PROBE := $(shell $(BTF_PAHOLE) --help 2>&1 | grep BTF) +BTF_OBJCOPY_PROBE := $(shell $(LLVM_OBJCOPY) --help 2>&1 | grep -i 'usage.*llvm') +BTF_LLVM_PROBE := $(shell echo "int main() { return 0; }" | \ + $(CLANG) -target bpf -O2 -g -c -x c - -o ./llvm_btf_verify.o; \ + readelf -S ./llvm_btf_verify.o | grep BTF; \ + /bin/rm -f ./llvm_btf_verify.o) + +ifneq ($(BTF_LLVM_PROBE),) + EXTRA_CFLAGS += -g +else +ifneq ($(and $(BTF_LLC_PROBE),$(BTF_PAHOLE_PROBE),$(BTF_OBJCOPY_PROBE)),) + EXTRA_CFLAGS += -g + LLC_FLAGS += -mattr=dwarfris + DWARF2BTF = y +endif +endif + +$(LIBBPF): FORCE +# Fix up variables inherited from Kbuild that tools/ build system won't like + $(MAKE) -C $(dir $@) RM='rm -rf' LDFLAGS= srctree=$(XDP_FLOW_PATH)/../../ O= + +# Verify LLVM compiler tools are available and bpf target is supported by llc +.PHONY: verify_cmds verify_target_bpf $(CLANG) $(LLC) + +verify_cmds: $(CLANG) $(LLC) + @for TOOL in $^ ; do \ + if ! (which -- "$${TOOL}" > /dev/null 2>&1); then \ + echo "*** ERROR: Cannot find LLVM tool $${TOOL}" ;\ + exit 1; \ + else true; fi; \ + done + +verify_target_bpf: verify_cmds + @if ! (${LLC} -march=bpf -mattr=help > /dev/null 2>&1); then \ + echo "*** ERROR: LLVM (${LLC}) does not support 'bpf' target" ;\ + echo " NOTICE: LLVM version >= 3.7.1 required" ;\ + exit 2; \ + else true; fi + +$(src)/xdp_flow_kern_bpf.c: verify_target_bpf + +$(obj)/xdp_flow_kern_bpf.o: $(src)/xdp_flow_kern_bpf.c FORCE + @echo " CLANG-bpf " $@ + $(Q)$(CLANG) $(NOSTDINC_FLAGS) $(LINUXINCLUDE) $(EXTRA_CFLAGS) -I$(obj) \ + -I$(srctree)/tools/testing/selftests/bpf/ \ + -D__KERNEL__ -D__BPF_TRACING__ -Wno-unused-value -Wno-pointer-sign \ + -D__TARGET_ARCH_$(SRCARCH) -Wno-compare-distinct-pointer-types \ + -Wno-gnu-variable-sized-type-not-at-end \ + -Wno-address-of-packed-member -Wno-tautological-compare \ + -Wno-unknown-warning-option $(CLANG_ARCH_ARGS) \ + -I$(srctree)/samples/bpf/ -include asm_goto_workaround.h \ + -O2 -emit-llvm -c $< -o -| $(LLC) -march=bpf $(LLC_FLAGS) -filetype=obj -o $@ +ifeq ($(DWARF2BTF),y) + $(BTF_PAHOLE) -J $@ +endif + ifeq ($(CONFIG_XDP_FLOW_UMH), y) # builtin xdp_flow_umh should be compiled with -static # since rootfs isn't mounted at the time of __init # function is called and do_execv won't find elf interpreter STATIC := -static +STATICLDLIBS := -lz endif +quiet_cmd_as_user = AS $@ + cmd_as_user = $(AS) -c -o $@ $< + quiet_cmd_cc_user = CC $@ cmd_cc_user = $(CC) -Wall -Wmissing-prototypes -O2 -std=gnu89 \ - -I$(srctree)/tools/include/ \ + -I$(srctree)/tools/lib/ -I$(srctree)/tools/include/ \ -c -o $@ $< quiet_cmd_ld_user = LD $@ - cmd_ld_user = $(CC) $(STATIC) -o $@ $^ + cmd_ld_user = $(CC) $(STATIC) -o $@ $^ $(LIBBPF) -lelf $(STATICLDLIBS) + +$(obj)/xdp_flow_kern_bpf_blob.o: $(src)/xdp_flow_kern_bpf_blob.S \ + $(obj)/xdp_flow_kern_bpf.o + $(call if_changed,as_user) $(obj)/xdp_flow_umh.o: $(src)/xdp_flow_umh.c FORCE $(call if_changed,cc_user) -$(obj)/xdp_flow_umh: $(obj)/xdp_flow_umh.o +$(obj)/xdp_flow_umh: $(obj)/xdp_flow_umh.o $(LIBBPF) \ + $(obj)/xdp_flow_kern_bpf_blob.o $(call if_changed,ld_user) clean-files := xdp_flow_umh diff --git a/net/xdp_flow/xdp_flow_kern_bpf.c b/net/xdp_flow/xdp_flow_kern_bpf.c new file mode 100644 index 0000000..74cdb1d --- /dev/null +++ b/net/xdp_flow/xdp_flow_kern_bpf.c @@ -0,0 +1,12 @@ +// SPDX-License-Identifier: GPL-2.0 +#define KBUILD_MODNAME "foo" +#include +#include + +SEC("xdp_flow") +int xdp_flow_prog(struct xdp_md *ctx) +{ + return XDP_PASS; +} + +char _license[] SEC("license") = "GPL"; diff --git a/net/xdp_flow/xdp_flow_kern_bpf_blob.S b/net/xdp_flow/xdp_flow_kern_bpf_blob.S new file mode 100644 index 0000000..d180c1b --- /dev/null +++ b/net/xdp_flow/xdp_flow_kern_bpf_blob.S @@ -0,0 +1,7 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + .section .rodata, "a" + .global xdp_flow_bpf_start +xdp_flow_bpf_start: + .incbin "net/xdp_flow/xdp_flow_kern_bpf.o" + .global xdp_flow_bpf_end +xdp_flow_bpf_end: diff --git a/net/xdp_flow/xdp_flow_umh.c b/net/xdp_flow/xdp_flow_umh.c index 6729bdf..734db00 100644 --- a/net/xdp_flow/xdp_flow_umh.c +++ b/net/xdp_flow/xdp_flow_umh.c @@ -6,9 +6,19 @@ #include #include #include +#include +#include +#include #include +#include +#include +#include #include "msgfmt.h" +extern char xdp_flow_bpf_start; +extern char xdp_flow_bpf_end; +int progfile_fd; + /* FIXME: syslog is used for easy debugging. As writing /dev/log can be stuck * due to reader side, should use another log mechanism like kmsg. */ @@ -17,15 +27,241 @@ #define pr_warn(fmt, ...) syslog(LOG_DAEMON | LOG_WARNING, fmt, ##__VA_ARGS__) #define pr_err(fmt, ...) syslog(LOG_DAEMON | LOG_ERR, fmt, ##__VA_ARGS__) +#define ERRBUF_SIZE 64 + +/* This key represents a net device */ +struct netdev_info_key { + int ifindex; +}; + +struct netdev_info { + struct netdev_info_key key; + struct hlist_node node; + struct bpf_object *obj; +}; + +DEFINE_HASHTABLE(netdev_info_table, 16); + +static int libbpf_err(int err, char *errbuf) +{ + libbpf_strerror(err, errbuf, ERRBUF_SIZE); + + if (-err < __LIBBPF_ERRNO__START) + return err; + + return -EINVAL; +} + +static int setup(void) +{ + size_t size = &xdp_flow_bpf_end - &xdp_flow_bpf_start; + struct rlimit r = { RLIM_INFINITY, RLIM_INFINITY }; + ssize_t len; + int err; + + if (setrlimit(RLIMIT_MEMLOCK, &r)) { + err = -errno; + pr_err("setrlimit MEMLOCK failed: %s\n", strerror(errno)); + return err; + } + + progfile_fd = memfd_create("xdp_flow_kern_bpf.o", 0); + if (progfile_fd < 0) { + err = -errno; + pr_err("memfd_create failed: %s\n", strerror(errno)); + return err; + } + + len = write(progfile_fd, &xdp_flow_bpf_start, size); + if (len < 0) { + err = -errno; + pr_err("Failed to write bpf prog: %s\n", strerror(errno)); + goto err; + } + + if (len < size) { + pr_err("bpf prog written too short: expected %ld, actual %ld\n", + size, len); + err = -EIO; + goto err; + } + + return 0; +err: + close(progfile_fd); + + return err; +} + +static int load_bpf(int ifindex, struct bpf_object **objp) +{ + struct bpf_object_open_attr attr = {}; + char path[256], errbuf[ERRBUF_SIZE]; + struct bpf_program *prog; + struct bpf_object *obj; + int prog_fd, err; + ssize_t len; + + len = snprintf(path, 256, "/proc/self/fd/%d", progfile_fd); + if (len < 0) { + err = -errno; + pr_err("Failed to setup prog fd path string: %s\n", + strerror(errno)); + return err; + } + + attr.file = path; + attr.prog_type = BPF_PROG_TYPE_XDP; + obj = bpf_object__open_xattr(&attr); + if (IS_ERR_OR_NULL(obj)) { + if (IS_ERR(obj)) { + err = libbpf_err((int)PTR_ERR(obj), errbuf); + } else { + err = -ENOENT; + strerror_r(-err, errbuf, sizeof(errbuf)); + } + pr_err("Cannot open bpf prog: %s\n", errbuf); + return err; + } + + bpf_object__for_each_program(prog, obj) + bpf_program__set_type(prog, attr.prog_type); + + err = bpf_object__load(obj); + if (err) { + err = libbpf_err(err, errbuf); + pr_err("Failed to load bpf prog: %s\n", errbuf); + goto err; + } + + prog = bpf_object__find_program_by_title(obj, "xdp_flow"); + if (!prog) { + pr_err("Cannot find xdp_flow program\n"); + err = -ENOENT; + goto err; + } + + prog_fd = bpf_program__fd(prog); + if (prog_fd < 0) { + err = libbpf_err(prog_fd, errbuf); + pr_err("Invalid program fd: %s\n", errbuf); + goto err; + } + + *objp = obj; + + return prog_fd; +err: + bpf_object__close(obj); + return err; +} + +static int get_netdev_info_keyval(const struct netdev_info_key *key) +{ + return key->ifindex; +} + +static struct netdev_info *find_netdev_info(const struct netdev_info_key *key) +{ + int keyval = get_netdev_info_keyval(key); + struct netdev_info *netdev_info; + + hash_for_each_possible(netdev_info_table, netdev_info, node, keyval) { + if (netdev_info->key.ifindex == key->ifindex) + return netdev_info; + } + + return NULL; +} + +static int get_netdev_info_key(const struct mbox_request *req, + struct netdev_info_key *key) +{ + key->ifindex = req->ifindex; + + return 0; +} + +static struct netdev_info *get_netdev_info(const struct mbox_request *req) +{ + struct netdev_info *netdev_info; + struct netdev_info_key key; + int err; + + err = get_netdev_info_key(req, &key); + if (err) + return ERR_PTR(err); + + netdev_info = find_netdev_info(&key); + if (!netdev_info) { + pr_err("BUG: netdev_info for if %d not found.\n", + key.ifindex); + return ERR_PTR(-ENOENT); + } + + return netdev_info; +} + static int handle_load(const struct mbox_request *req, __u32 *prog_id) { - *prog_id = 0; + struct netdev_info *netdev_info; + struct bpf_prog_info info = {}; + struct netdev_info_key key; + __u32 len = sizeof(info); + int err, prog_fd; + + err = get_netdev_info_key(req, &key); + if (err) + return err; + + netdev_info = find_netdev_info(&key); + if (netdev_info) + return 0; + + netdev_info = malloc(sizeof(*netdev_info)); + if (!netdev_info) { + pr_err("malloc for netdev_info failed.\n"); + return -ENOMEM; + } + netdev_info->key.ifindex = key.ifindex; + + prog_fd = load_bpf(req->ifindex, &netdev_info->obj); + if (prog_fd < 0) { + err = prog_fd; + goto err_netdev_info; + } + + err = bpf_obj_get_info_by_fd(prog_fd, &info, &len); + if (err) + goto err_obj; + + *prog_id = info.id; + hash_add(netdev_info_table, &netdev_info->node, + get_netdev_info_keyval(&netdev_info->key)); + pr_debug("XDP program for if %d was loaded\n", req->ifindex); return 0; +err_obj: + bpf_object__close(netdev_info->obj); +err_netdev_info: + free(netdev_info); + + return err; } static int handle_unload(const struct mbox_request *req) { + struct netdev_info *netdev_info; + + netdev_info = get_netdev_info(req); + if (IS_ERR(netdev_info)) + return PTR_ERR(netdev_info); + + hash_del(&netdev_info->node); + bpf_object__close(netdev_info->obj); + free(netdev_info); + pr_debug("XDP program for if %d was closed\n", req->ifindex); + return 0; } @@ -103,7 +339,10 @@ static void loop(void) int main(void) { pr_info("Started xdp_flow\n"); + if (setup()) + return -1; loop(); + close(progfile_fd); return 0; } From patchwork Tue Aug 13 12:05:47 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1146284 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="JEp9uKPG"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 467BKx5n9nz9sN1 for ; Tue, 13 Aug 2019 22:07:33 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727616AbfHMMHc (ORCPT ); Tue, 13 Aug 2019 08:07:32 -0400 Received: from mail-pl1-f194.google.com ([209.85.214.194]:32918 "EHLO mail-pl1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725953AbfHMMHc (ORCPT ); Tue, 13 Aug 2019 08:07:32 -0400 Received: by mail-pl1-f194.google.com with SMTP id c14so49113676plo.0; Tue, 13 Aug 2019 05:07:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=T6JTpw6JbtT5ka0riUx2a+o3iLlogyZ6wLRPGdIrAk8=; b=JEp9uKPG68bZPEquhG4r+x/3OK01A25GSXMkTp+MVGy0eb7UaoIRVFA8+fWQuXIKt/ Su44/c4yEq61PnJpU1dYOSFkmPRfDzqeT/DPpnq2rOSSGm7c5lZkHI3D89KfvmoODOIS Hv62rQ7xQ2rH9kMTYMNdFOBqBECC2g7IvkdvQMWUYaL0Xd9ZEjvu1cICQduJ9C9EfJ4Y qu2nXy+R29sWEAu1zuI4IhlwOjzEs2eqXkXq6sBKQ4K2EbQBFnUFQAU3xrJvmQTPhbx5 X6QNX3eN/K1k30ejXvIhJ5nzxpvCXc9yU2ljnn6gjXBSUCUJESu42GGVDa8yz0CngPm3 0xWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=T6JTpw6JbtT5ka0riUx2a+o3iLlogyZ6wLRPGdIrAk8=; b=jkzuUxv6e34e2vrC+u9HT8OFZyAcFHXBtyD55m/e2awfRDoMpzIeILNXaosNPYV573 IxiPtkE4wvTe0hd1OrcvHNkrGXNikQKX8eJttdPi9UBpN+KHQubeBZ3hgUNL7xLQKxU8 SVrwIUk9hDAG+fvZ2mNlkSYGdaYcHiRJDyEbslXAhK5IDBitOW9fz4qQ06E+wfL21BLT YX3duj7+OZrxwJ+JGm/d4RztJePDzjlC0zT1TekzEuPz4uAsQg7vSc8l9UvP2fmUrWqQ 42ViMXeqFZqINBDxmi+Dt96f0qwMHdhwraYAZ68++W/2lMbM2IIZaOJLxc4Ni2uz+djb yw+g== X-Gm-Message-State: APjAAAVrk6VhqSLhVsCEiQXD/4qwkSx7RZjc7yC+6E3ma3R9XWpLDCP6 5fGAz2VoXbmy54CBsmWxcI0= X-Google-Smtp-Source: APXvYqw0sY2ao0zJXVVZCu0PvRC++TfOyREYh4hDK1JrW8CNOVgTsorT194iqd4VbVGDjKRwmB5ndA== X-Received: by 2002:a17:902:a70c:: with SMTP id w12mr10106522plq.288.1565698051603; Tue, 13 Aug 2019 05:07:31 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id o9sm73251099pgv.19.2019.08.13.05.07.27 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Tue, 13 Aug 2019 05:07:31 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu Subject: [RFC PATCH bpf-next 03/14] bpf: Add API to get program from id Date: Tue, 13 Aug 2019 21:05:47 +0900 Message-Id: <20190813120558.6151-4-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190813120558.6151-1-toshiaki.makita1@gmail.com> References: <20190813120558.6151-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Factor out the logic in bpf_prog_get_fd_by_id() and add bpf_prog_get_by_id(). Also export bpf_prog_get_ok(). They are used by the next commit to get bpf prog from its id. Signed-off-by: Toshiaki Makita --- include/linux/bpf.h | 6 ++++++ kernel/bpf/syscall.c | 26 ++++++++++++++++++-------- 2 files changed, 24 insertions(+), 8 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index f9a5061..d8ad865 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -633,6 +633,7 @@ int bpf_prog_array_copy(struct bpf_prog_array *old_array, struct bpf_prog *bpf_prog_get(u32 ufd); struct bpf_prog *bpf_prog_get_type_dev(u32 ufd, enum bpf_prog_type type, bool attach_drv); +struct bpf_prog *bpf_prog_get_by_id(u32 id); struct bpf_prog * __must_check bpf_prog_add(struct bpf_prog *prog, int i); void bpf_prog_sub(struct bpf_prog *prog, int i); struct bpf_prog * __must_check bpf_prog_inc(struct bpf_prog *prog); @@ -755,6 +756,11 @@ static inline struct bpf_prog *bpf_prog_get_type_dev(u32 ufd, return ERR_PTR(-EOPNOTSUPP); } +static inline struct bpf_prog *bpf_prog_get_by_id(u32 id) +{ + return ERR_PTR(-EOPNOTSUPP); +} + static inline struct bpf_prog * __must_check bpf_prog_add(struct bpf_prog *prog, int i) { diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 5d141f1..cb5ecc4 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1495,6 +1495,7 @@ bool bpf_prog_get_ok(struct bpf_prog *prog, return true; } +EXPORT_SYMBOL_GPL(bpf_prog_get_ok); static struct bpf_prog *__bpf_prog_get(u32 ufd, enum bpf_prog_type *attach_type, bool attach_drv) @@ -2122,6 +2123,22 @@ static int bpf_obj_get_next_id(const union bpf_attr *attr, return err; } +struct bpf_prog *bpf_prog_get_by_id(u32 id) +{ + struct bpf_prog *prog; + + spin_lock_bh(&prog_idr_lock); + prog = idr_find(&prog_idr, id); + if (prog) + prog = bpf_prog_inc_not_zero(prog); + else + prog = ERR_PTR(-ENOENT); + spin_unlock_bh(&prog_idr_lock); + + return prog; +} +EXPORT_SYMBOL_GPL(bpf_prog_get_by_id); + #define BPF_PROG_GET_FD_BY_ID_LAST_FIELD prog_id static int bpf_prog_get_fd_by_id(const union bpf_attr *attr) @@ -2136,14 +2153,7 @@ static int bpf_prog_get_fd_by_id(const union bpf_attr *attr) if (!capable(CAP_SYS_ADMIN)) return -EPERM; - spin_lock_bh(&prog_idr_lock); - prog = idr_find(&prog_idr, id); - if (prog) - prog = bpf_prog_inc_not_zero(prog); - else - prog = ERR_PTR(-ENOENT); - spin_unlock_bh(&prog_idr_lock); - + prog = bpf_prog_get_by_id(id); if (IS_ERR(prog)) return PTR_ERR(prog); From patchwork Tue Aug 13 12:05:48 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1146286 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="rYf7YTuE"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 467BL053P9z9sN1 for ; Tue, 13 Aug 2019 22:07:36 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727893AbfHMMHg (ORCPT ); Tue, 13 Aug 2019 08:07:36 -0400 Received: from mail-pl1-f193.google.com ([209.85.214.193]:43810 "EHLO mail-pl1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725953AbfHMMHg (ORCPT ); Tue, 13 Aug 2019 08:07:36 -0400 Received: by mail-pl1-f193.google.com with SMTP id 4so42205198pld.10; Tue, 13 Aug 2019 05:07:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=NSzte4qImKaVoIOJy2YiIp8UqcOexukam1D+S8iGRbI=; b=rYf7YTuEe3myzNuvgPRYRaD3BTFxVHi636jXYyYSL+FIOBGyc++9LqKRyYlNv5HYXO bpyRcnmoUwDnaguhu4KxthHB/37yhNjff2yr6HCKizcCJabePr/E5nCOwbDVQ5bjtZDy XYWwVI61BDv4tGHpVC3jqrt/b00nXEOZx4l8i+pJhQTRHsBawiWmkjZcYwoIqzRdV/Ru +SW5iXCi7gJGByM2Msd7aRrguBrtf5cm40q3WN13rofbVSI2WeYdi2kZ+0QFyp5AI1eF 2eOrEXDaOkiNWZOfUWHnXT9tZl13NjWwBTMAmFCbmkQGAQydta+tRMcbrhOGQrrc+/zo j/JQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=NSzte4qImKaVoIOJy2YiIp8UqcOexukam1D+S8iGRbI=; b=Z5ryXkqXVQ2qawx5cJ+JpOr5evpfXTp1YeK59SXCZgBfGSK92I6C4lFZNNUUwto+hd q/8ol3GvoxdNoh93wDHxTtQurXQl7Jdsf+VqNDTeDaA+ZRyz/KG0kXine5qJ5lT+nvDv qKfQdkEzCg20w9D4lWzwtxf4mupO7g5RYGJTyAHZ5BrXVbf7i+4r5lfeJcVS/nx1Jl2J viORu08IE1WSzKNvnfBjWQfSZ3IYqIszUSf4oLEa2nHZK41DIXCPGJBBdg/Hfd3WKCOy UgCyM86k9mCkFxG0RIEMPuNpNI3orSxg3bzKmt608ExbhuCnmx82jJDYgACSnDhd3NH7 qHlA== X-Gm-Message-State: APjAAAXIBXNCSy3rqUlYXM3tSPMWT5xRMlg4oaQL8MZ8HOhznhg/vNb7 bn6zgnLoNHrwCl36xfmsXHc= X-Google-Smtp-Source: APXvYqzm5EK3vheawBCTLnN7UjuHkG0r+UyZoV4Iii0krs9KlN2izlDyBpjyXwQ/0fcCw1biZ7NuKQ== X-Received: by 2002:a17:902:9688:: with SMTP id n8mr36519225plp.227.1565698055384; Tue, 13 Aug 2019 05:07:35 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id o9sm73251099pgv.19.2019.08.13.05.07.31 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Tue, 13 Aug 2019 05:07:34 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu Subject: [RFC PATCH bpf-next 04/14] xdp_flow: Attach bpf prog to XDP in kernel after UMH loaded program Date: Tue, 13 Aug 2019 21:05:48 +0900 Message-Id: <20190813120558.6151-5-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190813120558.6151-1-toshiaki.makita1@gmail.com> References: <20190813120558.6151-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org As UMH runs under RTNL, it cannot attach XDP from userspace. Thus the kernel, xdp_flow module, installs the XDP program. NOTE: As an RFC, XDP-related logic is emulating dev_change_xdp_fd(). I'm thinking I should factor out the logic from dev_change_xdp_fd() and export it instead. Signed-off-by: Toshiaki Makita --- include/linux/netdevice.h | 4 +++ net/core/dev.c | 11 ++++--- net/xdp_flow/xdp_flow_kern_mod.c | 63 ++++++++++++++++++++++++++++++++++++---- 3 files changed, 69 insertions(+), 9 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 8829295..c99e022 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -3678,6 +3678,10 @@ struct sk_buff *dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev, struct netdev_queue *txq, int *ret); typedef int (*bpf_op_t)(struct net_device *dev, struct netdev_bpf *bpf); +int dev_xdp_install(struct net_device *dev, bpf_op_t bpf_op, + struct netlink_ext_ack *extack, u32 flags, + struct bpf_prog *prog); +int generic_xdp_install(struct net_device *dev, struct netdev_bpf *xdp); int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack, int fd, u32 flags); u32 __dev_xdp_query(struct net_device *dev, bpf_op_t xdp_op, diff --git a/net/core/dev.c b/net/core/dev.c index fc676b2..a45d2e4 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -5145,7 +5145,7 @@ static void __netif_receive_skb_list(struct list_head *head) memalloc_noreclaim_restore(noreclaim_flag); } -static int generic_xdp_install(struct net_device *dev, struct netdev_bpf *xdp) +int generic_xdp_install(struct net_device *dev, struct netdev_bpf *xdp) { struct bpf_prog *old = rtnl_dereference(dev->xdp_prog); struct bpf_prog *new = xdp->prog; @@ -5177,6 +5177,7 @@ static int generic_xdp_install(struct net_device *dev, struct netdev_bpf *xdp) return ret; } +EXPORT_SYMBOL_GPL(generic_xdp_install); static int netif_receive_skb_internal(struct sk_buff *skb) { @@ -8001,10 +8002,11 @@ u32 __dev_xdp_query(struct net_device *dev, bpf_op_t bpf_op, return xdp.prog_id; } +EXPORT_SYMBOL_GPL(__dev_xdp_query); -static int dev_xdp_install(struct net_device *dev, bpf_op_t bpf_op, - struct netlink_ext_ack *extack, u32 flags, - struct bpf_prog *prog) +int dev_xdp_install(struct net_device *dev, bpf_op_t bpf_op, + struct netlink_ext_ack *extack, u32 flags, + struct bpf_prog *prog) { struct netdev_bpf xdp; @@ -8019,6 +8021,7 @@ static int dev_xdp_install(struct net_device *dev, bpf_op_t bpf_op, return bpf_op(dev, &xdp); } +EXPORT_SYMBOL_GPL(dev_xdp_install); static void dev_xdp_uninstall(struct net_device *dev) { diff --git a/net/xdp_flow/xdp_flow_kern_mod.c b/net/xdp_flow/xdp_flow_kern_mod.c index 823ab65..9cf527d 100644 --- a/net/xdp_flow/xdp_flow_kern_mod.c +++ b/net/xdp_flow/xdp_flow_kern_mod.c @@ -116,10 +116,26 @@ static int xdp_flow_setup_block_cb(enum tc_setup_type type, void *type_data, static int xdp_flow_setup_bind(struct net_device *dev, struct netlink_ext_ack *extack) { + enum bpf_prog_type attach_type = BPF_PROG_TYPE_XDP; struct mbox_request *req; + bpf_op_t bpf_op, bpf_chk; + struct bpf_prog *prog; u32 id = 0; int err; + bpf_op = bpf_chk = dev->netdev_ops->ndo_bpf; + if (!bpf_op) + bpf_op = generic_xdp_install; + else + bpf_chk = generic_xdp_install; + + /* TODO: These checks should be unified with net core */ + if (__dev_xdp_query(dev, bpf_chk, XDP_QUERY_PROG)) + return -EEXIST; + + if (__dev_xdp_query(dev, bpf_op, XDP_QUERY_PROG)) + return -EBUSY; + req = kzalloc(sizeof(*req), GFP_KERNEL); if (!req) return -ENOMEM; @@ -129,21 +145,56 @@ static int xdp_flow_setup_bind(struct net_device *dev, /* Load bpf in UMH and get prog id */ err = transact_umh(req, &id); + if (err) + goto out; + + prog = bpf_prog_get_by_id(id); + if (IS_ERR(prog)) { + err = PTR_ERR(prog); + goto err_umh; + } + + if (!bpf_prog_get_ok(prog, &attach_type, false)) { + err = -EINVAL; + goto err_prog; + } - /* TODO: id will be used to attach bpf prog to XDP - * As we have rtnl_lock, UMH cannot attach prog to XDP - */ + /* As we have rtnl_lock, install XDP in kernel */ + err = dev_xdp_install(dev, bpf_op, extack, 0, prog); + if (err) + goto err_prog; + /* TODO: Should get prog once more and save it for later check */ +out: kfree(req); return err; +err_prog: + bpf_prog_put(prog); +err_umh: + req->cmd = XDP_FLOW_CMD_UNLOAD; + transact_umh(req, NULL); + + goto out; } static int xdp_flow_setup_unbind(struct net_device *dev, struct netlink_ext_ack *extack) { struct mbox_request *req; - int err; + int err, ret = 0; + bpf_op_t bpf_op; + + bpf_op = dev->netdev_ops->ndo_bpf; + if (!bpf_op) + bpf_op = generic_xdp_install; + + /* TODO: Should check if prog is not changed */ + err = dev_xdp_install(dev, bpf_op, extack, 0, NULL); + if (err) { + pr_warn("Failed to uninstall XDP prog: %d\n", err); + ret = err; + } req = kzalloc(sizeof(*req), GFP_KERNEL); if (!req) @@ -153,10 +204,12 @@ static int xdp_flow_setup_unbind(struct net_device *dev, req->ifindex = dev->ifindex; err = transact_umh(req, NULL); + if (err) + ret = err; kfree(req); - return err; + return ret; } static int xdp_flow_setup(struct net_device *dev, bool do_bind, From patchwork Tue Aug 13 12:05:49 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1146288 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="FCkTbebd"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 467BL44S6Mz9sN1 for ; Tue, 13 Aug 2019 22:07:40 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727234AbfHMMHk (ORCPT ); Tue, 13 Aug 2019 08:07:40 -0400 Received: from mail-pf1-f194.google.com ([209.85.210.194]:43373 "EHLO mail-pf1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725953AbfHMMHk (ORCPT ); Tue, 13 Aug 2019 08:07:40 -0400 Received: by mail-pf1-f194.google.com with SMTP id v12so3713507pfn.10; Tue, 13 Aug 2019 05:07:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=JOoTnYKOHFfIesoWtivO+FL06Qxsm3b0tiJNLnOqwik=; b=FCkTbebdflPRFQVhc2G2rYaVEHZkCC1Dq/1ERV3+YbY+V6u2+ZURZ6/0gReD1pRIQP Fdf6uyBcmxAFJ3E15+lfMqfIRBkUk4eH2otH7B2wQGuFAFPxhvLAuBn4gZEKTdSsmjlM KsT6xqMN59C/9Ccp5JdRodF/GYtV3BGba1vDwE0M6DCeWlGYEymRT3zJzVk83SXonnie VkoxbM37UR4Qsm7tslKLoXrr4pRyeqp5iDA5mQIRkTjRJYM/2wZSxRmLSfRKouXoRv5S xiZANJA4Cf+YlV80X/PmAfKydlFGsOZfnkJUuix2LozSx+OwBhqWSgJl27/9BCfk8ZnA h6UA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=JOoTnYKOHFfIesoWtivO+FL06Qxsm3b0tiJNLnOqwik=; b=TezYh5hzYAEh2zDdaaqi3GKMbn5vp5T9T7v+9p9/JIsQ/vZpBD563iw4DiVBLR5/aR 1Fry9kyHii0lPJ9zvQSQLEgg+9QezUOUd93/lbtcdkzUaEDLBexA4qnHnPS0NbkoXHR5 y+v9acS+YDTJxc/NgkcI4lVg6tk5RS4rWcyH2KIiI1ZKtuMtZo4DgACNyIZDIa4/nW/M +vCSjCmRCDt95kMwt/rjx8JbCm1b/S8Uuxc1y1Q20EC2PuzAufhSJrUWiICJiBu0qNcl sOfJGbGAWtwAsS8glT/YXrH2/o/vVrB0ZIbWeVXi5KoeOpa0pE3jco6OadT13cAhgISi tutA== X-Gm-Message-State: APjAAAX48WIupskfOkkTG6DOmGgyzGUJQaklW+T8kYelkTrXqcrU6RAN ZMYoI2H7Gz/lEyTrrNrQTHrp/MOo X-Google-Smtp-Source: APXvYqy1g9RCGvUeGuYeGqHbj6YHzvpgtw7ssOSc7t7j6/rGS2l9tM9DCY3XnrWMzUK11Ivcz/20Vg== X-Received: by 2002:a17:90a:b104:: with SMTP id z4mr1943940pjq.102.1565698059098; Tue, 13 Aug 2019 05:07:39 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id o9sm73251099pgv.19.2019.08.13.05.07.35 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Tue, 13 Aug 2019 05:07:38 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu Subject: [RFC PATCH bpf-next 05/14] xdp_flow: Prepare flow tables in bpf Date: Tue, 13 Aug 2019 21:05:49 +0900 Message-Id: <20190813120558.6151-6-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190813120558.6151-1-toshiaki.makita1@gmail.com> References: <20190813120558.6151-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Add maps for flow tables in bpf. TC flower has hash tables for each flow mask ordered by priority. To do the same thing, prepare hashmap-in-arraymap. As bpf does not provide ordered list, we emulate it by an array. Each array entry has one-byte next index field to implement a list. Also prepare a one-element array to point to the head index of the list. Because of the limitation of bpf maps, the outer array is implemented using two array maps. "flow_masks" is the array to emulate the list and its entries have the priority and mask of each flow table. For each priority/mask, the same index entry of another map "flow_tables", which is the hashmap-in-arraymap, points to the actual flow table. The flow insertion logic in UMH and lookup logic in BPF will be implemented in the following commits. NOTE: This list emulation by array may be able to be realized by adding ordered-list type map. In that case we also need map iteration API for bpf progs. Signed-off-by: Toshiaki Makita --- net/xdp_flow/umh_bpf.h | 18 +++++++++++ net/xdp_flow/xdp_flow_kern_bpf.c | 22 +++++++++++++ net/xdp_flow/xdp_flow_umh.c | 70 ++++++++++++++++++++++++++++++++++++++-- 3 files changed, 108 insertions(+), 2 deletions(-) create mode 100644 net/xdp_flow/umh_bpf.h diff --git a/net/xdp_flow/umh_bpf.h b/net/xdp_flow/umh_bpf.h new file mode 100644 index 0000000..b4fe0c6 --- /dev/null +++ b/net/xdp_flow/umh_bpf.h @@ -0,0 +1,18 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _NET_XDP_FLOW_UMH_BPF_H +#define _NET_XDP_FLOW_UMH_BPF_H + +#include "msgfmt.h" + +#define MAX_FLOWS 1024 +#define MAX_FLOW_MASKS 255 +#define FLOW_MASKS_TAIL 255 + +struct xdp_flow_mask_entry { + struct xdp_flow_key mask; + __u16 priority; + short count; + int next; +}; + +#endif diff --git a/net/xdp_flow/xdp_flow_kern_bpf.c b/net/xdp_flow/xdp_flow_kern_bpf.c index 74cdb1d..c101156 100644 --- a/net/xdp_flow/xdp_flow_kern_bpf.c +++ b/net/xdp_flow/xdp_flow_kern_bpf.c @@ -2,6 +2,28 @@ #define KBUILD_MODNAME "foo" #include #include +#include "umh_bpf.h" + +struct bpf_map_def SEC("maps") flow_masks_head = { + .type = BPF_MAP_TYPE_ARRAY, + .key_size = sizeof(u32), + .value_size = sizeof(int), + .max_entries = 1, +}; + +struct bpf_map_def SEC("maps") flow_masks = { + .type = BPF_MAP_TYPE_ARRAY, + .key_size = sizeof(u32), + .value_size = sizeof(struct xdp_flow_mask_entry), + .max_entries = MAX_FLOW_MASKS, +}; + +struct bpf_map_def SEC("maps") flow_tables = { + .type = BPF_MAP_TYPE_ARRAY_OF_MAPS, + .key_size = sizeof(u32), + .value_size = sizeof(u32), + .max_entries = MAX_FLOW_MASKS, +}; SEC("xdp_flow") int xdp_flow_prog(struct xdp_md *ctx) diff --git a/net/xdp_flow/xdp_flow_umh.c b/net/xdp_flow/xdp_flow_umh.c index 734db00..e35666a 100644 --- a/net/xdp_flow/xdp_flow_umh.c +++ b/net/xdp_flow/xdp_flow_umh.c @@ -13,7 +13,7 @@ #include #include #include -#include "msgfmt.h" +#include "umh_bpf.h" extern char xdp_flow_bpf_start; extern char xdp_flow_bpf_end; @@ -95,11 +95,13 @@ static int setup(void) static int load_bpf(int ifindex, struct bpf_object **objp) { + int prog_fd, flow_tables_fd, flow_meta_fd, flow_masks_head_fd, err; + struct bpf_map *flow_tables, *flow_masks_head; + int zero = 0, flow_masks_tail = FLOW_MASKS_TAIL; struct bpf_object_open_attr attr = {}; char path[256], errbuf[ERRBUF_SIZE]; struct bpf_program *prog; struct bpf_object *obj; - int prog_fd, err; ssize_t len; len = snprintf(path, 256, "/proc/self/fd/%d", progfile_fd); @@ -127,6 +129,48 @@ static int load_bpf(int ifindex, struct bpf_object **objp) bpf_object__for_each_program(prog, obj) bpf_program__set_type(prog, attr.prog_type); + flow_meta_fd = bpf_create_map(BPF_MAP_TYPE_HASH, + sizeof(struct xdp_flow_key), + sizeof(struct xdp_flow_actions), + MAX_FLOWS, 0); + if (flow_meta_fd < 0) { + err = -errno; + pr_err("map creation for flow_tables meta failed: %s\n", + strerror(errno)); + goto err; + } + + flow_tables_fd = bpf_create_map_in_map(BPF_MAP_TYPE_ARRAY_OF_MAPS, + "flow_tables", sizeof(__u32), + flow_meta_fd, MAX_FLOW_MASKS, 0); + if (flow_tables_fd < 0) { + err = -errno; + pr_err("map creation for flow_tables failed: %s\n", + strerror(errno)); + close(flow_meta_fd); + goto err; + } + + close(flow_meta_fd); + + flow_tables = bpf_object__find_map_by_name(obj, "flow_tables"); + if (!flow_tables) { + pr_err("Cannot find flow_tables\n"); + err = -ENOENT; + close(flow_tables_fd); + goto err; + } + + err = bpf_map__reuse_fd(flow_tables, flow_tables_fd); + if (err) { + err = libbpf_err(err, errbuf); + pr_err("Failed to reuse flow_tables fd: %s\n", errbuf); + close(flow_tables_fd); + goto err; + } + + close(flow_tables_fd); + err = bpf_object__load(obj); if (err) { err = libbpf_err(err, errbuf); @@ -134,6 +178,28 @@ static int load_bpf(int ifindex, struct bpf_object **objp) goto err; } + flow_masks_head = bpf_object__find_map_by_name(obj, "flow_masks_head"); + if (!flow_masks_head) { + pr_err("Cannot find flow_masks_head map\n"); + err = -ENOENT; + goto err; + } + + flow_masks_head_fd = bpf_map__fd(flow_masks_head); + if (flow_masks_head_fd < 0) { + err = libbpf_err(flow_masks_head_fd, errbuf); + pr_err("Invalid flow_masks_head fd: %s\n", errbuf); + goto err; + } + + if (bpf_map_update_elem(flow_masks_head_fd, &zero, &flow_masks_tail, + 0)) { + err = -errno; + pr_err("Failed to initialize flow_masks_head: %s\n", + strerror(errno)); + goto err; + } + prog = bpf_object__find_program_by_title(obj, "xdp_flow"); if (!prog) { pr_err("Cannot find xdp_flow program\n"); From patchwork Tue Aug 13 12:05:50 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1146290 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="LyKc1qJy"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 467BL84Kmdz9sN1 for ; Tue, 13 Aug 2019 22:07:44 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727958AbfHMMHo (ORCPT ); Tue, 13 Aug 2019 08:07:44 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:36289 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725953AbfHMMHo (ORCPT ); Tue, 13 Aug 2019 08:07:44 -0400 Received: by mail-pf1-f196.google.com with SMTP id w2so682030pfi.3; Tue, 13 Aug 2019 05:07:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=h5aBwQYBhYYm8ip0wUptHajj4CQWnHke2rSosOEwaMg=; b=LyKc1qJyP7aBy9ENhdsWpGjI2HOcy3pcqIEEWKujnzVqlBZDtYedPalwj85b5Qrv1m qzwpKe5d9nGe6YsGqw1QkOc3qHsQn8YGu+PlE9ag2vnbIlJ2m8LT1NEne8vcQtZvhkOc c3Ieqnre2ZI78uluTMHInv0Z5LBj3y1rmEBY+TcZ78XkfBm9cztL6P+hHq4KtS8mSfoj 4e62M/XBNTuJNhF47ZB8Stlf0llvwAsr9mKOnem258LD33m8MLYd4onC9XxJISJUPSA3 kth54A06Ke657YptEzerQJql2/+WYwmPGIObZbq+/rUr7/iuNjKJtLmuzGjIniCayYsv xmPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=h5aBwQYBhYYm8ip0wUptHajj4CQWnHke2rSosOEwaMg=; b=Zhsj/tPfKEWUG1nW/q0VKa8WBrO8lXmnPR39mx7N0NQvud0vsZVI4KbWTs/ZfXrR0o 7hLMWUu67nA0SRuwkfhnsRTmD3RoFDo4nDeDcXN4lOc6CTgZNmtR0Mm22vrO3SeQF7Sr D3jKuJkicMqGFdgJMG/6UTM0OTgtjeM1rymL4tSblXbAdLI1Xp9KecArq/ENjq5dmFmD 4RTUzyLay/Gjxc6VsJRW95Q5Q/P34gOKeJTZt7cX2rKVxSNJfs7Nz/Lh9s2dt3ppvFxB GSF6ky0ZSFapi1vQoyjowNvcZd/MMt7L+vbkygEEm72G7eF8lvQyw7e+tazCm39SvftA GEyw== X-Gm-Message-State: APjAAAVZ3SqrD739KPwa81USpyQ7/Ow2CSazK2u9ZhzdrKsGd+tdampt KjvpYvZvcMrTs5pBxSrMEWguGxHc X-Google-Smtp-Source: APXvYqyPIHQSnx5i/Di00DhakF4CYHHke4RyPIm1iMnNzmdo33J8VdvjcUgp7B2Kyd1d2eyGIjzOvg== X-Received: by 2002:aa7:8a92:: with SMTP id a18mr41198769pfc.216.1565698062776; Tue, 13 Aug 2019 05:07:42 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id o9sm73251099pgv.19.2019.08.13.05.07.39 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Tue, 13 Aug 2019 05:07:42 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu Subject: [RFC PATCH bpf-next 06/14] xdp_flow: Add flow entry insertion/deletion logic in UMH Date: Tue, 13 Aug 2019 21:05:50 +0900 Message-Id: <20190813120558.6151-7-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190813120558.6151-1-toshiaki.makita1@gmail.com> References: <20190813120558.6151-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org This logic will be used when xdp_flow kmod requests flow insertion/deleteion. On insertion, find a free entry and populate it, then update next index pointer of its previous entry. On deletion, set the next index pointer of the prev entry to the next index of the entry to be deleted. Signed-off-by: Toshiaki Makita --- net/xdp_flow/umh_bpf.h | 15 ++ net/xdp_flow/xdp_flow_umh.c | 470 +++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 483 insertions(+), 2 deletions(-) diff --git a/net/xdp_flow/umh_bpf.h b/net/xdp_flow/umh_bpf.h index b4fe0c6..4e4633f 100644 --- a/net/xdp_flow/umh_bpf.h +++ b/net/xdp_flow/umh_bpf.h @@ -15,4 +15,19 @@ struct xdp_flow_mask_entry { int next; }; +static inline bool flow_equal(const struct xdp_flow_key *key1, + const struct xdp_flow_key *key2) +{ + long *lkey1 = (long *)key1; + long *lkey2 = (long *)key2; + int i; + + for (i = 0; i < sizeof(*key1); i += sizeof(long)) { + if (*lkey1++ != *lkey2++) + return false; + } + + return true; +} + #endif diff --git a/net/xdp_flow/xdp_flow_umh.c b/net/xdp_flow/xdp_flow_umh.c index e35666a..9a4769b 100644 --- a/net/xdp_flow/xdp_flow_umh.c +++ b/net/xdp_flow/xdp_flow_umh.c @@ -19,6 +19,8 @@ extern char xdp_flow_bpf_end; int progfile_fd; +#define zalloc(size) calloc(1, (size)) + /* FIXME: syslog is used for easy debugging. As writing /dev/log can be stuck * due to reader side, should use another log mechanism like kmsg. */ @@ -38,6 +40,8 @@ struct netdev_info { struct netdev_info_key key; struct hlist_node node; struct bpf_object *obj; + int free_slot_top; + int free_slots[MAX_FLOW_MASKS]; }; DEFINE_HASHTABLE(netdev_info_table, 16); @@ -268,6 +272,57 @@ static struct netdev_info *get_netdev_info(const struct mbox_request *req) return netdev_info; } +static void init_flow_masks_free_slot(struct netdev_info *netdev_info) +{ + int i; + + for (i = 0; i < MAX_FLOW_MASKS; i++) + netdev_info->free_slots[MAX_FLOW_MASKS - 1 - i] = i; + netdev_info->free_slot_top = MAX_FLOW_MASKS - 1; +} + +static int get_flow_masks_free_slot(const struct netdev_info *netdev_info) +{ + if (netdev_info->free_slot_top < 0) + return -ENOBUFS; + + return netdev_info->free_slots[netdev_info->free_slot_top]; +} + +static int add_flow_masks_free_slot(struct netdev_info *netdev_info, int slot) +{ + if (unlikely(netdev_info->free_slot_top >= MAX_FLOW_MASKS - 1)) { + pr_warn("BUG: free_slot overflow: top=%d, slot=%d\n", + netdev_info->free_slot_top, slot); + return -EOVERFLOW; + } + + netdev_info->free_slots[++netdev_info->free_slot_top] = slot; + + return 0; +} + +static void delete_flow_masks_free_slot(struct netdev_info *netdev_info, + int slot) +{ + int top_slot; + + if (unlikely(netdev_info->free_slot_top < 0)) { + pr_warn("BUG: free_slot underflow: top=%d, slot=%d\n", + netdev_info->free_slot_top, slot); + return; + } + + top_slot = netdev_info->free_slots[netdev_info->free_slot_top]; + if (unlikely(top_slot != slot)) { + pr_warn("BUG: inconsistent free_slot top: top_slot=%d, slot=%d\n", + top_slot, slot); + return; + } + + netdev_info->free_slot_top--; +} + static int handle_load(const struct mbox_request *req, __u32 *prog_id) { struct netdev_info *netdev_info; @@ -291,6 +346,8 @@ static int handle_load(const struct mbox_request *req, __u32 *prog_id) } netdev_info->key.ifindex = key.ifindex; + init_flow_masks_free_slot(netdev_info); + prog_fd = load_bpf(req->ifindex, &netdev_info->obj); if (prog_fd < 0) { err = prog_fd; @@ -331,14 +388,423 @@ static int handle_unload(const struct mbox_request *req) return 0; } +static int get_table_fd(const struct netdev_info *netdev_info, + const char *table_name) +{ + char errbuf[ERRBUF_SIZE]; + struct bpf_map *map; + int map_fd; + int err; + + map = bpf_object__find_map_by_name(netdev_info->obj, table_name); + if (!map) { + pr_err("BUG: %s map not found.\n", table_name); + return -ENOENT; + } + + map_fd = bpf_map__fd(map); + if (map_fd < 0) { + err = libbpf_err(map_fd, errbuf); + pr_err("Invalid map fd: %s\n", errbuf); + return err; + } + + return map_fd; +} + +static int get_flow_masks_head_fd(const struct netdev_info *netdev_info) +{ + return get_table_fd(netdev_info, "flow_masks_head"); +} + +static int get_flow_masks_head(int head_fd, int *head) +{ + int err, zero = 0; + + if (bpf_map_lookup_elem(head_fd, &zero, head)) { + err = -errno; + pr_err("Cannot get flow_masks_head: %s\n", strerror(errno)); + return err; + } + + return 0; +} + +static int update_flow_masks_head(int head_fd, int head) +{ + int err, zero = 0; + + if (bpf_map_update_elem(head_fd, &zero, &head, 0)) { + err = -errno; + pr_err("Cannot update flow_masks_head: %s\n", strerror(errno)); + return err; + } + + return 0; +} + +static int get_flow_masks_fd(const struct netdev_info *netdev_info) +{ + return get_table_fd(netdev_info, "flow_masks"); +} + +static int get_flow_tables_fd(const struct netdev_info *netdev_info) +{ + return get_table_fd(netdev_info, "flow_tables"); +} + +static int __flow_table_insert_elem(int flow_table_fd, + const struct xdp_flow *flow) +{ + int err = 0; + + if (bpf_map_update_elem(flow_table_fd, &flow->key, &flow->actions, 0)) { + err = -errno; + pr_err("Cannot insert flow entry: %s\n", + strerror(errno)); + } + + return err; +} + +static void __flow_table_delete_elem(int flow_table_fd, + const struct xdp_flow *flow) +{ + bpf_map_delete_elem(flow_table_fd, &flow->key); +} + +static int flow_table_insert_elem(struct netdev_info *netdev_info, + const struct xdp_flow *flow) +{ + int masks_fd, head_fd, flow_tables_fd, flow_table_fd, free_slot, head; + struct xdp_flow_mask_entry *entry, *pentry; + int err, cnt, idx, pidx; + + masks_fd = get_flow_masks_fd(netdev_info); + if (masks_fd < 0) + return masks_fd; + + head_fd = get_flow_masks_head_fd(netdev_info); + if (head_fd < 0) + return head_fd; + + err = get_flow_masks_head(head_fd, &head); + if (err) + return err; + + flow_tables_fd = get_flow_tables_fd(netdev_info); + if (flow_tables_fd < 0) + return flow_tables_fd; + + entry = zalloc(sizeof(*entry)); + if (!entry) { + pr_err("Memory allocation for flow_masks entry failed\n"); + return -ENOMEM; + } + + pentry = zalloc(sizeof(*pentry)); + if (!pentry) { + flow_table_fd = -ENOMEM; + pr_err("Memory allocation for flow_masks prev entry failed\n"); + goto err_entry; + } + + idx = head; + for (cnt = 0; cnt < MAX_FLOW_MASKS; cnt++) { + if (idx == FLOW_MASKS_TAIL) + break; + + if (bpf_map_lookup_elem(masks_fd, &idx, entry)) { + err = -errno; + pr_err("Cannot lookup flow_masks: %s\n", + strerror(errno)); + goto err; + } + + if (entry->priority == flow->priority && + flow_equal(&entry->mask, &flow->mask)) { + __u32 id; + + if (bpf_map_lookup_elem(flow_tables_fd, &idx, &id)) { + err = -errno; + pr_err("Cannot lookup flow_tables: %s\n", + strerror(errno)); + goto err; + } + + flow_table_fd = bpf_map_get_fd_by_id(id); + if (flow_table_fd < 0) { + err = -errno; + pr_err("Cannot get flow_table fd by id: %s\n", + strerror(errno)); + goto err; + } + + err = __flow_table_insert_elem(flow_table_fd, flow); + if (err) + goto out; + + entry->count++; + if (bpf_map_update_elem(masks_fd, &idx, entry, 0)) { + err = -errno; + pr_err("Cannot update flow_masks count: %s\n", + strerror(errno)); + __flow_table_delete_elem(flow_table_fd, flow); + goto out; + } + + goto out; + } + + if (entry->priority > flow->priority) + break; + + *pentry = *entry; + pidx = idx; + idx = entry->next; + } + + if (unlikely(cnt == MAX_FLOW_MASKS && idx != FLOW_MASKS_TAIL)) { + err = -EINVAL; + pr_err("Cannot lookup flow_masks: Broken flow_masks list\n"); + goto out; + } + + /* Flow mask was not found. Create a new one */ + + free_slot = get_flow_masks_free_slot(netdev_info); + if (free_slot < 0) { + err = free_slot; + goto err; + } + + entry->mask = flow->mask; + entry->priority = flow->priority; + entry->count = 1; + entry->next = idx; + if (bpf_map_update_elem(masks_fd, &free_slot, entry, 0)) { + err = -errno; + pr_err("Cannot update flow_masks: %s\n", strerror(errno)); + goto err; + } + + flow_table_fd = bpf_create_map(BPF_MAP_TYPE_HASH, + sizeof(struct xdp_flow_key), + sizeof(struct xdp_flow_actions), + MAX_FLOWS, 0); + if (flow_table_fd < 0) { + err = -errno; + pr_err("map creation for flow_table failed: %s\n", + strerror(errno)); + goto err; + } + + err = __flow_table_insert_elem(flow_table_fd, flow); + if (err) + goto out; + + if (bpf_map_update_elem(flow_tables_fd, &free_slot, &flow_table_fd, 0)) { + err = -errno; + pr_err("Failed to insert flow_table into flow_tables: %s\n", + strerror(errno)); + goto out; + } + + if (cnt == 0) { + err = update_flow_masks_head(head_fd, free_slot); + if (err) + goto err_flow_table; + } else { + pentry->next = free_slot; + /* This effectively only updates one byte of entry->next */ + if (bpf_map_update_elem(masks_fd, &pidx, pentry, 0)) { + err = -errno; + pr_err("Cannot update flow_masks prev entry: %s\n", + strerror(errno)); + goto err_flow_table; + } + } + delete_flow_masks_free_slot(netdev_info, free_slot); +out: + close(flow_table_fd); +err: + free(pentry); +err_entry: + free(entry); + + return err; + +err_flow_table: + bpf_map_delete_elem(flow_tables_fd, &free_slot); + + goto out; +} + +static int flow_table_delete_elem(struct netdev_info *netdev_info, + const struct xdp_flow *flow) +{ + int masks_fd, head_fd, flow_tables_fd, flow_table_fd, head; + struct xdp_flow_mask_entry *entry, *pentry; + int err, cnt, idx, pidx; + __u32 id; + + masks_fd = get_flow_masks_fd(netdev_info); + if (masks_fd < 0) + return masks_fd; + + head_fd = get_flow_masks_head_fd(netdev_info); + if (head_fd < 0) + return head_fd; + + err = get_flow_masks_head(head_fd, &head); + if (err) + return err; + + flow_tables_fd = get_flow_tables_fd(netdev_info); + if (flow_tables_fd < 0) + return flow_tables_fd; + + entry = zalloc(sizeof(*entry)); + if (!entry) { + pr_err("Memory allocation for flow_masks entry failed\n"); + return -ENOMEM; + } + + pentry = zalloc(sizeof(*pentry)); + if (!pentry) { + err = -ENOMEM; + pr_err("Memory allocation for flow_masks prev entry failed\n"); + goto err_pentry; + } + + idx = head; + for (cnt = 0; cnt < MAX_FLOW_MASKS; cnt++) { + if (idx == FLOW_MASKS_TAIL) { + err = -ENOENT; + pr_err("Cannot lookup flow_masks: %s\n", + strerror(-err)); + goto out; + } + + if (bpf_map_lookup_elem(masks_fd, &idx, entry)) { + err = -errno; + pr_err("Cannot lookup flow_masks: %s\n", + strerror(errno)); + goto out; + } + + if (entry->priority > flow->priority) { + err = -ENOENT; + pr_err("Cannot lookup flow_masks: %s\n", + strerror(-err)); + goto out; + } + + if (entry->priority == flow->priority && + flow_equal(&entry->mask, &flow->mask)) + break; + + *pentry = *entry; + pidx = idx; + idx = entry->next; + } + + if (unlikely(cnt == MAX_FLOW_MASKS)) { + err = -ENOENT; + pr_err("Cannot lookup flow_masks: Broken flow_masks list\n"); + goto out; + } + + if (bpf_map_lookup_elem(flow_tables_fd, &idx, &id)) { + err = -errno; + pr_err("Cannot lookup flow_tables: %s\n", + strerror(errno)); + goto out; + } + + flow_table_fd = bpf_map_get_fd_by_id(id); + if (flow_table_fd < 0) { + err = -errno; + pr_err("Cannot get flow_table fd by id: %s\n", + strerror(errno)); + goto out; + } + + __flow_table_delete_elem(flow_table_fd, flow); + close(flow_table_fd); + + if (--entry->count > 0) { + if (bpf_map_update_elem(masks_fd, &idx, entry, 0)) { + err = -errno; + pr_err("Cannot update flow_masks count: %s\n", + strerror(errno)); + } + + goto out; + } + + if (unlikely(entry->count < 0)) { + pr_warn("flow_masks has negative count: %d\n", + entry->count); + } + + if (cnt == 0) { + err = update_flow_masks_head(head_fd, entry->next); + if (err) + goto out; + } else { + pentry->next = entry->next; + /* This effectively only updates one byte of entry->next */ + if (bpf_map_update_elem(masks_fd, &pidx, pentry, 0)) { + err = -errno; + pr_err("Cannot update flow_masks prev entry: %s\n", + strerror(errno)); + goto out; + } + } + + bpf_map_delete_elem(flow_tables_fd, &idx); + err = add_flow_masks_free_slot(netdev_info, idx); + if (err) + pr_err("Cannot add flow_masks free slot: %s\n", strerror(-err)); +out: + free(pentry); +err_pentry: + free(entry); + + return err; +} + static int handle_replace(struct mbox_request *req) { - return -EOPNOTSUPP; + struct netdev_info *netdev_info; + int err; + + netdev_info = get_netdev_info(req); + if (IS_ERR(netdev_info)) + return PTR_ERR(netdev_info); + + err = flow_table_insert_elem(netdev_info, &req->flow); + if (err) + return err; + + return 0; } static int handle_delete(const struct mbox_request *req) { - return -EOPNOTSUPP; + struct netdev_info *netdev_info; + int err; + + netdev_info = get_netdev_info(req); + if (IS_ERR(netdev_info)) + return PTR_ERR(netdev_info); + + err = flow_table_delete_elem(netdev_info, &req->flow); + if (err) + return err; + + return 0; } static void loop(void) From patchwork Tue Aug 13 12:05:51 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1146292 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="JJtdbqzI"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 467BLD1Bqbz9sN1 for ; Tue, 13 Aug 2019 22:07:48 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727954AbfHMMHr (ORCPT ); Tue, 13 Aug 2019 08:07:47 -0400 Received: from mail-pf1-f194.google.com ([209.85.210.194]:36292 "EHLO mail-pf1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727425AbfHMMHr (ORCPT ); Tue, 13 Aug 2019 08:07:47 -0400 Received: by mail-pf1-f194.google.com with SMTP id w2so682105pfi.3; Tue, 13 Aug 2019 05:07:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+Tz0gD0fN7lfmTT1cRoBEztbwc847PpvO+BtdiAc5R4=; b=JJtdbqzIrih7fa4JRsBVVyxVOEnZ3DyPngz9UVsOCZJ38FixdPTVmUZg3eCP0v4N1E hNGvQLslDDYp7h1Q16+29vYhyajzDLoI9VUk8z5PqMhErmfxYJoa+qClLslydPeMzL+U CzZI50WDwBTp7smvz10g8EsxH/4cLOrxPsHY9+1VGSzf2YFxjbJK6yMv89R0CSoB5Hwz 3plDLQ63RnklQ0+DYDVRV/poQhPaRetH45wD5VHJvQhXJCFBCKaQeHTUOG87RKgCWhpr 3AyBHqPBOlREUUFxcR4fTAkCLAqHCZv+/W62a5xaJYUnP7N85A2gL1TkEK82ac7sm+pA whwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+Tz0gD0fN7lfmTT1cRoBEztbwc847PpvO+BtdiAc5R4=; b=Wq2LTmZAvsXhpCkmSZU9yq5zviykeiKZx5MIHlsXdp4SfCZlJGmgfqiUQeYcG7bs/W z/1RANZXUo4DdAOmPCg7FGblUzBtiMTJSBr3eNbI4Ie/YsWspbRpjPNqXBaxn2hL1VWA DNdMER7hUcrknxz04OvzO7M258mUpnGhKg54RZ8hh16FIW0w/jBwd35ybKg0YDCUSDuH +qxrJWkOHDoIoXnYvgMhsl/m9daoozMup/uBuEihaXfFuNe0+KDx1nidVd3OPwGiZmXc aKdPAhQaSXQYfHH85IqkFzkpG+qSAFygsvXs7C7s7yDDnFQR3Mroup8iPNMABMraAhkn jLbw== X-Gm-Message-State: APjAAAXskI0ZBLk3zqwSAw+Dcm02Xjj4WB/mMryBB52sc/nQr+Ivil+N gzTK2WHCHbq/XzD9DYijGd0= X-Google-Smtp-Source: APXvYqzxwC8wkT3Grx8FcmM4fmbExdjgF3kr4WneNDMq64CsoUQnIMaXuLdAeiyLz9q+nm4Qqo+dTQ== X-Received: by 2002:a63:4a20:: with SMTP id x32mr25886128pga.357.1565698066473; Tue, 13 Aug 2019 05:07:46 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id o9sm73251099pgv.19.2019.08.13.05.07.42 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Tue, 13 Aug 2019 05:07:45 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu Subject: [RFC PATCH bpf-next 07/14] xdp_flow: Add flow handling and basic actions in bpf prog Date: Tue, 13 Aug 2019 21:05:51 +0900 Message-Id: <20190813120558.6151-8-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190813120558.6151-1-toshiaki.makita1@gmail.com> References: <20190813120558.6151-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org BPF prog for XDP parses the packet and extracts the flow key. Then find an entry from flow tables. Only "accept" and "drop" actions are implemented at this point. Signed-off-by: Toshiaki Makita --- net/xdp_flow/xdp_flow_kern_bpf.c | 297 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 296 insertions(+), 1 deletion(-) diff --git a/net/xdp_flow/xdp_flow_kern_bpf.c b/net/xdp_flow/xdp_flow_kern_bpf.c index c101156..ceb8a92 100644 --- a/net/xdp_flow/xdp_flow_kern_bpf.c +++ b/net/xdp_flow/xdp_flow_kern_bpf.c @@ -1,9 +1,27 @@ // SPDX-License-Identifier: GPL-2.0 #define KBUILD_MODNAME "foo" #include +#include +#include +#include +#include +#include +#include +#include +#include #include #include "umh_bpf.h" +/* Used when the action only modifies the packet */ +#define _XDP_CONTINUE -1 + +struct bpf_map_def SEC("maps") debug_stats = { + .type = BPF_MAP_TYPE_PERCPU_ARRAY, + .key_size = sizeof(u32), + .value_size = sizeof(long), + .max_entries = 256, +}; + struct bpf_map_def SEC("maps") flow_masks_head = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(u32), @@ -25,10 +43,287 @@ struct bpf_map_def SEC("maps") flow_tables = { .max_entries = MAX_FLOW_MASKS, }; +static inline void account_debug(int idx) +{ + long *cnt; + + cnt = bpf_map_lookup_elem(&debug_stats, &idx); + if (cnt) + *cnt += 1; +} + +static inline void account_action(int act) +{ + account_debug(act + 1); +} + +static inline int action_accept(void) +{ + account_action(XDP_FLOW_ACTION_ACCEPT); + return XDP_PASS; +} + +static inline int action_drop(void) +{ + account_action(XDP_FLOW_ACTION_DROP); + return XDP_DROP; +} + +static inline int action_redirect(struct xdp_flow_action *action) +{ + account_action(XDP_FLOW_ACTION_REDIRECT); + + // TODO: implement this + return XDP_ABORTED; +} + +static inline int action_vlan_push(struct xdp_md *ctx, + struct xdp_flow_action *action) +{ + account_action(XDP_FLOW_ACTION_VLAN_PUSH); + + // TODO: implement this + return XDP_ABORTED; +} + +static inline int action_vlan_pop(struct xdp_md *ctx, + struct xdp_flow_action *action) +{ + account_action(XDP_FLOW_ACTION_VLAN_POP); + + // TODO: implement this + return XDP_ABORTED; +} + +static inline int action_vlan_mangle(struct xdp_md *ctx, + struct xdp_flow_action *action) +{ + account_action(XDP_FLOW_ACTION_VLAN_MANGLE); + + // TODO: implement this + return XDP_ABORTED; +} + +static inline int action_mangle(struct xdp_md *ctx, + struct xdp_flow_action *action) +{ + account_action(XDP_FLOW_ACTION_MANGLE); + + // TODO: implement this + return XDP_ABORTED; +} + +static inline int action_csum(struct xdp_md *ctx, + struct xdp_flow_action *action) +{ + account_action(XDP_FLOW_ACTION_CSUM); + + // TODO: implement this + return XDP_ABORTED; +} + +static inline void __ether_addr_copy(u8 *dst, const u8 *src) +{ + u16 *a = (u16 *)dst; + const u16 *b = (const u16 *)src; + + a[0] = b[0]; + a[1] = b[1]; + a[2] = b[2]; +} + +static inline int parse_ipv4(void *data, u64 *nh_off, void *data_end, + struct xdp_flow_key *key) +{ + struct iphdr *iph = data + *nh_off; + + if (iph + 1 > data_end) + return -1; + + key->ipv4.src = iph->saddr; + key->ipv4.dst = iph->daddr; + key->ip.ttl = iph->ttl; + key->ip.tos = iph->tos; + *nh_off += iph->ihl * 4; + + return iph->protocol; +} + +static inline int parse_ipv6(void *data, u64 *nh_off, void *data_end, + struct xdp_flow_key *key) +{ + struct ipv6hdr *ip6h = data + *nh_off; + + if (ip6h + 1 > data_end) + return -1; + + key->ipv6.src = ip6h->saddr; + key->ipv6.dst = ip6h->daddr; + key->ip.ttl = ip6h->hop_limit; + key->ip.tos = ipv6_get_dsfield(ip6h); + *nh_off += sizeof(*ip6h); + + if (ip6h->nexthdr == NEXTHDR_HOP || + ip6h->nexthdr == NEXTHDR_ROUTING || + ip6h->nexthdr == NEXTHDR_FRAGMENT || + ip6h->nexthdr == NEXTHDR_AUTH || + ip6h->nexthdr == NEXTHDR_NONE || + ip6h->nexthdr == NEXTHDR_DEST) + return 0; + + return ip6h->nexthdr; +} + +#define for_each_flow_mask(entry, head, idx, cnt) \ + for (entry = bpf_map_lookup_elem(&flow_masks, (head)), \ + idx = *(head), cnt = 0; \ + entry != NULL && cnt < MAX_FLOW_MASKS; \ + idx = entry->next, \ + entry = bpf_map_lookup_elem(&flow_masks, &idx), cnt++) + +static inline void flow_mask(struct xdp_flow_key *mkey, + const struct xdp_flow_key *key, + const struct xdp_flow_key *mask) +{ + long *lmkey = (long *)mkey; + long *lmask = (long *)mask; + long *lkey = (long *)key; + int i; + + for (i = 0; i < sizeof(*mkey); i += sizeof(long)) + *lmkey++ = *lkey++ & *lmask++; +} + SEC("xdp_flow") int xdp_flow_prog(struct xdp_md *ctx) { - return XDP_PASS; + void *data_end = (void *)(long)ctx->data_end; + struct xdp_flow_actions *actions = NULL; + void *data = (void *)(long)ctx->data; + int cnt, idx, action_idx, zero = 0; + struct xdp_flow_mask_entry *entry; + struct ethhdr *eth = data; + struct xdp_flow_key key; + int rc = XDP_DROP; + long *value; + u16 h_proto; + u32 ipproto; + u64 nh_off; + int *head; + + account_debug(0); + + nh_off = sizeof(*eth); + if (data + nh_off > data_end) + return XDP_DROP; + + __builtin_memset(&key, 0, sizeof(key)); + h_proto = eth->h_proto; + __ether_addr_copy(key.eth.dst, eth->h_dest); + __ether_addr_copy(key.eth.src, eth->h_source); + + if (eth_type_vlan(h_proto)) { + struct vlan_hdr *vhdr; + + vhdr = data + nh_off; + nh_off += sizeof(*vhdr); + if (data + nh_off > data_end) + return XDP_DROP; + key.vlan.tpid = h_proto; + key.vlan.tci = vhdr->h_vlan_TCI; + h_proto = vhdr->h_vlan_encapsulated_proto; + } + key.eth.type = h_proto; + + if (h_proto == htons(ETH_P_IP)) + ipproto = parse_ipv4(data, &nh_off, data_end, &key); + else if (h_proto == htons(ETH_P_IPV6)) + ipproto = parse_ipv6(data, &nh_off, data_end, &key); + else + ipproto = 0; + if (ipproto < 0) + return XDP_DROP; + key.ip.proto = ipproto; + + if (ipproto == IPPROTO_TCP) { + struct tcphdr *th = data + nh_off; + + if (th + 1 > data_end) + return XDP_DROP; + + key.l4port.src = th->source; + key.l4port.dst = th->dest; + key.tcp.flags = (*(__be16 *)&tcp_flag_word(th) & htons(0x0FFF)); + } else if (ipproto == IPPROTO_UDP) { + struct udphdr *uh = data + nh_off; + + if (uh + 1 > data_end) + return XDP_DROP; + + key.l4port.src = uh->source; + key.l4port.dst = uh->dest; + } + + head = bpf_map_lookup_elem(&flow_masks_head, &zero); + if (!head) + return XDP_PASS; + + for_each_flow_mask(entry, head, idx, cnt) { + struct xdp_flow_key mkey; + void *flow_table; + + flow_table = bpf_map_lookup_elem(&flow_tables, &idx); + if (!flow_table) + return XDP_ABORTED; + + flow_mask(&mkey, &key, &entry->mask); + actions = bpf_map_lookup_elem(flow_table, &mkey); + if (actions) + break; + } + + if (!actions) + return XDP_PASS; + + for (action_idx = 0; + action_idx < actions->num_actions && + action_idx < MAX_XDP_FLOW_ACTIONS; + action_idx++) { + struct xdp_flow_action *action; + int act; + + action = &actions->actions[action_idx]; + + switch (action->id) { + case XDP_FLOW_ACTION_ACCEPT: + return action_accept(); + case XDP_FLOW_ACTION_DROP: + return action_drop(); + case XDP_FLOW_ACTION_REDIRECT: + return action_redirect(action); + case XDP_FLOW_ACTION_VLAN_PUSH: + act = action_vlan_push(ctx, action); + break; + case XDP_FLOW_ACTION_VLAN_POP: + act = action_vlan_pop(ctx, action); + break; + case XDP_FLOW_ACTION_VLAN_MANGLE: + act = action_vlan_mangle(ctx, action); + break; + case XDP_FLOW_ACTION_MANGLE: + act = action_mangle(ctx, action); + break; + case XDP_FLOW_ACTION_CSUM: + act = action_csum(ctx, action); + break; + default: + return XDP_ABORTED; + } + if (act != _XDP_CONTINUE) + return act; + } + + return XDP_ABORTED; } char _license[] SEC("license") = "GPL"; From patchwork Tue Aug 13 12:05:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1146294 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="eh9JUsLY"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 467BLH3QL8z9sN1 for ; Tue, 13 Aug 2019 22:07:51 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727991AbfHMMHv (ORCPT ); Tue, 13 Aug 2019 08:07:51 -0400 Received: from mail-pg1-f195.google.com ([209.85.215.195]:32951 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727425AbfHMMHv (ORCPT ); Tue, 13 Aug 2019 08:07:51 -0400 Received: by mail-pg1-f195.google.com with SMTP id n190so10351841pgn.0; Tue, 13 Aug 2019 05:07:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=TZkJV6sDt2PGGk4lDoLb18E0qN5uD7qf4OWOBYTWxJo=; b=eh9JUsLYnROz/ziqwHGkkG0/lBGISitAGttCNaZL0Ohxlqm6D3h/K2yzjBi6+KMLqx Jc+XbnTwk6HmpODz32leCcB0sqqhy9VFPMAHCjHGoBBsdANuuMXS9XzwrdPzdAUqmy2T 3WgaFdqQyM51Qre+/YsaAYT+Wx9oyo7k4C4A1E6DXDYZIrogonxdI8kYl/5+AF77XyZG AlxI0Y1r32FxqHJJL7r4ZSihis7sqJHCezzIiLzrqrpDzQmnIKfk4azbq3WWJZ76gOvT Cr8beFMzaZxgVuRUNWN/v8eGuiub1XsZuLsDtf9SQxxXc3B5pBuPsDGxAQqvDDWI2RoG 7WMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TZkJV6sDt2PGGk4lDoLb18E0qN5uD7qf4OWOBYTWxJo=; b=sUGSTnbihJNzwvIbKRaXv+Ok71wSqilA9UJ89b/oH6JtlmKnFb3SnIOz9KeOCQ5pSy eKRHwx8t7UdqAH1X9TdP/sHczLc+iflwiVoopWxmAt/YKeZ3BvWS5QXXHVOYH/4SJh3O leDuM56YvnClKs2i8qsTIU4VFHi2dQq9jxEb5l3Qpsz6MNGsBKml2IiW00p/SUkkzDap CtUGPlmTin45G6FVHTXz6m5k6b9yRyfaJqmm1DwSE0Ntu0FJ5T6NER3BETtvWdEa0ENe ccyktK2qIXdigaqiLHZp2OE14tKG80MsgKkX6Tg0jiuqM546Ib5UJ+pg3aVjka9HlA4A fjLQ== X-Gm-Message-State: APjAAAWLuI5SQqJ15AJ5DmBfDZDcNnUqwiW2armX/UW1Mj83tCs5WOe1 a1Nj9zcsHh/PeeW4S6fUu76azXq1 X-Google-Smtp-Source: APXvYqwPB2b7TjAbWVfEWXftR6YR8tilpJQtS+MKbETkJQgA3p7aadPNPk+bER7YfeoDE9iLUMTTlg== X-Received: by 2002:aa7:8f29:: with SMTP id y9mr10828587pfr.27.1565698070196; Tue, 13 Aug 2019 05:07:50 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id o9sm73251099pgv.19.2019.08.13.05.07.46 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Tue, 13 Aug 2019 05:07:49 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu Subject: [RFC PATCH bpf-next 08/14] xdp_flow: Implement flow replacement/deletion logic in xdp_flow kmod Date: Tue, 13 Aug 2019 21:05:52 +0900 Message-Id: <20190813120558.6151-9-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190813120558.6151-1-toshiaki.makita1@gmail.com> References: <20190813120558.6151-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org As struct flow_rule has descrete storages for flow_dissector and key/mask containers, we need to serialize them in some way to pass them to UMH. Convert flow_rule into flow key form used in xdp_flow bpf prog and pass it. Signed-off-by: Toshiaki Makita --- net/xdp_flow/xdp_flow_kern_mod.c | 334 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 331 insertions(+), 3 deletions(-) diff --git a/net/xdp_flow/xdp_flow_kern_mod.c b/net/xdp_flow/xdp_flow_kern_mod.c index 9cf527d..fe925db 100644 --- a/net/xdp_flow/xdp_flow_kern_mod.c +++ b/net/xdp_flow/xdp_flow_kern_mod.c @@ -3,13 +3,266 @@ #include #include #include +#include #include #include #include "msgfmt.h" +struct xdp_flow_rule { + struct rhash_head ht_node; + unsigned long cookie; + struct xdp_flow_key key; + struct xdp_flow_key mask; +}; + +static const struct rhashtable_params rules_params = { + .key_len = sizeof(unsigned long), + .key_offset = offsetof(struct xdp_flow_rule, cookie), + .head_offset = offsetof(struct xdp_flow_rule, ht_node), + .automatic_shrinking = true, +}; + +static struct rhashtable rules; + extern char xdp_flow_umh_start; extern char xdp_flow_umh_end; +static int xdp_flow_parse_actions(struct xdp_flow_actions *actions, + struct flow_action *flow_action, + struct netlink_ext_ack *extack) +{ + const struct flow_action_entry *act; + int i; + + if (!flow_action_has_entries(flow_action)) + return 0; + + if (flow_action->num_entries > MAX_XDP_FLOW_ACTIONS) + return -ENOBUFS; + + flow_action_for_each(i, act, flow_action) { + struct xdp_flow_action *action = &actions->actions[i]; + + switch (act->id) { + case FLOW_ACTION_ACCEPT: + action->id = XDP_FLOW_ACTION_ACCEPT; + break; + case FLOW_ACTION_DROP: + action->id = XDP_FLOW_ACTION_DROP; + break; + case FLOW_ACTION_REDIRECT: + case FLOW_ACTION_VLAN_PUSH: + case FLOW_ACTION_VLAN_POP: + case FLOW_ACTION_VLAN_MANGLE: + case FLOW_ACTION_MANGLE: + case FLOW_ACTION_CSUM: + /* TODO: implement these */ + /* fall through */ + default: + NL_SET_ERR_MSG_MOD(extack, "Unsupported action"); + return -EOPNOTSUPP; + } + } + actions->num_actions = flow_action->num_entries; + + return 0; +} + +static int xdp_flow_parse_ports(struct xdp_flow_key *key, + struct xdp_flow_key *mask, + struct flow_cls_offload *f, u8 ip_proto) +{ + const struct flow_rule *rule = flow_cls_offload_flow_rule(f); + struct flow_match_ports match; + + if (!flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_PORTS)) + return 0; + + if (ip_proto != IPPROTO_TCP && ip_proto != IPPROTO_UDP) { + NL_SET_ERR_MSG_MOD(f->common.extack, + "Only UDP and TCP keys are supported"); + return -EINVAL; + } + + flow_rule_match_ports(rule, &match); + + key->l4port.src = match.key->src; + mask->l4port.src = match.mask->src; + key->l4port.dst = match.key->dst; + mask->l4port.dst = match.mask->dst; + + return 0; +} + +static int xdp_flow_parse_tcp(struct xdp_flow_key *key, + struct xdp_flow_key *mask, + struct flow_cls_offload *f, u8 ip_proto) +{ + const struct flow_rule *rule = flow_cls_offload_flow_rule(f); + struct flow_match_tcp match; + + if (!flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_TCP)) + return 0; + + if (ip_proto != IPPROTO_TCP) { + NL_SET_ERR_MSG_MOD(f->common.extack, + "TCP keys supported only for TCP"); + return -EINVAL; + } + + flow_rule_match_tcp(rule, &match); + + key->tcp.flags = match.key->flags; + mask->tcp.flags = match.mask->flags; + + return 0; +} + +static int xdp_flow_parse_ip(struct xdp_flow_key *key, + struct xdp_flow_key *mask, + struct flow_cls_offload *f, __be16 n_proto) +{ + const struct flow_rule *rule = flow_cls_offload_flow_rule(f); + struct flow_match_ip match; + + if (!flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_IP)) + return 0; + + if (n_proto != htons(ETH_P_IP) && n_proto != htons(ETH_P_IPV6)) { + NL_SET_ERR_MSG_MOD(f->common.extack, + "IP keys supported only for IPv4/6"); + return -EINVAL; + } + + flow_rule_match_ip(rule, &match); + + key->ip.ttl = match.key->ttl; + mask->ip.ttl = match.mask->ttl; + key->ip.tos = match.key->tos; + mask->ip.tos = match.mask->tos; + + return 0; +} + +static int xdp_flow_parse(struct xdp_flow_key *key, struct xdp_flow_key *mask, + struct xdp_flow_actions *actions, + struct flow_cls_offload *f) +{ + struct flow_rule *rule = flow_cls_offload_flow_rule(f); + struct flow_dissector *dissector = rule->match.dissector; + __be16 n_proto = 0, n_proto_mask = 0; + u16 addr_type = 0; + u8 ip_proto = 0; + int err; + + if (dissector->used_keys & + ~(BIT(FLOW_DISSECTOR_KEY_CONTROL) | + BIT(FLOW_DISSECTOR_KEY_BASIC) | + BIT(FLOW_DISSECTOR_KEY_ETH_ADDRS) | + BIT(FLOW_DISSECTOR_KEY_IPV4_ADDRS) | + BIT(FLOW_DISSECTOR_KEY_IPV6_ADDRS) | + BIT(FLOW_DISSECTOR_KEY_PORTS) | + BIT(FLOW_DISSECTOR_KEY_TCP) | + BIT(FLOW_DISSECTOR_KEY_IP) | + BIT(FLOW_DISSECTOR_KEY_VLAN))) { + NL_SET_ERR_MSG_MOD(f->common.extack, "Unsupported key"); + return -EOPNOTSUPP; + } + + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_CONTROL)) { + struct flow_match_control match; + + flow_rule_match_control(rule, &match); + addr_type = match.key->addr_type; + } + + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_BASIC)) { + struct flow_match_basic match; + + flow_rule_match_basic(rule, &match); + + n_proto = match.key->n_proto; + n_proto_mask = match.mask->n_proto; + if (n_proto == htons(ETH_P_ALL)) { + n_proto = 0; + n_proto_mask = 0; + } + + key->eth.type = n_proto; + mask->eth.type = n_proto_mask; + + if (match.mask->ip_proto) { + ip_proto = match.key->ip_proto; + key->ip.proto = ip_proto; + mask->ip.proto = match.mask->ip_proto; + } + } + + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_ETH_ADDRS)) { + struct flow_match_eth_addrs match; + + flow_rule_match_eth_addrs(rule, &match); + + ether_addr_copy(key->eth.dst, match.key->dst); + ether_addr_copy(mask->eth.dst, match.mask->dst); + ether_addr_copy(key->eth.src, match.key->src); + ether_addr_copy(mask->eth.src, match.mask->src); + } + + if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_VLAN)) { + struct flow_match_vlan match; + + flow_rule_match_vlan(rule, &match); + + key->vlan.tpid = match.key->vlan_tpid; + mask->vlan.tpid = match.mask->vlan_tpid; + key->vlan.tci = htons(match.key->vlan_id | + (match.key->vlan_priority << + VLAN_PRIO_SHIFT)); + mask->vlan.tci = htons(match.mask->vlan_id | + (match.mask->vlan_priority << + VLAN_PRIO_SHIFT)); + } + + if (addr_type == FLOW_DISSECTOR_KEY_IPV4_ADDRS) { + struct flow_match_ipv4_addrs match; + + flow_rule_match_ipv4_addrs(rule, &match); + + key->ipv4.src = match.key->src; + mask->ipv4.src = match.mask->src; + key->ipv4.dst = match.key->dst; + mask->ipv4.dst = match.mask->dst; + } + + if (addr_type == FLOW_DISSECTOR_KEY_IPV6_ADDRS) { + struct flow_match_ipv6_addrs match; + + flow_rule_match_ipv6_addrs(rule, &match); + + key->ipv6.src = match.key->src; + mask->ipv6.src = match.mask->src; + key->ipv6.dst = match.key->dst; + mask->ipv6.dst = match.mask->dst; + } + + err = xdp_flow_parse_ports(key, mask, f, ip_proto); + if (err) + return err; + err = xdp_flow_parse_tcp(key, mask, f, ip_proto); + if (err) + return err; + + err = xdp_flow_parse_ip(key, mask, f, n_proto); + if (err) + return err; + + // TODO: encapsulation related tasks + + return xdp_flow_parse_actions(actions, &rule->action, + f->common.extack); +} + static void shutdown_umh(void) { struct task_struct *tsk; @@ -60,12 +313,78 @@ static int transact_umh(struct mbox_request *req, u32 *id) static int xdp_flow_replace(struct net_device *dev, struct flow_cls_offload *f) { - return -EOPNOTSUPP; + struct xdp_flow_rule *rule; + struct mbox_request *req; + int err; + + req = kzalloc(sizeof(*req), GFP_KERNEL); + if (!req) + return -ENOMEM; + + rule = kzalloc(sizeof(*rule), GFP_KERNEL); + if (!rule) { + err = -ENOMEM; + goto out; + } + + req->flow.priority = f->common.prio >> 16; + err = xdp_flow_parse(&req->flow.key, &req->flow.mask, + &req->flow.actions, f); + if (err) + goto err_parse; + + rule->cookie = f->cookie; + rule->key = req->flow.key; + rule->mask = req->flow.mask; + err = rhashtable_insert_fast(&rules, &rule->ht_node, rules_params); + if (err) + goto err_parse; + + req->cmd = XDP_FLOW_CMD_REPLACE; + req->ifindex = dev->ifindex; + err = transact_umh(req, NULL); + if (err) + goto err_umh; +out: + kfree(req); + + return err; +err_umh: + rhashtable_remove_fast(&rules, &rule->ht_node, rules_params); +err_parse: + kfree(rule); + goto out; } int xdp_flow_destroy(struct net_device *dev, struct flow_cls_offload *f) { - return -EOPNOTSUPP; + struct mbox_request *req; + struct xdp_flow_rule *rule; + int err; + + rule = rhashtable_lookup_fast(&rules, &f->cookie, rules_params); + if (!rule) + return 0; + + req = kzalloc(sizeof(*req), GFP_KERNEL); + if (!req) + return -ENOMEM; + + req->flow.priority = f->common.prio >> 16; + req->flow.key = rule->key; + req->flow.mask = rule->mask; + req->cmd = XDP_FLOW_CMD_DELETE; + req->ifindex = dev->ifindex; + err = transact_umh(req, NULL); + + kfree(req); + + if (!err) { + rhashtable_remove_fast(&rules, &rule->ht_node, rules_params); + kfree(rule); + } + + return err; } static int xdp_flow_setup_flower(struct net_device *dev, @@ -267,7 +586,11 @@ static int start_umh(void) static int __init load_umh(void) { - int err = 0; + int err; + + err = rhashtable_init(&rules, &rules_params); + if (err) + return err; mutex_lock(&xdp_flow_ops.lock); if (!xdp_flow_ops.stop) { @@ -283,8 +606,12 @@ static int __init load_umh(void) xdp_flow_ops.setup = &xdp_flow_setup; xdp_flow_ops.start = &start_umh; xdp_flow_ops.module = THIS_MODULE; + + mutex_unlock(&xdp_flow_ops.lock); + return 0; err: mutex_unlock(&xdp_flow_ops.lock); + rhashtable_destroy(&rules); return err; } @@ -297,6 +624,7 @@ static void __exit fini_umh(void) xdp_flow_ops.setup = NULL; xdp_flow_ops.setup_cb = NULL; mutex_unlock(&xdp_flow_ops.lock); + rhashtable_destroy(&rules); } module_init(load_umh); module_exit(fini_umh); From patchwork Tue Aug 13 12:05:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1146296 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="H8Q3tZI6"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 467BLM6r4pz9sN1 for ; Tue, 13 Aug 2019 22:07:55 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727552AbfHMMHz (ORCPT ); Tue, 13 Aug 2019 08:07:55 -0400 Received: from mail-pf1-f194.google.com ([209.85.210.194]:41859 "EHLO mail-pf1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727425AbfHMMHz (ORCPT ); Tue, 13 Aug 2019 08:07:55 -0400 Received: by mail-pf1-f194.google.com with SMTP id 196so4330049pfz.8; Tue, 13 Aug 2019 05:07:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=KThQsLYjfewdl28QuyIif2v6KSDzi1vGtGdeAeqU0LU=; b=H8Q3tZI6X+13+VDoFlTgcFkyaAxdYPmw1hsQ5FhMT1ha+pQjVj78IZHNlzCPfs0KD0 gaulE74uwKfAykmEAIxT9r5WxSzP1WnAZQwfb2pqeSX/iFAJrJsqnexUYM/p5jwymPxk G/5eAbdb4pyuhxx1H+NVX7NvrQiPGjBHX+wbQNTwlDusU+w/WBasnfThFiXDggHl9Q2P 05w+uGXZA+l/KTln80p/Y0QV7WCL9el8jw6zZNLwayRQjiFAN3KXG74xTmxLjUaotGbO ptCc56rs32qCC4N+8QNAmPTzGV01uH5I2phnXyko+aIdC0Wu6ZKow4jGb3bYeNRndEjs axCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KThQsLYjfewdl28QuyIif2v6KSDzi1vGtGdeAeqU0LU=; b=mJ1h3hqIJ97jR4eZ9r6fx+TDAO5IaZKBHDz9l0fb2KhcymgPgclZX3QSxVlFv5jwXB qBXNwZWwtjOuyptsZE0n+PKrkfvRqi4FJ0Pm9niLx0ZSOMocdwFmRcKVR4upAg3yPnKu +gmI+dACCfl8gkVSa6BsnwANDqgDzuqZPFJ90zTKh0VeqFgAXep35xoMlXKSGtMAso7g BRyfsndDtOF8/O2aoNIFWJdREHeGxR8p41tJCCy+RufAd8/enARunJkGU2Eqeldnlupi GklgfuQVmw1zkBjOodMnoX27zapy3faNnPdpzdDQg1BQ0/T8LBFo3wJ0NWlImW0vMNcH Lu0Q== X-Gm-Message-State: APjAAAUrwgts+m88Vw+SwFjkSdKJf7CD1XfW2LDMRrFnkkfzsZeaiS4F 8c4tJKcJ3Qb//QSaJLiNXmk= X-Google-Smtp-Source: APXvYqwSPMZfyCOoJ95lHf5XVTNqRmKEM7/0NPpAnboVu/Yl1IfBpWsTuAW6uepdVXTpDYNYg0cDHA== X-Received: by 2002:a63:2004:: with SMTP id g4mr32645374pgg.97.1565698073945; Tue, 13 Aug 2019 05:07:53 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id o9sm73251099pgv.19.2019.08.13.05.07.50 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Tue, 13 Aug 2019 05:07:53 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu Subject: [RFC PATCH bpf-next 09/14] xdp_flow: Add netdev feature for enabling TC flower offload to XDP Date: Tue, 13 Aug 2019 21:05:53 +0900 Message-Id: <20190813120558.6151-10-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190813120558.6151-1-toshiaki.makita1@gmail.com> References: <20190813120558.6151-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org The usage would be like this: $ ethtool -K eth0 tc-offload-xdp on $ tc qdisc add dev eth0 clsact $ tc filter add dev eth0 ingress protocol ip flower skip_sw ... Then the filters offloaded to XDP are marked as "in_hw". If the tc flow block is created when tc-offload-xdp is enabled on the device, the block is internally marked as xdp and only can be offloaded to XDP. The reason not to allow HW-offload and XDP-offload at the same time is to avoid the situation where offloading to only one of them succeeds. If we allow offloading to both, users cannot know which offload succeeded. NOTE: This makes flows offloaded to XDP look as if they are HW offloaded, since they will be marked as "in_hw". This could be confusing. Maybe we can add another status "in_xdp"? Then we can allow both of HW- and XDP-offload at the same time. Signed-off-by: Toshiaki Makita --- include/linux/netdev_features.h | 2 ++ include/net/pkt_cls.h | 5 +++ include/net/sch_generic.h | 1 + net/core/dev.c | 2 ++ net/core/ethtool.c | 1 + net/sched/cls_api.c | 67 +++++++++++++++++++++++++++++++++++++--- net/xdp_flow/xdp_flow_kern_mod.c | 6 ++++ 7 files changed, 80 insertions(+), 4 deletions(-) diff --git a/include/linux/netdev_features.h b/include/linux/netdev_features.h index 4b19c54..ddd201e 100644 --- a/include/linux/netdev_features.h +++ b/include/linux/netdev_features.h @@ -80,6 +80,7 @@ enum { NETIF_F_GRO_HW_BIT, /* Hardware Generic receive offload */ NETIF_F_HW_TLS_RECORD_BIT, /* Offload TLS record */ + NETIF_F_XDP_TC_BIT, /* Offload TC to XDP */ /* * Add your fresh new feature above and remember to update @@ -150,6 +151,7 @@ enum { #define NETIF_F_GSO_UDP_L4 __NETIF_F(GSO_UDP_L4) #define NETIF_F_HW_TLS_TX __NETIF_F(HW_TLS_TX) #define NETIF_F_HW_TLS_RX __NETIF_F(HW_TLS_RX) +#define NETIF_F_XDP_TC __NETIF_F(XDP_TC) /* Finds the next feature with the highest number of the range of start till 0. */ diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h index e429809..d190aae 100644 --- a/include/net/pkt_cls.h +++ b/include/net/pkt_cls.h @@ -610,6 +610,11 @@ static inline bool tc_can_offload_extack(const struct net_device *dev, return true; } +static inline bool tc_xdp_offload_enabled(const struct net_device *dev) +{ + return dev->features & NETIF_F_XDP_TC; +} + static inline bool tc_skip_hw(u32 flags) { return (flags & TCA_CLS_FLAGS_SKIP_HW) ? true : false; diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h index 6b6b012..a4d90b5 100644 --- a/include/net/sch_generic.h +++ b/include/net/sch_generic.h @@ -402,6 +402,7 @@ struct tcf_block { struct flow_block flow_block; struct list_head owner_list; bool keep_dst; + bool xdp; unsigned int offloadcnt; /* Number of oddloaded filters */ unsigned int nooffloaddevcnt; /* Number of devs unable to do offload */ struct { diff --git a/net/core/dev.c b/net/core/dev.c index a45d2e4..d1f980d 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -8680,6 +8680,8 @@ int register_netdevice(struct net_device *dev) * software offloads (GSO and GRO). */ dev->hw_features |= NETIF_F_SOFT_FEATURES; + if (IS_ENABLED(CONFIG_XDP_FLOW)) + dev->hw_features |= NETIF_F_XDP_TC; dev->features |= NETIF_F_SOFT_FEATURES; if (dev->netdev_ops->ndo_udp_tunnel_add) { diff --git a/net/core/ethtool.c b/net/core/ethtool.c index 6288e69..c7e61cf 100644 --- a/net/core/ethtool.c +++ b/net/core/ethtool.c @@ -111,6 +111,7 @@ int ethtool_op_get_ts_info(struct net_device *dev, struct ethtool_ts_info *info) [NETIF_F_HW_TLS_RECORD_BIT] = "tls-hw-record", [NETIF_F_HW_TLS_TX_BIT] = "tls-hw-tx-offload", [NETIF_F_HW_TLS_RX_BIT] = "tls-hw-rx-offload", + [NETIF_F_XDP_TC_BIT] = "tc-offload-xdp", }; static const char diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index 3565d9a..4c89bab 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -37,6 +37,7 @@ #include #include #include +#include extern const struct nla_policy rtm_tca_policy[TCA_MAX + 1]; @@ -806,7 +807,7 @@ static int tcf_block_offload_cmd(struct tcf_block *block, struct net_device *dev, struct tcf_block_ext_info *ei, enum flow_block_command command, - struct netlink_ext_ack *extack) + bool xdp, struct netlink_ext_ack *extack) { struct flow_block_offload bo = {}; int err; @@ -819,13 +820,39 @@ static int tcf_block_offload_cmd(struct tcf_block *block, bo.extack = extack; INIT_LIST_HEAD(&bo.cb_list); - err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo); + if (xdp) + err = xdp_flow_setup_block(dev, &bo); + else + err = dev->netdev_ops->ndo_setup_tc(dev, TC_SETUP_BLOCK, &bo); if (err < 0) return err; return tcf_block_setup(block, &bo); } +static int tcf_block_offload_bind_xdp(struct tcf_block *block, struct Qdisc *q, + struct tcf_block_ext_info *ei, + struct netlink_ext_ack *extack) +{ + struct net_device *dev = q->dev_queue->dev; + int err; + + if (!tc_xdp_offload_enabled(dev) && tcf_block_offload_in_use(block)) { + NL_SET_ERR_MSG(extack, + "Bind to offloaded block failed as dev has tc-offload-xdp disabled"); + return -EOPNOTSUPP; + } + + err = tcf_block_offload_cmd(block, dev, ei, FLOW_BLOCK_BIND, true, + extack); + if (err == -EOPNOTSUPP) { + block->nooffloaddevcnt++; + err = 0; + } + + return err; +} + static int tcf_block_offload_bind(struct tcf_block *block, struct Qdisc *q, struct tcf_block_ext_info *ei, struct netlink_ext_ack *extack) @@ -833,6 +860,15 @@ static int tcf_block_offload_bind(struct tcf_block *block, struct Qdisc *q, struct net_device *dev = q->dev_queue->dev; int err; + if (block->xdp) + return tcf_block_offload_bind_xdp(block, q, ei, extack); + + if (tc_xdp_offload_enabled(dev)) { + NL_SET_ERR_MSG(extack, + "Cannot bind to block created with tc-offload-xdp disabled"); + return -EOPNOTSUPP; + } + if (!dev->netdev_ops->ndo_setup_tc) goto no_offload_dev_inc; @@ -844,7 +880,8 @@ static int tcf_block_offload_bind(struct tcf_block *block, struct Qdisc *q, return -EOPNOTSUPP; } - err = tcf_block_offload_cmd(block, dev, ei, FLOW_BLOCK_BIND, extack); + err = tcf_block_offload_cmd(block, dev, ei, FLOW_BLOCK_BIND, false, + extack); if (err == -EOPNOTSUPP) goto no_offload_dev_inc; if (err) @@ -861,17 +898,35 @@ static int tcf_block_offload_bind(struct tcf_block *block, struct Qdisc *q, return 0; } +static void tcf_block_offload_unbind_xdp(struct tcf_block *block, + struct net_device *dev, + struct tcf_block_ext_info *ei) +{ + int err; + + err = tcf_block_offload_cmd(block, dev, ei, FLOW_BLOCK_UNBIND, true, + NULL); + if (err == -EOPNOTSUPP) + WARN_ON(block->nooffloaddevcnt-- == 0); +} + static void tcf_block_offload_unbind(struct tcf_block *block, struct Qdisc *q, struct tcf_block_ext_info *ei) { struct net_device *dev = q->dev_queue->dev; int err; + if (block->xdp) { + tcf_block_offload_unbind_xdp(block, dev, ei); + return; + } + tc_indr_block_call(block, dev, ei, FLOW_BLOCK_UNBIND, NULL); if (!dev->netdev_ops->ndo_setup_tc) goto no_offload_dev_dec; - err = tcf_block_offload_cmd(block, dev, ei, FLOW_BLOCK_UNBIND, NULL); + err = tcf_block_offload_cmd(block, dev, ei, FLOW_BLOCK_UNBIND, false, + NULL); if (err == -EOPNOTSUPP) goto no_offload_dev_dec; return; @@ -1004,6 +1059,10 @@ static struct tcf_block *tcf_block_create(struct net *net, struct Qdisc *q, /* Don't store q pointer for blocks which are shared */ if (!tcf_block_shared(block)) block->q = q; + + if (tc_xdp_offload_enabled(q->dev_queue->dev)) + block->xdp = true; + return block; } diff --git a/net/xdp_flow/xdp_flow_kern_mod.c b/net/xdp_flow/xdp_flow_kern_mod.c index fe925db..891b18c 100644 --- a/net/xdp_flow/xdp_flow_kern_mod.c +++ b/net/xdp_flow/xdp_flow_kern_mod.c @@ -410,6 +410,12 @@ static int xdp_flow_setup_block_cb(enum tc_setup_type type, void *type_data, struct net_device *dev = cb_priv; int err = 0; + if (!tc_xdp_offload_enabled(dev)) { + NL_SET_ERR_MSG(common->extack, + "tc-offload-xdp is disabled on net device"); + return -EOPNOTSUPP; + } + if (common->chain_index) { NL_SET_ERR_MSG(common->extack, "xdp_flow supports only offload of chain 0"); From patchwork Tue Aug 13 12:05:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1146298 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="j5uwJuKs"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 467BLR2LQ9z9sN1 for ; Tue, 13 Aug 2019 22:07:59 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727738AbfHMMH7 (ORCPT ); Tue, 13 Aug 2019 08:07:59 -0400 Received: from mail-pf1-f170.google.com ([209.85.210.170]:39315 "EHLO mail-pf1-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727425AbfHMMH6 (ORCPT ); Tue, 13 Aug 2019 08:07:58 -0400 Received: by mail-pf1-f170.google.com with SMTP id f17so47506987pfn.6; Tue, 13 Aug 2019 05:07:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=4kX9zyB70W2FSoQFqbjQbosXPW34+UsdRCkbTPZUG+Y=; b=j5uwJuKs0Or8iyIvNCoQGRgkS3CRUVlLAp24x9+JupBlSVMRkGXq9L1X3mDNBvLCq/ 8Se8MEMT9PIamCdXf1G1nUyvduf/sJGwok7cLbErm6wOBgaOc6tg/EgYbdWv/7tWsYYm mWuss3vUUdEYZTUEO82frHqqzQ6vH6HXq1xBkkzMpUiEG2wFmmPwJZaBZ/FvZC4/IK4W yvSZFDLTplHBb8nuk69tpwL0/sTZrUOVc8HCCwzdaAEZZfn92cBUpd4TKcClSFBBOV3K mWZObYO47Z+TNs6XyTkK6NKY9YMcp32Glb3/EWcNROG13msYoRrczZ0LePHA32RLpcip GI0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=4kX9zyB70W2FSoQFqbjQbosXPW34+UsdRCkbTPZUG+Y=; b=RyF8nVcT3LJwiGLcu4GxgvsmEIEzNkOTSVTtX8sxduqMRTcDLF2FnkLVstZvbkN0q3 WqZWwhx0g15odNNgJuPceb6FqqY0+hE9KxmphKG564xYBFzwm/vdeSTxOK7NWFygLluV bPLFN6T6rpLAIV7tPujuZbLQZTvp71wWndAuebwMtQkNaaUnnQ1N4iGLIeXiKPyHxDD9 nKNo2F7I38Tf/AVd/EXAJHeUfVlakXJLEBRDmWQG/UuSAGzJgIdbsBWqy9TWm6zgXk7k dUug8k/GibHjbr2fJrZrTge42guU+WORlYrA2kTD0l1Xv9ACxknmaGsogmB73w/Cyan3 oXHQ== X-Gm-Message-State: APjAAAWQWu0ebvKSQDofHFUeeMQeBy1ABBKYW1DsTASn+9EmdKui3yGm M0bwYqNZw10Qk2NfxEUuzH8= X-Google-Smtp-Source: APXvYqyFkW0YBv8JwjRSU5NgoWrFIwS3jgAMcAWIvx5WDH8c0WkVK09rO2JSPeDr65K21JNLr0G+/Q== X-Received: by 2002:aa7:9ab8:: with SMTP id x24mr7167121pfi.98.1565698077763; Tue, 13 Aug 2019 05:07:57 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id o9sm73251099pgv.19.2019.08.13.05.07.54 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Tue, 13 Aug 2019 05:07:57 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu Subject: [RFC PATCH bpf-next 10/14] xdp_flow: Implement redirect action Date: Tue, 13 Aug 2019 21:05:54 +0900 Message-Id: <20190813120558.6151-11-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190813120558.6151-1-toshiaki.makita1@gmail.com> References: <20190813120558.6151-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Add a devmap for XDP_REDIRECT and use it for redirect action. Signed-off-by: Toshiaki Makita --- net/xdp_flow/umh_bpf.h | 1 + net/xdp_flow/xdp_flow_kern_bpf.c | 14 +++- net/xdp_flow/xdp_flow_kern_mod.c | 3 + net/xdp_flow/xdp_flow_umh.c | 164 +++++++++++++++++++++++++++++++++++++-- 4 files changed, 175 insertions(+), 7 deletions(-) diff --git a/net/xdp_flow/umh_bpf.h b/net/xdp_flow/umh_bpf.h index 4e4633f..a279d0a1 100644 --- a/net/xdp_flow/umh_bpf.h +++ b/net/xdp_flow/umh_bpf.h @@ -4,6 +4,7 @@ #include "msgfmt.h" +#define MAX_PORTS 65536 #define MAX_FLOWS 1024 #define MAX_FLOW_MASKS 255 #define FLOW_MASKS_TAIL 255 diff --git a/net/xdp_flow/xdp_flow_kern_bpf.c b/net/xdp_flow/xdp_flow_kern_bpf.c index ceb8a92..8f3d359 100644 --- a/net/xdp_flow/xdp_flow_kern_bpf.c +++ b/net/xdp_flow/xdp_flow_kern_bpf.c @@ -22,6 +22,13 @@ struct bpf_map_def SEC("maps") debug_stats = { .max_entries = 256, }; +struct bpf_map_def SEC("maps") output_map = { + .type = BPF_MAP_TYPE_DEVMAP, + .key_size = sizeof(int), + .value_size = sizeof(int), + .max_entries = MAX_PORTS, +}; + struct bpf_map_def SEC("maps") flow_masks_head = { .type = BPF_MAP_TYPE_ARRAY, .key_size = sizeof(u32), @@ -71,10 +78,13 @@ static inline int action_drop(void) static inline int action_redirect(struct xdp_flow_action *action) { + int tx_port; + account_action(XDP_FLOW_ACTION_REDIRECT); - // TODO: implement this - return XDP_ABORTED; + tx_port = action->ifindex; + + return bpf_redirect_map(&output_map, tx_port, 0); } static inline int action_vlan_push(struct xdp_md *ctx, diff --git a/net/xdp_flow/xdp_flow_kern_mod.c b/net/xdp_flow/xdp_flow_kern_mod.c index 891b18c..caa4968 100644 --- a/net/xdp_flow/xdp_flow_kern_mod.c +++ b/net/xdp_flow/xdp_flow_kern_mod.c @@ -51,6 +51,9 @@ static int xdp_flow_parse_actions(struct xdp_flow_actions *actions, action->id = XDP_FLOW_ACTION_DROP; break; case FLOW_ACTION_REDIRECT: + action->id = XDP_FLOW_ACTION_REDIRECT; + action->ifindex = act->dev->ifindex; + break; case FLOW_ACTION_VLAN_PUSH: case FLOW_ACTION_VLAN_POP: case FLOW_ACTION_VLAN_MANGLE: diff --git a/net/xdp_flow/xdp_flow_umh.c b/net/xdp_flow/xdp_flow_umh.c index 9a4769b..cbb766a 100644 --- a/net/xdp_flow/xdp_flow_umh.c +++ b/net/xdp_flow/xdp_flow_umh.c @@ -18,6 +18,7 @@ extern char xdp_flow_bpf_start; extern char xdp_flow_bpf_end; int progfile_fd; +int output_map_fd; #define zalloc(size) calloc(1, (size)) @@ -40,12 +41,22 @@ struct netdev_info { struct netdev_info_key key; struct hlist_node node; struct bpf_object *obj; + int devmap_idx; int free_slot_top; int free_slots[MAX_FLOW_MASKS]; }; DEFINE_HASHTABLE(netdev_info_table, 16); +struct devmap_idx_node { + int devmap_idx; + struct hlist_node node; +}; + +DEFINE_HASHTABLE(devmap_idx_table, 16); + +int max_devmap_idx; + static int libbpf_err(int err, char *errbuf) { libbpf_strerror(err, errbuf, ERRBUF_SIZE); @@ -90,6 +101,15 @@ static int setup(void) goto err; } + output_map_fd = bpf_create_map(BPF_MAP_TYPE_DEVMAP, sizeof(int), + sizeof(int), MAX_PORTS, 0); + if (output_map_fd < 0) { + err = -errno; + pr_err("map creation for output_map failed: %s\n", + strerror(errno)); + goto err; + } + return 0; err: close(progfile_fd); @@ -97,10 +117,23 @@ static int setup(void) return err; } -static int load_bpf(int ifindex, struct bpf_object **objp) +static void delete_output_map_elem(int idx) +{ + char errbuf[ERRBUF_SIZE]; + int err; + + err = bpf_map_delete_elem(output_map_fd, &idx); + if (err) { + libbpf_err(err, errbuf); + pr_warn("Failed to delete idx %d from output_map: %s\n", + idx, errbuf); + } +} + +static int load_bpf(int ifindex, int devmap_idx, struct bpf_object **objp) { int prog_fd, flow_tables_fd, flow_meta_fd, flow_masks_head_fd, err; - struct bpf_map *flow_tables, *flow_masks_head; + struct bpf_map *output_map, *flow_tables, *flow_masks_head; int zero = 0, flow_masks_tail = FLOW_MASKS_TAIL; struct bpf_object_open_attr attr = {}; char path[256], errbuf[ERRBUF_SIZE]; @@ -133,6 +166,27 @@ static int load_bpf(int ifindex, struct bpf_object **objp) bpf_object__for_each_program(prog, obj) bpf_program__set_type(prog, attr.prog_type); + output_map = bpf_object__find_map_by_name(obj, "output_map"); + if (!output_map) { + pr_err("Cannot find output_map\n"); + err = -ENOENT; + goto err_obj; + } + + err = bpf_map__reuse_fd(output_map, output_map_fd); + if (err) { + err = libbpf_err(err, errbuf); + pr_err("Failed to reuse output_map fd: %s\n", errbuf); + goto err_obj; + } + + if (bpf_map_update_elem(output_map_fd, &devmap_idx, &ifindex, 0)) { + err = -errno; + pr_err("Failed to insert idx %d if %d into output_map: %s\n", + devmap_idx, ifindex, strerror(errno)); + goto err_obj; + } + flow_meta_fd = bpf_create_map(BPF_MAP_TYPE_HASH, sizeof(struct xdp_flow_key), sizeof(struct xdp_flow_actions), @@ -222,6 +276,8 @@ static int load_bpf(int ifindex, struct bpf_object **objp) return prog_fd; err: + delete_output_map_elem(devmap_idx); +err_obj: bpf_object__close(obj); return err; } @@ -272,6 +328,56 @@ static struct netdev_info *get_netdev_info(const struct mbox_request *req) return netdev_info; } +static struct devmap_idx_node *find_devmap_idx(int devmap_idx) +{ + struct devmap_idx_node *node; + + hash_for_each_possible(devmap_idx_table, node, node, devmap_idx) { + if (node->devmap_idx == devmap_idx) + return node; + } + + return NULL; +} + +static int get_new_devmap_idx(void) +{ + int offset; + + for (offset = 0; offset < MAX_PORTS; offset++) { + int devmap_idx = max_devmap_idx++; + + if (max_devmap_idx >= MAX_PORTS) + max_devmap_idx -= MAX_PORTS; + + if (!find_devmap_idx(devmap_idx)) { + struct devmap_idx_node *node; + + node = malloc(sizeof(*node)); + if (!node) { + pr_err("malloc for devmap_idx failed\n"); + return -ENOMEM; + } + node->devmap_idx = devmap_idx; + hash_add(devmap_idx_table, &node->node, devmap_idx); + + return devmap_idx; + } + } + + return -ENOSPC; +} + +static void delete_devmap_idx(int devmap_idx) +{ + struct devmap_idx_node *node = find_devmap_idx(devmap_idx); + + if (node) { + hash_del(&node->node); + free(node); + } +} + static void init_flow_masks_free_slot(struct netdev_info *netdev_info) { int i; @@ -325,11 +431,11 @@ static void delete_flow_masks_free_slot(struct netdev_info *netdev_info, static int handle_load(const struct mbox_request *req, __u32 *prog_id) { + int err, prog_fd, devmap_idx = -1; struct netdev_info *netdev_info; struct bpf_prog_info info = {}; struct netdev_info_key key; __u32 len = sizeof(info); - int err, prog_fd; err = get_netdev_info_key(req, &key); if (err) @@ -346,12 +452,19 @@ static int handle_load(const struct mbox_request *req, __u32 *prog_id) } netdev_info->key.ifindex = key.ifindex; + devmap_idx = get_new_devmap_idx(); + if (devmap_idx < 0) { + err = devmap_idx; + goto err_netdev_info; + } + netdev_info->devmap_idx = devmap_idx; + init_flow_masks_free_slot(netdev_info); - prog_fd = load_bpf(req->ifindex, &netdev_info->obj); + prog_fd = load_bpf(req->ifindex, devmap_idx, &netdev_info->obj); if (prog_fd < 0) { err = prog_fd; - goto err_netdev_info; + goto err_devmap_idx; } err = bpf_obj_get_info_by_fd(prog_fd, &info, &len); @@ -366,6 +479,8 @@ static int handle_load(const struct mbox_request *req, __u32 *prog_id) return 0; err_obj: bpf_object__close(netdev_info->obj); +err_devmap_idx: + delete_devmap_idx(devmap_idx); err_netdev_info: free(netdev_info); @@ -382,12 +497,45 @@ static int handle_unload(const struct mbox_request *req) hash_del(&netdev_info->node); bpf_object__close(netdev_info->obj); + delete_output_map_elem(netdev_info->devmap_idx); + delete_devmap_idx(netdev_info->devmap_idx); free(netdev_info); pr_debug("XDP program for if %d was closed\n", req->ifindex); return 0; } +static int convert_ifindex_to_devmap_idx(struct mbox_request *req) +{ + int i; + + for (i = 0; i < req->flow.actions.num_actions; i++) { + struct xdp_flow_action *action = &req->flow.actions.actions[i]; + + if (action->id == XDP_FLOW_ACTION_REDIRECT) { + struct netdev_info *netdev_info; + struct netdev_info_key key; + int err; + + err = get_netdev_info_key(req, &key); + if (err) + return err; + key.ifindex = action->ifindex; + + netdev_info = find_netdev_info(&key); + if (!netdev_info) { + pr_err("Cannot redirect to ifindex %d. Please setup xdp_flow on ifindex %d in advance.\n", + key.ifindex, key.ifindex); + return -ENOENT; + } + + action->ifindex = netdev_info->devmap_idx; + } + } + + return 0; +} + static int get_table_fd(const struct netdev_info *netdev_info, const char *table_name) { @@ -784,6 +932,11 @@ static int handle_replace(struct mbox_request *req) if (IS_ERR(netdev_info)) return PTR_ERR(netdev_info); + /* TODO: Use XDP_TX for redirect action when possible */ + err = convert_ifindex_to_devmap_idx(req); + if (err) + return err; + err = flow_table_insert_elem(netdev_info, &req->flow); if (err) return err; @@ -875,6 +1028,7 @@ int main(void) return -1; loop(); close(progfile_fd); + close(output_map_fd); return 0; } From patchwork Tue Aug 13 12:05:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1146300 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="Xw/SEa83"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 467BLW0CTzz9sN1 for ; Tue, 13 Aug 2019 22:08:03 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728026AbfHMMIC (ORCPT ); Tue, 13 Aug 2019 08:08:02 -0400 Received: from mail-pf1-f194.google.com ([209.85.210.194]:44326 "EHLO mail-pf1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727425AbfHMMIC (ORCPT ); Tue, 13 Aug 2019 08:08:02 -0400 Received: by mail-pf1-f194.google.com with SMTP id c81so2490750pfc.11; Tue, 13 Aug 2019 05:08:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/VLzgSwz7Ojl/rXjGtgOE1lI6r7B7q/BQ6+SStfrnZg=; b=Xw/SEa83cOPsCLVKND0fZ+qg5AuHySmttLZmQ7M0SFq1S+BOLvfDiWPs+38TQNHKa1 rC6VxxOzKfoikkgaPLtlfuv29bHz0GUxupQsx9z3mPyExd9geHUiP4t92jVsxZ1w7oKj TBEDkwIHoJJLO16dJ003rzCkewMx5ZLlxc8ke7gYCS2KtEaqDQ7xdeVgP33thiHhff36 0yecyWBhyqa7lyGQ157a+LPqngtvQsjk0rajWGzPG7hu4NTpxvrtIGakkXYxDSpsdo7P YF/32sUAkMmNM07O+0QG4tZWAarTI+Bkb4la0iN72xYwaDo4WukwnJkp0YgU+ob165qR WnpQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/VLzgSwz7Ojl/rXjGtgOE1lI6r7B7q/BQ6+SStfrnZg=; b=eOa2tj/NsO3xO2pl/hmEPVFl4wfmabbV9tCjciI8irgJHKW7CIBj/VWkt8QkJAbIjm SHaI40gOIeUHjVS2NTwwI6ltuDY3fO/LccqsiKqqF3/CRyAMg/1qjLnhE/vq/CJ8zg4I 9DHY23u2SZmvf/Oo4+daaOwuTgee1U0R/c5+00Nl2n/qGhUs8uMdk4dPXI6WIzPt3quu VXH2VZF5PEwtlVoyv8sZfcT2zl78S6hxjoorxq4nPbsa3mB+YP1apTFilVz09KxncrJR FcfCJPnIoPLd96KUrKliFly7gVHghwKT//dcgaykM9T3iKFXgIBCefeSuiPkJCNCTzvD lFVw== X-Gm-Message-State: APjAAAVjPLh4DQ/1pIVli61IwPOQ9TzAlWxGayNVfVPagBB1CN1DWgvl 3V0ZBgKA3XCYDIfHSsWoiRU= X-Google-Smtp-Source: APXvYqzN/XMbIA9OkqiXo15s/d0kDOEJ88WmDPlbw9beP2ZkxhxCMwN3AKbgYyGbDerhQ4lI03NOuw== X-Received: by 2002:a63:89c2:: with SMTP id v185mr34018783pgd.241.1565698081438; Tue, 13 Aug 2019 05:08:01 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id o9sm73251099pgv.19.2019.08.13.05.07.57 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Tue, 13 Aug 2019 05:08:00 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu Subject: [RFC PATCH bpf-next 11/14] xdp_flow: Implement vlan_push action Date: Tue, 13 Aug 2019 21:05:55 +0900 Message-Id: <20190813120558.6151-12-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190813120558.6151-1-toshiaki.makita1@gmail.com> References: <20190813120558.6151-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org This is another example action. Signed-off-by: Toshiaki Makita --- net/xdp_flow/xdp_flow_kern_bpf.c | 23 +++++++++++++++++++++-- net/xdp_flow/xdp_flow_kern_mod.c | 5 +++++ 2 files changed, 26 insertions(+), 2 deletions(-) diff --git a/net/xdp_flow/xdp_flow_kern_bpf.c b/net/xdp_flow/xdp_flow_kern_bpf.c index 8f3d359..51e181b 100644 --- a/net/xdp_flow/xdp_flow_kern_bpf.c +++ b/net/xdp_flow/xdp_flow_kern_bpf.c @@ -90,10 +90,29 @@ static inline int action_redirect(struct xdp_flow_action *action) static inline int action_vlan_push(struct xdp_md *ctx, struct xdp_flow_action *action) { + struct vlan_ethhdr *vehdr; + void *data, *data_end; + __be16 proto, tci; + account_action(XDP_FLOW_ACTION_VLAN_PUSH); - // TODO: implement this - return XDP_ABORTED; + proto = action->vlan.proto; + tci = action->vlan.tci; + + if (bpf_xdp_adjust_head(ctx, -VLAN_HLEN)) + return XDP_DROP; + + data_end = (void *)(long)ctx->data_end; + data = (void *)(long)ctx->data; + if (data + VLAN_ETH_HLEN > data_end) + return XDP_DROP; + + __builtin_memmove(data, data + VLAN_HLEN, ETH_ALEN * 2); + vehdr = data; + vehdr->h_vlan_proto = proto; + vehdr->h_vlan_TCI = tci; + + return _XDP_CONTINUE; } static inline int action_vlan_pop(struct xdp_md *ctx, diff --git a/net/xdp_flow/xdp_flow_kern_mod.c b/net/xdp_flow/xdp_flow_kern_mod.c index caa4968..52dc64e 100644 --- a/net/xdp_flow/xdp_flow_kern_mod.c +++ b/net/xdp_flow/xdp_flow_kern_mod.c @@ -55,6 +55,11 @@ static int xdp_flow_parse_actions(struct xdp_flow_actions *actions, action->ifindex = act->dev->ifindex; break; case FLOW_ACTION_VLAN_PUSH: + action->id = XDP_FLOW_ACTION_VLAN_PUSH; + action->vlan.tci = act->vlan.vid | + (act->vlan.prio << VLAN_PRIO_SHIFT); + action->vlan.proto = act->vlan.proto; + break; case FLOW_ACTION_VLAN_POP: case FLOW_ACTION_VLAN_MANGLE: case FLOW_ACTION_MANGLE: From patchwork Tue Aug 13 12:05:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1146302 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="Uf6l/F3N"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 467BLZ6ZJvz9sN1 for ; Tue, 13 Aug 2019 22:08:06 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727425AbfHMMIG (ORCPT ); Tue, 13 Aug 2019 08:08:06 -0400 Received: from mail-pg1-f193.google.com ([209.85.215.193]:40150 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728048AbfHMMIG (ORCPT ); Tue, 13 Aug 2019 08:08:06 -0400 Received: by mail-pg1-f193.google.com with SMTP id w10so51161849pgj.7; Tue, 13 Aug 2019 05:08:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=nXUR57zVJA41KWE5xxjUW/MAhjVMNeKFhg9ej1hm1bs=; b=Uf6l/F3NVUroJP3OXRhhfkGNulTyndljCHvP48oxq8LDMQQdYaUaXRwEgNhAdUGeE/ YOr3y8FhzS2wvmBLgxJWC34tnGJCs1VQT+gqb/vIVJjvsvgSwjAUip121HMfqfumNIqS IQ8lv+oKOcFeUV+QlhMjO+IjdQ0HUQDtub8jG1XDdgJcU/GagR6LLgS3YaCU7zbD3Dpc HtMzoLfFQGOuC5V0q6MHWzyXN6/r/Cj8bPKNFHiweiuZCZpV36ninJ9qMtPITL+j1eje iJ/yzo8QaepWxnqqAgFR4fP8NRN5x9VbsdathQDbaNzGwNOXNdk8Vry3/xM/a+WdqEO9 Gvdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=nXUR57zVJA41KWE5xxjUW/MAhjVMNeKFhg9ej1hm1bs=; b=CRNhGGGexy9LTIeTkSPGY7vgfAqi8aq7kxMxMSJNMyEXxZfx4/kx3jfZi7iDPivcQ0 xHev1+gLkNDbA0hs6oya1Jh+dcfzFZK+a9enNYgS0gEI923YtEHwMPXv/Ji/4dNYfYWy uG0FbcHZA8YCa8+3dRFDJmeUOO0eOrt0o3AImwHLs8L/hMYeXRtq0sBFrDGSorOr3rpJ UNrNf2a4P7JBMoMrT5Bme4H2rLtvNDQiaPzQMQY92TpUS0igPhUFqHyJPf8gQIogUnIM OF8FCAfSzDSlxIFMw+/QFTYnQ4W2ybbPeUninl85SVyNXLHllnh7qiVGCm1Q4VpsrNEi tJnA== X-Gm-Message-State: APjAAAVRRVv97VxSZhG3ses2DyaQgU5jo2OMMIkdRBxQBtq9iEhxScDF gvXzbL/1eat6RMFAVRdk3Iw= X-Google-Smtp-Source: APXvYqxRd3nWpYyebfDCvRyb7AAoz8fjwQm1zIf3xK9gPSpq3/SIl0QVgydDwowdOX05K8fTcKVEGA== X-Received: by 2002:a63:7d05:: with SMTP id y5mr12780591pgc.425.1565698085649; Tue, 13 Aug 2019 05:08:05 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id o9sm73251099pgv.19.2019.08.13.05.08.01 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Tue, 13 Aug 2019 05:08:04 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu Subject: [RFC PATCH bpf-next 12/14] bpf, selftest: Add test for xdp_flow Date: Tue, 13 Aug 2019 21:05:56 +0900 Message-Id: <20190813120558.6151-13-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190813120558.6151-1-toshiaki.makita1@gmail.com> References: <20190813120558.6151-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Check if TC flower offloading to XDP works. Signed-off-by: Toshiaki Makita --- tools/testing/selftests/bpf/Makefile | 1 + tools/testing/selftests/bpf/test_xdp_flow.sh | 103 +++++++++++++++++++++++++++ 2 files changed, 104 insertions(+) create mode 100755 tools/testing/selftests/bpf/test_xdp_flow.sh diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 3bd0f4a..886702a 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -50,6 +50,7 @@ TEST_PROGS := test_kmod.sh \ test_xdp_redirect.sh \ test_xdp_meta.sh \ test_xdp_veth.sh \ + test_xdp_flow.sh \ test_offload.py \ test_sock_addr.sh \ test_tunnel.sh \ diff --git a/tools/testing/selftests/bpf/test_xdp_flow.sh b/tools/testing/selftests/bpf/test_xdp_flow.sh new file mode 100755 index 0000000..cb06f3e --- /dev/null +++ b/tools/testing/selftests/bpf/test_xdp_flow.sh @@ -0,0 +1,103 @@ +#!/bin/sh +# SPDX-License-Identifier: GPL-2.0 +# +# Create 2 namespaces with 2 veth peers, and +# forward packets in-between using xdp_flow +# +# NS1(veth11) NS2(veth22) +# | | +# | | +# (veth1) (veth2) +# ^ ^ +# | xdp_flow | +# -------------------- + +# Kselftest framework requirement - SKIP code is 4. +ksft_skip=4 + +TESTNAME=xdp_flow + +_cleanup() +{ + set +e + ip link del veth1 2> /dev/null + ip link del veth2 2> /dev/null + ip netns del ns1 2> /dev/null + ip netns del ns2 2> /dev/null +} + +cleanup_skip() +{ + echo "selftests: $TESTNAME [SKIP]" + _cleanup + + exit $ksft_skip +} + +cleanup() +{ + if [ "$?" = 0 ]; then + echo "selftests: $TESTNAME [PASS]" + else + echo "selftests: $TESTNAME [FAILED]" + fi + _cleanup +} + +if [ $(id -u) -ne 0 ]; then + echo "selftests: $TESTNAME [SKIP] Need root privileges" + exit $ksft_skip +fi + +if ! ip link set dev lo xdp off > /dev/null 2>&1; then + echo "selftests: $TESTNAME [SKIP] Could not run test without the ip xdp support" + exit $ksft_skip +fi + +set -e + +trap cleanup_skip EXIT + +ip netns add ns1 +ip netns add ns2 + +ip link add veth1 type veth peer name veth11 netns ns1 +ip link add veth2 type veth peer name veth22 netns ns2 + +ip link set veth1 up +ip link set veth2 up + +ip -n ns1 addr add 10.1.1.11/24 dev veth11 +ip -n ns2 addr add 10.1.1.22/24 dev veth22 + +ip -n ns1 link set dev veth11 up +ip -n ns2 link set dev veth22 up + +ip -n ns1 link set dev veth11 xdp obj xdp_dummy.o sec xdp_dummy +ip -n ns2 link set dev veth22 xdp obj xdp_dummy.o sec xdp_dummy + +ethtool -K veth1 tc-offload-xdp on +ethtool -K veth2 tc-offload-xdp on + +trap cleanup EXIT + +# Adding clsact or ingress will trigger loading bpf prog in UMH +tc qdisc add dev veth1 clsact +tc qdisc add dev veth2 clsact + +# Adding filter will have UMH populate flow table map +# 'skip_sw' can be accepted only when 'tc-offload-xdp' is enabled on veth +tc filter add dev veth1 ingress protocol ip flower skip_sw \ + dst_ip 10.1.1.0/24 action mirred egress redirect dev veth2 +tc filter add dev veth2 ingress protocol ip flower skip_sw \ + dst_ip 10.1.1.0/24 action mirred egress redirect dev veth1 + +# ARP is not supported so don't add 'skip_sw' +tc filter add dev veth1 ingress protocol arp flower \ + arp_tip 10.1.1.0/24 action mirred egress redirect dev veth2 +tc filter add dev veth2 ingress protocol arp flower \ + arp_sip 10.1.1.0/24 action mirred egress redirect dev veth1 + +ip netns exec ns1 ping -c 1 -W 1 10.1.1.22 + +exit 0 From patchwork Tue Aug 13 12:05:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1146304 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="hs127bin"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 467BLf4Mdfz9sN1 for ; Tue, 13 Aug 2019 22:08:10 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728063AbfHMMIK (ORCPT ); Tue, 13 Aug 2019 08:08:10 -0400 Received: from mail-pf1-f193.google.com ([209.85.210.193]:39545 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727466AbfHMMIJ (ORCPT ); Tue, 13 Aug 2019 08:08:09 -0400 Received: by mail-pf1-f193.google.com with SMTP id f17so47507320pfn.6; Tue, 13 Aug 2019 05:08:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=f/dTb4ClZaN81sXe2USd7VA6EXdy1jkzbTVP88gvsZU=; b=hs127bino+aRHv0nBLwWt+DG51I3QDsad9WxE1u2ABfBUc8amtsY19MCmUDSi0lMDn wEYGySn1SCZmeAD3FCs5lYA1MIpTozTTkL2+0j3hAWtuTotw2DVyfixMtmr8hHvpVmRA xLjG8jNRLiNUw5Jm48gnt8NRtBd/KX5LTZw9n4r9W9eIYEg+5zsP9/ShWNCDcsFXWHVK ujVKmE2Dlavd1szTRYywb7yj/c/1/6GDR/mvbdtSrWYIEoHkxXPhyAZcVUzLkLqSPVo9 jGkKRKXolKBTo/Wi2FdFdzl6GPwCLZf6KnOzswP/F0qRFdVA5fdjnzdLeQvA0QpagOTK FwhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=f/dTb4ClZaN81sXe2USd7VA6EXdy1jkzbTVP88gvsZU=; b=k6M0AKyLh4DuAElwerxTVGTTdZwGrxXSi9uqqCgCY3GFEDU4FkHXfyRKjm7LzJq162 NlVic626TwdX/AaGZQUp4uhARWv0As1yu3f7v5FECmocHvFoWaGjFU11EV2JX981aKtE DQDatJOOd81wyDrXIhGyPe40LrjKh6fypZT6/dilBZG4hk7C8uLqeIq0wsS5KeoT+uMx irJzc+tws4TmWcMtQIdM4BrJ4EvqBJSt6QA/4o8svsCnq/gTqqoE19VIJ3Wb9bvgHpct jrG/QsK+39mrEDcRO+bwNhg/T+mjM5xVrlZEB2voZPh/jcVPk9VP4hEzaamr/dBLbJhK gZRg== X-Gm-Message-State: APjAAAXDSfCCHhkro7xpz/+G9Zu3Xpf4If0oBqnvqWZdxNpQ1QxaW4e+ OEqkMgwlnDR8Cgd26ICAo3Y= X-Google-Smtp-Source: APXvYqzVtckARatEy2C/6IfJsCE1AauFGEeu/iH3Zoi4a8JcHfi/0kIKIMZY0Giz/EW5ki6DGfUbSQ== X-Received: by 2002:a17:90a:256f:: with SMTP id j102mr2012310pje.14.1565698089340; Tue, 13 Aug 2019 05:08:09 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id o9sm73251099pgv.19.2019.08.13.05.08.05 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Tue, 13 Aug 2019 05:08:08 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu Subject: [RFC PATCH bpf-next 13/14] i40e: prefetch xdp->data before running XDP prog Date: Tue, 13 Aug 2019 21:05:57 +0900 Message-Id: <20190813120558.6151-14-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190813120558.6151-1-toshiaki.makita1@gmail.com> References: <20190813120558.6151-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org XDP progs are likely to read/write xdp->data. This improves the performance of xdp_flow. This is included in this series just to demonstrate to what extent xdp_flow performance can increase. Signed-off-by: Toshiaki Makita --- drivers/net/ethernet/intel/i40e/i40e_txrx.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c index f162252..ea775ae 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c @@ -2207,6 +2207,7 @@ static struct sk_buff *i40e_run_xdp(struct i40e_ring *rx_ring, if (!xdp_prog) goto xdp_out; + prefetchw(xdp->data); prefetchw(xdp->data_hard_start); /* xdp_frame write */ act = bpf_prog_run_xdp(xdp_prog, xdp); From patchwork Tue Aug 13 12:05:58 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Toshiaki Makita X-Patchwork-Id: 1146306 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="R5fPH5jp"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 467BLk1JdNz9sN1 for ; Tue, 13 Aug 2019 22:08:14 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727720AbfHMMIN (ORCPT ); Tue, 13 Aug 2019 08:08:13 -0400 Received: from mail-pg1-f195.google.com ([209.85.215.195]:39454 "EHLO mail-pg1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727466AbfHMMIN (ORCPT ); Tue, 13 Aug 2019 08:08:13 -0400 Received: by mail-pg1-f195.google.com with SMTP id u17so51159362pgi.6; Tue, 13 Aug 2019 05:08:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=sqLnhHvWb1pYJOG96mznxLtUFrCsozUxEBwFZR41qYU=; b=R5fPH5jpnlpW2LB0OvQ/09A/ukgivy7W++yiMxJNSrRxSrJDkYNr9XuZ92zwB/NZ4+ 1V1jnzLoQy9E0/3LLbwuHpf94+TPvFrWnMSLY+C+2kfJtTQVX0FslsPaKx40MXPh/Rf1 VcXisdzkaCJc2/EvgCdwMgNUot4ttsaSuKJb72w+9UG0oRg8pb5QRa7J5Ku/D0qvHasf jVOrYo7JwYLG1SCfekONJjdbbfEcEmWsbDgZsv7k9WkudousIbd9mLuTowndMC1vpZDn swqfOl6SlZfwDFX8nMqS/tSH8/jtnXWPCyG+N3ZI5kBz1u6wXxv4ERo4RTHd9/U0aAd4 djbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=sqLnhHvWb1pYJOG96mznxLtUFrCsozUxEBwFZR41qYU=; b=nPBGkPat7pmuq9u1dSKU+J7EIpJ70DEvy1KtiWMi9KYJujESehJM1bKSvsVc4NwcNQ m4V/omG8uBWyEQ/TCtVbWGdw2w1KX88Gm67Sb10KnMwLZbRKQx302CR72A8rA8ngEwXD yVuW0jZCPzDzY3nc1VnRC8wjPamRbxlcTM+wGwze3XQF58ldzr2xx2YFYI5PtcVehjGg eJUi2zGBG22l5v8BdZcTBOJNARBTbsVZnKWKg3F6018hVg37aVMgHU52wSR32IUOtCBq xEW44VuYxi4uJ6/F/vrTjf68qrhGA29v7W69UVRwb/sXPHdddYhZM6URbKgpROAEmLF/ dDew== X-Gm-Message-State: APjAAAXGBAr1Ejo00pPDLr6qz0SewxsRKNRLFvsQccVGEz0GhFao5p3D BfpPt3jXqCBWjuJ12+PWNlo= X-Google-Smtp-Source: APXvYqzOLBIpKruIrN+pdC6zB0JrdgpaIcYhgQwPP0Ok4ayQ9LQDTomDipedDKnF0CfXNsMaCp8Tqw== X-Received: by 2002:a62:1808:: with SMTP id 8mr14528787pfy.177.1565698092949; Tue, 13 Aug 2019 05:08:12 -0700 (PDT) Received: from z400-fedora29.kern.oss.ntt.co.jp ([222.151.198.97]) by smtp.gmail.com with ESMTPSA id o9sm73251099pgv.19.2019.08.13.05.08.09 (version=TLS1_3 cipher=AEAD-AES256-GCM-SHA384 bits=256/256); Tue, 13 Aug 2019 05:08:12 -0700 (PDT) From: Toshiaki Makita To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Jamal Hadi Salim , Cong Wang , Jiri Pirko Cc: Toshiaki Makita , netdev@vger.kernel.org, bpf@vger.kernel.org, William Tu Subject: [RFC PATCH bpf-next 14/14] bpf, hashtab: Compare keys in long Date: Tue, 13 Aug 2019 21:05:58 +0900 Message-Id: <20190813120558.6151-15-toshiaki.makita1@gmail.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20190813120558.6151-1-toshiaki.makita1@gmail.com> References: <20190813120558.6151-1-toshiaki.makita1@gmail.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org memcmp() is generally slow. Compare keys in long if possible. This improves xdp_flow performance. This is included in this series just to demonstrate to what extent xdp_flow performance can increase. Signed-off-by: Toshiaki Makita --- kernel/bpf/hashtab.c | 27 +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-) diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c index 22066a6..8b5ffd4 100644 --- a/kernel/bpf/hashtab.c +++ b/kernel/bpf/hashtab.c @@ -417,6 +417,29 @@ static inline struct hlist_nulls_head *select_bucket(struct bpf_htab *htab, u32 return &__select_bucket(htab, hash)->head; } +/* key1 must be aligned to sizeof long */ +static bool key_equal(void *key1, void *key2, u32 size) +{ + /* Check for key1 */ + BUILD_BUG_ON(!IS_ALIGNED(offsetof(struct htab_elem, key), + sizeof(long))); + + if (IS_ALIGNED((unsigned long)key2 | (unsigned long)size, + sizeof(long))) { + unsigned long *lkey1, *lkey2; + + for (lkey1 = key1, lkey2 = key2; size > 0; + lkey1++, lkey2++, size -= sizeof(long)) { + if (*lkey1 != *lkey2) + return false; + } + + return true; + } + + return !memcmp(key1, key2, size); +} + /* this lookup function can only be called with bucket lock taken */ static struct htab_elem *lookup_elem_raw(struct hlist_nulls_head *head, u32 hash, void *key, u32 key_size) @@ -425,7 +448,7 @@ static struct htab_elem *lookup_elem_raw(struct hlist_nulls_head *head, u32 hash struct htab_elem *l; hlist_nulls_for_each_entry_rcu(l, n, head, hash_node) - if (l->hash == hash && !memcmp(&l->key, key, key_size)) + if (l->hash == hash && key_equal(&l->key, key, key_size)) return l; return NULL; @@ -444,7 +467,7 @@ static struct htab_elem *lookup_nulls_elem_raw(struct hlist_nulls_head *head, again: hlist_nulls_for_each_entry_rcu(l, n, head, hash_node) - if (l->hash == hash && !memcmp(&l->key, key, key_size)) + if (l->hash == hash && key_equal(&l->key, key, key_size)) return l; if (unlikely(get_nulls_value(n) != (hash & (n_buckets - 1))))