From patchwork Mon Apr 8 17:05:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= X-Patchwork-Id: 1081296 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 44dGzK4vhdz9sPY for ; Tue, 9 Apr 2019 03:06:21 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729229AbfDHRGU (ORCPT ); Mon, 8 Apr 2019 13:06:20 -0400 Received: from mail-ed1-f67.google.com ([209.85.208.67]:41072 "EHLO mail-ed1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727661AbfDHRGB (ORCPT ); Mon, 8 Apr 2019 13:06:01 -0400 Received: by mail-ed1-f67.google.com with SMTP id u2so7959093eds.8 for ; Mon, 08 Apr 2019 10:05:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=znNONNZKniKVs6Z6nyEDPSmTVprKWeRqrzT5x5BGsOs=; b=U8SbEav4+eJ2nEY+wUTQv5Vk3T0QBEn94C2F363g8wvQdyTWCBP2ENNThqEuCt3//L o/SeEhhUu5Nod2areYuzQpL9aP1x17QlElWPnMunjFTvwft3NtlRhhTHtou5traFWPqE ie5SN9ikw/MoVn4PWO4bFEUXq8MpdimHpqxrsb9WvqKwYJfVWBls5yk/tw1zTnkRJxH7 fOy+a59JTWnlU2sKGZwJK/uOhzRemlbqYNleEVm9Q92j5HXJTygbk7T3BbYdUhVlT24H Yn1BqBHBugTGBsv0IMMexMuni/3JlD9DVsfDzB9KKcUricLMh7q7ZHHteq8P8kwBnGU3 EgWw== X-Gm-Message-State: APjAAAVUuHRvipNHOT6UZSIYSNrStAVuJaCHMqG/CzpfSDM8JkRJaj7O qqtfs5qUMXFXvYeb7UNJyPRniQ== X-Google-Smtp-Source: APXvYqztVabN8Wqm7ofzxe841Y1XO0VQXIk5zdE/nQfixS7cMKkDy74GINFe2QUiedTVAz4lqhYgog== X-Received: by 2002:a17:906:3941:: with SMTP id g1mr17146390eje.168.1554743158961; Mon, 08 Apr 2019 10:05:58 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk (alrua-x1.vpn.toke.dk. [2a00:7660:6da:10::2]) by smtp.gmail.com with ESMTPSA id f32sm8936828edf.59.2019.04.08.10.05.57 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 08 Apr 2019 10:05:57 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id A4C7C1804A8; Mon, 8 Apr 2019 19:05:56 +0200 (CEST) Subject: [PATCH net-next v4 1/6] net: xdp: refactor XDP attach From: Toke =?utf-8?q?H=C3=B8iland-J=C3=B8rgensen?= To: David Miller Cc: netdev@vger.kernel.org, Jesper Dangaard Brouer , Daniel Borkmann , Alexei Starovoitov , Jakub Kicinski , =?utf-8?b?QmrDtnJuVMO2?= =?utf-8?q?pel?= Date: Mon, 08 Apr 2019 19:05:56 +0200 Message-ID: <155474315660.24432.7026778509168675619.stgit@alrua-x1> In-Reply-To: <155474315642.24432.6179239576879119104.stgit@alrua-x1> References: <155474315642.24432.6179239576879119104.stgit@alrua-x1> User-Agent: StGit/unknown-version MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel Generic XDP and driver XDP cannot be enabled at the same time. However, they don't share any state; let's fix that. Here, dev->xdp_prog, is used for both driver and generic mode. This removes the need for the XDP_QUERY_PROG command to ndo_bpf. Signed-off-by: Björn Töpel Signed-off-by: Toke Høiland-Jørgensen --- include/linux/netdevice.h | 8 +- net/core/dev.c | 148 ++++++++++++++++++++++++++------------------- net/core/rtnetlink.c | 14 +--- 3 files changed, 93 insertions(+), 77 deletions(-) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index eb9f05e0863d..68d4bbc44c63 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1957,7 +1957,7 @@ struct net_device { struct cpu_rmap *rx_cpu_rmap; #endif struct hlist_node index_hlist; - + unsigned int xdp_target; /* * Cache lines mostly used on transmit path */ @@ -2063,7 +2063,8 @@ struct net_device { static inline bool netif_elide_gro(const struct net_device *dev) { - if (!(dev->features & NETIF_F_GRO) || dev->xdp_prog) + if (!(dev->features & NETIF_F_GRO) || + dev->xdp_target == XDP_FLAGS_SKB_MODE) return true; return false; } @@ -3702,8 +3703,7 @@ struct sk_buff *dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev, typedef int (*bpf_op_t)(struct net_device *dev, struct netdev_bpf *bpf); int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack, int fd, u32 flags); -u32 __dev_xdp_query(struct net_device *dev, bpf_op_t xdp_op, - enum bpf_netdev_command cmd); +u32 __dev_xdp_query(struct net_device *dev, unsigned int target); int xdp_umem_query(struct net_device *dev, u16 queue_id); int __dev_forward_skb(struct net_device *dev, struct sk_buff *skb); diff --git a/net/core/dev.c b/net/core/dev.c index d5b1315218d3..feafc3580350 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -5121,35 +5121,25 @@ static void __netif_receive_skb_list(struct list_head *head) static int generic_xdp_install(struct net_device *dev, struct netdev_bpf *xdp) { - struct bpf_prog *old = rtnl_dereference(dev->xdp_prog); struct bpf_prog *new = xdp->prog; - int ret = 0; - - switch (xdp->command) { - case XDP_SETUP_PROG: - rcu_assign_pointer(dev->xdp_prog, new); - if (old) - bpf_prog_put(old); - if (old && !new) { + if (xdp->command == XDP_SETUP_PROG) { + if (static_key_enabled(&generic_xdp_needed_key) && !new) { static_branch_dec(&generic_xdp_needed_key); - } else if (new && !old) { + } else if (new && + !static_key_enabled(&generic_xdp_needed_key)) { static_branch_inc(&generic_xdp_needed_key); dev_disable_lro(dev); dev_disable_gro_hw(dev); } - break; - case XDP_QUERY_PROG: - xdp->prog_id = old ? old->aux->id : 0; - break; + if (new) + bpf_prog_put(new); - default: - ret = -EINVAL; - break; + return 0; } - return ret; + return -EINVAL; } static int netif_receive_skb_internal(struct sk_buff *skb) @@ -7981,30 +7971,50 @@ int dev_change_proto_down_generic(struct net_device *dev, bool proto_down) } EXPORT_SYMBOL(dev_change_proto_down_generic); -u32 __dev_xdp_query(struct net_device *dev, bpf_op_t bpf_op, - enum bpf_netdev_command cmd) +u32 __dev_xdp_query(struct net_device *dev, unsigned int target) { - struct netdev_bpf xdp; + if (target == XDP_FLAGS_DRV_MODE || target == XDP_FLAGS_SKB_MODE) { + struct bpf_prog *prog = rtnl_dereference(dev->xdp_prog); - if (!bpf_op) - return 0; + return prog && (dev->xdp_target == target) ? prog->aux->id : 0; + } - memset(&xdp, 0, sizeof(xdp)); - xdp.command = cmd; + if (target == XDP_FLAGS_HW_MODE) { + struct netdev_bpf xdp = { .command = XDP_QUERY_PROG_HW }; - /* Query must always succeed. */ - WARN_ON(bpf_op(dev, &xdp) < 0 && cmd == XDP_QUERY_PROG); + if (!dev->netdev_ops->ndo_bpf) + return 0; + dev->netdev_ops->ndo_bpf(dev, &xdp); + return xdp.prog_id; + } - return xdp.prog_id; + WARN_ON(1); + return 0; } -static int dev_xdp_install(struct net_device *dev, bpf_op_t bpf_op, +static int dev_xdp_install(struct net_device *dev, unsigned int target, struct netlink_ext_ack *extack, u32 flags, struct bpf_prog *prog) { - struct netdev_bpf xdp; + const struct net_device_ops *ops = dev->netdev_ops; + struct bpf_prog *old = NULL; + struct netdev_bpf xdp = {}; + int ret; + + if (target != XDP_FLAGS_HW_MODE) { + if (prog) { + prog = bpf_prog_inc(prog); + if (IS_ERR(prog)) + return PTR_ERR(prog); + } + + old = rtnl_dereference(dev->xdp_prog); + if (!old && !prog) + return 0; + rcu_assign_pointer(dev->xdp_prog, prog); + dev->xdp_target = prog ? target : 0; + } - memset(&xdp, 0, sizeof(xdp)); if (flags & XDP_FLAGS_HW_MODE) xdp.command = XDP_SETUP_PROG_HW; else @@ -8013,7 +8023,24 @@ static int dev_xdp_install(struct net_device *dev, bpf_op_t bpf_op, xdp.flags = flags; xdp.prog = prog; - return bpf_op(dev, &xdp); + if (target == XDP_FLAGS_SKB_MODE) + ret = generic_xdp_install(dev, &xdp); + else + ret = ops->ndo_bpf(dev, &xdp); + + if (target != XDP_FLAGS_HW_MODE) { + if (ret) { + if (prog) + bpf_prog_put(prog); + rcu_assign_pointer(dev->xdp_prog, old); + dev->xdp_target = old ? target : 0; + } else { + if (old) + bpf_prog_put(old); + } + } + + return ret; } static void dev_xdp_uninstall(struct net_device *dev) @@ -8021,27 +8048,28 @@ static void dev_xdp_uninstall(struct net_device *dev) struct netdev_bpf xdp; bpf_op_t ndo_bpf; - /* Remove generic XDP */ - WARN_ON(dev_xdp_install(dev, generic_xdp_install, NULL, 0, NULL)); + /* Remove generic/native XDP */ + WARN_ON(dev_xdp_install(dev, dev->xdp_target, NULL, 0, NULL)); /* Remove from the driver */ ndo_bpf = dev->netdev_ops->ndo_bpf; if (!ndo_bpf) return; - memset(&xdp, 0, sizeof(xdp)); - xdp.command = XDP_QUERY_PROG; - WARN_ON(ndo_bpf(dev, &xdp)); - if (xdp.prog_id) - WARN_ON(dev_xdp_install(dev, ndo_bpf, NULL, xdp.prog_flags, - NULL)); - - /* Remove HW offload */ memset(&xdp, 0, sizeof(xdp)); xdp.command = XDP_QUERY_PROG_HW; if (!ndo_bpf(dev, &xdp) && xdp.prog_id) - WARN_ON(dev_xdp_install(dev, ndo_bpf, NULL, xdp.prog_flags, - NULL)); + WARN_ON(dev_xdp_install(dev, XDP_FLAGS_HW_MODE, NULL, + xdp.prog_flags, NULL)); +} + +static bool dev_validate_active_xdp(struct net_device *dev, unsigned int target) +{ + if (target == XDP_FLAGS_DRV_MODE) + return dev->xdp_target != XDP_FLAGS_SKB_MODE; + if (target == XDP_FLAGS_SKB_MODE) + return dev->xdp_target != XDP_FLAGS_DRV_MODE; + return true; } /** @@ -8057,40 +8085,36 @@ int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack, int fd, u32 flags) { const struct net_device_ops *ops = dev->netdev_ops; - enum bpf_netdev_command query; + bool offload, drv = !!ops->ndo_bpf; struct bpf_prog *prog = NULL; - bpf_op_t bpf_op, bpf_chk; - bool offload; + unsigned int target; int err; ASSERT_RTNL(); offload = flags & XDP_FLAGS_HW_MODE; - query = offload ? XDP_QUERY_PROG_HW : XDP_QUERY_PROG; + target = offload ? XDP_FLAGS_HW_MODE : XDP_FLAGS_DRV_MODE; - bpf_op = bpf_chk = ops->ndo_bpf; - if (!bpf_op && (flags & (XDP_FLAGS_DRV_MODE | XDP_FLAGS_HW_MODE))) { + if (!drv && (flags & (XDP_FLAGS_DRV_MODE | XDP_FLAGS_HW_MODE))) { NL_SET_ERR_MSG(extack, "underlying driver does not support XDP in native mode"); return -EOPNOTSUPP; } - if (!bpf_op || (flags & XDP_FLAGS_SKB_MODE)) - bpf_op = generic_xdp_install; - if (bpf_op == bpf_chk) - bpf_chk = generic_xdp_install; + if (!drv || (flags & XDP_FLAGS_SKB_MODE)) + target = XDP_FLAGS_SKB_MODE; + + if (!dev_validate_active_xdp(dev, target)) { + NL_SET_ERR_MSG(extack, "native and generic XDP can't be active at the same time"); + return -EEXIST; + } if (fd >= 0) { - if (!offload && __dev_xdp_query(dev, bpf_chk, XDP_QUERY_PROG)) { - NL_SET_ERR_MSG(extack, "native and generic XDP can't be active at the same time"); - return -EEXIST; - } if ((flags & XDP_FLAGS_UPDATE_IF_NOEXIST) && - __dev_xdp_query(dev, bpf_op, query)) { + __dev_xdp_query(dev, target)) { NL_SET_ERR_MSG(extack, "XDP program already attached"); return -EBUSY; } - prog = bpf_prog_get_type_dev(fd, BPF_PROG_TYPE_XDP, - bpf_op == ops->ndo_bpf); + prog = bpf_prog_get_type_dev(fd, BPF_PROG_TYPE_XDP, drv); if (IS_ERR(prog)) return PTR_ERR(prog); @@ -8101,7 +8125,7 @@ int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack, } } - err = dev_xdp_install(dev, bpf_op, extack, flags, prog); + err = dev_xdp_install(dev, target, extack, flags, prog); if (err < 0 && prog) bpf_prog_put(prog); diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index f9b964fd4e4d..d9c7fe34c3a1 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -1362,25 +1362,17 @@ static int rtnl_fill_link_ifmap(struct sk_buff *skb, struct net_device *dev) static u32 rtnl_xdp_prog_skb(struct net_device *dev) { - const struct bpf_prog *generic_xdp_prog; - - ASSERT_RTNL(); - - generic_xdp_prog = rtnl_dereference(dev->xdp_prog); - if (!generic_xdp_prog) - return 0; - return generic_xdp_prog->aux->id; + return __dev_xdp_query(dev, XDP_FLAGS_SKB_MODE); } static u32 rtnl_xdp_prog_drv(struct net_device *dev) { - return __dev_xdp_query(dev, dev->netdev_ops->ndo_bpf, XDP_QUERY_PROG); + return __dev_xdp_query(dev, XDP_FLAGS_DRV_MODE); } static u32 rtnl_xdp_prog_hw(struct net_device *dev) { - return __dev_xdp_query(dev, dev->netdev_ops->ndo_bpf, - XDP_QUERY_PROG_HW); + return __dev_xdp_query(dev, XDP_FLAGS_HW_MODE); } static int rtnl_xdp_report_one(struct sk_buff *skb, struct net_device *dev, From patchwork Mon Apr 8 17:05:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= X-Patchwork-Id: 1081291 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 44dGyz5k4gz9sR7 for ; Tue, 9 Apr 2019 03:06:03 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729171AbfDHRGC (ORCPT ); Mon, 8 Apr 2019 13:06:02 -0400 Received: from mail-ed1-f67.google.com ([209.85.208.67]:40219 "EHLO mail-ed1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727613AbfDHRGA (ORCPT ); Mon, 8 Apr 2019 13:06:00 -0400 Received: by mail-ed1-f67.google.com with SMTP id h22so12366316edw.7 for ; Mon, 08 Apr 2019 10:05:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=gsp0p2tvUVv5VZ5diX0QfIJRbvyS2AvTG67zBjYuHNU=; b=hfeTrqvyckac9RMJ+bu72t+h593Nl7PaDrSWLXd/eXYDnCz2D56IDSvrvB83/1QEDm MTpr4n/l6SC8XFMabHoIUimUSsJ5fazJorFUGITwx+PduEbgHuxQPx05ZbzImwoSqbK1 pqFsWHJLe4GI9DbTewDbkgZTc1RbnBtCw53PFumF7sS6qtjxC5cPtdfhhk/GPyagU4SA eQWOMkyuCSOyBEMilaTYOaP3XtF50XSure7HNv3ScJqSvAH/AyYtkMSyyYC4hLUoabcE CQYPZ7hwV6OThbVBRc30NM8/Bl9tua6X3elq5LQ9NSU5v4rXTQy3LskxlCbbVLpiC5zJ ZQJQ== X-Gm-Message-State: APjAAAUomJnNWiPafJaDHThkPU8f+AFEsW8ZoQ6yxjGVPf9+ZHdkuiYb +yS+lc441onUgHz0hy/sj5XZsc+oeqG3bg== X-Google-Smtp-Source: APXvYqy2xPZrns7mbggEMBn9/LzWHDSM71fmtEO5xjA/5hms+E2xLxpMESEMYp4eVDgV6SunHxTb8g== X-Received: by 2002:a17:906:3601:: with SMTP id q1mr7575283ejb.163.1554743158397; Mon, 08 Apr 2019 10:05:58 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk (alrua-x1.vpn.toke.dk. [2a00:7660:6da:10::2]) by smtp.gmail.com with ESMTPSA id s13sm3069984eju.89.2019.04.08.10.05.56 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 08 Apr 2019 10:05:57 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id B88AC180C01; Mon, 8 Apr 2019 19:05:56 +0200 (CEST) Subject: [PATCH net-next v4 2/6] net: xdp: remove XDP_QUERY_PROG From: Toke =?utf-8?q?H=C3=B8iland-J=C3=B8rgensen?= To: David Miller Cc: netdev@vger.kernel.org, Jesper Dangaard Brouer , Daniel Borkmann , Alexei Starovoitov , Jakub Kicinski , =?utf-8?b?QmrDtnJuVMO2?= =?utf-8?q?pel?= Date: Mon, 08 Apr 2019 19:05:56 +0200 Message-ID: <155474315667.24432.10067093738731836097.stgit@alrua-x1> In-Reply-To: <155474315642.24432.6179239576879119104.stgit@alrua-x1> References: <155474315642.24432.6179239576879119104.stgit@alrua-x1> User-Agent: StGit/unknown-version MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel With the previous patch, we no longer need the XDP_QUERY_PROG operation, so remove the handling of that from all drivers. Signed-off-by: Björn Töpel Signed-off-by: Toke Høiland-Jørgensen --- drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 4 --- drivers/net/ethernet/cavium/thunder/nicvf_main.c | 3 --- drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c | 3 --- drivers/net/ethernet/intel/i40e/i40e_main.c | 3 --- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 4 --- drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 4 --- drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 24 -------------------- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 18 --------------- .../net/ethernet/netronome/nfp/nfp_net_common.c | 2 -- drivers/net/ethernet/qlogic/qede/qede_filter.c | 3 --- drivers/net/netdevsim/bpf.c | 2 -- drivers/net/netdevsim/netdevsim.h | 2 +- drivers/net/tun.c | 15 ------------- drivers/net/veth.c | 15 ------------- drivers/net/virtio_net.c | 17 -------------- include/linux/netdevice.h | 3 +-- 16 files changed, 2 insertions(+), 120 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c index 0184ef6f05a7..8b1e77522e18 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c @@ -217,10 +217,6 @@ int bnxt_xdp(struct net_device *dev, struct netdev_bpf *xdp) case XDP_SETUP_PROG: rc = bnxt_xdp_set(bp, xdp->prog); break; - case XDP_QUERY_PROG: - xdp->prog_id = bp->xdp_prog ? bp->xdp_prog->aux->id : 0; - rc = 0; - break; default: rc = -EINVAL; break; diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index aa2be4807191..dfe38f64ca12 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -1892,9 +1892,6 @@ static int nicvf_xdp(struct net_device *netdev, struct netdev_bpf *xdp) switch (xdp->command) { case XDP_SETUP_PROG: return nicvf_xdp_setup(nic, xdp->prog); - case XDP_QUERY_PROG: - xdp->prog_id = nic->xdp_prog ? nic->xdp_prog->aux->id : 0; - return 0; default: return -EINVAL; } diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c index 2055c97dc22b..4f96a36f64f9 100644 --- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c @@ -1763,9 +1763,6 @@ static int dpaa2_eth_xdp(struct net_device *dev, struct netdev_bpf *xdp) switch (xdp->command) { case XDP_SETUP_PROG: return setup_xdp(dev, xdp->prog); - case XDP_QUERY_PROG: - xdp->prog_id = priv->xdp_prog ? priv->xdp_prog->aux->id : 0; - break; default: return -EINVAL; } diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index da62218eb70a..7cd0ccb9ec87 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -12146,9 +12146,6 @@ static int i40e_xdp(struct net_device *dev, switch (xdp->command) { case XDP_SETUP_PROG: return i40e_xdp_setup(vsi, xdp->prog); - case XDP_QUERY_PROG: - xdp->prog_id = vsi->xdp_prog ? vsi->xdp_prog->aux->id : 0; - return 0; case XDP_SETUP_XSK_UMEM: return i40e_xsk_umem_setup(vsi, xdp->xsk.umem, xdp->xsk.queue_id); diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index 60cec3540dd7..fa03dfefd098 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -10289,10 +10289,6 @@ static int ixgbe_xdp(struct net_device *dev, struct netdev_bpf *xdp) switch (xdp->command) { case XDP_SETUP_PROG: return ixgbe_xdp_setup(dev, xdp->prog); - case XDP_QUERY_PROG: - xdp->prog_id = adapter->xdp_prog ? - adapter->xdp_prog->aux->id : 0; - return 0; case XDP_SETUP_XSK_UMEM: return ixgbe_xsk_umem_setup(adapter, xdp->xsk.umem, xdp->xsk.queue_id); diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 49e23afa05a2..75fcc148ed97 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -4484,10 +4484,6 @@ static int ixgbevf_xdp(struct net_device *dev, struct netdev_bpf *xdp) switch (xdp->command) { case XDP_SETUP_PROG: return ixgbevf_xdp_setup(dev, xdp->prog); - case XDP_QUERY_PROG: - xdp->prog_id = adapter->xdp_prog ? - adapter->xdp_prog->aux->id : 0; - return 0; default: return -EINVAL; } diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c index c1438ae52a11..8850fc35510a 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c @@ -2883,35 +2883,11 @@ static int mlx4_xdp_set(struct net_device *dev, struct bpf_prog *prog) return err; } -static u32 mlx4_xdp_query(struct net_device *dev) -{ - struct mlx4_en_priv *priv = netdev_priv(dev); - struct mlx4_en_dev *mdev = priv->mdev; - const struct bpf_prog *xdp_prog; - u32 prog_id = 0; - - if (!priv->tx_ring_num[TX_XDP]) - return prog_id; - - mutex_lock(&mdev->state_lock); - xdp_prog = rcu_dereference_protected( - priv->rx_ring[0]->xdp_prog, - lockdep_is_held(&mdev->state_lock)); - if (xdp_prog) - prog_id = xdp_prog->aux->id; - mutex_unlock(&mdev->state_lock); - - return prog_id; -} - static int mlx4_xdp(struct net_device *dev, struct netdev_bpf *xdp) { switch (xdp->command) { case XDP_SETUP_PROG: return mlx4_xdp_set(dev, xdp->prog); - case XDP_QUERY_PROG: - xdp->prog_id = mlx4_xdp_query(dev); - return 0; default: return -EINVAL; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index e08a1eb04e22..67fb1738fd84 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -4293,29 +4293,11 @@ static int mlx5e_xdp_set(struct net_device *netdev, struct bpf_prog *prog) return err; } -static u32 mlx5e_xdp_query(struct net_device *dev) -{ - struct mlx5e_priv *priv = netdev_priv(dev); - const struct bpf_prog *xdp_prog; - u32 prog_id = 0; - - mutex_lock(&priv->state_lock); - xdp_prog = priv->channels.params.xdp_prog; - if (xdp_prog) - prog_id = xdp_prog->aux->id; - mutex_unlock(&priv->state_lock); - - return prog_id; -} - static int mlx5e_xdp(struct net_device *dev, struct netdev_bpf *xdp) { switch (xdp->command) { case XDP_SETUP_PROG: return mlx5e_xdp_set(dev, xdp->prog); - case XDP_QUERY_PROG: - xdp->prog_id = mlx5e_xdp_query(dev); - return 0; default: return -EINVAL; } diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c index 961cd5e7bf2b..2101328e6446 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c @@ -3478,8 +3478,6 @@ static int nfp_net_xdp(struct net_device *netdev, struct netdev_bpf *xdp) return nfp_net_xdp_setup_drv(nn, xdp); case XDP_SETUP_PROG_HW: return nfp_net_xdp_setup_hw(nn, xdp); - case XDP_QUERY_PROG: - return xdp_attachment_query(&nn->xdp, xdp); case XDP_QUERY_PROG_HW: return xdp_attachment_query(&nn->xdp_hw, xdp); default: diff --git a/drivers/net/ethernet/qlogic/qede/qede_filter.c b/drivers/net/ethernet/qlogic/qede/qede_filter.c index add922b93d2c..69f4e7d37d01 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_filter.c +++ b/drivers/net/ethernet/qlogic/qede/qede_filter.c @@ -1118,9 +1118,6 @@ int qede_xdp(struct net_device *dev, struct netdev_bpf *xdp) switch (xdp->command) { case XDP_SETUP_PROG: return qede_xdp_set(edev, xdp->prog); - case XDP_QUERY_PROG: - xdp->prog_id = edev->xdp_prog ? edev->xdp_prog->aux->id : 0; - return 0; default: return -EINVAL; } diff --git a/drivers/net/netdevsim/bpf.c b/drivers/net/netdevsim/bpf.c index f92c43453ec6..a6cc8954783d 100644 --- a/drivers/net/netdevsim/bpf.c +++ b/drivers/net/netdevsim/bpf.c @@ -547,8 +547,6 @@ int nsim_bpf(struct net_device *dev, struct netdev_bpf *bpf) ASSERT_RTNL(); switch (bpf->command) { - case XDP_QUERY_PROG: - return xdp_attachment_query(&ns->xdp, bpf); case XDP_QUERY_PROG_HW: return xdp_attachment_query(&ns->xdp_hw, bpf); case XDP_SETUP_PROG: diff --git a/drivers/net/netdevsim/netdevsim.h b/drivers/net/netdevsim/netdevsim.h index 384c254fafc5..c169477d1f5d 100644 --- a/drivers/net/netdevsim/netdevsim.h +++ b/drivers/net/netdevsim/netdevsim.h @@ -122,7 +122,7 @@ static inline void nsim_bpf_uninit(struct netdevsim *ns) static inline int nsim_bpf(struct net_device *dev, struct netdev_bpf *bpf) { - return bpf->command == XDP_QUERY_PROG ? 0 : -EOPNOTSUPP; + return -EOPNOTSUPP; } static inline int nsim_bpf_disable_tc(struct netdevsim *ns) diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 24d0220b9ba0..c81610802354 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -1229,26 +1229,11 @@ static int tun_xdp_set(struct net_device *dev, struct bpf_prog *prog, return 0; } -static u32 tun_xdp_query(struct net_device *dev) -{ - struct tun_struct *tun = netdev_priv(dev); - const struct bpf_prog *xdp_prog; - - xdp_prog = rtnl_dereference(tun->xdp_prog); - if (xdp_prog) - return xdp_prog->aux->id; - - return 0; -} - static int tun_xdp(struct net_device *dev, struct netdev_bpf *xdp) { switch (xdp->command) { case XDP_SETUP_PROG: return tun_xdp_set(dev, xdp->prog, xdp->extack); - case XDP_QUERY_PROG: - xdp->prog_id = tun_xdp_query(dev); - return 0; default: return -EINVAL; } diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 569e87a51a33..33ccadbea577 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -1108,26 +1108,11 @@ static int veth_xdp_set(struct net_device *dev, struct bpf_prog *prog, return err; } -static u32 veth_xdp_query(struct net_device *dev) -{ - struct veth_priv *priv = netdev_priv(dev); - const struct bpf_prog *xdp_prog; - - xdp_prog = priv->_xdp_prog; - if (xdp_prog) - return xdp_prog->aux->id; - - return 0; -} - static int veth_xdp(struct net_device *dev, struct netdev_bpf *xdp) { switch (xdp->command) { case XDP_SETUP_PROG: return veth_xdp_set(dev, xdp->prog, xdp->extack); - case XDP_QUERY_PROG: - xdp->prog_id = veth_xdp_query(dev); - return 0; default: return -EINVAL; } diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index ba246fc475ae..2009a23d57d7 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -2525,28 +2525,11 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog, return err; } -static u32 virtnet_xdp_query(struct net_device *dev) -{ - struct virtnet_info *vi = netdev_priv(dev); - const struct bpf_prog *xdp_prog; - int i; - - for (i = 0; i < vi->max_queue_pairs; i++) { - xdp_prog = rtnl_dereference(vi->rq[i].xdp_prog); - if (xdp_prog) - return xdp_prog->aux->id; - } - return 0; -} - static int virtnet_xdp(struct net_device *dev, struct netdev_bpf *xdp) { switch (xdp->command) { case XDP_SETUP_PROG: return virtnet_xdp_set(dev, xdp->prog, xdp->extack); - case XDP_QUERY_PROG: - xdp->prog_id = virtnet_xdp_query(dev); - return 0; default: return -EINVAL; } diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 68d4bbc44c63..db978850bf70 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -864,7 +864,6 @@ enum bpf_netdev_command { */ XDP_SETUP_PROG, XDP_SETUP_PROG_HW, - XDP_QUERY_PROG, XDP_QUERY_PROG_HW, /* BPF program for offload callbacks, invoked at program load time. */ BPF_OFFLOAD_MAP_ALLOC, @@ -885,7 +884,7 @@ struct netdev_bpf { struct bpf_prog *prog; struct netlink_ext_ack *extack; }; - /* XDP_QUERY_PROG, XDP_QUERY_PROG_HW */ + /* XDP_QUERY_PROG_HW */ struct { u32 prog_id; /* flags with which program was installed */ From patchwork Mon Apr 8 17:05:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= X-Patchwork-Id: 1081292 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 44dGz42Zs2z9sR7 for ; Tue, 9 Apr 2019 03:06:08 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729208AbfDHRGE (ORCPT ); Mon, 8 Apr 2019 13:06:04 -0400 Received: from mail-ed1-f68.google.com ([209.85.208.68]:45359 "EHLO mail-ed1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727190AbfDHRGC (ORCPT ); Mon, 8 Apr 2019 13:06:02 -0400 Received: by mail-ed1-f68.google.com with SMTP id f19so5346470edw.12 for ; Mon, 08 Apr 2019 10:06:00 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=RetgJx2Ih/gulJhTxyv8SnTQZxQhSjyABsYHW/m1Gzs=; b=Egsv5Ftss5MMoEodtujs3VJxLQu9DCqm5sWB+pR39yuUuw2bsqaZR/uAEUjvABhZFv 7miUBVNjb6yxl7JQam7mayJu8zJuV3p9KbpFDY1EZ9lTsiDX09Byepa9CA9mY4MBMDTk CoP0wYIEpw79oT8y/HMFoqzsaBRV56rjMpT7I5V9+4pQnYmu98MIqora0ejfyIBkCSk3 fGLWR6hESFxj7I/Qqfc+qVJlrBNqKMqv9RBJiO5n6+vyQeAfJbTtOWZc039dWYzC3MCC 8hRqctoytyf9FwAVOOSCrMA3u3Npz0/APwxL64jP3Rzi81RquwucZmwAeSB8e1bcJg/Q g4PQ== X-Gm-Message-State: APjAAAXuPDB6NZgq0qdra+adT7DsUgw7KK+S9tOTI8Ho29DrHz+6TEGq HJlUrrPjQlhF1pbppDyu6kEBGA== X-Google-Smtp-Source: APXvYqzOxYGeNvKNenHadIj8mJ3JelzXzyJ+NBGaJgwISAhwQmVVwJkq7BFNVj3EaN/k2TN4ltlwkQ== X-Received: by 2002:a17:906:7b86:: with SMTP id s6mr17133184ejo.144.1554743159652; Mon, 08 Apr 2019 10:05:59 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk (alrua-x1.vpn.toke.dk. [2a00:7660:6da:10::2]) by smtp.gmail.com with ESMTPSA id o25sm3076906ejc.87.2019.04.08.10.05.57 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 08 Apr 2019 10:05:57 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id C6D98180C02; Mon, 8 Apr 2019 19:05:56 +0200 (CEST) Subject: [PATCH net-next v4 3/6] xdp: Refactor devmap code in preparation for subsequent additions From: Toke =?utf-8?q?H=C3=B8iland-J=C3=B8rgensen?= To: David Miller Cc: netdev@vger.kernel.org, Jesper Dangaard Brouer , Daniel Borkmann , Alexei Starovoitov , Jakub Kicinski , =?utf-8?b?QmrDtnJuVMO2?= =?utf-8?q?pel?= Date: Mon, 08 Apr 2019 19:05:56 +0200 Message-ID: <155474315675.24432.16407129756640477880.stgit@alrua-x1> In-Reply-To: <155474315642.24432.6179239576879119104.stgit@alrua-x1> References: <155474315642.24432.6179239576879119104.stgit@alrua-x1> User-Agent: StGit/unknown-version MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org The subsequent commits introducing default maps and a hash-based ifindex devmap require a bit of refactoring of the devmap code. Perform this first so the subsequent commits become easier to read. Also split out the final freeing and flushing of devmaps into a workqueue, to make it easier to queue up the freeing of maps in the subsequent patches. Finally, change the spin lock into a mutex, as subsequent patches add code that needs to be able to sleep while holding the lock. Signed-off-by: Toke Høiland-Jørgensen --- kernel/bpf/devmap.c | 186 +++++++++++++++++++++++++++++++++------------------ 1 file changed, 120 insertions(+), 66 deletions(-) diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index 191b79948424..92393b283b87 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -48,6 +48,7 @@ * calls will fail at this point. */ #include +#include #include #include #include @@ -75,33 +76,30 @@ struct bpf_dtab { struct bpf_dtab_netdev **netdev_map; unsigned long __percpu *flush_needed; struct list_head list; + struct work_struct free_work; }; -static DEFINE_SPINLOCK(dev_map_lock); +static DEFINE_MUTEX(dev_map_mtx); static LIST_HEAD(dev_map_list); +static struct workqueue_struct *dev_map_wq; +static void __dev_map_free(struct work_struct *work); + static u64 dev_map_bitmap_size(const union bpf_attr *attr) { return BITS_TO_LONGS((u64) attr->max_entries) * sizeof(unsigned long); } -static struct bpf_map *dev_map_alloc(union bpf_attr *attr) +static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr, + bool check_memlock) { - struct bpf_dtab *dtab; - int err = -EINVAL; u64 cost; - - if (!capable(CAP_NET_ADMIN)) - return ERR_PTR(-EPERM); + int err; /* check sanity of attributes */ if (attr->max_entries == 0 || attr->key_size != 4 || attr->value_size != 4 || attr->map_flags & ~DEV_CREATE_FLAG_MASK) - return ERR_PTR(-EINVAL); - - dtab = kzalloc(sizeof(*dtab), GFP_USER); - if (!dtab) - return ERR_PTR(-ENOMEM); + return -EINVAL; bpf_map_init_from_attr(&dtab->map, attr); @@ -109,59 +107,70 @@ static struct bpf_map *dev_map_alloc(union bpf_attr *attr) cost = (u64) dtab->map.max_entries * sizeof(struct bpf_dtab_netdev *); cost += dev_map_bitmap_size(attr) * num_possible_cpus(); if (cost >= U32_MAX - PAGE_SIZE) - goto free_dtab; + return -EINVAL; dtab->map.pages = round_up(cost, PAGE_SIZE) >> PAGE_SHIFT; - /* if map size is larger than memlock limit, reject it early */ - err = bpf_map_precharge_memlock(dtab->map.pages); - if (err) - goto free_dtab; - - err = -ENOMEM; + if (check_memlock) { + /* if map size is larger than memlock limit, reject it early */ + err = bpf_map_precharge_memlock(dtab->map.pages); + if (err) + return -EINVAL; + } /* A per cpu bitfield with a bit per possible net device */ dtab->flush_needed = __alloc_percpu_gfp(dev_map_bitmap_size(attr), __alignof__(unsigned long), GFP_KERNEL | __GFP_NOWARN); if (!dtab->flush_needed) - goto free_dtab; + return -ENOMEM; dtab->netdev_map = bpf_map_area_alloc(dtab->map.max_entries * sizeof(struct bpf_dtab_netdev *), dtab->map.numa_node); if (!dtab->netdev_map) - goto free_dtab; + goto free_map; - spin_lock(&dev_map_lock); - list_add_tail_rcu(&dtab->list, &dev_map_list); - spin_unlock(&dev_map_lock); + INIT_WORK(&dtab->free_work, __dev_map_free); - return &dtab->map; -free_dtab: + return 0; + +free_map: free_percpu(dtab->flush_needed); - kfree(dtab); - return ERR_PTR(err); + return -ENOMEM; } -static void dev_map_free(struct bpf_map *map) +static struct bpf_map *dev_map_alloc(union bpf_attr *attr) { - struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map); - int i, cpu; + struct bpf_dtab *dtab; + int err; - /* At this point bpf_prog->aux->refcnt == 0 and this map->refcnt == 0, - * so the programs (can be more than one that used this map) were - * disconnected from events. Wait for outstanding critical sections in - * these programs to complete. The rcu critical section only guarantees - * no further reads against netdev_map. It does __not__ ensure pending - * flush operations (if any) are complete. - */ + if (!capable(CAP_NET_ADMIN)) + return ERR_PTR(-EPERM); - spin_lock(&dev_map_lock); - list_del_rcu(&dtab->list); - spin_unlock(&dev_map_lock); + dtab = kzalloc(sizeof(*dtab), GFP_USER); + if (!dtab) + return ERR_PTR(-ENOMEM); - bpf_clear_redirect_map(map); + err = dev_map_init_map(dtab, attr, true); + if (err) { + kfree(dtab); + return ERR_PTR(err); + } + + mutex_lock(&dev_map_mtx); + list_add_tail_rcu(&dtab->list, &dev_map_list); + mutex_unlock(&dev_map_mtx); + + return &dtab->map; +} + +static void __dev_map_free(struct work_struct *work) +{ + struct bpf_dtab *dtab = container_of(work, struct bpf_dtab, free_work); + int i, cpu; + + /* Make sure all references to this dtab are cleared out. */ synchronize_rcu(); /* To ensure all pending flush operations have completed wait for flush @@ -192,6 +201,26 @@ static void dev_map_free(struct bpf_map *map) kfree(dtab); } +static void dev_map_free(struct bpf_map *map) +{ + struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map); + + /* At this point bpf_prog->aux->refcnt == 0 and this map->refcnt == 0, + * so the programs (can be more than one that used this map) were + * disconnected from events. Wait for outstanding critical sections in + * these programs to complete. The rcu critical section only guarantees + * no further reads against netdev_map. It does __not__ ensure pending + * flush operations (if any) are complete. + */ + + mutex_lock(&dev_map_mtx); + list_del_rcu(&dtab->list); + mutex_unlock(&dev_map_mtx); + + bpf_clear_redirect_map(map); + queue_work(dev_map_wq, &dtab->free_work); +} + static int dev_map_get_next_key(struct bpf_map *map, void *key, void *next_key) { struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map); @@ -429,12 +458,42 @@ static int dev_map_delete_elem(struct bpf_map *map, void *key) return 0; } -static int dev_map_update_elem(struct bpf_map *map, void *key, void *value, - u64 map_flags) +static struct bpf_dtab_netdev *__dev_map_alloc_node(struct net *net, + struct bpf_dtab *dtab, + u32 ifindex, + unsigned int bit) { - struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map); - struct net *net = current->nsproxy->net_ns; gfp_t gfp = GFP_ATOMIC | __GFP_NOWARN; + struct bpf_dtab_netdev *dev; + + dev = kmalloc_node(sizeof(*dev), gfp, dtab->map.numa_node); + if (!dev) + return ERR_PTR(-ENOMEM); + + dev->bulkq = __alloc_percpu_gfp(sizeof(*dev->bulkq), + sizeof(void *), gfp); + if (!dev->bulkq) { + kfree(dev); + return ERR_PTR(-ENOMEM); + } + + dev->dev = dev_get_by_index(net, ifindex); + if (!dev->dev) { + free_percpu(dev->bulkq); + kfree(dev); + return ERR_PTR(-EINVAL); + } + + dev->bit = bit; + dev->dtab = dtab; + + return dev; +} + +static int __dev_map_update_elem(struct net *net, struct bpf_map *map, + void *key, void *value, u64 map_flags) +{ + struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map); struct bpf_dtab_netdev *dev, *old_dev; u32 i = *(u32 *)key; u32 ifindex = *(u32 *)value; @@ -449,26 +508,9 @@ static int dev_map_update_elem(struct bpf_map *map, void *key, void *value, if (!ifindex) { dev = NULL; } else { - dev = kmalloc_node(sizeof(*dev), gfp, map->numa_node); - if (!dev) - return -ENOMEM; - - dev->bulkq = __alloc_percpu_gfp(sizeof(*dev->bulkq), - sizeof(void *), gfp); - if (!dev->bulkq) { - kfree(dev); - return -ENOMEM; - } - - dev->dev = dev_get_by_index(net, ifindex); - if (!dev->dev) { - free_percpu(dev->bulkq); - kfree(dev); - return -EINVAL; - } - - dev->bit = i; - dev->dtab = dtab; + dev = __dev_map_alloc_node(net, dtab, ifindex, i); + if (IS_ERR(dev)) + return PTR_ERR(dev); } /* Use call_rcu() here to ensure rcu critical sections have completed @@ -482,6 +524,13 @@ static int dev_map_update_elem(struct bpf_map *map, void *key, void *value, return 0; } +static int dev_map_update_elem(struct bpf_map *map, void *key, void *value, + u64 map_flags) +{ + return __dev_map_update_elem(current->nsproxy->net_ns, + map, key, value, map_flags); +} + const struct bpf_map_ops dev_map_ops = { .map_alloc = dev_map_alloc, .map_free = dev_map_free, @@ -537,6 +586,11 @@ static int __init dev_map_init(void) /* Assure tracepoint shadow struct _bpf_dtab_netdev is in sync */ BUILD_BUG_ON(offsetof(struct bpf_dtab_netdev, dev) != offsetof(struct _bpf_dtab_netdev, dev)); + + dev_map_wq = alloc_workqueue("dev_map_wq", 0, 0); + if (!dev_map_wq) + return -ENOMEM; + register_netdevice_notifier(&dev_map_notifier); return 0; } From patchwork Mon Apr 8 17:05:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= X-Patchwork-Id: 1081295 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 44dGzB1vrrz9sPY for ; Tue, 9 Apr 2019 03:06:14 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729189AbfDHRGD (ORCPT ); Mon, 8 Apr 2019 13:06:03 -0400 Received: from mail-ed1-f67.google.com ([209.85.208.67]:45360 "EHLO mail-ed1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729152AbfDHRGC (ORCPT ); Mon, 8 Apr 2019 13:06:02 -0400 Received: by mail-ed1-f67.google.com with SMTP id f19so5346535edw.12 for ; Mon, 08 Apr 2019 10:06:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=ElZmvCGyJDcwpu87lT7sjZwA2qOkwdbIb5XytwQ8C4M=; b=iebJIllyxnkNd8AVBnr+Gs28QyKPu0eM9SvgjqGfzJWOL3vPkUav3R38Zrvj5BurxO Ft1h8Ys4OxXP1VmXvHOLa6uWOulC5sTKA3A7OMmkvHT+zh2NdXpbcjPCt5aIp3ohBT0x Oa/rRC/U0w8fY/lCEkOFU3hQf/a/oXhEsG33Eo9njrNZc6S24RB5/7KT8uQgmjLW9rx7 2nHpuc6Qa6iEsXCFdYODiyjCr7oV9JGrb8LbYXeWoreVDWHZWHmGweYQSABhfF5juh/t qdm2l/f7evtu/2RSFaiA7RlYNhWHLeIeu8XYLnThgIir4OASSSplQNDgTYvmMINVoWbz f0NQ== X-Gm-Message-State: APjAAAU6O3Uv++3ZyerEkd/2CdoPTGltBRD/924VROiyTQijU95QWB7Q 8EVmwF6HKOt/R062XOJqDy7aeA== X-Google-Smtp-Source: APXvYqwIr416CrDhE7EM1LuM5VHnecHhF4pmm5Z2bxrMp8YBO3j7KdoZqsr8j4ZB3R5LhynK8GUF6A== X-Received: by 2002:a50:b002:: with SMTP id i2mr19820770edd.43.1554743160222; Mon, 08 Apr 2019 10:06:00 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk (alrua-x1.vpn.toke.dk. [2a00:7660:6da:10::2]) by smtp.gmail.com with ESMTPSA id p16sm8720498eds.61.2019.04.08.10.05.57 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 08 Apr 2019 10:05:57 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id DCFE3180C04; Mon, 8 Apr 2019 19:05:56 +0200 (CEST) Subject: [PATCH net-next v4 4/6] xdp: Always use a devmap for XDP_REDIRECT to a device From: Toke =?utf-8?q?H=C3=B8iland-J=C3=B8rgensen?= To: David Miller Cc: netdev@vger.kernel.org, Jesper Dangaard Brouer , Daniel Borkmann , Alexei Starovoitov , Jakub Kicinski , =?utf-8?b?QmrDtnJuVMO2?= =?utf-8?q?pel?= Date: Mon, 08 Apr 2019 19:05:56 +0200 Message-ID: <155474315681.24432.16316236828243189182.stgit@alrua-x1> In-Reply-To: <155474315642.24432.6179239576879119104.stgit@alrua-x1> References: <155474315642.24432.6179239576879119104.stgit@alrua-x1> User-Agent: StGit/unknown-version MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org An XDP program can redirect packets between interfaces using either the xdp_redirect() helper or the xdp_redirect_map() helper. Apart from the flexibility of updating maps from userspace, the redirect_map() helper also uses the map structure to batch packets, which results in a significant (around 50%) performance boost. However, the xdp_redirect() API is simpler if one just wants to redirect to another interface, which means people tend to use this interface and then wonder why they getter worse performance than expected. This patch seeks to close this performance difference between the two APIs. It achieves this by changing xdp_redirect() to use a hidden devmap for looking up destination interfaces, thus gaining the batching benefit with no visible difference from the user API point of view. A hidden per-namespace map is allocated when an XDP program that uses the non-map xdp_redirect() helper is first loaded. This map is populated with all available interfaces in its namespace, and kept up to date as interfaces come and go. Once allocated, the map is kept around until the namespace is removed. The hidden map uses the ifindex as map key, which means they are limited to ifindexes smaller than the map size of 64. A later patch introduces a new map type to lift this restriction. Performance numbers: Before patch: xdp_redirect: 5426035 pkt/s xdp_redirect_map: 8412754 pkt/s After patch: xdp_redirect: 8314702 pkt/s xdp_redirect_map: 8411854 pkt/s This corresponds to a 53% increase in xdp_redirect performance, or a reduction in per-packet processing time by 64 nanoseconds. Signed-off-by: Toke Høiland-Jørgensen --- include/linux/bpf.h | 35 ++++++ include/linux/filter.h | 2 include/net/net_namespace.h | 2 include/net/netns/xdp.h | 11 ++ kernel/bpf/devmap.c | 264 +++++++++++++++++++++++++++++++++++++++++++ kernel/bpf/syscall.c | 3 kernel/bpf/verifier.c | 13 ++ net/core/dev.c | 65 ++++++++++- net/core/filter.c | 58 --------- 9 files changed, 395 insertions(+), 58 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index f62897198844..c73ff0ea1bf4 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -26,6 +26,7 @@ struct sock; struct seq_file; struct btf; struct btf_type; +struct net; /* map is generic key/value storage optionally accesible by eBPF programs */ struct bpf_map_ops { @@ -541,6 +542,7 @@ extern const struct bpf_verifier_ops tc_cls_act_analyzer_ops; extern const struct bpf_verifier_ops xdp_analyzer_ops; struct bpf_prog *bpf_prog_get(u32 ufd); +struct bpf_prog *bpf_prog_get_by_id(u32 id); struct bpf_prog *bpf_prog_get_type_dev(u32 ufd, enum bpf_prog_type type, bool attach_drv); struct bpf_prog * __must_check bpf_prog_add(struct bpf_prog *prog, int i); @@ -621,6 +623,11 @@ struct xdp_buff; struct sk_buff; struct bpf_dtab_netdev *__dev_map_lookup_elem(struct bpf_map *map, u32 key); +struct bpf_map *__dev_map_get_default_map(struct net_device *dev); +int dev_map_ensure_default_map(struct net *net); +void dev_map_put_default_map(struct net *net); +int dev_map_inc_redirect_use_count(void); +void dev_map_dec_redirect_use_count(void); void __dev_map_insert_ctx(struct bpf_map *map, u32 index); void __dev_map_flush(struct bpf_map *map); int dev_map_enqueue(struct bpf_dtab_netdev *dst, struct xdp_buff *xdp, @@ -650,6 +657,11 @@ static inline struct bpf_prog *bpf_prog_get(u32 ufd) return ERR_PTR(-EOPNOTSUPP); } +static inline struct bpf_prog *bpf_prog_get_by_id(u32 id) +{ + return ERR_PTR(-EOPNOTSUPP); +} + static inline struct bpf_prog *bpf_prog_get_type_dev(u32 ufd, enum bpf_prog_type type, bool attach_drv) @@ -702,6 +714,29 @@ static inline struct net_device *__dev_map_lookup_elem(struct bpf_map *map, return NULL; } +static inline struct bpf_map *__dev_map_get_default_map(struct net_device *dev) +{ + return NULL; +} + +static inline int dev_map_ensure_default_map(struct net *net) +{ + return 0; +} + +static inline void dev_map_put_default_map(struct net *net) +{ +} + +static inline int dev_map_inc_redirect_use_count(void) +{ + return 0; +} + +static inline void dev_map_dec_redirect_use_count(void) +{ +} + static inline void __dev_map_insert_ctx(struct bpf_map *map, u32 index) { } diff --git a/include/linux/filter.h b/include/linux/filter.h index 6074aa064b54..df6dbf86daf6 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -507,6 +507,8 @@ struct bpf_prog { gpl_compatible:1, /* Is filter GPL compatible? */ cb_access:1, /* Is control block accessed? */ dst_needed:1, /* Do we need dst entry? */ + redirect_needed:1, /* Does program need access to xdp_redirect? */ + redirect_used:1, /* Does program use xdp_redirect? */ blinded:1, /* Was blinded */ is_func:1, /* program is a bpf function */ kprobe_override:1, /* Do we override a kprobe? */ diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h index a68ced28d8f4..6706ecc25d8f 100644 --- a/include/net/net_namespace.h +++ b/include/net/net_namespace.h @@ -162,7 +162,7 @@ struct net { #if IS_ENABLED(CONFIG_CAN) struct netns_can can; #endif -#ifdef CONFIG_XDP_SOCKETS +#ifdef CONFIG_BPF_SYSCALL struct netns_xdp xdp; #endif struct sock *diag_nlsk; diff --git a/include/net/netns/xdp.h b/include/net/netns/xdp.h index e5734261ba0a..4d0ac1606175 100644 --- a/include/net/netns/xdp.h +++ b/include/net/netns/xdp.h @@ -4,10 +4,21 @@ #include #include +#include + +struct bpf_dtab; + +struct bpf_dtab_container { + struct bpf_dtab __rcu *dtab; + atomic_t use_cnt; +}; struct netns_xdp { +#ifdef CONFIG_XDP_SOCKETS struct mutex lock; struct hlist_head list; +#endif + struct bpf_dtab_container default_map; }; #endif /* __NETNS_XDP_H__ */ diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index 92393b283b87..5f0b517bde21 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -52,11 +52,16 @@ #include #include #include +#include +#include #define DEV_CREATE_FLAG_MASK \ (BPF_F_NUMA_NODE | BPF_F_RDONLY | BPF_F_WRONLY) #define DEV_MAP_BULK_SIZE 16 +#define DEV_MAP_DEFAULT_SIZE 8 +#define DEV_MAP_MAX_USE_CNT 32768 + struct xdp_bulk_queue { struct xdp_frame *q[DEV_MAP_BULK_SIZE]; struct net_device *dev_rx; @@ -81,6 +86,7 @@ struct bpf_dtab { static DEFINE_MUTEX(dev_map_mtx); static LIST_HEAD(dev_map_list); +static atomic_t global_redirect_use_cnt = ATOMIC_INIT(0); static struct workqueue_struct *dev_map_wq; static void __dev_map_free(struct work_struct *work); @@ -340,6 +346,19 @@ struct bpf_dtab_netdev *__dev_map_lookup_elem(struct bpf_map *map, u32 key) return obj; } +/* This is only being called from xdp_do_redirect() if the xdp_redirect helper + * is used; the default map is allocated on XDP program load if the helper is + * used, so will always be available at this point. + */ +struct bpf_map *__dev_map_get_default_map(struct net_device *dev) +{ + struct net *net = dev_net(dev); + struct bpf_dtab *dtab; + + dtab = rcu_dereference(net->xdp.default_map.dtab); + return &dtab->map; +} + /* Runs under RCU-read-side, plus in softirq under NAPI protection. * Thus, safe percpu variable access. */ @@ -541,14 +560,212 @@ const struct bpf_map_ops dev_map_ops = { .map_check_btf = map_check_no_btf, }; +static inline struct net *bpf_default_map_to_net(struct bpf_dtab_container *cont) +{ + struct netns_xdp *xdp = container_of(cont, struct netns_xdp, default_map); + + return container_of(xdp, struct net, xdp); +} + +static void __dev_map_release_default_map(struct bpf_dtab_container *cont) +{ + struct bpf_dtab *dtab = NULL; + + lockdep_assert_held(&dev_map_mtx); + + dtab = rcu_dereference(cont->dtab); + if (dtab) { + list_del_rcu(&dtab->list); + rcu_assign_pointer(cont->dtab, NULL); + bpf_clear_redirect_map(&dtab->map); + queue_work(dev_map_wq, &dtab->free_work); + } +} + +void dev_map_put_default_map(struct net *net) +{ + mutex_lock(&dev_map_mtx); + if (atomic_dec_and_test(&net->xdp.default_map.use_cnt)) { + __dev_map_release_default_map(&net->xdp.default_map); + } + mutex_unlock(&dev_map_mtx); +} + +static int __init_default_map(struct bpf_dtab_container *cont) +{ + struct net *net = bpf_default_map_to_net(cont); + struct bpf_dtab *dtab, *old_dtab; + int size = DEV_MAP_DEFAULT_SIZE; + struct net_device *netdev; + union bpf_attr attr = {}; + u32 idx; + int err; + + lockdep_assert_held(&dev_map_mtx); + + if (!atomic_read(&global_redirect_use_cnt)) + return 0; + + for_each_netdev(net, netdev) + if (netdev->ifindex >= size) + size <<= 1; + + old_dtab = rcu_dereference(cont->dtab); + if (old_dtab && old_dtab->map.max_entries == size) + return 0; + + dtab = kzalloc(sizeof(*dtab), GFP_USER); + if (!dtab) + return -ENOMEM; + + attr.map_type = BPF_MAP_TYPE_DEVMAP; + attr.max_entries = size; + attr.value_size = 4; + attr.key_size = 4; + + err = dev_map_init_map(dtab, &attr, false); + if (err) { + kfree(dtab); + return err; + } + + for_each_netdev(net, netdev) { + idx = netdev->ifindex; + err = __dev_map_update_elem(net, &dtab->map, &idx, &idx, 0); + if (err) { + queue_work(dev_map_wq, &dtab->free_work); + return err; + } + } + + rcu_assign_pointer(cont->dtab, dtab); + list_add_tail_rcu(&dtab->list, &dev_map_list); + + if (old_dtab) { + list_del_rcu(&old_dtab->list); + bpf_clear_redirect_map(&old_dtab->map); + queue_work(dev_map_wq, &old_dtab->free_work); + } + + return 0; +} + +static int maybe_inc_use_cnt(atomic_t *v) +{ + int use_cnt; + + use_cnt = atomic_inc_return(v); + if (use_cnt > DEV_MAP_MAX_USE_CNT) { + atomic_dec(v); + return -EBUSY; + } + + return use_cnt; +} + +int dev_map_ensure_default_map(struct net *net) +{ + int use_cnt, err = 0; + + mutex_lock(&dev_map_mtx); + use_cnt = maybe_inc_use_cnt(&net->xdp.default_map.use_cnt); + if (use_cnt < 0) { + err = use_cnt; + goto out; + } + + if (use_cnt == 1) + err = __init_default_map(&net->xdp.default_map); + +out: + mutex_unlock(&dev_map_mtx); + return err; +} + +static void __dev_map_dec_redirect_count(void) +{ + struct net *net; + + lockdep_assert_held(&dev_map_mtx); + + if (atomic_dec_and_test(&global_redirect_use_cnt)) + for_each_net_rcu(net) + __dev_map_release_default_map(&net->xdp.default_map); +} + +void dev_map_dec_redirect_use_count(void) +{ + mutex_lock(&dev_map_mtx); + __dev_map_dec_redirect_count(); + mutex_unlock(&dev_map_mtx); +} + +static int __dev_map_init_redirect_use(void) +{ + struct net *net; + int err; + + lockdep_assert_held(&dev_map_mtx); + + for_each_net_rcu(net) { + if (atomic_read(&net->xdp.default_map.use_cnt)) { + err = __init_default_map(&net->xdp.default_map); + if (err) + return err; + } + } + + return 0; +} + +int dev_map_inc_redirect_use_count(void) +{ + int use_cnt, err = 0; + + mutex_lock(&dev_map_mtx); + use_cnt = maybe_inc_use_cnt(&global_redirect_use_cnt); + if (use_cnt < 0) { + err = use_cnt; + goto out; + } + + if (use_cnt == 1) + err = __dev_map_init_redirect_use(); + + if (err) + __dev_map_dec_redirect_count(); + + out: + mutex_unlock(&dev_map_mtx); + return err; +} + static int dev_map_notification(struct notifier_block *notifier, ulong event, void *ptr) { struct net_device *netdev = netdev_notifier_info_to_dev(ptr); + struct net *net = dev_net(netdev); + u32 idx = netdev->ifindex; struct bpf_dtab *dtab; - int i; + int i, err; switch (event) { + case NETDEV_REGISTER: + rcu_read_lock(); + dtab = rcu_dereference(net->xdp.default_map.dtab); + if (dtab) { + err = __dev_map_update_elem(net, &dtab->map, + &idx, &idx, 0); + if (err == -E2BIG) { + mutex_lock(&dev_map_mtx); + err = __init_default_map(&net->xdp.default_map); + if (err) + net_warn_ratelimited("Unable to re-allocate default map, xdp_redirect() may fail on some ifindexes\n"); + mutex_unlock(&dev_map_mtx); + } + } + rcu_read_unlock(); + break; case NETDEV_UNREGISTER: /* This rcu_read_lock/unlock pair is needed because * dev_map_list is an RCU list AND to ensure a delete @@ -581,8 +798,46 @@ static struct notifier_block dev_map_notifier = { .notifier_call = dev_map_notification, }; +#ifdef CONFIG_PROC_FS +static int dev_map_default_show(struct seq_file *seq, void *v) +{ + struct net *net = (struct net *)seq->private; + struct bpf_dtab *dtab; + + dtab = rcu_dereference(net->xdp.default_map.dtab); + seq_printf(seq, "%d %d\n", + atomic_read(&net->xdp.default_map.use_cnt), + dtab ? 1 : 0); + return 0; +} +#endif /* CONFIG_PROC_FS */ + +static int __net_init dev_map_net_init(struct net *net) +{ +#ifdef CONFIG_PROC_FS + proc_create_net_single("default_dev_map", 0444, net->proc_net, + dev_map_default_show, NULL); +#endif + return 0; +} + +static void __net_exit dev_map_net_exit(struct net *net) +{ +#ifdef CONFIG_PROC_FS + remove_proc_entry("default_dev_map", net->proc_net); +#endif +} + + +static struct pernet_operations dev_map_net_ops = { + .init = dev_map_net_init, + .exit = dev_map_net_exit, +}; + static int __init dev_map_init(void) { + int ret; + /* Assure tracepoint shadow struct _bpf_dtab_netdev is in sync */ BUILD_BUG_ON(offsetof(struct bpf_dtab_netdev, dev) != offsetof(struct _bpf_dtab_netdev, dev)); @@ -591,8 +846,15 @@ static int __init dev_map_init(void) if (!dev_map_wq) return -ENOMEM; + ret = register_pernet_subsys(&dev_map_net_ops); + if (ret) { + destroy_workqueue(dev_map_wq); + return ret; + } + register_netdevice_notifier(&dev_map_notifier); return 0; } + subsys_initcall(dev_map_init); diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index afca36f53c49..2b1e691b9d7e 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1274,6 +1274,9 @@ static void __bpf_prog_put(struct bpf_prog *prog, bool do_idr_lock) kvfree(prog->aux->func_info); bpf_prog_free_linfo(prog); + if (prog->redirect_used) + dev_map_dec_redirect_use_count(); + call_rcu(&prog->aux->rcu, __bpf_prog_put_rcu); } } diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 2fe89138309a..7f2c01911134 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -7646,6 +7646,18 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) prog->dst_needed = 1; if (insn->imm == BPF_FUNC_get_prandom_u32) bpf_user_rnd_init_once(); + if (insn->imm == BPF_FUNC_redirect) { + prog->redirect_needed = 1; + if (!prog->redirect_used) { + int err; + + err = dev_map_inc_redirect_use_count(); + if (err) + return err; + prog->redirect_used = 1; + } + } + if (insn->imm == BPF_FUNC_override_return) prog->kprobe_override = 1; if (insn->imm == BPF_FUNC_tail_call) { @@ -7655,6 +7667,7 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env) * the program array. */ prog->cb_access = 1; + prog->redirect_needed = 1; env->prog->aux->stack_depth = MAX_BPF_STACK; env->prog->aux->max_pkt_offset = MAX_PACKET_OFF; diff --git a/net/core/dev.c b/net/core/dev.c index feafc3580350..3930777a5c6f 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -7992,6 +7992,23 @@ u32 __dev_xdp_query(struct net_device *dev, unsigned int target) return 0; } +static struct bpf_prog *dev_xdp_get_prog(struct net_device *dev, + unsigned int target) +{ + struct bpf_prog *prog; + + if (WARN_ON(!(target == XDP_FLAGS_DRV_MODE || + target == XDP_FLAGS_SKB_MODE)) + || target != dev->xdp_target) + return NULL; + + prog = rtnl_dereference(dev->xdp_prog); + if (prog) + prog = bpf_prog_inc_not_zero(prog); + + return prog; +} + static int dev_xdp_install(struct net_device *dev, unsigned int target, struct netlink_ext_ack *extack, u32 flags, struct bpf_prog *prog) @@ -8045,7 +8062,8 @@ static int dev_xdp_install(struct net_device *dev, unsigned int target, static void dev_xdp_uninstall(struct net_device *dev) { - struct netdev_bpf xdp; + struct bpf_prog *prog; + struct netdev_bpf xdp; bpf_op_t ndo_bpf; /* Remove generic/native XDP */ @@ -8056,6 +8074,14 @@ static void dev_xdp_uninstall(struct net_device *dev) if (!ndo_bpf) return; + prog = dev_xdp_get_prog(dev, XDP_FLAGS_DRV_MODE); + if (prog) { + if (prog->redirect_needed) + dev_map_put_default_map(dev_net(dev)); + bpf_prog_put(prog); + } + + /* Remove HW offload */ memset(&xdp, 0, sizeof(xdp)); xdp.command = XDP_QUERY_PROG_HW; if (!ndo_bpf(dev, &xdp) && xdp.prog_id) @@ -8087,6 +8113,7 @@ int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack, const struct net_device_ops *ops = dev->netdev_ops; bool offload, drv = !!ops->ndo_bpf; struct bpf_prog *prog = NULL; + int default_map_needed = 0; unsigned int target; int err; @@ -8107,6 +8134,16 @@ int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack, return -EEXIST; } + if (target == XDP_FLAGS_DRV_MODE) { + struct bpf_prog *old_prog = dev_xdp_get_prog(dev, target); + + if (old_prog) { + if (old_prog->redirect_needed) + default_map_needed--; + bpf_prog_put(old_prog); + } + } + if (fd >= 0) { if ((flags & XDP_FLAGS_UPDATE_IF_NOEXIST) && __dev_xdp_query(dev, target)) { @@ -8123,11 +8160,23 @@ int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack, bpf_prog_put(prog); return -EINVAL; } + + if (target == XDP_FLAGS_DRV_MODE && prog->redirect_needed && + ++default_map_needed > 0) { + err = dev_map_ensure_default_map(dev_net(dev)); + if (err) { + NL_SET_ERR_MSG(extack, "unable to allocate default map for xdp_redirect()"); + return err; + } + } } err = dev_xdp_install(dev, target, extack, flags, prog); if (err < 0 && prog) bpf_prog_put(prog); + else if (!err && default_map_needed < 0) + dev_map_put_default_map(dev_net(dev)); + return err; } @@ -9365,6 +9414,7 @@ EXPORT_SYMBOL(unregister_netdev); int dev_change_net_namespace(struct net_device *dev, struct net *net, const char *pat) { int err, new_nsid, new_ifindex; + struct bpf_prog *prog = NULL; ASSERT_RTNL(); @@ -9382,6 +9432,13 @@ int dev_change_net_namespace(struct net_device *dev, struct net *net, const char if (net_eq(dev_net(dev), net)) goto out; + prog = dev_xdp_get_prog(dev, XDP_FLAGS_DRV_MODE); + if (prog && prog->redirect_needed) { + err = dev_map_ensure_default_map(net); + if (err) + goto out; + } + /* Pick the destination device name, and ensure * we can use it in the destination network namespace. */ @@ -9420,6 +9477,9 @@ int dev_change_net_namespace(struct net_device *dev, struct net *net, const char call_netdevice_notifiers(NETDEV_UNREGISTER, dev); rcu_barrier(); + if (prog && prog->redirect_needed) + dev_map_put_default_map(dev_net(dev)); + new_nsid = peernet2id_alloc(dev_net(dev), net); /* If there is an ifindex conflict assign a new one */ if (__dev_get_by_index(net, dev->ifindex)) @@ -9467,6 +9527,9 @@ int dev_change_net_namespace(struct net_device *dev, struct net *net, const char synchronize_net(); err = 0; out: + if (prog) + bpf_prog_put(prog); + return err; } EXPORT_SYMBOL_GPL(dev_change_net_namespace); diff --git a/net/core/filter.c b/net/core/filter.c index cdaafa3322db..c062f18f1492 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -3410,58 +3410,6 @@ static const struct bpf_func_proto bpf_xdp_adjust_meta_proto = { .arg2_type = ARG_ANYTHING, }; -static int __bpf_tx_xdp(struct net_device *dev, - struct bpf_map *map, - struct xdp_buff *xdp, - u32 index) -{ - struct xdp_frame *xdpf; - int err, sent; - - if (!dev->netdev_ops->ndo_xdp_xmit) { - return -EOPNOTSUPP; - } - - err = xdp_ok_fwd_dev(dev, xdp->data_end - xdp->data); - if (unlikely(err)) - return err; - - xdpf = convert_to_xdp_frame(xdp); - if (unlikely(!xdpf)) - return -EOVERFLOW; - - sent = dev->netdev_ops->ndo_xdp_xmit(dev, 1, &xdpf, XDP_XMIT_FLUSH); - if (sent <= 0) - return sent; - return 0; -} - -static noinline int -xdp_do_redirect_slow(struct net_device *dev, struct xdp_buff *xdp, - struct bpf_prog *xdp_prog, struct bpf_redirect_info *ri) -{ - struct net_device *fwd; - u32 index = ri->ifindex; - int err; - - fwd = dev_get_by_index_rcu(dev_net(dev), index); - ri->ifindex = 0; - if (unlikely(!fwd)) { - err = -EINVAL; - goto err; - } - - err = __bpf_tx_xdp(fwd, NULL, xdp, 0); - if (unlikely(err)) - goto err; - - _trace_xdp_redirect(dev, xdp_prog, index); - return 0; -err: - _trace_xdp_redirect_err(dev, xdp_prog, index, err); - return err; -} - static int __bpf_tx_xdp_map(struct net_device *dev_rx, void *fwd, struct bpf_map *map, struct xdp_buff *xdp, @@ -3592,10 +3540,10 @@ int xdp_do_redirect(struct net_device *dev, struct xdp_buff *xdp, struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); struct bpf_map *map = READ_ONCE(ri->map); - if (likely(map)) - return xdp_do_redirect_map(dev, xdp, xdp_prog, map, ri); + if (unlikely(!map)) + map = __dev_map_get_default_map(dev); - return xdp_do_redirect_slow(dev, xdp, xdp_prog, ri); + return xdp_do_redirect_map(dev, xdp, xdp_prog, map, ri); } EXPORT_SYMBOL_GPL(xdp_do_redirect); From patchwork Mon Apr 8 17:05:56 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= X-Patchwork-Id: 1081293 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 44dGz75gqYz9sR7 for ; Tue, 9 Apr 2019 03:06:11 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729223AbfDHRGJ (ORCPT ); Mon, 8 Apr 2019 13:06:09 -0400 Received: from mail-ed1-f68.google.com ([209.85.208.68]:36317 "EHLO mail-ed1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729159AbfDHRGE (ORCPT ); Mon, 8 Apr 2019 13:06:04 -0400 Received: by mail-ed1-f68.google.com with SMTP id s16so12384726edr.3 for ; Mon, 08 Apr 2019 10:06:01 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=P/59BTq5X+1ffXp/+aPx6G5wVBfIuY6G1/DnFshr/co=; b=bKI4aCCXWV/abMu55JYqjMxaQ/AGwGRuBYasJGjnNapjAD89yf3bqIQFoUul4bxG43 jnFa1igiIEWgV4QKCJQ4oSGPiyU/Ecu1xamhLLsYvO7TC7u0lbkIBR7bqSV350pc/T40 GsZmijsf94DgwPoK7j8MZZyzkW45QIVlduxm5TRn4Sl+YIM5v7yspHFUAR18Iwc0z88b ROGyP038Ld8IxtAYT0Evk22HnHj3LIMM/ryMSBCdftvsig3ikN8gSH6dSwWBIE8V/QQR BNBCzSAEfvfk5XXArN2Y/kq8Ba622dpWyNcdYVOUYPxms4X1TCoHS9I6sCtJgx9ujANR BQKg== X-Gm-Message-State: APjAAAUmBQO9ScCls6iV1zwYGGy65FsdLnASmpjvQ2gO1P6u9CvBI9tv UWFyfZXII2Y/aHLZ0NUZwlt9DA== X-Google-Smtp-Source: APXvYqx9gmsZWKbu9YQZZpSOtmxd1bpoZHZeCyvC2H3Od2ObsAYTxD+JiXei6WyB+GLWXNp/xc1O9w== X-Received: by 2002:aa7:dc5a:: with SMTP id g26mr8068226edu.54.1554743160882; Mon, 08 Apr 2019 10:06:00 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk (borgediget.toke.dk. [85.204.121.218]) by smtp.gmail.com with ESMTPSA id m6sm1993542eds.67.2019.04.08.10.05.58 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 08 Apr 2019 10:06:00 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id EEDA3181FEB; Mon, 8 Apr 2019 19:05:56 +0200 (CEST) Subject: [PATCH net-next v4 5/6] xdp: Add devmap_idx map type for looking up devices by ifindex From: Toke =?utf-8?q?H=C3=B8iland-J=C3=B8rgensen?= To: David Miller Cc: netdev@vger.kernel.org, Jesper Dangaard Brouer , Daniel Borkmann , Alexei Starovoitov , Jakub Kicinski , =?utf-8?b?QmrDtnJuVMO2?= =?utf-8?q?pel?= Date: Mon, 08 Apr 2019 19:05:56 +0200 Message-ID: <155474315690.24432.16404237076945293859.stgit@alrua-x1> In-Reply-To: <155474315642.24432.6179239576879119104.stgit@alrua-x1> References: <155474315642.24432.6179239576879119104.stgit@alrua-x1> User-Agent: StGit/unknown-version MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org A common pattern when using xdp_redirect_map() is to create a device map where the lookup key is simply ifindex. Because device maps are arrays, this leaves holes in the map, and the map has to be sized to fit the largest ifindex, regardless of how many devices actually are actually needed in the map. This patch adds a second type of device map where the key is interpreted as an ifindex and looked up using a hashmap, instead of being used as an array index. This leads to maps being densely packed, so they can be smaller. The default maps used by xdp_redirect() are changed to use the new map type, which means that xdp_redirect() is no longer limited to ifindex < 64, but instead to 64 total simultaneous interfaces per network namespace. This also provides an easy way to compare the performance of devmap and devmap_idx: xdp_redirect_map (devmap): 8394560 pkt/s xdp_redirect (devmap_idx): 8179480 pkt/s Difference: 215080 pkt/s or 3.1 nanoseconds per packet. Signed-off-by: Toke Høiland-Jørgensen --- include/linux/bpf.h | 11 + include/linux/bpf_types.h | 1 include/trace/events/xdp.h | 3 include/uapi/linux/bpf.h | 1 kernel/bpf/devmap.c | 232 ++++++++++++++++++++++++++++++- kernel/bpf/verifier.c | 2 net/core/filter.c | 11 + tools/bpf/bpftool/map.c | 1 tools/include/uapi/linux/bpf.h | 1 tools/lib/bpf/libbpf_probes.c | 1 tools/testing/selftests/bpf/test_maps.c | 16 ++ 11 files changed, 264 insertions(+), 16 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index c73ff0ea1bf4..24457da7c6fc 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -623,12 +623,13 @@ struct xdp_buff; struct sk_buff; struct bpf_dtab_netdev *__dev_map_lookup_elem(struct bpf_map *map, u32 key); +struct bpf_dtab_netdev *__dev_map_idx_lookup_elem(struct bpf_map *map, u32 key); struct bpf_map *__dev_map_get_default_map(struct net_device *dev); int dev_map_ensure_default_map(struct net *net); void dev_map_put_default_map(struct net *net); int dev_map_inc_redirect_use_count(void); void dev_map_dec_redirect_use_count(void); -void __dev_map_insert_ctx(struct bpf_map *map, u32 index); +void __dev_map_insert_ctx(struct bpf_map *map, struct bpf_dtab_netdev *dst); void __dev_map_flush(struct bpf_map *map); int dev_map_enqueue(struct bpf_dtab_netdev *dst, struct xdp_buff *xdp, struct net_device *dev_rx); @@ -714,6 +715,12 @@ static inline struct net_device *__dev_map_lookup_elem(struct bpf_map *map, return NULL; } +static inline struct net_device *__dev_map_idx_lookup_elem(struct bpf_map *map, + u32 key) +{ + return NULL; +} + static inline struct bpf_map *__dev_map_get_default_map(struct net_device *dev) { return NULL; @@ -737,7 +744,7 @@ static inline void dev_map_dec_redirect_use_count(void) { } -static inline void __dev_map_insert_ctx(struct bpf_map *map, u32 index) +static inline void __dev_map_insert_ctx(struct bpf_map *map, void *dst) { } diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h index 08bf2f1fe553..374c013ca243 100644 --- a/include/linux/bpf_types.h +++ b/include/linux/bpf_types.h @@ -59,6 +59,7 @@ BPF_MAP_TYPE(BPF_MAP_TYPE_ARRAY_OF_MAPS, array_of_maps_map_ops) BPF_MAP_TYPE(BPF_MAP_TYPE_HASH_OF_MAPS, htab_of_maps_map_ops) #ifdef CONFIG_NET BPF_MAP_TYPE(BPF_MAP_TYPE_DEVMAP, dev_map_ops) +BPF_MAP_TYPE(BPF_MAP_TYPE_DEVMAP_IDX, dev_map_idx_ops) #if defined(CONFIG_BPF_STREAM_PARSER) BPF_MAP_TYPE(BPF_MAP_TYPE_SOCKMAP, sock_map_ops) BPF_MAP_TYPE(BPF_MAP_TYPE_SOCKHASH, sock_hash_ops) diff --git a/include/trace/events/xdp.h b/include/trace/events/xdp.h index e95cb86b65cf..fcf006d49f67 100644 --- a/include/trace/events/xdp.h +++ b/include/trace/events/xdp.h @@ -147,7 +147,8 @@ struct _bpf_dtab_netdev { #define devmap_ifindex(fwd, map) \ (!fwd ? 0 : \ - ((map->map_type == BPF_MAP_TYPE_DEVMAP) ? \ + ((map->map_type == BPF_MAP_TYPE_DEVMAP || \ + map->map_type == BPF_MAP_TYPE_DEVMAP_IDX) ? \ ((struct _bpf_dtab_netdev *)fwd)->dev->ifindex : 0)) #define _trace_xdp_redirect_map(dev, xdp, fwd, map, idx) \ diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 837024512baf..84ef3b4355b0 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -132,6 +132,7 @@ enum bpf_map_type { BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE, BPF_MAP_TYPE_QUEUE, BPF_MAP_TYPE_STACK, + BPF_MAP_TYPE_DEVMAP_IDX, }; /* Note that tracing related programs such as diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c index 5f0b517bde21..eaa56cd7adfe 100644 --- a/kernel/bpf/devmap.c +++ b/kernel/bpf/devmap.c @@ -46,6 +46,12 @@ * notifier hook walks the map we know that new dev references can not be * added by the user because core infrastructure ensures dev_get_by_index() * calls will fail at this point. + * + * The devmap_idx type is a map type which interprets keys as ifindexes and + * indexes these using a hashmap. This allows maps that use ifindex as key to be + * densely packed instead of having holes in the lookup array for unused + * ifindexes. The setup and packet enqueue/send code is shared between the two + * types of devmap; only the lookup and insertion is different. */ #include #include @@ -70,6 +76,8 @@ struct xdp_bulk_queue { struct bpf_dtab_netdev { struct net_device *dev; /* must be first member, due to tracepoint */ + unsigned int ifindex; + struct hlist_node index_hlist; struct bpf_dtab *dtab; unsigned int bit; struct xdp_bulk_queue __percpu *bulkq; @@ -82,6 +90,11 @@ struct bpf_dtab { unsigned long __percpu *flush_needed; struct list_head list; struct work_struct free_work; + + /* these are only used for DEVMAP_IDX type maps */ + unsigned long *bits_used; + struct hlist_head *dev_index_head; + spinlock_t index_lock; }; static DEFINE_MUTEX(dev_map_mtx); @@ -91,6 +104,19 @@ static atomic_t global_redirect_use_cnt = ATOMIC_INIT(0); static struct workqueue_struct *dev_map_wq; static void __dev_map_free(struct work_struct *work); +static struct hlist_head *dev_map_create_hash(void) +{ + int i; + struct hlist_head *hash; + + hash = kmalloc_array(NETDEV_HASHENTRIES, sizeof(*hash), GFP_KERNEL); + if (hash != NULL) + for (i = 0; i < NETDEV_HASHENTRIES; i++) + INIT_HLIST_HEAD(&hash[i]); + + return hash; +} + static u64 dev_map_bitmap_size(const union bpf_attr *attr) { return BITS_TO_LONGS((u64) attr->max_entries) * sizeof(unsigned long); @@ -112,6 +138,11 @@ static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr, /* make sure page count doesn't overflow */ cost = (u64) dtab->map.max_entries * sizeof(struct bpf_dtab_netdev *); cost += dev_map_bitmap_size(attr) * num_possible_cpus(); + + if (attr->map_type == BPF_MAP_TYPE_DEVMAP_IDX) + cost += dev_map_bitmap_size(attr) + + sizeof(struct hlist_head) * NETDEV_HASHENTRIES; + if (cost >= U32_MAX - PAGE_SIZE) return -EINVAL; @@ -139,8 +170,25 @@ static int dev_map_init_map(struct bpf_dtab *dtab, union bpf_attr *attr, INIT_WORK(&dtab->free_work, __dev_map_free); + if (attr->map_type == BPF_MAP_TYPE_DEVMAP_IDX) { + dtab->bits_used = kzalloc(dev_map_bitmap_size(attr), + GFP_KERNEL); + if (!dtab->bits_used) + goto free_map_area; + + dtab->dev_index_head = dev_map_create_hash(); + if (!dtab->dev_index_head) + goto free_bitmap; + + spin_lock_init(&dtab->index_lock); + } + return 0; +free_bitmap: + kfree(dtab->bits_used); +free_map_area: + bpf_map_area_free(dtab->netdev_map); free_map: free_percpu(dtab->flush_needed); return -ENOMEM; @@ -202,6 +250,8 @@ static void __dev_map_free(struct work_struct *work) kfree(dev); } + kfree(dtab->dev_index_head); + kfree(dtab->bits_used); free_percpu(dtab->flush_needed); bpf_map_area_free(dtab->netdev_map); kfree(dtab); @@ -244,12 +294,76 @@ static int dev_map_get_next_key(struct bpf_map *map, void *key, void *next_key) return 0; } -void __dev_map_insert_ctx(struct bpf_map *map, u32 bit) +static inline struct hlist_head *dev_map_index_hash(struct bpf_dtab *dtab, + int ifindex) +{ + return &dtab->dev_index_head[ifindex & (NETDEV_HASHENTRIES - 1)]; +} + +struct bpf_dtab_netdev *__dev_map_idx_lookup_elem(struct bpf_map *map, u32 key) +{ + struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map); + struct hlist_head *head = dev_map_index_hash(dtab, key); + struct bpf_dtab_netdev *dev; + + hlist_for_each_entry_rcu(dev, head, index_hlist) + if (dev->ifindex == key) + return dev; + + return NULL; +} + +static int dev_map_idx_get_next_key(struct bpf_map *map, void *key, + void *next_key) +{ + struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map); + u32 ifindex, *next = next_key; + struct bpf_dtab_netdev *dev, *next_dev; + struct hlist_head *head; + int i = 0; + + if (!key) + goto find_first; + + ifindex = *(u32 *)key; + + dev = __dev_map_idx_lookup_elem(map, ifindex); + if (!dev) + goto find_first; + + next_dev = hlist_entry_safe(rcu_dereference_raw(hlist_next_rcu(&dev->index_hlist)), + struct bpf_dtab_netdev, index_hlist); + + if (next_dev) { + *next = next_dev->ifindex; + return 0; + } + + i = ifindex & (NETDEV_HASHENTRIES - 1); + i++; + + find_first: + for (; i < NETDEV_HASHENTRIES; i++) { + head = dev_map_index_hash(dtab, i); + + next_dev = hlist_entry_safe(rcu_dereference_raw(hlist_first_rcu(head)), + struct bpf_dtab_netdev, + index_hlist); + if (next_dev) { + *next = next_dev->ifindex; + return 0; + } + } + + return -ENOENT; +} + +void __dev_map_insert_ctx(struct bpf_map *map, struct bpf_dtab_netdev *dst) { struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map); unsigned long *bitmap = this_cpu_ptr(dtab->flush_needed); - __set_bit(bit, bitmap); + __set_bit(dst->bit, bitmap); } static int bq_xmit_all(struct bpf_dtab_netdev *obj, @@ -420,9 +534,16 @@ int dev_map_generic_redirect(struct bpf_dtab_netdev *dst, struct sk_buff *skb, static void *dev_map_lookup_elem(struct bpf_map *map, void *key) { struct bpf_dtab_netdev *obj = __dev_map_lookup_elem(map, *(u32 *)key); - struct net_device *dev = obj ? obj->dev : NULL; - return dev ? &dev->ifindex : NULL; + return obj ? &obj->ifindex : NULL; +} + +static void *dev_map_idx_lookup_elem(struct bpf_map *map, void *key) +{ + struct bpf_dtab_netdev *obj = __dev_map_idx_lookup_elem(map, + *(u32 *)key); + + return obj ? &obj->ifindex : NULL; } static void dev_map_flush_old(struct bpf_dtab_netdev *dev) @@ -477,6 +598,43 @@ static int dev_map_delete_elem(struct bpf_map *map, void *key) return 0; } +static int dev_map_idx_delete_elem(struct bpf_map *map, void *key) +{ + struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map); + struct bpf_dtab_netdev *old_dev; + int k = *(u32 *)key; + + old_dev = __dev_map_idx_lookup_elem(map, k); + if (!old_dev) + return 0; + + spin_lock(&dtab->index_lock); + hlist_del_rcu(&old_dev->index_hlist); + spin_unlock(&dtab->index_lock); + + xchg(&dtab->netdev_map[old_dev->bit], NULL); + clear_bit_unlock(old_dev->bit, dtab->bits_used); + call_rcu(&old_dev->rcu, __dev_map_entry_free); + return 0; +} + +static bool __dev_map_find_bit(struct bpf_dtab *dtab, unsigned int *bit) +{ + unsigned int b = 0; + + retry: + b = find_next_zero_bit(dtab->bits_used, dtab->map.max_entries, b); + + if (b >= dtab->map.max_entries) + return false; + + if (test_and_set_bit_lock(b, dtab->bits_used)) + goto retry; + + *bit = b; + return true; +} + static struct bpf_dtab_netdev *__dev_map_alloc_node(struct net *net, struct bpf_dtab *dtab, u32 ifindex, @@ -503,6 +661,7 @@ static struct bpf_dtab_netdev *__dev_map_alloc_node(struct net *net, return ERR_PTR(-EINVAL); } + dev->ifindex = dev->dev->ifindex; dev->bit = bit; dev->dtab = dtab; @@ -550,6 +709,49 @@ static int dev_map_update_elem(struct bpf_map *map, void *key, void *value, map, key, value, map_flags); } +static int __dev_map_idx_update_elem(struct net *net, struct bpf_map *map, + void *key, void *value, u64 map_flags) +{ + struct bpf_dtab *dtab = container_of(map, struct bpf_dtab, map); + struct bpf_dtab_netdev *dev, *old_dev; + u32 idx = *(u32 *)key; + u32 val = *(u32 *)value; + u32 bit; + + if (idx != val) + return -EINVAL; + if (unlikely(map_flags > BPF_EXIST)) + return -EINVAL; + + old_dev = __dev_map_idx_lookup_elem(map, idx); + if (old_dev) { + if (map_flags & BPF_NOEXIST) + return -EEXIST; + else + return 0; + } + + if (!__dev_map_find_bit(dtab, &bit)) + return -ENOSPC; + dev = __dev_map_alloc_node(net, dtab, idx, bit); + if (IS_ERR(dev)) + return PTR_ERR(dev); + + xchg(&dtab->netdev_map[bit], dev); + spin_lock(&dtab->index_lock); + hlist_add_head_rcu(&dev->index_hlist, + dev_map_index_hash(dtab, dev->ifindex)); + spin_unlock(&dtab->index_lock); + return 0; +} + +static int dev_map_idx_update_elem(struct bpf_map *map, void *key, void *value, + u64 map_flags) +{ + return __dev_map_idx_update_elem(current->nsproxy->net_ns, + map, key, value, map_flags); +} + const struct bpf_map_ops dev_map_ops = { .map_alloc = dev_map_alloc, .map_free = dev_map_free, @@ -560,6 +762,16 @@ const struct bpf_map_ops dev_map_ops = { .map_check_btf = map_check_no_btf, }; +const struct bpf_map_ops dev_map_idx_ops = { + .map_alloc = dev_map_alloc, + .map_free = dev_map_free, + .map_get_next_key = dev_map_idx_get_next_key, + .map_lookup_elem = dev_map_idx_lookup_elem, + .map_update_elem = dev_map_idx_update_elem, + .map_delete_elem = dev_map_idx_delete_elem, + .map_check_btf = map_check_no_btf, +}; + static inline struct net *bpf_default_map_to_net(struct bpf_dtab_container *cont) { struct netns_xdp *xdp = container_of(cont, struct netns_xdp, default_map); @@ -594,8 +806,8 @@ void dev_map_put_default_map(struct net *net) static int __init_default_map(struct bpf_dtab_container *cont) { struct net *net = bpf_default_map_to_net(cont); + int size = DEV_MAP_DEFAULT_SIZE, i = 0; struct bpf_dtab *dtab, *old_dtab; - int size = DEV_MAP_DEFAULT_SIZE; struct net_device *netdev; union bpf_attr attr = {}; u32 idx; @@ -607,7 +819,7 @@ static int __init_default_map(struct bpf_dtab_container *cont) return 0; for_each_netdev(net, netdev) - if (netdev->ifindex >= size) + if (++i >= size) size <<= 1; old_dtab = rcu_dereference(cont->dtab); @@ -618,7 +830,7 @@ static int __init_default_map(struct bpf_dtab_container *cont) if (!dtab) return -ENOMEM; - attr.map_type = BPF_MAP_TYPE_DEVMAP; + attr.map_type = BPF_MAP_TYPE_DEVMAP_IDX; attr.max_entries = size; attr.value_size = 4; attr.key_size = 4; @@ -631,7 +843,7 @@ static int __init_default_map(struct bpf_dtab_container *cont) for_each_netdev(net, netdev) { idx = netdev->ifindex; - err = __dev_map_update_elem(net, &dtab->map, &idx, &idx, 0); + err = __dev_map_idx_update_elem(net, &dtab->map, &idx, &idx, 0); if (err) { queue_work(dev_map_wq, &dtab->free_work); return err; @@ -754,8 +966,8 @@ static int dev_map_notification(struct notifier_block *notifier, rcu_read_lock(); dtab = rcu_dereference(net->xdp.default_map.dtab); if (dtab) { - err = __dev_map_update_elem(net, &dtab->map, - &idx, &idx, 0); + err = __dev_map_idx_update_elem(net, &dtab->map, + &idx, &idx, 0); if (err == -E2BIG) { mutex_lock(&dev_map_mtx); err = __init_default_map(&net->xdp.default_map); diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 7f2c01911134..5c7f3aaa5e3f 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -2572,6 +2572,7 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env, * for now. */ case BPF_MAP_TYPE_DEVMAP: + case BPF_MAP_TYPE_DEVMAP_IDX: if (func_id != BPF_FUNC_redirect_map) goto error; break; @@ -2644,6 +2645,7 @@ static int check_map_func_compatibility(struct bpf_verifier_env *env, break; case BPF_FUNC_redirect_map: if (map->map_type != BPF_MAP_TYPE_DEVMAP && + map->map_type != BPF_MAP_TYPE_DEVMAP_IDX && map->map_type != BPF_MAP_TYPE_CPUMAP && map->map_type != BPF_MAP_TYPE_XSKMAP) goto error; diff --git a/net/core/filter.c b/net/core/filter.c index c062f18f1492..958b7c59dba8 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -3418,13 +3418,14 @@ static int __bpf_tx_xdp_map(struct net_device *dev_rx, void *fwd, int err; switch (map->map_type) { - case BPF_MAP_TYPE_DEVMAP: { + case BPF_MAP_TYPE_DEVMAP: + case BPF_MAP_TYPE_DEVMAP_IDX: { struct bpf_dtab_netdev *dst = fwd; err = dev_map_enqueue(dst, xdp, dev_rx); if (unlikely(err)) return err; - __dev_map_insert_ctx(map, index); + __dev_map_insert_ctx(map, dst); break; } case BPF_MAP_TYPE_CPUMAP: { @@ -3457,6 +3458,7 @@ void xdp_do_flush_map(void) if (map) { switch (map->map_type) { case BPF_MAP_TYPE_DEVMAP: + case BPF_MAP_TYPE_DEVMAP_IDX: __dev_map_flush(map); break; case BPF_MAP_TYPE_CPUMAP: @@ -3477,6 +3479,8 @@ static inline void *__xdp_map_lookup_elem(struct bpf_map *map, u32 index) switch (map->map_type) { case BPF_MAP_TYPE_DEVMAP: return __dev_map_lookup_elem(map, index); + case BPF_MAP_TYPE_DEVMAP_IDX: + return __dev_map_idx_lookup_elem(map, index); case BPF_MAP_TYPE_CPUMAP: return __cpu_map_lookup_elem(map, index); case BPF_MAP_TYPE_XSKMAP: @@ -3567,7 +3571,8 @@ static int xdp_do_generic_redirect_map(struct net_device *dev, goto err; } - if (map->map_type == BPF_MAP_TYPE_DEVMAP) { + if (map->map_type == BPF_MAP_TYPE_DEVMAP || + map->map_type == BPF_MAP_TYPE_DEVMAP_IDX) { struct bpf_dtab_netdev *dst = fwd; err = dev_map_generic_redirect(dst, skb, xdp_prog); diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c index e0c650d91784..0864ce33df94 100644 --- a/tools/bpf/bpftool/map.c +++ b/tools/bpf/bpftool/map.c @@ -37,6 +37,7 @@ const char * const map_type_name[] = { [BPF_MAP_TYPE_ARRAY_OF_MAPS] = "array_of_maps", [BPF_MAP_TYPE_HASH_OF_MAPS] = "hash_of_maps", [BPF_MAP_TYPE_DEVMAP] = "devmap", + [BPF_MAP_TYPE_DEVMAP_IDX] = "devmap_idx", [BPF_MAP_TYPE_SOCKMAP] = "sockmap", [BPF_MAP_TYPE_CPUMAP] = "cpumap", [BPF_MAP_TYPE_XSKMAP] = "xskmap", diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 837024512baf..84ef3b4355b0 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -132,6 +132,7 @@ enum bpf_map_type { BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE, BPF_MAP_TYPE_QUEUE, BPF_MAP_TYPE_STACK, + BPF_MAP_TYPE_DEVMAP_IDX, }; /* Note that tracing related programs such as diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c index 8c3a1c04dcb2..b87b760a1355 100644 --- a/tools/lib/bpf/libbpf_probes.c +++ b/tools/lib/bpf/libbpf_probes.c @@ -172,6 +172,7 @@ bool bpf_probe_map_type(enum bpf_map_type map_type, __u32 ifindex) case BPF_MAP_TYPE_ARRAY_OF_MAPS: case BPF_MAP_TYPE_HASH_OF_MAPS: case BPF_MAP_TYPE_DEVMAP: + case BPF_MAP_TYPE_DEVMAP_IDX: case BPF_MAP_TYPE_SOCKMAP: case BPF_MAP_TYPE_CPUMAP: case BPF_MAP_TYPE_XSKMAP: diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c index 3c627771f965..63681e4647f9 100644 --- a/tools/testing/selftests/bpf/test_maps.c +++ b/tools/testing/selftests/bpf/test_maps.c @@ -519,6 +519,21 @@ static void test_devmap(unsigned int task, void *data) close(fd); } +static void test_devmap_idx(unsigned int task, void *data) +{ + int fd; + __u32 key, value; + + fd = bpf_create_map(BPF_MAP_TYPE_DEVMAP_IDX, sizeof(key), sizeof(value), + 2, 0); + if (fd < 0) { + printf("Failed to create devmap_idx '%s'!\n", strerror(errno)); + exit(1); + } + + close(fd); +} + static void test_queuemap(unsigned int task, void *data) { const int MAP_SIZE = 32; @@ -1686,6 +1701,7 @@ static void run_all_tests(void) test_arraymap_percpu_many_keys(); test_devmap(0, NULL); + test_devmap_idx(0, NULL); test_sockmap(0, NULL); test_map_large(); From patchwork Mon Apr 8 17:05:57 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Toke_H=C3=B8iland-J=C3=B8rgensen?= X-Patchwork-Id: 1081294 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 44dGz8682Rz9sRV for ; Tue, 9 Apr 2019 03:06:12 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729216AbfDHRGJ (ORCPT ); Mon, 8 Apr 2019 13:06:09 -0400 Received: from mail-ed1-f66.google.com ([209.85.208.66]:36319 "EHLO mail-ed1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729169AbfDHRGE (ORCPT ); Mon, 8 Apr 2019 13:06:04 -0400 Received: by mail-ed1-f66.google.com with SMTP id s16so12384776edr.3 for ; Mon, 08 Apr 2019 10:06:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=SCrQEgbh7RSgRtblS++H5vqf4cK72LMoC6UNcsq47MU=; b=P0G30zMr6THZJ7lZS1tBuLW5UPfUbBH8oaiNFQrKGfkX+jPj+QS3f9uGp1CGZqB+8K rHW1b6RrbhZDjLCxzSovomwHgH98nqHxEMIsmP6WiABK7gPNB5taXeVfMnw7uZhTDjxw hBHXCFEIZ0SJ3JkHSzxox8RZxpbnfBIomAaBc4rfyEpaDPVo3FbpXCWimw3pZEPiKY/V dOJirpweyAumgJOABcpRO9d3WNl2qiDT+KsDP9qdqW3OUDfCuMNPkYgXQxeonzMT/mzC EVtlLihOCeTV9NaYV/+qyA3UHupoeLxxdF0OQC8f/4LWkKLhOj0Raih5gm6Dp+PQq+DA FSHw== X-Gm-Message-State: APjAAAWd3FRrEyGr6tHvFjf2khk2gv4J19HVa2hQ20N/BspljHDjWp6u WqO2sgQUHEXUG/QSaOwNBaJ5aNsfQHDOsA== X-Google-Smtp-Source: APXvYqzqFt21puIXUPzTvZ1pTRBC7qeR9H7V3B7o9hPP8LmBOFSNqTQjr/Mzssiq4QyPn3LZzcVjKA== X-Received: by 2002:a17:906:a945:: with SMTP id hh5mr17633443ejb.108.1554743161691; Mon, 08 Apr 2019 10:06:01 -0700 (PDT) Received: from alrua-x1.borgediget.toke.dk (alrua-x1.vpn.toke.dk. [2a00:7660:6da:10::2]) by smtp.gmail.com with ESMTPSA id s24sm2652127edq.79.2019.04.08.10.05.57 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Mon, 08 Apr 2019 10:06:00 -0700 (PDT) Received: by alrua-x1.borgediget.toke.dk (Postfix, from userid 1000) id 08526181FEC; Mon, 8 Apr 2019 19:05:57 +0200 (CEST) Subject: [PATCH net-next v4 6/6] selftests/bpf: Add test for default devmap allocation From: Toke =?utf-8?q?H=C3=B8iland-J=C3=B8rgensen?= To: David Miller Cc: netdev@vger.kernel.org, Jesper Dangaard Brouer , Daniel Borkmann , Alexei Starovoitov , Jakub Kicinski , =?utf-8?b?QmrDtnJuVMO2?= =?utf-8?q?pel?= Date: Mon, 08 Apr 2019 19:05:57 +0200 Message-ID: <155474315697.24432.13044630044539815309.stgit@alrua-x1> In-Reply-To: <155474315642.24432.6179239576879119104.stgit@alrua-x1> References: <155474315642.24432.6179239576879119104.stgit@alrua-x1> User-Agent: StGit/unknown-version MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This adds a new selftest checking the allocation and de-allocation of default maps in different network namespaces. It loads the two different kinds of programs that need allocation (programs using tail calls, and programs using redirect), and moves interfaces around between namespaces to make sure the default map is correctly allocated and de-allocated in each of the namespaces. Signed-off-by: Toke Høiland-Jørgensen --- tools/testing/selftests/bpf/Makefile | 3 - .../selftests/bpf/progs/test_xdp_tail_call.c | 39 ++++++++ .../testing/selftests/bpf/test_xdp_devmap_alloc.sh | 94 ++++++++++++++++++++ 3 files changed, 135 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_tail_call.c create mode 100755 tools/testing/selftests/bpf/test_xdp_devmap_alloc.sh diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 77b73b892136..07f2f54a6a87 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -54,7 +54,8 @@ TEST_PROGS := test_kmod.sh \ test_lwt_ip_encap.sh \ test_tcp_check_syncookie.sh \ test_tc_tunnel.sh \ - test_tc_edt.sh + test_tc_edt.sh \ + test_xdp_devmap_alloc.sh TEST_PROGS_EXTENDED := with_addr.sh \ with_tunnels.sh \ diff --git a/tools/testing/selftests/bpf/progs/test_xdp_tail_call.c b/tools/testing/selftests/bpf/progs/test_xdp_tail_call.c new file mode 100644 index 000000000000..6c89dc4ad341 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_xdp_tail_call.c @@ -0,0 +1,39 @@ +// SPDX-License-Identifier: GPL-2.0 + +#define KBUILD_MODNAME "xdp_dummy" +#include +#include "bpf_helpers.h" + +struct bpf_map_def SEC("maps") jmp_table = { + .type = BPF_MAP_TYPE_PROG_ARRAY, + .key_size = sizeof(__u32), + .value_size = sizeof(__u32), + .max_entries = 8, +}; + +struct bpf_map_def SEC("maps") arr_map = { + .type = BPF_MAP_TYPE_ARRAY, + .key_size = sizeof(__u32), + .value_size = sizeof(__u32), + .max_entries = 1, +}; + + +SEC("xdp_dummy_tail_call") +int xdp_dummy_prog(struct xdp_md *ctx) +{ + long *value; + __u32 key = 0; + + /* We just need the call instruction in the program, so it is fine that + * this fails, but it should not be optimised out by the compiler (so + * can't just do if (false)). + */ + value = bpf_map_lookup_elem(&arr_map, &key); + if (value) + bpf_tail_call(ctx, &jmp_table, 1); + + return XDP_PASS; +} + +char _license[] SEC("license") = "GPL"; diff --git a/tools/testing/selftests/bpf/test_xdp_devmap_alloc.sh b/tools/testing/selftests/bpf/test_xdp_devmap_alloc.sh new file mode 100755 index 000000000000..0431f94f136a --- /dev/null +++ b/tools/testing/selftests/bpf/test_xdp_devmap_alloc.sh @@ -0,0 +1,94 @@ +#!/bin/bash + +cleanup() +{ + if [ "$?" = "0" ]; then + echo "selftests: test_xdp_devmap_alloc [PASS]"; + else + echo "selftests: test_xdp_devmap_alloc [FAILED]"; + fi + + set +e + ip link del veth1 2> /dev/null + ip netns del ns1 2> /dev/null + ip netns del ns2 2> /dev/null +} + +check_alloc() +{ + ns="$1" + expected="$2 $3" + + if [[ "$ns" == "root" ]]; then + actual=$(< /proc/net/default_dev_map) + else + actual=$(ip netns exec "$ns" cat /proc/net/default_dev_map) + fi + + if [[ "$expected" != "$actual" ]]; then + echo "Expected allocation '$expected' got '$actual'" >&2 + exit 1 + fi +} + +ip link set dev lo xdp off 2>/dev/null > /dev/null +if [ $? -ne 0 ];then + echo "selftests: [SKIP] Could not run test without ip xdp support" + exit 0 +fi +set -e + +ip netns add ns1 +ip netns add ns2 + +trap cleanup 0 2 3 6 9 + +ip link add veth1 type veth peer name veth2 + +ip link set veth1 netns ns1 +ip link set veth2 netns ns2 + +check_alloc ns1 0 0 +check_alloc ns2 0 0 + +# Check that loading an xdp tail call program increases counter, but doesn't +# load a program +ip netns exec ns2 ip link set dev veth2 xdp obj test_xdp_tail_call.o sec xdp_dummy_tail_call +check_alloc ns2 1 0 + +# Check that loading a redirect program allocates a map, and +# removing that program de-allocates the map again. +ip netns exec ns1 ip link set dev veth1 xdp obj test_xdp_redirect.o sec redirect_to_111 +check_alloc ns1 1 1 +# Now we should have a map allocated in the other ns +check_alloc ns2 1 1 +ip netns exec ns1 ip link set dev veth1 xdp off +check_alloc ns1 0 0 +check_alloc ns2 1 0 +ip netns exec ns2 ip link set dev veth2 xdp off +check_alloc ns2 0 0 + +# Check that switching between redirect and non-redirect programs correctly +# allocs/de-allocs map +ip netns exec ns1 ip link set dev veth1 xdp obj xdp_dummy.o sec xdp_dummy +check_alloc ns1 0 0 +ip netns exec ns1 ip -force link set dev veth1 xdp obj test_xdp_redirect.o sec redirect_to_111 +check_alloc ns1 1 1 +ip netns exec ns1 ip -force link set dev veth1 xdp obj xdp_dummy.o sec xdp_dummy +check_alloc ns1 0 0 + +ip netns exec ns1 ip link set dev veth1 xdp off + +# Check that moving an interface into a namespace will allocate the map +ip netns del ns1 +ip netns add ns1 +ip link add veth1 type veth peer name veth2 + +ip link set dev veth1 xdp obj test_xdp_redirect.o sec redirect_to_111 +check_alloc root 1 1 +check_alloc ns1 0 0 +ip link set dev veth1 netns ns1 +check_alloc ns1 1 1 +check_alloc root 0 0 + +exit 0