From patchwork Tue Oct 8 06:16:52 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Samudrala, Sridhar" X-Patchwork-Id: 1173123 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=osuosl.org (client-ip=140.211.166.136; helo=silver.osuosl.org; envelope-from=intel-wired-lan-bounces@osuosl.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46nRvp3hY1z9sPq for ; Tue, 8 Oct 2019 17:17:10 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id BC04C2226B; Tue, 8 Oct 2019 06:17:08 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yTVXZE8Ih82k; Tue, 8 Oct 2019 06:17:02 +0000 (UTC) Received: from ash.osuosl.org (ash.osuosl.org [140.211.166.34]) by silver.osuosl.org (Postfix) with ESMTP id 711DB2226E; Tue, 8 Oct 2019 06:17:02 +0000 (UTC) X-Original-To: intel-wired-lan@lists.osuosl.org Delivered-To: intel-wired-lan@lists.osuosl.org Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by ash.osuosl.org (Postfix) with ESMTP id 8F9C61BF289 for ; Tue, 8 Oct 2019 06:17:00 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id 89A2D221FB for ; Tue, 8 Oct 2019 06:17:00 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qa+NVpblsVoe for ; Tue, 8 Oct 2019 06:16:56 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by silver.osuosl.org (Postfix) with ESMTPS id BBF282221F for ; Tue, 8 Oct 2019 06:16:56 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Oct 2019 23:16:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.67,269,1566889200"; d="scan'208";a="187206749" Received: from arch-p28.jf.intel.com ([10.166.187.31]) by orsmga008.jf.intel.com with ESMTP; 07 Oct 2019 23:16:55 -0700 From: Sridhar Samudrala To: magnus.karlsson@intel.com, bjorn.topel@intel.com, netdev@vger.kernel.org, bpf@vger.kernel.org, sridhar.samudrala@intel.com, intel-wired-lan@lists.osuosl.org, maciej.fijalkowski@intel.com, tom.herbert@intel.com Date: Mon, 7 Oct 2019 23:16:52 -0700 Message-Id: <1570515415-45593-2-git-send-email-sridhar.samudrala@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1570515415-45593-1-git-send-email-sridhar.samudrala@intel.com> References: <1570515415-45593-1-git-send-email-sridhar.samudrala@intel.com> Subject: [Intel-wired-lan] [PATCH bpf-next 1/4] bpf: introduce bpf_get_prog_id and bpf_set_prog_id helper functions. X-BeenThere: intel-wired-lan@osuosl.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Wired Ethernet Linux Kernel Driver Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-wired-lan-bounces@osuosl.org Sender: "Intel-wired-lan" Currently the users of bpf prog id access it directly via prog->aux->id. Next patch in this series introduces a special bpf prog pointer to support AF_XDP sockets bound to a queue to receive packets from that queue directly. As the special bpf prog pointer is not associated with any struct bpf_prog prog id is not accessible via prog->aux. To abstract this from the users, 2 helper functions to get and set prog id are introduced and all the users are updated to use these functions. Signed-off-by: Sridhar Samudrala --- drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c | 2 +- drivers/net/ethernet/cavium/thunder/nicvf_main.c | 2 +- drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c | 2 +- drivers/net/ethernet/intel/i40e/i40e_main.c | 2 +- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 3 +-- drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 3 +-- drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 3 +-- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 3 +-- drivers/net/ethernet/qlogic/qede/qede_filter.c | 2 +- drivers/net/ethernet/socionext/netsec.c | 2 +- drivers/net/netdevsim/bpf.c | 6 +++-- drivers/net/tun.c | 4 +-- drivers/net/veth.c | 4 +-- drivers/net/virtio_net.c | 3 +-- include/linux/bpf.h | 3 +++ include/trace/events/xdp.h | 4 +-- kernel/bpf/arraymap.c | 2 +- kernel/bpf/cgroup.c | 2 +- kernel/bpf/core.c | 2 +- kernel/bpf/syscall.c | 30 +++++++++++++++++------ kernel/events/core.c | 2 +- kernel/trace/bpf_trace.c | 2 +- net/core/dev.c | 4 +-- net/core/flow_dissector.c | 2 +- net/core/rtnetlink.c | 2 +- net/core/xdp.c | 2 +- net/ipv6/seg6_local.c | 2 +- net/sched/act_bpf.c | 2 +- net/sched/cls_bpf.c | 2 +- 29 files changed, 58 insertions(+), 46 deletions(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c index c6f6f2033880..ef6dd2881264 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_xdp.c @@ -330,7 +330,7 @@ int bnxt_xdp(struct net_device *dev, struct netdev_bpf *xdp) rc = bnxt_xdp_set(bp, xdp->prog); break; case XDP_QUERY_PROG: - xdp->prog_id = bp->xdp_prog ? bp->xdp_prog->aux->id : 0; + xdp->prog_id = bpf_get_prog_id(bp->xdp_prog); rc = 0; break; default: diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_main.c b/drivers/net/ethernet/cavium/thunder/nicvf_main.c index 40a44dcb3d9b..5c6c680252c1 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_main.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_main.c @@ -1912,7 +1912,7 @@ static int nicvf_xdp(struct net_device *netdev, struct netdev_bpf *xdp) case XDP_SETUP_PROG: return nicvf_xdp_setup(nic, xdp->prog); case XDP_QUERY_PROG: - xdp->prog_id = nic->xdp_prog ? nic->xdp_prog->aux->id : 0; + xdp->prog_id = bpf_get_prog_id(nic->xdp_prog); return 0; default: return -EINVAL; diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c index 162d7d8fb295..8aef671d0731 100644 --- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c @@ -1831,7 +1831,7 @@ static int dpaa2_eth_xdp(struct net_device *dev, struct netdev_bpf *xdp) case XDP_SETUP_PROG: return setup_xdp(dev, xdp->prog); case XDP_QUERY_PROG: - xdp->prog_id = priv->xdp_prog ? priv->xdp_prog->aux->id : 0; + xdp->prog_id = bpf_get_prog_id(priv->xdp_prog); break; default: return -EINVAL; diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c index 6031223eafab..0a59937b376e 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_main.c +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c @@ -12820,7 +12820,7 @@ static int i40e_xdp(struct net_device *dev, case XDP_SETUP_PROG: return i40e_xdp_setup(vsi, xdp->prog); case XDP_QUERY_PROG: - xdp->prog_id = vsi->xdp_prog ? vsi->xdp_prog->aux->id : 0; + xdp->prog_id = bpf_get_prog_id(vsi->xdp_prog); return 0; case XDP_SETUP_XSK_UMEM: return i40e_xsk_umem_setup(vsi, xdp->xsk.umem, diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index 1ce2397306b9..c51ad24be037 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -10283,8 +10283,7 @@ static int ixgbe_xdp(struct net_device *dev, struct netdev_bpf *xdp) case XDP_SETUP_PROG: return ixgbe_xdp_setup(dev, xdp->prog); case XDP_QUERY_PROG: - xdp->prog_id = adapter->xdp_prog ? - adapter->xdp_prog->aux->id : 0; + xdp->prog_id = bpf_get_prog_id(adapter->xdp_prog); return 0; case XDP_SETUP_XSK_UMEM: return ixgbe_xsk_umem_setup(adapter, xdp->xsk.umem, diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 076f2da36f27..dcf32a32cde8 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -4496,8 +4496,7 @@ static int ixgbevf_xdp(struct net_device *dev, struct netdev_bpf *xdp) case XDP_SETUP_PROG: return ixgbevf_xdp_setup(dev, xdp->prog); case XDP_QUERY_PROG: - xdp->prog_id = adapter->xdp_prog ? - adapter->xdp_prog->aux->id : 0; + xdp->prog_id = bpf_get_prog_id(adapter->xdp_prog); return 0; default: return -EINVAL; diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c index 40ec5acf79c0..cf086748306f 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c @@ -2881,8 +2881,7 @@ static u32 mlx4_xdp_query(struct net_device *dev) xdp_prog = rcu_dereference_protected( priv->rx_ring[0]->xdp_prog, lockdep_is_held(&mdev->state_lock)); - if (xdp_prog) - prog_id = xdp_prog->aux->id; + prog_id = bpf_get_prog_id(xdp_prog); mutex_unlock(&mdev->state_lock); return prog_id; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 7569287f8f3c..45c247ff05b2 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -4478,8 +4478,7 @@ static u32 mlx5e_xdp_query(struct net_device *dev) mutex_lock(&priv->state_lock); xdp_prog = priv->channels.params.xdp_prog; - if (xdp_prog) - prog_id = xdp_prog->aux->id; + prog_id = bpf_get_prog_id(xdp_prog); mutex_unlock(&priv->state_lock); return prog_id; diff --git a/drivers/net/ethernet/qlogic/qede/qede_filter.c b/drivers/net/ethernet/qlogic/qede/qede_filter.c index 9a6a9a008714..75376bb85875 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_filter.c +++ b/drivers/net/ethernet/qlogic/qede/qede_filter.c @@ -1119,7 +1119,7 @@ int qede_xdp(struct net_device *dev, struct netdev_bpf *xdp) case XDP_SETUP_PROG: return qede_xdp_set(edev, xdp->prog); case XDP_QUERY_PROG: - xdp->prog_id = edev->xdp_prog ? edev->xdp_prog->aux->id : 0; + xdp->prog_id = bpf_get_prog_id(edev->xdp_prog); return 0; default: return -EINVAL; diff --git a/drivers/net/ethernet/socionext/netsec.c b/drivers/net/ethernet/socionext/netsec.c index 55db7fbd43cc..19f5a1d850de 100644 --- a/drivers/net/ethernet/socionext/netsec.c +++ b/drivers/net/ethernet/socionext/netsec.c @@ -1820,7 +1820,7 @@ static int netsec_xdp(struct net_device *ndev, struct netdev_bpf *xdp) case XDP_SETUP_PROG: return netsec_xdp_setup(priv, xdp->prog, xdp->extack); case XDP_QUERY_PROG: - xdp->prog_id = priv->xdp_prog ? priv->xdp_prog->aux->id : 0; + xdp->prog_id = bpf_get_prog_id(priv->xdp_prog); return 0; default: return -EINVAL; diff --git a/drivers/net/netdevsim/bpf.c b/drivers/net/netdevsim/bpf.c index 2b74425822ab..2a24a7a41985 100644 --- a/drivers/net/netdevsim/bpf.c +++ b/drivers/net/netdevsim/bpf.c @@ -104,7 +104,7 @@ nsim_bpf_offload(struct netdevsim *ns, struct bpf_prog *prog, bool oldprog) "bad offload state, expected offload %sto be active", oldprog ? "" : "not "); ns->bpf_offloaded = prog; - ns->bpf_offloaded_id = prog ? prog->aux->id : 0; + ns->bpf_offloaded_id = bpf_get_prog_id(prog); nsim_prog_set_loaded(prog, true); return 0; @@ -218,6 +218,7 @@ static int nsim_bpf_create_prog(struct nsim_dev *nsim_dev, { struct nsim_bpf_bound_prog *state; char name[16]; + u32 prog_id; state = kzalloc(sizeof(*state), GFP_KERNEL); if (!state) @@ -235,7 +236,8 @@ static int nsim_bpf_create_prog(struct nsim_dev *nsim_dev, return -ENOMEM; } - debugfs_create_u32("id", 0400, state->ddir, &prog->aux->id); + prog_id = bpf_get_prog_id(prog); + debugfs_create_u32("id", 0400, state->ddir, &prog_id); debugfs_create_file("state", 0400, state->ddir, &state->state, &nsim_bpf_string_fops); debugfs_create_bool("loaded", 0400, state->ddir, &state->is_loaded); diff --git a/drivers/net/tun.c b/drivers/net/tun.c index aab0be40d443..396905d5c59a 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -1224,10 +1224,8 @@ static u32 tun_xdp_query(struct net_device *dev) const struct bpf_prog *xdp_prog; xdp_prog = rtnl_dereference(tun->xdp_prog); - if (xdp_prog) - return xdp_prog->aux->id; - return 0; + return bpf_get_prog_id(xdp_prog); } static int tun_xdp(struct net_device *dev, struct netdev_bpf *xdp) diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 9f3c839f9e5f..261b0df8dc41 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -1140,10 +1140,8 @@ static u32 veth_xdp_query(struct net_device *dev) const struct bpf_prog *xdp_prog; xdp_prog = priv->_xdp_prog; - if (xdp_prog) - return xdp_prog->aux->id; - return 0; + return bpf_get_prog_id(xdp_prog); } static int veth_xdp(struct net_device *dev, struct netdev_bpf *xdp) diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index ba98e0971b84..5aa7f95b6c99 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -2521,8 +2521,7 @@ static u32 virtnet_xdp_query(struct net_device *dev) for (i = 0; i < vi->max_queue_pairs; i++) { xdp_prog = rtnl_dereference(vi->rq[i].xdp_prog); - if (xdp_prog) - return xdp_prog->aux->id; + return bpf_get_prog_id(xdp_prog); } return 0; } diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 5b9d22338606..e5b023cda42a 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -692,6 +692,9 @@ int bpf_get_file_flag(int flags); int bpf_check_uarg_tail_zero(void __user *uaddr, size_t expected_size, size_t actual_size); +u32 bpf_get_prog_id(const struct bpf_prog *prog); +void bpf_set_prog_id(struct bpf_prog *prog, u32 id); + /* memcpy that is used with 8-byte aligned pointers, power-of-8 size and * forced to use 'long' read/writes to try to atomically copy long counters. * Best-effort only. No barriers here, since it _will_ race with concurrent diff --git a/include/trace/events/xdp.h b/include/trace/events/xdp.h index 8c8420230a10..3369a73c27e1 100644 --- a/include/trace/events/xdp.h +++ b/include/trace/events/xdp.h @@ -39,7 +39,7 @@ TRACE_EVENT(xdp_exception, ), TP_fast_assign( - __entry->prog_id = xdp->aux->id; + __entry->prog_id = bpf_get_prog_id(xdp); __entry->act = act; __entry->ifindex = dev->ifindex; ), @@ -99,7 +99,7 @@ DECLARE_EVENT_CLASS(xdp_redirect_template, ), TP_fast_assign( - __entry->prog_id = xdp->aux->id; + __entry->prog_id = bpf_get_prog_id(xdp); __entry->act = XDP_REDIRECT; __entry->ifindex = dev->ifindex; __entry->err = err; diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c index 1c65ce0098a9..30037d72c176 100644 --- a/kernel/bpf/arraymap.c +++ b/kernel/bpf/arraymap.c @@ -589,7 +589,7 @@ static void prog_fd_array_put_ptr(void *ptr) static u32 prog_fd_array_sys_lookup_elem(void *ptr) { - return ((struct bpf_prog *)ptr)->aux->id; + return bpf_get_prog_id((struct bpf_prog *)ptr); } /* decrement refcnt of all bpf_progs that are stored in this map */ diff --git a/kernel/bpf/cgroup.c b/kernel/bpf/cgroup.c index ddd8addcdb5c..8db882606d54 100644 --- a/kernel/bpf/cgroup.c +++ b/kernel/bpf/cgroup.c @@ -526,7 +526,7 @@ int __cgroup_bpf_query(struct cgroup *cgrp, const union bpf_attr *attr, i = 0; list_for_each_entry(pl, progs, node) { - id = pl->prog->aux->id; + id = bpf_get_prog_id(pl->prog); if (copy_to_user(prog_ids + i, &id, sizeof(id))) return -EFAULT; if (++i == cnt) diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 66088a9e9b9e..60ccf0e552fc 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -1832,7 +1832,7 @@ static bool bpf_prog_array_copy_core(struct bpf_prog_array *array, for (item = array->items; item->prog; item++) { if (item->prog == &dummy_bpf_prog.prog) continue; - prog_ids[i] = item->prog->aux->id; + prog_ids[i] = bpf_get_prog_id(item->prog); if (++i == request_cnt) { item++; break; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 82eabd4e38ad..205f95af67d2 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1287,7 +1287,7 @@ static int bpf_prog_alloc_id(struct bpf_prog *prog) spin_lock_bh(&prog_idr_lock); id = idr_alloc_cyclic(&prog_idr, prog, 1, INT_MAX, GFP_ATOMIC); if (id > 0) - prog->aux->id = id; + bpf_set_prog_id(prog, id); spin_unlock_bh(&prog_idr_lock); idr_preload_end(); @@ -1305,7 +1305,7 @@ void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock) * disappears - even if someone grabs an fd to them they are unusable, * simply waiting for refcnt to drop to be freed. */ - if (!prog->aux->id) + if (!bpf_get_prog_id(prog)) return; if (do_idr_lock) @@ -1313,8 +1313,8 @@ void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock) else __acquire(&prog_idr_lock); - idr_remove(&prog_idr, prog->aux->id); - prog->aux->id = 0; + idr_remove(&prog_idr, bpf_get_prog_id(prog)); + bpf_set_prog_id(prog, 0); if (do_idr_lock) spin_unlock_bh(&prog_idr_lock); @@ -1353,6 +1353,22 @@ void bpf_prog_put(struct bpf_prog *prog) } EXPORT_SYMBOL_GPL(bpf_prog_put); +u32 bpf_get_prog_id(const struct bpf_prog *prog) +{ + if (prog) + return prog->aux->id; + + return 0; +} +EXPORT_SYMBOL(bpf_get_prog_id); + +void bpf_set_prog_id(struct bpf_prog *prog, u32 id) +{ + if (prog) + prog->aux->id = id; +} +EXPORT_SYMBOL(bpf_set_prog_id); + static int bpf_prog_release(struct inode *inode, struct file *filp) { struct bpf_prog *prog = filp->private_data; @@ -1406,7 +1422,7 @@ static void bpf_prog_show_fdinfo(struct seq_file *m, struct file *filp) prog->jited, prog_tag, prog->pages * 1ULL << PAGE_SHIFT, - prog->aux->id, + bpf_get_prog_id(prog), stats.nsecs, stats.cnt); } @@ -2329,7 +2345,7 @@ static int bpf_prog_get_info_by_fd(struct bpf_prog *prog, return -EFAULT; info.type = prog->type; - info.id = prog->aux->id; + info.id = bpf_get_prog_id(prog); info.load_time = prog->aux->load_time; info.created_by_uid = from_kuid_munged(current_user_ns(), prog->aux->user->uid); @@ -2792,7 +2808,7 @@ static int bpf_task_fd_query(const union bpf_attr *attr, struct bpf_raw_event_map *btp = raw_tp->btp; err = bpf_task_fd_query_copy(attr, uattr, - raw_tp->prog->aux->id, + bpf_get_prog_id(raw_tp->prog), BPF_FD_TYPE_RAW_TRACEPOINT, btp->tp->name, 0, 0); goto put_file; diff --git a/kernel/events/core.c b/kernel/events/core.c index 4655adbbae10..1410951ca904 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -8035,7 +8035,7 @@ void perf_event_bpf_event(struct bpf_prog *prog, }, .type = type, .flags = flags, - .id = prog->aux->id, + .id = bpf_get_prog_id(prog), }, }; diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c index 44bd08f2443b..35e8cd2b6b54 100644 --- a/kernel/trace/bpf_trace.c +++ b/kernel/trace/bpf_trace.c @@ -1426,7 +1426,7 @@ int bpf_get_perf_event_info(const struct perf_event *event, u32 *prog_id, if (prog->type == BPF_PROG_TYPE_PERF_EVENT) return -EOPNOTSUPP; - *prog_id = prog->aux->id; + *prog_id = bpf_get_prog_id(prog); flags = event->tp_event->flags; is_tracepoint = flags & TRACE_EVENT_FL_TRACEPOINT; is_syscall_tp = is_syscall_trace_event(event->tp_event); diff --git a/net/core/dev.c b/net/core/dev.c index 7a456c6a7ad8..866d0ad936a5 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -5286,7 +5286,7 @@ static int generic_xdp_install(struct net_device *dev, struct netdev_bpf *xdp) break; case XDP_QUERY_PROG: - xdp->prog_id = old ? old->aux->id : 0; + xdp->prog_id = bpf_get_prog_id(old); break; default: @@ -8262,7 +8262,7 @@ int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack, return -EINVAL; } - if (prog->aux->id == prog_id) { + if (bpf_get_prog_id(prog) == prog_id) { bpf_prog_put(prog); return 0; } diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index 6b4b88d1599d..fa8b8e88bfaa 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -89,7 +89,7 @@ int skb_flow_dissector_prog_query(const union bpf_attr *attr, attached = rcu_dereference(net->flow_dissector_prog); if (attached) { prog_cnt = 1; - prog_id = attached->aux->id; + prog_id = bpf_get_prog_id(attached); } rcu_read_unlock(); diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 49fa910b58af..86fd505b4111 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -1387,7 +1387,7 @@ static u32 rtnl_xdp_prog_skb(struct net_device *dev) generic_xdp_prog = rtnl_dereference(dev->xdp_prog); if (!generic_xdp_prog) return 0; - return generic_xdp_prog->aux->id; + return bpf_get_prog_id(generic_xdp_prog); } static u32 rtnl_xdp_prog_drv(struct net_device *dev) diff --git a/net/core/xdp.c b/net/core/xdp.c index d7bf62ffbb5e..0bc9a50eb318 100644 --- a/net/core/xdp.c +++ b/net/core/xdp.c @@ -469,7 +469,7 @@ EXPORT_SYMBOL_GPL(__xdp_release_frame); int xdp_attachment_query(struct xdp_attachment_info *info, struct netdev_bpf *bpf) { - bpf->prog_id = info->prog ? info->prog->aux->id : 0; + bpf->prog_id = bpf_get_prog_id(info->prog); bpf->prog_flags = info->prog ? info->flags : 0; return 0; } diff --git a/net/ipv6/seg6_local.c b/net/ipv6/seg6_local.c index 9d4f75e0d33a..e49987655567 100644 --- a/net/ipv6/seg6_local.c +++ b/net/ipv6/seg6_local.c @@ -853,7 +853,7 @@ static int put_nla_bpf(struct sk_buff *skb, struct seg6_local_lwt *slwt) if (!nest) return -EMSGSIZE; - if (nla_put_u32(skb, SEG6_LOCAL_BPF_PROG, slwt->bpf.prog->aux->id)) + if (nla_put_u32(skb, SEG6_LOCAL_BPF_PROG, bpf_get_prog_id(slwt->bpf.prog))) return -EMSGSIZE; if (slwt->bpf.name && diff --git a/net/sched/act_bpf.c b/net/sched/act_bpf.c index 04b7bd4ec751..d55a28d0adf6 100644 --- a/net/sched/act_bpf.c +++ b/net/sched/act_bpf.c @@ -119,7 +119,7 @@ static int tcf_bpf_dump_ebpf_info(const struct tcf_bpf *prog, nla_put_string(skb, TCA_ACT_BPF_NAME, prog->bpf_name)) return -EMSGSIZE; - if (nla_put_u32(skb, TCA_ACT_BPF_ID, prog->filter->aux->id)) + if (nla_put_u32(skb, TCA_ACT_BPF_ID, bpf_get_prog_id(prog->filter)) return -EMSGSIZE; nla = nla_reserve(skb, TCA_ACT_BPF_TAG, sizeof(prog->filter->tag)); diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c index bf10bdaf5012..d35aedb4aee5 100644 --- a/net/sched/cls_bpf.c +++ b/net/sched/cls_bpf.c @@ -562,7 +562,7 @@ static int cls_bpf_dump_ebpf_info(const struct cls_bpf_prog *prog, nla_put_string(skb, TCA_BPF_NAME, prog->bpf_name)) return -EMSGSIZE; - if (nla_put_u32(skb, TCA_BPF_ID, prog->filter->aux->id)) + if (nla_put_u32(skb, TCA_BPF_ID, bpf_get_prog_id(prog->filter)) return -EMSGSIZE; nla = nla_reserve(skb, TCA_BPF_TAG, sizeof(prog->filter->tag)); From patchwork Tue Oct 8 06:16:53 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Samudrala, Sridhar" X-Patchwork-Id: 1173120 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=osuosl.org (client-ip=140.211.166.138; helo=whitealder.osuosl.org; envelope-from=intel-wired-lan-bounces@osuosl.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46nRvm1YSMz9sNF for ; Tue, 8 Oct 2019 17:17:07 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id 4AF458733E; Tue, 8 Oct 2019 06:17:05 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sYF9v0ew0UPE; Tue, 8 Oct 2019 06:17:00 +0000 (UTC) Received: from ash.osuosl.org (ash.osuosl.org [140.211.166.34]) by whitealder.osuosl.org (Postfix) with ESMTP id B73AE8731C; Tue, 8 Oct 2019 06:17:00 +0000 (UTC) X-Original-To: intel-wired-lan@lists.osuosl.org Delivered-To: intel-wired-lan@lists.osuosl.org Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by ash.osuosl.org (Postfix) with ESMTP id 6BB841BF289 for ; Tue, 8 Oct 2019 06:16:59 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id 4000422613 for ; Tue, 8 Oct 2019 06:16:59 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zYRGC5o4yKlv for ; Tue, 8 Oct 2019 06:16:56 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by silver.osuosl.org (Postfix) with ESMTPS id D6C8022268 for ; Tue, 8 Oct 2019 06:16:56 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Oct 2019 23:16:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.67,269,1566889200"; d="scan'208";a="187206752" Received: from arch-p28.jf.intel.com ([10.166.187.31]) by orsmga008.jf.intel.com with ESMTP; 07 Oct 2019 23:16:55 -0700 From: Sridhar Samudrala To: magnus.karlsson@intel.com, bjorn.topel@intel.com, netdev@vger.kernel.org, bpf@vger.kernel.org, sridhar.samudrala@intel.com, intel-wired-lan@lists.osuosl.org, maciej.fijalkowski@intel.com, tom.herbert@intel.com Date: Mon, 7 Oct 2019 23:16:53 -0700 Message-Id: <1570515415-45593-3-git-send-email-sridhar.samudrala@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1570515415-45593-1-git-send-email-sridhar.samudrala@intel.com> References: <1570515415-45593-1-git-send-email-sridhar.samudrala@intel.com> Subject: [Intel-wired-lan] [PATCH bpf-next 2/4] xsk: allow AF_XDP sockets to receive packets directly from a queue X-BeenThere: intel-wired-lan@osuosl.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Wired Ethernet Linux Kernel Driver Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-wired-lan-bounces@osuosl.org Sender: "Intel-wired-lan" Introduce a flag that can be specified during the bind() call of an AF_XDP socket to receive packets directly from a queue when there is no XDP program attached to the device. This is enabled by introducing a special BPF prog pointer called BPF_PROG_DIRECT_XSK and a new bind flag XDP_DIRECT that can be specified when binding an AF_XDP socket to a queue. At the time of bind(), an AF_XDP socket in XDP_DIRECT mode, will attach BPF_PROG_DIRECT_XSK as a bpf program if there is no other XDP program attached to the device. The device receive queue is also associated with the AF_XDP socket. In the XDP receive path, if the bpf program is a DIRECT_XSK program, the XDP buffer is passed to the AF_XDP socket that is associated with the receive queue on which the packet is received. To avoid any overhead for nomal XDP programs, a static key is used to keep a count of AF_XDP direct xsk sockets and static_branch_unlikely() is used to handle receives for direct XDP sockets. Any attach of a normal XDP program will take precedence and the direct xsk program will be removed. The direct XSK program will be attached automatically when the normal XDP program is removed when there are any AF_XDP direct sockets associated with that device. Signed-off-by: Sridhar Samudrala --- include/linux/filter.h | 18 ++++++++++++ include/linux/netdevice.h | 10 +++++++ include/net/xdp_sock.h | 5 ++++ include/uapi/linux/if_xdp.h | 5 ++++ kernel/bpf/syscall.c | 7 +++-- net/core/dev.c | 50 +++++++++++++++++++++++++++++++++ net/core/filter.c | 58 +++++++++++++++++++++++++++++++++++++++ net/xdp/xsk.c | 51 ++++++++++++++++++++++++++++++++-- tools/include/uapi/linux/if_xdp.h | 5 ++++ 9 files changed, 204 insertions(+), 5 deletions(-) diff --git a/include/linux/filter.h b/include/linux/filter.h index 2ce57645f3cd..db4ad85d8321 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -585,6 +585,9 @@ struct bpf_redirect_info { struct bpf_map *map; struct bpf_map *map_to_flush; u32 kern_flags; +#ifdef CONFIG_XDP_SOCKETS + struct xdp_sock *xsk; +#endif }; DECLARE_PER_CPU(struct bpf_redirect_info, bpf_redirect_info); @@ -693,6 +696,16 @@ static inline u32 bpf_prog_run_clear_cb(const struct bpf_prog *prog, return res; } +#ifdef CONFIG_XDP_SOCKETS +#define BPF_PROG_DIRECT_XSK 0x1 +#define bpf_is_direct_xsk_prog(prog) \ + ((unsigned long)prog == BPF_PROG_DIRECT_XSK) +DECLARE_STATIC_KEY_FALSE(xdp_direct_xsk_needed); +u32 bpf_direct_xsk(const struct bpf_prog *prog, struct xdp_buff *xdp); +#else +#define bpf_is_direct_xsk_prog(prog) (false) +#endif + static __always_inline u32 bpf_prog_run_xdp(const struct bpf_prog *prog, struct xdp_buff *xdp) { @@ -702,6 +715,11 @@ static __always_inline u32 bpf_prog_run_xdp(const struct bpf_prog *prog, * already takes rcu_read_lock() when fetching the program, so * it's not necessary here anymore. */ +#ifdef CONFIG_XDP_SOCKETS + if (static_branch_unlikely(&xdp_direct_xsk_needed) && + bpf_is_direct_xsk_prog(prog)) + return bpf_direct_xsk(prog, xdp); +#endif return BPF_PROG_RUN(prog, xdp); } diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 48cc71aae466..f4d0f70aa718 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -743,6 +743,7 @@ struct netdev_rx_queue { struct xdp_rxq_info xdp_rxq; #ifdef CONFIG_XDP_SOCKETS struct xdp_umem *umem; + struct xdp_sock *xsk; #endif } ____cacheline_aligned_in_smp; @@ -1836,6 +1837,10 @@ struct net_device { atomic_t carrier_up_count; atomic_t carrier_down_count; +#ifdef CONFIG_XDP_SOCKETS + u16 direct_xsk_count; +#endif + #ifdef CONFIG_WIRELESS_EXT const struct iw_handler_def *wireless_handlers; struct iw_public_data *wireless_data; @@ -3712,6 +3717,11 @@ int dev_forward_skb(struct net_device *dev, struct sk_buff *skb); bool is_skb_forwardable(const struct net_device *dev, const struct sk_buff *skb); +#ifdef CONFIG_XDP_SOCKETS +int dev_set_direct_xsk_prog(struct net_device *dev); +int dev_clear_direct_xsk_prog(struct net_device *dev); +#endif + static __always_inline int ____dev_forward_skb(struct net_device *dev, struct sk_buff *skb) { diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index c9398ce7960f..9158233d34e1 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -76,6 +76,9 @@ struct xsk_map_node { struct xdp_sock **map_entry; }; +/* Flags for the xdp_sock flags field. */ +#define XDP_SOCK_DIRECT (1 << 0) + struct xdp_sock { /* struct sock must be the first member of struct xdp_sock */ struct sock sk; @@ -104,6 +107,7 @@ struct xdp_sock { struct list_head map_list; /* Protects map_list */ spinlock_t map_list_lock; + u16 flags; }; struct xdp_buff; @@ -129,6 +133,7 @@ void xsk_set_tx_need_wakeup(struct xdp_umem *umem); void xsk_clear_rx_need_wakeup(struct xdp_umem *umem); void xsk_clear_tx_need_wakeup(struct xdp_umem *umem); bool xsk_umem_uses_need_wakeup(struct xdp_umem *umem); +struct xdp_sock *xdp_get_xsk_from_qid(struct net_device *dev, u16 queue_id); void xsk_map_try_sock_delete(struct xsk_map *map, struct xdp_sock *xs, struct xdp_sock **map_entry); diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h index be328c59389d..d202b5d40859 100644 --- a/include/uapi/linux/if_xdp.h +++ b/include/uapi/linux/if_xdp.h @@ -25,6 +25,11 @@ * application. */ #define XDP_USE_NEED_WAKEUP (1 << 3) +/* This option allows an AF_XDP socket bound to a queue to receive all + * the packets directly from that queue when there is no XDP program + * attached to the device. + */ +#define XDP_DIRECT (1 << 4) /* Flags for xsk_umem_config flags */ #define XDP_UMEM_UNALIGNED_CHUNK_FLAG (1 << 0) diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 205f95af67d2..871d738a78d2 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -1349,13 +1349,14 @@ static void __bpf_prog_put(struct bpf_prog *prog, bool do_idr_lock) void bpf_prog_put(struct bpf_prog *prog) { - __bpf_prog_put(prog, true); + if (!bpf_is_direct_xsk_prog(prog)) + __bpf_prog_put(prog, true); } EXPORT_SYMBOL_GPL(bpf_prog_put); u32 bpf_get_prog_id(const struct bpf_prog *prog) { - if (prog) + if (prog && !bpf_is_direct_xsk_prog(prog)) return prog->aux->id; return 0; @@ -1364,7 +1365,7 @@ EXPORT_SYMBOL(bpf_get_prog_id); void bpf_set_prog_id(struct bpf_prog *prog, u32 id) { - if (prog) + if (prog && !bpf_is_direct_xsk_prog(prog)) prog->aux->id = id; } EXPORT_SYMBOL(bpf_set_prog_id); diff --git a/net/core/dev.c b/net/core/dev.c index 866d0ad936a5..eb3cd718e580 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -8269,6 +8269,10 @@ int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack, } else { if (!__dev_xdp_query(dev, bpf_op, query)) return 0; +#ifdef CONFIG_XDP_SOCKETS + if (dev->direct_xsk_count) + prog = (void *)BPF_PROG_DIRECT_XSK; +#endif } err = dev_xdp_install(dev, bpf_op, extack, flags, prog); @@ -8278,6 +8282,52 @@ int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack, return err; } +#ifdef CONFIG_XDP_SOCKETS +int dev_set_direct_xsk_prog(struct net_device *dev) +{ + const struct net_device_ops *ops = dev->netdev_ops; + struct bpf_prog *prog; + bpf_op_t bpf_op; + + ASSERT_RTNL(); + + dev->direct_xsk_count++; + + bpf_op = ops->ndo_bpf; + if (!bpf_op) + return -EOPNOTSUPP; + + if (__dev_xdp_query(dev, bpf_op, XDP_QUERY_PROG)) + return 0; + + prog = (void *)BPF_PROG_DIRECT_XSK; + + return dev_xdp_install(dev, bpf_op, NULL, XDP_FLAGS_DRV_MODE, prog); +} + +int dev_clear_direct_xsk_prog(struct net_device *dev) +{ + const struct net_device_ops *ops = dev->netdev_ops; + bpf_op_t bpf_op; + + ASSERT_RTNL(); + + dev->direct_xsk_count--; + + if (dev->direct_xsk_count) + return 0; + + bpf_op = ops->ndo_bpf; + if (!bpf_op) + return -EOPNOTSUPP; + + if (__dev_xdp_query(dev, bpf_op, XDP_QUERY_PROG)) + return 0; + + return dev_xdp_install(dev, bpf_op, NULL, XDP_FLAGS_DRV_MODE, NULL); +} +#endif + /** * dev_new_index - allocate an ifindex * @net: the applicable net namespace diff --git a/net/core/filter.c b/net/core/filter.c index ed6563622ce3..391d7d600284 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -73,6 +73,7 @@ #include #include #include +#include /** * sk_filter_trim_cap - run a packet through a socket filter @@ -3546,6 +3547,22 @@ static int __bpf_tx_xdp_map(struct net_device *dev_rx, void *fwd, return 0; } +#ifdef CONFIG_XDP_SOCKETS +static void xdp_do_flush_xsk(struct bpf_redirect_info *ri) +{ + struct xdp_sock *xsk = READ_ONCE(ri->xsk); + + if (xsk) { + ri->xsk = NULL; + xsk_flush(xsk); + } +} +#else +static inline void xdp_do_flush_xsk(struct bpf_redirect_info *ri) +{ +} +#endif + void xdp_do_flush_map(void) { struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); @@ -3568,6 +3585,8 @@ void xdp_do_flush_map(void) break; } } + + xdp_do_flush_xsk(ri); } EXPORT_SYMBOL_GPL(xdp_do_flush_map); @@ -3631,11 +3650,28 @@ static int xdp_do_redirect_map(struct net_device *dev, struct xdp_buff *xdp, return err; } +#ifdef CONFIG_XDP_SOCKETS +static inline struct xdp_sock *xdp_get_direct_xsk(struct bpf_redirect_info *ri) +{ + return READ_ONCE(ri->xsk); +} +#else +static inline struct xdp_sock *xdp_get_direct_xsk(struct bpf_redirect_info *ri) +{ + return NULL; +} +#endif + int xdp_do_redirect(struct net_device *dev, struct xdp_buff *xdp, struct bpf_prog *xdp_prog) { struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); struct bpf_map *map = READ_ONCE(ri->map); + struct xdp_sock *xsk; + + xsk = xdp_get_direct_xsk(ri); + if (xsk) + return xsk_rcv(xsk, xdp); if (likely(map)) return xdp_do_redirect_map(dev, xdp, xdp_prog, map, ri); @@ -8934,4 +8970,26 @@ const struct bpf_verifier_ops sk_reuseport_verifier_ops = { const struct bpf_prog_ops sk_reuseport_prog_ops = { }; + +#ifdef CONFIG_XDP_SOCKETS +DEFINE_STATIC_KEY_FALSE(xdp_direct_xsk_needed); +EXPORT_SYMBOL_GPL(xdp_direct_xsk_needed); + +u32 bpf_direct_xsk(const struct bpf_prog *prog, struct xdp_buff *xdp) +{ + struct xdp_sock *xsk; + + xsk = xdp_get_xsk_from_qid(xdp->rxq->dev, xdp->rxq->queue_index); + if (xsk) { + struct bpf_redirect_info *ri = this_cpu_ptr(&bpf_redirect_info); + + ri->xsk = xsk; + return XDP_REDIRECT; + } + + return XDP_PASS; +} +EXPORT_SYMBOL(bpf_direct_xsk); +#endif + #endif /* CONFIG_INET */ diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index fa8fbb8fa3c8..8a29939bac7e 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -24,6 +24,7 @@ #include #include #include +#include #include "xsk_queue.h" #include "xdp_umem.h" @@ -264,6 +265,45 @@ int xsk_generic_rcv(struct xdp_sock *xs, struct xdp_buff *xdp) return err; } +static void xdp_clear_direct_xsk(struct xdp_sock *xsk) +{ + struct net_device *dev = xsk->dev; + u32 qid = xsk->queue_id; + + if (!dev->_rx[qid].xsk) + return; + + dev_clear_direct_xsk_prog(dev); + dev->_rx[qid].xsk = NULL; + static_branch_dec(&xdp_direct_xsk_needed); + xsk->flags &= ~XDP_SOCK_DIRECT; +} + +static int xdp_set_direct_xsk(struct xdp_sock *xsk) +{ + struct net_device *dev = xsk->dev; + u32 qid = xsk->queue_id; + int err = 0; + + if (dev->_rx[qid].xsk) + return -EEXIST; + + xsk->flags |= XDP_SOCK_DIRECT; + static_branch_inc(&xdp_direct_xsk_needed); + dev->_rx[qid].xsk = xsk; + err = dev_set_direct_xsk_prog(dev); + if (err) + xdp_clear_direct_xsk(xsk); + + return err; +} + +struct xdp_sock *xdp_get_xsk_from_qid(struct net_device *dev, u16 queue_id) +{ + return dev->_rx[queue_id].xsk; +} +EXPORT_SYMBOL(xdp_get_xsk_from_qid); + void xsk_umem_complete_tx(struct xdp_umem *umem, u32 nb_entries) { xskq_produce_flush_addr_n(umem->cq, nb_entries); @@ -464,6 +504,11 @@ static void xsk_unbind_dev(struct xdp_sock *xs) return; WRITE_ONCE(xs->state, XSK_UNBOUND); + if (xs->flags & XDP_SOCK_DIRECT) { + rtnl_lock(); + xdp_clear_direct_xsk(xs); + rtnl_unlock(); + } /* Wait for driver to stop using the xdp socket. */ xdp_del_sk_umem(xs->umem, xs); xs->dev = NULL; @@ -604,7 +649,7 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len) flags = sxdp->sxdp_flags; if (flags & ~(XDP_SHARED_UMEM | XDP_COPY | XDP_ZEROCOPY | - XDP_USE_NEED_WAKEUP)) + XDP_USE_NEED_WAKEUP | XDP_DIRECT)) return -EINVAL; rtnl_lock(); @@ -632,7 +677,7 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len) struct socket *sock; if ((flags & XDP_COPY) || (flags & XDP_ZEROCOPY) || - (flags & XDP_USE_NEED_WAKEUP)) { + (flags & XDP_USE_NEED_WAKEUP) || (flags & XDP_DIRECT)) { /* Cannot specify flags for shared sockets. */ err = -EINVAL; goto out_unlock; @@ -688,6 +733,8 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len) xskq_set_umem(xs->rx, xs->umem->size, xs->umem->chunk_mask); xskq_set_umem(xs->tx, xs->umem->size, xs->umem->chunk_mask); xdp_add_sk_umem(xs->umem, xs); + if (flags & XDP_DIRECT) + err = xdp_set_direct_xsk(xs); out_unlock: if (err) { diff --git a/tools/include/uapi/linux/if_xdp.h b/tools/include/uapi/linux/if_xdp.h index be328c59389d..d202b5d40859 100644 --- a/tools/include/uapi/linux/if_xdp.h +++ b/tools/include/uapi/linux/if_xdp.h @@ -25,6 +25,11 @@ * application. */ #define XDP_USE_NEED_WAKEUP (1 << 3) +/* This option allows an AF_XDP socket bound to a queue to receive all + * the packets directly from that queue when there is no XDP program + * attached to the device. + */ +#define XDP_DIRECT (1 << 4) /* Flags for xsk_umem_config flags */ #define XDP_UMEM_UNALIGNED_CHUNK_FLAG (1 << 0) From patchwork Tue Oct 8 06:16:54 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Samudrala, Sridhar" X-Patchwork-Id: 1173121 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=osuosl.org (client-ip=140.211.166.133; helo=hemlock.osuosl.org; envelope-from=intel-wired-lan-bounces@osuosl.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46nRvp0pbjz9sNF for ; Tue, 8 Oct 2019 17:17:10 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id AC5D687E6F; Tue, 8 Oct 2019 06:17:08 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PWIaNSwapSZu; Tue, 8 Oct 2019 06:17:05 +0000 (UTC) Received: from ash.osuosl.org (ash.osuosl.org [140.211.166.34]) by hemlock.osuosl.org (Postfix) with ESMTP id 5A71187E7D; Tue, 8 Oct 2019 06:17:05 +0000 (UTC) X-Original-To: intel-wired-lan@lists.osuosl.org Delivered-To: intel-wired-lan@lists.osuosl.org Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by ash.osuosl.org (Postfix) with ESMTP id 901781BF289 for ; Tue, 8 Oct 2019 06:17:01 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id 745B82262B for ; Tue, 8 Oct 2019 06:17:01 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id USMYAnNay-c2 for ; Tue, 8 Oct 2019 06:16:59 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by silver.osuosl.org (Postfix) with ESMTPS id F2E9B2226E for ; Tue, 8 Oct 2019 06:16:56 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Oct 2019 23:16:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.67,269,1566889200"; d="scan'208";a="187206755" Received: from arch-p28.jf.intel.com ([10.166.187.31]) by orsmga008.jf.intel.com with ESMTP; 07 Oct 2019 23:16:55 -0700 From: Sridhar Samudrala To: magnus.karlsson@intel.com, bjorn.topel@intel.com, netdev@vger.kernel.org, bpf@vger.kernel.org, sridhar.samudrala@intel.com, intel-wired-lan@lists.osuosl.org, maciej.fijalkowski@intel.com, tom.herbert@intel.com Date: Mon, 7 Oct 2019 23:16:54 -0700 Message-Id: <1570515415-45593-4-git-send-email-sridhar.samudrala@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1570515415-45593-1-git-send-email-sridhar.samudrala@intel.com> References: <1570515415-45593-1-git-send-email-sridhar.samudrala@intel.com> Subject: [Intel-wired-lan] [PATCH bpf-next 3/4] libbpf: handle AF_XDP sockets created with XDP_DIRECT bind flag. X-BeenThere: intel-wired-lan@osuosl.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Wired Ethernet Linux Kernel Driver Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-wired-lan-bounces@osuosl.org Sender: "Intel-wired-lan" Don't allow an AF_XDP socket trying to bind with XDP_DIRECT bind flag when a normal XDP program is already attached to the device, Don't attach the default XDP program when AF_XDP socket is created with XDP_DIRECT bind flag. Signed-off-by: Sridhar Samudrala --- tools/lib/bpf/xsk.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/tools/lib/bpf/xsk.c b/tools/lib/bpf/xsk.c index d5f4900e5c54..953b479040cd 100644 --- a/tools/lib/bpf/xsk.c +++ b/tools/lib/bpf/xsk.c @@ -454,6 +454,9 @@ static int xsk_setup_xdp_prog(struct xsk_socket *xsk) return err; if (!prog_id) { + if (xsk->config.bind_flags & XDP_DIRECT) + return 0; + err = xsk_create_bpf_maps(xsk); if (err) return err; @@ -464,6 +467,9 @@ static int xsk_setup_xdp_prog(struct xsk_socket *xsk) return err; } } else { + if (xsk->config.bind_flags & XDP_DIRECT) + return -EEXIST; + xsk->prog_fd = bpf_prog_get_fd_by_id(prog_id); err = xsk_lookup_bpf_maps(xsk); if (err) { From patchwork Tue Oct 8 06:16:55 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Samudrala, Sridhar" X-Patchwork-Id: 1173122 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=osuosl.org (client-ip=140.211.166.133; helo=hemlock.osuosl.org; envelope-from=intel-wired-lan-bounces@osuosl.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 46nRvp3Lp6z9sPk for ; Tue, 8 Oct 2019 17:17:10 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id DABBF87E80; Tue, 8 Oct 2019 06:17:08 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id IjMZ4rzl7ZGt; Tue, 8 Oct 2019 06:17:07 +0000 (UTC) Received: from ash.osuosl.org (ash.osuosl.org [140.211.166.34]) by hemlock.osuosl.org (Postfix) with ESMTP id 256D087E68; Tue, 8 Oct 2019 06:17:06 +0000 (UTC) X-Original-To: intel-wired-lan@lists.osuosl.org Delivered-To: intel-wired-lan@lists.osuosl.org Received: from silver.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by ash.osuosl.org (Postfix) with ESMTP id 97D0D1BF289 for ; Tue, 8 Oct 2019 06:17:02 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by silver.osuosl.org (Postfix) with ESMTP id 8ED1F22268 for ; Tue, 8 Oct 2019 06:17:02 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from silver.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id EBjM48-mb4hD for ; Tue, 8 Oct 2019 06:16:59 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by silver.osuosl.org (Postfix) with ESMTPS id 13B2F22270 for ; Tue, 8 Oct 2019 06:16:57 +0000 (UTC) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Oct 2019 23:16:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.67,269,1566889200"; d="scan'208";a="187206757" Received: from arch-p28.jf.intel.com ([10.166.187.31]) by orsmga008.jf.intel.com with ESMTP; 07 Oct 2019 23:16:55 -0700 From: Sridhar Samudrala To: magnus.karlsson@intel.com, bjorn.topel@intel.com, netdev@vger.kernel.org, bpf@vger.kernel.org, sridhar.samudrala@intel.com, intel-wired-lan@lists.osuosl.org, maciej.fijalkowski@intel.com, tom.herbert@intel.com Date: Mon, 7 Oct 2019 23:16:55 -0700 Message-Id: <1570515415-45593-5-git-send-email-sridhar.samudrala@intel.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1570515415-45593-1-git-send-email-sridhar.samudrala@intel.com> References: <1570515415-45593-1-git-send-email-sridhar.samudrala@intel.com> Subject: [Intel-wired-lan] [PATCH bpf-next 4/4] xdpsock: add an option to create AF_XDP sockets in XDP_DIRECT mode X-BeenThere: intel-wired-lan@osuosl.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Intel Wired Ethernet Linux Kernel Driver Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Errors-To: intel-wired-lan-bounces@osuosl.org Sender: "Intel-wired-lan" This option enables an AF_XDP socket to bind with a XDP_DIRECT flag that allows packets received on the associated queue to be received directly when an XDP program is not attached. Signed-off-by: Sridhar Samudrala Acked-by: Björn Töpel --- samples/bpf/xdpsock_user.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c index 405c4e091f8b..6f4633769968 100644 --- a/samples/bpf/xdpsock_user.c +++ b/samples/bpf/xdpsock_user.c @@ -129,6 +129,9 @@ static void print_benchmark(bool running) if (opt_poll) printf("poll() "); + if (opt_xdp_bind_flags & XDP_DIRECT) + printf("direct-xsk "); + if (running) { printf("running..."); fflush(stdout); @@ -202,7 +205,8 @@ static void int_exit(int sig) dump_stats(); xsk_socket__delete(xsks[0]->xsk); (void)xsk_umem__delete(umem); - remove_xdp_program(); + if (!(opt_xdp_bind_flags & XDP_DIRECT)) + remove_xdp_program(); exit(EXIT_SUCCESS); } @@ -213,7 +217,8 @@ static void __exit_with_error(int error, const char *file, const char *func, fprintf(stderr, "%s:%s:%i: errno: %d/\"%s\"\n", file, func, line, error, strerror(error)); dump_stats(); - remove_xdp_program(); + if (!(opt_xdp_bind_flags & XDP_DIRECT)) + remove_xdp_program(); exit(EXIT_FAILURE); } @@ -363,6 +368,7 @@ static struct option long_options[] = { {"frame-size", required_argument, 0, 'f'}, {"no-need-wakeup", no_argument, 0, 'm'}, {"unaligned", no_argument, 0, 'u'}, + {"direct-xsk", no_argument, 0, 'd'}, {0, 0, 0, 0} }; @@ -386,6 +392,7 @@ static void usage(const char *prog) " -m, --no-need-wakeup Turn off use of driver need wakeup flag.\n" " -f, --frame-size=n Set the frame size (must be a power of two in aligned mode, default is %d).\n" " -u, --unaligned Enable unaligned chunk placement\n" + " -d, --direct-xsk Direct packets to XDP socket.\n" "\n"; fprintf(stderr, str, prog, XSK_UMEM__DEFAULT_FRAME_SIZE); exit(EXIT_FAILURE); @@ -398,7 +405,7 @@ static void parse_command_line(int argc, char **argv) opterr = 0; for (;;) { - c = getopt_long(argc, argv, "Frtli:q:psSNn:czf:mu", + c = getopt_long(argc, argv, "Frtli:q:psSNn:czf:mud", long_options, &option_index); if (c == -1) break; @@ -452,7 +459,9 @@ static void parse_command_line(int argc, char **argv) opt_need_wakeup = false; opt_xdp_bind_flags &= ~XDP_USE_NEED_WAKEUP; break; - + case 'd': + opt_xdp_bind_flags |= XDP_DIRECT; + break; default: usage(basename(argv[0])); }