From patchwork Mon Jul 13 17:46:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Sitnicki X-Patchwork-Id: 1328339 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=cloudflare.com header.i=@cloudflare.com header.a=rsa-sha256 header.s=google header.b=vkij8c4e; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4B5B124xMkz9sRN for ; Tue, 14 Jul 2020 03:47:02 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730216AbgGMRrB (ORCPT ); Mon, 13 Jul 2020 13:47:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60260 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729826AbgGMRq7 (ORCPT ); Mon, 13 Jul 2020 13:46:59 -0400 Received: from mail-lj1-x241.google.com (mail-lj1-x241.google.com [IPv6:2a00:1450:4864:20::241]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 312F9C061755 for ; Mon, 13 Jul 2020 10:46:59 -0700 (PDT) Received: by mail-lj1-x241.google.com with SMTP id e4so18948453ljn.4 for ; Mon, 13 Jul 2020 10:46:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=opL//BYzhcQPBoxrHchY4vZX82iC++4GHZTFlkl4Nb8=; b=vkij8c4eti22az1c2OF806pGH9GUJG7d3hmjrZ9M0KM7TyJqQtp3lBuyKJnpbZVhL1 q70nFRE7LT9DO7jFOy0smkUdLcO/uWfJL75lqpLV/hVE2ONNY6upC3BqNAB4zmEGMoji BhJD1QHXBHoryQPm/NFBxFEUk2SN0CQQbRu0c= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=opL//BYzhcQPBoxrHchY4vZX82iC++4GHZTFlkl4Nb8=; b=FNPIQPtZXs8SgXMQNNX8xfgml7iOmbT3PuZDQFjyTktgTfOegX6o/Te/m7Iev5EdMX bHj4NKSQeP229FY3eW6MHY+MnHd/1+RYFvkC9TKAbOSxg4LUjnbVhC5pvoGZqsU6PfFs qZOwZgbN2uN6wh1kWXlrIJ9x6zLf9rIRqtkuyNtDk6OxPCt5vKC1YmUi4otBvV/E7iFS bmQuPn8VE3MNkfnbKGZfXjGgi5AvAYBszOmJRNPrNidFeUAp0dghWhVrTry7QA87cnnJ lk1ueCoUnVKpYIRDCaodyRbvUy8wp/UyCLn49wpi43vquN+ydwnxh3V5vVYnv/aBz7lR YxDQ== X-Gm-Message-State: AOAM5321i8Hm65+iMzoKFEMBBo1OTcDUUr9IqL7pKoghKlCEHRlSP6Uf YFionb8T4galZBY1BT+0HRnrSQ== X-Google-Smtp-Source: ABdhPJxOhso6OoTmlFc2xk0wErkRYTNkDXXkmvLn/bCi9OjuQp0edzUKBtoDNdgwmdQP1EAEXqDoSQ== X-Received: by 2002:a2e:a544:: with SMTP id e4mr424887ljn.264.1594662417544; Mon, 13 Jul 2020 10:46:57 -0700 (PDT) Received: from cloudflare.com ([2a02:a310:c262:aa00:b35e:8938:2c2a:ba8b]) by smtp.gmail.com with ESMTPSA id g22sm4741921lfb.43.2020.07.13.10.46.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Jul 2020 10:46:57 -0700 (PDT) From: Jakub Sitnicki To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, kernel-team@cloudflare.com, Alexei Starovoitov , Daniel Borkmann , "David S. Miller" , Jakub Kicinski Subject: [PATCH bpf-next v4 01/16] bpf, netns: Handle multiple link attachments Date: Mon, 13 Jul 2020 19:46:39 +0200 Message-Id: <20200713174654.642628-2-jakub@cloudflare.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20200713174654.642628-1-jakub@cloudflare.com> References: <20200713174654.642628-1-jakub@cloudflare.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Extend the BPF netns link callbacks to rebuild (grow/shrink) or update the prog_array at given position when link gets attached/updated/released. This let's us lift the limit of having just one link attached for the new attach type introduced by subsequent patch. No functional changes intended. Signed-off-by: Jakub Sitnicki Acked-by: Andrii Nakryiko --- Notes: v4: - Document prog_array {delete_safe,update}_at() behavior. (Andrii) - Return -EINVAL/-ENOENT on failure in {delete_safe,update}_at(). (Andrii) - Return -ENOENT on index out of range in link_index(). (Andrii) v3: - New in v3 to support multi-prog attachments. (Alexei) include/linux/bpf.h | 3 ++ kernel/bpf/core.c | 55 +++++++++++++++++++++++ kernel/bpf/net_namespace.c | 90 ++++++++++++++++++++++++++++++++++---- 3 files changed, 139 insertions(+), 9 deletions(-) diff --git a/include/linux/bpf.h b/include/linux/bpf.h index 0cd7f6884c5c..ad9c61ae8640 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -928,6 +928,9 @@ int bpf_prog_array_copy_to_user(struct bpf_prog_array *progs, void bpf_prog_array_delete_safe(struct bpf_prog_array *progs, struct bpf_prog *old_prog); +int bpf_prog_array_delete_safe_at(struct bpf_prog_array *array, int index); +int bpf_prog_array_update_at(struct bpf_prog_array *array, int index, + struct bpf_prog *prog); int bpf_prog_array_copy_info(struct bpf_prog_array *array, u32 *prog_ids, u32 request_cnt, u32 *prog_cnt); diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index 9df4cc9a2907..7be02e555ab9 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -1958,6 +1958,61 @@ void bpf_prog_array_delete_safe(struct bpf_prog_array *array, } } +/** + * bpf_prog_array_delete_safe_at() - Replaces the program at the given + * index into the program array with + * a dummy no-op program. + * @array: a bpf_prog_array + * @index: the index of the program to replace + * + * Skips over dummy programs, by not counting them, when calculating + * the the position of the program to replace. + * + * Return: + * * 0 - Success + * * -EINVAL - Invalid index value. Must be a non-negative integer. + * * -ENOENT - Index out of range + */ +int bpf_prog_array_delete_safe_at(struct bpf_prog_array *array, int index) +{ + return bpf_prog_array_update_at(array, index, &dummy_bpf_prog.prog); +} + +/** + * bpf_prog_array_update_at() - Updates the program at the given index + * into the program array. + * @array: a bpf_prog_array + * @index: the index of the program to update + * @prog: the program to insert into the array + * + * Skips over dummy programs, by not counting them, when calculating + * the position of the program to update. + * + * Return: + * * 0 - Success + * * -EINVAL - Invalid index value. Must be a non-negative integer. + * * -ENOENT - Index out of range + */ +int bpf_prog_array_update_at(struct bpf_prog_array *array, int index, + struct bpf_prog *prog) +{ + struct bpf_prog_array_item *item; + + if (unlikely(index < 0)) + return -EINVAL; + + for (item = array->items; item->prog; item++) { + if (item->prog == &dummy_bpf_prog.prog) + continue; + if (!index) { + WRITE_ONCE(item->prog, prog); + return 0; + } + index--; + } + return -ENOENT; +} + int bpf_prog_array_copy(struct bpf_prog_array *old_array, struct bpf_prog *exclude_prog, struct bpf_prog *include_prog, diff --git a/kernel/bpf/net_namespace.c b/kernel/bpf/net_namespace.c index 247543380fa6..988c2766ec97 100644 --- a/kernel/bpf/net_namespace.c +++ b/kernel/bpf/net_namespace.c @@ -36,12 +36,50 @@ static void netns_bpf_run_array_detach(struct net *net, bpf_prog_array_free(run_array); } +static int link_index(struct net *net, enum netns_bpf_attach_type type, + struct bpf_netns_link *link) +{ + struct bpf_netns_link *pos; + int i = 0; + + list_for_each_entry(pos, &net->bpf.links[type], node) { + if (pos == link) + return i; + i++; + } + return -ENOENT; +} + +static int link_count(struct net *net, enum netns_bpf_attach_type type) +{ + struct list_head *pos; + int i = 0; + + list_for_each(pos, &net->bpf.links[type]) + i++; + return i; +} + +static void fill_prog_array(struct net *net, enum netns_bpf_attach_type type, + struct bpf_prog_array *prog_array) +{ + struct bpf_netns_link *pos; + unsigned int i = 0; + + list_for_each_entry(pos, &net->bpf.links[type], node) { + prog_array->items[i].prog = pos->link.prog; + i++; + } +} + static void bpf_netns_link_release(struct bpf_link *link) { struct bpf_netns_link *net_link = container_of(link, struct bpf_netns_link, link); enum netns_bpf_attach_type type = net_link->netns_type; + struct bpf_prog_array *old_array, *new_array; struct net *net; + int cnt, idx; mutex_lock(&netns_bpf_mutex); @@ -53,9 +91,27 @@ static void bpf_netns_link_release(struct bpf_link *link) if (!net) goto out_unlock; - netns_bpf_run_array_detach(net, type); + /* Remember link position in case of safe delete */ + idx = link_index(net, type, net_link); list_del(&net_link->node); + cnt = link_count(net, type); + if (!cnt) { + netns_bpf_run_array_detach(net, type); + goto out_unlock; + } + + old_array = rcu_dereference_protected(net->bpf.run_array[type], + lockdep_is_held(&netns_bpf_mutex)); + new_array = bpf_prog_array_alloc(cnt, GFP_KERNEL); + if (!new_array) { + WARN_ON(bpf_prog_array_delete_safe_at(old_array, idx)); + goto out_unlock; + } + fill_prog_array(net, type, new_array); + rcu_assign_pointer(net->bpf.run_array[type], new_array); + bpf_prog_array_free(old_array); + out_unlock: mutex_unlock(&netns_bpf_mutex); } @@ -77,7 +133,7 @@ static int bpf_netns_link_update_prog(struct bpf_link *link, enum netns_bpf_attach_type type = net_link->netns_type; struct bpf_prog_array *run_array; struct net *net; - int ret = 0; + int idx, ret; if (old_prog && old_prog != link->prog) return -EPERM; @@ -95,7 +151,10 @@ static int bpf_netns_link_update_prog(struct bpf_link *link, run_array = rcu_dereference_protected(net->bpf.run_array[type], lockdep_is_held(&netns_bpf_mutex)); - WRITE_ONCE(run_array->items[0].prog, new_prog); + idx = link_index(net, type, net_link); + ret = bpf_prog_array_update_at(run_array, idx, new_prog); + if (ret) + goto out_unlock; old_prog = xchg(&link->prog, new_prog); bpf_prog_put(old_prog); @@ -295,18 +354,28 @@ int netns_bpf_prog_detach(const union bpf_attr *attr) return ret; } +static int netns_bpf_max_progs(enum netns_bpf_attach_type type) +{ + switch (type) { + case NETNS_BPF_FLOW_DISSECTOR: + return 1; + default: + return 0; + } +} + static int netns_bpf_link_attach(struct net *net, struct bpf_link *link, enum netns_bpf_attach_type type) { struct bpf_netns_link *net_link = container_of(link, struct bpf_netns_link, link); struct bpf_prog_array *run_array; - int err; + int cnt, err; mutex_lock(&netns_bpf_mutex); - /* Allow attaching only one prog or link for now */ - if (!list_empty(&net->bpf.links[type])) { + cnt = link_count(net, type); + if (cnt >= netns_bpf_max_progs(type)) { err = -E2BIG; goto out_unlock; } @@ -327,16 +396,19 @@ static int netns_bpf_link_attach(struct net *net, struct bpf_link *link, if (err) goto out_unlock; - run_array = bpf_prog_array_alloc(1, GFP_KERNEL); + run_array = bpf_prog_array_alloc(cnt + 1, GFP_KERNEL); if (!run_array) { err = -ENOMEM; goto out_unlock; } - run_array->items[0].prog = link->prog; - rcu_assign_pointer(net->bpf.run_array[type], run_array); list_add_tail(&net_link->node, &net->bpf.links[type]); + fill_prog_array(net, type, run_array); + run_array = rcu_replace_pointer(net->bpf.run_array[type], run_array, + lockdep_is_held(&netns_bpf_mutex)); + bpf_prog_array_free(run_array); + out_unlock: mutex_unlock(&netns_bpf_mutex); return err; From patchwork Mon Jul 13 17:46:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Sitnicki X-Patchwork-Id: 1328341 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=cloudflare.com header.i=@cloudflare.com header.a=rsa-sha256 header.s=google header.b=Xsqfq1e+; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4B5B135D93z9sSJ for ; Tue, 14 Jul 2020 03:47:03 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730274AbgGMRrC (ORCPT ); Mon, 13 Jul 2020 13:47:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60268 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729703AbgGMRrB (ORCPT ); Mon, 13 Jul 2020 13:47:01 -0400 Received: from mail-lf1-x142.google.com (mail-lf1-x142.google.com [IPv6:2a00:1450:4864:20::142]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 73324C061755 for ; Mon, 13 Jul 2020 10:47:01 -0700 (PDT) Received: by mail-lf1-x142.google.com with SMTP id y13so9600602lfe.9 for ; Mon, 13 Jul 2020 10:47:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=7bIusmK25t+OvYXtTDBqMTp8M1en4/Qe8ODN3kY6Mb8=; b=Xsqfq1e+xdTAC0c7tjEdwIoCDSwNjUjQiZzfj0P6IJ14XNdwt0UwEL1dF0+LQMypNk 9iZ5neOOyUKN0qQaeMDuegpTgeCtj1bN3c2RUM8aoEQGF0euFboqc/k61tdvvnmIgM1d B0lD5ELrlJrpzMzsw3Bo4PsCn4uNO/lkrbnMw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=7bIusmK25t+OvYXtTDBqMTp8M1en4/Qe8ODN3kY6Mb8=; b=mM2oMAmKNy7vAT5mDeYVKGwUHLfS3tmyGyGyT/b0mulCRB7UudPOR3zvyD/bQUxRyF FlJTR0qRv+2HEjJDKyqhLYmiOkO6QASvqMG+LLIzxY3k2d2zyCm5yFGveeK1rKzrzr/F pOjomLRduvNeA9NvFl+O6s4s/gqA5eegbN9lYQQwo8mc4Q4jU4KAxpkQa96CVbw1UsRP HMy1/y//yoGeOvi3NtLqxv6MOdEqbmOuAdELNLnfA6V8macxyMCdclRemIOGHgpvi/lU oj/Tz9la1pz5IGIvzya84HzzWHtanD1FAzjzaiYr2JQh1mqeV46aUwv9nw4vErX92vrg hLHA== X-Gm-Message-State: AOAM531V0MflPd8mXOZKai42omIWiLMBLIx1Kv9kqyNCAh3/owusEpkY hcBHN5jAlMA/RyO6q2+fq0Ly+w== X-Google-Smtp-Source: ABdhPJwPwJD4cu+xD/Sj8sDOFsFTEzkAzA2g43TxEErlWhmNsfUR23wSTG+ESF8oyNEE4pQvBl5lww== X-Received: by 2002:a19:c886:: with SMTP id y128mr170115lff.98.1594662419713; Mon, 13 Jul 2020 10:46:59 -0700 (PDT) Received: from cloudflare.com ([2a02:a310:c262:aa00:b35e:8938:2c2a:ba8b]) by smtp.gmail.com with ESMTPSA id z16sm4162112ljg.68.2020.07.13.10.46.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Jul 2020 10:46:59 -0700 (PDT) From: Jakub Sitnicki To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, kernel-team@cloudflare.com, Alexei Starovoitov , Daniel Borkmann , "David S. Miller" , Jakub Kicinski , Marek Majkowski Subject: [PATCH bpf-next v4 02/16] bpf: Introduce SK_LOOKUP program type with a dedicated attach point Date: Mon, 13 Jul 2020 19:46:40 +0200 Message-Id: <20200713174654.642628-3-jakub@cloudflare.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20200713174654.642628-1-jakub@cloudflare.com> References: <20200713174654.642628-1-jakub@cloudflare.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Add a new program type BPF_PROG_TYPE_SK_LOOKUP with a dedicated attach type BPF_SK_LOOKUP. The new program kind is to be invoked by the transport layer when looking up a listening socket for a new connection request for connection oriented protocols, or when looking up an unconnected socket for a packet for connection-less protocols. When called, SK_LOOKUP BPF program can select a socket that will receive the packet. This serves as a mechanism to overcome the limits of what bind() API allows to express. Two use-cases driving this work are: (1) steer packets destined to an IP range, on fixed port to a socket 192.0.2.0/24, port 80 -> NGINX socket (2) steer packets destined to an IP address, on any port to a socket 198.51.100.1, any port -> L7 proxy socket In its run-time context program receives information about the packet that triggered the socket lookup. Namely IP version, L4 protocol identifier, and address 4-tuple. Context can be further extended to include ingress interface identifier. To select a socket BPF program fetches it from a map holding socket references, like SOCKMAP or SOCKHASH, and calls bpf_sk_assign(ctx, sk, ...) helper to record the selection. Transport layer then uses the selected socket as a result of socket lookup. This patch only enables the user to attach an SK_LOOKUP program to a network namespace. Subsequent patches hook it up to run on local delivery path in ipv4 and ipv6 stacks. Suggested-by: Marek Majkowski Signed-off-by: Jakub Sitnicki Acked-by: Andrii Nakryiko --- Notes: v4: - Reintroduce narrow load support for most BPF context fields. (Yonghong) - Fix null-ptr-deref in BPF context access when IPv6 address not set. - Unpack v4/v6 IP address union in bpf_sk_lookup context type. - Add verifier support for ARG_PTR_TO_SOCKET_OR_NULL. - Allow resetting socket selection with bpf_sk_assign(ctx, NULL). - Document that bpf_sk_assign accepts a NULL socket. v3: - Allow bpf_sk_assign helper to replace previously selected socket only when BPF_SK_LOOKUP_F_REPLACE flag is set, as a precaution for multiple programs running in series to accidentally override each other's verdict. - Let BPF program decide that load-balancing within a reuseport socket group should be skipped for the socket selected with bpf_sk_assign() by passing BPF_SK_LOOKUP_F_NO_REUSEPORT flag. (Martin) - Extend struct bpf_sk_lookup program context with an 'sk' field containing the selected socket with an intention for multiple attached program running in series to see each other's choices. However, currently the verifier doesn't allow checking if pointer is set. - Use bpf-netns infra for link-based multi-program attachment. (Alexei) - Get rid of macros in convert_ctx_access to make it easier to read. - Disallow 1-,2-byte access to context fields containing IP addresses. v2: - Make bpf_sk_assign reject sockets that don't use RCU freeing. Update bpf_sk_assign docs accordingly. (Martin) - Change bpf_sk_assign proto to take PTR_TO_SOCKET as argument. (Martin) - Fix broken build when CONFIG_INET is not selected. (Martin) - Rename bpf_sk_lookup{} src_/dst_* fields remote_/local_*. (Martin) - Enforce BPF_SK_LOOKUP attach point on load & attach. (Martin) include/linux/bpf-netns.h | 3 + include/linux/bpf.h | 1 + include/linux/bpf_types.h | 2 + include/linux/filter.h | 17 ++++ include/uapi/linux/bpf.h | 77 ++++++++++++++++ kernel/bpf/net_namespace.c | 5 ++ kernel/bpf/syscall.c | 9 ++ kernel/bpf/verifier.c | 10 ++- net/core/filter.c | 179 +++++++++++++++++++++++++++++++++++++ scripts/bpf_helpers_doc.py | 9 +- 10 files changed, 308 insertions(+), 4 deletions(-) diff --git a/include/linux/bpf-netns.h b/include/linux/bpf-netns.h index 4052d649f36d..cb1d849c5d4f 100644 --- a/include/linux/bpf-netns.h +++ b/include/linux/bpf-netns.h @@ -8,6 +8,7 @@ enum netns_bpf_attach_type { NETNS_BPF_INVALID = -1, NETNS_BPF_FLOW_DISSECTOR = 0, + NETNS_BPF_SK_LOOKUP, MAX_NETNS_BPF_ATTACH_TYPE }; @@ -17,6 +18,8 @@ to_netns_bpf_attach_type(enum bpf_attach_type attach_type) switch (attach_type) { case BPF_FLOW_DISSECTOR: return NETNS_BPF_FLOW_DISSECTOR; + case BPF_SK_LOOKUP: + return NETNS_BPF_SK_LOOKUP; default: return NETNS_BPF_INVALID; } diff --git a/include/linux/bpf.h b/include/linux/bpf.h index ad9c61ae8640..f092d13bdd08 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -249,6 +249,7 @@ enum bpf_arg_type { ARG_PTR_TO_INT, /* pointer to int */ ARG_PTR_TO_LONG, /* pointer to long */ ARG_PTR_TO_SOCKET, /* pointer to bpf_sock (fullsock) */ + ARG_PTR_TO_SOCKET_OR_NULL, /* pointer to bpf_sock (fullsock) or NULL */ ARG_PTR_TO_BTF_ID, /* pointer to in-kernel struct */ ARG_PTR_TO_ALLOC_MEM, /* pointer to dynamically allocated memory */ ARG_PTR_TO_ALLOC_MEM_OR_NULL, /* pointer to dynamically allocated memory or NULL */ diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h index a18ae82a298a..a52a5688418e 100644 --- a/include/linux/bpf_types.h +++ b/include/linux/bpf_types.h @@ -64,6 +64,8 @@ BPF_PROG_TYPE(BPF_PROG_TYPE_LIRC_MODE2, lirc_mode2, #ifdef CONFIG_INET BPF_PROG_TYPE(BPF_PROG_TYPE_SK_REUSEPORT, sk_reuseport, struct sk_reuseport_md, struct sk_reuseport_kern) +BPF_PROG_TYPE(BPF_PROG_TYPE_SK_LOOKUP, sk_lookup, + struct bpf_sk_lookup, struct bpf_sk_lookup_kern) #endif #if defined(CONFIG_BPF_JIT) BPF_PROG_TYPE(BPF_PROG_TYPE_STRUCT_OPS, bpf_struct_ops, diff --git a/include/linux/filter.h b/include/linux/filter.h index 259377723603..380746f47fa1 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -1278,4 +1278,21 @@ struct bpf_sockopt_kern { s32 retval; }; +struct bpf_sk_lookup_kern { + u16 family; + u16 protocol; + struct { + __be32 saddr; + __be32 daddr; + } v4; + struct { + const struct in6_addr *saddr; + const struct in6_addr *daddr; + } v6; + __be16 sport; + u16 dport; + struct sock *selected_sk; + bool no_reuseport; +}; + #endif /* __LINUX_FILTER_H__ */ diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 548a749aebb3..e2ffeb150d0f 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -189,6 +189,7 @@ enum bpf_prog_type { BPF_PROG_TYPE_STRUCT_OPS, BPF_PROG_TYPE_EXT, BPF_PROG_TYPE_LSM, + BPF_PROG_TYPE_SK_LOOKUP, }; enum bpf_attach_type { @@ -227,6 +228,7 @@ enum bpf_attach_type { BPF_CGROUP_INET6_GETSOCKNAME, BPF_XDP_DEVMAP, BPF_CGROUP_INET_SOCK_RELEASE, + BPF_SK_LOOKUP, __MAX_BPF_ATTACH_TYPE }; @@ -3068,6 +3070,10 @@ union bpf_attr { * * long bpf_sk_assign(struct sk_buff *skb, struct bpf_sock *sk, u64 flags) * Description + * Helper is overloaded depending on BPF program type. This + * description applies to **BPF_PROG_TYPE_SCHED_CLS** and + * **BPF_PROG_TYPE_SCHED_ACT** programs. + * * Assign the *sk* to the *skb*. When combined with appropriate * routing configuration to receive the packet towards the socket, * will cause *skb* to be delivered to the specified socket. @@ -3093,6 +3099,56 @@ union bpf_attr { * **-ESOCKTNOSUPPORT** if the socket type is not supported * (reuseport). * + * long bpf_sk_assign(struct bpf_sk_lookup *ctx, struct bpf_sock *sk, u64 flags) + * Description + * Helper is overloaded depending on BPF program type. This + * description applies to **BPF_PROG_TYPE_SK_LOOKUP** programs. + * + * Select the *sk* as a result of a socket lookup. + * + * For the operation to succeed passed socket must be compatible + * with the packet description provided by the *ctx* object. + * + * L4 protocol (**IPPROTO_TCP** or **IPPROTO_UDP**) must + * be an exact match. While IP family (**AF_INET** or + * **AF_INET6**) must be compatible, that is IPv6 sockets + * that are not v6-only can be selected for IPv4 packets. + * + * Only TCP listeners and UDP unconnected sockets can be + * selected. *sk* can also be NULL to reset any previous + * selection. + * + * *flags* argument can combination of following values: + * + * * **BPF_SK_LOOKUP_F_REPLACE** to override the previous + * socket selection, potentially done by a BPF program + * that ran before us. + * + * * **BPF_SK_LOOKUP_F_NO_REUSEPORT** to skip + * load-balancing within reuseport group for the socket + * being selected. + * + * On success *ctx->sk* will point to the selected socket. + * + * Return + * 0 on success, or a negative errno in case of failure. + * + * * **-EAFNOSUPPORT** if socket family (*sk->family*) is + * not compatible with packet family (*ctx->family*). + * + * * **-EEXIST** if socket has been already selected, + * potentially by another program, and + * **BPF_SK_LOOKUP_F_REPLACE** flag was not specified. + * + * * **-EINVAL** if unsupported flags were specified. + * + * * **-EPROTOTYPE** if socket L4 protocol + * (*sk->protocol*) doesn't match packet protocol + * (*ctx->protocol*). + * + * * **-ESOCKTNOSUPPORT** if socket is not in allowed + * state (TCP listening or UDP unconnected). + * * u64 bpf_ktime_get_boot_ns(void) * Description * Return the time elapsed since system boot, in nanoseconds. @@ -3605,6 +3661,12 @@ enum { BPF_RINGBUF_HDR_SZ = 8, }; +/* BPF_FUNC_sk_assign flags in bpf_sk_lookup context. */ +enum { + BPF_SK_LOOKUP_F_REPLACE = (1ULL << 0), + BPF_SK_LOOKUP_F_NO_REUSEPORT = (1ULL << 1), +}; + /* Mode for BPF_FUNC_skb_adjust_room helper. */ enum bpf_adj_room_mode { BPF_ADJ_ROOM_NET, @@ -4334,4 +4396,19 @@ struct bpf_pidns_info { __u32 pid; __u32 tgid; }; + +/* User accessible data for SK_LOOKUP programs. Add new fields at the end. */ +struct bpf_sk_lookup { + __bpf_md_ptr(struct bpf_sock *, sk); /* Selected socket */ + + __u32 family; /* Protocol family (AF_INET, AF_INET6) */ + __u32 protocol; /* IP protocol (IPPROTO_TCP, IPPROTO_UDP) */ + __u32 remote_ip4; /* Network byte order */ + __u32 remote_ip6[4]; /* Network byte order */ + __u32 remote_port; /* Network byte order */ + __u32 local_ip4; /* Network byte order */ + __u32 local_ip6[4]; /* Network byte order */ + __u32 local_port; /* Host byte order */ +}; + #endif /* _UAPI__LINUX_BPF_H__ */ diff --git a/kernel/bpf/net_namespace.c b/kernel/bpf/net_namespace.c index 988c2766ec97..596c30b963f3 100644 --- a/kernel/bpf/net_namespace.c +++ b/kernel/bpf/net_namespace.c @@ -359,6 +359,8 @@ static int netns_bpf_max_progs(enum netns_bpf_attach_type type) switch (type) { case NETNS_BPF_FLOW_DISSECTOR: return 1; + case NETNS_BPF_SK_LOOKUP: + return 64; default: return 0; } @@ -389,6 +391,9 @@ static int netns_bpf_link_attach(struct net *net, struct bpf_link *link, case NETNS_BPF_FLOW_DISSECTOR: err = flow_dissector_bpf_prog_attach_check(net, link->prog); break; + case NETNS_BPF_SK_LOOKUP: + err = 0; /* nothing to check */ + break; default: err = -EINVAL; break; diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c index 156f51ffada2..945975a14582 100644 --- a/kernel/bpf/syscall.c +++ b/kernel/bpf/syscall.c @@ -2022,6 +2022,10 @@ bpf_prog_load_check_attach(enum bpf_prog_type prog_type, default: return -EINVAL; } + case BPF_PROG_TYPE_SK_LOOKUP: + if (expected_attach_type == BPF_SK_LOOKUP) + return 0; + return -EINVAL; case BPF_PROG_TYPE_EXT: if (expected_attach_type) return -EINVAL; @@ -2756,6 +2760,7 @@ static int bpf_prog_attach_check_attach_type(const struct bpf_prog *prog, case BPF_PROG_TYPE_CGROUP_SOCK: case BPF_PROG_TYPE_CGROUP_SOCK_ADDR: case BPF_PROG_TYPE_CGROUP_SOCKOPT: + case BPF_PROG_TYPE_SK_LOOKUP: return attach_type == prog->expected_attach_type ? 0 : -EINVAL; case BPF_PROG_TYPE_CGROUP_SKB: if (!capable(CAP_NET_ADMIN)) @@ -2817,6 +2822,8 @@ attach_type_to_prog_type(enum bpf_attach_type attach_type) return BPF_PROG_TYPE_CGROUP_SOCKOPT; case BPF_TRACE_ITER: return BPF_PROG_TYPE_TRACING; + case BPF_SK_LOOKUP: + return BPF_PROG_TYPE_SK_LOOKUP; default: return BPF_PROG_TYPE_UNSPEC; } @@ -2955,6 +2962,7 @@ static int bpf_prog_query(const union bpf_attr *attr, case BPF_LIRC_MODE2: return lirc_prog_query(attr, uattr); case BPF_FLOW_DISSECTOR: + case BPF_SK_LOOKUP: return netns_bpf_prog_query(attr, uattr); default: return -EINVAL; @@ -3888,6 +3896,7 @@ static int link_create(union bpf_attr *attr) ret = tracing_bpf_link_attach(attr, prog); break; case BPF_PROG_TYPE_FLOW_DISSECTOR: + case BPF_PROG_TYPE_SK_LOOKUP: ret = netns_bpf_link_create(attr, prog); break; default: diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 2196523c9716..cc6cc92c19e2 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3879,10 +3879,14 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg, } meta->ref_obj_id = reg->ref_obj_id; } - } else if (arg_type == ARG_PTR_TO_SOCKET) { + } else if (arg_type == ARG_PTR_TO_SOCKET || + arg_type == ARG_PTR_TO_SOCKET_OR_NULL) { expected_type = PTR_TO_SOCKET; - if (type != expected_type) - goto err_type; + if (!(register_is_null(reg) && + arg_type == ARG_PTR_TO_SOCKET_OR_NULL)) { + if (type != expected_type) + goto err_type; + } } else if (arg_type == ARG_PTR_TO_BTF_ID) { expected_type = PTR_TO_BTF_ID; if (type != expected_type) diff --git a/net/core/filter.c b/net/core/filter.c index ddcc0d6209e1..81c462881133 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -9220,6 +9220,185 @@ const struct bpf_verifier_ops sk_reuseport_verifier_ops = { const struct bpf_prog_ops sk_reuseport_prog_ops = { }; + +BPF_CALL_3(bpf_sk_lookup_assign, struct bpf_sk_lookup_kern *, ctx, + struct sock *, sk, u64, flags) +{ + if (unlikely(flags & ~(BPF_SK_LOOKUP_F_REPLACE | + BPF_SK_LOOKUP_F_NO_REUSEPORT))) + return -EINVAL; + if (unlikely(sk && sk_is_refcounted(sk))) + return -ESOCKTNOSUPPORT; /* reject non-RCU freed sockets */ + if (unlikely(sk && sk->sk_state == TCP_ESTABLISHED)) + return -ESOCKTNOSUPPORT; /* reject connected sockets */ + + /* Check if socket is suitable for packet L3/L4 protocol */ + if (sk && sk->sk_protocol != ctx->protocol) + return -EPROTOTYPE; + if (sk && sk->sk_family != ctx->family && + (sk->sk_family == AF_INET || ipv6_only_sock(sk))) + return -EAFNOSUPPORT; + + if (ctx->selected_sk && !(flags & BPF_SK_LOOKUP_F_REPLACE)) + return -EEXIST; + + /* Select socket as lookup result */ + ctx->selected_sk = sk; + ctx->no_reuseport = flags & BPF_SK_LOOKUP_F_NO_REUSEPORT; + return 0; +} + +static const struct bpf_func_proto bpf_sk_lookup_assign_proto = { + .func = bpf_sk_lookup_assign, + .gpl_only = false, + .ret_type = RET_INTEGER, + .arg1_type = ARG_PTR_TO_CTX, + .arg2_type = ARG_PTR_TO_SOCKET_OR_NULL, + .arg3_type = ARG_ANYTHING, +}; + +static const struct bpf_func_proto * +sk_lookup_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) +{ + switch (func_id) { + case BPF_FUNC_sk_assign: + return &bpf_sk_lookup_assign_proto; + case BPF_FUNC_sk_release: + return &bpf_sk_release_proto; + default: + return bpf_base_func_proto(func_id); + } +} + +static bool sk_lookup_is_valid_access(int off, int size, + enum bpf_access_type type, + const struct bpf_prog *prog, + struct bpf_insn_access_aux *info) +{ + if (off < 0 || off >= sizeof(struct bpf_sk_lookup)) + return false; + if (off % size != 0) + return false; + if (type != BPF_READ) + return false; + + switch (off) { + case offsetof(struct bpf_sk_lookup, sk): + info->reg_type = PTR_TO_SOCKET_OR_NULL; + return size == sizeof(__u64); + + case bpf_ctx_range(struct bpf_sk_lookup, family): + case bpf_ctx_range(struct bpf_sk_lookup, protocol): + case bpf_ctx_range(struct bpf_sk_lookup, remote_ip4): + case bpf_ctx_range(struct bpf_sk_lookup, local_ip4): + case bpf_ctx_range_till(struct bpf_sk_lookup, remote_ip6[0], remote_ip6[3]): + case bpf_ctx_range_till(struct bpf_sk_lookup, local_ip6[0], local_ip6[3]): + case bpf_ctx_range(struct bpf_sk_lookup, remote_port): + case bpf_ctx_range(struct bpf_sk_lookup, local_port): + bpf_ctx_record_field_size(info, sizeof(__u32)); + return bpf_ctx_narrow_access_ok(off, size, sizeof(__u32)); + + default: + return false; + } +} + +static u32 sk_lookup_convert_ctx_access(enum bpf_access_type type, + const struct bpf_insn *si, + struct bpf_insn *insn_buf, + struct bpf_prog *prog, + u32 *target_size) +{ + struct bpf_insn *insn = insn_buf; +#if IS_ENABLED(CONFIG_IPV6) + int off; +#endif + + switch (si->off) { + case offsetof(struct bpf_sk_lookup, sk): + *insn++ = BPF_LDX_MEM(BPF_SIZEOF(void *), si->dst_reg, si->src_reg, + offsetof(struct bpf_sk_lookup_kern, selected_sk)); + break; + + case offsetof(struct bpf_sk_lookup, family): + *insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->src_reg, + bpf_target_off(struct bpf_sk_lookup_kern, + family, 2, target_size)); + break; + + case offsetof(struct bpf_sk_lookup, protocol): + *insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->src_reg, + bpf_target_off(struct bpf_sk_lookup_kern, + protocol, 2, target_size)); + break; + + case offsetof(struct bpf_sk_lookup, remote_ip4): + *insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->src_reg, + bpf_target_off(struct bpf_sk_lookup_kern, + v4.saddr, 4, target_size)); + break; + + case offsetof(struct bpf_sk_lookup, local_ip4): + *insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->src_reg, + bpf_target_off(struct bpf_sk_lookup_kern, + v4.daddr, 4, target_size)); + break; + + case bpf_ctx_range_till(struct bpf_sk_lookup, + remote_ip6[0], remote_ip6[3]): +#if IS_ENABLED(CONFIG_IPV6) + off = si->off; + off -= offsetof(struct bpf_sk_lookup, remote_ip6[0]); + off += bpf_target_off(struct in6_addr, s6_addr32[0], 4, target_size); + *insn++ = BPF_LDX_MEM(BPF_SIZEOF(void *), si->dst_reg, si->src_reg, + offsetof(struct bpf_sk_lookup_kern, v6.saddr)); + *insn++ = BPF_JMP_IMM(BPF_JEQ, si->dst_reg, 0, 1); + *insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg, off); +#else + *insn++ = BPF_MOV32_IMM(si->dst_reg, 0); +#endif + break; + + case bpf_ctx_range_till(struct bpf_sk_lookup, + local_ip6[0], local_ip6[3]): +#if IS_ENABLED(CONFIG_IPV6) + off = si->off; + off -= offsetof(struct bpf_sk_lookup, local_ip6[0]); + off += bpf_target_off(struct in6_addr, s6_addr32[0], 4, target_size); + *insn++ = BPF_LDX_MEM(BPF_SIZEOF(void *), si->dst_reg, si->src_reg, + offsetof(struct bpf_sk_lookup_kern, v6.daddr)); + *insn++ = BPF_JMP_IMM(BPF_JEQ, si->dst_reg, 0, 1); + *insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg, off); +#else + *insn++ = BPF_MOV32_IMM(si->dst_reg, 0); +#endif + break; + + case offsetof(struct bpf_sk_lookup, remote_port): + *insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->src_reg, + bpf_target_off(struct bpf_sk_lookup_kern, + sport, 2, target_size)); + break; + + case offsetof(struct bpf_sk_lookup, local_port): + *insn++ = BPF_LDX_MEM(BPF_H, si->dst_reg, si->src_reg, + bpf_target_off(struct bpf_sk_lookup_kern, + dport, 2, target_size)); + break; + } + + return insn - insn_buf; +} + +const struct bpf_prog_ops sk_lookup_prog_ops = { +}; + +const struct bpf_verifier_ops sk_lookup_verifier_ops = { + .get_func_proto = sk_lookup_func_proto, + .is_valid_access = sk_lookup_is_valid_access, + .convert_ctx_access = sk_lookup_convert_ctx_access, +}; + #endif /* CONFIG_INET */ DEFINE_BPF_DISPATCHER(xdp) diff --git a/scripts/bpf_helpers_doc.py b/scripts/bpf_helpers_doc.py index 6843376733df..5bfa448b4704 100755 --- a/scripts/bpf_helpers_doc.py +++ b/scripts/bpf_helpers_doc.py @@ -404,6 +404,7 @@ class PrinterHelpers(Printer): type_fwds = [ 'struct bpf_fib_lookup', + 'struct bpf_sk_lookup', 'struct bpf_perf_event_data', 'struct bpf_perf_event_value', 'struct bpf_pidns_info', @@ -450,6 +451,7 @@ class PrinterHelpers(Printer): 'struct bpf_perf_event_data', 'struct bpf_perf_event_value', 'struct bpf_pidns_info', + 'struct bpf_sk_lookup', 'struct bpf_sock', 'struct bpf_sock_addr', 'struct bpf_sock_ops', @@ -487,6 +489,11 @@ class PrinterHelpers(Printer): 'struct sk_msg_buff': 'struct sk_msg_md', 'struct xdp_buff': 'struct xdp_md', } + # Helpers overloaded for different context types. + overloaded_helpers = [ + 'bpf_get_socket_cookie', + 'bpf_sk_assign', + ] def print_header(self): header = '''\ @@ -543,7 +550,7 @@ class PrinterHelpers(Printer): for i, a in enumerate(proto['args']): t = a['type'] n = a['name'] - if proto['name'] == 'bpf_get_socket_cookie' and i == 0: + if proto['name'] in self.overloaded_helpers and i == 0: t = 'void' n = 'ctx' one_arg = '{}{}'.format(comma, self.map_type(t)) From patchwork Mon Jul 13 17:46:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Sitnicki X-Patchwork-Id: 1328344 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=cloudflare.com header.i=@cloudflare.com header.a=rsa-sha256 header.s=google header.b=LXhwG2An; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4B5B181VS6z9sR4 for ; Tue, 14 Jul 2020 03:47:08 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730386AbgGMRrH (ORCPT ); Mon, 13 Jul 2020 13:47:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60280 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730297AbgGMRrD (ORCPT ); Mon, 13 Jul 2020 13:47:03 -0400 Received: from mail-lf1-x144.google.com (mail-lf1-x144.google.com [IPv6:2a00:1450:4864:20::144]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7EE6AC061794 for ; Mon, 13 Jul 2020 10:47:03 -0700 (PDT) Received: by mail-lf1-x144.google.com with SMTP id y13so9600668lfe.9 for ; Mon, 13 Jul 2020 10:47:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=jDSEEbaluBZs+0tECJKbkPZc+3U6GYzy2UWo7TMx4PQ=; b=LXhwG2Ana8/kJY+LED0FuwqzhBaLI5+UE19TVBMjlpjp33lylKcgcw7e1tCm3DQS3R XaMWDl0XAvahKKP/DUAKFxb/Ez/cIjRpkDBBXtcYC21XlLRnPB7fJZS42HNNQEgXaf/q rPyc5a813qGJ6VEPTkfrBdrR83VMXwJb0pxzc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=jDSEEbaluBZs+0tECJKbkPZc+3U6GYzy2UWo7TMx4PQ=; b=nr43s5b5zcv+y6TUEV8fda/i5fzKCZ4KpZeefN8dCw46qx8b8Y03OTgEK/OPubwwnl 2BgY4v+apFpBcQ9TeaYkn9+98aOsr7YAH/lsNOvQUhUtM+bo2hl189wUCjcQKb+t7nUA aZY5SJAISl++Z5mMU91yRhk/a9R6jkh+/+vZI3fOsz7xcxmW+JI24uzHjL858pwf3GrW BohEzGY6Hpzo9/AjFAZrZpIJJGlQf3mVnXw6dVcbu7CkhUTg/JkS1pVl1Fyf2XrcfG45 A5JXtnRRBX+RPVMIkt867DUzmmG59RbgQyQMhf+UJhYLLKydFPABdfUIZTwez/69YJld RStw== X-Gm-Message-State: AOAM530b96PeSue2o/gtRWdjsrJqqIQo+uWgO4mqSYKJD7tAxwsSGGic aqXuxZ9CeUt7YEI/2WraHj+KsN4w3EHnvA== X-Google-Smtp-Source: ABdhPJytKuI10VUN5tx6g67NHROdnxuFoPpXxDYHGu+GqkdsG0zOStJQNKfZ1eHGJXHGQEit/WlQ5w== X-Received: by 2002:a19:c886:: with SMTP id y128mr170168lff.98.1594662421680; Mon, 13 Jul 2020 10:47:01 -0700 (PDT) Received: from cloudflare.com ([2a02:a310:c262:aa00:b35e:8938:2c2a:ba8b]) by smtp.gmail.com with ESMTPSA id b11sm4736057lfa.50.2020.07.13.10.47.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Jul 2020 10:47:01 -0700 (PDT) From: Jakub Sitnicki To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, kernel-team@cloudflare.com, Alexei Starovoitov , Daniel Borkmann , "David S. Miller" , Jakub Kicinski Subject: [PATCH bpf-next v4 03/16] inet: Extract helper for selecting socket from reuseport group Date: Mon, 13 Jul 2020 19:46:41 +0200 Message-Id: <20200713174654.642628-4-jakub@cloudflare.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20200713174654.642628-1-jakub@cloudflare.com> References: <20200713174654.642628-1-jakub@cloudflare.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Prepare for calling into reuseport from __inet_lookup_listener as well. Signed-off-by: Jakub Sitnicki Acked-by: Andrii Nakryiko --- net/ipv4/inet_hashtables.c | 29 ++++++++++++++++++++--------- 1 file changed, 20 insertions(+), 9 deletions(-) diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c index 2bbaaf0c7176..ab64834837c8 100644 --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c @@ -246,6 +246,21 @@ static inline int compute_score(struct sock *sk, struct net *net, return score; } +static inline struct sock *lookup_reuseport(struct net *net, struct sock *sk, + struct sk_buff *skb, int doff, + __be32 saddr, __be16 sport, + __be32 daddr, unsigned short hnum) +{ + struct sock *reuse_sk = NULL; + u32 phash; + + if (sk->sk_reuseport) { + phash = inet_ehashfn(net, daddr, hnum, saddr, sport); + reuse_sk = reuseport_select_sock(sk, phash, skb, doff); + } + return reuse_sk; +} + /* * Here are some nice properties to exploit here. The BSD API * does not allow a listening sock to specify the remote port nor the @@ -265,21 +280,17 @@ static struct sock *inet_lhash2_lookup(struct net *net, struct inet_connection_sock *icsk; struct sock *sk, *result = NULL; int score, hiscore = 0; - u32 phash = 0; inet_lhash2_for_each_icsk_rcu(icsk, &ilb2->head) { sk = (struct sock *)icsk; score = compute_score(sk, net, hnum, daddr, dif, sdif, exact_dif); if (score > hiscore) { - if (sk->sk_reuseport) { - phash = inet_ehashfn(net, daddr, hnum, - saddr, sport); - result = reuseport_select_sock(sk, phash, - skb, doff); - if (result) - return result; - } + result = lookup_reuseport(net, sk, skb, doff, + saddr, sport, daddr, hnum); + if (result) + return result; + result = sk; hiscore = score; } From patchwork Mon Jul 13 17:46:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Sitnicki X-Patchwork-Id: 1328346 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=cloudflare.com header.i=@cloudflare.com header.a=rsa-sha256 header.s=google header.b=a1ct+skd; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4B5B185T1zz9sRW for ; Tue, 14 Jul 2020 03:47:08 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730395AbgGMRrH (ORCPT ); Mon, 13 Jul 2020 13:47:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60288 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730319AbgGMRrG (ORCPT ); Mon, 13 Jul 2020 13:47:06 -0400 Received: from mail-lf1-x141.google.com (mail-lf1-x141.google.com [IPv6:2a00:1450:4864:20::141]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A2255C08C5DD for ; Mon, 13 Jul 2020 10:47:05 -0700 (PDT) Received: by mail-lf1-x141.google.com with SMTP id y13so9600749lfe.9 for ; Mon, 13 Jul 2020 10:47:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=dcblGvWhBWCzM643yO1GenYw1yXlx9Mn/KsJt0ZGcP4=; b=a1ct+skdVvh1aO+JIKVD9+N25TdD+eLWjFfuh96vrpH+GfEr3P/7OCAOojRnelTgs9 rJS6H9nFEcedN6kk3ttMiHCYzOVOmR3XzgnQDjU50uGHk8N5dTBeyIjInb7xEN+DWzSN NORcgZtRZr4uo/Unc4D7VUPV5wT/Og5akrkt8= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=dcblGvWhBWCzM643yO1GenYw1yXlx9Mn/KsJt0ZGcP4=; b=MaMTi2LR+sJJ+tAXKTGY63vDYqDh84qADX+kLKjaTmTFZuv8czLd+ZOqUmEFLP1HKr Y5uDofjbIhtyHnWGgjTqsLHdmf21YfnXbenRmLBXy/TrNOiXb5ZkreKxjzWSest4XGlv XusE8DNK00LTtziZ1yGDZgO3N5uL/iY8SJEyk6EL5y95l6w8hc9LxTgVArldFZgAEY/w FB2kn85vrJO++gpXfjpmpdC8HWY9t+eZVHZUe2rygXYLhT6BGm7CfQr3LElCxTWXRbeg gwD/uu3gSa+kSeDHfWJ+c1Lr3hlECVPDqQ0QQJv5gRPmDjn82b94/2iDzLZhMQpM7x0u imNg== X-Gm-Message-State: AOAM530XqZEvM9UNfkxpngNmn0gmQ8ieZLlb5tw46QNSETEfv8NAnHP8 EAyNAS8Jpwt9drdB9pzaxsqA0HvVh/VP6Q== X-Google-Smtp-Source: ABdhPJzgtWnsFDRcL2v+crbBG+Jw4Sf07C9czStdGuHZanBVUu43t1WqdYkZH0Ovv+5NU9SJmtgtIg== X-Received: by 2002:a19:e05d:: with SMTP id g29mr142258lfj.217.1594662423437; Mon, 13 Jul 2020 10:47:03 -0700 (PDT) Received: from cloudflare.com ([2a02:a310:c262:aa00:b35e:8938:2c2a:ba8b]) by smtp.gmail.com with ESMTPSA id l22sm5752360lje.81.2020.07.13.10.47.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Jul 2020 10:47:02 -0700 (PDT) From: Jakub Sitnicki To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, kernel-team@cloudflare.com, Alexei Starovoitov , Daniel Borkmann , "David S. Miller" , Jakub Kicinski , Marek Majkowski Subject: [PATCH bpf-next v4 04/16] inet: Run SK_LOOKUP BPF program on socket lookup Date: Mon, 13 Jul 2020 19:46:42 +0200 Message-Id: <20200713174654.642628-5-jakub@cloudflare.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20200713174654.642628-1-jakub@cloudflare.com> References: <20200713174654.642628-1-jakub@cloudflare.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Run a BPF program before looking up a listening socket on the receive path. Program selects a listening socket to yield as result of socket lookup by calling bpf_sk_assign() helper and returning SK_PASS code. Program can revert its decision by assigning a NULL socket with bpf_sk_assign(). Alternatively, BPF program can also fail the lookup by returning with SK_DROP, or let the lookup continue as usual with SK_PASS on return, when no socket has not been selected with bpf_sk_assign(). Other return values are treated the same as SK_DROP. This lets the user match packets with listening sockets freely at the last possible point on the receive path, where we know that packets are destined for local delivery after undergoing policing, filtering, and routing. With BPF code selecting the socket, directing packets destined to an IP range or to a port range to a single socket becomes possible. In case multiple programs are attached, they are run in series in the order in which they were attached. The end result is determined from return codes of all the programs according to following rules: 1. If any program returned SK_PASS and selected a valid socket, the socket is used as result of socket lookup. 2. If more than one program returned SK_PASS and selected a socket, last selection takes effect. 3. If any program returned SK_DROP or an invalid return code, and no program returned SK_PASS and selected a socket, socket lookup fails with -ECONNREFUSED. 4. If all programs returned SK_PASS and none of them selected a socket, socket lookup continues to htable-based lookup. Suggested-by: Marek Majkowski Signed-off-by: Jakub Sitnicki --- Notes: v4: - Reduce BPF sk_lookup prog return codes to SK_PASS/SK_DROP. (Lorenz) - Default to drop & warn on illegal return value from BPF prog. (Lorenz) - Rename netns_bpf_attach_type_enable/disable to _need/unneed. (Lorenz) - Export bpf_sk_lookup_enabled symbol for CONFIG_IPV6=m (kernel test robot) - Invert return value from bpf_sk_lookup_run_v4 to true on skip reuseport. - Move dedicated prog_array runner close to its callers in filter.h. v3: - Use a static_key to minimize the hook overhead when not used. (Alexei) - Adapt for running an array of attached programs. (Alexei) - Adapt for optionally skipping reuseport selection. (Martin) include/linux/filter.h | 102 +++++++++++++++++++++++++++++++++++++ kernel/bpf/net_namespace.c | 32 +++++++++++- net/core/filter.c | 3 ++ net/ipv4/inet_hashtables.c | 31 +++++++++++ 4 files changed, 167 insertions(+), 1 deletion(-) diff --git a/include/linux/filter.h b/include/linux/filter.h index 380746f47fa1..b9ad0fdabca5 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -1295,4 +1295,106 @@ struct bpf_sk_lookup_kern { bool no_reuseport; }; +extern struct static_key_false bpf_sk_lookup_enabled; + +/* Runners for BPF_SK_LOOKUP programs to invoke on socket lookup. + * + * Allowed return values for a BPF SK_LOOKUP program are SK_PASS and + * SK_DROP. Any other return value is treated as SK_DROP. Their + * meaning is as follows: + * + * SK_PASS && ctx.selected_sk != NULL: use selected_sk as lookup result + * SK_PASS && ctx.selected_sk == NULL: continue to htable-based socket lookup + * SK_DROP : terminate lookup with -ECONNREFUSED + * + * This macro aggregates return values and selected sockets from + * multiple BPF programs according to following rules: + * + * 1. If any program returned SK_PASS and a non-NULL ctx.selected_sk, + * macro result is SK_PASS and last ctx.selected_sk is used. + * 2. If any program returned non-SK_PASS return value, + * macro result is the last non-SK_PASS return value. + * 3. Otherwise result is SK_PASS and ctx.selected_sk is NULL. + * + * Caller must ensure that the prog array is non-NULL, and that the + * array as well as the programs it contains remain valid. + */ +#define BPF_PROG_SK_LOOKUP_RUN_ARRAY(array, ctx, func) \ + ({ \ + struct bpf_sk_lookup_kern *_ctx = &(ctx); \ + struct bpf_prog_array_item *_item; \ + struct sock *_selected_sk; \ + struct bpf_prog *_prog; \ + u32 _ret, _last_ret; \ + bool _no_reuseport; \ + \ + migrate_disable(); \ + _last_ret = SK_PASS; \ + _selected_sk = NULL; \ + _no_reuseport = false; \ + _item = &(array)->items[0]; \ + while ((_prog = READ_ONCE(_item->prog))) { \ + /* restore most recent selection */ \ + _ctx->selected_sk = _selected_sk; \ + _ctx->no_reuseport = _no_reuseport; \ + \ + _ret = func(_prog, _ctx); \ + if (_ret == SK_PASS) { \ + /* remember last non-NULL socket */ \ + if (_ctx->selected_sk) { \ + _selected_sk = _ctx->selected_sk; \ + _no_reuseport = _ctx->no_reuseport; \ + } \ + } else { \ + /* remember last non-PASS ret code */ \ + _last_ret = _ret; \ + } \ + _item++; \ + } \ + _ctx->selected_sk = _selected_sk; \ + _ctx->no_reuseport = _no_reuseport; \ + migrate_enable(); \ + _ctx->selected_sk ? SK_PASS : _last_ret; \ + }) + +static inline bool bpf_sk_lookup_run_v4(struct net *net, int protocol, + const __be32 saddr, const __be16 sport, + const __be32 daddr, const u16 dport, + struct sock **psk) +{ + struct bpf_prog_array *run_array; + struct sock *selected_sk = NULL; + bool no_reuseport = false; + + rcu_read_lock(); + run_array = rcu_dereference(net->bpf.run_array[NETNS_BPF_SK_LOOKUP]); + if (run_array) { + struct bpf_sk_lookup_kern ctx = { + .family = AF_INET, + .protocol = protocol, + .v4.saddr = saddr, + .v4.daddr = daddr, + .sport = sport, + .dport = dport, + }; + u32 act; + + act = BPF_PROG_SK_LOOKUP_RUN_ARRAY(run_array, ctx, BPF_PROG_RUN); + if (act == SK_PASS) { + selected_sk = ctx.selected_sk; + no_reuseport = ctx.no_reuseport; + goto unlock; + } + + selected_sk = ERR_PTR(-ECONNREFUSED); + WARN_ONCE(act != SK_DROP, + "Illegal BPF SK_LOOKUP return value %u, expect packet loss!\n", + act); + } +unlock: + rcu_read_unlock(); + *psk = selected_sk; + return no_reuseport; +} + #endif /* __LINUX_FILTER_H__ */ diff --git a/kernel/bpf/net_namespace.c b/kernel/bpf/net_namespace.c index 596c30b963f3..ee3599a51891 100644 --- a/kernel/bpf/net_namespace.c +++ b/kernel/bpf/net_namespace.c @@ -25,6 +25,28 @@ struct bpf_netns_link { /* Protects updates to netns_bpf */ DEFINE_MUTEX(netns_bpf_mutex); +static void netns_bpf_attach_type_unneed(enum netns_bpf_attach_type type) +{ + switch (type) { + case NETNS_BPF_SK_LOOKUP: + static_branch_dec(&bpf_sk_lookup_enabled); + break; + default: + break; + } +} + +static void netns_bpf_attach_type_need(enum netns_bpf_attach_type type) +{ + switch (type) { + case NETNS_BPF_SK_LOOKUP: + static_branch_inc(&bpf_sk_lookup_enabled); + break; + default: + break; + } +} + /* Must be called with netns_bpf_mutex held. */ static void netns_bpf_run_array_detach(struct net *net, enum netns_bpf_attach_type type) @@ -91,6 +113,9 @@ static void bpf_netns_link_release(struct bpf_link *link) if (!net) goto out_unlock; + /* Mark attach point as unused */ + netns_bpf_attach_type_unneed(type); + /* Remember link position in case of safe delete */ idx = link_index(net, type, net_link); list_del(&net_link->node); @@ -414,6 +439,9 @@ static int netns_bpf_link_attach(struct net *net, struct bpf_link *link, lockdep_is_held(&netns_bpf_mutex)); bpf_prog_array_free(run_array); + /* Mark attach point as used */ + netns_bpf_attach_type_need(type); + out_unlock: mutex_unlock(&netns_bpf_mutex); return err; @@ -489,8 +517,10 @@ static void __net_exit netns_bpf_pernet_pre_exit(struct net *net) mutex_lock(&netns_bpf_mutex); for (type = 0; type < MAX_NETNS_BPF_ATTACH_TYPE; type++) { netns_bpf_run_array_detach(net, type); - list_for_each_entry(net_link, &net->bpf.links[type], node) + list_for_each_entry(net_link, &net->bpf.links[type], node) { net_link->net = NULL; /* auto-detach link */ + netns_bpf_attach_type_unneed(type); + } if (net->bpf.progs[type]) bpf_prog_put(net->bpf.progs[type]); } diff --git a/net/core/filter.c b/net/core/filter.c index 81c462881133..3fcb9c8cec4c 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -9221,6 +9221,9 @@ const struct bpf_verifier_ops sk_reuseport_verifier_ops = { const struct bpf_prog_ops sk_reuseport_prog_ops = { }; +DEFINE_STATIC_KEY_FALSE(bpf_sk_lookup_enabled); +EXPORT_SYMBOL(bpf_sk_lookup_enabled); + BPF_CALL_3(bpf_sk_lookup_assign, struct bpf_sk_lookup_kern *, ctx, struct sock *, sk, u64, flags) { diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c index ab64834837c8..4eb4cd8d20dd 100644 --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c @@ -299,6 +299,29 @@ static struct sock *inet_lhash2_lookup(struct net *net, return result; } +static inline struct sock *inet_lookup_run_bpf(struct net *net, + struct inet_hashinfo *hashinfo, + struct sk_buff *skb, int doff, + __be32 saddr, __be16 sport, + __be32 daddr, u16 hnum) +{ + struct sock *sk, *reuse_sk; + bool no_reuseport; + + if (hashinfo != &tcp_hashinfo) + return NULL; /* only TCP is supported */ + + no_reuseport = bpf_sk_lookup_run_v4(net, IPPROTO_TCP, + saddr, sport, daddr, hnum, &sk); + if (no_reuseport || IS_ERR_OR_NULL(sk)) + return sk; + + reuse_sk = lookup_reuseport(net, sk, skb, doff, saddr, sport, daddr, hnum); + if (reuse_sk) + sk = reuse_sk; + return sk; +} + struct sock *__inet_lookup_listener(struct net *net, struct inet_hashinfo *hashinfo, struct sk_buff *skb, int doff, @@ -310,6 +333,14 @@ struct sock *__inet_lookup_listener(struct net *net, struct sock *result = NULL; unsigned int hash2; + /* Lookup redirect from BPF */ + if (static_branch_unlikely(&bpf_sk_lookup_enabled)) { + result = inet_lookup_run_bpf(net, hashinfo, skb, doff, + saddr, sport, daddr, hnum); + if (result) + goto done; + } + hash2 = ipv4_portaddr_hash(net, daddr, hnum); ilb2 = inet_lhash2_bucket(hashinfo, hash2); From patchwork Mon Jul 13 17:46:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Sitnicki X-Patchwork-Id: 1328348 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=cloudflare.com header.i=@cloudflare.com header.a=rsa-sha256 header.s=google header.b=fK+WlYQV; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4B5B1F3V2vz9sRW for ; Tue, 14 Jul 2020 03:47:13 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730438AbgGMRrL (ORCPT ); Mon, 13 Jul 2020 13:47:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60294 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730043AbgGMRrG (ORCPT ); Mon, 13 Jul 2020 13:47:06 -0400 Received: from mail-lj1-x242.google.com (mail-lj1-x242.google.com [IPv6:2a00:1450:4864:20::242]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 87C5FC08C5DE for ; Mon, 13 Jul 2020 10:47:06 -0700 (PDT) Received: by mail-lj1-x242.google.com with SMTP id h22so18948957lji.9 for ; Mon, 13 Jul 2020 10:47:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=QgMFjCP5Xlp83RyJSzaeuU80nMRVl3A4bF+6T6QGTNA=; b=fK+WlYQV0st4Lh29BoNg3l5x8IvT47Afy0ynaX5UT0Ty5cgrbkl7zxrGQtrGlM9Lha om5ZnhlOI6TRCLb9SmdhapWgKA47Blk48aTXly3nde4eThMk1/7K4t0ZJmUQaxqPx+2v sQy0pGDtuuEYMLMCtCjs0PhS1NzvXEPoHzDoM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=QgMFjCP5Xlp83RyJSzaeuU80nMRVl3A4bF+6T6QGTNA=; b=W6EJZt2Gq9XChsv9QjtRuQf9AdKRZJpDUypv7AXJTyNmVpxmQnH6w2X4BpLU1Zh/zA PHA5G0JvLCYPtRFVizOp9DicWu+/cgj2h5z7zoLvwIXTtrF26Z2+6gWLfbcZgCPxOrOq OBdxYEENPRwRYo+kvhBibA8gjgo70v4TG69TA223yYFKg8jpM/v8L+YeKaw3af7r+hXW KV9sfkKOFHfS/OpxPxjooVRQAuMgg3cwdVhTPqxqKww4xbDrD5jPphwqL198FT7H9wQP 2ntTHEWth/zHSaJihW2Gs+G5JPqfBapc/i9A4svGlgySOEv6Dc/v6Kl2fXv1WI+iLuJ7 5l2A== X-Gm-Message-State: AOAM530nvYDfhL/5Wf9sqJfJc9HAABHJQGLLM7x2QQmE/EW/iQNrd03w 3NbPgB/jaR/J3dAz50t2bEJc3mfPqbsI3w== X-Google-Smtp-Source: ABdhPJxR+0STxQ8suGDJqdTomUPYy155szqwQxy2Qj3XQA7yCHOqUdXVDdFUKEru44pwFZzseB07gg== X-Received: by 2002:a05:651c:120d:: with SMTP id i13mr367008lja.153.1594662425025; Mon, 13 Jul 2020 10:47:05 -0700 (PDT) Received: from cloudflare.com ([2a02:a310:c262:aa00:b35e:8938:2c2a:ba8b]) by smtp.gmail.com with ESMTPSA id g24sm4149166ljl.139.2020.07.13.10.47.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Jul 2020 10:47:04 -0700 (PDT) From: Jakub Sitnicki To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, kernel-team@cloudflare.com, Alexei Starovoitov , Daniel Borkmann , "David S. Miller" , Jakub Kicinski Subject: [PATCH bpf-next v4 05/16] inet6: Extract helper for selecting socket from reuseport group Date: Mon, 13 Jul 2020 19:46:43 +0200 Message-Id: <20200713174654.642628-6-jakub@cloudflare.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20200713174654.642628-1-jakub@cloudflare.com> References: <20200713174654.642628-1-jakub@cloudflare.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Prepare for calling into reuseport from inet6_lookup_listener as well. Signed-off-by: Jakub Sitnicki --- net/ipv6/inet6_hashtables.c | 31 ++++++++++++++++++++++--------- 1 file changed, 22 insertions(+), 9 deletions(-) diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c index fbe9d4295eac..03942eef8ab6 100644 --- a/net/ipv6/inet6_hashtables.c +++ b/net/ipv6/inet6_hashtables.c @@ -111,6 +111,23 @@ static inline int compute_score(struct sock *sk, struct net *net, return score; } +static inline struct sock *lookup_reuseport(struct net *net, struct sock *sk, + struct sk_buff *skb, int doff, + const struct in6_addr *saddr, + __be16 sport, + const struct in6_addr *daddr, + unsigned short hnum) +{ + struct sock *reuse_sk = NULL; + u32 phash; + + if (sk->sk_reuseport) { + phash = inet6_ehashfn(net, daddr, hnum, saddr, sport); + reuse_sk = reuseport_select_sock(sk, phash, skb, doff); + } + return reuse_sk; +} + /* called with rcu_read_lock() */ static struct sock *inet6_lhash2_lookup(struct net *net, struct inet_listen_hashbucket *ilb2, @@ -123,21 +140,17 @@ static struct sock *inet6_lhash2_lookup(struct net *net, struct inet_connection_sock *icsk; struct sock *sk, *result = NULL; int score, hiscore = 0; - u32 phash = 0; inet_lhash2_for_each_icsk_rcu(icsk, &ilb2->head) { sk = (struct sock *)icsk; score = compute_score(sk, net, hnum, daddr, dif, sdif, exact_dif); if (score > hiscore) { - if (sk->sk_reuseport) { - phash = inet6_ehashfn(net, daddr, hnum, - saddr, sport); - result = reuseport_select_sock(sk, phash, - skb, doff); - if (result) - return result; - } + result = lookup_reuseport(net, sk, skb, doff, + saddr, sport, daddr, hnum); + if (result) + return result; + result = sk; hiscore = score; } From patchwork Mon Jul 13 17:46:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Sitnicki X-Patchwork-Id: 1328347 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=cloudflare.com header.i=@cloudflare.com header.a=rsa-sha256 header.s=google header.b=PrJnbZuc; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4B5B1C6yxgz9sR4 for ; Tue, 14 Jul 2020 03:47:11 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730464AbgGMRrL (ORCPT ); Mon, 13 Jul 2020 13:47:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60302 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730423AbgGMRrI (ORCPT ); Mon, 13 Jul 2020 13:47:08 -0400 Received: from mail-lj1-x244.google.com (mail-lj1-x244.google.com [IPv6:2a00:1450:4864:20::244]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8A84AC061755 for ; Mon, 13 Jul 2020 10:47:08 -0700 (PDT) Received: by mail-lj1-x244.google.com with SMTP id e8so18986773ljb.0 for ; Mon, 13 Jul 2020 10:47:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=V0LD3bn+WRHrd+N/cPZAGVVSG99I0PJVGPePLpxK9tw=; b=PrJnbZuc4RuZm6vbyl7Xl3YCGpAOyjWAbOryqvufulLr7lbKmrgpTYZw1BmlgbQRmu VOmxNzrMNFtFhWD//eGPk0HVJoP/0NGtrxuD0BUA5bwjVm4deNKdaAyLifKMughPS5cZ aPw4xBOhgt37xMM4w5ZAOLMe/SS/yuIun9MG4= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=V0LD3bn+WRHrd+N/cPZAGVVSG99I0PJVGPePLpxK9tw=; b=KUeQovqLH4XGJTSKxciTePeqDVqH//7xny57+T70gxOymTIPI4shkvNlYQpSNwM1FG Y+WI8BZzDBvCMQ0d3uZUAma6O24jEOP1IWtkrJd7yWRnYwQFNPAB+QW8C53kNH36Ducp JHCTihZ5958oNDpL728GvN+yjZ/NBy2tqlc94hH7gLSOkEyo0tTjQt+NZcqoznDPBpwy eRtgTnC1tiRsFia4UPRclm/S8EIPuPaG2wpMyub5JMfsqO2T99a0QZmGCY1Wan/Pmqj/ o/eTt+KkFgpCWtaAqwsrPcwBU046LnBGNZfi92ChICoHOFyrpslOwFdtnvQafaDE9KBN w0Og== X-Gm-Message-State: AOAM533NXmf5qIJqj32hJuWEkQyeLliOXagMBuzKhehSMZdgrWT7AZu5 a4z/DQeVbENH16ud7otnd+LPgg== X-Google-Smtp-Source: ABdhPJwjmBeJY92EPBffVFiioAiZDwb9VJR3FpV2hlwwJ3XVrRczEf3WdEE9ZrQeJeo0Qb0NhGNZzA== X-Received: by 2002:a2e:a375:: with SMTP id i21mr384347ljn.403.1594662427013; Mon, 13 Jul 2020 10:47:07 -0700 (PDT) Received: from cloudflare.com ([2a02:a310:c262:aa00:b35e:8938:2c2a:ba8b]) by smtp.gmail.com with ESMTPSA id 132sm4727993lfl.37.2020.07.13.10.47.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Jul 2020 10:47:06 -0700 (PDT) From: Jakub Sitnicki To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, kernel-team@cloudflare.com, Alexei Starovoitov , Daniel Borkmann , "David S. Miller" , Jakub Kicinski , Marek Majkowski Subject: [PATCH bpf-next v4 06/16] inet6: Run SK_LOOKUP BPF program on socket lookup Date: Mon, 13 Jul 2020 19:46:44 +0200 Message-Id: <20200713174654.642628-7-jakub@cloudflare.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20200713174654.642628-1-jakub@cloudflare.com> References: <20200713174654.642628-1-jakub@cloudflare.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Following ipv4 stack changes, run a BPF program attached to netns before looking up a listening socket. Program can return a listening socket to use as result of socket lookup, fail the lookup, or take no action. Suggested-by: Marek Majkowski Signed-off-by: Jakub Sitnicki --- Notes: v4: - Adapt to changes in BPF prog return codes. - Invert return value from bpf_sk_lookup_run_v6 to true on skip reuseport. v3: - Use a static_key to minimize the hook overhead when not used. (Alexei) - Don't copy struct in6_addr when populating BPF prog context. (Martin) - Adapt for running an array of attached programs. (Alexei) - Adapt for optionally skipping reuseport selection. (Martin) include/linux/filter.h | 44 +++++++++++++++++++++++++++++++++++++ net/ipv6/inet6_hashtables.c | 35 +++++++++++++++++++++++++++++ 2 files changed, 79 insertions(+) diff --git a/include/linux/filter.h b/include/linux/filter.h index b9ad0fdabca5..900b71af5580 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -1397,4 +1397,48 @@ static inline bool bpf_sk_lookup_run_v4(struct net *net, int protocol, return no_reuseport; } +#if IS_ENABLED(CONFIG_IPV6) +static inline bool bpf_sk_lookup_run_v6(struct net *net, int protocol, + const struct in6_addr *saddr, + const __be16 sport, + const struct in6_addr *daddr, + const u16 dport, + struct sock **psk) +{ + struct bpf_prog_array *run_array; + struct sock *selected_sk = NULL; + bool no_reuseport = false; + + rcu_read_lock(); + run_array = rcu_dereference(net->bpf.run_array[NETNS_BPF_SK_LOOKUP]); + if (run_array) { + struct bpf_sk_lookup_kern ctx = { + .family = AF_INET6, + .protocol = protocol, + .v6.saddr = saddr, + .v6.daddr = daddr, + .sport = sport, + .dport = dport, + }; + u32 act; + + act = BPF_PROG_SK_LOOKUP_RUN_ARRAY(run_array, ctx, BPF_PROG_RUN); + if (act == SK_PASS) { + selected_sk = ctx.selected_sk; + no_reuseport = ctx.no_reuseport; + goto unlock; + } + + selected_sk = ERR_PTR(-ECONNREFUSED); + WARN_ONCE(act != SK_DROP, + "Illegal BPF SK_LOOKUP return value %u, expect packet loss!\n", + act); + } +unlock: + rcu_read_unlock(); + *psk = selected_sk; + return no_reuseport; +} +#endif /* IS_ENABLED(CONFIG_IPV6) */ + #endif /* __LINUX_FILTER_H__ */ diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c index 03942eef8ab6..2d3add9e6116 100644 --- a/net/ipv6/inet6_hashtables.c +++ b/net/ipv6/inet6_hashtables.c @@ -21,6 +21,8 @@ #include #include +extern struct inet_hashinfo tcp_hashinfo; + u32 inet6_ehashfn(const struct net *net, const struct in6_addr *laddr, const u16 lport, const struct in6_addr *faddr, const __be16 fport) @@ -159,6 +161,31 @@ static struct sock *inet6_lhash2_lookup(struct net *net, return result; } +static inline struct sock *inet6_lookup_run_bpf(struct net *net, + struct inet_hashinfo *hashinfo, + struct sk_buff *skb, int doff, + const struct in6_addr *saddr, + const __be16 sport, + const struct in6_addr *daddr, + const u16 hnum) +{ + struct sock *sk, *reuse_sk; + bool no_reuseport; + + if (hashinfo != &tcp_hashinfo) + return NULL; /* only TCP is supported */ + + no_reuseport = bpf_sk_lookup_run_v6(net, IPPROTO_TCP, + saddr, sport, daddr, hnum, &sk); + if (no_reuseport || IS_ERR_OR_NULL(sk)) + return sk; + + reuse_sk = lookup_reuseport(net, sk, skb, doff, saddr, sport, daddr, hnum); + if (reuse_sk) + sk = reuse_sk; + return sk; +} + struct sock *inet6_lookup_listener(struct net *net, struct inet_hashinfo *hashinfo, struct sk_buff *skb, int doff, @@ -170,6 +197,14 @@ struct sock *inet6_lookup_listener(struct net *net, struct sock *result = NULL; unsigned int hash2; + /* Lookup redirect from BPF */ + if (static_branch_unlikely(&bpf_sk_lookup_enabled)) { + result = inet6_lookup_run_bpf(net, hashinfo, skb, doff, + saddr, sport, daddr, hnum); + if (result) + goto done; + } + hash2 = ipv6_portaddr_hash(net, daddr, hnum); ilb2 = inet_lhash2_bucket(hashinfo, hash2); From patchwork Mon Jul 13 17:46:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Sitnicki X-Patchwork-Id: 1328349 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=cloudflare.com header.i=@cloudflare.com header.a=rsa-sha256 header.s=google header.b=cVOFjrb8; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4B5B1F5Pqxz9sSJ for ; Tue, 14 Jul 2020 03:47:13 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730472AbgGMRrN (ORCPT ); Mon, 13 Jul 2020 13:47:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60314 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730447AbgGMRrL (ORCPT ); Mon, 13 Jul 2020 13:47:11 -0400 Received: from mail-lf1-x142.google.com (mail-lf1-x142.google.com [IPv6:2a00:1450:4864:20::142]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 98164C08C5DB for ; Mon, 13 Jul 2020 10:47:10 -0700 (PDT) Received: by mail-lf1-x142.google.com with SMTP id t74so9598018lff.2 for ; Mon, 13 Jul 2020 10:47:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=dWS55m6tjKFIXvF8FfZLoHa8RJokM3hk5iam/MiuzJ0=; b=cVOFjrb8nB2k7Yt0IvzlBgZfQPUdFQjnlodhZaKOx2YnK7BtWDJH0wEMAt3IHMLz/u fWs8C0Y/Q9BIml3hS7d6vUIPZfMapAB/d1sBVbwwRACkZdrKwuNsBIk4SzQXyzwvY6Q/ 4w3dbxfpYprvYND1Nieq9QfOoIe8BojwEm3MQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=dWS55m6tjKFIXvF8FfZLoHa8RJokM3hk5iam/MiuzJ0=; b=IEMfLpOvBSmd+WF/TnGZi86aA7EfLptUEWvswAQCcoeMj1f/qYRGeGQ3CnhjBqpLdb S1VfVU9lv+HfXz77HYbI2LjzgtQSaDIH/5YB9Uh+BZMAIV59z49BP4qFJP9ILZZs5Ebu CUlo3bTxk2DEentk29GNgUg7uGgCGmIeX4YfL7kTp790gosN/GE4KI4h+KleSIi19bcA Xg3jE8rlfUbShn3ioe2raqcntZvvtifjKn30AzeooXnrbWtpZGS3nZT5xEd6IcEpZoD7 PS+JFTsniqn7bjCJ3x2vgwScauULYvw94ziQ3ahBJWwKuhmasTqE3adlKOj9DZ1h4gxH jP4A== X-Gm-Message-State: AOAM531E8w/Wk+YWhfwujU0Ojbt5MmBSf21D9pF4oRmw7Axq43zpiXWW md/E9YU4j3MfhHOsAx+1PzJSY1UsP/IwUg== X-Google-Smtp-Source: ABdhPJyqO3bEy6jB31tdonFVj7PglltQYEVgBqFXsrOvaZtqcYatMVmXZ3XMCyBhvLasrTku4TjAAQ== X-Received: by 2002:ac2:4422:: with SMTP id w2mr167186lfl.152.1594662428791; Mon, 13 Jul 2020 10:47:08 -0700 (PDT) Received: from cloudflare.com ([2a02:a310:c262:aa00:b35e:8938:2c2a:ba8b]) by smtp.gmail.com with ESMTPSA id s28sm4194053ljm.24.2020.07.13.10.47.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Jul 2020 10:47:08 -0700 (PDT) From: Jakub Sitnicki To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, kernel-team@cloudflare.com, Alexei Starovoitov , Daniel Borkmann , "David S. Miller" , Jakub Kicinski Subject: [PATCH bpf-next v4 07/16] udp: Extract helper for selecting socket from reuseport group Date: Mon, 13 Jul 2020 19:46:45 +0200 Message-Id: <20200713174654.642628-8-jakub@cloudflare.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20200713174654.642628-1-jakub@cloudflare.com> References: <20200713174654.642628-1-jakub@cloudflare.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Prepare for calling into reuseport from __udp4_lib_lookup as well. Signed-off-by: Jakub Sitnicki --- net/ipv4/udp.c | 34 ++++++++++++++++++++++++---------- 1 file changed, 24 insertions(+), 10 deletions(-) diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 31530129f137..0d03e0277263 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -408,6 +408,25 @@ static u32 udp_ehashfn(const struct net *net, const __be32 laddr, udp_ehash_secret + net_hash_mix(net)); } +static inline struct sock *lookup_reuseport(struct net *net, struct sock *sk, + struct sk_buff *skb, + __be32 saddr, __be16 sport, + __be32 daddr, unsigned short hnum) +{ + struct sock *reuse_sk = NULL; + u32 hash; + + if (sk->sk_reuseport && sk->sk_state != TCP_ESTABLISHED) { + hash = udp_ehashfn(net, daddr, hnum, saddr, sport); + reuse_sk = reuseport_select_sock(sk, hash, skb, + sizeof(struct udphdr)); + /* Fall back to scoring if group has connections */ + if (reuseport_has_conns(sk, false)) + return NULL; + } + return reuse_sk; +} + /* called with rcu_read_lock() */ static struct sock *udp4_lib_lookup2(struct net *net, __be32 saddr, __be16 sport, @@ -418,7 +437,6 @@ static struct sock *udp4_lib_lookup2(struct net *net, { struct sock *sk, *result; int score, badness; - u32 hash = 0; result = NULL; badness = 0; @@ -426,15 +444,11 @@ static struct sock *udp4_lib_lookup2(struct net *net, score = compute_score(sk, net, saddr, sport, daddr, hnum, dif, sdif); if (score > badness) { - if (sk->sk_reuseport && - sk->sk_state != TCP_ESTABLISHED) { - hash = udp_ehashfn(net, daddr, hnum, - saddr, sport); - result = reuseport_select_sock(sk, hash, skb, - sizeof(struct udphdr)); - if (result && !reuseport_has_conns(sk, false)) - return result; - } + result = lookup_reuseport(net, sk, skb, + saddr, sport, daddr, hnum); + if (result) + return result; + badness = score; result = sk; } From patchwork Mon Jul 13 17:46:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Sitnicki X-Patchwork-Id: 1328350 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=cloudflare.com header.i=@cloudflare.com header.a=rsa-sha256 header.s=google header.b=tetYQPuo; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4B5B1G4g3kz9sSd for ; Tue, 14 Jul 2020 03:47:14 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730466AbgGMRrO (ORCPT ); Mon, 13 Jul 2020 13:47:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60320 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730423AbgGMRrM (ORCPT ); Mon, 13 Jul 2020 13:47:12 -0400 Received: from mail-lf1-x144.google.com (mail-lf1-x144.google.com [IPv6:2a00:1450:4864:20::144]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1E50BC061755 for ; Mon, 13 Jul 2020 10:47:12 -0700 (PDT) Received: by mail-lf1-x144.google.com with SMTP id s16so9556947lfp.12 for ; Mon, 13 Jul 2020 10:47:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=4adtaF+vh9ZCjhGrf1+RjNro+KhZMebp2/WL08RlLtc=; b=tetYQPuo1HNfbQu7YB33sGi0dcbFBATg/HeznsxFxVoA92dczroFy75rMBQv6SPA69 DNRrA9E4KjSH8CDQP+/fghtq9rWIoRcehmYOs3JDLgee6cQuS52b8K8R+64WU3LsAQ1g 2AhOVjbta5jKnsaj9fyifypGgKb+WDYZKFhZQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=4adtaF+vh9ZCjhGrf1+RjNro+KhZMebp2/WL08RlLtc=; b=eieYPaghYYXmthAhFettNLcxQfx4IRokJt/AFgC35x9ctdlF444yq24VTYlMjtQC1m Q1bcBW7obdkGHJKIDEbydij4D4cI0F9sxoznf4kswhtIB2Aur4w29eXDZVsRvOzFiYKZ vmROZz0dyunQW7vT3M9Hg0Mmdu91qtcXn2chm3OiwicfVtTtYWgckrbNhkrKLZ5P2SRM afvWFZQyu6QPQ/6VMl17MXo621tZQORiLYfH9VsJLnOWPa0G2gLJstJt3fcsF27Vlwvp FK99Ag6htHd0vwKp4Ja31pm4zu9qJrjuGRxjo2VaywQXIUDdXfRXzjRAt2wTDWbmLaWE ghMA== X-Gm-Message-State: AOAM530Dmw1I6+wHzhltxtikn4RmXQImLBU463ZMU3bOGcbkGXh33xMm bAOeYcLe/0XzWyZNyb3f7HqWIw== X-Google-Smtp-Source: ABdhPJyj9c5rR9UY+ckpPyOhrZ39eULC/Cyi0TZxxCaOU5hZjpnCZ9safKIrK4XOPvb9HIneCZk5Ng== X-Received: by 2002:a19:22d6:: with SMTP id i205mr175738lfi.50.1594662430546; Mon, 13 Jul 2020 10:47:10 -0700 (PDT) Received: from cloudflare.com ([2a02:a310:c262:aa00:b35e:8938:2c2a:ba8b]) by smtp.gmail.com with ESMTPSA id a17sm4771942lfo.73.2020.07.13.10.47.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Jul 2020 10:47:09 -0700 (PDT) From: Jakub Sitnicki To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, kernel-team@cloudflare.com, Alexei Starovoitov , Daniel Borkmann , "David S. Miller" , Jakub Kicinski , Marek Majkowski Subject: [PATCH bpf-next v4 08/16] udp: Run SK_LOOKUP BPF program on socket lookup Date: Mon, 13 Jul 2020 19:46:46 +0200 Message-Id: <20200713174654.642628-9-jakub@cloudflare.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20200713174654.642628-1-jakub@cloudflare.com> References: <20200713174654.642628-1-jakub@cloudflare.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Following INET/TCP socket lookup changes, modify UDP socket lookup to let BPF program select a receiving socket before searching for a socket by destination address and port as usual. Lookup of connected sockets that match packet 4-tuple is unaffected by this change. BPF program runs, and potentially overrides the lookup result, only if a 4-tuple match was not found. Suggested-by: Marek Majkowski Signed-off-by: Jakub Sitnicki --- Notes: v4: - Adapt to change in bpf_sk_lookup_run_v4 return value semantics. v3: - Use a static_key to minimize the hook overhead when not used. (Alexei) - Adapt for running an array of attached programs. (Alexei) - Adapt for optionally skipping reuseport selection. (Martin) net/ipv4/udp.c | 59 ++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 50 insertions(+), 9 deletions(-) diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index 0d03e0277263..e82db3ab49d3 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -456,6 +456,29 @@ static struct sock *udp4_lib_lookup2(struct net *net, return result; } +static inline struct sock *udp4_lookup_run_bpf(struct net *net, + struct udp_table *udptable, + struct sk_buff *skb, + __be32 saddr, __be16 sport, + __be32 daddr, u16 hnum) +{ + struct sock *sk, *reuse_sk; + bool no_reuseport; + + if (udptable != &udp_table) + return NULL; /* only UDP is supported */ + + no_reuseport = bpf_sk_lookup_run_v4(net, IPPROTO_UDP, + saddr, sport, daddr, hnum, &sk); + if (no_reuseport || IS_ERR_OR_NULL(sk)) + return sk; + + reuse_sk = lookup_reuseport(net, sk, skb, saddr, sport, daddr, hnum); + if (reuse_sk) + sk = reuse_sk; + return sk; +} + /* UDP is nearly always wildcards out the wazoo, it makes no sense to try * harder than this. -DaveM */ @@ -463,27 +486,45 @@ struct sock *__udp4_lib_lookup(struct net *net, __be32 saddr, __be16 sport, __be32 daddr, __be16 dport, int dif, int sdif, struct udp_table *udptable, struct sk_buff *skb) { - struct sock *result; unsigned short hnum = ntohs(dport); unsigned int hash2, slot2; struct udp_hslot *hslot2; + struct sock *result, *sk; hash2 = ipv4_portaddr_hash(net, daddr, hnum); slot2 = hash2 & udptable->mask; hslot2 = &udptable->hash2[slot2]; + /* Lookup connected or non-wildcard socket */ result = udp4_lib_lookup2(net, saddr, sport, daddr, hnum, dif, sdif, hslot2, skb); - if (!result) { - hash2 = ipv4_portaddr_hash(net, htonl(INADDR_ANY), hnum); - slot2 = hash2 & udptable->mask; - hslot2 = &udptable->hash2[slot2]; - - result = udp4_lib_lookup2(net, saddr, sport, - htonl(INADDR_ANY), hnum, dif, sdif, - hslot2, skb); + if (!IS_ERR_OR_NULL(result) && result->sk_state == TCP_ESTABLISHED) + goto done; + + /* Lookup redirect from BPF */ + if (static_branch_unlikely(&bpf_sk_lookup_enabled)) { + sk = udp4_lookup_run_bpf(net, udptable, skb, + saddr, sport, daddr, hnum); + if (sk) { + result = sk; + goto done; + } } + + /* Got non-wildcard socket or error on first lookup */ + if (result) + goto done; + + /* Lookup wildcard sockets */ + hash2 = ipv4_portaddr_hash(net, htonl(INADDR_ANY), hnum); + slot2 = hash2 & udptable->mask; + hslot2 = &udptable->hash2[slot2]; + + result = udp4_lib_lookup2(net, saddr, sport, + htonl(INADDR_ANY), hnum, dif, sdif, + hslot2, skb); +done: if (IS_ERR(result)) return NULL; return result; From patchwork Mon Jul 13 17:46:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Sitnicki X-Patchwork-Id: 1328352 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=cloudflare.com header.i=@cloudflare.com header.a=rsa-sha256 header.s=google header.b=HQFFUQ/g; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4B5B1K3p3Pz9sRN for ; Tue, 14 Jul 2020 03:47:17 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730498AbgGMRrQ (ORCPT ); Mon, 13 Jul 2020 13:47:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60326 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730423AbgGMRrO (ORCPT ); Mon, 13 Jul 2020 13:47:14 -0400 Received: from mail-lj1-x241.google.com (mail-lj1-x241.google.com [IPv6:2a00:1450:4864:20::241]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EFA3EC061755 for ; Mon, 13 Jul 2020 10:47:13 -0700 (PDT) Received: by mail-lj1-x241.google.com with SMTP id b25so18931599ljp.6 for ; Mon, 13 Jul 2020 10:47:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Htoi5B3nTzd0JCiGoeXt1gyzcrOgoKflVk9EqYXpwIA=; b=HQFFUQ/gwR7+Tm9gVtPMta13OFDeILtTCdYDunOYGsCbwIYv5mJ9ZWl9FW6kApiJpK wRoWTZ6pORqflBmUXe9/duod6Ebaacs0P6M5aeZWuJYSdD3P9ub8SZgeVE2y/KxTCqaT GOvo773DH1Cez5gXNYEIK2+oPF5IZpKdh5jEw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Htoi5B3nTzd0JCiGoeXt1gyzcrOgoKflVk9EqYXpwIA=; b=RxrLmIBwCjAPcZjPZ1XbBUWf+S8TCGqi6nbv9yH63xF0NKmT5v/gcQa7mbDtphmkPt mAu+jHX4WBg2ZsSpkCLYGezJcE5xp185345fZ0ZIF/caMdx1WZEOXYAxgRQHrwDGB0yN QwZYCZokZmyo2NxFjVQskHNmXFSa2MXNvObqRWaBW4j7AS8NpE94Lq975IbBIccHC4eK QLai9IuCcNdSYzLb3OqaGgYrhSKzeCgQTtC8t4jOZFZmax9peu1vWvVbWPTs4OB7sFLW n9Q2xxxegXbtt17fD8vfuqcIdBaGCGrx9ywxB8cAPE10ZL2CkTLFZCAn5USymuqAN7qJ +Hqw== X-Gm-Message-State: AOAM531l4oEdt+MpeT/BM3++EQXUdk9wumRKV5vZQFqVpoMqu5Z1Sd+v JOS3W587zhR1K/lBFhaF83xkBSJ3uQhKFg== X-Google-Smtp-Source: ABdhPJzHDaCgFe74GEpCI8eS4qXtdM8DqQGYrEFuNA2bNi98V41lpwwgD/4Gyi5Dyifdd2IjU2jJFQ== X-Received: by 2002:a2e:99d0:: with SMTP id l16mr359282ljj.209.1594662432404; Mon, 13 Jul 2020 10:47:12 -0700 (PDT) Received: from cloudflare.com ([2a02:a310:c262:aa00:b35e:8938:2c2a:ba8b]) by smtp.gmail.com with ESMTPSA id g142sm4758350lfd.41.2020.07.13.10.47.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Jul 2020 10:47:11 -0700 (PDT) From: Jakub Sitnicki To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, kernel-team@cloudflare.com, Alexei Starovoitov , Daniel Borkmann , "David S. Miller" , Jakub Kicinski Subject: [PATCH bpf-next v4 09/16] udp6: Extract helper for selecting socket from reuseport group Date: Mon, 13 Jul 2020 19:46:47 +0200 Message-Id: <20200713174654.642628-10-jakub@cloudflare.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20200713174654.642628-1-jakub@cloudflare.com> References: <20200713174654.642628-1-jakub@cloudflare.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Prepare for calling into reuseport from __udp6_lib_lookup as well. Signed-off-by: Jakub Sitnicki --- net/ipv6/udp.c | 37 ++++++++++++++++++++++++++----------- 1 file changed, 26 insertions(+), 11 deletions(-) diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 7d4151747340..65b843e7acde 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -141,6 +141,27 @@ static int compute_score(struct sock *sk, struct net *net, return score; } +static inline struct sock *lookup_reuseport(struct net *net, struct sock *sk, + struct sk_buff *skb, + const struct in6_addr *saddr, + __be16 sport, + const struct in6_addr *daddr, + unsigned int hnum) +{ + struct sock *reuse_sk = NULL; + u32 hash; + + if (sk->sk_reuseport && sk->sk_state != TCP_ESTABLISHED) { + hash = udp6_ehashfn(net, daddr, hnum, saddr, sport); + reuse_sk = reuseport_select_sock(sk, hash, skb, + sizeof(struct udphdr)); + /* Fall back to scoring if group has connections */ + if (reuseport_has_conns(sk, false)) + return NULL; + } + return reuse_sk; +} + /* called with rcu_read_lock() */ static struct sock *udp6_lib_lookup2(struct net *net, const struct in6_addr *saddr, __be16 sport, @@ -150,7 +171,6 @@ static struct sock *udp6_lib_lookup2(struct net *net, { struct sock *sk, *result; int score, badness; - u32 hash = 0; result = NULL; badness = -1; @@ -158,16 +178,11 @@ static struct sock *udp6_lib_lookup2(struct net *net, score = compute_score(sk, net, saddr, sport, daddr, hnum, dif, sdif); if (score > badness) { - if (sk->sk_reuseport && - sk->sk_state != TCP_ESTABLISHED) { - hash = udp6_ehashfn(net, daddr, hnum, - saddr, sport); - - result = reuseport_select_sock(sk, hash, skb, - sizeof(struct udphdr)); - if (result && !reuseport_has_conns(sk, false)) - return result; - } + result = lookup_reuseport(net, sk, skb, + saddr, sport, daddr, hnum); + if (result) + return result; + result = sk; badness = score; } From patchwork Mon Jul 13 17:46:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Sitnicki X-Patchwork-Id: 1328353 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=cloudflare.com header.i=@cloudflare.com header.a=rsa-sha256 header.s=google header.b=rGs721d7; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4B5B1L5ggBz9sRW for ; Tue, 14 Jul 2020 03:47:18 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730503AbgGMRrR (ORCPT ); Mon, 13 Jul 2020 13:47:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60340 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730500AbgGMRrR (ORCPT ); Mon, 13 Jul 2020 13:47:17 -0400 Received: from mail-lj1-x242.google.com (mail-lj1-x242.google.com [IPv6:2a00:1450:4864:20::242]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AE802C061794 for ; Mon, 13 Jul 2020 10:47:16 -0700 (PDT) Received: by mail-lj1-x242.google.com with SMTP id d17so18976072ljl.3 for ; Mon, 13 Jul 2020 10:47:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5sob8efe4WSDw01Inzq02yiJzim8BxMO0L4N+ZlIM8o=; b=rGs721d7iQPNXL/HucdiRuHkUKTsDJYzQEpxJvwOoYJfrjIWorYpldlmrbbNqqif01 KtLYbsU7uprBJn5zuJbF0itAW6NJ9JouthxhDxOAkQVyutKJ/CqxsjxtRb9FfgKviMa7 unu3ASP5qHFuAj9y7X7b+9yFI9p6CFTWcWXvY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5sob8efe4WSDw01Inzq02yiJzim8BxMO0L4N+ZlIM8o=; b=KZqCPQ5tlqKZnqTBM9c1f00EGi/R5iP9+n4VWeTNQU+5uNrC4eYgxWX8dgOiZUj0hq XIsXfxTWuxcfNIwB4TD76lCe6Zgs+S3AFyfPK9Kl7q5U6fTXflXxHKkOh/+pIpMIqhKc lySGf6xW4TI6LKc9s+E/ARRAUXchpm89MVmwPBYPUksfl9v/5G/mNF50MOdR9RaFUOND OfRUdnHq2bX/FeQw8K4UePCr4jLNbB9nQC1dDer+OKQL9c31sz3BonBMik1tMuifA8iX ciAjlRF060xnrtv6iwKQjTpm1ET29dThsJNLD+Lhon20aWmhzsU/7/mes6e+joho196Y 2xcA== X-Gm-Message-State: AOAM533araZrlnjmebfQ54ZkCGvmD9Hal88TQDq0+i9gxy+31SxgR6az K6fHxGfGukoPrKVy70FLHYPWa1qGykBXSw== X-Google-Smtp-Source: ABdhPJxJ/247X3MskwFQTvJkeGDTmI+Ln+PB9RT2BVvL/6uU2PYAIx02cPCPukjV3DRjNmFQ0N7wbA== X-Received: by 2002:a2e:b5a8:: with SMTP id f8mr354652ljn.247.1594662434284; Mon, 13 Jul 2020 10:47:14 -0700 (PDT) Received: from cloudflare.com ([2a02:a310:c262:aa00:b35e:8938:2c2a:ba8b]) by smtp.gmail.com with ESMTPSA id b6sm4711778lfe.28.2020.07.13.10.47.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Jul 2020 10:47:13 -0700 (PDT) From: Jakub Sitnicki To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, kernel-team@cloudflare.com, Alexei Starovoitov , Daniel Borkmann , "David S. Miller" , Jakub Kicinski , Marek Majkowski Subject: [PATCH bpf-next v4 10/16] udp6: Run SK_LOOKUP BPF program on socket lookup Date: Mon, 13 Jul 2020 19:46:48 +0200 Message-Id: <20200713174654.642628-11-jakub@cloudflare.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20200713174654.642628-1-jakub@cloudflare.com> References: <20200713174654.642628-1-jakub@cloudflare.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Same as for udp4, let BPF program override the socket lookup result, by selecting a receiving socket of its choice or failing the lookup, if no connected UDP socket matched packet 4-tuple. Suggested-by: Marek Majkowski Signed-off-by: Jakub Sitnicki --- Notes: v4: - Adapt to change in bpf_sk_lookup_run_v6 return value semantics. v3: - Use a static_key to minimize the hook overhead when not used. (Alexei) - Adapt for running an array of attached programs. (Alexei) - Adapt for optionally skipping reuseport selection. (Martin) net/ipv6/udp.c | 60 ++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 51 insertions(+), 9 deletions(-) diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 65b843e7acde..d46c62976b5b 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -190,6 +190,31 @@ static struct sock *udp6_lib_lookup2(struct net *net, return result; } +static inline struct sock *udp6_lookup_run_bpf(struct net *net, + struct udp_table *udptable, + struct sk_buff *skb, + const struct in6_addr *saddr, + __be16 sport, + const struct in6_addr *daddr, + u16 hnum) +{ + struct sock *sk, *reuse_sk; + bool no_reuseport; + + if (udptable != &udp_table) + return NULL; /* only UDP is supported */ + + no_reuseport = bpf_sk_lookup_run_v6(net, IPPROTO_UDP, + saddr, sport, daddr, hnum, &sk); + if (no_reuseport || IS_ERR_OR_NULL(sk)) + return sk; + + reuse_sk = lookup_reuseport(net, sk, skb, saddr, sport, daddr, hnum); + if (reuse_sk) + sk = reuse_sk; + return sk; +} + /* rcu_read_lock() must be held */ struct sock *__udp6_lib_lookup(struct net *net, const struct in6_addr *saddr, __be16 sport, @@ -200,25 +225,42 @@ struct sock *__udp6_lib_lookup(struct net *net, unsigned short hnum = ntohs(dport); unsigned int hash2, slot2; struct udp_hslot *hslot2; - struct sock *result; + struct sock *result, *sk; hash2 = ipv6_portaddr_hash(net, daddr, hnum); slot2 = hash2 & udptable->mask; hslot2 = &udptable->hash2[slot2]; + /* Lookup connected or non-wildcard sockets */ result = udp6_lib_lookup2(net, saddr, sport, daddr, hnum, dif, sdif, hslot2, skb); - if (!result) { - hash2 = ipv6_portaddr_hash(net, &in6addr_any, hnum); - slot2 = hash2 & udptable->mask; + if (!IS_ERR_OR_NULL(result) && result->sk_state == TCP_ESTABLISHED) + goto done; + + /* Lookup redirect from BPF */ + if (static_branch_unlikely(&bpf_sk_lookup_enabled)) { + sk = udp6_lookup_run_bpf(net, udptable, skb, + saddr, sport, daddr, hnum); + if (sk) { + result = sk; + goto done; + } + } - hslot2 = &udptable->hash2[slot2]; + /* Got non-wildcard socket or error on first lookup */ + if (result) + goto done; - result = udp6_lib_lookup2(net, saddr, sport, - &in6addr_any, hnum, dif, sdif, - hslot2, skb); - } + /* Lookup wildcard sockets */ + hash2 = ipv6_portaddr_hash(net, &in6addr_any, hnum); + slot2 = hash2 & udptable->mask; + hslot2 = &udptable->hash2[slot2]; + + result = udp6_lib_lookup2(net, saddr, sport, + &in6addr_any, hnum, dif, sdif, + hslot2, skb); +done: if (IS_ERR(result)) return NULL; return result; From patchwork Mon Jul 13 17:46:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Sitnicki X-Patchwork-Id: 1328355 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=cloudflare.com header.i=@cloudflare.com header.a=rsa-sha256 header.s=google header.b=Ujiga2Pp; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4B5B1P5h3Jz9sRW for ; Tue, 14 Jul 2020 03:47:21 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730526AbgGMRrU (ORCPT ); Mon, 13 Jul 2020 13:47:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60344 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730507AbgGMRrS (ORCPT ); Mon, 13 Jul 2020 13:47:18 -0400 Received: from mail-lj1-x244.google.com (mail-lj1-x244.google.com [IPv6:2a00:1450:4864:20::244]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A1D10C061755 for ; Mon, 13 Jul 2020 10:47:17 -0700 (PDT) Received: by mail-lj1-x244.google.com with SMTP id z24so18955465ljn.8 for ; Mon, 13 Jul 2020 10:47:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=HcHBwnvbK/ZEAWghrBP98aUipgfIWV3+AoVVPtvwDmA=; b=Ujiga2PpuLTYAiGTA+OvFxcvBO+K2VeA1yHrIaOaI/+x2hfQgL20SemEkb6oEloW4u ZpnOGIukTzqtjn7hcc4m1J5fS+84DMc8vOUEsHSfrY0GN5za/BJa0PxRo3GOeUhF/KId RKZOpQw8p27cf5ql6Nc1s+RE1a9DYGV7UNrVA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=HcHBwnvbK/ZEAWghrBP98aUipgfIWV3+AoVVPtvwDmA=; b=NqsNv+Tv7nZtT7E1t9ZOBPjMftaIr7Ekn7i9Uop/nMaQ5ND3np092WJpmjhzRgRY+X MuOuuN4ia5uuGhTU0CEHppe94LbU4QqF6m2etuS3YiU8vR8uKLZ8XD6FKMU78PppQKv6 Ilj7Dlf30NQFIOgl0wotCJCWv8Qjtm8k37lIEKQc9VpE6QKUAezX70+/XjGQNsNv64Ev iHqkTiJspFwQS1jHM0jlgEGEbkdv3rU0nU4zzfg4iyQIIgi9xNUVW3tzt9nqXpFql0d7 y9G+wFLNcco9PuQwf4WSiLNlj+DS5iEhOeqUia4R8vj8LJvPvGKMGTCaiqjzCZwtbYGp ZiJg== X-Gm-Message-State: AOAM533POhlu12Q6y1cnuisQg1YXviCv/wRXc4k8qsZZbXonXMMtq9S/ lUFblN6ute17pHSux+1jdBAvaA== X-Google-Smtp-Source: ABdhPJwQ9wQS9/jXKGvkcYbCaq2AMZikKQvOmFSJ7Tr+RFkFuc64s8+zqiyK9UNWy2q7FatnmHJPYw== X-Received: by 2002:a2e:8992:: with SMTP id c18mr380078lji.388.1594662436085; Mon, 13 Jul 2020 10:47:16 -0700 (PDT) Received: from cloudflare.com ([2a02:a310:c262:aa00:b35e:8938:2c2a:ba8b]) by smtp.gmail.com with ESMTPSA id f13sm4762032lfs.29.2020.07.13.10.47.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Jul 2020 10:47:15 -0700 (PDT) From: Jakub Sitnicki To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, kernel-team@cloudflare.com, Alexei Starovoitov , Daniel Borkmann , "David S. Miller" , Jakub Kicinski Subject: [PATCH bpf-next v4 11/16] bpf: Sync linux/bpf.h to tools/ Date: Mon, 13 Jul 2020 19:46:49 +0200 Message-Id: <20200713174654.642628-12-jakub@cloudflare.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20200713174654.642628-1-jakub@cloudflare.com> References: <20200713174654.642628-1-jakub@cloudflare.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Newly added program, context type and helper is used by tests in a subsequent patch. Synchronize the header file. Signed-off-by: Jakub Sitnicki --- Notes: v4: - Update after changes to bpf.h in earlier patch. v3: - Update after changes to bpf.h in earlier patch. v2: - Update after changes to bpf.h in earlier patch. tools/include/uapi/linux/bpf.h | 77 ++++++++++++++++++++++++++++++++++ 1 file changed, 77 insertions(+) diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 548a749aebb3..e2ffeb150d0f 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -189,6 +189,7 @@ enum bpf_prog_type { BPF_PROG_TYPE_STRUCT_OPS, BPF_PROG_TYPE_EXT, BPF_PROG_TYPE_LSM, + BPF_PROG_TYPE_SK_LOOKUP, }; enum bpf_attach_type { @@ -227,6 +228,7 @@ enum bpf_attach_type { BPF_CGROUP_INET6_GETSOCKNAME, BPF_XDP_DEVMAP, BPF_CGROUP_INET_SOCK_RELEASE, + BPF_SK_LOOKUP, __MAX_BPF_ATTACH_TYPE }; @@ -3068,6 +3070,10 @@ union bpf_attr { * * long bpf_sk_assign(struct sk_buff *skb, struct bpf_sock *sk, u64 flags) * Description + * Helper is overloaded depending on BPF program type. This + * description applies to **BPF_PROG_TYPE_SCHED_CLS** and + * **BPF_PROG_TYPE_SCHED_ACT** programs. + * * Assign the *sk* to the *skb*. When combined with appropriate * routing configuration to receive the packet towards the socket, * will cause *skb* to be delivered to the specified socket. @@ -3093,6 +3099,56 @@ union bpf_attr { * **-ESOCKTNOSUPPORT** if the socket type is not supported * (reuseport). * + * long bpf_sk_assign(struct bpf_sk_lookup *ctx, struct bpf_sock *sk, u64 flags) + * Description + * Helper is overloaded depending on BPF program type. This + * description applies to **BPF_PROG_TYPE_SK_LOOKUP** programs. + * + * Select the *sk* as a result of a socket lookup. + * + * For the operation to succeed passed socket must be compatible + * with the packet description provided by the *ctx* object. + * + * L4 protocol (**IPPROTO_TCP** or **IPPROTO_UDP**) must + * be an exact match. While IP family (**AF_INET** or + * **AF_INET6**) must be compatible, that is IPv6 sockets + * that are not v6-only can be selected for IPv4 packets. + * + * Only TCP listeners and UDP unconnected sockets can be + * selected. *sk* can also be NULL to reset any previous + * selection. + * + * *flags* argument can combination of following values: + * + * * **BPF_SK_LOOKUP_F_REPLACE** to override the previous + * socket selection, potentially done by a BPF program + * that ran before us. + * + * * **BPF_SK_LOOKUP_F_NO_REUSEPORT** to skip + * load-balancing within reuseport group for the socket + * being selected. + * + * On success *ctx->sk* will point to the selected socket. + * + * Return + * 0 on success, or a negative errno in case of failure. + * + * * **-EAFNOSUPPORT** if socket family (*sk->family*) is + * not compatible with packet family (*ctx->family*). + * + * * **-EEXIST** if socket has been already selected, + * potentially by another program, and + * **BPF_SK_LOOKUP_F_REPLACE** flag was not specified. + * + * * **-EINVAL** if unsupported flags were specified. + * + * * **-EPROTOTYPE** if socket L4 protocol + * (*sk->protocol*) doesn't match packet protocol + * (*ctx->protocol*). + * + * * **-ESOCKTNOSUPPORT** if socket is not in allowed + * state (TCP listening or UDP unconnected). + * * u64 bpf_ktime_get_boot_ns(void) * Description * Return the time elapsed since system boot, in nanoseconds. @@ -3605,6 +3661,12 @@ enum { BPF_RINGBUF_HDR_SZ = 8, }; +/* BPF_FUNC_sk_assign flags in bpf_sk_lookup context. */ +enum { + BPF_SK_LOOKUP_F_REPLACE = (1ULL << 0), + BPF_SK_LOOKUP_F_NO_REUSEPORT = (1ULL << 1), +}; + /* Mode for BPF_FUNC_skb_adjust_room helper. */ enum bpf_adj_room_mode { BPF_ADJ_ROOM_NET, @@ -4334,4 +4396,19 @@ struct bpf_pidns_info { __u32 pid; __u32 tgid; }; + +/* User accessible data for SK_LOOKUP programs. Add new fields at the end. */ +struct bpf_sk_lookup { + __bpf_md_ptr(struct bpf_sock *, sk); /* Selected socket */ + + __u32 family; /* Protocol family (AF_INET, AF_INET6) */ + __u32 protocol; /* IP protocol (IPPROTO_TCP, IPPROTO_UDP) */ + __u32 remote_ip4; /* Network byte order */ + __u32 remote_ip6[4]; /* Network byte order */ + __u32 remote_port; /* Network byte order */ + __u32 local_ip4; /* Network byte order */ + __u32 local_ip6[4]; /* Network byte order */ + __u32 local_port; /* Host byte order */ +}; + #endif /* _UAPI__LINUX_BPF_H__ */ From patchwork Mon Jul 13 17:46:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Sitnicki X-Patchwork-Id: 1328364 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=cloudflare.com header.i=@cloudflare.com header.a=rsa-sha256 header.s=google header.b=b0dBiSQB; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4B5B1b5cCzz9sR4 for ; Tue, 14 Jul 2020 03:47:31 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730540AbgGMRr3 (ORCPT ); Mon, 13 Jul 2020 13:47:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60354 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730520AbgGMRrT (ORCPT ); Mon, 13 Jul 2020 13:47:19 -0400 Received: from mail-lj1-x241.google.com (mail-lj1-x241.google.com [IPv6:2a00:1450:4864:20::241]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4ECCBC061755 for ; Mon, 13 Jul 2020 10:47:19 -0700 (PDT) Received: by mail-lj1-x241.google.com with SMTP id b25so18932009ljp.6 for ; Mon, 13 Jul 2020 10:47:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=d2MbTasSERGXmtsRA08a5962M6R0um/2B1uf4vS2E1Q=; b=b0dBiSQBcZlqdEMJRe1nTnIUghXUCsoDaaHaq4rf6ms6Y7YrIWDRjFAKkS29VjgI+j Z6C0f3hfQX2fAIQGrv6KIN1i1lfu/ChX4KXa5HFSlQ0AYjdXvkrrTrNKY2PXzJTZRATO uAXNyt0SLLJ6pa5nEooPUlWlNaZQAwn8u5mJY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=d2MbTasSERGXmtsRA08a5962M6R0um/2B1uf4vS2E1Q=; b=SrcqPMD9Cto0/n7phjXDlLsOfLd0FiGRfdRZ1cTPEvft3+Kcasogx/odbkOJ723n4U 8ZEy68RQAKGkK/ogXRrv3W6GXnD7TId26Z3AI0cQHzgVlw7Ad3n/QajG+cNCEFL2cY/O 3OPn0KyEhvqGLJ8SMgMUgKOyvgIDMDKk2wfZGk34niaXZgMtlP40QT2RNRIS6yLsNhbN gosbeWqdg1dDAxkVtazGs7mPBfyTQHPyliN6JZ5CXsjONhQ2T8IFOjcXFG79ednJIjBC h9q6zQWyaAKEyxjOygd3xDKTZBGHd28/MzrNUQqbT/+XMPU+iYJju8HsotDlzF/v3SaR t9QQ== X-Gm-Message-State: AOAM532xosfSjs0HDPwvbJyFIYVMHSwcoMdelSD0cTBtx81z8mtWAvX7 jF2fsxTuZM0JhUhaCO5t8b5UIg== X-Google-Smtp-Source: ABdhPJzexv9qPbq0TqbcqZ1ZOG2nZXS9iXfhK1n3cvEFOqJcTMlyk3B+PckxhRvWm9GnvN7DZOkZjQ== X-Received: by 2002:a2e:8059:: with SMTP id p25mr367806ljg.156.1594662437706; Mon, 13 Jul 2020 10:47:17 -0700 (PDT) Received: from cloudflare.com ([2a02:a310:c262:aa00:b35e:8938:2c2a:ba8b]) by smtp.gmail.com with ESMTPSA id h18sm4164630lji.136.2020.07.13.10.47.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Jul 2020 10:47:17 -0700 (PDT) From: Jakub Sitnicki To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, kernel-team@cloudflare.com, Alexei Starovoitov , Daniel Borkmann , "David S. Miller" , Jakub Kicinski Subject: [PATCH bpf-next v4 12/16] libbpf: Add support for SK_LOOKUP program type Date: Mon, 13 Jul 2020 19:46:50 +0200 Message-Id: <20200713174654.642628-13-jakub@cloudflare.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20200713174654.642628-1-jakub@cloudflare.com> References: <20200713174654.642628-1-jakub@cloudflare.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Make libbpf aware of the newly added program type, and assign it a section name. Signed-off-by: Jakub Sitnicki --- Notes: v4: - Add trailing slash to section prefix ("sk_lookup/"). (Andrii) v3: - Move new libbpf symbols to version 0.1.0. - Set expected_attach_type in probe_load for new prog type. v2: - Add new libbpf symbols to version 0.0.9. (Andrii) tools/lib/bpf/libbpf.c | 3 +++ tools/lib/bpf/libbpf.h | 2 ++ tools/lib/bpf/libbpf.map | 2 ++ tools/lib/bpf/libbpf_probes.c | 3 +++ 4 files changed, 10 insertions(+) diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 25e4f77be8d7..1dfdf7d36352 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -6793,6 +6793,7 @@ BPF_PROG_TYPE_FNS(perf_event, BPF_PROG_TYPE_PERF_EVENT); BPF_PROG_TYPE_FNS(tracing, BPF_PROG_TYPE_TRACING); BPF_PROG_TYPE_FNS(struct_ops, BPF_PROG_TYPE_STRUCT_OPS); BPF_PROG_TYPE_FNS(extension, BPF_PROG_TYPE_EXT); +BPF_PROG_TYPE_FNS(sk_lookup, BPF_PROG_TYPE_SK_LOOKUP); enum bpf_attach_type bpf_program__get_expected_attach_type(struct bpf_program *prog) @@ -6973,6 +6974,8 @@ static const struct bpf_sec_def section_defs[] = { BPF_EAPROG_SEC("cgroup/setsockopt", BPF_PROG_TYPE_CGROUP_SOCKOPT, BPF_CGROUP_SETSOCKOPT), BPF_PROG_SEC("struct_ops", BPF_PROG_TYPE_STRUCT_OPS), + BPF_EAPROG_SEC("sk_lookup/", BPF_PROG_TYPE_SK_LOOKUP, + BPF_SK_LOOKUP), }; #undef BPF_PROG_SEC_IMPL diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 2335971ed0bd..c2272132e929 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -350,6 +350,7 @@ LIBBPF_API int bpf_program__set_perf_event(struct bpf_program *prog); LIBBPF_API int bpf_program__set_tracing(struct bpf_program *prog); LIBBPF_API int bpf_program__set_struct_ops(struct bpf_program *prog); LIBBPF_API int bpf_program__set_extension(struct bpf_program *prog); +LIBBPF_API int bpf_program__set_sk_lookup(struct bpf_program *prog); LIBBPF_API enum bpf_prog_type bpf_program__get_type(struct bpf_program *prog); LIBBPF_API void bpf_program__set_type(struct bpf_program *prog, @@ -377,6 +378,7 @@ LIBBPF_API bool bpf_program__is_perf_event(const struct bpf_program *prog); LIBBPF_API bool bpf_program__is_tracing(const struct bpf_program *prog); LIBBPF_API bool bpf_program__is_struct_ops(const struct bpf_program *prog); LIBBPF_API bool bpf_program__is_extension(const struct bpf_program *prog); +LIBBPF_API bool bpf_program__is_sk_lookup(const struct bpf_program *prog); /* * No need for __attribute__((packed)), all members of 'bpf_map_def' diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index c5d5c7664c3b..6f0856abe299 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -287,6 +287,8 @@ LIBBPF_0.1.0 { bpf_map__type; bpf_map__value_size; bpf_program__autoload; + bpf_program__is_sk_lookup; bpf_program__set_autoload; + bpf_program__set_sk_lookup; btf__set_fd; } LIBBPF_0.0.9; diff --git a/tools/lib/bpf/libbpf_probes.c b/tools/lib/bpf/libbpf_probes.c index 10cd8d1891f5..5a3d3f078408 100644 --- a/tools/lib/bpf/libbpf_probes.c +++ b/tools/lib/bpf/libbpf_probes.c @@ -78,6 +78,9 @@ probe_load(enum bpf_prog_type prog_type, const struct bpf_insn *insns, case BPF_PROG_TYPE_CGROUP_SOCK_ADDR: xattr.expected_attach_type = BPF_CGROUP_INET4_CONNECT; break; + case BPF_PROG_TYPE_SK_LOOKUP: + xattr.expected_attach_type = BPF_SK_LOOKUP; + break; case BPF_PROG_TYPE_KPROBE: xattr.kern_version = get_kernel_version(); break; From patchwork Mon Jul 13 17:46:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Sitnicki X-Patchwork-Id: 1328365 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=cloudflare.com header.i=@cloudflare.com header.a=rsa-sha256 header.s=google header.b=UbVbBF9J; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4B5B1c1b5rz9sRW for ; Tue, 14 Jul 2020 03:47:32 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730548AbgGMRra (ORCPT ); Mon, 13 Jul 2020 13:47:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730507AbgGMRrV (ORCPT ); Mon, 13 Jul 2020 13:47:21 -0400 Received: from mail-lj1-x22e.google.com (mail-lj1-x22e.google.com [IPv6:2a00:1450:4864:20::22e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2979CC08C5DB for ; Mon, 13 Jul 2020 10:47:21 -0700 (PDT) Received: by mail-lj1-x22e.google.com with SMTP id q4so19010271lji.2 for ; Mon, 13 Jul 2020 10:47:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=AyMfG/HsKAqlEgMq6Li9Y2u77ypCGbN506NWO50QX3Q=; b=UbVbBF9JMEf2IYzaTLaA8hdc6cE25wp+3t7yNtDUTZdpBK6t70DZLbPHnYLPdhPlkQ igvwHgibT54icmNx/nmJe5wNlTBoHUJGUUVH7qHs6NqekxongxFne1hpNHXzrtqDo3Q7 zaCuq9LHKSfNrGn57njCAO3he3gKmRJnObgmg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=AyMfG/HsKAqlEgMq6Li9Y2u77ypCGbN506NWO50QX3Q=; b=YzJKnMsyK4gTqd5rVW5QoOXKie367XaqvsZgpWKFPHEeW/zHXa/R2nJ/BW9BChDCfo pKSCGjmS35c0FhslTT57oxTH73El7rx1CihR/uzOcb9sAaE02xoXsbsQNIXPpczYtDt2 Ab8Cg4sc1m/Un4d50+NYNEm6TQ0qiy7JAWvYuhoFFx4Tikosr1DGbJQSEiu8XCjJ5gkM Q8cokQAs1XtWIaBn6LNAFB6/RQjZQv20ayjs3qAHuSA8shzuUFprSsm40ZkND4NFGFuP 8175T09ye7arc0VNG/EEs1uo0XJghJI9oQcVXdzRugQfxWmokIskpygtIXvRBjIcL3H2 c3pw== X-Gm-Message-State: AOAM532eKFsog2mE67p+X2LOYmYkUtGiHd5aIhzAdbF0cjW6QQLZVVCg qG0yncdjBJqoysm/TZm7/l7d2g== X-Google-Smtp-Source: ABdhPJwQq0qg5FqzJIChrRpnHkyul5R+1XxHa1aghGXAgvsN05JN3+Ivn0/IlZixXaesnmc/lD1fMw== X-Received: by 2002:a2e:8855:: with SMTP id z21mr386110ljj.325.1594662439641; Mon, 13 Jul 2020 10:47:19 -0700 (PDT) Received: from cloudflare.com ([2a02:a310:c262:aa00:b35e:8938:2c2a:ba8b]) by smtp.gmail.com with ESMTPSA id a22sm4734030lfg.96.2020.07.13.10.47.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Jul 2020 10:47:19 -0700 (PDT) From: Jakub Sitnicki To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, kernel-team@cloudflare.com, Alexei Starovoitov , Daniel Borkmann , "David S. Miller" , Jakub Kicinski Subject: [PATCH bpf-next v4 13/16] tools/bpftool: Add name mappings for SK_LOOKUP prog and attach type Date: Mon, 13 Jul 2020 19:46:51 +0200 Message-Id: <20200713174654.642628-14-jakub@cloudflare.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20200713174654.642628-1-jakub@cloudflare.com> References: <20200713174654.642628-1-jakub@cloudflare.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Make bpftool show human-friendly identifiers for newly introduced program and attach type, BPF_PROG_TYPE_SK_LOOKUP and BPF_SK_LOOKUP, respectively. Signed-off-by: Jakub Sitnicki Acked-by: Andrii Nakryiko --- Notes: v3: - New patch in v3. tools/bpf/bpftool/common.c | 1 + tools/bpf/bpftool/prog.c | 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c index 29f4e7611ae8..9b28c69dd8e4 100644 --- a/tools/bpf/bpftool/common.c +++ b/tools/bpf/bpftool/common.c @@ -64,6 +64,7 @@ const char * const attach_type_name[__MAX_BPF_ATTACH_TYPE] = { [BPF_TRACE_FEXIT] = "fexit", [BPF_MODIFY_RETURN] = "mod_ret", [BPF_LSM_MAC] = "lsm_mac", + [BPF_SK_LOOKUP] = "sk_lookup", }; void p_err(const char *fmt, ...) diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c index 6863c57effd0..3e6ecc6332e2 100644 --- a/tools/bpf/bpftool/prog.c +++ b/tools/bpf/bpftool/prog.c @@ -59,6 +59,7 @@ const char * const prog_type_name[] = { [BPF_PROG_TYPE_TRACING] = "tracing", [BPF_PROG_TYPE_STRUCT_OPS] = "struct_ops", [BPF_PROG_TYPE_EXT] = "ext", + [BPF_PROG_TYPE_SK_LOOKUP] = "sk_lookup", }; const size_t prog_type_name_size = ARRAY_SIZE(prog_type_name); @@ -1905,7 +1906,7 @@ static int do_help(int argc, char **argv) " cgroup/getsockname4 | cgroup/getsockname6 | cgroup/sendmsg4 |\n" " cgroup/sendmsg6 | cgroup/recvmsg4 | cgroup/recvmsg6 |\n" " cgroup/getsockopt | cgroup/setsockopt |\n" - " struct_ops | fentry | fexit | freplace }\n" + " struct_ops | fentry | fexit | freplace | sk_lookup }\n" " ATTACH_TYPE := { msg_verdict | stream_verdict | stream_parser |\n" " flow_dissector }\n" " METRIC := { cycles | instructions | l1d_loads | llc_misses }\n" From patchwork Mon Jul 13 17:46:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Sitnicki X-Patchwork-Id: 1328368 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=cloudflare.com header.i=@cloudflare.com header.a=rsa-sha256 header.s=google header.b=NvMHrWhe; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4B5B1h62Xkz9sR4 for ; Tue, 14 Jul 2020 03:47:36 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730554AbgGMRre (ORCPT ); Mon, 13 Jul 2020 13:47:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60382 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730262AbgGMRr0 (ORCPT ); Mon, 13 Jul 2020 13:47:26 -0400 Received: from mail-lj1-x242.google.com (mail-lj1-x242.google.com [IPv6:2a00:1450:4864:20::242]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 12EEAC08C5DF for ; Mon, 13 Jul 2020 10:47:24 -0700 (PDT) Received: by mail-lj1-x242.google.com with SMTP id s9so18956001ljm.11 for ; Mon, 13 Jul 2020 10:47:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5VY9p1/52hi4hbQ4htxtTIqUGt8t9p8oZD49pXCtp5Y=; b=NvMHrWhemqv1H3THxKkL0FkfE7vlDxFp9eNx6wpwnT4iVCODukqeasR4iBq1A6BazM 8qCePhoreocDSFNYX3MLEPH0y13INy9qO1c60i5t06BOl9qQMeFt/O8X99B4zpeoeWrK o8wAOrQgp4swgnVaJjQB74pbQOFr7O3UN7R/Q= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5VY9p1/52hi4hbQ4htxtTIqUGt8t9p8oZD49pXCtp5Y=; b=BugjPEvp2uQpSBmtFePT3hBh45AVjXW+IRBWoa6owP5rv24RlmwLkqVVDcPTEDtk/t /el4p6ddRFJLOPtkWrifOC52qXWzCvcCyoxcateUxyS51IyziAcybLrg2JQZP3iL4kzm D3A8deMPadPbgGPzsrX9zVT3Cez6zaj1QYfQM9zEqPRSWMXsVqsnDycjhgPIiNZ6iUa2 KXSku6K4EGjqg/l+ALZjxGYxGqaFd2/Ey6k04GMpOehIVyJeufEsY06KpZh7hwFY8FeF 3upLa+B/7+NPqyg2O/06jUemmk0lw5Jf7AhuCrOuGdm6kV+eeRZZSNOoSPkz5iQCIUgj SICQ== X-Gm-Message-State: AOAM532sY/M5B2aMdD7pkHnrsWJ9KzIGZwEae9lp5iTb6jPhHwT49A4l 2rsHzz4vALDILQZ2fZSK52HmxmVeeLyXLA== X-Google-Smtp-Source: ABdhPJztwjuJnYnnluy9WSI8DAN4XSFT/4dRKrgoOPePYqJpuB5s3IVpwKILJ7IQNeHan3ibF8w0Fg== X-Received: by 2002:a2e:a0cd:: with SMTP id f13mr344695ljm.343.1594662441867; Mon, 13 Jul 2020 10:47:21 -0700 (PDT) Received: from cloudflare.com ([2a02:a310:c262:aa00:b35e:8938:2c2a:ba8b]) by smtp.gmail.com with ESMTPSA id o1sm4752393lfi.92.2020.07.13.10.47.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Jul 2020 10:47:20 -0700 (PDT) From: Jakub Sitnicki To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, kernel-team@cloudflare.com, Alexei Starovoitov , Daniel Borkmann , "David S. Miller" , Jakub Kicinski Subject: [PATCH bpf-next v4 14/16] selftests/bpf: Add verifier tests for bpf_sk_lookup context access Date: Mon, 13 Jul 2020 19:46:52 +0200 Message-Id: <20200713174654.642628-15-jakub@cloudflare.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20200713174654.642628-1-jakub@cloudflare.com> References: <20200713174654.642628-1-jakub@cloudflare.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Exercise verifier access checks for bpf_sk_lookup context fields. Signed-off-by: Jakub Sitnicki Acked-by: Andrii Nakryiko --- Notes: v4: - Bring back tests for narrow loads. v3: - Consolidate ACCEPT tests into one. - Deduplicate REJECT tests and arrange them into logical groups. - Add tests for out-of-bounds and unaligned access. - Cover access to newly introduced 'sk' field. v2: - Adjust for fields renames in struct bpf_sk_lookup. .../selftests/bpf/verifier/ctx_sk_lookup.c | 471 ++++++++++++++++++ 1 file changed, 471 insertions(+) create mode 100644 tools/testing/selftests/bpf/verifier/ctx_sk_lookup.c diff --git a/tools/testing/selftests/bpf/verifier/ctx_sk_lookup.c b/tools/testing/selftests/bpf/verifier/ctx_sk_lookup.c new file mode 100644 index 000000000000..24b0852984cf --- /dev/null +++ b/tools/testing/selftests/bpf/verifier/ctx_sk_lookup.c @@ -0,0 +1,471 @@ +{ + "valid 1,2,4,8-byte reads from bpf_sk_lookup", + .insns = { + /* 1-byte read from family field */ + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, family)), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, family) + 1), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, family) + 2), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, family) + 3), + /* 2-byte read from family field */ + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, family)), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, family) + 2), + /* 4-byte read from family field */ + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, family)), + + /* 1-byte read from protocol field */ + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, protocol)), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, protocol) + 1), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, protocol) + 2), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, protocol) + 3), + /* 2-byte read from protocol field */ + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, protocol)), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, protocol) + 2), + /* 4-byte read from protocol field */ + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, protocol)), + + /* 1-byte read from remote_ip4 field */ + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip4)), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip4) + 1), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip4) + 2), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip4) + 3), + /* 2-byte read from remote_ip4 field */ + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip4)), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip4) + 2), + /* 4-byte read from remote_ip4 field */ + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip4)), + + /* 1-byte read from remote_ip6 field */ + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6)), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 1), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 2), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 3), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 4), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 5), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 6), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 7), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 8), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 9), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 10), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 11), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 12), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 13), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 14), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 15), + /* 2-byte read from remote_ip6 field */ + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6)), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 2), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 4), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 6), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 8), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 10), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 12), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 14), + /* 4-byte read from remote_ip6 field */ + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6)), + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 4), + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 8), + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6) + 12), + + /* 1-byte read from remote_port field */ + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_port)), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_port) + 1), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_port) + 2), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_port) + 3), + /* 2-byte read from remote_port field */ + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_port)), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_port) + 2), + /* 4-byte read from remote_port field */ + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_port)), + + /* 1-byte read from local_ip4 field */ + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip4)), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip4) + 1), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip4) + 2), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip4) + 3), + /* 2-byte read from local_ip4 field */ + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip4)), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip4) + 2), + /* 4-byte read from local_ip4 field */ + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip4)), + + /* 1-byte read from local_ip6 field */ + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6)), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 1), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 2), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 3), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 4), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 5), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 6), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 7), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 8), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 9), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 10), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 11), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 12), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 13), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 14), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 15), + /* 2-byte read from local_ip6 field */ + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6)), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 2), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 4), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 6), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 8), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 10), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 12), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 14), + /* 4-byte read from local_ip6 field */ + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6)), + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 4), + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 8), + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6) + 12), + + /* 1-byte read from local_port field */ + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_port)), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_port) + 1), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_port) + 2), + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_port) + 3), + /* 2-byte read from local_port field */ + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_port)), + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_port) + 2), + /* 4-byte read from local_port field */ + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_port)), + + /* 8-byte read from sk field */ + BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, sk)), + BPF_EXIT_INSN(), + }, + .result = ACCEPT, + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, +}, +/* invalid 8-byte reads from a 4-byte fields in bpf_sk_lookup */ +{ + "invalid 8-byte read from bpf_sk_lookup family field", + .insns = { + BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, family)), + BPF_EXIT_INSN(), + }, + .errstr = "invalid bpf_context access", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, +}, +{ + "invalid 8-byte read from bpf_sk_lookup protocol field", + .insns = { + BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, protocol)), + BPF_EXIT_INSN(), + }, + .errstr = "invalid bpf_context access", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, +}, +{ + "invalid 8-byte read from bpf_sk_lookup remote_ip4 field", + .insns = { + BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip4)), + BPF_EXIT_INSN(), + }, + .errstr = "invalid bpf_context access", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, +}, +{ + "invalid 8-byte read from bpf_sk_lookup remote_ip6 field", + .insns = { + BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_ip6)), + BPF_EXIT_INSN(), + }, + .errstr = "invalid bpf_context access", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, +}, +{ + "invalid 8-byte read from bpf_sk_lookup remote_port field", + .insns = { + BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, remote_port)), + BPF_EXIT_INSN(), + }, + .errstr = "invalid bpf_context access", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, +}, +{ + "invalid 8-byte read from bpf_sk_lookup local_ip4 field", + .insns = { + BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip4)), + BPF_EXIT_INSN(), + }, + .errstr = "invalid bpf_context access", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, +}, +{ + "invalid 8-byte read from bpf_sk_lookup local_ip6 field", + .insns = { + BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_ip6)), + BPF_EXIT_INSN(), + }, + .errstr = "invalid bpf_context access", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, +}, +{ + "invalid 8-byte read from bpf_sk_lookup local_port field", + .insns = { + BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, local_port)), + BPF_EXIT_INSN(), + }, + .errstr = "invalid bpf_context access", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, +}, +/* invalid 1,2,4-byte reads from 8-byte fields in bpf_sk_lookup */ +{ + "invalid 4-byte read from bpf_sk_lookup sk field", + .insns = { + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, sk)), + BPF_EXIT_INSN(), + }, + .errstr = "invalid bpf_context access", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, +}, +{ + "invalid 2-byte read from bpf_sk_lookup sk field", + .insns = { + BPF_LDX_MEM(BPF_H, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, sk)), + BPF_EXIT_INSN(), + }, + .errstr = "invalid bpf_context access", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, +}, +{ + "invalid 1-byte read from bpf_sk_lookup sk field", + .insns = { + BPF_LDX_MEM(BPF_B, BPF_REG_0, BPF_REG_1, + offsetof(struct bpf_sk_lookup, sk)), + BPF_EXIT_INSN(), + }, + .errstr = "invalid bpf_context access", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, +}, +/* out of bounds and unaligned reads from bpf_sk_lookup */ +{ + "invalid 4-byte read past end of bpf_sk_lookup", + .insns = { + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1, + sizeof(struct bpf_sk_lookup)), + BPF_EXIT_INSN(), + }, + .errstr = "invalid bpf_context access", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, +}, +{ + "invalid 4-byte unaligned read from bpf_sk_lookup at odd offset", + .insns = { + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1, 1), + BPF_EXIT_INSN(), + }, + .errstr = "invalid bpf_context access", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, +}, +{ + "invalid 4-byte unaligned read from bpf_sk_lookup at even offset", + .insns = { + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1, 2), + BPF_EXIT_INSN(), + }, + .errstr = "invalid bpf_context access", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, +}, +/* in-bound and out-of-bound writes to bpf_sk_lookup */ +{ + "invalid 8-byte write to bpf_sk_lookup", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 0xcafe4a11U), + BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .errstr = "invalid bpf_context access", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, +}, +{ + "invalid 4-byte write to bpf_sk_lookup", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 0xcafe4a11U), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .errstr = "invalid bpf_context access", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, +}, +{ + "invalid 2-byte write to bpf_sk_lookup", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 0xcafe4a11U), + BPF_STX_MEM(BPF_H, BPF_REG_1, BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .errstr = "invalid bpf_context access", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, +}, +{ + "invalid 1-byte write to bpf_sk_lookup", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 0xcafe4a11U), + BPF_STX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .errstr = "invalid bpf_context access", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, +}, +{ + "invalid 4-byte write past end of bpf_sk_lookup", + .insns = { + BPF_MOV64_IMM(BPF_REG_0, 0xcafe4a11U), + BPF_STX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, + sizeof(struct bpf_sk_lookup)), + BPF_EXIT_INSN(), + }, + .errstr = "invalid bpf_context access", + .result = REJECT, + .prog_type = BPF_PROG_TYPE_SK_LOOKUP, + .expected_attach_type = BPF_SK_LOOKUP, +}, From patchwork Mon Jul 13 17:46:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Sitnicki X-Patchwork-Id: 1328362 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=cloudflare.com header.i=@cloudflare.com header.a=rsa-sha256 header.s=google header.b=FxObrd5y; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4B5B1Y5P2Jz9sR4 for ; Tue, 14 Jul 2020 03:47:29 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730538AbgGMRr2 (ORCPT ); Mon, 13 Jul 2020 13:47:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60386 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730528AbgGMRr1 (ORCPT ); Mon, 13 Jul 2020 13:47:27 -0400 Received: from mail-lj1-x243.google.com (mail-lj1-x243.google.com [IPv6:2a00:1450:4864:20::243]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 80E3AC08C5E1 for ; Mon, 13 Jul 2020 10:47:25 -0700 (PDT) Received: by mail-lj1-x243.google.com with SMTP id q7so19004852ljm.1 for ; Mon, 13 Jul 2020 10:47:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=3ybsptQ1uDuYH5yr/joHYcohd4KEboQtBfB059oe4Vg=; b=FxObrd5yQWq6ZMDTHzPP1ehrdv0JRCKwevvJG2XyGrLv25bYbEr+6GqSpjuI1zG91D KxRe1uM1CrJSHG7rcUGkJMK9iAu+Lrt2Nzz7ym2JT/fDxcn0kWz7E/UvqHQyvsRpX0PS XkW1OXMuCPdEEiGUCFfnCPzfoGu4qJ9aZa7FA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=3ybsptQ1uDuYH5yr/joHYcohd4KEboQtBfB059oe4Vg=; b=hUQs6uVQAEoWaz9UC2vOSpKwR+MXkTM80zgoauPA/QgiXiSX6BXyBK7PbHMZ1yy1z3 aqHwRn88NIs5NzNrJijGc5hXgZUvh0bZWsI1rYY3qJPDvva0vjemTuinCjSE0vG1Q5C+ dRU7RBxLJGuaf2MHYSg95Z/d8eG0Oqj68JtDdJLvcZ3KGphVqKL5zC/ACmmMI9IuJiEk Q4cGaRTFaLIc4segJHsjvxiqi23FoY3Z7a6Ws+XSEQtFX4FKW2MeqAzLdnpZIePhvWkj vt6w0jVCSBybAn/7ZpKWNjlgOD39OlLXwl9WmPhYwENumFSfnPxxCCGG8LIFB+arIVsR cbqQ== X-Gm-Message-State: AOAM531cr6980bcI6ABqH3bb5O7DMIbAIiHZfdw4benz79ViUt23fvbC dXP+ylVYFaSwQR8wczHz1N/rtv8fpVfhRg== X-Google-Smtp-Source: ABdhPJw4XghKF8wejD8Baxl8SdCIahGaAt0NbnbabRhRsbM3aemz1CoXDjje9sYW2csvK2oOWdZlOA== X-Received: by 2002:a05:651c:8c:: with SMTP id 12mr354416ljq.420.1594662443573; Mon, 13 Jul 2020 10:47:23 -0700 (PDT) Received: from cloudflare.com ([2a02:a310:c262:aa00:b35e:8938:2c2a:ba8b]) by smtp.gmail.com with ESMTPSA id j25sm4737668lfh.95.2020.07.13.10.47.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Jul 2020 10:47:23 -0700 (PDT) From: Jakub Sitnicki To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, kernel-team@cloudflare.com, Alexei Starovoitov , Daniel Borkmann , "David S. Miller" , Jakub Kicinski Subject: [PATCH bpf-next v4 15/16] selftests/bpf: Rename test_sk_lookup_kern.c to test_ref_track_kern.c Date: Mon, 13 Jul 2020 19:46:53 +0200 Message-Id: <20200713174654.642628-16-jakub@cloudflare.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20200713174654.642628-1-jakub@cloudflare.com> References: <20200713174654.642628-1-jakub@cloudflare.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Name the BPF C file after the test case that uses it. This frees up "test_sk_lookup" namespace for BPF sk_lookup program tests introduced by the following patch. Signed-off-by: Jakub Sitnicki --- tools/testing/selftests/bpf/prog_tests/reference_tracking.c | 2 +- .../bpf/progs/{test_sk_lookup_kern.c => test_ref_track_kern.c} | 0 2 files changed, 1 insertion(+), 1 deletion(-) rename tools/testing/selftests/bpf/progs/{test_sk_lookup_kern.c => test_ref_track_kern.c} (100%) diff --git a/tools/testing/selftests/bpf/prog_tests/reference_tracking.c b/tools/testing/selftests/bpf/prog_tests/reference_tracking.c index fc0d7f4f02cf..106ca8bb2a8f 100644 --- a/tools/testing/selftests/bpf/prog_tests/reference_tracking.c +++ b/tools/testing/selftests/bpf/prog_tests/reference_tracking.c @@ -3,7 +3,7 @@ void test_reference_tracking(void) { - const char *file = "test_sk_lookup_kern.o"; + const char *file = "test_ref_track_kern.o"; const char *obj_name = "ref_track"; DECLARE_LIBBPF_OPTS(bpf_object_open_opts, open_opts, .object_name = obj_name, diff --git a/tools/testing/selftests/bpf/progs/test_sk_lookup_kern.c b/tools/testing/selftests/bpf/progs/test_ref_track_kern.c similarity index 100% rename from tools/testing/selftests/bpf/progs/test_sk_lookup_kern.c rename to tools/testing/selftests/bpf/progs/test_ref_track_kern.c From patchwork Mon Jul 13 17:46:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jakub Sitnicki X-Patchwork-Id: 1328366 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=cloudflare.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=cloudflare.com header.i=@cloudflare.com header.a=rsa-sha256 header.s=google header.b=GHL/YsLo; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4B5B1c1N0kz9sRN for ; Tue, 14 Jul 2020 03:47:32 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730549AbgGMRrb (ORCPT ); Mon, 13 Jul 2020 13:47:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730520AbgGMRr3 (ORCPT ); Mon, 13 Jul 2020 13:47:29 -0400 Received: from mail-lj1-x242.google.com (mail-lj1-x242.google.com [IPv6:2a00:1450:4864:20::242]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BB48EC061794 for ; Mon, 13 Jul 2020 10:47:28 -0700 (PDT) Received: by mail-lj1-x242.google.com with SMTP id d17so18976877ljl.3 for ; Mon, 13 Jul 2020 10:47:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=P0pIonZHPPxYTF/vNNO8A7E98x+1xWSHEhKlRPS9dys=; b=GHL/YsLoRg9DzgAsSWRMunYZ1rHiaKC0gNoYm0bfmjWRVvsnZQeU1PHGcwDEqIOUSQ bJBhIQGcydCDEGfBVQ7GE5qURtPdBiTWDbwhP2o2r7hS97qpszyDXwB34AkDbQY7jEEN RyMCTjgwo4Jb6UDbnb+4744PODIx+//J/ld4Y= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=P0pIonZHPPxYTF/vNNO8A7E98x+1xWSHEhKlRPS9dys=; b=b9EEQmVlyEt+OFi1jkbbxA+jyBNVEp9cuIg8fBn6o0jIBN5XnsZnOrwRzdVlXGgsMQ kLL7xfnmDWVwH/FfNfHRZEy8Hnf4+umFFu8zFz7b45KUVHuQQwBPbmcfs/4dhkyV1vGb +eWtTtseAZ92ChlqbaWDf2jE3uM+gajifxmldR7fv1tR+BICyQcFZ1Jzm8BXs+CiyUJg We1Lain2pZ23Ro17JLtrpMw9rnpQHm+ccoodrFr6MADd35ZuuvDNkvBoj/nb/tjH5IZN uICBBwuIx0EwkO3reSkFDvEAw/74AQgsg7K2aW+V3AyrLbPfZR7jpaR+BbneHovQvxWW K3MA== X-Gm-Message-State: AOAM532sz3c8OryRVXmBQdAfez9RWP2VtCQBdKlcwG6O7TvFmJzopGkk cTYfQ24HWDhMbElULNwJxFSKrzj0u2k2aQ== X-Google-Smtp-Source: ABdhPJxZS+jZ4td9Xk8NMh1YpqrJnXcKFfJ+zxwHklEENjtCDva+Icw8tLKNgMQZhU5wYhFiuapOoA== X-Received: by 2002:a2e:9746:: with SMTP id f6mr436911ljj.68.1594662445812; Mon, 13 Jul 2020 10:47:25 -0700 (PDT) Received: from cloudflare.com ([2a02:a310:c262:aa00:b35e:8938:2c2a:ba8b]) by smtp.gmail.com with ESMTPSA id i8sm4180729ljg.57.2020.07.13.10.47.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Jul 2020 10:47:25 -0700 (PDT) From: Jakub Sitnicki To: bpf@vger.kernel.org Cc: netdev@vger.kernel.org, kernel-team@cloudflare.com, Alexei Starovoitov , Daniel Borkmann , "David S. Miller" , Jakub Kicinski Subject: [PATCH bpf-next v4 16/16] selftests/bpf: Tests for BPF_SK_LOOKUP attach point Date: Mon, 13 Jul 2020 19:46:54 +0200 Message-Id: <20200713174654.642628-17-jakub@cloudflare.com> X-Mailer: git-send-email 2.25.4 In-Reply-To: <20200713174654.642628-1-jakub@cloudflare.com> References: <20200713174654.642628-1-jakub@cloudflare.com> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org Add tests to test_progs that exercise: - attaching/detaching/querying programs to BPF_SK_LOOKUP hook, - redirecting socket lookup to a socket selected by BPF program, - failing a socket lookup on BPF program's request, - error scenarios for selecting a socket from BPF program, - accessing BPF program context, - attaching and running multiple BPF programs. Run log: # ./test_progs -n 69 #69/1 query lookup prog:OK #69/2 TCP IPv4 redir port:OK #69/3 TCP IPv4 redir addr:OK #69/4 TCP IPv4 redir with reuseport:OK #69/5 TCP IPv4 redir skip reuseport:OK #69/6 TCP IPv6 redir port:OK #69/7 TCP IPv6 redir addr:OK #69/8 TCP IPv4->IPv6 redir port:OK #69/9 TCP IPv6 redir with reuseport:OK #69/10 TCP IPv6 redir skip reuseport:OK #69/11 UDP IPv4 redir port:OK #69/12 UDP IPv4 redir addr:OK #69/13 UDP IPv4 redir with reuseport:OK #69/14 UDP IPv4 redir skip reuseport:OK #69/15 UDP IPv6 redir port:OK #69/16 UDP IPv6 redir addr:OK #69/17 UDP IPv4->IPv6 redir port:OK #69/18 UDP IPv6 redir and reuseport:OK #69/19 UDP IPv6 redir skip reuseport:OK #69/20 TCP IPv4 drop on lookup:OK #69/21 TCP IPv6 drop on lookup:OK #69/22 UDP IPv4 drop on lookup:OK #69/23 UDP IPv6 drop on lookup:OK #69/24 TCP IPv4 drop on reuseport:OK #69/25 TCP IPv6 drop on reuseport:OK #69/26 UDP IPv4 drop on reuseport:OK #69/27 TCP IPv6 drop on reuseport:OK #69/28 sk_assign returns EEXIST:OK #69/29 sk_assign honors F_REPLACE:OK #69/30 sk_assign accepts NULL socket:OK #69/31 access ctx->sk:OK #69/32 narrow access to ctx v4:OK #69/33 narrow access to ctx v6:OK #69/34 sk_assign rejects TCP established:OK #69/35 sk_assign rejects UDP connected:OK #69/36 multi prog - pass, pass:OK #69/37 multi prog - drop, drop:OK #69/38 multi prog - pass, drop:OK #69/39 multi prog - drop, pass:OK #69/40 multi prog - pass, redir:OK #69/41 multi prog - redir, pass:OK #69/42 multi prog - drop, redir:OK #69/43 multi prog - redir, drop:OK #69/44 multi prog - redir, redir:OK #69 sk_lookup:OK Summary: 1/44 PASSED, 0 SKIPPED, 0 FAILED Signed-off-by: Jakub Sitnicki Acked-by: Andrii Nakryiko --- Notes: v4: - Remove system("bpftool ...") call left over from debugging. (Lorenz) - Dedup BPF code that selects a socket. (Lorenz) - Switch from CHECK_FAIL to CHECK macro. (Andrii) - Extract a network_helper that wraps inet_pton. - Don't restore netns now that test_progs does it. - Cover bpf_sk_assign(ctx, NULL) in tests. - Cover narrow loads in tests. - Cover NULL ctx->sk access attempts in tests. - Cover accessing IPv6 ctx fields on IPv4 lookup. v3: - Extend tests to cover new functionality in v3: - multi-prog attachments (query, running, verdict precedence) - socket selecting for the second time with bpf_sk_assign - skipping over reuseport load-balancing v2: - Adjust for fields renames in struct bpf_sk_lookup. tools/testing/selftests/bpf/network_helpers.c | 58 +- tools/testing/selftests/bpf/network_helpers.h | 2 + .../selftests/bpf/prog_tests/sk_lookup.c | 1282 +++++++++++++++++ .../selftests/bpf/progs/test_sk_lookup_kern.c | 639 ++++++++ 4 files changed, 1958 insertions(+), 23 deletions(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/sk_lookup.c create mode 100644 tools/testing/selftests/bpf/progs/test_sk_lookup_kern.c diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c index acd08715be2e..f56655690f9b 100644 --- a/tools/testing/selftests/bpf/network_helpers.c +++ b/tools/testing/selftests/bpf/network_helpers.c @@ -73,29 +73,8 @@ int start_server(int family, int type, const char *addr_str, __u16 port, socklen_t len; int fd; - if (family == AF_INET) { - struct sockaddr_in *sin = (void *)&addr; - - sin->sin_family = AF_INET; - sin->sin_port = htons(port); - if (addr_str && - inet_pton(AF_INET, addr_str, &sin->sin_addr) != 1) { - log_err("inet_pton(AF_INET, %s)", addr_str); - return -1; - } - len = sizeof(*sin); - } else { - struct sockaddr_in6 *sin6 = (void *)&addr; - - sin6->sin6_family = AF_INET6; - sin6->sin6_port = htons(port); - if (addr_str && - inet_pton(AF_INET6, addr_str, &sin6->sin6_addr) != 1) { - log_err("inet_pton(AF_INET6, %s)", addr_str); - return -1; - } - len = sizeof(*sin6); - } + if (make_sockaddr(family, addr_str, port, &addr, &len)) + return -1; fd = socket(family, type, 0); if (fd < 0) { @@ -194,3 +173,36 @@ int connect_fd_to_fd(int client_fd, int server_fd, int timeout_ms) return 0; } + +int make_sockaddr(int family, const char *addr_str, __u16 port, + struct sockaddr_storage *addr, socklen_t *len) +{ + if (family == AF_INET) { + struct sockaddr_in *sin = (void *)addr; + + sin->sin_family = AF_INET; + sin->sin_port = htons(port); + if (addr_str && + inet_pton(AF_INET, addr_str, &sin->sin_addr) != 1) { + log_err("inet_pton(AF_INET, %s)", addr_str); + return -1; + } + if (len) + *len = sizeof(*sin); + return 0; + } else if (family == AF_INET6) { + struct sockaddr_in6 *sin6 = (void *)addr; + + sin6->sin6_family = AF_INET6; + sin6->sin6_port = htons(port); + if (addr_str && + inet_pton(AF_INET6, addr_str, &sin6->sin6_addr) != 1) { + log_err("inet_pton(AF_INET6, %s)", addr_str); + return -1; + } + if (len) + *len = sizeof(*sin6); + return 0; + } + return -1; +} diff --git a/tools/testing/selftests/bpf/network_helpers.h b/tools/testing/selftests/bpf/network_helpers.h index f580e82fda58..c3728f6667e4 100644 --- a/tools/testing/selftests/bpf/network_helpers.h +++ b/tools/testing/selftests/bpf/network_helpers.h @@ -37,5 +37,7 @@ int start_server(int family, int type, const char *addr, __u16 port, int timeout_ms); int connect_to_fd(int server_fd, int timeout_ms); int connect_fd_to_fd(int client_fd, int server_fd, int timeout_ms); +int make_sockaddr(int family, const char *addr_str, __u16 port, + struct sockaddr_storage *addr, socklen_t *len); #endif diff --git a/tools/testing/selftests/bpf/prog_tests/sk_lookup.c b/tools/testing/selftests/bpf/prog_tests/sk_lookup.c new file mode 100644 index 000000000000..992e35ee2bc8 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/sk_lookup.c @@ -0,0 +1,1282 @@ +// SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause +// Copyright (c) 2020 Cloudflare +/* + * Test BPF attach point for INET socket lookup (BPF_SK_LOOKUP). + * + * Tests exercise: + * - attaching/detaching/querying programs to BPF_SK_LOOKUP hook, + * - redirecting socket lookup to a socket selected by BPF program, + * - failing a socket lookup on BPF program's request, + * - error scenarios for selecting a socket from BPF program, + * - accessing BPF program context, + * - attaching and running multiple BPF programs. + * + * Tests run in a dedicated network namespace. + */ + +#define _GNU_SOURCE +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include + +#include "test_progs.h" +#include "bpf_rlimit.h" +#include "bpf_util.h" +#include "cgroup_helpers.h" +#include "network_helpers.h" +#include "test_sk_lookup_kern.skel.h" + +/* External (address, port) pairs the client sends packets to. */ +#define EXT_IP4 "127.0.0.1" +#define EXT_IP6 "fd00::1" +#define EXT_PORT 7007 + +/* Internal (address, port) pairs the server listens/receives at. */ +#define INT_IP4 "127.0.0.2" +#define INT_IP4_V6 "::ffff:127.0.0.2" +#define INT_IP6 "fd00::2" +#define INT_PORT 8008 + +#define IO_TIMEOUT_SEC 3 + +enum server { + SERVER_A = 0, + SERVER_B = 1, + MAX_SERVERS, +}; + +enum { + PROG1 = 0, + PROG2, +}; + +struct inet_addr { + const char *ip; + unsigned short port; +}; + +struct test { + const char *desc; + struct bpf_program *lookup_prog; + struct bpf_program *reuseport_prog; + struct bpf_map *sock_map; + int sotype; + struct inet_addr connect_to; + struct inet_addr listen_at; + enum server accept_on; +}; + +static __u32 duration; /* for CHECK macro */ + +static bool is_ipv6(const char *ip) +{ + return !!strchr(ip, ':'); +} + +static int attach_reuseport(int sock_fd, struct bpf_program *reuseport_prog) +{ + int err, prog_fd; + + prog_fd = bpf_program__fd(reuseport_prog); + if (prog_fd < 0) { + errno = -prog_fd; + return -1; + } + + err = setsockopt(sock_fd, SOL_SOCKET, SO_ATTACH_REUSEPORT_EBPF, + &prog_fd, sizeof(prog_fd)); + if (err) + return -1; + + return 0; +} + +static socklen_t inetaddr_len(const struct sockaddr_storage *addr) +{ + return (addr->ss_family == AF_INET ? sizeof(struct sockaddr_in) : + addr->ss_family == AF_INET6 ? sizeof(struct sockaddr_in6) : 0); +} + +static int make_socket(int sotype, const char *ip, int port, + struct sockaddr_storage *addr) +{ + struct timeval timeo = { .tv_sec = IO_TIMEOUT_SEC }; + int err, family, fd; + + family = is_ipv6(ip) ? AF_INET6 : AF_INET; + err = make_sockaddr(family, ip, port, addr, NULL); + if (CHECK(err, "make_address", "failed\n")) + return -1; + + fd = socket(addr->ss_family, sotype, 0); + if (CHECK(fd < 0, "socket", "failed\n")) { + log_err("failed to make socket"); + return -1; + } + + err = setsockopt(fd, SOL_SOCKET, SO_SNDTIMEO, &timeo, sizeof(timeo)); + if (CHECK(err, "setsockopt(SO_SNDTIMEO)", "failed\n")) { + log_err("failed to set SNDTIMEO"); + close(fd); + return -1; + } + + err = setsockopt(fd, SOL_SOCKET, SO_RCVTIMEO, &timeo, sizeof(timeo)); + if (CHECK(err, "setsockopt(SO_RCVTIMEO)", "failed\n")) { + log_err("failed to set RCVTIMEO"); + close(fd); + return -1; + } + + return fd; +} + +static int make_server(int sotype, const char *ip, int port, + struct bpf_program *reuseport_prog) +{ + struct sockaddr_storage addr = {0}; + const int one = 1; + int err, fd = -1; + + fd = make_socket(sotype, ip, port, &addr); + if (fd < 0) + return -1; + + /* Enabled for UDPv6 sockets for IPv4-mapped IPv6 to work. */ + if (sotype == SOCK_DGRAM) { + err = setsockopt(fd, SOL_IP, IP_RECVORIGDSTADDR, &one, + sizeof(one)); + if (CHECK(err, "setsockopt(IP_RECVORIGDSTADDR)", "failed\n")) { + log_err("failed to enable IP_RECVORIGDSTADDR"); + goto fail; + } + } + + if (sotype == SOCK_DGRAM && addr.ss_family == AF_INET6) { + err = setsockopt(fd, SOL_IPV6, IPV6_RECVORIGDSTADDR, &one, + sizeof(one)); + if (CHECK(err, "setsockopt(IPV6_RECVORIGDSTADDR)", "failed\n")) { + log_err("failed to enable IPV6_RECVORIGDSTADDR"); + goto fail; + } + } + + if (sotype == SOCK_STREAM) { + err = setsockopt(fd, SOL_SOCKET, SO_REUSEADDR, &one, + sizeof(one)); + if (CHECK(err, "setsockopt(SO_REUSEADDR)", "failed\n")) { + log_err("failed to enable SO_REUSEADDR"); + goto fail; + } + } + + if (reuseport_prog) { + err = setsockopt(fd, SOL_SOCKET, SO_REUSEPORT, &one, + sizeof(one)); + if (CHECK(err, "setsockopt(SO_REUSEPORT)", "failed\n")) { + log_err("failed to enable SO_REUSEPORT"); + goto fail; + } + } + + err = bind(fd, (void *)&addr, inetaddr_len(&addr)); + if (CHECK(err, "bind", "failed\n")) { + log_err("failed to bind listen socket"); + goto fail; + } + + if (sotype == SOCK_STREAM) { + err = listen(fd, SOMAXCONN); + if (CHECK(err, "make_server", "listen")) { + log_err("failed to listen on port %d", port); + goto fail; + } + } + + /* Late attach reuseport prog so we can have one init path */ + if (reuseport_prog) { + err = attach_reuseport(fd, reuseport_prog); + if (CHECK(err, "attach_reuseport", "failed\n")) { + log_err("failed to attach reuseport prog"); + goto fail; + } + } + + return fd; +fail: + close(fd); + return -1; +} + +static int make_client(int sotype, const char *ip, int port) +{ + struct sockaddr_storage addr = {0}; + int err, fd; + + fd = make_socket(sotype, ip, port, &addr); + if (fd < 0) + return -1; + + err = connect(fd, (void *)&addr, inetaddr_len(&addr)); + if (CHECK(err, "make_client", "connect")) { + log_err("failed to connect client socket"); + goto fail; + } + + return fd; +fail: + close(fd); + return -1; +} + +static int send_byte(int fd) +{ + ssize_t n; + + errno = 0; + n = send(fd, "a", 1, 0); + if (CHECK(n <= 0, "send_byte", "send")) { + log_err("failed/partial send"); + return -1; + } + return 0; +} + +static int recv_byte(int fd) +{ + char buf[1]; + ssize_t n; + + n = recv(fd, buf, sizeof(buf), 0); + if (CHECK(n <= 0, "recv_byte", "recv")) { + log_err("failed/partial recv"); + return -1; + } + return 0; +} + +static int tcp_recv_send(int server_fd) +{ + char buf[1]; + int ret, fd; + ssize_t n; + + fd = accept(server_fd, NULL, NULL); + if (CHECK(fd < 0, "accept", "failed\n")) { + log_err("failed to accept"); + return -1; + } + + n = recv(fd, buf, sizeof(buf), 0); + if (CHECK(n <= 0, "recv", "failed\n")) { + log_err("failed/partial recv"); + ret = -1; + goto close; + } + + n = send(fd, buf, n, 0); + if (CHECK(n <= 0, "send", "failed\n")) { + log_err("failed/partial send"); + ret = -1; + goto close; + } + + ret = 0; +close: + close(fd); + return ret; +} + +static void v4_to_v6(struct sockaddr_storage *ss) +{ + struct sockaddr_in6 *v6 = (struct sockaddr_in6 *)ss; + struct sockaddr_in v4 = *(struct sockaddr_in *)ss; + + v6->sin6_family = AF_INET6; + v6->sin6_port = v4.sin_port; + v6->sin6_addr.s6_addr[10] = 0xff; + v6->sin6_addr.s6_addr[11] = 0xff; + memcpy(&v6->sin6_addr.s6_addr[12], &v4.sin_addr.s_addr, 4); +} + +static int udp_recv_send(int server_fd) +{ + char cmsg_buf[CMSG_SPACE(sizeof(struct sockaddr_storage))]; + struct sockaddr_storage _src_addr = { 0 }; + struct sockaddr_storage *src_addr = &_src_addr; + struct sockaddr_storage *dst_addr = NULL; + struct msghdr msg = { 0 }; + struct iovec iov = { 0 }; + struct cmsghdr *cm; + char buf[1]; + int ret, fd; + ssize_t n; + + iov.iov_base = buf; + iov.iov_len = sizeof(buf); + + msg.msg_name = src_addr; + msg.msg_namelen = sizeof(*src_addr); + msg.msg_iov = &iov; + msg.msg_iovlen = 1; + msg.msg_control = cmsg_buf; + msg.msg_controllen = sizeof(cmsg_buf); + + errno = 0; + n = recvmsg(server_fd, &msg, 0); + if (CHECK(n <= 0, "recvmsg", "failed\n")) { + log_err("failed to receive"); + return -1; + } + if (CHECK(msg.msg_flags & MSG_CTRUNC, "recvmsg", "truncated cmsg\n")) + return -1; + + for (cm = CMSG_FIRSTHDR(&msg); cm; cm = CMSG_NXTHDR(&msg, cm)) { + if ((cm->cmsg_level == SOL_IP && + cm->cmsg_type == IP_ORIGDSTADDR) || + (cm->cmsg_level == SOL_IPV6 && + cm->cmsg_type == IPV6_ORIGDSTADDR)) { + dst_addr = (struct sockaddr_storage *)CMSG_DATA(cm); + break; + } + log_err("warning: ignored cmsg at level %d type %d", + cm->cmsg_level, cm->cmsg_type); + } + if (CHECK(!dst_addr, "recvmsg", "missing ORIGDSTADDR\n")) + return -1; + + /* Server socket bound to IPv4-mapped IPv6 address */ + if (src_addr->ss_family == AF_INET6 && + dst_addr->ss_family == AF_INET) { + v4_to_v6(dst_addr); + } + + /* Reply from original destination address. */ + fd = socket(dst_addr->ss_family, SOCK_DGRAM, 0); + if (CHECK(fd < 0, "socket", "failed\n")) { + log_err("failed to create tx socket"); + return -1; + } + + ret = bind(fd, (struct sockaddr *)dst_addr, sizeof(*dst_addr)); + if (CHECK(ret, "bind", "failed\n")) { + log_err("failed to bind tx socket"); + goto out; + } + + msg.msg_control = NULL; + msg.msg_controllen = 0; + n = sendmsg(fd, &msg, 0); + if (CHECK(n <= 0, "sendmsg", "failed\n")) { + log_err("failed to send echo reply"); + ret = -1; + goto out; + } + + ret = 0; +out: + close(fd); + return ret; +} + +static int tcp_echo_test(int client_fd, int server_fd) +{ + int err; + + err = send_byte(client_fd); + if (err) + return -1; + err = tcp_recv_send(server_fd); + if (err) + return -1; + err = recv_byte(client_fd); + if (err) + return -1; + + return 0; +} + +static int udp_echo_test(int client_fd, int server_fd) +{ + int err; + + err = send_byte(client_fd); + if (err) + return -1; + err = udp_recv_send(server_fd); + if (err) + return -1; + err = recv_byte(client_fd); + if (err) + return -1; + + return 0; +} + +static struct bpf_link *attach_lookup_prog(struct bpf_program *prog) +{ + struct bpf_link *link; + int net_fd; + + net_fd = open("/proc/self/ns/net", O_RDONLY); + if (CHECK(net_fd < 0, "open", "failed\n")) { + log_err("failed to open /proc/self/ns/net"); + return NULL; + } + + link = bpf_program__attach_netns(prog, net_fd); + if (CHECK(IS_ERR(link), "bpf_program__attach_netns", "failed\n")) { + errno = -PTR_ERR(link); + log_err("failed to attach program '%s' to netns", + bpf_program__name(prog)); + link = NULL; + } + + close(net_fd); + return link; +} + +static int update_lookup_map(struct bpf_map *map, int index, int sock_fd) +{ + int err, map_fd; + uint64_t value; + + map_fd = bpf_map__fd(map); + if (CHECK(map_fd < 0, "bpf_map__fd", "failed\n")) { + errno = -map_fd; + log_err("failed to get map FD"); + return -1; + } + + value = (uint64_t)sock_fd; + err = bpf_map_update_elem(map_fd, &index, &value, BPF_NOEXIST); + if (CHECK(err, "bpf_map_update_elem", "failed\n")) { + log_err("failed to update redir_map @ %d", index); + return -1; + } + + return 0; +} + +static __u32 link_info_prog_id(struct bpf_link *link) +{ + struct bpf_link_info info = {}; + __u32 info_len = sizeof(info); + int link_fd, err; + + link_fd = bpf_link__fd(link); + if (CHECK(link_fd < 0, "bpf_link__fd", "failed\n")) { + errno = -link_fd; + log_err("bpf_link__fd failed"); + return 0; + } + + err = bpf_obj_get_info_by_fd(link_fd, &info, &info_len); + if (CHECK(err, "bpf_obj_get_info_by_fd", "failed\n")) { + log_err("bpf_obj_get_info_by_fd"); + return 0; + } + if (CHECK(info_len != sizeof(info), "bpf_obj_get_info_by_fd", + "unexpected info len %u\n", info_len)) + return 0; + + return info.prog_id; +} + +static void query_lookup_prog(struct test_sk_lookup_kern *skel) +{ + struct bpf_link *link[3] = {}; + __u32 attach_flags = 0; + __u32 prog_ids[3] = {}; + __u32 prog_cnt = 3; + __u32 prog_id; + int net_fd; + int err; + + net_fd = open("/proc/self/ns/net", O_RDONLY); + if (CHECK(net_fd < 0, "open", "failed\n")) { + log_err("failed to open /proc/self/ns/net"); + return; + } + + link[0] = attach_lookup_prog(skel->progs.lookup_pass); + if (!link[0]) + goto close; + link[1] = attach_lookup_prog(skel->progs.lookup_pass); + if (!link[1]) + goto detach; + link[2] = attach_lookup_prog(skel->progs.lookup_drop); + if (!link[2]) + goto detach; + + err = bpf_prog_query(net_fd, BPF_SK_LOOKUP, 0 /* query flags */, + &attach_flags, prog_ids, &prog_cnt); + if (CHECK(err, "bpf_prog_query", "failed\n")) { + log_err("failed to query lookup prog"); + goto detach; + } + + errno = 0; + if (CHECK(attach_flags != 0, "bpf_prog_query", + "wrong attach_flags on query: %u", attach_flags)) + goto detach; + if (CHECK(prog_cnt != 3, "bpf_prog_query", + "wrong program count on query: %u", prog_cnt)) + goto detach; + prog_id = link_info_prog_id(link[0]); + CHECK(prog_ids[0] != prog_id, "bpf_prog_query", + "invalid program #0 id on query: %u != %u\n", + prog_ids[0], prog_id); + prog_id = link_info_prog_id(link[1]); + CHECK(prog_ids[1] != prog_id, "bpf_prog_query", + "invalid program #1 id on query: %u != %u\n", + prog_ids[1], prog_id); + prog_id = link_info_prog_id(link[2]); + CHECK(prog_ids[2] != prog_id, "bpf_prog_query", + "invalid program #2 id on query: %u != %u\n", + prog_ids[2], prog_id); + +detach: + if (link[2]) + bpf_link__destroy(link[2]); + if (link[1]) + bpf_link__destroy(link[1]); + if (link[0]) + bpf_link__destroy(link[0]); +close: + close(net_fd); +} + +static void run_lookup_prog(const struct test *t) +{ + int client_fd, server_fds[MAX_SERVERS] = { -1 }; + struct bpf_link *lookup_link; + int i, err; + + lookup_link = attach_lookup_prog(t->lookup_prog); + if (!lookup_link) + return; + + for (i = 0; i < ARRAY_SIZE(server_fds); i++) { + server_fds[i] = make_server(t->sotype, t->listen_at.ip, + t->listen_at.port, + t->reuseport_prog); + if (server_fds[i] < 0) + goto close; + + err = update_lookup_map(t->sock_map, i, server_fds[i]); + if (err) + goto close; + + /* want just one server for non-reuseport test */ + if (!t->reuseport_prog) + break; + } + + client_fd = make_client(t->sotype, t->connect_to.ip, t->connect_to.port); + if (client_fd < 0) + goto close; + + if (t->sotype == SOCK_STREAM) + tcp_echo_test(client_fd, server_fds[t->accept_on]); + else + udp_echo_test(client_fd, server_fds[t->accept_on]); + + close(client_fd); +close: + for (i = 0; i < ARRAY_SIZE(server_fds); i++) { + if (server_fds[i] != -1) + close(server_fds[i]); + } + bpf_link__destroy(lookup_link); +} + +static void test_redirect_lookup(struct test_sk_lookup_kern *skel) +{ + const struct test tests[] = { + { + .desc = "TCP IPv4 redir port", + .lookup_prog = skel->progs.redir_port, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_STREAM, + .connect_to = { EXT_IP4, EXT_PORT }, + .listen_at = { EXT_IP4, INT_PORT }, + }, + { + .desc = "TCP IPv4 redir addr", + .lookup_prog = skel->progs.redir_ip4, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_STREAM, + .connect_to = { EXT_IP4, EXT_PORT }, + .listen_at = { INT_IP4, EXT_PORT }, + }, + { + .desc = "TCP IPv4 redir with reuseport", + .lookup_prog = skel->progs.select_sock_a, + .reuseport_prog = skel->progs.select_sock_b, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_STREAM, + .connect_to = { EXT_IP4, EXT_PORT }, + .listen_at = { INT_IP4, INT_PORT }, + .accept_on = SERVER_B, + }, + { + .desc = "TCP IPv4 redir skip reuseport", + .lookup_prog = skel->progs.select_sock_a_no_reuseport, + .reuseport_prog = skel->progs.select_sock_b, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_STREAM, + .connect_to = { EXT_IP4, EXT_PORT }, + .listen_at = { INT_IP4, INT_PORT }, + .accept_on = SERVER_A, + }, + { + .desc = "TCP IPv6 redir port", + .lookup_prog = skel->progs.redir_port, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_STREAM, + .connect_to = { EXT_IP6, EXT_PORT }, + .listen_at = { EXT_IP6, INT_PORT }, + }, + { + .desc = "TCP IPv6 redir addr", + .lookup_prog = skel->progs.redir_ip6, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_STREAM, + .connect_to = { EXT_IP6, EXT_PORT }, + .listen_at = { INT_IP6, EXT_PORT }, + }, + { + .desc = "TCP IPv4->IPv6 redir port", + .lookup_prog = skel->progs.redir_port, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_STREAM, + .connect_to = { EXT_IP4, EXT_PORT }, + .listen_at = { INT_IP4_V6, INT_PORT }, + }, + { + .desc = "TCP IPv6 redir with reuseport", + .lookup_prog = skel->progs.select_sock_a, + .reuseport_prog = skel->progs.select_sock_b, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_STREAM, + .connect_to = { EXT_IP6, EXT_PORT }, + .listen_at = { INT_IP6, INT_PORT }, + .accept_on = SERVER_B, + }, + { + .desc = "TCP IPv6 redir skip reuseport", + .lookup_prog = skel->progs.select_sock_a_no_reuseport, + .reuseport_prog = skel->progs.select_sock_b, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_STREAM, + .connect_to = { EXT_IP6, EXT_PORT }, + .listen_at = { INT_IP6, INT_PORT }, + .accept_on = SERVER_A, + }, + { + .desc = "UDP IPv4 redir port", + .lookup_prog = skel->progs.redir_port, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_DGRAM, + .connect_to = { EXT_IP4, EXT_PORT }, + .listen_at = { EXT_IP4, INT_PORT }, + }, + { + .desc = "UDP IPv4 redir addr", + .lookup_prog = skel->progs.redir_ip4, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_DGRAM, + .connect_to = { EXT_IP4, EXT_PORT }, + .listen_at = { INT_IP4, EXT_PORT }, + }, + { + .desc = "UDP IPv4 redir with reuseport", + .lookup_prog = skel->progs.select_sock_a, + .reuseport_prog = skel->progs.select_sock_b, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_DGRAM, + .connect_to = { EXT_IP4, EXT_PORT }, + .listen_at = { INT_IP4, INT_PORT }, + .accept_on = SERVER_B, + }, + { + .desc = "UDP IPv4 redir skip reuseport", + .lookup_prog = skel->progs.select_sock_a_no_reuseport, + .reuseport_prog = skel->progs.select_sock_b, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_DGRAM, + .connect_to = { EXT_IP4, EXT_PORT }, + .listen_at = { INT_IP4, INT_PORT }, + .accept_on = SERVER_A, + }, + { + .desc = "UDP IPv6 redir port", + .lookup_prog = skel->progs.redir_port, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_DGRAM, + .connect_to = { EXT_IP6, EXT_PORT }, + .listen_at = { EXT_IP6, INT_PORT }, + }, + { + .desc = "UDP IPv6 redir addr", + .lookup_prog = skel->progs.redir_ip6, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_DGRAM, + .connect_to = { EXT_IP6, EXT_PORT }, + .listen_at = { INT_IP6, EXT_PORT }, + }, + { + .desc = "UDP IPv4->IPv6 redir port", + .lookup_prog = skel->progs.redir_port, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_DGRAM, + .listen_at = { INT_IP4_V6, INT_PORT }, + .connect_to = { EXT_IP4, EXT_PORT }, + }, + { + .desc = "UDP IPv6 redir and reuseport", + .lookup_prog = skel->progs.select_sock_a, + .reuseport_prog = skel->progs.select_sock_b, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_DGRAM, + .connect_to = { EXT_IP6, EXT_PORT }, + .listen_at = { INT_IP6, INT_PORT }, + .accept_on = SERVER_B, + }, + { + .desc = "UDP IPv6 redir skip reuseport", + .lookup_prog = skel->progs.select_sock_a_no_reuseport, + .reuseport_prog = skel->progs.select_sock_b, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_DGRAM, + .connect_to = { EXT_IP6, EXT_PORT }, + .listen_at = { INT_IP6, INT_PORT }, + .accept_on = SERVER_A, + }, + }; + const struct test *t; + + for (t = tests; t < tests + ARRAY_SIZE(tests); t++) { + if (test__start_subtest(t->desc)) + run_lookup_prog(t); + } +} + +static void drop_on_lookup(const struct test *t) +{ + struct sockaddr_storage dst = {}; + int client_fd, server_fd, err; + struct bpf_link *lookup_link; + ssize_t n; + + lookup_link = attach_lookup_prog(t->lookup_prog); + if (!lookup_link) + return; + + server_fd = make_server(t->sotype, t->listen_at.ip, t->listen_at.port, + t->reuseport_prog); + if (server_fd < 0) + goto detach; + + client_fd = make_socket(t->sotype, t->connect_to.ip, + t->connect_to.port, &dst); + if (client_fd < 0) + goto close_srv; + + err = connect(client_fd, (void *)&dst, inetaddr_len(&dst)); + if (t->sotype == SOCK_DGRAM) { + err = send_byte(client_fd); + if (err) + goto close_all; + + /* Read out asynchronous error */ + n = recv(client_fd, NULL, 0, 0); + err = n == -1; + } + if (CHECK(!err || errno != ECONNREFUSED, "connect", + "unexpected success or error\n")) + log_err("expected ECONNREFUSED on connect"); + +close_all: + close(client_fd); +close_srv: + close(server_fd); +detach: + bpf_link__destroy(lookup_link); +} + +static void test_drop_on_lookup(struct test_sk_lookup_kern *skel) +{ + const struct test tests[] = { + { + .desc = "TCP IPv4 drop on lookup", + .lookup_prog = skel->progs.lookup_drop, + .sotype = SOCK_STREAM, + .connect_to = { EXT_IP4, EXT_PORT }, + .listen_at = { EXT_IP4, EXT_PORT }, + }, + { + .desc = "TCP IPv6 drop on lookup", + .lookup_prog = skel->progs.lookup_drop, + .sotype = SOCK_STREAM, + .connect_to = { EXT_IP6, EXT_PORT }, + .listen_at = { EXT_IP6, EXT_PORT }, + }, + { + .desc = "UDP IPv4 drop on lookup", + .lookup_prog = skel->progs.lookup_drop, + .sotype = SOCK_DGRAM, + .connect_to = { EXT_IP4, EXT_PORT }, + .listen_at = { EXT_IP4, EXT_PORT }, + }, + { + .desc = "UDP IPv6 drop on lookup", + .lookup_prog = skel->progs.lookup_drop, + .sotype = SOCK_DGRAM, + .connect_to = { EXT_IP6, EXT_PORT }, + .listen_at = { EXT_IP6, INT_PORT }, + }, + }; + const struct test *t; + + for (t = tests; t < tests + ARRAY_SIZE(tests); t++) { + if (test__start_subtest(t->desc)) + drop_on_lookup(t); + } +} + +static void drop_on_reuseport(const struct test *t) +{ + struct sockaddr_storage dst = { 0 }; + int client, server1, server2, err; + struct bpf_link *lookup_link; + ssize_t n; + + lookup_link = attach_lookup_prog(t->lookup_prog); + if (!lookup_link) + return; + + server1 = make_server(t->sotype, t->listen_at.ip, t->listen_at.port, + t->reuseport_prog); + if (server1 < 0) + goto detach; + + err = update_lookup_map(t->sock_map, SERVER_A, server1); + if (err) + goto detach; + + /* second server on destination address we should never reach */ + server2 = make_server(t->sotype, t->connect_to.ip, t->connect_to.port, + NULL /* reuseport prog */); + if (server2 < 0) + goto close_srv1; + + client = make_socket(t->sotype, t->connect_to.ip, + t->connect_to.port, &dst); + if (client < 0) + goto close_srv2; + + err = connect(client, (void *)&dst, inetaddr_len(&dst)); + if (t->sotype == SOCK_DGRAM) { + err = send_byte(client); + if (err) + goto close_all; + + /* Read out asynchronous error */ + n = recv(client, NULL, 0, 0); + err = n == -1; + } + if (CHECK(!err || errno != ECONNREFUSED, "connect", + "unexpected success or error\n")) + log_err("expected ECONNREFUSED on connect"); + +close_all: + close(client); +close_srv2: + close(server2); +close_srv1: + close(server1); +detach: + bpf_link__destroy(lookup_link); +} + +static void test_drop_on_reuseport(struct test_sk_lookup_kern *skel) +{ + const struct test tests[] = { + { + .desc = "TCP IPv4 drop on reuseport", + .lookup_prog = skel->progs.select_sock_a, + .reuseport_prog = skel->progs.reuseport_drop, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_STREAM, + .connect_to = { EXT_IP4, EXT_PORT }, + .listen_at = { INT_IP4, INT_PORT }, + }, + { + .desc = "TCP IPv6 drop on reuseport", + .lookup_prog = skel->progs.select_sock_a, + .reuseport_prog = skel->progs.reuseport_drop, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_STREAM, + .connect_to = { EXT_IP6, EXT_PORT }, + .listen_at = { INT_IP6, INT_PORT }, + }, + { + .desc = "UDP IPv4 drop on reuseport", + .lookup_prog = skel->progs.select_sock_a, + .reuseport_prog = skel->progs.reuseport_drop, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_DGRAM, + .connect_to = { EXT_IP4, EXT_PORT }, + .listen_at = { INT_IP4, INT_PORT }, + }, + { + .desc = "TCP IPv6 drop on reuseport", + .lookup_prog = skel->progs.select_sock_a, + .reuseport_prog = skel->progs.reuseport_drop, + .sock_map = skel->maps.redir_map, + .sotype = SOCK_STREAM, + .connect_to = { EXT_IP6, EXT_PORT }, + .listen_at = { INT_IP6, INT_PORT }, + }, + }; + const struct test *t; + + for (t = tests; t < tests + ARRAY_SIZE(tests); t++) { + if (test__start_subtest(t->desc)) + drop_on_reuseport(t); + } +} + +static void run_sk_assign(struct test_sk_lookup_kern *skel, + struct bpf_program *lookup_prog, + const char *listen_ip, const char *connect_ip) +{ + int client_fd, peer_fd, server_fds[MAX_SERVERS] = { -1 }; + struct bpf_link *lookup_link; + int i, err; + + lookup_link = attach_lookup_prog(lookup_prog); + if (!lookup_link) + return; + + for (i = 0; i < ARRAY_SIZE(server_fds); i++) { + server_fds[i] = make_server(SOCK_STREAM, listen_ip, 0, NULL); + if (server_fds[i] < 0) + goto close_servers; + + err = update_lookup_map(skel->maps.redir_map, i, + server_fds[i]); + if (err) + goto close_servers; + } + + client_fd = make_client(SOCK_STREAM, connect_ip, EXT_PORT); + if (client_fd < 0) + goto close_servers; + + peer_fd = accept(server_fds[SERVER_B], NULL, NULL); + if (CHECK(peer_fd < 0, "accept", "failed\n")) + goto close_client; + + close(peer_fd); +close_client: + close(client_fd); +close_servers: + for (i = 0; i < ARRAY_SIZE(server_fds); i++) { + if (server_fds[i] != -1) + close(server_fds[i]); + } + bpf_link__destroy(lookup_link); +} + +static void run_sk_assign_v4(struct test_sk_lookup_kern *skel, + struct bpf_program *lookup_prog) +{ + run_sk_assign(skel, lookup_prog, INT_IP4, EXT_IP4); +} + +static void run_sk_assign_v6(struct test_sk_lookup_kern *skel, + struct bpf_program *lookup_prog) +{ + run_sk_assign(skel, lookup_prog, INT_IP6, EXT_IP6); +} + +static void run_sk_assign_connected(struct test_sk_lookup_kern *skel, + int sotype) +{ + int err, client_fd, connected_fd, server_fd; + struct bpf_link *lookup_link; + + server_fd = make_server(sotype, EXT_IP4, EXT_PORT, NULL); + if (server_fd < 0) + return; + + connected_fd = make_client(sotype, EXT_IP4, EXT_PORT); + if (connected_fd < 0) + goto out_close_server; + + /* Put a connected socket in redirect map */ + err = update_lookup_map(skel->maps.redir_map, SERVER_A, connected_fd); + if (err) + goto out_close_connected; + + lookup_link = attach_lookup_prog(skel->progs.sk_assign_esocknosupport); + if (!lookup_link) + goto out_close_connected; + + /* Try to redirect TCP SYN / UDP packet to a connected socket */ + client_fd = make_client(sotype, EXT_IP4, EXT_PORT); + if (client_fd < 0) + goto out_unlink_prog; + if (sotype == SOCK_DGRAM) { + send_byte(client_fd); + recv_byte(server_fd); + } + + close(client_fd); +out_unlink_prog: + bpf_link__destroy(lookup_link); +out_close_connected: + close(connected_fd); +out_close_server: + close(server_fd); +} + +static void test_sk_assign_helper(struct test_sk_lookup_kern *skel) +{ + if (test__start_subtest("sk_assign returns EEXIST")) + run_sk_assign_v4(skel, skel->progs.sk_assign_eexist); + if (test__start_subtest("sk_assign honors F_REPLACE")) + run_sk_assign_v4(skel, skel->progs.sk_assign_replace_flag); + if (test__start_subtest("sk_assign accepts NULL socket")) + run_sk_assign_v4(skel, skel->progs.sk_assign_null); + if (test__start_subtest("access ctx->sk")) + run_sk_assign_v4(skel, skel->progs.access_ctx_sk); + if (test__start_subtest("narrow access to ctx v4")) + run_sk_assign_v4(skel, skel->progs.ctx_narrow_access); + if (test__start_subtest("narrow access to ctx v6")) + run_sk_assign_v6(skel, skel->progs.ctx_narrow_access); + if (test__start_subtest("sk_assign rejects TCP established")) + run_sk_assign_connected(skel, SOCK_STREAM); + if (test__start_subtest("sk_assign rejects UDP connected")) + run_sk_assign_connected(skel, SOCK_DGRAM); +} + +struct test_multi_prog { + const char *desc; + struct bpf_program *prog1; + struct bpf_program *prog2; + struct bpf_map *redir_map; + struct bpf_map *run_map; + int expect_errno; + struct inet_addr listen_at; +}; + +static void run_multi_prog_lookup(const struct test_multi_prog *t) +{ + struct sockaddr_storage dst = {}; + int map_fd, server_fd, client_fd; + struct bpf_link *link1, *link2; + int prog_idx, done, err; + + map_fd = bpf_map__fd(t->run_map); + + done = 0; + prog_idx = PROG1; + err = bpf_map_update_elem(map_fd, &prog_idx, &done, BPF_ANY); + if (CHECK(err, "bpf_map_update_elem", "failed\n")) + return; + prog_idx = PROG2; + err = bpf_map_update_elem(map_fd, &prog_idx, &done, BPF_ANY); + if (CHECK(err, "bpf_map_update_elem", "failed\n")) + return; + + link1 = attach_lookup_prog(t->prog1); + if (!link1) + return; + link2 = attach_lookup_prog(t->prog2); + if (!link2) + goto out_unlink1; + + server_fd = make_server(SOCK_STREAM, t->listen_at.ip, + t->listen_at.port, NULL); + if (server_fd < 0) + goto out_unlink2; + + err = update_lookup_map(t->redir_map, SERVER_A, server_fd); + if (err) + goto out_close_server; + + client_fd = make_socket(SOCK_STREAM, EXT_IP4, EXT_PORT, &dst); + if (client_fd < 0) + goto out_close_server; + + err = connect(client_fd, (void *)&dst, inetaddr_len(&dst)); + if (CHECK(err && !t->expect_errno, "connect", + "unexpected error %d\n", errno)) + goto out_close_client; + if (CHECK(err && t->expect_errno && errno != t->expect_errno, + "connect", "unexpected error %d\n", errno)) + goto out_close_client; + + done = 0; + prog_idx = PROG1; + err = bpf_map_lookup_elem(map_fd, &prog_idx, &done); + CHECK(err, "bpf_map_lookup_elem", "failed\n"); + CHECK(!done, "bpf_map_lookup_elem", "PROG1 !done\n"); + + done = 0; + prog_idx = PROG2; + err = bpf_map_lookup_elem(map_fd, &prog_idx, &done); + CHECK(err, "bpf_map_lookup_elem", "failed\n"); + CHECK(!done, "bpf_map_lookup_elem", "PROG2 !done\n"); + +out_close_client: + close(client_fd); +out_close_server: + close(server_fd); +out_unlink2: + bpf_link__destroy(link2); +out_unlink1: + bpf_link__destroy(link1); +} + +static void test_multi_prog_lookup(struct test_sk_lookup_kern *skel) +{ + struct test_multi_prog tests[] = { + { + .desc = "multi prog - pass, pass", + .prog1 = skel->progs.multi_prog_pass1, + .prog2 = skel->progs.multi_prog_pass2, + .listen_at = { EXT_IP4, EXT_PORT }, + }, + { + .desc = "multi prog - drop, drop", + .prog1 = skel->progs.multi_prog_drop1, + .prog2 = skel->progs.multi_prog_drop2, + .listen_at = { EXT_IP4, EXT_PORT }, + .expect_errno = ECONNREFUSED, + }, + { + .desc = "multi prog - pass, drop", + .prog1 = skel->progs.multi_prog_pass1, + .prog2 = skel->progs.multi_prog_drop2, + .listen_at = { EXT_IP4, EXT_PORT }, + .expect_errno = ECONNREFUSED, + }, + { + .desc = "multi prog - drop, pass", + .prog1 = skel->progs.multi_prog_drop1, + .prog2 = skel->progs.multi_prog_pass2, + .listen_at = { EXT_IP4, EXT_PORT }, + .expect_errno = ECONNREFUSED, + }, + { + .desc = "multi prog - pass, redir", + .prog1 = skel->progs.multi_prog_pass1, + .prog2 = skel->progs.multi_prog_redir2, + .listen_at = { INT_IP4, INT_PORT }, + }, + { + .desc = "multi prog - redir, pass", + .prog1 = skel->progs.multi_prog_redir1, + .prog2 = skel->progs.multi_prog_pass2, + .listen_at = { INT_IP4, INT_PORT }, + }, + { + .desc = "multi prog - drop, redir", + .prog1 = skel->progs.multi_prog_drop1, + .prog2 = skel->progs.multi_prog_redir2, + .listen_at = { INT_IP4, INT_PORT }, + }, + { + .desc = "multi prog - redir, drop", + .prog1 = skel->progs.multi_prog_redir1, + .prog2 = skel->progs.multi_prog_drop2, + .listen_at = { INT_IP4, INT_PORT }, + }, + { + .desc = "multi prog - redir, redir", + .prog1 = skel->progs.multi_prog_redir1, + .prog2 = skel->progs.multi_prog_redir2, + .listen_at = { INT_IP4, INT_PORT }, + }, + }; + struct test_multi_prog *t; + + for (t = tests; t < tests + ARRAY_SIZE(tests); t++) { + t->redir_map = skel->maps.redir_map; + t->run_map = skel->maps.run_map; + if (test__start_subtest(t->desc)) + run_multi_prog_lookup(t); + } +} + +static void run_tests(struct test_sk_lookup_kern *skel) +{ + if (test__start_subtest("query lookup prog")) + query_lookup_prog(skel); + test_redirect_lookup(skel); + test_drop_on_lookup(skel); + test_drop_on_reuseport(skel); + test_sk_assign_helper(skel); + test_multi_prog_lookup(skel); +} + +static int switch_netns(void) +{ + static const char * const setup_script[] = { + "ip -6 addr add dev lo " EXT_IP6 "/128 nodad", + "ip -6 addr add dev lo " INT_IP6 "/128 nodad", + "ip link set dev lo up", + NULL, + }; + const char * const *cmd; + int err; + + err = unshare(CLONE_NEWNET); + if (CHECK(err, "unshare", "failed\n")) { + log_err("unshare(CLONE_NEWNET)"); + return -1; + } + + for (cmd = setup_script; *cmd; cmd++) { + err = system(*cmd); + if (CHECK(err, "system", "failed\n")) { + log_err("system(%s)", *cmd); + return -1; + } + } + + return 0; +} + +void test_sk_lookup(void) +{ + struct test_sk_lookup_kern *skel; + int err; + + err = switch_netns(); + if (err) + return; + + skel = test_sk_lookup_kern__open_and_load(); + if (CHECK(!skel, "skel open_and_load", "failed\n")) + return; + + run_tests(skel); + + test_sk_lookup_kern__destroy(skel); +} diff --git a/tools/testing/selftests/bpf/progs/test_sk_lookup_kern.c b/tools/testing/selftests/bpf/progs/test_sk_lookup_kern.c new file mode 100644 index 000000000000..b6d2ca75ac91 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/test_sk_lookup_kern.c @@ -0,0 +1,639 @@ +// SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause +// Copyright (c) 2020 Cloudflare + +#include +#include +#include +#include +#include +#include + +#include +#include + +#define IP4(a, b, c, d) \ + bpf_htonl((((__u32)(a) & 0xffU) << 24) | \ + (((__u32)(b) & 0xffU) << 16) | \ + (((__u32)(c) & 0xffU) << 8) | \ + (((__u32)(d) & 0xffU) << 0)) +#define IP6(aaaa, bbbb, cccc, dddd) \ + { bpf_htonl(aaaa), bpf_htonl(bbbb), bpf_htonl(cccc), bpf_htonl(dddd) } + +#define MAX_SOCKS 32 + +struct { + __uint(type, BPF_MAP_TYPE_SOCKMAP); + __uint(max_entries, MAX_SOCKS); + __type(key, __u32); + __type(value, __u64); +} redir_map SEC(".maps"); + +struct { + __uint(type, BPF_MAP_TYPE_ARRAY); + __uint(max_entries, 2); + __type(key, int); + __type(value, int); +} run_map SEC(".maps"); + +enum { + PROG1 = 0, + PROG2, +}; + +enum { + SERVER_A = 0, + SERVER_B, +}; + +/* Addressable key/value constants for convenience */ +static const int KEY_PROG1 = PROG1; +static const int KEY_PROG2 = PROG2; +static const int PROG_DONE = 1; + +static const __u32 KEY_SERVER_A = SERVER_A; +static const __u32 KEY_SERVER_B = SERVER_B; + +static const __u16 DST_PORT = 7007; /* Host byte order */ +static const __u32 DST_IP4 = IP4(127, 0, 0, 1); +static const __u32 DST_IP6[] = IP6(0xfd000000, 0x0, 0x0, 0x00000001); + +SEC("sk_lookup/lookup_pass") +int lookup_pass(struct bpf_sk_lookup *ctx) +{ + return SK_PASS; +} + +SEC("sk_lookup/lookup_drop") +int lookup_drop(struct bpf_sk_lookup *ctx) +{ + return SK_DROP; +} + +SEC("sk_reuseport/reuse_pass") +int reuseport_pass(struct sk_reuseport_md *ctx) +{ + return SK_PASS; +} + +SEC("sk_reuseport/reuse_drop") +int reuseport_drop(struct sk_reuseport_md *ctx) +{ + return SK_DROP; +} + +/* Redirect packets destined for port DST_PORT to socket at redir_map[0]. */ +SEC("sk_lookup/redir_port") +int redir_port(struct bpf_sk_lookup *ctx) +{ + struct bpf_sock *sk; + int err; + + if (ctx->local_port != DST_PORT) + return SK_PASS; + + sk = bpf_map_lookup_elem(&redir_map, &KEY_SERVER_A); + if (!sk) + return SK_PASS; + + err = bpf_sk_assign(ctx, sk, 0); + bpf_sk_release(sk); + return err ? SK_DROP : SK_PASS; +} + +/* Redirect packets destined for DST_IP4 address to socket at redir_map[0]. */ +SEC("sk_lookup/redir_ip4") +int redir_ip4(struct bpf_sk_lookup *ctx) +{ + struct bpf_sock *sk; + int err; + + if (ctx->family != AF_INET) + return SK_PASS; + if (ctx->local_port != DST_PORT) + return SK_PASS; + if (ctx->local_ip4 != DST_IP4) + return SK_PASS; + + sk = bpf_map_lookup_elem(&redir_map, &KEY_SERVER_A); + if (!sk) + return SK_PASS; + + err = bpf_sk_assign(ctx, sk, 0); + bpf_sk_release(sk); + return err ? SK_DROP : SK_PASS; +} + +/* Redirect packets destined for DST_IP6 address to socket at redir_map[0]. */ +SEC("sk_lookup/redir_ip6") +int redir_ip6(struct bpf_sk_lookup *ctx) +{ + struct bpf_sock *sk; + int err; + + if (ctx->family != AF_INET6) + return SK_PASS; + if (ctx->local_port != DST_PORT) + return SK_PASS; + if (ctx->local_ip6[0] != DST_IP6[0] || + ctx->local_ip6[1] != DST_IP6[1] || + ctx->local_ip6[2] != DST_IP6[2] || + ctx->local_ip6[3] != DST_IP6[3]) + return SK_PASS; + + sk = bpf_map_lookup_elem(&redir_map, &KEY_SERVER_A); + if (!sk) + return SK_PASS; + + err = bpf_sk_assign(ctx, sk, 0); + bpf_sk_release(sk); + return err ? SK_DROP : SK_PASS; +} + +SEC("sk_lookup/select_sock_a") +int select_sock_a(struct bpf_sk_lookup *ctx) +{ + struct bpf_sock *sk; + int err; + + sk = bpf_map_lookup_elem(&redir_map, &KEY_SERVER_A); + if (!sk) + return SK_PASS; + + err = bpf_sk_assign(ctx, sk, 0); + bpf_sk_release(sk); + return err ? SK_DROP : SK_PASS; +} + +SEC("sk_lookup/select_sock_a_no_reuseport") +int select_sock_a_no_reuseport(struct bpf_sk_lookup *ctx) +{ + struct bpf_sock *sk; + int err; + + sk = bpf_map_lookup_elem(&redir_map, &KEY_SERVER_A); + if (!sk) + return SK_DROP; + + err = bpf_sk_assign(ctx, sk, BPF_SK_LOOKUP_F_NO_REUSEPORT); + bpf_sk_release(sk); + return err ? SK_DROP : SK_PASS; +} + +SEC("sk_reuseport/select_sock_b") +int select_sock_b(struct sk_reuseport_md *ctx) +{ + __u32 key = KEY_SERVER_B; + int err; + + err = bpf_sk_select_reuseport(ctx, &redir_map, &key, 0); + return err ? SK_DROP : SK_PASS; +} + +/* Check that bpf_sk_assign() returns -EEXIST if socket already selected. */ +SEC("sk_lookup/sk_assign_eexist") +int sk_assign_eexist(struct bpf_sk_lookup *ctx) +{ + struct bpf_sock *sk; + int err, ret; + + ret = SK_DROP; + sk = bpf_map_lookup_elem(&redir_map, &KEY_SERVER_B); + if (!sk) + goto out; + err = bpf_sk_assign(ctx, sk, 0); + if (err) + goto out; + bpf_sk_release(sk); + + sk = bpf_map_lookup_elem(&redir_map, &KEY_SERVER_A); + if (!sk) + goto out; + err = bpf_sk_assign(ctx, sk, 0); + if (err != -EEXIST) { + bpf_printk("sk_assign returned %d, expected %d\n", + err, -EEXIST); + goto out; + } + + ret = SK_PASS; /* Success, redirect to KEY_SERVER_B */ +out: + if (sk) + bpf_sk_release(sk); + return ret; +} + +/* Check that bpf_sk_assign(BPF_SK_LOOKUP_F_REPLACE) can override selection. */ +SEC("sk_lookup/sk_assign_replace_flag") +int sk_assign_replace_flag(struct bpf_sk_lookup *ctx) +{ + struct bpf_sock *sk; + int err, ret; + + ret = SK_DROP; + sk = bpf_map_lookup_elem(&redir_map, &KEY_SERVER_A); + if (!sk) + goto out; + err = bpf_sk_assign(ctx, sk, 0); + if (err) + goto out; + bpf_sk_release(sk); + + sk = bpf_map_lookup_elem(&redir_map, &KEY_SERVER_B); + if (!sk) + goto out; + err = bpf_sk_assign(ctx, sk, BPF_SK_LOOKUP_F_REPLACE); + if (err) { + bpf_printk("sk_assign returned %d, expected 0\n", err); + goto out; + } + + ret = SK_PASS; /* Success, redirect to KEY_SERVER_B */ +out: + if (sk) + bpf_sk_release(sk); + return ret; +} + +/* Check that bpf_sk_assign(sk=NULL) is accepted. */ +SEC("sk_lookup/sk_assign_null") +int sk_assign_null(struct bpf_sk_lookup *ctx) +{ + struct bpf_sock *sk = NULL; + int err, ret; + + ret = SK_DROP; + + err = bpf_sk_assign(ctx, NULL, 0); + if (err) { + bpf_printk("sk_assign returned %d, expected 0\n", err); + goto out; + } + + sk = bpf_map_lookup_elem(&redir_map, &KEY_SERVER_B); + if (!sk) + goto out; + err = bpf_sk_assign(ctx, sk, BPF_SK_LOOKUP_F_REPLACE); + if (err) { + bpf_printk("sk_assign returned %d, expected 0\n", err); + goto out; + } + + if (ctx->sk != sk) + goto out; + err = bpf_sk_assign(ctx, NULL, 0); + if (err != -EEXIST) + goto out; + err = bpf_sk_assign(ctx, NULL, BPF_SK_LOOKUP_F_REPLACE); + if (err) + goto out; + err = bpf_sk_assign(ctx, sk, BPF_SK_LOOKUP_F_REPLACE); + if (err) + goto out; + + ret = SK_PASS; /* Success, redirect to KEY_SERVER_B */ +out: + if (sk) + bpf_sk_release(sk); + return ret; +} + +/* Check that selected sk is accessible through context. */ +SEC("sk_lookup/access_ctx_sk") +int access_ctx_sk(struct bpf_sk_lookup *ctx) +{ + struct bpf_sock *sk1 = NULL, *sk2 = NULL; + int err, ret; + + ret = SK_DROP; + + /* Try accessing unassigned (NULL) ctx->sk field */ + if (ctx->sk && ctx->sk->family != AF_INET) + goto out; + + /* Assign a value to ctx->sk */ + sk1 = bpf_map_lookup_elem(&redir_map, &KEY_SERVER_A); + if (!sk1) + goto out; + err = bpf_sk_assign(ctx, sk1, 0); + if (err) + goto out; + if (ctx->sk != sk1) + goto out; + + /* Access ctx->sk fields */ + if (ctx->sk->family != AF_INET || + ctx->sk->type != SOCK_STREAM || + ctx->sk->state != BPF_TCP_LISTEN) + goto out; + + /* Reset selection */ + err = bpf_sk_assign(ctx, NULL, BPF_SK_LOOKUP_F_REPLACE); + if (err) + goto out; + if (ctx->sk) + goto out; + + /* Assign another socket */ + sk2 = bpf_map_lookup_elem(&redir_map, &KEY_SERVER_B); + if (!sk2) + goto out; + err = bpf_sk_assign(ctx, sk2, BPF_SK_LOOKUP_F_REPLACE); + if (err) + goto out; + if (ctx->sk != sk2) + goto out; + + /* Access reassigned ctx->sk fields */ + if (ctx->sk->family != AF_INET || + ctx->sk->type != SOCK_STREAM || + ctx->sk->state != BPF_TCP_LISTEN) + goto out; + + ret = SK_PASS; /* Success, redirect to KEY_SERVER_B */ +out: + if (sk1) + bpf_sk_release(sk1); + if (sk2) + bpf_sk_release(sk2); + return ret; +} + +/* Check narrow loads from ctx fields that support them */ +SEC("sk_lookup/ctx_narrow_access") +int ctx_narrow_access(struct bpf_sk_lookup *ctx) +{ + struct bpf_sock *sk; + int err, family; + __u16 *half; + __u8 *byte; + bool v4; + + v4 = (ctx->family == AF_INET); + + /* Narrow loads from family field */ + byte = (__u8 *)&ctx->family; + half = (__u16 *)&ctx->family; + if (byte[0] != (v4 ? AF_INET : AF_INET6) || + byte[1] != 0 || byte[2] != 0 || byte[3] != 0) + return SK_DROP; + if (half[0] != (v4 ? AF_INET : AF_INET6) || + half[1] != 0) + return SK_DROP; + + byte = (__u8 *)&ctx->protocol; + if (byte[0] != IPPROTO_TCP || + byte[1] != 0 || byte[2] != 0 || byte[3] != 0) + return SK_DROP; + half = (__u16 *)&ctx->protocol; + if (half[0] != IPPROTO_TCP || + half[1] != 0) + return SK_DROP; + + /* Narrow loads from remote_port field. Expect non-0 value. */ + byte = (__u8 *)&ctx->remote_port; + if (byte[0] == 0 && byte[1] == 0 && byte[2] == 0 && byte[3] == 0) + return SK_DROP; + half = (__u16 *)&ctx->remote_port; + if (half[0] == 0 && half[1] == 0) + return SK_DROP; + + /* Narrow loads from local_port field. Expect DST_PORT. */ + byte = (__u8 *)&ctx->local_port; + if (byte[0] != ((DST_PORT >> 0) & 0xff) || + byte[1] != ((DST_PORT >> 8) & 0xff) || + byte[2] != 0 || byte[3] != 0) + return SK_DROP; + half = (__u16 *)&ctx->local_port; + if (half[0] != DST_PORT || + half[1] != 0) + return SK_DROP; + + /* Narrow loads from IPv4 fields */ + if (v4) { + /* Expect non-0.0.0.0 in remote_ip4 */ + byte = (__u8 *)&ctx->remote_ip4; + if (byte[0] == 0 && byte[1] == 0 && + byte[2] == 0 && byte[3] == 0) + return SK_DROP; + half = (__u16 *)&ctx->remote_ip4; + if (half[0] == 0 && half[1] == 0) + return SK_DROP; + + /* Expect DST_IP4 in local_ip4 */ + byte = (__u8 *)&ctx->local_ip4; + if (byte[0] != ((DST_IP4 >> 0) & 0xff) || + byte[1] != ((DST_IP4 >> 8) & 0xff) || + byte[2] != ((DST_IP4 >> 16) & 0xff) || + byte[3] != ((DST_IP4 >> 24) & 0xff)) + return SK_DROP; + half = (__u16 *)&ctx->local_ip4; + if (half[0] != ((DST_IP4 >> 0) & 0xffff) || + half[1] != ((DST_IP4 >> 16) & 0xffff)) + return SK_DROP; + } else { + /* Expect 0.0.0.0 IPs when family != AF_INET */ + byte = (__u8 *)&ctx->remote_ip4; + if (byte[0] != 0 || byte[1] != 0 && + byte[2] != 0 || byte[3] != 0) + return SK_DROP; + half = (__u16 *)&ctx->remote_ip4; + if (half[0] != 0 || half[1] != 0) + return SK_DROP; + + byte = (__u8 *)&ctx->local_ip4; + if (byte[0] != 0 || byte[1] != 0 && + byte[2] != 0 || byte[3] != 0) + return SK_DROP; + half = (__u16 *)&ctx->local_ip4; + if (half[0] != 0 || half[1] != 0) + return SK_DROP; + } + + /* Narrow loads from IPv6 fields */ + if (!v4) { + /* Expenct non-:: IP in remote_ip6 */ + byte = (__u8 *)&ctx->remote_ip6; + if (byte[0] == 0 && byte[1] == 0 && + byte[2] == 0 && byte[3] == 0 && + byte[4] == 0 && byte[5] == 0 && + byte[6] == 0 && byte[7] == 0 && + byte[8] == 0 && byte[9] == 0 && + byte[10] == 0 && byte[11] == 0 && + byte[12] == 0 && byte[13] == 0 && + byte[14] == 0 && byte[15] == 0) + return SK_DROP; + half = (__u16 *)&ctx->remote_ip6; + if (half[0] == 0 && half[1] == 0 && + half[2] == 0 && half[3] == 0 && + half[4] == 0 && half[5] == 0 && + half[6] == 0 && half[7] == 0) + return SK_DROP; + + /* Expect DST_IP6 in local_ip6 */ + byte = (__u8 *)&ctx->local_ip6; + if (byte[0] != ((DST_IP6[0] >> 0) & 0xff) || + byte[1] != ((DST_IP6[0] >> 8) & 0xff) || + byte[2] != ((DST_IP6[0] >> 16) & 0xff) || + byte[3] != ((DST_IP6[0] >> 24) & 0xff) || + byte[4] != ((DST_IP6[1] >> 0) & 0xff) || + byte[5] != ((DST_IP6[1] >> 8) & 0xff) || + byte[6] != ((DST_IP6[1] >> 16) & 0xff) || + byte[7] != ((DST_IP6[1] >> 24) & 0xff) || + byte[8] != ((DST_IP6[2] >> 0) & 0xff) || + byte[9] != ((DST_IP6[2] >> 8) & 0xff) || + byte[10] != ((DST_IP6[2] >> 16) & 0xff) || + byte[11] != ((DST_IP6[2] >> 24) & 0xff) || + byte[12] != ((DST_IP6[3] >> 0) & 0xff) || + byte[13] != ((DST_IP6[3] >> 8) & 0xff) || + byte[14] != ((DST_IP6[3] >> 16) & 0xff) || + byte[15] != ((DST_IP6[3] >> 24) & 0xff)) + return SK_DROP; + half = (__u16 *)&ctx->local_ip6; + if (half[0] != ((DST_IP6[0] >> 0) & 0xffff) || + half[1] != ((DST_IP6[0] >> 16) & 0xffff) || + half[2] != ((DST_IP6[1] >> 0) & 0xffff) || + half[3] != ((DST_IP6[1] >> 16) & 0xffff) || + half[4] != ((DST_IP6[2] >> 0) & 0xffff) || + half[5] != ((DST_IP6[2] >> 16) & 0xffff) || + half[6] != ((DST_IP6[3] >> 0) & 0xffff) || + half[7] != ((DST_IP6[3] >> 16) & 0xffff)) + return SK_DROP; + } else { + /* Expect :: IPs when family != AF_INET6 */ + byte = (__u8 *)&ctx->remote_ip6; + if (byte[0] != 0 || byte[1] != 0 || + byte[2] != 0 || byte[3] != 0 || + byte[4] != 0 || byte[5] != 0 || + byte[6] != 0 || byte[7] != 0 || + byte[8] != 0 || byte[9] != 0 || + byte[10] != 0 || byte[11] != 0 || + byte[12] != 0 || byte[13] != 0 || + byte[14] != 0 || byte[15] != 0) + return SK_DROP; + half = (__u16 *)&ctx->remote_ip6; + if (half[0] != 0 || half[1] != 0 || + half[2] != 0 || half[3] != 0 || + half[4] != 0 || half[5] != 0 || + half[6] != 0 || half[7] != 0) + return SK_DROP; + + byte = (__u8 *)&ctx->local_ip6; + if (byte[0] != 0 || byte[1] != 0 || + byte[2] != 0 || byte[3] != 0 || + byte[4] != 0 || byte[5] != 0 || + byte[6] != 0 || byte[7] != 0 || + byte[8] != 0 || byte[9] != 0 || + byte[10] != 0 || byte[11] != 0 || + byte[12] != 0 || byte[13] != 0 || + byte[14] != 0 || byte[15] != 0) + return SK_DROP; + half = (__u16 *)&ctx->local_ip6; + if (half[0] != 0 || half[1] != 0 || + half[2] != 0 || half[3] != 0 || + half[4] != 0 || half[5] != 0 || + half[6] != 0 || half[7] != 0) + return SK_DROP; + } + + /* Success, redirect to KEY_SERVER_B */ + sk = bpf_map_lookup_elem(&redir_map, &KEY_SERVER_B); + if (sk) { + bpf_sk_assign(ctx, sk, 0); + bpf_sk_release(sk); + } + return SK_PASS; +} + +/* Check that sk_assign rejects SERVER_A socket with -ESOCKNOSUPPORT */ +SEC("sk_lookup/sk_assign_esocknosupport") +int sk_assign_esocknosupport(struct bpf_sk_lookup *ctx) +{ + struct bpf_sock *sk; + int err, ret; + + ret = SK_DROP; + sk = bpf_map_lookup_elem(&redir_map, &KEY_SERVER_A); + if (!sk) + goto out; + + err = bpf_sk_assign(ctx, sk, 0); + if (err != -ESOCKTNOSUPPORT) { + bpf_printk("sk_assign returned %d, expected %d\n", + err, -ESOCKTNOSUPPORT); + goto out; + } + + ret = SK_PASS; /* Success, pass to regular lookup */ +out: + if (sk) + bpf_sk_release(sk); + return ret; +} + +SEC("sk_lookup/multi_prog_pass1") +int multi_prog_pass1(struct bpf_sk_lookup *ctx) +{ + bpf_map_update_elem(&run_map, &KEY_PROG1, &PROG_DONE, BPF_ANY); + return SK_PASS; +} + +SEC("sk_lookup/multi_prog_pass2") +int multi_prog_pass2(struct bpf_sk_lookup *ctx) +{ + bpf_map_update_elem(&run_map, &KEY_PROG2, &PROG_DONE, BPF_ANY); + return SK_PASS; +} + +SEC("sk_lookup/multi_prog_drop1") +int multi_prog_drop1(struct bpf_sk_lookup *ctx) +{ + bpf_map_update_elem(&run_map, &KEY_PROG1, &PROG_DONE, BPF_ANY); + return SK_DROP; +} + +SEC("sk_lookup/multi_prog_drop2") +int multi_prog_drop2(struct bpf_sk_lookup *ctx) +{ + bpf_map_update_elem(&run_map, &KEY_PROG2, &PROG_DONE, BPF_ANY); + return SK_DROP; +} + +static __always_inline int select_server_a(struct bpf_sk_lookup *ctx) +{ + struct bpf_sock *sk; + int err; + + sk = bpf_map_lookup_elem(&redir_map, &KEY_SERVER_A); + if (!sk) + return SK_DROP; + + err = bpf_sk_assign(ctx, sk, 0); + bpf_sk_release(sk); + if (err) + return SK_DROP; + + return SK_PASS; +} + +SEC("sk_lookup/multi_prog_redir1") +int multi_prog_redir1(struct bpf_sk_lookup *ctx) +{ + int ret; + + ret = select_server_a(ctx); + bpf_map_update_elem(&run_map, &KEY_PROG1, &PROG_DONE, BPF_ANY); + return SK_PASS; +} + +SEC("sk_lookup/multi_prog_redir2") +int multi_prog_redir2(struct bpf_sk_lookup *ctx) +{ + int ret; + + ret = select_server_a(ctx); + bpf_map_update_elem(&run_map, &KEY_PROG2, &PROG_DONE, BPF_ANY); + return SK_PASS; +} + +char _license[] SEC("license") = "Dual BSD/GPL"; +__u32 _version SEC("version") = 1;