From patchwork Fri Sep 11 14:30:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Rybowski X-Patchwork-Id: 1362487 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=tessares.net Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=tessares-net.20150623.gappssmtp.com header.i=@tessares-net.20150623.gappssmtp.com header.a=rsa-sha256 header.s=20150623 header.b=jkpy5S66; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4BnzlD2Cbsz9sSs for ; Sat, 12 Sep 2020 01:12:40 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726073AbgIKPMV (ORCPT ); Fri, 11 Sep 2020 11:12:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59316 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726400AbgIKPKM (ORCPT ); Fri, 11 Sep 2020 11:10:12 -0400 Received: from mail-ed1-x544.google.com (mail-ed1-x544.google.com [IPv6:2a00:1450:4864:20::544]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B1941C06135A for ; Fri, 11 Sep 2020 07:31:00 -0700 (PDT) Received: by mail-ed1-x544.google.com with SMTP id q21so10198671edv.1 for ; Fri, 11 Sep 2020 07:31:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tessares-net.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=7mgZuX+I018CPLyKtPUh7k/9HodG7wIMhXXoeD6+JKU=; b=jkpy5S66o1feRhrtlzl6PzU1fYz4TE/zZjvhcBHu8qQD+l2uP0ug6wbXuJK9XU4+zS Ml+05QzSwfORDdn+2bQvu4VdPK2hc1MOc+CNLNakaj+5eG4vOqQ+6C4yyw6O78nMtoDL fYu9Skcyyh6a42fCJ70NDglzYCpuwLP8yZDm3HKrrSLSbHoGhva5SKerC7dFzpZNzzWf QXwz7BMUIkVhL2MFmhQEV8f0wCTcrqy83/EsGthbVOTMtERTPPy0pauQg8lD+Eyvz4U8 e+xRRDyuXs0jNQBiTm6Ybs236Zx5Na+0jiuaPpAphBmFuJAz0xli0/wSs3jTZHX11Ryr XlUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=7mgZuX+I018CPLyKtPUh7k/9HodG7wIMhXXoeD6+JKU=; b=frwIpidDJRrkeRTOz60SvdP0glZpN5c+S76wPxiFUgICvg87nqK6GHKBlClLFTZaEE M+LkzrWIgxUOEHJWLBm7o3wIKkAGwN99uMzAGTtODYJQnUIdLwGhC57gAE+HcHeM4kjn 6Y8EzvjDDGQFgy9GL1+mPVdf5J8W2qKMdVwOSTvaxllxmLHNsAvZtwFL9ZC/03QzJxJY EGSf6DBlZmpW2HKmzRGBTx0WQdpiTkIw1luSKnodbHtVBQ2kYAlkfltuJZ1E1C1yAOMF k/eE8YUXsJYbue+t2j5/DikFD6/a6aGkcTl7oJLE1TfxD0LnBe0zyYrzw0UAHqFF6TyO v1ew== X-Gm-Message-State: AOAM5329Zr8h2OMDIumqjKy9ojXISrhFIwY6fK32vquUyT3OzL2M1gbY 2d8F2JIzDdSI/b5ihcYP9mXEaQ== X-Google-Smtp-Source: ABdhPJwhmYbA1MMaXCK6kWdSVFv5YuQOmbMnYfXF79rYVkrYaDq+fAD+mllnm22XYjlgKXbocLQ7ng== X-Received: by 2002:aa7:d29a:: with SMTP id w26mr2271422edq.106.1599834659332; Fri, 11 Sep 2020 07:30:59 -0700 (PDT) Received: from localhost.localdomain ([87.66.33.240]) by smtp.gmail.com with ESMTPSA id y21sm1716261eju.46.2020.09.11.07.30.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Sep 2020 07:30:58 -0700 (PDT) From: Nicolas Rybowski To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "David S. Miller" , Jakub Kicinski Cc: Nicolas Rybowski , Matthieu Baerts , Mat Martineau , netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH bpf-next v2 1/5] bpf: expose is_mptcp flag to bpf_tcp_sock Date: Fri, 11 Sep 2020 16:30:16 +0200 Message-Id: <20200911143022.414783-1-nicolas.rybowski@tessares.net> X-Mailer: git-send-email 2.28.0 MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org is_mptcp is a field from struct tcp_sock used to indicate that the current tcp_sock is part of the MPTCP protocol. In this protocol, a first socket (mptcp_sock) is created with sk_protocol set to IPPROTO_MPTCP (=262) for control purpose but it isn't directly on the wire. This is the role of the subflow (kernel) sockets which are classical tcp_sock with sk_protocol set to IPPROTO_TCP. The only way to differentiate such sockets from plain TCP sockets is the is_mptcp field from tcp_sock. Such an exposure in BPF is thus required to be able to differentiate plain TCP sockets from MPTCP subflow sockets in BPF_PROG_TYPE_SOCK_OPS programs. The choice has been made to silently pass the case when CONFIG_MPTCP is unset by defaulting is_mptcp to 0 in order to make BPF independent of the MPTCP configuration. Another solution is to make the verifier fail in 'bpf_tcp_sock_is_valid_ctx_access' but this will add an additional '#ifdef CONFIG_MPTCP' in the BPF code and a same injected BPF program will not run if MPTCP is not set. An example use-case is provided in https://github.com/multipath-tcp/mptcp_net-next/tree/scripts/bpf/examples Suggested-by: Matthieu Baerts Acked-by: Matthieu Baerts Acked-by: Mat Martineau Signed-off-by: Nicolas Rybowski --- include/uapi/linux/bpf.h | 1 + net/core/filter.c | 9 ++++++++- tools/include/uapi/linux/bpf.h | 1 + 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 7dd314176df7..7d179eada1c3 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -4060,6 +4060,7 @@ struct bpf_tcp_sock { __u32 delivered; /* Total data packets delivered incl. rexmits */ __u32 delivered_ce; /* Like the above but only ECE marked packets */ __u32 icsk_retransmits; /* Number of unrecovered [RTO] timeouts */ + __u32 is_mptcp; /* Is MPTCP subflow? */ }; struct bpf_sock_tuple { diff --git a/net/core/filter.c b/net/core/filter.c index d266c6941967..dab48528dceb 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -5837,7 +5837,7 @@ bool bpf_tcp_sock_is_valid_access(int off, int size, enum bpf_access_type type, struct bpf_insn_access_aux *info) { if (off < 0 || off >= offsetofend(struct bpf_tcp_sock, - icsk_retransmits)) + is_mptcp)) return false; if (off % size != 0) @@ -5971,6 +5971,13 @@ u32 bpf_tcp_sock_convert_ctx_access(enum bpf_access_type type, case offsetof(struct bpf_tcp_sock, icsk_retransmits): BPF_INET_SOCK_GET_COMMON(icsk_retransmits); break; + case offsetof(struct bpf_tcp_sock, is_mptcp): +#ifdef CONFIG_MPTCP + BPF_TCP_SOCK_GET_COMMON(is_mptcp); +#else + *insn++ = BPF_MOV32_IMM(si->dst_reg, 0); +#endif + break; } return insn - insn_buf; diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 7dd314176df7..7d179eada1c3 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -4060,6 +4060,7 @@ struct bpf_tcp_sock { __u32 delivered; /* Total data packets delivered incl. rexmits */ __u32 delivered_ce; /* Like the above but only ECE marked packets */ __u32 icsk_retransmits; /* Number of unrecovered [RTO] timeouts */ + __u32 is_mptcp; /* Is MPTCP subflow? */ }; struct bpf_sock_tuple { From patchwork Fri Sep 11 14:30:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Rybowski X-Patchwork-Id: 1362540 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=tessares.net Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=tessares-net.20150623.gappssmtp.com header.i=@tessares-net.20150623.gappssmtp.com header.a=rsa-sha256 header.s=20150623 header.b=qySYWhW0; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4Bp1lT0HW4z9sTS for ; Sat, 12 Sep 2020 02:43:01 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726166AbgIKQm6 (ORCPT ); Fri, 11 Sep 2020 12:42:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59294 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726255AbgIKPKE (ORCPT ); Fri, 11 Sep 2020 11:10:04 -0400 Received: from mail-ej1-x643.google.com (mail-ej1-x643.google.com [IPv6:2a00:1450:4864:20::643]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B3F12C06135E for ; Fri, 11 Sep 2020 07:31:06 -0700 (PDT) Received: by mail-ej1-x643.google.com with SMTP id q13so14066034ejo.9 for ; Fri, 11 Sep 2020 07:31:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tessares-net.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=8voqPtdkbS5RYnHXaRVOj2SC1jKq1kvwIX4dEXxo6Mc=; b=qySYWhW0yC42rPW/IoqWQolhYfXTLM21q7M2YytnHh2yQU4RoGj86gmZePTxz8sAjT GFnqtUFKRVnNvFFSoku5JBmqgqWLDyp3MH8ip3utZK5/D9/Q5oyLP69h10p78VRSCZL4 CjxstQ5GyKsCPPsHl+CbtPcFPb7vjpkHd29RXhjIRjBEWV+X8wVZQMmDOV0ELwoDcLmz HrVgononjhPjTIsOluxMQICFWXylrDWQkjYNSlCBsEJOszmzHOOjtdqU3VA0Jdd7dB1w X8H0violb+3GZwx8SNphe81iv+N5lCd4hWhgNDxulNEkayJY2qBM/9ftAQY8iAf/gPHo GyDg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=8voqPtdkbS5RYnHXaRVOj2SC1jKq1kvwIX4dEXxo6Mc=; b=NgKAKzW4ZTQHcfuno58jAes3qBpFjKL0D+h55uyjtny4O10bSt5+AT150I+xLiuRDT JsULcuvEjwh9FMg3bud5Su3rndNyp2fxrnz2PueKJX1o6KbdDqozIZdDURextk0hVRa2 ZkZ5iOk2mgEImD0dFLdn09RGteP+axTDiSXXAbTLs3Ok0FZwVAJKHNu/QzJQB6e+fBKL mY4CEotUmAZ87r+kJDQ3Vxwo0+tK+OAghl6daElwSzDeU3IgojlbBzpsPRxtztpaZ9tz c2RHO1sW4ZUqWn3+GUL7tnTWTaBfRuEO+z5JBq9n3rTAe9oiS6rgGTqDQV8N7PmdmaMp DlQA== X-Gm-Message-State: AOAM533sbLjUovtt71C/YkkW9yQdnMB9lVa1xICeQuD3XgL0kGlkFNgT nEpXuRsDmGnyR1Bshna4lkbe+Q== X-Google-Smtp-Source: ABdhPJx9aD+LYlhca0oVerMVXjMsebdQaEZZyd5ADDx85BMSztyKMmeinroRPPOAzwBmwtoOQ65huA== X-Received: by 2002:a17:906:5206:: with SMTP id g6mr2404464ejm.292.1599834665328; Fri, 11 Sep 2020 07:31:05 -0700 (PDT) Received: from localhost.localdomain ([87.66.33.240]) by smtp.gmail.com with ESMTPSA id y21sm1716261eju.46.2020.09.11.07.31.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Sep 2020 07:31:04 -0700 (PDT) From: Nicolas Rybowski To: Mat Martineau , Matthieu Baerts , "David S. Miller" , Jakub Kicinski , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh Cc: Nicolas Rybowski , Paolo Abeni , netdev@vger.kernel.org, mptcp@lists.01.org, linux-kernel@vger.kernel.org, bpf@vger.kernel.org Subject: [PATCH bpf-next v2 2/5] mptcp: attach subflow socket to parent cgroup Date: Fri, 11 Sep 2020 16:30:17 +0200 Message-Id: <20200911143022.414783-2-nicolas.rybowski@tessares.net> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200911143022.414783-1-nicolas.rybowski@tessares.net> References: <20200911143022.414783-1-nicolas.rybowski@tessares.net> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org It has been observed that the kernel sockets created for the subflows (except the first one) are not in the same cgroup as their parents. That's because the additional subflows are created by kernel workers. This is a problem with eBPF programs attached to the parent's cgroup won't be executed for the children. But also with any other features of CGroup linked to a sk. This patch fixes this behaviour. As the subflow sockets are created by the kernel, we can't use 'mem_cgroup_sk_alloc' because of the current context being the one of the kworker. This is why we have to do low level memcg manipulation, if required. Suggested-by: Matthieu Baerts Suggested-by: Paolo Abeni Acked-by: Matthieu Baerts Reviewed-by: Mat Martineau Signed-off-by: Nicolas Rybowski Acked-by: Song Liu --- net/mptcp/subflow.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index e8cac2655c82..535a3f9f8cfc 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -1130,6 +1130,30 @@ int __mptcp_subflow_connect(struct sock *sk, int ifindex, return err; } +static void mptcp_attach_cgroup(struct sock *parent, struct sock *child) +{ +#ifdef CONFIG_SOCK_CGROUP_DATA + struct sock_cgroup_data *parent_skcd = &parent->sk_cgrp_data, + *child_skcd = &child->sk_cgrp_data; + + /* only the additional subflows created by kworkers have to be modified */ + if (cgroup_id(sock_cgroup_ptr(parent_skcd)) != + cgroup_id(sock_cgroup_ptr(child_skcd))) { +#ifdef CONFIG_MEMCG + struct mem_cgroup *memcg = parent->sk_memcg; + + mem_cgroup_sk_free(child); + if (memcg && css_tryget(&memcg->css)) + child->sk_memcg = memcg; +#endif /* CONFIG_MEMCG */ + + cgroup_sk_free(child_skcd); + *child_skcd = *parent_skcd; + cgroup_sk_clone(child_skcd); + } +#endif /* CONFIG_SOCK_CGROUP_DATA */ +} + int mptcp_subflow_create_socket(struct sock *sk, struct socket **new_sock) { struct mptcp_subflow_context *subflow; @@ -1150,6 +1174,9 @@ int mptcp_subflow_create_socket(struct sock *sk, struct socket **new_sock) lock_sock(sf->sk); + /* the newly created socket has to be in the same cgroup as its parent */ + mptcp_attach_cgroup(sk, sf->sk); + /* kernel sockets do not by default acquire net ref, but TCP timer * needs it. */ From patchwork Fri Sep 11 14:30:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Rybowski X-Patchwork-Id: 1362532 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=tessares.net Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=tessares-net.20150623.gappssmtp.com header.i=@tessares-net.20150623.gappssmtp.com header.a=rsa-sha256 header.s=20150623 header.b=GXyxdu47; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4Bp1f35xFSz9sSs for ; Sat, 12 Sep 2020 02:38:19 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726355AbgIKPMR (ORCPT ); Fri, 11 Sep 2020 11:12:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59322 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726402AbgIKPKN (ORCPT ); Fri, 11 Sep 2020 11:10:13 -0400 Received: from mail-ed1-x542.google.com (mail-ed1-x542.google.com [IPv6:2a00:1450:4864:20::542]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B6509C061362 for ; Fri, 11 Sep 2020 07:31:11 -0700 (PDT) Received: by mail-ed1-x542.google.com with SMTP id ay8so10154683edb.8 for ; Fri, 11 Sep 2020 07:31:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tessares-net.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=FKkvcu84sYpCTZ/MXCNyDF95OUtfxDviJDI7snHzfWw=; b=GXyxdu47UAMPwkML/Q5Ud6KV6e1YMZk5GlaBd6ru10Df7EFKq+TUhEx/XYK0JaEhjO PWUz3eTT6jS0nJS1M2aYCzhBO4J9ipXTWVDjSjnwEEBBMRpgAhtJjJfAWwOTE6ZBjLOh +DGLNf3ZKcu0d7JPNvVoY9R/vQbxuqkljrm9aMnOSAAT1Ej3Mq5rKe7EAOTk4nmii9EF NVxaq/sSFQB5/H/t0rWmmYyLmy3mU4+5DNfm2bPXIU7SrOwuKQeX2sv1Tu+itJsNetQg O9bpzAkIm2dCLONVGYsWtNHWYuq2sLIXLVeLYLDGjyf9o2s2v2wamkx/Lt08lN5dF7O/ M6OA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FKkvcu84sYpCTZ/MXCNyDF95OUtfxDviJDI7snHzfWw=; b=V/BaT2eXu6jUxxCxB+OvtwJcLYgI0DqTLdPUuEERwwK2eCx3pnY9IJYv5Rah0lK6RD Iy7YGtqhRpzlwuJlJt6cg0WBV7op2/riUzyTNe+PCeqGtnYkvksECeWZAFu+7+LephXB BNjo9PSBy3cKNNBSgeyz2ZRNL4Aq4N6ZTDqrsJv19iiHGIemEuSLqIUCPMeiL9ECf1FY RwIPwGv5Qd1PVqYMQd79D86QJEPp6Op2mYcy/anwkZARH6oQ31D+Q2xEGPiBOHvp+i9a 53pxwUBhTAsevd8x+0aCZNHT29+5JO378nUTunIVH7wyDXqqg9vu54cfzL8zNX6m7Bgt JR8Q== X-Gm-Message-State: AOAM533pNw1xYq3D7sqsbxh2y5to/i/buPy6CqXFgyohGHf+DMF8X2uH 6mw6zGKlkpc8x/pyotFrCE1bQw== X-Google-Smtp-Source: ABdhPJzTjkwRowBAQIocdnQ4t5RAq/bQGP277dBKSlublhrCVD2uJiO83E5rqJXUAaY9BiWCsf3Dqg== X-Received: by 2002:a05:6402:1593:: with SMTP id c19mr2330321edv.33.1599834670311; Fri, 11 Sep 2020 07:31:10 -0700 (PDT) Received: from localhost.localdomain ([87.66.33.240]) by smtp.gmail.com with ESMTPSA id y21sm1716261eju.46.2020.09.11.07.31.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Sep 2020 07:31:09 -0700 (PDT) From: Nicolas Rybowski To: Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh , "David S. Miller" , Jakub Kicinski , Mat Martineau , Matthieu Baerts Cc: Nicolas Rybowski , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, mptcp@lists.01.org Subject: [PATCH bpf-next v2 3/5] bpf: add 'bpf_mptcp_sock' structure and helper Date: Fri, 11 Sep 2020 16:30:18 +0200 Message-Id: <20200911143022.414783-3-nicolas.rybowski@tessares.net> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200911143022.414783-1-nicolas.rybowski@tessares.net> References: <20200911143022.414783-1-nicolas.rybowski@tessares.net> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org In order to precisely identify the parent MPTCP connection of a subflow, it is required to access the mptcp_sock's token which uniquely identify a MPTCP connection. This patch adds a new structure 'bpf_mptcp_sock' exposing the 'token' field of the 'mptcp_sock' extracted from a subflow's 'tcp_sock'. It also adds the declaration of a new BPF helper of the same name to expose the newly defined structure in the userspace BPF API. This is the foundation to expose more MPTCP-specific fields through BPF. Currently, it is limited to the field 'token' of the msk but it is easily extensible. Acked-by: Matthieu Baerts Acked-by: Mat Martineau Signed-off-by: Nicolas Rybowski Acked-by: Song Liu --- include/linux/bpf.h | 33 ++++++++++++++++ include/uapi/linux/bpf.h | 14 +++++++ kernel/bpf/verifier.c | 30 ++++++++++++++ net/core/filter.c | 4 ++ net/mptcp/Makefile | 2 + net/mptcp/bpf.c | 72 ++++++++++++++++++++++++++++++++++ scripts/bpf_helpers_doc.py | 2 + tools/include/uapi/linux/bpf.h | 14 +++++++ 8 files changed, 171 insertions(+) create mode 100644 net/mptcp/bpf.c diff --git a/include/linux/bpf.h b/include/linux/bpf.h index c6d9f2c444f4..6be74420f6fa 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -305,6 +305,7 @@ enum bpf_return_type { RET_PTR_TO_SOCK_COMMON_OR_NULL, /* returns a pointer to a sock_common or NULL */ RET_PTR_TO_ALLOC_MEM_OR_NULL, /* returns a pointer to dynamically allocated memory or NULL */ RET_PTR_TO_BTF_ID_OR_NULL, /* returns a pointer to a btf_id or NULL */ + RET_PTR_TO_MPTCP_SOCK_OR_NULL, /* returns a pointer to mptcp_sock or NULL */ }; /* eBPF function prototype used by verifier to allow BPF_CALLs from eBPF programs @@ -385,6 +386,8 @@ enum bpf_reg_type { PTR_TO_RDONLY_BUF_OR_NULL, /* reg points to a readonly buffer or NULL */ PTR_TO_RDWR_BUF, /* reg points to a read/write buffer */ PTR_TO_RDWR_BUF_OR_NULL, /* reg points to a read/write buffer or NULL */ + PTR_TO_MPTCP_SOCK, /* reg points to struct mptcp_sock */ + PTR_TO_MPTCP_SOCK_OR_NULL, /* reg points to struct mptcp_sock or NULL */ }; /* The information passed from prog-specific *_is_valid_access @@ -1785,6 +1788,7 @@ extern const struct bpf_func_proto bpf_skc_to_tcp_timewait_sock_proto; extern const struct bpf_func_proto bpf_skc_to_tcp_request_sock_proto; extern const struct bpf_func_proto bpf_skc_to_udp6_sock_proto; extern const struct bpf_func_proto bpf_copy_from_user_proto; +extern const struct bpf_func_proto bpf_mptcp_sock_proto; const struct bpf_func_proto *bpf_tracing_func_proto( enum bpf_func_id func_id, const struct bpf_prog *prog); @@ -1841,6 +1845,35 @@ struct sk_reuseport_kern { u32 reuseport_id; bool bind_inany; }; + +#ifdef CONFIG_MPTCP +bool bpf_mptcp_sock_is_valid_access(int off, int size, + enum bpf_access_type type, + struct bpf_insn_access_aux *info); + +u32 bpf_mptcp_sock_convert_ctx_access(enum bpf_access_type type, + const struct bpf_insn *si, + struct bpf_insn *insn_buf, + struct bpf_prog *prog, + u32 *target_size); +#else /* CONFIG_MPTCP */ +static inline bool bpf_mptcp_sock_is_valid_access(int off, int size, + enum bpf_access_type type, + struct bpf_insn_access_aux *info) +{ + return false; +} + +static inline u32 bpf_mptcp_sock_convert_ctx_access(enum bpf_access_type type, + const struct bpf_insn *si, + struct bpf_insn *insn_buf, + struct bpf_prog *prog, + u32 *target_size) +{ + return 0; +} +#endif /* CONFIG_MPTCP */ + bool bpf_tcp_sock_is_valid_access(int off, int size, enum bpf_access_type type, struct bpf_insn_access_aux *info); diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index 7d179eada1c3..ee7f6fd67f47 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -3579,6 +3579,15 @@ union bpf_attr { * the data in *dst*. This is a wrapper of **copy_from_user**\ (). * Return * 0 on success, or a negative error in case of failure. + * + * struct bpf_mptcp_sock *bpf_mptcp_sock(struct bpf_sock *sk) + * Description + * This helper gets a **struct bpf_mptcp_sock** pointer from a + * **struct bpf_sock** pointer. + * Return + * A **struct bpf_mptcp_sock** pointer on success, or **NULL** in + * case of failure. + * */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -3730,6 +3739,7 @@ union bpf_attr { FN(inode_storage_delete), \ FN(d_path), \ FN(copy_from_user), \ + FN(mptcp_sock), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper @@ -4063,6 +4073,10 @@ struct bpf_tcp_sock { __u32 is_mptcp; /* Is MPTCP subflow? */ }; +struct bpf_mptcp_sock { + __u32 token; /* msk token */ +}; + struct bpf_sock_tuple { union { struct { diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index 814bc6c1ad16..8a9ce121addd 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -392,6 +392,7 @@ static bool type_is_sk_pointer(enum bpf_reg_type type) return type == PTR_TO_SOCKET || type == PTR_TO_SOCK_COMMON || type == PTR_TO_TCP_SOCK || + type == PTR_TO_MPTCP_SOCK || type == PTR_TO_XDP_SOCK; } @@ -399,6 +400,7 @@ static bool reg_type_not_null(enum bpf_reg_type type) { return type == PTR_TO_SOCKET || type == PTR_TO_TCP_SOCK || + type == PTR_TO_MPTCP_SOCK || type == PTR_TO_MAP_VALUE || type == PTR_TO_SOCK_COMMON; } @@ -409,6 +411,7 @@ static bool reg_type_may_be_null(enum bpf_reg_type type) type == PTR_TO_SOCKET_OR_NULL || type == PTR_TO_SOCK_COMMON_OR_NULL || type == PTR_TO_TCP_SOCK_OR_NULL || + type == PTR_TO_MPTCP_SOCK_OR_NULL || type == PTR_TO_BTF_ID_OR_NULL || type == PTR_TO_MEM_OR_NULL || type == PTR_TO_RDONLY_BUF_OR_NULL || @@ -427,6 +430,8 @@ static bool reg_type_may_be_refcounted_or_null(enum bpf_reg_type type) type == PTR_TO_SOCKET_OR_NULL || type == PTR_TO_TCP_SOCK || type == PTR_TO_TCP_SOCK_OR_NULL || + type == PTR_TO_MPTCP_SOCK || + type == PTR_TO_MPTCP_SOCK_OR_NULL || type == PTR_TO_MEM || type == PTR_TO_MEM_OR_NULL; } @@ -500,6 +505,8 @@ static const char * const reg_type_str[] = { [PTR_TO_SOCK_COMMON_OR_NULL] = "sock_common_or_null", [PTR_TO_TCP_SOCK] = "tcp_sock", [PTR_TO_TCP_SOCK_OR_NULL] = "tcp_sock_or_null", + [PTR_TO_MPTCP_SOCK] = "mptcp_sock", + [PTR_TO_MPTCP_SOCK_OR_NULL] = "mptcp_sock_or_null", [PTR_TO_TP_BUFFER] = "tp_buffer", [PTR_TO_XDP_SOCK] = "xdp_sock", [PTR_TO_BTF_ID] = "ptr_", @@ -2177,6 +2184,8 @@ static bool is_spillable_regtype(enum bpf_reg_type type) case PTR_TO_SOCK_COMMON_OR_NULL: case PTR_TO_TCP_SOCK: case PTR_TO_TCP_SOCK_OR_NULL: + case PTR_TO_MPTCP_SOCK: + case PTR_TO_MPTCP_SOCK_OR_NULL: case PTR_TO_XDP_SOCK: case PTR_TO_BTF_ID: case PTR_TO_BTF_ID_OR_NULL: @@ -2786,6 +2795,9 @@ static int check_sock_access(struct bpf_verifier_env *env, int insn_idx, case PTR_TO_TCP_SOCK: valid = bpf_tcp_sock_is_valid_access(off, size, t, &info); break; + case PTR_TO_MPTCP_SOCK: + valid = bpf_mptcp_sock_is_valid_access(off, size, t, &info); + break; case PTR_TO_XDP_SOCK: valid = bpf_xdp_sock_is_valid_access(off, size, t, &info); break; @@ -2944,6 +2956,9 @@ static int check_ptr_alignment(struct bpf_verifier_env *env, case PTR_TO_TCP_SOCK: pointer_desc = "tcp_sock "; break; + case PTR_TO_MPTCP_SOCK: + pointer_desc = "mptcp_sock "; + break; case PTR_TO_XDP_SOCK: pointer_desc = "xdp_sock "; break; @@ -4997,6 +5012,10 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn mark_reg_known_zero(env, regs, BPF_REG_0); regs[BPF_REG_0].type = PTR_TO_TCP_SOCK_OR_NULL; regs[BPF_REG_0].id = ++env->id_gen; + } else if (fn->ret_type == RET_PTR_TO_MPTCP_SOCK_OR_NULL) { + mark_reg_known_zero(env, regs, BPF_REG_0); + regs[BPF_REG_0].type = PTR_TO_MPTCP_SOCK_OR_NULL; + regs[BPF_REG_0].id = ++env->id_gen; } else if (fn->ret_type == RET_PTR_TO_ALLOC_MEM_OR_NULL) { mark_reg_known_zero(env, regs, BPF_REG_0); regs[BPF_REG_0].type = PTR_TO_MEM_OR_NULL; @@ -5328,6 +5347,8 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env, case PTR_TO_SOCK_COMMON_OR_NULL: case PTR_TO_TCP_SOCK: case PTR_TO_TCP_SOCK_OR_NULL: + case PTR_TO_MPTCP_SOCK: + case PTR_TO_MPTCP_SOCK_OR_NULL: case PTR_TO_XDP_SOCK: verbose(env, "R%d pointer arithmetic on %s prohibited\n", dst, reg_type_str[ptr_reg->type]); @@ -7048,6 +7069,8 @@ static void mark_ptr_or_null_reg(struct bpf_func_state *state, reg->type = PTR_TO_SOCK_COMMON; } else if (reg->type == PTR_TO_TCP_SOCK_OR_NULL) { reg->type = PTR_TO_TCP_SOCK; + } else if (reg->type == PTR_TO_MPTCP_SOCK_OR_NULL) { + reg->type = PTR_TO_MPTCP_SOCK; } else if (reg->type == PTR_TO_BTF_ID_OR_NULL) { reg->type = PTR_TO_BTF_ID; } else if (reg->type == PTR_TO_MEM_OR_NULL) { @@ -8411,6 +8434,8 @@ static bool regsafe(struct bpf_reg_state *rold, struct bpf_reg_state *rcur, case PTR_TO_SOCK_COMMON_OR_NULL: case PTR_TO_TCP_SOCK: case PTR_TO_TCP_SOCK_OR_NULL: + case PTR_TO_MPTCP_SOCK: + case PTR_TO_MPTCP_SOCK_OR_NULL: case PTR_TO_XDP_SOCK: /* Only valid matches are exact, which memcmp() above * would have accepted @@ -8938,6 +8963,8 @@ static bool reg_type_mismatch_ok(enum bpf_reg_type type) case PTR_TO_SOCK_COMMON_OR_NULL: case PTR_TO_TCP_SOCK: case PTR_TO_TCP_SOCK_OR_NULL: + case PTR_TO_MPTCP_SOCK: + case PTR_TO_MPTCP_SOCK_OR_NULL: case PTR_TO_XDP_SOCK: case PTR_TO_BTF_ID: case PTR_TO_BTF_ID_OR_NULL: @@ -10076,6 +10103,9 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env) case PTR_TO_TCP_SOCK: convert_ctx_access = bpf_tcp_sock_convert_ctx_access; break; + case PTR_TO_MPTCP_SOCK: + convert_ctx_access = bpf_mptcp_sock_convert_ctx_access; + break; case PTR_TO_XDP_SOCK: convert_ctx_access = bpf_xdp_sock_convert_ctx_access; break; diff --git a/net/core/filter.c b/net/core/filter.c index dab48528dceb..a9d964e4764e 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -6890,6 +6890,10 @@ sock_ops_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog) case BPF_FUNC_tcp_sock: return &bpf_tcp_sock_proto; #endif /* CONFIG_INET */ +#ifdef CONFIG_MPTCP + case BPF_FUNC_mptcp_sock: + return &bpf_mptcp_sock_proto; +#endif /* CONFIG_MPTCP */ default: return bpf_base_func_proto(func_id); } diff --git a/net/mptcp/Makefile b/net/mptcp/Makefile index a611968be4d7..aae2ede220ed 100644 --- a/net/mptcp/Makefile +++ b/net/mptcp/Makefile @@ -10,3 +10,5 @@ obj-$(CONFIG_INET_MPTCP_DIAG) += mptcp_diag.o mptcp_crypto_test-objs := crypto_test.o mptcp_token_test-objs := token_test.o obj-$(CONFIG_MPTCP_KUNIT_TESTS) += mptcp_crypto_test.o mptcp_token_test.o + +obj-$(CONFIG_BPF) += bpf.o diff --git a/net/mptcp/bpf.c b/net/mptcp/bpf.c new file mode 100644 index 000000000000..5332469fbb28 --- /dev/null +++ b/net/mptcp/bpf.c @@ -0,0 +1,72 @@ +// SPDX-License-Identifier: GPL-2.0 +/* Multipath TCP + * + * Copyright (c) 2020, Tessares SA. + * + * Author: Nicolas Rybowski + * + */ + +#include + +#include "protocol.h" + +bool bpf_mptcp_sock_is_valid_access(int off, int size, enum bpf_access_type type, + struct bpf_insn_access_aux *info) +{ + if (off < 0 || off >= offsetofend(struct bpf_mptcp_sock, token)) + return false; + + if (off % size != 0) + return false; + + switch (off) { + default: + return size == sizeof(__u32); + } +} + +u32 bpf_mptcp_sock_convert_ctx_access(enum bpf_access_type type, + const struct bpf_insn *si, + struct bpf_insn *insn_buf, + struct bpf_prog *prog, u32 *target_size) +{ + struct bpf_insn *insn = insn_buf; + +#define BPF_MPTCP_SOCK_GET_COMMON(FIELD) \ + do { \ + BUILD_BUG_ON(sizeof_field(struct mptcp_sock, FIELD) > \ + sizeof_field(struct bpf_mptcp_sock, FIELD)); \ + *insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct mptcp_sock, FIELD), \ + si->dst_reg, si->src_reg, \ + offsetof(struct mptcp_sock, FIELD)); \ + } while (0) + + if (insn > insn_buf) + return insn - insn_buf; + + switch (si->off) { + case offsetof(struct bpf_mptcp_sock, token): + BPF_MPTCP_SOCK_GET_COMMON(token); + break; + } + + return insn - insn_buf; +} + +BPF_CALL_1(bpf_mptcp_sock, struct sock *, sk) +{ + if (sk_fullsock(sk) && sk->sk_protocol == IPPROTO_TCP && sk_is_mptcp(sk)) { + struct mptcp_subflow_context *mptcp_sfc = mptcp_subflow_ctx(sk); + + return (unsigned long)mptcp_sfc->conn; + } + return (unsigned long)NULL; +} + +const struct bpf_func_proto bpf_mptcp_sock_proto = { + .func = bpf_mptcp_sock, + .gpl_only = false, + .ret_type = RET_PTR_TO_MPTCP_SOCK_OR_NULL, + .arg1_type = ARG_PTR_TO_SOCK_COMMON, +}; diff --git a/scripts/bpf_helpers_doc.py b/scripts/bpf_helpers_doc.py index 08388173973f..5c3e7aad8faa 100755 --- a/scripts/bpf_helpers_doc.py +++ b/scripts/bpf_helpers_doc.py @@ -428,6 +428,7 @@ class PrinterHelpers(Printer): 'struct tcp_request_sock', 'struct udp6_sock', 'struct task_struct', + 'struct bpf_mptcp_sock', 'struct __sk_buff', 'struct sk_msg_md', @@ -474,6 +475,7 @@ class PrinterHelpers(Printer): 'struct udp6_sock', 'struct task_struct', 'struct path', + 'struct bpf_mptcp_sock', } mapped_types = { 'u8': '__u8', diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h index 7d179eada1c3..ee7f6fd67f47 100644 --- a/tools/include/uapi/linux/bpf.h +++ b/tools/include/uapi/linux/bpf.h @@ -3579,6 +3579,15 @@ union bpf_attr { * the data in *dst*. This is a wrapper of **copy_from_user**\ (). * Return * 0 on success, or a negative error in case of failure. + * + * struct bpf_mptcp_sock *bpf_mptcp_sock(struct bpf_sock *sk) + * Description + * This helper gets a **struct bpf_mptcp_sock** pointer from a + * **struct bpf_sock** pointer. + * Return + * A **struct bpf_mptcp_sock** pointer on success, or **NULL** in + * case of failure. + * */ #define __BPF_FUNC_MAPPER(FN) \ FN(unspec), \ @@ -3730,6 +3739,7 @@ union bpf_attr { FN(inode_storage_delete), \ FN(d_path), \ FN(copy_from_user), \ + FN(mptcp_sock), \ /* */ /* integer value in 'imm' field of BPF_CALL instruction selects which helper @@ -4063,6 +4073,10 @@ struct bpf_tcp_sock { __u32 is_mptcp; /* Is MPTCP subflow? */ }; +struct bpf_mptcp_sock { + __u32 token; /* msk token */ +}; + struct bpf_sock_tuple { union { struct { From patchwork Fri Sep 11 14:30:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Rybowski X-Patchwork-Id: 1362481 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=tessares.net Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=tessares-net.20150623.gappssmtp.com header.i=@tessares-net.20150623.gappssmtp.com header.a=rsa-sha256 header.s=20150623 header.b=a15BUZJR; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4BnzSC1J58z9sSs for ; Sat, 12 Sep 2020 00:59:39 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726225AbgIKO7h (ORCPT ); Fri, 11 Sep 2020 10:59:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57328 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726184AbgIKO6U (ORCPT ); Fri, 11 Sep 2020 10:58:20 -0400 Received: from mail-ej1-x643.google.com (mail-ej1-x643.google.com [IPv6:2a00:1450:4864:20::643]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E45C4C061365 for ; Fri, 11 Sep 2020 07:31:16 -0700 (PDT) Received: by mail-ej1-x643.google.com with SMTP id z22so14077833ejl.7 for ; Fri, 11 Sep 2020 07:31:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tessares-net.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+i8R58kw/leW4I2M13LKbuHEybfMrzesYwlJiGTgtNk=; b=a15BUZJRqua56q81WMjBEDQ3iSAVjqj55mNjlT0LHm7fj8zncIWlNPf3SrHkC/eofS MUhGOCYkv34QW1doX+CMNH+ev+OYMNBtjPe9GoTZyrcxRq00tSwp53Z8f4pxs2Jw1udo Z6k1oSYZLFfsSuR/MgGY022bxzWljeVKwumPP+KmaGAHP0/7QGVkbf3SA2+C8GzGvjt5 wDsUMZSigxwNqh19nEEHq3Yn9wkEF05CIk2+FnoopbRLINo3K1qss/pCs8zx7AwKQMhX ROL8xgPAq1U794RkujJ3S5Bi024Rd92qVakoS/PyvX/Vv9XhMrkfYcv/0ovX0sczmaHf sISg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+i8R58kw/leW4I2M13LKbuHEybfMrzesYwlJiGTgtNk=; b=rR5DtEkxWIeGeGDd7kBmtK39rdo+sdRq28uhZZrLeAxM5xLHzpJop2OuX/hSJR8p1r CyoGE7F0y5pbZ9p9mL3CZp1wpF0ZcP1kAbMC/1X/bwJq3KuvBm0NVAlCiLhXYahCLtDt x/S2poP6qfT7GdQ1SLgjJRTZwH88Dgo5JQajkiKBAA1qzc0Sp7j5V3HJwiymdkHuB9BU sDvpMbxBfcWLeDdWUVtDNaV8hAsRx9zLLgLi99IAKinHSQ7Tkt5mrfn/a4k9PSaLWRDD m8C4rsxT4EvoPJyoQW+9RNEvs+AQ3XJYkO7CRWWAQhfz+DS+ZNtwUN8dY19/S0MfvQtk mmKg== X-Gm-Message-State: AOAM533m21EABI0dNYVb5x2UU/SZZVKnF/haNBMYIDwIN/Aw94QzrnAT JJW7Z+m5pQh+WKwb1xvwP8n1VQ== X-Google-Smtp-Source: ABdhPJy1pF6nMvBpXqraqaRNOATVMTu/betL9JR2wN4Oe0zmBlv0YY5nb7O5C/sfCFUnKtFrAXQASw== X-Received: by 2002:a17:906:1b55:: with SMTP id p21mr2425207ejg.457.1599834675454; Fri, 11 Sep 2020 07:31:15 -0700 (PDT) Received: from localhost.localdomain ([87.66.33.240]) by smtp.gmail.com with ESMTPSA id y21sm1716261eju.46.2020.09.11.07.31.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Sep 2020 07:31:14 -0700 (PDT) From: Nicolas Rybowski To: Shuah Khan , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh Cc: Nicolas Rybowski , Matthieu Baerts , linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org Subject: [PATCH bpf-next v2 4/5] bpf: selftests: add MPTCP test base Date: Fri, 11 Sep 2020 16:30:19 +0200 Message-Id: <20200911143022.414783-4-nicolas.rybowski@tessares.net> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200911143022.414783-1-nicolas.rybowski@tessares.net> References: <20200911143022.414783-1-nicolas.rybowski@tessares.net> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch adds a base for MPTCP specific tests. It is currently limited to the is_mptcp field in case of plain TCP connection because for the moment there is no easy way to get the subflow sk from a msk in userspace. This implies that we cannot lookup the sk_storage attached to the subflow sk in the sockops program. Acked-by: Matthieu Baerts Signed-off-by: Nicolas Rybowski Acked-by: Song Liu --- Notes: v1 -> v2: - new patch: mandatory selftests (Alexei) tools/testing/selftests/bpf/config | 1 + tools/testing/selftests/bpf/network_helpers.c | 37 +++++- tools/testing/selftests/bpf/network_helpers.h | 3 + .../testing/selftests/bpf/prog_tests/mptcp.c | 119 ++++++++++++++++++ tools/testing/selftests/bpf/progs/mptcp.c | 48 +++++++ 5 files changed, 203 insertions(+), 5 deletions(-) create mode 100644 tools/testing/selftests/bpf/prog_tests/mptcp.c create mode 100644 tools/testing/selftests/bpf/progs/mptcp.c diff --git a/tools/testing/selftests/bpf/config b/tools/testing/selftests/bpf/config index 2118e23ac07a..8377836ea976 100644 --- a/tools/testing/selftests/bpf/config +++ b/tools/testing/selftests/bpf/config @@ -39,3 +39,4 @@ CONFIG_BPF_JIT=y CONFIG_BPF_LSM=y CONFIG_SECURITY=y CONFIG_LIRC=y +CONFIG_MPTCP=y diff --git a/tools/testing/selftests/bpf/network_helpers.c b/tools/testing/selftests/bpf/network_helpers.c index 12ee40284da0..85cbb683965c 100644 --- a/tools/testing/selftests/bpf/network_helpers.c +++ b/tools/testing/selftests/bpf/network_helpers.c @@ -14,6 +14,10 @@ #include "bpf_util.h" #include "network_helpers.h" +#ifndef IPPROTO_MPTCP +#define IPPROTO_MPTCP 262 +#endif + #define clean_errno() (errno == 0 ? "None" : strerror(errno)) #define log_err(MSG, ...) ({ \ int __save = errno; \ @@ -66,8 +70,8 @@ static int settimeo(int fd, int timeout_ms) #define save_errno_close(fd) ({ int __save = errno; close(fd); errno = __save; }) -int start_server(int family, int type, const char *addr_str, __u16 port, - int timeout_ms) +static int start_server_proto(int family, int type, int protocol, + const char *addr_str, __u16 port, int timeout_ms) { struct sockaddr_storage addr = {}; socklen_t len; @@ -76,7 +80,7 @@ int start_server(int family, int type, const char *addr_str, __u16 port, if (make_sockaddr(family, addr_str, port, &addr, &len)) return -1; - fd = socket(family, type, 0); + fd = socket(family, type, protocol); if (fd < 0) { log_err("Failed to create server socket"); return -1; @@ -104,6 +108,19 @@ int start_server(int family, int type, const char *addr_str, __u16 port, return -1; } +int start_server(int family, int type, const char *addr_str, __u16 port, + int timeout_ms) +{ + return start_server_proto(family, type, 0, addr_str, port, timeout_ms); +} + +int start_mptcp_server(int family, const char *addr_str, __u16 port, + int timeout_ms) +{ + return start_server_proto(family, SOCK_STREAM, IPPROTO_MPTCP, addr_str, + port, timeout_ms); +} + int fastopen_connect(int server_fd, const char *data, unsigned int data_len, int timeout_ms) { @@ -153,7 +170,7 @@ static int connect_fd_to_addr(int fd, return 0; } -int connect_to_fd(int server_fd, int timeout_ms) +static int connect_to_fd_proto(int server_fd, int protocol, int timeout_ms) { struct sockaddr_storage addr; struct sockaddr_in *addr_in; @@ -173,7 +190,7 @@ int connect_to_fd(int server_fd, int timeout_ms) } addr_in = (struct sockaddr_in *)&addr; - fd = socket(addr_in->sin_family, type, 0); + fd = socket(addr_in->sin_family, type, protocol); if (fd < 0) { log_err("Failed to create client socket"); return -1; @@ -192,6 +209,16 @@ int connect_to_fd(int server_fd, int timeout_ms) return -1; } +int connect_to_fd(int server_fd, int timeout_ms) +{ + return connect_to_fd_proto(server_fd, 0, timeout_ms); +} + +int connect_to_mptcp_fd(int server_fd, int timeout_ms) +{ + return connect_to_fd_proto(server_fd, IPPROTO_MPTCP, timeout_ms); +} + int connect_fd_to_fd(int client_fd, int server_fd, int timeout_ms) { struct sockaddr_storage addr; diff --git a/tools/testing/selftests/bpf/network_helpers.h b/tools/testing/selftests/bpf/network_helpers.h index 7205f8afdba1..336423a789e9 100644 --- a/tools/testing/selftests/bpf/network_helpers.h +++ b/tools/testing/selftests/bpf/network_helpers.h @@ -35,7 +35,10 @@ extern struct ipv6_packet pkt_v6; int start_server(int family, int type, const char *addr, __u16 port, int timeout_ms); +int start_mptcp_server(int family, const char *addr, __u16 port, + int timeout_ms); int connect_to_fd(int server_fd, int timeout_ms); +int connect_to_mptcp_fd(int server_fd, int timeout_ms); int connect_fd_to_fd(int client_fd, int server_fd, int timeout_ms); int fastopen_connect(int server_fd, const char *data, unsigned int data_len, int timeout_ms); diff --git a/tools/testing/selftests/bpf/prog_tests/mptcp.c b/tools/testing/selftests/bpf/prog_tests/mptcp.c new file mode 100644 index 000000000000..0e65d64868e9 --- /dev/null +++ b/tools/testing/selftests/bpf/prog_tests/mptcp.c @@ -0,0 +1,119 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include "cgroup_helpers.h" +#include "network_helpers.h" + +struct mptcp_storage { + __u32 invoked; + __u32 is_mptcp; +}; + +static int verify_sk(int map_fd, int client_fd, const char *msg, __u32 is_mptcp) +{ + int err = 0, cfd = client_fd; + struct mptcp_storage val; + + /* Currently there is no easy way to get back the subflow sk from the MPTCP + * sk, thus we cannot access here the sk_storage associated to the subflow + * sk. Also, there is no sk_storage associated with the MPTCP sk since it + * does not trigger sockops events. + * We silently pass this situation at the moment. + */ + if (is_mptcp == 1) + return 0; + + if (CHECK_FAIL(bpf_map_lookup_elem(map_fd, &cfd, &val) < 0)) { + perror("Failed to read socket storage"); + return -1; + } + + if (val.invoked != 1) { + log_err("%s: unexpected invoked count %d != %d", + msg, val.invoked, 1); + err++; + } + + if (val.is_mptcp != is_mptcp) { + log_err("%s: unexpected bpf_tcp_sock.is_mptcp %d != %d", + msg, val.is_mptcp, is_mptcp); + err++; + } + + return err; +} + +static int run_test(int cgroup_fd, int server_fd, bool is_mptcp) +{ + int client_fd, prog_fd, map_fd, err; + struct bpf_object *obj; + struct bpf_map *map; + + struct bpf_prog_load_attr attr = { + .prog_type = BPF_PROG_TYPE_SOCK_OPS, + .file = "./mptcp.o", + .expected_attach_type = BPF_CGROUP_SOCK_OPS, + }; + + err = bpf_prog_load_xattr(&attr, &obj, &prog_fd); + if (err) { + log_err("Failed to load BPF object"); + return -1; + } + + map = bpf_map__next(NULL, obj); + map_fd = bpf_map__fd(map); + + err = bpf_prog_attach(prog_fd, cgroup_fd, BPF_CGROUP_SOCK_OPS, 0); + if (err) { + log_err("Failed to attach BPF program"); + goto close_bpf_object; + } + + client_fd = is_mptcp ? connect_to_mptcp_fd(server_fd, 0) : + connect_to_fd(server_fd, 0); + if (client_fd < 0) { + err = -1; + goto close_client_fd; + } + + err += is_mptcp ? verify_sk(map_fd, client_fd, "MPTCP subflow socket", 1) : + verify_sk(map_fd, client_fd, "plain TCP socket", 0); + +close_client_fd: + close(client_fd); + +close_bpf_object: + bpf_object__close(obj); + return err; +} + +void test_mptcp(void) +{ + int server_fd, cgroup_fd; + + cgroup_fd = test__join_cgroup("/mptcp"); + if (CHECK_FAIL(cgroup_fd < 0)) + return; + + /* without MPTCP */ + server_fd = start_server(AF_INET, SOCK_STREAM, NULL, 0, 0); + if (CHECK_FAIL(server_fd < 0)) + goto with_mptcp; + + CHECK_FAIL(run_test(cgroup_fd, server_fd, false)); + + close(server_fd); + +with_mptcp: + /* with MPTCP */ + server_fd = start_mptcp_server(AF_INET, NULL, 0, 0); + if (CHECK_FAIL(server_fd < 0)) + goto close_cgroup_fd; + + CHECK_FAIL(run_test(cgroup_fd, server_fd, true)); + + close(server_fd); + +close_cgroup_fd: + close(cgroup_fd); +} diff --git a/tools/testing/selftests/bpf/progs/mptcp.c b/tools/testing/selftests/bpf/progs/mptcp.c new file mode 100644 index 000000000000..be5ee8dac2b3 --- /dev/null +++ b/tools/testing/selftests/bpf/progs/mptcp.c @@ -0,0 +1,48 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include + +char _license[] SEC("license") = "GPL"; +__u32 _version SEC("version") = 1; + +struct mptcp_storage { + __u32 invoked; + __u32 is_mptcp; +}; + +struct { + __uint(type, BPF_MAP_TYPE_SK_STORAGE); + __uint(map_flags, BPF_F_NO_PREALLOC); + __type(key, int); + __type(value, struct mptcp_storage); +} socket_storage_map SEC(".maps"); + +SEC("sockops") +int _sockops(struct bpf_sock_ops *ctx) +{ + struct mptcp_storage *storage; + struct bpf_tcp_sock *tcp_sk; + int op = (int)ctx->op; + struct bpf_sock *sk; + + sk = ctx->sk; + if (!sk) + return 1; + + storage = bpf_sk_storage_get(&socket_storage_map, sk, 0, + BPF_SK_STORAGE_GET_F_CREATE); + if (!storage) + return 1; + + if (op != BPF_SOCK_OPS_TCP_CONNECT_CB) + return 1; + + tcp_sk = bpf_tcp_sock(sk); + if (!tcp_sk) + return 1; + + storage->invoked++; + storage->is_mptcp = tcp_sk->is_mptcp; + + return 1; +} From patchwork Fri Sep 11 14:30:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nicolas Rybowski X-Patchwork-Id: 1362545 X-Patchwork-Delegate: bpf@iogearbox.net Return-Path: X-Original-To: incoming-bpf@patchwork.ozlabs.org Delivered-To: patchwork-incoming-bpf@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=bpf-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=tessares.net Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=tessares-net.20150623.gappssmtp.com header.i=@tessares-net.20150623.gappssmtp.com header.a=rsa-sha256 header.s=20150623 header.b=mq0+Z2ot; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4Bp1n45GmPz9sTS for ; Sat, 12 Sep 2020 02:44:24 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726389AbgIKQoQ (ORCPT ); Fri, 11 Sep 2020 12:44:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58720 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726388AbgIKPIW (ORCPT ); Fri, 11 Sep 2020 11:08:22 -0400 Received: from mail-ej1-x644.google.com (mail-ej1-x644.google.com [IPv6:2a00:1450:4864:20::644]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 96F1CC06136A for ; Fri, 11 Sep 2020 07:31:21 -0700 (PDT) Received: by mail-ej1-x644.google.com with SMTP id z22so14078184ejl.7 for ; Fri, 11 Sep 2020 07:31:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=tessares-net.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=AifmCwlGQkYNJWlBnZAsy/4oYCtlJh2CI6z+GloOpvw=; b=mq0+Z2otTHI4q6BWKiid1j5pubgJEivA198+nK7hLJEC/gmULm+cPFKIChKQ1QIXXT VfP69f6jo2c/djHc4Ptm9Vmz9NLArYKToVGHHKzKzYpLIT6v1Xlcfr+zEpNWLebmfasV A1V1xIpjsS+9MtmrnQcWeJJTNfPtm44TjzXo2ZmNM94e9OpmE1JnuWO/kAEO1itG+EO/ r8Nd2bYlLPI2BzGI9BuQpL9OhAx9rOgDDVlQ5Fgp2tNYKo9cR2b5tqHGBIoX54XR1lpH Y5aoqaILSDiOPn7a7zj9IB6LIepvIcStRyCkLZRoiqiLpI11U+VfTGIaWKzHgPd/YgT2 d2Hg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=AifmCwlGQkYNJWlBnZAsy/4oYCtlJh2CI6z+GloOpvw=; b=pfjq1xrUK2+bvcZY3fcP6YQHpy95ooSkHeVQVJLjjGQYRy4oXJEI+VEFhOTjhAPnz9 7TXS56/7l2C+BenEvkj2MTrqyi8ba5gDiPa2xUpl5tbKdBoqRxiUDwHPTm3ZEMyJeaH1 VUlehLqO6lVCCXOOwUlLGJLJyzwk7I1BMr2yLc2+cL6pjWGX4gsh6Az0mKd9nmq1LBpm lfenJSEnydcDaLbO2upiOl1Dt+akkgJ1/O01vJV3AACFNgLgduEFUCfeBct8ubOPgWDK nFb5FGELjDg4muMrrXU5CrlblE39Ot3GKBuamAkLo5AfHMsaKBZxtDxMuozoA+GJIlIb Zs+g== X-Gm-Message-State: AOAM531xg5DN81B2a5cljcOFHKI+gBcSZ0EPbWKUlYijnd590171pavY two68B6Tk6VuJ3OOqtD74OKwSg== X-Google-Smtp-Source: ABdhPJzugQqj4flmudtLUAs/WO1IQQCcBDyfUkw8ekBI2jy8c9MqX9LqDLSC4uGs7wvGyrfUUU2uUA== X-Received: by 2002:a17:906:4951:: with SMTP id f17mr2295162ejt.29.1599834680288; Fri, 11 Sep 2020 07:31:20 -0700 (PDT) Received: from localhost.localdomain ([87.66.33.240]) by smtp.gmail.com with ESMTPSA id y21sm1716261eju.46.2020.09.11.07.31.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 11 Sep 2020 07:31:19 -0700 (PDT) From: Nicolas Rybowski To: Shuah Khan , Alexei Starovoitov , Daniel Borkmann , Martin KaFai Lau , Song Liu , Yonghong Song , Andrii Nakryiko , John Fastabend , KP Singh Cc: Nicolas Rybowski , Matthieu Baerts , linux-kselftest@vger.kernel.org, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH bpf-next v2 5/5] bpf: selftests: add bpf_mptcp_sock() verifier tests Date: Fri, 11 Sep 2020 16:30:20 +0200 Message-Id: <20200911143022.414783-5-nicolas.rybowski@tessares.net> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20200911143022.414783-1-nicolas.rybowski@tessares.net> References: <20200911143022.414783-1-nicolas.rybowski@tessares.net> MIME-Version: 1.0 Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org This patch adds verifier side tests for the new bpf_mptcp_sock() helper. Here are the new tests : - NULL bpf_sock is correctly handled - We cannot access a field from bpf_mptcp_sock if the latter is NULL - We can access a field from bpf_mptcp_sock if the latter is not NULL - We cannot modify a field from bpf_mptcp_sock. Note that "token" is currently the only field in bpf_mptcp_sock. Currently, there is no easy way to test the token field since we cannot get back the mptcp_sock in userspace, this could be a future amelioration. Acked-by: Matthieu Baerts Signed-off-by: Nicolas Rybowski Acked-by: Song Liu --- Notes: v1 -> v2: - new patch: mandatory selftests (Alexei) tools/testing/selftests/bpf/verifier/sock.c | 63 +++++++++++++++++++++ 1 file changed, 63 insertions(+) diff --git a/tools/testing/selftests/bpf/verifier/sock.c b/tools/testing/selftests/bpf/verifier/sock.c index b1aac2641498..9ce7c7ec3b5e 100644 --- a/tools/testing/selftests/bpf/verifier/sock.c +++ b/tools/testing/selftests/bpf/verifier/sock.c @@ -631,3 +631,66 @@ .prog_type = BPF_PROG_TYPE_SK_REUSEPORT, .result = ACCEPT, }, +{ + "bpf_mptcp_sock(skops->sk): no !skops->sk check", + .insns = { + BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, offsetof(struct bpf_sock_ops, sk)), + BPF_EMIT_CALL(BPF_FUNC_mptcp_sock), + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_SOCK_OPS, + .result = REJECT, + .errstr = "type=sock_or_null expected=sock_common", +}, +{ + "bpf_mptcp_sock(skops->sk): no NULL check on ret", + .insns = { + BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, offsetof(struct bpf_sock_ops, sk)), + BPF_JMP_IMM(BPF_JNE, BPF_REG_1, 0, 2), + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + BPF_EMIT_CALL(BPF_FUNC_mptcp_sock), + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_0, offsetof(struct bpf_mptcp_sock, token)), + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_SOCK_OPS, + .result = REJECT, + .errstr = "invalid mem access 'mptcp_sock_or_null'", +}, +{ + "bpf_mptcp_sock(skops->sk): msk->token", + .insns = { + BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, offsetof(struct bpf_sock_ops, sk)), + BPF_JMP_IMM(BPF_JNE, BPF_REG_1, 0, 2), + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + BPF_EMIT_CALL(BPF_FUNC_mptcp_sock), + BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1), + BPF_EXIT_INSN(), + BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_0, offsetof(struct bpf_mptcp_sock, token)), + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_SOCK_OPS, + .result = ACCEPT, +}, +{ + "bpf_mptcp_sock(skops->sk): msk->token cannot be modified", + .insns = { + BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_1, offsetof(struct bpf_sock_ops, sk)), + BPF_JMP_IMM(BPF_JNE, BPF_REG_1, 0, 2), + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + BPF_EMIT_CALL(BPF_FUNC_mptcp_sock), + BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1), + BPF_EXIT_INSN(), + BPF_ST_MEM(BPF_W, BPF_REG_0, offsetof(struct bpf_mptcp_sock, token), 0x2a), + BPF_MOV64_IMM(BPF_REG_0, 0), + BPF_EXIT_INSN(), + }, + .prog_type = BPF_PROG_TYPE_SOCK_OPS, + .result = REJECT, + .errstr = "cannot write into mptcp_sock", +},