From patchwork Thu Nov 19 21:23:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 1403282 Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=M+xlC4LZ; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4CcXkr4mglz9sVJ for ; Fri, 20 Nov 2020 08:24:52 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726311AbgKSVX4 (ORCPT ); Thu, 19 Nov 2020 16:23:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43326 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725887AbgKSVX4 (ORCPT ); Thu, 19 Nov 2020 16:23:56 -0500 Received: from mail-qk1-x741.google.com (mail-qk1-x741.google.com [IPv6:2607:f8b0:4864:20::741]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7955CC0613CF; Thu, 19 Nov 2020 13:23:54 -0800 (PST) Received: by mail-qk1-x741.google.com with SMTP id k4so6908988qko.13; Thu, 19 Nov 2020 13:23:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=rsJB7zlj1ubuXabMIzGnYmJ1xAmzWzPfgMBEPVTYqiw=; b=M+xlC4LZ0W0RBfeQ8bJKTXQ6k4Wk7hm7LybWzFmAL9hXJ5GkrSrBLrDlbzbNsXDqwt jdOcFj5Q4SQq5ohkBCFDDF/aIo2rAiHlKFpck3ljklGkj3dkZkEwYg/Hj8HzrNTDGclZ L9rkMmfnz3ahKXjxCzg8/CHmjd482AkPM0ZNKakx9q2015rMOkiaPZDo8MYKtGAgWh9a DJNFKRcZylV30pLZEe/OwX/8D0LiC4XgFxdwV0DzodgWbt34uW/OLbn0HESX47yEjjk4 qxhyWtTrwuEXGi+yNVvS25kDeDd2vZSve0r5W1SbAC5RtPM6LRABJG4lVJBeapE7obXd UVsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=rsJB7zlj1ubuXabMIzGnYmJ1xAmzWzPfgMBEPVTYqiw=; b=DSi387vXajWvy8V+AjWbwMhv+7si64tDOYAqLZzV+uhtKkN0H9lKMRO/pEu0CHONCA PpdwtPyOFOcTd7I3A26kDRpG+8xMRrfp7kGfvV5ekehZQGkK9kbI3lM4spEyoeWXP6Hg FimX8++8q7sAoQY2Zh5gxi4D0xlvamqrA3y5fA0AVk9EVpVc5ZPew09ZvlAotzB/+tH8 eg5huGh13q/lHfPcTLpmQ19EPaarb43q8OzdB5QSooJ0OXrVKaAgm2ktGjDo2neI+Hii Iv9cZ1t1SBfg5yOpit9GDT8X8an7sqtiHJvEXyxPUV+xe5Jb9AN9VqQ0CcgQrXIlZ8if uKlw== X-Gm-Message-State: AOAM533z1fCwkf/aFFFRfywHor4wK5CvMP5ql9aRFG90H09HHSbUqQLi xluTkGYsn8irp4O+Buc+kNdRXKU3vvnEug== X-Google-Smtp-Source: ABdhPJxZLSkvv9TBTCU9BtisX+3eGnv9olK8d5wylR+7gk3o9am0gce+0Ms+ziup2x2RwSOr0cixQQ== X-Received: by 2002:a37:a855:: with SMTP id r82mr12429026qke.132.1605821033465; Thu, 19 Nov 2020 13:23:53 -0800 (PST) Received: from localhost.localdomain ([2001:470:b:9c3:9e5c:8eff:fe4f:f2d0]) by smtp.gmail.com with ESMTPSA id 21sm796088qkv.78.2020.11.19.13.23.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Nov 2020 13:23:52 -0800 (PST) Subject: [net PATCH 1/2] tcp: Allow full IP tos/IPv6 tclass to be reflected in L3 header From: Alexander Duyck To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, kafai@fb.com, kernel-team@fb.com, edumazet@google.com, brakmo@fb.com, alexanderduyck@fb.com, weiwan@google.com Date: Thu, 19 Nov 2020 13:23:51 -0800 Message-ID: <160582103106.66684.9841738004971200231.stgit@localhost.localdomain> In-Reply-To: <160582070138.66684.11785214534154816097.stgit@localhost.localdomain> References: <160582070138.66684.11785214534154816097.stgit@localhost.localdomain> User-Agent: StGit/0.23 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Alexander Duyck An issue was recently found where DCTCP SYN/ACK packets did not have the ECT bit set in the L3 header. A bit of code review found that the recent change referenced below had gone though and added a mask that prevented the ECN bits from being populated in the L3 header. This patch addresses that by rolling back the mask so that it is only applied to the flags coming from the incoming TCP request instead of applying it to the socket tos/tclass field. Doing this the ECT bits were restored in the SYN/ACK packets in my testing. One thing that is not addressed by this patch set is the fact that tcp_reflect_tos appears to be incompatible with ECN based congestion avoidance algorithms. At a minimum the feature should likely be documented which it currently isn't. Fixes: ac8f1710c12b ("tcp: reflect tos value received in SYN to the socket") Signed-off-by: Alexander Duyck Acked-by: Wei Wang --- net/ipv4/tcp_ipv4.c | 5 +++-- net/ipv6/tcp_ipv6.c | 6 +++--- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index c2d5132c523c..c5f8b686aa82 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -981,7 +981,8 @@ static int tcp_v4_send_synack(const struct sock *sk, struct dst_entry *dst, skb = tcp_make_synack(sk, dst, req, foc, synack_type, syn_skb); tos = sock_net(sk)->ipv4.sysctl_tcp_reflect_tos ? - tcp_rsk(req)->syn_tos : inet_sk(sk)->tos; + tcp_rsk(req)->syn_tos & ~INET_ECN_MASK : + inet_sk(sk)->tos; if (skb) { __tcp_v4_send_check(skb, ireq->ir_loc_addr, ireq->ir_rmt_addr); @@ -990,7 +991,7 @@ static int tcp_v4_send_synack(const struct sock *sk, struct dst_entry *dst, err = ip_build_and_send_pkt(skb, sk, ireq->ir_loc_addr, ireq->ir_rmt_addr, rcu_dereference(ireq->ireq_opt), - tos & ~INET_ECN_MASK); + tos); rcu_read_unlock(); err = net_xmit_eval(err); } diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c index 8db59f4e5f13..3d49e8d0afee 100644 --- a/net/ipv6/tcp_ipv6.c +++ b/net/ipv6/tcp_ipv6.c @@ -530,12 +530,12 @@ static int tcp_v6_send_synack(const struct sock *sk, struct dst_entry *dst, rcu_read_lock(); opt = ireq->ipv6_opt; tclass = sock_net(sk)->ipv4.sysctl_tcp_reflect_tos ? - tcp_rsk(req)->syn_tos : np->tclass; + tcp_rsk(req)->syn_tos & ~INET_ECN_MASK : + np->tclass; if (!opt) opt = rcu_dereference(np->opt); err = ip6_xmit(sk, skb, fl6, sk->sk_mark, opt, - tclass & ~INET_ECN_MASK, - sk->sk_priority); + tclass, sk->sk_priority); rcu_read_unlock(); err = net_xmit_eval(err); } From patchwork Thu Nov 19 21:23:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Duyck X-Patchwork-Id: 1403283 Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=vger.kernel.org (client-ip=23.128.96.18; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20161025 header.b=BJzEYI/m; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by ozlabs.org (Postfix) with ESMTP id 4CcXkt69cyz9sVH for ; Fri, 20 Nov 2020 08:24:54 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726529AbgKSVYC (ORCPT ); Thu, 19 Nov 2020 16:24:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43350 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726199AbgKSVYC (ORCPT ); Thu, 19 Nov 2020 16:24:02 -0500 Received: from mail-qk1-x744.google.com (mail-qk1-x744.google.com [IPv6:2607:f8b0:4864:20::744]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D0D75C0613CF; Thu, 19 Nov 2020 13:24:01 -0800 (PST) Received: by mail-qk1-x744.google.com with SMTP id l2so7006330qkf.0; Thu, 19 Nov 2020 13:24:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:from:to:cc:date:message-id:in-reply-to:references :user-agent:mime-version:content-transfer-encoding; bh=UWiJFAJrRJVWOtvz3wQUC/wgiDj3DblZKCOoxZM1/OA=; b=BJzEYI/mMICjxog5etBiXoSf7SSV9DPZ2MzTyjv4JP0o3OA19AT9XxwtQoo7CNhmFQ LiaZBJHGCsaBbQPgmKFjGi3EFzWFr7up1/bLlciQYtPD6RA/8eQrDs8lCk25s/I9lh7W loR2CbDQOboARDv/CMzrIxA6DFNKxv8z6NPFKSl672LiwdXnBYQ80mkfdGWAtNWqcWT7 sHgju+RgZRGb9OQApWTu8aBh4YSvnCAy2z0Ee+ARUhqKJX4meq6/W8V+v88Tvj3uF87s JD8TuAkyTAqN1ynSpMDUmb9qXHkXLkdFKnbZnvyZHpw6lt5SSU1V5ea7HO0G95xEDk+b sSQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:from:to:cc:date:message-id:in-reply-to :references:user-agent:mime-version:content-transfer-encoding; bh=UWiJFAJrRJVWOtvz3wQUC/wgiDj3DblZKCOoxZM1/OA=; b=TANJJ8RVeMYzhaMwMiP1pGGhLf4SIjfHqWc0A0w30tWmzlZtNZ3JvlU51T1/tDc+yv V8GkLLeQOnNtSog3MthP7yXjFU4cyS5rD4NIDWthNH5AlNtvFv68mpVQsfsw2eAfuME3 hocXoWeua6SWan6pEVt2X62H/rWhry5L2occdynj1PYJsqTQZUo+NBQNntSjQyFkQ2en Y7K8X+pj3e5lhNe+aapjJEZRDadAEPkU6maWU5eFbBPDKb3JLBRd6yfojX57z+DnDHhW ylf7gE7bmPaU6lYqEHpZXRPoc/vEv8Xa+tiSnjP0OkdOrtQ/3KUSmNg8AT5F40TCKXbt dQQQ== X-Gm-Message-State: AOAM533IZy3UAoJC5U/1oSf5K2LhzPx1+ocTBp0G96REUMPCK9Z/wQdx 1x0G8m49tY1PyCvfdp6J0dtzaIgZuX9X0w== X-Google-Smtp-Source: ABdhPJye3QjQ+8rVvFwp+gyIPgPH38isUS6n34m3zyGsUaGeu6XeK/A20Pi/TpriGt3N0IKpOBxHHw== X-Received: by 2002:a37:793:: with SMTP id 141mr12835577qkh.462.1605821040873; Thu, 19 Nov 2020 13:24:00 -0800 (PST) Received: from localhost.localdomain ([2001:470:b:9c3:9e5c:8eff:fe4f:f2d0]) by smtp.gmail.com with ESMTPSA id x72sm766124qkb.90.2020.11.19.13.23.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Nov 2020 13:24:00 -0800 (PST) Subject: [net PATCH 2/2] tcp: Set INET_ECN_xmit configuration in tcp_reinit_congestion_control From: Alexander Duyck To: netdev@vger.kernel.org Cc: bpf@vger.kernel.org, daniel@iogearbox.net, kafai@fb.com, kernel-team@fb.com, edumazet@google.com, brakmo@fb.com, alexanderduyck@fb.com, weiwan@google.com Date: Thu, 19 Nov 2020 13:23:58 -0800 Message-ID: <160582103862.66684.1529849392380485857.stgit@localhost.localdomain> In-Reply-To: <160582070138.66684.11785214534154816097.stgit@localhost.localdomain> References: <160582070138.66684.11785214534154816097.stgit@localhost.localdomain> User-Agent: StGit/0.23 MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Alexander Duyck When setting congestion control via a BPF program it is seen that the SYN/ACK for packets within a given flow will not include the ECT0 flag. A bit of simple printk debugging shows that when this is configured without BPF we will see the value INET_ECN_xmit value initialized in tcp_assign_congestion_control however when we configure this via BPF the socket is in the closed state and as such it isn't configured, and I do not see it being initialized when we transition the socket into the listen state. The result of this is that the ECT0 bit is configured based on whatever the default state is for the socket. Any easy way to reproduce this is to monitor the following with tcpdump: tools/testing/selftests/bpf/test_progs -t bpf_tcp_ca Without this patch the SYN/ACK will follow whatever the default is. If dctcp all SYN/ACK packets will have the ECT0 bit set, and if it is not then ECT0 will be cleared on all SYN/ACK packets. With this patch applied the SYN/ACK bit matches the value seen on the other packets in the given stream. Fixes: 91b5b21c7c16 ("bpf: Add support for changing congestion control") Signed-off-by: Alexander Duyck --- net/ipv4/tcp_cong.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c index db47ac24d057..563d016e7478 100644 --- a/net/ipv4/tcp_cong.c +++ b/net/ipv4/tcp_cong.c @@ -198,6 +198,11 @@ static void tcp_reinit_congestion_control(struct sock *sk, icsk->icsk_ca_setsockopt = 1; memset(icsk->icsk_ca_priv, 0, sizeof(icsk->icsk_ca_priv)); + if (ca->flags & TCP_CONG_NEEDS_ECN) + INET_ECN_xmit(sk); + else + INET_ECN_dontxmit(sk); + if (!((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_LISTEN))) tcp_init_congestion_control(sk); }