From patchwork Tue Aug 13 23:26:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vinicius Peixoto X-Patchwork-Id: 1972147 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=185.125.189.65; helo=lists.ubuntu.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=patchwork.ozlabs.org) Received: from lists.ubuntu.com (lists.ubuntu.com [185.125.189.65]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Wk6wW3ylTz1yXl for ; Wed, 14 Aug 2024 09:26:51 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=lists.ubuntu.com) by lists.ubuntu.com with esmtp (Exim 4.86_2) (envelope-from ) id 1se0uc-0001gN-9l; Tue, 13 Aug 2024 23:26:46 +0000 Received: from smtp-relay-internal-0.internal ([10.131.114.225] helo=smtp-relay-internal-0.canonical.com) by lists.ubuntu.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1se0ua-0001fn-SU for kernel-team@lists.ubuntu.com; Tue, 13 Aug 2024 23:26:44 +0000 Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-0.canonical.com (Postfix) with ESMTPS id A9EAE3F1EE for ; Tue, 13 Aug 2024 23:26:44 +0000 (UTC) Received: by mail-pf1-f199.google.com with SMTP id d2e1a72fcca58-70e910f309eso6924874b3a.3 for ; Tue, 13 Aug 2024 16:26:44 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723591602; x=1724196402; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OHdbukhBAjcYizdUrqEd3E7eSN9rz4PHKzVCFmdN6+s=; b=Opm3MIqWuwc+D03vSrL0ROhAbSJeWobwScmwBYZgbFWAU5yHFA2ezI+AYuaOHogXtb EI2n8sihms1P4kZwZ8rkKu652gzXv0kLQTfNVShUHu73fgE8bLl9K8GRWLKDPemoikVr oadUNV6gSlQALgOQ6gnoqdi1+cuyEPob78ff8pb+/X4TvHM/WboQjUZuZ/dERfrN+77y 23CxXOMM7Sgrlh1PnZxnD7RwdMa+/pUom0byw3tUWzD8v8xXVPZ2xILw0KoN/PcKYlbZ 61pv+OaEKtPeQUWSV7JuQZ2braJ4tgZWZMLcyfPWG+X3Fal4ZRXZeEv2Nso0iE5ZAqJQ d0Ug== X-Gm-Message-State: AOJu0YwKRio0pq7afOfRGz+1NcI8ReNO1BtT8JHJWDxyK43cFiYfJSrp 5fkmzzNtMw3aTZ6sVnBTxkGJtE6r8K1DrYQDmpWathYjOvzS9zSEzpZnGP3pYWTbGUGFlIC1qBK Jr03rVeFePjLVvHP7Tur6Nkl0YHNwLuLxGkPi7ORRQUdjqu6J9aqrTTxblu6J3YN9B6TkG6OKGH 2WpBYu4yzt3w== X-Received: by 2002:a05:6a20:9e47:b0:1be:c6a5:5e74 with SMTP id adf61e73a8af0-1c8eae6f8ddmr1567489637.21.1723591602292; Tue, 13 Aug 2024 16:26:42 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHOarnQd0UJMByDf2bks2W+mw7SrQo27MeVeuvBlFyv2M0+iQuc4du1w2obaQrO3MYjZ1z3HA== X-Received: by 2002:a05:6a20:9e47:b0:1be:c6a5:5e74 with SMTP id adf61e73a8af0-1c8eae6f8ddmr1567474637.21.1723591601918; Tue, 13 Aug 2024 16:26:41 -0700 (PDT) Received: from canonical.com ([2804:1b3:a700:3d2c:2581:40a7:e5dc:ac36]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-710e5a89ab3sm6251522b3a.153.2024.08.13.16.26.40 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Aug 2024 16:26:41 -0700 (PDT) From: Vinicius Peixoto To: kernel-team@lists.ubuntu.com Subject: [SRU][jammy:linux-gcp][PATCH 1/3] tcp: derive delack_max from rto_min Date: Tue, 13 Aug 2024 20:26:26 -0300 Message-ID: <20240813232628.408515-2-vinicius.peixoto@canonical.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240813232628.408515-1-vinicius.peixoto@canonical.com> References: <20240813232628.408515-1-vinicius.peixoto@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Eric Dumazet While BPF allows to set icsk->->icsk_delack_max and/or icsk->icsk_rto_min, we have an ip route attribute (RTAX_RTO_MIN) to be able to tune rto_min, but nothing to consequently adjust max delayed ack, which vary from 40ms to 200 ms (TCP_DELACK_{MIN|MAX}). This makes RTAX_RTO_MIN of almost no practical use, unless customers are in big trouble. Modern days datacenter communications want to set rto_min to ~5 ms, and the max delayed ack one jiffie smaller to avoid spurious retransmits. After this patch, an "rto_min 5" route attribute will effectively lower max delayed ack timers to 4 ms. Note in the following ss output, "rto:6 ... ato:4" $ ss -temoi dst XXXXXX State Recv-Q Send-Q Local Address:Port Peer Address:Port Process ESTAB 0 0 [2002:a05:6608:295::]:52950 [2002:a05:6608:297::]:41597 ino:255134 sk:1001 <-> skmem:(r0,rb1707063,t872,tb262144,f0,w0,o0,bl0,d0) ts sack cubic wscale:8,8 rto:6 rtt:0.02/0.002 ato:4 mss:4096 pmtu:4500 rcvmss:536 advmss:4096 cwnd:10 bytes_sent:54823160 bytes_acked:54823121 bytes_received:54823120 segs_out:1370582 segs_in:1370580 data_segs_out:1370579 data_segs_in:1370578 send 16.4Gbps pacing_rate 32.6Gbps delivery_rate 1.72Gbps delivered:1370579 busy:26920ms unacked:1 rcv_rtt:34.615 rcv_space:65920 rcv_ssthresh:65535 minrtt:0.015 snd_wnd:65536 While we could argue this patch fixes a bug with RTAX_RTO_MIN, I do not add a Fixes: tag, so that we can soak it a bit before asking backports to stable branches. Signed-off-by: Eric Dumazet Acked-by: Soheil Hassas Yeganeh Acked-by: Neal Cardwell Signed-off-by: David S. Miller (cherry-picked from commit bbf80d713fe75cfbecda26e7c03a9a8d22af2f4f) Signed-off-by: Vinicius Peixoto --- include/net/tcp.h | 2 ++ net/ipv4/tcp.c | 3 ++- net/ipv4/tcp_output.c | 16 +++++++++++++++- 3 files changed, 19 insertions(+), 2 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index 08923ed4278f..204e8e91fd19 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -708,6 +708,8 @@ static inline void tcp_fast_path_check(struct sock *sk) tcp_fast_path_on(tp); } +u32 tcp_delack_max(const struct sock *sk); + /* Compute the actual rto_min value */ static inline u32 tcp_rto_min(struct sock *sk) { diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index e348b69ef0f5..95398c10086c 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -3793,7 +3793,8 @@ void tcp_get_info(struct sock *sk, struct tcp_info *info) info->tcpi_options |= TCPI_OPT_SYN_DATA; info->tcpi_rto = jiffies_to_usecs(icsk->icsk_rto); - info->tcpi_ato = jiffies_to_usecs(icsk->icsk_ack.ato); + info->tcpi_ato = jiffies_to_usecs(min(icsk->icsk_ack.ato, + tcp_delack_max(sk))); info->tcpi_snd_mss = tp->mss_cache; info->tcpi_rcv_mss = icsk->icsk_ack.rcv_mss; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 0fb84e57a2d4..5ccced7e72d1 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -3911,6 +3911,20 @@ int tcp_connect(struct sock *sk) } EXPORT_SYMBOL(tcp_connect); +u32 tcp_delack_max(const struct sock *sk) +{ + const struct dst_entry *dst = __sk_dst_get(sk); + u32 delack_max = inet_csk(sk)->icsk_delack_max; + + if (dst && dst_metric_locked(dst, RTAX_RTO_MIN)) { + u32 rto_min = dst_metric_rtt(dst, RTAX_RTO_MIN); + u32 delack_from_rto_min = max_t(int, 1, rto_min - 1); + + delack_max = min_t(u32, delack_max, delack_from_rto_min); + } + return delack_max; +} + /* Send out a delayed ack, the caller does the policy checking * to see if we should even be here. See tcp_input.c:tcp_ack_snd_check() * for details. @@ -3946,7 +3960,7 @@ void tcp_send_delayed_ack(struct sock *sk) ato = min(ato, max_ato); } - ato = min_t(u32, ato, inet_csk(sk)->icsk_delack_max); + ato = min_t(u32, ato, tcp_delack_max(sk)); /* Stay within the limit we were given */ timeout = jiffies + ato; From patchwork Tue Aug 13 23:26:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vinicius Peixoto X-Patchwork-Id: 1972148 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=185.125.189.65; helo=lists.ubuntu.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=patchwork.ozlabs.org) Received: from lists.ubuntu.com (lists.ubuntu.com [185.125.189.65]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Wk6wZ4c64z1yXl for ; Wed, 14 Aug 2024 09:26:54 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=lists.ubuntu.com) by lists.ubuntu.com with esmtp (Exim 4.86_2) (envelope-from ) id 1se0uf-0001js-ID; Tue, 13 Aug 2024 23:26:49 +0000 Received: from smtp-relay-internal-0.internal ([10.131.114.225] helo=smtp-relay-internal-0.canonical.com) by lists.ubuntu.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1se0ue-0001is-Fs for kernel-team@lists.ubuntu.com; Tue, 13 Aug 2024 23:26:48 +0000 Received: from mail-pg1-f200.google.com (mail-pg1-f200.google.com [209.85.215.200]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-0.canonical.com (Postfix) with ESMTPS id 54A333F1EE for ; Tue, 13 Aug 2024 23:26:48 +0000 (UTC) Received: by mail-pg1-f200.google.com with SMTP id 41be03b00d2f7-7a2a04c79b6so258180a12.0 for ; Tue, 13 Aug 2024 16:26:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723591606; x=1724196406; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=i2wGFn0wGZA10mZN5qK/jNwjaErHAaXf7D/c6Q7Zunw=; b=w9zSsUMJ7gRBtk7iA49xzxWOrXsw4B+QRbqELyKw7EPDkO/AuKc/cCK32ZBAW4EIzF 3zPK2nuVXC5tEe5++CFw9sxdl00rXCnG1+eR+RPCHLlL3czIjGLEjMSU9yMojA4BBRhu p223lwP1sWMP+QbQ7OEbCq8h3Dd2n/GCBHo9FcbYPfmsn8Jjx7BxShoTv3c1gNte0S3F wH/NDFJe23U8z8eBlru5SaY7zt1x7eBEw4GguziPZ0qBYrVBhgDmGmGGNOhwaZHTZjOY 41XNkTrrhkj2Wve3cKopDaOLJH6GKlSz/PLfiXZqTqdjUbx3jIb8RyVUVImgGadYk55K lgMA== X-Gm-Message-State: AOJu0YwaW2loFzxNqgoI8QG+TT1VY1wcD+ULwdiufQyf6mJdxYtGO6Ps Ln4t5XW5kP+BK11yE9R582ASN4UAUG+C7nB+2b5mMW+Pto2T1iUP5VZuTTVrEDdjw8kXlQH3EkF +qewzZI+JT56d8nL5TZHswy9eI/QnVhNq80KKmLyXZyD3bCry3pMcaZrv7RqOb2YMsxHJOFV00F Yf/DHkCaMOIQ== X-Received: by 2002:a17:90b:3510:b0:2c8:2236:e2c3 with SMTP id 98e67ed59e1d1-2d3942ed105mr6806649a91.17.1723591606145; Tue, 13 Aug 2024 16:26:46 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGuiWY6OwddOfleVdM8Loo4zisQZ7mlxme7ThjjKaKMoqf35vjeIPZyUynJ6x+AI9HaWclBOg== X-Received: by 2002:a17:90b:3510:b0:2c8:2236:e2c3 with SMTP id 98e67ed59e1d1-2d3942ed105mr6806629a91.17.1723591605739; Tue, 13 Aug 2024 16:26:45 -0700 (PDT) Received: from canonical.com ([2804:1b3:a700:3d2c:2581:40a7:e5dc:ac36]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2d3ac7f3adfsm147870a91.33.2024.08.13.16.26.44 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Aug 2024 16:26:45 -0700 (PDT) From: Vinicius Peixoto To: kernel-team@lists.ubuntu.com Subject: [SRU][jammy:linux-gcp][PATCH 2/3] tcp: derive delack_max with tcp_rto_min helper Date: Tue, 13 Aug 2024 20:26:27 -0300 Message-ID: <20240813232628.408515-3-vinicius.peixoto@canonical.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240813232628.408515-1-vinicius.peixoto@canonical.com> References: <20240813232628.408515-1-vinicius.peixoto@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Kevin Yang Rto_min now has multiple sources, ordered by preprecedence high to low: ip route option rto_min, icsk->icsk_rto_min. When derive delack_max from rto_min, we should not only use ip route option, but should use tcp_rto_min helper to get the correct rto_min. Signed-off-by: Kevin Yang Reviewed-by: Neal Cardwell Reviewed-by: Yuchung Cheng Reviewed-by: Eric Dumazet Reviewed-by: Tony Lu Reviewed-by: Jakub Kicinski Signed-off-by: David S. Miller (cherry-picked from commit 18fd64d2542292713b0322e6815be059bdee440c) Signed-off-by: Vinicius Peixoto --- net/ipv4/tcp_output.c | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 5ccced7e72d1..6fb5629c1db3 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -3913,16 +3913,9 @@ EXPORT_SYMBOL(tcp_connect); u32 tcp_delack_max(const struct sock *sk) { - const struct dst_entry *dst = __sk_dst_get(sk); - u32 delack_max = inet_csk(sk)->icsk_delack_max; - - if (dst && dst_metric_locked(dst, RTAX_RTO_MIN)) { - u32 rto_min = dst_metric_rtt(dst, RTAX_RTO_MIN); - u32 delack_from_rto_min = max_t(int, 1, rto_min - 1); + u32 delack_from_rto_min = max(tcp_rto_min(sk), 2) - 1; - delack_max = min_t(u32, delack_max, delack_from_rto_min); - } - return delack_max; + return min(inet_csk(sk)->icsk_delack_max, delack_from_rto_min); } /* Send out a delayed ack, the caller does the policy checking From patchwork Tue Aug 13 23:26:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vinicius Peixoto X-Patchwork-Id: 1972149 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=185.125.189.65; helo=lists.ubuntu.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=patchwork.ozlabs.org) Received: from lists.ubuntu.com (lists.ubuntu.com [185.125.189.65]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Wk6wg5h3qz1yXl for ; Wed, 14 Aug 2024 09:26:59 +1000 (AEST) Received: from localhost ([127.0.0.1] helo=lists.ubuntu.com) by lists.ubuntu.com with esmtp (Exim 4.86_2) (envelope-from ) id 1se0uk-0001pO-Nt; Tue, 13 Aug 2024 23:26:54 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by lists.ubuntu.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1se0uj-0001nf-DX for kernel-team@lists.ubuntu.com; Tue, 13 Aug 2024 23:26:53 +0000 Received: from mail-pj1-f72.google.com (mail-pj1-f72.google.com [209.85.216.72]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id B7B783F322 for ; Tue, 13 Aug 2024 23:26:52 +0000 (UTC) Received: by mail-pj1-f72.google.com with SMTP id 98e67ed59e1d1-2cb696be198so6266899a91.3 for ; Tue, 13 Aug 2024 16:26:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723591610; x=1724196410; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8q87bEOUt4nBRRCga98MF8NMthi6MbtWxUOy7WB1+zE=; b=Nz2GRlCJubVxB8aNnm0oPdOkaLcv+1J0ah+EnQgukUZGuM58YqauoMmHg2EWZeUeOM V3qC7SxvwuL5iUmzHt1mfKGhuoTmRAXiaEUcYEZ0bBA3JxY4q7JvuBKFyOC9wm95+J0+ nRXzqyRtLYItorJnq5jG+Si9nYapFsQFUNOBLHcIwd1UYKo+Ucfcrc9O0/3u/4TgdRgO 1PUPMlvDqlHhN05lEkfUlXhwaaijHMEGCUlYp0hwUyRTWG2twf/2I0BUNi4J9fJtkc/o xcFq/VF4BTzyJ6TMKEfDsoy7H7Z9ex+0F+2bZWnkN0DSRV9sniKYlB5n4CBFmU7YFo6+ lpkg== X-Gm-Message-State: AOJu0YxYQRvbEqzb6y/8m08clL6RYNertBJoneVaev9FHFHtuu0ewcz2 Zppe0VyV6q38rcxDh7f6qDa5ZA1RMsAujxftbmkgQIdH0JGONvug9ayyDtXVVw115eO2to8POz8 Bv4z4RyNj1tyDAWWqyY+VcXtfV8/YCyMCsV1/4DC2WWO/IiebBC5kNuIOQJXx9uqTftwWNuiDIj ZL1RPLpz2nAw== X-Received: by 2002:a17:90a:d904:b0:2c9:5f1e:1a62 with SMTP id 98e67ed59e1d1-2d3aab87adbmr1130421a91.36.1723591610286; Tue, 13 Aug 2024 16:26:50 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFqucXkkRLYVb9IO/Z87uWnZ+PeJ6b7JCRU3FUPmUicS6Eh8nY3npCZYg9hy0fsbB/pshv+CQ== X-Received: by 2002:a17:90a:d904:b0:2c9:5f1e:1a62 with SMTP id 98e67ed59e1d1-2d3aab87adbmr1130406a91.36.1723591609763; Tue, 13 Aug 2024 16:26:49 -0700 (PDT) Received: from canonical.com ([2804:1b3:a700:3d2c:2581:40a7:e5dc:ac36]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2d3ac7d3804sm149040a91.3.2024.08.13.16.26.48 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Aug 2024 16:26:49 -0700 (PDT) From: Vinicius Peixoto To: kernel-team@lists.ubuntu.com Subject: [SRU][jammy:linux-gcp][PATCH 3/3] tcp: add sysctl_tcp_rto_min_us Date: Tue, 13 Aug 2024 20:26:28 -0300 Message-ID: <20240813232628.408515-4-vinicius.peixoto@canonical.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240813232628.408515-1-vinicius.peixoto@canonical.com> References: <20240813232628.408515-1-vinicius.peixoto@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Kevin Yang Adding a sysctl knob to allow user to specify a default rto_min at socket init time, other than using the hard coded 200ms default rto_min. Note that the rto_min route option has the highest precedence for configuring this setting, followed by the TCP_BPF_RTO_MIN socket option, followed by the tcp_rto_min_us sysctl. Signed-off-by: Kevin Yang Reviewed-by: Neal Cardwell Reviewed-by: Yuchung Cheng Reviewed-by: Eric Dumazet Reviewed-by: Tony Lu Reviewed-by: Jakub Kicinski Signed-off-by: David S. Miller (backported from commit f086edef71be7174a16c1ed67ac65a085cda28b1) [vpeixoto: fixed conflicts in include/net/netns/ipv4.h due to missing commits 18fd64d25422 ("netns-ipv4: reorganize netns_ipv4 fast path variables"), and 1c106eb01cee ("net: ipv{6,4}: Remove the now superfluous sentinel elements from ctl_table array"), as well as context conflicts in net/ipv4/tcp_ipv4.c due to missing commits adding other unrelated TCP sysctls.] Signed-off-by: Vinicius Peixoto --- Documentation/networking/ip-sysctl.rst | 13 +++++++++++++ include/net/netns/ipv4.h | 1 + net/ipv4/sysctl_net_ipv4.c | 8 ++++++++ net/ipv4/tcp.c | 4 +++- net/ipv4/tcp_ipv4.c | 2 ++ 5 files changed, 27 insertions(+), 1 deletion(-) diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst index 7f75767a24f1..9c931bb5fdc3 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -999,6 +999,19 @@ tcp_rx_skb_cache - BOOLEAN Default: 0 (disabled) +tcp_rto_min_us - INTEGER + Minimal TCP retransmission timeout (in microseconds). Note that the + rto_min route option has the highest precedence for configuring this + setting, followed by the TCP_BPF_RTO_MIN socket option, followed by + this tcp_rto_min_us sysctl. + + The recommended practice is to use a value less or equal to 200000 + microseconds. + + Possible Values: 1 - INT_MAX + + Default: 200000 + UDP variables ============= diff --git a/include/net/netns/ipv4.h b/include/net/netns/ipv4.h index d60a10cfc382..d35337cb7b87 100644 --- a/include/net/netns/ipv4.h +++ b/include/net/netns/ipv4.h @@ -137,6 +137,7 @@ struct netns_ipv4 { u8 sysctl_tcp_window_scaling; u8 sysctl_tcp_timestamps; u8 sysctl_tcp_early_retrans; + int sysctl_tcp_rto_min_us; u8 sysctl_tcp_recovery; u8 sysctl_tcp_thin_linear_timeouts; u8 sysctl_tcp_slow_start_after_idle; diff --git a/net/ipv4/sysctl_net_ipv4.c b/net/ipv4/sysctl_net_ipv4.c index 1f22e72074fd..f3df0a77a27d 100644 --- a/net/ipv4/sysctl_net_ipv4.c +++ b/net/ipv4/sysctl_net_ipv4.c @@ -1362,6 +1362,14 @@ static struct ctl_table ipv4_net_table[] = { .extra1 = SYSCTL_ZERO, .extra2 = &two, }, + { + .procname = "tcp_rto_min_us", + .data = &init_net.ipv4.sysctl_tcp_rto_min_us, + .maxlen = sizeof(int), + .mode = 0644, + .proc_handler = proc_dointvec_minmax, + .extra1 = SYSCTL_ONE, + }, { } }; diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 95398c10086c..e709031df533 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -417,6 +417,7 @@ void tcp_init_sock(struct sock *sk) { struct inet_connection_sock *icsk = inet_csk(sk); struct tcp_sock *tp = tcp_sk(sk); + int rto_min_us; tp->out_of_order_queue = RB_ROOT; sk->tcp_rtx_queue = RB_ROOT; @@ -425,7 +426,8 @@ void tcp_init_sock(struct sock *sk) INIT_LIST_HEAD(&tp->tsorted_sent_queue); icsk->icsk_rto = TCP_TIMEOUT_INIT; - icsk->icsk_rto_min = TCP_RTO_MIN; + rto_min_us = READ_ONCE(sock_net(sk)->ipv4.sysctl_tcp_rto_min_us); + icsk->icsk_rto_min = usecs_to_jiffies(rto_min_us); icsk->icsk_delack_max = TCP_DELACK_MAX; tp->mdev_us = jiffies_to_usecs(TCP_TIMEOUT_INIT); minmax_reset(&tp->rtt_min, tcp_jiffies32, ~0U); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index e162bed1916a..37f017d6d82f 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -3217,6 +3217,8 @@ static int __net_init tcp_sk_init(struct net *net) else net->ipv4.tcp_congestion_control = &tcp_reno; + net->ipv4.sysctl_tcp_rto_min_us = jiffies_to_usecs(TCP_RTO_MIN); + return 0; }