diff mbox series

[net-next] tcp: apply a floor of 1 for RTT samples from TCP timestamps

Message ID 20200730234916.2708735-1-jfwang@google.com
State Accepted
Delegated to: David Miller
Headers show
Series [net-next] tcp: apply a floor of 1 for RTT samples from TCP timestamps | expand

Commit Message

Jianfeng Wang July 30, 2020, 11:49 p.m. UTC
For retransmitted packets, TCP needs to resort to using TCP timestamps
for computing RTT samples. In the common case where the data and ACK
fall in the same 1-millisecond interval, TCP senders with millisecond-
granularity TCP timestamps compute a ca_rtt_us of 0. This ca_rtt_us
of 0 propagates to rs->rtt_us.

This value of 0 can cause performance problems for congestion control
modules. For example, in BBR, the zero min_rtt sample can bring the
min_rtt and BDP estimate down to 0, reduce snd_cwnd and result in a
low throughput. It would be hard to mitigate this with filtering in
the congestion control module, because the proper floor to apply would
depend on the method of RTT sampling (using timestamp options or
internally-saved transmission timestamps).

This fix applies a floor of 1 for the RTT sample delta from TCP
timestamps, so that seq_rtt_us, ca_rtt_us, and rs->rtt_us will be at
least 1 * (USEC_PER_SEC / TCP_TS_HZ).

Note that the receiver RTT computation in tcp_rcv_rtt_measure() and
min_rtt computation in tcp_update_rtt_min() both already apply a floor
of 1 timestamp tick, so this commit makes the code more consistent in
avoiding this edge case of a value of 0.

Signed-off-by: Jianfeng Wang <jfwang@google.com>
Signed-off-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Kevin Yang <yyd@google.com>
Acked-by: Yuchung Cheng <ycheng@google.com>
---
 net/ipv4/tcp_input.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Neal Cardwell July 31, 2020, 2:36 a.m. UTC | #1
On Thu, Jul 30, 2020 at 7:53 PM Jianfeng Wang <jfwang@google.com> wrote:
>
> For retransmitted packets, TCP needs to resort to using TCP timestamps
> for computing RTT samples. In the common case where the data and ACK
> fall in the same 1-millisecond interval, TCP senders with millisecond-
> granularity TCP timestamps compute a ca_rtt_us of 0. This ca_rtt_us
> of 0 propagates to rs->rtt_us.
>
> This value of 0 can cause performance problems for congestion control
> modules. For example, in BBR, the zero min_rtt sample can bring the
> min_rtt and BDP estimate down to 0, reduce snd_cwnd and result in a
> low throughput. It would be hard to mitigate this with filtering in
> the congestion control module, because the proper floor to apply would
> depend on the method of RTT sampling (using timestamp options or
> internally-saved transmission timestamps).
>
> This fix applies a floor of 1 for the RTT sample delta from TCP
> timestamps, so that seq_rtt_us, ca_rtt_us, and rs->rtt_us will be at
> least 1 * (USEC_PER_SEC / TCP_TS_HZ).
>
> Note that the receiver RTT computation in tcp_rcv_rtt_measure() and
> min_rtt computation in tcp_update_rtt_min() both already apply a floor
> of 1 timestamp tick, so this commit makes the code more consistent in
> avoiding this edge case of a value of 0.
>
> Signed-off-by: Jianfeng Wang <jfwang@google.com>
> Signed-off-by: Neal Cardwell <ncardwell@google.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Acked-by: Kevin Yang <yyd@google.com>
> Acked-by: Yuchung Cheng <ycheng@google.com>
> ---

One extra note on this patch: IMHO this is a bug fix that is worth
backporting to stable releases. Normally we would submit a patch like
this to the net branch, but we submitted this to the net-next branch
since Eric advised that this was the best approach, given how late it
is in the v5.8 development cycle.

Apologies that a note to this effect is not in the commit message itself.

best,
neal
David Miller Aug. 4, 2020, 12:54 a.m. UTC | #2
From: Jianfeng Wang <jfwang@google.com>
Date: Thu, 30 Jul 2020 23:49:16 +0000

> For retransmitted packets, TCP needs to resort to using TCP timestamps
> for computing RTT samples. In the common case where the data and ACK
> fall in the same 1-millisecond interval, TCP senders with millisecond-
> granularity TCP timestamps compute a ca_rtt_us of 0. This ca_rtt_us
> of 0 propagates to rs->rtt_us.
> 
> This value of 0 can cause performance problems for congestion control
> modules. For example, in BBR, the zero min_rtt sample can bring the
> min_rtt and BDP estimate down to 0, reduce snd_cwnd and result in a
> low throughput. It would be hard to mitigate this with filtering in
> the congestion control module, because the proper floor to apply would
> depend on the method of RTT sampling (using timestamp options or
> internally-saved transmission timestamps).
> 
> This fix applies a floor of 1 for the RTT sample delta from TCP
> timestamps, so that seq_rtt_us, ca_rtt_us, and rs->rtt_us will be at
> least 1 * (USEC_PER_SEC / TCP_TS_HZ).
> 
> Note that the receiver RTT computation in tcp_rcv_rtt_measure() and
> min_rtt computation in tcp_update_rtt_min() both already apply a floor
> of 1 timestamp tick, so this commit makes the code more consistent in
> avoiding this edge case of a value of 0.
> 
> Signed-off-by: Jianfeng Wang <jfwang@google.com>
> Signed-off-by: Neal Cardwell <ncardwell@google.com>
> Signed-off-by: Eric Dumazet <edumazet@google.com>
> Acked-by: Kevin Yang <yyd@google.com>
> Acked-by: Yuchung Cheng <ycheng@google.com>

Applied and queued up for -stable, thanks.
diff mbox series

Patch

diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index a018bafd7bdf..b725288b7e67 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2950,6 +2950,8 @@  static bool tcp_ack_update_rtt(struct sock *sk, const int flag,
 		u32 delta = tcp_time_stamp(tp) - tp->rx_opt.rcv_tsecr;
 
 		if (likely(delta < INT_MAX / (USEC_PER_SEC / TCP_TS_HZ))) {
+			if (!delta)
+				delta = 1;
 			seq_rtt_us = delta * (USEC_PER_SEC / TCP_TS_HZ);
 			ca_rtt_us = seq_rtt_us;
 		}