Message ID | 20200730234916.2708735-1-jfwang@google.com |
---|---|
State | Accepted |
Delegated to: | David Miller |
Headers | show |
Series | [net-next] tcp: apply a floor of 1 for RTT samples from TCP timestamps | expand |
On Thu, Jul 30, 2020 at 7:53 PM Jianfeng Wang <jfwang@google.com> wrote: > > For retransmitted packets, TCP needs to resort to using TCP timestamps > for computing RTT samples. In the common case where the data and ACK > fall in the same 1-millisecond interval, TCP senders with millisecond- > granularity TCP timestamps compute a ca_rtt_us of 0. This ca_rtt_us > of 0 propagates to rs->rtt_us. > > This value of 0 can cause performance problems for congestion control > modules. For example, in BBR, the zero min_rtt sample can bring the > min_rtt and BDP estimate down to 0, reduce snd_cwnd and result in a > low throughput. It would be hard to mitigate this with filtering in > the congestion control module, because the proper floor to apply would > depend on the method of RTT sampling (using timestamp options or > internally-saved transmission timestamps). > > This fix applies a floor of 1 for the RTT sample delta from TCP > timestamps, so that seq_rtt_us, ca_rtt_us, and rs->rtt_us will be at > least 1 * (USEC_PER_SEC / TCP_TS_HZ). > > Note that the receiver RTT computation in tcp_rcv_rtt_measure() and > min_rtt computation in tcp_update_rtt_min() both already apply a floor > of 1 timestamp tick, so this commit makes the code more consistent in > avoiding this edge case of a value of 0. > > Signed-off-by: Jianfeng Wang <jfwang@google.com> > Signed-off-by: Neal Cardwell <ncardwell@google.com> > Signed-off-by: Eric Dumazet <edumazet@google.com> > Acked-by: Kevin Yang <yyd@google.com> > Acked-by: Yuchung Cheng <ycheng@google.com> > --- One extra note on this patch: IMHO this is a bug fix that is worth backporting to stable releases. Normally we would submit a patch like this to the net branch, but we submitted this to the net-next branch since Eric advised that this was the best approach, given how late it is in the v5.8 development cycle. Apologies that a note to this effect is not in the commit message itself. best, neal
From: Jianfeng Wang <jfwang@google.com> Date: Thu, 30 Jul 2020 23:49:16 +0000 > For retransmitted packets, TCP needs to resort to using TCP timestamps > for computing RTT samples. In the common case where the data and ACK > fall in the same 1-millisecond interval, TCP senders with millisecond- > granularity TCP timestamps compute a ca_rtt_us of 0. This ca_rtt_us > of 0 propagates to rs->rtt_us. > > This value of 0 can cause performance problems for congestion control > modules. For example, in BBR, the zero min_rtt sample can bring the > min_rtt and BDP estimate down to 0, reduce snd_cwnd and result in a > low throughput. It would be hard to mitigate this with filtering in > the congestion control module, because the proper floor to apply would > depend on the method of RTT sampling (using timestamp options or > internally-saved transmission timestamps). > > This fix applies a floor of 1 for the RTT sample delta from TCP > timestamps, so that seq_rtt_us, ca_rtt_us, and rs->rtt_us will be at > least 1 * (USEC_PER_SEC / TCP_TS_HZ). > > Note that the receiver RTT computation in tcp_rcv_rtt_measure() and > min_rtt computation in tcp_update_rtt_min() both already apply a floor > of 1 timestamp tick, so this commit makes the code more consistent in > avoiding this edge case of a value of 0. > > Signed-off-by: Jianfeng Wang <jfwang@google.com> > Signed-off-by: Neal Cardwell <ncardwell@google.com> > Signed-off-by: Eric Dumazet <edumazet@google.com> > Acked-by: Kevin Yang <yyd@google.com> > Acked-by: Yuchung Cheng <ycheng@google.com> Applied and queued up for -stable, thanks.
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index a018bafd7bdf..b725288b7e67 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -2950,6 +2950,8 @@ static bool tcp_ack_update_rtt(struct sock *sk, const int flag, u32 delta = tcp_time_stamp(tp) - tp->rx_opt.rcv_tsecr; if (likely(delta < INT_MAX / (USEC_PER_SEC / TCP_TS_HZ))) { + if (!delta) + delta = 1; seq_rtt_us = delta * (USEC_PER_SEC / TCP_TS_HZ); ca_rtt_us = seq_rtt_us; }