From patchwork Wed Jun 20 20:07:35 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kevin(Yudong) Yang" X-Patchwork-Id: 932378 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="TGPft/6L"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 419wvp53Ypz9s2L for ; Thu, 21 Jun 2018 06:11:34 +1000 (AEST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932763AbeFTULc (ORCPT ); Wed, 20 Jun 2018 16:11:32 -0400 Received: from mail-yb0-f202.google.com ([209.85.213.202]:43006 "EHLO mail-yb0-f202.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932742AbeFTULb (ORCPT ); Wed, 20 Jun 2018 16:11:31 -0400 Received: by mail-yb0-f202.google.com with SMTP id x14-v6so458072ybj.9 for ; Wed, 20 Jun 2018 13:11:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:date:message-id:subject:from:to:cc; bh=+4dcKMo8NvE7XpfOLocLtMfhyk25/rUB3cY32gzh4Ic=; b=TGPft/6LEufT4HxorbJK4NpcPkMdnfVFE/UdTKvCuAGU8BUiiBpCwkp9V8Qnz1sxT0 YarpYDkseJuMMKNHA5GquGAExqrycy1+pgs0p316vr1qdFGtWU/KaY84y+YQ1GDAFr3S cFGjfnK5ccRm2uI+lGqV6yNIL9qW1sz+Fh9bO6FRManbBfXqESCb4ogWgE/ABvpcW9dQ tVJpUCyFH0zj8cbTj9wUUHBNvGxSYE2KCl3EJSNypVSXOfQP8KQLPGqeBhZLLyWaUpQ7 0yEiZwLVO/KfBTat8/daH4U6iE1Szc09ovFTywAQGLUNwR+r8VUcti1C8lrwyRRLRMkD oNBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:date:message-id:subject:from:to:cc; bh=+4dcKMo8NvE7XpfOLocLtMfhyk25/rUB3cY32gzh4Ic=; b=tsJil41MvQQjAAhtRj8k3+fpo0aiKwcX0id2q8GrT+6/lUxGqXl/qQ1GKUZAGpIylb ZRgm92aVrKwXcHeVxSyEqUAaddD/nvXrLq6oIkp1MRYWqMN2i2H7jMAWSnD6TfPhjAGv P6r0BfLdH0MD3fdnPVzaCb9kx03Ga+zjD6bCF7S2l20kkZxEqLbXH3ztX6pJDi8oyDbu SRA5NUlZt9kBT4AqVkFT7Wex7dC33NwNCHa60lDtPEfmEvuYKHPquqxagQDpKD5IVRs1 11H/WYwT8QhoGk+svEB1+F8sclnaLl1DzWEeuRjkEIRNkp8qqOIg80eggKBhPIqEQItS oX+g== X-Gm-Message-State: APt69E0iqhEsa7PMiuqKAhb3j4IMu1JN5RC/Ofo7QJSbvyj+e9ew+l+/ 2yYhX2/2fPRvlozGKDOU4NuZouw= X-Google-Smtp-Source: ADUXVKKq+yMWtJi8AHtPjzLNRgoxjSonNkezOfk/6sCLtQbFzhAl+Ui+djg+InLsUFBy1XZAARS9kFY= MIME-Version: 1.0 X-Received: by 2002:a25:c714:: with SMTP id w20-v6mr852647ybe.69.1529525490507; Wed, 20 Jun 2018 13:11:30 -0700 (PDT) Date: Wed, 20 Jun 2018 16:07:35 -0400 Message-Id: <20180620200735.82085-1-yyd@google.com> X-Mailer: git-send-email 2.18.0.rc1.244.gcf134e6275-goog Subject: [PATCH net-next] tcp_bbr: fix bbr pacing rate for internal pacing From: Kevin Yang To: David Miller Cc: netdev@vger.kernel.org, Eric Dumazet , Kevin Yang Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Eric Dumazet This commit makes BBR use only the MSS (without any headers) to calculate pacing rates when internal TCP-layer pacing is used. This is necessary to achieve the correct pacing behavior in this case, since tcp_internal_pacing() uses only the payload length to calculate pacing delays. Signed-off-by: Kevin Yang Signed-off-by: Eric Dumazet Reviewed-by: Neal Cardwell --- include/net/tcp.h | 11 +++++++++++ net/ipv4/tcp_bbr.c | 6 +++++- net/ipv4/tcp_output.c | 14 -------------- 3 files changed, 16 insertions(+), 15 deletions(-) diff --git a/include/net/tcp.h b/include/net/tcp.h index 0448e7c5d2b4..822ee49ed0f9 100644 --- a/include/net/tcp.h +++ b/include/net/tcp.h @@ -1184,6 +1184,17 @@ static inline bool tcp_is_cwnd_limited(const struct sock *sk) return tp->is_cwnd_limited; } +/* BBR congestion control needs pacing. + * Same remark for SO_MAX_PACING_RATE. + * sch_fq packet scheduler is efficiently handling pacing, + * but is not always installed/used. + * Return true if TCP stack should pace packets itself. + */ +static inline bool tcp_needs_internal_pacing(const struct sock *sk) +{ + return smp_load_acquire(&sk->sk_pacing_status) == SK_PACING_NEEDED; +} + /* Something is really bad, we could not queue an additional packet, * because qdisc is full or receiver sent a 0 window. * We do not want to add fuel to the fire, or abort too early, diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c index 58e2f479ffb4..3b5f45b9e81e 100644 --- a/net/ipv4/tcp_bbr.c +++ b/net/ipv4/tcp_bbr.c @@ -205,7 +205,11 @@ static u32 bbr_bw(const struct sock *sk) */ static u64 bbr_rate_bytes_per_sec(struct sock *sk, u64 rate, int gain) { - rate *= tcp_mss_to_mtu(sk, tcp_sk(sk)->mss_cache); + unsigned int mss = tcp_sk(sk)->mss_cache; + + if (!tcp_needs_internal_pacing(sk)) + mss = tcp_mss_to_mtu(sk, mss); + rate *= mss; rate *= gain; rate >>= BBR_SCALE; rate *= USEC_PER_SEC; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 8e08b409c71e..f8f6129160dd 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -973,17 +973,6 @@ enum hrtimer_restart tcp_pace_kick(struct hrtimer *timer) return HRTIMER_NORESTART; } -/* BBR congestion control needs pacing. - * Same remark for SO_MAX_PACING_RATE. - * sch_fq packet scheduler is efficiently handling pacing, - * but is not always installed/used. - * Return true if TCP stack should pace packets itself. - */ -static bool tcp_needs_internal_pacing(const struct sock *sk) -{ - return smp_load_acquire(&sk->sk_pacing_status) == SK_PACING_NEEDED; -} - static void tcp_internal_pacing(struct sock *sk, const struct sk_buff *skb) { u64 len_ns; @@ -995,9 +984,6 @@ static void tcp_internal_pacing(struct sock *sk, const struct sk_buff *skb) if (!rate || rate == ~0U) return; - /* Should account for header sizes as sch_fq does, - * but lets make things simple. - */ len_ns = (u64)skb->len * NSEC_PER_SEC; do_div(len_ns, rate); hrtimer_start(&tcp_sk(sk)->pacing_timer,