From patchwork Wed Jan 16 23:05:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yuchung Cheng X-Patchwork-Id: 1026252 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="W8fCww2X"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 43g2r54cXnz9sBQ for ; Thu, 17 Jan 2019 10:05:57 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387936AbfAPXFy (ORCPT ); Wed, 16 Jan 2019 18:05:54 -0500 Received: from mail-pf1-f195.google.com ([209.85.210.195]:38839 "EHLO mail-pf1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726196AbfAPXFw (ORCPT ); Wed, 16 Jan 2019 18:05:52 -0500 Received: by mail-pf1-f195.google.com with SMTP id q1so3799366pfi.5 for ; Wed, 16 Jan 2019 15:05:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=rwPQA0GrSsFJgFnZ2hJ2WvsV9TaC1QflgEaItJNj6rs=; b=W8fCww2XFJoS2iAxpEv+yd1/WChQ9JD8byLNZVvMjNXR9Cc7gNXAwL1nbCAH2S6dP8 f0DIDuuPYGzr/XG2RzOMlJ42qxMmGRasgd59LlKdCYRwyqqReKhc4/HXNGf37Iq4se6b Nhz80ehXziHFMFaV9yDyKaYgT4TNbJbePXa5IOuVk69huYIioBT2dXqcmDTa18H7tpFY BZcSp/xbXDCO/eGsN1vUgPOyd6qALhTlbxfEKK1zVY5HXRtlBgiptkew93Twt/oNFRR/ lLCF2mtuxP933U1J1uyVSI/xSm/WRDwthIy2JifU27eFL5gUZxMzErpPma486I4bb4X+ L7Mw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=rwPQA0GrSsFJgFnZ2hJ2WvsV9TaC1QflgEaItJNj6rs=; b=fa12xFPIzkpFU496T2eh7wjfKFRe7jn0L3S0rFYpprHR4R5zOnJvlpj6471AJDKBEe CneAi4RxBshj1NnF5lLsg5vJ4zWnggA+0bN4diJ3kabnM3YbO6SPgE9dzOd7359dxUB5 NwfqBM79VTniS7QcyFwXPXJT7sCBi8kAKBsWGQzbvwc9OE+eB16Mi6s2O21wXAqmHEyQ RoPtOteWqmqNlxhqwUZMTIc6sv58HC7nupHSibdrPjW5DAgDybHm0vViSqQxY8nsFmgJ WoX2Z1FchoKQzCg8g7GavDiWDbbwZfZVU0S5qIQsjYkidKUEQcMbO06NKLl+23WAq3Dq xwDA== X-Gm-Message-State: AJcUukeVJTVcuNd5tKmjYJLoC8LlwSC030xeiOk93mcU80dQ0Cnp2Epd P+sBqvtlaKm4TtdgT77lR1YpTw== X-Google-Smtp-Source: ALg8bN7+7TPgOPayAPIn4vDYACxrm/sFLAhEg7FbswII20cRUNNLO9RPI11ASkoC9gmnysy01Okiaw== X-Received: by 2002:a62:c42:: with SMTP id u63mr12088027pfi.73.1547679951009; Wed, 16 Jan 2019 15:05:51 -0800 (PST) Received: from ycheng2.svl.corp.google.com ([2620:15c:2c4:201:d660:6c0b:8a4f:4c77]) by smtp.gmail.com with ESMTPSA id k186sm8481087pge.13.2019.01.16.15.05.49 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Jan 2019 15:05:49 -0800 (PST) From: Yuchung Cheng To: davem@davemloft.net, edumazet@google.com Cc: netdev@vger.kernel.org, ncardwell@google.com, soheil@google.com, Yuchung Cheng Subject: [PATCH net-next 1/8] tcp: exit if nothing to retransmit on RTO timeout Date: Wed, 16 Jan 2019 15:05:28 -0800 Message-Id: <20190116230535.162758-2-ycheng@google.com> X-Mailer: git-send-email 2.20.1.97.g81188d93c3-goog In-Reply-To: <20190116230535.162758-1-ycheng@google.com> References: <20190116230535.162758-1-ycheng@google.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Previously TCP only warns if its RTO timer fires and the retransmission queue is empty, but it'll cause null pointer reference later on. It's better to avoid such catastrophic failure and simply exit with a warning. Signed-off-by: Yuchung Cheng Signed-off-by: Eric Dumazet Reviewed-by: Neal Cardwell Reviewed-by: Soheil Hassas Yeganeh --- net/ipv4/tcp_timer.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index 71a29e9c0620..e7d09e3705b8 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -443,10 +443,8 @@ void tcp_retransmit_timer(struct sock *sk) */ return; } - if (!tp->packets_out) - goto out; - - WARN_ON(tcp_rtx_queue_empty(sk)); + if (!tp->packets_out || WARN_ON_ONCE(tcp_rtx_queue_empty(sk))) + return; tp->tlp_high_seq = 0; From patchwork Wed Jan 16 23:05:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yuchung Cheng X-Patchwork-Id: 1026251 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="uagjWgrd"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 43g2r36BrNz9sBQ for ; Thu, 17 Jan 2019 10:05:55 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387941AbfAPXFz (ORCPT ); Wed, 16 Jan 2019 18:05:55 -0500 Received: from mail-pf1-f196.google.com ([209.85.210.196]:46736 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387904AbfAPXFy (ORCPT ); Wed, 16 Jan 2019 18:05:54 -0500 Received: by mail-pf1-f196.google.com with SMTP id c73so3774641pfe.13 for ; Wed, 16 Jan 2019 15:05:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=SysHxMHw+e0XO9lDcN5RxQnzjg3kflyj3dW3k3g9vqg=; b=uagjWgrdvNHxqC2fYaCjSMNtDtdkvQf10j97tOCvnfI/g1wQv6n2OUebAjdBp9ZQN9 BXgOIw2bmPJy2LT2ofCsCpEKfa9NectCOeXA7yQr7Ii/XCubhTgURh1jVgvtbJrA+A9S DaH0k0YzOodDMf5N0S1/CN5W/V5PJwwJwPfG4xclUXGM6MhutrcFliysmznYM8Cp7GIH 6MDfAP9ZqDhBhHeAiIHLop72Zxa0M3R1fFXzSL7227voNlDfCVqJjfBjqrduH1TOkQQH Pi8kNM/yPVsSzxVZaVzVwKUb4VC8se32h60pXga2ETY7moTsfKYGJImmtX1yM9hUVxh9 BF8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=SysHxMHw+e0XO9lDcN5RxQnzjg3kflyj3dW3k3g9vqg=; b=V4bxu6PHU2eW8HGaXlrlGA0kOxTz1FGVofk9HJ04/n1KKY04LQXHbRl1qDjKmxsT6w hdKdbv23VyKTQuAv+lP6Oe7CgG/adw+8dAFPL4hEBnQb46lDUu2qrQzz4TeYDWyI0aGi nCrsahDmHa3O60wqZZweBTk3WWGCzToegBHapDH7Mq1gqdh8SvrMFcza+EnmCMnrSXDy 7r/UnWQgRpWV1Tem+VCqp2UWbjNZioSQvDUDV4YlZ++uEh/P7Tx/bZT4HSFVwVXVUme5 0j8BOKb8Ld8A3/isvep17bGAQo9YDBaunhTWEP+c8XH158s7XiwDxHunTglJwr7TIqc7 ED/g== X-Gm-Message-State: AJcUukc2M834mLRyQRwriZryG2lP0GEstSfG1BdgpVTfLesXFbRPk3e/ G9M9KIazlBPkmZYP+rt2+iFE2g== X-Google-Smtp-Source: ALg8bN5J2bQHXTfd3I3lKiX2wO0+/BpAGToClKVSLH6S+hWvQMzejMics4/7WDivjmwqMjF0xZ5Rmg== X-Received: by 2002:a62:d148:: with SMTP id t8mr12667796pfl.52.1547679952748; Wed, 16 Jan 2019 15:05:52 -0800 (PST) Received: from ycheng2.svl.corp.google.com ([2620:15c:2c4:201:d660:6c0b:8a4f:4c77]) by smtp.gmail.com with ESMTPSA id k186sm8481087pge.13.2019.01.16.15.05.51 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Jan 2019 15:05:51 -0800 (PST) From: Yuchung Cheng To: davem@davemloft.net, edumazet@google.com Cc: netdev@vger.kernel.org, ncardwell@google.com, soheil@google.com, Yuchung Cheng Subject: [PATCH net-next 2/8] tcp: always timestamp on every skb transmission Date: Wed, 16 Jan 2019 15:05:29 -0800 Message-Id: <20190116230535.162758-3-ycheng@google.com> X-Mailer: git-send-email 2.20.1.97.g81188d93c3-goog In-Reply-To: <20190116230535.162758-1-ycheng@google.com> References: <20190116230535.162758-1-ycheng@google.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Previously TCP skbs are not always timestamped if the transmission failed due to memory or other local issues. This makes deciding when to abort a socket tricky and complicated because the first unacknowledged skb's timestamp may be 0 on TCP timeout. The straight-forward fix is to always timestamp skb on every transmission attempt. Also every skb retransmission needs to be flagged properly to avoid RTT under-estimation. This can happen upon receiving an ACK for the original packet and the a previous (spurious) retransmission has failed. It's worth noting that this reverts to the old time-stamping style before commit 8c72c65b426b ("tcp: update skb->skb_mstamp more carefully") which addresses a problem in computing the elapsed time of a stalled window-probing socket. The problem will be addressed differently in the next patches with a simpler approach. Signed-off-by: Yuchung Cheng Signed-off-by: Eric Dumazet Reviewed-by: Neal Cardwell Reviewed-by: Soheil Hassas Yeganeh --- net/ipv4/tcp_output.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 730bc44dbad9..57a56e205070 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -980,7 +980,6 @@ static void tcp_update_skb_after_send(struct sock *sk, struct sk_buff *skb, { struct tcp_sock *tp = tcp_sk(sk); - skb->skb_mstamp_ns = tp->tcp_wstamp_ns; if (sk->sk_pacing_status != SK_PACING_NONE) { unsigned long rate = sk->sk_pacing_rate; @@ -1028,7 +1027,9 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, BUG_ON(!skb || !tcp_skb_pcount(skb)); tp = tcp_sk(sk); - + prior_wstamp = tp->tcp_wstamp_ns; + tp->tcp_wstamp_ns = max(tp->tcp_wstamp_ns, tp->tcp_clock_cache); + skb->skb_mstamp_ns = tp->tcp_wstamp_ns; if (clone_it) { TCP_SKB_CB(skb)->tx.in_flight = TCP_SKB_CB(skb)->end_seq - tp->snd_una; @@ -1045,11 +1046,6 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, return -ENOBUFS; } - prior_wstamp = tp->tcp_wstamp_ns; - tp->tcp_wstamp_ns = max(tp->tcp_wstamp_ns, tp->tcp_clock_cache); - - skb->skb_mstamp_ns = tp->tcp_wstamp_ns; - inet = inet_sk(sk); tcb = TCP_SKB_CB(skb); memset(&opts, 0, sizeof(opts)); @@ -2937,12 +2933,16 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs) err = tcp_transmit_skb(sk, skb, 1, GFP_ATOMIC); } + /* To avoid taking spuriously low RTT samples based on a timestamp + * for a transmit that never happened, always mark EVER_RETRANS + */ + TCP_SKB_CB(skb)->sacked |= TCPCB_EVER_RETRANS; + if (BPF_SOCK_OPS_TEST_FLAG(tp, BPF_SOCK_OPS_RETRANS_CB_FLAG)) tcp_call_bpf_3arg(sk, BPF_SOCK_OPS_RETRANS_CB, TCP_SKB_CB(skb)->seq, segs, err); if (likely(!err)) { - TCP_SKB_CB(skb)->sacked |= TCPCB_EVER_RETRANS; trace_tcp_retransmit_skb(sk, skb); } else if (err != -EBUSY) { NET_ADD_STATS(sock_net(sk), LINUX_MIB_TCPRETRANSFAIL, segs); From patchwork Wed Jan 16 23:05:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yuchung Cheng X-Patchwork-Id: 1026257 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="dtrcGBMm"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 43g2rG4w08z9sBQ for ; Thu, 17 Jan 2019 10:06:06 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387975AbfAPXGB (ORCPT ); Wed, 16 Jan 2019 18:06:01 -0500 Received: from mail-pl1-f195.google.com ([209.85.214.195]:43241 "EHLO mail-pl1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387904AbfAPXFz (ORCPT ); Wed, 16 Jan 2019 18:05:55 -0500 Received: by mail-pl1-f195.google.com with SMTP id gn14so3724065plb.10 for ; Wed, 16 Jan 2019 15:05:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=wTu0k4hqSiiGyJUjl4skW273vNwz4o/PabdANwcl1s8=; b=dtrcGBMmHIG6MH+vsYPzmbTC3hA4LJ8np+OX1wSEI4XhFwuSTsulPULO1u7KiDWQlB AOufAwbjXKabhA3qWmMfFI07LbkH35oSFMgsq2Tx1sIj4MYyn9gYJrPZp9vY7zAMdBpA GP9q18PfevQMZnR/ttluAHOjWV+SIckpi6h2eSlDlDYrctC+GvTeygQSKMExi+sdIG8B vQ5+mUI89k7YP8s2emdJd3i7FdkNI+Qki26xAwYZalTACrmlQH9ryUaWh11c7w98n3Ku +pbKvsvYVNGdDhT7JSzWikdrBLUoigIKOfMG91KGDYykmbhF05hifvkb5YJHWlTJyW3s 5QyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=wTu0k4hqSiiGyJUjl4skW273vNwz4o/PabdANwcl1s8=; b=BxUfufUV/v+ee9mexxoFge0jmWT5LqThT4yMxE6UDpSEPG7DOVcDMNdmx6EVEkH3sH oUi3jmvAdY1HcG6KFFwceJfj3ZWgtWKqGU4sJVSgrXAW8UWf80933ie5hWCHRV76OlsG 6zOSbXs7Qat7DaFbzoWXy6j5mCrFprFbAkefbNpj+Pbu41/9Ha0bpsVZ7cezh0AiqjXC VoP2TemWnouCkp2aLvLsBWFgG++DvcOZH8eWFRfURIhIO2tnzJN44SCXKGeXu7DdrAfw RngXfYiuQH+H5ug2td9N3PMAqcX+ULTHrYYhOLRY3v+hQROLGnfvmJnbD9urvBhr9a2a bRzg== X-Gm-Message-State: AJcUukcGWPmsX0Oa2kjQHPjZc/se+SWnr841AKon1Tjx2Y7WwkTtD8qC ASa/kzeUvVxgJSDjpuNxuAnplg== X-Google-Smtp-Source: ALg8bN66GQh2At/D3mSMI3iszUXutANDm7XalvU+MYQUS4qzrrKg9JKN3I+IaqsFHjppD0kcMSWa1A== X-Received: by 2002:a17:902:8a8a:: with SMTP id p10mr12561013plo.50.1547679954574; Wed, 16 Jan 2019 15:05:54 -0800 (PST) Received: from ycheng2.svl.corp.google.com ([2620:15c:2c4:201:d660:6c0b:8a4f:4c77]) by smtp.gmail.com with ESMTPSA id k186sm8481087pge.13.2019.01.16.15.05.53 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Jan 2019 15:05:53 -0800 (PST) From: Yuchung Cheng To: davem@davemloft.net, edumazet@google.com Cc: netdev@vger.kernel.org, ncardwell@google.com, soheil@google.com, Yuchung Cheng Subject: [PATCH net-next 3/8] tcp: always set retrans_stamp on recovery Date: Wed, 16 Jan 2019 15:05:30 -0800 Message-Id: <20190116230535.162758-4-ycheng@google.com> X-Mailer: git-send-email 2.20.1.97.g81188d93c3-goog In-Reply-To: <20190116230535.162758-1-ycheng@google.com> References: <20190116230535.162758-1-ycheng@google.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Previously TCP socket's retrans_stamp is not set if the retransmission has failed to send. As a result if a socket is experiencing local issues to retransmit packets, determining when to abort a socket is complicated w/o knowning the starting time of the recovery since retrans_stamp may remain zero. This complication causes sub-optimal behavior that TCP may use the latest, instead of the first, retransmission time to compute the elapsed time of a stalling connection due to local issues. Then TCP may disrecard TCP retries settings and keep retrying until it finally succeed: not a good idea when the local host is already strained. The simple fix is to always timestamp the start of a recovery. It's worth noting that retrans_stamp is also used to compare echo timestamp values to detect spurious recovery. This patch does not break that because retrans_stamp is still later than when the original packet was sent. Signed-off-by: Yuchung Cheng Signed-off-by: Eric Dumazet Reviewed-by: Neal Cardwell Reviewed-by: Soheil Hassas Yeganeh --- net/ipv4/tcp_output.c | 9 ++++----- net/ipv4/tcp_timer.c | 23 +++-------------------- 2 files changed, 7 insertions(+), 25 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 57a56e205070..d2d494c74811 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -2963,13 +2963,12 @@ int tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs) #endif TCP_SKB_CB(skb)->sacked |= TCPCB_RETRANS; tp->retrans_out += tcp_skb_pcount(skb); - - /* Save stamp of the first retransmit. */ - if (!tp->retrans_stamp) - tp->retrans_stamp = tcp_skb_timestamp(skb); - } + /* Save stamp of the first (attempted) retransmit. */ + if (!tp->retrans_stamp) + tp->retrans_stamp = tcp_skb_timestamp(skb); + if (tp->undo_retrans < 0) tp->undo_retrans = 0; tp->undo_retrans += tcp_skb_pcount(skb); diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index e7d09e3705b8..1e61f0bd6e24 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -22,28 +22,14 @@ #include #include -static u32 tcp_retransmit_stamp(const struct sock *sk) -{ - u32 start_ts = tcp_sk(sk)->retrans_stamp; - - if (unlikely(!start_ts)) { - struct sk_buff *head = tcp_rtx_queue_head(sk); - - if (!head) - return 0; - start_ts = tcp_skb_timestamp(head); - } - return start_ts; -} - static u32 tcp_clamp_rto_to_user_timeout(const struct sock *sk) { struct inet_connection_sock *icsk = inet_csk(sk); u32 elapsed, start_ts; s32 remaining; - start_ts = tcp_retransmit_stamp(sk); - if (!icsk->icsk_user_timeout || !start_ts) + start_ts = tcp_sk(sk)->retrans_stamp; + if (!icsk->icsk_user_timeout) return icsk->icsk_rto; elapsed = tcp_time_stamp(tcp_sk(sk)) - start_ts; remaining = icsk->icsk_user_timeout - elapsed; @@ -197,10 +183,7 @@ static bool retransmits_timed_out(struct sock *sk, if (!inet_csk(sk)->icsk_retransmits) return false; - start_ts = tcp_retransmit_stamp(sk); - if (!start_ts) - return false; - + start_ts = tcp_sk(sk)->retrans_stamp; if (likely(timeout == 0)) { linear_backoff_thresh = ilog2(TCP_RTO_MAX/rto_base); From patchwork Wed Jan 16 23:05:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yuchung Cheng X-Patchwork-Id: 1026254 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="CQJUpOnn"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 43g2r95lWMz9sBn for ; Thu, 17 Jan 2019 10:06:01 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387963AbfAPXGA (ORCPT ); Wed, 16 Jan 2019 18:06:00 -0500 Received: from mail-pg1-f194.google.com ([209.85.215.194]:40840 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387944AbfAPXF5 (ORCPT ); Wed, 16 Jan 2019 18:05:57 -0500 Received: by mail-pg1-f194.google.com with SMTP id z10so3494292pgp.7 for ; Wed, 16 Jan 2019 15:05:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=TlSiXHUnzohCuOrsGIq/DUsUipOvuYAYR3rWHV93A70=; b=CQJUpOnnizFoV+qltYUJvnhvPW5ZoFPh5QtcsO9+XpSH2FtjOUtVa5kE6pxYDONq/Z /LJLVYkHHnYLqo/sTafW0kjMZ0+v7OXxYrDW/l6QYGOeg88v8HDNnyGm0E+Vl7BFA463 T43I3DDSq/aljbGUFcMyXkWKzCFZl/I3UmrAaQSv9xlpbErt9saOfUGRhIhZYzq/PFkX J+UlYdP8IcfIlF/UdQUQBGGZxKN6FCf5qpX1Jf3RqRgI8tcXkpucXlDfCyYKZMK8kzMR EAuQXkcR3S6pCdUawVzgiNrjgYeTFK1un3Occ0wzKCi0lDsWcxFK3hyDrV9LI1UegNbH ACzw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=TlSiXHUnzohCuOrsGIq/DUsUipOvuYAYR3rWHV93A70=; b=VKC5A6sHOQ3266feRQ1gV/uIj6yq2s5kdslmBHrGsWHLlWVLDriucuY3P+n/vWGLmx ai7tn0B8mAxzn/yZSEJcYmRtT/szMpvxgU/c3heDne9UoRRWDWwF4ls7QnowZGOnPuZ7 y/selxJDJIUdmf17hXQfg/WQT9GYbxWF0cjrEs2OKoE4SJ7/9tRtCLFL3xe5AsHW8vO5 o7FL3kO0Ta39TCuKt20Y7pE1ppWIs+AIYVC4GQ50WKnhWsI7wv9aLYF0+hbR7cM/Gal5 5AE2FUWbgHLvkAMWokFhqZaEOol7SkO1TuqR2tV89UFnbgBgeT69v94hj9ZuHr/Wrrs9 STKg== X-Gm-Message-State: AJcUukedzI4IpztXZeAaU8C40l9VxaK8HfBDNyEPb4s1dq3jr6eS+hNs 5eM94JyQJX8B+BswMFERaUxU5cxwKGLXyg== X-Google-Smtp-Source: ALg8bN7SoSrHWMxANM2PYNCIeNqmhBw7jRE2U+9ocAkKV3y606sqpQH2fqolFLratAZsoTDy2714uw== X-Received: by 2002:a63:902:: with SMTP id 2mr10483581pgj.219.1547679956216; Wed, 16 Jan 2019 15:05:56 -0800 (PST) Received: from ycheng2.svl.corp.google.com ([2620:15c:2c4:201:d660:6c0b:8a4f:4c77]) by smtp.gmail.com with ESMTPSA id k186sm8481087pge.13.2019.01.16.15.05.54 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Jan 2019 15:05:55 -0800 (PST) From: Yuchung Cheng To: davem@davemloft.net, edumazet@google.com Cc: netdev@vger.kernel.org, ncardwell@google.com, soheil@google.com, Yuchung Cheng Subject: [PATCH net-next 4/8] tcp: properly track retry time on passive Fast Open Date: Wed, 16 Jan 2019 15:05:31 -0800 Message-Id: <20190116230535.162758-5-ycheng@google.com> X-Mailer: git-send-email 2.20.1.97.g81188d93c3-goog In-Reply-To: <20190116230535.162758-1-ycheng@google.com> References: <20190116230535.162758-1-ycheng@google.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch addresses a corner issue on timeout behavior of a passive Fast Open socket. A passive Fast Open server may write and close the socket when it is re-trying SYN-ACK to complete the handshake. After the handshake is completely, the server does not properly stamp the recovery start time (tp->retrans_stamp is 0), and the socket may abort immediately on the very first FIN timeout, instead of retying until it passes the system or user specified limit. Signed-off-by: Yuchung Cheng Signed-off-by: Eric Dumazet Reviewed-by: Neal Cardwell Reviewed-by: Soheil Hassas Yeganeh --- net/ipv4/tcp_timer.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index 1e61f0bd6e24..074de38bafbd 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -378,6 +378,7 @@ static void tcp_fastopen_synack_timer(struct sock *sk) struct inet_connection_sock *icsk = inet_csk(sk); int max_retries = icsk->icsk_syn_retries ? : sock_net(sk)->ipv4.sysctl_tcp_synack_retries + 1; /* add one more retry for fastopen */ + struct tcp_sock *tp = tcp_sk(sk); struct request_sock *req; req = tcp_sk(sk)->fastopen_rsk; @@ -395,6 +396,8 @@ static void tcp_fastopen_synack_timer(struct sock *sk) inet_rtx_syn_ack(sk, req); req->num_timeout++; icsk->icsk_retransmits++; + if (!tp->retrans_stamp) + tp->retrans_stamp = tcp_time_stamp(tp); inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS, TCP_TIMEOUT_INIT << req->num_timeout, TCP_RTO_MAX); } From patchwork Wed Jan 16 23:05:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yuchung Cheng X-Patchwork-Id: 1026253 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="jOkPxWm8"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 43g2r86d8nz9sBQ for ; Thu, 17 Jan 2019 10:06:00 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387958AbfAPXF7 (ORCPT ); Wed, 16 Jan 2019 18:05:59 -0500 Received: from mail-pl1-f196.google.com ([209.85.214.196]:45626 "EHLO mail-pl1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387947AbfAPXF6 (ORCPT ); Wed, 16 Jan 2019 18:05:58 -0500 Received: by mail-pl1-f196.google.com with SMTP id a14so3718092plm.12 for ; Wed, 16 Jan 2019 15:05:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=YZNkXrN5sHGZEc2lfkfn6bRtfB+l3g+mYllBUgR11uM=; b=jOkPxWm8nFIqzxmJUH0Ebt27GBuiJOVmoJX1B/UV+sVE4qdO56OJMJCshgPLC92pdi 9EKZU54OL2FrFgcWXU/A62+w+Lact/RdzmmXmuhXMrSAwgQTbHWIDD5bls9S/URIYRBJ zwu+zL4lVvfdfL90rcL5HKsf04pNbTpqRNFhzGEupULFw9yT/P5wI9fZOT/VGpW7qya6 wadtkZfDWqjHF458sD0Gr+z885pO46StOJJcc/DxF0RuaRX7jYItIbxRb32h+pmh1d+E Nuxmd082byWluYcepnWZSAcP9PX2VQS+dgURb5+xtx1sEJ9H3xIsFPXJLz7QnMHbU68A yqrg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YZNkXrN5sHGZEc2lfkfn6bRtfB+l3g+mYllBUgR11uM=; b=BVBZ1+JKciE8VXwyOaNIvtNpE6JVUxWx3dNJwptOH6SycjDs5GpB88BBkPLCv3Tt8B PESbZlkAWHkbHLOyCF/YdssagjOfmX60AaAQqkcgEsOqkEnlCxUTbrOOGFaVK1j4ZXvZ J1nwGDKtT9FYbZjTh5QZcWUEqtrFtTPB1ru6I206yqyUhHGmbNnbXdbgGE3vyjlHvUen 6p8cBRdlLLM0Qbhf6GdXT3VtwyRtbQ/FZSyjNCkE39wdLxJKmbOq/sFNvCAe55eYoDRu o8GF8GjGk7+UWULbqhdZMICmB6Z86fNylWSpMK1kM7almJppNVRyrUlEQ6/Xk4HGrktC 99Mg== X-Gm-Message-State: AJcUukcrfYEJVnHcw/7mYDBjAkWvBevuymh94sPuHh+nZ2e12ZhbwbC0 ezEfe/wI4jgEaTBb8umYbFqdTgrj0tcg5Q== X-Google-Smtp-Source: ALg8bN73btZ9qezpgQFwZdU332oSItzwBz/wKj+yBQyl48k3gt88l3VhDmnctTde2PANdKPpUNb0Eg== X-Received: by 2002:a17:902:9a02:: with SMTP id v2mr12650064plp.180.1547679957867; Wed, 16 Jan 2019 15:05:57 -0800 (PST) Received: from ycheng2.svl.corp.google.com ([2620:15c:2c4:201:d660:6c0b:8a4f:4c77]) by smtp.gmail.com with ESMTPSA id k186sm8481087pge.13.2019.01.16.15.05.56 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Jan 2019 15:05:56 -0800 (PST) From: Yuchung Cheng To: davem@davemloft.net, edumazet@google.com Cc: netdev@vger.kernel.org, ncardwell@google.com, soheil@google.com, Yuchung Cheng Subject: [PATCH net-next 5/8] tcp: create a helper to model exponential backoff Date: Wed, 16 Jan 2019 15:05:32 -0800 Message-Id: <20190116230535.162758-6-ycheng@google.com> X-Mailer: git-send-email 2.20.1.97.g81188d93c3-goog In-Reply-To: <20190116230535.162758-1-ycheng@google.com> References: <20190116230535.162758-1-ycheng@google.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Create a helper to model TCP exponential backoff for the next patch. This is pure refactor w no behavior change. Signed-off-by: Yuchung Cheng Signed-off-by: Eric Dumazet Reviewed-by: Neal Cardwell Reviewed-by: Soheil Hassas Yeganeh --- net/ipv4/tcp_timer.c | 31 ++++++++++++++++++------------- 1 file changed, 18 insertions(+), 13 deletions(-) diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index 074de38bafbd..bcc2f5783e57 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -159,7 +159,20 @@ static void tcp_mtu_probing(struct inet_connection_sock *icsk, struct sock *sk) tcp_sync_mss(sk, icsk->icsk_pmtu_cookie); } - +static unsigned int tcp_model_timeout(struct sock *sk, + unsigned int boundary, + unsigned int rto_base) +{ + unsigned int linear_backoff_thresh, timeout; + + linear_backoff_thresh = ilog2(TCP_RTO_MAX / rto_base); + if (boundary <= linear_backoff_thresh) + timeout = ((2 << boundary) - 1) * rto_base; + else + timeout = ((2 << linear_backoff_thresh) - 1) * rto_base + + (boundary - linear_backoff_thresh) * TCP_RTO_MAX; + return jiffies_to_msecs(timeout); +} /** * retransmits_timed_out() - returns true if this connection has timed out * @sk: The current socket @@ -177,23 +190,15 @@ static bool retransmits_timed_out(struct sock *sk, unsigned int boundary, unsigned int timeout) { - const unsigned int rto_base = TCP_RTO_MIN; - unsigned int linear_backoff_thresh, start_ts; + unsigned int start_ts; if (!inet_csk(sk)->icsk_retransmits) return false; start_ts = tcp_sk(sk)->retrans_stamp; - if (likely(timeout == 0)) { - linear_backoff_thresh = ilog2(TCP_RTO_MAX/rto_base); - - if (boundary <= linear_backoff_thresh) - timeout = ((2 << boundary) - 1) * rto_base; - else - timeout = ((2 << linear_backoff_thresh) - 1) * rto_base + - (boundary - linear_backoff_thresh) * TCP_RTO_MAX; - timeout = jiffies_to_msecs(timeout); - } + if (likely(timeout == 0)) + timeout = tcp_model_timeout(sk, boundary, TCP_RTO_MIN); + return (s32)(tcp_time_stamp(tcp_sk(sk)) - start_ts - timeout) >= 0; } From patchwork Wed Jan 16 23:05:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yuchung Cheng X-Patchwork-Id: 1026255 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="WElLRFWa"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 43g2rC5rg4z9sBQ for ; Thu, 17 Jan 2019 10:06:03 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387985AbfAPXGC (ORCPT ); Wed, 16 Jan 2019 18:06:02 -0500 Received: from mail-pg1-f193.google.com ([209.85.215.193]:43151 "EHLO mail-pg1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387960AbfAPXGA (ORCPT ); Wed, 16 Jan 2019 18:06:00 -0500 Received: by mail-pg1-f193.google.com with SMTP id v28so3483612pgk.10 for ; Wed, 16 Jan 2019 15:06:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=KXGjPymY+kd/StRmzLytH48w7Kn/SpEWnih644L4mcg=; b=WElLRFWasHcOFGQXYll/yoENui+SVxWfzEajbxu0du7JNf4Q3yTcSmYUvNC/+wgktp zwzICyQ4+4yVY7Tfl9hCCpxL7rsXBR8EiE7BujeY+ihmXGpf+yZOLAV9+FiV5GOJhUJ3 0EYbbwBhgLNNhme6QE59kMLelVVMBIi8lN+1YjBtaMfFalD8nujnOtHJ9khrBnTswsLy qTv4Pu1Fj6EhJZ0qZoBFMaMzLdYbDshRohfSxZFxfvHBfWeWYft5njKplCQ6Qo5/rd5N 8mPi8H9H6ygN12zONXO7EaGh2pHe4HPAe8kmqGuhEBHiAoYbWobM7shql5eauccbtXsz Padw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=KXGjPymY+kd/StRmzLytH48w7Kn/SpEWnih644L4mcg=; b=n7KYeumoHyQ6Y8v+69zYy3VzDgGdJ6z71ZkcMkmsMWaA7RUTEmyZVjUSNaLJQ0YhVx gS46Nn/HBIAmZ8BHJGj+Wkqn7wDV2fGCsVDV57kmGtJmPfkZZOJwMxiryTRgjPVhJ83B lPNu9Jmkydaw+SxaKTTXiqE48engsopDCFZk+Bk9kc5IpU+kaHdK/Skzj2WBpOydsvKw WudQXiGVbL8QqLQM1PJ141WfWXq40+K62lLG6Q8SFiiW95rJ/vnkZi55WuXhqdzNy14U PUnq8F3LdUgEhb5crUYnTCmYdFBG8yd2TRRIFPay7+PN0MRyz7jPlDeNC40WxG5AdGA+ oakw== X-Gm-Message-State: AJcUuke0A/3WMbTCUOdtPqv8rfethjrgiY3yLOsTE4CxgeMN5mqEuBTc 0658Q+P8wAf+mLCDGrwrBh2xAw== X-Google-Smtp-Source: ALg8bN6I/ttmMXBxCLOqRDJa52f/R3iZWD9R3tpBUxK8/mxoVHdS66YDLlL+jAz9LBhr3d3rPGM/2g== X-Received: by 2002:aa7:84d3:: with SMTP id x19mr12244163pfn.220.1547679959473; Wed, 16 Jan 2019 15:05:59 -0800 (PST) Received: from ycheng2.svl.corp.google.com ([2620:15c:2c4:201:d660:6c0b:8a4f:4c77]) by smtp.gmail.com with ESMTPSA id k186sm8481087pge.13.2019.01.16.15.05.58 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Jan 2019 15:05:58 -0800 (PST) From: Yuchung Cheng To: davem@davemloft.net, edumazet@google.com Cc: netdev@vger.kernel.org, ncardwell@google.com, soheil@google.com, Yuchung Cheng Subject: [PATCH net-next 6/8] tcp: simplify window probe aborting on USER_TIMEOUT Date: Wed, 16 Jan 2019 15:05:33 -0800 Message-Id: <20190116230535.162758-7-ycheng@google.com> X-Mailer: git-send-email 2.20.1.97.g81188d93c3-goog In-Reply-To: <20190116230535.162758-1-ycheng@google.com> References: <20190116230535.162758-1-ycheng@google.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Previously we use the next unsent skb's timestamp to determine when to abort a socket stalling on window probes. This no longer works as skb timestamp reflects the last instead of the first transmission. Instead we can estimate how long the socket has been stalling with the probe count and the exponential backoff behavior. Signed-off-by: Yuchung Cheng Signed-off-by: Eric Dumazet Reviewed-by: Neal Cardwell Reviewed-by: Soheil Hassas Yeganeh --- net/ipv4/tcp_timer.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index bcc2f5783e57..c36089aa3515 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -333,7 +333,6 @@ static void tcp_probe_timer(struct sock *sk) struct sk_buff *skb = tcp_send_head(sk); struct tcp_sock *tp = tcp_sk(sk); int max_probes; - u32 start_ts; if (tp->packets_out || !skb) { icsk->icsk_probes_out = 0; @@ -348,12 +347,13 @@ static void tcp_probe_timer(struct sock *sk) * corresponding system limit. We also implement similar policy when * we use RTO to probe window in tcp_retransmit_timer(). */ - start_ts = tcp_skb_timestamp(skb); - if (!start_ts) - skb->skb_mstamp_ns = tp->tcp_clock_cache; - else if (icsk->icsk_user_timeout && - (s32)(tcp_time_stamp(tp) - start_ts) > icsk->icsk_user_timeout) - goto abort; + if (icsk->icsk_user_timeout) { + u32 elapsed = tcp_model_timeout(sk, icsk->icsk_probes_out, + tcp_probe0_base(sk)); + + if (elapsed >= icsk->icsk_user_timeout) + goto abort; + } max_probes = sock_net(sk)->ipv4.sysctl_tcp_retries2; if (sock_flag(sk, SOCK_DEAD)) { From patchwork Wed Jan 16 23:05:34 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yuchung Cheng X-Patchwork-Id: 1026256 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="viWqKFdo"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 43g2rF0rnYz9sBQ for ; Thu, 17 Jan 2019 10:06:05 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387989AbfAPXGD (ORCPT ); Wed, 16 Jan 2019 18:06:03 -0500 Received: from mail-pg1-f194.google.com ([209.85.215.194]:34060 "EHLO mail-pg1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387980AbfAPXGC (ORCPT ); Wed, 16 Jan 2019 18:06:02 -0500 Received: by mail-pg1-f194.google.com with SMTP id j10so3507971pga.1 for ; Wed, 16 Jan 2019 15:06:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=K/4xedm8B8QRpTBv9uVpAUKLNB1iO3Tt0SQWuBHTcl0=; b=viWqKFdoC0n8De8+y7XVrdxMJ0NYLZX3cNKLnssp3pV2KetrhFhKzVkb13O62YTNQy Wrg/626IogCJLYk96LkyiHabntYVCcYnDM5/SwJR3YmIrAUXwG687nRxqojv+O9ByOJ6 nDvRsQV5Dv/YKM2jp4P41N2FN3PvZEKtJeKvHSYRuYa34ZYT1CkRvpVSy2Id3RCgCdSa o9jUSVpPR9ZftuWdw/2K2I33eZ6ZCKDBevB1cHH2ixrNErTUrZU0lvf84eucnP+6QF88 q7zRC8joglezCF3LfGQZg+RWa7MMal8clsRDZGSKtzXYuQ9qENkugXDurbQUI/MekRt8 2Gsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=K/4xedm8B8QRpTBv9uVpAUKLNB1iO3Tt0SQWuBHTcl0=; b=pdjCTtuyqZ9/GTssIM2NnVo04eZ/iqeA2SEZ3pBttyD/gpac8kINRgzvBYeM01hn+C Cq7qfDM2rbY4qfx9nP/DRgGPhmqJVLuMg7+F6sxus4x317av2rUQxe3XVM1DElBcX4+U Hhd1kJ9l/IxEyXePoWlGC8Oqccw+ibjT0yQaIMG5J8jkuULcCbiwR8ZHvK5JJdQcZiSi wcZR1xum00GhkdnH2lLZAzZeErpgVfIF/sE4jZA7JrlkfiKzHiDK6uGkVklQ6OEs6Af6 yo0uNqva6qe04FMxzgGuEBVwLFPRaSHDRTyynGd6NQ4qhoAzPRWicvzQadccOPenUdjU NduQ== X-Gm-Message-State: AJcUukcgMgbbwgbQY7cyCH7EvlC1VylzMRX9j1bGj/tRf+zB20Q8MBkA eBjTMjVQU2aPJDXg7Dem5NZ74w== X-Google-Smtp-Source: ALg8bN6G3QhDpgMzNTgNSOwUEbE3k+gi+JFE1VLxG5NzzmblbMe65+VZOE8q5zAyqP0pTXilZ9lQTQ== X-Received: by 2002:a62:8893:: with SMTP id l141mr12203588pfd.1.1547679961206; Wed, 16 Jan 2019 15:06:01 -0800 (PST) Received: from ycheng2.svl.corp.google.com ([2620:15c:2c4:201:d660:6c0b:8a4f:4c77]) by smtp.gmail.com with ESMTPSA id k186sm8481087pge.13.2019.01.16.15.05.59 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Jan 2019 15:06:00 -0800 (PST) From: Yuchung Cheng To: davem@davemloft.net, edumazet@google.com Cc: netdev@vger.kernel.org, ncardwell@google.com, soheil@google.com, Yuchung Cheng Subject: [PATCH net-next 7/8] tcp: retry more conservatively on local congestion Date: Wed, 16 Jan 2019 15:05:34 -0800 Message-Id: <20190116230535.162758-8-ycheng@google.com> X-Mailer: git-send-email 2.20.1.97.g81188d93c3-goog In-Reply-To: <20190116230535.162758-1-ycheng@google.com> References: <20190116230535.162758-1-ycheng@google.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Previously when the sender fails to retransmit a data packet on timeout due to congestion in the local host (e.g. throttling in qdisc), it'll retry within an RTO up to 500ms. In low-RTT networks such as data-centers, RTO is often far below the default minimum 200ms (and the cap 500ms). Then local host congestion could trigger a retry storm pouring gas to the fire. Worse yet, the retry counter (icsk_retransmits) is not properly updated so the aggressive retry may exceed the system limit (15 rounds) until the packet finally slips through. On such rare events, it's wise to retry more conservatively (500ms) and update the stats properly to reflect these incidents and follow the system limit. Note that this is consistent with the behavior when a keep-alive probe is dropped due to local congestion. Signed-off-by: Yuchung Cheng Signed-off-by: Eric Dumazet Reviewed-by: Neal Cardwell Reviewed-by: Soheil Hassas Yeganeh --- net/ipv4/tcp_timer.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c index c36089aa3515..d7399a89469d 100644 --- a/net/ipv4/tcp_timer.c +++ b/net/ipv4/tcp_timer.c @@ -500,14 +500,13 @@ void tcp_retransmit_timer(struct sock *sk) tcp_enter_loss(sk); + icsk->icsk_retransmits++; if (tcp_retransmit_skb(sk, tcp_rtx_queue_head(sk), 1) > 0) { /* Retransmission failed because of local congestion, - * do not backoff. + * Let senders fight for local resources conservatively. */ - if (!icsk->icsk_retransmits) - icsk->icsk_retransmits = 1; inet_csk_reset_xmit_timer(sk, ICSK_TIME_RETRANS, - min(icsk->icsk_rto, TCP_RESOURCE_PROBE_INTERVAL), + TCP_RESOURCE_PROBE_INTERVAL, TCP_RTO_MAX); goto out; } @@ -528,7 +527,6 @@ void tcp_retransmit_timer(struct sock *sk) * the 120 second clamps though! */ icsk->icsk_backoff++; - icsk->icsk_retransmits++; out_reset_timer: /* If stream is thin, use linear timeouts. Since 'icsk_backoff' is From patchwork Wed Jan 16 23:05:35 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yuchung Cheng X-Patchwork-Id: 1026258 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=google.com header.i=@google.com header.b="TGrc8Yit"; dkim-atps=neutral Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 43g2rJ1RWKz9sBQ for ; Thu, 17 Jan 2019 10:06:08 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387995AbfAPXGG (ORCPT ); Wed, 16 Jan 2019 18:06:06 -0500 Received: from mail-pf1-f195.google.com ([209.85.210.195]:34046 "EHLO mail-pf1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387990AbfAPXGF (ORCPT ); Wed, 16 Jan 2019 18:06:05 -0500 Received: by mail-pf1-f195.google.com with SMTP id h3so3810659pfg.1 for ; Wed, 16 Jan 2019 15:06:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=49+k6zS9/oEVA5XTNptxH+3XfGiaGGO7VAuh9+r22wI=; b=TGrc8Yit8UvG4MYaiBPAu8RguXd4g7/fidBiSEaqXxeoiCWnXr4+/AsiiNkssLSCD5 tLU2qCh4wpK8xskXC1d2YQ/CJDXcTN4mTpeUl5vPIo3zYgqqwtw1UNS5TybcF/OiUy6E R7rxtg0u3kMwnvIVlD1/UgkBAOvCwtfz+x8kl9UnynAbHiG/zLx4x1tpM1W4eN4vriYV GZ7dvO2inIgfO6OQKPYYblkgyaCb9OS837j2hMEvZdT8Jt8rsEBUxDQwAigNHfgyqdAu Exe9b2W/164AgWtNvOMYcR3gh5fkLTU2g8ao8al3C9JB7JxhdZeUKFlVLDBjpl0TV+Ti LJEg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=49+k6zS9/oEVA5XTNptxH+3XfGiaGGO7VAuh9+r22wI=; b=DXgcpEgc8AK2i8nK7cg+ykfp1ogofzxgSLnSp9W2ZtO6gcSBHGirEfy1y2EjEfpGNI wzjooLrg56Aj8w8/Ux22wsRDzPLzNxRRooQ3XbyolXPuuBbarc+LuT3TRbyDIPUGVbi+ wZT3eb8AoCkyKTR2GnXD3iBEsCqi/etZh4wWKxjklRxaDidHqrCaBxGTW45NtRMI6MeZ od7KANKeQtl4DGloOUba/5vDe5BHG3u3EWmxDleL1idRq9Ut6ZcCuGELd4vPfhCzjlNR Xq+MA8Yl9E8lckm1SyUqVzqD3iincUu5um3V7Qwu36BfpiVFwAzYOvZSsNymR+80H7z3 kiTQ== X-Gm-Message-State: AJcUukfakHKUI2A4A/3qDspQN+Vaz+K6ngA3LsTvKllBunuZu3YJI0oF ZUbaJBzHXmaC9SAdjyIOlcLhdg== X-Google-Smtp-Source: ALg8bN5Wdqy1/LdnmRMOzcv6nfecuK7pYrwjF/H0yX7dSBc1MVIoJm/Cs4EpMnOl0bM4E4XzhqAAqg== X-Received: by 2002:a62:6ec8:: with SMTP id j191mr12334564pfc.198.1547679962961; Wed, 16 Jan 2019 15:06:02 -0800 (PST) Received: from ycheng2.svl.corp.google.com ([2620:15c:2c4:201:d660:6c0b:8a4f:4c77]) by smtp.gmail.com with ESMTPSA id k186sm8481087pge.13.2019.01.16.15.06.01 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 16 Jan 2019 15:06:01 -0800 (PST) From: Yuchung Cheng To: davem@davemloft.net, edumazet@google.com Cc: netdev@vger.kernel.org, ncardwell@google.com, soheil@google.com, Yuchung Cheng Subject: [PATCH net-next 8/8] tcp: less aggressive window probing on local congestion Date: Wed, 16 Jan 2019 15:05:35 -0800 Message-Id: <20190116230535.162758-9-ycheng@google.com> X-Mailer: git-send-email 2.20.1.97.g81188d93c3-goog In-Reply-To: <20190116230535.162758-1-ycheng@google.com> References: <20190116230535.162758-1-ycheng@google.com> MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Previously when the sender fails to send (original) data packet or window probes due to congestion in the local host (e.g. throttling in qdisc), it'll retry within an RTO or two up to 500ms. In low-RTT networks such as data-centers, RTO is often far below the default minimum 200ms. Then local host congestion could trigger a retry storm pouring gas to the fire. Worse yet, the probe counter (icsk_probes_out) is not properly updated so the aggressive retry may exceed the system limit (15 rounds) until the packet finally slips through. On such rare events, it's wise to retry more conservatively (500ms) and update the stats properly to reflect these incidents and follow the system limit. Note that this is consistent with the behaviors when a keep-alive probe or RTO retry is dropped due to local congestion. Signed-off-by: Yuchung Cheng Signed-off-by: Eric Dumazet Reviewed-by: Neal Cardwell Reviewed-by: Soheil Hassas Yeganeh --- net/ipv4/tcp_output.c | 22 +++++++--------------- 1 file changed, 7 insertions(+), 15 deletions(-) diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index d2d494c74811..6527f61f59ff 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -3749,7 +3749,7 @@ void tcp_send_probe0(struct sock *sk) struct inet_connection_sock *icsk = inet_csk(sk); struct tcp_sock *tp = tcp_sk(sk); struct net *net = sock_net(sk); - unsigned long probe_max; + unsigned long timeout; int err; err = tcp_write_wakeup(sk, LINUX_MIB_TCPWINPROBE); @@ -3761,26 +3761,18 @@ void tcp_send_probe0(struct sock *sk) return; } + icsk->icsk_probes_out++; if (err <= 0) { if (icsk->icsk_backoff < net->ipv4.sysctl_tcp_retries2) icsk->icsk_backoff++; - icsk->icsk_probes_out++; - probe_max = TCP_RTO_MAX; + timeout = tcp_probe0_when(sk, TCP_RTO_MAX); } else { /* If packet was not sent due to local congestion, - * do not backoff and do not remember icsk_probes_out. - * Let local senders to fight for local resources. - * - * Use accumulated backoff yet. + * Let senders fight for local resources conservatively. */ - if (!icsk->icsk_probes_out) - icsk->icsk_probes_out = 1; - probe_max = TCP_RESOURCE_PROBE_INTERVAL; - } - tcp_reset_xmit_timer(sk, ICSK_TIME_PROBE0, - tcp_probe0_when(sk, probe_max), - TCP_RTO_MAX, - NULL); + timeout = TCP_RESOURCE_PROBE_INTERVAL; + } + tcp_reset_xmit_timer(sk, ICSK_TIME_PROBE0, timeout, TCP_RTO_MAX, NULL); } int tcp_rtx_synack(const struct sock *sk, struct request_sock *req)