From patchwork Thu Feb 22 18:29:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Philip Cox X-Patchwork-Id: 1902918 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=lists.ubuntu.com (client-ip=185.125.189.65; helo=lists.ubuntu.com; envelope-from=kernel-team-bounces@lists.ubuntu.com; receiver=patchwork.ozlabs.org) Received: from lists.ubuntu.com (lists.ubuntu.com [185.125.189.65]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4TghYW4Ktnz23hc for ; Fri, 23 Feb 2024 05:31:27 +1100 (AEDT) Received: from localhost ([127.0.0.1] helo=lists.ubuntu.com) by lists.ubuntu.com with esmtp (Exim 4.86_2) (envelope-from ) id 1rdDqn-0007Mr-SJ; Thu, 22 Feb 2024 18:31:17 +0000 Received: from smtp-relay-internal-1.internal ([10.131.114.114] helo=smtp-relay-internal-1.canonical.com) by lists.ubuntu.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1rdDqk-0007MQ-Cg for kernel-team@lists.ubuntu.com; Thu, 22 Feb 2024 18:31:14 +0000 Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by smtp-relay-internal-1.canonical.com (Postfix) with ESMTPS id 29AF93F636 for ; Thu, 22 Feb 2024 18:31:14 +0000 (UTC) Received: by mail-qk1-f198.google.com with SMTP id af79cd13be357-787a67284fdso570385a.0 for ; Thu, 22 Feb 2024 10:31:14 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708626672; x=1709231472; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TvuIJIOYV2wzorqkkdB67aGpFCMILK3ii8HuoHDUkCc=; b=pLDswejF5nBHSbdQRcyj5Vxm4GBFchQl8wDoVpFyMei3ZZZR2wv+uynRwj8FJgF5Ms OxJMbDH7fcTAGbC/cf9a+3w7Bu0tJFXLAZKC62tDvH/njra5g3ocfslMeSjM7bvUYxda t2DInEflH9kFE1xi8N+TJc47OC+LxTZ4OBm+B6IqgIdn5maKfSrhHvM0YhWhgpz47S0Y 2qN48uQ77i5KC4tPta3H54vZi/oqXxVvN9oJTm+4KVhlVxUAKa2uzAq+AV6SKQbEcBtJ u/zZLVb2wHNa9g2G14dyPCyXse7v/t42vuk5gKZ7vnGvNIXQJvbETk0Uzg/9F2X6smaX Kexg== X-Gm-Message-State: AOJu0Yz9OtX4A2TSF2N8WJUHM63nHYfIzR8uQn0QZBrfLex5a3OSOGdq UbrLYgWtTXsQjXNg8OBSQgM8m+7fSQ8hcGzB+vRGhX58to8kyL7PLz64a200N4ug4IDKBc5NKwp 1z1APaceommFHcNgXE4jIQuWTqbbFILUhc+y4buaiCWO5MOOgYwYfCm/Tw1sZmFSup1xtamc66V edpCQkPcCobA== X-Received: by 2002:a05:620a:16af:b0:787:ac82:d9c0 with SMTP id s15-20020a05620a16af00b00787ac82d9c0mr571817qkj.67.1708626672121; Thu, 22 Feb 2024 10:31:12 -0800 (PST) X-Google-Smtp-Source: AGHT+IHsfrihBLQjzwTkcWYU3Ia+dNKKs1PsOIxJA15mvg02I1PpUB93wAab8cSa+0B72DokCmhdCw== X-Received: by 2002:a05:620a:16af:b0:787:ac82:d9c0 with SMTP id s15-20020a05620a16af00b00787ac82d9c0mr571801qkj.67.1708626671793; Thu, 22 Feb 2024 10:31:11 -0800 (PST) Received: from cox.home.arpa ([108.175.227.176]) by smtp.gmail.com with ESMTPSA id e26-20020a05620a209a00b00785da717d64sm5545171qka.111.2024.02.22.10.31.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Feb 2024 10:31:08 -0800 (PST) From: Philip Cox To: kernel-team@lists.ubuntu.com, philip.cox@canonical.com Subject: [f/j/m:linux-aws][PATCH 1/1] tcp: Add memory barrier to tcp_push() Date: Thu, 22 Feb 2024 13:29:03 -0500 Message-Id: <20240222182903.1490015-2-philip.cox@canonical.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240222182903.1490015-1-philip.cox@canonical.com> References: <20240222182903.1490015-1-philip.cox@canonical.com> MIME-Version: 1.0 X-BeenThere: kernel-team@lists.ubuntu.com X-Mailman-Version: 2.1.20 Precedence: list List-Id: Kernel team discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: kernel-team-bounces@lists.ubuntu.com Sender: "kernel-team" From: Salvatore Dipietro BugLink: https://bugs.launchpad.net/bugs/2051727 On CPUs with weak memory models, reads and updates performed by tcp_push to the sk variables can get reordered leaving the socket throttled when it should not. The tasklet running tcp_wfree() may also not observe the memory updates in time and will skip flushing any packets throttled by tcp_push(), delaying the sending. This can pathologically cause 40ms extra latency due to bad interactions with delayed acks. Adding a memory barrier in tcp_push removes the bug, similarly to the previous commit bf06200e732d ("tcp: tsq: fix nonagle handling"). smp_mb__after_atomic() is used to not incur in unnecessary overhead on x86 since not affected. Patch has been tested using an AWS c7g.2xlarge instance with Ubuntu 22.04 and Apache Tomcat 9.0.83 running the basic servlet below: import java.io.IOException; import java.io.OutputStreamWriter; import java.io.PrintWriter; import javax.servlet.ServletException; import javax.servlet.http.HttpServlet; import javax.servlet.http.HttpServletRequest; import javax.servlet.http.HttpServletResponse; public class HelloWorldServlet extends HttpServlet { @Override protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { response.setContentType("text/html;charset=utf-8"); OutputStreamWriter osw = new OutputStreamWriter(response.getOutputStream(),"UTF-8"); String s = "a".repeat(3096); osw.write(s,0,s.length()); osw.flush(); } } Load was applied using wrk2 (https://github.com/kinvolk/wrk2) from an AWS c6i.8xlarge instance. Before the patch an additional 40ms latency from P99.99+ values is observed while, with the patch, the extra latency disappears. No patch and tcp_autocorking=1 ./wrk -t32 -c128 -d40s --latency -R10000 http://172.31.60.173:8080/hello/hello ... 50.000% 0.91ms 75.000% 1.13ms 90.000% 1.46ms 99.000% 1.74ms 99.900% 1.89ms 99.990% 41.95ms <<< 40+ ms extra latency 99.999% 48.32ms 100.000% 48.96ms With patch and tcp_autocorking=1 ./wrk -t32 -c128 -d40s --latency -R10000 http://172.31.60.173:8080/hello/hello ... 50.000% 0.90ms 75.000% 1.13ms 90.000% 1.45ms 99.000% 1.72ms 99.900% 1.83ms 99.990% 2.11ms <<< no 40+ ms extra latency 99.999% 2.53ms 100.000% 2.62ms Patch has been also tested on x86 (m7i.2xlarge instance) which it is not affected by this issue and the patch doesn't introduce any additional delay. Fixes: 7aa5470c2c09 ("tcp: tsq: move tsq_flags close to sk_wmem_alloc") Signed-off-by: Salvatore Dipietro Reviewed-by: Eric Dumazet Link: https://lore.kernel.org/r/20240119190133.43698-1-dipiets@amazon.com Signed-off-by: Paolo Abeni (cherry picked from commit 7267e8dcad6b2f9fce05a6a06335d7040acbc2b6) Signed-off-by: Philip Cox --- net/ipv4/tcp.c | 1 + 1 file changed, 1 insertion(+) diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 7e090b3adf2a..ae665ddf6243 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -722,6 +722,7 @@ void tcp_push(struct sock *sk, int flags, int mss_now, if (!test_bit(TSQ_THROTTLED, &sk->sk_tsq_flags)) { NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPAUTOCORKING); set_bit(TSQ_THROTTLED, &sk->sk_tsq_flags); + smp_mb__after_atomic(); } /* It is possible TX completion already happened * before we set TSQ_THROTTLED.