From patchwork Wed Jan 2 17:00:19 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mauricio Faria de Oliveira X-Patchwork-Id: 1020033 Return-Path: X-Original-To: patchwork-incoming-netdev@ozlabs.org Delivered-To: patchwork-incoming-netdev@ozlabs.org Authentication-Results: ozlabs.org; spf=none (mailfrom) smtp.mailfrom=vger.kernel.org (client-ip=209.132.180.67; helo=vger.kernel.org; envelope-from=netdev-owner@vger.kernel.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=canonical.com Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 43VHPk3rS5z9s4s for ; Thu, 3 Jan 2019 04:01:14 +1100 (AEDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729973AbfABRBN (ORCPT ); Wed, 2 Jan 2019 12:01:13 -0500 Received: from youngberry.canonical.com ([91.189.89.112]:41774 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729722AbfABRBN (ORCPT ); Wed, 2 Jan 2019 12:01:13 -0500 Received: from mail-qt1-f199.google.com ([209.85.160.199]) by youngberry.canonical.com with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.76) (envelope-from ) id 1gejtG-00042s-Tu for netdev@vger.kernel.org; Wed, 02 Jan 2019 17:01:11 +0000 Received: by mail-qt1-f199.google.com with SMTP id u32so39803856qte.1 for ; Wed, 02 Jan 2019 09:01:10 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=CYeUP4JNsvbBs5sYwcGm+tMqpZISs48JlQSLeCo1iw4=; b=g+BNjNAojlDt0cb2WKlFuGi1GhL3MuIzz4rQazwWEpQRB13vjd1MRAPL5riWy+2znK t4ABoPSSf0FddvzGxjm/2cZezv7IBs1rutI4BtXxIYUcqMSHBsLSjyG6tm3lOzT9X46r WqT+wisdPHfqYSBew5iZi89sHHY+lecwjEjCfEHzQOT2tVUIR6DAE2YQ8LaVTmHofcAN 9VkmV2BOviXpz7SQwz1O70XVyd/VnApgDT/nGTCFf/2T3PH52ZJd3YQNugKrE3EtJUMO zxF0yq/ApCead4n6nzBpxVg3/nrMTiX5lEzKKpMsvXnbzNHtKGePN+Fb3zuPc0qUUKZm iaQw== X-Gm-Message-State: AJcUukcZ+JOzY3bk9wTVn60MND6ncNL9PZ/5Zrrvj47fRghBHZq5ETmD znM9c7mm3B2O9EdxT+Dx7/fpjfvJ6PCtM8PJxqwVvSFNeoO4BySROFHZnys5vRO4QjlFCW53UP2 vQK3HdSS/S+VrDURG0Hrdo0WmjrBJJ614LQ== X-Received: by 2002:a0c:b48d:: with SMTP id c13mr42717345qve.91.1546448470096; Wed, 02 Jan 2019 09:01:10 -0800 (PST) X-Google-Smtp-Source: ALg8bN4BnErck4jh7bWxtl13325IwrpZ+Bw0BZ4jk7sUd5SM5j1d90i5BbkxTM0m2DxwTXMafRaZBg== X-Received: by 2002:a0c:b48d:: with SMTP id c13mr42717318qve.91.1546448469858; Wed, 02 Jan 2019 09:01:09 -0800 (PST) Received: from localhost.localdomain ([179.159.56.118]) by smtp.gmail.com with ESMTPSA id i26sm20333938qkg.12.2019.01.02.09.01.06 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 02 Jan 2019 09:01:09 -0800 (PST) From: Mauricio Faria de Oliveira To: stable@vger.kernel.org, netdev@vger.kernel.org, Florian Westphal Cc: Alakesh Haloi , nivedita.singhvi@canonical.com, Pablo Neira Ayuso , Jozsef Kadlecsik , "David S. Miller" , Yi-Hung Wei Subject: [PATCH 4.14 0/4] netfilter: xt_connlimit: backport upstream fixes for race in connection counting Date: Wed, 2 Jan 2019 15:00:19 -0200 Message-Id: <20190102170023.10415-1-mfo@canonical.com> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org Recently, Alakesh Haloi reported the following issue [1] with stable/4.14: """ An iptable rule like the following on a multicore systems will result in accepting more connections than set in the rule. iptables -A INPUT -p tcp -m tcp --syn --dport 7777 -m connlimit \ --connlimit-above 2000 --connlimit-mask 0 -j DROP """ And proposed a fix that is not in Linus's tree. The discussion went on to confirm whether the issue was still reproducible with mainline/nf.git tip, and to either identify the upstream fix or re-submit the non-upstream fix. Alakesh eventually was able to test with upstream, and reported that issue was still reproducible [2]. On that, our findinds diverge, at least in my test environment: First, I verified that the suggested mainline fix for the issue [3] indeed fixes it, by testing with it applied and reverted on v4.18, a clean revert. (The issue is reproducible with the commit reverted). Then, with a consistent reproducer, I moved to nf.git, with HEAD on commit a007232 ("netfilter: nf_conncount: fix argument order to find_next_bit"), and the issues was not reproducible (even with 20+ threads on client side, the number Alakesh reported to achieve 2150+ connections [4], and I tried spreading the network interface IRQ affinity over more and more CPUs too.) Either way, the suggested mainline fix does actually fix the issue in 4.14 for at least one environment. So, it might well be the case that Alakesh's test environment has differences/subtleties that leads to more connections accepted, and more commits are needed for that particular environment type. But for now, with one bare-metal environment (24-core server, 4-core client) verified, I thought of submitting the patches for review/comments/testing, then looking for additional fixes for that environment separately. The fix is PATCH 4/4, and PATCHes 1-3/4 are helpers for a cleaner backport. All backports are simple, and essentially consist of refresh context lines and use older struct/file names. Reviews from netfilter maintainers are very appreciated, as I've no previous experience in this area, and although the backports look simple and build/run correctly, there's usually stuff that only more experienced people may notice. Thanks, Mauricio Links: ===== [1] https://www.spinics.net/lists/stable/msg270040.html [2] https://www.spinics.net/lists/stable/msg273669.html [3] https://www.spinics.net/lists/stable/msg271300.html [4] https://www.spinics.net/lists/stable/msg273669.html Test-case: ========= - v4.14.91 (original): client achieves 2000+ connections (6000 target) with 3 threads. server # iptables -F server # iptables -A INPUT -p tcp -m tcp --syn --dport 7777 -m connlimit --connlimit-above 2000 --connlimit-mask 0 -j DROP server # iptables -L Chain INPUT (policy ACCEPT) target prot opt source destination DROP tcp -- anywhere anywhere tcp dpt:7777 flags:FIN,SYN,RST,ACK/SYN #conn src/0 > 2000 Chain FORWARD (policy ACCEPT) target prot opt source destination Chain OUTPUT (policy ACCEPT) target prot opt source destination server # ulimit -SHn 65000 server # ruby server.rb <... listening ...> client # ulimit -SHn 65000 client # ruby client.rb 10.230.56.100 7777 6000 3 Connecting to ["10.230.56.100"]:7777 6000 times with 3 1 2 3 <...> 2000 <...> 6000 Target reached. Thread finishing 6001 Target reached. Thread finishing 6002 Target reached. Thread finishing Threads done. 6002 connections press enter to exit - v4.14.91 + patches: client only achieved 2000 connections. server # (same procedure) client # (same procedure) Connecting to ["10.230.56.100"]:7777 6000 times with 3 1 2 3 <...> 2000 <... blocked for a while...> failed to create connection: Connection timed out - connect(2) for "10.230.56.100" port 7777 failed to create connection: Connection timed out - connect(2) for "10.230.56.100" port 7777 failed to create connection: Connection timed out - connect(2) for "10.230.56.100" port 7777 Threads done. 2000 connections press enter to exit Florian Westphal (2): netfilter: xt_connlimit: don't store address in the conn nodes netfilter: nf_conncount: fix garbage collection confirm race Pablo Neira Ayuso (1): netfilter: nf_conncount: expose connection list interface Yi-Hung Wei (1): netfilter: nf_conncount: Fix garbage collection with zones include/net/netfilter/nf_conntrack_count.h | 15 +++++ net/netfilter/xt_connlimit.c | 99 +++++++++++++++++++++++------- 2 files changed, 91 insertions(+), 23 deletions(-) create mode 100644 include/net/netfilter/nf_conntrack_count.h