From patchwork Fri Nov 1 01:23:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Maximets X-Patchwork-Id: 2004933 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::133; helo=smtp2.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Xfjmy0TN5z1xwF for ; Fri, 1 Nov 2024 12:23:45 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 6AA2C40F0C; Fri, 1 Nov 2024 01:23:43 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id e14ovTn26nZR; Fri, 1 Nov 2024 01:23:39 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=140.211.9.56; helo=lists.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver= DKIM-Filter: OpenDKIM Filter v2.11.0 smtp2.osuosl.org BFC6C40EB0 Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp2.osuosl.org (Postfix) with ESMTPS id BFC6C40EB0; Fri, 1 Nov 2024 01:23:39 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 9E202C08A6; Fri, 1 Nov 2024 01:23:39 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) by lists.linuxfoundation.org (Postfix) with ESMTP id 0ECEBC08A3 for ; Fri, 1 Nov 2024 01:23:39 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id F2DC760625 for ; Fri, 1 Nov 2024 01:23:38 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id HGRJ3uyVQfBn for ; Fri, 1 Nov 2024 01:23:37 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.221.67; helo=mail-wr1-f67.google.com; envelope-from=i.maximets.ovn@gmail.com; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp3.osuosl.org 060706060A Authentication-Results: smtp3.osuosl.org; dmarc=none (p=none dis=none) header.from=ovn.org DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org 060706060A Received: from mail-wr1-f67.google.com (mail-wr1-f67.google.com [209.85.221.67]) by smtp3.osuosl.org (Postfix) with ESMTPS id 060706060A for ; Fri, 1 Nov 2024 01:23:36 +0000 (UTC) Received: by mail-wr1-f67.google.com with SMTP id ffacd0b85a97d-3807dd08cfcso1194659f8f.1 for ; Thu, 31 Oct 2024 18:23:36 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730424215; x=1731029015; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Zq6lFVMUs+wr1Pz/nEovIRLPJHvDoXtZj5o1tnR9wKI=; b=hB/+tbQO6FLptgDfDINobX3ndKaG+epSn0oH41TPU2MWcCUGeUXSQBkVMW2+XJDpQl wULo9tshjThyGVdClxv8fUHCLjv/hD6glO65HwPDsinYX1Qnw3v8IYH8WEcAjSmq5eCz w7drVpDajsBH6gREb1vh86CtINChrMUR4ssgU5UHpldvRis/z73xfG/8WPU015wpAFfW pUiiD38jKtWCe4HRunQ7OV6qXbQIzimFPLOz4ARKYjnwv6jVeWW7cQ0m7IDA4fW1QSUz npC5RnFnMSPDOsADse7d1AYy4vEfrm9e3bsfl2WEd6FdwIAjvyqlscrbhw0in5qz1+wm Q3ew== X-Gm-Message-State: AOJu0YyQaSWqLRfGOU7utpbcOJZYFJOZ6eCgTQgcsOlICph6AQB+Zx0l hHU68H0l7k65Mk2yrryUW0cIUZRHyWdMBipUWk5dTmKwy89/sEeWHD3qij1l X-Google-Smtp-Source: AGHT+IG/00MH4D2uhPA0UXDGrm736ku6iLvFSQVphTks7RUQUW2URkZX7Jq0PPCVvSGasmB+LBPscQ== X-Received: by 2002:a5d:64a7:0:b0:37c:cfdc:19ba with SMTP id ffacd0b85a97d-381c7a6bc70mr1663822f8f.28.1730424214648; Thu, 31 Oct 2024 18:23:34 -0700 (PDT) Received: from im-t490s.redhat.com (ip-86-49-44-151.bb.vodafone.cz. [86.49.44.151]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9e5663fef4sm126112766b.149.2024.10.31.18.23.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 18:23:33 -0700 (PDT) From: Ilya Maximets To: ovs-dev@openvswitch.org Date: Fri, 1 Nov 2024 02:23:02 +0100 Message-ID: <20241101012321.3346333-2-i.maximets@ovn.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20241101012321.3346333-1-i.maximets@ovn.org> References: <20241101012321.3346333-1-i.maximets@ovn.org> MIME-Version: 1.0 Subject: [ovs-dev] [PATCH v3 01/10] ipsec: Add a helper function to run commands from the monitor. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Ilya Maximets Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Until now, functions that needed to call external programs like openssl or ipsec commands were using subprocess commands directly. Most of these calls had no failure checks or any logging making it hard to understand what is happening inside the daemon when something doesn't work as intended. Some commands also had a chance to not read the command output in full. That might sound like not a big problem, but in practice it causes ovs-monitor-ipsec to deadlock pluto and itself with certain versions of Libreswan (mainly Libreswan 5+). The order of events is following: 1. ovs-monitor-ipsec calls ipsec status redirecting the output to a pipe. 2. ipsec status calls ipsec whack. 3. ipsec whack connects to pluto and asks for status. 4. ovs-monitor-ipsec doesn't read the pipe in full. 5. ipsec whack blocks on write to the other side of the pipe when it runs out of buffer space. 6. pluto blocks on sendmsg to ipsec whack for the same reason. 7. ovs-monitor-ipsec calls another ipsec command and blocks on connection to pluto. In this scenario the running process is at the mercy of garbage collector and it doesn't run because we're blocked on calling another ipsec command. All the processes are completely blocked and will not do any work until ipsec whack is killed. With this change we're introducing a new function that will be used for all the external process execution commands and will read the full output before returning, avoiding the deadlock. It will also log all the failures as warnings and the commands themselves at the debug level. We'll be adding more logic into this function in later commits as well, so it will not stay that simple. Acked-by: Roi Dayan Signed-off-by: Ilya Maximets Acked-by: Eelco Chaudron --- ipsec/ovs-monitor-ipsec.in | 290 +++++++++++++++++-------------------- 1 file changed, 131 insertions(+), 159 deletions(-) diff --git a/ipsec/ovs-monitor-ipsec.in b/ipsec/ovs-monitor-ipsec.in index 37c509ac6..771a3c745 100755 --- a/ipsec/ovs-monitor-ipsec.in +++ b/ipsec/ovs-monitor-ipsec.in @@ -84,6 +84,28 @@ monitor = None xfrm = None +def run_command(args, description=None): + """ This function runs the process args[0] with args[1:] arguments + and returns a tuple: return-code, stdout, stderr. """ + + if not description: + description = "run %s command" % args[0] + + vlog.dbg("Running %s" % args) + proc = subprocess.Popen(args, stdout=subprocess.PIPE, + stderr=subprocess.PIPE) + pout, perr = proc.communicate() + + if proc.returncode or perr: + vlog.warn("Failed to %s; exit code: %d" + % (description, proc.returncode)) + vlog.warn("cmdline: %s" % proc.args) + vlog.warn("stderr: %s" % perr) + vlog.warn("stdout: %s" % pout) + + return proc.returncode, pout.decode(), perr.decode() + + class XFRM(object): """This class is a simple wrapper around ip-xfrm (8) command line utility. We are using this class only for informational purposes @@ -99,13 +121,14 @@ class XFRM(object): where is destination IPv4 address and is SELECTOR of the IPsec policy.""" policies = {} - proc = subprocess.Popen([self.IP, 'xfrm', 'policy'], - stdout=subprocess.PIPE) - while True: - line = proc.stdout.readline().strip().decode() - if line == '': - break - a = line.split(" ") + + ret, pout, perr = run_command([self.IP, 'xfrm', 'policy'], + "get XFRM policies") + if ret: + return policies + + for line in pout.splitlines(): + a = line.strip().split(" ") if len(a) >= 4 and a[0] == "src" and a[2] == "dst": dst = (a[3].split("/"))[0] if dst not in policies: @@ -122,13 +145,14 @@ class XFRM(object): in a dictionary where is destination IPv4 address and is SELECTOR.""" securities = {} - proc = subprocess.Popen([self.IP, 'xfrm', 'state'], - stdout=subprocess.PIPE) - while True: - line = proc.stdout.readline().strip().decode() - if line == '': - break - a = line.split(" ") + + ret, pout, perr = run_command([self.IP, 'xfrm', 'state'], + "get XFRM state") + if ret: + return securities + + for line in pout.splitlines(): + a = line.strip().split(" ") if len(a) >= 4 and a[0] == "sel" \ and a[1] == "src" and a[3] == "dst": remote_ip = a[4].rstrip().split("/")[0] @@ -242,7 +266,7 @@ conn prevent_unencrypted_vxlan f.close() vlog.info("Restarting StrongSwan") - subprocess.call([self.IPSEC, "restart"]) + run_command([self.IPSEC, "restart"], "restart StrongSwan") def get_active_conns(self): """This function parses output from 'ipsec status' command. @@ -252,13 +276,13 @@ conn prevent_unencrypted_vxlan sample line from the parsed outpus as . """ conns = {} - proc = subprocess.Popen([self.IPSEC, 'status'], stdout=subprocess.PIPE) + ret, pout, perr = run_command([self.IPSEC, 'status'], + "get active connections") + if ret: + return conns - while True: - line = proc.stdout.readline().strip().decode() - if line == '': - break - tunnel_name = line.split(":") + for line in pout.splitlines(): + tunnel_name = line.strip().split(":") if len(tunnel_name) < 2: continue m = re.match(r"(.*)(-in-\d+|-out-\d+|-\d+).*", tunnel_name[0]) @@ -341,15 +365,11 @@ conn prevent_unencrypted_vxlan Once strongSwan vici bindings will be distributed with major Linux distributions this function could be simplified.""" vlog.info("Refreshing StrongSwan configuration") - proc = subprocess.Popen([self.IPSEC, "update"], - stdout=subprocess.PIPE, - stderr=subprocess.PIPE) - outs, errs = proc.communicate() - if proc.returncode != 0: - vlog.err("StrongSwan failed to update configuration:\n" - "%s \n %s" % (str(outs), str(errs))) - - subprocess.call([self.IPSEC, "rereadsecrets"]) + + run_command([self.IPSEC, "update"], + "update StrongSwan's configuration") + run_command([self.IPSEC, "rereadsecrets"], "re-read secrets") + # "ipsec update" command does not remove those tunnels that were # updated or that disappeared from the ipsec.conf file. So, we have # to manually remove them by calling "ipsec stroke down-nb " @@ -382,7 +402,8 @@ conn prevent_unencrypted_vxlan if not tunnel or tunnel.version != ver: vlog.info("%s is outdated %u" % (conn, ver)) - subprocess.call([self.IPSEC, "stroke", "down-nb", conn]) + run_command([self.IPSEC, "stroke", "down-nb", conn], + "stroke the outdated %s" % conn) class LibreSwanHelper(object): @@ -460,13 +481,11 @@ conn prevent_unencrypted_vxlan # Collect version infromation self.IPSEC = libreswan_root_prefix + "/usr/sbin/ipsec" self.IPSEC_AUTO = [self.IPSEC] - proc = subprocess.Popen([self.IPSEC, "--version"], - stdout=subprocess.PIPE, - encoding="latin1") - pout, perr = proc.communicate() - v = re.match("^Libreswan v?(.*)$", pout) + ret, pout, perr = run_command([self.IPSEC, "--version"], + "get Libreswan's version") try: + v = re.match("^Libreswan v?(.*)$", pout.strip()) version = int(v.group(1).split(".")[0]) except: version = 0 @@ -513,7 +532,7 @@ conn prevent_unencrypted_vxlan f.close() vlog.info("Restarting LibreSwan") - subprocess.call([self.IPSEC, "restart"]) + run_command([self.IPSEC, "restart"], "restart Libreswan") def config_init(self): self.conf_file = open(self.IPSEC_CONF, "w") @@ -599,8 +618,10 @@ conn prevent_unencrypted_vxlan def refresh(self, monitor): vlog.info("Refreshing LibreSwan configuration") - subprocess.call(self.IPSEC_AUTO + ["--ctlsocket", self.IPSEC_CTL, - "--config", self.IPSEC_CONF, "--rereadsecrets"]) + run_command(self.IPSEC_AUTO + ["--ctlsocket", self.IPSEC_CTL, + "--config", self.IPSEC_CONF, + "--rereadsecrets"], + "re-read secrets") tunnels = set(monitor.tunnels.keys()) # Delete old connections @@ -627,9 +648,10 @@ conn prevent_unencrypted_vxlan if not tunnel or tunnel.version != ver: vlog.info("%s is outdated %u" % (conn, ver)) - subprocess.call(self.IPSEC_AUTO + ["--ctlsocket", - self.IPSEC_CTL, "--config", - self.IPSEC_CONF, "--delete", conn]) + run_command(self.IPSEC_AUTO + + ["--ctlsocket", self.IPSEC_CTL, + "--config", self.IPSEC_CONF, + "--delete", conn], "delete %s" % conn) elif ifname in tunnels: tunnels.remove(ifname) @@ -649,43 +671,43 @@ conn prevent_unencrypted_vxlan # Update shunt policy if changed if monitor.conf_in_use["skb_mark"] != monitor.conf["skb_mark"]: if monitor.conf["skb_mark"]: - subprocess.call(self.IPSEC_AUTO + + run_command(self.IPSEC_AUTO + ["--config", self.IPSEC_CONF, "--ctlsocket", self.IPSEC_CTL, "--add", "--asynchronous", "prevent_unencrypted_gre"]) - subprocess.call(self.IPSEC_AUTO + + run_command(self.IPSEC_AUTO + ["--config", self.IPSEC_CONF, "--ctlsocket", self.IPSEC_CTL, "--add", "--asynchronous", "prevent_unencrypted_geneve"]) - subprocess.call(self.IPSEC_AUTO + + run_command(self.IPSEC_AUTO + ["--config", self.IPSEC_CONF, "--ctlsocket", self.IPSEC_CTL, "--add", "--asynchronous", "prevent_unencrypted_stt"]) - subprocess.call(self.IPSEC_AUTO + + run_command(self.IPSEC_AUTO + ["--config", self.IPSEC_CONF, "--ctlsocket", self.IPSEC_CTL, "--add", "--asynchronous", "prevent_unencrypted_vxlan"]) else: - subprocess.call(self.IPSEC_AUTO + + run_command(self.IPSEC_AUTO + ["--config", self.IPSEC_CONF, "--ctlsocket", self.IPSEC_CTL, "--delete", "--asynchronous", "prevent_unencrypted_gre"]) - subprocess.call(self.IPSEC_AUTO + + run_command(self.IPSEC_AUTO + ["--config", self.IPSEC_CONF, "--ctlsocket", self.IPSEC_CTL, "--delete", "--asynchronous", "prevent_unencrypted_geneve"]) - subprocess.call(self.IPSEC_AUTO + + run_command(self.IPSEC_AUTO + ["--config", self.IPSEC_CONF, "--ctlsocket", self.IPSEC_CTL, "--delete", "--asynchronous", "prevent_unencrypted_stt"]) - subprocess.call(self.IPSEC_AUTO + + run_command(self.IPSEC_AUTO + ["--config", self.IPSEC_CONF, "--ctlsocket", self.IPSEC_CTL, "--delete", @@ -700,14 +722,13 @@ conn prevent_unencrypted_vxlan sample line from the parsed outpus as . """ conns = {} - proc = subprocess.Popen([self.IPSEC, 'status', '--ctlsocket', - self.IPSEC_CTL], stdout=subprocess.PIPE) - - while True: - line = proc.stdout.readline().strip().decode() - if line == '': - break + ret, pout, perr = run_command([self.IPSEC, 'status', + '--ctlsocket', self.IPSEC_CTL], + "get active connections") + if ret: + return conns + for line in pout.splitlines(): m = re.search(r"#\d+: \"(.*)\".*", line) if not m: continue @@ -732,15 +753,12 @@ conn prevent_unencrypted_vxlan # the "ipsec auto --start" command is lost. Just retry to make sure # the command is received by LibreSwan. while True: - proc = subprocess.Popen(self.IPSEC_AUTO + - ["--config", self.IPSEC_CONF, - "--ctlsocket", self.IPSEC_CTL, - "--start", - "--asynchronous", conn], - stdout=subprocess.PIPE, - stderr=subprocess.PIPE) - perr = str(proc.stderr.read()) - pout = str(proc.stdout.read()) + ret, pout, perr = run_command(self.IPSEC_AUTO + + ["--config", self.IPSEC_CONF, + "--ctlsocket", self.IPSEC_CTL, + "--start", + "--asynchronous", conn], + "start %s" % conn) if not re.match(r".*Connection refused.*", perr) and \ not re.match(r".*need --listen.*", pout): break @@ -748,101 +766,59 @@ conn prevent_unencrypted_vxlan if re.match(r".*[F|f]ailed to initiate connection.*", pout): vlog.err('Failed to initiate connection through' ' Interface %s.\n' % (conn.split('-')[0])) - vlog.err(pout) + vlog.err("stdout: %s" % pout) def _nss_clear_database(self): """Remove all OVS IPsec related state from the NSS database""" - try: - proc = subprocess.Popen(['certutil', '-L', '-d', - self.IPSEC_D], - stdout=subprocess.PIPE, - stderr=subprocess.PIPE, - universal_newlines=True) - lines = proc.stdout.readlines() - - for line in lines: - s = line.strip().split() - if len(s) < 1: - continue - name = s[0] - if name.startswith(self.CERT_PREFIX): - self._nss_delete_cert(name) - elif name.startswith(self.CERTKEY_PREFIX): - self._nss_delete_cert_and_key(name) + ret, pout, perr = run_command(['certutil', '-L', '-d', self.IPSEC_D], + "clear NSS database") + if ret: + return - except Exception as e: - vlog.err("Failed to clear NSS database.\n" + str(e)) + for line in pout.splitlines(): + s = line.strip().split() + if len(s) < 1: + continue + name = s[0] + if name.startswith(self.CERT_PREFIX): + self._nss_delete_cert(name) + elif name.startswith(self.CERTKEY_PREFIX): + self._nss_delete_cert_and_key(name) def _nss_import_cert(self, cert, name, cert_type): """Cert_type is 'CT,,' for the CA certificate and 'P,P,P' for the normal certificate.""" - try: - proc = subprocess.Popen(['certutil', '-A', '-a', '-i', cert, - '-d', self.IPSEC_D, '-n', - name, '-t', cert_type], - stdout=subprocess.PIPE, - stderr=subprocess.PIPE) - proc.wait() - if proc.returncode: - raise Exception(proc.stderr.read()) - except Exception as e: - vlog.err("Failed to import certificate into NSS.\n" + str(e)) + run_command(['certutil', '-A', '-a', '-i', cert, '-d', self.IPSEC_D, + '-n', name, '-t', cert_type], + "import certificate %s into NSS" % name) def _nss_delete_cert(self, name): - try: - proc = subprocess.Popen(['certutil', '-D', '-d', - self.IPSEC_D, '-n', name], - stdout=subprocess.PIPE, - stderr=subprocess.PIPE) - proc.wait() - if proc.returncode: - raise Exception(proc.stderr.read()) - except Exception as e: - vlog.err("Failed to delete certificate from NSS.\n" + str(e)) + run_command(['certutil', '-D', '-d', self.IPSEC_D, '-n', name], + "delete certificate %s from NSS" % name) def _nss_import_cert_and_key(self, cert, key, name): - try: - # Avoid deleting other files - path = os.path.abspath('/tmp/%s.p12' % name) - if not path.startswith('/tmp/'): - raise Exception("Illegal certificate name!") - - # Create p12 file from pem files - proc = subprocess.Popen(['openssl', 'pkcs12', '-export', - '-in', cert, '-inkey', key, '-out', - path, '-name', name, '-passout', 'pass:'], - stdout=subprocess.PIPE, - stderr=subprocess.PIPE) - proc.wait() - if proc.returncode: - raise Exception(proc.stderr.read()) - - # Load p12 file to the database - proc = subprocess.Popen(['pk12util', '-i', path, '-d', - self.IPSEC_D, '-W', ''], - stdout=subprocess.PIPE, - stderr=subprocess.PIPE) - proc.wait() - if proc.returncode: - raise Exception(proc.stderr.read()) - - except Exception as e: - vlog.err("Import cert and key failed.\n" + str(e)) + # Avoid deleting other files + path = os.path.abspath('/tmp/%s.p12' % name) + if not path.startswith('/tmp/'): + vlog.err("Illegal certificate name '%s'!" % name) + return + + if run_command(['openssl', 'pkcs12', '-export', + '-in', cert, '-inkey', key, + '-out', path, '-name', name, + '-passout', 'pass:'], + "create p12 file from pem files")[0]: + return + + # Load p12 file to the database + run_command(['pk12util', '-i', path, '-d', self.IPSEC_D, '-W', ''], + "load p12 file to the NSS database") os.remove(path) def _nss_delete_cert_and_key(self, name): - try: - # Delete certificate and private key - proc = subprocess.Popen(['certutil', '-F', '-d', - self.IPSEC_D, '-n', name], - stdout=subprocess.PIPE, - stderr=subprocess.PIPE) - proc.wait() - if proc.returncode: - raise Exception(proc.stderr.read()) - - except Exception as e: - vlog.err("Delete cert and key failed.\n" + str(e)) + # Delete certificate and private key + run_command(['certutil', '-F', '-d', self.IPSEC_D, '-n', name], + "delete certificate and private key for %s" % name) class IPsecTunnel(object): @@ -1220,19 +1196,15 @@ class IPsecMonitor(object): self.ike_helper.refresh(self) def _get_cn_from_cert(self, cert): - try: - proc = subprocess.Popen(['openssl', 'x509', '-noout', '-subject', - '-nameopt', 'RFC2253', '-in', cert], - stdout=subprocess.PIPE, - stderr=subprocess.PIPE) - proc.wait() - if proc.returncode: - raise Exception(proc.stderr.read()) - m = re.search(r"CN=(.+?),", proc.stdout.readline().decode()) - if not m: - raise Exception("No CN in the certificate subject.") - except Exception as e: - vlog.warn(str(e)) + ret, pout, perr = run_command(['openssl', 'x509', '-noout', '-subject', + '-nameopt', 'RFC2253', '-in', cert], + "get certificate %s options" % cert) + if ret: + return None + + m = re.search(r"CN=(.+?),", pout.strip()) + if not m: + vlog.warn("No CN in the certificate subject (%s)." % cert) return None return m.group(1) From patchwork Fri Nov 1 01:23:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Maximets X-Patchwork-Id: 2004934 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::133; helo=smtp2.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Xfjn12QsPz1xwF for ; Fri, 1 Nov 2024 12:23:49 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 6EBF940F13; Fri, 1 Nov 2024 01:23:47 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id oTV_MSAHQOmQ; Fri, 1 Nov 2024 01:23:44 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=2605:bc80:3010:104::8cd3:938; helo=lists.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver= DKIM-Filter: OpenDKIM Filter v2.11.0 smtp2.osuosl.org 767E640F04 Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp2.osuosl.org (Postfix) with ESMTPS id 767E640F04; Fri, 1 Nov 2024 01:23:43 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 3CF0BC08A6; Fri, 1 Nov 2024 01:23:43 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 9A4BFC08B4 for ; Fri, 1 Nov 2024 01:23:41 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 7B41940EE6 for ; Fri, 1 Nov 2024 01:23:41 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id JgdkbXLhShE0 for ; Fri, 1 Nov 2024 01:23:40 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.208.65; helo=mail-ed1-f65.google.com; envelope-from=i.maximets.ovn@gmail.com; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp2.osuosl.org 003B240ECF Authentication-Results: smtp2.osuosl.org; dmarc=none (p=none dis=none) header.from=ovn.org DKIM-Filter: OpenDKIM Filter v2.11.0 smtp2.osuosl.org 003B240ECF Received: from mail-ed1-f65.google.com (mail-ed1-f65.google.com [209.85.208.65]) by smtp2.osuosl.org (Postfix) with ESMTPS id 003B240ECF for ; Fri, 1 Nov 2024 01:23:39 +0000 (UTC) Received: by mail-ed1-f65.google.com with SMTP id 4fb4d7f45d1cf-5c96936065dso1867853a12.3 for ; Thu, 31 Oct 2024 18:23:39 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730424218; x=1731029018; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hYuGQorpM39/sLgsp3x+9y/CLrBziQ23bVDI/ysxkkw=; b=qNL6b5Ovf7BfEein6he4Misok7Xh/p5s9BUDiyU0E8KWm0qYT24pV7gej9BRkdJQ08 06gDcqLuP+wS+V4J2LXyPvZjSrLotYNefCKCWX+W03uWTw/zcweUrVTlErvKulhW9lAq lmVyCpPHzVsKAbrYvWVp/O6UjkGewu6rjo9ZX6bsXAPdZmervAD7E9YVplnF5qdfZPqH bu+EYFgRyR4wrGri8fqwQuf7xJI8tB0Aqf7fjrX7mzdc3eNuPXlmPkh9E/5MjaycGldH FVYQnn5vPFvMWCNMnlEVDFyIiZna0dGaWxQYWTIGK3mtyXwZ//3ekLFUqAUVuUxvTw7w RyeQ== X-Gm-Message-State: AOJu0YzxIJHZ042s8oI/lebfLWGKHWE32yscYTLHitvQltb3+fAwOFnw x3MQvgMtkNtOzn84annF/tK/VFvQdHNTyV/PpOQfbvEmhVcEyXXwaOIfBBpd X-Google-Smtp-Source: AGHT+IECFGIN4AmjxIywrVeSeeXKadBmIkN/Jdod3eWjMdQNhPHN+D66YI7v/QmzLSo8OB7Zfok5Sw== X-Received: by 2002:a17:907:96a9:b0:a9a:3cc6:f14c with SMTP id a640c23a62f3a-a9e3a6c8e88mr908704166b.48.1730424217868; Thu, 31 Oct 2024 18:23:37 -0700 (PDT) Received: from im-t490s.redhat.com (ip-86-49-44-151.bb.vodafone.cz. [86.49.44.151]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9e5663fef4sm126112766b.149.2024.10.31.18.23.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 18:23:36 -0700 (PDT) From: Ilya Maximets To: ovs-dev@openvswitch.org Date: Fri, 1 Nov 2024 02:23:03 +0100 Message-ID: <20241101012321.3346333-3-i.maximets@ovn.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20241101012321.3346333-1-i.maximets@ovn.org> References: <20241101012321.3346333-1-i.maximets@ovn.org> MIME-Version: 1.0 Subject: [ovs-dev] [PATCH v3 02/10] ipsec: libreswan: Fix regexp for connections waiting on child SA. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Ilya Maximets Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" These should be considered active, because pluto is waiting for the other side to react. We should not remove them or try to repair. Such connections have an extra text between the SA number and the name of the connection. Ideally, we would like not to parse the output of ipsec status, since it's very error prone, but there is, unfortunately, no other interface. Acked-by: Roi Dayan Acked-by: Eelco Chaudron Signed-off-by: Ilya Maximets --- ipsec/ovs-monitor-ipsec.in | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/ipsec/ovs-monitor-ipsec.in b/ipsec/ovs-monitor-ipsec.in index 771a3c745..08df35c36 100755 --- a/ipsec/ovs-monitor-ipsec.in +++ b/ipsec/ovs-monitor-ipsec.in @@ -729,7 +729,7 @@ conn prevent_unencrypted_vxlan return conns for line in pout.splitlines(): - m = re.search(r"#\d+: \"(.*)\".*", line) + m = re.search(r"#\d+: .*\"(.*)\".*", line) if not m: continue From patchwork Fri Nov 1 01:23:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Maximets X-Patchwork-Id: 2004936 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::138; helo=smtp1.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp1.osuosl.org (smtp1.osuosl.org [IPv6:2605:bc80:3010::138]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Xfjn94Lbqz1xwF for ; Fri, 1 Nov 2024 12:23:57 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 95EC98190B; Fri, 1 Nov 2024 01:23:55 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id 3DTV4AMCy12E; Fri, 1 Nov 2024 01:23:52 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=140.211.9.56; helo=lists.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver= DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org 82D2D81992 Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp1.osuosl.org (Postfix) with ESMTPS id 82D2D81992; Fri, 1 Nov 2024 01:23:52 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 08565C08AA; Fri, 1 Nov 2024 01:23:52 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 21D4AC08B4 for ; Fri, 1 Nov 2024 01:23:51 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id A6EB640ED8 for ; Fri, 1 Nov 2024 01:23:46 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id E0Ba-o-qIsmY for ; Fri, 1 Nov 2024 01:23:44 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.208.68; helo=mail-ed1-f68.google.com; envelope-from=i.maximets.ovn@gmail.com; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp2.osuosl.org C9B6040EF8 Authentication-Results: smtp2.osuosl.org; dmarc=none (p=none dis=none) header.from=ovn.org DKIM-Filter: OpenDKIM Filter v2.11.0 smtp2.osuosl.org C9B6040EF8 Received: from mail-ed1-f68.google.com (mail-ed1-f68.google.com [209.85.208.68]) by smtp2.osuosl.org (Postfix) with ESMTPS id C9B6040EF8 for ; Fri, 1 Nov 2024 01:23:42 +0000 (UTC) Received: by mail-ed1-f68.google.com with SMTP id 4fb4d7f45d1cf-5c99be0a4bbso2150363a12.2 for ; Thu, 31 Oct 2024 18:23:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730424221; x=1731029021; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+FV11wLGdoYQMCKNv6G8BNOPX+XIAqUNSGy8+kTZcGE=; b=Moo1lB1k/HXsCDl0Ygo1qrCcKHkV33ZRCyePV5dayK7zO6fUUaCpTcmDZ1Uufqop6b 462Z2PDNFxOQVWT7ei+HHimQ1Vh7bNXTaBnAIESDDMF09cK8JPNwfEryt7yB1zE3pTqq opYK5+JNg2ijJnSFvFksLKdFsYpEshTB73sXh1o0ViNywQnZK45wMRgo/MDWpV3PWWTQ 6P/Bvt7LPIu3DCCm9YEhF9yvUnwlRUStj49ZdoIdzkLjbiGEo9BBQAOQQHanuYmsx8kY 8qeqUi8JAunVT5fglVg3OrSK6Eere63cd8RGe8oZAmC6AfP7GaD5rWKs4OGucFnmeuZp L7SQ== X-Gm-Message-State: AOJu0YwxtmYZK366Canij2O2EcbWXhzpcMjts6yWLOvqd9KyWDf9Py/S qAc8B9iujLhusvbgUY+HZ/JAa7uj1cBRXWixzIFcvjl3ZpVi4Z4L7VK0IHrw X-Google-Smtp-Source: AGHT+IGBcEyWmd5ZvwlGV7c6a67uNsNWzcFGBRBoGiA2ur2Xowqq3sUpI0jIdZEDjN12pYKsmWNSJg== X-Received: by 2002:a17:906:6a12:b0:a9a:3718:6d6 with SMTP id a640c23a62f3a-a9e3a7f2373mr1050397866b.58.1730424220273; Thu, 31 Oct 2024 18:23:40 -0700 (PDT) Received: from im-t490s.redhat.com (ip-86-49-44-151.bb.vodafone.cz. [86.49.44.151]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9e5663fef4sm126112766b.149.2024.10.31.18.23.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 18:23:39 -0700 (PDT) From: Ilya Maximets To: ovs-dev@openvswitch.org Date: Fri, 1 Nov 2024 02:23:04 +0100 Message-ID: <20241101012321.3346333-4-i.maximets@ovn.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20241101012321.3346333-1-i.maximets@ovn.org> References: <20241101012321.3346333-1-i.maximets@ovn.org> MIME-Version: 1.0 Subject: [ovs-dev] [PATCH v3 03/10] ipsec: libreswan: Reconcile missing connections periodically. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Ilya Maximets Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" There are cases where ipsec commands may fail to add new connections or remove the old ones. Unfortunately, this means that those connections may actually never be added or removed, since ovs-monitor-ipsec will not re-visit them, unless something else changes. Wake up the monitor periodically to check if something changed in the system or if some connections still need loading. This addresses two main use cases: 1. Connection failed to start for some reason and was not added to pluto or properly started. The logic will go over all the desired, loaded and active connections and make sure that any undesired connections are removed, non-loaded connections are loaded and non-active connections are brought UP. 2. If pluto re-starts it loads all the connections, but doesn't bring them up, because we're using route (ondemand) activation strategy. This change in this commit will notice all the loaded but not active connections and will bring them up. This helps avoiding packet drops on first packets until the connection activates. Choosing 15 seconds as an interval to wake up to give pluto some breathing room, i.e. a chance to activate the connections properly before we start poking them. And also if pluto is down, 15 second interval will create less spam in the logs. StrongSwan doesn't need such a logic, because it supports a single command 'ipsec update' that re-loads the config as a whole and figures out what configuration changes are needed. But since we're starting all the connections separately with Libreswan, we have to keep track and reconcile manually. Some more details of the logic are in the comments in the code. Acked-by: Roi Dayan Acked-by: Eelco Chaudron Signed-off-by: Ilya Maximets --- ipsec/ovs-monitor-ipsec.in | 185 ++++++++++++++++++++++++------------- 1 file changed, 123 insertions(+), 62 deletions(-) diff --git a/ipsec/ovs-monitor-ipsec.in b/ipsec/ovs-monitor-ipsec.in index 08df35c36..152c30a13 100755 --- a/ipsec/ovs-monitor-ipsec.in +++ b/ipsec/ovs-monitor-ipsec.in @@ -20,6 +20,7 @@ import os import re import subprocess import sys +import time from string import Template import ovs.daemon @@ -82,6 +83,7 @@ vlog = ovs.vlog.Vlog("ovs-monitor-ipsec") exiting = False monitor = None xfrm = None +RECONCILIATION_INTERVAL = 15 # seconds def run_command(args, description=None): @@ -295,6 +297,9 @@ conn prevent_unencrypted_vxlan return conns + def need_to_reconcile(self, monitor): + return False + def config_init(self): self.conf_file = open(self.IPSEC_CONF, "w") self.secrets_file = open(self.IPSEC_SECRETS, "w") @@ -511,6 +516,7 @@ conn prevent_unencrypted_vxlan self.IPSEC_D = "sql:" + libreswan_root_prefix + ipsec_d self.IPSEC_CTL = libreswan_root_prefix + ipsec_ctl self.conf_file = None + self.last_refresh = time.time() self.secrets_file = None vlog.dbg("Using: " + self.IPSEC) vlog.dbg("Configuration file: " + self.IPSEC_CONF) @@ -622,51 +628,50 @@ conn prevent_unencrypted_vxlan "--config", self.IPSEC_CONF, "--rereadsecrets"], "re-read secrets") - tunnels = set(monitor.tunnels.keys()) - - # Delete old connections - conns_dict = self.get_active_conns() - for ifname, conns in conns_dict.items(): - tunnel = monitor.tunnels.get(ifname) - - for conn in conns: - # IPsec "connection" names must start with Interface name - if not conn.startswith(ifname): - vlog.err("%s does not start with %s" % (conn, ifname)) - continue - - # version number should be the first integer after - # interface name in IPsec "connection" - try: - ver = int(re.findall(r'\d+', conn[len(ifname):])[0]) - except ValueError: - vlog.err("%s does not contain version number") - continue - except IndexError: - vlog.err("%s does not contain version number") - continue - if not tunnel or tunnel.version != ver: - vlog.info("%s is outdated %u" % (conn, ver)) - run_command(self.IPSEC_AUTO + - ["--ctlsocket", self.IPSEC_CTL, - "--config", self.IPSEC_CONF, - "--delete", conn], "delete %s" % conn) - elif ifname in tunnels: - tunnels.remove(ifname) - - # Activate new connections - for name in tunnels: - ver = monitor.tunnels[name].version - - if monitor.tunnels[name].conf["tunnel_type"] == "gre": - conn = "%s-%s" % (name, ver) - self._start_ipsec_connection(conn) + loaded_conns = self.get_loaded_conns() + active_conns = self.get_active_conns() + + all_names = set(monitor.tunnels.keys()) | \ + set(loaded_conns.keys()) | \ + set(active_conns.keys()) + + for name in all_names: + desired = set(self.get_conn_names(monitor, name)) + loaded = set(loaded_conns.get(name, dict()).keys()) + active = set(active_conns.get(name, dict()).keys()) + + # Remove all the loaded or active but not desired connections. + for conn in loaded | active: + if conn not in desired: + self._delete_ipsec_connection(conn, "is outdated") + loaded.discard(conn) + active.discard(conn) + + # If not all desired are loaded, remove all the loaded and + # active for this tunnel and re-load only the desired ones. + # Need to do that, because connections for the same tunnel + # may share SAs. If one is loaded and the other is not, + # it means the second one failed, so the shared SA may be in + # a broken state. + if desired != loaded: + for conn in loaded | active: + self._delete_ipsec_connection(conn, "is half-loaded") + loaded.discard(conn) + active.discard(conn) + + for conn in desired: + vlog.info("Starting ipsec connection %s" % conn) + self._start_ipsec_connection(conn, "start") else: - conn_in = "%s-in-%s" % (name, ver) - conn_out = "%s-out-%s" % (name, ver) - self._start_ipsec_connection(conn_in) - self._start_ipsec_connection(conn_out) + # Ask pluto to bring UP connections that are loaded, + # but not active for some reason. + # + # desired == loaded and desired >= loaded + active, + # so loaded >= active + for conn in loaded - active: + vlog.info("Bringing up ipsec connection %s" % conn) + self._start_ipsec_connection(conn, "up") # Update shunt policy if changed if monitor.conf_in_use["skb_mark"] != monitor.conf["skb_mark"]: @@ -713,23 +718,27 @@ conn prevent_unencrypted_vxlan "--delete", "--asynchronous", "prevent_unencrypted_vxlan"]) monitor.conf_in_use["skb_mark"] = monitor.conf["skb_mark"] + self.last_refresh = time.time() + vlog.info("Refreshing is done.") - def get_active_conns(self): + def get_conns_from_status(self, pattern): """This function parses output from 'ipsec status' command. It returns dictionary where is interface name (as in OVSDB) and is another dictionary. This another dictionary uses LibreSwan connection name as and more detailed - sample line from the parsed outpus as . """ + sample line from the parsed outpus as . 'pattern' should + be a regular expression that parses out the connection name. + Only the lines that match the pattern will be parsed. """ conns = {} ret, pout, perr = run_command([self.IPSEC, 'status', '--ctlsocket', self.IPSEC_CTL], - "get active connections") + "get ipsec status") if ret: return conns for line in pout.splitlines(): - m = re.search(r"#\d+: .*\"(.*)\".*", line) + m = re.search(pattern, line) if not m: continue @@ -748,25 +757,76 @@ conn prevent_unencrypted_vxlan return conns - def _start_ipsec_connection(self, conn): - # In a corner case, LibreSwan daemon restarts for some reason and - # the "ipsec auto --start" command is lost. Just retry to make sure - # the command is received by LibreSwan. - while True: - ret, pout, perr = run_command(self.IPSEC_AUTO + - ["--config", self.IPSEC_CONF, - "--ctlsocket", self.IPSEC_CTL, - "--start", - "--asynchronous", conn], - "start %s" % conn) - if not re.match(r".*Connection refused.*", perr) and \ - not re.match(r".*need --listen.*", pout): - break + def get_active_conns(self): + return self.get_conns_from_status(r"#\d+: .*\"(.*)\".*") + + def get_loaded_conns(self): + return self.get_conns_from_status(r"\"(.*)\": \d+.*(===|\.\.\.).*") + + def get_conn_names(self, monitor, ifname): + conns = [] + if ifname not in monitor.tunnels: + return conns + + tunnel = monitor.tunnels.get(ifname) + ver = tunnel.version + + if tunnel.conf["tunnel_type"] == "gre": + conns.append("%s-%s" % (ifname, ver)) + else: + conns.append("%s-in-%s" % (ifname, ver)) + conns.append("%s-out-%s" % (ifname, ver)) + + return conns + + def need_to_reconcile(self, monitor): + if time.time() - self.last_refresh < RECONCILIATION_INTERVAL: + return False + + conns_dict = self.get_active_conns() + for ifname, tunnel in monitor.tunnels.items(): + if ifname not in conns_dict: + vlog.info("Connection for port %s is not active, " + "need to reconcile" % ifname) + return True + + existing_conns = conns_dict.get(ifname) + desired_conns = self.get_conn_names(monitor, ifname) + + if set(existing_conns.keys()) != set(desired_conns): + vlog.info("Active connections for port %s %s do not match " + "desired %s, need to reconcile" + % (ifname, list(existing_conns.keys()), + desired_conns)) + return True + + return False + + def _delete_ipsec_connection(self, conn, reason): + vlog.info("%s %s, removing" % (conn, reason)) + run_command(self.IPSEC_AUTO + + ["--ctlsocket", self.IPSEC_CTL, + "--config", self.IPSEC_CONF, + "--delete", conn], "delete %s" % conn) + + def _start_ipsec_connection(self, conn, action): + ret, pout, perr = run_command(self.IPSEC_AUTO + + ["--config", self.IPSEC_CONF, + "--ctlsocket", self.IPSEC_CTL, + "--" + action, + "--asynchronous", conn], + "%s %s" % (action, conn)) if re.match(r".*[F|f]ailed to initiate connection.*", pout): vlog.err('Failed to initiate connection through' ' Interface %s.\n' % (conn.split('-')[0])) vlog.err("stdout: %s" % pout) + ret = 1 + + if ret: + # We don't know in which state the connection was left on + # failure. Try to clean it up. + self._delete_ipsec_connection(conn, "--%s failed" % action) def _nss_clear_database(self): """Remove all OVS IPsec related state from the NSS database""" @@ -1192,7 +1252,7 @@ class IPsecMonitor(object): self.ike_helper.clear_tunnel_state(self.tunnels[name]) del self.tunnels[name] - if needs_refresh: + if needs_refresh or self.ike_helper.need_to_reconcile(self): self.ike_helper.refresh(self) def _get_cn_from_cert(self, cert): @@ -1365,6 +1425,7 @@ def main(): poller = ovs.poller.Poller() unixctl_server.wait(poller) idl.wait(poller) + poller.timer_wait(RECONCILIATION_INTERVAL * 1000) poller.block() unixctl_server.close() From patchwork Fri Nov 1 01:23:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Maximets X-Patchwork-Id: 2004935 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::138; helo=smtp1.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp1.osuosl.org (smtp1.osuosl.org [IPv6:2605:bc80:3010::138]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Xfjn64KqTz1xwF for ; Fri, 1 Nov 2024 12:23:54 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id F2302819D2; Fri, 1 Nov 2024 01:23:52 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id ulQ4dDslm0xX; Fri, 1 Nov 2024 01:23:50 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=2605:bc80:3010:104::8cd3:938; helo=lists.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver= DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org 9842D813CC Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp1.osuosl.org (Postfix) with ESMTPS id 9842D813CC; Fri, 1 Nov 2024 01:23:50 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 68816C08A8; Fri, 1 Nov 2024 01:23:50 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp1.osuosl.org (smtp1.osuosl.org [IPv6:2605:bc80:3010::138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 11DF1C08A6 for ; Fri, 1 Nov 2024 01:23:49 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id A3C4181765 for ; Fri, 1 Nov 2024 01:23:46 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id YHkB6h-PlrGw for ; Fri, 1 Nov 2024 01:23:46 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.218.66; helo=mail-ej1-f66.google.com; envelope-from=i.maximets.ovn@gmail.com; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp1.osuosl.org 97F928176F Authentication-Results: smtp1.osuosl.org; dmarc=none (p=none dis=none) header.from=ovn.org DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org 97F928176F Received: from mail-ej1-f66.google.com (mail-ej1-f66.google.com [209.85.218.66]) by smtp1.osuosl.org (Postfix) with ESMTPS id 97F928176F for ; Fri, 1 Nov 2024 01:23:45 +0000 (UTC) Received: by mail-ej1-f66.google.com with SMTP id a640c23a62f3a-a9acafdb745so261206266b.0 for ; Thu, 31 Oct 2024 18:23:45 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730424223; x=1731029023; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Fl029zscEulCjwyVKRQENMpMiVk8mHy3/zyIhVoIY1E=; b=Rf5GCL8mZO5s2tLGe/wfa61UHaBsArIQYOtuQ8BhIiQowbzxLlw4QFZwJGzsKSjC3g K9cCEUGisGascJ5adpbR4II1oWmOSCOgYGa9q0N7Y6BjzdHmAxlq+5cK9yd/3/JKuUPM H3unG30LJWbIOtXpzcObJJZuFrsE9CK1XqBphVqEhUd5hI9qoP+RKxglzcOTMWfgxU4H 6CkG/JeBGnZ2xeNVLS07vQoZ5hLGMTWuryTP62E39b+knQbYFJbQqgN8FVHgMFpLTugE 6H3PJZlI+3xD8QXzhaKRk1KORS+fUiKFvYR+KFx8gFb6sfSqfj8lcLG1lMmFff9ZZGFJ PKtA== X-Gm-Message-State: AOJu0Yync9L7InkTbqCuz6I06HbgK2qOAtx9BtE7AouneRwrCJR3RaFc wIkfHdH+It0WWUAdmzoEMHWVndu8agsr4NejvmevGEx0z7E5qyB9N/SbEWBJ X-Google-Smtp-Source: AGHT+IHmbzDgmINh2uH59vZ2E06OZYlLJSZDLE/Hs0PNJ8RMJxVX4+Mv0hY72NhgZd+k0L4YNw+2Wg== X-Received: by 2002:a17:907:ea1:b0:a9a:babb:b916 with SMTP id a640c23a62f3a-a9e55a836fcmr385261566b.15.1730424223335; Thu, 31 Oct 2024 18:23:43 -0700 (PDT) Received: from im-t490s.redhat.com (ip-86-49-44-151.bb.vodafone.cz. [86.49.44.151]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9e5663fef4sm126112766b.149.2024.10.31.18.23.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 18:23:42 -0700 (PDT) From: Ilya Maximets To: ovs-dev@openvswitch.org Date: Fri, 1 Nov 2024 02:23:05 +0100 Message-ID: <20241101012321.3346333-5-i.maximets@ovn.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20241101012321.3346333-1-i.maximets@ovn.org> References: <20241101012321.3346333-1-i.maximets@ovn.org> MIME-Version: 1.0 Subject: [ovs-dev] [PATCH v3 04/10] ipsec: libreswan: Try to bring non-active connections up. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Ilya Maximets Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Sometimes connections are getting loaded, but do not become active for some reason on a first try. We can try and bring them up manually. However, if they are still not active after that, it's better to just remove the connection and try to add them from scratch, as there must be some internal issue in libreswan that doesn't allow these connections to actually become active. Note: Once the "defunct" connection is removed, the second connection for the same tunnel will also be removed as "half-loaded". This ensures that all the shared SAs will also be cleaned up, so we can truly start from scratch. Acked-by: Eelco Chaudron Acked-by: Roi Dayan Signed-off-by: Ilya Maximets --- ipsec/ovs-monitor-ipsec.in | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/ipsec/ovs-monitor-ipsec.in b/ipsec/ovs-monitor-ipsec.in index 152c30a13..20f6ccb20 100755 --- a/ipsec/ovs-monitor-ipsec.in +++ b/ipsec/ovs-monitor-ipsec.in @@ -516,6 +516,7 @@ conn prevent_unencrypted_vxlan self.IPSEC_D = "sql:" + libreswan_root_prefix + ipsec_d self.IPSEC_CTL = libreswan_root_prefix + ipsec_ctl self.conf_file = None + self.conns_not_active = set() self.last_refresh = time.time() self.secrets_file = None vlog.dbg("Using: " + self.IPSEC) @@ -641,6 +642,14 @@ conn prevent_unencrypted_vxlan loaded = set(loaded_conns.get(name, dict()).keys()) active = set(active_conns.get(name, dict()).keys()) + # Untrack connections that became active. + self.conns_not_active.difference_update(active) + # Remove connections that didn't become active after --start + # and another explicit --up. + for conn in self.conns_not_active & loaded: + self._delete_ipsec_connection(conn, "is defunct") + loaded.remove(conn) + # Remove all the loaded or active but not desired connections. for conn in loaded | active: if conn not in desired: @@ -671,6 +680,8 @@ conn prevent_unencrypted_vxlan # so loaded >= active for conn in loaded - active: vlog.info("Bringing up ipsec connection %s" % conn) + # On failure to --up it will be removed from the set. + self.conns_not_active.add(conn) self._start_ipsec_connection(conn, "up") # Update shunt policy if changed @@ -804,6 +815,7 @@ conn prevent_unencrypted_vxlan def _delete_ipsec_connection(self, conn, reason): vlog.info("%s %s, removing" % (conn, reason)) + self.conns_not_active.discard(conn) run_command(self.IPSEC_AUTO + ["--ctlsocket", self.IPSEC_CTL, "--config", self.IPSEC_CONF, From patchwork Fri Nov 1 01:23:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Maximets X-Patchwork-Id: 2004937 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::138; helo=smtp1.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp1.osuosl.org (smtp1.osuosl.org [IPv6:2605:bc80:3010::138]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XfjnP494Dz1xwF for ; Fri, 1 Nov 2024 12:24:09 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 074F881BBD; Fri, 1 Nov 2024 01:24:08 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id vS64ilLjUMXW; Fri, 1 Nov 2024 01:24:05 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=140.211.9.56; helo=lists.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver= DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org 45AE881B83 Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp1.osuosl.org (Postfix) with ESMTPS id 45AE881B83; Fri, 1 Nov 2024 01:24:03 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id D9724C08A6; Fri, 1 Nov 2024 01:24:02 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp3.osuosl.org (smtp3.osuosl.org [IPv6:2605:bc80:3010::136]) by lists.linuxfoundation.org (Postfix) with ESMTP id 0F609C08A3 for ; Fri, 1 Nov 2024 01:24:01 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id 3259060B23 for ; Fri, 1 Nov 2024 01:23:50 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id Hu-X8s2IatqG for ; Fri, 1 Nov 2024 01:23:49 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.218.68; helo=mail-ej1-f68.google.com; envelope-from=i.maximets.ovn@gmail.com; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp3.osuosl.org EC17160BB4 Authentication-Results: smtp3.osuosl.org; dmarc=none (p=none dis=none) header.from=ovn.org DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org EC17160BB4 Received: from mail-ej1-f68.google.com (mail-ej1-f68.google.com [209.85.218.68]) by smtp3.osuosl.org (Postfix) with ESMTPS id EC17160BB4 for ; Fri, 1 Nov 2024 01:23:48 +0000 (UTC) Received: by mail-ej1-f68.google.com with SMTP id a640c23a62f3a-a9e44654ae3so163839566b.1 for ; Thu, 31 Oct 2024 18:23:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730424227; x=1731029027; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RRfF56jXaQp1yoS/16v8xbcT9W0AyKii5UsMEQysakg=; b=DerPcNTTdbcQRGNDU53BmPMCpWTK8VsurhkXpUr2YDlj1NdV2jAylknBm3ytU8pqxG Qbg18SdVOZ2qjSCSzARTH0ERrSqrU8JjMAsuOib9Bq2dIureUfp1TWrWv6MwvCPYInhe XiGltCqYw0Vq4CQorhMg28PgnanKGprYRyE1cmT/P0/XZrwCHy976KmgtFleNnVxV8MM g08qIJSkDAmEWQ9n+T1Z0LOqaGLNuM1rGRao9Mkz4IrqGb6qtUv/euhenLwAe//09Oja 0hRJI2TtSA8gHjBtYP5WfzvHuFdhRdnVOFdNpY+kCtfWsPxy3HQ0UwsVeOz0MZI50a4N Guow== X-Gm-Message-State: AOJu0YyM468yU5eiJFCQBceHgOwCCaoVllIh0pNfh67LRkEGlmnBq2H/ qHV/5MoGxEt2kaNwKq1JtISJtRbQ47/jrmg6jN9bJN604NXk5JeTQ2QF/ddG X-Google-Smtp-Source: AGHT+IG1Tn9nvYtRFJTr3e9JccgjXKjexSuirc6nWr1I7j2vtjuw+pAtCMIQD5xUQ/jBiNepXyhPtw== X-Received: by 2002:a17:907:2d0d:b0:a9a:59f:dfb9 with SMTP id a640c23a62f3a-a9e508ace9bmr498128966b.5.1730424226285; Thu, 31 Oct 2024 18:23:46 -0700 (PDT) Received: from im-t490s.redhat.com (ip-86-49-44-151.bb.vodafone.cz. [86.49.44.151]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9e5663fef4sm126112766b.149.2024.10.31.18.23.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 18:23:45 -0700 (PDT) From: Ilya Maximets To: ovs-dev@openvswitch.org Date: Fri, 1 Nov 2024 02:23:06 +0100 Message-ID: <20241101012321.3346333-6-i.maximets@ovn.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20241101012321.3346333-1-i.maximets@ovn.org> References: <20241101012321.3346333-1-i.maximets@ovn.org> MIME-Version: 1.0 Subject: [ovs-dev] [PATCH v3 05/10] ipsec: libreswan: Avoid monitor hanging on stuck ipsec commands. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Ilya Maximets Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Multiple versions of Libreswan have an issue where ipsec --start command may get stuck forever. This issue affects many popular versions of Libreswan from 4.5 to 4.15, which are shipped in most modern distributions. When ipsec --start gets stuck, ovs-monitor-ipsec hangs and can't do anything else, so not only this one but all other tunnels are also not being started. Add a timeout to the subprocess call, so we do not wait forever. Just introduced reconciliation process will clean things up and will try to re-add this connection later. Pluto may take a lot of time to process the --start request. Notably, the time depends on the retransmission timeout, which is 60 seconds by default. However, even at high scale, it doesn't take much more than that in tests. So, 120 second timeout should be a reasonable default value. Note: it is observed in practice that the process doesn't actually terminate for a long time, so we can't afford waiting for it. That's the main reason why we're not using the subprocess.run() with a timeout option here (it would wait). But also, because we'd had to catch the exception anyway. Reported-at: https://issues.redhat.com/browse/FDP-846 Signed-off-by: Ilya Maximets Acked-by: Eelco Chaudron Acked-by: Roi Dayan --- ipsec/ovs-monitor-ipsec.in | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/ipsec/ovs-monitor-ipsec.in b/ipsec/ovs-monitor-ipsec.in index 20f6ccb20..05c1965df 100755 --- a/ipsec/ovs-monitor-ipsec.in +++ b/ipsec/ovs-monitor-ipsec.in @@ -84,6 +84,7 @@ exiting = False monitor = None xfrm = None RECONCILIATION_INTERVAL = 15 # seconds +TIMEOUT_EXPIRED = 137 # Exit code for a SIGKILL (128 + 9). def run_command(args, description=None): @@ -96,7 +97,16 @@ def run_command(args, description=None): vlog.dbg("Running %s" % args) proc = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE) - pout, perr = proc.communicate() + try: + pout, perr = proc.communicate(timeout=120) + ret = proc.returncode + except subprocess.TimeoutExpired: + vlog.warn("Timed out after 120 seconds trying to %s." % description) + pout, perr = b'', b'' + # Just kill the process here. We can't afford waiting for it, + # as it may be stuck and may not actually be terminated. + proc.kill() + ret = TIMEOUT_EXPIRED if proc.returncode or perr: vlog.warn("Failed to %s; exit code: %d" @@ -105,7 +115,7 @@ def run_command(args, description=None): vlog.warn("stderr: %s" % perr) vlog.warn("stdout: %s" % pout) - return proc.returncode, pout.decode(), perr.decode() + return ret, pout.decode(), perr.decode() class XFRM(object): From patchwork Fri Nov 1 01:23:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Maximets X-Patchwork-Id: 2004939 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::138; helo=smtp1.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp1.osuosl.org (smtp1.osuosl.org [IPv6:2605:bc80:3010::138]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Xfjnb2zvCz1xwF for ; Fri, 1 Nov 2024 12:24:19 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 4C11C81E47; Fri, 1 Nov 2024 01:24:17 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id 7MCsDyVcVMMq; Fri, 1 Nov 2024 01:24:16 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=140.211.9.56; helo=lists.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver= DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org CFB1B81BFA Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp1.osuosl.org (Postfix) with ESMTPS id CFB1B81BFA; Fri, 1 Nov 2024 01:24:11 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 71654C08A8; Fri, 1 Nov 2024 01:24:11 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp1.osuosl.org (smtp1.osuosl.org [IPv6:2605:bc80:3010::138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 13B2BC08AA for ; Fri, 1 Nov 2024 01:24:10 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id DB43481985 for ; Fri, 1 Nov 2024 01:23:53 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id FGY5qRkWK5Xh for ; Fri, 1 Nov 2024 01:23:52 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.218.66; helo=mail-ej1-f66.google.com; envelope-from=i.maximets.ovn@gmail.com; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp1.osuosl.org 843A28195D Authentication-Results: smtp1.osuosl.org; dmarc=none (p=none dis=none) header.from=ovn.org DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org 843A28195D Received: from mail-ej1-f66.google.com (mail-ej1-f66.google.com [209.85.218.66]) by smtp1.osuosl.org (Postfix) with ESMTPS id 843A28195D for ; Fri, 1 Nov 2024 01:23:51 +0000 (UTC) Received: by mail-ej1-f66.google.com with SMTP id a640c23a62f3a-a9e44654ae3so163842566b.1 for ; Thu, 31 Oct 2024 18:23:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730424229; x=1731029029; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+ReMbQwTA/j9Kj5oFVlG5cMxA7pEYf3DJn7bTXw2h/4=; b=UKPdiKkhypM9vlgskOH2Z8UnKCbcSS+P7ertyWT0p9l/CxcSOmB+dXBcWR0j1E199V WD3PseZRPU1iltDbL9vVFhfBwLE6nfJ+z1m7QBnZtbvvxQ3T63cyiAiEkPGK/meH6605 RYY5o+VZrZeGuFXd1mbTJnpQZwEB7IrK0AFpnr/jfailwUK0GWt2WR41MecVVT4fxGa7 0JcrxNIXawZSbFv9doTXWgHlASQ4sb9Tyo+iV7cVDdFM2sJDJDRGRY/N5qjakkcRmkW2 E+aIz7KW4utpYHZyrYqXDsmeXok5EwNCsdoUIUHa5Qg1FuhUsTy6cJHaT9SXESj5wYQZ GgKw== X-Gm-Message-State: AOJu0YyXxOODkh/24Hlgv8Ch3DrZIJajFvbP/wWzFGhLoooGU8Igd80/ /5A2rEvRN5bbr3BMY145MRO2xTCvIorMeWBG/lpQq9m1BFVTi0KvaMhN/B2F X-Google-Smtp-Source: AGHT+IHCqbPFep6rbonKC1IhI1NIGw4Lx8SdKZjQgbgGLDtg3XNkGs9CfMfv0Bv8BAftCuwq4D0STg== X-Received: by 2002:a17:907:868a:b0:a99:e745:d47f with SMTP id a640c23a62f3a-a9e508e50a5mr460988666b.21.1730424229244; Thu, 31 Oct 2024 18:23:49 -0700 (PDT) Received: from im-t490s.redhat.com (ip-86-49-44-151.bb.vodafone.cz. [86.49.44.151]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9e5663fef4sm126112766b.149.2024.10.31.18.23.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 18:23:48 -0700 (PDT) From: Ilya Maximets To: ovs-dev@openvswitch.org Date: Fri, 1 Nov 2024 02:23:07 +0100 Message-ID: <20241101012321.3346333-7-i.maximets@ovn.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20241101012321.3346333-1-i.maximets@ovn.org> References: <20241101012321.3346333-1-i.maximets@ovn.org> MIME-Version: 1.0 Subject: [ovs-dev] [PATCH v3 06/10] ipsec: Make command timeout configurable. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Ilya Maximets Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Add a new command line option --command-timeout that controls the command timeout. It is important to have this configurable, because the retransmit-timeout is configurable in Libreswan. Also, users may prefer the monitor to be more responsive. ovs-monitor-ipsec options are not documented anywhere, so not trying to address that here. Signed-off-by: Ilya Maximets Acked-by: Eelco Chaudron Acked-by: Roi Dayan --- ipsec/ovs-monitor-ipsec.in | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/ipsec/ovs-monitor-ipsec.in b/ipsec/ovs-monitor-ipsec.in index 05c1965df..89d1fa3cc 100755 --- a/ipsec/ovs-monitor-ipsec.in +++ b/ipsec/ovs-monitor-ipsec.in @@ -83,6 +83,7 @@ vlog = ovs.vlog.Vlog("ovs-monitor-ipsec") exiting = False monitor = None xfrm = None +command_timeout = None RECONCILIATION_INTERVAL = 15 # seconds TIMEOUT_EXPIRED = 137 # Exit code for a SIGKILL (128 + 9). @@ -98,10 +99,11 @@ def run_command(args, description=None): proc = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE) try: - pout, perr = proc.communicate(timeout=120) + pout, perr = proc.communicate(timeout=command_timeout) ret = proc.returncode except subprocess.TimeoutExpired: - vlog.warn("Timed out after 120 seconds trying to %s." % description) + vlog.warn("Timed out after %s seconds trying to %s." + % (command_timeout, description)) pout, perr = b'', b'' # Just kill the process here. We can't afford waiting for it, # as it may be stuck and may not actually be terminated. @@ -1387,6 +1389,10 @@ def main(): parser.add_argument("--ipsec-ctl", metavar="IPSEC-CTL", help="Use DIR/IPSEC-CTL as location for " " pluto ctl socket (libreswan only).") + parser.add_argument("--command-timeout", metavar="TIMEOUT", + type=int, default=120, + help="Timeout for external commands called by the " + "ovs-monitor-ipsec daemon, e.g. ipsec --start.") ovs.vlog.add_args(parser) ovs.daemon.add_args(parser) @@ -1396,11 +1402,13 @@ def main(): global monitor global xfrm + global command_timeout root_prefix = args.root_prefix if args.root_prefix else "" xfrm = XFRM(root_prefix) monitor = IPsecMonitor(root_prefix, args.ike_daemon, not args.no_restart_ike_daemon, args) + command_timeout = args.command_timeout remote = args.database schema_helper = ovs.db.idl.SchemaHelper() From patchwork Fri Nov 1 01:23:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Maximets X-Patchwork-Id: 2004938 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::133; helo=smtp2.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XfjnS0vwcz1xwF for ; Fri, 1 Nov 2024 12:24:12 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 7F27640F86; Fri, 1 Nov 2024 01:24:10 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id e1_iUGdjW9iI; Fri, 1 Nov 2024 01:24:09 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=140.211.9.56; helo=lists.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver= DKIM-Filter: OpenDKIM Filter v2.11.0 smtp2.osuosl.org 32FF840EE4 Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp2.osuosl.org (Postfix) with ESMTPS id 32FF840EE4; Fri, 1 Nov 2024 01:24:09 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 064AFC08A6; Fri, 1 Nov 2024 01:24:09 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp4.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id 15531C08A6 for ; Fri, 1 Nov 2024 01:24:07 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 37ABB40922 for ; Fri, 1 Nov 2024 01:23:55 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id VN4ksRDZszIr for ; Fri, 1 Nov 2024 01:23:54 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.218.68; helo=mail-ej1-f68.google.com; envelope-from=i.maximets.ovn@gmail.com; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp4.osuosl.org DD42E408D6 Authentication-Results: smtp4.osuosl.org; dmarc=none (p=none dis=none) header.from=ovn.org DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org DD42E408D6 Received: from mail-ej1-f68.google.com (mail-ej1-f68.google.com [209.85.218.68]) by smtp4.osuosl.org (Postfix) with ESMTPS id DD42E408D6 for ; Fri, 1 Nov 2024 01:23:53 +0000 (UTC) Received: by mail-ej1-f68.google.com with SMTP id a640c23a62f3a-a9a156513a1so253426466b.0 for ; Thu, 31 Oct 2024 18:23:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730424232; x=1731029032; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bEGCm+6pXqJetcJwZjyKAfjN6eO6nFY3Y91iDlmS7fE=; b=PJ+DDHhZ3dFPsoegi0/b97XcB7mdokHMZd7Ggfuy50zsjvuZODhvDriqJXUJb8c3NP ZTefr5tsSNWbhwNbRcxBoKhGHYtaxLKi2UarJ7YgjPZ9/8isiYzjQ3xiO3Rot5L8fm61 iaEmPLLoRTH6etMxfUZpyqxrOLDaNUTbqGpfGEbuQe1L/g0NgCSFpXo5Cj5hoswQSGGB shwUiAO8vvZrJFUNiAl+Lwe6MuI7VTBKkE99qRMle0Fj4IrApE6ylL5YA9U2w+uUgbud kIcprkeB9gWd60b9ZKuoIzrUMy1BUYD7K931soPWrV5vE6el9VfE0PNAFwZMXvp/4BLI XGbA== X-Gm-Message-State: AOJu0YzLyVemAXefvR4V7GnHt8jnmTZWsz8nFF0iJdO67JVaywm0av6h 9HF0W2x44WDRMv5esaRQKCVmawW4t90NSiStLz9mXw+muwqVDRdFIjsNM1c0 X-Google-Smtp-Source: AGHT+IH0N7RAOVWOmljkPrvotnYzvIP3QAVOZovM0Z6OIziatDLIW68F7kCan/3EynYKKbMZYmHNsA== X-Received: by 2002:a17:907:1b93:b0:a9e:380b:8ce with SMTP id a640c23a62f3a-a9e380b0dfbmr559538366b.35.1730424231613; Thu, 31 Oct 2024 18:23:51 -0700 (PDT) Received: from im-t490s.redhat.com (ip-86-49-44-151.bb.vodafone.cz. [86.49.44.151]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9e5663fef4sm126112766b.149.2024.10.31.18.23.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 18:23:50 -0700 (PDT) From: Ilya Maximets To: ovs-dev@openvswitch.org Date: Fri, 1 Nov 2024 02:23:08 +0100 Message-ID: <20241101012321.3346333-8-i.maximets@ovn.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20241101012321.3346333-1-i.maximets@ovn.org> References: <20241101012321.3346333-1-i.maximets@ovn.org> MIME-Version: 1.0 Subject: [ovs-dev] [PATCH v3 07/10] system-tests: Verbose cleanup of ports and namespaces. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Ilya Maximets Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Removal of ports and network namespaces can take a significant amount of time, and it is not clear if the test is stuck or actually doing something during that time. Add some logging to cleanup commands to see what is going on. Acked-by: Eelco Chaudron Signed-off-by: Ilya Maximets Acked-by: Roi Dayan --- tests/system-common-macros.at | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-) diff --git a/tests/system-common-macros.at b/tests/system-common-macros.at index e9be021f3..ff86d15cd 100644 --- a/tests/system-common-macros.at +++ b/tests/system-common-macros.at @@ -2,10 +2,7 @@ # # Delete namespaces from the running OS m4_define([DEL_NAMESPACES], - [m4_foreach([ns], [$@], - [ip netns del ns -]) - ] + [m4_foreach([ns], [$@], [echo removing namespace ns; ip netns del ns])] ) # ADD_NAMESPACES(ns [, ns ... ]) @@ -72,7 +69,7 @@ m4_define([ADD_INT], # m4_define([ADD_VETH], [ AT_CHECK([ip link add $1 type veth peer name ovs-$1 || return 77]) - on_exit 'ip link del ovs-$1' + on_exit 'echo removing interface ovs-$1; ip link del ovs-$1' CONFIGURE_VETH_OFFLOADS([$1]) AT_CHECK([ip link set $1 netns $2]) AT_CHECK([ip link set dev ovs-$1 up]) From patchwork Fri Nov 1 01:23:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Maximets X-Patchwork-Id: 2004940 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.136; helo=smtp3.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Xfjnn0pH2z1xwF for ; Fri, 1 Nov 2024 12:24:28 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id 4DDCA60C10; Fri, 1 Nov 2024 01:24:27 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id jC61pkaqWgqs; Fri, 1 Nov 2024 01:24:24 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=2605:bc80:3010:104::8cd3:938; helo=lists.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver= DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org 15A3960D76 Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp3.osuosl.org (Postfix) with ESMTPS id 15A3960D76; Fri, 1 Nov 2024 01:24:24 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id CFF48C08A6; Fri, 1 Nov 2024 01:24:23 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp4.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id 0CE54C08A6 for ; Fri, 1 Nov 2024 01:24:23 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 3ABB6408AF for ; Fri, 1 Nov 2024 01:23:59 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id IUMRW7WY4JyQ for ; Fri, 1 Nov 2024 01:23:57 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.218.65; helo=mail-ej1-f65.google.com; envelope-from=i.maximets.ovn@gmail.com; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp4.osuosl.org 8824440934 Authentication-Results: smtp4.osuosl.org; dmarc=none (p=none dis=none) header.from=ovn.org DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 8824440934 Received: from mail-ej1-f65.google.com (mail-ej1-f65.google.com [209.85.218.65]) by smtp4.osuosl.org (Postfix) with ESMTPS id 8824440934 for ; Fri, 1 Nov 2024 01:23:56 +0000 (UTC) Received: by mail-ej1-f65.google.com with SMTP id a640c23a62f3a-a9a4031f69fso214939866b.0 for ; Thu, 31 Oct 2024 18:23:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730424234; x=1731029034; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZzYPrsGyOK0Ffiiyx/BZtFW/GFYIu4WDeaTsjkLtZuo=; b=UBF5IQoXo23heA3v9iAEZnU94rbLTJ4Ygj+1TulQAEntsNk1B+EmiU8IJyu+LMonXn 8ZYMhlBkwAQBxYBD93HgGonoDpUYrZgIsAGKMU5fJMInhbqKmZHjvLdbygRETm7oAMXK V7tOFmxrLTcDQ7bfSlKFrD6T7fmZxHETecpxt9m3R426HsIxKn3lW1qEtXEwhAVt6LC2 kcj4EjaWWmuEvJEef+i0+mVTiznbnYIbPvWmkDMM+FKrlmPnd9SctSMHQOIvORjjOLYS zh7o4TPdeks3kJqhTSM+quO8TyXNAsHXfoim9Ys3iDznVv8fIIrwdAUNZrpG8cu+B1yQ BVgg== X-Gm-Message-State: AOJu0YxDzxIjXnSISeAOUC3FACz4EJAYRhSXvu+DCJlVPiYGuULyMsK6 ThqHX1f25/XQ/viWxYMDAJFKxAyfTGNsaltojxsV9k8O6zAh4EpxOrALFCee X-Google-Smtp-Source: AGHT+IFu1HYpv96rDnY8U2KW34x2Y9lEFFPhWxxzRYvE89Ygd1QgyT7gQBDDb+gkAUJG1xBbQWH+Jw== X-Received: by 2002:a17:907:94c4:b0:a99:fc9a:5363 with SMTP id a640c23a62f3a-a9de5c919e4mr2036929266b.9.1730424233926; Thu, 31 Oct 2024 18:23:53 -0700 (PDT) Received: from im-t490s.redhat.com (ip-86-49-44-151.bb.vodafone.cz. [86.49.44.151]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9e5663fef4sm126112766b.149.2024.10.31.18.23.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 18:23:53 -0700 (PDT) From: Ilya Maximets To: ovs-dev@openvswitch.org Date: Fri, 1 Nov 2024 02:23:09 +0100 Message-ID: <20241101012321.3346333-9-i.maximets@ovn.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20241101012321.3346333-1-i.maximets@ovn.org> References: <20241101012321.3346333-1-i.maximets@ovn.org> MIME-Version: 1.0 Subject: [ovs-dev] [PATCH v3 08/10] tests: ipsec: Add NxN + reconciliation test. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Ilya Maximets Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Add a test to check establishment of IPsec connections among multiple nodes and check the reconciliation logic along the way. The test: - Creates 20 network namespaces. - Starts Libreswan, OVS and ovs-monitor-ipsec in each of them. - Adds a geneve tunnel from each namespace to every other namespace. - Checks that each namespace has all the IPsec connections loaded. - Removes a few connections manually. - Checks that these connections are added back. Unfortunately, many widely used versions of Libreswan have issues of pluto crashing frequently. For that reason the test is trying to bring pluto back online once it finds a dead one. Also, since retransmit-timeout is 60 seconds and our command timeout is 120, we can't actually use the OVS_WAIT_UNTIL macro most of the time, so the checks are done in the custom loop that waits up to 300 seconds. Acked-by: Eelco Chaudron Signed-off-by: Ilya Maximets Acked-by: Roi Dayan Acked-by: Roi Dayan --- tests/system-ipsec.at | 138 ++++++++++++++++++++++++++++++++++++++---- 1 file changed, 125 insertions(+), 13 deletions(-) diff --git a/tests/system-ipsec.at b/tests/system-ipsec.at index 1e155fece..de459804b 100644 --- a/tests/system-ipsec.at +++ b/tests/system-ipsec.at @@ -8,6 +8,18 @@ m4_define([IPSEC_SETUP_UNDERLAY], dnl Set up the underlay switch AT_CHECK([ovs-ofctl add-flow br0 "actions=normal"])]) +m4_define([START_PLUTO], [ + rm -f $ovs_base/$1/pluto.pid + mkdir -p $ovs_base/$1/ipsec.d + touch $ovs_base/$1/ipsec.conf + touch $ovs_base/$1/secrets + ipsec initnss --nssdir $ovs_base/$1/ipsec.d + NS_CHECK_EXEC([$1], [ipsec pluto --config $ovs_base/$1/ipsec.conf \ + --ipsecdir $ovs_base/$1 --nssdir $ovs_base/$1/ipsec.d \ + --logfile $ovs_base/$1/pluto.log --secretsfile $ovs_base/$1/secrets \ + --rundir $ovs_base/$1], [0], [], [stderr]) +]) + dnl IPSEC_ADD_NODE([namespace], [device], [address], [peer address])) dnl dnl Creates a dummy host that acts as an IPsec endpoint. Creates host in @@ -45,15 +57,8 @@ m4_define([IPSEC_ADD_NODE], on_exit "kill_ovs_vswitchd `cat $ovs_base/$1/vswitchd.pid`" dnl Start pluto - mkdir -p $ovs_base/$1/ipsec.d - touch $ovs_base/$1/ipsec.conf - touch $ovs_base/$1/secrets - ipsec initnss --nssdir $ovs_base/$1/ipsec.d - NS_CHECK_EXEC([$1], [ipsec pluto --config $ovs_base/$1/ipsec.conf \ - --ipsecdir $ovs_base/$1 --nssdir $ovs_base/$1/ipsec.d \ - --logfile $ovs_base/$1/pluto.log --secretsfile $ovs_base/$1/secrets \ - --rundir $ovs_base/$1], [0], [], [stderr]) - on_exit "kill `cat $ovs_base/$1/pluto.pid`" + START_PLUTO([$1]) + on_exit 'kill $(cat $ovs_base/$1/pluto.pid)' dnl Start ovs-monitor-ipsec NS_CHECK_EXEC([$1], [ovs-monitor-ipsec unix:${OVS_RUNDIR}/$1/db.sock\ @@ -110,16 +115,18 @@ m4_define([CHECK_LIBRESWAN], dnl IPSEC_STATUS_LOADED([]) dnl dnl Get number of loaded connections from ipsec status -m4_define([IPSEC_STATUS_LOADED], [ipsec --rundir $ovs_base/$1 status | \ +m4_define([IPSEC_STATUS_LOADED], [ + ipsec --rundir $ovs_base/$1 status | \ grep "Total IPsec connections" | \ - sed 's/[[0-9]]* *Total IPsec connections: loaded \([[0-2]]\), active \([[0-2]]\).*/\1/m']) + sed 's/[[0-9]]* *Total IPsec connections: loaded \([[0-9]]*\), active \([[0-9]]*\).*/\1/m']) dnl IPSEC_STATUS_ACTIVE([]) dnl dnl Get number of active connections from ipsec status -m4_define([IPSEC_STATUS_ACTIVE], [ipsec --rundir $ovs_base/$1 status | \ +m4_define([IPSEC_STATUS_ACTIVE], [ + ipsec --rundir $ovs_base/$1 status | \ grep "Total IPsec connections" | \ - sed 's/[[0-9]]* *Total IPsec connections: loaded \([[0-2]]\), active \([[0-2]]\).*/\2/m']) + sed 's/[[0-9]]* *Total IPsec connections: loaded \([[0-9]]*\), active \([[0-9]]*\).*/\2/m']) dnl CHECK_ESP_TRAFFIC() dnl @@ -401,3 +408,108 @@ CHECK_ESP_TRAFFIC OVS_TRAFFIC_VSWITCHD_STOP() AT_CLEANUP + +AT_SETUP([IPsec -- Libreswan NxN geneve tunnels + reconciliation]) +AT_KEYWORDS([ipsec libreswan scale reconciliation]) +dnl Note: Geneve test may not work on older kernels due to CVE-2020-25645 +dnl https://bugzilla.redhat.com/show_bug.cgi?id=1883988 + +CHECK_LIBRESWAN() +OVS_TRAFFIC_VSWITCHD_START() +IPSEC_SETUP_UNDERLAY() + +m4_define([NODES], [20]) + +dnl Set up fake hosts. +m4_for([id], [1], NODES, [1], [ + IPSEC_ADD_NODE([node-id], [p-id], 10.1.1.id, 10.1.1.254) + AT_CHECK([ovs-pki -b -d ${ovs_base} -l ${ovs_base}/ovs-pki.log \ + req -u node-id], [0], [stdout]) + AT_CHECK([ovs-pki -b -d ${ovs_base} -l ${ovs_base}/ovs-pki.log \ + self-sign node-id], [0], [stdout]) + AT_CHECK(OVS_VSCTL([node-id], set Open_vSwitch . \ + other_config:certificate=${ovs_base}/node-id-cert.pem \ + other_config:private_key=${ovs_base}/node-id-privkey.pem), + [0], [ignore], [ignore]) + on_exit "ipsec --rundir $ovs_base/node-id status > $ovs_base/node-id/status" +]) + +dnl Create a full mesh of tunnels. +m4_for([LEFT], [1], NODES, [1], [ + m4_for([RIGHT], [1], NODES, [1], [ + if test LEFT -ne RIGHT; then + AT_CHECK(OVS_VSCTL(node-LEFT, add-port br-ipsec tun-RIGHT \ + -- set Interface tun-RIGHT type=geneve options:remote_ip=10.1.1.RIGHT \ + options:remote_cert=${ovs_base}/node-RIGHT-cert.pem), + [0], [ignore], [ignore]) + fi +])]) + +m4_define([WAIT_FOR_LOADED_CONNS], [ + m4_for([id], [1], NODES, [1], [ + echo "================== node-id =========================" + iterations=0 + loaded=0 + dnl Using a custom loop instead of OVS_WAIT_UNTIL, because it may take + dnl much longer than a default timeout. The default retransmit timeout + dnl for pluto is 60 seconds. Also, we need to make sure pluto didn't + dnl crash in the process and revive it if it did, unfortunately. + while true; do + date + AT_CHECK([ipsec --rundir $ovs_base/node-id status 2>&1 \ + | grep -E "whack|Total"], [ignore], [stdout]) + if grep -E 'is Pluto running?|refused' stdout; then + echo "node-id: Pluto died, restarting..." + START_PLUTO([node-id]) + else + loaded=$(IPSEC_STATUS_LOADED(node-id)) + fi + if test "$loaded" -ne $(( (NODES - 1) * 2 )); then + sleep 3 + else + break + fi + let iterations=$iterations+1 + AT_CHECK([test $iterations -lt 100]) + done + ]) +]) + +dnl Wait for all the connections to be loaded to pluto. Not waiting for +dnl them to become active, because if pluto is down on one of the nodes, +dnl some connections may not become active until we revive it. Some +dnl connections may also never become active due to bugs in libreswan 4.x. +WAIT_FOR_LOADED_CONNS() + +AT_CHECK([ipsec auto --help], [ignore], [ignore], [stderr]) +auto=auto +if test -s stderr; then + auto= +fi + +dnl Remove connections for two tunnels. One fully and one partially. +AT_CHECK([ipsec $auto --ctlsocket $ovs_base/node-1/pluto.ctl \ + --config $ovs_base/node-1/ipsec.conf \ + --delete tun-5-out-1], [0], [stdout]) +AT_CHECK([ipsec $auto --ctlsocket $ovs_base/node-1/pluto.ctl \ + --config $ovs_base/node-1/ipsec.conf \ + --delete tun-2-in-1], [0], [stdout]) +AT_CHECK([ipsec $auto --ctlsocket $ovs_base/node-1/pluto.ctl \ + --config $ovs_base/node-1/ipsec.conf \ + --delete tun-2-out-1], [0], [stdout]) + +dnl Wait for the monitor to notice the missing connections. +OVS_WAIT_UNTIL([grep -q 'tun-2.*need to reconcile' \ + $ovs_base/node-1/ovs-monitor-ipsec.log]) + +dnl Wait for all the connections to be loaded back. +WAIT_FOR_LOADED_CONNS() + +dnl These are not necessary, but nice to have in the test log in +dnl order to spot pluto failures during the test. +grep -E 'Timed out|outdated|half-loaded|defunct' \ + $ovs_base/node-*/ovs-monitor-ipsec.log +grep -E 'ABORT|ERROR' $ovs_base/node-*/pluto.log + +OVS_TRAFFIC_VSWITCHD_STOP() +AT_CLEANUP From patchwork Fri Nov 1 01:23:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Maximets X-Patchwork-Id: 2004941 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::133; helo=smtp2.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp2.osuosl.org (smtp2.osuosl.org [IPv6:2605:bc80:3010::133]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XfjpV0WHWz1xwF for ; Fri, 1 Nov 2024 12:25:05 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id 6186F40F33; Fri, 1 Nov 2024 01:25:04 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id lB0cex4i6Upp; Fri, 1 Nov 2024 01:25:02 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=140.211.9.56; helo=lists.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver= DKIM-Filter: OpenDKIM Filter v2.11.0 smtp2.osuosl.org B1EAC40F31 Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp2.osuosl.org (Postfix) with ESMTPS id B1EAC40F31; Fri, 1 Nov 2024 01:25:02 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 7BE5AC08A6; Fri, 1 Nov 2024 01:25:02 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp1.osuosl.org (smtp1.osuosl.org [IPv6:2605:bc80:3010::138]) by lists.linuxfoundation.org (Postfix) with ESMTP id A77CCC08A3 for ; Fri, 1 Nov 2024 01:25:00 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 3F7E48197D for ; Fri, 1 Nov 2024 01:24:09 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id DND5MomaWqSB for ; Fri, 1 Nov 2024 01:24:03 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.218.67; helo=mail-ej1-f67.google.com; envelope-from=i.maximets.ovn@gmail.com; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp1.osuosl.org 2391881B9D Authentication-Results: smtp1.osuosl.org; dmarc=none (p=none dis=none) header.from=ovn.org DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org 2391881B9D Received: from mail-ej1-f67.google.com (mail-ej1-f67.google.com [209.85.218.67]) by smtp1.osuosl.org (Postfix) with ESMTPS id 2391881B9D for ; Fri, 1 Nov 2024 01:24:00 +0000 (UTC) Received: by mail-ej1-f67.google.com with SMTP id a640c23a62f3a-a9e71401844so12444166b.3 for ; Thu, 31 Oct 2024 18:23:59 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730424238; x=1731029038; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=IRb+u0IHVwOquEt6lxIBShV+f6uTMB4aRnomVo9+GYY=; b=IhdyxccruaUWxtGibay9wurZVbZ1wMHg/b9TasRGh25tnBrk6Dlv/Im+WpXMyH+3oi P/FG6UC+JCr8whwVtU1b4d4gM5XIPu66fhNtfLs/Do/BZjOUTbwZ7F2qhD/dS7fqPmYy UmRwsimXH6tZnBDBvuttnr6oDScFV23Pur7NJFsa9/Vcca21asjku4TB2B2F1VCqPT07 dIJQIYbBq9AY2gnroEAW7bZohb6TApxFvOtLeXqP/9rgm/4OTuuoWRZrPIB14NjLkBVV bUSuawqmGJnqlfM8yerGoE5uOz6M+QCWE9qnVfQYlCHpVtBAXOGdQLWMOuwDdroAYcBp +ocA== X-Gm-Message-State: AOJu0Yz/yIQdVpi8BRJzUU2/JBMk0Gs8HrxDeJMXWO/saGZRPW8P2Qiz fraMB5zk7P4qg5eb+2ioqe6RVzwz4eZio+gNOSgf95B/NGhf/dF+KMaGzceE X-Google-Smtp-Source: AGHT+IHzIVusVNXYlMWum7YAjmDMxDqLbw/11HmQ7fH1Fuoqb+YVXesWBi1vdsgC0mfNwI/ja771rg== X-Received: by 2002:a17:907:97ce:b0:a9a:e91:68c5 with SMTP id a640c23a62f3a-a9e5093efccmr488837866b.33.1730424237687; Thu, 31 Oct 2024 18:23:57 -0700 (PDT) Received: from im-t490s.redhat.com (ip-86-49-44-151.bb.vodafone.cz. [86.49.44.151]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9e5663fef4sm126112766b.149.2024.10.31.18.23.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 18:23:55 -0700 (PDT) From: Ilya Maximets To: ovs-dev@openvswitch.org Date: Fri, 1 Nov 2024 02:23:10 +0100 Message-ID: <20241101012321.3346333-10-i.maximets@ovn.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20241101012321.3346333-1-i.maximets@ovn.org> References: <20241101012321.3346333-1-i.maximets@ovn.org> MIME-Version: 1.0 Subject: [ovs-dev] [PATCH v3 09/10] tests: ipsec: Check that nodes can ping each other in the NxN test. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Ilya Maximets Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Expand the NxN test with the network connectivity check between all the nodes. Unfortunately, we can't really run this test with Libreswan 4.x, since, due to internal issues in these versions, we are getting into states where everything is loaded and active, but no traffic can pass. This is an internal issue in Libreswan that we can't workaround from the outside. So, the fix is required in Libreswan itself. 4.5 and earlier versions seem to not be affected by this problem, at least not severely affected, but it's easier to just cut off all the 4.x versions from the test. 3.32 version from Ubuntu 22.04 and Libreswna 5.1 work just fine with this test. Test is relatively long, but it is very valuable, IMO. Besides stressing ovs-monitor-ipsec with various failure and asynchronous connection establishment conditions, which are important for OVS, it also was used to reproduce and fix several bugs in Libreswan 4.x. Unfortunately, not all the issues are understood and fixed yet. Signed-off-by: Ilya Maximets Acked-by: Eelco Chaudron Acked-by: Roi Dayan --- tests/system-ipsec.at | 84 ++++++++++++++++++++++++++++++++++++++----- 1 file changed, 76 insertions(+), 8 deletions(-) diff --git a/tests/system-ipsec.at b/tests/system-ipsec.at index de459804b..4ab384d89 100644 --- a/tests/system-ipsec.at +++ b/tests/system-ipsec.at @@ -71,7 +71,9 @@ m4_define([IPSEC_ADD_NODE], on_exit "kill `cat $ovs_base/$1/ovs-monitor-ipsec.pid`" dnl Set up OVS bridge - NS_EXEC([$1], [ovs-vsctl --db unix:$ovs_base/$1/db.sock add-br br-ipsec])] + NS_CHECK_EXEC([$1], + [ovs-vsctl --db unix:$ovs_base/$1/db.sock add-br br-ipsec \ + -- set-controller br-ipsec punix:$ovs_base/br-ipsec.$1.mgmt])] ) m4_define([IPSEC_ADD_NODE_LEFT], [IPSEC_ADD_NODE(left, p0, $1, $2)]) m4_define([IPSEC_ADD_NODE_RIGHT], [IPSEC_ADD_NODE(right, p1, $1, $2)]) @@ -429,7 +431,8 @@ m4_for([id], [1], NODES, [1], [ self-sign node-id], [0], [stdout]) AT_CHECK(OVS_VSCTL([node-id], set Open_vSwitch . \ other_config:certificate=${ovs_base}/node-id-cert.pem \ - other_config:private_key=${ovs_base}/node-id-privkey.pem), + other_config:private_key=${ovs_base}/node-id-privkey.pem \ + -- set bridge br-ipsec other-config:hwaddr=f2:ff:00:00:00:id), [0], [ignore], [ignore]) on_exit "ipsec --rundir $ovs_base/node-id status > $ovs_base/node-id/status" ]) @@ -445,11 +448,18 @@ m4_for([LEFT], [1], NODES, [1], [ fi ])]) +dnl These are not necessary, but nice to have in the test log in +dnl order to spot pluto failures during the test. +on_exit "grep -E 'Timed out|outdated|half-loaded|defunct' \ + $ovs_base/node-*/ovs-monitor-ipsec.log" +on_exit "grep -E 'ABORT|ERROR' $ovs_base/node-*/pluto.log" + m4_define([WAIT_FOR_LOADED_CONNS], [ m4_for([id], [1], NODES, [1], [ echo "================== node-id =========================" iterations=0 loaded=0 + active=0 dnl Using a custom loop instead of OVS_WAIT_UNTIL, because it may take dnl much longer than a default timeout. The default retransmit timeout dnl for pluto is 60 seconds. Also, we need to make sure pluto didn't @@ -463,8 +473,11 @@ m4_define([WAIT_FOR_LOADED_CONNS], [ START_PLUTO([node-id]) else loaded=$(IPSEC_STATUS_LOADED(node-id)) + m4_if([$1], [active], + [active=$(IPSEC_STATUS_ACTIVE(node-id))], [active=$loaded]) fi - if test "$loaded" -ne $(( (NODES - 1) * 2 )); then + if test "$loaded" -ne "$(( (NODES - 1) * 2 ))" -o \ + "$loaded" -ne "$active"; then sleep 3 else break @@ -505,11 +518,66 @@ OVS_WAIT_UNTIL([grep -q 'tun-2.*need to reconcile' \ dnl Wait for all the connections to be loaded back. WAIT_FOR_LOADED_CONNS() -dnl These are not necessary, but nice to have in the test log in -dnl order to spot pluto failures during the test. -grep -E 'Timed out|outdated|half-loaded|defunct' \ - $ovs_base/node-*/ovs-monitor-ipsec.log -grep -E 'ABORT|ERROR' $ovs_base/node-*/pluto.log +dnl Next section will check connectivity between all the nodes. +dnl Different versions of Libreswan 4.x have issues where connections +dnl are not being correctly established or never become active in a +dnl way that can not be mitigated from ovs-monitor-ipsec or the test. +dnl So, only checking connectivity for Libreswan 3- or 5+. +dnl Skipping in the middle of the test, so test can still fail while +dnl testing with Libreswan 4, if the first half fails. +AT_SKIP_IF([ipsec --version 2>&1 | grep -q 'Libreswan 4\.']) + +dnl Turn off IPv6 and add static ARP entries for all namespaces to avoid +dnl any broadcast / multicast traffic that would otherwise be multiplied +dnl by each node creating a traffic storm. Add specific OpenFlow rules +dnl to forward traffic to exact destinations without any MAC learning. +m4_for([LEFT], [1], NODES, [1], [ + NS_CHECK_EXEC([node-LEFT], [sysctl -w net.ipv6.conf.all.disable_ipv6=1], + [0], [ignore]) + AT_CHECK([ovs-ofctl del-flows unix:$ovs_base/br-ipsec.node-LEFT.mgmt]) + AT_CHECK([ovs-ofctl add-flow unix:$ovs_base/br-ipsec.node-LEFT.mgmt \ + "dl_dst=f2:ff:00:00:00:LEFT actions=LOCAL"]) + m4_for([RIGHT], [1], NODES, [1], [ + if test LEFT -ne RIGHT; then + NS_CHECK_EXEC([node-LEFT], + [ip neigh add 192.0.0.RIGHT lladdr f2:ff:00:00:00:RIGHT dev br-ipsec]) + AT_CHECK([ovs-ofctl add-flow unix:$ovs_base/br-ipsec.node-LEFT.mgmt \ + "dl_dst=f2:ff:00:00:00:RIGHT actions=tun-RIGHT"]) + fi + ]) +]) + +dnl Bring up and add IP addresses for br-ipsec interface. +m4_for([id], [1], NODES, [1], [ + echo "================== node-id =========================" + NS_CHECK_EXEC([node-id], [ip addr add 192.0.0.id/24 dev br-ipsec]) + NS_CHECK_EXEC([node-id], [ip link set dev br-ipsec up]) +]) + +dnl Wait for all the connections to be loaded and active. In case one of +dnl the pluto processes crashed some of the connections may never become +dnl active. But we did run this loop with a pluto reviving logic twice +dnl already, so the chances for pluto to be down here are much lower. +WAIT_FOR_LOADED_CONNS([active]) + +dnl Check the full mesh ping. +m4_for([LEFT], [1], NODES, [1], [ + m4_for([RIGHT], [1], NODES, [1], [ + if test LEFT -ne RIGHT; then + echo "====== ping: node-LEFT --> node-RIGHT ==========" + dnl Ping without checking in case connection will recover after the + dnl first packet. + NS_CHECK_EXEC([node-LEFT], + [ping -q -c 1 -W 2 192.0.0.RIGHT | FORMAT_PING], + [ignore], [stdout]) + dnl Now check. If this one fails, there is no actual connectivity. + NS_CHECK_EXEC([node-LEFT], + [ping -q -c 3 -i 0.1 -W 2 192.0.0.RIGHT | FORMAT_PING], + [0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + fi +])]) OVS_TRAFFIC_VSWITCHD_STOP() AT_CLEANUP From patchwork Fri Nov 1 01:23:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Maximets X-Patchwork-Id: 2004942 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.137; helo=smtp4.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp4.osuosl.org (smtp4.osuosl.org [140.211.166.137]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XfjpW5NXTz1xwF for ; Fri, 1 Nov 2024 12:25:07 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id DA578408C7; Fri, 1 Nov 2024 01:25:05 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id J4wCparVzH0V; Fri, 1 Nov 2024 01:25:04 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=140.211.9.56; helo=lists.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver= DKIM-Filter: OpenDKIM Filter v2.11.0 smtp4.osuosl.org 75CFE408E2 Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp4.osuosl.org (Postfix) with ESMTPS id 75CFE408E2; Fri, 1 Nov 2024 01:25:04 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id C71DDC08A9; Fri, 1 Nov 2024 01:25:03 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp1.osuosl.org (smtp1.osuosl.org [IPv6:2605:bc80:3010::138]) by lists.linuxfoundation.org (Postfix) with ESMTP id 1DC5FC08A3 for ; Fri, 1 Nov 2024 01:25:02 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id F407E81DE5 for ; Fri, 1 Nov 2024 01:24:09 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id mwyuWsDPPK8r for ; Fri, 1 Nov 2024 01:24:06 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.218.65; helo=mail-ej1-f65.google.com; envelope-from=i.maximets.ovn@gmail.com; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp1.osuosl.org 45C4481D11 Authentication-Results: smtp1.osuosl.org; dmarc=none (p=none dis=none) header.from=ovn.org DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org 45C4481D11 Received: from mail-ej1-f65.google.com (mail-ej1-f65.google.com [209.85.218.65]) by smtp1.osuosl.org (Postfix) with ESMTPS id 45C4481D11 for ; Fri, 1 Nov 2024 01:24:03 +0000 (UTC) Received: by mail-ej1-f65.google.com with SMTP id a640c23a62f3a-a9acafdb745so261236366b.0 for ; Thu, 31 Oct 2024 18:24:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730424241; x=1731029041; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8bx3pK7JHmcowQZDIGqwmBYvfvxyWiQU2PkdbLgZvI0=; b=QEYK+hXNXHRQZzrNFPaqr7mGfKUWA7Q2hNysE0dJdUuDjEWTS9bhcV1n9F+uohnx5c GV3GAXrPRLuUZiDlL8BTWgysWCiAUSw6XZcYkP65ArWm0HNSKOVEDPJti0/4D5hXC+GJ exO/LqAnFe5x9WwuGBV8aB0uqmXfS9ijs934ESK7vZ/sWOQ8nkMnl0TQtKexo3HO6O6T K3TQBZh/KakaNV3UVJSMVXhE/ipmwzCs+bBg+sLIJ+gsGr/lNI92DDp3tbwMdd3B8yHz 6HWlD7bxcQEfKFQiz+pc71vZ8iGHfNB+FZRBdtfkr8ygwxB++B5DxoAWTNnyLluqf9Cb djWQ== X-Gm-Message-State: AOJu0YxYcM52ohvTErxfiaDF/NhGSmi/GAujf6gi+M8xGvfNh0U7Tl0d no4B84ZhZJXHYpOo14cUsnZ2rVSQ9uPI7BPJmlfwxIMhvfudq+9qC0WTarAL X-Google-Smtp-Source: AGHT+IHRylsQkZz6rTL6KTPLpM6TzWAYAGZ9I2P2B5toGrVFY6EZFSaTJTF59XHdRt6ctMmjK0rPUw== X-Received: by 2002:a17:907:cca3:b0:a9a:eeb:b26a with SMTP id a640c23a62f3a-a9e559e0da2mr331257366b.1.1730424240840; Thu, 31 Oct 2024 18:24:00 -0700 (PDT) Received: from im-t490s.redhat.com (ip-86-49-44-151.bb.vodafone.cz. [86.49.44.151]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9e5663fef4sm126112766b.149.2024.10.31.18.23.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 18:23:59 -0700 (PDT) From: Ilya Maximets To: ovs-dev@openvswitch.org Date: Fri, 1 Nov 2024 02:23:11 +0100 Message-ID: <20241101012321.3346333-11-i.maximets@ovn.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20241101012321.3346333-1-i.maximets@ovn.org> References: <20241101012321.3346333-1-i.maximets@ovn.org> MIME-Version: 1.0 Subject: [ovs-dev] [PATCH v3 10/10] ipsec: libreswan: Reduce chances for crossing streams. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Ilya Maximets Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Normally ovs-monitor-ipsec will start all the connections it manages. This is required, because we do not generally know if the other side of the tunnel is going to initiate the IPsec connection or not. For example, the other side might not belong to an OVS setup, so it may not be managed by the other instance of ovs-monitor-ipsec. There are also issues in Libreswan that may cause the other side to fail the connection initiation in a way that it will not try again. However, in many cases the other side is managed by ovs-monitor-ipsec. And in that scenario there is a high chance the both sides will try to initiate the connection at the same time. This is known as crossing streams. Unfortunately, Libreswan, 4.x in particular, doesn't handle this well and either crashes or ends up in a state where connections reported as active, but no traffic can actually go through. For tunnels, where we create separate incoming and outgoing connections (geneve), we may start (add + up) the outgoing connection and only add the incoming one. This would give the other side some time to initiate, avoiding the crossing streams and giving Libreswan a higher chance to survive. We still have to try to bring the incoming connections up at some point if they do not become active. Reconciliation logic will take care of this. Next time we check the active connections, we'll try to reconcile and will bring all the loaded but not active connections up. So, we're loosing at most 15 seconds if something goes wrong. This change greatly improves stability with Libreswan 4.x. It's still not enough to enable the ping test for it, but hopefully enough for real world setups to not hit the Libreswan issues often. GRE connections will still be started from both sides. We do already have some issues in case users name their tunnels with -in- or -out- in the name, so it's not a new problem, but if the regex accidentally matches on such a GRE tunnel, we'll again loose at most 15 seconds before they will be brought up during reconciliation. So, should not be a big deal. Note: ipsec auto in Libreswan < 5 accepts --asynchronous together with --add, even thought the --asynchronous flag is only for up/down/start, but Libreswan 5 fails the command, so we need to add it conditionally. Signed-off-by: Ilya Maximets Acked-by: Eelco Chaudron --- ipsec/ovs-monitor-ipsec.in | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/ipsec/ovs-monitor-ipsec.in b/ipsec/ovs-monitor-ipsec.in index 89d1fa3cc..6c60c07e3 100755 --- a/ipsec/ovs-monitor-ipsec.in +++ b/ipsec/ovs-monitor-ipsec.in @@ -682,8 +682,16 @@ conn prevent_unencrypted_vxlan active.discard(conn) for conn in desired: - vlog.info("Starting ipsec connection %s" % conn) - self._start_ipsec_connection(conn, "start") + # Start (add + up) outgoing connections and only add + # incoming ones. If the other side will not initiate + # the connection and it will not become active, we'll + # bring it up during the next refresh. + if re.match(r".*-in-\d+$", conn): + vlog.info("Adding ipsec connection %s" % conn) + self._start_ipsec_connection(conn, "add") + else: + vlog.info("Starting ipsec connection %s" % conn) + self._start_ipsec_connection(conn, "start") else: # Ask pluto to bring UP connections that are loaded, # but not active for some reason. @@ -834,11 +842,12 @@ conn prevent_unencrypted_vxlan "--delete", conn], "delete %s" % conn) def _start_ipsec_connection(self, conn, action): + asynchronous = [] if action == "add" else ["--asynchronous"] ret, pout, perr = run_command(self.IPSEC_AUTO + ["--config", self.IPSEC_CONF, "--ctlsocket", self.IPSEC_CTL, "--" + action, - "--asynchronous", conn], + *asynchronous, conn], "%s %s" % (action, conn)) if re.match(r".*[F|f]ailed to initiate connection.*", pout):