From patchwork Tue Oct 29 10:15:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Maximets X-Patchwork-Id: 2003665 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::138; helo=smtp1.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp1.osuosl.org (smtp1.osuosl.org [IPv6:2605:bc80:3010::138]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Xd5l90nrqz1xwn for ; Tue, 29 Oct 2024 21:16:37 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 7F5C98121F; Tue, 29 Oct 2024 10:16:35 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id WfUkbkX8hGgn; Tue, 29 Oct 2024 10:16:32 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=140.211.9.56; helo=lists.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver= DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org 3E84581224 Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp1.osuosl.org (Postfix) with ESMTPS id 3E84581224; Tue, 29 Oct 2024 10:16:32 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id F072EC08A6; Tue, 29 Oct 2024 10:16:31 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 3C316C08B4 for ; Tue, 29 Oct 2024 10:16:30 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id CD7D940AF6 for ; Tue, 29 Oct 2024 10:16:28 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id jw-deEy3zn4o for ; Tue, 29 Oct 2024 10:16:27 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.167.65; helo=mail-lf1-f65.google.com; envelope-from=i.maximets.ovn@gmail.com; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp2.osuosl.org 6E34340B01 Authentication-Results: smtp2.osuosl.org; dmarc=none (p=none dis=none) header.from=ovn.org DKIM-Filter: OpenDKIM Filter v2.11.0 smtp2.osuosl.org 6E34340B01 Received: from mail-lf1-f65.google.com (mail-lf1-f65.google.com [209.85.167.65]) by smtp2.osuosl.org (Postfix) with ESMTPS id 6E34340B01 for ; Tue, 29 Oct 2024 10:16:27 +0000 (UTC) Received: by mail-lf1-f65.google.com with SMTP id 2adb3069b0e04-539f84907caso5596280e87.3 for ; Tue, 29 Oct 2024 03:16:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730196985; x=1730801785; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ruUEmioI/vAPH/6Effwwy/4loMD2TIqW+V7yCVR/9Nk=; b=nsDkCm2e6slSGK/gJUodD7smuK7Q8cvnVmZbgeMW23XS6zZqXsZhhRyBmgjEwBKF9t M5QBiOsDZ/qaInXYpj9hyjM6WHSz7Ds4Ezsq6tvyPVZT1OyHVXGDxHdzEk+CHSgcxaSn xlK9JLKbgkz/xlk53CAaHlS+ZGo6QO3UM8VWIK5smRLOL5/lu+DJRqiCjN0JA3fc3j1T IGASB+fqzHOzwFwFuq3yZpZ+QWIjM/2ThvbD28Nh5/iw91ttBU0WVq2V044WekKKGwD3 opNOZbsIjlUQ2YBZ/pJ1ChweeFOsL4wK8uDnA6iXEMTqexu0NMTtEpaBYZdRLayXdJmp 03Tg== X-Gm-Message-State: AOJu0YwL/76r0pYne4pDgmd/roHkTdrulIcxRpnCtdPpN+Q7cSfUsuBR 1e9OfbVLLZ1y3WGjyj7h0++YadSTdIhMAn+22d0Ltfkezx0zh0TJ75IE2tAt X-Google-Smtp-Source: AGHT+IGttqMo9aOxiNhWuJTdeTi0GBqKFN1PPTRnP/yOQTAPnAUMYzgxt8vp4TCr4U0iQOk9ubIc5Q== X-Received: by 2002:a05:6512:3e26:b0:53b:2114:92a7 with SMTP id 2adb3069b0e04-53b34c8e61fmr5778743e87.52.1730196984488; Tue, 29 Oct 2024 03:16:24 -0700 (PDT) Received: from im-t490s.redhat.com (ip-86-49-44-151.bb.vodafone.cz. [86.49.44.151]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-431b4594ec3sm20279685e9.1.2024.10.29.03.16.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Oct 2024 03:16:24 -0700 (PDT) From: Ilya Maximets To: ovs-dev@openvswitch.org Date: Tue, 29 Oct 2024 11:15:03 +0100 Message-ID: <20241029101608.2991596-6-i.maximets@ovn.org> X-Mailer: git-send-email 2.46.0 In-Reply-To: <20241029101608.2991596-1-i.maximets@ovn.org> References: <20241029101608.2991596-1-i.maximets@ovn.org> MIME-Version: 1.0 Subject: [ovs-dev] [PATCH 5/9] ipsec: libreswan: Avoid monitor hanging on stuck ipsec commands. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Ilya Maximets Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Multiple versions of Libreswan have an issue where ipsec --start command may get stuck forever. This issue affects many popular versions of Libreswan from 4.5 to 4.15, which are shipped in most modern distributions. When ipsec --start gets stuck, ovs-monitor-ipsec hangs and can't do anything else, so not olny this one but all other tunnels are also not being started. Add a timeout to the subprocess call, so we do not wait forever. Just introduced reconciliation process will clean things up and will try to re-add this connection later. Pluto may take a lot of time to process the --start request. Notably, the time depends on the retransmission timeout, which is 60 seconds by default. However, even at high scale, it doesn't take much more than that in tests. So, 120 second timeout should be a reasonable default value. Note: it is observed in practice that the process doesn't actually terminate for a long time, so we can't afford waiting for it. That's the main reason why we're not using the subprocess.run() with a timeout option here (it would wait). But also, because we'd had to catch the exception anyway. Reported-at: https://issues.redhat.com/browse/FDP-846 Signed-off-by: Ilya Maximets --- ipsec/ovs-monitor-ipsec.in | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/ipsec/ovs-monitor-ipsec.in b/ipsec/ovs-monitor-ipsec.in index 3b2057389..264b055e6 100755 --- a/ipsec/ovs-monitor-ipsec.in +++ b/ipsec/ovs-monitor-ipsec.in @@ -82,6 +82,7 @@ vlog = ovs.vlog.Vlog("ovs-monitor-ipsec") exiting = False monitor = None xfrm = None +TIEMOUT_EXPIRED = 37 def run_command(args, description=None): @@ -94,7 +95,16 @@ def run_command(args, description=None): vlog.dbg("Running %s" % args) proc = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.PIPE) - pout, perr = proc.communicate() + try: + pout, perr = proc.communicate(timeout=120) + ret = proc.returncode + except subprocess.TimeoutExpired: + vlog.warn("Command timed out trying to %s." % description) + pout, perr = b'', b'' + # Just kill the process here. We can't afford waiting for it, + # as it may be stuck and may not actually be terminated. + proc.kill() + ret = TIEMOUT_EXPIRED if proc.returncode or len(perr): vlog.warn("Failed to %s; exit code: %d" @@ -103,7 +113,7 @@ def run_command(args, description=None): vlog.warn("stderr: %s" % perr) vlog.warn("stdout: %s" % pout) - return proc.returncode, pout or b'', perr or b'' + return ret, pout or b'', perr or b'' class XFRM(object):