From patchwork Wed Oct 16 14:23:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roberto Bartzen Acosta X-Patchwork-Id: 1998094 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=luizalabs.com header.i=@luizalabs.com header.a=rsa-sha256 header.s=google header.b=dcKB3yAp; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::138; helo=smtp1.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp1.osuosl.org (smtp1.osuosl.org [IPv6:2605:bc80:3010::138]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XTCrr6gTGz1xvV for ; Thu, 17 Oct 2024 01:24:12 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp1.osuosl.org (Postfix) with ESMTP id 02B7D81013; Wed, 16 Oct 2024 14:24:11 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp1.osuosl.org ([127.0.0.1]) by localhost (smtp1.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id o1pRQcqLst8k; Wed, 16 Oct 2024 14:24:07 +0000 (UTC) X-Comment: SPF check N/A for local connections - client-ip=140.211.9.56; helo=lists.linuxfoundation.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver= DKIM-Filter: OpenDKIM Filter v2.11.0 smtp1.osuosl.org 8ED5A80DFA Authentication-Results: smtp1.osuosl.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=luizalabs.com header.i=@luizalabs.com header.a=rsa-sha256 header.s=google header.b=dcKB3yAp Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp1.osuosl.org (Postfix) with ESMTPS id 8ED5A80DFA; Wed, 16 Oct 2024 14:24:07 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 62350C08A6; Wed, 16 Oct 2024 14:24:07 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp3.osuosl.org (smtp3.osuosl.org [IPv6:2605:bc80:3010::136]) by lists.linuxfoundation.org (Postfix) with ESMTP id 49CA8C08A3 for ; Wed, 16 Oct 2024 14:24:06 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id 34C05605AE for ; Wed, 16 Oct 2024 14:24:06 +0000 (UTC) X-Virus-Scanned: amavis at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavis, port 10024) with ESMTP id BsIcqUORsHcz for ; Wed, 16 Oct 2024 14:24:03 +0000 (UTC) Received-SPF: None (mailfrom) identity=mailfrom; client-ip=2607:f8b0:4864:20::631; helo=mail-pl1-x631.google.com; envelope-from=roberto.acosta@luizalabs.com; receiver= DMARC-Filter: OpenDMARC Filter v1.4.2 smtp3.osuosl.org EB3A7605AC Authentication-Results: smtp3.osuosl.org; dmarc=pass (p=quarantine dis=none) header.from=luizalabs.com DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org EB3A7605AC Authentication-Results: smtp3.osuosl.org; dkim=pass (1024-bit key, unprotected) header.d=luizalabs.com header.i=@luizalabs.com header.a=rsa-sha256 header.s=google header.b=dcKB3yAp Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) by smtp3.osuosl.org (Postfix) with ESMTPS id EB3A7605AC for ; Wed, 16 Oct 2024 14:24:02 +0000 (UTC) Received: by mail-pl1-x631.google.com with SMTP id d9443c01a7336-20c7ee8fe6bso49059255ad.2 for ; Wed, 16 Oct 2024 07:24:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=luizalabs.com; s=google; t=1729088641; x=1729693441; darn=openvswitch.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=y0o6+D/sf3i9K2rYod9aWYeiQMd+PQ+i9OiaJSHQZGw=; b=dcKB3yAp05598SBr2eblfnXiDVbAsv7kPaXIj9Gzq6v5q2CnOB8aJkAnoKWaqWMXXN Gs6zyEc6vTG77Hr165bozSJq4zXAL4kES3q1Fr6b58A0TQZGGCNdFUujK8EpnmQLp4bN 3wp7u8pX6NeZ3N2YhNmiiLaGZV4LIHcwE/KBo= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729088641; x=1729693441; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=y0o6+D/sf3i9K2rYod9aWYeiQMd+PQ+i9OiaJSHQZGw=; b=mdaaMzANSPmwBhUh5AS8Yx6cokkfcwStUb5oQHgoeeYMS1R42Ek7cFHit/9naO5+5R jDLeHzHC4vVIIOCOdJiTVkbYRbFm31mTWassESsQrVrwWAGj1yZwTLco2lHdbySO/V6/ 8PJSJjvwYO5cYwHMN6AZRhkG+eEyoEbxL/Z22cTPHZwBiM82zl4ju3tKwm94V7lLrAUS l25vYZtuSPSpcMUcIitGmjGcSJK7kH3N0cAPqLo677nheTDs+P0BXUUB+2oHN5miF6qi yzQ8b4KhY0DEfYL81s31SMFlTxYNSo4B1Ae94Fx5rywftHsx2NC84/KlOXEEkfwUbjDT L3aA== X-Gm-Message-State: AOJu0Yz/yp4nhWYRMSpzqKpjWEbvYe3fnm/bdIeBNDfyBKIi6ah0RPHb gKimsJ0rMoOs2Lh++50ztC+2ULH8xfpbHPCln9ikbD1haOHh89VQAcptsDqwlqnzObPPCxR+g0a zV2wHMijmXU8f3aspiHk4l0Dgf88CN2g/Wjlx/L/yRzqFKghdfUj6bl+isvc= X-Google-Smtp-Source: AGHT+IGvbbICGkK7xjb62oyQxhRg+GiikREBFJd8yd36DxfvRZ9dyQ6PmsU4us9861q4l+r+ektkag== X-Received: by 2002:a17:903:40c6:b0:20b:5351:f69a with SMTP id d9443c01a7336-20d27f3fae7mr56317825ad.58.1729088639123; Wed, 16 Oct 2024 07:23:59 -0700 (PDT) Received: from roberto.. ([2804:14c:7989:8165:5896:a2a8:91de:5b73]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-20d17f994c5sm29471775ad.71.2024.10.16.07.23.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Oct 2024 07:23:58 -0700 (PDT) To: dev@openvswitch.org Date: Wed, 16 Oct 2024 11:23:49 -0300 Message-Id: <20241016142349.66311-1-roberto.acosta@luizalabs.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Subject: [ovs-dev] [PATCH ovn v9] northd: Fix logical router load-balancer nat rules when using DGP. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Roberto Bartzen Acosta via dev From: Roberto Bartzen Acosta Reply-To: Roberto Bartzen Acosta Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" This commit fixes the build_distr_lrouter_nat_flows_for_lb function to include a DNAT flow entry for each DGP in use. Since we have added support to create multiple gateway ports per logical router, it's necessary to include in the LR NAT rules pipeline a specific entry for each attached DGP. Otherwise, the inbound traffic will only be redirected when the incoming LRP matches the chassis_resident field. Additionally, this patch includes the ability to use load-balancer with DGPs attached to multiple chassis. We can have each of the DGPs associated with a different chassis, and in this case the DNAT rules added by default will not be enough to guarantee outgoing traffic. To solve the multiple chassis for DGPs problem, this patch include a new config options to be configured in the load-balancer. If the use_stateless_nat is set to true, the logical router that references this load-balancer will use Stateless NAT rules when the logical router has multiple DGPs. After applying this patch and setting the use_stateless_nat option, the inbound and/or outbound traffic can pass through any chassis where the DGP resides without having problems with CT state. Reported-at: https://bugs.launchpad.net/ubuntu/+source/ovn/+bug/2054322 Fixes: 15348b7b806f ("ovn-northd: Multiple distributed gateway port support.") Signed-off-by: Roberto Bartzen Acosta --- .../workflows/ovn-fake-multinode-tests.yml | 2 +- northd/en-lr-stateful.c | 12 - northd/northd.c | 138 ++- ovn-nb.xml | 20 +- tests/multinode-macros.at | 36 + tests/multinode.at | 991 ++++++++++++++++++ tests/ovn-northd.at | 320 ++++++ 7 files changed, 1479 insertions(+), 40 deletions(-) diff --git a/.github/workflows/ovn-fake-multinode-tests.yml b/.github/workflows/ovn-fake-multinode-tests.yml index 5026a3c6c..bf966299d 100644 --- a/.github/workflows/ovn-fake-multinode-tests.yml +++ b/.github/workflows/ovn-fake-multinode-tests.yml @@ -149,7 +149,7 @@ jobs: - name: Start basic cluster run: | - sudo -E ./ovn_cluster.sh start + sudo -E CHASSIS_COUNT=4 GW_COUNT=4 ./ovn_cluster.sh start sudo podman exec -it ovn-central-az1-1 ovn-nbctl show sudo podman exec -it ovn-central-az1-1 ovn-appctl -t ovn-northd version sudo podman exec -it ovn-chassis-1 ovn-appctl -t ovn-controller version diff --git a/northd/en-lr-stateful.c b/northd/en-lr-stateful.c index baf1bd2f8..f09691af6 100644 --- a/northd/en-lr-stateful.c +++ b/northd/en-lr-stateful.c @@ -516,18 +516,6 @@ lr_stateful_record_create(struct lr_stateful_table *table, table->array[od->index] = lr_stateful_rec; - /* Load balancers are not supported (yet) if a logical router has multiple - * distributed gateway port. Log a warning. */ - if (lr_stateful_rec->has_lb_vip && lr_has_multiple_gw_ports(od)) { - static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); - VLOG_WARN_RL(&rl, "Load-balancers are configured on logical " - "router %s, which has %"PRIuSIZE" distributed " - "gateway ports. Load-balancer is not supported " - "yet when there is more than one distributed " - "gateway port on the router.", - od->nbr->name, od->n_l3dgw_ports); - } - return lr_stateful_rec; } diff --git a/northd/northd.c b/northd/northd.c index 0aa0de637..4ae79e827 100644 --- a/northd/northd.c +++ b/northd/northd.c @@ -11813,31 +11813,42 @@ static void build_distr_lrouter_nat_flows_for_lb(struct lrouter_nat_lb_flows_ctx *ctx, enum lrouter_nat_lb_flow_type type, struct ovn_datapath *od, - struct lflow_ref *lflow_ref) + struct lflow_ref *lflow_ref, + struct ovn_port *dgp, + bool stateless_nat) { - struct ovn_port *dgp = od->l3dgw_ports[0]; - - const char *undnat_action; - - switch (type) { - case LROUTER_NAT_LB_FLOW_FORCE_SNAT: - undnat_action = "flags.force_snat_for_lb = 1; next;"; - break; - case LROUTER_NAT_LB_FLOW_SKIP_SNAT: - undnat_action = "flags.skip_snat_for_lb = 1; next;"; - break; - case LROUTER_NAT_LB_FLOW_NORMAL: - case LROUTER_NAT_LB_FLOW_MAX: - undnat_action = lrouter_use_common_zone(od) - ? "ct_dnat_in_czone;" - : "ct_dnat;"; - break; - } + struct ds dnat_action = DS_EMPTY_INITIALIZER; /* Store the match lengths, so we can reuse the ds buffer. */ size_t new_match_len = ctx->new_match->length; size_t undnat_match_len = ctx->undnat_match->length; + /* (NOTE) dnat_action: Add the first LB backend IP as a destination + * action of the lr_in_dnat NAT rule. Including the backend IP is useful + * for accepting packets coming from a chassis that does not have + * previously established conntrack entries. This means that the actions + * (ip4.dst + ct_lb_mark) are executed in addition and ip4.dst is not + * useful when traffic passes through the same chassis for ingress/egress + * packets. However, the actions are complementary in cases where traffic + * enters from one chassis, the ack response comes from another chassis, + * and the final ack step of the TCP handshake comes from the first + * chassis used. Without using stateless NAT, the connection will not be + * established because the return packet followed a path through another + * chassis and only ct_lb_mark will not be able to receive the ack and + * forward it to the right backend. With using stateless NAT, the packet + * will be accepted and forwarded to the same backend that corresponds to + * the previous conntrack entry that is in the SYN_SENT state + * (created by ct_lb_mark for the first rcv packet in this flow). + */ + if (stateless_nat) { + if (ctx->lb_vip->n_backends) { + struct ovn_lb_backend *backend = &ctx->lb_vip->backends[0]; + bool ipv6 = !IN6_IS_ADDR_V4MAPPED(&backend->ip); + ds_put_format(&dnat_action, "%s.dst=%s;", ipv6 ? "ip6" : "ip4", + backend->ip_str); + } + } + ds_put_format(&dnat_action, "%s", ctx->new_action[type]); const char *meter = NULL; @@ -11847,20 +11858,46 @@ build_distr_lrouter_nat_flows_for_lb(struct lrouter_nat_lb_flows_ctx *ctx, if (ctx->lb_vip->n_backends || !ctx->lb_vip->empty_backend_rej) { ds_put_format(ctx->new_match, " && is_chassis_resident(%s)", - od->l3dgw_ports[0]->cr_port->json_key); + dgp->cr_port->json_key); } ovn_lflow_add_with_hint__(ctx->lflows, od, S_ROUTER_IN_DNAT, ctx->prio, - ds_cstr(ctx->new_match), ctx->new_action[type], + ds_cstr(ctx->new_match), ds_cstr(&dnat_action), NULL, meter, &ctx->lb->nlb->header_, lflow_ref); ds_truncate(ctx->new_match, new_match_len); + ds_destroy(&dnat_action); if (!ctx->lb_vip->n_backends) { return; } + struct ds undnat_action = DS_EMPTY_INITIALIZER; + struct ds snat_action = DS_EMPTY_INITIALIZER; + + switch (type) { + case LROUTER_NAT_LB_FLOW_FORCE_SNAT: + ds_put_format(&undnat_action, "flags.force_snat_for_lb = 1; next;"); + break; + case LROUTER_NAT_LB_FLOW_SKIP_SNAT: + ds_put_format(&undnat_action, "flags.skip_snat_for_lb = 1; next;"); + break; + case LROUTER_NAT_LB_FLOW_NORMAL: + case LROUTER_NAT_LB_FLOW_MAX: + ds_put_format(&undnat_action, "%s", + lrouter_use_common_zone(od) ? "ct_dnat_in_czone;" + : "ct_dnat;"); + break; + } + + /* undnat_action: Remove the ct action from the lr_out_undenat NAT rule. + */ + if (stateless_nat) { + ds_clear(&undnat_action); + ds_put_format(&undnat_action, "next;"); + } + /* We need to centralize the LB traffic to properly perform * the undnat stage. */ @@ -11879,11 +11916,51 @@ build_distr_lrouter_nat_flows_for_lb(struct lrouter_nat_lb_flows_ctx *ctx, ds_put_format(ctx->undnat_match, ") && (inport == %s || outport == %s)" " && is_chassis_resident(%s)", dgp->json_key, dgp->json_key, dgp->cr_port->json_key); + /* Use the LB protocol as matching criteria for out undnat and snat when + * creating LBs with stateless NAT. */ + if (stateless_nat) { + ds_put_format(ctx->undnat_match, " && %s", ctx->lb->proto); + } ovn_lflow_add_with_hint(ctx->lflows, od, S_ROUTER_OUT_UNDNAT, 120, - ds_cstr(ctx->undnat_match), undnat_action, - &ctx->lb->nlb->header_, + ds_cstr(ctx->undnat_match), + ds_cstr(&undnat_action), &ctx->lb->nlb->header_, lflow_ref); + + /* (NOTE) snat_action: Add a new rule lr_out_snat with LB VIP as source + * IP action to perform stateless NAT pipeline completely when the + * outgoing packet is redirected to a chassis that does not have an + * active conntrack entry. Otherwise, it will not be SNATed by the + * ct_lb action because it does not refer to a valid created flow. The + * use case for responding to a packet in different chassis is multipath + * via ECMP. So, the LB lr_out_snat is created with a lower priority than + * the other router pipeline entries, in this case, if the packet is not + * SNATed by ct_lb (conntrack lost), it will be SNATed by the LB + * stateless NAT rule. Also, SNAT is performed only when the packet + * matches the configured LB backend IPs, ports and protocols. Otherwise, + * the packet will be forwarded without SNAted interference. + */ + if (stateless_nat) { + if (ctx->lb_vip->port_str) { + ds_put_format(&snat_action, "%s.src=%s; %s.src=%s; next;", + ctx->lb_vip->address_family == AF_INET6 ? + "ip6" : "ip4", + ctx->lb_vip->vip_str, ctx->lb->proto, + ctx->lb_vip->port_str); + } else { + ds_put_format(&snat_action, "%s.src=%s; next;", + ctx->lb_vip->address_family == AF_INET6 ? + "ip6" : "ip4", + ctx->lb_vip->vip_str); + } + ovn_lflow_add_with_hint(ctx->lflows, od, S_ROUTER_OUT_SNAT, 160, + ds_cstr(ctx->undnat_match), + ds_cstr(&snat_action), &ctx->lb->nlb->header_, + lflow_ref); + } + ds_truncate(ctx->undnat_match, undnat_match_len); + ds_destroy(&undnat_action); + ds_destroy(&snat_action); } static void @@ -12028,6 +12105,8 @@ build_lrouter_nat_flows_for_lb( * lflow generation for them. */ size_t index; + bool use_stateless_nat = smap_get_bool(&lb->nlb->options, + "use_stateless_nat", false); BITMAP_FOR_EACH_1 (index, bitmap_len, lb_dps->nb_lr_map) { struct ovn_datapath *od = lr_datapaths->array[index]; enum lrouter_nat_lb_flow_type type; @@ -12049,8 +12128,17 @@ build_lrouter_nat_flows_for_lb( if (!od->n_l3dgw_ports) { bitmap_set1(gw_dp_bitmap[type], index); } else { - build_distr_lrouter_nat_flows_for_lb(&ctx, type, od, - lb_dps->lflow_ref); + /* Create stateless LB NAT rules when using multiple DGPs and + * use_stateless_nat is true. + */ + bool stateless_nat = (od->n_l3dgw_ports > 1) + ? use_stateless_nat : false; + for (size_t i = 0; i < od->n_l3dgw_ports; i++) { + struct ovn_port *dgp = od->l3dgw_ports[i]; + build_distr_lrouter_nat_flows_for_lb(&ctx, type, od, + lb_dps->lflow_ref, dgp, + stateless_nat); + } } if (lb->affinity_timeout) { diff --git a/ovn-nb.xml b/ovn-nb.xml index 2836f58f5..739ec152a 100644 --- a/ovn-nb.xml +++ b/ovn-nb.xml @@ -2302,6 +2302,16 @@ or local anymore by the ovn-controller. This option is set to false by default. + + + If the load balancer is configured with use_stateless_nat + option to true, the logical router that references this + load balancer will use Stateless NAT rules when the logical router + has multiple distributed gateway ports(DGP). Otherwise, the outbound + traffic may be dropped in scenarios where we have different chassis + for each DGP. This option is set to false by default. + @@ -2688,8 +2698,14 @@ or Set of load balancers associated to this logical router. Load balancer - Load balancer rules only work on the Gateway routers or routers with one - and only one distributed gateway port. + rules only work without limitations on the Gateway routers or routers + with one and only one distributed gateway port (DGP). Load balancers + will only work in scenarios that use more than one DGP when the multiple + DGPs are associated with the same gateway chassis, this way this chassis + can apply/maintain the conntrack state without problems. To use a load + balancer in scenarios with DGPs associated with different gateway chassis + (e.g. ECMP routes), consider using the use_stateless_nat + option to true in the load balancer options column. diff --git a/tests/multinode-macros.at b/tests/multinode-macros.at index 5b171885e..698d2c625 100644 --- a/tests/multinode-macros.at +++ b/tests/multinode-macros.at @@ -41,6 +41,23 @@ m4_define([M_START_TCPDUMP], ) +# M_FORMAT_CT([ip-addr]) +# +# Strip content from the piped input which would differ from test to test +# and limit the output to the rows containing 'ip-addr'. +# +m4_define([M_FORMAT_CT], + [[grep -F "dst=$1," | sed -e 's/id=[0-9]*/id=/g' -e 's/state=[0-9_A-Z]*/state=/g' | sort | uniq | sed -e 's/zone=[[0-9]]*/zone=/' -e 's/mark=[[0-9]]*/mark=/' ]]) + +# M_FORMAT_CURL([ip-addr], [port]) +# +# Strip content from the piped input which would differ from test to test +# and limit the output to the rows containing 'ip-addr' and 'port'. +# +m4_define([M_FORMAT_CURL], + [[sed 's/\(.*\)Connected to $1 ($1) port $2/Connected to $1 ($1) port $2\n/' | sed 's/\(.*\)200 OK/200 OK\n/' | grep -i -e connected -e "200 OK" | uniq ]]) + + OVS_START_SHELL_HELPERS m_as() { @@ -76,6 +93,25 @@ multinode_nbctl () { m_as ovn-central-az1-1 ovn-nbctl "$@" } +check_fake_multinode_setup_by_nodes() { + check m_as ovn-central-az1-1 ovn-nbctl --wait=sb sync + for c in $1 + do + AT_CHECK([m_as $c ovn-appctl -t ovn-controller version], [0], [ignore]) + done +} + +cleanup_multinode_resources_by_nodes() { + m_as ovn-central-az1-1 rm -f /etc/ovn/ovnnb_db.db + m_as ovn-central-az1-1 /usr/share/ovn/scripts/ovn-ctl restart_northd + check m_as ovn-central-az1-1 ovn-nbctl --wait=sb sync + for c in $1 + do + m_as $c ovs-vsctl del-br br-int + m_as $c ip --all netns delete + done +} + # m_count_rows TABLE [CONDITION...] # # Prints the number of rows in TABLE (that satisfy CONDITION). diff --git a/tests/multinode.at b/tests/multinode.at index 408e1118d..a45dc55cc 100644 --- a/tests/multinode.at +++ b/tests/multinode.at @@ -1591,3 +1591,994 @@ AT_CHECK([cat ch1_eth2.tcpdump], [0], [dnl ]) AT_CLEANUP + +AT_SETUP([ovn multinode load-balancer with multiple DGPs and multiple chassis]) + +# Check that ovn-fake-multinode setup is up and running - requires additional nodes +check_fake_multinode_setup_by_nodes 'ovn-chassis-1 ovn-chassis-2 ovn-chassis-3 ovn-chassis-4 ovn-gw-1 ovn-gw-2' + +# Delete the multinode NB and OVS resources before starting the test. +cleanup_multinode_resources_by_nodes 'ovn-chassis-1 ovn-chassis-2 ovn-chassis-3 ovn-chassis-4 ovn-gw-1 ovn-gw-2' + +# Reset geneve tunnels +for c in ovn-chassis-1 ovn-chassis-2 ovn-chassis-3 ovn-chassis-4 ovn-gw-1 ovn-gw-2 +do + m_as $c ovs-vsctl set open . external-ids:ovn-encap-type=geneve +done + +# Network topology +# +# publicp1 (ovn-chassis-3) (20.0.1.3/24) +# | +# overlay +# | +# DGP public1 (ovn-gw-1) (20.0.1.1/24) +# | +# | +# | +# lr0 ------- sw0 --- sw0p1 (ovn-chassis-1) 10.0.1.3/24 +# | | +# | + --- sw0p2 (ovn-chassis-2) 10.0.1.4/24 +# | +# DGP public2 (ovn-gw-2) (30.0.1.1/24) +# | +# overlay +# | +# publicp2 (ovn-chassis-4) (30.0.1.3/24) + +# Delete already used ovs-ports +m_as ovn-chassis-1 ip link del sw0p1-p +m_as ovn-chassis-2 ip link del sw0p2-p +m_as ovn-chassis-2 ip link del sw1p1-p +m_as ovn-chassis-3 ip link del publicp1-p +m_as ovn-chassis-4 ip link del publicp2-p + +# Create East-West switch for LB backends +check multinode_nbctl ls-add sw0 +check multinode_nbctl lsp-add sw0 sw0-port1 +check multinode_nbctl lsp-set-addresses sw0-port1 "50:54:00:00:00:03 10.0.1.3 1000::3" +check multinode_nbctl lsp-add sw0 sw0-port2 +check multinode_nbctl lsp-set-addresses sw0-port2 "50:54:00:00:00:04 10.0.1.4 1000::4" + +m_as ovn-chassis-1 /data/create_fake_vm.sh sw0-port1 sw0p1 50:54:00:00:00:03 1342 10.0.1.3 24 10.0.1.1 1000::3/64 1000::a +m_as ovn-chassis-2 /data/create_fake_vm.sh sw0-port2 sw0p2 50:54:00:00:00:04 1342 10.0.1.4 24 10.0.1.1 1000::4/64 1000::a + +m_wait_for_ports_up + +M_NS_CHECK_EXEC([ovn-chassis-1], [sw0p1], [ping -q -c 3 -i 0.3 -w 2 10.0.1.4 | FORMAT_PING], \ +[0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + +M_NS_CHECK_EXEC([ovn-chassis-2], [sw0p2], [ping -q -c 3 -i 0.3 -w 2 10.0.1.3 | FORMAT_PING], \ +[0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + +# Create a logical router and attach to sw0 +check multinode_nbctl lr-add lr0 +check multinode_nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.1.1/24 1000::a/64 +check multinode_nbctl lsp-add sw0 sw0-lr0 +check multinode_nbctl lsp-set-type sw0-lr0 router +check multinode_nbctl lsp-set-addresses sw0-lr0 router +check multinode_nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 + +# create external connection for N/S traffic using multiple DGPs +check multinode_nbctl ls-add public + +# create external connection for N/S traffic +# DGP public1 +check multinode_nbctl lsp-add public ln-public-1 +check multinode_nbctl lsp-set-type ln-public-1 localnet +check multinode_nbctl lsp-set-addresses ln-public-1 unknown +check multinode_nbctl lsp-set-options ln-public-1 network_name=public1 + +# DGP public2 +check multinode_nbctl lsp-add public ln-public-2 +check multinode_nbctl lsp-set-type ln-public-2 localnet +check multinode_nbctl lsp-set-addresses ln-public-2 unknown +check multinode_nbctl lsp-set-options ln-public-2 network_name=public2 + +# Attach DGP public1 to GW-1 and chassis-3 (overlay connectivity) +m_as ovn-gw-1 ovs-vsctl set open . external-ids:ovn-bridge-mappings=public1:br-ex +m_as ovn-chassis-3 ovs-vsctl set open . external-ids:ovn-bridge-mappings=public1:br-ex + +# Attach DGP public2 to GW-2 and chassis-4 (overlay connectivity) +m_as ovn-gw-2 ovs-vsctl set open . external-ids:ovn-bridge-mappings=public2:br-ex +m_as ovn-chassis-4 ovs-vsctl set open . external-ids:ovn-bridge-mappings=public2:br-ex + +# Create the external LR0 port to the DGP public1 +check multinode_nbctl lsp-add public public-port1 +check multinode_nbctl lsp-set-addresses public-port1 "40:54:00:00:00:03 20.0.1.3 2000::3" + +check multinode_nbctl lrp-add lr0 lr0-public-p1 00:00:00:00:ff:02 20.0.1.1/24 2000::a/64 +check multinode_nbctl lsp-add public public-lr0-p1 +check multinode_nbctl lsp-set-type public-lr0-p1 router +check multinode_nbctl lsp-set-addresses public-lr0-p1 router +check multinode_nbctl lsp-set-options public-lr0-p1 router-port=lr0-public-p1 +check multinode_nbctl lrp-set-gateway-chassis lr0-public-p1 ovn-gw-1 10 + +# Create a VM on ovn-chassis-3 in the same public1 overlay +m_as ovn-chassis-3 /data/create_fake_vm.sh public-port1 publicp1 40:54:00:00:00:03 1342 20.0.1.3 24 20.0.1.1 2000::4/64 2000::a + +m_wait_for_ports_up public-port1 + +# Create the external LR0 port to the DGP public2 +check multinode_nbctl lsp-add public public-port2 +check multinode_nbctl lsp-set-addresses public-port2 "60:54:00:00:00:03 30.0.1.3 3000::3" + +check multinode_nbctl lrp-add lr0 lr0-public-p2 00:00:00:00:ff:03 30.0.1.1/24 3000::a/64 +check multinode_nbctl lsp-add public public-lr0-p2 +check multinode_nbctl lsp-set-type public-lr0-p2 router +check multinode_nbctl lsp-set-addresses public-lr0-p2 router +check multinode_nbctl lsp-set-options public-lr0-p2 router-port=lr0-public-p2 +check multinode_nbctl lrp-set-gateway-chassis lr0-public-p2 ovn-gw-2 10 + +# Create a VM on ovn-chassis-4 in the same public2 overlay +m_as ovn-chassis-4 /data/create_fake_vm.sh public-port2 publicp2 60:54:00:00:00:03 1342 30.0.1.3 24 30.0.1.1 3000::4/64 3000::a + +m_wait_for_ports_up public-port2 + +# Add SNAT rules using gateway-port +check multinode_nbctl --gateway-port lr0-public-p1 lr-nat-add lr0 snat 20.0.1.1 10.0.1.0/24 +check multinode_nbctl --gateway-port lr0-public-p2 lr-nat-add lr0 snat 30.0.1.1 10.0.1.0/24 + +M_NS_CHECK_EXEC([ovn-chassis-1], [sw0p1], [ping -q -c 3 -i 0.3 -w 2 20.0.1.3 | FORMAT_PING], \ +[0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + +M_NS_CHECK_EXEC([ovn-chassis-2], [sw0p2], [ping -q -c 3 -i 0.3 -w 2 30.0.1.3 | FORMAT_PING], \ +[0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + +# create LB +check multinode_nbctl lb-add lb0 "172.16.0.100:80" "10.0.1.3:80,10.0.1.4:80" +check multinode_nbctl lr-lb-add lr0 lb0 +check multinode_nbctl ls-lb-add sw0 lb0 + +# Set use_stateless_nat to true +check multinode_nbctl set load_balancer lb0 options:use_stateless_nat=true + +# Start backend http services +M_NS_DAEMONIZE([ovn-chassis-1], [sw0p1], [python3 -m http.server --bind 10.0.1.3 80 >/dev/null 2>&1], [http1.pid]) +M_NS_DAEMONIZE([ovn-chassis-2], [sw0p2], [python3 -m http.server --bind 10.0.1.4 80 >/dev/null 2>&1], [http2.pid]) + +# wait for http server be ready +OVS_WAIT_UNTIL([m_as ovn-chassis-1 ip netns exec sw0p1 ss -tulpn | grep LISTEN | grep 10.0.1.3:80]) +OVS_WAIT_UNTIL([m_as ovn-chassis-2 ip netns exec sw0p2 ss -tulpn | grep LISTEN | grep 10.0.1.4:80]) + +# Flush conntrack entries for easier output parsing of next test. +m_as ovn-gw-1 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-2 ovs-appctl dpctl/flush-conntrack + +M_NS_EXEC([ovn-chassis-3], [publicp1], [sh -c 'curl -v 172.16.0.100:80 --retry 0 --connect-timeout 1 --max-time 1 --local-port 59002 2> curl.out']) +M_NS_CHECK_EXEC([ovn-chassis-3], [publicp1], [sh -c 'cat -v curl.out' | M_FORMAT_CURL([172.16.0.100], [80])], [0], [dnl +Connected to 172.16.0.100 (172.16.0.100) port 80 +200 OK +]) + +M_NS_EXEC([ovn-chassis-4], [publicp2], [sh -c 'curl -v 172.16.0.100:80 --retry 0 --connect-timeout 1 --max-time 1 --local-port 59003 2> curl.out']) +M_NS_CHECK_EXEC([ovn-chassis-4], [publicp2], [sh -c 'cat -v curl.out' | M_FORMAT_CURL([172.16.0.100], [80])], [0], [dnl +Connected to 172.16.0.100 (172.16.0.100) port 80 +200 OK +]) + +m_as ovn-gw-1 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-2 ovs-appctl dpctl/flush-conntrack + +M_NS_EXEC([ovn-chassis-3], [publicp1], [sh -c 'curl -v 172.16.0.100:80 --retry 0 --connect-timeout 1 --max-time 1 --local-port 59001']) +OVS_WAIT_FOR_OUTPUT([m_as ovn-gw-1 ovs-appctl dpctl/dump-conntrack | M_FORMAT_CT(20.0.1.3) | \ +grep tcp | sed -E -e 's/10.0.1.3|10.0.1.4//g' | sort], [0], [dnl +tcp,orig=(src=20.0.1.3,dst=,sport=59001,dport=80),reply=(src=,dst=20.0.1.3,sport=80,dport=59001),zone=,mark=,protoinfo=(state=) +tcp,orig=(src=20.0.1.3,dst=,sport=59001,dport=80),reply=(src=,dst=20.0.1.3,sport=80,dport=59001),zone=,protoinfo=(state=) +]) + +M_NS_EXEC([ovn-chassis-4], [publicp2], [sh -c 'curl -v 172.16.0.100:80 --retry 0 --connect-timeout 1 --max-time 1 --local-port 59000']) +OVS_WAIT_FOR_OUTPUT([m_as ovn-gw-2 ovs-appctl dpctl/dump-conntrack | M_FORMAT_CT(30.0.1.3) | \ +grep tcp | sed -E -e 's/10.0.1.3|10.0.1.4//g' | sort], [0], [dnl +tcp,orig=(src=30.0.1.3,dst=,sport=59000,dport=80),reply=(src=,dst=30.0.1.3,sport=80,dport=59000),zone=,mark=,protoinfo=(state=) +tcp,orig=(src=30.0.1.3,dst=,sport=59000,dport=80),reply=(src=,dst=30.0.1.3,sport=80,dport=59000),zone=,protoinfo=(state=) +]) + +# create a big file on web servers for download +M_NS_EXEC([ovn-chassis-1], [sw0p1], [dd bs=512 count=200000 if=/dev/urandom of=download_file]) +M_NS_EXEC([ovn-chassis-2], [sw0p2], [dd bs=512 count=200000 if=/dev/urandom of=download_file]) + +# Flush conntrack entries for easier output parsing of next test. +m_as ovn-chassis-1 ovs-appctl dpctl/flush-conntrack +m_as ovn-chassis-2 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-1 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-2 ovs-appctl dpctl/flush-conntrack + +M_NS_EXEC([ovn-chassis-3], [publicp1], [sh -c 'curl -v -O 172.16.0.100:80/download_file --retry 0 --connect-timeout 1 --max-time 1 --local-port 59004 2>curl.out']) + +gw1_ct=$(m_as ovn-gw-1 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +gw2_ct=$(m_as ovn-gw-2 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +chassis1_ct=$(m_as ovn-chassis-1 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +chassis2_ct=$(m_as ovn-chassis-2 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +chassis1_flow=$(m_as ovn-chassis-1 ovs-dpctl dump-flows | sed ':a;N;$!ba;s/\n/\\n/g') +chassis2_flow=$(m_as ovn-chassis-2 ovs-dpctl dump-flows | sed ':a;N;$!ba;s/\n/\\n/g') + +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 ip netns exec publicp1 cat -v curl.out | M_FORMAT_CURL([172.16.0.100], [80])], [0], [dnl +Connected to 172.16.0.100 (172.16.0.100) port 80 +200 OK +]) + +# Check if we have only one backend for the same connection - orig + dest ports +OVS_WAIT_FOR_OUTPUT([echo -e $gw1_ct | M_FORMAT_CT(20.0.1.3) | \ +grep tcp | sed -E -e 's/10.0.1.3|10.0.1.4//g' | sort], [0], [dnl +tcp,orig=(src=20.0.1.3,dst=,sport=59004,dport=80),reply=(src=,dst=20.0.1.3,sport=80,dport=59004),zone=,mark=,protoinfo=(state=) +tcp,orig=(src=20.0.1.3,dst=,sport=59004,dport=80),reply=(src=,dst=20.0.1.3,sport=80,dport=59004),zone=,protoinfo=(state=) +]) + +# Check if gw-2 is empty to ensure that the traffic only come from/to the originator chassis via DGP public1 +AT_CHECK([echo -e $gw2_ct | grep "20.0.1.3" -c], [1], [dnl +0 +]) + +# Check the backend IP from ct entries on gw-1 (DGP public1) +backend_check=$(echo -e $chassis1_ct | grep "10.0.1.3,sport=59004,dport=80" -c) + +if [[ $backend_check -gt 0 ]]; then +# Backend resides on ovn-chassis-1 +AT_CHECK([echo -e $chassis1_ct | M_FORMAT_CT(20.0.1.3) | \ +grep tcp], [0], [dnl +tcp,orig=(src=20.0.1.3,dst=10.0.1.3,sport=59004,dport=80),reply=(src=10.0.1.3,dst=20.0.1.3,sport=80,dport=59004),zone=,protoinfo=(state=) +]) + +# Ensure that the traffic only come from ovn-chassis-1 +AT_CHECK([echo -e $chassis2_ct | grep "20.0.1.3" | grep "dport=80" -c], [1], [dnl +0 +]) +AT_CHECK([echo -e $chassis2_flow | grep "20.0.1.3" | grep "dport=80" -c], [1], [dnl +0 +]) +else +# Backend resides on ovn-chassis-2 +AT_CHECK([echo -e $chassis2_ct | M_FORMAT_CT(20.0.1.3) | \ +grep tcp], [0], [dnl +tcp,orig=(src=20.0.1.3,dst=10.0.1.4,sport=59004,dport=80),reply=(src=10.0.1.4,dst=20.0.1.3,sport=80,dport=59004),zone=,protoinfo=(state=) +]) + +# Ensure that the traffic only come from ovn-chassis-2 +AT_CHECK([echo -e $chassis1_ct | grep "20.0.1.3" | grep "dport=80" -c], [1], [dnl +0 +]) +AT_CHECK([echo -e $chassis1_flow | grep "20.0.1.3" | grep "dport=80" -c], [1], [dnl +0 +]) +fi + +# Flush conntrack entries for easier output parsing of next test. +m_as ovn-chassis-1 ovs-appctl dpctl/flush-conntrack +m_as ovn-chassis-2 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-1 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-2 ovs-appctl dpctl/flush-conntrack + +# Check the flows again for a new source port +M_NS_EXEC([ovn-chassis-3], [publicp1], [sh -c 'curl -v -O 172.16.0.100:80/download_file --retry 0 --connect-timeout 1 --max-time 1 --local-port 59005 2>curl.out']) + +gw1_ct=$(m_as ovn-gw-1 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +gw2_ct=$(m_as ovn-gw-2 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +chassis1_ct=$(m_as ovn-chassis-1 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +chassis2_ct=$(m_as ovn-chassis-2 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +chassis1_flow=$(m_as ovn-chassis-1 ovs-dpctl dump-flows | sed ':a;N;$!ba;s/\n/\\n/g') +chassis2_flow=$(m_as ovn-chassis-2 ovs-dpctl dump-flows | sed ':a;N;$!ba;s/\n/\\n/g') + +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 ip netns exec publicp1 cat -v curl.out | M_FORMAT_CURL([172.16.0.100], [80])], [0], [dnl +Connected to 172.16.0.100 (172.16.0.100) port 80 +200 OK +]) + +# Check if we have only one backend for the same connection - orig + dest ports +OVS_WAIT_FOR_OUTPUT([echo -e $gw1_ct | M_FORMAT_CT(20.0.1.3) | \ +grep tcp | sed -E -e 's/10.0.1.3|10.0.1.4//g' | sort], [0], [dnl +tcp,orig=(src=20.0.1.3,dst=,sport=59005,dport=80),reply=(src=,dst=20.0.1.3,sport=80,dport=59005),zone=,mark=,protoinfo=(state=) +tcp,orig=(src=20.0.1.3,dst=,sport=59005,dport=80),reply=(src=,dst=20.0.1.3,sport=80,dport=59005),zone=,protoinfo=(state=) +]) + +# Check if gw-2 is empty to ensure that the traffic only come from/to the originator chassis via DGP public1 +AT_CHECK([echo -e $gw2_ct | grep "20.0.1.3" -c], [1], [dnl +0 +]) + +# Check the backend IP from ct entries on gw-1 (DGP public1) +backend_check=$(echo -e $chassis1_ct | grep "10.0.1.3,sport=59005,dport=80" -c) + +if [[ $backend_check -gt 0 ]]; then +# Backend resides on ovn-chassis-1 +# Ensure that the traffic only come from ovn-chassis-1 +AT_CHECK([echo -e $chassis2_ct | grep "20.0.1.3" | grep "dport=80" -c], [1], [dnl +0 +]) +AT_CHECK([echo -e $chassis2_flow | grep "20.0.1.3" | grep "dport=80" -c], [1], [dnl +0 +]) +else +# Backend resides on ovn-chassis-2 +# Ensure that the traffic only come from ovn-chassis-2 +AT_CHECK([echo -e $chassis1_ct | grep "20.0.1.3" | grep "dport=80" -c], [1], [dnl +0 +]) +AT_CHECK([echo -e $chassis1_flow | grep "20.0.1.3" | grep "dport=80" -c], [1], [dnl +0 +]) +fi + +# Flush conntrack entries for easier output parsing of next test. +m_as ovn-chassis-1 ovs-appctl dpctl/flush-conntrack +m_as ovn-chassis-2 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-1 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-2 ovs-appctl dpctl/flush-conntrack + +# Start a new test using the second DGP as origin (public2) +M_NS_EXEC([ovn-chassis-4], [publicp2], [sh -c 'curl -v -O 172.16.0.100:80/download_file --retry 0 --connect-timeout 1 --max-time 1 --local-port 59006 2>curl.out']) + +gw1_ct=$(m_as ovn-gw-1 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +gw2_ct=$(m_as ovn-gw-2 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +chassis1_ct=$(m_as ovn-chassis-1 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +chassis2_ct=$(m_as ovn-chassis-2 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +chassis1_flow=$(m_as ovn-chassis-1 ovs-dpctl dump-flows | sed ':a;N;$!ba;s/\n/\\n/g') +chassis2_flow=$(m_as ovn-chassis-2 ovs-dpctl dump-flows | sed ':a;N;$!ba;s/\n/\\n/g') + +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-4 ip netns exec publicp2 cat -v curl.out | M_FORMAT_CURL([172.16.0.100], [80])], [0], [dnl +Connected to 172.16.0.100 (172.16.0.100) port 80 +200 OK +]) + +# Check if we have only one backend for the same connection - orig + dest ports +OVS_WAIT_FOR_OUTPUT([echo -e $gw2_ct | M_FORMAT_CT(30.0.1.3) | \ +grep tcp | sed -E -e 's/10.0.1.3|10.0.1.4//g' | sort], [0], [dnl +tcp,orig=(src=30.0.1.3,dst=,sport=59006,dport=80),reply=(src=,dst=30.0.1.3,sport=80,dport=59006),zone=,mark=,protoinfo=(state=) +tcp,orig=(src=30.0.1.3,dst=,sport=59006,dport=80),reply=(src=,dst=30.0.1.3,sport=80,dport=59006),zone=,protoinfo=(state=) +]) + +# Check if gw-1 is empty to ensure that the traffic only come from/to the originator chassis via DGP public2 +AT_CHECK([echo -e $gw1_ct | grep "30.0.1.3" -c], [1], [dnl +0 +]) + +# Check the backend IP from ct entries on gw-2 (DGP public2) +backend_check=$(echo -e $chassis1_ct | grep "10.0.1.3,sport=59006,dport=80" -c) + +if [[ $backend_check -gt 0 ]]; then +# Backend resides on ovn-chassis-1 +AT_CHECK([echo -e $chassis1_ct | M_FORMAT_CT(30.0.1.3) | \ +grep tcp], [0], [dnl +tcp,orig=(src=30.0.1.3,dst=10.0.1.3,sport=59006,dport=80),reply=(src=10.0.1.3,dst=30.0.1.3,sport=80,dport=59006),zone=,protoinfo=(state=) +]) + +# Ensure that the traffic only come from ovn-chassis-1 +AT_CHECK([echo -e $chassis2_ct | grep "30.0.1.3" | grep "dport=80" -c], [1], [dnl +0 +]) +AT_CHECK([echo -e $chassis2_flow | grep "30.0.1.3" | grep "dport=80" -c], [1], [dnl +0 +]) +else +# Backend resides on ovn-chassis-2 +AT_CHECK([echo -e $chassis2_ct | M_FORMAT_CT(30.0.1.3) | \ +grep tcp], [0], [dnl +tcp,orig=(src=30.0.1.3,dst=10.0.1.4,sport=59006,dport=80),reply=(src=10.0.1.4,dst=30.0.1.3,sport=80,dport=59006),zone=,protoinfo=(state=) +]) + +# Ensure that the traffic only come from ovn-chassis-2 +AT_CHECK([echo -e $chassis1_ct | grep "30.0.1.3" | grep "dport=80" -c], [1], [dnl +0 +]) +AT_CHECK([echo -e $chassis1_flow | grep "30.0.1.3" | grep "dport=80" -c], [1], [dnl +0 +]) +fi + +# Flush conntrack entries for easier output parsing of next test. +m_as ovn-chassis-1 ovs-appctl dpctl/flush-conntrack +m_as ovn-chassis-2 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-1 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-2 ovs-appctl dpctl/flush-conntrack + +# Check the flows again for a new source port using the second DGP as origin (public2) +M_NS_EXEC([ovn-chassis-4], [publicp2], [sh -c 'curl -v -O 172.16.0.100:80/download_file --retry 0 --connect-timeout 1 --max-time 1 --local-port 59007 2>curl.out']) + +gw1_ct=$(m_as ovn-gw-1 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +gw2_ct=$(m_as ovn-gw-2 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +chassis1_ct=$(m_as ovn-chassis-1 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +chassis2_ct=$(m_as ovn-chassis-2 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +chassis1_flow=$(m_as ovn-chassis-1 ovs-dpctl dump-flows | sed ':a;N;$!ba;s/\n/\\n/g') +chassis2_flow=$(m_as ovn-chassis-2 ovs-dpctl dump-flows | sed ':a;N;$!ba;s/\n/\\n/g') + +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-4 ip netns exec publicp2 cat -v curl.out | M_FORMAT_CURL([172.16.0.100], [80])], [0], [dnl +Connected to 172.16.0.100 (172.16.0.100) port 80 +200 OK +]) + +# Check if we have only one backend for the same connection - orig + dest ports +OVS_WAIT_FOR_OUTPUT([echo -e $gw2_ct | M_FORMAT_CT(30.0.1.3) | \ +grep tcp | sed -E -e 's/10.0.1.3|10.0.1.4//g' | sort], [0], [dnl +tcp,orig=(src=30.0.1.3,dst=,sport=59007,dport=80),reply=(src=,dst=30.0.1.3,sport=80,dport=59007),zone=,mark=,protoinfo=(state=) +tcp,orig=(src=30.0.1.3,dst=,sport=59007,dport=80),reply=(src=,dst=30.0.1.3,sport=80,dport=59007),zone=,protoinfo=(state=) +]) + +# Check if gw-1 is empty to ensure that the traffic only come from/to the originator chassis via DGP public2 +AT_CHECK([echo -e $gw1_ct | grep "30.0.1.3" -c], [1], [dnl +0 +]) + +# Check the backend IP from ct entries on gw-1 (DGP public1) +backend_check=$(echo -e $chassis1_ct | grep "10.0.1.3,sport=59007,dport=80" -c) + +if [[ $backend_check -gt 0 ]]; then +# Backend resides on ovn-chassis-1 +AT_CHECK([echo -e $chassis1_ct | M_FORMAT_CT(30.0.1.3) | \ +grep tcp], [0], [dnl +tcp,orig=(src=30.0.1.3,dst=10.0.1.3,sport=59007,dport=80),reply=(src=10.0.1.3,dst=30.0.1.3,sport=80,dport=59007),zone=,protoinfo=(state=) +]) + +# Ensure that the traffic only come from ovn-chassis-1 +AT_CHECK([echo -e $chassis2_ct | grep "30.0.1.3" | grep "dport=80" -c], [1], [dnl +0 +]) +AT_CHECK([echo -e $chassis2_flow | grep "30.0.1.3" | grep "dport=80" -c], [1], [dnl +0 +]) +else +# Backend resides on ovn-chassis-2 +AT_CHECK([echo -e $chassis2_ct | M_FORMAT_CT(30.0.1.3) | \ +grep tcp], [0], [dnl +tcp,orig=(src=30.0.1.3,dst=10.0.1.4,sport=59007,dport=80),reply=(src=10.0.1.4,dst=30.0.1.3,sport=80,dport=59007),zone=,protoinfo=(state=) +]) + +# Ensure that the traffic only come from ovn-chassis-2 +AT_CHECK([echo -e $chassis1_ct | grep "30.0.1.3" | grep "dport=80" -c], [1], [dnl +0 +]) +AT_CHECK([echo -e $chassis1_flow | grep "30.0.1.3" | grep "dport=80" -c], [1], [dnl +0 +]) +fi + +# Check multiple requests coming from DGP's public1 and public2 + +M_NS_EXEC([ovn-chassis-4], [publicp2], [sh -c 'curl -v -O 172.16.0.100:80/download_file --retry 0 --connect-timeout 1 --max-time 1 2>curl.out']) +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-4 cat -v curl.out | M_FORMAT_CURL([172.16.0.100], [80])], [0], [dnl +Connected to 172.16.0.100 (172.16.0.100) port 80 +200 OK +]) + +M_NS_EXEC([ovn-chassis-3], [publicp1], [sh -c 'curl -v -O 172.16.0.100:80/download_file --retry 0 --connect-timeout 1 --max-time 1 2>curl.out']) +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 cat -v curl.out | M_FORMAT_CURL([172.16.0.100], [80])], [0], [dnl +Connected to 172.16.0.100 (172.16.0.100) port 80 +200 OK +]) + +M_NS_EXEC([ovn-chassis-4], [publicp2], [sh -c 'curl -v -O 172.16.0.100:80/download_file --retry 0 --connect-timeout 1 --max-time 1 2>curl.out']) +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-4 cat -v curl.out | M_FORMAT_CURL([172.16.0.100], [80])], [0], [dnl +Connected to 172.16.0.100 (172.16.0.100) port 80 +200 OK +]) + +M_NS_EXEC([ovn-chassis-3], [publicp1], [sh -c 'curl -v -O 172.16.0.100:80/download_file --retry 0 --connect-timeout 1 --max-time 1 2>curl.out']) +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 cat -v curl.out | M_FORMAT_CURL([172.16.0.100], [80])], [0], [dnl +Connected to 172.16.0.100 (172.16.0.100) port 80 +200 OK +]) + +# Remove the LB and change the VIP port - different from the backend ports +check multinode_nbctl lb-del lb0 + +# create LB again +check multinode_nbctl lb-add lb0 "172.16.0.100:9000" "10.0.1.3:80,10.0.1.4:80" +check multinode_nbctl lr-lb-add lr0 lb0 +check multinode_nbctl ls-lb-add sw0 lb0 + +# Set use_stateless_nat to true +check multinode_nbctl set load_balancer lb0 options:use_stateless_nat=true + +m_as ovn-gw-1 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-2 ovs-appctl dpctl/flush-conntrack + +# Check end-to-end request using a new port for VIP +M_NS_EXEC([ovn-chassis-3], [publicp1], [sh -c 'curl -v -O 172.16.0.100:9000/download_file --retry 0 --connect-timeout 1 --max-time 1 --local-port 59008 2>curl.out']) +OVS_WAIT_FOR_OUTPUT([m_as ovn-gw-1 ovs-appctl dpctl/dump-conntrack | M_FORMAT_CT(20.0.1.3) | \ +grep tcp | sed -E -e 's/10.0.1.3|10.0.1.4//g' | sort], [0], [dnl +tcp,orig=(src=20.0.1.3,dst=,sport=59008,dport=80),reply=(src=,dst=20.0.1.3,sport=80,dport=59008),zone=,protoinfo=(state=) +tcp,orig=(src=20.0.1.3,dst=,sport=59008,dport=9000),reply=(src=,dst=20.0.1.3,sport=80,dport=59008),zone=,mark=,protoinfo=(state=) +]) + +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 cat -v curl.out | M_FORMAT_CURL([172.16.0.100], [9000])], [0], [dnl +Connected to 172.16.0.100 (172.16.0.100) port 9000 +200 OK +]) + +m_as ovn-gw-1 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-2 ovs-appctl dpctl/flush-conntrack + +# Check end-to-end request using a new port for VIP +M_NS_EXEC([ovn-chassis-4], [publicp2], [sh -c 'curl -v -O 172.16.0.100:9000/download_file --retry 0 --connect-timeout 1 --max-time 1 --local-port 59008 2>curl.out']) +OVS_WAIT_FOR_OUTPUT([m_as ovn-gw-2 ovs-appctl dpctl/dump-conntrack | M_FORMAT_CT(30.0.1.3) | \ +grep tcp | sed -E -e 's/10.0.1.3|10.0.1.4//g' | sort], [0], [dnl +tcp,orig=(src=30.0.1.3,dst=,sport=59008,dport=80),reply=(src=,dst=30.0.1.3,sport=80,dport=59008),zone=,protoinfo=(state=) +tcp,orig=(src=30.0.1.3,dst=,sport=59008,dport=9000),reply=(src=,dst=30.0.1.3,sport=80,dport=59008),zone=,mark=,protoinfo=(state=) +]) + +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 cat -v curl.out | M_FORMAT_CURL([172.16.0.100], [9000])], [0], [dnl +Connected to 172.16.0.100 (172.16.0.100) port 9000 +200 OK +]) + +AT_CLEANUP + +AT_SETUP([ovn multinode load-balancer with multiple DGPs and multiple chassis - ECMP environment]) + +# Check that ovn-fake-multinode setup is up and running - requires additional nodes +check_fake_multinode_setup_by_nodes 'ovn-chassis-1 ovn-chassis-2 ovn-chassis-3 ovn-gw-1 ovn-gw-2 ovn-gw-3 ovn-gw-4' + +# Delete the multinode NB and OVS resources before starting the test. +cleanup_multinode_resources_by_nodes 'ovn-chassis-1 ovn-chassis-2 ovn-chassis-3 ovn-gw-1 ovn-gw-2 ovn-gw-3 ovn-gw-4' + +# Reset geneve tunnels +for c in ovn-chassis-1 ovn-chassis-2 ovn-chassis-3 ovn-gw-1 ovn-gw-2 ovn-gw-3 ovn-gw-4 +do + m_as $c ovs-vsctl set open . external-ids:ovn-encap-type=geneve +done + +# Network topology +# VM ovn-chassis-3 (40.0.2.3/24) +# | +# sw1 +# | +# lr1 +# | +# +.............................|.............................+ +# | | +# DGP publicp3 (ovn-gw-3) (20.0.2.3/24) DGP publicp4 (ovn-gw-4) (20.0.2.4/24) +# | | +# +.............................+.............................+ +# | +# | (overlay) +# +.............................+.............................+ +# | | +# DGP public1 (ovn-gw-1) (20.0.2.1/24) DGP public2 (ovn-gw-2) (20.0.2.2/24) +# | | +# +.............................+.............................+ +# | +# lr0 (lb0 VIP 172.16.0.100) +# | +# sw0 +# | +# +.............................+.............................+ +# | | +# sw0p1 (ovn-chassis-1) 10.0.2.3/24 sw0p2 (ovn-chassis-2) 10.0.2.4/24 + + +# Delete already used ovs-ports +m_as ovn-chassis-1 ip link del sw0p1-p +m_as ovn-chassis-2 ip link del sw0p2-p +m_as ovn-chassis-2 ip link del sw1p1-p +m_as ovn-chassis-3 ip link del sw1p1-p + +# Create East-West switch for LB backends +check multinode_nbctl ls-add sw0 +check multinode_nbctl lsp-add sw0 sw0-port1 +check multinode_nbctl lsp-set-addresses sw0-port1 "50:54:00:00:00:03 10.0.2.3 1000::3" +check multinode_nbctl lsp-add sw0 sw0-port2 +check multinode_nbctl lsp-set-addresses sw0-port2 "50:54:00:00:00:04 10.0.2.4 1000::4" + +m_as ovn-chassis-1 /data/create_fake_vm.sh sw0-port1 sw0p1 50:54:00:00:00:03 1342 10.0.2.3 24 10.0.2.1 1000::3/64 1000::a +m_as ovn-chassis-2 /data/create_fake_vm.sh sw0-port2 sw0p2 50:54:00:00:00:04 1342 10.0.2.4 24 10.0.2.1 1000::4/64 1000::a + +# Create sw1 for ovn-chassis-3 VM +check multinode_nbctl ls-add sw1 +check multinode_nbctl lsp-add sw1 sw1-port1 +check multinode_nbctl lsp-set-addresses sw1-port1 "70:54:00:00:00:03 40.0.2.3 5000::3" + +m_as ovn-chassis-3 /data/create_fake_vm.sh sw1-port1 sw1p1 70:54:00:00:00:03 1342 40.0.2.3 24 40.0.2.1 5000::3/64 5000::a + +m_wait_for_ports_up + +M_NS_CHECK_EXEC([ovn-chassis-1], [sw0p1], [ping -q -c 3 -i 0.3 -w 2 10.0.2.4 | FORMAT_PING], \ +[0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + +M_NS_CHECK_EXEC([ovn-chassis-2], [sw0p2], [ping -q -c 3 -i 0.3 -w 2 10.0.2.3 | FORMAT_PING], \ +[0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + +# Create a logical router and attach to sw0 +check multinode_nbctl lr-add lr0 +check multinode_nbctl lrp-add lr0 lr0-sw0 00:00:00:00:ff:01 10.0.2.1/24 1000::a/64 +check multinode_nbctl lsp-add sw0 sw0-lr0 +check multinode_nbctl lsp-set-type sw0-lr0 router +check multinode_nbctl lsp-set-addresses sw0-lr0 router +check multinode_nbctl lsp-set-options sw0-lr0 router-port=lr0-sw0 + +# create external connection for N/S traffic using multiple DGPs +check multinode_nbctl ls-add public + +# create external connection for N/S traffic +# DGP public1 +check multinode_nbctl lsp-add public ln-public-1 +check multinode_nbctl lsp-set-type ln-public-1 localnet +check multinode_nbctl lsp-set-addresses ln-public-1 unknown +check multinode_nbctl lsp-set-options ln-public-1 network_name=public1 + +# DGP public2 +check multinode_nbctl lsp-add public ln-public-2 +check multinode_nbctl lsp-set-type ln-public-2 localnet +check multinode_nbctl lsp-set-addresses ln-public-2 unknown +check multinode_nbctl lsp-set-options ln-public-2 network_name=public2 + +# Attach DGP public1 to GW-1 public1 (overlay connectivity) +m_as ovn-gw-1 ovs-vsctl set open . external-ids:ovn-bridge-mappings=public1:br-ex + +# Attach DGP public2 to GW-2 public2 (overlay connectivity) +m_as ovn-gw-2 ovs-vsctl set open . external-ids:ovn-bridge-mappings=public2:br-ex + +check multinode_nbctl lrp-add lr0 lr0-public-p1 40:54:00:00:00:01 20.0.2.1/24 2000::1/64 +check multinode_nbctl lsp-add public public-lr0-p1 +check multinode_nbctl lsp-set-type public-lr0-p1 router +check multinode_nbctl lsp-set-addresses public-lr0-p1 router +check multinode_nbctl lsp-set-options public-lr0-p1 router-port=lr0-public-p1 +check multinode_nbctl lrp-set-gateway-chassis lr0-public-p1 ovn-gw-1 10 + +m_wait_for_ports_up + +M_NS_CHECK_EXEC([ovn-chassis-1], [sw0p1], [ping -q -c 3 -i 0.3 -w 2 20.0.2.1 | FORMAT_PING], \ +[0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + +check multinode_nbctl lrp-add lr0 lr0-public-p2 40:54:00:00:00:02 20.0.2.2/24 2000::2/64 +check multinode_nbctl lsp-add public public-lr0-p2 +check multinode_nbctl lsp-set-type public-lr0-p2 router +check multinode_nbctl lsp-set-addresses public-lr0-p2 router +check multinode_nbctl lsp-set-options public-lr0-p2 router-port=lr0-public-p2 +check multinode_nbctl lrp-set-gateway-chassis lr0-public-p2 ovn-gw-2 10 + +M_NS_CHECK_EXEC([ovn-chassis-2], [sw0p2], [ping -q -c 3 -i 0.3 -w 2 20.0.2.2 | FORMAT_PING], \ +[0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + +# Create a logical router and attach to sw1 +check multinode_nbctl lr-add lr1 +check multinode_nbctl lrp-add lr1 lr1-sw1 00:00:00:00:ff:02 40.0.2.1/24 5000::a/64 +check multinode_nbctl lsp-add sw1 sw1-lr1 +check multinode_nbctl lsp-set-type sw1-lr1 router +check multinode_nbctl lsp-set-addresses sw1-lr1 router +check multinode_nbctl lsp-set-options sw1-lr1 router-port=lr1-sw1 + +# create external connection for N/S traffic +# DGP public3 +check multinode_nbctl lsp-add public ln-public-3 +check multinode_nbctl lsp-set-type ln-public-3 localnet +check multinode_nbctl lsp-set-addresses ln-public-3 unknown +check multinode_nbctl lsp-set-options ln-public-3 network_name=public3 + +# Attach DGP public3 to GW-3 public3 (overlay connectivity) +m_as ovn-gw-3 ovs-vsctl set open . external-ids:ovn-bridge-mappings=public3:br-ex + +check multinode_nbctl lrp-add lr1 lr1-public-p3 40:54:00:00:00:03 20.0.2.3/24 2000::3/64 +check multinode_nbctl lsp-add public public-lr1-p3 +check multinode_nbctl lsp-set-type public-lr1-p3 router +check multinode_nbctl lsp-set-addresses public-lr1-p3 router +check multinode_nbctl lsp-set-options public-lr1-p3 router-port=lr1-public-p3 +check multinode_nbctl lrp-set-gateway-chassis lr1-public-p3 ovn-gw-3 10 + +M_NS_CHECK_EXEC([ovn-chassis-3], [sw1p1], [ping -q -c 3 -i 0.3 -w 2 40.0.2.1 | FORMAT_PING], \ +[0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + +M_NS_CHECK_EXEC([ovn-chassis-3], [sw1p1], [ping -q -c 3 -i 0.3 -w 2 20.0.2.3 | FORMAT_PING], \ +[0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + +# Add a default route for multiple DGPs using ECMP - first step +check multinode_nbctl --ecmp lr-route-add lr0 0.0.0.0/0 20.0.2.3 +check multinode_nbctl --ecmp lr-route-add lr1 0.0.0.0/0 20.0.2.1 + +# Add SNAT rules using gateway-port +check multinode_nbctl --gateway-port lr0-public-p1 lr-nat-add lr0 snat 20.0.2.1 10.0.2.0/24 +check multinode_nbctl --gateway-port lr0-public-p2 lr-nat-add lr0 snat 20.0.2.2 10.0.2.0/24 +check multinode_nbctl --gateway-port lr1-public-p3 lr-nat-add lr1 snat 20.0.2.3 40.0.2.0/24 + +M_NS_CHECK_EXEC([ovn-chassis-3], [sw1p1], [ping -q -c 3 -i 0.3 -w 2 20.0.2.1 | FORMAT_PING], \ +[0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + +M_NS_CHECK_EXEC([ovn-chassis-3], [sw1p1], [ping -q -c 3 -i 0.3 -w 2 20.0.2.2 | FORMAT_PING], \ +[0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + +# Configure the second DGP for the lr1 +# DGP public4 +check multinode_nbctl lsp-add public ln-public-4 +check multinode_nbctl lsp-set-type ln-public-4 localnet +check multinode_nbctl lsp-set-addresses ln-public-4 unknown +check multinode_nbctl lsp-set-options ln-public-4 network_name=public4 + +# Attach DGP public4 to GW-2 public4 (overlay connectivity) +m_as ovn-gw-4 ovs-vsctl set open . external-ids:ovn-bridge-mappings=public4:br-ex + +check multinode_nbctl lrp-add lr1 lr1-public-p4 40:54:00:00:00:04 20.0.2.4/24 2000::4/64 +check multinode_nbctl lsp-add public public-lr1-p4 +check multinode_nbctl lsp-set-type public-lr1-p4 router +check multinode_nbctl lsp-set-addresses public-lr1-p4 router +check multinode_nbctl lsp-set-options public-lr1-p4 router-port=lr1-public-p4 +check multinode_nbctl lrp-set-gateway-chassis lr1-public-p4 ovn-gw-4 10 + +M_NS_CHECK_EXEC([ovn-chassis-3], [sw1p1], [ping -q -c 3 -i 0.3 -w 2 20.0.2.4 | FORMAT_PING], \ +[0], [dnl +3 packets transmitted, 3 received, 0% packet loss, time 0ms +]) + +# Add SNAT rules using gateway-port +check multinode_nbctl --gateway-port lr1-public-p4 lr-nat-add lr1 snat 20.0.2.4 40.0.2.0/24 + +# Add a default route for multiple DGPs using ECMP - second step (multipath) +check multinode_nbctl --ecmp lr-route-add lr0 0.0.0.0/0 20.0.2.4 +check multinode_nbctl --ecmp lr-route-add lr1 0.0.0.0/0 20.0.2.2 + +# Start backend http services +M_NS_DAEMONIZE([ovn-chassis-1], [sw0p1], [python3 -m http.server --bind 10.0.2.3 80 >/dev/null 2>&1], [http1.pid]) +M_NS_DAEMONIZE([ovn-chassis-2], [sw0p2], [python3 -m http.server --bind 10.0.2.4 80 >/dev/null 2>&1], [http2.pid]) + +# wait for http server be ready +OVS_WAIT_UNTIL([m_as ovn-chassis-1 ip netns exec sw0p1 ss -tulpn | grep LISTEN | grep 10.0.2.3:80]) +OVS_WAIT_UNTIL([m_as ovn-chassis-2 ip netns exec sw0p2 ss -tulpn | grep LISTEN | grep 10.0.2.4:80]) + +# create a big file on web servers for download +M_NS_EXEC([ovn-chassis-1], [sw0p1], [dd bs=512 count=200000 if=/dev/urandom of=download_file]) +M_NS_EXEC([ovn-chassis-2], [sw0p2], [dd bs=512 count=200000 if=/dev/urandom of=download_file]) + +# create LB +check multinode_nbctl lb-add lb0 "172.16.0.100:80" "10.0.2.3:80,10.0.2.4:80" +check multinode_nbctl lr-lb-add lr0 lb0 +check multinode_nbctl ls-lb-add sw0 lb0 + +check multinode_nbctl --wait=sb sync + +# Flush conntrack entries for easier output parsing of next test. +m_as ovn-gw-1 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-2 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-3 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-4 ovs-appctl dpctl/flush-conntrack + +# Check direct backend traffic using the same LB ports +M_NS_EXEC([ovn-chassis-3], [sw1p1], [sh -c 'curl -v -O 10.0.2.3:80/download_file --retry 0 --connect-timeout 1 --max-time 1 --local-port 59013 2>curl.out']) + +gw1_ct=$(m_as ovn-gw-1 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +gw2_ct=$(m_as ovn-gw-2 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +gw3_ct=$(m_as ovn-gw-3 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +gw4_ct=$(m_as ovn-gw-4 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') + +# Check the backend IP from ct entries on gateways +backend_check_gw1=$(echo -e $gw1_ct | grep "dport=80" | grep "59013" -c) +backend_check_gw2=$(echo -e $gw2_ct | grep "dport=80" | grep "59013" -c) +backend_check_gw3=$(echo -e $gw3_ct | grep "dport=80" | grep "59013" -c) +backend_check_gw4=$(echo -e $gw4_ct | grep "dport=80" | grep "59013" -c) + +chassis_in_use=$(($backend_check_gw1 + $backend_check_gw2 + $backend_check_gw3 + $backend_check_gw4)) + +# If the traffic passes through both gateways (GW-1 and GW-2 OR GW-3 and GW-4) it will be dropped because +# we are bypassing the Stateless NAT solution for LB when we access the backend directly +if [[ $chassis_in_use -gt 2 ]]; then +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 ip netns exec sw1p1 cat -v curl.out | \ +sed 's/\(.*\)timed out/timed out\n/' | sed 's/\(.*\)connect timeout/timed out\n/' | grep -i -e "timed out" | uniq], [0], [dnl +timed out +]) +else +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 cat -v curl.out | M_FORMAT_CURL([10.0.2.3], [80])], [0], [dnl +Connected to 10.0.2.3 (10.0.2.3) port 80 +200 OK +]) +fi + +# Flush conntrack entries for easier output parsing of next test. +m_as ovn-gw-1 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-2 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-3 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-4 ovs-appctl dpctl/flush-conntrack + +M_NS_EXEC([ovn-chassis-3], [sw1p1], [sh -c 'curl -v -O 10.0.2.4:80/download_file --retry 0 --connect-timeout 1 --max-time 1 --local-port 59014 2>curl.out']) + +gw1_ct=$(m_as ovn-gw-1 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +gw2_ct=$(m_as ovn-gw-2 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +gw3_ct=$(m_as ovn-gw-3 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +gw4_ct=$(m_as ovn-gw-4 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') + +# Check the backend IP from ct entries on gateways +backend_check_gw1=$(echo -e $gw1_ct | grep "dport=80" | grep "59014" -c) +backend_check_gw2=$(echo -e $gw2_ct | grep "dport=80" | grep "59014" -c) +backend_check_gw3=$(echo -e $gw3_ct | grep "dport=80" | grep "59014" -c) +backend_check_gw4=$(echo -e $gw4_ct | grep "dport=80" | grep "59014" -c) + +chassis_in_use=$(($backend_check_gw1 + $backend_check_gw2 + $backend_check_gw3 + $backend_check_gw4)) + +# If the traffic passes through both gateways (GW-1 and GW-2 OR GW-3 and GW-4) it will be dropped because +# we are bypassing the Stateless NAT solution for LB when we access the backend directly +if [[ $chassis_in_use -gt 2 ]]; then +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 ip netns exec sw1p1 cat -v curl.out | \ +sed 's/\(.*\)timed out/timed out\n/' | sed 's/\(.*\)connect timeout/timed out\n/' | grep -i -e "timed out" | uniq], [0], [dnl +timed out +]) +else +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 cat -v curl.out | M_FORMAT_CURL([10.0.2.4], [80])], [0], [dnl +Connected to 10.0.2.4 (10.0.2.4) port 80 +200 OK +]) +fi + +# Check the flows again for the LB VIP +M_NS_EXEC([ovn-chassis-3], [sw1p1], [sh -c 'curl -v 172.16.0.100:80 --retry 0 --connect-timeout 1 --max-time 1 --local-port 59015 2>curl.out']) + +curl_timeout=$(m_as ovn-chassis-3 cat -v curl.out | grep -i -e "timed out" -e "timeout" -c) + +# This may fail because we do not have the flows to work independently of the DGP (DNAT + SNAT for the LB Stateless NAT) +if [[ $curl_timeout -gt 0 ]]; then +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 ip netns exec sw1p1 cat -v curl.out | \ +sed 's/\(.*\)timed out/timed out\n/' | sed 's/\(.*\)connect timeout/timed out\n/' | grep -i -e "timed out" | uniq], [0], [dnl +timed out +]) +else +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 cat -v curl.out | M_FORMAT_CURL([172.16.0.100], [80])], [0], [dnl +Connected to 172.16.0.100 (172.16.0.100) port 80 +200 OK +]) +fi + +# Check the flows again for the LB VIP +M_NS_EXEC([ovn-chassis-3], [sw1p1], [sh -c 'curl -v 172.16.0.100:80 --retry 0 --connect-timeout 1 --max-time 1 --local-port 59016 2>curl.out']) + +curl_timeout=$(m_as ovn-chassis-3 cat -v curl.out | grep -i -e "timed out" -e "timeout" -c) + +# This may fail because we do not have the flows to work independently of the DGP (DNAT + SNAT for the LB Stateless NAT) +if [[ $curl_timeout -gt 0 ]]; then +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 ip netns exec sw1p1 cat -v curl.out | \ +sed 's/\(.*\)timed out/timed out\n/' | sed 's/\(.*\)connect timeout/timed out\n/' | grep -i -e "timed out" | uniq], [0], [dnl +timed out +]) +else +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 cat -v curl.out | M_FORMAT_CURL([172.16.0.100], [80])], [0], [dnl +Connected to 172.16.0.100 (172.16.0.100) port 80 +200 OK +]) +fi + +# Set use_stateless_nat to true +# Now, if the traffic passes through both gateways (GW-1 and GW-2) it will be forwarded successfully +check multinode_nbctl set load_balancer lb0 options:use_stateless_nat=true + +# Check the flows again for the LB VIP - always needs to be successful regardless of the datapath (one or two gw chassis) +M_NS_EXEC([ovn-chassis-3], [sw1p1], [sh -c 'curl -v -O 172.16.0.100:80/download_file --retry 0 --connect-timeout 1 --max-time 1 2>curl.out']) + +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 cat -v curl.out | M_FORMAT_CURL([172.16.0.100], [80])], [0], [dnl +Connected to 172.16.0.100 (172.16.0.100) port 80 +200 OK +]) + +M_NS_EXEC([ovn-chassis-3], [sw1p1], [sh -c 'curl -v -O 172.16.0.100:80/download_file --retry 0 --connect-timeout 1 --max-time 1 2>curl.out']) + +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 cat -v curl.out | M_FORMAT_CURL([172.16.0.100], [80])], [0], [dnl +Connected to 172.16.0.100 (172.16.0.100) port 80 +200 OK +]) + +M_NS_EXEC([ovn-chassis-3], [sw1p1], [sh -c 'curl -v -O 172.16.0.100:80/download_file --retry 0 --connect-timeout 1 --max-time 1 2>curl.out']) + +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 cat -v curl.out | M_FORMAT_CURL([172.16.0.100], [80])], [0], [dnl +Connected to 172.16.0.100 (172.16.0.100) port 80 +200 OK +]) + +M_NS_EXEC([ovn-chassis-3], [sw1p1], [sh -c 'curl -v -O 172.16.0.100:80/download_file --retry 0 --connect-timeout 1 --max-time 1 2>curl.out']) + +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 cat -v curl.out | M_FORMAT_CURL([172.16.0.100], [80])], [0], [dnl +Connected to 172.16.0.100 (172.16.0.100) port 80 +200 OK +]) + +# Direct backend traffic using the same LB ports needs to be dropped +M_NS_EXEC([ovn-chassis-3], [sw1p1], [sh -c 'curl -v -O 10.0.2.3:80/download_file --retry 0 --connect-timeout 1 --max-time 1 2>curl.out']) + +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 ip netns exec sw1p1 cat -v curl.out | \ +sed 's/\(.*\)timed out/timed out\n/' | sed 's/\(.*\)connect timeout/timed out\n/' | grep -i -e "timed out" | uniq], [0], [dnl +timed out +]) + +# check again using another source ports +M_NS_EXEC([ovn-chassis-3], [sw1p1], [sh -c 'curl -v -O 10.0.2.3:80/download_file --retry 0 --connect-timeout 1 --max-time 1 2>curl.out']) + +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 ip netns exec sw1p1 cat -v curl.out | \ +sed 's/\(.*\)timed out/timed out\n/' | sed 's/\(.*\)connect timeout/timed out\n/' | grep -i -e "timed out" | uniq], [0], [dnl +timed out +]) + +# Start backend http services using different ports from the LB config - check connectivity +M_NS_DAEMONIZE([ovn-chassis-1], [sw0p1], [python3 -m http.server --bind 10.0.2.3 8080 >/dev/null 2>&1], [http3.pid]) +M_NS_DAEMONIZE([ovn-chassis-2], [sw0p2], [python3 -m http.server --bind 10.0.2.4 8080 >/dev/null 2>&1], [http4.pid]) + +# wait for http server be ready +OVS_WAIT_UNTIL([m_as ovn-chassis-1 ip netns exec sw0p1 ss -tulpn | grep LISTEN | grep 10.0.2.3:8080]) +OVS_WAIT_UNTIL([m_as ovn-chassis-2 ip netns exec sw0p2 ss -tulpn | grep LISTEN | grep 10.0.2.4:8080]) + +# Flush conntrack entries for easier output parsing of next test. +m_as ovn-gw-1 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-2 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-3 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-4 ovs-appctl dpctl/flush-conntrack + +M_NS_EXEC([ovn-chassis-3], [sw1p1], [sh -c 'curl -v -O 10.0.2.4:8080/download_file --retry 0 --connect-timeout 1 --max-time 1 --local-port 59017 2>curl.out']) + +gw1_ct=$(m_as ovn-gw-1 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +gw2_ct=$(m_as ovn-gw-2 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +gw3_ct=$(m_as ovn-gw-3 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +gw4_ct=$(m_as ovn-gw-4 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') + +# Check the backend IP from ct entries on gateways +backend_check_gw1=$(echo -e $gw1_ct | grep "dport=8080" | grep "59017" -c) +backend_check_gw2=$(echo -e $gw2_ct | grep "dport=8080" | grep "59017" -c) +backend_check_gw3=$(echo -e $gw3_ct | grep "dport=8080" | grep "59017" -c) +backend_check_gw4=$(echo -e $gw4_ct | grep "dport=8080" | grep "59017" -c) + +chassis_in_use=$(($backend_check_gw1 + $backend_check_gw2 + $backend_check_gw3 + $backend_check_gw4)) + +# If the traffic passes through both gateways (GW-1 and GW-2 OR GW-3 and GW-4) it will be dropped because +# we are bypassing the Stateless NAT solution for LB when we access the backend directly +if [[ $chassis_in_use -gt 2 ]]; then +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 ip netns exec sw1p1 cat -v curl.out | \ +sed 's/\(.*\)timed out/timed out\n/' | sed 's/\(.*\)connect timeout/timed out\n/' | grep -i -e "timed out" | uniq], [0], [dnl +timed out +]) +else +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 cat -v curl.out | M_FORMAT_CURL([10.0.2.4], [8080])], [0], [dnl +Connected to 10.0.2.4 (10.0.2.4) port 8080 +200 OK +]) +fi + +# Flush conntrack entries for easier output parsing of next test. +m_as ovn-gw-1 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-2 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-3 ovs-appctl dpctl/flush-conntrack +m_as ovn-gw-4 ovs-appctl dpctl/flush-conntrack + +# Check again +M_NS_EXEC([ovn-chassis-3], [sw1p1], [sh -c 'curl -v -O 10.0.2.4:8080/download_file --retry 0 --connect-timeout 1 --max-time 1 --local-port 59018 2>curl.out']) + +gw1_ct=$(m_as ovn-gw-1 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +gw2_ct=$(m_as ovn-gw-2 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +gw3_ct=$(m_as ovn-gw-3 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') +gw4_ct=$(m_as ovn-gw-4 ovs-appctl dpctl/dump-conntrack | sed ':a;N;$!ba;s/\n/\\n/g') + +# Check the backend IP from ct entries on gateways +backend_check_gw1=$(echo -e $gw1_ct | grep "dport=8080" | grep "59018" -c) +backend_check_gw2=$(echo -e $gw2_ct | grep "dport=8080" | grep "59018" -c) +backend_check_gw3=$(echo -e $gw3_ct | grep "dport=8080" | grep "59018" -c) +backend_check_gw4=$(echo -e $gw4_ct | grep "dport=8080" | grep "59018" -c) + +chassis_in_use=$(($backend_check_gw1 + $backend_check_gw2 + $backend_check_gw3 + $backend_check_gw4)) + +# If the traffic passes through both gateways (GW-1 and GW-2 OR GW-3 and GW-4) it will be dropped because +# we are bypassing the Stateless NAT solution for LB when we access the backend directly +if [[ $chassis_in_use -gt 2 ]]; then +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 ip netns exec sw1p1 cat -v curl.out | \ +sed 's/\(.*\)timed out/timed out\n/' | sed 's/\(.*\)connect timeout/timed out\n/' | grep -i -e "timed out" | uniq], [0], [dnl +timed out +]) +else +OVS_WAIT_FOR_OUTPUT([m_as ovn-chassis-3 cat -v curl.out | M_FORMAT_CURL([10.0.2.4], [8080])], [0], [dnl +Connected to 10.0.2.4 (10.0.2.4) port 8080 +200 OK +]) +fi + +AT_CLEANUP diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at index d6a8c4640..b760f662e 100644 --- a/tests/ovn-northd.at +++ b/tests/ovn-northd.at @@ -13872,3 +13872,323 @@ check_no_redirect AT_CLEANUP ]) + +OVN_FOR_EACH_NORTHD_NO_HV_PARALLELIZATION([ +AT_SETUP([Load balancer with Distributed Gateway Ports (LB + DGP + NAT Stateless)]) +ovn_start + +check ovn-nbctl ls-add public +check ovn-nbctl lr-add lr1 + +# lr1 DGP ts1 +check ovn-nbctl ls-add ts1 +check ovn-nbctl lrp-add lr1 lr1-ts1 00:00:01:02:03:04 172.16.10.1/24 +check ovn-nbctl lrp-set-gateway-chassis lr1-ts1 chassis-2 + +# lr1 DGP ts2 +check ovn-nbctl ls-add ts2 +check ovn-nbctl lrp-add lr1 lr1-ts2 00:00:01:02:03:05 172.16.20.1/24 +check ovn-nbctl lrp-set-gateway-chassis lr1-ts2 chassis-3 + +# lr1 DGP public +check ovn-nbctl lrp-add lr1 lr1_public 00:de:ad:ff:00:01 173.16.0.1/16 +check ovn-nbctl lrp-add lr1 lr1_s1 00:de:ad:fe:00:02 172.16.0.1/24 +check ovn-nbctl lrp-set-gateway-chassis lr1_public chassis-1 + +check ovn-nbctl ls-add s1 +# s1 - lr1 +check ovn-nbctl lsp-add s1 s1_lr1 +check ovn-nbctl lsp-set-type s1_lr1 router +check ovn-nbctl lsp-set-addresses s1_lr1 "00:de:ad:fe:00:02 172.16.0.1" +check ovn-nbctl lsp-set-options s1_lr1 router-port=lr1_s1 + +# s1 - backend vm1 +check ovn-nbctl lsp-add s1 vm1 +check ovn-nbctl lsp-set-addresses vm1 "00:de:ad:01:00:01 172.16.0.101" + +# s1 - backend vm2 +check ovn-nbctl lsp-add s1 vm2 +check ovn-nbctl lsp-set-addresses vm2 "00:de:ad:01:00:02 172.16.0.102" + +# s1 - backend vm3 +check ovn-nbctl lsp-add s1 vm3 +check ovn-nbctl lsp-set-addresses vm3 "00:de:ad:01:00:03 172.16.0.103" + +# Add the lr1 DGP ts1 to the public switch +check ovn-nbctl lsp-add public public_lr1_ts1 +check ovn-nbctl lsp-set-type public_lr1_ts1 router +check ovn-nbctl lsp-set-addresses public_lr1_ts1 router +check ovn-nbctl lsp-set-options public_lr1_ts1 router-port=lr1-ts1 nat-addresses=router + +# Add the lr1 DGP ts2 to the public switch +check ovn-nbctl lsp-add public public_lr1_ts2 +check ovn-nbctl lsp-set-type public_lr1_ts2 router +check ovn-nbctl lsp-set-addresses public_lr1_ts2 router +check ovn-nbctl lsp-set-options public_lr1_ts2 router-port=lr1-ts2 nat-addresses=router + +# Add the lr1 DGP public to the public switch +check ovn-nbctl lsp-add public public_lr1 +check ovn-nbctl lsp-set-type public_lr1 router +check ovn-nbctl lsp-set-addresses public_lr1 router +check ovn-nbctl lsp-set-options public_lr1 router-port=lr1_public nat-addresses=router + +# Create the Load Balancer lb1 +check ovn-nbctl --wait=sb lb-add lb1 "30.0.0.1" "172.16.0.103,172.16.0.102,172.16.0.101" + +# Set use_stateless_nat to true +check ovn-nbctl --wait=sb set load_balancer lb1 options:use_stateless_nat=true + +# Associate load balancer to s1 +check ovn-nbctl ls-lb-add s1 lb1 +check ovn-nbctl --wait=sb sync + +ovn-sbctl dump-flows s1 > s1flows +AT_CAPTURE_FILE([s1flows]) + +AT_CHECK([grep "ls_in_pre_stateful" s1flows | ovn_strip_lflows | grep "30.0.0.1"], [0], [dnl + table=??(ls_in_pre_stateful ), priority=120 , match=(reg0[[2]] == 1 && ip4.dst == 30.0.0.1), action=(reg1 = 30.0.0.1; ct_lb_mark;) +]) +AT_CHECK([grep "ls_in_lb" s1flows | ovn_strip_lflows | grep "30.0.0.1"], [0], [dnl + table=??(ls_in_lb ), priority=110 , match=(ct.new && ip4.dst == 30.0.0.1), action=(ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);) +]) + +# Associate load balancer to lr1 with DGP +check ovn-nbctl lr-lb-add lr1 lb1 +check ovn-nbctl --wait=sb sync + +ovn-sbctl dump-flows lr1 > lr1flows +AT_CAPTURE_FILE([lr1flows]) + +# Check stateless NAT rules for load balancer with multiple DGP +# 1. Check if the backend IPs are in the ipX.dst action +AT_CHECK([grep "lr_in_dnat" lr1flows | ovn_strip_lflows | grep "30.0.0.1"], [0], [dnl + table=??(lr_in_dnat ), priority=110 , match=(ct.new && !ct.rel && ip4 && ip4.dst == 30.0.0.1 && is_chassis_resident("cr-lr1-ts1")), action=(ip4.dst=172.16.0.103;ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);) + table=??(lr_in_dnat ), priority=110 , match=(ct.new && !ct.rel && ip4 && ip4.dst == 30.0.0.1 && is_chassis_resident("cr-lr1-ts2")), action=(ip4.dst=172.16.0.103;ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);) + table=??(lr_in_dnat ), priority=110 , match=(ct.new && !ct.rel && ip4 && ip4.dst == 30.0.0.1 && is_chassis_resident("cr-lr1_public")), action=(ip4.dst=172.16.0.103;ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);) +]) + +# 2. Check if the DGP ports are in the match with action next +AT_CHECK([grep "lr_out_undnat" lr1flows | ovn_strip_lflows], [0], [dnl + table=??(lr_out_undnat ), priority=0 , match=(1), action=(next;) + table=??(lr_out_undnat ), priority=120 , match=(ip4 && ((ip4.src == 172.16.0.103) || (ip4.src == 172.16.0.102) || (ip4.src == 172.16.0.101)) && (inport == "lr1-ts1" || outport == "lr1-ts1") && is_chassis_resident("cr-lr1-ts1") && tcp), action=(next;) + table=??(lr_out_undnat ), priority=120 , match=(ip4 && ((ip4.src == 172.16.0.103) || (ip4.src == 172.16.0.102) || (ip4.src == 172.16.0.101)) && (inport == "lr1-ts2" || outport == "lr1-ts2") && is_chassis_resident("cr-lr1-ts2") && tcp), action=(next;) + table=??(lr_out_undnat ), priority=120 , match=(ip4 && ((ip4.src == 172.16.0.103) || (ip4.src == 172.16.0.102) || (ip4.src == 172.16.0.101)) && (inport == "lr1_public" || outport == "lr1_public") && is_chassis_resident("cr-lr1_public") && tcp), action=(next;) +]) + +# 3. Check if the VIP IP is in the ipX.src action +AT_CHECK([grep "lr_out_snat" lr1flows | ovn_strip_lflows], [0], [dnl + table=??(lr_out_snat ), priority=0 , match=(1), action=(next;) + table=??(lr_out_snat ), priority=120 , match=(nd_ns), action=(next;) + table=??(lr_out_snat ), priority=160 , match=(ip4 && ((ip4.src == 172.16.0.103) || (ip4.src == 172.16.0.102) || (ip4.src == 172.16.0.101)) && (inport == "lr1-ts1" || outport == "lr1-ts1") && is_chassis_resident("cr-lr1-ts1") && tcp), action=(ip4.src=30.0.0.1; next;) + table=??(lr_out_snat ), priority=160 , match=(ip4 && ((ip4.src == 172.16.0.103) || (ip4.src == 172.16.0.102) || (ip4.src == 172.16.0.101)) && (inport == "lr1-ts2" || outport == "lr1-ts2") && is_chassis_resident("cr-lr1-ts2") && tcp), action=(ip4.src=30.0.0.1; next;) + table=??(lr_out_snat ), priority=160 , match=(ip4 && ((ip4.src == 172.16.0.103) || (ip4.src == 172.16.0.102) || (ip4.src == 172.16.0.101)) && (inport == "lr1_public" || outport == "lr1_public") && is_chassis_resident("cr-lr1_public") && tcp), action=(ip4.src=30.0.0.1; next;) +]) + +AT_CLEANUP +]) + +OVN_FOR_EACH_NORTHD_NO_HV_PARALLELIZATION([ +AT_SETUP([Load balancer with Distributed Gateway Ports (LB + DGP + NAT Stateless) - IPv6]) +ovn_start + +check ovn-nbctl ls-add public +check ovn-nbctl lr-add lr1 + +# lr1 DGP ts1 +check ovn-nbctl ls-add ts1 +check ovn-nbctl lrp-add lr1 lr1-ts1 00:00:01:02:03:04 2001:db8:aaaa:1::1/64 +check ovn-nbctl lrp-set-gateway-chassis lr1-ts1 chassis-2 + +# lr1 DGP ts2 +check ovn-nbctl ls-add ts2 +check ovn-nbctl lrp-add lr1 lr1-ts2 00:00:01:02:03:05 2001:db8:aaaa:2::1/64 +check ovn-nbctl lrp-set-gateway-chassis lr1-ts2 chassis-3 + +# lr1 DGP public +check ovn-nbctl lrp-add lr1 lr1_public 00:de:ad:ff:00:01 2001:db8:bbbb::1/64 +check ovn-nbctl lrp-add lr1 lr1_s1 00:de:ad:fe:00:02 2001:db8:aaaa:3::1/64 +check ovn-nbctl lrp-set-gateway-chassis lr1_public chassis-1 + +check ovn-nbctl ls-add s1 +# s1 - lr1 +check ovn-nbctl lsp-add s1 s1_lr1 +check ovn-nbctl lsp-set-type s1_lr1 router +check ovn-nbctl lsp-set-addresses s1_lr1 "00:de:ad:fe:00:02 2001:db8:aaaa:3::1" +check ovn-nbctl lsp-set-options s1_lr1 router-port=lr1_s1 + +# s1 - backend vm1 +check ovn-nbctl lsp-add s1 vm1 +check ovn-nbctl lsp-set-addresses vm1 "00:de:ad:01:00:01 2001:db8:aaaa:3::101" + +# s1 - backend vm2 +check ovn-nbctl lsp-add s1 vm2 +check ovn-nbctl lsp-set-addresses vm2 "00:de:ad:01:00:02 2001:db8:aaaa:3::102" + +# s1 - backend vm3 +check ovn-nbctl lsp-add s1 vm3 +check ovn-nbctl lsp-set-addresses vm3 "00:de:ad:01:00:03 2001:db8:aaaa:3::103" + +# Add the lr1 DGP ts1 to the public switch +check ovn-nbctl lsp-add public public_lr1_ts1 +check ovn-nbctl lsp-set-type public_lr1_ts1 router +check ovn-nbctl lsp-set-addresses public_lr1_ts1 router +check ovn-nbctl lsp-set-options public_lr1_ts1 router-port=lr1-ts1 nat-addresses=router + +# Add the lr1 DGP ts2 to the public switch +check ovn-nbctl lsp-add public public_lr1_ts2 +check ovn-nbctl lsp-set-type public_lr1_ts2 router +check ovn-nbctl lsp-set-addresses public_lr1_ts2 router +check ovn-nbctl lsp-set-options public_lr1_ts2 router-port=lr1-ts2 nat-addresses=router + +# Add the lr1 DGP public to the public switch +check ovn-nbctl lsp-add public public_lr1 +check ovn-nbctl lsp-set-type public_lr1 router +check ovn-nbctl lsp-set-addresses public_lr1 router +check ovn-nbctl lsp-set-options public_lr1 router-port=lr1_public nat-addresses=router + +# Create the Load Balancer lb1 +check ovn-nbctl --wait=sb lb-add lb1 "2001:db8:cccc::1" "2001:db8:aaaa:3::103,2001:db8:aaaa:3::102,2001:db8:aaaa:3::101" + +# Set use_stateless_nat to true +check ovn-nbctl --wait=sb set load_balancer lb1 options:use_stateless_nat=true + +# Associate load balancer to s1 +check ovn-nbctl ls-lb-add s1 lb1 +check ovn-nbctl --wait=sb sync + +ovn-sbctl dump-flows s1 > s1flows +AT_CAPTURE_FILE([s1flows]) + +AT_CHECK([grep "ls_in_pre_stateful" s1flows | ovn_strip_lflows | grep "2001:db8:cccc::1"], [0], [dnl + table=??(ls_in_pre_stateful ), priority=120 , match=(reg0[[2]] == 1 && ip6.dst == 2001:db8:cccc::1), action=(xxreg1 = 2001:db8:cccc::1; ct_lb_mark;) +]) +AT_CHECK([grep "ls_in_lb" s1flows | ovn_strip_lflows | grep "2001:db8:cccc::1"], [0], [dnl + table=??(ls_in_lb ), priority=110 , match=(ct.new && ip6.dst == 2001:db8:cccc::1), action=(ct_lb_mark(backends=2001:db8:aaaa:3::103,2001:db8:aaaa:3::102,2001:db8:aaaa:3::101);) +]) + +# Associate load balancer to lr1 with DGP +check ovn-nbctl lr-lb-add lr1 lb1 +check ovn-nbctl --wait=sb sync + +ovn-sbctl dump-flows lr1 > lr1flows +AT_CAPTURE_FILE([lr1flows]) + +# Check stateless NAT rules for load balancer with multiple DGP +# 1. Check if the backend IPs are in the ipX.dst action +AT_CHECK([grep "lr_in_dnat" lr1flows | ovn_strip_lflows | grep "2001:db8:cccc::1"], [0], [dnl + table=??(lr_in_dnat ), priority=110 , match=(ct.new && !ct.rel && ip6 && ip6.dst == 2001:db8:cccc::1 && is_chassis_resident("cr-lr1-ts1")), action=(ip6.dst=2001:db8:aaaa:3::103;ct_lb_mark(backends=2001:db8:aaaa:3::103,2001:db8:aaaa:3::102,2001:db8:aaaa:3::101);) + table=??(lr_in_dnat ), priority=110 , match=(ct.new && !ct.rel && ip6 && ip6.dst == 2001:db8:cccc::1 && is_chassis_resident("cr-lr1-ts2")), action=(ip6.dst=2001:db8:aaaa:3::103;ct_lb_mark(backends=2001:db8:aaaa:3::103,2001:db8:aaaa:3::102,2001:db8:aaaa:3::101);) + table=??(lr_in_dnat ), priority=110 , match=(ct.new && !ct.rel && ip6 && ip6.dst == 2001:db8:cccc::1 && is_chassis_resident("cr-lr1_public")), action=(ip6.dst=2001:db8:aaaa:3::103;ct_lb_mark(backends=2001:db8:aaaa:3::103,2001:db8:aaaa:3::102,2001:db8:aaaa:3::101);) +]) + +# 2. Check if the DGP ports are in the match with action next +AT_CHECK([grep "lr_out_undnat" lr1flows | ovn_strip_lflows], [0], [dnl + table=??(lr_out_undnat ), priority=0 , match=(1), action=(next;) + table=??(lr_out_undnat ), priority=120 , match=(ip6 && ((ip6.src == 2001:db8:aaaa:3::103) || (ip6.src == 2001:db8:aaaa:3::102) || (ip6.src == 2001:db8:aaaa:3::101)) && (inport == "lr1-ts1" || outport == "lr1-ts1") && is_chassis_resident("cr-lr1-ts1") && tcp), action=(next;) + table=??(lr_out_undnat ), priority=120 , match=(ip6 && ((ip6.src == 2001:db8:aaaa:3::103) || (ip6.src == 2001:db8:aaaa:3::102) || (ip6.src == 2001:db8:aaaa:3::101)) && (inport == "lr1-ts2" || outport == "lr1-ts2") && is_chassis_resident("cr-lr1-ts2") && tcp), action=(next;) + table=??(lr_out_undnat ), priority=120 , match=(ip6 && ((ip6.src == 2001:db8:aaaa:3::103) || (ip6.src == 2001:db8:aaaa:3::102) || (ip6.src == 2001:db8:aaaa:3::101)) && (inport == "lr1_public" || outport == "lr1_public") && is_chassis_resident("cr-lr1_public") && tcp), action=(next;) +]) + +# 3. Check if the VIP IP is in the ipX.src action +AT_CHECK([grep "lr_out_snat" lr1flows | ovn_strip_lflows], [0], [dnl + table=??(lr_out_snat ), priority=0 , match=(1), action=(next;) + table=??(lr_out_snat ), priority=120 , match=(nd_ns), action=(next;) + table=??(lr_out_snat ), priority=160 , match=(ip6 && ((ip6.src == 2001:db8:aaaa:3::103) || (ip6.src == 2001:db8:aaaa:3::102) || (ip6.src == 2001:db8:aaaa:3::101)) && (inport == "lr1-ts1" || outport == "lr1-ts1") && is_chassis_resident("cr-lr1-ts1") && tcp), action=(ip6.src=2001:db8:cccc::1; next;) + table=??(lr_out_snat ), priority=160 , match=(ip6 && ((ip6.src == 2001:db8:aaaa:3::103) || (ip6.src == 2001:db8:aaaa:3::102) || (ip6.src == 2001:db8:aaaa:3::101)) && (inport == "lr1-ts2" || outport == "lr1-ts2") && is_chassis_resident("cr-lr1-ts2") && tcp), action=(ip6.src=2001:db8:cccc::1; next;) + table=??(lr_out_snat ), priority=160 , match=(ip6 && ((ip6.src == 2001:db8:aaaa:3::103) || (ip6.src == 2001:db8:aaaa:3::102) || (ip6.src == 2001:db8:aaaa:3::101)) && (inport == "lr1_public" || outport == "lr1_public") && is_chassis_resident("cr-lr1_public") && tcp), action=(ip6.src=2001:db8:cccc::1; next;) +]) + +AT_CLEANUP +]) + +OVN_FOR_EACH_NORTHD_NO_HV_PARALLELIZATION([ +AT_SETUP([Load balancer with Distributed Gateway Ports (DGP)]) +ovn_start + +check ovn-nbctl ls-add public +check ovn-nbctl lr-add lr1 + +# lr1 DGP ts1 +check ovn-nbctl ls-add ts1 +check ovn-nbctl lrp-add lr1 lr1-ts1 00:00:01:02:03:04 172.16.10.1/24 +check ovn-nbctl lrp-set-gateway-chassis lr1-ts1 chassis-1 + +# lr1 DGP ts2 +check ovn-nbctl ls-add ts2 +check ovn-nbctl lrp-add lr1 lr1-ts2 00:00:01:02:03:05 172.16.20.1/24 +check ovn-nbctl lrp-set-gateway-chassis lr1-ts2 chassis-1 + +# lr1 DGP public +check ovn-nbctl lrp-add lr1 lr1_public 00:de:ad:ff:00:01 173.16.0.1/16 +check ovn-nbctl lrp-add lr1 lr1_s1 00:de:ad:fe:00:02 172.16.0.1/24 +check ovn-nbctl lrp-set-gateway-chassis lr1_public chassis-1 + +check ovn-nbctl ls-add s1 +# s1 - lr1 +check ovn-nbctl lsp-add s1 s1_lr1 +check ovn-nbctl lsp-set-type s1_lr1 router +check ovn-nbctl lsp-set-addresses s1_lr1 "00:de:ad:fe:00:02 172.16.0.1" +check ovn-nbctl lsp-set-options s1_lr1 router-port=lr1_s1 + +# s1 - backend vm1 +check ovn-nbctl lsp-add s1 vm1 +check ovn-nbctl lsp-set-addresses vm1 "00:de:ad:01:00:01 172.16.0.101" + +# s1 - backend vm2 +check ovn-nbctl lsp-add s1 vm2 +check ovn-nbctl lsp-set-addresses vm2 "00:de:ad:01:00:02 172.16.0.102" + +# s1 - backend vm3 +check ovn-nbctl lsp-add s1 vm3 +check ovn-nbctl lsp-set-addresses vm3 "00:de:ad:01:00:03 172.16.0.103" + +# Add the lr1 DGP ts1 to the public switch +check ovn-nbctl lsp-add public public_lr1_ts1 +check ovn-nbctl lsp-set-type public_lr1_ts1 router +check ovn-nbctl lsp-set-addresses public_lr1_ts1 router +check ovn-nbctl lsp-set-options public_lr1_ts1 router-port=lr1-ts1 nat-addresses=router + +# Add the lr1 DGP ts2 to the public switch +check ovn-nbctl lsp-add public public_lr1_ts2 +check ovn-nbctl lsp-set-type public_lr1_ts2 router +check ovn-nbctl lsp-set-addresses public_lr1_ts2 router +check ovn-nbctl lsp-set-options public_lr1_ts2 router-port=lr1-ts2 nat-addresses=router + +# Add the lr1 DGP public to the public switch +check ovn-nbctl lsp-add public public_lr1 +check ovn-nbctl lsp-set-type public_lr1 router +check ovn-nbctl lsp-set-addresses public_lr1 router +check ovn-nbctl lsp-set-options public_lr1 router-port=lr1_public nat-addresses=router + +# Create the Load Balancer lb1 +check ovn-nbctl --wait=sb lb-add lb1 "30.0.0.1" "172.16.0.103,172.16.0.102,172.16.0.101" + +# Associate load balancer to s1 +check ovn-nbctl ls-lb-add s1 lb1 +check ovn-nbctl --wait=sb sync + +ovn-sbctl dump-flows s1 > s1flows +AT_CAPTURE_FILE([s1flows]) + +AT_CHECK([grep "ls_in_pre_stateful" s1flows | ovn_strip_lflows | grep "30.0.0.1"], [0], [dnl + table=??(ls_in_pre_stateful ), priority=120 , match=(reg0[[2]] == 1 && ip4.dst == 30.0.0.1), action=(reg1 = 30.0.0.1; ct_lb_mark;) +]) +AT_CHECK([grep "ls_in_lb" s1flows | ovn_strip_lflows | grep "30.0.0.1"], [0], [dnl + table=??(ls_in_lb ), priority=110 , match=(ct.new && ip4.dst == 30.0.0.1), action=(ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);) +]) + +# Associate load balancer to lr1 with DGP +check ovn-nbctl lr-lb-add lr1 lb1 +check ovn-nbctl --wait=sb sync + +ovn-sbctl dump-flows lr1 > lr1flows +AT_CAPTURE_FILE([lr1flows]) + +AT_CHECK([grep "lr_in_dnat" lr1flows | ovn_strip_lflows | grep "30.0.0.1"], [0], [dnl + table=??(lr_in_dnat ), priority=110 , match=(ct.new && !ct.rel && ip4 && ip4.dst == 30.0.0.1 && is_chassis_resident("cr-lr1-ts1")), action=(ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);) + table=??(lr_in_dnat ), priority=110 , match=(ct.new && !ct.rel && ip4 && ip4.dst == 30.0.0.1 && is_chassis_resident("cr-lr1-ts2")), action=(ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);) + table=??(lr_in_dnat ), priority=110 , match=(ct.new && !ct.rel && ip4 && ip4.dst == 30.0.0.1 && is_chassis_resident("cr-lr1_public")), action=(ct_lb_mark(backends=172.16.0.103,172.16.0.102,172.16.0.101);) +]) + +AT_CLEANUP +])