From patchwork Tue Nov 3 15:51:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dumitru Ceara X-Patchwork-Id: 1393166 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.138; helo=whitealder.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=dQb8V0Lr; dkim-atps=neutral Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CQZ5M5BxTz9sT6 for ; Wed, 4 Nov 2020 02:51:19 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id 3E00586C51; Tue, 3 Nov 2020 15:51:18 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3Sk0s0tEcLfK; Tue, 3 Nov 2020 15:51:15 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by whitealder.osuosl.org (Postfix) with ESMTP id D83FE86A08; Tue, 3 Nov 2020 15:51:15 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id CEFF9C088B; Tue, 3 Nov 2020 15:51:15 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from hemlock.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 2FA37C0051 for ; Tue, 3 Nov 2020 15:51:15 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by hemlock.osuosl.org (Postfix) with ESMTP id 2BF0586F82 for ; Tue, 3 Nov 2020 15:51:15 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from hemlock.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Fw2Kw0GGsZ0B for ; Tue, 3 Nov 2020 15:51:14 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by hemlock.osuosl.org (Postfix) with ESMTPS id 1924A86229 for ; Tue, 3 Nov 2020 15:51:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1604418672; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=x8ZysPyUSEXlaZv/i3nUXUyTEuuypFxoO3YqNX1nSbY=; b=dQb8V0LrZApOqDDsJB6d6hyLwqK2ntTCBiRS6DAhofUkVNloqgaVd3mVttzl8ysmadxxf2 8IJlMJp3qO8h3FwZIn0CZGUkeLvoZI7LSPpnIweAuoGKhfUx9w6ou5Z6RFgIpCrouKF6/r HLkSaR9O3kTKtLAgM3UtKXlu4zQUxxg= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-472-vP_1LhrTO7qQQSNazpv-OQ-1; Tue, 03 Nov 2020 10:51:10 -0500 X-MC-Unique: vP_1LhrTO7qQQSNazpv-OQ-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D75C3803F44 for ; Tue, 3 Nov 2020 15:51:09 +0000 (UTC) Received: from dceara.remote.csb (ovpn-114-168.ams2.redhat.com [10.36.114.168]) by smtp.corp.redhat.com (Postfix) with ESMTP id D0D8C7C3F8 for ; Tue, 3 Nov 2020 15:51:08 +0000 (UTC) From: Dumitru Ceara To: dev@openvswitch.org Date: Tue, 3 Nov 2020 16:51:04 +0100 Message-Id: <20201103155101.19095.8338.stgit@dceara.remote.csb> In-Reply-To: <20201103155045.19095.14237.stgit@dceara.remote.csb> References: <20201103155045.19095.14237.stgit@dceara.remote.csb> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dceara@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Subject: [ovs-dev] [PATCH ovn 1/2] pinctrl: Directly udpate MAC_Bindings created by self originated GARPs. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" OVN uses GARPs to announce changes to locally owned NAT addresses. This is OK when updating upstream router caches but is unnecessary for updating OVN logical router MAC_Bindings. ovn-controller already has the information required for directly updating/inserting the MAC_Bindings that would be created by neighbor routers. This also has the advantage that GARPs don't necessarily need to be flooded in the complete L2 domain of the switch and that router patch ports can be skipped. An upcoming commit will take advantage of this. Suggested-by: Lorenzo Bianconi Fixes: 81e928526b8a ("ovn-controller: Inject GARPs to logical switch pipeline to update neighbors") Signed-off-by: Dumitru Ceara --- controller/pinctrl.c | 108 ++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 87 insertions(+), 21 deletions(-) diff --git a/controller/pinctrl.c b/controller/pinctrl.c index 08adc10..795b52f 100644 --- a/controller/pinctrl.c +++ b/controller/pinctrl.c @@ -189,7 +189,7 @@ static void run_put_mac_bindings( struct ovsdb_idl_txn *ovnsb_idl_txn, struct ovsdb_idl_index *sbrec_datapath_binding_by_key, struct ovsdb_idl_index *sbrec_port_binding_by_key, - struct ovsdb_idl_index *sbrec_mac_binding_by_lport_ip) + struct ovsdb_idl_index *sbrec_mac_binding_by_lport_ip); OVS_REQUIRES(pinctrl_mutex); static void wait_put_mac_bindings(struct ovsdb_idl_txn *ovnsb_idl_txn); static void flush_put_mac_bindings(void); @@ -200,8 +200,10 @@ static void init_send_garps_rarps(void); static void destroy_send_garps_rarps(void); static void send_garp_rarp_wait(long long int send_garp_rarp_time); static void send_garp_rarp_prepare( + struct ovsdb_idl_txn *ovnsb_idl_txn, struct ovsdb_idl_index *sbrec_port_binding_by_datapath, struct ovsdb_idl_index *sbrec_port_binding_by_name, + struct ovsdb_idl_index *sbrec_mac_binding_by_lport_ip, const struct ovsrec_bridge *, const struct sbrec_chassis *, const struct hmap *local_datapaths, @@ -3146,8 +3148,9 @@ pinctrl_run(struct ovsdb_idl_txn *ovnsb_idl_txn, sbrec_mac_binding_by_lport_ip); run_put_vport_bindings(ovnsb_idl_txn, sbrec_datapath_binding_by_key, sbrec_port_binding_by_key, chassis); - send_garp_rarp_prepare(sbrec_port_binding_by_datapath, - sbrec_port_binding_by_name, br_int, chassis, + send_garp_rarp_prepare(ovnsb_idl_txn, sbrec_port_binding_by_datapath, + sbrec_port_binding_by_name, + sbrec_mac_binding_by_lport_ip, br_int, chassis, local_datapaths, active_tunnels); prepare_ipv6_ras(local_datapaths); prepare_ipv6_prefixd(ovnsb_idl_txn, sbrec_port_binding_by_name, @@ -3838,6 +3841,65 @@ mac_binding_lookup(struct ovsdb_idl_index *sbrec_mac_binding_by_lport_ip, return retval; } +/* Update or add an IP-MAC binding for 'logical_port'. */ +static void +mac_binding_add(struct ovsdb_idl_txn *ovnsb_idl_txn, + struct ovsdb_idl_index *sbrec_mac_binding_by_lport_ip, + const char *logical_port, + const struct sbrec_datapath_binding *dp, + struct eth_addr ea, const char *ip) +{ + /* Convert ethernet argument to string form for database. */ + char mac_string[ETH_ADDR_STRLEN + 1]; + snprintf(mac_string, sizeof mac_string, + ETH_ADDR_FMT, ETH_ADDR_ARGS(ea)); + + const struct sbrec_mac_binding *b = + mac_binding_lookup(sbrec_mac_binding_by_lport_ip, logical_port, ip); + if (!b) { + b = sbrec_mac_binding_insert(ovnsb_idl_txn); + sbrec_mac_binding_set_logical_port(b, logical_port); + sbrec_mac_binding_set_ip(b, ip); + sbrec_mac_binding_set_mac(b, mac_string); + sbrec_mac_binding_set_datapath(b, dp); + } else if (strcmp(b->mac, mac_string)) { + sbrec_mac_binding_set_mac(b, mac_string); + } +} + +/* Simulate the effect of a GARP on local datapaths, i.e., create MAC_Bindings + * on peer router datapaths. + */ +static void +send_garp_locally(struct ovsdb_idl_txn *ovnsb_idl_txn, + struct ovsdb_idl_index *sbrec_mac_binding_by_lport_ip, + const struct hmap *local_datapaths, + const struct sbrec_port_binding *in_pb, + struct eth_addr ea, ovs_be32 ip) +{ + const struct local_datapath *ldp = + get_local_datapath(local_datapaths, in_pb->datapath->tunnel_key); + + ovs_assert(ldp); + for (size_t i = 0; i < ldp->n_peer_ports; i++) { + const struct sbrec_port_binding *local = ldp->peer_ports[i].local; + const struct sbrec_port_binding *remote = ldp->peer_ports[i].remote; + + /* Skip "ingress" port. */ + if (local == in_pb) { + continue; + } + + struct ds ip_s = DS_EMPTY_INITIALIZER; + + ip_format_masked(ip, OVS_BE32_MAX, &ip_s); + mac_binding_add(ovnsb_idl_txn, sbrec_mac_binding_by_lport_ip, + remote->logical_port, remote->datapath, + ea, ds_cstr(&ip_s)); + ds_destroy(&ip_s); + } +} + static void run_put_mac_binding(struct ovsdb_idl_txn *ovnsb_idl_txn, struct ovsdb_idl_index *sbrec_datapath_binding_by_key, @@ -3864,20 +3926,8 @@ run_put_mac_binding(struct ovsdb_idl_txn *ovnsb_idl_txn, struct ds ip_s = DS_EMPTY_INITIALIZER; ipv6_format_mapped(&pmb->ip_key, &ip_s); - - /* Update or add an IP-MAC binding for this logical port. */ - const struct sbrec_mac_binding *b = - mac_binding_lookup(sbrec_mac_binding_by_lport_ip, pb->logical_port, - ds_cstr(&ip_s)); - if (!b) { - b = sbrec_mac_binding_insert(ovnsb_idl_txn); - sbrec_mac_binding_set_logical_port(b, pb->logical_port); - sbrec_mac_binding_set_ip(b, ds_cstr(&ip_s)); - sbrec_mac_binding_set_mac(b, mac_string); - sbrec_mac_binding_set_datapath(b, pb->datapath); - } else if (strcmp(b->mac, mac_string)) { - sbrec_mac_binding_set_mac(b, mac_string); - } + mac_binding_add(ovnsb_idl_txn, sbrec_mac_binding_by_lport_ip, + pb->logical_port, pb->datapath, pmb->mac, ds_cstr(&ip_s)); ds_destroy(&ip_s); } @@ -4019,7 +4069,10 @@ add_garp_rarp(const char *name, const struct eth_addr ea, ovs_be32 ip, /* Add or update a vif for which GARPs need to be announced. */ static void -send_garp_rarp_update(const struct sbrec_port_binding *binding_rec, +send_garp_rarp_update(struct ovsdb_idl_txn *ovnsb_idl_txn, + struct ovsdb_idl_index *sbrec_mac_binding_by_lport_ip, + const struct hmap *local_datapaths, + const struct sbrec_port_binding *binding_rec, struct shash *nat_addresses) { volatile struct garp_rarp_data *garp_rarp = NULL; @@ -4045,6 +4098,11 @@ send_garp_rarp_update(const struct sbrec_port_binding *binding_rec, laddrs->ipv4_addrs[i].addr, binding_rec->datapath->tunnel_key, binding_rec->tunnel_key); + send_garp_locally(ovnsb_idl_txn, + sbrec_mac_binding_by_lport_ip, + local_datapaths, binding_rec, laddrs->ea, + laddrs->ipv4_addrs[i].addr); + } free(name); } @@ -4080,6 +4138,10 @@ send_garp_rarp_update(const struct sbrec_port_binding *binding_rec, laddrs.ea, ip, binding_rec->datapath->tunnel_key, binding_rec->tunnel_key); + if (ip) { + send_garp_locally(ovnsb_idl_txn, sbrec_mac_binding_by_lport_ip, + local_datapaths, binding_rec, laddrs.ea, ip); + } destroy_lport_addresses(&laddrs); break; @@ -5356,8 +5418,10 @@ send_garp_rarp_run(struct rconn *swconn, long long int *send_garp_rarp_time) /* Called by pinctrl_run(). Runs with in the main ovn-controller * thread context. */ static void -send_garp_rarp_prepare(struct ovsdb_idl_index *sbrec_port_binding_by_datapath, +send_garp_rarp_prepare(struct ovsdb_idl_txn *ovnsb_idl_txn, + struct ovsdb_idl_index *sbrec_port_binding_by_datapath, struct ovsdb_idl_index *sbrec_port_binding_by_name, + struct ovsdb_idl_index *sbrec_mac_binding_by_lport_ip, const struct ovsrec_bridge *br_int, const struct sbrec_chassis *chassis, const struct hmap *local_datapaths, @@ -5396,7 +5460,8 @@ send_garp_rarp_prepare(struct ovsdb_idl_index *sbrec_port_binding_by_datapath, const struct sbrec_port_binding *pb = lport_lookup_by_name( sbrec_port_binding_by_name, iface_id); if (pb) { - send_garp_rarp_update(pb, &nat_addresses); + send_garp_rarp_update(ovnsb_idl_txn, sbrec_mac_binding_by_lport_ip, + local_datapaths, pb, &nat_addresses); } } @@ -5406,7 +5471,8 @@ send_garp_rarp_prepare(struct ovsdb_idl_index *sbrec_port_binding_by_datapath, const struct sbrec_port_binding *pb = lport_lookup_by_name(sbrec_port_binding_by_name, gw_port); if (pb) { - send_garp_rarp_update(pb, &nat_addresses); + send_garp_rarp_update(ovnsb_idl_txn, sbrec_mac_binding_by_lport_ip, + local_datapaths, pb, &nat_addresses); } } From patchwork Tue Nov 3 15:51:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dumitru Ceara X-Patchwork-Id: 1393167 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.138; helo=whitealder.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=Ei6aLweJ; dkim-atps=neutral Received: from whitealder.osuosl.org (smtp1.osuosl.org [140.211.166.138]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CQZ5w1KRZz9s0b for ; Wed, 4 Nov 2020 02:51:48 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by whitealder.osuosl.org (Postfix) with ESMTP id BC4D686C62; Tue, 3 Nov 2020 15:51:46 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from whitealder.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id a-rhizj8fXm3; Tue, 3 Nov 2020 15:51:41 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by whitealder.osuosl.org (Postfix) with ESMTP id E77CE86BE5; Tue, 3 Nov 2020 15:51:40 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id C8728C088B; Tue, 3 Nov 2020 15:51:40 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from fraxinus.osuosl.org (smtp4.osuosl.org [140.211.166.137]) by lists.linuxfoundation.org (Postfix) with ESMTP id ECE9CC0051 for ; Tue, 3 Nov 2020 15:51:39 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by fraxinus.osuosl.org (Postfix) with ESMTP id DC56D85ADC for ; Tue, 3 Nov 2020 15:51:39 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from fraxinus.osuosl.org ([127.0.0.1]) by localhost (.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id V8bDnyrv3FmM for ; Tue, 3 Nov 2020 15:51:37 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by fraxinus.osuosl.org (Postfix) with ESMTPS id 8B20385C86 for ; Tue, 3 Nov 2020 15:51:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1604418696; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nFv+j5xlSdfWsuPcu9y8ZdgyUCYv3Avoapdk1wu2KjI=; b=Ei6aLweJ7tPyHlI6YJ4fPG080rDQW1sXmnEEF14SJ6khcrkhXhFubqwaufPNwK4CSgx0BH z9kEOtJrw2CgqT6Hc3NayKZpu0SLYc0ATA5wJLV+AZfGlywi+Z8xRAr/2uNrJinIOLIzG/ hZwdj9v7sUOHIv5KMBYeqqFIdOaERhY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-548-djT_Ska9OBOcDoDwW6BPvQ-1; Tue, 03 Nov 2020 10:51:34 -0500 X-MC-Unique: djT_Ska9OBOcDoDwW6BPvQ-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id AB0605F9C5 for ; Tue, 3 Nov 2020 15:51:33 +0000 (UTC) Received: from dceara.remote.csb (ovpn-114-168.ams2.redhat.com [10.36.114.168]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8114B100239A for ; Tue, 3 Nov 2020 15:51:31 +0000 (UTC) From: Dumitru Ceara To: dev@openvswitch.org Date: Tue, 3 Nov 2020 16:51:21 +0100 Message-Id: <20201103155117.19095.21177.stgit@dceara.remote.csb> In-Reply-To: <20201103155045.19095.14237.stgit@dceara.remote.csb> References: <20201103155045.19095.14237.stgit@dceara.remote.csb> User-Agent: StGit/0.17.1-dirty MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dceara@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Subject: [ovs-dev] [PATCH ovn 2/2] ovn-northd: Limit self originated ARP/ND broadcast domain. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" Considering the following large scale deployment: external-network -- public-logical-switch -- router-1 -- sw1 -- VIF-1 -- router-2 -- sw2 -- VIF-2 ... -- router-n -- swn -- VIF-n To avoid hitting the max number of OVS resubmits (4K currently) OVN already restricted the broadcast domain for ARP/ND requests received from the external network and targeting an OVN-owned IP (e.g., owned by router-x). Such packets are not flooded in the broadcast domain of the public logical switch and instead are forwarded only to the port connecting router-x. However, the max number of OVS resubmits can also be hit in the following scenarios: - router-x tries to resolve an IP owned by router-y (e.g., VIF-x trying to reach VIF-y via floating IP). - router-x tries to resolve an IP owned by the external network. Because ARP/ND requests in the above cases are originated by OVN router ports they were being flooded in the complete broadcast domain of the public logical switch. Instead, we now create a new Multicast_Group for each logical switch and add all non-router ports to it. ARP/ND requests are now forwarded to the router port that owns the IP (if any) and then flooded in this restricted MC_FLOOD_L2 broadcast domain. Fixes: 32f5ebb06226 ("ovn-northd: Limit ARP/ND broadcast domain whenever possible.") Signed-off-by: Dumitru Ceara --- lib/mcast-group-index.h | 1 + northd/ovn-northd.8.xml | 19 ++++++----- northd/ovn-northd.c | 84 +++++++++++++++++++++++++++++------------------ tests/ovn.at | 50 +++++++++++++--------------- 4 files changed, 86 insertions(+), 68 deletions(-) diff --git a/lib/mcast-group-index.h b/lib/mcast-group-index.h index ba995ba..72af117 100644 --- a/lib/mcast-group-index.h +++ b/lib/mcast-group-index.h @@ -30,6 +30,7 @@ enum ovn_mcast_tunnel_keys { OVN_MCAST_MROUTER_FLOOD_TUNNEL_KEY, OVN_MCAST_MROUTER_STATIC_TUNNEL_KEY, OVN_MCAST_STATIC_TUNNEL_KEY, + OVN_MCAST_FLOOD_L2_TUNNEL_KEY, OVN_MIN_IP_MULTICAST, OVN_MAX_IP_MULTICAST = OVN_MAX_MULTICAST, }; diff --git a/northd/ovn-northd.8.xml b/northd/ovn-northd.8.xml index 9b96ce9..b37cecd 100644 --- a/northd/ovn-northd.8.xml +++ b/northd/ovn-northd.8.xml @@ -1367,18 +1367,19 @@ output;
  • - Priority-80 flows for each port connected to a logical router - matching self originated GARP/ARP request/ND packets. These packets - are flooded to the MC_FLOOD which contains all logical - ports. + Priority-80 flows for each IP address/VIP/NAT address owned by a + router port connected to the switch. These flows match ARP requests + and ND packets for the specific IP addresses. Matched packets are + forwarded only to the router that owns the IP address and to the + MC_FLOOD_L2 multicast group which contains all non-router + logical ports.
  • - Priority-75 flows for each IP address/VIP/NAT address owned by a - router port connected to the switch. These flows match ARP requests - and ND packets for the specific IP addresses. Matched packets are - forwarded only to the router that owns the IP address and, if - present, to the localnet port of the logical switch. + Priority-75 flows for each port connected to a logical router + matching self originated ARP request/ND packets. These packets + are flooded to the MC_FLOOD_L2 which contains all + non-router logical ports.
  • diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index 8800a3d..dbe5fa6 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -1559,6 +1559,12 @@ lsp_is_external(const struct nbrec_logical_switch_port *nbsp) } static bool +lsp_is_router(const struct nbrec_logical_switch_port *nbsp) +{ + return !strcmp(nbsp->type, "router"); +} + +static bool lrport_is_enabled(const struct nbrec_logical_router_port *lrport) { return !lrport->enabled || *lrport->enabled; @@ -2488,7 +2494,7 @@ join_logical_ports(struct northd_context *ctx, * to their peers. */ struct ovn_port *op; HMAP_FOR_EACH (op, key_node, ports) { - if (op->nbsp && !strcmp(op->nbsp->type, "router") && !op->derived) { + if (op->nbsp && lsp_is_router(op->nbsp) && !op->derived) { const char *peer_name = smap_get(&op->nbsp->options, "router-port"); if (!peer_name) { continue; @@ -3105,7 +3111,7 @@ ovn_port_update_sbrec(struct northd_context *ctx, sbrec_port_binding_set_nat_addresses(op->sb, NULL, 0); } else { - if (strcmp(op->nbsp->type, "router")) { + if (!lsp_is_router(op->nbsp)) { uint32_t queue_id = smap_get_int( &op->sb->options, "qdisc_queue_id", 0); bool has_qos = port_has_qos_params(&op->nbsp->options); @@ -3843,6 +3849,10 @@ static const struct multicast_group mc_static = static const struct multicast_group mc_unknown = { MC_UNKNOWN, OVN_MCAST_UNKNOWN_TUNNEL_KEY }; +#define MC_FLOOD_L2 "_MC_flood_l2" +static const struct multicast_group mc_flood_l2 = + { MC_FLOOD_L2, OVN_MCAST_FLOOD_L2_TUNNEL_KEY }; + static bool multicast_group_equal(const struct multicast_group *a, const struct multicast_group *b) @@ -6394,12 +6404,11 @@ build_lswitch_rport_arp_req_self_orig_flow(struct ovn_port *op, sset_add(&all_eth_addrs, nat->external_mac); } - - /* Self originated (G)ARP requests/ND need to be flooded as usual. - * Determine that packets are self originated by also matching on - * source MAC. Matching on ingress port is not reliable in case this - * is a VLAN-backed network. - * Priority: 80. + /* Self originated ARP requests/ND need to be flooded to the L2 domain + * (except on router ports). Determine that packets are self originated + * by also matching on source MAC. Matching on ingress port is not + * reliable in case this is a VLAN-backed network. + * Priority: 75. */ const char *eth_addr; @@ -6415,7 +6424,7 @@ build_lswitch_rport_arp_req_self_orig_flow(struct ovn_port *op, ds_cstr(ð_src)); ovn_lflow_add(lflows, od, S_SWITCH_IN_L2_LKUP, priority, ds_cstr(&match), - "outport = \""MC_FLOOD"\"; output;"); + "outport = \""MC_FLOOD_L2"\"; output;"); sset_destroy(&all_eth_addrs); ds_destroy(ð_src); @@ -6461,14 +6470,16 @@ build_lswitch_rport_arp_req_flow_for_ip(struct sset *ips, ds_chomp(&match, ','); ds_put_cstr(&match, "}"); - /* Send a the packet only to the router pipeline and skip flooding it - * in the broadcast domain (except for the localnet port). + /* Send a the packet to the router pipeline. If the switch has non-router + * ports then flood it there as well. */ - for (size_t i = 0; i < od->n_localnet_ports; i++) { - ds_put_format(&actions, "clone { outport = %s; output; }; ", - od->localnet_ports[i]->json_key); + if (od->n_router_ports != od->nbs->n_ports) { + ds_put_format(&actions, "clone {outport = %s; output; }; " + "outport = \""MC_FLOOD_L2"\"; output;", + patch_op->json_key); + } else { + ds_put_format(&actions, "outport = %s; output;", patch_op->json_key); } - ds_put_format(&actions, "outport = %s; output;", patch_op->json_key); ovn_lflow_add_with_hint(lflows, od, S_SWITCH_IN_L2_LKUP, priority, ds_cstr(&match), ds_cstr(&actions), stage_hint); @@ -6498,14 +6509,9 @@ build_lswitch_rport_arp_req_flows(struct ovn_port *op, return; } - /* Self originated (G)ARP requests/ND need to be flooded as usual. - * Priority: 80. - */ - build_lswitch_rport_arp_req_self_orig_flow(op, 80, sw_od, lflows); - /* Forward ARP requests for owned IP addresses (L3, VIP, NAT) only to this * router port. - * Priority: 75. + * Priority: 80. */ struct sset all_ips_v4 = SSET_INITIALIZER(&all_ips_v4); struct sset all_ips_v6 = SSET_INITIALIZER(&all_ips_v6); @@ -6580,17 +6586,28 @@ build_lswitch_rport_arp_req_flows(struct ovn_port *op, if (!sset_is_empty(&all_ips_v4)) { build_lswitch_rport_arp_req_flow_for_ip(&all_ips_v4, AF_INET, sw_op, - sw_od, 75, lflows, + sw_od, 80, lflows, stage_hint); } if (!sset_is_empty(&all_ips_v6)) { build_lswitch_rport_arp_req_flow_for_ip(&all_ips_v6, AF_INET6, sw_op, - sw_od, 75, lflows, + sw_od, 80, lflows, stage_hint); } sset_destroy(&all_ips_v4); sset_destroy(&all_ips_v6); + + /* Self originated ARP requests/ND need to be flooded as usual. + * + * However, if the switch doesn't have any non-router ports we shouldn't + * even try to flood. + * + * Priority: 75. + */ + if (sw_od->n_router_ports != sw_od->nbs->n_ports) { + build_lswitch_rport_arp_req_self_orig_flow(op, 75, sw_od, lflows); + } } static void @@ -6930,7 +6947,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports, * - port type is localport */ if (check_lsp_is_up && - !lsp_is_up(op->nbsp) && strcmp(op->nbsp->type, "router") && + !lsp_is_up(op->nbsp) && !lsp_is_router(op->nbsp) && strcmp(op->nbsp->type, "localport")) { continue; } @@ -7005,8 +7022,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports, "flags.loopback = 1; " "output; " "};", - !strcmp(op->nbsp->type, "router") ? - "nd_na_router" : "nd_na", + lsp_is_router(op->nbsp) ? "nd_na_router" : "nd_na", op->lsp_addrs[i].ea_s, op->lsp_addrs[i].ipv6_addrs[j].addr_s, op->lsp_addrs[i].ipv6_addrs[j].addr_s, @@ -7088,7 +7104,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports, continue; } - if (!lsp_is_enabled(op->nbsp) || !strcmp(op->nbsp->type, "router")) { + if (!lsp_is_enabled(op->nbsp) || lsp_is_router(op->nbsp)) { /* Don't add the DHCP flows if the port is not enabled or if the * port is a router port. */ continue; @@ -7347,7 +7363,7 @@ build_lswitch_flows(struct hmap *datapaths, struct hmap *ports, * broadcast flooding of ARP/ND requests in table 19. We direct the * requests only to the router port that owns the IP address. */ - if (!strcmp(op->nbsp->type, "router")) { + if (lsp_is_router(op->nbsp)) { build_lswitch_rport_arp_req_flows(op->peer, op->od, op, lflows, &op->nbsp->header_); } @@ -10638,7 +10654,7 @@ build_arp_resolve_flows_for_lrouter_port( */ build_lrouter_drop_own_dest(op, S_ROUTER_IN_ARP_RESOLVE, 1, true, lflows); - } else if (op->od->n_router_ports && strcmp(op->nbsp->type, "router") + } else if (op->od->n_router_ports && !lsp_is_router(op->nbsp) && strcmp(op->nbsp->type, "virtual")) { /* This is a logical switch port that backs a VM or a container. * Extract its addresses. For each of the address, go through all @@ -10722,7 +10738,7 @@ build_arp_resolve_flows_for_lrouter_port( } } } - } else if (op->od->n_router_ports && strcmp(op->nbsp->type, "router") + } else if (op->od->n_router_ports && !lsp_is_router(op->nbsp) && !strcmp(op->nbsp->type, "virtual")) { /* This is a virtual port. Add ARP replies for the virtual ip with * the mac of the present active virtual parent. @@ -10826,7 +10842,7 @@ build_arp_resolve_flows_for_lrouter_port( } } } - } else if (!strcmp(op->nbsp->type, "router")) { + } else if (lsp_is_router(op->nbsp)) { /* This is a logical switch port that connects to a router. */ /* The peer of this switch port is the router port for which @@ -11992,6 +12008,10 @@ build_mcast_groups(struct northd_context *ctx, } else if (op->nbsp && lsp_is_enabled(op->nbsp)) { ovn_multicast_add(mcast_groups, &mc_flood, op); + if (!lsp_is_router(op->nbsp)) { + ovn_multicast_add(mcast_groups, &mc_flood_l2, op); + } + /* If this port is connected to a multicast router then add it * to the MC_MROUTER_FLOOD group. */ @@ -12435,7 +12455,7 @@ handle_port_binding_changes(struct northd_context *ctx, struct hmap *ports, continue; } - bool up = (sb->chassis || !strcmp(op->nbsp->type, "router")); + bool up = (sb->chassis || lsp_is_router(op->nbsp)); if (!op->nbsp->up || *op->nbsp->up != up) { nbrec_logical_switch_port_set_up(op->nbsp, &up, 1); } diff --git a/tests/ovn.at b/tests/ovn.at index 4be4840..434ae87 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -3593,7 +3593,7 @@ test_ip() { done } -# test_arp INPORT SHA SPA TPA FLOOD [REPLY_HA] +# test_arp INPORT SHA SPA TPA [REPLY_HA] # # Causes a packet to be received on INPORT. The packet is an ARP # request with SHA, SPA, and TPA as specified. If REPLY_HA is provided, then @@ -3605,25 +3605,21 @@ test_ip() { # SPA and TPA are each 8 hex digits. test_arp() { echo "$@" - local inport=$1 sha=$2 spa=$3 tpa=$4 flood=$5 reply_ha=$6 + local inport=$1 sha=$2 spa=$3 tpa=$4 reply_ha=$5 local request=ffffffffffff${sha}08060001080006040001${sha}${spa}ffffffffffff${tpa} hv=hv`vif_to_hv $inport` as $hv ovs-appctl netdev-dummy/receive vif$inport $request as $hv ovs-appctl ofproto/trace br-int in_port=$inport $request > trace # Expect to receive the broadcast ARP on the other logical switch ports if - # IP address is not configured on the switch patch port or on the router - # port (i.e, $flood == 1). + # IP address is not configured to the switch patch port. local i=`vif_to_ls $inport` local j k for j in 1 2 3; do for k in 1 2 3; do - # Skip ingress port. - if test $i$j$k == $inport; then - continue - fi - - if test X$flood == X1; then + # 192.168.33.254 is configured to the switch patch port for lrp33, + # so no ARP flooding expected for it. + if test $i$j$k != $inport && test $tpa != `ip_to_hex 192 168 33 254`; then echo $request >> $i$j$k.expected fi done @@ -3762,9 +3758,9 @@ for i in 1 2 3; do otherip=`ip_to_hex 192 168 $i$j 55` # Some other IP in subnet externalip=`ip_to_hex 1 2 3 4` # Some other IP not in subnet - test_arp $i$j$k $smac $sip $rip 0 $rmac #4 - test_arp $i$j$k $smac $otherip $rip 0 $rmac #5 - test_arp $i$j$k $smac $sip $otherip 1 #6 + test_arp $i$j$k $smac $sip $rip $rmac #4 + test_arp $i$j$k $smac $otherip $rip $rmac #5 + test_arp $i$j$k $smac $sip $otherip #6 # When rip is 192.168.33.254, ARP request from externalip won't be # filtered, because 192.168.33.254 is configured to switch peer port @@ -3773,7 +3769,7 @@ for i in 1 2 3; do if test $i = 3 && test $j = 3; then lrp33_rsp=$rmac fi - test_arp $i$j$k $smac $externalip $rip 0 $lrp33_rsp #7 + test_arp $i$j$k $smac $externalip $rip $lrp33_rsp #7 # MAC binding should be learned from ARP request. host_mac_pretty=f0:00:00:00:0$i:$j$k @@ -19819,7 +19815,7 @@ match_r1_metadata="metadata=0x${r1_dp_key}" send_arp_request 1 0 ${src_mac} $(ip_to_hex 10 0 0 254) $(ip_to_hex 10 0 0 1) # Verify that the ARP request is sent only to rtr1. -match_arp_req="priority=75.*${match_sw_metadata}.*arp_tpa=10.0.0.1,arp_op=1" +match_arp_req="priority=80.*${match_sw_metadata}.*arp_tpa=10.0.0.1,arp_op=1" match_send_rtr1="load:0x${r1_tnl_key}->NXM_NX_REG15" match_send_rtr2="load:0x${r2_tnl_key}->NXM_NX_REG15" @@ -19843,7 +19839,7 @@ dst_ipv6=00100000000000000000000000000001 send_nd_ns 1 0 ${src_mac} ${src_ipv6} ${dst_ipv6} 751d # Verify that the ND_NS is sent only to rtr1. -match_nd_ns="priority=75.*${match_sw_metadata}.*icmp_type=135.*nd_target=10::1" +match_nd_ns="priority=80.*${match_sw_metadata}.*icmp_type=135.*nd_target=10::1" as hv1 OVS_WAIT_UNTIL([ @@ -19875,7 +19871,7 @@ check ovn-nbctl --wait=hv sync send_arp_request 1 0 ${src_mac} $(ip_to_hex 10 0 0 254) $(ip_to_hex 10 0 0 11) # Verify that the ARP request is sent only to rtr1. -match_arp_req="priority=75.*${match_sw_metadata}.*arp_tpa=10.0.0.11,arp_op=1" +match_arp_req="priority=80.*${match_sw_metadata}.*arp_tpa=10.0.0.11,arp_op=1" match_send_rtr1="load:0x${r1_tnl_key}->NXM_NX_REG15" match_send_rtr2="load:0x${r2_tnl_key}->NXM_NX_REG15" @@ -19899,7 +19895,7 @@ dst_ipv6=00100000000000000000000000000011 send_nd_ns 1 0 ${src_mac} ${src_ipv6} ${dst_ipv6} 751d # Verify that the ND_NS is sent only to rtr1. -match_nd_ns="priority=75.*${match_sw_metadata}.*icmp_type=135.*nd_target=10::11" +match_nd_ns="priority=80.*${match_sw_metadata}.*icmp_type=135.*nd_target=10::11" as hv1 OVS_WAIT_UNTIL([ @@ -19943,7 +19939,7 @@ AT_CAPTURE_FILE([sbflows2]) # - 10.0.0.22, 10::22 - LB VIPs. # - 10.0.0.222, 10::222 - DNAT IPs. as hv1 -AT_CHECK([ovs-ofctl dump-flows br-int | grep -E "priority=75,.*${match_sw_metadata}" | grep -oE "arp_tpa=[[0-9.]]+" | sort], [0], [dnl +AT_CHECK([ovs-ofctl dump-flows br-int | grep -E "priority=80,.*${match_sw_metadata}" | grep -oE "arp_tpa=[[0-9.]]+" | sort], [0], [dnl arp_tpa=10.0.0.1 arp_tpa=10.0.0.11 arp_tpa=10.0.0.111 @@ -19953,7 +19949,7 @@ arp_tpa=10.0.0.2 arp_tpa=10.0.0.22 arp_tpa=10.0.0.222 ]) -AT_CHECK([ovs-ofctl dump-flows br-int | grep -E "priority=75,.*${match_sw_metadata}" | grep -oE "nd_target=[[0-9a-f:]]+" | sort], [0], [dnl +AT_CHECK([ovs-ofctl dump-flows br-int | grep -E "priority=80,.*${match_sw_metadata}" | grep -oE "nd_target=[[0-9a-f:]]+" | sort], [0], [dnl nd_target=10::1 nd_target=10::11 nd_target=10::111 @@ -19969,10 +19965,10 @@ nd_target=fe80::200:ff:fe00:200 # For sw1-rtr1: # - 20.0.0.1, 20::1, fe80::200:1ff:fe00:0 - interface IPs. as hv1 -AT_CHECK([ovs-ofctl dump-flows br-int | grep -E "priority=75,.*${match_sw1_metadata}" | grep -oE "arp_tpa=[[0-9.]]+" | sort], [0], [dnl +AT_CHECK([ovs-ofctl dump-flows br-int | grep -E "priority=80,.*${match_sw1_metadata}" | grep -oE "arp_tpa=[[0-9.]]+" | sort], [0], [dnl arp_tpa=20.0.0.1 ]) -AT_CHECK([ovs-ofctl dump-flows br-int | grep -E "priority=75,.*${match_sw1_metadata}" | grep -oE "nd_target=[[0-9a-f:]]+" | sort], [0], [dnl +AT_CHECK([ovs-ofctl dump-flows br-int | grep -E "priority=80,.*${match_sw1_metadata}" | grep -oE "nd_target=[[0-9a-f:]]+" | sort], [0], [dnl nd_target=20::1 nd_target=fe80::200:1ff:fe00:0 ]) @@ -19984,13 +19980,13 @@ nd_target=fe80::200:1ff:fe00:0 # - 00:00:00:01:00:00 - dnat_and_snat external MAC. # - 00:00:00:02:00:00 - dnat_and_snat external MAC. as hv1 -AT_CHECK([ovs-ofctl dump-flows br-int | grep -E "priority=80,.*${match_sw_metadata}.*arp_op=1" | grep -oE "dl_src=[[0-9a-f:]]+" | sort], [0], [dnl +AT_CHECK([ovs-ofctl dump-flows br-int | grep -E "priority=75,.*${match_sw_metadata}.*arp_op=1" | grep -oE "dl_src=[[0-9a-f:]]+" | sort], [0], [dnl dl_src=00:00:00:00:01:00 dl_src=00:00:00:00:02:00 dl_src=00:00:00:01:00:00 dl_src=00:00:00:02:00:00 ]) -AT_CHECK([ovs-ofctl dump-flows br-int | grep -E "priority=80,.*${match_sw_metadata}.*icmp_type=135" | grep -oE "dl_src=[[0-9a-f:]]+" | sort], [0], [dnl +AT_CHECK([ovs-ofctl dump-flows br-int | grep -E "priority=75,.*${match_sw_metadata}.*icmp_type=135" | grep -oE "dl_src=[[0-9a-f:]]+" | sort], [0], [dnl dl_src=00:00:00:00:01:00 dl_src=00:00:00:00:02:00 dl_src=00:00:00:01:00:00 @@ -20000,7 +19996,7 @@ dl_src=00:00:00:02:00:00 # Self originated ARP/NS with SMACs owned by rtr1-sw1 should be flooded: # - 00:00:01:00:00:00 - interface MAC. as hv1 -AT_CHECK([ovs-ofctl dump-flows br-int | grep -E "priority=80,.*${match_sw1_metadata}.*arp_op=1" | grep -oE "dl_src=[[0-9a-f:]]+" | sort], [0], [dnl +AT_CHECK([ovs-ofctl dump-flows br-int | grep -E "priority=75,.*${match_sw1_metadata}.*arp_op=1" | grep -oE "dl_src=[[0-9a-f:]]+" | sort], [0], [dnl dl_src=00:00:01:00:00:00 ]) @@ -20008,7 +20004,7 @@ dl_src=00:00:01:00:00:00 send_arp_request 1 0 ${src_mac} $(ip_to_hex 10 0 0 254) $(ip_to_hex 10 0 0 111) # Verify that the ARP request is sent only to rtr1. -match_arp_req="priority=75.*${match_sw_metadata}.*arp_tpa=10.0.0.111,arp_op=1" +match_arp_req="priority=80.*${match_sw_metadata}.*arp_tpa=10.0.0.111,arp_op=1" match_send_rtr1="load:0x${r1_tnl_key}->NXM_NX_REG15" match_send_rtr2="load:0x${r2_tnl_key}->NXM_NX_REG15" @@ -20072,7 +20068,7 @@ dst_ipv6=00100000000000000000000000000111 send_nd_ns 1 0 ${src_mac} ${src_ipv6} ${dst_ipv6} 751d # Verify that the ND_NS is sent only to rtr1. -match_nd_ns="priority=75.*${match_sw_metadata}.*icmp_type=135.*nd_target=10::111" +match_nd_ns="priority=80.*${match_sw_metadata}.*icmp_type=135.*nd_target=10::111" as hv1 OVS_WAIT_UNTIL([