From patchwork Thu Jan 11 15:30:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Numan Siddique X-Patchwork-Id: 1885676 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=140.211.166.136; helo=smtp3.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=patchwork.ozlabs.org) Received: from smtp3.osuosl.org (smtp3.osuosl.org [140.211.166.136]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4T9pYP4bgmz1yPp for ; Fri, 12 Jan 2024 02:31:37 +1100 (AEDT) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id 101076F559; Thu, 11 Jan 2024 15:31:35 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org 101076F559 X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Aj-l9KBhoOhR; Thu, 11 Jan 2024 15:31:31 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [IPv6:2605:bc80:3010:104::8cd3:938]) by smtp3.osuosl.org (Postfix) with ESMTPS id 1D51F6F4B9; Thu, 11 Jan 2024 15:31:30 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org 1D51F6F4B9 Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id E8349C0077; Thu, 11 Jan 2024 15:31:29 +0000 (UTC) X-Original-To: dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp3.osuosl.org (smtp3.osuosl.org [IPv6:2605:bc80:3010::136]) by lists.linuxfoundation.org (Postfix) with ESMTP id 539C6C0037 for ; Thu, 11 Jan 2024 15:31:29 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp3.osuosl.org (Postfix) with ESMTP id 8FE0B61B3D for ; Thu, 11 Jan 2024 15:31:16 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org 8FE0B61B3D X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp3.osuosl.org ([127.0.0.1]) by localhost (smtp3.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id b-v1Xvm3PxAM for ; Thu, 11 Jan 2024 15:31:13 +0000 (UTC) Received: from relay3-d.mail.gandi.net (relay3-d.mail.gandi.net [IPv6:2001:4b98:dc4:8::223]) by smtp3.osuosl.org (Postfix) with ESMTPS id 37E3B61B31 for ; Thu, 11 Jan 2024 15:31:13 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 smtp3.osuosl.org 37E3B61B31 Received: by mail.gandi.net (Postfix) with ESMTPSA id 97A4D6000A; Thu, 11 Jan 2024 15:31:10 +0000 (UTC) From: numans@ovn.org To: dev@openvswitch.org Date: Thu, 11 Jan 2024 10:30:28 -0500 Message-ID: <20240111153028.2790191-1-numans@ovn.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240111152752.2789854-1-numans@ovn.org> References: <20240111152752.2789854-1-numans@ovn.org> MIME-Version: 1.0 X-GND-Sasl: numans@ovn.org Subject: [ovs-dev] [PATCH ovn v5 07/16] northd: Add a new node 'ls_stateful'. X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: Numan Siddique This new engine now maintains the load balancer and ACL data of a logical switch which was earlier part of northd engine node data. The main inputs to this engine are: - northd node - Port group node A record for each logical switch is maintained in the 'ls_statefuls' hmap table and this record stores the below data which was earlier part of 'struct ovn_datapath'. - bool has_stateful_acl; - bool has_lb_vip; - bool has_acls; - uint64_t max_acl_tier; This engine node becomes an input to 'lflow' node. If a logical switch is configured with - DHCP for all or some of its ports - stateful ACLs and/or load balancers ovn-northd was adding flows to commit the DHCP response generated by ovn-controller to conntrack. With this patch we don't commit the DHCP response to conntrack anymore for the following reasons: 1. There is no need to actually commit the response. 2. Since stateful information is moved to 'ls_stateful' node, it becomes ineffecient to access ls_stateful's data ('has_lb_vip' and 'has_acls') in build_dhcpv4_options_flows(). Signed-off-by: Numan Siddique Acked-by: Dumitru Ceara --- lib/stopwatch-names.h | 1 + northd/automake.mk | 2 + northd/en-lflow.c | 4 + northd/en-lr-stateful.c | 6 +- northd/en-lr-stateful.h | 2 + northd/en-ls-stateful.c | 440 +++++++++++++++++++++++++++++++++++++++ northd/en-ls-stateful.h | 81 +++++++ northd/en-northd.c | 2 +- northd/en-port-group.h | 3 + northd/inc-proc-northd.c | 7 + northd/northd.c | 335 +++++++++++++++-------------- northd/northd.h | 29 ++- northd/ovn-northd.c | 1 + 13 files changed, 753 insertions(+), 160 deletions(-) create mode 100644 northd/en-ls-stateful.c create mode 100644 northd/en-ls-stateful.h diff --git a/lib/stopwatch-names.h b/lib/stopwatch-names.h index e5e41fbfd8..c865f125be 100644 --- a/lib/stopwatch-names.h +++ b/lib/stopwatch-names.h @@ -31,5 +31,6 @@ #define SYNC_METERS_RUN_STOPWATCH_NAME "sync_meters_run" #define LR_NAT_RUN_STOPWATCH_NAME "lr_nat_run" #define LR_STATEFUL_RUN_STOPWATCH_NAME "lr_stateful" +#define LS_STATEFUL_RUN_STOPWATCH_NAME "ls_stateful" #endif diff --git a/northd/automake.mk b/northd/automake.mk index b886356c9c..a178541759 100644 --- a/northd/automake.mk +++ b/northd/automake.mk @@ -28,6 +28,8 @@ northd_ovn_northd_SOURCES = \ northd/en-lr-nat.h \ northd/en-lr-stateful.c \ northd/en-lr-stateful.h \ + northd/en-ls-stateful.c \ + northd/en-ls-stateful.h \ northd/inc-proc-northd.c \ northd/inc-proc-northd.h \ northd/ipam.c \ diff --git a/northd/en-lflow.c b/northd/en-lflow.c index a3a0d62f15..eb6b2a8666 100644 --- a/northd/en-lflow.c +++ b/northd/en-lflow.c @@ -21,6 +21,7 @@ #include "en-lflow.h" #include "en-lr-nat.h" #include "en-lr-stateful.h" +#include "en-ls-stateful.h" #include "en-northd.h" #include "en-meters.h" @@ -44,6 +45,8 @@ lflow_get_input_data(struct engine_node *node, engine_get_input_data("sync_meters", node); struct ed_type_lr_stateful *lr_stateful_data = engine_get_input_data("lr_stateful", node); + struct ed_type_ls_stateful *ls_stateful_data = + engine_get_input_data("ls_stateful", node); lflow_input->nbrec_bfd_table = EN_OVSDB_GET(engine_get_input("NB_bfd", node)); @@ -67,6 +70,7 @@ lflow_get_input_data(struct engine_node *node, lflow_input->lr_ports = &northd_data->lr_ports; lflow_input->ls_port_groups = &pg_data->ls_port_groups; lflow_input->lr_stateful_table = &lr_stateful_data->table; + lflow_input->ls_stateful_table = &ls_stateful_data->table; lflow_input->meter_groups = &sync_meters_data->meter_groups; lflow_input->lb_datapaths_map = &northd_data->lb_datapaths_map; lflow_input->svc_monitor_map = &northd_data->svc_monitor_map; diff --git a/northd/en-lr-stateful.c b/northd/en-lr-stateful.c index dedfdf6747..4287aaddc9 100644 --- a/northd/en-lr-stateful.c +++ b/northd/en-lr-stateful.c @@ -295,7 +295,9 @@ lr_stateful_lb_data_handler(struct engine_node *node, void *data_) /* For all the modified lr_stateful records (re)build the * vip nats. */ HMAPX_FOR_EACH (hmapx_node, &data->trk_data.crupdated) { - lr_stateful_build_vip_nats(hmapx_node->data); + struct lr_stateful_record *lr_stateful_rec = hmapx_node->data; + lr_stateful_build_vip_nats(lr_stateful_rec); + lr_stateful_rec->has_lb_vip = od_has_lb_vip(lr_stateful_rec->od); } engine_set_node_state(node, EN_UPDATED); @@ -510,6 +512,8 @@ lr_stateful_record_init(struct lr_stateful_record *lr_stateful_rec, if (nbr->n_nat) { lr_stateful_build_vip_nats(lr_stateful_rec); } + + lr_stateful_rec->has_lb_vip = od_has_lb_vip(lr_stateful_rec->od); } static struct lr_stateful_input diff --git a/northd/en-lr-stateful.h b/northd/en-lr-stateful.h index 74029c9a6c..5d647d8863 100644 --- a/northd/en-lr-stateful.h +++ b/northd/en-lr-stateful.h @@ -49,6 +49,8 @@ struct lr_stateful_record { const struct ovn_datapath *od; const struct lr_nat_record *lrnat_rec; + bool has_lb_vip; + /* Load Balancer vIPs relevant for this datapath. */ struct ovn_lb_ip_set *lb_ips; diff --git a/northd/en-ls-stateful.c b/northd/en-ls-stateful.c new file mode 100644 index 0000000000..5fa305bc29 --- /dev/null +++ b/northd/en-ls-stateful.c @@ -0,0 +1,440 @@ +/* + * Copyright (c) 2023, Red Hat, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include + +#include +#include +#include + +/* OVS includes */ +#include "include/openvswitch/hmap.h" +#include "lib/bitmap.h" +#include "lib/socket-util.h" +#include "lib/uuidset.h" +#include "openvswitch/util.h" +#include "openvswitch/vlog.h" +#include "stopwatch.h" + +/* OVN includes */ +#include "en-lb-data.h" +#include "en-ls-stateful.h" +#include "en-port-group.h" +#include "lib/inc-proc-eng.h" +#include "lib/lb.h" +#include "lib/ovn-nb-idl.h" +#include "lib/ovn-sb-idl.h" +#include "lib/ovn-util.h" +#include "lib/stopwatch-names.h" +#include "northd.h" + +VLOG_DEFINE_THIS_MODULE(en_ls_stateful); + +/* Static function declarations. */ +static void ls_stateful_table_init(struct ls_stateful_table *); +static void ls_stateful_table_clear(struct ls_stateful_table *); +static void ls_stateful_table_destroy(struct ls_stateful_table *); +static struct ls_stateful_record *ls_stateful_table_find_( + const struct ls_stateful_table *, const struct nbrec_logical_switch *); +static void ls_stateful_table_build(struct ls_stateful_table *, + const struct ovn_datapaths *ls_datapaths, + const struct ls_port_group_table *); + +static struct ls_stateful_input ls_stateful_get_input_data( + struct engine_node *); + +static struct ls_stateful_record *ls_stateful_record_create( + struct ls_stateful_table *, + const struct ovn_datapath *, + const struct ls_port_group_table *); +static void ls_stateful_record_destroy(struct ls_stateful_record *); +static void ls_stateful_record_init( + struct ls_stateful_record *, + const struct ovn_datapath *, + const struct ls_port_group *, + const struct ls_port_group_table *); +static void ls_stateful_record_reinit( + struct ls_stateful_record *, + const struct ls_port_group *, + const struct ls_port_group_table *); +static bool ls_has_lb_vip(const struct ovn_datapath *); +static void ls_stateful_record_set_acl_flags( + struct ls_stateful_record *, const struct ovn_datapath *, + const struct ls_port_group *, const struct ls_port_group_table *); +static bool ls_stateful_record_set_acl_flags_(struct ls_stateful_record *, + struct nbrec_acl **, + size_t n_acls); +static struct ls_stateful_input ls_stateful_get_input_data( + struct engine_node *); + +struct ls_stateful_input { + const struct ls_port_group_table *ls_port_groups; + const struct ovn_datapaths *ls_datapaths; +}; + +/* public functions. */ +const struct ls_stateful_record * +ls_stateful_table_find(const struct ls_stateful_table *table, + const struct nbrec_logical_switch *nbs) +{ + return ls_stateful_table_find_(table, nbs); +} + +void * +en_ls_stateful_init(struct engine_node *node OVS_UNUSED, + struct engine_arg *arg OVS_UNUSED) +{ + struct ed_type_ls_stateful *data = xzalloc(sizeof *data); + ls_stateful_table_init(&data->table); + hmapx_init(&data->trk_data.crupdated); + return data; +} + +void +en_ls_stateful_cleanup(void *data_) +{ + struct ed_type_ls_stateful *data = data_; + ls_stateful_table_destroy(&data->table); + hmapx_destroy(&data->trk_data.crupdated); +} + +void +en_ls_stateful_clear_tracked_data(void *data_) +{ + struct ed_type_ls_stateful *data = data_; + hmapx_clear(&data->trk_data.crupdated); +} + +void +en_ls_stateful_run(struct engine_node *node, void *data_) +{ + struct ls_stateful_input input_data = ls_stateful_get_input_data(node); + struct ed_type_ls_stateful *data = data_; + + stopwatch_start(LS_STATEFUL_RUN_STOPWATCH_NAME, time_msec()); + + ls_stateful_table_clear(&data->table); + ls_stateful_table_build(&data->table, input_data.ls_datapaths, + input_data.ls_port_groups); + + stopwatch_stop(LS_STATEFUL_RUN_STOPWATCH_NAME, time_msec()); + engine_set_node_state(node, EN_UPDATED); +} + +/* Handler functions. */ +bool +ls_stateful_northd_handler(struct engine_node *node, void *data_) +{ + struct northd_data *northd_data = engine_get_input_data("northd", node); + if (!northd_has_tracked_data(&northd_data->trk_data)) { + return false; + } + + if (!northd_has_ls_lbs_in_tracked_data(&northd_data->trk_data) && + !northd_has_ls_acls_in_tracked_data(&northd_data->trk_data)) { + return true; + } + + struct northd_tracked_data *nd_changes = &northd_data->trk_data; + struct ls_stateful_input input_data = ls_stateful_get_input_data(node); + struct ed_type_ls_stateful *data = data_; + struct hmapx_node *hmapx_node; + + struct hmapx changed_stateful_od = HMAPX_INITIALIZER(&changed_stateful_od); + HMAPX_FOR_EACH (hmapx_node, &nd_changes->ls_with_changed_lbs) { + hmapx_add(&changed_stateful_od, hmapx_node->data); + } + + HMAPX_FOR_EACH (hmapx_node, &nd_changes->ls_with_changed_acls) { + hmapx_add(&changed_stateful_od, hmapx_node->data); + } + + HMAPX_FOR_EACH (hmapx_node, &changed_stateful_od) { + const struct ovn_datapath *od = hmapx_node->data; + + struct ls_stateful_record *ls_stateful_rec = ls_stateful_table_find_( + &data->table, od->nbs); + ovs_assert(ls_stateful_rec); + ls_stateful_record_reinit(ls_stateful_rec, NULL, + input_data.ls_port_groups); + + /* Add the ls_stateful_rec to the tracking data. */ + hmapx_add(&data->trk_data.crupdated, ls_stateful_rec); + } + + if (ls_stateful_has_tracked_data(&data->trk_data)) { + engine_set_node_state(node, EN_UPDATED); + } + + hmapx_destroy(&changed_stateful_od); + + return true; +} + +bool +ls_stateful_port_group_handler(struct engine_node *node, void *data_) +{ + struct port_group_data *pg_data = + engine_get_input_data("port_group", node); + + if (pg_data->ls_port_groups_sets_changed) { + return false; + } + + /* port_group engine node doesn't provide the tracking data yet. + * Loop through all the ls port groups and update the ls_stateful_rec. + * This is still better than returning false. */ + struct ls_stateful_input input_data = ls_stateful_get_input_data(node); + struct ed_type_ls_stateful *data = data_; + const struct ls_port_group *ls_pg; + + LS_PORT_GROUP_TABLE_FOR_EACH (ls_pg, input_data.ls_port_groups) { + struct ls_stateful_record *ls_stateful_rec = + ls_stateful_table_find_(&data->table, ls_pg->nbs); + ovs_assert(ls_stateful_rec); + + bool had_stateful_acl = ls_stateful_rec->has_stateful_acl; + uint64_t max_acl_tier = ls_stateful_rec->max_acl_tier; + bool had_acls = ls_stateful_rec->has_acls; + bool modified = false; + + ls_stateful_record_reinit(ls_stateful_rec, ls_pg, + input_data.ls_port_groups); + + if ((had_stateful_acl != ls_stateful_rec->has_stateful_acl) + || (had_acls != ls_stateful_rec->has_acls) + || max_acl_tier != ls_stateful_rec->max_acl_tier) { + modified = true; + } + + if (modified) { + /* Add the ls_stateful_rec to the tracking data. */ + hmapx_add(&data->trk_data.crupdated, ls_stateful_rec); + } + } + + if (ls_stateful_has_tracked_data(&data->trk_data)) { + engine_set_node_state(node, EN_UPDATED); + } + return true; +} + +/* static functions. */ +static void +ls_stateful_table_init(struct ls_stateful_table *table) +{ + *table = (struct ls_stateful_table) { + .entries = HMAP_INITIALIZER(&table->entries), + }; +} + +static void +ls_stateful_table_destroy(struct ls_stateful_table *table) +{ + ls_stateful_table_clear(table); + hmap_destroy(&table->entries); +} + +static void +ls_stateful_table_clear(struct ls_stateful_table *table) +{ + struct ls_stateful_record *ls_stateful_rec; + HMAP_FOR_EACH_POP (ls_stateful_rec, key_node, &table->entries) { + ls_stateful_record_destroy(ls_stateful_rec); + } +} + +static void +ls_stateful_table_build(struct ls_stateful_table *table, + const struct ovn_datapaths *ls_datapaths, + const struct ls_port_group_table *ls_pgs) +{ + const struct ovn_datapath *od; + HMAP_FOR_EACH (od, key_node, &ls_datapaths->datapaths) { + ls_stateful_record_create(table, od, ls_pgs); + } +} + +struct ls_stateful_record * +ls_stateful_table_find_(const struct ls_stateful_table *table, + const struct nbrec_logical_switch *nbs) +{ + struct ls_stateful_record *rec; + + HMAP_FOR_EACH_WITH_HASH (rec, key_node, + uuid_hash(&nbs->header_.uuid), &table->entries) { + if (nbs == rec->od->nbs) { + return rec; + } + } + return NULL; +} + +static struct ls_stateful_record * +ls_stateful_record_create(struct ls_stateful_table *table, + const struct ovn_datapath *od, + const struct ls_port_group_table *ls_pgs) +{ + struct ls_stateful_record *ls_stateful_rec = + xzalloc(sizeof *ls_stateful_rec); + ls_stateful_rec->od = od; + ls_stateful_record_init(ls_stateful_rec, od, NULL, ls_pgs); + + hmap_insert(&table->entries, &ls_stateful_rec->key_node, + uuid_hash(&ls_stateful_rec->od->nbs->header_.uuid)); + + return ls_stateful_rec; +} + +static void +ls_stateful_record_destroy(struct ls_stateful_record *ls_stateful_rec) +{ + free(ls_stateful_rec); +} + +static void +ls_stateful_record_init(struct ls_stateful_record *ls_stateful_rec, + const struct ovn_datapath *od, + const struct ls_port_group *ls_pg, + const struct ls_port_group_table *ls_pgs) +{ + ls_stateful_rec->has_lb_vip = ls_has_lb_vip(od); + ls_stateful_record_set_acl_flags(ls_stateful_rec, od, ls_pg, ls_pgs); +} + +static void +ls_stateful_record_reinit(struct ls_stateful_record *ls_stateful_rec, + const struct ls_port_group *ls_pg, + const struct ls_port_group_table *ls_pgs) +{ + ls_stateful_record_init(ls_stateful_rec, ls_stateful_rec->od, + ls_pg, ls_pgs); +} + +static bool +lb_has_vip(const struct nbrec_load_balancer *lb) +{ + return !smap_is_empty(&lb->vips); +} + +static bool +lb_group_has_vip(const struct nbrec_load_balancer_group *lb_group) +{ + for (size_t i = 0; i < lb_group->n_load_balancer; i++) { + if (lb_has_vip(lb_group->load_balancer[i])) { + return true; + } + } + return false; +} + +static bool +ls_has_lb_vip(const struct ovn_datapath *od) +{ + for (size_t i = 0; i < od->nbs->n_load_balancer; i++) { + if (lb_has_vip(od->nbs->load_balancer[i])) { + return true; + } + } + + for (size_t i = 0; i < od->nbs->n_load_balancer_group; i++) { + if (lb_group_has_vip(od->nbs->load_balancer_group[i])) { + return true; + } + } + return false; +} + +static void +ls_stateful_record_set_acl_flags(struct ls_stateful_record *ls_stateful_rec, + const struct ovn_datapath *od, + const struct ls_port_group *ls_pg, + const struct ls_port_group_table *ls_pgs) +{ + ls_stateful_rec->has_stateful_acl = false; + ls_stateful_rec->max_acl_tier = 0; + ls_stateful_rec->has_acls = false; + + if (ls_stateful_record_set_acl_flags_(ls_stateful_rec, od->nbs->acls, + od->nbs->n_acls)) { + return; + } + + if (!ls_pg) { + ls_pg = ls_port_group_table_find(ls_pgs, od->nbs); + } + + if (!ls_pg) { + return; + } + + const struct ls_port_group_record *ls_pg_rec; + HMAP_FOR_EACH (ls_pg_rec, key_node, &ls_pg->nb_pgs) { + if (ls_stateful_record_set_acl_flags_(ls_stateful_rec, + ls_pg_rec->nb_pg->acls, + ls_pg_rec->nb_pg->n_acls)) { + return; + } + } +} + +static bool +ls_stateful_record_set_acl_flags_(struct ls_stateful_record *ls_stateful_rec, + struct nbrec_acl **acls, + size_t n_acls) +{ + /* A true return indicates that there are no possible ACL flags + * left to set on ls_stateful record. A false return indicates that + * further ACLs should be explored in case more flags need to be + * set on ls_stateful record. + */ + if (!n_acls) { + return false; + } + + ls_stateful_rec->has_acls = true; + for (size_t i = 0; i < n_acls; i++) { + const struct nbrec_acl *acl = acls[i]; + if (acl->tier > ls_stateful_rec->max_acl_tier) { + ls_stateful_rec->max_acl_tier = acl->tier; + } + if (!ls_stateful_rec->has_stateful_acl + && !strcmp(acl->action, "allow-related")) { + ls_stateful_rec->has_stateful_acl = true; + } + if (ls_stateful_rec->has_stateful_acl && + ls_stateful_rec->max_acl_tier == + nbrec_acl_col_tier.type.value.integer.max) { + return true; + } + } + + return false; +} + +static struct ls_stateful_input +ls_stateful_get_input_data(struct engine_node *node) +{ + const struct northd_data *northd_data = + engine_get_input_data("northd", node); + const struct port_group_data *pg_data = + engine_get_input_data("port_group", node); + + return (struct ls_stateful_input) { + .ls_port_groups = &pg_data->ls_port_groups, + .ls_datapaths = &northd_data->ls_datapaths, + }; +} diff --git a/northd/en-ls-stateful.h b/northd/en-ls-stateful.h new file mode 100644 index 0000000000..cba53e1f29 --- /dev/null +++ b/northd/en-ls-stateful.h @@ -0,0 +1,81 @@ +/* + * Copyright (c) 2023, Red Hat, Inc. + * + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at: + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +#ifndef EN_LS_STATEFUL_H +#define EN_LS_STATEFUL_H 1 + +#include + +/* OVS includes. */ +#include "lib/hmapx.h" +#include "openvswitch/hmap.h" +#include "sset.h" + +/* OVN includes. */ +#include "lib/inc-proc-eng.h" +#include "lib/lb.h" +#include "lib/ovn-nb-idl.h" +#include "lib/ovn-sb-idl.h" +#include "lib/ovn-util.h" +#include "lib/stopwatch-names.h" + +struct ls_stateful_record { + struct hmap_node key_node; + + const struct ovn_datapath *od; + bool has_stateful_acl; + bool has_lb_vip; + bool has_acls; + uint64_t max_acl_tier; +}; + +struct ls_stateful_table { + struct hmap entries; +}; + +#define LS_STATEFUL_TABLE_FOR_EACH(LS_STATEFUL_REC, TABLE) \ + HMAP_FOR_EACH (LS_STATEFUL_REC, key_node, &(TABLE)->entries) + +#define LS_STATEFUL_TABLE_FOR_EACH_IN_P(LS_STATEFUL_REC, JOBID, TABLE) \ + HMAP_FOR_EACH_IN_PARALLEL (LS_STATEFUL_REC, key_node, JOBID, \ + &(TABLE)->entries) + +struct ls_stateful_tracked_data { + /* Created or updated logical switch with LB and ACL data. */ + struct hmapx crupdated; /* Stores 'struct ls_stateful_record'. */ +}; + +struct ed_type_ls_stateful { + struct ls_stateful_table table; + struct ls_stateful_tracked_data trk_data; +}; + +void *en_ls_stateful_init(struct engine_node *, struct engine_arg *); +void en_ls_stateful_cleanup(void *data); +void en_ls_stateful_clear_tracked_data(void *data); +void en_ls_stateful_run(struct engine_node *, void *data); + +bool ls_stateful_northd_handler(struct engine_node *, void *data); +bool ls_stateful_port_group_handler(struct engine_node *, void *data); + +const struct ls_stateful_record *ls_stateful_table_find( + const struct ls_stateful_table *, const struct nbrec_logical_switch *); + +static inline bool +ls_stateful_has_tracked_data(struct ls_stateful_tracked_data *trk_data) { + return !hmapx_is_empty(&trk_data->crupdated); +} + +#endif /* EN_LS_STATEFUL_H */ diff --git a/northd/en-northd.c b/northd/en-northd.c index 546397f3dc..5143603f39 100644 --- a/northd/en-northd.c +++ b/northd/en-northd.c @@ -171,7 +171,7 @@ northd_nb_logical_switch_handler(struct engine_node *node, return false; } - if (northd_has_lsps_in_tracked_data(&nd->trk_data)) { + if (northd_has_tracked_data(&nd->trk_data)) { engine_set_node_state(node, EN_UPDATED); } diff --git a/northd/en-port-group.h b/northd/en-port-group.h index 3b28a23694..54014062ce 100644 --- a/northd/en-port-group.h +++ b/northd/en-port-group.h @@ -48,6 +48,9 @@ struct ls_port_group_record { struct sset ports; /* Subset of 'nb_pg' ports in this record. */ }; +#define LS_PORT_GROUP_TABLE_FOR_EACH(LS_PG, TABLE) \ + HMAP_FOR_EACH (LS_PG, key_node, &(TABLE)->entries) + void ls_port_group_table_init(struct ls_port_group_table *); void ls_port_group_table_clear(struct ls_port_group_table *); void ls_port_group_table_destroy(struct ls_port_group_table *); diff --git a/northd/inc-proc-northd.c b/northd/inc-proc-northd.c index adb38dde78..9ce4279ee8 100644 --- a/northd/inc-proc-northd.c +++ b/northd/inc-proc-northd.c @@ -33,6 +33,7 @@ #include "en-lb-data.h" #include "en-lr-stateful.h" #include "en-lr-nat.h" +#include "en-ls-stateful.h" #include "en-northd.h" #include "en-lflow.h" #include "en-northd-output.h" @@ -150,6 +151,7 @@ static ENGINE_NODE(sync_to_sb_pb, "sync_to_sb_pb"); static ENGINE_NODE_WITH_CLEAR_TRACK_DATA(lb_data, "lb_data"); static ENGINE_NODE_WITH_CLEAR_TRACK_DATA(lr_nat, "lr_nat"); static ENGINE_NODE_WITH_CLEAR_TRACK_DATA(lr_stateful, "lr_stateful"); +static ENGINE_NODE_WITH_CLEAR_TRACK_DATA(ls_stateful, "ls_stateful"); void inc_proc_northd_init(struct ovsdb_idl_loop *nb, struct ovsdb_idl_loop *sb) @@ -200,6 +202,10 @@ void inc_proc_northd_init(struct ovsdb_idl_loop *nb, engine_add_input(&en_lr_stateful, &en_lb_data, lr_stateful_lb_data_handler); + engine_add_input(&en_ls_stateful, &en_northd, ls_stateful_northd_handler); + engine_add_input(&en_ls_stateful, &en_port_group, + ls_stateful_port_group_handler); + engine_add_input(&en_mac_binding_aging, &en_nb_nb_global, NULL); engine_add_input(&en_mac_binding_aging, &en_sb_mac_binding, NULL); engine_add_input(&en_mac_binding_aging, &en_northd, NULL); @@ -222,6 +228,7 @@ void inc_proc_northd_init(struct ovsdb_idl_loop *nb, engine_add_input(&en_lflow, &en_sb_multicast_group, NULL); engine_add_input(&en_lflow, &en_sb_igmp_group, NULL); engine_add_input(&en_lflow, &en_lr_stateful, NULL); + engine_add_input(&en_lflow, &en_ls_stateful, NULL); engine_add_input(&en_lflow, &en_northd, lflow_northd_handler); engine_add_input(&en_lflow, &en_port_group, lflow_port_group_handler); diff --git a/northd/northd.c b/northd/northd.c index f56a7c5ea4..68f2b0bab4 100644 --- a/northd/northd.c +++ b/northd/northd.c @@ -46,6 +46,7 @@ #include "en-lb-data.h" #include "en-lr-nat.h" #include "en-lr-stateful.h" +#include "en-ls-stateful.h" #include "lib/ovn-parallel-hmap.h" #include "ovn/actions.h" #include "ovn/features.h" @@ -576,7 +577,7 @@ lb_group_has_vip(const struct nbrec_load_balancer_group *lb_group) } static bool -ls_has_lb_vip(struct ovn_datapath *od) +ls_has_lb_vip(const struct ovn_datapath *od) { for (size_t i = 0; i < od->nbs->n_load_balancer; i++) { if (lb_has_vip(od->nbs->load_balancer[i])) { @@ -593,7 +594,7 @@ ls_has_lb_vip(struct ovn_datapath *od) } static bool -lr_has_lb_vip(struct ovn_datapath *od) +lr_has_lb_vip(const struct ovn_datapath *od) { for (size_t i = 0; i < od->nbr->n_load_balancer; i++) { if (lb_has_vip(od->nbr->load_balancer[i])) { @@ -609,13 +610,13 @@ lr_has_lb_vip(struct ovn_datapath *od) return false; } -static void -init_lb_for_datapath(struct ovn_datapath *od) +bool +od_has_lb_vip(const struct ovn_datapath *od) { if (od->nbs) { - od->has_lb_vip = ls_has_lb_vip(od); + return ls_has_lb_vip(od); } else { - od->has_lb_vip = lr_has_lb_vip(od); + return lr_has_lb_vip(od); } } @@ -1065,7 +1066,6 @@ join_datapaths(const struct nbrec_logical_switch_table *nbrec_ls_table, init_ipam_info_for_datapath(od); init_mcast_info_for_datapath(od); - init_lb_for_datapath(od); } const struct nbrec_logical_router *nbr; @@ -1096,7 +1096,6 @@ join_datapaths(const struct nbrec_logical_switch_table *nbrec_ls_table, ovs_list_push_back(nb_only, &od->list); } init_mcast_info_for_datapath(od); - init_lb_for_datapath(od); if (smap_get(&od->nbr->options, "chassis")) { od->is_gw_router = true; } @@ -2583,7 +2582,8 @@ get_nat_addresses(const struct ovn_port *op, size_t *n, bool routable_only, size_t n_nats = 0; struct eth_addr mac; if (!op || !op->nbrp || !op->od || !op->od->nbr - || (!op->od->nbr->n_nat && !op->od->has_lb_vip) + || (!op->od->nbr->n_nat && (!lr_stateful_rec + || !lr_stateful_rec->has_lb_vip)) || !eth_addr_from_string(op->nbrp->mac, &mac)) { *n = n_nats; return NULL; @@ -3830,7 +3830,7 @@ build_lrouter_lbs_check(const struct ovn_datapaths *lr_datapaths) HMAP_FOR_EACH (od, key_node, &lr_datapaths->datapaths) { ovs_assert(od->nbr); - if (od->has_lb_vip && od->n_l3dgw_ports > 1 + if (od_has_lb_vip(od) && od->n_l3dgw_ports > 1 && !smap_get(&od->nbr->options, "chassis")) { static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); VLOG_WARN_RL(&rl, "Load-balancers are configured on logical " @@ -4750,6 +4750,8 @@ destroy_northd_data_tracked_changes(struct northd_data *nd) destroy_tracked_ovn_ports(&trk_changes->trk_lsps); destroy_tracked_lbs(&trk_changes->trk_lbs); hmapx_clear(&trk_changes->lr_with_changed_nats); + hmapx_clear(&trk_changes->ls_with_changed_lbs); + hmapx_clear(&trk_changes->ls_with_changed_acls); nd->trk_data.type = NORTHD_TRACKED_NONE; } @@ -4764,6 +4766,8 @@ init_northd_tracked_data(struct northd_data *nd) hmapx_init(&trk_data->trk_lbs.crupdated); hmapx_init(&trk_data->trk_lbs.deleted); hmapx_init(&trk_data->lr_with_changed_nats); + hmapx_init(&trk_data->ls_with_changed_lbs); + hmapx_init(&trk_data->ls_with_changed_acls); } static void @@ -4777,6 +4781,8 @@ destroy_northd_tracked_data(struct northd_data *nd) hmapx_destroy(&trk_data->trk_lbs.crupdated); hmapx_destroy(&trk_data->trk_lbs.deleted); hmapx_destroy(&trk_data->lr_with_changed_nats); + hmapx_destroy(&trk_data->ls_with_changed_lbs); + hmapx_destroy(&trk_data->ls_with_changed_acls); } /* Check if a changed LSP can be handled incrementally within the I-P engine @@ -4892,6 +4898,7 @@ ls_port_create(struct ovsdb_idl_txn *ovnsb_txn, struct hmap *ls_ports, * - logical switch ports. * - load balancers. * - load balancer groups. + * - ACLs */ static bool ls_changes_can_be_handled( @@ -5116,6 +5123,25 @@ fail: return false; } +static bool +is_acls_seqno_changed(struct nbrec_acl **nb_acls, size_t n_nb_acls) +{ + for (size_t i = 0; i < n_nb_acls; i++) { + if (nbrec_acl_row_get_seqno(nb_acls[i], + OVSDB_IDL_CHANGE_MODIFY) > 0) { + return true; + } + } + + return false; +} + +static bool +is_ls_acls_changed(const struct nbrec_logical_switch *nbs) { + return (nbrec_logical_switch_is_updated(nbs, NBREC_LOGICAL_SWITCH_COL_ACLS) + || is_acls_seqno_changed(nbs->acls, nbs->n_acls)); +} + /* Return true if changes are handled incrementally, false otherwise. * When there are any changes, try to track what's exactly changed and set * northd_data->trk_data accordingly. @@ -5157,6 +5183,10 @@ northd_handle_ls_changes(struct ovsdb_idl_txn *ovnsb_idl_txn, ni, nd, od, &trk_data->trk_lsps)) { goto fail; } + + if (is_ls_acls_changed(changed_ls)) { + hmapx_add(&trk_data->ls_with_changed_acls, od); + } } if (!hmapx_is_empty(&trk_data->trk_lsps.created) @@ -5165,6 +5195,10 @@ northd_handle_ls_changes(struct ovsdb_idl_txn *ovnsb_idl_txn, trk_data->type |= NORTHD_TRACKED_PORTS; } + if (!hmapx_is_empty(&trk_data->ls_with_changed_acls)) { + trk_data->type |= NORTHD_TRACKED_LS_ACLS; + } + return true; fail: @@ -5386,10 +5420,6 @@ northd_handle_sb_port_binding_changes( * due to association of a load balancer (eg. ovn-nbctl ls-lb-add sw0 lb1), * the logical switch datapath is added to the load balancer (represented * by 'struct ovn_lb_datapaths') by calling ovn_lb_datapaths_add_ls(). - * - * - For every 'lb' in the tracked data (trk_lb_data->crupdated_lbs) , - * it gets the associated logical switches and for each switch it - * re-evaluates 'od->has_lb_vip' to reflect any changes to the lb vips. * */ bool northd_handle_lb_data_changes(struct tracked_lb_data *trk_lb_data, @@ -5450,13 +5480,13 @@ northd_handle_lb_data_changes(struct tracked_lb_data *trk_lb_data, lb_dps = ovn_lb_datapaths_find(lb_datapaths_map, lb_uuid); ovs_assert(lb_dps); - /* Re-evaluate 'od->has_lb_vip for od's associated with the - * deleted lb. */ size_t index; BITMAP_FOR_EACH_1 (index, ods_size(ls_datapaths), lb_dps->nb_ls_map) { od = ls_datapaths->array[index]; - init_lb_for_datapath(od); + + /* Add the ls datapath to the northd tracked data. */ + hmapx_add(&nd_changes->ls_with_changed_lbs, od); } hmap_remove(lb_datapaths_map, &lb_dps->hmap_node); @@ -5536,8 +5566,8 @@ northd_handle_lb_data_changes(struct tracked_lb_data *trk_lb_data, } } - /* Re-evaluate 'od->has_lb_vip' */ - init_lb_for_datapath(od); + /* Add the ls datapath to the northd tracked data. */ + hmapx_add(&nd_changes->ls_with_changed_lbs, od); } LIST_FOR_EACH (codlb, list_node, &trk_lb_data->crupdated_lr_lbs) { @@ -5572,9 +5602,6 @@ northd_handle_lb_data_changes(struct tracked_lb_data *trk_lb_data, hmapx_add(&nd_changes->trk_lbs.crupdated, lb_dps); } } - - /* Re-evaluate 'od->has_lb_vip' */ - init_lb_for_datapath(od); } HMAP_FOR_EACH (clb, hmap_node, &trk_lb_data->crupdated_lbs) { @@ -5587,15 +5614,9 @@ northd_handle_lb_data_changes(struct tracked_lb_data *trk_lb_data, BITMAP_FOR_EACH_1 (index, ods_size(ls_datapaths), lb_dps->nb_ls_map) { od = ls_datapaths->array[index]; - /* Re-evaluate 'od->has_lb_vip' */ - init_lb_for_datapath(od); - } - BITMAP_FOR_EACH_1 (index, ods_size(lr_datapaths), - lb_dps->nb_lr_map) { - od = lr_datapaths->array[index]; - /* Re-evaluate 'od->has_lb_vip' */ - init_lb_for_datapath(od); + /* Add the ls datapath to the northd tracked data. */ + hmapx_add(&nd_changes->ls_with_changed_lbs, od); } } @@ -5617,17 +5638,14 @@ northd_handle_lb_data_changes(struct tracked_lb_data *trk_lb_data, for (size_t i = 0; i < lbgrp_dps->n_lr; i++) { od = lbgrp_dps->lr[i]; ovn_lb_datapaths_add_lr(lb_dps, 1, &od); - - /* Re-evaluate 'od->has_lb_vip' */ - init_lb_for_datapath(od); } for (size_t i = 0; i < lbgrp_dps->n_ls; i++) { od = lbgrp_dps->ls[i]; ovn_lb_datapaths_add_ls(lb_dps, 1, &od); - /* Re-evaluate 'od->has_lb_vip' */ - init_lb_for_datapath(od); + /* Add the ls datapath to the northd tracked data. */ + hmapx_add(&nd_changes->ls_with_changed_lbs, od); } /* Add the lb to the northd tracked data. */ @@ -5640,6 +5658,10 @@ northd_handle_lb_data_changes(struct tracked_lb_data *trk_lb_data, nd_changes->type |= NORTHD_TRACKED_LBS; } + if (!hmapx_is_empty(&nd_changes->ls_with_changed_lbs)) { + nd_changes->type |= NORTHD_TRACKED_LS_LBS; + } + return true; } @@ -6573,63 +6595,6 @@ build_dhcpv6_action(struct ovn_port *op, struct in6_addr *offer_ip, return true; } -static bool -od_set_acl_flags(struct ovn_datapath *od, struct nbrec_acl **acls, - size_t n_acls) -{ - /* A true return indicates that there are no possible ACL flags - * left to set on od. A false return indicates that further ACLs - * should be explored in case more flags need to be set on od - */ - if (!n_acls) { - return false; - } - - od->has_acls = true; - for (size_t i = 0; i < n_acls; i++) { - const struct nbrec_acl *acl = acls[i]; - if (acl->tier > od->max_acl_tier) { - od->max_acl_tier = acl->tier; - } - if (!od->has_stateful_acl && !strcmp(acl->action, "allow-related")) { - od->has_stateful_acl = true; - } - if (od->has_stateful_acl && - od->max_acl_tier == nbrec_acl_col_tier.type.value.integer.max) { - return true; - } - } - - return false; -} - -static void -ls_get_acl_flags(struct ovn_datapath *od, - const struct ls_port_group_table *ls_port_groups) -{ - od->has_acls = false; - od->has_stateful_acl = false; - od->max_acl_tier = 0; - - if (od_set_acl_flags(od, od->nbs->acls, od->nbs->n_acls)) { - return; - } - - const struct ls_port_group *ls_pg = - ls_port_group_table_find(ls_port_groups, od->nbs); - if (!ls_pg) { - return; - } - - const struct ls_port_group_record *ls_pg_rec; - HMAP_FOR_EACH (ls_pg_rec, key_node, &ls_pg->nb_pgs) { - if (od_set_acl_flags(od, ls_pg_rec->nb_pg->acls, - ls_pg_rec->nb_pg->n_acls)) { - return; - } - } -} - /* Adds the logical flows in the (in/out) check port sec stage only if * - the lport is disabled or * - lport is of type vtep - to skip the ingress pipeline. @@ -6777,9 +6742,10 @@ build_lswitch_output_port_sec_od(struct ovn_datapath *od, } static void -skip_port_from_conntrack(struct ovn_datapath *od, struct ovn_port *op, - enum ovn_stage in_stage, enum ovn_stage out_stage, - uint16_t priority, struct hmap *lflows) +skip_port_from_conntrack(const struct ovn_datapath *od, struct ovn_port *op, + bool has_stateful_acl, enum ovn_stage in_stage, + enum ovn_stage out_stage, uint16_t priority, + struct hmap *lflows) { /* Can't use ct() for router ports. Consider the following configuration: * lp1(10.0.0.2) on hostA--ls1--lr0--ls2--lp2(10.0.1.2) on hostB, For a @@ -6792,7 +6758,7 @@ skip_port_from_conntrack(struct ovn_datapath *od, struct ovn_port *op, * conntrack state across all chassis. */ const char *ingress_action = "next;"; - const char *egress_action = od->has_stateful_acl + const char *egress_action = has_stateful_acl ? "next;" : "ct_clear; next;"; @@ -6811,7 +6777,7 @@ skip_port_from_conntrack(struct ovn_datapath *od, struct ovn_port *op, } static void -build_stateless_filter(struct ovn_datapath *od, +build_stateless_filter(const struct ovn_datapath *od, const struct nbrec_acl *acl, struct hmap *lflows) { @@ -6832,7 +6798,7 @@ build_stateless_filter(struct ovn_datapath *od, } static void -build_stateless_filters(struct ovn_datapath *od, +build_stateless_filters(const struct ovn_datapath *od, const struct ls_port_group_table *ls_port_groups, struct hmap *lflows) { @@ -6862,9 +6828,7 @@ build_stateless_filters(struct ovn_datapath *od, } static void -build_pre_acls(struct ovn_datapath *od, - const struct ls_port_group_table *ls_port_groups, - struct hmap *lflows) +build_pre_acls(struct ovn_datapath *od, struct hmap *lflows) { /* Ingress and Egress Pre-ACL Table (Priority 0): Packets are * allowed by default. */ @@ -6876,22 +6840,30 @@ build_pre_acls(struct ovn_datapath *od, ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_ACL, 110, "eth.src == $svc_monitor_mac", "next;"); +} + +static void +build_ls_stateful_rec_pre_acls(const struct ls_stateful_record *ls_sful_rec, + const struct ls_port_group_table *ls_port_groups, + struct hmap *lflows) +{ + const struct ovn_datapath *od = ls_sful_rec->od; /* If there are any stateful ACL rules in this datapath, we may * send IP packets for some (allow) filters through the conntrack action, * which handles defragmentation, in order to match L4 headers. */ - if (od->has_stateful_acl) { + if (ls_sful_rec->has_stateful_acl) { for (size_t i = 0; i < od->n_router_ports; i++) { struct ovn_port *op = od->router_ports[i]; if (op->enable_router_port_acl) { continue; } - skip_port_from_conntrack(od, op, + skip_port_from_conntrack(od, op, true, S_SWITCH_IN_PRE_ACL, S_SWITCH_OUT_PRE_ACL, 110, lflows); } for (size_t i = 0; i < od->n_localnet_ports; i++) { - skip_port_from_conntrack(od, od->localnet_ports[i], + skip_port_from_conntrack(od, od->localnet_ports[i], true, S_SWITCH_IN_PRE_ACL, S_SWITCH_OUT_PRE_ACL, 110, lflows); @@ -6929,7 +6901,7 @@ build_pre_acls(struct ovn_datapath *od, REGBIT_CONNTRACK_DEFRAG" = 1; next;"); ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_ACL, 100, "ip", REGBIT_CONNTRACK_DEFRAG" = 1; next;"); - } else if (od->has_lb_vip) { + } else if (ls_sful_rec->has_lb_vip) { /* We'll build stateless filters if there are LB rules so that * the stateless flows are not tracked in pre-lb. */ build_stateless_filters(od, ls_port_groups, lflows); @@ -7057,30 +7029,40 @@ build_pre_lb(struct ovn_datapath *od, const struct shash *meter_groups, ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 0, "1", "next;"); ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 0, "1", "next;"); + /* Do not send statless flows via conntrack */ + ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 110, + REGBIT_ACL_STATELESS" == 1", "next;"); + ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 110, + REGBIT_ACL_STATELESS" == 1", "next;"); +} + +static void +build_ls_stateful_rec_pre_lb(const struct ls_stateful_record *ls_stateful_rec, + struct hmap *lflows) +{ + const struct ovn_datapath *od = ls_stateful_rec->od; + for (size_t i = 0; i < od->n_router_ports; i++) { skip_port_from_conntrack(od, od->router_ports[i], + ls_stateful_rec->has_stateful_acl, S_SWITCH_IN_PRE_LB, S_SWITCH_OUT_PRE_LB, 110, lflows); } + /* Localnet ports have no need for going through conntrack, unless * the logical switch has a load balancer. Then, conntrack is necessary * so that traffic arriving via the localnet port can be load * balanced. */ - if (!od->has_lb_vip) { + if (!ls_stateful_rec->has_lb_vip) { for (size_t i = 0; i < od->n_localnet_ports; i++) { skip_port_from_conntrack(od, od->localnet_ports[i], + ls_stateful_rec->has_stateful_acl, S_SWITCH_IN_PRE_LB, S_SWITCH_OUT_PRE_LB, 110, lflows); } } - /* Do not sent statless flows via conntrack */ - ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 110, - REGBIT_ACL_STATELESS" == 1", "next;"); - ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, 110, - REGBIT_ACL_STATELESS" == 1", "next;"); - /* 'REGBIT_CONNTRACK_NAT' is set to let the pre-stateful table send * packet to conntrack for defragmentation and possibly for unNATting. * @@ -7111,7 +7093,7 @@ build_pre_lb(struct ovn_datapath *od, const struct shash *meter_groups, * ingress pipeline if a load balancer is configured. We can now * add a lflow to drop ct.inv packets. */ - if (od->has_lb_vip) { + if (ls_stateful_rec->has_lb_vip) { ovn_lflow_add(lflows, od, S_SWITCH_IN_PRE_LB, 100, "ip", REGBIT_CONNTRACK_NAT" = 1; next;"); ovn_lflow_add(lflows, od, S_SWITCH_OUT_PRE_LB, @@ -7152,10 +7134,12 @@ build_pre_stateful(struct ovn_datapath *od, } static void -build_acl_hints(struct ovn_datapath *od, +build_acl_hints(const struct ls_stateful_record *ls_sful_rec, const struct chassis_features *features, struct hmap *lflows) { + const struct ovn_datapath *od = ls_sful_rec->od; + /* This stage builds hints for the IN/OUT_ACL stage. Based on various * combinations of ct flags packets may hit only a subset of the logical * flows in the IN/OUT_ACL stage. @@ -7179,13 +7163,13 @@ build_acl_hints(struct ovn_datapath *od, const char *match; /* In any case, advance to the next stage. */ - if (!od->has_acls && !od->has_lb_vip) { + if (!ls_sful_rec->has_acls && !ls_sful_rec->has_lb_vip) { ovn_lflow_add(lflows, od, stage, UINT16_MAX, "1", "next;"); } else { ovn_lflow_add(lflows, od, stage, 0, "1", "next;"); } - if (!od->has_stateful_acl && !od->has_lb_vip) { + if (!ls_sful_rec->has_stateful_acl && !ls_sful_rec->has_lb_vip) { continue; } @@ -7321,10 +7305,10 @@ build_acl_log(struct ds *actions, const struct nbrec_acl *acl, } static void -consider_acl(struct hmap *lflows, struct ovn_datapath *od, +consider_acl(struct hmap *lflows, const struct ovn_datapath *od, const struct nbrec_acl *acl, bool has_stateful, bool ct_masked_mark, const struct shash *meter_groups, - struct ds *match, struct ds *actions) + uint64_t max_acl_tier, struct ds *match, struct ds *actions) { const char *ct_blocked_match = ct_masked_mark ? "ct_mark.blocked" @@ -7361,7 +7345,7 @@ consider_acl(struct hmap *lflows, struct ovn_datapath *od, /* All ACLS will start by matching on their respective tier. */ size_t match_tier_len = 0; ds_clear(match); - if (od->max_acl_tier) { + if (max_acl_tier) { ds_put_format(match, REG_ACL_TIER " == %"PRId64" && ", acl->tier); match_tier_len = match->length; } @@ -7548,12 +7532,15 @@ ovn_update_ipv6_options(struct hmap *lr_ports) #define IPV6_CT_OMIT_MATCH "nd || nd_ra || nd_rs || mldv1 || mldv2" static void -build_acl_action_lflows(struct ovn_datapath *od, struct hmap *lflows, +build_acl_action_lflows(const struct ls_stateful_record *ls_stateful_rec, + struct hmap *lflows, const char *default_acl_action, const struct shash *meter_groups, struct ds *match, struct ds *actions) { + const struct ovn_datapath *od = ls_stateful_rec->od; + enum ovn_stage stages [] = { S_SWITCH_IN_ACL_ACTION, S_SWITCH_IN_ACL_AFTER_LB_ACTION, @@ -7564,7 +7551,7 @@ build_acl_action_lflows(struct ovn_datapath *od, struct hmap *lflows, ds_put_cstr(actions, REGBIT_ACL_VERDICT_ALLOW " = 0; " REGBIT_ACL_VERDICT_DROP " = 0; " REGBIT_ACL_VERDICT_REJECT " = 0; "); - if (od->max_acl_tier) { + if (ls_stateful_rec->max_acl_tier) { ds_put_cstr(actions, REG_ACL_TIER " = 0; "); } @@ -7572,7 +7559,7 @@ build_acl_action_lflows(struct ovn_datapath *od, struct hmap *lflows, for (size_t i = 0; i < ARRAY_SIZE(stages); i++) { enum ovn_stage stage = stages[i]; - if (!od->has_acls) { + if (!ls_stateful_rec->has_acls) { ovn_lflow_add(lflows, od, stage, 0, "1", "next;"); continue; } @@ -7607,7 +7594,7 @@ build_acl_action_lflows(struct ovn_datapath *od, struct hmap *lflows, ovn_lflow_add(lflows, od, stage, 0, "1", ds_cstr(actions)); struct ds tier_actions = DS_EMPTY_INITIALIZER; - for (size_t j = 0; j < od->max_acl_tier; j++) { + for (size_t j = 0; j < ls_stateful_rec->max_acl_tier; j++) { ds_clear(match); ds_put_format(match, REG_ACL_TIER " == %"PRIuSIZE, j); ds_clear(&tier_actions); @@ -7623,7 +7610,7 @@ build_acl_action_lflows(struct ovn_datapath *od, struct hmap *lflows, } static void -build_acl_log_related_flows(struct ovn_datapath *od, struct hmap *lflows, +build_acl_log_related_flows(const struct ovn_datapath *od, struct hmap *lflows, const struct nbrec_acl *acl, bool has_stateful, bool ct_masked_mark, const struct shash *meter_groups, @@ -7696,15 +7683,19 @@ build_acl_log_related_flows(struct ovn_datapath *od, struct hmap *lflows, } static void -build_acls(struct ovn_datapath *od, const struct chassis_features *features, +build_acls(const struct ls_stateful_record *ls_stateful_rec, + const struct chassis_features *features, struct hmap *lflows, const struct ls_port_group_table *ls_port_groups, const struct shash *meter_groups) { + const struct ovn_datapath *od = ls_stateful_rec->od; + const char *default_acl_action = default_acl_drop ? debug_implicit_drop_action() : "next;"; - bool has_stateful = od->has_stateful_acl || od->has_lb_vip; + bool has_stateful = (ls_stateful_rec->has_stateful_acl + || ls_stateful_rec->has_lb_vip); const char *ct_blocked_match = features->ct_no_masked_label ? "ct_mark.blocked" : "ct_label.blocked"; @@ -7718,8 +7709,8 @@ build_acls(struct ovn_datapath *od, const struct chassis_features *features, * * A related rule at priority 1 is added below if there * are any stateful ACLs in this datapath. */ - if (!od->has_acls) { - if (!od->has_lb_vip) { + if (!ls_stateful_rec->has_acls) { + if (!ls_stateful_rec->has_lb_vip) { ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL_EVAL, UINT16_MAX, "1", "next;"); ovn_lflow_add(lflows, od, S_SWITCH_OUT_ACL_EVAL, UINT16_MAX, "1", @@ -7882,7 +7873,8 @@ build_acls(struct ovn_datapath *od, const struct chassis_features *features, meter_groups, &match, &actions); consider_acl(lflows, od, acl, has_stateful, features->ct_no_masked_label, - meter_groups, &match, &actions); + meter_groups, ls_stateful_rec->max_acl_tier, + &match, &actions); } const struct ls_port_group *ls_pg = @@ -7898,7 +7890,8 @@ build_acls(struct ovn_datapath *od, const struct chassis_features *features, meter_groups, &match, &actions); consider_acl(lflows, od, acl, has_stateful, features->ct_no_masked_label, - meter_groups, &match, &actions); + meter_groups, ls_stateful_rec->max_acl_tier, + &match, &actions); } } } @@ -7916,7 +7909,7 @@ build_acls(struct ovn_datapath *od, const struct chassis_features *features, dns_actions); } - if (od->has_acls || od->has_lb_vip) { + if (ls_stateful_rec->has_acls || ls_stateful_rec->has_lb_vip) { /* Add a 34000 priority flow to advance the service monitor reply * packets to skip applying ingress ACLs. */ ovn_lflow_add(lflows, od, S_SWITCH_IN_ACL_EVAL, 34000, @@ -7930,8 +7923,8 @@ build_acls(struct ovn_datapath *od, const struct chassis_features *features, REGBIT_ACL_VERDICT_ALLOW" = 1; next;"); } - build_acl_action_lflows(od, lflows, default_acl_action, meter_groups, - &match, &actions); + build_acl_action_lflows(ls_stateful_rec, lflows, default_acl_action, + meter_groups, &match, &actions); ds_destroy(&match); ds_destroy(&actions); @@ -8576,8 +8569,11 @@ build_stateful(struct ovn_datapath *od, } static void -build_lb_hairpin(struct ovn_datapath *od, struct hmap *lflows) +build_lb_hairpin(const struct ls_stateful_record *ls_stateful_rec, + struct hmap *lflows) { + const struct ovn_datapath *od = ls_stateful_rec->od; + /* Ingress Pre-Hairpin/Nat-Hairpin/Hairpin tabled (Priority 0). * Packets that don't need hairpinning should continue processing. */ @@ -8585,7 +8581,7 @@ build_lb_hairpin(struct ovn_datapath *od, struct hmap *lflows) ovn_lflow_add(lflows, od, S_SWITCH_IN_NAT_HAIRPIN, 0, "1", "next;"); ovn_lflow_add(lflows, od, S_SWITCH_IN_HAIRPIN, 0, "1", "next;"); - if (od->has_lb_vip) { + if (ls_stateful_rec->has_lb_vip) { /* Check if the packet needs to be hairpinned. * Set REGBIT_HAIRPIN in the original direction and * REGBIT_HAIRPIN_REPLY in the reply direction. @@ -9242,9 +9238,7 @@ build_dhcpv4_options_flows(struct ovn_port *op, &op->nbsp->dhcpv4_options->options, "lease_time"); ovs_assert(server_id && server_mac && lease_time); const char *dhcp_actions = - (op->od->has_stateful_acl || op->od->has_lb_vip) - ? REGBIT_ACL_VERDICT_ALLOW" = 1; ct_commit; next;" - : REGBIT_ACL_VERDICT_ALLOW" = 1; next;"; + REGBIT_ACL_VERDICT_ALLOW" = 1; next;"; ds_clear(&match); ds_put_format(&match, "outport == %s && eth.src == %s " "&& ip4.src == %s && udp && udp.src == 67 " @@ -9327,9 +9321,7 @@ build_dhcpv6_options_flows(struct ovn_port *op, ipv6_string_mapped(server_ip, &lla); const char *dhcp6_actions = - (op->od->has_stateful_acl || op->od->has_lb_vip) - ? REGBIT_ACL_VERDICT_ALLOW" = 1; ct_commit; next;" - : REGBIT_ACL_VERDICT_ALLOW" = 1; next;"; + REGBIT_ACL_VERDICT_ALLOW" = 1; next;"; ds_clear(&match); ds_put_format(&match, "outport == %s && eth.src == %s " "&& ip6.src == %s && udp && udp.src == 547 " @@ -9436,22 +9428,16 @@ build_lswitch_lflows_l2_unknown(struct ovn_datapath *od, static void build_lswitch_lflows_pre_acl_and_acl( struct ovn_datapath *od, - const struct ls_port_group_table *ls_port_groups, const struct chassis_features *features, struct hmap *lflows, const struct shash *meter_groups) { ovs_assert(od->nbs); - ls_get_acl_flags(od, ls_port_groups); - - build_pre_acls(od, ls_port_groups, lflows); + build_pre_acls(od, lflows); build_pre_lb(od, meter_groups, lflows); build_pre_stateful(od, features, lflows); - build_acl_hints(od, features, lflows); - build_acls(od, features, lflows, ls_port_groups, meter_groups); build_qos(od, lflows); build_stateful(od, features, lflows); - build_lb_hairpin(od, lflows); build_vtep_hairpin(od, lflows); } @@ -15511,7 +15497,7 @@ build_lrouter_nat_defrag_and_lb( * a dynamically negotiated FTP data channel), but will allow * related traffic such as an ICMP Port Unreachable through * that's generated from a non-listening UDP port. */ - if (od->has_lb_vip && features->ct_lb_related) { + if (lr_stateful_rec->has_lb_vip && features->ct_lb_related) { ds_clear(match); ds_put_cstr(match, "ct.rel && !ct.est && !ct.new"); @@ -15536,7 +15522,7 @@ build_lrouter_nat_defrag_and_lb( * Pass the traffic that is already established to the next table with * proper flags set. */ - if (od->has_lb_vip) { + if (lr_stateful_rec->has_lb_vip) { ds_clear(match); ds_put_format(match, "ct.est && !ct.rel && !ct.new && %s.natted", @@ -15566,7 +15552,7 @@ build_lrouter_nat_defrag_and_lb( * not committed, it would produce ongoing datapath flows with the ct.new * flag set. Some NICs are unable to offload these flows. */ - if (od->is_gw_router && (od->nbr->n_nat || od->has_lb_vip)) { + if (od->is_gw_router && (od->nbr->n_nat || lr_stateful_rec->has_lb_vip)) { /* Do not send ND or ICMP packets to connection tracking. */ ovn_lflow_add(lflows, od, S_ROUTER_OUT_UNDNAT, 100, "nd || nd_rs || nd_ra", "next;"); @@ -15987,6 +15973,22 @@ build_lr_stateful_flows(const struct lr_stateful_record *lr_stateful_rec, meter_groups); } +static void +build_ls_stateful_flows(const struct ls_stateful_record *ls_stateful_rec, + const struct ls_port_group_table *ls_pgs, + const struct chassis_features *features, + const struct shash *meter_groups, + struct hmap *lflows) +{ + ovs_assert(ls_stateful_rec->od); + + build_ls_stateful_rec_pre_acls(ls_stateful_rec, ls_pgs, lflows); + build_ls_stateful_rec_pre_lb(ls_stateful_rec, lflows); + build_acl_hints(ls_stateful_rec, features, lflows); + build_acls(ls_stateful_rec, features, lflows, ls_pgs, meter_groups); + build_lb_hairpin(ls_stateful_rec, lflows); +} + struct lswitch_flow_build_info { const struct ovn_datapaths *ls_datapaths; const struct ovn_datapaths *lr_datapaths; @@ -15994,6 +15996,7 @@ struct lswitch_flow_build_info { const struct hmap *lr_ports; const struct ls_port_group_table *ls_port_groups; const struct lr_stateful_table *lr_stateful_table; + const struct ls_stateful_table *ls_stateful_table; struct hmap *lflows; struct hmap *igmp_groups; const struct shash *meter_groups; @@ -16018,9 +16021,7 @@ build_lswitch_and_lrouter_iterate_by_ls(struct ovn_datapath *od, struct lswitch_flow_build_info *lsi) { ovs_assert(od->nbs); - build_lswitch_lflows_pre_acl_and_acl(od, lsi->ls_port_groups, - lsi->features, - lsi->lflows, + build_lswitch_lflows_pre_acl_and_acl(od, lsi->features, lsi->lflows, lsi->meter_groups); build_fwd_group_lflows(od, lsi->lflows); @@ -16135,6 +16136,7 @@ build_lflows_thread(void *arg) { struct worker_control *control = (struct worker_control *) arg; const struct lr_stateful_record *lr_stateful_rec; + const struct ls_stateful_record *ls_stateful_rec; struct lswitch_flow_build_info *lsi; struct ovn_igmp_group *igmp_group; struct ovn_lb_datapaths *lb_dps; @@ -16259,6 +16261,20 @@ build_lflows_thread(void *arg) lsi->features); } } + + for (bnum = control->id; + bnum <= lsi->ls_stateful_table->entries.mask; + bnum += control->pool->size) + { + LS_STATEFUL_TABLE_FOR_EACH_IN_P (ls_stateful_rec, bnum, + lsi->ls_stateful_table) { + build_ls_stateful_flows(ls_stateful_rec, + lsi->ls_port_groups, + lsi->features, lsi->meter_groups, + lsi->lflows); + } + } + for (bnum = control->id; bnum <= lsi->igmp_groups->mask; bnum += control->pool->size) @@ -16320,6 +16336,7 @@ build_lswitch_and_lrouter_flows( const struct hmap *lr_ports, const struct ls_port_group_table *ls_pgs, const struct lr_stateful_table *lr_stateful_table, + const struct ls_stateful_table *ls_stateful_table, struct hmap *lflows, struct hmap *igmp_groups, const struct shash *meter_groups, @@ -16350,6 +16367,7 @@ build_lswitch_and_lrouter_flows( lsiv[index].lr_ports = lr_ports; lsiv[index].ls_port_groups = ls_pgs; lsiv[index].lr_stateful_table = lr_stateful_table; + lsiv[index].ls_stateful_table = ls_stateful_table; lsiv[index].igmp_groups = igmp_groups; lsiv[index].meter_groups = meter_groups; lsiv[index].lb_dps_map = lb_dps_map; @@ -16375,6 +16393,7 @@ build_lswitch_and_lrouter_flows( free(lsiv); } else { const struct lr_stateful_record *lr_stateful_rec; + const struct ls_stateful_record *ls_stateful_rec; struct ovn_igmp_group *igmp_group; struct ovn_lb_datapaths *lb_dps; struct ovn_datapath *od; @@ -16387,6 +16406,7 @@ build_lswitch_and_lrouter_flows( .lr_ports = lr_ports, .ls_port_groups = ls_pgs, .lr_stateful_table = lr_stateful_table, + .ls_stateful_table = ls_stateful_table, .lflows = lflows, .igmp_groups = igmp_groups, .meter_groups = meter_groups, @@ -16456,6 +16476,12 @@ build_lswitch_and_lrouter_flows( lsi.features); } + LS_STATEFUL_TABLE_FOR_EACH (ls_stateful_rec, ls_stateful_table) { + build_ls_stateful_flows(ls_stateful_rec, lsi.ls_port_groups, + lsi.features, lsi.meter_groups, + lsi.lflows); + } + stopwatch_start(LFLOWS_IGMP_STOPWATCH_NAME, time_msec()); HMAP_FOR_EACH (igmp_group, hmap_node, igmp_groups) { build_lswitch_ip_mcast_igmp_mld(igmp_group, @@ -16552,6 +16578,7 @@ void build_lflows(struct ovsdb_idl_txn *ovnsb_txn, input_data->lr_ports, input_data->ls_port_groups, input_data->lr_stateful_table, + input_data->ls_stateful_table, lflows, &igmp_groups, input_data->meter_groups, diff --git a/northd/northd.h b/northd/northd.h index 46bd2498f4..88406bffee 100644 --- a/northd/northd.h +++ b/northd/northd.h @@ -89,6 +89,8 @@ ods_size(const struct ovn_datapaths *datapaths) return hmap_count(&datapaths->datapaths); } +bool od_has_lb_vip(const struct ovn_datapath *od); + struct tracked_ovn_ports { /* tracked created ports. * hmapx node data is 'struct ovn_port *' */ @@ -118,6 +120,8 @@ enum northd_tracked_data_type { NORTHD_TRACKED_PORTS = (1 << 0), NORTHD_TRACKED_LBS = (1 << 1), NORTHD_TRACKED_LR_NATS = (1 << 2), + NORTHD_TRACKED_LS_LBS = (1 << 3), + NORTHD_TRACKED_LS_ACLS = (1 << 4), }; /* Track what's changed in the northd engine node. @@ -132,6 +136,14 @@ struct northd_tracked_data { /* Tracked logical routers whose NATs have changed. * hmapx node is 'struct ovn_datapath *'. */ struct hmapx lr_with_changed_nats; + + /* Tracked logical switches whose load balancers have changed. + * hmapx node is 'struct ovn_datapath *'. */ + struct hmapx ls_with_changed_lbs; + + /* Tracked logical switches whose ACLs have changed. + * hmapx node is 'struct ovn_datapath *'. */ + struct hmapx ls_with_changed_acls; }; struct northd_data { @@ -180,6 +192,7 @@ struct lflow_input { const struct hmap *lr_ports; const struct ls_port_group_table *ls_port_groups; const struct lr_stateful_table *lr_stateful_table; + const struct ls_stateful_table *ls_stateful_table; const struct shash *meter_groups; const struct hmap *lb_datapaths_map; const struct hmap *bfd_connections; @@ -289,11 +302,7 @@ struct ovn_datapath { struct hmap port_tnlids; uint32_t port_key_hint; - bool has_stateful_acl; - bool has_lb_vip; bool has_unknown; - bool has_acls; - uint64_t max_acl_tier; bool has_vtep_lports; bool has_arp_proxy_port; @@ -541,6 +550,18 @@ northd_has_lr_nats_in_tracked_data(struct northd_tracked_data *trk_nd_changes) return (trk_nd_changes->type & NORTHD_TRACKED_LR_NATS); } +static inline bool +northd_has_ls_lbs_in_tracked_data(struct northd_tracked_data *trk_nd_changes) +{ + return (trk_nd_changes->type & NORTHD_TRACKED_LS_LBS); +} + +static inline bool +northd_has_ls_acls_in_tracked_data(struct northd_tracked_data *trk_nd_changes) +{ + return (trk_nd_changes->type & NORTHD_TRACKED_LS_ACLS); +} + /* Returns 'true' if the IPv4 'addr' is on the same subnet with one of the * IPs configured on the router port. */ diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index e85e00cc9b..084d675644 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -872,6 +872,7 @@ main(int argc, char *argv[]) stopwatch_create(SYNC_METERS_RUN_STOPWATCH_NAME, SW_MS); stopwatch_create(LR_NAT_RUN_STOPWATCH_NAME, SW_MS); stopwatch_create(LR_STATEFUL_RUN_STOPWATCH_NAME, SW_MS); + stopwatch_create(LS_STATEFUL_RUN_STOPWATCH_NAME, SW_MS); /* Initialize incremental processing engine for ovn-northd */ inc_proc_northd_init(&ovnnb_idl_loop, &ovnsb_idl_loop);