From patchwork Thu Sep 2 08:26:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anton Ivanov X-Patchwork-Id: 1523521 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=openvswitch.org (client-ip=2605:bc80:3010::137; helo=smtp4.osuosl.org; envelope-from=ovs-dev-bounces@openvswitch.org; receiver=) Received: from smtp4.osuosl.org (smtp4.osuosl.org [IPv6:2605:bc80:3010::137]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4H0YvJ6KJmz9sPf for ; Thu, 2 Sep 2021 18:27:24 +1000 (AEST) Received: from localhost (localhost [127.0.0.1]) by smtp4.osuosl.org (Postfix) with ESMTP id 3942B42510; Thu, 2 Sep 2021 08:27:22 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp4.osuosl.org ([127.0.0.1]) by localhost (smtp4.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Wu3sphMiMBkE; Thu, 2 Sep 2021 08:27:18 +0000 (UTC) Received: from lists.linuxfoundation.org (lf-lists.osuosl.org [140.211.9.56]) by smtp4.osuosl.org (Postfix) with ESMTPS id 5A1EE4252E; Thu, 2 Sep 2021 08:27:17 +0000 (UTC) Received: from lf-lists.osuosl.org (localhost [127.0.0.1]) by lists.linuxfoundation.org (Postfix) with ESMTP id 38963C000E; Thu, 2 Sep 2021 08:27:17 +0000 (UTC) X-Original-To: ovs-dev@openvswitch.org Delivered-To: ovs-dev@lists.linuxfoundation.org Received: from smtp2.osuosl.org (smtp2.osuosl.org [140.211.166.133]) by lists.linuxfoundation.org (Postfix) with ESMTP id 2C2FAC001A for ; Thu, 2 Sep 2021 08:27:16 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by smtp2.osuosl.org (Postfix) with ESMTP id E85E0406F3 for ; Thu, 2 Sep 2021 08:27:15 +0000 (UTC) X-Virus-Scanned: amavisd-new at osuosl.org Received: from smtp2.osuosl.org ([127.0.0.1]) by localhost (smtp2.osuosl.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0N_-WRe9i2Z7 for ; Thu, 2 Sep 2021 08:27:14 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey-1.8.0 Received: from www.kot-begemot.co.uk (ivanoab7.miniserver.com [37.128.132.42]) by smtp2.osuosl.org (Postfix) with ESMTPS id C2756406E0 for ; Thu, 2 Sep 2021 08:27:11 +0000 (UTC) Received: from tun252.jain.kot-begemot.co.uk ([192.168.18.6] helo=jain.kot-begemot.co.uk) by www.kot-begemot.co.uk with esmtps (TLS1.3:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mLi3t-0003Rs-Ql for ovs-dev@openvswitch.org; Thu, 02 Sep 2021 08:27:10 +0000 Received: from jain.kot-begemot.co.uk ([192.168.3.3]) by jain.kot-begemot.co.uk with esmtp (Exim 4.92) (envelope-from ) id 1mLi3k-000128-I9; Thu, 02 Sep 2021 09:27:01 +0100 From: anton.ivanov@cambridgegreys.com To: ovs-dev@openvswitch.org Date: Thu, 2 Sep 2021 09:26:43 +0100 Message-Id: <20210902082643.3916-4-anton.ivanov@cambridgegreys.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210902082643.3916-1-anton.ivanov@cambridgegreys.com> References: <20210902082643.3916-1-anton.ivanov@cambridgegreys.com> MIME-Version: 1.0 X-Clacks-Overhead: GNU Terry Pratchett Cc: Anton Ivanov Subject: [ovs-dev] [OVN Patch v5 4/4] northd: Restore parallel build with dp_groups X-BeenThere: ovs-dev@openvswitch.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: ovs-dev-bounces@openvswitch.org Sender: "dev" From: Anton Ivanov Restore parallel build with dp groups using rwlock instead of per row locking as an underlying mechanism. This provides improvement ~ 10% end-to-end on ovn-heater under virutalization despite awakening some qemu gremlin which makes qemu climb to silly CPU usage. The gain on bare metal is likely to be higher. Signed-off-by: Anton Ivanov --- northd/ovn-northd.c | 183 +++++++++++++++++++++++++++++++++----------- 1 file changed, 137 insertions(+), 46 deletions(-) diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index 28af790bc..f5d49143d 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -59,6 +59,7 @@ #include "unixctl.h" #include "util.h" #include "uuid.h" +#include "ovs-thread.h" #include "openvswitch/vlog.h" VLOG_DEFINE_THIS_MODULE(ovn_northd); @@ -4369,7 +4370,26 @@ ovn_lflow_init(struct ovn_lflow *lflow, struct ovn_datapath *od, static bool use_logical_dp_groups = false; static bool use_parallel_build = true; -static struct hashrow_locks lflow_locks; +static struct ovs_rwlock flowtable_lock; + +static void ovn_make_multi_lflow(struct ovn_lflow *old_lflow, + struct ovn_datapath *od, + struct lflow_state *lflow_map, + uint32_t hash) +{ + hmapx_add(&old_lflow->od_group, od); + hmap_remove(&lflow_map->single_od, &old_lflow->hmap_node); + if (use_parallel_build) { + hmap_insert_fast(&lflow_map->multiple_od, &old_lflow->hmap_node, hash); + } else { + hmap_insert(&lflow_map->multiple_od, &old_lflow->hmap_node, hash); + } +} + +#ifdef __clang__ +#pragma clang diagnostic push +#pragma clang diagnostic ignored "-Wthread-safety" +#endif /* Adds a row with the specified contents to the Logical_Flow table. * Version to use when locking is required. @@ -4385,57 +4405,133 @@ do_ovn_lflow_add(struct lflow_state *lflow_map, struct ovn_datapath *od, struct ovn_lflow *old_lflow; struct ovn_lflow *lflow; + /* Fast Path. + * See if we can get away without writing - grab a rdlock and check + * if we can get away with as little work as possible. + */ + if (use_logical_dp_groups) { - old_lflow = do_ovn_lflow_find(&lflow_map->single_od, NULL, stage, - priority, match, + if (use_parallel_build) { + ovs_rwlock_rdlock(&flowtable_lock); + } + old_lflow = do_ovn_lflow_find(&lflow_map->single_od, + NULL, stage, priority, match, actions, ctrl_meter, hash); if (old_lflow) { - hmapx_add(&old_lflow->od_group, od); - /* Found, different, od count went up. Move to multiple od. */ - if (hmapx_count(&old_lflow->od_group) > 1) { - hmap_remove(&lflow_map->single_od, &old_lflow->hmap_node); + if (!hmapx_contains(&old_lflow->od_group, od)) { + /* od not in od_group, we need to add it and move to + * multiple. */ if (use_parallel_build) { - hmap_insert_fast(&lflow_map->multiple_od, - &old_lflow->hmap_node, hash); - } else { - hmap_insert(&lflow_map->multiple_od, - &old_lflow->hmap_node, hash); + /* Upgrade the lock to write, we are likely to + * modify data. */ + ovs_rwlock_unlock(&flowtable_lock); + ovs_rwlock_wrlock(&flowtable_lock); + + /* Check if someone got ahead of us and the flow is already + * in multiple. */ + if (!hmap_contains(&lflow_map->single_od, + &old_lflow->hmap_node)) { + /* Someone did get ahead of us, add the od to the + * group. */ + hmapx_add(&old_lflow->od_group, od); + goto done_update_unlock; + } } + ovn_make_multi_lflow(old_lflow, od, lflow_map, hash); + goto done_update_unlock; } - } else { - /* Not found, lookup in multiple od. */ + } + if (!old_lflow) { + /* Not found in single, lookup in multiple od. */ old_lflow = do_ovn_lflow_find(&lflow_map->multiple_od, NULL, stage, priority, match, actions, ctrl_meter, hash); if (old_lflow) { - hmapx_add(&old_lflow->od_group, od); + if (!hmapx_contains(&old_lflow->od_group, od)) { + if (use_parallel_build) { + /* Upgrade lock to write.*/ + ovs_rwlock_unlock(&flowtable_lock); + ovs_rwlock_wrlock(&flowtable_lock); + } + hmapx_add(&old_lflow->od_group, od); + } } } +done_update_unlock: + if (use_parallel_build) { + ovs_rwlock_unlock(&flowtable_lock); + } if (old_lflow) { return; } } - lflow = xmalloc(sizeof *lflow); - /* While adding new logical flows we're not setting single datapath, but - * collecting a group. 'od' will be updated later for all flows with only - * one datapath in a group, so it could be hashed correctly. */ - ovn_lflow_init(lflow, NULL, stage, priority, - xstrdup(match), xstrdup(actions), - io_port ? xstrdup(io_port) : NULL, - nullable_xstrdup(ctrl_meter), - ovn_lflow_hint(stage_hint), where); - hmapx_add(&lflow->od_group, od); - - /* Insert "fresh" lflows into single_od. */ + /* Slow Path. + * We could not get away with minimal mostly ro amount of work, + * lock with rw and try to do an insert (may end up repeating + * some of what we do for ro). */ + if (use_logical_dp_groups && use_parallel_build) { + ovs_rwlock_wrlock(&flowtable_lock); + } if (!use_parallel_build) { + lflow = xmalloc(sizeof *lflow); + /* While adding new logical flows we are not setting single datapath, + * but collecting a group. 'od' will be updated later for all flows + * with only one datapath in a group, so it could be hashed correctly. + */ + ovn_lflow_init(lflow, NULL, stage, priority, + xstrdup(match), xstrdup(actions), + io_port ? xstrdup(io_port) : NULL, + nullable_xstrdup(ctrl_meter), + ovn_lflow_hint(stage_hint), where); + hmapx_add(&lflow->od_group, od); hmap_insert(&lflow_map->single_od, &lflow->hmap_node, hash); } else { + if (use_logical_dp_groups) { + /* Search again in case someone else got here first. */ + old_lflow = do_ovn_lflow_find(&lflow_map->single_od, + NULL, stage, priority, match, + actions, ctrl_meter, hash); + if (old_lflow) { + if (!hmapx_contains(&old_lflow->od_group, od)) { + ovn_make_multi_lflow(old_lflow, od, lflow_map, hash); + } + goto done_add_unlock; + } + /* Unlikely, but possible, more than one thread got here + * ahead of us while we were wating to acquire a write lock. */ + old_lflow = do_ovn_lflow_find(&lflow_map->multiple_od, NULL, + stage, priority, match, + actions, ctrl_meter, hash); + if (old_lflow) { + hmapx_add(&old_lflow->od_group, od); + goto done_add_unlock; + } + } + lflow = xmalloc(sizeof *lflow); + /* While adding new logical flows we're not setting single datapath, + * but collecting a group. 'od' will be updated later for all + * flows with only * one datapath in a group, so it could be hashed + * correctly. */ + ovn_lflow_init(lflow, NULL, stage, priority, + xstrdup(match), xstrdup(actions), + io_port ? xstrdup(io_port) : NULL, + nullable_xstrdup(ctrl_meter), + ovn_lflow_hint(stage_hint), where); + hmapx_add(&lflow->od_group, od); hmap_insert_fast(&lflow_map->single_od, &lflow->hmap_node, hash); } +done_add_unlock: + if (use_logical_dp_groups && use_parallel_build) { + ovs_rwlock_unlock(&flowtable_lock); + } } +#ifdef __clang__ +#pragma clang diagnostic pop +#endif + /* Adds a row with the specified contents to the Logical_Flow table. */ static void ovn_lflow_add_at(struct lflow_state *lflow_map, struct ovn_datapath *od, @@ -4453,15 +4549,8 @@ ovn_lflow_add_at(struct lflow_state *lflow_map, struct ovn_datapath *od, priority, match, actions); - if (use_logical_dp_groups && use_parallel_build) { - lock_hash_row(&lflow_locks, hash); - do_ovn_lflow_add(lflow_map, od, hash, stage, priority, match, - actions, io_port, stage_hint, where, ctrl_meter); - unlock_hash_row(&lflow_locks, hash); - } else { - do_ovn_lflow_add(lflow_map, od, hash, stage, priority, match, - actions, io_port, stage_hint, where, ctrl_meter); - } + do_ovn_lflow_add(lflow_map, od, hash, stage, priority, match, + actions, io_port, stage_hint, where, ctrl_meter); } /* Adds a row with the specified contents to the Logical_Flow table. */ @@ -4555,15 +4644,9 @@ hmap_safe_remove(struct hmap *hmap, struct hmap_node *node) static void remove_lflow_from_lflows(struct lflow_state *lflows, struct ovn_lflow *lflow) { - if (use_logical_dp_groups && use_parallel_build) { - lock_hash_row(&lflow_locks, lflow->hmap_node.hash); - } if (!hmap_safe_remove(&lflows->multiple_od, &lflow->hmap_node)) { hmap_remove(&lflows->single_od, &lflow->hmap_node); } - if (use_logical_dp_groups && use_parallel_build) { - unlock_hash_row(&lflow_locks, lflow->hmap_node.hash); - } } static void @@ -12967,7 +13050,7 @@ build_lswitch_and_lrouter_flows(struct hmap *datapaths, struct hmap *ports, } } - if (use_parallel_build && (!use_logical_dp_groups)) { + if (use_parallel_build) { struct lflow_state *lflow_segs; struct lswitch_flow_build_info *lsiv; int index; @@ -13197,6 +13280,8 @@ reconcile_lflow(struct ovn_lflow *lflow, struct northd_context *ctx, ovn_lflow_destroy(lflows, lflow); } +static bool needs_init = true; + /* Updates the Logical_Flow and Multicast_Group tables in the OVN_SB database, * constructing their contents based on the OVN_NB database. */ static void @@ -13211,8 +13296,15 @@ build_lflows(struct northd_context *ctx, struct hmap *datapaths, fast_hmap_size_for(&lflows.single_od, max_seen_lflow_size); fast_hmap_size_for(&lflows.multiple_od, max_seen_lflow_size); - if (use_parallel_build && use_logical_dp_groups) { - update_hashrow_locks(&lflows.single_od, &lflow_locks); + if (use_parallel_build && use_logical_dp_groups && needs_init) { + ovs_rwlock_init(&flowtable_lock); + /* This happens on first run with parallel+dp_groups. + * db_run will re-read use_parallel_build from config and + * reset it. This way we get correct sizing for + * parallel + dp_groups by doing one single threaded run + * on the first iteration. */ + use_parallel_build = false; + needs_init = false; } @@ -15139,7 +15231,6 @@ main(int argc, char *argv[]) daemonize_complete(); - init_hash_row_locks(&lflow_locks); use_parallel_build = can_parallelize_hashes(false); /* We want to detect (almost) all changes to the ovn-nb db. */