Message ID | 20240214182616.3260992-1-xsimonar@redhat.com |
---|---|
State | Accepted |
Headers | show |
Series | [ovs-dev] northd: Initialize hmap size in lflow_mgr. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/github-robot-_ovn-kubernetes | success | github build: passed |
On Wed, Feb 14, 2024 at 1:26 PM Xavier Simonart <xsimonar@redhat.com> wrote: > > When (re)starting ovn-northd with an existing big nbdb, > the first iteration of northd was very slow as trying to > push all flows in a single bucket. > > Fixes: a623606052ea ("northd: Refactor lflow management into a separate module.") > > Signed-off-by: Xavier Simonart <xsimonar@redhat.com> Hi Xavier, Thanks for the patch. Did the first iteration of north become very slow after the lflow-mgr was added ? If I compare the code before these patches were merged, lflows_temp hmap was not a fast hmap [1]. So I'm still confused how this regression was introduced ? [1] - https://github.com/ovn-org/ovn/blob/branch-23.09/northd/northd.c#L16717 If you see the code I linked above, lflows_temp is declared and initialized as struct hmap lflows_temp = HMAP_INITIALIZER(&lflows_temp); And later a lflow is removed from the "lflows" hmap and inserted into it. And finally it is swapped with "lflows" hmap_swap(lflows, &lflows_temp); hmap_destroy(&lflows_temp); The same thing is done in the lflow-mgr too. I'm fine with this patch if it speeds up the first iteration. I want to understand if this behavior is a regression or not. Thanks Numan > --- > northd/lflow-mgr.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/northd/lflow-mgr.c b/northd/lflow-mgr.c > index 61729e903..af072bf70 100644 > --- a/northd/lflow-mgr.c > +++ b/northd/lflow-mgr.c > @@ -260,6 +260,9 @@ lflow_table_sync_to_sb(struct lflow_table *lflow_table, > struct hmap *lflows = &lflow_table->entries; > struct ovn_lflow *lflow; > > + fast_hmap_size_for(&lflows_temp, > + lflow_table->max_seen_lflow_size); > + > /* Push changes to the Logical_Flow table to database. */ > const struct sbrec_logical_flow *sbflow; > SBREC_LOGICAL_FLOW_TABLE_FOR_EACH_SAFE (sbflow, sb_flow_table) { > -- > 2.41.0 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
Hi Numan Thanks for the quick review. On Wed, Feb 14, 2024 at 10:57 PM Numan Siddique <numans@ovn.org> wrote: > On Wed, Feb 14, 2024 at 1:26 PM Xavier Simonart <xsimonar@redhat.com> > wrote: > > > > When (re)starting ovn-northd with an existing big nbdb, > > the first iteration of northd was very slow as trying to > > push all flows in a single bucket. > > > > Fixes: a623606052ea ("northd: Refactor lflow management into a separate > module.") > > > > Signed-off-by: Xavier Simonart <xsimonar@redhat.com> > > Hi Xavier, > > Thanks for the patch. Did the first iteration of north become very > slow after the lflow-mgr was added ? > Yes, it was introduced by "northd: Refactor lflow management into a separate module" We can test this easily using the check-perf tests. Using "check perf TESTSUITEFLAGS="-v 2", it takes around 16 seconds (on my system) before the patch and 213 seconds after. The difference gets exponential if we increase the number of ports. > > If I compare the code before these patches were merged, lflows_temp > hmap was not a fast hmap [1]. > So I'm still confused how this regression was introduced ? > > [1] - > https://github.com/ovn-org/ovn/blob/branch-23.09/northd/northd.c#L16717 > > If you see the code I linked above, lflows_temp is declared and > initialized as > > struct hmap lflows_temp = HMAP_INITIALIZER(&lflows_temp); > > And later a lflow is removed from the "lflows" hmap and inserted into > it. And finally it is swapped with "lflows" > > hmap_swap(lflows, &lflows_temp); > hmap_destroy(&lflows_temp); > > The same thing is done in the lflow-mgr too. > > I'm fine with this patch if it speeds up the first iteration. I want > to understand if this behavior is a regression or not. > > Sorry, I should have added more comments in the commit message. The extra time spent is not in lflow_table_sync_to_sb. But the hmap_swap in lflow_table_sync_to_sb, when the hmap is empty, swaps an empty hmap initialized with 128 buckets with an empty hmap initialized with one bucket. Then, when we build flows the first time, the "fast" hmap has only one bucket and could get millions of flows in this poor bucket. When initialized with 128 buckets, the initial flow build is not great, but still much better ... Another fix could have been to set the number of buckets in lflows_temp to 128, or to avoid the swap if both hmap are empty. > Thanks > Numan > > Thanks Xavier > > > --- > > northd/lflow-mgr.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/northd/lflow-mgr.c b/northd/lflow-mgr.c > > index 61729e903..af072bf70 100644 > > --- a/northd/lflow-mgr.c > > +++ b/northd/lflow-mgr.c > > @@ -260,6 +260,9 @@ lflow_table_sync_to_sb(struct lflow_table > *lflow_table, > > struct hmap *lflows = &lflow_table->entries; > > struct ovn_lflow *lflow; > > > > + fast_hmap_size_for(&lflows_temp, > > + lflow_table->max_seen_lflow_size); > > + > > /* Push changes to the Logical_Flow table to database. */ > > const struct sbrec_logical_flow *sbflow; > > SBREC_LOGICAL_FLOW_TABLE_FOR_EACH_SAFE (sbflow, sb_flow_table) { > > -- > > 2.41.0 > > > > _______________________________________________ > > dev mailing list > > dev@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > >
On Thu, Feb 15, 2024 at 5:10 AM Xavier Simonart <xsimonar@redhat.com> wrote: > > Hi Numan > > Thanks for the quick review. > > On Wed, Feb 14, 2024 at 10:57 PM Numan Siddique <numans@ovn.org> wrote: > > > On Wed, Feb 14, 2024 at 1:26 PM Xavier Simonart <xsimonar@redhat.com> > > wrote: > > > > > > When (re)starting ovn-northd with an existing big nbdb, > > > the first iteration of northd was very slow as trying to > > > push all flows in a single bucket. > > > > > > Fixes: a623606052ea ("northd: Refactor lflow management into a separate > > module.") > > > > > > Signed-off-by: Xavier Simonart <xsimonar@redhat.com> > > > > Hi Xavier, > > > > Thanks for the patch. Did the first iteration of north become very > > slow after the lflow-mgr was added ? > > > Yes, it was introduced by "northd: Refactor lflow management into a > separate module" > We can test this easily using the check-perf tests. Using "check perf > TESTSUITEFLAGS="-v 2", it takes around 16 seconds (on my system) before the > patch and 213 seconds after. > The difference gets exponential if we increase the number of ports. > > > > > If I compare the code before these patches were merged, lflows_temp > > hmap was not a fast hmap [1]. > > So I'm still confused how this regression was introduced ? > > > > [1] - > > https://github.com/ovn-org/ovn/blob/branch-23.09/northd/northd.c#L16717 > > > > If you see the code I linked above, lflows_temp is declared and > > initialized as > > > > struct hmap lflows_temp = HMAP_INITIALIZER(&lflows_temp); > > > > And later a lflow is removed from the "lflows" hmap and inserted into > > it. And finally it is swapped with "lflows" > > > > hmap_swap(lflows, &lflows_temp); > > hmap_destroy(&lflows_temp); > > > > The same thing is done in the lflow-mgr too. > > > > I'm fine with this patch if it speeds up the first iteration. I want > > to understand if this behavior is a regression or not. > > > > Sorry, I should have added more comments in the commit message. The extra > time spent is not in lflow_table_sync_to_sb. > But the hmap_swap in lflow_table_sync_to_sb, when the hmap is empty, swaps > an empty hmap initialized with 128 buckets with an > empty hmap initialized with one bucket. > Then, when we build flows the first time, the "fast" hmap has only one > bucket and could get millions of flows in this poor bucket. > When initialized with 128 buckets, the initial flow build is not great, but > still much better ... > Another fix could have been to set the number of buckets in lflows_temp to > 128, or to avoid the swap if both hmap are empty. Now I understand what's going on. Thank you for the fix. I applied to main and backported to branch-24.03. Thanks Numan > > > Thanks > > Numan > > > > Thanks > Xavier > > > > > > --- > > > northd/lflow-mgr.c | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/northd/lflow-mgr.c b/northd/lflow-mgr.c > > > index 61729e903..af072bf70 100644 > > > --- a/northd/lflow-mgr.c > > > +++ b/northd/lflow-mgr.c > > > @@ -260,6 +260,9 @@ lflow_table_sync_to_sb(struct lflow_table > > *lflow_table, > > > struct hmap *lflows = &lflow_table->entries; > > > struct ovn_lflow *lflow; > > > > > > + fast_hmap_size_for(&lflows_temp, > > > + lflow_table->max_seen_lflow_size); > > > + > > > /* Push changes to the Logical_Flow table to database. */ > > > const struct sbrec_logical_flow *sbflow; > > > SBREC_LOGICAL_FLOW_TABLE_FOR_EACH_SAFE (sbflow, sb_flow_table) { > > > -- > > > 2.41.0 > > > > > > _______________________________________________ > > > dev mailing list > > > dev@openvswitch.org > > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > > > > > > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
diff --git a/northd/lflow-mgr.c b/northd/lflow-mgr.c index 61729e903..af072bf70 100644 --- a/northd/lflow-mgr.c +++ b/northd/lflow-mgr.c @@ -260,6 +260,9 @@ lflow_table_sync_to_sb(struct lflow_table *lflow_table, struct hmap *lflows = &lflow_table->entries; struct ovn_lflow *lflow; + fast_hmap_size_for(&lflows_temp, + lflow_table->max_seen_lflow_size); + /* Push changes to the Logical_Flow table to database. */ const struct sbrec_logical_flow *sbflow; SBREC_LOGICAL_FLOW_TABLE_FOR_EACH_SAFE (sbflow, sb_flow_table) {
When (re)starting ovn-northd with an existing big nbdb, the first iteration of northd was very slow as trying to push all flows in a single bucket. Fixes: a623606052ea ("northd: Refactor lflow management into a separate module.") Signed-off-by: Xavier Simonart <xsimonar@redhat.com> --- northd/lflow-mgr.c | 3 +++ 1 file changed, 3 insertions(+)