diff mbox series

[ovs-dev] northd: Fall back to 'northd' engine recompute for certain VIF scenarios.

Message ID 20230803192905.84512-1-numans@ovn.org
State Accepted
Headers show
Series [ovs-dev] northd: Fall back to 'northd' engine recompute for certain VIF scenarios. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test fail github build: failed
ovsrobot/github-robot-_ovn-kubernetes fail github build: failed

Commit Message

Numan Siddique Aug. 3, 2023, 7:29 p.m. UTC
From: Numan Siddique <numans@ovn.org>

When a logical switch has only router ports and if a new VIF port is
added, both northd engine and lflow engine handle this change
incrementally, but it misses out on adding a few logical flows
where we have checks like :
   if (od->n_router_ports != od->nbs->n_ports) {
        ds_put_format(&actions, "clone {outport = %s; output; }; "
                                "outport = \""MC_FLOOD_L2"\"; output;",
                      patch_op->json_key);
        ....
   } else {
       ds_put_format(&actions, "outport = %s; output;", patch_op->json_key);
   }

The same issue is seen when a VIF port is deleted and after which the
logical switch has only router ports.

This patch fixes this issue by falling back to full recompute of northd
engine node.  It is possible to handle these changes incrementally
in northd engine node but fall back to full recompute in lflow engine
node.  But this patch goes for a simpler fix.  This can be optimized
later if required.

Fixes: b337750e45be ("northd: Incremental processing of VIF changes in 'northd' node.")
CC: Han Zhou <hzhou@ovn.org>
Signed-off-by: Numan Siddique <numans@ovn.org>
---
 northd/northd.c     | 10 ++++++++++
 tests/ovn-northd.at | 28 ++++++++++++++++++++++++++++
 2 files changed, 38 insertions(+)

Comments

Han Zhou Aug. 8, 2023, 7:45 a.m. UTC | #1
On Thu, Aug 3, 2023 at 12:29 PM <numans@ovn.org> wrote:
>
> From: Numan Siddique <numans@ovn.org>
>
> When a logical switch has only router ports and if a new VIF port is
> added, both northd engine and lflow engine handle this change
> incrementally, but it misses out on adding a few logical flows
> where we have checks like :
>    if (od->n_router_ports != od->nbs->n_ports) {
>         ds_put_format(&actions, "clone {outport = %s; output; }; "
>                                 "outport = \""MC_FLOOD_L2"\"; output;",
>                       patch_op->json_key);
>         ....
>    } else {
>        ds_put_format(&actions, "outport = %s; output;",
patch_op->json_key);
>    }
>
> The same issue is seen when a VIF port is deleted and after which the
> logical switch has only router ports.
>
> This patch fixes this issue by falling back to full recompute of northd
> engine node.  It is possible to handle these changes incrementally
> in northd engine node but fall back to full recompute in lflow engine
> node.  But this patch goes for a simpler fix.  This can be optimized
> later if required.
>

Good catch! This reminds me about a ddlog performance problem for this
scenario:
https://mail.openvswitch.org/pipermail/ovs-dev/2021-September/388184.html
+1 for the quick fix. Thanks a lot for identifying such corner cases.
I'd like to work on I-P for such a scenario because it may be quite common
in environments like ovn-k8s when the first LSP is added to a LS or the
last LSP is deleted from it.
Please also find some minor comments below.

With those addressed:
Acked-by: Han Zhou <hzhou@ovn.org>

> Fixes: b337750e45be ("northd: Incremental processing of VIF changes in
'northd' node.")
> CC: Han Zhou <hzhou@ovn.org>
> Signed-off-by: Numan Siddique <numans@ovn.org>
> ---
>  northd/northd.c     | 10 ++++++++++
>  tests/ovn-northd.at | 28 ++++++++++++++++++++++++++++
>  2 files changed, 38 insertions(+)
>
> diff --git a/northd/northd.c b/northd/northd.c
> index b9605862e..462fa83ca 100644
> --- a/northd/northd.c
> +++ b/northd/northd.c
> @@ -5120,6 +5120,11 @@ northd_handle_ls_changes(struct ovsdb_idl_txn
*ovnsb_idl_txn,
>              op->visited = false;
>          }
>
> +        bool only_rports = (od->n_router_ports ==
hmap_count(&od->ports));
> +        if (only_rports) {
> +            goto fail;
> +        }

Would be better to also check if the n_router_ports == 0 then no need to
fallback to recompute.
Would be better to add a comment here to explain why we are doing this
check.

> +
>          /* Compare the individual ports in the old and new Logical
Switches */
>          for (size_t j = 0; j < changed_ls->n_ports; ++j) {
>              struct nbrec_logical_switch_port *new_nbsp =
changed_ls->ports[j];
> @@ -5201,6 +5206,11 @@ northd_handle_ls_changes(struct ovsdb_idl_txn
*ovnsb_idl_txn,
>              }
>          }
>
> +        only_rports = (od->n_router_ports == hmap_count(&od->ports));
> +        if (only_rports) {
> +            goto fail_clean_deleted;
> +        }

Same as above. (comments may be different, because we are checking if the
last LSP was just got deleted)

> +
>          if (!ovs_list_is_empty(&ls_change->added_ports) ||
>              !ovs_list_is_empty(&ls_change->updated_ports) ||
>              !ovs_list_is_empty(&ls_change->deleted_ports)) {
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index 3e06f14c9..912aa5431 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -9589,6 +9589,34 @@ check_recompute_counter 0 0
>
>  CHECK_NO_CHANGE_AFTER_RECOMPUTE
>

nit: Better to add a comment to describe the purpose of the tests below
because they are quite different from the above basic tests. (or even
better if just move to a separate test case)

> +check ovn-nbctl --wait=hv ls-del ls0
> +check ovn-nbctl ls-add ls0
> +check ovn-nbctl --wait=sb lr-add lr0
> +ovn-nbctl lrp-add lr0 lr0-ls0 00:00:00:00:ff:01 192.168.0.1/24
> +ovn-nbctl lsp-add ls0 ls0-lr0
> +ovn-nbctl lsp-set-type ls0-lr0 router
> +ovn-nbctl lsp-set-addresses ls0-lr0 router
> +check ovn-nbctl --wait=sb lsp-set-options ls0-lr0 router-port=lr0-ls0
> +
> +CHECK_NO_CHANGE_AFTER_RECOMPUTE
> +
> +ovn-nbctl lb-add lb0 192.168.0.10:80 10.0.0.10:8080
> +check ovn-nbctl --wait=sb ls-lb-add ls0 lb0
> +CHECK_NO_CHANGE_AFTER_RECOMPUTE
> +
> +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
> +# Add a lsp.  northd and lflow engine should recompute since this is
> +# the first lsp added after the router ports.
> +check ovn-nbctl --wait=hv lsp-add ls0 lsp0-1 -- lsp-set-addresses lsp0-1
"aa:aa:aa:00:00:01 192.168.0.11"
> +check_recompute_counter 1 1
> +CHECK_NO_CHANGE_AFTER_RECOMPUTE

nit: after asserting that the recompute is performed (the coutner 1 1),
there is no need to CHECK_NO_CHANGE_AFTER_RECOMPUTE.

> +
> +# Delete the lsp.  northd and lflow engine should recompute.
> +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
> +check ovn-nbctl lsp-del lsp0-1
> +check_recompute_counter 1 1
> +CHECK_NO_CHANGE_AFTER_RECOMPUTE
> +
>  OVN_CLEANUP([hv1])
>  AT_CLEANUP
>  ])
> --
> 2.40.1
>
Numan Siddique Aug. 8, 2023, 6:19 p.m. UTC | #2
On Tue, Aug 8, 2023 at 1:15 PM Han Zhou <hzhou@ovn.org> wrote:
>
> On Thu, Aug 3, 2023 at 12:29 PM <numans@ovn.org> wrote:
> >
> > From: Numan Siddique <numans@ovn.org>
> >
> > When a logical switch has only router ports and if a new VIF port is
> > added, both northd engine and lflow engine handle this change
> > incrementally, but it misses out on adding a few logical flows
> > where we have checks like :
> >    if (od->n_router_ports != od->nbs->n_ports) {
> >         ds_put_format(&actions, "clone {outport = %s; output; }; "
> >                                 "outport = \""MC_FLOOD_L2"\"; output;",
> >                       patch_op->json_key);
> >         ....
> >    } else {
> >        ds_put_format(&actions, "outport = %s; output;",
> patch_op->json_key);
> >    }
> >
> > The same issue is seen when a VIF port is deleted and after which the
> > logical switch has only router ports.
> >
> > This patch fixes this issue by falling back to full recompute of northd
> > engine node.  It is possible to handle these changes incrementally
> > in northd engine node but fall back to full recompute in lflow engine
> > node.  But this patch goes for a simpler fix.  This can be optimized
> > later if required.
> >
>
> Good catch! This reminds me about a ddlog performance problem for this
> scenario:
> https://mail.openvswitch.org/pipermail/ovs-dev/2021-September/388184.html
> +1 for the quick fix. Thanks a lot for identifying such corner cases.
> I'd like to work on I-P for such a scenario because it may be quite common
> in environments like ovn-k8s when the first LSP is added to a LS or the
> last LSP is deleted from it.
> Please also find some minor comments below.
>
> With those addressed:
> Acked-by: Han Zhou <hzhou@ovn.org>

Thanks.  I addressed your comments and applied the patch to the main branch.

I added a separate test case as you suggested.

Numan

>
> > Fixes: b337750e45be ("northd: Incremental processing of VIF changes in
> 'northd' node.")
> > CC: Han Zhou <hzhou@ovn.org>
> > Signed-off-by: Numan Siddique <numans@ovn.org>
> > ---
> >  northd/northd.c     | 10 ++++++++++
> >  tests/ovn-northd.at | 28 ++++++++++++++++++++++++++++
> >  2 files changed, 38 insertions(+)
> >
> > diff --git a/northd/northd.c b/northd/northd.c
> > index b9605862e..462fa83ca 100644
> > --- a/northd/northd.c
> > +++ b/northd/northd.c
> > @@ -5120,6 +5120,11 @@ northd_handle_ls_changes(struct ovsdb_idl_txn
> *ovnsb_idl_txn,
> >              op->visited = false;
> >          }
> >
> > +        bool only_rports = (od->n_router_ports ==
> hmap_count(&od->ports));
> > +        if (only_rports) {
> > +            goto fail;
> > +        }
>
> Would be better to also check if the n_router_ports == 0 then no need to
> fallback to recompute.
> Would be better to add a comment here to explain why we are doing this
> check.
>
> > +
> >          /* Compare the individual ports in the old and new Logical
> Switches */
> >          for (size_t j = 0; j < changed_ls->n_ports; ++j) {
> >              struct nbrec_logical_switch_port *new_nbsp =
> changed_ls->ports[j];
> > @@ -5201,6 +5206,11 @@ northd_handle_ls_changes(struct ovsdb_idl_txn
> *ovnsb_idl_txn,
> >              }
> >          }
> >
> > +        only_rports = (od->n_router_ports == hmap_count(&od->ports));
> > +        if (only_rports) {
> > +            goto fail_clean_deleted;
> > +        }
>
> Same as above. (comments may be different, because we are checking if the
> last LSP was just got deleted)
>
> > +
> >          if (!ovs_list_is_empty(&ls_change->added_ports) ||
> >              !ovs_list_is_empty(&ls_change->updated_ports) ||
> >              !ovs_list_is_empty(&ls_change->deleted_ports)) {
> > diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> > index 3e06f14c9..912aa5431 100644
> > --- a/tests/ovn-northd.at
> > +++ b/tests/ovn-northd.at
> > @@ -9589,6 +9589,34 @@ check_recompute_counter 0 0
> >
> >  CHECK_NO_CHANGE_AFTER_RECOMPUTE
> >
>
> nit: Better to add a comment to describe the purpose of the tests below
> because they are quite different from the above basic tests. (or even
> better if just move to a separate test case)
>
> > +check ovn-nbctl --wait=hv ls-del ls0
> > +check ovn-nbctl ls-add ls0
> > +check ovn-nbctl --wait=sb lr-add lr0
> > +ovn-nbctl lrp-add lr0 lr0-ls0 00:00:00:00:ff:01 192.168.0.1/24
> > +ovn-nbctl lsp-add ls0 ls0-lr0
> > +ovn-nbctl lsp-set-type ls0-lr0 router
> > +ovn-nbctl lsp-set-addresses ls0-lr0 router
> > +check ovn-nbctl --wait=sb lsp-set-options ls0-lr0 router-port=lr0-ls0
> > +
> > +CHECK_NO_CHANGE_AFTER_RECOMPUTE
> > +
> > +ovn-nbctl lb-add lb0 192.168.0.10:80 10.0.0.10:8080
> > +check ovn-nbctl --wait=sb ls-lb-add ls0 lb0
> > +CHECK_NO_CHANGE_AFTER_RECOMPUTE
> > +
> > +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
> > +# Add a lsp.  northd and lflow engine should recompute since this is
> > +# the first lsp added after the router ports.
> > +check ovn-nbctl --wait=hv lsp-add ls0 lsp0-1 -- lsp-set-addresses lsp0-1
> "aa:aa:aa:00:00:01 192.168.0.11"
> > +check_recompute_counter 1 1
> > +CHECK_NO_CHANGE_AFTER_RECOMPUTE
>
> nit: after asserting that the recompute is performed (the coutner 1 1),
> there is no need to CHECK_NO_CHANGE_AFTER_RECOMPUTE.
>
> > +
> > +# Delete the lsp.  northd and lflow engine should recompute.
> > +check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
> > +check ovn-nbctl lsp-del lsp0-1
> > +check_recompute_counter 1 1
> > +CHECK_NO_CHANGE_AFTER_RECOMPUTE
> > +
> >  OVN_CLEANUP([hv1])
> >  AT_CLEANUP
> >  ])
> > --
> > 2.40.1
> >
> _______________________________________________
> dev mailing list
> dev@openvswitch.org
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
diff mbox series

Patch

diff --git a/northd/northd.c b/northd/northd.c
index b9605862e..462fa83ca 100644
--- a/northd/northd.c
+++ b/northd/northd.c
@@ -5120,6 +5120,11 @@  northd_handle_ls_changes(struct ovsdb_idl_txn *ovnsb_idl_txn,
             op->visited = false;
         }
 
+        bool only_rports = (od->n_router_ports == hmap_count(&od->ports));
+        if (only_rports) {
+            goto fail;
+        }
+
         /* Compare the individual ports in the old and new Logical Switches */
         for (size_t j = 0; j < changed_ls->n_ports; ++j) {
             struct nbrec_logical_switch_port *new_nbsp = changed_ls->ports[j];
@@ -5201,6 +5206,11 @@  northd_handle_ls_changes(struct ovsdb_idl_txn *ovnsb_idl_txn,
             }
         }
 
+        only_rports = (od->n_router_ports == hmap_count(&od->ports));
+        if (only_rports) {
+            goto fail_clean_deleted;
+        }
+
         if (!ovs_list_is_empty(&ls_change->added_ports) ||
             !ovs_list_is_empty(&ls_change->updated_ports) ||
             !ovs_list_is_empty(&ls_change->deleted_ports)) {
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index 3e06f14c9..912aa5431 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -9589,6 +9589,34 @@  check_recompute_counter 0 0
 
 CHECK_NO_CHANGE_AFTER_RECOMPUTE
 
+check ovn-nbctl --wait=hv ls-del ls0
+check ovn-nbctl ls-add ls0
+check ovn-nbctl --wait=sb lr-add lr0
+ovn-nbctl lrp-add lr0 lr0-ls0 00:00:00:00:ff:01 192.168.0.1/24
+ovn-nbctl lsp-add ls0 ls0-lr0
+ovn-nbctl lsp-set-type ls0-lr0 router
+ovn-nbctl lsp-set-addresses ls0-lr0 router
+check ovn-nbctl --wait=sb lsp-set-options ls0-lr0 router-port=lr0-ls0
+
+CHECK_NO_CHANGE_AFTER_RECOMPUTE
+
+ovn-nbctl lb-add lb0 192.168.0.10:80 10.0.0.10:8080
+check ovn-nbctl --wait=sb ls-lb-add ls0 lb0
+CHECK_NO_CHANGE_AFTER_RECOMPUTE
+
+check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
+# Add a lsp.  northd and lflow engine should recompute since this is
+# the first lsp added after the router ports.
+check ovn-nbctl --wait=hv lsp-add ls0 lsp0-1 -- lsp-set-addresses lsp0-1 "aa:aa:aa:00:00:01 192.168.0.11"
+check_recompute_counter 1 1
+CHECK_NO_CHANGE_AFTER_RECOMPUTE
+
+# Delete the lsp.  northd and lflow engine should recompute.
+check as northd ovn-appctl -t NORTHD_TYPE inc-engine/clear-stats
+check ovn-nbctl lsp-del lsp0-1
+check_recompute_counter 1 1
+CHECK_NO_CHANGE_AFTER_RECOMPUTE
+
 OVN_CLEANUP([hv1])
 AT_CLEANUP
 ])