Message ID | 20220208121536.3802003-1-xsimonar@redhat.com |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev] ovn-controller: Delay ovn-installed if sb db is readonly | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/github-robot-_ovn-kubernetes | success | github build: passed |
On 2/8/22 13:15, Xavier Simonart wrote: > Before this patch, when all flows for a port have been installed in OVS, > ovn-installed was immediately written to the OVS DB; the notification of > this change to the OVN controller (handled by OVS interface handler) > might cause a full recompute if the SBDB is readonly / still handling > the chassis update in PortBinding. > With this patch, ovn-installed is not written to the OVS DB while the > SBDB is readonly. This reduces the number of (un-necessary) full > recomputes. > Note that this was causing some spurious full recompute in test "ACL > with Port Group conjunction flow efficiency", causing the test to fail > from time to time. The test case was also updated as part of this patch > to check that ovn-installed was always properly set. > > Signed-off-by: Xavier Simonart <xsimonar@redhat.com> > --- Hi Xavier, > controller/binding.c | 2 +- > tests/ovn.at | 10 +++++++++- > 2 files changed, 10 insertions(+), 2 deletions(-) > > diff --git a/controller/binding.c b/controller/binding.c > index 4d62b0858..536cbc513 100644 > --- a/controller/binding.c > +++ b/controller/binding.c > @@ -688,7 +688,7 @@ local_binding_set_up(struct shash *local_bindings, const char *pb_name, > local_binding_find(local_bindings, pb_name); > struct binding_lport *b_lport = local_binding_get_primary_lport(lbinding); > > - if (!ovs_readonly && lbinding && lbinding->iface > + if (!ovs_readonly && !sb_readonly && lbinding && lbinding->iface > && !smap_get_bool(&lbinding->iface->external_ids, > OVN_INSTALLED_EXT_ID, false)) { This makes sense but we're adding a dependency on the SB. In some cases, e.g., at scale when the SB ovn-controller is connected to is snapshotting, it can take a while (a few seconds) until a locally initiated transaction commits and ovn-controller is notified. I'm guessing those are corner cases, but it would be nice if we could test this change at scale with a CMS that uses the "ovn-installed" OVS external-id. AFAIK the only CMS doing that is ovn-kubernetes. Pending some scale test results with OpenShift/Kubernetes with ovn-k8s, and with a few minor comments below: Acked-by: Dumitru Ceara <dceara@redhat.com> > VLOG_INFO("Setting lport %s ovn-installed in OVS", pb_name); > diff --git a/tests/ovn.at b/tests/ovn.at > index 957eb7850..1a009561b 100644 > --- a/tests/ovn.at > +++ b/tests/ovn.at > @@ -28325,6 +28325,8 @@ sim_add hv1 > as hv1 > ovs-vsctl add-br br-phys > ovn_attach n1 br-phys 192.168.0.1 > +#ovn-appctl -t ovn-controller vlog/set dbg > +ovn-appctl -t ovn-sb vlog/set dbg > I'm guessing these were debug leftovers. > ovn-nbctl lr-add lr0 > > @@ -28353,6 +28355,8 @@ ovs-vsctl add-port br-int lsp0-1 -- set interface lsp0-1 external_ids:iface-id=l > > check ovn-nbctl --wait=hv sync > AT_CHECK([test $(ovs-ofctl dump-flows br-int table=44 | grep conjunction | wc -l) == 22]) > +OVS_WAIT_UNTIL([test `ovs-vsctl get Interface lsp0-0 external_ids:ovn-installed` = '"true"']) > +OVS_WAIT_UNTIL([test `ovs-vsctl get Interface lsp0-1 external_ids:ovn-installed` = '"true"']) > > # Save the current lflow_run counter > lflow_run=$(ovn-appctl -t ovn-controller coverage/read-counter lflow_run) > @@ -28377,11 +28381,13 @@ AT_CHECK([test $(ovs-ofctl dump-flows br-int table=44 | grep 192.168 | wc -l) == > ovs-vsctl add-port br-int lsp0-0 -- set interface lsp0-0 external_ids:iface-id=lsp0-0 > check ovn-nbctl --wait=hv sync > AT_CHECK([test $(ovs-ofctl dump-flows br-int table=44 | grep conjunction | wc -l) == 12]) > +OVS_WAIT_UNTIL([test `ovs-vsctl get Interface lsp0-0 external_ids:ovn-installed` = '"true"']) > > # 4. Bind a lsp (lsp9-0) that doesn't belong to pg1, should not see any change. > ovs-vsctl add-port br-int lsp9-0 -- set interface lsp9-0 external_ids:iface-id=lsp9-0 > check ovn-nbctl --wait=hv sync > AT_CHECK([test $(ovs-ofctl dump-flows br-int table=44 | grep conjunction | wc -l) == 12]) > +OVS_WAIT_UNTIL([test `ovs-vsctl get Interface lsp9-0 external_ids:ovn-installed` = '"true"']) > > # 5. Bind another 2 lsps (lsp1-0 lsp1-1) that belong to pg1 and on a different > # LS (ls1), should see conjunction flows doubled (12 x 2 = 24) > @@ -28389,6 +28395,8 @@ ovs-vsctl add-port br-int lsp1-0 -- set interface lsp1-0 external_ids:iface-id=l > ovs-vsctl add-port br-int lsp1-1 -- set interface lsp1-1 external_ids:iface-id=lsp1-1 > check ovn-nbctl --wait=hv sync > AT_CHECK([test $(ovs-ofctl dump-flows br-int table=44 | grep conjunction | wc -l) == 24]) > +OVS_WAIT_UNTIL([test `ovs-vsctl get Interface lsp1-0 external_ids:ovn-installed` = '"true"']) > +OVS_WAIT_UNTIL([test `ovs-vsctl get Interface lsp1-1 external_ids:ovn-installed` = '"true"']) > > # 6. Simulate a SB port-group "del and add" notification to ovn-controller in the > # same IDL iteration. ovn-controller should still program the same flows. In > @@ -28414,11 +28422,11 @@ for i in $(seq 1 10); do > > # Finally check flow count is the same as before. > AT_CHECK([test $(ovs-ofctl dump-flows br-int table=44 | grep conjunction | wc -l) == 24]) > + OVS_WAIT_UNTIL([test `ovs-vsctl get Interface lsp0-0 external_ids:ovn-installed` = '"true"']) > done > > # Make sure all the above was performed with I-P (no recompute) > AT_CHECK([test $(ovn-appctl -t ovn-controller coverage/read-counter lflow_run) == $lflow_run]) > - Accidental? I would've missed it but it actually generates a conflict when applying on top of current main branch. > OVN_CLEANUP([hv1]) > AT_CLEANUP > ]) Thanks, Dumitru
diff --git a/controller/binding.c b/controller/binding.c index 4d62b0858..536cbc513 100644 --- a/controller/binding.c +++ b/controller/binding.c @@ -688,7 +688,7 @@ local_binding_set_up(struct shash *local_bindings, const char *pb_name, local_binding_find(local_bindings, pb_name); struct binding_lport *b_lport = local_binding_get_primary_lport(lbinding); - if (!ovs_readonly && lbinding && lbinding->iface + if (!ovs_readonly && !sb_readonly && lbinding && lbinding->iface && !smap_get_bool(&lbinding->iface->external_ids, OVN_INSTALLED_EXT_ID, false)) { VLOG_INFO("Setting lport %s ovn-installed in OVS", pb_name); diff --git a/tests/ovn.at b/tests/ovn.at index 957eb7850..1a009561b 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -28325,6 +28325,8 @@ sim_add hv1 as hv1 ovs-vsctl add-br br-phys ovn_attach n1 br-phys 192.168.0.1 +#ovn-appctl -t ovn-controller vlog/set dbg +ovn-appctl -t ovn-sb vlog/set dbg ovn-nbctl lr-add lr0 @@ -28353,6 +28355,8 @@ ovs-vsctl add-port br-int lsp0-1 -- set interface lsp0-1 external_ids:iface-id=l check ovn-nbctl --wait=hv sync AT_CHECK([test $(ovs-ofctl dump-flows br-int table=44 | grep conjunction | wc -l) == 22]) +OVS_WAIT_UNTIL([test `ovs-vsctl get Interface lsp0-0 external_ids:ovn-installed` = '"true"']) +OVS_WAIT_UNTIL([test `ovs-vsctl get Interface lsp0-1 external_ids:ovn-installed` = '"true"']) # Save the current lflow_run counter lflow_run=$(ovn-appctl -t ovn-controller coverage/read-counter lflow_run) @@ -28377,11 +28381,13 @@ AT_CHECK([test $(ovs-ofctl dump-flows br-int table=44 | grep 192.168 | wc -l) == ovs-vsctl add-port br-int lsp0-0 -- set interface lsp0-0 external_ids:iface-id=lsp0-0 check ovn-nbctl --wait=hv sync AT_CHECK([test $(ovs-ofctl dump-flows br-int table=44 | grep conjunction | wc -l) == 12]) +OVS_WAIT_UNTIL([test `ovs-vsctl get Interface lsp0-0 external_ids:ovn-installed` = '"true"']) # 4. Bind a lsp (lsp9-0) that doesn't belong to pg1, should not see any change. ovs-vsctl add-port br-int lsp9-0 -- set interface lsp9-0 external_ids:iface-id=lsp9-0 check ovn-nbctl --wait=hv sync AT_CHECK([test $(ovs-ofctl dump-flows br-int table=44 | grep conjunction | wc -l) == 12]) +OVS_WAIT_UNTIL([test `ovs-vsctl get Interface lsp9-0 external_ids:ovn-installed` = '"true"']) # 5. Bind another 2 lsps (lsp1-0 lsp1-1) that belong to pg1 and on a different # LS (ls1), should see conjunction flows doubled (12 x 2 = 24) @@ -28389,6 +28395,8 @@ ovs-vsctl add-port br-int lsp1-0 -- set interface lsp1-0 external_ids:iface-id=l ovs-vsctl add-port br-int lsp1-1 -- set interface lsp1-1 external_ids:iface-id=lsp1-1 check ovn-nbctl --wait=hv sync AT_CHECK([test $(ovs-ofctl dump-flows br-int table=44 | grep conjunction | wc -l) == 24]) +OVS_WAIT_UNTIL([test `ovs-vsctl get Interface lsp1-0 external_ids:ovn-installed` = '"true"']) +OVS_WAIT_UNTIL([test `ovs-vsctl get Interface lsp1-1 external_ids:ovn-installed` = '"true"']) # 6. Simulate a SB port-group "del and add" notification to ovn-controller in the # same IDL iteration. ovn-controller should still program the same flows. In @@ -28414,11 +28422,11 @@ for i in $(seq 1 10); do # Finally check flow count is the same as before. AT_CHECK([test $(ovs-ofctl dump-flows br-int table=44 | grep conjunction | wc -l) == 24]) + OVS_WAIT_UNTIL([test `ovs-vsctl get Interface lsp0-0 external_ids:ovn-installed` = '"true"']) done # Make sure all the above was performed with I-P (no recompute) AT_CHECK([test $(ovn-appctl -t ovn-controller coverage/read-counter lflow_run) == $lflow_run]) - OVN_CLEANUP([hv1]) AT_CLEANUP ])
Before this patch, when all flows for a port have been installed in OVS, ovn-installed was immediately written to the OVS DB; the notification of this change to the OVN controller (handled by OVS interface handler) might cause a full recompute if the SBDB is readonly / still handling the chassis update in PortBinding. With this patch, ovn-installed is not written to the OVS DB while the SBDB is readonly. This reduces the number of (un-necessary) full recomputes. Note that this was causing some spurious full recompute in test "ACL with Port Group conjunction flow efficiency", causing the test to fail from time to time. The test case was also updated as part of this patch to check that ovn-installed was always properly set. Signed-off-by: Xavier Simonart <xsimonar@redhat.com> --- controller/binding.c | 2 +- tests/ovn.at | 10 +++++++++- 2 files changed, 10 insertions(+), 2 deletions(-)