Message ID | 1598869480-28137-1-git-send-email-dceara@redhat.com |
---|---|
State | Superseded |
Headers | show |
Series | [ovs-dev] ovn-controller: Fix incremental processing of Port_Binding deletes. | expand |
On Mon, Aug 31, 2020 at 3:55 PM Dumitru Ceara <dceara@redhat.com> wrote: > If a Port_Binding is deleted from the Southbound DB and the > corresponding OVS interface is also deleted from the OVS DB, and > if both updates are received and processed by ovn-controller in > the same iteration, ovn-controller should not be allowed to write > to the (to be deleted) Port_Binding record stored in memory. > > This commit also adds three new unixctl debug commands for > ovn-controller: > - debug/pause: pause ovn-controller processing, except unixctl commands. > - debug/resume: resume ovn-controller processing. > - debug/status: return the status of the ovn-controller processing. > > These new commands are needed by the test for this scenario as without > them we have no way of ensuring predictable results. Users should not > use these commands in production. This is also why the commands are not > documented. > > CC: Numan Siddique <numans@ovn.org> > Fixes: 6b0f01116bab ("ovn-controller: Handle runtime data changes in flow > output engine") > Reported-by: Tim Rozet <trozet@redhat.com> > Reported-at: https://bugzilla.redhat.com/1871961 > Signed-off-by: Dumitru Ceara <dceara@redhat.com> > --- > Thanks Dumitru for the fix. The issue is seen because we handle OVS interface changes first and then the port binding changes. I think we can address this issue with lesser changes to the code by handling port binding changes first in the runtime data engine. I think the I-P engine respects the order of inputs defined on an engine node right ? If the engine guarantees this, then I think we can take this approach. What do you think ? Thanks Numan > controller/binding.c | 74 > ++++++++++++++++++++++++++++++--------------- > controller/ovn-controller.c | 58 +++++++++++++++++++++++++++++++++++ > tests/ovn.at | 38 +++++++++++++++++++++++ > 3 files changed, 145 insertions(+), 25 deletions(-) > > diff --git a/controller/binding.c b/controller/binding.c > index 3c102dc..80a7898 100644 > --- a/controller/binding.c > +++ b/controller/binding.c > @@ -837,24 +837,40 @@ claim_lport(const struct sbrec_port_binding *pb, > return false; > } > > - if (pb->chassis) { > - VLOG_INFO("Changing chassis for lport %s from %s to %s.", > - pb->logical_port, pb->chassis->name, > - chassis_rec->name); > - } else { > - VLOG_INFO("Claiming lport %s for this chassis.", > pb->logical_port); > - } > - for (int i = 0; i < pb->n_mac; i++) { > - VLOG_INFO("%s: Claiming %s", pb->logical_port, pb->mac[i]); > - } > + /* Update the port_binding with our chassis only if the port > binding > + * is not being deleted from the SB. > + */ > + if (!sbrec_port_binding_is_deleted(pb)) { > + if (pb->chassis) { > + VLOG_INFO("Changing chassis for lport %s from %s to %s.", > + pb->logical_port, pb->chassis->name, > + chassis_rec->name); > + } else { > + VLOG_INFO("Claiming lport %s for this chassis.", > + pb->logical_port); > + } > + for (size_t i = 0; i < pb->n_mac; i++) { > + VLOG_INFO("%s: Claiming %s", pb->logical_port, > pb->mac[i]); > + } > > - sbrec_port_binding_set_chassis(pb, chassis_rec); > + sbrec_port_binding_set_chassis(pb, chassis_rec); > + } > > + /* In any case, update tracked_datapaths as we need to track all > types > + * of updates, including deletes. > + */ > if (tracked_datapaths) { > update_lport_tracking(pb, tracked_datapaths); > } > } > > + /* No need to check and update the encaps if the port_binding is being > + * deleted. > + */ > + if (sbrec_port_binding_is_deleted(pb)) { > + return true; > + } > + > /* Check if the port encap binding, if any, has changed */ > struct sbrec_encap *encap_rec = > sbrec_get_port_encap(chassis_rec, iface_rec); > @@ -879,27 +895,35 @@ static bool > release_lport(const struct sbrec_port_binding *pb, bool sb_readonly, > struct hmap *tracked_datapaths) > { > - if (pb->encap) { > - if (sb_readonly) { > - return false; > + /* Clear the port_binding encap, chassis, and virtual_parent fields > + * only if the port binding is not being deleted from the SB. > + */ > + if (!sbrec_port_binding_is_deleted(pb)) { > + if (pb->encap) { > + if (sb_readonly) { > + return false; > + } > + sbrec_port_binding_set_encap(pb, NULL); > } > - sbrec_port_binding_set_encap(pb, NULL); > - } > > - if (pb->chassis) { > - if (sb_readonly) { > - return false; > + if (pb->chassis) { > + if (sb_readonly) { > + return false; > + } > + sbrec_port_binding_set_chassis(pb, NULL); > } > - sbrec_port_binding_set_chassis(pb, NULL); > - } > > - if (pb->virtual_parent) { > - if (sb_readonly) { > - return false; > + if (pb->virtual_parent) { > + if (sb_readonly) { > + return false; > + } > + sbrec_port_binding_set_virtual_parent(pb, NULL); > } > - sbrec_port_binding_set_virtual_parent(pb, NULL); > } > > + /* In any case, update tracked_datapaths as we need to track all types > + * of updates, including deletes. > + */ > update_lport_tracking(pb, tracked_datapaths); > VLOG_INFO("Releasing lport %s from this chassis.", pb->logical_port); > return true; > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c > index d299c61..37b09c4 100644 > --- a/controller/ovn-controller.c > +++ b/controller/ovn-controller.c > @@ -73,6 +73,9 @@ static unixctl_cb_func extend_table_list; > static unixctl_cb_func inject_pkt; > static unixctl_cb_func engine_recompute_cmd; > static unixctl_cb_func cluster_state_reset_cmd; > +static unixctl_cb_func debug_pause_execution; > +static unixctl_cb_func debug_resume_execution; > +static unixctl_cb_func debug_status_execution; > > #define DEFAULT_BRIDGE_NAME "br-int" > #define DEFAULT_PROBE_INTERVAL_MSEC 5000 > @@ -2342,6 +2345,14 @@ main(int argc, char *argv[]) > cluster_state_reset_cmd, > &reset_ovnsb_idl_min_index); > > + bool paused = false; > + unixctl_command_register("debug/pause", "", 0, 0, > debug_pause_execution, > + &paused); > + unixctl_command_register("debug/resume", "", 0, 0, > debug_resume_execution, > + &paused); > + unixctl_command_register("debug/status", "", 0, 0, > debug_status_execution, > + &paused); > + > unsigned int ovs_cond_seqno = UINT_MAX; > unsigned int ovnsb_cond_seqno = UINT_MAX; > > @@ -2350,6 +2361,15 @@ main(int argc, char *argv[]) > restart = false; > bool sb_monitor_all = false; > while (!exiting) { > + /* If we're paused just run the unixctl server and skip most of > the > + * processing loop. > + */ > + if (paused) { > + unixctl_server_run(unixctl); > + unixctl_server_wait(unixctl); > + goto loop_done; > + } > + > engine_init_run(); > > struct ovsdb_idl_txn *ovs_idl_txn = > ovsdb_idl_loop_run(&ovs_idl_loop); > @@ -2604,6 +2624,8 @@ main(int argc, char *argv[]) > > ovsdb_idl_track_clear(ovnsb_idl_loop.idl); > ovsdb_idl_track_clear(ovs_idl_loop.idl); > + > +loop_done: > poll_block(); > if (should_service_stop()) { > exiting = true; > @@ -2857,3 +2879,39 @@ cluster_state_reset_cmd(struct unixctl_conn *conn, > int argc OVS_UNUSED, > poll_immediate_wake(); > unixctl_command_reply(conn, NULL); > } > + > +static void > +debug_pause_execution(struct unixctl_conn *conn, int argc OVS_UNUSED, > + const char *argv[] OVS_UNUSED, void *paused_) > +{ > + bool *paused = paused_; > + > + VLOG_INFO("User triggered execution pause."); > + *paused = true; > + unixctl_command_reply(conn, NULL); > +} > + > +static void > +debug_resume_execution(struct unixctl_conn *conn, int argc OVS_UNUSED, > + const char *argv[] OVS_UNUSED, void *paused_) > +{ > + bool *paused = paused_; > + > + VLOG_INFO("User triggered execution resume."); > + *paused = false; > + poll_immediate_wake(); > + unixctl_command_reply(conn, NULL); > +} > + > +static void > +debug_status_execution(struct unixctl_conn *conn, int argc OVS_UNUSED, > + const char *argv[] OVS_UNUSED, void *paused_) > +{ > + bool *paused = paused_; > + > + if (*paused) { > + unixctl_command_reply(conn, "paused"); > + } else { > + unixctl_command_reply(conn, "running"); > + } > +} > diff --git a/tests/ovn.at b/tests/ovn.at > index c4edbd9..5ad51c0 100644 > --- a/tests/ovn.at > +++ b/tests/ovn.at > @@ -21371,3 +21371,41 @@ AT_CHECK([ovn-sbctl find mac ip=10.0.0.2 > mac='"00:00:00:00:03:02"' logical_port= > OVN_CLEANUP([hv1],[hv2]) > > AT_CLEANUP > + > +AT_SETUP([ovn -- Delete Port_Binding and OVS port Incremental Processing]) > +ovn_start > + > +net_add n1 > +sim_add hv1 > +as hv1 > +ovs-vsctl add-br br-phys > +ovn_attach n1 br-phys 192.168.0.10 > + > +ovn-nbctl ls-add ls > +ovn-nbctl lsp-add ls lsp > + > +as hv1 ovs-vsctl \ > + -- add-port br-int vif1 \ > + -- set Interface vif1 external_ids:iface-id=lsp > + > +# Wait for port to be bound. > +OVS_WAIT_UNTIL([test $(ovn-sbctl --columns _uuid --bare list chassis hv1 > | wc -l) -eq 1]) > +ch=$(ovn-sbctl --columns _uuid --bare list chassis hv1) > +OVS_WAIT_UNTIL([test $(ovn-sbctl --columns chassis --bare list > port_binding lsp | grep $ch -c) -eq 1]) > + > +# Pause ovn-controller. > +as hv1 ovn-appctl -t ovn-controller debug/pause > + > +# Delete port binding and OVS port. The updates will be processed in the > same > +# loop in ovn-controller when it resumes. > +ovn-nbctl --wait=sb lsp-del lsp > +as hv1 ovs-vsctl del-port vif1 > + > +# Resume ovn-controller. > +as hv1 ovn-appctl -t ovn-controller debug/resume > + > +# Make sure ovn-controller runs fine. > +OVS_WAIT_UNTIL([test x$(as hv1 ovn-appctl -t ovn-controller debug/status) > = "xrunning"]) > + > +OVN_CLEANUP([hv1]) > +AT_CLEANUP > -- > 1.8.3.1 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > >
On 8/31/20 1:24 PM, Numan Siddique wrote: > > > On Mon, Aug 31, 2020 at 3:55 PM Dumitru Ceara <dceara@redhat.com > <mailto:dceara@redhat.com>> wrote: > > If a Port_Binding is deleted from the Southbound DB and the > corresponding OVS interface is also deleted from the OVS DB, and > if both updates are received and processed by ovn-controller in > the same iteration, ovn-controller should not be allowed to write > to the (to be deleted) Port_Binding record stored in memory. > > This commit also adds three new unixctl debug commands for > ovn-controller: > - debug/pause: pause ovn-controller processing, except unixctl commands. > - debug/resume: resume ovn-controller processing. > - debug/status: return the status of the ovn-controller processing. > > These new commands are needed by the test for this scenario as without > them we have no way of ensuring predictable results. Users should not > use these commands in production. This is also why the commands are not > documented. > > CC: Numan Siddique <numans@ovn.org <mailto:numans@ovn.org>> > Fixes: 6b0f01116bab ("ovn-controller: Handle runtime data changes in > flow output engine") > Reported-by: Tim Rozet <trozet@redhat.com <mailto:trozet@redhat.com>> > Reported-at: https://bugzilla.redhat.com/1871961 > Signed-off-by: Dumitru Ceara <dceara@redhat.com > <mailto:dceara@redhat.com>> > --- > > > Thanks Dumitru for the fix. > > The issue is seen because we handle OVS interface changes first and then > the port binding changes. > I think we can address this issue with lesser changes to the code by > handling port binding changes > first in the runtime data engine. > > I think the I-P engine respects the order of inputs defined on an engine > node right ? If the engine guarantees > this, then I think we can take this approach. > > What do you think ? > > Thanks > Numan > Yes, that works and is indeed a way smaller change. The new test case will fail if the order of the incremental engine nodes is changed back. I'll also add a comment in ovn-controller.c to mention why order is relevant. I'll send a v2 soon. Regards, Dumitru
diff --git a/controller/binding.c b/controller/binding.c index 3c102dc..80a7898 100644 --- a/controller/binding.c +++ b/controller/binding.c @@ -837,24 +837,40 @@ claim_lport(const struct sbrec_port_binding *pb, return false; } - if (pb->chassis) { - VLOG_INFO("Changing chassis for lport %s from %s to %s.", - pb->logical_port, pb->chassis->name, - chassis_rec->name); - } else { - VLOG_INFO("Claiming lport %s for this chassis.", pb->logical_port); - } - for (int i = 0; i < pb->n_mac; i++) { - VLOG_INFO("%s: Claiming %s", pb->logical_port, pb->mac[i]); - } + /* Update the port_binding with our chassis only if the port binding + * is not being deleted from the SB. + */ + if (!sbrec_port_binding_is_deleted(pb)) { + if (pb->chassis) { + VLOG_INFO("Changing chassis for lport %s from %s to %s.", + pb->logical_port, pb->chassis->name, + chassis_rec->name); + } else { + VLOG_INFO("Claiming lport %s for this chassis.", + pb->logical_port); + } + for (size_t i = 0; i < pb->n_mac; i++) { + VLOG_INFO("%s: Claiming %s", pb->logical_port, pb->mac[i]); + } - sbrec_port_binding_set_chassis(pb, chassis_rec); + sbrec_port_binding_set_chassis(pb, chassis_rec); + } + /* In any case, update tracked_datapaths as we need to track all types + * of updates, including deletes. + */ if (tracked_datapaths) { update_lport_tracking(pb, tracked_datapaths); } } + /* No need to check and update the encaps if the port_binding is being + * deleted. + */ + if (sbrec_port_binding_is_deleted(pb)) { + return true; + } + /* Check if the port encap binding, if any, has changed */ struct sbrec_encap *encap_rec = sbrec_get_port_encap(chassis_rec, iface_rec); @@ -879,27 +895,35 @@ static bool release_lport(const struct sbrec_port_binding *pb, bool sb_readonly, struct hmap *tracked_datapaths) { - if (pb->encap) { - if (sb_readonly) { - return false; + /* Clear the port_binding encap, chassis, and virtual_parent fields + * only if the port binding is not being deleted from the SB. + */ + if (!sbrec_port_binding_is_deleted(pb)) { + if (pb->encap) { + if (sb_readonly) { + return false; + } + sbrec_port_binding_set_encap(pb, NULL); } - sbrec_port_binding_set_encap(pb, NULL); - } - if (pb->chassis) { - if (sb_readonly) { - return false; + if (pb->chassis) { + if (sb_readonly) { + return false; + } + sbrec_port_binding_set_chassis(pb, NULL); } - sbrec_port_binding_set_chassis(pb, NULL); - } - if (pb->virtual_parent) { - if (sb_readonly) { - return false; + if (pb->virtual_parent) { + if (sb_readonly) { + return false; + } + sbrec_port_binding_set_virtual_parent(pb, NULL); } - sbrec_port_binding_set_virtual_parent(pb, NULL); } + /* In any case, update tracked_datapaths as we need to track all types + * of updates, including deletes. + */ update_lport_tracking(pb, tracked_datapaths); VLOG_INFO("Releasing lport %s from this chassis.", pb->logical_port); return true; diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c index d299c61..37b09c4 100644 --- a/controller/ovn-controller.c +++ b/controller/ovn-controller.c @@ -73,6 +73,9 @@ static unixctl_cb_func extend_table_list; static unixctl_cb_func inject_pkt; static unixctl_cb_func engine_recompute_cmd; static unixctl_cb_func cluster_state_reset_cmd; +static unixctl_cb_func debug_pause_execution; +static unixctl_cb_func debug_resume_execution; +static unixctl_cb_func debug_status_execution; #define DEFAULT_BRIDGE_NAME "br-int" #define DEFAULT_PROBE_INTERVAL_MSEC 5000 @@ -2342,6 +2345,14 @@ main(int argc, char *argv[]) cluster_state_reset_cmd, &reset_ovnsb_idl_min_index); + bool paused = false; + unixctl_command_register("debug/pause", "", 0, 0, debug_pause_execution, + &paused); + unixctl_command_register("debug/resume", "", 0, 0, debug_resume_execution, + &paused); + unixctl_command_register("debug/status", "", 0, 0, debug_status_execution, + &paused); + unsigned int ovs_cond_seqno = UINT_MAX; unsigned int ovnsb_cond_seqno = UINT_MAX; @@ -2350,6 +2361,15 @@ main(int argc, char *argv[]) restart = false; bool sb_monitor_all = false; while (!exiting) { + /* If we're paused just run the unixctl server and skip most of the + * processing loop. + */ + if (paused) { + unixctl_server_run(unixctl); + unixctl_server_wait(unixctl); + goto loop_done; + } + engine_init_run(); struct ovsdb_idl_txn *ovs_idl_txn = ovsdb_idl_loop_run(&ovs_idl_loop); @@ -2604,6 +2624,8 @@ main(int argc, char *argv[]) ovsdb_idl_track_clear(ovnsb_idl_loop.idl); ovsdb_idl_track_clear(ovs_idl_loop.idl); + +loop_done: poll_block(); if (should_service_stop()) { exiting = true; @@ -2857,3 +2879,39 @@ cluster_state_reset_cmd(struct unixctl_conn *conn, int argc OVS_UNUSED, poll_immediate_wake(); unixctl_command_reply(conn, NULL); } + +static void +debug_pause_execution(struct unixctl_conn *conn, int argc OVS_UNUSED, + const char *argv[] OVS_UNUSED, void *paused_) +{ + bool *paused = paused_; + + VLOG_INFO("User triggered execution pause."); + *paused = true; + unixctl_command_reply(conn, NULL); +} + +static void +debug_resume_execution(struct unixctl_conn *conn, int argc OVS_UNUSED, + const char *argv[] OVS_UNUSED, void *paused_) +{ + bool *paused = paused_; + + VLOG_INFO("User triggered execution resume."); + *paused = false; + poll_immediate_wake(); + unixctl_command_reply(conn, NULL); +} + +static void +debug_status_execution(struct unixctl_conn *conn, int argc OVS_UNUSED, + const char *argv[] OVS_UNUSED, void *paused_) +{ + bool *paused = paused_; + + if (*paused) { + unixctl_command_reply(conn, "paused"); + } else { + unixctl_command_reply(conn, "running"); + } +} diff --git a/tests/ovn.at b/tests/ovn.at index c4edbd9..5ad51c0 100644 --- a/tests/ovn.at +++ b/tests/ovn.at @@ -21371,3 +21371,41 @@ AT_CHECK([ovn-sbctl find mac ip=10.0.0.2 mac='"00:00:00:00:03:02"' logical_port= OVN_CLEANUP([hv1],[hv2]) AT_CLEANUP + +AT_SETUP([ovn -- Delete Port_Binding and OVS port Incremental Processing]) +ovn_start + +net_add n1 +sim_add hv1 +as hv1 +ovs-vsctl add-br br-phys +ovn_attach n1 br-phys 192.168.0.10 + +ovn-nbctl ls-add ls +ovn-nbctl lsp-add ls lsp + +as hv1 ovs-vsctl \ + -- add-port br-int vif1 \ + -- set Interface vif1 external_ids:iface-id=lsp + +# Wait for port to be bound. +OVS_WAIT_UNTIL([test $(ovn-sbctl --columns _uuid --bare list chassis hv1 | wc -l) -eq 1]) +ch=$(ovn-sbctl --columns _uuid --bare list chassis hv1) +OVS_WAIT_UNTIL([test $(ovn-sbctl --columns chassis --bare list port_binding lsp | grep $ch -c) -eq 1]) + +# Pause ovn-controller. +as hv1 ovn-appctl -t ovn-controller debug/pause + +# Delete port binding and OVS port. The updates will be processed in the same +# loop in ovn-controller when it resumes. +ovn-nbctl --wait=sb lsp-del lsp +as hv1 ovs-vsctl del-port vif1 + +# Resume ovn-controller. +as hv1 ovn-appctl -t ovn-controller debug/resume + +# Make sure ovn-controller runs fine. +OVS_WAIT_UNTIL([test x$(as hv1 ovn-appctl -t ovn-controller debug/status) = "xrunning"]) + +OVN_CLEANUP([hv1]) +AT_CLEANUP
If a Port_Binding is deleted from the Southbound DB and the corresponding OVS interface is also deleted from the OVS DB, and if both updates are received and processed by ovn-controller in the same iteration, ovn-controller should not be allowed to write to the (to be deleted) Port_Binding record stored in memory. This commit also adds three new unixctl debug commands for ovn-controller: - debug/pause: pause ovn-controller processing, except unixctl commands. - debug/resume: resume ovn-controller processing. - debug/status: return the status of the ovn-controller processing. These new commands are needed by the test for this scenario as without them we have no way of ensuring predictable results. Users should not use these commands in production. This is also why the commands are not documented. CC: Numan Siddique <numans@ovn.org> Fixes: 6b0f01116bab ("ovn-controller: Handle runtime data changes in flow output engine") Reported-by: Tim Rozet <trozet@redhat.com> Reported-at: https://bugzilla.redhat.com/1871961 Signed-off-by: Dumitru Ceara <dceara@redhat.com> --- controller/binding.c | 74 ++++++++++++++++++++++++++++++--------------- controller/ovn-controller.c | 58 +++++++++++++++++++++++++++++++++++ tests/ovn.at | 38 +++++++++++++++++++++++ 3 files changed, 145 insertions(+), 25 deletions(-)