Message ID | 20240320054127.979228-1-hzhou@ovn.org |
---|---|
State | Accepted |
Headers | show |
Series | [ovs-dev] ovn-controller: Fix busy loop when ofctrl is disconnected. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/github-robot-_ovn-kubernetes | success | github build: passed |
On 3/20/24 06:41, Han Zhou wrote: > ovn-controller runs at 100% cpu when OVS exits. This is because the > ofctrl_run is not called while ofctrl_wait is always called in the main > loop. Because of the missing ofctrl_run, it doesn't even detect that the > ofctrl connection is disconnected. > > This patch fixes the issue by always giving a chance to run ofctrl_run > as long as ofctrl_wait is called. > > Fixes: 1d6d953bf883 ("controller: Don't artificially limit group and meter IDs to 16bit.") > Fixes: 94cbc59dc0f1 ("ovn-controller: Fix use of dangling pointers in I-P runtime_data.") > Signed-off-by: Han Zhou <hzhou@ovn.org> > --- Thanks for the patch, Han! I tested it in a sandbox and it fixes the issue. Looks good to me: Acked-by: Dumitru Ceara <dceara@redhat.com> Regards, Dumitru
On Wed, Mar 20, 2024 at 2:56 AM Dumitru Ceara <dceara@redhat.com> wrote: > > On 3/20/24 06:41, Han Zhou wrote: > > ovn-controller runs at 100% cpu when OVS exits. This is because the > > ofctrl_run is not called while ofctrl_wait is always called in the main > > loop. Because of the missing ofctrl_run, it doesn't even detect that the > > ofctrl connection is disconnected. > > > > This patch fixes the issue by always giving a chance to run ofctrl_run > > as long as ofctrl_wait is called. > > > > Fixes: 1d6d953bf883 ("controller: Don't artificially limit group and meter IDs to 16bit.") > > Fixes: 94cbc59dc0f1 ("ovn-controller: Fix use of dangling pointers in I-P runtime_data.") > > Signed-off-by: Han Zhou <hzhou@ovn.org> > > --- > > Thanks for the patch, Han! I tested it in a sandbox and it fixes the > issue. Looks good to me: > > Acked-by: Dumitru Ceara <dceara@redhat.com> > Thanks Dumitru. I applied to main and backported down to branch-23.06. > Regards, > Dumitru >
diff --git a/controller/ofctrl.c b/controller/ofctrl.c index f14cd79a8dbb..1a2d2be791eb 100644 --- a/controller/ofctrl.c +++ b/controller/ofctrl.c @@ -787,7 +787,7 @@ ofctrl_run(const struct ovsrec_bridge *br_int, rconn_run(swconn); - if (!rconn_is_connected(swconn)) { + if (!rconn_is_connected(swconn) || !pending_ct_zones) { return reconnected; } diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c index 1c9960c708bf..c9ff5967a2af 100644 --- a/controller/ovn-controller.c +++ b/controller/ovn-controller.c @@ -5736,10 +5736,11 @@ main(int argc, char *argv[]) } } - if (br_int && ovs_feature_set_discovered()) { + if (br_int) { ct_zones_data = engine_get_data(&en_ct_zones); - if (ct_zones_data && ofctrl_run(br_int, ovs_table, - &ct_zones_data->pending)) { + if (ofctrl_run(br_int, ovs_table, + ct_zones_data ? &ct_zones_data->pending + : NULL)) { static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1, 1); @@ -5748,7 +5749,7 @@ main(int argc, char *argv[]) engine_set_force_recompute(true); } - if (chassis) { + if (chassis && ovs_feature_set_discovered()) { encaps_run(ovs_idl_txn, br_int, sbrec_chassis_table_get(ovnsb_idl_loop.idl), chassis,
ovn-controller runs at 100% cpu when OVS exits. This is because the ofctrl_run is not called while ofctrl_wait is always called in the main loop. Because of the missing ofctrl_run, it doesn't even detect that the ofctrl connection is disconnected. This patch fixes the issue by always giving a chance to run ofctrl_run as long as ofctrl_wait is called. Fixes: 1d6d953bf883 ("controller: Don't artificially limit group and meter IDs to 16bit.") Fixes: 94cbc59dc0f1 ("ovn-controller: Fix use of dangling pointers in I-P runtime_data.") Signed-off-by: Han Zhou <hzhou@ovn.org> --- controller/ofctrl.c | 2 +- controller/ovn-controller.c | 9 +++++---- 2 files changed, 6 insertions(+), 5 deletions(-)