Message ID | 20240910132958.316433-1-amusil@redhat.com |
---|---|
State | Changes Requested |
Headers | show |
Series | [ovs-dev] northd, controller: Recompute immediately after commit failure. | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_Build_and_Test | success | github build: passed |
ovsrobot/github-robot-_ovn-kubernetes | success | github build: passed |
On 9/10/24 15:29, Ales Musil wrote: > The recompute was requested when the commit failed, however the > engine run might have happened later than expected. This was caused > by poll_block() call, which was waiting for any event in the loop. > Make sure that the engine runs right away by calling > poll_immediate_wake() which ensures that the poll_block() returns > right away without any further waiting. > > Reported-at: https://issues.redhat.com/browse/FDP-753 > Signed-off-by: Ales Musil <amusil@redhat.com> > --- Hi Ales, Thanks for the fix! I wonder if we shouldn't just do this every time engine_set_force_recompute(true) is called. That is, call poll_immediate_wake() inside the engine_set_force_recompute() function. By the look there's no case in whicha forced recompute is requested where we have a reason to wait instead of immediately waking up to do the computation. We could also repurpose the engine_trigger_recompute() function. What do you think? Thanks, Dumitru > controller/ovn-controller.c | 1 + > northd/ovn-northd.c | 2 ++ > 2 files changed, 3 insertions(+) > > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c > index c48667887..8b8ff3a92 100644 > --- a/controller/ovn-controller.c > +++ b/controller/ovn-controller.c > @@ -5816,6 +5816,7 @@ main(int argc, char *argv[]) > if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop)) { > VLOG_INFO("OVNSB commit failed, force recompute next time."); > engine_set_force_recompute(true); > + poll_immediate_wake(); > } > > int ovs_txn_status = ovsdb_idl_loop_commit_and_wait(&ovs_idl_loop); > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > index d71114f35..b3c80f012 100644 > --- a/northd/ovn-northd.c > +++ b/northd/ovn-northd.c > @@ -992,12 +992,14 @@ main(int argc, char *argv[]) > VLOG_INFO("OVNNB commit failed, " > "force recompute next time."); > eng_ctx.recompute = true; > + poll_immediate_wake(); > } > > if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop)) { > VLOG_INFO("OVNSB commit failed, " > "force recompute next time."); > eng_ctx.recompute = true; > + poll_immediate_wake(); > } > run_memory_trimmer(ovnnb_idl_loop.idl, activity); > } else {
On Mon, Sep 23, 2024 at 4:31 PM Dumitru Ceara <dceara@redhat.com> wrote: > On 9/10/24 15:29, Ales Musil wrote: > > The recompute was requested when the commit failed, however the > > engine run might have happened later than expected. This was caused > > by poll_block() call, which was waiting for any event in the loop. > > Make sure that the engine runs right away by calling > > poll_immediate_wake() which ensures that the poll_block() returns > > right away without any further waiting. > > > > Reported-at: https://issues.redhat.com/browse/FDP-753 > > Signed-off-by: Ales Musil <amusil@redhat.com> > > --- > > Hi Ales, > > Thanks for the fix! > Hi Dumitru, thank you for the review. > I wonder if we shouldn't just do this every time > engine_set_force_recompute(true) is called. That is, call > poll_immediate_wake() inside the engine_set_force_recompute() function. > > By the look there's no case in whicha forced recompute is requested > where we have a reason to wait instead of immediately waking up to do > the computation. > > We could also repurpose the engine_trigger_recompute() function. > > What do you think? > After some offline discussion there is v2 which should be more explicit about what is happening. > > Thanks, > Dumitru > > > controller/ovn-controller.c | 1 + > > northd/ovn-northd.c | 2 ++ > > 2 files changed, 3 insertions(+) > > > > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c > > index c48667887..8b8ff3a92 100644 > > --- a/controller/ovn-controller.c > > +++ b/controller/ovn-controller.c > > @@ -5816,6 +5816,7 @@ main(int argc, char *argv[]) > > if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop)) { > > VLOG_INFO("OVNSB commit failed, force recompute next > time."); > > engine_set_force_recompute(true); > > + poll_immediate_wake(); > > } > > > > int ovs_txn_status = > ovsdb_idl_loop_commit_and_wait(&ovs_idl_loop); > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > > index d71114f35..b3c80f012 100644 > > --- a/northd/ovn-northd.c > > +++ b/northd/ovn-northd.c > > @@ -992,12 +992,14 @@ main(int argc, char *argv[]) > > VLOG_INFO("OVNNB commit failed, " > > "force recompute next time."); > > eng_ctx.recompute = true; > > + poll_immediate_wake(); > > } > > > > if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop)) { > > VLOG_INFO("OVNSB commit failed, " > > "force recompute next time."); > > eng_ctx.recompute = true; > > + poll_immediate_wake(); > > } > > run_memory_trimmer(ovnnb_idl_loop.idl, activity); > > } else { > > Thanks, Ales
diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c index c48667887..8b8ff3a92 100644 --- a/controller/ovn-controller.c +++ b/controller/ovn-controller.c @@ -5816,6 +5816,7 @@ main(int argc, char *argv[]) if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop)) { VLOG_INFO("OVNSB commit failed, force recompute next time."); engine_set_force_recompute(true); + poll_immediate_wake(); } int ovs_txn_status = ovsdb_idl_loop_commit_and_wait(&ovs_idl_loop); diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index d71114f35..b3c80f012 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -992,12 +992,14 @@ main(int argc, char *argv[]) VLOG_INFO("OVNNB commit failed, " "force recompute next time."); eng_ctx.recompute = true; + poll_immediate_wake(); } if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop)) { VLOG_INFO("OVNSB commit failed, " "force recompute next time."); eng_ctx.recompute = true; + poll_immediate_wake(); } run_memory_trimmer(ovnnb_idl_loop.idl, activity); } else {
The recompute was requested when the commit failed, however the engine run might have happened later than expected. This was caused by poll_block() call, which was waiting for any event in the loop. Make sure that the engine runs right away by calling poll_immediate_wake() which ensures that the poll_block() returns right away without any further waiting. Reported-at: https://issues.redhat.com/browse/FDP-753 Signed-off-by: Ales Musil <amusil@redhat.com> --- controller/ovn-controller.c | 1 + northd/ovn-northd.c | 2 ++ 2 files changed, 3 insertions(+)