diff mbox series

[ovs-dev] northd, controller: Recompute immediately after commit failure.

Message ID 20240910132958.316433-1-amusil@redhat.com
State Changes Requested
Headers show
Series [ovs-dev] northd, controller: Recompute immediately after commit failure. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/github-robot-_ovn-kubernetes success github build: passed

Commit Message

Ales Musil Sept. 10, 2024, 1:29 p.m. UTC
The recompute was requested when the commit failed, however the
engine run might have happened later than expected. This was caused
by poll_block() call, which was waiting for any event in the loop.
Make sure that the engine runs right away by calling
poll_immediate_wake() which ensures that the poll_block() returns
right away without any further waiting.

Reported-at: https://issues.redhat.com/browse/FDP-753
Signed-off-by: Ales Musil <amusil@redhat.com>
---
 controller/ovn-controller.c | 1 +
 northd/ovn-northd.c         | 2 ++
 2 files changed, 3 insertions(+)

Comments

Dumitru Ceara Sept. 23, 2024, 2:31 p.m. UTC | #1
On 9/10/24 15:29, Ales Musil wrote:
> The recompute was requested when the commit failed, however the
> engine run might have happened later than expected. This was caused
> by poll_block() call, which was waiting for any event in the loop.
> Make sure that the engine runs right away by calling
> poll_immediate_wake() which ensures that the poll_block() returns
> right away without any further waiting.
> 
> Reported-at: https://issues.redhat.com/browse/FDP-753
> Signed-off-by: Ales Musil <amusil@redhat.com>
> ---

Hi Ales,

Thanks for the fix!

I wonder if we shouldn't just do this every time
engine_set_force_recompute(true) is called.  That is, call
poll_immediate_wake() inside the engine_set_force_recompute() function.

By the look there's no case in whicha forced recompute is requested
where we have a reason to wait instead of immediately waking up to do
the computation.

We could also repurpose the engine_trigger_recompute() function.

What do you think?

Thanks,
Dumitru

>  controller/ovn-controller.c | 1 +
>  northd/ovn-northd.c         | 2 ++
>  2 files changed, 3 insertions(+)
> 
> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> index c48667887..8b8ff3a92 100644
> --- a/controller/ovn-controller.c
> +++ b/controller/ovn-controller.c
> @@ -5816,6 +5816,7 @@ main(int argc, char *argv[])
>          if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop)) {
>              VLOG_INFO("OVNSB commit failed, force recompute next time.");
>              engine_set_force_recompute(true);
> +            poll_immediate_wake();
>          }
>  
>          int ovs_txn_status = ovsdb_idl_loop_commit_and_wait(&ovs_idl_loop);
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index d71114f35..b3c80f012 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -992,12 +992,14 @@ main(int argc, char *argv[])
>                      VLOG_INFO("OVNNB commit failed, "
>                                "force recompute next time.");
>                      eng_ctx.recompute = true;
> +                    poll_immediate_wake();
>                  }
>  
>                  if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop)) {
>                      VLOG_INFO("OVNSB commit failed, "
>                                "force recompute next time.");
>                      eng_ctx.recompute = true;
> +                    poll_immediate_wake();
>                  }
>                  run_memory_trimmer(ovnnb_idl_loop.idl, activity);
>              } else {
Ales Musil Sept. 24, 2024, 1:32 p.m. UTC | #2
On Mon, Sep 23, 2024 at 4:31 PM Dumitru Ceara <dceara@redhat.com> wrote:

> On 9/10/24 15:29, Ales Musil wrote:
> > The recompute was requested when the commit failed, however the
> > engine run might have happened later than expected. This was caused
> > by poll_block() call, which was waiting for any event in the loop.
> > Make sure that the engine runs right away by calling
> > poll_immediate_wake() which ensures that the poll_block() returns
> > right away without any further waiting.
> >
> > Reported-at: https://issues.redhat.com/browse/FDP-753
> > Signed-off-by: Ales Musil <amusil@redhat.com>
> > ---
>
> Hi Ales,
>
> Thanks for the fix!
>


Hi Dumitru,

thank you for the review.


> I wonder if we shouldn't just do this every time
> engine_set_force_recompute(true) is called.  That is, call
> poll_immediate_wake() inside the engine_set_force_recompute() function.
>
> By the look there's no case in whicha forced recompute is requested
> where we have a reason to wait instead of immediately waking up to do
> the computation.
>
> We could also repurpose the engine_trigger_recompute() function.
>
> What do you think?
>

After some offline discussion there is v2 which should be more explicit
about what is happening.


>
> Thanks,
> Dumitru
>
> >  controller/ovn-controller.c | 1 +
> >  northd/ovn-northd.c         | 2 ++
> >  2 files changed, 3 insertions(+)
> >
> > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
> > index c48667887..8b8ff3a92 100644
> > --- a/controller/ovn-controller.c
> > +++ b/controller/ovn-controller.c
> > @@ -5816,6 +5816,7 @@ main(int argc, char *argv[])
> >          if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop)) {
> >              VLOG_INFO("OVNSB commit failed, force recompute next
> time.");
> >              engine_set_force_recompute(true);
> > +            poll_immediate_wake();
> >          }
> >
> >          int ovs_txn_status =
> ovsdb_idl_loop_commit_and_wait(&ovs_idl_loop);
> > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> > index d71114f35..b3c80f012 100644
> > --- a/northd/ovn-northd.c
> > +++ b/northd/ovn-northd.c
> > @@ -992,12 +992,14 @@ main(int argc, char *argv[])
> >                      VLOG_INFO("OVNNB commit failed, "
> >                                "force recompute next time.");
> >                      eng_ctx.recompute = true;
> > +                    poll_immediate_wake();
> >                  }
> >
> >                  if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop)) {
> >                      VLOG_INFO("OVNSB commit failed, "
> >                                "force recompute next time.");
> >                      eng_ctx.recompute = true;
> > +                    poll_immediate_wake();
> >                  }
> >                  run_memory_trimmer(ovnnb_idl_loop.idl, activity);
> >              } else {
>
>
Thanks,
Ales
diff mbox series

Patch

diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c
index c48667887..8b8ff3a92 100644
--- a/controller/ovn-controller.c
+++ b/controller/ovn-controller.c
@@ -5816,6 +5816,7 @@  main(int argc, char *argv[])
         if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop)) {
             VLOG_INFO("OVNSB commit failed, force recompute next time.");
             engine_set_force_recompute(true);
+            poll_immediate_wake();
         }
 
         int ovs_txn_status = ovsdb_idl_loop_commit_and_wait(&ovs_idl_loop);
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index d71114f35..b3c80f012 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -992,12 +992,14 @@  main(int argc, char *argv[])
                     VLOG_INFO("OVNNB commit failed, "
                               "force recompute next time.");
                     eng_ctx.recompute = true;
+                    poll_immediate_wake();
                 }
 
                 if (!ovsdb_idl_loop_commit_and_wait(&ovnsb_idl_loop)) {
                     VLOG_INFO("OVNSB commit failed, "
                               "force recompute next time.");
                     eng_ctx.recompute = true;
+                    poll_immediate_wake();
                 }
                 run_memory_trimmer(ovnnb_idl_loop.idl, activity);
             } else {