diff mbox series

[ovs-dev,1/2] fail-open: Only fail open if we've been disconnected for at least 1 s.

Message ID 20210610224934.2200271-1-blp@ovn.org
State Accepted
Headers show
Series [ovs-dev,1/2] fail-open: Only fail open if we've been disconnected for at least 1 s. | expand

Commit Message

Ben Pfaff June 10, 2021, 10:49 p.m. UTC
The 'last_disconn_secs' member determines whether we're currently in
fail-open mode (see fail_open_is_active()), but before this
commit, fail_open_run() could decide to enter fail-open mode even if
that would set 'last_disconn_secs' to 0 (and thus not really enter it).
This could lead to an endless stream of log messages about entering
fail-open mode, none of which actually does anything.  This fixes the
problem.

(This patch worries me because this functionality has been stable
and unchanged for many years and I wonder how something so simple
is broken.)

Signed-off-by: Ben Pfaff <blp@ovn.org>
---
 ofproto/fail-open.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Ilya Maximets June 14, 2021, 6:59 p.m. UTC | #1
On 6/11/21 12:49 AM, Ben Pfaff wrote:
> The 'last_disconn_secs' member determines whether we're currently in
> fail-open mode (see fail_open_is_active()), but before this
> commit, fail_open_run() could decide to enter fail-open mode even if
> that would set 'last_disconn_secs' to 0 (and thus not really enter it).
> This could lead to an endless stream of log messages about entering
> fail-open mode, none of which actually does anything.  This fixes the
> problem.
> 
> (This patch worries me because this functionality has been stable
> and unchanged for many years and I wonder how something so simple
> is broken.)

Hmm, I suppose it's just not common to disable inactivity probes on
controller connections.

Acked-by: Ilya Maximets <i.maximets@ovn.org>
Ben Pfaff July 2, 2021, 6:04 p.m. UTC | #2
On Mon, Jun 14, 2021 at 08:59:43PM +0200, Ilya Maximets wrote:
> On 6/11/21 12:49 AM, Ben Pfaff wrote:
> > The 'last_disconn_secs' member determines whether we're currently in
> > fail-open mode (see fail_open_is_active()), but before this
> > commit, fail_open_run() could decide to enter fail-open mode even if
> > that would set 'last_disconn_secs' to 0 (and thus not really enter it).
> > This could lead to an endless stream of log messages about entering
> > fail-open mode, none of which actually does anything.  This fixes the
> > problem.
> > 
> > (This patch worries me because this functionality has been stable
> > and unchanged for many years and I wonder how something so simple
> > is broken.)
> 
> Hmm, I suppose it's just not common to disable inactivity probes on
> controller connections.
> 
> Acked-by: Ilya Maximets <i.maximets@ovn.org>

Thanks, applied to master.
diff mbox series

Patch

diff --git a/ofproto/fail-open.c b/ofproto/fail-open.c
index 5b105ba880a3..34b398ecae30 100644
--- a/ofproto/fail-open.c
+++ b/ofproto/fail-open.c
@@ -180,7 +180,7 @@  fail_open_run(struct fail_open *fo)
     int disconn_secs = connmgr_failure_duration(fo->connmgr);
 
     /* Enter fail-open mode if 'fo' is not in it but should be.  */
-    if (disconn_secs >= trigger_duration(fo)) {
+    if (disconn_secs > 0 && disconn_secs >= trigger_duration(fo)) {
         if (!fail_open_is_active(fo)) {
             VLOG_WARN("Could not connect to controller (or switch failed "
                       "controller's post-connection admission control "