Message ID | 20210610224934.2200271-1-blp@ovn.org |
---|---|
State | Accepted |
Headers | show |
Series | [ovs-dev,1/2] fail-open: Only fail open if we've been disconnected for at least 1 s. | expand |
On 6/11/21 12:49 AM, Ben Pfaff wrote: > The 'last_disconn_secs' member determines whether we're currently in > fail-open mode (see fail_open_is_active()), but before this > commit, fail_open_run() could decide to enter fail-open mode even if > that would set 'last_disconn_secs' to 0 (and thus not really enter it). > This could lead to an endless stream of log messages about entering > fail-open mode, none of which actually does anything. This fixes the > problem. > > (This patch worries me because this functionality has been stable > and unchanged for many years and I wonder how something so simple > is broken.) Hmm, I suppose it's just not common to disable inactivity probes on controller connections. Acked-by: Ilya Maximets <i.maximets@ovn.org>
On Mon, Jun 14, 2021 at 08:59:43PM +0200, Ilya Maximets wrote: > On 6/11/21 12:49 AM, Ben Pfaff wrote: > > The 'last_disconn_secs' member determines whether we're currently in > > fail-open mode (see fail_open_is_active()), but before this > > commit, fail_open_run() could decide to enter fail-open mode even if > > that would set 'last_disconn_secs' to 0 (and thus not really enter it). > > This could lead to an endless stream of log messages about entering > > fail-open mode, none of which actually does anything. This fixes the > > problem. > > > > (This patch worries me because this functionality has been stable > > and unchanged for many years and I wonder how something so simple > > is broken.) > > Hmm, I suppose it's just not common to disable inactivity probes on > controller connections. > > Acked-by: Ilya Maximets <i.maximets@ovn.org> Thanks, applied to master.
diff --git a/ofproto/fail-open.c b/ofproto/fail-open.c index 5b105ba880a3..34b398ecae30 100644 --- a/ofproto/fail-open.c +++ b/ofproto/fail-open.c @@ -180,7 +180,7 @@ fail_open_run(struct fail_open *fo) int disconn_secs = connmgr_failure_duration(fo->connmgr); /* Enter fail-open mode if 'fo' is not in it but should be. */ - if (disconn_secs >= trigger_duration(fo)) { + if (disconn_secs > 0 && disconn_secs >= trigger_duration(fo)) { if (!fail_open_is_active(fo)) { VLOG_WARN("Could not connect to controller (or switch failed " "controller's post-connection admission control "
The 'last_disconn_secs' member determines whether we're currently in fail-open mode (see fail_open_is_active()), but before this commit, fail_open_run() could decide to enter fail-open mode even if that would set 'last_disconn_secs' to 0 (and thus not really enter it). This could lead to an endless stream of log messages about entering fail-open mode, none of which actually does anything. This fixes the problem. (This patch worries me because this functionality has been stable and unchanged for many years and I wonder how something so simple is broken.) Signed-off-by: Ben Pfaff <blp@ovn.org> --- ofproto/fail-open.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)