Message ID | 20200304083617.481725-1-numans@ovn.org |
---|---|
State | Accepted |
Headers | show |
Series | [ovs-dev,ovn] ovn-northd: Add the option to configure probe interval | expand |
On Wed, Mar 4, 2020 at 12:36 AM <numans@ovn.org> wrote: > > From: Numan Siddique <numans@ovn.org> > > This patch adds the option 'NB_Global.options:probe_interval' to > configure the probe interval for the North and South db connections > from ovn-northd. > Just a minor comment here. There are so many probe intervals in the whole system, e.g. from NB/SB/IC-NB/IC-SB to clients (northd/ovn-controller/ic, and external CMS clients), from clients to NB/SB/IC-NB/IC-SB. Here this implementation only configures the probe from northd client to NB/SB. Is it better to rename it to northd_probe_interval to avoid confusion? Otherwise: Acked-by: Han Zhou <hzhou@ovn.org> > Requested-by: Girish Moodalbail <gmoodalbail@nvidia.com> > Signed-off-by: Numan Siddique <numans@ovn.org> > --- > northd/ovn-northd.c | 16 ++++++++++++++++ > ovn-nb.xml | 13 +++++++++++++ > 2 files changed, 29 insertions(+) > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > index 0d43322cf..a3746f7ea 100644 > --- a/northd/ovn-northd.c > +++ b/northd/ovn-northd.c > @@ -92,6 +92,10 @@ static bool controller_event_en; > * all locally handled, having just one mac is good enough. */ > static char svc_monitor_mac[ETH_ADDR_STRLEN + 1]; > > +/* Default probe interval for NB and SB DB connections. */ > +#define DEFAULT_PROBE_INTERVAL_MSEC 5000 > +static int probe_interval = DEFAULT_PROBE_INTERVAL_MSEC; > + > #define MAX_OVN_TAGS 4096 > > /* Pipeline stages. */ > @@ -10858,6 +10862,14 @@ ovnnb_db_run(struct northd_context *ctx, > smap_destroy(&options); > } > > + /* Update the probe interval. */ > + probe_interval = smap_get_int(&nb->options, "probe_interval", > + DEFAULT_PROBE_INTERVAL_MSEC); > + > + if (probe_interval > 0 && probe_interval < 1000) { > + probe_interval = 1000; > + } > + > controller_event_en = smap_get_bool(&nb->options, > "controller_event", false); > > @@ -11818,6 +11830,10 @@ main(int argc, char *argv[]) > poll_immediate_wake(); > } > > + > + ovsdb_idl_set_probe_interval(ovnnb_idl_loop.idl, probe_interval); > + ovsdb_idl_set_probe_interval(ovnsb_idl_loop.idl, probe_interval); > + > poll_block(); > if (should_service_stop()) { > exiting = true; > diff --git a/ovn-nb.xml b/ovn-nb.xml > index f30cc9ee9..f9b028aa0 100644 > --- a/ovn-nb.xml > +++ b/ovn-nb.xml > @@ -134,6 +134,19 @@ > > </column> > > + <column name="options" key="probe_interval"> > + <p> > + The inactivity probe interval of the connection to the OVN Northbound > + and Southbound databases, in milliseconds. > + If the value is zero, it disables the connection keepalive feature. > + </p> > + > + <p> > + If the value is nonzero, then it will be forced to a value of > + at least 1000 ms. > + </p> > + </column> > + > <group title="Options for configuring interconnection route advertisement"> > <p> > These options control how routes are advertised between OVN > -- > 2.24.1 > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Hello Numan, Thank you for getting this done __. Quick comment: The option set on NB_Global also affects ovn-northd's connection to OVN SB as well. Typically, one would expect any option on NB_Global should affect ovsdb-server managing OVN NB DB alone, right? Should we add similar option to SB_Global as well? Regards, Girish On 3/4/20, 12:36 AM, "numans@ovn.org" <numans@ovn.org> wrote: External email: Use caution opening links or attachments From: Numan Siddique <numans@ovn.org> This patch adds the option 'NB_Global.options:probe_interval' to configure the probe interval for the North and South db connections from ovn-northd. Requested-by: Girish Moodalbail <gmoodalbail@nvidia.com> Signed-off-by: Numan Siddique <numans@ovn.org> --- northd/ovn-northd.c | 16 ++++++++++++++++ ovn-nb.xml | 13 +++++++++++++ 2 files changed, 29 insertions(+) diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index 0d43322cf..a3746f7ea 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -92,6 +92,10 @@ static bool controller_event_en; * all locally handled, having just one mac is good enough. */ static char svc_monitor_mac[ETH_ADDR_STRLEN + 1]; +/* Default probe interval for NB and SB DB connections. */ +#define DEFAULT_PROBE_INTERVAL_MSEC 5000 +static int probe_interval = DEFAULT_PROBE_INTERVAL_MSEC; + #define MAX_OVN_TAGS 4096 /* Pipeline stages. */ @@ -10858,6 +10862,14 @@ ovnnb_db_run(struct northd_context *ctx, smap_destroy(&options); } + /* Update the probe interval. */ + probe_interval = smap_get_int(&nb->options, "probe_interval", + DEFAULT_PROBE_INTERVAL_MSEC); + + if (probe_interval > 0 && probe_interval < 1000) { + probe_interval = 1000; + } + controller_event_en = smap_get_bool(&nb->options, "controller_event", false); @@ -11818,6 +11830,10 @@ main(int argc, char *argv[]) poll_immediate_wake(); } + + ovsdb_idl_set_probe_interval(ovnnb_idl_loop.idl, probe_interval); + ovsdb_idl_set_probe_interval(ovnsb_idl_loop.idl, probe_interval); + poll_block(); if (should_service_stop()) { exiting = true; diff --git a/ovn-nb.xml b/ovn-nb.xml index f30cc9ee9..f9b028aa0 100644 --- a/ovn-nb.xml +++ b/ovn-nb.xml @@ -134,6 +134,19 @@ </column> + <column name="options" key="probe_interval"> + <p> + The inactivity probe interval of the connection to the OVN Northbound + and Southbound databases, in milliseconds. + If the value is zero, it disables the connection keepalive feature. + </p> + + <p> + If the value is nonzero, then it will be forced to a value of + at least 1000 ms. + </p> + </column> + <group title="Options for configuring interconnection route advertisement"> <p> These options control how routes are advertised between OVN -- 2.24.1 ----------------------------------------------------------------------------------- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. -----------------------------------------------------------------------------------
On Thu, Mar 5, 2020 at 10:22 AM Girish Moodalbail <gmoodalbail@nvidia.com> wrote: > > Hello Numan, > > Thank you for getting this done __. > > Quick comment: The option set on NB_Global also affects ovn-northd's connection to OVN SB as well. Typically, one would expect any option on NB_Global should affect ovsdb-server managing OVN NB DB alone, right? Should we add similar option to SB_Global as well? Hi Girish, This newly added option probe_interval (which will be renamed to northd_probe_interval) applies to both Northbound and Southbound connections from ovn-northd. So there is no need to add in SB_Global. And CMS is not supposed to do any write txns to Southbound db. I thought just one option is enough for both db connections. You want separate ones ? Thanks Numan > > Regards, > Girish > > > On 3/4/20, 12:36 AM, "numans@ovn.org" <numans@ovn.org> wrote: > > External email: Use caution opening links or attachments > > > From: Numan Siddique <numans@ovn.org> > > This patch adds the option 'NB_Global.options:probe_interval' to > configure the probe interval for the North and South db connections > from ovn-northd. > > Requested-by: Girish Moodalbail <gmoodalbail@nvidia.com> > Signed-off-by: Numan Siddique <numans@ovn.org> > --- > northd/ovn-northd.c | 16 ++++++++++++++++ > ovn-nb.xml | 13 +++++++++++++ > 2 files changed, 29 insertions(+) > > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > index 0d43322cf..a3746f7ea 100644 > --- a/northd/ovn-northd.c > +++ b/northd/ovn-northd.c > @@ -92,6 +92,10 @@ static bool controller_event_en; > * all locally handled, having just one mac is good enough. */ > static char svc_monitor_mac[ETH_ADDR_STRLEN + 1]; > > +/* Default probe interval for NB and SB DB connections. */ > +#define DEFAULT_PROBE_INTERVAL_MSEC 5000 > +static int probe_interval = DEFAULT_PROBE_INTERVAL_MSEC; > + > #define MAX_OVN_TAGS 4096 > > /* Pipeline stages. */ > @@ -10858,6 +10862,14 @@ ovnnb_db_run(struct northd_context *ctx, > smap_destroy(&options); > } > > + /* Update the probe interval. */ > + probe_interval = smap_get_int(&nb->options, "probe_interval", > + DEFAULT_PROBE_INTERVAL_MSEC); > + > + if (probe_interval > 0 && probe_interval < 1000) { > + probe_interval = 1000; > + } > + > controller_event_en = smap_get_bool(&nb->options, > "controller_event", false); > > @@ -11818,6 +11830,10 @@ main(int argc, char *argv[]) > poll_immediate_wake(); > } > > + > + ovsdb_idl_set_probe_interval(ovnnb_idl_loop.idl, probe_interval); > + ovsdb_idl_set_probe_interval(ovnsb_idl_loop.idl, probe_interval); > + > poll_block(); > if (should_service_stop()) { > exiting = true; > diff --git a/ovn-nb.xml b/ovn-nb.xml > index f30cc9ee9..f9b028aa0 100644 > --- a/ovn-nb.xml > +++ b/ovn-nb.xml > @@ -134,6 +134,19 @@ > > </column> > > + <column name="options" key="probe_interval"> > + <p> > + The inactivity probe interval of the connection to the OVN Northbound > + and Southbound databases, in milliseconds. > + If the value is zero, it disables the connection keepalive feature. > + </p> > + > + <p> > + If the value is nonzero, then it will be forced to a value of > + at least 1000 ms. > + </p> > + </column> > + > <group title="Options for configuring interconnection route advertisement"> > <p> > These options control how routes are advertised between OVN > -- > 2.24.1 > > > > > ----------------------------------------------------------------------------------- > This email message is for the sole use of the intended recipient(s) and may contain > confidential information. Any unauthorized review, use, disclosure or distribution > is prohibited. If you are not the intended recipient, please contact the sender by > reply email and destroy all copies of the original message. > ----------------------------------------------------------------------------------- > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Hello Numan, On 3/4/20, 9:58 PM, "Numan Siddique" <numans@ovn.org> wrote: External email: Use caution opening links or attachments On Thu, Mar 5, 2020 at 10:22 AM Girish Moodalbail <gmoodalbail@nvidia.com> wrote: > > Hello Numan, > > Thank you for getting this done __. > > Quick comment: The option set on NB_Global also affects ovn-northd's connection to OVN SB as well. Typically, one would expect any option on NB_Global should affect ovsdb-server managing OVN NB DB alone, right? Should we add similar option to SB_Global as well? Hi Girish, This newly added option probe_interval (which will be renamed to northd_probe_interval) applies to both Northbound and Southbound connections from ovn-northd. So there is no need to add in SB_Global. And CMS is not supposed to do any write txns to Southbound db. I see. If that is the case, then one option should be fine then. Regards, ~Girish ----------------------------------------------------------------------------------- This email message is for the sole use of the intended recipient(s) and may contain confidential information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. -----------------------------------------------------------------------------------
On Thu, Mar 5, 2020 at 11:40 AM Girish Moodalbail <gmoodalbail@nvidia.com> wrote: > > Hello Numan, > > On 3/4/20, 9:58 PM, "Numan Siddique" <numans@ovn.org> wrote: > > External email: Use caution opening links or attachments > > > On Thu, Mar 5, 2020 at 10:22 AM Girish Moodalbail > <gmoodalbail@nvidia.com> wrote: > > > > Hello Numan, > > > > Thank you for getting this done __. > > > > Quick comment: The option set on NB_Global also affects ovn-northd's connection to OVN SB as well. Typically, one would expect any option on NB_Global should affect ovsdb-server managing OVN NB DB alone, right? Should we add similar option to SB_Global as well? > > Hi Girish, > > This newly added option probe_interval (which will be renamed to > northd_probe_interval) applies to both Northbound and Southbound > connections from ovn-northd. > So there is no need to add in SB_Global. And CMS is not supposed to do > any write txns to Southbound db. > > I see. If that is the case, then one option should be fine then. Thanks Han and Girish for the review and comments. I applied this patch to master with the changes suggested by Han. Thanks Numan > > Regards, > ~Girish > > > > ----------------------------------------------------------------------------------- > This email message is for the sole use of the intended recipient(s) and may contain > confidential information. Any unauthorized review, use, disclosure or distribution > is prohibited. If you are not the intended recipient, please contact the sender by > reply email and destroy all copies of the original message. > ----------------------------------------------------------------------------------- > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
Hello Numan, We found a minor issue with this patch in our OVN K8s setup. Say, we have set the northd_probe_interval to 50ms, and we have 3 instances of `ovn-northd` running. In this setup, one of the ovn-northd instances will be active, whilst the remaining two instances will be in 'standby' mode. From my reading of ovn-northd.c`main(), the active instance will read the northd_probe_interval value and set it on its NB and SB IDLs. However, for the two standby instances we wouldn't have set that value on their NB and SB IDLs. Until all the standby instances become active, they will still have the default inactivity probe interval value of 5sec. So, in a large k8s cluster we are seeing that the standby ovn-northd instances are not able to latch on to connections for a long time due to the short inactivity probe interval value. Regards, ~Girish On Wed, Mar 4, 2020 at 10:54 PM Numan Siddique <numans@ovn.org> wrote: > On Thu, Mar 5, 2020 at 11:40 AM Girish Moodalbail > <gmoodalbail@nvidia.com> wrote: > > > > Hello Numan, > > > > On 3/4/20, 9:58 PM, "Numan Siddique" <numans@ovn.org> wrote: > > > > External email: Use caution opening links or attachments > > > > > > On Thu, Mar 5, 2020 at 10:22 AM Girish Moodalbail > > <gmoodalbail@nvidia.com> wrote: > > > > > > Hello Numan, > > > > > > Thank you for getting this done __. > > > > > > Quick comment: The option set on NB_Global also affects > ovn-northd's connection to OVN SB as well. Typically, one would expect any > option on NB_Global should affect ovsdb-server managing OVN NB DB alone, > right? Should we add similar option to SB_Global as well? > > > > Hi Girish, > > > > This newly added option probe_interval (which will be renamed to > > northd_probe_interval) applies to both Northbound and Southbound > > connections from ovn-northd. > > So there is no need to add in SB_Global. And CMS is not supposed to > do > > any write txns to Southbound db. > > > > I see. If that is the case, then one option should be fine then. > > Thanks Han and Girish for the review and comments. > > I applied this patch to master with the changes suggested by Han. > > Thanks > Numan > > > > > Regards, > > ~Girish > > > > > > > > > ----------------------------------------------------------------------------------- > > This email message is for the sole use of the intended recipient(s) and > may contain > > confidential information. Any unauthorized review, use, disclosure or > distribution > > is prohibited. If you are not the intended recipient, please contact > the sender by > > reply email and destroy all copies of the original message. > > > ----------------------------------------------------------------------------------- > > _______________________________________________ > > dev mailing list > > dev@openvswitch.org > > https://mail.openvswitch.org/mailman/listinfo/ovs-dev > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >
On Wed, Jul 15, 2020 at 10:33 AM Girish Moodalbail <gmoodalbail@gmail.com> wrote: > > Hello Numan, > > We found a minor issue with this patch in our OVN K8s setup. Say, we have set the northd_probe_interval to 50ms, and we have 3 instances of `ovn-northd` running. In this setup, one of the ovn-northd instances will be active, whilst the remaining two instances will be in 'standby' mode. > > From my reading of ovn-northd.c`main(), the active instance will read the northd_probe_interval value and set it on its NB and SB IDLs. However, for the two standby instances we wouldn't have set that value on their NB and SB IDLs. Until all the standby instances become active, they will still have the default inactivity probe interval value of 5sec. > > So, in a large k8s cluster we are seeing that the standby ovn-northd instances are not able to latch on to connections for a long time due to the short inactivity probe interval value. Hi Girish, Thanks for reporting the issue. I'll take a look into it next week. Thanks Numan > > Regards, > ~Girish > > On Wed, Mar 4, 2020 at 10:54 PM Numan Siddique <numans@ovn.org> wrote: >> >> On Thu, Mar 5, 2020 at 11:40 AM Girish Moodalbail >> <gmoodalbail@nvidia.com> wrote: >> > >> > Hello Numan, >> > >> > On 3/4/20, 9:58 PM, "Numan Siddique" <numans@ovn.org> wrote: >> > >> > External email: Use caution opening links or attachments >> > >> > >> > On Thu, Mar 5, 2020 at 10:22 AM Girish Moodalbail >> > <gmoodalbail@nvidia.com> wrote: >> > > >> > > Hello Numan, >> > > >> > > Thank you for getting this done __. >> > > >> > > Quick comment: The option set on NB_Global also affects ovn-northd's connection to OVN SB as well. Typically, one would expect any option on NB_Global should affect ovsdb-server managing OVN NB DB alone, right? Should we add similar option to SB_Global as well? >> > >> > Hi Girish, >> > >> > This newly added option probe_interval (which will be renamed to >> > northd_probe_interval) applies to both Northbound and Southbound >> > connections from ovn-northd. >> > So there is no need to add in SB_Global. And CMS is not supposed to do >> > any write txns to Southbound db. >> > >> > I see. If that is the case, then one option should be fine then. >> >> Thanks Han and Girish for the review and comments. >> >> I applied this patch to master with the changes suggested by Han. >> >> Thanks >> Numan >> >> > >> > Regards, >> > ~Girish >> > >> > >> > >> > ----------------------------------------------------------------------------------- >> > This email message is for the sole use of the intended recipient(s) and may contain >> > confidential information. Any unauthorized review, use, disclosure or distribution >> > is prohibited. If you are not the intended recipient, please contact the sender by >> > reply email and destroy all copies of the original message. >> > ----------------------------------------------------------------------------------- >> > _______________________________________________ >> > dev mailing list >> > dev@openvswitch.org >> > https://mail.openvswitch.org/mailman/listinfo/ovs-dev >> _______________________________________________ >> dev mailing list >> dev@openvswitch.org >> https://mail.openvswitch.org/mailman/listinfo/ovs-dev > > -- > You received this message because you are subscribed to the Google Groups "ovn-kubernetes" group. > To unsubscribe from this group and stop receiving emails from it, send an email to ovn-kubernetes+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/ovn-kubernetes/CAAF2STRuQX-Ft-S6RLeJFTdvVe2Z46eQsTg8H9nixhDFzOppJw%40mail.gmail.com.
diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index 0d43322cf..a3746f7ea 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -92,6 +92,10 @@ static bool controller_event_en; * all locally handled, having just one mac is good enough. */ static char svc_monitor_mac[ETH_ADDR_STRLEN + 1]; +/* Default probe interval for NB and SB DB connections. */ +#define DEFAULT_PROBE_INTERVAL_MSEC 5000 +static int probe_interval = DEFAULT_PROBE_INTERVAL_MSEC; + #define MAX_OVN_TAGS 4096 /* Pipeline stages. */ @@ -10858,6 +10862,14 @@ ovnnb_db_run(struct northd_context *ctx, smap_destroy(&options); } + /* Update the probe interval. */ + probe_interval = smap_get_int(&nb->options, "probe_interval", + DEFAULT_PROBE_INTERVAL_MSEC); + + if (probe_interval > 0 && probe_interval < 1000) { + probe_interval = 1000; + } + controller_event_en = smap_get_bool(&nb->options, "controller_event", false); @@ -11818,6 +11830,10 @@ main(int argc, char *argv[]) poll_immediate_wake(); } + + ovsdb_idl_set_probe_interval(ovnnb_idl_loop.idl, probe_interval); + ovsdb_idl_set_probe_interval(ovnsb_idl_loop.idl, probe_interval); + poll_block(); if (should_service_stop()) { exiting = true; diff --git a/ovn-nb.xml b/ovn-nb.xml index f30cc9ee9..f9b028aa0 100644 --- a/ovn-nb.xml +++ b/ovn-nb.xml @@ -134,6 +134,19 @@ </column> + <column name="options" key="probe_interval"> + <p> + The inactivity probe interval of the connection to the OVN Northbound + and Southbound databases, in milliseconds. + If the value is zero, it disables the connection keepalive feature. + </p> + + <p> + If the value is nonzero, then it will be forced to a value of + at least 1000 ms. + </p> + </column> + <group title="Options for configuring interconnection route advertisement"> <p> These options control how routes are advertised between OVN