diff mbox series

[ovs-dev,v3] northd: Remove delay in sb_cfg propagation.

Message ID 20250310104113.49312-1-alekssmirnov@k2.cloud
State Accepted
Delegated to: Dumitru Ceara
Headers show
Series [ovs-dev,v3] northd: Remove delay in sb_cfg propagation. | expand

Checks

Context Check Description
ovsrobot/apply-robot success apply and check: success
ovsrobot/github-robot-_Build_and_Test success github build: passed
ovsrobot/github-robot-_ovn-kubernetes success github build: passed

Commit Message

Aleksandr Smirnov March 10, 2025, 10:40 a.m. UTC
According to the ovn-architecture(7) sb_cfg counter should be
propagated at the moment northd completed transaction of related changes
to the southbound db. However, a processing engine call happened right
between two events, a sb transaction commitment, and initiating
the sb_cfg write. Normally, that processing engine call has nothing
to do, because it has only update from sb db that fires back result
of recent transaction from northd. But if, by some reason, engine
change handler falls into full recompute there will be big delay
before sb_cfg propagated to nb db, and in fact it really happened
in old release, for example 22.09.

The fix defers engine run for one iteration (of main loop)
if we have sb_cfg ready to propagate.

Signed-off-by: Aleksandr Smirnov <alekssmirnov@k2.cloud>
---
v2: Address Mark's comments.
v3: Address Vladislav's comments.
---
 northd/ovn-northd.c | 22 ++++++++++++++++++++--
 tests/ovn-northd.at | 37 +++++++++++++++++++++++++++++++++++++
 2 files changed, 57 insertions(+), 2 deletions(-)

Comments

Dumitru Ceara March 31, 2025, 7:40 p.m. UTC | #1
On 3/10/25 11:40 AM, Aleksandr Smirnov wrote:
> According to the ovn-architecture(7) sb_cfg counter should be
> propagated at the moment northd completed transaction of related changes
> to the southbound db. However, a processing engine call happened right
> between two events, a sb transaction commitment, and initiating
> the sb_cfg write. Normally, that processing engine call has nothing
> to do, because it has only update from sb db that fires back result
> of recent transaction from northd. But if, by some reason, engine
> change handler falls into full recompute there will be big delay
> before sb_cfg propagated to nb db, and in fact it really happened
> in old release, for example 22.09.
> 
> The fix defers engine run for one iteration (of main loop)
> if we have sb_cfg ready to propagate.
> 
> Signed-off-by: Aleksandr Smirnov <alekssmirnov@k2.cloud>
> ---

Hi Aleksandr,

Thanks for the patch and sorry for the delay in reviewing.

> v2: Address Mark's comments.
> v3: Address Vladislav's comments.
> ---
>  northd/ovn-northd.c | 22 ++++++++++++++++++++--
>  tests/ovn-northd.at | 37 +++++++++++++++++++++++++++++++++++++
>  2 files changed, 57 insertions(+), 2 deletions(-)
> 
> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
> index 72eedbfdb..68a31d0b0 100644
> --- a/northd/ovn-northd.c
> +++ b/northd/ovn-northd.c
> @@ -798,6 +798,15 @@ run_memory_trimmer(struct ovsdb_idl *ovnnb_idl, bool activity)
>      memory_trimmer_wait(mt);
>  }
>  
> +static bool
> +sb_cfg_is_in_sync(struct ovsdb_idl *ovnnb_idl,
> +                  struct ovsdb_idl_loop *sb_loop)
> +{
> +    const struct nbrec_nb_global *nb = nbrec_nb_global_first(ovnnb_idl);
> +
> +    return !(nb && sb_loop->cur_cfg && nb->sb_cfg != sb_loop->cur_cfg);
> +}
> +
>  int
>  main(int argc, char *argv[])
>  {
> @@ -1064,8 +1073,17 @@ main(int argc, char *argv[])
>                  if (ovnnb_txn && ovnsb_txn &&
>                      inc_proc_northd_can_run(&eng_ctx)) {
>                      int64_t loop_start_time = time_wall_msec();
> -                    activity = inc_proc_northd_run(ovnnb_txn, ovnsb_txn,
> -                                                   &eng_ctx);
> +
> +                    if (sb_cfg_is_in_sync(ovnnb_idl_loop.idl,
> +                                          &ovnsb_idl_loop)) {
> +                        activity = inc_proc_northd_run(ovnnb_txn,
> +                                                       ovnsb_txn,
> +                                                       &eng_ctx);
> +                    } else {
> +                        poll_immediate_wake();

I guess this call to poll_immediate_wake() was added after Mark's
comment on v1 [0]:

> I think you should add a call to poll_immediate_wake() here. In
> theory, we should write the new sb_cfg value to the northbound
> database. Then that should trigger an update from the northbound
> database that will wake up northd and process what was skipped in this
> loop. However, since we also know that we definitely have data at hand
> that the incremental engine needs to process, we should not rely on
> the database transactions to work as expected before we process that
> data. Calling poll_immediate_wake() will ensure the loop wakes up
> immediately and performs an incremental engine run.

[0] https://mail.openvswitch.org/pipermail/ovs-dev/2025-February/420812.html

However, even without it we should wake up for the next incoming event,
whichever happens first; that is one of:
- northd's SB transaction to bump SB_Global.nb_cfg (done by
update_sequence_numbers() executed just below) completes
- a different NB/SB update wakes us
- a different scheduled poll wake timeout (e.g., mac binding refresh)
expires

Calling poll_immediate_wake() seems a little excessive but I might be
wrong.  Am I missing a case here?

Regards,
Dumitru

> +                        clear_idl_track = false;
> +                    }
> +
>                      check_and_add_supported_dhcp_opts_to_sb_db(
>                                   ovnsb_txn, ovnsb_idl_loop.idl);
>                      check_and_add_supported_dhcpv6_opts_to_sb_db(
> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
> index cfaba19bf..5db6d4984 100644
> --- a/tests/ovn-northd.at
> +++ b/tests/ovn-northd.at
> @@ -16748,3 +16748,40 @@ check grep -q "Bad configuration: The peer of the switch port 'ls-lrp1' (LRP pee
>  
>  AT_CLEANUP
>  ])
> +
> +OVN_FOR_EACH_NORTHD_NO_HV([
> +AT_SETUP([sb_cfg propagation])
> +ovn_start
> +
> +#
> +# Test engine call does not happen between sb db transaction
> +# commitment and sb_cfg write to the nb db (if have to)
> +#
> +> northd/ovn-northd.log
> +check as northd ovn-appctl -t ovn-northd vlog/set jsonrpc:dbg
> +check as northd ovn-appctl -t ovn-northd vlog/set inc_proc_eng:dbg
> +# Any change that invoke engine processing, for example add logical switch.
> +# nb_cfg bump must be present in order to get feedback as sb_cfg.
> +check ovn-nbctl --wait=sb ls-add sw1
> +#
> +# Get following messages from log file:
> +# 'unix... transact ... Southbound' --> Indicates the pack of updates has been
> +#                                       committed to the sb db.
> +# 'unix... transact ... sb_cfg'  --> Indicates the sb_cfg has been committed to
> +#                                    the sb db.
> +# 'Initializing new run' --> Indicates the engine has been called.
> +#
> +# Then, take first letter of messages, here 'u' or 'I' and form them into one
> +# string.
> +#
> +# Finally, a 'good' string should have two 'u' conjuncted with several 'I'
> +# both as prefix and suffix, while 'bad' string will have 'I' between two 'u'.
> +#
> +call_seq=`grep -E \
> + "(transact.*sb_cfg)|(transact.*Southbound)|(Initializing new run)" \
> + northd/ovn-northd.log | cut -d "|" -f 5- | cut -b 1 | tr -d '\n'`
> +
> +AT_CHECK([echo $call_seq | grep -qE "^I*uuI*$"], [0])
> +
> +AT_CLEANUP
> +])
Aleksandr Smirnov May 5, 2025, 9:19 a.m. UTC | #2
Hello Mark,

Could you please review Dumitru's comment on poll_immediate_wake() usage?
I have no my own solid point of view on this.

Thank you,

Alexander


On 3/31/25 10:40 PM, Dumitru Ceara wrote:
> On 3/10/25 11:40 AM, Aleksandr Smirnov wrote:
>> According to the ovn-architecture(7) sb_cfg counter should be
>> propagated at the moment northd completed transaction of related changes
>> to the southbound db. However, a processing engine call happened right
>> between two events, a sb transaction commitment, and initiating
>> the sb_cfg write. Normally, that processing engine call has nothing
>> to do, because it has only update from sb db that fires back result
>> of recent transaction from northd. But if, by some reason, engine
>> change handler falls into full recompute there will be big delay
>> before sb_cfg propagated to nb db, and in fact it really happened
>> in old release, for example 22.09.
>>
>> The fix defers engine run for one iteration (of main loop)
>> if we have sb_cfg ready to propagate.
>>
>> Signed-off-by: Aleksandr Smirnov <alekssmirnov@k2.cloud>
>> ---
> Hi Aleksandr,
>
> Thanks for the patch and sorry for the delay in reviewing.
>
>> v2: Address Mark's comments.
>> v3: Address Vladislav's comments.
>> ---
>>   northd/ovn-northd.c | 22 ++++++++++++++++++++--
>>   tests/ovn-northd.at | 37 +++++++++++++++++++++++++++++++++++++
>>   2 files changed, 57 insertions(+), 2 deletions(-)
>>
>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
>> index 72eedbfdb..68a31d0b0 100644
>> --- a/northd/ovn-northd.c
>> +++ b/northd/ovn-northd.c
>> @@ -798,6 +798,15 @@ run_memory_trimmer(struct ovsdb_idl *ovnnb_idl, bool activity)
>>       memory_trimmer_wait(mt);
>>   }
>>   
>> +static bool
>> +sb_cfg_is_in_sync(struct ovsdb_idl *ovnnb_idl,
>> +                  struct ovsdb_idl_loop *sb_loop)
>> +{
>> +    const struct nbrec_nb_global *nb = nbrec_nb_global_first(ovnnb_idl);
>> +
>> +    return !(nb && sb_loop->cur_cfg && nb->sb_cfg != sb_loop->cur_cfg);
>> +}
>> +
>>   int
>>   main(int argc, char *argv[])
>>   {
>> @@ -1064,8 +1073,17 @@ main(int argc, char *argv[])
>>                   if (ovnnb_txn && ovnsb_txn &&
>>                       inc_proc_northd_can_run(&eng_ctx)) {
>>                       int64_t loop_start_time = time_wall_msec();
>> -                    activity = inc_proc_northd_run(ovnnb_txn, ovnsb_txn,
>> -                                                   &eng_ctx);
>> +
>> +                    if (sb_cfg_is_in_sync(ovnnb_idl_loop.idl,
>> +                                          &ovnsb_idl_loop)) {
>> +                        activity = inc_proc_northd_run(ovnnb_txn,
>> +                                                       ovnsb_txn,
>> +                                                       &eng_ctx);
>> +                    } else {
>> +                        poll_immediate_wake();
> I guess this call to poll_immediate_wake() was added after Mark's
> comment on v1 [0]:
>
>> I think you should add a call to poll_immediate_wake() here. In
>> theory, we should write the new sb_cfg value to the northbound
>> database. Then that should trigger an update from the northbound
>> database that will wake up northd and process what was skipped in this
>> loop. However, since we also know that we definitely have data at hand
>> that the incremental engine needs to process, we should not rely on
>> the database transactions to work as expected before we process that
>> data. Calling poll_immediate_wake() will ensure the loop wakes up
>> immediately and performs an incremental engine run.
> [0] https://mail.openvswitch.org/pipermail/ovs-dev/2025-February/420812.html
>
> However, even without it we should wake up for the next incoming event,
> whichever happens first; that is one of:
> - northd's SB transaction to bump SB_Global.nb_cfg (done by
> update_sequence_numbers() executed just below) completes
> - a different NB/SB update wakes us
> - a different scheduled poll wake timeout (e.g., mac binding refresh)
> expires
>
> Calling poll_immediate_wake() seems a little excessive but I might be
> wrong.  Am I missing a case here?
>
> Regards,
> Dumitru
>
>> +                        clear_idl_track = false;
>> +                    }
>> +
>>                       check_and_add_supported_dhcp_opts_to_sb_db(
>>                                    ovnsb_txn, ovnsb_idl_loop.idl);
>>                       check_and_add_supported_dhcpv6_opts_to_sb_db(
>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>> index cfaba19bf..5db6d4984 100644
>> --- a/tests/ovn-northd.at
>> +++ b/tests/ovn-northd.at
>> @@ -16748,3 +16748,40 @@ check grep -q "Bad configuration: The peer of the switch port 'ls-lrp1' (LRP pee
>>   
>>   AT_CLEANUP
>>   ])
>> +
>> +OVN_FOR_EACH_NORTHD_NO_HV([
>> +AT_SETUP([sb_cfg propagation])
>> +ovn_start
>> +
>> +#
>> +# Test engine call does not happen between sb db transaction
>> +# commitment and sb_cfg write to the nb db (if have to)
>> +#
>> +> northd/ovn-northd.log
>> +check as northd ovn-appctl -t ovn-northd vlog/set jsonrpc:dbg
>> +check as northd ovn-appctl -t ovn-northd vlog/set inc_proc_eng:dbg
>> +# Any change that invoke engine processing, for example add logical switch.
>> +# nb_cfg bump must be present in order to get feedback as sb_cfg.
>> +check ovn-nbctl --wait=sb ls-add sw1
>> +#
>> +# Get following messages from log file:
>> +# 'unix... transact ... Southbound' --> Indicates the pack of updates has been
>> +#                                       committed to the sb db.
>> +# 'unix... transact ... sb_cfg'  --> Indicates the sb_cfg has been committed to
>> +#                                    the sb db.
>> +# 'Initializing new run' --> Indicates the engine has been called.
>> +#
>> +# Then, take first letter of messages, here 'u' or 'I' and form them into one
>> +# string.
>> +#
>> +# Finally, a 'good' string should have two 'u' conjuncted with several 'I'
>> +# both as prefix and suffix, while 'bad' string will have 'I' between two 'u'.
>> +#
>> +call_seq=`grep -E \
>> + "(transact.*sb_cfg)|(transact.*Southbound)|(Initializing new run)" \
>> + northd/ovn-northd.log | cut -d "|" -f 5- | cut -b 1 | tr -d '\n'`
>> +
>> +AT_CHECK([echo $call_seq | grep -qE "^I*uuI*$"], [0])
>> +
>> +AT_CLEANUP
>> +])
Mark Michelson May 5, 2025, 1:55 p.m. UTC | #3
On 5/5/25 05:19, Smirnov Aleksandr (K2 Cloud) wrote:
> Hello Mark,
> 
> Could you please review Dumitru's comment on poll_immediate_wake() usage?
> I have no my own solid point of view on this.
> 
> Thank you,
> 
> Alexander
> 
> 
> On 3/31/25 10:40 PM, Dumitru Ceara wrote:
>> On 3/10/25 11:40 AM, Aleksandr Smirnov wrote:
>>> According to the ovn-architecture(7) sb_cfg counter should be
>>> propagated at the moment northd completed transaction of related changes
>>> to the southbound db. However, a processing engine call happened right
>>> between two events, a sb transaction commitment, and initiating
>>> the sb_cfg write. Normally, that processing engine call has nothing
>>> to do, because it has only update from sb db that fires back result
>>> of recent transaction from northd. But if, by some reason, engine
>>> change handler falls into full recompute there will be big delay
>>> before sb_cfg propagated to nb db, and in fact it really happened
>>> in old release, for example 22.09.
>>>
>>> The fix defers engine run for one iteration (of main loop)
>>> if we have sb_cfg ready to propagate.
>>>
>>> Signed-off-by: Aleksandr Smirnov <alekssmirnov@k2.cloud>
>>> ---
>> Hi Aleksandr,
>>
>> Thanks for the patch and sorry for the delay in reviewing.
>>
>>> v2: Address Mark's comments.
>>> v3: Address Vladislav's comments.
>>> ---
>>>    northd/ovn-northd.c | 22 ++++++++++++++++++++--
>>>    tests/ovn-northd.at | 37 +++++++++++++++++++++++++++++++++++++
>>>    2 files changed, 57 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
>>> index 72eedbfdb..68a31d0b0 100644
>>> --- a/northd/ovn-northd.c
>>> +++ b/northd/ovn-northd.c
>>> @@ -798,6 +798,15 @@ run_memory_trimmer(struct ovsdb_idl *ovnnb_idl, bool activity)
>>>        memory_trimmer_wait(mt);
>>>    }
>>>    
>>> +static bool
>>> +sb_cfg_is_in_sync(struct ovsdb_idl *ovnnb_idl,
>>> +                  struct ovsdb_idl_loop *sb_loop)
>>> +{
>>> +    const struct nbrec_nb_global *nb = nbrec_nb_global_first(ovnnb_idl);
>>> +
>>> +    return !(nb && sb_loop->cur_cfg && nb->sb_cfg != sb_loop->cur_cfg);
>>> +}
>>> +
>>>    int
>>>    main(int argc, char *argv[])
>>>    {
>>> @@ -1064,8 +1073,17 @@ main(int argc, char *argv[])
>>>                    if (ovnnb_txn && ovnsb_txn &&
>>>                        inc_proc_northd_can_run(&eng_ctx)) {
>>>                        int64_t loop_start_time = time_wall_msec();
>>> -                    activity = inc_proc_northd_run(ovnnb_txn, ovnsb_txn,
>>> -                                                   &eng_ctx);
>>> +
>>> +                    if (sb_cfg_is_in_sync(ovnnb_idl_loop.idl,
>>> +                                          &ovnsb_idl_loop)) {
>>> +                        activity = inc_proc_northd_run(ovnnb_txn,
>>> +                                                       ovnsb_txn,
>>> +                                                       &eng_ctx);
>>> +                    } else {
>>> +                        poll_immediate_wake();
>> I guess this call to poll_immediate_wake() was added after Mark's
>> comment on v1 [0]:
>>
>>> I think you should add a call to poll_immediate_wake() here. In
>>> theory, we should write the new sb_cfg value to the northbound
>>> database. Then that should trigger an update from the northbound
>>> database that will wake up northd and process what was skipped in this
>>> loop. However, since we also know that we definitely have data at hand
>>> that the incremental engine needs to process, we should not rely on
>>> the database transactions to work as expected before we process that
>>> data. Calling poll_immediate_wake() will ensure the loop wakes up
>>> immediately and performs an incremental engine run.
>> [0] https://mail.openvswitch.org/pipermail/ovs-dev/2025-February/420812.html
>>
>> However, even without it we should wake up for the next incoming event,
>> whichever happens first; that is one of:
>> - northd's SB transaction to bump SB_Global.nb_cfg (done by
>> update_sequence_numbers() executed just below) completes
>> - a different NB/SB update wakes us
>> - a different scheduled poll wake timeout (e.g., mac binding refresh)
>> expires
>>
>> Calling poll_immediate_wake() seems a little excessive but I might be
>> wrong.  Am I missing a case here?

I don't think you're missing a case here. My idea for calling 
poll_immediate_wake() was couched in paranoia and principle.

On the paranoia side, you've listed the expected ways that the loop 
should be woken up based on current code. However, if any bugs should be 
introduced to those systems, or if there are database connection issues 
[1], or there is some change to how the mechanisms work, and they end up 
not waking the loop up, then our reliance on them may cause problems.

On the principle side, knowing that we have data that requires 
processing, it only makes sense that we should tell the poll loop to 
wake up to process that data. This makes the code work in the face of 
unknown bugs or unintended consequences of changes.

If there is active harm in calling poll_immediate_wake() here, then that 
is a different matter, and we should avoid it. However, I don't think 
that's the case.

Thanks,
Mark

[1] I realize that connection issues have more severe consequences than 
simply not waking up the polling loop, but for the purposes of this 
discussion, that is the sort of thing that could cause an expected 
wake-up not to occur.

>>
>> Regards,
>> Dumitru
>>
>>> +                        clear_idl_track = false;
>>> +                    }
>>> +
>>>                        check_and_add_supported_dhcp_opts_to_sb_db(
>>>                                     ovnsb_txn, ovnsb_idl_loop.idl);
>>>                        check_and_add_supported_dhcpv6_opts_to_sb_db(
>>> diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
>>> index cfaba19bf..5db6d4984 100644
>>> --- a/tests/ovn-northd.at
>>> +++ b/tests/ovn-northd.at
>>> @@ -16748,3 +16748,40 @@ check grep -q "Bad configuration: The peer of the switch port 'ls-lrp1' (LRP pee
>>>    
>>>    AT_CLEANUP
>>>    ])
>>> +
>>> +OVN_FOR_EACH_NORTHD_NO_HV([
>>> +AT_SETUP([sb_cfg propagation])
>>> +ovn_start
>>> +
>>> +#
>>> +# Test engine call does not happen between sb db transaction
>>> +# commitment and sb_cfg write to the nb db (if have to)
>>> +#
>>> +> northd/ovn-northd.log
>>> +check as northd ovn-appctl -t ovn-northd vlog/set jsonrpc:dbg
>>> +check as northd ovn-appctl -t ovn-northd vlog/set inc_proc_eng:dbg
>>> +# Any change that invoke engine processing, for example add logical switch.
>>> +# nb_cfg bump must be present in order to get feedback as sb_cfg.
>>> +check ovn-nbctl --wait=sb ls-add sw1
>>> +#
>>> +# Get following messages from log file:
>>> +# 'unix... transact ... Southbound' --> Indicates the pack of updates has been
>>> +#                                       committed to the sb db.
>>> +# 'unix... transact ... sb_cfg'  --> Indicates the sb_cfg has been committed to
>>> +#                                    the sb db.
>>> +# 'Initializing new run' --> Indicates the engine has been called.
>>> +#
>>> +# Then, take first letter of messages, here 'u' or 'I' and form them into one
>>> +# string.
>>> +#
>>> +# Finally, a 'good' string should have two 'u' conjuncted with several 'I'
>>> +# both as prefix and suffix, while 'bad' string will have 'I' between two 'u'.
>>> +#
>>> +call_seq=`grep -E \
>>> + "(transact.*sb_cfg)|(transact.*Southbound)|(Initializing new run)" \
>>> + northd/ovn-northd.log | cut -d "|" -f 5- | cut -b 1 | tr -d '\n'`
>>> +
>>> +AT_CHECK([echo $call_seq | grep -qE "^I*uuI*$"], [0])
>>> +
>>> +AT_CLEANUP
>>> +])
> 
>
Dumitru Ceara May 20, 2025, 12:26 p.m. UTC | #4
Hi Mark, Aleksandr,

On 5/5/25 3:55 PM, Mark Michelson wrote:
> On 5/5/25 05:19, Smirnov Aleksandr (K2 Cloud) wrote:
>> Hello Mark,
>>
>> Could you please review Dumitru's comment on poll_immediate_wake() usage?
>> I have no my own solid point of view on this.
>>
>> Thank you,
>>
>> Alexander
>>
>>
>> On 3/31/25 10:40 PM, Dumitru Ceara wrote:
>>> On 3/10/25 11:40 AM, Aleksandr Smirnov wrote:
>>>> According to the ovn-architecture(7) sb_cfg counter should be
>>>> propagated at the moment northd completed transaction of related
>>>> changes
>>>> to the southbound db. However, a processing engine call happened right
>>>> between two events, a sb transaction commitment, and initiating
>>>> the sb_cfg write. Normally, that processing engine call has nothing
>>>> to do, because it has only update from sb db that fires back result
>>>> of recent transaction from northd. But if, by some reason, engine
>>>> change handler falls into full recompute there will be big delay
>>>> before sb_cfg propagated to nb db, and in fact it really happened
>>>> in old release, for example 22.09.
>>>>
>>>> The fix defers engine run for one iteration (of main loop)
>>>> if we have sb_cfg ready to propagate.
>>>>
>>>> Signed-off-by: Aleksandr Smirnov <alekssmirnov@k2.cloud>
>>>> ---
>>> Hi Aleksandr,
>>>
>>> Thanks for the patch and sorry for the delay in reviewing.
>>>
>>>> v2: Address Mark's comments.
>>>> v3: Address Vladislav's comments.
>>>> ---
>>>>    northd/ovn-northd.c | 22 ++++++++++++++++++++--
>>>>    tests/ovn-northd.at | 37 +++++++++++++++++++++++++++++++++++++
>>>>    2 files changed, 57 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
>>>> index 72eedbfdb..68a31d0b0 100644
>>>> --- a/northd/ovn-northd.c
>>>> +++ b/northd/ovn-northd.c
>>>> @@ -798,6 +798,15 @@ run_memory_trimmer(struct ovsdb_idl *ovnnb_idl,
>>>> bool activity)
>>>>        memory_trimmer_wait(mt);
>>>>    }
>>>>    +static bool
>>>> +sb_cfg_is_in_sync(struct ovsdb_idl *ovnnb_idl,
>>>> +                  struct ovsdb_idl_loop *sb_loop)
>>>> +{
>>>> +    const struct nbrec_nb_global *nb =
>>>> nbrec_nb_global_first(ovnnb_idl);
>>>> +
>>>> +    return !(nb && sb_loop->cur_cfg && nb->sb_cfg != sb_loop-
>>>> >cur_cfg);
>>>> +}
>>>> +
>>>>    int
>>>>    main(int argc, char *argv[])
>>>>    {
>>>> @@ -1064,8 +1073,17 @@ main(int argc, char *argv[])
>>>>                    if (ovnnb_txn && ovnsb_txn &&
>>>>                        inc_proc_northd_can_run(&eng_ctx)) {
>>>>                        int64_t loop_start_time = time_wall_msec();
>>>> -                    activity = inc_proc_northd_run(ovnnb_txn,
>>>> ovnsb_txn,
>>>> -                                                   &eng_ctx);
>>>> +
>>>> +                    if (sb_cfg_is_in_sync(ovnnb_idl_loop.idl,
>>>> +                                          &ovnsb_idl_loop)) {
>>>> +                        activity = inc_proc_northd_run(ovnnb_txn,
>>>> +                                                       ovnsb_txn,
>>>> +                                                       &eng_ctx);
>>>> +                    } else {
>>>> +                        poll_immediate_wake();
>>> I guess this call to poll_immediate_wake() was added after Mark's
>>> comment on v1 [0]:
>>>
>>>> I think you should add a call to poll_immediate_wake() here. In
>>>> theory, we should write the new sb_cfg value to the northbound
>>>> database. Then that should trigger an update from the northbound
>>>> database that will wake up northd and process what was skipped in this
>>>> loop. However, since we also know that we definitely have data at hand
>>>> that the incremental engine needs to process, we should not rely on
>>>> the database transactions to work as expected before we process that
>>>> data. Calling poll_immediate_wake() will ensure the loop wakes up
>>>> immediately and performs an incremental engine run.
>>> [0] https://mail.openvswitch.org/pipermail/ovs-dev/2025-
>>> February/420812.html
>>>
>>> However, even without it we should wake up for the next incoming event,
>>> whichever happens first; that is one of:
>>> - northd's SB transaction to bump SB_Global.nb_cfg (done by
>>> update_sequence_numbers() executed just below) completes
>>> - a different NB/SB update wakes us
>>> - a different scheduled poll wake timeout (e.g., mac binding refresh)
>>> expires
>>>
>>> Calling poll_immediate_wake() seems a little excessive but I might be
>>> wrong.  Am I missing a case here?
> 
> I don't think you're missing a case here. My idea for calling
> poll_immediate_wake() was couched in paranoia and principle.
> 
> On the paranoia side, you've listed the expected ways that the loop
> should be woken up based on current code. However, if any bugs should be
> introduced to those systems, or if there are database connection issues
> [1], or there is some change to how the mechanisms work, and they end up
> not waking the loop up, then our reliance on them may cause problems.
> 
> On the principle side, knowing that we have data that requires
> processing, it only makes sense that we should tell the poll loop to
> wake up to process that data. This makes the code work in the face of
> unknown bugs or unintended consequences of changes.
> 
> If there is active harm in calling poll_immediate_wake() here, then that
> is a different matter, and we should avoid it. However, I don't think
> that's the case.
> 

You're right Mark, it's better to not rely on waking up as side effects
of other events.

I applied the patch to main, thanks!

Best regards,
Dumitru

> Thanks,
> Mark
> 
> [1] I realize that connection issues have more severe consequences than
> simply not waking up the polling loop, but for the purposes of this
> discussion, that is the sort of thing that could cause an expected wake-
> up not to occur.
>
diff mbox series

Patch

diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c
index 72eedbfdb..68a31d0b0 100644
--- a/northd/ovn-northd.c
+++ b/northd/ovn-northd.c
@@ -798,6 +798,15 @@  run_memory_trimmer(struct ovsdb_idl *ovnnb_idl, bool activity)
     memory_trimmer_wait(mt);
 }
 
+static bool
+sb_cfg_is_in_sync(struct ovsdb_idl *ovnnb_idl,
+                  struct ovsdb_idl_loop *sb_loop)
+{
+    const struct nbrec_nb_global *nb = nbrec_nb_global_first(ovnnb_idl);
+
+    return !(nb && sb_loop->cur_cfg && nb->sb_cfg != sb_loop->cur_cfg);
+}
+
 int
 main(int argc, char *argv[])
 {
@@ -1064,8 +1073,17 @@  main(int argc, char *argv[])
                 if (ovnnb_txn && ovnsb_txn &&
                     inc_proc_northd_can_run(&eng_ctx)) {
                     int64_t loop_start_time = time_wall_msec();
-                    activity = inc_proc_northd_run(ovnnb_txn, ovnsb_txn,
-                                                   &eng_ctx);
+
+                    if (sb_cfg_is_in_sync(ovnnb_idl_loop.idl,
+                                          &ovnsb_idl_loop)) {
+                        activity = inc_proc_northd_run(ovnnb_txn,
+                                                       ovnsb_txn,
+                                                       &eng_ctx);
+                    } else {
+                        poll_immediate_wake();
+                        clear_idl_track = false;
+                    }
+
                     check_and_add_supported_dhcp_opts_to_sb_db(
                                  ovnsb_txn, ovnsb_idl_loop.idl);
                     check_and_add_supported_dhcpv6_opts_to_sb_db(
diff --git a/tests/ovn-northd.at b/tests/ovn-northd.at
index cfaba19bf..5db6d4984 100644
--- a/tests/ovn-northd.at
+++ b/tests/ovn-northd.at
@@ -16748,3 +16748,40 @@  check grep -q "Bad configuration: The peer of the switch port 'ls-lrp1' (LRP pee
 
 AT_CLEANUP
 ])
+
+OVN_FOR_EACH_NORTHD_NO_HV([
+AT_SETUP([sb_cfg propagation])
+ovn_start
+
+#
+# Test engine call does not happen between sb db transaction
+# commitment and sb_cfg write to the nb db (if have to)
+#
+> northd/ovn-northd.log
+check as northd ovn-appctl -t ovn-northd vlog/set jsonrpc:dbg
+check as northd ovn-appctl -t ovn-northd vlog/set inc_proc_eng:dbg
+# Any change that invoke engine processing, for example add logical switch.
+# nb_cfg bump must be present in order to get feedback as sb_cfg.
+check ovn-nbctl --wait=sb ls-add sw1
+#
+# Get following messages from log file:
+# 'unix... transact ... Southbound' --> Indicates the pack of updates has been
+#                                       committed to the sb db.
+# 'unix... transact ... sb_cfg'  --> Indicates the sb_cfg has been committed to
+#                                    the sb db.
+# 'Initializing new run' --> Indicates the engine has been called.
+#
+# Then, take first letter of messages, here 'u' or 'I' and form them into one
+# string.
+#
+# Finally, a 'good' string should have two 'u' conjuncted with several 'I'
+# both as prefix and suffix, while 'bad' string will have 'I' between two 'u'.
+#
+call_seq=`grep -E \
+ "(transact.*sb_cfg)|(transact.*Southbound)|(Initializing new run)" \
+ northd/ovn-northd.log | cut -d "|" -f 5- | cut -b 1 | tr -d '\n'`
+
+AT_CHECK([echo $call_seq | grep -qE "^I*uuI*$"], [0])
+
+AT_CLEANUP
+])