Message ID | 936751686098234@gavdw3jrkjwbxdg3.myt.yp-c.yandex.net |
---|---|
State | Accepted |
Headers | show |
Series | [ovs-dev] call ovsrcu_exit() before exit in ovn-northd and ovn-controller to make valgrind happy | expand |
Context | Check | Description |
---|---|---|
ovsrobot/apply-robot | success | apply and check: success |
ovsrobot/github-robot-_ovn-kubernetes | success | github build: passed |
ovsrobot/github-robot-_Build_and_Test | fail | github build: failed |
Thanks, Igor. Acked-by: Mark Michelson <mmichels@redhat.com> On 6/6/23 20:37, Igor Zhukov wrote: > From: Igor Zhukov <ivzhukov@sbercloud.ru> > > You can check logs by running: make check-valgrind TESTSUITEFLAGS="246" > > (Actually almost every test affected but for example we need one test) > > Valgrind message looks like > > ==65437== 304 bytes in 1 blocks are possibly lost in loss record 265 of 289 > ==65437== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) > ==65437== by 0x40149DA: allocate_dtv (dl-tls.c:286) > ==65437== by 0x40149DA: _dl_allocate_tls (dl-tls.c:532) > ==65437== by 0x4BE2322: allocate_stack (allocatestack.c:622) > ==65437== by 0x4BE2322: pthread_create@@GLIBC_2.2.5 (pthread_create.c:660) > ==65437== by 0x283D46: ovs_thread_create (in /home/ivzhukov/upstream/ovn/controller/ovn-controller) > ==65437== by 0x27FFB0: ovsrcu_quiesced (in /home/ivzhukov/upstream/ovn/controller/ovn-controller) > ==65437== by 0x28000B: ovsrcu_quiesce_start (in /home/ivzhukov/upstream/ovn/controller/ovn-controller) > ==65437== by 0x2AFE04: time_poll (in /home/ivzhukov/upstream/ovn/controller/ovn-controller) > ==65437== by 0x2A16A2: poll_block (in /home/ivzhukov/upstream/ovn/controller/ovn-controller) > ==65437== by 0x165019: pinctrl_handler (in /home/ivzhukov/upstream/ovn/controller/ovn-controller) > ==65437== by 0x283BB1: ovsthread_wrapper (in /home/ivzhukov/upstream/ovn/controller/ovn-controller) > ==65437== by 0x4BE1608: start_thread (pthread_create.c:477) > ==65437== by 0x4F5F132: clone (clone.S:95) > > or > > ==63926== 304 bytes in 1 blocks are possibly lost in loss record 128 of 130 > ==63926== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) > ==63926== by 0x40149DA: allocate_dtv (dl-tls.c:286) > ==63926== by 0x40149DA: _dl_allocate_tls (dl-tls.c:532) > ==63926== by 0x4BE2322: allocate_stack (allocatestack.c:622) > ==63926== by 0x4BE2322: pthread_create@@GLIBC_2.2.5 (pthread_create.c:660) > ==63926== by 0x24EE4B: ovs_thread_create (in /home/ivzhukov/upstream/ovn/northd/ovn-northd) > ==63926== by 0x24CB63: ovsrcu_quiesced (in /home/ivzhukov/upstream/ovn/northd/ovn-northd) > ==63926== by 0x24CBBE: ovsrcu_quiesce_start (in /home/ivzhukov/upstream/ovn/northd/ovn-northd) > ==63926== by 0x276DE6: time_poll (in /home/ivzhukov/upstream/ovn/northd/ovn-northd) > ==63926== by 0x26BA03: poll_block (in /home/ivzhukov/upstream/ovn/northd/ovn-northd) > ==63926== by 0x26A9FD: stopwatch_thread (in /home/ivzhukov/upstream/ovn/northd/ovn-northd) > ==63926== by 0x24ECB6: ovsthread_wrapper (in /home/ivzhukov/upstream/ovn/northd/ovn-northd) > ==63926== by 0x4BE1608: start_thread (pthread_create.c:477) > ==63926== by 0x4F5F132: clone (clone.S:95) > > I also noticed that ovs-vswitchd calls ovsrcu_exit() before exit: https://github.com/openvswitch/ovs/blob/0187eadfce4505d502e57c0e688b830f0a1ec728/vswitchd/ovs-vswitchd.c#L150 > > I also stopped a thread pool by calling run_update_worker_pool(0); because ovsrcu_exit(); stucks without it. > > Signed-off-by: Igor Zhukov <ivzhukov@sbercloud.ru> > --- > controller/ovn-controller.c | 1 + > northd/ovn-northd.c | 3 +++ > 2 files changed, 4 insertions(+) > > diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c > index 3a81a13fb..a47406979 100644 > --- a/controller/ovn-controller.c > +++ b/controller/ovn-controller.c > @@ -5620,6 +5620,7 @@ loop_done: > free(cli_system_id); > } > service_stop(); > + ovsrcu_exit(); > > exit(retval); > } > diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c > index 3515b68a2..de807c0a4 100644 > --- a/northd/ovn-northd.c > +++ b/northd/ovn-northd.c > @@ -33,6 +33,7 @@ > #include "lib/ovn-l7.h" > #include "lib/ovn-nb-idl.h" > #include "lib/ovn-sb-idl.h" > +#include "lib/ovs-rcu.h" > #include "openvswitch/poll-loop.h" > #include "simap.h" > #include "stopwatch.h" > @@ -1030,6 +1031,8 @@ main(int argc, char *argv[]) > ovsdb_idl_loop_destroy(&ovnnb_idl_loop); > ovsdb_idl_loop_destroy(&ovnsb_idl_loop); > service_stop(); > + run_update_worker_pool(0); > + ovsrcu_exit(); > > exit(res); > }
I pushed this change to main and all branches back to 22.03. Thanks! On 6/7/23 15:21, Mark Michelson wrote: > Thanks, Igor. > > Acked-by: Mark Michelson <mmichels@redhat.com> > > On 6/6/23 20:37, Igor Zhukov wrote: >> From: Igor Zhukov <ivzhukov@sbercloud.ru> >> >> You can check logs by running: make check-valgrind TESTSUITEFLAGS="246" >> >> (Actually almost every test affected but for example we need one test) >> >> Valgrind message looks like >> >> ==65437== 304 bytes in 1 blocks are possibly lost in loss record 265 >> of 289 >> ==65437== at 0x483DD99: calloc (in >> /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) >> ==65437== by 0x40149DA: allocate_dtv (dl-tls.c:286) >> ==65437== by 0x40149DA: _dl_allocate_tls (dl-tls.c:532) >> ==65437== by 0x4BE2322: allocate_stack (allocatestack.c:622) >> ==65437== by 0x4BE2322: pthread_create@@GLIBC_2.2.5 >> (pthread_create.c:660) >> ==65437== by 0x283D46: ovs_thread_create (in >> /home/ivzhukov/upstream/ovn/controller/ovn-controller) >> ==65437== by 0x27FFB0: ovsrcu_quiesced (in >> /home/ivzhukov/upstream/ovn/controller/ovn-controller) >> ==65437== by 0x28000B: ovsrcu_quiesce_start (in >> /home/ivzhukov/upstream/ovn/controller/ovn-controller) >> ==65437== by 0x2AFE04: time_poll (in >> /home/ivzhukov/upstream/ovn/controller/ovn-controller) >> ==65437== by 0x2A16A2: poll_block (in >> /home/ivzhukov/upstream/ovn/controller/ovn-controller) >> ==65437== by 0x165019: pinctrl_handler (in >> /home/ivzhukov/upstream/ovn/controller/ovn-controller) >> ==65437== by 0x283BB1: ovsthread_wrapper (in >> /home/ivzhukov/upstream/ovn/controller/ovn-controller) >> ==65437== by 0x4BE1608: start_thread (pthread_create.c:477) >> ==65437== by 0x4F5F132: clone (clone.S:95) >> >> or >> >> ==63926== 304 bytes in 1 blocks are possibly lost in loss record 128 >> of 130 >> ==63926== at 0x483DD99: calloc (in >> /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) >> ==63926== by 0x40149DA: allocate_dtv (dl-tls.c:286) >> ==63926== by 0x40149DA: _dl_allocate_tls (dl-tls.c:532) >> ==63926== by 0x4BE2322: allocate_stack (allocatestack.c:622) >> ==63926== by 0x4BE2322: pthread_create@@GLIBC_2.2.5 >> (pthread_create.c:660) >> ==63926== by 0x24EE4B: ovs_thread_create (in >> /home/ivzhukov/upstream/ovn/northd/ovn-northd) >> ==63926== by 0x24CB63: ovsrcu_quiesced (in >> /home/ivzhukov/upstream/ovn/northd/ovn-northd) >> ==63926== by 0x24CBBE: ovsrcu_quiesce_start (in >> /home/ivzhukov/upstream/ovn/northd/ovn-northd) >> ==63926== by 0x276DE6: time_poll (in >> /home/ivzhukov/upstream/ovn/northd/ovn-northd) >> ==63926== by 0x26BA03: poll_block (in >> /home/ivzhukov/upstream/ovn/northd/ovn-northd) >> ==63926== by 0x26A9FD: stopwatch_thread (in >> /home/ivzhukov/upstream/ovn/northd/ovn-northd) >> ==63926== by 0x24ECB6: ovsthread_wrapper (in >> /home/ivzhukov/upstream/ovn/northd/ovn-northd) >> ==63926== by 0x4BE1608: start_thread (pthread_create.c:477) >> ==63926== by 0x4F5F132: clone (clone.S:95) >> >> I also noticed that ovs-vswitchd calls ovsrcu_exit() before exit: >> https://github.com/openvswitch/ovs/blob/0187eadfce4505d502e57c0e688b830f0a1ec728/vswitchd/ovs-vswitchd.c#L150 >> >> I also stopped a thread pool by calling run_update_worker_pool(0); >> because ovsrcu_exit(); stucks without it. >> >> Signed-off-by: Igor Zhukov <ivzhukov@sbercloud.ru> >> --- >> controller/ovn-controller.c | 1 + >> northd/ovn-northd.c | 3 +++ >> 2 files changed, 4 insertions(+) >> >> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c >> index 3a81a13fb..a47406979 100644 >> --- a/controller/ovn-controller.c >> +++ b/controller/ovn-controller.c >> @@ -5620,6 +5620,7 @@ loop_done: >> free(cli_system_id); >> } >> service_stop(); >> + ovsrcu_exit(); >> exit(retval); >> } >> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c >> index 3515b68a2..de807c0a4 100644 >> --- a/northd/ovn-northd.c >> +++ b/northd/ovn-northd.c >> @@ -33,6 +33,7 @@ >> #include "lib/ovn-l7.h" >> #include "lib/ovn-nb-idl.h" >> #include "lib/ovn-sb-idl.h" >> +#include "lib/ovs-rcu.h" >> #include "openvswitch/poll-loop.h" >> #include "simap.h" >> #include "stopwatch.h" >> @@ -1030,6 +1031,8 @@ main(int argc, char *argv[]) >> ovsdb_idl_loop_destroy(&ovnnb_idl_loop); >> ovsdb_idl_loop_destroy(&ovnsb_idl_loop); >> service_stop(); >> + run_update_worker_pool(0); >> + ovsrcu_exit(); >> exit(res); >> } >
Great! You are welcome!
On 6/9/23 21:13, Mark Michelson wrote: > I pushed this change to main and all branches back to 22.03. Thanks! > Branch 22.03 doesn't have run_update_worker_pool() so this backport broke compilation there. I posted a 22.03-only revert patch: https://patchwork.ozlabs.org/project/ovn/patch/20230614091231.463313-1-dceara@redhat.com/ Regards, Dumitru > On 6/7/23 15:21, Mark Michelson wrote: >> Thanks, Igor. >> >> Acked-by: Mark Michelson <mmichels@redhat.com> >> >> On 6/6/23 20:37, Igor Zhukov wrote: >>> From: Igor Zhukov <ivzhukov@sbercloud.ru> >>> >>> You can check logs by running: make check-valgrind TESTSUITEFLAGS="246" >>> >>> (Actually almost every test affected but for example we need one test) >>> >>> Valgrind message looks like >>> >>> ==65437== 304 bytes in 1 blocks are possibly lost in loss record 265 >>> of 289 >>> ==65437== at 0x483DD99: calloc (in >>> /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) >>> ==65437== by 0x40149DA: allocate_dtv (dl-tls.c:286) >>> ==65437== by 0x40149DA: _dl_allocate_tls (dl-tls.c:532) >>> ==65437== by 0x4BE2322: allocate_stack (allocatestack.c:622) >>> ==65437== by 0x4BE2322: pthread_create@@GLIBC_2.2.5 >>> (pthread_create.c:660) >>> ==65437== by 0x283D46: ovs_thread_create (in >>> /home/ivzhukov/upstream/ovn/controller/ovn-controller) >>> ==65437== by 0x27FFB0: ovsrcu_quiesced (in >>> /home/ivzhukov/upstream/ovn/controller/ovn-controller) >>> ==65437== by 0x28000B: ovsrcu_quiesce_start (in >>> /home/ivzhukov/upstream/ovn/controller/ovn-controller) >>> ==65437== by 0x2AFE04: time_poll (in >>> /home/ivzhukov/upstream/ovn/controller/ovn-controller) >>> ==65437== by 0x2A16A2: poll_block (in >>> /home/ivzhukov/upstream/ovn/controller/ovn-controller) >>> ==65437== by 0x165019: pinctrl_handler (in >>> /home/ivzhukov/upstream/ovn/controller/ovn-controller) >>> ==65437== by 0x283BB1: ovsthread_wrapper (in >>> /home/ivzhukov/upstream/ovn/controller/ovn-controller) >>> ==65437== by 0x4BE1608: start_thread (pthread_create.c:477) >>> ==65437== by 0x4F5F132: clone (clone.S:95) >>> >>> or >>> >>> ==63926== 304 bytes in 1 blocks are possibly lost in loss record 128 >>> of 130 >>> ==63926== at 0x483DD99: calloc (in >>> /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so) >>> ==63926== by 0x40149DA: allocate_dtv (dl-tls.c:286) >>> ==63926== by 0x40149DA: _dl_allocate_tls (dl-tls.c:532) >>> ==63926== by 0x4BE2322: allocate_stack (allocatestack.c:622) >>> ==63926== by 0x4BE2322: pthread_create@@GLIBC_2.2.5 >>> (pthread_create.c:660) >>> ==63926== by 0x24EE4B: ovs_thread_create (in >>> /home/ivzhukov/upstream/ovn/northd/ovn-northd) >>> ==63926== by 0x24CB63: ovsrcu_quiesced (in >>> /home/ivzhukov/upstream/ovn/northd/ovn-northd) >>> ==63926== by 0x24CBBE: ovsrcu_quiesce_start (in >>> /home/ivzhukov/upstream/ovn/northd/ovn-northd) >>> ==63926== by 0x276DE6: time_poll (in >>> /home/ivzhukov/upstream/ovn/northd/ovn-northd) >>> ==63926== by 0x26BA03: poll_block (in >>> /home/ivzhukov/upstream/ovn/northd/ovn-northd) >>> ==63926== by 0x26A9FD: stopwatch_thread (in >>> /home/ivzhukov/upstream/ovn/northd/ovn-northd) >>> ==63926== by 0x24ECB6: ovsthread_wrapper (in >>> /home/ivzhukov/upstream/ovn/northd/ovn-northd) >>> ==63926== by 0x4BE1608: start_thread (pthread_create.c:477) >>> ==63926== by 0x4F5F132: clone (clone.S:95) >>> >>> I also noticed that ovs-vswitchd calls ovsrcu_exit() before exit: >>> https://github.com/openvswitch/ovs/blob/0187eadfce4505d502e57c0e688b830f0a1ec728/vswitchd/ovs-vswitchd.c#L150 >>> >>> I also stopped a thread pool by calling run_update_worker_pool(0); >>> because ovsrcu_exit(); stucks without it. >>> >>> Signed-off-by: Igor Zhukov <ivzhukov@sbercloud.ru> >>> --- >>> controller/ovn-controller.c | 1 + >>> northd/ovn-northd.c | 3 +++ >>> 2 files changed, 4 insertions(+) >>> >>> diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c >>> index 3a81a13fb..a47406979 100644 >>> --- a/controller/ovn-controller.c >>> +++ b/controller/ovn-controller.c >>> @@ -5620,6 +5620,7 @@ loop_done: >>> free(cli_system_id); >>> } >>> service_stop(); >>> + ovsrcu_exit(); >>> exit(retval); >>> } >>> diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c >>> index 3515b68a2..de807c0a4 100644 >>> --- a/northd/ovn-northd.c >>> +++ b/northd/ovn-northd.c >>> @@ -33,6 +33,7 @@ >>> #include "lib/ovn-l7.h" >>> #include "lib/ovn-nb-idl.h" >>> #include "lib/ovn-sb-idl.h" >>> +#include "lib/ovs-rcu.h" >>> #include "openvswitch/poll-loop.h" >>> #include "simap.h" >>> #include "stopwatch.h" >>> @@ -1030,6 +1031,8 @@ main(int argc, char *argv[]) >>> ovsdb_idl_loop_destroy(&ovnnb_idl_loop); >>> ovsdb_idl_loop_destroy(&ovnsb_idl_loop); >>> service_stop(); >>> + run_update_worker_pool(0); >>> + ovsrcu_exit(); >>> exit(res); >>> } >> > > _______________________________________________ > dev mailing list > dev@openvswitch.org > https://mail.openvswitch.org/mailman/listinfo/ovs-dev
I agree. Yes, we needed to check the build on version 22.03. Sorry.
diff --git a/controller/ovn-controller.c b/controller/ovn-controller.c index 3a81a13fb..a47406979 100644 --- a/controller/ovn-controller.c +++ b/controller/ovn-controller.c @@ -5620,6 +5620,7 @@ loop_done: free(cli_system_id); } service_stop(); + ovsrcu_exit(); exit(retval); } diff --git a/northd/ovn-northd.c b/northd/ovn-northd.c index 3515b68a2..de807c0a4 100644 --- a/northd/ovn-northd.c +++ b/northd/ovn-northd.c @@ -33,6 +33,7 @@ #include "lib/ovn-l7.h" #include "lib/ovn-nb-idl.h" #include "lib/ovn-sb-idl.h" +#include "lib/ovs-rcu.h" #include "openvswitch/poll-loop.h" #include "simap.h" #include "stopwatch.h" @@ -1030,6 +1031,8 @@ main(int argc, char *argv[]) ovsdb_idl_loop_destroy(&ovnnb_idl_loop); ovsdb_idl_loop_destroy(&ovnsb_idl_loop); service_stop(); + run_update_worker_pool(0); + ovsrcu_exit(); exit(res); }