Message ID | e573ada7-7491-86d4-e36d-40032b18c9df@redhat.com |
---|---|
State | New |
Headers | show |
Series | nptl: Fix extraneous testing run by tst-rseq-nptl in the test driver | expand |
On Thu, 11 Jul 2024, Maciej W. Rozycki wrote: > Fix an issue with commit 8f4632deb354 ("Linux: rseq registration tests") > and prevent testing from being run in the process of the test driver > itself rather than just the test child where one has been forked. The > problem here is the unguarded use of a destructor to call a part of the > testing. The destructor function, 'do_rseq_destructor_test' is called > implicitly at program completion, however because it is associated with > the executable itself rather than an individual process, it is called > both in the test child *and* in the test driver itself. Ping for: <https://sourceware.org/pipermail/libc-alpha/2024-July/158277.html>, <https://patchwork.sourceware.org/project/glibc/patch/e573ada7-7491-86d4-e36d-40032b18c9df@redhat.com/>. Maciej
On Thu, 11 Jul 2024, Maciej W. Rozycki wrote: > Fix an issue with commit 8f4632deb354 ("Linux: rseq registration tests") > and prevent testing from being run in the process of the test driver > itself rather than just the test child where one has been forked. The > problem here is the unguarded use of a destructor to call a part of the > testing. The destructor function, 'do_rseq_destructor_test' is called > implicitly at program completion, however because it is associated with > the executable itself rather than an individual process, it is called > both in the test child *and* in the test driver itself. Ping for: <https://sourceware.org/pipermail/libc-alpha/2024-July/158277.html>, <https://patchwork.sourceware.org/project/glibc/patch/e573ada7-7491-86d4-e36d-40032b18c9df@redhat.com/>. Maciej
On 11/07/24 11:07, Maciej W. Rozycki wrote: > Fix an issue with commit 8f4632deb354 ("Linux: rseq registration tests") > and prevent testing from being run in the process of the test driver > itself rather than just the test child where one has been forked. The > problem here is the unguarded use of a destructor to call a part of the > testing. The destructor function, 'do_rseq_destructor_test' is called > implicitly at program completion, however because it is associated with > the executable itself rather than an individual process, it is called > both in the test child *and* in the test driver itself. > > Prevent this from happening by providing a guard variable that only > enables test invocation from 'do_rseq_destructor_test' in the process > that has first run 'do_test'. Consequently extra testing is invoked > from 'do_rseq_destructor_test' only once and in the correct process, > regardless of the use or the lack of of the '--direct' option. Where > called in the controlling test driver process that has neved called > 'do_test' the destructor function silently returns right away without > taking any further actions, letting the test driver fail gracefully > where applicable. > > This arrangement prevents 'tst-rseq-nptl' from ever causing testing to > hang forever and never complete, such as currently happening with the > 'mips-linux-gnu' (o32 ABI) target. > --- > Hi, > > In the course of verifying changes for the BZ #27650 test case, posted > separately, I have come across a couple of tests hanging forever that > cause `mips-linux-gnu' (o32 ABI) testing to never complete unless the > offending tests are killed by hand. > > This patch addresses this issue for the tst-rseq-nptl test. This was a > bit tricky to debug as I saw no immediate cause as to the hang *despite* > the test driver already being used by the test case. Eventually running > via `strace' revealed the actual problem: > > wait4(16905, 0x7fffa930, 0, NULL) = ? ERESTARTSYS (To be restarted) > --- SIGALRM (Alarm clock) --- > SYS_4366() = 0 > kill(-16905, SIGKILL) = 0 > kill(16905, SIGKILL) = 0 > wait4(16905, 0x7fffa5d0, WNOHANG|WSTOPPED, NULL) = 0 > SYS_4265() = -1 ERRNO_516 ((null)) > --- SIGCHLD (Child exited) --- > SYS_4253() = 0 > wait4(16905, [WIFSIGNALED(s) && WTERMSIG(s) == SIGKILL], WNOHANG|WSTOPPED, NULL) = 16905 > write(1, "Timed out: killed the child proc"..., 35) = 35 > write(1, "\n", 1) = 1 > brk(0) = 0x55591000 > brk(0x555b2000) = 0x555b2000 > SYS_4288() = 3 > SYS_4366() = 0 > SYS_4366() = 0 > read(3, "TZif\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\7\0\0\0\7\0"..., 4096) = 1323 > close(3) = 0 > write(1, "Termination time: 2024-07-06T03:"..., 48) = 48 > SYS_4367() = -1 EINVAL (Invalid argument) > rt_sigaction(SIGUSR1, {SIG_DFL}, NULL, 16) = 0 > SYS_4222() = 16904 > getpid() = 16904 > SYS_4266() = 0 > --- SIGUSR1 (User defined signal 1) --- > sigreturn() = 0 > > This was obtained with a slightly outdated `strace' binary, hence the > missing decoding of some syscalls, however output produced is clear: the > test child does get killed, the test driver reports what it is supposed to > in this case, but then, hold on, what's up with that `rt_sigaction' call? > OK, that pointed me at the right place, that is `do_rseq_destructor_test' > having been marked with `__attribute__ ((destructor))'. > > Being a destructor the function is obviously implicitly called at program > completion, and then by both the test child *and* the test driver process > itself. Hence the call to `do_rseq_test' that hangs also causes the test > driver to hang forever, preventing testing from ever completing. > > OK to apply (either now or past the freeze period)? > > Maciej LGTM, thanks. Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> > --- > sysdeps/unix/sysv/linux/tst-rseq-nptl.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > glibc-tst-rseq-nptl-destructor.diff > Index: glibc/sysdeps/unix/sysv/linux/tst-rseq-nptl.c > =================================================================== > --- glibc.orig/sysdeps/unix/sysv/linux/tst-rseq-nptl.c > +++ glibc/sysdeps/unix/sysv/linux/tst-rseq-nptl.c > @@ -28,6 +28,11 @@ > #include <sys/rseq.h> > #include <unistd.h> > > +/* Set this in 'do_test' only so as to invoke the destructor test in > + the test process only and not 'support_test_main' parent. Otherwise > + the test harness may hang in the destructor if something goes wrong. */ > +static int run_destructor_test; > + > #ifdef RSEQ_SIG > # include <array_length.h> > # include <errno.h> > @@ -236,6 +241,9 @@ do_rseq_test (void) > static void __attribute__ ((destructor)) > do_rseq_destructor_test (void) > { > + if (!run_destructor_test) > + return; > + > /* Cannot use deferred failure reporting after main returns. */ > if (do_rseq_test ()) > FAIL_EXIT1 ("rseq not registered within destructor"); > @@ -254,6 +262,7 @@ do_rseq_test (void) > static int > do_test (void) > { > + run_destructor_test = 1; > return do_rseq_test (); > } > >
On Wed, 14 Aug 2024, Adhemerval Zanella Netto wrote: > LGTM, thanks. > > Reviewed-by: Adhemerval Zanella <adhemerval.zanella@linaro.org> Applied now, thank you for your review. Maciej
Index: glibc/sysdeps/unix/sysv/linux/tst-rseq-nptl.c =================================================================== --- glibc.orig/sysdeps/unix/sysv/linux/tst-rseq-nptl.c +++ glibc/sysdeps/unix/sysv/linux/tst-rseq-nptl.c @@ -28,6 +28,11 @@ #include <sys/rseq.h> #include <unistd.h> +/* Set this in 'do_test' only so as to invoke the destructor test in + the test process only and not 'support_test_main' parent. Otherwise + the test harness may hang in the destructor if something goes wrong. */ +static int run_destructor_test; + #ifdef RSEQ_SIG # include <array_length.h> # include <errno.h> @@ -236,6 +241,9 @@ do_rseq_test (void) static void __attribute__ ((destructor)) do_rseq_destructor_test (void) { + if (!run_destructor_test) + return; + /* Cannot use deferred failure reporting after main returns. */ if (do_rseq_test ()) FAIL_EXIT1 ("rseq not registered within destructor"); @@ -254,6 +262,7 @@ do_rseq_test (void) static int do_test (void) { + run_destructor_test = 1; return do_rseq_test (); }