mbox series

[0/2] nptl: Fix termination issues with tst-cancel7

Message ID alpine.DEB.2.21.2408051401320.61955@angie.orcam.me.uk
Headers show
Series nptl: Fix termination issues with tst-cancel7 | expand

Message

Maciej W. Rozycki Aug. 5, 2024, 2:22 p.m. UTC
Hi,

 I have observed remote `alpha-linux-gnu' testing to hang forever with 
tst-cancel7 and tst-cancelx7 (which uses the same code) and noticed that 
there is a stray process left that SSH continues waiting for to terminate 
after the test driver has exited, and which has to be killed by hand for 
testing to continue.

 This can be easily reproduced in an otherwise correctly working setup, by 
placing:

  abort ();

just above the call to `xpthread_cancel' in `do_test' and then running 
tst-cancel7 remotely with any relevant target.

 In the course of developing a fix I have encountered a locking fault in 
the test case, so I have made this small patch series to address the fault 
in 1/2 and the hang issue in 2/2.

 This has been verified with the `powerpc64le-linux-gnu' (IBM POWER9) 
native target and then the same host and the `riscv64-linux-gnu' (SiFive 
FU740), `mips-linux-gnu' (o32 ABI) (MIPS 74Kf), and `alpha-linux-gnu' (DEC 
21064A) targets, removing a hang with failed tst-cancel7 and tst-cancelx7 
with `alpha-linux-gnu' target and preserving graceful failures of these 
tests with `mips-linux-gnu' target.  Both tests succeed with and without 
this change applied with `powerpc64le-linux-gnu' and `riscv64-linux-gnu' 
targets.

 OK to apply?

  Maciej

Comments

Maciej W. Rozycki Aug. 5, 2024, 2:48 p.m. UTC | #1
On Mon, 5 Aug 2024, Maciej W. Rozycki wrote:

>  I have observed remote `alpha-linux-gnu' testing to hang forever with 
> tst-cancel7 and tst-cancelx7 (which uses the same code) and noticed that 
> there is a stray process left that SSH continues waiting for to terminate 
> after the test driver has exited, and which has to be killed by hand for 
> testing to continue.

 I note that I have seen this hang before in `riscv64-linux-gnu' testing: 
<https://sourceware.org/pipermail/libc-alpha/2019-July/105438.html>, so it 
seems like a problem that shows up from time to time if an issue causes 
these test cases not to run as intended.  Regardless such a glitch must 
not prevent forward progress as a hung test run is about the worst that 
can happen.

  Maciej