diff mbox series

tests/qtest/netdev-socket: Raise connection timeout to 120 seconds

Message ID 20230912133310.60583-1-stefanha@redhat.com
State New
Headers show
Series tests/qtest/netdev-socket: Raise connection timeout to 120 seconds | expand

Commit Message

Stefan Hajnoczi Sept. 12, 2023, 1:33 p.m. UTC
The test still fails intermittently with a 60 second timeout in the
GitLab CI environment. Raise the timeout to 120 seconds.

  576/839 ERROR:../tests/qtest/netdev-socket.c:293:test_stream_unix: assertion failed (resp == expect): ("st0: index=0,type=stream,connection error\r\n" == "st0: index=0,type=stream,unix:/tmp/netdev-socket.UW5IA2/stream_unix\r\n") ERROR
  576/839 qemu:qtest+qtest-sh4 / qtest-sh4/netdev-socket                            ERROR          62.85s   killed by signal 6 SIGABRT
  >>> MALLOC_PERTURB_=249 QTEST_QEMU_BINARY=./qemu-system-sh4 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh QTEST_QEMU_IMG=./qemu-img /home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/tests/qtest/netdev-socket --tap -k
  ――――――――――――――――――――――――――――――――――――― ✀  ―――――――――――――――――――――――――――――――――――――
  stderr:
  **
  ERROR:../tests/qtest/netdev-socket.c:293:test_stream_unix: assertion failed (resp == expect): ("st0: index=0,type=stream,connection error\r\n" == "st0: index=0,type=stream,unix:/tmp/netdev-socket.UW5IA2/stream_unix\r\n")
  (test program exited with status code -6)

Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1881
Fixes: 417296c8d858 ("tests/qtest/netdev-socket: Raise connection timeout to 60 seconds")
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 tests/qtest/netdev-socket.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Daniel P. Berrangé Sept. 12, 2023, 1:42 p.m. UTC | #1
On Tue, Sep 12, 2023 at 09:33:10AM -0400, Stefan Hajnoczi wrote:
> The test still fails intermittently with a 60 second timeout in the
> GitLab CI environment. Raise the timeout to 120 seconds.
> 
>   576/839 ERROR:../tests/qtest/netdev-socket.c:293:test_stream_unix: assertion failed (resp == expect): ("st0: index=0,type=stream,connection error\r\n" == "st0: index=0,type=stream,unix:/tmp/netdev-socket.UW5IA2/stream_unix\r\n") ERROR
>   576/839 qemu:qtest+qtest-sh4 / qtest-sh4/netdev-socket                            ERROR          62.85s   killed by signal 6 SIGABRT
>   >>> MALLOC_PERTURB_=249 QTEST_QEMU_BINARY=./qemu-system-sh4 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh QTEST_QEMU_IMG=./qemu-img /home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/tests/qtest/netdev-socket --tap -k
>   ――――――――――――――――――――――――――――――――――――― ✀  ―――――――――――――――――――――――――――――――――――――
>   stderr:
>   **
>   ERROR:../tests/qtest/netdev-socket.c:293:test_stream_unix: assertion failed (resp == expect): ("st0: index=0,type=stream,connection error\r\n" == "st0: index=0,type=stream,unix:/tmp/netdev-socket.UW5IA2/stream_unix\r\n")
>   (test program exited with status code -6)
> 
> Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1881
> Fixes: 417296c8d858 ("tests/qtest/netdev-socket: Raise connection timeout to 60 seconds")
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

That bumped the timeout from 5 seconds to 60 seconds to
cope with intermittent failures, which was a x12
increases. I'm concerned that it would still be failing
in largely the same way after that, and possibly we are
instead hitting a race condition causing setup to fail,
which masquerades as a timeout.

> ---
>  tests/qtest/netdev-socket.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tests/qtest/netdev-socket.c b/tests/qtest/netdev-socket.c
> index 8eed54801f..b2501d72a1 100644
> --- a/tests/qtest/netdev-socket.c
> +++ b/tests/qtest/netdev-socket.c
> @@ -16,7 +16,7 @@
>  #include "qapi/qobject-input-visitor.h"
>  #include "qapi/qapi-visit-sockets.h"
>  
> -#define CONNECTION_TIMEOUT    60
> +#define CONNECTION_TIMEOUT    120
>  
>  #define EXPECT_STATE(q, e, t)                             \
>  do {                                                      \

I'll add

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>

but with the caveat that i'm only 50/50 on whether this is actually
the right fix. Doesn't hurt to try it, but if 120 seconds still shows
failures I'd say we're hitting a functional race not a timeout.

With regards,
Daniel
Laurent Vivier Sept. 12, 2023, 2:06 p.m. UTC | #2
On 9/12/23 15:42, Daniel P. Berrangé wrote:
> On Tue, Sep 12, 2023 at 09:33:10AM -0400, Stefan Hajnoczi wrote:
>> The test still fails intermittently with a 60 second timeout in the
>> GitLab CI environment. Raise the timeout to 120 seconds.
>>
>>    576/839 ERROR:../tests/qtest/netdev-socket.c:293:test_stream_unix: assertion failed (resp == expect): ("st0: index=0,type=stream,connection error\r\n" == "st0: index=0,type=stream,unix:/tmp/netdev-socket.UW5IA2/stream_unix\r\n") ERROR
>>    576/839 qemu:qtest+qtest-sh4 / qtest-sh4/netdev-socket                            ERROR          62.85s   killed by signal 6 SIGABRT
>>    >>> MALLOC_PERTURB_=249 QTEST_QEMU_BINARY=./qemu-system-sh4 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh QTEST_QEMU_IMG=./qemu-img /home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/tests/qtest/netdev-socket --tap -k
>>    ――――――――――――――――――――――――――――――――――――― ✀  ―――――――――――――――――――――――――――――――――――――
>>    stderr:
>>    **
>>    ERROR:../tests/qtest/netdev-socket.c:293:test_stream_unix: assertion failed (resp == expect): ("st0: index=0,type=stream,connection error\r\n" == "st0: index=0,type=stream,unix:/tmp/netdev-socket.UW5IA2/stream_unix\r\n")
>>    (test program exited with status code -6)
>>
>> Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1881
>> Fixes: 417296c8d858 ("tests/qtest/netdev-socket: Raise connection timeout to 60 seconds")
>> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> 
> That bumped the timeout from 5 seconds to 60 seconds to
> cope with intermittent failures, which was a x12
> increases. I'm concerned that it would still be failing
> in largely the same way after that, and possibly we are
> instead hitting a race condition causing setup to fail,
> which masquerades as a timeout.
> 
>> ---
>>   tests/qtest/netdev-socket.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/tests/qtest/netdev-socket.c b/tests/qtest/netdev-socket.c
>> index 8eed54801f..b2501d72a1 100644
>> --- a/tests/qtest/netdev-socket.c
>> +++ b/tests/qtest/netdev-socket.c
>> @@ -16,7 +16,7 @@
>>   #include "qapi/qobject-input-visitor.h"
>>   #include "qapi/qapi-visit-sockets.h"
>>   
>> -#define CONNECTION_TIMEOUT    60
>> +#define CONNECTION_TIMEOUT    120
>>   
>>   #define EXPECT_STATE(q, e, t)                             \
>>   do {                                                      \
> 
> I'll add
> 
> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
> 
> but with the caveat that i'm only 50/50 on whether this is actually
> the right fix. Doesn't hurt to try it, but if 120 seconds still shows
> failures I'd say we're hitting a functional race not a timeout.

It can also happen if the first QEMU (server) crashes. Do we have some traces from this side?

Reviewed-by: Laurent Vivier <lvivier@redhat.com>

Thanks,
Laurent
Stefan Hajnoczi Sept. 12, 2023, 2:16 p.m. UTC | #3
Here is a log from the CI, but I don't think it has much information:
https://gitlab.com/qemu-project/qemu/-/jobs/5020899550

Is it possible to detect the crash? Timeouts are hard to diagnose, so
it would be better for the test to detect a terminated child process
and print an error.

Stefan

On Tue, 12 Sept 2023 at 10:08, Laurent Vivier <lvivier@redhat.com> wrote:
>
> On 9/12/23 15:42, Daniel P. Berrangé wrote:
> > On Tue, Sep 12, 2023 at 09:33:10AM -0400, Stefan Hajnoczi wrote:
> >> The test still fails intermittently with a 60 second timeout in the
> >> GitLab CI environment. Raise the timeout to 120 seconds.
> >>
> >>    576/839 ERROR:../tests/qtest/netdev-socket.c:293:test_stream_unix: assertion failed (resp == expect): ("st0: index=0,type=stream,connection error\r\n" == "st0: index=0,type=stream,unix:/tmp/netdev-socket.UW5IA2/stream_unix\r\n") ERROR
> >>    576/839 qemu:qtest+qtest-sh4 / qtest-sh4/netdev-socket                            ERROR          62.85s   killed by signal 6 SIGABRT
> >>    >>> MALLOC_PERTURB_=249 QTEST_QEMU_BINARY=./qemu-system-sh4 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh QTEST_QEMU_IMG=./qemu-img /home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/tests/qtest/netdev-socket --tap -k
> >>    ――――――――――――――――――――――――――――――――――――― ✀  ―――――――――――――――――――――――――――――――――――――
> >>    stderr:
> >>    **
> >>    ERROR:../tests/qtest/netdev-socket.c:293:test_stream_unix: assertion failed (resp == expect): ("st0: index=0,type=stream,connection error\r\n" == "st0: index=0,type=stream,unix:/tmp/netdev-socket.UW5IA2/stream_unix\r\n")
> >>    (test program exited with status code -6)
> >>
> >> Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1881
> >> Fixes: 417296c8d858 ("tests/qtest/netdev-socket: Raise connection timeout to 60 seconds")
> >> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> >
> > That bumped the timeout from 5 seconds to 60 seconds to
> > cope with intermittent failures, which was a x12
> > increases. I'm concerned that it would still be failing
> > in largely the same way after that, and possibly we are
> > instead hitting a race condition causing setup to fail,
> > which masquerades as a timeout.
> >
> >> ---
> >>   tests/qtest/netdev-socket.c | 2 +-
> >>   1 file changed, 1 insertion(+), 1 deletion(-)
> >>
> >> diff --git a/tests/qtest/netdev-socket.c b/tests/qtest/netdev-socket.c
> >> index 8eed54801f..b2501d72a1 100644
> >> --- a/tests/qtest/netdev-socket.c
> >> +++ b/tests/qtest/netdev-socket.c
> >> @@ -16,7 +16,7 @@
> >>   #include "qapi/qobject-input-visitor.h"
> >>   #include "qapi/qapi-visit-sockets.h"
> >>
> >> -#define CONNECTION_TIMEOUT    60
> >> +#define CONNECTION_TIMEOUT    120
> >>
> >>   #define EXPECT_STATE(q, e, t)                             \
> >>   do {                                                      \
> >
> > I'll add
> >
> > Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
> >
> > but with the caveat that i'm only 50/50 on whether this is actually
> > the right fix. Doesn't hurt to try it, but if 120 seconds still shows
> > failures I'd say we're hitting a functional race not a timeout.
>
> It can also happen if the first QEMU (server) crashes. Do we have some traces from this side?
>
> Reviewed-by: Laurent Vivier <lvivier@redhat.com>
>
> Thanks,
> Laurent
>
>
diff mbox series

Patch

diff --git a/tests/qtest/netdev-socket.c b/tests/qtest/netdev-socket.c
index 8eed54801f..b2501d72a1 100644
--- a/tests/qtest/netdev-socket.c
+++ b/tests/qtest/netdev-socket.c
@@ -16,7 +16,7 @@ 
 #include "qapi/qobject-input-visitor.h"
 #include "qapi/qapi-visit-sockets.h"
 
-#define CONNECTION_TIMEOUT    60
+#define CONNECTION_TIMEOUT    120
 
 #define EXPECT_STATE(q, e, t)                             \
 do {                                                      \