Message ID | 20230912133310.60583-1-stefanha@redhat.com |
---|---|
State | New |
Headers | show |
Series | tests/qtest/netdev-socket: Raise connection timeout to 120 seconds | expand |
On Tue, Sep 12, 2023 at 09:33:10AM -0400, Stefan Hajnoczi wrote: > The test still fails intermittently with a 60 second timeout in the > GitLab CI environment. Raise the timeout to 120 seconds. > > 576/839 ERROR:../tests/qtest/netdev-socket.c:293:test_stream_unix: assertion failed (resp == expect): ("st0: index=0,type=stream,connection error\r\n" == "st0: index=0,type=stream,unix:/tmp/netdev-socket.UW5IA2/stream_unix\r\n") ERROR > 576/839 qemu:qtest+qtest-sh4 / qtest-sh4/netdev-socket ERROR 62.85s killed by signal 6 SIGABRT > >>> MALLOC_PERTURB_=249 QTEST_QEMU_BINARY=./qemu-system-sh4 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh QTEST_QEMU_IMG=./qemu-img /home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/tests/qtest/netdev-socket --tap -k > ――――――――――――――――――――――――――――――――――――― ✀ ――――――――――――――――――――――――――――――――――――― > stderr: > ** > ERROR:../tests/qtest/netdev-socket.c:293:test_stream_unix: assertion failed (resp == expect): ("st0: index=0,type=stream,connection error\r\n" == "st0: index=0,type=stream,unix:/tmp/netdev-socket.UW5IA2/stream_unix\r\n") > (test program exited with status code -6) > > Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1881 > Fixes: 417296c8d858 ("tests/qtest/netdev-socket: Raise connection timeout to 60 seconds") > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> That bumped the timeout from 5 seconds to 60 seconds to cope with intermittent failures, which was a x12 increases. I'm concerned that it would still be failing in largely the same way after that, and possibly we are instead hitting a race condition causing setup to fail, which masquerades as a timeout. > --- > tests/qtest/netdev-socket.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/tests/qtest/netdev-socket.c b/tests/qtest/netdev-socket.c > index 8eed54801f..b2501d72a1 100644 > --- a/tests/qtest/netdev-socket.c > +++ b/tests/qtest/netdev-socket.c > @@ -16,7 +16,7 @@ > #include "qapi/qobject-input-visitor.h" > #include "qapi/qapi-visit-sockets.h" > > -#define CONNECTION_TIMEOUT 60 > +#define CONNECTION_TIMEOUT 120 > > #define EXPECT_STATE(q, e, t) \ > do { \ I'll add Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> but with the caveat that i'm only 50/50 on whether this is actually the right fix. Doesn't hurt to try it, but if 120 seconds still shows failures I'd say we're hitting a functional race not a timeout. With regards, Daniel
On 9/12/23 15:42, Daniel P. Berrangé wrote: > On Tue, Sep 12, 2023 at 09:33:10AM -0400, Stefan Hajnoczi wrote: >> The test still fails intermittently with a 60 second timeout in the >> GitLab CI environment. Raise the timeout to 120 seconds. >> >> 576/839 ERROR:../tests/qtest/netdev-socket.c:293:test_stream_unix: assertion failed (resp == expect): ("st0: index=0,type=stream,connection error\r\n" == "st0: index=0,type=stream,unix:/tmp/netdev-socket.UW5IA2/stream_unix\r\n") ERROR >> 576/839 qemu:qtest+qtest-sh4 / qtest-sh4/netdev-socket ERROR 62.85s killed by signal 6 SIGABRT >> >>> MALLOC_PERTURB_=249 QTEST_QEMU_BINARY=./qemu-system-sh4 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh QTEST_QEMU_IMG=./qemu-img /home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/tests/qtest/netdev-socket --tap -k >> ――――――――――――――――――――――――――――――――――――― ✀ ――――――――――――――――――――――――――――――――――――― >> stderr: >> ** >> ERROR:../tests/qtest/netdev-socket.c:293:test_stream_unix: assertion failed (resp == expect): ("st0: index=0,type=stream,connection error\r\n" == "st0: index=0,type=stream,unix:/tmp/netdev-socket.UW5IA2/stream_unix\r\n") >> (test program exited with status code -6) >> >> Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1881 >> Fixes: 417296c8d858 ("tests/qtest/netdev-socket: Raise connection timeout to 60 seconds") >> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > > That bumped the timeout from 5 seconds to 60 seconds to > cope with intermittent failures, which was a x12 > increases. I'm concerned that it would still be failing > in largely the same way after that, and possibly we are > instead hitting a race condition causing setup to fail, > which masquerades as a timeout. > >> --- >> tests/qtest/netdev-socket.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/tests/qtest/netdev-socket.c b/tests/qtest/netdev-socket.c >> index 8eed54801f..b2501d72a1 100644 >> --- a/tests/qtest/netdev-socket.c >> +++ b/tests/qtest/netdev-socket.c >> @@ -16,7 +16,7 @@ >> #include "qapi/qobject-input-visitor.h" >> #include "qapi/qapi-visit-sockets.h" >> >> -#define CONNECTION_TIMEOUT 60 >> +#define CONNECTION_TIMEOUT 120 >> >> #define EXPECT_STATE(q, e, t) \ >> do { \ > > I'll add > > Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> > > but with the caveat that i'm only 50/50 on whether this is actually > the right fix. Doesn't hurt to try it, but if 120 seconds still shows > failures I'd say we're hitting a functional race not a timeout. It can also happen if the first QEMU (server) crashes. Do we have some traces from this side? Reviewed-by: Laurent Vivier <lvivier@redhat.com> Thanks, Laurent
Here is a log from the CI, but I don't think it has much information: https://gitlab.com/qemu-project/qemu/-/jobs/5020899550 Is it possible to detect the crash? Timeouts are hard to diagnose, so it would be better for the test to detect a terminated child process and print an error. Stefan On Tue, 12 Sept 2023 at 10:08, Laurent Vivier <lvivier@redhat.com> wrote: > > On 9/12/23 15:42, Daniel P. Berrangé wrote: > > On Tue, Sep 12, 2023 at 09:33:10AM -0400, Stefan Hajnoczi wrote: > >> The test still fails intermittently with a 60 second timeout in the > >> GitLab CI environment. Raise the timeout to 120 seconds. > >> > >> 576/839 ERROR:../tests/qtest/netdev-socket.c:293:test_stream_unix: assertion failed (resp == expect): ("st0: index=0,type=stream,connection error\r\n" == "st0: index=0,type=stream,unix:/tmp/netdev-socket.UW5IA2/stream_unix\r\n") ERROR > >> 576/839 qemu:qtest+qtest-sh4 / qtest-sh4/netdev-socket ERROR 62.85s killed by signal 6 SIGABRT > >> >>> MALLOC_PERTURB_=249 QTEST_QEMU_BINARY=./qemu-system-sh4 QTEST_QEMU_STORAGE_DAEMON_BINARY=./storage-daemon/qemu-storage-daemon G_TEST_DBUS_DAEMON=/home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/tests/dbus-vmstate-daemon.sh QTEST_QEMU_IMG=./qemu-img /home/gitlab-runner/builds/-LCfcJ2T/0/qemu-project/qemu/build/tests/qtest/netdev-socket --tap -k > >> ――――――――――――――――――――――――――――――――――――― ✀ ――――――――――――――――――――――――――――――――――――― > >> stderr: > >> ** > >> ERROR:../tests/qtest/netdev-socket.c:293:test_stream_unix: assertion failed (resp == expect): ("st0: index=0,type=stream,connection error\r\n" == "st0: index=0,type=stream,unix:/tmp/netdev-socket.UW5IA2/stream_unix\r\n") > >> (test program exited with status code -6) > >> > >> Buglink: https://gitlab.com/qemu-project/qemu/-/issues/1881 > >> Fixes: 417296c8d858 ("tests/qtest/netdev-socket: Raise connection timeout to 60 seconds") > >> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > > > > That bumped the timeout from 5 seconds to 60 seconds to > > cope with intermittent failures, which was a x12 > > increases. I'm concerned that it would still be failing > > in largely the same way after that, and possibly we are > > instead hitting a race condition causing setup to fail, > > which masquerades as a timeout. > > > >> --- > >> tests/qtest/netdev-socket.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/tests/qtest/netdev-socket.c b/tests/qtest/netdev-socket.c > >> index 8eed54801f..b2501d72a1 100644 > >> --- a/tests/qtest/netdev-socket.c > >> +++ b/tests/qtest/netdev-socket.c > >> @@ -16,7 +16,7 @@ > >> #include "qapi/qobject-input-visitor.h" > >> #include "qapi/qapi-visit-sockets.h" > >> > >> -#define CONNECTION_TIMEOUT 60 > >> +#define CONNECTION_TIMEOUT 120 > >> > >> #define EXPECT_STATE(q, e, t) \ > >> do { \ > > > > I'll add > > > > Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> > > > > but with the caveat that i'm only 50/50 on whether this is actually > > the right fix. Doesn't hurt to try it, but if 120 seconds still shows > > failures I'd say we're hitting a functional race not a timeout. > > It can also happen if the first QEMU (server) crashes. Do we have some traces from this side? > > Reviewed-by: Laurent Vivier <lvivier@redhat.com> > > Thanks, > Laurent > >
diff --git a/tests/qtest/netdev-socket.c b/tests/qtest/netdev-socket.c index 8eed54801f..b2501d72a1 100644 --- a/tests/qtest/netdev-socket.c +++ b/tests/qtest/netdev-socket.c @@ -16,7 +16,7 @@ #include "qapi/qobject-input-visitor.h" #include "qapi/qapi-visit-sockets.h" -#define CONNECTION_TIMEOUT 60 +#define CONNECTION_TIMEOUT 120 #define EXPECT_STATE(q, e, t) \ do { \