Message ID | 1361356113-11049-7-git-send-email-stefanha@redhat.com |
---|---|
State | New |
Headers | show |
On 2013-02-20 11:28, Stefan Hajnoczi wrote: > Convert iohandler_select_fill() and iohandler_select_poll() to use > GPollFD instead of rfds/wfds/xfds. Since this commmit, I'm getting QEMU lock-ups, apparently slirp is involved (the Linux guest tries to start its network at this point): (gdb) thread apply all bt Thread 3 (Thread 0x7fffed0e3700 (LWP 26788)): #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136 #1 0x00007ffff44c7294 in _L_lock_999 () from /lib64/libpthread.so.0 #2 0x00007ffff44c70aa in __pthread_mutex_lock (mutex=0x5555560afcc0) at pthread_mutex_lock.c:61 #3 0x00005555558945e9 in qemu_mutex_lock (mutex=<value optimized out>) at /data/qemu/util/qemu-thread-posix.c:57 #4 0x00005555557d9da5 in kvm_cpu_exec (env=0x55555689c9f0) at /data/qemu/kvm-all.c:1564 #5 0x0000555555780091 in qemu_kvm_cpu_thread_fn (arg=0x55555689c9f0) at /data/qemu/cpus.c:759 #6 0x00007ffff44c4a3f in start_thread (arg=0x7fffed0e3700) at pthread_create.c:297 #7 0x00007ffff2fb871d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #8 0x0000000000000000 in ?? () Thread 2 (Thread 0x7fffed8e4700 (LWP 26787)): #0 sem_timedwait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S:103 #1 0x0000555555894a93 in qemu_sem_timedwait (sem=0x555556090020, ms=<value optimized out>) at /data/qemu/util/qemu-thread-posix.c:237 #2 0x000055555575116e in worker_thread (unused=<value optimized out>) at /data/qemu/thread-pool.c:88 #3 0x00007ffff44c4a3f in start_thread (arg=0x7fffed8e4700) at pthread_create.c:297 #4 0x00007ffff2fb871d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #5 0x0000000000000000 in ?? () Thread 1 (Thread 0x7ffff7fab760 (LWP 26784)): #0 0x00007ffff44cc763 in recvfrom () at ../sysdeps/unix/syscall-template.S:82 #1 0x000055555574b67d in recvfrom (so=0x555556bd2f50) at /usr/include/bits/socket2.h:77 #2 sorecvfrom (so=0x555556bd2f50) at /data/qemu/slirp/socket.c:498 #3 0x000055555574a160 in slirp_pollfds_poll (pollfds=0x555556511240, select_error=0) at /data/qemu/slirp/slirp.c:619 #4 0x000055555570ec99 in main_loop_wait (nonblocking=<value optimized out>) at /data/qemu/main-loop.c:514 #5 0x000055555577821d in main_loop (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>) at /data/qemu/vl.c:2002 #6 main (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>) at /data/qemu/vl.c:4334 Thread 1 blocks in recvfrom, not returning, not releasing the global lock. Any idea? Jan
On 02/22/13 18:33, Jan Kiszka wrote: > On 2013-02-20 11:28, Stefan Hajnoczi wrote: >> Convert iohandler_select_fill() and iohandler_select_poll() to use >> GPollFD instead of rfds/wfds/xfds. > > Since this commmit, I'm getting QEMU lock-ups, apparently slirp is > involved (the Linux guest tries to start its network at this point): > > (gdb) thread apply all bt > > Thread 3 (Thread 0x7fffed0e3700 (LWP 26788)): > #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:136 > #1 0x00007ffff44c7294 in _L_lock_999 () from /lib64/libpthread.so.0 > #2 0x00007ffff44c70aa in __pthread_mutex_lock (mutex=0x5555560afcc0) at pthread_mutex_lock.c:61 > #3 0x00005555558945e9 in qemu_mutex_lock (mutex=<value optimized out>) at /data/qemu/util/qemu-thread-posix.c:57 > #4 0x00005555557d9da5 in kvm_cpu_exec (env=0x55555689c9f0) at /data/qemu/kvm-all.c:1564 > #5 0x0000555555780091 in qemu_kvm_cpu_thread_fn (arg=0x55555689c9f0) at /data/qemu/cpus.c:759 > #6 0x00007ffff44c4a3f in start_thread (arg=0x7fffed0e3700) at pthread_create.c:297 > #7 0x00007ffff2fb871d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 > #8 0x0000000000000000 in ?? () > > Thread 2 (Thread 0x7fffed8e4700 (LWP 26787)): > #0 sem_timedwait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/sem_timedwait.S:103 > #1 0x0000555555894a93 in qemu_sem_timedwait (sem=0x555556090020, ms=<value optimized out>) at /data/qemu/util/qemu-thread-posix.c:237 > #2 0x000055555575116e in worker_thread (unused=<value optimized out>) at /data/qemu/thread-pool.c:88 > #3 0x00007ffff44c4a3f in start_thread (arg=0x7fffed8e4700) at pthread_create.c:297 > #4 0x00007ffff2fb871d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 > #5 0x0000000000000000 in ?? () > > Thread 1 (Thread 0x7ffff7fab760 (LWP 26784)): > #0 0x00007ffff44cc763 in recvfrom () at ../sysdeps/unix/syscall-template.S:82 > #1 0x000055555574b67d in recvfrom (so=0x555556bd2f50) at /usr/include/bits/socket2.h:77 > #2 sorecvfrom (so=0x555556bd2f50) at /data/qemu/slirp/socket.c:498 > #3 0x000055555574a160 in slirp_pollfds_poll (pollfds=0x555556511240, select_error=0) at /data/qemu/slirp/slirp.c:619 > #4 0x000055555570ec99 in main_loop_wait (nonblocking=<value optimized out>) at /data/qemu/main-loop.c:514 > #5 0x000055555577821d in main_loop (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>) at /data/qemu/vl.c:2002 > #6 main (argc=<value optimized out>, argv=<value optimized out>, envp=<value optimized out>) at /data/qemu/vl.c:4334 > > Thread 1 blocks in recvfrom, not returning, not releasing the global lock. > > Any idea? Well I guess recvfrom() shouldn't block in slirp / thread 1, so maybe slirp_pollfds_poll() finds readiness where it shouldn't -- we possibly shouldn't even call sorecvfrom(). sorecvfrom() belongs to the UDP branch in slirp_pollfds_poll(). Could this be related to the change we discussed in <http://thread.gmane.org/gmane.comp.emulators.qemu/192801/focus=193181>? I guess trace calls would be handy... FWIW I find it interesting that slirp doesn't hang after patch 05/10 (which is the slirp conversion) but here. This patch (06/10) converts qemu iohandler. It looks as if qemu_iohandler_poll(), called just before slirp_pollfds_poll() in main_loop_wait(), "stole" data from slirp. Mixup between file descriptors? Laszlo
diff --git a/include/qemu/main-loop.h b/include/qemu/main-loop.h index e8059c3..0995288 100644 --- a/include/qemu/main-loop.h +++ b/include/qemu/main-loop.h @@ -297,8 +297,8 @@ void qemu_mutex_unlock_iothread(void); /* internal interfaces */ void qemu_fd_register(int fd); -void qemu_iohandler_fill(int *pnfds, fd_set *readfds, fd_set *writefds, fd_set *xfds); -void qemu_iohandler_poll(fd_set *readfds, fd_set *writefds, fd_set *xfds, int rc); +void qemu_iohandler_fill(GArray *pollfds); +void qemu_iohandler_poll(GArray *pollfds, int rc); QEMUBH *qemu_bh_new(QEMUBHFunc *cb, void *opaque); void qemu_bh_schedule_idle(QEMUBH *bh); diff --git a/iohandler.c b/iohandler.c index 2523adc..ae2ef8f 100644 --- a/iohandler.c +++ b/iohandler.c @@ -39,6 +39,7 @@ typedef struct IOHandlerRecord { void *opaque; QLIST_ENTRY(IOHandlerRecord) next; int fd; + int pollfds_idx; bool deleted; } IOHandlerRecord; @@ -78,6 +79,7 @@ int qemu_set_fd_handler2(int fd, ioh->fd_read = fd_read; ioh->fd_write = fd_write; ioh->opaque = opaque; + ioh->pollfds_idx = -1; ioh->deleted = 0; qemu_notify_event(); } @@ -92,38 +94,56 @@ int qemu_set_fd_handler(int fd, return qemu_set_fd_handler2(fd, NULL, fd_read, fd_write, opaque); } -void qemu_iohandler_fill(int *pnfds, fd_set *readfds, fd_set *writefds, fd_set *xfds) +void qemu_iohandler_fill(GArray *pollfds) { IOHandlerRecord *ioh; QLIST_FOREACH(ioh, &io_handlers, next) { + int events = 0; + if (ioh->deleted) continue; if (ioh->fd_read && (!ioh->fd_read_poll || ioh->fd_read_poll(ioh->opaque) != 0)) { - FD_SET(ioh->fd, readfds); - if (ioh->fd > *pnfds) - *pnfds = ioh->fd; + events |= G_IO_IN | G_IO_HUP | G_IO_ERR; } if (ioh->fd_write) { - FD_SET(ioh->fd, writefds); - if (ioh->fd > *pnfds) - *pnfds = ioh->fd; + events |= G_IO_OUT | G_IO_ERR; + } + if (events) { + GPollFD pfd = { + .fd = ioh->fd, + .events = events, + }; + ioh->pollfds_idx = pollfds->len; + g_array_append_val(pollfds, pfd); + } else { + ioh->pollfds_idx = -1; } } } -void qemu_iohandler_poll(fd_set *readfds, fd_set *writefds, fd_set *xfds, int ret) +void qemu_iohandler_poll(GArray *pollfds, int ret) { if (ret > 0) { IOHandlerRecord *pioh, *ioh; QLIST_FOREACH_SAFE(ioh, &io_handlers, next, pioh) { - if (!ioh->deleted && ioh->fd_read && FD_ISSET(ioh->fd, readfds)) { + int revents = 0; + + if (!ioh->deleted && ioh->pollfds_idx != -1) { + GPollFD *pfd = &g_array_index(pollfds, GPollFD, + ioh->pollfds_idx); + revents = pfd->revents; + } + + if (!ioh->deleted && ioh->fd_read && + (revents & (G_IO_IN | G_IO_HUP | G_IO_ERR))) { ioh->fd_read(ioh->opaque); } - if (!ioh->deleted && ioh->fd_write && FD_ISSET(ioh->fd, writefds)) { + if (!ioh->deleted && ioh->fd_write && + (revents & (G_IO_OUT | G_IO_ERR))) { ioh->fd_write(ioh->opaque); } diff --git a/main-loop.c b/main-loop.c index 839e98f..800868a 100644 --- a/main-loop.c +++ b/main-loop.c @@ -507,9 +507,9 @@ int main_loop_wait(int nonblocking) slirp_update_timeout(&timeout); slirp_pollfds_fill(gpollfds); #endif - qemu_iohandler_fill(&nfds, &rfds, &wfds, &xfds); + qemu_iohandler_fill(gpollfds); ret = os_host_main_loop_wait(timeout); - qemu_iohandler_poll(&rfds, &wfds, &xfds, ret); + qemu_iohandler_poll(gpollfds, ret); #ifdef CONFIG_SLIRP slirp_pollfds_poll(gpollfds, (ret < 0)); #endif
Convert iohandler_select_fill() and iohandler_select_poll() to use GPollFD instead of rfds/wfds/xfds. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> --- include/qemu/main-loop.h | 4 ++-- iohandler.c | 40 ++++++++++++++++++++++++++++++---------- main-loop.c | 4 ++-- 3 files changed, 34 insertions(+), 14 deletions(-)