Message ID | 20091022205727.GA23092@amt.cnet |
---|---|
State | New |
Headers | show |
On Thu, Oct 22, 2009 at 06:57:27PM -0200, Marcelo Tosatti wrote: > On Thu, Oct 22, 2009 at 02:00:15PM +0200, Michael S. Tsirkin wrote: > > Hi! > > I'm sometimes getting segfaults when I kill qemu. > > This time I caught it when qemu was under gdb: > > > > > > Program received signal SIGSEGV, Segmentation fault. > > [Switching to Thread 0x411d0940 (LWP 14446)] > > 0x000000000040afb4 in qemu_mod_timer (ts=0x19f0fd0, expire_time=62275467335) > > at /home/mst/scm/qemu-kvm/vl.c:1009 > > 1009 if ((alarm_timer->flags & ALARM_FLAG_EXPIRED) == 0) { > > (gdb) l > > 1004 ts->next = *pt; > > 1005 *pt = ts; > > 1006 > > 1007 /* Rearm if necessary */ > > 1008 if (pt == &active_timers[ts->clock->type]) { > > 1009 if ((alarm_timer->flags & ALARM_FLAG_EXPIRED) == 0) { > > 1010 qemu_rearm_alarm_timer(alarm_timer); > > 1011 } > > 1012 /* Interrupt execution to force deadline recalculation. */ > > 1013 if (use_icount) > > (gdb) p alarm_timer > > $1 = (struct qemu_alarm_timer *) 0x0 > > (gdb) where > > #0 0x000000000040afb4 in qemu_mod_timer (ts=0x19f0fd0, expire_time=62275467335) > > at /home/mst/scm/qemu-kvm/vl.c:1009 > > #1 0x000000000041aadf in virtio_net_handle_tx (vdev=<value optimized out>, vq=0x19f5af0) > > at /home/mst/scm/qemu-kvm/hw/virtio-net.c:696 > > #2 0x0000000000421669 in kvm_run (vcpu=0x19d46a0, env=0x19c2250) at /home/mst/scm/qemu-kvm/qemu-kvm.c:797 > > #3 0x00000000004216d6 in kvm_cpu_exec (env=0x83d0f8) at /home/mst/scm/qemu-kvm/qemu-kvm.c:1714 > > #4 0x0000000000422981 in ap_main_loop (_env=<value optimized out>) at /home/mst/scm/qemu-kvm/qemu-kvm.c:1969 > > #5 0x000000377dc06367 in start_thread () from /lib64/libpthread.so.0 > > #6 0x000000377d0d30ad in clone () from /lib64/libc.so.6 > > (gdb) > > > > So this probably means that we have already run quit_timers: > > > > static void quit_timers(void) > > { > > alarm_timer->stop(alarm_timer); > > alarm_timer = NULL; > > } > > > > but kvm vcpu thread is still running. > > > > > > Not sure what the right fix is here: should we stop > > kvm after main loop has exited? > > kvm_main_loop_wait(env, 0) can process the stop request (signalling > iothread that vcpu is stopped, so its OK to exit) and continue to > kvm_cpu_exec. > > Can you please try this: I applied this, and have not yet see any segfaults at exit. Not sure whether this is means anything as the crash is not 100% reproducable. Push it out to Anthony and we'll see, long term? Based on the knowledge of how to fix this, how would you go about reproducing it? > diff --git a/qemu-kvm.c b/qemu-kvm.c > index 87ece3d..141c8b1 100644 > --- a/qemu-kvm.c > +++ b/qemu-kvm.c > @@ -1931,7 +1931,8 @@ static int kvm_main_loop_cpu(CPUState *env) > } > if (run_cpu) { > kvm_main_loop_wait(env, 0); > - kvm_cpu_exec(env); > + if (!is_cpu_stopped(env)) > + kvm_cpu_exec(env); > } else { > kvm_main_loop_wait(env, 1000); > }
> On Thu, Oct 22, 2009 at 06:57:27PM -0200, Marcelo Tosatti wrote: > > On Thu, Oct 22, 2009 at 02:00:15PM +0200, Michael S. Tsirkin wrote: > > > Hi! > > > I'm sometimes getting segfaults when I kill qemu. > > > This time I caught it when qemu was under gdb: > > > > > > > > > Program received signal SIGSEGV, Segmentation fault. > > > [Switching to Thread 0x411d0940 (LWP 14446)] > > > 0x000000000040afb4 in qemu_mod_timer (ts=0x19f0fd0, expire_time=62275467335) > > > at /home/mst/scm/qemu-kvm/vl.c:1009 > > > 1009 if ((alarm_timer->flags & ALARM_FLAG_EXPIRED) == 0) { > > > (gdb) l > > > 1004 ts->next = *pt; > > > 1005 *pt = ts; > > > 1006 > > > 1007 /* Rearm if necessary */ > > > 1008 if (pt == &active_timers[ts->clock->type]) { > > > 1009 if ((alarm_timer->flags & ALARM_FLAG_EXPIRED) == 0) { > > > 1010 qemu_rearm_alarm_timer(alarm_timer); > > > 1011 } > > > 1012 /* Interrupt execution to force deadline recalculation. */ > > > 1013 if (use_icount) > > > (gdb) p alarm_timer > > > $1 = (struct qemu_alarm_timer *) 0x0 > > > (gdb) where > > > #0 0x000000000040afb4 in qemu_mod_timer (ts=0x19f0fd0, expire_time=62275467335) > > > at /home/mst/scm/qemu-kvm/vl.c:1009 > > > #1 0x000000000041aadf in virtio_net_handle_tx (vdev=<value optimized out>, vq=0x19f5af0) > > > at /home/mst/scm/qemu-kvm/hw/virtio-net.c:696 > > > #2 0x0000000000421669 in kvm_run (vcpu=0x19d46a0, env=0x19c2250) at /home/mst/scm/qemu-kvm/qemu-kvm.c:797 > > > #3 0x00000000004216d6 in kvm_cpu_exec (env=0x83d0f8) at /home/mst/scm/qemu-kvm/qemu-kvm.c:1714 > > > #4 0x0000000000422981 in ap_main_loop (_env=<value optimized out>) at /home/mst/scm/qemu-kvm/qemu-kvm.c:1969 > > > #5 0x000000377dc06367 in start_thread () from /lib64/libpthread.so.0 > > > #6 0x000000377d0d30ad in clone () from /lib64/libc.so.6 > > > (gdb) > > > > > > So this probably means that we have already run quit_timers: > > > > > > static void quit_timers(void) > > > { > > > alarm_timer->stop(alarm_timer); > > > alarm_timer = NULL; > > > } > > > > > > but kvm vcpu thread is still running. > > > > > > > > > Not sure what the right fix is here: should we stop > > > kvm after main loop has exited? > > > > kvm_main_loop_wait(env, 0) can process the stop request (signalling > > iothread that vcpu is stopped, so its OK to exit) and continue to > > kvm_cpu_exec. > > > > Can you please try this: > > I applied this, and have not yet see any segfaults at exit. > Not sure whether this is means anything as the crash is not > 100% reproducable. Push it out to Anthony and we'll see, long term? > Based on the knowledge of how to fix this, > how would you go about reproducing it? Add code to trigger the race manually, but i'm pretty sure thats it. Thanks for testing.
On Mon, Oct 26, 2009 at 04:43:11PM -0200, Marcelo Tosatti wrote: > > On Thu, Oct 22, 2009 at 06:57:27PM -0200, Marcelo Tosatti wrote: > > > On Thu, Oct 22, 2009 at 02:00:15PM +0200, Michael S. Tsirkin wrote: > > > > Hi! > > > > I'm sometimes getting segfaults when I kill qemu. > > > > This time I caught it when qemu was under gdb: > > > > > > > > > > > > Program received signal SIGSEGV, Segmentation fault. > > > > [Switching to Thread 0x411d0940 (LWP 14446)] > > > > 0x000000000040afb4 in qemu_mod_timer (ts=0x19f0fd0, expire_time=62275467335) > > > > at /home/mst/scm/qemu-kvm/vl.c:1009 > > > > 1009 if ((alarm_timer->flags & ALARM_FLAG_EXPIRED) == 0) { > > > > (gdb) l > > > > 1004 ts->next = *pt; > > > > 1005 *pt = ts; > > > > 1006 > > > > 1007 /* Rearm if necessary */ > > > > 1008 if (pt == &active_timers[ts->clock->type]) { > > > > 1009 if ((alarm_timer->flags & ALARM_FLAG_EXPIRED) == 0) { > > > > 1010 qemu_rearm_alarm_timer(alarm_timer); > > > > 1011 } > > > > 1012 /* Interrupt execution to force deadline recalculation. */ > > > > 1013 if (use_icount) > > > > (gdb) p alarm_timer > > > > $1 = (struct qemu_alarm_timer *) 0x0 > > > > (gdb) where > > > > #0 0x000000000040afb4 in qemu_mod_timer (ts=0x19f0fd0, expire_time=62275467335) > > > > at /home/mst/scm/qemu-kvm/vl.c:1009 > > > > #1 0x000000000041aadf in virtio_net_handle_tx (vdev=<value optimized out>, vq=0x19f5af0) > > > > at /home/mst/scm/qemu-kvm/hw/virtio-net.c:696 > > > > #2 0x0000000000421669 in kvm_run (vcpu=0x19d46a0, env=0x19c2250) at /home/mst/scm/qemu-kvm/qemu-kvm.c:797 > > > > #3 0x00000000004216d6 in kvm_cpu_exec (env=0x83d0f8) at /home/mst/scm/qemu-kvm/qemu-kvm.c:1714 > > > > #4 0x0000000000422981 in ap_main_loop (_env=<value optimized out>) at /home/mst/scm/qemu-kvm/qemu-kvm.c:1969 > > > > #5 0x000000377dc06367 in start_thread () from /lib64/libpthread.so.0 > > > > #6 0x000000377d0d30ad in clone () from /lib64/libc.so.6 > > > > (gdb) > > > > > > > > So this probably means that we have already run quit_timers: > > > > > > > > static void quit_timers(void) > > > > { > > > > alarm_timer->stop(alarm_timer); > > > > alarm_timer = NULL; > > > > } > > > > > > > > but kvm vcpu thread is still running. > > > > > > > > > > > > Not sure what the right fix is here: should we stop > > > > kvm after main loop has exited? > > > > > > kvm_main_loop_wait(env, 0) can process the stop request (signalling > > > iothread that vcpu is stopped, so its OK to exit) and continue to > > > kvm_cpu_exec. > > > > > > Can you please try this: > > > > I applied this, and have not yet see any segfaults at exit. > > Not sure whether this is means anything as the crash is not > > 100% reproducable. Push it out to Anthony and we'll see, long term? > > Based on the knowledge of how to fix this, > > how would you go about reproducing it? > > Add code to trigger the race manually, If you like, send a patch adding such code, I will test. > but i'm pretty sure thats it. > > Thanks for testing.
diff --git a/qemu-kvm.c b/qemu-kvm.c index 87ece3d..141c8b1 100644 --- a/qemu-kvm.c +++ b/qemu-kvm.c @@ -1931,7 +1931,8 @@ static int kvm_main_loop_cpu(CPUState *env) } if (run_cpu) { kvm_main_loop_wait(env, 0); - kvm_cpu_exec(env); + if (!is_cpu_stopped(env)) + kvm_cpu_exec(env); } else { kvm_main_loop_wait(env, 1000); }