Message ID | 20140603163448.GA11829@amt.cnet |
---|---|
State | New |
Headers | show |
Il 03/06/2014 18:34, Marcelo Tosatti ha scritto: > > Ensure proper env->tsc value for kvmclock_current_nsec calculation. > > Reported-by: Marcin Gibuła <m.gibula@beyond.pl> > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> > > diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c > index 6f4ed28a..bef2504 100644 > --- a/hw/i386/kvm/clock.c > +++ b/hw/i386/kvm/clock.c > @@ -17,6 +17,7 @@ > #include "qemu/host-utils.h" > #include "sysemu/sysemu.h" > #include "sysemu/kvm.h" > +#include "sysemu/cpus.h" > #include "hw/sysbus.h" > #include "hw/kvm/clock.h" > > @@ -65,6 +66,7 @@ static uint64_t kvmclock_current_nsec(KVMClockState *s) > > cpu_physical_memory_read(kvmclock_struct_pa, &time, sizeof(time)); > > + assert(time.tsc_timestamp <= migration_tsc); > delta = migration_tsc - time.tsc_timestamp; > if (time.tsc_shift < 0) { > delta >>= -time.tsc_shift; > @@ -123,6 +125,8 @@ static void kvmclock_vm_state_change(void *opaque, int running, > if (s->clock_valid) { > return; > } > + > + cpu_synchronize_all_states(); > ret = kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, &data); > if (ret < 0) { > fprintf(stderr, "KVM_GET_CLOCK failed: %s\n", strerror(ret)); > > This causes a hang during migration, so I'll revert the patch from 2.1. Sorry. Paolo
>> @@ -65,6 +66,7 @@ static uint64_t kvmclock_current_nsec(KVMClockState *s) >> >> cpu_physical_memory_read(kvmclock_struct_pa, &time, sizeof(time)); >> >> + assert(time.tsc_timestamp <= migration_tsc); >> delta = migration_tsc - time.tsc_timestamp; >> if (time.tsc_shift < 0) { >> delta >>= -time.tsc_shift; >> @@ -123,6 +125,8 @@ static void kvmclock_vm_state_change(void *opaque, >> int running, >> if (s->clock_valid) { >> return; >> } >> + >> + cpu_synchronize_all_states(); >> ret = kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, &data); >> if (ret < 0) { >> fprintf(stderr, "KVM_GET_CLOCK failed: %s\n", >> strerror(ret)); >> >> > > This causes a hang during migration, so I'll revert the patch from 2.1. For me this patch series fixed all hangs I had with migration (at least with qemu 2.0).
Il 15/07/2014 22:03, Marcin Gibuła ha scritto: >> This causes a hang during migration, so I'll revert the patch from 2.1. > > For me this patch series fixed all hangs I had with migration (at least > with qemu 2.0). Unfortunately, someone else bisected their hangs exactly to this patch: http://permalink.gmane.org/gmane.comp.emulators.qemu/286681 I could not reproduce that, but I couldn't reproduce yours either so I'm inclined to trust him and revert the patch. Paolo
On Tue, Jul 15, 2014 at 10:15:14PM +0200, Paolo Bonzini wrote: > Il 15/07/2014 22:03, Marcin Gibuła ha scritto: > >>This causes a hang during migration, so I'll revert the patch from 2.1. > > > >For me this patch series fixed all hangs I had with migration (at least > >with qemu 2.0). > > Unfortunately, someone else bisected their hangs exactly to this patch: > > http://permalink.gmane.org/gmane.comp.emulators.qemu/286681 > > I could not reproduce that, but I couldn't reproduce yours either so > I'm inclined to trust him and revert the patch. This patch (or some form of updating env->tsc, see changelog) is necessary. I was just about to reproduce it. Andrey, can you please provide the qemu command line ? Can you attempt to start VM via command line and migrate via QEMU monitor, then gdb -p to get a backtrace for all threads ?
On Tue, Jul 15, 2014 at 05:31:22PM -0300, Marcelo Tosatti wrote: > On Tue, Jul 15, 2014 at 10:15:14PM +0200, Paolo Bonzini wrote: > > Il 15/07/2014 22:03, Marcin Gibuła ha scritto: > > >>This causes a hang during migration, so I'll revert the patch from 2.1. > > > > > >For me this patch series fixed all hangs I had with migration (at least > > >with qemu 2.0). > > > > Unfortunately, someone else bisected their hangs exactly to this patch: > > > > http://permalink.gmane.org/gmane.comp.emulators.qemu/286681 > > > > I could not reproduce that, but I couldn't reproduce yours either so > > I'm inclined to trust him and revert the patch. > > This patch (or some form of updating env->tsc, see changelog) > is necessary. > > I was just about to reproduce it. > > Andrey, can you please provide the qemu command line ? > > Can you attempt to start VM via command line and migrate > via QEMU monitor, then gdb -p to get a backtrace for all > threads ? > The backtrace in the following message is for a different problem, correct? http://www.mail-archive.com/qemu-devel@nongnu.org/msg246161.html
Il 15/07/2014 22:31, Marcelo Tosatti ha scritto: > This patch (or some form of updating env->tsc, see changelog) > is necessary. Yes, I was going to revert Alex's too. > I was just about to reproduce it. > > Andrey, can you please provide the qemu command line ? > > Can you attempt to start VM via command line and migrate > via QEMU monitor, then gdb -p to get a backtrace for all > threads ? Andrey provided backtraces for this issue here: http://article.gmane.org/gmane.comp.emulators.qemu/286679 The first backtrace attached is bt-noreset.txt, the second is bt-reset.txt but they are identical. The backtraces lack symbols for QEMU, but it looks like the VMs are running because the CPU threads all look like this: Thread 4 (Thread 0x7f0054ff9700 (LWP 2915)): #0 0x00007f00a04dbc37 in ioctl () at ../sysdeps/unix/syscall-template.S:82 #1 0x00007f00a59ea0a9 in ?? () #2 0x00007f00a59ea1e5 in ?? () #3 0x00007f00a59d59bc in ?? () #4 0x00007f00a07b5e9a in start_thread (arg=0x7f0054ff9700) at pthread_create.c:308 #5 0x00007f00a04e33dd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #6 0x0000000000000000 in ?? () where the ioctl is likely KVM_RUN. Paolo
Il 15/07/2014 22:34, Marcelo Tosatti ha scritto: > The backtrace in the following message is for a different problem, > correct? > > http://www.mail-archive.com/qemu-devel@nongnu.org/msg246161.html > > > Correct. Paolo
On Wed, Jul 16, 2014 at 12:43 AM, Paolo Bonzini <pbonzini@redhat.com> wrote: > Il 15/07/2014 22:34, Marcelo Tosatti ha scritto: > >> The backtrace in the following message is for a different problem, >> correct? >> >> http://www.mail-archive.com/qemu-devel@nongnu.org/msg246161.html >> >> >> > > Correct. > > Paolo Sorry for being unclear. VM is running well, the resulting lockup visibly affects only disk, I wrote this as a lockup, but it is not complete lockup, I am able to execute cached executables (and see, for example, soft lockup warnings for the disk). Unfortunately I had not prepared -dbg package, so if you need exact trace, let me know, I thought that it is necessary just for quick verification. I`ll be very glad to provide any other information if necessary. As I mentioned in original thread, the issue is not specific to the iothread code, though I had not tested with other bus than virtio. If any of you who are interested in solving such a decision (thing which broke stuff for me fixes stuff in the other place) in place, I may offer you a sandboxed environment with rbd-ready deployment. The argument set follows: qemu-system-x86_64 -enable-kvm -name vm27842 -S -machine pc-i440fx-2.1,accel=kvm,usb=off -m 256 -realtime mlock=off -smp 12,sockets=1,cores=12,threads=12 -numa node,nodeid=0,cpus=0,mem=256 -uuid 9a44bdae-5702-463b-aa1e-d8d85055f6af -nographic -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/vm27842.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,clock=vm,driftfix=slew -global kvm-pit.lost_tick_policy=discard -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x5 -drive file=rbd:dev-rack2/vm27842-Kxs:id=qemukvm:key=XXXXXXXX:auth_supported=cephx\;none:mon_host=10.6.0.1\:6789\;10.6.0.4\:6789,if=none,id=drive-virtio-disk0,format=raw,cache=writeback -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -netdev tap,fd=23,id=hostnet0,vhost=on,vhostfd=24 -device virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:10:03:40,bus=pci.0,addr=0x3 -netdev tap,fd=25,id=hostnet1,vhost=on,vhostfd=26 -device virtio-net-pci,netdev=hostnet1,id=net1,mac=52:54:00:10:03:41,bus=pci.0,addr=0x4 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,path=/var/lib/libvirt/qemu/vm27842.sock,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.1
Il 15/07/2014 23:05, Andrey Korolyov ha scritto: > On Wed, Jul 16, 2014 at 12:43 AM, Paolo Bonzini <pbonzini@redhat.com> wrote: >> Il 15/07/2014 22:34, Marcelo Tosatti ha scritto: >> >>> The backtrace in the following message is for a different problem, >>> correct? >>> >>> http://www.mail-archive.com/qemu-devel@nongnu.org/msg246161.html >> >> Correct. >> >> Paolo > > Sorry for being unclear. VM is running well, the resulting lockup > visibly affects only disk, I wrote this as a lockup, but it is not > complete lockup, I am able to execute cached executables (and see, for > example, soft lockup warnings for the disk). Yeah, that was my understanding. Can you reproduce it without rbd? Paolo
diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c index 6f4ed28a..bef2504 100644 --- a/hw/i386/kvm/clock.c +++ b/hw/i386/kvm/clock.c @@ -17,6 +17,7 @@ #include "qemu/host-utils.h" #include "sysemu/sysemu.h" #include "sysemu/kvm.h" +#include "sysemu/cpus.h" #include "hw/sysbus.h" #include "hw/kvm/clock.h" @@ -65,6 +66,7 @@ static uint64_t kvmclock_current_nsec(KVMClockState *s) cpu_physical_memory_read(kvmclock_struct_pa, &time, sizeof(time)); + assert(time.tsc_timestamp <= migration_tsc); delta = migration_tsc - time.tsc_timestamp; if (time.tsc_shift < 0) { delta >>= -time.tsc_shift; @@ -123,6 +125,8 @@ static void kvmclock_vm_state_change(void *opaque, int running, if (s->clock_valid) { return; } + + cpu_synchronize_all_states(); ret = kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, &data); if (ret < 0) { fprintf(stderr, "KVM_GET_CLOCK failed: %s\n", strerror(ret));
Ensure proper env->tsc value for kvmclock_current_nsec calculation. Reported-by: Marcin Gibuła <m.gibula@beyond.pl> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>