Message ID | 20140715210948.GA20036@amt.cnet |
---|---|
State | New |
Headers | show |
On Wed, Jul 16, 2014 at 1:09 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote: > On Tue, Jul 15, 2014 at 06:01:08PM +0400, Andrey Korolyov wrote: >> On Tue, Jul 15, 2014 at 10:52 AM, Andrey Korolyov <andrey@xdel.ru> wrote: >> > On Tue, Jul 15, 2014 at 9:03 AM, Amit Shah <amit.shah@redhat.com> wrote: >> >> On (Sun) 13 Jul 2014 [16:28:56], Andrey Korolyov wrote: >> >>> Hello, >> >>> >> >>> the issue is not specific to the iothread code because generic >> >>> virtio-blk also hangs up: >> >> >> >> Do you know which version works well? If you could bisect, that'll >> >> help a lot. >> >> >> >> Thanks, >> >> Amit >> > >> > Hi, >> > >> > 2.0 works definitely well. I`ll try to finish bisection today, though >> > every step takes about 10 minutes to complete. >> >> Yay! It is even outside of virtio-blk. >> >> commit 9b1786829aefb83f37a8f3135e3ea91c56001b56 >> Author: Marcelo Tosatti <mtosatti@redhat.com> >> Date: Tue Jun 3 13:34:48 2014 -0300 >> >> kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec calculation >> >> Ensure proper env->tsc value for kvmclock_current_nsec calculation. >> >> Reported-by: Marcin Gibuła <m.gibula@beyond.pl> >> Cc: qemu-stable@nongnu.org >> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> >> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> > > Andrey, > > Can you please provide instructions on how to create reproducible > environment? > > The following patch is equivalent to the original patch, for > the purposes of fixing the kvmclock problem. > > Perhaps it becomes easier to spot the reason for the hang you are > experiencing. > > > diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c > index 272a88a..feb5fc5 100644 > --- a/hw/i386/kvm/clock.c > +++ b/hw/i386/kvm/clock.c > @@ -17,7 +17,6 @@ > #include "qemu/host-utils.h" > #include "sysemu/sysemu.h" > #include "sysemu/kvm.h" > -#include "sysemu/cpus.h" > #include "hw/sysbus.h" > #include "hw/kvm/clock.h" > > @@ -66,7 +65,6 @@ static uint64_t kvmclock_current_nsec(KVMClockState *s) > > cpu_physical_memory_read(kvmclock_struct_pa, &time, sizeof(time)); > > - assert(time.tsc_timestamp <= migration_tsc); > delta = migration_tsc - time.tsc_timestamp; > if (time.tsc_shift < 0) { > delta >>= -time.tsc_shift; > @@ -125,8 +123,6 @@ static void kvmclock_vm_state_change(void *opaque, int running, > if (s->clock_valid) { > return; > } > - > - cpu_synchronize_all_states(); > ret = kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, &data); > if (ret < 0) { > fprintf(stderr, "KVM_GET_CLOCK failed: %s\n", strerror(ret)); > diff --git a/migration.c b/migration.c > index 8d675b3..34f2325 100644 > --- a/migration.c > +++ b/migration.c > @@ -608,6 +608,7 @@ static void *migration_thread(void *opaque) > qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER); > old_vm_running = runstate_is_running(); > > + cpu_synchronize_all_states(); > ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); > if (ret >= 0) { > qemu_file_set_rate_limit(s->file, INT64_MAX); Marcelo, I do not see way easier than creating PoC deployment (involving at least two separated physical nodes) which will act for both as a sender and receiver for migration and for Ceph storage (http://ceph.com/docs/master/start/). For simplicity you probably want to disable cephx, therefore not putting the secret in the CLI. Also you may receive minimal qemu-ready installation using Mirantis` Fuel with Ceph deployment settings (it`ll deploy some Openstack too as a side effect, but the main reason to do things this way is a very high level of provisioning automation, you`ll get necessary environment with multi-node setting with RBD backend in matter of some clicks and some hours). In a meantime, I`ll try to reproduce the issue with iscsi, because I do not want to mess with shared storage and sanlock plugin.
Il 15/07/2014 23:25, Andrey Korolyov ha scritto: > On Wed, Jul 16, 2014 at 1:09 AM, Marcelo Tosatti <mtosatti@redhat.com> wrote: >> On Tue, Jul 15, 2014 at 06:01:08PM +0400, Andrey Korolyov wrote: >>> On Tue, Jul 15, 2014 at 10:52 AM, Andrey Korolyov <andrey@xdel.ru> wrote: >>>> On Tue, Jul 15, 2014 at 9:03 AM, Amit Shah <amit.shah@redhat.com> wrote: >>>>> On (Sun) 13 Jul 2014 [16:28:56], Andrey Korolyov wrote: >>>>>> Hello, >>>>>> >>>>>> the issue is not specific to the iothread code because generic >>>>>> virtio-blk also hangs up: >>>>> >>>>> Do you know which version works well? If you could bisect, that'll >>>>> help a lot. >>>>> >>>>> Thanks, >>>>> Amit >>>> >>>> Hi, >>>> >>>> 2.0 works definitely well. I`ll try to finish bisection today, though >>>> every step takes about 10 minutes to complete. >>> >>> Yay! It is even outside of virtio-blk. >>> >>> commit 9b1786829aefb83f37a8f3135e3ea91c56001b56 >>> Author: Marcelo Tosatti <mtosatti@redhat.com> >>> Date: Tue Jun 3 13:34:48 2014 -0300 >>> >>> kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec calculation >>> >>> Ensure proper env->tsc value for kvmclock_current_nsec calculation. >>> >>> Reported-by: Marcin Gibuła <m.gibula@beyond.pl> >>> Cc: qemu-stable@nongnu.org >>> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> >>> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> >> >> Andrey, >> >> Can you please provide instructions on how to create reproducible >> environment? >> >> The following patch is equivalent to the original patch, for >> the purposes of fixing the kvmclock problem. >> >> Perhaps it becomes easier to spot the reason for the hang you are >> experiencing. >> >> >> diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c >> index 272a88a..feb5fc5 100644 >> --- a/hw/i386/kvm/clock.c >> +++ b/hw/i386/kvm/clock.c >> @@ -17,7 +17,6 @@ >> #include "qemu/host-utils.h" >> #include "sysemu/sysemu.h" >> #include "sysemu/kvm.h" >> -#include "sysemu/cpus.h" >> #include "hw/sysbus.h" >> #include "hw/kvm/clock.h" >> >> @@ -66,7 +65,6 @@ static uint64_t kvmclock_current_nsec(KVMClockState *s) >> >> cpu_physical_memory_read(kvmclock_struct_pa, &time, sizeof(time)); >> >> - assert(time.tsc_timestamp <= migration_tsc); >> delta = migration_tsc - time.tsc_timestamp; >> if (time.tsc_shift < 0) { >> delta >>= -time.tsc_shift; >> @@ -125,8 +123,6 @@ static void kvmclock_vm_state_change(void *opaque, int running, >> if (s->clock_valid) { >> return; >> } >> - >> - cpu_synchronize_all_states(); >> ret = kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, &data); >> if (ret < 0) { >> fprintf(stderr, "KVM_GET_CLOCK failed: %s\n", strerror(ret)); >> diff --git a/migration.c b/migration.c >> index 8d675b3..34f2325 100644 >> --- a/migration.c >> +++ b/migration.c >> @@ -608,6 +608,7 @@ static void *migration_thread(void *opaque) >> qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER); >> old_vm_running = runstate_is_running(); >> >> + cpu_synchronize_all_states(); >> ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); >> if (ret >= 0) { >> qemu_file_set_rate_limit(s->file, INT64_MAX); It could also be useful to apply the above patch _and_ revert a096b3a6732f846ec57dc28b47ee9435aa0609bf, then try to reproduce. Paolo
> Andrey, > > Can you please provide instructions on how to create reproducible > environment? > > The following patch is equivalent to the original patch, for > the purposes of fixing the kvmclock problem. > > Perhaps it becomes easier to spot the reason for the hang you are > experiencing. Marcelo, the original reason for patch adding cpu_synchronize_all_states() there was because this bug affected non-migration operations as well - http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg00472.html. Won't moving it only to migration code break these things again? > > diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c > index 272a88a..feb5fc5 100644 > --- a/hw/i386/kvm/clock.c > +++ b/hw/i386/kvm/clock.c > @@ -17,7 +17,6 @@ > #include "qemu/host-utils.h" > #include "sysemu/sysemu.h" > #include "sysemu/kvm.h" > -#include "sysemu/cpus.h" > #include "hw/sysbus.h" > #include "hw/kvm/clock.h" > > @@ -66,7 +65,6 @@ static uint64_t kvmclock_current_nsec(KVMClockState *s) > > cpu_physical_memory_read(kvmclock_struct_pa, &time, sizeof(time)); > > - assert(time.tsc_timestamp <= migration_tsc); > delta = migration_tsc - time.tsc_timestamp; > if (time.tsc_shift < 0) { > delta >>= -time.tsc_shift; > @@ -125,8 +123,6 @@ static void kvmclock_vm_state_change(void *opaque, int running, > if (s->clock_valid) { > return; > } > - > - cpu_synchronize_all_states(); > ret = kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, &data); > if (ret < 0) { > fprintf(stderr, "KVM_GET_CLOCK failed: %s\n", strerror(ret)); > diff --git a/migration.c b/migration.c > index 8d675b3..34f2325 100644 > --- a/migration.c > +++ b/migration.c > @@ -608,6 +608,7 @@ static void *migration_thread(void *opaque) > qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER); > old_vm_running = runstate_is_running(); > > + cpu_synchronize_all_states(); > ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); > if (ret >= 0) { > qemu_file_set_rate_limit(s->file, INT64_MAX); >
On Wed, Jul 16, 2014 at 09:35:16AM +0200, Marcin Gibuła wrote: > >Andrey, > > > >Can you please provide instructions on how to create reproducible > >environment? > > > >The following patch is equivalent to the original patch, for > >the purposes of fixing the kvmclock problem. > > > >Perhaps it becomes easier to spot the reason for the hang you are > >experiencing. > > Marcelo, > > the original reason for patch adding cpu_synchronize_all_states() > there was because this bug affected non-migration operations as well > - > http://lists.gnu.org/archive/html/qemu-devel/2014-06/msg00472.html. > > Won't moving it only to migration code break these things again? Yes - its just for debug purposes.
diff --git a/hw/i386/kvm/clock.c b/hw/i386/kvm/clock.c index 272a88a..feb5fc5 100644 --- a/hw/i386/kvm/clock.c +++ b/hw/i386/kvm/clock.c @@ -17,7 +17,6 @@ #include "qemu/host-utils.h" #include "sysemu/sysemu.h" #include "sysemu/kvm.h" -#include "sysemu/cpus.h" #include "hw/sysbus.h" #include "hw/kvm/clock.h" @@ -66,7 +65,6 @@ static uint64_t kvmclock_current_nsec(KVMClockState *s) cpu_physical_memory_read(kvmclock_struct_pa, &time, sizeof(time)); - assert(time.tsc_timestamp <= migration_tsc); delta = migration_tsc - time.tsc_timestamp; if (time.tsc_shift < 0) { delta >>= -time.tsc_shift; @@ -125,8 +123,6 @@ static void kvmclock_vm_state_change(void *opaque, int running, if (s->clock_valid) { return; } - - cpu_synchronize_all_states(); ret = kvm_vm_ioctl(kvm_state, KVM_GET_CLOCK, &data); if (ret < 0) { fprintf(stderr, "KVM_GET_CLOCK failed: %s\n", strerror(ret)); diff --git a/migration.c b/migration.c index 8d675b3..34f2325 100644 --- a/migration.c +++ b/migration.c @@ -608,6 +608,7 @@ static void *migration_thread(void *opaque) qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER); old_vm_running = runstate_is_running(); + cpu_synchronize_all_states(); ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE); if (ret >= 0) { qemu_file_set_rate_limit(s->file, INT64_MAX);