From patchwork Fri Jan 27 12:24:58 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Laurent Vivier X-Patchwork-Id: 720656 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3v8ygZ3PbKz9tT8 for ; Fri, 27 Jan 2017 23:26:01 +1100 (AEDT) Received: from localhost ([::1]:44851 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cX5bH-0007nl-9u for incoming@patchwork.ozlabs.org; Fri, 27 Jan 2017 07:25:55 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44587) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1cX5aV-0007Sd-Up for qemu-devel@nongnu.org; Fri, 27 Jan 2017 07:25:09 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1cX5aR-00055i-QX for qemu-devel@nongnu.org; Fri, 27 Jan 2017 07:25:07 -0500 Received: from mx1.redhat.com ([209.132.183.28]:41866) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1cX5aR-00055Q-IG; Fri, 27 Jan 2017 07:25:03 -0500 Received: from smtp.corp.redhat.com (int-mx16.intmail.prod.int.phx2.redhat.com [10.5.11.28]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 999C37E9C5; Fri, 27 Jan 2017 12:25:03 +0000 (UTC) Received: from thinkpad.redhat.com (ovpn-116-235.ams2.redhat.com [10.36.116.235]) by smtp.corp.redhat.com (Postfix) with ESMTP id 05A2B1CB9A2; Fri, 27 Jan 2017 12:25:00 +0000 (UTC) From: Laurent Vivier To: qemu-devel@nongnu.org Date: Fri, 27 Jan 2017 13:24:58 +0100 Message-Id: <20170127122458.318-1-lvivier@redhat.com> X-Scanned-By: MIMEDefang 2.74 on 10.5.11.28 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Fri, 27 Jan 2017 12:25:03 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH v2] spapr: clock should count only if vm is running X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Laurent Vivier , Thomas Huth , Alexey Kardashevskiy , Marcelo Tosatti , Alexander Graf , qemu-ppc@nongnu.org, Paolo Bonzini , David Gibson Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" This is a port to ppc of the i386 commit: 00f4d64 kvmclock: clock should count only if vm is running We remove timebase_post_load function, and use the VM state change handler to save and restore the guest_timebase (on stop and continue). We keep timebase_pre_save to reduce the clock difference on migration like in: 6053a86 kvmclock: reduce kvmclock difference on migration Time base offset has originally been introduced by commit 98a8b52 spapr: Add support for time base offset migration So while VM is paused, the time is stopped. This allows to have the same result with date (based on Time Base Register) and hwclock (based on "get-time-of-day" RTAS call). Moreover in TCG mode, the Time Base is always paused, so this patch also adjust the behavior between TCG and KVM. VM state field "time_of_the_day_ns" is now useless but we keep it to be able to migrate to older version of the machine. As vmstate_ppc_timebase structure (with timebase_pre_save() and timebase_post_load() functions) was only used by vmstate_spapr, we register the VM state change handler only in ppc_spapr_init(). Signed-off-by: Laurent Vivier --- v2: keep timebase_pre_save() move save and load code to timebase_save() and timebase_load(), to use the timebase_save() in timebase_pre_save() and in cpu_ppc_clock_vm_state_change(), and this eases the patch review. put a "#if defined(CONFIG_KVM)" only around kvm_set_one_reg() don't remove the trace function hw/ppc/ppc.c | 66 ++++++++++++++++++++++++++++++++++------------------ hw/ppc/spapr.c | 6 +++++ target/ppc/cpu-qom.h | 3 +++ 3 files changed, 52 insertions(+), 23 deletions(-) diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c index 8945869..24d4392 100644 --- a/hw/ppc/ppc.c +++ b/hw/ppc/ppc.c @@ -847,9 +847,8 @@ static void cpu_ppc_set_tb_clk (void *opaque, uint32_t freq) cpu_ppc_store_purr(cpu, 0x0000000000000000ULL); } -static void timebase_pre_save(void *opaque) +static void timebase_save(PPCTimebase *tb) { - PPCTimebase *tb = opaque; uint64_t ticks = cpu_get_host_ticks(); PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu); @@ -858,43 +857,30 @@ static void timebase_pre_save(void *opaque) return; } + /* not used anymore, we keep it for compatibility */ tb->time_of_the_day_ns = qemu_clock_get_ns(QEMU_CLOCK_HOST); /* - * tb_offset is only expected to be changed by migration so + * tb_offset is only expected to be changed by QEMU so * there is no need to update it from KVM here */ tb->guest_timebase = ticks + first_ppc_cpu->env.tb_env->tb_offset; } -static int timebase_post_load(void *opaque, int version_id) +static void timebase_load(PPCTimebase *tb) { - PPCTimebase *tb_remote = opaque; CPUState *cpu; PowerPCCPU *first_ppc_cpu = POWERPC_CPU(first_cpu); - int64_t tb_off_adj, tb_off, ns_diff; - int64_t migration_duration_ns, migration_duration_tb, guest_tb, host_ns; + int64_t tb_off_adj, tb_off; unsigned long freq; if (!first_ppc_cpu->env.tb_env) { error_report("No timebase object"); - return -1; + return; } freq = first_ppc_cpu->env.tb_env->tb_freq; - /* - * Calculate timebase on the destination side of migration. - * The destination timebase must be not less than the source timebase. - * We try to adjust timebase by downtime if host clocks are not - * too much out of sync (1 second for now). - */ - host_ns = qemu_clock_get_ns(QEMU_CLOCK_HOST); - ns_diff = MAX(0, host_ns - tb_remote->time_of_the_day_ns); - migration_duration_ns = MIN(NANOSECONDS_PER_SECOND, ns_diff); - migration_duration_tb = muldiv64(freq, migration_duration_ns, - NANOSECONDS_PER_SECOND); - guest_tb = tb_remote->guest_timebase + MIN(0, migration_duration_tb); - tb_off_adj = guest_tb - cpu_get_host_ticks(); + tb_off_adj = tb->guest_timebase - cpu_get_host_ticks(); tb_off = first_ppc_cpu->env.tb_env->tb_offset; trace_ppc_tb_adjust(tb_off, tb_off_adj, tb_off_adj - tb_off, @@ -904,9 +890,44 @@ static int timebase_post_load(void *opaque, int version_id) CPU_FOREACH(cpu) { PowerPCCPU *pcpu = POWERPC_CPU(cpu); pcpu->env.tb_env->tb_offset = tb_off_adj; +#if defined(CONFIG_KVM) + kvm_set_one_reg(cpu, KVM_REG_PPC_TB_OFFSET, + &pcpu->env.tb_env->tb_offset); +#endif } +} - return 0; +void cpu_ppc_clock_vm_state_change(void *opaque, int running, + RunState state) +{ + PPCTimebase *tb = opaque; + + if (running) { + timebase_load(tb); + } else { + timebase_save(tb); + } +} + +/* + * When migrating, read the clock just before migration, + * so that the guest clock counts during the events + * between: + * + * * vm_stop() + * * + * * pre_save() + * + * This reduces clock difference on migration from 5s + * to 0.1s (when max_downtime == 5s), because sending the + * final pages of memory (which happens between vm_stop() + * and pre_save()) takes max_downtime. + */ +static void timebase_pre_save(void *opaque) +{ + PPCTimebase *tb = opaque; + + timebase_save(tb); } const VMStateDescription vmstate_ppc_timebase = { @@ -915,7 +936,6 @@ const VMStateDescription vmstate_ppc_timebase = { .minimum_version_id = 1, .minimum_version_id_old = 1, .pre_save = timebase_pre_save, - .post_load = timebase_post_load, .fields = (VMStateField []) { VMSTATE_UINT64(guest_timebase, PPCTimebase), VMSTATE_INT64(time_of_the_day_ns, PPCTimebase), diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index a642e66..ee78096 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -2116,6 +2116,12 @@ static void ppc_spapr_init(MachineState *machine) qemu_register_reset(spapr_ccs_reset_hook, spapr); qemu_register_boot_set(spapr_boot_set, spapr); + + /* to stop and start vmclock */ + if (kvm_enabled()) { + qemu_add_vm_change_state_handler(cpu_ppc_clock_vm_state_change, + &spapr->tb); + } } static int spapr_kvm_type(const char *vm_type) diff --git a/target/ppc/cpu-qom.h b/target/ppc/cpu-qom.h index d46c31a..b7977ba 100644 --- a/target/ppc/cpu-qom.h +++ b/target/ppc/cpu-qom.h @@ -214,6 +214,9 @@ extern const struct VMStateDescription vmstate_ppc_timebase; .flags = VMS_STRUCT, \ .offset = vmstate_offset_value(_state, _field, PPCTimebase), \ } + +void cpu_ppc_clock_vm_state_change(void *opaque, int running, + RunState state); #endif #endif