From patchwork Mon Oct 21 01:14:18 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: mrhines@linux.vnet.ibm.com X-Patchwork-Id: 285061 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 935D92C012E for ; Mon, 21 Oct 2013 12:19:38 +1100 (EST) Received: from localhost ([::1]:37920 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VY49g-0001yb-KS for incoming@patchwork.ozlabs.org; Sun, 20 Oct 2013 21:19:36 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50528) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VY45I-0003z9-LU for qemu-devel@nongnu.org; Sun, 20 Oct 2013 21:15:13 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1VY459-00047O-KJ for qemu-devel@nongnu.org; Sun, 20 Oct 2013 21:15:04 -0400 Received: from e36.co.us.ibm.com ([32.97.110.154]:44744) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1VY459-00047K-CL for qemu-devel@nongnu.org; Sun, 20 Oct 2013 21:14:55 -0400 Received: from /spool/local by e36.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Sun, 20 Oct 2013 19:14:55 -0600 Received: from d03dlp02.boulder.ibm.com (9.17.202.178) by e36.co.us.ibm.com (192.168.1.136) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Sun, 20 Oct 2013 19:14:53 -0600 Received: from d03relay02.boulder.ibm.com (d03relay02.boulder.ibm.com [9.17.195.227]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id 432163E40026 for ; Sun, 20 Oct 2013 19:14:53 -0600 (MDT) Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by d03relay02.boulder.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id r9L1Er8A100860 for ; Sun, 20 Oct 2013 19:14:53 -0600 Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id r9L1I5B4000877 for ; Sun, 20 Oct 2013 19:18:05 -0600 Received: from mahler.ibm.com ([9.80.101.39]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id r9L1HikV032531; Sun, 20 Oct 2013 19:18:03 -0600 From: mrhines@linux.vnet.ibm.com To: qemu-devel@nongnu.org Date: Mon, 21 Oct 2013 01:14:18 +0000 Message-Id: <1382318062-6288-9-git-send-email-mrhines@linux.vnet.ibm.com> X-Mailer: git-send-email 1.8.1.2 In-Reply-To: <1382318062-6288-1-git-send-email-mrhines@linux.vnet.ibm.com> References: <1382318062-6288-1-git-send-email-mrhines@linux.vnet.ibm.com> X-TM-AS-MML: No X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13102101-3532-0000-0000-000002537E56 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 32.97.110.154 Cc: aliguori@us.ibm.com, quintela@redhat.com, owasserm@redhat.com, onom@us.ibm.com, abali@us.ibm.com, mrhines@us.ibm.com, gokul@us.ibm.com, pbonzini@redhat.com Subject: [Qemu-devel] [RFC PATCH v1: 08/12] mc: modified QMP statistics and migration_thread handoff X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: "Michael R. Hines" In addition to better handling of new QMP statistics associated with the migration_bitmap and MC performance, we need to transfer control from the migration thread to the MC thread more cleanly, which means dynamically allocating the threads and doing the handoff after the initial live migration has completed. Signed-off-by: Michael R. Hines --- hmp.c | 17 ++++++++ include/migration/migration.h | 14 ++++++- migration.c | 94 +++++++++++++++++++++++++++---------------- qapi-schema.json | 2 + savevm.c | 5 +-- 5 files changed, 93 insertions(+), 39 deletions(-) diff --git a/hmp.c b/hmp.c index 32ee285..43896e9 100644 --- a/hmp.c +++ b/hmp.c @@ -202,6 +202,23 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict) info->disk->total >> 10); } + if (info->has_mc) { + monitor_printf(mon, "checkpoints: %" PRIu64 "\n", + info->mc->checkpoints); + monitor_printf(mon, "xmit_time: %" PRIu64 " ms\n", + info->mc->xmit_time); + monitor_printf(mon, "log_dirty_time: %" PRIu64 " ms\n", + info->mc->log_dirty_time); + monitor_printf(mon, "migration_bitmap_time: %" PRIu64 " ms\n", + info->mc->migration_bitmap_time); + monitor_printf(mon, "ram_copy_time: %" PRIu64 " ms\n", + info->mc->ram_copy_time); + monitor_printf(mon, "copy_mbps: %0.2f mbps\n", + info->mc->copy_mbps); + monitor_printf(mon, "throughput: %0.2f mbps\n", + info->mc->mbps); + } + if (info->has_xbzrle_cache) { monitor_printf(mon, "cache size: %" PRIu64 " bytes\n", info->xbzrle_cache->cache_size); diff --git a/include/migration/migration.h b/include/migration/migration.h index fcf7684..a1ab06c 100644 --- a/include/migration/migration.h +++ b/include/migration/migration.h @@ -35,13 +35,14 @@ struct MigrationState int64_t bandwidth_limit; size_t bytes_xfer; size_t xfer_limit; - QemuThread thread; + QemuThread *thread; QEMUBH *cleanup_bh; QEMUFile *file; int state; MigrationParams params; double mbps; + double copy_mbps; int64_t total_time; int64_t downtime; int64_t expected_downtime; @@ -54,6 +55,7 @@ struct MigrationState bool enabled_capabilities[MIGRATION_CAPABILITY_MAX]; int64_t xbzrle_cache_size; int64_t setup_time; + int64_t checkpoints; }; void process_incoming_migration(QEMUFile *f); @@ -137,6 +139,12 @@ enum { MIG_STATE_MC, MIG_STATE_COMPLETED, }; + +int mc_enable_buffering(void); +int mc_start_buffer(void); +void mc_init_checkpointer(MigrationState *s); +void mc_process_incoming_checkpoints_if_requested(QEMUFile *f); + void ram_handle_compressed(void *host, uint8_t ch, uint64_t size); /** @@ -207,10 +215,14 @@ int ram_control_copy_page(QEMUFile *f, long size); int migrate_use_mc(void); +int migrate_use_mc_net(void); int migrate_use_mc_rdma_copy(void); #define MC_VERSION 1 +int mc_info_load(QEMUFile *f, void *opaque, int version_id); +void mc_info_save(QEMUFile *f, void *opaque); + void qemu_rdma_info_save(QEMUFile *f, void *opaque); int qemu_rdma_info_load(QEMUFile *f, void *opaque, int version_id); #endif diff --git a/migration.c b/migration.c index 62dded3..8e0827e 100644 --- a/migration.c +++ b/migration.c @@ -172,6 +172,31 @@ static void get_xbzrle_cache_stats(MigrationInfo *info) } } +static void get_ram_stats(MigrationState *s, MigrationInfo *info) +{ + info->has_total_time = true; + info->total_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + - s->total_time; + + info->has_ram = true; + info->ram = g_malloc0(sizeof(*info->ram)); + info->ram->transferred = ram_bytes_transferred(); + info->ram->total = ram_bytes_total(); + info->ram->duplicate = dup_mig_pages_transferred(); + info->ram->skipped = skipped_mig_pages_transferred(); + info->ram->normal = norm_mig_pages_transferred(); + info->ram->normal_bytes = norm_mig_bytes_transferred(); + info->ram->mbps = s->mbps; + + if (blk_mig_active()) { + info->has_disk = true; + info->disk = g_malloc0(sizeof(*info->disk)); + info->disk->transferred = blk_mig_bytes_transferred(); + info->disk->remaining = blk_mig_bytes_remaining(); + info->disk->total = blk_mig_bytes_total(); + } +} + MigrationInfo *qmp_query_migrate(Error **errp) { MigrationInfo *info = g_malloc0(sizeof(*info)); @@ -197,26 +222,8 @@ MigrationInfo *qmp_query_migrate(Error **errp) info->has_setup_time = true; info->setup_time = s->setup_time; - info->has_ram = true; - info->ram = g_malloc0(sizeof(*info->ram)); - info->ram->transferred = ram_bytes_transferred(); - info->ram->remaining = ram_bytes_remaining(); - info->ram->total = ram_bytes_total(); - info->ram->duplicate = dup_mig_pages_transferred(); - info->ram->skipped = skipped_mig_pages_transferred(); - info->ram->normal = norm_mig_pages_transferred(); - info->ram->normal_bytes = norm_mig_bytes_transferred(); + get_ram_stats(s, info); info->ram->dirty_pages_rate = s->dirty_pages_rate; - info->ram->mbps = s->mbps; - - if (blk_mig_active()) { - info->has_disk = true; - info->disk = g_malloc0(sizeof(*info->disk)); - info->disk->transferred = blk_mig_bytes_transferred(); - info->disk->remaining = blk_mig_bytes_remaining(); - info->disk->total = blk_mig_bytes_total(); - } - get_xbzrle_cache_stats(info); break; case MIG_STATE_COMPLETED: @@ -225,22 +232,37 @@ MigrationInfo *qmp_query_migrate(Error **errp) info->has_status = true; info->status = g_strdup("completed"); info->has_total_time = true; - info->total_time = s->total_time; + info->total_time = qemu_clock_get_ms(QEMU_CLOCK_REALTIME) + - s->total_time; info->has_downtime = true; info->downtime = s->downtime; info->has_setup_time = true; info->setup_time = s->setup_time; - info->has_ram = true; - info->ram = g_malloc0(sizeof(*info->ram)); - info->ram->transferred = ram_bytes_transferred(); - info->ram->remaining = 0; - info->ram->total = ram_bytes_total(); - info->ram->duplicate = dup_mig_pages_transferred(); - info->ram->skipped = skipped_mig_pages_transferred(); - info->ram->normal = norm_mig_pages_transferred(); - info->ram->normal_bytes = norm_mig_bytes_transferred(); - info->ram->mbps = s->mbps; + get_ram_stats(s, info); + break; + case MIG_STATE_MC: + info->has_status = true; + info->status = g_strdup("checkpointing"); + info->has_setup_time = true; + info->setup_time = s->setup_time; + info->has_downtime = true; + info->downtime = s->downtime; + + get_ram_stats(s, info); + info->ram->dirty_pages_rate = s->dirty_pages_rate; + get_xbzrle_cache_stats(info); + + + info->has_mc = true; + info->mc = g_malloc0(sizeof(*info->mc)); + info->mc->xmit_time = s->xmit_time; + info->mc->log_dirty_time = s->log_dirty_time; + info->mc->migration_bitmap_time = s->bitmap_time; + info->mc->ram_copy_time = s->ram_copy_time; + info->mc->copy_mbps = s->copy_mbps; + info->mc->mbps = s->mbps; + info->mc->checkpoints = s->checkpoints; break; case MIG_STATE_ERROR: info->has_status = true; @@ -288,14 +310,17 @@ static void migrate_fd_cleanup(void *opaque) { MigrationState *s = opaque; - qemu_bh_delete(s->cleanup_bh); - s->cleanup_bh = NULL; + if(s->cleanup_bh) { + qemu_bh_delete(s->cleanup_bh); + s->cleanup_bh = NULL; + } if (s->file) { DPRINTF("closing file\n"); qemu_mutex_unlock_iothread(); - qemu_thread_join(&s->thread); + qemu_thread_join(s->thread); qemu_mutex_lock_iothread(); + g_free(s->thread); qemu_fclose(s->file); s->file = NULL; @@ -670,6 +695,7 @@ void migrate_fd_connect(MigrationState *s) /* Notify before starting migration thread */ notifier_list_notify(&migration_state_notifiers, s); - qemu_thread_create(&s->thread, migration_thread, s, + s->thread = g_malloc0(sizeof(*s->thread)); + qemu_thread_create(s->thread, migration_thread, s, QEMU_THREAD_JOINABLE); } diff --git a/qapi-schema.json b/qapi-schema.json index 8e72bcf..e0a430c 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -630,6 +630,8 @@ # migration statistics, only returned if XBZRLE feature is on and # status is 'active' or 'completed' (since 1.2) # +# @mc: #options @MCStats containing details Micro-Checkpointing statistics +# # @total-time: #optional total amount of milliseconds since migration started. # If migration has ended, it returns the total migration # time. (since 1.2) diff --git a/savevm.c b/savevm.c index f8eb225..3ad4eea 100644 --- a/savevm.c +++ b/savevm.c @@ -419,10 +419,7 @@ QEMUFile *qemu_fdopen(int fd, const char *mode) { QEMUFileSocket *s; - if (mode == NULL || - (mode[0] != 'r' && mode[0] != 'w') || - mode[1] != 'b' || mode[2] != 0) { - fprintf(stderr, "qemu_fdopen: Argument validity check failed\n"); + if (qemu_file_mode_is_not_valid(mode)) { return NULL; }