From patchwork Tue Mar 27 09:10:37 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiao Guangrong X-Patchwork-Id: 892041 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=2001:4830:134:3::11; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.b="ZAAD0BPT"; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 40B2Nb1vLLz9s0m for ; Wed, 28 Mar 2018 20:18:06 +1100 (AEDT) Received: from localhost ([::1]:38184 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f17DX-0006P9-JU for incoming@patchwork.ozlabs.org; Wed, 28 Mar 2018 05:18:03 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:56664) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1f176M-0000sj-JX for qemu-devel@nongnu.org; Wed, 28 Mar 2018 05:10:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1f176L-0004zo-61 for qemu-devel@nongnu.org; Wed, 28 Mar 2018 05:10:38 -0400 Received: from mail-pl0-x244.google.com ([2607:f8b0:400e:c01::244]:44892) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1f176K-0004zO-U4 for qemu-devel@nongnu.org; Wed, 28 Mar 2018 05:10:37 -0400 Received: by mail-pl0-x244.google.com with SMTP id 9-v6so1196446ple.11 for ; Wed, 28 Mar 2018 02:10:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=IadA7RXf8Bc1iPVuEx3SNmtkcRJ7JNpOk6YbIEwuGgY=; b=ZAAD0BPT6uqEGXKOtNyn1lrNkODhmdB7MeXdPjF1wuRf4vyR0Q2lpxTsOL1a/ExI6b x9EVbLGC6WqruN3wsKXQBzHdYCvlVIf2IqwH17IQfdwQ6VtoB7JkpYaSbPASbq60rG/N xbiqO30B/aipUxXJOjnjnGz0Mdgq9iZDmRlJKYRf3EuYEHRMUs0ByGHxkSjH7fvVIq4A AyRtbFuggC4BzZqJcOeYzxsRz+t7m4evMxkKsIC83BvVWRq8TgKUEQ0CqUQf69k+ASR9 XWrCQC+uCA8I1LY+WjOSUVK1H4gSzZijShJbojjHvTr9S2IfWL2Lc8I9IEwLvP3iypKe U6SA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=IadA7RXf8Bc1iPVuEx3SNmtkcRJ7JNpOk6YbIEwuGgY=; b=J7sn/w4ssfQDvhljbsG5YjP/bxl9McnhSDTZ1D7s9YGQ7iAIRteFcrtfC01bvyoW3f hG/nh0WBd3TEGLgeFtMJsfJHTJBoclMy5uhW4zs8/4kxw2rFGcYSr9ShZvg9UATd8V/s tN05eAfVFvMzUmwtRkcg0dOIeAdY3fAJvjRdPfNG1mcVK/8nG0AvwBmNJo9U9ScIWpxH iBLciwL7Xo6jnCdt5+p3XrNRwK/dXZnSHFrj5D6wju4bDi1kPj/iVlFUU9+k01dJZXV4 KYszvbveD1wnqxEkiLiQG4/myx8xkoqmPMjtqU+oe2TDmFFW+elV0kEwJufp+h3BG8ol 4IaQ== X-Gm-Message-State: AElRT7HxVLi/DpGF31GV+VZrb6I7VYYTNxIZeD9XMiqhBAds5lqpN4Tn Z5hM/7CuhrZi4PGAYaMIAz4= X-Google-Smtp-Source: AIpwx48gggc5V4EdtW6uR0gdtRl4LYvhFkk9C5kyekQODYNwOv2rewOJovLigUvqEihhyo6+FMn89A== X-Received: by 2002:a17:902:b7cc:: with SMTP id v12-v6mr2977255plz.237.1522228236078; Wed, 28 Mar 2018 02:10:36 -0700 (PDT) Received: from localhost.localdomain ([203.205.141.38]) by smtp.gmail.com with ESMTPSA id q9sm5590648pgs.89.2018.03.28.02.10.33 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Wed, 28 Mar 2018 02:10:35 -0700 (PDT) From: guangrong.xiao@gmail.com X-Google-Original-From: xiaoguangrong@tencent.com To: pbonzini@redhat.com, mst@redhat.com, mtosatti@redhat.com Date: Tue, 27 Mar 2018 17:10:37 +0800 Message-Id: <20180327091043.30220-5-xiaoguangrong@tencent.com> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180327091043.30220-1-xiaoguangrong@tencent.com> References: <20180327091043.30220-1-xiaoguangrong@tencent.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c01::244 Subject: [Qemu-devel] [PATCH v2 04/10] migration: detect compression and decompression errors X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kvm@vger.kernel.org, Xiao Guangrong , qemu-devel@nongnu.org, peterx@redhat.com, dgilbert@redhat.com, wei.w.wang@intel.com, jiang.biao2@zte.com.cn Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" From: Xiao Guangrong Currently the page being compressed is allowed to be updated by the VM on the source QEMU, correspondingly the destination QEMU just ignores the decompression error. However, we completely miss the chance to catch real errors, then the VM is corrupted silently To make the migration more robuster, we copy the page to a buffer first to avoid it being written by VM, then detect and handle the errors of both compression and decompression errors properly Signed-off-by: Xiao Guangrong --- migration/qemu-file.c | 4 ++-- migration/ram.c | 55 +++++++++++++++++++++++++++++++++++---------------- 2 files changed, 40 insertions(+), 19 deletions(-) diff --git a/migration/qemu-file.c b/migration/qemu-file.c index e924cc23c5..a7614e8c28 100644 --- a/migration/qemu-file.c +++ b/migration/qemu-file.c @@ -710,9 +710,9 @@ ssize_t qemu_put_compression_data(QEMUFile *f, z_stream *stream, blen = qemu_compress_data(stream, f->buf + f->buf_index + sizeof(int32_t), blen, p, size); if (blen < 0) { - error_report("Compress Failed!"); - return 0; + return -1; } + qemu_put_be32(f, blen); if (f->ops->writev_buffer) { add_to_iovec(f, f->buf + f->buf_index, blen, false); diff --git a/migration/ram.c b/migration/ram.c index 6b699650ca..e85191c1cb 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -269,7 +269,10 @@ struct CompressParam { QemuCond cond; RAMBlock *block; ram_addr_t offset; + + /* internally used fields */ z_stream stream; + uint8_t *originbuf; }; typedef struct CompressParam CompressParam; @@ -278,6 +281,7 @@ struct DecompressParam { bool quit; QemuMutex mutex; QemuCond cond; + QEMUFile *file; void *des; uint8_t *compbuf; int len; @@ -302,7 +306,7 @@ static QemuMutex decomp_done_lock; static QemuCond decomp_done_cond; static int do_compress_ram_page(QEMUFile *f, z_stream *stream, RAMBlock *block, - ram_addr_t offset); + ram_addr_t offset, uint8_t *source_buf); static void *do_data_compress(void *opaque) { @@ -318,7 +322,8 @@ static void *do_data_compress(void *opaque) param->block = NULL; qemu_mutex_unlock(¶m->mutex); - do_compress_ram_page(param->file, ¶m->stream, block, offset); + do_compress_ram_page(param->file, ¶m->stream, block, offset, + param->originbuf); qemu_mutex_lock(&comp_done_lock); param->done = true; @@ -372,6 +377,7 @@ static void compress_threads_save_cleanup(void) qemu_mutex_destroy(&comp_param[i].mutex); qemu_cond_destroy(&comp_param[i].cond); deflateEnd(&comp_param[i].stream); + g_free(comp_param[i].originbuf); comp_param[i].stream.opaque = NULL; } qemu_mutex_destroy(&comp_done_lock); @@ -395,8 +401,14 @@ static int compress_threads_save_setup(void) qemu_cond_init(&comp_done_cond); qemu_mutex_init(&comp_done_lock); for (i = 0; i < thread_count; i++) { + comp_param[i].originbuf = g_try_malloc(TARGET_PAGE_SIZE); + if (!comp_param[i].originbuf) { + goto exit; + } + if (deflateInit(&comp_param[i].stream, migrate_compress_level()) != Z_OK) { + g_free(comp_param[i].originbuf); goto exit; } comp_param[i].stream.opaque = &comp_param[i]; @@ -1055,7 +1067,7 @@ static int ram_save_page(RAMState *rs, PageSearchStatus *pss, bool last_stage) } static int do_compress_ram_page(QEMUFile *f, z_stream *stream, RAMBlock *block, - ram_addr_t offset) + ram_addr_t offset, uint8_t *source_buf) { RAMState *rs = ram_state; int bytes_sent, blen; @@ -1063,7 +1075,14 @@ static int do_compress_ram_page(QEMUFile *f, z_stream *stream, RAMBlock *block, bytes_sent = save_page_header(rs, f, block, offset | RAM_SAVE_FLAG_COMPRESS_PAGE); - blen = qemu_put_compression_data(f, stream, p, TARGET_PAGE_SIZE); + + /* + * copy it to a internal buffer to avoid it being modified by VM + * so that we can catch up the error during compression and + * decompression + */ + memcpy(source_buf, p, TARGET_PAGE_SIZE); + blen = qemu_put_compression_data(f, stream, source_buf, TARGET_PAGE_SIZE); if (blen < 0) { bytes_sent = 0; qemu_file_set_error(migrate_get_current()->to_dst_file, blen); @@ -2557,7 +2576,7 @@ static void *do_data_decompress(void *opaque) DecompressParam *param = opaque; unsigned long pagesize; uint8_t *des; - int len; + int len, ret; qemu_mutex_lock(¶m->mutex); while (!param->quit) { @@ -2568,13 +2587,13 @@ static void *do_data_decompress(void *opaque) qemu_mutex_unlock(¶m->mutex); pagesize = TARGET_PAGE_SIZE; - /* qemu_uncompress_data() will return failed in some case, - * especially when the page is dirtied when doing the compression, - * it's not a problem because the dirty page will be retransferred - * and uncompress() won't break the data in other pages. - */ - qemu_uncompress_data(¶m->stream, des, pagesize, param->compbuf, - len); + + ret = qemu_uncompress_data(¶m->stream, des, pagesize, + param->compbuf, len); + if (ret < 0) { + error_report("decompress data failed"); + qemu_file_set_error(param->file, ret); + } qemu_mutex_lock(&decomp_done_lock); param->done = true; @@ -2591,12 +2610,12 @@ static void *do_data_decompress(void *opaque) return NULL; } -static void wait_for_decompress_done(void) +static int wait_for_decompress_done(QEMUFile *f) { int idx, thread_count; if (!migrate_use_compression()) { - return; + return 0; } thread_count = migrate_decompress_threads(); @@ -2607,6 +2626,7 @@ static void wait_for_decompress_done(void) } } qemu_mutex_unlock(&decomp_done_lock); + return qemu_file_get_error(f); } static void compress_threads_load_cleanup(void) @@ -2646,7 +2666,7 @@ static void compress_threads_load_cleanup(void) decomp_param = NULL; } -static int compress_threads_load_setup(void) +static int compress_threads_load_setup(QEMUFile *f) { int i, thread_count; @@ -2665,6 +2685,7 @@ static int compress_threads_load_setup(void) } decomp_param[i].stream.opaque = &decomp_param[i]; + decomp_param[i].file = f; qemu_mutex_init(&decomp_param[i].mutex); qemu_cond_init(&decomp_param[i].cond); decomp_param[i].compbuf = g_malloc0(compressBound(TARGET_PAGE_SIZE)); @@ -2719,7 +2740,7 @@ static void decompress_data_with_multi_threads(QEMUFile *f, */ static int ram_load_setup(QEMUFile *f, void *opaque) { - if (compress_threads_load_setup()) { + if (compress_threads_load_setup(f)) { return -1; } @@ -3074,7 +3095,7 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) } } - wait_for_decompress_done(); + ret |= wait_for_decompress_done(f); rcu_read_unlock(); trace_ram_load_complete(ret, seq_iter); return ret;