From patchwork Fri Aug 14 15:00:15 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kevin Wolf X-Patchwork-Id: 31406 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by bilbo.ozlabs.org (Postfix) with ESMTPS id 6E83AB6F1E for ; Sat, 15 Aug 2009 01:02:17 +1000 (EST) Received: from localhost ([127.0.0.1]:47174 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MbyIH-0001Uq-7p for incoming@patchwork.ozlabs.org; Fri, 14 Aug 2009 11:02:13 -0400 Received: from mailman by lists.gnu.org with tmda-scanned (Exim 4.43) id 1MbyHf-0001UE-2J for qemu-devel@nongnu.org; Fri, 14 Aug 2009 11:01:35 -0400 Received: from exim by lists.gnu.org with spam-scanned (Exim 4.43) id 1MbyHZ-0001TW-CD for qemu-devel@nongnu.org; Fri, 14 Aug 2009 11:01:33 -0400 Received: from [199.232.76.173] (port=52824 helo=monty-python.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1MbyHZ-0001TT-6g for qemu-devel@nongnu.org; Fri, 14 Aug 2009 11:01:29 -0400 Received: from mx2.redhat.com ([66.187.237.31]:35555) by monty-python.gnu.org with esmtp (Exim 4.60) (envelope-from ) id 1MbyHY-0008GG-Hx for qemu-devel@nongnu.org; Fri, 14 Aug 2009 11:01:28 -0400 Received: from int-mx2.corp.redhat.com (int-mx2.corp.redhat.com [172.16.27.26]) by mx2.redhat.com (8.13.8/8.13.8) with ESMTP id n7EF1QK1018858 for ; Fri, 14 Aug 2009 11:01:27 -0400 Received: from ns3.rdu.redhat.com (ns3.rdu.redhat.com [10.11.255.199]) by int-mx2.corp.redhat.com (8.13.1/8.13.1) with ESMTP id n7EF1P6R024158; Fri, 14 Aug 2009 11:01:26 -0400 Received: from localhost.localdomain (vpn-10-109.str.redhat.com [10.32.10.109]) by ns3.rdu.redhat.com (8.13.8/8.13.8) with ESMTP id n7EF1Lbn006959; Fri, 14 Aug 2009 11:01:22 -0400 From: Kevin Wolf To: qemu-devel@nongnu.org Date: Fri, 14 Aug 2009 17:00:15 +0200 Message-Id: <1250262015-996-1-git-send-email-kwolf@redhat.com> X-Scanned-By: MIMEDefang 2.58 on 172.16.27.26 X-detected-operating-system: by monty-python.gnu.org: GNU/Linux 2.6 (newer, 3) Cc: Kevin Wolf Subject: [Qemu-devel] [PATCH] qcow2: Metadata preallocation X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org This introduces a qemu-img create option for qcow2 which allows the metadata to be preallocated, i.e. clusters are reserved in the refcount table and L1/L2 tables, but no data is written to them. Metadata is quite small, so this happens in almost no time. Especially with qcow2 on virtio this helps to gain a bit of performance during the initial writes. However, as soon as create a snapshot, we're back to the normal slow speed, obviously. So this isn't the real fix, but kind of a cheat while we're still having trouble with qcow2 on virtio. Note that the option is disabled by default and needs to be specified explicitly using qemu-img create -f qcow2 -o preallocation=metadata. Signed-off-by: Kevin Wolf --- block/qcow2.c | 83 +++++++++++++++++++++++++++++++++++++++++++++++++++++++- block_int.h | 1 + 2 files changed, 82 insertions(+), 2 deletions(-) diff --git a/block/qcow2.c b/block/qcow2.c index a5bf205..88e0c71 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -638,9 +638,56 @@ static int get_bits_from_size(size_t size) return res; } + +static int preallocate(BlockDriverState *bs) +{ + BDRVQcowState *s = bs->opaque; + uint64_t cluster_offset; + uint64_t nb_sectors; + uint64_t offset; + int num; + QCowL2Meta meta; + + nb_sectors = bdrv_getlength(bs) >> 9; + offset = 0; + + while (nb_sectors) { + num = MIN(nb_sectors, INT_MAX >> 9); + cluster_offset = qcow2_alloc_cluster_offset(bs, offset, 0, num, &num, + &meta); + + if (cluster_offset == 0) { + return -1; + } + + if (qcow2_alloc_cluster_link_l2(bs, cluster_offset, &meta) < 0) { + qcow2_free_any_clusters(bs, cluster_offset, meta.nb_clusters); + return -1; + } + + /* TODO Preallocate data if requested */ + + nb_sectors -= num; + offset += num << 9; + } + + /* + * It is expected that the image file is large enough to actually contain + * all of the allocated clusters (otherwise we get failing reads after + * EOF). So just write some zeros to the last sector. + */ + if (cluster_offset != 0) { + uint8_t buf[512]; + memset(buf, 0, 512); + bdrv_write(s->hd, (cluster_offset >> 9) + num - 1, buf, 1); + } + + return 0; +} + static int qcow_create2(const char *filename, int64_t total_size, const char *backing_file, const char *backing_format, - int flags, size_t cluster_size) + int flags, size_t cluster_size, int prealloc) { int fd, header_size, backing_filename_len, l1_size, i, shift, l2_bits; @@ -762,6 +809,16 @@ static int qcow_create2(const char *filename, int64_t total_size, qemu_free(s->refcount_table); qemu_free(s->refcount_block); close(fd); + + /* Preallocate metadata */ + if (prealloc) { + BlockDriverState *bs; + bs = bdrv_new(""); + bdrv_open(bs, filename, BDRV_O_CACHE_WB); + preallocate(bs); + bdrv_close(bs); + } + return 0; } @@ -772,6 +829,7 @@ static int qcow_create(const char *filename, QEMUOptionParameter *options) uint64_t sectors = 0; int flags = 0; size_t cluster_size = 65536; + int prealloc = 0; /* Read out options */ while (options && options->name) { @@ -787,12 +845,28 @@ static int qcow_create(const char *filename, QEMUOptionParameter *options) if (options->value.n) { cluster_size = options->value.n; } + } else if (!strcmp(options->name, BLOCK_OPT_PREALLOC)) { + if (!options->value.s || !strcmp(options->value.s, "off")) { + prealloc = 0; + } else if (!strcmp(options->value.s, "metadata")) { + prealloc = 1; + } else { + fprintf(stderr, "Invalid preallocation mode: '%s'\n", + options->value.s); + return -EINVAL; + } } options++; } + if (backing_file && prealloc) { + fprintf(stderr, "Backing file and preallocation cannot be used at " + "the same time\n"); + return -EINVAL; + } + return qcow_create2(filename, sectors, backing_file, backing_fmt, flags, - cluster_size); + cluster_size, prealloc); } static int qcow_make_empty(BlockDriverState *bs) @@ -982,6 +1056,11 @@ static QEMUOptionParameter qcow_create_options[] = { .type = OPT_SIZE, .help = "qcow2 cluster size" }, + { + .name = BLOCK_OPT_PREALLOC, + .type = OPT_STRING, + .help = "Preallocation mode (allowed values: off, metadata)" + }, { NULL } }; diff --git a/block_int.h b/block_int.h index 8898d91..0902fd4 100644 --- a/block_int.h +++ b/block_int.h @@ -37,6 +37,7 @@ #define BLOCK_OPT_BACKING_FILE "backing_file" #define BLOCK_OPT_BACKING_FMT "backing_fmt" #define BLOCK_OPT_CLUSTER_SIZE "cluster_size" +#define BLOCK_OPT_PREALLOC "preallocation" typedef struct AIOPool { void (*cancel)(BlockDriverAIOCB *acb);