From patchwork Thu Feb 21 11:27:47 2013
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Dietmar Maurer <dietmar@proxmox.com>
X-Patchwork-Id: 222264
Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from lists.gnu.org (lists.gnu.org [208.118.235.17])
	(using TLSv1 with cipher AES256-SHA (256/256 bits))
	(Client did not present a certificate)
	by ozlabs.org (Postfix) with ESMTPS id B7A8F2C0082
	for <incoming@patchwork.ozlabs.org>;
	Thu, 21 Feb 2013 22:28:44 +1100 (EST)
Received: from localhost ([::1]:54674 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71) (envelope-from
	<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>)
	id 1U8UKQ-0004SJ-Lz
	for incoming@patchwork.ozlabs.org; Thu, 21 Feb 2013 06:28:42 -0500
Received: from eggs.gnu.org ([208.118.235.92]:35147)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dietmar@proxmox.com>) id 1U8UJm-0003fO-Nh
	for qemu-devel@nongnu.org; Thu, 21 Feb 2013 06:28:12 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <dietmar@proxmox.com>) id 1U8UJh-0001qO-Dv
	for qemu-devel@nongnu.org; Thu, 21 Feb 2013 06:28:02 -0500
Received: from www.maurer-it.com ([213.129.239.114]:52132
	helo=maui.maurer-it.com) by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <dietmar@proxmox.com>) id 1U8UJh-0001oy-1d
	for qemu-devel@nongnu.org; Thu, 21 Feb 2013 06:27:57 -0500
Received: by maui.maurer-it.com (Postfix, from userid 0)
	id 3CC413B0855; Thu, 21 Feb 2013 12:27:55 +0100 (CET)
From: Dietmar Maurer <dietmar@proxmox.com>
To: qemu-devel@nongnu.org
Date: Thu, 21 Feb 2013 12:27:47 +0100
Message-Id: <1361446072-601980-2-git-send-email-dietmar@proxmox.com>
X-Mailer: git-send-email 1.7.2.5
In-Reply-To: <1361446072-601980-1-git-send-email-dietmar@proxmox.com>
References: <1361446072-601980-1-git-send-email-dietmar@proxmox.com>
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x
X-Received-From: 213.129.239.114
Cc: Dietmar Maurer <dietmar@proxmox.com>
Subject: [Qemu-devel] [PATCH v5 1/6] add documenation for new backup
	framework
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org

Signed-off-by: Dietmar Maurer <dietmar@proxmox.com>
---
 docs/backup.txt |  116 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 116 insertions(+), 0 deletions(-)
 create mode 100644 docs/backup.txt

diff --git a/docs/backup.txt b/docs/backup.txt
new file mode 100644
index 0000000..927d787
--- /dev/null
+++ b/docs/backup.txt
@@ -0,0 +1,116 @@
+Efficient VM backup for qemu
+
+=Requirements=
+
+* Backup to a single archive file
+* Backup needs to contain all data to restore VM (full backup)
+* Do not depend on storage type or image format
+* Avoid use of temporary storage
+* store sparse images efficiently
+
+=Introduction=
+
+Most VM backup solutions use some kind of snapshot to get a consistent
+VM view at a specific point in time. For example, we previously used
+LVM to create a snapshot of all used VM images, which are then copied
+into a tar file.
+
+That basically means that any data written during backup involve
+considerable overhead. For LVM we get the following steps:
+
+1.) read original data (VM write)
+2.) write original data into snapshot (VM write)
+3.) write new data (VM write)
+4.) read data from snapshot (backup)
+5.) write data from snapshot into tar file (backup)
+
+Another approach to backup VM images is to create a new qcow2 image
+which use the old image as base. During backup, writes are redirected
+to the new image, so the old image represents a 'snapshot'. After
+backup, data need to be copied back from new image into the old
+one (commit). So a simple write during backup triggers the following
+steps:
+
+1.) write new data to new image (VM write)
+2.) read data from old image (backup)
+3.) write data from old image into tar file (backup)
+
+4.) read data from new image (commit)
+5.) write data to old image (commit)
+
+This is in fact the same overhead as before. Other tools like qemu
+livebackup produces similar overhead (2 reads, 3 writes).
+
+Some storage types/formats supports internal snapshots using some kind
+of reference counting (rados, sheepdog, dm-thin, qcow2). It would be possible
+to use that for backups, but for now we want to be storage-independent.
+
+=Make it more efficient=
+
+The be more efficient, we simply need to avoid unnecessary steps. The
+following steps are always required:
+
+1.) read old data before it gets overwritten
+2.) write that data into the backup archive
+3.) write new data (VM write)
+
+As you can see, this involves only one read, and two writes.
+
+To make that work, our backup archive need to be able to store image
+data 'out of order'. It is important to notice that this will not work
+with traditional archive formats like tar.
+
+During backup we simply intercept writes, then read existing data and
+store that directly into the archive. After that we can continue the
+write.
+
+==Advantages==
+
+* very good performance (1 read, 2 writes)
+* works on any storage type and image format.
+* avoid usage of temporary storage
+* we can define a new and simple archive format, which is able to
+  store sparse files efficiently.
+
+Note: Storing sparse files is a mess with existing archive
+formats. For example, tar requires information about holes at the
+beginning of the archive.
+
+==Disadvantages==
+
+* we need to define a new archive format
+
+Note: Most existing archive formats are optimized to store small files
+including file attributes. We simply do not need that for VM archives.
+
+* archive contains data 'out of order'
+
+If you want to access image data in sequential order, you need to
+re-order archive data. It would be possible to to that on the fly,
+using temporary files.
+
+Fortunately, a normal restore/extract works perfectly with 'out of
+order' data, because the target files are seekable.
+
+* slow backup storage can slow down VM during backup
+
+It is important to note that we only do sequential writes to the
+backup storage. Furthermore one can compress the backup stream. IMHO,
+it is better to slow down the VM a bit. All other solutions creates
+large amounts of temporary data during backup.
+
+=Archive format requirements=
+
+The basic requirement for such new format is that we can store image
+date 'out of order'. It is also very likely that we have less than 256
+drives/images per VM, and we want to be able to store VM configuration
+files.
+
+We have defined a very simply format with those properties, see:
+
+docs/specs/vma_spec.txt
+
+Please let us know if you know an existing format which provides the
+same functionality.
+
+