From patchwork Thu Jul 15 12:50:57 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Liguori X-Patchwork-Id: 58980 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 265BBB6F0C for ; Thu, 15 Jul 2010 22:51:39 +1000 (EST) Received: from localhost ([127.0.0.1]:53404 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OZNuY-00077k-1X for incoming@patchwork.ozlabs.org; Thu, 15 Jul 2010 08:51:34 -0400 Received: from [140.186.70.92] (port=58888 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OZNty-00077f-Eb for qemu-devel@nongnu.org; Thu, 15 Jul 2010 08:50:59 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OZNtw-0003y1-Rg for qemu-devel@nongnu.org; Thu, 15 Jul 2010 08:50:58 -0400 Received: from e8.ny.us.ibm.com ([32.97.182.138]:37971) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OZNtw-0003xv-NP for qemu-devel@nongnu.org; Thu, 15 Jul 2010 08:50:56 -0400 Received: from d01relay06.pok.ibm.com (d01relay06.pok.ibm.com [9.56.227.116]) by e8.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id o6F8cQwl032270 for ; Thu, 15 Jul 2010 04:38:26 -0400 Received: from d01av02.pok.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by d01relay06.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id o6FCorgL1376498 for ; Thu, 15 Jul 2010 08:50:53 -0400 Received: from d01av02.pok.ibm.com (loopback [127.0.0.1]) by d01av02.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id o6FCorwZ024468 for ; Thu, 15 Jul 2010 09:50:53 -0300 Received: from localhost.localdomain (sig-9-65-53-183.mts.ibm.com [9.65.53.183]) by d01av02.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id o6FCoqkP024434; Thu, 15 Jul 2010 09:50:52 -0300 From: Anthony Liguori To: qemu-devel@nongnu.org Date: Thu, 15 Jul 2010 07:50:57 -0500 Message-Id: <1279198257-23681-1-git-send-email-aliguori@us.ibm.com> X-Mailer: git-send-email 1.7.0.4 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6, seldom 2.4 (older, 4) Cc: Kevin Wolf , Anthony Liguori , Stefan Hajnoczi Subject: [Qemu-devel] [PATCH] Make default invocation of block drivers safer (v3) X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org CVE-2008-2004 described a vulnerability in QEMU whereas a malicious user could trick the block probing code into accessing arbitrary files in a guest. To mitigate this, we added an explicit format parameter to -drive which disabling block probing. Fast forward to today, and the vast majority of users do not use this parameter. libvirt does not use this by default nor does virt-manager. Most users want block probing so we should try to make it safer. This patch adds some logic to the raw device which attempts to detect a write operation to the beginning of a raw device. If the first 4 bytes happen to match an image file that has a backing file that we support, it scrubs the signature to all zeros. If a user specifies an explicit format parameter, this behavior is disabled. I contend that while a legitimate guest could write such a signature to the header, we would behave incorrectly anyway upon the next invocation of QEMU. This simply changes the incorrect behavior to not involve a security vulnerability. I've tested this pretty extensively both in the positive and negative case. I'm not 100% confident in the block layer's ability to deal with zero sized writes particularly with respect to the aio functions so some additional eyes would be appreciated. Even in the case of a single sector write, we have to make sure to invoked the completion from a bottom half so just removing the zero sized write is not an option. Signed-off-by: Anthony Liguori Acked-by: Kevin Wolf --- v2 -> v3 - add an assert to ensure the first iovec element is at least 512 bytes v1 -> v2 - be more paranoid about empty iovecs --- block.c | 4 ++ block/raw.c | 130 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ block_int.h | 1 + 3 files changed, 135 insertions(+), 0 deletions(-) diff --git a/block.c b/block.c index 65cf4dc..f837876 100644 --- a/block.c +++ b/block.c @@ -511,6 +511,7 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags, BlockDriver *drv) { int ret; + int probed = 0; if (flags & BDRV_O_SNAPSHOT) { BlockDriverState *bs1; @@ -571,6 +572,7 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags, /* Find the right image format driver */ if (!drv) { drv = find_image_format(filename); + probed = 1; } if (!drv) { @@ -584,6 +586,8 @@ int bdrv_open(BlockDriverState *bs, const char *filename, int flags, goto unlink_and_fail; } + bs->probed = probed; + /* If there is a backing file, use it */ if ((flags & BDRV_O_NO_BACKING) == 0 && bs->backing_file[0] != '\0') { char backing_filename[PATH_MAX]; diff --git a/block/raw.c b/block/raw.c index 4406b8c..1414e77 100644 --- a/block/raw.c +++ b/block/raw.c @@ -9,15 +9,82 @@ static int raw_open(BlockDriverState *bs, int flags) return 0; } +/* check for the user attempting to write something that looks like a + block format header to the beginning of the image and fail out. +*/ +static int check_for_block_signature(BlockDriverState *bs, const uint8_t *buf) +{ + static const uint8_t signatures[][4] = { + { 'Q', 'F', 'I', 0xfb }, /* qcow/qcow2 */ + { 'C', 'O', 'W', 'D' }, /* VMDK3 */ + { 'V', 'M', 'D', 'K' }, /* VMDK4 */ + { 'O', 'O', 'O', 'M' }, /* UML COW */ + {} + }; + int i; + + for (i = 0; signatures[i][0] != 0; i++) { + if (memcmp(buf, signatures[i], 4) == 0) { + return 1; + } + } + + return 0; +} + +static int check_write_unsafe(BlockDriverState *bs, int64_t sector_num, + const uint8_t *buf, int nb_sectors) +{ + /* assume that if the user specifies the format explicitly, then assume + that they will continue to do so and provide no safety net */ + if (!bs->probed) { + return 0; + } + + if (sector_num == 0 && nb_sectors > 0) { + return check_for_block_signature(bs, buf); + } + + return 0; +} + static int raw_read(BlockDriverState *bs, int64_t sector_num, uint8_t *buf, int nb_sectors) { return bdrv_read(bs->file, sector_num, buf, nb_sectors); } +static int raw_write_scrubbed_bootsect(BlockDriverState *bs, + const uint8_t *buf) +{ + uint8_t bootsect[512]; + + /* scrub the dangerous signature */ + memcpy(bootsect, buf, 512); + memset(bootsect, 0, 4); + + return bdrv_write(bs->file, 0, bootsect, 1); +} + static int raw_write(BlockDriverState *bs, int64_t sector_num, const uint8_t *buf, int nb_sectors) { + if (check_write_unsafe(bs, sector_num, buf, nb_sectors)) { + int ret; + + ret = raw_write_scrubbed_bootsect(bs, buf); + if (ret < 0) { + return ret; + } + + ret = bdrv_write(bs->file, 1, buf + 512, nb_sectors - 1); + if (ret < 0) { + return ret; + } + + return ret + 512; + } + return bdrv_write(bs->file, sector_num, buf, nb_sectors); } @@ -28,10 +95,73 @@ static BlockDriverAIOCB *raw_aio_readv(BlockDriverState *bs, return bdrv_aio_readv(bs->file, sector_num, qiov, nb_sectors, cb, opaque); } +typedef struct RawScrubberBounce +{ + BlockDriverCompletionFunc *cb; + void *opaque; + QEMUIOVector qiov; +} RawScrubberBounce; + +static void raw_aio_writev_scrubbed(void *opaque, int ret) +{ + RawScrubberBounce *b = opaque; + + if (ret < 0) { + b->cb(b->opaque, ret); + } else { + b->cb(b->opaque, ret + 512); + } + + qemu_iovec_destroy(&b->qiov); + qemu_free(b); +} + static BlockDriverAIOCB *raw_aio_writev(BlockDriverState *bs, int64_t sector_num, QEMUIOVector *qiov, int nb_sectors, BlockDriverCompletionFunc *cb, void *opaque) { + const uint8_t *first_buf; + int first_buf_index = 0, i; + + /* This is probably being paranoid, but handle cases of zero size + vectors. */ + for (i = 0; i < qiov->niov; i++) { + if (qiov->iov[i].iov_len) { + assert(qiov->iov[i].iov_len >= 512); + first_buf_index = i; + break; + } + } + + first_buf = qiov->iov[first_buf_index].iov_base; + + if (check_write_unsafe(bs, sector_num, first_buf, nb_sectors)) { + RawScrubberBounce *b; + int ret; + + /* write the first sector using sync I/O */ + ret = raw_write_scrubbed_bootsect(bs, first_buf); + if (ret < 0) { + return NULL; + } + + /* adjust request to be everything but first sector */ + + b = qemu_malloc(sizeof(*b)); + b->cb = cb; + b->opaque = opaque; + + qemu_iovec_init(&b->qiov, qiov->nalloc); + qemu_iovec_concat(&b->qiov, qiov, qiov->size); + + b->qiov.size -= 512; + b->qiov.iov[first_buf_index].iov_base += 512; + b->qiov.iov[first_buf_index].iov_len -= 512; + + return bdrv_aio_writev(bs->file, sector_num + 1, &b->qiov, + nb_sectors - 1, raw_aio_writev_scrubbed, b); + } + return bdrv_aio_writev(bs->file, sector_num, qiov, nb_sectors, cb, opaque); } diff --git a/block_int.h b/block_int.h index 877e1e5..96ff4cf 100644 --- a/block_int.h +++ b/block_int.h @@ -144,6 +144,7 @@ struct BlockDriverState { int encrypted; /* if true, the media is encrypted */ int valid_key; /* if true, a valid encryption key has been set */ int sg; /* if true, the device is a /dev/sg* */ + int probed; /* if true, format was probed automatically */ /* event callback when inserting/removing */ void (*change_cb)(void *opaque); void *change_opaque;