diff mbox series

qcow2: Forbid discard in qcow2 v2 images with backing files

Message ID 20200323194429.1717-1-berto@igalia.com
State New
Headers show
Series qcow2: Forbid discard in qcow2 v2 images with backing files | expand

Commit Message

Alberto Garcia March 23, 2020, 7:44 p.m. UTC
A discard request deallocates the selected clusters so they read back
as zeroes. This is done by clearing the cluster offset field and
setting QCOW_OFLAG_ZERO in the L2 entry.

This flag is however only supported when qcow_version >= 3. In older
images the cluster is simply deallocated, exposing any possible stale
data from the backing file.

Since discard is an advisory operation it's safer to simply forbid it
in this scenario.

Note that we are adding this check to qcow2_co_pdiscard() and not to
qcow2_cluster_discard() or discard_in_l2_slice() because the last
two are also used by qcow2_snapshot_create() to discard the clusters
used by the VM state. In this case there's no risk of exposing stale
data to the guest and we really want that the clusters are always
discarded.

Signed-off-by: Alberto Garcia <berto@igalia.com>
---
 block/qcow2.c              |  6 +++
 tests/qemu-iotests/060     |  5 ++-
 tests/qemu-iotests/060.out |  2 -
 tests/qemu-iotests/289     | 90 ++++++++++++++++++++++++++++++++++++++
 tests/qemu-iotests/289.out | 52 ++++++++++++++++++++++
 tests/qemu-iotests/group   |  1 +
 6 files changed, 152 insertions(+), 4 deletions(-)
 create mode 100755 tests/qemu-iotests/289
 create mode 100644 tests/qemu-iotests/289.out

Comments

Max Reitz March 24, 2020, noon UTC | #1
On 23.03.20 20:44, Alberto Garcia wrote:
> A discard request deallocates the selected clusters so they read back
> as zeroes. This is done by clearing the cluster offset field and
> setting QCOW_OFLAG_ZERO in the L2 entry.
> 
> This flag is however only supported when qcow_version >= 3. In older
> images the cluster is simply deallocated, exposing any possible stale
> data from the backing file.
> 
> Since discard is an advisory operation it's safer to simply forbid it
> in this scenario.
> 
> Note that we are adding this check to qcow2_co_pdiscard() and not to
> qcow2_cluster_discard() or discard_in_l2_slice() because the last
> two are also used by qcow2_snapshot_create() to discard the clusters
> used by the VM state. In this case there's no risk of exposing stale
> data to the guest and we really want that the clusters are always
> discarded.

Sounds good to me.

> Signed-off-by: Alberto Garcia <berto@igalia.com>
> ---
>  block/qcow2.c              |  6 +++
>  tests/qemu-iotests/060     |  5 ++-
>  tests/qemu-iotests/060.out |  2 -
>  tests/qemu-iotests/289     | 90 ++++++++++++++++++++++++++++++++++++++
>  tests/qemu-iotests/289.out | 52 ++++++++++++++++++++++
>  tests/qemu-iotests/group   |  1 +
>  6 files changed, 152 insertions(+), 4 deletions(-)
>  create mode 100755 tests/qemu-iotests/289
>  create mode 100644 tests/qemu-iotests/289.out
> 
> diff --git a/block/qcow2.c b/block/qcow2.c
> index d44b45633d..7bb7e392e1 100644
> --- a/block/qcow2.c
> +++ b/block/qcow2.c
> @@ -3763,6 +3763,12 @@ static coroutine_fn int qcow2_co_pdiscard(BlockDriverState *bs,
>      int ret;
>      BDRVQcow2State *s = bs->opaque;
>  
> +    /* If the image does not support QCOW_OFLAG_ZERO then discarding
> +     * clusters could expose stale data from the backing file. */
> +    if (s->qcow_version < 3 && bs->backing) {
> +        return -ENOTSUP;
> +    }
> +
>      if (!QEMU_IS_ALIGNED(offset | bytes, s->cluster_size)) {
>          assert(bytes < s->cluster_size);
>          /* Ignore partial clusters, except for the special case of the
> diff --git a/tests/qemu-iotests/060 b/tests/qemu-iotests/060
> index 043f12904a..4a4fdfb1e1 100755
> --- a/tests/qemu-iotests/060
> +++ b/tests/qemu-iotests/060
> @@ -167,9 +167,10 @@ _make_test_img -o 'compat=0.10' -b "$BACKING_IMG" 1G

More context: This image is created with -o 'compat=0.10', just because
a discard on such an image would result in the cluster being freed.  We
can drop that compat=0.10 bit now.

>  # Write two clusters, the second one enforces creation of an L2 table after
>  # the first data cluster.
>  $QEMU_IO -c 'write 0k 64k' -c 'write 512M 64k' "$TEST_IMG" | _filter_qemu_io
> -# Discard the first cluster. This cluster will soon enough be reallocated and
> +# Free the first cluster. This cluster will soon enough be reallocated and
>  # used for COW.
> -$QEMU_IO -c 'discard 0k 64k' "$TEST_IMG" | _filter_qemu_io
> +poke_file "$TEST_IMG" '262144' "\x00\x00\x00\x00\x00\x00\x00\x00" # 0x40000 - L2 entry
> +poke_file "$TEST_IMG" '131082' "\x00\x00" # 0x2000a - Refcount entry
>  # Now, corrupt the image by marking the second L2 table cluster as free.
>  poke_file "$TEST_IMG" '131084' "\x00\x00" # 0x2000c
>  # Start a write operation requiring COW on the image stopping it right before

[...]

> diff --git a/tests/qemu-iotests/289 b/tests/qemu-iotests/289
> new file mode 100755
> index 0000000000..13b4984721
> --- /dev/null
> +++ b/tests/qemu-iotests/289

[...]

> +_cleanup()
> +{
> +    _cleanup_test_img
> +    rm -f "$TEST_IMG.backing"

I’d call the image $TEST_IMG.base so _cleanup_test_img picks up on it.
(rm-ing test images is also wrong, because with external data files,
there will be more than one file.  It doesn’t matter here anyway because
this test doesn’t support external data files, but, well.)

> +}
> +trap "_cleanup; exit \$status" 0 1 2 3 15
> +
> +# get standard environment, filters and checks
> +. ./common.rc
> +. ./common.filter
> +
> +_supported_fmt qcow2
> +_supported_proto file
> +_supported_os Linux

I’d mark the compat option unsupported because this test will ignore it.
 Furthermore, the refcount_bits and data_file options are really
unsupported, because they won’t work with compat=0.10.

The test itself looks good.

Max
Eric Blake March 24, 2020, 2:46 p.m. UTC | #2
On 3/23/20 2:44 PM, Alberto Garcia wrote:
> A discard request deallocates the selected clusters so they read back
> as zeroes. This is done by clearing the cluster offset field and
> setting QCOW_OFLAG_ZERO in the L2 entry.
> 
> This flag is however only supported when qcow_version >= 3. In older
> images the cluster is simply deallocated, exposing any possible stale
> data from the backing file.
> 
> Since discard is an advisory operation it's safer to simply forbid it
> in this scenario.
> 
> Note that we are adding this check to qcow2_co_pdiscard() and not to
> qcow2_cluster_discard() or discard_in_l2_slice() because the last
> two are also used by qcow2_snapshot_create() to discard the clusters
> used by the VM state. In this case there's no risk of exposing stale
> data to the guest and we really want that the clusters are always
> discarded.
> 
> Signed-off-by: Alberto Garcia <berto@igalia.com>
> ---
>   block/qcow2.c              |  6 +++
>   tests/qemu-iotests/060     |  5 ++-
>   tests/qemu-iotests/060.out |  2 -
>   tests/qemu-iotests/289     | 90 ++++++++++++++++++++++++++++++++++++++
>   tests/qemu-iotests/289.out | 52 ++++++++++++++++++++++
>   tests/qemu-iotests/group   |  1 +
>   6 files changed, 152 insertions(+), 4 deletions(-)
>   create mode 100755 tests/qemu-iotests/289
>   create mode 100644 tests/qemu-iotests/289.out

The actual fix is much smaller than the iotest fallout ;)

> +++ b/tests/qemu-iotests/060
> @@ -167,9 +167,10 @@ _make_test_img -o 'compat=0.10' -b "$BACKING_IMG" 1G
>   # Write two clusters, the second one enforces creation of an L2 table after
>   # the first data cluster.
>   $QEMU_IO -c 'write 0k 64k' -c 'write 512M 64k' "$TEST_IMG" | _filter_qemu_io
> -# Discard the first cluster. This cluster will soon enough be reallocated and
> +# Free the first cluster. This cluster will soon enough be reallocated and
>   # used for COW.
> -$QEMU_IO -c 'discard 0k 64k' "$TEST_IMG" | _filter_qemu_io
> +poke_file "$TEST_IMG" '262144' "\x00\x00\x00\x00\x00\x00\x00\x00" # 0x40000 - L2 entry
> +poke_file "$TEST_IMG" '131082' "\x00\x00" # 0x2000a - Refcount entry

Instead of writing '262144' ... # 0x40000, you could write $((0x40000)) 
in-place.  Similarly for 131082 vs. 0x2000a.

Also, Max has pending patches for adding poke_file_be; if those land 
first, this becomes simpler as:

poke_file_be "$TEST_IMG" $((0x40000)) 8 0 # L2 entry
poke_file_be "$TEST_IMG" $((0x2000a)) 2 0 # Refcount entry

> +++ b/tests/qemu-iotests/289
> @@ -0,0 +1,90 @@
> +#!/usr/bin/env bash
> +#
> +# Test how 'qemu-io -c discard' behaves on v2 and v3 qcow2 images
At any rate, the new test looks reasonable to me. I see you have other 
review comments for improving it, with thos in, you can add

Reviewed-by: Eric Blake <eblake@redhat.com>
Alberto Garcia March 24, 2020, 3:13 p.m. UTC | #3
On Tue 24 Mar 2020 03:46:07 PM CET, Eric Blake wrote:
>> -$QEMU_IO -c 'discard 0k 64k' "$TEST_IMG" | _filter_qemu_io
>> +poke_file "$TEST_IMG" '262144' "\x00\x00\x00\x00\x00\x00\x00\x00" # 0x40000 - L2 entry
>> +poke_file "$TEST_IMG" '131082' "\x00\x00" # 0x2000a - Refcount entry
>
> Instead of writing '262144' ... # 0x40000, you could write
> $((0x40000)) in-place.  Similarly for 131082 vs. 0x2000a.

The exiting poke_file line in that test was using base 10 so I decided
to use it too for consistency.

I actually realized that $rb_offset and $l2_offset are defined, so I
could use those too.

> Also, Max has pending patches for adding poke_file_be; if those land
> first, this becomes simpler as:
>
> poke_file_be "$TEST_IMG" $((0x40000)) 8 0 # L2 entry
> poke_file_be "$TEST_IMG" $((0x2000a)) 2 0 # Refcount entry

I'm fine if those lines are changed when the patch is committed.

Berto
diff mbox series

Patch

diff --git a/block/qcow2.c b/block/qcow2.c
index d44b45633d..7bb7e392e1 100644
--- a/block/qcow2.c
+++ b/block/qcow2.c
@@ -3763,6 +3763,12 @@  static coroutine_fn int qcow2_co_pdiscard(BlockDriverState *bs,
     int ret;
     BDRVQcow2State *s = bs->opaque;
 
+    /* If the image does not support QCOW_OFLAG_ZERO then discarding
+     * clusters could expose stale data from the backing file. */
+    if (s->qcow_version < 3 && bs->backing) {
+        return -ENOTSUP;
+    }
+
     if (!QEMU_IS_ALIGNED(offset | bytes, s->cluster_size)) {
         assert(bytes < s->cluster_size);
         /* Ignore partial clusters, except for the special case of the
diff --git a/tests/qemu-iotests/060 b/tests/qemu-iotests/060
index 043f12904a..4a4fdfb1e1 100755
--- a/tests/qemu-iotests/060
+++ b/tests/qemu-iotests/060
@@ -167,9 +167,10 @@  _make_test_img -o 'compat=0.10' -b "$BACKING_IMG" 1G
 # Write two clusters, the second one enforces creation of an L2 table after
 # the first data cluster.
 $QEMU_IO -c 'write 0k 64k' -c 'write 512M 64k' "$TEST_IMG" | _filter_qemu_io
-# Discard the first cluster. This cluster will soon enough be reallocated and
+# Free the first cluster. This cluster will soon enough be reallocated and
 # used for COW.
-$QEMU_IO -c 'discard 0k 64k' "$TEST_IMG" | _filter_qemu_io
+poke_file "$TEST_IMG" '262144' "\x00\x00\x00\x00\x00\x00\x00\x00" # 0x40000 - L2 entry
+poke_file "$TEST_IMG" '131082' "\x00\x00" # 0x2000a - Refcount entry
 # Now, corrupt the image by marking the second L2 table cluster as free.
 poke_file "$TEST_IMG" '131084' "\x00\x00" # 0x2000c
 # Start a write operation requiring COW on the image stopping it right before
diff --git a/tests/qemu-iotests/060.out b/tests/qemu-iotests/060.out
index d27692a33c..09caaea865 100644
--- a/tests/qemu-iotests/060.out
+++ b/tests/qemu-iotests/060.out
@@ -105,8 +105,6 @@  wrote 65536/65536 bytes at offset 0
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 wrote 65536/65536 bytes at offset 536870912
 64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
-discard 65536/65536 bytes at offset 0
-64 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
 qcow2: Marking image as corrupt: Preventing invalid write on metadata (overlaps with active L2 table); further corruption events will be suppressed
 blkdebug: Suspended request '0'
 write failed: Input/output error
diff --git a/tests/qemu-iotests/289 b/tests/qemu-iotests/289
new file mode 100755
index 0000000000..13b4984721
--- /dev/null
+++ b/tests/qemu-iotests/289
@@ -0,0 +1,90 @@ 
+#!/usr/bin/env bash
+#
+# Test how 'qemu-io -c discard' behaves on v2 and v3 qcow2 images
+#
+# Copyright (C) 2020 Igalia, S.L.
+# Author: Alberto Garcia <berto@igalia.com>
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+#
+
+# creator
+owner=berto@igalia.com
+
+seq=`basename $0`
+echo "QA output created by $seq"
+
+status=1    # failure is the default!
+
+_cleanup()
+{
+    _cleanup_test_img
+    rm -f "$TEST_IMG.backing"
+}
+trap "_cleanup; exit \$status" 0 1 2 3 15
+
+# get standard environment, filters and checks
+. ./common.rc
+. ./common.filter
+
+_supported_fmt qcow2
+_supported_proto file
+_supported_os Linux
+
+echo
+echo "### Test 'qemu-io -c discard' on a QCOW2 image without a backing file"
+echo
+for qcow2_compat in 0.10 1.1; do
+    echo "# Create an image with compat=$qcow2_compat without a backing file"
+    _make_test_img -o "compat=$qcow2_compat" 128k
+
+    echo "# Fill all clusters with data and then discard them"
+    $QEMU_IO -c 'write -P 0x01 0 128k' "$TEST_IMG" | _filter_qemu_io
+    $QEMU_IO -c 'discard 0 128k' "$TEST_IMG" | _filter_qemu_io
+
+    echo "# Read the data from the discarded clusters"
+    $QEMU_IO -c 'read -P 0x00 0 128k' "$TEST_IMG" | _filter_qemu_io
+done
+
+echo
+echo "### Test 'qemu-io -c discard' on a QCOW2 image with a backing file"
+echo
+
+echo "# Create a backing image and fill it with data"
+TEST_IMG="$TEST_IMG.backing" _make_test_img 128k
+$QEMU_IO -c 'write -P 0xff 0 128k' "$TEST_IMG.backing" | _filter_qemu_io
+
+for qcow2_compat in 0.10 1.1; do
+    echo "# Create an image with compat=$qcow2_compat and a backing file"
+    _make_test_img -o "compat=$qcow2_compat" -b "$TEST_IMG.backing"
+
+    echo "# Fill all clusters with data and then discard them"
+    $QEMU_IO -c 'write -P 0x01 0 128k' "$TEST_IMG" | _filter_qemu_io
+    $QEMU_IO -c 'discard 0 128k' "$TEST_IMG" | _filter_qemu_io
+
+    echo "# Read the data from the discarded clusters"
+    if [ "$qcow2_compat" = "1.1" ]; then
+        # In qcow2 v3 clusters are zeroed (with QCOW_OFLAG_ZERO)
+        $QEMU_IO -c 'read -P 0x00 0 128k' "$TEST_IMG" | _filter_qemu_io
+    else
+        # In qcow2 v2 if there's a backing image we cannot zero the clusters
+        # without exposing the backing file data so discard does nothing
+        $QEMU_IO -c 'read -P 0x01 0 128k' "$TEST_IMG" | _filter_qemu_io
+    fi
+done
+
+# success, all done
+echo "*** done"
+rm -f $seq.full
+status=0
diff --git a/tests/qemu-iotests/289.out b/tests/qemu-iotests/289.out
new file mode 100644
index 0000000000..dcd82c6d07
--- /dev/null
+++ b/tests/qemu-iotests/289.out
@@ -0,0 +1,52 @@ 
+QA output created by 289
+
+### Test 'qemu-io -c discard' on a QCOW2 image without a backing file
+
+# Create an image with compat=0.10 without a backing file
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=131072
+# Fill all clusters with data and then discard them
+wrote 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+discard 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+# Read the data from the discarded clusters
+read 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+# Create an image with compat=1.1 without a backing file
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=131072
+# Fill all clusters with data and then discard them
+wrote 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+discard 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+# Read the data from the discarded clusters
+read 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+
+### Test 'qemu-io -c discard' on a QCOW2 image with a backing file
+
+# Create a backing image and fill it with data
+Formatting 'TEST_DIR/t.IMGFMT.backing', fmt=IMGFMT size=131072
+wrote 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+# Create an image with compat=0.10 and a backing file
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=131072 backing_file=TEST_DIR/t.IMGFMT.backing
+# Fill all clusters with data and then discard them
+wrote 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+discard 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+# Read the data from the discarded clusters
+read 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+# Create an image with compat=1.1 and a backing file
+Formatting 'TEST_DIR/t.IMGFMT', fmt=IMGFMT size=131072 backing_file=TEST_DIR/t.IMGFMT.backing
+# Fill all clusters with data and then discard them
+wrote 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+discard 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+# Read the data from the discarded clusters
+read 131072/131072 bytes at offset 0
+128 KiB, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
+*** done
diff --git a/tests/qemu-iotests/group b/tests/qemu-iotests/group
index ec2b2302e5..891b3ce858 100644
--- a/tests/qemu-iotests/group
+++ b/tests/qemu-iotests/group
@@ -295,3 +295,4 @@ 
 284 rw
 286 rw quick
 288 quick
+289 rw auto quick