diff mbox series

iotests/MRCE: Write data to source

Message ID 20211223165308.103793-1-hreitz@redhat.com
State New
Headers show
Series iotests/MRCE: Write data to source | expand

Commit Message

Hanna Czenczek Dec. 23, 2021, 4:53 p.m. UTC
This test assumes that mirror flushes the source when entering the READY
state, and that the format level will pass that flush on to the protocol
level (where we intercept it with blkdebug).

However, apparently that does not happen when using a VMDK image with
zeroed_grain=on, which actually is the default set by testenv.py.  Right
now, Python tests ignore IMGOPTS, though, so this has no effect; but
Vladimir has a series that will change this, so we need to fix this test
before that series lands.

We can fix it by writing data to the source before we start the mirror
job; apparently that makes the (VMDK) format layer change its mind and
pass on the pre-READY flush to the protocol level, so the test passes
again.  (I presume, without any data written, mirror just does a 64M
zero write on the target, which VMDK with zeroed_grain=on basically just
ignores.)

Without this, we do not get a flush, and so blkdebug only sees a single
flush at the end of the job instead of two, and therefore does not
inject an error, which makes the block job complete instead of raising
an error.

Signed-off-by: Hanna Reitz <hreitz@redhat.com>
---
As hinted at above, I think this should be merged before Vladimir's
"iotests: support zstd" series, or said test fails for me with VMDK.
(At least on one system, not the other...  Would be too easy otherwise,
obviously.)
---
 tests/qemu-iotests/tests/mirror-ready-cancel-error | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

Comments

Vladimir Sementsov-Ogievskiy Dec. 23, 2021, 5:50 p.m. UTC | #1
23.12.2021 19:53, Hanna Reitz wrote:
> This test assumes that mirror flushes the source when entering the READY
> state, and that the format level will pass that flush on to the protocol
> level (where we intercept it with blkdebug).
> 
> However, apparently that does not happen when using a VMDK image with
> zeroed_grain=on, which actually is the default set by testenv.py.  Right
> now, Python tests ignore IMGOPTS, though, so this has no effect; but
> Vladimir has a series that will change this, so we need to fix this test
> before that series lands.
> 
> We can fix it by writing data to the source before we start the mirror
> job; apparently that makes the (VMDK) format layer change its mind and
> pass on the pre-READY flush to the protocol level, so the test passes
> again.  (I presume, without any data written, mirror just does a 64M
> zero write on the target, which VMDK with zeroed_grain=on basically just
> ignores.)
> 
> Without this, we do not get a flush, and so blkdebug only sees a single
> flush at the end of the job instead of two, and therefore does not
> inject an error, which makes the block job complete instead of raising
> an error.
> 
> Signed-off-by: Hanna Reitz <hreitz@redhat.com>

Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>

Hmm, relying on flush requests sequence doesn't seem reliable..
Maybe, it's better to insert blockdev-add filter after JOB_READY? blockdev-reopen should work now for "file" link changing.

> ---
> As hinted at above, I think this should be merged before Vladimir's
> "iotests: support zstd" series, or said test fails for me with VMDK.
> (At least on one system, not the other...  Would be too easy otherwise,
> obviously.)
> ---
>   tests/qemu-iotests/tests/mirror-ready-cancel-error | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/tests/qemu-iotests/tests/mirror-ready-cancel-error b/tests/qemu-iotests/tests/mirror-ready-cancel-error
> index f2dc88881f..770ffca379 100755
> --- a/tests/qemu-iotests/tests/mirror-ready-cancel-error
> +++ b/tests/qemu-iotests/tests/mirror-ready-cancel-error
> @@ -36,6 +36,11 @@ class TestMirrorReadyCancelError(iotests.QMPTestCase):
>           assert iotests.qemu_img_create('-f', iotests.imgfmt, target,
>                                          str(image_size)) == 0
>   
> +        # Ensure that mirror will copy something before READY so the
> +        # target format layer will forward the pre-READY flush to its
> +        # file child
> +        assert iotests.qemu_io_silent('-c', 'write -P 1 0 64k', source) == 0
> +
>           self.vm = iotests.VM()
>           self.vm.launch()
>   
> @@ -97,7 +102,7 @@ class TestMirrorReadyCancelError(iotests.QMPTestCase):
>           # Write something so will not leave the job immediately, but
>           # flush first (which will fail, thanks to blkdebug)
>           res = self.vm.qmp('human-monitor-command',
> -                          command_line='qemu-io mirror-top "write 0 64k"')
> +                          command_line='qemu-io mirror-top "write -P 2 0 64k"')
>           self.assert_qmp(res, 'return', '')
>   
>           # Drain status change events
>
Vladimir Sementsov-Ogievskiy Dec. 23, 2021, 5:51 p.m. UTC | #2
23.12.2021 20:50, Vladimir Sementsov-Ogievskiy wrote:
> 
> Hmm, relying on flush requests sequence doesn't seem reliable..
> Maybe, it's better to insert blockdev-add filter after JOB_READY? blockdev-reopen should work now for "file" link changing.

s/blockdev-add/blkdebug/ :)
Hanna Czenczek Jan. 3, 2022, 3:41 p.m. UTC | #3
On 23.12.21 18:50, Vladimir Sementsov-Ogievskiy wrote:
> 23.12.2021 19:53, Hanna Reitz wrote:
>> This test assumes that mirror flushes the source when entering the READY
>> state, and that the format level will pass that flush on to the protocol
>> level (where we intercept it with blkdebug).
>>
>> However, apparently that does not happen when using a VMDK image with
>> zeroed_grain=on, which actually is the default set by testenv.py.  Right
>> now, Python tests ignore IMGOPTS, though, so this has no effect; but
>> Vladimir has a series that will change this, so we need to fix this test
>> before that series lands.
>>
>> We can fix it by writing data to the source before we start the mirror
>> job; apparently that makes the (VMDK) format layer change its mind and
>> pass on the pre-READY flush to the protocol level, so the test passes
>> again.  (I presume, without any data written, mirror just does a 64M
>> zero write on the target, which VMDK with zeroed_grain=on basically just
>> ignores.)
>>
>> Without this, we do not get a flush, and so blkdebug only sees a single
>> flush at the end of the job instead of two, and therefore does not
>> inject an error, which makes the block job complete instead of raising
>> an error.
>>
>> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
>
> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
>
> Hmm, relying on flush requests sequence doesn't seem reliable..
> Maybe, it's better to insert blockdev-add filter after JOB_READY? 
> blockdev-reopen should work now for "file" link changing.

Oh, yes, that would be nice, good idea!

But, on the other hand...  I tried, but unfortunately blockdev-reopen 
doesn’t seem to work that well with vmdk in particular:

$ ./qemu-img create -f vmdk /tmp/test.vmdk 64M
Formatting '/tmp/test.vmdk', fmt=vmdk size=67108864 compat6=off 
hwversion=undefined

$ ./qemu-system-x86_64 \
     -qmp stdio \
     -blockdev file,node-name=file,filename=/tmp/test.vmdk \
     -blockdev vmdk,node-name=vmdk,file=file \
     -blockdev raw,node-name=raw,file=file \
     <<EOF
{"execute": "qmp_capabilities"}
{"execute": "blockdev-reopen",
  "arguments": {
    "options": [{
      "node-name": "vmdk",
      "driver": "vmdk",
      "file": "raw"}]}}
{"execute": "quit"}
EOF
{"QMP": {"version": {"qemu": {"micro": 50, "minor": 2, "major": 6}, 
"package": "v6.2.0-428-g572068bc87"}, "capabilities": ["oob"]}}
{"return": {}}
{"return": {}}
{"return": {}}
{"timestamp": {"seconds": 1641223501, "microseconds": 865524}, "event": 
"SHUTDOWN", "data": {"guest": false, "reason": "host-qmp-quit"}}
[1]    21585 segmentation fault (core dumped)  ./qemu-system-x86_64 -qmp 
stdio -blockdev  -blockdev  -blockdev  <<<''


I don’t quite know what’s wrong, seems to me like some strong reference 
may be missing:

(gdb) bt
#0  0x00005643a784bb93 in bdrv_unset_inherits_from 
(root=root@entry=0x5643a898d990, child=child@entry=0x5643a89856c0, 
tran=tran@entry=0x0) at ../block.c:3172
#1  0x00005643a78524d0 in bdrv_unref_child (child=0x5643a89856c0, 
parent=0x5643a898d990) at ../block.c:3199
#2  bdrv_unref_child (parent=parent@entry=0x5643a898d990, 
child=0x5643a89856c0) at ../block.c:3193
#3  0x00005643a78b0c31 in vmdk_free_extents (bs=0x5643a898d990) at 
../block/vmdk.c:277
#4  0x00005643a78b0c5a in vmdk_close (bs=<optimized out>) at 
../block/vmdk.c:2810
#5  0x00005643a785260c in bdrv_close (bs=0x5643a898d990) at ../block.c:4790
#6  bdrv_delete (bs=<optimized out>) at ../block.c:5248
#7  bdrv_unref (bs=bs@entry=0x5643a898d990) at ../block.c:6675
#8  0x00005643a78436c5 in blockdev_close_all_bdrv_states () at 
../blockdev.c:675
#9  0x00005643a784f24f in bdrv_close_all () at ../block.c:4846
#10 0x00005643a77032de in qemu_cleanup () at ../softmmu/runstate.c:816
#11 0x00005643a7412823 in main (argc=<optimized out>, argv=<optimized 
out>, envp=<optimized out>) at ../softmmu/main.c:51
(gdb) p *child->bs
Cannot access memory at address 0x5646ccad4005


Valgrind points to vmdk_free_extents(), which makes me believe that the 
problem is that changing vmdk’s file child would need some cooperation 
from it to update its s->extents[...].file pointers:

==23055== Invalid read of size 8
==23055==    at 0x194CD0: bdrv_unset_inherits_from (block.c:3172)
==23055==    by 0x19B60F: bdrv_unref_child (block.c:3199)
==23055==    by 0x19B60F: bdrv_unref_child (block.c:3193)
==23055==    by 0x1F9D70: vmdk_free_extents (vmdk.c:277)
==23055==    by 0x1F9D99: vmdk_close (vmdk.c:2810)
==23055==    by 0x19B74B: bdrv_close (block.c:4790)
==23055==    by 0x19B74B: bdrv_delete (block.c:5248)
==23055==    by 0x19B74B: bdrv_unref (block.c:6675)
==23055==    by 0x18C804: blockdev_close_all_bdrv_states (blockdev.c:675)
==23055==    by 0x19838E: bdrv_close_all (block.c:4846)
==23055==    by 0x17F235: main (qemu-storage-daemon.c:354)
==23055==  Address 0x11259770 is 0 bytes inside a block of size 96 free'd
==23055==    at 0x48440E4: free (vg_replace_malloc.c:872)
==23055==    by 0x4AE0DAC: g_free (gmem.c:199)
==23055==    by 0x2ACE21: tran_commit (transactions.c:87)
==23055==    by 0x19D109: bdrv_reopen_multiple (block.c:4325)
==23055==    by 0x18FCB7: qmp_blockdev_reopen (blockdev.c:3599)
==23055==    by 0x272B0D: qmp_marshal_blockdev_reopen 
(qapi-commands-block-core.c:1086)
==23055==    by 0x295238: do_qmp_dispatch_bh (qmp-dispatch.c:129)
==23055==    by 0x2AD8A3: aio_bh_call (async.c:141)
==23055==    by 0x2AD8A3: aio_bh_poll (async.c:169)
==23055==    by 0x29E61D: aio_dispatch (aio-posix.c:381)
==23055==    by 0x2AD4ED: aio_ctx_dispatch (async.c:311)
==23055==    by 0x4ADF12F: UnknownInlinedFun (gmain.c:3381)
==23055==    by 0x4ADF12F: g_main_context_dispatch (gmain.c:4099)
==23055==    by 0x2B8E07: glib_pollfds_poll (main-loop.c:232)
==23055==    by 0x2B8E07: os_host_main_loop_wait (main-loop.c:255)
==23055==    by 0x2B8E07: main_loop_wait (main-loop.c:531)
==23055==  Block was alloc'd at
==23055==    at 0x484186F: malloc (vg_replace_malloc.c:381)
==23055==    by 0x4AE44A8: g_malloc (gmem.c:106)
==23055==    by 0x19A52F: bdrv_attach_child_common (block.c:2912)
==23055==    by 0x19A878: bdrv_attach_child_noperm (block.c:3002)
==23055==    by 0x19BC90: bdrv_attach_child (block.c:3095)
==23055==    by 0x1FC72E: vmdk_open (vmdk.c:1265)
==23055==    by 0x19B1D9: bdrv_open_driver (block.c:1566)
==23055==    by 0x19EAE3: bdrv_open_common (block.c:1858)
==23055==    by 0x19EAE3: bdrv_open_inherit (block.c:3865)
==23055==    by 0x19FB02: bdrv_open (block.c:3958)
==23055==    by 0x18FA9D: qmp_blockdev_add (blockdev.c:3540)
==23055==    by 0x17F1CA: process_options (qemu-storage-daemon.c:216)
==23055==    by 0x17F1CA: main (qemu-storage-daemon.c:338)


So we can’t use blockdev-reopen until this is fixed™, and because this 
kind of is a blocker for your series, I think I’ll just take the patch 
as-is then.

Hanna
Hanna Czenczek Jan. 13, 2022, 4:10 p.m. UTC | #4
On 23.12.21 17:53, Hanna Reitz wrote:
> This test assumes that mirror flushes the source when entering the READY
> state, and that the format level will pass that flush on to the protocol
> level (where we intercept it with blkdebug).
>
> However, apparently that does not happen when using a VMDK image with
> zeroed_grain=on, which actually is the default set by testenv.py.  Right
> now, Python tests ignore IMGOPTS, though, so this has no effect; but
> Vladimir has a series that will change this, so we need to fix this test
> before that series lands.
>
> We can fix it by writing data to the source before we start the mirror
> job; apparently that makes the (VMDK) format layer change its mind and
> pass on the pre-READY flush to the protocol level, so the test passes
> again.  (I presume, without any data written, mirror just does a 64M
> zero write on the target, which VMDK with zeroed_grain=on basically just
> ignores.)
>
> Without this, we do not get a flush, and so blkdebug only sees a single
> flush at the end of the job instead of two, and therefore does not
> inject an error, which makes the block job complete instead of raising
> an error.
>
> Signed-off-by: Hanna Reitz <hreitz@redhat.com>
> ---
> As hinted at above, I think this should be merged before Vladimir's
> "iotests: support zstd" series, or said test fails for me with VMDK.
> (At least on one system, not the other...  Would be too easy otherwise,
> obviously.)
> ---
>   tests/qemu-iotests/tests/mirror-ready-cancel-error | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)

Applied to my block branch.

Hanna
diff mbox series

Patch

diff --git a/tests/qemu-iotests/tests/mirror-ready-cancel-error b/tests/qemu-iotests/tests/mirror-ready-cancel-error
index f2dc88881f..770ffca379 100755
--- a/tests/qemu-iotests/tests/mirror-ready-cancel-error
+++ b/tests/qemu-iotests/tests/mirror-ready-cancel-error
@@ -36,6 +36,11 @@  class TestMirrorReadyCancelError(iotests.QMPTestCase):
         assert iotests.qemu_img_create('-f', iotests.imgfmt, target,
                                        str(image_size)) == 0
 
+        # Ensure that mirror will copy something before READY so the
+        # target format layer will forward the pre-READY flush to its
+        # file child
+        assert iotests.qemu_io_silent('-c', 'write -P 1 0 64k', source) == 0
+
         self.vm = iotests.VM()
         self.vm.launch()
 
@@ -97,7 +102,7 @@  class TestMirrorReadyCancelError(iotests.QMPTestCase):
         # Write something so will not leave the job immediately, but
         # flush first (which will fail, thanks to blkdebug)
         res = self.vm.qmp('human-monitor-command',
-                          command_line='qemu-io mirror-top "write 0 64k"')
+                          command_line='qemu-io mirror-top "write -P 2 0 64k"')
         self.assert_qmp(res, 'return', '')
 
         # Drain status change events