mbox series

[v2,0/4] mirror: allow specifying working bitmap

Message ID 20240307134711.709816-1-f.ebner@proxmox.com
Headers show
Series mirror: allow specifying working bitmap | expand

Message

Fiona Ebner March 7, 2024, 1:47 p.m. UTC
Changes from RFC/v1 (discussion here [0]):
* Add patch to split BackupSyncMode and MirrorSyncMode.
* Drop bitmap-mode parameter and use passed-in bitmap as the working
  bitmap instead. Users can get the desired behaviors by
  using the block-dirty-bitmap-clear and block-dirty-bitmap-merge
  calls (see commit message in patch 2/4 for how exactly).
* Add patch to check whether target image's cluster size is at most
  mirror job's granularity. Optional, it's an extra safety check
  that's useful when the target is a "diff" image that does not have
  previously synced data.

Use cases:
* Possibility to resume a failed mirror later.
* Possibility to only mirror deltas to a previously mirrored volume.
* Possibility to (efficiently) mirror an drive that was previously
  mirrored via some external mechanism (e.g. ZFS replication).

We are using the last one in production without any issues since about
4 years now. In particular, like mentioned in [1]:

> - create bitmap(s)
> - (incrementally) replicate storage volume(s) out of band (using ZFS)
> - incrementally drive mirror as part of a live migration of VM
> - drop bitmap(s)


Now, the IO test added in patch 4/4 actually contains yet another use
case, namely doing incremental mirrors to stand-alone qcow2 "diff"
images, that only contain the delta and can be rebased later. I had to
adapt the IO test, because its output expected the mirror bitmap to
still be dirty, but nowadays the mirror is apparently already done
when the bitmaps are queried. So I thought, I'll just use
'write-blocking' mode to avoid any potential timing issues.

But this exposed an issue with the diff image approach. If a write is
not aligned to the granularity of the mirror target, then rebasing the
diff image onto a backing image will not yield the desired result,
because the full cluster is considered to be allocated and will "hide"
some part of the base/backing image. The failure can be seen by either
using 'write-blocking' mode in the IO test or setting the (bitmap)
granularity to 32 KiB rather than the current 64 KiB.

For the latter case, patch 4/4 adds a check. For the former, the
limitation is documented (I'd expect this to be a niche use case in
practice).

[0]: https://lore.kernel.org/qemu-devel/b91dba34-7969-4d51-ba40-96a91038cc54@yandex-team.ru/T/#m4ae27dc8ca1fb053e0a32cc4ffa2cfab6646805c
[1]: https://lore.kernel.org/qemu-devel/1599127031.9uxdp5h9o2.astroid@nora.none/


Fabian Grünbichler (1):
  iotests: add test for bitmap mirror

Fiona Ebner (2):
  qapi/block-core: avoid the re-use of MirrorSyncMode for backup
  blockdev: mirror: check for target's cluster size when using bitmap

John Snow (1):
  mirror: allow specifying working bitmap

 block/backup.c                                |   18 +-
 block/mirror.c                                |  102 +-
 block/monitor/block-hmp-cmds.c                |    2 +-
 block/replication.c                           |    2 +-
 blockdev.c                                    |   84 +-
 include/block/block_int-global-state.h        |    7 +-
 qapi/block-core.json                          |   64 +-
 tests/qemu-iotests/tests/bitmap-sync-mirror   |  571 ++++
 .../qemu-iotests/tests/bitmap-sync-mirror.out | 2946 +++++++++++++++++
 tests/unit/test-block-iothread.c              |    2 +-
 10 files changed, 3729 insertions(+), 69 deletions(-)
 create mode 100755 tests/qemu-iotests/tests/bitmap-sync-mirror
 create mode 100644 tests/qemu-iotests/tests/bitmap-sync-mirror.out