mbox series

[RFC,v2,0/7] Add persistence to NVMe ZNS emulation

Message ID 20231127085641.3729-1-faithilikerun@gmail.com
Headers show
Series Add persistence to NVMe ZNS emulation | expand

Message

Sam Li Nov. 27, 2023, 8:56 a.m. UTC
ZNS emulation follows NVMe ZNS spec but the state of namespace
zones does not persist accross restarts of QEMU. This patch makes the
metadata of ZNS emulation persistent by using new block layer APIs and
the qcow2 img as backing file. It is the second part after the patches
- adding full zoned storage emulation to qcow2 driver.
https://patchwork.kernel.org/project/qemu-devel/cover/20231127043703.49489-1-faithilikerun@gmail.com/

The metadata of ZNS emulation divides into two parts, zone metadata and
zone descriptor extension data. The zone metadata is composed of zone
states, zone type, wp and zone attributes. The zone information can be
stored at an uint64_t wp to save space and easy access. The structure of
wp of each zone is as follows:
|0000(4)| zone type (1)| zone attr (8)| wp (51) ||

The zone descriptor extension data is relatively small comparing to the
overall size therefore we adopt the option that store zded of all zones
in an array regardless of the valid bit set.

Creating a zns format qcow2 image file adds one more option zd_extension_size
to zoned device configurations.

To attach this file as emulated zns drive in the command line of QEMU, use:
  -drive file=${znsimg},id=nvmezns0,format=qcow2,if=none \
  -device nvme-ns,drive=nvmezns0,bus=nvme0,nsid=1,uuid=xxx \

Sorry, send this one more time due to network problems.

v1->v2:
- split [v1 2/5] patch to three (doc, config, block layer API)
- adapt qcow2 v6

Sam Li (7):
  docs/qcow2: add zd_extension_size option to the zoned format feature
  qcow2: add zd_extension configurations to zoned metadata
  hw/nvme: use blk_get_*() to access zone info in the block layer
  hw/nvme: add blk_get_zone_extension to access zd_extensions
  hw/nvme: make the metadata of ZNS emulation persistent
  hw/nvme: refactor zone append write using block layer APIs
  hw/nvme: make ZDED persistent

 block/block-backend.c             |   88 ++
 block/qcow2.c                     |  119 ++-
 block/qcow2.h                     |    2 +
 docs/interop/qcow2.txt            |    3 +
 hw/nvme/ctrl.c                    | 1247 ++++++++---------------------
 hw/nvme/ns.c                      |  162 +---
 hw/nvme/nvme.h                    |   95 +--
 include/block/block-common.h      |    9 +
 include/block/block_int-common.h  |    8 +
 include/sysemu/block-backend-io.h |   11 +
 include/sysemu/dma.h              |    3 +
 qapi/block-core.json              |    4 +
 system/dma-helpers.c              |   17 +
 13 files changed, 647 insertions(+), 1121 deletions(-)

Comments

Markus Armbruster Nov. 30, 2023, 10:11 a.m. UTC | #1
Sam Li <faithilikerun@gmail.com> writes:

> ZNS emulation follows NVMe ZNS spec but the state of namespace
> zones does not persist accross restarts of QEMU. This patch makes the
> metadata of ZNS emulation persistent by using new block layer APIs and
> the qcow2 img as backing file. It is the second part after the patches
> - adding full zoned storage emulation to qcow2 driver.
> https://patchwork.kernel.org/project/qemu-devel/cover/20231127043703.49489-1-faithilikerun@gmail.com/

In the future, also add this information the machine-readable way,
i.e. like

  Based-on: <20231127043703.49489-1-faithilikerun@gmail.com>

However, it doesn't apply on top of that series for me.  Got something I
could pull?

[...]
Sam Li Nov. 30, 2023, 10:20 a.m. UTC | #2
Markus Armbruster <armbru@redhat.com> 于2023年11月30日周四 18:11写道:
>
> Sam Li <faithilikerun@gmail.com> writes:
>
> > ZNS emulation follows NVMe ZNS spec but the state of namespace
> > zones does not persist accross restarts of QEMU. This patch makes the
> > metadata of ZNS emulation persistent by using new block layer APIs and
> > the qcow2 img as backing file. It is the second part after the patches
> > - adding full zoned storage emulation to qcow2 driver.
> > https://patchwork.kernel.org/project/qemu-devel/cover/20231127043703.49489-1-faithilikerun@gmail.com/
>
> In the future, also add this information the machine-readable way,
> i.e. like
>
>   Based-on: <20231127043703.49489-1-faithilikerun@gmail.com>
>
> However, it doesn't apply on top of that series for me.  Got something I
> could pull?

Weird, I biuld this on top of v6 qcow2 patches. I'll check that after
settling down. I am moving to another city recently.

Thanks,
Sam
Klaus Jensen Jan. 10, 2024, 6:52 a.m. UTC | #3
On Nov 27 16:56, Sam Li wrote:
> ZNS emulation follows NVMe ZNS spec but the state of namespace
> zones does not persist accross restarts of QEMU. This patch makes the
> metadata of ZNS emulation persistent by using new block layer APIs and
> the qcow2 img as backing file. It is the second part after the patches
> - adding full zoned storage emulation to qcow2 driver.
> https://patchwork.kernel.org/project/qemu-devel/cover/20231127043703.49489-1-faithilikerun@gmail.com/
> 
> The metadata of ZNS emulation divides into two parts, zone metadata and
> zone descriptor extension data. The zone metadata is composed of zone
> states, zone type, wp and zone attributes. The zone information can be
> stored at an uint64_t wp to save space and easy access. The structure of
> wp of each zone is as follows:
> |0000(4)| zone type (1)| zone attr (8)| wp (51) ||
> 
> The zone descriptor extension data is relatively small comparing to the
> overall size therefore we adopt the option that store zded of all zones
> in an array regardless of the valid bit set.
> 
> Creating a zns format qcow2 image file adds one more option zd_extension_size
> to zoned device configurations.
> 
> To attach this file as emulated zns drive in the command line of QEMU, use:
>   -drive file=${znsimg},id=nvmezns0,format=qcow2,if=none \
>   -device nvme-ns,drive=nvmezns0,bus=nvme0,nsid=1,uuid=xxx \
> 
> Sorry, send this one more time due to network problems.
> 
> v1->v2:
> - split [v1 2/5] patch to three (doc, config, block layer API)
> - adapt qcow2 v6
> 
> Sam Li (7):
>   docs/qcow2: add zd_extension_size option to the zoned format feature
>   qcow2: add zd_extension configurations to zoned metadata
>   hw/nvme: use blk_get_*() to access zone info in the block layer
>   hw/nvme: add blk_get_zone_extension to access zd_extensions
>   hw/nvme: make the metadata of ZNS emulation persistent
>   hw/nvme: refactor zone append write using block layer APIs
>   hw/nvme: make ZDED persistent
> 
>  block/block-backend.c             |   88 ++
>  block/qcow2.c                     |  119 ++-
>  block/qcow2.h                     |    2 +
>  docs/interop/qcow2.txt            |    3 +
>  hw/nvme/ctrl.c                    | 1247 ++++++++---------------------
>  hw/nvme/ns.c                      |  162 +---
>  hw/nvme/nvme.h                    |   95 +--
>  include/block/block-common.h      |    9 +
>  include/block/block_int-common.h  |    8 +
>  include/sysemu/block-backend-io.h |   11 +
>  include/sysemu/dma.h              |    3 +
>  qapi/block-core.json              |    4 +
>  system/dma-helpers.c              |   17 +
>  13 files changed, 647 insertions(+), 1121 deletions(-)
> 
> -- 
> 2.40.1
> 

Hi Sam,

This is awesome. For the hw/nvme parts,

Acked-by: Klaus Jensen <k.jensen@samsung.com>

I'll give it a proper R-b when you drop the RFC status.
Sam Li Jan. 22, 2024, 6:43 p.m. UTC | #4
Klaus Jensen <its@irrelevant.dk> 于2024年1月10日周三 07:52写道:
>
> Hi Sam,
>
> This is awesome. For the hw/nvme parts,
>
> Acked-by: Klaus Jensen <k.jensen@samsung.com>
>
> I'll give it a proper R-b when you drop the RFC status.

Hi Klaus,

Sorry for the late response. I will submit a new RFC patch series very
soon.

Now the zone states should persist. The following is the result of
regression tests on zonefs. It's been a while since I worked on this
series. Please let me know if I made any mistake.

Thanks,
Sam

[root@guest tests]# ./zonefs-tests.sh /dev/nvme0n1
Gathering information on /dev/nvme0n1...
zonefs-tests on /dev/nvme0n1:
  12 zones (0 conventional zones, 12 sequential zones)
  131072 512B sectors zone size (64 MiB)
  6 max open zones
  8 max active zones
Running tests
...
75 / 112 tests passed (37 skipped, 0 failures)