From patchwork Wed Aug 16 06:46:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sam Li X-Patchwork-Id: 1821704 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20221208 header.b=DmSvrkFM; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RQdx80l9vz1yff for ; Wed, 16 Aug 2023 16:47:40 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qWAJG-0000sZ-Rw; Wed, 16 Aug 2023 02:47:14 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qWAJ2-0000oJ-7Q; Wed, 16 Aug 2023 02:47:01 -0400 Received: from mail-pl1-x62d.google.com ([2607:f8b0:4864:20::62d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qWAIz-00086t-3o; Wed, 16 Aug 2023 02:46:59 -0400 Received: by mail-pl1-x62d.google.com with SMTP id d9443c01a7336-1bdf7086ae5so13276415ad.0; Tue, 15 Aug 2023 23:46:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692168415; x=1692773215; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CPRNnhdfxUqFe1MV8/rcvlcRFuDBQqjGT/C3sNgciRM=; b=DmSvrkFMwc2rVuN3lPfIMvs7ofTE5SbFY/wHHNKcMfxN6gb4KfkCOikg8b61ApVjDp fYSusKKZylQ5pa7GJzf4cHcTA/PjGK5eW3TNkPnw1EM1L+KZpgv9luvQFR0oShyyNAv1 59WHR/cNA9yyW9xYmfIiTGMgh0iBDtN1evHYLNybXgWUVvURsnXOotvdznTZ5XTKRk9N zXPNjdA4AFNhCRBJ1yhPUXqV7IGMWoqZzt8/1Z4kxrmL+5z/oVsTGu/bJwOvkbfL2AlZ Id7yW1ltkcAa57gTYfX0u/N5mgM4TevmC4NBmlBtq4US5ohe7Oa2LLysVyRUOKG+XZVJ l57g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692168415; x=1692773215; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CPRNnhdfxUqFe1MV8/rcvlcRFuDBQqjGT/C3sNgciRM=; b=CmsQBDxjgtXYPaHmAG9ythB/Vn0rnjOVW0uKAcPRLQGovOs7bxVX/B3zygBoddzatU bP6UvhzncGcshJmVyKURUHDTQbND62Q/4n7Un2tAYQrg0ttnOyIdE3i3pTr8ixV4zlmb Csm3v+9zQK65ElqvIbDRj7DLiJe9WB3UxgR5u9zApRoCC0xmT6PlRToe/O7JUaJtlk2M 0sd8eWDv0jfcTm3p5P+L7cm0NrULXGgaWVq0aoUlnAJAeOx8Pv2vFXq+8bFJBFnABVy/ dwBTCOA2iJfnDWeAjWiopnpyu3/TTVktZXlImLchWhHskiHBczCiac67Yq74PP36CPof ohAw== X-Gm-Message-State: AOJu0YyZ/irsULqbF/gQi7IiWxgNnrFtwwB6vambTbxtSMAECa0d00W/ AAkdk7hJaoupspYBXVi9rgvavHxpXfyx2syRcu0= X-Google-Smtp-Source: AGHT+IEeYRnB8jOFiUnr3fepPpLgxxeTe/PJOUrvA+tn3Nb/NNc180FtizhJU9yDN24b0uPHUql4/g== X-Received: by 2002:a17:902:da87:b0:1bc:98dd:e872 with SMTP id j7-20020a170902da8700b001bc98dde872mr1448739plx.29.1692168414537; Tue, 15 Aug 2023 23:46:54 -0700 (PDT) Received: from fedlinux.. ([106.84.130.68]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b001bee782a1desm1329363plc.181.2023.08.15.23.46.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Aug 2023 23:46:54 -0700 (PDT) From: Sam Li To: qemu-devel@nongnu.org Cc: Peter Xu , stefanha@redhat.com, David Hildenbrand , Kevin Wolf , Markus Armbruster , Keith Busch , Hanna Reitz , dmitry.fomichev@wdc.com, Eric Blake , Klaus Jensen , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , dlemoal@kernel.org, Paolo Bonzini , hare@suse.de, qemu-block@nongnu.org, Sam Li Subject: [RFC 1/5] hw/nvme: use blk_get_*() to access zone info in the block layer Date: Wed, 16 Aug 2023 14:46:13 +0800 Message-Id: <20230816064617.3310-2-faithilikerun@gmail.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230816064617.3310-1-faithilikerun@gmail.com> References: <20230816064617.3310-1-faithilikerun@gmail.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62d; envelope-from=faithilikerun@gmail.com; helo=mail-pl1-x62d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org The zone information is contained in the BlockLimits fileds. Add blk_get_*() functions to access the block layer and update zone info accessing in the NVMe device emulation. Signed-off-by: Sam Li --- block/block-backend.c | 56 ++++++++++++++++++++++++++++ block/qcow2.c | 20 +++++++++- hw/nvme/ctrl.c | 34 ++++++----------- hw/nvme/ns.c | 62 ++++++++++--------------------- hw/nvme/nvme.h | 3 -- include/sysemu/block-backend-io.h | 7 ++++ 6 files changed, 112 insertions(+), 70 deletions(-) diff --git a/block/block-backend.c b/block/block-backend.c index 4009ed5fed..ad410286a0 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -2362,6 +2362,62 @@ int blk_get_max_iov(BlockBackend *blk) return blk->root->bs->bl.max_iov; } +uint8_t blk_get_zone_model(BlockBackend *blk) +{ + BlockDriverState *bs = blk_bs(blk); + IO_CODE(); + return bs ? bs->bl.zoned: 0; + +} + +uint8_t blk_get_zone_profile(BlockBackend *blk) +{ + BlockDriverState *bs = blk_bs(blk); + IO_CODE(); + return bs ? bs->bl.zoned_profile: 0; + +} + +uint32_t blk_get_zone_size(BlockBackend *blk) +{ + BlockDriverState *bs = blk_bs(blk); + IO_CODE(); + + return bs ? bs->bl.zone_size : 0; +} + +uint32_t blk_get_zone_capacity(BlockBackend *blk) +{ + BlockDriverState *bs = blk_bs(blk); + IO_CODE(); + + return bs ? bs->bl.zone_capacity : 0; +} + +uint32_t blk_get_max_open_zones(BlockBackend *blk) +{ + BlockDriverState *bs = blk_bs(blk); + IO_CODE(); + + return bs ? bs->bl.max_open_zones : 0; +} + +uint32_t blk_get_max_active_zones(BlockBackend *blk) +{ + BlockDriverState *bs = blk_bs(blk); + IO_CODE(); + + return bs ? bs->bl.max_active_zones : 0; +} + +uint32_t blk_get_max_append_sectors(BlockBackend *blk) +{ + BlockDriverState *bs = blk_bs(blk); + IO_CODE(); + + return bs ? bs->bl.max_append_sectors : 0; +} + void *blk_try_blockalign(BlockBackend *blk, size_t size) { IO_CODE(); diff --git a/block/qcow2.c b/block/qcow2.c index 5ccf79cbe7..9de90ccc9f 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -2172,6 +2172,8 @@ static void qcow2_refresh_limits(BlockDriverState *bs, Error **errp) bs->bl.pwrite_zeroes_alignment = s->subcluster_size; bs->bl.pdiscard_alignment = s->cluster_size; bs->bl.zoned = s->zoned_header.zoned; + bs->bl.zoned_profile = s->zoned_header.zoned_profile; + bs->bl.zone_capacity = s->zoned_header.zone_capacity; bs->bl.nr_zones = s->zoned_header.nr_zones; bs->wps = s->wps; bs->bl.max_append_sectors = s->zoned_header.max_append_sectors; @@ -4083,8 +4085,22 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp) s->zoned_header.zoned = BLK_Z_HM; s->zoned_header.zone_size = qcow2_opts->zone_size; s->zoned_header.zone_nr_conv = qcow2_opts->zone_nr_conv; - s->zoned_header.max_open_zones = qcow2_opts->max_open_zones; - s->zoned_header.max_active_zones = qcow2_opts->max_active_zones; + + if (!qcow2_opts->max_active_zones) { + if (qcow2_opts->max_open_zones > qcow2_opts->max_active_zones) { + error_setg(errp, "max_open_zones (%u) exceeds " + "max_active_zones (%u)", qcow2_opts->max_open_zones, + qcow2_opts->max_active_zones); + return -1; + } + + if (!qcow2_opts->max_open_zones) { + s->zoned_header.max_open_zones = qcow2_opts->max_active_zones; + } + s->zoned_header.max_open_zones = qcow2_opts->max_open_zones; + s->zoned_header.max_active_zones = qcow2_opts->max_active_zones; + } + s->zoned_header.max_append_sectors = qcow2_opts->max_append_sectors; s->zoned_header.nr_zones = qcow2_opts->size / qcow2_opts->zone_size; diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index 539d273553..4e1608f0c1 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -417,18 +417,6 @@ static void nvme_assign_zone_state(NvmeNamespace *ns, NvmeZone *zone, static uint16_t nvme_zns_check_resources(NvmeNamespace *ns, uint32_t act, uint32_t opn, uint32_t zrwa) { - if (ns->params.max_active_zones != 0 && - ns->nr_active_zones + act > ns->params.max_active_zones) { - trace_pci_nvme_err_insuff_active_res(ns->params.max_active_zones); - return NVME_ZONE_TOO_MANY_ACTIVE | NVME_DNR; - } - - if (ns->params.max_open_zones != 0 && - ns->nr_open_zones + opn > ns->params.max_open_zones) { - trace_pci_nvme_err_insuff_open_res(ns->params.max_open_zones); - return NVME_ZONE_TOO_MANY_OPEN | NVME_DNR; - } - if (zrwa > ns->zns.numzrwa) { return NVME_NOZRWA | NVME_DNR; } @@ -1988,9 +1976,9 @@ static uint16_t nvme_zrm_reset(NvmeNamespace *ns, NvmeZone *zone) static void nvme_zrm_auto_transition_zone(NvmeNamespace *ns) { NvmeZone *zone; + int moz = blk_get_max_open_zones(ns->blkconf.blk); - if (ns->params.max_open_zones && - ns->nr_open_zones == ns->params.max_open_zones) { + if (moz && ns->nr_open_zones == moz) { zone = QTAILQ_FIRST(&ns->imp_open_zones); if (zone) { /* @@ -2165,7 +2153,7 @@ void nvme_rw_complete_cb(void *opaque, int ret) block_acct_done(stats, acct); } - if (ns->params.zoned && nvme_is_write(req)) { + if (blk_get_zone_model(blk) && nvme_is_write(req)) { nvme_finalize_zoned_write(ns, req); } @@ -2887,7 +2875,7 @@ static void nvme_copy_out_completed_cb(void *opaque, int ret) goto out; } - if (ns->params.zoned) { + if (blk_get_zone_model(ns->blkconf.blk)) { nvme_advance_zone_wp(ns, iocb->zone, nlb); } @@ -2999,7 +2987,7 @@ static void nvme_copy_in_completed_cb(void *opaque, int ret) goto invalid; } - if (ns->params.zoned) { + if (blk_get_zone_model(ns->blkconf.blk)) { status = nvme_check_zone_write(ns, iocb->zone, iocb->slba, nlb); if (status) { goto invalid; @@ -3093,7 +3081,7 @@ static void nvme_do_copy(NvmeCopyAIOCB *iocb) } } - if (ns->params.zoned) { + if (blk_get_zone_model(ns->blkconf.blk)) { status = nvme_check_zone_read(ns, slba, nlb); if (status) { goto invalid; @@ -3169,7 +3157,7 @@ static uint16_t nvme_copy(NvmeCtrl *n, NvmeRequest *req) iocb->slba = le64_to_cpu(copy->sdlba); - if (ns->params.zoned) { + if (blk_get_zone_model(ns->blkconf.blk)) { iocb->zone = nvme_get_zone_by_slba(ns, iocb->slba); if (!iocb->zone) { status = NVME_LBA_RANGE | NVME_DNR; @@ -3440,7 +3428,7 @@ static uint16_t nvme_read(NvmeCtrl *n, NvmeRequest *req) goto invalid; } - if (ns->params.zoned) { + if (blk_get_zone_model(blk)) { status = nvme_check_zone_read(ns, slba, nlb); if (status) { trace_pci_nvme_err_zone_read_not_ok(slba, nlb, status); @@ -3555,7 +3543,7 @@ static uint16_t nvme_do_write(NvmeCtrl *n, NvmeRequest *req, bool append, goto invalid; } - if (ns->params.zoned) { + if (blk_get_zone_model(blk)) { zone = nvme_get_zone_by_slba(ns, slba); assert(zone); @@ -3673,7 +3661,7 @@ static uint16_t nvme_get_mgmt_zone_slba_idx(NvmeNamespace *ns, NvmeCmd *c, uint32_t dw10 = le32_to_cpu(c->cdw10); uint32_t dw11 = le32_to_cpu(c->cdw11); - if (!ns->params.zoned) { + if (blk_get_zone_model(ns->blkconf.blk)) { trace_pci_nvme_err_invalid_opc(c->opcode); return NVME_INVALID_OPCODE | NVME_DNR; } @@ -6534,7 +6522,7 @@ done: static uint16_t nvme_format_check(NvmeNamespace *ns, uint8_t lbaf, uint8_t pi) { - if (ns->params.zoned) { + if (blk_get_zone_model(ns->blkconf.blk)) { return NVME_INVALID_FORMAT | NVME_DNR; } diff --git a/hw/nvme/ns.c b/hw/nvme/ns.c index 44aba8f4d9..f076593ada 100644 --- a/hw/nvme/ns.c +++ b/hw/nvme/ns.c @@ -25,7 +25,6 @@ #include "trace.h" #define MIN_DISCARD_GRANULARITY (4 * KiB) -#define NVME_DEFAULT_ZONE_SIZE (128 * MiB) void nvme_ns_init_format(NvmeNamespace *ns) { @@ -177,19 +176,11 @@ static int nvme_ns_init_blk(NvmeNamespace *ns, Error **errp) static int nvme_ns_zoned_check_calc_geometry(NvmeNamespace *ns, Error **errp) { - uint64_t zone_size, zone_cap; + BlockBackend *blk = ns->blkconf.blk; + uint64_t zone_size = blk_get_zone_size(blk); + uint64_t zone_cap = blk_get_zone_capacity(blk); /* Make sure that the values of ZNS properties are sane */ - if (ns->params.zone_size_bs) { - zone_size = ns->params.zone_size_bs; - } else { - zone_size = NVME_DEFAULT_ZONE_SIZE; - } - if (ns->params.zone_cap_bs) { - zone_cap = ns->params.zone_cap_bs; - } else { - zone_cap = zone_size; - } if (zone_cap > zone_size) { error_setg(errp, "zone capacity %"PRIu64"B exceeds " "zone size %"PRIu64"B", zone_cap, zone_size); @@ -266,6 +257,7 @@ static void nvme_ns_zoned_init_state(NvmeNamespace *ns) static void nvme_ns_init_zoned(NvmeNamespace *ns) { + BlockBackend *blk = ns->blkconf.blk; NvmeIdNsZoned *id_ns_z; int i; @@ -274,8 +266,8 @@ static void nvme_ns_init_zoned(NvmeNamespace *ns) id_ns_z = g_new0(NvmeIdNsZoned, 1); /* MAR/MOR are zeroes-based, FFFFFFFFFh means no limit */ - id_ns_z->mar = cpu_to_le32(ns->params.max_active_zones - 1); - id_ns_z->mor = cpu_to_le32(ns->params.max_open_zones - 1); + id_ns_z->mar = cpu_to_le32(blk_get_max_active_zones(blk) - 1); + id_ns_z->mor = cpu_to_le32(blk_get_max_open_zones(blk) - 1); id_ns_z->zoc = 0; id_ns_z->ozcs = ns->params.cross_zone_read ? NVME_ID_NS_ZONED_OZCS_RAZB : 0x00; @@ -539,6 +531,7 @@ static bool nvme_ns_init_fdp(NvmeNamespace *ns, Error **errp) static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp) { + BlockBackend *blk = ns->blkconf.blk; unsigned int pi_size; if (!ns->blkconf.blk) { @@ -577,25 +570,13 @@ static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp) return -1; } - if (ns->params.zoned && ns->endgrp && ns->endgrp->fdp.enabled) { + if (blk_get_zone_profile(blk) == BLK_ZP_ZNS && ns->endgrp + && ns->endgrp->fdp.enabled) { error_setg(errp, "cannot be a zoned- in an FDP configuration"); return -1; } - if (ns->params.zoned) { - if (ns->params.max_active_zones) { - if (ns->params.max_open_zones > ns->params.max_active_zones) { - error_setg(errp, "max_open_zones (%u) exceeds " - "max_active_zones (%u)", ns->params.max_open_zones, - ns->params.max_active_zones); - return -1; - } - - if (!ns->params.max_open_zones) { - ns->params.max_open_zones = ns->params.max_active_zones; - } - } - + if (blk_get_zone_model(blk)) { if (ns->params.zd_extension_size) { if (ns->params.zd_extension_size & 0x3f) { error_setg(errp, "zone descriptor extension size must be a " @@ -630,14 +611,14 @@ static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp) return -1; } - if (ns->params.max_active_zones) { - if (ns->params.numzrwa > ns->params.max_active_zones) { + int maz = blk_get_max_active_zones(blk); + if (maz) { + if (ns->params.numzrwa > maz) { error_setg(errp, "number of zone random write area " "resources (zoned.numzrwa, %d) must be less " "than or equal to maximum active resources " "(zoned.max_active_zones, %d)", - ns->params.numzrwa, - ns->params.max_active_zones); + ns->params.numzrwa, maz); return -1; } } @@ -660,7 +641,7 @@ int nvme_ns_setup(NvmeNamespace *ns, Error **errp) if (nvme_ns_init(ns, errp)) { return -1; } - if (ns->params.zoned) { + if (blk_get_zone_model(ns->blkconf.blk)) { if (nvme_ns_zoned_check_calc_geometry(ns, errp) != 0) { return -1; } @@ -683,15 +664,17 @@ void nvme_ns_drain(NvmeNamespace *ns) void nvme_ns_shutdown(NvmeNamespace *ns) { - blk_flush(ns->blkconf.blk); - if (ns->params.zoned) { + + BlockBackend *blk = ns->blkconf.blk; + blk_flush(blk); + if (blk_get_zone_model(blk)) { nvme_zoned_ns_shutdown(ns); } } void nvme_ns_cleanup(NvmeNamespace *ns) { - if (ns->params.zoned) { + if (blk_get_zone_model(ns->blkconf.blk)) { g_free(ns->id_ns_zoned); g_free(ns->zone_array); g_free(ns->zd_extensions); @@ -806,11 +789,6 @@ static Property nvme_ns_props[] = { DEFINE_PROP_UINT16("mssrl", NvmeNamespace, params.mssrl, 128), DEFINE_PROP_UINT32("mcl", NvmeNamespace, params.mcl, 128), DEFINE_PROP_UINT8("msrc", NvmeNamespace, params.msrc, 127), - DEFINE_PROP_BOOL("zoned", NvmeNamespace, params.zoned, false), - DEFINE_PROP_SIZE("zoned.zone_size", NvmeNamespace, params.zone_size_bs, - NVME_DEFAULT_ZONE_SIZE), - DEFINE_PROP_SIZE("zoned.zone_capacity", NvmeNamespace, params.zone_cap_bs, - 0), DEFINE_PROP_BOOL("zoned.cross_read", NvmeNamespace, params.cross_zone_read, false), DEFINE_PROP_UINT32("zoned.max_active", NvmeNamespace, diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h index 5f2ae7b28b..76677a86e9 100644 --- a/hw/nvme/nvme.h +++ b/hw/nvme/nvme.h @@ -189,10 +189,7 @@ typedef struct NvmeNamespaceParams { uint32_t mcl; uint8_t msrc; - bool zoned; bool cross_zone_read; - uint64_t zone_size_bs; - uint64_t zone_cap_bs; uint32_t max_active_zones; uint32_t max_open_zones; uint32_t zd_extension_size; diff --git a/include/sysemu/block-backend-io.h b/include/sysemu/block-backend-io.h index be4dcef59d..3be221e752 100644 --- a/include/sysemu/block-backend-io.h +++ b/include/sysemu/block-backend-io.h @@ -99,6 +99,13 @@ void blk_error_action(BlockBackend *blk, BlockErrorAction action, void blk_iostatus_set_err(BlockBackend *blk, int error); int blk_get_max_iov(BlockBackend *blk); int blk_get_max_hw_iov(BlockBackend *blk); +uint8_t blk_get_zone_model(BlockBackend *blk); +uint8_t blk_get_zone_profile(BlockBackend *blk); +uint32_t blk_get_zone_size(BlockBackend *blk); +uint32_t blk_get_zone_capacity(BlockBackend *blk); +uint32_t blk_get_max_open_zones(BlockBackend *blk); +uint32_t blk_get_max_active_zones(BlockBackend *blk); +uint32_t blk_get_max_append_sectors(BlockBackend *blk); void blk_io_plug(void); void blk_io_unplug(void); From patchwork Wed Aug 16 06:46:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sam Li X-Patchwork-Id: 1821705 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20221208 header.b=BzY586Yt; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RQdxy0CCgz1yVt for ; Wed, 16 Aug 2023 16:48:20 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qWAJV-0000vs-5x; Wed, 16 Aug 2023 02:47:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qWAJJ-0000ty-Dm; Wed, 16 Aug 2023 02:47:19 -0400 Received: from mail-pg1-x534.google.com ([2607:f8b0:4864:20::534]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qWAJC-000893-RN; Wed, 16 Aug 2023 02:47:16 -0400 Received: by mail-pg1-x534.google.com with SMTP id 41be03b00d2f7-564b8e60ce9so3560583a12.2; Tue, 15 Aug 2023 23:47:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692168426; x=1692773226; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=sb5qE0PjZPClWCUTwMHC1op8L5ZY4+0925rtvOL/gZ8=; b=BzY586Yt1m7q+WmNrRwTp9myyAafH0/V3aRKBUyD5Bu/hSpI17Y+zCEvVI59q26JjI sonB960UhL4I6lJnF8Kp31pQuRcScNqgIN7ZFdr7VKXT1J7uWVZnuwneJ3Uqes2pZMJA 50jGvRqV0UOMcsBdtXDGUQVO0t0TUjRZVVtjYkQhF5PnanRButz9aEy8sU/iGEruyxGu W88eCbPsUYLrWU5kXMMwwFEoHsnGZuGwYw2s5rTUyGomHN9RMPdF6XVsJtGmg5iVOjvT UhT5bUvRbSIeMnE91RfD+vf53riujE0WWOQX/eDykUtJPO/IXMLyMdimAv2+4QSn44XM G6AQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692168426; x=1692773226; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sb5qE0PjZPClWCUTwMHC1op8L5ZY4+0925rtvOL/gZ8=; b=OuT8C+y/GTm3RCB7lqCxp1ZPQNg2ovk3mk97pPp5TOtzeTi/qaE28HdD/cw/g/XxxX +L4a4Ljs9U5QA59CHgJwMEdN8Vm1mkbfaGn8ft9oKrZEvfylRC6mSdk9vfIBWUbx2vI7 Xd8Suzu6q/oUOX1fviK2xGnI1zxxsVPLmuWkY85HL1VQKRMDk+6aA2zGHWGXpsB3Myu4 qUcg1hEohzCsKy22dWtUPDmsxmztghBtmKxxcDVniORvMndq7nKDFVtwaSbWjC9dS6Xd ijC/thiI8bcINXzhVSiLha1rVSSeaUrnBiiNi70rFtNAGuISefEyuXPVg3+M0lm2eZVD jSFg== X-Gm-Message-State: AOJu0YxxsejH94gWw/7u0DUNavf5nEmKKw1T7aBCfgEbMPVFh0dKPXV1 Z6njCUZX6THNOZoQ0IGvn0C23rs/3TXNFDFysgE= X-Google-Smtp-Source: AGHT+IFZHuuGkmYzvem0+xjms25qHaPcIY+XqM0E0E5hM9qOEScjxpxCX1LTKvag+xkIYYLd99Z4mw== X-Received: by 2002:a05:6a20:3d94:b0:13f:9cee:ff2b with SMTP id s20-20020a056a203d9400b0013f9ceeff2bmr1000389pzi.17.1692168425956; Tue, 15 Aug 2023 23:47:05 -0700 (PDT) Received: from fedlinux.. ([106.84.130.68]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b001bee782a1desm1329363plc.181.2023.08.15.23.46.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Aug 2023 23:47:05 -0700 (PDT) From: Sam Li To: qemu-devel@nongnu.org Cc: Peter Xu , stefanha@redhat.com, David Hildenbrand , Kevin Wolf , Markus Armbruster , Keith Busch , Hanna Reitz , dmitry.fomichev@wdc.com, Eric Blake , Klaus Jensen , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , dlemoal@kernel.org, Paolo Bonzini , hare@suse.de, qemu-block@nongnu.org, Sam Li Subject: [RFC 2/5] qcow2: add zone device metadata with zd_extension Date: Wed, 16 Aug 2023 14:46:14 +0800 Message-Id: <20230816064617.3310-3-faithilikerun@gmail.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230816064617.3310-1-faithilikerun@gmail.com> References: <20230816064617.3310-1-faithilikerun@gmail.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::534; envelope-from=faithilikerun@gmail.com; helo=mail-pg1-x534.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Zone descriptor data is host definied data that is associated with each zone. Add zone descriptor extensions to zonedmeta and blk_get_zone_extension to access zd_extensions. Signed-off-by: Sam Li --- block/block-backend.c | 15 ++++++ block/qcow2.c | 86 ++++++++++++++++++++++++++----- block/qcow2.h | 3 ++ docs/interop/qcow2.txt | 2 + hw/nvme/ctrl.c | 19 ++++--- hw/nvme/ns.c | 24 ++------- hw/nvme/nvme.h | 7 --- include/block/block_int-common.h | 6 +++ include/sysemu/block-backend-io.h | 2 + qapi/block-core.json | 3 ++ 10 files changed, 121 insertions(+), 46 deletions(-) diff --git a/block/block-backend.c b/block/block-backend.c index ad410286a0..f68c5263f3 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -2418,6 +2418,21 @@ uint32_t blk_get_max_append_sectors(BlockBackend *blk) return bs ? bs->bl.max_append_sectors : 0; } +uint8_t *blk_get_zone_extension(BlockBackend *blk) { + BlockDriverState * bs = blk_bs(blk); + IO_CODE(); + + return bs ? bs->zd_extensions : NULL; +} + +uint32_t blk_get_zd_ext_size(BlockBackend *blk) +{ + BlockDriverState *bs = blk_bs(blk); + IO_CODE(); + + return bs ? bs->bl.zd_extension_size : 0; +} + void *blk_try_blockalign(BlockBackend *blk, size_t size) { IO_CODE(); diff --git a/block/qcow2.c b/block/qcow2.c index 9de90ccc9f..fce1fe83a7 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -340,15 +340,28 @@ static inline int qcow2_refresh_zonedmeta(BlockDriverState *bs) { int ret; BDRVQcow2State *s = bs->opaque; - uint64_t *temp = g_malloc(s->zoned_header.zonedmeta_size); + uint64_t wps_size = s->zoned_header.zonedmeta_size - + s->zded_size; + g_autofree uint64_t *temp = NULL; + temp = g_new(uint64_t, wps_size); ret = bdrv_pread(bs->file, s->zoned_header.zonedmeta_offset, - s->zoned_header.zonedmeta_size, temp, 0); + wps_size, temp, 0); if (ret < 0) { - error_report("Can not read metadata\n"); + error_report("Can not read metadata"); return ret; } - memcpy(s->wps->wp, temp, s->zoned_header.zonedmeta_size); + g_autofree uint8_t *zded = NULL; + zded = g_try_malloc0(s->zded_size); + ret = bdrv_pread(bs->file, s->zoned_header.zonedmeta_offset + wps_size, + s->zded_size, zded, 0); + if (ret < 0) { + error_report("Can not read zded"); + return ret; + } + + memcpy(s->wps->wp, temp, wps_size); + memcpy(bs->zd_extensions, zded, s->zded_size); return 0; } @@ -607,6 +620,8 @@ qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset, zoned_ext.zone_size = be32_to_cpu(zoned_ext.zone_size); zoned_ext.zone_capacity = be32_to_cpu(zoned_ext.zone_capacity); + zoned_ext.zd_extension_size = + be32_to_cpu(zoned_ext.zd_extension_size); zoned_ext.nr_zones = be32_to_cpu(zoned_ext.nr_zones); zoned_ext.zone_nr_conv = be32_to_cpu(zoned_ext.zone_nr_conv); zoned_ext.max_open_zones = be32_to_cpu(zoned_ext.max_open_zones); @@ -618,8 +633,10 @@ qcow2_read_extensions(BlockDriverState *bs, uint64_t start_offset, be64_to_cpu(zoned_ext.zonedmeta_offset); zoned_ext.zonedmeta_size = be64_to_cpu(zoned_ext.zonedmeta_size); s->zoned_header = zoned_ext; + s->wps = g_malloc(sizeof(BlockZoneWps) - + s->zoned_header.zonedmeta_size); + + zoned_ext.zonedmeta_size - s->zded_size); + bs->zd_extensions = g_malloc0(s->zded_size); ret = qcow2_refresh_zonedmeta(bs); if (ret < 0) { error_setg_errno(errp, -ret, "zonedmeta: " @@ -2174,6 +2191,7 @@ static void qcow2_refresh_limits(BlockDriverState *bs, Error **errp) bs->bl.zoned = s->zoned_header.zoned; bs->bl.zoned_profile = s->zoned_header.zoned_profile; bs->bl.zone_capacity = s->zoned_header.zone_capacity; + bs->bl.zd_extension_size = s->zoned_header.zd_extension_size; bs->bl.nr_zones = s->zoned_header.nr_zones; bs->wps = s->wps; bs->bl.max_append_sectors = s->zoned_header.max_append_sectors; @@ -3369,6 +3387,8 @@ int qcow2_update_header(BlockDriverState *bs) .nr_zones = cpu_to_be32(s->zoned_header.nr_zones), .zone_size = cpu_to_be32(s->zoned_header.zone_size), .zone_capacity = cpu_to_be32(s->zoned_header.zone_capacity), + .zd_extension_size = + cpu_to_be32(s->zoned_header.zd_extension_size), .zone_nr_conv = cpu_to_be32(s->zoned_header.zone_nr_conv), .max_open_zones = cpu_to_be32(s->zoned_header.max_open_zones), .max_active_zones = @@ -4075,13 +4095,8 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp) if (qcow2_opts->zoned_profile) { BDRVQcow2State *s = blk_bs(blk)->opaque; - if (!strcmp(qcow2_opts->zoned_profile, "zbc")) { - s->zoned_header.zoned_profile = BLK_ZP_ZBC; - s->zoned_header.zone_capacity = qcow2_opts->zone_size; - } else if (!strcmp(qcow2_opts->zoned_profile, "zns")) { - s->zoned_header.zoned_profile = BLK_ZP_ZNS; - s->zoned_header.zone_capacity = qcow2_opts->zone_capacity; - } + uint64_t zded_size = 0; + s->zoned_header.zoned = BLK_Z_HM; s->zoned_header.zone_size = qcow2_opts->zone_size; s->zoned_header.zone_nr_conv = qcow2_opts->zone_nr_conv; @@ -4119,6 +4134,33 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp) meta[i] |= ((uint64_t)BLK_ZS_EMPTY << 60); } + if (!g_strcmp0(qcow2_opts->zoned_profile, "zbc")) { + s->zoned_header.zoned_profile = BLK_ZP_ZBC; + s->zoned_header.zone_capacity = qcow2_opts->zone_size; + } else if (!g_strcmp0(qcow2_opts->zoned_profile, "zns")) { + s->zoned_header.zoned_profile = BLK_ZP_ZNS; + s->zoned_header.zone_capacity = qcow2_opts->zone_capacity; + + if (qcow2_opts->zd_extension_size) { + if (qcow2_opts->zd_extension_size & 0x3f) { + error_setg(errp, "zone descriptor extension size must be a " + "multiple of 64B"); + return -1; + } + if ((qcow2_opts->zd_extension_size >> 6) > 0xff) { + error_setg(errp, + "zone descriptor extension size is too large"); + return -1; + } + } + s->zoned_header.zd_extension_size = qcow2_opts->zd_extension_size; + + zded_size = s->zoned_header.zd_extension_size * + s->zoned_header.nr_zones; + } + s->zded_size = zded_size; + zoned_meta_size += zded_size; + offset = qcow2_alloc_clusters(blk_bs(blk), zoned_meta_size); if (offset < 0) { error_setg_errno(errp, -offset, "Could not allocate clusters " @@ -4138,12 +4180,23 @@ qcow2_co_create(BlockdevCreateOptions *create_options, Error **errp) error_setg_errno(errp, -ret, "Could not zero fill zoned metadata"); goto out; } - ret = bdrv_pwrite(blk_bs(blk)->file, offset, zoned_meta_size, meta, 0); + ret = bdrv_pwrite(blk_bs(blk)->file, offset, + zoned_meta_size - zded_size, meta, 0); if (ret < 0) { error_setg_errno(errp, -ret, "Could not write zoned metadata " "to disk"); goto out; } + if (s->zoned_header.zoned_profile == BLK_ZP_ZNS) { + /* Initialize zone descriptor extensions */ + ret = bdrv_co_pwrite_zeroes(blk_bs(blk)->file, offset + zded_size, + zded_size, 0); + if (ret < 0) { + error_setg_errno(errp, -ret, "Could not write zone descriptor" + "extensions to disk"); + goto out; + } + } } /* Create a full header (including things like feature table) */ @@ -4290,6 +4343,7 @@ qcow2_co_create_opts(BlockDriver *drv, const char *filename, QemuOpts *opts, { BLOCK_OPT_Z_MAS, "max-append-sectors"}, { BLOCK_OPT_Z_SIZE, "zone-size"}, { BLOCK_OPT_Z_CAP, "zone-capacity"}, + { BLOCK_OPT_Z_DEXTSIZE, "zd-extension-size"}, { NULL, NULL }, }; @@ -6856,6 +6910,12 @@ static QemuOptsList qcow2_create_opts = { .type = QEMU_OPT_SIZE, \ .help = "zone capacity", \ }, \ + { \ + .name = BLOCK_OPT_Z_DEXTSIZE, \ + .type = QEMU_OPT_SIZE, \ + .help = "zone descriptor extension size (defaults " \ + "to 0, must be a multiple of 64 bytes)", \ + }, \ { \ .name = BLOCK_OPT_Z_NR_COV, \ .type = QEMU_OPT_NUMBER, \ diff --git a/block/qcow2.h b/block/qcow2.h index 38b779ae32..254295cfce 100644 --- a/block/qcow2.h +++ b/block/qcow2.h @@ -250,6 +250,8 @@ typedef struct Qcow2ZonedHeaderExtension { uint32_t max_append_sectors; uint64_t zonedmeta_offset; uint64_t zonedmeta_size; + uint32_t zd_extension_size; /* must be multiple of 64 B */ + uint32_t reserved32; } QEMU_PACKED Qcow2ZonedHeaderExtension; typedef struct Qcow2UnknownHeaderExtension { @@ -445,6 +447,7 @@ typedef struct BDRVQcow2State { uint32_t nr_zones_imp_open; uint32_t nr_zones_closed; BlockZoneWps *wps; + uint64_t zded_size; } BDRVQcow2State; typedef struct Qcow2COWRegion { diff --git a/docs/interop/qcow2.txt b/docs/interop/qcow2.txt index 739e2c62c6..beb4ead094 100644 --- a/docs/interop/qcow2.txt +++ b/docs/interop/qcow2.txt @@ -356,6 +356,8 @@ The fields of the zoned extension are: 28 - 31: max_append_sectors 32 - 39: zonedmeta_offset 40 - 47: zonedmeta_size + 48 - 51: zd_extension_size + 52 - 55: Reserved, must be zero. == Full disk encryption header pointer == diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index 4e1608f0c1..4320f3a15c 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -4010,6 +4010,12 @@ static uint16_t nvme_zone_mgmt_send_zrwa_flush(NvmeCtrl *n, NvmeZone *zone, return NVME_SUCCESS; } +static inline uint8_t *nvme_get_zd_extension(NvmeNamespace *ns, + uint32_t zone_idx) +{ + return &ns->zd_extensions[zone_idx * blk_get_zd_ext_size(ns->blkconf.blk)]; +} + static uint16_t nvme_zone_mgmt_send(NvmeCtrl *n, NvmeRequest *req) { NvmeZoneSendCmd *cmd = (NvmeZoneSendCmd *)&req->cmd; @@ -4094,11 +4100,11 @@ static uint16_t nvme_zone_mgmt_send(NvmeCtrl *n, NvmeRequest *req) case NVME_ZONE_ACTION_SET_ZD_EXT: trace_pci_nvme_set_descriptor_extension(slba, zone_idx); - if (all || !ns->params.zd_extension_size) { + if (all || !blk_get_zd_ext_size(ns->blkconf.blk)) { return NVME_INVALID_FIELD | NVME_DNR; } zd_ext = nvme_get_zd_extension(ns, zone_idx); - status = nvme_h2c(n, zd_ext, ns->params.zd_extension_size, req); + status = nvme_h2c(n, zd_ext, blk_get_zd_ext_size(ns->blkconf.blk), req); if (status) { trace_pci_nvme_err_zd_extension_map_error(zone_idx); return status; @@ -4189,7 +4195,7 @@ static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) if (zra != NVME_ZONE_REPORT && zra != NVME_ZONE_REPORT_EXTENDED) { return NVME_INVALID_FIELD | NVME_DNR; } - if (zra == NVME_ZONE_REPORT_EXTENDED && !ns->params.zd_extension_size) { + if (zra == NVME_ZONE_REPORT_EXTENDED && !blk_get_zd_ext_size(ns->blkconf.blk)){ return NVME_INVALID_FIELD | NVME_DNR; } @@ -4211,7 +4217,7 @@ static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) zone_entry_sz = sizeof(NvmeZoneDescr); if (zra == NVME_ZONE_REPORT_EXTENDED) { - zone_entry_sz += ns->params.zd_extension_size; + zone_entry_sz += blk_get_zd_ext_size(ns->blkconf.blk) ; } max_zones = (data_size - sizeof(NvmeZoneReportHeader)) / zone_entry_sz; @@ -4249,11 +4255,12 @@ static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) } if (zra == NVME_ZONE_REPORT_EXTENDED) { + int zd_ext_size = blk_get_zd_ext_size(ns->blkconf.blk); if (zone->d.za & NVME_ZA_ZD_EXT_VALID) { memcpy(buf_p, nvme_get_zd_extension(ns, zone_idx), - ns->params.zd_extension_size); + zd_ext_size); } - buf_p += ns->params.zd_extension_size; + buf_p += zd_ext_size; } max_zones--; diff --git a/hw/nvme/ns.c b/hw/nvme/ns.c index f076593ada..c9c3a54d36 100644 --- a/hw/nvme/ns.c +++ b/hw/nvme/ns.c @@ -218,15 +218,15 @@ static int nvme_ns_zoned_check_calc_geometry(NvmeNamespace *ns, Error **errp) static void nvme_ns_zoned_init_state(NvmeNamespace *ns) { + BlockBackend *blk = ns->blkconf.blk; uint64_t start = 0, zone_size = ns->zone_size; uint64_t capacity = ns->num_zones * zone_size; NvmeZone *zone; int i; ns->zone_array = g_new0(NvmeZone, ns->num_zones); - if (ns->params.zd_extension_size) { - ns->zd_extensions = g_malloc0(ns->params.zd_extension_size * - ns->num_zones); + if (blk_get_zone_extension(blk)) { + ns->zd_extensions = blk_get_zone_extension(blk); } QTAILQ_INIT(&ns->exp_open_zones); @@ -275,7 +275,7 @@ static void nvme_ns_init_zoned(NvmeNamespace *ns) for (i = 0; i <= ns->id_ns.nlbaf; i++) { id_ns_z->lbafe[i].zsze = cpu_to_le64(ns->zone_size); id_ns_z->lbafe[i].zdes = - ns->params.zd_extension_size >> 6; /* Units of 64B */ + blk_get_zd_ext_size(blk) >> 6; /* Units of 64B */ } if (ns->params.zrwas) { @@ -577,19 +577,6 @@ static int nvme_ns_check_constraints(NvmeNamespace *ns, Error **errp) } if (blk_get_zone_model(blk)) { - if (ns->params.zd_extension_size) { - if (ns->params.zd_extension_size & 0x3f) { - error_setg(errp, "zone descriptor extension size must be a " - "multiple of 64B"); - return -1; - } - if ((ns->params.zd_extension_size >> 6) > 0xff) { - error_setg(errp, - "zone descriptor extension size is too large"); - return -1; - } - } - if (ns->params.zrwas) { if (ns->params.zrwas % ns->blkconf.logical_block_size) { error_setg(errp, "zone random write area size (zoned.zrwas " @@ -677,7 +664,6 @@ void nvme_ns_cleanup(NvmeNamespace *ns) if (blk_get_zone_model(ns->blkconf.blk)) { g_free(ns->id_ns_zoned); g_free(ns->zone_array); - g_free(ns->zd_extensions); } if (ns->endgrp && ns->endgrp->fdp.enabled) { @@ -795,8 +781,6 @@ static Property nvme_ns_props[] = { params.max_active_zones, 0), DEFINE_PROP_UINT32("zoned.max_open", NvmeNamespace, params.max_open_zones, 0), - DEFINE_PROP_UINT32("zoned.descr_ext_size", NvmeNamespace, - params.zd_extension_size, 0), DEFINE_PROP_UINT32("zoned.numzrwa", NvmeNamespace, params.numzrwa, 0), DEFINE_PROP_SIZE("zoned.zrwas", NvmeNamespace, params.zrwas, 0), DEFINE_PROP_SIZE("zoned.zrwafg", NvmeNamespace, params.zrwafg, -1), diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h index 76677a86e9..37007952fc 100644 --- a/hw/nvme/nvme.h +++ b/hw/nvme/nvme.h @@ -192,7 +192,6 @@ typedef struct NvmeNamespaceParams { bool cross_zone_read; uint32_t max_active_zones; uint32_t max_open_zones; - uint32_t zd_extension_size; uint32_t numzrwa; uint64_t zrwas; @@ -315,12 +314,6 @@ static inline bool nvme_wp_is_valid(NvmeZone *zone) st != NVME_ZONE_STATE_OFFLINE; } -static inline uint8_t *nvme_get_zd_extension(NvmeNamespace *ns, - uint32_t zone_idx) -{ - return &ns->zd_extensions[zone_idx * ns->params.zd_extension_size]; -} - static inline void nvme_aor_inc_open(NvmeNamespace *ns) { assert(ns->nr_open_zones >= 0); diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h index 1dbe820a9b..e16dfe8581 100644 --- a/include/block/block_int-common.h +++ b/include/block/block_int-common.h @@ -61,6 +61,7 @@ #define BLOCK_OPT_Z_MODEL "zoned" #define BLOCK_OPT_Z_SIZE "zone_size" #define BLOCK_OPT_Z_CAP "zone_capacity" +#define BLOCK_OPT_Z_DEXTSIZE "zd_extension_size" #define BLOCK_OPT_Z_NR_COV "zone_nr_conv" #define BLOCK_OPT_Z_MAS "max_append_sectors" #define BLOCK_OPT_Z_MAZ "max_active_zones" @@ -907,6 +908,9 @@ typedef struct BlockLimits { uint32_t max_active_zones; uint32_t write_granularity; + + /* size of data that is associated with a zone in bytes */ + uint32_t zd_extension_size; } BlockLimits; typedef struct BdrvOpBlocker BdrvOpBlocker; @@ -1265,6 +1269,8 @@ struct BlockDriverState { /* array of write pointers' location of each zone in the zoned device. */ BlockZoneWps *wps; + + uint8_t *zd_extensions; }; struct BlockBackendRootState { diff --git a/include/sysemu/block-backend-io.h b/include/sysemu/block-backend-io.h index 3be221e752..c56ed29c8f 100644 --- a/include/sysemu/block-backend-io.h +++ b/include/sysemu/block-backend-io.h @@ -106,6 +106,8 @@ uint32_t blk_get_zone_capacity(BlockBackend *blk); uint32_t blk_get_max_open_zones(BlockBackend *blk); uint32_t blk_get_max_active_zones(BlockBackend *blk); uint32_t blk_get_max_append_sectors(BlockBackend *blk); +uint8_t *blk_get_zone_extension(BlockBackend *blk); +uint32_t blk_get_zd_ext_size(BlockBackend *blk); void blk_io_plug(void); void blk_io_unplug(void); diff --git a/qapi/block-core.json b/qapi/block-core.json index 0c97ae678b..f71dd18fc3 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -5024,6 +5024,8 @@ # (default: off, since 8.0) # @zone-size: The size of a zone of the zoned device (since 8.0) # @zone-capacity: The capacity of a zone of the zoned device (since 8.0) +# @zd-extension-size: Zone descriptor extension size. Must be a multiple of +# 64 bytes (since 8.0) # @zone-nr-conv: The number of conventional zones of the zoned device # (since 8.0) # @max-open-zones: The maximal allowed open zones (since 8.0) @@ -5052,6 +5054,7 @@ '*zoned-profile': 'str', '*zone-size': 'size', '*zone-capacity': 'size', + '*zd-extension-size': 'size', '*zone-nr-conv': 'uint32', '*max-open-zones': 'uint32', '*max-active-zones': 'uint32', From patchwork Wed Aug 16 06:46:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sam Li X-Patchwork-Id: 1821706 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20221208 header.b=H6+JZiHY; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RQdzp03wJz1yVt for ; Wed, 16 Aug 2023 16:49:57 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qWAJv-0001D3-2m; Wed, 16 Aug 2023 02:47:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qWAJf-00011g-Gy; Wed, 16 Aug 2023 02:47:41 -0400 Received: from mail-pl1-x636.google.com ([2607:f8b0:4864:20::636]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qWAJa-0008An-JC; Wed, 16 Aug 2023 02:47:39 -0400 Received: by mail-pl1-x636.google.com with SMTP id d9443c01a7336-1bc6535027aso52446855ad.2; Tue, 15 Aug 2023 23:47:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1692168450; x=1692773250; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=nNI6dSJFpouzAtow9H15Powe/E62QjazVdjlL5QlT28=; b=H6+JZiHY8TbGR/fIjQDVpYKfqmOrxGS5AkM7qUA8gqoxCP1M5syJQma7wd4YRrZLdS P6LYTvYVx7/VqQEtHLncsvL8mJ2dXxSKZ+rpm21MmdyuZBgUaTdlPuccjH51MqIRrUPq rvGJUl6iMKGG45MEqyZ4IrTnINbV0BGnhv4lkIQjC3QcH3jmdL+68i7vcAtWjNTcoWUQ B+OH5Nkgp5YMmjkwCw1Ubu0IqM+iLleOsN9rFDE2vZZ2AGdEv5wutZIdCJ8jaY4OCQ6L 07yoDaiyr9u+l6/7iV6M4y1ImAB4r7lTU1FRVrqcVNHybPxUVQmW75aHxukkTFEtfyvi rxxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692168450; x=1692773250; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=nNI6dSJFpouzAtow9H15Powe/E62QjazVdjlL5QlT28=; b=Xdd5YT/jQccsJOh+oexJ0+4fSNL6mfji0h8uWX0Xwn3fqKQicOLxs0yJo6TLDcOrE1 wsKM/MTgj2eSckBbOkVcfEkxPmtgiuqAz4lywHoTbUd9Weio/EzrcA27JfA8+8VpK8KP qw+ZO0eCWhb1lbg1SFccpc/opgmzGic4CtzeEruIYC0D3ygCVCDLDD/NQQvNz8/ohxu2 G+pcR5QWHDwNlXpkgZAABuINq3AZQxOaasY37Plj+uqkJp3x0eGNi5o1Mo/vf8eOyZ21 5KZXQx1cBEKSwtj+EIcoTR7ROrMaB8T8flNS0aASdx/gIQMW4h+rgw5MYKc23k6maiid +V/A== X-Gm-Message-State: AOJu0YxpybuGdH5uKYMXwuTsMl4zSW2BC5iR1xP98OtxPCT2cYHgIqR9 L6+C9eRMw4TnHMm2HodtOw63mCgqQfU2xTFl/P4= X-Google-Smtp-Source: AGHT+IED/7nybHsujfjfouGiCocqIxkEfWwET8FRLvSgb22akiokaCchbi6ue+efiAqsQQ6lrXnY6w== X-Received: by 2002:a17:902:b788:b0:1b8:76ce:9d91 with SMTP id e8-20020a170902b78800b001b876ce9d91mr1209587pls.1.1692168449209; Tue, 15 Aug 2023 23:47:29 -0700 (PDT) Received: from fedlinux.. ([106.84.130.68]) by smtp.gmail.com with ESMTPSA id c6-20020a170902c1c600b001bee782a1desm1329363plc.181.2023.08.15.23.47.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Aug 2023 23:47:28 -0700 (PDT) From: Sam Li To: qemu-devel@nongnu.org Cc: Peter Xu , stefanha@redhat.com, David Hildenbrand , Kevin Wolf , Markus Armbruster , Keith Busch , Hanna Reitz , dmitry.fomichev@wdc.com, Eric Blake , Klaus Jensen , =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , dlemoal@kernel.org, Paolo Bonzini , hare@suse.de, qemu-block@nongnu.org, Sam Li Subject: [RFC 3/5] hw/nvme: make the metadata of ZNS emulation persistent Date: Wed, 16 Aug 2023 14:46:15 +0800 Message-Id: <20230816064617.3310-4-faithilikerun@gmail.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230816064617.3310-1-faithilikerun@gmail.com> References: <20230816064617.3310-1-faithilikerun@gmail.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::636; envelope-from=faithilikerun@gmail.com; helo=mail-pl1-x636.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org The NVMe ZNS devices follow NVMe ZNS spec but the state of namespace zones does not persist accross restarts of QEMU. This patch makes the metadata of ZNS emulation persistent by using new block layer APIs. The ZNS device calls zone report and zone mgmt APIs from the block layer which will handle zone state transition and manage zone resources. Signed-off-by: Sam Li --- block/block-backend.c | 15 + block/qcow2.c | 3 + hw/nvme/ctrl.c | 1114 ++++++----------------------- hw/nvme/ns.c | 77 +- hw/nvme/nvme.h | 85 +-- include/block/block-common.h | 8 + include/block/block_int-common.h | 2 + include/sysemu/block-backend-io.h | 2 + 8 files changed, 283 insertions(+), 1023 deletions(-) diff --git a/block/block-backend.c b/block/block-backend.c index f68c5263f3..9c95ae0267 100644 --- a/block/block-backend.c +++ b/block/block-backend.c @@ -2418,6 +2418,14 @@ uint32_t blk_get_max_append_sectors(BlockBackend *blk) return bs ? bs->bl.max_append_sectors : 0; } +uint32_t blk_get_nr_zones(BlockBackend *blk) +{ + BlockDriverState *bs = blk_bs(blk); + IO_CODE(); + + return bs ? bs->bl.nr_zones : 0; +} + uint8_t *blk_get_zone_extension(BlockBackend *blk) { BlockDriverState * bs = blk_bs(blk); IO_CODE(); @@ -2433,6 +2441,13 @@ uint32_t blk_get_zd_ext_size(BlockBackend *blk) return bs ? bs->bl.zd_extension_size : 0; } +BlockZoneWps *blk_get_zone_wps(BlockBackend *blk) { + BlockDriverState * bs = blk_bs(blk); + IO_CODE(); + + return bs ? bs->wps : NULL; +} + void *blk_try_blockalign(BlockBackend *blk, size_t size) { IO_CODE(); diff --git a/block/qcow2.c b/block/qcow2.c index fce1fe83a7..41549dd68b 100644 --- a/block/qcow2.c +++ b/block/qcow2.c @@ -4854,6 +4854,9 @@ static int coroutine_fn qcow2_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp op, case BLK_ZO_RESET: ret = qcow2_reset_zone(bs, index, len); break; + case BLK_ZO_OFFLINE: + ret = qcow2_write_wp_at(bs, &wps->wp[index], index, BLK_ZO_OFFLINE); + break; default: error_report("Unsupported zone op: 0x%x", op); ret = -ENOTSUP; diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c index 4320f3a15c..8d4c08dc4c 100644 --- a/hw/nvme/ctrl.c +++ b/hw/nvme/ctrl.c @@ -372,67 +372,6 @@ static inline bool nvme_parse_pid(NvmeNamespace *ns, uint16_t pid, return nvme_ph_valid(ns, *ph) && nvme_rg_valid(ns->endgrp, *rg); } -static void nvme_assign_zone_state(NvmeNamespace *ns, NvmeZone *zone, - NvmeZoneState state) -{ - if (QTAILQ_IN_USE(zone, entry)) { - switch (nvme_get_zone_state(zone)) { - case NVME_ZONE_STATE_EXPLICITLY_OPEN: - QTAILQ_REMOVE(&ns->exp_open_zones, zone, entry); - break; - case NVME_ZONE_STATE_IMPLICITLY_OPEN: - QTAILQ_REMOVE(&ns->imp_open_zones, zone, entry); - break; - case NVME_ZONE_STATE_CLOSED: - QTAILQ_REMOVE(&ns->closed_zones, zone, entry); - break; - case NVME_ZONE_STATE_FULL: - QTAILQ_REMOVE(&ns->full_zones, zone, entry); - default: - ; - } - } - - nvme_set_zone_state(zone, state); - - switch (state) { - case NVME_ZONE_STATE_EXPLICITLY_OPEN: - QTAILQ_INSERT_TAIL(&ns->exp_open_zones, zone, entry); - break; - case NVME_ZONE_STATE_IMPLICITLY_OPEN: - QTAILQ_INSERT_TAIL(&ns->imp_open_zones, zone, entry); - break; - case NVME_ZONE_STATE_CLOSED: - QTAILQ_INSERT_TAIL(&ns->closed_zones, zone, entry); - break; - case NVME_ZONE_STATE_FULL: - QTAILQ_INSERT_TAIL(&ns->full_zones, zone, entry); - case NVME_ZONE_STATE_READ_ONLY: - break; - default: - zone->d.za = 0; - } -} - -static uint16_t nvme_zns_check_resources(NvmeNamespace *ns, uint32_t act, - uint32_t opn, uint32_t zrwa) -{ - if (zrwa > ns->zns.numzrwa) { - return NVME_NOZRWA | NVME_DNR; - } - - return NVME_SUCCESS; -} - -/* - * Check if we can open a zone without exceeding open/active limits. - * AOR stands for "Active and Open Resources" (see TP 4053 section 2.5). - */ -static uint16_t nvme_aor_check(NvmeNamespace *ns, uint32_t act, uint32_t opn) -{ - return nvme_zns_check_resources(ns, act, opn, 0); -} - static NvmeFdpEvent *nvme_fdp_alloc_event(NvmeCtrl *n, NvmeFdpEventBuffer *ebuf) { NvmeFdpEvent *ret = NULL; @@ -1769,346 +1708,11 @@ static inline uint32_t nvme_zone_idx(NvmeNamespace *ns, uint64_t slba) slba / ns->zone_size; } -static inline NvmeZone *nvme_get_zone_by_slba(NvmeNamespace *ns, uint64_t slba) -{ - uint32_t zone_idx = nvme_zone_idx(ns, slba); - - if (zone_idx >= ns->num_zones) { - return NULL; - } - - return &ns->zone_array[zone_idx]; -} - -static uint16_t nvme_check_zone_state_for_write(NvmeZone *zone) -{ - uint64_t zslba = zone->d.zslba; - - switch (nvme_get_zone_state(zone)) { - case NVME_ZONE_STATE_EMPTY: - case NVME_ZONE_STATE_IMPLICITLY_OPEN: - case NVME_ZONE_STATE_EXPLICITLY_OPEN: - case NVME_ZONE_STATE_CLOSED: - return NVME_SUCCESS; - case NVME_ZONE_STATE_FULL: - trace_pci_nvme_err_zone_is_full(zslba); - return NVME_ZONE_FULL; - case NVME_ZONE_STATE_OFFLINE: - trace_pci_nvme_err_zone_is_offline(zslba); - return NVME_ZONE_OFFLINE; - case NVME_ZONE_STATE_READ_ONLY: - trace_pci_nvme_err_zone_is_read_only(zslba); - return NVME_ZONE_READ_ONLY; - default: - assert(false); - } - - return NVME_INTERNAL_DEV_ERROR; -} - -static uint16_t nvme_check_zone_write(NvmeNamespace *ns, NvmeZone *zone, - uint64_t slba, uint32_t nlb) -{ - uint64_t zcap = nvme_zone_wr_boundary(zone); - uint16_t status; - - status = nvme_check_zone_state_for_write(zone); - if (status) { - return status; - } - - if (zone->d.za & NVME_ZA_ZRWA_VALID) { - uint64_t ezrwa = zone->w_ptr + 2 * ns->zns.zrwas; - - if (slba < zone->w_ptr || slba + nlb > ezrwa) { - trace_pci_nvme_err_zone_invalid_write(slba, zone->w_ptr); - return NVME_ZONE_INVALID_WRITE; - } - } else { - if (unlikely(slba != zone->w_ptr)) { - trace_pci_nvme_err_write_not_at_wp(slba, zone->d.zslba, - zone->w_ptr); - return NVME_ZONE_INVALID_WRITE; - } - } - - if (unlikely((slba + nlb) > zcap)) { - trace_pci_nvme_err_zone_boundary(slba, nlb, zcap); - return NVME_ZONE_BOUNDARY_ERROR; - } - - return NVME_SUCCESS; -} - -static uint16_t nvme_check_zone_state_for_read(NvmeZone *zone) -{ - switch (nvme_get_zone_state(zone)) { - case NVME_ZONE_STATE_EMPTY: - case NVME_ZONE_STATE_IMPLICITLY_OPEN: - case NVME_ZONE_STATE_EXPLICITLY_OPEN: - case NVME_ZONE_STATE_FULL: - case NVME_ZONE_STATE_CLOSED: - case NVME_ZONE_STATE_READ_ONLY: - return NVME_SUCCESS; - case NVME_ZONE_STATE_OFFLINE: - trace_pci_nvme_err_zone_is_offline(zone->d.zslba); - return NVME_ZONE_OFFLINE; - default: - assert(false); - } - - return NVME_INTERNAL_DEV_ERROR; -} - -static uint16_t nvme_check_zone_read(NvmeNamespace *ns, uint64_t slba, - uint32_t nlb) -{ - NvmeZone *zone; - uint64_t bndry, end; - uint16_t status; - - zone = nvme_get_zone_by_slba(ns, slba); - assert(zone); - - bndry = nvme_zone_rd_boundary(ns, zone); - end = slba + nlb; - - status = nvme_check_zone_state_for_read(zone); - if (status) { - ; - } else if (unlikely(end > bndry)) { - if (!ns->params.cross_zone_read) { - status = NVME_ZONE_BOUNDARY_ERROR; - } else { - /* - * Read across zone boundary - check that all subsequent - * zones that are being read have an appropriate state. - */ - do { - zone++; - status = nvme_check_zone_state_for_read(zone); - if (status) { - break; - } - } while (end > nvme_zone_rd_boundary(ns, zone)); - } - } - - return status; -} - -static uint16_t nvme_zrm_finish(NvmeNamespace *ns, NvmeZone *zone) -{ - switch (nvme_get_zone_state(zone)) { - case NVME_ZONE_STATE_FULL: - return NVME_SUCCESS; - - case NVME_ZONE_STATE_IMPLICITLY_OPEN: - case NVME_ZONE_STATE_EXPLICITLY_OPEN: - nvme_aor_dec_open(ns); - /* fallthrough */ - case NVME_ZONE_STATE_CLOSED: - nvme_aor_dec_active(ns); - - if (zone->d.za & NVME_ZA_ZRWA_VALID) { - zone->d.za &= ~NVME_ZA_ZRWA_VALID; - if (ns->params.numzrwa) { - ns->zns.numzrwa++; - } - } - - /* fallthrough */ - case NVME_ZONE_STATE_EMPTY: - nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_FULL); - return NVME_SUCCESS; - - default: - return NVME_ZONE_INVAL_TRANSITION; - } -} - -static uint16_t nvme_zrm_close(NvmeNamespace *ns, NvmeZone *zone) -{ - switch (nvme_get_zone_state(zone)) { - case NVME_ZONE_STATE_EXPLICITLY_OPEN: - case NVME_ZONE_STATE_IMPLICITLY_OPEN: - nvme_aor_dec_open(ns); - nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_CLOSED); - /* fall through */ - case NVME_ZONE_STATE_CLOSED: - return NVME_SUCCESS; - - default: - return NVME_ZONE_INVAL_TRANSITION; - } -} - -static uint16_t nvme_zrm_reset(NvmeNamespace *ns, NvmeZone *zone) -{ - switch (nvme_get_zone_state(zone)) { - case NVME_ZONE_STATE_EXPLICITLY_OPEN: - case NVME_ZONE_STATE_IMPLICITLY_OPEN: - nvme_aor_dec_open(ns); - /* fallthrough */ - case NVME_ZONE_STATE_CLOSED: - nvme_aor_dec_active(ns); - - if (zone->d.za & NVME_ZA_ZRWA_VALID) { - if (ns->params.numzrwa) { - ns->zns.numzrwa++; - } - } - - /* fallthrough */ - case NVME_ZONE_STATE_FULL: - zone->w_ptr = zone->d.zslba; - zone->d.wp = zone->w_ptr; - nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_EMPTY); - /* fallthrough */ - case NVME_ZONE_STATE_EMPTY: - return NVME_SUCCESS; - - default: - return NVME_ZONE_INVAL_TRANSITION; - } -} - -static void nvme_zrm_auto_transition_zone(NvmeNamespace *ns) -{ - NvmeZone *zone; - int moz = blk_get_max_open_zones(ns->blkconf.blk); - - if (moz && ns->nr_open_zones == moz) { - zone = QTAILQ_FIRST(&ns->imp_open_zones); - if (zone) { - /* - * Automatically close this implicitly open zone. - */ - QTAILQ_REMOVE(&ns->imp_open_zones, zone, entry); - nvme_zrm_close(ns, zone); - } - } -} - enum { NVME_ZRM_AUTO = 1 << 0, NVME_ZRM_ZRWA = 1 << 1, }; -static uint16_t nvme_zrm_open_flags(NvmeCtrl *n, NvmeNamespace *ns, - NvmeZone *zone, int flags) -{ - int act = 0; - uint16_t status; - - switch (nvme_get_zone_state(zone)) { - case NVME_ZONE_STATE_EMPTY: - act = 1; - - /* fallthrough */ - - case NVME_ZONE_STATE_CLOSED: - if (n->params.auto_transition_zones) { - nvme_zrm_auto_transition_zone(ns); - } - status = nvme_zns_check_resources(ns, act, 1, - (flags & NVME_ZRM_ZRWA) ? 1 : 0); - if (status) { - return status; - } - - if (act) { - nvme_aor_inc_active(ns); - } - - nvme_aor_inc_open(ns); - - if (flags & NVME_ZRM_AUTO) { - nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_IMPLICITLY_OPEN); - return NVME_SUCCESS; - } - - /* fallthrough */ - - case NVME_ZONE_STATE_IMPLICITLY_OPEN: - if (flags & NVME_ZRM_AUTO) { - return NVME_SUCCESS; - } - - nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_EXPLICITLY_OPEN); - - /* fallthrough */ - - case NVME_ZONE_STATE_EXPLICITLY_OPEN: - if (flags & NVME_ZRM_ZRWA) { - ns->zns.numzrwa--; - - zone->d.za |= NVME_ZA_ZRWA_VALID; - } - - return NVME_SUCCESS; - - default: - return NVME_ZONE_INVAL_TRANSITION; - } -} - -static inline uint16_t nvme_zrm_auto(NvmeCtrl *n, NvmeNamespace *ns, - NvmeZone *zone) -{ - return nvme_zrm_open_flags(n, ns, zone, NVME_ZRM_AUTO); -} - -static void nvme_advance_zone_wp(NvmeNamespace *ns, NvmeZone *zone, - uint32_t nlb) -{ - zone->d.wp += nlb; - - if (zone->d.wp == nvme_zone_wr_boundary(zone)) { - nvme_zrm_finish(ns, zone); - } -} - -static void nvme_zoned_zrwa_implicit_flush(NvmeNamespace *ns, NvmeZone *zone, - uint32_t nlbc) -{ - uint16_t nzrwafgs = DIV_ROUND_UP(nlbc, ns->zns.zrwafg); - - nlbc = nzrwafgs * ns->zns.zrwafg; - - trace_pci_nvme_zoned_zrwa_implicit_flush(zone->d.zslba, nlbc); - - zone->w_ptr += nlbc; - - nvme_advance_zone_wp(ns, zone, nlbc); -} - -static void nvme_finalize_zoned_write(NvmeNamespace *ns, NvmeRequest *req) -{ - NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; - NvmeZone *zone; - uint64_t slba; - uint32_t nlb; - - slba = le64_to_cpu(rw->slba); - nlb = le16_to_cpu(rw->nlb) + 1; - zone = nvme_get_zone_by_slba(ns, slba); - assert(zone); - - if (zone->d.za & NVME_ZA_ZRWA_VALID) { - uint64_t ezrwa = zone->w_ptr + ns->zns.zrwas - 1; - uint64_t elba = slba + nlb - 1; - - if (elba > ezrwa) { - nvme_zoned_zrwa_implicit_flush(ns, zone, elba - ezrwa); - } - - return; - } - - nvme_advance_zone_wp(ns, zone, nlb); -} - static inline bool nvme_is_write(NvmeRequest *req) { NvmeRwCmd *rw = (NvmeRwCmd *)&req->cmd; @@ -2153,10 +1757,6 @@ void nvme_rw_complete_cb(void *opaque, int ret) block_acct_done(stats, acct); } - if (blk_get_zone_model(blk) && nvme_is_write(req)) { - nvme_finalize_zoned_write(ns, req); - } - nvme_enqueue_req_completion(nvme_cq(req), req); } @@ -2861,8 +2461,6 @@ static inline uint16_t nvme_check_copy_mcl(NvmeNamespace *ns, static void nvme_copy_out_completed_cb(void *opaque, int ret) { NvmeCopyAIOCB *iocb = opaque; - NvmeRequest *req = iocb->req; - NvmeNamespace *ns = req->ns; uint32_t nlb; nvme_copy_source_range_parse(iocb->ranges, iocb->idx, iocb->format, NULL, @@ -2875,10 +2473,6 @@ static void nvme_copy_out_completed_cb(void *opaque, int ret) goto out; } - if (blk_get_zone_model(ns->blkconf.blk)) { - nvme_advance_zone_wp(ns, iocb->zone, nlb); - } - iocb->idx++; iocb->slba += nlb; out: @@ -2987,17 +2581,6 @@ static void nvme_copy_in_completed_cb(void *opaque, int ret) goto invalid; } - if (blk_get_zone_model(ns->blkconf.blk)) { - status = nvme_check_zone_write(ns, iocb->zone, iocb->slba, nlb); - if (status) { - goto invalid; - } - - if (!(iocb->zone->d.za & NVME_ZA_ZRWA_VALID)) { - iocb->zone->w_ptr += nlb; - } - } - qemu_iovec_reset(&iocb->iov); qemu_iovec_add(&iocb->iov, iocb->bounce, len); @@ -3081,13 +2664,6 @@ static void nvme_do_copy(NvmeCopyAIOCB *iocb) } } - if (blk_get_zone_model(ns->blkconf.blk)) { - status = nvme_check_zone_read(ns, slba, nlb); - if (status) { - goto invalid; - } - } - qemu_iovec_reset(&iocb->iov); qemu_iovec_add(&iocb->iov, iocb->bounce, len); @@ -3157,19 +2733,6 @@ static uint16_t nvme_copy(NvmeCtrl *n, NvmeRequest *req) iocb->slba = le64_to_cpu(copy->sdlba); - if (blk_get_zone_model(ns->blkconf.blk)) { - iocb->zone = nvme_get_zone_by_slba(ns, iocb->slba); - if (!iocb->zone) { - status = NVME_LBA_RANGE | NVME_DNR; - goto invalid; - } - - status = nvme_zrm_auto(n, ns, iocb->zone); - if (status) { - goto invalid; - } - } - status = nvme_check_copy_mcl(ns, iocb, nr); if (status) { goto invalid; @@ -3428,14 +2991,6 @@ static uint16_t nvme_read(NvmeCtrl *n, NvmeRequest *req) goto invalid; } - if (blk_get_zone_model(blk)) { - status = nvme_check_zone_read(ns, slba, nlb); - if (status) { - trace_pci_nvme_err_zone_read_not_ok(slba, nlb, status); - goto invalid; - } - } - if (NVME_ERR_REC_DULBE(ns->features.err_rec)) { status = nvme_check_dulbe(ns, slba, nlb); if (status) { @@ -3511,8 +3066,6 @@ static uint16_t nvme_do_write(NvmeCtrl *n, NvmeRequest *req, bool append, uint64_t data_size = nvme_l2b(ns, nlb); uint64_t mapped_size = data_size; uint64_t data_offset; - NvmeZone *zone; - NvmeZonedResult *res = (NvmeZonedResult *)&req->cqe; BlockBackend *blk = ns->blkconf.blk; uint16_t status; @@ -3544,32 +3097,20 @@ static uint16_t nvme_do_write(NvmeCtrl *n, NvmeRequest *req, bool append, } if (blk_get_zone_model(blk)) { - zone = nvme_get_zone_by_slba(ns, slba); - assert(zone); + uint32_t zone_size = blk_get_zone_size(blk); + uint32_t zone_idx = slba / zone_size; + int64_t zone_start = zone_idx * zone_size; if (append) { bool piremap = !!(ctrl & NVME_RW_PIREMAP); - if (unlikely(zone->d.za & NVME_ZA_ZRWA_VALID)) { - return NVME_INVALID_ZONE_OP | NVME_DNR; - } - - if (unlikely(slba != zone->d.zslba)) { - trace_pci_nvme_err_append_not_at_start(slba, zone->d.zslba); - status = NVME_INVALID_FIELD; - goto invalid; - } - if (n->params.zasl && data_size > (uint64_t)n->page_size << n->params.zasl) { trace_pci_nvme_err_zasl(data_size); return NVME_INVALID_FIELD | NVME_DNR; } - slba = zone->w_ptr; rw->slba = cpu_to_le64(slba); - res->slba = cpu_to_le64(slba); - switch (NVME_ID_NS_DPS_TYPE(ns->id_ns.dps)) { case NVME_ID_NS_DPS_TYPE_1: if (!piremap) { @@ -3581,7 +3122,7 @@ static uint16_t nvme_do_write(NvmeCtrl *n, NvmeRequest *req, bool append, case NVME_ID_NS_DPS_TYPE_2: if (piremap) { uint32_t reftag = le32_to_cpu(rw->reftag); - rw->reftag = cpu_to_le32(reftag + (slba - zone->d.zslba)); + rw->reftag = cpu_to_le32(reftag + (slba - zone_start)); } break; @@ -3595,19 +3136,6 @@ static uint16_t nvme_do_write(NvmeCtrl *n, NvmeRequest *req, bool append, } } - status = nvme_check_zone_write(ns, zone, slba, nlb); - if (status) { - goto invalid; - } - - status = nvme_zrm_auto(n, ns, zone); - if (status) { - goto invalid; - } - - if (!(zone->d.za & NVME_ZA_ZRWA_VALID)) { - zone->w_ptr += nlb; - } } else if (ns->endgrp && ns->endgrp->fdp.enabled) { nvme_do_write_fdp(n, req, slba, nlb); } @@ -3650,6 +3178,23 @@ static inline uint16_t nvme_write_zeroes(NvmeCtrl *n, NvmeRequest *req) return nvme_do_write(n, req, false, true); } +typedef struct NvmeZoneCmdAIOCB { + NvmeRequest *req; + NvmeCmd *cmd; + NvmeCtrl *n; + + union { + struct { + uint32_t partial; + unsigned int nr_zones; + BlockZoneDescriptor *zones; + } zone_report_data; + struct { + int64_t offset; + } zone_append_data; + }; +} NvmeZoneCmdAIOCB; + static inline uint16_t nvme_zone_append(NvmeCtrl *n, NvmeRequest *req) { return nvme_do_write(n, req, true, false); @@ -3661,7 +3206,7 @@ static uint16_t nvme_get_mgmt_zone_slba_idx(NvmeNamespace *ns, NvmeCmd *c, uint32_t dw10 = le32_to_cpu(c->cdw10); uint32_t dw11 = le32_to_cpu(c->cdw11); - if (blk_get_zone_model(ns->blkconf.blk)) { + if (!blk_get_zone_model(ns->blkconf.blk)) { trace_pci_nvme_err_invalid_opc(c->opcode); return NVME_INVALID_OPCODE | NVME_DNR; } @@ -3679,198 +3224,21 @@ static uint16_t nvme_get_mgmt_zone_slba_idx(NvmeNamespace *ns, NvmeCmd *c, return NVME_SUCCESS; } -typedef uint16_t (*op_handler_t)(NvmeNamespace *, NvmeZone *, NvmeZoneState, - NvmeRequest *); - -enum NvmeZoneProcessingMask { - NVME_PROC_CURRENT_ZONE = 0, - NVME_PROC_OPENED_ZONES = 1 << 0, - NVME_PROC_CLOSED_ZONES = 1 << 1, - NVME_PROC_READ_ONLY_ZONES = 1 << 2, - NVME_PROC_FULL_ZONES = 1 << 3, -}; - -static uint16_t nvme_open_zone(NvmeNamespace *ns, NvmeZone *zone, - NvmeZoneState state, NvmeRequest *req) -{ - NvmeZoneSendCmd *cmd = (NvmeZoneSendCmd *)&req->cmd; - int flags = 0; - - if (cmd->zsflags & NVME_ZSFLAG_ZRWA_ALLOC) { - uint16_t ozcs = le16_to_cpu(ns->id_ns_zoned->ozcs); - - if (!(ozcs & NVME_ID_NS_ZONED_OZCS_ZRWASUP)) { - return NVME_INVALID_ZONE_OP | NVME_DNR; - } - - if (zone->w_ptr % ns->zns.zrwafg) { - return NVME_NOZRWA | NVME_DNR; - } - - flags = NVME_ZRM_ZRWA; - } - - return nvme_zrm_open_flags(nvme_ctrl(req), ns, zone, flags); -} - -static uint16_t nvme_close_zone(NvmeNamespace *ns, NvmeZone *zone, - NvmeZoneState state, NvmeRequest *req) -{ - return nvme_zrm_close(ns, zone); -} - -static uint16_t nvme_finish_zone(NvmeNamespace *ns, NvmeZone *zone, - NvmeZoneState state, NvmeRequest *req) -{ - return nvme_zrm_finish(ns, zone); -} - -static uint16_t nvme_offline_zone(NvmeNamespace *ns, NvmeZone *zone, - NvmeZoneState state, NvmeRequest *req) -{ - switch (state) { - case NVME_ZONE_STATE_READ_ONLY: - nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_OFFLINE); - /* fall through */ - case NVME_ZONE_STATE_OFFLINE: - return NVME_SUCCESS; - default: - return NVME_ZONE_INVAL_TRANSITION; - } -} - -static uint16_t nvme_set_zd_ext(NvmeNamespace *ns, NvmeZone *zone) -{ - uint16_t status; - uint8_t state = nvme_get_zone_state(zone); - - if (state == NVME_ZONE_STATE_EMPTY) { - status = nvme_aor_check(ns, 1, 0); - if (status) { - return status; - } - nvme_aor_inc_active(ns); - zone->d.za |= NVME_ZA_ZD_EXT_VALID; - nvme_assign_zone_state(ns, zone, NVME_ZONE_STATE_CLOSED); - return NVME_SUCCESS; - } - - return NVME_ZONE_INVAL_TRANSITION; -} - -static uint16_t nvme_bulk_proc_zone(NvmeNamespace *ns, NvmeZone *zone, - enum NvmeZoneProcessingMask proc_mask, - op_handler_t op_hndlr, NvmeRequest *req) -{ - uint16_t status = NVME_SUCCESS; - NvmeZoneState zs = nvme_get_zone_state(zone); - bool proc_zone; - - switch (zs) { - case NVME_ZONE_STATE_IMPLICITLY_OPEN: - case NVME_ZONE_STATE_EXPLICITLY_OPEN: - proc_zone = proc_mask & NVME_PROC_OPENED_ZONES; - break; - case NVME_ZONE_STATE_CLOSED: - proc_zone = proc_mask & NVME_PROC_CLOSED_ZONES; - break; - case NVME_ZONE_STATE_READ_ONLY: - proc_zone = proc_mask & NVME_PROC_READ_ONLY_ZONES; - break; - case NVME_ZONE_STATE_FULL: - proc_zone = proc_mask & NVME_PROC_FULL_ZONES; - break; - default: - proc_zone = false; - } - - if (proc_zone) { - status = op_hndlr(ns, zone, zs, req); - } - - return status; -} - -static uint16_t nvme_do_zone_op(NvmeNamespace *ns, NvmeZone *zone, - enum NvmeZoneProcessingMask proc_mask, - op_handler_t op_hndlr, NvmeRequest *req) -{ - NvmeZone *next; - uint16_t status = NVME_SUCCESS; - int i; - - if (!proc_mask) { - status = op_hndlr(ns, zone, nvme_get_zone_state(zone), req); - } else { - if (proc_mask & NVME_PROC_CLOSED_ZONES) { - QTAILQ_FOREACH_SAFE(zone, &ns->closed_zones, entry, next) { - status = nvme_bulk_proc_zone(ns, zone, proc_mask, op_hndlr, - req); - if (status && status != NVME_NO_COMPLETE) { - goto out; - } - } - } - if (proc_mask & NVME_PROC_OPENED_ZONES) { - QTAILQ_FOREACH_SAFE(zone, &ns->imp_open_zones, entry, next) { - status = nvme_bulk_proc_zone(ns, zone, proc_mask, op_hndlr, - req); - if (status && status != NVME_NO_COMPLETE) { - goto out; - } - } - - QTAILQ_FOREACH_SAFE(zone, &ns->exp_open_zones, entry, next) { - status = nvme_bulk_proc_zone(ns, zone, proc_mask, op_hndlr, - req); - if (status && status != NVME_NO_COMPLETE) { - goto out; - } - } - } - if (proc_mask & NVME_PROC_FULL_ZONES) { - QTAILQ_FOREACH_SAFE(zone, &ns->full_zones, entry, next) { - status = nvme_bulk_proc_zone(ns, zone, proc_mask, op_hndlr, - req); - if (status && status != NVME_NO_COMPLETE) { - goto out; - } - } - } - - if (proc_mask & NVME_PROC_READ_ONLY_ZONES) { - for (i = 0; i < ns->num_zones; i++, zone++) { - status = nvme_bulk_proc_zone(ns, zone, proc_mask, op_hndlr, - req); - if (status && status != NVME_NO_COMPLETE) { - goto out; - } - } - } - } - -out: - return status; -} - -typedef struct NvmeZoneResetAIOCB { +typedef struct NvmeZoneMgmtAIOCB { BlockAIOCB common; BlockAIOCB *aiocb; NvmeRequest *req; int ret; bool all; - int idx; - NvmeZone *zone; -} NvmeZoneResetAIOCB; + uint64_t offset; + uint64_t len; + BlockZoneOp op; +} NvmeZoneMgmtAIOCB; -static void nvme_zone_reset_cancel(BlockAIOCB *aiocb) +static void nvme_zone_mgmt_send_cancel(BlockAIOCB *aiocb) { - NvmeZoneResetAIOCB *iocb = container_of(aiocb, NvmeZoneResetAIOCB, common); - NvmeRequest *req = iocb->req; - NvmeNamespace *ns = req->ns; - - iocb->idx = ns->num_zones; + NvmeZoneMgmtAIOCB *iocb = container_of(aiocb, NvmeZoneMgmtAIOCB, common); iocb->ret = -ECANCELED; @@ -3880,117 +3248,66 @@ static void nvme_zone_reset_cancel(BlockAIOCB *aiocb) } } -static const AIOCBInfo nvme_zone_reset_aiocb_info = { - .aiocb_size = sizeof(NvmeZoneResetAIOCB), - .cancel_async = nvme_zone_reset_cancel, +static const AIOCBInfo nvme_zone_mgmt_aiocb_info = { + .aiocb_size = sizeof(NvmeZoneMgmtAIOCB), + .cancel_async = nvme_zone_mgmt_send_cancel, }; -static void nvme_zone_reset_cb(void *opaque, int ret); +static void nvme_zone_mgmt_send_cb(void *opaque, int ret); -static void nvme_zone_reset_epilogue_cb(void *opaque, int ret) +static void nvme_zone_mgmt_send_epilogue_cb(void *opaque, int ret) { - NvmeZoneResetAIOCB *iocb = opaque; - NvmeRequest *req = iocb->req; - NvmeNamespace *ns = req->ns; - int64_t moff; - int count; + NvmeZoneMgmtAIOCB *iocb = opaque; + NvmeNamespace *ns = iocb->req->ns; if (ret < 0 || iocb->ret < 0 || !ns->lbaf.ms) { - goto out; + iocb->ret = ret; + error_report("Invalid zone mgmt op %d", ret); + goto done; } - moff = nvme_moff(ns, iocb->zone->d.zslba); - count = nvme_m2b(ns, ns->zone_size); - - iocb->aiocb = blk_aio_pwrite_zeroes(ns->blkconf.blk, moff, count, - BDRV_REQ_MAY_UNMAP, - nvme_zone_reset_cb, iocb); return; -out: - nvme_zone_reset_cb(iocb, ret); +done: + iocb->aiocb = NULL; + iocb->common.cb(iocb->common.opaque, iocb->ret); + qemu_aio_unref(iocb); } -static void nvme_zone_reset_cb(void *opaque, int ret) +static void nvme_zone_mgmt_send_cb(void *opaque, int ret) { - NvmeZoneResetAIOCB *iocb = opaque; + NvmeZoneMgmtAIOCB *iocb = opaque; NvmeRequest *req = iocb->req; NvmeNamespace *ns = req->ns; + BlockBackend *blk = ns->blkconf.blk; - if (iocb->ret < 0) { - goto done; - } else if (ret < 0) { - iocb->ret = ret; - goto done; - } - - if (iocb->zone) { - nvme_zrm_reset(ns, iocb->zone); - - if (!iocb->all) { - goto done; - } - } - - while (iocb->idx < ns->num_zones) { - NvmeZone *zone = &ns->zone_array[iocb->idx++]; - - switch (nvme_get_zone_state(zone)) { - case NVME_ZONE_STATE_EMPTY: - if (!iocb->all) { - goto done; - } - - continue; - - case NVME_ZONE_STATE_EXPLICITLY_OPEN: - case NVME_ZONE_STATE_IMPLICITLY_OPEN: - case NVME_ZONE_STATE_CLOSED: - case NVME_ZONE_STATE_FULL: - iocb->zone = zone; - break; - - default: - continue; - } - - trace_pci_nvme_zns_zone_reset(zone->d.zslba); - - iocb->aiocb = blk_aio_pwrite_zeroes(ns->blkconf.blk, - nvme_l2b(ns, zone->d.zslba), - nvme_l2b(ns, ns->zone_size), - BDRV_REQ_MAY_UNMAP, - nvme_zone_reset_epilogue_cb, - iocb); - return; - } - -done: - iocb->aiocb = NULL; - - iocb->common.cb(iocb->common.opaque, iocb->ret); - qemu_aio_unref(iocb); + iocb->aiocb = blk_aio_zone_mgmt(blk, iocb->op, iocb->offset, + iocb->len, + nvme_zone_mgmt_send_epilogue_cb, iocb); + return; } -static uint16_t nvme_zone_mgmt_send_zrwa_flush(NvmeCtrl *n, NvmeZone *zone, +static uint16_t nvme_zone_mgmt_send_zrwa_flush(NvmeCtrl *n, uint32_t zidx, uint64_t elba, NvmeRequest *req) { NvmeNamespace *ns = req->ns; uint16_t ozcs = le16_to_cpu(ns->id_ns_zoned->ozcs); - uint64_t wp = zone->d.wp; - uint32_t nlb = elba - wp + 1; - uint16_t status; - + BlockZoneWps *wps = blk_get_zone_wps(ns->blkconf.blk); + uint64_t *wp = &wps->wp[zidx]; + uint64_t raw_wpv = BDRV_ZP_GET_WP(*wp); + uint8_t za = BDRV_ZP_GET_ZA(raw_wpv); + uint64_t wpv = BDRV_ZP_GET_WP(raw_wpv); + uint32_t nlb = elba - wpv + 1; if (!(ozcs & NVME_ID_NS_ZONED_OZCS_ZRWASUP)) { return NVME_INVALID_ZONE_OP | NVME_DNR; } - if (!(zone->d.za & NVME_ZA_ZRWA_VALID)) { + if (!(za & NVME_ZA_ZRWA_VALID)) { return NVME_INVALID_FIELD | NVME_DNR; } - if (elba < wp || elba > wp + ns->zns.zrwas) { + if (elba < wpv || elba > wpv + ns->zns.zrwas) { return NVME_ZONE_BOUNDARY_ERROR | NVME_DNR; } @@ -3998,37 +3315,36 @@ static uint16_t nvme_zone_mgmt_send_zrwa_flush(NvmeCtrl *n, NvmeZone *zone, return NVME_INVALID_FIELD | NVME_DNR; } - status = nvme_zrm_auto(n, ns, zone); - if (status) { - return status; - } - - zone->w_ptr += nlb; - - nvme_advance_zone_wp(ns, zone, nlb); + *wp += nlb; return NVME_SUCCESS; } static inline uint8_t *nvme_get_zd_extension(NvmeNamespace *ns, - uint32_t zone_idx) + uint32_t zone_idx) { return &ns->zd_extensions[zone_idx * blk_get_zd_ext_size(ns->blkconf.blk)]; } +#define BLK_ZO_UNSUP 0x22 static uint16_t nvme_zone_mgmt_send(NvmeCtrl *n, NvmeRequest *req) { NvmeZoneSendCmd *cmd = (NvmeZoneSendCmd *)&req->cmd; NvmeNamespace *ns = req->ns; - NvmeZone *zone; - NvmeZoneResetAIOCB *iocb; - uint8_t *zd_ext; + NvmeZoneMgmtAIOCB *iocb; uint64_t slba = 0; uint32_t zone_idx = 0; uint16_t status; uint8_t action = cmd->zsa; + uint8_t *zd_ext; + uint64_t offset, len; + BlockBackend *blk = ns->blkconf.blk; + uint32_t zone_size = blk_get_zone_size(blk); + uint64_t size = zone_size * blk_get_nr_zones(blk); + BlockZoneOp op = BLK_ZO_UNSUP; + /* support flag, true when the op is supported */ + bool flag = true; bool all; - enum NvmeZoneProcessingMask proc_mask = NVME_PROC_CURRENT_ZONE; all = cmd->zsflags & NVME_ZSFLAG_SELECT_ALL; @@ -4039,82 +3355,51 @@ static uint16_t nvme_zone_mgmt_send(NvmeCtrl *n, NvmeRequest *req) if (status) { return status; } - } - - zone = &ns->zone_array[zone_idx]; - if (slba != zone->d.zslba && action != NVME_ZONE_ACTION_ZRWA_FLUSH) { - trace_pci_nvme_err_unaligned_zone_cmd(action, slba, zone->d.zslba); - return NVME_INVALID_FIELD | NVME_DNR; + len = zone_size; + } else { + len = size; } switch (action) { case NVME_ZONE_ACTION_OPEN: - if (all) { - proc_mask = NVME_PROC_CLOSED_ZONES; - } + op = BLK_ZO_OPEN; trace_pci_nvme_open_zone(slba, zone_idx, all); - status = nvme_do_zone_op(ns, zone, proc_mask, nvme_open_zone, req); break; case NVME_ZONE_ACTION_CLOSE: - if (all) { - proc_mask = NVME_PROC_OPENED_ZONES; - } + op = BLK_ZO_CLOSE; trace_pci_nvme_close_zone(slba, zone_idx, all); - status = nvme_do_zone_op(ns, zone, proc_mask, nvme_close_zone, req); break; case NVME_ZONE_ACTION_FINISH: - if (all) { - proc_mask = NVME_PROC_OPENED_ZONES | NVME_PROC_CLOSED_ZONES; - } + op = BLK_ZO_FINISH; trace_pci_nvme_finish_zone(slba, zone_idx, all); - status = nvme_do_zone_op(ns, zone, proc_mask, nvme_finish_zone, req); break; case NVME_ZONE_ACTION_RESET: + op = BLK_ZO_RESET; trace_pci_nvme_reset_zone(slba, zone_idx, all); - - iocb = blk_aio_get(&nvme_zone_reset_aiocb_info, ns->blkconf.blk, - nvme_misc_cb, req); - - iocb->req = req; - iocb->ret = 0; - iocb->all = all; - iocb->idx = zone_idx; - iocb->zone = NULL; - - req->aiocb = &iocb->common; - nvme_zone_reset_cb(iocb, 0); - - return NVME_NO_COMPLETE; + break; case NVME_ZONE_ACTION_OFFLINE: - if (all) { - proc_mask = NVME_PROC_READ_ONLY_ZONES; - } + op = BLK_ZO_OFFLINE; trace_pci_nvme_offline_zone(slba, zone_idx, all); - status = nvme_do_zone_op(ns, zone, proc_mask, nvme_offline_zone, req); break; case NVME_ZONE_ACTION_SET_ZD_EXT: + int zd_ext_size = blk_get_zd_ext_size(blk); trace_pci_nvme_set_descriptor_extension(slba, zone_idx); - if (all || !blk_get_zd_ext_size(ns->blkconf.blk)) { + if (all || !zd_ext_size) { return NVME_INVALID_FIELD | NVME_DNR; } zd_ext = nvme_get_zd_extension(ns, zone_idx); - status = nvme_h2c(n, zd_ext, blk_get_zd_ext_size(ns->blkconf.blk), req); + status = nvme_h2c(n, zd_ext, zd_ext_size, req); if (status) { trace_pci_nvme_err_zd_extension_map_error(zone_idx); return status; } - - status = nvme_set_zd_ext(ns, zone); - if (status == NVME_SUCCESS) { - trace_pci_nvme_zd_extension_set(zone_idx); - return status; - } + trace_pci_nvme_zd_extension_set(zone_idx); break; case NVME_ZONE_ACTION_ZRWA_FLUSH: @@ -4122,16 +3407,34 @@ static uint16_t nvme_zone_mgmt_send(NvmeCtrl *n, NvmeRequest *req) return NVME_INVALID_FIELD | NVME_DNR; } - return nvme_zone_mgmt_send_zrwa_flush(n, zone, slba, req); + return nvme_zone_mgmt_send_zrwa_flush(n, zone_idx, slba, req); default: trace_pci_nvme_err_invalid_mgmt_action(action); status = NVME_INVALID_FIELD; } + if (flag && (op != BLK_ZO_UNSUP)) { + iocb = blk_aio_get(&nvme_zone_mgmt_aiocb_info, ns->blkconf.blk, + nvme_misc_cb, req); + iocb->req = req; + iocb->ret = 0; + iocb->all = all; + /* Convert it to bytes for accessing block layers */ + offset = nvme_l2b(ns, slba); + iocb->offset = offset; + iocb->len = len; + iocb->op = op; + + req->aiocb = &iocb->common; + nvme_zone_mgmt_send_cb(iocb, 0); + + return NVME_NO_COMPLETE; + } + if (status == NVME_ZONE_INVAL_TRANSITION) { trace_pci_nvme_err_invalid_zone_state_transition(action, slba, - zone->d.za); + TO_DO_ZA); } if (status) { status |= NVME_DNR; @@ -4140,50 +3443,144 @@ static uint16_t nvme_zone_mgmt_send(NvmeCtrl *n, NvmeRequest *req) return status; } -static bool nvme_zone_matches_filter(uint32_t zafs, NvmeZone *zl) +static bool nvme_zone_matches_filter(uint32_t zafs, BlockZoneState zs) { - NvmeZoneState zs = nvme_get_zone_state(zl); - switch (zafs) { case NVME_ZONE_REPORT_ALL: return true; case NVME_ZONE_REPORT_EMPTY: - return zs == NVME_ZONE_STATE_EMPTY; + return zs == BLK_ZS_EMPTY; case NVME_ZONE_REPORT_IMPLICITLY_OPEN: - return zs == NVME_ZONE_STATE_IMPLICITLY_OPEN; + return zs == BLK_ZS_IOPEN; case NVME_ZONE_REPORT_EXPLICITLY_OPEN: - return zs == NVME_ZONE_STATE_EXPLICITLY_OPEN; + return zs == BLK_ZS_EOPEN; case NVME_ZONE_REPORT_CLOSED: - return zs == NVME_ZONE_STATE_CLOSED; + return zs == BLK_ZS_CLOSED; case NVME_ZONE_REPORT_FULL: - return zs == NVME_ZONE_STATE_FULL; + return zs == BLK_ZS_FULL; case NVME_ZONE_REPORT_READ_ONLY: - return zs == NVME_ZONE_STATE_READ_ONLY; + return zs == BLK_ZS_RDONLY; case NVME_ZONE_REPORT_OFFLINE: - return zs == NVME_ZONE_STATE_OFFLINE; + return zs == BLK_ZS_OFFLINE; default: return false; } } +static void nvme_zone_mgmt_recv_completed_cb(void *opaque, int ret) +{ + NvmeZoneCmdAIOCB *iocb = opaque; + NvmeRequest *req = iocb->req; + NvmeCmd *cmd = iocb->cmd; + uint32_t dw13 = le32_to_cpu(cmd->cdw13); + int64_t zrp_size, j = 0; + uint32_t zrasf; + g_autofree void *buf = NULL; + void *buf_p; + NvmeZoneReportHeader *zrp_hdr; + uint64_t nz = iocb->zone_report_data.nr_zones; + BlockZoneDescriptor *in_zone = iocb->zone_report_data.zones; + NvmeZoneDescr *out_zone; + + if (ret < 0) { + error_report("Invalid zone recv %d", ret); + goto out; + } + + zrasf = (dw13 >> 8) & 0xff; + if (zrasf > NVME_ZONE_REPORT_OFFLINE) { + error_report("Nvme invalid field"); + return; + } + + zrp_size = sizeof(NvmeZoneReportHeader) + sizeof(NvmeZoneDescr) * nz; + buf = g_malloc0(zrp_size); + + zrp_hdr = buf; + zrp_hdr->nr_zones = cpu_to_le64(nz); + buf_p = buf + sizeof(NvmeZoneReportHeader); + + for (; j < nz; j++) { + out_zone = buf_p; + buf_p += sizeof(NvmeZoneDescr); + + BlockZoneState zs = in_zone[j].state; + if (!nvme_zone_matches_filter(zrasf, zs)) { + continue; + } + + *out_zone = (NvmeZoneDescr) { + .zslba = nvme_b2l(req->ns, in_zone[j].start), + .zcap = nvme_b2l(req->ns, in_zone[j].cap), + .wp = nvme_b2l(req->ns, in_zone[j].wp), + }; + + switch (in_zone[j].type) { + case BLK_ZT_CONV: + out_zone->zt = NVME_ZONE_TYPE_RESERVED; + break; + case BLK_ZT_SWR: + out_zone->zt = NVME_ZONE_TYPE_SEQ_WRITE; + break; + case BLK_ZT_SWP: + out_zone->zt = NVME_ZONE_TYPE_RESERVED; + break; + default: + g_assert_not_reached(); + } + + switch (zs) { + case BLK_ZS_RDONLY: + out_zone->zs = NVME_ZONE_STATE_READ_ONLY << 4; + break; + case BLK_ZS_OFFLINE: + out_zone->zs = NVME_ZONE_STATE_OFFLINE << 4; + break; + case BLK_ZS_EMPTY: + out_zone->zs = NVME_ZONE_STATE_EMPTY << 4; + break; + case BLK_ZS_CLOSED: + out_zone->zs = NVME_ZONE_STATE_CLOSED << 4; + break; + case BLK_ZS_FULL: + out_zone->zs = NVME_ZONE_STATE_FULL << 4; + break; + case BLK_ZS_EOPEN: + out_zone->zs = NVME_ZONE_STATE_EXPLICITLY_OPEN << 4; + break; + case BLK_ZS_IOPEN: + out_zone->zs = NVME_ZONE_STATE_IMPLICITLY_OPEN << 4; + break; + case BLK_ZS_NOT_WP: + out_zone->zs = NVME_ZONE_STATE_RESERVED << 4; + break; + default: + g_assert_not_reached(); + } + } + + nvme_c2h(iocb->n, (uint8_t *)buf, zrp_size, req); + +out: + g_free(iocb->zone_report_data.zones); + g_free(iocb); + return; +} + static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) { NvmeCmd *cmd = (NvmeCmd *)&req->cmd; NvmeNamespace *ns = req->ns; + BlockBackend *blk = ns->blkconf.blk; + NvmeZoneCmdAIOCB *iocb; /* cdw12 is zero-based number of dwords to return. Convert to bytes */ uint32_t data_size = (le32_to_cpu(cmd->cdw12) + 1) << 2; uint32_t dw13 = le32_to_cpu(cmd->cdw13); - uint32_t zone_idx, zra, zrasf, partial; - uint64_t max_zones, nr_zones = 0; + uint32_t zone_idx, zra, partial, nr_zones; uint16_t status; uint64_t slba; - NvmeZoneDescr *z; - NvmeZone *zone; - NvmeZoneReportHeader *header; - void *buf, *buf_p; size_t zone_entry_sz; - int i; - + int64_t offset; req->status = NVME_SUCCESS; status = nvme_get_mgmt_zone_slba_idx(ns, cmd, &slba, &zone_idx); @@ -4195,12 +3592,8 @@ static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) if (zra != NVME_ZONE_REPORT && zra != NVME_ZONE_REPORT_EXTENDED) { return NVME_INVALID_FIELD | NVME_DNR; } - if (zra == NVME_ZONE_REPORT_EXTENDED && !blk_get_zd_ext_size(ns->blkconf.blk)){ - return NVME_INVALID_FIELD | NVME_DNR; - } - - zrasf = (dw13 >> 8) & 0xff; - if (zrasf > NVME_ZONE_REPORT_OFFLINE) { + if (zra == NVME_ZONE_REPORT_EXTENDED && + !blk_get_zd_ext_size(ns->blkconf.blk)){ return NVME_INVALID_FIELD | NVME_DNR; } @@ -4213,64 +3606,31 @@ static uint16_t nvme_zone_mgmt_recv(NvmeCtrl *n, NvmeRequest *req) return status; } - partial = (dw13 >> 16) & 0x01; - zone_entry_sz = sizeof(NvmeZoneDescr); if (zra == NVME_ZONE_REPORT_EXTENDED) { - zone_entry_sz += blk_get_zd_ext_size(ns->blkconf.blk) ; + zone_entry_sz += blk_get_zd_ext_size(ns->blkconf.blk); } - max_zones = (data_size - sizeof(NvmeZoneReportHeader)) / zone_entry_sz; - buf = g_malloc0(data_size); - - zone = &ns->zone_array[zone_idx]; - for (i = zone_idx; i < ns->num_zones; i++) { - if (partial && nr_zones >= max_zones) { - break; - } - if (nvme_zone_matches_filter(zrasf, zone++)) { - nr_zones++; - } - } - header = buf; - header->nr_zones = cpu_to_le64(nr_zones); - - buf_p = buf + sizeof(NvmeZoneReportHeader); - for (; zone_idx < ns->num_zones && max_zones > 0; zone_idx++) { - zone = &ns->zone_array[zone_idx]; - if (nvme_zone_matches_filter(zrasf, zone)) { - z = buf_p; - buf_p += sizeof(NvmeZoneDescr); - - z->zt = zone->d.zt; - z->zs = zone->d.zs; - z->zcap = cpu_to_le64(zone->d.zcap); - z->zslba = cpu_to_le64(zone->d.zslba); - z->za = zone->d.za; - - if (nvme_wp_is_valid(zone)) { - z->wp = cpu_to_le64(zone->d.wp); - } else { - z->wp = cpu_to_le64(~0ULL); - } - - if (zra == NVME_ZONE_REPORT_EXTENDED) { - int zd_ext_size = blk_get_zd_ext_size(ns->blkconf.blk); - if (zone->d.za & NVME_ZA_ZD_EXT_VALID) { - memcpy(buf_p, nvme_get_zd_extension(ns, zone_idx), - zd_ext_size); - } - buf_p += zd_ext_size; - } - - max_zones--; - } + offset = nvme_l2b(ns, slba); + nr_zones = (data_size - sizeof(NvmeZoneReportHeader)) / zone_entry_sz; + partial = (dw13 >> 16) & 0x01; + if (!partial) { + nr_zones = blk_get_nr_zones(blk); + offset = 0; } - status = nvme_c2h(n, (uint8_t *)buf, data_size, req); - - g_free(buf); - + iocb = g_malloc0(sizeof(NvmeZoneCmdAIOCB)); + iocb->req = req; + iocb->n = n; + iocb->cmd = cmd; + iocb->zone_report_data.nr_zones = nr_zones; + iocb->zone_report_data.zones = g_malloc0( + sizeof(BlockZoneDescriptor) * nr_zones); + + blk_aio_zone_report(blk, offset, + &iocb->zone_report_data.nr_zones, + iocb->zone_report_data.zones, + nvme_zone_mgmt_recv_completed_cb, iocb); return status; } diff --git a/hw/nvme/ns.c b/hw/nvme/ns.c index c9c3a54d36..e54712de30 100644 --- a/hw/nvme/ns.c +++ b/hw/nvme/ns.c @@ -219,36 +219,10 @@ static int nvme_ns_zoned_check_calc_geometry(NvmeNamespace *ns, Error **errp) static void nvme_ns_zoned_init_state(NvmeNamespace *ns) { BlockBackend *blk = ns->blkconf.blk; - uint64_t start = 0, zone_size = ns->zone_size; - uint64_t capacity = ns->num_zones * zone_size; - NvmeZone *zone; - int i; - - ns->zone_array = g_new0(NvmeZone, ns->num_zones); if (blk_get_zone_extension(blk)) { ns->zd_extensions = blk_get_zone_extension(blk); } - QTAILQ_INIT(&ns->exp_open_zones); - QTAILQ_INIT(&ns->imp_open_zones); - QTAILQ_INIT(&ns->closed_zones); - QTAILQ_INIT(&ns->full_zones); - - zone = ns->zone_array; - for (i = 0; i < ns->num_zones; i++, zone++) { - if (start + zone_size > capacity) { - zone_size = capacity - start; - } - zone->d.zt = NVME_ZONE_TYPE_SEQ_WRITE; - nvme_set_zone_state(zone, NVME_ZONE_STATE_EMPTY); - zone->d.za = 0; - zone->d.zcap = ns->zone_capacity; - zone->d.zslba = start; - zone->d.wp = start; - zone->w_ptr = start; - start += zone_size; - } - ns->zone_size_log2 = 0; if (is_power_of_2(ns->zone_size)) { ns->zone_size_log2 = 63 - clz64(ns->zone_size); @@ -319,56 +293,12 @@ static void nvme_ns_init_zoned(NvmeNamespace *ns) ns->id_ns_zoned = id_ns_z; } -static void nvme_clear_zone(NvmeNamespace *ns, NvmeZone *zone) -{ - uint8_t state; - - zone->w_ptr = zone->d.wp; - state = nvme_get_zone_state(zone); - if (zone->d.wp != zone->d.zslba || - (zone->d.za & NVME_ZA_ZD_EXT_VALID)) { - if (state != NVME_ZONE_STATE_CLOSED) { - trace_pci_nvme_clear_ns_close(state, zone->d.zslba); - nvme_set_zone_state(zone, NVME_ZONE_STATE_CLOSED); - } - nvme_aor_inc_active(ns); - QTAILQ_INSERT_HEAD(&ns->closed_zones, zone, entry); - } else { - trace_pci_nvme_clear_ns_reset(state, zone->d.zslba); - if (zone->d.za & NVME_ZA_ZRWA_VALID) { - zone->d.za &= ~NVME_ZA_ZRWA_VALID; - ns->zns.numzrwa++; - } - nvme_set_zone_state(zone, NVME_ZONE_STATE_EMPTY); - } -} - /* * Close all the zones that are currently open. */ static void nvme_zoned_ns_shutdown(NvmeNamespace *ns) { - NvmeZone *zone, *next; - - QTAILQ_FOREACH_SAFE(zone, &ns->closed_zones, entry, next) { - QTAILQ_REMOVE(&ns->closed_zones, zone, entry); - nvme_aor_dec_active(ns); - nvme_clear_zone(ns, zone); - } - QTAILQ_FOREACH_SAFE(zone, &ns->imp_open_zones, entry, next) { - QTAILQ_REMOVE(&ns->imp_open_zones, zone, entry); - nvme_aor_dec_open(ns); - nvme_aor_dec_active(ns); - nvme_clear_zone(ns, zone); - } - QTAILQ_FOREACH_SAFE(zone, &ns->exp_open_zones, entry, next) { - QTAILQ_REMOVE(&ns->exp_open_zones, zone, entry); - nvme_aor_dec_open(ns); - nvme_aor_dec_active(ns); - nvme_clear_zone(ns, zone); - } - - assert(ns->nr_open_zones == 0); + /* Set states (exp/imp_open/closed/full) to empty */ } static NvmeRuHandle *nvme_find_ruh_by_attr(NvmeEnduranceGroup *endgrp, @@ -663,7 +593,6 @@ void nvme_ns_cleanup(NvmeNamespace *ns) { if (blk_get_zone_model(ns->blkconf.blk)) { g_free(ns->id_ns_zoned); - g_free(ns->zone_array); } if (ns->endgrp && ns->endgrp->fdp.enabled) { @@ -777,10 +706,6 @@ static Property nvme_ns_props[] = { DEFINE_PROP_UINT8("msrc", NvmeNamespace, params.msrc, 127), DEFINE_PROP_BOOL("zoned.cross_read", NvmeNamespace, params.cross_zone_read, false), - DEFINE_PROP_UINT32("zoned.max_active", NvmeNamespace, - params.max_active_zones, 0), - DEFINE_PROP_UINT32("zoned.max_open", NvmeNamespace, - params.max_open_zones, 0), DEFINE_PROP_UINT32("zoned.numzrwa", NvmeNamespace, params.numzrwa, 0), DEFINE_PROP_SIZE("zoned.zrwas", NvmeNamespace, params.zrwas, 0), DEFINE_PROP_SIZE("zoned.zrwafg", NvmeNamespace, params.zrwafg, -1), diff --git a/hw/nvme/nvme.h b/hw/nvme/nvme.h index 37007952fc..c2d1b07f88 100644 --- a/hw/nvme/nvme.h +++ b/hw/nvme/nvme.h @@ -150,6 +150,9 @@ static inline NvmeNamespace *nvme_subsys_ns(NvmeSubsystem *subsys, #define NVME_NS(obj) \ OBJECT_CHECK(NvmeNamespace, (obj), TYPE_NVME_NS) +#define TO_DO_STATE 0 +#define TO_DO_ZA 0 + typedef struct NvmeZone { NvmeZoneDescr d; uint64_t w_ptr; @@ -190,8 +193,6 @@ typedef struct NvmeNamespaceParams { uint8_t msrc; bool cross_zone_read; - uint32_t max_active_zones; - uint32_t max_open_zones; uint32_t numzrwa; uint64_t zrwas; @@ -228,11 +229,10 @@ typedef struct NvmeNamespace { QTAILQ_ENTRY(NvmeNamespace) entry; NvmeIdNsZoned *id_ns_zoned; - NvmeZone *zone_array; - QTAILQ_HEAD(, NvmeZone) exp_open_zones; - QTAILQ_HEAD(, NvmeZone) imp_open_zones; - QTAILQ_HEAD(, NvmeZone) closed_zones; - QTAILQ_HEAD(, NvmeZone) full_zones; + uint32_t *exp_open_zones; + uint32_t *imp_open_zones; + uint32_t *closed_zones; + uint32_t *full_zones; uint32_t num_zones; uint64_t zone_size; uint64_t zone_capacity; @@ -265,6 +265,12 @@ static inline uint32_t nvme_nsid(NvmeNamespace *ns) return 0; } +/* Bytes to LBAs */ +static inline uint64_t nvme_b2l(NvmeNamespace *ns, uint64_t lba) +{ + return lba >> ns->lbaf.ds; +} + static inline size_t nvme_l2b(NvmeNamespace *ns, uint64_t lba) { return lba << ns->lbaf.ds; @@ -285,70 +291,9 @@ static inline bool nvme_ns_ext(NvmeNamespace *ns) return !!NVME_ID_NS_FLBAS_EXTENDED(ns->id_ns.flbas); } -static inline NvmeZoneState nvme_get_zone_state(NvmeZone *zone) +static inline NvmeZoneState nvme_get_zone_state(uint64_t wp) { - return zone->d.zs >> 4; -} - -static inline void nvme_set_zone_state(NvmeZone *zone, NvmeZoneState state) -{ - zone->d.zs = state << 4; -} - -static inline uint64_t nvme_zone_rd_boundary(NvmeNamespace *ns, NvmeZone *zone) -{ - return zone->d.zslba + ns->zone_size; -} - -static inline uint64_t nvme_zone_wr_boundary(NvmeZone *zone) -{ - return zone->d.zslba + zone->d.zcap; -} - -static inline bool nvme_wp_is_valid(NvmeZone *zone) -{ - uint8_t st = nvme_get_zone_state(zone); - - return st != NVME_ZONE_STATE_FULL && - st != NVME_ZONE_STATE_READ_ONLY && - st != NVME_ZONE_STATE_OFFLINE; -} - -static inline void nvme_aor_inc_open(NvmeNamespace *ns) -{ - assert(ns->nr_open_zones >= 0); - if (ns->params.max_open_zones) { - ns->nr_open_zones++; - assert(ns->nr_open_zones <= ns->params.max_open_zones); - } -} - -static inline void nvme_aor_dec_open(NvmeNamespace *ns) -{ - if (ns->params.max_open_zones) { - assert(ns->nr_open_zones > 0); - ns->nr_open_zones--; - } - assert(ns->nr_open_zones >= 0); -} - -static inline void nvme_aor_inc_active(NvmeNamespace *ns) -{ - assert(ns->nr_active_zones >= 0); - if (ns->params.max_active_zones) { - ns->nr_active_zones++; - assert(ns->nr_active_zones <= ns->params.max_active_zones); - } -} - -static inline void nvme_aor_dec_active(NvmeNamespace *ns) -{ - if (ns->params.max_active_zones) { - assert(ns->nr_active_zones > 0); - ns->nr_active_zones--; - assert(ns->nr_active_zones >= ns->nr_open_zones); - } - assert(ns->nr_active_zones >= 0); + return wp >> 60; } static inline void nvme_fdp_stat_inc(uint64_t *a, uint64_t b) diff --git a/include/block/block-common.h b/include/block/block-common.h index 9f04a772f6..0cbed607a8 100644 --- a/include/block/block-common.h +++ b/include/block/block-common.h @@ -83,6 +83,7 @@ typedef enum BlockZoneOp { BLK_ZO_CLOSE, BLK_ZO_FINISH, BLK_ZO_RESET, + BLK_ZO_OFFLINE, } BlockZoneOp; typedef enum BlockZoneModel { @@ -262,6 +263,13 @@ typedef enum { */ #define BDRV_ZT_IS_CONV(wp) (wp & (1ULL << 63)) +/* + * Clear the zone state, type and attribute information in the wp. + */ +#define BDRV_ZP_GET_WP(wp) ((wp << 6) >> 6) +#define BDRV_ZP_GET_ZS(wp) (wp >> 60) +#define BDRV_ZP_GET_ZA(wp) (wp & ((1ULL << 8) - 1ULL) << 51) + #define BDRV_REQUEST_MAX_SECTORS MIN_CONST(SIZE_MAX >> BDRV_SECTOR_BITS, \ INT_MAX >> BDRV_SECTOR_BITS) #define BDRV_REQUEST_MAX_BYTES (BDRV_REQUEST_MAX_SECTORS << BDRV_SECTOR_BITS) diff --git a/include/block/block_int-common.h b/include/block/block_int-common.h index e16dfe8581..79a62b5271 100644 --- a/include/block/block_int-common.h +++ b/include/block/block_int-common.h @@ -911,6 +911,8 @@ typedef struct BlockLimits { /* size of data that is associated with a zone in bytes */ uint32_t zd_extension_size; + + uint8_t zone_attribute; } BlockLimits; typedef struct BdrvOpBlocker BdrvOpBlocker; diff --git a/include/sysemu/block-backend-io.h b/include/sysemu/block-backend-io.h index c56ed29c8f..f69aa1094a 100644 --- a/include/sysemu/block-backend-io.h +++ b/include/sysemu/block-backend-io.h @@ -106,8 +106,10 @@ uint32_t blk_get_zone_capacity(BlockBackend *blk); uint32_t blk_get_max_open_zones(BlockBackend *blk); uint32_t blk_get_max_active_zones(BlockBackend *blk); uint32_t blk_get_max_append_sectors(BlockBackend *blk); +uint32_t blk_get_nr_zones(BlockBackend *blk); uint8_t *blk_get_zone_extension(BlockBackend *blk); uint32_t blk_get_zd_ext_size(BlockBackend *blk); +BlockZoneWps *blk_get_zone_wps(BlockBackend *blk); void blk_io_plug(void); void blk_io_unplug(void);