From patchwork Mon Nov 6 17:10:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hanna Czenczek X-Patchwork-Id: 1860290 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=jNYWr7Gg; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SPHtz3LXyz1yQY for ; Tue, 7 Nov 2023 04:11:23 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r037d-0002Y4-Co; Mon, 06 Nov 2023 12:10:45 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r037Z-0002XG-94 for qemu-devel@nongnu.org; Mon, 06 Nov 2023 12:10:42 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r037X-00070E-8J for qemu-devel@nongnu.org; Mon, 06 Nov 2023 12:10:41 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1699290638; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ggOWmtxey/harr42Oal2rCuuFLnxtORqeHMKe0yyHzk=; b=jNYWr7GgtXNSRrDpLb+USW76N6PsvaP90Nd2buHFk7l7zZF+sQokAPZbyO8x+i+/wNcBqr rMe30GQ0OnF920yWYZJgu58yraKGHpKsRg1FVnkd56QZtaqBu73psCTF2V9wdiBq/4DBYv hkDina4dEuNSyA8fm02TCad4KLxpSP8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-507-SnHhuHstPae0P9TSL-ZOPA-1; Mon, 06 Nov 2023 12:10:35 -0500 X-MC-Unique: SnHhuHstPae0P9TSL-ZOPA-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 468EA1019C99; Mon, 6 Nov 2023 17:10:35 +0000 (UTC) Received: from localhost (unknown [10.39.193.251]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 066A7492A; Mon, 6 Nov 2023 17:10:34 +0000 (UTC) From: Hanna Czenczek To: qemu-block@nongnu.org Cc: qemu-devel@nongnu.org, Hanna Czenczek Subject: [PULL 1/3] qcow2: keep reference on zeroize with discard-no-unref enabled Date: Mon, 6 Nov 2023 18:10:29 +0100 Message-ID: <20231106171031.1084277-2-hreitz@redhat.com> In-Reply-To: <20231106171031.1084277-1-hreitz@redhat.com> References: <20231106171031.1084277-1-hreitz@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.1 Received-SPF: pass client-ip=170.10.129.124; envelope-from=hreitz@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Jean-Louis Dupond When the discard-no-unref flag is enabled, we keep the reference for normal discard requests. But when a discard is executed on a snapshot/qcow2 image with backing, the discards are saved as zero clusters in the snapshot image. When committing the snapshot to the backing file, not discard_in_l2_slice is called but zero_in_l2_slice. Which did not had any logic to keep the reference when discard-no-unref is enabled. Therefor we add logic in the zero_in_l2_slice call to keep the reference on commit. Fixes: https://gitlab.com/qemu-project/qemu/-/issues/1621 Signed-off-by: Jean-Louis Dupond Message-Id: <20231003125236.216473-2-jean-louis@dupond.be> [hreitz: Made the documentation change more verbose, as discussed on-list] Signed-off-by: Hanna Czenczek --- qapi/block-core.json | 24 ++++++++++++++---------- block/qcow2-cluster.c | 22 ++++++++++++++++++---- qemu-options.hx | 10 +++++++--- 3 files changed, 39 insertions(+), 17 deletions(-) diff --git a/qapi/block-core.json b/qapi/block-core.json index 99961256f2..ca390c5700 100644 --- a/qapi/block-core.json +++ b/qapi/block-core.json @@ -3528,16 +3528,20 @@ # @pass-discard-other: whether discard requests for the data source # should be issued on other occasions where a cluster gets freed # -# @discard-no-unref: when enabled, discards from the guest will not -# cause cluster allocations to be relinquished. This prevents -# qcow2 fragmentation that would be caused by such discards. -# Besides potential performance degradation, such fragmentation -# can lead to increased allocation of clusters past the end of the -# image file, resulting in image files whose file length can grow -# much larger than their guest disk size would suggest. If image -# file length is of concern (e.g. when storing qcow2 images -# directly on block devices), you should consider enabling this -# option. (since 8.1) +# @discard-no-unref: when enabled, data clusters will remain +# preallocated when they are no longer used, e.g. because they are +# discarded or converted to zero clusters. As usual, whether the +# old data is discarded or kept on the protocol level (i.e. in the +# image file) depends on the setting of the pass-discard-request +# option. Keeping the clusters preallocated prevents qcow2 +# fragmentation that would otherwise be caused by freeing and +# re-allocating them later. Besides potential performance +# degradation, such fragmentation can lead to increased allocation +# of clusters past the end of the image file, resulting in image +# files whose file length can grow much larger than their guest disk +# size would suggest. If image file length is of concern (e.g. when +# storing qcow2 images directly on block devices), you should +# consider enabling this option. (since 8.1) # # @overlap-check: which overlap checks to perform for writes to the # image, defaults to 'cached' (since 2.2) diff --git a/block/qcow2-cluster.c b/block/qcow2-cluster.c index 904f00d1b3..5af439bd11 100644 --- a/block/qcow2-cluster.c +++ b/block/qcow2-cluster.c @@ -1983,7 +1983,7 @@ discard_in_l2_slice(BlockDriverState *bs, uint64_t offset, uint64_t nb_clusters, /* If we keep the reference, pass on the discard still */ bdrv_pdiscard(s->data_file, old_l2_entry & L2E_OFFSET_MASK, s->cluster_size); - } + } } qcow2_cache_put(s->l2_table_cache, (void **) &l2_slice); @@ -2061,9 +2061,15 @@ zero_in_l2_slice(BlockDriverState *bs, uint64_t offset, QCow2ClusterType type = qcow2_get_cluster_type(bs, old_l2_entry); bool unmap = (type == QCOW2_CLUSTER_COMPRESSED) || ((flags & BDRV_REQ_MAY_UNMAP) && qcow2_cluster_is_allocated(type)); - uint64_t new_l2_entry = unmap ? 0 : old_l2_entry; + bool keep_reference = + (s->discard_no_unref && type != QCOW2_CLUSTER_COMPRESSED); + uint64_t new_l2_entry = old_l2_entry; uint64_t new_l2_bitmap = old_l2_bitmap; + if (unmap && !keep_reference) { + new_l2_entry = 0; + } + if (has_subclusters(s)) { new_l2_bitmap = QCOW_L2_BITMAP_ALL_ZEROES; } else { @@ -2081,9 +2087,17 @@ zero_in_l2_slice(BlockDriverState *bs, uint64_t offset, set_l2_bitmap(s, l2_slice, l2_index + i, new_l2_bitmap); } - /* Then decrease the refcount */ if (unmap) { - qcow2_free_any_cluster(bs, old_l2_entry, QCOW2_DISCARD_REQUEST); + if (!keep_reference) { + /* Then decrease the refcount */ + qcow2_free_any_cluster(bs, old_l2_entry, QCOW2_DISCARD_REQUEST); + } else if (s->discard_passthrough[QCOW2_DISCARD_REQUEST] && + (type == QCOW2_CLUSTER_NORMAL || + type == QCOW2_CLUSTER_ZERO_ALLOC)) { + /* If we keep the reference, pass on the discard still */ + bdrv_pdiscard(s->data_file, old_l2_entry & L2E_OFFSET_MASK, + s->cluster_size); + } } } diff --git a/qemu-options.hx b/qemu-options.hx index e26230bac5..7809036d8c 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -1457,9 +1457,13 @@ SRST (on/off; default: off) ``discard-no-unref`` - When enabled, discards from the guest will not cause cluster - allocations to be relinquished. This prevents qcow2 fragmentation - that would be caused by such discards. Besides potential + When enabled, data clusters will remain preallocated when they are + no longer used, e.g. because they are discarded or converted to + zero clusters. As usual, whether the old data is discarded or kept + on the protocol level (i.e. in the image file) depends on the + setting of the pass-discard-request option. Keeping the clusters + preallocated prevents qcow2 fragmentation that would otherwise be + caused by freeing and re-allocating them later. Besides potential performance degradation, such fragmentation can lead to increased allocation of clusters past the end of the image file, resulting in image files whose file length can grow much larger From patchwork Mon Nov 6 17:10:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hanna Czenczek X-Patchwork-Id: 1860294 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=XqUukMsw; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SPHvX5Y6Xz1yQ9 for ; Tue, 7 Nov 2023 04:11:52 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r037k-0002a6-MV; Mon, 06 Nov 2023 12:10:52 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r037j-0002Zg-MK for qemu-devel@nongnu.org; Mon, 06 Nov 2023 12:10:51 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r037d-00071B-HF for qemu-devel@nongnu.org; Mon, 06 Nov 2023 12:10:51 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1699290644; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tt285uAAiJGay31EKSbK1TdL4ZWi3Wd1g8Th72g+Nwc=; b=XqUukMsw+8flTuTzwA4HB0onZTMcDR6O5ijZoR4TRP3pYmUxERdajsF8krEVO4nPDOJEEU v47yWYhlfWHhaD+lMgY4yEOSHmGRlRBbw4s94R/mLgF4L3lZonnS0bY5T9um2rx5ffxATm 6if0nDpdIht3tQmZ9uyDpa754pIHf0E= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-554-6uRl-xhhMQCHxgTMeIR5og-1; Mon, 06 Nov 2023 12:10:37 -0500 X-MC-Unique: 6uRl-xhhMQCHxgTMeIR5og-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id EF218185A794; Mon, 6 Nov 2023 17:10:36 +0000 (UTC) Received: from localhost (unknown [10.39.193.251]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 9FB1EC1596F; Mon, 6 Nov 2023 17:10:36 +0000 (UTC) From: Hanna Czenczek To: qemu-block@nongnu.org Cc: qemu-devel@nongnu.org, Hanna Czenczek Subject: [PULL 2/3] block/file-posix: fix update_zones_wp() caller Date: Mon, 6 Nov 2023 18:10:30 +0100 Message-ID: <20231106171031.1084277-3-hreitz@redhat.com> In-Reply-To: <20231106171031.1084277-1-hreitz@redhat.com> References: <20231106171031.1084277-1-hreitz@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.8 Received-SPF: pass client-ip=170.10.133.124; envelope-from=hreitz@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01, T_SPF_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Sam Li When the zoned request fail, it needs to update only the wp of the target zones for not disrupting the in-flight writes on these other zones. The wp is updated successfully after the request completes. Fixed the callers with right offset and nr_zones. Signed-off-by: Sam Li Message-Id: <20230825040556.4217-1-faithilikerun@gmail.com> Reviewed-by: Stefan Hajnoczi [hreitz: Rebased and fixed comment spelling] Signed-off-by: Hanna Czenczek --- block/file-posix.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index 50e2b20d5c..3c0ce9c258 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -2523,7 +2523,10 @@ out: } } } else { - update_zones_wp(bs, s->fd, 0, 1); + /* + * write and append write are not allowed to cross zone boundaries + */ + update_zones_wp(bs, s->fd, offset, 1); } qemu_co_mutex_unlock(&wps->colock); @@ -3470,7 +3473,7 @@ static int coroutine_fn raw_co_zone_mgmt(BlockDriverState *bs, BlockZoneOp op, len >> BDRV_SECTOR_BITS); ret = raw_thread_pool_submit(handle_aiocb_zone_mgmt, &acb); if (ret != 0) { - update_zones_wp(bs, s->fd, offset, i); + update_zones_wp(bs, s->fd, offset, nrz); error_report("ioctl %s failed %d", op_name, ret); return ret; } From patchwork Mon Nov 6 17:10:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hanna Czenczek X-Patchwork-Id: 1860291 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=KUH94Rfu; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=patchwork.ozlabs.org) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4SPHv44KB2z1yQ9 for ; Tue, 7 Nov 2023 04:11:28 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r037e-0002YQ-3X; Mon, 06 Nov 2023 12:10:46 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r037c-0002Xo-UT for qemu-devel@nongnu.org; Mon, 06 Nov 2023 12:10:44 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r037Z-00070Q-PK for qemu-devel@nongnu.org; Mon, 06 Nov 2023 12:10:44 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1699290640; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Du5YL3nQiNSvA7+97bv/GKr2/L/cZNmIRLUQKPEa0OM=; b=KUH94RfuJyVdIY4Ss0zI+fViDUm5XBr6dOkVbxSdB71qRd8lnBqHAPnotgibPpwp5ue64+ a35rh8sltmyY/XVAiDN006evlaA1ZOwDnlL61o/lL4nMgNqeBMOcStHdUKN9Mn8A6zmaf5 1E1Fylp7uicyb3lhbcJNPV0rFBZ8yG8= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-347-9R4IbSFYPvOySlD_HHOQaA-1; Mon, 06 Nov 2023 12:10:39 -0500 X-MC-Unique: 9R4IbSFYPvOySlD_HHOQaA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 063E08678AA; Mon, 6 Nov 2023 17:10:39 +0000 (UTC) Received: from localhost (unknown [10.39.193.251]) by smtp.corp.redhat.com (Postfix) with ESMTPS id ABAFB40C6EB9; Mon, 6 Nov 2023 17:10:38 +0000 (UTC) From: Hanna Czenczek To: qemu-block@nongnu.org Cc: qemu-devel@nongnu.org, Hanna Czenczek Subject: [PULL 3/3] file-posix: fix over-writing of returning zone_append offset Date: Mon, 6 Nov 2023 18:10:31 +0100 Message-ID: <20231106171031.1084277-4-hreitz@redhat.com> In-Reply-To: <20231106171031.1084277-1-hreitz@redhat.com> References: <20231106171031.1084277-1-hreitz@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.2 Received-SPF: pass client-ip=170.10.129.124; envelope-from=hreitz@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Naohiro Aota raw_co_zone_append() sets "s->offset" where "BDRVRawState *s". This pointer is used later at raw_co_prw() to save the block address where the data is written. When multiple IOs are on-going at the same time, a later IO's raw_co_zone_append() call over-writes a former IO's offset address before raw_co_prw() completes. As a result, the former zone append IO returns the initial value (= the start address of the writing zone), instead of the proper address. Fix the issue by passing the offset pointer to raw_co_prw() instead of passing it through s->offset. Also, remove "offset" from BDRVRawState as there is no usage anymore. Fixes: 4751d09adcc3 ("block: introduce zone append write for zoned devices") Signed-off-by: Naohiro Aota Message-Id: <20231030073853.2601162-1-naohiro.aota@wdc.com> Reviewed-by: Sam Li Reviewed-by: Stefan Hajnoczi Signed-off-by: Hanna Czenczek --- block/file-posix.c | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/block/file-posix.c b/block/file-posix.c index 3c0ce9c258..b862406c71 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -160,7 +160,6 @@ typedef struct BDRVRawState { bool has_write_zeroes:1; bool use_linux_aio:1; bool use_linux_io_uring:1; - int64_t *offset; /* offset of zone append operation */ int page_cache_inconsistent; /* errno from fdatasync failure */ bool has_fallocate; bool needs_alignment; @@ -2445,12 +2444,13 @@ static bool bdrv_qiov_is_aligned(BlockDriverState *bs, QEMUIOVector *qiov) return true; } -static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset, +static int coroutine_fn raw_co_prw(BlockDriverState *bs, int64_t *offset_ptr, uint64_t bytes, QEMUIOVector *qiov, int type) { BDRVRawState *s = bs->opaque; RawPosixAIOData acb; int ret; + uint64_t offset = *offset_ptr; if (fd_open(bs) < 0) return -EIO; @@ -2513,8 +2513,8 @@ out: uint64_t *wp = &wps->wp[offset / bs->bl.zone_size]; if (!BDRV_ZT_IS_CONV(*wp)) { if (type & QEMU_AIO_ZONE_APPEND) { - *s->offset = *wp; - trace_zbd_zone_append_complete(bs, *s->offset + *offset_ptr = *wp; + trace_zbd_zone_append_complete(bs, *offset_ptr >> BDRV_SECTOR_BITS); } /* Advance the wp if needed */ @@ -2539,14 +2539,14 @@ static int coroutine_fn raw_co_preadv(BlockDriverState *bs, int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags) { - return raw_co_prw(bs, offset, bytes, qiov, QEMU_AIO_READ); + return raw_co_prw(bs, &offset, bytes, qiov, QEMU_AIO_READ); } static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, int64_t offset, int64_t bytes, QEMUIOVector *qiov, BdrvRequestFlags flags) { - return raw_co_prw(bs, offset, bytes, qiov, QEMU_AIO_WRITE); + return raw_co_prw(bs, &offset, bytes, qiov, QEMU_AIO_WRITE); } static int coroutine_fn raw_co_flush_to_disk(BlockDriverState *bs) @@ -3509,8 +3509,6 @@ static int coroutine_fn raw_co_zone_append(BlockDriverState *bs, int64_t zone_size_mask = bs->bl.zone_size - 1; int64_t iov_len = 0; int64_t len = 0; - BDRVRawState *s = bs->opaque; - s->offset = offset; if (*offset & zone_size_mask) { error_report("sector offset %" PRId64 " is not aligned to zone size " @@ -3531,7 +3529,7 @@ static int coroutine_fn raw_co_zone_append(BlockDriverState *bs, } trace_zbd_zone_append(bs, *offset >> BDRV_SECTOR_BITS); - return raw_co_prw(bs, *offset, len, qiov, QEMU_AIO_ZONE_APPEND); + return raw_co_prw(bs, offset, len, qiov, QEMU_AIO_ZONE_APPEND); } #endif