From patchwork Tue Feb 8 15:36:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emanuele Giuseppe Esposito X-Patchwork-Id: 1589977 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=hqDyYwGv; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4JtT7S49y6z9s09 for ; Wed, 9 Feb 2022 03:32:18 +1100 (AEDT) Received: from localhost ([::1]:52092 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nHTPY-0000Q9-N8 for incoming@patchwork.ozlabs.org; Tue, 08 Feb 2022 11:32:12 -0500 Received: from eggs.gnu.org ([209.51.188.92]:47080) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nHSYP-0000Ib-3x for qemu-devel@nongnu.org; Tue, 08 Feb 2022 10:37:20 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:26763) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nHSYD-0002e0-Kl for qemu-devel@nongnu.org; Tue, 08 Feb 2022 10:37:14 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644334624; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=w2tBAWxKc13b0G4VpXQRinZ9kc2SVuxYbfgKU6+nyAw=; b=hqDyYwGvm2uxRPt2DqaTfqo5r1XT2SOQwPsKmkmVL7AqUR0ZZxETVr8ZeuTujvDi/85GPp 0kh/6vuyjErLmiKBMI1C9aGtfleEcDC00AlM0l1qsadJCCzVAYQuZCJoQAf0pZoRH19XIE h+21OmfhA49JZwDssDHO5WwNOFGSvvA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-433-jqQxXrQlM_e1mGVkMU0Jew-1; Tue, 08 Feb 2022 10:37:00 -0500 X-MC-Unique: jqQxXrQlM_e1mGVkMU0Jew-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id AB5371898291; Tue, 8 Feb 2022 15:36:59 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id E81157D475; Tue, 8 Feb 2022 15:36:58 +0000 (UTC) From: Emanuele Giuseppe Esposito To: qemu-block@nongnu.org Subject: [PATCH 1/6] block/io.c: fix bdrv_child_cb_drained_begin invocations from a coroutine Date: Tue, 8 Feb 2022 10:36:50 -0500 Message-Id: <20220208153655.1251658-2-eesposit@redhat.com> In-Reply-To: <20220208153655.1251658-1-eesposit@redhat.com> References: <20220208153655.1251658-1-eesposit@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eesposit@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.129.124; envelope-from=eesposit@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01, T_SPF_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Vladimir Sementsov-Ogievskiy , Emanuele Giuseppe Esposito , qemu-devel@nongnu.org, Hanna Reitz , Stefan Hajnoczi , Paolo Bonzini , John Snow Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Using bdrv_do_drained_begin_quiesce() in bdrv_child_cb_drained_begin() is not a good idea: the callback might be called when running a drain in a coroutine, and bdrv_drained_begin_poll() does not handle that case, resulting in assertion failure. Instead, bdrv_do_drained_begin with no recursion and poll will accomplish the same thing (invoking bdrv_do_drained_begin_quiesce) but will firstly check if we are already in a coroutine, and exit from that via bdrv_co_yield_to_drain(). Signed-off-by: Emanuele Giuseppe Esposito Reviewed-by: Stefan Hajnoczi --- block.c | 2 +- block/io.c | 7 ++++++- include/block/block-io.h | 8 +++++--- 3 files changed, 12 insertions(+), 5 deletions(-) diff --git a/block.c b/block.c index e42a99d2af..4551eba2aa 100644 --- a/block.c +++ b/block.c @@ -1203,7 +1203,7 @@ static char *bdrv_child_get_parent_desc(BdrvChild *c) static void bdrv_child_cb_drained_begin(BdrvChild *child) { BlockDriverState *bs = child->opaque; - bdrv_do_drained_begin_quiesce(bs, NULL, false); + bdrv_drained_begin_no_poll(bs); } static bool bdrv_child_cb_drained_poll(BdrvChild *child) diff --git a/block/io.c b/block/io.c index 683a32c955..d3ee3e1a7c 100644 --- a/block/io.c +++ b/block/io.c @@ -425,7 +425,7 @@ static void coroutine_fn bdrv_co_yield_to_drain(BlockDriverState *bs, } } -void bdrv_do_drained_begin_quiesce(BlockDriverState *bs, +static void bdrv_do_drained_begin_quiesce(BlockDriverState *bs, BdrvChild *parent, bool ignore_bds_parents) { assert(!qemu_in_coroutine()); @@ -487,6 +487,11 @@ void bdrv_subtree_drained_begin(BlockDriverState *bs) bdrv_do_drained_begin(bs, true, NULL, false, true); } +void bdrv_drained_begin_no_poll(BlockDriverState *bs) +{ + bdrv_do_drained_begin(bs, false, NULL, false, false); +} + /** * This function does not poll, nor must any of its recursively called * functions. The *drained_end_counter pointee will be incremented diff --git a/include/block/block-io.h b/include/block/block-io.h index 7c04bc3049..48d3442970 100644 --- a/include/block/block-io.h +++ b/include/block/block-io.h @@ -308,13 +308,15 @@ bool bdrv_drain_poll(BlockDriverState *bs, bool recursive, void bdrv_drained_begin(BlockDriverState *bs); /** - * bdrv_do_drained_begin_quiesce: + * bdrv_drained_begin_no_poll: * * Quiesces a BDS like bdrv_drained_begin(), but does not wait for already * running requests to complete. + * Same as bdrv_drained_begin(), but do not poll for the subgraph to + * actually become unquiesced. Therefore, no graph changes will occur + * with this function. */ -void bdrv_do_drained_begin_quiesce(BlockDriverState *bs, - BdrvChild *parent, bool ignore_bds_parents); +void bdrv_drained_begin_no_poll(BlockDriverState *bs); /** * Like bdrv_drained_begin, but recursively begins a quiesced section for From patchwork Tue Feb 8 15:36:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emanuele Giuseppe Esposito X-Patchwork-Id: 1589980 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=HrIqVYx7; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4JtTF41S6xz9s09 for ; Wed, 9 Feb 2022 03:37:11 +1100 (AEDT) Received: from localhost ([::1]:60992 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nHTUL-0006rE-MY for incoming@patchwork.ozlabs.org; Tue, 08 Feb 2022 11:37:09 -0500 Received: from eggs.gnu.org ([209.51.188.92]:47100) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nHSYR-0000Iz-8s for qemu-devel@nongnu.org; Tue, 08 Feb 2022 10:37:20 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:35624) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nHSYE-0002cj-9E for qemu-devel@nongnu.org; Tue, 08 Feb 2022 10:37:17 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644334622; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AmVFxJrqQC59K0UGKZi9lS9YAYryefRrUsV+fmHbpv0=; b=HrIqVYx7TomDFBphq1svS+ApQ5LF+37TG+U09GzY6xyIm09ITv9lkpgpXehjbwGWuuIVmK e9yOvCOoc9GRURHHnjNkXnBhutCJQFtvM6KJBEyhXYaDnCi4qsyuLnNWwCHeZ6tCTr4c8k TP8OYALcwQ0rYuRQrx24/CkPfvj1o/w= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-592-gzbsZAtXNmKJtEQXxQAcxA-1; Tue, 08 Feb 2022 10:37:01 -0500 X-MC-Unique: gzbsZAtXNmKJtEQXxQAcxA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 884BA189829C; Tue, 8 Feb 2022 15:37:00 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id C56AB7D4AF; Tue, 8 Feb 2022 15:36:59 +0000 (UTC) From: Emanuele Giuseppe Esposito To: qemu-block@nongnu.org Subject: [PATCH 2/6] block.c: bdrv_replace_child_noperm: first remove the child, and then call ->detach() Date: Tue, 8 Feb 2022 10:36:51 -0500 Message-Id: <20220208153655.1251658-3-eesposit@redhat.com> In-Reply-To: <20220208153655.1251658-1-eesposit@redhat.com> References: <20220208153655.1251658-1-eesposit@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eesposit@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.133.124; envelope-from=eesposit@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Vladimir Sementsov-Ogievskiy , Emanuele Giuseppe Esposito , qemu-devel@nongnu.org, Hanna Reitz , Stefan Hajnoczi , Paolo Bonzini , John Snow Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Doing the opposite can make ->detach() (more precisely bdrv_unapply_subtree_drain() in bdrv_child_cb_detach) undo the subtree_drain just performed to protect the removal of the child from the graph, thus making the fully-enabled assert_bdrv_graph_writable fail. Note that assert_bdrv_graph_writable is not yet fully enabled. Signed-off-by: Emanuele Giuseppe Esposito Reviewed-by: Stefan Hajnoczi --- block.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-) diff --git a/block.c b/block.c index 4551eba2aa..ec346a7e2e 100644 --- a/block.c +++ b/block.c @@ -2854,14 +2854,16 @@ static void bdrv_replace_child_noperm(BdrvChild **childp, } if (old_bs) { - /* Detach first so that the recursive drain sections coming from @child + assert_bdrv_graph_writable(old_bs); + QLIST_REMOVE(child, next_parent); + /* + * Detach first so that the recursive drain sections coming from @child * are already gone and we only end the drain sections that came from - * elsewhere. */ + * elsewhere. + */ if (child->klass->detach) { child->klass->detach(child); } - assert_bdrv_graph_writable(old_bs); - QLIST_REMOVE(child, next_parent); } child->bs = new_bs; From patchwork Tue Feb 8 15:36:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emanuele Giuseppe Esposito X-Patchwork-Id: 1590021 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=IGxeC9XL; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4JtWZX59lrz9s0B for ; Wed, 9 Feb 2022 05:22:28 +1100 (AEDT) Received: from localhost ([::1]:59740 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nHV8E-0003xH-0w for incoming@patchwork.ozlabs.org; Tue, 08 Feb 2022 13:22:26 -0500 Received: from eggs.gnu.org ([209.51.188.92]:47266) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nHSYe-0000Tx-Vd for qemu-devel@nongnu.org; Tue, 08 Feb 2022 10:37:33 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:35270) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nHSYE-0002eN-Py for qemu-devel@nongnu.org; Tue, 08 Feb 2022 10:37:32 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644334626; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=I3/bM+4lQ/kO8LqzI5XVk+qD+YU+/kn83G35vECh8vc=; b=IGxeC9XLLIGscyPH0WKYCMfasnYj9UWe02I6S0sR+vpADm3tHFnOTG+gYQxufHw7R8ENmZ vUFD0Aws81tsGg7orAPteapZwhA1dvjvWZP1yWoe3U+etotBvKecoLXKdIF8MnMEEODMgk Hqn9RPCeMTLKWlF7X2q0UyDfDNjNLm0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-253-ppz_WKFLNY2TlK3DmFm3Tg-1; Tue, 08 Feb 2022 10:37:03 -0500 X-MC-Unique: ppz_WKFLNY2TlK3DmFm3Tg-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 66956363A6; Tue, 8 Feb 2022 15:37:01 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id A2DF07D473; Tue, 8 Feb 2022 15:37:00 +0000 (UTC) From: Emanuele Giuseppe Esposito To: qemu-block@nongnu.org Subject: [PATCH 3/6] block.c: bdrv_replace_child_noperm: first call ->attach(), and then add child Date: Tue, 8 Feb 2022 10:36:52 -0500 Message-Id: <20220208153655.1251658-4-eesposit@redhat.com> In-Reply-To: <20220208153655.1251658-1-eesposit@redhat.com> References: <20220208153655.1251658-1-eesposit@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eesposit@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.129.124; envelope-from=eesposit@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Vladimir Sementsov-Ogievskiy , Emanuele Giuseppe Esposito , qemu-devel@nongnu.org, Hanna Reitz , Stefan Hajnoczi , Paolo Bonzini , John Snow Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Doing the opposite can make adding the child node to a non-drained node, as apply_subtree_drain is only done in ->attach() and thus make assert_bdrv_graph_writable fail. This can happen for example during a transaction rollback (test 245, test_io_with_graph_changes): 1. a node is removed from the graph, thus it is undrained 2. then something happens, and we need to roll back the transactions through tran_abort() 3. at this point, the current code would first attach the undrained node to the graph via QLIST_INSERT_HEAD, and then call ->attach() that will take care of restoring the drain with apply_subtree_drain(), leaving the node undrained between the two operations. Signed-off-by: Emanuele Giuseppe Esposito Reviewed-by: Stefan Hajnoczi --- block.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/block.c b/block.c index ec346a7e2e..08a6e3a4ef 100644 --- a/block.c +++ b/block.c @@ -2872,8 +2872,6 @@ static void bdrv_replace_child_noperm(BdrvChild **childp, } if (new_bs) { - assert_bdrv_graph_writable(new_bs); - QLIST_INSERT_HEAD(&new_bs->parents, child, next_parent); /* * Detaching the old node may have led to the new node's @@ -2890,6 +2888,10 @@ static void bdrv_replace_child_noperm(BdrvChild **childp, if (child->klass->attach) { child->klass->attach(child); } + + assert_bdrv_graph_writable(new_bs); + QLIST_INSERT_HEAD(&new_bs->parents, child, next_parent); + } /* From patchwork Tue Feb 8 15:36:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emanuele Giuseppe Esposito X-Patchwork-Id: 1589983 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=gOwnd71l; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4JtTLr6F21z9s09 for ; Wed, 9 Feb 2022 03:42:12 +1100 (AEDT) Received: from localhost ([::1]:41442 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nHTZC-0004aX-ND for incoming@patchwork.ozlabs.org; Tue, 08 Feb 2022 11:42:10 -0500 Received: from eggs.gnu.org ([209.51.188.92]:47130) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nHSYU-0000JY-2i for qemu-devel@nongnu.org; Tue, 08 Feb 2022 10:37:22 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:24907) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nHSYE-0002e5-70 for qemu-devel@nongnu.org; Tue, 08 Feb 2022 10:37:20 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644334624; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=wooyD7Yqu7I8E0U9w4WCE3qS2Jo+fC3O3hAAv7wXP64=; b=gOwnd71lD5ePWDY+vnf++xOIntoy5unPpkVDykoIy9dHmYatUsQfZWLeJdjQSCmXvSY0Ay Tf1kkAGYauxBKFcPX1kWr3O8SNmpBUcIV/CyTyMGtLr9SPaoknAq0YuB/QYh20+EQ4kKEO yrYkJVcTndQcfMY0qNjru40UzENItbQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-596-dFkCcP-aNIi9IED_udlYhA-1; Tue, 08 Feb 2022 10:37:03 -0500 X-MC-Unique: dFkCcP-aNIi9IED_udlYhA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 44807100C669; Tue, 8 Feb 2022 15:37:02 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 811627D473; Tue, 8 Feb 2022 15:37:01 +0000 (UTC) From: Emanuele Giuseppe Esposito To: qemu-block@nongnu.org Subject: [PATCH 4/6] test-bdrv-drain.c: adapt test to the coming subtree drains Date: Tue, 8 Feb 2022 10:36:53 -0500 Message-Id: <20220208153655.1251658-5-eesposit@redhat.com> In-Reply-To: <20220208153655.1251658-1-eesposit@redhat.com> References: <20220208153655.1251658-1-eesposit@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eesposit@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.133.124; envelope-from=eesposit@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01, T_SPF_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Vladimir Sementsov-Ogievskiy , Emanuele Giuseppe Esposito , qemu-devel@nongnu.org, Hanna Reitz , Stefan Hajnoczi , Paolo Bonzini , John Snow Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" There will be 2 problems in this test when we will add subtree drains in bdrv_replace_child_noperm: - First, the test is inconsistent about taking the AioContext lock when calling bdrv_replace_child_noperm. bdrv_replace_child_noperm is reached in two places: from blk_insert_bs directly, and via block_job_create. Only the second does it with the AioContext lock taken, and there seems to be no reason why the lock is needed. Move aio_context_acquire further down, to just protect block_job_add_bdrv() - Second, test_detach_indirect is only interested in observing the first call to .drained_begin. In the original test, there was only a single subtree drain; however, with additional drains introduced in bdrv_replace_child_noperm(), the test callback would be called too early and/or multiple times. Override the callback only when we actually want to use it, and put back the original after it's been invoked. Signed-off-by: Emanuele Giuseppe Esposito --- tests/unit/test-bdrv-drain.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/tests/unit/test-bdrv-drain.c b/tests/unit/test-bdrv-drain.c index 4924ceb562..c52ba2db4e 100644 --- a/tests/unit/test-bdrv-drain.c +++ b/tests/unit/test-bdrv-drain.c @@ -912,12 +912,12 @@ static void test_blockjob_common_drain_node(enum drain_type drain_type, blk_insert_bs(blk_target, target, &error_abort); blk_set_allow_aio_context_change(blk_target, true); - aio_context_acquire(ctx); tjob = block_job_create("job0", &test_job_driver, NULL, src, 0, BLK_PERM_ALL, 0, 0, NULL, NULL, &error_abort); tjob->bs = src; job = &tjob->common; + aio_context_acquire(ctx); block_job_add_bdrv(job, "target", target, 0, BLK_PERM_ALL, &error_abort); switch (result) { @@ -1342,15 +1342,18 @@ static void detach_by_parent_aio_cb(void *opaque, int ret) } } +static BdrvChildClass detach_by_driver_cb_class; + static void detach_by_driver_cb_drained_begin(BdrvChild *child) { + /* restore .drained_begin cb, we don't need it anymore. */ + detach_by_driver_cb_class.drained_begin = child_of_bds.drained_begin; + aio_bh_schedule_oneshot(qemu_get_current_aio_context(), detach_indirect_bh, &detach_by_parent_data); child_of_bds.drained_begin(child); } -static BdrvChildClass detach_by_driver_cb_class; - /* * Initial graph: * @@ -1382,8 +1385,6 @@ static void test_detach_indirect(bool by_parent_cb) if (!by_parent_cb) { detach_by_driver_cb_class = child_of_bds; - detach_by_driver_cb_class.drained_begin = - detach_by_driver_cb_drained_begin; } /* Create all involved nodes */ @@ -1441,6 +1442,12 @@ static void test_detach_indirect(bool by_parent_cb) acb = blk_aio_preadv(blk, 0, &qiov, 0, detach_by_parent_aio_cb, NULL); g_assert(acb != NULL); + if (!by_parent_cb) { + /* set .drained_begin cb to run only in the following drain. */ + detach_by_driver_cb_class.drained_begin = + detach_by_driver_cb_drained_begin; + } + /* Drain and check the expected result */ bdrv_subtree_drained_begin(parent_b); From patchwork Tue Feb 8 15:36:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emanuele Giuseppe Esposito X-Patchwork-Id: 1590023 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=DPlMV2Bn; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4JtWj12q31z9s0B for ; Wed, 9 Feb 2022 05:28:04 +1100 (AEDT) Received: from localhost ([::1]:38880 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nHVDc-0000z5-4W for incoming@patchwork.ozlabs.org; Tue, 08 Feb 2022 13:28:00 -0500 Received: from eggs.gnu.org ([209.51.188.92]:47262) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nHSYe-0000Tw-Mg for qemu-devel@nongnu.org; Tue, 08 Feb 2022 10:37:32 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:47356) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nHSYI-0002ee-4K for qemu-devel@nongnu.org; Tue, 08 Feb 2022 10:37:31 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644334627; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=i9p3DrLXzXtNC11veQfbYrUtwOaG0S532XgvkB92rFU=; b=DPlMV2Bn+HBGt0TgIN8sGP4CBGQp5wdK+jfZoBi6OEIk4zYQsx2tvYdwMl8hv9pGorTa3c OS8OJzbJwGGIyw6lBzfycpx3WKZmWZumrdh9MB2n4HIRg9jZO4BY0gA5Molg5eSluigXJJ sCQIWlz1x6wwCwDMw7STBWbMB1TiZvs= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-625-rvHOCPGtMQm8pbpxHokIWA-1; Tue, 08 Feb 2022 10:37:04 -0500 X-MC-Unique: rvHOCPGtMQm8pbpxHokIWA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 240621091DA2; Tue, 8 Feb 2022 15:37:03 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5FB507D473; Tue, 8 Feb 2022 15:37:02 +0000 (UTC) From: Emanuele Giuseppe Esposito To: qemu-block@nongnu.org Subject: [PATCH 5/6] test-bdrv-drain.c: remove test_detach_by_parent_cb() Date: Tue, 8 Feb 2022 10:36:54 -0500 Message-Id: <20220208153655.1251658-6-eesposit@redhat.com> In-Reply-To: <20220208153655.1251658-1-eesposit@redhat.com> References: <20220208153655.1251658-1-eesposit@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eesposit@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.129.124; envelope-from=eesposit@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Vladimir Sementsov-Ogievskiy , Emanuele Giuseppe Esposito , qemu-devel@nongnu.org, Hanna Reitz , Stefan Hajnoczi , Paolo Bonzini , John Snow Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" This test uses a callback of an I/O function (blk_aio_preadv) to modify the graph, using bdrv_attach_child. This is simply not allowed anymore. I/O cannot change the graph. Before "block/io.c: make bdrv_do_drained_begin_quiesce static and introduce bdrv_drained_begin_no_poll", the test would simply be at risk of failure, because if bdrv_replace_child_noperm() (called to modify the graph) would call a drain, then one callback of .drained_begin() is bdrv_do_drained_begin_quiesce, that specifically asserts that we are not in a coroutine. Now that we fixed the behavior, the drain will invoke a bh in the main loop, so we don't have such problem. However, this test is still illegal and fails because we forbid graph changes from I/O paths. Once we add the required subtree_drains to protect bdrv_replace_child_noperm(), the key problem in this test is in: acb = blk_aio_preadv(blk, 0, &qiov, 0, detach_by_parent_aio_cb, NULL); /* Drain and check the expected result */ bdrv_subtree_drained_begin(parent_b); because the detach_by_parent_aio_cb calls detach_indirect_bh(), that modifies the graph and is invoked during bdrv_subtree_drained_begin(). The call stack is the following: 1. blk_aio_preadv() creates a coroutine, increments in_flight counter and enters the coroutine running blk_aio_read_entry() 2. blk_aio_read_entry() performs the read and then schedules a bh to complete (blk_aio_complete) 3. at this point, subtree_drained_begin() kicks in and waits for all in_flight requests, polling 4. polling allows the bh to be scheduled, so blk_aio_complete runs 5. blk_aio_complete *first* invokes the callback (detach_by_parent_aio_cb) and then decrements the in_flight counter 6. Here we have the problem: detach_by_parent_aio_cb modifies the graph, so both bdrv_unref_child() and bdrv_attach_child() will have subtree_drains inside. And this causes a deadlock, because the nested drain will wait for in_flight counter to go to zero, which is only happening once the drain itself finishes. Different story is test_detach_by_driver_cb(): in this case, detach_by_parent_aio_cb() does not call detach_indirect_bh(), but it is instead called as a bh running in the main loop by detach_by_driver_cb_drained_begin(), the callback for .drained_begin(). This test was added in 231281ab42 and part of the series "Drain fixes and cleanups, part 3" https://lists.nongnu.org/archive/html/qemu-block/2018-05/msg01132.html but as explained above I believe that it is not valid anymore, and can be discarded. Signed-off-by: Emanuele Giuseppe Esposito Reviewed-by: Stefan Hajnoczi --- tests/unit/test-bdrv-drain.c | 46 +++++++++--------------------------- 1 file changed, 11 insertions(+), 35 deletions(-) diff --git a/tests/unit/test-bdrv-drain.c b/tests/unit/test-bdrv-drain.c index c52ba2db4e..ef154b6536 100644 --- a/tests/unit/test-bdrv-drain.c +++ b/tests/unit/test-bdrv-drain.c @@ -1316,7 +1316,6 @@ struct detach_by_parent_data { BdrvChild *child_b; BlockDriverState *c; BdrvChild *child_c; - bool by_parent_cb; }; static struct detach_by_parent_data detach_by_parent_data; @@ -1334,12 +1333,7 @@ static void detach_indirect_bh(void *opaque) static void detach_by_parent_aio_cb(void *opaque, int ret) { - struct detach_by_parent_data *data = &detach_by_parent_data; - g_assert_cmpint(ret, ==, 0); - if (data->by_parent_cb) { - detach_indirect_bh(data); - } } static BdrvChildClass detach_by_driver_cb_class; @@ -1361,31 +1355,24 @@ static void detach_by_driver_cb_drained_begin(BdrvChild *child) * \ / \ * A B C * - * by_parent_cb == true: Test that parent callbacks don't poll - * - * PA has a pending write request whose callback changes the child nodes of - * PB: It removes B and adds C instead. The subtree of PB is drained, which - * will indirectly drain the write request, too. - * - * by_parent_cb == false: Test that bdrv_drain_invoke() doesn't poll + * Test that bdrv_drain_invoke() doesn't poll * * PA's BdrvChildClass has a .drained_begin callback that schedules a BH * that does the same graph change. If bdrv_drain_invoke() calls it, the * state is messed up, but if it is only polled in the single * BDRV_POLL_WHILE() at the end of the drain, this should work fine. */ -static void test_detach_indirect(bool by_parent_cb) +static void test_detach_indirect(void) { BlockBackend *blk; BlockDriverState *parent_a, *parent_b, *a, *b, *c; BdrvChild *child_a, *child_b; BlockAIOCB *acb; + BDRVTestState *s; QEMUIOVector qiov = QEMU_IOVEC_INIT_BUF(qiov, NULL, 0); - if (!by_parent_cb) { - detach_by_driver_cb_class = child_of_bds; - } + detach_by_driver_cb_class = child_of_bds; /* Create all involved nodes */ parent_a = bdrv_new_open_driver(&bdrv_test, "parent-a", BDRV_O_RDWR, @@ -1404,10 +1391,8 @@ static void test_detach_indirect(bool by_parent_cb) /* If we want to get bdrv_drain_invoke() to call aio_poll(), the driver * callback must not return immediately. */ - if (!by_parent_cb) { - BDRVTestState *s = parent_a->opaque; - s->sleep_in_drain_begin = true; - } + s = parent_a->opaque; + s->sleep_in_drain_begin = true; /* Set child relationships */ bdrv_ref(b); @@ -1419,7 +1404,7 @@ static void test_detach_indirect(bool by_parent_cb) bdrv_ref(a); bdrv_attach_child(parent_a, a, "PA-A", - by_parent_cb ? &child_of_bds : &detach_by_driver_cb_class, + &detach_by_driver_cb_class, BDRV_CHILD_DATA, &error_abort); g_assert_cmpint(parent_a->refcnt, ==, 1); @@ -1437,16 +1422,13 @@ static void test_detach_indirect(bool by_parent_cb) .parent_b = parent_b, .child_b = child_b, .c = c, - .by_parent_cb = by_parent_cb, }; acb = blk_aio_preadv(blk, 0, &qiov, 0, detach_by_parent_aio_cb, NULL); g_assert(acb != NULL); - if (!by_parent_cb) { - /* set .drained_begin cb to run only in the following drain. */ - detach_by_driver_cb_class.drained_begin = - detach_by_driver_cb_drained_begin; - } + /* set .drained_begin cb to run only in the following drain. */ + detach_by_driver_cb_class.drained_begin = + detach_by_driver_cb_drained_begin; /* Drain and check the expected result */ bdrv_subtree_drained_begin(parent_b); @@ -1482,14 +1464,9 @@ static void test_detach_indirect(bool by_parent_cb) bdrv_unref(c); } -static void test_detach_by_parent_cb(void) -{ - test_detach_indirect(true); -} - static void test_detach_by_driver_cb(void) { - test_detach_indirect(false); + test_detach_indirect(); } static void test_append_to_drained(void) @@ -2244,7 +2221,6 @@ int main(int argc, char **argv) g_test_add_func("/bdrv-drain/detach/drain_all", test_detach_by_drain_all); g_test_add_func("/bdrv-drain/detach/drain", test_detach_by_drain); g_test_add_func("/bdrv-drain/detach/drain_subtree", test_detach_by_drain_subtree); - g_test_add_func("/bdrv-drain/detach/parent_cb", test_detach_by_parent_cb); g_test_add_func("/bdrv-drain/detach/driver_cb", test_detach_by_driver_cb); g_test_add_func("/bdrv-drain/attach/drain", test_append_to_drained); From patchwork Tue Feb 8 15:36:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emanuele Giuseppe Esposito X-Patchwork-Id: 1590019 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=SLPx/R2X; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4JtWSk5dy4z9s0B for ; Wed, 9 Feb 2022 05:17:26 +1100 (AEDT) Received: from localhost ([::1]:53240 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1nHV3M-0007cV-Jq for incoming@patchwork.ozlabs.org; Tue, 08 Feb 2022 13:17:24 -0500 Received: from eggs.gnu.org ([209.51.188.92]:47204) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nHSYb-0000SG-Vq for qemu-devel@nongnu.org; Tue, 08 Feb 2022 10:37:30 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:20635) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1nHSYH-0002ei-QX for qemu-devel@nongnu.org; Tue, 08 Feb 2022 10:37:29 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1644334628; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8ScGcYB9BhOfQXboStI0hGCZ92n6rkbvzVu2F1ND7Z0=; b=SLPx/R2XU+JUmoF1LFKfaexP94Vy3kMnmx0AfEzKxwCF7tlb7CbrPCXFz+Zl5us+sRKx3I 7XRQRso1RRSznOKEj05U44YAxDDrHkiencu33bGrXkwTSRP3lXJvGCo5Natdi/2TvRhEQC LIQuAiqkuwzhy+n2uxsUDtjv9Bv2fzM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-662-D7b9JGZQNZGoibbzlnwd4w-1; Tue, 08 Feb 2022 10:37:05 -0500 X-MC-Unique: D7b9JGZQNZGoibbzlnwd4w-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 02ADD1091DA1; Tue, 8 Feb 2022 15:37:04 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3EEAB7D473; Tue, 8 Feb 2022 15:37:03 +0000 (UTC) From: Emanuele Giuseppe Esposito To: qemu-block@nongnu.org Subject: [PATCH 6/6] jobs: ensure sleep in job_sleep_ns is fully performed Date: Tue, 8 Feb 2022 10:36:55 -0500 Message-Id: <20220208153655.1251658-7-eesposit@redhat.com> In-Reply-To: <20220208153655.1251658-1-eesposit@redhat.com> References: <20220208153655.1251658-1-eesposit@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=eesposit@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.133.124; envelope-from=eesposit@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -27 X-Spam_score: -2.8 X-Spam_bar: -- X-Spam_report: (-2.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , Fam Zheng , Vladimir Sementsov-Ogievskiy , Emanuele Giuseppe Esposito , qemu-devel@nongnu.org, Hanna Reitz , Stefan Hajnoczi , Paolo Bonzini , John Snow Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" If a drain happens while a job is sleeping, the timeout gets cancelled and the job continues once the drain ends. This is especially bad for the sleep performed in commit and stream jobs, since that is dictated by ratelimit to maintain a certain speed. Basically the execution path is the following: 1. job calls job_sleep_ns, and yield with a timer in @ns ns. 2. meanwhile, a drain is executed, and child_job_drained_{begin/end} could be executed as ->drained_begin() and ->drained_end() callbacks. Therefore child_job_drained_begin() enters the job, that continues execution in job_sleep_ns() and calls job_pause_point_locked(). 3. job_pause_point_locked() detects that we are in the middle of a drain, and firstly deletes any existing timer and then yields again, waiting for ->drained_end(). 4. Once draining is finished, child_job_drained_end() runs and resumes the job. At this point, the timer has been lost and we just resume without checking if enough time has passed. This fix implies that from now onwards, job_sleep_ns will force the job to sleep @ns, even if it is wake up (purposefully or not) in the middle of the sleep. Therefore qemu-iotests test might run a little bit slower, depending on the speed of the job. Setting a job speed to values like "1" is not allowed anymore (unless you want to wait forever). Because of this fix, test_stream_parallel() in tests/qemu-iotests/030 takes too long, since speed of stream job is just 1024 and before it was skipping all the wait thanks to the drains. Increase the speed to 256 * 1024. Exactly the same happens for test 151. Instead we need to sleep less in test_cancel_ready() test-blockjob.c, so that the job will be able to exit the sleep and transition to ready before the main loop asserts. Signed-off-by: Emanuele Giuseppe Esposito Acked-by: Stefan Hajnoczi Acked-by: Vladimir Sementsov-Ogievskiy --- job.c | 19 +++++++++++-------- tests/qemu-iotests/030 | 2 +- tests/qemu-iotests/151 | 4 ++-- tests/unit/test-blockjob.c | 2 +- 4 files changed, 15 insertions(+), 12 deletions(-) diff --git a/job.c b/job.c index c7a41d88d6..72d6c462ed 100644 --- a/job.c +++ b/job.c @@ -639,19 +639,22 @@ void job_yield(Job *job) void coroutine_fn job_sleep_ns(Job *job, int64_t ns) { + int64_t end_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + ns; JOB_LOCK_GUARD(); assert(job->busy); - /* Check cancellation *before* setting busy = false, too! */ - if (job_is_cancelled_locked(job)) { - return; - } + do { + /* Check cancellation *before* setting busy = false, too! */ + if (job_is_cancelled_locked(job)) { + return; + } - if (!job_should_pause_locked(job)) { - job_do_yield_locked(job, qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + ns); - } + if (!job_should_pause_locked(job)) { + job_do_yield_locked(job, end_ns); + } - job_pause_point_locked(job); + job_pause_point_locked(job); + } while (qemu_clock_get_ns(QEMU_CLOCK_REALTIME) < end_ns); } /* Assumes the job_mutex is held */ diff --git a/tests/qemu-iotests/030 b/tests/qemu-iotests/030 index 567bf1da67..969b246d0f 100755 --- a/tests/qemu-iotests/030 +++ b/tests/qemu-iotests/030 @@ -248,7 +248,7 @@ class TestParallelOps(iotests.QMPTestCase): pending_jobs.append(job_id) result = self.vm.qmp('block-stream', device=node_name, job_id=job_id, bottom=f'node{i-1}', - speed=1024) + speed=256*1024) self.assert_qmp(result, 'return', {}) # Do this in reverse: After unthrottling them, some jobs may finish diff --git a/tests/qemu-iotests/151 b/tests/qemu-iotests/151 index 93d14193d0..5998beb5c4 100755 --- a/tests/qemu-iotests/151 +++ b/tests/qemu-iotests/151 @@ -129,7 +129,7 @@ class TestActiveMirror(iotests.QMPTestCase): sync='full', copy_mode='write-blocking', buf_size=(1048576 // 4), - speed=1) + speed=1024*1024) self.assert_qmp(result, 'return', {}) # Start an unaligned request to a dirty area @@ -154,7 +154,7 @@ class TestActiveMirror(iotests.QMPTestCase): target='target-node', sync='full', copy_mode='write-blocking', - speed=1) + speed=1024*1024) self.vm.hmp_qemu_io('source', 'break write_aio A') self.vm.hmp_qemu_io('source', 'aio_write 0 1M') # 1 diff --git a/tests/unit/test-blockjob.c b/tests/unit/test-blockjob.c index d6fc52f80a..81ebc7d154 100644 --- a/tests/unit/test-blockjob.c +++ b/tests/unit/test-blockjob.c @@ -184,7 +184,7 @@ static int coroutine_fn cancel_job_run(Job *job, Error **errp) job_transition_to_ready(&s->common.job); } - job_sleep_ns(&s->common.job, 100000); + job_sleep_ns(&s->common.job, 100); } return 0;