From patchwork Wed Nov 16 13:43:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Emanuele Giuseppe Esposito X-Patchwork-Id: 1704612 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=PZrrAofa; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4NC46z38c2z20KC for ; Thu, 17 Nov 2022 00:45:13 +1100 (AEDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1ovIiA-0006Cs-No; Wed, 16 Nov 2022 08:44:18 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovIi9-0006CM-6X for qemu-devel@nongnu.org; Wed, 16 Nov 2022 08:44:17 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1ovIi7-0006e4-De for qemu-devel@nongnu.org; Wed, 16 Nov 2022 08:44:16 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668606254; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=hzuVg1JoQt1Jv9OxtU1sf1xQ6cxH/J5SAqaUbLqo/0o=; b=PZrrAofak36PtmdKaGEnnxpScnLswvr2H/0Ar2fVlin1TX53OYsmkvTCTBmlR1WAOyHOe+ 5bXRHL/MbzSyIFjbOzIbnNiWsekinhN4WHWAkP3I/EuNL1LhfLXOZxDCI2piR55GdeRi8z Z7ik2+8PdyxUrI0hj937lvfds9dpArw= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-647-AIAbfhR0N2Kg7n48o2EMRw-1; Wed, 16 Nov 2022 08:44:10 -0500 X-MC-Unique: AIAbfhR0N2Kg7n48o2EMRw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 85E4C811E75; Wed, 16 Nov 2022 13:44:10 +0000 (UTC) Received: from virtlab701.virt.lab.eng.bos.redhat.com (virtlab701.virt.lab.eng.bos.redhat.com [10.19.152.228]) by smtp.corp.redhat.com (Postfix) with ESMTP id 34CD040C2086; Wed, 16 Nov 2022 13:44:10 +0000 (UTC) From: Emanuele Giuseppe Esposito To: qemu-block@nongnu.org Cc: Kevin Wolf , Hanna Reitz , John Snow , Paolo Bonzini , Vladimir Sementsov-Ogievskiy , Stefan Hajnoczi , Fam Zheng , Eric Blake , Cleber Rosa , qemu-devel@nongnu.org, Emanuele Giuseppe Esposito Subject: [PATCH 00/20] Protect the block layer with a rwlock: part 1 Date: Wed, 16 Nov 2022 08:43:43 -0500 Message-Id: <20221116134403.3051127-1-eesposit@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 Received-SPF: pass client-ip=170.10.129.124; envelope-from=eesposit@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org This serie is the first of four series that aim to introduce and use a new graph rwlock in the QEMU block layer. The aim is to replace the current AioContext lock with much fine-grained locks, aimed to protect only specific data. Currently the AioContext lock is used pretty much everywhere, and it's not even clear what it is protecting exactly. The aim of the rwlock is to cover graph modifications: more precisely, when a BlockDriverState parent or child list is modified or read, since it can be concurrently accessed by the main loop and iothreads. The main assumption is that the main loop is the only one allowed to perform graph modifications, and so far this has always been held by the current code. The rwlock is inspired from cpus-common.c implementation, and aims to reduce cacheline bouncing by having per-aiocontext counter of readers. All details and implementation of the lock are in patch 1. We distinguish between writer (main loop, under BQL) that modifies the graph, and readers (all other coroutines running in various AioContext), that go through the graph edges, reading ->parents and->children. The writer (main loop) has an "exclusive" access, so it first waits for current read to finish, and then prevents incoming ones from entering while it has the exclusive access. The readers (coroutines in multiple AioContext) are free to access the graph as long the writer is not modifying the graph. In case it is, they go in a CoQueue and sleep until the writer is done. In this first serie, my aim is to introduce the lock (patches 1-3,6), cover the main graph writer (patch 4), define assertions (patch 5) and start using the read lock in the generated_co_wrapper functions (7-20). Such functions recursively traverse the BlockDriverState graph, so they must take the graph rdlock. We distinguish two cases related to the generated_co_wrapper (often shortened to g_c_w): - qemu_in_coroutine(), which means the function is already running in a coroutine. This means we don't take the lock, because the coroutine must have taken it when it started - !qemu_in_coroutine(), which means we need to create a new coroutine that performs the operation requested. In this case we take the rdlock as soon as the coroutine starts, and release only before finishing. In this and following series, we try to follow the following locking pattern: - bdrv_co_* functions that call BlockDriver callbacks always expect the lock to be taken, therefore they assert. - blk_co_* functions usually call blk_wait_while_drained(), therefore they must take the lock internally. Therefore we introduce generated_co_wrapper_blk, which does not take the rdlock when starting the coroutine. The long term goal of this series is to eventually replace the AioContext lock, so that we can get rid of it once and for all. This serie is based on v4 of "Still more coroutine and various fixes in block layer". Based-on: <20221116122241.2856527-1-eesposit@redhat.com> Thank you, Emanuele Emanuele Giuseppe Esposito (19): graph-lock: introduce BdrvGraphRWlock structure async: register/unregister aiocontext in graph lock list block.c: wrlock in bdrv_replace_child_noperm block: remove unnecessary assert_bdrv_graph_writable() block: assert that graph read and writes are performed correctly graph-lock: implement WITH_GRAPH_RDLOCK_GUARD and GRAPH_RDLOCK_GUARD macros block-coroutine-wrapper.py: take the graph rdlock in bdrv_* functions block-backend: introduce new generated_co_wrapper_blk annotation block-gen: assert that {bdrv/blk}_co_truncate is always called with graph rdlock taken block-gen: assert that bdrv_co_{check/invalidate_cache} are always called with graph rdlock taken block-gen: assert that bdrv_co_pwrite is always called with graph rdlock taken block-gen: assert that bdrv_co_pwrite_{zeros/sync} is always called with graph rdlock taken block-gen: assert that bdrv_co_pread is always called with graph rdlock taken block-gen: assert that {bdrv/blk}_co_flush is always called with graph rdlock taken block-gen: assert that bdrv_co_{read/write}v_vmstate are always called with graph rdlock taken block-gen: assert that bdrv_co_pdiscard is always called with graph rdlock taken block-gen: assert that bdrv_co_common_block_status_above is always called with graph rdlock taken block-gen: assert that bdrv_co_ioctl is always called with graph rdlock taken block-gen: assert that nbd_co_do_establish_connection is always called with graph rdlock taken Paolo Bonzini (1): block: introduce a lock to protect graph operations block.c | 15 +- block/backup.c | 3 + block/block-backend.c | 10 +- block/block-copy.c | 10 +- block/graph-lock.c | 255 +++++++++++++++++++++++++ block/io.c | 15 ++ block/meson.build | 1 + block/mirror.c | 20 +- block/nbd.c | 1 + block/stream.c | 32 ++-- include/block/aio.h | 9 + include/block/block-common.h | 1 + include/block/block_int-common.h | 36 +++- include/block/block_int-global-state.h | 17 -- include/block/block_int-io.h | 2 + include/block/block_int.h | 1 + include/block/graph-lock.h | 180 +++++++++++++++++ include/sysemu/block-backend-io.h | 69 +++---- qemu-img.c | 4 +- scripts/block-coroutine-wrapper.py | 13 +- tests/unit/test-bdrv-drain.c | 2 + util/async.c | 4 + util/meson.build | 1 + 23 files changed, 615 insertions(+), 86 deletions(-) create mode 100644 block/graph-lock.c create mode 100644 include/block/graph-lock.h