Message ID | 20230628141526.293104-12-kwolf@redhat.com |
---|---|
State | New |
Headers | show |
Series | [PULL,01/23] iotests: Test active commit with iothread and background I/O | expand |
On Jun 28 16:15, Kevin Wolf wrote: > Now that bdrv_graph_wrlock() temporarily drops the AioContext lock that > its caller holds, it can poll without causing deadlocks. We can now > re-enable graph locking. > > This reverts commit ad128dff0bf4b6f971d05eb4335a627883a19c1d. > I'm seeing a pretty major performance regression on iothread-enabled virtio-blk (and on some on-going iothread hw/nvme work) with this applied. Something like ~300k iops prior to this vs ~200k after on my set up. On master, virtio-blk is currently faster without an iothread (~215k) than with (~200k). I bisected the change in iops to this revert.
Am 10.07.2023 um 14:22 hat Klaus Jensen geschrieben: > On Jun 28 16:15, Kevin Wolf wrote: > > Now that bdrv_graph_wrlock() temporarily drops the AioContext lock that > > its caller holds, it can poll without causing deadlocks. We can now > > re-enable graph locking. > > > > This reverts commit ad128dff0bf4b6f971d05eb4335a627883a19c1d. > > > > I'm seeing a pretty major performance regression on iothread-enabled > virtio-blk (and on some on-going iothread hw/nvme work) with this > applied. Something like ~300k iops prior to this vs ~200k after on my > set up. On master, virtio-blk is currently faster without an iothread > (~215k) than with (~200k). > > I bisected the change in iops to this revert. Is CONFIG_DEBUG_GRAPH_LOCK enabled in your build? If so, this is expected to cost some performance. If not, we need to take a look at what else is causing the regression. Kevin
On Jul 10 14:40, Kevin Wolf wrote: > Am 10.07.2023 um 14:22 hat Klaus Jensen geschrieben: > > On Jun 28 16:15, Kevin Wolf wrote: > > > Now that bdrv_graph_wrlock() temporarily drops the AioContext lock that > > > its caller holds, it can poll without causing deadlocks. We can now > > > re-enable graph locking. > > > > > > This reverts commit ad128dff0bf4b6f971d05eb4335a627883a19c1d. > > > > > > > I'm seeing a pretty major performance regression on iothread-enabled > > virtio-blk (and on some on-going iothread hw/nvme work) with this > > applied. Something like ~300k iops prior to this vs ~200k after on my > > set up. On master, virtio-blk is currently faster without an iothread > > (~215k) than with (~200k). > > > > I bisected the change in iops to this revert. > > Is CONFIG_DEBUG_GRAPH_LOCK enabled in your build? If so, this is > expected to cost some performance. If not, we need to take a look at > what else is causing the regression. > > Kevin Argh. Doh. Yes, was enabled and made QUITE the difference. Sorry for the noise. Thanks!
diff --git a/block/graph-lock.c b/block/graph-lock.c index 3bf2591dc4..5e66f01ae8 100644 --- a/block/graph-lock.c +++ b/block/graph-lock.c @@ -30,10 +30,8 @@ BdrvGraphLock graph_lock; /* Protects the list of aiocontext and orphaned_reader_count */ static QemuMutex aio_context_list_lock; -#if 0 /* Written and read with atomic operations. */ static int has_writer; -#endif /* * A reader coroutine could move from an AioContext to another. @@ -90,7 +88,6 @@ void unregister_aiocontext(AioContext *ctx) g_free(ctx->bdrv_graph); } -#if 0 static uint32_t reader_count(void) { BdrvGraphRWlock *brdv_graph; @@ -108,21 +105,13 @@ static uint32_t reader_count(void) assert((int32_t)rd >= 0); return rd; } -#endif void bdrv_graph_wrlock(BlockDriverState *bs) { AioContext *ctx = NULL; GLOBAL_STATE_CODE(); - /* - * TODO Some callers hold an AioContext lock when this is called, which - * causes deadlocks. Reenable once the AioContext locking is cleaned up (or - * AioContext locks are gone). - */ -#if 0 assert(!qatomic_read(&has_writer)); -#endif /* * Release only non-mainloop AioContext. The mainloop often relies on the @@ -137,7 +126,6 @@ void bdrv_graph_wrlock(BlockDriverState *bs) } } -#if 0 /* Make sure that constantly arriving new I/O doesn't cause starvation */ bdrv_drain_all_begin_nopoll(); @@ -166,7 +154,6 @@ void bdrv_graph_wrlock(BlockDriverState *bs) } while (reader_count() >= 1); bdrv_drain_all_end(); -#endif if (ctx) { aio_context_acquire(bdrv_get_aio_context(bs)); @@ -176,7 +163,6 @@ void bdrv_graph_wrlock(BlockDriverState *bs) void bdrv_graph_wrunlock(void) { GLOBAL_STATE_CODE(); -#if 0 QEMU_LOCK_GUARD(&aio_context_list_lock); assert(qatomic_read(&has_writer)); @@ -188,13 +174,10 @@ void bdrv_graph_wrunlock(void) /* Wake up all coroutine that are waiting to read the graph */ qemu_co_enter_all(&reader_queue, &aio_context_list_lock); -#endif } void coroutine_fn bdrv_graph_co_rdlock(void) { - /* TODO Reenable when wrlock is reenabled */ -#if 0 BdrvGraphRWlock *bdrv_graph; bdrv_graph = qemu_get_current_aio_context()->bdrv_graph; @@ -254,12 +237,10 @@ void coroutine_fn bdrv_graph_co_rdlock(void) qemu_co_queue_wait(&reader_queue, &aio_context_list_lock); } } -#endif } void coroutine_fn bdrv_graph_co_rdunlock(void) { -#if 0 BdrvGraphRWlock *bdrv_graph; bdrv_graph = qemu_get_current_aio_context()->bdrv_graph; @@ -277,7 +258,6 @@ void coroutine_fn bdrv_graph_co_rdunlock(void) if (qatomic_read(&has_writer)) { aio_wait_kick(); } -#endif } void bdrv_graph_rdlock_main_loop(void) @@ -295,19 +275,13 @@ void bdrv_graph_rdunlock_main_loop(void) void assert_bdrv_graph_readable(void) { /* reader_count() is slow due to aio_context_list_lock lock contention */ - /* TODO Reenable when wrlock is reenabled */ -#if 0 #ifdef CONFIG_DEBUG_GRAPH_LOCK assert(qemu_in_main_thread() || reader_count()); #endif -#endif } void assert_bdrv_graph_writable(void) { assert(qemu_in_main_thread()); - /* TODO Reenable when wrlock is reenabled */ -#if 0 assert(qatomic_read(&has_writer)); -#endif }