Message ID | 20230823235938.1398382-5-stefanha@redhat.com |
---|---|
State | New |
Headers | show |
Series | block-backend: process I/O in the current AioContext | expand |
Am 24.08.2023 um 01:59 hat Stefan Hajnoczi geschrieben: > Use qemu_get_current_aio_context() in mixed wrappers and coroutine > wrappers so that code runs in the caller's AioContext instead of moving > to the BlockDriverState's AioContext. This change is necessary for the > multi-queue block layer where any thread can call into the block layer. > > Most wrappers are IO_CODE where it's safe to use the current AioContext > nowadays. BlockDrivers and the core block layer use their own locks and > no longer depend on the AioContext lock for thread-safety. > > The bdrv_create() wrapper invokes GLOBAL_STATE code. Using the current > AioContext is safe because this code is only called with the BQL held > from the main loop thread. > > The output of qemu-iotests 051 is sensitive to event loop activity. > Update the output because the monitor BH runs at a different time, > causing prompts to be printed differently in the output. > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> The update for 051 is actually missing from this patch, and so the test fails. I missed the dependency on your qio_channel series, so 281 ran into an abort() for me (see below for the stack trace). I expect that the other series actually fixes this, but this kind of interaction wasn't really obvious. How did you make sure that there aren't other places that don't like this change? Kevin (gdb) bt #0 0x00007f8ef0d2fe5c in __pthread_kill_implementation () at /lib64/libc.so.6 #1 0x00007f8ef0cdfa76 in raise () at /lib64/libc.so.6 #2 0x00007f8ef0cc97fc in abort () at /lib64/libc.so.6 #3 0x00007f8ef0cc971b in _nl_load_domain.cold () at /lib64/libc.so.6 #4 0x00007f8ef0cd8656 in () at /lib64/libc.so.6 #5 0x000055fd19da6af3 in qio_channel_yield (ioc=0x7f8ee0000b70, condition=G_IO_IN) at ../io/channel.c:583 #6 0x000055fd19e0382f in nbd_read_eof (bs=0x55fd1b681350, ioc=0x7f8ee0000b70, buffer=0x55fd1b680da0, size=4, errp=0x0) at ../nbd/client.c:1454 #7 0x000055fd19e03612 in nbd_receive_reply (bs=0x55fd1b681350, ioc=0x7f8ee0000b70, reply=0x55fd1b680da0, errp=0x0) at ../nbd/client.c:1491 #8 0x000055fd19e40575 in nbd_receive_replies (s=0x55fd1b680b00, cookie=1) at ../block/nbd.c:461 #9 0x000055fd19e3fec4 in nbd_co_do_receive_one_chunk (s=0x55fd1b680b00, cookie=1, only_structured=true, request_ret=0x7f8ee8bff91c, qiov=0x7f8ee8bfff10, payload=0x7f8ee8bff9d0, errp=0x7f8ee8bff910) at ../block/nbd.c:844 #10 0x000055fd19e3fd55 in nbd_co_receive_one_chunk (s=0x55fd1b680b00, cookie=1, only_structured=true, request_ret=0x7f8ee8bff91c, qiov=0x7f8ee8bfff10, reply=0x7f8ee8bff9f0, payload=0x7f8ee8bff9d0, errp=0x7f8ee8bff910) at ../block/nbd.c:925 #11 0x000055fd19e3f7b5 in nbd_reply_chunk_iter_receive (s=0x55fd1b680b00, iter=0x7f8ee8bff9d8, cookie=1, qiov=0x7f8ee8bfff10, reply=0x7f8ee8bff9f0, payload=0x7f8ee8bff9d0) at ../block/nbd.c:1008 #12 0x000055fd19e3ecf7 in nbd_co_receive_cmdread_reply (s=0x55fd1b680b00, cookie=1, offset=0, qiov=0x7f8ee8bfff10, request_ret=0x7f8ee8bffad4, errp=0x7f8ee8bffac8) at ../block/nbd.c:1074 #13 0x000055fd19e3c804 in nbd_client_co_preadv (bs=0x55fd1b681350, offset=0, bytes=131072, qiov=0x7f8ee8bfff10, flags=0) at ../block/nbd.c:1258 #14 0x000055fd19e33547 in bdrv_driver_preadv (bs=0x55fd1b681350, offset=0, bytes=131072, qiov=0x7f8ee8bfff10, qiov_offset=0, flags=0) at ../block/io.c:1005 #15 0x000055fd19e2c8bb in bdrv_aligned_preadv (child=0x55fd1c282d90, req=0x7f8ee8bffd90, offset=0, bytes=131072, align=1, qiov=0x7f8ee8bfff10, qiov_offset=0, flags=0) at ../block/io.c:1398 #16 0x000055fd19e2bf7d in bdrv_co_preadv_part (child=0x55fd1c282d90, offset=0, bytes=131072, qiov=0x7f8ee8bfff10, qiov_offset=0, flags=0) at ../block/io.c:1815 #17 0x000055fd19e176bd in blk_co_do_preadv_part (blk=0x55fd1c269c00, offset=0, bytes=131072, qiov=0x7f8ee8bfff10, qiov_offset=0, flags=0) at ../block/block-backend.c:1344 #18 0x000055fd19e17588 in blk_co_preadv (blk=0x55fd1c269c00, offset=0, bytes=131072, qiov=0x7f8ee8bfff10, flags=0) at ../block/block-backend.c:1369 #19 0x000055fd19e17514 in blk_co_pread (blk=0x55fd1c269c00, offset=0, bytes=131072, buf=0x55fd1c16d000, flags=0) at ../block/block-backend.c:1358 #20 0x000055fd19ddcc91 in blk_co_pread_entry (opaque=0x7ffc4bbdd9a0) at block/block-gen.c:1519 #21 0x000055fd19feb2a1 in coroutine_trampoline (i0=460835072, i1=22013) at ../util/coroutine-ucontext.c:177 #22 0x00007f8ef0cf5c20 in __start_context () at /lib64/libc.so.6 io/channel.c:583 is this: 577 void coroutine_fn qio_channel_yield(QIOChannel *ioc, 578 GIOCondition condition) 579 { 580 AioContext *ioc_ctx = ioc->ctx ?: qemu_get_aio_context(); 581 582 assert(qemu_in_coroutine()); 583 assert(in_aio_context_home_thread(ioc_ctx)); 584
On Fri, Sep 01, 2023 at 07:01:37PM +0200, Kevin Wolf wrote: > Am 24.08.2023 um 01:59 hat Stefan Hajnoczi geschrieben: > > Use qemu_get_current_aio_context() in mixed wrappers and coroutine > > wrappers so that code runs in the caller's AioContext instead of moving > > to the BlockDriverState's AioContext. This change is necessary for the > > multi-queue block layer where any thread can call into the block layer. > > > > Most wrappers are IO_CODE where it's safe to use the current AioContext > > nowadays. BlockDrivers and the core block layer use their own locks and > > no longer depend on the AioContext lock for thread-safety. > > > > The bdrv_create() wrapper invokes GLOBAL_STATE code. Using the current > > AioContext is safe because this code is only called with the BQL held > > from the main loop thread. > > > > The output of qemu-iotests 051 is sensitive to event loop activity. > > Update the output because the monitor BH runs at a different time, > > causing prompts to be printed differently in the output. > > > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > > The update for 051 is actually missing from this patch, and so the test > fails. > > I missed the dependency on your qio_channel series, so 281 ran into an > abort() for me (see below for the stack trace). I expect that the other > series actually fixes this, but this kind of interaction wasn't really > obvious. How did you make sure that there aren't other places that don't > like this change? Only by running qemu-iotests. Stefan > > Kevin > > (gdb) bt > #0 0x00007f8ef0d2fe5c in __pthread_kill_implementation () at /lib64/libc.so.6 > #1 0x00007f8ef0cdfa76 in raise () at /lib64/libc.so.6 > #2 0x00007f8ef0cc97fc in abort () at /lib64/libc.so.6 > #3 0x00007f8ef0cc971b in _nl_load_domain.cold () at /lib64/libc.so.6 > #4 0x00007f8ef0cd8656 in () at /lib64/libc.so.6 > #5 0x000055fd19da6af3 in qio_channel_yield (ioc=0x7f8ee0000b70, condition=G_IO_IN) at ../io/channel.c:583 > #6 0x000055fd19e0382f in nbd_read_eof (bs=0x55fd1b681350, ioc=0x7f8ee0000b70, buffer=0x55fd1b680da0, size=4, errp=0x0) at ../nbd/client.c:1454 > #7 0x000055fd19e03612 in nbd_receive_reply (bs=0x55fd1b681350, ioc=0x7f8ee0000b70, reply=0x55fd1b680da0, errp=0x0) at ../nbd/client.c:1491 > #8 0x000055fd19e40575 in nbd_receive_replies (s=0x55fd1b680b00, cookie=1) at ../block/nbd.c:461 > #9 0x000055fd19e3fec4 in nbd_co_do_receive_one_chunk > (s=0x55fd1b680b00, cookie=1, only_structured=true, request_ret=0x7f8ee8bff91c, qiov=0x7f8ee8bfff10, payload=0x7f8ee8bff9d0, errp=0x7f8ee8bff910) at ../block/nbd.c:844 > #10 0x000055fd19e3fd55 in nbd_co_receive_one_chunk > (s=0x55fd1b680b00, cookie=1, only_structured=true, request_ret=0x7f8ee8bff91c, qiov=0x7f8ee8bfff10, reply=0x7f8ee8bff9f0, payload=0x7f8ee8bff9d0, errp=0x7f8ee8bff910) > at ../block/nbd.c:925 > #11 0x000055fd19e3f7b5 in nbd_reply_chunk_iter_receive (s=0x55fd1b680b00, iter=0x7f8ee8bff9d8, cookie=1, qiov=0x7f8ee8bfff10, reply=0x7f8ee8bff9f0, payload=0x7f8ee8bff9d0) > at ../block/nbd.c:1008 > #12 0x000055fd19e3ecf7 in nbd_co_receive_cmdread_reply (s=0x55fd1b680b00, cookie=1, offset=0, qiov=0x7f8ee8bfff10, request_ret=0x7f8ee8bffad4, errp=0x7f8ee8bffac8) at ../block/nbd.c:1074 > #13 0x000055fd19e3c804 in nbd_client_co_preadv (bs=0x55fd1b681350, offset=0, bytes=131072, qiov=0x7f8ee8bfff10, flags=0) at ../block/nbd.c:1258 > #14 0x000055fd19e33547 in bdrv_driver_preadv (bs=0x55fd1b681350, offset=0, bytes=131072, qiov=0x7f8ee8bfff10, qiov_offset=0, flags=0) at ../block/io.c:1005 > #15 0x000055fd19e2c8bb in bdrv_aligned_preadv (child=0x55fd1c282d90, req=0x7f8ee8bffd90, offset=0, bytes=131072, align=1, qiov=0x7f8ee8bfff10, qiov_offset=0, flags=0) at ../block/io.c:1398 > #16 0x000055fd19e2bf7d in bdrv_co_preadv_part (child=0x55fd1c282d90, offset=0, bytes=131072, qiov=0x7f8ee8bfff10, qiov_offset=0, flags=0) at ../block/io.c:1815 > #17 0x000055fd19e176bd in blk_co_do_preadv_part (blk=0x55fd1c269c00, offset=0, bytes=131072, qiov=0x7f8ee8bfff10, qiov_offset=0, flags=0) at ../block/block-backend.c:1344 > #18 0x000055fd19e17588 in blk_co_preadv (blk=0x55fd1c269c00, offset=0, bytes=131072, qiov=0x7f8ee8bfff10, flags=0) at ../block/block-backend.c:1369 > #19 0x000055fd19e17514 in blk_co_pread (blk=0x55fd1c269c00, offset=0, bytes=131072, buf=0x55fd1c16d000, flags=0) at ../block/block-backend.c:1358 > #20 0x000055fd19ddcc91 in blk_co_pread_entry (opaque=0x7ffc4bbdd9a0) at block/block-gen.c:1519 > #21 0x000055fd19feb2a1 in coroutine_trampoline (i0=460835072, i1=22013) at ../util/coroutine-ucontext.c:177 > #22 0x00007f8ef0cf5c20 in __start_context () at /lib64/libc.so.6 > > io/channel.c:583 is this: > > 577 void coroutine_fn qio_channel_yield(QIOChannel *ioc, > 578 GIOCondition condition) > 579 { > 580 AioContext *ioc_ctx = ioc->ctx ?: qemu_get_aio_context(); > 581 > 582 assert(qemu_in_coroutine()); > 583 assert(in_aio_context_home_thread(ioc_ctx)); > 584 >
diff --git a/scripts/block-coroutine-wrapper.py b/scripts/block-coroutine-wrapper.py index d4a183db61..f93fe154c3 100644 --- a/scripts/block-coroutine-wrapper.py +++ b/scripts/block-coroutine-wrapper.py @@ -88,8 +88,6 @@ def __init__(self, wrapper_type: str, return_type: str, name: str, raise ValueError(f"no_co function can't be rdlock: {self.name}") self.target_name = f'{subsystem}_{subname}' - self.ctx = self.gen_ctx() - self.get_result = 's->ret = ' self.ret = 'return s.ret;' self.co_ret = 'return ' @@ -162,7 +160,7 @@ def create_mixed_wrapper(func: FuncDecl) -> str: {func.co_ret}{name}({ func.gen_list('{name}') }); }} else {{ {struct_name} s = {{ - .poll_state.ctx = {func.ctx}, + .poll_state.ctx = qemu_get_current_aio_context(), .poll_state.in_progress = true, { func.gen_block(' .{name} = {name},') } @@ -186,7 +184,7 @@ def create_co_wrapper(func: FuncDecl) -> str: {func.return_type} {func.name}({ func.gen_list('{decl}') }) {{ {struct_name} s = {{ - .poll_state.ctx = {func.ctx}, + .poll_state.ctx = qemu_get_current_aio_context(), .poll_state.in_progress = true, { func.gen_block(' .{name} = {name},') }
Use qemu_get_current_aio_context() in mixed wrappers and coroutine wrappers so that code runs in the caller's AioContext instead of moving to the BlockDriverState's AioContext. This change is necessary for the multi-queue block layer where any thread can call into the block layer. Most wrappers are IO_CODE where it's safe to use the current AioContext nowadays. BlockDrivers and the core block layer use their own locks and no longer depend on the AioContext lock for thread-safety. The bdrv_create() wrapper invokes GLOBAL_STATE code. Using the current AioContext is safe because this code is only called with the BQL held from the main loop thread. The output of qemu-iotests 051 is sensitive to event loop activity. Update the output because the monitor BH runs at a different time, causing prompts to be printed differently in the output. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> --- scripts/block-coroutine-wrapper.py | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)