Message ID | 20231213211544.1601971-4-stefanha@redhat.com |
---|---|
State | New |
Headers | show |
Series | aio-posix: call ->poll_end() when removing AioHandler | expand |
On Wed, Dec 13, 2023 at 10:15 PM Stefan Hajnoczi <stefanha@redhat.com> wrote: > - /* If a read is in progress, just mark the node as deleted */ > - if (ctx->walking_handlers > 0) { > - QLIST_INSERT_HEAD_RCU(&ctx->deleted_aio_handlers, node, node_deleted); > - return false; > + /* If polling was started on the node then end it now */ > + if (ctx->poll_started && node->io_poll_end) { > + node->io_poll_end(node->opaque); > + > + /* Poll one last time in case ->io_poll_end() raced with the event */ > + if (node->io_poll(node->opaque)) { > + poll_ready = true; > + } Do you need this at all? If the caller is removing the handlers, they should have put themselves in a state where they don't care about the file descriptor becoming readable. You still have to be careful because aio_bh_schedule_oneshot() can trigger remote nested event loops (which can cause deadlocks and, I am especially afraid, can take some time and invalidate the expectation that you don't need to drop the BQL). But it does simplify this patch quite a bit. Paolo
On Wed, Dec 13, 2023 at 10:52:58PM +0100, Paolo Bonzini wrote: > On Wed, Dec 13, 2023 at 10:15 PM Stefan Hajnoczi <stefanha@redhat.com> wrote: > > - /* If a read is in progress, just mark the node as deleted */ > > - if (ctx->walking_handlers > 0) { > > - QLIST_INSERT_HEAD_RCU(&ctx->deleted_aio_handlers, node, node_deleted); > > - return false; > > + /* If polling was started on the node then end it now */ > > + if (ctx->poll_started && node->io_poll_end) { > > + node->io_poll_end(node->opaque); > > + > > + /* Poll one last time in case ->io_poll_end() raced with the event */ > > + if (node->io_poll(node->opaque)) { > > + poll_ready = true; > > + } > > Do you need this at all? If the caller is removing the handlers, they > should have put themselves in a state where they don't care about the > file descriptor becoming readable. The caller no longer wishes to monitor the fd. This may be temporary though. The fd must stay readable so that the caller can monitor it again in the future and handle pending events. ->io_poll_begin() and ->io_poll_end() are used to bypass the fd while in polling mode (e.g. disabling virtqueue kicks). This is a performance optimization that avoids wasting cycles servicing fds when we're polling anyway. When the caller does: 1. aio_set_fd_handler(ctx, fd, NULL, ...) // stop monitoring 2. aio_set_fd_handler(ctx, fd, read_fd, ...) // start monitoring Then read_fd() should be called if the fd became readable between 1 and 2. Since the fd may be bypassed until ->io_poll_end() returns, we must poll one last time to check if an event snuck in right at the end without making the fd readable. If polling detected an event, then we must do something. We cannot drop the event. (An alternative is to poll once before monitoring the fd again. That way pending events are detected even if the fd wasn't readable. That is currently not the way aio-posix.c works though.) Stefan
Il gio 14 dic 2023, 21:12 Stefan Hajnoczi <stefanha@redhat.com> ha scritto: > Since the fd may be bypassed until ->io_poll_end() returns, we must poll > one last time to check if an event snuck in right at the end without > making the fd readable. If polling detected an event, then we must do > something. We cannot drop the event I agree that in general we cannot. I wonder however if, in the (already racy) case of a concurrent aio_set_fd_handler(ctx, fd, NULL, ...), you really need to call poll_ready here. > > (An alternative is to poll once before monitoring the fd again. That way > pending events are detected even if the fd wasn't readable. That is > currently not the way aio-posix.c works though.) > Yes, that would be a change. If I understood correctly Hanna's suggestions in the issue, she also mentioned doing a manual virtqueue notification before monitoring restarts. So basically my idea boils down to implementing that, and then cleaning up everything on top. Paolo > Stefan >
On Thu, Dec 14, 2023 at 09:39:02PM +0100, Paolo Bonzini wrote: > Il gio 14 dic 2023, 21:12 Stefan Hajnoczi <stefanha@redhat.com> ha scritto: > > > Since the fd may be bypassed until ->io_poll_end() returns, we must poll > > one last time to check if an event snuck in right at the end without > > making the fd readable. If polling detected an event, then we must do > > something. We cannot drop the event > > > I agree that in general we cannot. I wonder however if, in the (already > racy) case of a concurrent aio_set_fd_handler(ctx, fd, NULL, ...), you > really need to call poll_ready here. It doesn't need to happen here but it needs to be recorded so that the handler (either poll_ready or the fd read handler) runs later. In the case of eventfds it's easy to write(fd, ...) so that activity will be picked up again when file descriptor monitoring resumes. > > > > (An alternative is to poll once before monitoring the fd again. That way > > pending events are detected even if the fd wasn't readable. That is > > currently not the way aio-posix.c works though.) > > > > Yes, that would be a change. If I understood correctly Hanna's suggestions > in the issue, she also mentioned doing a manual virtqueue notification > before monitoring restarts. So basically my idea boils down to implementing > that, and then cleaning up everything on top. Okay. Stefan
diff --git a/util/aio-posix.c b/util/aio-posix.c index 86e232e9d3..2245a9659b 100644 --- a/util/aio-posix.c +++ b/util/aio-posix.c @@ -64,8 +64,11 @@ static AioHandler *find_aio_handler(AioContext *ctx, int fd) return NULL; } -static bool aio_remove_fd_handler(AioContext *ctx, AioHandler *node) +static void aio_remove_fd_handler(AioContext *ctx, AioHandler *node) { + bool poll_ready; + bool delete_node = false; + /* If the GSource is in the process of being destroyed then * g_source_remove_poll() causes an assertion failure. Skip * removal in that case, because glib cleans up its state during @@ -76,25 +79,40 @@ static bool aio_remove_fd_handler(AioContext *ctx, AioHandler *node) } node->pfd.revents = 0; + poll_ready = node->poll_ready; node->poll_ready = false; /* If the fd monitor has already marked it deleted, leave it alone */ - if (QLIST_IS_INSERTED(node, node_deleted)) { - return false; + if (!QLIST_IS_INSERTED(node, node_deleted)) { + /* If a read is in progress, just mark the node as deleted */ + if (ctx->walking_handlers > 0) { + QLIST_INSERT_HEAD_RCU(&ctx->deleted_aio_handlers, node, node_deleted); + } else { + /* Otherwise, delete it for real. We can't just mark it as + * deleted because deleted nodes are only cleaned up while + * no one is walking the handlers list. + */ + QLIST_SAFE_REMOVE(node, node_poll); + QLIST_REMOVE(node, node); + delete_node = true; + } } - /* If a read is in progress, just mark the node as deleted */ - if (ctx->walking_handlers > 0) { - QLIST_INSERT_HEAD_RCU(&ctx->deleted_aio_handlers, node, node_deleted); - return false; + /* If polling was started on the node then end it now */ + if (ctx->poll_started && node->io_poll_end) { + node->io_poll_end(node->opaque); + + /* Poll one last time in case ->io_poll_end() raced with the event */ + if (node->io_poll(node->opaque)) { + poll_ready = true; + } + } + if (poll_ready) { + node->io_poll_ready(node->opaque); + } + if (delete_node) { + g_free(node); } - /* Otherwise, delete it for real. We can't just mark it as - * deleted because deleted nodes are only cleaned up while - * no one is walking the handlers list. - */ - QLIST_SAFE_REMOVE(node, node_poll); - QLIST_REMOVE(node, node); - return true; } /* Perform aio_set_fd_handler() in this thread's AioContext */ @@ -109,7 +127,6 @@ static void aio_set_fd_handler_local(AioContext *ctx, AioHandler *node; AioHandler *new_node = NULL; bool is_new = false; - bool deleted = false; int poll_disable_change; if (io_poll && !io_poll_ready) { @@ -166,13 +183,9 @@ static void aio_set_fd_handler_local(AioContext *ctx, ctx->fdmon_ops->update(ctx, node, new_node); if (node) { - deleted = aio_remove_fd_handler(ctx, node); + aio_remove_fd_handler(ctx, node); } aio_notify(ctx); - - if (deleted) { - g_free(node); - } } typedef struct {
Hanna Czenczek found the root cause for --device virtio-scsi-pci,iothread= hangs across hotplug/unplug. When AioContext polling has started on the virtqueue host notifier in the IOThread and the main loop thread calls aio_set_fd_handler(), then the io_poll_end() callback is not invoked on the virtqueue host notifier. The virtqueue is left with polling disabled and never detects guest activity once attached again (because virtqueue kicks are not enabled and AioContext only starts polling after the fd becomes active for the first time). The result is that the virtio-scsi device stops responding to its virtqueue. Previous patches made aio_set_fd_handler() run in the AioContext's home thread so it's now possible to call ->io_poll_end() to solve this bug. Buglink: https://issues.redhat.com/browse/RHEL-3934 Cc: Hanna Czenczek <hreitz@redhat.com> Cc: Fiona Ebner <f.ebner@proxmox.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> --- util/aio-posix.c | 53 ++++++++++++++++++++++++++++++------------------ 1 file changed, 33 insertions(+), 20 deletions(-)