Message ID | 20230816142510.5637-9-farosas@suse.de |
---|---|
State | New |
Headers | show |
Series | Fix segfault on migration return path | expand |
On Wed, Aug 16, 2023 at 11:25:10AM -0300, Fabiano Rosas wrote: > @@ -2003,6 +1980,8 @@ static int open_return_path_on_source(MigrationState *ms) > return -1; > } > > + migration_ioc_register_yank(qemu_file_get_ioc(ms->rp_state.from_dst_file)); I think I didn't really get why it wasn't paired before yesterday. My fault. Registering from_dst_file, afaict, will register two identical yank objects because the ioc is the same. Should we make migration_file_release() not handle the unregister of yank(), but leave that to callers? Then we keep the rule of only register yank for each ioc once. We can also have yet another wrapper migration_file_release_with_yank() to unregister the yank before calling migration_file_release().
On Wed, Aug 16, 2023 at 03:35:24PM -0400, Peter Xu wrote: > On Wed, Aug 16, 2023 at 03:47:26PM -0300, Fabiano Rosas wrote: > > Peter Xu <peterx@redhat.com> writes: > > > > > On Wed, Aug 16, 2023 at 11:25:10AM -0300, Fabiano Rosas wrote: > > >> @@ -2003,6 +1980,8 @@ static int open_return_path_on_source(MigrationState *ms) > > >> return -1; > > >> } > > >> > > >> + migration_ioc_register_yank(qemu_file_get_ioc(ms->rp_state.from_dst_file)); > > > > > > I think I didn't really get why it wasn't paired before yesterday. My > > > fault. > > > > > > Registering from_dst_file, afaict, will register two identical yank objects > > > because the ioc is the same. > > > > > > > Why do we have two QEMUFiles for the same fd again? > > Because qemufile has a "direction" (either read / write)? > > > > > We're bound to crash at some point by trying to qemu_fclose() the two > > QEMUFiles at the same time. > > Even with each qemufile holding a reference on the ioc object? I thought > it won't crash, but if it will please point that out; or fix it would be > even better. > > > > > > Should we make migration_file_release() not handle the unregister of > > > yank(), but leave that to callers? Then we keep the rule of only register > > > yank for each ioc once. > > > > > > > We need the unregister to be at migration_file_release() so that it > > takes benefit of the locking while checking the file for NULL. If it > > moves out then the caller will have to do locking as well. Which > > defeats the purpose of the patch. > > > > I don't understand why you moved the unregister out of channel_close in > > commit 39675ffffb ("migration: Move the yank unregister of channel_close > > out"). You called it a "hack" at the time, but looking at the current > > situation, it seems like a reasonable thing to do: unregister the yank > > when the ioc refcount drops to 1. > > > > I would go even further and say that qemu_fclose should also avoid > > calling qio_channel_close if the ioc refcnt is elevated. > > I'd rather not; I still think it's a hack, always open to be corrected. > > I think the problem is yank can register anything so it's separate from > iochannels. If one would like to have ioc close() automatically > unregister, then one should also register yank transparently without the > ioc user even aware of yank's existance. > > Now the condition is the caller register yank itself, then I think the > caller should unreg it.. not iochannel itself, silently. I just noticed this is not really copying the list.. let me add the cc list back, assuming it was just forgotten. One more thing to mention is, now I kind of agree probably we should register yank over each qemufile, as you raised the concern in the other thread that otherwise qmp_yank() won't set error for the qemufile, which seems to be unexpected.
Peter Xu <peterx@redhat.com> writes: > On Wed, Aug 16, 2023 at 03:35:24PM -0400, Peter Xu wrote: >> On Wed, Aug 16, 2023 at 03:47:26PM -0300, Fabiano Rosas wrote: >> > Peter Xu <peterx@redhat.com> writes: >> > >> > > On Wed, Aug 16, 2023 at 11:25:10AM -0300, Fabiano Rosas wrote: >> > >> @@ -2003,6 +1980,8 @@ static int open_return_path_on_source(MigrationState *ms) >> > >> return -1; >> > >> } >> > >> >> > >> + migration_ioc_register_yank(qemu_file_get_ioc(ms->rp_state.from_dst_file)); >> > > >> > > I think I didn't really get why it wasn't paired before yesterday. My >> > > fault. >> > > >> > > Registering from_dst_file, afaict, will register two identical yank objects >> > > because the ioc is the same. >> > > >> > >> > Why do we have two QEMUFiles for the same fd again? >> >> Because qemufile has a "direction" (either read / write)? >> >> > >> > We're bound to crash at some point by trying to qemu_fclose() the two >> > QEMUFiles at the same time. >> >> Even with each qemufile holding a reference on the ioc object? I thought >> it won't crash, but if it will please point that out; or fix it would be >> even better. You're right, it wouldn't crash. But it's still the same ioc object. If qio_channel_close() is called twice, then we could potentially close the fd twice. Which would either error out or close a reused fd. The window is small though, so probably unlikely to ever happen. >> >> > >> > > Should we make migration_file_release() not handle the unregister of >> > > yank(), but leave that to callers? Then we keep the rule of only register >> > > yank for each ioc once. >> > > >> > >> > We need the unregister to be at migration_file_release() so that it >> > takes benefit of the locking while checking the file for NULL. If it >> > moves out then the caller will have to do locking as well. Which >> > defeats the purpose of the patch. >> > >> > I don't understand why you moved the unregister out of channel_close in >> > commit 39675ffffb ("migration: Move the yank unregister of channel_close >> > out"). You called it a "hack" at the time, but looking at the current >> > situation, it seems like a reasonable thing to do: unregister the yank >> > when the ioc refcount drops to 1. >> > >> > I would go even further and say that qemu_fclose should also avoid >> > calling qio_channel_close if the ioc refcnt is elevated. >> >> I'd rather not; I still think it's a hack, always open to be corrected. It's hard to figure out what you mean by hack at times. Even more when reading a years-old commit message. >> >> I think the problem is yank can register anything so it's separate from >> iochannels. If one would like to have ioc close() automatically >> unregister, then one should also register yank transparently without the >> ioc user even aware of yank's existance. Ok, fair point. >> >> Now the condition is the caller register yank itself, then I think the >> caller should unreg it.. not iochannel itself, silently. I think the issue is that we're linking the yank with the QEMUFile for no reason. The migration_yank_iochannel() performs a qio_channel_shutdown() which is an operation on the fd. The QEMUFile just happens to hold a pointer to the ioc. > > I just noticed this is not really copying the list.. let me add the cc list > back, assuming it was just forgotten. I'm sorry, I hit the wrong key while replying. > One more thing to mention is, now I kind of agree probably we should > register yank over each qemufile, as you raised the concern in the other > thread that otherwise qmp_yank() won't set error for the qemufile, which > seems to be unexpected. I haven't made up my mind yet, but I think I'd rather stop setting that error instead of doing it from other places. A shutdown() is mostly a benign operation intended to end the connection. The fact that we use it in some cases to kick the thread out of a possible hang doesn't seem compelling enough to set -EIO. Of course we currently have no other way to indicate that the file was shutdown, so the -EIO will have to stay and that's a discussion for another day.
On Wed, Aug 16, 2023 at 06:20:58PM -0300, Fabiano Rosas wrote: > > One more thing to mention is, now I kind of agree probably we should > > register yank over each qemufile, as you raised the concern in the other > > thread that otherwise qmp_yank() won't set error for the qemufile, which > > seems to be unexpected. > > I haven't made up my mind yet, but I think I'd rather stop setting that > error instead of doing it from other places. A shutdown() is mostly a > benign operation intended to end the connection. The fact that we use it > in some cases to kick the thread out of a possible hang doesn't seem > compelling enough to set -EIO. > > Of course we currently have no other way to indicate that the file was > shutdown, so the -EIO will have to stay and that's a discussion for > another day. Yes, if we can avoid setting -EIO at all when shutdown that'll also be good, maybe making more sense. Thanks,
diff --git a/migration/migration.c b/migration/migration.c index 7fec57ad7f..22ab7199e4 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -217,6 +217,26 @@ MigrationIncomingState *migration_incoming_get_current(void) return current_incoming; } +static void migration_file_release(QEMUFile **file) +{ + MigrationState *ms = migrate_get_current(); + QEMUFile *tmp; + + /* + * Reset the pointer before releasing it to avoid holding the lock + * for too long. + */ + WITH_QEMU_LOCK_GUARD(&ms->qemu_file_lock) { + tmp = *file; + *file = NULL; + } + + if (tmp) { + migration_ioc_unregister_yank_from_file(tmp); + qemu_fclose(tmp); + } +} + void migration_incoming_transport_cleanup(MigrationIncomingState *mis) { if (mis->socket_address_list) { @@ -1155,8 +1175,6 @@ static void migrate_fd_cleanup(MigrationState *s) qemu_savevm_state_cleanup(); if (s->to_dst_file) { - QEMUFile *tmp; - trace_migrate_fd_cleanup(); qemu_mutex_unlock_iothread(); if (s->migration_thread_running) { @@ -1166,16 +1184,7 @@ static void migrate_fd_cleanup(MigrationState *s) qemu_mutex_lock_iothread(); multifd_save_cleanup(); - qemu_mutex_lock(&s->qemu_file_lock); - tmp = s->to_dst_file; - s->to_dst_file = NULL; - qemu_mutex_unlock(&s->qemu_file_lock); - /* - * Close the file handle without the lock to make sure the - * critical section won't block for long. - */ - migration_ioc_unregister_yank_from_file(tmp); - qemu_fclose(tmp); + migration_file_release(&s->to_dst_file); } /* @@ -1815,38 +1824,6 @@ static int migrate_handle_rp_resume_ack(MigrationState *s, uint32_t value) return 0; } -/* - * Release ms->rp_state.from_dst_file (and postcopy_qemufile_src if - * existed) in a safe way. - */ -static void migration_release_dst_files(MigrationState *ms) -{ - QEMUFile *file; - - WITH_QEMU_LOCK_GUARD(&ms->qemu_file_lock) { - /* - * Reset the from_dst_file pointer first before releasing it, as we - * can't block within lock section - */ - file = ms->rp_state.from_dst_file; - ms->rp_state.from_dst_file = NULL; - } - - /* - * Do the same to postcopy fast path socket too if there is. No - * locking needed because this qemufile should only be managed by - * return path thread. - */ - if (ms->postcopy_qemufile_src) { - migration_ioc_unregister_yank_from_file(ms->postcopy_qemufile_src); - qemu_file_shutdown(ms->postcopy_qemufile_src); - qemu_fclose(ms->postcopy_qemufile_src); - ms->postcopy_qemufile_src = NULL; - } - - qemu_fclose(file); -} - /* * Handles messages sent on the return path towards the source VM * @@ -2003,6 +1980,8 @@ static int open_return_path_on_source(MigrationState *ms) return -1; } + migration_ioc_register_yank(qemu_file_get_ioc(ms->rp_state.from_dst_file)); + trace_open_return_path_on_source(); qemu_thread_create(&ms->rp_state.rp_thread, "return path", @@ -2046,7 +2025,8 @@ static int await_return_path_close_on_source(MigrationState *ms) ret = ms->rp_state.error; ms->rp_state.error = false; - migration_release_dst_files(ms); + migration_file_release(&ms->rp_state.from_dst_file); + migration_file_release(&ms->postcopy_qemufile_src); trace_migration_return_path_end_after(ret); return ret; @@ -2502,26 +2482,9 @@ static MigThrError postcopy_pause(MigrationState *s) assert(s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE); while (true) { - QEMUFile *file; - - /* - * Current channel is possibly broken. Release it. Note that this is - * guaranteed even without lock because to_dst_file should only be - * modified by the migration thread. That also guarantees that the - * unregister of yank is safe too without the lock. It should be safe - * even to be within the qemu_file_lock, but we didn't do that to avoid - * taking more mutex (yank_lock) within qemu_file_lock. TL;DR: we make - * the qemu_file_lock critical section as small as possible. - */ + /* Current channel is possibly broken. Release it. */ assert(s->to_dst_file); - migration_ioc_unregister_yank_from_file(s->to_dst_file); - qemu_mutex_lock(&s->qemu_file_lock); - file = s->to_dst_file; - s->to_dst_file = NULL; - qemu_mutex_unlock(&s->qemu_file_lock); - - qemu_file_shutdown(file); - qemu_fclose(file); + migration_file_release(&s->to_dst_file); /* * We're already pausing, so ignore any errors on the return
We currently have a pattern for cleaning up a migration QEMUFile: qemu_mutex_lock(&s->qemu_file_lock); file = s->file_name; s->file_name = NULL; qemu_mutex_unlock(&s->qemu_file_lock); migration_ioc_unregister_yank_from_file(file); qemu_file_shutdown(file); qemu_fclose(file); There are some considerations for this sequence: - we must clear the pointer under the lock, to avoid TOC/TOU bugs; - the close() and yank unregister expect be given a non-null parameter; - a close() in one thread should not race with a shutdown() in another; - we don't need to shutdown() right before close(); Create a wrapper function to make sure everything works correctly. Note: the return path didn't have a yank handler registered, I added it nonetheless for uniformity. Signed-off-by: Fabiano Rosas <farosas@suse.de> --- migration/migration.c | 91 +++++++++++++------------------------------ 1 file changed, 27 insertions(+), 64 deletions(-)