diff mbox series

[v4,8/8] migration: Add a wrapper to cleanup migration files

Message ID 20230816142510.5637-9-farosas@suse.de
State New
Headers show
Series Fix segfault on migration return path | expand

Commit Message

Fabiano Rosas Aug. 16, 2023, 2:25 p.m. UTC
We currently have a pattern for cleaning up a migration QEMUFile:

  qemu_mutex_lock(&s->qemu_file_lock);
  file = s->file_name;
  s->file_name = NULL;
  qemu_mutex_unlock(&s->qemu_file_lock);

  migration_ioc_unregister_yank_from_file(file);
  qemu_file_shutdown(file);
  qemu_fclose(file);

There are some considerations for this sequence:

- we must clear the pointer under the lock, to avoid TOC/TOU bugs;
- the close() and yank unregister expect be given a non-null parameter;
- a close() in one thread should not race with a shutdown() in another;
- we don't need to shutdown() right before close();

Create a wrapper function to make sure everything works correctly.

Note: the return path didn't have a yank handler registered, I added
      it nonetheless for uniformity.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/migration.c | 91 +++++++++++++------------------------------
 1 file changed, 27 insertions(+), 64 deletions(-)

Comments

Peter Xu Aug. 16, 2023, 3:18 p.m. UTC | #1
On Wed, Aug 16, 2023 at 11:25:10AM -0300, Fabiano Rosas wrote:
> @@ -2003,6 +1980,8 @@ static int open_return_path_on_source(MigrationState *ms)
>          return -1;
>      }
>  
> +    migration_ioc_register_yank(qemu_file_get_ioc(ms->rp_state.from_dst_file));

I think I didn't really get why it wasn't paired before yesterday.  My
fault.

Registering from_dst_file, afaict, will register two identical yank objects
because the ioc is the same.

Should we make migration_file_release() not handle the unregister of
yank(), but leave that to callers?  Then we keep the rule of only register
yank for each ioc once.

We can also have yet another wrapper migration_file_release_with_yank() to
unregister the yank before calling migration_file_release().
Peter Xu Aug. 16, 2023, 7:37 p.m. UTC | #2
On Wed, Aug 16, 2023 at 03:35:24PM -0400, Peter Xu wrote:
> On Wed, Aug 16, 2023 at 03:47:26PM -0300, Fabiano Rosas wrote:
> > Peter Xu <peterx@redhat.com> writes:
> > 
> > > On Wed, Aug 16, 2023 at 11:25:10AM -0300, Fabiano Rosas wrote:
> > >> @@ -2003,6 +1980,8 @@ static int open_return_path_on_source(MigrationState *ms)
> > >>          return -1;
> > >>      }
> > >>  
> > >> +    migration_ioc_register_yank(qemu_file_get_ioc(ms->rp_state.from_dst_file));
> > >
> > > I think I didn't really get why it wasn't paired before yesterday.  My
> > > fault.
> > >
> > > Registering from_dst_file, afaict, will register two identical yank objects
> > > because the ioc is the same.
> > >
> > 
> > Why do we have two QEMUFiles for the same fd again?
> 
> Because qemufile has a "direction" (either read / write)?
> 
> > 
> > We're bound to crash at some point by trying to qemu_fclose() the two
> > QEMUFiles at the same time.
> 
> Even with each qemufile holding a reference on the ioc object?  I thought
> it won't crash, but if it will please point that out; or fix it would be
> even better.
> 
> > 
> > > Should we make migration_file_release() not handle the unregister of
> > > yank(), but leave that to callers?  Then we keep the rule of only register
> > > yank for each ioc once.
> > >
> > 
> > We need the unregister to be at migration_file_release() so that it
> > takes benefit of the locking while checking the file for NULL. If it
> > moves out then the caller will have to do locking as well. Which
> > defeats the purpose of the patch.
> > 
> > I don't understand why you moved the unregister out of channel_close in
> > commit 39675ffffb ("migration: Move the yank unregister of channel_close
> > out"). You called it a "hack" at the time, but looking at the current
> > situation, it seems like a reasonable thing to do: unregister the yank
> > when the ioc refcount drops to 1.
> > 
> > I would go even further and say that qemu_fclose should also avoid
> > calling qio_channel_close if the ioc refcnt is elevated.
> 
> I'd rather not; I still think it's a hack, always open to be corrected.
> 
> I think the problem is yank can register anything so it's separate from
> iochannels.  If one would like to have ioc close() automatically
> unregister, then one should also register yank transparently without the
> ioc user even aware of yank's existance.
> 
> Now the condition is the caller register yank itself, then I think the
> caller should unreg it.. not iochannel itself, silently.

I just noticed this is not really copying the list.. let me add the cc list
back, assuming it was just forgotten.

One more thing to mention is, now I kind of agree probably we should
register yank over each qemufile, as you raised the concern in the other
thread that otherwise qmp_yank() won't set error for the qemufile, which
seems to be unexpected.
Fabiano Rosas Aug. 16, 2023, 9:20 p.m. UTC | #3
Peter Xu <peterx@redhat.com> writes:

> On Wed, Aug 16, 2023 at 03:35:24PM -0400, Peter Xu wrote:
>> On Wed, Aug 16, 2023 at 03:47:26PM -0300, Fabiano Rosas wrote:
>> > Peter Xu <peterx@redhat.com> writes:
>> > 
>> > > On Wed, Aug 16, 2023 at 11:25:10AM -0300, Fabiano Rosas wrote:
>> > >> @@ -2003,6 +1980,8 @@ static int open_return_path_on_source(MigrationState *ms)
>> > >>          return -1;
>> > >>      }
>> > >>  
>> > >> +    migration_ioc_register_yank(qemu_file_get_ioc(ms->rp_state.from_dst_file));
>> > >
>> > > I think I didn't really get why it wasn't paired before yesterday.  My
>> > > fault.
>> > >
>> > > Registering from_dst_file, afaict, will register two identical yank objects
>> > > because the ioc is the same.
>> > >
>> > 
>> > Why do we have two QEMUFiles for the same fd again?
>> 
>> Because qemufile has a "direction" (either read / write)?
>> 
>> > 
>> > We're bound to crash at some point by trying to qemu_fclose() the two
>> > QEMUFiles at the same time.
>> 
>> Even with each qemufile holding a reference on the ioc object?  I thought
>> it won't crash, but if it will please point that out; or fix it would be
>> even better.

You're right, it wouldn't crash. But it's still the same ioc object. If
qio_channel_close() is called twice, then we could potentially close the
fd twice. Which would either error out or close a reused fd. The window
is small though, so probably unlikely to ever happen.

>> 
>> > 
>> > > Should we make migration_file_release() not handle the unregister of
>> > > yank(), but leave that to callers?  Then we keep the rule of only register
>> > > yank for each ioc once.
>> > >
>> > 
>> > We need the unregister to be at migration_file_release() so that it
>> > takes benefit of the locking while checking the file for NULL. If it
>> > moves out then the caller will have to do locking as well. Which
>> > defeats the purpose of the patch.
>> > 
>> > I don't understand why you moved the unregister out of channel_close in
>> > commit 39675ffffb ("migration: Move the yank unregister of channel_close
>> > out"). You called it a "hack" at the time, but looking at the current
>> > situation, it seems like a reasonable thing to do: unregister the yank
>> > when the ioc refcount drops to 1.
>> > 
>> > I would go even further and say that qemu_fclose should also avoid
>> > calling qio_channel_close if the ioc refcnt is elevated.
>> 
>> I'd rather not; I still think it's a hack, always open to be corrected.

It's hard to figure out what you mean by hack at times. Even more when
reading a years-old commit message.

>> 
>> I think the problem is yank can register anything so it's separate from
>> iochannels.  If one would like to have ioc close() automatically
>> unregister, then one should also register yank transparently without the
>> ioc user even aware of yank's existance.

Ok, fair point.

>> 
>> Now the condition is the caller register yank itself, then I think the
>> caller should unreg it.. not iochannel itself, silently.

I think the issue is that we're linking the yank with the QEMUFile for
no reason. The migration_yank_iochannel() performs a
qio_channel_shutdown() which is an operation on the fd. The QEMUFile
just happens to hold a pointer to the ioc.

>
> I just noticed this is not really copying the list.. let me add the cc list
> back, assuming it was just forgotten.

I'm sorry, I hit the wrong key while replying.

> One more thing to mention is, now I kind of agree probably we should
> register yank over each qemufile, as you raised the concern in the other
> thread that otherwise qmp_yank() won't set error for the qemufile, which
> seems to be unexpected.

I haven't made up my mind yet, but I think I'd rather stop setting that
error instead of doing it from other places. A shutdown() is mostly a
benign operation intended to end the connection. The fact that we use it
in some cases to kick the thread out of a possible hang doesn't seem
compelling enough to set -EIO.

Of course we currently have no other way to indicate that the file was
shutdown, so the -EIO will have to stay and that's a discussion for
another day.
Peter Xu Aug. 16, 2023, 9:55 p.m. UTC | #4
On Wed, Aug 16, 2023 at 06:20:58PM -0300, Fabiano Rosas wrote:
> > One more thing to mention is, now I kind of agree probably we should
> > register yank over each qemufile, as you raised the concern in the other
> > thread that otherwise qmp_yank() won't set error for the qemufile, which
> > seems to be unexpected.
> 
> I haven't made up my mind yet, but I think I'd rather stop setting that
> error instead of doing it from other places. A shutdown() is mostly a
> benign operation intended to end the connection. The fact that we use it
> in some cases to kick the thread out of a possible hang doesn't seem
> compelling enough to set -EIO.
> 
> Of course we currently have no other way to indicate that the file was
> shutdown, so the -EIO will have to stay and that's a discussion for
> another day.

Yes, if we can avoid setting -EIO at all when shutdown that'll also be
good, maybe making more sense.  Thanks,
diff mbox series

Patch

diff --git a/migration/migration.c b/migration/migration.c
index 7fec57ad7f..22ab7199e4 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -217,6 +217,26 @@  MigrationIncomingState *migration_incoming_get_current(void)
     return current_incoming;
 }
 
+static void migration_file_release(QEMUFile **file)
+{
+    MigrationState *ms = migrate_get_current();
+    QEMUFile *tmp;
+
+    /*
+     * Reset the pointer before releasing it to avoid holding the lock
+     * for too long.
+     */
+    WITH_QEMU_LOCK_GUARD(&ms->qemu_file_lock) {
+        tmp = *file;
+        *file = NULL;
+    }
+
+    if (tmp) {
+        migration_ioc_unregister_yank_from_file(tmp);
+        qemu_fclose(tmp);
+    }
+}
+
 void migration_incoming_transport_cleanup(MigrationIncomingState *mis)
 {
     if (mis->socket_address_list) {
@@ -1155,8 +1175,6 @@  static void migrate_fd_cleanup(MigrationState *s)
     qemu_savevm_state_cleanup();
 
     if (s->to_dst_file) {
-        QEMUFile *tmp;
-
         trace_migrate_fd_cleanup();
         qemu_mutex_unlock_iothread();
         if (s->migration_thread_running) {
@@ -1166,16 +1184,7 @@  static void migrate_fd_cleanup(MigrationState *s)
         qemu_mutex_lock_iothread();
 
         multifd_save_cleanup();
-        qemu_mutex_lock(&s->qemu_file_lock);
-        tmp = s->to_dst_file;
-        s->to_dst_file = NULL;
-        qemu_mutex_unlock(&s->qemu_file_lock);
-        /*
-         * Close the file handle without the lock to make sure the
-         * critical section won't block for long.
-         */
-        migration_ioc_unregister_yank_from_file(tmp);
-        qemu_fclose(tmp);
+        migration_file_release(&s->to_dst_file);
     }
 
     /*
@@ -1815,38 +1824,6 @@  static int migrate_handle_rp_resume_ack(MigrationState *s, uint32_t value)
     return 0;
 }
 
-/*
- * Release ms->rp_state.from_dst_file (and postcopy_qemufile_src if
- * existed) in a safe way.
- */
-static void migration_release_dst_files(MigrationState *ms)
-{
-    QEMUFile *file;
-
-    WITH_QEMU_LOCK_GUARD(&ms->qemu_file_lock) {
-        /*
-         * Reset the from_dst_file pointer first before releasing it, as we
-         * can't block within lock section
-         */
-        file = ms->rp_state.from_dst_file;
-        ms->rp_state.from_dst_file = NULL;
-    }
-
-    /*
-     * Do the same to postcopy fast path socket too if there is.  No
-     * locking needed because this qemufile should only be managed by
-     * return path thread.
-     */
-    if (ms->postcopy_qemufile_src) {
-        migration_ioc_unregister_yank_from_file(ms->postcopy_qemufile_src);
-        qemu_file_shutdown(ms->postcopy_qemufile_src);
-        qemu_fclose(ms->postcopy_qemufile_src);
-        ms->postcopy_qemufile_src = NULL;
-    }
-
-    qemu_fclose(file);
-}
-
 /*
  * Handles messages sent on the return path towards the source VM
  *
@@ -2003,6 +1980,8 @@  static int open_return_path_on_source(MigrationState *ms)
         return -1;
     }
 
+    migration_ioc_register_yank(qemu_file_get_ioc(ms->rp_state.from_dst_file));
+
     trace_open_return_path_on_source();
 
     qemu_thread_create(&ms->rp_state.rp_thread, "return path",
@@ -2046,7 +2025,8 @@  static int await_return_path_close_on_source(MigrationState *ms)
     ret = ms->rp_state.error;
     ms->rp_state.error = false;
 
-    migration_release_dst_files(ms);
+    migration_file_release(&ms->rp_state.from_dst_file);
+    migration_file_release(&ms->postcopy_qemufile_src);
 
     trace_migration_return_path_end_after(ret);
     return ret;
@@ -2502,26 +2482,9 @@  static MigThrError postcopy_pause(MigrationState *s)
     assert(s->state == MIGRATION_STATUS_POSTCOPY_ACTIVE);
 
     while (true) {
-        QEMUFile *file;
-
-        /*
-         * Current channel is possibly broken. Release it.  Note that this is
-         * guaranteed even without lock because to_dst_file should only be
-         * modified by the migration thread.  That also guarantees that the
-         * unregister of yank is safe too without the lock.  It should be safe
-         * even to be within the qemu_file_lock, but we didn't do that to avoid
-         * taking more mutex (yank_lock) within qemu_file_lock.  TL;DR: we make
-         * the qemu_file_lock critical section as small as possible.
-         */
+        /* Current channel is possibly broken. Release it. */
         assert(s->to_dst_file);
-        migration_ioc_unregister_yank_from_file(s->to_dst_file);
-        qemu_mutex_lock(&s->qemu_file_lock);
-        file = s->to_dst_file;
-        s->to_dst_file = NULL;
-        qemu_mutex_unlock(&s->qemu_file_lock);
-
-        qemu_file_shutdown(file);
-        qemu_fclose(file);
+        migration_file_release(&s->to_dst_file);
 
         /*
          * We're already pausing, so ignore any errors on the return