diff mbox series

[RFC,4/7] migration: Drop MultiFDSendParams.quit and cleanup error paths

Message ID 20231022201211.452861-5-peterx@redhat.com
State New
Headers show
Series migration/multifd: quit unitifications and separate sync packet | expand

Commit Message

Peter Xu Oct. 22, 2023, 8:12 p.m. UTC
Multifd send side has two fields to indicate error quits:

  - MultiFDSendParams.quit
  - &multifd_send_state->exiting

Merge them into the global one.  The replacement is done by changing all
p->quit checks into the global var check.  The global check doesn't need
any lock.

A few more things done on top of this altogether:

  - multifd_send_terminate_threads()

    Moving the xchg() of &multifd_send_state->exiting upper, so as to cover
    the tracepoint, migrate_set_error() and migrate_set_state().

  - multifd_send_sync_main()

    In the 2nd loop, add one more check over the global var to make sure we
    don't keep the looping if QEMU already decided to quit.

  - multifd_tls_outgoing_handshake()

    Use multifd_send_terminate_threads() to set the error state.  That has
    a benefit of updating MigrationState.error to that error too, so we can
    persist that 1st error we hit in that specific channel.

  - multifd_new_send_channel_async()

    Take similar approach like above, drop the migrate_set_error() because
    multifd_send_terminate_threads() already covers that.  Unwrap the helper
    multifd_new_send_channel_cleanup() along the way; not really needed.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/multifd.h |  2 --
 migration/multifd.c | 82 ++++++++++++++-------------------------------
 2 files changed, 26 insertions(+), 58 deletions(-)

Comments

Fabiano Rosas Oct. 23, 2023, 2:42 p.m. UTC | #1
Peter Xu <peterx@redhat.com> writes:

> Multifd send side has two fields to indicate error quits:
>
>   - MultiFDSendParams.quit
>   - &multifd_send_state->exiting
>
> Merge them into the global one.  The replacement is done by changing all
> p->quit checks into the global var check.  The global check doesn't need
> any lock.
>
> A few more things done on top of this altogether:
>
>   - multifd_send_terminate_threads()
>
>     Moving the xchg() of &multifd_send_state->exiting upper, so as to cover
>     the tracepoint, migrate_set_error() and migrate_set_state().
>
>   - multifd_send_sync_main()
>
>     In the 2nd loop, add one more check over the global var to make sure we
>     don't keep the looping if QEMU already decided to quit.
>
>   - multifd_tls_outgoing_handshake()
>
>     Use multifd_send_terminate_threads() to set the error state.  That has
>     a benefit of updating MigrationState.error to that error too, so we can
>     persist that 1st error we hit in that specific channel.
>
>   - multifd_new_send_channel_async()
>
>     Take similar approach like above, drop the migrate_set_error() because
>     multifd_send_terminate_threads() already covers that.  Unwrap the helper
>     multifd_new_send_channel_cleanup() along the way; not really needed.

This all looks good to me. I had a very similar patch in the works. Just
one comment below.

> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  migration/multifd.h |  2 --
>  migration/multifd.c | 82 ++++++++++++++-------------------------------
>  2 files changed, 26 insertions(+), 58 deletions(-)
>
> diff --git a/migration/multifd.h b/migration/multifd.h
> index a835643b48..2acf400085 100644
> --- a/migration/multifd.h
> +++ b/migration/multifd.h
> @@ -97,8 +97,6 @@ typedef struct {
>      QemuMutex mutex;
>      /* is this channel thread running */
>      bool running;
> -    /* should this thread finish */
> -    bool quit;
>      /* multifd flags for each packet */
>      uint32_t flags;
>      /* global number of generated multifd packets */
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 33fb21d0e4..9d458914a9 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -411,10 +411,6 @@ static int multifd_send_pages(QEMUFile *f)
>      MultiFDSendParams *p = NULL; /* make happy gcc */
>      MultiFDPages_t *pages = multifd_send_state->pages;
>  
> -    if (qatomic_read(&multifd_send_state->exiting)) {
> -        return -1;
> -    }
> -

I'd keep this. This function can be called from outside of multifd code
so the channels could be completely gone already.
Peter Xu Oct. 23, 2023, 2:53 p.m. UTC | #2
Fabiano,

On Mon, Oct 23, 2023 at 11:42:28AM -0300, Fabiano Rosas wrote:
> > diff --git a/migration/multifd.c b/migration/multifd.c
> > index 33fb21d0e4..9d458914a9 100644
> > --- a/migration/multifd.c
> > +++ b/migration/multifd.c
> > @@ -411,10 +411,6 @@ static int multifd_send_pages(QEMUFile *f)
> >      MultiFDSendParams *p = NULL; /* make happy gcc */
> >      MultiFDPages_t *pages = multifd_send_state->pages;
> >  
> > -    if (qatomic_read(&multifd_send_state->exiting)) {
> > -        return -1;
> > -    }
> > -
> 
> I'd keep this. This function can be called from outside of multifd code
> so the channels could be completely gone already.

I can definitely add it back; nothing hurts.  But I want to make sure I
didn't miss some point.

Do you have a specific path that could trigger what you said?
Fabiano Rosas Oct. 23, 2023, 3:35 p.m. UTC | #3
Peter Xu <peterx@redhat.com> writes:

> Fabiano,
>
> On Mon, Oct 23, 2023 at 11:42:28AM -0300, Fabiano Rosas wrote:
>> > diff --git a/migration/multifd.c b/migration/multifd.c
>> > index 33fb21d0e4..9d458914a9 100644
>> > --- a/migration/multifd.c
>> > +++ b/migration/multifd.c
>> > @@ -411,10 +411,6 @@ static int multifd_send_pages(QEMUFile *f)
>> >      MultiFDSendParams *p = NULL; /* make happy gcc */
>> >      MultiFDPages_t *pages = multifd_send_state->pages;
>> >  
>> > -    if (qatomic_read(&multifd_send_state->exiting)) {
>> > -        return -1;
>> > -    }
>> > -
>> 
>> I'd keep this. This function can be called from outside of multifd code
>> so the channels could be completely gone already.
>
> I can definitely add it back; nothing hurts.  But I want to make sure I
> didn't miss some point.
>
> Do you have a specific path that could trigger what you said?

I don't, just thought of being conservative since this is a multifd
external API (of sorts).
Peter Xu Oct. 23, 2023, 3:54 p.m. UTC | #4
On Mon, Oct 23, 2023 at 12:35:50PM -0300, Fabiano Rosas wrote:
> I don't, just thought of being conservative since this is a multifd
> external API (of sorts).

No worry, let me just keep it there.  Thanks for the quick reviews!
diff mbox series

Patch

diff --git a/migration/multifd.h b/migration/multifd.h
index a835643b48..2acf400085 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -97,8 +97,6 @@  typedef struct {
     QemuMutex mutex;
     /* is this channel thread running */
     bool running;
-    /* should this thread finish */
-    bool quit;
     /* multifd flags for each packet */
     uint32_t flags;
     /* global number of generated multifd packets */
diff --git a/migration/multifd.c b/migration/multifd.c
index 33fb21d0e4..9d458914a9 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -411,10 +411,6 @@  static int multifd_send_pages(QEMUFile *f)
     MultiFDSendParams *p = NULL; /* make happy gcc */
     MultiFDPages_t *pages = multifd_send_state->pages;
 
-    if (qatomic_read(&multifd_send_state->exiting)) {
-        return -1;
-    }
-
     qemu_sem_wait(&multifd_send_state->channels_ready);
     /*
      * next_channel can remain from a previous migration that was
@@ -423,14 +419,11 @@  static int multifd_send_pages(QEMUFile *f)
      */
     next_channel %= migrate_multifd_channels();
     for (i = next_channel;; i = (i + 1) % migrate_multifd_channels()) {
-        p = &multifd_send_state->params[i];
-
-        qemu_mutex_lock(&p->mutex);
-        if (p->quit) {
-            error_report("%s: channel %d has already quit!", __func__, i);
-            qemu_mutex_unlock(&p->mutex);
+        if (qatomic_read(&multifd_send_state->exiting)) {
             return -1;
         }
+        p = &multifd_send_state->params[i];
+        qemu_mutex_lock(&p->mutex);
         if (!p->pending_job) {
             p->pending_job++;
             next_channel = (i + 1) % migrate_multifd_channels();
@@ -485,6 +478,16 @@  static void multifd_send_terminate_threads(Error *err)
 {
     int i;
 
+    /*
+     * We don't want to exit each threads twice.  Depending on where
+     * we get the error, or if there are two independent errors in two
+     * threads at the same time, we can end calling this function
+     * twice.
+     */
+    if (qatomic_xchg(&multifd_send_state->exiting, 1)) {
+        return;
+    }
+
     trace_multifd_send_terminate_threads(err != NULL);
 
     if (err) {
@@ -499,26 +502,13 @@  static void multifd_send_terminate_threads(Error *err)
         }
     }
 
-    /*
-     * We don't want to exit each threads twice.  Depending on where
-     * we get the error, or if there are two independent errors in two
-     * threads at the same time, we can end calling this function
-     * twice.
-     */
-    if (qatomic_xchg(&multifd_send_state->exiting, 1)) {
-        return;
-    }
-
     for (i = 0; i < migrate_multifd_channels(); i++) {
         MultiFDSendParams *p = &multifd_send_state->params[i];
 
-        qemu_mutex_lock(&p->mutex);
-        p->quit = true;
         qemu_sem_post(&p->sem);
         if (p->c) {
             qio_channel_shutdown(p->c, QIO_CHANNEL_SHUTDOWN_BOTH, NULL);
         }
-        qemu_mutex_unlock(&p->mutex);
     }
 }
 
@@ -617,16 +607,13 @@  int multifd_send_sync_main(QEMUFile *f)
     for (i = 0; i < migrate_multifd_channels(); i++) {
         MultiFDSendParams *p = &multifd_send_state->params[i];
 
-        trace_multifd_send_sync_main_signal(p->id);
-
-        qemu_mutex_lock(&p->mutex);
-
-        if (p->quit) {
-            error_report("%s: channel %d has already quit", __func__, i);
-            qemu_mutex_unlock(&p->mutex);
+        if (qatomic_read(&multifd_send_state->exiting)) {
             return -1;
         }
 
+        trace_multifd_send_sync_main_signal(p->id);
+
+        qemu_mutex_lock(&p->mutex);
         p->packet_num = multifd_send_state->packet_num++;
         p->flags |= MULTIFD_FLAG_SYNC;
         p->pending_job++;
@@ -636,6 +623,10 @@  int multifd_send_sync_main(QEMUFile *f)
     for (i = 0; i < migrate_multifd_channels(); i++) {
         MultiFDSendParams *p = &multifd_send_state->params[i];
 
+        if (qatomic_read(&multifd_send_state->exiting)) {
+            return -1;
+        }
+
         qemu_sem_wait(&multifd_send_state->channels_ready);
         trace_multifd_send_sync_main_wait(p->id);
         qemu_sem_wait(&p->sem_sync);
@@ -744,9 +735,6 @@  static void *multifd_send_thread(void *opaque)
             if (flags & MULTIFD_FLAG_SYNC) {
                 qemu_sem_post(&p->sem_sync);
             }
-        } else if (p->quit) {
-            qemu_mutex_unlock(&p->mutex);
-            break;
         } else {
             qemu_mutex_unlock(&p->mutex);
             /* sometimes there are spurious wakeups */
@@ -793,11 +781,7 @@  static void multifd_tls_outgoing_handshake(QIOTask *task,
 
     trace_multifd_tls_outgoing_handshake_error(ioc, error_get_pretty(err));
 
-    /*
-     * Error happen, mark multifd_send_thread status as 'quit' although it
-     * is not created, and then tell who pay attention to me.
-     */
-    p->quit = true;
+    multifd_send_terminate_threads(err);
     multifd_send_kick_main(p);
     error_free(err);
 }
@@ -864,22 +848,6 @@  static bool multifd_channel_connect(MultiFDSendParams *p,
     return true;
 }
 
-static void multifd_new_send_channel_cleanup(MultiFDSendParams *p,
-                                             QIOChannel *ioc, Error *err)
-{
-     migrate_set_error(migrate_get_current(), err);
-     /* Error happen, we need to tell who pay attention to me */
-     multifd_send_kick_main(p);
-     /*
-      * Although multifd_send_thread is not created, but main migration
-      * thread need to judge whether it is running, so we need to mark
-      * its status.
-      */
-     p->quit = true;
-     object_unref(OBJECT(ioc));
-     error_free(err);
-}
-
 static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque)
 {
     MultiFDSendParams *p = opaque;
@@ -897,7 +865,10 @@  static void multifd_new_send_channel_async(QIOTask *task, gpointer opaque)
     }
 
     trace_multifd_new_send_channel_async_error(p->id, local_err);
-    multifd_new_send_channel_cleanup(p, ioc, local_err);
+    multifd_send_terminate_threads(local_err);
+    multifd_send_kick_main(p);
+    object_unref(OBJECT(ioc));
+    error_free(local_err);
 }
 
 static void multifd_new_send_channel_create(gpointer opaque)
@@ -929,7 +900,6 @@  int multifd_save_setup(Error **errp)
         qemu_mutex_init(&p->mutex);
         qemu_sem_init(&p->sem, 0);
         qemu_sem_init(&p->sem_sync, 0);
-        p->quit = false;
         p->pending_job = 0;
         p->id = i;
         p->pages = multifd_pages_init(page_count);