diff mbox series

[1/6] migration: Set migration status early in incoming side

Message ID 20230628165542.17214-2-farosas@suse.de
State New
Headers show
Series migration: Test the new "file:" migration | expand

Commit Message

Fabiano Rosas June 28, 2023, 4:55 p.m. UTC
We are sending a migration event of MIGRATION_STATUS_SETUP at
qemu_start_incoming_migration but never actually setting the state.

This creates a window between qmp_migrate_incoming and
process_incoming_migration_co where the migration status is still
MIGRATION_STATUS_NONE. Calling query-migrate during this time will
return an empty response even though the incoming migration command
has already been issued.

Commit 7cf1fe6d68 ("migration: Add migration events on target side")
has added support to the 'events' capability to the incoming part of
migration, but chose to send the SETUP event without setting the
state. I'm assuming this was a mistake.

To avoid introducing a change in behavior, we need to keep sending the
SETUP event, even if the 'events' capability is not set. Add the
force-emit-setup-event migration property to enable it.

Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/migration.c | 17 +++++++++++++++--
 migration/migration.h | 11 +++++++++++
 migration/options.c   | 13 +++++++++++++
 migration/options.h   |  1 +
 4 files changed, 40 insertions(+), 2 deletions(-)

Comments

Peter Xu June 29, 2023, 7:18 p.m. UTC | #1
On Wed, Jun 28, 2023 at 01:55:37PM -0300, Fabiano Rosas wrote:
> We are sending a migration event of MIGRATION_STATUS_SETUP at
> qemu_start_incoming_migration but never actually setting the state.
> 
> This creates a window between qmp_migrate_incoming and
> process_incoming_migration_co where the migration status is still
> MIGRATION_STATUS_NONE. Calling query-migrate during this time will
> return an empty response even though the incoming migration command
> has already been issued.
> 
> Commit 7cf1fe6d68 ("migration: Add migration events on target side")
> has added support to the 'events' capability to the incoming part of
> migration, but chose to send the SETUP event without setting the
> state. I'm assuming this was a mistake.
> 
> To avoid introducing a change in behavior, we need to keep sending the
> SETUP event, even if the 'events' capability is not set. Add the
> force-emit-setup-event migration property to enable it.

This is so unfortunate... since qemu 2.4.....

Does it mean that when cap-events is set we can send duplicated events?

The fix makes sense to me in general, butt I'm curious whether we can fix
it without having a compat bit doing the wrong thing, even if having the
risk of breaking someone, with the hope that the only thing he/she needs to
do is to enable the cap-events if didn't.  I'd consider that if e.g. as
long as libvirt is fine.  Does anyone know how libvirt handles this?

The worst case is if there's major breakage we can apply a patch adding the
compat bit and copy stable, which should cover all the recent releases. And
if no report after a few releases, probably mean we're fine anyway.

It's just feel so unfortunate migration needs to carry over so much legacy
issues along the way.  So hope to avoid it if any possiblility.

> 
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>  migration/migration.c | 17 +++++++++++++++--
>  migration/migration.h | 11 +++++++++++
>  migration/options.c   | 13 +++++++++++++
>  migration/options.h   |  1 +
>  4 files changed, 40 insertions(+), 2 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 7c8292d4d4..6da1865e80 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -424,13 +424,26 @@ void migrate_add_address(SocketAddress *address)
>  static void qemu_start_incoming_migration(const char *uri, Error **errp)
>  {
>      const char *p = NULL;
> +    MigrationIncomingState *mis = migration_incoming_get_current();
>  
>      /* URI is not suitable for migration? */
>      if (!migration_channels_and_uri_compatible(uri, errp)) {
>          return;
>      }
>  
> -    qapi_event_send_migration(MIGRATION_STATUS_SETUP);
> +    migrate_set_state(&mis->state, MIGRATION_STATUS_NONE,
> +                      MIGRATION_STATUS_SETUP);
> +    /*
> +     * QMP clients should have set the 'events' migration capability
> +     * if they want to receive this event, in which case the
> +     * migrate_set_state() call above will have already sent the
> +     * event. We still need to send the event for compatibility even
> +     * if migration events are disabled.
> +     */
> +    if (migrate_emit_setup_event()) {
> +        qapi_event_send_migration(MIGRATION_STATUS_SETUP);
> +    }
> +
>      if (strstart(uri, "tcp:", &p) ||
>          strstart(uri, "unix:", NULL) ||
>          strstart(uri, "vsock:", NULL)) {
> @@ -524,7 +537,7 @@ process_incoming_migration_co(void *opaque)
>  
>      mis->largest_page_size = qemu_ram_pagesize_largest();
>      postcopy_state_set(POSTCOPY_INCOMING_NONE);
> -    migrate_set_state(&mis->state, MIGRATION_STATUS_NONE,
> +    migrate_set_state(&mis->state, MIGRATION_STATUS_SETUP,
>                        MIGRATION_STATUS_ACTIVE);
>  
>      mis->loadvm_co = qemu_coroutine_self();
> diff --git a/migration/migration.h b/migration/migration.h
> index 30c3e97635..05e1e19e4f 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -433,6 +433,17 @@ struct MigrationState {
>       */
>      uint8_t clear_bitmap_shift;
>  
> +    /*
> +     * Always emit the incoming migration's SETUP event, even when the
> +     * 'events' capability is not enabled.
> +     *
> +     * QMP clients that wish to receive migration events should always
> +     * enable the 'events' capability. This property is for
> +     * compatibility with clients that rely on the older QEMU behavior
> +     * of unconditionally emitting the SETUP event.
> +     */
> +    bool force_emit_setup_event;
> +
>      /*
>       * This save hostname when out-going migration starts
>       */
> diff --git a/migration/options.c b/migration/options.c
> index b62ab30cd5..b0eda7cb05 100644
> --- a/migration/options.c
> +++ b/migration/options.c
> @@ -95,6 +95,8 @@ Property migration_properties[] = {
>                        clear_bitmap_shift, CLEAR_BITMAP_SHIFT_DEFAULT),
>      DEFINE_PROP_BOOL("x-preempt-pre-7-2", MigrationState,
>                       preempt_pre_7_2, false),
> +    DEFINE_PROP_BOOL("force-emit-setup-event", MigrationState,
> +                      force_emit_setup_event, true),
>  
>      /* Migration parameters */
>      DEFINE_PROP_UINT8("x-compress-level", MigrationState,
> @@ -338,6 +340,17 @@ bool migrate_zero_copy_send(void)
>  
>  /* pseudo capabilities */
>  
> +bool migrate_emit_setup_event(void)
> +{
> +    MigrationState *s = migrate_get_current();
> +
> +    /*
> +     * If migration events are enabled the setup event will have
> +     * already been sent.
> +     */
> +    return !migrate_events() && s->force_emit_setup_event;
> +}
> +
>  bool migrate_multifd_flush_after_each_section(void)
>  {
>      MigrationState *s = migrate_get_current();
> diff --git a/migration/options.h b/migration/options.h
> index 45991af3c2..5c9785e455 100644
> --- a/migration/options.h
> +++ b/migration/options.h
> @@ -52,6 +52,7 @@ bool migrate_zero_copy_send(void);
>   * check, but they are not a capability.
>   */
>  
> +bool migrate_emit_setup_event(void);
>  bool migrate_multifd_flush_after_each_section(void);
>  bool migrate_postcopy(void);
>  bool migrate_tls(void);
> -- 
> 2.35.3
>
Fabiano Rosas June 30, 2023, 2:57 p.m. UTC | #2
Peter Xu <peterx@redhat.com> writes:

> On Wed, Jun 28, 2023 at 01:55:37PM -0300, Fabiano Rosas wrote:
>> We are sending a migration event of MIGRATION_STATUS_SETUP at
>> qemu_start_incoming_migration but never actually setting the state.
>> 
>> This creates a window between qmp_migrate_incoming and
>> process_incoming_migration_co where the migration status is still
>> MIGRATION_STATUS_NONE. Calling query-migrate during this time will
>> return an empty response even though the incoming migration command
>> has already been issued.
>> 
>> Commit 7cf1fe6d68 ("migration: Add migration events on target side")
>> has added support to the 'events' capability to the incoming part of
>> migration, but chose to send the SETUP event without setting the
>> state. I'm assuming this was a mistake.
>> 
>> To avoid introducing a change in behavior, we need to keep sending the
>> SETUP event, even if the 'events' capability is not set. Add the
>> force-emit-setup-event migration property to enable it.
>
> This is so unfortunate... since qemu 2.4.....
>
> Does it mean that when cap-events is set we can send duplicated events?
>

Not with current code because this event was the only one sent directly
without setting the state first, so migrate_generate_event() never runs.

And not with this patch because I'm not sending the event if
migrate_events() is true because it will already be sent by
migrate_generate_event().

> The fix makes sense to me in general, butt I'm curious whether we can fix
> it without having a compat bit doing the wrong thing, even if having the
> risk of breaking someone, with the hope that the only thing he/she needs to
> do is to enable the cap-events if didn't.  I'd consider that if e.g. as
> long as libvirt is fine.  Does anyone know how libvirt handles this?
>

I agree that it would be cleaner for us to just break compatibility and
hope for the best. Any process waiting for the event would hang, but
simply enabling the capability would fix it.

I see libvirt knows about the 'events' capability but I couldn't
determine if it is enabled by default. I'll have to take a deeper look.
diff mbox series

Patch

diff --git a/migration/migration.c b/migration/migration.c
index 7c8292d4d4..6da1865e80 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -424,13 +424,26 @@  void migrate_add_address(SocketAddress *address)
 static void qemu_start_incoming_migration(const char *uri, Error **errp)
 {
     const char *p = NULL;
+    MigrationIncomingState *mis = migration_incoming_get_current();
 
     /* URI is not suitable for migration? */
     if (!migration_channels_and_uri_compatible(uri, errp)) {
         return;
     }
 
-    qapi_event_send_migration(MIGRATION_STATUS_SETUP);
+    migrate_set_state(&mis->state, MIGRATION_STATUS_NONE,
+                      MIGRATION_STATUS_SETUP);
+    /*
+     * QMP clients should have set the 'events' migration capability
+     * if they want to receive this event, in which case the
+     * migrate_set_state() call above will have already sent the
+     * event. We still need to send the event for compatibility even
+     * if migration events are disabled.
+     */
+    if (migrate_emit_setup_event()) {
+        qapi_event_send_migration(MIGRATION_STATUS_SETUP);
+    }
+
     if (strstart(uri, "tcp:", &p) ||
         strstart(uri, "unix:", NULL) ||
         strstart(uri, "vsock:", NULL)) {
@@ -524,7 +537,7 @@  process_incoming_migration_co(void *opaque)
 
     mis->largest_page_size = qemu_ram_pagesize_largest();
     postcopy_state_set(POSTCOPY_INCOMING_NONE);
-    migrate_set_state(&mis->state, MIGRATION_STATUS_NONE,
+    migrate_set_state(&mis->state, MIGRATION_STATUS_SETUP,
                       MIGRATION_STATUS_ACTIVE);
 
     mis->loadvm_co = qemu_coroutine_self();
diff --git a/migration/migration.h b/migration/migration.h
index 30c3e97635..05e1e19e4f 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -433,6 +433,17 @@  struct MigrationState {
      */
     uint8_t clear_bitmap_shift;
 
+    /*
+     * Always emit the incoming migration's SETUP event, even when the
+     * 'events' capability is not enabled.
+     *
+     * QMP clients that wish to receive migration events should always
+     * enable the 'events' capability. This property is for
+     * compatibility with clients that rely on the older QEMU behavior
+     * of unconditionally emitting the SETUP event.
+     */
+    bool force_emit_setup_event;
+
     /*
      * This save hostname when out-going migration starts
      */
diff --git a/migration/options.c b/migration/options.c
index b62ab30cd5..b0eda7cb05 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -95,6 +95,8 @@  Property migration_properties[] = {
                       clear_bitmap_shift, CLEAR_BITMAP_SHIFT_DEFAULT),
     DEFINE_PROP_BOOL("x-preempt-pre-7-2", MigrationState,
                      preempt_pre_7_2, false),
+    DEFINE_PROP_BOOL("force-emit-setup-event", MigrationState,
+                      force_emit_setup_event, true),
 
     /* Migration parameters */
     DEFINE_PROP_UINT8("x-compress-level", MigrationState,
@@ -338,6 +340,17 @@  bool migrate_zero_copy_send(void)
 
 /* pseudo capabilities */
 
+bool migrate_emit_setup_event(void)
+{
+    MigrationState *s = migrate_get_current();
+
+    /*
+     * If migration events are enabled the setup event will have
+     * already been sent.
+     */
+    return !migrate_events() && s->force_emit_setup_event;
+}
+
 bool migrate_multifd_flush_after_each_section(void)
 {
     MigrationState *s = migrate_get_current();
diff --git a/migration/options.h b/migration/options.h
index 45991af3c2..5c9785e455 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -52,6 +52,7 @@  bool migrate_zero_copy_send(void);
  * check, but they are not a capability.
  */
 
+bool migrate_emit_setup_event(void);
 bool migrate_multifd_flush_after_each_section(void);
 bool migrate_postcopy(void);
 bool migrate_tls(void);