Message ID | 20230202212458.726496-2-peterx@redhat.com |
---|---|
State | New |
Headers | show |
Series | migration: Fix disorder of channel creations | expand |
Peter Xu <peterx@redhat.com> wrote: > The whole idea of multi-channel checks was not properly done, IMHO. > > Currently we check multi-channel in a lot of places, but actually that's > not needed because we only need to check it right after we get the URI and > that should be it. > > If the URI check succeeded, we should never need to check it again because > we must have it. If it check fails, we should fail immediately on either > the qmp_migrate or qmp_migrate_incoming, instead of failingg it later after > the connection established. > > Neither should we fail any set capabiliities like what we used to do here: > > 5ad15e8614 ("migration: allow enabling mutilfd for specific protocol only", 2021-10-19) > > Because logically the URI will only be set later after the capability is > set, so it doesn't make a lot of sense to check the URI type when setting > the capability, because we're checking the cap with an old URI passed in, > and that may not even be the URI we're going to use later. > > This patch mostly reverted all such checks for before, dropping the > variable migrate_allow_multi_channels and helpers. Instead, add a common > helper to check URI for multi-channels for either qmp_migrate and > qmp_migrate_incoming and that should do all the proper checks. The failure > will only trigger with the "migrate" or "migrate_incoming" command, or when > user specified "-incoming xxx" where "xxx" is not "defer". > > With that, make postcopy_preempt_setup() as simple as creating the channel. > > Signed-off-by: Peter Xu <peterx@redhat.com> The idea is right. But I think that changing everything in a single patch is confusing. > --- > migration/migration.c | 56 +++++++++++++++++++--------------------- > migration/migration.h | 3 --- > migration/multifd.c | 12 ++------- > migration/postcopy-ram.c | 14 +--------- > migration/postcopy-ram.h | 2 +- > 5 files changed, 31 insertions(+), 56 deletions(-) > > diff --git a/migration/migration.c b/migration/migration.c > index f4f7d207f0..ef7fceb5d7 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -182,16 +182,26 @@ static int migration_maybe_pause(MigrationState *s, > int new_state); > static void migrate_fd_cancel(MigrationState *s); > > -static bool migrate_allow_multi_channels = true; > +static bool migration_needs_multiple_sockets(void) > +{ > + return migrate_use_multifd() || migrate_postcopy_preempt(); > +} This function (and use it) makes sense > -void migrate_protocol_allow_multi_channels(bool allow) > +static bool uri_supports_multi_channels(const char *uri) > { > - migrate_allow_multi_channels = allow; > + return strstart(uri, "tcp:", NULL) || strstart(uri, "unix:", NULL) || > + strstart(uri, "vsock:", NULL); Indentation is wrong. Fixing it by hand. This other is also ok with me. > } > > -bool migrate_multi_channels_is_allowed(void) > +static bool migration_uri_validate(const char *uri, Error **errp) > { > - return migrate_allow_multi_channels; > + if (migration_needs_multiple_sockets() && > + !uri_supports_multi_channels(uri)) { > + error_setg(errp, "Migration requires multi-channel URIs (e.g. tcp)"); > + return false; > + } > + > + return true; > } This name is not O:-) What about: migration_channels_and_uri_compatible() No, it is not perfect, but I can think anything else. But validate don't mean anything. I can't know without looking at the function what is the meaning of the result. > static gint page_request_addr_cmp(gconstpointer ap, gconstpointer bp) > @@ -491,12 +501,15 @@ static void qemu_start_incoming_migration(const char *uri, Error **errp) > { > const char *p = NULL; > > - migrate_protocol_allow_multi_channels(false); /* reset it anyway */ > + /* URI is not suitable for migration? */ > + if (!migration_uri_validate(uri, errp)) { > + return; > + } > + > qapi_event_send_migration(MIGRATION_STATUS_SETUP); > if (strstart(uri, "tcp:", &p) || > strstart(uri, "unix:", NULL) || > strstart(uri, "vsock:", NULL)) { > - migrate_protocol_allow_multi_channels(true); > socket_start_incoming_migration(p ? p : uri, errp); > #ifdef CONFIG_RDMA > } else if (strstart(uri, "rdma:", &p)) { > @@ -721,11 +734,6 @@ void migration_fd_process_incoming(QEMUFile *f, Error **errp) > migration_incoming_process(); > } > > -static bool migration_needs_multiple_sockets(void) > -{ > - return migrate_use_multifd() || migrate_postcopy_preempt(); > -} > - > void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) > { > MigrationIncomingState *mis = migration_incoming_get_current(); > @@ -1347,15 +1355,6 @@ static bool migrate_caps_check(bool *cap_list, > } > #endif > > - > - /* incoming side only */ > - if (runstate_check(RUN_STATE_INMIGRATE) && > - !migrate_multi_channels_is_allowed() && > - cap_list[MIGRATION_CAPABILITY_MULTIFD]) { > - error_setg(errp, "multifd is not supported by current protocol"); > - return false; > - } > - > if (cap_list[MIGRATION_CAPABILITY_POSTCOPY_PREEMPT]) { > if (!cap_list[MIGRATION_CAPABILITY_POSTCOPY_RAM]) { > error_setg(errp, "Postcopy preempt requires postcopy-ram"); > @@ -2440,6 +2439,11 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk, > MigrationState *s = migrate_get_current(); > const char *p = NULL; > > + /* URI is not suitable for migration? */ > + if (!migration_uri_validate(uri, errp)) { > + return; > + } > + > if (!migrate_prepare(s, has_blk && blk, has_inc && inc, > has_resume && resume, errp)) { > /* Error detected, put into errp */ > @@ -2452,11 +2456,9 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk, > } > } > > - migrate_protocol_allow_multi_channels(false); > if (strstart(uri, "tcp:", &p) || > strstart(uri, "unix:", NULL) || > strstart(uri, "vsock:", NULL)) { > - migrate_protocol_allow_multi_channels(true); > socket_start_outgoing_migration(s, p ? p : uri, &local_err); > #ifdef CONFIG_RDMA > } else if (strstart(uri, "rdma:", &p)) { > @@ -4309,12 +4311,8 @@ void migrate_fd_connect(MigrationState *s, Error *error_in) > } > > /* This needs to be done before resuming a postcopy */ > - if (postcopy_preempt_setup(s, &local_err)) { > - error_report_err(local_err); > - migrate_set_state(&s->state, MIGRATION_STATUS_SETUP, > - MIGRATION_STATUS_FAILED); > - migrate_fd_cleanup(s); > - return; > + if (migrate_postcopy_preempt()) { > + postcopy_preempt_setup(s); > } I think that this should go in a different patch. Rest looks ok. Thanks, Juan.
On Wed, Feb 08, 2023 at 08:19:11PM +0100, Juan Quintela wrote: > Peter Xu <peterx@redhat.com> wrote: > > The whole idea of multi-channel checks was not properly done, IMHO. > > > > Currently we check multi-channel in a lot of places, but actually that's > > not needed because we only need to check it right after we get the URI and > > that should be it. > > > > If the URI check succeeded, we should never need to check it again because > > we must have it. If it check fails, we should fail immediately on either > > the qmp_migrate or qmp_migrate_incoming, instead of failingg it later after > > the connection established. > > > > Neither should we fail any set capabiliities like what we used to do here: > > > > 5ad15e8614 ("migration: allow enabling mutilfd for specific protocol only", 2021-10-19) > > > > Because logically the URI will only be set later after the capability is > > set, so it doesn't make a lot of sense to check the URI type when setting > > the capability, because we're checking the cap with an old URI passed in, > > and that may not even be the URI we're going to use later. > > > > This patch mostly reverted all such checks for before, dropping the > > variable migrate_allow_multi_channels and helpers. Instead, add a common > > helper to check URI for multi-channels for either qmp_migrate and > > qmp_migrate_incoming and that should do all the proper checks. The failure > > will only trigger with the "migrate" or "migrate_incoming" command, or when > > user specified "-incoming xxx" where "xxx" is not "defer". > > > > With that, make postcopy_preempt_setup() as simple as creating the channel. > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > The idea is right. But I think that changing everything in a single > patch is confusing. > > > --- > > migration/migration.c | 56 +++++++++++++++++++--------------------- > > migration/migration.h | 3 --- > > migration/multifd.c | 12 ++------- > > migration/postcopy-ram.c | 14 +--------- > > migration/postcopy-ram.h | 2 +- > > 5 files changed, 31 insertions(+), 56 deletions(-) > > > > diff --git a/migration/migration.c b/migration/migration.c > > index f4f7d207f0..ef7fceb5d7 100644 > > --- a/migration/migration.c > > +++ b/migration/migration.c > > @@ -182,16 +182,26 @@ static int migration_maybe_pause(MigrationState *s, > > int new_state); > > static void migrate_fd_cancel(MigrationState *s); > > > > -static bool migrate_allow_multi_channels = true; > > +static bool migration_needs_multiple_sockets(void) > > +{ > > + return migrate_use_multifd() || migrate_postcopy_preempt(); > > +} > > This function (and use it) makes sense > > > -void migrate_protocol_allow_multi_channels(bool allow) > > +static bool uri_supports_multi_channels(const char *uri) > > { > > - migrate_allow_multi_channels = allow; > > + return strstart(uri, "tcp:", NULL) || strstart(uri, "unix:", NULL) || > > + strstart(uri, "vsock:", NULL); > > Indentation is wrong. Fixing it by hand. Will do. > > This other is also ok with me. > > > } > > > > -bool migrate_multi_channels_is_allowed(void) > > +static bool migration_uri_validate(const char *uri, Error **errp) > > { > > - return migrate_allow_multi_channels; > > + if (migration_needs_multiple_sockets() && > > + !uri_supports_multi_channels(uri)) { > > + error_setg(errp, "Migration requires multi-channel URIs (e.g. tcp)"); > > + return false; > > + } > > + > > + return true; > > } > > This name is not O:-) > > What about: > > migration_channels_and_uri_compatible() > > No, it is not perfect, but I can think anything else. > > But validate don't mean anything. I can't know without looking at the > function what is the meaning of the result. I don't have an obvious preference; I think it means I'll just go ahead and rename it. :) > > > static gint page_request_addr_cmp(gconstpointer ap, gconstpointer bp) > > @@ -491,12 +501,15 @@ static void qemu_start_incoming_migration(const char *uri, Error **errp) > > { > > const char *p = NULL; > > > > - migrate_protocol_allow_multi_channels(false); /* reset it anyway */ > > + /* URI is not suitable for migration? */ > > + if (!migration_uri_validate(uri, errp)) { > > + return; > > + } > > + > > qapi_event_send_migration(MIGRATION_STATUS_SETUP); > > if (strstart(uri, "tcp:", &p) || > > strstart(uri, "unix:", NULL) || > > strstart(uri, "vsock:", NULL)) { > > - migrate_protocol_allow_multi_channels(true); > > socket_start_incoming_migration(p ? p : uri, errp); > > #ifdef CONFIG_RDMA > > } else if (strstart(uri, "rdma:", &p)) { > > @@ -721,11 +734,6 @@ void migration_fd_process_incoming(QEMUFile *f, Error **errp) > > migration_incoming_process(); > > } > > > > -static bool migration_needs_multiple_sockets(void) > > -{ > > - return migrate_use_multifd() || migrate_postcopy_preempt(); > > -} > > - > > void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) > > { > > MigrationIncomingState *mis = migration_incoming_get_current(); > > @@ -1347,15 +1355,6 @@ static bool migrate_caps_check(bool *cap_list, > > } > > #endif > > > > - > > - /* incoming side only */ > > - if (runstate_check(RUN_STATE_INMIGRATE) && > > - !migrate_multi_channels_is_allowed() && > > - cap_list[MIGRATION_CAPABILITY_MULTIFD]) { > > - error_setg(errp, "multifd is not supported by current protocol"); > > - return false; > > - } > > - > > if (cap_list[MIGRATION_CAPABILITY_POSTCOPY_PREEMPT]) { > > if (!cap_list[MIGRATION_CAPABILITY_POSTCOPY_RAM]) { > > error_setg(errp, "Postcopy preempt requires postcopy-ram"); > > @@ -2440,6 +2439,11 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk, > > MigrationState *s = migrate_get_current(); > > const char *p = NULL; > > > > + /* URI is not suitable for migration? */ > > + if (!migration_uri_validate(uri, errp)) { > > + return; > > + } > > + > > if (!migrate_prepare(s, has_blk && blk, has_inc && inc, > > has_resume && resume, errp)) { > > /* Error detected, put into errp */ > > @@ -2452,11 +2456,9 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk, > > } > > } > > > > - migrate_protocol_allow_multi_channels(false); > > if (strstart(uri, "tcp:", &p) || > > strstart(uri, "unix:", NULL) || > > strstart(uri, "vsock:", NULL)) { > > - migrate_protocol_allow_multi_channels(true); > > socket_start_outgoing_migration(s, p ? p : uri, &local_err); > > #ifdef CONFIG_RDMA > > } else if (strstart(uri, "rdma:", &p)) { > > @@ -4309,12 +4311,8 @@ void migrate_fd_connect(MigrationState *s, Error *error_in) > > } > > > > /* This needs to be done before resuming a postcopy */ > > - if (postcopy_preempt_setup(s, &local_err)) { > > - error_report_err(local_err); > > - migrate_set_state(&s->state, MIGRATION_STATUS_SETUP, > > - MIGRATION_STATUS_FAILED); > > - migrate_fd_cleanup(s); > > - return; > > + if (migrate_postcopy_preempt()) { > > + postcopy_preempt_setup(s); > > } > > I think that this should go in a different patch. It's so small and natural so I "hid" it in. But I agree, I'll split. Thanks!
diff --git a/migration/migration.c b/migration/migration.c index f4f7d207f0..ef7fceb5d7 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -182,16 +182,26 @@ static int migration_maybe_pause(MigrationState *s, int new_state); static void migrate_fd_cancel(MigrationState *s); -static bool migrate_allow_multi_channels = true; +static bool migration_needs_multiple_sockets(void) +{ + return migrate_use_multifd() || migrate_postcopy_preempt(); +} -void migrate_protocol_allow_multi_channels(bool allow) +static bool uri_supports_multi_channels(const char *uri) { - migrate_allow_multi_channels = allow; + return strstart(uri, "tcp:", NULL) || strstart(uri, "unix:", NULL) || + strstart(uri, "vsock:", NULL); } -bool migrate_multi_channels_is_allowed(void) +static bool migration_uri_validate(const char *uri, Error **errp) { - return migrate_allow_multi_channels; + if (migration_needs_multiple_sockets() && + !uri_supports_multi_channels(uri)) { + error_setg(errp, "Migration requires multi-channel URIs (e.g. tcp)"); + return false; + } + + return true; } static gint page_request_addr_cmp(gconstpointer ap, gconstpointer bp) @@ -491,12 +501,15 @@ static void qemu_start_incoming_migration(const char *uri, Error **errp) { const char *p = NULL; - migrate_protocol_allow_multi_channels(false); /* reset it anyway */ + /* URI is not suitable for migration? */ + if (!migration_uri_validate(uri, errp)) { + return; + } + qapi_event_send_migration(MIGRATION_STATUS_SETUP); if (strstart(uri, "tcp:", &p) || strstart(uri, "unix:", NULL) || strstart(uri, "vsock:", NULL)) { - migrate_protocol_allow_multi_channels(true); socket_start_incoming_migration(p ? p : uri, errp); #ifdef CONFIG_RDMA } else if (strstart(uri, "rdma:", &p)) { @@ -721,11 +734,6 @@ void migration_fd_process_incoming(QEMUFile *f, Error **errp) migration_incoming_process(); } -static bool migration_needs_multiple_sockets(void) -{ - return migrate_use_multifd() || migrate_postcopy_preempt(); -} - void migration_ioc_process_incoming(QIOChannel *ioc, Error **errp) { MigrationIncomingState *mis = migration_incoming_get_current(); @@ -1347,15 +1355,6 @@ static bool migrate_caps_check(bool *cap_list, } #endif - - /* incoming side only */ - if (runstate_check(RUN_STATE_INMIGRATE) && - !migrate_multi_channels_is_allowed() && - cap_list[MIGRATION_CAPABILITY_MULTIFD]) { - error_setg(errp, "multifd is not supported by current protocol"); - return false; - } - if (cap_list[MIGRATION_CAPABILITY_POSTCOPY_PREEMPT]) { if (!cap_list[MIGRATION_CAPABILITY_POSTCOPY_RAM]) { error_setg(errp, "Postcopy preempt requires postcopy-ram"); @@ -2440,6 +2439,11 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk, MigrationState *s = migrate_get_current(); const char *p = NULL; + /* URI is not suitable for migration? */ + if (!migration_uri_validate(uri, errp)) { + return; + } + if (!migrate_prepare(s, has_blk && blk, has_inc && inc, has_resume && resume, errp)) { /* Error detected, put into errp */ @@ -2452,11 +2456,9 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk, } } - migrate_protocol_allow_multi_channels(false); if (strstart(uri, "tcp:", &p) || strstart(uri, "unix:", NULL) || strstart(uri, "vsock:", NULL)) { - migrate_protocol_allow_multi_channels(true); socket_start_outgoing_migration(s, p ? p : uri, &local_err); #ifdef CONFIG_RDMA } else if (strstart(uri, "rdma:", &p)) { @@ -4309,12 +4311,8 @@ void migrate_fd_connect(MigrationState *s, Error *error_in) } /* This needs to be done before resuming a postcopy */ - if (postcopy_preempt_setup(s, &local_err)) { - error_report_err(local_err); - migrate_set_state(&s->state, MIGRATION_STATUS_SETUP, - MIGRATION_STATUS_FAILED); - migrate_fd_cleanup(s); - return; + if (migrate_postcopy_preempt()) { + postcopy_preempt_setup(s); } if (resume) { diff --git a/migration/migration.h b/migration/migration.h index 66511ce532..c351872360 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -474,7 +474,4 @@ void migration_cancel(const Error *error); void populate_vfio_info(MigrationInfo *info); void postcopy_temp_page_reset(PostcopyTmpPage *tmp_page); -bool migrate_multi_channels_is_allowed(void); -void migrate_protocol_allow_multi_channels(bool allow); - #endif diff --git a/migration/multifd.c b/migration/multifd.c index eeb4fb87ee..dfe8eda5bf 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -512,7 +512,7 @@ void multifd_save_cleanup(void) { int i; - if (!migrate_use_multifd() || !migrate_multi_channels_is_allowed()) { + if (!migrate_use_multifd()) { return; } multifd_send_terminate_threads(NULL); @@ -910,10 +910,6 @@ int multifd_save_setup(Error **errp) if (!migrate_use_multifd()) { return 0; } - if (!migrate_multi_channels_is_allowed()) { - error_setg(errp, "multifd is not supported by current protocol"); - return -1; - } thread_count = migrate_multifd_channels(); multifd_send_state = g_malloc0(sizeof(*multifd_send_state)); @@ -1018,7 +1014,7 @@ int multifd_load_cleanup(Error **errp) { int i; - if (!migrate_use_multifd() || !migrate_multi_channels_is_allowed()) { + if (!migrate_use_multifd()) { return 0; } multifd_recv_terminate_threads(NULL); @@ -1172,10 +1168,6 @@ int multifd_load_setup(Error **errp) return 0; } - if (!migrate_multi_channels_is_allowed()) { - error_setg(errp, "multifd is not supported by current protocol"); - return -1; - } thread_count = migrate_multifd_channels(); multifd_recv_state = g_malloc0(sizeof(*multifd_recv_state)); multifd_recv_state->params = g_new0(MultiFDRecvParams, thread_count); diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index b98e95dab0..e2578dbd21 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -1631,22 +1631,10 @@ int postcopy_preempt_wait_channel(MigrationState *s) return s->postcopy_qemufile_src ? 0 : -1; } -int postcopy_preempt_setup(MigrationState *s, Error **errp) +void postcopy_preempt_setup(MigrationState *s) { - if (!migrate_postcopy_preempt()) { - return 0; - } - - if (!migrate_multi_channels_is_allowed()) { - error_setg(errp, "Postcopy preempt is not supported as current " - "migration stream does not support multi-channels."); - return -1; - } - /* Kick an async task to connect */ socket_send_channel_create(postcopy_preempt_send_channel_new, s); - - return 0; } static void postcopy_pause_ram_fast_load(MigrationIncomingState *mis) diff --git a/migration/postcopy-ram.h b/migration/postcopy-ram.h index 25881c4127..d5604cbcf1 100644 --- a/migration/postcopy-ram.h +++ b/migration/postcopy-ram.h @@ -191,7 +191,7 @@ enum PostcopyChannels { }; void postcopy_preempt_new_channel(MigrationIncomingState *mis, QEMUFile *file); -int postcopy_preempt_setup(MigrationState *s, Error **errp); +void postcopy_preempt_setup(MigrationState *s); int postcopy_preempt_wait_channel(MigrationState *s); #endif
The whole idea of multi-channel checks was not properly done, IMHO. Currently we check multi-channel in a lot of places, but actually that's not needed because we only need to check it right after we get the URI and that should be it. If the URI check succeeded, we should never need to check it again because we must have it. If it check fails, we should fail immediately on either the qmp_migrate or qmp_migrate_incoming, instead of failingg it later after the connection established. Neither should we fail any set capabiliities like what we used to do here: 5ad15e8614 ("migration: allow enabling mutilfd for specific protocol only", 2021-10-19) Because logically the URI will only be set later after the capability is set, so it doesn't make a lot of sense to check the URI type when setting the capability, because we're checking the cap with an old URI passed in, and that may not even be the URI we're going to use later. This patch mostly reverted all such checks for before, dropping the variable migrate_allow_multi_channels and helpers. Instead, add a common helper to check URI for multi-channels for either qmp_migrate and qmp_migrate_incoming and that should do all the proper checks. The failure will only trigger with the "migrate" or "migrate_incoming" command, or when user specified "-incoming xxx" where "xxx" is not "defer". With that, make postcopy_preempt_setup() as simple as creating the channel. Signed-off-by: Peter Xu <peterx@redhat.com> --- migration/migration.c | 56 +++++++++++++++++++--------------------- migration/migration.h | 3 --- migration/multifd.c | 12 ++------- migration/postcopy-ram.c | 14 +--------- migration/postcopy-ram.h | 2 +- 5 files changed, 31 insertions(+), 56 deletions(-)