diff mbox series

migration: Remove interface query-migrationthreads

Message ID 20241011153417.516715-1-peterx@redhat.com
State New
Headers show
Series migration: Remove interface query-migrationthreads | expand

Commit Message

Peter Xu Oct. 11, 2024, 3:34 p.m. UTC
This reverts two commits:

671326201dac8fe91222ba0045709f04a8ec3af4
1b1f4ab69c41279a45ccd0d3178e83471e6e4ec1

Meanwhile it adds an entry to removed-features.rst for the
query-migrationthreads QMP command.

This patch originates from another patchset [1] that wanted to cleanup the
interface and add corresponding HMP command, as lots of things are missing
in the query report; so far it only reports the main thread and multifd
sender threads; all the rest migration threads are not reported, including
multifd recv threads.

As pointed out by Dan in the follow up discussions [1], the API is designed
in an awkward way where CPU pinning may not cover the whole lifecycle of
even the thread being reported.  When asked, we also didn't get chance to
hear from the developer who introduced this feature to explain how this API
can be properly used.

OTOH, this feature from debugging POV isn't very helpful either, as all
these information can be easily obtained by GDB.  Esepcially, if with
"-name $VM,debug-threads=on" we do already have names for each migration
threads (which covers more than multifd sender threads).

So it looks like the API isn't helpful in any form as of now, besides it
only adds maintenance burden to migration code, even if not much.

Considering that so far there's totally no justification on how to use this
interface correctly, let's remove this interface instead of cleaning it up.

In this special case, we even go beyond normal deprecation procedure,
because a deprecation process would only make sense when there are existing
users. In this specific case, we expect zero serious users with this API.

[1] https://lore.kernel.org/qemu-devel/20240930195837.825728-1-peterx@redhat.com/

Cc: Jiang Jiacheng <jiangjiacheng@huawei.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 docs/about/removed-features.rst |  6 ++++
 qapi/migration.json             | 27 --------------
 migration/threadinfo.h          | 25 -------------
 migration/migration.c           |  6 ----
 migration/multifd.c             |  5 ---
 migration/threadinfo.c          | 64 ---------------------------------
 migration/meson.build           |  1 -
 7 files changed, 6 insertions(+), 128 deletions(-)
 delete mode 100644 migration/threadinfo.h
 delete mode 100644 migration/threadinfo.c

Comments

Fabiano Rosas Oct. 11, 2024, 5:27 p.m. UTC | #1
Peter Xu <peterx@redhat.com> writes:

> This reverts two commits:
>
> 671326201dac8fe91222ba0045709f04a8ec3af4
> 1b1f4ab69c41279a45ccd0d3178e83471e6e4ec1
>
> Meanwhile it adds an entry to removed-features.rst for the
> query-migrationthreads QMP command.
>
> This patch originates from another patchset [1] that wanted to cleanup the
> interface and add corresponding HMP command, as lots of things are missing
> in the query report; so far it only reports the main thread and multifd
> sender threads; all the rest migration threads are not reported, including
> multifd recv threads.
>
> As pointed out by Dan in the follow up discussions [1], the API is designed
> in an awkward way where CPU pinning may not cover the whole lifecycle of
> even the thread being reported.  When asked, we also didn't get chance to
> hear from the developer who introduced this feature to explain how this API
> can be properly used.
>
> OTOH, this feature from debugging POV isn't very helpful either, as all
> these information can be easily obtained by GDB.  Esepcially, if with
> "-name $VM,debug-threads=on" we do already have names for each migration
> threads (which covers more than multifd sender threads).
>
> So it looks like the API isn't helpful in any form as of now, besides it
> only adds maintenance burden to migration code, even if not much.
>
> Considering that so far there's totally no justification on how to use this
> interface correctly, let's remove this interface instead of cleaning it up.
>
> In this special case, we even go beyond normal deprecation procedure,
> because a deprecation process would only make sense when there are existing
> users. In this specific case, we expect zero serious users with this API.
>
> [1] https://lore.kernel.org/qemu-devel/20240930195837.825728-1-peterx@redhat.com/
>
> Cc: Jiang Jiacheng <jiangjiacheng@huawei.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Dr. David Alan Gilbert Oct. 11, 2024, 5:47 p.m. UTC | #2
* Peter Xu (peterx@redhat.com) wrote:
> This reverts two commits:
> 
> 671326201dac8fe91222ba0045709f04a8ec3af4
> 1b1f4ab69c41279a45ccd0d3178e83471e6e4ec1
> 
> Meanwhile it adds an entry to removed-features.rst for the
> query-migrationthreads QMP command.
> 
> This patch originates from another patchset [1] that wanted to cleanup the
> interface and add corresponding HMP command, as lots of things are missing
> in the query report; so far it only reports the main thread and multifd
> sender threads; all the rest migration threads are not reported, including
> multifd recv threads.
> 
> As pointed out by Dan in the follow up discussions [1], the API is designed
> in an awkward way where CPU pinning may not cover the whole lifecycle of
> even the thread being reported.  When asked, we also didn't get chance to
> hear from the developer who introduced this feature to explain how this API
> can be properly used.

One suggestion for replacing it, might be a way for a management interface
to add threads to a thread-pool, prior to migration; and then have migration
use threads from that pool rather than creating it's own.

Dave

> OTOH, this feature from debugging POV isn't very helpful either, as all
> these information can be easily obtained by GDB.  Esepcially, if with
> "-name $VM,debug-threads=on" we do already have names for each migration
> threads (which covers more than multifd sender threads).
> 
> So it looks like the API isn't helpful in any form as of now, besides it
> only adds maintenance burden to migration code, even if not much.
> 
> Considering that so far there's totally no justification on how to use this
> interface correctly, let's remove this interface instead of cleaning it up.
> 
> In this special case, we even go beyond normal deprecation procedure,
> because a deprecation process would only make sense when there are existing
> users. In this specific case, we expect zero serious users with this API.
> 
> [1] https://lore.kernel.org/qemu-devel/20240930195837.825728-1-peterx@redhat.com/
> 
> Cc: Jiang Jiacheng <jiangjiacheng@huawei.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  docs/about/removed-features.rst |  6 ++++
>  qapi/migration.json             | 27 --------------
>  migration/threadinfo.h          | 25 -------------
>  migration/migration.c           |  6 ----
>  migration/multifd.c             |  5 ---
>  migration/threadinfo.c          | 64 ---------------------------------
>  migration/meson.build           |  1 -
>  7 files changed, 6 insertions(+), 128 deletions(-)
>  delete mode 100644 migration/threadinfo.h
>  delete mode 100644 migration/threadinfo.c
> 
> diff --git a/docs/about/removed-features.rst b/docs/about/removed-features.rst
> index c76b91a88d..02ca43dec7 100644
> --- a/docs/about/removed-features.rst
> +++ b/docs/about/removed-features.rst
> @@ -693,6 +693,12 @@ Use ``multifd-channels`` instead.
>  
>  Use ``multifd-compression`` instead.
>  
> +``query-migrationthreads`` (removed in 9.2)
> +'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
> +
> +Removed with no replacement.  For debugging purpose, please use ``-name
> +$VM,debug-threads=on`` instead.
> +
>  QEMU Machine Protocol (QMP) events
>  ----------------------------------
>  
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 3af6aa1740..5520a51553 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -2264,33 +2264,6 @@
>  { 'command': 'query-vcpu-dirty-limit',
>    'returns': [ 'DirtyLimitInfo' ] }
>  
> -##
> -# @MigrationThreadInfo:
> -#
> -# Information about migrationthreads
> -#
> -# @name: the name of migration thread
> -#
> -# @thread-id: ID of the underlying host thread
> -#
> -# Since: 7.2
> -##
> -{ 'struct': 'MigrationThreadInfo',
> -  'data': {'name': 'str',
> -           'thread-id': 'int'} }
> -
> -##
> -# @query-migrationthreads:
> -#
> -# Returns information of migration threads
> -#
> -# Returns: @MigrationThreadInfo
> -#
> -# Since: 7.2
> -##
> -{ 'command': 'query-migrationthreads',
> -  'returns': ['MigrationThreadInfo'] }
> -
>  ##
>  # @snapshot-save:
>  #
> diff --git a/migration/threadinfo.h b/migration/threadinfo.h
> deleted file mode 100644
> index 2f356ff312..0000000000
> --- a/migration/threadinfo.h
> +++ /dev/null
> @@ -1,25 +0,0 @@
> -/*
> - *  Migration Threads info
> - *
> - *  Copyright (c) 2022 HUAWEI TECHNOLOGIES CO., LTD.
> - *
> - *  Authors:
> - *  Jiang Jiacheng <jiangjiacheng@huawei.com>
> - *
> - *  This work is licensed under the terms of the GNU GPL, version 2 or later.
> - *  See the COPYING file in the top-level directory.
> - */
> -
> -#include "qapi/error.h"
> -#include "qapi/qapi-commands-migration.h"
> -
> -typedef struct MigrationThread MigrationThread;
> -
> -struct MigrationThread {
> -    const char *name; /* the name of migration thread */
> -    int thread_id; /* ID of the underlying host thread */
> -    QLIST_ENTRY(MigrationThread) node;
> -};
> -
> -MigrationThread *migration_threads_add(const char *name, int thread_id);
> -void migration_threads_remove(MigrationThread *info);
> diff --git a/migration/migration.c b/migration/migration.c
> index 7609e0feed..12388c451d 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -57,7 +57,6 @@
>  #include "net/announce.h"
>  #include "qemu/queue.h"
>  #include "multifd.h"
> -#include "threadinfo.h"
>  #include "qemu/yank.h"
>  #include "sysemu/cpus.h"
>  #include "yank_functions.h"
> @@ -3466,16 +3465,12 @@ static void qemu_savevm_wait_unplug(MigrationState *s, int old_state,
>  static void *migration_thread(void *opaque)
>  {
>      MigrationState *s = opaque;
> -    MigrationThread *thread = NULL;
>      int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
>      MigThrError thr_error;
>      bool urgent = false;
>      Error *local_err = NULL;
>      int ret;
>  
> -    thread = migration_threads_add(MIGRATION_THREAD_SRC_MAIN,
> -                                   qemu_get_thread_id());
> -
>      rcu_register_thread();
>  
>      object_ref(OBJECT(s));
> @@ -3573,7 +3568,6 @@ out:
>      migration_iteration_finish(s);
>      object_unref(OBJECT(s));
>      rcu_unregister_thread();
> -    migration_threads_remove(thread);
>      return NULL;
>  }
>  
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 697fe86fdf..659022e817 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -26,7 +26,6 @@
>  #include "qemu-file.h"
>  #include "trace.h"
>  #include "multifd.h"
> -#include "threadinfo.h"
>  #include "options.h"
>  #include "qemu/yank.h"
>  #include "io/channel-file.h"
> @@ -570,13 +569,10 @@ int multifd_send_sync_main(void)
>  static void *multifd_send_thread(void *opaque)
>  {
>      MultiFDSendParams *p = opaque;
> -    MigrationThread *thread = NULL;
>      Error *local_err = NULL;
>      int ret = 0;
>      bool use_packets = multifd_use_packets();
>  
> -    thread = migration_threads_add(p->name, qemu_get_thread_id());
> -
>      trace_multifd_send_thread_start(p->id);
>      rcu_register_thread();
>  
> @@ -669,7 +665,6 @@ out:
>      }
>  
>      rcu_unregister_thread();
> -    migration_threads_remove(thread);
>      trace_multifd_send_thread_end(p->id, p->packets_sent);
>  
>      return NULL;
> diff --git a/migration/threadinfo.c b/migration/threadinfo.c
> deleted file mode 100644
> index 262990dd75..0000000000
> --- a/migration/threadinfo.c
> +++ /dev/null
> @@ -1,64 +0,0 @@
> -/*
> - *  Migration Threads info
> - *
> - *  Copyright (c) 2022 HUAWEI TECHNOLOGIES CO., LTD.
> - *
> - *  Authors:
> - *  Jiang Jiacheng <jiangjiacheng@huawei.com>
> - *
> - *  This work is licensed under the terms of the GNU GPL, version 2 or later.
> - *  See the COPYING file in the top-level directory.
> - */
> -
> -#include "qemu/osdep.h"
> -#include "qemu/queue.h"
> -#include "qemu/lockable.h"
> -#include "threadinfo.h"
> -
> -QemuMutex migration_threads_lock;
> -static QLIST_HEAD(, MigrationThread) migration_threads;
> -
> -static void __attribute__((constructor)) migration_threads_init(void)
> -{
> -    qemu_mutex_init(&migration_threads_lock);
> -}
> -
> -MigrationThread *migration_threads_add(const char *name, int thread_id)
> -{
> -    MigrationThread *thread =  g_new0(MigrationThread, 1);
> -    thread->name = name;
> -    thread->thread_id = thread_id;
> -
> -    WITH_QEMU_LOCK_GUARD(&migration_threads_lock) {
> -        QLIST_INSERT_HEAD(&migration_threads, thread, node);
> -    }
> -
> -    return thread;
> -}
> -
> -void migration_threads_remove(MigrationThread *thread)
> -{
> -    QEMU_LOCK_GUARD(&migration_threads_lock);
> -    if (thread) {
> -        QLIST_REMOVE(thread, node);
> -        g_free(thread);
> -    }
> -}
> -
> -MigrationThreadInfoList *qmp_query_migrationthreads(Error **errp)
> -{
> -    MigrationThreadInfoList *head = NULL;
> -    MigrationThreadInfoList **tail = &head;
> -    MigrationThread *thread = NULL;
> -
> -    QEMU_LOCK_GUARD(&migration_threads_lock);
> -    QLIST_FOREACH(thread, &migration_threads, node) {
> -        MigrationThreadInfo *info = g_new0(MigrationThreadInfo, 1);
> -        info->name = g_strdup(thread->name);
> -        info->thread_id = thread->thread_id;
> -
> -        QAPI_LIST_APPEND(tail, info);
> -    }
> -
> -    return head;
> -}
> diff --git a/migration/meson.build b/migration/meson.build
> index 66d3de86f0..28a680e5e2 100644
> --- a/migration/meson.build
> +++ b/migration/meson.build
> @@ -29,7 +29,6 @@ system_ss.add(files(
>    'savevm.c',
>    'socket.c',
>    'tls.c',
> -  'threadinfo.c',
>  ), gnutls, zlib)
>  
>  if get_option('replication').allowed()
> -- 
> 2.45.0
>
Daniel P. Berrangé Oct. 14, 2024, 10:18 a.m. UTC | #3
On Fri, Oct 11, 2024 at 11:34:17AM -0400, Peter Xu wrote:
> This reverts two commits:
> 
> 671326201dac8fe91222ba0045709f04a8ec3af4
> 1b1f4ab69c41279a45ccd0d3178e83471e6e4ec1
> 
> Meanwhile it adds an entry to removed-features.rst for the
> query-migrationthreads QMP command.
> 
> This patch originates from another patchset [1] that wanted to cleanup the
> interface and add corresponding HMP command, as lots of things are missing
> in the query report; so far it only reports the main thread and multifd
> sender threads; all the rest migration threads are not reported, including
> multifd recv threads.
> 
> As pointed out by Dan in the follow up discussions [1], the API is designed
> in an awkward way where CPU pinning may not cover the whole lifecycle of
> even the thread being reported.  When asked, we also didn't get chance to
> hear from the developer who introduced this feature to explain how this API
> can be properly used.
> 
> OTOH, this feature from debugging POV isn't very helpful either, as all
> these information can be easily obtained by GDB.  Esepcially, if with
> "-name $VM,debug-threads=on" we do already have names for each migration
> threads (which covers more than multifd sender threads).
> 
> So it looks like the API isn't helpful in any form as of now, besides it
> only adds maintenance burden to migration code, even if not much.
> 
> Considering that so far there's totally no justification on how to use this
> interface correctly, let's remove this interface instead of cleaning it up.
> 
> In this special case, we even go beyond normal deprecation procedure,
> because a deprecation process would only make sense when there are existing
> users. In this specific case, we expect zero serious users with this API.

We have no way of knowing whether there are existing users of this, or
any other feature in QEMU. This is why we have a formal deprecation
period, rather than immediately deleting existing features.

Yes, there are plenty of reasons why this feature is sub-optimal, but
it is not broken to the extent that it is *impossible* for people to
be using it.

IOW, I don't see that there's anything special here to justify bypassing
our deprecation process here.


With regards,
Daniel
Fabiano Rosas Oct. 14, 2024, 2:22 p.m. UTC | #4
Daniel P. Berrangé <berrange@redhat.com> writes:

> On Fri, Oct 11, 2024 at 11:34:17AM -0400, Peter Xu wrote:
>> This reverts two commits:
>> 
>> 671326201dac8fe91222ba0045709f04a8ec3af4
>> 1b1f4ab69c41279a45ccd0d3178e83471e6e4ec1
>> 
>> Meanwhile it adds an entry to removed-features.rst for the
>> query-migrationthreads QMP command.
>> 
>> This patch originates from another patchset [1] that wanted to cleanup the
>> interface and add corresponding HMP command, as lots of things are missing
>> in the query report; so far it only reports the main thread and multifd
>> sender threads; all the rest migration threads are not reported, including
>> multifd recv threads.
>> 
>> As pointed out by Dan in the follow up discussions [1], the API is designed
>> in an awkward way where CPU pinning may not cover the whole lifecycle of
>> even the thread being reported.  When asked, we also didn't get chance to
>> hear from the developer who introduced this feature to explain how this API
>> can be properly used.
>> 
>> OTOH, this feature from debugging POV isn't very helpful either, as all
>> these information can be easily obtained by GDB.  Esepcially, if with
>> "-name $VM,debug-threads=on" we do already have names for each migration
>> threads (which covers more than multifd sender threads).
>> 
>> So it looks like the API isn't helpful in any form as of now, besides it
>> only adds maintenance burden to migration code, even if not much.
>> 
>> Considering that so far there's totally no justification on how to use this
>> interface correctly, let's remove this interface instead of cleaning it up.
>> 
>> In this special case, we even go beyond normal deprecation procedure,
>> because a deprecation process would only make sense when there are existing
>> users. In this specific case, we expect zero serious users with this API.
>
> We have no way of knowing whether there are existing users of this, or
> any other feature in QEMU. This is why we have a formal deprecation
> period, rather than immediately deleting existing features.
>
> Yes, there are plenty of reasons why this feature is sub-optimal, but
> it is not broken to the extent that it is *impossible* for people to
> be using it.
>
> IOW, I don't see that there's anything special here to justify bypassing
> our deprecation process here.

I have no dog in this race, but as a data point, I see that this was
submitted to libvirt as a new migrationpin command:

https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/FVNAUEVIMLG6F2VCRKHZDUEOLBJCXQHO/#BVEGJVZMMLQMXE263GO5BSIWUDIYIFZU

>
>
> With regards,
> Daniel
Daniel P. Berrangé Oct. 14, 2024, 2:44 p.m. UTC | #5
On Mon, Oct 14, 2024 at 11:22:21AM -0300, Fabiano Rosas wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
> > On Fri, Oct 11, 2024 at 11:34:17AM -0400, Peter Xu wrote:
> >> This reverts two commits:
> >> 
> >> 671326201dac8fe91222ba0045709f04a8ec3af4
> >> 1b1f4ab69c41279a45ccd0d3178e83471e6e4ec1
> >> 
> >> Meanwhile it adds an entry to removed-features.rst for the
> >> query-migrationthreads QMP command.
> >> 
> >> This patch originates from another patchset [1] that wanted to cleanup the
> >> interface and add corresponding HMP command, as lots of things are missing
> >> in the query report; so far it only reports the main thread and multifd
> >> sender threads; all the rest migration threads are not reported, including
> >> multifd recv threads.
> >> 
> >> As pointed out by Dan in the follow up discussions [1], the API is designed
> >> in an awkward way where CPU pinning may not cover the whole lifecycle of
> >> even the thread being reported.  When asked, we also didn't get chance to
> >> hear from the developer who introduced this feature to explain how this API
> >> can be properly used.
> >> 
> >> OTOH, this feature from debugging POV isn't very helpful either, as all
> >> these information can be easily obtained by GDB.  Esepcially, if with
> >> "-name $VM,debug-threads=on" we do already have names for each migration
> >> threads (which covers more than multifd sender threads).
> >> 
> >> So it looks like the API isn't helpful in any form as of now, besides it
> >> only adds maintenance burden to migration code, even if not much.
> >> 
> >> Considering that so far there's totally no justification on how to use this
> >> interface correctly, let's remove this interface instead of cleaning it up.
> >> 
> >> In this special case, we even go beyond normal deprecation procedure,
> >> because a deprecation process would only make sense when there are existing
> >> users. In this specific case, we expect zero serious users with this API.
> >
> > We have no way of knowing whether there are existing users of this, or
> > any other feature in QEMU. This is why we have a formal deprecation
> > period, rather than immediately deleting existing features.
> >
> > Yes, there are plenty of reasons why this feature is sub-optimal, but
> > it is not broken to the extent that it is *impossible* for people to
> > be using it.
> >
> > IOW, I don't see that there's anything special here to justify bypassing
> > our deprecation process here.
> 
> I have no dog in this race, but as a data point, I see that this was
> submitted to libvirt as a new migrationpin command:
> 
> https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/FVNAUEVIMLG6F2VCRKHZDUEOLBJCXQHO/#BVEGJVZMMLQMXE263GO5BSIWUDIYIFZU

And unforunately it seems we dropped the ball on reviewing the v2 of
their series and they never ping'd for a response, so this was not
merged. Possibly they're just running it as a local patch to libvirt...


With regards,
Daniel
Peter Xu Oct. 15, 2024, 2:19 p.m. UTC | #6
On Mon, Oct 14, 2024 at 03:44:13PM +0100, Daniel P. Berrangé wrote:
> On Mon, Oct 14, 2024 at 11:22:21AM -0300, Fabiano Rosas wrote:
> > Daniel P. Berrangé <berrange@redhat.com> writes:
> > 
> > > On Fri, Oct 11, 2024 at 11:34:17AM -0400, Peter Xu wrote:
> > >> This reverts two commits:
> > >> 
> > >> 671326201dac8fe91222ba0045709f04a8ec3af4
> > >> 1b1f4ab69c41279a45ccd0d3178e83471e6e4ec1
> > >> 
> > >> Meanwhile it adds an entry to removed-features.rst for the
> > >> query-migrationthreads QMP command.
> > >> 
> > >> This patch originates from another patchset [1] that wanted to cleanup the
> > >> interface and add corresponding HMP command, as lots of things are missing
> > >> in the query report; so far it only reports the main thread and multifd
> > >> sender threads; all the rest migration threads are not reported, including
> > >> multifd recv threads.
> > >> 
> > >> As pointed out by Dan in the follow up discussions [1], the API is designed
> > >> in an awkward way where CPU pinning may not cover the whole lifecycle of
> > >> even the thread being reported.  When asked, we also didn't get chance to
> > >> hear from the developer who introduced this feature to explain how this API
> > >> can be properly used.
> > >> 
> > >> OTOH, this feature from debugging POV isn't very helpful either, as all
> > >> these information can be easily obtained by GDB.  Esepcially, if with
> > >> "-name $VM,debug-threads=on" we do already have names for each migration
> > >> threads (which covers more than multifd sender threads).
> > >> 
> > >> So it looks like the API isn't helpful in any form as of now, besides it
> > >> only adds maintenance burden to migration code, even if not much.
> > >> 
> > >> Considering that so far there's totally no justification on how to use this
> > >> interface correctly, let's remove this interface instead of cleaning it up.
> > >> 
> > >> In this special case, we even go beyond normal deprecation procedure,
> > >> because a deprecation process would only make sense when there are existing
> > >> users. In this specific case, we expect zero serious users with this API.
> > >
> > > We have no way of knowing whether there are existing users of this, or
> > > any other feature in QEMU. This is why we have a formal deprecation
> > > period, rather than immediately deleting existing features.
> > >
> > > Yes, there are plenty of reasons why this feature is sub-optimal, but
> > > it is not broken to the extent that it is *impossible* for people to
> > > be using it.
> > >
> > > IOW, I don't see that there's anything special here to justify bypassing
> > > our deprecation process here.
> > 
> > I have no dog in this race, but as a data point, I see that this was
> > submitted to libvirt as a new migrationpin command:
> > 
> > https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/FVNAUEVIMLG6F2VCRKHZDUEOLBJCXQHO/#BVEGJVZMMLQMXE263GO5BSIWUDIYIFZU
> 
> And unforunately it seems we dropped the ball on reviewing the v2 of
> their series and they never ping'd for a response, so this was not
> merged. Possibly they're just running it as a local patch to libvirt...

If any of us wants to make this working properly, we don't need to rush
removing or deprecating this. Instead we may want to discuss the proper
interface, using the new qmp cmd name that Markus suggested, then (maybe)
deprecate the old name only.

I proposed direct removal majorly because of the two things I mentioned in
the commit message as major design issues:

  (1) Migration threads are dynamic so created with short windows that
  they're not be able to be pinned by the mgmt.

  (2) Only sender threads are reported, while at least recv threads are
  equally important in this case.

I proposed direct removal not only because I want to avoid further fuss on
this API from any migration developer - I think we already spent more time
than we needed.. on this 100+ LOCs.  While, this can be reintroduced in the
proper way again when necessary easily.  Meanwhile I also wanted to avoid
new users see this API at all and use it ignoring dest threads even during
deprecation process.

Jiacheng is in the loop.  Jiacheng, do you want to chime in here?

Thanks,
Daniel P. Berrangé Oct. 15, 2024, 2:30 p.m. UTC | #7
On Tue, Oct 15, 2024 at 10:19:22AM -0400, Peter Xu wrote:
> On Mon, Oct 14, 2024 at 03:44:13PM +0100, Daniel P. Berrangé wrote:
> > On Mon, Oct 14, 2024 at 11:22:21AM -0300, Fabiano Rosas wrote:
> > > Daniel P. Berrangé <berrange@redhat.com> writes:
> > > 
> > > > On Fri, Oct 11, 2024 at 11:34:17AM -0400, Peter Xu wrote:
> > > >> This reverts two commits:
> > > >> 
> > > >> 671326201dac8fe91222ba0045709f04a8ec3af4
> > > >> 1b1f4ab69c41279a45ccd0d3178e83471e6e4ec1
> > > >> 
> > > >> Meanwhile it adds an entry to removed-features.rst for the
> > > >> query-migrationthreads QMP command.
> > > >> 
> > > >> This patch originates from another patchset [1] that wanted to cleanup the
> > > >> interface and add corresponding HMP command, as lots of things are missing
> > > >> in the query report; so far it only reports the main thread and multifd
> > > >> sender threads; all the rest migration threads are not reported, including
> > > >> multifd recv threads.
> > > >> 
> > > >> As pointed out by Dan in the follow up discussions [1], the API is designed
> > > >> in an awkward way where CPU pinning may not cover the whole lifecycle of
> > > >> even the thread being reported.  When asked, we also didn't get chance to
> > > >> hear from the developer who introduced this feature to explain how this API
> > > >> can be properly used.
> > > >> 
> > > >> OTOH, this feature from debugging POV isn't very helpful either, as all
> > > >> these information can be easily obtained by GDB.  Esepcially, if with
> > > >> "-name $VM,debug-threads=on" we do already have names for each migration
> > > >> threads (which covers more than multifd sender threads).
> > > >> 
> > > >> So it looks like the API isn't helpful in any form as of now, besides it
> > > >> only adds maintenance burden to migration code, even if not much.
> > > >> 
> > > >> Considering that so far there's totally no justification on how to use this
> > > >> interface correctly, let's remove this interface instead of cleaning it up.
> > > >> 
> > > >> In this special case, we even go beyond normal deprecation procedure,
> > > >> because a deprecation process would only make sense when there are existing
> > > >> users. In this specific case, we expect zero serious users with this API.
> > > >
> > > > We have no way of knowing whether there are existing users of this, or
> > > > any other feature in QEMU. This is why we have a formal deprecation
> > > > period, rather than immediately deleting existing features.
> > > >
> > > > Yes, there are plenty of reasons why this feature is sub-optimal, but
> > > > it is not broken to the extent that it is *impossible* for people to
> > > > be using it.
> > > >
> > > > IOW, I don't see that there's anything special here to justify bypassing
> > > > our deprecation process here.
> > > 
> > > I have no dog in this race, but as a data point, I see that this was
> > > submitted to libvirt as a new migrationpin command:
> > > 
> > > https://lists.libvirt.org/archives/list/devel@lists.libvirt.org/thread/FVNAUEVIMLG6F2VCRKHZDUEOLBJCXQHO/#BVEGJVZMMLQMXE263GO5BSIWUDIYIFZU
> > 
> > And unforunately it seems we dropped the ball on reviewing the v2 of
> > their series and they never ping'd for a response, so this was not
> > merged. Possibly they're just running it as a local patch to libvirt...
> 
> If any of us wants to make this working properly, we don't need to rush
> removing or deprecating this. Instead we may want to discuss the proper
> interface, using the new qmp cmd name that Markus suggested, then (maybe)
> deprecate the old name only.
> 
> I proposed direct removal majorly because of the two things I mentioned in
> the commit message as major design issues:
> 
>   (1) Migration threads are dynamic so created with short windows that
>   they're not be able to be pinned by the mgmt.
> 
>   (2) Only sender threads are reported, while at least recv threads are
>   equally important in this case.
> 
> I proposed direct removal not only because I want to avoid further fuss on
> this API from any migration developer - I think we already spent more time
> than we needed.. on this 100+ LOCs.  While, this can be reintroduced in the
> proper way again when necessary easily.  Meanwhile I also wanted to avoid
> new users see this API at all and use it ignoring dest threads even during
> deprecation process.

None of those are reasons to bypass the deprecation process.

As annoying as the current design is, we've released it and it
is subject to our usual QAPI stability promise.

The deprecation of the feature is sufficient warning for anyone who
might contemplate using it, to either decide to NOT use it, or to
tell us why we should keep it.

With regards,
Daniel
Peter Xu Oct. 15, 2024, 3:31 p.m. UTC | #8
On Fri, Oct 11, 2024 at 05:47:03PM +0000, Dr. David Alan Gilbert wrote:
> * Peter Xu (peterx@redhat.com) wrote:
> > This reverts two commits:
> > 
> > 671326201dac8fe91222ba0045709f04a8ec3af4
> > 1b1f4ab69c41279a45ccd0d3178e83471e6e4ec1
> > 
> > Meanwhile it adds an entry to removed-features.rst for the
> > query-migrationthreads QMP command.
> > 
> > This patch originates from another patchset [1] that wanted to cleanup the
> > interface and add corresponding HMP command, as lots of things are missing
> > in the query report; so far it only reports the main thread and multifd
> > sender threads; all the rest migration threads are not reported, including
> > multifd recv threads.
> > 
> > As pointed out by Dan in the follow up discussions [1], the API is designed
> > in an awkward way where CPU pinning may not cover the whole lifecycle of
> > even the thread being reported.  When asked, we also didn't get chance to
> > hear from the developer who introduced this feature to explain how this API
> > can be properly used.
> 
> One suggestion for replacing it, might be a way for a management interface
> to add threads to a thread-pool, prior to migration; and then have migration
> use threads from that pool rather than creating it's own.

It could work indeed.  Though that may require some pool QMP command to
either specify cpu pinning or making the thread all name-based, so that the
names need to be passed over to migration again with another API, which
might be slightly complicated.

I think if we do want to support this feature seriously, either we bare
with that short window (can still use a hack to make bw=0 prior to that),
or we can add one stage of migration so it SETUPs everything and halt there
just like what -S for QEMU in general, until another command to kick off
the migration process.

I remember CPR had such a discussion, that if we have such a mechanism, CPR
state (which must be sent earlier than migration stream) can be sent during
this SETUP-only peroid too.  Since we don't have that, the latest
CPR-transfer is relying on something else (a HUP event on CPR channel sent
from dest by closing it) to identify that window, so that src will block at
SETUP phase until the HUP event from destination CPR channel arrives,
saying "cpr state is loaded, let's continue the migration process".

Thanks,
diff mbox series

Patch

diff --git a/docs/about/removed-features.rst b/docs/about/removed-features.rst
index c76b91a88d..02ca43dec7 100644
--- a/docs/about/removed-features.rst
+++ b/docs/about/removed-features.rst
@@ -693,6 +693,12 @@  Use ``multifd-channels`` instead.
 
 Use ``multifd-compression`` instead.
 
+``query-migrationthreads`` (removed in 9.2)
+'''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
+
+Removed with no replacement.  For debugging purpose, please use ``-name
+$VM,debug-threads=on`` instead.
+
 QEMU Machine Protocol (QMP) events
 ----------------------------------
 
diff --git a/qapi/migration.json b/qapi/migration.json
index 3af6aa1740..5520a51553 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -2264,33 +2264,6 @@ 
 { 'command': 'query-vcpu-dirty-limit',
   'returns': [ 'DirtyLimitInfo' ] }
 
-##
-# @MigrationThreadInfo:
-#
-# Information about migrationthreads
-#
-# @name: the name of migration thread
-#
-# @thread-id: ID of the underlying host thread
-#
-# Since: 7.2
-##
-{ 'struct': 'MigrationThreadInfo',
-  'data': {'name': 'str',
-           'thread-id': 'int'} }
-
-##
-# @query-migrationthreads:
-#
-# Returns information of migration threads
-#
-# Returns: @MigrationThreadInfo
-#
-# Since: 7.2
-##
-{ 'command': 'query-migrationthreads',
-  'returns': ['MigrationThreadInfo'] }
-
 ##
 # @snapshot-save:
 #
diff --git a/migration/threadinfo.h b/migration/threadinfo.h
deleted file mode 100644
index 2f356ff312..0000000000
--- a/migration/threadinfo.h
+++ /dev/null
@@ -1,25 +0,0 @@ 
-/*
- *  Migration Threads info
- *
- *  Copyright (c) 2022 HUAWEI TECHNOLOGIES CO., LTD.
- *
- *  Authors:
- *  Jiang Jiacheng <jiangjiacheng@huawei.com>
- *
- *  This work is licensed under the terms of the GNU GPL, version 2 or later.
- *  See the COPYING file in the top-level directory.
- */
-
-#include "qapi/error.h"
-#include "qapi/qapi-commands-migration.h"
-
-typedef struct MigrationThread MigrationThread;
-
-struct MigrationThread {
-    const char *name; /* the name of migration thread */
-    int thread_id; /* ID of the underlying host thread */
-    QLIST_ENTRY(MigrationThread) node;
-};
-
-MigrationThread *migration_threads_add(const char *name, int thread_id);
-void migration_threads_remove(MigrationThread *info);
diff --git a/migration/migration.c b/migration/migration.c
index 7609e0feed..12388c451d 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -57,7 +57,6 @@ 
 #include "net/announce.h"
 #include "qemu/queue.h"
 #include "multifd.h"
-#include "threadinfo.h"
 #include "qemu/yank.h"
 #include "sysemu/cpus.h"
 #include "yank_functions.h"
@@ -3466,16 +3465,12 @@  static void qemu_savevm_wait_unplug(MigrationState *s, int old_state,
 static void *migration_thread(void *opaque)
 {
     MigrationState *s = opaque;
-    MigrationThread *thread = NULL;
     int64_t setup_start = qemu_clock_get_ms(QEMU_CLOCK_HOST);
     MigThrError thr_error;
     bool urgent = false;
     Error *local_err = NULL;
     int ret;
 
-    thread = migration_threads_add(MIGRATION_THREAD_SRC_MAIN,
-                                   qemu_get_thread_id());
-
     rcu_register_thread();
 
     object_ref(OBJECT(s));
@@ -3573,7 +3568,6 @@  out:
     migration_iteration_finish(s);
     object_unref(OBJECT(s));
     rcu_unregister_thread();
-    migration_threads_remove(thread);
     return NULL;
 }
 
diff --git a/migration/multifd.c b/migration/multifd.c
index 697fe86fdf..659022e817 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -26,7 +26,6 @@ 
 #include "qemu-file.h"
 #include "trace.h"
 #include "multifd.h"
-#include "threadinfo.h"
 #include "options.h"
 #include "qemu/yank.h"
 #include "io/channel-file.h"
@@ -570,13 +569,10 @@  int multifd_send_sync_main(void)
 static void *multifd_send_thread(void *opaque)
 {
     MultiFDSendParams *p = opaque;
-    MigrationThread *thread = NULL;
     Error *local_err = NULL;
     int ret = 0;
     bool use_packets = multifd_use_packets();
 
-    thread = migration_threads_add(p->name, qemu_get_thread_id());
-
     trace_multifd_send_thread_start(p->id);
     rcu_register_thread();
 
@@ -669,7 +665,6 @@  out:
     }
 
     rcu_unregister_thread();
-    migration_threads_remove(thread);
     trace_multifd_send_thread_end(p->id, p->packets_sent);
 
     return NULL;
diff --git a/migration/threadinfo.c b/migration/threadinfo.c
deleted file mode 100644
index 262990dd75..0000000000
--- a/migration/threadinfo.c
+++ /dev/null
@@ -1,64 +0,0 @@ 
-/*
- *  Migration Threads info
- *
- *  Copyright (c) 2022 HUAWEI TECHNOLOGIES CO., LTD.
- *
- *  Authors:
- *  Jiang Jiacheng <jiangjiacheng@huawei.com>
- *
- *  This work is licensed under the terms of the GNU GPL, version 2 or later.
- *  See the COPYING file in the top-level directory.
- */
-
-#include "qemu/osdep.h"
-#include "qemu/queue.h"
-#include "qemu/lockable.h"
-#include "threadinfo.h"
-
-QemuMutex migration_threads_lock;
-static QLIST_HEAD(, MigrationThread) migration_threads;
-
-static void __attribute__((constructor)) migration_threads_init(void)
-{
-    qemu_mutex_init(&migration_threads_lock);
-}
-
-MigrationThread *migration_threads_add(const char *name, int thread_id)
-{
-    MigrationThread *thread =  g_new0(MigrationThread, 1);
-    thread->name = name;
-    thread->thread_id = thread_id;
-
-    WITH_QEMU_LOCK_GUARD(&migration_threads_lock) {
-        QLIST_INSERT_HEAD(&migration_threads, thread, node);
-    }
-
-    return thread;
-}
-
-void migration_threads_remove(MigrationThread *thread)
-{
-    QEMU_LOCK_GUARD(&migration_threads_lock);
-    if (thread) {
-        QLIST_REMOVE(thread, node);
-        g_free(thread);
-    }
-}
-
-MigrationThreadInfoList *qmp_query_migrationthreads(Error **errp)
-{
-    MigrationThreadInfoList *head = NULL;
-    MigrationThreadInfoList **tail = &head;
-    MigrationThread *thread = NULL;
-
-    QEMU_LOCK_GUARD(&migration_threads_lock);
-    QLIST_FOREACH(thread, &migration_threads, node) {
-        MigrationThreadInfo *info = g_new0(MigrationThreadInfo, 1);
-        info->name = g_strdup(thread->name);
-        info->thread_id = thread->thread_id;
-
-        QAPI_LIST_APPEND(tail, info);
-    }
-
-    return head;
-}
diff --git a/migration/meson.build b/migration/meson.build
index 66d3de86f0..28a680e5e2 100644
--- a/migration/meson.build
+++ b/migration/meson.build
@@ -29,7 +29,6 @@  system_ss.add(files(
   'savevm.c',
   'socket.c',
   'tls.c',
-  'threadinfo.c',
 ), gnutls, zlib)
 
 if get_option('replication').allowed()