diff mbox series

[v2,1/6] migration/multifd: Join the TLS thread

Message ID 20240205194929.28963-2-farosas@suse.de
State New
Headers show
Series migration/multifd: Fix channel creation vs. cleanup races | expand

Commit Message

Fabiano Rosas Feb. 5, 2024, 7:49 p.m. UTC
We're currently leaking the resources of the TLS thread by not joining
it and also overwriting the p->thread pointer altogether.

Fixes: a1af605bd5 ("migration/multifd: fix hangup with TLS-Multifd due to blocking handshake")
Cc: qemu-stable <qemu-stable@nongnu.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
---
 migration/multifd.c | 8 +++++++-
 migration/multifd.h | 2 ++
 2 files changed, 9 insertions(+), 1 deletion(-)

Comments

Daniel P. Berrangé Feb. 6, 2024, 8:53 a.m. UTC | #1
On Mon, Feb 05, 2024 at 04:49:24PM -0300, Fabiano Rosas wrote:
> We're currently leaking the resources of the TLS thread by not joining
> it and also overwriting the p->thread pointer altogether.

AFAICS, it is not ovewriting 'p->thread' because at the time when the
TLS thread is created, the main 'send thread' has not yet been
created. The TLS thread and send thread execution times are mutually
exclusive.

The 'p->running' flag is already set to true when the TLS thread is
created, so the existing cleanup should be working too, so I'm not
seeing a bug that needs fixing here.

> 
> Fixes: a1af605bd5 ("migration/multifd: fix hangup with TLS-Multifd due to blocking handshake")
> Cc: qemu-stable <qemu-stable@nongnu.org>
> Reviewed-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
>  migration/multifd.c | 8 +++++++-
>  migration/multifd.h | 2 ++
>  2 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/multifd.c b/migration/multifd.c
> index ef13e2e781..8195c1daf3 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -630,6 +630,10 @@ static void multifd_send_terminate_threads(void)
>      for (i = 0; i < migrate_multifd_channels(); i++) {
>          MultiFDSendParams *p = &multifd_send_state->params[i];
>  
> +        if (p->tls_thread_created) {
> +            qemu_thread_join(&p->tls_thread);
> +        }
> +
>          if (p->running) {
>              qemu_thread_join(&p->thread);
>          }
> @@ -921,7 +925,9 @@ static bool multifd_tls_channel_connect(MultiFDSendParams *p,
>      trace_multifd_tls_outgoing_handshake_start(ioc, tioc, hostname);
>      qio_channel_set_name(QIO_CHANNEL(tioc), "multifd-tls-outgoing");
>      p->c = QIO_CHANNEL(tioc);
> -    qemu_thread_create(&p->thread, "multifd-tls-handshake-worker",
> +
> +    p->tls_thread_created = true;
> +    qemu_thread_create(&p->tls_thread, "multifd-tls-handshake-worker",
>                         multifd_tls_handshake_thread, p,
>                         QEMU_THREAD_JOINABLE);
>      return true;
> diff --git a/migration/multifd.h b/migration/multifd.h
> index 78a2317263..720c9d50db 100644
> --- a/migration/multifd.h
> +++ b/migration/multifd.h
> @@ -73,6 +73,8 @@ typedef struct {
>      char *name;
>      /* channel thread id */
>      QemuThread thread;
> +    QemuThread tls_thread;
> +    bool tls_thread_created;
>      /* communication channel */
>      QIOChannel *c;
>      /* is the yank function registered */
> -- 
> 2.35.3
> 

With regards,
Daniel
Peter Xu Feb. 6, 2024, 9:15 a.m. UTC | #2
On Tue, Feb 06, 2024 at 08:53:45AM +0000, Daniel P. Berrangé wrote:
> AFAICS, it is not ovewriting 'p->thread' because at the time when the
> TLS thread is created, the main 'send thread' has not yet been
> created. The TLS thread and send thread execution times are mutually
> exclusive.

IIUC it'll be overwritten after the tls handshake, where in the tls thread
uses multifd_channel_connect() to create the ultimate multifd thread with
the same p->thread variable:

    qemu_thread_create(&p->thread, p->name, multifd_send_thread, p,
                       QEMU_THREAD_JOINABLE);

There it'll overwrite the old value setup by p->thread, hence the tls
thread resource should be leaked until QEMU quits when created with
JOINABLE in both contexts.

Thanks,
Daniel P. Berrangé Feb. 6, 2024, 10:06 a.m. UTC | #3
On Tue, Feb 06, 2024 at 05:15:07PM +0800, Peter Xu wrote:
> On Tue, Feb 06, 2024 at 08:53:45AM +0000, Daniel P. Berrangé wrote:
> > AFAICS, it is not ovewriting 'p->thread' because at the time when the
> > TLS thread is created, the main 'send thread' has not yet been
> > created. The TLS thread and send thread execution times are mutually
> > exclusive.
> 
> IIUC it'll be overwritten after the tls handshake, where in the tls thread
> uses multifd_channel_connect() to create the ultimate multifd thread with
> the same p->thread variable:
> 
>     qemu_thread_create(&p->thread, p->name, multifd_send_thread, p,
>                        QEMU_THREAD_JOINABLE);
> 
> There it'll overwrite the old value setup by p->thread, hence the tls
> thread resource should be leaked until QEMU quits when created with
> JOINABLE in both contexts.

Ah yes, missed that, you're right.

With regards,
Daniel
diff mbox series

Patch

diff --git a/migration/multifd.c b/migration/multifd.c
index ef13e2e781..8195c1daf3 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -630,6 +630,10 @@  static void multifd_send_terminate_threads(void)
     for (i = 0; i < migrate_multifd_channels(); i++) {
         MultiFDSendParams *p = &multifd_send_state->params[i];
 
+        if (p->tls_thread_created) {
+            qemu_thread_join(&p->tls_thread);
+        }
+
         if (p->running) {
             qemu_thread_join(&p->thread);
         }
@@ -921,7 +925,9 @@  static bool multifd_tls_channel_connect(MultiFDSendParams *p,
     trace_multifd_tls_outgoing_handshake_start(ioc, tioc, hostname);
     qio_channel_set_name(QIO_CHANNEL(tioc), "multifd-tls-outgoing");
     p->c = QIO_CHANNEL(tioc);
-    qemu_thread_create(&p->thread, "multifd-tls-handshake-worker",
+
+    p->tls_thread_created = true;
+    qemu_thread_create(&p->tls_thread, "multifd-tls-handshake-worker",
                        multifd_tls_handshake_thread, p,
                        QEMU_THREAD_JOINABLE);
     return true;
diff --git a/migration/multifd.h b/migration/multifd.h
index 78a2317263..720c9d50db 100644
--- a/migration/multifd.h
+++ b/migration/multifd.h
@@ -73,6 +73,8 @@  typedef struct {
     char *name;
     /* channel thread id */
     QemuThread thread;
+    QemuThread tls_thread;
+    bool tls_thread_created;
     /* communication channel */
     QIOChannel *c;
     /* is the yank function registered */