diff mbox series

[14/14] migration/multifd: Forbid spurious wakeups

Message ID 20240131103111.306523-15-peterx@redhat.com
State New
Headers show
Series migration/multifd: Refactor ->send_prepare() and cleanups | expand

Commit Message

Peter Xu Jan. 31, 2024, 10:31 a.m. UTC
From: Peter Xu <peterx@redhat.com>

Now multifd's logic is designed to have no spurious wakeup.  I still
remember a talk to Juan and he seems to agree we should drop it now, and if
my memory was right it was there because multifd used to hit that when
still debugging.

Let's drop it and see what can explode; as long as it's not reaching
soft-freeze.

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 migration/multifd.c | 3 ---
 1 file changed, 3 deletions(-)

Comments

Fabiano Rosas Jan. 31, 2024, 9:43 p.m. UTC | #1
peterx@redhat.com writes:

> From: Peter Xu <peterx@redhat.com>
>
> Now multifd's logic is designed to have no spurious wakeup.  I still
> remember a talk to Juan and he seems to agree we should drop it now, and if
> my memory was right it was there because multifd used to hit that when
> still debugging.
>
> Let's drop it and see what can explode; as long as it's not reaching
> soft-freeze.
>
> Signed-off-by: Peter Xu <peterx@redhat.com>

Reviewed-by: Fabiano Rosas <farosas@suse.de>
Peter Xu Feb. 1, 2024, 6:01 a.m. UTC | #2
On Wed, Jan 31, 2024 at 06:31:11PM +0800, peterx@redhat.com wrote:
> From: Peter Xu <peterx@redhat.com>
> 
> Now multifd's logic is designed to have no spurious wakeup.  I still
> remember a talk to Juan and he seems to agree we should drop it now, and if
> my memory was right it was there because multifd used to hit that when
> still debugging.
> 
> Let's drop it and see what can explode; as long as it's not reaching
> soft-freeze.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  migration/multifd.c | 3 ---
>  1 file changed, 3 deletions(-)
> 
> diff --git a/migration/multifd.c b/migration/multifd.c
> index 0f22646f95..bd0e3ea1a5 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -766,9 +766,6 @@ static void *multifd_send_thread(void *opaque)
>              p->pending_sync = false;
>              qemu_mutex_unlock(&p->mutex);
>              qemu_sem_post(&p->sem_sync);
> -        } else {
> -            qemu_mutex_unlock(&p->mutex);
> -            /* sometimes there are spurious wakeups */
>          }
>      }
>  
> -- 
> 2.43.0
> 

While removing this is still the goal, I just noticed that _if_ something
spurious wakeup happens then this will not crash qemu, but instead it'll
cause mutex locked forever and deadlock.

A deadlock is less wanted than a crash in this case, so when I repost, I'll
make sure it crashes and does it hard, like squashing this in:

====
diff --git a/migration/multifd.c b/migration/multifd.c
index bd0e3ea1a5..89011f75d9 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -751,7 +751,9 @@ static void *multifd_send_thread(void *opaque)
             p->next_packet_size = 0;
             p->pending_job = false;
             qemu_mutex_unlock(&p->mutex);
-        } else if (p->pending_sync) {
+        } else {
+            /* If not a normal job, must be a sync request */
+            assert(p->pending_sync);
             p->flags = MULTIFD_FLAG_SYNC;
             multifd_send_fill_packet(p);
             ret = qio_channel_write_all(p->c, (void *)p->packet,
====

Fabiano, I'll keep your ACK, but let me know otherwise..
diff mbox series

Patch

diff --git a/migration/multifd.c b/migration/multifd.c
index 0f22646f95..bd0e3ea1a5 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -766,9 +766,6 @@  static void *multifd_send_thread(void *opaque)
             p->pending_sync = false;
             qemu_mutex_unlock(&p->mutex);
             qemu_sem_post(&p->sem_sync);
-        } else {
-            qemu_mutex_unlock(&p->mutex);
-            /* sometimes there are spurious wakeups */
         }
     }