diff mbox series

migration: NULL transport_data after freeing

Message ID 20220217170407.24906-1-hreitz@redhat.com
State New
Headers show
Series migration: NULL transport_data after freeing | expand

Commit Message

Hanna Czenczek Feb. 17, 2022, 5:04 p.m. UTC
migration_incoming_state_destroy() NULLs all objects it frees after they
are freed, presumably so that a subsequent call to the same function
will not free them again, unless new objects have been created in the
meantime.

transport_data is the exception, and it shows exactly this problem: When
an incoming migration uses transport_cleanup() and transport_data, and a
subsequent incoming migration (e.g. loadvm) occurs that does not, then
when this second one is done, it will call transport_cleanup() on the
old transport_data again -- which has already been freed.  This is
sometimes visible in the iotest 201, though for some reason I can only
reproduce it with -m32.

To fix this, call transport_cleanup() only when transport_data is not
NULL (otherwise there is nothing to clean up), and set transport_data to
NULL when it has been cleaned up (i.e. freed).

(transport_cleanup() is used only by migration/socket.c, where
socket_start_incoming_migration_internal() sets both it and
transport_data to non-NULL values.)

Signed-off-by: Hanna Reitz <hreitz@redhat.com>
---
 migration/migration.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Dr. David Alan Gilbert Feb. 17, 2022, 5:29 p.m. UTC | #1
* Hanna Reitz (hreitz@redhat.com) wrote:
> migration_incoming_state_destroy() NULLs all objects it frees after they
> are freed, presumably so that a subsequent call to the same function
> will not free them again, unless new objects have been created in the
> meantime.
> 
> transport_data is the exception, and it shows exactly this problem: When
> an incoming migration uses transport_cleanup() and transport_data, and a
> subsequent incoming migration (e.g. loadvm) occurs that does not, then
> when this second one is done, it will call transport_cleanup() on the
> old transport_data again -- which has already been freed.  This is
> sometimes visible in the iotest 201, though for some reason I can only
> reproduce it with -m32.
> 
> To fix this, call transport_cleanup() only when transport_data is not
> NULL (otherwise there is nothing to clean up), and set transport_data to
> NULL when it has been cleaned up (i.e. freed).
> 
> (transport_cleanup() is used only by migration/socket.c, where
> socket_start_incoming_migration_internal() sets both it and
> transport_data to non-NULL values.)
> 
> Signed-off-by: Hanna Reitz <hreitz@redhat.com>

That probably deserves a fixes: a59136f

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> ---
>  migration/migration.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index bcc385b94b..cdb2e76d02 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -287,8 +287,9 @@ void migration_incoming_state_destroy(void)
>          g_array_free(mis->postcopy_remote_fds, TRUE);
>          mis->postcopy_remote_fds = NULL;
>      }
> -    if (mis->transport_cleanup) {
> +    if (mis->transport_cleanup && mis->transport_data) {
>          mis->transport_cleanup(mis->transport_data);
> +        mis->transport_data = NULL;
>      }
>  
>      qemu_event_reset(&mis->main_thread_load_event);
> -- 
> 2.34.1
>
Peter Xu Feb. 18, 2022, 1:47 a.m. UTC | #2
On Thu, Feb 17, 2022 at 06:04:07PM +0100, Hanna Reitz wrote:
> migration_incoming_state_destroy() NULLs all objects it frees after they
> are freed, presumably so that a subsequent call to the same function
> will not free them again, unless new objects have been created in the
> meantime.
> 
> transport_data is the exception, and it shows exactly this problem: When
> an incoming migration uses transport_cleanup() and transport_data, and a
> subsequent incoming migration (e.g. loadvm) occurs that does not, then
> when this second one is done, it will call transport_cleanup() on the
> old transport_data again -- which has already been freed.  This is
> sometimes visible in the iotest 201, though for some reason I can only
> reproduce it with -m32.
> 
> To fix this, call transport_cleanup() only when transport_data is not
> NULL (otherwise there is nothing to clean up), and set transport_data to
> NULL when it has been cleaned up (i.e. freed).
> 
> (transport_cleanup() is used only by migration/socket.c, where
> socket_start_incoming_migration_internal() sets both it and
> transport_data to non-NULL values.)
> 
> Signed-off-by: Hanna Reitz <hreitz@redhat.com>

I had a similar fix here:

https://lore.kernel.org/qemu-devel/20220216062809.57179-15-peterx@redhat.com/

Though there it was because I need migration_incoming_transport_cleanup()
for other purposes, so the fix came along.

My guess is this small fix will land earlier, if so I'll rebase. :)

Thanks,
Dr. David Alan Gilbert March 2, 2022, 12:17 p.m. UTC | #3
* Peter Xu (peterx@redhat.com) wrote:
> On Thu, Feb 17, 2022 at 06:04:07PM +0100, Hanna Reitz wrote:
> > migration_incoming_state_destroy() NULLs all objects it frees after they
> > are freed, presumably so that a subsequent call to the same function
> > will not free them again, unless new objects have been created in the
> > meantime.
> > 
> > transport_data is the exception, and it shows exactly this problem: When
> > an incoming migration uses transport_cleanup() and transport_data, and a
> > subsequent incoming migration (e.g. loadvm) occurs that does not, then
> > when this second one is done, it will call transport_cleanup() on the
> > old transport_data again -- which has already been freed.  This is
> > sometimes visible in the iotest 201, though for some reason I can only
> > reproduce it with -m32.
> > 
> > To fix this, call transport_cleanup() only when transport_data is not
> > NULL (otherwise there is nothing to clean up), and set transport_data to
> > NULL when it has been cleaned up (i.e. freed).
> > 
> > (transport_cleanup() is used only by migration/socket.c, where
> > socket_start_incoming_migration_internal() sets both it and
> > transport_data to non-NULL values.)
> > 
> > Signed-off-by: Hanna Reitz <hreitz@redhat.com>
> 
> I had a similar fix here:
> 
> https://lore.kernel.org/qemu-devel/20220216062809.57179-15-peterx@redhat.com/
> 
> Though there it was because I need migration_incoming_transport_cleanup()
> for other purposes, so the fix came along.
> 
> My guess is this small fix will land earlier, if so I'll rebase. :)

Actually it didn't; so since I've pulled a chunk of Peter's series in
anyway I took the one from Peter's series.

Dave

> Thanks,
> 
> -- 
> Peter Xu
> 
>
diff mbox series

Patch

diff --git a/migration/migration.c b/migration/migration.c
index bcc385b94b..cdb2e76d02 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -287,8 +287,9 @@  void migration_incoming_state_destroy(void)
         g_array_free(mis->postcopy_remote_fds, TRUE);
         mis->postcopy_remote_fds = NULL;
     }
-    if (mis->transport_cleanup) {
+    if (mis->transport_cleanup && mis->transport_data) {
         mis->transport_cleanup(mis->transport_data);
+        mis->transport_data = NULL;
     }
 
     qemu_event_reset(&mis->main_thread_load_event);