diff mbox series

[V4,04/11] migration: preserve suspended for snapshot

Message ID 1693333086-392798-5-git-send-email-steven.sistare@oracle.com
State New
Headers show
Series fix migration of suspended runstate | expand

Commit Message

Steve Sistare Aug. 29, 2023, 6:17 p.m. UTC
Restoring a snapshot can break a suspended guest.

If a guest is suspended and saved to a snapshot using savevm, and qemu
is terminated and restarted with the -S option, then loadvm does not
restore the guest.  The runstate is running, but the guest is not, because
vm_start was not called.  The root cause is that loadvm does not restore
the runstate (eg suspended) from global_state loaded from the state file.

Restore the runstate, and allow the new state transitions that are possible.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 migration/savevm.c | 1 +
 softmmu/runstate.c | 2 ++
 2 files changed, 3 insertions(+)

Comments

Peter Xu Aug. 30, 2023, 4:22 p.m. UTC | #1
On Tue, Aug 29, 2023 at 11:17:59AM -0700, Steve Sistare wrote:
> Restoring a snapshot can break a suspended guest.
> 
> If a guest is suspended and saved to a snapshot using savevm, and qemu
> is terminated and restarted with the -S option, then loadvm does not
> restore the guest.  The runstate is running, but the guest is not, because
> vm_start was not called.  The root cause is that loadvm does not restore
> the runstate (eg suspended) from global_state loaded from the state file.
> 
> Restore the runstate, and allow the new state transitions that are possible.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  migration/savevm.c | 1 +
>  softmmu/runstate.c | 2 ++
>  2 files changed, 3 insertions(+)
> 
> diff --git a/migration/savevm.c b/migration/savevm.c
> index eba3653..7b9c477 100644
> --- a/migration/savevm.c
> +++ b/migration/savevm.c
> @@ -3194,6 +3194,7 @@ bool load_snapshot(const char *name, const char *vmstate,
>      }
>      aio_context_acquire(aio_context);
>      ret = qemu_loadvm_state(f);
> +    migrate_set_runstate();

I see that some load_snapshot() callers manage the vm states on their own.
Take snapshot_load_job_bh() as an example:

    s->ret = load_snapshot(s->tag, s->vmstate, true, s->devices, s->errp);
    if (s->ret && orig_vm_running) {
        vm_start();
    }

I assume you wanted to unify the state changes here.  Need to fix the
callers too?

>      migration_incoming_state_destroy();
>      aio_context_release(aio_context);
>  
> diff --git a/softmmu/runstate.c b/softmmu/runstate.c
> index f3bd862..21d7407 100644
> --- a/softmmu/runstate.c
> +++ b/softmmu/runstate.c
> @@ -77,6 +77,8 @@ typedef struct {
>  
>  static const RunStateTransition runstate_transitions_def[] = {
>      { RUN_STATE_PRELAUNCH, RUN_STATE_INMIGRATE },
> +    { RUN_STATE_PRELAUNCH, RUN_STATE_PAUSED },
> +    { RUN_STATE_PRELAUNCH, RUN_STATE_SUSPENDED },
>  
>      { RUN_STATE_DEBUG, RUN_STATE_RUNNING },
>      { RUN_STATE_DEBUG, RUN_STATE_FINISH_MIGRATE },

Many of the call sites also starts loadvm under RUN_STATE_RESTORE_VM.  Do
we need more entries for that?
Steve Sistare Nov. 13, 2023, 6:32 p.m. UTC | #2
On 8/30/2023 12:22 PM, Peter Xu wrote:
> On Tue, Aug 29, 2023 at 11:17:59AM -0700, Steve Sistare wrote:
>> Restoring a snapshot can break a suspended guest.
>>
>> If a guest is suspended and saved to a snapshot using savevm, and qemu
>> is terminated and restarted with the -S option, then loadvm does not
>> restore the guest.  The runstate is running, but the guest is not, because
>> vm_start was not called.  The root cause is that loadvm does not restore
>> the runstate (eg suspended) from global_state loaded from the state file.
>>
>> Restore the runstate, and allow the new state transitions that are possible.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>>  migration/savevm.c | 1 +
>>  softmmu/runstate.c | 2 ++
>>  2 files changed, 3 insertions(+)
>>
>> diff --git a/migration/savevm.c b/migration/savevm.c
>> index eba3653..7b9c477 100644
>> --- a/migration/savevm.c
>> +++ b/migration/savevm.c
>> @@ -3194,6 +3194,7 @@ bool load_snapshot(const char *name, const char *vmstate,
>>      }
>>      aio_context_acquire(aio_context);
>>      ret = qemu_loadvm_state(f);
>> +    migrate_set_runstate();
> 
> I see that some load_snapshot() callers manage the vm states on their own.
> Take snapshot_load_job_bh() as an example:
> 
>     s->ret = load_snapshot(s->tag, s->vmstate, true, s->devices, s->errp);
>     if (s->ret && orig_vm_running) {
>         vm_start();
>     }
> 
> I assume you wanted to unify the state changes here.  Need to fix the
> callers too?

Agreed. Fixed in V5.

>>      migration_incoming_state_destroy();
>>      aio_context_release(aio_context);
>>  
>> diff --git a/softmmu/runstate.c b/softmmu/runstate.c
>> index f3bd862..21d7407 100644
>> --- a/softmmu/runstate.c
>> +++ b/softmmu/runstate.c
>> @@ -77,6 +77,8 @@ typedef struct {
>>  
>>  static const RunStateTransition runstate_transitions_def[] = {
>>      { RUN_STATE_PRELAUNCH, RUN_STATE_INMIGRATE },
>> +    { RUN_STATE_PRELAUNCH, RUN_STATE_PAUSED },
>> +    { RUN_STATE_PRELAUNCH, RUN_STATE_SUSPENDED },
>>  
>>      { RUN_STATE_DEBUG, RUN_STATE_RUNNING },
>>      { RUN_STATE_DEBUG, RUN_STATE_FINISH_MIGRATE },
> 
> Many of the call sites also starts loadvm under RUN_STATE_RESTORE_VM.  Do
> we need more entries for that?

Agreed. Fixed in V5.

- Steve
diff mbox series

Patch

diff --git a/migration/savevm.c b/migration/savevm.c
index eba3653..7b9c477 100644
--- a/migration/savevm.c
+++ b/migration/savevm.c
@@ -3194,6 +3194,7 @@  bool load_snapshot(const char *name, const char *vmstate,
     }
     aio_context_acquire(aio_context);
     ret = qemu_loadvm_state(f);
+    migrate_set_runstate();
     migration_incoming_state_destroy();
     aio_context_release(aio_context);
 
diff --git a/softmmu/runstate.c b/softmmu/runstate.c
index f3bd862..21d7407 100644
--- a/softmmu/runstate.c
+++ b/softmmu/runstate.c
@@ -77,6 +77,8 @@  typedef struct {
 
 static const RunStateTransition runstate_transitions_def[] = {
     { RUN_STATE_PRELAUNCH, RUN_STATE_INMIGRATE },
+    { RUN_STATE_PRELAUNCH, RUN_STATE_PAUSED },
+    { RUN_STATE_PRELAUNCH, RUN_STATE_SUSPENDED },
 
     { RUN_STATE_DEBUG, RUN_STATE_RUNNING },
     { RUN_STATE_DEBUG, RUN_STATE_FINISH_MIGRATE },