diff mbox series

[V1,4/4] cpr: reboot mode

Message ID 1697748466-373230-5-git-send-email-steven.sistare@oracle.com
State New
Headers show
Series Live Update reboot mode | expand

Commit Message

Steven Sistare Oct. 19, 2023, 8:47 p.m. UTC
Add the cpr-reboot migration mode.  Usage:

$ qemu-system-$arch -monitor stdio ...
QEMU 8.1.50 monitor - type 'help' for more information
(qemu) migrate_set_capability x-ignore-shared on
(qemu) migrate_set_parameter mode cpr-reboot
(qemu) migrate -d file:vm.state
(qemu) info status
VM status: paused (postmigrate)
(qemu) quit

$ qemu-system-$arch -monitor stdio -incoming defer ...
QEMU 8.1.50 monitor - type 'help' for more information
(qemu) migrate_set_capability x-ignore-shared on
(qemu) migrate_set_parameter mode cpr-reboot
(qemu) migrate_incoming file:vm.state
(qemu) info status
VM status: running

In this mode, the migrate command saves state to a file, allowing one
to quit qemu, reboot to an updated kernel, and restart an updated version
of qemu.  The caller must specify a migration URI that writes to and reads
from a file.  Unlike normal mode, the use of certain local storage options
does not block the migration, but the caller must not modify guest block
devices between the quit and restart.  The guest RAM memory-backend must
be shared, and the @x-ignore-shared migration capability must be set,
to avoid saving RAM to the file.  Guest RAM must be non-volatile across
reboot, such as by backing it with a dax device, but this is not enforced.
The restarted qemu arguments must match those used to initially start qemu,
plus the -incoming option.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
---
 qapi/migration.json | 16 +++++++++++++++-
 1 file changed, 15 insertions(+), 1 deletion(-)

Comments

Juan Quintela Oct. 20, 2023, 9:45 a.m. UTC | #1
Steve Sistare <steven.sistare@oracle.com> wrote:
> Add the cpr-reboot migration mode.  Usage:
>
> $ qemu-system-$arch -monitor stdio ...
> QEMU 8.1.50 monitor - type 'help' for more information
> (qemu) migrate_set_capability x-ignore-shared on
> (qemu) migrate_set_parameter mode cpr-reboot
> (qemu) migrate -d file:vm.state
> (qemu) info status
> VM status: paused (postmigrate)
> (qemu) quit
>
> $ qemu-system-$arch -monitor stdio -incoming defer ...
> QEMU 8.1.50 monitor - type 'help' for more information
> (qemu) migrate_set_capability x-ignore-shared on
> (qemu) migrate_set_parameter mode cpr-reboot
> (qemu) migrate_incoming file:vm.state
> (qemu) info status
> VM status: running
>
> In this mode, the migrate command saves state to a file, allowing one
> to quit qemu, reboot to an updated kernel, and restart an updated version
> of qemu.  The caller must specify a migration URI that writes to and reads
> from a file.  Unlike normal mode, the use of certain local storage options
> does not block the migration, but the caller must not modify guest block
> devices between the quit and restart.  The guest RAM memory-backend must
> be shared, and the @x-ignore-shared migration capability must be set,
> to avoid saving RAM to the file.  Guest RAM must be non-volatile across
> reboot, such as by backing it with a dax device, but this is not enforced.
> The restarted qemu arguments must match those used to initially start qemu,
> plus the -incoming option.

Please, add this message to doc/<somewhere> instead (or additionally) to
the commit log.

> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  qapi/migration.json | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
>
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 184fb78..2d862fa 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -620,9 +620,23 @@
>  #
>  # @normal: the original form of migration. (since 8.2)
>  #
> +# @cpr-reboot: The migrate command saves state to a file, allowing one to
> +#              quit qemu, reboot to an updated kernel, and restart an updated
> +#              version of qemu.  The caller must specify a migration URI
> +#              that writes to and reads from a file.  Unlike normal mode,
> +#              the use of certain local storage options does not block the
> +#              migration, but the caller must not modify guest block devices
> +#              between the quit and restart.  The guest RAM memory-backend
> +#              must be shared, and the @x-ignore-shared migration capability
> +#              must be set, to avoid saving it to the file.  Guest RAM must
> +#              be non-volatile across reboot, such as by backing it with
> +#              a dax device, but this is not enforced.  The restarted qemu
> +#              arguments must match those used to initially start qemu, plus
> +#              the -incoming option. (since 8.2)
> +#
>  ##
>  { 'enum': 'MigMode',
> -  'data': [ 'normal' ] }
> +  'data': [ 'normal', 'cpr-reboot' ] }
>  
>  ##
>  # @BitmapMigrationBitmapAliasTransform:

It only works with file backend, and we don't have any check for that.
Wondering how to add that check.

Additionally, you are not adding a migration test that does exactly what
you put there in the comment.

Thanks, Juan.
Steven Sistare Oct. 20, 2023, 2:09 p.m. UTC | #2
On 10/20/2023 5:45 AM, Juan Quintela wrote:
> Steve Sistare <steven.sistare@oracle.com> wrote:
>> Add the cpr-reboot migration mode.  Usage:
>>
>> $ qemu-system-$arch -monitor stdio ...
>> QEMU 8.1.50 monitor - type 'help' for more information
>> (qemu) migrate_set_capability x-ignore-shared on
>> (qemu) migrate_set_parameter mode cpr-reboot
>> (qemu) migrate -d file:vm.state
>> (qemu) info status
>> VM status: paused (postmigrate)
>> (qemu) quit
>>
>> $ qemu-system-$arch -monitor stdio -incoming defer ...
>> QEMU 8.1.50 monitor - type 'help' for more information
>> (qemu) migrate_set_capability x-ignore-shared on
>> (qemu) migrate_set_parameter mode cpr-reboot
>> (qemu) migrate_incoming file:vm.state
>> (qemu) info status
>> VM status: running
>>
>> In this mode, the migrate command saves state to a file, allowing one
>> to quit qemu, reboot to an updated kernel, and restart an updated version
>> of qemu.  The caller must specify a migration URI that writes to and reads
>> from a file.  Unlike normal mode, the use of certain local storage options
>> does not block the migration, but the caller must not modify guest block
>> devices between the quit and restart.  The guest RAM memory-backend must
>> be shared, and the @x-ignore-shared migration capability must be set,
>> to avoid saving RAM to the file.  Guest RAM must be non-volatile across
>> reboot, such as by backing it with a dax device, but this is not enforced.
>> The restarted qemu arguments must match those used to initially start qemu,
>> plus the -incoming option.
> 
> Please, add this message to doc/<somewhere> instead (or additionally) to
> the commit log.
> 
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>>  qapi/migration.json | 16 +++++++++++++++-
>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/qapi/migration.json b/qapi/migration.json
>> index 184fb78..2d862fa 100644
>> --- a/qapi/migration.json
>> +++ b/qapi/migration.json
>> @@ -620,9 +620,23 @@
>>  #
>>  # @normal: the original form of migration. (since 8.2)
>>  #
>> +# @cpr-reboot: The migrate command saves state to a file, allowing one to
>> +#              quit qemu, reboot to an updated kernel, and restart an updated
>> +#              version of qemu.  The caller must specify a migration URI
>> +#              that writes to and reads from a file.  Unlike normal mode,
>> +#              the use of certain local storage options does not block the
>> +#              migration, but the caller must not modify guest block devices
>> +#              between the quit and restart.  The guest RAM memory-backend
>> +#              must be shared, and the @x-ignore-shared migration capability
>> +#              must be set, to avoid saving it to the file.  Guest RAM must
>> +#              be non-volatile across reboot, such as by backing it with
>> +#              a dax device, but this is not enforced.  The restarted qemu
>> +#              arguments must match those used to initially start qemu, plus
>> +#              the -incoming option. (since 8.2)
>> +#
>>  ##
>>  { 'enum': 'MigMode',
>> -  'data': [ 'normal' ] }
>> +  'data': [ 'normal', 'cpr-reboot' ] }
>>  
>>  ##
>>  # @BitmapMigrationBitmapAliasTransform:
> 
> It only works with file backend, and we don't have any check for that.
> Wondering how to add that check.

Actually, it works for other backends, but the ram contents are saved in the
state file, which is slower. I should spell that out in the json comment and
in the commit message.

> Additionally, you are not adding a migration test that does exactly what
> you put there in the comment.

I provide tests/avocado/cpr.py in the original long series.  Would you
like me to add it to this series, or post it later?  Would you prefer I
add a test to tests/qtest/migration-test.c?

- Steve
Juan Quintela Oct. 20, 2023, 7:40 p.m. UTC | #3
Steven Sistare <steven.sistare@oracle.com> wrote:
> On 10/20/2023 5:45 AM, Juan Quintela wrote:
>> Steve Sistare <steven.sistare@oracle.com> wrote:
>>> Add the cpr-reboot migration mode.  Usage:
>>>
>>> $ qemu-system-$arch -monitor stdio ...
>>> QEMU 8.1.50 monitor - type 'help' for more information
>>> (qemu) migrate_set_capability x-ignore-shared on
>>> (qemu) migrate_set_parameter mode cpr-reboot
>>> (qemu) migrate -d file:vm.state
>>> (qemu) info status
>>> VM status: paused (postmigrate)
>>> (qemu) quit
>>>
>>> $ qemu-system-$arch -monitor stdio -incoming defer ...
>>> QEMU 8.1.50 monitor - type 'help' for more information
>>> (qemu) migrate_set_capability x-ignore-shared on
>>> (qemu) migrate_set_parameter mode cpr-reboot
>>> (qemu) migrate_incoming file:vm.state
>>> (qemu) info status
>>> VM status: running
>>>
>>> In this mode, the migrate command saves state to a file, allowing one
>>> to quit qemu, reboot to an updated kernel, and restart an updated version
>>> of qemu.  The caller must specify a migration URI that writes to and reads
>>> from a file.  Unlike normal mode, the use of certain local storage options
>>> does not block the migration, but the caller must not modify guest block
>>> devices between the quit and restart.  The guest RAM memory-backend must
>>> be shared, and the @x-ignore-shared migration capability must be set,
>>> to avoid saving RAM to the file.  Guest RAM must be non-volatile across
>>> reboot, such as by backing it with a dax device, but this is not enforced.
>>> The restarted qemu arguments must match those used to initially start qemu,
>>> plus the -incoming option.
>> 
>> Please, add this message to doc/<somewhere> instead (or additionally) to
>> the commit log.
>> 
>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>> ---
>>>  qapi/migration.json | 16 +++++++++++++++-
>>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/qapi/migration.json b/qapi/migration.json
>>> index 184fb78..2d862fa 100644
>>> --- a/qapi/migration.json
>>> +++ b/qapi/migration.json
>>> @@ -620,9 +620,23 @@
>>>  #
>>>  # @normal: the original form of migration. (since 8.2)
>>>  #
>>> +# @cpr-reboot: The migrate command saves state to a file, allowing one to
>>> +#              quit qemu, reboot to an updated kernel, and restart an updated
>>> +#              version of qemu.  The caller must specify a migration URI
>>> +#              that writes to and reads from a file.  Unlike normal mode,
>>> +#              the use of certain local storage options does not block the
>>> +#              migration, but the caller must not modify guest block devices
>>> +#              between the quit and restart.  The guest RAM memory-backend
>>> +#              must be shared, and the @x-ignore-shared migration capability
>>> +#              must be set, to avoid saving it to the file.  Guest RAM must
>>> +#              be non-volatile across reboot, such as by backing it with
>>> +#              a dax device, but this is not enforced.  The restarted qemu
>>> +#              arguments must match those used to initially start qemu, plus
>>> +#              the -incoming option. (since 8.2)
>>> +#
>>>  ##
>>>  { 'enum': 'MigMode',
>>> -  'data': [ 'normal' ] }
>>> +  'data': [ 'normal', 'cpr-reboot' ] }
>>>  
>>>  ##
>>>  # @BitmapMigrationBitmapAliasTransform:
>> 
>> It only works with file backend, and we don't have any check for that.
>> Wondering how to add that check.
>
> Actually, it works for other backends, but the ram contents are saved in the
> state file, which is slower. I should spell that out in the json comment and
> in the commit message.

Thanks.
>
>> Additionally, you are not adding a migration test that does exactly what
>> you put there in the comment.
>
> I provide tests/avocado/cpr.py in the original long series.  Would you
> like me to add it to this series, or post it later?  Would you prefer I
> add a test to tests/qtest/migration-test.c?

test/qtest/migration-test.c

please.

Something simple like what you say in the commit should be a good start.

Thanks, Juan.
Peter Xu Oct. 23, 2023, 3:39 p.m. UTC | #4
On Thu, Oct 19, 2023 at 01:47:46PM -0700, Steve Sistare wrote:
> Add the cpr-reboot migration mode.  Usage:
> 
> $ qemu-system-$arch -monitor stdio ...
> QEMU 8.1.50 monitor - type 'help' for more information
> (qemu) migrate_set_capability x-ignore-shared on
> (qemu) migrate_set_parameter mode cpr-reboot
> (qemu) migrate -d file:vm.state
> (qemu) info status
> VM status: paused (postmigrate)
> (qemu) quit
> 
> $ qemu-system-$arch -monitor stdio -incoming defer ...
> QEMU 8.1.50 monitor - type 'help' for more information
> (qemu) migrate_set_capability x-ignore-shared on
> (qemu) migrate_set_parameter mode cpr-reboot
> (qemu) migrate_incoming file:vm.state
> (qemu) info status
> VM status: running
> 
> In this mode, the migrate command saves state to a file, allowing one
> to quit qemu, reboot to an updated kernel, and restart an updated version
> of qemu.  The caller must specify a migration URI that writes to and reads
> from a file.  Unlike normal mode, the use of certain local storage options
> does not block the migration, but the caller must not modify guest block
> devices between the quit and restart.  The guest RAM memory-backend must
> be shared, and the @x-ignore-shared migration capability must be set,
> to avoid saving RAM to the file.  Guest RAM must be non-volatile across
> reboot, such as by backing it with a dax device, but this is not enforced.
> The restarted qemu arguments must match those used to initially start qemu,
> plus the -incoming option.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  qapi/migration.json | 16 +++++++++++++++-
>  1 file changed, 15 insertions(+), 1 deletion(-)
> 
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 184fb78..2d862fa 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -620,9 +620,23 @@
>  #
>  # @normal: the original form of migration. (since 8.2)
>  #
> +# @cpr-reboot: The migrate command saves state to a file, allowing one to
> +#              quit qemu, reboot to an updated kernel, and restart an updated
> +#              version of qemu.  The caller must specify a migration URI
> +#              that writes to and reads from a file.  Unlike normal mode,
> +#              the use of certain local storage options does not block the
> +#              migration, but the caller must not modify guest block devices
> +#              between the quit and restart.  The guest RAM memory-backend
> +#              must be shared, and the @x-ignore-shared migration capability
> +#              must be set, to avoid saving it to the file.  Guest RAM must
> +#              be non-volatile across reboot, such as by backing it with
> +#              a dax device, but this is not enforced.  The restarted qemu
> +#              arguments must match those used to initially start qemu, plus
> +#              the -incoming option. (since 8.2)

What happens if someone migrates with non-shared memory, or without
ignore-shared?  Is it only because it'll be slow saving and loading?

If that's required, we should fail the mode set if (1) non-shared memory is
used, or (2) x-ignore-shared is not enabled.  But I had a feeling it's the
other way round.

Reading the whole series, if it's so far all about "local storage", why
"cpr-reboot"?  Why not "local" or "local storage" as the name?

I had a feeling that this patchset mixed a lot of higher level use case
into the mode definition.  IMHO we should provide clear definition of each
mode on what it does.  It's so far not so clear to me, even if I kind of
know what you plan to do.

I tried again google what CPR is for and found this:

https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg08541.html

I also prefer spell it out, at least make it clear on what that means..  I
didn't even see "Checkpoint/restart" words mentioned anywhere in this
patchset.

Besides: do you have a tree somewhere for the whole set of latest CPR work?

Thanks,

> +#
>  ##
>  { 'enum': 'MigMode',
> -  'data': [ 'normal' ] }
> +  'data': [ 'normal', 'cpr-reboot' ] }
>  
>  ##
>  # @BitmapMigrationBitmapAliasTransform:
> -- 
> 1.8.3.1
>
Steven Sistare Oct. 23, 2023, 6:29 p.m. UTC | #5
On 10/23/2023 11:39 AM, Peter Xu wrote:
> On Thu, Oct 19, 2023 at 01:47:46PM -0700, Steve Sistare wrote:
>> Add the cpr-reboot migration mode.  Usage:
>>
>> $ qemu-system-$arch -monitor stdio ...
>> QEMU 8.1.50 monitor - type 'help' for more information
>> (qemu) migrate_set_capability x-ignore-shared on
>> (qemu) migrate_set_parameter mode cpr-reboot
>> (qemu) migrate -d file:vm.state
>> (qemu) info status
>> VM status: paused (postmigrate)
>> (qemu) quit
>>
>> $ qemu-system-$arch -monitor stdio -incoming defer ...
>> QEMU 8.1.50 monitor - type 'help' for more information
>> (qemu) migrate_set_capability x-ignore-shared on
>> (qemu) migrate_set_parameter mode cpr-reboot
>> (qemu) migrate_incoming file:vm.state
>> (qemu) info status
>> VM status: running
>>
>> In this mode, the migrate command saves state to a file, allowing one
>> to quit qemu, reboot to an updated kernel, and restart an updated version
>> of qemu.  The caller must specify a migration URI that writes to and reads
>> from a file.  Unlike normal mode, the use of certain local storage options
>> does not block the migration, but the caller must not modify guest block
>> devices between the quit and restart.  The guest RAM memory-backend must
>> be shared, and the @x-ignore-shared migration capability must be set,
>> to avoid saving RAM to the file.  Guest RAM must be non-volatile across
>> reboot, such as by backing it with a dax device, but this is not enforced.
>> The restarted qemu arguments must match those used to initially start qemu,
>> plus the -incoming option.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> ---
>>  qapi/migration.json | 16 +++++++++++++++-
>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/qapi/migration.json b/qapi/migration.json
>> index 184fb78..2d862fa 100644
>> --- a/qapi/migration.json
>> +++ b/qapi/migration.json
>> @@ -620,9 +620,23 @@
>>  #
>>  # @normal: the original form of migration. (since 8.2)
>>  #
>> +# @cpr-reboot: The migrate command saves state to a file, allowing one to
>> +#              quit qemu, reboot to an updated kernel, and restart an updated
>> +#              version of qemu.  The caller must specify a migration URI
>> +#              that writes to and reads from a file.  Unlike normal mode,
>> +#              the use of certain local storage options does not block the
>> +#              migration, but the caller must not modify guest block devices
>> +#              between the quit and restart.  The guest RAM memory-backend
>> +#              must be shared, and the @x-ignore-shared migration capability
>> +#              must be set, to avoid saving it to the file.  Guest RAM must
>> +#              be non-volatile across reboot, such as by backing it with
>> +#              a dax device, but this is not enforced.  The restarted qemu
>> +#              arguments must match those used to initially start qemu, plus
>> +#              the -incoming option. (since 8.2)
> 
> What happens if someone migrates with non-shared memory, or without
> ignore-shared?  Is it only because it'll be slow saving and loading?
> 
> If that's required, we should fail the mode set if (1) non-shared memory is
> used, or (2) x-ignore-shared is not enabled.  But I had a feeling it's the
> other way round.

Juan also asked me to clarify this.  I plan to resubmit this:

#                                        ...  Private guest RAM is saved in
#              the file.  To avoid this cost, the guest RAM memory-backend
#              must be shared, and the @x-ignore-shared migration capability
#              must be set.  ...

> 
> Reading the whole series, if it's so far all about "local storage", why
> "cpr-reboot"?  Why not "local" or "local storage" as the name?

The use case is about rebooting and updating the host, so reboot is in 
the name.  Local storage just happens to be allowed for it.

> I had a feeling that this patchset mixed a lot of higher level use case
> into the mode definition.  IMHO we should provide clear definition of each
> mode on what it does.  It's so far not so clear to me, even if I kind of
> know what you plan to do.

I believe I already have, in the cover letter, commit message, and qapi 
definition, at the start of each:

# @cpr-reboot: The migrate command saves state to a file, allowing one to
#              quit qemu, reboot to an updated kernel, and restart an updated
#              version of qemu.

The cover letter hints at the cpr-exec use case, and the long V9 patch series
describes it, and I will make sure the use case comes first when I submit cpr-exec,
which is:
  * much shorter guest downtime than cpr reboot
  * support vfio without requiring guest suspension
  * keep certain character devices alive

> I tried again google what CPR is for and found this:
> 
> https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg08541.html
> 
> I also prefer spell it out, at least make it clear on what that means..  I
> didn't even see "Checkpoint/restart" words mentioned anywhere in this
> patchset.

Will do.

> Besides: do you have a tree somewhere for the whole set of latest CPR work?

I have the V9 patch series:
  https://lore.kernel.org/qemu-devel/1658851843-236870-1-git-send-email-steven.sistare@oracle.com
and I can re-send my proposal for breaking it down into patch sets that I presented in the
qemu community meeting, if you did not save it.

- Steve
Steven Sistare Oct. 23, 2023, 6:39 p.m. UTC | #6
On 10/20/2023 3:40 PM, Juan Quintela wrote:
> Steven Sistare <steven.sistare@oracle.com> wrote:
>> On 10/20/2023 5:45 AM, Juan Quintela wrote:
>>> Steve Sistare <steven.sistare@oracle.com> wrote:
>>>> Add the cpr-reboot migration mode.  Usage:
>>>>
>>>> $ qemu-system-$arch -monitor stdio ...
>>>> QEMU 8.1.50 monitor - type 'help' for more information
>>>> (qemu) migrate_set_capability x-ignore-shared on
>>>> (qemu) migrate_set_parameter mode cpr-reboot
>>>> (qemu) migrate -d file:vm.state
>>>> (qemu) info status
>>>> VM status: paused (postmigrate)
>>>> (qemu) quit
>>>>
>>>> $ qemu-system-$arch -monitor stdio -incoming defer ...
>>>> QEMU 8.1.50 monitor - type 'help' for more information
>>>> (qemu) migrate_set_capability x-ignore-shared on
>>>> (qemu) migrate_set_parameter mode cpr-reboot
>>>> (qemu) migrate_incoming file:vm.state
>>>> (qemu) info status
>>>> VM status: running
>>>>
>>>> In this mode, the migrate command saves state to a file, allowing one
>>>> to quit qemu, reboot to an updated kernel, and restart an updated version
>>>> of qemu.  The caller must specify a migration URI that writes to and reads
>>>> from a file.  Unlike normal mode, the use of certain local storage options
>>>> does not block the migration, but the caller must not modify guest block
>>>> devices between the quit and restart.  The guest RAM memory-backend must
>>>> be shared, and the @x-ignore-shared migration capability must be set,
>>>> to avoid saving RAM to the file.  Guest RAM must be non-volatile across
>>>> reboot, such as by backing it with a dax device, but this is not enforced.
>>>> The restarted qemu arguments must match those used to initially start qemu,
>>>> plus the -incoming option.
>>>
>>> Please, add this message to doc/<somewhere> instead (or additionally) to
>>> the commit log.
>>>
>>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>>> ---
>>>>  qapi/migration.json | 16 +++++++++++++++-
>>>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/qapi/migration.json b/qapi/migration.json
>>>> index 184fb78..2d862fa 100644
>>>> --- a/qapi/migration.json
>>>> +++ b/qapi/migration.json
>>>> @@ -620,9 +620,23 @@
>>>>  #
>>>>  # @normal: the original form of migration. (since 8.2)
>>>>  #
>>>> +# @cpr-reboot: The migrate command saves state to a file, allowing one to
>>>> +#              quit qemu, reboot to an updated kernel, and restart an updated
>>>> +#              version of qemu.  The caller must specify a migration URI
>>>> +#              that writes to and reads from a file.  Unlike normal mode,
>>>> +#              the use of certain local storage options does not block the
>>>> +#              migration, but the caller must not modify guest block devices
>>>> +#              between the quit and restart.  The guest RAM memory-backend
>>>> +#              must be shared, and the @x-ignore-shared migration capability
>>>> +#              must be set, to avoid saving it to the file.  Guest RAM must
>>>> +#              be non-volatile across reboot, such as by backing it with
>>>> +#              a dax device, but this is not enforced.  The restarted qemu
>>>> +#              arguments must match those used to initially start qemu, plus
>>>> +#              the -incoming option. (since 8.2)
>>>> +#
>>>>  ##
>>>>  { 'enum': 'MigMode',
>>>> -  'data': [ 'normal' ] }
>>>> +  'data': [ 'normal', 'cpr-reboot' ] }
>>>>  
>>>>  ##
>>>>  # @BitmapMigrationBitmapAliasTransform:
>>>
>>> It only works with file backend, and we don't have any check for that.
>>> Wondering how to add that check.
>>
>> Actually, it works for other backends, but the ram contents are saved in the
>> state file, which is slower. I should spell that out in the json comment and
>> in the commit message.
> 
> Thanks.
>>
>>> Additionally, you are not adding a migration test that does exactly what
>>> you put there in the comment.
>>
>> I provide tests/avocado/cpr.py in the original long series.  Would you
>> like me to add it to this series, or post it later?  Would you prefer I
>> add a test to tests/qtest/migration-test.c?
> 
> test/qtest/migration-test.c
> 
> please.
> 
> Something simple like what you say in the commit should be a good start.

I wrote a new test which I will submit with V2 of this series.  Turned out to be
easy after Fabiano provided test_file_common.

+static void *test_mode_reboot_start(QTestState *from, QTestState *to)
+{
+    migrate_set_parameter_str(from, "mode", "cpr-reboot");
+    migrate_set_parameter_str(to, "mode", "cpr-reboot");
+
+    migrate_set_capability(from, "x-ignore-shared", true);
+    migrate_set_capability(to, "x-ignore-shared", true);
+
+    return NULL;
+}
+
+static void test_mode_reboot(void)
+{
+    g_autofree char *uri = g_strdup_printf("file:%s/%s", tmpfs,
+                                           FILE_TEST_FILENAME);
+    MigrateCommon args = {
+        .start.use_shmem = true,
+        .connect_uri = uri,
+        .listen_uri = "defer",
+        .start_hook = test_mode_reboot_start
+    };
+
+    test_file_common(&args, true);
+}

- Steve
Steven Sistare Oct. 23, 2023, 6:51 p.m. UTC | #7
On 10/23/2023 2:29 PM, Steven Sistare wrote:
> On 10/23/2023 11:39 AM, Peter Xu wrote:
>> On Thu, Oct 19, 2023 at 01:47:46PM -0700, Steve Sistare wrote:
>>> Add the cpr-reboot migration mode.  Usage:
>>>
>>> $ qemu-system-$arch -monitor stdio ...
>>> QEMU 8.1.50 monitor - type 'help' for more information
>>> (qemu) migrate_set_capability x-ignore-shared on
>>> (qemu) migrate_set_parameter mode cpr-reboot
>>> (qemu) migrate -d file:vm.state
>>> (qemu) info status
>>> VM status: paused (postmigrate)
>>> (qemu) quit
>>>
>>> $ qemu-system-$arch -monitor stdio -incoming defer ...
>>> QEMU 8.1.50 monitor - type 'help' for more information
>>> (qemu) migrate_set_capability x-ignore-shared on
>>> (qemu) migrate_set_parameter mode cpr-reboot
>>> (qemu) migrate_incoming file:vm.state
>>> (qemu) info status
>>> VM status: running
>>>
>>> In this mode, the migrate command saves state to a file, allowing one
>>> to quit qemu, reboot to an updated kernel, and restart an updated version
>>> of qemu.  The caller must specify a migration URI that writes to and reads
>>> from a file.  Unlike normal mode, the use of certain local storage options
>>> does not block the migration, but the caller must not modify guest block
>>> devices between the quit and restart.  The guest RAM memory-backend must
>>> be shared, and the @x-ignore-shared migration capability must be set,
>>> to avoid saving RAM to the file.  Guest RAM must be non-volatile across
>>> reboot, such as by backing it with a dax device, but this is not enforced.
>>> The restarted qemu arguments must match those used to initially start qemu,
>>> plus the -incoming option.
>>>
>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>> ---
>>>  qapi/migration.json | 16 +++++++++++++++-
>>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/qapi/migration.json b/qapi/migration.json
>>> index 184fb78..2d862fa 100644
>>> --- a/qapi/migration.json
>>> +++ b/qapi/migration.json
>>> @@ -620,9 +620,23 @@
>>>  #
>>>  # @normal: the original form of migration. (since 8.2)
>>>  #
>>> +# @cpr-reboot: The migrate command saves state to a file, allowing one to
>>> +#              quit qemu, reboot to an updated kernel, and restart an updated
>>> +#              version of qemu.  The caller must specify a migration URI
>>> +#              that writes to and reads from a file.  Unlike normal mode,
>>> +#              the use of certain local storage options does not block the
>>> +#              migration, but the caller must not modify guest block devices
>>> +#              between the quit and restart.  The guest RAM memory-backend
>>> +#              must be shared, and the @x-ignore-shared migration capability
>>> +#              must be set, to avoid saving it to the file.  Guest RAM must
>>> +#              be non-volatile across reboot, such as by backing it with
>>> +#              a dax device, but this is not enforced.  The restarted qemu
>>> +#              arguments must match those used to initially start qemu, plus
>>> +#              the -incoming option. (since 8.2)
>>
>> What happens if someone migrates with non-shared memory, or without
>> ignore-shared?  Is it only because it'll be slow saving and loading?
>>
>> If that's required, we should fail the mode set if (1) non-shared memory is
>> used, or (2) x-ignore-shared is not enabled.  But I had a feeling it's the
>> other way round.
> 
> Juan also asked me to clarify this.  I plan to resubmit this:
> 
> #                                        ...  Private guest RAM is saved in
> #              the file.  To avoid this cost, the guest RAM memory-backend
> #              must be shared, and the @x-ignore-shared migration capability
> #              must be set.  ...
> 
>>
>> Reading the whole series, if it's so far all about "local storage", why
>> "cpr-reboot"?  Why not "local" or "local storage" as the name?
> 
> The use case is about rebooting and updating the host, so reboot is in 
> the name.  Local storage just happens to be allowed for it.
> 
>> I had a feeling that this patchset mixed a lot of higher level use case
>> into the mode definition.  IMHO we should provide clear definition of each
>> mode on what it does.  It's so far not so clear to me, even if I kind of
>> know what you plan to do.
> 
> I believe I already have, in the cover letter, commit message, and qapi 
> definition, at the start of each:
> 
> # @cpr-reboot: The migrate command saves state to a file, allowing one to
> #              quit qemu, reboot to an updated kernel, and restart an updated
> #              version of qemu.
> 
> The cover letter hints at the cpr-exec use case, and the long V9 patch series
> describes it, and I will make sure the use case comes first when I submit cpr-exec,
> which is:
    * restart an updated version of qemu     (I buried the lead - steve)
>   * much shorter guest downtime than cpr reboot
>   * support vfio without requiring guest suspension
>   * keep certain character devices alive
> 
>> I tried again google what CPR is for and found this:
>>
>> https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg08541.html
>>
>> I also prefer spell it out, at least make it clear on what that means..  I
>> didn't even see "Checkpoint/restart" words mentioned anywhere in this
>> patchset.
> 
> Will do.
> 
>> Besides: do you have a tree somewhere for the whole set of latest CPR work?
> 
> I have the V9 patch series:
>   https://lore.kernel.org/qemu-devel/1658851843-236870-1-git-send-email-steven.sistare@oracle.com
> and I can re-send my proposal for breaking it down into patch sets that I presented in the
> qemu community meeting, if you did not save it.
> 
> - Steve
Peter Xu Oct. 23, 2023, 7:05 p.m. UTC | #8
On Mon, Oct 23, 2023 at 02:51:50PM -0400, Steven Sistare wrote:
> On 10/23/2023 2:29 PM, Steven Sistare wrote:
> > On 10/23/2023 11:39 AM, Peter Xu wrote:
> >> On Thu, Oct 19, 2023 at 01:47:46PM -0700, Steve Sistare wrote:
> >>> Add the cpr-reboot migration mode.  Usage:
> >>>
> >>> $ qemu-system-$arch -monitor stdio ...
> >>> QEMU 8.1.50 monitor - type 'help' for more information
> >>> (qemu) migrate_set_capability x-ignore-shared on
> >>> (qemu) migrate_set_parameter mode cpr-reboot
> >>> (qemu) migrate -d file:vm.state
> >>> (qemu) info status
> >>> VM status: paused (postmigrate)
> >>> (qemu) quit
> >>>
> >>> $ qemu-system-$arch -monitor stdio -incoming defer ...
> >>> QEMU 8.1.50 monitor - type 'help' for more information
> >>> (qemu) migrate_set_capability x-ignore-shared on
> >>> (qemu) migrate_set_parameter mode cpr-reboot
> >>> (qemu) migrate_incoming file:vm.state
> >>> (qemu) info status
> >>> VM status: running
> >>>
> >>> In this mode, the migrate command saves state to a file, allowing one
> >>> to quit qemu, reboot to an updated kernel, and restart an updated version
> >>> of qemu.  The caller must specify a migration URI that writes to and reads
> >>> from a file.  Unlike normal mode, the use of certain local storage options
> >>> does not block the migration, but the caller must not modify guest block
> >>> devices between the quit and restart.  The guest RAM memory-backend must
> >>> be shared, and the @x-ignore-shared migration capability must be set,
> >>> to avoid saving RAM to the file.  Guest RAM must be non-volatile across
> >>> reboot, such as by backing it with a dax device, but this is not enforced.
> >>> The restarted qemu arguments must match those used to initially start qemu,
> >>> plus the -incoming option.
> >>>
> >>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> >>> ---
> >>>  qapi/migration.json | 16 +++++++++++++++-
> >>>  1 file changed, 15 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/qapi/migration.json b/qapi/migration.json
> >>> index 184fb78..2d862fa 100644
> >>> --- a/qapi/migration.json
> >>> +++ b/qapi/migration.json
> >>> @@ -620,9 +620,23 @@
> >>>  #
> >>>  # @normal: the original form of migration. (since 8.2)
> >>>  #
> >>> +# @cpr-reboot: The migrate command saves state to a file, allowing one to
> >>> +#              quit qemu, reboot to an updated kernel, and restart an updated
> >>> +#              version of qemu.  The caller must specify a migration URI
> >>> +#              that writes to and reads from a file.  Unlike normal mode,
> >>> +#              the use of certain local storage options does not block the
> >>> +#              migration, but the caller must not modify guest block devices
> >>> +#              between the quit and restart.  The guest RAM memory-backend
> >>> +#              must be shared, and the @x-ignore-shared migration capability
> >>> +#              must be set, to avoid saving it to the file.  Guest RAM must
> >>> +#              be non-volatile across reboot, such as by backing it with
> >>> +#              a dax device, but this is not enforced.  The restarted qemu
> >>> +#              arguments must match those used to initially start qemu, plus
> >>> +#              the -incoming option. (since 8.2)
> >>
> >> What happens if someone migrates with non-shared memory, or without
> >> ignore-shared?  Is it only because it'll be slow saving and loading?
> >>
> >> If that's required, we should fail the mode set if (1) non-shared memory is
> >> used, or (2) x-ignore-shared is not enabled.  But I had a feeling it's the
> >> other way round.
> > 
> > Juan also asked me to clarify this.  I plan to resubmit this:
> > 
> > #                                        ...  Private guest RAM is saved in
> > #              the file.  To avoid this cost, the guest RAM memory-backend
> > #              must be shared, and the @x-ignore-shared migration capability
> > #              must be set.  ...

Okay.  We can also avoid mentioning "private guest RAM is saved to ..."
because that's what migration already does.  IMO we can simplify all that
to:

  It is suggested to use share memory with x-ignore-shared when using this
  mode.

> > 
> >>
> >> Reading the whole series, if it's so far all about "local storage", why
> >> "cpr-reboot"?  Why not "local" or "local storage" as the name?
> > 
> > The use case is about rebooting and updating the host, so reboot is in 
> > the name.  Local storage just happens to be allowed for it.
> > 
> >> I had a feeling that this patchset mixed a lot of higher level use case
> >> into the mode definition.  IMHO we should provide clear definition of each
> >> mode on what it does.  It's so far not so clear to me, even if I kind of
> >> know what you plan to do.
> > 
> > I believe I already have, in the cover letter, commit message, and qapi 
> > definition, at the start of each:
> > 
> > # @cpr-reboot: The migrate command saves state to a file, allowing one to
> > #              quit qemu, reboot to an updated kernel, and restart an updated
> > #              version of qemu.

I think this is why I'm confused: above sentence is describing a very
generic migration to file scenario to me.  IOW, I think I can get the same
result described even with normal migration to file, or am I wrong?

IMHO the description here needs to explain the difference and when an user
should use this mode.  I think the real answer resides in your whole set,
I'll try to read that.

In all cases, can we name it something like "live-upgrade" v.s. "normal"?

> > 
> > The cover letter hints at the cpr-exec use case, and the long V9 patch series
> > describes it, and I will make sure the use case comes first when I submit cpr-exec,
> > which is:
>     * restart an updated version of qemu     (I buried the lead - steve)
> >   * much shorter guest downtime than cpr reboot
> >   * support vfio without requiring guest suspension
> >   * keep certain character devices alive
> > 
> >> I tried again google what CPR is for and found this:
> >>
> >> https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg08541.html
> >>
> >> I also prefer spell it out, at least make it clear on what that means..  I
> >> didn't even see "Checkpoint/restart" words mentioned anywhere in this
> >> patchset.
> > 
> > Will do.
> > 
> >> Besides: do you have a tree somewhere for the whole set of latest CPR work?
> > 
> > I have the V9 patch series:
> >   https://lore.kernel.org/qemu-devel/1658851843-236870-1-git-send-email-steven.sistare@oracle.com
> > and I can re-send my proposal for breaking it down into patch sets that I presented in the
> > qemu community meeting, if you did not save it.

No need to resend.  A link is exactly what I need; git tree even better.
I'll comment when I get something when reading that.

Thanks,
Steven Sistare Oct. 23, 2023, 8:06 p.m. UTC | #9
On 10/23/2023 3:05 PM, Peter Xu wrote:
> On Mon, Oct 23, 2023 at 02:51:50PM -0400, Steven Sistare wrote:
>> On 10/23/2023 2:29 PM, Steven Sistare wrote:
>>> On 10/23/2023 11:39 AM, Peter Xu wrote:
>>>> On Thu, Oct 19, 2023 at 01:47:46PM -0700, Steve Sistare wrote:
>>>>> Add the cpr-reboot migration mode.  Usage:
>>>>>
>>>>> $ qemu-system-$arch -monitor stdio ...
>>>>> QEMU 8.1.50 monitor - type 'help' for more information
>>>>> (qemu) migrate_set_capability x-ignore-shared on
>>>>> (qemu) migrate_set_parameter mode cpr-reboot
>>>>> (qemu) migrate -d file:vm.state
>>>>> (qemu) info status
>>>>> VM status: paused (postmigrate)
>>>>> (qemu) quit
>>>>>
>>>>> $ qemu-system-$arch -monitor stdio -incoming defer ...
>>>>> QEMU 8.1.50 monitor - type 'help' for more information
>>>>> (qemu) migrate_set_capability x-ignore-shared on
>>>>> (qemu) migrate_set_parameter mode cpr-reboot
>>>>> (qemu) migrate_incoming file:vm.state
>>>>> (qemu) info status
>>>>> VM status: running
>>>>>
>>>>> In this mode, the migrate command saves state to a file, allowing one
>>>>> to quit qemu, reboot to an updated kernel, and restart an updated version
>>>>> of qemu.  The caller must specify a migration URI that writes to and reads
>>>>> from a file.  Unlike normal mode, the use of certain local storage options
>>>>> does not block the migration, but the caller must not modify guest block
>>>>> devices between the quit and restart.  The guest RAM memory-backend must
>>>>> be shared, and the @x-ignore-shared migration capability must be set,
>>>>> to avoid saving RAM to the file.  Guest RAM must be non-volatile across
>>>>> reboot, such as by backing it with a dax device, but this is not enforced.
>>>>> The restarted qemu arguments must match those used to initially start qemu,
>>>>> plus the -incoming option.
>>>>>
>>>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>>>> ---
>>>>>  qapi/migration.json | 16 +++++++++++++++-
>>>>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/qapi/migration.json b/qapi/migration.json
>>>>> index 184fb78..2d862fa 100644
>>>>> --- a/qapi/migration.json
>>>>> +++ b/qapi/migration.json
>>>>> @@ -620,9 +620,23 @@
>>>>>  #
>>>>>  # @normal: the original form of migration. (since 8.2)
>>>>>  #
>>>>> +# @cpr-reboot: The migrate command saves state to a file, allowing one to
>>>>> +#              quit qemu, reboot to an updated kernel, and restart an updated
>>>>> +#              version of qemu.  The caller must specify a migration URI
>>>>> +#              that writes to and reads from a file.  Unlike normal mode,
>>>>> +#              the use of certain local storage options does not block the
>>>>> +#              migration, but the caller must not modify guest block devices
>>>>> +#              between the quit and restart.  The guest RAM memory-backend
>>>>> +#              must be shared, and the @x-ignore-shared migration capability
>>>>> +#              must be set, to avoid saving it to the file.  Guest RAM must
>>>>> +#              be non-volatile across reboot, such as by backing it with
>>>>> +#              a dax device, but this is not enforced.  The restarted qemu
>>>>> +#              arguments must match those used to initially start qemu, plus
>>>>> +#              the -incoming option. (since 8.2)
>>>>
>>>> What happens if someone migrates with non-shared memory, or without
>>>> ignore-shared?  Is it only because it'll be slow saving and loading?
>>>>
>>>> If that's required, we should fail the mode set if (1) non-shared memory is
>>>> used, or (2) x-ignore-shared is not enabled.  But I had a feeling it's the
>>>> other way round.
>>>
>>> Juan also asked me to clarify this.  I plan to resubmit this:
>>>
>>> #                                        ...  Private guest RAM is saved in
>>> #              the file.  To avoid this cost, the guest RAM memory-backend
>>> #              must be shared, and the @x-ignore-shared migration capability
>>> #              must be set.  ...
> 
> Okay.  We can also avoid mentioning "private guest RAM is saved to ..."
> because that's what migration already does.  IMO we can simplify all that
> to:
> 
>   It is suggested to use share memory with x-ignore-shared when using this
>   mode

OK, I'll massage it some more.  I think we should explicitly warn about the
cost of saving all memory.

>>>> Reading the whole series, if it's so far all about "local storage", why
>>>> "cpr-reboot"?  Why not "local" or "local storage" as the name?
>>>
>>> The use case is about rebooting and updating the host, so reboot is in 
>>> the name.  Local storage just happens to be allowed for it.
>>>
>>>> I had a feeling that this patchset mixed a lot of higher level use case
>>>> into the mode definition.  IMHO we should provide clear definition of each
>>>> mode on what it does.  It's so far not so clear to me, even if I kind of
>>>> know what you plan to do.
>>>
>>> I believe I already have, in the cover letter, commit message, and qapi 
>>> definition, at the start of each:
>>>
>>> # @cpr-reboot: The migrate command saves state to a file, allowing one to
>>> #              quit qemu, reboot to an updated kernel, and restart an updated
>>> #              version of qemu.
> 
> I think this is why I'm confused: above sentence is describing a very
> generic migration to file scenario to me.  IOW, I think I can get the same
> result described even with normal migration to file, or am I wrong?

cpr-reboot has fewer blockers than normal migration to a file URI.  Most
importantly, the presence of vfio devices will not block it as long as
the guest is suspended.  That functionality is implemented in the patch
"vfio: allow cpr-reboot migration if suspended" in the V9 series.

I suppose we could use the presence of a "file URI" as the criteria for relaxing 
blockers, and eliminate cpr-reboot mode.  However, by making the mode explicit,
we can add mode-based options such as '-only-migratable <mode>'.

If we decide to delete the explicit reboot mode, I still need to add MigMode and 
per-mode blockers when I submit cpr-exec mode.

> IMHO the description here needs to explain the difference and when an user
> should use this mode.  I think the real answer resides in your whole set,
> I'll try to read that.
> 
> In all cases, can we name it something like "live-upgrade" v.s. "normal"?

I like cpr because it is a short and unique identifier for functions, types, 
variables, and user-visible tokens.  It reduces line wrapping and makes the code
more readable, IMO.

- Steve

>>> The cover letter hints at the cpr-exec use case, and the long V9 patch series
>>> describes it, and I will make sure the use case comes first when I submit cpr-exec,
>>> which is:
>>     * restart an updated version of qemu     (I buried the lead - steve)
>>>   * much shorter guest downtime than cpr reboot
>>>   * support vfio without requiring guest suspension
>>>   * keep certain character devices alive
>>>
>>>> I tried again google what CPR is for and found this:
>>>>
>>>> https://lists.gnu.org/archive/html/qemu-devel/2020-07/msg08541.html
>>>>
>>>> I also prefer spell it out, at least make it clear on what that means..  I
>>>> didn't even see "Checkpoint/restart" words mentioned anywhere in this
>>>> patchset.
>>>
>>> Will do.
>>>
>>>> Besides: do you have a tree somewhere for the whole set of latest CPR work?
>>>
>>> I have the V9 patch series:
>>>   https://lore.kernel.org/qemu-devel/1658851843-236870-1-git-send-email-steven.sistare@oracle.com
>>> and I can re-send my proposal for breaking it down into patch sets that I presented in the
>>> qemu community meeting, if you did not save it.
> 
> No need to resend.  A link is exactly what I need; git tree even better.
> I'll comment when I get something when reading that.
> 
> Thanks,
>
Juan Quintela Oct. 24, 2023, 11:13 a.m. UTC | #10
Steven Sistare <steven.sistare@oracle.com> wrote:
> On 10/20/2023 3:40 PM, Juan Quintela wrote:
>> Steven Sistare <steven.sistare@oracle.com> wrote:
>>> On 10/20/2023 5:45 AM, Juan Quintela wrote:
>>>> Steve Sistare <steven.sistare@oracle.com> wrote:
>>>>> Add the cpr-reboot migration mode.  Usage:
>>>>>
>>>>> $ qemu-system-$arch -monitor stdio ...
>>>>> QEMU 8.1.50 monitor - type 'help' for more information
>>>>> (qemu) migrate_set_capability x-ignore-shared on
>>>>> (qemu) migrate_set_parameter mode cpr-reboot
>>>>> (qemu) migrate -d file:vm.state
>>>>> (qemu) info status
>>>>> VM status: paused (postmigrate)
>>>>> (qemu) quit
>>>>>
>>>>> $ qemu-system-$arch -monitor stdio -incoming defer ...
>>>>> QEMU 8.1.50 monitor - type 'help' for more information
>>>>> (qemu) migrate_set_capability x-ignore-shared on
>>>>> (qemu) migrate_set_parameter mode cpr-reboot
>>>>> (qemu) migrate_incoming file:vm.state
>>>>> (qemu) info status
>>>>> VM status: running
>>>>>
>>>>> In this mode, the migrate command saves state to a file, allowing one
>>>>> to quit qemu, reboot to an updated kernel, and restart an updated version
>>>>> of qemu.  The caller must specify a migration URI that writes to and reads
>>>>> from a file.  Unlike normal mode, the use of certain local storage options
>>>>> does not block the migration, but the caller must not modify guest block
>>>>> devices between the quit and restart.  The guest RAM memory-backend must
>>>>> be shared, and the @x-ignore-shared migration capability must be set,
>>>>> to avoid saving RAM to the file.  Guest RAM must be non-volatile across
>>>>> reboot, such as by backing it with a dax device, but this is not enforced.
>>>>> The restarted qemu arguments must match those used to initially start qemu,
>>>>> plus the -incoming option.
>>>>
>>>> Please, add this message to doc/<somewhere> instead (or additionally) to
>>>> the commit log.
>>>>
>>>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>>>> ---
>>>>>  qapi/migration.json | 16 +++++++++++++++-
>>>>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/qapi/migration.json b/qapi/migration.json
>>>>> index 184fb78..2d862fa 100644
>>>>> --- a/qapi/migration.json
>>>>> +++ b/qapi/migration.json
>>>>> @@ -620,9 +620,23 @@
>>>>>  #
>>>>>  # @normal: the original form of migration. (since 8.2)
>>>>>  #
>>>>> +# @cpr-reboot: The migrate command saves state to a file, allowing one to
>>>>> +#              quit qemu, reboot to an updated kernel, and restart an updated
>>>>> +#              version of qemu.  The caller must specify a migration URI
>>>>> +#              that writes to and reads from a file.  Unlike normal mode,
>>>>> +#              the use of certain local storage options does not block the
>>>>> +#              migration, but the caller must not modify guest block devices
>>>>> +#              between the quit and restart.  The guest RAM memory-backend
>>>>> +#              must be shared, and the @x-ignore-shared migration capability
>>>>> +#              must be set, to avoid saving it to the file.  Guest RAM must
>>>>> +#              be non-volatile across reboot, such as by backing it with
>>>>> +#              a dax device, but this is not enforced.  The restarted qemu
>>>>> +#              arguments must match those used to initially start qemu, plus
>>>>> +#              the -incoming option. (since 8.2)
>>>>> +#
>>>>>  ##
>>>>>  { 'enum': 'MigMode',
>>>>> -  'data': [ 'normal' ] }
>>>>> +  'data': [ 'normal', 'cpr-reboot' ] }
>>>>>  
>>>>>  ##
>>>>>  # @BitmapMigrationBitmapAliasTransform:
>>>>
>>>> It only works with file backend, and we don't have any check for that.
>>>> Wondering how to add that check.
>>>
>>> Actually, it works for other backends, but the ram contents are saved in the
>>> state file, which is slower. I should spell that out in the json comment and
>>> in the commit message.
>> 
>> Thanks.
>>>
>>>> Additionally, you are not adding a migration test that does exactly what
>>>> you put there in the comment.
>>>
>>> I provide tests/avocado/cpr.py in the original long series.  Would you
>>> like me to add it to this series, or post it later?  Would you prefer I
>>> add a test to tests/qtest/migration-test.c?
>> 
>> test/qtest/migration-test.c
>> 
>> please.
>> 
>> Something simple like what you say in the commit should be a good start.
>
> I wrote a new test which I will submit with V2 of this series.  Turned out to be
> easy after Fabiano provided test_file_common.
>
> +static void *test_mode_reboot_start(QTestState *from, QTestState *to)
> +{
> +    migrate_set_parameter_str(from, "mode", "cpr-reboot");
> +    migrate_set_parameter_str(to, "mode", "cpr-reboot");
> +
> +    migrate_set_capability(from, "x-ignore-shared", true);
> +    migrate_set_capability(to, "x-ignore-shared", true);
> +
> +    return NULL;
> +}
> +
> +static void test_mode_reboot(void)
> +{
> +    g_autofree char *uri = g_strdup_printf("file:%s/%s", tmpfs,
> +                                           FILE_TEST_FILENAME);
> +    MigrateCommon args = {
> +        .start.use_shmem = true,
> +        .connect_uri = uri,
> +        .listen_uri = "defer",
> +        .start_hook = test_mode_reboot_start
> +    };
> +
> +    test_file_common(&args, true);
> +}
>
> - Steve

As a started, that sounds good.
When you add more features, please add tests acordingly.

Thanks again, Juan.
diff mbox series

Patch

diff --git a/qapi/migration.json b/qapi/migration.json
index 184fb78..2d862fa 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -620,9 +620,23 @@ 
 #
 # @normal: the original form of migration. (since 8.2)
 #
+# @cpr-reboot: The migrate command saves state to a file, allowing one to
+#              quit qemu, reboot to an updated kernel, and restart an updated
+#              version of qemu.  The caller must specify a migration URI
+#              that writes to and reads from a file.  Unlike normal mode,
+#              the use of certain local storage options does not block the
+#              migration, but the caller must not modify guest block devices
+#              between the quit and restart.  The guest RAM memory-backend
+#              must be shared, and the @x-ignore-shared migration capability
+#              must be set, to avoid saving it to the file.  Guest RAM must
+#              be non-volatile across reboot, such as by backing it with
+#              a dax device, but this is not enforced.  The restarted qemu
+#              arguments must match those used to initially start qemu, plus
+#              the -incoming option. (since 8.2)
+#
 ##
 { 'enum': 'MigMode',
-  'data': [ 'normal' ] }
+  'data': [ 'normal', 'cpr-reboot' ] }
 
 ##
 # @BitmapMigrationBitmapAliasTransform: