mbox

[PULL,0/4] machine development tool

Message ID 20240304135145.154860-1-davydov-max@yandex-team.ru
State New
Headers show

Pull-request

https://gitlab.com/davydov-max/qemu.git tags/pull-compare-mt-2024-03-04

Message

Maksim Davydov March 4, 2024, 1:51 p.m. UTC
The following changes since commit e1007b6bab5cf97705bf4f2aaec1f607787355b8:

  Merge tag 'pull-request-2024-03-01' of https://gitlab.com/thuth/qemu into staging (2024-03-01 10:14:32 +0000)

are available in the Git repository at:

  https://gitlab.com/davydov-max/qemu.git tags/pull-compare-mt-2024-03-04

for you to fetch changes up to 7693a2e8518811a907d73a85807ee71dac8fabcb:

  scripts: add script to compare compatibility properties (2024-03-04 14:10:53 +0300)

----------------------------------------------------------------
Please note. This is the first pull request from me.
My public GPG key is available here
https://keys.openpgp.org/vks/v1/by-fingerprint/CDB5BEEF8837142579F5CDFE8E927E10F72F78D4

----------------------------------------------------------------
scripts: add a new script for machine development

----------------------------------------------------------------

Maksim Davydov (4):
  qom: add default value
  qmp: add dump machine type compatibility properties
  python/qemu/machine: add method to retrieve QEMUMachine::binary field
  scripts: add script to compare compatibility properties

 MAINTAINERS                      |   5 +
 hw/core/machine-qmp-cmds.c       |  23 +-
 python/qemu/machine/machine.py   |   5 +
 qapi/machine.json                |  69 ++++-
 qom/qom-qmp-cmds.c               |   1 +
 scripts/compare-machine-types.py | 486 +++++++++++++++++++++++++++++++
 tests/qtest/fuzz/qos_fuzz.c      |   2 +-
 7 files changed, 586 insertions(+), 5 deletions(-)
 create mode 100755 scripts/compare-machine-types.py

Comments

Peter Maydell March 5, 2024, 1:49 p.m. UTC | #1
On Mon, 4 Mar 2024 at 13:52, Maksim Davydov <davydov-max@yandex-team.ru> wrote:
>
> The following changes since commit e1007b6bab5cf97705bf4f2aaec1f607787355b8:
>
>   Merge tag 'pull-request-2024-03-01' of https://gitlab.com/thuth/qemu into staging (2024-03-01 10:14:32 +0000)
>
> are available in the Git repository at:
>
>   https://gitlab.com/davydov-max/qemu.git tags/pull-compare-mt-2024-03-04
>
> for you to fetch changes up to 7693a2e8518811a907d73a85807ee71dac8fabcb:
>
>   scripts: add script to compare compatibility properties (2024-03-04 14:10:53 +0300)
>
> ----------------------------------------------------------------
> Please note. This is the first pull request from me.
> My public GPG key is available here
> https://keys.openpgp.org/vks/v1/by-fingerprint/CDB5BEEF8837142579F5CDFE8E927E10F72F78D4
>
> ----------------------------------------------------------------
> scripts: add a new script for machine development
>
> ----------------------------------------------------------------

Hi; I would prefer this to go through some existing submaintainer
tree if possible, please.

thanks
-- PMM
Markus Armbruster March 5, 2024, 2:43 p.m. UTC | #2
Peter Maydell <peter.maydell@linaro.org> writes:

> On Mon, 4 Mar 2024 at 13:52, Maksim Davydov <davydov-max@yandex-team.ru> wrote:
>>
>> The following changes since commit e1007b6bab5cf97705bf4f2aaec1f607787355b8:
>>
>>   Merge tag 'pull-request-2024-03-01' of https://gitlab.com/thuth/qemu into staging (2024-03-01 10:14:32 +0000)
>>
>> are available in the Git repository at:
>>
>>   https://gitlab.com/davydov-max/qemu.git tags/pull-compare-mt-2024-03-04
>>
>> for you to fetch changes up to 7693a2e8518811a907d73a85807ee71dac8fabcb:
>>
>>   scripts: add script to compare compatibility properties (2024-03-04 14:10:53 +0300)
>>
>> ----------------------------------------------------------------
>> Please note. This is the first pull request from me.
>> My public GPG key is available here
>> https://keys.openpgp.org/vks/v1/by-fingerprint/CDB5BEEF8837142579F5CDFE8E927E10F72F78D4
>>
>> ----------------------------------------------------------------
>> scripts: add a new script for machine development
>>
>> ----------------------------------------------------------------
>
> Hi; I would prefer this to go through some existing submaintainer
> tree if possible, please.

Migration?  QOM?  Not sure.  Cc'ing the maintainers anyway.
Peter Xu March 6, 2024, 1:57 a.m. UTC | #3
On Tue, Mar 05, 2024 at 03:43:41PM +0100, Markus Armbruster wrote:
> Peter Maydell <peter.maydell@linaro.org> writes:
> 
> > On Mon, 4 Mar 2024 at 13:52, Maksim Davydov <davydov-max@yandex-team.ru> wrote:
> >>
> >> The following changes since commit e1007b6bab5cf97705bf4f2aaec1f607787355b8:
> >>
> >>   Merge tag 'pull-request-2024-03-01' of https://gitlab.com/thuth/qemu into staging (2024-03-01 10:14:32 +0000)
> >>
> >> are available in the Git repository at:
> >>
> >>   https://gitlab.com/davydov-max/qemu.git tags/pull-compare-mt-2024-03-04
> >>
> >> for you to fetch changes up to 7693a2e8518811a907d73a85807ee71dac8fabcb:
> >>
> >>   scripts: add script to compare compatibility properties (2024-03-04 14:10:53 +0300)
> >>
> >> ----------------------------------------------------------------
> >> Please note. This is the first pull request from me.
> >> My public GPG key is available here
> >> https://keys.openpgp.org/vks/v1/by-fingerprint/CDB5BEEF8837142579F5CDFE8E927E10F72F78D4
> >>
> >> ----------------------------------------------------------------
> >> scripts: add a new script for machine development
> >>
> >> ----------------------------------------------------------------
> >
> > Hi; I would prefer this to go through some existing submaintainer
> > tree if possible, please.
> 
> Migration?  QOM?  Not sure.  Cc'ing the maintainers anyway.

Yeah this seems like migration relevant.. however now I'm slightly confused
on when this script should be useful.

According to:

https://lore.kernel.org/qemu-devel/20240222153912.646053-5-davydov-max@yandex-team.ru/

        This script runs QEMU to obtain compat_props of machines and
        default values of different types of drivers to produce comparison
        table. This table can be used to compare machine types to choose
        the most suitable machine or compare binaries to be sure that
        migration to the newer version will save all device
        properties. Also the json or csv format of this table can be used
        to check does a new machine affect the previous ones by comparing
        tables with and without the new machine.

In regards to "choose the most suitable machine": why do you need to choose
a machine?

If it's pretty standalone setup, shouldn't we always try to use the latest
machine type if possible (as normally compat props are only used to keep
compatible with old machine types, and the default should always be
preferred). If it's a cluster setup, IMHO it should depend on the oldest
QEMU version that plans to be supported.  I don't see how such comparison
helps yet in either of the cases..

In regards to "compare binaries to be sure that migration to the newer
version will save all device properties": do we even support migrating
_between_ machine types??

Sololy relying on compat properties to detect device compatibility is also
not reliable.  For example, see VMStateField.field_exists() or similarly,
VMStateDescription.needed(), which are hooks that device can provide to
dynamically decide what device state to be saved/loaded.  Such things are
not reflected in compat properties, so even if compat properties of all
devices are the same between two machine types, it may not mean that the
migration stream will always be compatible.

Thanks,
Maksim Davydov March 7, 2024, 9:06 a.m. UTC | #4
On 3/6/24 04:57, Peter Xu wrote:
> On Tue, Mar 05, 2024 at 03:43:41PM +0100, Markus Armbruster wrote:
>> Peter Maydell<peter.maydell@linaro.org>  writes:
>>
>>> On Mon, 4 Mar 2024 at 13:52, Maksim Davydov<davydov-max@yandex-team.ru>  wrote:
>>>> The following changes since commit e1007b6bab5cf97705bf4f2aaec1f607787355b8:
>>>>
>>>>    Merge tag 'pull-request-2024-03-01' ofhttps://gitlab.com/thuth/qemu  into staging (2024-03-01 10:14:32 +0000)
>>>>
>>>> are available in the Git repository at:
>>>>
>>>>    https://gitlab.com/davydov-max/qemu.git  tags/pull-compare-mt-2024-03-04
>>>>
>>>> for you to fetch changes up to 7693a2e8518811a907d73a85807ee71dac8fabcb:
>>>>
>>>>    scripts: add script to compare compatibility properties (2024-03-04 14:10:53 +0300)
>>>>
>>>> ----------------------------------------------------------------
>>>> Please note. This is the first pull request from me.
>>>> My public GPG key is available here
>>>> https://keys.openpgp.org/vks/v1/by-fingerprint/CDB5BEEF8837142579F5CDFE8E927E10F72F78D4
>>>>
>>>> ----------------------------------------------------------------
>>>> scripts: add a new script for machine development
>>>>
>>>> ----------------------------------------------------------------
>>> Hi; I would prefer this to go through some existing submaintainer
>>> tree if possible, please.
>> Migration?  QOM?  Not sure.  Cc'ing the maintainers anyway.
> Yeah this seems like migration relevant.. however now I'm slightly confused
> on when this script should be useful.
>
> According to:
>
> https://lore.kernel.org/qemu-devel/20240222153912.646053-5-davydov-max@yandex-team.ru/
>
>          This script runs QEMU to obtain compat_props of machines and
>          default values of different types of drivers to produce comparison
>          table. This table can be used to compare machine types to choose
>          the most suitable machine or compare binaries to be sure that
>          migration to the newer version will save all device
>          properties. Also the json or csv format of this table can be used
>          to check does a new machine affect the previous ones by comparing
>          tables with and without the new machine.
>
> In regards to "choose the most suitable machine": why do you need to choose
> a machine?
>
> If it's pretty standalone setup, shouldn't we always try to use the latest
> machine type if possible (as normally compat props are only used to keep
> compatible with old machine types, and the default should always be
> preferred). If it's a cluster setup, IMHO it should depend on the oldest
> QEMU version that plans to be supported.  I don't see how such comparison
> helps yet in either of the cases..
>
> In regards to "compare binaries to be sure that migration to the newer
> version will save all device properties": do we even support migrating
> _between_ machine types??
>
> Sololy relying on compat properties to detect device compatibility is also
> not reliable.  For example, see VMStateField.field_exists() or similarly,
> VMStateDescription.needed(), which are hooks that device can provide to
> dynamically decide what device state to be saved/loaded.  Such things are
> not reflected in compat properties, so even if compat properties of all
> devices are the same between two machine types, it may not mean that the
> migration stream will always be compatible.
>
> Thanks,

In fact, the last commit describes the meaning of this series best. Perhaps
it should have been moved to the cover letter:
Often, many teams have their own "machines" inherited from "upstream" to
manage default values of devices. This is done because of the limitations
imposed by the compatibility requirements or the desire to help their
customers better configure their devices. And since machine type has
a hard-to-read structure, it is very easy to make a mistake when 
transferring
default values from one machine to another. For example, when some property
is set for the entire abstract class x86_64-cpu (which will be applied 
to all
models), and then rolled back for a specific model. The situation is about
the same with changing the default values of devices: if the value changes
in the description of the device itself, then you need to make sure that
nothing changes when using the current machine.
Therefore, there was a desire to make a dev tool that will help quickly 
expand
the definition of a machine or compare several machines from different 
binary
files. It can be used when rebasing to a new version of qemu and/or for
automatic tests.
Peter Xu March 8, 2024, 3:47 a.m. UTC | #5
On Thu, Mar 07, 2024 at 12:06:59PM +0300, Maksim Davydov wrote:
> 
> On 3/6/24 04:57, Peter Xu wrote:
> > On Tue, Mar 05, 2024 at 03:43:41PM +0100, Markus Armbruster wrote:
> > > Peter Maydell<peter.maydell@linaro.org>  writes:
> > > 
> > > > On Mon, 4 Mar 2024 at 13:52, Maksim Davydov<davydov-max@yandex-team.ru>  wrote:
> > > > > The following changes since commit e1007b6bab5cf97705bf4f2aaec1f607787355b8:
> > > > > 
> > > > >    Merge tag 'pull-request-2024-03-01' ofhttps://gitlab.com/thuth/qemu  into staging (2024-03-01 10:14:32 +0000)
> > > > > 
> > > > > are available in the Git repository at:
> > > > > 
> > > > >    https://gitlab.com/davydov-max/qemu.git  tags/pull-compare-mt-2024-03-04
> > > > > 
> > > > > for you to fetch changes up to 7693a2e8518811a907d73a85807ee71dac8fabcb:
> > > > > 
> > > > >    scripts: add script to compare compatibility properties (2024-03-04 14:10:53 +0300)
> > > > > 
> > > > > ----------------------------------------------------------------
> > > > > Please note. This is the first pull request from me.
> > > > > My public GPG key is available here
> > > > > https://keys.openpgp.org/vks/v1/by-fingerprint/CDB5BEEF8837142579F5CDFE8E927E10F72F78D4
> > > > > 
> > > > > ----------------------------------------------------------------
> > > > > scripts: add a new script for machine development
> > > > > 
> > > > > ----------------------------------------------------------------
> > > > Hi; I would prefer this to go through some existing submaintainer
> > > > tree if possible, please.
> > > Migration?  QOM?  Not sure.  Cc'ing the maintainers anyway.
> > Yeah this seems like migration relevant.. however now I'm slightly confused
> > on when this script should be useful.
> > 
> > According to:
> > 
> > https://lore.kernel.org/qemu-devel/20240222153912.646053-5-davydov-max@yandex-team.ru/
> > 
> >          This script runs QEMU to obtain compat_props of machines and
> >          default values of different types of drivers to produce comparison
> >          table. This table can be used to compare machine types to choose
> >          the most suitable machine or compare binaries to be sure that
> >          migration to the newer version will save all device
> >          properties. Also the json or csv format of this table can be used
> >          to check does a new machine affect the previous ones by comparing
> >          tables with and without the new machine.
> > 
> > In regards to "choose the most suitable machine": why do you need to choose
> > a machine?
> > 
> > If it's pretty standalone setup, shouldn't we always try to use the latest
> > machine type if possible (as normally compat props are only used to keep
> > compatible with old machine types, and the default should always be
> > preferred). If it's a cluster setup, IMHO it should depend on the oldest
> > QEMU version that plans to be supported.  I don't see how such comparison
> > helps yet in either of the cases..
> > 
> > In regards to "compare binaries to be sure that migration to the newer
> > version will save all device properties": do we even support migrating
> > _between_ machine types??
> > 
> > Sololy relying on compat properties to detect device compatibility is also
> > not reliable.  For example, see VMStateField.field_exists() or similarly,
> > VMStateDescription.needed(), which are hooks that device can provide to
> > dynamically decide what device state to be saved/loaded.  Such things are
> > not reflected in compat properties, so even if compat properties of all
> > devices are the same between two machine types, it may not mean that the
> > migration stream will always be compatible.
> > 
> > Thanks,
> 
> In fact, the last commit describes the meaning of this series best. Perhaps
> it should have been moved to the cover letter:
> Often, many teams have their own "machines" inherited from "upstream" to
> manage default values of devices. This is done because of the limitations
> imposed by the compatibility requirements or the desire to help their
> customers better configure their devices. And since machine type has
> a hard-to-read structure, it is very easy to make a mistake when
> transferring
> default values from one machine to another. For example, when some property
> is set for the entire abstract class x86_64-cpu (which will be applied to
> all
> models), and then rolled back for a specific model. The situation is about
> the same with changing the default values of devices: if the value changes
> in the description of the device itself, then you need to make sure that
> nothing changes when using the current machine.
> Therefore, there was a desire to make a dev tool that will help quickly
> expand
> the definition of a machine or compare several machines from different
> binary
> files. It can be used when rebasing to a new version of qemu and/or for
> automatic tests.

OK, thanks.

So is it a migration compatibility issue that you care (migrating VMs from
your old downstream binary to new downstream binary should always succeed),
or perhaps you care more on making sure the features you wanted, i.e., you
want to make sure some specific devices that you care will have the
properties that you expect?

I think compat properties are mostly used for migration purposes, but
indeed it can also be used to keep old behaviors of devices, even if the
migration could succed with/without such a compat property entry.

If it's about migration, I'd like to know whether vmstate-static-checker.py
could also help your case (under scripts/), perhaps in a better way,
because it directly observes the VMSD structures (which is the ultimate
form on wire, after all these compat properties applied to the devices).

If it's not about migration, then maybe it's more QOM-relevant, and if so I
don't have a strong opinion. It seems still make some sense to have a tool
simply dump the QOM tree for a machine type with all properties and compare
them between machines with some binaries.  For that I'll leave that to
Markus to decide.

Btw, I tried to apply the patches and build, but failed:

In file included from ../qapi/qapi-schema.json:70:
../qapi/machine.json:224: text required after 'Example:'
[40/2810] Generating trace/trace-hw_ide.h with a custom command
[41/2810] Generating trace/trace-hw_isa.h with a custom command
[42/2810] Generating trace/trace-hw_intc.c with a custom command
[43/2810] Generating trace/trace-hw_mem.h with a custom command
[44/2810] Generating trace/trace-hw_isa.c with a custom command
[45/2810] Generating trace/trace-hw_intc.h with a custom command
[46/2810] Generating trace/trace-hw_mem.c with a custom command
ninja: build stopped: subcommand failed.
make: *** [Makefile:162: run-ninja] Error 1

There also seems to have an assumption that QEMU is built under "build/" in
the script.

+default_qemu_binary = 'build/qemu-system-x86_64'
Vladimir Sementsov-Ogievskiy March 18, 2024, 5:08 p.m. UTC | #6
On 08.03.24 06:47, Peter Xu wrote:
> On Thu, Mar 07, 2024 at 12:06:59PM +0300, Maksim Davydov wrote:
>>
>> On 3/6/24 04:57, Peter Xu wrote:
>>> On Tue, Mar 05, 2024 at 03:43:41PM +0100, Markus Armbruster wrote:
>>>> Peter Maydell<peter.maydell@linaro.org>  writes:
>>>>
>>>>> On Mon, 4 Mar 2024 at 13:52, Maksim Davydov<davydov-max@yandex-team.ru>  wrote:
>>>>>> The following changes since commit e1007b6bab5cf97705bf4f2aaec1f607787355b8:
>>>>>>
>>>>>>     Merge tag 'pull-request-2024-03-01' ofhttps://gitlab.com/thuth/qemu  into staging (2024-03-01 10:14:32 +0000)
>>>>>>
>>>>>> are available in the Git repository at:
>>>>>>
>>>>>>     https://gitlab.com/davydov-max/qemu.git  tags/pull-compare-mt-2024-03-04
>>>>>>
>>>>>> for you to fetch changes up to 7693a2e8518811a907d73a85807ee71dac8fabcb:
>>>>>>
>>>>>>     scripts: add script to compare compatibility properties (2024-03-04 14:10:53 +0300)
>>>>>>
>>>>>> ----------------------------------------------------------------
>>>>>> Please note. This is the first pull request from me.
>>>>>> My public GPG key is available here
>>>>>> https://keys.openpgp.org/vks/v1/by-fingerprint/CDB5BEEF8837142579F5CDFE8E927E10F72F78D4
>>>>>>
>>>>>> ----------------------------------------------------------------
>>>>>> scripts: add a new script for machine development
>>>>>>
>>>>>> ----------------------------------------------------------------
>>>>> Hi; I would prefer this to go through some existing submaintainer
>>>>> tree if possible, please.
>>>> Migration?  QOM?  Not sure.  Cc'ing the maintainers anyway.
>>> Yeah this seems like migration relevant.. however now I'm slightly confused
>>> on when this script should be useful.
>>>
>>> According to:
>>>
>>> https://lore.kernel.org/qemu-devel/20240222153912.646053-5-davydov-max@yandex-team.ru/
>>>
>>>           This script runs QEMU to obtain compat_props of machines and
>>>           default values of different types of drivers to produce comparison
>>>           table. This table can be used to compare machine types to choose
>>>           the most suitable machine or compare binaries to be sure that
>>>           migration to the newer version will save all device
>>>           properties. Also the json or csv format of this table can be used
>>>           to check does a new machine affect the previous ones by comparing
>>>           tables with and without the new machine.
>>>
>>> In regards to "choose the most suitable machine": why do you need to choose
>>> a machine?
>>>
>>> If it's pretty standalone setup, shouldn't we always try to use the latest
>>> machine type if possible (as normally compat props are only used to keep
>>> compatible with old machine types, and the default should always be
>>> preferred). If it's a cluster setup, IMHO it should depend on the oldest
>>> QEMU version that plans to be supported.  I don't see how such comparison
>>> helps yet in either of the cases..
>>>
>>> In regards to "compare binaries to be sure that migration to the newer
>>> version will save all device properties": do we even support migrating
>>> _between_ machine types??
>>>
>>> Sololy relying on compat properties to detect device compatibility is also
>>> not reliable.  For example, see VMStateField.field_exists() or similarly,
>>> VMStateDescription.needed(), which are hooks that device can provide to
>>> dynamically decide what device state to be saved/loaded.  Such things are
>>> not reflected in compat properties, so even if compat properties of all
>>> devices are the same between two machine types, it may not mean that the
>>> migration stream will always be compatible.
>>>
>>> Thanks,
>>
>> In fact, the last commit describes the meaning of this series best. Perhaps
>> it should have been moved to the cover letter:
>> Often, many teams have their own "machines" inherited from "upstream" to
>> manage default values of devices. This is done because of the limitations
>> imposed by the compatibility requirements or the desire to help their
>> customers better configure their devices. And since machine type has
>> a hard-to-read structure, it is very easy to make a mistake when
>> transferring
>> default values from one machine to another. For example, when some property
>> is set for the entire abstract class x86_64-cpu (which will be applied to
>> all
>> models), and then rolled back for a specific model. The situation is about
>> the same with changing the default values of devices: if the value changes
>> in the description of the device itself, then you need to make sure that
>> nothing changes when using the current machine.
>> Therefore, there was a desire to make a dev tool that will help quickly
>> expand
>> the definition of a machine or compare several machines from different
>> binary
>> files. It can be used when rebasing to a new version of qemu and/or for
>> automatic tests.
> 
> OK, thanks.
> 
> So is it a migration compatibility issue that you care (migrating VMs from
> your old downstream binary to new downstream binary should always succeed),
> or perhaps you care more on making sure the features you wanted, i.e., you
> want to make sure some specific devices that you care will have the
> properties that you expect?

Actually both things.

1. We need a tool to analyze, what exactly changes between MT-s. Do we want to move on new upstream MT or not, how much it is different from our downstream MT and so on.
2. It also could be used to check, that new MT is correctly defined (not breaking old MT's)

> 
> I think compat properties are mostly used for migration purposes, but
> indeed it can also be used to keep old behaviors of devices, even if the
> migration could succed with/without such a compat property entry.
> 
> If it's about migration, I'd like to know whether vmstate-static-checker.py
> could also help your case (under scripts/), perhaps in a better way,
> because it directly observes the VMSD structures (which is the ultimate
> form on wire, after all these compat properties applied to the devices).

Hmm, vmstate-static-checker.py checks a concrete device configuration. So it's a different thing.

> 
> If it's not about migration, then maybe it's more QOM-relevant, and if so I
> don't have a strong opinion. It seems still make some sense to have a tool
> simply dump the QOM tree for a machine type with all properties and compare
> them between machines with some binaries.  For that I'll leave that to
> Markus to decide.

Markus ACKed :)

> 
> Btw, I tried to apply the patches and build, but failed:
> 
> In file included from ../qapi/qapi-schema.json:70:
> ../qapi/machine.json:224: text required after 'Example:'
> [40/2810] Generating trace/trace-hw_ide.h with a custom command
> [41/2810] Generating trace/trace-hw_isa.h with a custom command
> [42/2810] Generating trace/trace-hw_intc.c with a custom command
> [43/2810] Generating trace/trace-hw_mem.h with a custom command
> [44/2810] Generating trace/trace-hw_isa.c with a custom command
> [45/2810] Generating trace/trace-hw_intc.h with a custom command
> [46/2810] Generating trace/trace-hw_mem.c with a custom command
> ninja: build stopped: subcommand failed.
> make: *** [Makefile:162: run-ninja] Error 1
> 

The series missed change in QAPI documentation requirements. I see, we need 4 spaces indentation for Examples text.
Max, could you fix and resend as patches again? We also have to replace "Since: 9.0" -> "Since: 9.1".


> There also seems to have an assumption that QEMU is built under "build/" in
> the script.
> 
> +default_qemu_binary = 'build/qemu-system-x86_64'
> 

I think it's not a problem for now. Could be changed later if needed.
Maksim Davydov March 18, 2024, 9 p.m. UTC | #7
On 3/8/24 06:47, Peter Xu wrote:
> On Thu, Mar 07, 2024 at 12:06:59PM +0300, Maksim Davydov wrote:
>> On 3/6/24 04:57, Peter Xu wrote:
>>> On Tue, Mar 05, 2024 at 03:43:41PM +0100, Markus Armbruster wrote:
>>>> Peter Maydell<peter.maydell@linaro.org>  writes:
>>>>
>>>>> On Mon, 4 Mar 2024 at 13:52, Maksim Davydov<davydov-max@yandex-team.ru>  wrote:
>>>>>> The following changes since commit e1007b6bab5cf97705bf4f2aaec1f607787355b8:
>>>>>>
>>>>>>     Merge tag 'pull-request-2024-03-01' ofhttps://gitlab.com/thuth/qemu  into staging (2024-03-01 10:14:32 +0000)
>>>>>>
>>>>>> are available in the Git repository at:
>>>>>>
>>>>>>     https://gitlab.com/davydov-max/qemu.git  tags/pull-compare-mt-2024-03-04
>>>>>>
>>>>>> for you to fetch changes up to 7693a2e8518811a907d73a85807ee71dac8fabcb:
>>>>>>
>>>>>>     scripts: add script to compare compatibility properties (2024-03-04 14:10:53 +0300)
>>>>>>
>>>>>> ----------------------------------------------------------------
>>>>>> Please note. This is the first pull request from me.
>>>>>> My public GPG key is available here
>>>>>> https://keys.openpgp.org/vks/v1/by-fingerprint/CDB5BEEF8837142579F5CDFE8E927E10F72F78D4
>>>>>>
>>>>>> ----------------------------------------------------------------
>>>>>> scripts: add a new script for machine development
>>>>>>
>>>>>> ----------------------------------------------------------------
>>>>> Hi; I would prefer this to go through some existing submaintainer
>>>>> tree if possible, please.
>>>> Migration?  QOM?  Not sure.  Cc'ing the maintainers anyway.
>>> Yeah this seems like migration relevant.. however now I'm slightly confused
>>> on when this script should be useful.
>>>
>>> According to:
>>>
>>> https://lore.kernel.org/qemu-devel/20240222153912.646053-5-davydov-max@yandex-team.ru/
>>>
>>>           This script runs QEMU to obtain compat_props of machines and
>>>           default values of different types of drivers to produce comparison
>>>           table. This table can be used to compare machine types to choose
>>>           the most suitable machine or compare binaries to be sure that
>>>           migration to the newer version will save all device
>>>           properties. Also the json or csv format of this table can be used
>>>           to check does a new machine affect the previous ones by comparing
>>>           tables with and without the new machine.
>>>
>>> In regards to "choose the most suitable machine": why do you need to choose
>>> a machine?
>>>
>>> If it's pretty standalone setup, shouldn't we always try to use the latest
>>> machine type if possible (as normally compat props are only used to keep
>>> compatible with old machine types, and the default should always be
>>> preferred). If it's a cluster setup, IMHO it should depend on the oldest
>>> QEMU version that plans to be supported.  I don't see how such comparison
>>> helps yet in either of the cases..
>>>
>>> In regards to "compare binaries to be sure that migration to the newer
>>> version will save all device properties": do we even support migrating
>>> _between_ machine types??
>>>
>>> Sololy relying on compat properties to detect device compatibility is also
>>> not reliable.  For example, see VMStateField.field_exists() or similarly,
>>> VMStateDescription.needed(), which are hooks that device can provide to
>>> dynamically decide what device state to be saved/loaded.  Such things are
>>> not reflected in compat properties, so even if compat properties of all
>>> devices are the same between two machine types, it may not mean that the
>>> migration stream will always be compatible.
>>>
>>> Thanks,
>> In fact, the last commit describes the meaning of this series best. Perhaps
>> it should have been moved to the cover letter:
>> Often, many teams have their own "machines" inherited from "upstream" to
>> manage default values of devices. This is done because of the limitations
>> imposed by the compatibility requirements or the desire to help their
>> customers better configure their devices. And since machine type has
>> a hard-to-read structure, it is very easy to make a mistake when
>> transferring
>> default values from one machine to another. For example, when some property
>> is set for the entire abstract class x86_64-cpu (which will be applied to
>> all
>> models), and then rolled back for a specific model. The situation is about
>> the same with changing the default values of devices: if the value changes
>> in the description of the device itself, then you need to make sure that
>> nothing changes when using the current machine.
>> Therefore, there was a desire to make a dev tool that will help quickly
>> expand
>> the definition of a machine or compare several machines from different
>> binary
>> files. It can be used when rebasing to a new version of qemu and/or for
>> automatic tests.
> OK, thanks.
>
> So is it a migration compatibility issue that you care (migrating VMs from
> your old downstream binary to new downstream binary should always succeed),
> or perhaps you care more on making sure the features you wanted, i.e., you
> want to make sure some specific devices that you care will have the
> properties that you expect?
>
> I think compat properties are mostly used for migration purposes, but
> indeed it can also be used to keep old behaviors of devices, even if the
> migration could succed with/without such a compat property entry.
>
> If it's about migration, I'd like to know whether vmstate-static-checker.py
> could also help your case (under scripts/), perhaps in a better way,
> because it directly observes the VMSD structures (which is the ultimate
> form on wire, after all these compat properties applied to the devices).
>
> If it's not about migration, then maybe it's more QOM-relevant, and if so I
> don't have a strong opinion. It seems still make some sense to have a tool
> simply dump the QOM tree for a machine type with all properties and compare
> them between machines with some binaries.  For that I'll leave that to
> Markus to decide.
>
> Btw, I tried to apply the patches and build, but failed:
>
> In file included from ../qapi/qapi-schema.json:70:
> ../qapi/machine.json:224: text required after 'Example:'
> [40/2810] Generating trace/trace-hw_ide.h with a custom command
> [41/2810] Generating trace/trace-hw_isa.h with a custom command
> [42/2810] Generating trace/trace-hw_intc.c with a custom command
> [43/2810] Generating trace/trace-hw_mem.h with a custom command
> [44/2810] Generating trace/trace-hw_isa.c with a custom command
> [45/2810] Generating trace/trace-hw_intc.h with a custom command
> [46/2810] Generating trace/trace-hw_mem.c with a custom command
> ninja: build stopped: subcommand failed.
> make: *** [Makefile:162: run-ninja] Error 1
>
> There also seems to have an assumption that QEMU is built under "build/" in
> the script.
>
> +default_qemu_binary = 'build/qemu-system-x86_64'
>
Sorry for late response
This is the default value, the script has the option to redefine the path to
the binary `--qemu-binary`
Peter Xu March 18, 2024, 11:27 p.m. UTC | #8
On Mon, Mar 18, 2024 at 08:08:29PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> On 08.03.24 06:47, Peter Xu wrote:
> > On Thu, Mar 07, 2024 at 12:06:59PM +0300, Maksim Davydov wrote:
> > > 
> > > On 3/6/24 04:57, Peter Xu wrote:
> > > > On Tue, Mar 05, 2024 at 03:43:41PM +0100, Markus Armbruster wrote:
> > > > > Peter Maydell<peter.maydell@linaro.org>  writes:
> > > > > 
> > > > > > On Mon, 4 Mar 2024 at 13:52, Maksim Davydov<davydov-max@yandex-team.ru>  wrote:
> > > > > > > The following changes since commit e1007b6bab5cf97705bf4f2aaec1f607787355b8:
> > > > > > > 
> > > > > > >     Merge tag 'pull-request-2024-03-01' ofhttps://gitlab.com/thuth/qemu  into staging (2024-03-01 10:14:32 +0000)
> > > > > > > 
> > > > > > > are available in the Git repository at:
> > > > > > > 
> > > > > > >     https://gitlab.com/davydov-max/qemu.git  tags/pull-compare-mt-2024-03-04
> > > > > > > 
> > > > > > > for you to fetch changes up to 7693a2e8518811a907d73a85807ee71dac8fabcb:
> > > > > > > 
> > > > > > >     scripts: add script to compare compatibility properties (2024-03-04 14:10:53 +0300)
> > > > > > > 
> > > > > > > ----------------------------------------------------------------
> > > > > > > Please note. This is the first pull request from me.
> > > > > > > My public GPG key is available here
> > > > > > > https://keys.openpgp.org/vks/v1/by-fingerprint/CDB5BEEF8837142579F5CDFE8E927E10F72F78D4
> > > > > > > 
> > > > > > > ----------------------------------------------------------------
> > > > > > > scripts: add a new script for machine development
> > > > > > > 
> > > > > > > ----------------------------------------------------------------
> > > > > > Hi; I would prefer this to go through some existing submaintainer
> > > > > > tree if possible, please.
> > > > > Migration?  QOM?  Not sure.  Cc'ing the maintainers anyway.
> > > > Yeah this seems like migration relevant.. however now I'm slightly confused
> > > > on when this script should be useful.
> > > > 
> > > > According to:
> > > > 
> > > > https://lore.kernel.org/qemu-devel/20240222153912.646053-5-davydov-max@yandex-team.ru/
> > > > 
> > > >           This script runs QEMU to obtain compat_props of machines and
> > > >           default values of different types of drivers to produce comparison
> > > >           table. This table can be used to compare machine types to choose
> > > >           the most suitable machine or compare binaries to be sure that
> > > >           migration to the newer version will save all device
> > > >           properties. Also the json or csv format of this table can be used
> > > >           to check does a new machine affect the previous ones by comparing
> > > >           tables with and without the new machine.
> > > > 
> > > > In regards to "choose the most suitable machine": why do you need to choose
> > > > a machine?
> > > > 
> > > > If it's pretty standalone setup, shouldn't we always try to use the latest
> > > > machine type if possible (as normally compat props are only used to keep
> > > > compatible with old machine types, and the default should always be
> > > > preferred). If it's a cluster setup, IMHO it should depend on the oldest
> > > > QEMU version that plans to be supported.  I don't see how such comparison
> > > > helps yet in either of the cases..
> > > > 
> > > > In regards to "compare binaries to be sure that migration to the newer
> > > > version will save all device properties": do we even support migrating
> > > > _between_ machine types??
> > > > 
> > > > Sololy relying on compat properties to detect device compatibility is also
> > > > not reliable.  For example, see VMStateField.field_exists() or similarly,
> > > > VMStateDescription.needed(), which are hooks that device can provide to
> > > > dynamically decide what device state to be saved/loaded.  Such things are
> > > > not reflected in compat properties, so even if compat properties of all
> > > > devices are the same between two machine types, it may not mean that the
> > > > migration stream will always be compatible.
> > > > 
> > > > Thanks,
> > > 
> > > In fact, the last commit describes the meaning of this series best. Perhaps
> > > it should have been moved to the cover letter:
> > > Often, many teams have their own "machines" inherited from "upstream" to
> > > manage default values of devices. This is done because of the limitations
> > > imposed by the compatibility requirements or the desire to help their
> > > customers better configure their devices. And since machine type has
> > > a hard-to-read structure, it is very easy to make a mistake when
> > > transferring
> > > default values from one machine to another. For example, when some property
> > > is set for the entire abstract class x86_64-cpu (which will be applied to
> > > all
> > > models), and then rolled back for a specific model. The situation is about
> > > the same with changing the default values of devices: if the value changes
> > > in the description of the device itself, then you need to make sure that
> > > nothing changes when using the current machine.
> > > Therefore, there was a desire to make a dev tool that will help quickly
> > > expand
> > > the definition of a machine or compare several machines from different
> > > binary
> > > files. It can be used when rebasing to a new version of qemu and/or for
> > > automatic tests.
> > 
> > OK, thanks.
> > 
> > So is it a migration compatibility issue that you care (migrating VMs from
> > your old downstream binary to new downstream binary should always succeed),
> > or perhaps you care more on making sure the features you wanted, i.e., you
> > want to make sure some specific devices that you care will have the
> > properties that you expect?
> 
> Actually both things.
> 
> 1. We need a tool to analyze, what exactly changes between MT-s. Do we want to move on new upstream MT or not, how much it is different from our downstream MT and so on.
> 2. It also could be used to check, that new MT is correctly defined (not breaking old MT's)
> 
> > 
> > I think compat properties are mostly used for migration purposes, but
> > indeed it can also be used to keep old behaviors of devices, even if the
> > migration could succed with/without such a compat property entry.
> > 
> > If it's about migration, I'd like to know whether vmstate-static-checker.py
> > could also help your case (under scripts/), perhaps in a better way,
> > because it directly observes the VMSD structures (which is the ultimate
> > form on wire, after all these compat properties applied to the devices).
> 
> Hmm, vmstate-static-checker.py checks a concrete device configuration. So it's a different thing.

I don't think so - 'qemu -dump-vmstate' should dump all device states that
it ever supports.  Feel free to have a look at dump_vmstate_json_to_file(),
or just try give it a dump.

> 
> > 
> > If it's not about migration, then maybe it's more QOM-relevant, and if so I
> > don't have a strong opinion. It seems still make some sense to have a tool
> > simply dump the QOM tree for a machine type with all properties and compare
> > them between machines with some binaries.  For that I'll leave that to
> > Markus to decide.
> 
> Markus ACKed :)

I didn't see Markus acked all the patches yet, but if so that's okay then.
Even if so, I think what Peter Maydell suggested is then this series should
go through the QOM tree, rather than a separate pull.

Thanks,