[RFC,21/21] arm/cpu-features: Document custom vcpu model

Message ID	20241025101959.601048-22-eric.auger@redhat.com
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> From: Eric Auger <eric.auger@redhat.com> To: eric.auger.pro@gmail.com, eric.auger@redhat.com, cohuck@redhat.com, qemu-devel@nongnu.org, qemu-arm@nongnu.org, kvmarm@lists.linux.dev, peter.maydell@linaro.org, richard.henderson@linaro.org, alex.bennee@linaro.org, maz@kernel.org, oliver.upton@linux.dev, sebott@redhat.com, shameerali.kolothum.thodi@huawei.com, armbru@redhat.com, berrange@redhat.com, abologna@redhat.com, jdenemar@redhat.com Cc: shahuang@redhat.com, mark.rutland@arm.com, philmd@linaro.org, pbonzini@redhat.com Subject: [RFC 21/21] arm/cpu-features: Document custom vcpu model Date: Fri, 25 Oct 2024 12:17:40 +0200 Message-ID: <20241025101959.601048-22-eric.auger@redhat.com> In-Reply-To: <20241025101959.601048-1-eric.auger@redhat.com> References: <20241025101959.601048-1-eric.auger@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=170.10.133.124; envelope-from=eric.auger@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -23 X-Spam_score: -2.4 X-Spam_bar: -- X-Spam_report: (-2.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.263, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Series	kvm/arm: Introduce a customizable aarch64 KVM host model \| expand [RFC,00/21] kvm/arm: Introduce a customizable aarch64 KVM host model [RFC,01/21] kvm: kvm_get_writable_id_regs [RFC,02/21] arm/cpu: Add sysreg definitions in cpu-sysegs.h [RFC,03/21] arm/cpu: Store aa64isar0 into the idregs arrays [RFC,04/21] arm/cpu: Store aa64isar1/2 into the idregs array [RFC,05/21] arm/cpu: Store aa64drf0/1 into the idregs array [RFC,06/21] arm/cpu: Store aa64mmfr0-3 into the idregs array [RFC,07/21] arm/cpu: Store aa64drf0/1 into the idregs array [RFC,08/21] arm/cpu: Store aa64smfr0 into the idregs array [RFC,09/21] arm/cpu: Store id_isar0-7 into the idregs array [RFC,10/21] arm/cpu: Store id_mfr0/1 into the idregs array [RFC,11/21] arm/cpu: Store id_dfr0/1 into the idregs array [RFC,12/21] arm/cpu: Store id_mmfr0-5 into the idregs array [RFC,13/21] arm/cpu: Add infra to handle generated ID register definitions [RFC,14/21] arm/cpu: Add sysreg generation scripts [RFC,15/21] arm/cpu: Add generated files [RFC,16/21] arm/kvm: Allow reading all the writable ID registers [RFC,17/21] arm/kvm: write back modified ID regs to KVM [RFC,18/21] arm/cpu: Introduce a customizable kvm host cpu model [RFC,19/21] virt: Allow custom vcpu model in arm virt [RFC,20/21] arm-qmp-cmds: introspection for custom model [RFC,21/21] arm/cpu-features: Document custom vcpu model

Eric Auger Oct. 25, 2024, 10:17 a.m. UTC

From: Cornelia Huck <cohuck@redhat.com>

Add some documentation for the custom model.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
---
 docs/system/arm/cpu-features.rst | 55 +++++++++++++++++++++++++++-----
 1 file changed, 47 insertions(+), 8 deletions(-)

Daniel P. Berrangé Oct. 25, 2024, 1:13 p.m. UTC | #1

On Fri, Oct 25, 2024 at 12:17:40PM +0200, Eric Auger wrote:
> From: Cornelia Huck <cohuck@redhat.com>
> 
> Add some documentation for the custom model.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> ---
>  docs/system/arm/cpu-features.rst | 55 +++++++++++++++++++++++++++-----
>  1 file changed, 47 insertions(+), 8 deletions(-)

> @@ -167,6 +196,16 @@ disabling many SVE vector lengths would be quite verbose, the ``sve<N>`` CPU
>  properties have special semantics (see "SVE CPU Property Parsing
>  Semantics").
>  
> +The ``custom`` CPU model needs to be configured via individual ID register
> +field properties, for example::
> +
> +  $ qemu-system-aarch64 -M virt -cpu custom,SYSREG_ID_AA64ISAR0_EL1_DP=0x0
> +
> +This forces ID_AA64ISAR0_EL1 DP field to 0.

What is the "baseline" featureset implied by 'custom' ?

On x86 we have the named CPU models each setting a baseline that matches
some corresponding real world silicon. Arm has that too, with TCG at
least. So that way you know what the baseline is that you're toggling
features against.

Experiance on x86 was that making arbitrary feature changes on top of the
named models could often backfire, as there are too many scenarios where
code will check for feature "Y", and assume that existance of "Y" implies
existance of "A", "B" and "C" too. So if you invent custom models where
Y is set, but B is missing, there's decent risk of things going wrong in
horrible to debug ways.  With that in mind, best practice is to try to
just the vanilla named CPU models to the greatest extent possible, and
keep feature toggling to an absolute minimum.  This 'custom' model does
not seem to give us such ability for arm.

With regards,
Daniel

Eric Auger Oct. 25, 2024, 1:28 p.m. UTC | #2

Hi Daniel,

On 10/25/24 15:13, Daniel P. Berrangé wrote:
> On Fri, Oct 25, 2024 at 12:17:40PM +0200, Eric Auger wrote:
>> From: Cornelia Huck <cohuck@redhat.com>
>>
>> Add some documentation for the custom model.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
>> ---
>>  docs/system/arm/cpu-features.rst | 55 +++++++++++++++++++++++++++-----
>>  1 file changed, 47 insertions(+), 8 deletions(-)
>
>> @@ -167,6 +196,16 @@ disabling many SVE vector lengths would be quite verbose, the ``sve<N>`` CPU
>>  properties have special semantics (see "SVE CPU Property Parsing
>>  Semantics").
>>  
>> +The ``custom`` CPU model needs to be configured via individual ID register
>> +field properties, for example::
>> +
>> +  $ qemu-system-aarch64 -M virt -cpu custom,SYSREG_ID_AA64ISAR0_EL1_DP=0x0
>> +
>> +This forces ID_AA64ISAR0_EL1 DP field to 0.
> What is the "baseline" featureset implied by 'custom' ?
there is no baseline at the moment. By default this is a host
passthrough model.
>
> On x86 we have the named CPU models each setting a baseline that matches
> some corresponding real world silicon. Arm has that too, with TCG at
> least. So that way you know what the baseline is that you're toggling
> features against.
Having named models is the next thing. custom vcpu model is not a named
model. But we don't want to TCG like CPU model (like A57) because we
want to be able to migrate between different machines like Ampere to
NVidia or different Ampere systems. So the baseline must be something
usable by both hosts.
>
> Experiance on x86 was that making arbitrary feature changes on top of the
> named models could often backfire, as there are too many scenarios where
> code will check for feature "Y", and assume that existance of "Y" implies
> existance of "A", "B" and "C" too. So if you invent custom models where
> Y is set, but B is missing, there's decent risk of things going wrong in
> horrible to debug ways.  With that in mind, best practice is to try to
> just the vanilla named CPU models to the greatest extent possible, and
> keep feature toggling to an absolute minimum.  This 'custom' model does
> not seem to give us such ability for arm.
The custom model is not yet a named model. This is rather something to
start this kind of discussion.
The code used by the custom vcpu model allows fine tuning of the ID reg
fields. This code could be reused by named models. We can also imagine
that libvirt does implement the named models, ie. hardcodes some IDReg
fields and thus implement the named model instead. Libvirt could
identify what is the baseline source and dest are the closest to, choose
this baseline and tune few reg id fields if some additional tuning are
needed.

Thanks

Eric
>
> With regards,
> Daniel

Daniel P. Berrangé Oct. 25, 2024, 1:31 p.m. UTC | #3

On Fri, Oct 25, 2024 at 03:28:35PM +0200, Eric Auger wrote:
> Hi Daniel,
> 
> On 10/25/24 15:13, Daniel P. Berrangé wrote:
> > On Fri, Oct 25, 2024 at 12:17:40PM +0200, Eric Auger wrote:
> >> From: Cornelia Huck <cohuck@redhat.com>
> >>
> >> Add some documentation for the custom model.
> >>
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> >> ---
> >>  docs/system/arm/cpu-features.rst | 55 +++++++++++++++++++++++++++-----
> >>  1 file changed, 47 insertions(+), 8 deletions(-)
> >
> >> @@ -167,6 +196,16 @@ disabling many SVE vector lengths would be quite verbose, the ``sve<N>`` CPU
> >>  properties have special semantics (see "SVE CPU Property Parsing
> >>  Semantics").
> >>  
> >> +The ``custom`` CPU model needs to be configured via individual ID register
> >> +field properties, for example::
> >> +
> >> +  $ qemu-system-aarch64 -M virt -cpu custom,SYSREG_ID_AA64ISAR0_EL1_DP=0x0
> >> +
> >> +This forces ID_AA64ISAR0_EL1 DP field to 0.
> > What is the "baseline" featureset implied by 'custom' ?
> there is no baseline at the moment. By default this is a host
> passthrough model.

Why do we need to create "custom" at all, as opposed to just letting
users toggle features on "-cpu host" ? 

With regards,
Daniel

Cornelia Huck Oct. 28, 2024, 4:05 p.m. UTC | #4

On Fri, Oct 25 2024, Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Fri, Oct 25, 2024 at 03:28:35PM +0200, Eric Auger wrote:
>> Hi Daniel,
>> 
>> On 10/25/24 15:13, Daniel P. Berrangé wrote:
>> > On Fri, Oct 25, 2024 at 12:17:40PM +0200, Eric Auger wrote:
>> >> From: Cornelia Huck <cohuck@redhat.com>
>> >>
>> >> Add some documentation for the custom model.
>> >>
>> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> >> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
>> >> ---
>> >>  docs/system/arm/cpu-features.rst | 55 +++++++++++++++++++++++++++-----
>> >>  1 file changed, 47 insertions(+), 8 deletions(-)
>> >
>> >> @@ -167,6 +196,16 @@ disabling many SVE vector lengths would be quite verbose, the ``sve<N>`` CPU
>> >>  properties have special semantics (see "SVE CPU Property Parsing
>> >>  Semantics").
>> >>  
>> >> +The ``custom`` CPU model needs to be configured via individual ID register
>> >> +field properties, for example::
>> >> +
>> >> +  $ qemu-system-aarch64 -M virt -cpu custom,SYSREG_ID_AA64ISAR0_EL1_DP=0x0
>> >> +
>> >> +This forces ID_AA64ISAR0_EL1 DP field to 0.
>> > What is the "baseline" featureset implied by 'custom' ?
>> there is no baseline at the moment. By default this is a host
>> passthrough model.
>
> Why do we need to create "custom" at all, as opposed to just letting
> users toggle features on "-cpu host" ? 

We could consolidate that to the current "host" model, once we figure
out how to handle the currently already existing properties. Models
based on the different architecture extensions would probably be more
useable in the long run; maybe "custom" has a place for testing.

Daniel P. Berrangé Oct. 28, 2024, 4:09 p.m. UTC | #5

On Mon, Oct 28, 2024 at 05:05:44PM +0100, Cornelia Huck wrote:
> On Fri, Oct 25 2024, Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Fri, Oct 25, 2024 at 03:28:35PM +0200, Eric Auger wrote:
> >> Hi Daniel,
> >> 
> >> On 10/25/24 15:13, Daniel P. Berrangé wrote:
> >> > On Fri, Oct 25, 2024 at 12:17:40PM +0200, Eric Auger wrote:
> >> >> From: Cornelia Huck <cohuck@redhat.com>
> >> >>
> >> >> Add some documentation for the custom model.
> >> >>
> >> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >> >> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> >> >> ---
> >> >>  docs/system/arm/cpu-features.rst | 55 +++++++++++++++++++++++++++-----
> >> >>  1 file changed, 47 insertions(+), 8 deletions(-)
> >> >
> >> >> @@ -167,6 +196,16 @@ disabling many SVE vector lengths would be quite verbose, the ``sve<N>`` CPU
> >> >>  properties have special semantics (see "SVE CPU Property Parsing
> >> >>  Semantics").
> >> >>  
> >> >> +The ``custom`` CPU model needs to be configured via individual ID register
> >> >> +field properties, for example::
> >> >> +
> >> >> +  $ qemu-system-aarch64 -M virt -cpu custom,SYSREG_ID_AA64ISAR0_EL1_DP=0x0
> >> >> +
> >> >> +This forces ID_AA64ISAR0_EL1 DP field to 0.
> >> > What is the "baseline" featureset implied by 'custom' ?
> >> there is no baseline at the moment. By default this is a host
> >> passthrough model.
> >
> > Why do we need to create "custom" at all, as opposed to just letting
> > users toggle features on "-cpu host" ? 
> 
> We could consolidate that to the current "host" model, once we figure
> out how to handle the currently already existing properties. Models
> based on the different architecture extensions would probably be more
> useable in the long run; maybe "custom" has a place for testing.

If you can set the features against "host", then any testing could
be done with "host" surely, making 'custom' pointless ?

With regards,
Daniel

Cornelia Huck Oct. 28, 2024, 4:29 p.m. UTC | #6

On Mon, Oct 28 2024, Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Mon, Oct 28, 2024 at 05:05:44PM +0100, Cornelia Huck wrote:
>> On Fri, Oct 25 2024, Daniel P. Berrangé <berrange@redhat.com> wrote:
>> 
>> > On Fri, Oct 25, 2024 at 03:28:35PM +0200, Eric Auger wrote:
>> >> Hi Daniel,
>> >> 
>> >> On 10/25/24 15:13, Daniel P. Berrangé wrote:
>> >> > On Fri, Oct 25, 2024 at 12:17:40PM +0200, Eric Auger wrote:
>> >> >> From: Cornelia Huck <cohuck@redhat.com>
>> >> >>
>> >> >> Add some documentation for the custom model.
>> >> >>
>> >> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> >> >> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
>> >> >> ---
>> >> >>  docs/system/arm/cpu-features.rst | 55 +++++++++++++++++++++++++++-----
>> >> >>  1 file changed, 47 insertions(+), 8 deletions(-)
>> >> >
>> >> >> @@ -167,6 +196,16 @@ disabling many SVE vector lengths would be quite verbose, the ``sve<N>`` CPU
>> >> >>  properties have special semantics (see "SVE CPU Property Parsing
>> >> >>  Semantics").
>> >> >>  
>> >> >> +The ``custom`` CPU model needs to be configured via individual ID register
>> >> >> +field properties, for example::
>> >> >> +
>> >> >> +  $ qemu-system-aarch64 -M virt -cpu custom,SYSREG_ID_AA64ISAR0_EL1_DP=0x0
>> >> >> +
>> >> >> +This forces ID_AA64ISAR0_EL1 DP field to 0.
>> >> > What is the "baseline" featureset implied by 'custom' ?
>> >> there is no baseline at the moment. By default this is a host
>> >> passthrough model.
>> >
>> > Why do we need to create "custom" at all, as opposed to just letting
>> > users toggle features on "-cpu host" ? 
>> 
>> We could consolidate that to the current "host" model, once we figure
>> out how to handle the currently already existing properties. Models
>> based on the different architecture extensions would probably be more
>> useable in the long run; maybe "custom" has a place for testing.
>
> If you can set the features against "host", then any testing could
> be done with "host" surely, making 'custom' pointless ?

We might differentiate between "do some consistency checks" and "allow
a completely weird wolpertinger"; if we agree that we don't need it,
then we surely could drop it again.

Kashyap Chamarthy Oct. 28, 2024, 9:17 p.m. UTC | #7

On Fri, Oct 25, 2024 at 12:17:40PM +0200, Eric Auger wrote:
> From: Cornelia Huck <cohuck@redhat.com>
> 
> Add some documentation for the custom model.
> 
> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> ---
>  docs/system/arm/cpu-features.rst | 55 +++++++++++++++++++++++++++-----
>  1 file changed, 47 insertions(+), 8 deletions(-)
> 
> diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
> index a5fb929243..962a2c6c26 100644
> --- a/docs/system/arm/cpu-features.rst
> +++ b/docs/system/arm/cpu-features.rst
> @@ -2,7 +2,10 @@ Arm CPU Features

[...]

> +Using the ``host`` type means the guest is provided all the same CPU
> +features as the host CPU type has.  And, for this reason, the ``host``
> +CPU type should enable all CPU features that the host has by default.
> +
> +In case some features need to be hidden to the guest, ``custom`` model
> +shall be used instead. This is especially useful for migration purpose.
> +
> +The ``custom`` CPU model generally is the better choice if you want more
> +flexibility or stability across different machines or with different kernel
> +versions. 

Does "more flexibility or stability across different machines" also
imply "live migration compatiblity across host CPUs"?

> However, even the ``custom`` CPU model will not allow configuring
> +an arbitrary set of features; the ID registers must describe a subset of the
> +host's features, and all differences to the host's configuration must actually
> +be supported by the kernel to be deconfigured.

[...]

> +The ``custom`` CPU model needs to be configured via individual ID register
> +field properties, for example::
> +
> +  $ qemu-system-aarch64 -M virt -cpu custom,SYSREG_ID_AA64ISAR0_EL1_DP=0x0

If possible, it would be really helpful (and user-friendly) to be able
to specify the CPU feature names as you see under /proc/cpuinfo, and be
able to turn the flags on or off:
    
        -M virt -cpu franken,rndr=on,ts=on,fhm=off

(... instead of specifying long system register IDs that groups together
a bunch of CPU features.  If I understand it correctly, the register
"ID_AA64ISAR0_EL1" maps to a set of visible features listed here:
https://docs.kernel.org/arch/arm64/cpu-feature-registers.html)


Next, I prefix the below by noting that I wrote it before seeing
Cornelia's reply that the name "custom" is not set in stone:
https://lists.nongnu.org/archive/html/qemu-arm/2024-10/msg00987.html.

I wonder if the word "custom" is starting to get overloaded; on x86:

  - Libvirt itself uses the term "custom" this way, to quote its
    documentation[1] for the 'custom' XML attribute:

      custom
    
      In this mode, the 'cpu' element describes the CPU that should be
      presented to the guest. This is the default when no 'mode'
      attribute is specified. This mode makes it so that a persistent
      guest will see the same hardware no matter what host the guest is
      booted on.

  - Some management tools also follow libvirt and use the term "custom"
    to refer to one of two things, (a) a specific named CPU model that
    libvirt and QEMU recognize, e.g. "Cascadelake-Server"; or (b) a
    named CPU model + extra CPU flags, e.g. this is how OpenStack
    uses[2] "custom" to configure CPU models, and flags that can be
    enabled or disabled via "+" or "-":

      [libvirt]
      cpu_mode = custom
      cpu_model = IvyBridge-IBRS
      cpu_model_extra_flags="ss,+vmx,-pcid [...]"

    (Note the "cpu_mode" there: it is referring to the three possible
    modes that libvirt and QEMU support today: 'host-passthrough',
    'host-model', and named CPU models via "custom".)

    The above config translates to this QEMU command-line:

        -cpu IvyBridge-IBRS,ss=on,vmx=on,pcid=off [...]

Now if QEMU introduces "custom", it is likely to create some confusion.
But luckily, as referenced above, it is open to change. :)

    * * *

FWIW, I agree with Dan here[3] that it would cause less future pain if
Arm's named CPU models also decides on a "baseline that matches some
corresponding real world silicon".  I've experienced plenty of such
debugging pain in x86-land from years of troubleshooting live migration
bugs involving CPU model (in)compatibility.  (Often, with help from
DanPB and Jiri Denemark).

[1] https://docs.openstack.org/nova/latest/admin/cpu-models.html#cpu-modes
[2] https://libvirt.org/formatdomain.html#cpu-model-and-topology
[3] https://lists.nongnu.org/archive/html/qemu-arm/2024-10/msg00888.html
    — [RFC 21/21] arm/cpu-features: Document custom vcpu model

[...]

Kashyap Chamarthy Oct. 31, 2024, 12:24 p.m. UTC | #8

On Mon, Oct 28, 2024 at 05:29:11PM +0100, Cornelia Huck wrote:
> On Mon, Oct 28 2024, Daniel P. Berrangé <berrange@redhat.com> wrote:

[...]

> >> We could consolidate that to the current "host" model, once we figure
> >> out how to handle the currently already existing properties. Models
> >> based on the different architecture extensions would probably be more
> >> useable in the long run; maybe "custom" has a place for testing.
> >
> > If you can set the features against "host", then any testing could
> > be done with "host" surely, making 'custom' pointless ?
> 
> We might differentiate between "do some consistency checks" and "allow
> a completely weird wolpertinger"; if we agree that we don't need it,
> then we surely could drop it again.

Yeah, FWIW, I agree that it's best to drop "custom" if all the
meaningful tests can be handled by being able to add/remove CPU flags
from `-cpu host`.

Related: I don't see any mention of `-cpu max` here.  Is it not
relevant?  It is currently defined as: "enables all features supported
by the accelerator in the current host".  Does it make sense for `max`
to allow disabling features?  Or is the idea that, why would you choose
`-cpu max` if you want to disable features?  In that case, go with
either:

    -cpu host,feat1=off

Or:

    -cpu some_future_named_model,$feat1=off

?

Peter Maydell Oct. 31, 2024, 12:59 p.m. UTC | #9

On Thu, 31 Oct 2024 at 12:24, Kashyap Chamarthy <kchamart@redhat.com> wrote:
>
> On Mon, Oct 28, 2024 at 05:29:11PM +0100, Cornelia Huck wrote:
> > On Mon, Oct 28 2024, Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> [...]
>
> > >> We could consolidate that to the current "host" model, once we figure
> > >> out how to handle the currently already existing properties. Models
> > >> based on the different architecture extensions would probably be more
> > >> useable in the long run; maybe "custom" has a place for testing.
> > >
> > > If you can set the features against "host", then any testing could
> > > be done with "host" surely, making 'custom' pointless ?
> >
> > We might differentiate between "do some consistency checks" and "allow
> > a completely weird wolpertinger"; if we agree that we don't need it,
> > then we surely could drop it again.
>
> Yeah, FWIW, I agree that it's best to drop "custom" if all the
> meaningful tests can be handled by being able to add/remove CPU flags
> from `-cpu host`.
>
>
> Related: I don't see any mention of `-cpu max` here.  Is it not
> relevant?  It is currently defined as: "enables all features supported
> by the accelerator in the current host".  Does it make sense for `max`
> to allow disabling features?  Or is the idea that, why would you choose
> `-cpu max` if you want to disable features?

Ideally, disabling features would work with any '-cpu' option
(including our existing TCG CPUs). (The main reason for 'max'
is as an option that works whether you're using TCG or KVM.)

-- PMM

Eric Auger Nov. 4, 2024, 2:45 p.m. UTC | #10

Hi

On 10/28/24 17:09, Daniel P. Berrangé wrote:
> On Mon, Oct 28, 2024 at 05:05:44PM +0100, Cornelia Huck wrote:
>> On Fri, Oct 25 2024, Daniel P. Berrangé <berrange@redhat.com> wrote:
>>
>>> On Fri, Oct 25, 2024 at 03:28:35PM +0200, Eric Auger wrote:
>>>> Hi Daniel,
>>>>
>>>> On 10/25/24 15:13, Daniel P. Berrangé wrote:
>>>>> On Fri, Oct 25, 2024 at 12:17:40PM +0200, Eric Auger wrote:
>>>>>> From: Cornelia Huck <cohuck@redhat.com>
>>>>>>
>>>>>> Add some documentation for the custom model.
>>>>>>
>>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>>>> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
>>>>>> ---
>>>>>>  docs/system/arm/cpu-features.rst | 55 +++++++++++++++++++++++++++-----
>>>>>>  1 file changed, 47 insertions(+), 8 deletions(-)
>>>>>> @@ -167,6 +196,16 @@ disabling many SVE vector lengths would be quite verbose, the ``sve<N>`` CPU
>>>>>>  properties have special semantics (see "SVE CPU Property Parsing
>>>>>>  Semantics").
>>>>>>  
>>>>>> +The ``custom`` CPU model needs to be configured via individual ID register
>>>>>> +field properties, for example::
>>>>>> +
>>>>>> +  $ qemu-system-aarch64 -M virt -cpu custom,SYSREG_ID_AA64ISAR0_EL1_DP=0x0
>>>>>> +
>>>>>> +This forces ID_AA64ISAR0_EL1 DP field to 0.
>>>>> What is the "baseline" featureset implied by 'custom' ?
>>>> there is no baseline at the moment. By default this is a host
>>>> passthrough model.
>>> Why do we need to create "custom" at all, as opposed to just letting
>>> users toggle features on "-cpu host" ? 
>> We could consolidate that to the current "host" model, once we figure
>> out how to handle the currently already existing properties. Models
>> based on the different architecture extensions would probably be more
>> useable in the long run; maybe "custom" has a place for testing.
> If you can set the features against "host", then any testing could
> be done with "host" surely, making 'custom' pointless ?
Yeah I do agree that we may not need to introduce this "custom" model
bus just enhance the custom host model with the capability to tweek some
features. For instance we have the case where migration between 2 Ampere
systems fails with host model but if you tweek 1 field in CTR_EL0 it
passes. So I think in itself this modality can be useful. Same for
debug/test purpose. As mentionned in the cover letter the number of
writable ID regs continue to grow and this enhanced host model gives
flexibility to test new support and may provide enhanced debug
capabilities for migration (getting a straight understanding of which ID
reg field(s) causes the migration failure could be helpful I think)

Thanks

Eric
>
> With regards,
> Daniel

Daniel P. Berrangé Nov. 4, 2024, 2:55 p.m. UTC | #11

On Mon, Nov 04, 2024 at 03:45:13PM +0100, Eric Auger wrote:
> Hi
> 
> On 10/28/24 17:09, Daniel P. Berrangé wrote:
> > On Mon, Oct 28, 2024 at 05:05:44PM +0100, Cornelia Huck wrote:
> >> On Fri, Oct 25 2024, Daniel P. Berrangé <berrange@redhat.com> wrote:
> >>
> >>> On Fri, Oct 25, 2024 at 03:28:35PM +0200, Eric Auger wrote:
> >>>> Hi Daniel,
> >>>>
> >>>> On 10/25/24 15:13, Daniel P. Berrangé wrote:
> >>>>> On Fri, Oct 25, 2024 at 12:17:40PM +0200, Eric Auger wrote:
> >>>>>> From: Cornelia Huck <cohuck@redhat.com>
> >>>>>>
> >>>>>> Add some documentation for the custom model.
> >>>>>>
> >>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >>>>>> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> >>>>>> ---
> >>>>>>  docs/system/arm/cpu-features.rst | 55 +++++++++++++++++++++++++++-----
> >>>>>>  1 file changed, 47 insertions(+), 8 deletions(-)
> >>>>>> @@ -167,6 +196,16 @@ disabling many SVE vector lengths would be quite verbose, the ``sve<N>`` CPU
> >>>>>>  properties have special semantics (see "SVE CPU Property Parsing
> >>>>>>  Semantics").
> >>>>>>  
> >>>>>> +The ``custom`` CPU model needs to be configured via individual ID register
> >>>>>> +field properties, for example::
> >>>>>> +
> >>>>>> +  $ qemu-system-aarch64 -M virt -cpu custom,SYSREG_ID_AA64ISAR0_EL1_DP=0x0
> >>>>>> +
> >>>>>> +This forces ID_AA64ISAR0_EL1 DP field to 0.
> >>>>> What is the "baseline" featureset implied by 'custom' ?
> >>>> there is no baseline at the moment. By default this is a host
> >>>> passthrough model.
> >>> Why do we need to create "custom" at all, as opposed to just letting
> >>> users toggle features on "-cpu host" ? 
> >> We could consolidate that to the current "host" model, once we figure
> >> out how to handle the currently already existing properties. Models
> >> based on the different architecture extensions would probably be more
> >> useable in the long run; maybe "custom" has a place for testing.
> > If you can set the features against "host", then any testing could
> > be done with "host" surely, making 'custom' pointless ?
> Yeah I do agree that we may not need to introduce this "custom" model
> bus just enhance the custom host model with the capability to tweek some
> features. For instance we have the case where migration between 2 Ampere
> systems fails with host model but if you tweek 1 field in CTR_EL0 it
> passes. So I think in itself this modality can be useful. Same for
> debug/test purpose. As mentionned in the cover letter the number of
> writable ID regs continue to grow and this enhanced host model gives
> flexibility to test new support and may provide enhanced debug
> capabilities for migration (getting a straight understanding of which ID
> reg field(s) causes the migration failure could be helpful I think)

FYI, in x86 target the -cpu command has had a "migratable=bool" property
for a long time , which defaults to 'true' for 'host' model. This causes
QEMU to explicitly drop features which would otherwise prevent migration
between two hosts with identical physical CPUs.

IOW, if there are some bits present in 'host' that cause migration
problems on Ampere hosts, ideally either QEMU (or KVM kmod) would
detect them and turn them off automatically if migratable=true is
set. See commit message in 84f1b92f & 120eee7d1fd for some background
info

NB "migratable" is defined in i386 target code today, but conceptually
we should expand/move that to apply to all targets for consistency,
even if it is effectively a no-op some targets (eg if they are
guaranteed migratable out of the box already with '-cpu host').

With regards,
Daniel

Cornelia Huck Nov. 4, 2024, 3:10 p.m. UTC | #12

On Mon, Nov 04 2024, Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Mon, Nov 04, 2024 at 03:45:13PM +0100, Eric Auger wrote:
>> Hi
>> 
>> On 10/28/24 17:09, Daniel P. Berrangé wrote:
>> > On Mon, Oct 28, 2024 at 05:05:44PM +0100, Cornelia Huck wrote:
>> >> On Fri, Oct 25 2024, Daniel P. Berrangé <berrange@redhat.com> wrote:
>> >>
>> >>> On Fri, Oct 25, 2024 at 03:28:35PM +0200, Eric Auger wrote:
>> >>>> Hi Daniel,
>> >>>>
>> >>>> On 10/25/24 15:13, Daniel P. Berrangé wrote:
>> >>>>> On Fri, Oct 25, 2024 at 12:17:40PM +0200, Eric Auger wrote:
>> >>>>>> From: Cornelia Huck <cohuck@redhat.com>
>> >>>>>>
>> >>>>>> Add some documentation for the custom model.
>> >>>>>>
>> >>>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> >>>>>> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
>> >>>>>> ---
>> >>>>>>  docs/system/arm/cpu-features.rst | 55 +++++++++++++++++++++++++++-----
>> >>>>>>  1 file changed, 47 insertions(+), 8 deletions(-)
>> >>>>>> @@ -167,6 +196,16 @@ disabling many SVE vector lengths would be quite verbose, the ``sve<N>`` CPU
>> >>>>>>  properties have special semantics (see "SVE CPU Property Parsing
>> >>>>>>  Semantics").
>> >>>>>>  
>> >>>>>> +The ``custom`` CPU model needs to be configured via individual ID register
>> >>>>>> +field properties, for example::
>> >>>>>> +
>> >>>>>> +  $ qemu-system-aarch64 -M virt -cpu custom,SYSREG_ID_AA64ISAR0_EL1_DP=0x0
>> >>>>>> +
>> >>>>>> +This forces ID_AA64ISAR0_EL1 DP field to 0.
>> >>>>> What is the "baseline" featureset implied by 'custom' ?
>> >>>> there is no baseline at the moment. By default this is a host
>> >>>> passthrough model.
>> >>> Why do we need to create "custom" at all, as opposed to just letting
>> >>> users toggle features on "-cpu host" ? 
>> >> We could consolidate that to the current "host" model, once we figure
>> >> out how to handle the currently already existing properties. Models
>> >> based on the different architecture extensions would probably be more
>> >> useable in the long run; maybe "custom" has a place for testing.
>> > If you can set the features against "host", then any testing could
>> > be done with "host" surely, making 'custom' pointless ?
>> Yeah I do agree that we may not need to introduce this "custom" model
>> bus just enhance the custom host model with the capability to tweek some
>> features. For instance we have the case where migration between 2 Ampere
>> systems fails with host model but if you tweek 1 field in CTR_EL0 it
>> passes. So I think in itself this modality can be useful. Same for
>> debug/test purpose. As mentionned in the cover letter the number of
>> writable ID regs continue to grow and this enhanced host model gives
>> flexibility to test new support and may provide enhanced debug
>> capabilities for migration (getting a straight understanding of which ID
>> reg field(s) causes the migration failure could be helpful I think)
>
> FYI, in x86 target the -cpu command has had a "migratable=bool" property
> for a long time , which defaults to 'true' for 'host' model. This causes
> QEMU to explicitly drop features which would otherwise prevent migration
> between two hosts with identical physical CPUs.
>
> IOW, if there are some bits present in 'host' that cause migration
> problems on Ampere hosts, ideally either QEMU (or KVM kmod) would
> detect them and turn them off automatically if migratable=true is
> set. See commit message in 84f1b92f & 120eee7d1fd for some background
> info

How does this work for version-sensitive features -- are they always
defaulting to off? How many features are left with that in the end?

>
> NB "migratable" is defined in i386 target code today, but conceptually
> we should expand/move that to apply to all targets for consistency,
> even if it is effectively a no-op some targets (eg if they are
> guaranteed migratable out of the box already with '-cpu host').

How does this compare to s390x, which defines some migration-safe cpu
models, based upon the different hw generations? If I look at the QEMU
code for x86 and s390x, the s390x approach seems cleaner to me (probably
because it came later, and therefore could start afresh without having
to care for legacy things.) Given that we'll cook up a new model for Arm
migration as well, we might as well start with a clean implementation :)

(Not sure what this looks like on the libvirt side.)

Daniel P. Berrangé Nov. 4, 2024, 3:24 p.m. UTC | #13

On Mon, Nov 04, 2024 at 04:10:12PM +0100, Cornelia Huck wrote:
> On Mon, Nov 04 2024, Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> >
> > FYI, in x86 target the -cpu command has had a "migratable=bool" property
> > for a long time , which defaults to 'true' for 'host' model. This causes
> > QEMU to explicitly drop features which would otherwise prevent migration
> > between two hosts with identical physical CPUs.
> >
> > IOW, if there are some bits present in 'host' that cause migration
> > problems on Ampere hosts, ideally either QEMU (or KVM kmod) would
> > detect them and turn them off automatically if migratable=true is
> > set. See commit message in 84f1b92f & 120eee7d1fd for some background
> > info
> 
> How does this work for version-sensitive features -- are they always
> defaulting to off? How many features are left with that in the end?

Do you mean QEMU versions here ? The non-migratable feature list is
just hardcoded in QEMU right now, and there's only 1 of them.
eg grep for 'unmigratable_flags'

Note, that "migratable" property is not defining a general purpose
migration mask between different hw generations. It was specifically
blocking just stuff that is known to make migration impossible, even
if HW is identical on both sides.

> > NB "migratable" is defined in i386 target code today, but conceptually
> > we should expand/move that to apply to all targets for consistency,
> > even if it is effectively a no-op some targets (eg if they are
> > guaranteed migratable out of the box already with '-cpu host').
> 
> How does this compare to s390x, which defines some migration-safe cpu
> models, based upon the different hw generations? If I look at the QEMU
> code for x86 and s390x, the s390x approach seems cleaner to me (probably
> because it came later, and therefore could start afresh without having
> to care for legacy things.) Given that we'll cook up a new model for Arm
> migration as well, we might as well start with a clean implementation :)

The impression I get (as an distant observer) is that CPUs on s390x in
general have less complexity to worry about. A combination of not having
a vendor who creates loads off different SKUs for the same CPU model
family with slight variations between each, and also not seeming to have
a situation where CPU flags a known to disappear (or appear) arbitrarily
in microcode updates.

The s390x idea of a "migratable" and "non migratable" model for each
HW generation is a nice simplification, but I can't see how it could
be made to work for x86 when you can't predict ahead of time what
features are going to be removed from existing HW definition by the
next microcode update, or by the next CPU SKU that removes a feature
you had assumed would always be present in a given HW generation.

I don't know much about how ARM world works, but having lots of vendors
competing with their own custom impls makes me worry complexity will
be closer to x86 than to s390.

If the ARM specifications define a minimum require featureset for each
HW generation, maybe you can define a model based on that ? You might
still want to have vendor specific models though, if there are compelling
features they expose which are optional, or non-standardized. 

With regards,
Daniel

Eric Auger Nov. 4, 2024, 3:34 p.m. UTC | #14

Hi Kashyap,

On 10/28/24 22:17, Kashyap Chamarthy wrote:
> On Fri, Oct 25, 2024 at 12:17:40PM +0200, Eric Auger wrote:
>> From: Cornelia Huck <cohuck@redhat.com>
>>
>> Add some documentation for the custom model.
>>
>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
>> ---
>>  docs/system/arm/cpu-features.rst | 55 +++++++++++++++++++++++++++-----
>>  1 file changed, 47 insertions(+), 8 deletions(-)
>>
>> diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
>> index a5fb929243..962a2c6c26 100644
>> --- a/docs/system/arm/cpu-features.rst
>> +++ b/docs/system/arm/cpu-features.rst
>> @@ -2,7 +2,10 @@ Arm CPU Features
> [...]
>
>> +Using the ``host`` type means the guest is provided all the same CPU
>> +features as the host CPU type has.  And, for this reason, the ``host``
>> +CPU type should enable all CPU features that the host has by default.
>> +
>> +In case some features need to be hidden to the guest, ``custom`` model
>> +shall be used instead. This is especially useful for migration purpose.
>> +
>> +The ``custom`` CPU model generally is the better choice if you want more
>> +flexibility or stability across different machines or with different kernel
>> +versions. 
> Does "more flexibility or stability across different machines" also
> imply "live migration compatiblity across host CPUs"?
yes that's the goal
>
>> However, even the ``custom`` CPU model will not allow configuring
>> +an arbitrary set of features; the ID registers must describe a subset of the
>> +host's features, and all differences to the host's configuration must actually
>> +be supported by the kernel to be deconfigured.
> [...]
>
>> +The ``custom`` CPU model needs to be configured via individual ID register
>> +field properties, for example::
>> +
>> +  $ qemu-system-aarch64 -M virt -cpu custom,SYSREG_ID_AA64ISAR0_EL1_DP=0x0
> If possible, it would be really helpful (and user-friendly) to be able
> to specify the CPU feature names as you see under /proc/cpuinfo, and be
> able to turn the flags on or off:
>     
>         -M virt -cpu franken,rndr=on,ts=on,fhm=off
>
> (... instead of specifying long system register IDs that groups together
> a bunch of CPU features.  If I understand it correctly, the register
> "ID_AA64ISAR0_EL1" maps to a set of visible features listed here:
> https://docs.kernel.org/arch/arm64/cpu-feature-registers.html)
Not all the writable ID regs are visible through the above technique.
But indeed I think we converged on the idea to use higher level feature
names than ID reg field values.
However we need to study the feasibility and mappings between those high
level features and ID reg field values.
The cons is that we need to describe this mapping manually. Besides
being cumbersome this is also error prone.
>
>
> Next, I prefix the below by noting that I wrote it before seeing
> Cornelia's reply that the name "custom" is not set in stone:
> https://lists.nongnu.org/archive/html/qemu-arm/2024-10/msg00987.html.
>
> I wonder if the word "custom" is starting to get overloaded; on x86:
>
>   - Libvirt itself uses the term "custom" this way, to quote its
>     documentation[1] for the 'custom' XML attribute:
>
>       custom
>     
>       In this mode, the 'cpu' element describes the CPU that should be
>       presented to the guest. This is the default when no 'mode'
>       attribute is specified. This mode makes it so that a persistent
>       guest will see the same hardware no matter what host the guest is
>       booted on.
>
>   - Some management tools also follow libvirt and use the term "custom"
>     to refer to one of two things, (a) a specific named CPU model that
>     libvirt and QEMU recognize, e.g. "Cascadelake-Server"; or (b) a
>     named CPU model + extra CPU flags, e.g. this is how OpenStack
>     uses[2] "custom" to configure CPU models, and flags that can be
>     enabled or disabled via "+" or "-":
>
>       [libvirt]
>       cpu_mode = custom
>       cpu_model = IvyBridge-IBRS
>       cpu_model_extra_flags="ss,+vmx,-pcid [...]"
>
>     (Note the "cpu_mode" there: it is referring to the three possible
>     modes that libvirt and QEMU support today: 'host-passthrough',
>     'host-model', and named CPU models via "custom".)
>
>     The above config translates to this QEMU command-line:
>
>         -cpu IvyBridge-IBRS,ss=on,vmx=on,pcid=off [...]
>
> Now if QEMU introduces "custom", it is likely to create some confusion.
> But luckily, as referenced above, it is open to change. :)
Agreed! Thank you for the references!

Eric
>
>     * * *
>
> FWIW, I agree with Dan here[3] that it would cause less future pain if
> Arm's named CPU models also decides on a "baseline that matches some
> corresponding real world silicon".  I've experienced plenty of such
> debugging pain in x86-land from years of troubleshooting live migration
> bugs involving CPU model (in)compatibility.  (Often, with help from
> DanPB and Jiri Denemark).
>
> [1] https://docs.openstack.org/nova/latest/admin/cpu-models.html#cpu-modes
> [2] https://libvirt.org/formatdomain.html#cpu-model-and-topology
> [3] https://lists.nongnu.org/archive/html/qemu-arm/2024-10/msg00888.html
>     — [RFC 21/21] arm/cpu-features: Document custom vcpu model
>
> [...]
>

Cornelia Huck Nov. 4, 2024, 3:48 p.m. UTC | #15

On Mon, Nov 04 2024, Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Mon, Nov 04, 2024 at 04:10:12PM +0100, Cornelia Huck wrote:
>> On Mon, Nov 04 2024, Daniel P. Berrangé <berrange@redhat.com> wrote:
>> 
>> >
>> > FYI, in x86 target the -cpu command has had a "migratable=bool" property
>> > for a long time , which defaults to 'true' for 'host' model. This causes
>> > QEMU to explicitly drop features which would otherwise prevent migration
>> > between two hosts with identical physical CPUs.
>> >
>> > IOW, if there are some bits present in 'host' that cause migration
>> > problems on Ampere hosts, ideally either QEMU (or KVM kmod) would
>> > detect them and turn them off automatically if migratable=true is
>> > set. See commit message in 84f1b92f & 120eee7d1fd for some background
>> > info
>> 
>> How does this work for version-sensitive features -- are they always
>> defaulting to off? How many features are left with that in the end?
>
> Do you mean QEMU versions here ? The non-migratable feature list is
> just hardcoded in QEMU right now, and there's only 1 of them.
> eg grep for 'unmigratable_flags'
>
> Note, that "migratable" property is not defining a general purpose
> migration mask between different hw generations. It was specifically
> blocking just stuff that is known to make migration impossible, even
> if HW is identical on both sides.

I was more thinking of dependencies on the KVM version -- QEMU versions
are easier to control for, but you don't really know what kernel version
you are running with. In the end, we'd probably need to mark a lot of
things as unmigratable.

>
>> > NB "migratable" is defined in i386 target code today, but conceptually
>> > we should expand/move that to apply to all targets for consistency,
>> > even if it is effectively a no-op some targets (eg if they are
>> > guaranteed migratable out of the box already with '-cpu host').
>> 
>> How does this compare to s390x, which defines some migration-safe cpu
>> models, based upon the different hw generations? If I look at the QEMU
>> code for x86 and s390x, the s390x approach seems cleaner to me (probably
>> because it came later, and therefore could start afresh without having
>> to care for legacy things.) Given that we'll cook up a new model for Arm
>> migration as well, we might as well start with a clean implementation :)
>
> The impression I get (as an distant observer) is that CPUs on s390x in
> general have less complexity to worry about. A combination of not having
> a vendor who creates loads off different SKUs for the same CPU model
> family with slight variations between each, and also not seeming to have
> a situation where CPU flags a known to disappear (or appear) arbitrarily
> in microcode updates.
>
> The s390x idea of a "migratable" and "non migratable" model for each
> HW generation is a nice simplification, but I can't see how it could
> be made to work for x86 when you can't predict ahead of time what
> features are going to be removed from existing HW definition by the
> next microcode update, or by the next CPU SKU that removes a feature
> you had assumed would always be present in a given HW generation.
>
> I don't know much about how ARM world works, but having lots of vendors
> competing with their own custom impls makes me worry complexity will
> be closer to x86 than to s390.

My concern was more about code complexity, not hw complexity. We'll
probably end up with a zoo of weird creatures for Arm, but I don't see a
reason why the code would need to have strange things tacked
on. I.e. have a set of arch extensions that you can baseline to, and
have individual cpus on top, so you can deal with both well-known cpus
and more boutique ones.

>
> If the ARM specifications define a minimum require featureset for each
> HW generation, maybe you can define a model based on that ? You might
> still want to have vendor specific models though, if there are compelling
> features they expose which are optional, or non-standardized. 

We have a list of features that are optional for a given arch extension,
and a list of features that are mandatory, so I think we'd be able to
generate a model with the mandatory features only. Models for individual
cpus could base off these. (There are currently 13 vendors defined in
MIDR, but I'm not sure how often new vendors might be added, and vendors
may also be more or less active.) If you have a baseline of Arm v9.2 or
so, that might already go a long way.

[But I obviously have no idea how well that will work when it meats
reality :)]

Peter Maydell Nov. 4, 2024, 4:30 p.m. UTC | #16

On Mon, 4 Nov 2024 at 15:34, Eric Auger <eric.auger@redhat.com> wrote:
>
> Hi Kashyap,
>
> On 10/28/24 22:17, Kashyap Chamarthy wrote:
> > On Fri, Oct 25, 2024 at 12:17:40PM +0200, Eric Auger wrote:
> >> From: Cornelia Huck <cohuck@redhat.com>
> >>
> >> Add some documentation for the custom model.
> >>
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> >> ---
> >>  docs/system/arm/cpu-features.rst | 55 +++++++++++++++++++++++++++-----
> >>  1 file changed, 47 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
> >> index a5fb929243..962a2c6c26 100644
> >> --- a/docs/system/arm/cpu-features.rst
> >> +++ b/docs/system/arm/cpu-features.rst
> >> @@ -2,7 +2,10 @@ Arm CPU Features
> > [...]
> >
> >> +Using the ``host`` type means the guest is provided all the same CPU
> >> +features as the host CPU type has.  And, for this reason, the ``host``
> >> +CPU type should enable all CPU features that the host has by default.
> >> +
> >> +In case some features need to be hidden to the guest, ``custom`` model
> >> +shall be used instead. This is especially useful for migration purpose.
> >> +
> >> +The ``custom`` CPU model generally is the better choice if you want more
> >> +flexibility or stability across different machines or with different kernel
> >> +versions.
> > Does "more flexibility or stability across different machines" also
> > imply "live migration compatiblity across host CPUs"?
> yes that's the goal
> >
> >> However, even the ``custom`` CPU model will not allow configuring
> >> +an arbitrary set of features; the ID registers must describe a subset of the
> >> +host's features, and all differences to the host's configuration must actually
> >> +be supported by the kernel to be deconfigured.
> > [...]
> >
> >> +The ``custom`` CPU model needs to be configured via individual ID register
> >> +field properties, for example::
> >> +
> >> +  $ qemu-system-aarch64 -M virt -cpu custom,SYSREG_ID_AA64ISAR0_EL1_DP=0x0
> > If possible, it would be really helpful (and user-friendly) to be able
> > to specify the CPU feature names as you see under /proc/cpuinfo, and be
> > able to turn the flags on or off:
> >
> >         -M virt -cpu franken,rndr=on,ts=on,fhm=off
> >
> > (... instead of specifying long system register IDs that groups together
> > a bunch of CPU features.  If I understand it correctly, the register
> > "ID_AA64ISAR0_EL1" maps to a set of visible features listed here:
> > https://docs.kernel.org/arch/arm64/cpu-feature-registers.html)
> Not all the writable ID regs are visible through the above technique.
> But indeed I think we converged on the idea to use higher level feature
> names than ID reg field values.
> However we need to study the feasibility and mappings between those high
> level features and ID reg field values.
> The cons is that we need to describe this mapping manually. Besides
> being cumbersome this is also error prone.

You might be interested in "Arm Architecture Features" on
https://developer.arm.com/Architectures/A-Profile%20Architecture#Downloads
which includes a 1.8MB Features.json which is a machine
readable version of the "what are the features and their
dependencies and ID registers and so on" information.

But note that (a) it is alpha quality and (b) I am not personally
going to try to interpret what might be reasonable to do with it
based on the legal notice attached to it: that's a matter for
you and your lawyer ;-)

-- PMM

Eric Auger Nov. 4, 2024, 5:07 p.m. UTC | #17

Hi Peter,

On 11/4/24 17:30, Peter Maydell wrote:
> On Mon, 4 Nov 2024 at 15:34, Eric Auger <eric.auger@redhat.com> wrote:
>> Hi Kashyap,
>>
>> On 10/28/24 22:17, Kashyap Chamarthy wrote:
>>> On Fri, Oct 25, 2024 at 12:17:40PM +0200, Eric Auger wrote:
>>>> From: Cornelia Huck <cohuck@redhat.com>
>>>>
>>>> Add some documentation for the custom model.
>>>>
>>>> Signed-off-by: Eric Auger <eric.auger@redhat.com>
>>>> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
>>>> ---
>>>>  docs/system/arm/cpu-features.rst | 55 +++++++++++++++++++++++++++-----
>>>>  1 file changed, 47 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/docs/system/arm/cpu-features.rst b/docs/system/arm/cpu-features.rst
>>>> index a5fb929243..962a2c6c26 100644
>>>> --- a/docs/system/arm/cpu-features.rst
>>>> +++ b/docs/system/arm/cpu-features.rst
>>>> @@ -2,7 +2,10 @@ Arm CPU Features
>>> [...]
>>>
>>>> +Using the ``host`` type means the guest is provided all the same CPU
>>>> +features as the host CPU type has.  And, for this reason, the ``host``
>>>> +CPU type should enable all CPU features that the host has by default.
>>>> +
>>>> +In case some features need to be hidden to the guest, ``custom`` model
>>>> +shall be used instead. This is especially useful for migration purpose.
>>>> +
>>>> +The ``custom`` CPU model generally is the better choice if you want more
>>>> +flexibility or stability across different machines or with different kernel
>>>> +versions.
>>> Does "more flexibility or stability across different machines" also
>>> imply "live migration compatiblity across host CPUs"?
>> yes that's the goal
>>>> However, even the ``custom`` CPU model will not allow configuring
>>>> +an arbitrary set of features; the ID registers must describe a subset of the
>>>> +host's features, and all differences to the host's configuration must actually
>>>> +be supported by the kernel to be deconfigured.
>>> [...]
>>>
>>>> +The ``custom`` CPU model needs to be configured via individual ID register
>>>> +field properties, for example::
>>>> +
>>>> +  $ qemu-system-aarch64 -M virt -cpu custom,SYSREG_ID_AA64ISAR0_EL1_DP=0x0
>>> If possible, it would be really helpful (and user-friendly) to be able
>>> to specify the CPU feature names as you see under /proc/cpuinfo, and be
>>> able to turn the flags on or off:
>>>
>>>         -M virt -cpu franken,rndr=on,ts=on,fhm=off
>>>
>>> (... instead of specifying long system register IDs that groups together
>>> a bunch of CPU features.  If I understand it correctly, the register
>>> "ID_AA64ISAR0_EL1" maps to a set of visible features listed here:
>>> https://docs.kernel.org/arch/arm64/cpu-feature-registers.html)
>> Not all the writable ID regs are visible through the above technique.
>> But indeed I think we converged on the idea to use higher level feature
>> names than ID reg field values.
>> However we need to study the feasibility and mappings between those high
>> level features and ID reg field values.
>> The cons is that we need to describe this mapping manually. Besides
>> being cumbersome this is also error prone.
> You might be interested in "Arm Architecture Features" on
> https://developer.arm.com/Architectures/A-Profile%20Architecture#Downloads
> which includes a 1.8MB Features.json which is a machine
> readable version of the "what are the features and their
> dependencies and ID registers and so on" information.
thank you for the link.
>
> But note that (a) it is alpha quality and (b) I am not personally
> going to try to interpret what might be reasonable to do with it
> based on the legal notice attached to it: that's a matter for
> you and your lawyer ;-)

Thank you for the notice. This may be similar to the ARM xml mentioned
by Marc...

Eric
>
> -- PMM
>

Kashyap Chamarthy Nov. 4, 2024, 6:29 p.m. UTC | #18

On Mon, Nov 04, 2024 at 04:30:17PM +0000, Peter Maydell wrote:
> On Mon, 4 Nov 2024 at 15:34, Eric Auger <eric.auger@redhat.com> wrote:

[...]

> > > If possible, it would be really helpful (and user-friendly) to be able
> > > to specify the CPU feature names as you see under /proc/cpuinfo, and be
> > > able to turn the flags on or off:
> > >
> > >         -M virt -cpu franken,rndr=on,ts=on,fhm=off
> > >
> > > (... instead of specifying long system register IDs that groups together
> > > a bunch of CPU features.  If I understand it correctly, the register
> > > "ID_AA64ISAR0_EL1" maps to a set of visible features listed here:
> > > https://docs.kernel.org/arch/arm64/cpu-feature-registers.html)
> > Not all the writable ID regs are visible through the above technique.
> > But indeed I think we converged on the idea to use higher level feature
> > names than ID reg field values.
> > However we need to study the feasibility and mappings between those high
> > level features and ID reg field values.
> > The cons is that we need to describe this mapping manually. Besides
> > being cumbersome this is also error prone.
> 
> You might be interested in "Arm Architecture Features" on
> https://developer.arm.com/Architectures/A-Profile%20Architecture#Downloads
> which includes a 1.8MB Features.json which is a machine
> readable version of the "what are the features and their
> dependencies and ID registers and so on" information.

I just took a quick at the JSON file.  It was a bit difficult to get a
sense of the structure of the file.  Maybe it makes more sense to those
who are more Arm-aware than me.  Or maybe that's what you meant by
"alpha quality" below :)

Out of curiosity, I tried to find the features for the register
"ID_AA64ISAR0_EL1".  I was navigating via trial-and-error with `jq`.
There's a lot of "right"/"left" traversing:

    $> jq '.parameters[136]|.constraints[1]|.right|.left|.value' Features.json 
    "FEAT_TME"

The register name is buried under this:

    $> jq '.parameters[136]|.constraints[1]|.right|.right|.left|.arguments' Features.json 
    [
      {
        "_type": "Types.Field",
        "value": {
          "field": "TME",
          "instance": null,
          "name": "ID_AA64ISAR0_EL1",
          "slices": null,
          "state": "AArch64"
        }
      }
    ]


I was niavely expecting a more predictable structure such as:

    $> jq '.register[ID_AA64ISAR0_EL1]|.fields' Features.json
    FEAT_RNG
    FEAT_TLBIOS
    ...

> But note that (a) it is alpha quality and (b) I am not personally
> going to try to interpret what might be reasonable to do with it
> based on the legal notice attached to it: that's a matter for
> you and your lawyer ;-)

Hmm, it sounds like until point (b) is clarified, this file is out of
consideration from an upstream point of view.

> -- PMM
>

[RFC,21/21] arm/cpu-features: Document custom vcpu model

Commit Message

Comments

Patch