diff mbox series

arm/kvm: add support for MTE

Message ID 20240709060448.251881-1-gankulkarni@os.amperecomputing.com
State New
Headers show
Series arm/kvm: add support for MTE | expand

Commit Message

Ganapatrao Kulkarni July 9, 2024, 6:04 a.m. UTC
Extend the 'mte' property for the virt machine to cover KVM as
well. For KVM, we don't allocate tag memory, but instead enable
the capability.

If MTE has been enabled, we need to disable migration, as we do not
yet have a way to migrate the tags as well. Therefore, MTE will stay
off with KVM unless requested explicitly.

This patch is rework of commit b320e21c48ce64853904bea6631c0158cc2ef227
which broke TCG since it made the TCG -cpu max
report the presence of MTE to the guest even if the board hadn't
enabled MTE by wiring up the tag RAM. This meant that if the guest
then tried to use MTE QEMU would segfault accessing the
non-existent tag RAM.

Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
---

This patch is boot tested on ARM64 with KVM and on X86 with TCG for mte=on
and default case(no mte).

 hw/arm/virt.c        | 72 ++++++++++++++++++++++++++------------------
 target/arm/cpu.h     |  1 +
 target/arm/kvm.c     | 40 ++++++++++++++++++++++++
 target/arm/kvm_arm.h | 19 ++++++++++++
 4 files changed, 102 insertions(+), 30 deletions(-)

Comments

Cornelia Huck July 10, 2024, 3:42 p.m. UTC | #1
On Mon, Jul 08 2024, Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> wrote:

> Extend the 'mte' property for the virt machine to cover KVM as
> well. For KVM, we don't allocate tag memory, but instead enable
> the capability.
>
> If MTE has been enabled, we need to disable migration, as we do not
> yet have a way to migrate the tags as well. Therefore, MTE will stay
> off with KVM unless requested explicitly.
>
> This patch is rework of commit b320e21c48ce64853904bea6631c0158cc2ef227
> which broke TCG since it made the TCG -cpu max
> report the presence of MTE to the guest even if the board hadn't
> enabled MTE by wiring up the tag RAM. This meant that if the guest
> then tried to use MTE QEMU would segfault accessing the
> non-existent tag RAM.

So, the main difference to my original patch is that we don't end up
with MTE in the max model if we didn't configure tag memory, but the
rest of the behaviour stays the same?

>
> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> Signed-off-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>

(...)

> +void kvm_arm_enable_mte(Object *cpuobj, Error **errp)
> +{
> +    static bool tried_to_enable;
> +    static bool succeeded_to_enable;
> +    Error *mte_migration_blocker = NULL;
> +    int ret;
> +
> +    if (!tried_to_enable) {
> +        /*
> +         * MTE on KVM is enabled on a per-VM basis (and retrying doesn't make
> +         * sense), and we only want a single migration blocker as well.
> +         */
> +        tried_to_enable = true;
> +
> +        ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_ARM_MTE, 0);
> +        if (ret) {
> +            error_setg_errno(errp, -ret, "Failed to enable KVM_CAP_ARM_MTE");
> +            return;
> +        }
> +
> +        /* TODO: Migration is not supported with MTE enabled */

Do you have a plan for enabling migration in the future? From what I
remember, pre-copy support should be doable within QEMU (with a similar
approach to e.g. s390 skey), but post-copy would need a kernel API
extension to support getting additional data while faulting in a page.

> +        error_setg(&mte_migration_blocker,
> +                   "Live migration disabled due to MTE enabled");
> +        if (migrate_add_blocker(&mte_migration_blocker, errp)) {
> +            error_free(mte_migration_blocker);
> +            return;
> +        }
> +        succeeded_to_enable = true;
> +    }
> +    if (succeeded_to_enable) {
> +        object_property_set_bool(cpuobj, "has_mte", true, NULL);
> +    }
> +}
Ganapatrao Kulkarni July 11, 2024, 8:53 a.m. UTC | #2
On 10-07-2024 09:12 pm, Cornelia Huck wrote:
> On Mon, Jul 08 2024, Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> wrote:
> 
>> Extend the 'mte' property for the virt machine to cover KVM as
>> well. For KVM, we don't allocate tag memory, but instead enable
>> the capability.
>>
>> If MTE has been enabled, we need to disable migration, as we do not
>> yet have a way to migrate the tags as well. Therefore, MTE will stay
>> off with KVM unless requested explicitly.
>>
>> This patch is rework of commit b320e21c48ce64853904bea6631c0158cc2ef227
>> which broke TCG since it made the TCG -cpu max
>> report the presence of MTE to the guest even if the board hadn't
>> enabled MTE by wiring up the tag RAM. This meant that if the guest
>> then tried to use MTE QEMU would segfault accessing the
>> non-existent tag RAM.
> 
> So, the main difference to my original patch is that we don't end up
> with MTE in the max model if we didn't configure tag memory, but the
> rest of the behaviour stays the same?

Yes most of the patch is same. What I changed is to fix the code which 
was advertising MTE feature through PFR1 register for TCG based boot, 
irrespective of the tagged RAM allocated.

> 
>>
>> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
>> Signed-off-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
> 
> (...)
> 
>> +void kvm_arm_enable_mte(Object *cpuobj, Error **errp)
>> +{
>> +    static bool tried_to_enable;
>> +    static bool succeeded_to_enable;
>> +    Error *mte_migration_blocker = NULL;
>> +    int ret;
>> +
>> +    if (!tried_to_enable) {
>> +        /*
>> +         * MTE on KVM is enabled on a per-VM basis (and retrying doesn't make
>> +         * sense), and we only want a single migration blocker as well.
>> +         */
>> +        tried_to_enable = true;
>> +
>> +        ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_ARM_MTE, 0);
>> +        if (ret) {
>> +            error_setg_errno(errp, -ret, "Failed to enable KVM_CAP_ARM_MTE");
>> +            return;
>> +        }
>> +
>> +        /* TODO: Migration is not supported with MTE enabled */
> 
> Do you have a plan for enabling migration in the future? From what I
> remember, pre-copy support should be doable within QEMU (with a similar
> approach to e.g. s390 skey), but post-copy would need a kernel API
> extension to support getting additional data while faulting in a page.
> 

No plan at the moment.

>> +        error_setg(&mte_migration_blocker,
>> +                   "Live migration disabled due to MTE enabled");
>> +        if (migrate_add_blocker(&mte_migration_blocker, errp)) {
>> +            error_free(mte_migration_blocker);
>> +            return;
>> +        }
>> +        succeeded_to_enable = true;
>> +    }
>> +    if (succeeded_to_enable) {
>> +        object_property_set_bool(cpuobj, "has_mte", true, NULL);
>> +    }
>> +}
> 

Thanks,
Ganapat
Ganapatrao Kulkarni July 15, 2024, 11:27 a.m. UTC | #3
Hi Peter/Richard,

On 09-07-2024 11:34 am, Ganapatrao Kulkarni wrote:
> Extend the 'mte' property for the virt machine to cover KVM as
> well. For KVM, we don't allocate tag memory, but instead enable
> the capability.
> 
> If MTE has been enabled, we need to disable migration, as we do not
> yet have a way to migrate the tags as well. Therefore, MTE will stay
> off with KVM unless requested explicitly.
> 
> This patch is rework of commit b320e21c48ce64853904bea6631c0158cc2ef227
> which broke TCG since it made the TCG -cpu max
> report the presence of MTE to the guest even if the board hadn't
> enabled MTE by wiring up the tag RAM. This meant that if the guest
> then tried to use MTE QEMU would segfault accessing the
> non-existent tag RAM.
> 
> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> Signed-off-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
> ---
> 
> This patch is boot tested on ARM64 with KVM and on X86 with TCG for mte=on
> and default case(no mte).

Any comments on this patch?
Can this patch be merged in this merge cycle/window?

> 
>   hw/arm/virt.c        | 72 ++++++++++++++++++++++++++------------------
>   target/arm/cpu.h     |  1 +
>   target/arm/kvm.c     | 40 ++++++++++++++++++++++++
>   target/arm/kvm_arm.h | 19 ++++++++++++
>   4 files changed, 102 insertions(+), 30 deletions(-)
> 
> diff --git a/hw/arm/virt.c b/hw/arm/virt.c
> index b0c68d66a3..cc9db79ec3 100644
> --- a/hw/arm/virt.c
> +++ b/hw/arm/virt.c
> @@ -2206,7 +2206,7 @@ static void machvirt_init(MachineState *machine)
>           exit(1);
>       }
>   
> -    if (vms->mte && (kvm_enabled() || hvf_enabled())) {
> +    if (vms->mte && hvf_enabled()) {
>           error_report("mach-virt: %s does not support providing "
>                        "MTE to the guest CPU",
>                        current_accel_name());
> @@ -2276,39 +2276,51 @@ static void machvirt_init(MachineState *machine)
>           }
>   
>           if (vms->mte) {
> -            /* Create the memory region only once, but link to all cpus. */
> -            if (!tag_sysmem) {
> -                /*
> -                 * The property exists only if MemTag is supported.
> -                 * If it is, we must allocate the ram to back that up.
> -                 */
> -                if (!object_property_find(cpuobj, "tag-memory")) {
> -                    error_report("MTE requested, but not supported "
> -                                 "by the guest CPU");
> -                    exit(1);
> +            if (tcg_enabled()) {
> +                /* Create the memory region only once, but link to all cpus. */
> +                if (!tag_sysmem) {
> +                    /*
> +                     * The property exists only if MemTag is supported.
> +                     * If it is, we must allocate the ram to back that up.
> +                     */
> +                    if (!object_property_find(cpuobj, "tag-memory")) {
> +                        error_report("MTE requested, but not supported "
> +                                     "by the guest CPU");
> +                        exit(1);
> +                    }
> +
> +                    tag_sysmem = g_new(MemoryRegion, 1);
> +                    memory_region_init(tag_sysmem, OBJECT(machine),
> +                                       "tag-memory", UINT64_MAX / 32);
> +
> +                    if (vms->secure) {
> +                        secure_tag_sysmem = g_new(MemoryRegion, 1);
> +                        memory_region_init(secure_tag_sysmem, OBJECT(machine),
> +                                           "secure-tag-memory",
> +                                           UINT64_MAX / 32);
> +
> +                        /* As with ram, secure-tag takes precedence over tag. */
> +                        memory_region_add_subregion_overlap(secure_tag_sysmem,
> +                                                            0, tag_sysmem, -1);
> +                    }
>                   }
>   
> -                tag_sysmem = g_new(MemoryRegion, 1);
> -                memory_region_init(tag_sysmem, OBJECT(machine),
> -                                   "tag-memory", UINT64_MAX / 32);
> -
> +                object_property_set_link(cpuobj, "tag-memory",
> +                                         OBJECT(tag_sysmem), &error_abort);
>                   if (vms->secure) {
> -                    secure_tag_sysmem = g_new(MemoryRegion, 1);
> -                    memory_region_init(secure_tag_sysmem, OBJECT(machine),
> -                                       "secure-tag-memory", UINT64_MAX / 32);
> -
> -                    /* As with ram, secure-tag takes precedence over tag.  */
> -                    memory_region_add_subregion_overlap(secure_tag_sysmem, 0,
> -                                                        tag_sysmem, -1);
> +                    object_property_set_link(cpuobj, "secure-tag-memory",
> +                                             OBJECT(secure_tag_sysmem),
> +                                             &error_abort);
>                   }
> -            }
> -
> -            object_property_set_link(cpuobj, "tag-memory", OBJECT(tag_sysmem),
> -                                     &error_abort);
> -            if (vms->secure) {
> -                object_property_set_link(cpuobj, "secure-tag-memory",
> -                                         OBJECT(secure_tag_sysmem),
> -                                         &error_abort);
> +            } else if (kvm_enabled()) {
> +                if (!kvm_arm_mte_supported()) {
> +                    error_report("MTE requested, but not supported by KVM");
> +                    exit(1);
> +                }
> +                kvm_arm_enable_mte(cpuobj, &error_abort);
> +            } else {
> +                    error_report("MTE requested, but not supported ");
> +                    exit(1);
>               }
>           }
>   
> diff --git a/target/arm/cpu.h b/target/arm/cpu.h
> index d8eb986a04..661d35d8d8 100644
> --- a/target/arm/cpu.h
> +++ b/target/arm/cpu.h
> @@ -1054,6 +1054,7 @@ struct ArchCPU {
>       bool prop_pauth_impdef;
>       bool prop_pauth_qarma3;
>       bool prop_lpa2;
> +    OnOffAuto prop_mte;
>   
>       /* DCZ blocksize, in log_2(words), ie low 4 bits of DCZID_EL0 */
>       uint8_t dcz_blocksize;
> diff --git a/target/arm/kvm.c b/target/arm/kvm.c
> index 70f79eda33..0267a013e4 100644
> --- a/target/arm/kvm.c
> +++ b/target/arm/kvm.c
> @@ -39,6 +39,7 @@
>   #include "hw/acpi/acpi.h"
>   #include "hw/acpi/ghes.h"
>   #include "target/arm/gtimer.h"
> +#include "migration/blocker.h"
>   
>   const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
>       KVM_CAP_LAST_INFO
> @@ -1793,6 +1794,11 @@ bool kvm_arm_sve_supported(void)
>       return kvm_check_extension(kvm_state, KVM_CAP_ARM_SVE);
>   }
>   
> +bool kvm_arm_mte_supported(void)
> +{
> +    return kvm_check_extension(kvm_state, KVM_CAP_ARM_MTE);
> +}
> +
>   QEMU_BUILD_BUG_ON(KVM_ARM64_SVE_VQ_MIN != 1);
>   
>   uint32_t kvm_arm_sve_get_vls(ARMCPU *cpu)
> @@ -2422,3 +2428,37 @@ int kvm_arch_remove_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp)
>       }
>       return 0;
>   }
> +
> +void kvm_arm_enable_mte(Object *cpuobj, Error **errp)
> +{
> +    static bool tried_to_enable;
> +    static bool succeeded_to_enable;
> +    Error *mte_migration_blocker = NULL;
> +    int ret;
> +
> +    if (!tried_to_enable) {
> +        /*
> +         * MTE on KVM is enabled on a per-VM basis (and retrying doesn't make
> +         * sense), and we only want a single migration blocker as well.
> +         */
> +        tried_to_enable = true;
> +
> +        ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_ARM_MTE, 0);
> +        if (ret) {
> +            error_setg_errno(errp, -ret, "Failed to enable KVM_CAP_ARM_MTE");
> +            return;
> +        }
> +
> +        /* TODO: Migration is not supported with MTE enabled */
> +        error_setg(&mte_migration_blocker,
> +                   "Live migration disabled due to MTE enabled");
> +        if (migrate_add_blocker(&mte_migration_blocker, errp)) {
> +            error_free(mte_migration_blocker);
> +            return;
> +        }
> +        succeeded_to_enable = true;
> +    }
> +    if (succeeded_to_enable) {
> +        object_property_set_bool(cpuobj, "has_mte", true, NULL);
> +    }
> +}
> diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
> index cfaa0d9bc7..4d293618a7 100644
> --- a/target/arm/kvm_arm.h
> +++ b/target/arm/kvm_arm.h
> @@ -188,6 +188,13 @@ bool kvm_arm_pmu_supported(void);
>    */
>   bool kvm_arm_sve_supported(void);
>   
> +/**
> + * kvm_arm_mte_supported:
> + *
> + * Returns: true if KVM can enable MTE, and false otherwise.
> + */
> +bool kvm_arm_mte_supported(void);
> +
>   /**
>    * kvm_arm_get_max_vm_ipa_size:
>    * @ms: Machine state handle
> @@ -214,6 +221,8 @@ void kvm_arm_pvtime_init(ARMCPU *cpu, uint64_t ipa);
>   
>   int kvm_arm_set_irq(int cpu, int irqtype, int irq, int level);
>   
> +void kvm_arm_enable_mte(Object *cpuobj, Error **errp);
> +
>   #else
>   
>   /*
> @@ -235,6 +244,11 @@ static inline bool kvm_arm_sve_supported(void)
>       return false;
>   }
>   
> +static inline bool kvm_arm_mte_supported(void)
> +{
> +    return false;
> +}
> +
>   /*
>    * These functions should never actually be called without KVM support.
>    */
> @@ -283,6 +297,11 @@ static inline uint32_t kvm_arm_sve_get_vls(ARMCPU *cpu)
>       g_assert_not_reached();
>   }
>   
> +static inline void kvm_arm_enable_mte(Object *cpuobj, Error **errp)
> +{
> +    g_assert_not_reached();
> +}
> +
>   #endif
>   
>   #endif

Thanks,
Ganapat
Peter Maydell July 16, 2024, 3:45 p.m. UTC | #4
On Tue, 9 Jul 2024 at 07:05, Ganapatrao Kulkarni
<gankulkarni@os.amperecomputing.com> wrote:
>
> Extend the 'mte' property for the virt machine to cover KVM as
> well. For KVM, we don't allocate tag memory, but instead enable
> the capability.
>
> If MTE has been enabled, we need to disable migration, as we do not
> yet have a way to migrate the tags as well. Therefore, MTE will stay
> off with KVM unless requested explicitly.
>
> This patch is rework of commit b320e21c48ce64853904bea6631c0158cc2ef227
> which broke TCG since it made the TCG -cpu max
> report the presence of MTE to the guest even if the board hadn't
> enabled MTE by wiring up the tag RAM. This meant that if the guest
> then tried to use MTE QEMU would segfault accessing the
> non-existent tag RAM.
>
> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
> Signed-off-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
> ---

In target/arm/cpu.c:arm_cpu_realizefn() there is this code:

    if (cpu_isar_feature(aa64_mte, cpu)) {
        /*
         * The architectural range of GM blocksize is 2-6, however qemu
         * doesn't support blocksize of 2 (see HELPER(ldgm)).
         */
        if (tcg_enabled()) {
            assert(cpu->gm_blocksize >= 3 && cpu->gm_blocksize <= 6);
        }

#ifndef CONFIG_USER_ONLY
        /*
         * If we do not have tag-memory provided by the machine,
         * reduce MTE support to instructions enabled at EL0.
         * This matches Cortex-A710 BROADCASTMTE input being LOW.
         */
        if (cpu->tag_memory == NULL) {
            cpu->isar.id_aa64pfr1 =
                FIELD_DP64(cpu->isar.id_aa64pfr1, ID_AA64PFR1, MTE, 1);
        }
#endif
    }

With this patch, for KVM we will end up going through the
"squash ID_AA64PFR1_EL1.MTE to 1" codepath, because KVM doesn't
set cpu->tag_memory and this is still using that as its check.

More generally, how does the enabling of the MTE KVM cap
interact with the ID_AA64PFR1_EL1 value that we read from
the host in kvm_arm_get_host_cpu_features() ? We care that we
have the right ID register values because we use ID field
checks to determine whether the vcpu has a feature or not,
even in the KVM case.

Since Cornelia first wrote the patch this is based on, we've
landed gdbstub support for MTE (so gdb can find out which
addresses in the memory map have tags and read and write
those tags). So I think the KVM MTE support now also needs to
handle that. (See aarch64_cpu_register_gdb_commands() in
target/arm/gdbstub64.c.)

thanks
-- PMM
Ganapatrao Kulkarni July 29, 2024, 9:37 a.m. UTC | #5
Hi Peter,


[Apologies for the delayed response]

On 16-07-2024 09:15 pm, Peter Maydell wrote:
> On Tue, 9 Jul 2024 at 07:05, Ganapatrao Kulkarni
> <gankulkarni@os.amperecomputing.com> wrote:
>>
>> Extend the 'mte' property for the virt machine to cover KVM as
>> well. For KVM, we don't allocate tag memory, but instead enable
>> the capability.
>>
>> If MTE has been enabled, we need to disable migration, as we do not
>> yet have a way to migrate the tags as well. Therefore, MTE will stay
>> off with KVM unless requested explicitly.
>>
>> This patch is rework of commit b320e21c48ce64853904bea6631c0158cc2ef227
>> which broke TCG since it made the TCG -cpu max
>> report the presence of MTE to the guest even if the board hadn't
>> enabled MTE by wiring up the tag RAM. This meant that if the guest
>> then tried to use MTE QEMU would segfault accessing the
>> non-existent tag RAM.
>>
>> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
>> Signed-off-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
>> ---
> 
> In target/arm/cpu.c:arm_cpu_realizefn() there is this code:
> 
>      if (cpu_isar_feature(aa64_mte, cpu)) {
>          /*
>           * The architectural range of GM blocksize is 2-6, however qemu
>           * doesn't support blocksize of 2 (see HELPER(ldgm)).
>           */
>          if (tcg_enabled()) {
>              assert(cpu->gm_blocksize >= 3 && cpu->gm_blocksize <= 6);
>          }
> 
> #ifndef CONFIG_USER_ONLY
>          /*
>           * If we do not have tag-memory provided by the machine,
>           * reduce MTE support to instructions enabled at EL0.
>           * This matches Cortex-A710 BROADCASTMTE input being LOW.
>           */
>          if (cpu->tag_memory == NULL) {
>              cpu->isar.id_aa64pfr1 =
>                  FIELD_DP64(cpu->isar.id_aa64pfr1, ID_AA64PFR1, MTE, 1);
>          }
> #endif
>      }
> 
> With this patch, for KVM we will end up going through the
> "squash ID_AA64PFR1_EL1.MTE to 1" codepath, because KVM doesn't
> set cpu->tag_memory and this is still using that as its check.
> 

I looked at this function and it seems we are not entering this function 
for KVM boot. I do see -DCONFIG_USER_ONLY added to make files.

Also Linux kernel wont detect/enable MTE until unless the 
ID_AA64PFR1_EL1.MTE value is 2(b0010) and above.

> More generally, how does the enabling of the MTE KVM cap
> interact with the ID_AA64PFR1_EL1 value that we read from
> the host in kvm_arm_get_host_cpu_features() ? We care that we
> have the right ID register values because we use ID field
> checks to determine whether the vcpu has a feature or not,
> even in the KVM case.
> 
> Since Cornelia first wrote the patch this is based on, we've
> landed gdbstub support for MTE (so gdb can find out which
> addresses in the memory map have tags and read and write
> those tags). So I think the KVM MTE support now also needs to
> handle that. (See aarch64_cpu_register_gdb_commands() in
> target/arm/gdbstub64.c.)

Ok sure, I will go through this file to add/update MTE part


Thanks,
Ganapat
Alex Bennée July 29, 2024, 10:14 a.m. UTC | #6
Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> writes:

> Hi Peter,
>
>
> [Apologies for the delayed response]
>
> On 16-07-2024 09:15 pm, Peter Maydell wrote:
>> On Tue, 9 Jul 2024 at 07:05, Ganapatrao Kulkarni
>> <gankulkarni@os.amperecomputing.com> wrote:
>>>
>>> Extend the 'mte' property for the virt machine to cover KVM as
>>> well. For KVM, we don't allocate tag memory, but instead enable
>>> the capability.
>>>
>>> If MTE has been enabled, we need to disable migration, as we do not
>>> yet have a way to migrate the tags as well. Therefore, MTE will stay
>>> off with KVM unless requested explicitly.
>>>
>>> This patch is rework of commit b320e21c48ce64853904bea6631c0158cc2ef227
>>> which broke TCG since it made the TCG -cpu max
>>> report the presence of MTE to the guest even if the board hadn't
>>> enabled MTE by wiring up the tag RAM. This meant that if the guest
>>> then tried to use MTE QEMU would segfault accessing the
>>> non-existent tag RAM.
>>>
>>> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
>>> Signed-off-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
>>> ---
>> In target/arm/cpu.c:arm_cpu_realizefn() there is this code:
>>      if (cpu_isar_feature(aa64_mte, cpu)) {
>>          /*
>>           * The architectural range of GM blocksize is 2-6, however qemu
>>           * doesn't support blocksize of 2 (see HELPER(ldgm)).
>>           */
>>          if (tcg_enabled()) {
>>              assert(cpu->gm_blocksize >= 3 && cpu->gm_blocksize <= 6);
>>          }
>> #ifndef CONFIG_USER_ONLY
>>          /*
>>           * If we do not have tag-memory provided by the machine,
>>           * reduce MTE support to instructions enabled at EL0.
>>           * This matches Cortex-A710 BROADCASTMTE input being LOW.
>>           */
>>          if (cpu->tag_memory == NULL) {
>>              cpu->isar.id_aa64pfr1 =
>>                  FIELD_DP64(cpu->isar.id_aa64pfr1, ID_AA64PFR1, MTE, 1);
>>          }
>> #endif
>>      }
>> With this patch, for KVM we will end up going through the
>> "squash ID_AA64PFR1_EL1.MTE to 1" codepath, because KVM doesn't
>> set cpu->tag_memory and this is still using that as its check.
>> 
>
> I looked at this function and it seems we are not entering this
> function for KVM boot. I do see -DCONFIG_USER_ONLY added to make
> files.
>
> Also Linux kernel wont detect/enable MTE until unless the
> ID_AA64PFR1_EL1.MTE value is 2(b0010) and above.
>
>> More generally, how does the enabling of the MTE KVM cap
>> interact with the ID_AA64PFR1_EL1 value that we read from
>> the host in kvm_arm_get_host_cpu_features() ? We care that we
>> have the right ID register values because we use ID field
>> checks to determine whether the vcpu has a feature or not,
>> even in the KVM case.
>> Since Cornelia first wrote the patch this is based on, we've
>> landed gdbstub support for MTE (so gdb can find out which
>> addresses in the memory map have tags and read and write
>> those tags). So I think the KVM MTE support now also needs to
>> handle that. (See aarch64_cpu_register_gdb_commands() in
>> target/arm/gdbstub64.c.)
>
> Ok sure, I will go through this file to add/update MTE part

So to be clear the current MTE gdbstub support is linux-user only.
Gustavo has a series on the list that adds the system emulation part:

  Message-Id: <20240722160709.1677430-1-gustavo.romero@linaro.org>
  Date: Mon, 22 Jul 2024 16:07:05 +0000
  Subject: [PATCH 0/4] gdbstub: Add support for MTE in system mode
  From: Gustavo Romero <gustavo.romero@linaro.org>

which of course is focused on TCG. But if the KVM guests sync to the same
registers to cpregs I think most stuff should just work. However the
current code uses the TCG only:

  allocation_tag_mem_probe

which I guess needs a KVM equivalent to query the tag memory?

>
>
> Thanks,
> Ganapat
Ganapatrao Kulkarni July 29, 2024, 10:40 a.m. UTC | #7
On 29-07-2024 03:44 pm, Alex Bennée wrote:
> Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> writes:
> 
>> Hi Peter,
>>
>>
>> [Apologies for the delayed response]
>>
>> On 16-07-2024 09:15 pm, Peter Maydell wrote:
>>> On Tue, 9 Jul 2024 at 07:05, Ganapatrao Kulkarni
>>> <gankulkarni@os.amperecomputing.com> wrote:
>>>>
>>>> Extend the 'mte' property for the virt machine to cover KVM as
>>>> well. For KVM, we don't allocate tag memory, but instead enable
>>>> the capability.
>>>>
>>>> If MTE has been enabled, we need to disable migration, as we do not
>>>> yet have a way to migrate the tags as well. Therefore, MTE will stay
>>>> off with KVM unless requested explicitly.
>>>>
>>>> This patch is rework of commit b320e21c48ce64853904bea6631c0158cc2ef227
>>>> which broke TCG since it made the TCG -cpu max
>>>> report the presence of MTE to the guest even if the board hadn't
>>>> enabled MTE by wiring up the tag RAM. This meant that if the guest
>>>> then tried to use MTE QEMU would segfault accessing the
>>>> non-existent tag RAM.
>>>>
>>>> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
>>>> Signed-off-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
>>>> ---
>>> In target/arm/cpu.c:arm_cpu_realizefn() there is this code:
>>>       if (cpu_isar_feature(aa64_mte, cpu)) {
>>>           /*
>>>            * The architectural range of GM blocksize is 2-6, however qemu
>>>            * doesn't support blocksize of 2 (see HELPER(ldgm)).
>>>            */
>>>           if (tcg_enabled()) {
>>>               assert(cpu->gm_blocksize >= 3 && cpu->gm_blocksize <= 6);
>>>           }
>>> #ifndef CONFIG_USER_ONLY
>>>           /*
>>>            * If we do not have tag-memory provided by the machine,
>>>            * reduce MTE support to instructions enabled at EL0.
>>>            * This matches Cortex-A710 BROADCASTMTE input being LOW.
>>>            */
>>>           if (cpu->tag_memory == NULL) {
>>>               cpu->isar.id_aa64pfr1 =
>>>                   FIELD_DP64(cpu->isar.id_aa64pfr1, ID_AA64PFR1, MTE, 1);
>>>           }
>>> #endif
>>>       }
>>> With this patch, for KVM we will end up going through the
>>> "squash ID_AA64PFR1_EL1.MTE to 1" codepath, because KVM doesn't
>>> set cpu->tag_memory and this is still using that as its check.
>>>
>>
>> I looked at this function and it seems we are not entering this
>> function for KVM boot. I do see -DCONFIG_USER_ONLY added to make
>> files.
>>
>> Also Linux kernel wont detect/enable MTE until unless the
>> ID_AA64PFR1_EL1.MTE value is 2(b0010) and above.
>>
>>> More generally, how does the enabling of the MTE KVM cap
>>> interact with the ID_AA64PFR1_EL1 value that we read from
>>> the host in kvm_arm_get_host_cpu_features() ? We care that we
>>> have the right ID register values because we use ID field
>>> checks to determine whether the vcpu has a feature or not,
>>> even in the KVM case.
>>> Since Cornelia first wrote the patch this is based on, we've
>>> landed gdbstub support for MTE (so gdb can find out which
>>> addresses in the memory map have tags and read and write
>>> those tags). So I think the KVM MTE support now also needs to
>>> handle that. (See aarch64_cpu_register_gdb_commands() in
>>> target/arm/gdbstub64.c.)
>>
>> Ok sure, I will go through this file to add/update MTE part
> 
> So to be clear the current MTE gdbstub support is linux-user only.
> Gustavo has a series on the list that adds the system emulation part:
> 
>    Message-Id: <20240722160709.1677430-1-gustavo.romero@linaro.org>
>    Date: Mon, 22 Jul 2024 16:07:05 +0000
>    Subject: [PATCH 0/4] gdbstub: Add support for MTE in system mode
>    From: Gustavo Romero <gustavo.romero@linaro.org>
> 
> which of course is focused on TCG. But if the KVM guests sync to the same
> registers to cpregs I think most stuff should just work. However the
> current code uses the TCG only:
> 
>    allocation_tag_mem_probe
> 
> which I guess needs a KVM equivalent to query the tag memory?

Ok, thanks for the heads-up!.

Thanks,
Ganapat
Ganapatrao Kulkarni July 31, 2024, 12:36 p.m. UTC | #8
On 29-07-2024 04:10 pm, Ganapatrao Kulkarni wrote:
> 
> 
> On 29-07-2024 03:44 pm, Alex Bennée wrote:
>> Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> writes:
>>
>>> Hi Peter,
>>>
>>>
>>> [Apologies for the delayed response]
>>>
>>> On 16-07-2024 09:15 pm, Peter Maydell wrote:
>>>> On Tue, 9 Jul 2024 at 07:05, Ganapatrao Kulkarni
>>>> <gankulkarni@os.amperecomputing.com> wrote:
>>>>>
>>>>> Extend the 'mte' property for the virt machine to cover KVM as
>>>>> well. For KVM, we don't allocate tag memory, but instead enable
>>>>> the capability.
>>>>>
>>>>> If MTE has been enabled, we need to disable migration, as we do not
>>>>> yet have a way to migrate the tags as well. Therefore, MTE will stay
>>>>> off with KVM unless requested explicitly.
>>>>>
>>>>> This patch is rework of commit 
>>>>> b320e21c48ce64853904bea6631c0158cc2ef227
>>>>> which broke TCG since it made the TCG -cpu max
>>>>> report the presence of MTE to the guest even if the board hadn't
>>>>> enabled MTE by wiring up the tag RAM. This meant that if the guest
>>>>> then tried to use MTE QEMU would segfault accessing the
>>>>> non-existent tag RAM.
>>>>>
>>>>> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
>>>>> Signed-off-by: Ganapatrao Kulkarni 
>>>>> <gankulkarni@os.amperecomputing.com>
>>>>> ---
>>>> In target/arm/cpu.c:arm_cpu_realizefn() there is this code:
>>>>       if (cpu_isar_feature(aa64_mte, cpu)) {
>>>>           /*
>>>>            * The architectural range of GM blocksize is 2-6, however 
>>>> qemu
>>>>            * doesn't support blocksize of 2 (see HELPER(ldgm)).
>>>>            */
>>>>           if (tcg_enabled()) {
>>>>               assert(cpu->gm_blocksize >= 3 && cpu->gm_blocksize <= 6);
>>>>           }
>>>> #ifndef CONFIG_USER_ONLY
>>>>           /*
>>>>            * If we do not have tag-memory provided by the machine,
>>>>            * reduce MTE support to instructions enabled at EL0.
>>>>            * This matches Cortex-A710 BROADCASTMTE input being LOW.
>>>>            */
>>>>           if (cpu->tag_memory == NULL) {
>>>>               cpu->isar.id_aa64pfr1 =
>>>>                   FIELD_DP64(cpu->isar.id_aa64pfr1, ID_AA64PFR1, 
>>>> MTE, 1);
>>>>           }
>>>> #endif
>>>>       }
>>>> With this patch, for KVM we will end up going through the
>>>> "squash ID_AA64PFR1_EL1.MTE to 1" codepath, because KVM doesn't
>>>> set cpu->tag_memory and this is still using that as its check.
>>>>
>>>
>>> I looked at this function and it seems we are not entering this
>>> function for KVM boot. I do see -DCONFIG_USER_ONLY added to make
>>> files.
>>>

My bad, please ignore my previous/above comment.
I did not hit this issue since cpu_isar_feature(aa64_mte, cpu) is 
returning zero/false on my ARM64 platform. Then I dumped the register 
id_aa64pfr1 at QEMU(qemu-system-aarch64) as well in Linux(vanilla 6.10) 
kernel(for ioctl KVM_GET_ONE_REG) and to my surprise, in qemu the value 
is 0x21 however the value at kernel is 0x321(expected value).

Root-caused and it is due to, kernel is hiding[1] the MTE bits of 
ID_AA64PFR1_EL1 register from user/qemu. Need to send the kernel patch 
upstream to revert it, otherwise this check in qemu is dummy.

[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=v6.11-rc1&id=2ac638fc5724f011f8ba1b425667c5592e1571ce

>>> Also Linux kernel wont detect/enable MTE until unless the
>>> ID_AA64PFR1_EL1.MTE value is 2(b0010) and above.
>>>
>>>> More generally, how does the enabling of the MTE KVM cap
>>>> interact with the ID_AA64PFR1_EL1 value that we read from
>>>> the host in kvm_arm_get_host_cpu_features() ? We care that we
>>>> have the right ID register values because we use ID field
>>>> checks to determine whether the vcpu has a feature or not,
>>>> even in the KVM case.
>>>> Since Cornelia first wrote the patch this is based on, we've
>>>> landed gdbstub support for MTE (so gdb can find out which
>>>> addresses in the memory map have tags and read and write
>>>> those tags). So I think the KVM MTE support now also needs to
>>>> handle that. (See aarch64_cpu_register_gdb_commands() in
>>>> target/arm/gdbstub64.c.)
>>>
>>> Ok sure, I will go through this file to add/update MTE part
>>
>> So to be clear the current MTE gdbstub support is linux-user only.
>> Gustavo has a series on the list that adds the system emulation part:
>>
>>    Message-Id: <20240722160709.1677430-1-gustavo.romero@linaro.org>
>>    Date: Mon, 22 Jul 2024 16:07:05 +0000
>>    Subject: [PATCH 0/4] gdbstub: Add support for MTE in system mode
>>    From: Gustavo Romero <gustavo.romero@linaro.org>
>>
>> which of course is focused on TCG. But if the KVM guests sync to the same
>> registers to cpregs I think most stuff should just work. However the
>> current code uses the TCG only:
>>
>>    allocation_tag_mem_probe
>>
>> which I guess needs a KVM equivalent to query the tag memory?
> 
> Ok, thanks for the heads-up!.
> 

Thanks,
Ganapat
Ganapatrao Kulkarni Aug. 2, 2024, 12:34 p.m. UTC | #9
Hi Peter,

On 16-07-2024 09:15 pm, Peter Maydell wrote:
> On Tue, 9 Jul 2024 at 07:05, Ganapatrao Kulkarni
> <gankulkarni@os.amperecomputing.com> wrote:
>>
>> Extend the 'mte' property for the virt machine to cover KVM as
>> well. For KVM, we don't allocate tag memory, but instead enable
>> the capability.
>>
>> If MTE has been enabled, we need to disable migration, as we do not
>> yet have a way to migrate the tags as well. Therefore, MTE will stay
>> off with KVM unless requested explicitly.
>>
>> This patch is rework of commit b320e21c48ce64853904bea6631c0158cc2ef227
>> which broke TCG since it made the TCG -cpu max
>> report the presence of MTE to the guest even if the board hadn't
>> enabled MTE by wiring up the tag RAM. This meant that if the guest
>> then tried to use MTE QEMU would segfault accessing the
>> non-existent tag RAM.
>>
>> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
>> Signed-off-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
>> ---
> 
> In target/arm/cpu.c:arm_cpu_realizefn() there is this code:
> 
>      if (cpu_isar_feature(aa64_mte, cpu)) {
>          /*
>           * The architectural range of GM blocksize is 2-6, however qemu
>           * doesn't support blocksize of 2 (see HELPER(ldgm)).
>           */
>          if (tcg_enabled()) {
>              assert(cpu->gm_blocksize >= 3 && cpu->gm_blocksize <= 6);
>          }
> 
> #ifndef CONFIG_USER_ONLY
>          /*
>           * If we do not have tag-memory provided by the machine,
>           * reduce MTE support to instructions enabled at EL0.
>           * This matches Cortex-A710 BROADCASTMTE input being LOW.
>           */
>          if (cpu->tag_memory == NULL) {
>              cpu->isar.id_aa64pfr1 =
>                  FIELD_DP64(cpu->isar.id_aa64pfr1, ID_AA64PFR1, MTE, 1);
>          }
> #endif
>      }
> 
> With this patch, for KVM we will end up going through the
> "squash ID_AA64PFR1_EL1.MTE to 1" codepath, because KVM doesn't
> set cpu->tag_memory and this is still using that as its check.
> 
> More generally, how does the enabling of the MTE KVM cap
> interact with the ID_AA64PFR1_EL1 value that we read from
> the host in kvm_arm_get_host_cpu_features() ? We care that we
> have the right ID register values because we use ID field
> checks to determine whether the vcpu has a feature or not,
> even in the KVM case.

Using ID_AA64PFR1_EL1.MTE bits seems to have issues with the Linux 
kernel implementation.
This register is sensitized to user space and value read differs time to 
time. ID_AA64PFR1_EL1 register read at the beginning of qemu code will 
have the MTE bits cleared/masked. However ID_AA64PFR1_EL1.MTE bits are 
unmasked and shows the real value of MTE after ioctl KVM_CAP_ARM_MTE is 
executed to enable MTE.
In QEMU,. isar.id_aa64pfr1 is read at the very beginning and cached, by 
creating dummy(kvm_arm_create_scratch_host_vcpu) interfaces(fds) for 
kvm, vm and vcpu. At later stages use of isar.id_aa64pfr1.mte at 
function like arm_cpu_realizefn does not show the right value and code 
never enters the "if (cpu_isar_feature(aa64_mte, cpu)" loop.

Not sure about other feature bits, but for MTE, using isar.id_aa64pfr1 
may not be appropriate but I do see it is getting used already many 
places of the code. I am not sure how it is behaving on emulator/TCG mode?

Having said that, I have tried to remove the sensitization of 
ID_AA64PFR1_EL1 for MTE bits in the kernel, but that will have bigger 
problem like the VM boot crashes with the default compiled qemu binary 
with normal VM boot itself, since VM detects as MTE present and starts 
init of it.

> 
> Since Cornelia first wrote the patch this is based on, we've
> landed gdbstub support for MTE (so gdb can find out which
> addresses in the memory map have tags and read and write
> those tags). So I think the KVM MTE support now also needs to
> handle that. (See aarch64_cpu_register_gdb_commands() in
> target/arm/gdbstub64.c.)
> 
> thanks
> -- PMM

Thanks,
Ganapat
Cornelia Huck Aug. 2, 2024, 1:16 p.m. UTC | #10
On Fri, Aug 02 2024, Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com> wrote:

> Hi Peter,
>
> On 16-07-2024 09:15 pm, Peter Maydell wrote:
>> On Tue, 9 Jul 2024 at 07:05, Ganapatrao Kulkarni
>> <gankulkarni@os.amperecomputing.com> wrote:
>>>
>>> Extend the 'mte' property for the virt machine to cover KVM as
>>> well. For KVM, we don't allocate tag memory, but instead enable
>>> the capability.
>>>
>>> If MTE has been enabled, we need to disable migration, as we do not
>>> yet have a way to migrate the tags as well. Therefore, MTE will stay
>>> off with KVM unless requested explicitly.
>>>
>>> This patch is rework of commit b320e21c48ce64853904bea6631c0158cc2ef227
>>> which broke TCG since it made the TCG -cpu max
>>> report the presence of MTE to the guest even if the board hadn't
>>> enabled MTE by wiring up the tag RAM. This meant that if the guest
>>> then tried to use MTE QEMU would segfault accessing the
>>> non-existent tag RAM.
>>>
>>> Signed-off-by: Cornelia Huck <cohuck@redhat.com>
>>> Signed-off-by: Ganapatrao Kulkarni <gankulkarni@os.amperecomputing.com>
>>> ---
>> 
>> In target/arm/cpu.c:arm_cpu_realizefn() there is this code:
>> 
>>      if (cpu_isar_feature(aa64_mte, cpu)) {
>>          /*
>>           * The architectural range of GM blocksize is 2-6, however qemu
>>           * doesn't support blocksize of 2 (see HELPER(ldgm)).
>>           */
>>          if (tcg_enabled()) {
>>              assert(cpu->gm_blocksize >= 3 && cpu->gm_blocksize <= 6);
>>          }
>> 
>> #ifndef CONFIG_USER_ONLY
>>          /*
>>           * If we do not have tag-memory provided by the machine,
>>           * reduce MTE support to instructions enabled at EL0.
>>           * This matches Cortex-A710 BROADCASTMTE input being LOW.
>>           */
>>          if (cpu->tag_memory == NULL) {
>>              cpu->isar.id_aa64pfr1 =
>>                  FIELD_DP64(cpu->isar.id_aa64pfr1, ID_AA64PFR1, MTE, 1);
>>          }
>> #endif
>>      }
>> 
>> With this patch, for KVM we will end up going through the
>> "squash ID_AA64PFR1_EL1.MTE to 1" codepath, because KVM doesn't
>> set cpu->tag_memory and this is still using that as its check.
>> 
>> More generally, how does the enabling of the MTE KVM cap
>> interact with the ID_AA64PFR1_EL1 value that we read from
>> the host in kvm_arm_get_host_cpu_features() ? We care that we
>> have the right ID register values because we use ID field
>> checks to determine whether the vcpu has a feature or not,
>> even in the KVM case.
>
> Using ID_AA64PFR1_EL1.MTE bits seems to have issues with the Linux 
> kernel implementation.
> This register is sensitized to user space and value read differs time to 
> time. ID_AA64PFR1_EL1 register read at the beginning of qemu code will 
> have the MTE bits cleared/masked. However ID_AA64PFR1_EL1.MTE bits are 
> unmasked and shows the real value of MTE after ioctl KVM_CAP_ARM_MTE is 
> executed to enable MTE.
> In QEMU,. isar.id_aa64pfr1 is read at the very beginning and cached, by 
> creating dummy(kvm_arm_create_scratch_host_vcpu) interfaces(fds) for 
> kvm, vm and vcpu. At later stages use of isar.id_aa64pfr1.mte at 
> function like arm_cpu_realizefn does not show the right value and code 
> never enters the "if (cpu_isar_feature(aa64_mte, cpu)" loop.
>
> Not sure about other feature bits, but for MTE, using isar.id_aa64pfr1 
> may not be appropriate but I do see it is getting used already many 
> places of the code. I am not sure how it is behaving on emulator/TCG mode?
>
> Having said that, I have tried to remove the sensitization of 
> ID_AA64PFR1_EL1 for MTE bits in the kernel, but that will have bigger 
> problem like the VM boot crashes with the default compiled qemu binary 
> with normal VM boot itself, since VM detects as MTE present and starts 
> init of it.

Not sure if you've seen
https://lore.kernel.org/all/20240723072004.1470688-1-shahuang@redhat.com/,
which aims to make aa64pfr1 writable by userspace -- but still needs
special handling for MTE. (That series also masks out MTEX and MTE_frac
because the Linux kernel does not yet handle them; might become relevant
if we want to support it from QEMU at some time in the future, but then
the command line would also need some thought.)

I fear that we'll be stuck with special handling for the MTE bits in
aa64pfr1 because the kernel will not want to break its userspace API.
Ganapatrao Kulkarni Sept. 10, 2024, 11:57 a.m. UTC | #11
Hi Peter,

On 16-07-2024 09:15 pm, Peter Maydell wrote:
> 
> In target/arm/cpu.c:arm_cpu_realizefn() there is this code:
> 
>      if (cpu_isar_feature(aa64_mte, cpu)) {
>          /*
>           * The architectural range of GM blocksize is 2-6, however qemu
>           * doesn't support blocksize of 2 (see HELPER(ldgm)).
>           */
>          if (tcg_enabled()) {
>              assert(cpu->gm_blocksize >= 3 && cpu->gm_blocksize <= 6);
>          }
> 
> #ifndef CONFIG_USER_ONLY
>          /*
>           * If we do not have tag-memory provided by the machine,
>           * reduce MTE support to instructions enabled at EL0.
>           * This matches Cortex-A710 BROADCASTMTE input being LOW.
>           */
>          if (cpu->tag_memory == NULL) {
>              cpu->isar.id_aa64pfr1 =
>                  FIELD_DP64(cpu->isar.id_aa64pfr1, ID_AA64PFR1, MTE, 1);
>          }
> #endif
>      }
> 
> With this patch, for KVM we will end up going through the
> "squash ID_AA64PFR1_EL1.MTE to 1" codepath, because KVM doesn't
> set cpu->tag_memory and this is still using that as its check.
> 
> More generally, how does the enabling of the MTE KVM cap
> interact with the ID_AA64PFR1_EL1 value that we read from
> the host in kvm_arm_get_host_cpu_features() ? We care that we

Linux kernel masks the MTE bits of register id_aa64pfr1 until unless the 
MTE is enabled for that VM. I have modified to enable 
MTE(KVM_CAP_ARM_MTE) before we read the register id_aa64pfr1 in 
kvm_arm_get_host_cpu_features to make sure we get the unmasked/actual 
MTE bits. I will post this change in the V2 patchset.

> have the right ID register values because we use ID field
> checks to determine whether the vcpu has a feature or not,
> even in the KVM case.
> 
> Since Cornelia first wrote the patch this is based on, we've
> landed gdbstub support for MTE (so gdb can find out which
> addresses in the memory map have tags and read and write
> those tags). So I think the KVM MTE support now also needs to
> handle that. (See aarch64_cpu_register_gdb_commands() in
> target/arm/gdbstub64.c.)

I looked at this code and it looks like, complete code is under
ifdef CONFIG_USER_ONLY and for kvm(target aarch64-softmmu) this is not 
getting enabled. Are you asking to remove these ifdef and make
mte-gdbstub commands available for the KVM mode as well?
Peter Maydell Sept. 10, 2024, 12:23 p.m. UTC | #12
On Tue, 10 Sept 2024 at 12:57, Ganapatrao Kulkarni
<gankulkarni@os.amperecomputing.com> wrote:
> On 16-07-2024 09:15 pm, Peter Maydell wrote:
> > Since Cornelia first wrote the patch this is based on, we've
> > landed gdbstub support for MTE (so gdb can find out which
> > addresses in the memory map have tags and read and write
> > those tags). So I think the KVM MTE support now also needs to
> > handle that. (See aarch64_cpu_register_gdb_commands() in
> > target/arm/gdbstub64.c.)
>
> I looked at this code and it looks like, complete code is under
> ifdef CONFIG_USER_ONLY and for kvm(target aarch64-softmmu) this is not
> getting enabled. Are you asking to remove these ifdef and make
> mte-gdbstub commands available for the KVM mode as well?

The system mode support for mte gdbstub is just about
to land. The current patchset is this one:
https://patchew.org/QEMU/20240906143316.657436-1-gustavo.romero@linaro.org/

thanks
-- PMM
Ganapatrao Kulkarni Sept. 11, 2024, 6:50 a.m. UTC | #13
On 10-09-2024 05:53 pm, Peter Maydell wrote:
> On Tue, 10 Sept 2024 at 12:57, Ganapatrao Kulkarni
> <gankulkarni@os.amperecomputing.com> wrote:
>> On 16-07-2024 09:15 pm, Peter Maydell wrote:
>>> Since Cornelia first wrote the patch this is based on, we've
>>> landed gdbstub support for MTE (so gdb can find out which
>>> addresses in the memory map have tags and read and write
>>> those tags). So I think the KVM MTE support now also needs to
>>> handle that. (See aarch64_cpu_register_gdb_commands() in
>>> target/arm/gdbstub64.c.)
>>
>> I looked at this code and it looks like, complete code is under
>> ifdef CONFIG_USER_ONLY and for kvm(target aarch64-softmmu) this is not
>> getting enabled. Are you asking to remove these ifdef and make
>> mte-gdbstub commands available for the KVM mode as well?
> 
> The system mode support for mte gdbstub is just about
> to land. The current patchset is this one:
> https://patchew.org/QEMU/20240906143316.657436-1-gustavo.romero@linaro.org/
> 

Thanks.
I applied these patches to qemu and compiled gdb as said in the 
cover-letter. Below is the log of the run, help me to interpret the logs.

[root@sut01sys-r214 mte-qemu]# make -C build -j 32 
run-tcg-tests-aarch64-softmmu
make: Entering directory '/home/ganapat/upstream/mte-qemu/build'
   BUILD   aarch64-softmmu guest-tests
   RUN     aarch64-softmmu guest-tests
   TEST    mte on aarch64
   TEST    hello on aarch64
   TEST    interrupt on aarch64
   TEST    memory on aarch64
   TEST    memory-sve on aarch64
   TEST    hello-with-libbb.so on aarch64
   TEST    interrupt-with-libbb.so on aarch64
   TEST    memory-with-libbb.so on aarch64
   TEST    hello-with-libempty.so on aarch64
   TEST    interrupt-with-libempty.so on aarch64
   TEST    memory-with-libempty.so on aarch64
   TEST    interrupt-with-libinline.so on aarch64
   TEST    hello-with-libinline.so on aarch64
   TEST    memory-with-libinline.so on aarch64
   TEST    hello-with-libinsn.so on aarch64
   TEST    memory-with-libinsn.so on aarch64
   TEST    interrupt-with-libinsn.so on aarch64
   TEST    hello-with-libmem.so on aarch64
   TEST    interrupt-with-libmem.so on aarch64
   TEST    hello-with-libsyscall.so on aarch64
   TEST    memory-with-libmem.so on aarch64
   TEST    interrupt-with-libsyscall.so on aarch64
   TEST    softmmu gdbstub support on aarch64
   TEST    memory-with-libsyscall.so on aarch64
   TEST    softmmu gdbstub support on aarch64
   TEST    softmmu gdbstub untimely packets on aarch64
   TEST    softmmu gdbstub support on aarch64
   TEST    gdbstub MTE support on aarch64
   TEST    memory-record on aarch64
qemu-system-aarch64: -gdb 
unix:path=/tmp/tmp80lbp057qemu-gdbstub/gdbstub.socket,server=on: info: 
QEMU waiting for connection on: 
disconnected:unix:/tmp/tmp80lbp057qemu-gdbstub/gdbstub.socket,server=on
qemu-system-aarch64: -gdb 
unix:path=/tmp/tmp8h7kx8kyqemu-gdbstub/gdbstub.socket,server=on: info: 
QEMU waiting for connection on: 
disconnected:unix:/tmp/tmp8h7kx8kyqemu-gdbstub/gdbstub.socket,server=on
qemu-system-aarch64: -gdb 
unix:path=/tmp/tmpo0448vk6qemu-gdbstub/gdbstub.socket,server=on: info: 
QEMU waiting for connection on: 
disconnected:unix:/tmp/tmpo0448vk6qemu-gdbstub/gdbstub.socket,server=on
qemu-system-aarch64: -gdb 
unix:path=/tmp/tmplvwd79e3qemu-gdbstub/gdbstub.socket,server=on: info: 
QEMU waiting for connection on: 
disconnected:unix:/tmp/tmplvwd79e3qemu-gdbstub/gdbstub.socket,server=on
qemu-system-aarch64: -gdb 
unix:path=/tmp/tmpv1t2ffjfqemu-gdbstub/gdbstub.socket,server=on: info: 
QEMU waiting for connection on: 
disconnected:unix:/tmp/tmpv1t2ffjfqemu-gdbstub/gdbstub.socket,server=on
   TEST    memory-replay on aarch64
qemu-system-aarch64: QEMU: Terminated via GDBstub
qemu-system-aarch64: QEMU: Terminated via GDBstub
   GREP    file untimely-packet.gdb.err
qemu-system-aarch64: QEMU: Terminated via GDBstub
qemu-system-aarch64: QEMU: Terminated via GDBstub
make: Leaving directory '/home/ganapat/upstream/mte-qemu/build'
[root@sut01sys-r214 mte-qemu]#

Last few lines of untimely-packet.gdb.err below:

  [remote] Packet received: 
00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 
[24 bytes omitted]
   [remote] Sending packet: $qfThreadInfo#bb
   [remote] Received Ack
   [remote] Packet received: mp01.01
   [remote] Sending packet: $qsThreadInfo#c8
   [remote] Received Ack
   [remote] Packet received: l
   [remote] Sending packet: $p56#db
   [remote] Received Ack
   [remote] Packet received: 000000000000ffff
   [remote] packet_ok: Packet p (fetch-register) is supported
   [remote] Sending packet: $p57#dc
   [remote] Received Ack
   [remote] Packet received: 000000000000ffff
   [remote] Sending packet: $m400027b0,4#8c
   [remote] Received Ack
   [remote] Packet received: 80c2ff10
   [remote] Sending packet: $m400027ac,4#be
   [remote] Received Ack
   [remote] Packet received: 1f2003d5
   [remote] Sending packet: $m400027b0,4#8c
   [remote] Received Ack
   [remote] Packet received: 80c2ff10
   [remote] Sending packet: $qSymbol::#5b
   [remote] Received Ack
   [remote] Packet received:
   [remote] packet_ok: Packet qSymbol (symbol-lookup) is NOT supported
[remote] start_remote_1: exit
[remote] Sending packet: $D;1#b0
[remote] Received Ack
[remote] Packet received: OK
Peter Maydell Sept. 11, 2024, 10:30 a.m. UTC | #14
On Wed, 11 Sept 2024 at 07:50, Ganapatrao Kulkarni
<gankulkarni@os.amperecomputing.com> wrote:
>
>
>
> On 10-09-2024 05:53 pm, Peter Maydell wrote:
> > On Tue, 10 Sept 2024 at 12:57, Ganapatrao Kulkarni
> > <gankulkarni@os.amperecomputing.com> wrote:
> >> On 16-07-2024 09:15 pm, Peter Maydell wrote:
> >>> Since Cornelia first wrote the patch this is based on, we've
> >>> landed gdbstub support for MTE (so gdb can find out which
> >>> addresses in the memory map have tags and read and write
> >>> those tags). So I think the KVM MTE support now also needs to
> >>> handle that. (See aarch64_cpu_register_gdb_commands() in
> >>> target/arm/gdbstub64.c.)
> >>
> >> I looked at this code and it looks like, complete code is under
> >> ifdef CONFIG_USER_ONLY and for kvm(target aarch64-softmmu) this is not
> >> getting enabled. Are you asking to remove these ifdef and make
> >> mte-gdbstub commands available for the KVM mode as well?
> >
> > The system mode support for mte gdbstub is just about
> > to land. The current patchset is this one:
> > https://patchew.org/QEMU/20240906143316.657436-1-gustavo.romero@linaro.org/
> >
>
> Thanks.
> I applied these patches to qemu and compiled gdb as said in the
> cover-letter. Below is the log of the run, help me to interpret the logs.

I've cc'd Gustavo since he's the author of the patches.

> [root@sut01sys-r214 mte-qemu]# make -C build -j 32
> run-tcg-tests-aarch64-softmmu
> make: Entering directory '/home/ganapat/upstream/mte-qemu/build'
>    BUILD   aarch64-softmmu guest-tests
>    RUN     aarch64-softmmu guest-tests
>    TEST    mte on aarch64
>    TEST    hello on aarch64
>    TEST    interrupt on aarch64
>    TEST    memory on aarch64
>    TEST    memory-sve on aarch64
>    TEST    hello-with-libbb.so on aarch64
>    TEST    interrupt-with-libbb.so on aarch64
>    TEST    memory-with-libbb.so on aarch64
>    TEST    hello-with-libempty.so on aarch64
>    TEST    interrupt-with-libempty.so on aarch64
>    TEST    memory-with-libempty.so on aarch64
>    TEST    interrupt-with-libinline.so on aarch64
>    TEST    hello-with-libinline.so on aarch64
>    TEST    memory-with-libinline.so on aarch64
>    TEST    hello-with-libinsn.so on aarch64
>    TEST    memory-with-libinsn.so on aarch64
>    TEST    interrupt-with-libinsn.so on aarch64
>    TEST    hello-with-libmem.so on aarch64
>    TEST    interrupt-with-libmem.so on aarch64
>    TEST    hello-with-libsyscall.so on aarch64
>    TEST    memory-with-libmem.so on aarch64
>    TEST    interrupt-with-libsyscall.so on aarch64
>    TEST    softmmu gdbstub support on aarch64
>    TEST    memory-with-libsyscall.so on aarch64
>    TEST    softmmu gdbstub support on aarch64
>    TEST    softmmu gdbstub untimely packets on aarch64
>    TEST    softmmu gdbstub support on aarch64
>    TEST    gdbstub MTE support on aarch64
>    TEST    memory-record on aarch64
> qemu-system-aarch64: -gdb
> unix:path=/tmp/tmp80lbp057qemu-gdbstub/gdbstub.socket,server=on: info:
> QEMU waiting for connection on:
> disconnected:unix:/tmp/tmp80lbp057qemu-gdbstub/gdbstub.socket,server=on
> qemu-system-aarch64: -gdb
> unix:path=/tmp/tmp8h7kx8kyqemu-gdbstub/gdbstub.socket,server=on: info:
> QEMU waiting for connection on:
> disconnected:unix:/tmp/tmp8h7kx8kyqemu-gdbstub/gdbstub.socket,server=on
> qemu-system-aarch64: -gdb
> unix:path=/tmp/tmpo0448vk6qemu-gdbstub/gdbstub.socket,server=on: info:
> QEMU waiting for connection on:
> disconnected:unix:/tmp/tmpo0448vk6qemu-gdbstub/gdbstub.socket,server=on
> qemu-system-aarch64: -gdb
> unix:path=/tmp/tmplvwd79e3qemu-gdbstub/gdbstub.socket,server=on: info:
> QEMU waiting for connection on:
> disconnected:unix:/tmp/tmplvwd79e3qemu-gdbstub/gdbstub.socket,server=on
> qemu-system-aarch64: -gdb
> unix:path=/tmp/tmpv1t2ffjfqemu-gdbstub/gdbstub.socket,server=on: info:
> QEMU waiting for connection on:
> disconnected:unix:/tmp/tmpv1t2ffjfqemu-gdbstub/gdbstub.socket,server=on
>    TEST    memory-replay on aarch64
> qemu-system-aarch64: QEMU: Terminated via GDBstub
> qemu-system-aarch64: QEMU: Terminated via GDBstub
>    GREP    file untimely-packet.gdb.err
> qemu-system-aarch64: QEMU: Terminated via GDBstub
> qemu-system-aarch64: QEMU: Terminated via GDBstub
> make: Leaving directory '/home/ganapat/upstream/mte-qemu/build'
> [root@sut01sys-r214 mte-qemu]#
>
> Last few lines of untimely-packet.gdb.err below:
>
>   [remote] Packet received:
> 00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
> [24 bytes omitted]
>    [remote] Sending packet: $qfThreadInfo#bb
>    [remote] Received Ack
>    [remote] Packet received: mp01.01
>    [remote] Sending packet: $qsThreadInfo#c8
>    [remote] Received Ack
>    [remote] Packet received: l
>    [remote] Sending packet: $p56#db
>    [remote] Received Ack
>    [remote] Packet received: 000000000000ffff
>    [remote] packet_ok: Packet p (fetch-register) is supported
>    [remote] Sending packet: $p57#dc
>    [remote] Received Ack
>    [remote] Packet received: 000000000000ffff
>    [remote] Sending packet: $m400027b0,4#8c
>    [remote] Received Ack
>    [remote] Packet received: 80c2ff10
>    [remote] Sending packet: $m400027ac,4#be
>    [remote] Received Ack
>    [remote] Packet received: 1f2003d5
>    [remote] Sending packet: $m400027b0,4#8c
>    [remote] Received Ack
>    [remote] Packet received: 80c2ff10
>    [remote] Sending packet: $qSymbol::#5b
>    [remote] Received Ack
>    [remote] Packet received:
>    [remote] packet_ok: Packet qSymbol (symbol-lookup) is NOT supported
> [remote] start_remote_1: exit
> [remote] Sending packet: $D;1#b0
> [remote] Received Ack
> [remote] Packet received: OK
>
>
> --
> Thanks,
> Ganapat/GK

-- PMM
diff mbox series

Patch

diff --git a/hw/arm/virt.c b/hw/arm/virt.c
index b0c68d66a3..cc9db79ec3 100644
--- a/hw/arm/virt.c
+++ b/hw/arm/virt.c
@@ -2206,7 +2206,7 @@  static void machvirt_init(MachineState *machine)
         exit(1);
     }
 
-    if (vms->mte && (kvm_enabled() || hvf_enabled())) {
+    if (vms->mte && hvf_enabled()) {
         error_report("mach-virt: %s does not support providing "
                      "MTE to the guest CPU",
                      current_accel_name());
@@ -2276,39 +2276,51 @@  static void machvirt_init(MachineState *machine)
         }
 
         if (vms->mte) {
-            /* Create the memory region only once, but link to all cpus. */
-            if (!tag_sysmem) {
-                /*
-                 * The property exists only if MemTag is supported.
-                 * If it is, we must allocate the ram to back that up.
-                 */
-                if (!object_property_find(cpuobj, "tag-memory")) {
-                    error_report("MTE requested, but not supported "
-                                 "by the guest CPU");
-                    exit(1);
+            if (tcg_enabled()) {
+                /* Create the memory region only once, but link to all cpus. */
+                if (!tag_sysmem) {
+                    /*
+                     * The property exists only if MemTag is supported.
+                     * If it is, we must allocate the ram to back that up.
+                     */
+                    if (!object_property_find(cpuobj, "tag-memory")) {
+                        error_report("MTE requested, but not supported "
+                                     "by the guest CPU");
+                        exit(1);
+                    }
+
+                    tag_sysmem = g_new(MemoryRegion, 1);
+                    memory_region_init(tag_sysmem, OBJECT(machine),
+                                       "tag-memory", UINT64_MAX / 32);
+
+                    if (vms->secure) {
+                        secure_tag_sysmem = g_new(MemoryRegion, 1);
+                        memory_region_init(secure_tag_sysmem, OBJECT(machine),
+                                           "secure-tag-memory",
+                                           UINT64_MAX / 32);
+
+                        /* As with ram, secure-tag takes precedence over tag. */
+                        memory_region_add_subregion_overlap(secure_tag_sysmem,
+                                                            0, tag_sysmem, -1);
+                    }
                 }
 
-                tag_sysmem = g_new(MemoryRegion, 1);
-                memory_region_init(tag_sysmem, OBJECT(machine),
-                                   "tag-memory", UINT64_MAX / 32);
-
+                object_property_set_link(cpuobj, "tag-memory",
+                                         OBJECT(tag_sysmem), &error_abort);
                 if (vms->secure) {
-                    secure_tag_sysmem = g_new(MemoryRegion, 1);
-                    memory_region_init(secure_tag_sysmem, OBJECT(machine),
-                                       "secure-tag-memory", UINT64_MAX / 32);
-
-                    /* As with ram, secure-tag takes precedence over tag.  */
-                    memory_region_add_subregion_overlap(secure_tag_sysmem, 0,
-                                                        tag_sysmem, -1);
+                    object_property_set_link(cpuobj, "secure-tag-memory",
+                                             OBJECT(secure_tag_sysmem),
+                                             &error_abort);
                 }
-            }
-
-            object_property_set_link(cpuobj, "tag-memory", OBJECT(tag_sysmem),
-                                     &error_abort);
-            if (vms->secure) {
-                object_property_set_link(cpuobj, "secure-tag-memory",
-                                         OBJECT(secure_tag_sysmem),
-                                         &error_abort);
+            } else if (kvm_enabled()) {
+                if (!kvm_arm_mte_supported()) {
+                    error_report("MTE requested, but not supported by KVM");
+                    exit(1);
+                }
+                kvm_arm_enable_mte(cpuobj, &error_abort);
+            } else {
+                    error_report("MTE requested, but not supported ");
+                    exit(1);
             }
         }
 
diff --git a/target/arm/cpu.h b/target/arm/cpu.h
index d8eb986a04..661d35d8d8 100644
--- a/target/arm/cpu.h
+++ b/target/arm/cpu.h
@@ -1054,6 +1054,7 @@  struct ArchCPU {
     bool prop_pauth_impdef;
     bool prop_pauth_qarma3;
     bool prop_lpa2;
+    OnOffAuto prop_mte;
 
     /* DCZ blocksize, in log_2(words), ie low 4 bits of DCZID_EL0 */
     uint8_t dcz_blocksize;
diff --git a/target/arm/kvm.c b/target/arm/kvm.c
index 70f79eda33..0267a013e4 100644
--- a/target/arm/kvm.c
+++ b/target/arm/kvm.c
@@ -39,6 +39,7 @@ 
 #include "hw/acpi/acpi.h"
 #include "hw/acpi/ghes.h"
 #include "target/arm/gtimer.h"
+#include "migration/blocker.h"
 
 const KVMCapabilityInfo kvm_arch_required_capabilities[] = {
     KVM_CAP_LAST_INFO
@@ -1793,6 +1794,11 @@  bool kvm_arm_sve_supported(void)
     return kvm_check_extension(kvm_state, KVM_CAP_ARM_SVE);
 }
 
+bool kvm_arm_mte_supported(void)
+{
+    return kvm_check_extension(kvm_state, KVM_CAP_ARM_MTE);
+}
+
 QEMU_BUILD_BUG_ON(KVM_ARM64_SVE_VQ_MIN != 1);
 
 uint32_t kvm_arm_sve_get_vls(ARMCPU *cpu)
@@ -2422,3 +2428,37 @@  int kvm_arch_remove_sw_breakpoint(CPUState *cs, struct kvm_sw_breakpoint *bp)
     }
     return 0;
 }
+
+void kvm_arm_enable_mte(Object *cpuobj, Error **errp)
+{
+    static bool tried_to_enable;
+    static bool succeeded_to_enable;
+    Error *mte_migration_blocker = NULL;
+    int ret;
+
+    if (!tried_to_enable) {
+        /*
+         * MTE on KVM is enabled on a per-VM basis (and retrying doesn't make
+         * sense), and we only want a single migration blocker as well.
+         */
+        tried_to_enable = true;
+
+        ret = kvm_vm_enable_cap(kvm_state, KVM_CAP_ARM_MTE, 0);
+        if (ret) {
+            error_setg_errno(errp, -ret, "Failed to enable KVM_CAP_ARM_MTE");
+            return;
+        }
+
+        /* TODO: Migration is not supported with MTE enabled */
+        error_setg(&mte_migration_blocker,
+                   "Live migration disabled due to MTE enabled");
+        if (migrate_add_blocker(&mte_migration_blocker, errp)) {
+            error_free(mte_migration_blocker);
+            return;
+        }
+        succeeded_to_enable = true;
+    }
+    if (succeeded_to_enable) {
+        object_property_set_bool(cpuobj, "has_mte", true, NULL);
+    }
+}
diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h
index cfaa0d9bc7..4d293618a7 100644
--- a/target/arm/kvm_arm.h
+++ b/target/arm/kvm_arm.h
@@ -188,6 +188,13 @@  bool kvm_arm_pmu_supported(void);
  */
 bool kvm_arm_sve_supported(void);
 
+/**
+ * kvm_arm_mte_supported:
+ *
+ * Returns: true if KVM can enable MTE, and false otherwise.
+ */
+bool kvm_arm_mte_supported(void);
+
 /**
  * kvm_arm_get_max_vm_ipa_size:
  * @ms: Machine state handle
@@ -214,6 +221,8 @@  void kvm_arm_pvtime_init(ARMCPU *cpu, uint64_t ipa);
 
 int kvm_arm_set_irq(int cpu, int irqtype, int irq, int level);
 
+void kvm_arm_enable_mte(Object *cpuobj, Error **errp);
+
 #else
 
 /*
@@ -235,6 +244,11 @@  static inline bool kvm_arm_sve_supported(void)
     return false;
 }
 
+static inline bool kvm_arm_mte_supported(void)
+{
+    return false;
+}
+
 /*
  * These functions should never actually be called without KVM support.
  */
@@ -283,6 +297,11 @@  static inline uint32_t kvm_arm_sve_get_vls(ARMCPU *cpu)
     g_assert_not_reached();
 }
 
+static inline void kvm_arm_enable_mte(Object *cpuobj, Error **errp)
+{
+    g_assert_not_reached();
+}
+
 #endif
 
 #endif