Message ID | 20220928164719.655586-2-clg@kaod.org |
---|---|
State | New |
Headers | show |
Series | ast2600: Disable NEON and VFPv4-D32 | expand |
On Wed, 28 Sept 2022 at 16:47, Cédric Le Goater <clg@kaod.org> wrote: > > As the Cortex A7 MPCore Technical reference says : > > "When FPU option is selected without NEON, the FPU is VFPv4-D16 and > uses 16 double-precision registers. When the FPU is implemented with > NEON, the FPU is VFPv4-D32 and uses 32 double-precision registers. > This register bank is shared with NEON." > > Modify the mvfr0 register value of the cortex A7 to advertise only 16 > registers when NEON is not available, and not 32 registers. > > Signed-off-by: Cédric Le Goater <clg@kaod.org> > --- > target/arm/cpu.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/target/arm/cpu.c b/target/arm/cpu.c > index 7ec3281da9aa..01dc74c32add 100644 > --- a/target/arm/cpu.c > +++ b/target/arm/cpu.c > @@ -1684,6 +1684,10 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp) > cpu->isar.id_isar6 = u; > > if (!arm_feature(env, ARM_FEATURE_M)) { Can you explain why the test is put behind the !ARM_FEATURE_M check? Reviewed-by: Joel Stanley <joel@jms.id.au> > + u = cpu->isar.mvfr0; > + u = FIELD_DP32(u, MVFR0, SIMDREG, 1); /* 16 registers */ > + cpu->isar.mvfr0 = u; > + > u = cpu->isar.mvfr1; > u = FIELD_DP32(u, MVFR1, SIMDLS, 0); > u = FIELD_DP32(u, MVFR1, SIMDINT, 0); > -- > 2.37.3 >
On 9/29/22 01:00, Joel Stanley wrote: > On Wed, 28 Sept 2022 at 16:47, Cédric Le Goater <clg@kaod.org> wrote: >> >> As the Cortex A7 MPCore Technical reference says : >> >> "When FPU option is selected without NEON, the FPU is VFPv4-D16 and >> uses 16 double-precision registers. When the FPU is implemented with >> NEON, the FPU is VFPv4-D32 and uses 32 double-precision registers. >> This register bank is shared with NEON." >> >> Modify the mvfr0 register value of the cortex A7 to advertise only 16 >> registers when NEON is not available, and not 32 registers. >> >> Signed-off-by: Cédric Le Goater <clg@kaod.org> > > > >> --- >> target/arm/cpu.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/target/arm/cpu.c b/target/arm/cpu.c >> index 7ec3281da9aa..01dc74c32add 100644 >> --- a/target/arm/cpu.c >> +++ b/target/arm/cpu.c >> @@ -1684,6 +1684,10 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp) >> cpu->isar.id_isar6 = u; >> >> if (!arm_feature(env, ARM_FEATURE_M)) { > > Can you explain why the test is put behind the !ARM_FEATURE_M check? Do you mean the setting of MVFR0 ? because it was close to the code clearing the SIMD bits (NEON) of MVFR1 and it seemed more in sync with the specs : "When FPU option is selected without NEON, the FPU is VFPv4-D16 and uses 16 double-precision registers. When the FPU is implemented with NEON, the FPU is VFPv4-D32 and uses 32 double-precision registers. This register bank is shared with NEON." (That said, M processors don't have NEON, so this part of the code should never be reached ) It could be done outside of this test also because SIMDREG = 1 is a valid value for M processors and the code path : if (!cpu->has_neon && !cpu->has_vfp) { will set MVFR0 to 0 later on if needed. M55 seems to be a special case though : cpu->isar.mvfr1 = 0x12100211 these are the FPU and MVE bits. C. > > Reviewed-by: Joel Stanley <joel@jms.id.au> > >> + u = cpu->isar.mvfr0; >> + u = FIELD_DP32(u, MVFR0, SIMDREG, 1); /* 16 registers */ >> + cpu->isar.mvfr0 = u; >> + >> u = cpu->isar.mvfr1; >> u = FIELD_DP32(u, MVFR1, SIMDLS, 0); >> u = FIELD_DP32(u, MVFR1, SIMDINT, 0); >> -- >> 2.37.3 >>
On Wed, 28 Sept 2022 at 17:47, Cédric Le Goater <clg@kaod.org> wrote: > > As the Cortex A7 MPCore Technical reference says : > > "When FPU option is selected without NEON, the FPU is VFPv4-D16 and > uses 16 double-precision registers. When the FPU is implemented with > NEON, the FPU is VFPv4-D32 and uses 32 double-precision registers. > This register bank is shared with NEON." > > Modify the mvfr0 register value of the cortex A7 to advertise only 16 > registers when NEON is not available, and not 32 registers. > > Signed-off-by: Cédric Le Goater <clg@kaod.org> > --- > target/arm/cpu.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/target/arm/cpu.c b/target/arm/cpu.c > index 7ec3281da9aa..01dc74c32add 100644 > --- a/target/arm/cpu.c > +++ b/target/arm/cpu.c > @@ -1684,6 +1684,10 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp) > cpu->isar.id_isar6 = u; > > if (!arm_feature(env, ARM_FEATURE_M)) { > + u = cpu->isar.mvfr0; > + u = FIELD_DP32(u, MVFR0, SIMDREG, 1); /* 16 registers */ > + cpu->isar.mvfr0 = u; > + Architecturally, Neon implies that you must have 32 dp registers, but not having Neon does not imply that you must only have 16. In particular, the Cortex-A15 always implements VFPv4-D32 whether Neon is enabled or not. If you want to be able to turn off D32 and restrict to 16 registers, I think you need to add a separate property to control that. thanks -- PMM
On Thu, 29 Sept 2022 at 08:20, Cédric Le Goater <clg@kaod.org> wrote: > > On 9/29/22 01:00, Joel Stanley wrote: > > On Wed, 28 Sept 2022 at 16:47, Cédric Le Goater <clg@kaod.org> wrote: > >> @@ -1684,6 +1684,10 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp) > >> cpu->isar.id_isar6 = u; > >> > >> if (!arm_feature(env, ARM_FEATURE_M)) { > > > > Can you explain why the test is put behind the !ARM_FEATURE_M check? > > Do you mean the setting of MVFR0 ? > > because it was close to the code clearing the SIMD bits (NEON) > of MVFR1 and it seemed more in sync with the specs : > > "When FPU option is selected without NEON, the FPU is VFPv4-D16 and > uses 16 double-precision registers. When the FPU is implemented with > NEON, the FPU is VFPv4-D32 and uses 32 double-precision registers. > This register bank is shared with NEON." > > (That said, M processors don't have NEON, so this part of the code > should never be reached ) They don't have Neon, but that means that cpu->has_neon is false, so this part of the code *will* be reached. The reason this sub-part of the "disable Neon" handling is inside a not-M check is because M-profile has a different assignment for some of the MVFR1 fields (check the comments in the FIELD definitions in cpu.h), and zeroing things out based on the A-profile meanings would be wrong. thanks -- PMM
On 9/29/22 04:44, Peter Maydell wrote: > Architecturally, Neon implies that you must have 32 dp registers, > but not having Neon does not imply that you must only have 16. > In particular, the Cortex-A15 always implements VFPv4-D32 > whether Neon is enabled or not. A15 requires VFP == NEON in its configuration. r~
On Thu, 29 Sept 2022 at 16:22, Richard Henderson <richard.henderson@linaro.org> wrote: > > On 9/29/22 04:44, Peter Maydell wrote: > > Architecturally, Neon implies that you must have 32 dp registers, > > but not having Neon does not imply that you must only have 16. > > In particular, the Cortex-A15 always implements VFPv4-D32 > > whether Neon is enabled or not. > > A15 requires VFP == NEON in its configuration. No, it requires that if you have Neon then you have VFP; but it allows all of: * no VFP or Neon * VFP, no Neon * VFP and Neon https://developer.arm.com/documentation/ddi0438/i/neon-and-vfp-unit/about-neon-and-vfp-unit -- PMM
On 9/29/22 13:44, Peter Maydell wrote: > On Wed, 28 Sept 2022 at 17:47, Cédric Le Goater <clg@kaod.org> wrote: >> >> As the Cortex A7 MPCore Technical reference says : >> >> "When FPU option is selected without NEON, the FPU is VFPv4-D16 and >> uses 16 double-precision registers. When the FPU is implemented with >> NEON, the FPU is VFPv4-D32 and uses 32 double-precision registers. >> This register bank is shared with NEON." >> >> Modify the mvfr0 register value of the cortex A7 to advertise only 16 >> registers when NEON is not available, and not 32 registers. >> >> Signed-off-by: Cédric Le Goater <clg@kaod.org> >> --- >> target/arm/cpu.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/target/arm/cpu.c b/target/arm/cpu.c >> index 7ec3281da9aa..01dc74c32add 100644 >> --- a/target/arm/cpu.c >> +++ b/target/arm/cpu.c >> @@ -1684,6 +1684,10 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp) >> cpu->isar.id_isar6 = u; >> >> if (!arm_feature(env, ARM_FEATURE_M)) { >> + u = cpu->isar.mvfr0; >> + u = FIELD_DP32(u, MVFR0, SIMDREG, 1); /* 16 registers */ >> + cpu->isar.mvfr0 = u; >> + > > Architecturally, Neon implies that you must have 32 dp registers, > but not having Neon does not imply that you must only have 16. > In particular, the Cortex-A15 always implements VFPv4-D32 > whether Neon is enabled or not. > > If you want to be able to turn off D32 and restrict to 16 > registers, I think you need to add a separate property to > control that. Something like "vfp-d16" ? Thanks, C.
On Fri, 30 Sept 2022 at 15:59, Cédric Le Goater <clg@kaod.org> wrote: > > On 9/29/22 13:44, Peter Maydell wrote: > > If you want to be able to turn off D32 and restrict to 16 > > registers, I think you need to add a separate property to > > control that. > > Something like "vfp-d16" ? That ends up being a sort of negative-polarity feature. Maybe "vfp-d32" for "have 32 dregs", with 'no' meaning "only 16" ? thanks -- PMM
diff --git a/target/arm/cpu.c b/target/arm/cpu.c index 7ec3281da9aa..01dc74c32add 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -1684,6 +1684,10 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp) cpu->isar.id_isar6 = u; if (!arm_feature(env, ARM_FEATURE_M)) { + u = cpu->isar.mvfr0; + u = FIELD_DP32(u, MVFR0, SIMDREG, 1); /* 16 registers */ + cpu->isar.mvfr0 = u; + u = cpu->isar.mvfr1; u = FIELD_DP32(u, MVFR1, SIMDLS, 0); u = FIELD_DP32(u, MVFR1, SIMDINT, 0);
As the Cortex A7 MPCore Technical reference says : "When FPU option is selected without NEON, the FPU is VFPv4-D16 and uses 16 double-precision registers. When the FPU is implemented with NEON, the FPU is VFPv4-D32 and uses 32 double-precision registers. This register bank is shared with NEON." Modify the mvfr0 register value of the cortex A7 to advertise only 16 registers when NEON is not available, and not 32 registers. Signed-off-by: Cédric Le Goater <clg@kaod.org> --- target/arm/cpu.c | 4 ++++ 1 file changed, 4 insertions(+)