diff mbox series

[for-8.2,6/7] target/riscv: add 'max' CPU type

Message ID 20230712190149.424675-7-dbarboza@ventanamicro.com
State New
Headers show
Series target/riscv: add 'max' CPU type | expand

Commit Message

Daniel Henrique Barboza July 12, 2023, 7:01 p.m. UTC
The 'max' CPU type is used by tooling to determine what's the most
capable CPU a current QEMU version implements. Other archs such as ARM
implements this type. Let's add it to RISC-V.

What we consider "most capable CPU" in this context are related to
ratified, non-vendor extensions. This means that we want the 'max' CPU
to enable all (possible) ratified extensions by default. The reasoning
behind this design is (1) vendor extensions can conflict with each other
and we won't play favorities deciding which one is default or not and
(2) non-ratified extensions are always prone to changes, not being
stable enough to be enabled by default.

All this said, we're still not able to enable all ratified extensions
due to conflicts between them. Zfinx and all its dependencies aren't
enabled because of a conflict with RVF. zce, zcmp and zcmt are also
disabled due to RVD conflicts. When running with 64 bits we're also
disabling zcf.

Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
---
 target/riscv/cpu-qom.h |  1 +
 target/riscv/cpu.c     | 50 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 51 insertions(+)

Comments

Conor Dooley July 12, 2023, 7:22 p.m. UTC | #1
On Wed, Jul 12, 2023 at 04:01:48PM -0300, Daniel Henrique Barboza wrote:
> The 'max' CPU type is used by tooling to determine what's the most
> capable CPU a current QEMU version implements. Other archs such as ARM
> implements this type. Let's add it to RISC-V.
> 
> What we consider "most capable CPU" in this context are related to
> ratified, non-vendor extensions. This means that we want the 'max' CPU
> to enable all (possible) ratified extensions by default. The reasoning
> behind this design is (1) vendor extensions can conflict with each other
> and we won't play favorities deciding which one is default or not and
> (2) non-ratified extensions are always prone to changes, not being
> stable enough to be enabled by default.
> 
> All this said, we're still not able to enable all ratified extensions
> due to conflicts between them. Zfinx and all its dependencies aren't
> enabled because of a conflict with RVF. zce, zcmp and zcmt are also
> disabled due to RVD conflicts. When running with 64 bits we're also
> disabling zcf.
> 
> Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>

This seems like it will be super helpful for CI stuff etc, thanks for
doing it.
Daniel Henrique Barboza July 12, 2023, 8:30 p.m. UTC | #2
On 7/12/23 16:22, Conor Dooley wrote:
> On Wed, Jul 12, 2023 at 04:01:48PM -0300, Daniel Henrique Barboza wrote:
>> The 'max' CPU type is used by tooling to determine what's the most
>> capable CPU a current QEMU version implements. Other archs such as ARM
>> implements this type. Let's add it to RISC-V.
>>
>> What we consider "most capable CPU" in this context are related to
>> ratified, non-vendor extensions. This means that we want the 'max' CPU
>> to enable all (possible) ratified extensions by default. The reasoning
>> behind this design is (1) vendor extensions can conflict with each other
>> and we won't play favorities deciding which one is default or not and
>> (2) non-ratified extensions are always prone to changes, not being
>> stable enough to be enabled by default.
>>
>> All this said, we're still not able to enable all ratified extensions
>> due to conflicts between them. Zfinx and all its dependencies aren't
>> enabled because of a conflict with RVF. zce, zcmp and zcmt are also
>> disabled due to RVD conflicts. When running with 64 bits we're also
>> disabling zcf.
>>
>> Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
> 
> This seems like it will be super helpful for CI stuff etc, thanks for
> doing it.

And Linux actually boots on it, which was remarkable to see. I was expecting something
to blow up I guess.

This is the riscv,isa DT generated:

# cat /proc/device-tree/cpus/cpu@0/riscv,isa
rv64imafdch_zicbom_zicboz_zicsr_zifencei_zihintpause_zawrs_zfa_zfh_zfhmin_zca_zcb_zcd_
zba_zbb_zbc_zbkb_zbkc_zbkx_zbs_zk_zkn_zknd_zkne_zknh_zkr_zks_zksed_zksh_zkt_
zve32f_zve64f_zve64d_smstateen_sscofpmf_sstc_svadu_svinval_svnapot_svpbmt#


I'll put this in the commit message for the next version.

Oh, and I just realized that I forgot to light up all the MISA bits (we're missing
RVV). Guess I'll have to send the v2 right away.


Thanks,


Daniel
Conor Dooley July 12, 2023, 9 p.m. UTC | #3
On Wed, Jul 12, 2023 at 05:30:41PM -0300, Daniel Henrique Barboza wrote:
> On 7/12/23 16:22, Conor Dooley wrote:
> > On Wed, Jul 12, 2023 at 04:01:48PM -0300, Daniel Henrique Barboza wrote:
> > > The 'max' CPU type is used by tooling to determine what's the most
> > > capable CPU a current QEMU version implements. Other archs such as ARM
> > > implements this type. Let's add it to RISC-V.
> > > 
> > > What we consider "most capable CPU" in this context are related to
> > > ratified, non-vendor extensions. This means that we want the 'max' CPU
> > > to enable all (possible) ratified extensions by default. The reasoning
> > > behind this design is (1) vendor extensions can conflict with each other
> > > and we won't play favorities deciding which one is default or not and
> > > (2) non-ratified extensions are always prone to changes, not being
> > > stable enough to be enabled by default.
> > > 
> > > All this said, we're still not able to enable all ratified extensions
> > > due to conflicts between them. Zfinx and all its dependencies aren't
> > > enabled because of a conflict with RVF. zce, zcmp and zcmt are also
> > > disabled due to RVD conflicts. When running with 64 bits we're also
> > > disabling zcf.
> > > 
> > > Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
> > 
> > This seems like it will be super helpful for CI stuff etc, thanks for
> > doing it.
> 
> And Linux actually boots on it, which was remarkable to see. I was expecting something
> to blow up I guess.
> 
> This is the riscv,isa DT generated:
> 
> # cat /proc/device-tree/cpus/cpu@0/riscv,isa
> rv64imafdch_zicbom_zicboz_zicsr_zifencei_zihintpause_zawrs_zfa_zfh_zfhmin_zca_zcb_zcd_
> zba_zbb_zbc_zbkb_zbkc_zbkx_zbs_zk_zkn_zknd_zkne_zknh_zkr_zks_zksed_zksh_zkt_
> zve32f_zve64f_zve64d_smstateen_sscofpmf_sstc_svadu_svinval_svnapot_svpbmt#

Of which an upstream Linux kernel, building using something close to
defconfig, accepts only
rv64imafdch_zicbom_zicboz_zicntr_zicsr_zifencei_zihintpause_zihpm_zba_zbb_zbs_sscofpmf_sstc_svinval_svnapot_svpbmt
so the set of possible things that break could break it has been reduced
somewhat.

btw, I noticed that the default marchid/mimpid have changed. Previously I
used to see something like:
processor       : 15
hart            : 15
isa             : rv64imafdcvh_zicbom_zicboz_zicntr_zicsr_zifencei_zihintpause_zihpm_zba_zbb_zbs_sscofpmf_sstc
mmu             : sv57
mvendorid       : 0x0
marchid         : 0x80032
mimpid          : 0x80032
in /proc/cpuinfo, but "now" I see 0x0 for marchid & mimpid. Is this
change to the default behaviour intentional? I saw "now" in "s because
I applied your patches on top of Alistair's next branch, which contains
the changes to m*id stuff.

Cheers,
Conor.
Daniel Henrique Barboza July 12, 2023, 9:09 p.m. UTC | #4
On 7/12/23 18:00, Conor Dooley wrote:
> On Wed, Jul 12, 2023 at 05:30:41PM -0300, Daniel Henrique Barboza wrote:
>> On 7/12/23 16:22, Conor Dooley wrote:
>>> On Wed, Jul 12, 2023 at 04:01:48PM -0300, Daniel Henrique Barboza wrote:
>>>> The 'max' CPU type is used by tooling to determine what's the most
>>>> capable CPU a current QEMU version implements. Other archs such as ARM
>>>> implements this type. Let's add it to RISC-V.
>>>>
>>>> What we consider "most capable CPU" in this context are related to
>>>> ratified, non-vendor extensions. This means that we want the 'max' CPU
>>>> to enable all (possible) ratified extensions by default. The reasoning
>>>> behind this design is (1) vendor extensions can conflict with each other
>>>> and we won't play favorities deciding which one is default or not and
>>>> (2) non-ratified extensions are always prone to changes, not being
>>>> stable enough to be enabled by default.
>>>>
>>>> All this said, we're still not able to enable all ratified extensions
>>>> due to conflicts between them. Zfinx and all its dependencies aren't
>>>> enabled because of a conflict with RVF. zce, zcmp and zcmt are also
>>>> disabled due to RVD conflicts. When running with 64 bits we're also
>>>> disabling zcf.
>>>>
>>>> Signed-off-by: Daniel Henrique Barboza <dbarboza@ventanamicro.com>
>>>
>>> This seems like it will be super helpful for CI stuff etc, thanks for
>>> doing it.
>>
>> And Linux actually boots on it, which was remarkable to see. I was expecting something
>> to blow up I guess.
>>
>> This is the riscv,isa DT generated:
>>
>> # cat /proc/device-tree/cpus/cpu@0/riscv,isa
>> rv64imafdch_zicbom_zicboz_zicsr_zifencei_zihintpause_zawrs_zfa_zfh_zfhmin_zca_zcb_zcd_
>> zba_zbb_zbc_zbkb_zbkc_zbkx_zbs_zk_zkn_zknd_zkne_zknh_zkr_zks_zksed_zksh_zkt_
>> zve32f_zve64f_zve64d_smstateen_sscofpmf_sstc_svadu_svinval_svnapot_svpbmt#
> 
> Of which an upstream Linux kernel, building using something close to
> defconfig, accepts only
> rv64imafdch_zicbom_zicboz_zicntr_zicsr_zifencei_zihintpause_zihpm_zba_zbb_zbs_sscofpmf_sstc_svinval_svnapot_svpbmt
> so the set of possible things that break could break it has been reduced
> somewhat.
> 
> btw, I noticed that the default marchid/mimpid have changed. Previously I
> used to see something like:
> processor       : 15
> hart            : 15
> isa             : rv64imafdcvh_zicbom_zicboz_zicntr_zicsr_zifencei_zihintpause_zihpm_zba_zbb_zbs_sscofpmf_sstc
> mmu             : sv57
> mvendorid       : 0x0
> marchid         : 0x80032
> mimpid          : 0x80032
> in /proc/cpuinfo, but "now" I see 0x0 for marchid & mimpid. Is this
> change to the default behaviour intentional? I saw "now" in "s because
> I applied your patches on top of Alistair's next branch, which contains
> the changes to m*id stuff.

It is intentional. Those default marchid/mimpid vals were derived from the current
QEMU version ID/build and didn't mean much.

It is still possible to set them via "-cpu rv64,marchid=N,mimpid=N" if needed when
using the generic (rv64,rv32) CPUs. Vendor CPUs can't have their machine IDs changed
via command line.


Thanks,

Daniel

> 
> Cheers,
> Conor.
Conor Dooley July 12, 2023, 9:35 p.m. UTC | #5
On Wed, Jul 12, 2023 at 06:09:10PM -0300, Daniel Henrique Barboza wrote:

> It is intentional. Those default marchid/mimpid vals were derived from the current
> QEMU version ID/build and didn't mean much.
> 
> It is still possible to set them via "-cpu rv64,marchid=N,mimpid=N" if needed when
> using the generic (rv64,rv32) CPUs. Vendor CPUs can't have their machine IDs changed
> via command line.

Sounds good, thanks. I did just now go and check icicle to see what it
would report & it does not boot. I'll go bisect...
Daniel Henrique Barboza July 12, 2023, 9:39 p.m. UTC | #6
On 7/12/23 18:35, Conor Dooley wrote:
> On Wed, Jul 12, 2023 at 06:09:10PM -0300, Daniel Henrique Barboza wrote:
> 
>> It is intentional. Those default marchid/mimpid vals were derived from the current
>> QEMU version ID/build and didn't mean much.
>>
>> It is still possible to set them via "-cpu rv64,marchid=N,mimpid=N" if needed when
>> using the generic (rv64,rv32) CPUs. Vendor CPUs can't have their machine IDs changed
>> via command line.
> 
> Sounds good, thanks. I did just now go and check icicle to see what it
> would report & it does not boot. I'll go bisect...

BTW how are you booting the icicle board nowadays? I remember you mentioning about
some changes in the FDT being required to boot and whatnot.

If it's not too hard I'll add it in my test scripts to keep it under check. Perhaps
we can even add it to QEMU testsuite.


Daniel
Conor Dooley July 12, 2023, 10:14 p.m. UTC | #7
On Wed, Jul 12, 2023 at 06:39:28PM -0300, Daniel Henrique Barboza wrote:
> On 7/12/23 18:35, Conor Dooley wrote:
> > On Wed, Jul 12, 2023 at 06:09:10PM -0300, Daniel Henrique Barboza wrote:
> > 
> > > It is intentional. Those default marchid/mimpid vals were derived from the current
> > > QEMU version ID/build and didn't mean much.
> > > 
> > > It is still possible to set them via "-cpu rv64,marchid=N,mimpid=N" if needed when
> > > using the generic (rv64,rv32) CPUs. Vendor CPUs can't have their machine IDs changed
> > > via command line.
> > 
> > Sounds good, thanks. I did just now go and check icicle to see what it
> > would report & it does not boot. I'll go bisect...
> 
> BTW how are you booting the icicle board nowadays? I remember you mentioning about
> some changes in the FDT being required to boot and whatnot.

I do direct kernel boots, as the HSS doesn't work anymore, and just lie
a bit to QEMU about how much DDR we have.
.PHONY: qemu-icicle
qemu-icicle:
	$(qemu) -M microchip-icicle-kit \
		-m 3G -smp 5 \
		-kernel $(vmlinux_bin) \
		-dtb $(icicle_dtb) \
		-initrd $(initramfs) \
		-display none -serial null \
		-serial stdio \
		-D qemu.log -d unimp

The platform only supports 2 GiB of DDR, not 3, but if I pass 2 to QEMU
it thinks there's 1 GiB at 0x8000_0000 and 1 GiB at 0x10_0000_0000. The
upstream devicetree (and current FPGA reference design) expects there to
be 1 GiB at 0x8000_0000 and 1 GiB at 0x10_4000_0000. If I lie to QEMU,
it thinks there is 1 GiB at 0x8000_0000 and 2 GiB at 0x10_0000_0000, and
things just work. I prefer doing it this way than having to modify the
DT, it is a lot easier to explain to people this way.

I've been meaning to work the support for the icicle & mpfs in QEMU, but
it just gets shunted down the priority list. I'd really like if a proper
boot flow would run in QEMU, which means fixing whatever broke the HSS,
but I've recently picked up maintainership of dt-binding stuff in Linux,
so I've unfortunately got even less time to try and work on it. Maybe
we'll get some new graduate in and I can make them suffer in my stead...

> If it's not too hard I'll add it in my test scripts to keep it under check. Perhaps
> we can even add it to QEMU testsuite.

I don't think it really should be that bad, at least for the direct
kernel boot, which is what I mainly care about, since I use it fairly
often for debugging boot stuff in Linux.

Anyways, aa903cf31391dd505b399627158f1292a6d19896 is the first bad commit:
commit aa903cf31391dd505b399627158f1292a6d19896
Author: Bin Meng <bmeng@tinylab.org>
Date:   Fri Jun 30 23:36:04 2023 +0800

    roms/opensbi: Upgrade from v1.2 to v1.3
    
    Upgrade OpenSBI from v1.2 to v1.3 and the pre-built bios images.

And I see something like:
qemu//build/qemu-system-riscv64 -M microchip-icicle-kit \
        -m 3G -smp 5 \
        -kernel vmlinux.bin \
        -dtb icicle.dtb \
        -initrd initramfs.cpio.gz \
        -display none -serial null \
        -serial stdio \
        -D qemu.log -d unimp
qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000000 because privilege spec version does not match
qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000001 because privilege spec version does not match
qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000001 because privilege spec version does not match
qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000002 because privilege spec version does not match
qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000002 because privilege spec version does not match
qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000003 because privilege spec version does not match
qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000003 because privilege spec version does not match
qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000004 because privilege spec version does not match
qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000004 because privilege spec version does not match

OpenSBI v1.3
   ____                    _____ ____ _____
  / __ \                  / ____|  _ \_   _|
 | |  | |_ __   ___ _ __ | (___ | |_) || |
 | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
 | |__| | |_) |  __/ | | |____) | |_) || |_
  \____/| .__/ \___|_| |_|_____/|___/_____|
        | |
        |_|

init_coldboot: ipi init failed (error -1009)

Just to note, because we use our own firmware that vendors in OpenSBI
and compiles only a significantly cut down number of files from it, we
do not use the fw_dynamic etc flow on our hardware. As a result, we have
not tested v1.3, nor do we have any immediate plans to change our
platform firmware to vendor v1.3 either.

I unless there's something obvious to you, it sounds like I will need to
go and bisect OpenSBI. That's a job for another day though, given the
time.

Cheers,
Conor.
Conor Dooley July 13, 2023, 10:12 p.m. UTC | #8
+CC OpenSBI Mailing list

I've not yet had the chance to bisect this, so adding the OpenSBI folks
to CC in case they might have an idea for what to try.

And a question for you below Daniel.

On Wed, Jul 12, 2023 at 11:14:21PM +0100, Conor Dooley wrote:
> On Wed, Jul 12, 2023 at 06:39:28PM -0300, Daniel Henrique Barboza wrote:
> > On 7/12/23 18:35, Conor Dooley wrote:
> > > On Wed, Jul 12, 2023 at 06:09:10PM -0300, Daniel Henrique Barboza wrote:
> > > 
> > > > It is intentional. Those default marchid/mimpid vals were derived from the current
> > > > QEMU version ID/build and didn't mean much.
> > > > 
> > > > It is still possible to set them via "-cpu rv64,marchid=N,mimpid=N" if needed when
> > > > using the generic (rv64,rv32) CPUs. Vendor CPUs can't have their machine IDs changed
> > > > via command line.
> > > 
> > > Sounds good, thanks. I did just now go and check icicle to see what it
> > > would report & it does not boot. I'll go bisect...
> > 
> > BTW how are you booting the icicle board nowadays? I remember you mentioning about
> > some changes in the FDT being required to boot and whatnot.
> 
> I do direct kernel boots, as the HSS doesn't work anymore, and just lie
> a bit to QEMU about how much DDR we have.
> .PHONY: qemu-icicle
> qemu-icicle:
> 	$(qemu) -M microchip-icicle-kit \
> 		-m 3G -smp 5 \
> 		-kernel $(vmlinux_bin) \
> 		-dtb $(icicle_dtb) \
> 		-initrd $(initramfs) \
> 		-display none -serial null \
> 		-serial stdio \
> 		-D qemu.log -d unimp
> 
> The platform only supports 2 GiB of DDR, not 3, but if I pass 2 to QEMU
> it thinks there's 1 GiB at 0x8000_0000 and 1 GiB at 0x10_0000_0000. The
> upstream devicetree (and current FPGA reference design) expects there to
> be 1 GiB at 0x8000_0000 and 1 GiB at 0x10_4000_0000. If I lie to QEMU,
> it thinks there is 1 GiB at 0x8000_0000 and 2 GiB at 0x10_0000_0000, and
> things just work. I prefer doing it this way than having to modify the
> DT, it is a lot easier to explain to people this way.
> 
> I've been meaning to work the support for the icicle & mpfs in QEMU, but
> it just gets shunted down the priority list. I'd really like if a proper
> boot flow would run in QEMU, which means fixing whatever broke the HSS,
> but I've recently picked up maintainership of dt-binding stuff in Linux,
> so I've unfortunately got even less time to try and work on it. Maybe
> we'll get some new graduate in and I can make them suffer in my stead...
> 
> > If it's not too hard I'll add it in my test scripts to keep it under check. Perhaps
> > we can even add it to QEMU testsuite.
> 
> I don't think it really should be that bad, at least for the direct
> kernel boot, which is what I mainly care about, since I use it fairly
> often for debugging boot stuff in Linux.
> 
> Anyways, aa903cf31391dd505b399627158f1292a6d19896 is the first bad commit:
> commit aa903cf31391dd505b399627158f1292a6d19896
> Author: Bin Meng <bmeng@tinylab.org>
> Date:   Fri Jun 30 23:36:04 2023 +0800
> 
>     roms/opensbi: Upgrade from v1.2 to v1.3
>     
>     Upgrade OpenSBI from v1.2 to v1.3 and the pre-built bios images.
> 
> And I see something like:
> qemu//build/qemu-system-riscv64 -M microchip-icicle-kit \
>         -m 3G -smp 5 \
>         -kernel vmlinux.bin \
>         -dtb icicle.dtb \
>         -initrd initramfs.cpio.gz \
>         -display none -serial null \
>         -serial stdio \
>         -D qemu.log -d unimp

> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000000 because privilege spec version does not match
> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000001 because privilege spec version does not match
> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000001 because privilege spec version does not match
> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000002 because privilege spec version does not match
> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000002 because privilege spec version does not match
> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000003 because privilege spec version does not match
> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000003 because privilege spec version does not match
> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000004 because privilege spec version does not match
> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000004 because privilege spec version does not match

Why am I seeing these warnings? Does the mpfs machine type need to
disable some things? It only supports rv64imafdc per the DT, and
predates things like Zca existing, so emitting warnings does not seem
fair at all to me!

> 
> OpenSBI v1.3
>    ____                    _____ ____ _____
>   / __ \                  / ____|  _ \_   _|
>  | |  | |_ __   ___ _ __ | (___ | |_) || |
>  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
>  | |__| | |_) |  __/ | | |____) | |_) || |_
>   \____/| .__/ \___|_| |_|_____/|___/_____|
>         | |
>         |_|
> 
> init_coldboot: ipi init failed (error -1009)
> 
> Just to note, because we use our own firmware that vendors in OpenSBI
> and compiles only a significantly cut down number of files from it, we
> do not use the fw_dynamic etc flow on our hardware. As a result, we have
> not tested v1.3, nor do we have any immediate plans to change our
> platform firmware to vendor v1.3 either.
> 
> I unless there's something obvious to you, it sounds like I will need to
> go and bisect OpenSBI. That's a job for another day though, given the
> time.
> 
> Cheers,
> Conor.
Daniel Henrique Barboza July 13, 2023, 10:35 p.m. UTC | #9
On 7/13/23 19:12, Conor Dooley wrote:
> +CC OpenSBI Mailing list
> 
> I've not yet had the chance to bisect this, so adding the OpenSBI folks
> to CC in case they might have an idea for what to try.
> 
> And a question for you below Daniel.
> 
> On Wed, Jul 12, 2023 at 11:14:21PM +0100, Conor Dooley wrote:
>> On Wed, Jul 12, 2023 at 06:39:28PM -0300, Daniel Henrique Barboza wrote:
>>> On 7/12/23 18:35, Conor Dooley wrote:
>>>> On Wed, Jul 12, 2023 at 06:09:10PM -0300, Daniel Henrique Barboza wrote:
>>>>
>>>>> It is intentional. Those default marchid/mimpid vals were derived from the current
>>>>> QEMU version ID/build and didn't mean much.
>>>>>
>>>>> It is still possible to set them via "-cpu rv64,marchid=N,mimpid=N" if needed when
>>>>> using the generic (rv64,rv32) CPUs. Vendor CPUs can't have their machine IDs changed
>>>>> via command line.
>>>>
>>>> Sounds good, thanks. I did just now go and check icicle to see what it
>>>> would report & it does not boot. I'll go bisect...
>>>
>>> BTW how are you booting the icicle board nowadays? I remember you mentioning about
>>> some changes in the FDT being required to boot and whatnot.
>>
>> I do direct kernel boots, as the HSS doesn't work anymore, and just lie
>> a bit to QEMU about how much DDR we have.
>> .PHONY: qemu-icicle
>> qemu-icicle:
>> 	$(qemu) -M microchip-icicle-kit \
>> 		-m 3G -smp 5 \
>> 		-kernel $(vmlinux_bin) \
>> 		-dtb $(icicle_dtb) \
>> 		-initrd $(initramfs) \
>> 		-display none -serial null \
>> 		-serial stdio \
>> 		-D qemu.log -d unimp
>>
>> The platform only supports 2 GiB of DDR, not 3, but if I pass 2 to QEMU
>> it thinks there's 1 GiB at 0x8000_0000 and 1 GiB at 0x10_0000_0000. The
>> upstream devicetree (and current FPGA reference design) expects there to
>> be 1 GiB at 0x8000_0000 and 1 GiB at 0x10_4000_0000. If I lie to QEMU,
>> it thinks there is 1 GiB at 0x8000_0000 and 2 GiB at 0x10_0000_0000, and
>> things just work. I prefer doing it this way than having to modify the
>> DT, it is a lot easier to explain to people this way.
>>
>> I've been meaning to work the support for the icicle & mpfs in QEMU, but
>> it just gets shunted down the priority list. I'd really like if a proper
>> boot flow would run in QEMU, which means fixing whatever broke the HSS,
>> but I've recently picked up maintainership of dt-binding stuff in Linux,
>> so I've unfortunately got even less time to try and work on it. Maybe
>> we'll get some new graduate in and I can make them suffer in my stead...
>>
>>> If it's not too hard I'll add it in my test scripts to keep it under check. Perhaps
>>> we can even add it to QEMU testsuite.
>>
>> I don't think it really should be that bad, at least for the direct
>> kernel boot, which is what I mainly care about, since I use it fairly
>> often for debugging boot stuff in Linux.
>>
>> Anyways, aa903cf31391dd505b399627158f1292a6d19896 is the first bad commit:
>> commit aa903cf31391dd505b399627158f1292a6d19896
>> Author: Bin Meng <bmeng@tinylab.org>
>> Date:   Fri Jun 30 23:36:04 2023 +0800
>>
>>      roms/opensbi: Upgrade from v1.2 to v1.3
>>      
>>      Upgrade OpenSBI from v1.2 to v1.3 and the pre-built bios images.
>>
>> And I see something like:
>> qemu//build/qemu-system-riscv64 -M microchip-icicle-kit \
>>          -m 3G -smp 5 \
>>          -kernel vmlinux.bin \
>>          -dtb icicle.dtb \
>>          -initrd initramfs.cpio.gz \
>>          -display none -serial null \
>>          -serial stdio \
>>          -D qemu.log -d unimp
> 
>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000000 because privilege spec version does not match
>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000001 because privilege spec version does not match
>> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000001 because privilege spec version does not match
>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000002 because privilege spec version does not match
>> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000002 because privilege spec version does not match
>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000003 because privilege spec version does not match
>> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000003 because privilege spec version does not match
>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000004 because privilege spec version does not match
>> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000004 because privilege spec version does not match
> 
> Why am I seeing these warnings? Does the mpfs machine type need to
> disable some things? It only supports rv64imafdc per the DT, and
> predates things like Zca existing, so emitting warnings does not seem
> fair at all to me!

QEMU will disable extensions that are newer than a priv spec version that is set
by the CPU. IIUC the icicle board is running a sifive_u54 CPU by default. That
CPU has a priv spec version 1_10_0. The CPU is also enabling C.

We will enable zca if C is enabled. C and D enabled will also enable zcd. But
then the priv check will disabled both because zca and zcd have priv spec 1_12_0.

This is a side effect for a change that I did a few months ago. Back then we
weren't disabling stuff correctly. The warnings are annoying but are benign.
And apparently the sifive_u54 CPU is being inconsistent for some time and
we noticed just now.

Now, if the icicle board is supposed to have zca and zcd then we have a problem.
We'll need to discuss whether we move sifive_u54 CPU priv spec to 1_12_0 (I'm not
sure how this will affect other boards that uses this CPU) or remove this priv spec
disable code altogether from QEMU.


Thanks,

Daniel



> 
>>
>> OpenSBI v1.3
>>     ____                    _____ ____ _____
>>    / __ \                  / ____|  _ \_   _|
>>   | |  | |_ __   ___ _ __ | (___ | |_) || |
>>   | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
>>   | |__| | |_) |  __/ | | |____) | |_) || |_
>>    \____/| .__/ \___|_| |_|_____/|___/_____|
>>          | |
>>          |_|
>>
>> init_coldboot: ipi init failed (error -1009)
>>
>> Just to note, because we use our own firmware that vendors in OpenSBI
>> and compiles only a significantly cut down number of files from it, we
>> do not use the fw_dynamic etc flow on our hardware. As a result, we have
>> not tested v1.3, nor do we have any immediate plans to change our
>> platform firmware to vendor v1.3 either.
>>
>> I unless there's something obvious to you, it sounds like I will need to
>> go and bisect OpenSBI. That's a job for another day though, given the
>> time.
>>
>> Cheers,
>> Conor.
> 
>
Conor Dooley July 13, 2023, 10:47 p.m. UTC | #10
On Thu, Jul 13, 2023 at 07:35:01PM -0300, Daniel Henrique Barboza wrote:
> On 7/13/23 19:12, Conor Dooley wrote:

> > And a question for you below Daniel.
> > 
> > On Wed, Jul 12, 2023 at 11:14:21PM +0100, Conor Dooley wrote:


> > 
> > > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000000 because privilege spec version does not match
> > > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000001 because privilege spec version does not match
> > > qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000001 because privilege spec version does not match
> > > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000002 because privilege spec version does not match
> > > qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000002 because privilege spec version does not match
> > > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000003 because privilege spec version does not match
> > > qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000003 because privilege spec version does not match
> > > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000004 because privilege spec version does not match
> > > qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000004 because privilege spec version does not match
> > 
> > Why am I seeing these warnings? Does the mpfs machine type need to
> > disable some things? It only supports rv64imafdc per the DT, and
> > predates things like Zca existing, so emitting warnings does not seem
> > fair at all to me!
> 
> QEMU will disable extensions that are newer than a priv spec version that is set
> by the CPU. IIUC the icicle board is running a sifive_u54 CPU by default. That
> CPU has a priv spec version 1_10_0. The CPU is also enabling C.
> 
> We will enable zca if C is enabled. C and D enabled will also enable zcd. But
> then the priv check will disabled both because zca and zcd have priv spec 1_12_0.
> 
> This is a side effect for a change that I did a few months ago. Back then we
> weren't disabling stuff correctly.

Yah, I did check out the blame, hence directing it at you. Thanks for
the explanation.

> The warnings are annoying but are benign.

To be honest, benign or not, this is kind of thing is only going to
lead to grief. Even though only the direct kernel boot works, we do
actually have some customers that are using the icicle target in QEMU.

> And apparently the sifive_u54 CPU is being inconsistent for some time and
> we noticed just now.
> Now, if the icicle board is supposed to have zca and zcd then we have a problem.

I don't know, this depends on how you see things in QEMU. I would say
that it supports c, and not Zca/Zcf/Zcd, given it predates the
extensions. I have no interest in retrofitting my devicetree stuff with
them, for example.

> We'll need to discuss whether we move sifive_u54 CPU priv spec to 1_12_0 (I'm not
> sure how this will affect other boards that uses this CPU) or remove this priv spec
> disable code altogether from QEMU.

I think you should stop warning for this? From my dumb-user perspective,
the warning only "scares" me into thinking something is wrong, when
there isn't. I can see a use case for the warning where someone tries to
enable Zca & Co. in their QEMU incantation for a CPU that does not
have the correct privilege level to support it, but I didn't try to set
any options at all in that way, so the warnings seem unfair?

Cheers,
Conor.
Conor Dooley July 13, 2023, 11:04 p.m. UTC | #11
On Thu, Jul 13, 2023 at 11:12:33PM +0100, Conor Dooley wrote:
> +CC OpenSBI Mailing list
> 
> I've not yet had the chance to bisect this, so adding the OpenSBI folks
> to CC in case they might have an idea for what to try.

NVM this, I bisected it. Logs below.

> And a question for you below Daniel.
> 
> On Wed, Jul 12, 2023 at 11:14:21PM +0100, Conor Dooley wrote:
> > On Wed, Jul 12, 2023 at 06:39:28PM -0300, Daniel Henrique Barboza wrote:
> > > On 7/12/23 18:35, Conor Dooley wrote:
> > > > On Wed, Jul 12, 2023 at 06:09:10PM -0300, Daniel Henrique Barboza wrote:
> > > > 
> > > > > It is intentional. Those default marchid/mimpid vals were derived from the current
> > > > > QEMU version ID/build and didn't mean much.
> > > > > 
> > > > > It is still possible to set them via "-cpu rv64,marchid=N,mimpid=N" if needed when
> > > > > using the generic (rv64,rv32) CPUs. Vendor CPUs can't have their machine IDs changed
> > > > > via command line.
> > > > 
> > > > Sounds good, thanks. I did just now go and check icicle to see what it
> > > > would report & it does not boot. I'll go bisect...
> > > 
> > > BTW how are you booting the icicle board nowadays? I remember you mentioning about
> > > some changes in the FDT being required to boot and whatnot.
> > 
> > I do direct kernel boots, as the HSS doesn't work anymore, and just lie
> > a bit to QEMU about how much DDR we have.
> > .PHONY: qemu-icicle
> > qemu-icicle:
> > 	$(qemu) -M microchip-icicle-kit \
> > 		-m 3G -smp 5 \
> > 		-kernel $(vmlinux_bin) \
> > 		-dtb $(icicle_dtb) \
> > 		-initrd $(initramfs) \
> > 		-display none -serial null \
> > 		-serial stdio \
> > 		-D qemu.log -d unimp
> > 
> > The platform only supports 2 GiB of DDR, not 3, but if I pass 2 to QEMU
> > it thinks there's 1 GiB at 0x8000_0000 and 1 GiB at 0x10_0000_0000. The
> > upstream devicetree (and current FPGA reference design) expects there to
> > be 1 GiB at 0x8000_0000 and 1 GiB at 0x10_4000_0000. If I lie to QEMU,
> > it thinks there is 1 GiB at 0x8000_0000 and 2 GiB at 0x10_0000_0000, and
> > things just work. I prefer doing it this way than having to modify the
> > DT, it is a lot easier to explain to people this way.
> > 
> > I've been meaning to work the support for the icicle & mpfs in QEMU, but
> > it just gets shunted down the priority list. I'd really like if a proper
> > boot flow would run in QEMU, which means fixing whatever broke the HSS,
> > but I've recently picked up maintainership of dt-binding stuff in Linux,
> > so I've unfortunately got even less time to try and work on it. Maybe
> > we'll get some new graduate in and I can make them suffer in my stead...
> > 
> > > If it's not too hard I'll add it in my test scripts to keep it under check. Perhaps
> > > we can even add it to QEMU testsuite.
> > 
> > I don't think it really should be that bad, at least for the direct
> > kernel boot, which is what I mainly care about, since I use it fairly
> > often for debugging boot stuff in Linux.
> > 
> > Anyways, aa903cf31391dd505b399627158f1292a6d19896 is the first bad commit:
> > commit aa903cf31391dd505b399627158f1292a6d19896
> > Author: Bin Meng <bmeng@tinylab.org>
> > Date:   Fri Jun 30 23:36:04 2023 +0800
> > 
> >     roms/opensbi: Upgrade from v1.2 to v1.3
> >     
> >     Upgrade OpenSBI from v1.2 to v1.3 and the pre-built bios images.
> > 
> > And I see something like:
> > qemu//build/qemu-system-riscv64 -M microchip-icicle-kit \
> >         -m 3G -smp 5 \
> >         -kernel vmlinux.bin \
> >         -dtb icicle.dtb \
> >         -initrd initramfs.cpio.gz \
> >         -display none -serial null \
> >         -serial stdio \
> >         -D qemu.log -d unimp
> 
> > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000000 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000001 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000001 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000002 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000002 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000003 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000003 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000004 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000004 because privilege spec version does not match
> 
> Why am I seeing these warnings? Does the mpfs machine type need to
> disable some things? It only supports rv64imafdc per the DT, and
> predates things like Zca existing, so emitting warnings does not seem
> fair at all to me!

> > OpenSBI v1.3
> >    ____                    _____ ____ _____
> >   / __ \                  / ____|  _ \_   _|
> >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> >  | |__| | |_) |  __/ | | |____) | |_) || |_
> >   \____/| .__/ \___|_| |_|_____/|___/_____|
> >         | |
> >         |_|
> > 
> > init_coldboot: ipi init failed (error -1009)

This can be reproduced using OpenSBI built using `make PLATFORM=generic`
and the QEMU incantation linked above with a -bios argument added to the
incantation.

Thanks,
Conor.

acbd8fce9e5d92f07d344388a3b046f1722ce072 is the first bad commit
commit acbd8fce9e5d92f07d344388a3b046f1722ce072
Author: Anup Patel <apatel@ventanamicro.com>
Date:   Wed Apr 19 21:23:53 2023 +0530

    lib: utils/ipi: Use scratch space to save per-HART MSWI pointer
    
    Instead of using a global array indexed by hartid, we should use
    scratch space to save per-HART MSWI pointer.
    
    Signed-off-by: Anup Patel <apatel@ventanamicro.com>
    Reviewed-by: Andrew Jones <ajones@ventanamicro.com>

 lib/utils/ipi/aclint_mswi.c | 43 +++++++++++++++++++++++++++++++++----------
 1 file changed, 33 insertions(+), 10 deletions(-)
git bisect start
# status: waiting for both good and bad commits
# bad: [2552799a1df30a3dcd2321a8b75d61d06f5fb9fc] include: Bump-up version to 1.3
git bisect bad 2552799a1df30a3dcd2321a8b75d61d06f5fb9fc
# status: waiting for good commit(s), bad commit known
# good: [6b5188ca14e59ce7bf71afe4e7d3d557c3d31bf8] include: Bump-up version to 1.2
git bisect good 6b5188ca14e59ce7bf71afe4e7d3d557c3d31bf8
# good: [6b5188ca14e59ce7bf71afe4e7d3d557c3d31bf8] include: Bump-up version to 1.2
git bisect good 6b5188ca14e59ce7bf71afe4e7d3d557c3d31bf8
# good: [908be1b85c8ff0695ea226fbbf0ff24a779cdece] gpio/starfive: add gpio driver and support gpio reset
git bisect good 908be1b85c8ff0695ea226fbbf0ff24a779cdece
# good: [6bc02dede86c47f87e65293b7099e9caf3b22c29] lib: sbi: Simplify sbi_ipi_process remove goto
git bisect good 6bc02dede86c47f87e65293b7099e9caf3b22c29
# good: [bbff53fe3b6cdd3c9bc084d489640d7ee2a3f831] lib: sbi_pmu: Use heap for per-HART PMU state
git bisect good bbff53fe3b6cdd3c9bc084d489640d7ee2a3f831
# bad: [f0516beae068ffce0d5a79f09a96904a661a25ba] lib: utils/timer: Use scratch space to save per-HART MTIMER pointer
git bisect bad f0516beae068ffce0d5a79f09a96904a661a25ba
# good: [5a8cfcdf19d98b8dc5dd5a087a2eceb7f5b185fb] lib: utils/ipi: Use heap in ACLINT MSWI driver
git bisect good 5a8cfcdf19d98b8dc5dd5a087a2eceb7f5b185fb
# good: [7e5636ac3788451991a65791c5adcc7798dcc22a] lib: utils/timer: Use heap in ACLINT MTIMER driver
git bisect good 7e5636ac3788451991a65791c5adcc7798dcc22a
# bad: [acbd8fce9e5d92f07d344388a3b046f1722ce072] lib: utils/ipi: Use scratch space to save per-HART MSWI pointer
git bisect bad acbd8fce9e5d92f07d344388a3b046f1722ce072
# good: [3c1c972cb69d670ddc309391c4db76f1f19fd77e] lib: utils/fdt: Use heap in FDT domain parsing
git bisect good 3c1c972cb69d670ddc309391c4db76f1f19fd77e
# first bad commit: [acbd8fce9e5d92f07d344388a3b046f1722ce072] lib: utils/ipi: Use scratch space to save per-HART MSWI pointer
Daniel Henrique Barboza July 14, 2023, 1:13 a.m. UTC | #12
On 7/13/23 19:47, Conor Dooley wrote:
> On Thu, Jul 13, 2023 at 07:35:01PM -0300, Daniel Henrique Barboza wrote:
>> On 7/13/23 19:12, Conor Dooley wrote:
> 
>>> And a question for you below Daniel.
>>>
>>> On Wed, Jul 12, 2023 at 11:14:21PM +0100, Conor Dooley wrote:
> 
> 
>>>
>>>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000000 because privilege spec version does not match
>>>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000001 because privilege spec version does not match
>>>> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000001 because privilege spec version does not match
>>>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000002 because privilege spec version does not match
>>>> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000002 because privilege spec version does not match
>>>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000003 because privilege spec version does not match
>>>> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000003 because privilege spec version does not match
>>>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000004 because privilege spec version does not match
>>>> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000004 because privilege spec version does not match
>>>
>>> Why am I seeing these warnings? Does the mpfs machine type need to
>>> disable some things? It only supports rv64imafdc per the DT, and
>>> predates things like Zca existing, so emitting warnings does not seem
>>> fair at all to me!
>>
>> QEMU will disable extensions that are newer than a priv spec version that is set
>> by the CPU. IIUC the icicle board is running a sifive_u54 CPU by default. That
>> CPU has a priv spec version 1_10_0. The CPU is also enabling C.
>>
>> We will enable zca if C is enabled. C and D enabled will also enable zcd. But
>> then the priv check will disabled both because zca and zcd have priv spec 1_12_0.
>>
>> This is a side effect for a change that I did a few months ago. Back then we
>> weren't disabling stuff correctly.
> 
> Yah, I did check out the blame, hence directing it at you. Thanks for
> the explanation.
> 
>> The warnings are annoying but are benign.
> 
> To be honest, benign or not, this is kind of thing is only going to
> lead to grief. Even though only the direct kernel boot works, we do
> actually have some customers that are using the icicle target in QEMU.
> 
>> And apparently the sifive_u54 CPU is being inconsistent for some time and
>> we noticed just now.
>> Now, if the icicle board is supposed to have zca and zcd then we have a problem.
> 
> I don't know, this depends on how you see things in QEMU. I would say
> that it supports c, and not Zca/Zcf/Zcd, given it predates the
> extensions. I have no interest in retrofitting my devicetree stuff with
> them, for example.
> 
>> We'll need to discuss whether we move sifive_u54 CPU priv spec to 1_12_0 (I'm not
>> sure how this will affect other boards that uses this CPU) or remove this priv spec
>> disable code altogether from QEMU.
> 
> I think you should stop warning for this? From my dumb-user perspective,
> the warning only "scares" me into thinking something is wrong, when
> there isn't. I can see a use case for the warning where someone tries to
> enable Zca & Co. in their QEMU incantation for a CPU that does not
> have the correct privilege level to support it, but I didn't try to set
> any options at all in that way, so the warnings seem unfair?


That's a fair criticism. We had similar discussions a few months back. It's weird
to send warnings when the user didn't set the extensions manually, but ATM we
can't tell whether an extension was user enabled or not.

So we can either show unfair warning messages or not show warnings and take the risk
of silently disabling extensions that users enabled in the command line. It seems
that the former is more annoying to deal with than the latter.

I guess I can propose a patch to remove the warnings. We can send warning again
when we have a better solution.


Daniel


> 
> Cheers,
> Conor.
Alistair Francis July 14, 2023, 3:12 a.m. UTC | #13
On Fri, Jul 14, 2023 at 11:14 AM Daniel Henrique Barboza
<dbarboza@ventanamicro.com> wrote:
>
>
>
> On 7/13/23 19:47, Conor Dooley wrote:
> > On Thu, Jul 13, 2023 at 07:35:01PM -0300, Daniel Henrique Barboza wrote:
> >> On 7/13/23 19:12, Conor Dooley wrote:
> >
> >>> And a question for you below Daniel.
> >>>
> >>> On Wed, Jul 12, 2023 at 11:14:21PM +0100, Conor Dooley wrote:
> >
> >
> >>>
> >>>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000000 because privilege spec version does not match
> >>>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000001 because privilege spec version does not match
> >>>> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000001 because privilege spec version does not match
> >>>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000002 because privilege spec version does not match
> >>>> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000002 because privilege spec version does not match
> >>>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000003 because privilege spec version does not match
> >>>> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000003 because privilege spec version does not match
> >>>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000004 because privilege spec version does not match
> >>>> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000004 because privilege spec version does not match
> >>>
> >>> Why am I seeing these warnings? Does the mpfs machine type need to
> >>> disable some things? It only supports rv64imafdc per the DT, and
> >>> predates things like Zca existing, so emitting warnings does not seem
> >>> fair at all to me!
> >>
> >> QEMU will disable extensions that are newer than a priv spec version that is set
> >> by the CPU. IIUC the icicle board is running a sifive_u54 CPU by default. That
> >> CPU has a priv spec version 1_10_0. The CPU is also enabling C.
> >>
> >> We will enable zca if C is enabled. C and D enabled will also enable zcd. But
> >> then the priv check will disabled both because zca and zcd have priv spec 1_12_0.
> >>
> >> This is a side effect for a change that I did a few months ago. Back then we
> >> weren't disabling stuff correctly.
> >
> > Yah, I did check out the blame, hence directing it at you. Thanks for
> > the explanation.
> >
> >> The warnings are annoying but are benign.
> >
> > To be honest, benign or not, this is kind of thing is only going to
> > lead to grief. Even though only the direct kernel boot works, we do
> > actually have some customers that are using the icicle target in QEMU.
> >
> >> And apparently the sifive_u54 CPU is being inconsistent for some time and
> >> we noticed just now.
> >> Now, if the icicle board is supposed to have zca and zcd then we have a problem.
> >
> > I don't know, this depends on how you see things in QEMU. I would say
> > that it supports c, and not Zca/Zcf/Zcd, given it predates the
> > extensions. I have no interest in retrofitting my devicetree stuff with
> > them, for example.
> >
> >> We'll need to discuss whether we move sifive_u54 CPU priv spec to 1_12_0 (I'm not
> >> sure how this will affect other boards that uses this CPU) or remove this priv spec
> >> disable code altogether from QEMU.
> >
> > I think you should stop warning for this? From my dumb-user perspective,
> > the warning only "scares" me into thinking something is wrong, when
> > there isn't. I can see a use case for the warning where someone tries to
> > enable Zca & Co. in their QEMU incantation for a CPU that does not
> > have the correct privilege level to support it, but I didn't try to set
> > any options at all in that way, so the warnings seem unfair?
>
>
> That's a fair criticism. We had similar discussions a few months back. It's weird
> to send warnings when the user didn't set the extensions manually, but ATM we
> can't tell whether an extension was user enabled or not.
>
> So we can either show unfair warning messages or not show warnings and take the risk
> of silently disabling extensions that users enabled in the command line. It seems
> that the former is more annoying to deal with than the latter.
>
> I guess I can propose a patch to remove the warnings. We can send warning again
> when we have a better solution.

A better solution is to just not enable Zca and friends automatically,
or at least look at the priv spec before we do

Alistair

>
>
> Daniel
>
>
> >
> > Cheers,
> > Conor.
>
Anup Patel July 14, 2023, 4:30 a.m. UTC | #14
On Fri, Jul 14, 2023 at 3:43 AM Conor Dooley <conor@kernel.org> wrote:
>
> +CC OpenSBI Mailing list
>
> I've not yet had the chance to bisect this, so adding the OpenSBI folks
> to CC in case they might have an idea for what to try.
>
> And a question for you below Daniel.
>
> On Wed, Jul 12, 2023 at 11:14:21PM +0100, Conor Dooley wrote:
> > On Wed, Jul 12, 2023 at 06:39:28PM -0300, Daniel Henrique Barboza wrote:
> > > On 7/12/23 18:35, Conor Dooley wrote:
> > > > On Wed, Jul 12, 2023 at 06:09:10PM -0300, Daniel Henrique Barboza wrote:
> > > >
> > > > > It is intentional. Those default marchid/mimpid vals were derived from the current
> > > > > QEMU version ID/build and didn't mean much.
> > > > >
> > > > > It is still possible to set them via "-cpu rv64,marchid=N,mimpid=N" if needed when
> > > > > using the generic (rv64,rv32) CPUs. Vendor CPUs can't have their machine IDs changed
> > > > > via command line.
> > > >
> > > > Sounds good, thanks. I did just now go and check icicle to see what it
> > > > would report & it does not boot. I'll go bisect...
> > >
> > > BTW how are you booting the icicle board nowadays? I remember you mentioning about
> > > some changes in the FDT being required to boot and whatnot.
> >
> > I do direct kernel boots, as the HSS doesn't work anymore, and just lie
> > a bit to QEMU about how much DDR we have.
> > .PHONY: qemu-icicle
> > qemu-icicle:
> >       $(qemu) -M microchip-icicle-kit \
> >               -m 3G -smp 5 \
> >               -kernel $(vmlinux_bin) \
> >               -dtb $(icicle_dtb) \
> >               -initrd $(initramfs) \
> >               -display none -serial null \
> >               -serial stdio \
> >               -D qemu.log -d unimp
> >
> > The platform only supports 2 GiB of DDR, not 3, but if I pass 2 to QEMU
> > it thinks there's 1 GiB at 0x8000_0000 and 1 GiB at 0x10_0000_0000. The
> > upstream devicetree (and current FPGA reference design) expects there to
> > be 1 GiB at 0x8000_0000 and 1 GiB at 0x10_4000_0000. If I lie to QEMU,
> > it thinks there is 1 GiB at 0x8000_0000 and 2 GiB at 0x10_0000_0000, and
> > things just work. I prefer doing it this way than having to modify the
> > DT, it is a lot easier to explain to people this way.
> >
> > I've been meaning to work the support for the icicle & mpfs in QEMU, but
> > it just gets shunted down the priority list. I'd really like if a proper
> > boot flow would run in QEMU, which means fixing whatever broke the HSS,
> > but I've recently picked up maintainership of dt-binding stuff in Linux,
> > so I've unfortunately got even less time to try and work on it. Maybe
> > we'll get some new graduate in and I can make them suffer in my stead...
> >
> > > If it's not too hard I'll add it in my test scripts to keep it under check. Perhaps
> > > we can even add it to QEMU testsuite.
> >
> > I don't think it really should be that bad, at least for the direct
> > kernel boot, which is what I mainly care about, since I use it fairly
> > often for debugging boot stuff in Linux.
> >
> > Anyways, aa903cf31391dd505b399627158f1292a6d19896 is the first bad commit:
> > commit aa903cf31391dd505b399627158f1292a6d19896
> > Author: Bin Meng <bmeng@tinylab.org>
> > Date:   Fri Jun 30 23:36:04 2023 +0800
> >
> >     roms/opensbi: Upgrade from v1.2 to v1.3
> >
> >     Upgrade OpenSBI from v1.2 to v1.3 and the pre-built bios images.
> >
> > And I see something like:
> > qemu//build/qemu-system-riscv64 -M microchip-icicle-kit \
> >         -m 3G -smp 5 \
> >         -kernel vmlinux.bin \
> >         -dtb icicle.dtb \
> >         -initrd initramfs.cpio.gz \
> >         -display none -serial null \
> >         -serial stdio \
> >         -D qemu.log -d unimp
>
> > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000000 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000001 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000001 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000002 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000002 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000003 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000003 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000004 because privilege spec version does not match
> > qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000004 because privilege spec version does not match
>
> Why am I seeing these warnings? Does the mpfs machine type need to
> disable some things? It only supports rv64imafdc per the DT, and
> predates things like Zca existing, so emitting warnings does not seem
> fair at all to me!
>
> >
> > OpenSBI v1.3
> >    ____                    _____ ____ _____
> >   / __ \                  / ____|  _ \_   _|
> >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> >  | |__| | |_) |  __/ | | |____) | |_) || |_
> >   \____/| .__/ \___|_| |_|_____/|___/_____|
> >         | |
> >         |_|
> >
> > init_coldboot: ipi init failed (error -1009)
> >
> > Just to note, because we use our own firmware that vendors in OpenSBI
> > and compiles only a significantly cut down number of files from it, we
> > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > not tested v1.3, nor do we have any immediate plans to change our
> > platform firmware to vendor v1.3 either.
> >
> > I unless there's something obvious to you, it sounds like I will need to
> > go and bisect OpenSBI. That's a job for another day though, given the
> > time.
> >

The real issue is some CPU/HART DT nodes marked as disabled in the
DT passed to OpenSBI 1.3.

This issue does not exist in any of the DTs generated by QEMU but some
of the DTs in the kernel (such as microchip and SiFive board DTs) have
the E-core disabled.

I had discovered this issue in a totally different context after the OpenSBI 1.3
release happened. This issue is already fixed in the latest OpenSBI by the
following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
Fix sbi_hartid_to_scratch() usage in ACLINT drivers").

I always assumed that Microchip hss.bin is the preferred BIOS for the
QEMU microchip-icicle-kit machine but I guess that's not true.

At this point, you can either:
1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
2) Ensure CPU0 DT node is enabled in DT when booting on QEMU
    microchip-icicle-kit machine with OpenSBI 1.3

Regards,
Anup
Daniel Henrique Barboza July 14, 2023, 9:26 a.m. UTC | #15
On 7/14/23 00:12, Alistair Francis wrote:
> On Fri, Jul 14, 2023 at 11:14 AM Daniel Henrique Barboza
> <dbarboza@ventanamicro.com> wrote:
>>
>>
>>
>> On 7/13/23 19:47, Conor Dooley wrote:
>>> On Thu, Jul 13, 2023 at 07:35:01PM -0300, Daniel Henrique Barboza wrote:
>>>> On 7/13/23 19:12, Conor Dooley wrote:
>>>
>>>>> And a question for you below Daniel.
>>>>>
>>>>> On Wed, Jul 12, 2023 at 11:14:21PM +0100, Conor Dooley wrote:
>>>
>>>
>>>>>
>>>>>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000000 because privilege spec version does not match
>>>>>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000001 because privilege spec version does not match
>>>>>> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000001 because privilege spec version does not match
>>>>>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000002 because privilege spec version does not match
>>>>>> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000002 because privilege spec version does not match
>>>>>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000003 because privilege spec version does not match
>>>>>> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000003 because privilege spec version does not match
>>>>>> qemu-system-riscv64: warning: disabling zca extension for hart 0x0000000000000004 because privilege spec version does not match
>>>>>> qemu-system-riscv64: warning: disabling zcd extension for hart 0x0000000000000004 because privilege spec version does not match
>>>>>
>>>>> Why am I seeing these warnings? Does the mpfs machine type need to
>>>>> disable some things? It only supports rv64imafdc per the DT, and
>>>>> predates things like Zca existing, so emitting warnings does not seem
>>>>> fair at all to me!
>>>>
>>>> QEMU will disable extensions that are newer than a priv spec version that is set
>>>> by the CPU. IIUC the icicle board is running a sifive_u54 CPU by default. That
>>>> CPU has a priv spec version 1_10_0. The CPU is also enabling C.
>>>>
>>>> We will enable zca if C is enabled. C and D enabled will also enable zcd. But
>>>> then the priv check will disabled both because zca and zcd have priv spec 1_12_0.
>>>>
>>>> This is a side effect for a change that I did a few months ago. Back then we
>>>> weren't disabling stuff correctly.
>>>
>>> Yah, I did check out the blame, hence directing it at you. Thanks for
>>> the explanation.
>>>
>>>> The warnings are annoying but are benign.
>>>
>>> To be honest, benign or not, this is kind of thing is only going to
>>> lead to grief. Even though only the direct kernel boot works, we do
>>> actually have some customers that are using the icicle target in QEMU.
>>>
>>>> And apparently the sifive_u54 CPU is being inconsistent for some time and
>>>> we noticed just now.
>>>> Now, if the icicle board is supposed to have zca and zcd then we have a problem.
>>>
>>> I don't know, this depends on how you see things in QEMU. I would say
>>> that it supports c, and not Zca/Zcf/Zcd, given it predates the
>>> extensions. I have no interest in retrofitting my devicetree stuff with
>>> them, for example.
>>>
>>>> We'll need to discuss whether we move sifive_u54 CPU priv spec to 1_12_0 (I'm not
>>>> sure how this will affect other boards that uses this CPU) or remove this priv spec
>>>> disable code altogether from QEMU.
>>>
>>> I think you should stop warning for this? From my dumb-user perspective,
>>> the warning only "scares" me into thinking something is wrong, when
>>> there isn't. I can see a use case for the warning where someone tries to
>>> enable Zca & Co. in their QEMU incantation for a CPU that does not
>>> have the correct privilege level to support it, but I didn't try to set
>>> any options at all in that way, so the warnings seem unfair?
>>
>>
>> That's a fair criticism. We had similar discussions a few months back. It's weird
>> to send warnings when the user didn't set the extensions manually, but ATM we
>> can't tell whether an extension was user enabled or not.
>>
>> So we can either show unfair warning messages or not show warnings and take the risk
>> of silently disabling extensions that users enabled in the command line. It seems
>> that the former is more annoying to deal with than the latter.
>>
>> I guess I can propose a patch to remove the warnings. We can send warning again
>> when we have a better solution.
> 
> A better solution is to just not enable Zca and friends automatically,
> or at least look at the priv spec before we do

Good idea. In fact we should do that for all extensions that we're enabling
automatically.


I'll work something out. Thanks,


Daniel

> 
> Alistair
> 
>>
>>
>> Daniel
>>
>>
>>>
>>> Cheers,
>>> Conor.
>>
Conor Dooley July 14, 2023, 10:19 a.m. UTC | #16
On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:

> > > OpenSBI v1.3
> > >    ____                    _____ ____ _____
> > >   / __ \                  / ____|  _ \_   _|
> > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > >         | |
> > >         |_|
> > >
> > > init_coldboot: ipi init failed (error -1009)
> > >
> > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > and compiles only a significantly cut down number of files from it, we
> > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > not tested v1.3, nor do we have any immediate plans to change our
> > > platform firmware to vendor v1.3 either.
> > >
> > > I unless there's something obvious to you, it sounds like I will need to
> > > go and bisect OpenSBI. That's a job for another day though, given the
> > > time.
> > >
> 
> The real issue is some CPU/HART DT nodes marked as disabled in the
> DT passed to OpenSBI 1.3.
> 
> This issue does not exist in any of the DTs generated by QEMU but some
> of the DTs in the kernel (such as microchip and SiFive board DTs) have
> the E-core disabled.
> 
> I had discovered this issue in a totally different context after the OpenSBI 1.3
> release happened. This issue is already fixed in the latest OpenSBI by the
> following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> Fix sbi_hartid_to_scratch() usage in ACLINT drivers").

Great, thanks Anup! I thought I had tested tip-of-tree too, but
obviously not.

> I always assumed that Microchip hss.bin is the preferred BIOS for the
> QEMU microchip-icicle-kit machine but I guess that's not true.

Unfortunately the HSS has not worked in QEMU for a long time, and while
I would love to fix it, but am pretty stretched for spare time to begin
with.
I usually just do direct kernel boots, which use the OpenSBI that comes
with QEMU, as I am sure you already know :)

> At this point, you can either:
> 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine

> 2) Ensure CPU0 DT node is enabled in DT when booting on QEMU
>     microchip-icicle-kit machine with OpenSBI 1.3

Will OpenSBI disable it? If not, I think option 2) needs to be remove
the DT node. I'll just use tip-of-tree myself & up to the
Conor Dooley July 14, 2023, 12:28 p.m. UTC | #17
On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote:
> On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
> 
> > > > OpenSBI v1.3
> > > >    ____                    _____ ____ _____
> > > >   / __ \                  / ____|  _ \_   _|
> > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > >         | |
> > > >         |_|
> > > >
> > > > init_coldboot: ipi init failed (error -1009)
> > > >
> > > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > > and compiles only a significantly cut down number of files from it, we
> > > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > platform firmware to vendor v1.3 either.
> > > >
> > > > I unless there's something obvious to you, it sounds like I will need to
> > > > go and bisect OpenSBI. That's a job for another day though, given the
> > > > time.
> > > >
> > 
> > The real issue is some CPU/HART DT nodes marked as disabled in the
> > DT passed to OpenSBI 1.3.
> > 
> > This issue does not exist in any of the DTs generated by QEMU but some
> > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > the E-core disabled.
> > 
> > I had discovered this issue in a totally different context after the OpenSBI 1.3
> > release happened. This issue is already fixed in the latest OpenSBI by the
> > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
> 
> Great, thanks Anup! I thought I had tested tip-of-tree too, but
> obviously not.
> 
> > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > QEMU microchip-icicle-kit machine but I guess that's not true.
> 
> Unfortunately the HSS has not worked in QEMU for a long time, and while
> I would love to fix it, but am pretty stretched for spare time to begin
> with.
> I usually just do direct kernel boots, which use the OpenSBI that comes
> with QEMU, as I am sure you already know :)
> 
> > At this point, you can either:
> > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine

I forgot to reply to this point, wondering what should be done with
QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless
of whether I can go and build a fixed version of OpenSBI.

> > 2) Ensure CPU0 DT node is enabled in DT when booting on QEMU
> >     microchip-icicle-kit machine with OpenSBI 1.3
> 
> Will OpenSBI disable it? If not, I think option 2) needs to be remove
> the DT node. I'll just use tip-of-tree myself & up to the 

Clearly didn't finish this comment. It was meant to say "up to the QEMU
maintainers what they want to do on the QEMU side of things".

Thanks,
Conor.
Anup Patel July 14, 2023, 12:35 p.m. UTC | #18
On Fri, Jul 14, 2023 at 3:50 PM Conor Dooley <conor@kernel.org> wrote:
>
> On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
>
> > > > OpenSBI v1.3
> > > >    ____                    _____ ____ _____
> > > >   / __ \                  / ____|  _ \_   _|
> > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > >         | |
> > > >         |_|
> > > >
> > > > init_coldboot: ipi init failed (error -1009)
> > > >
> > > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > > and compiles only a significantly cut down number of files from it, we
> > > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > platform firmware to vendor v1.3 either.
> > > >
> > > > I unless there's something obvious to you, it sounds like I will need to
> > > > go and bisect OpenSBI. That's a job for another day though, given the
> > > > time.
> > > >
> >
> > The real issue is some CPU/HART DT nodes marked as disabled in the
> > DT passed to OpenSBI 1.3.
> >
> > This issue does not exist in any of the DTs generated by QEMU but some
> > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > the E-core disabled.
> >
> > I had discovered this issue in a totally different context after the OpenSBI 1.3
> > release happened. This issue is already fixed in the latest OpenSBI by the
> > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
>
> Great, thanks Anup! I thought I had tested tip-of-tree too, but
> obviously not.
>
> > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > QEMU microchip-icicle-kit machine but I guess that's not true.
>
> Unfortunately the HSS has not worked in QEMU for a long time, and while
> I would love to fix it, but am pretty stretched for spare time to begin
> with.
> I usually just do direct kernel boots, which use the OpenSBI that comes
> with QEMU, as I am sure you already know :)
>
> > At this point, you can either:
> > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
>
> > 2) Ensure CPU0 DT node is enabled in DT when booting on QEMU
> >     microchip-icicle-kit machine with OpenSBI 1.3
>
> Will OpenSBI disable it? If not, I think option 2) needs to be remove
> the DT node. I'll just use tip-of-tree myself & up to the

Current, FDT fixup code in OpenSBI will disable any CPU DT node
which satisfies any of the following:
1) CPU is not assigned to the current domain
2) CPU does not have "mmu-type" DT property

Regards,
Anup
Atish Patra July 15, 2023, 9:12 a.m. UTC | #19
On Fri, Jul 14, 2023 at 5:29 AM Conor Dooley <conor@kernel.org> wrote:
>
> On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote:
> > On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
> >
> > > > > OpenSBI v1.3
> > > > >    ____                    _____ ____ _____
> > > > >   / __ \                  / ____|  _ \_   _|
> > > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > > >         | |
> > > > >         |_|
> > > > >
> > > > > init_coldboot: ipi init failed (error -1009)
> > > > >
> > > > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > > > and compiles only a significantly cut down number of files from it, we
> > > > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > > platform firmware to vendor v1.3 either.
> > > > >
> > > > > I unless there's something obvious to you, it sounds like I will need to
> > > > > go and bisect OpenSBI. That's a job for another day though, given the
> > > > > time.
> > > > >
> > >
> > > The real issue is some CPU/HART DT nodes marked as disabled in the
> > > DT passed to OpenSBI 1.3.
> > >
> > > This issue does not exist in any of the DTs generated by QEMU but some
> > > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > > the E-core disabled.
> > >
> > > I had discovered this issue in a totally different context after the OpenSBI 1.3
> > > release happened. This issue is already fixed in the latest OpenSBI by the
> > > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> > > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
> >
> > Great, thanks Anup! I thought I had tested tip-of-tree too, but
> > obviously not.
> >
> > > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > > QEMU microchip-icicle-kit machine but I guess that's not true.
> >
> > Unfortunately the HSS has not worked in QEMU for a long time, and while
> > I would love to fix it, but am pretty stretched for spare time to begin
> > with.
> > I usually just do direct kernel boots, which use the OpenSBI that comes
> > with QEMU, as I am sure you already know :)
> >
> > > At this point, you can either:
> > > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
>
> I forgot to reply to this point, wondering what should be done with
> QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless
> of whether I can go and build a fixed version of OpenSBI.
>
FYI: The no-map fix went in OpenSBI v1.3. Without the upgrade, any
user using the latest kernel (> v6.4)
may hit those random linear map related issues (in hibernation or EFI
booting path).

There are three possible scenarios:

1. Upgrade to OpenSBI v1.3: Any user of microchip-icicle-kit machine
or sifive fu540 machine users
may hit this issue if the device tree has the disabled hart (e core).
2. No upgrade to OpenSBI v1.2. Any user using hibernation or UEFI may
have issues [1]
3. Include a non-release version OpenSBI in Qemu with the fix as an exception.

#3 probably deviates from policy and sets a bad precedent. So I am not
advocating for it though ;)
For both #1 & #2, the solution would be to use the latest OpenSBI in
-bios argument instead of the stock one.
I could be wrong but my guess is the number of users facing #2 would
be higher than #1.

[1] https://lore.kernel.org/linux-riscv/20230625140931.1266216-1-songshuaishuai@tinylab.org/
> > > 2) Ensure CPU0 DT node is enabled in DT when booting on QEMU
> > >     microchip-icicle-kit machine with OpenSBI 1.3
> >
> > Will OpenSBI disable it? If not, I think option 2) needs to be remove
> > the DT node. I'll just use tip-of-tree myself & up to the
>
> Clearly didn't finish this comment. It was meant to say "up to the QEMU
> maintainers what they want to do on the QEMU side of things".
>
> Thanks,
> Conor.
Alistair Francis July 19, 2023, 1:32 a.m. UTC | #20
On Sat, Jul 15, 2023 at 7:14 PM Atish Patra <atishp@atishpatra.org> wrote:
>
> On Fri, Jul 14, 2023 at 5:29 AM Conor Dooley <conor@kernel.org> wrote:
> >
> > On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote:
> > > On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
> > >
> > > > > > OpenSBI v1.3
> > > > > >    ____                    _____ ____ _____
> > > > > >   / __ \                  / ____|  _ \_   _|
> > > > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > > > >         | |
> > > > > >         |_|
> > > > > >
> > > > > > init_coldboot: ipi init failed (error -1009)
> > > > > >
> > > > > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > > > > and compiles only a significantly cut down number of files from it, we
> > > > > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > > > platform firmware to vendor v1.3 either.
> > > > > >
> > > > > > I unless there's something obvious to you, it sounds like I will need to
> > > > > > go and bisect OpenSBI. That's a job for another day though, given the
> > > > > > time.
> > > > > >
> > > >
> > > > The real issue is some CPU/HART DT nodes marked as disabled in the
> > > > DT passed to OpenSBI 1.3.
> > > >
> > > > This issue does not exist in any of the DTs generated by QEMU but some
> > > > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > > > the E-core disabled.
> > > >
> > > > I had discovered this issue in a totally different context after the OpenSBI 1.3
> > > > release happened. This issue is already fixed in the latest OpenSBI by the
> > > > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> > > > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
> > >
> > > Great, thanks Anup! I thought I had tested tip-of-tree too, but
> > > obviously not.
> > >
> > > > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > > > QEMU microchip-icicle-kit machine but I guess that's not true.
> > >
> > > Unfortunately the HSS has not worked in QEMU for a long time, and while
> > > I would love to fix it, but am pretty stretched for spare time to begin
> > > with.
> > > I usually just do direct kernel boots, which use the OpenSBI that comes
> > > with QEMU, as I am sure you already know :)
> > >
> > > > At this point, you can either:
> > > > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
> >
> > I forgot to reply to this point, wondering what should be done with
> > QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless
> > of whether I can go and build a fixed version of OpenSBI.
> >
> FYI: The no-map fix went in OpenSBI v1.3. Without the upgrade, any
> user using the latest kernel (> v6.4)
> may hit those random linear map related issues (in hibernation or EFI
> booting path).
>
> There are three possible scenarios:
>
> 1. Upgrade to OpenSBI v1.3: Any user of microchip-icicle-kit machine
> or sifive fu540 machine users
> may hit this issue if the device tree has the disabled hart (e core).
> 2. No upgrade to OpenSBI v1.2. Any user using hibernation or UEFI may
> have issues [1]
> 3. Include a non-release version OpenSBI in Qemu with the fix as an exception.
>
> #3 probably deviates from policy and sets a bad precedent. So I am not
> advocating for it though ;)
> For both #1 & #2, the solution would be to use the latest OpenSBI in
> -bios argument instead of the stock one.
> I could be wrong but my guess is the number of users facing #2 would
> be higher than #1.

Thanks for that info Atish!

We are stuck in a bad situation.

The best solution would be if OpenSBI can release a 1.3.1, @Anup Patel
do you think you could do that?

Otherwise I think we should stick with OpenSBI 1.3. Considering that
it fixes UEFI boot issues for the virt board (which would be the most
used) it seems like a best call to make. People using the other boards
are unfortunately stuck building their own OpenSBI release.

If there is no OpenSBI 1.3.1 release we should add something to the
release notes. @Conor Dooley are you able to give a clear sentence on
how the boot fails?

Alistair

>
> [1] https://lore.kernel.org/linux-riscv/20230625140931.1266216-1-songshuaishuai@tinylab.org/
> > > > 2) Ensure CPU0 DT node is enabled in DT when booting on QEMU
> > > >     microchip-icicle-kit machine with OpenSBI 1.3
> > >
> > > Will OpenSBI disable it? If not, I think option 2) needs to be remove
> > > the DT node. I'll just use tip-of-tree myself & up to the
> >
> > Clearly didn't finish this comment. It was meant to say "up to the QEMU
> > maintainers what they want to do on the QEMU side of things".
> >
> > Thanks,
> > Conor.
>
>
>
> --
> Regards,
> Atish
>
Anup Patel July 19, 2023, 5:39 a.m. UTC | #21
On Wed, Jul 19, 2023 at 7:03 AM Alistair Francis <alistair23@gmail.com> wrote:
>
> On Sat, Jul 15, 2023 at 7:14 PM Atish Patra <atishp@atishpatra.org> wrote:
> >
> > On Fri, Jul 14, 2023 at 5:29 AM Conor Dooley <conor@kernel.org> wrote:
> > >
> > > On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote:
> > > > On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
> > > >
> > > > > > > OpenSBI v1.3
> > > > > > >    ____                    _____ ____ _____
> > > > > > >   / __ \                  / ____|  _ \_   _|
> > > > > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > > > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > > > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > > > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > > > > >         | |
> > > > > > >         |_|
> > > > > > >
> > > > > > > init_coldboot: ipi init failed (error -1009)
> > > > > > >
> > > > > > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > > > > > and compiles only a significantly cut down number of files from it, we
> > > > > > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > > > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > > > > platform firmware to vendor v1.3 either.
> > > > > > >
> > > > > > > I unless there's something obvious to you, it sounds like I will need to
> > > > > > > go and bisect OpenSBI. That's a job for another day though, given the
> > > > > > > time.
> > > > > > >
> > > > >
> > > > > The real issue is some CPU/HART DT nodes marked as disabled in the
> > > > > DT passed to OpenSBI 1.3.
> > > > >
> > > > > This issue does not exist in any of the DTs generated by QEMU but some
> > > > > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > > > > the E-core disabled.
> > > > >
> > > > > I had discovered this issue in a totally different context after the OpenSBI 1.3
> > > > > release happened. This issue is already fixed in the latest OpenSBI by the
> > > > > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> > > > > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
> > > >
> > > > Great, thanks Anup! I thought I had tested tip-of-tree too, but
> > > > obviously not.
> > > >
> > > > > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > > > > QEMU microchip-icicle-kit machine but I guess that's not true.
> > > >
> > > > Unfortunately the HSS has not worked in QEMU for a long time, and while
> > > > I would love to fix it, but am pretty stretched for spare time to begin
> > > > with.
> > > > I usually just do direct kernel boots, which use the OpenSBI that comes
> > > > with QEMU, as I am sure you already know :)
> > > >
> > > > > At this point, you can either:
> > > > > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
> > >
> > > I forgot to reply to this point, wondering what should be done with
> > > QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless
> > > of whether I can go and build a fixed version of OpenSBI.
> > >
> > FYI: The no-map fix went in OpenSBI v1.3. Without the upgrade, any
> > user using the latest kernel (> v6.4)
> > may hit those random linear map related issues (in hibernation or EFI
> > booting path).
> >
> > There are three possible scenarios:
> >
> > 1. Upgrade to OpenSBI v1.3: Any user of microchip-icicle-kit machine
> > or sifive fu540 machine users
> > may hit this issue if the device tree has the disabled hart (e core).
> > 2. No upgrade to OpenSBI v1.2. Any user using hibernation or UEFI may
> > have issues [1]
> > 3. Include a non-release version OpenSBI in Qemu with the fix as an exception.
> >
> > #3 probably deviates from policy and sets a bad precedent. So I am not
> > advocating for it though ;)
> > For both #1 & #2, the solution would be to use the latest OpenSBI in
> > -bios argument instead of the stock one.
> > I could be wrong but my guess is the number of users facing #2 would
> > be higher than #1.
>
> Thanks for that info Atish!
>
> We are stuck in a bad situation.
>
> The best solution would be if OpenSBI can release a 1.3.1, @Anup Patel
> do you think you could do that?

OpenSBI has a major number and minor number in the version but it does
not have release/patch number so best would be to treat OpenSBI vX.Y.Z
as bug fixes on-top-of OpenSBI vX.Y. In other words, supervisor software
won't be able to differentiate between OpenSBI vX.Y.Z and OpenSBI vX.Y
using sbi_get_impl_version().

There are only three commits between the ACLINT fix and OpenSBI v1.3
so as one-of case I will go ahead create OpenSBI v1.3.1 containing only
four commits on-top of OpenSBI v1.3

Does this sound okay ?

>
> Otherwise I think we should stick with OpenSBI 1.3. Considering that
> it fixes UEFI boot issues for the virt board (which would be the most
> used) it seems like a best call to make. People using the other boards
> are unfortunately stuck building their own OpenSBI release.
>
> If there is no OpenSBI 1.3.1 release we should add something to the
> release notes. @Conor Dooley are you able to give a clear sentence on
> how the boot fails?
>
> Alistair
>
> >
> > [1] https://lore.kernel.org/linux-riscv/20230625140931.1266216-1-songshuaishuai@tinylab.org/
> > > > > 2) Ensure CPU0 DT node is enabled in DT when booting on QEMU
> > > > >     microchip-icicle-kit machine with OpenSBI 1.3
> > > >
> > > > Will OpenSBI disable it? If not, I think option 2) needs to be remove
> > > > the DT node. I'll just use tip-of-tree myself & up to the
> > >
> > > Clearly didn't finish this comment. It was meant to say "up to the QEMU
> > > maintainers what they want to do on the QEMU side of things".
> > >
> > > Thanks,
> > > Conor.
> >
> >
> >
> > --
> > Regards,
> > Atish
> >

Regards,
Anup
Conor Dooley July 19, 2023, 7:07 a.m. UTC | #22
On Wed, Jul 19, 2023 at 11:32:55AM +1000, Alistair Francis wrote:

> If there is no OpenSBI 1.3.1 release we should add something to the
> release notes. @Conor Dooley are you able to give a clear sentence on
> how the boot fails?

Uhh, I'll give it a shot, but hopefully it is not required :)

In version v1.3, OpenSBI's aclint drivers fail to initialise if they
encounter a disabled CPU node in the devicetree. Attempting to boot
using, for example, the Linux kernel's PolarFire SoC or Freedom U540
devicetrees, will fail with the error:
"init_coldboot: ipi init failed (error -1009)"
Please see OpenSBI commit c6a3573 ("lib: utils: Fix sbi_hartid_to_scratch()
usage in ACLINT drivers")
<https://github.com/riscv-software-src/opensbi/commit/c6a35733b74aeff612398f274ed19a74f81d1f37>
for the fix.
Alistair Francis July 19, 2023, 9:53 a.m. UTC | #23
On Wed, Jul 19, 2023 at 3:39 PM Anup Patel <anup@brainfault.org> wrote:
>
> On Wed, Jul 19, 2023 at 7:03 AM Alistair Francis <alistair23@gmail.com> wrote:
> >
> > On Sat, Jul 15, 2023 at 7:14 PM Atish Patra <atishp@atishpatra.org> wrote:
> > >
> > > On Fri, Jul 14, 2023 at 5:29 AM Conor Dooley <conor@kernel.org> wrote:
> > > >
> > > > On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote:
> > > > > On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
> > > > >
> > > > > > > > OpenSBI v1.3
> > > > > > > >    ____                    _____ ____ _____
> > > > > > > >   / __ \                  / ____|  _ \_   _|
> > > > > > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > > > > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > > > > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > > > > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > > > > > >         | |
> > > > > > > >         |_|
> > > > > > > >
> > > > > > > > init_coldboot: ipi init failed (error -1009)
> > > > > > > >
> > > > > > > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > > > > > > and compiles only a significantly cut down number of files from it, we
> > > > > > > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > > > > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > > > > > platform firmware to vendor v1.3 either.
> > > > > > > >
> > > > > > > > I unless there's something obvious to you, it sounds like I will need to
> > > > > > > > go and bisect OpenSBI. That's a job for another day though, given the
> > > > > > > > time.
> > > > > > > >
> > > > > >
> > > > > > The real issue is some CPU/HART DT nodes marked as disabled in the
> > > > > > DT passed to OpenSBI 1.3.
> > > > > >
> > > > > > This issue does not exist in any of the DTs generated by QEMU but some
> > > > > > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > > > > > the E-core disabled.
> > > > > >
> > > > > > I had discovered this issue in a totally different context after the OpenSBI 1.3
> > > > > > release happened. This issue is already fixed in the latest OpenSBI by the
> > > > > > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> > > > > > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
> > > > >
> > > > > Great, thanks Anup! I thought I had tested tip-of-tree too, but
> > > > > obviously not.
> > > > >
> > > > > > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > > > > > QEMU microchip-icicle-kit machine but I guess that's not true.
> > > > >
> > > > > Unfortunately the HSS has not worked in QEMU for a long time, and while
> > > > > I would love to fix it, but am pretty stretched for spare time to begin
> > > > > with.
> > > > > I usually just do direct kernel boots, which use the OpenSBI that comes
> > > > > with QEMU, as I am sure you already know :)
> > > > >
> > > > > > At this point, you can either:
> > > > > > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
> > > >
> > > > I forgot to reply to this point, wondering what should be done with
> > > > QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless
> > > > of whether I can go and build a fixed version of OpenSBI.
> > > >
> > > FYI: The no-map fix went in OpenSBI v1.3. Without the upgrade, any
> > > user using the latest kernel (> v6.4)
> > > may hit those random linear map related issues (in hibernation or EFI
> > > booting path).
> > >
> > > There are three possible scenarios:
> > >
> > > 1. Upgrade to OpenSBI v1.3: Any user of microchip-icicle-kit machine
> > > or sifive fu540 machine users
> > > may hit this issue if the device tree has the disabled hart (e core).
> > > 2. No upgrade to OpenSBI v1.2. Any user using hibernation or UEFI may
> > > have issues [1]
> > > 3. Include a non-release version OpenSBI in Qemu with the fix as an exception.
> > >
> > > #3 probably deviates from policy and sets a bad precedent. So I am not
> > > advocating for it though ;)
> > > For both #1 & #2, the solution would be to use the latest OpenSBI in
> > > -bios argument instead of the stock one.
> > > I could be wrong but my guess is the number of users facing #2 would
> > > be higher than #1.
> >
> > Thanks for that info Atish!
> >
> > We are stuck in a bad situation.
> >
> > The best solution would be if OpenSBI can release a 1.3.1, @Anup Patel
> > do you think you could do that?
>
> OpenSBI has a major number and minor number in the version but it does
> not have release/patch number so best would be to treat OpenSBI vX.Y.Z
> as bug fixes on-top-of OpenSBI vX.Y. In other words, supervisor software
> won't be able to differentiate between OpenSBI vX.Y.Z and OpenSBI vX.Y
> using sbi_get_impl_version().
>
> There are only three commits between the ACLINT fix and OpenSBI v1.3
> so as one-of case I will go ahead create OpenSBI v1.3.1 containing only
> four commits on-top of OpenSBI v1.3
>
> Does this sound okay ?

That sounds fine to me. It fixes the issue for the Microsemi board and
it's a very small change between 1.3 and 1.3.1

Alistair

>
> >
> > Otherwise I think we should stick with OpenSBI 1.3. Considering that
> > it fixes UEFI boot issues for the virt board (which would be the most
> > used) it seems like a best call to make. People using the other boards
> > are unfortunately stuck building their own OpenSBI release.
> >
> > If there is no OpenSBI 1.3.1 release we should add something to the
> > release notes. @Conor Dooley are you able to give a clear sentence on
> > how the boot fails?
> >
> > Alistair
> >
> > >
> > > [1] https://lore.kernel.org/linux-riscv/20230625140931.1266216-1-songshuaishuai@tinylab.org/
> > > > > > 2) Ensure CPU0 DT node is enabled in DT when booting on QEMU
> > > > > >     microchip-icicle-kit machine with OpenSBI 1.3
> > > > >
> > > > > Will OpenSBI disable it? If not, I think option 2) needs to be remove
> > > > > the DT node. I'll just use tip-of-tree myself & up to the
> > > >
> > > > Clearly didn't finish this comment. It was meant to say "up to the QEMU
> > > > maintainers what they want to do on the QEMU side of things".
> > > >
> > > > Thanks,
> > > > Conor.
> > >
> > >
> > >
> > > --
> > > Regards,
> > > Atish
> > >
>
> Regards,
> Anup
Anup Patel July 19, 2023, 3:21 p.m. UTC | #24
On Wed, Jul 19, 2023 at 3:23 PM Alistair Francis <alistair23@gmail.com> wrote:
>
> On Wed, Jul 19, 2023 at 3:39 PM Anup Patel <anup@brainfault.org> wrote:
> >
> > On Wed, Jul 19, 2023 at 7:03 AM Alistair Francis <alistair23@gmail.com> wrote:
> > >
> > > On Sat, Jul 15, 2023 at 7:14 PM Atish Patra <atishp@atishpatra.org> wrote:
> > > >
> > > > On Fri, Jul 14, 2023 at 5:29 AM Conor Dooley <conor@kernel.org> wrote:
> > > > >
> > > > > On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote:
> > > > > > On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
> > > > > >
> > > > > > > > > OpenSBI v1.3
> > > > > > > > >    ____                    _____ ____ _____
> > > > > > > > >   / __ \                  / ____|  _ \_   _|
> > > > > > > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > > > > > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > > > > > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > > > > > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > > > > > > >         | |
> > > > > > > > >         |_|
> > > > > > > > >
> > > > > > > > > init_coldboot: ipi init failed (error -1009)
> > > > > > > > >
> > > > > > > > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > > > > > > > and compiles only a significantly cut down number of files from it, we
> > > > > > > > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > > > > > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > > > > > > platform firmware to vendor v1.3 either.
> > > > > > > > >
> > > > > > > > > I unless there's something obvious to you, it sounds like I will need to
> > > > > > > > > go and bisect OpenSBI. That's a job for another day though, given the
> > > > > > > > > time.
> > > > > > > > >
> > > > > > >
> > > > > > > The real issue is some CPU/HART DT nodes marked as disabled in the
> > > > > > > DT passed to OpenSBI 1.3.
> > > > > > >
> > > > > > > This issue does not exist in any of the DTs generated by QEMU but some
> > > > > > > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > > > > > > the E-core disabled.
> > > > > > >
> > > > > > > I had discovered this issue in a totally different context after the OpenSBI 1.3
> > > > > > > release happened. This issue is already fixed in the latest OpenSBI by the
> > > > > > > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> > > > > > > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
> > > > > >
> > > > > > Great, thanks Anup! I thought I had tested tip-of-tree too, but
> > > > > > obviously not.
> > > > > >
> > > > > > > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > > > > > > QEMU microchip-icicle-kit machine but I guess that's not true.
> > > > > >
> > > > > > Unfortunately the HSS has not worked in QEMU for a long time, and while
> > > > > > I would love to fix it, but am pretty stretched for spare time to begin
> > > > > > with.
> > > > > > I usually just do direct kernel boots, which use the OpenSBI that comes
> > > > > > with QEMU, as I am sure you already know :)
> > > > > >
> > > > > > > At this point, you can either:
> > > > > > > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
> > > > >
> > > > > I forgot to reply to this point, wondering what should be done with
> > > > > QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless
> > > > > of whether I can go and build a fixed version of OpenSBI.
> > > > >
> > > > FYI: The no-map fix went in OpenSBI v1.3. Without the upgrade, any
> > > > user using the latest kernel (> v6.4)
> > > > may hit those random linear map related issues (in hibernation or EFI
> > > > booting path).
> > > >
> > > > There are three possible scenarios:
> > > >
> > > > 1. Upgrade to OpenSBI v1.3: Any user of microchip-icicle-kit machine
> > > > or sifive fu540 machine users
> > > > may hit this issue if the device tree has the disabled hart (e core).
> > > > 2. No upgrade to OpenSBI v1.2. Any user using hibernation or UEFI may
> > > > have issues [1]
> > > > 3. Include a non-release version OpenSBI in Qemu with the fix as an exception.
> > > >
> > > > #3 probably deviates from policy and sets a bad precedent. So I am not
> > > > advocating for it though ;)
> > > > For both #1 & #2, the solution would be to use the latest OpenSBI in
> > > > -bios argument instead of the stock one.
> > > > I could be wrong but my guess is the number of users facing #2 would
> > > > be higher than #1.
> > >
> > > Thanks for that info Atish!
> > >
> > > We are stuck in a bad situation.
> > >
> > > The best solution would be if OpenSBI can release a 1.3.1, @Anup Patel
> > > do you think you could do that?
> >
> > OpenSBI has a major number and minor number in the version but it does
> > not have release/patch number so best would be to treat OpenSBI vX.Y.Z
> > as bug fixes on-top-of OpenSBI vX.Y. In other words, supervisor software
> > won't be able to differentiate between OpenSBI vX.Y.Z and OpenSBI vX.Y
> > using sbi_get_impl_version().
> >
> > There are only three commits between the ACLINT fix and OpenSBI v1.3
> > so as one-of case I will go ahead create OpenSBI v1.3.1 containing only
> > four commits on-top of OpenSBI v1.3
> >
> > Does this sound okay ?
>
> That sounds fine to me. It fixes the issue for the Microsemi board and
> it's a very small change between 1.3 and 1.3.1

Please check
https://github.com/riscv-software-src/opensbi/releases/tag/v1.3.1

I hope this helps.

Regards,
Anup

>
> Alistair
>
> >
> > >
> > > Otherwise I think we should stick with OpenSBI 1.3. Considering that
> > > it fixes UEFI boot issues for the virt board (which would be the most
> > > used) it seems like a best call to make. People using the other boards
> > > are unfortunately stuck building their own OpenSBI release.
> > >
> > > If there is no OpenSBI 1.3.1 release we should add something to the
> > > release notes. @Conor Dooley are you able to give a clear sentence on
> > > how the boot fails?
> > >
> > > Alistair
> > >
> > > >
> > > > [1] https://lore.kernel.org/linux-riscv/20230625140931.1266216-1-songshuaishuai@tinylab.org/
> > > > > > > 2) Ensure CPU0 DT node is enabled in DT when booting on QEMU
> > > > > > >     microchip-icicle-kit machine with OpenSBI 1.3
> > > > > >
> > > > > > Will OpenSBI disable it? If not, I think option 2) needs to be remove
> > > > > > the DT node. I'll just use tip-of-tree myself & up to the
> > > > >
> > > > > Clearly didn't finish this comment. It was meant to say "up to the QEMU
> > > > > maintainers what they want to do on the QEMU side of things".
> > > > >
> > > > > Thanks,
> > > > > Conor.
> > > >
> > > >
> > > >
> > > > --
> > > > Regards,
> > > > Atish
> > > >
> >
> > Regards,
> > Anup
Bin Meng July 19, 2023, 3:45 p.m. UTC | #25
On Wed, Jul 19, 2023 at 11:22 PM Anup Patel <anup@brainfault.org> wrote:
>
> On Wed, Jul 19, 2023 at 3:23 PM Alistair Francis <alistair23@gmail.com> wrote:
> >
> > On Wed, Jul 19, 2023 at 3:39 PM Anup Patel <anup@brainfault.org> wrote:
> > >
> > > On Wed, Jul 19, 2023 at 7:03 AM Alistair Francis <alistair23@gmail.com> wrote:
> > > >
> > > > On Sat, Jul 15, 2023 at 7:14 PM Atish Patra <atishp@atishpatra.org> wrote:
> > > > >
> > > > > On Fri, Jul 14, 2023 at 5:29 AM Conor Dooley <conor@kernel.org> wrote:
> > > > > >
> > > > > > On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote:
> > > > > > > On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
> > > > > > >
> > > > > > > > > > OpenSBI v1.3
> > > > > > > > > >    ____                    _____ ____ _____
> > > > > > > > > >   / __ \                  / ____|  _ \_   _|
> > > > > > > > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > > > > > > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > > > > > > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > > > > > > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > > > > > > > >         | |
> > > > > > > > > >         |_|
> > > > > > > > > >
> > > > > > > > > > init_coldboot: ipi init failed (error -1009)
> > > > > > > > > >
> > > > > > > > > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > > > > > > > > and compiles only a significantly cut down number of files from it, we
> > > > > > > > > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > > > > > > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > > > > > > > platform firmware to vendor v1.3 either.
> > > > > > > > > >
> > > > > > > > > > I unless there's something obvious to you, it sounds like I will need to
> > > > > > > > > > go and bisect OpenSBI. That's a job for another day though, given the
> > > > > > > > > > time.
> > > > > > > > > >
> > > > > > > >
> > > > > > > > The real issue is some CPU/HART DT nodes marked as disabled in the
> > > > > > > > DT passed to OpenSBI 1.3.
> > > > > > > >
> > > > > > > > This issue does not exist in any of the DTs generated by QEMU but some
> > > > > > > > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > > > > > > > the E-core disabled.
> > > > > > > >
> > > > > > > > I had discovered this issue in a totally different context after the OpenSBI 1.3
> > > > > > > > release happened. This issue is already fixed in the latest OpenSBI by the
> > > > > > > > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> > > > > > > > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
> > > > > > >
> > > > > > > Great, thanks Anup! I thought I had tested tip-of-tree too, but
> > > > > > > obviously not.
> > > > > > >
> > > > > > > > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > > > > > > > QEMU microchip-icicle-kit machine but I guess that's not true.
> > > > > > >
> > > > > > > Unfortunately the HSS has not worked in QEMU for a long time, and while
> > > > > > > I would love to fix it, but am pretty stretched for spare time to begin
> > > > > > > with.
> > > > > > > I usually just do direct kernel boots, which use the OpenSBI that comes
> > > > > > > with QEMU, as I am sure you already know :)
> > > > > > >
> > > > > > > > At this point, you can either:
> > > > > > > > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
> > > > > >
> > > > > > I forgot to reply to this point, wondering what should be done with
> > > > > > QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless
> > > > > > of whether I can go and build a fixed version of OpenSBI.
> > > > > >
> > > > > FYI: The no-map fix went in OpenSBI v1.3. Without the upgrade, any
> > > > > user using the latest kernel (> v6.4)
> > > > > may hit those random linear map related issues (in hibernation or EFI
> > > > > booting path).
> > > > >
> > > > > There are three possible scenarios:
> > > > >
> > > > > 1. Upgrade to OpenSBI v1.3: Any user of microchip-icicle-kit machine
> > > > > or sifive fu540 machine users
> > > > > may hit this issue if the device tree has the disabled hart (e core).
> > > > > 2. No upgrade to OpenSBI v1.2. Any user using hibernation or UEFI may
> > > > > have issues [1]
> > > > > 3. Include a non-release version OpenSBI in Qemu with the fix as an exception.
> > > > >
> > > > > #3 probably deviates from policy and sets a bad precedent. So I am not
> > > > > advocating for it though ;)
> > > > > For both #1 & #2, the solution would be to use the latest OpenSBI in
> > > > > -bios argument instead of the stock one.
> > > > > I could be wrong but my guess is the number of users facing #2 would
> > > > > be higher than #1.
> > > >
> > > > Thanks for that info Atish!
> > > >
> > > > We are stuck in a bad situation.
> > > >
> > > > The best solution would be if OpenSBI can release a 1.3.1, @Anup Patel
> > > > do you think you could do that?
> > >
> > > OpenSBI has a major number and minor number in the version but it does
> > > not have release/patch number so best would be to treat OpenSBI vX.Y.Z
> > > as bug fixes on-top-of OpenSBI vX.Y. In other words, supervisor software
> > > won't be able to differentiate between OpenSBI vX.Y.Z and OpenSBI vX.Y
> > > using sbi_get_impl_version().
> > >
> > > There are only three commits between the ACLINT fix and OpenSBI v1.3
> > > so as one-of case I will go ahead create OpenSBI v1.3.1 containing only
> > > four commits on-top of OpenSBI v1.3
> > >
> > > Does this sound okay ?
> >
> > That sounds fine to me. It fixes the issue for the Microsemi board and
> > it's a very small change between 1.3 and 1.3.1
>
> Please check
> https://github.com/riscv-software-src/opensbi/releases/tag/v1.3.1
>
> I hope this helps.

Hi Alistair,

Do we need to update QEMU's opensbi binaries to v1.3.1?

Hi Anup,

Somehow I cannot see the 'tag' v1.3.1 being populated in the opensbi
git repo. Am I missing anything?

Regards,
Bin
Anup Patel July 19, 2023, 4:10 p.m. UTC | #26
Hi Bin,

On Wed, Jul 19, 2023 at 9:15 PM Bin Meng <bmeng.cn@gmail.com> wrote:
>
> On Wed, Jul 19, 2023 at 11:22 PM Anup Patel <anup@brainfault.org> wrote:
> >
> > On Wed, Jul 19, 2023 at 3:23 PM Alistair Francis <alistair23@gmail.com> wrote:
> > >
> > > On Wed, Jul 19, 2023 at 3:39 PM Anup Patel <anup@brainfault.org> wrote:
> > > >
> > > > On Wed, Jul 19, 2023 at 7:03 AM Alistair Francis <alistair23@gmail.com> wrote:
> > > > >
> > > > > On Sat, Jul 15, 2023 at 7:14 PM Atish Patra <atishp@atishpatra.org> wrote:
> > > > > >
> > > > > > On Fri, Jul 14, 2023 at 5:29 AM Conor Dooley <conor@kernel.org> wrote:
> > > > > > >
> > > > > > > On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote:
> > > > > > > > On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
> > > > > > > >
> > > > > > > > > > > OpenSBI v1.3
> > > > > > > > > > >    ____                    _____ ____ _____
> > > > > > > > > > >   / __ \                  / ____|  _ \_   _|
> > > > > > > > > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > > > > > > > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > > > > > > > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > > > > > > > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > > > > > > > > >         | |
> > > > > > > > > > >         |_|
> > > > > > > > > > >
> > > > > > > > > > > init_coldboot: ipi init failed (error -1009)
> > > > > > > > > > >
> > > > > > > > > > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > > > > > > > > > and compiles only a significantly cut down number of files from it, we
> > > > > > > > > > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > > > > > > > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > > > > > > > > platform firmware to vendor v1.3 either.
> > > > > > > > > > >
> > > > > > > > > > > I unless there's something obvious to you, it sounds like I will need to
> > > > > > > > > > > go and bisect OpenSBI. That's a job for another day though, given the
> > > > > > > > > > > time.
> > > > > > > > > > >
> > > > > > > > >
> > > > > > > > > The real issue is some CPU/HART DT nodes marked as disabled in the
> > > > > > > > > DT passed to OpenSBI 1.3.
> > > > > > > > >
> > > > > > > > > This issue does not exist in any of the DTs generated by QEMU but some
> > > > > > > > > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > > > > > > > > the E-core disabled.
> > > > > > > > >
> > > > > > > > > I had discovered this issue in a totally different context after the OpenSBI 1.3
> > > > > > > > > release happened. This issue is already fixed in the latest OpenSBI by the
> > > > > > > > > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> > > > > > > > > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
> > > > > > > >
> > > > > > > > Great, thanks Anup! I thought I had tested tip-of-tree too, but
> > > > > > > > obviously not.
> > > > > > > >
> > > > > > > > > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > > > > > > > > QEMU microchip-icicle-kit machine but I guess that's not true.
> > > > > > > >
> > > > > > > > Unfortunately the HSS has not worked in QEMU for a long time, and while
> > > > > > > > I would love to fix it, but am pretty stretched for spare time to begin
> > > > > > > > with.
> > > > > > > > I usually just do direct kernel boots, which use the OpenSBI that comes
> > > > > > > > with QEMU, as I am sure you already know :)
> > > > > > > >
> > > > > > > > > At this point, you can either:
> > > > > > > > > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
> > > > > > >
> > > > > > > I forgot to reply to this point, wondering what should be done with
> > > > > > > QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless
> > > > > > > of whether I can go and build a fixed version of OpenSBI.
> > > > > > >
> > > > > > FYI: The no-map fix went in OpenSBI v1.3. Without the upgrade, any
> > > > > > user using the latest kernel (> v6.4)
> > > > > > may hit those random linear map related issues (in hibernation or EFI
> > > > > > booting path).
> > > > > >
> > > > > > There are three possible scenarios:
> > > > > >
> > > > > > 1. Upgrade to OpenSBI v1.3: Any user of microchip-icicle-kit machine
> > > > > > or sifive fu540 machine users
> > > > > > may hit this issue if the device tree has the disabled hart (e core).
> > > > > > 2. No upgrade to OpenSBI v1.2. Any user using hibernation or UEFI may
> > > > > > have issues [1]
> > > > > > 3. Include a non-release version OpenSBI in Qemu with the fix as an exception.
> > > > > >
> > > > > > #3 probably deviates from policy and sets a bad precedent. So I am not
> > > > > > advocating for it though ;)
> > > > > > For both #1 & #2, the solution would be to use the latest OpenSBI in
> > > > > > -bios argument instead of the stock one.
> > > > > > I could be wrong but my guess is the number of users facing #2 would
> > > > > > be higher than #1.
> > > > >
> > > > > Thanks for that info Atish!
> > > > >
> > > > > We are stuck in a bad situation.
> > > > >
> > > > > The best solution would be if OpenSBI can release a 1.3.1, @Anup Patel
> > > > > do you think you could do that?
> > > >
> > > > OpenSBI has a major number and minor number in the version but it does
> > > > not have release/patch number so best would be to treat OpenSBI vX.Y.Z
> > > > as bug fixes on-top-of OpenSBI vX.Y. In other words, supervisor software
> > > > won't be able to differentiate between OpenSBI vX.Y.Z and OpenSBI vX.Y
> > > > using sbi_get_impl_version().
> > > >
> > > > There are only three commits between the ACLINT fix and OpenSBI v1.3
> > > > so as one-of case I will go ahead create OpenSBI v1.3.1 containing only
> > > > four commits on-top of OpenSBI v1.3
> > > >
> > > > Does this sound okay ?
> > >
> > > That sounds fine to me. It fixes the issue for the Microsemi board and
> > > it's a very small change between 1.3 and 1.3.1
> >
> > Please check
> > https://github.com/riscv-software-src/opensbi/releases/tag/v1.3.1
> >
> > I hope this helps.
>
> Hi Alistair,
>
> Do we need to update QEMU's opensbi binaries to v1.3.1?
>
> Hi Anup,
>
> Somehow I cannot see the 'tag' v1.3.1 being populated in the opensbi
> git repo. Am I missing anything?

There is a v1.3.1 tag in https://github.com/riscv-software-src/opensbi
(Try cloning the repo again?)

The commit history of v1.3.1 is v1.3 tag + 5 cherry picked commits
which means the commit history of the master branch is not the same
as the commit history of v1.3.1.

Regards,
Anup
Andreas Schwab July 19, 2023, 4:17 p.m. UTC | #27
On Jul 19 2023, Bin Meng wrote:

>> Please check
>> https://github.com/riscv-software-src/opensbi/releases/tag/v1.3.1
>>
>> I hope this helps.
>
> Hi Alistair,
>
> Do we need to update QEMU's opensbi binaries to v1.3.1?
>
> Hi Anup,
>
> Somehow I cannot see the 'tag' v1.3.1 being populated in the opensbi
> git repo. Am I missing anything?

You need to run git fetch --tags, because the tag is not part of any
branch, thus not fetched automatically.
Bin Meng July 19, 2023, 4:18 p.m. UTC | #28
Hi Anup,

On Thu, Jul 20, 2023 at 12:10 AM Anup Patel <apatel@ventanamicro.com> wrote:
>
> Hi Bin,
>
> On Wed, Jul 19, 2023 at 9:15 PM Bin Meng <bmeng.cn@gmail.com> wrote:
> >
> > On Wed, Jul 19, 2023 at 11:22 PM Anup Patel <anup@brainfault.org> wrote:
> > >
> > > On Wed, Jul 19, 2023 at 3:23 PM Alistair Francis <alistair23@gmail.com> wrote:
> > > >
> > > > On Wed, Jul 19, 2023 at 3:39 PM Anup Patel <anup@brainfault.org> wrote:
> > > > >
> > > > > On Wed, Jul 19, 2023 at 7:03 AM Alistair Francis <alistair23@gmail.com> wrote:
> > > > > >
> > > > > > On Sat, Jul 15, 2023 at 7:14 PM Atish Patra <atishp@atishpatra.org> wrote:
> > > > > > >
> > > > > > > On Fri, Jul 14, 2023 at 5:29 AM Conor Dooley <conor@kernel.org> wrote:
> > > > > > > >
> > > > > > > > On Fri, Jul 14, 2023 at 11:19:34AM +0100, Conor Dooley wrote:
> > > > > > > > > On Fri, Jul 14, 2023 at 10:00:19AM +0530, Anup Patel wrote:
> > > > > > > > >
> > > > > > > > > > > > OpenSBI v1.3
> > > > > > > > > > > >    ____                    _____ ____ _____
> > > > > > > > > > > >   / __ \                  / ____|  _ \_   _|
> > > > > > > > > > > >  | |  | |_ __   ___ _ __ | (___ | |_) || |
> > > > > > > > > > > >  | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
> > > > > > > > > > > >  | |__| | |_) |  __/ | | |____) | |_) || |_
> > > > > > > > > > > >   \____/| .__/ \___|_| |_|_____/|___/_____|
> > > > > > > > > > > >         | |
> > > > > > > > > > > >         |_|
> > > > > > > > > > > >
> > > > > > > > > > > > init_coldboot: ipi init failed (error -1009)
> > > > > > > > > > > >
> > > > > > > > > > > > Just to note, because we use our own firmware that vendors in OpenSBI
> > > > > > > > > > > > and compiles only a significantly cut down number of files from it, we
> > > > > > > > > > > > do not use the fw_dynamic etc flow on our hardware. As a result, we have
> > > > > > > > > > > > not tested v1.3, nor do we have any immediate plans to change our
> > > > > > > > > > > > platform firmware to vendor v1.3 either.
> > > > > > > > > > > >
> > > > > > > > > > > > I unless there's something obvious to you, it sounds like I will need to
> > > > > > > > > > > > go and bisect OpenSBI. That's a job for another day though, given the
> > > > > > > > > > > > time.
> > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > The real issue is some CPU/HART DT nodes marked as disabled in the
> > > > > > > > > > DT passed to OpenSBI 1.3.
> > > > > > > > > >
> > > > > > > > > > This issue does not exist in any of the DTs generated by QEMU but some
> > > > > > > > > > of the DTs in the kernel (such as microchip and SiFive board DTs) have
> > > > > > > > > > the E-core disabled.
> > > > > > > > > >
> > > > > > > > > > I had discovered this issue in a totally different context after the OpenSBI 1.3
> > > > > > > > > > release happened. This issue is already fixed in the latest OpenSBI by the
> > > > > > > > > > following commit c6a35733b74aeff612398f274ed19a74f81d1f37 ("lib: utils:
> > > > > > > > > > Fix sbi_hartid_to_scratch() usage in ACLINT drivers").
> > > > > > > > >
> > > > > > > > > Great, thanks Anup! I thought I had tested tip-of-tree too, but
> > > > > > > > > obviously not.
> > > > > > > > >
> > > > > > > > > > I always assumed that Microchip hss.bin is the preferred BIOS for the
> > > > > > > > > > QEMU microchip-icicle-kit machine but I guess that's not true.
> > > > > > > > >
> > > > > > > > > Unfortunately the HSS has not worked in QEMU for a long time, and while
> > > > > > > > > I would love to fix it, but am pretty stretched for spare time to begin
> > > > > > > > > with.
> > > > > > > > > I usually just do direct kernel boots, which use the OpenSBI that comes
> > > > > > > > > with QEMU, as I am sure you already know :)
> > > > > > > > >
> > > > > > > > > > At this point, you can either:
> > > > > > > > > > 1) Use latest OpenSBI on QEMU microchip-icicle-kit machine
> > > > > > > >
> > > > > > > > I forgot to reply to this point, wondering what should be done with
> > > > > > > > QEMU. Bumping to v1.3 in QEMU introduces a regression here, regardless
> > > > > > > > of whether I can go and build a fixed version of OpenSBI.
> > > > > > > >
> > > > > > > FYI: The no-map fix went in OpenSBI v1.3. Without the upgrade, any
> > > > > > > user using the latest kernel (> v6.4)
> > > > > > > may hit those random linear map related issues (in hibernation or EFI
> > > > > > > booting path).
> > > > > > >
> > > > > > > There are three possible scenarios:
> > > > > > >
> > > > > > > 1. Upgrade to OpenSBI v1.3: Any user of microchip-icicle-kit machine
> > > > > > > or sifive fu540 machine users
> > > > > > > may hit this issue if the device tree has the disabled hart (e core).
> > > > > > > 2. No upgrade to OpenSBI v1.2. Any user using hibernation or UEFI may
> > > > > > > have issues [1]
> > > > > > > 3. Include a non-release version OpenSBI in Qemu with the fix as an exception.
> > > > > > >
> > > > > > > #3 probably deviates from policy and sets a bad precedent. So I am not
> > > > > > > advocating for it though ;)
> > > > > > > For both #1 & #2, the solution would be to use the latest OpenSBI in
> > > > > > > -bios argument instead of the stock one.
> > > > > > > I could be wrong but my guess is the number of users facing #2 would
> > > > > > > be higher than #1.
> > > > > >
> > > > > > Thanks for that info Atish!
> > > > > >
> > > > > > We are stuck in a bad situation.
> > > > > >
> > > > > > The best solution would be if OpenSBI can release a 1.3.1, @Anup Patel
> > > > > > do you think you could do that?
> > > > >
> > > > > OpenSBI has a major number and minor number in the version but it does
> > > > > not have release/patch number so best would be to treat OpenSBI vX.Y.Z
> > > > > as bug fixes on-top-of OpenSBI vX.Y. In other words, supervisor software
> > > > > won't be able to differentiate between OpenSBI vX.Y.Z and OpenSBI vX.Y
> > > > > using sbi_get_impl_version().
> > > > >
> > > > > There are only three commits between the ACLINT fix and OpenSBI v1.3
> > > > > so as one-of case I will go ahead create OpenSBI v1.3.1 containing only
> > > > > four commits on-top of OpenSBI v1.3
> > > > >
> > > > > Does this sound okay ?
> > > >
> > > > That sounds fine to me. It fixes the issue for the Microsemi board and
> > > > it's a very small change between 1.3 and 1.3.1
> > >
> > > Please check
> > > https://github.com/riscv-software-src/opensbi/releases/tag/v1.3.1
> > >
> > > I hope this helps.
> >
> > Hi Alistair,
> >
> > Do we need to update QEMU's opensbi binaries to v1.3.1?
> >
> > Hi Anup,
> >
> > Somehow I cannot see the 'tag' v1.3.1 being populated in the opensbi
> > git repo. Am I missing anything?
>
> There is a v1.3.1 tag in https://github.com/riscv-software-src/opensbi
> (Try cloning the repo again?)
>
> The commit history of v1.3.1 is v1.3 tag + 5 cherry picked commits
> which means the commit history of the master branch is not the same
> as the commit history of v1.3.1.

I see. I was seeing a github warning message when I see the last
commit [1] from tag v1.3.1:

"This commit does not belong to any branch on this repository, and may
belong to a fork outside of the repository."

and was misled. Thanks for the hint. I now added "--tags" when I do a
git fetch and now is seeing the new tag.

[1] https://github.com/riscv-software-src/opensbi/commit/057eb10b6d523540012e6947d5c9f63e95244e94

Regards,
Bin
diff mbox series

Patch

diff --git a/target/riscv/cpu-qom.h b/target/riscv/cpu-qom.h
index 04af50983e..f3fbe37a2c 100644
--- a/target/riscv/cpu-qom.h
+++ b/target/riscv/cpu-qom.h
@@ -30,6 +30,7 @@ 
 #define CPU_RESOLVING_TYPE TYPE_RISCV_CPU
 
 #define TYPE_RISCV_CPU_ANY              RISCV_CPU_TYPE_NAME("any")
+#define TYPE_RISCV_CPU_MAX              RISCV_CPU_TYPE_NAME("max")
 #define TYPE_RISCV_CPU_BASE32           RISCV_CPU_TYPE_NAME("rv32")
 #define TYPE_RISCV_CPU_BASE64           RISCV_CPU_TYPE_NAME("rv64")
 #define TYPE_RISCV_CPU_BASE128          RISCV_CPU_TYPE_NAME("x-rv128")
diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index b61465c8c4..125cf096c4 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -248,6 +248,7 @@  static const char * const riscv_intr_names[] = {
 };
 
 static void riscv_cpu_add_user_properties(Object *obj);
+static void riscv_init_max_cpu_extensions(Object *obj);
 
 const char *riscv_cpu_get_trap_name(target_ulong cause, bool async)
 {
@@ -374,6 +375,25 @@  static void riscv_any_cpu_init(Object *obj)
     cpu->cfg.pmp = true;
 }
 
+static void riscv_max_cpu_init(Object *obj)
+{
+    RISCVCPU *cpu = RISCV_CPU(obj);
+    CPURISCVState *env = &cpu->env;
+    RISCVMXL mlx = MXL_RV64;
+
+#ifdef TARGET_RISCV32
+    mlx = MXL_RV32;
+#endif
+    set_misa(env, mlx, 0);
+    riscv_cpu_add_user_properties(obj);
+    riscv_init_max_cpu_extensions(obj);
+    env->priv_ver = PRIV_VERSION_LATEST;
+#ifndef CONFIG_USER_ONLY
+    set_satp_mode_max_supported(RISCV_CPU(obj), mlx == MXL_RV32 ?
+                                VM_1_10_SV32 : VM_1_10_SV57);
+#endif
+}
+
 #if defined(TARGET_RISCV64)
 static void rv64_base_cpu_init(Object *obj)
 {
@@ -1934,6 +1954,35 @@  static void riscv_cpu_add_user_properties(Object *obj)
     ADD_CPU_PROPERTIES_ARRAY(dev, riscv_cpu_experimental_exts);
 }
 
+/*
+ * The 'max' type CPU will have all possible ratified
+ * non-vendor extensions enabled.
+ */
+static void riscv_init_max_cpu_extensions(Object *obj)
+{
+    RISCVCPU *cpu = RISCV_CPU(obj);
+    CPURISCVState *env = &cpu->env;
+    Property *prop;
+
+    for (prop = riscv_cpu_extensions; prop && prop->name; prop++) {
+        object_property_set_bool(obj, prop->name, true, NULL);
+    }
+
+    /* Zfinx is not compatible with F. Disable it */
+    object_property_set_bool(obj, "zfinx", false, NULL);
+    object_property_set_bool(obj, "zdinx", false, NULL);
+    object_property_set_bool(obj, "zhinx", false, NULL);
+    object_property_set_bool(obj, "zhinxmin", false, NULL);
+
+    object_property_set_bool(obj, "zce", false, NULL);
+    object_property_set_bool(obj, "zcmp", false, NULL);
+    object_property_set_bool(obj, "zcmt", false, NULL);
+
+    if (env->misa_mxl != MXL_RV32) {
+        object_property_set_bool(obj, "zcf", false, NULL);
+    }
+}
+
 static Property riscv_cpu_properties[] = {
     DEFINE_PROP_BOOL("debug", RISCVCPU, cfg.debug, true),
 
@@ -2272,6 +2321,7 @@  static const TypeInfo riscv_cpu_type_infos[] = {
         .abstract = true,
     },
     DEFINE_DYNAMIC_CPU(TYPE_RISCV_CPU_ANY,      riscv_any_cpu_init),
+    DEFINE_DYNAMIC_CPU(TYPE_RISCV_CPU_MAX,      riscv_max_cpu_init),
 #if defined(CONFIG_KVM)
     DEFINE_CPU(TYPE_RISCV_CPU_HOST,             riscv_host_cpu_init),
 #endif