Message ID | 20190211221345.31980-6-lukas.auer@aisec.fraunhofer.de |
---|---|
State | Superseded |
Delegated to: | Andes |
Headers | show |
Series | SMP support for RISC-V | expand |
> -----Original Message----- > From: Lukas Auer [mailto:lukas.auer@aisec.fraunhofer.de] > Sent: Tuesday, February 12, 2019 3:44 AM > To: u-boot@lists.denx.de > Cc: Atish Patra <Atish.Patra@wdc.com>; Anup Patel > <Anup.Patel@wdc.com>; Bin Meng <bmeng.cn@gmail.com>; Andreas > Schwab <schwab@suse.de>; Palmer Dabbelt <palmer@sifive.com>; > Alexander Graf <agraf@suse.de>; Lukas Auer > <lukas.auer@aisec.fraunhofer.de>; Anup Patel <anup@brainfault.org>; Rick > Chen <rick@andestech.com>; Baruch Siach <baruch@tkos.co.il>; Stefan > Roese <sr@denx.de> > Subject: [PATCH 5/7] riscv: add support for multi-hart systems > > On RISC-V, all harts boot independently. To be able to run on a multi-hart > system, U-Boot must be extended with the functionality to manage all harts > in the system. A new config option, CONFIG_MAIN_HART, is used to select > the hart U-Boot runs on. All other harts are halted. > U-Boot can delegate functions to them using smp_call_function(). > > Every hart has a valid pointer to the global data structure and a 8KiB stack by > default. The stack size is set with CONFIG_STACK_SIZE_SHIFT. > > Signed-off-by: Lukas Auer <lukas.auer@aisec.fraunhofer.de> > --- > > arch/riscv/Kconfig | 12 +++++ > arch/riscv/cpu/start.S | 102 ++++++++++++++++++++++++++++++++++- > arch/riscv/include/asm/csr.h | 1 + > 3 files changed, 114 insertions(+), 1 deletion(-) > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index > 3a51339c4d..af8d0f8d67 100644 > --- a/arch/riscv/Kconfig > +++ b/arch/riscv/Kconfig > @@ -140,4 +140,16 @@ config SBI_IPI > default y if RISCV_SMODE > depends on SMP > > +config MAIN_HART > + int "Main hart in system" > + default 0 > + help > + Some SoCs include harts of various sizes, some of which might not > + be suitable for running U-Boot. CONFIG_MAIN_HART is used to > select > + the hart U-Boot runs on. This config option can be avoided altogether if we have lottery based system to select "Main HART" in start.S. With the MAIN_HART config option in-place, every system will have to pick a "Main HART". What if the "Main HART" itself does not come online due to HW failure. Regards, Anup
Hi Lukas > > From: Lukas Auer [mailto:lukas.auer@aisec.fraunhofer.de] > > Sent: Tuesday, February 12, 2019 6:14 AM > > To: u-boot@lists.denx.de > > Cc: Atish Patra; Anup Patel; Bin Meng; Andreas Schwab; Palmer Dabbelt; > > Alexander Graf; Lukas Auer; Anup Patel; Rick Jian-Zhi Chen(陳建志); Baruch Siach; > > Stefan Roese > > Subject: [PATCH 5/7] riscv: add support for multi-hart systems > > > > On RISC-V, all harts boot independently. To be able to run on a multi-hart system, > > U-Boot must be extended with the functionality to manage all harts in the > > system. A new config option, CONFIG_MAIN_HART, is used to select the hart > > U-Boot runs on. All other harts are halted. > > U-Boot can delegate functions to them using smp_call_function(). > > > > Every hart has a valid pointer to the global data structure and a 8KiB stack by > > default. The stack size is set with CONFIG_STACK_SIZE_SHIFT. > > > > Signed-off-by: Lukas Auer <lukas.auer@aisec.fraunhofer.de> > > --- > > > > arch/riscv/Kconfig | 12 +++++ > > arch/riscv/cpu/start.S | 102 ++++++++++++++++++++++++++++++++++- > > arch/riscv/include/asm/csr.h | 1 + > > 3 files changed, 114 insertions(+), 1 deletion(-) > > > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 3a51339c4d..af8d0f8d67 > > 100644 > > --- a/arch/riscv/Kconfig > > +++ b/arch/riscv/Kconfig > > @@ -140,4 +140,16 @@ config SBI_IPI > > default y if RISCV_SMODE > > depends on SMP > > > > +config MAIN_HART > > + int "Main hart in system" > > + default 0 > > + help > > + Some SoCs include harts of various sizes, some of which might not > > + be suitable for running U-Boot. CONFIG_MAIN_HART is used to select > > + the hart U-Boot runs on. > > + > > +config STACK_SIZE_SHIFT > > + int > > + default 13 > > + > > endmenu > > diff --git a/arch/riscv/cpu/start.S b/arch/riscv/cpu/start.S index > > a30f6f7194..ce7230df37 100644 > > --- a/arch/riscv/cpu/start.S > > +++ b/arch/riscv/cpu/start.S > > @@ -13,6 +13,7 @@ > > #include <config.h> > > #include <common.h> > > #include <elf.h> > > +#include <asm/csr.h> > > #include <asm/encoding.h> > > #include <generated/asm-offsets.h> > > If u-boot boot from flash by itself in M-mode without any FSBL or gdb, in this case there will be no chance to assign a0. Can we add some code as below : _start: +#ifdef CONFIG_RISCV_MMODE + csrr a0, mhartid +#endif /* save hart id and dtb pointer */ mv s0, a0 mv s1, a1 How do you think about it ? Thanks Rick > > @@ -45,6 +46,23 @@ _start: > > /* mask all interrupts */ > > csrw MODE_PREFIX(ie), zero > > > > +#ifdef CONFIG_SMP > > + /* check if hart is within range */ > > + /* s0: hart id */ > > + li t0, CONFIG_NR_CPUS > > + bge s0, t0, hart_out_of_bounds_loop > > +#endif
On Tue, 2019-02-12 at 01:48 +0000, Anup Patel wrote: > > -----Original Message----- > > From: Lukas Auer [mailto:lukas.auer@aisec.fraunhofer.de] > > Sent: Tuesday, February 12, 2019 3:44 AM > > To: u-boot@lists.denx.de > > Cc: Atish Patra <Atish.Patra@wdc.com>; Anup Patel > > <Anup.Patel@wdc.com>; Bin Meng <bmeng.cn@gmail.com>; Andreas > > Schwab <schwab@suse.de>; Palmer Dabbelt <palmer@sifive.com>; > > Alexander Graf <agraf@suse.de>; Lukas Auer > > <lukas.auer@aisec.fraunhofer.de>; Anup Patel <anup@brainfault.org>; > > Rick > > Chen <rick@andestech.com>; Baruch Siach <baruch@tkos.co.il>; Stefan > > Roese <sr@denx.de> > > Subject: [PATCH 5/7] riscv: add support for multi-hart systems > > > > On RISC-V, all harts boot independently. To be able to run on a > > multi-hart > > system, U-Boot must be extended with the functionality to manage > > all harts > > in the system. A new config option, CONFIG_MAIN_HART, is used to > > select > > the hart U-Boot runs on. All other harts are halted. > > U-Boot can delegate functions to them using smp_call_function(). > > > > Every hart has a valid pointer to the global data structure and a > > 8KiB stack by > > default. The stack size is set with CONFIG_STACK_SIZE_SHIFT. > > > > Signed-off-by: Lukas Auer <lukas.auer@aisec.fraunhofer.de> > > --- > > > > arch/riscv/Kconfig | 12 +++++ > > arch/riscv/cpu/start.S | 102 > > ++++++++++++++++++++++++++++++++++- > > arch/riscv/include/asm/csr.h | 1 + > > 3 files changed, 114 insertions(+), 1 deletion(-) > > > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index > > 3a51339c4d..af8d0f8d67 100644 > > --- a/arch/riscv/Kconfig > > +++ b/arch/riscv/Kconfig > > @@ -140,4 +140,16 @@ config SBI_IPI > > default y if RISCV_SMODE > > depends on SMP > > > > +config MAIN_HART > > + int "Main hart in system" > > + default 0 > > + help > > + Some SoCs include harts of various sizes, some of which might > > not > > + be suitable for running U-Boot. CONFIG_MAIN_HART is used to > > select > > + the hart U-Boot runs on. > > This config option can be avoided altogether if we have > lottery based system to select "Main HART" in start.S. > > With the MAIN_HART config option in-place, every system > will have to pick a "Main HART". What if the "Main HART" > itself does not come online due to HW failure. > Good point, I did not consider this. I will add a lottery-based system in the next version. Thanks, Lukas
Hi Rick, On Fri, 2019-02-15 at 14:51 +0800, Rick Chen wrote: > Hi Lukas > > > > From: Lukas Auer [mailto:lukas.auer@aisec.fraunhofer.de] > > > Sent: Tuesday, February 12, 2019 6:14 AM > > > To: u-boot@lists.denx.de > > > Cc: Atish Patra; Anup Patel; Bin Meng; Andreas Schwab; Palmer > > > Dabbelt; > > > Alexander Graf; Lukas Auer; Anup Patel; Rick Jian-Zhi Chen(陳建志); > > > Baruch Siach; > > > Stefan Roese > > > Subject: [PATCH 5/7] riscv: add support for multi-hart systems > > > > > > On RISC-V, all harts boot independently. To be able to run on a > > > multi-hart system, > > > U-Boot must be extended with the functionality to manage all > > > harts in the > > > system. A new config option, CONFIG_MAIN_HART, is used to select > > > the hart > > > U-Boot runs on. All other harts are halted. > > > U-Boot can delegate functions to them using smp_call_function(). > > > > > > Every hart has a valid pointer to the global data structure and a > > > 8KiB stack by > > > default. The stack size is set with CONFIG_STACK_SIZE_SHIFT. > > > > > > Signed-off-by: Lukas Auer <lukas.auer@aisec.fraunhofer.de> > > > --- > > > > > > arch/riscv/Kconfig | 12 +++++ > > > arch/riscv/cpu/start.S | 102 > > > ++++++++++++++++++++++++++++++++++- > > > arch/riscv/include/asm/csr.h | 1 + > > > 3 files changed, 114 insertions(+), 1 deletion(-) > > > > > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index > > > 3a51339c4d..af8d0f8d67 > > > 100644 > > > --- a/arch/riscv/Kconfig > > > +++ b/arch/riscv/Kconfig > > > @@ -140,4 +140,16 @@ config SBI_IPI > > > default y if RISCV_SMODE > > > depends on SMP > > > > > > +config MAIN_HART > > > + int "Main hart in system" > > > + default 0 > > > + help > > > + Some SoCs include harts of various sizes, some of which > > > might not > > > + be suitable for running U-Boot. CONFIG_MAIN_HART is used > > > to select > > > + the hart U-Boot runs on. > > > + > > > +config STACK_SIZE_SHIFT > > > + int > > > + default 13 > > > + > > > endmenu > > > diff --git a/arch/riscv/cpu/start.S b/arch/riscv/cpu/start.S > > > index > > > a30f6f7194..ce7230df37 100644 > > > --- a/arch/riscv/cpu/start.S > > > +++ b/arch/riscv/cpu/start.S > > > @@ -13,6 +13,7 @@ > > > #include <config.h> > > > #include <common.h> > > > #include <elf.h> > > > +#include <asm/csr.h> > > > #include <asm/encoding.h> > > > #include <generated/asm-offsets.h> > > > > > If u-boot boot from flash by itself in M-mode without any FSBL or > gdb, > in this case there will be no chance to assign a0. > Can we add some code as below : > > _start: > +#ifdef CONFIG_RISCV_MMODE > + csrr a0, mhartid > +#endif > /* save hart id and dtb pointer */ > mv s0, a0 > mv s1, a1 > > How do you think about it ? > Good point, thanks! Even without SMP support this could cause issues if we pass an invalid hart ID to Linux. I will add a patch for this. Thanks, Lukas > Thanks > Rick > > > > > @@ -45,6 +46,23 @@ _start: > > > /* mask all interrupts */ > > > csrw MODE_PREFIX(ie), zero > > > > > > +#ifdef CONFIG_SMP > > > + /* check if hart is within range */ > > > + /* s0: hart id */ > > > + li t0, CONFIG_NR_CPUS > > > + bge s0, t0, hart_out_of_bounds_loop > > > +#endif
Hi Lukas > > From: Lukas Auer [mailto:lukas.auer@aisec.fraunhofer.de] > > Sent: Tuesday, February 12, 2019 6:14 AM > > To: u-boot@lists.denx.de > > Cc: Atish Patra; Anup Patel; Bin Meng; Andreas Schwab; Palmer Dabbelt; > > Alexander Graf; Lukas Auer; Anup Patel; Rick Jian-Zhi Chen(陳建志); Baruch Siach; > > Stefan Roese > > Subject: [PATCH 5/7] riscv: add support for multi-hart systems > > > > On RISC-V, all harts boot independently. To be able to run on a multi-hart system, > > U-Boot must be extended with the functionality to manage all harts in the > > system. A new config option, CONFIG_MAIN_HART, is used to select the hart > > U-Boot runs on. All other harts are halted. > > U-Boot can delegate functions to them using smp_call_function(). > > > > Every hart has a valid pointer to the global data structure and a 8KiB stack by > > default. The stack size is set with CONFIG_STACK_SIZE_SHIFT. > > > > Signed-off-by: Lukas Auer <lukas.auer@aisec.fraunhofer.de> > > --- > > > > arch/riscv/Kconfig | 12 +++++ > > arch/riscv/cpu/start.S | 102 ++++++++++++++++++++++++++++++++++- > > arch/riscv/include/asm/csr.h | 1 + > > 3 files changed, 114 insertions(+), 1 deletion(-) > > > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 3a51339c4d..af8d0f8d67 > > 100644 > > --- a/arch/riscv/Kconfig > > +++ b/arch/riscv/Kconfig > > @@ -140,4 +140,16 @@ config SBI_IPI > > default y if RISCV_SMODE > > depends on SMP > > > > +config MAIN_HART > > + int "Main hart in system" > > + default 0 > > + help > > + Some SoCs include harts of various sizes, some of which might not > > + be suitable for running U-Boot. CONFIG_MAIN_HART is used to select > > + the hart U-Boot runs on. > > + > > +config STACK_SIZE_SHIFT > > + int > > + default 13 > > + > > endmenu > > diff --git a/arch/riscv/cpu/start.S b/arch/riscv/cpu/start.S index > > a30f6f7194..ce7230df37 100644 > > --- a/arch/riscv/cpu/start.S > > +++ b/arch/riscv/cpu/start.S > > @@ -13,6 +13,7 @@ > > #include <config.h> > > #include <common.h> > > #include <elf.h> > > +#include <asm/csr.h> > > #include <asm/encoding.h> > > #include <generated/asm-offsets.h> > > > > @@ -45,6 +46,23 @@ _start: > > /* mask all interrupts */ > > csrw MODE_PREFIX(ie), zero > > > > +#ifdef CONFIG_SMP > > + /* check if hart is within range */ > > + /* s0: hart id */ > > + li t0, CONFIG_NR_CPUS > > + bge s0, t0, hart_out_of_bounds_loop > > +#endif > > + > > +#ifdef CONFIG_SMP > > + /* set xSIE bit to receive IPIs */ > > +#ifdef CONFIG_RISCV_MMODE > > + li t0, MIE_MSIE > > +#else > > + li t0, SIE_SSIE > > +#endif > > + csrs MODE_PREFIX(ie), t0 > > +#endif > > + > > /* > > * Set stackpointer in internal/ex RAM to call board_init_f > > */ > > @@ -56,7 +74,25 @@ call_board_init_f: > > call_board_init_f_0: > > mv a0, sp > > jal board_init_f_alloc_reserve > > + > > + /* > > + * Set global data pointer here for all harts, uninitialized at this > > + * point. > > + */ > > + mv gp, a0 > > + > > + /* setup stack */ > > +#ifdef CONFIG_SMP > > + /* s0: hart id */ > > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > > + sub sp, a0, t0 > > +#else > > mv sp, a0 > > +#endif > > + > > + /* Continue on main hart, others branch to secondary_hart_loop */ > > + li t0, CONFIG_MAIN_HART > > + bne s0, t0, secondary_hart_loop > > > > la t0, prior_stage_fdt_address > > SREG s1, 0(t0) > > @@ -95,7 +131,14 @@ relocate_code: > > *Set up the stack > > */ > > stack_setup: > > +#ifdef CONFIG_SMP > > + /* s0: hart id */ > > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > > + sub sp, s2, t0 > > +#else > > mv sp, s2 > > +#endif > > + > > la t0, _start > > sub t6, s4, t0 /* t6 <- relocation offset */ > > beq t0, s4, clear_bss /* skip relocation */ > > @@ -175,13 +218,30 @@ clear_bss: > > add t0, t0, t6 /* t0 <- rel __bss_start in RAM */ > > la t1, __bss_end /* t1 <- rel __bss_end in FLASH */ > > add t1, t1, t6 /* t1 <- rel __bss_end in RAM */ > > - beq t0, t1, call_board_init_r > > + beq t0, t1, relocate_secondary_harts > > > > clbss_l: > > SREG zero, 0(t0) /* clear loop... */ > > addi t0, t0, REGBYTES > > bne t0, t1, clbss_l > > > > +relocate_secondary_harts: > > +#ifdef CONFIG_SMP > > + /* send relocation IPI */ > > + la t0, secondary_hart_relocate > > + add a0, t0, t6 > > + > > + /* store relocation offset */ > > + mv s5, t6 > > + > > + mv a1, s2 > > + mv a2, s3 > > + jal smp_call_function > > + > > + /* restore relocation offset */ > > + mv t6, s5 > > +#endif > > + > > /* > > * We are done. Do not return, instead branch to second part of board > > * initialization, now running from RAM. > > @@ -202,3 +262,43 @@ call_board_init_r: > > * jump to it ... > > */ > > jr t4 /* jump to board_init_r() */ > > + > > +#ifdef CONFIG_SMP > > +hart_out_of_bounds_loop: > > + /* Harts in this loop are out of bounds, increase CONFIG_NR_CPUS. */ > > + wfi > > + j hart_out_of_bounds_loop > > +#endif > > + > > +#ifdef CONFIG_SMP > > +/* SMP relocation entry */ > > +secondary_hart_relocate: > > + /* a1: new sp */ > > + /* a2: new gd */ > > + /* s0: hart id */ > > + > > + /* setup stack */ > > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > > + sub sp, a1, t0 > > + > > + /* update global data pointer */ > > + mv gp, a2 > > +#endif > > + > > +secondary_hart_loop: > > + wfi > > + > > +#ifdef CONFIG_SMP > > + csrr t0, MODE_PREFIX(ip) > > +#ifdef CONFIG_RISCV_MMODE > > + andi t0, t0, MIE_MSIE > > +#else > > + andi t0, t0, SIE_SSIE > > +#endif > > + beqz t0, secondary_hart_loop > > + > > + mv a0, s0 > > + jal handle_ipi I found that s0 maybe corrupted after execute handle_ipi. Because smp_function will be treated as a return function by compiler, so compiler will generate codes to execute restore after smp_function(). But actually it is a no-return function. So there maybe no chance to execute restore. And s0 will be corrupted somehow. The usage of s0 in v2 flow seem the same as v1. So I reply mail in v1 patch. Thanks Rick > > +#endif > > + > > + j secondary_hart_loop > > diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h index > > 86136f542c..644e6baa15 100644 > > --- a/arch/riscv/include/asm/csr.h > > +++ b/arch/riscv/include/asm/csr.h > > @@ -46,6 +46,7 @@ > > #endif > > > > /* Interrupt Enable and Interrupt Pending flags */ > > +#define MIE_MSIE _AC(0x00000008, UL) /* Software Interrupt Enable */ > > #define SIE_SSIE _AC(0x00000002, UL) /* Software Interrupt Enable */ > > #define SIE_STIE _AC(0x00000020, UL) /* Timer Interrupt Enable */ > > > > -- > > 2.20.1
Hi Rick, On Thu, 2019-03-07 at 17:30 +0800, Rick Chen wrote: > Hi Lukas > > > > From: Lukas Auer [mailto:lukas.auer@aisec.fraunhofer.de] > > > Sent: Tuesday, February 12, 2019 6:14 AM > > > To: u-boot@lists.denx.de > > > Cc: Atish Patra; Anup Patel; Bin Meng; Andreas Schwab; Palmer > > > Dabbelt; > > > Alexander Graf; Lukas Auer; Anup Patel; Rick Jian-Zhi Chen(陳建志); > > > Baruch Siach; > > > Stefan Roese > > > Subject: [PATCH 5/7] riscv: add support for multi-hart systems > > > > > > On RISC-V, all harts boot independently. To be able to run on a > > > multi-hart system, > > > U-Boot must be extended with the functionality to manage all > > > harts in the > > > system. A new config option, CONFIG_MAIN_HART, is used to select > > > the hart > > > U-Boot runs on. All other harts are halted. > > > U-Boot can delegate functions to them using smp_call_function(). > > > > > > Every hart has a valid pointer to the global data structure and a > > > 8KiB stack by > > > default. The stack size is set with CONFIG_STACK_SIZE_SHIFT. > > > > > > Signed-off-by: Lukas Auer <lukas.auer@aisec.fraunhofer.de> > > > --- > > > > > > arch/riscv/Kconfig | 12 +++++ > > > arch/riscv/cpu/start.S | 102 > > > ++++++++++++++++++++++++++++++++++- > > > arch/riscv/include/asm/csr.h | 1 + > > > 3 files changed, 114 insertions(+), 1 deletion(-) > > > > > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index > > > 3a51339c4d..af8d0f8d67 > > > 100644 > > > --- a/arch/riscv/Kconfig > > > +++ b/arch/riscv/Kconfig > > > @@ -140,4 +140,16 @@ config SBI_IPI > > > default y if RISCV_SMODE > > > depends on SMP > > > > > > +config MAIN_HART > > > + int "Main hart in system" > > > + default 0 > > > + help > > > + Some SoCs include harts of various sizes, some of which > > > might not > > > + be suitable for running U-Boot. CONFIG_MAIN_HART is used > > > to select > > > + the hart U-Boot runs on. > > > + > > > +config STACK_SIZE_SHIFT > > > + int > > > + default 13 > > > + > > > endmenu > > > diff --git a/arch/riscv/cpu/start.S b/arch/riscv/cpu/start.S > > > index > > > a30f6f7194..ce7230df37 100644 > > > --- a/arch/riscv/cpu/start.S > > > +++ b/arch/riscv/cpu/start.S > > > @@ -13,6 +13,7 @@ > > > #include <config.h> > > > #include <common.h> > > > #include <elf.h> > > > +#include <asm/csr.h> > > > #include <asm/encoding.h> > > > #include <generated/asm-offsets.h> > > > > > > @@ -45,6 +46,23 @@ _start: > > > /* mask all interrupts */ > > > csrw MODE_PREFIX(ie), zero > > > > > > +#ifdef CONFIG_SMP > > > + /* check if hart is within range */ > > > + /* s0: hart id */ > > > + li t0, CONFIG_NR_CPUS > > > + bge s0, t0, hart_out_of_bounds_loop > > > +#endif > > > + > > > +#ifdef CONFIG_SMP > > > + /* set xSIE bit to receive IPIs */ > > > +#ifdef CONFIG_RISCV_MMODE > > > + li t0, MIE_MSIE > > > +#else > > > + li t0, SIE_SSIE > > > +#endif > > > + csrs MODE_PREFIX(ie), t0 > > > +#endif > > > + > > > /* > > > * Set stackpointer in internal/ex RAM to call board_init_f > > > */ > > > @@ -56,7 +74,25 @@ call_board_init_f: > > > call_board_init_f_0: > > > mv a0, sp > > > jal board_init_f_alloc_reserve > > > + > > > + /* > > > + * Set global data pointer here for all harts, > > > uninitialized at this > > > + * point. > > > + */ > > > + mv gp, a0 > > > + > > > + /* setup stack */ > > > +#ifdef CONFIG_SMP > > > + /* s0: hart id */ > > > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > > > + sub sp, a0, t0 > > > +#else > > > mv sp, a0 > > > +#endif > > > + > > > + /* Continue on main hart, others branch to > > > secondary_hart_loop */ > > > + li t0, CONFIG_MAIN_HART > > > + bne s0, t0, secondary_hart_loop > > > > > > la t0, prior_stage_fdt_address > > > SREG s1, 0(t0) > > > @@ -95,7 +131,14 @@ relocate_code: > > > *Set up the stack > > > */ > > > stack_setup: > > > +#ifdef CONFIG_SMP > > > + /* s0: hart id */ > > > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > > > + sub sp, s2, t0 > > > +#else > > > mv sp, s2 > > > +#endif > > > + > > > la t0, _start > > > sub t6, s4, t0 /* t6 <- relocation offset > > > */ > > > beq t0, s4, clear_bss /* skip relocation */ > > > @@ -175,13 +218,30 @@ clear_bss: > > > add t0, t0, t6 /* t0 <- rel __bss_start in > > > RAM */ > > > la t1, __bss_end /* t1 <- rel __bss_end in > > > FLASH */ > > > add t1, t1, t6 /* t1 <- rel __bss_end in > > > RAM */ > > > - beq t0, t1, call_board_init_r > > > + beq t0, t1, relocate_secondary_harts > > > > > > clbss_l: > > > SREG zero, 0(t0) /* clear loop... */ > > > addi t0, t0, REGBYTES > > > bne t0, t1, clbss_l > > > > > > +relocate_secondary_harts: > > > +#ifdef CONFIG_SMP > > > + /* send relocation IPI */ > > > + la t0, secondary_hart_relocate > > > + add a0, t0, t6 > > > + > > > + /* store relocation offset */ > > > + mv s5, t6 > > > + > > > + mv a1, s2 > > > + mv a2, s3 > > > + jal smp_call_function > > > + > > > + /* restore relocation offset */ > > > + mv t6, s5 > > > +#endif > > > + > > > /* > > > * We are done. Do not return, instead branch to second part of > > > board > > > * initialization, now running from RAM. > > > @@ -202,3 +262,43 @@ call_board_init_r: > > > * jump to it ... > > > */ > > > jr t4 /* jump to board_init_r() > > > */ > > > + > > > +#ifdef CONFIG_SMP > > > +hart_out_of_bounds_loop: > > > + /* Harts in this loop are out of bounds, increase > > > CONFIG_NR_CPUS. */ > > > + wfi > > > + j hart_out_of_bounds_loop > > > +#endif > > > + > > > +#ifdef CONFIG_SMP > > > +/* SMP relocation entry */ > > > +secondary_hart_relocate: > > > + /* a1: new sp */ > > > + /* a2: new gd */ > > > + /* s0: hart id */ > > > + > > > + /* setup stack */ > > > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > > > + sub sp, a1, t0 > > > + > > > + /* update global data pointer */ > > > + mv gp, a2 > > > +#endif > > > + > > > +secondary_hart_loop: > > > + wfi > > > + > > > +#ifdef CONFIG_SMP > > > + csrr t0, MODE_PREFIX(ip) > > > +#ifdef CONFIG_RISCV_MMODE > > > + andi t0, t0, MIE_MSIE > > > +#else > > > + andi t0, t0, SIE_SSIE > > > +#endif > > > + beqz t0, secondary_hart_loop > > > + > > > + mv a0, s0 > > > + jal handle_ipi > > I found that s0 maybe corrupted after execute handle_ipi. > Because smp_function will be treated as a return function by > compiler, > so compiler will generate codes to execute restore after > smp_function(). > > But actually it is a no-return function. So there maybe no chance to > execute > restore. And s0 will be corrupted somehow. > > The usage of s0 in v2 flow seem the same as v1. > So I reply mail in v1 patch. > Thanks for reporting this issue! What compiler version are you using? I never saw this issue with GCC 8.2.0, because optimization removes the ret following the call to smp_function(), it is called using jr. The registers are therefore restored before jumping to smp_function(). This system is probably too fragile. A simple solution would be to store the hart ID somewhere else, for example register tp. What do you think? Thanks, Lukas
On Sun, Mar 10, 2019 at 7:28 PM Auer, Lukas <lukas.auer@aisec.fraunhofer.de> wrote: > > Hi Rick, > > On Thu, 2019-03-07 at 17:30 +0800, Rick Chen wrote: > > Hi Lukas > > > > > > From: Lukas Auer [mailto:lukas.auer@aisec.fraunhofer.de] > > > > Sent: Tuesday, February 12, 2019 6:14 AM > > > > To: u-boot@lists.denx.de > > > > Cc: Atish Patra; Anup Patel; Bin Meng; Andreas Schwab; Palmer > > > > Dabbelt; > > > > Alexander Graf; Lukas Auer; Anup Patel; Rick Jian-Zhi Chen(陳建志); > > > > Baruch Siach; > > > > Stefan Roese > > > > Subject: [PATCH 5/7] riscv: add support for multi-hart systems > > > > > > > > On RISC-V, all harts boot independently. To be able to run on a > > > > multi-hart system, > > > > U-Boot must be extended with the functionality to manage all > > > > harts in the > > > > system. A new config option, CONFIG_MAIN_HART, is used to select > > > > the hart > > > > U-Boot runs on. All other harts are halted. > > > > U-Boot can delegate functions to them using smp_call_function(). > > > > > > > > Every hart has a valid pointer to the global data structure and a > > > > 8KiB stack by > > > > default. The stack size is set with CONFIG_STACK_SIZE_SHIFT. > > > > > > > > Signed-off-by: Lukas Auer <lukas.auer@aisec.fraunhofer.de> > > > > --- > > > > > > > > arch/riscv/Kconfig | 12 +++++ > > > > arch/riscv/cpu/start.S | 102 > > > > ++++++++++++++++++++++++++++++++++- > > > > arch/riscv/include/asm/csr.h | 1 + > > > > 3 files changed, 114 insertions(+), 1 deletion(-) > > > > > > > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index > > > > 3a51339c4d..af8d0f8d67 > > > > 100644 > > > > --- a/arch/riscv/Kconfig > > > > +++ b/arch/riscv/Kconfig > > > > @@ -140,4 +140,16 @@ config SBI_IPI > > > > default y if RISCV_SMODE > > > > depends on SMP > > > > > > > > +config MAIN_HART > > > > + int "Main hart in system" > > > > + default 0 > > > > + help > > > > + Some SoCs include harts of various sizes, some of which > > > > might not > > > > + be suitable for running U-Boot. CONFIG_MAIN_HART is used > > > > to select > > > > + the hart U-Boot runs on. > > > > + > > > > +config STACK_SIZE_SHIFT > > > > + int > > > > + default 13 > > > > + > > > > endmenu > > > > diff --git a/arch/riscv/cpu/start.S b/arch/riscv/cpu/start.S > > > > index > > > > a30f6f7194..ce7230df37 100644 > > > > --- a/arch/riscv/cpu/start.S > > > > +++ b/arch/riscv/cpu/start.S > > > > @@ -13,6 +13,7 @@ > > > > #include <config.h> > > > > #include <common.h> > > > > #include <elf.h> > > > > +#include <asm/csr.h> > > > > #include <asm/encoding.h> > > > > #include <generated/asm-offsets.h> > > > > > > > > @@ -45,6 +46,23 @@ _start: > > > > /* mask all interrupts */ > > > > csrw MODE_PREFIX(ie), zero > > > > > > > > +#ifdef CONFIG_SMP > > > > + /* check if hart is within range */ > > > > + /* s0: hart id */ > > > > + li t0, CONFIG_NR_CPUS > > > > + bge s0, t0, hart_out_of_bounds_loop > > > > +#endif > > > > + > > > > +#ifdef CONFIG_SMP > > > > + /* set xSIE bit to receive IPIs */ > > > > +#ifdef CONFIG_RISCV_MMODE > > > > + li t0, MIE_MSIE > > > > +#else > > > > + li t0, SIE_SSIE > > > > +#endif > > > > + csrs MODE_PREFIX(ie), t0 > > > > +#endif > > > > + > > > > /* > > > > * Set stackpointer in internal/ex RAM to call board_init_f > > > > */ > > > > @@ -56,7 +74,25 @@ call_board_init_f: > > > > call_board_init_f_0: > > > > mv a0, sp > > > > jal board_init_f_alloc_reserve > > > > + > > > > + /* > > > > + * Set global data pointer here for all harts, > > > > uninitialized at this > > > > + * point. > > > > + */ > > > > + mv gp, a0 > > > > + > > > > + /* setup stack */ > > > > +#ifdef CONFIG_SMP > > > > + /* s0: hart id */ > > > > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > > > > + sub sp, a0, t0 > > > > +#else > > > > mv sp, a0 > > > > +#endif > > > > + > > > > + /* Continue on main hart, others branch to > > > > secondary_hart_loop */ > > > > + li t0, CONFIG_MAIN_HART > > > > + bne s0, t0, secondary_hart_loop > > > > > > > > la t0, prior_stage_fdt_address > > > > SREG s1, 0(t0) > > > > @@ -95,7 +131,14 @@ relocate_code: > > > > *Set up the stack > > > > */ > > > > stack_setup: > > > > +#ifdef CONFIG_SMP > > > > + /* s0: hart id */ > > > > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > > > > + sub sp, s2, t0 > > > > +#else > > > > mv sp, s2 > > > > +#endif > > > > + > > > > la t0, _start > > > > sub t6, s4, t0 /* t6 <- relocation offset > > > > */ > > > > beq t0, s4, clear_bss /* skip relocation */ > > > > @@ -175,13 +218,30 @@ clear_bss: > > > > add t0, t0, t6 /* t0 <- rel __bss_start in > > > > RAM */ > > > > la t1, __bss_end /* t1 <- rel __bss_end in > > > > FLASH */ > > > > add t1, t1, t6 /* t1 <- rel __bss_end in > > > > RAM */ > > > > - beq t0, t1, call_board_init_r > > > > + beq t0, t1, relocate_secondary_harts > > > > > > > > clbss_l: > > > > SREG zero, 0(t0) /* clear loop... */ > > > > addi t0, t0, REGBYTES > > > > bne t0, t1, clbss_l > > > > > > > > +relocate_secondary_harts: > > > > +#ifdef CONFIG_SMP > > > > + /* send relocation IPI */ > > > > + la t0, secondary_hart_relocate > > > > + add a0, t0, t6 > > > > + > > > > + /* store relocation offset */ > > > > + mv s5, t6 > > > > + > > > > + mv a1, s2 > > > > + mv a2, s3 > > > > + jal smp_call_function > > > > + > > > > + /* restore relocation offset */ > > > > + mv t6, s5 > > > > +#endif > > > > + > > > > /* > > > > * We are done. Do not return, instead branch to second part of > > > > board > > > > * initialization, now running from RAM. > > > > @@ -202,3 +262,43 @@ call_board_init_r: > > > > * jump to it ... > > > > */ > > > > jr t4 /* jump to board_init_r() > > > > */ > > > > + > > > > +#ifdef CONFIG_SMP > > > > +hart_out_of_bounds_loop: > > > > + /* Harts in this loop are out of bounds, increase > > > > CONFIG_NR_CPUS. */ > > > > + wfi > > > > + j hart_out_of_bounds_loop > > > > +#endif > > > > + > > > > +#ifdef CONFIG_SMP > > > > +/* SMP relocation entry */ > > > > +secondary_hart_relocate: > > > > + /* a1: new sp */ > > > > + /* a2: new gd */ > > > > + /* s0: hart id */ > > > > + > > > > + /* setup stack */ > > > > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > > > > + sub sp, a1, t0 > > > > + > > > > + /* update global data pointer */ > > > > + mv gp, a2 > > > > +#endif > > > > + > > > > +secondary_hart_loop: > > > > + wfi > > > > + > > > > +#ifdef CONFIG_SMP > > > > + csrr t0, MODE_PREFIX(ip) > > > > +#ifdef CONFIG_RISCV_MMODE > > > > + andi t0, t0, MIE_MSIE > > > > +#else > > > > + andi t0, t0, SIE_SSIE > > > > +#endif > > > > + beqz t0, secondary_hart_loop > > > > + > > > > + mv a0, s0 > > > > + jal handle_ipi > > > > I found that s0 maybe corrupted after execute handle_ipi. > > Because smp_function will be treated as a return function by > > compiler, > > so compiler will generate codes to execute restore after > > smp_function(). > > > > But actually it is a no-return function. So there maybe no chance to > > execute > > restore. And s0 will be corrupted somehow. > > > > The usage of s0 in v2 flow seem the same as v1. > > So I reply mail in v1 patch. > > > > Thanks for reporting this issue! > What compiler version are you using? I never saw this issue with GCC > 8.2.0, because optimization removes the ret following the call to > smp_function(), it is called using jr. The registers are therefore > restored before jumping to smp_function(). > > This system is probably too fragile. A simple solution would be to > store the hart ID somewhere else, for example register tp. What do you > think? I think you can also use scratch/mscratch (if it is not used anywhere else). Regards, Anup
On Sun, 2019-03-10 at 20:24 +0530, Anup Patel wrote: > On Sun, Mar 10, 2019 at 7:28 PM Auer, Lukas > <lukas.auer@aisec.fraunhofer.de> wrote: > > Hi Rick, > > > > On Thu, 2019-03-07 at 17:30 +0800, Rick Chen wrote: > > > Hi Lukas > > > > > > > > From: Lukas Auer [mailto:lukas.auer@aisec.fraunhofer.de] > > > > > Sent: Tuesday, February 12, 2019 6:14 AM > > > > > To: u-boot@lists.denx.de > > > > > Cc: Atish Patra; Anup Patel; Bin Meng; Andreas Schwab; Palmer > > > > > Dabbelt; > > > > > Alexander Graf; Lukas Auer; Anup Patel; Rick Jian-Zhi > > > > > Chen(陳建志); > > > > > Baruch Siach; > > > > > Stefan Roese > > > > > Subject: [PATCH 5/7] riscv: add support for multi-hart > > > > > systems > > > > > > > > > > On RISC-V, all harts boot independently. To be able to run on > > > > > a > > > > > multi-hart system, > > > > > U-Boot must be extended with the functionality to manage all > > > > > harts in the > > > > > system. A new config option, CONFIG_MAIN_HART, is used to > > > > > select > > > > > the hart > > > > > U-Boot runs on. All other harts are halted. > > > > > U-Boot can delegate functions to them using > > > > > smp_call_function(). > > > > > > > > > > Every hart has a valid pointer to the global data structure > > > > > and a > > > > > 8KiB stack by > > > > > default. The stack size is set with CONFIG_STACK_SIZE_SHIFT. > > > > > > > > > > Signed-off-by: Lukas Auer <lukas.auer@aisec.fraunhofer.de> > > > > > --- > > > > > > > > > > arch/riscv/Kconfig | 12 +++++ > > > > > arch/riscv/cpu/start.S | 102 > > > > > ++++++++++++++++++++++++++++++++++- > > > > > arch/riscv/include/asm/csr.h | 1 + > > > > > 3 files changed, 114 insertions(+), 1 deletion(-) > > > > > > > > > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index > > > > > 3a51339c4d..af8d0f8d67 > > > > > 100644 > > > > > --- a/arch/riscv/Kconfig > > > > > +++ b/arch/riscv/Kconfig > > > > > @@ -140,4 +140,16 @@ config SBI_IPI > > > > > default y if RISCV_SMODE > > > > > depends on SMP > > > > > > > > > > +config MAIN_HART > > > > > + int "Main hart in system" > > > > > + default 0 > > > > > + help > > > > > + Some SoCs include harts of various sizes, some of > > > > > which > > > > > might not > > > > > + be suitable for running U-Boot. CONFIG_MAIN_HART is > > > > > used > > > > > to select > > > > > + the hart U-Boot runs on. > > > > > + > > > > > +config STACK_SIZE_SHIFT > > > > > + int > > > > > + default 13 > > > > > + > > > > > endmenu > > > > > diff --git a/arch/riscv/cpu/start.S b/arch/riscv/cpu/start.S > > > > > index > > > > > a30f6f7194..ce7230df37 100644 > > > > > --- a/arch/riscv/cpu/start.S > > > > > +++ b/arch/riscv/cpu/start.S > > > > > @@ -13,6 +13,7 @@ > > > > > #include <config.h> > > > > > #include <common.h> > > > > > #include <elf.h> > > > > > +#include <asm/csr.h> > > > > > #include <asm/encoding.h> > > > > > #include <generated/asm-offsets.h> > > > > > > > > > > @@ -45,6 +46,23 @@ _start: > > > > > /* mask all interrupts */ > > > > > csrw MODE_PREFIX(ie), zero > > > > > > > > > > +#ifdef CONFIG_SMP > > > > > + /* check if hart is within range */ > > > > > + /* s0: hart id */ > > > > > + li t0, CONFIG_NR_CPUS > > > > > + bge s0, t0, hart_out_of_bounds_loop > > > > > +#endif > > > > > + > > > > > +#ifdef CONFIG_SMP > > > > > + /* set xSIE bit to receive IPIs */ > > > > > +#ifdef CONFIG_RISCV_MMODE > > > > > + li t0, MIE_MSIE > > > > > +#else > > > > > + li t0, SIE_SSIE > > > > > +#endif > > > > > + csrs MODE_PREFIX(ie), t0 > > > > > +#endif > > > > > + > > > > > /* > > > > > * Set stackpointer in internal/ex RAM to call board_init_f > > > > > */ > > > > > @@ -56,7 +74,25 @@ call_board_init_f: > > > > > call_board_init_f_0: > > > > > mv a0, sp > > > > > jal board_init_f_alloc_reserve > > > > > + > > > > > + /* > > > > > + * Set global data pointer here for all harts, > > > > > uninitialized at this > > > > > + * point. > > > > > + */ > > > > > + mv gp, a0 > > > > > + > > > > > + /* setup stack */ > > > > > +#ifdef CONFIG_SMP > > > > > + /* s0: hart id */ > > > > > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > > > > > + sub sp, a0, t0 > > > > > +#else > > > > > mv sp, a0 > > > > > +#endif > > > > > + > > > > > + /* Continue on main hart, others branch to > > > > > secondary_hart_loop */ > > > > > + li t0, CONFIG_MAIN_HART > > > > > + bne s0, t0, secondary_hart_loop > > > > > > > > > > la t0, prior_stage_fdt_address > > > > > SREG s1, 0(t0) > > > > > @@ -95,7 +131,14 @@ relocate_code: > > > > > *Set up the stack > > > > > */ > > > > > stack_setup: > > > > > +#ifdef CONFIG_SMP > > > > > + /* s0: hart id */ > > > > > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > > > > > + sub sp, s2, t0 > > > > > +#else > > > > > mv sp, s2 > > > > > +#endif > > > > > + > > > > > la t0, _start > > > > > sub t6, s4, t0 /* t6 <- relocation > > > > > offset > > > > > */ > > > > > beq t0, s4, clear_bss /* skip relocation */ > > > > > @@ -175,13 +218,30 @@ clear_bss: > > > > > add t0, t0, t6 /* t0 <- rel > > > > > __bss_start in > > > > > RAM */ > > > > > la t1, __bss_end /* t1 <- rel __bss_end > > > > > in > > > > > FLASH */ > > > > > add t1, t1, t6 /* t1 <- rel __bss_end > > > > > in > > > > > RAM */ > > > > > - beq t0, t1, call_board_init_r > > > > > + beq t0, t1, relocate_secondary_harts > > > > > > > > > > clbss_l: > > > > > SREG zero, 0(t0) /* clear loop... */ > > > > > addi t0, t0, REGBYTES > > > > > bne t0, t1, clbss_l > > > > > > > > > > +relocate_secondary_harts: > > > > > +#ifdef CONFIG_SMP > > > > > + /* send relocation IPI */ > > > > > + la t0, secondary_hart_relocate > > > > > + add a0, t0, t6 > > > > > + > > > > > + /* store relocation offset */ > > > > > + mv s5, t6 > > > > > + > > > > > + mv a1, s2 > > > > > + mv a2, s3 > > > > > + jal smp_call_function > > > > > + > > > > > + /* restore relocation offset */ > > > > > + mv t6, s5 > > > > > +#endif > > > > > + > > > > > /* > > > > > * We are done. Do not return, instead branch to second part > > > > > of > > > > > board > > > > > * initialization, now running from RAM. > > > > > @@ -202,3 +262,43 @@ call_board_init_r: > > > > > * jump to it ... > > > > > */ > > > > > jr t4 /* jump to > > > > > board_init_r() > > > > > */ > > > > > + > > > > > +#ifdef CONFIG_SMP > > > > > +hart_out_of_bounds_loop: > > > > > + /* Harts in this loop are out of bounds, increase > > > > > CONFIG_NR_CPUS. */ > > > > > + wfi > > > > > + j hart_out_of_bounds_loop > > > > > +#endif > > > > > + > > > > > +#ifdef CONFIG_SMP > > > > > +/* SMP relocation entry */ > > > > > +secondary_hart_relocate: > > > > > + /* a1: new sp */ > > > > > + /* a2: new gd */ > > > > > + /* s0: hart id */ > > > > > + > > > > > + /* setup stack */ > > > > > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > > > > > + sub sp, a1, t0 > > > > > + > > > > > + /* update global data pointer */ > > > > > + mv gp, a2 > > > > > +#endif > > > > > + > > > > > +secondary_hart_loop: > > > > > + wfi > > > > > + > > > > > +#ifdef CONFIG_SMP > > > > > + csrr t0, MODE_PREFIX(ip) > > > > > +#ifdef CONFIG_RISCV_MMODE > > > > > + andi t0, t0, MIE_MSIE > > > > > +#else > > > > > + andi t0, t0, SIE_SSIE > > > > > +#endif > > > > > + beqz t0, secondary_hart_loop > > > > > + > > > > > + mv a0, s0 > > > > > + jal handle_ipi > > > > > > I found that s0 maybe corrupted after execute handle_ipi. > > > Because smp_function will be treated as a return function by > > > compiler, > > > so compiler will generate codes to execute restore after > > > smp_function(). > > > > > > But actually it is a no-return function. So there maybe no chance > > > to > > > execute > > > restore. And s0 will be corrupted somehow. > > > > > > The usage of s0 in v2 flow seem the same as v1. > > > So I reply mail in v1 patch. > > > > > > > Thanks for reporting this issue! > > What compiler version are you using? I never saw this issue with > > GCC > > 8.2.0, because optimization removes the ret following the call to > > smp_function(), it is called using jr. The registers are therefore > > restored before jumping to smp_function(). > > > > This system is probably too fragile. A simple solution would be to > > store the hart ID somewhere else, for example register tp. What do > > you > > think? > > I think you can also use scratch/mscratch (if it is not used anywhere > else). > You are right, sscratch and mscratch are also not currently used in U-Boot. Is there an advantage to use the scratch CSRs instead of tp? Thanks, Lukas
> -----Original Message----- > From: U-Boot <u-boot-bounces@lists.denx.de> On Behalf Of Auer, Lukas > Sent: Sunday, March 10, 2019 11:42 PM > To: anup@brainfault.org > Cc: rickchen36@gmail.com; baruch@tkos.co.il; sr@denx.de; > cmchen@andestech.com; greentime@andestech.com; schwab@suse.de; > palmer@sifive.com; agraf@suse.de; u-boot@lists.denx.de; > kito@andestech.com > Subject: Re: [U-Boot] [PATCH 5/7] riscv: add support for multi-hart systems > > On Sun, 2019-03-10 at 20:24 +0530, Anup Patel wrote: > > On Sun, Mar 10, 2019 at 7:28 PM Auer, Lukas > > <lukas.auer@aisec.fraunhofer.de> wrote: > > > Hi Rick, > > > > > > On Thu, 2019-03-07 at 17:30 +0800, Rick Chen wrote: > > > > Hi Lukas > > > > > > > > > > From: Lukas Auer [mailto:lukas.auer@aisec.fraunhofer.de] > > > > > > Sent: Tuesday, February 12, 2019 6:14 AM > > > > > > To: u-boot@lists.denx.de > > > > > > Cc: Atish Patra; Anup Patel; Bin Meng; Andreas Schwab; Palmer > > > > > > Dabbelt; Alexander Graf; Lukas Auer; Anup Patel; Rick Jian-Zhi > > > > > > Chen(陳建志); > > > > > > Baruch Siach; > > > > > > Stefan Roese > > > > > > Subject: [PATCH 5/7] riscv: add support for multi-hart systems > > > > > > > > > > > > On RISC-V, all harts boot independently. To be able to run on > > > > > > a multi-hart system, U-Boot must be extended with the > > > > > > functionality to manage all harts in the system. A new config > > > > > > option, CONFIG_MAIN_HART, is used to select the hart U-Boot > > > > > > runs on. All other harts are halted. > > > > > > U-Boot can delegate functions to them using > > > > > > smp_call_function(). > > > > > > > > > > > > Every hart has a valid pointer to the global data structure > > > > > > and a 8KiB stack by default. The stack size is set with > > > > > > CONFIG_STACK_SIZE_SHIFT. > > > > > > > > > > > > Signed-off-by: Lukas Auer <lukas.auer@aisec.fraunhofer.de> > > > > > > --- > > > > > > > > > > > > arch/riscv/Kconfig | 12 +++++ > > > > > > arch/riscv/cpu/start.S | 102 > > > > > > ++++++++++++++++++++++++++++++++++- > > > > > > arch/riscv/include/asm/csr.h | 1 + > > > > > > 3 files changed, 114 insertions(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index > > > > > > 3a51339c4d..af8d0f8d67 > > > > > > 100644 > > > > > > --- a/arch/riscv/Kconfig > > > > > > +++ b/arch/riscv/Kconfig > > > > > > @@ -140,4 +140,16 @@ config SBI_IPI > > > > > > default y if RISCV_SMODE > > > > > > depends on SMP > > > > > > > > > > > > +config MAIN_HART > > > > > > + int "Main hart in system" > > > > > > + default 0 > > > > > > + help > > > > > > + Some SoCs include harts of various sizes, some of > > > > > > which > > > > > > might not > > > > > > + be suitable for running U-Boot. CONFIG_MAIN_HART is > > > > > > used > > > > > > to select > > > > > > + the hart U-Boot runs on. > > > > > > + > > > > > > +config STACK_SIZE_SHIFT > > > > > > + int > > > > > > + default 13 > > > > > > + > > > > > > endmenu > > > > > > diff --git a/arch/riscv/cpu/start.S b/arch/riscv/cpu/start.S > > > > > > index > > > > > > a30f6f7194..ce7230df37 100644 > > > > > > --- a/arch/riscv/cpu/start.S > > > > > > +++ b/arch/riscv/cpu/start.S > > > > > > @@ -13,6 +13,7 @@ > > > > > > #include <config.h> > > > > > > #include <common.h> > > > > > > #include <elf.h> > > > > > > +#include <asm/csr.h> > > > > > > #include <asm/encoding.h> > > > > > > #include <generated/asm-offsets.h> > > > > > > > > > > > > @@ -45,6 +46,23 @@ _start: > > > > > > /* mask all interrupts */ > > > > > > csrw MODE_PREFIX(ie), zero > > > > > > > > > > > > +#ifdef CONFIG_SMP > > > > > > + /* check if hart is within range */ > > > > > > + /* s0: hart id */ > > > > > > + li t0, CONFIG_NR_CPUS > > > > > > + bge s0, t0, hart_out_of_bounds_loop > > > > > > +#endif > > > > > > + > > > > > > +#ifdef CONFIG_SMP > > > > > > + /* set xSIE bit to receive IPIs */ #ifdef > > > > > > +CONFIG_RISCV_MMODE > > > > > > + li t0, MIE_MSIE > > > > > > +#else > > > > > > + li t0, SIE_SSIE > > > > > > +#endif > > > > > > + csrs MODE_PREFIX(ie), t0 > > > > > > +#endif > > > > > > + > > > > > > /* > > > > > > * Set stackpointer in internal/ex RAM to call board_init_f > > > > > > */ > > > > > > @@ -56,7 +74,25 @@ call_board_init_f: > > > > > > call_board_init_f_0: > > > > > > mv a0, sp > > > > > > jal board_init_f_alloc_reserve > > > > > > + > > > > > > + /* > > > > > > + * Set global data pointer here for all harts, > > > > > > uninitialized at this > > > > > > + * point. > > > > > > + */ > > > > > > + mv gp, a0 > > > > > > + > > > > > > + /* setup stack */ > > > > > > +#ifdef CONFIG_SMP > > > > > > + /* s0: hart id */ > > > > > > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > > > > > > + sub sp, a0, t0 > > > > > > +#else > > > > > > mv sp, a0 > > > > > > +#endif > > > > > > + > > > > > > + /* Continue on main hart, others branch to > > > > > > secondary_hart_loop */ > > > > > > + li t0, CONFIG_MAIN_HART > > > > > > + bne s0, t0, secondary_hart_loop > > > > > > > > > > > > la t0, prior_stage_fdt_address > > > > > > SREG s1, 0(t0) > > > > > > @@ -95,7 +131,14 @@ relocate_code: > > > > > > *Set up the stack > > > > > > */ > > > > > > stack_setup: > > > > > > +#ifdef CONFIG_SMP > > > > > > + /* s0: hart id */ > > > > > > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > > > > > > + sub sp, s2, t0 > > > > > > +#else > > > > > > mv sp, s2 > > > > > > +#endif > > > > > > + > > > > > > la t0, _start > > > > > > sub t6, s4, t0 /* t6 <- relocation > > > > > > offset > > > > > > */ > > > > > > beq t0, s4, clear_bss /* skip relocation */ > > > > > > @@ -175,13 +218,30 @@ clear_bss: > > > > > > add t0, t0, t6 /* t0 <- rel > > > > > > __bss_start in > > > > > > RAM */ > > > > > > la t1, __bss_end /* t1 <- rel __bss_end > > > > > > in > > > > > > FLASH */ > > > > > > add t1, t1, t6 /* t1 <- rel __bss_end > > > > > > in > > > > > > RAM */ > > > > > > - beq t0, t1, call_board_init_r > > > > > > + beq t0, t1, relocate_secondary_harts > > > > > > > > > > > > clbss_l: > > > > > > SREG zero, 0(t0) /* clear loop... */ > > > > > > addi t0, t0, REGBYTES > > > > > > bne t0, t1, clbss_l > > > > > > > > > > > > +relocate_secondary_harts: > > > > > > +#ifdef CONFIG_SMP > > > > > > + /* send relocation IPI */ > > > > > > + la t0, secondary_hart_relocate > > > > > > + add a0, t0, t6 > > > > > > + > > > > > > + /* store relocation offset */ > > > > > > + mv s5, t6 > > > > > > + > > > > > > + mv a1, s2 > > > > > > + mv a2, s3 > > > > > > + jal smp_call_function > > > > > > + > > > > > > + /* restore relocation offset */ > > > > > > + mv t6, s5 > > > > > > +#endif > > > > > > + > > > > > > /* > > > > > > * We are done. Do not return, instead branch to second part > > > > > > of board > > > > > > * initialization, now running from RAM. > > > > > > @@ -202,3 +262,43 @@ call_board_init_r: > > > > > > * jump to it ... > > > > > > */ > > > > > > jr t4 /* jump to > > > > > > board_init_r() > > > > > > */ > > > > > > + > > > > > > +#ifdef CONFIG_SMP > > > > > > +hart_out_of_bounds_loop: > > > > > > + /* Harts in this loop are out of bounds, increase > > > > > > CONFIG_NR_CPUS. */ > > > > > > + wfi > > > > > > + j hart_out_of_bounds_loop > > > > > > +#endif > > > > > > + > > > > > > +#ifdef CONFIG_SMP > > > > > > +/* SMP relocation entry */ > > > > > > +secondary_hart_relocate: > > > > > > + /* a1: new sp */ > > > > > > + /* a2: new gd */ > > > > > > + /* s0: hart id */ > > > > > > + > > > > > > + /* setup stack */ > > > > > > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > > > > > > + sub sp, a1, t0 > > > > > > + > > > > > > + /* update global data pointer */ > > > > > > + mv gp, a2 > > > > > > +#endif > > > > > > + > > > > > > +secondary_hart_loop: > > > > > > + wfi > > > > > > + > > > > > > +#ifdef CONFIG_SMP > > > > > > + csrr t0, MODE_PREFIX(ip) > > > > > > +#ifdef CONFIG_RISCV_MMODE > > > > > > + andi t0, t0, MIE_MSIE > > > > > > +#else > > > > > > + andi t0, t0, SIE_SSIE > > > > > > +#endif > > > > > > + beqz t0, secondary_hart_loop > > > > > > + > > > > > > + mv a0, s0 > > > > > > + jal handle_ipi > > > > > > > > I found that s0 maybe corrupted after execute handle_ipi. > > > > Because smp_function will be treated as a return function by > > > > compiler, so compiler will generate codes to execute restore after > > > > smp_function(). > > > > > > > > But actually it is a no-return function. So there maybe no chance > > > > to execute restore. And s0 will be corrupted somehow. > > > > > > > > The usage of s0 in v2 flow seem the same as v1. > > > > So I reply mail in v1 patch. > > > > > > > > > > Thanks for reporting this issue! > > > What compiler version are you using? I never saw this issue with GCC > > > 8.2.0, because optimization removes the ret following the call to > > > smp_function(), it is called using jr. The registers are therefore > > > restored before jumping to smp_function(). > > > > > > This system is probably too fragile. A simple solution would be to > > > store the hart ID somewhere else, for example register tp. What do > > > you think? > > > > I think you can also use scratch/mscratch (if it is not used anywhere > > else). > > > > You are right, sscratch and mscratch are also not currently used in U-Boot. Is > there an advantage to use the scratch CSRs instead of tp? No specific advantage as such. This was another possibility. I suggest you go ahead with TP because mscratch will be used by M-mode U-Boot when linked with OpenSBI as library. Regards, Anup
Hi Lukas Auer, Lukas <lukas.auer@aisec.fraunhofer.de> 於 2019年3月11日 週一 上午2:12寫道: > > On Sun, 2019-03-10 at 20:24 +0530, Anup Patel wrote: > > On Sun, Mar 10, 2019 at 7:28 PM Auer, Lukas > > <lukas.auer@aisec.fraunhofer.de> wrote: > > > Hi Rick, > > > > > > On Thu, 2019-03-07 at 17:30 +0800, Rick Chen wrote: > > > > Hi Lukas > > > > > > > > > > From: Lukas Auer [mailto:lukas.auer@aisec.fraunhofer.de] > > > > > > Sent: Tuesday, February 12, 2019 6:14 AM > > > > > > To: u-boot@lists.denx.de > > > > > > Cc: Atish Patra; Anup Patel; Bin Meng; Andreas Schwab; Palmer > > > > > > Dabbelt; > > > > > > Alexander Graf; Lukas Auer; Anup Patel; Rick Jian-Zhi > > > > > > Chen(陳建志); > > > > > > Baruch Siach; > > > > > > Stefan Roese > > > > > > Subject: [PATCH 5/7] riscv: add support for multi-hart > > > > > > systems > > > > > > > > > > > > On RISC-V, all harts boot independently. To be able to run on > > > > > > a > > > > > > multi-hart system, > > > > > > U-Boot must be extended with the functionality to manage all > > > > > > harts in the > > > > > > system. A new config option, CONFIG_MAIN_HART, is used to > > > > > > select > > > > > > the hart > > > > > > U-Boot runs on. All other harts are halted. > > > > > > U-Boot can delegate functions to them using > > > > > > smp_call_function(). > > > > > > > > > > > > Every hart has a valid pointer to the global data structure > > > > > > and a > > > > > > 8KiB stack by > > > > > > default. The stack size is set with CONFIG_STACK_SIZE_SHIFT. > > > > > > > > > > > > Signed-off-by: Lukas Auer <lukas.auer@aisec.fraunhofer.de> > > > > > > --- > > > > > > > > > > > > arch/riscv/Kconfig | 12 +++++ > > > > > > arch/riscv/cpu/start.S | 102 > > > > > > ++++++++++++++++++++++++++++++++++- > > > > > > arch/riscv/include/asm/csr.h | 1 + > > > > > > 3 files changed, 114 insertions(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index > > > > > > 3a51339c4d..af8d0f8d67 > > > > > > 100644 > > > > > > --- a/arch/riscv/Kconfig > > > > > > +++ b/arch/riscv/Kconfig > > > > > > @@ -140,4 +140,16 @@ config SBI_IPI > > > > > > default y if RISCV_SMODE > > > > > > depends on SMP > > > > > > > > > > > > +config MAIN_HART > > > > > > + int "Main hart in system" > > > > > > + default 0 > > > > > > + help > > > > > > + Some SoCs include harts of various sizes, some of > > > > > > which > > > > > > might not > > > > > > + be suitable for running U-Boot. CONFIG_MAIN_HART is > > > > > > used > > > > > > to select > > > > > > + the hart U-Boot runs on. > > > > > > + > > > > > > +config STACK_SIZE_SHIFT > > > > > > + int > > > > > > + default 13 > > > > > > + > > > > > > endmenu > > > > > > diff --git a/arch/riscv/cpu/start.S b/arch/riscv/cpu/start.S > > > > > > index > > > > > > a30f6f7194..ce7230df37 100644 > > > > > > --- a/arch/riscv/cpu/start.S > > > > > > +++ b/arch/riscv/cpu/start.S > > > > > > @@ -13,6 +13,7 @@ > > > > > > #include <config.h> > > > > > > #include <common.h> > > > > > > #include <elf.h> > > > > > > +#include <asm/csr.h> > > > > > > #include <asm/encoding.h> > > > > > > #include <generated/asm-offsets.h> > > > > > > > > > > > > @@ -45,6 +46,23 @@ _start: > > > > > > /* mask all interrupts */ > > > > > > csrw MODE_PREFIX(ie), zero > > > > > > > > > > > > +#ifdef CONFIG_SMP > > > > > > + /* check if hart is within range */ > > > > > > + /* s0: hart id */ > > > > > > + li t0, CONFIG_NR_CPUS > > > > > > + bge s0, t0, hart_out_of_bounds_loop > > > > > > +#endif > > > > > > + > > > > > > +#ifdef CONFIG_SMP > > > > > > + /* set xSIE bit to receive IPIs */ > > > > > > +#ifdef CONFIG_RISCV_MMODE > > > > > > + li t0, MIE_MSIE > > > > > > +#else > > > > > > + li t0, SIE_SSIE > > > > > > +#endif > > > > > > + csrs MODE_PREFIX(ie), t0 > > > > > > +#endif > > > > > > + > > > > > > /* > > > > > > * Set stackpointer in internal/ex RAM to call board_init_f > > > > > > */ > > > > > > @@ -56,7 +74,25 @@ call_board_init_f: > > > > > > call_board_init_f_0: > > > > > > mv a0, sp > > > > > > jal board_init_f_alloc_reserve > > > > > > + > > > > > > + /* > > > > > > + * Set global data pointer here for all harts, > > > > > > uninitialized at this > > > > > > + * point. > > > > > > + */ > > > > > > + mv gp, a0 > > > > > > + > > > > > > + /* setup stack */ > > > > > > +#ifdef CONFIG_SMP > > > > > > + /* s0: hart id */ > > > > > > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > > > > > > + sub sp, a0, t0 > > > > > > +#else > > > > > > mv sp, a0 > > > > > > +#endif > > > > > > + > > > > > > + /* Continue on main hart, others branch to > > > > > > secondary_hart_loop */ > > > > > > + li t0, CONFIG_MAIN_HART > > > > > > + bne s0, t0, secondary_hart_loop > > > > > > > > > > > > la t0, prior_stage_fdt_address > > > > > > SREG s1, 0(t0) > > > > > > @@ -95,7 +131,14 @@ relocate_code: > > > > > > *Set up the stack > > > > > > */ > > > > > > stack_setup: > > > > > > +#ifdef CONFIG_SMP > > > > > > + /* s0: hart id */ > > > > > > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > > > > > > + sub sp, s2, t0 > > > > > > +#else > > > > > > mv sp, s2 > > > > > > +#endif > > > > > > + > > > > > > la t0, _start > > > > > > sub t6, s4, t0 /* t6 <- relocation > > > > > > offset > > > > > > */ > > > > > > beq t0, s4, clear_bss /* skip relocation */ > > > > > > @@ -175,13 +218,30 @@ clear_bss: > > > > > > add t0, t0, t6 /* t0 <- rel > > > > > > __bss_start in > > > > > > RAM */ > > > > > > la t1, __bss_end /* t1 <- rel __bss_end > > > > > > in > > > > > > FLASH */ > > > > > > add t1, t1, t6 /* t1 <- rel __bss_end > > > > > > in > > > > > > RAM */ > > > > > > - beq t0, t1, call_board_init_r > > > > > > + beq t0, t1, relocate_secondary_harts > > > > > > > > > > > > clbss_l: > > > > > > SREG zero, 0(t0) /* clear loop... */ > > > > > > addi t0, t0, REGBYTES > > > > > > bne t0, t1, clbss_l > > > > > > > > > > > > +relocate_secondary_harts: > > > > > > +#ifdef CONFIG_SMP > > > > > > + /* send relocation IPI */ > > > > > > + la t0, secondary_hart_relocate > > > > > > + add a0, t0, t6 > > > > > > + > > > > > > + /* store relocation offset */ > > > > > > + mv s5, t6 > > > > > > + > > > > > > + mv a1, s2 > > > > > > + mv a2, s3 > > > > > > + jal smp_call_function > > > > > > + > > > > > > + /* restore relocation offset */ > > > > > > + mv t6, s5 > > > > > > +#endif > > > > > > + > > > > > > /* > > > > > > * We are done. Do not return, instead branch to second part > > > > > > of > > > > > > board > > > > > > * initialization, now running from RAM. > > > > > > @@ -202,3 +262,43 @@ call_board_init_r: > > > > > > * jump to it ... > > > > > > */ > > > > > > jr t4 /* jump to > > > > > > board_init_r() > > > > > > */ > > > > > > + > > > > > > +#ifdef CONFIG_SMP > > > > > > +hart_out_of_bounds_loop: > > > > > > + /* Harts in this loop are out of bounds, increase > > > > > > CONFIG_NR_CPUS. */ > > > > > > + wfi > > > > > > + j hart_out_of_bounds_loop > > > > > > +#endif > > > > > > + > > > > > > +#ifdef CONFIG_SMP > > > > > > +/* SMP relocation entry */ > > > > > > +secondary_hart_relocate: > > > > > > + /* a1: new sp */ > > > > > > + /* a2: new gd */ > > > > > > + /* s0: hart id */ > > > > > > + > > > > > > + /* setup stack */ > > > > > > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > > > > > > + sub sp, a1, t0 > > > > > > + > > > > > > + /* update global data pointer */ > > > > > > + mv gp, a2 > > > > > > +#endif > > > > > > + > > > > > > +secondary_hart_loop: > > > > > > + wfi > > > > > > + > > > > > > +#ifdef CONFIG_SMP > > > > > > + csrr t0, MODE_PREFIX(ip) > > > > > > +#ifdef CONFIG_RISCV_MMODE > > > > > > + andi t0, t0, MIE_MSIE > > > > > > +#else > > > > > > + andi t0, t0, SIE_SSIE > > > > > > +#endif > > > > > > + beqz t0, secondary_hart_loop > > > > > > + > > > > > > + mv a0, s0 > > > > > > + jal handle_ipi > > > > > > > > I found that s0 maybe corrupted after execute handle_ipi. > > > > Because smp_function will be treated as a return function by > > > > compiler, > > > > so compiler will generate codes to execute restore after > > > > smp_function(). > > > > > > > > But actually it is a no-return function. So there maybe no chance > > > > to > > > > execute > > > > restore. And s0 will be corrupted somehow. > > > > > > > > The usage of s0 in v2 flow seem the same as v1. > > > > So I reply mail in v1 patch. > > > > > > > > > > Thanks for reporting this issue! > > > What compiler version are you using? I never saw this issue with > > > GCC > > > 8.2.0, because optimization removes the ret following the call to > > > smp_function(), it is called using jr. The registers are therefore > > > restored before jumping to smp_function(). > > > I found this issue with Andestech's toolchain (GCC version is 7.3.0) And it's code gen will be as below: 00000e94 <handle_ipi>: e94: 4785 c.li a5,1 e96: 04a7e663 bltu a5,a0,ee2 <handle_ipi+0x4e> e9a: 456362ef jal t0,372f0 <__riscv_save_0> e9e: 84aa c.mv s1,a0 ea0: 3539 c.jal cae <riscv_clear_ipi> ea2: cd19 c.beqz a0,ec0 <handle_ipi+0x2c> ea4: 0003d517 auipc a0,0x3d ea8: 7c450513 addi a0,a0,1988 # 3e668 <freqs.8638+0x430> eac: 17c320ef jal ra,33028 <printf> eb0: 0003d517 auipc a0,0x3d eb4: 7b850513 addi a0,a0,1976 # 3e668 <freqs.8638+0x430> eb8: 170320ef jal ra,33028 <printf> ebc: 4583606f j 37314 <__riscv_restore_0> ec0: 47b1 c.li a5,12 ec2: 02f48433 mul s0,s1,a5 s0 : 1 => 0xc ec6: 003407b3 add a5,s0,gp eca: 0bc7a903 lw s2,188(a5) # 800000bc <__bss_end+0x7ff8f718> ece: 3b21 c.jal be6 <invalidate_icache_all> ed0: 008187b3 add a5,gp,s0 ed4: 0c47a603 lw a2,196(a5) ed8: 0c07a583 lw a1,192(a5) edc: 8526 c.mv a0,s1 ede: 9902 c.jalr s2 s2=0x3ff8f194 , s0=0xcsou ee0: bff1 c.j ebc <handle_ipi+0x28> ee2: 8082 c.jr ra As I said last mail, compiler will treat smp_function() as a return call, so restore() will be executed after it. This behavior shall comply with the calling convention. Thanks for your understanding and tolerant. > > > This system is probably too fragile. A simple solution would be to > > > store the hart ID somewhere else, for example register tp. What do > > > you > > > think? > > > > I think you can also use scratch/mscratch (if it is not used anywhere > > else). > > > > You are right, sscratch and mscratch are also not currently used in > U-Boot. Is there an advantage to use the scratch CSRs instead of tp? > I will prefer to use tp. xscratch shall consider mode distinguish additionally. Rick > Thanks, > Lukas
Hi Rick, On Tue, 2019-03-12 at 09:15 +0800, Rick Chen wrote: > Hi Lukas > > Auer, Lukas <lukas.auer@aisec.fraunhofer.de> 於 2019年3月11日 週一 > 上午2:12寫道: > > On Sun, 2019-03-10 at 20:24 +0530, Anup Patel wrote: > > > On Sun, Mar 10, 2019 at 7:28 PM Auer, Lukas > > > <lukas.auer@aisec.fraunhofer.de> wrote: > > > > Hi Rick, > > > > > > > > On Thu, 2019-03-07 at 17:30 +0800, Rick Chen wrote: > > > > > Hi Lukas > > > > > > > > > > > > From: Lukas Auer [mailto:lukas.auer@aisec.fraunhofer.de] > > > > > > > Sent: Tuesday, February 12, 2019 6:14 AM > > > > > > > To: u-boot@lists.denx.de > > > > > > > Cc: Atish Patra; Anup Patel; Bin Meng; Andreas Schwab; > > > > > > > Palmer > > > > > > > Dabbelt; > > > > > > > Alexander Graf; Lukas Auer; Anup Patel; Rick Jian-Zhi > > > > > > > Chen(陳建志); > > > > > > > Baruch Siach; > > > > > > > Stefan Roese > > > > > > > Subject: [PATCH 5/7] riscv: add support for multi-hart > > > > > > > systems > > > > > > > > > > > > > > On RISC-V, all harts boot independently. To be able to > > > > > > > run on > > > > > > > a > > > > > > > multi-hart system, > > > > > > > U-Boot must be extended with the functionality to manage > > > > > > > all > > > > > > > harts in the > > > > > > > system. A new config option, CONFIG_MAIN_HART, is used to > > > > > > > select > > > > > > > the hart > > > > > > > U-Boot runs on. All other harts are halted. > > > > > > > U-Boot can delegate functions to them using > > > > > > > smp_call_function(). > > > > > > > > > > > > > > Every hart has a valid pointer to the global data > > > > > > > structure > > > > > > > and a > > > > > > > 8KiB stack by > > > > > > > default. The stack size is set with > > > > > > > CONFIG_STACK_SIZE_SHIFT. > > > > > > > > > > > > > > Signed-off-by: Lukas Auer <lukas.auer@aisec.fraunhofer.de > > > > > > > > > > > > > > > --- > > > > > > > > > > > > > > arch/riscv/Kconfig | 12 +++++ > > > > > > > arch/riscv/cpu/start.S | 102 > > > > > > > ++++++++++++++++++++++++++++++++++- > > > > > > > arch/riscv/include/asm/csr.h | 1 + > > > > > > > 3 files changed, 114 insertions(+), 1 deletion(-) > > > > > > > > > > > > > > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig > > > > > > > index > > > > > > > 3a51339c4d..af8d0f8d67 > > > > > > > 100644 > > > > > > > --- a/arch/riscv/Kconfig > > > > > > > +++ b/arch/riscv/Kconfig > > > > > > > @@ -140,4 +140,16 @@ config SBI_IPI > > > > > > > default y if RISCV_SMODE > > > > > > > depends on SMP > > > > > > > > > > > > > > +config MAIN_HART > > > > > > > + int "Main hart in system" > > > > > > > + default 0 > > > > > > > + help > > > > > > > + Some SoCs include harts of various sizes, some of > > > > > > > which > > > > > > > might not > > > > > > > + be suitable for running U-Boot. CONFIG_MAIN_HART > > > > > > > is > > > > > > > used > > > > > > > to select > > > > > > > + the hart U-Boot runs on. > > > > > > > + > > > > > > > +config STACK_SIZE_SHIFT > > > > > > > + int > > > > > > > + default 13 > > > > > > > + > > > > > > > endmenu > > > > > > > diff --git a/arch/riscv/cpu/start.S > > > > > > > b/arch/riscv/cpu/start.S > > > > > > > index > > > > > > > a30f6f7194..ce7230df37 100644 > > > > > > > --- a/arch/riscv/cpu/start.S > > > > > > > +++ b/arch/riscv/cpu/start.S > > > > > > > @@ -13,6 +13,7 @@ > > > > > > > #include <config.h> > > > > > > > #include <common.h> > > > > > > > #include <elf.h> > > > > > > > +#include <asm/csr.h> > > > > > > > #include <asm/encoding.h> > > > > > > > #include <generated/asm-offsets.h> > > > > > > > > > > > > > > @@ -45,6 +46,23 @@ _start: > > > > > > > /* mask all interrupts */ > > > > > > > csrw MODE_PREFIX(ie), zero > > > > > > > > > > > > > > +#ifdef CONFIG_SMP > > > > > > > + /* check if hart is within range */ > > > > > > > + /* s0: hart id */ > > > > > > > + li t0, CONFIG_NR_CPUS > > > > > > > + bge s0, t0, hart_out_of_bounds_loop > > > > > > > +#endif > > > > > > > + > > > > > > > +#ifdef CONFIG_SMP > > > > > > > + /* set xSIE bit to receive IPIs */ > > > > > > > +#ifdef CONFIG_RISCV_MMODE > > > > > > > + li t0, MIE_MSIE > > > > > > > +#else > > > > > > > + li t0, SIE_SSIE > > > > > > > +#endif > > > > > > > + csrs MODE_PREFIX(ie), t0 > > > > > > > +#endif > > > > > > > + > > > > > > > /* > > > > > > > * Set stackpointer in internal/ex RAM to call > > > > > > > board_init_f > > > > > > > */ > > > > > > > @@ -56,7 +74,25 @@ call_board_init_f: > > > > > > > call_board_init_f_0: > > > > > > > mv a0, sp > > > > > > > jal board_init_f_alloc_reserve > > > > > > > + > > > > > > > + /* > > > > > > > + * Set global data pointer here for all harts, > > > > > > > uninitialized at this > > > > > > > + * point. > > > > > > > + */ > > > > > > > + mv gp, a0 > > > > > > > + > > > > > > > + /* setup stack */ > > > > > > > +#ifdef CONFIG_SMP > > > > > > > + /* s0: hart id */ > > > > > > > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > > > > > > > + sub sp, a0, t0 > > > > > > > +#else > > > > > > > mv sp, a0 > > > > > > > +#endif > > > > > > > + > > > > > > > + /* Continue on main hart, others branch to > > > > > > > secondary_hart_loop */ > > > > > > > + li t0, CONFIG_MAIN_HART > > > > > > > + bne s0, t0, secondary_hart_loop > > > > > > > > > > > > > > la t0, prior_stage_fdt_address > > > > > > > SREG s1, 0(t0) > > > > > > > @@ -95,7 +131,14 @@ relocate_code: > > > > > > > *Set up the stack > > > > > > > */ > > > > > > > stack_setup: > > > > > > > +#ifdef CONFIG_SMP > > > > > > > + /* s0: hart id */ > > > > > > > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > > > > > > > + sub sp, s2, t0 > > > > > > > +#else > > > > > > > mv sp, s2 > > > > > > > +#endif > > > > > > > + > > > > > > > la t0, _start > > > > > > > sub t6, s4, t0 /* t6 <- relocation > > > > > > > offset > > > > > > > */ > > > > > > > beq t0, s4, clear_bss /* skip relocation > > > > > > > */ > > > > > > > @@ -175,13 +218,30 @@ clear_bss: > > > > > > > add t0, t0, t6 /* t0 <- rel > > > > > > > __bss_start in > > > > > > > RAM */ > > > > > > > la t1, __bss_end /* t1 <- rel > > > > > > > __bss_end > > > > > > > in > > > > > > > FLASH */ > > > > > > > add t1, t1, t6 /* t1 <- rel > > > > > > > __bss_end > > > > > > > in > > > > > > > RAM */ > > > > > > > - beq t0, t1, call_board_init_r > > > > > > > + beq t0, t1, relocate_secondary_harts > > > > > > > > > > > > > > clbss_l: > > > > > > > SREG zero, 0(t0) /* clear loop... */ > > > > > > > addi t0, t0, REGBYTES > > > > > > > bne t0, t1, clbss_l > > > > > > > > > > > > > > +relocate_secondary_harts: > > > > > > > +#ifdef CONFIG_SMP > > > > > > > + /* send relocation IPI */ > > > > > > > + la t0, secondary_hart_relocate > > > > > > > + add a0, t0, t6 > > > > > > > + > > > > > > > + /* store relocation offset */ > > > > > > > + mv s5, t6 > > > > > > > + > > > > > > > + mv a1, s2 > > > > > > > + mv a2, s3 > > > > > > > + jal smp_call_function > > > > > > > + > > > > > > > + /* restore relocation offset */ > > > > > > > + mv t6, s5 > > > > > > > +#endif > > > > > > > + > > > > > > > /* > > > > > > > * We are done. Do not return, instead branch to second > > > > > > > part > > > > > > > of > > > > > > > board > > > > > > > * initialization, now running from RAM. > > > > > > > @@ -202,3 +262,43 @@ call_board_init_r: > > > > > > > * jump to it ... > > > > > > > */ > > > > > > > jr t4 /* jump to > > > > > > > board_init_r() > > > > > > > */ > > > > > > > + > > > > > > > +#ifdef CONFIG_SMP > > > > > > > +hart_out_of_bounds_loop: > > > > > > > + /* Harts in this loop are out of bounds, increase > > > > > > > CONFIG_NR_CPUS. */ > > > > > > > + wfi > > > > > > > + j hart_out_of_bounds_loop > > > > > > > +#endif > > > > > > > + > > > > > > > +#ifdef CONFIG_SMP > > > > > > > +/* SMP relocation entry */ > > > > > > > +secondary_hart_relocate: > > > > > > > + /* a1: new sp */ > > > > > > > + /* a2: new gd */ > > > > > > > + /* s0: hart id */ > > > > > > > + > > > > > > > + /* setup stack */ > > > > > > > + slli t0, s0, CONFIG_STACK_SIZE_SHIFT > > > > > > > + sub sp, a1, t0 > > > > > > > + > > > > > > > + /* update global data pointer */ > > > > > > > + mv gp, a2 > > > > > > > +#endif > > > > > > > + > > > > > > > +secondary_hart_loop: > > > > > > > + wfi > > > > > > > + > > > > > > > +#ifdef CONFIG_SMP > > > > > > > + csrr t0, MODE_PREFIX(ip) > > > > > > > +#ifdef CONFIG_RISCV_MMODE > > > > > > > + andi t0, t0, MIE_MSIE > > > > > > > +#else > > > > > > > + andi t0, t0, SIE_SSIE > > > > > > > +#endif > > > > > > > + beqz t0, secondary_hart_loop > > > > > > > + > > > > > > > + mv a0, s0 > > > > > > > + jal handle_ipi > > > > > > > > > > I found that s0 maybe corrupted after execute handle_ipi. > > > > > Because smp_function will be treated as a return function by > > > > > compiler, > > > > > so compiler will generate codes to execute restore after > > > > > smp_function(). > > > > > > > > > > But actually it is a no-return function. So there maybe no > > > > > chance > > > > > to > > > > > execute > > > > > restore. And s0 will be corrupted somehow. > > > > > > > > > > The usage of s0 in v2 flow seem the same as v1. > > > > > So I reply mail in v1 patch. > > > > > > > > > > > > > Thanks for reporting this issue! > > > > What compiler version are you using? I never saw this issue > > > > with > > > > GCC > > > > 8.2.0, because optimization removes the ret following the call > > > > to > > > > smp_function(), it is called using jr. The registers are > > > > therefore > > > > restored before jumping to smp_function(). > > > > > > I found this issue with Andestech's toolchain (GCC version is 7.3.0) > And it's code gen will be as below: > > 00000e94 <handle_ipi>: > e94: 4785 c.li a5,1 > e96: 04a7e663 bltu a5,a0,ee2 > <handle_ipi+0x4e> > e9a: 456362ef jal t0,372f0 > <__riscv_save_0> > e9e: 84aa c.mv s1,a0 > ea0: 3539 c.jal cae <riscv_clear_ipi> > ea2: cd19 c.beqz a0,ec0 > <handle_ipi+0x2c> > ea4: 0003d517 auipc a0,0x3d > ea8: 7c450513 addi a0,a0,1988 # 3e668 > <freqs.8638+0x430> > eac: 17c320ef jal ra,33028 <printf> > eb0: 0003d517 auipc a0,0x3d > eb4: 7b850513 addi a0,a0,1976 # 3e668 > <freqs.8638+0x430> > eb8: 170320ef jal ra,33028 <printf> > ebc: 4583606f j 37314 > <__riscv_restore_0> > ec0: 47b1 c.li a5,12 > ec2: 02f48433 mul s0,s1,a5 s0 : 1 => > 0xc > ec6: 003407b3 add a5,s0,gp > eca: 0bc7a903 lw s2,188(a5) # 800000bc > <__bss_end+0x7ff8f718> > ece: 3b21 c.jal be6 > <invalidate_icache_all> > ed0: 008187b3 add a5,gp,s0 > ed4: 0c47a603 lw a2,196(a5) > ed8: 0c07a583 lw a1,192(a5) > edc: 8526 c.mv a0,s1 > ede: 9902 c.jalr s2 s2=0x3ff8f194 , > s0=0xcsou > ee0: bff1 c.j ebc <handle_ipi+0x28> > ee2: 8082 c.jr ra > > As I said last mail, compiler will treat smp_function() as a return > call, so restore() will be executed after it. > This behavior shall comply with the calling convention. > > Thanks for your understanding and tolerant. No problem, I was just asking out of interest. Thanks for providing the toolchain output above! > > > > > This system is probably too fragile. A simple solution would be > > > > to > > > > store the hart ID somewhere else, for example register tp. What > > > > do > > > > you > > > > think? > > > > > > I think you can also use scratch/mscratch (if it is not used > > > anywhere > > > else). > > > > > > > You are right, sscratch and mscratch are also not currently used in > > U-Boot. Is there an advantage to use the scratch CSRs instead of > > tp? > > > > I will prefer to use tp. > xscratch shall consider mode distinguish additionally. > Ok, I will update my series and use tp instead of s0. Thanks, Lukas
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 3a51339c4d..af8d0f8d67 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -140,4 +140,16 @@ config SBI_IPI default y if RISCV_SMODE depends on SMP +config MAIN_HART + int "Main hart in system" + default 0 + help + Some SoCs include harts of various sizes, some of which might not + be suitable for running U-Boot. CONFIG_MAIN_HART is used to select + the hart U-Boot runs on. + +config STACK_SIZE_SHIFT + int + default 13 + endmenu diff --git a/arch/riscv/cpu/start.S b/arch/riscv/cpu/start.S index a30f6f7194..ce7230df37 100644 --- a/arch/riscv/cpu/start.S +++ b/arch/riscv/cpu/start.S @@ -13,6 +13,7 @@ #include <config.h> #include <common.h> #include <elf.h> +#include <asm/csr.h> #include <asm/encoding.h> #include <generated/asm-offsets.h> @@ -45,6 +46,23 @@ _start: /* mask all interrupts */ csrw MODE_PREFIX(ie), zero +#ifdef CONFIG_SMP + /* check if hart is within range */ + /* s0: hart id */ + li t0, CONFIG_NR_CPUS + bge s0, t0, hart_out_of_bounds_loop +#endif + +#ifdef CONFIG_SMP + /* set xSIE bit to receive IPIs */ +#ifdef CONFIG_RISCV_MMODE + li t0, MIE_MSIE +#else + li t0, SIE_SSIE +#endif + csrs MODE_PREFIX(ie), t0 +#endif + /* * Set stackpointer in internal/ex RAM to call board_init_f */ @@ -56,7 +74,25 @@ call_board_init_f: call_board_init_f_0: mv a0, sp jal board_init_f_alloc_reserve + + /* + * Set global data pointer here for all harts, uninitialized at this + * point. + */ + mv gp, a0 + + /* setup stack */ +#ifdef CONFIG_SMP + /* s0: hart id */ + slli t0, s0, CONFIG_STACK_SIZE_SHIFT + sub sp, a0, t0 +#else mv sp, a0 +#endif + + /* Continue on main hart, others branch to secondary_hart_loop */ + li t0, CONFIG_MAIN_HART + bne s0, t0, secondary_hart_loop la t0, prior_stage_fdt_address SREG s1, 0(t0) @@ -95,7 +131,14 @@ relocate_code: *Set up the stack */ stack_setup: +#ifdef CONFIG_SMP + /* s0: hart id */ + slli t0, s0, CONFIG_STACK_SIZE_SHIFT + sub sp, s2, t0 +#else mv sp, s2 +#endif + la t0, _start sub t6, s4, t0 /* t6 <- relocation offset */ beq t0, s4, clear_bss /* skip relocation */ @@ -175,13 +218,30 @@ clear_bss: add t0, t0, t6 /* t0 <- rel __bss_start in RAM */ la t1, __bss_end /* t1 <- rel __bss_end in FLASH */ add t1, t1, t6 /* t1 <- rel __bss_end in RAM */ - beq t0, t1, call_board_init_r + beq t0, t1, relocate_secondary_harts clbss_l: SREG zero, 0(t0) /* clear loop... */ addi t0, t0, REGBYTES bne t0, t1, clbss_l +relocate_secondary_harts: +#ifdef CONFIG_SMP + /* send relocation IPI */ + la t0, secondary_hart_relocate + add a0, t0, t6 + + /* store relocation offset */ + mv s5, t6 + + mv a1, s2 + mv a2, s3 + jal smp_call_function + + /* restore relocation offset */ + mv t6, s5 +#endif + /* * We are done. Do not return, instead branch to second part of board * initialization, now running from RAM. @@ -202,3 +262,43 @@ call_board_init_r: * jump to it ... */ jr t4 /* jump to board_init_r() */ + +#ifdef CONFIG_SMP +hart_out_of_bounds_loop: + /* Harts in this loop are out of bounds, increase CONFIG_NR_CPUS. */ + wfi + j hart_out_of_bounds_loop +#endif + +#ifdef CONFIG_SMP +/* SMP relocation entry */ +secondary_hart_relocate: + /* a1: new sp */ + /* a2: new gd */ + /* s0: hart id */ + + /* setup stack */ + slli t0, s0, CONFIG_STACK_SIZE_SHIFT + sub sp, a1, t0 + + /* update global data pointer */ + mv gp, a2 +#endif + +secondary_hart_loop: + wfi + +#ifdef CONFIG_SMP + csrr t0, MODE_PREFIX(ip) +#ifdef CONFIG_RISCV_MMODE + andi t0, t0, MIE_MSIE +#else + andi t0, t0, SIE_SSIE +#endif + beqz t0, secondary_hart_loop + + mv a0, s0 + jal handle_ipi +#endif + + j secondary_hart_loop diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h index 86136f542c..644e6baa15 100644 --- a/arch/riscv/include/asm/csr.h +++ b/arch/riscv/include/asm/csr.h @@ -46,6 +46,7 @@ #endif /* Interrupt Enable and Interrupt Pending flags */ +#define MIE_MSIE _AC(0x00000008, UL) /* Software Interrupt Enable */ #define SIE_SSIE _AC(0x00000002, UL) /* Software Interrupt Enable */ #define SIE_STIE _AC(0x00000020, UL) /* Timer Interrupt Enable */
On RISC-V, all harts boot independently. To be able to run on a multi-hart system, U-Boot must be extended with the functionality to manage all harts in the system. A new config option, CONFIG_MAIN_HART, is used to select the hart U-Boot runs on. All other harts are halted. U-Boot can delegate functions to them using smp_call_function(). Every hart has a valid pointer to the global data structure and a 8KiB stack by default. The stack size is set with CONFIG_STACK_SIZE_SHIFT. Signed-off-by: Lukas Auer <lukas.auer@aisec.fraunhofer.de> --- arch/riscv/Kconfig | 12 +++++ arch/riscv/cpu/start.S | 102 ++++++++++++++++++++++++++++++++++- arch/riscv/include/asm/csr.h | 1 + 3 files changed, 114 insertions(+), 1 deletion(-)