Message ID | 20220511120541.2242797-1-shorne@gmail.com |
---|---|
State | New |
Headers | show |
Series | target/openrisc: Do not reset delay slot flag on early tb exit | expand |
On 5/11/22 05:05, Stafford Horne wrote: > +static void openrisc_cpu_synchronize_from_tb(CPUState *cs, > + const TranslationBlock *tb) > +{ > + OpenRISCCPU *cpu = OPENRISC_CPU(cs); > + > + cpu->env.pc = tb->pc; > +} If mips is a guide, you'd want to set dflag based on tb->flags & TB_FLAGS_DFLAG as well. But I think openrisc is more careful to keep dflag up-to-date. Reviewed-by: Richard Henderson <richard.henderson@linaro.org> r~
On Wed, May 11, 2022 at 07:32:58AM -0700, Richard Henderson wrote: > On 5/11/22 05:05, Stafford Horne wrote: > > +static void openrisc_cpu_synchronize_from_tb(CPUState *cs, > > + const TranslationBlock *tb) > > +{ > > + OpenRISCCPU *cpu = OPENRISC_CPU(cs); > > + > > + cpu->env.pc = tb->pc; > > +} > > If mips is a guide, you'd want to set dflag based on > > tb->flags & TB_FLAGS_DFLAG > > as well. But I think openrisc is more careful to keep dflag up-to-date. I was thinking that too so I left it out. For example: 0xc01e3ffc: l.bf 3 0xc01e4000: l.ori r12, r0, 1 --- ---- c01e3ffc 00000000 movcond_i32 jmp_pc,sr_f,$0x0,$0xc01e4008,$0xc01e4004,ne sync: 0 dead: 0 1 2 3 4 pref=0xffff mov_i32 dflag,$0x1 sync: 0 dead: 0 1 pref=0xffff mov_i32 ppc,$0xc01e3ffc sync: 0 dead: 0 1 pref=0xffff mov_i32 pc,$0xc01e4000 sync: 0 dead: 0 1 pref=0xffff call lookup_tb_ptr,$0x6,$1,tmp7,env dead: 1 pref=none goto_ptr tmp7 dead: 0 set_label $L0 exit_tb $0x7f7b047f3b43 --- ld_i32 tmp0,env,$0xfffffffffffffff0 pref=0xffff brcond_i32 tmp0,$0x0,lt,$L0 dead: 0 ---- c01e4000 00000001 mov_i32 r12,$0x1 sync: 0 dead: 0 1 pref=0xffff mov_i32 dflag,$0x0 sync: 0 dead: 0 1 pref=0xffff mov_i32 ppc,$0xc01e4000 sync: 0 dead: 0 1 pref=0xffff mov_i32 pc,jmp_pc sync: 0 dead: 0 1 pref=0xffff discard jmp_pc pref=none call lookup_tb_ptr,$0x6,$1,tmp4,env dead: 1 pref=none goto_ptr tmp4 dead: 0 set_label $L0 exit_tb $0x7f7b047f3c83 This is an example of a branch followed by a branch delay slot. If we exit the branch delay slot via `exit_tb $0x7f7b047f3c83`. The `mov_i32 dflag,$0x1` instruction would have run from `c01e3ffc` having env already updated. At this point how would tb->flags have the right value? Would it always be set correctly by `cpu_get_tb_cpu_state` in the `lookup_tb_ptr`call? -Stafford
On 5/11/22 14:43, Stafford Horne wrote: > At this point how would tb->flags have the right value? Would it always be set > correctly by `cpu_get_tb_cpu_state` in the `lookup_tb_ptr`call? Well, it would be set by cpu_get_tb_cpu_state in cpu_exec, which is then passed to tb_gen_code. If we go around a loop and look it up a second time, we'll find a tb with a matching set of flags. r~
On Wed, May 11, 2022 at 02:56:37PM -0700, Richard Henderson wrote: > On 5/11/22 14:43, Stafford Horne wrote: > > At this point how would tb->flags have the right value? Would it always be set > > correctly by `cpu_get_tb_cpu_state` in the `lookup_tb_ptr`call? > > Well, it would be set by cpu_get_tb_cpu_state in cpu_exec, which is then > passed to tb_gen_code. If we go around a loop and look it up a second time, > we'll find a tb with a matching set of flags. Right, cpu_get_tb_cpu_state called in lookup_tb_ptr will not update tb->flags. What you mention, that is for when we have to generate a new TB, the tb->flags get set right before tb_gen_code. But for the case where we exit the delay-slot TB due to a pending exception I think the flow would go. TB chain: -> branch-TB : set env->flag 1 -> delay-slot-TB : exit_tb due to condition Exit: -> return to cpu_tb_exec -> tcg_ops->synchronize_from_tb In this case I don't see how the tb->flag would be updated, ooh, I guess it would have been set earlier when the TB was generated. Maybe that is what I am missing. -Stafford
On 5/11/22 15:34, Stafford Horne wrote: > In this case I don't see how the tb->flag would be updated, ooh, I guess it > would have been set earlier when the TB was generated. Maybe that is what I am > missing. Correct, it should be unchanged (and correct) from generation. r~
On Wed, May 11, 2022 at 07:11:20PM -0700, Richard Henderson wrote: > On 5/11/22 15:34, Stafford Horne wrote: > > In this case I don't see how the tb->flag would be updated, ooh, I guess it > > would have been set earlier when the TB was generated. Maybe that is what I am > > missing. > > Correct, it should be unchanged (and correct) from generation. OK, its very clear now thanks. With that said, I am still not convinced we need something like: --- a/target/openrisc/cpu.c +++ b/target/openrisc/cpu.c @@ -37,6 +37,7 @@ static void openrisc_cpu_synchronize_from_tb(CPUState *cs, OpenRISCCPU *cpu = OPENRISC_CPU(cs); cpu->env.pc = tb->pc; + cpu->env.dflag = (tb->flags & TB_FLAGS_DFLAG) ? 1 : 0; } I will leave it out for now as I feel comfortable that the env.dflag will be correct. But if you think of something let me know. -Stafford
diff --git a/target/openrisc/cpu.c b/target/openrisc/cpu.c index dfbafc5236..41d1b2a24a 100644 --- a/target/openrisc/cpu.c +++ b/target/openrisc/cpu.c @@ -21,6 +21,7 @@ #include "qapi/error.h" #include "qemu/qemu-print.h" #include "cpu.h" +#include "exec/exec-all.h" static void openrisc_cpu_set_pc(CPUState *cs, vaddr value) { @@ -30,6 +31,15 @@ static void openrisc_cpu_set_pc(CPUState *cs, vaddr value) cpu->env.dflag = 0; } +static void openrisc_cpu_synchronize_from_tb(CPUState *cs, + const TranslationBlock *tb) +{ + OpenRISCCPU *cpu = OPENRISC_CPU(cs); + + cpu->env.pc = tb->pc; +} + + static bool openrisc_cpu_has_work(CPUState *cs) { return cs->interrupt_request & (CPU_INTERRUPT_HARD | @@ -186,6 +196,7 @@ static const struct SysemuCPUOps openrisc_sysemu_ops = { static const struct TCGCPUOps openrisc_tcg_ops = { .initialize = openrisc_translate_init, + .synchronize_from_tb = openrisc_cpu_synchronize_from_tb, #ifndef CONFIG_USER_ONLY .tlb_fill = openrisc_cpu_tlb_fill,
This was found when running linux crypto algorithm selftests used by wireguard. We found that randomly the tests would fail. We found through investigation that a combination of a tick timer interrupt, raised when executing a delay slot instruction at a page boundary caused the issue. This was caused when handling the TB_EXIT_REQUESTED case in cpu_tb_exec. On OpenRISC, which doesn't implement synchronize_from_tb, set_pc was being used as a fallback. The OpenRISC set_pc implementation clears dflag, which caused the exception handling logic to not account for the delay slot. This was the bug, because it meant when execution resumed after the interrupt was handling it resumed in the wrong place. Fix this by implementing synchronize_from_tb which simply updates pc, and not clear the delay slot flag. Reported-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Stafford Horne <shorne@gmail.com> --- target/openrisc/cpu.c | 11 +++++++++++ 1 file changed, 11 insertions(+)