Message ID | 1432243971-26417-1-git-send-email-aurelien@aurel32.net |
---|---|
State | New |
Headers | show |
On 05/21/2015 02:32 PM, Aurelien Jarno wrote: > When consecutive memory locations are on page boundary a page fault > might occur when using the LOAD MULTIPLE instruction. In that case real > hardware doesn't load any register. > > This is an important detail in case the base register is in the list > of registers to be loaded. If a page fault occurs this register might be > overwritten and when the instruction is later restarted the wrong > base register value is useD. > > Fix this by first loading all values from memory and then writing them > back to the registers. > > This fixes random segmentation faults seen in the guest. > > Cc: Alexander Graf <agraf@suse.de> > Cc: Richard Henderson <rth@twiddle.net> > Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> > --- > target-s390x/translate.c | 56 +++++++++++++++++++++++++++++++++++++++++++----- > 1 file changed, 51 insertions(+), 5 deletions(-) Hmm. Seems to be un/under-specified in the PoO. That said, Reviewed-by: Richard Henderson <rth@twiddle.net> It would be nice to know if there ought to be similar up-front access checking for STM, to avoid errant partial stores. r~
> Am 21.05.2015 um 23:32 schrieb Aurelien Jarno <aurelien@aurel32.net>: > > When consecutive memory locations are on page boundary a page fault > might occur when using the LOAD MULTIPLE instruction. In that case real > hardware doesn't load any register. > > This is an important detail in case the base register is in the list > of registers to be loaded. If a page fault occurs this register might be > overwritten and when the instruction is later restarted the wrong > base register value is useD. > > Fix this by first loading all values from memory and then writing them > back to the registers. > > This fixes random segmentation faults seen in the guest. > > Cc: Alexander Graf <agraf@suse.de> > Cc: Richard Henderson <rth@twiddle.net> > Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> Looks like you're finding lots of fun corner case bugs in the emulation. Have you or Richard considered to implement s390x support in Risu yet? Aarch64 emulation accuracy is remarkable since its introduction. Alex
On 2015-05-21 14:42, Richard Henderson wrote: > On 05/21/2015 02:32 PM, Aurelien Jarno wrote: > > When consecutive memory locations are on page boundary a page fault > > might occur when using the LOAD MULTIPLE instruction. In that case real > > hardware doesn't load any register. > > > > This is an important detail in case the base register is in the list > > of registers to be loaded. If a page fault occurs this register might be > > overwritten and when the instruction is later restarted the wrong > > base register value is useD. > > > > Fix this by first loading all values from memory and then writing them > > back to the registers. > > > > This fixes random segmentation faults seen in the guest. > > > > Cc: Alexander Graf <agraf@suse.de> > > Cc: Richard Henderson <rth@twiddle.net> > > Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> > > --- > > target-s390x/translate.c | 56 +++++++++++++++++++++++++++++++++++++++++++----- > > 1 file changed, 51 insertions(+), 5 deletions(-) > > Hmm. Seems to be un/under-specified in the PoO. That said, There is a small sentence in the PoO, in chapter "Program Execution", section "Sequence of Storage Reference": It can normally be assumed that the execution of each instruction occurs as an indivisible event. > Reviewed-by: Richard Henderson <rth@twiddle.net> > > It would be nice to know if there ought to be similar up-front access checking > for STM, to avoid errant partial stores. I have just checked, the same is also true for STM instructions, though it's probably more difficult to fix that in QEMU. Maybe we need a way to check if a load/store will succeed, preferably without using a helper.
On 2015-05-22 00:00, Alexander Graf wrote:
> Looks like you're finding lots of fun corner case bugs in the emulation. Have you or Richard considered to implement s390x support in Risu yet? Aarch64 emulation accuracy is remarkable since its introduction.
I have just learned about Risu, so no I haven't considered that. Do you
have an idea about the efforts required to port it to another
architecture?
On 23 May 2015 at 09:22, Aurelien Jarno <aurelien@aurel32.net> wrote: > On 2015-05-22 00:00, Alexander Graf wrote: >> Looks like you're finding lots of fun corner case bugs in the >> emulation. Have you or Richard considered to implement s390x >> support in Risu yet? Aarch64 emulation accuracy is remarkable >> since its introduction. > > I have just learned about Risu, so no I haven't considered that. Do you > have an idea about the efforts required to port it to another > architecture? The C code parts should be pretty trivial to port: we've kept the architecture-specific parts fairly cleanly separated. More awkward is the risugen perl script, which currently handles ARM, Thumb and 64-bit ARM instructions but doesn't currently have the same careful separation of CPU specific bits. Still, the necessary refactoring should not be too difficult. -- PMM
On 05/23/2015 12:59 AM, Aurelien Jarno wrote: > On 2015-05-21 14:42, Richard Henderson wrote: >> Hmm. Seems to be un/under-specified in the PoO. That said, > > There is a small sentence in the PoO, in chapter "Program Execution", > section "Sequence of Storage Reference": > > It can normally be assumed that the execution of > each instruction occurs as an indivisible event. Ah, I didn't think to look in a different chapter. ;-) >> It would be nice to know if there ought to be similar up-front access checking >> for STM, to avoid errant partial stores. > > I have just checked, the same is also true for STM instructions, though > it's probably more difficult to fix that in QEMU. Maybe we need a way to > check if a load/store will succeed, preferably without using a helper. I did just suggest a new helper in the "unaligned stores for mips r6" thread. Therein we provide a probe_write helper that does assert that the given page is writable, or raise the usual exception. It leaves the TLB updated, so a subsequent write should take the fast path. It should be easy enough to extend that with an opcode so that we can implement this for s390 as probe_write addr + n * size - 1 qemu_st r0, addr qemu_st r1, addr + 1*size ... Hopefully for the edge case where both pages are unmapped, producing an exception pointing to the last byte, rather than the first byte, is acceptable. r~
On 23.05.15 21:33, Richard Henderson wrote: > On 05/23/2015 12:59 AM, Aurelien Jarno wrote: >> On 2015-05-21 14:42, Richard Henderson wrote: >>> Hmm. Seems to be un/under-specified in the PoO. That said, >> >> There is a small sentence in the PoO, in chapter "Program Execution", >> section "Sequence of Storage Reference": >> >> It can normally be assumed that the execution of >> each instruction occurs as an indivisible event. > > Ah, I didn't think to look in a different chapter. ;-) > >>> It would be nice to know if there ought to be similar up-front access >>> checking >>> for STM, to avoid errant partial stores. >> >> I have just checked, the same is also true for STM instructions, though >> it's probably more difficult to fix that in QEMU. Maybe we need a way to >> check if a load/store will succeed, preferably without using a helper. > > I did just suggest a new helper in the "unaligned stores for mips r6" > thread. Therein we provide a probe_write helper that does assert that > the given page is writable, or raise the usual exception. It leaves the > TLB updated, so a subsequent write should take the fast path. > > It should be easy enough to extend that with an opcode so that we can > implement this for s390 as > > probe_write addr + n * size - 1 > qemu_st r0, addr > qemu_st r1, addr + 1*size > ... > > Hopefully for the edge case where both pages are unmapped, producing an > exception pointing to the last byte, rather than the first byte, is > acceptable. So that means we should hold off on this patch for now as well and rather go for the probe approach? Alex
On 2015-05-25 22:47, Alexander Graf wrote: > > > On 23.05.15 21:33, Richard Henderson wrote: > > On 05/23/2015 12:59 AM, Aurelien Jarno wrote: > >> On 2015-05-21 14:42, Richard Henderson wrote: > >>> Hmm. Seems to be un/under-specified in the PoO. That said, > >> > >> There is a small sentence in the PoO, in chapter "Program Execution", > >> section "Sequence of Storage Reference": > >> > >> It can normally be assumed that the execution of > >> each instruction occurs as an indivisible event. > > > > Ah, I didn't think to look in a different chapter. ;-) > > > >>> It would be nice to know if there ought to be similar up-front access > >>> checking > >>> for STM, to avoid errant partial stores. > >> > >> I have just checked, the same is also true for STM instructions, though > >> it's probably more difficult to fix that in QEMU. Maybe we need a way to > >> check if a load/store will succeed, preferably without using a helper. > > > > I did just suggest a new helper in the "unaligned stores for mips r6" > > thread. Therein we provide a probe_write helper that does assert that > > the given page is writable, or raise the usual exception. It leaves the > > TLB updated, so a subsequent write should take the fast path. > > > > It should be easy enough to extend that with an opcode so that we can > > implement this for s390 as > > > > probe_write addr + n * size - 1 > > qemu_st r0, addr > > qemu_st r1, addr + 1*size > > ... > > > > Hopefully for the edge case where both pages are unmapped, producing an > > exception pointing to the last byte, rather than the first byte, is > > acceptable. > > So that means we should hold off on this patch for now as well and > rather go for the probe approach? For loads it's a bit different, but I guess we might come with a better approach: load first word load last word save first word in the corresponding register save second word in the corresponding register load words in between and save them in the corresponding registers So yes it might be a good idea to hold off this patch.
On 2015-05-23 12:33, Richard Henderson wrote: > On 05/23/2015 12:59 AM, Aurelien Jarno wrote: > >On 2015-05-21 14:42, Richard Henderson wrote: > >>Hmm. Seems to be un/under-specified in the PoO. That said, > > > >There is a small sentence in the PoO, in chapter "Program Execution", > >section "Sequence of Storage Reference": > > > > It can normally be assumed that the execution of > > each instruction occurs as an indivisible event. > > Ah, I didn't think to look in a different chapter. ;-) > > >>It would be nice to know if there ought to be similar up-front access checking > >>for STM, to avoid errant partial stores. > > > >I have just checked, the same is also true for STM instructions, though > >it's probably more difficult to fix that in QEMU. Maybe we need a way to > >check if a load/store will succeed, preferably without using a helper. > > I did just suggest a new helper in the "unaligned stores for mips r6" > thread. Therein we provide a probe_write helper that does assert that the > given page is writable, or raise the usual exception. It leaves the TLB > updated, so a subsequent write should take the fast path. I guess it would work for softmmu, but not in linux-user mode, though that's even more a corner case. > It should be easy enough to extend that with an opcode so that we can > implement this for s390 as > > probe_write addr + n * size - 1 > qemu_st r0, addr > qemu_st r1, addr + 1*size > ... > > Hopefully for the edge case where both pages are unmapped, producing an > exception pointing to the last byte, rather than the first byte, is > acceptable. Worst case we can probe the first address and then the last address.
On 25.05.15 23:05, Aurelien Jarno wrote: > On 2015-05-23 12:33, Richard Henderson wrote: >> On 05/23/2015 12:59 AM, Aurelien Jarno wrote: >>> On 2015-05-21 14:42, Richard Henderson wrote: >>>> Hmm. Seems to be un/under-specified in the PoO. That said, >>> >>> There is a small sentence in the PoO, in chapter "Program Execution", >>> section "Sequence of Storage Reference": >>> >>> It can normally be assumed that the execution of >>> each instruction occurs as an indivisible event. >> >> Ah, I didn't think to look in a different chapter. ;-) >> >>>> It would be nice to know if there ought to be similar up-front access checking >>>> for STM, to avoid errant partial stores. >>> >>> I have just checked, the same is also true for STM instructions, though >>> it's probably more difficult to fix that in QEMU. Maybe we need a way to >>> check if a load/store will succeed, preferably without using a helper. >> >> I did just suggest a new helper in the "unaligned stores for mips r6" >> thread. Therein we provide a probe_write helper that does assert that the >> given page is writable, or raise the usual exception. It leaves the TLB >> updated, so a subsequent write should take the fast path. > > I guess it would work for softmmu, but not in linux-user mode, though > that's even more a corner case. For linux-user we could just implement probe as foo = load_x_bytes(addr) store_x_bytes(addr, foo) or can we have write-only maps there? Alex
On 25 May 2015 at 22:55, Alexander Graf <agraf@suse.de> wrote: > For linux-user we could just implement probe as > > foo = load_x_bytes(addr) > store_x_bytes(addr, foo) > > or can we have write-only maps there? The guest can mmap() things write-only, so yes. -- PMM
On 05/25/2015 02:55 PM, Alexander Graf wrote: > For linux-user we could just implement probe as > > foo = load_x_bytes(addr) > store_x_bytes(addr, foo) > > or can we have write-only maps there? One of these days I'm going to enable softmmu for linux-user, at least as an option. While direct loads and stores are nice, there are a whole pile of things that Just Don't Work. Especially guests with page sizes smaller than the host. Very few of the linux-user-test-0.3 suite even load e.g. on ppc64/aarch64 with a 64k page size. r~
diff --git a/target-s390x/translate.c b/target-s390x/translate.c index 52e106e..ddc78a9 100644 --- a/target-s390x/translate.c +++ b/target-s390x/translate.c @@ -2436,10 +2436,13 @@ static ExitStatus op_lm32(DisasContext *s, DisasOps *o) int r3 = get_field(s->fields, r3); TCGv_i64 t = tcg_temp_new_i64(); TCGv_i64 t4 = tcg_const_i64(4); + TCGv_i64 tregs[16]; + /* First load all the values from memory. If a page fault occurs the + registers should not be changed. */ while (1) { - tcg_gen_qemu_ld32u(t, o->in2, get_mem_index(s)); - store_reg32_i64(r1, t); + tregs[r1] = tcg_temp_new_i64(); + tcg_gen_qemu_ld32u(tregs[r1], o->in2, get_mem_index(s)); if (r1 == r3) { break; } @@ -2447,6 +2450,18 @@ static ExitStatus op_lm32(DisasContext *s, DisasOps *o) r1 = (r1 + 1) & 15; } + /* When all the values have been loaded, write them back to the + registers. */ + r1 = get_field(s->fields, r1); + while (1) { + store_reg32_i64(r1, tregs[r1]); + tcg_temp_free_i64(tregs[r1]); + if (r1 == r3) { + break; + } + r1 = (r1 + 1) & 15; + } + tcg_temp_free_i64(t); tcg_temp_free_i64(t4); return NO_EXIT; @@ -2458,10 +2473,13 @@ static ExitStatus op_lmh(DisasContext *s, DisasOps *o) int r3 = get_field(s->fields, r3); TCGv_i64 t = tcg_temp_new_i64(); TCGv_i64 t4 = tcg_const_i64(4); + TCGv_i64 tregs[16]; + /* First load all the values from memory. If a page fault occurs the + registers should not be changed. */ while (1) { - tcg_gen_qemu_ld32u(t, o->in2, get_mem_index(s)); - store_reg32h_i64(r1, t); + tregs[r1] = tcg_temp_new_i64(); + tcg_gen_qemu_ld32u(tregs[r1], o->in2, get_mem_index(s)); if (r1 == r3) { break; } @@ -2469,6 +2487,18 @@ static ExitStatus op_lmh(DisasContext *s, DisasOps *o) r1 = (r1 + 1) & 15; } + /* When all the values have been loaded, write them back to the + registers. */ + r1 = get_field(s->fields, r1); + while (1) { + store_reg32h_i64(r1, tregs[r1]); + tcg_temp_free_i64(tregs[r1]); + if (r1 == r3) { + break; + } + r1 = (r1 + 1) & 15; + } + tcg_temp_free_i64(t); tcg_temp_free_i64(t4); return NO_EXIT; @@ -2479,9 +2509,13 @@ static ExitStatus op_lm64(DisasContext *s, DisasOps *o) int r1 = get_field(s->fields, r1); int r3 = get_field(s->fields, r3); TCGv_i64 t8 = tcg_const_i64(8); + TCGv_i64 tregs[16]; + /* First load all the values from memory. If a page fault occurs the + registers should not be changed. */ while (1) { - tcg_gen_qemu_ld64(regs[r1], o->in2, get_mem_index(s)); + tregs[r1] = tcg_temp_new_i64(); + tcg_gen_qemu_ld64(tregs[r1], o->in2, get_mem_index(s)); if (r1 == r3) { break; } @@ -2489,6 +2523,18 @@ static ExitStatus op_lm64(DisasContext *s, DisasOps *o) r1 = (r1 + 1) & 15; } + /* When all the values have been loaded, write them back to the + registers. */ + r1 = get_field(s->fields, r1); + while (1) { + tcg_gen_mov_i64(regs[r1], tregs[r1]); + tcg_temp_free_i64(tregs[r1]); + if (r1 == r3) { + break; + } + r1 = (r1 + 1) & 15; + } + tcg_temp_free_i64(t8); return NO_EXIT; }
When consecutive memory locations are on page boundary a page fault might occur when using the LOAD MULTIPLE instruction. In that case real hardware doesn't load any register. This is an important detail in case the base register is in the list of registers to be loaded. If a page fault occurs this register might be overwritten and when the instruction is later restarted the wrong base register value is useD. Fix this by first loading all values from memory and then writing them back to the registers. This fixes random segmentation faults seen in the guest. Cc: Alexander Graf <agraf@suse.de> Cc: Richard Henderson <rth@twiddle.net> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> --- target-s390x/translate.c | 56 +++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 51 insertions(+), 5 deletions(-)