Message ID | 1470211348.12584.68.camel@kernel.crashing.org |
---|---|
State | New |
Headers | show |
On 3 August 2016 at 09:02, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > Other archs don't do it, some programs catch signals just fine > and those dumps just clutter the output. Keep the dumps for cases > that aren't supposed to happen such as unknown codes. > > Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> > --- > linux-user/main.c | 14 -------------- > 1 file changed, 14 deletions(-) > > diff --git a/linux-user/main.c b/linux-user/main.c > index eb9975c..8fbc5a6 100644 > --- a/linux-user/main.c > +++ b/linux-user/main.c > @@ -1651,9 +1651,6 @@ void cpu_loop(CPUPPCState *env) > "Aborting\n"); > break; > case POWERPC_EXCP_DSI: /* Data storage exception */ > - EXCP_DUMP(env, "Invalid data memory access: 0x" TARGET_FMT_lx "\n", > - env->spr[SPR_DAR]); > - /* XXX: check this. Seems bugged */ Are you removing these XXX comments because you've checked the error code cases below? If so it would be useful to say so in the commit message... > switch (env->error_code & 0xFF000000) { > case 0x40000000: > case 0x42000000: > @@ -1684,9 +1681,6 @@ void cpu_loop(CPUPPCState *env) > queue_signal(env, info.si_signo, &info); > break; > case POWERPC_EXCP_ISI: /* Instruction storage exception */ > - EXCP_DUMP(env, "Invalid instruction fetch: 0x\n" TARGET_FMT_lx > - "\n", env->spr[SPR_SRR0]); > - /* XXX: check this */ > switch (env->error_code & 0xFF000000) { > case 0x40000000: > info.si_signo = TARGET_SIGSEGV; thanks -- PMM
On Wed, 2016-08-03 at 12:05 +0100, Peter Maydell wrote: > On 3 August 2016 at 09:02, Benjamin Herrenschmidt > > <benh@kernel.crashing.org> wrote: > > > > Other archs don't do it, some programs catch signals just fine > > and those dumps just clutter the output. Keep the dumps for cases > > that aren't supposed to happen such as unknown codes. > > > > > > Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> > > --- > > linux-user/main.c | 14 -------------- > > 1 file changed, 14 deletions(-) > > > > diff --git a/linux-user/main.c b/linux-user/main.c > > index eb9975c..8fbc5a6 100644 > > --- a/linux-user/main.c > > +++ b/linux-user/main.c > > @@ -1651,9 +1651,6 @@ void cpu_loop(CPUPPCState *env) > > "Aborting\n"); > > break; > > case POWERPC_EXCP_DSI: /* Data storage exception */ > > - EXCP_DUMP(env, "Invalid data memory access: 0x" TARGET_FMT_lx "\n", > > - env->spr[SPR_DAR]); > > - /* XXX: check this. Seems bugged */ > > Are you removing these XXX comments because you've checked the > error code cases below? If so it would be useful to say so > in the commit message... Well I fix some of it in another patch... To the best of my understanding we only ever generate 0x40000000 and 0x42000000 via ppc_cpu_handle_mmu_fault() in user_only_helper.c unless there's another path to DSI in user mode that I missed. That being said, the comment isn't useful anyway. Nobody looks at it :-) I can respin with the comment back in or do put some blurb in the cset comment, let me know. BTW. I noticed, the generic path from the original signal coming from the host kernel through to here loses the information that would allow us to differenciate between a map error (no mapping) or a protection fault. It's a bit unfortunate, not sure if we can do much about it though without stopping to hijack the existing MMU translate hook in the CPU class and using a user mode dediated one that takes that sort of info as an argument. Cheers, Ben.
On 3 August 2016 at 12:28, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > To the best of my understanding we only ever generate > 0x40000000 and 0x42000000 via ppc_cpu_handle_mmu_fault() > in user_only_helper.c unless there's another path to DSI > in user mode that I missed. > > That being said, the comment isn't useful anyway. Nobody > looks at it :-) Well, it means when somebody does come along to clean up the code or just to read it they have some idea of whether the code is believed-to-be-good or not. > I can respin with the comment back in or do put some blurb > in the cset comment, let me know. I think if you're going to remove the comment you should specifically say you're doing it because you believe the code is actually correct. > BTW. I noticed, the generic path from the original signal > coming from the host kernel through to here loses the > information that would allow us to differenciate between > a map error (no mapping) or a protection fault. > > It's a bit unfortunate, not sure if we can do much about it > though without stopping to hijack the existing MMU translate > hook in the CPU class and using a user mode dediated one that > takes that sort of info as an argument. Yeah, there's an LTP test that fails because of this (we send a SIGSEGV when we should be sending a SIGBUS). It's a bit painful to fix though, since as you say we've effectively thrown away some information. I'm inclined to put this in the big pile of "bugs we could fix if we ever implemented linux-user-with-softmmu" and otherwise ignore it, unless you have a real-world program that needs this and makes some kind of bodge fix worthwhile. thanks -- PMM
On Wed, 2016-08-03 at 12:32 +0100, Peter Maydell wrote: > Yeah, there's an LTP test that fails because of this (we > send a SIGSEGV when we should be sending a SIGBUS). It's > a bit painful to fix though, since as you say we've > effectively thrown away some information. I'm inclined to > put this in the big pile of "bugs we could fix if we > ever implemented linux-user-with-softmmu" and otherwise > ignore it, unless you have a real-world program that > needs this and makes some kind of bodge fix worthwhile. Nope, not really... If we care, a simpler fix would be to add a "translate_user_fault" hook to the CPU class that takes more info about the original signal than handle_mmu_fault does, and call it when available (with a fallback) from user-exec.c That does mean going through all the cpu_signal_handler() variants in there though to make them extract more useful info. Not sure it's worthwhile... As far user-with-softmmu, I'm not too sure... softmmu significantly increases the overhead of load and stores. Maybe after we add 128-bit integers to TGC to alleviate that a bit ? :-) Cheers, Ben .
On 08/03/2016 05:09 PM, Benjamin Herrenschmidt wrote: > As far user-with-softmmu, I'm not too sure... softmmu significantly > increases the overhead of load and stores. Maybe after we add 128-bit > integers to TGC to alleviate that a bit ? :-) It wouldn't be mandatory, but there are certain bugs we can't fix without it. The big issues to be fixed with softmmu are (1) Host page size > guest page size. E.g. there are many programs (i386, sparc, etc, all with 4k pages) that you can't even load, much less run, on a ppc64 host using a 64k page size. (2) Host virtual address space bits != guest virtual address space bits My alpha emulation has run into this. A real Alpha guest has a 44-bit address space, but an x86_64 host has a 48-bit address space. The x86_64 kernel cannot be persuaded to reliably map memory below (1ul << 44), so I have to pretend than Alpha has a 48-bit address space. (Indeed, I set this to 63 bits, so that it works for even wider va, like on ppc64 and sparc64.) More theoretically, if the guest uses high bits for some purpose (e.g. ia64 segmentation in the top 3 bits), and the host doesn't have a full 64-bit virtual address space, then we cannot even map the program, since we cannot set bits 61-63 to non-zero values. r~
On Sat, 2016-08-06 at 15:23 +0530, Richard Henderson wrote: > On 08/03/2016 05:09 PM, Benjamin Herrenschmidt wrote: > > > > As far user-with-softmmu, I'm not too sure... softmmu significantly > > increases the overhead of load and stores. Maybe after we add 128-bit > > integers to TGC to alleviate that a bit ? :-) > > It wouldn't be mandatory, but there are certain bugs we can't fix without it. > The big issues to be fixed with softmmu are > > (1) Host page size > guest page size. > > E.g. there are many programs (i386, sparc, etc, all with 4k pages) that you > can't even load, much less run, on a ppc64 host using a 64k page size. Can't we advertise the host page size to the guest ? Or there are too many compiled-in assumptions ? > > (2) Host virtual address space bits != guest virtual address space bits > > My alpha emulation has run into this. A real Alpha guest has a 44-bit address > space, but an x86_64 host has a 48-bit address space. The x86_64 kernel cannot > be persuaded to reliably map memory below (1ul << 44), so I have to pretend > than Alpha has a 48-bit address space. (Indeed, I set this to 63 bits, so that > it works for even wider va, like on ppc64 and sparc64.) You can't just set a no-access VMA covering the top of the address space ? Are alpha programs relying on the fact that they won't get addresses above 44 ? > > More theoretically, if the guest uses high bits for some purpose (e.g. ia64 > segmentation in the top 3 bits), and the host doesn't have a full 64-bit > virtual address space, then we cannot even map the program, since we cannot set > bits 61-63 to non-zero values. I see. We could definitely have the option then. Cheers, Ben.
On 08/07/2016 06:20 AM, Benjamin Herrenschmidt wrote: > On Sat, 2016-08-06 at 15:23 +0530, Richard Henderson wrote: >> On 08/03/2016 05:09 PM, Benjamin Herrenschmidt wrote: >>> >>> As far user-with-softmmu, I'm not too sure... softmmu significantly >>> increases the overhead of load and stores. Maybe after we add 128-bit >>> integers to TGC to alleviate that a bit ? :-) >> >> It wouldn't be mandatory, but there are certain bugs we can't fix without it. >> The big issues to be fixed with softmmu are >> >> (1) Host page size > guest page size. >> >> E.g. there are many programs (i386, sparc, etc, all with 4k pages) that you >> can't even load, much less run, on a ppc64 host using a 64k page size. > > Can't we advertise the host page size to the guest ? Or there are too many > compiled-in assumptions ? Sometimes. However: Each guest has a set of page sizes that are legal. The largest of these is baked into the static linker via MAXPAGESIZE. If the host page size is larger than this guest maximum, the layout of the guest binary may be impossible to satisfy on the host. There are examples of this even within linux-user-test-0.3. For those guests whose set of page sizes contains only one size, the page size is often baked into executables via the constant PAGE_SIZE, rather than performing a runtime query on getpagesize(2). r~
diff --git a/linux-user/main.c b/linux-user/main.c index eb9975c..8fbc5a6 100644 --- a/linux-user/main.c +++ b/linux-user/main.c @@ -1651,9 +1651,6 @@ void cpu_loop(CPUPPCState *env) "Aborting\n"); break; case POWERPC_EXCP_DSI: /* Data storage exception */ - EXCP_DUMP(env, "Invalid data memory access: 0x" TARGET_FMT_lx "\n", - env->spr[SPR_DAR]); - /* XXX: check this. Seems bugged */ switch (env->error_code & 0xFF000000) { case 0x40000000: case 0x42000000: @@ -1684,9 +1681,6 @@ void cpu_loop(CPUPPCState *env) queue_signal(env, info.si_signo, &info); break; case POWERPC_EXCP_ISI: /* Instruction storage exception */ - EXCP_DUMP(env, "Invalid instruction fetch: 0x\n" TARGET_FMT_lx - "\n", env->spr[SPR_SRR0]); - /* XXX: check this */ switch (env->error_code & 0xFF000000) { case 0x40000000: info.si_signo = TARGET_SIGSEGV; @@ -1716,7 +1710,6 @@ void cpu_loop(CPUPPCState *env) "Aborting\n"); break; case POWERPC_EXCP_ALIGN: /* Alignment exception */ - EXCP_DUMP(env, "Unaligned memory access\n"); /* XXX: check this */ info.si_signo = TARGET_SIGBUS; info.si_errno = 0; @@ -1729,7 +1722,6 @@ void cpu_loop(CPUPPCState *env) /* XXX: check this */ switch (env->error_code & ~0xF) { case POWERPC_EXCP_FP: - EXCP_DUMP(env, "Floating point program exception\n"); info.si_signo = TARGET_SIGFPE; info.si_errno = 0; switch (env->error_code & 0xF) { @@ -1765,7 +1757,6 @@ void cpu_loop(CPUPPCState *env) } break; case POWERPC_EXCP_INVAL: - EXCP_DUMP(env, "Invalid instruction\n"); info.si_signo = TARGET_SIGILL; info.si_errno = 0; switch (env->error_code & 0xF) { @@ -1789,7 +1780,6 @@ void cpu_loop(CPUPPCState *env) } break; case POWERPC_EXCP_PRIV: - EXCP_DUMP(env, "Privilege violation\n"); info.si_signo = TARGET_SIGILL; info.si_errno = 0; switch (env->error_code & 0xF) { @@ -1819,7 +1809,6 @@ void cpu_loop(CPUPPCState *env) queue_signal(env, info.si_signo, &info); break; case POWERPC_EXCP_FPU: /* Floating-point unavailable exception */ - EXCP_DUMP(env, "No floating point allowed\n"); info.si_signo = TARGET_SIGILL; info.si_errno = 0; info.si_code = TARGET_ILL_COPROC; @@ -1831,7 +1820,6 @@ void cpu_loop(CPUPPCState *env) "Aborting\n"); break; case POWERPC_EXCP_APU: /* Auxiliary processor unavailable */ - EXCP_DUMP(env, "No APU instruction allowed\n"); info.si_signo = TARGET_SIGILL; info.si_errno = 0; info.si_code = TARGET_ILL_COPROC; @@ -1859,7 +1847,6 @@ void cpu_loop(CPUPPCState *env) "Aborting\n"); break; case POWERPC_EXCP_SPEU: /* SPE/embedded floating-point unavail. */ - EXCP_DUMP(env, "No SPE/floating-point instruction allowed\n"); info.si_signo = TARGET_SIGILL; info.si_errno = 0; info.si_code = TARGET_ILL_COPROC; @@ -1923,7 +1910,6 @@ void cpu_loop(CPUPPCState *env) "while in user mode. Aborting\n"); break; case POWERPC_EXCP_VPU: /* Vector unavailable exception */ - EXCP_DUMP(env, "No Altivec instructions allowed\n"); info.si_signo = TARGET_SIGILL; info.si_errno = 0; info.si_code = TARGET_ILL_COPROC;
Other archs don't do it, some programs catch signals just fine and those dumps just clutter the output. Keep the dumps for cases that aren't supposed to happen such as unknown codes. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> --- linux-user/main.c | 14 -------------- 1 file changed, 14 deletions(-)