Message ID | 20230522160242.37261-1-nnmlinux@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | [v3] target: ppc: Use MSR_HVB bit to get the target endianness for memory dump | expand |
On Mon, 22 May 2023 12:02:42 -0400 Narayana Murty N <nnmlinux@linux.ibm.com> wrote: > Currently on PPC64 qemu always dumps the guest memory in > Big Endian (BE) format even though the guest running in Little Endian > (LE) mode. So crash tool fails to load the dump as illustrated below: > > Log : > $ virsh dump DOMAIN --memory-only dump.file > > Domain 'DOMAIN' dumped to dump.file > > $ crash vmlinux dump.file > > <snip> > crash 8.0.2-1.el9 > > WARNING: endian mismatch: > crash utility: little-endian > dump.file: big-endian > > WARNING: machine type mismatch: > crash utility: PPC64 > dump.file: (unknown) > > crash: dump.file: not a supported file format > <snip> > > This happens because cpu_get_dump_info() passes cpu->env->has_hv_mode > to function ppc_interrupts_little_endian(), the cpu->env->has_hv_mode > always set for powerNV even though the guest is not running in hv mode. > The hv mode should be taken from msr_mask MSR_HVB bit > (cpu->env.msr_mask & MSR_HVB). This patch fixes the issue by passing > MSR_HVB value to ppc_interrupts_little_endian() in order to determine > the guest endianness. > > The crash tool also expects guest kernel endianness should match the > endianness of the dump. > > The patch was tested on POWER9 box booted with Linux as host in > following cases: > > Host-Endianess Qemu-Target-Machine Qemu-Guest-Endianess Qemu-Generated-Guest > Memory-Dump-Format > BE powernv LE KVM guest LE > BE powernv BE KVM guest BE > LE powernv LE KVM guest LE > LE powernv BE KVM guest BE I don't quite understand why KVM is mentioned with the powernv machine. Also have you tried to dump at various moments, e.g. during skiboot and when guest is booted, as in [1] which introduced the code this patch is changing ? [1] https://github.com/qemu/qemu/commit/5609400a422809c89ea788e4d0e13124a617582e. > LE pseries KVM LE KVM guest LE > LE pseries TCG LE guest LE > Fixes: 5609400a4228 ("target/ppc: Set the correct endianness for powernv memory dumps") > Signed-off-by: Narayana Murty N <nnmlinux@linux.ibm.com> > --- > Changes since V2: > commit message modified as per feedbak from Nicholas Piggin. > Changes since V1: > https://lore.kernel.org/qemu-devel/20230420145055.10196-1-nnmlinux@linux.ibm.com/ > The approach to solve the issue was changed based on feedback from > Fabiano Rosas on patch V1. > --- > target/ppc/arch_dump.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/target/ppc/arch_dump.c b/target/ppc/arch_dump.c > index f58e6359d5..a8315659d9 100644 > --- a/target/ppc/arch_dump.c > +++ b/target/ppc/arch_dump.c > @@ -237,7 +237,7 @@ int cpu_get_dump_info(ArchDumpInfo *info, > info->d_machine = PPC_ELF_MACHINE; > info->d_class = ELFCLASS; > > - if (ppc_interrupts_little_endian(cpu, cpu->env.has_hv_mode)) { > + if (ppc_interrupts_little_endian(cpu, !!(cpu->env.msr_mask & MSR_HVB))) { > info->d_endian = ELFDATA2LSB; > } else { > info->d_endian = ELFDATA2MSB;
On 5/22/23 23:50, Greg Kurz wrote: > On Mon, 22 May 2023 12:02:42 -0400 > Narayana Murty N<nnmlinux@linux.ibm.com> wrote: > >> Currently on PPC64 qemu always dumps the guest memory in >> Big Endian (BE) format even though the guest running in Little Endian >> (LE) mode. So crash tool fails to load the dump as illustrated below: >> >> Log : >> $ virsh dump DOMAIN --memory-only dump.file >> >> Domain 'DOMAIN' dumped to dump.file >> >> $ crash vmlinux dump.file >> >> <snip> >> crash 8.0.2-1.el9 >> >> WARNING: endian mismatch: >> crash utility: little-endian >> dump.file: big-endian >> >> WARNING: machine type mismatch: >> crash utility: PPC64 >> dump.file: (unknown) >> >> crash: dump.file: not a supported file format >> <snip> >> >> This happens because cpu_get_dump_info() passes cpu->env->has_hv_mode >> to function ppc_interrupts_little_endian(), the cpu->env->has_hv_mode >> always set for powerNV even though the guest is not running in hv mode. >> The hv mode should be taken from msr_mask MSR_HVB bit >> (cpu->env.msr_mask & MSR_HVB). This patch fixes the issue by passing >> MSR_HVB value to ppc_interrupts_little_endian() in order to determine >> the guest endianness. >> >> The crash tool also expects guest kernel endianness should match the >> endianness of the dump. >> >> The patch was tested on POWER9 box booted with Linux as host in >> following cases: >> >> Host-Endianess Qemu-Target-Machine Qemu-Guest-Endianess Qemu-Generated-Guest >> Memory-Dump-Format >> BE powernv LE KVM guest LE >> BE powernv BE KVM guest BE >> LE powernv LE KVM guest LE >> LE powernv BE KVM guest BE > I don't quite understand why KVM is mentioned with the powernv machine. guest running mode was mentioned. > > Also have you tried to dump at various moments, e.g. during skiboot > and when guest is booted, as in [1] which introduced the code this > patch is changing ? > > [1]https://github.com/qemu/qemu/commit/5609400a422809c89ea788e4d0e13124a617582e. > >> LE pseries KVM LE KVM guest LE >> LE pseries TCG LE guest LE >> > Fixes: 5609400a4228 ("target/ppc: Set the correct endianness for powernv memory dumps") I agree, commit 5609400a4228 fixes endianness detection only for initial stage (skiboot) till endianness switch happens. However, has_hv_mode is just a capability flag which is always set based on command-line param and doesnt really represent current hv state. With this patch, it relies on the current state of the hv state based on the MSR_HVB of the msr_mask. > >> Signed-off-by: Narayana Murty N<nnmlinux@linux.ibm.com> >> --- >> Changes since V2: >> commit message modified as per feedbak from Nicholas Piggin. >> Changes since V1: >> https://lore.kernel.org/qemu-devel/20230420145055.10196-1-nnmlinux@linux.ibm.com/ >> The approach to solve the issue was changed based on feedback from >> Fabiano Rosas on patch V1. >> --- >> target/ppc/arch_dump.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/target/ppc/arch_dump.c b/target/ppc/arch_dump.c >> index f58e6359d5..a8315659d9 100644 >> --- a/target/ppc/arch_dump.c >> +++ b/target/ppc/arch_dump.c >> @@ -237,7 +237,7 @@ int cpu_get_dump_info(ArchDumpInfo *info, >> info->d_machine = PPC_ELF_MACHINE; >> info->d_class = ELFCLASS; >> >> - if (ppc_interrupts_little_endian(cpu, cpu->env.has_hv_mode)) { >> + if (ppc_interrupts_little_endian(cpu, !!(cpu->env.msr_mask & MSR_HVB))) { >> info->d_endian = ELFDATA2LSB; >> } else { >> info->d_endian = ELFDATA2MSB;
On Tue, 23 May 2023 12:20:17 +0530 Narayana Murty N <nnmlinux@linux.vnet.ibm.com> wrote: > > On 5/22/23 23:50, Greg Kurz wrote: > > On Mon, 22 May 2023 12:02:42 -0400 > > Narayana Murty N<nnmlinux@linux.ibm.com> wrote: > > > >> Currently on PPC64 qemu always dumps the guest memory in > >> Big Endian (BE) format even though the guest running in Little Endian > >> (LE) mode. So crash tool fails to load the dump as illustrated below: > >> > >> Log : > >> $ virsh dump DOMAIN --memory-only dump.file > >> > >> Domain 'DOMAIN' dumped to dump.file > >> > >> $ crash vmlinux dump.file > >> > >> <snip> > >> crash 8.0.2-1.el9 > >> > >> WARNING: endian mismatch: > >> crash utility: little-endian > >> dump.file: big-endian > >> > >> WARNING: machine type mismatch: > >> crash utility: PPC64 > >> dump.file: (unknown) > >> > >> crash: dump.file: not a supported file format > >> <snip> > >> > >> This happens because cpu_get_dump_info() passes cpu->env->has_hv_mode > >> to function ppc_interrupts_little_endian(), the cpu->env->has_hv_mode > >> always set for powerNV even though the guest is not running in hv mode. > >> The hv mode should be taken from msr_mask MSR_HVB bit > >> (cpu->env.msr_mask & MSR_HVB). This patch fixes the issue by passing > >> MSR_HVB value to ppc_interrupts_little_endian() in order to determine > >> the guest endianness. > >> > >> The crash tool also expects guest kernel endianness should match the > >> endianness of the dump. > >> > >> The patch was tested on POWER9 box booted with Linux as host in > >> following cases: > >> > >> Host-Endianess Qemu-Target-Machine Qemu-Guest-Endianess Qemu-Generated-Guest > >> Memory-Dump-Format > >> BE powernv LE KVM guest LE > >> BE powernv BE KVM guest BE > >> LE powernv LE KVM guest LE > >> LE powernv BE KVM guest BE > > I don't quite understand why KVM is mentioned with the powernv machine. > > guest running mode was mentioned. > QEMU cannot use KVM on the host to run a powernv machine. The guest is thus necessarily running in TCG mode. Please describe your setup and what exactly you are testing. > > > > Also have you tried to dump at various moments, e.g. during skiboot > > and when guest is booted, as in [1] which introduced the code this > > patch is changing ? > > > > [1]https://github.com/qemu/qemu/commit/5609400a422809c89ea788e4d0e13124a617582e. > > > >> LE pseries KVM LE KVM guest LE > >> LE pseries TCG LE guest LE > >> > > Fixes: 5609400a4228 ("target/ppc: Set the correct endianness for powernv memory dumps") > > I agree, commit 5609400a4228 fixes endianness detection only for initial stage (skiboot) till endianness switch happens. > However, has_hv_mode is just a capability flag which is always set based on command-line param and doesnt really represent current hv state. > With this patch, it relies on the current state of the hv state based on the MSR_HVB of the msr_mask. > Yes I see what your patch is doing. The 'Fixes: 5609400a4228 ...' line is intended to the changelog because it is supposedly a fix to this commit. > > > >> Signed-off-by: Narayana Murty N<nnmlinux@linux.ibm.com> > >> --- > >> Changes since V2: > >> commit message modified as per feedbak from Nicholas Piggin. > >> Changes since V1: > >> https://lore.kernel.org/qemu-devel/20230420145055.10196-1-nnmlinux@linux.ibm.com/ > >> The approach to solve the issue was changed based on feedback from > >> Fabiano Rosas on patch V1. > >> --- > >> target/ppc/arch_dump.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/target/ppc/arch_dump.c b/target/ppc/arch_dump.c > >> index f58e6359d5..a8315659d9 100644 > >> --- a/target/ppc/arch_dump.c > >> +++ b/target/ppc/arch_dump.c > >> @@ -237,7 +237,7 @@ int cpu_get_dump_info(ArchDumpInfo *info, > >> info->d_machine = PPC_ELF_MACHINE; > >> info->d_class = ELFCLASS; > >> > >> - if (ppc_interrupts_little_endian(cpu, cpu->env.has_hv_mode)) { > >> + if (ppc_interrupts_little_endian(cpu, !!(cpu->env.msr_mask & MSR_HVB))) { > >> info->d_endian = ELFDATA2LSB; > >> } else { > >> info->d_endian = ELFDATA2MSB;
On 5/22/23 18:02, Narayana Murty N wrote: > Currently on PPC64 qemu always dumps the guest memory in > Big Endian (BE) format even though the guest running in Little Endian The patch is surely correct. I have problems understanding the config you are testing. PPC Book3s has multiple hypervisor implementations : 1. pHyp (AKA PowerVM) 2. OPAL/PowerNV (AKA Power KVM-HV) 3. OPAL/PowerNV/pSeries (AKA Power KVMHV-on-pSeries) 4. pHyp/pSeries (very recent implementation, I don't know how it is referred to in the kernel) I am leaving the KVM-PR implementation out of the discussions for simplicity. QEMU also supports emulation of 2. and 3. in two different machines PowerNV and pseries, although running pseries guests under a PowerNV machine is slow, so is running pseries guests under pseries. Could you please describe your environment ? Thanks, C. > (LE) mode. So crash tool fails to load the dump as illustrated below: > > Log : > $ virsh dump DOMAIN --memory-only dump.file > > Domain 'DOMAIN' dumped to dump.file > > $ crash vmlinux dump.file > > <snip> > crash 8.0.2-1.el9 > > WARNING: endian mismatch: > crash utility: little-endian > dump.file: big-endian > > WARNING: machine type mismatch: > crash utility: PPC64 > dump.file: (unknown) > > crash: dump.file: not a supported file format > <snip> > > This happens because cpu_get_dump_info() passes cpu->env->has_hv_mode > to function ppc_interrupts_little_endian(), the cpu->env->has_hv_mode > always set for powerNV even though the guest is not running in hv mode. > The hv mode should be taken from msr_mask MSR_HVB bit > (cpu->env.msr_mask & MSR_HVB). This patch fixes the issue by passing > MSR_HVB value to ppc_interrupts_little_endian() in order to determine > the guest endianness. > > The crash tool also expects guest kernel endianness should match the > endianness of the dump. > > The patch was tested on POWER9 box booted with Linux as host in > following cases: > > Host-Endianess Qemu-Target-Machine Qemu-Guest-Endianess Qemu-Generated-Guest > Memory-Dump-Format > BE powernv LE KVM guest LE > BE powernv BE KVM guest BE > LE powernv LE KVM guest LE > LE powernv BE KVM guest BE > LE pseries KVM LE KVM guest LE > LE pseries TCG LE guest LE > > Signed-off-by: Narayana Murty N <nnmlinux@linux.ibm.com> > --- > Changes since V2: > commit message modified as per feedbak from Nicholas Piggin. > Changes since V1: > https://lore.kernel.org/qemu-devel/20230420145055.10196-1-nnmlinux@linux.ibm.com/ > The approach to solve the issue was changed based on feedback from > Fabiano Rosas on patch V1. > --- > target/ppc/arch_dump.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/target/ppc/arch_dump.c b/target/ppc/arch_dump.c > index f58e6359d5..a8315659d9 100644 > --- a/target/ppc/arch_dump.c > +++ b/target/ppc/arch_dump.c > @@ -237,7 +237,7 @@ int cpu_get_dump_info(ArchDumpInfo *info, > info->d_machine = PPC_ELF_MACHINE; > info->d_class = ELFCLASS; > > - if (ppc_interrupts_little_endian(cpu, cpu->env.has_hv_mode)) { > + if (ppc_interrupts_little_endian(cpu, !!(cpu->env.msr_mask & MSR_HVB))) { > info->d_endian = ELFDATA2LSB; > } else { > info->d_endian = ELFDATA2MSB;
On 5/23/23 15:52, Cédric Le Goater wrote: > On 5/22/23 18:02, Narayana Murty N wrote: >> Currently on PPC64 qemu always dumps the guest memory in >> Big Endian (BE) format even though the guest running in Little Endian > > > > The patch is surely correct. I have problems understanding the config > you are testing. PPC Book3s has multiple hypervisor implementations : > > 1. pHyp (AKA PowerVM) > 2. OPAL/PowerNV (AKA Power KVM-HV) > 3. OPAL/PowerNV/pSeries (AKA Power KVMHV-on-pSeries) > 4. pHyp/pSeries (very recent implementation, I don't know how it is > referred to in the kernel) > > I am leaving the KVM-PR implementation out of the discussions for > simplicity. > > QEMU also supports emulation of 2. and 3. in two different machines > PowerNV and pseries, although running pseries guests under a PowerNV > machine is slow, so is running pseries guests under pseries. > > Could you please describe your environment ? > > Thanks, > > C. > > It had been tested target machine OPAL/PowerNV with big endian host os. and also target OPAL/PowerNV/pSeries little endian host setup with qemu. > >> (LE) mode. So crash tool fails to load the dump as illustrated below: >> >> Log : >> $ virsh dump DOMAIN --memory-only dump.file >> >> Domain 'DOMAIN' dumped to dump.file >> >> $ crash vmlinux dump.file >> >> <snip> >> crash 8.0.2-1.el9 >> >> WARNING: endian mismatch: >> crash utility: little-endian >> dump.file: big-endian >> >> WARNING: machine type mismatch: >> crash utility: PPC64 >> dump.file: (unknown) >> >> crash: dump.file: not a supported file format >> <snip> >> >> This happens because cpu_get_dump_info() passes cpu->env->has_hv_mode >> to function ppc_interrupts_little_endian(), the cpu->env->has_hv_mode >> always set for powerNV even though the guest is not running in hv mode. >> The hv mode should be taken from msr_mask MSR_HVB bit >> (cpu->env.msr_mask & MSR_HVB). This patch fixes the issue by passing >> MSR_HVB value to ppc_interrupts_little_endian() in order to determine >> the guest endianness. >> >> The crash tool also expects guest kernel endianness should match the >> endianness of the dump. >> >> The patch was tested on POWER9 box booted with Linux as host in >> following cases: >> >> Host-Endianess Qemu-Target-Machine Qemu-Guest-Endianess >> Qemu-Generated-Guest >> Memory-Dump-Format >> BE powernv LE KVM guest LE >> BE powernv BE KVM guest BE >> LE powernv LE KVM guest LE >> LE powernv BE KVM guest BE >> LE pseries KVM LE KVM guest LE >> LE pseries TCG LE guest LE >> >> Signed-off-by: Narayana Murty N <nnmlinux@linux.ibm.com> >> --- >> Changes since V2: >> commit message modified as per feedbak from Nicholas Piggin. >> Changes since V1: >> https://lore.kernel.org/qemu-devel/20230420145055.10196-1-nnmlinux@linux.ibm.com/ >> >> The approach to solve the issue was changed based on feedback from >> Fabiano Rosas on patch V1. >> --- >> target/ppc/arch_dump.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/target/ppc/arch_dump.c b/target/ppc/arch_dump.c >> index f58e6359d5..a8315659d9 100644 >> --- a/target/ppc/arch_dump.c >> +++ b/target/ppc/arch_dump.c >> @@ -237,7 +237,7 @@ int cpu_get_dump_info(ArchDumpInfo *info, >> info->d_machine = PPC_ELF_MACHINE; >> info->d_class = ELFCLASS; >> - if (ppc_interrupts_little_endian(cpu, cpu->env.has_hv_mode)) { >> + if (ppc_interrupts_little_endian(cpu, !!(cpu->env.msr_mask & >> MSR_HVB))) { >> info->d_endian = ELFDATA2LSB; >> } else { >> info->d_endian = ELFDATA2MSB; >
On Tue May 23, 2023 at 2:02 AM AEST, Narayana Murty N wrote: > Currently on PPC64 qemu always dumps the guest memory in > Big Endian (BE) format even though the guest running in Little Endian > (LE) mode. The guest? Are you talking about the immediate target, or a KVM guest running under that? Or both? > So crash tool fails to load the dump as illustrated below: > > Log : > $ virsh dump DOMAIN --memory-only dump.file > > Domain 'DOMAIN' dumped to dump.file > > $ crash vmlinux dump.file > > <snip> > crash 8.0.2-1.el9 > > WARNING: endian mismatch: > crash utility: little-endian > dump.file: big-endian > > WARNING: machine type mismatch: > crash utility: PPC64 > dump.file: (unknown) > > crash: dump.file: not a supported file format > <snip> > > This happens because cpu_get_dump_info() passes cpu->env->has_hv_mode > to function ppc_interrupts_little_endian(), the cpu->env->has_hv_mode > always set for powerNV even though the guest is not running in hv mode. > The hv mode should be taken from msr_mask MSR_HVB bit > (cpu->env.msr_mask & MSR_HVB). This patch fixes the issue by passing > MSR_HVB value to ppc_interrupts_little_endian() in order to determine > the guest endianness. > > The crash tool also expects guest kernel endianness should match the > endianness of the dump. > > The patch was tested on POWER9 box booted with Linux as host in > following cases: > > Host-Endianess Qemu-Target-Machine Qemu-Guest-Endianess Qemu-Generated-Guest > Memory-Dump-Format > BE powernv LE KVM guest LE > BE powernv BE KVM guest BE > LE powernv LE KVM guest LE > LE powernv BE KVM guest BE > LE pseries KVM LE KVM guest LE > LE pseries TCG LE guest LE I'm still having a bit of trouble with this like the others. First thing first, take "guest" out of it entirely. If I read it right, then the dump will be done in whichever endian the OS running on the machine has set the interrupt endian mode to, and crash expects that to match the vmlinux. Is that correct? For the powernv machine, when the OS runs a KVM guest, the crash dump continues to be generated in the same endianness even if the guest OS is the opposite. If my understanding is correct, I don't see why that is desirable to change it to the arbitrary guest setting. The OS running on the target is the one you want to debug if you dump from the first QEMU. If you want to debug the guest you would dump from the QEMU running in the target that is running the guest, no? For the pseries machine, the TCG nested HV thing probably takes over LPCR when running the nested HV guest. I would argue that's wrong and it should actually match powernv behaviour. Thanks, Nick > > Signed-off-by: Narayana Murty N <nnmlinux@linux.ibm.com> > --- > Changes since V2: > commit message modified as per feedbak from Nicholas Piggin. > Changes since V1: > https://lore.kernel.org/qemu-devel/20230420145055.10196-1-nnmlinux@linux.ibm.com/ > The approach to solve the issue was changed based on feedback from > Fabiano Rosas on patch V1. > --- > target/ppc/arch_dump.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/target/ppc/arch_dump.c b/target/ppc/arch_dump.c > index f58e6359d5..a8315659d9 100644 > --- a/target/ppc/arch_dump.c > +++ b/target/ppc/arch_dump.c > @@ -237,7 +237,7 @@ int cpu_get_dump_info(ArchDumpInfo *info, > info->d_machine = PPC_ELF_MACHINE; > info->d_class = ELFCLASS; > > - if (ppc_interrupts_little_endian(cpu, cpu->env.has_hv_mode)) { > + if (ppc_interrupts_little_endian(cpu, !!(cpu->env.msr_mask & MSR_HVB))) { > info->d_endian = ELFDATA2LSB; > } else { > info->d_endian = ELFDATA2MSB; > -- > 2.39.2
On Tue May 23, 2023 at 2:02 AM AEST, Narayana Murty N wrote: > Changes since V2: > commit message modified as per feedbak from Nicholas Piggin. > Changes since V1: > https://lore.kernel.org/qemu-devel/20230420145055.10196-1-nnmlinux@linux.ibm.com/ > The approach to solve the issue was changed based on feedback from > Fabiano Rosas on patch V1. > --- > target/ppc/arch_dump.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/target/ppc/arch_dump.c b/target/ppc/arch_dump.c > index f58e6359d5..a8315659d9 100644 > --- a/target/ppc/arch_dump.c > +++ b/target/ppc/arch_dump.c > @@ -237,7 +237,7 @@ int cpu_get_dump_info(ArchDumpInfo *info, > info->d_machine = PPC_ELF_MACHINE; > info->d_class = ELFCLASS; > > - if (ppc_interrupts_little_endian(cpu, cpu->env.has_hv_mode)) { > + if (ppc_interrupts_little_endian(cpu, !!(cpu->env.msr_mask & MSR_HVB))) { > info->d_endian = ELFDATA2LSB; > } else { > info->d_endian = ELFDATA2MSB; Oh, and now I see it cpu_get_dump_info just picks the first CPU to test this! So a test that can change at runtime is surely not the right one. If you use MSR[HV] then if you have a SMP machine that is doing a bunch of things and you want to dump to debug the system, this will just randomly give you a wrong-endian dump if CPU0 just happened to be running some KVM guest. I know HILE techically does change at runtime, but at least in practice it is just boot (and maybe kexec or reboot) so that is the least worst option. Part of the problem is perhaps the tools and commands aren't so so suited to ppc's bi-endian nature. But even ignoring all of that, let's say you have all the same endian host and guest kernels and dump format... If you dump host memory then you need host kernel and structures to debug guest kernel/image (assuming crash or the person debugging it is smart enough to make sense of it). So I don't see how you can sanely use the crash dump of host memory with the guest kernel. I must still be missing something. Thanks, Nick
"Nicholas Piggin" <npiggin@gmail.com> writes: > On Tue May 23, 2023 at 2:02 AM AEST, Narayana Murty N wrote: >> Changes since V2: >> commit message modified as per feedbak from Nicholas Piggin. >> Changes since V1: >> https://lore.kernel.org/qemu-devel/20230420145055.10196-1-nnmlinux@linux.ibm.com/ >> The approach to solve the issue was changed based on feedback from >> Fabiano Rosas on patch V1. >> --- >> target/ppc/arch_dump.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/target/ppc/arch_dump.c b/target/ppc/arch_dump.c >> index f58e6359d5..a8315659d9 100644 >> --- a/target/ppc/arch_dump.c >> +++ b/target/ppc/arch_dump.c >> @@ -237,7 +237,7 @@ int cpu_get_dump_info(ArchDumpInfo *info, >> info->d_machine = PPC_ELF_MACHINE; >> info->d_class = ELFCLASS; >> >> - if (ppc_interrupts_little_endian(cpu, cpu->env.has_hv_mode)) { >> + if (ppc_interrupts_little_endian(cpu, !!(cpu->env.msr_mask & MSR_HVB))) { >> info->d_endian = ELFDATA2LSB; >> } else { >> info->d_endian = ELFDATA2MSB; > > Oh, and now I see it cpu_get_dump_info just picks the first CPU to test > this! So a test that can change at runtime is surely not the right one. > If you use MSR[HV] then if you have a SMP machine that is doing a bunch > of things and you want to dump to debug the system, this will just > randomly give you a wrong-endian dump if CPU0 just happened to be > running some KVM guest. > Not sure if you are just thinking out loud about MSR_HV or if you mistook MSR_HVB for MSR_HV. But just in case: The env->msr_mask is what tells us what MSR bits are supported for this CPU, i.e. what features it contains. So MSR_HVB is not equivalent to MSR[HV], but merely informs that this CPU knows about MSR_HV. We then store that information at has_hv_mode. The MSR_HVB bit changes only once (at cpu_ppc_set_vhyp), after we decide whether to use vhyp. So: env->has_hv_mode == cpu supports HV mode; MSR_HVB=1 == cpu supports HV mode AND we allow the OS to run with MSR_HV=1; MSR_HVB=0 == cpu doesn't support HV mode OR cpu supports HV mode, but we don't allow the OS to run with MSR_HV=1 because QEMU is the HV (i.e. vhyp); For the arch_dump code, passing (msr_mask & MSR_HVB) ends up meaning: "can this OS ever run with MSR_HV=1?", which for emulated powernv would be Yes and for pseries (incl. nested) would be No. So for emulated powernv we always look at the emulated HILE and for pseries we always look at LPCR_ILE. > But even ignoring all of that, let's say you have all the same endian > host and guest kernels and dump format... If you dump host memory then > you need host kernel and structures to debug guest kernel/image > (assuming crash or the person debugging it is smart enough to make sense > of it). So I don't see how you can sanely use the crash dump of host > memory with the guest kernel. I must still be missing something. > You're right, the QEMU instance that receives the qmp_dump_guest_memory command should generate a memory dump of whatever OS is running in it at a direct level. So the endianness reported in the dump's ELF should match the guest OS image ELF (roughly, there's -bios which doesn't require an ELF).
On Tue May 30, 2023 at 12:05 AM AEST, Fabiano Rosas wrote: > "Nicholas Piggin" <npiggin@gmail.com> writes: > > > On Tue May 23, 2023 at 2:02 AM AEST, Narayana Murty N wrote: > >> Changes since V2: > >> commit message modified as per feedbak from Nicholas Piggin. > >> Changes since V1: > >> https://lore.kernel.org/qemu-devel/20230420145055.10196-1-nnmlinux@linux.ibm.com/ > >> The approach to solve the issue was changed based on feedback from > >> Fabiano Rosas on patch V1. > >> --- > >> target/ppc/arch_dump.c | 2 +- > >> 1 file changed, 1 insertion(+), 1 deletion(-) > >> > >> diff --git a/target/ppc/arch_dump.c b/target/ppc/arch_dump.c > >> index f58e6359d5..a8315659d9 100644 > >> --- a/target/ppc/arch_dump.c > >> +++ b/target/ppc/arch_dump.c > >> @@ -237,7 +237,7 @@ int cpu_get_dump_info(ArchDumpInfo *info, > >> info->d_machine = PPC_ELF_MACHINE; > >> info->d_class = ELFCLASS; > >> > >> - if (ppc_interrupts_little_endian(cpu, cpu->env.has_hv_mode)) { > >> + if (ppc_interrupts_little_endian(cpu, !!(cpu->env.msr_mask & MSR_HVB))) { > >> info->d_endian = ELFDATA2LSB; > >> } else { > >> info->d_endian = ELFDATA2MSB; > > > > Oh, and now I see it cpu_get_dump_info just picks the first CPU to test > > this! So a test that can change at runtime is surely not the right one. > > If you use MSR[HV] then if you have a SMP machine that is doing a bunch > > of things and you want to dump to debug the system, this will just > > randomly give you a wrong-endian dump if CPU0 just happened to be > > running some KVM guest. > > > > Not sure if you are just thinking out loud about MSR_HV or if you > mistook MSR_HVB for MSR_HV. But just in case: Oh, yes I did confuse the MSR from the mask. Apologies, that makes much of my ranting invalid. > The env->msr_mask is what tells us what MSR bits are supported for this > CPU, i.e. what features it contains. So MSR_HVB is not equivalent to > MSR[HV], but merely informs that this CPU knows about MSR_HV. We then > store that information at has_hv_mode. The MSR_HVB bit changes only > once (at cpu_ppc_set_vhyp), after we decide whether to use vhyp. So: > > env->has_hv_mode == cpu supports HV mode; > > MSR_HVB=1 == cpu supports HV mode AND we allow the OS to run with MSR_HV=1; > > MSR_HVB=0 == cpu doesn't support HV mode OR > cpu supports HV mode, but we don't allow the OS to run with > MSR_HV=1 because QEMU is the HV (i.e. vhyp); > > For the arch_dump code, passing (msr_mask & MSR_HVB) ends up meaning: > "can this OS ever run with MSR_HV=1?", which for emulated powernv would > be Yes and for pseries (incl. nested) would be No. So for emulated > powernv we always look at the emulated HILE and for pseries we always > look at LPCR_ILE. Well then I agree with that, I think the talk of the KVM guest confused me. So in that case I agree with the patch. It does seem like there would be still a problem with a nested KVM guest on pseries though, since it hijacks LPCR among other SPRs, and may modify ILE. It seems like you would need a way to ask vhyp for the host's interrupt endian mode (and would get that from SpaprCpuState's nested_host_state regs. But that could be fixed later. Thanks, Nick
On 5/23/23 15:45, Greg Kurz wrote: > On Tue, 23 May 2023 12:20:17 +0530 > Narayana Murty N<nnmlinux@linux.vnet.ibm.com> wrote: > >> On 5/22/23 23:50, Greg Kurz wrote: >>> On Mon, 22 May 2023 12:02:42 -0400 >>> Narayana Murty N<nnmlinux@linux.ibm.com> wrote: >>> >>>> Currently on PPC64 qemu always dumps the guest memory in >>>> Big Endian (BE) format even though the guest running in Little Endian >>>> (LE) mode. So crash tool fails to load the dump as illustrated below: >>>> >>>> Log : >>>> $ virsh dump DOMAIN --memory-only dump.file >>>> >>>> Domain 'DOMAIN' dumped to dump.file >>>> >>>> $ crash vmlinux dump.file >>>> >>>> <snip> >>>> crash 8.0.2-1.el9 >>>> >>>> WARNING: endian mismatch: >>>> crash utility: little-endian >>>> dump.file: big-endian >>>> >>>> WARNING: machine type mismatch: >>>> crash utility: PPC64 >>>> dump.file: (unknown) >>>> >>>> crash: dump.file: not a supported file format >>>> <snip> >>>> >>>> This happens because cpu_get_dump_info() passes cpu->env->has_hv_mode >>>> to function ppc_interrupts_little_endian(), the cpu->env->has_hv_mode >>>> always set for powerNV even though the guest is not running in hv mode. >>>> The hv mode should be taken from msr_mask MSR_HVB bit >>>> (cpu->env.msr_mask & MSR_HVB). This patch fixes the issue by passing >>>> MSR_HVB value to ppc_interrupts_little_endian() in order to determine >>>> the guest endianness. >>>> >>>> The crash tool also expects guest kernel endianness should match the >>>> endianness of the dump. >>>> >>>> The patch was tested on POWER9 box booted with Linux as host in >>>> following cases: >>>> >>>> Host-Endianess Qemu-Target-Machine Qemu-Guest-Endianess Qemu-Generated-Guest >>>> Memory-Dump-Format >>>> BE powernv LE KVM guest LE >>>> BE powernv BE KVM guest BE >>>> LE powernv LE KVM guest LE >>>> LE powernv BE KVM guest BE >>> I don't quite understand why KVM is mentioned with the powernv machine. >> guest running mode was mentioned. >> > QEMU cannot use KVM on the host to run a powernv machine. The > guest is thus necessarily running in TCG mode. > > Please describe your setup and what exactly you are testing. described in v4. >>> Also have you tried to dump at various moments, e.g. during skiboot >>> and when guest is booted, as in [1] which introduced the code this >>> patch is changing ? >>> >>> [1]https://github.com/qemu/qemu/commit/5609400a422809c89ea788e4d0e13124a617582e. >>> >>>> LE pseries KVM LE KVM guest LE >>>> LE pseries TCG LE guest LE >>>> >>> Fixes: 5609400a4228 ("target/ppc: Set the correct endianness for powernv memory dumps") >> I agree, commit 5609400a4228 fixes endianness detection only for initial stage (skiboot) till endianness switch happens. >> However, has_hv_mode is just a capability flag which is always set based on command-line param and doesnt really represent current hv state. >> With this patch, it relies on the current state of the hv state based on the MSR_HVB of the msr_mask. >> > Yes I see what your patch is doing. The 'Fixes: 5609400a4228 ...' line is > intended to the changelog because it is supposedly a fix to this commit. done >>>> Signed-off-by: Narayana Murty N<nnmlinux@linux.ibm.com> >>>> --- >>>> Changes since V2: >>>> commit message modified as per feedbak from Nicholas Piggin. >>>> Changes since V1: >>>> https://lore.kernel.org/qemu-devel/20230420145055.10196-1-nnmlinux@linux.ibm.com/ >>>> The approach to solve the issue was changed based on feedback from >>>> Fabiano Rosas on patch V1. >>>> --- >>>> target/ppc/arch_dump.c | 2 +- >>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> diff --git a/target/ppc/arch_dump.c b/target/ppc/arch_dump.c >>>> index f58e6359d5..a8315659d9 100644 >>>> --- a/target/ppc/arch_dump.c >>>> +++ b/target/ppc/arch_dump.c >>>> @@ -237,7 +237,7 @@ int cpu_get_dump_info(ArchDumpInfo *info, >>>> info->d_machine = PPC_ELF_MACHINE; >>>> info->d_class = ELFCLASS; >>>> >>>> - if (ppc_interrupts_little_endian(cpu, cpu->env.has_hv_mode)) { >>>> + if (ppc_interrupts_little_endian(cpu, !!(cpu->env.msr_mask & MSR_HVB))) { >>>> info->d_endian = ELFDATA2LSB; >>>> } else { >>>> info->d_endian = ELFDATA2MSB;
diff --git a/target/ppc/arch_dump.c b/target/ppc/arch_dump.c index f58e6359d5..a8315659d9 100644 --- a/target/ppc/arch_dump.c +++ b/target/ppc/arch_dump.c @@ -237,7 +237,7 @@ int cpu_get_dump_info(ArchDumpInfo *info, info->d_machine = PPC_ELF_MACHINE; info->d_class = ELFCLASS; - if (ppc_interrupts_little_endian(cpu, cpu->env.has_hv_mode)) { + if (ppc_interrupts_little_endian(cpu, !!(cpu->env.msr_mask & MSR_HVB))) { info->d_endian = ELFDATA2LSB; } else { info->d_endian = ELFDATA2MSB;
Currently on PPC64 qemu always dumps the guest memory in Big Endian (BE) format even though the guest running in Little Endian (LE) mode. So crash tool fails to load the dump as illustrated below: Log : $ virsh dump DOMAIN --memory-only dump.file Domain 'DOMAIN' dumped to dump.file $ crash vmlinux dump.file <snip> crash 8.0.2-1.el9 WARNING: endian mismatch: crash utility: little-endian dump.file: big-endian WARNING: machine type mismatch: crash utility: PPC64 dump.file: (unknown) crash: dump.file: not a supported file format <snip> This happens because cpu_get_dump_info() passes cpu->env->has_hv_mode to function ppc_interrupts_little_endian(), the cpu->env->has_hv_mode always set for powerNV even though the guest is not running in hv mode. The hv mode should be taken from msr_mask MSR_HVB bit (cpu->env.msr_mask & MSR_HVB). This patch fixes the issue by passing MSR_HVB value to ppc_interrupts_little_endian() in order to determine the guest endianness. The crash tool also expects guest kernel endianness should match the endianness of the dump. The patch was tested on POWER9 box booted with Linux as host in following cases: Host-Endianess Qemu-Target-Machine Qemu-Guest-Endianess Qemu-Generated-Guest Memory-Dump-Format BE powernv LE KVM guest LE BE powernv BE KVM guest BE LE powernv LE KVM guest LE LE powernv BE KVM guest BE LE pseries KVM LE KVM guest LE LE pseries TCG LE guest LE Signed-off-by: Narayana Murty N <nnmlinux@linux.ibm.com> --- Changes since V2: commit message modified as per feedbak from Nicholas Piggin. Changes since V1: https://lore.kernel.org/qemu-devel/20230420145055.10196-1-nnmlinux@linux.ibm.com/ The approach to solve the issue was changed based on feedback from Fabiano Rosas on patch V1. --- target/ppc/arch_dump.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)