Message ID | 20200507114824.788942-1-npiggin@gmail.com |
---|---|
State | New |
Headers | show |
Series | ppc/pnv: Fix NMI system reset SRR1 value | expand |
On Thu, May 07, 2020 at 09:48:24PM +1000, Nicholas Piggin wrote: > Commit a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the > SRR1 setting wrong for sresets that hit outside of power-save states. > > Fix this, better documenting the source for the bit definitions. > > Fixes: a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the > Cc: Cédric Le Goater <clg@kaod.org> > Cc: David Gibson <david@gibson.dropbear.id.au> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> Applied to ppc-for-5.1, thanks. > --- > > Thanks to Cedric for pointing out concerns with a previous MCE patch > that unearthed this as well. Linux does not actually care what these > SRR1[42:45] bits look like for non-powersave sresets, but we should > follow documented behaviour as far as possible. > > hw/ppc/pnv.c | 26 ++++++++++++++++++++------ > 1 file changed, 20 insertions(+), 6 deletions(-) > > diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c > index a3b7a8d0ff..1b4748ce6d 100644 > --- a/hw/ppc/pnv.c > +++ b/hw/ppc/pnv.c > @@ -1986,12 +1986,26 @@ static void pnv_cpu_do_nmi_on_cpu(CPUState *cs, run_on_cpu_data arg) > > cpu_synchronize_state(cs); > ppc_cpu_do_system_reset(cs); > - /* > - * SRR1[42:45] is set to 0100 which the ISA defines as implementation > - * dependent. POWER processors use this for xscom triggered interrupts, > - * which come from the BMC or NMI IPIs. > - */ > - env->spr[SPR_SRR1] |= PPC_BIT(43); > + if (env->spr[SPR_SRR1] & PPC_BITMASK(46, 47)) { > + /* > + * Power-save wakeups, as indicated by non-zero SRR1[46:47] put the > + * wakeup reason in SRR1[42:45], system reset is indicated with 0b0100 > + * (PPC_BIT(43)). > + */ > + if (!(env->spr[SPR_SRR1] & PPC_BIT(43))) { > + warn_report("ppc_cpu_do_system_reset does not set system reset wakeup reason"); > + env->spr[SPR_SRR1] |= PPC_BIT(43); > + } > + } else { > + /* > + * For non-powersave system resets, SRR1[42:45] are defined to be > + * implementation-dependent. The POWER9 User Manual specifies that > + * an external (SCOM driven, which may come from a BMC nmi command or > + * another CPU requesting a NMI IPI) system reset exception should be > + * 0b0010 (PPC_BIT(44)). > + */ > + env->spr[SPR_SRR1] |= PPC_BIT(44); > + } > } > > static void pnv_nmi(NMIState *n, int cpu_index, Error **errp)
On 5/7/20 1:48 PM, Nicholas Piggin wrote: > Commit a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the > SRR1 setting wrong for sresets that hit outside of power-save states. > > Fix this, better documenting the source for the bit definitions. > > Fixes: a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the > Cc: Cédric Le Goater <clg@kaod.org> > Cc: David Gibson <david@gibson.dropbear.id.au> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> We should introduce some defines like the SRR1_WAKE ones in Linux and cleanup powerpc_reset_wakeup(). This function uses cryptic values. That can be done later on as a followup. Reviewed-by: Cédric Le Goater <clg@kaod.org> > --- > > Thanks to Cedric for pointing out concerns with a previous MCE patch > that unearthed this as well. Linux does not actually care what these > SRR1[42:45] bits look like for non-powersave sresets, but we should > follow documented behaviour as far as possible. We should introduce some defines like the SRR1_WAKE ones in Linux and cleanup powerpc_reset_wakeup(). This function uses cryptic values. That can be done later on as a followup. I am currently after a bug which results in a CPU hard lockup because of a pending interrupt. It occurs on a SMP PowerNV machine when it is stressed with IO, such as scp of a big file. I am suspecting more and more an issue with an interrupt being handled when the CPU is coming out of idle. I haven't seen anything wrong in the models. Unless this maybe : /* Pretend to be returning from doze always as we don't lose state */ *msr |= (0x1ull << (63 - 47)); I am not sure how in sync it is with PSSCR. Thanks, C. > hw/ppc/pnv.c | 26 ++++++++++++++++++++------ > 1 file changed, 20 insertions(+), 6 deletions(-) > > diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c > index a3b7a8d0ff..1b4748ce6d 100644 > --- a/hw/ppc/pnv.c > +++ b/hw/ppc/pnv.c > @@ -1986,12 +1986,26 @@ static void pnv_cpu_do_nmi_on_cpu(CPUState *cs, run_on_cpu_data arg) > > cpu_synchronize_state(cs); > ppc_cpu_do_system_reset(cs); > - /* > - * SRR1[42:45] is set to 0100 which the ISA defines as implementation > - * dependent. POWER processors use this for xscom triggered interrupts, > - * which come from the BMC or NMI IPIs. > - */ > - env->spr[SPR_SRR1] |= PPC_BIT(43); > + if (env->spr[SPR_SRR1] & PPC_BITMASK(46, 47)) { > + /* > + * Power-save wakeups, as indicated by non-zero SRR1[46:47] put the > + * wakeup reason in SRR1[42:45], system reset is indicated with 0b0100 > + * (PPC_BIT(43)). > + */ > + if (!(env->spr[SPR_SRR1] & PPC_BIT(43))) { > + warn_report("ppc_cpu_do_system_reset does not set system reset wakeup reason"); > + env->spr[SPR_SRR1] |= PPC_BIT(43); > + } > + } else { > + /* > + * For non-powersave system resets, SRR1[42:45] are defined to be > + * implementation-dependent. The POWER9 User Manual specifies that > + * an external (SCOM driven, which may come from a BMC nmi command or > + * another CPU requesting a NMI IPI) system reset exception should be > + * 0b0010 (PPC_BIT(44)). > + */ > + env->spr[SPR_SRR1] |= PPC_BIT(44); > + } > } > > static void pnv_nmi(NMIState *n, int cpu_index, Error **errp) >
Patchew URL: https://patchew.org/QEMU/20200507114824.788942-1-npiggin@gmail.com/ Hi, This series seems to have some coding style problems. See output below for more information: Message-id: 20200507114824.788942-1-npiggin@gmail.com Subject: [PATCH] ppc/pnv: Fix NMI system reset SRR1 value Type: series === TEST SCRIPT BEGIN === #!/bin/bash git rev-parse base > /dev/null || exit 0 git config --local diff.renamelimit 0 git config --local diff.renames True git config --local diff.algorithm histogram ./scripts/checkpatch.pl --mailback base.. === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 Switched to a new branch 'test' fbe72cb ppc/pnv: Fix NMI system reset SRR1 value === OUTPUT BEGIN === ERROR: code indent should never use tabs #35: FILE: hw/ppc/pnv.c:1991: +^I * Power-save wakeups, as indicated by non-zero SRR1[46:47] put the$ ERROR: code indent should never use tabs #36: FILE: hw/ppc/pnv.c:1992: +^I * wakeup reason in SRR1[42:45], system reset is indicated with 0b0100$ ERROR: code indent should never use tabs #37: FILE: hw/ppc/pnv.c:1993: +^I * (PPC_BIT(43)).$ ERROR: code indent should never use tabs #38: FILE: hw/ppc/pnv.c:1994: +^I */$ ERROR: line over 90 characters #40: FILE: hw/ppc/pnv.c:1996: + warn_report("ppc_cpu_do_system_reset does not set system reset wakeup reason"); ERROR: code indent should never use tabs #45: FILE: hw/ppc/pnv.c:2001: +^I * For non-powersave system resets, SRR1[42:45] are defined to be$ ERROR: code indent should never use tabs #46: FILE: hw/ppc/pnv.c:2002: +^I * implementation-dependent. The POWER9 User Manual specifies that$ ERROR: code indent should never use tabs #47: FILE: hw/ppc/pnv.c:2003: +^I * an external (SCOM driven, which may come from a BMC nmi command or$ ERROR: code indent should never use tabs #48: FILE: hw/ppc/pnv.c:2004: +^I * another CPU requesting a NMI IPI) system reset exception should be$ ERROR: code indent should never use tabs #49: FILE: hw/ppc/pnv.c:2005: +^I * 0b0010 (PPC_BIT(44)).$ total: 10 errors, 0 warnings, 32 lines checked Commit fbe72cb9d465 (ppc/pnv: Fix NMI system reset SRR1 value) has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. === OUTPUT END === Test command exited with code: 1 The full log is available at http://patchew.org/logs/20200507114824.788942-1-npiggin@gmail.com/testing.checkpatch/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-devel@redhat.com
Excerpts from Cédric Le Goater's message of May 8, 2020 3:14 am: > On 5/7/20 1:48 PM, Nicholas Piggin wrote: >> Commit a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the >> SRR1 setting wrong for sresets that hit outside of power-save states. >> >> Fix this, better documenting the source for the bit definitions. >> >> Fixes: a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the >> Cc: Cédric Le Goater <clg@kaod.org> >> Cc: David Gibson <david@gibson.dropbear.id.au> >> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> > > We should introduce some defines like the SRR1_WAKE ones in Linux and > cleanup powerpc_reset_wakeup(). This function uses cryptic values. > That can be done later on as a followup. > > Reviewed-by: Cédric Le Goater <clg@kaod.org> Thanks. >> --- >> >> Thanks to Cedric for pointing out concerns with a previous MCE patch >> that unearthed this as well. Linux does not actually care what these >> SRR1[42:45] bits look like for non-powersave sresets, but we should >> follow documented behaviour as far as possible. > > We should introduce some defines like the SRR1_WAKE ones in Linux and > cleanup powerpc_reset_wakeup(). This function uses cryptic values. > That can be done later on as a followup. > > > I am currently after a bug which results in a CPU hard lockup because > of a pending interrupt. It occurs on a SMP PowerNV machine when it is > stressed with IO, such as scp of a big file. > > I am suspecting more and more an issue with an interrupt being handled > when the CPU is coming out of idle. I haven't seen anything wrong in So you can't hit it when booting Linux with powersave=off? Do we model stop with EC=0 properly? Looks like helper_pminsn seems to be doing the right thing there. > the models. Unless this maybe : > > /* Pretend to be returning from doze always as we don't lose state */ > *msr |= (0x1ull << (63 - 47)); > > I am not sure how in sync it is with PSSCR. That should be okay, the hardware can always enter a shallower state than was asked for. Linux will handle it. For testing purpose, we could model deeper states by scribbling on registers and indicating state loss. Aide from SRR1 sleep state value, Linux uses the SRR1 wake reason value to run the interrupt handler, but even if we got SRR1 wrong, Linux eventually enables MSR[EE] so the interrupt should get replayed then (this is what Linux used to do until we added the wake-reason processing for improved performance). But we do appear to get those right in powerpc_reset_wakeup(). Thanks, Nick
On Thu, 7 May 2020 23:51:54 +1000 David Gibson <david@gibson.dropbear.id.au> wrote: > On Thu, May 07, 2020 at 09:48:24PM +1000, Nicholas Piggin wrote: > > Commit a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the Please note that the culprit patch was merged with a different SHA1: https://git.qemu.org/?p=qemu.git;a=commit;h=01b552b05b0f21f8ff57a508f7ad26f7abbcd123 > > SRR1 setting wrong for sresets that hit outside of power-save states. > > > > Fix this, better documenting the source for the bit definitions. > > > > Fixes: a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the Fixes: 01b552b05b0f ("ppc/pnv: Add support for NMI interface") > > Cc: Cédric Le Goater <clg@kaod.org> > > Cc: David Gibson <david@gibson.dropbear.id.au> > > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> > > Applied to ppc-for-5.1, thanks. > > --- > > > > Thanks to Cedric for pointing out concerns with a previous MCE patch > > that unearthed this as well. Linux does not actually care what these > > SRR1[42:45] bits look like for non-powersave sresets, but we should > > follow documented behaviour as far as possible. > > > > hw/ppc/pnv.c | 26 ++++++++++++++++++++------ > > 1 file changed, 20 insertions(+), 6 deletions(-) > > > > diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c > > index a3b7a8d0ff..1b4748ce6d 100644 > > --- a/hw/ppc/pnv.c > > +++ b/hw/ppc/pnv.c > > @@ -1986,12 +1986,26 @@ static void pnv_cpu_do_nmi_on_cpu(CPUState *cs, run_on_cpu_data arg) > > > > cpu_synchronize_state(cs); > > ppc_cpu_do_system_reset(cs); > > - /* > > - * SRR1[42:45] is set to 0100 which the ISA defines as implementation > > - * dependent. POWER processors use this for xscom triggered interrupts, > > - * which come from the BMC or NMI IPIs. > > - */ > > - env->spr[SPR_SRR1] |= PPC_BIT(43); > > + if (env->spr[SPR_SRR1] & PPC_BITMASK(46, 47)) { > > + /* > > + * Power-save wakeups, as indicated by non-zero SRR1[46:47] put the > > + * wakeup reason in SRR1[42:45], system reset is indicated with 0b0100 > > + * (PPC_BIT(43)). > > + */ > > + if (!(env->spr[SPR_SRR1] & PPC_BIT(43))) { > > + warn_report("ppc_cpu_do_system_reset does not set system reset wakeup reason"); > > + env->spr[SPR_SRR1] |= PPC_BIT(43); > > + } > > + } else { > > + /* > > + * For non-powersave system resets, SRR1[42:45] are defined to be > > + * implementation-dependent. The POWER9 User Manual specifies that > > + * an external (SCOM driven, which may come from a BMC nmi command or > > + * another CPU requesting a NMI IPI) system reset exception should be > > + * 0b0010 (PPC_BIT(44)). > > + */ > > + env->spr[SPR_SRR1] |= PPC_BIT(44); > > + } > > } > > > > static void pnv_nmi(NMIState *n, int cpu_index, Error **errp) >
>> of a pending interrupt. It occurs on a SMP PowerNV machine when it is >> stressed with IO, such as scp of a big file. >> >> I am suspecting more and more an issue with an interrupt being handled >> when the CPU is coming out of idle. I haven't seen anything wrong in > > So you can't hit it when booting Linux with powersave=off? no. I uploaded 32GB steadily at 3.0MB/s on a smp 2 machine. When powersave is on, a P8 or P9 machine will miss an interrupt quite quickly. This assert can catch a symptom of the failure : @@ -75,6 +83,9 @@ void ppc_set_irq(PowerPCCPU *cpu, int n_ if (level) { env->pending_interrupts |= 1 << n_IRQ; cpu_interrupt(cs, CPU_INTERRUPT_HARD); + if (!(env->pending_interrupts & (1 << n_IRQ))) { + g_assert_not_reached(); + } } else { env->pending_interrupts &= ~(1 << n_IRQ); if (env->pending_interrupts == 0) { env->pending_interrupts is reseted in ppc_set_irq() setting it. I think it is the CPU handling the external IO interrupt which is kicked to wake up in cpu_interrupt(). The IRQ level goes out of sync with what the device expects and things go bad very quickly after. But this is post mortem. I need to find the right spot where to put an assert() to analyze. But, adding too much traces closes the window ... > Do we model stop with EC=0 properly? Looks like helper_pminsn seems to > be doing the right thing there. Yes. It seems so. The CPUs enter nap and come out with PACA_IRQ_EE set. >> the models. Unless this maybe : >> >> /* Pretend to be returning from doze always as we don't lose state */ >> *msr |= (0x1ull << (63 - 47)); >> >> I am not sure how in sync it is with PSSCR. > > That should be okay, the hardware can always enter a shallower state > than was asked for. Linux will handle it. For testing purpose, we could > model deeper states by scribbling on registers and indicating state loss. > > Aide from SRR1 sleep state value, Linux uses the SRR1 wake reason value > to run the interrupt handler, but even if we got SRR1 wrong, Linux > eventually enables MSR[EE] so the interrupt should get replayed then > (this is what Linux used to do until we added the wake-reason processing > for improved performance). > > But we do appear to get those right in powerpc_reset_wakeup(). yes. Still digging. Thanks, C.
On Fri, May 08, 2020 at 10:43:05AM +0200, Greg Kurz wrote: > On Thu, 7 May 2020 23:51:54 +1000 > David Gibson <david@gibson.dropbear.id.au> wrote: > > > On Thu, May 07, 2020 at 09:48:24PM +1000, Nicholas Piggin wrote: > > > Commit a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the > > Please note that the culprit patch was merged with a different SHA1: > > https://git.qemu.org/?p=qemu.git;a=commit;h=01b552b05b0f21f8ff57a508f7ad26f7abbcd123 > > > > SRR1 setting wrong for sresets that hit outside of power-save states. > > > > > > Fix this, better documenting the source for the bit definitions. > > > > > > Fixes: a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the > > Fixes: 01b552b05b0f ("ppc/pnv: Add support for NMI interface") Updated in my tree, thanks. > > > > Cc: Cédric Le Goater <clg@kaod.org> > > > Cc: David Gibson <david@gibson.dropbear.id.au> > > > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> > > > > Applied to ppc-for-5.1, thanks. > > > --- > > > > > > Thanks to Cedric for pointing out concerns with a previous MCE patch > > > that unearthed this as well. Linux does not actually care what these > > > SRR1[42:45] bits look like for non-powersave sresets, but we should > > > follow documented behaviour as far as possible. > > > > > > hw/ppc/pnv.c | 26 ++++++++++++++++++++------ > > > 1 file changed, 20 insertions(+), 6 deletions(-) > > > > > > diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c > > > index a3b7a8d0ff..1b4748ce6d 100644 > > > --- a/hw/ppc/pnv.c > > > +++ b/hw/ppc/pnv.c > > > @@ -1986,12 +1986,26 @@ static void pnv_cpu_do_nmi_on_cpu(CPUState *cs, run_on_cpu_data arg) > > > > > > cpu_synchronize_state(cs); > > > ppc_cpu_do_system_reset(cs); > > > - /* > > > - * SRR1[42:45] is set to 0100 which the ISA defines as implementation > > > - * dependent. POWER processors use this for xscom triggered interrupts, > > > - * which come from the BMC or NMI IPIs. > > > - */ > > > - env->spr[SPR_SRR1] |= PPC_BIT(43); > > > + if (env->spr[SPR_SRR1] & PPC_BITMASK(46, 47)) { > > > + /* > > > + * Power-save wakeups, as indicated by non-zero SRR1[46:47] put the > > > + * wakeup reason in SRR1[42:45], system reset is indicated with 0b0100 > > > + * (PPC_BIT(43)). > > > + */ > > > + if (!(env->spr[SPR_SRR1] & PPC_BIT(43))) { > > > + warn_report("ppc_cpu_do_system_reset does not set system reset wakeup reason"); > > > + env->spr[SPR_SRR1] |= PPC_BIT(43); > > > + } > > > + } else { > > > + /* > > > + * For non-powersave system resets, SRR1[42:45] are defined to be > > > + * implementation-dependent. The POWER9 User Manual specifies that > > > + * an external (SCOM driven, which may come from a BMC nmi command or > > > + * another CPU requesting a NMI IPI) system reset exception should be > > > + * 0b0010 (PPC_BIT(44)). > > > + */ > > > + env->spr[SPR_SRR1] |= PPC_BIT(44); > > > + } > > > } > > > > > > static void pnv_nmi(NMIState *n, int cpu_index, Error **errp) > > >
diff --git a/hw/ppc/pnv.c b/hw/ppc/pnv.c index a3b7a8d0ff..1b4748ce6d 100644 --- a/hw/ppc/pnv.c +++ b/hw/ppc/pnv.c @@ -1986,12 +1986,26 @@ static void pnv_cpu_do_nmi_on_cpu(CPUState *cs, run_on_cpu_data arg) cpu_synchronize_state(cs); ppc_cpu_do_system_reset(cs); - /* - * SRR1[42:45] is set to 0100 which the ISA defines as implementation - * dependent. POWER processors use this for xscom triggered interrupts, - * which come from the BMC or NMI IPIs. - */ - env->spr[SPR_SRR1] |= PPC_BIT(43); + if (env->spr[SPR_SRR1] & PPC_BITMASK(46, 47)) { + /* + * Power-save wakeups, as indicated by non-zero SRR1[46:47] put the + * wakeup reason in SRR1[42:45], system reset is indicated with 0b0100 + * (PPC_BIT(43)). + */ + if (!(env->spr[SPR_SRR1] & PPC_BIT(43))) { + warn_report("ppc_cpu_do_system_reset does not set system reset wakeup reason"); + env->spr[SPR_SRR1] |= PPC_BIT(43); + } + } else { + /* + * For non-powersave system resets, SRR1[42:45] are defined to be + * implementation-dependent. The POWER9 User Manual specifies that + * an external (SCOM driven, which may come from a BMC nmi command or + * another CPU requesting a NMI IPI) system reset exception should be + * 0b0010 (PPC_BIT(44)). + */ + env->spr[SPR_SRR1] |= PPC_BIT(44); + } } static void pnv_nmi(NMIState *n, int cpu_index, Error **errp)
Commit a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the SRR1 setting wrong for sresets that hit outside of power-save states. Fix this, better documenting the source for the bit definitions. Fixes: a77fed5bd926 ("ppc/pnv: Add support for NMI interface") got the Cc: Cédric Le Goater <clg@kaod.org> Cc: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: Nicholas Piggin <npiggin@gmail.com> --- Thanks to Cedric for pointing out concerns with a previous MCE patch that unearthed this as well. Linux does not actually care what these SRR1[42:45] bits look like for non-powersave sresets, but we should follow documented behaviour as far as possible. hw/ppc/pnv.c | 26 ++++++++++++++++++++------ 1 file changed, 20 insertions(+), 6 deletions(-)