Message ID | 20200108153350.4724-1-fbarrat@linux.ibm.com |
---|---|
State | Accepted |
Headers | show |
Series | [v2] npu2-opencapi: don't fence on masked XSL errors | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | Successfully applied on branch master (d75e82dbfbb9443efeb3f9a5921ac23605aab469) |
snowpatch_ozlabs/snowpatch_job_snowpatch-skiboot | success | Test snowpatch/job/snowpatch-skiboot on branch master |
snowpatch_ozlabs/snowpatch_job_snowpatch-skiboot-dco | success | Signed-off-by present |
On 9/1/20 2:33 am, Frederic Barrat wrote: > An upcoming change in the initfile is going to modify the default > action and fence behavior of some of the NPU FIR2 bits. We're already > overriding the settings of most of those. The one exception is for > bits 41 and 42, which are XSL errors impacting 2 links that we > mask (instead we rely on the subsequent OTL error, which is per link). > > The new initfile will fence-on-error for bits 41 and 42. And even if > the FIRs are masked, the NPU logic could fence the links, which is not > what we want. So this patch makes sure we don't fence on the FIRs we > want to ignore. It has no effect on existing firmware. > > Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> Reviewed-by: Andrew Donnellan <ajd@linux.ibm.com> > --- > Changelog: > v2: add comment and use macro for the xsl bits we ignore (Andrew) > > hw/npu2-opencapi.c | 11 +++++++++-- > 1 file changed, 9 insertions(+), 2 deletions(-) > > diff --git a/hw/npu2-opencapi.c b/hw/npu2-opencapi.c > index ed6650f4..07e81d23 100644 > --- a/hw/npu2-opencapi.c > +++ b/hw/npu2-opencapi.c > @@ -1649,7 +1649,7 @@ static int enable_interrupts(struct npu2 *p) > * the systems, since we can just fence the brick and keep > * the system alive. > * - the exception to the above is 2 FIRs for XSL errors > - * resulting of bad AFU behavior, for which we don't want to > + * resulting from bad AFU behavior, for which we don't want to > * checkstop but can't configure to send an error interrupt > * either, as the XSL errors are reported on 2 links (the > * XSL is shared between 2 links). Instead, we mask > @@ -1661,7 +1661,8 @@ static int enable_interrupts(struct npu2 *p) > */ > xsl_fault = PPC_BIT(0) | PPC_BIT(1) | PPC_BIT(2) | PPC_BIT(3); > xstop_override = 0x0FFFEFC00F91B000; > - xsl_mask = PPC_BIT(41) | PPC_BIT(42); > + xsl_mask = NPU2_CHECKSTOP_REG2_XSL_XLAT_REQ_WHILE_SPAP_INVALID | > + NPU2_CHECKSTOP_REG2_XSL_INVALID_PEE; > > xscom_read(p->chip_id, p->xscom_base + NPU2_MISC_FIR2_MASK, ®); > reg |= xsl_fault | xstop_override | xsl_mask; > @@ -1677,10 +1678,16 @@ static int enable_interrupts(struct npu2 *p) > * Make sure the brick is fenced on those errors. > * Fencing is incompatible with freezing, but there's no > * freeze defined for FIR2, so we don't have to worry about it > + * > + * For the 2 XSL bits we ignore, we need to make sure they > + * don't fence the link, as the NPU logic could allow it even > + * when masked. > */ > reg = npu2_scom_read(p->chip_id, p->xscom_base, NPU2_MISC_FENCE_ENABLE2, > NPU2_MISC_DA_LEN_8B); > reg |= xstop_override; > + reg &= ~NPU2_CHECKSTOP_REG2_XSL_XLAT_REQ_WHILE_SPAP_INVALID; > + reg &= ~NPU2_CHECKSTOP_REG2_XSL_INVALID_PEE; > npu2_scom_write(p->chip_id, p->xscom_base, NPU2_MISC_FENCE_ENABLE2, > NPU2_MISC_DA_LEN_8B, reg); > >
On Thu, Jan 9, 2020 at 2:34 AM Frederic Barrat <fbarrat@linux.ibm.com> wrote: > > An upcoming change in the initfile is going to modify the default > action and fence behavior of some of the NPU FIR2 bits. We're already > overriding the settings of most of those. The one exception is for > bits 41 and 42, which are XSL errors impacting 2 links that we > mask (instead we rely on the subsequent OTL error, which is per link). > > The new initfile will fence-on-error for bits 41 and 42. And even if > the FIRs are masked, the NPU logic could fence the links, which is not > what we want. So this patch makes sure we don't fence on the FIRs we > want to ignore. It has no effect on existing firmware. > > Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> > --- > Changelog: > v2: add comment and use macro for the xsl bits we ignore (Andrew) Thanks, merged as 09478eaeef8dc272586a29190d58f47b50ec821b
diff --git a/hw/npu2-opencapi.c b/hw/npu2-opencapi.c index ed6650f4..07e81d23 100644 --- a/hw/npu2-opencapi.c +++ b/hw/npu2-opencapi.c @@ -1649,7 +1649,7 @@ static int enable_interrupts(struct npu2 *p) * the systems, since we can just fence the brick and keep * the system alive. * - the exception to the above is 2 FIRs for XSL errors - * resulting of bad AFU behavior, for which we don't want to + * resulting from bad AFU behavior, for which we don't want to * checkstop but can't configure to send an error interrupt * either, as the XSL errors are reported on 2 links (the * XSL is shared between 2 links). Instead, we mask @@ -1661,7 +1661,8 @@ static int enable_interrupts(struct npu2 *p) */ xsl_fault = PPC_BIT(0) | PPC_BIT(1) | PPC_BIT(2) | PPC_BIT(3); xstop_override = 0x0FFFEFC00F91B000; - xsl_mask = PPC_BIT(41) | PPC_BIT(42); + xsl_mask = NPU2_CHECKSTOP_REG2_XSL_XLAT_REQ_WHILE_SPAP_INVALID | + NPU2_CHECKSTOP_REG2_XSL_INVALID_PEE; xscom_read(p->chip_id, p->xscom_base + NPU2_MISC_FIR2_MASK, ®); reg |= xsl_fault | xstop_override | xsl_mask; @@ -1677,10 +1678,16 @@ static int enable_interrupts(struct npu2 *p) * Make sure the brick is fenced on those errors. * Fencing is incompatible with freezing, but there's no * freeze defined for FIR2, so we don't have to worry about it + * + * For the 2 XSL bits we ignore, we need to make sure they + * don't fence the link, as the NPU logic could allow it even + * when masked. */ reg = npu2_scom_read(p->chip_id, p->xscom_base, NPU2_MISC_FENCE_ENABLE2, NPU2_MISC_DA_LEN_8B); reg |= xstop_override; + reg &= ~NPU2_CHECKSTOP_REG2_XSL_XLAT_REQ_WHILE_SPAP_INVALID; + reg &= ~NPU2_CHECKSTOP_REG2_XSL_INVALID_PEE; npu2_scom_write(p->chip_id, p->xscom_base, NPU2_MISC_FENCE_ENABLE2, NPU2_MISC_DA_LEN_8B, reg);
An upcoming change in the initfile is going to modify the default action and fence behavior of some of the NPU FIR2 bits. We're already overriding the settings of most of those. The one exception is for bits 41 and 42, which are XSL errors impacting 2 links that we mask (instead we rely on the subsequent OTL error, which is per link). The new initfile will fence-on-error for bits 41 and 42. And even if the FIRs are masked, the NPU logic could fence the links, which is not what we want. So this patch makes sure we don't fence on the FIRs we want to ignore. It has no effect on existing firmware. Signed-off-by: Frederic Barrat <fbarrat@linux.ibm.com> --- Changelog: v2: add comment and use macro for the xsl bits we ignore (Andrew) hw/npu2-opencapi.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)