Message ID | 20181101053515.4344-1-vaibhav@linux.ibm.com |
---|---|
State | Accepted |
Headers | show |
Series | phb4/capp: Only reset FIR bits that cause capp machine check | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | master/apply_patch Successfully applied |
snowpatch_ozlabs/make_check | success | Test make_check on branch master |
Vaibhav Jain <vaibhav@linux.ibm.com> writes: > During CAPP recovery do_capp_recovery_scoms() will reset the CAPP Fir > register just after CAPP recovery is completed. This has an > unintentional side effect of preventing PRD from analyzing and > reporting this error. If PRD tries to read the CAPP FIR after opal has > already reset it, then it logs a critical error complaining "No active > error bits found". > > To prevent this from happening we update do_capp_recovery_scoms() to > only reset fir bits that cause CAPP machine check (local xstop). This > is done by reading the CAPP Fir Action0/1 & Mask registers and > generating a mask which is then written on CAPP_FIR_CLEAR register. > > Cc: stable > Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Cheers, Merged to master as of 999246716d2da347aad46a28ed9899b832bffe6c and into 6.0.x as of bf93742f5c047082a759dda6799e42808e2f9135 for 6.0.11
diff --git a/hw/phb4.c b/hw/phb4.c index 10df206b..7a1f58e3 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -3055,7 +3055,24 @@ static int do_capp_recovery_scoms(struct phb4 *p) /* Check if the recovery failed or passed */ if (reg & PPC_BIT(1)) { + uint64_t act0, act1, mask, fir; + + /* Use the Action0/1 and mask to only clear the bits + * that cause local checkstop. Other bits needs attention + * of the PRD daemon. + */ + xscom_read(p->chip_id, CAPP_FIR_ACTION0 + offset, &act0); + xscom_read(p->chip_id, CAPP_FIR_ACTION1 + offset, &act1); + xscom_read(p->chip_id, CAPP_FIR_MASK + offset, &mask); + xscom_read(p->chip_id, CAPP_FIR + offset, &fir); + + fir = ~(fir & ~mask & act0 & act1); PHBDBG(p, "Doing CAPP recovery scoms\n"); + + /* update capp fir clearing bits causing local checkstop */ + PHBDBG(p, "Resetting CAPP Fir with mask 0x%016llX\n", fir); + xscom_write(p->chip_id, CAPP_FIR_CLEAR + offset, fir); + /* disable snoops */ xscom_write(p->chip_id, SNOOP_CAPI_CONFIG + offset, 0); load_capp_ucode(p); diff --git a/include/phb4-capp.h b/include/phb4-capp.h index 68200ac5..2f309d4c 100644 --- a/include/phb4-capp.h +++ b/include/phb4-capp.h @@ -23,6 +23,7 @@ #define CAPP_APC_MASTER_ARRAY_WRITE_REG 0x2010842 /* Satellite 2 */ #define CAPP_FIR 0x2010800 +#define CAPP_FIR_CLEAR 0x2010801 #define CAPP_FIR_MASK 0x2010803 #define CAPP_FIR_ACTION0 0x2010806 #define CAPP_FIR_ACTION1 0x2010807
During CAPP recovery do_capp_recovery_scoms() will reset the CAPP Fir register just after CAPP recovery is completed. This has an unintentional side effect of preventing PRD from analyzing and reporting this error. If PRD tries to read the CAPP FIR after opal has already reset it, then it logs a critical error complaining "No active error bits found". To prevent this from happening we update do_capp_recovery_scoms() to only reset fir bits that cause CAPP machine check (local xstop). This is done by reading the CAPP Fir Action0/1 & Mask registers and generating a mask which is then written on CAPP_FIR_CLEAR register. Cc: stable Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> --- hw/phb4.c | 17 +++++++++++++++++ include/phb4-capp.h | 1 + 2 files changed, 18 insertions(+)