Message ID | 20180823084702.25878-1-vaibhav@linux.ibm.com |
---|---|
State | Accepted |
Headers | show |
Series | [RESEND] phb4: Don't probe a PHB if its garded | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | master/apply_patch Successfully applied |
snowpatch_ozlabs/make_check | success | Test make_check on branch master |
Vaibhav Jain <vaibhav@linux.ibm.com> writes: > Presently phb4_probe_stack() causes an exception while trying to probe > a PHB if its garded. This causes skiboot to go into a reboot loop with > following exception log: > > *********************************************** > Fatal MCE at 000000003006ecd4 .probe_phb4+0x570 > CFAR : 00000000300b98a0 > <snip> > Aborting! > CPU 0018 Backtrace: > S: 0000000031cc37e0 R: 000000003001a51c ._abort+0x4c > S: 0000000031cc3860 R: 0000000030028170 .exception_entry+0x180 > S: 0000000031cc3a40 R: 0000000000001f10 * > S: 0000000031cc3c20 R: 000000003006ecb0 .probe_phb4+0x54c > S: 0000000031cc3e30 R: 0000000030014ca4 .main_cpu_entry+0x5b0 > S: 0000000031cc3f00 R: 0000000030002700 boot_entry+0x1b8 > > This is caused as phb4_probe_stack() will ignore all xscom read/write > errors to enable PHB Bars and then tries to perform an mmio to read > PHB Version registers that cause the fatal MCE. > > We fix this by ignoring the PHB probe if the first xscom_write() to > populate the PHB Bar register fails, which indicates that there is > something wrong with the PHB. > > Cc: stable > Fixes: dc21b4db3a2e('hw/phb4: Add initial support') > Reviewed-by: Oliver O'Halloran <oohall@gmail.com> > Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> Thanks! Merged to master as of 1520d6a1e3aaec74228d213083b68da70729121a and to 6.0.x as of 54ac06ff404bf13d0a8035985ff67e0610213f5a.
diff --git a/hw/phb4.c b/hw/phb4.c index d1245dce..89b0b859 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -5546,6 +5546,7 @@ static void phb4_probe_stack(struct dt_node *stk_node, uint32_t pec_index, char *path; uint64_t capp_ucode_base; unsigned int max_link_speed; + int rc; gcid = dt_get_chip_id(stk_node); stk_index = dt_prop_get_u32(stk_node, "reg"); @@ -5567,9 +5568,17 @@ static void phb4_probe_stack(struct dt_node *stk_node, uint32_t pec_index, /* Initialize PHB register BAR */ phys_map_get(gcid, PHB4_REG_SPC, phb_num, &phb_bar, NULL); - xscom_write(gcid, nest_stack + XPEC_NEST_STK_PHB_REG_BAR, phb_bar << 8); - bar_en |= XPEC_NEST_STK_BAR_EN_PHB; + rc = xscom_write(gcid, nest_stack + XPEC_NEST_STK_PHB_REG_BAR, + phb_bar << 8); + + /* A scom error here probably indicates a defective/garded PHB */ + if (rc != OPAL_SUCCESS) { + prerror("PHB[%d:%d] Unable to set PHB BAR. Error=%d\n", + gcid, phb_num, rc); + return; + } + bar_en |= XPEC_NEST_STK_BAR_EN_PHB; /* Same with INT BAR (ESB) */ phys_map_get(gcid, PHB4_XIVE_ESB, phb_num, &irq_bar, NULL);