Message ID | 20180118040943.13135-1-vaibhav@linux.vnet.ibm.com |
---|---|
State | Accepted |
Headers | show |
Series | [PATCH-RESEND] capi: Disable CAPP virtual machines | expand |
On 18/01/18 15:09, Vaibhav Jain wrote: > When exercising more than one CAPI accelerators simultaneously in > cache coherency mode, the verification team is seeing a deadlock. To > fix this a workaround of disabling CAPP virtual machines is > suggested. These 'virtual machines' let PSL queue multiple CAPP > commands for servicing by CAPP there by increasing > throughput. Below is the error scenario described by the h/w team: > > " With virtual machines enabled we had a deadlock scenario where with 2 > or more CAPI's in a system you could get in a deadlock scenario due to > cast-outs that are required break the deadlock (evict lines that > another CAPI is requesting) get stuck in the virtual machine queue by > a command ahead of it that is being retried by the same scenario in > the other CAPI. " > > So this patch updates CAPP APC Master Powerbus control > register during CAPP init to also set Bit(12) that disables CAPP > virtual machines. This forces processing of CAPP commands from PSL one > at a time and thereby preventing above mentioned deadlock scenario. > > Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> Thanks for the description - that makes a lot more sense. Should this be heading to stable? Reviewed-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> > --- > Change-log: > Resend -> Updated the patch description with more info CAPP virtual > machines and the error scenario. > --- > hw/phb4.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/hw/phb4.c b/hw/phb4.c > index ff912e1f..8e660b66 100644 > --- a/hw/phb4.c > +++ b/hw/phb4.c > @@ -3581,6 +3581,7 @@ static void phb4_init_capp_regs(struct phb4 *p, uint32_t capp_eng) > xscom_read(p->chip_id, APC_MASTER_PB_CTRL + offset, ®); > reg |= PPC_BIT(0); /* enable cResp exam */ > reg |= PPC_BIT(3); /* disable vg not sys */ > + reg |= PPC_BIT(12);/* HW417025: disable capp virtual machines */ > if (p->rev == PHB4_REV_NIMBUS_DD10) { > reg |= PPC_BIT(1); > } else { >
Le 18/01/2018 à 05:09, Vaibhav Jain a écrit : > When exercising more than one CAPI accelerators simultaneously in > cache coherency mode, the verification team is seeing a deadlock. To > fix this a workaround of disabling CAPP virtual machines is > suggested. These 'virtual machines' let PSL queue multiple CAPP > commands for servicing by CAPP there by increasing > throughput. Below is the error scenario described by the h/w team: > > " With virtual machines enabled we had a deadlock scenario where with 2 > or more CAPI's in a system you could get in a deadlock scenario due to > cast-outs that are required break the deadlock (evict lines that > another CAPI is requesting) get stuck in the virtual machine queue by > a command ahead of it that is being retried by the same scenario in > the other CAPI. " > > So this patch updates CAPP APC Master Powerbus control > register during CAPP init to also set Bit(12) that disables CAPP > virtual machines. This forces processing of CAPP commands from PSL one > at a time and thereby preventing above mentioned deadlock scenario. > > Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> > --- > Change-log: > Resend -> Updated the patch description with more info CAPP virtual > machines and the error scenario. > --- > hw/phb4.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/hw/phb4.c b/hw/phb4.c > index ff912e1f..8e660b66 100644 > --- a/hw/phb4.c > +++ b/hw/phb4.c > @@ -3581,6 +3581,7 @@ static void phb4_init_capp_regs(struct phb4 *p, uint32_t capp_eng) > xscom_read(p->chip_id, APC_MASTER_PB_CTRL + offset, ®); > reg |= PPC_BIT(0); /* enable cResp exam */ > reg |= PPC_BIT(3); /* disable vg not sys */ > + reg |= PPC_BIT(12);/* HW417025: disable capp virtual machines */ Should this patch be applied on all chips ? And same question for all devices using a PSL > if (p->rev == PHB4_REV_NIMBUS_DD10) { > reg |= PPC_BIT(1); > } else { >
Hi Christophe, christophe lombard <clombard@linux.vnet.ibm.com> writes: > Should this patch be applied on all chips ? > And same question for all devices using a PSL H/w team has confirmed that this needs to be done for all current P9 chips. This is the only workaround for HW417025 in errata section of the CAPP workbook.
Le 18/01/2018 à 05:09, Vaibhav Jain a écrit : > When exercising more than one CAPI accelerators simultaneously in > cache coherency mode, the verification team is seeing a deadlock. To > fix this a workaround of disabling CAPP virtual machines is > suggested. These 'virtual machines' let PSL queue multiple CAPP > commands for servicing by CAPP there by increasing > throughput. Below is the error scenario described by the h/w team: > > " With virtual machines enabled we had a deadlock scenario where with 2 > or more CAPI's in a system you could get in a deadlock scenario due to > cast-outs that are required break the deadlock (evict lines that > another CAPI is requesting) get stuck in the virtual machine queue by > a command ahead of it that is being retried by the same scenario in > the other CAPI. " > > So this patch updates CAPP APC Master Powerbus control > register during CAPP init to also set Bit(12) that disables CAPP > virtual machines. This forces processing of CAPP commands from PSL one > at a time and thereby preventing above mentioned deadlock scenario. > > Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> > --- > Change-log: > Resend -> Updated the patch description with more info CAPP virtual > machines and the error scenario. > --- > hw/phb4.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/hw/phb4.c b/hw/phb4.c > index ff912e1f..8e660b66 100644 > --- a/hw/phb4.c > +++ b/hw/phb4.c > @@ -3581,6 +3581,7 @@ static void phb4_init_capp_regs(struct phb4 *p, uint32_t capp_eng) > xscom_read(p->chip_id, APC_MASTER_PB_CTRL + offset, ®); > reg |= PPC_BIT(0); /* enable cResp exam */ > reg |= PPC_BIT(3); /* disable vg not sys */ > + reg |= PPC_BIT(12);/* HW417025: disable capp virtual machines */ > if (p->rev == PHB4_REV_NIMBUS_DD10) { > reg |= PPC_BIT(1); > } else { > Acked-by: Christophe Lombard clombard@linux.vnet.ibm.com
Vaibhav Jain <vaibhav@linux.vnet.ibm.com> writes: > When exercising more than one CAPI accelerators simultaneously in > cache coherency mode, the verification team is seeing a deadlock. To > fix this a workaround of disabling CAPP virtual machines is > suggested. These 'virtual machines' let PSL queue multiple CAPP > commands for servicing by CAPP there by increasing > throughput. Below is the error scenario described by the h/w team: > > " With virtual machines enabled we had a deadlock scenario where with 2 > or more CAPI's in a system you could get in a deadlock scenario due to > cast-outs that are required break the deadlock (evict lines that > another CAPI is requesting) get stuck in the virtual machine queue by > a command ahead of it that is being retried by the same scenario in > the other CAPI. " > > So this patch updates CAPP APC Master Powerbus control > register during CAPP init to also set Bit(12) that disables CAPP > virtual machines. This forces processing of CAPP commands from PSL one > at a time and thereby preventing above mentioned deadlock scenario. > > Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> > --- > Change-log: > Resend -> Updated the patch description with more info CAPP virtual > machines and the error scenario. > --- > hw/phb4.c | 1 + > 1 file changed, 1 insertion(+) Thanks! Merged to master as of 5a959af3fb417c4269b625d9ff2cb204f20728d5
christophe lombard <clombard@linux.vnet.ibm.com> writes: > Le 18/01/2018 à 05:09, Vaibhav Jain a écrit : >> When exercising more than one CAPI accelerators simultaneously in >> cache coherency mode, the verification team is seeing a deadlock. To >> fix this a workaround of disabling CAPP virtual machines is >> suggested. These 'virtual machines' let PSL queue multiple CAPP >> commands for servicing by CAPP there by increasing >> throughput. Below is the error scenario described by the h/w team: >> >> " With virtual machines enabled we had a deadlock scenario where with 2 >> or more CAPI's in a system you could get in a deadlock scenario due to >> cast-outs that are required break the deadlock (evict lines that >> another CAPI is requesting) get stuck in the virtual machine queue by >> a command ahead of it that is being retried by the same scenario in >> the other CAPI. " >> >> So this patch updates CAPP APC Master Powerbus control >> register during CAPP init to also set Bit(12) that disables CAPP >> virtual machines. This forces processing of CAPP commands from PSL one >> at a time and thereby preventing above mentioned deadlock scenario. >> >> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> >> --- >> Change-log: >> Resend -> Updated the patch description with more info CAPP virtual >> machines and the error scenario. >> --- >> hw/phb4.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/hw/phb4.c b/hw/phb4.c >> index ff912e1f..8e660b66 100644 >> --- a/hw/phb4.c >> +++ b/hw/phb4.c >> @@ -3581,6 +3581,7 @@ static void phb4_init_capp_regs(struct phb4 *p, uint32_t capp_eng) >> xscom_read(p->chip_id, APC_MASTER_PB_CTRL + offset, ®); >> reg |= PPC_BIT(0); /* enable cResp exam */ >> reg |= PPC_BIT(3); /* disable vg not sys */ >> + reg |= PPC_BIT(12);/* HW417025: disable capp virtual machines */ >> if (p->rev == PHB4_REV_NIMBUS_DD10) { >> reg |= PPC_BIT(1); >> } else { >> > > Acked-by: Christophe Lombard clombard@linux.vnet.ibm.com You seem to be missing the < > around the email address, which caused patchwork not to pick up the acked-by. Might want to fix that :)
On 01/02/18 19:29, Stewart Smith wrote: >> Acked-by: Christophe Lombard clombard@linux.vnet.ibm.com > > You seem to be missing the < > around the email address, which caused > patchwork not to pick up the acked-by. Might want to fix that :) > The next version of patchwork will fix that particular regex, but yes, gotta maintain the official style ;)
Le 01/02/2018 à 09:29, Stewart Smith a écrit : > christophe lombard <clombard@linux.vnet.ibm.com> writes: >> Le 18/01/2018 à 05:09, Vaibhav Jain a écrit : >>> When exercising more than one CAPI accelerators simultaneously in >>> cache coherency mode, the verification team is seeing a deadlock. To >>> fix this a workaround of disabling CAPP virtual machines is >>> suggested. These 'virtual machines' let PSL queue multiple CAPP >>> commands for servicing by CAPP there by increasing >>> throughput. Below is the error scenario described by the h/w team: >>> >>> " With virtual machines enabled we had a deadlock scenario where with 2 >>> or more CAPI's in a system you could get in a deadlock scenario due to >>> cast-outs that are required break the deadlock (evict lines that >>> another CAPI is requesting) get stuck in the virtual machine queue by >>> a command ahead of it that is being retried by the same scenario in >>> the other CAPI. " >>> >>> So this patch updates CAPP APC Master Powerbus control >>> register during CAPP init to also set Bit(12) that disables CAPP >>> virtual machines. This forces processing of CAPP commands from PSL one >>> at a time and thereby preventing above mentioned deadlock scenario. >>> >>> Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> >>> --- >>> Change-log: >>> Resend -> Updated the patch description with more info CAPP virtual >>> machines and the error scenario. >>> --- >>> hw/phb4.c | 1 + >>> 1 file changed, 1 insertion(+) >>> >>> diff --git a/hw/phb4.c b/hw/phb4.c >>> index ff912e1f..8e660b66 100644 >>> --- a/hw/phb4.c >>> +++ b/hw/phb4.c >>> @@ -3581,6 +3581,7 @@ static void phb4_init_capp_regs(struct phb4 *p, uint32_t capp_eng) >>> xscom_read(p->chip_id, APC_MASTER_PB_CTRL + offset, ®); >>> reg |= PPC_BIT(0); /* enable cResp exam */ >>> reg |= PPC_BIT(3); /* disable vg not sys */ >>> + reg |= PPC_BIT(12);/* HW417025: disable capp virtual machines */ >>> if (p->rev == PHB4_REV_NIMBUS_DD10) { >>> reg |= PPC_BIT(1); >>> } else { >>> >> >> Acked-by: Christophe Lombard clombard@linux.vnet.ibm.com > > You seem to be missing the < > around the email address, which caused > patchwork not to pick up the acked-by. Might want to fix that :) > Oups, sorry about that. I will check this point.
diff --git a/hw/phb4.c b/hw/phb4.c index ff912e1f..8e660b66 100644 --- a/hw/phb4.c +++ b/hw/phb4.c @@ -3581,6 +3581,7 @@ static void phb4_init_capp_regs(struct phb4 *p, uint32_t capp_eng) xscom_read(p->chip_id, APC_MASTER_PB_CTRL + offset, ®); reg |= PPC_BIT(0); /* enable cResp exam */ reg |= PPC_BIT(3); /* disable vg not sys */ + reg |= PPC_BIT(12);/* HW417025: disable capp virtual machines */ if (p->rev == PHB4_REV_NIMBUS_DD10) { reg |= PPC_BIT(1); } else {
When exercising more than one CAPI accelerators simultaneously in cache coherency mode, the verification team is seeing a deadlock. To fix this a workaround of disabling CAPP virtual machines is suggested. These 'virtual machines' let PSL queue multiple CAPP commands for servicing by CAPP there by increasing throughput. Below is the error scenario described by the h/w team: " With virtual machines enabled we had a deadlock scenario where with 2 or more CAPI's in a system you could get in a deadlock scenario due to cast-outs that are required break the deadlock (evict lines that another CAPI is requesting) get stuck in the virtual machine queue by a command ahead of it that is being retried by the same scenario in the other CAPI. " So this patch updates CAPP APC Master Powerbus control register during CAPP init to also set Bit(12) that disables CAPP virtual machines. This forces processing of CAPP commands from PSL one at a time and thereby preventing above mentioned deadlock scenario. Signed-off-by: Vaibhav Jain <vaibhav@linux.vnet.ibm.com> --- Change-log: Resend -> Updated the patch description with more info CAPP virtual machines and the error scenario. --- hw/phb4.c | 1 + 1 file changed, 1 insertion(+)