Message ID | 20180806021742.18265-1-andrew.donnellan@au1.ibm.com |
---|---|
State | Accepted |
Headers | show |
Series | hw/npu2: Don't assert if we hit a mixed OpenCAPI/NVLink setup | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | master/apply_patch Successfully applied |
snowpatch_ozlabs/make_check | success | Test make_check on branch master |
Andrew Donnellan <andrew.donnellan@au1.ibm.com> writes: > If our device tree contains a mix of OpenCAPI and NVLink links, that's a > problem, but it's not fatal and we should simply abort NPU init rather than > kill the machine - this is helpful for doing further debugging. > > Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> Merged to master as of 8a8cc857fa3f4a635cd9ef4acbd5abdfbe7872bd > > Debating whether or not to send a backport of this to the 6.0 stable > tree... Is it something that could/would occur on any system that someone has shipped? Probably not, so we're likely okay?
On 07/08/18 13:00, Stewart Smith wrote: > Andrew Donnellan <andrew.donnellan@au1.ibm.com> writes: >> If our device tree contains a mix of OpenCAPI and NVLink links, that's a >> problem, but it's not fatal and we should simply abort NPU init rather than >> kill the machine - this is helpful for doing further debugging. >> >> Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> > > Merged to master as of 8a8cc857fa3f4a635cd9ef4acbd5abdfbe7872bd > >> >> Debating whether or not to send a backport of this to the 6.0 stable >> tree... > > Is it something that could/would occur on any system that someone has > shipped? Probably not, so we're likely okay? Yeah, actually they should never hit this in a release - the only case where we're seeing this is when skiboot-level platform hacks conflict with changes in HDAT, so I don't think we need a stable backport.
diff --git a/hw/npu2.c b/hw/npu2.c index acd56c14e1da..5a5e6944c898 100644 --- a/hw/npu2.c +++ b/hw/npu2.c @@ -1362,7 +1362,7 @@ static void npu2_probe_phb(struct dt_node *dn) if (ocapi_detected && nvlink_detected) { prlog(PR_ERR, "NPU: NVLink and OpenCAPI devices on same chip not supported\n"); - assert(false); + return; } else if (ocapi_detected) { prlog(PR_INFO, "NPU: OpenCAPI link configuration detected, not initialising NVLink\n"); return;
If our device tree contains a mix of OpenCAPI and NVLink links, that's a problem, but it's not fatal and we should simply abort NPU init rather than kill the machine - this is helpful for doing further debugging. Signed-off-by: Andrew Donnellan <andrew.donnellan@au1.ibm.com> --- Debating whether or not to send a backport of this to the 6.0 stable tree... --- hw/npu2.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)