diff mbox

nvlink: Enable NPU device BAR before triggering freeze

Message ID 20160620064048.19333-1-ruscur@russell.cc
State Accepted
Headers show

Commit Message

Russell Currey June 20, 2016, 6:40 a.m. UTC
NPU freeze injection works by performing an invalid MMIO read on the NPU
device BAR.  If the BAR isn't enabled, which is the case when the
appropriate driver isn't loaded, this checkstops the machine.

Work around this by making sure the BAR is enabled before performing the
read.  The idea of an error inject doing anything other than an error
inject isn't great, but it's better than unintentionally crashing your
machine.

Also, fix the comment incorrectly stating the operation was a write
instead of a read.

Signed-off-by: Russell Currey <ruscur@russell.cc>
---
 hw/npu.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Stewart Smith June 21, 2016, 5:58 a.m. UTC | #1
Russell Currey <ruscur@russell.cc> writes:

> NPU freeze injection works by performing an invalid MMIO read on the NPU
> device BAR.  If the BAR isn't enabled, which is the case when the
> appropriate driver isn't loaded, this checkstops the machine.
>
> Work around this by making sure the BAR is enabled before performing the
> read.  The idea of an error inject doing anything other than an error
> inject isn't great, but it's better than unintentionally crashing your
> machine.
>
> Also, fix the comment incorrectly stating the operation was a write
> instead of a read.
>
> Signed-off-by: Russell Currey <ruscur@russell.cc>
> ---
>  hw/npu.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)

Looks good, merged to master as of 0f23e5b
diff mbox

Patch

diff --git a/hw/npu.c b/hw/npu.c
index e444b96..834b3e3 100644
--- a/hw/npu.c
+++ b/hw/npu.c
@@ -1118,7 +1118,10 @@  static int64_t npu_err_inject(struct phb *phb, uint32_t pe_num,
 		/* Emulate fence mode. */
 		p->fenced = true;
 	} else {
-		/* Cause a freeze with an invalid MMIO write. */
+		/* Cause a freeze with an invalid MMIO read.  If the BAR is not
+		 * enabled, this will checkstop the machine.
+		 */
+		npu_dev_bar_update(p->chip_id, &dev->bar, true);
 		in_be64((void *)dev->bar.base);
 	}