Message ID | 1483175588-17006-4-git-send-email-caoj.fnst@cn.fujitsu.com |
---|---|
State | New |
Headers | show |
On Sat, Dec 31, 2016 at 05:13:07PM +0800, Cao jin wrote: > From: Chen Fan <chen.fan.fnst@cn.fujitsu.com> > > When physical device has uncorrectable error hanppened, the vfio_pci > driver will signal the uncorrectable error status register value to > corresponding QEMU's vfio-pci device via the eventfd registered by this > device, then, the vfio-pci's error eventfd handler will be invoked in > event loop. > > Construct and pass the aer message to root port, root port will trigger an > interrupt to signal guest, then, the guest driver will do the recovery. > > Note: Now only support non-fatal error's recovery, fatal error will > still result in vm stop. > > Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com> > Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> > Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> > --- > hw/vfio/pci.c | 50 ++++++++++++++++++++++++++++++++++++++++++-------- > 1 file changed, 42 insertions(+), 8 deletions(-) > > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c > index 76a8ac3..9861f72 100644 > --- a/hw/vfio/pci.c > +++ b/hw/vfio/pci.c > @@ -2470,21 +2470,55 @@ static void vfio_put_device(VFIOPCIDevice *vdev) > static void vfio_err_notifier_handler(void *opaque) > { > VFIOPCIDevice *vdev = opaque; > + PCIDevice *dev = &vdev->pdev; > + PCIEAERMsg msg = { > + .severity = 0, > + .source_id = (pci_bus_num(dev->bus) << 8) | dev->devfn, > + }; > + int len; > + uint64_t uncor_status; > + > + /* Read uncorrectable error status from driver */ > + len = read(vdev->err_notifier.rfd, &uncor_status, sizeof(uncor_status)); > + if (len != sizeof(uncor_status)) { > + error_report("vfio-pci: uncor error status reading returns" > + " invalid number of bytes: %d", len); > + return; //Or goto stop? I would definitely suggest this to make sure we don't regress. > + } > + > + if (!(vdev->features & VFIO_FEATURE_ENABLE_AER)) { > + goto stop; > + } > + > + /* Populate the aer msg and send it to root port */ > + if (dev->exp.aer_cap) { > + uint8_t *aer_cap = dev->config + dev->exp.aer_cap; > + bool isfatal = uncor_status & > + pci_get_long(aer_cap + PCI_ERR_UNCOR_SEVER); > + > + if (isfatal) { > + goto stop; > + } > + > + msg.severity = isfatal ? PCI_ERR_ROOT_CMD_FATAL_EN : > + PCI_ERR_ROOT_CMD_NONFATAL_EN; > > - if (!event_notifier_test_and_clear(&vdev->err_notifier)) { > + error_report("vfio-pci device %d sending AER to root port. uncor" > + " status = 0x%"PRIx64, dev->devfn, uncor_status); > + pcie_aer_msg(dev, &msg); > return; > } > > +stop: > /* > - * TBD. Retrieve the error details and decide what action > - * needs to be taken. One of the actions could be to pass > - * the error to the guest and have the guest driver recover > - * from the error. This requires that PCIe capabilities be > - * exposed to the guest. For now, we just terminate the > - * guest to contain the error. > + * Terminate the guest in case of > + * 1. AER capability is not exposed to guest. > + * 2. AER capability is exposed, but error is fatal, only non-fatal > + * error is handled now. > */ > > - error_report("%s(%s) Unrecoverable error detected. Please collect any data possible and then kill the guest", __func__, vdev->vbasedev.name); > + error_report("%s(%s) fatal error detected. Please collect any data" > + " possible and then kill the guest", __func__, vdev->vbasedev.name); > > vm_stop(RUN_STATE_INTERNAL_ERROR); > } > -- > 1.8.3.1 > >
On Sat, 31 Dec 2016 17:13:07 +0800 Cao jin <caoj.fnst@cn.fujitsu.com> wrote: > From: Chen Fan <chen.fan.fnst@cn.fujitsu.com> > > When physical device has uncorrectable error hanppened, the vfio_pci > driver will signal the uncorrectable error status register value to > corresponding QEMU's vfio-pci device via the eventfd registered by this > device, then, the vfio-pci's error eventfd handler will be invoked in > event loop. > > Construct and pass the aer message to root port, root port will trigger an > interrupt to signal guest, then, the guest driver will do the recovery. > > Note: Now only support non-fatal error's recovery, fatal error will > still result in vm stop. Please update the entire commit log, don't just add a note that this now only covers non-fatal errors. > > Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com> > Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com> > Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com> > --- > hw/vfio/pci.c | 50 ++++++++++++++++++++++++++++++++++++++++++-------- > 1 file changed, 42 insertions(+), 8 deletions(-) > > diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c > index 76a8ac3..9861f72 100644 > --- a/hw/vfio/pci.c > +++ b/hw/vfio/pci.c > @@ -2470,21 +2470,55 @@ static void vfio_put_device(VFIOPCIDevice *vdev) > static void vfio_err_notifier_handler(void *opaque) > { > VFIOPCIDevice *vdev = opaque; > + PCIDevice *dev = &vdev->pdev; > + PCIEAERMsg msg = { > + .severity = 0, > + .source_id = (pci_bus_num(dev->bus) << 8) | dev->devfn, > + }; > + int len; > + uint64_t uncor_status; > + > + /* Read uncorrectable error status from driver */ > + len = read(vdev->err_notifier.rfd, &uncor_status, sizeof(uncor_status)); Whoa this seems bogus. In the kernel eventfd_signal() adds the value to the internal counter. We can't guarantee that the user is going to immediately respond to every signal, multiple signals might occur, each incrementing the counter. I don't think we can use the eventfd like this. > + if (len != sizeof(uncor_status)) { > + error_report("vfio-pci: uncor error status reading returns" > + " invalid number of bytes: %d", len); > + return; //Or goto stop? It's bogus use of the eventfd anyway afaict, but event_notifier_test_and_clear() certainly handles at least EINTR. > + } > + > + if (!(vdev->features & VFIO_FEATURE_ENABLE_AER)) { > + goto stop; > + } I'm not sure this should be user selected anymore. > + > + /* Populate the aer msg and send it to root port */ > + if (dev->exp.aer_cap) { > + uint8_t *aer_cap = dev->config + dev->exp.aer_cap; > + bool isfatal = uncor_status & > + pci_get_long(aer_cap + PCI_ERR_UNCOR_SEVER); > + > + if (isfatal) { > + goto stop; > + } QEMU uses spaces not tabs. > + > + msg.severity = isfatal ? PCI_ERR_ROOT_CMD_FATAL_EN : > + PCI_ERR_ROOT_CMD_NONFATAL_EN; We don't get here if isfatal. > > - if (!event_notifier_test_and_clear(&vdev->err_notifier)) { > + error_report("vfio-pci device %d sending AER to root port. uncor" > + " status = 0x%"PRIx64, dev->devfn, uncor_status); > + pcie_aer_msg(dev, &msg); > return; > } > > +stop: > /* > - * TBD. Retrieve the error details and decide what action > - * needs to be taken. One of the actions could be to pass > - * the error to the guest and have the guest driver recover > - * from the error. This requires that PCIe capabilities be > - * exposed to the guest. For now, we just terminate the > - * guest to contain the error. > + * Terminate the guest in case of > + * 1. AER capability is not exposed to guest. > + * 2. AER capability is exposed, but error is fatal, only non-fatal > + * error is handled now. You're also currently requiring the user to enable aer per device. > */ > > - error_report("%s(%s) Unrecoverable error detected. Please collect any data possible and then kill the guest", __func__, vdev->vbasedev.name); > + error_report("%s(%s) fatal error detected. Please collect any data" > + " possible and then kill the guest", __func__, vdev->vbasedev.name); > > vm_stop(RUN_STATE_INTERNAL_ERROR); > }
On 01/19/2017 06:31 AM, Alex Williamson wrote: > On Sat, 31 Dec 2016 17:13:07 +0800 > Cao jin <caoj.fnst@cn.fujitsu.com> wrote: > >> From: Chen Fan <chen.fan.fnst@cn.fujitsu.com> >> >> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c >> index 76a8ac3..9861f72 100644 >> --- a/hw/vfio/pci.c >> +++ b/hw/vfio/pci.c >> @@ -2470,21 +2470,55 @@ static void vfio_put_device(VFIOPCIDevice *vdev) >> static void vfio_err_notifier_handler(void *opaque) >> { >> VFIOPCIDevice *vdev = opaque; >> + PCIDevice *dev = &vdev->pdev; >> + PCIEAERMsg msg = { >> + .severity = 0, >> + .source_id = (pci_bus_num(dev->bus) << 8) | dev->devfn, >> + }; >> + int len; >> + uint64_t uncor_status; >> + >> + /* Read uncorrectable error status from driver */ >> + len = read(vdev->err_notifier.rfd, &uncor_status, sizeof(uncor_status)); > > Whoa this seems bogus. In the kernel eventfd_signal() adds the value > to the internal counter. We can't guarantee that the user is going to > immediately respond to every signal, multiple signals might occur, each > incrementing the counter. I don't think we can use the eventfd like > this. > Ah, got your concern, make sense. Michael had the same comments as you, I didn't got him at that time... I explained the reason in v10 cover letter(5 of changelog): I find that error status register reading in error handler sometime returns ALL F's. I may want to take a look at v10, incorporate your comments, and test to see if the issue still exists. Currently if we only handle non-fatal error, we can still use eventfd like before.
> From: Alex Williamson > Sent: Thursday, January 19, 2017 6:32 AM > > On Sat, 31 Dec 2016 17:13:07 +0800 > Cao jin <caoj.fnst@cn.fujitsu.com> wrote: > > > From: Chen Fan <chen.fan.fnst@cn.fujitsu.com> > > > > When physical device has uncorrectable error hanppened, the vfio_pci > > driver will signal the uncorrectable error status register value to > > corresponding QEMU's vfio-pci device via the eventfd registered by this > > device, then, the vfio-pci's error eventfd handler will be invoked in > > event loop. > > > > Construct and pass the aer message to root port, root port will trigger an > > interrupt to signal guest, then, the guest driver will do the recovery. > > > > Note: Now only support non-fatal error's recovery, fatal error will > > still result in vm stop. > > Please update the entire commit log, don't just add a note that this > now only covers non-fatal errors. > One thing relate to vIOMMU. There is still a TODO task about forwarding IOMMU fault thru VFIO to Qemu, so Qemu vIOMMU has the chance to walk guest remapping structure to emulate virtual IOMMU fault. Likely it also requires eventfd mechanism. Wondering whether making sense to reuse same eventfd for both AER and vIOMMU or using separate eventfd is also fine? Even go with the former option, I don't expect substantial change to this series. Major change is on interface definition - extensible to multiple types of fault/error conditions instead of assuming AER only. Thoughts? Thanks Kevin
On Fri, 20 Jan 2017 06:57:22 +0000 "Tian, Kevin" <kevin.tian@intel.com> wrote: > > From: Alex Williamson > > Sent: Thursday, January 19, 2017 6:32 AM > > > > On Sat, 31 Dec 2016 17:13:07 +0800 > > Cao jin <caoj.fnst@cn.fujitsu.com> wrote: > > > > > From: Chen Fan <chen.fan.fnst@cn.fujitsu.com> > > > > > > When physical device has uncorrectable error hanppened, the vfio_pci > > > driver will signal the uncorrectable error status register value to > > > corresponding QEMU's vfio-pci device via the eventfd registered by this > > > device, then, the vfio-pci's error eventfd handler will be invoked in > > > event loop. > > > > > > Construct and pass the aer message to root port, root port will trigger an > > > interrupt to signal guest, then, the guest driver will do the recovery. > > > > > > Note: Now only support non-fatal error's recovery, fatal error will > > > still result in vm stop. > > > > Please update the entire commit log, don't just add a note that this > > now only covers non-fatal errors. > > > > One thing relate to vIOMMU. There is still a TODO task about forwarding > IOMMU fault thru VFIO to Qemu, so Qemu vIOMMU has the chance to > walk guest remapping structure to emulate virtual IOMMU fault. Likely > it also requires eventfd mechanism. > > Wondering whether making sense to reuse same eventfd for both AER > and vIOMMU or using separate eventfd is also fine? Even go with the > former option, I don't expect substantial change to this series. Major > change is on interface definition - extensible to multiple types of > fault/error conditions instead of assuming AER only. > > Thoughts? We can't really convey any information through an eventfd, it's just a signal, so I don't think we can use the same eventfd for both types of errors. Already here we're discussing the idea of using separate eventfds for fatal vs non-fatal AERs. IOMMU error processing seems like yet another eventfd and likely some region or ioctl mechanism for retrieving the error details since the IOMMU hardware is not directly accessible. Furthermore, such an event might logically be connected to the vfio container rather than the device, so it might not even use the same file descriptor. Thanks, Alex
> From: Alex Williamson [mailto:alex.williamson@redhat.com] > Sent: Saturday, January 21, 2017 2:21 AM > > On Fri, 20 Jan 2017 06:57:22 +0000 > "Tian, Kevin" <kevin.tian@intel.com> wrote: > > > > From: Alex Williamson > > > Sent: Thursday, January 19, 2017 6:32 AM > > > > > > On Sat, 31 Dec 2016 17:13:07 +0800 > > > Cao jin <caoj.fnst@cn.fujitsu.com> wrote: > > > > > > > From: Chen Fan <chen.fan.fnst@cn.fujitsu.com> > > > > > > > > When physical device has uncorrectable error hanppened, the vfio_pci > > > > driver will signal the uncorrectable error status register value to > > > > corresponding QEMU's vfio-pci device via the eventfd registered by this > > > > device, then, the vfio-pci's error eventfd handler will be invoked in > > > > event loop. > > > > > > > > Construct and pass the aer message to root port, root port will trigger an > > > > interrupt to signal guest, then, the guest driver will do the recovery. > > > > > > > > Note: Now only support non-fatal error's recovery, fatal error will > > > > still result in vm stop. > > > > > > Please update the entire commit log, don't just add a note that this > > > now only covers non-fatal errors. > > > > > > > One thing relate to vIOMMU. There is still a TODO task about forwarding > > IOMMU fault thru VFIO to Qemu, so Qemu vIOMMU has the chance to > > walk guest remapping structure to emulate virtual IOMMU fault. Likely > > it also requires eventfd mechanism. > > > > Wondering whether making sense to reuse same eventfd for both AER > > and vIOMMU or using separate eventfd is also fine? Even go with the > > former option, I don't expect substantial change to this series. Major > > change is on interface definition - extensible to multiple types of > > fault/error conditions instead of assuming AER only. > > > > Thoughts? > > We can't really convey any information through an eventfd, it's just a > signal, so I don't think we can use the same eventfd for both types of > errors. Already here we're discussing the idea of using separate > eventfds for fatal vs non-fatal AERs. IOMMU error processing seems > like yet another eventfd and likely some region or ioctl mechanism for > retrieving the error details since the IOMMU hardware is not directly > accessible. Furthermore, such an event might logically be connected to > the vfio container rather than the device, so it might not even use the > same file descriptor. Thanks, > Clear enough. Thanks, Kevin
diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 76a8ac3..9861f72 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -2470,21 +2470,55 @@ static void vfio_put_device(VFIOPCIDevice *vdev) static void vfio_err_notifier_handler(void *opaque) { VFIOPCIDevice *vdev = opaque; + PCIDevice *dev = &vdev->pdev; + PCIEAERMsg msg = { + .severity = 0, + .source_id = (pci_bus_num(dev->bus) << 8) | dev->devfn, + }; + int len; + uint64_t uncor_status; + + /* Read uncorrectable error status from driver */ + len = read(vdev->err_notifier.rfd, &uncor_status, sizeof(uncor_status)); + if (len != sizeof(uncor_status)) { + error_report("vfio-pci: uncor error status reading returns" + " invalid number of bytes: %d", len); + return; //Or goto stop? + } + + if (!(vdev->features & VFIO_FEATURE_ENABLE_AER)) { + goto stop; + } + + /* Populate the aer msg and send it to root port */ + if (dev->exp.aer_cap) { + uint8_t *aer_cap = dev->config + dev->exp.aer_cap; + bool isfatal = uncor_status & + pci_get_long(aer_cap + PCI_ERR_UNCOR_SEVER); + + if (isfatal) { + goto stop; + } + + msg.severity = isfatal ? PCI_ERR_ROOT_CMD_FATAL_EN : + PCI_ERR_ROOT_CMD_NONFATAL_EN; - if (!event_notifier_test_and_clear(&vdev->err_notifier)) { + error_report("vfio-pci device %d sending AER to root port. uncor" + " status = 0x%"PRIx64, dev->devfn, uncor_status); + pcie_aer_msg(dev, &msg); return; } +stop: /* - * TBD. Retrieve the error details and decide what action - * needs to be taken. One of the actions could be to pass - * the error to the guest and have the guest driver recover - * from the error. This requires that PCIe capabilities be - * exposed to the guest. For now, we just terminate the - * guest to contain the error. + * Terminate the guest in case of + * 1. AER capability is not exposed to guest. + * 2. AER capability is exposed, but error is fatal, only non-fatal + * error is handled now. */ - error_report("%s(%s) Unrecoverable error detected. Please collect any data possible and then kill the guest", __func__, vdev->vbasedev.name); + error_report("%s(%s) fatal error detected. Please collect any data" + " possible and then kill the guest", __func__, vdev->vbasedev.name); vm_stop(RUN_STATE_INTERNAL_ERROR); }