mbox series

[v2,0/3] PCI: Add PCI_ERROR_RESPONSE, check for errors

Message ID 20190822200551.129039-1-helgaas@kernel.org
Headers show
Series PCI: Add PCI_ERROR_RESPONSE, check for errors | expand

Message

Bjorn Helgaas Aug. 22, 2019, 8:05 p.m. UTC
From: Bjorn Helgaas <bhelgaas@google.com>

Reads from a PCI device may fail if the device has been turned off (put
into D3cold), removed, or if some other error occurs.  The PCI host bridge
typically fabricates ~0 data to complete the CPU's read.

We check for that in a few places, but not in a consistent way.  This
series adds a PCI_ERROR_RESPONSE definition to make the checks more
consistent and easier to find.  Note that ~0 may indicate a PCI error, but
it may also be valid read data, so you need more information (such as
knowing that a register can never contain ~0) before concluding that it's
an error.

This series also adds a new check for PCI_ERROR_RESPONSE in the power
management code because that code frequently encounters devices in D3cold,
where we previously misinterpreted ~0 data.  It also uses pci_power_name()
to print D-state names more consistently.

Rafael, I didn't add your Reviewed-by to "PCI / PM: Return error when
changing power state from D3cold" because I made small changes to try to
make the messages more consistent, and I didn't want to presume they'd be
OK with you.

Changes since v1:
  - Add Rafael's Reviewed-By to the first two patches
  - Drop "PCI / PM: Check for error when reading PME status" because Rafael
    pointed out that some devices can signal PME even when in D3cold, so
    this would require additional rework
  - Drop "PCI / PM: Check for error when reading Power State" because
    Rafael thinks it's mostly redundant

Bjorn Helgaas (3):
  PCI: Add PCI_ERROR_RESPONSE definition
  PCI / PM: Decode D3cold power state correctly
  PCI / PM: Return error when changing power state from D3cold

 drivers/pci/access.c                          | 13 ++++----
 .../pci/controller/dwc/pcie-designware-host.c |  2 +-
 drivers/pci/controller/pci-aardvark.c         |  2 +-
 drivers/pci/controller/pci-mvebu.c            |  4 +--
 drivers/pci/controller/pci-thunder-ecam.c     | 20 ++++++------
 drivers/pci/controller/pci-thunder-pem.c      |  2 +-
 drivers/pci/controller/pcie-altera.c          |  2 +-
 drivers/pci/controller/pcie-iproc.c           |  2 +-
 drivers/pci/controller/pcie-mediatek.c        |  4 +--
 drivers/pci/controller/pcie-rcar.c            |  2 +-
 drivers/pci/controller/pcie-rockchip-host.c   |  2 +-
 drivers/pci/controller/vmd.c                  |  2 +-
 drivers/pci/hotplug/cpqphp_ctrl.c             | 12 +++----
 drivers/pci/hotplug/cpqphp_pci.c              | 20 ++++++------
 drivers/pci/hotplug/pciehp_hpc.c              |  6 ++--
 drivers/pci/pci.c                             | 31 ++++++++++++-------
 drivers/pci/pcie/dpc.c                        |  3 +-
 drivers/pci/pcie/pme.c                        |  4 +--
 drivers/pci/probe.c                           |  4 +--
 drivers/pci/quirks.c                          |  2 +-
 include/linux/pci.h                           |  7 +++++
 21 files changed, 81 insertions(+), 65 deletions(-)

Comments

Mika Westerberg Aug. 23, 2019, 7:22 a.m. UTC | #1
On Thu, Aug 22, 2019 at 03:05:48PM -0500, Bjorn Helgaas wrote:
> From: Bjorn Helgaas <bhelgaas@google.com>
> 
> Reads from a PCI device may fail if the device has been turned off (put
> into D3cold), removed, or if some other error occurs.  The PCI host bridge
> typically fabricates ~0 data to complete the CPU's read.
> 
> We check for that in a few places, but not in a consistent way.  This
> series adds a PCI_ERROR_RESPONSE definition to make the checks more
> consistent and easier to find.  Note that ~0 may indicate a PCI error, but
> it may also be valid read data, so you need more information (such as
> knowing that a register can never contain ~0) before concluding that it's
> an error.
> 
> This series also adds a new check for PCI_ERROR_RESPONSE in the power
> management code because that code frequently encounters devices in D3cold,
> where we previously misinterpreted ~0 data.  It also uses pci_power_name()
> to print D-state names more consistently.
> 
> Rafael, I didn't add your Reviewed-by to "PCI / PM: Return error when
> changing power state from D3cold" because I made small changes to try to
> make the messages more consistent, and I didn't want to presume they'd be
> OK with you.
> 
> Changes since v1:
>   - Add Rafael's Reviewed-By to the first two patches
>   - Drop "PCI / PM: Check for error when reading PME status" because Rafael
>     pointed out that some devices can signal PME even when in D3cold, so
>     this would require additional rework
>   - Drop "PCI / PM: Check for error when reading Power State" because
>     Rafael thinks it's mostly redundant
> 
> Bjorn Helgaas (3):
>   PCI: Add PCI_ERROR_RESPONSE definition
>   PCI / PM: Decode D3cold power state correctly
>   PCI / PM: Return error when changing power state from D3cold

For the whole series,

Reviewed-by: Mika Westerberg <mika.westerberg@linux.intel.com>
Keith Busch Aug. 23, 2019, 3:04 p.m. UTC | #2
On Thu, Aug 22, 2019 at 03:05:48PM -0500, Bjorn Helgaas wrote:
> From: Bjorn Helgaas <bhelgaas@google.com>
> 
> Reads from a PCI device may fail if the device has been turned off (put
> into D3cold), removed, or if some other error occurs.  The PCI host bridge
> typically fabricates ~0 data to complete the CPU's read.
> 
> We check for that in a few places, but not in a consistent way.  This
> series adds a PCI_ERROR_RESPONSE definition to make the checks more
> consistent and easier to find.  Note that ~0 may indicate a PCI error, but
> it may also be valid read data, so you need more information (such as
> knowing that a register can never contain ~0) before concluding that it's
> an error.
> 
> This series also adds a new check for PCI_ERROR_RESPONSE in the power
> management code because that code frequently encounters devices in D3cold,
> where we previously misinterpreted ~0 data.  It also uses pci_power_name()
> to print D-state names more consistently.
> 
> Rafael, I didn't add your Reviewed-by to "PCI / PM: Return error when
> changing power state from D3cold" because I made small changes to try to
> make the messages more consistent, and I didn't want to presume they'd be
> OK with you.
> 
> Changes since v1:
>   - Add Rafael's Reviewed-By to the first two patches
>   - Drop "PCI / PM: Check for error when reading PME status" because Rafael
>     pointed out that some devices can signal PME even when in D3cold, so
>     this would require additional rework
>   - Drop "PCI / PM: Check for error when reading Power State" because
>     Rafael thinks it's mostly redundant
> 
> Bjorn Helgaas (3):
>   PCI: Add PCI_ERROR_RESPONSE definition
>   PCI / PM: Decode D3cold power state correctly
>   PCI / PM: Return error when changing power state from D3cold

Series looks good to me.

Reviewed-by: Keith Busch <kbusch@kernel.org>
Bjorn Helgaas Nov. 14, 2019, 8:19 p.m. UTC | #3
[+cc Andrew]

On Thu, Aug 22, 2019 at 03:05:48PM -0500, Bjorn Helgaas wrote:
> From: Bjorn Helgaas <bhelgaas@google.com>
> 
> Reads from a PCI device may fail if the device has been turned off (put
> into D3cold), removed, or if some other error occurs.  The PCI host bridge
> typically fabricates ~0 data to complete the CPU's read.
> 
> We check for that in a few places, but not in a consistent way.  This
> series adds a PCI_ERROR_RESPONSE definition to make the checks more
> consistent and easier to find.  Note that ~0 may indicate a PCI error, but
> it may also be valid read data, so you need more information (such as
> knowing that a register can never contain ~0) before concluding that it's
> an error.
> 
> This series also adds a new check for PCI_ERROR_RESPONSE in the power
> management code because that code frequently encounters devices in D3cold,
> where we previously misinterpreted ~0 data.  It also uses pci_power_name()
> to print D-state names more consistently.
> 
> Rafael, I didn't add your Reviewed-by to "PCI / PM: Return error when
> changing power state from D3cold" because I made small changes to try to
> make the messages more consistent, and I didn't want to presume they'd be
> OK with you.
> 
> Changes since v1:
>   - Add Rafael's Reviewed-By to the first two patches
>   - Drop "PCI / PM: Check for error when reading PME status" because Rafael
>     pointed out that some devices can signal PME even when in D3cold, so
>     this would require additional rework
>   - Drop "PCI / PM: Check for error when reading Power State" because
>     Rafael thinks it's mostly redundant
> 
> Bjorn Helgaas (3):
>   PCI: Add PCI_ERROR_RESPONSE definition
>   PCI / PM: Decode D3cold power state correctly
>   PCI / PM: Return error when changing power state from D3cold

I applied patches 2 & 3 (tweaked to not depend on the
PCI_ERROR_RESPONSE definition) with reviewed-by from Rafael, Keith,
and Mika to pci/pm for v5.5, thanks everybody for taking a look.

Andrew had good ideas for improving the PCI_ERROR_RESPONSE part, so
it's gone for now but not forgotten.

>  drivers/pci/access.c                          | 13 ++++----
>  .../pci/controller/dwc/pcie-designware-host.c |  2 +-
>  drivers/pci/controller/pci-aardvark.c         |  2 +-
>  drivers/pci/controller/pci-mvebu.c            |  4 +--
>  drivers/pci/controller/pci-thunder-ecam.c     | 20 ++++++------
>  drivers/pci/controller/pci-thunder-pem.c      |  2 +-
>  drivers/pci/controller/pcie-altera.c          |  2 +-
>  drivers/pci/controller/pcie-iproc.c           |  2 +-
>  drivers/pci/controller/pcie-mediatek.c        |  4 +--
>  drivers/pci/controller/pcie-rcar.c            |  2 +-
>  drivers/pci/controller/pcie-rockchip-host.c   |  2 +-
>  drivers/pci/controller/vmd.c                  |  2 +-
>  drivers/pci/hotplug/cpqphp_ctrl.c             | 12 +++----
>  drivers/pci/hotplug/cpqphp_pci.c              | 20 ++++++------
>  drivers/pci/hotplug/pciehp_hpc.c              |  6 ++--
>  drivers/pci/pci.c                             | 31 ++++++++++++-------
>  drivers/pci/pcie/dpc.c                        |  3 +-
>  drivers/pci/pcie/pme.c                        |  4 +--
>  drivers/pci/probe.c                           |  4 +--
>  drivers/pci/quirks.c                          |  2 +-
>  include/linux/pci.h                           |  7 +++++
>  21 files changed, 81 insertions(+), 65 deletions(-)
> 
> -- 
> 2.23.0.187.g17f5b7556c-goog
>