Message ID | 20231206224231.732765-2-helgaas@kernel.org |
---|---|
State | New |
Headers | show |
Series | PCI/AER: Clean up logging | expand |
On Wed, 6 Dec 2023 16:42:29 -0600 Bjorn Helgaas <helgaas@kernel.org> wrote: > From: Bjorn Helgaas <bhelgaas@google.com> > > The PCIe spec classifies errors as either "Correctable" or "Uncorrectable". > Previously we printed these as "Corrected" or "Uncorrected". To avoid > confusion, use the same terms as the spec. > > One confusing situation is when one agent detects an error, but another > agent is responsible for recovery, e.g., by re-attempting the operation. > The first agent may log a "correctable" error but it has not yet been > corrected. The recovery agent must report an uncorrectable error if it is > unable to recover. If we print the first agent's error as "Corrected", it > gives the false impression that it has already been resolved. > > Sample message change: > > - pcieport 0000:00:1c.5: AER: Corrected error received: 0000:00:1c.5 > + pcieport 0000:00:1c.5: AER: Correctable error received: 0000:00:1c.5 > > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com Good to tidy this up. FWIW Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Hi Bjorn, Will help prevent confusion. LGTM. On 12/6/23 16:42, Bjorn Helgaas wrote: > From: Bjorn Helgaas <bhelgaas@google.com> > > The PCIe spec classifies errors as either "Correctable" or "Uncorrectable". > Previously we printed these as "Corrected" or "Uncorrected". To avoid > confusion, use the same terms as the spec. > > One confusing situation is when one agent detects an error, but another > agent is responsible for recovery, e.g., by re-attempting the operation. > The first agent may log a "correctable" error but it has not yet been > corrected. The recovery agent must report an uncorrectable error if it is > unable to recover. If we print the first agent's error as "Corrected", it > gives the false impression that it has already been resolved. > > Sample message change: > > - pcieport 0000:00:1c.5: AER: Corrected error received: 0000:00:1c.5 > + pcieport 0000:00:1c.5: AER: Correctable error received: 0000:00:1c.5 > > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> > --- > drivers/pci/pcie/aer.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index 42a3bd35a3e1..20db80018b5d 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -436,9 +436,9 @@ void pci_aer_exit(struct pci_dev *dev) > * AER error strings > */ > static const char *aer_error_severity_string[] = { > - "Uncorrected (Non-Fatal)", > - "Uncorrected (Fatal)", > - "Corrected" > + "Uncorrectable (Non-Fatal)", > + "Uncorrectable (Fatal)", > + "Correctable" > }; > > static const char *aer_error_layer[] = {
On Tue, Dec 12, 2023 at 09:00:24AM -0600, Terry Bowman wrote: > Hi Bjorn, > > Will help prevent confusion. LGTM. Thanks a lot for taking a look at these! I'd like to give you credit in the log, e.g., "Reviewed-by: Terry Bowman <Terry.Bowman@amd.com>", but I'm OCD enough that I don't want to translate "LGTM" into that all by myself. If you want that credit (and, I guess, the privilege of being cc'd when we find that these patches break something :)), just reply again with that actual "Reviewed-by:" text in it. Bjorn
On 12/6/2023 2:42 PM, Bjorn Helgaas wrote: > From: Bjorn Helgaas <bhelgaas@google.com> > > The PCIe spec classifies errors as either "Correctable" or "Uncorrectable". > Previously we printed these as "Corrected" or "Uncorrected". To avoid > confusion, use the same terms as the spec. > > One confusing situation is when one agent detects an error, but another > agent is responsible for recovery, e.g., by re-attempting the operation. > The first agent may log a "correctable" error but it has not yet been > corrected. The recovery agent must report an uncorrectable error if it is > unable to recover. If we print the first agent's error as "Corrected", it > gives the false impression that it has already been resolved. > > Sample message change: > > - pcieport 0000:00:1c.5: AER: Corrected error received: 0000:00:1c.5 > + pcieport 0000:00:1c.5: AER: Correctable error received: 0000:00:1c.5 > > Signed-off-by: Bjorn Helgaas <bhelgaas@google.com> > --- Looks good to me. Reviewed-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com> > drivers/pci/pcie/aer.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c > index 42a3bd35a3e1..20db80018b5d 100644 > --- a/drivers/pci/pcie/aer.c > +++ b/drivers/pci/pcie/aer.c > @@ -436,9 +436,9 @@ void pci_aer_exit(struct pci_dev *dev) > * AER error strings > */ > static const char *aer_error_severity_string[] = { > - "Uncorrected (Non-Fatal)", > - "Uncorrected (Fatal)", > - "Corrected" > + "Uncorrectable (Non-Fatal)", > + "Uncorrectable (Fatal)", > + "Correctable" > }; > > static const char *aer_error_layer[] = {
diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index 42a3bd35a3e1..20db80018b5d 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -436,9 +436,9 @@ void pci_aer_exit(struct pci_dev *dev) * AER error strings */ static const char *aer_error_severity_string[] = { - "Uncorrected (Non-Fatal)", - "Uncorrected (Fatal)", - "Corrected" + "Uncorrectable (Non-Fatal)", + "Uncorrectable (Fatal)", + "Correctable" }; static const char *aer_error_layer[] = {