powerpc/eeh: Dump PHB3 diag-data on frozen PE

Message ID	1384940196-32514-1-git-send-email-shangw@linux.vnet.ibm.com (mailing list archive)
State	Superseded, archived
Headers	show Return-Path: <linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org> Gateway: Authorized Use Only! Violators will be prosecuted for <linuxppc-dev@lists.ozlabs.org> from <shangw@linux.vnet.ibm.com>; Wed, 20 Nov 2013 04:36:46 -0500 Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 20 Nov 2013 04:36:44 -0500 From: Gavin Shan <shangw@linux.vnet.ibm.com> To: linuxppc-dev@lists.ozlabs.org Subject: [PATCH] powerpc/eeh: Dump PHB3 diag-data on frozen PE Date: Wed, 20 Nov 2013 17:36:36 +0800 Message-Id: <1384940196-32514-1-git-send-email-shangw@linux.vnet.ibm.com> Cc: Gavin Shan <shangw@linux.vnet.ibm.com> Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" <linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>

Message ID

1384940196-32514-1-git-send-email-shangw@linux.vnet.ibm.com (mailing list archive)

State

Superseded, archived

Headers

From: Gavin Shan <shangw@linux.vnet.ibm.com>
To: linuxppc-dev@lists.ozlabs.org
Subject: [PATCH] powerpc/eeh: Dump PHB3 diag-data on frozen PE
Date: Wed, 20 Nov 2013 17:36:36 +0800
Message-Id: <1384940196-32514-1-git-send-email-shangw@linux.vnet.ibm.com>
Cc: Gavin Shan <shangw@linux.vnet.ibm.com>
Precedence: list
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org
Sender: "Linuxppc-dev"
	<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>

Commit Message

Gavin Shan Nov. 20, 2013, 9:36 a.m. UTC

While we detect frozen PE on PHB3, it's always meaningful to have
the dumped diag-data for further diagnosis and analysis.

Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
---
 arch/powerpc/platforms/powernv/eeh-ioda.c |    3 +++
 1 file changed, 3 insertions(+)

Comments

Benjamin Herrenschmidt Nov. 20, 2013, 9:38 a.m. UTC | #1

On Wed, 2013-11-20 at 17:36 +0800, Gavin Shan wrote:
> While we detect frozen PE on PHB3, it's always meaningful to have
> the dumped diag-data for further diagnosis and analysis.

Don't we trip that during PCI probing ? For example if we probe behind
a PCI-X bridge (which can exist on an adapter) we'll trip EEH on every
non-existing device won't we ?

Cheers,
Ben.

> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
> ---
>  arch/powerpc/platforms/powernv/eeh-ioda.c |    3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/arch/powerpc/platforms/powernv/eeh-ioda.c b/arch/powerpc/platforms/powernv/eeh-ioda.c
> index 02245ce..481528d 100644
> --- a/arch/powerpc/platforms/powernv/eeh-ioda.c
> +++ b/arch/powerpc/platforms/powernv/eeh-ioda.c
> @@ -994,8 +994,11 @@ static int ioda_eeh_next_error(struct eeh_pe **pe)
>  			if (ioda_eeh_get_pe(hose, frozen_pe_no, pe))
>  				break;
>  
> +			/* It would be always indicative to have PHB diag-data */
>  			pr_err("EEH: Frozen PE#%x on PHB#%x detected\n",
>  				(*pe)->addr, (*pe)->phb->global_number);
> +			ioda_eeh_phb_diag(hose);
> +
>  			ret = 1;
>  			goto out;
>  		}

Gavin Shan Nov. 20, 2013, 10:09 a.m. UTC | #2

On Wed, Nov 20, 2013 at 08:38:48PM +1100, Benjamin Herrenschmidt wrote:
>On Wed, 2013-11-20 at 17:36 +0800, Gavin Shan wrote:
>> While we detect frozen PE on PHB3, it's always meaningful to have
>> the dumped diag-data for further diagnosis and analysis.
>
>Don't we trip that during PCI probing ? For example if we probe behind
>a PCI-X bridge (which can exist on an adapter) we'll trip EEH on every
>non-existing device won't we ?
>

Yes, we already had the dumped PHB diag-data when detecting frozen PE
during PCI probing. After PCI probing is completed, the EEH takes over
and we won't dump PHB diag-data during PCI config cycles.

Took a close look on what we have in the code. Those functions to dump
PHB (P7IOC & PHB3) needs a bit rework or refactoring since we're dumping
same PHB diag-data in pci.c and eeh-ioda.c at the same time.

Besides, I think the appropriate place to dump PHB diag-data (for EEH
core itself) would be ioda_eeh_get_log(), which is the indirect backend
of eeh_ops::get_log, instead of the function ioda_eeh_next_error().

Ben, please drop this one for now and I'll send the revised one :-)

Thanks,
Gavin

>> Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
>> ---
>>  arch/powerpc/platforms/powernv/eeh-ioda.c |    3 +++
>>  1 file changed, 3 insertions(+)
>> 
>> diff --git a/arch/powerpc/platforms/powernv/eeh-ioda.c b/arch/powerpc/platforms/powernv/eeh-ioda.c
>> index 02245ce..481528d 100644
>> --- a/arch/powerpc/platforms/powernv/eeh-ioda.c
>> +++ b/arch/powerpc/platforms/powernv/eeh-ioda.c
>> @@ -994,8 +994,11 @@ static int ioda_eeh_next_error(struct eeh_pe **pe)
>>  			if (ioda_eeh_get_pe(hose, frozen_pe_no, pe))
>>  				break;
>>  
>> +			/* It would be always indicative to have PHB diag-data */
>>  			pr_err("EEH: Frozen PE#%x on PHB#%x detected\n",
>>  				(*pe)->addr, (*pe)->phb->global_number);
>> +			ioda_eeh_phb_diag(hose);
>> +
>>  			ret = 1;
>>  			goto out;
>>  		}
>
>

diff --git a/arch/powerpc/platforms/powernv/eeh-ioda.c b/arch/powerpc/platforms/powernv/eeh-ioda.c
index 02245ce..481528d 100644
--- a/arch/powerpc/platforms/powernv/eeh-ioda.c
+++ b/arch/powerpc/platforms/powernv/eeh-ioda.c
@@ -994,8 +994,11 @@  static int ioda_eeh_next_error(struct eeh_pe **pe)
 			if (ioda_eeh_get_pe(hose, frozen_pe_no, pe))
 				break;
 
+			/* It would be always indicative to have PHB diag-data */
 			pr_err("EEH: Frozen PE#%x on PHB#%x detected\n",
 				(*pe)->addr, (*pe)->phb->global_number);
+			ioda_eeh_phb_diag(hose);
+
 			ret = 1;
 			goto out;
 		}

powerpc/eeh: Dump PHB3 diag-data on frozen PE

Commit Message

Comments

Patch