diff mbox

netdev/phy: Use mdiobus_read() so that proper locks are taken.

Message ID 1317419482-25655-1-git-send-email-david.daney@cavium.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

David Daney Sept. 30, 2011, 9:51 p.m. UTC
Accesses to the mdio busses must be done with the mdio_lock to ensure
proper operation.  Conveniently we have the helper function
mdiobus_read() to do that for us.  Lets use it in get_phy_id() instead
of accessing the bus without the lock held.

Signed-off-by: David Daney <david.daney@cavium.com>
---
 drivers/net/phy/phy_device.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

Comments

David Miller Sept. 30, 2011, 10:54 p.m. UTC | #1
From: David Daney <david.daney@cavium.com>
Date: Fri, 30 Sep 2011 14:51:22 -0700

> Accesses to the mdio busses must be done with the mdio_lock to ensure
> proper operation.  Conveniently we have the helper function
> mdiobus_read() to do that for us.  Lets use it in get_phy_id() instead
> of accessing the bus without the lock held.
> 
> Signed-off-by: David Daney <david.daney@cavium.com>

Applied to net-next.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Tabi Timur-B04825 Nov. 11, 2011, 12:29 a.m. UTC | #2
On Fri, Sep 30, 2011 at 4:51 PM, David Daney <david.daney@cavium.com> wrote:
> Accesses to the mdio busses must be done with the mdio_lock to ensure
> proper operation.  Conveniently we have the helper function
> mdiobus_read() to do that for us.  Lets use it in get_phy_id() instead
> of accessing the bus without the lock held.
>
> Signed-off-by: David Daney <david.daney@cavium.com>
> ---

This patch is causing me problems, but I'm not exactly certain how.
The problems only appear when I add some unrelated code to my platform
file (p1022ds.c), but the trap is definitely in the phy code:

Fixed MDIO Bus: probed
INFO: trying to register non-static key.
the code is fine but needs lockdep annotation.
turning off the locking correctness validator.
Call Trace:
[e685dcc0] [c0008c7c] show_stack+0x44/0x154 (unreliable)
[e685dd00] [c007bf74] __lock_acquire+0x1374/0x1824
[e685ddb0] [c007c870] lock_acquire+0x4c/0x68
[e685ddd0] [c0455fe4] mutex_lock_nested+0x6c/0x394
[e685de30] [c02acc08] mdiobus_read+0x3c/0x78
[e685de50] [c02abc98] get_phy_id+0x24/0x80
[e685de70] [c02b1860] fsl_pq_mdio_probe+0x3ac/0x458
[e685deb0] [c027ba2c] platform_drv_probe+0x20/0x30
[e685dec0] [c027a4b0] driver_probe_device+0xa4/0x1d4
[e685dee0] [c027a6a4] __driver_attach+0xc4/0xc8
[e685df00] [c027939c] bus_for_each_dev+0x60/0x9c
[e685df30] [c027a0e4] driver_attach+0x24/0x34
[e685df40] [c0279d30] bus_add_driver+0x1b0/0x278
[e685df70] [c027aab8] driver_register+0x88/0x154
[e685df90] [c027bd5c] platform_driver_register+0x68/0x78
[e685dfa0] [c05d47c8] fsl_pq_mdio_init+0x18/0x28
[e685dfb0] [c0001eb8] do_one_initcall+0x34/0x1ac
[e685dfe0] [c05b984c] kernel_init+0xa0/0x13c
[e685dff0] [c000e588] kernel_thread+0x4c/0x68
Unable to handle kernel paging request for data at address 0x00000000
Faulting instruction address: 0xc0456080
Oops: Kernel access of bad area, sig: 11 [#1]
SMP NR_CPUS=8 P1022 DS
Modules linked in:
NIP: c0456080 LR: c0456068 CTR: 00000000
REGS: e685dd20 TRAP: 0300   Not tainted  (3.2.0-10b-00092-g66f2305-dirty)
MSR: 00021000 <ME,CE>  CR: 22042044  XER: 20000000
DEAR: 00000000, ESR: 00800000
TASK = e6860000[1] 'swapper' THREAD: e685c000 CPU: 0
GPR00: ffffffff e685ddd0 e6860000 e6e58028 e685ddd8 e685c000 e685dde4 00000002
GPR08: 00000000 00000000 00000000 00000000 44042022 40401800 00000000 00000000
GPR16: c0000a00 00000014 3fffffff 03ff9000 00000015 7ff3a68c c061e000 00000000
GPR24: e6e5804c e685ddd8 e6e5802c 00029000 e6860000 c0620000 e685c000 e6e58028
NIP [c0456080] mutex_lock_nested+0x108/0x394
LR [c0456068] mutex_lock_nested+0xf0/0x394
Call Trace:
[e685ddd0] [c0456068] mutex_lock_nested+0xf0/0x394 (unreliable)
[e685de30] [c02acc08] mdiobus_read+0x3c/0x78
[e685de50] [c02abc98] get_phy_id+0x24/0x80
[e685de70] [c02b1860] fsl_pq_mdio_probe+0x3ac/0x458
[e685deb0] [c027ba2c] platform_drv_probe+0x20/0x30
[e685dec0] [c027a4b0] driver_probe_device+0xa4/0x1d4
[e685dee0] [c027a6a4] __driver_attach+0xc4/0xc8
[e685df00] [c027939c] bus_for_each_dev+0x60/0x9c
[e685df30] [c027a0e4] driver_attach+0x24/0x34
[e685df40] [c0279d30] bus_add_driver+0x1b0/0x278
[e685df70] [c027aab8] driver_register+0x88/0x154
[e685df90] [c027bd5c] platform_driver_register+0x68/0x78
[e685dfa0] [c05d47c8] fsl_pq_mdio_init+0x18/0x28
[e685dfb0] [c0001eb8] do_one_initcall+0x34/0x1ac
[e685dfe0] [c05b984c] kernel_init+0xa0/0x13c
[e685dff0] [c000e588] kernel_thread+0x4c/0x68
Instruction dump:
7f24cb78 4bc21141 80bc0004 7fe3fb78 7f24cb78 4bc21325 813f0028 3b1f0024
933f0028 3800ffff 93010008 9121000c
 93810010 7c0004ac 7d20f828
---[ end trace 7cc8bbd19b132dac ]---
note: swapper[1] exited with preempt_count 1
Kernel panic - not syncing: Attempted to kill init!
Call Trace:
[e685dc00] [c0008c7c] show_stack+0x44/0x154 (unreliable)
[e685dc40] [c04583a0] panic+0xb4/0x1f0
[e685dc90] [c004724c] do_exit+0x5dc/0x684
[e685dce0] [c000b368] die+0xdc/0x1b4
[e685dd00] [c00137fc] bad_page_fault+0xb4/0xfc
[e685dd10] [c000f998] handle_page_fault+0x7c/0x80
--- Exception: 300 at mutex_lock_nested+0x108/0x394
    LR = mutex_lock_nested+0xf0/0x394
[e685de30] [c02acc08] mdiobus_read+0x3c/0x78
[e685de50] [c02abc98] get_phy_id+0x24/0x80
[e685de70] [c02b1860] fsl_pq_mdio_probe+0x3ac/0x458
[e685deb0] [c027ba2c] platform_drv_probe+0x20/0x30
[e685dec0] [c027a4b0] driver_probe_device+0xa4/0x1d4
[e685dee0] [c027a6a4] __driver_attach+0xc4/0xc8
[e685df00] [c027939c] bus_for_each_dev+0x60/0x9c
[e685df30] [c027a0e4] driver_attach+0x24/0x34
[e685df40] [c0279d30] bus_add_driver+0x1b0/0x278
[e685df70] [c027aab8] driver_register+0x88/0x154
[e685df90] [c027bd5c] platform_driver_register+0x68/0x78
[e685dfa0] [c05d47c8] fsl_pq_mdio_init+0x18/0x28
[e685dfb0] [c0001eb8] do_one_initcall+0x34/0x1ac
[e685dfe0] [c05b984c] kernel_init+0xa0/0x13c
[e685dff0] [c000e588] kernel_thread+0x4c/0x68
Rebooting in 1 seconds..

I'm still trying to narrow down what's causing the problem, but when I
revert this patch, I don't see these traps.

Sometimes, I don't get the trap, but I get a hang on this line:

int mdiobus_read(struct mii_bus *bus, int addr, u32 regnum)
{
	int retval;

	BUG_ON(in_interrupt());

--->	mutex_lock(&bus->mdio_lock);
David Daney Nov. 11, 2011, 12:48 a.m. UTC | #3
On 11/10/2011 04:29 PM, Tabi Timur-B04825 wrote:
> On Fri, Sep 30, 2011 at 4:51 PM, David Daney<david.daney@cavium.com>  wrote:
>> Accesses to the mdio busses must be done with the mdio_lock to ensure
>> proper operation.  Conveniently we have the helper function
>> mdiobus_read() to do that for us.  Lets use it in get_phy_id() instead
>> of accessing the bus without the lock held.
>>
>> Signed-off-by: David Daney<david.daney@cavium.com>
>> ---
>
> This patch is causing me problems, but I'm not exactly certain how.

I think it is because fsl_pq_mdio_probe() is defective.

You are using the bus, by calling fsl_pq_mdio_find_free(), before the 
driver is registered and initialized by calling  of_mdiobus_register().

At least that is my take on it.

David Daney


> The problems only appear when I add some unrelated code to my platform
> file (p1022ds.c), but the trap is definitely in the phy code:
>
> Fixed MDIO Bus: probed
> INFO: trying to register non-static key.
> the code is fine but needs lockdep annotation.
> turning off the locking correctness validator.
> Call Trace:
> [e685dcc0] [c0008c7c] show_stack+0x44/0x154 (unreliable)
> [e685dd00] [c007bf74] __lock_acquire+0x1374/0x1824
> [e685ddb0] [c007c870] lock_acquire+0x4c/0x68
> [e685ddd0] [c0455fe4] mutex_lock_nested+0x6c/0x394
> [e685de30] [c02acc08] mdiobus_read+0x3c/0x78
> [e685de50] [c02abc98] get_phy_id+0x24/0x80
> [e685de70] [c02b1860] fsl_pq_mdio_probe+0x3ac/0x458
> [e685deb0] [c027ba2c] platform_drv_probe+0x20/0x30
> [e685dec0] [c027a4b0] driver_probe_device+0xa4/0x1d4
> [e685dee0] [c027a6a4] __driver_attach+0xc4/0xc8
> [e685df00] [c027939c] bus_for_each_dev+0x60/0x9c
> [e685df30] [c027a0e4] driver_attach+0x24/0x34
> [e685df40] [c0279d30] bus_add_driver+0x1b0/0x278
> [e685df70] [c027aab8] driver_register+0x88/0x154
> [e685df90] [c027bd5c] platform_driver_register+0x68/0x78
> [e685dfa0] [c05d47c8] fsl_pq_mdio_init+0x18/0x28
> [e685dfb0] [c0001eb8] do_one_initcall+0x34/0x1ac
> [e685dfe0] [c05b984c] kernel_init+0xa0/0x13c
> [e685dff0] [c000e588] kernel_thread+0x4c/0x68
> Unable to handle kernel paging request for data at address 0x00000000
> Faulting instruction address: 0xc0456080
> Oops: Kernel access of bad area, sig: 11 [#1]
> SMP NR_CPUS=8 P1022 DS
> Modules linked in:
> NIP: c0456080 LR: c0456068 CTR: 00000000
> REGS: e685dd20 TRAP: 0300   Not tainted  (3.2.0-10b-00092-g66f2305-dirty)
> MSR: 00021000<ME,CE>   CR: 22042044  XER: 20000000
> DEAR: 00000000, ESR: 00800000
> TASK = e6860000[1] 'swapper' THREAD: e685c000 CPU: 0
> GPR00: ffffffff e685ddd0 e6860000 e6e58028 e685ddd8 e685c000 e685dde4 00000002
> GPR08: 00000000 00000000 00000000 00000000 44042022 40401800 00000000 00000000
> GPR16: c0000a00 00000014 3fffffff 03ff9000 00000015 7ff3a68c c061e000 00000000
> GPR24: e6e5804c e685ddd8 e6e5802c 00029000 e6860000 c0620000 e685c000 e6e58028
> NIP [c0456080] mutex_lock_nested+0x108/0x394
> LR [c0456068] mutex_lock_nested+0xf0/0x394
> Call Trace:
> [e685ddd0] [c0456068] mutex_lock_nested+0xf0/0x394 (unreliable)
> [e685de30] [c02acc08] mdiobus_read+0x3c/0x78
> [e685de50] [c02abc98] get_phy_id+0x24/0x80
> [e685de70] [c02b1860] fsl_pq_mdio_probe+0x3ac/0x458
> [e685deb0] [c027ba2c] platform_drv_probe+0x20/0x30
> [e685dec0] [c027a4b0] driver_probe_device+0xa4/0x1d4
> [e685dee0] [c027a6a4] __driver_attach+0xc4/0xc8
> [e685df00] [c027939c] bus_for_each_dev+0x60/0x9c
> [e685df30] [c027a0e4] driver_attach+0x24/0x34
> [e685df40] [c0279d30] bus_add_driver+0x1b0/0x278
> [e685df70] [c027aab8] driver_register+0x88/0x154
> [e685df90] [c027bd5c] platform_driver_register+0x68/0x78
> [e685dfa0] [c05d47c8] fsl_pq_mdio_init+0x18/0x28
> [e685dfb0] [c0001eb8] do_one_initcall+0x34/0x1ac
> [e685dfe0] [c05b984c] kernel_init+0xa0/0x13c
> [e685dff0] [c000e588] kernel_thread+0x4c/0x68
> Instruction dump:
> 7f24cb78 4bc21141 80bc0004 7fe3fb78 7f24cb78 4bc21325 813f0028 3b1f0024
> 933f0028 3800ffff 93010008 9121000c
>   93810010 7c0004ac 7d20f828
> ---[ end trace 7cc8bbd19b132dac ]---
> note: swapper[1] exited with preempt_count 1
> Kernel panic - not syncing: Attempted to kill init!
> Call Trace:
> [e685dc00] [c0008c7c] show_stack+0x44/0x154 (unreliable)
> [e685dc40] [c04583a0] panic+0xb4/0x1f0
> [e685dc90] [c004724c] do_exit+0x5dc/0x684
> [e685dce0] [c000b368] die+0xdc/0x1b4
> [e685dd00] [c00137fc] bad_page_fault+0xb4/0xfc
> [e685dd10] [c000f998] handle_page_fault+0x7c/0x80
> --- Exception: 300 at mutex_lock_nested+0x108/0x394
>      LR = mutex_lock_nested+0xf0/0x394
> [e685de30] [c02acc08] mdiobus_read+0x3c/0x78
> [e685de50] [c02abc98] get_phy_id+0x24/0x80
> [e685de70] [c02b1860] fsl_pq_mdio_probe+0x3ac/0x458
> [e685deb0] [c027ba2c] platform_drv_probe+0x20/0x30
> [e685dec0] [c027a4b0] driver_probe_device+0xa4/0x1d4
> [e685dee0] [c027a6a4] __driver_attach+0xc4/0xc8
> [e685df00] [c027939c] bus_for_each_dev+0x60/0x9c
> [e685df30] [c027a0e4] driver_attach+0x24/0x34
> [e685df40] [c0279d30] bus_add_driver+0x1b0/0x278
> [e685df70] [c027aab8] driver_register+0x88/0x154
> [e685df90] [c027bd5c] platform_driver_register+0x68/0x78
> [e685dfa0] [c05d47c8] fsl_pq_mdio_init+0x18/0x28
> [e685dfb0] [c0001eb8] do_one_initcall+0x34/0x1ac
> [e685dfe0] [c05b984c] kernel_init+0xa0/0x13c
> [e685dff0] [c000e588] kernel_thread+0x4c/0x68
> Rebooting in 1 seconds..
>
> I'm still trying to narrow down what's causing the problem, but when I
> revert this patch, I don't see these traps.
>
> Sometimes, I don't get the trap, but I get a hang on this line:
>
> int mdiobus_read(struct mii_bus *bus, int addr, u32 regnum)
> {
> 	int retval;
>
> 	BUG_ON(in_interrupt());
>
> --->	mutex_lock(&bus->mdio_lock);
>

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Timur Tabi Nov. 14, 2011, 5:10 p.m. UTC | #4
Andy Fleming wrote:

> Yes, that is correct. I have a patch, I just have to resend it. Will do
> next time I come to a break in what I'm working on.

Can you point me to that patch, so that I can try it out now?

I fixed some lockdep-related code in my audio driver.  The driver was not initializing a sysfs attr structure, so I added a call to sysfs_attr_init().  But when I do that, I get the following bug report.  I don't understand the connection.

=============================================                                   
[ INFO: possible recursive locking detected ]                                   
3.2.0-10b-00093-gebea711-dirty #10                                              
---------------------------------------------                                   
kworker/1:1/271 is trying to acquire lock:                                      
 (&(&(priv->tx_queue[i]->txlock))->rlock){......}, at: [<c02b2e28>] lock_tx_qs+0
x34/0x54                                                                        
                                                                                
but task is already holding lock:                                               
 (&(&(priv->tx_queue[i]->txlock))->rlock){......}, at: [<c02b2e28>] lock_tx_qs+0
x34/0x54                                                                        
                                                                                
other info that might help us debug this:                                       
 Possible unsafe locking scenario:                                              
                                                                                
       CPU0                                                                     
       ----                                                                     
  lock(&(&(priv->tx_queue[i]->txlock))->rlock);                                 
  lock(&(&(priv->tx_queue[i]->txlock))->rlock);                                 
                                                                                
 *** DEADLOCK ***                                                               
                                                                                
 May be due to missing lock nesting notation                                    
                                                                                
4 locks held by kworker/1:1/271:                                                
 #0:  (events){.+.+.+}, at: [<c005b7c8>] process_one_work+0x138/0x490           
 #1:  ((&(&dev->state_queue)->work)){+.+...}, at: [<c005b7c8>] process_one_work+
0x138/0x490                                                                     
 #2:  (&dev->lock){+.+...}, at: [<c02ab3cc>] phy_state_machine+0x30/0x580       
 #3:  (&(&(priv->tx_queue[i]->txlock))->rlock){......}, at: [<c02b2e28>] lock_tx
_qs+0x34/0x54                                                                   
                                                                                
stack backtrace:                                                                
Call Trace:                                                                     
[e69d5d80] [c0008c7c] show_stack+0x44/0x154 (unreliable)                        
[e69d5dc0] [c007bafc] __lock_acquire+0xefc/0x1824                               
[e69d5e70] [c007c870] lock_acquire+0x4c/0x68                                    
[e69d5e90] [c04575d8] _raw_spin_lock+0x44/0x60                                  
[e69d5ea0] [c02b2e28] lock_tx_qs+0x34/0x54                                      
[e69d5ec0] [c02b2f24] adjust_link+0x34/0x1d8                                    
[e69d5ef0] [c02ab73c] phy_state_machine+0x3a0/0x580                             
[e69d5f10] [c005b83c] process_one_work+0x1ac/0x490                              
[e69d5f50] [c005e4c0] worker_thread+0x18c/0x35c                                 
[e69d5f90] [c00631d4] kthread+0x7c/0x80                                         
[e69d5ff0] [c000e588] kernel_thread+0x4c/0x68
diff mbox

Patch

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index ff109fe..83a5a5a 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -213,7 +213,7 @@  int get_phy_id(struct mii_bus *bus, int addr, u32 *phy_id)
 
 	/* Grab the bits from PHYIR1, and put them
 	 * in the upper half */
-	phy_reg = bus->read(bus, addr, MII_PHYSID1);
+	phy_reg = mdiobus_read(bus, addr, MII_PHYSID1);
 
 	if (phy_reg < 0)
 		return -EIO;
@@ -221,7 +221,7 @@  int get_phy_id(struct mii_bus *bus, int addr, u32 *phy_id)
 	*phy_id = (phy_reg & 0xffff) << 16;
 
 	/* Grab the bits from PHYIR2, and put them in the lower half */
-	phy_reg = bus->read(bus, addr, MII_PHYSID2);
+	phy_reg = mdiobus_read(bus, addr, MII_PHYSID2);
 
 	if (phy_reg < 0)
 		return -EIO;