diff mbox series

[BUG] micrel phy suspend/resume

Message ID 20171210164752.GO10595@n2100.armlinux.org.uk
State RFC, archived
Delegated to: David Miller
Headers show
Series [BUG] micrel phy suspend/resume | expand

Commit Message

Russell King (Oracle) Dec. 10, 2017, 4:47 p.m. UTC
Guys,

I've just tripped over a bug with the Micrel PHY driver, but it
really isn't specific to the Micrel PHY driver.

When we suspend, we suspend the PHY and then the MAC driver (eg,
on the ZII board):

[  198.822751] 400d0000.ethernet-1:00: bus : mdio_bus_suspend+0x0/0x34
[  198.822859] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3100
[  198.822878] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3900
...
[  198.826235] 400d0000.ethernet: bus : platform_pm_freeze+0x0/0x5c
[  198.826354] __mdiobus_read: 117 400d0000.ethernet-1 00 1f => 9198
[  198.826374] __mdiobus_write: 117 400d0000.ethernet-1 00 1f <= 9198
[  198.826503] __mdiobus_write: 117 400d0000.ethernet-1 00 1b <= 0000
[  198.826699] __mdiobus_read: 117 400d0000.ethernet-1 00 1b => 0000

When we resume, the order is reversed:

[  198.848300] 400d0000.ethernet: bus : platform_pm_thaw+0x0/0x54
[  198.849024] __mdiobus_read: 117 400d0000.ethernet-1 00 1b => 0000
[  198.849120] __mdiobus_read: 117 400d0000.ethernet-1 00 1f => 9198
[  198.849141] __mdiobus_write: 117 400d0000.ethernet-1 00 1f <= 9198
[  198.849243] __mdiobus_write: 117 400d0000.ethernet-1 00 1b <= 0500
[  198.849401] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3900
[  198.849419] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3100
[  198.849637] __mdiobus_read: 61 400d0000.ethernet-1 00 01 => 7849
...
[  198.852677] 400d0000.ethernet-1:00: bus : mdio_bus_resume+0x0/0x34
[  198.852778] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3100
[  198.852797] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3100

Now, the MAC driver calls phy_stop() and phy_start() from within its
own suspend/resume methods, and at this is done while the PHY is,
as far as the kernel PM code is concerned, suspended.

However, phylib works around that by resuming the PHY itself when
phy_start() is called in this situation.  That looks good, but it
really isn't in this case.  Given the above sequence, we will be in
PHY_HALTED state.

So, when phy_start() is called, the first thing it does is check the
state, and finds PHY_HALTED.  It then tries to enable the PHY
interrupts.  This cause the Micrel driver to write to the PHY to
enable interrupts - it reads 0x1b to ack any pre-existing interrupt,
modifies 0x1f to set the interrupt pin level, and then supposedly
writes 0x1b to enable interrupts.

However, at this point, the PHY is still powered down, as we can see
in the following read of the BMCR - containing 0x3900.  The write to
enable interrupts is ignored, and so interrupts remain disabled.

The resume process continues, and the system resumes, but interrupts
on the PHY remain disabled, and the phylib state machine never
advances.  You can do anything you like with cables etc, as far as
phylib is concerned, the link steadfastly remains "down".

That's a bit of a problem if your platform is running root-NFS through
that network interface.

It looks like some variants of the Micrel phy code work around this by
including in their resume path code to enable PHY interrupts
independently of phylib.

phy_resume() can't be called inside the locked region in phy_start(),
where we enable interrupts, because genphy_resume() also wants to take
phydev->lock - and it wants to do that to safely read-modify-write the
BMCR register.  This, I feel, comes back to an abuse of the phy state
machine lock to also protect atomic bus operations.

If we move over to using the bus lock to protect bus atomic operations
(as I believe we should) then phy_resume() can be called while holding
phydev->lock from within bits of the code that are protecting themselves
from concurrency with the phylib state machine.  It also means that we
can get rid of some of these boolean variables, and most importantly
in this particular case, call phy_resume() before we enable interrupts.

With that arrangement, things look a lot better:

[   74.545584] 400d0000.ethernet: bus : platform_pm_thaw+0x0/0x54
[   74.546010] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3900
[   74.546029] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3100
[   74.546264] __mdiobus_read: 117 400d0000.ethernet-1 00 1b => 0000
[   74.546354] __mdiobus_read: 117 400d0000.ethernet-1 00 1f => 8100
[   74.546373] __mdiobus_write: 117 400d0000.ethernet-1 00 1f <= 8100
[   74.546651] __mdiobus_write: 117 400d0000.ethernet-1 00 1b <= 0500

The patch for this is slightly larger than it needs to be because I've
converted more places that do read-modify-write to the new xxx_modify()
accessor, and it also ensures that phy_resume() is consistently called
with the phy state machine lock held.  It also means we can get rid of
the interrupt enable hack for some micrel PHYs, which I suspect comes
from this same root cause.

Something similar ought to be done for phy_suspend() as well, but that's
a little more complex because it also calls get_wol() functions.

More places could probably be converted to phy_modify() too.

 drivers/net/phy/at803x.c     | 24 +++--------------
 drivers/net/phy/mdio_bus.c   | 32 ++++++++++++++++++++++
 drivers/net/phy/micrel.c     | 15 +----------
 drivers/net/phy/phy.c        |  9 +++----
 drivers/net/phy/phy_device.c | 63 +++++++++++++-------------------------------
 include/linux/mdio.h         |  1 +
 include/linux/phy.h          | 18 +++++++++++++
 7 files changed, 77 insertions(+), 85 deletions(-)

Comments

Florian Fainelli Dec. 10, 2017, 11:50 p.m. UTC | #1
Hi Russell,

On 12/10/2017 08:47 AM, Russell King - ARM Linux wrote:
> Guys,
> 
> I've just tripped over a bug with the Micrel PHY driver, but it
> really isn't specific to the Micrel PHY driver.
> 
> When we suspend, we suspend the PHY and then the MAC driver (eg,
> on the ZII board):
> 
> [  198.822751] 400d0000.ethernet-1:00: bus : mdio_bus_suspend+0x0/0x34
> [  198.822859] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3100
> [  198.822878] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3900
> ...
> [  198.826235] 400d0000.ethernet: bus : platform_pm_freeze+0x0/0x5c
> [  198.826354] __mdiobus_read: 117 400d0000.ethernet-1 00 1f => 9198
> [  198.826374] __mdiobus_write: 117 400d0000.ethernet-1 00 1f <= 9198
> [  198.826503] __mdiobus_write: 117 400d0000.ethernet-1 00 1b <= 0000
> [  198.826699] __mdiobus_read: 117 400d0000.ethernet-1 00 1b => 0000
> 
> When we resume, the order is reversed:
> 
> [  198.848300] 400d0000.ethernet: bus : platform_pm_thaw+0x0/0x54
> [  198.849024] __mdiobus_read: 117 400d0000.ethernet-1 00 1b => 0000
> [  198.849120] __mdiobus_read: 117 400d0000.ethernet-1 00 1f => 9198
> [  198.849141] __mdiobus_write: 117 400d0000.ethernet-1 00 1f <= 9198
> [  198.849243] __mdiobus_write: 117 400d0000.ethernet-1 00 1b <= 0500
> [  198.849401] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3900
> [  198.849419] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3100
> [  198.849637] __mdiobus_read: 61 400d0000.ethernet-1 00 01 => 7849
> ...
> [  198.852677] 400d0000.ethernet-1:00: bus : mdio_bus_resume+0x0/0x34
> [  198.852778] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3100
> [  198.852797] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3100
> 
> Now, the MAC driver calls phy_stop() and phy_start() from within its
> own suspend/resume methods, and at this is done while the PHY is,
> as far as the kernel PM code is concerned, suspended.
> 
> However, phylib works around that by resuming the PHY itself when
> phy_start() is called in this situation.  That looks good, but it
> really isn't in this case.  Given the above sequence, we will be in
> PHY_HALTED state.
> 
> So, when phy_start() is called, the first thing it does is check the
> state, and finds PHY_HALTED.  It then tries to enable the PHY
> interrupts.  This cause the Micrel driver to write to the PHY to
> enable interrupts - it reads 0x1b to ack any pre-existing interrupt,
> modifies 0x1f to set the interrupt pin level, and then supposedly
> writes 0x1b to enable interrupts.
> 
> However, at this point, the PHY is still powered down, as we can see
> in the following read of the BMCR - containing 0x3900.  The write to
> enable interrupts is ignored, and so interrupts remain disabled.
> 
> The resume process continues, and the system resumes, but interrupts
> on the PHY remain disabled, and the phylib state machine never
> advances.  You can do anything you like with cables etc, as far as
> phylib is concerned, the link steadfastly remains "down".
> 
> That's a bit of a problem if your platform is running root-NFS through
> that network interface.
> 
> It looks like some variants of the Micrel phy code work around this by
> including in their resume path code to enable PHY interrupts
> independently of phylib.
> 
> phy_resume() can't be called inside the locked region in phy_start(),
> where we enable interrupts, because genphy_resume() also wants to take
> phydev->lock - and it wants to do that to safely read-modify-write the
> BMCR register.  This, I feel, comes back to an abuse of the phy state
> machine lock to also protect atomic bus operations.
> 
> If we move over to using the bus lock to protect bus atomic operations
> (as I believe we should) then phy_resume() can be called while holding
> phydev->lock from within bits of the code that are protecting themselves
> from concurrency with the phylib state machine.  It also means that we
> can get rid of some of these boolean variables, and most importantly
> in this particular case, call phy_resume() before we enable interrupts.
> 
> With that arrangement, things look a lot better:
> 
> [   74.545584] 400d0000.ethernet: bus : platform_pm_thaw+0x0/0x54
> [   74.546010] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3900
> [   74.546029] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3100
> [   74.546264] __mdiobus_read: 117 400d0000.ethernet-1 00 1b => 0000
> [   74.546354] __mdiobus_read: 117 400d0000.ethernet-1 00 1f => 8100
> [   74.546373] __mdiobus_write: 117 400d0000.ethernet-1 00 1f <= 8100
> [   74.546651] __mdiobus_write: 117 400d0000.ethernet-1 00 1b <= 0500
> 
> The patch for this is slightly larger than it needs to be because I've
> converted more places that do read-modify-write to the new xxx_modify()
> accessor, and it also ensures that phy_resume() is consistently called
> with the phy state machine lock held.  It also means we can get rid of
> the interrupt enable hack for some micrel PHYs, which I suspect comes
> from this same root cause.

Since this is a bug fix, I would really rather have the ability to
target the "net" tree with a simpler fix that is contained within
drivers/net/phy/phy.c and which does not require the use of phy_modify().

Thanks for doing all the detective work!
Russell King (Oracle) Dec. 11, 2017, 12:06 a.m. UTC | #2
On Sun, Dec 10, 2017 at 03:50:28PM -0800, Florian Fainelli wrote:
> Hi Russell,
> 
> On 12/10/2017 08:47 AM, Russell King - ARM Linux wrote:
> > Guys,
> > 
> > I've just tripped over a bug with the Micrel PHY driver, but it
> > really isn't specific to the Micrel PHY driver.
> > 
> > When we suspend, we suspend the PHY and then the MAC driver (eg,
> > on the ZII board):
> > 
> > [  198.822751] 400d0000.ethernet-1:00: bus : mdio_bus_suspend+0x0/0x34
> > [  198.822859] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3100
> > [  198.822878] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3900
> > ...
> > [  198.826235] 400d0000.ethernet: bus : platform_pm_freeze+0x0/0x5c
> > [  198.826354] __mdiobus_read: 117 400d0000.ethernet-1 00 1f => 9198
> > [  198.826374] __mdiobus_write: 117 400d0000.ethernet-1 00 1f <= 9198
> > [  198.826503] __mdiobus_write: 117 400d0000.ethernet-1 00 1b <= 0000
> > [  198.826699] __mdiobus_read: 117 400d0000.ethernet-1 00 1b => 0000
> > 
> > When we resume, the order is reversed:
> > 
> > [  198.848300] 400d0000.ethernet: bus : platform_pm_thaw+0x0/0x54
> > [  198.849024] __mdiobus_read: 117 400d0000.ethernet-1 00 1b => 0000
> > [  198.849120] __mdiobus_read: 117 400d0000.ethernet-1 00 1f => 9198
> > [  198.849141] __mdiobus_write: 117 400d0000.ethernet-1 00 1f <= 9198
> > [  198.849243] __mdiobus_write: 117 400d0000.ethernet-1 00 1b <= 0500
> > [  198.849401] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3900
> > [  198.849419] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3100
> > [  198.849637] __mdiobus_read: 61 400d0000.ethernet-1 00 01 => 7849
> > ...
> > [  198.852677] 400d0000.ethernet-1:00: bus : mdio_bus_resume+0x0/0x34
> > [  198.852778] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3100
> > [  198.852797] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3100
> > 
> > Now, the MAC driver calls phy_stop() and phy_start() from within its
> > own suspend/resume methods, and at this is done while the PHY is,
> > as far as the kernel PM code is concerned, suspended.
> > 
> > However, phylib works around that by resuming the PHY itself when
> > phy_start() is called in this situation.  That looks good, but it
> > really isn't in this case.  Given the above sequence, we will be in
> > PHY_HALTED state.
> > 
> > So, when phy_start() is called, the first thing it does is check the
> > state, and finds PHY_HALTED.  It then tries to enable the PHY
> > interrupts.  This cause the Micrel driver to write to the PHY to
> > enable interrupts - it reads 0x1b to ack any pre-existing interrupt,
> > modifies 0x1f to set the interrupt pin level, and then supposedly
> > writes 0x1b to enable interrupts.
> > 
> > However, at this point, the PHY is still powered down, as we can see
> > in the following read of the BMCR - containing 0x3900.  The write to
> > enable interrupts is ignored, and so interrupts remain disabled.
> > 
> > The resume process continues, and the system resumes, but interrupts
> > on the PHY remain disabled, and the phylib state machine never
> > advances.  You can do anything you like with cables etc, as far as
> > phylib is concerned, the link steadfastly remains "down".
> > 
> > That's a bit of a problem if your platform is running root-NFS through
> > that network interface.
> > 
> > It looks like some variants of the Micrel phy code work around this by
> > including in their resume path code to enable PHY interrupts
> > independently of phylib.
> > 
> > phy_resume() can't be called inside the locked region in phy_start(),
> > where we enable interrupts, because genphy_resume() also wants to take
> > phydev->lock - and it wants to do that to safely read-modify-write the
> > BMCR register.  This, I feel, comes back to an abuse of the phy state
> > machine lock to also protect atomic bus operations.
> > 
> > If we move over to using the bus lock to protect bus atomic operations
> > (as I believe we should) then phy_resume() can be called while holding
> > phydev->lock from within bits of the code that are protecting themselves
> > from concurrency with the phylib state machine.  It also means that we
> > can get rid of some of these boolean variables, and most importantly
> > in this particular case, call phy_resume() before we enable interrupts.
> > 
> > With that arrangement, things look a lot better:
> > 
> > [   74.545584] 400d0000.ethernet: bus : platform_pm_thaw+0x0/0x54
> > [   74.546010] __mdiobus_read: 117 400d0000.ethernet-1 00 00 => 3900
> > [   74.546029] __mdiobus_write: 117 400d0000.ethernet-1 00 00 <= 3100
> > [   74.546264] __mdiobus_read: 117 400d0000.ethernet-1 00 1b => 0000
> > [   74.546354] __mdiobus_read: 117 400d0000.ethernet-1 00 1f => 8100
> > [   74.546373] __mdiobus_write: 117 400d0000.ethernet-1 00 1f <= 8100
> > [   74.546651] __mdiobus_write: 117 400d0000.ethernet-1 00 1b <= 0500
> > 
> > The patch for this is slightly larger than it needs to be because I've
> > converted more places that do read-modify-write to the new xxx_modify()
> > accessor, and it also ensures that phy_resume() is consistently called
> > with the phy state machine lock held.  It also means we can get rid of
> > the interrupt enable hack for some micrel PHYs, which I suspect comes
> > from this same root cause.
> 
> Since this is a bug fix, I would really rather have the ability to
> target the "net" tree with a simpler fix that is contained within
> drivers/net/phy/phy.c and which does not require the use of phy_modify().

Not really that easy.

The simple solution would be to open-code the phy_modify() inside
genphy_resume() in phy_device.c, and move the phy_resume() call
in phy.c.

However, that leaves at803x taking phydev->lock in its resume path,
which will immediately deadlock.  We couldn't just delete that lock
without ensuring that phy_resume() is always called with phydev->lock
held to preserve the existing locking.  I wouldn't want to simply
delete that lock without some testing of that driver in case there's
a good reason for it being there.

So, to preserve that would mean making sure the two calls to
phy_resume() in phy_device.c are also under the lock, replacing
the at803x.c read-modify-write sequences with open-coded versions,
or convincing ourselves that the locks in at803x_suspend() and
at803x_resume() are not required.

My feeling is that the safest and minimal change for -rc is to add
the phy_modify() helper, switch both genphy_resume() and
at803x_resume() to it and delete the locking there, and ensure that
phy_resume() is always called with phydev->lock taken for a
consistent locking strategy there.

We can prove the phy_modify() helper works elsewhere, which means
the risk is reduced to the at803x_resume() conversion.

The unfortunate issue that gives us is that the calling convention
of "suspend" vs "resume" is different - resume ends up with
phydev->lock always held but suspend doesn't, and that will be
confusing for phy driver developers, and really doesn't help from
the review point of view.
diff mbox series

Patch

diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
index 4c75cfcdeaec..a20d1f5e4dbf 100644
--- a/drivers/net/phy/at803x.c
+++ b/drivers/net/phy/at803x.c
@@ -258,38 +258,22 @@  static int at803x_suspend(struct phy_device *phydev)
 	int value;
 	int wol_enabled;
 
-	mutex_lock(&phydev->lock);
-
 	value = phy_read(phydev, AT803X_INTR_ENABLE);
 	wol_enabled = value & AT803X_INTR_ENABLE_WOL;
 
-	value = phy_read(phydev, MII_BMCR);
-
 	if (wol_enabled)
-		value |= BMCR_ISOLATE;
+		value = BMCR_ISOLATE;
 	else
-		value |= BMCR_PDOWN;
-
-	phy_write(phydev, MII_BMCR, value);
+		value = BMCR_PDOWN;
 
-	mutex_unlock(&phydev->lock);
+	phy_modify(phydev, MII_BMCR, value, value);
 
 	return 0;
 }
 
 static int at803x_resume(struct phy_device *phydev)
 {
-	int value;
-
-	mutex_lock(&phydev->lock);
-
-	value = phy_read(phydev, MII_BMCR);
-	value &= ~(BMCR_PDOWN | BMCR_ISOLATE);
-	phy_write(phydev, MII_BMCR, value);
-
-	mutex_unlock(&phydev->lock);
-
-	return 0;
+	return phy_modify(phydev, MII_BMCR, BMCR_PDOWN | BMCR_ISOLATE, 0);
 }
 
 static int at803x_probe(struct phy_device *phydev)
diff --git a/drivers/net/phy/mdio_bus.c b/drivers/net/phy/mdio_bus.c
index 1bf3adcdcbac..d5adb00eafdf 100644
--- a/drivers/net/phy/mdio_bus.c
+++ b/drivers/net/phy/mdio_bus.c
@@ -651,6 +651,38 @@  int mdiobus_write(struct mii_bus *bus, int addr, u32 regnum, u16 val)
 EXPORT_SYMBOL(mdiobus_write);
 
 /**
+ * mdiobus_modify - Convenience function for writing a given MII mgmt register
+ * @bus: the mii_bus struct
+ * @addr: the phy address
+ * @regnum: register number to write
+ * @mask: bits to mask off from the register @regnum
+ * @val: new value of bits set in mask to write to @regnum
+ *
+ * NOTE: MUST NOT be called from interrupt context,
+ * because the bus read/write functions may wait for an interrupt
+ * to conclude the operation.
+ */
+int mdiobus_modify(struct mii_bus *bus, int addr, u32 regnum, u16 mask, u16 val)
+{
+	int retval;
+	int err;
+
+	BUG_ON(in_interrupt());
+
+	mutex_lock(&bus->mdio_lock);
+	err = retval = __mdiobus_read(bus, addr, regnum);
+	if (err >= 0) {
+		retval &= ~mask;
+		retval |= val & mask;
+		err = __mdiobus_write(bus, addr, regnum, retval);
+	}
+	mutex_unlock(&bus->mdio_lock);
+
+	return err;
+}
+EXPORT_SYMBOL(mdiobus_modify);
+
+/**
  * mdio_bus_match - determine if given MDIO driver supports the given
  *		    MDIO device
  * @dev: target MDIO device
diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
index fdb43dd9b5cd..9e9438caa2b7 100644
--- a/drivers/net/phy/micrel.c
+++ b/drivers/net/phy/micrel.c
@@ -711,22 +711,9 @@  static int kszphy_suspend(struct phy_device *phydev)
 
 static int kszphy_resume(struct phy_device *phydev)
 {
-	int ret;
-
 	genphy_resume(phydev);
 
-	ret = kszphy_config_reset(phydev);
-	if (ret)
-		return ret;
-
-	/* Enable PHY Interrupts */
-	if (phy_interrupt_is_valid(phydev)) {
-		phydev->interrupts = PHY_INTERRUPT_ENABLED;
-		if (phydev->drv->config_intr)
-			phydev->drv->config_intr(phydev);
-	}
-
-	return 0;
+	return kszphy_config_reset(phydev);
 }
 
 static int kszphy_probe(struct phy_device *phydev)
diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
index f2a83ab00e71..874740957606 100644
--- a/drivers/net/phy/phy.c
+++ b/drivers/net/phy/phy.c
@@ -844,7 +844,6 @@  EXPORT_SYMBOL(phy_stop);
  */
 void phy_start(struct phy_device *phydev)
 {
-	bool do_resume = false;
 	int err = 0;
 
 	mutex_lock(&phydev->lock);
@@ -857,6 +856,9 @@  void phy_start(struct phy_device *phydev)
 		phydev->state = PHY_UP;
 		break;
 	case PHY_HALTED:
+		/* if phy was suspended, bring the physical link up again */
+		phy_resume(phydev);
+
 		/* make sure interrupts are re-enabled for the PHY */
 		if (phydev->irq != PHY_POLL) {
 			err = phy_enable_interrupts(phydev);
@@ -865,17 +867,12 @@  void phy_start(struct phy_device *phydev)
 		}
 
 		phydev->state = PHY_RESUMING;
-		do_resume = true;
 		break;
 	default:
 		break;
 	}
 	mutex_unlock(&phydev->lock);
 
-	/* if phy was suspended, bring the physical link up again */
-	if (do_resume)
-		phy_resume(phydev);
-
 	phy_trigger_machine(phydev, true);
 }
 EXPORT_SYMBOL(phy_start);
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 67f25ac29025..d0d98e6f6406 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -135,7 +135,9 @@  static int mdio_bus_phy_resume(struct device *dev)
 	if (!mdio_bus_phy_may_suspend(phydev))
 		goto no_resume;
 
+	mutex_lock(&phydev->lock);
 	ret = phy_resume(phydev);
+	mutex_unlock(&phydev->lock);
 	if (ret < 0)
 		return ret;
 
@@ -1026,7 +1028,9 @@  int phy_attach_direct(struct net_device *dev, struct phy_device *phydev,
 	if (err)
 		goto error;
 
+	mutex_lock(&phydev->lock);
 	phy_resume(phydev);
+	mutex_unlock(&phydev->lock);
 	phy_led_triggers_register(phydev);
 
 	return err;
@@ -1157,6 +1161,8 @@  int phy_resume(struct phy_device *phydev)
 	struct phy_driver *phydrv = to_phy_driver(phydev->mdio.dev.driver);
 	int ret = 0;
 
+	WARN_ON(!mutex_is_locked(&phydev->lock));
+
 	if (phydev->drv && phydrv->resume)
 		ret = phydrv->resume(phydev);
 
@@ -1322,9 +1328,8 @@  static int genphy_config_eee_advert(struct phy_device *phydev)
  */
 int genphy_setup_forced(struct phy_device *phydev)
 {
-	int ctl = phy_read(phydev, MII_BMCR);
+	u16 ctl = 0;
 
-	ctl &= BMCR_LOOPBACK | BMCR_ISOLATE | BMCR_PDOWN;
 	phydev->pause = 0;
 	phydev->asym_pause = 0;
 
@@ -1336,7 +1341,10 @@  int genphy_setup_forced(struct phy_device *phydev)
 	if (DUPLEX_FULL == phydev->duplex)
 		ctl |= BMCR_FULLDPLX;
 
-	return phy_write(phydev, MII_BMCR, ctl);
+	return phy_modify(phydev, MII_BMCR,
+			  BMCR_LOOPBACK | BMCR_ISOLATE | BMCR_PDOWN |
+			  BMCR_SPEED1000 | BMCR_SPEED100 | BMCR_FULLDPLX,
+			  ctl);
 }
 EXPORT_SYMBOL(genphy_setup_forced);
 
@@ -1346,17 +1354,10 @@  EXPORT_SYMBOL(genphy_setup_forced);
  */
 int genphy_restart_aneg(struct phy_device *phydev)
 {
-	int ctl = phy_read(phydev, MII_BMCR);
-
-	if (ctl < 0)
-		return ctl;
-
-	ctl |= BMCR_ANENABLE | BMCR_ANRESTART;
-
 	/* Don't isolate the PHY if we're negotiating */
-	ctl &= ~BMCR_ISOLATE;
-
-	return phy_write(phydev, MII_BMCR, ctl);
+	return phy_modify(phydev, MII_BMCR,
+			  BMCR_ANENABLE | BMCR_ANRESTART | BMCR_ISOLATE,
+			  BMCR_ANENABLE | BMCR_ANRESTART);
 }
 EXPORT_SYMBOL(genphy_restart_aneg);
 
@@ -1622,48 +1623,20 @@  EXPORT_SYMBOL(genphy_config_init);
 
 int genphy_suspend(struct phy_device *phydev)
 {
-	int value;
-
-	mutex_lock(&phydev->lock);
-
-	value = phy_read(phydev, MII_BMCR);
-	phy_write(phydev, MII_BMCR, value | BMCR_PDOWN);
-
-	mutex_unlock(&phydev->lock);
-
-	return 0;
+	return phy_modify(phydev, MII_BMCR, BMCR_PDOWN, BMCR_PDOWN);
 }
 EXPORT_SYMBOL(genphy_suspend);
 
 int genphy_resume(struct phy_device *phydev)
 {
-	int value;
-
-	mutex_lock(&phydev->lock);
-
-	value = phy_read(phydev, MII_BMCR);
-	phy_write(phydev, MII_BMCR, value & ~BMCR_PDOWN);
-
-	mutex_unlock(&phydev->lock);
-
-	return 0;
+	return phy_modify(phydev, MII_BMCR, BMCR_PDOWN, 0);
 }
 EXPORT_SYMBOL(genphy_resume);
 
 int genphy_loopback(struct phy_device *phydev, bool enable)
 {
-	int value;
-
-	value = phy_read(phydev, MII_BMCR);
-	if (value < 0)
-		return value;
-
-	if (enable)
-		value |= BMCR_LOOPBACK;
-	else
-		value &= ~BMCR_LOOPBACK;
-
-	return phy_write(phydev, MII_BMCR, value);
+	return phy_modify(phydev, MII_BMCR, BMCR_LOOPBACK,
+			  enable ? BMCR_LOOPBACK : 0);
 }
 EXPORT_SYMBOL(genphy_loopback);
 
diff --git a/include/linux/mdio.h b/include/linux/mdio.h
index 4be30adc033b..8e69211685be 100644
--- a/include/linux/mdio.h
+++ b/include/linux/mdio.h
@@ -264,6 +264,7 @@  int mdiobus_read(struct mii_bus *bus, int addr, u32 regnum);
 int mdiobus_read_nested(struct mii_bus *bus, int addr, u32 regnum);
 int mdiobus_write(struct mii_bus *bus, int addr, u32 regnum, u16 val);
 int mdiobus_write_nested(struct mii_bus *bus, int addr, u32 regnum, u16 val);
+int mdiobus_modify(struct mii_bus *bus, int addr, u32 regnum, u16 mask, u16 val);
 
 int mdiobus_register_device(struct mdio_device *mdiodev);
 int mdiobus_unregister_device(struct mdio_device *mdiodev);
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 03a259f128f9..afe5d963a98c 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -766,6 +766,24 @@  static inline int __phy_write(struct phy_device *phydev, u32 regnum, u16 val)
 }
 
 /**
+ * phy_modify - Convenience function for modifying a given PHY register
+ * @phydev: the phy_device struct
+ * @regnum: register number to write
+ * @mask: bits to mask off from the register @regnum
+ * @val: new value of bits set in mask to write to @regnum
+ *
+ * NOTE: MUST NOT be called from interrupt context,
+ * because the bus read/write functions may wait for an interrupt
+ * to conclude the operation.
+ */
+static inline int phy_modify(struct phy_device *phydev, u32 regnum, u16 mask,
+			     u16 val)
+{
+	return mdiobus_modify(phydev->mdio.bus, phydev->mdio.addr, regnum,
+			      mask, val);
+}
+
+/**
  * phy_interrupt_is_valid - Convenience function for testing a given PHY irq
  * @phydev: the phy_device struct
  *