diff mbox

[net,4/4] sh_eth: Fix serialisation of interrupt disable with interrupt & NAPI handlers

Message ID 1421930648.1222.289.camel@xylophone.i.decadent.org.uk
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Ben Hutchings Jan. 22, 2015, 12:44 p.m. UTC
In order to stop the RX path accessing the RX ring while it's being
stopped or resized, we clear the interrupt mask (EESIPR) and then call
free_irq() or synchronise_irq().  This is insufficient because the
interrupt handler or NAPI poller may set EESIPR again after we clear
it.  Also, in sh_eth_set_ringparam() we currently don't disable NAPI
polling at all.

I could easily trigger a crash by running the loop:

   while ethtool -G eth0 rx 128 && ethtool -G eth0 rx 64; do echo -n .; done

and 'ping -f' toward the sh_eth port from another machine.

To fix this:
- Add a software flag (irq_enabled) to signal whether interrupts
  should be enabled
- In the interrupt handler, if the flag is clear then clear EESIPR
  and return
- In the NAPI poller, if the flag is clear then don't set EESIPR
- Set the flag before enabling interrupts in sh_eth_dev_init() and
  sh_eth_set_ringparam()
- Clear the flag and serialise with the interrupt and NAPI
  handlers before clearing EESIPR in sh_eth_close() and
  sh_eth_set_ringparam()

After this, I could run the loop for 100,000 iterations successfully.

Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>
---
 drivers/net/ethernet/renesas/sh_eth.c |   39 +++++++++++++++++++++++++--------
 drivers/net/ethernet/renesas/sh_eth.h |    1 +
 2 files changed, 31 insertions(+), 9 deletions(-)

Comments

Sergei Shtylyov Jan. 22, 2015, 1:50 p.m. UTC | #1
Hello.

On 1/22/2015 3:44 PM, Ben Hutchings wrote:

> In order to stop the RX path accessing the RX ring while it's being
> stopped or resized, we clear the interrupt mask (EESIPR) and then call
> free_irq() or synchronise_irq().  This is insufficient because the
> interrupt handler or NAPI poller may set EESIPR again after we clear
> it.

    Hm, how come the interrupt handler gets called when we have disabled all 
interrupts? Is it unmaskable EESR.ECI interrupt? BTW, I'm not seeing where the 
interrupt handler enables interrupts again; only NAPI poller does that AFAIK.

> Also, in sh_eth_set_ringparam() we currently don't disable NAPI
> polling at all.

> I could easily trigger a crash by running the loop:

>     while ethtool -G eth0 rx 128 && ethtool -G eth0 rx 64; do echo -n .; done

    Oh, never done any 'ethtool' tests...

> and 'ping -f' toward the sh_eth port from another machine.

To fix this:
> - Add a software flag (irq_enabled) to signal whether interrupts
>    should be enabled
> - In the interrupt handler, if the flag is clear then clear EESIPR
>    and return
> - In the NAPI poller, if the flag is clear then don't set EESIPR
> - Set the flag before enabling interrupts in sh_eth_dev_init() and
>    sh_eth_set_ringparam()
> - Clear the flag and serialise with the interrupt and NAPI
>    handlers before clearing EESIPR in sh_eth_close() and
>    sh_eth_set_ringparam()

> After this, I could run the loop for 100,000 iterations successfully.

> Signed-off-by: Ben Hutchings <ben.hutchings@codethink.co.uk>

[...]

> diff --git a/drivers/net/ethernet/renesas/sh_eth.h b/drivers/net/ethernet/renesas/sh_eth.h
> index 7bfaf1c..259d03f 100644
> --- a/drivers/net/ethernet/renesas/sh_eth.h
> +++ b/drivers/net/ethernet/renesas/sh_eth.h
> @@ -513,6 +513,7 @@ struct sh_eth_private {
>   	u32 rx_buf_sz;			/* Based on MTU+slack. */
>   	int edmac_endian;
>   	struct napi_struct napi;
> +	bool irq_enabled;
>   	/* MII transceiver section. */
>   	u32 phy_id;			/* PHY ID */
>   	struct mii_bus *mii_bus;	/* MDIO bus control */

    In order to conserve space, I'd have added that field after 
'vlan_num_ids', just before the 1-bit fields...

WBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Hutchings Jan. 22, 2015, 3:06 p.m. UTC | #2
On Thu, 2015-01-22 at 16:50 +0300, Sergei Shtylyov wrote:
> Hello.
> 
> On 1/22/2015 3:44 PM, Ben Hutchings wrote:
> 
> > In order to stop the RX path accessing the RX ring while it's being
> > stopped or resized, we clear the interrupt mask (EESIPR) and then call
> > free_irq() or synchronise_irq().  This is insufficient because the
> > interrupt handler or NAPI poller may set EESIPR again after we clear
> > it.
> 
>     Hm, how come the interrupt handler gets called when we have disabled all 
> interrupts?

It may be running on another processor and racing with the function that
clears EESIPR.

> Is it unmaskable EESR.ECI interrupt? BTW, I'm not seeing where the 
> interrupt handler enables interrupts again; only NAPI poller does that AFAIK.

Normally it only clears EESR_RX_CHECK, but as it cannot atomically clear
a single bit of EESIPR this can result in setting other bits.

> > Also, in sh_eth_set_ringparam() we currently don't disable NAPI
> > polling at all.
> 
> > I could easily trigger a crash by running the loop:
> 
> >     while ethtool -G eth0 rx 128 && ethtool -G eth0 rx 64; do echo -n .; done
> 
>     Oh, never done any 'ethtool' tests...

You should also be able to trigger this by bringing the device up and
down, but you have to wait for the PHY to bring the link up before any
packets will be received in between.  Thus each cycle takes longer.

[...]
> > diff --git a/drivers/net/ethernet/renesas/sh_eth.h b/drivers/net/ethernet/renesas/sh_eth.h
> > index 7bfaf1c..259d03f 100644
> > --- a/drivers/net/ethernet/renesas/sh_eth.h
> > +++ b/drivers/net/ethernet/renesas/sh_eth.h
> > @@ -513,6 +513,7 @@ struct sh_eth_private {
> >   	u32 rx_buf_sz;			/* Based on MTU+slack. */
> >   	int edmac_endian;
> >   	struct napi_struct napi;
> > +	bool irq_enabled;
> >   	/* MII transceiver section. */
> >   	u32 phy_id;			/* PHY ID */
> >   	struct mii_bus *mii_bus;	/* MDIO bus control */
> 
>     In order to conserve space, I'd have added that field after 
> 'vlan_num_ids', just before the 1-bit fields...

I don't think it's worth micro-optimising the size of a per-device
structure.

Ben.



--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sergei Shtylyov Jan. 22, 2015, 4:35 p.m. UTC | #3
Hello.

On 01/22/2015 06:06 PM, Ben Hutchings wrote:

>>> In order to stop the RX path accessing the RX ring while it's being
>>> stopped or resized, we clear the interrupt mask (EESIPR) and then call
>>> free_irq() or synchronise_irq().  This is insufficient because the
>>> interrupt handler or NAPI poller may set EESIPR again after we clear
>>> it.

>>      Hm, how come the interrupt handler gets called when we have disabled all
>> interrupts?

> It may be running on another processor and racing with the function that
> clears EESIPR.

    Ah, I didn't think about SMP... but then we need more spinlock protection 
instead, no?

>> Is it unmaskable EESR.ECI interrupt? BTW, I'm not seeing where the
>> interrupt handler enables interrupts again; only NAPI poller does that AFAIK.

> Normally it only clears EESR_RX_CHECK, but as it cannot atomically clear
> a single bit of EESIPR this can result in setting other bits.

    This is again only possible on SMP kernel, right?

[...]

> Ben.

WBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ben Hutchings Jan. 22, 2015, 5:59 p.m. UTC | #4
On Thu, 2015-01-22 at 19:35 +0300, Sergei Shtylyov wrote:
> Hello.
> 
> On 01/22/2015 06:06 PM, Ben Hutchings wrote:
> 
> >>> In order to stop the RX path accessing the RX ring while it's being
> >>> stopped or resized, we clear the interrupt mask (EESIPR) and then call
> >>> free_irq() or synchronise_irq().  This is insufficient because the
> >>> interrupt handler or NAPI poller may set EESIPR again after we clear
> >>> it.
> 
> >>      Hm, how come the interrupt handler gets called when we have disabled all
> >> interrupts?
> 
> > It may be running on another processor and racing with the function that
> > clears EESIPR.
> 
>     Ah, I didn't think about SMP... but then we need more spinlock protection 
> instead, no?

That's what I tried first.  As we need to serialise with NAPI as well,
and napi_disable() may sleep, we need to call that first, so I ended up
with:

               napi_disable(&mdp->napi);
               spin_lock_irq(&mdp->lock);
               sh_eth_write(ndev, 0x0000, EESIPR);
               spin_unlock_irq(&mdp->lock);
               napi_enable(&mdp->napi);

But after napi_disable() sets the NAPI_STATE_DISABLE bit,
napi_schedule_prep() will return false and so the interrupt handler will
not clear the EESR_RX_CHECK bit any more.  This can leave the interrupt
screaming and prevent the NAPI handler from ever completing, so the
system is livelocked.

> >> Is it unmaskable EESR.ECI interrupt? BTW, I'm not seeing where the
> >> interrupt handler enables interrupts again; only NAPI poller does that AFAIK.
> 
> > Normally it only clears EESR_RX_CHECK, but as it cannot atomically clear
> > a single bit of EESIPR this can result in setting other bits.
> 
>     This is again only possible on SMP kernel, right?

Yes.

Ben.


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
index be7aa43..d475929 100644
--- a/drivers/net/ethernet/renesas/sh_eth.c
+++ b/drivers/net/ethernet/renesas/sh_eth.c
@@ -1316,8 +1316,10 @@  static int sh_eth_dev_init(struct net_device *ndev, bool start)
 		     RFLR);
 
 	sh_eth_write(ndev, sh_eth_read(ndev, EESR), EESR);
-	if (start)
+	if (start) {
+		mdp->irq_enabled = true;
 		sh_eth_write(ndev, mdp->cd->eesipr_value, EESIPR);
+	}
 
 	/* PAUSE Prohibition */
 	val = (sh_eth_read(ndev, ECMR) & ECMR_DM) |
@@ -1653,7 +1655,12 @@  static irqreturn_t sh_eth_interrupt(int irq, void *netdev)
 	if (intr_status & (EESR_RX_CHECK | cd->tx_check | cd->eesr_err_check))
 		ret = IRQ_HANDLED;
 	else
-		goto other_irq;
+		goto out;
+
+	if (!likely(mdp->irq_enabled)) {
+		sh_eth_write(ndev, 0, EESIPR);
+		goto out;
+	}
 
 	if (intr_status & EESR_RX_CHECK) {
 		if (napi_schedule_prep(&mdp->napi)) {
@@ -1684,7 +1691,7 @@  static irqreturn_t sh_eth_interrupt(int irq, void *netdev)
 		sh_eth_error(ndev, intr_status);
 	}
 
-other_irq:
+out:
 	spin_unlock(&mdp->lock);
 
 	return ret;
@@ -1712,7 +1719,8 @@  static int sh_eth_poll(struct napi_struct *napi, int budget)
 	napi_complete(napi);
 
 	/* Reenable Rx interrupts */
-	sh_eth_write(ndev, mdp->cd->eesipr_value, EESIPR);
+	if (mdp->irq_enabled)
+		sh_eth_write(ndev, mdp->cd->eesipr_value, EESIPR);
 out:
 	return budget - quota;
 }
@@ -1970,12 +1978,20 @@  static int sh_eth_set_ringparam(struct net_device *ndev,
 	if (netif_running(ndev)) {
 		netif_device_detach(ndev);
 		netif_tx_disable(ndev);
-		/* Disable interrupts by clearing the interrupt mask. */
+
+		/* Serialise with the interrupt handler and NAPI, then
+		 * disable interrupts.  We have to clear the
+		 * irq_enabled flag first to ensure that interrupts
+		 * won't be re-enabled.
+		 */
+		mdp->irq_enabled = false;
+		synchronize_irq(ndev->irq);
+		napi_synchronize(&mdp->napi);
 		sh_eth_write(ndev, 0x0000, EESIPR);
+
 		/* Stop the chip's Tx and Rx processes. */
 		sh_eth_write(ndev, 0, EDTRR);
 		sh_eth_write(ndev, 0, EDRRR);
-		synchronize_irq(ndev->irq);
 
 		/* Free all the skbuffs in the Rx queue. */
 		sh_eth_ring_free(ndev);
@@ -2001,6 +2017,7 @@  static int sh_eth_set_ringparam(struct net_device *ndev,
 			return ret;
 		}
 
+		mdp->irq_enabled = true;
 		sh_eth_write(ndev, mdp->cd->eesipr_value, EESIPR);
 		/* Setting the Rx mode will start the Rx process. */
 		sh_eth_write(ndev, EDRRR_R, EDRRR);
@@ -2184,7 +2201,13 @@  static int sh_eth_close(struct net_device *ndev)
 
 	netif_stop_queue(ndev);
 
-	/* Disable interrupts by clearing the interrupt mask. */
+	/* Serialise with the interrupt handler and NAPI, then disable
+	 * interrupts.  We have to clear the irq_enabled flag first to
+	 * ensure that interrupts won't be re-enabled.
+	 */
+	mdp->irq_enabled = false;
+	synchronize_irq(ndev->irq);
+	napi_disable(&mdp->napi);
 	sh_eth_write(ndev, 0x0000, EESIPR);
 
 	/* Stop the chip's Tx and Rx processes. */
@@ -2201,8 +2224,6 @@  static int sh_eth_close(struct net_device *ndev)
 
 	free_irq(ndev->irq, ndev);
 
-	napi_disable(&mdp->napi);
-
 	/* Free all the skbuffs in the Rx queue. */
 	sh_eth_ring_free(ndev);
 
diff --git a/drivers/net/ethernet/renesas/sh_eth.h b/drivers/net/ethernet/renesas/sh_eth.h
index 7bfaf1c..259d03f 100644
--- a/drivers/net/ethernet/renesas/sh_eth.h
+++ b/drivers/net/ethernet/renesas/sh_eth.h
@@ -513,6 +513,7 @@  struct sh_eth_private {
 	u32 rx_buf_sz;			/* Based on MTU+slack. */
 	int edmac_endian;
 	struct napi_struct napi;
+	bool irq_enabled;
 	/* MII transceiver section. */
 	u32 phy_id;			/* PHY ID */
 	struct mii_bus *mii_bus;	/* MDIO bus control */