[3/3] via-velocity: Fix races on shared interrupts

Message ID	20100205165545.28832c53@marrow.netinsight.se
State	Changes Requested, archived
Delegated to:	David Miller
Headers	show Return-Path: <netdev-owner@vger.kernel.org> Date: Fri, 5 Feb 2010 16:55:45 +0100 From: Simon Kagstrom <simon.kagstrom@netinsight.net> To: netdev@vger.kernel.org, davem@davemloft.net Cc: davej@redhat.com, ben@decadent.org.uk Subject: [PATCH 3/3] via-velocity: Fix races on shared interrupts Message-ID: <20100205165545.28832c53@marrow.netinsight.se> In-Reply-To: <20100205165253.3f316b98@marrow.netinsight.se> References: <20100205165253.3f316b98@marrow.netinsight.se> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: netdev-owner@vger.kernel.org Precedence: bulk

Message ID

20100205165545.28832c53@marrow.netinsight.se

State

Changes Requested, archived

Delegated to:

David Miller

Headers

Date: Fri, 5 Feb 2010 16:55:45 +0100
From: Simon Kagstrom <simon.kagstrom@netinsight.net>
To: netdev@vger.kernel.org, davem@davemloft.net
Cc: davej@redhat.com, ben@decadent.org.uk
Subject: [PATCH 3/3] via-velocity: Fix races on shared interrupts
Message-ID: <20100205165545.28832c53@marrow.netinsight.se>
In-Reply-To: <20100205165253.3f316b98@marrow.netinsight.se>
References: <20100205165253.3f316b98@marrow.netinsight.se>
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
Sender: netdev-owner@vger.kernel.org
Precedence: bulk

Commit Message

Simon Kagstrom Feb. 5, 2010, 3:55 p.m. UTC

This patch fixes two potential races in the velocity driver:

* Move the ACK and error handler to the interrupt handler. This fixes a
  potential race with shared interrupts when the other device interrupts
  before the NAPI poll handler has finished. As the velocity driver hasn't
  acked it's own interrupt, it will then steal the interrupt from the
  other device.

* Use spin_trylock in the interrupt handler. To avoid having the
  interrupt off for long periods of time, velocity_poll uses non-irqsave
  spinlocks. In the current code, the interrupt handler will deadlock if
  e.g., the NAPI poll handler is executing when an interrupt (for another
  device) comes in since it tries to take the already held lock.

Signed-off-by: Simon Kagstrom <simon.kagstrom@netinsight.net>
Signed-off-by: Anders Grafstrom <anders.grafstrom@netinsight.net>
---
 drivers/net/via-velocity.c |   26 +++++++++++++++++---------
 1 files changed, 17 insertions(+), 9 deletions(-)

Comments

laurent chavey Feb. 10, 2010, 5:41 p.m. UTC | #1

On Fri, Feb 5, 2010 at 7:55 AM, Simon Kagstrom
<simon.kagstrom@netinsight.net> wrote:
> This patch fixes two potential races in the velocity driver:
>
> * Move the ACK and error handler to the interrupt handler. This fixes a
>  potential race with shared interrupts when the other device interrupts
>  before the NAPI poll handler has finished. As the velocity driver hasn't
>  acked it's own interrupt, it will then steal the interrupt from the
>  other device.
>
> * Use spin_trylock in the interrupt handler. To avoid having the
>  interrupt off for long periods of time, velocity_poll uses non-irqsave
>  spinlocks. In the current code, the interrupt handler will deadlock if
>  e.g., the NAPI poll handler is executing when an interrupt (for another
>  device) comes in since it tries to take the already held lock.
>
> Signed-off-by: Simon Kagstrom <simon.kagstrom@netinsight.net>
> Signed-off-by: Anders Grafstrom <anders.grafstrom@netinsight.net>
> ---
>  drivers/net/via-velocity.c |   26 +++++++++++++++++---------
>  1 files changed, 17 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/net/via-velocity.c b/drivers/net/via-velocity.c
> index 5e213f7..6882e7c 100644
> --- a/drivers/net/via-velocity.c
> +++ b/drivers/net/via-velocity.c
> @@ -2148,16 +2148,8 @@ static int velocity_poll(struct napi_struct *napi, int budget)
>        struct velocity_info *vptr = container_of(napi,
>                        struct velocity_info, napi);
>        unsigned int rx_done;
> -       u32 isr_status;
>
>        spin_lock(&vptr->lock);
> -       isr_status = mac_read_isr(vptr->mac_regs);
> -
> -       /* Ack the interrupt */
> -       mac_write_isr(vptr->mac_regs, isr_status);
> -       if (isr_status & (~(ISR_PRXI | ISR_PPRXI | ISR_PTXI | ISR_PPTXI)))
> -               velocity_error(vptr, isr_status);
> -
>        /*
>         * Do rx and tx twice for performance (taken from the VIA
>         * out-of-tree driver).
> @@ -2194,7 +2186,16 @@ static irqreturn_t velocity_intr(int irq, void *dev_instance)
>        struct velocity_info *vptr = netdev_priv(dev);
>        u32 isr_status;
>
> -       spin_lock(&vptr->lock);
> +       /* Check if the lock is taken, and if so ignore the interrupt. This
> +        * can happen with shared interrupts, where the other device can
> +        * interrupt during velocity_poll (where the lock is held).
> +        *
> +        * With spinlock debugging active on a uniprocessor, this will give
> +        * a warning which can safely be ignored.
> +        */
> +       if (!spin_trylock(&vptr->lock))
> +               return IRQ_NONE;

does the thread handling the interrupts check that an new
interrupts was received while it was servicing a previous one ?
wondering if there is a potential for an event that generates the interrupt
to be missed.


> +
>        isr_status = mac_read_isr(vptr->mac_regs);
>
>        /* Not us ? */
> @@ -2203,10 +2204,17 @@ static irqreturn_t velocity_intr(int irq, void *dev_instance)
>                return IRQ_NONE;
>        }
>
> +       /* Ack the interrupt */
> +       mac_write_isr(vptr->mac_regs, isr_status);
> +
>        if (likely(napi_schedule_prep(&vptr->napi))) {
>                mac_disable_int(vptr->mac_regs);
>                __napi_schedule(&vptr->napi);
>        }
> +
> +       if (isr_status & (~(ISR_PRXI | ISR_PPRXI | ISR_PTXI | ISR_PPTXI)))
> +               velocity_error(vptr, isr_status);
> +
>        spin_unlock(&vptr->lock);
>
>        return IRQ_HANDLED;
> --
> 1.6.0.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Simon Kagstrom Feb. 11, 2010, 8:05 a.m. UTC | #2

Hi Laurent!

On Wed, 10 Feb 2010 09:41:59 -0800
Laurent Chavey <chavey@google.com> wrote:

> > -       spin_lock(&vptr->lock);
> > +       /* Check if the lock is taken, and if so ignore the interrupt. This
> > +        * can happen with shared interrupts, where the other device can
> > +        * interrupt during velocity_poll (where the lock is held).
> > +        *
> > +        * With spinlock debugging active on a uniprocessor, this will give
> > +        * a warning which can safely be ignored.
> > +        */
> > +       if (!spin_trylock(&vptr->lock))
> > +               return IRQ_NONE;
> 
> does the thread handling the interrupts check that an new
> interrupts was received while it was servicing a previous one ?
> wondering if there is a potential for an event that generates the interrupt
> to be missed.

I should say that this particular part of the patch was reworked in
version 2, which David took in here:

  http://git.kernel.org/?p=linux/kernel/git/davem/net-2.6.git;a=commitdiff;h=3f2e8d9f13246382fbda6f03178eef867a9bfbe2

Anyway, velocity_poll will try to empty all events within it's budget
from the device and is executing with the device interrupt turned off
(and now also with the local processor interrupts off). If something
would be posted when it's exiting, that's fine since it either

  1) Consumed it's entire budget, in which case it will stay in the
  polling mode anyway

or,

  2) Didn't consume the budget and will then turn on the interrupt
  again and get the new event promptly.

You are right about the code above though, that one is racy as David
explained here:

  http://permalink.gmane.org/gmane.linux.network/151578

// Simon
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

diff --git a/drivers/net/via-velocity.c b/drivers/net/via-velocity.c
index 5e213f7..6882e7c 100644
--- a/drivers/net/via-velocity.c
+++ b/drivers/net/via-velocity.c
@@ -2148,16 +2148,8 @@  static int velocity_poll(struct napi_struct *napi, int budget)
 	struct velocity_info *vptr = container_of(napi,
 			struct velocity_info, napi);
 	unsigned int rx_done;
-	u32 isr_status;
 
 	spin_lock(&vptr->lock);
-	isr_status = mac_read_isr(vptr->mac_regs);
-
-	/* Ack the interrupt */
-	mac_write_isr(vptr->mac_regs, isr_status);
-	if (isr_status & (~(ISR_PRXI | ISR_PPRXI | ISR_PTXI | ISR_PPTXI)))
-		velocity_error(vptr, isr_status);
-
 	/*
 	 * Do rx and tx twice for performance (taken from the VIA
 	 * out-of-tree driver).
@@ -2194,7 +2186,16 @@  static irqreturn_t velocity_intr(int irq, void *dev_instance)
 	struct velocity_info *vptr = netdev_priv(dev);
 	u32 isr_status;
 
-	spin_lock(&vptr->lock);
+	/* Check if the lock is taken, and if so ignore the interrupt. This
+	 * can happen with shared interrupts, where the other device can
+	 * interrupt during velocity_poll (where the lock is held).
+	 *
+	 * With spinlock debugging active on a uniprocessor, this will give
+	 * a warning which can safely be ignored.
+	 */
+	if (!spin_trylock(&vptr->lock))
+		return IRQ_NONE;
+
 	isr_status = mac_read_isr(vptr->mac_regs);
 
 	/* Not us ? */
@@ -2203,10 +2204,17 @@  static irqreturn_t velocity_intr(int irq, void *dev_instance)
 		return IRQ_NONE;
 	}
 
+	/* Ack the interrupt */
+	mac_write_isr(vptr->mac_regs, isr_status);
+
 	if (likely(napi_schedule_prep(&vptr->napi))) {
 		mac_disable_int(vptr->mac_regs);
 		__napi_schedule(&vptr->napi);
 	}
+
+	if (isr_status & (~(ISR_PRXI | ISR_PPRXI | ISR_PTXI | ISR_PPTXI)))
+		velocity_error(vptr, isr_status);
+
 	spin_unlock(&vptr->lock);
 
 	return IRQ_HANDLED;

[3/3] via-velocity: Fix races on shared interrupts

Commit Message

Comments

Patch