diff mbox

[2.6.32,2/3] net: Fix ks8851 snl transmit problem

Message ID 14385191E87B904DBD836449AA30269D021A41@MORGANITE.micrel.com
State Rejected, archived
Delegated to: David Miller
Headers show

Commit Message

Ha, Tristram Dec. 3, 2009, 9:17 p.m. UTC
From: Tristram Ha <Tristram.Ha@micrel.com>

This fixes a transmit problem of the ks8851 snl network driver.  Under heavy TCP traffic the device will stop transmitting.  Turning off the transmit interrupt avoids this issue.  A new workqueue was implemented to replace the functionality of the transmit interrupt processing.

Signed-off-by: Tristram Ha <Tristram.Ha@micrel.com>
---
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

David Miller Dec. 4, 2009, 12:03 a.m. UTC | #1
From: "Ha, Tristram" <Tristram.Ha@Micrel.Com>
Date: Thu, 3 Dec 2009 13:17:48 -0800

> This fixes a transmit problem of the ks8851 snl network driver.  Under heavy TCP traffic the device will stop transmitting.  Turning off the transmit interrupt avoids this issue.  A new workqueue was implemented to replace the functionality of the transmit interrupt processing.
> 
> Signed-off-by: Tristram Ha <Tristram.Ha@micrel.com>

Please describe why the TX interrupt facility cannot be used.
You probably have a race condition or similar, have you tried
to see if there is some issue like that?

This workqueue is going to make performance terrible, at best.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ha, Tristram Dec. 4, 2009, 12:57 a.m. UTC | #2
David Miller wrote:
> From: "Ha, Tristram" <Tristram.Ha@Micrel.Com>
> Date: Thu, 3 Dec 2009 13:17:48 -0800
> 
>> This fixes a transmit problem of the ks8851 snl network driver.
Under heavy TCP traffic the
>> device will stop transmitting.  Turning off the transmit interrupt
avoids this issue.  A new
>> workqueue was implemented to replace the functionality of the
transmit interrupt processing.  
>> 
>> Signed-off-by: Tristram Ha <Tristram.Ha@micrel.com>
> 
> Please describe why the TX interrupt facility cannot be used.
> You probably have a race condition or similar, have you tried to see
if there is some issue like
> that? 
> 
> This workqueue is going to make performance terrible, at best.

I am a software engineer at Micrel, Inc., which developed the KSZ8851
SNL chips.  This transmit problem is a hardware issue as the same
problem is encountered in different drivers like WinCE.  We have not
pinpointed the hardware bug.  I do not think there is something wrong in
the driver.  Turning off the transmit interrupt is just a workaround.

I know the new workqueue implementation is polling.  But in my
experience, interrupting the system for each packet sent likely degrades
the system performance more.  The workqueue is only called when the
transmit buffer is empty, which happens after 3 or 4 packets are sent
quickly in succession.  The workqueue schedules itself again if the
transmit buffer is still not available, but in practice this case never
happens.

The KSZ8851 chips do not need special transmit interrupt handling
routine to cleanup the registers or socket buffers.  It is better to
turn it off for better performance.  Because the KSZ8851 SNL uses SPI,
its driver has restrictions of when to call SPI host to access
registers.  That is why the original driver implementation uses a
workqueue to check the transmit buffer during transmit interrupt
handling.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ha, Tristram Dec. 21, 2009, 9:51 p.m. UTC | #3
Ha, Tristram wrote:
> David Miller wrote:
>> 
>> Please describe why the TX interrupt facility cannot be used.
>> You probably have a race condition or similar, have you tried to see
>> if there is some issue like that?
>> 
>> This workqueue is going to make performance terrible, at best.
> 
> I am a software engineer at Micrel, Inc., which developed the KSZ8851
SNL chips.  This transmit
> problem is a hardware issue as the same problem is encountered in
different drivers like WinCE. 
> We have not pinpointed the hardware bug.  I do not think there is
something wrong in the driver.
> Turning off the transmit interrupt is just a workaround.   
> 
> I know the new workqueue implementation is polling.  But in my
experience, interrupting the
> system for each packet sent likely degrades the system performance
more.  The workqueue is only
> called when the transmit buffer is empty, which happens after 3 or 4
packets are sent quickly in
> succession.  The workqueue schedules itself again if the transmit
buffer is still not available,
> but in practice this case never happens.    
> 
> The KSZ8851 chips do not need special transmit interrupt handling
routine to cleanup the
> registers or socket buffers.  It is better to turn it off for better
performance.  Because the
> KSZ8851 SNL uses SPI, its driver has restrictions of when to call SPI
host to access registers. 
> That is why the original driver implementation uses a workqueue to
check the transmit buffer
> during transmit interrupt handling.    

Upon investigation, I found the ks8851 snl TCP transmit problem is not
related to the transmit side, which is fine, but to the receive side.
The "[PATCH 2.6.33] net: Fix ks8851 snl receive problem" patch I just
submitted should avoid the problem, so this patch is not required to
workaround the problem.

However, I still recommend this patch for performance consideration.  I
did a nuttcp benchmark using the Beagle Zippy combo board.

The original code using transmit done interrupt:

TCP RX	 9.6 Mbps	TCP TX	10.9 Mbps

The new code with transmit interrupt enabled:

TCP RX	 9.7 Mbps	TCP TX	11.5 Mbps

The new code with transmit interrupt disabled:

TCP RX	 9.8 Mbps	TCP TX	11.8 Mbps

The improvement is not much, but as the ks8851 snl device is targeted on
embedded systems that may not have powerful CPUs, the performance gain
is important to those ks8851 snl users.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff -urpN linux-2.6.32.old/drivers/net/ks8851.c linux-2.6.32.new/drivers/net/ks8851.c
--- linux-2.6.32.old/drivers/net/ks8851.c	2009-11-03 11:37:49.000000000 -0800
+++ linux-2.6.32.new/drivers/net/ks8851.c	2009-12-02 15:31:39.000000000 -0800
@@ -111,11 +111,13 @@  struct ks8851_net {
 	struct mii_if_info	mii;
 	struct ks8851_rxctrl	rxctrl;
 
+	struct work_struct	tx_check;
 	struct work_struct	tx_work;
 	struct work_struct	irq_work;
 	struct work_struct	rxctrl_work;
 
 	struct sk_buff_head	txq;
+	int			tx_len;
 
 	struct spi_message	spi_msg1;
 	struct spi_message	spi_msg2;
@@ -573,19 +575,6 @@  static void ks8851_irq_work(struct work_
 	if (status & IRQ_RXPSI)
 		handled |= IRQ_RXPSI;
 
-	if (status & IRQ_TXI) {
-		handled |= IRQ_TXI;
-
-		/* no lock here, tx queue should have been stopped */
-
-		/* update our idea of how much tx space is available to the
-		 * system */
-		ks->tx_space = ks8851_rdreg16(ks, KS_TXMIR);
-
-		if (netif_msg_intr(ks))
-			ks_dbg(ks, "%s: txspace %d\n", __func__, ks->tx_space);
-	}
-
 	if (status & IRQ_RXI)
 		handled |= IRQ_RXI;
 
@@ -623,9 +612,6 @@  static void ks8851_irq_work(struct work_
 
 	mutex_unlock(&ks->lock);
 
-	if (status & IRQ_TXI)
-		netif_wake_queue(ks->netdev);
-
 	enable_irq(ks->netdev->irq);
 }
 
@@ -703,6 +689,17 @@  static void ks8851_done_tx(struct ks8851
 	dev_kfree_skb(txb);
 }
 
+static void ks8851_tx_check(struct work_struct *work)
+{
+	struct ks8851_net *ks = container_of(work, struct ks8851_net, tx_check);
+
+	ks->tx_space = ks8851_rdreg16(ks, KS_TXMIR);
+	if (ks->tx_space > ks->tx_len)
+		netif_wake_queue(ks->netdev);
+	else
+		schedule_work(&ks->tx_check);
+}
+
 /**
  * ks8851_tx_work - process tx packet(s)
  * @work: The work strucutre what was scheduled.
@@ -814,7 +811,6 @@  static int ks8851_net_open(struct net_de
 	/* clear then enable interrupts */
 
 #define STD_IRQ (IRQ_LCI |	/* Link Change */	\
-		 IRQ_TXI |	/* TX done */		\
 		 IRQ_RXI |	/* RX done */		\
 		 IRQ_SPIBEI |	/* SPI bus error */	\
 		 IRQ_TXPSI |	/* TX process stop */	\
@@ -830,6 +826,7 @@  static int ks8851_net_open(struct net_de
 		ks_dbg(ks, "network device %s up\n", dev->name);
 
 	mutex_unlock(&ks->lock);
+	ks8851_write_mac_addr(dev);
 	return 0;
 }
 
@@ -854,6 +851,7 @@  static int ks8851_net_stop(struct net_de
 
 	/* stop any outstanding work */
 	flush_work(&ks->irq_work);
+	flush_work(&ks->tx_check);
 	flush_work(&ks->tx_work);
 	flush_work(&ks->rxctrl_work);
 
@@ -912,14 +910,16 @@  static netdev_tx_t ks8851_start_xmit(str
 
 	if (needed > ks->tx_space) {
 		netif_stop_queue(dev);
+		ks->tx_len = needed;
+		schedule_work(&ks->tx_check);
 		ret = NETDEV_TX_BUSY;
 	} else {
 		ks->tx_space -= needed;
 		skb_queue_tail(&ks->txq, skb);
+		schedule_work(&ks->tx_work);
 	}
 
 	spin_unlock(&ks->statelock);
-	schedule_work(&ks->tx_work);
 
 	return ret;
 }
@@ -1229,6 +1229,7 @@  static int __devinit ks8851_probe(struct
 	mutex_init(&ks->lock);
 	spin_lock_init(&ks->statelock);
 
+	INIT_WORK(&ks->tx_check, ks8851_tx_check);
 	INIT_WORK(&ks->tx_work, ks8851_tx_work);
 	INIT_WORK(&ks->irq_work, ks8851_irq_work);
 	INIT_WORK(&ks->rxctrl_work, ks8851_rxctrl_work);
@@ -1281,6 +1282,7 @@  static int __devinit ks8851_probe(struct
 
 	ks8851_read_selftest(ks);
 	ks8851_init_mac(ks);
+	ks->tx_space = ks8851_rdreg16(ks, KS_TXMIR);
 
 	ret = request_irq(spi->irq, ks8851_irq, IRQF_TRIGGER_LOW,
 			  ndev->name, ks);