diff mbox

[6/6] r8169: print errors when dma mapping fail

Message ID 1287144922-3297-6-git-send-email-sgruszka@redhat.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Stanislaw Gruszka Oct. 15, 2010, 12:15 p.m. UTC
Print errors because dma mapping failures can cause device to stop
working and will need user intervention to recover.

Signed-off-by: Stanislaw Gruszka <sgruszka@redhat.com>
---
 drivers/net/r8169.c |   13 ++++++++++---
 1 files changed, 10 insertions(+), 3 deletions(-)

Comments

Francois Romieu Oct. 15, 2010, 2:52 p.m. UTC | #1
Stanislaw Gruszka <sgruszka@redhat.com> :
> Print errors because dma mapping failures can cause device to stop
> working and will need user intervention to recover.

I am hesitating (overengineered ? bloaty ? not the right place ?).

The Tx stats are kept up-to-date : Tx failure will go along a Tx drop
stat increase.

Regarding a mapping failure in the Rx path, either it will behave as
an allocation failure at open / resume time - and I have no idea how
the user will recover - or it will happen during a Rx ring refill.

...

Ok, something could have gone unnoticed. Let's try it.
Stanislaw Gruszka Oct. 15, 2010, 3:59 p.m. UTC | #2
On Fri, Oct 15, 2010 at 04:52:01PM +0200, Francois Romieu wrote:
> Stanislaw Gruszka <sgruszka@redhat.com> :
> > Print errors because dma mapping failures can cause device to stop
> > working and will need user intervention to recover.
> 
> I am hesitating (overengineered ? bloaty ? not the right place ?).

As someone who seen lot's of bug reports like "my network device stops
working, nothing in dmesg", or like "my network device stops working,
there is NETDEV WATCHDOG: eth0 (r8169): transmit queue 0 timed out in
dmesg" (what is nothing but useful information), I do no think this is
overengineered or bloaty. I could agree for "not the right place", but
even if the error would be reported by upper layers, exact reason of
the problem will be unknown. Regarding lower layers, I don't think iommu
or other dma code print warning with calltrace in case of failure.

> The Tx stats are kept up-to-date : Tx failure will go along a Tx drop
> stat increase.

In current implementation, I stop tx queue on dma errors, if that
happens the queue can never be started again. I will probably change
that as you suggest not returning NETDEV_TX_BUSY, stopping the queue
is also wrong. But I would like to keep this error messages, perhaps
after adding net_ratelimit() check.
 
> Regarding a mapping failure in the Rx path, either it will behave as
> an allocation failure at open / resume time -

Still it's worth to know exact reason of failure.

> and I have no idea how
> the user will recover - or it will happen during a Rx ring refill.

ifconfig eth0 down/up or reloading module

Stanislaw
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/r8169.c b/drivers/net/r8169.c
index 0ef49b4..b27b989 100644
--- a/drivers/net/r8169.c
+++ b/drivers/net/r8169.c
@@ -4022,8 +4022,10 @@  static struct sk_buff *rtl8169_alloc_rx_skb(struct rtl8169_private *tp,
 	skb_reserve(skb, align ? ((pad - 1) & (unsigned long)skb->data) : pad);
 
 	mapping = dma_map_single(d, skb->data, rx_buf_sz, DMA_FROM_DEVICE);
-	if (dma_mapping_error(d, mapping))
+	if (dma_mapping_error(d, mapping)) {
+		netif_err(tp, drv, tp->dev, "Failed to map RX DMA!\n");
 		goto err_free_skb_1;
+	}
 
 	rtl8169_map_to_asic(desc, mapping, rx_buf_sz);
 
@@ -4259,8 +4261,11 @@  static int rtl8169_xmit_frags(struct rtl8169_private *tp, struct sk_buff *skb,
 		len = frag->size;
 		addr = ((void *) page_address(frag->page)) + frag->page_offset;
 		mapping = dma_map_single(d, addr, len, DMA_TO_DEVICE);
-		if (unlikely(dma_mapping_error(d, mapping)))
+		if (unlikely(dma_mapping_error(d, mapping))) {
+			netif_err(tp, drv, tp->dev,
+				  "Failed to map TX fragments DMA!\n");
 			goto err_out;
+		}
 
 		/* anti gcc 2.95.3 bugware (sic) */
 		status = opts1 | len | (RingEnd * !((entry + 1) % NUM_TX_DESC));
@@ -4326,8 +4331,10 @@  static netdev_tx_t rtl8169_start_xmit(struct sk_buff *skb,
 
 	len = skb_headlen(skb);
 	mapping = dma_map_single(d, skb->data, len, DMA_TO_DEVICE);
-	if (unlikely(dma_mapping_error(d, mapping)))
+	if (unlikely(dma_mapping_error(d, mapping))) {
+		netif_err(tp, drv, dev, "Failed to map TX DMA!\n");
 		goto err_stop_0;
+	}
 
 	tp->tx_skb[entry].len = len;
 	txd->addr = cpu_to_le64(mapping);