diff mbox

[net-next,06/13] bna: UDP RX Processing Improvements

Message ID 1386481724-32367-7-git-send-email-rmody@brocade.com
State Changes Requested, archived
Delegated to: David Miller
Headers show

Commit Message

Rasesh Mody Dec. 8, 2013, 5:48 a.m. UTC
Change Details:
 - Prefetch header in GRO path. This reduces napi_frags_skb time from 9% to 5%.
 - Changed the configurable limit of RxQ depth to 16384 (was 2048).
 - Rx unmap structure of 40 bytes is packed to 32 bytes reducing unmap_q size by 16K.
 - bnad_rx_unmap_q elements are cachealigned.

Signed-off-by: Rasesh Mody <rmody@brocade.com>
---
 drivers/net/ethernet/brocade/bna/bnad.c | 5 +++++
 drivers/net/ethernet/brocade/bna/bnad.h | 8 ++++----
 2 files changed, 9 insertions(+), 4 deletions(-)

Comments

David Miller Dec. 10, 2013, 12:17 a.m. UTC | #1
From: Rasesh Mody <rmody@brocade.com>
Date: Sat, 7 Dec 2013 21:48:37 -0800

> @@ -233,13 +233,13 @@ struct bnad_tx_unmap {
>  struct bnad_rx_vector {
>  	DEFINE_DMA_UNMAP_ADDR(dma_addr);
>  	u32			len;
> -};
> +} __packed;

This is absolutely unacceptable, and at best not even mentioned in your
commit message.

There is no reason whatsoever to mark this structure as __packed.

I'm not applying this patch series, sorry.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rasesh Mody Dec. 11, 2013, 12:05 a.m. UTC | #2
Hi Dave,

>From: David Miller [mailto:davem@davemloft.net]
>Sent: Monday, December 09, 2013 4:18 PM
>
>From: Rasesh Mody <rmody@brocade.com>
>Date: Sat, 7 Dec 2013 21:48:37 -0800
>
>> @@ -233,13 +233,13 @@ struct bnad_tx_unmap {  struct bnad_rx_vector {
>>  	DEFINE_DMA_UNMAP_ADDR(dma_addr);
>>  	u32			len;
>> -};
>> +} __packed;
>
>This is absolutely unacceptable, and at best not even mentioned in your
>commit message.
>
>There is no reason whatsoever to mark this structure as __packed.

By having the bnad_rx_unmap structure marked as __packed we were trying to
reduce the unmap_q size by 16K, __packed would ensure the RX unmap
structure of 40 bytes is packed to 32 bytes. This change was also captured
as a part of the commit message of patch #6 under change details.

We'll address your concern and resubmit the patches.

Thanks,
--Rasesh
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Miller Dec. 11, 2013, 2:11 a.m. UTC | #3
From: Rasesh Mody <rmody@brocade.com>
Date: Tue, 10 Dec 2013 16:05:46 -0800

> By having the bnad_rx_unmap structure marked as __packed we were trying to
> reduce the unmap_q size by 16K, __packed would ensure the RX unmap
> structure of 40 bytes is packed to 32 bytes. This change was also captured
> as a part of the commit message of patch #6 under change details.

Do you have any idea what the costs are on certain architectures
for adding this attribute?

When you mark something __packed, the compiler can no longer assume
anything about the alignment of any instance of that structure, nor
nor any structure which is contained within that type either.

So the compiler will, for example, emit byte by byte operations
to load multi-byte quantities on architectures where unaligned
data accesses can trap.

You should only use __packed as the extreme, very last resort method
to achieve a certain goal.  Frankly I think it should never be used
except in cases where layout is dictated by a network protocol or
hardware descriptor _AND_ the layout cannot be achieved using portable
fixed sized types.

--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/ethernet/brocade/bna/bnad.c b/drivers/net/ethernet/brocade/bna/bnad.c
index 33aeff2..849de5a 100644
--- a/drivers/net/ethernet/brocade/bna/bnad.c
+++ b/drivers/net/ethernet/brocade/bna/bnad.c
@@ -540,6 +540,11 @@  bnad_cq_setup_skb_frags(struct bna_rcb *rcb, struct sk_buff *skb,
 
 	unmap_q = rcb->unmap_q;
 	bnad = rcb->bnad;
+
+	/* prefetch header */
+	prefetch(page_address(unmap_q->unmap[sop_ci].page) +
+			unmap_q->unmap[sop_ci].page_offset);
+
 	for (vec = 1, ci = sop_ci; vec <= nvecs; vec++) {
 		unmap = &unmap_q->unmap[ci];
 		BNA_QE_INDX_INC(ci, rcb->q_depth);
diff --git a/drivers/net/ethernet/brocade/bna/bnad.h b/drivers/net/ethernet/brocade/bna/bnad.h
index 03de9cc..d1dc930 100644
--- a/drivers/net/ethernet/brocade/bna/bnad.h
+++ b/drivers/net/ethernet/brocade/bna/bnad.h
@@ -84,7 +84,7 @@  struct bnad_rx_ctrl {
 #define BNAD_IOCETH_TIMEOUT	     10000
 
 #define BNAD_MIN_Q_DEPTH		512
-#define BNAD_MAX_RXQ_DEPTH		2048
+#define BNAD_MAX_RXQ_DEPTH		16384
 #define BNAD_MAX_TXQ_DEPTH		2048
 
 #define BNAD_JUMBO_MTU			9000
@@ -233,13 +233,13 @@  struct bnad_tx_unmap {
 struct bnad_rx_vector {
 	DEFINE_DMA_UNMAP_ADDR(dma_addr);
 	u32			len;
-};
+} __packed;
 
 struct bnad_rx_unmap {
 	struct page		*page;
-	u32			page_offset;
 	struct sk_buff		*skb;
 	struct bnad_rx_vector	vector;
+	u32			page_offset;
 };
 
 enum bnad_rxbuf_type {
@@ -257,7 +257,7 @@  struct bnad_rx_unmap_q {
 	int			alloc_order;
 	u32			map_size;
 	enum bnad_rxbuf_type	type;
-	struct bnad_rx_unmap	unmap[0];
+	struct bnad_rx_unmap	unmap[0] ____cacheline_aligned;
 };
 
 #define BNAD_PCI_DEV_IS_CAT2(_bnad) \