Message ID | 20101204141826.GA5830@Desktop-Junchang |
---|---|
State | Changes Requested, archived |
Delegated to: | David Miller |
Headers | show |
Le samedi 04 décembre 2010 à 22:18 +0800, Junchang Wang a écrit : > I added the prefetchw() in pktgen as follows: > > diff --git a/net/core/pktgen.c b/net/core/pktgen.c > index 2953b2a..512f1ae 100644 > --- a/net/core/pktgen.c > +++ b/net/core/pktgen.c > @@ -2660,6 +2660,7 @@ static struct sk_buff *fill_packet_ipv4(struct net_device *odev, > sprintf(pkt_dev->result, "No memory"); > return NULL; > } > + prefetchw(skb->data); > > skb_reserve(skb, datalen); > > This time, I can check it without rebooting the system. The performance > gain is 4%-5%(stable). Does 4% worth submitting it to the kernel? Yes I believe so, pktgen being very specific, but I have few questions : Is it with SLUB or SLAB ? How many buffers in TX ring on you nic (ethtool -g eth0) ? What is the datalen value here ? (you prefetch, then advance skb->data) 32 or 64bit kernel ? How many pps do you get before and after patch ? Thanks -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le samedi 04 décembre 2010 à 15:47 +0100, Eric Dumazet a écrit : > Le samedi 04 décembre 2010 à 22:18 +0800, Junchang Wang a écrit : > > > I added the prefetchw() in pktgen as follows: > > > > diff --git a/net/core/pktgen.c b/net/core/pktgen.c > > index 2953b2a..512f1ae 100644 > > --- a/net/core/pktgen.c > > +++ b/net/core/pktgen.c > > @@ -2660,6 +2660,7 @@ static struct sk_buff *fill_packet_ipv4(struct net_device *odev, > > sprintf(pkt_dev->result, "No memory"); > > return NULL; > > } > > + prefetchw(skb->data); > > > > skb_reserve(skb, datalen); > > > > This time, I can check it without rebooting the system. The performance > > gain is 4%-5%(stable). Does 4% worth submitting it to the kernel? > > Yes I believe so, pktgen being very specific, but I have few questions : > > Is it with SLUB or SLAB ? > > How many buffers in TX ring on you nic (ethtool -g eth0) ? > > What is the datalen value here ? (you prefetch, then advance skb->data) > > 32 or 64bit kernel ? > > How many pps do you get before and after patch ? > > Thanks > Also, dont forget to include the prefetchw() in fill_packet_ipv6() as well when submitting your patch. -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Dec 04, 2010 at 03:47:38PM +0100, Eric Dumazet wrote: > >Yes I believe so, pktgen being very specific, but I have few questions : > >Is it with SLUB or SLAB ? I had read your discussion about "net: allocate skbs on local node" in the list, so SLUB was used. BTW, what I observed is that network subsystem scales well on NUMA systems equipped with a single processor(up to six cores), but the performance didn't scale very well if there are two processors. I have noticed there are a number of discussions in the list. Are there any suggestions? I'm very pleasant to do test. > >How many buffers in TX ring on you nic (ethtool -g eth0) ? > Pre-set maximums: RX: 4096 RX Mini: 0 RX Jumbo: 0 TX: 4096 Current hardware settings: RX: 512 RX Mini: 0 RX Jumbo: 0 TX: 512 >What is the datalen value here ? (you prefetch, then advance skb->data) > 16. But the following skb_push will drawback 14 bytes. >32 or 64bit kernel ? > This is a CentOS 5.5 - 64bit distribution with the latest net-next. >How many pps do you get before and after patch ? > A Intel SR1625 server with two E5530 quad-core processors and a single ixgbe-based NIC. Without prefetch: 8.63 Mpps With prefetch: 9.03 Mpps Improvement: 4.6% Thanks. --Junchang -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Le dimanche 05 décembre 2010 à 18:56 +0800, Junchang Wang a écrit : > On Sat, Dec 04, 2010 at 03:47:38PM +0100, Eric Dumazet wrote: > > > >Yes I believe so, pktgen being very specific, but I have few questions : > > > >Is it with SLUB or SLAB ? > I had read your discussion about "net: allocate skbs on local node" in > the list, so SLUB was used. > > BTW, what I observed is that network subsystem scales well on NUMA > systems equipped with a single processor(up to six cores), but the > performance didn't scale very well if there are two processors. > > I have noticed there are a number of discussions in the list. Are > there any suggestions? I'm very pleasant to do test. > > > > >How many buffers in TX ring on you nic (ethtool -g eth0) ? > > > Pre-set maximums: > RX: 4096 > RX Mini: 0 > RX Jumbo: 0 > TX: 4096 > Current hardware settings: > RX: 512 > RX Mini: 0 > RX Jumbo: 0 > TX: 512 > > >What is the datalen value here ? (you prefetch, then advance skb->data) > > > 16. But the following skb_push will drawback 14 bytes. > > >32 or 64bit kernel ? > > > This is a CentOS 5.5 - 64bit distribution with the latest net-next. > > >How many pps do you get before and after patch ? > > > A Intel SR1625 server with two E5530 quad-core processors and a single > ixgbe-based NIC. > Without prefetch: 8.63 Mpps > With prefetch: 9.03 Mpps > Improvement: 4.6% > > Thanks Junchang, please submit your pktgen patch with the two added prefetchw(), I'll Ack it :) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/net/core/pktgen.c b/net/core/pktgen.c index 2953b2a..512f1ae 100644 --- a/net/core/pktgen.c +++ b/net/core/pktgen.c @@ -2660,6 +2660,7 @@ static struct sk_buff *fill_packet_ipv4(struct net_device *odev, sprintf(pkt_dev->result, "No memory"); return NULL; } + prefetchw(skb->data); skb_reserve(skb, datalen);