diff mbox

[v2,net-next,5/7] xen-netback: process guest rx packets in batches

Message ID 1475573358-32414-6-git-send-email-paul.durrant@citrix.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Paul Durrant Oct. 4, 2016, 9:29 a.m. UTC
From: David Vrabel <david.vrabel@citrix.com>

Instead of only placing one skb on the guest rx ring at a time, process
a batch of up-to 64.  This improves performance by ~10% in some tests.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
[re-based]
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Wei Liu <wei.liu2@citrix.com>
---
 drivers/net/xen-netback/rx.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

Comments

Konrad Rzeszutek Wilk Oct. 4, 2016, 12:47 p.m. UTC | #1
On Tue, Oct 04, 2016 at 10:29:16AM +0100, Paul Durrant wrote:
> From: David Vrabel <david.vrabel@citrix.com>
> 
> Instead of only placing one skb on the guest rx ring at a time, process
> a batch of up-to 64.  This improves performance by ~10% in some tests.

And does it regress latency workloads?

What are those 'some tests' you speak off?

Thanks.
> 
> Signed-off-by: David Vrabel <david.vrabel@citrix.com>
> [re-based]
> Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> ---
> Cc: Wei Liu <wei.liu2@citrix.com>
> ---
>  drivers/net/xen-netback/rx.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c
> index 9548709..ae822b8 100644
> --- a/drivers/net/xen-netback/rx.c
> +++ b/drivers/net/xen-netback/rx.c
> @@ -399,7 +399,7 @@ static void xenvif_rx_extra_slot(struct xenvif_queue *queue,
>  	BUG();
>  }
>  
> -void xenvif_rx_action(struct xenvif_queue *queue)
> +void xenvif_rx_skb(struct xenvif_queue *queue)
>  {
>  	struct xenvif_pkt_state pkt;
>  
> @@ -425,6 +425,19 @@ void xenvif_rx_action(struct xenvif_queue *queue)
>  	xenvif_rx_complete(queue, &pkt);
>  }
>  
> +#define RX_BATCH_SIZE 64
> +
> +void xenvif_rx_action(struct xenvif_queue *queue)
> +{
> +	unsigned int work_done = 0;
> +
> +	while (xenvif_rx_ring_slots_available(queue) &&
> +	       work_done < RX_BATCH_SIZE) {
> +		xenvif_rx_skb(queue);
> +		work_done++;
> +	}
> +}
> +
>  static bool xenvif_rx_queue_stalled(struct xenvif_queue *queue)
>  {
>  	RING_IDX prod, cons;
> -- 
> 2.1.4
> 
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@lists.xen.org
> https://lists.xen.org/xen-devel
Paul Durrant Oct. 4, 2016, 2:02 p.m. UTC | #2
> -----Original Message-----
> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@oracle.com]
> Sent: 04 October 2016 13:48
> To: Paul Durrant <Paul.Durrant@citrix.com>
> Cc: netdev@vger.kernel.org; xen-devel@lists.xenproject.org; Wei Liu
> <wei.liu2@citrix.com>; David Vrabel <david.vrabel@citrix.com>
> Subject: Re: [Xen-devel] [PATCH v2 net-next 5/7] xen-netback: process
> guest rx packets in batches
> 
> On Tue, Oct 04, 2016 at 10:29:16AM +0100, Paul Durrant wrote:
> > From: David Vrabel <david.vrabel@citrix.com>
> >
> > Instead of only placing one skb on the guest rx ring at a time,
> > process a batch of up-to 64.  This improves performance by ~10% in some
> tests.

I believe the tests are mainly throughput tests, but David would know the specifics.

> 
> And does it regress latency workloads?
> 

It shouldn't, although I have not run ping-pong tests to verify. If packets are only placed on the vif queue singly though then the batching should have no effect, since rx_action will complete and do the push as before.

  Paul

> What are those 'some tests' you speak off?
> 
> Thanks.
> >
> > Signed-off-by: David Vrabel <david.vrabel@citrix.com> [re-based]
> > Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
> > ---
> > Cc: Wei Liu <wei.liu2@citrix.com>
> > ---
> >  drivers/net/xen-netback/rx.c | 15 ++++++++++++++-
> >  1 file changed, 14 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/xen-netback/rx.c
> > b/drivers/net/xen-netback/rx.c index 9548709..ae822b8 100644
> > --- a/drivers/net/xen-netback/rx.c
> > +++ b/drivers/net/xen-netback/rx.c
> > @@ -399,7 +399,7 @@ static void xenvif_rx_extra_slot(struct
> xenvif_queue *queue,
> >  	BUG();
> >  }
> >
> > -void xenvif_rx_action(struct xenvif_queue *queue)
> > +void xenvif_rx_skb(struct xenvif_queue *queue)
> >  {
> >  	struct xenvif_pkt_state pkt;
> >
> > @@ -425,6 +425,19 @@ void xenvif_rx_action(struct xenvif_queue
> *queue)
> >  	xenvif_rx_complete(queue, &pkt);
> >  }
> >
> > +#define RX_BATCH_SIZE 64
> > +
> > +void xenvif_rx_action(struct xenvif_queue *queue) {
> > +	unsigned int work_done = 0;
> > +
> > +	while (xenvif_rx_ring_slots_available(queue) &&
> > +	       work_done < RX_BATCH_SIZE) {
> > +		xenvif_rx_skb(queue);
> > +		work_done++;
> > +	}
> > +}
> > +
> >  static bool xenvif_rx_queue_stalled(struct xenvif_queue *queue)  {
> >  	RING_IDX prod, cons;
> > --
> > 2.1.4
> >
> >
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@lists.xen.org
> > https://lists.xen.org/xen-devel
David Vrabel Oct. 4, 2016, 2:51 p.m. UTC | #3
On 04/10/16 13:47, Konrad Rzeszutek Wilk wrote:
> On Tue, Oct 04, 2016 at 10:29:16AM +0100, Paul Durrant wrote:
>> From: David Vrabel <david.vrabel@citrix.com>
>>
>> Instead of only placing one skb on the guest rx ring at a time, process
>> a batch of up-to 64.  This improves performance by ~10% in some tests.
> 
> And does it regress latency workloads?

No. Because the loop outside of these batches is only checking for a
fatal error condition or a disconnection.

> What are those 'some tests' you speak off?

I think it was aggregate intrahost, but I don't remember exactly.

David
diff mbox

Patch

diff --git a/drivers/net/xen-netback/rx.c b/drivers/net/xen-netback/rx.c
index 9548709..ae822b8 100644
--- a/drivers/net/xen-netback/rx.c
+++ b/drivers/net/xen-netback/rx.c
@@ -399,7 +399,7 @@  static void xenvif_rx_extra_slot(struct xenvif_queue *queue,
 	BUG();
 }
 
-void xenvif_rx_action(struct xenvif_queue *queue)
+void xenvif_rx_skb(struct xenvif_queue *queue)
 {
 	struct xenvif_pkt_state pkt;
 
@@ -425,6 +425,19 @@  void xenvif_rx_action(struct xenvif_queue *queue)
 	xenvif_rx_complete(queue, &pkt);
 }
 
+#define RX_BATCH_SIZE 64
+
+void xenvif_rx_action(struct xenvif_queue *queue)
+{
+	unsigned int work_done = 0;
+
+	while (xenvif_rx_ring_slots_available(queue) &&
+	       work_done < RX_BATCH_SIZE) {
+		xenvif_rx_skb(queue);
+		work_done++;
+	}
+}
+
 static bool xenvif_rx_queue_stalled(struct xenvif_queue *queue)
 {
 	RING_IDX prod, cons;