From patchwork Wed Oct 27 21:58:13 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shirley Ma X-Patchwork-Id: 69417 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by ozlabs.org (Postfix) with ESMTP id 31D78B70D5 for ; Thu, 28 Oct 2010 08:58:44 +1100 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755551Ab0J0V6U (ORCPT ); Wed, 27 Oct 2010 17:58:20 -0400 Received: from e1.ny.us.ibm.com ([32.97.182.141]:33585 "EHLO e1.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755305Ab0J0V6S (ORCPT ); Wed, 27 Oct 2010 17:58:18 -0400 Received: from d01relay01.pok.ibm.com (d01relay01.pok.ibm.com [9.56.227.233]) by e1.ny.us.ibm.com (8.14.4/8.13.1) with ESMTP id o9RLour0009481; Wed, 27 Oct 2010 17:50:56 -0400 Received: from d01av02.pok.ibm.com (d01av02.pok.ibm.com [9.56.224.216]) by d01relay01.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id o9RLwHig365000; Wed, 27 Oct 2010 17:58:17 -0400 Received: from d01av02.pok.ibm.com (loopback [127.0.0.1]) by d01av02.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id o9RLwHa8026065; Wed, 27 Oct 2010 19:58:17 -0200 Received: from [9.47.28.66] (localhost-009047028066.beaverton.ibm.com [9.47.28.66]) by d01av02.pok.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id o9RLwGwg025924; Wed, 27 Oct 2010 19:58:16 -0200 Subject: [RFC PATCH 1/1] vhost: TX used buffer guest signal accumulation From: Shirley Ma To: "mst@redhat.com" Cc: David Miller , netdev@vger.kernel.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Date: Wed, 27 Oct 2010 14:58:13 -0700 Message-ID: <1288216693.17571.38.camel@localhost.localdomain> Mime-Version: 1.0 X-Mailer: Evolution 2.28.3 (2.28.3-1.fc12) Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org This patch changes vhost TX used buffer guest signal from one by one to 3/4 of vring size. This change improves vhost TX transmission both bandwidth and CPU utilization performance for 256 to 8K messages s ize without inducing any regression. Signed-off-by: Shirley Ma --- drivers/vhost/net.c | 20 +++++++++++++++++++- drivers/vhost/vhost.c | 31 +++++++++++++++++++++++++++++++ drivers/vhost/vhost.h | 3 +++ 3 files changed, 53 insertions(+), 1 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c index 4b4da5b..45e07cd 100644 --- a/drivers/vhost/net.c +++ b/drivers/vhost/net.c @@ -128,6 +128,7 @@ static void handle_tx(struct vhost_net *net) int err, wmem; size_t hdr_size; struct socket *sock; + int max_pend = vq->num - (vq->num >> 2); sock = rcu_dereference_check(vq->private_data, lockdep_is_held(&vq->mutex)); @@ -198,7 +199,24 @@ static void handle_tx(struct vhost_net *net) if (err != len) pr_debug("Truncated TX packet: " " len %d != %zd\n", err, len); - vhost_add_used_and_signal(&net->dev, vq, head, 0); + /* + * if no pending buffer size allocate, signal used buffer + * one by one, otherwise, signal used buffer when reaching + * 3/4 ring size to reduce CPU utilization. + */ + if (unlikely(vq->pend)) + vhost_add_used_and_signal(&net->dev, vq, head, 0); + else { + vq->pend[vq->num_pend].id = head; + vq->pend[vq->num_pend].len = 0; + ++vq->num_pend; + if (vq->num_pend == max_pend) { + vhost_add_used_and_signal_n(&net->dev, vq, + vq->pend, + vq->num_pend); + vq->num_pend = 0; + } + } total_len += len; if (unlikely(total_len >= VHOST_NET_WEIGHT)) { vhost_poll_queue(&vq->poll); diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c index 94701ff..9486a25 100644 --- a/drivers/vhost/vhost.c +++ b/drivers/vhost/vhost.c @@ -170,6 +170,16 @@ static void vhost_vq_reset(struct vhost_dev *dev, vq->call_ctx = NULL; vq->call = NULL; vq->log_ctx = NULL; + /* signal pending used buffers */ + if (vq->pend) { + if (vq->num_pend != 0) { + vhost_add_used_and_signal_n(dev, vq, vq->pend, + vq->num_pend); + vq->num_pend = 0; + } + kfree(vq->pend); + } + vq->pend = NULL; } static int vhost_worker(void *data) @@ -273,7 +283,13 @@ long vhost_dev_init(struct vhost_dev *dev, dev->vqs[i].heads = NULL; dev->vqs[i].dev = dev; mutex_init(&dev->vqs[i].mutex); + dev->vqs[i].num_pend = 0; + dev->vqs[i].pend = NULL; vhost_vq_reset(dev, dev->vqs + i); + /* signal 3/4 of ring size used buffers */ + dev->vqs[i].pend = kmalloc((dev->vqs[i].num - + (dev->vqs[i].num >> 2)) * + sizeof *vq->pend, GFP_KERNEL); if (dev->vqs[i].handle_kick) vhost_poll_init(&dev->vqs[i].poll, dev->vqs[i].handle_kick, POLLIN, dev); @@ -599,6 +615,21 @@ static long vhost_set_vring(struct vhost_dev *d, int ioctl, void __user *argp) r = -EINVAL; break; } + if (vq->num != s.num) { + /* signal used buffers first */ + if (vq->pend) { + if (vq->num_pend != 0) { + vhost_add_used_and_signal_n(vq->dev, vq, + vq->pend, + vq->num_pend); + vq->num_pend = 0; + } + kfree(vq->pend); + } + /* realloc pending used buffers size */ + vq->pend = kmalloc((s.num - (s.num >> 2)) * + sizeof *vq->pend, GFP_KERNEL); + } vq->num = s.num; break; case VHOST_SET_VRING_BASE: diff --git a/drivers/vhost/vhost.h b/drivers/vhost/vhost.h index 073d06a..78949c0 100644 --- a/drivers/vhost/vhost.h +++ b/drivers/vhost/vhost.h @@ -108,6 +108,9 @@ struct vhost_virtqueue { /* Log write descriptors */ void __user *log_base; struct vhost_log *log; + /* delay multiple used buffers to signal once */ + int num_pend; + struct vring_used_elem *pend; }; struct vhost_dev {