From patchwork Fri Nov 28 12:39:17 2014
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Peter Lieven <pl@kamp.de>
X-Patchwork-Id: 415853
Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11])
	(using TLSv1 with cipher AES256-SHA (256/256 bits))
	(No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id D4E8414017B
	for <incoming@patchwork.ozlabs.org>;
	Fri, 28 Nov 2014 23:40:01 +1100 (AEDT)
Received: from localhost ([::1]:43913 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71) (envelope-from
	<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>)
	id 1XuKq5-0008T8-Kz
	for incoming@patchwork.ozlabs.org; Fri, 28 Nov 2014 07:39:57 -0500
Received: from eggs.gnu.org ([2001:4830:134:3::10]:39515)
	by lists.gnu.org with esmtp (Exim 4.71) (envelope-from <pl@kamp.de>)
	id 1XuKpg-0008C2-Os
	for qemu-devel@nongnu.org; Fri, 28 Nov 2014 07:39:41 -0500
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <pl@kamp.de>) id 1XuKpX-0004nF-MU
	for qemu-devel@nongnu.org; Fri, 28 Nov 2014 07:39:32 -0500
Received: from mx-v6.kamp.de ([2a02:248:0:51::16]:58824 helo=mx01.kamp.de)
	by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from <pl@kamp.de>)
	id 1XuKpX-0004m8-CA
	for qemu-devel@nongnu.org; Fri, 28 Nov 2014 07:39:23 -0500
Received: (qmail 11292 invoked by uid 89); 28 Nov 2014 12:39:20 -0000
Received: from [195.62.97.28] by client-16-kamp (envelope-from <pl@kamp.de>,
	uid 89) with qmail-scanner-2010/03/19-MF
	(clamdscan: 0.98.5/19691. hbedv: 8.3.26.26/7.11.189.92. spamassassin:
	3.4.0. Clear:RC:1(195.62.97.28):SA:0(-2.0/5.0):.
	Processed in 1.212263 secs); 28 Nov 2014 12:39:20 -0000
Received: from smtp.kamp.de (HELO submission.kamp.de) ([195.62.97.28])
	by mx01.kamp.de with ESMTPS (DHE-RSA-AES256-SHA encrypted);
	28 Nov 2014 12:39:18 -0000
X-GL_Whitelist: yes
Received: (qmail 15205 invoked from network); 28 Nov 2014 12:39:15 -0000
Received: from ac58.vpn.kamp-intra.net (HELO ?172.20.250.58?)
	(pl@kamp.de@172.20.250.58)
	by submission.kamp.de with ESMTPS (DHE-RSA-AES128-SHA encrypted)
	ESMTPA; 28 Nov 2014 12:39:15 -0000
Message-ID: <54786CF5.2060705@kamp.de>
Date: Fri, 28 Nov 2014 13:39:17 +0100
From: Peter Lieven <pl@kamp.de>
User-Agent: Mozilla/5.0 (X11; Linux x86_64;
	rv:31.0) Gecko/20100101 Thunderbird/31.2.0
MIME-Version: 1.0
To: Paolo Bonzini <pbonzini@redhat.com>, ming.lei@canonical.com,
	Kevin Wolf <kwolf@redhat.com>, Stefan Hajnoczi <stefanha@redhat.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	Markus Armbruster <armbru@redhat.com>
References: <1417084026-12307-1-git-send-email-pl@kamp.de>
	<1417084026-12307-4-git-send-email-pl@kamp.de>
	<547753F7.2030709@redhat.com> <54782EC3.10005@kamp.de>
	<54784E55.6060405@redhat.com> <54785067.60905@kamp.de>
	<547858FF.5070602@redhat.com> <54785AA5.9070409@kamp.de>
	<54785B2E.9070203@redhat.com> <54785D60.1070306@kamp.de>
	<5478609B.8060503@kamp.de> <547869DE.3080907@redhat.com>
In-Reply-To: <547869DE.3080907@redhat.com>
X-detected-operating-system: by eggs.gnu.org: Error: Malformed IPv6 address
	(bad octet value).
X-Received-From: 2a02:248:0:51::16
Subject: Re: [Qemu-devel] [RFC PATCH 3/3] qemu-coroutine: use a ring per
	thread for the pool
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org

Am 28.11.2014 um 13:26 schrieb Paolo Bonzini:
>
> On 28/11/2014 12:46, Peter Lieven wrote:
>>> I get:
>>> Run operation 40000000 iterations 9.883958 s, 4046K operations/s, 247ns per coroutine
>> Ok, understood, it "steals" the whole pool, right? Isn't that bad if we have more
>> than one thread in need of a lot of coroutines?
> Overall the algorithm is expected to adapt.  The N threads contribute to
> the global release pool, so the pool will fill up N times faster than if
> you had only one thread.  There can be some variance, which is why the
> maximum size of the pool is twice the threshold (and probably could be
> tuned better).
>
> Benchmarks are needed on real I/O too, of course, especially with high
> queue depth.

Yes, cool. The atomic operations are a bit tricky at the first glance ;-)

Question:
 Why is the pool_size increment atomic and the set to zero not?
 
Idea:
 If the release_pool is full why not put the coroutine in the thread alloc_pool instead of throwing it away? :-)

Run operation 40000000 iterations 9.057805 s, 4416K operations/s, 226ns per coroutine


Bug?:
 The release_pool is not cleanup up on termination I think.

Peter
Signed-off-by: Peter Lieven <pl@kamp.de>

diff --git a/qemu-coroutine.c b/qemu-coroutine.c
index 6bee354..edea162 100644
--- a/qemu-coroutine.c
+++ b/qemu-coroutine.c
@@ -25,8 +25,9 @@ enum {
 
 /** Free list to speed up creation */
 static QSLIST_HEAD(, Coroutine) release_pool = QSLIST_HEAD_INITIALIZER(pool);
-static unsigned int pool_size;
+static unsigned int release_pool_size;
 static __thread QSLIST_HEAD(, Coroutine) alloc_pool = QSLIST_HEAD_INITIALIZER(pool);
+static __thread unsigned int alloc_pool_size;
 
 /* The GPrivate is only used to invoke coroutine_pool_cleanup.  */
 static void coroutine_pool_cleanup(void *value);
@@ -39,12 +40,12 @@ Coroutine *qemu_coroutine_create(CoroutineEntry *entry)
     if (CONFIG_COROUTINE_POOL) {
         co = QSLIST_FIRST(&alloc_pool);
         if (!co) {
-            if (pool_size > POOL_BATCH_SIZE) {
-                /* This is not exact; there could be a little skew between pool_size
+            if (release_pool_size > POOL_BATCH_SIZE) {
+                /* This is not exact; there could be a little skew between release_pool_size
                  * and the actual size of alloc_pool.  But it is just a heuristic,
                  * it does not need to be perfect.
                  */
-                pool_size = 0;
+                alloc_pool_size = atomic_fetch_and(&release_pool_size, 0);
                 QSLIST_MOVE_ATOMIC(&alloc_pool, &release_pool);
                 co = QSLIST_FIRST(&alloc_pool);
 
@@ -53,6 +54,8 @@ Coroutine *qemu_coroutine_create(CoroutineEntry *entry)
                  */
                 g_private_set(&dummy_key, &dummy_key);
             }
+        } else {
+            alloc_pool_size--;
         }
         if (co) {
             QSLIST_REMOVE_HEAD(&alloc_pool, pool_next);
@@ -71,10 +74,15 @@ Coroutine *qemu_coroutine_create(CoroutineEntry *entry)
 static void coroutine_delete(Coroutine *co)
 {
     if (CONFIG_COROUTINE_POOL) {
-        if (pool_size < POOL_BATCH_SIZE * 2) {
+        if (release_pool_size < POOL_BATCH_SIZE * 2) {
             co->caller = NULL;
             QSLIST_INSERT_HEAD_ATOMIC(&release_pool, co, pool_next);
-            atomic_inc(&pool_size);
+            atomic_inc(&release_pool_size);
+            return;
+        } else if (alloc_pool_size < POOL_BATCH_SIZE) {
+            co->caller = NULL;
+            QSLIST_INSERT_HEAD(&alloc_pool, co, pool_next);
+            alloc_pool_size++;
             return;
         }
     }