From patchwork Tue Oct 18 10:47:22 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Lieven X-Patchwork-Id: 683619 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3sysHy4GHzz9s2Q for ; Tue, 18 Oct 2016 21:48:46 +1100 (AEDT) Received: from localhost ([::1]:40248 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bwRwp-0002oR-Gb for incoming@patchwork.ozlabs.org; Tue, 18 Oct 2016 06:48:43 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:39925) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1bwRvb-00020t-VM for qemu-devel@nongnu.org; Tue, 18 Oct 2016 06:47:28 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1bwRvb-0002tq-1m for qemu-devel@nongnu.org; Tue, 18 Oct 2016 06:47:28 -0400 Received: from mx-v6.kamp.de ([2a02:248:0:51::16]:37050 helo=mx01.kamp.de) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1bwRva-0002tR-O9 for qemu-devel@nongnu.org; Tue, 18 Oct 2016 06:47:26 -0400 Received: (qmail 9303 invoked by uid 89); 18 Oct 2016 10:47:25 -0000 Received: from [195.62.97.28] by client-16-kamp (envelope-from , uid 89) with qmail-scanner-2010/03/19-MF (clamdscan: 0.99.2/22385. avast: 1.2.2/16101800. spamassassin: 3.4.1. Clear:RC:1(195.62.97.28):. Processed in 0.143325 secs); 18 Oct 2016 10:47:25 -0000 Received: from smtp.kamp.de (HELO submission.kamp.de) ([195.62.97.28]) by mx01.kamp.de with ESMTPS (DHE-RSA-AES256-GCM-SHA384 encrypted); 18 Oct 2016 10:47:23 -0000 X-GL_Whitelist: yes Received: (qmail 26619 invoked from network); 18 Oct 2016 10:47:22 -0000 Received: from lieven-pc.kamp-intra.net (HELO ?172.21.12.60?) (pl@kamp.de@::ffff:172.21.12.60) by submission.kamp.de with ESMTPS (DHE-RSA-AES128-SHA encrypted) ESMTPA; 18 Oct 2016 10:47:22 -0000 To: "Michael R. Hines" , qemu-devel@nongnu.org References: <1467104499-27517-1-git-send-email-pl@kamp.de> <087cb5a1-1aa4-dac0-83ab-0b9d8024b88d@digitalocean.com> From: Peter Lieven Message-ID: <73ad59dc-cc88-96ee-2867-9fb628823782@kamp.de> Date: Tue, 18 Oct 2016 12:47:22 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In-Reply-To: <087cb5a1-1aa4-dac0-83ab-0b9d8024b88d@digitalocean.com> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] X-Received-From: 2a02:248:0:51::16 Subject: Re: [Qemu-devel] [PATCH 00/15] optimize Qemu RSS usage X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: kwolf@redhat.com, peter.maydell@linaro.org, patrick@digitalocean.com, mst@redhat.com, blemasurier@digitalocean.com, dgilbert@redhat.com, mreitz@redhat.com, kraxel@redhat.com, pbonzini@redhat.com Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Am 12.10.2016 um 23:18 schrieb Michael R. Hines: > Peter, > > Greetings from DigitalOcean. We're experiencing the same symptoms without this patch. > We have, collectively, many gigabytes of un-planned-for RSS being used per-hypervisor > that we would like to get rid of =). > > Without explicitly trying this patch (will do that ASAP), we immediately noticed that the > 192MB mentioned immediately melts away (Yay) when we disabled the coroutine thread pool explicitly, > with another ~100MB in additional stack usage that would likely also go away if we > applied the entirety of your patch. > > Is there any chance you have revisited this or have a timeline for it? Hi Michael, the current master already includes some of the patches of this original series. There are still some changes left, but what works for me is the current master + + invoking qemu with the following environemnet variable set: MALLOC_MMAP_THRESHOLD_=32768 qemu-system-x86_64 .... The last one makes glibc automatically using mmap when the malloced memory exceeds 32kByte. Hope this helps, Peter diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c index 5816702..3eaef68 100644 --- a/util/qemu-coroutine.c +++ b/util/qemu-coroutine.c @@ -25,8 +25,6 @@ enum { }; /** Free list to speed up creation */ -static QSLIST_HEAD(, Coroutine) release_pool = QSLIST_HEAD_INITIALIZER(pool); -static unsigned int release_pool_size; static __thread QSLIST_HEAD(, Coroutine) alloc_pool = QSLIST_HEAD_INITIALIZER(pool); static __thread unsigned int alloc_pool_size; static __thread Notifier coroutine_pool_cleanup_notifier; @@ -49,20 +47,10 @@ Coroutine *qemu_coroutine_create(CoroutineEntry *entry) if (CONFIG_COROUTINE_POOL) { co = QSLIST_FIRST(&alloc_pool); if (!co) { - if (release_pool_size > POOL_BATCH_SIZE) { - /* Slow path; a good place to register the destructor, too. */ - if (!coroutine_pool_cleanup_notifier.notify) { - coroutine_pool_cleanup_notifier.notify = coroutine_pool_cleanup; - qemu_thread_atexit_add(&coroutine_pool_cleanup_notifier); - } - - /* This is not exact; there could be a little skew between - * release_pool_size and the actual size of release_pool. But - * it is just a heuristic, it does not need to be perfect. - */ - alloc_pool_size = atomic_xchg(&release_pool_size, 0); - QSLIST_MOVE_ATOMIC(&alloc_pool, &release_pool); - co = QSLIST_FIRST(&alloc_pool); + /* Slow path; a good place to register the destructor, too. */ + if (!coroutine_pool_cleanup_notifier.notify) { + coroutine_pool_cleanup_notifier.notify = coroutine_pool_cleanup; + qemu_thread_atexit_add(&coroutine_pool_cleanup_notifier); } } if (co) { @@ -85,11 +73,6 @@ static void coroutine_delete(Coroutine *co) co->caller = NULL; if (CONFIG_COROUTINE_POOL) { - if (release_pool_size < POOL_BATCH_SIZE * 2) { - QSLIST_INSERT_HEAD_ATOMIC(&release_pool, co, pool_next); - atomic_inc(&release_pool_size); - return; - } if (alloc_pool_size < POOL_BATCH_SIZE) { QSLIST_INSERT_HEAD(&alloc_pool, co, pool_next); alloc_pool_size++;