From patchwork Tue Mar 18 15:56:55 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Dr. David Alan Gilbert" X-Patchwork-Id: 331466 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id D10D12C00AD for ; Wed, 19 Mar 2014 02:57:34 +1100 (EST) Received: from localhost ([::1]:36082 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WPwOR-0007lC-QO for incoming@patchwork.ozlabs.org; Tue, 18 Mar 2014 11:57:31 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:42777) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WPwO6-0007XA-T2 for qemu-devel@nongnu.org; Tue, 18 Mar 2014 11:57:17 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WPwO0-0007Xw-TK for qemu-devel@nongnu.org; Tue, 18 Mar 2014 11:57:10 -0400 Received: from mx1.redhat.com ([209.132.183.28]:19865) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WPwO0-0007Xm-LA for qemu-devel@nongnu.org; Tue, 18 Mar 2014 11:57:04 -0400 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s2IFuxJ6006056 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 18 Mar 2014 11:56:59 -0400 Received: from dgilbert-t530.home.treblig.org (vpn1-4-85.ams2.redhat.com [10.36.4.85]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id s2IFuuFC022002; Tue, 18 Mar 2014 11:56:56 -0400 From: "Dr. David Alan Gilbert (git)" To: qemu-devel@nongnu.org Date: Tue, 18 Mar 2014 15:56:55 +0000 Message-Id: <1395158215-20488-1-git-send-email-dgilbert@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.24 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 209.132.183.28 Cc: arei.gonglei@huawei.com, armbru@redhat.com, quintela@redhat.com Subject: [Qemu-devel] [PATCH 1/1] Stop reinit of XBZRLE.lock X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: "Dr. David Alan Gilbert" Markus Armbruster spotted that the XBZRLE.lock might get initalised multiple times in the case of a second attempted migration, and that's undefined behaviour for pthread_mutex_init. This patch adds a flag to stop re-initialisation - finding somewhere to cleanly destroy it where we can guarantee isn't happening at the same place we might want to take the lock is tricky. It also adds comments to explain the lock use more. (I considered rewriting this lockless but can't justify it given the simplicity of the fix). Signed-off-by: Dr. David Alan Gilbert Reviewed-by: Markus Armbruster --- arch_init.c | 30 ++++++++++++++++++++++++++---- 1 file changed, 26 insertions(+), 4 deletions(-) diff --git a/arch_init.c b/arch_init.c index 60c975d..16474b5 100644 --- a/arch_init.c +++ b/arch_init.c @@ -167,10 +167,13 @@ static struct { /* Cache for XBZRLE, Protected by lock. */ PageCache *cache; QemuMutex lock; + bool lock_init; /* True once we've init'd lock */ } XBZRLE = { .encoded_buf = NULL, .current_buf = NULL, .cache = NULL, + /* .lock is initialised in ram_save_setup */ + .lock_init = false }; /* buffer used for XBZRLE decoding */ static uint8_t *xbzrle_decoded_buf; @@ -187,6 +190,11 @@ static void XBZRLE_cache_unlock(void) qemu_mutex_unlock(&XBZRLE.lock); } +/* called from qmp_migrate_set_cache_size in main thread, possibly while + * a migration is in progress. + * A running migration maybe using the cache and might finish during this + * call, hence changes to the cache are protected by XBZRLE.lock(). + */ int64_t xbzrle_cache_resize(int64_t new_size) { PageCache *new_cache, *cache_to_free; @@ -195,9 +203,12 @@ int64_t xbzrle_cache_resize(int64_t new_size) return -1; } - /* no need to lock, the current thread holds qemu big lock */ + /* The current thread holds qemu big lock, and we hold it while creating + * the cache in ram_save_setup, thus it's safe to test if the + * cache exists yet without it's own lock (which might not have been + * init'd yet) + */ if (XBZRLE.cache != NULL) { - /* check XBZRLE.cache again later */ if (pow2floor(new_size) == migrate_xbzrle_cache_size()) { return pow2floor(new_size); } @@ -209,7 +220,10 @@ int64_t xbzrle_cache_resize(int64_t new_size) } XBZRLE_cache_lock(); - /* the XBZRLE.cache may have be destroyed, check it again */ + /* the migration might have finished between the check above and us + * taking the lock, causing XBZRLE.cache to be destroyed + * check it again + */ if (XBZRLE.cache != NULL) { cache_to_free = XBZRLE.cache; XBZRLE.cache = new_cache; @@ -744,7 +758,15 @@ static int ram_save_setup(QEMUFile *f, void *opaque) DPRINTF("Error creating cache\n"); return -1; } - qemu_mutex_init(&XBZRLE.lock); + /* mutex's can't be reinit'd without destroying them + * and we've not got a good place to destroy it that + * we can guarantee isn't being called when we might want + * to hold the lock. + */ + if (!XBZRLE.lock_init) { + XBZRLE.lock_init = true; + qemu_mutex_init(&XBZRLE.lock); + } qemu_mutex_unlock_iothread(); /* We prefer not to abort if there is no memory */