From patchwork Wed Mar 19 09:31:39 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Dr. David Alan Gilbert" X-Patchwork-Id: 331664 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43FB02C007C for ; Wed, 19 Mar 2014 20:33:01 +1100 (EST) Received: from localhost ([::1]:39509 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WQCrq-0001uI-Rm for incoming@patchwork.ozlabs.org; Wed, 19 Mar 2014 05:32:58 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44147) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WQCrK-0001o5-QC for qemu-devel@nongnu.org; Wed, 19 Mar 2014 05:32:32 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WQCrE-0005kA-EH for qemu-devel@nongnu.org; Wed, 19 Mar 2014 05:32:26 -0400 Received: from mx1.redhat.com ([209.132.183.28]:54142) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WQCrE-0005k6-4h for qemu-devel@nongnu.org; Wed, 19 Mar 2014 05:32:20 -0400 Received: from int-mx01.intmail.prod.int.phx2.redhat.com (int-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.11]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id s2J9Vi97027384 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Wed, 19 Mar 2014 05:31:48 -0400 Received: from work-vm (vpn1-6-243.ams2.redhat.com [10.36.6.243]) by int-mx01.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id s2J9VeHg001524 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Wed, 19 Mar 2014 05:31:42 -0400 Date: Wed, 19 Mar 2014 09:31:39 +0000 From: "Dr. David Alan Gilbert" To: Markus Armbruster Message-ID: <20140319093139.GA2371@work-vm> References: <1395158215-20488-1-git-send-email-dgilbert@redhat.com> <871txze7yc.fsf@blackfin.pond.sub.org> <20140318204715.GE2715@work-vm> <8761na7hl8.fsf@blackfin.pond.sub.org> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <8761na7hl8.fsf@blackfin.pond.sub.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-Scanned-By: MIMEDefang 2.67 on 10.5.11.11 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 209.132.183.28 Cc: arei.gonglei@huawei.com, qemu-devel@nongnu.org, quintela@redhat.com Subject: Re: [Qemu-devel] [PATCH 1/1] Stop reinit of XBZRLE.lock X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org * Markus Armbruster (armbru@redhat.com) wrote: > "Dr. David Alan Gilbert" writes: > > > * Markus Armbruster (armbru@redhat.com) wrote: > >> "Dr. David Alan Gilbert (git)" writes: > > > > > > > >> > diff --git a/arch_init.c b/arch_init.c > >> > index 60c975d..16474b5 100644 > >> > --- a/arch_init.c > >> > +++ b/arch_init.c > >> > @@ -167,10 +167,13 @@ static struct { > >> > /* Cache for XBZRLE, Protected by lock. */ > >> > PageCache *cache; > >> > QemuMutex lock; > >> > + bool lock_init; /* True once we've init'd lock */ > >> > } XBZRLE = { > >> > .encoded_buf = NULL, > >> > .current_buf = NULL, > >> > .cache = NULL, > >> > + /* .lock is initialised in ram_save_setup */ > >> > + .lock_init = false > >> > }; > >> > >> Redundant initializers. > > > > Given how subtle lock stuff is, I'll take making it obvious as more important. > > That a static variable starts out zeroed is plenty obvious. More > obvious in fact than a bunch of explicit initializers I actually have to > read to see what they mean. > > We rely on implicit zero-initialization of static variables all over the > place. > > >> > /* buffer used for XBZRLE decoding */ > >> > static uint8_t *xbzrle_decoded_buf; > >> > @@ -187,6 +190,11 @@ static void XBZRLE_cache_unlock(void) > >> > qemu_mutex_unlock(&XBZRLE.lock); > >> > } > >> > > >> > +/* called from qmp_migrate_set_cache_size in main thread, possibly while > >> > + * a migration is in progress. > >> > + * A running migration maybe using the cache and might finish during this > >> > >> may be > >> > >> > + * call, hence changes to the cache are protected by XBZRLE.lock(). > >> > + */ > >> > >> Style nit, since I'm nitpicking spelling already: our winged comments > >> usually look like > >> > >> /* > >> * Text > >> */ > > > > Oops, yes; if I need to respin I'll fix that (hmm I wonder if the check > > script could be tweaked to find those). > > > >> > int64_t xbzrle_cache_resize(int64_t new_size) > >> > { > >> > PageCache *new_cache, *cache_to_free; > >> > @@ -195,9 +203,12 @@ int64_t xbzrle_cache_resize(int64_t new_size) > >> > return -1; > >> > } > >> > > >> > - /* no need to lock, the current thread holds qemu big lock */ > >> > + /* The current thread holds qemu big lock, and we hold it while creating > >> > + * the cache in ram_save_setup, thus it's safe to test if the > >> > + * cache exists yet without it's own lock (which might not have been > >> > + * init'd yet) > >> > + */ > >> > if (XBZRLE.cache != NULL) { > >> > - /* check XBZRLE.cache again later */ > >> > if (pow2floor(new_size) == migrate_xbzrle_cache_size()) { > >> > return pow2floor(new_size); > >> > } > >> > @@ -209,7 +220,10 @@ int64_t xbzrle_cache_resize(int64_t new_size) > >> > } > >> > > >> > XBZRLE_cache_lock(); > >> > - /* the XBZRLE.cache may have be destroyed, check it again */ > >> > + /* the migration might have finished between the check above and us > >> > + * taking the lock, causing XBZRLE.cache to be destroyed > >> > + * check it again > >> > + */ > >> > if (XBZRLE.cache != NULL) { > >> > cache_to_free = XBZRLE.cache; > >> > XBZRLE.cache = new_cache; > >> > @@ -744,7 +758,15 @@ static int ram_save_setup(QEMUFile *f, void *opaque) > >> > DPRINTF("Error creating cache\n"); > >> > return -1; > >> > } > >> > - qemu_mutex_init(&XBZRLE.lock); > >> > + /* mutex's can't be reinit'd without destroying them > >> > + * and we've not got a good place to destroy it that > >> > + * we can guarantee isn't being called when we might want > >> > + * to hold the lock. > >> > + */ > >> > + if (!XBZRLE.lock_init) { > >> > + XBZRLE.lock_init = true; > >> > + qemu_mutex_init(&XBZRLE.lock); > >> > + } > >> > qemu_mutex_unlock_iothread(); > >> > > >> > /* We prefer not to abort if there is no memory */ > >> > >> I detest how the locking works in xbzrle_cache_resize(). > > > > Yeh, it's tricky - I think the way to think about it is that the > > lock protects the cache and it's contents, not the pointer. > > Except that's not really true when it swaps the new one in - hmm. > > Yup, and that's what I call confusing and way too subtle. Yes, agreed - this shouldn't be a hard problem that needs subtlety. > >> The first XBZRLE.cache != NULL is not under XBZRLE.lock. As you explain > >> in the comment, this is required, because we foolishly delay lock > >> initialization until a migration starts, and it's safe, because cache > >> creation is also under the BQL. > > > > Some of this is down to the interfaces between migration generally, > > the devices being migrated and the specials for RAM/iterative migration. > > > > 1) A lot of the ram migration data, including XBZRLE, is global data in > > arch_init - this is bad. > > As long as there can only be one migration active at a time, it's not > really bad, isn't it? I find it quite messy; there is state associated with migration all over, I couldn't honestly tell you all the things that need to be initialised, updated, etc when a migration is in progress. I'd rather hang stuff off MigrationState. > > 2) The management of these is generally glued onto the migration > > setup/iterate/complete set of methods used during migration, however > > they don't have any calls that correspond to the actual start/end of > > migration, or the like that could be sanely used to do any init > > that tied up with the actual start end of migration - this is bad. > > Maybe. But why delay the initialization of XBZRLE.lock to start of > migration? All the other members of XBZRLE are initialized on start of > program! That's just fall out from the shock that you couldn't staticly init the lock, the original patch version statically init'd the lock in the same way as the other members. > > 3) Migration is in a separate thread and could finish at any time - this > > is expected but complicates life, especially when (2) means that > > the data structures for RAMs migration etc are cleared up when migration > > finishes. > > Four activities: initialization, cache setup, cache use, cache teardown. > > Initialization can be done during program startup, where we don't have > to worry about threads and locks. The other three can then be simply > put under the lock, and the comment "Protected by lock" becomes actually > true. Sketch appended. It passes "make check", it must be purrfect! I'm mostly OK with this, with one small change at the bottom. > > I don't know the history but (1) seems to arrise from the > > semi-arbitrary split between arch_init.c, memory.c, savevm.c, and probably > > 3 or 4 other files I've failed to mention. > > [...] > > > From 6ab76b05789de80d4e0919404429fadcbc0ea403 Mon Sep 17 00:00:00 2001 > From: Markus Armbruster > Date: Wed, 19 Mar 2014 08:46:51 +0100 > Subject: [PATCH] XBZRLE: Fix and simplify locking of cache WIP > > --- > arch_init.c | 49 ++++++++++++++++++++++------------------------ > include/sysemu/arch_init.h | 1 + > vl.c | 1 + > 3 files changed, 25 insertions(+), 26 deletions(-) > > diff --git a/arch_init.c b/arch_init.c > index 60c975d..e41e9dd 100644 > --- a/arch_init.c > +++ b/arch_init.c > @@ -167,14 +167,16 @@ static struct { > /* Cache for XBZRLE, Protected by lock. */ > PageCache *cache; > QemuMutex lock; > -} XBZRLE = { > - .encoded_buf = NULL, > - .current_buf = NULL, > - .cache = NULL, > -}; > +} XBZRLE; > + > /* buffer used for XBZRLE decoding */ > static uint8_t *xbzrle_decoded_buf; > > +void XBZRLE_init(void) > +{ > + qemu_mutex_init(&XBZRLE.lock); > +} > + > static void XBZRLE_cache_lock(void) > { > if (migrate_use_xbzrle()) > @@ -189,39 +191,35 @@ static void XBZRLE_cache_unlock(void) > > int64_t xbzrle_cache_resize(int64_t new_size) > { > - PageCache *new_cache, *cache_to_free; > + PageCache *new_cache; > + int64_t ret; > > if (new_size < TARGET_PAGE_SIZE) { > return -1; > } > > - /* no need to lock, the current thread holds qemu big lock */ > + XBZRLE_cache_lock(); > + > if (XBZRLE.cache != NULL) { > - /* check XBZRLE.cache again later */ > if (pow2floor(new_size) == migrate_xbzrle_cache_size()) { > - return pow2floor(new_size); > + goto out_new_size; > } > new_cache = cache_init(new_size / TARGET_PAGE_SIZE, > TARGET_PAGE_SIZE); > if (!new_cache) { > DPRINTF("Error creating cache\n"); > - return -1; > + goto out; > } > > - XBZRLE_cache_lock(); > - /* the XBZRLE.cache may have be destroyed, check it again */ > - if (XBZRLE.cache != NULL) { > - cache_to_free = XBZRLE.cache; > - XBZRLE.cache = new_cache; > - } else { > - cache_to_free = new_cache; > - } > - XBZRLE_cache_unlock(); > - > - cache_fini(cache_to_free); > + cache_fini(XBZRLE.cache); > + XBZRLE.cache = new_cache; > } > > - return pow2floor(new_size); > +out_new_size: > + ret = pow2floor(new_size); > +out: > + XBZRLE_cache_unlock(); > + return ret; > } > > /* accounting for migration statistics */ > @@ -735,17 +733,16 @@ static int ram_save_setup(QEMUFile *f, void *opaque) > dirty_rate_high_cnt = 0; > > if (migrate_use_xbzrle()) { > - qemu_mutex_lock_iothread(); > + XBZRLE_cache_lock(); > XBZRLE.cache = cache_init(migrate_xbzrle_cache_size() / > TARGET_PAGE_SIZE, > TARGET_PAGE_SIZE); > if (!XBZRLE.cache) { > - qemu_mutex_unlock_iothread(); > + XBZRLE_cache_unlock(); > DPRINTF("Error creating cache\n"); > return -1; > } > - qemu_mutex_init(&XBZRLE.lock); > - qemu_mutex_unlock_iothread(); > + XBZRLE_cache_unlock(); > > /* We prefer not to abort if there is no memory */ > XBZRLE.encoded_buf = g_try_malloc0(TARGET_PAGE_SIZE); > diff --git a/include/sysemu/arch_init.h b/include/sysemu/arch_init.h > index be71bca..fdc3423 100644 > --- a/include/sysemu/arch_init.h > +++ b/include/sysemu/arch_init.h > @@ -26,6 +26,7 @@ enum { > > extern const uint32_t arch_type; > > +void XBZRLE_init(void); > void select_soundhw(const char *optarg); > void do_acpitable_option(const QemuOpts *opts); > void do_smbios_option(QemuOpts *opts); > diff --git a/vl.c b/vl.c > index f0fe48b..24433c8 100644 > --- a/vl.c > +++ b/vl.c > @@ -3008,6 +3008,7 @@ int main(int argc, char **argv, char **envp) > > qemu_init_auxval(envp); > qemu_cache_utils_init(); > + XBZRLE_init(); > > QLIST_INIT (&vm_change_state_head); > os_setup_early_signal_handling(); > -- > 1.8.1.4 I noticed that in main.c there is a call to a blk_mig_init() - so how about (not even build tested): and that way we're simplifying vl.c rather than teaching it even more about the innards of ram migration. Dave --- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK diff --git a/vl.c b/vl.c index 842e897..ab8cbf0 100644 --- a/vl.c +++ b/vl.c @@ -4247,6 +4247,7 @@ int main(int argc, char **argv, char **envp) cpu_exec_init_all(); blk_mig_init(); + ram_mig_init(); /* open the virtual block devices */ if (snapshot) @@ -4261,8 +4262,6 @@ int main(int argc, char **argv, char **envp) default_drive(default_floppy, snapshot, IF_FLOPPY, 0, FD_OPTS); default_drive(default_sdcard, snapshot, IF_SD, 0, SD_OPTS); - register_savevm_live(NULL, "ram", 0, 4, &savevm_ram_handlers, NULL); - if (nb_numa_nodes > 0) { int i; diff --git a/arch_init.c b/arch_init.c index 16474b5..4f8376d 100644 --- a/arch_init.c +++ b/arch_init.c @@ -190,6 +190,12 @@ static void XBZRLE_cache_unlock(void) qemu_mutex_unlock(&XBZRLE.lock); } +void ram_mig_init(void) +{ + qemu_mutex_init(&XBZRLE.lock); + register_savevm_live(NULL, "ram", 0, 4, &savevm_ram_handlers, NULL); +} + /* called from qmp_migrate_set_cache_size in main thread, possibly while * a migration is in progress. * A running migration maybe using the cache and might finish during this @@ -1117,7 +1123,7 @@ done: return ret; } -SaveVMHandlers savevm_ram_handlers = { +static SaveVMHandlers savevm_ram_handlers = { .save_live_setup = ram_save_setup, .save_live_iterate = ram_save_iterate, .save_live_complete = ram_save_complete, diff --git a/include/migration/migration.h b/include/migration/migration.h index 3e1e6c7..31fbf17 100644 --- a/include/migration/migration.h +++ b/include/migration/migration.h @@ -113,8 +113,6 @@ void free_xbzrle_decoded_buf(void); void acct_update_position(QEMUFile *f, size_t size, bool zero); -extern SaveVMHandlers savevm_ram_handlers; - uint64_t dup_mig_bytes_transferred(void); uint64_t dup_mig_pages_transferred(void); uint64_t skipped_mig_bytes_transferred(void);