Message ID | 20140319093139.GA2371@work-vm |
---|---|
State | New |
Headers | show |
"Dr. David Alan Gilbert" <dgilbert@redhat.com> writes: > * Markus Armbruster (armbru@redhat.com) wrote: >> "Dr. David Alan Gilbert" <dgilbert@redhat.com> writes: >> >> > * Markus Armbruster (armbru@redhat.com) wrote: >> >> "Dr. David Alan Gilbert (git)" <dgilbert@redhat.com> writes: >> > >> > <snip> >> > >> >> > diff --git a/arch_init.c b/arch_init.c >> >> > index 60c975d..16474b5 100644 >> >> > --- a/arch_init.c >> >> > +++ b/arch_init.c >> >> > @@ -167,10 +167,13 @@ static struct { >> >> > /* Cache for XBZRLE, Protected by lock. */ >> >> > PageCache *cache; >> >> > QemuMutex lock; >> >> > + bool lock_init; /* True once we've init'd lock */ >> >> > } XBZRLE = { >> >> > .encoded_buf = NULL, >> >> > .current_buf = NULL, >> >> > .cache = NULL, >> >> > + /* .lock is initialised in ram_save_setup */ >> >> > + .lock_init = false >> >> > }; >> >> >> >> Redundant initializers. >> > >> > Given how subtle lock stuff is, I'll take making it obvious as >> > more important. >> >> That a static variable starts out zeroed is plenty obvious. More >> obvious in fact than a bunch of explicit initializers I actually have to >> read to see what they mean. >> >> We rely on implicit zero-initialization of static variables all over the >> place. >> >> >> > /* buffer used for XBZRLE decoding */ >> >> > static uint8_t *xbzrle_decoded_buf; >> >> > @@ -187,6 +190,11 @@ static void XBZRLE_cache_unlock(void) >> >> > qemu_mutex_unlock(&XBZRLE.lock); >> >> > } >> >> > >> >> > +/* called from qmp_migrate_set_cache_size in main thread, possibly while >> >> > + * a migration is in progress. >> >> > + * A running migration maybe using the cache and might finish during this >> >> >> >> may be >> >> >> >> > + * call, hence changes to the cache are protected by XBZRLE.lock(). >> >> > + */ >> >> >> >> Style nit, since I'm nitpicking spelling already: our winged comments >> >> usually look like >> >> >> >> /* >> >> * Text >> >> */ >> > >> > Oops, yes; if I need to respin I'll fix that (hmm I wonder if the check >> > script could be tweaked to find those). >> > >> >> > int64_t xbzrle_cache_resize(int64_t new_size) >> >> > { >> >> > PageCache *new_cache, *cache_to_free; >> >> > @@ -195,9 +203,12 @@ int64_t xbzrle_cache_resize(int64_t new_size) >> >> > return -1; >> >> > } >> >> > >> >> > - /* no need to lock, the current thread holds qemu big lock */ >> >> > + /* The current thread holds qemu big lock, and we hold it while creating >> >> > + * the cache in ram_save_setup, thus it's safe to test if the >> >> > + * cache exists yet without it's own lock (which might not have been >> >> > + * init'd yet) >> >> > + */ >> >> > if (XBZRLE.cache != NULL) { >> >> > - /* check XBZRLE.cache again later */ >> >> > if (pow2floor(new_size) == migrate_xbzrle_cache_size()) { >> >> > return pow2floor(new_size); >> >> > } >> >> > @@ -209,7 +220,10 @@ int64_t xbzrle_cache_resize(int64_t new_size) >> >> > } >> >> > >> >> > XBZRLE_cache_lock(); >> >> > - /* the XBZRLE.cache may have be destroyed, check it again */ >> >> > + /* the migration might have finished between the check above and us >> >> > + * taking the lock, causing XBZRLE.cache to be destroyed >> >> > + * check it again >> >> > + */ >> >> > if (XBZRLE.cache != NULL) { >> >> > cache_to_free = XBZRLE.cache; >> >> > XBZRLE.cache = new_cache; >> >> > @@ -744,7 +758,15 @@ static int ram_save_setup(QEMUFile *f, void *opaque) >> >> > DPRINTF("Error creating cache\n"); >> >> > return -1; >> >> > } >> >> > - qemu_mutex_init(&XBZRLE.lock); >> >> > + /* mutex's can't be reinit'd without destroying them >> >> > + * and we've not got a good place to destroy it that >> >> > + * we can guarantee isn't being called when we might want >> >> > + * to hold the lock. >> >> > + */ >> >> > + if (!XBZRLE.lock_init) { >> >> > + XBZRLE.lock_init = true; >> >> > + qemu_mutex_init(&XBZRLE.lock); >> >> > + } >> >> > qemu_mutex_unlock_iothread(); >> >> > >> >> > /* We prefer not to abort if there is no memory */ >> >> >> >> I detest how the locking works in xbzrle_cache_resize(). >> > >> > Yeh, it's tricky - I think the way to think about it is that the >> > lock protects the cache and it's contents, not the pointer. >> > Except that's not really true when it swaps the new one in - hmm. >> >> Yup, and that's what I call confusing and way too subtle. > > Yes, agreed - this shouldn't be a hard problem that needs subtlety. > >> >> The first XBZRLE.cache != NULL is not under XBZRLE.lock. As you explain >> >> in the comment, this is required, because we foolishly delay lock >> >> initialization until a migration starts, and it's safe, because cache >> >> creation is also under the BQL. >> > >> > Some of this is down to the interfaces between migration generally, >> > the devices being migrated and the specials for RAM/iterative migration. >> > >> > 1) A lot of the ram migration data, including XBZRLE, is global data in >> > arch_init - this is bad. >> >> As long as there can only be one migration active at a time, it's not >> really bad, isn't it? > > I find it quite messy; there is state associated with migration all > over, I couldn't honestly tell you all the things that need to be > initialised, updated, etc when a migration is in progress. > I'd rather hang stuff off MigrationState. Point taken. >> > 2) The management of these is generally glued onto the migration >> > setup/iterate/complete set of methods used during migration, however >> > they don't have any calls that correspond to the actual start/end of >> > migration, or the like that could be sanely used to do any init >> > that tied up with the actual start end of migration - this is bad. >> >> Maybe. But why delay the initialization of XBZRLE.lock to start of >> migration? All the other members of XBZRLE are initialized on start of >> program! > > That's just fall out from the shock that you couldn't staticly init > the lock, the original patch version statically init'd the lock in the > same way as the other members. Heh :) >> > 3) Migration is in a separate thread and could finish at any time - this >> > is expected but complicates life, especially when (2) means that >> > the data structures for RAMs migration etc are cleared up when migration >> > finishes. >> >> Four activities: initialization, cache setup, cache use, cache teardown. >> >> Initialization can be done during program startup, where we don't have >> to worry about threads and locks. The other three can then be simply >> put under the lock, and the comment "Protected by lock" becomes actually >> true. Sketch appended. It passes "make check", it must be purrfect! > > I'm mostly OK with this, with one small change at the bottom. > >> > I don't know the history but (1) seems to arrise from the >> > semi-arbitrary split between arch_init.c, memory.c, savevm.c, and probably >> > 3 or 4 other files I've failed to mention. >> >> [...] >> >> >> From 6ab76b05789de80d4e0919404429fadcbc0ea403 Mon Sep 17 00:00:00 2001 >> From: Markus Armbruster <armbru@redhat.com> >> Date: Wed, 19 Mar 2014 08:46:51 +0100 >> Subject: [PATCH] XBZRLE: Fix and simplify locking of cache WIP >> >> --- >> arch_init.c | 49 ++++++++++++++++++++++------------------------ >> include/sysemu/arch_init.h | 1 + >> vl.c | 1 + >> 3 files changed, 25 insertions(+), 26 deletions(-) >> >> diff --git a/arch_init.c b/arch_init.c >> index 60c975d..e41e9dd 100644 >> --- a/arch_init.c >> +++ b/arch_init.c >> @@ -167,14 +167,16 @@ static struct { >> /* Cache for XBZRLE, Protected by lock. */ >> PageCache *cache; >> QemuMutex lock; >> -} XBZRLE = { >> - .encoded_buf = NULL, >> - .current_buf = NULL, >> - .cache = NULL, >> -}; >> +} XBZRLE; >> + >> /* buffer used for XBZRLE decoding */ >> static uint8_t *xbzrle_decoded_buf; >> >> +void XBZRLE_init(void) >> +{ >> + qemu_mutex_init(&XBZRLE.lock); >> +} >> + >> static void XBZRLE_cache_lock(void) >> { >> if (migrate_use_xbzrle()) >> @@ -189,39 +191,35 @@ static void XBZRLE_cache_unlock(void) >> >> int64_t xbzrle_cache_resize(int64_t new_size) >> { >> - PageCache *new_cache, *cache_to_free; >> + PageCache *new_cache; >> + int64_t ret; >> >> if (new_size < TARGET_PAGE_SIZE) { >> return -1; >> } >> >> - /* no need to lock, the current thread holds qemu big lock */ >> + XBZRLE_cache_lock(); >> + >> if (XBZRLE.cache != NULL) { >> - /* check XBZRLE.cache again later */ >> if (pow2floor(new_size) == migrate_xbzrle_cache_size()) { >> - return pow2floor(new_size); >> + goto out_new_size; >> } >> new_cache = cache_init(new_size / TARGET_PAGE_SIZE, >> TARGET_PAGE_SIZE); >> if (!new_cache) { >> DPRINTF("Error creating cache\n"); >> - return -1; >> + goto out; >> } >> >> - XBZRLE_cache_lock(); >> - /* the XBZRLE.cache may have be destroyed, check it again */ >> - if (XBZRLE.cache != NULL) { >> - cache_to_free = XBZRLE.cache; >> - XBZRLE.cache = new_cache; >> - } else { >> - cache_to_free = new_cache; >> - } >> - XBZRLE_cache_unlock(); >> - >> - cache_fini(cache_to_free); >> + cache_fini(XBZRLE.cache); >> + XBZRLE.cache = new_cache; >> } >> >> - return pow2floor(new_size); >> +out_new_size: >> + ret = pow2floor(new_size); >> +out: >> + XBZRLE_cache_unlock(); >> + return ret; >> } >> >> /* accounting for migration statistics */ >> @@ -735,17 +733,16 @@ static int ram_save_setup(QEMUFile *f, void *opaque) >> dirty_rate_high_cnt = 0; >> >> if (migrate_use_xbzrle()) { >> - qemu_mutex_lock_iothread(); >> + XBZRLE_cache_lock(); >> XBZRLE.cache = cache_init(migrate_xbzrle_cache_size() / >> TARGET_PAGE_SIZE, >> TARGET_PAGE_SIZE); >> if (!XBZRLE.cache) { >> - qemu_mutex_unlock_iothread(); >> + XBZRLE_cache_unlock(); >> DPRINTF("Error creating cache\n"); >> return -1; >> } >> - qemu_mutex_init(&XBZRLE.lock); >> - qemu_mutex_unlock_iothread(); >> + XBZRLE_cache_unlock(); >> >> /* We prefer not to abort if there is no memory */ >> XBZRLE.encoded_buf = g_try_malloc0(TARGET_PAGE_SIZE); >> diff --git a/include/sysemu/arch_init.h b/include/sysemu/arch_init.h >> index be71bca..fdc3423 100644 >> --- a/include/sysemu/arch_init.h >> +++ b/include/sysemu/arch_init.h >> @@ -26,6 +26,7 @@ enum { >> >> extern const uint32_t arch_type; >> >> +void XBZRLE_init(void); >> void select_soundhw(const char *optarg); >> void do_acpitable_option(const QemuOpts *opts); >> void do_smbios_option(QemuOpts *opts); >> diff --git a/vl.c b/vl.c >> index f0fe48b..24433c8 100644 >> --- a/vl.c >> +++ b/vl.c >> @@ -3008,6 +3008,7 @@ int main(int argc, char **argv, char **envp) >> >> qemu_init_auxval(envp); >> qemu_cache_utils_init(); >> + XBZRLE_init(); >> >> QLIST_INIT (&vm_change_state_head); >> os_setup_early_signal_handling(); >> -- >> 1.8.1.4 > > I noticed that in main.c there is a call to a blk_mig_init() - so > how about (not even build tested): > > diff --git a/vl.c b/vl.c > index 842e897..ab8cbf0 100644 > --- a/vl.c > +++ b/vl.c > @@ -4247,6 +4247,7 @@ int main(int argc, char **argv, char **envp) > cpu_exec_init_all(); > > blk_mig_init(); > + ram_mig_init(); > > /* open the virtual block devices */ > if (snapshot) I figure this is still early eanough. > @@ -4261,8 +4262,6 @@ int main(int argc, char **argv, char **envp) > default_drive(default_floppy, snapshot, IF_FLOPPY, 0, FD_OPTS); > default_drive(default_sdcard, snapshot, IF_SD, 0, SD_OPTS); > > - register_savevm_live(NULL, "ram", 0, 4, &savevm_ram_handlers, NULL); > - > if (nb_numa_nodes > 0) { > int i; > diff --git a/arch_init.c b/arch_init.c > index 16474b5..4f8376d 100644 > --- a/arch_init.c > +++ b/arch_init.c > @@ -190,6 +190,12 @@ static void XBZRLE_cache_unlock(void) > qemu_mutex_unlock(&XBZRLE.lock); > } > > +void ram_mig_init(void) > +{ > + qemu_mutex_init(&XBZRLE.lock); > + register_savevm_live(NULL, "ram", 0, 4, &savevm_ram_handlers, NULL); > +} > + > /* called from qmp_migrate_set_cache_size in main thread, possibly while > * a migration is in progress. > * A running migration maybe using the cache and might finish during this > @@ -1117,7 +1123,7 @@ done: > return ret; > } > > -SaveVMHandlers savevm_ram_handlers = { > +static SaveVMHandlers savevm_ram_handlers = { > .save_live_setup = ram_save_setup, > .save_live_iterate = ram_save_iterate, > .save_live_complete = ram_save_complete, > > diff --git a/include/migration/migration.h b/include/migration/migration.h > index 3e1e6c7..31fbf17 100644 > --- a/include/migration/migration.h > +++ b/include/migration/migration.h > @@ -113,8 +113,6 @@ void free_xbzrle_decoded_buf(void); > > void acct_update_position(QEMUFile *f, size_t size, bool zero); > > -extern SaveVMHandlers savevm_ram_handlers; > - > uint64_t dup_mig_bytes_transferred(void); > uint64_t dup_mig_pages_transferred(void); > uint64_t skipped_mig_bytes_transferred(void); > > and that way we're simplifying vl.c rather than teaching it even > more about the innards of ram migration. Yes, that's better. Suggest to split into two patches, one to hide savevm_ram_handlers behind ram_mig_init(), and another one to fix and simplify the locking. Care to test it and post?
diff --git a/vl.c b/vl.c index 842e897..ab8cbf0 100644 --- a/vl.c +++ b/vl.c @@ -4247,6 +4247,7 @@ int main(int argc, char **argv, char **envp) cpu_exec_init_all(); blk_mig_init(); + ram_mig_init(); /* open the virtual block devices */ if (snapshot) @@ -4261,8 +4262,6 @@ int main(int argc, char **argv, char **envp) default_drive(default_floppy, snapshot, IF_FLOPPY, 0, FD_OPTS); default_drive(default_sdcard, snapshot, IF_SD, 0, SD_OPTS); - register_savevm_live(NULL, "ram", 0, 4, &savevm_ram_handlers, NULL); - if (nb_numa_nodes > 0) { int i; diff --git a/arch_init.c b/arch_init.c index 16474b5..4f8376d 100644 --- a/arch_init.c +++ b/arch_init.c @@ -190,6 +190,12 @@ static void XBZRLE_cache_unlock(void) qemu_mutex_unlock(&XBZRLE.lock); } +void ram_mig_init(void) +{ + qemu_mutex_init(&XBZRLE.lock); + register_savevm_live(NULL, "ram", 0, 4, &savevm_ram_handlers, NULL); +} + /* called from qmp_migrate_set_cache_size in main thread, possibly while * a migration is in progress. * A running migration maybe using the cache and might finish during this @@ -1117,7 +1123,7 @@ done: return ret; } -SaveVMHandlers savevm_ram_handlers = { +static SaveVMHandlers savevm_ram_handlers = { .save_live_setup = ram_save_setup, .save_live_iterate = ram_save_iterate, .save_live_complete = ram_save_complete, diff --git a/include/migration/migration.h b/include/migration/migration.h index 3e1e6c7..31fbf17 100644 --- a/include/migration/migration.h +++ b/include/migration/migration.h @@ -113,8 +113,6 @@ void free_xbzrle_decoded_buf(void); void acct_update_position(QEMUFile *f, size_t size, bool zero); -extern SaveVMHandlers savevm_ram_handlers; - uint64_t dup_mig_bytes_transferred(void); uint64_t dup_mig_pages_transferred(void); uint64_t skipped_mig_bytes_transferred(void);