Message ID | 1382461164-8854-1-git-send-email-mrhines@linux.vnet.ibm.com |
---|---|
State | New |
Headers | show |
On 10/22/2013 05:59 PM, mrhines@linux.vnet.ibm.com wrote: > From: "Michael R. Hines" <mrhines@us.ibm.com> > > As far as we can tell, all known bugs have been fixed, > there as been very good participation in testing and running. > > 1. Parallel RDMA migrations are working > 2. IPv6 migration is working > 3. Libvirt patches are ready > 4. virt-test is working > > Any objections to removing the experimental tag? > > There is one remaining bug: qemu-system-i386 does not compile > with RDMA: I have very zero access to 32-bit hardware > using RDMA, so this hasn't been much of a priority. It seems > safer to *not* submit non-testable patch rather than submit > submit a fix just for the sake of compiling =) > > Signed-off-by: Michael R. Hines <mrhines@us.ibm.com> > --- > > TODO: > ===== > -1. 'migrate x-rdma:host:port' and '-incoming x-rdma' options will be > +1. 'migrate rdma:host:port' and '-incoming rdma' options will be > renamed to 'rdma' after the experimental phase of this work has > completed upstream. Shouldn't you remove step 1 and renumber the rest of the list altogether, rather than just altering the comment to make it out-of-date? > +++ b/qapi-schema.json > @@ -615,7 +615,7 @@ > # This feature allows us to minimize migration traffic for certain work > # loads, by sending compressed difference of the pages > # > -# @x-rdma-pin-all: Controls whether or not the entire VM memory footprint is > +# @rdma-pin-all: Controls whether or not the entire VM memory footprint is > # mlock()'d on demand or all at once. Refer to docs/rdma.txt for usage. > # Disabled by default. Experimental: may (or may not) be renamed after > # further testing is complete. (since 1.6) I'd also recommend tweaking this to say 'since 1.7', since the spelling 'rdma-pin-all' is new to this release.
Il 22/10/2013 21:20, Eric Blake ha scritto: >> > -# @x-rdma-pin-all: Controls whether or not the entire VM memory footprint is >> > +# @rdma-pin-all: Controls whether or not the entire VM memory footprint is >> > # mlock()'d on demand or all at once. Refer to docs/rdma.txt for usage. >> > # Disabled by default. Experimental: may (or may not) be renamed after >> > # further testing is complete. (since 1.6) > I'd also recommend tweaking this to say 'since 1.7', since the spelling > 'rdma-pin-all' is new to this release. I would also leave this as experimental for now. Basically the point of the "experimental" designation was to ensure that RDMA protocol changes might not preserve backwards compatibility. The capability is a separate thing from the protocol, as it would likely apply to any migration-over-RDMA implementation Paolo
On 10/22/2013 04:20 PM, Eric Blake wrote: > On 10/22/2013 05:59 PM, mrhines@linux.vnet.ibm.com wrote: >> From: "Michael R. Hines" <mrhines@us.ibm.com> >> >> As far as we can tell, all known bugs have been fixed, >> there as been very good participation in testing and running. >> >> 1. Parallel RDMA migrations are working >> 2. IPv6 migration is working >> 3. Libvirt patches are ready >> 4. virt-test is working >> >> Any objections to removing the experimental tag? >> >> There is one remaining bug: qemu-system-i386 does not compile >> with RDMA: I have very zero access to 32-bit hardware >> using RDMA, so this hasn't been much of a priority. It seems >> safer to *not* submit non-testable patch rather than submit >> submit a fix just for the sake of compiling =) >> >> Signed-off-by: Michael R. Hines <mrhines@us.ibm.com> >> --- >> >> TODO: >> ===== >> -1. 'migrate x-rdma:host:port' and '-incoming x-rdma' options will be >> +1. 'migrate rdma:host:port' and '-incoming rdma' options will be >> renamed to 'rdma' after the experimental phase of this work has >> completed upstream. > Shouldn't you remove step 1 and renumber the rest of the list > altogether, rather than just altering the comment to make it out-of-date? > Oops =) >> +++ b/qapi-schema.json >> @@ -615,7 +615,7 @@ >> # This feature allows us to minimize migration traffic for certain work >> # loads, by sending compressed difference of the pages >> # >> -# @x-rdma-pin-all: Controls whether or not the entire VM memory footprint is >> +# @rdma-pin-all: Controls whether or not the entire VM memory footprint is >> # mlock()'d on demand or all at once. Refer to docs/rdma.txt for usage. >> # Disabled by default. Experimental: may (or may not) be renamed after >> # further testing is complete. (since 1.6) > I'd also recommend tweaking this to say 'since 1.7', since the spelling > 'rdma-pin-all' is new to this release. > Ah, yes. =)
On 10/23/2013 02:25 AM, Paolo Bonzini wrote: > Il 22/10/2013 21:20, Eric Blake ha scritto: >>>> -# @x-rdma-pin-all: Controls whether or not the entire VM memory footprint is >>>> +# @rdma-pin-all: Controls whether or not the entire VM memory footprint is >>>> # mlock()'d on demand or all at once. Refer to docs/rdma.txt for usage. >>>> # Disabled by default. Experimental: may (or may not) be renamed after >>>> # further testing is complete. (since 1.6) >> I'd also recommend tweaking this to say 'since 1.7', since the spelling >> 'rdma-pin-all' is new to this release. > I would also leave this as experimental for now. > > Basically the point of the "experimental" designation was to ensure that > RDMA protocol changes might not preserve backwards compatibility. The > capability is a separate thing from the protocol, as it would likely > apply to any migration-over-RDMA implementation > > Paolo > Well, I tried posting libvirt support with this naming scheme, but they didn't accepted. Their reason (Daniel, I think) is valid: experimental implies that it shouldn't be exposed in the management software until it is deemed stable at some point. As far we can tell, it is stable, and made very clear using the new 'setup' state in the migration state machine. How would we expose it in libvirt as an experimental feature without labeling it as an experimental feature? - Michael
Il 25/10/2013 16:03, Michael R. Hines ha scritto: > > Well, I tried posting libvirt support with this naming scheme, > but they didn't accepted. > > Their reason (Daniel, I think) is valid: experimental implies that it > shouldn't be exposed in the management software until it is > deemed stable at some point. > > As far we can tell, it is stable, and made very clear using the new > 'setup' state in the migration state machine. Sure, x-rdma => rdma *is* stable. I'm not sure about x-rdma-pin-all though. Paolo
On 10/25/2013 11:14 AM, Paolo Bonzini wrote: > Il 25/10/2013 16:03, Michael R. Hines ha scritto: >> Well, I tried posting libvirt support with this naming scheme, >> but they didn't accepted. >> >> Their reason (Daniel, I think) is valid: experimental implies that it >> shouldn't be exposed in the management software until it is >> deemed stable at some point. >> >> As far we can tell, it is stable, and made very clear using the new >> 'setup' state in the migration state machine. > Sure, x-rdma => rdma *is* stable. I'm not sure about x-rdma-pin-all though. > > Paolo > Use of this capability (which is disabled by default) is actually "safer" than migration without it - you would get an early warning well in advance of actually starting the iterative phases that there was not sufficient memory available on the destination. Safety is all relative, of course, but I have successfully performed thousands of back-to-back RDMA migrations automatically looped *in a row* using a heavy-weight memory-stress benchmark here at IBM. Migration success is done by capturing the actual serial console output of the virtual machine while the benchmark is running and redirecting each migration output to a file to verify that the output matches the expected output of a successful migration. Migrations have been tested with both 14GB and 2GB virtual machine sizes (to make sure I was testing both 32-bit and 64-bit address boundaries). The benchmark is configured to have 75% stores and 25% loads and is configured to use 80% of the allocatable free memory of the VM (i.e. no swapping allowed). I have defined a successful migration per the output file as follows: 1. The memory benchmark is still running and active (CPU near 100% and memory usage is high) 2. There are no kernel panics in the console output (regex keywords "panic", "BUG", "oom", etc...) 3. The VM is still responding to network activity (pings) 4. The console is still responsive by printing periodic messages throughout the life of the VM to the console from inside the VM using the 'write' command in infinite loop. - Michael
Il 26/10/2013 10:45, Michael R. Hines ha scritto: >> Sure, x-rdma => rdma *is* stable. I'm not sure about x-rdma-pin-all though. > > Use of this capability (which is disabled by default) is actually > "safer" than migration without it - you would get an early warning well > in advance of actually starting the iterative phases that there was not > sufficient memory available on the destination. I guess that shows more of my (lack of) knowledge about RDMA. :) You're obviously right, x-rdma-pin-all could make sense even if we had fancy caching strategies. Paolo
diff --git a/docs/rdma.txt b/docs/rdma.txt index 8d1e003..2c2beba 100644 --- a/docs/rdma.txt +++ b/docs/rdma.txt @@ -66,7 +66,7 @@ bulk-phase round of the migration and can be enabled for extremely high-performance RDMA hardware using the following command: QEMU Monitor Command: -$ migrate_set_capability x-rdma-pin-all on # disabled by default +$ migrate_set_capability rdma-pin-all on # disabled by default Performing this action will cause all 8GB to be pinned, so if that's not what you want, then please ignore this step altogether. @@ -93,12 +93,12 @@ $ migrate_set_speed 40g # or whatever is the MAX of your RDMA device Next, on the destination machine, add the following to the QEMU command line: -qemu ..... -incoming x-rdma:host:port +qemu ..... -incoming rdma:host:port Finally, perform the actual migration on the source machine: QEMU Monitor Command: -$ migrate -d x-rdma:host:port +$ migrate -d rdma:host:port PERFORMANCE =========== @@ -120,8 +120,8 @@ For example, in the same 8GB RAM example with all 8GB of memory in active use and the VM itself is completely idle using the same 40 gbps infiniband link: -1. x-rdma-pin-all disabled total time: approximately 7.5 seconds @ 9.5 Gbps -2. x-rdma-pin-all enabled total time: approximately 4 seconds @ 26 Gbps +1. rdma-pin-all disabled total time: approximately 7.5 seconds @ 9.5 Gbps +2. rdma-pin-all enabled total time: approximately 4 seconds @ 26 Gbps These numbers would of course scale up to whatever size virtual machine you have to migrate using RDMA. @@ -407,7 +407,7 @@ socket is broken during a non-RDMA based migration. TODO: ===== -1. 'migrate x-rdma:host:port' and '-incoming x-rdma' options will be +1. 'migrate rdma:host:port' and '-incoming rdma' options will be renamed to 'rdma' after the experimental phase of this work has completed upstream. 2. Currently, 'ulimit -l' mlock() limits as well as cgroups swap limits diff --git a/migration-rdma.c b/migration-rdma.c index f94f3b4..eeb4302 100644 --- a/migration-rdma.c +++ b/migration-rdma.c @@ -3412,7 +3412,7 @@ void rdma_start_outgoing_migration(void *opaque, } ret = qemu_rdma_source_init(rdma, &local_err, - s->enabled_capabilities[MIGRATION_CAPABILITY_X_RDMA_PIN_ALL]); + s->enabled_capabilities[MIGRATION_CAPABILITY_RDMA_PIN_ALL]); if (ret) { goto err; diff --git a/migration.c b/migration.c index b4f8462..d9c7a62 100644 --- a/migration.c +++ b/migration.c @@ -81,7 +81,7 @@ void qemu_start_incoming_migration(const char *uri, Error **errp) if (strstart(uri, "tcp:", &p)) tcp_start_incoming_migration(p, errp); #ifdef CONFIG_RDMA - else if (strstart(uri, "x-rdma:", &p)) + else if (strstart(uri, "rdma:", &p)) rdma_start_incoming_migration(p, errp); #endif #if !defined(WIN32) @@ -423,7 +423,7 @@ void qmp_migrate(const char *uri, bool has_blk, bool blk, if (strstart(uri, "tcp:", &p)) { tcp_start_outgoing_migration(s, p, &local_err); #ifdef CONFIG_RDMA - } else if (strstart(uri, "x-rdma:", &p)) { + } else if (strstart(uri, "rdma:", &p)) { rdma_start_outgoing_migration(s, p, &local_err); #endif #if !defined(WIN32) @@ -501,7 +501,7 @@ bool migrate_rdma_pin_all(void) s = migrate_get_current(); - return s->enabled_capabilities[MIGRATION_CAPABILITY_X_RDMA_PIN_ALL]; + return s->enabled_capabilities[MIGRATION_CAPABILITY_RDMA_PIN_ALL]; } bool migrate_auto_converge(void) diff --git a/qapi-schema.json b/qapi-schema.json index 145eca8..4bae4b1 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -615,7 +615,7 @@ # This feature allows us to minimize migration traffic for certain work # loads, by sending compressed difference of the pages # -# @x-rdma-pin-all: Controls whether or not the entire VM memory footprint is +# @rdma-pin-all: Controls whether or not the entire VM memory footprint is # mlock()'d on demand or all at once. Refer to docs/rdma.txt for usage. # Disabled by default. Experimental: may (or may not) be renamed after # further testing is complete. (since 1.6) @@ -632,7 +632,7 @@ # Since: 1.2 ## { 'enum': 'MigrationCapability', - 'data': ['xbzrle', 'x-rdma-pin-all', 'auto-converge', 'zero-blocks'] } + 'data': ['xbzrle', 'rdma-pin-all', 'auto-converge', 'zero-blocks'] } ## # @MigrationCapabilityStatus