Message ID | 1415272128-8273-2-git-send-email-liang.z.li@intel.com |
---|---|
State | New |
Headers | show |
On 11/06/2014 12:08 PM, Li Liang wrote: > Give some details about the multiple compression threads and how > to use it in live migration. > > Signed-off-by: Li Liang <liang.z.li@intel.com> > --- > docs/multiple-compression-threads.txt | 128 ++++++++++++++++++++++++++++++++++ > 1 file changed, 128 insertions(+) > create mode 100644 docs/multiple-compression-threads.txt > > diff --git a/docs/multiple-compression-threads.txt b/docs/multiple-compression-threads.txt > new file mode 100644 > index 0000000..a5e53de > --- /dev/null > +++ b/docs/multiple-compression-threads.txt > @@ -0,0 +1,128 @@ > +Use multiple (de)compression threads in live migration > +================================================================= > +Copyright (C) 2014 Li Liang <liang.z.li@intel.com> Asserting copyright without also mentioning an open license is awkward in open source (IANAL, but as I understand it, in some areas, asserting a copyright without also granting disclaimers merely gets the default non-open status where the file cannot be copied at all; the license is essential to make it obvious that the copyright holder INTENDS for the file to be copied in some circumstances). Thus, you need to explicitly call out GPLv2+ (even if it can be argued it is was implied by the top-level LICENSE) or some other compatible license to be safe. > + > + > +Contents: > +========= > +* Introduction > +* When to use > +* Performance > +* Usage > +* TODO > + > +Introduction > +============ > +Instead of sending the guest memory directly, this solution will > +compress the ram page before sending, after receiving, the data will s/sending,/sending;/ > +be decompressed. Using compression in live migration can help > +to reduce the data transferred about 60%, this is very useful when the > +bandwidth is limited, and the migration time can also be reduced about > +70% in a typical case. > + > +The process of compression will consume additional CPU cycles, and the > +extra CPU cycles will increase the migration time. On the other hand, > +the amount of data transferred will reduced, this factor can reduce > +the migration time. If the process of the compression is quick > +enough, then the total migration time can be reduced, and multiple > +compression threads can be used to accelerate the compression process. > + > +The decompression speed of zlib is at least 4 times as quickly as s/quickly/quick/ > +compression, if the source and destination CPU have equal speed, > +keeping the compression thread count 4 times the decompression > +thread count can avoid CPU waste. > + > +Compression level can be used to control the compression speed and the > +compression ratio. High compression ratio will take more time, level 0 > +stands for no compression, level 1 stands for the best compression > +speed, and level 9 stands for the best compression ratio. Users can > +select a level number between 0 and 9. > + > + > +When to use the multiple compression threads in live migration > +============================================================== > +Compression of data will consume lot of extra CPU cycles, in a system s/lot of// s/cycles,/cycles; so/ > +with high overhead of CPU, avoid using this feature. When the network > +bandwidth is very limited and the CPU resource is adequate, use the s/use the/use of/ > +multiple compression threads will be very helpful. If both the CPU and > +the network bandwidth are adequate, use multiple compression threads s/use/use of/ > +can still help to reduce the migration time. > + > +Performance > +=========== > +Test environment: > + > +CPU: Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz > +Socket Count: 2 > +Ram: 128G > +NIC: Intel I350 (10/100/1000Mbps) > +Host OS: CentOS 7 64-bit > +Guest OS: Ubuntu 12.10 64-bit > +Parameter: qemu-system-x86_64 -enable-kvm -m 1024 > + /share/ia32e_ubuntu12.10.img -monitor stdio > + > +There is no additional application is running on the guest when doing > +the test. > + > + > +Speed limit: 32MB/s > +--------------------------------------------------------------- > + | original | compress thread: 8 > + | way | decompress thread: 2 > + | | compression level: 1 > +--------------------------------------------------------------- > +total time(msec): | 26561 | 7920 > +--------------------------------------------------------------- > +transferred ram(kB):| 877054 | 260641 > +--------------------------------------------------------------- > +throughput(mbps): | 270.53 | 269.68 > +--------------------------------------------------------------- > +total ram(kB): | 1057604 | 1057604 > +--------------------------------------------------------------- > + > + > +Speed limit: No > +--------------------------------------------------------------- > + | original | compress thread: 15 > + | way | decompress thread: 4 > + | | compression level: 1 > +--------------------------------------------------------------- > +total time(msec): | 7611 | 2888 > +--------------------------------------------------------------- > +transferred ram(kB):| 876761 | 262301 > +--------------------------------------------------------------- > +throughput(mbps): | 943.78 | 744.27 > +--------------------------------------------------------------- > +total ram(kB): | 1057604 | 1057604 > +--------------------------------------------------------------- > + > +Usage > +====== > +1. Verify the destination QEMU version is able to support the multiple > +compression threads migration: > + {qemu} info_migrate_capablilites > + {qemu} ... compress: off ... > + > +2. Activate compression on the souce: > + {qemu} migrate_set_capability compress on > + > +3. Set the compression thread count on source: > + {qemu} migrate_set_compress_threads 10 > + > +4. Set the compression level on the source: > + {qemu} migrate_set_compress_level 1 > + > +5. Set the decompression thread count on destination: > + {qemu} migrate_set_decompress_threads 5 > + > +6. Start outgoing migration: > + {qemu} migrate -d tcp:destination.host:4444 > + {qemu} info migrate > + Capabilities: ... compress: on > + ... > + > +TODO > +==== > +Some faster compression/decompression method such as lz4 and quicklz > +can help to reduce the CPU consumption when doing (de)compression. > +Less (de)compression threads are needed when doing the migration. >
* Li Liang (liang.z.li@intel.com) wrote: > Give some details about the multiple compression threads and how > to use it in live migration. > > Signed-off-by: Li Liang <liang.z.li@intel.com> > --- > docs/multiple-compression-threads.txt | 128 ++++++++++++++++++++++++++++++++++ > 1 file changed, 128 insertions(+) > create mode 100644 docs/multiple-compression-threads.txt > > diff --git a/docs/multiple-compression-threads.txt b/docs/multiple-compression-threads.txt > new file mode 100644 > index 0000000..a5e53de > --- /dev/null > +++ b/docs/multiple-compression-threads.txt Should probably have migration in the title? > +Usage > +====== > +1. Verify the destination QEMU version is able to support the multiple > +compression threads migration: > + {qemu} info_migrate_capablilites > + {qemu} ... compress: off ... > + > +2. Activate compression on the souce: > + {qemu} migrate_set_capability compress on > + > +3. Set the compression thread count on source: > + {qemu} migrate_set_compress_threads 10 > + > +4. Set the compression level on the source: > + {qemu} migrate_set_compress_level 1 > + > +5. Set the decompression thread count on destination: > + {qemu} migrate_set_decompress_threads 5 > + > +6. Start outgoing migration: > + {qemu} migrate -d tcp:destination.host:4444 > + {qemu} info migrate > + Capabilities: ... compress: on > + ... > + > +TODO > +==== > +Some faster compression/decompression method such as lz4 and quicklz > +can help to reduce the CPU consumption when doing (de)compression. > +Less (de)compression threads are needed when doing the migration. OK, some high level questions: 1) How does the performance compare to running a separate compressor process in the stream rather than embedding it in the qemu? 2) Since you're looking at different compression schemes do we need something in the settings to select it, and to say what makes sense for the 'compress_level'? For example I don't know if lz4 or quicklz have 1-10 for their compression levels? How do I know which compression schemes are available on any host? Dave -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
On 11/06/2014 02:24 PM, Dr. David Alan Gilbert wrote: > * Li Liang (liang.z.li@intel.com) wrote: >> Give some details about the multiple compression threads and how >> to use it in live migration. >> >> Signed-off-by: Li Liang <liang.z.li@intel.com> >> --- >> +TODO >> +==== >> +Some faster compression/decompression method such as lz4 and quicklz >> +can help to reduce the CPU consumption when doing (de)compression. >> +Less (de)compression threads are needed when doing the migration. > > OK, some high level questions: > 1) How does the performance compare to running a separate compressor > process in the stream rather than embedding it in the qemu? Interesting question. I wonder if libvirt should be extended to optionally insert a compression/decompression filter in the setups it creates. Remember, in libvirt tunnelled mode, where libvirt is adding TLS encryption on top of the migration data stream so that it is not sniffable from TCP, all data is already going through the path: source qemu -> source libvirt -> destination libvirt -> destination qemu Unix socket/pipe TCP socket Unix socket/pipe Furthermore, libvirt is ALREADY wired up to use external compression when doing migration to file (such as supporting multiple compression formats for 'virsh save'), which looks like: qemu -> compressor -> libvirt I/O helper -> file pipe pipe O_DIRECT file ops then restoring that image with: file -> libvirt I/O helper -> decompressor -> qemu O_DIRECT file ops pipe pipe So adding compression in the mix seems like it would be easy for libvirt to do: source qemu -> compressor -> source libvirt -> destination libvirt ... pipe pipe TCP socket -> decompressor -> destination qemu pipe pipe Of course, with an external processor, I don't know if you can get speedups from having multiple compression threads when all input is coming serially from a single connection, so your approach of folding in parallel compression threads directly into qemu may still have some speed merits. On the other hand, I'm not sure how your solution is multiplexing the multiple compression threads into a single migration stream; if you are still bottlenecked by a single migration stream, what good do you get by adding multiple (de)compression threads, without some way in the migration protocol to cleanly call out a fair rotation from the independent sub-stream of each thread?
>OK, some high level questions: > > 1) How does the performance compare to running a separate compressor process in the stream rather than embedding it in the qemu? > I have not do the test, so I don't know the performance. Maybe I can do it later. > 2) Since you're looking at different compression schemes do we need something in the settings to select it, and to say what makes sense >for the 'compress_level'? For example I don't know if lz4 or quicklz >have 1-10 for their compression levels? How do I know which compression schemes are available on any host? > Only the LZ4HC support compression level, which range from 0 to 16. My implementation does not support selecting different compression schemes, it only support selecting different compression level. Using LZ4HC can actually help to improve the performance compared to using zlib, on the other hand, it's not widespread as zlib, and the License is another problem. Liang
diff --git a/docs/multiple-compression-threads.txt b/docs/multiple-compression-threads.txt new file mode 100644 index 0000000..a5e53de --- /dev/null +++ b/docs/multiple-compression-threads.txt @@ -0,0 +1,128 @@ +Use multiple (de)compression threads in live migration +================================================================= +Copyright (C) 2014 Li Liang <liang.z.li@intel.com> + + +Contents: +========= +* Introduction +* When to use +* Performance +* Usage +* TODO + +Introduction +============ +Instead of sending the guest memory directly, this solution will +compress the ram page before sending, after receiving, the data will +be decompressed. Using compression in live migration can help +to reduce the data transferred about 60%, this is very useful when the +bandwidth is limited, and the migration time can also be reduced about +70% in a typical case. + +The process of compression will consume additional CPU cycles, and the +extra CPU cycles will increase the migration time. On the other hand, +the amount of data transferred will reduced, this factor can reduce +the migration time. If the process of the compression is quick +enough, then the total migration time can be reduced, and multiple +compression threads can be used to accelerate the compression process. + +The decompression speed of zlib is at least 4 times as quickly as +compression, if the source and destination CPU have equal speed, +keeping the compression thread count 4 times the decompression +thread count can avoid CPU waste. + +Compression level can be used to control the compression speed and the +compression ratio. High compression ratio will take more time, level 0 +stands for no compression, level 1 stands for the best compression +speed, and level 9 stands for the best compression ratio. Users can +select a level number between 0 and 9. + + +When to use the multiple compression threads in live migration +============================================================== +Compression of data will consume lot of extra CPU cycles, in a system +with high overhead of CPU, avoid using this feature. When the network +bandwidth is very limited and the CPU resource is adequate, use the +multiple compression threads will be very helpful. If both the CPU and +the network bandwidth are adequate, use multiple compression threads +can still help to reduce the migration time. + +Performance +=========== +Test environment: + +CPU: Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz +Socket Count: 2 +Ram: 128G +NIC: Intel I350 (10/100/1000Mbps) +Host OS: CentOS 7 64-bit +Guest OS: Ubuntu 12.10 64-bit +Parameter: qemu-system-x86_64 -enable-kvm -m 1024 + /share/ia32e_ubuntu12.10.img -monitor stdio + +There is no additional application is running on the guest when doing +the test. + + +Speed limit: 32MB/s +--------------------------------------------------------------- + | original | compress thread: 8 + | way | decompress thread: 2 + | | compression level: 1 +--------------------------------------------------------------- +total time(msec): | 26561 | 7920 +--------------------------------------------------------------- +transferred ram(kB):| 877054 | 260641 +--------------------------------------------------------------- +throughput(mbps): | 270.53 | 269.68 +--------------------------------------------------------------- +total ram(kB): | 1057604 | 1057604 +--------------------------------------------------------------- + + +Speed limit: No +--------------------------------------------------------------- + | original | compress thread: 15 + | way | decompress thread: 4 + | | compression level: 1 +--------------------------------------------------------------- +total time(msec): | 7611 | 2888 +--------------------------------------------------------------- +transferred ram(kB):| 876761 | 262301 +--------------------------------------------------------------- +throughput(mbps): | 943.78 | 744.27 +--------------------------------------------------------------- +total ram(kB): | 1057604 | 1057604 +--------------------------------------------------------------- + +Usage +====== +1. Verify the destination QEMU version is able to support the multiple +compression threads migration: + {qemu} info_migrate_capablilites + {qemu} ... compress: off ... + +2. Activate compression on the souce: + {qemu} migrate_set_capability compress on + +3. Set the compression thread count on source: + {qemu} migrate_set_compress_threads 10 + +4. Set the compression level on the source: + {qemu} migrate_set_compress_level 1 + +5. Set the decompression thread count on destination: + {qemu} migrate_set_decompress_threads 5 + +6. Start outgoing migration: + {qemu} migrate -d tcp:destination.host:4444 + {qemu} info migrate + Capabilities: ... compress: on + ... + +TODO +==== +Some faster compression/decompression method such as lz4 and quicklz +can help to reduce the CPU consumption when doing (de)compression. +Less (de)compression threads are needed when doing the migration.
Give some details about the multiple compression threads and how to use it in live migration. Signed-off-by: Li Liang <liang.z.li@intel.com> --- docs/multiple-compression-threads.txt | 128 ++++++++++++++++++++++++++++++++++ 1 file changed, 128 insertions(+) create mode 100644 docs/multiple-compression-threads.txt