Message ID | 1364291919-19563-7-git-send-email-pl@kamp.de |
---|---|
State | New |
Headers | show |
Am 26.03.2013 um 10:58 hat Peter Lieven geschrieben: > virtually all dup pages are zero pages. remove > the special is_dup_page() function and use the > optimized buffer_find_nonzero_offset() function > instead. > > here buffer_find_nonzero_offset() is used directly > to avoid the unnecssary additional checks in > buffer_is_zero(). > > raw performace gain checking 1 GByte zeroed memory > over is_dup_page() is approx. 10-12% with SSE2 > and 8-10% with unsigned long arithmedtic. > > Signed-off-by: Peter Lieven <pl@kamp.de> > Reviewed-by: Orit Wasserman <owasserm@redhat.com> > Reviewed-by: Eric Blake <eblake@redhat.com> Okay, so I bisected again and this is the second patch that is involved in the slowness of qemu-iotests case 007. The problem seems to be that the RAM of a guest is in fact _not_ zeroed during initialisation. It hits my test case reliably because I'm running with MALLOC_PERTURB_. Now I'm wondering if in practice this happens only under such test conditions, or if real guests could be affected as well and we should make sure to get zeroed memory for RAM. Kevin
Il 05/04/2013 21:23, Kevin Wolf ha scritto: >> > virtually all dup pages are zero pages. remove >> > the special is_dup_page() function and use the >> > optimized buffer_find_nonzero_offset() function >> > instead. >> > >> > here buffer_find_nonzero_offset() is used directly >> > to avoid the unnecssary additional checks in >> > buffer_is_zero(). >> > >> > raw performace gain checking 1 GByte zeroed memory >> > over is_dup_page() is approx. 10-12% with SSE2 >> > and 8-10% with unsigned long arithmedtic. >> > >> > Signed-off-by: Peter Lieven <pl@kamp.de> >> > Reviewed-by: Orit Wasserman <owasserm@redhat.com> >> > Reviewed-by: Eric Blake <eblake@redhat.com> > Okay, so I bisected again and this is the second patch that is involved > in the slowness of qemu-iotests case 007. > > The problem seems to be that the RAM of a guest is in fact _not_ zeroed > during initialisation. It hits my test case reliably because I'm running > with MALLOC_PERTURB_. Now I'm wondering if in practice this happens only > under such test conditions, or if real guests could be affected as well > and we should make sure to get zeroed memory for RAM. I think we should MADV_DONTNEED it. Paolo
Am 05.04.2013 um 22:00 schrieb Paolo Bonzini <pbonzini@redhat.com>: > Il 05/04/2013 21:23, Kevin Wolf ha scritto: >>>> virtually all dup pages are zero pages. remove >>>> the special is_dup_page() function and use the >>>> optimized buffer_find_nonzero_offset() function >>>> instead. >>>> >>>> here buffer_find_nonzero_offset() is used directly >>>> to avoid the unnecssary additional checks in >>>> buffer_is_zero(). >>>> >>>> raw performace gain checking 1 GByte zeroed memory >>>> over is_dup_page() is approx. 10-12% with SSE2 >>>> and 8-10% with unsigned long arithmedtic. >>>> >>>> Signed-off-by: Peter Lieven <pl@kamp.de> >>>> Reviewed-by: Orit Wasserman <owasserm@redhat.com> >>>> Reviewed-by: Eric Blake <eblake@redhat.com> >> Okay, so I bisected again and this is the second patch that is involved >> in the slowness of qemu-iotests case 007. >> >> The problem seems to be that the RAM of a guest is in fact _not_ zeroed >> during initialisation. It hits my test case reliably because I'm running >> with MALLOC_PERTURB_. Now I'm wondering if in practice this happens only >> under such test conditions, or if real guests could be affected as well >> and we should make sure to get zeroed memory for RAM. > > I think we should MADV_DONTNEED it. This does not guarantee that the memory is unmapped afaik. Sadly, I think we have to revert migration: do not sent zero pages in bulk stage The memory assigned by posix_memalign is most likely zero as all GFP_USER pages are zeroed out by the kernel on alloc (at least under Linux), but if the page is reused in the same process it is not necessarily zero anymore. What I was trying to achieve with this patch is that the memset when receiving a zero_page at the target was allocating memory and the MADV_DONTNEED was not immediately dropping the page. This lead to memory pressure and swapping etc. on overcommitted systems. What I would propose as a solution for this is after reverting the "do not sent zero pages" patch is sth like this when receiving a compressed page: if (ch != 0 || !is_zero_page(host)) { memset(host, ch, TARGET_PAGE_SIZE); } Regarding Kevins observation of the speed regression in iotest 007 this is simply if MALLOC_PERTURB_ is used there are simply no zero pages, but only dup pages in memory. On a real system the observation is that pages are either zero or not dup at all. Peter > > Paolo
Am 05.04.2013 um 22:00 schrieb Paolo Bonzini <pbonzini@redhat.com>: > Il 05/04/2013 21:23, Kevin Wolf ha scritto: >>>> virtually all dup pages are zero pages. remove >>>> the special is_dup_page() function and use the >>>> optimized buffer_find_nonzero_offset() function >>>> instead. >>>> >>>> here buffer_find_nonzero_offset() is used directly >>>> to avoid the unnecssary additional checks in >>>> buffer_is_zero(). >>>> >>>> raw performace gain checking 1 GByte zeroed memory >>>> over is_dup_page() is approx. 10-12% with SSE2 >>>> and 8-10% with unsigned long arithmedtic. >>>> >>>> Signed-off-by: Peter Lieven <pl@kamp.de> >>>> Reviewed-by: Orit Wasserman <owasserm@redhat.com> >>>> Reviewed-by: Eric Blake <eblake@redhat.com> >> Okay, so I bisected again and this is the second patch that is involved >> in the slowness of qemu-iotests case 007. >> >> The problem seems to be that the RAM of a guest is in fact _not_ zeroed >> during initialisation. It hits my test case reliably because I'm running >> with MALLOC_PERTURB_. Now I'm wondering if in practice this happens only >> under such test conditions, or if real guests could be affected as well >> and we should make sure to get zeroed memory for RAM. > > I think we should MADV_DONTNEED it. i have to correct myself, on Linux it seems that MADV_DONTNEED guarantees that subsequent reads will return a zero filled page. If I am right with that we could MADV_DONTNEED the memory on startup and keep not sending zero pages in bulk stage if we are running under Linux. Am I right with that? Peter
Il 06/04/2013 00:06, Peter Lieven ha scritto: >> > I think we should MADV_DONTNEED it. > i have to correct myself, on Linux it seems that MADV_DONTNEED guarantees that > subsequent reads will return a zero filled page. > > If I am right with that we could MADV_DONTNEED the memory on startup and keep > not sending zero pages in bulk stage if we are running under Linux. Am I right with that? Actually we can use mmap+munmap to allocate a well-aligned, always zero block for the RAM. Paolo
diff --git a/arch_init.c b/arch_init.c index 35974c2..dd5deff 100644 --- a/arch_init.c +++ b/arch_init.c @@ -146,19 +146,10 @@ int qemu_read_default_config_files(bool userconfig) return 0; } -static int is_dup_page(uint8_t *page) +static inline bool is_zero_page(uint8_t *p) { - VECTYPE *p = (VECTYPE *)page; - VECTYPE val = SPLAT(page); - int i; - - for (i = 0; i < TARGET_PAGE_SIZE / sizeof(VECTYPE); i++) { - if (!ALL_EQ(val, p[i])) { - return 0; - } - } - - return 1; + return buffer_find_nonzero_offset(p, TARGET_PAGE_SIZE) == + TARGET_PAGE_SIZE; } /* struct contains XBZRLE cache and a static page @@ -445,12 +436,12 @@ static int ram_save_block(QEMUFile *f, bool last_stage) /* In doubt sent page as normal */ bytes_sent = -1; - if (is_dup_page(p)) { + if (is_zero_page(p)) { acct_info.dup_pages++; bytes_sent = save_block_hdr(f, block, offset, cont, RAM_SAVE_FLAG_COMPRESS); - qemu_put_byte(f, *p); - bytes_sent += 1; + qemu_put_byte(f, 0); + bytes_sent++; } else if (migrate_use_xbzrle()) { current_addr = block->offset + offset; bytes_sent = save_xbzrle_page(f, p, current_addr, block,