stop the iteration when too many pages is transferred

Message ID	4CE49053.3000608@cn.fujitsu.com
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> Message-ID: <4CE49053.3000608@cn.fujitsu.com> Date: Thu, 18 Nov 2010 10:32:51 +0800 From: Wen Congyang <wency@cn.fujitsu.com> User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.9) Gecko/20100413 Fedora/3.0.4-2.fc13 Thunderbird/3.0.4 MIME-Version: 1.0 To: qemu-devel <qemu-devel@nongnu.org> Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=GB2312 Subject: [Qemu-devel] [PATCH] stop the iteration when too many pages is transferred Precedence: list Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org

Message ID

4CE49053.3000608@cn.fujitsu.com

State

New

Headers

Message-ID: <4CE49053.3000608@cn.fujitsu.com>
Date: Thu, 18 Nov 2010 10:32:51 +0800
From: Wen Congyang <wency@cn.fujitsu.com>
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US;
	rv:1.9.1.9) Gecko/20100413 Fedora/3.0.4-2.fc13 Thunderbird/3.0.4
MIME-Version: 1.0
To: qemu-devel <qemu-devel@nongnu.org>
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset=GB2312
Subject: [Qemu-devel] [PATCH] stop the iteration when too many pages is
	transferred
Precedence: list
Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org

Commit Message

Wen Congyang Nov. 18, 2010, 2:32 a.m. UTC

When the total sent page size is larger than max_factor
times of the size of guest OS's memory, stop the
iteration.
The default value of max_factor is 3.

This is similar to XEN.


Signed-off-by: Wen Congyang

---
 arch_init.c |   13 ++++++++++++-
 1 files changed, 12 insertions(+), 1 deletions(-)

Comments

Anthony Liguori Nov. 20, 2010, 2:23 a.m. UTC | #1

On 11/17/2010 08:32 PM, Wen Congyang wrote:
> When the total sent page size is larger than max_factor
> times of the size of guest OS's memory, stop the
> iteration.
> The default value of max_factor is 3.
>
> This is similar to XEN.
>
>
> Signed-off-by: Wen Congyang
>   

I'm strongly opposed to doing this. I think Xen gets this totally wrong.

Migration is a contract. When you set the stop time, you're saying that
you want only want the guest to experience a fixed amount of downtime.
Stopping the guest after some arbitrary number of iterations makes the
downtime non-deterministic. With a very large guest, this could wreak
havoc causing dropped networking connections, etc.

It's totally unsafe.

If a management tool wants this behavior, they can set a timeout and
explicitly stop the guest during the live migration. IMHO, such a
management tool is not doing it's job properly but it still can be
implemented.

Regards,

Anthony Liguori

> ---
>  arch_init.c |   13 ++++++++++++-
>  1 files changed, 12 insertions(+), 1 deletions(-)
>
> diff --git a/arch_init.c b/arch_init.c
> index 4486925..67e90f8 100644
> --- a/arch_init.c
> +++ b/arch_init.c
> @@ -212,6 +212,14 @@ uint64_t ram_bytes_total(void)
>      return total;
>  }
>  
> +static uint64_t ram_blocks_total(void)
> +{
> +    return ram_bytes_total() / TARGET_PAGE_SIZE;
> +}
> +
> +static uint64_t blocks_transferred = 0;
> +static int max_factor = 3;
> +
>  int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque)
>  {
>      ram_addr_t addr;
> @@ -234,6 +242,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque)
>          bytes_transferred = 0;
>          last_block = NULL;
>          last_offset = 0;
> +        blocks_transferred = 0;
>  
>          /* Make sure all dirty bits are set */
>          QLIST_FOREACH(block, &ram_list.blocks, next) {
> @@ -266,6 +275,7 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque)
>  
>          bytes_sent = ram_save_block(f);
>          bytes_transferred += bytes_sent;
> +        blocks_transferred += !!bytes_sent;
>          if (bytes_sent == 0) { /* no more blocks */
>              break;
>          }
> @@ -295,7 +305,8 @@ int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque)
>  
>      expected_time = ram_save_remaining() * TARGET_PAGE_SIZE / bwidth;
>  
> -    return (stage == 2) && (expected_time <= migrate_max_downtime());
> +    return (stage == 2) && ((expected_time <= migrate_max_downtime())
> +            || (blocks_transferred > ram_blocks_total() * max_factor));
>  }
>  
>  static inline void *host_from_stream_offset(QEMUFile *f,
>

Wen Congyang Nov. 22, 2010, 2:25 a.m. UTC | #2

At 2010年11月20日 10:23, Anthony Liguori Write:
> On 11/17/2010 08:32 PM, Wen Congyang wrote:
>> When the total sent page size is larger than max_factor
>> times of the size of guest OS's memory, stop the
>> iteration.
>> The default value of max_factor is 3.
>>
>> This is similar to XEN.
>>
>>
>> Signed-off-by: Wen Congyang
>>   
> 
> I'm strongly opposed to doing this. I think Xen gets this totally wrong.
> 
> Migration is a contract. When you set the stop time, you're saying that
> you want only want the guest to experience a fixed amount of downtime.
> Stopping the guest after some arbitrary number of iterations makes the
> downtime non-deterministic. With a very large guest, this could wreak
> havoc causing dropped networking connections, etc.
> 

Thanks for your comment.
As a developer, I know the downtime.
But as a user, he does not know the downtime.
When he migrates a very large guest lively without setting the stop time,
he does not say "I want the guest to experience a fixed amount of downtime",
he only wants to migrate the guest in a short time, the migration should
be done during some minutes, not ever for ever.

If we set the stop time too larger, this could also wreak havoc causing
dropped networking connections, etc.

I think we can do it as the following:
1. If the user does not set the stop time, we should complete the
   migration in a short time.
2. If the user sets the stop time, we do it as now.

> It's totally unsafe.
> 
> If a management tool wants this behavior, they can set a timeout and
> explicitly stop the guest during the live migration. IMHO, such a
> management tool is not doing it's job properly but it still can be
> implemented.
> 
> Regards,
> 
> Anthony Liguori
>

KAMEZAWA Hiroyuki Nov. 22, 2010, 7:11 a.m. UTC | #3

On Fri, 19 Nov 2010 20:23:55 -0600
Anthony Liguori <anthony@codemonkey.ws> wrote:

> On 11/17/2010 08:32 PM, Wen Congyang wrote:
> > When the total sent page size is larger than max_factor
> > times of the size of guest OS's memory, stop the
> > iteration.
> > The default value of max_factor is 3.
> >
> > This is similar to XEN.
> >
> >
> > Signed-off-by: Wen Congyang
> >   
> 
> I'm strongly opposed to doing this. I think Xen gets this totally wrong.
> 
> Migration is a contract. When you set the stop time, you're saying that
> you want only want the guest to experience a fixed amount of downtime.
> Stopping the guest after some arbitrary number of iterations makes the
> downtime non-deterministic. With a very large guest, this could wreak
> havoc causing dropped networking connections, etc.
> 
> It's totally unsafe.
> 
> If a management tool wants this behavior, they can set a timeout and
> explicitly stop the guest during the live migration. IMHO, such a
> management tool is not doing it's job properly but it still can be
> implemented.
> 

Hmm, is there any information available for management-tools about
"the reason migration failed was because migration never ends because
 of new dirty pages" or some ?

I'm grad if I know cold-migraton will success at high rate before stop machine
even when live migration failed by timeout. If the "network" or "target node is
too busy" is the reason of failure, cold migration will also be in trouble and
we'll see longer down time than expected.
I think it's helpful to show how the transfer was done, as "sent 3x pages of guest
pages but failed."

any idea ?

Thanks,
-Kame

Michael S. Tsirkin Nov. 24, 2010, 9:23 a.m. UTC | #4

On Fri, Nov 19, 2010 at 08:23:55PM -0600, Anthony Liguori wrote:
> On 11/17/2010 08:32 PM, Wen Congyang wrote:
> > When the total sent page size is larger than max_factor
> > times of the size of guest OS's memory, stop the
> > iteration.
> > The default value of max_factor is 3.
> >
> > This is similar to XEN.
> >
> >
> > Signed-off-by: Wen Congyang
> >   
> 
> I'm strongly opposed to doing this. I think Xen gets this totally wrong.

Additionally, please don't just change the behaviour of existing commands.
Make any new behaviour opt-in.

diff --git a/arch_init.c b/arch_init.c
index 4486925..67e90f8 100644
--- a/arch_init.c
+++ b/arch_init.c
@@ -212,6 +212,14 @@  uint64_t ram_bytes_total(void)
     return total;
 }
 
+static uint64_t ram_blocks_total(void)
+{
+    return ram_bytes_total() / TARGET_PAGE_SIZE;
+}
+
+static uint64_t blocks_transferred = 0;
+static int max_factor = 3;
+
 int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque)
 {
     ram_addr_t addr;
@@ -234,6 +242,7 @@  int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque)
         bytes_transferred = 0;
         last_block = NULL;
         last_offset = 0;
+        blocks_transferred = 0;
 
         /* Make sure all dirty bits are set */
         QLIST_FOREACH(block, &ram_list.blocks, next) {
@@ -266,6 +275,7 @@  int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque)
 
         bytes_sent = ram_save_block(f);
         bytes_transferred += bytes_sent;
+        blocks_transferred += !!bytes_sent;
         if (bytes_sent == 0) { /* no more blocks */
             break;
         }
@@ -295,7 +305,8 @@  int ram_save_live(Monitor *mon, QEMUFile *f, int stage, void *opaque)
 
     expected_time = ram_save_remaining() * TARGET_PAGE_SIZE / bwidth;
 
-    return (stage == 2) && (expected_time <= migrate_max_downtime());
+    return (stage == 2) && ((expected_time <= migrate_max_downtime())
+            || (blocks_transferred > ram_blocks_total() * max_factor));
 }
 
 static inline void *host_from_stream_offset(QEMUFile *f,

stop the iteration when too many pages is transferred

Commit Message

Comments

Patch