From patchwork Mon May 11 11:10:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lukas Straub X-Patchwork-Id: 1287612 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=web.de Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; secure) header.d=web.de header.i=@web.de header.a=rsa-sha256 header.s=dbaedf251592 header.b=FrZNULWn; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49LJF34cRnz9sSF for ; Mon, 11 May 2020 21:12:39 +1000 (AEST) Received: from localhost ([::1]:46674 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jY6MP-0004wp-6n for incoming@patchwork.ozlabs.org; Mon, 11 May 2020 07:12:37 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54862) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jY6Km-0004pX-S6 for qemu-devel@nongnu.org; Mon, 11 May 2020 07:10:56 -0400 Received: from mout.web.de ([212.227.15.14]:37461) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jY6Kl-0005kY-Ui for qemu-devel@nongnu.org; Mon, 11 May 2020 07:10:56 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=web.de; s=dbaedf251592; t=1589195446; bh=lr/pPkNofIIuTpKiKjKXeB6X0Y9xXQrBIpptsOBfeRk=; h=X-UI-Sender-Class:Date:From:To:Cc:Subject:In-Reply-To:References; b=FrZNULWnieRFZbW0K0iphENSVAphL2UkQG73vpDD3Iewos0HO4wv6rq7GScYCM4uP VSWYlFArTUOcG6SFyImx9cf7WqdFSbEcUcW+H6k4oby0eP/BK8Gh1lcwE8LmtjfTxW 9tJWstkk0gD7bwemTmJZEzPFY7cjcU9teL2OtM5M= X-UI-Sender-Class: c548c8c5-30a9-4db5-a2e7-cb6cb037b8f9 Received: from luklap ([89.247.255.192]) by smtp.web.de (mrweb005 [213.165.67.108]) with ESMTPSA (Nemesis) id 1MP384-1jkGUc0NX2-00PJnB; Mon, 11 May 2020 13:10:46 +0200 Date: Mon, 11 May 2020 13:10:44 +0200 From: Lukas Straub To: qemu-devel Subject: [PATCH 1/6] migration/colo.c: Use event instead of semaphore Message-ID: In-Reply-To: References: MIME-Version: 1.0 X-Provags-ID: V03:K1:IgsUpe26+RxTYtih8/sHO9Aaw+7fLFEYesdd2zvwqcfQnmTUF/S cVX+5scOUPDxUiGgZuJVXn58se/h+sKUaWxARTWDKY0tkqbAAd1bs9jMdyHN6nBF3viRRtK ZyDCjN59WsLUtTU73X/J7BUUCLRMo4Hdk5Np+ayMYettrE6Wo3L7NYdEIZ+q6d1QCw43zxE lR0+Q9zW1Vrf0JY70gNMw== X-UI-Out-Filterresults: notjunk:1;V03:K0:a5BOynFPUX0=:2WH5QN2UOyxl1OeG14MS5e gL+KaVxRx+wKnoslNzOx5Yf2O97mxSrzLQxz28Se6p5gp0Gok2lY/HnTHkqntrlAasdrT2mHn LTrYvdtoKywcC8e21aYM8qNToDqeBAfiz9zMrUyg8GX7oIbBuyUVJBLfzhJEAb0RW0pxp90V4 oABcKtlF5xdvUwwc6d6tZ3C+5fz7CBe/nvNBlvaBe7g9tQrdMJZUKbgOy80AVi9dLf+QhfYcy AAAxzlS32rrw+4lgodySlwlqXo8ljfEru7jsfR8EbXayqXOpZoo+cOcQ+3BGKw6ygKM9DdPSB Eps6YeWwAn/ecZZIdUklbcpd9SWkIC5FcQzDmZmneoV7SEwt7XsNu44H1szCAkDrEbKE7Iun3 kk1miM0xUFUkgh13em3Ptbn80O92adpUQ5HJefc/h55lEHODJ2jAPyjCDnGZEhWI3TLJmWAXh LNI80l7tLbdBCVDtyZZ5voRl27vmm1OlJBWUnYPSMUonUgeRp6JCJt7XHOfutkEGdJF0yO1ML gYGoIZ59w7Msp2/wCwTv85Y8VErQDaBtucjB8uF+7ODl1P4oJTf9faKbOPYi1ndow6HKgHhY/ BTV7Fb70EX6y4ox7XinSJOBiDRJsuz490XJLVoNeixVwKobvDttKn0YKIflG0AaKHk4J6ym15 AnXQpbj5kBommwkEOEHFO0x5KnN2WqQOaGnL/Q3j/ad2bR2b3zSKS8dsJLHxzHtVxrCGYnP0W ogorghPzBxGEmUg3VjxsJloW+o08U6pBmMH7TFKUIJbWG+sgy9xEF/HVjMvNji11cgujGkTzV oKrYIbYi5iD7LOjOy3OYgwshTTRysjk2Q9lkvEiAAWki0H67XDwN5Tt+1HLcOv+sQr5fV8XkI HF4NRnxJgOA95u6glZ7tqSGflq69aD8YLFfyIaXiJOtP8OwHbfR8WsN/Z3dyN3wr2YXcqSNOP eXmcJc1V76jSOwb6JQrgwR8HD1nqvR8DfvhTiKcvJAOSDOAw998LST4sN0LpCGXp9wE5J1RfS X9l92snOanDL7u756NIbUTR2T5QvoVvd4dmQp/ZFB9N4dTF9tZG6AYiJqGUtfUvCiJnZZcEXZ ++4ATsTnTycROL/2frnlsqLuVl48g4Kj10ndjT7k/HfjazhqZA8vQVfGION88P4Y5RBq1cH6D p5e4VN3Q1rWEC3T1RDaffe6OqMzqLQFFT5XUE9yYuCmMuOctuSNOWS1JS0Zd5MEoQh0W91Nvs i5SK+GcAMUYGfvmoD Received-SPF: pass client-ip=212.227.15.14; envelope-from=lukasstraub2@web.de; helo=mout.web.de X-detected-operating-system: by eggs.gnu.org: First seen = 2020/05/11 07:10:54 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] X-Spam_score_int: -25 X-Spam_score: -2.6 X-Spam_bar: -- X-Spam_report: (-2.6 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Hailiang Zhang , "Dr. David Alan Gilbert" , Juan Quintela Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" If multiple packets miscompare in a short timeframe, the semaphore value will be increased multiple times. This causes multiple checkpoints even if one would be sufficient. Fix this by using a event instead of a semaphore for triggering checkpoints. Now, checkpoint requests will be ignored until the checkpoint event is sent to colo-compare (which releases the miscompared packets). Benchmark results (iperf3): Client-to-server tcp: without patch: ~66 Mbit/s with patch: ~61 Mbit/s Server-to-client tcp: without patch: ~702 Kbit/s with patch: ~16 Mbit/s Signed-off-by: Lukas Straub Reviewed-by: zhanghailiang --- migration/colo.c | 9 +++++---- migration/migration.h | 4 ++-- 2 files changed, 7 insertions(+), 6 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index a54ac84f41..09168627bc 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -430,6 +430,7 @@ static int colo_do_checkpoint_transaction(MigrationState *s, goto out; } + qemu_event_reset(&s->colo_checkpoint_event); colo_notify_compares_event(NULL, COLO_EVENT_CHECKPOINT, &local_err); if (local_err) { goto out; @@ -580,7 +581,7 @@ static void colo_process_checkpoint(MigrationState *s) goto out; } - qemu_sem_wait(&s->colo_checkpoint_sem); + qemu_event_wait(&s->colo_checkpoint_event); if (s->state != MIGRATION_STATUS_COLO) { goto out; @@ -628,7 +629,7 @@ out: colo_compare_unregister_notifier(&packets_compare_notifier); timer_del(s->colo_delay_timer); timer_free(s->colo_delay_timer); - qemu_sem_destroy(&s->colo_checkpoint_sem); + qemu_event_destroy(&s->colo_checkpoint_event); /* * Must be called after failover BH is completed, @@ -645,7 +646,7 @@ void colo_checkpoint_notify(void *opaque) MigrationState *s = opaque; int64_t next_notify_time; - qemu_sem_post(&s->colo_checkpoint_sem); + qemu_event_set(&s->colo_checkpoint_event); s->colo_checkpoint_time = qemu_clock_get_ms(QEMU_CLOCK_HOST); next_notify_time = s->colo_checkpoint_time + s->parameters.x_checkpoint_delay; @@ -655,7 +656,7 @@ void colo_checkpoint_notify(void *opaque) void migrate_start_colo_process(MigrationState *s) { qemu_mutex_unlock_iothread(); - qemu_sem_init(&s->colo_checkpoint_sem, 0); + qemu_event_init(&s->colo_checkpoint_event, false); s->colo_delay_timer = timer_new_ms(QEMU_CLOCK_HOST, colo_checkpoint_notify, s); diff --git a/migration/migration.h b/migration/migration.h index 507284e563..f617960522 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -215,8 +215,8 @@ struct MigrationState /* The semaphore is used to notify COLO thread that failover is finished */ QemuSemaphore colo_exit_sem; - /* The semaphore is used to notify COLO thread to do checkpoint */ - QemuSemaphore colo_checkpoint_sem; + /* The event is used to notify COLO thread to do checkpoint */ + QemuEvent colo_checkpoint_event; int64_t colo_checkpoint_time; QEMUTimer *colo_delay_timer; From patchwork Mon May 11 11:10:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lukas Straub X-Patchwork-Id: 1287617 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=web.de Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; secure) header.d=web.de header.i=@web.de header.a=rsa-sha256 header.s=dbaedf251592 header.b=Y7kPtTKD; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49LJNv3WPFz9sPF for ; Mon, 11 May 2020 21:19:27 +1000 (AEST) Received: from localhost ([::1]:58844 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jY6Sz-0002WM-18 for incoming@patchwork.ozlabs.org; Mon, 11 May 2020 07:19:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54868) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jY6Ko-0004uB-Qg for qemu-devel@nongnu.org; Mon, 11 May 2020 07:10:58 -0400 Received: from mout.web.de ([212.227.17.12]:50661) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jY6Kn-0005kp-Vb for qemu-devel@nongnu.org; Mon, 11 May 2020 07:10:58 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=web.de; s=dbaedf251592; t=1589195449; bh=fHF36Ki6yNEwvH+HbsyWEOYKs0aX/3g3RLEGn37QNNQ=; h=X-UI-Sender-Class:Date:From:To:Cc:Subject:In-Reply-To:References; b=Y7kPtTKDnZLnyyUcEIb1sJ+hrK31f73NOOIdvjgr8TvWU4TzC+WtkkyWS7G9k24yZ imszOWin4ky8ZGMZwp6YaFzy9c2SEBgfwg3wTteU5Ko9u3fL9pFf0bo2pNk90lLTJL B0liIoggwDz7jzZeq/KFyJnmvqAjrUuYs/3U33eM= X-UI-Sender-Class: c548c8c5-30a9-4db5-a2e7-cb6cb037b8f9 Received: from luklap ([89.247.255.192]) by smtp.web.de (mrweb101 [213.165.67.124]) with ESMTPSA (Nemesis) id 0MPpQU-1jUERf2eIc-004zSU; Mon, 11 May 2020 13:10:49 +0200 Date: Mon, 11 May 2020 13:10:48 +0200 From: Lukas Straub To: qemu-devel Subject: [PATCH 2/6] migration/colo.c: Use cpu_synchronize_all_states() Message-ID: <9675031ce557b73ebd10e7bd20ebbf57f30b177c.1589193382.git.lukasstraub2@web.de> In-Reply-To: References: MIME-Version: 1.0 X-Provags-ID: V03:K1:J4iiaiK1ynPM7Kmwo10bTHc7piCUCoI/HzNwQLPal82GaYiXvum MXqTuslu5KSg59/KDgWm7lhRITLwYcbII/a5Jkn+K2qNDNWL+Td2ma8/T440Itas4KCh4OL UioQxsxtPN5c9dQnjCEQHFrLRd7Q1dIQL6FfUSVsT0UvmnhW43NGcptvQYr391vVcxJnz5V lbU46LzV1EM8Y6iY8DFbA== X-UI-Out-Filterresults: notjunk:1;V03:K0:v8rXSVK0XTY=:K1WWX7lmUKDuUCj1Qpx/kC WqSxI4VeNh37f8/uOyYIRzQb+j0rLG2bwW5T0KGA9sh3jF6N00n4vb1aHDlRthUkuqTO2kOcI kK4DQJd+mBn21qxGy7NQ70eqbbDHEoLzz78pL70yYAXt4OTEzt7mOEL220o5NcPWbYPLBPCVr amusBpTnEu954twQHjEseO5y3jBOuMTRd6Jbt71aBwvwPFDl1GrzZ5p/BrSU2+LE5g0gfunCV oL6taFOH6SWYpo8of9YUrGMQoBCr/ZsCWe003NfH3uIew0dIw0TmlRGdJvEcpaG6iIZGSfsvS 1j4IR5az6F9wQebhmBVgvp0PVF08f2osN7vBIs+xDW0UqctxAF1MdZl4M4J+veTLt/0UotnvW 3A+sy2xTgokz6IEpH6K2JQtM9B1/MsDIGJuv7IojXdPMS5W+5DEv8tUURmmm8Ro7EUlP9xPhg AYICAQvuJB7tXFXBsIxb2vpeHujs6rdB+M0ZObFk5BUEBwZdKJLnqwkkZq7JnXWfqtomvAViA uPtvPHJs6CHZ1B//nYc6vQWSMXjieI61iu2YgQlwSqqiIyniWGgy6LBk4bTsbizsUPWbRRX1p EIgMo0L4qsBHO1uipmROhY0LEpx9j6onTkXrllARsPSxLyveA4FTkGCOyXK9OjAF5ioTbQu3v u+VMCeRsirsb/1JB86RRqHKsliGUucGb0QAWYfxSJO7DcQ11ZhhoKtaOGv1mxfgX1Dkubum3y najaqPxvM64lmgzDl3qcxxGt5Yl4CPBn02y5VDF8WjM8TySeYe/yWDixDZvklg7vZicXkFKtm MBJQgoHaxGs5IeXVDPJLUzP71/9iJDC6Eyu2NJh8HWKxZa1rgAGFjQyzmQFxUHMGCWu37T/Zi EuxDq8mAZEmpalh8CfKp3gIeG4VDRHpsrCxT0iDkVuFQQ8x0NfMrXIaHg2FaXZx+2gM47dkxR bNQ7RK/RxPXj8sziEeMEywhVryvRMag+NkdjXWqZzZGU2JV/VNGjz44WOCQksdWzA8e7hv6S8 RXt9Mh8fdtmmEcmrVnU937N/QFBnwyfT0sQYfJOUtxEABFhJ8yM4NhJk6RWbop2PTRDQ0hsAF sfKoQ4GpuQw8/Wq84EO48xNrieNNsoNmI4LIIRCsiva4UVDJbdUe4ku6Udn+VZBFxByTYyvX+ dg+df9CUjjgbClqTcMClD+rkGk8mNPm5hSPNDNg5gz7BRvx3riHqmJHs9fdloDg2WPvyDght4 8SvVq6PIzuXKJsDYq Received-SPF: pass client-ip=212.227.17.12; envelope-from=lukasstraub2@web.de; helo=mout.web.de X-detected-operating-system: by eggs.gnu.org: First seen = 2020/05/11 07:10:54 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] X-Spam_score_int: -24 X-Spam_score: -2.5 X-Spam_bar: -- X-Spam_report: (-2.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Hailiang Zhang , "Dr. David Alan Gilbert" , Juan Quintela Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" cpu_synchronize_all_pre_loadvm() marks all vcpus as dirty, so the registers are loaded from CPUState before we continue running the vm. However if we failover during checkpoint, CPUState is not initialized and the registers are loaded with garbage. This causes guest hangs and crashes. Fix this by using cpu_synchronize_all_states(), which initializes CPUState from the current cpu registers additionally to marking the vcpus as dirty. Signed-off-by: Lukas Straub Reviewed-by: Dr. David Alan Gilbert --- migration/colo.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/migration/colo.c b/migration/colo.c index 09168627bc..6b2ad35aa4 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -696,7 +696,7 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis, } qemu_mutex_lock_iothread(); - cpu_synchronize_all_pre_loadvm(); + cpu_synchronize_all_states(); ret = qemu_loadvm_state_main(mis->from_src_file, mis); qemu_mutex_unlock_iothread(); From patchwork Mon May 11 11:10:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lukas Straub X-Patchwork-Id: 1287618 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=web.de Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; secure) header.d=web.de header.i=@web.de header.a=rsa-sha256 header.s=dbaedf251592 header.b=BU0YYEY8; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49LJNy5dmPz9sPF for ; Mon, 11 May 2020 21:19:30 +1000 (AEST) Received: from localhost ([::1]:58900 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jY6T2-0002Xs-Af for incoming@patchwork.ozlabs.org; Mon, 11 May 2020 07:19:28 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54878) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jY6Ks-00050F-F7 for qemu-devel@nongnu.org; Mon, 11 May 2020 07:11:02 -0400 Received: from mout.web.de ([212.227.15.3]:34051) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jY6Kr-0005l1-Ir for qemu-devel@nongnu.org; Mon, 11 May 2020 07:11:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=web.de; s=dbaedf251592; t=1589195453; bh=dZ8rX0irz/4KH8600VHSFn1CNdmEGizP4s4ww9kK2Ho=; h=X-UI-Sender-Class:Date:From:To:Cc:Subject:In-Reply-To:References; b=BU0YYEY81LWwOVoniojx3ZwInETAbh5nNx3nTJs79IxHNK2BjKXmIj8JGEE22yHcN hk7LaR1/k1fxIvmaZsoCMtCcKL7qnRseFWkwTB9mFXZeBVm5MW2sD5LX2jXyMPvBgo AUjtoJcga/XkTBbaUc0T4dufWUhKTszzo+5j943I= X-UI-Sender-Class: c548c8c5-30a9-4db5-a2e7-cb6cb037b8f9 Received: from luklap ([89.247.255.192]) by smtp.web.de (mrweb005 [213.165.67.108]) with ESMTPSA (Nemesis) id 1Mhnvw-1iuciC44J6-00dpwi; Mon, 11 May 2020 13:10:53 +0200 Date: Mon, 11 May 2020 13:10:51 +0200 From: Lukas Straub To: qemu-devel Subject: [PATCH 3/6] migration/colo.c: Flush ram cache only after receiving device state Message-ID: <3289d007d494cb0e2f05b1cf4ae6a78d300fede3.1589193382.git.lukasstraub2@web.de> In-Reply-To: References: MIME-Version: 1.0 X-Provags-ID: V03:K1:Q43gYz642FHDiJiOKUcWlXH5M8tCMurNfc9+BrsFD4f//0Cl3r0 xOcLddf14yOmnxbkR8uaLWB5YbHmgNxwVUaWsAi0Hp0952L5A3n4mhnWQw402KvYWX2N19L aucJojiHjtXA+AMztvOMsH8CZ9CVF1i+Z0weSWMaEODLj3TnDh1HinCJ02kskNb7KDw9Sat ZWeG0x2dTrqJzD3H3HwQw== X-UI-Out-Filterresults: notjunk:1;V03:K0:+mUVSYNyHrg=:wVx8sGujcmxz0dEFzpXZ8C +ZHuteoBY98PsfVuNeLzp2tQuZxhKk6VZPYj9Z2E3fcWsw85fQnN78HpMTojTk95GilLPcc+1 9SIdwgbsII3iNnbR+BW2BWWNKYDX6oYubx/DRP7uqIkGKpX790PkUqxr05N4I0EG7sm6uxj9F 7Oc7NOV4V3AN6Lhtp81X7fdP16JOoTUPrsEy/7FM8Rrvo9+e+tieB+J9G9tAHLUb9I6345NSa VlzBonzSAYh+A4KoiXFkP6rNq4TV24iRyGh2c6lXyN5vvmfykxHqvnNeyqvS+fNZaIFsOsWdm ih25IYuCPj14jZZ79H+BEiwsY+oRkKHdL3fa1WKTsNjnafDSO+YQCzDuIndlT2qNh2ONBpy0q XNvXukOwAf+xafOHfZVhTqoeBneRLa9pDTwsfisTbqVc43xUDcV/iFycBpIUP3Nq9ZdS6KYau +e/BvVtRtoBJj90YpzCi3faxzJRhOqmcvIN916gRDhuMqe0M2Pl/83nPntqpoWQzaZoTGlAg4 ikX564Jl5+oPlfton6+KAvybkuK5KSIvb1dXjCbPz5w8FXPASsZf2GZEOgw3CFqmOQY00LiHC ScqMHDXES/McqT7Xm1kAwv3HE8pxmVepP5GFDJd6afQP2gwy4lhGNoWYpF7eLuyeu443WdI5b G59o3nm908MAqwwbx2zTFtVltcqYpqQQl1ZITgVA3b1LxnVz0/i15TpcJu9areuKzS5LaQI8s m8WT/xoMinE9CgxIYs2a4Eddaf42bt5AU5q48HM/hTpt+DENWHNKEKc7jckMfYCylfPSzBZeF a4NOJA/URxuBJo8QHQQaBiqxVPlky0RfFIFgTKok/ash2rLzkFoQGTVLzo75pIVrvATV7aKK5 4JkX5EhnaTLkVLKWKt9Rn3JQb+f5CiLAV3KCuzkBaaKUNI9a9iE1fLG+AVSdJWdGti6E4clIB cFREDTqYh5ZCc6k1VpmkZkv1MLEn2PaLqP/nkx1ULzUreXa9dhZbTTELONsi0/fkarTYMQgGJ Vb+pxilye2H7Xsgny3lJUtmogONKDaUQ6ugqiAdZsJVD1leCLqbVlQmNvKu80c0DCKk4RBBkG GmnW5zZ3fmZD+q4WNjtDSZWXj179oULGveRd2XsOmVQ5oXPY+jIqnxdR/QdKDzZS2W7hIg0OP zAnGu2cYrHT99BlPZQNdVES7nWg/WPOXhZUsYNVP5xMPTYvlZTnPTjPgfahSBGszCkJqdgGuE s2xD1JQ7GZ+/is1J1 Received-SPF: pass client-ip=212.227.15.3; envelope-from=lukasstraub2@web.de; helo=mout.web.de X-detected-operating-system: by eggs.gnu.org: First seen = 2020/05/11 07:11:00 X-ACL-Warn: Detected OS = Linux 3.11 and newer X-Spam_score_int: -24 X-Spam_score: -2.5 X-Spam_bar: -- X-Spam_report: (-2.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Hailiang Zhang , "Dr. David Alan Gilbert" , Juan Quintela Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" If we suceed in receiving ram state, but fail receiving the device state, there will be a mismatch between the two. Fix this by flushing the ram cache only after the vmstate has been received. Signed-off-by: Lukas Straub Reviewed-by: zhanghailiang --- migration/colo.c | 1 + migration/ram.c | 5 +---- migration/ram.h | 1 + 3 files changed, 3 insertions(+), 4 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index 6b2ad35aa4..2947363ae5 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -739,6 +739,7 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis, qemu_mutex_lock_iothread(); vmstate_loading = true; + colo_flush_ram_cache(); ret = qemu_load_device_state(fb); if (ret < 0) { error_setg(errp, "COLO: load device state failed"); diff --git a/migration/ram.c b/migration/ram.c index 04f13feb2e..5baec5fce9 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3313,7 +3313,7 @@ static bool postcopy_is_running(void) * Flush content of RAM cache into SVM's memory. * Only flush the pages that be dirtied by PVM or SVM or both. */ -static void colo_flush_ram_cache(void) +void colo_flush_ram_cache(void) { RAMBlock *block = NULL; void *dst_host; @@ -3585,9 +3585,6 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id) } trace_ram_load_complete(ret, seq_iter); - if (!ret && migration_incoming_in_colo_state()) { - colo_flush_ram_cache(); - } return ret; } diff --git a/migration/ram.h b/migration/ram.h index 5ceaff7cb4..2eeaacfa13 100644 --- a/migration/ram.h +++ b/migration/ram.h @@ -65,6 +65,7 @@ int ram_dirty_bitmap_reload(MigrationState *s, RAMBlock *rb); /* ram cache */ int colo_init_ram_cache(void); +void colo_flush_ram_cache(void); void colo_release_ram_cache(void); void colo_incoming_start_dirty_log(void); From patchwork Mon May 11 11:10:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lukas Straub X-Patchwork-Id: 1287621 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=web.de Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; secure) header.d=web.de header.i=@web.de header.a=rsa-sha256 header.s=dbaedf251592 header.b=rnCHsad+; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49LJW86247z9sSF for ; Mon, 11 May 2020 21:24:52 +1000 (AEST) Received: from localhost ([::1]:39130 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jY6YE-0007Ug-EB for incoming@patchwork.ozlabs.org; Mon, 11 May 2020 07:24:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54884) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jY6Kw-00055l-KD for qemu-devel@nongnu.org; Mon, 11 May 2020 07:11:06 -0400 Received: from mout.web.de ([212.227.17.12]:39059) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jY6Kv-0005lO-NO for qemu-devel@nongnu.org; Mon, 11 May 2020 07:11:06 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=web.de; s=dbaedf251592; t=1589195456; bh=EV7r9FZuq8i6pX8Ttb/SNRCCkrj7ZvzZ76JW02y6FPQ=; h=X-UI-Sender-Class:Date:From:To:Cc:Subject:In-Reply-To:References; b=rnCHsad+hzi9CutGZdmaLco+8cp1G+mpmJ4dyufRofOpssevpWO1hN30tbQDgzxzP /o4q13lc3uDDFDe0Je5ANXy2n07dHXUg4GhGnOGouirC8XSPtP75/t6a/hg0a3tu79 5jtu+aaWBJuKs2IwOrinGksTnb/DCxizJ7IYeihY= X-UI-Sender-Class: c548c8c5-30a9-4db5-a2e7-cb6cb037b8f9 Received: from luklap ([89.247.255.192]) by smtp.web.de (mrweb102 [213.165.67.124]) with ESMTPSA (Nemesis) id 0M9GJ0-1jOjDa2EjW-00Cicd; Mon, 11 May 2020 13:10:56 +0200 Date: Mon, 11 May 2020 13:10:55 +0200 From: Lukas Straub To: qemu-devel Subject: [PATCH 4/6] migration/colo.c: Relaunch failover even if there was an error Message-ID: In-Reply-To: References: MIME-Version: 1.0 X-Provags-ID: V03:K1:7t20dbAXEYDPneRBQEpebmd6MPQzBRiQBeLEm5/A3d/W32BBy84 zDueejoiv0hXboAgz9rWD5TZVQ/l1jJRt7n+tlfTJEZ+xxVX0xbG39u8eTrwaxNYX+KkhgI AnMRNINaiUE9joSecuptKsD2uEonxEGsR+2ad/BA5f+fboVQDt7ovyBelGnvYwctKa29Uqa QHnlkFZE8W3hjdLwSyaPg== X-UI-Out-Filterresults: notjunk:1;V03:K0:HoRpwYGd9tE=:hgXAaKlW4x2gJ26aVGnx7D vCpkUHJtnZfmA67h31GbyqixkJzvb65M450Wfcqpx0qp4U2Fk3mHQ1FECY1t4Bqd3UR0ouafm 0bpWd19I2SHvcbqw7GdGK7QLC0kCIURjkVO6DmPSw1RHRE+VM7xjyTNuP//B3LgwiIh+T2N7U r6xxiVNKqRegOy3Mhs7BdZ/sqWcENc9GxlW+YukTUg9pPX/bEnSx1x7QInpGXtN2DXbvilh0U u6FTpXblXK2rmzyArx3dOh4msDSuQ4FuywIiBhxZYF3c1/nszxLRXKhI9WuaM194PoGRR7SNx 3SG60of3zOUQ3YhjjyIX+q4EBB5GiSNs+r/xuUBEhC7L9TL+q0xxIBEp/41uzKQiyUcAAGte8 7eU+Tq1Q3biZanIFw1KDudmunfxtsWCwNDEwaMn1+SIRgpvlay10LFMSr54z/LvBVurjKa4hs gz60wh0kern8St/1FWG/+jzOFqfWYwc61QfXC4+DE7Hd6N4ErDUcSHLQFcYkrQ4PuOGVxpcxM hDqOesapwYlb5kpoWLmhhkCg2amOFcQFCNLjjYFhgHaGZferCKf+8r7tBJPu1CVSfm0Q6beIA 1+WsX6jFL8gUObgfdB4tSFQTsIP1ng2yVQOwhlwSW6E9scWBRBmbnVUqtc9jMaY/QMyktUR91 L/J8bSP60MqsgB5BnejplGlQnhfN6OqxvfmwLCJpMNovIkasAmLiesSPgNuSFX0Z+e0y/zfn7 vp2Jekwmmba4zftSfTqmQahDlWE7ewloTWieAoiWpmOrFn5kL0Tf5VE2AP2NcoNryITXWz8UB upAD2OAGpzqRDR0gqWrjar5j5Oev0PRgQGSprEFwYxxJuqn4ED+l+/kvcN6GpOkzWSZ6r0Mnp 0OQscmNAUXyQmjbo0asZDKZJabfis0xWNjm1MSi0tQgMXxAoW5PPVxYEF34vUJWCw2aRxSK0X KRuuvDcZlDVS0RJxMZhD34fg6kC+T9bQLKOtruVFnKI8BTtzl012w0wO/Xj6cPf7c296gVk9E Yhx8EGWe9fFEjtezJ/0bis/z543TAyOkPzFGRZPChMqNCCIFlWyo78qtS017Apr+DfIiSjGyX Li6MZzuNTLPFf5cmrJMsmQyoc39HsVK0j6yUt8oN8s+5hbWJHWQleSzSN6lfa5TaRTYc1Sx6X 8y9Ootv1U2vxWJGlHhr6o/gvXl0Sv5zimUv2KXe59tSPcgBTFKpQhTywscbQc0+1m8YyqKuSR FQgHaBcQd5UOB3PJ1 Received-SPF: pass client-ip=212.227.17.12; envelope-from=lukasstraub2@web.de; helo=mout.web.de X-detected-operating-system: by eggs.gnu.org: First seen = 2020/05/11 07:10:54 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] X-Spam_score_int: -24 X-Spam_score: -2.5 X-Spam_bar: -- X-Spam_report: (-2.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Hailiang Zhang , "Dr. David Alan Gilbert" , Juan Quintela Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" If vmstate_loading is true, secondary_vm_do_failover will set failover status to FAILOVER_STATUS_RELAUNCH and return success without initiating failover. However, if there is an error during the vmstate_loading section, failover isn't relaunched. Instead we then wait for failover on colo_incoming_sem. Fix this by relaunching failover even if there was an error. Also, to make this work properly, set vmstate_loading to false when returning during the vmstate_loading section. Signed-off-by: Lukas Straub Reviewed-by: zhanghailiang --- migration/colo.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index 2947363ae5..a69782efc5 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -743,6 +743,7 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis, ret = qemu_load_device_state(fb); if (ret < 0) { error_setg(errp, "COLO: load device state failed"); + vmstate_loading = false; qemu_mutex_unlock_iothread(); return; } @@ -751,6 +752,7 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis, replication_get_error_all(&local_err); if (local_err) { error_propagate(errp, local_err); + vmstate_loading = false; qemu_mutex_unlock_iothread(); return; } @@ -759,6 +761,7 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis, replication_do_checkpoint_all(&local_err); if (local_err) { error_propagate(errp, local_err); + vmstate_loading = false; qemu_mutex_unlock_iothread(); return; } @@ -770,6 +773,7 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis, if (local_err) { error_propagate(errp, local_err); + vmstate_loading = false; qemu_mutex_unlock_iothread(); return; } @@ -780,9 +784,6 @@ static void colo_incoming_process_checkpoint(MigrationIncomingState *mis, qemu_mutex_unlock_iothread(); if (failover_get_state() == FAILOVER_STATUS_RELAUNCH) { - failover_set_state(FAILOVER_STATUS_RELAUNCH, - FAILOVER_STATUS_NONE); - failover_request_active(NULL); return; } @@ -881,6 +882,14 @@ void *colo_process_incoming_thread(void *opaque) error_report_err(local_err); break; } + + if (failover_get_state() == FAILOVER_STATUS_RELAUNCH) { + failover_set_state(FAILOVER_STATUS_RELAUNCH, + FAILOVER_STATUS_NONE); + failover_request_active(NULL); + break; + } + if (failover_get_state() != FAILOVER_STATUS_NONE) { error_report("failover request"); break; @@ -888,8 +897,6 @@ void *colo_process_incoming_thread(void *opaque) } out: - vmstate_loading = false; - /* * There are only two reasons we can get here, some error happened * or the user triggered failover. From patchwork Mon May 11 11:10:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lukas Straub X-Patchwork-Id: 1287622 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=web.de Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; secure) header.d=web.de header.i=@web.de header.a=rsa-sha256 header.s=dbaedf251592 header.b=HhJZf8wh; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49LJWF5BmQz9sPF for ; Mon, 11 May 2020 21:24:57 +1000 (AEST) Received: from localhost ([::1]:39230 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jY6YJ-0007XR-8x for incoming@patchwork.ozlabs.org; Mon, 11 May 2020 07:24:55 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54888) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jY6Ky-00058W-NS for qemu-devel@nongnu.org; Mon, 11 May 2020 07:11:08 -0400 Received: from mout.web.de ([212.227.15.14]:56081) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jY6Kx-0005lb-T8 for qemu-devel@nongnu.org; Mon, 11 May 2020 07:11:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=web.de; s=dbaedf251592; t=1589195460; bh=tUg9o884OMIklfCikslkyZW37XNcYKoP71AiwmXwais=; h=X-UI-Sender-Class:Date:From:To:Cc:Subject:In-Reply-To:References; b=HhJZf8whG4m37KkvMIcoWqfkCyE/Mg7MyLwP2gg7lpwBnG7fzd8pCniv2bw+9BbHM Z63KpGKlgELSbElYbVjWqA2LB/5nOy2ikXUr/wf3l+jyCw/X+WZ65sTpqe6gal2THP dYQaf2SHD8GKkHOVrQvslKU6Vl+3QPaqnxnz8diA= X-UI-Sender-Class: c548c8c5-30a9-4db5-a2e7-cb6cb037b8f9 Received: from luklap ([89.247.255.192]) by smtp.web.de (mrweb005 [213.165.67.108]) with ESMTPSA (Nemesis) id 1M4KNZ-1jXpLu3hQX-000MFN; Mon, 11 May 2020 13:10:59 +0200 Date: Mon, 11 May 2020 13:10:58 +0200 From: Lukas Straub To: qemu-devel Subject: [PATCH 5/6] migration/qemu-file.c: Don't ratelimit a shutdown fd Message-ID: In-Reply-To: References: MIME-Version: 1.0 X-Provags-ID: V03:K1:wRaJQIVGOrJKahlvJBpGxF5IfAavcvQhBJS93SOFVHN+nTFMCV1 bi/9HMv+nsF9Z2f3o9Muz+4NaL7ZCnOZY9nbWsKvgbFhlhVqgMeXfZI3gtKol51E4EYIAMz Mb9KU0wPrYMEFcTeKllvK4gJR09kOJKq/RhKCLg12fn+AaKEWjcm3MjhOFBxXNrr8YYuqhp TiKfbAeByRad7kv94SDiQ== X-UI-Out-Filterresults: notjunk:1;V03:K0:Z2atHHipKfU=:1U5dNSkQ3wUvDdYHthQHdO Al/GrL7FYMkWYRJZRk9+vs+NCTY7AqHGkvNTl4ND7C3RDFWY4I3ykfBIo6yrbdVr3LBjObRlC bzyqSdQHYa4hWl+EAjMNO7pdtz/WumSWYFqTLRneX/PYjYE1tzo3FcrzkZQZfXJ8bhFmGLRqY Kl/y6W10hDlR4gIXOtuljunWI63qhFgCwkso3UhAx7DW58YmAGr3DQ9Py/uyO+Bn7T6Bn895r NXJ0Q13+UVYsYQ2MYQRDZhSRVadK3PWLPE1D7ZDxL8+y/bFu2D/ntYSyNR772aLlUJOuuFZhJ WhGyiCkC2xXbCyxM7dNX5kHnOxxSHOX7fDSRdJ0dWbKxkSj8PVD2MOBFRV0VaKtUX+GAkOqW8 7fd89LKsUgDyax1J8vu1mSVcK9HlrQWPilwMLmHFIEn3qlUN5T04f1kEqSTBFaNpYRSuITeB7 AUMF3qMDwzZnEXxn8bf/yJx2wwcA1jYkmsg9Rbg3GinT42TCyNIAw2XJPnqhQ73Bt60u4GFjy p9KaOug47O4LIlGjAYRQvrwVeGMPN46jC6JH7PIHj8U+6EpjyNr/UqWs4V+HXVuZxrbO6MIMN P2KchcoM5yqxntNKZIsLO4gUHF24MCehH1Bj1YVYgkXBEYVSl21gf/1JSLNAT4OgjiZclBJtw GbpPZvheF9bvEjFuM/L2y5eXfVhECHtlFluC6nGl73e/UBBT5FfopsDIsIwrZtv9Ow6qxGN4u Hyh3VBZivjmPh8FjcP7Xn92OBirJPYg8hUQTHJYzw2CIsJzAHLEjntjP41qAuwKVuxCOQKeZs +kCvdXqJRWX+ET/UsFXejRlQifWLk7QiR9CzzXifnSHYFXxroO6Ax0GkH3fE/nSYfUGyd2nAK gPjfwV6zV6+68KPaFWIvQyw+Endutr4TMf2gSN4ke+5zaqbXtJAIYQnuAHCM2xjDBBxigv43Q rEaKAK18hN1MjjM1WzjFD3hjc4vIBeszSHa0Gnsl83HLEt2Lv0hyxNzfSqIZjqXC+wlwTBRGZ FbYccZB4UsJ61s8wyiUPzLzrJGnyrDJ5/ED+ZEzsg1luPxLVQia6wHEDZNpvShicDZhVVX4am xsanau6M4irFkoW/c6SBbLH06rtWE1NVkcPWFs5cfhOJoMZEfMk7BtZvzhVRrGq7uzy5z7QQI hEZoflI3OZCOvA6oEjpUvlt22DGKmwkHYc1CvkB4zbH1ghZumeJBomhERmlUH6NoL194HijCT rfzs4vgFx+HbE2rnF Received-SPF: pass client-ip=212.227.15.14; envelope-from=lukasstraub2@web.de; helo=mout.web.de X-detected-operating-system: by eggs.gnu.org: First seen = 2020/05/11 07:10:54 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] X-Spam_score_int: -25 X-Spam_score: -2.6 X-Spam_bar: -- X-Spam_report: (-2.6 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Hailiang Zhang , "Dr. David Alan Gilbert" , Juan Quintela Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" This causes the migration thread to hang if we failover during checkpoint. A shutdown fd won't cause network traffic anyway. Signed-off-by: Lukas Straub --- migration/qemu-file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/migration/qemu-file.c b/migration/qemu-file.c index 1c3a358a14..0748b5810f 100644 --- a/migration/qemu-file.c +++ b/migration/qemu-file.c @@ -660,7 +660,7 @@ int64_t qemu_ftell(QEMUFile *f) int qemu_file_rate_limit(QEMUFile *f) { if (f->shutdown) { - return 1; + return 0; } if (qemu_file_get_error(f)) { return 1; From patchwork Mon May 11 11:11:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Lukas Straub X-Patchwork-Id: 1287628 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=web.de Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (1024-bit key; secure) header.d=web.de header.i=@web.de header.a=rsa-sha256 header.s=dbaedf251592 header.b=ElpuAmr3; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49LJdF1Z7mz9sT3 for ; Mon, 11 May 2020 21:30:08 +1000 (AEST) Received: from localhost ([::1]:46012 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jY6dK-0002Tz-CG for incoming@patchwork.ozlabs.org; Mon, 11 May 2020 07:30:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:54918) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1jY6L2-0005E5-OM for qemu-devel@nongnu.org; Mon, 11 May 2020 07:11:12 -0400 Received: from mout.web.de ([212.227.17.12]:41025) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1jY6L1-0005mk-Tq for qemu-devel@nongnu.org; Mon, 11 May 2020 07:11:12 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=web.de; s=dbaedf251592; t=1589195463; bh=8aa/cXv8MJaQTeNGuhAYmWL57Hgij/DmXi7owbKM+js=; h=X-UI-Sender-Class:Date:From:To:Cc:Subject:In-Reply-To:References; b=ElpuAmr32P3BcPMQbFvM1DiqPwEaQsMWeMEr2oUIegE5pSG70OfSd7H2ZKsq8Go0d rkBIvGVAITuKb4ItpfGotd3L1iE885lucHhx3X40KhsE0+c2P3Ws8FrBLLZO4FPlps wLwhH80F0AZbA/yPU0C829Ohy23ZsXOw+9auErQ8= X-UI-Sender-Class: c548c8c5-30a9-4db5-a2e7-cb6cb037b8f9 Received: from luklap ([89.247.255.192]) by smtp.web.de (mrweb106 [213.165.67.124]) with ESMTPSA (Nemesis) id 1MsJSy-1jF43B1gD4-00teTy; Mon, 11 May 2020 13:11:03 +0200 Date: Mon, 11 May 2020 13:11:01 +0200 From: Lukas Straub To: qemu-devel Subject: [PATCH 6/6] migration/colo.c: Move colo_notify_compares_event to the right place Message-ID: In-Reply-To: References: MIME-Version: 1.0 X-Provags-ID: V03:K1:IU3txvhBiDREI7tNkWx5REKG0Uwnv8IAQ5LmYKL67wiqxUBMOz0 p5ndNdGAW5FFPlTEc5QTBzYWe6+xo9MWfLMPlNqr2WfNX6zhM/GzCk6rGlev1VD5PTHdepY dK17zoILIrgBBtobj8jFp+DIh/2AqBGVqEzE6EOUoGM+FEVxNSg2NpmRuExPh44ZD0+bEzt wVCL4Nancp9BOYIvLFkfQ== X-UI-Out-Filterresults: notjunk:1;V03:K0:0DKC7TXAbnA=:DWl6hDFq3XycRpSa+M1Hp/ hHGddW+YlCsAHnie+oATtR+re7h3E41g3fHGSLJqgBpsLi2etwMDeFUBuMXXzRzKuc+bC/ACJ QVwuqm2DboB9boqVwxkNmpu3JFXYmTAjKDnPt0Yal9bzRfC2T9Y3CXCRLXSH8ER/MtptiJeeu axIEa/KMy1Tb2Yx8UjLhN9av6zYRbeHOLarY+Ed9A9cGoD7Fe7lDPyNond4f+CRK/NkpH3H3A xO5Pg6hmv/jpultud0CM7HDS/jdSzDMEow1AXzrhZmNA4nw2A7vYzdPr0+h5zoFyChPOr8asd Ra0yqjMdoYQO3uASIi3kEP7IXVpKjNOKuKqsmqEUtMNwd1RQggl8c1I8Yy5PBshvgGbm/EiBS EL753HU4MsQzJncjA4dcChrq5eRdHXPtUPM2LWtK7z5Kn95fdCOj8E4BDjzw/hCiC1398NJ6V F/dAdq6LMitG2pmz2Gvyu2rqpNS1jbjrkSWxibm+WEZ6+m7cwX2ApFs1c5QCtAC2n57+qvdM9 l09dZpmAuSGsqv/enXkKKlQbHO7a6uEE4yZ6DPj5Pd/YJTwQg4o60tePPzUT6HUDgCDXxJ4uQ sIr+t9b3mz2QJq0IiwSOXYj9pXGN1JCJyeTCGkVFkdeOe4r40ahfjR3QRBQw1KyInCuuOIw9h 7lj4YOyaMSGcB5U2q5x/dKXnAwcG2WMVA1pIFBL7ObxkoZJoHl8G0awGeYkBjydfeXbpmtaQT VPaTnaD+yrz68mePAL1Fov9q8jD/FBNmKdwKOqa8nTJ82ztnR8bYyDdI9uP3UiDpSaiszHJaR 5AOtHtW1jgJTXgS4SRDv4/3NjiK2vgbTRVzZYA7fE6HAAWoYCO7OliBdLO7W8iTrf2W9d7t88 FDJN8/QReDVxto0RXdk+BjadKW17dLNO5YCvK4EA1v6/1yV9OBHQr5E+bVPNdjSvOrJ+9xgWY xsdLGkvG7N0pWKMijf5LuQPnK52sK0F94zje8SbtPMvotB4XJt758E1uuePvUyVSquJunOasS xc8Dnf1s5oIhYvZsR80dYQSsqcTfSB7fQUPggmDYbqZCs0OgF7kdJtgRFtomnjtWCnban41F7 RyLccvU7BF0mXftQKvRH7t2WtOh04L/+IIPi2aMPQgS/wegoMdpdHxcuJIXsJtNgUTBRPjl2d q5ZtkUyxRi4pdwdGRlLBXvqujdLJaP4gZ8lbLNR2j47S/7s+dOEcM2bb9Xh4f8XRErhCPOSVO 5YYf7WPd2bR/Xp4Np Received-SPF: pass client-ip=212.227.17.12; envelope-from=lukasstraub2@web.de; helo=mout.web.de X-detected-operating-system: by eggs.gnu.org: First seen = 2020/05/11 07:10:54 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] X-Spam_score_int: -24 X-Spam_score: -2.5 X-Spam_bar: -- X-Spam_report: (-2.5 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_PASS=-0.001 autolearn=_AUTOLEARN X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Hailiang Zhang , "Dr. David Alan Gilbert" , Juan Quintela Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" If the secondary has to failover during checkpointing, it still is in the old state (i.e. different state than primary). Thus we can't expose the primary state until after the checkpoint is sent. This fixes sporadic connection reset of client connections during failover. Signed-off-by: Lukas Straub Reviewed-by: zhanghailiang --- migration/colo.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/migration/colo.c b/migration/colo.c index a69782efc5..a3fc21e86e 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -430,12 +430,6 @@ static int colo_do_checkpoint_transaction(MigrationState *s, goto out; } - qemu_event_reset(&s->colo_checkpoint_event); - colo_notify_compares_event(NULL, COLO_EVENT_CHECKPOINT, &local_err); - if (local_err) { - goto out; - } - /* Disable block migration */ migrate_set_block_enabled(false, &local_err); qemu_mutex_lock_iothread(); @@ -494,6 +488,12 @@ static int colo_do_checkpoint_transaction(MigrationState *s, goto out; } + qemu_event_reset(&s->colo_checkpoint_event); + colo_notify_compares_event(NULL, COLO_EVENT_CHECKPOINT, &local_err); + if (local_err) { + goto out; + } + colo_receive_check_message(s->rp_state.from_dst_file, COLO_MESSAGE_VMSTATE_LOADED, &local_err); if (local_err) {