From patchwork Tue Mar 10 09:14:43 2020
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Longpeng (Mike,
 Cloud Infrastructure Service Product Dept.)" <longpeng2@huawei.com>
X-Patchwork-Id: 1252038
Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized)
	smtp.mailfrom=nongnu.org (client-ip=209.51.188.17;
	helo=lists.gnu.org;
	envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org;
	receiver=<UNKNOWN>)
Authentication-Results: ozlabs.org;
	dmarc=none (p=none dis=none) header.from=huawei.com
Received: from lists.gnu.org (lists.gnu.org [209.51.188.17])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 48c8ZZ5kzKz9sRN
	for <incoming@patchwork.ozlabs.org>;
	Tue, 10 Mar 2020 20:15:34 +1100 (AEDT)
Received: from localhost ([::1]:55812 helo=lists1p.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from
	<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>)
	id 1jBaz6-0004s6-Hx
	for incoming@patchwork.ozlabs.org; Tue, 10 Mar 2020 05:15:32 -0400
Received: from eggs.gnu.org ([2001:470:142:3::10]:58487)
	by lists.gnu.org with esmtp (Exim 4.90_1)
	(envelope-from <longpeng2@huawei.com>) id 1jBayk-0004rx-4I
	for qemu-devel@nongnu.org; Tue, 10 Mar 2020 05:15:11 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <longpeng2@huawei.com>) id 1jBayi-0002rO-Ib
	for qemu-devel@nongnu.org; Tue, 10 Mar 2020 05:15:09 -0400
Received: from szxga05-in.huawei.com ([45.249.212.191]:3268 helo=huawei.com)
	by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32)
	(Exim 4.71) (envelope-from <longpeng2@huawei.com>)
	id 1jBayi-0002bP-7C
	for qemu-devel@nongnu.org; Tue, 10 Mar 2020 05:15:08 -0400
Received: from DGGEMS411-HUB.china.huawei.com (unknown [172.30.72.59])
	by Forcepoint Email with ESMTP id 88E969DBF840174FD901;
	Tue, 10 Mar 2020 17:15:01 +0800 (CST)
Received: from DESKTOP-27KDQMV.china.huawei.com (10.173.228.124) by
	DGGEMS411-HUB.china.huawei.com (10.3.19.211) with Microsoft SMTP
	Server id 14.3.487.0; Tue, 10 Mar 2020 17:14:54 +0800
From: "Longpeng(Mike)" <longpeng2@huawei.com>
To: <pbonzini@redhat.com>, <rth@twiddle.net>
Subject: [RFC] cpus: avoid get stuck in pause_all_vcpus
Date: Tue, 10 Mar 2020 17:14:43 +0800
Message-ID: <20200310091443.1326-1-longpeng2@huawei.com>
X-Mailer: git-send-email 2.25.0.windows.1
MIME-Version: 1.0
X-Originating-IP: [10.173.228.124]
X-CFilter-Loop: Reflected
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic]
	[fuzzy]
X-Received-From: 45.249.212.191
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: Peter Maydell <peter.maydell@linaro.org>, "Dr .
	David Alan Gilbert" <dgilbert@redhat.com>,
	"qemu-devel @ nongnu . org" <qemu-devel@nongnu.org>,
	arei.gonglei@huawei.com,
	huangzhichao@huawei.com, Longpeng <longpeng2@huawei.com>
Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Sender: "Qemu-devel"
	<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>

From: Longpeng <longpeng2@huawei.com>

We find an issue when repeat reboot in guest during migration, it cause the
migration thread never be waken up again.

<main loop>                        |<migration_thread>
                                   |
LOCK BQL                           |
...                                |
main_loop_should_exit              |
 pause_all_vcpus                   |
  1. set all cpus ->stop=true      |
     and then kick                 |
  2. return if all cpus is paused  |
     (by '->stopped == true'), else|
  3. qemu_cond_wait [BQL UNLOCK]   |
                                   |LOCK BQL
                                   |...
                                   |do_vm_stop
                                   | pause_all_vcpus
                                   |  (A)set all cpus ->stop=true
                                   |     and then kick
                                   |  (B)return if all cpus is paused
                                   |     (by '->stopped == true'), else
                                   |  (C)qemu_cond_wait [BQL UNLOCK]
  4. be waken up and LOCK BQL      |  (D)be waken up BUT wait for  BQL
  5. goto 2.                       |
 (BQL is still LOCKed)             |
 resume_all_vcpus                  |
  1. set all cpus ->stop=false     |
     and ->stopped=false           |
...                                |
BQL UNLOCK                         |  (E)LOCK BQL
                                   |  (F)goto B. [but stopped is false now!]
                                   |Finally, sleep at step 3 forever.


Note: This patch is just for discuss this issue, I'm looking forward to
      your suggestions, thanks!

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: qemu-devel@nongnu.org <qemu-devel@nongnu.org>
Signed-off-by: Longpeng <longpeng2@huawei.com>
---
 cpus.c | 41 ++++++++++++++++++++++++++++++++++++-----
 1 file changed, 36 insertions(+), 5 deletions(-)
diff --git a/cpus.c b/cpus.c
index b4f8b84..15e8b21 100644
--- a/cpus.c
+++ b/cpus.c
@@ -1857,10 +1857,30 @@ static bool all_vcpus_paused(void)
     return true;
 }
 
+static bool all_vcpus_resumed(void)
+{
+    CPUState *cpu;
+
+    CPU_FOREACH(cpu) {
+        if (cpu->stopped) {
+            return false;
+        }
+    }
+
+    return true;
+}
+
 void pause_all_vcpus(void)
 {
     CPUState *cpu;
 
+    /* We need to drop the replay_lock so any vCPU threads woken up
+     * can finish their replay tasks
+     */
+retry_unlock:
+    replay_mutex_unlock();
+
+retry_pause:
     qemu_clock_enable(QEMU_CLOCK_VIRTUAL, false);
     CPU_FOREACH(cpu) {
         if (qemu_cpu_is_self(cpu)) {
@@ -1871,13 +1891,17 @@ void pause_all_vcpus(void)
         }
     }
 
-    /* We need to drop the replay_lock so any vCPU threads woken up
-     * can finish their replay tasks
-     */
-    replay_mutex_unlock();
-
     while (!all_vcpus_paused()) {
         qemu_cond_wait(&qemu_pause_cond, &qemu_global_mutex);
+        /*
+         * All of the vcpus maybe resumed due to the race with other
+         * threads that doing pause && resume, and we'll stuck as a
+         * result. So we need to request again if the race occurs.
+         */
+        if (all_vcpus_resumed()) {
+            goto retry_pause;
+        }
+
         CPU_FOREACH(cpu) {
             qemu_cpu_kick(cpu);
         }
@@ -1886,6 +1910,13 @@ void pause_all_vcpus(void)
     qemu_mutex_unlock_iothread();
     replay_mutex_lock();
     qemu_mutex_lock_iothread();
+    /*
+     * The vcpus maybe resumed during the mutex is unlocking, we must
+     * make sure all of the vcpus are paused before return.
+     */
+    if (!all_vcpus_paused()) {
+        goto retry_unlock;
+    }
 }
 
 void cpu_resume(CPUState *cpu)