From patchwork Thu May 11 12:07:28 2017
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: "Zhoujian (jay)" <jianjay.zhou@huawei.com>
X-Patchwork-Id: 761083
Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11])
	(using TLSv1 with cipher AES256-SHA (256/256 bits))
	(No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 3wNvjn62DZz9rvt
	for <incoming@patchwork.ozlabs.org>;
	Thu, 11 May 2017 23:54:37 +1000 (AEST)
Received: from localhost ([::1]:48371 helo=lists.gnu.org)
	by lists.gnu.org with esmtp (Exim 4.71) (envelope-from
	<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>)
	id 1d8oY7-0005dx-Bo
	for incoming@patchwork.ozlabs.org; Thu, 11 May 2017 09:54:35 -0400
Received: from eggs.gnu.org ([2001:4830:134:3::10]:47495)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <jianjay.zhou@huawei.com>) id 1d8mt1-00020t-PE
	for qemu-devel@nongnu.org; Thu, 11 May 2017 08:08:08 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <jianjay.zhou@huawei.com>) id 1d8msz-0007po-3O
	for qemu-devel@nongnu.org; Thu, 11 May 2017 08:08:03 -0400
Received: from szxga03-in.huawei.com ([45.249.212.189]:3863)
	by eggs.gnu.org with esmtps (TLS1.0:RSA_ARCFOUR_SHA1:16) (Exim 4.71)
	(envelope-from <jianjay.zhou@huawei.com>) id 1d8msy-0007kw-Ck
	for qemu-devel@nongnu.org; Thu, 11 May 2017 08:08:01 -0400
Received: from 172.30.72.56 (EHLO DGGEML401-HUB.china.huawei.com)
	([172.30.72.56])
	by dggrg03-dlp.huawei.com (MOS 4.4.6-GA FastPath queued)
	with ESMTP id ANK33375; Thu, 11 May 2017 20:07:39 +0800 (CST)
Received: from DGGEML511-MBX.china.huawei.com ([169.254.1.164]) by
	DGGEML401-HUB.china.huawei.com ([fe80::89ed:853e:30a9:2a79%31]) with
	mapi id 14.03.0301.000; Thu, 11 May 2017 20:07:29 +0800
From: "Zhoujian (jay)" <jianjay.zhou@huawei.com>
To: "Dr. David Alan Gilbert" <dgilbert@redhat.com>, yanghongyang
	<yanghongyang@huawei.com>
Thread-Topic: [Qemu-devel] About QEMU BQL and dirty log switch in Migration
Thread-Index: AQHSvRnLyPsNIXfv9k+ZtGcw9lIdjKHvHZmA
Date: Thu, 11 May 2017 12:07:28 +0000
Message-ID: 
 <B2D15215269B544CADD246097EACE747395AD759@dggeml511-mbx.china.huawei.com>
References: <830bfc39-56c7-a901-9ebb-77d6e7a5614c@huawei.com>
	<874lxeovrg.fsf@secure.mitica>
	<7cd332ec-48d4-1feb-12e2-97b50b04e028@huawei.com>
	<20170424164244.GJ2362@work-vm>
In-Reply-To: <20170424164244.GJ2362@work-vm>
Accept-Language: zh-CN, en-US
Content-Language: zh-CN
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [10.177.19.14]
MIME-Version: 1.0
X-CFilter-Loop: Reflected
X-Mirapoint-Virus-RAPID-Raw: score=unknown(0),
	refid=str=0001.0A020203.5914540D.004E, ss=1, re=0.000, recu=0.000,
	reip=0.000, cl=1, cld=1, fgs=0, ip=169.254.1.164,
	so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32
X-Mirapoint-Loop-Id: 86b90252629c75137f5d05c26c5336b1
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic]
	[fuzzy]
X-Received-From: 45.249.212.189
X-Mailman-Approved-At: Thu, 11 May 2017 09:54:07 -0400
Subject: Re: [Qemu-devel] About QEMU BQL and dirty log switch in Migration
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.21
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel/>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Cc: Zhanghailiang <zhang.zhanghailiang@huawei.com>,
	"quintela@redhat.com" <quintela@redhat.com>,
	"wangxin \(U\)" <wangxinxin.wang@huawei.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"Gonglei \(Arei\)" <arei.gonglei@huawei.com>,
	Huangzhichao <huangzhichao@huawei.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"Herongguang \(Stephen\)" <herongguang.he@huawei.com>
Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Sender: "Qemu-devel"
	<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>

Hi all,

After applying the patch below, the time which
memory_global_dirty_log_stop() function takes is down to milliseconds
of a 4T memory guest, but I'm not sure whether this patch will trigger
other problems. Does this patch make sense?


> * Yang Hongyang (yanghongyang@huawei.com) wrote:
> >
> >
> > On 2017/4/24 20:06, Juan Quintela wrote:
> > > Yang Hongyang <yanghongyang@huawei.com> wrote:
> > >> Hi all,
> > >>
> > >> We found dirty log switch costs more then 13 seconds while
> > >> migrating a 4T memory guest, and dirty log switch is currently
> > >> protected by QEMU BQL. This causes guest freeze for a long time
> > >> when switching dirty log on, and the migration downtime is
> unacceptable.
> > >> Are there any chance to optimize the time cost for dirty log switch
> operation?
> > >> Or move the time consuming operation out of the QEMU BQL?
> > >
> > > Hi
> > >
> > > Could you specify what do you mean by dirty log switch?
> > > The one inside kvm?
> > > The merge between kvm one and migration bitmap?
> >
> > The call of the following functions:
> > memory_global_dirty_log_start/stop();
> 
> I suppose there's a few questions;
>   a) Do we actually need the BQL - and if so why
>   b) What actually takes 13s?  It's probably worth figuring out where it
> goes,  the whole bitmap is only 1GB isn't it even on a 4TB machine, and
> even the simplest way to fill that takes way less than 13s.
> 
> Dave
> 
> >
> > >
> > > Thanks, Juan.
> > >
> >
> > --
> > Thanks,
> > Yang
> --
> Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Regards,
Jay Zhou

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 464da93..fe26ee5 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8313,6 +8313,8 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
                                enum kvm_mr_change change)
 {
        int nr_mmu_pages = 0;
+       int i;
+       struct kvm_vcpu *vcpu;
 
        if (!kvm->arch.n_requested_mmu_pages)
                nr_mmu_pages = kvm_mmu_calculate_mmu_pages(kvm);
@@ -8328,14 +8330,15 @@ void kvm_arch_commit_memory_region(struct kvm *kvm,
         * in the source machine (for example if live migration fails), small
         * sptes will remain around and cause bad performance.
         *
-        * Scan sptes if dirty logging has been stopped, dropping those
-        * which can be collapsed into a single large-page spte.  Later
-        * page faults will create the large-page sptes.
+        * Reset each vcpu's mmu, then page faults will create the large-page
+        * sptes later.
         */
        if ((change != KVM_MR_DELETE) &&
                (old->flags & KVM_MEM_LOG_DIRTY_PAGES) &&
-               !(new->flags & KVM_MEM_LOG_DIRTY_PAGES))
-               kvm_mmu_zap_collapsible_sptes(kvm, new);
+               !(new->flags & KVM_MEM_LOG_DIRTY_PAGES)) {
+               kvm_for_each_vcpu(i, vcpu, kvm)
+                       kvm_mmu_reset_context(vcpu);
+       }
 
        /*
         * Set up write protection and/or dirty logging for the new slot.