[v11,1/7] mm: ksm: Export ksm_madvise()

Message ID	20191125030631.7716-2-bharata@linux.ibm.com (mailing list archive)
State	Not Applicable
Headers	show Return-Path: <linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org> Gateway: Authorized Use Only! Violators will be prosecuted for <linuxppc-dev@lists.ozlabs.org> from <bharata@linux.ibm.com>; Mon, 25 Nov 2019 03:06:45 -0000 Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Mon, 25 Nov 2019 03:06:40 -0000 From: Bharata B Rao <bharata@linux.ibm.com> To: linuxppc-dev@lists.ozlabs.org, kvm-ppc@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v11 1/7] mm: ksm: Export ksm_madvise() Date: Mon, 25 Nov 2019 08:36:25 +0530 In-Reply-To: <20191125030631.7716-1-bharata@linux.ibm.com> References: <20191125030631.7716-1-bharata@linux.ibm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Message-Id: <20191125030631.7716-2-bharata@linux.ibm.com> Precedence: list Cc: Andrea Arcangeli <aarcange@redhat.com>, Hugh Dickins <hughd@google.com>, linuxram@us.ibm.com, cclaudio@linux.ibm.com, Bharata B Rao <bharata@linux.ibm.com>, jglisse@redhat.com, aneesh.kumar@linux.vnet.ibm.com, paulus@au1.ibm.com, sukadev@linux.vnet.ibm.com, hch@lst.de Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Sender: "Linuxppc-dev" <linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>
Series	KVM: PPC: Driver to manage pages of secure guest \| expand [v11,0/7] KVM: PPC: Driver to manage pages of secure guest [v11,1/7] mm: ksm: Export ksm_madvise() [v11,2/7] KVM: PPC: Support for running secure guests [v11,3/7] KVM: PPC: Shared pages support for secure guests [v11,4/7] KVM: PPC: Radix changes for secure guest [v11,5/7] KVM: PPC: Handle memory plug/unplug to secure VM [v11,6/7] KVM: PPC: Support reset of secure guest [v11,7/7] KVM: PPC: Ultravisor: Add PPC_UV config option

Message ID

20191125030631.7716-2-bharata@linux.ibm.com (mailing list archive)

State

Not Applicable

Headers

From: Bharata B Rao <bharata@linux.ibm.com>
To: linuxppc-dev@lists.ozlabs.org, kvm-ppc@vger.kernel.org,
	linux-mm@kvack.org
Subject: [PATCH v11 1/7] mm: ksm: Export ksm_madvise()
Date: Mon, 25 Nov 2019 08:36:25 +0530
In-Reply-To: <20191125030631.7716-1-bharata@linux.ibm.com>
References: <20191125030631.7716-1-bharata@linux.ibm.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Message-Id: <20191125030631.7716-2-bharata@linux.ibm.com>
Precedence: list
Cc: Andrea Arcangeli <aarcange@redhat.com>, Hugh Dickins <hughd@google.com>, 
	linuxram@us.ibm.com, cclaudio@linux.ibm.com,
	Bharata B Rao <bharata@linux.ibm.com>, jglisse@redhat.com,
	aneesh.kumar@linux.vnet.ibm.com, paulus@au1.ibm.com,
	sukadev@linux.vnet.ibm.com, hch@lst.de
Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org
Sender: "Linuxppc-dev"
	<linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org>

Series

KVM: PPC: Driver to manage pages of secure guest | expand

Checks

Context	Check	Description
snowpatch_ozlabs/apply_patch	success	Successfully applied on branch powerpc/merge (f5b8031d0193757ee977b2cee25065a4e6200615)
snowpatch_ozlabs/checkpatch	success	total: 0 errors, 0 warnings, 0 checks, 7 lines checked
snowpatch_ozlabs/needsstable	success	Patch has no Fixes tags

Context

Check

Description

snowpatch_ozlabs/apply_patch

success

Successfully applied on branch powerpc/merge (f5b8031d0193757ee977b2cee25065a4e6200615)

snowpatch_ozlabs/checkpatch

success

total: 0 errors, 0 warnings, 0 checks, 7 lines checked

snowpatch_ozlabs/needsstable

success

Patch has no Fixes tags

Commit Message

Bharata B Rao Nov. 25, 2019, 3:06 a.m. UTC

On PEF-enabled POWER platforms that support running of secure guests,
secure pages of the guest are represented by device private pages
in the host. Such pages needn't participate in KSM merging. This is
achieved by using ksm_madvise() call which need to be exported
since KVM PPC can be a kernel module.

Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
Acked-by: Paul Mackerras <paulus@ozlabs.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
---
 mm/ksm.c | 1 +
 1 file changed, 1 insertion(+)

Comments

Bharata B Rao Nov. 25, 2019, 3:09 a.m. UTC | #1

On Mon, Nov 25, 2019 at 08:36:25AM +0530, Bharata B Rao wrote:
> On PEF-enabled POWER platforms that support running of secure guests,
> secure pages of the guest are represented by device private pages
> in the host. Such pages needn't participate in KSM merging. This is
> achieved by using ksm_madvise() call which need to be exported
> since KVM PPC can be a kernel module.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
> Acked-by: Paul Mackerras <paulus@ozlabs.org>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Hugh Dickins <hughd@google.com>

Just want to point out that I observe a kernel crash when KSM is
dealing with device private pages. More details about the crash here:

https://lore.kernel.org/linuxppc-dev/20191115141006.GA21409@in.ibm.com/

Regards,
Bharata.

Hugh Dickins Nov. 27, 2019, 3:59 a.m. UTC | #2

On Mon, 25 Nov 2019, Bharata B Rao wrote:

> On PEF-enabled POWER platforms that support running of secure guests,
> secure pages of the guest are represented by device private pages
> in the host. Such pages needn't participate in KSM merging. This is
> achieved by using ksm_madvise() call which need to be exported
> since KVM PPC can be a kernel module.
> 
> Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
> Acked-by: Paul Mackerras <paulus@ozlabs.org>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Hugh Dickins <hughd@google.com>

I can say
Acked-by: Hugh Dickins <hughd@google.com>
to this one.

But not to your 2/7 which actually makes use of it: because sadly it
needs down_write(&kvm->mm->mmap_sem) for the case when it switches off
VM_MERGEABLE in vma->vm_flags.  That's frustrating, since I think it's
the only operation for which down_read() is not good enough.

I have no idea how contended that mmap_sem is likely to be, nor how
many to-be-secured pages that vma is likely to contain: you might find
it okay simply to go with it down_write throughout, or you might want
to start out with it down_read, and only restart with down_write (then
perhaps downgrade_write later) when you see VM_MERGEABLE is set.

The crash you got (thanks for the link): that will be because your
migrate_vma_pages() had already been applied to a page that was
already being shared via KSM.

But if these secure pages are expected to be few and far between,
maybe you'd prefer to keep VM_MERGEABLE, and add per-page checks
of some kind into mm/ksm.c, to skip over these surprising hybrids.

Hugh

> ---
>  mm/ksm.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/mm/ksm.c b/mm/ksm.c
> index dbee2eb4dd05..e45b02ad3f0b 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -2478,6 +2478,7 @@ int ksm_madvise(struct vm_area_struct *vma, unsigned long start,
>  
>  	return 0;
>  }
> +EXPORT_SYMBOL_GPL(ksm_madvise);
>  
>  int __ksm_enter(struct mm_struct *mm)
>  {
> -- 
> 2.21.0

Bharata B Rao Nov. 27, 2019, 6:53 a.m. UTC | #3

On Tue, Nov 26, 2019 at 07:59:49PM -0800, Hugh Dickins wrote:
> On Mon, 25 Nov 2019, Bharata B Rao wrote:
> 
> > On PEF-enabled POWER platforms that support running of secure guests,
> > secure pages of the guest are represented by device private pages
> > in the host. Such pages needn't participate in KSM merging. This is
> > achieved by using ksm_madvise() call which need to be exported
> > since KVM PPC can be a kernel module.
> > 
> > Signed-off-by: Bharata B Rao <bharata@linux.ibm.com>
> > Acked-by: Paul Mackerras <paulus@ozlabs.org>
> > Cc: Andrea Arcangeli <aarcange@redhat.com>
> > Cc: Hugh Dickins <hughd@google.com>
> 
> I can say
> Acked-by: Hugh Dickins <hughd@google.com>
> to this one.
> 
> But not to your 2/7 which actually makes use of it: because sadly it
> needs down_write(&kvm->mm->mmap_sem) for the case when it switches off
> VM_MERGEABLE in vma->vm_flags.  That's frustrating, since I think it's
> the only operation for which down_read() is not good enough.

Oh ok! Thanks for pointing this out.

> 
> I have no idea how contended that mmap_sem is likely to be, nor how
> many to-be-secured pages that vma is likely to contain: you might find
> it okay simply to go with it down_write throughout, or you might want
> to start out with it down_read, and only restart with down_write (then
> perhaps downgrade_write later) when you see VM_MERGEABLE is set.

Using down_write throughtout is not easy as we do migrate_vma_pages()
from fault path (->migrate_to_ram()) too. Here we come with down_read
already held.

Starting with down_read and restarting with down_write if VM_MERGEABLE
is set -- this also looks a bit difficult as we will have challenges
with locking order if we release mmap_sem in between and re-acquire.

So I think I will start with down_write in this particular case
and will downgrade_write as soon as ksm_madvise() is complete.

> 
> The crash you got (thanks for the link): that will be because your
> migrate_vma_pages() had already been applied to a page that was
> already being shared via KSM.
> 
> But if these secure pages are expected to be few and far between,
> maybe you'd prefer to keep VM_MERGEABLE, and add per-page checks
> of some kind into mm/ksm.c, to skip over these surprising hybrids.

I did bail out from a few routines in mm/ksm.c with
is_device_private_page(page) check, but that wasn't good enough and
I encountered crashes in different code paths. Guess a bit more
understanding of KSM internals would be required before retrying that.

However since all the pages of the guest except for a few will be turned
into secure pages early during boot, it appears better if secure guests
don't participate in in KSM merging at all.

Regards,
Bharata.

diff --git a/mm/ksm.c b/mm/ksm.c
index dbee2eb4dd05..e45b02ad3f0b 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -2478,6 +2478,7 @@  int ksm_madvise(struct vm_area_struct *vma, unsigned long start,
 
 	return 0;
 }
+EXPORT_SYMBOL_GPL(ksm_madvise);
 
 int __ksm_enter(struct mm_struct *mm)
 {