mbox series

[0/3] Postcopy migration: Add userfaultfd- user-mode-only capability

Message ID 20211014091551.15201-1-lma@suse.com
Headers show
Series Postcopy migration: Add userfaultfd- user-mode-only capability | expand

Message

Lin Ma Oct. 14, 2021, 9:15 a.m. UTC
Since kernel v5.11, Unprivileged user (without SYS_CAP_PTRACE capability)
must pass UFFD_USER_MODE_ONLY to userfaultd in case unprivileged_userfaultfd
sysctl knob is 0.
Please refer to https://lwn.net/Articles/819834/ and the kernel commits:
37cd0575 userfaultfd: add UFFD_USER_MODE_ONLY
d0d4730a userfaultfd: add user-mode only option to unprivileged_userfaultfd sysctl knob

This patch set adds a migration capability to pass UFFD_USER_MODE_ONLY
for postcopy migration.

Lin Ma (3):
  migration: introduce postcopy-uffd-usermode-only capability
  migration: postcopy-uffd-usermode-only documentation
  tests: add postcopy-uffd-usermode-only capability into migration-test

 docs/devel/migration.rst     |  9 +++++++++
 migration/migration.c        |  9 +++++++++
 migration/migration.h        |  1 +
 migration/postcopy-ram.c     | 22 +++++++++++++++++++---
 qapi/migration.json          |  8 +++++++-
 tests/qtest/migration-test.c | 11 +++++++++--
 6 files changed, 54 insertions(+), 6 deletions(-)

Comments

Peter Xu Oct. 14, 2021, 11:43 p.m. UTC | #1
On Thu, Oct 14, 2021 at 05:15:48PM +0800, Lin Ma wrote:
> Since kernel v5.11, Unprivileged user (without SYS_CAP_PTRACE capability)
> must pass UFFD_USER_MODE_ONLY to userfaultd in case unprivileged_userfaultfd
> sysctl knob is 0.
> Please refer to https://lwn.net/Articles/819834/ and the kernel commits:
> 37cd0575 userfaultfd: add UFFD_USER_MODE_ONLY
> d0d4730a userfaultfd: add user-mode only option to unprivileged_userfaultfd sysctl knob
> 
> This patch set adds a migration capability to pass UFFD_USER_MODE_ONLY
> for postcopy migration.

Then it's at least no KVM, no vhost, am I right?  Could I ask is there a real
user behind this?  Thanks,
lma Oct. 15, 2021, 5:38 a.m. UTC | #2
在 2021-10-15 07:43,Peter Xu 写道:
> On Thu, Oct 14, 2021 at 05:15:48PM +0800, Lin Ma wrote:
>> Since kernel v5.11, Unprivileged user (without SYS_CAP_PTRACE 
>> capability)
>> must pass UFFD_USER_MODE_ONLY to userfaultd in case 
>> unprivileged_userfaultfd
>> sysctl knob is 0.
>> Please refer to https://lwn.net/Articles/819834/ and the kernel 
>> commits:
>> 37cd0575 userfaultfd: add UFFD_USER_MODE_ONLY
>> d0d4730a userfaultfd: add user-mode only option to 
>> unprivileged_userfaultfd sysctl knob
>> 
>> This patch set adds a migration capability to pass UFFD_USER_MODE_ONLY
>> for postcopy migration.
> 
> Then it's at least no KVM, no vhost, am I right?  Could I ask is there 
> a real
> user behind this?  Thanks,

Well, The "user-mode-only" has nothing to do with qemu's user-mode 
emulation.

The unprivileged_userfaultfd sysctl knob controls whether unprivileged 
users can use the userfaultfd system calls.
  set it to 1 to allow unprivileged users to use the userfaultfd system 
calls.
  set it to 0 to restrict userfaultfd to only privileged users (with 
SYS_CAP_PTRACE capability).

If host's unprivileged_userfaultfd sysctl knob is 0(The default value of 
this knob is changed to 0 since host kernel v5.11):
Qemu must pass the UFFD_USER_MODE_ONLY flag when creating userfaultfd 
object for postcopy migration in case qemu runs as unprivileged user.

Before host kernel v5.11, If host's unprivileged_userfaultfd sysctl knob 
is 0, Then postcopy migration is not allowed in case qemu runs as 
unprivileged user.

Thanks,
Lin
Peter Xu Oct. 15, 2021, 6:12 a.m. UTC | #3
On Fri, Oct 15, 2021 at 01:38:06PM +0800, lma wrote:
> 在 2021-10-15 07:43,Peter Xu 写道:
> > On Thu, Oct 14, 2021 at 05:15:48PM +0800, Lin Ma wrote:
> > > Since kernel v5.11, Unprivileged user (without SYS_CAP_PTRACE
> > > capability)
> > > must pass UFFD_USER_MODE_ONLY to userfaultd in case
> > > unprivileged_userfaultfd
> > > sysctl knob is 0.
> > > Please refer to https://lwn.net/Articles/819834/ and the kernel
> > > commits:
> > > 37cd0575 userfaultfd: add UFFD_USER_MODE_ONLY
> > > d0d4730a userfaultfd: add user-mode only option to
> > > unprivileged_userfaultfd sysctl knob
> > > 
> > > This patch set adds a migration capability to pass UFFD_USER_MODE_ONLY
> > > for postcopy migration.
> > 
> > Then it's at least no KVM, no vhost, am I right?  Could I ask is there a
> > real
> > user behind this?  Thanks,
> 
> Well, The "user-mode-only" has nothing to do with qemu's user-mode
> emulation.

I didn't follow why you thought my question was about "user-mode emulation"..

To ask in another way: after this new cap set, qemu will get a SIGBUS and VM
will crash during postcopy migrating as long as either KVM or vhost-kernel
faulted on any of the missing pages, am I right?

Thanks,
lma Oct. 15, 2021, 8:16 a.m. UTC | #4
在 2021-10-15 14:12,Peter Xu 写道:
> On Fri, Oct 15, 2021 at 01:38:06PM +0800, lma wrote:
>> 在 2021-10-15 07:43,Peter Xu 写道:
>> > On Thu, Oct 14, 2021 at 05:15:48PM +0800, Lin Ma wrote:
>> > > Since kernel v5.11, Unprivileged user (without SYS_CAP_PTRACE
>> > > capability)
>> > > must pass UFFD_USER_MODE_ONLY to userfaultd in case
>> > > unprivileged_userfaultfd
>> > > sysctl knob is 0.
>> > > Please refer to https://lwn.net/Articles/819834/ and the kernel
>> > > commits:
>> > > 37cd0575 userfaultfd: add UFFD_USER_MODE_ONLY
>> > > d0d4730a userfaultfd: add user-mode only option to
>> > > unprivileged_userfaultfd sysctl knob
>> > >
>> > > This patch set adds a migration capability to pass UFFD_USER_MODE_ONLY
>> > > for postcopy migration.
>> >
>> > Then it's at least no KVM, no vhost, am I right?  Could I ask is there a
>> > real
>> > user behind this?  Thanks,
>> 
>> Well, The "user-mode-only" has nothing to do with qemu's user-mode
>> emulation.
> 
> I didn't follow why you thought my question was about "user-mode 
> emulation"..
Sorry about the misunderstanding.

> To ask in another way: after this new cap set, qemu will get a SIGBUS 
> and VM
> will crash during postcopy migrating as long as either KVM or 
> vhost-kernel
> faulted on any of the missing pages, am I right?

Oops...Yes, you're right. It indeed casues qemu crash on destination due 
to
fault on missing pages.
This patch set and my thought about introducing this cap to qemu are 
wrong.

Thanks,
Lin
Peter Xu Oct. 15, 2021, 8:28 a.m. UTC | #5
On Fri, Oct 15, 2021 at 04:16:15PM +0800, lma wrote:
> 在 2021-10-15 14:12,Peter Xu 写道:
> > On Fri, Oct 15, 2021 at 01:38:06PM +0800, lma wrote:
> > > 在 2021-10-15 07:43,Peter Xu 写道:
> > > > On Thu, Oct 14, 2021 at 05:15:48PM +0800, Lin Ma wrote:
> > > > > Since kernel v5.11, Unprivileged user (without SYS_CAP_PTRACE
> > > > > capability)
> > > > > must pass UFFD_USER_MODE_ONLY to userfaultd in case
> > > > > unprivileged_userfaultfd
> > > > > sysctl knob is 0.
> > > > > Please refer to https://lwn.net/Articles/819834/ and the kernel
> > > > > commits:
> > > > > 37cd0575 userfaultfd: add UFFD_USER_MODE_ONLY
> > > > > d0d4730a userfaultfd: add user-mode only option to
> > > > > unprivileged_userfaultfd sysctl knob
> > > > >
> > > > > This patch set adds a migration capability to pass UFFD_USER_MODE_ONLY
> > > > > for postcopy migration.
> > > >
> > > > Then it's at least no KVM, no vhost, am I right?  Could I ask is there a
> > > > real
> > > > user behind this?  Thanks,
> > > 
> > > Well, The "user-mode-only" has nothing to do with qemu's user-mode
> > > emulation.
> > 
> > I didn't follow why you thought my question was about "user-mode
> > emulation"..
> Sorry about the misunderstanding.

No worry. :)

> 
> > To ask in another way: after this new cap set, qemu will get a SIGBUS
> > and VM
> > will crash during postcopy migrating as long as either KVM or
> > vhost-kernel
> > faulted on any of the missing pages, am I right?
> 
> Oops...Yes, you're right. It indeed casues qemu crash on destination due to
> fault on missing pages.
> This patch set and my thought about introducing this cap to qemu are wrong.

I can't say it's wrong, it's just that it may need some more thoughts on how to
make it applicable.

We'll need to make sure no kernel module will access guest pages, however I
think it'll be so hard to guarantee.  For example, there can be some read()
syscall from qemu initiated with guest pages passed in as the buffer (so the
kernel will fill up the buffer when syscall returns), then if that page is
missing on dst then that'll also trigger a kernel page fault and it'll crash
qemu too even if no kvm/vhost-kernel is used.  We'll need to dig out everything
like that.

The other thing is about my original question on whether it'll be useful in any
way, and I just worry it won't help anyone, because afaiu any real user of
migration (I believe it's majorly public/private cloud) will definitely at
least be kvm based as tcg could be too slow.  Then they'll simply enable the
unprivileged uffd on the hosts, since even if it's unsafe it'll be at least as
unsafe as before unprivileged_userfaultfd is introduced.

Thanks,
lma Oct. 15, 2021, 9:49 a.m. UTC | #6
在 2021-10-15 16:28,Peter Xu 写道:
> On Fri, Oct 15, 2021 at 04:16:15PM +0800, lma wrote:
>> 在 2021-10-15 14:12,Peter Xu 写道:
>> > On Fri, Oct 15, 2021 at 01:38:06PM +0800, lma wrote:
>> > > 在 2021-10-15 07:43,Peter Xu 写道:
>> > > > On Thu, Oct 14, 2021 at 05:15:48PM +0800, Lin Ma wrote:
>> > > > > Since kernel v5.11, Unprivileged user (without SYS_CAP_PTRACE
>> > > > > capability)
>> > > > > must pass UFFD_USER_MODE_ONLY to userfaultd in case
>> > > > > unprivileged_userfaultfd
>> > > > > sysctl knob is 0.
>> > > > > Please refer to https://lwn.net/Articles/819834/ and the kernel
>> > > > > commits:
>> > > > > 37cd0575 userfaultfd: add UFFD_USER_MODE_ONLY
>> > > > > d0d4730a userfaultfd: add user-mode only option to
>> > > > > unprivileged_userfaultfd sysctl knob
>> > > > >
>> > > > > This patch set adds a migration capability to pass UFFD_USER_MODE_ONLY
>> > > > > for postcopy migration.
>> > > >
>> > > > Then it's at least no KVM, no vhost, am I right?  Could I ask is there a
>> > > > real
>> > > > user behind this?  Thanks,
>> > >
>> > > Well, The "user-mode-only" has nothing to do with qemu's user-mode
>> > > emulation.
>> >
>> > I didn't follow why you thought my question was about "user-mode
>> > emulation"..
>> Sorry about the misunderstanding.
> 
> No worry. :)
> 
>> 
>> > To ask in another way: after this new cap set, qemu will get a SIGBUS
>> > and VM
>> > will crash during postcopy migrating as long as either KVM or
>> > vhost-kernel
>> > faulted on any of the missing pages, am I right?
>> 
>> Oops...Yes, you're right. It indeed casues qemu crash on destination 
>> due to
>> fault on missing pages.
>> This patch set and my thought about introducing this cap to qemu are 
>> wrong.
> 
> I can't say it's wrong, it's just that it may need some more thoughts 
> on how to
> make it applicable.
> 
> We'll need to make sure no kernel module will access guest pages, 
> however I
> think it'll be so hard to guarantee.  For example, there can be some 
> read()
> syscall from qemu initiated with guest pages passed in as the buffer 
> (so the
> kernel will fill up the buffer when syscall returns), then if that page 
> is
> missing on dst then that'll also trigger a kernel page fault and it'll 
> crash
> qemu too even if no kvm/vhost-kernel is used.  We'll need to dig out 
> everything
> like that.

Yeah, It's hard to avoid pf in kernel completely.

> The other thing is about my original question on whether it'll be 
> useful in any
> way, and I just worry it won't help anyone, because afaiu any real user 
> of
> migration (I believe it's majorly public/private cloud) will definitely 
> at
> least be kvm based as tcg could be too slow.  Then they'll simply 
> enable the
> unprivileged uffd on the hosts, since even if it's unsafe it'll be at 
> least as
> unsafe as before unprivileged_userfaultfd is introduced.

It seems that this capability is useless for qemu/kvm so far :-)

Thanks for your information!

Lin