Message ID | 20211014091551.15201-1-lma@suse.com |
---|---|
Headers | show |
Series | Postcopy migration: Add userfaultfd- user-mode-only capability | expand |
On Thu, Oct 14, 2021 at 05:15:48PM +0800, Lin Ma wrote: > Since kernel v5.11, Unprivileged user (without SYS_CAP_PTRACE capability) > must pass UFFD_USER_MODE_ONLY to userfaultd in case unprivileged_userfaultfd > sysctl knob is 0. > Please refer to https://lwn.net/Articles/819834/ and the kernel commits: > 37cd0575 userfaultfd: add UFFD_USER_MODE_ONLY > d0d4730a userfaultfd: add user-mode only option to unprivileged_userfaultfd sysctl knob > > This patch set adds a migration capability to pass UFFD_USER_MODE_ONLY > for postcopy migration. Then it's at least no KVM, no vhost, am I right? Could I ask is there a real user behind this? Thanks,
在 2021-10-15 07:43,Peter Xu 写道: > On Thu, Oct 14, 2021 at 05:15:48PM +0800, Lin Ma wrote: >> Since kernel v5.11, Unprivileged user (without SYS_CAP_PTRACE >> capability) >> must pass UFFD_USER_MODE_ONLY to userfaultd in case >> unprivileged_userfaultfd >> sysctl knob is 0. >> Please refer to https://lwn.net/Articles/819834/ and the kernel >> commits: >> 37cd0575 userfaultfd: add UFFD_USER_MODE_ONLY >> d0d4730a userfaultfd: add user-mode only option to >> unprivileged_userfaultfd sysctl knob >> >> This patch set adds a migration capability to pass UFFD_USER_MODE_ONLY >> for postcopy migration. > > Then it's at least no KVM, no vhost, am I right? Could I ask is there > a real > user behind this? Thanks, Well, The "user-mode-only" has nothing to do with qemu's user-mode emulation. The unprivileged_userfaultfd sysctl knob controls whether unprivileged users can use the userfaultfd system calls. set it to 1 to allow unprivileged users to use the userfaultfd system calls. set it to 0 to restrict userfaultfd to only privileged users (with SYS_CAP_PTRACE capability). If host's unprivileged_userfaultfd sysctl knob is 0(The default value of this knob is changed to 0 since host kernel v5.11): Qemu must pass the UFFD_USER_MODE_ONLY flag when creating userfaultfd object for postcopy migration in case qemu runs as unprivileged user. Before host kernel v5.11, If host's unprivileged_userfaultfd sysctl knob is 0, Then postcopy migration is not allowed in case qemu runs as unprivileged user. Thanks, Lin
On Fri, Oct 15, 2021 at 01:38:06PM +0800, lma wrote: > 在 2021-10-15 07:43,Peter Xu 写道: > > On Thu, Oct 14, 2021 at 05:15:48PM +0800, Lin Ma wrote: > > > Since kernel v5.11, Unprivileged user (without SYS_CAP_PTRACE > > > capability) > > > must pass UFFD_USER_MODE_ONLY to userfaultd in case > > > unprivileged_userfaultfd > > > sysctl knob is 0. > > > Please refer to https://lwn.net/Articles/819834/ and the kernel > > > commits: > > > 37cd0575 userfaultfd: add UFFD_USER_MODE_ONLY > > > d0d4730a userfaultfd: add user-mode only option to > > > unprivileged_userfaultfd sysctl knob > > > > > > This patch set adds a migration capability to pass UFFD_USER_MODE_ONLY > > > for postcopy migration. > > > > Then it's at least no KVM, no vhost, am I right? Could I ask is there a > > real > > user behind this? Thanks, > > Well, The "user-mode-only" has nothing to do with qemu's user-mode > emulation. I didn't follow why you thought my question was about "user-mode emulation".. To ask in another way: after this new cap set, qemu will get a SIGBUS and VM will crash during postcopy migrating as long as either KVM or vhost-kernel faulted on any of the missing pages, am I right? Thanks,
在 2021-10-15 14:12,Peter Xu 写道: > On Fri, Oct 15, 2021 at 01:38:06PM +0800, lma wrote: >> 在 2021-10-15 07:43,Peter Xu 写道: >> > On Thu, Oct 14, 2021 at 05:15:48PM +0800, Lin Ma wrote: >> > > Since kernel v5.11, Unprivileged user (without SYS_CAP_PTRACE >> > > capability) >> > > must pass UFFD_USER_MODE_ONLY to userfaultd in case >> > > unprivileged_userfaultfd >> > > sysctl knob is 0. >> > > Please refer to https://lwn.net/Articles/819834/ and the kernel >> > > commits: >> > > 37cd0575 userfaultfd: add UFFD_USER_MODE_ONLY >> > > d0d4730a userfaultfd: add user-mode only option to >> > > unprivileged_userfaultfd sysctl knob >> > > >> > > This patch set adds a migration capability to pass UFFD_USER_MODE_ONLY >> > > for postcopy migration. >> > >> > Then it's at least no KVM, no vhost, am I right? Could I ask is there a >> > real >> > user behind this? Thanks, >> >> Well, The "user-mode-only" has nothing to do with qemu's user-mode >> emulation. > > I didn't follow why you thought my question was about "user-mode > emulation".. Sorry about the misunderstanding. > To ask in another way: after this new cap set, qemu will get a SIGBUS > and VM > will crash during postcopy migrating as long as either KVM or > vhost-kernel > faulted on any of the missing pages, am I right? Oops...Yes, you're right. It indeed casues qemu crash on destination due to fault on missing pages. This patch set and my thought about introducing this cap to qemu are wrong. Thanks, Lin
On Fri, Oct 15, 2021 at 04:16:15PM +0800, lma wrote: > 在 2021-10-15 14:12,Peter Xu 写道: > > On Fri, Oct 15, 2021 at 01:38:06PM +0800, lma wrote: > > > 在 2021-10-15 07:43,Peter Xu 写道: > > > > On Thu, Oct 14, 2021 at 05:15:48PM +0800, Lin Ma wrote: > > > > > Since kernel v5.11, Unprivileged user (without SYS_CAP_PTRACE > > > > > capability) > > > > > must pass UFFD_USER_MODE_ONLY to userfaultd in case > > > > > unprivileged_userfaultfd > > > > > sysctl knob is 0. > > > > > Please refer to https://lwn.net/Articles/819834/ and the kernel > > > > > commits: > > > > > 37cd0575 userfaultfd: add UFFD_USER_MODE_ONLY > > > > > d0d4730a userfaultfd: add user-mode only option to > > > > > unprivileged_userfaultfd sysctl knob > > > > > > > > > > This patch set adds a migration capability to pass UFFD_USER_MODE_ONLY > > > > > for postcopy migration. > > > > > > > > Then it's at least no KVM, no vhost, am I right? Could I ask is there a > > > > real > > > > user behind this? Thanks, > > > > > > Well, The "user-mode-only" has nothing to do with qemu's user-mode > > > emulation. > > > > I didn't follow why you thought my question was about "user-mode > > emulation".. > Sorry about the misunderstanding. No worry. :) > > > To ask in another way: after this new cap set, qemu will get a SIGBUS > > and VM > > will crash during postcopy migrating as long as either KVM or > > vhost-kernel > > faulted on any of the missing pages, am I right? > > Oops...Yes, you're right. It indeed casues qemu crash on destination due to > fault on missing pages. > This patch set and my thought about introducing this cap to qemu are wrong. I can't say it's wrong, it's just that it may need some more thoughts on how to make it applicable. We'll need to make sure no kernel module will access guest pages, however I think it'll be so hard to guarantee. For example, there can be some read() syscall from qemu initiated with guest pages passed in as the buffer (so the kernel will fill up the buffer when syscall returns), then if that page is missing on dst then that'll also trigger a kernel page fault and it'll crash qemu too even if no kvm/vhost-kernel is used. We'll need to dig out everything like that. The other thing is about my original question on whether it'll be useful in any way, and I just worry it won't help anyone, because afaiu any real user of migration (I believe it's majorly public/private cloud) will definitely at least be kvm based as tcg could be too slow. Then they'll simply enable the unprivileged uffd on the hosts, since even if it's unsafe it'll be at least as unsafe as before unprivileged_userfaultfd is introduced. Thanks,
在 2021-10-15 16:28,Peter Xu 写道: > On Fri, Oct 15, 2021 at 04:16:15PM +0800, lma wrote: >> 在 2021-10-15 14:12,Peter Xu 写道: >> > On Fri, Oct 15, 2021 at 01:38:06PM +0800, lma wrote: >> > > 在 2021-10-15 07:43,Peter Xu 写道: >> > > > On Thu, Oct 14, 2021 at 05:15:48PM +0800, Lin Ma wrote: >> > > > > Since kernel v5.11, Unprivileged user (without SYS_CAP_PTRACE >> > > > > capability) >> > > > > must pass UFFD_USER_MODE_ONLY to userfaultd in case >> > > > > unprivileged_userfaultfd >> > > > > sysctl knob is 0. >> > > > > Please refer to https://lwn.net/Articles/819834/ and the kernel >> > > > > commits: >> > > > > 37cd0575 userfaultfd: add UFFD_USER_MODE_ONLY >> > > > > d0d4730a userfaultfd: add user-mode only option to >> > > > > unprivileged_userfaultfd sysctl knob >> > > > > >> > > > > This patch set adds a migration capability to pass UFFD_USER_MODE_ONLY >> > > > > for postcopy migration. >> > > > >> > > > Then it's at least no KVM, no vhost, am I right? Could I ask is there a >> > > > real >> > > > user behind this? Thanks, >> > > >> > > Well, The "user-mode-only" has nothing to do with qemu's user-mode >> > > emulation. >> > >> > I didn't follow why you thought my question was about "user-mode >> > emulation".. >> Sorry about the misunderstanding. > > No worry. :) > >> >> > To ask in another way: after this new cap set, qemu will get a SIGBUS >> > and VM >> > will crash during postcopy migrating as long as either KVM or >> > vhost-kernel >> > faulted on any of the missing pages, am I right? >> >> Oops...Yes, you're right. It indeed casues qemu crash on destination >> due to >> fault on missing pages. >> This patch set and my thought about introducing this cap to qemu are >> wrong. > > I can't say it's wrong, it's just that it may need some more thoughts > on how to > make it applicable. > > We'll need to make sure no kernel module will access guest pages, > however I > think it'll be so hard to guarantee. For example, there can be some > read() > syscall from qemu initiated with guest pages passed in as the buffer > (so the > kernel will fill up the buffer when syscall returns), then if that page > is > missing on dst then that'll also trigger a kernel page fault and it'll > crash > qemu too even if no kvm/vhost-kernel is used. We'll need to dig out > everything > like that. Yeah, It's hard to avoid pf in kernel completely. > The other thing is about my original question on whether it'll be > useful in any > way, and I just worry it won't help anyone, because afaiu any real user > of > migration (I believe it's majorly public/private cloud) will definitely > at > least be kvm based as tcg could be too slow. Then they'll simply > enable the > unprivileged uffd on the hosts, since even if it's unsafe it'll be at > least as > unsafe as before unprivileged_userfaultfd is introduced. It seems that this capability is useless for qemu/kvm so far :-) Thanks for your information! Lin