diff mbox series

[v3,28/28] docs: update Xen-on-KVM documentation

Message ID 20231025145042.627381-29-dwmw2@infradead.org
State New
Headers show
Series Get Xen PV shim running in QEMU, add net & console | expand

Commit Message

David Woodhouse Oct. 25, 2023, 2:50 p.m. UTC
From: David Woodhouse <dwmw@amazon.co.uk>

Add notes about console and network support, and how to launch PV guests.
Clean up the disk configuration examples now that that's simpler, and
remove the comment about IDE unplug on q35/AHCI now that it's fixed.

Also update stale avocado test filename in MAINTAINERS.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
---
 MAINTAINERS              |   2 +-
 docs/system/i386/xen.rst | 100 ++++++++++++++++++++++++++++-----------
 2 files changed, 73 insertions(+), 29 deletions(-)

Comments

Eric Blake Oct. 25, 2023, 6:20 p.m. UTC | #1
On Wed, Oct 25, 2023 at 03:50:42PM +0100, David Woodhouse wrote:
> From: David Woodhouse <dwmw@amazon.co.uk>
> 
> Add notes about console and network support, and how to launch PV guests.
> Clean up the disk configuration examples now that that's simpler, and
> remove the comment about IDE unplug on q35/AHCI now that it's fixed.
> 
> Also update stale avocado test filename in MAINTAINERS.
> 
> Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> ---
> +Xen paravirtual devices
> +-----------------------
> +
> +The Xen PCI platform device is enabled automatically for a Xen guest. This
> +allows a guest to unplug all emulated devices, in order to use paravirtual
> +block and network drivers instead.
> +
> +Those paravirtual Xen block, network (and console) devices can be created
> +through the command line, and/or hot-plugged.
> +
> +To provide a Xen console device, define a character device and then a device
> +of type ``xen-console`` to connect to it. For the Xen console equivalent of
> +the handy ``-serial mon:stdio`` option, for example:
> +
> +.. parsed-literal::
> +   -chardev -chardev stdio,mux=on,id=char0,signal=off -mon char0 \\
> +   -device xen-console,chardev=char0

Is -chardev supposed to appear twice here?

...
> +
> +Booting Xen PV guests
> +---------------------
> +
> +Booting PV guest kernels is possible by using the Xen PV shim (a version of Xen
> +itself, designed to run inside a Xen HVM guest and provide memory management
> +services for one guest alone).
> +
> +The Xen binary is provided as the ``-kernel`` and the guest kernel itself (or
> +PV Grub image) as the ``-initrd`` image, which actually just means the first
> +multiboot "module". For example:
> +
> +.. parsed-literal::
> +
> +  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split \\
> +       -chardev stdio,id=char0 -device xen-console,chardev=char0 \\
> +       -display none  -m 1G  -kernel xen -initrd bzImage \\
> +       -append "pv-shim console=xen,pv -- console=hvc0 root=/dev/xvda1" \\
> +       -drive file=${GUEST_IMAGE},if=xen

Is the space between -- and console= intentionsl?
David Woodhouse Oct. 25, 2023, 6:26 p.m. UTC | #2
On Wed, 2023-10-25 at 13:20 -0500, Eric Blake wrote:
> On Wed, Oct 25, 2023 at 03:50:42PM +0100, David Woodhouse wrote:
> > From: David Woodhouse <dwmw@amazon.co.uk>
> > 
> > Add notes about console and network support, and how to launch PV guests.
> > Clean up the disk configuration examples now that that's simpler, and
> > remove the comment about IDE unplug on q35/AHCI now that it's fixed.
> > 
> > Also update stale avocado test filename in MAINTAINERS.
> > 
> > Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
> > ---
> > +Xen paravirtual devices
> > +-----------------------
> > +
> > +The Xen PCI platform device is enabled automatically for a Xen guest. This
> > +allows a guest to unplug all emulated devices, in order to use paravirtual
> > +block and network drivers instead.
> > +
> > +Those paravirtual Xen block, network (and console) devices can be created
> > +through the command line, and/or hot-plugged.
> > +
> > +To provide a Xen console device, define a character device and then a device
> > +of type ``xen-console`` to connect to it. For the Xen console equivalent of
> > +the handy ``-serial mon:stdio`` option, for example:
> > +
> > +.. parsed-literal::
> > +   -chardev -chardev stdio,mux=on,id=char0,signal=off -mon char0 \\
> > +   -device xen-console,chardev=char0
> 
> Is -chardev supposed to appear twice here?

It is not. Will fix; thanks.

> ...
> > +
> > +Booting Xen PV guests
> > +---------------------
> > +
> > +Booting PV guest kernels is possible by using the Xen PV shim (a version of Xen
> > +itself, designed to run inside a Xen HVM guest and provide memory management
> > +services for one guest alone).
> > +
> > +The Xen binary is provided as the ``-kernel`` and the guest kernel itself (or
> > +PV Grub image) as the ``-initrd`` image, which actually just means the first
> > +multiboot "module". For example:
> > +
> > +.. parsed-literal::
> > +
> > +  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split \\
> > +       -chardev stdio,id=char0 -device xen-console,chardev=char0 \\
> > +       -display none  -m 1G  -kernel xen -initrd bzImage \\
> > +       -append "pv-shim console=xen,pv -- console=hvc0 root=/dev/xvda1" \\
> > +       -drive file=${GUEST_IMAGE},if=xen
> 
> Is the space between -- and console= intentionsl?

Yes, that one is correct. The -- is how you separate Xen's command line
(on the left) from the guest kernel command line (on the right).
Andrew Cooper Oct. 25, 2023, 6:56 p.m. UTC | #3
On 25/10/2023 7:26 pm, David Woodhouse wrote:
> On Wed, 2023-10-25 at 13:20 -0500, Eric Blake wrote:
>> On Wed, Oct 25, 2023 at 03:50:42PM +0100, David Woodhouse wrote:
>>> +
>>> +Booting Xen PV guests
>>> +---------------------
>>> +
>>> +Booting PV guest kernels is possible by using the Xen PV shim (a version of Xen
>>> +itself, designed to run inside a Xen HVM guest and provide memory management
>>> +services for one guest alone).
>>> +
>>> +The Xen binary is provided as the ``-kernel`` and the guest kernel itself (or
>>> +PV Grub image) as the ``-initrd`` image, which actually just means the first
>>> +multiboot "module". For example:
>>> +
>>> +.. parsed-literal::
>>> +
>>> +  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split \\
>>> +       -chardev stdio,id=char0 -device xen-console,chardev=char0 \\
>>> +       -display none  -m 1G  -kernel xen -initrd bzImage \\
>>> +       -append "pv-shim console=xen,pv -- console=hvc0 root=/dev/xvda1" \\
>>> +       -drive file=${GUEST_IMAGE},if=xen
>> Is the space between -- and console= intentionsl?
> Yes, that one is correct. The -- is how you separate Xen's command line
> (on the left) from the guest kernel command line (on the right).

To expand on this a bit.

Multiboot1 supports multiple modules but only a single command line.  As
one of the modules passed to Xen is the dom0 kernel, we need some way to
pass it's command line, hence the " -- ".

Multiboot2 and PVH support a command line per module, which is the
preferred way to pass the commandlines, given a choice.

~Andrew
David Woodhouse Oct. 25, 2023, 7:02 p.m. UTC | #4
On Wed, 2023-10-25 at 19:56 +0100, Andrew Cooper wrote:
> On 25/10/2023 7:26 pm, David Woodhouse wrote:
>  
> > On Wed, 2023-10-25 at 13:20 -0500, Eric Blake wrote:
> >  
> > > On Wed, Oct 25, 2023 at 03:50:42PM +0100, David Woodhouse wrote:
> >  
> > >  
> > > > +
> > > > +Booting Xen PV guests
> > > > +---------------------
> > > > +
> > > > +Booting PV guest kernels is possible by using the Xen PV shim (a version of Xen
> > > > +itself, designed to run inside a Xen HVM guest and provide memory management
> > > > +services for one guest alone).
> > > > +
> > > > +The Xen binary is provided as the ``-kernel`` and the guest kernel itself (or
> > > > +PV Grub image) as the ``-initrd`` image, which actually just means the first
> > > > +multiboot "module". For example:
> > > > +
> > > > +.. parsed-literal::
> > > > +
> > > > +  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split \\
> > > > +       -chardev stdio,id=char0 -device xen-console,chardev=char0 \\
> > > > +       -display none  -m 1G  -kernel xen -initrd bzImage \\
> > > > +       -append "pv-shim console=xen,pv -- console=hvc0 root=/dev/xvda1" \\
> > > > +       -drive file=${GUEST_IMAGE},if=xen
> > > Is the space between -- and console= intentionsl?
> > Yes, that one is correct. The -- is how you separate Xen's command line
> > (on the left) from the guest kernel command line (on the right).
>  
>  To expand on this a bit.
>  
>  Multiboot1 supports multiple modules but only a single command
> line.  As one of the modules passed to Xen is the dom0 kernel, we
> need some way to pass it's command line, hence the " -- ".
>  
>  Multiboot2 and PVH support a command line per module, which is the
> preferred way to pass the commandlines, given a choice.
>  

Thanks.

Indeed, I had *originally* thought I was going to need to implement
Multiboot2 in qemu in order to boot Shim + PV guest, but it turns out
we can make it work with just Multiboot1 support.

As long as the guest kernel doesn't want an *actual* initrd, that is ;)
Kevin Wolf Oct. 26, 2023, 8:26 a.m. UTC | #5
Am 25.10.2023 um 20:56 hat Andrew Cooper geschrieben:
> On 25/10/2023 7:26 pm, David Woodhouse wrote:
> > On Wed, 2023-10-25 at 13:20 -0500, Eric Blake wrote:
> >> On Wed, Oct 25, 2023 at 03:50:42PM +0100, David Woodhouse wrote:
> >>> +
> >>> +Booting Xen PV guests
> >>> +---------------------
> >>> +
> >>> +Booting PV guest kernels is possible by using the Xen PV shim (a version of Xen
> >>> +itself, designed to run inside a Xen HVM guest and provide memory management
> >>> +services for one guest alone).
> >>> +
> >>> +The Xen binary is provided as the ``-kernel`` and the guest kernel itself (or
> >>> +PV Grub image) as the ``-initrd`` image, which actually just means the first
> >>> +multiboot "module". For example:
> >>> +
> >>> +.. parsed-literal::
> >>> +
> >>> +  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split \\
> >>> +       -chardev stdio,id=char0 -device xen-console,chardev=char0 \\
> >>> +       -display none  -m 1G  -kernel xen -initrd bzImage \\
> >>> +       -append "pv-shim console=xen,pv -- console=hvc0 root=/dev/xvda1" \\
> >>> +       -drive file=${GUEST_IMAGE},if=xen
> >> Is the space between -- and console= intentionsl?
> > Yes, that one is correct. The -- is how you separate Xen's command line
> > (on the left) from the guest kernel command line (on the right).
> 
> To expand on this a bit.
> 
> Multiboot1 supports multiple modules but only a single command line.  As
> one of the modules passed to Xen is the dom0 kernel, we need some way to
> pass it's command line, hence the " -- ".

That's not right, even Multiboot 1 contains a 'string' field in the
module structure that is defined to typically hold a command line. The
exact meaning is OS dependent, so Xen could use it however it wants.

In QEMU (and I believe this is the same behaviour as in GRUB),
everything before the space in an -initrd argument is treated as a
filename to load, everything after it is just passed as the command
line.

So it would have been entirely possible to use -initrd 'bzImage
console=hvc0 root=/dev/xvda1' if Xen worked like that.

> Multiboot2 and PVH support a command line per module, which is the
> preferred way to pass the commandlines, given a choice.

Multiboot 2 seems to integrate the string in a variable length module
structure instead of just having a pointer in a fixed length one, but
the model behind it is essentially the same as before.

Kevin
David Woodhouse Oct. 26, 2023, 9:25 a.m. UTC | #6
On Thu, 2023-10-26 at 10:26 +0200, Kevin Wolf wrote:
> 
> > > > > +.. parsed-literal::
> > > > > +
> > > > > +  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split \\
> > > > > +       -chardev stdio,id=char0 -device xen-console,chardev=char0 \\
> > > > > +       -display none  -m 1G  -kernel xen -initrd bzImage \\
> > > > > +       -append "pv-shim console=xen,pv -- console=hvc0 root=/dev/xvda1" \\
> > > > > +       -drive file=${GUEST_IMAGE},if=xen
> > > > Is the space between -- and console= intentionsl?
> > > Yes, that one is correct. The -- is how you separate Xen's command line
> > > (on the left) from the guest kernel command line (on the right).
> > 
> > To expand on this a bit.
> > 
> > Multiboot1 supports multiple modules but only a single command line.  As
> > one of the modules passed to Xen is the dom0 kernel, we need some way to
> > pass it's command line, hence the " -- ".
> 
> That's not right, even Multiboot 1 contains a 'string' field in the
> module structure that is defined to typically hold a command line. The
> exact meaning is OS dependent, so Xen could use it however it wants.
> 
> In QEMU (and I believe this is the same behaviour as in GRUB),
> everything before the space in an -initrd argument is treated as a
> filename to load, everything after it is just passed as the command
> line.
> 
> So it would have been entirely possible to use -initrd 'bzImage
> console=hvc0 root=/dev/xvda1' if Xen worked like that.

Xen does allow that too. I didn't realise our multiboot loader did though.

So yes, you *can* use  -initrd 'bzImage root=/dev/xvda1'. 

And you can even load more than one module, it seems. Separate them by
commas, so -initrd 'bzImage,initrd.img' should work.

You can even do both at the same time. If you have commas on the kernel
command line, *double* them:

 -initrd 'bzImage root=/dev/xvda earlyprintk=xen,,keep,initrd.img'

I'll update the documentation accordingly.
David Woodhouse Oct. 26, 2023, 4:25 p.m. UTC | #7
On Thu, 2023-10-26 at 10:25 +0100, David Woodhouse wrote:
> 
> > So it would have been entirely possible to use -initrd 'bzImage
> > console=hvc0 root=/dev/xvda1' if Xen worked like that.
> 
> Xen does allow that too. I didn't realise our multiboot loader did though.
> 
> So yes, you *can* use  -initrd 'bzImage root=/dev/xvda1'. 
> 
> And you can even load more than one module, it seems. Separate them by
> commas, so -initrd 'bzImage,initrd.img' should work.
> 
> You can even do both at the same time. If you have commas on the kernel
> command line, *double* them:
> 
>  -initrd 'bzImage root=/dev/xvda earlyprintk=xen,,keep,initrd.img'
> 
> I'll update the documentation accordingly.

https://git.infradead.org/users/dwmw2/qemu.git/commitdiff/0b13c0ae39b


+Booting Xen PV guests
+---------------------
+
+Booting PV guest kernels is possible by using the Xen PV shim (a version of Xen
+itself, designed to run inside a Xen HVM guest and provide memory management
+services for one guest alone).
+
+The Xen binary is provided as the ``-kernel`` and the guest kernel itself (or
+PV Grub image) as the ``-initrd`` image, which actually just means the first
+multiboot "module". For example:
+
+.. parsed-literal::
+
+  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split \\
+       -chardev stdio,id=char0 -device xen-console,chardev=char0 \\
+       -display none  -m 1G  -kernel xen -initrd bzImage \\
+       -append "pv-shim console=xen,pv -- console=hvc0 root=/dev/xvda1" \\
+       -drive file=${GUEST_IMAGE},if=xen
+
+The Xen image must be built with the ``CONFIG_XEN_GUEST`` and ``CONFIG_PV_SHIM``
+options, and as of Xen 4.17, Xen's PV shim mode does not support using a serial
+port; it must have a Xen console or it will panic.
+
+The example above provides the guest kernel command line after a separator
+(" ``--`` ") on the Xen command line, and does not provide the guest kernel
+with an actual initramfs, which would need to listed as a second multiboot
+module. For more complicated alternatives, see the
+:ref:`documentation <initrd-reference-label>` for the ``-initrd`` option.
+


I also fixed up the -initrd documentation so that it actually mentions
how to quote commas, using a Xen PV launch as an example:

 ``-initrd "file1 arg=foo,file2"``
     This syntax is only available with multiboot.
 
-    Use file1 and file2 as modules and pass arg=foo as parameter to the
-    first module.
+    Use file1 and file2 as modules and pass ``arg=foo`` as parameter to the
+    first module. Commas can be provided in module parameters by doubling
+    them on the command line to escape them:
+
+``-initrd "bzImage earlyprintk=xen,,keep root=/dev/xvda1,initrd.img"``
+    Multiboot only. Use bzImage as the first module with
+    "``earlyprintk=xen,keep root=/dev/xvda1``" as its command line,
+    and initrd.img as the second module.
diff mbox series

Patch

diff --git a/MAINTAINERS b/MAINTAINERS
index d36aa44661..0fcc454ccd 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -490,7 +490,7 @@  S: Supported
 F: include/sysemu/kvm_xen.h
 F: target/i386/kvm/xen*
 F: hw/i386/kvm/xen*
-F: tests/avocado/xen_guest.py
+F: tests/avocado/kvm_xen_guest.py
 
 Guest CPU Cores (other accelerators)
 ------------------------------------
diff --git a/docs/system/i386/xen.rst b/docs/system/i386/xen.rst
index f06765e88c..6214c4571e 100644
--- a/docs/system/i386/xen.rst
+++ b/docs/system/i386/xen.rst
@@ -15,46 +15,24 @@  Setup
 -----
 
 Xen mode is enabled by setting the ``xen-version`` property of the KVM
-accelerator, for example for Xen 4.10:
+accelerator, for example for Xen 4.17:
 
 .. parsed-literal::
 
-  |qemu_system| --accel kvm,xen-version=0x4000a,kernel-irqchip=split
+  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split
 
 Additionally, virtual APIC support can be advertised to the guest through the
 ``xen-vapic`` CPU flag:
 
 .. parsed-literal::
 
-  |qemu_system| --accel kvm,xen-version=0x4000a,kernel-irqchip=split --cpu host,+xen_vapic
+  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split --cpu host,+xen-vapic
 
 When Xen support is enabled, QEMU changes hypervisor identification (CPUID
 0x40000000..0x4000000A) to Xen. The KVM identification and features are not
 advertised to a Xen guest. If Hyper-V is also enabled, the Xen identification
 moves to leaves 0x40000100..0x4000010A.
 
-The Xen platform device is enabled automatically for a Xen guest. This allows
-a guest to unplug all emulated devices, in order to use Xen PV block and network
-drivers instead. Under Xen, the boot disk is typically available both via IDE
-emulation, and as a PV block device. Guest bootloaders typically use IDE to load
-the guest kernel, which then unplugs the IDE and continues with the Xen PV block
-device.
-
-This configuration can be achieved as follows
-
-.. parsed-literal::
-
-  |qemu_system| -M pc --accel kvm,xen-version=0x4000a,kernel-irqchip=split \\
-       -drive file=${GUEST_IMAGE},if=none,id=disk,file.locking=off -device xen-disk,drive=disk,vdev=xvda \\
-       -drive file=${GUEST_IMAGE},index=2,media=disk,file.locking=off,if=ide
-
-It is necessary to use the pc machine type, as the q35 machine uses AHCI instead
-of legacy IDE, and AHCI disks are not unplugged through the Xen PV unplug
-mechanism.
-
-VirtIO devices can also be used; Linux guests may need to be dissuaded from
-umplugging them by adding 'xen_emul_unplug=never' on their command line.
-
 Properties
 ----------
 
@@ -63,7 +41,10 @@  The following properties exist on the KVM accelerator object:
 ``xen-version``
   This property contains the Xen version in ``XENVER_version`` form, with the
   major version in the top 16 bits and the minor version in the low 16 bits.
-  Setting this property enables the Xen guest support.
+  Setting this property enables the Xen guest support. If Xen version 4.5 or
+  greater is specified, the HVM leaf in Xen CPUID is populated. Xen version
+  4.6 enables the vCPU ID in CPUID, and version 4.17 advertises vCPU upcall
+  vector support to the guest.
 
 ``xen-evtchn-max-pirq``
   Xen PIRQs represent an emulated physical interrupt, either GSI or MSI, which
@@ -83,8 +64,71 @@  The following properties exist on the KVM accelerator object:
   through simultaneous grants. For guests with large numbers of PV devices and
   high throughput, it may be desirable to increase this value.
 
-OS requirements
----------------
+Xen paravirtual devices
+-----------------------
+
+The Xen PCI platform device is enabled automatically for a Xen guest. This
+allows a guest to unplug all emulated devices, in order to use paravirtual
+block and network drivers instead.
+
+Those paravirtual Xen block, network (and console) devices can be created
+through the command line, and/or hot-plugged.
+
+To provide a Xen console device, define a character device and then a device
+of type ``xen-console`` to connect to it. For the Xen console equivalent of
+the handy ``-serial mon:stdio`` option, for example:
+
+.. parsed-literal::
+   -chardev -chardev stdio,mux=on,id=char0,signal=off -mon char0 \\
+   -device xen-console,chardev=char0
+
+The Xen network device is ``xen-net-device``, which becomes the default NIC
+model for emulated Xen guests, meaning that just the default ``-nic user``
+should automatically work and present a Xen network device to the guest.
+
+Disks can be configured with '``-drive file=${GUEST_IMAGE},if=xen``' and will
+appear to the guest as ``xvda`` onwards.
+
+Under Xen, the boot disk is typically available both via IDE emulation, and
+as a PV block device. Guest bootloaders typically use IDE to load the guest
+kernel, which then unplugs the IDE and continues with the Xen PV block device.
+
+This configuration can be achieved as follows:
+
+.. parsed-literal::
+
+  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split \\
+       -drive file=${GUEST_IMAGE},if=xen \\
+       -drive file=${GUEST_IMAGE},file.locking=off,if=ide
+
+VirtIO devices can also be used; Linux guests may need to be dissuaded from
+umplugging them by adding '``xen_emul_unplug=never``' on their command line.
+
+Booting Xen PV guests
+---------------------
+
+Booting PV guest kernels is possible by using the Xen PV shim (a version of Xen
+itself, designed to run inside a Xen HVM guest and provide memory management
+services for one guest alone).
+
+The Xen binary is provided as the ``-kernel`` and the guest kernel itself (or
+PV Grub image) as the ``-initrd`` image, which actually just means the first
+multiboot "module". For example:
+
+.. parsed-literal::
+
+  |qemu_system| --accel kvm,xen-version=0x40011,kernel-irqchip=split \\
+       -chardev stdio,id=char0 -device xen-console,chardev=char0 \\
+       -display none  -m 1G  -kernel xen -initrd bzImage \\
+       -append "pv-shim console=xen,pv -- console=hvc0 root=/dev/xvda1" \\
+       -drive file=${GUEST_IMAGE},if=xen
+
+The Xen image must be built with the ``CONFIG_XEN_GUEST`` and ``CONFIG_PV_SHIM``
+options, and as of Xen 4.17, Xen's PV shim mode does not support using a serial
+port; it must have a Xen console or it will panic.
+
+Host OS requirements
+--------------------
 
 The minimal Xen support in the KVM accelerator requires the host to be running
 Linux v5.12 or newer. Later versions add optimisations: Linux v5.17 added