From patchwork Sun Sep 13 12:43:35 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Laszlo Ersek X-Patchwork-Id: 517158 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id CAF9B14012C for ; Sun, 13 Sep 2015 22:44:37 +1000 (AEST) Received: from localhost ([::1]:35719 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zb6e3-0004UM-SK for incoming@patchwork.ozlabs.org; Sun, 13 Sep 2015 08:44:35 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:37902) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zb6dS-0003Rt-Uu for qemu-devel@nongnu.org; Sun, 13 Sep 2015 08:44:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Zb6dQ-00086b-89 for qemu-devel@nongnu.org; Sun, 13 Sep 2015 08:43:58 -0400 Received: from mx1.redhat.com ([209.132.183.28]:53129) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Zb6dP-00085m-Uh for qemu-devel@nongnu.org; Sun, 13 Sep 2015 08:43:56 -0400 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (Postfix) with ESMTPS id A0CBCC0B918B for ; Sun, 13 Sep 2015 12:43:55 +0000 (UTC) Received: from lacos-laptop-7.usersys.redhat.com (ovpn-116-23.ams2.redhat.com [10.36.116.23]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id t8DChnOr024712; Sun, 13 Sep 2015 08:43:53 -0400 From: Laszlo Ersek To: qemu-devel@nongnu.org Date: Sun, 13 Sep 2015 14:43:35 +0200 Message-Id: <1442148227-17343-2-git-send-email-lersek@redhat.com> In-Reply-To: <1442148227-17343-1-git-send-email-lersek@redhat.com> References: <55F5647C.6030901@redhat.com> <1442148227-17343-1-git-send-email-lersek@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.24 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x X-Received-From: 209.132.183.28 Cc: Gal Hammer , Paolo Bonzini , "Michael S. Tsirkin" , Igor Mammedov Subject: [Qemu-devel] [PATCH FYI 01/13] docs: describe QEMU's VMGenID design X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Cc: Paolo Bonzini Cc: Gal Hammer Cc: Igor Mammedov Cc: "Michael S. Tsirkin" Signed-off-by: Laszlo Ersek Acked-by: Michael S. Tsirkin --- Notes: fyi: - move from docs/specs/ to docs/ [Eric, Paolo] - fix grammar [Eric] - clarify that requirement R1e covers ROM and MMIO too [Michael] - replace '"BOCHS"' with '"BOCHS "' in the DataTableRegion operator, so that the OEM ID argument matches ACPI_BUILD_APPNAME6 exactly - remove the _CRS with the IO descriptor in it, because Windows' VMGENID driver chokes on that (but is okay with the absence of the _CRS). See for more. rfc: - This is based on the super long private email discussion we had two months ago, plus on the IRL discussion between Michael and myself @ the KVM Forum 2015. docs/vmgenid.txt | 336 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 336 insertions(+) create mode 100644 docs/vmgenid.txt diff --git a/docs/vmgenid.txt b/docs/vmgenid.txt new file mode 100644 index 0000000..4a9c1d0 --- /dev/null +++ b/docs/vmgenid.txt @@ -0,0 +1,336 @@ +Virtual Machine Generation ID Device +==================================== + +The Microsoft specification entitled "Virtual Machine Generation ID", +maintained at , defines an ACPI +feature that allows the guest OSPM to recognize when it has been returned "to +an earlier point in time", e.g. by restoring from snapshot, or by incoming +migration. Quoting the spec, + + The virtual machine generation ID is a feature whereby the virtual machines + BIOS will expose a new ID. This is a 128-bit, cryptographically random + integer value identifier that will be different every time the virtual + machine executes from a different configuration file-such as executing from + a recovered snapshot, or executing after restoring from backup. [...] + +The document you are reading now extracts the requirements set forth by the +VMGenID spec for hypervisors that intend to provide the feature, and describes +QEMU's implementation. The design below targets both SeaBIOS and OVMF as +compatible guest firmwares, without any changes to either of them. + +Requirements +------------ + +These requirements are extracted from the "How to implement virtual machine +generation ID support in a virtualization platform" section of the +specification, dated August 1, 2012. + +R1a. The generation ID shall live in an 8-byte aligned buffer. + +R1b. The buffer holding the generation ID shall be in guest RAM, ROM, or device + MMIO range. + +R1c. The buffer holding the generation ID shall be kept separate from areas + used by the operating system. + +R1d. The buffer shall not be covered by an AddressRangeMemory or + AddressRangeACPI entry in the E820 or UEFI memory map. + +R1e. The generation ID shall not live in a page frame that could be mapped with + caching disabled. (In other words, regardless of whether the generation ID + lives in RAM, ROM or MMIO, it shall only be mapped as cacheable.) + +R2 to R5. [These AML requirements are isolated well enough in the Microsoft + specification for us to simply refer to them here.] + +R6. The hypervisor shall expose a _HID (hardware identifier) object in the + VMGenId device's scope that is unique to the hypervisor vendor. + +Generation ID buffer design +--------------------------- + +QEMU places the generation ID buffer inside a separate fw_cfg blob that is +exposed to the guest OS with the ACPI linker/loader. + +The structure of the blob is as follows. Offsets, sizes and numeric values are +given in decimal; furthermore the latter are encoded in little endian. + + Offs Field Size Value + ---- ------------------ ---- ------------------------------------ + 0 System Description 36 + Table Header + 0 Signature 4 "UEFI" + 4 Length 4 62 + 8 Revision 1 1 + 9 Checksum 1 0 + 10 OEMID 6 ACPI_BUILD_APPNAME6 ("BOCHS ") + 16 OEM Table ID 8 "QEMUPARM" + 24 OEM Revision 4 1 + 28 Creator ID 4 ACPI_BUILD_APPNAME4 ("BXPC") + 32 Creator Revision 4 1 + + 36 UEFI Table 18 + Sub-Header + 36 Identifier 16 417a5dff-bf4b-4abc-a839-6593bb41f452 + 52 DataOffset 2 54 + + 54 ADDR base pointer 8 62 + .................................................................... + 62 OVMF SDT Header 36 zeroes + probe suppressor + 98 VMGenID alignment 6 zeroes + padding + 104 generation ID 16 128-bit VMGenID + 120 fw_cfg blob 3976 zeroes + padding + 4096 + +The fw_cfg blob is divided in two parts conceptually (separated by the dotted +line in the diagram). The first part, up to and excluding offset 62, is a +"UEFI" ACPI Table, governed by the UEFI specification 2.5, Appendix O. The +second part is mainly padding, but it also contains the generation ID. + +The "UEFI" ACPI Table -- in the first part -- is a "normal" ACPI table whose +generic header is defined by the ACPI specification, but for which the UEFI +spec defines the "UEFI" signature and adds two more fixed fields, "Identifier" +and "DataOffset". + +- The Identifier field carries a 128-bit GUID, and enables firmware + implementors to install several "UEFI" tables with different internal + structures, enabling OSPM to tell them apart based on the (Type-)Identifier + GUID field. + + For the purposes of QEMU's VMGenID implementation, we generated a new GUID + with the "uuidgen" utility. It should be different from all other + "Identifier" values, present and future, but otherwise no other software need + be aware of the concrete GUID value we generated. + +- The DataOffset field is just an offset into the table where the actual + (Identifier-specific) data starts. + + For the purposes of QEMU's VMGenID implementation, we simply set it to the + next (QEMU-specific) field, "ADDR base pointer". + +Linker/loader commands +---------------------- + +The name of the fw_cfg blob is "etc/acpi/qemuparam". The ALLOCATE command that +instructs the guest firmware to download this fw_cfg blob specifies an +alignment of 4096, and the blob will have size 4096 too. + +An ADD_POINTER command links the "UEFI" ACPI Table at the start of the blob +into the RSDT. + +Another ADD_POINTER command relocates the "ADDR base pointer" field to the +absolute address of the "OVMF SDT Header probe suppressor" field, within the +same blob. + +After this relocation, an ADD_CHECKSUM command updates the Checksum field, +covering the entire "UEFI" ACPI Table (which extends up to and excluding offset +62). + +Blob behavior under SeaBIOS +--------------------------- + +(Most of the complexity in the blob is ignored when the guest firmware is +SeaBIOS.) + +- SeaBIOS's ACPI linker/loader client allocates the blob in normal RAM + (satisfying R1b). + +- Because the ALLOCATE command prescribes an alignment of 4KB, and the blob's + size is also 4KB, the allocation covers a standalone page frame in full + (satisfying R1e). + +- The 128-bit VMGenID field is located at offset 104 within that page, + resulting in a guest-physical address divisible by 8 (satisfying R1a). + +- The blob is marked as Reserved in the E820 map (satisfying R1c and R1d). + +- The "UEFI" ACPI Table at the start of the blob is linked into the RSDT, + in-place. + +- The "ADDR" AML method (see later) is allowed to refer to the "UEFI" ACPI + Table with the DataTableRegion operator, because the table is located in + memory marked as AddressRangeReserved. + +- The "ADDR base pointer" field points at "OVMF SDT Header probe suppressor", + which is right after the "UEFI" ACPI Table inside the blob. At OSPM runtime, + the "ADDR" AML method reads the "ADDR base pointer" field, and adds 42, to + arrive at the address of the VMGenID field. + + blob @ page offset 0 RSDT + +-----------------------+ +-----+ + | "UEFI" ACPI Table <---------+ | ... | + | +-------------------+ | | | ... | + | | ... | | +---- ... | + | | ... | | +-----+ + | | ADDR base pointer -----+ + | +-------------------+ | | + | probe suppressor <-------+ + | VMGenID @ offset 104 | + | padding | + +-----------------------+ + +Blob behavior under OVMF +------------------------ + +The complexity in the blob is required by the two-pass nature of OVMF's ACPI +linker/loader client, which in turn comes from the fact that OVMF has to +dissect blobs into individual ACPI tables vs. "other things", tracking the +ADD_POINTER commands, so that tables can be installed individually, with +EFI_ACPI_TABLE_PROTOCOL. + +- OVMF's ACPI linker/loader client allocates the blob in normal RAM (satisfying + R1b). + +- Because the ALLOCATE command prescribes an alignment of 4KB, and the blob's + size is also 4KB, the allocation covers a standalone page frame in full + (satisfying R1e). + +- The 128-bit VMGenID field is located at offset 104 within that page, + resulting in a guest-physical address divisible by 8 (satisfying R1a). + +- OVMF's ACPI linker/loader allocates the blob in EfiACPIMemoryNVS type memory, + therefore it is marked as such in the UEFI memmap (satisfying R1c and R1d). + +- OVMF identifies the "UEFI" ACPI Table at the start of the blob in the second + pass, following the ADD_POINTER command that is meant to link the table into + the RSDT. OVMF installs a *copy* of the "UEFI" ACPI Table with + EFI_ACPI_TABLE_PROTOCOL (linking the copy into both RSDT and XSDT). Given the + "UEFI" signature of the table, EFI_ACPI_TABLE_PROTOCOL places the copy of the + table in EfiACPIMemoryNVS type memory. + +- The "ADDR" AML method (see later) is allowed to refer to the "UEFI" ACPI + Table with the DataTableRegion operator, because the table is located in + memory marked as AddressRangeNVS. + +- The "ADDR base pointer" field inside the installed table points at "OVMF SDT + Header probe suppressor" in the original blob. Because this field is filled + with zeros, OVMF's table identification heuristics unconditionally reports a + negative when it tracks the relevant ADD_POINTER command to it in the second + pass. Therefore the blob is marked as "hosts something else than just ACPI + tables", and it is preserved permanently (in the same EfiACPIMemoryNVS type + memory where it has been originally allocated). + + At OSPM runtime, the "ADDR" AML method reads the "ADDR base pointer" field, + and adds 42, to arrive at the address of the VMGenID field. + + blob @ page offset 0 RSDT XSDT + +-----------------------------+ +-----+ +-----+ + | "UEFI" ACPI Table (in blob) | | ... | | ... | + | +-------------------------+ | | ... ---+ | ... ---------------+ + | |XXXXXXXXXXXXXXXXXXXXXXXXX| | +-----+ | +-----+ | + | |XXXXXXX [unused] XXXXXXXX| | | | + | |XXXXXXXXXXXXXXXXXXXXXXXXX| | +------------------------+ + | +-------------------------+ | | + | probe suppressor <-------------+ "UEFI" ACPI Table (installed) <--+ + | VMGenID @ offset 104 | | +---------------------------+ + | padding | | | ... | + +-----------------------------+ | | ... | + +--- ADDR base pointer | + +---------------------------+ + +ACPI device, control methods +---------------------------- + +Requirements R2 through R6 of the VMGenID specification are satisfied with the +following ACPI logic, exposed by QEMU's ACPI generator in one of the SSDTs, and +installed by both guest firmwares as such. + +The basic idea is that, when the appropriate guest driver calls the ADDR method +(see R4), OSPM locates the generation ID field in the 4KB blob that lives in +E820 Reserved (SeaBIOS) or EfiACPIMemoryNVS type (OVMF) memory. The +guest-physical address of the field is communicated to QEMU via IO ports +[0x512..0x519] inclusive. Then QEMU is cued through IO port 0x51A to refresh +(and keep refreshing when appropriate) the generation ID at the passed back +address. Finally, the method returns the address to the guest driver too, in +the format required by R4. + + Scope(\_SB) { + Device (VMGI) { + /* satisfy R2 */ + Name (_CID, "VM_Gen_Counter") + + /* satisfy R3 */ + Name (_DDN, "VM_Gen_Counter") + + /* satisfy R6 */ + Name (_HID, "QEMU0002") + + /* Device status: present, enabled & decoding resources, should be + * shown in the UI, functioning properly. + */ + Name (_STA, 0xF) + + /* Satisfy R4. + * + * This method is serialized because it creates named objects. + */ + Method (ADDR, 0, Serialized) { + /* The 8-byte integer field defined as ADBP below is the + * "ADDR base pointer" field in the UEFI ACPI Table. + * + * The DataTableRegion() operator locates that ACPI table by + * scanning the RSDT/XSDT using the (SignatureString, + * OemIDString, OemTableIDString) triplet as key. + * + * Windows XP would normally crash on the DataTableRegion() + * operator, but it never calls the ADDR method, hence it never + * reaches or evaluates DataTableRegion(). + */ + DataTableRegion (TBLR, "UEFI", "BOCHS ", "QEMUPARM") + Field (TBLR, AnyAcc, NoLock, Preserve) { + Offset (54), + ADBP, 64 + } + + /* The first two 4-byte ports are used to communicate the + * 64-bit guest-physical address of the actual (relocated) + * 128-bit generation ID field to QEMU, in little endian + * encoding, so that QEMU can rewrite that field in guest RAM. + * + * A write to last 1-byte port signals that the address has + * been written fully, and QEMU is free to dereference it. + */ + OperationRegion (VMGR, SystemIO, 0x512, 9) + Field (VMGR, DWordAcc, NoLock, Preserve) { + PTLO, 32, + PTHI, 32, + AccessAs (ByteAcc), + DONE, 8 + } + + /* The ADBP field points to the "OVMF SDT Header probe + * suppressor" area in the blob, at offset 62. In order to + * arrive at the generation ID field at offset 104, we must add + * 42 dynamically. + * + * The RESU buffer below will contain the result of the + * addition. The ADFU field exposes it as an 8-byte integer + * (for storing the sum), while the ADLO and ADHI fields enable + * us to access the result in two separate 4-byte integers. + * This exact integer width is especially important for + * composing the package object that the ADDR method must + * return. + */ + Name (RESU, Buffer (8) {}) + CreateQWordField (RESU, 0, ADFU) + CreateDWordField (RESU, 0, ADLO) + CreateDWordField (RESU, 4, ADHI) + + Add (ADBP, 42, ADFU) + Store (ADLO, PTLO) + Store (ADHI, PTHI) + Store (0, DONE) + Return (Package (2) { ADLO, ADHI }) + } + } + } + + /* satisfy R5 */ + Scope (\_GPE) { + Method (_E04) { + Notify (\_SB.VMGI, 0x80) + } + }