mbox series

[v4,0/7] Rework x86 page table walks

Message ID 20240723010545.3648706-1-porter@cs.unc.edu
Headers show
Series Rework x86 page table walks | expand

Message

Don Porter July 23, 2024, 1:05 a.m. UTC
This version of the 'info pg' command adopts Peter Maydell's request
to write guest-agnostic page table iterator and accessor code, along
with architecture-specific hooks.  The first patch in this series
contributes a generic page table iterator and an x86 instantiation.
As a client, we first introduce an 'info pg' monitor command, as well
as a compressing callback hook for creating succinct page table
representations.

After this, each successive patch replaces an exisitng x86 page table
walker with a use of common iterator code.

I could use advice on how to ensure this is sufficiently well tested.
I used 'make check', 'make check-avocado', and 'make check-tcg' which
all behave comparably to master branch on my test system; what is the
typical standard for testing something like a page table related
change?

As far as generality, I have only attempted this on x86, but I expect
the design would work for any similar radix-tree style page table.

Per David Gilbert's suggestion, I was careful to ensure that monitor
calls do not perturb TLB state (see the read-only flag in some
functions).

Version 3 of this patch series moves 'info pg' into common monitor
code and implements the architecture-specific code hooks.  I did not
do this with the 'info mem' and 'info tlb' commands, since they have
implementations on other ISAs.

Version 4 of this patch series adopts significant feedback on both the
monitor side and the page table hooks.  I believe this addresses all
of the feedback from Richard Henderson on this, and considerably
reworked the code to use fewer hooks, recur on nested paging, and
hoist permission and reserved bit checking into common code.

Unfortunately, this exposes issues with nested paging and internal
virtualization APIs (or lack thereof).  AMD and Intel have some
architectural differences in how VM state is accessed; similarly, each
accelerator backend on qemu stores virtualization state in different
ways.  Since I cannot test the transitive closure of these
configurations, I focused on the test cases I had at hand (tcg and
kvm, with a guest that issues Intel-style EPT pages), and to detect
and warn on an unsupported configuration.

Finally, the only way to reliably get some architectural state from
kvm involved importing the vmcs12 definition from Linux.  I wasn't
sure whether to put this under the linux headers, or somewhere else,
since this definition is not in the standard Linux headers.

Thank you,
Don


Don Porter (7):
  Code motion: expose some TCG definitions for page table walk
    consolidation.
  Import vmcs12 definition from Linux/KVM
  Add an "info pg" command that prints the current page tables
  Convert 'info tlb' to use generic iterator.
  Convert 'info mem' to use generic iterator
  Convert x86_cpu_get_memory_mapping() to use generic iterators
  Convert x86_mmu_translate() to use common code.

 hmp-commands-info.hx                 |   13 +
 hw/core/cpu-sysemu.c                 |  168 +++-
 hw/core/machine-qmp-cmds.c           |  243 ++++++
 include/hw/core/cpu.h                |   78 +-
 include/hw/core/sysemu-cpu-ops.h     |  157 +++-
 include/monitor/hmp-target.h         |    1 +
 qapi/machine.json                    |   17 +
 system/memory_mapping.c              |    2 +-
 target/i386/arch_memory_mapping.c    | 1195 +++++++++++++++++++++-----
 target/i386/cpu.c                    |   25 +-
 target/i386/cpu.h                    |   64 +-
 target/i386/helper.c                 |   36 +
 target/i386/kvm/kvm.c                |   68 ++
 target/i386/kvm/vmcs12.h             |  213 +++++
 target/i386/monitor.c                |  797 +++++++----------
 target/i386/tcg/helper-tcg.h         |   32 +
 target/i386/tcg/seg_helper.c         |   36 -
 target/i386/tcg/sysemu/excp_helper.c |  454 +---------
 18 files changed, 2423 insertions(+), 1176 deletions(-)
 create mode 100644 target/i386/kvm/vmcs12.h

--
2.34.1

Comments

Richard Henderson July 24, 2024, 3:39 a.m. UTC | #1
Hi Don.

In addition to the other issues, this really needs to be broken up into many more patches.

Every patch should do *one* thing:

   - Code motion
   - Introduce an API
   - Introduce target-specific support for an API
   - Use an API to implement a monitor command
   - etc

Patch 3, 'Add an "info pg" command ...' attempts to do all of these at once, and so is at 
least 2000 lines too long, which makes the whole thing extraordinarily hard to review.


r~
Don Porter Aug. 2, 2024, 4:53 p.m. UTC | #2
On 7/23/24 23:39, Richard Henderson wrote:
> Hi Don.
>
> In addition to the other issues, this really needs to be broken up 
> into many more patches.
>
> Every patch should do *one* thing:
>
>   - Code motion
>   - Introduce an API
>   - Introduce target-specific support for an API
>   - Use an API to implement a monitor command
>   - etc
>
> Patch 3, 'Add an "info pg" command ...' attempts to do all of these at 
> once, and so is at least 2000 lines too long, which makes the whole 
> thing extraordinarily hard to review.
>
>
> r~

Hi Richard (and others),

I just wanted to acknowledge that this and the other feedback makes 
sense and I can apply them and send a new version, with the exception of 
what to do about KVM guest register state for Intel VT.

Thanks,

Don