mbox series

[v9,00/10] Increased address space for 64 bit

Message ID 20240919124511.282088-1-benjamin@sipsolutions.net
Headers show
Series Increased address space for 64 bit | expand

Message

Benjamin Berg Sept. 19, 2024, 12:45 p.m. UTC
From: Benjamin Berg <benjamin.berg@intel.com>

The new version of the patchset uses execveat on a memfd instead of
cloning twice to disable rseq. This should be much more robust going
forward as it will also avoid issues with other new features like mseal.

This patchset fixes a few bugs, adds a new method of discovering the
host task size and switches to four-level page tables on 64 bit. All of
this means the userspace TASK_SIZE is much larger and in turn permits
userspace applications that need a lot of virtual addresses to work
fine.

One such application is ASAN which uses a fixed address in memory that
would otherwise not be addressable.

v9:
* Drop support for 3-level page tables (on top of Tiwei Bie's patch)
* Add patches to set PR_SET_PDEATHSIG for more robust cleanup
* Cleanup top address calculation a bit more
* Use code address for stub data location

v8:
* Make changes suggested by Johannes Berg

v7:
* Plenty of changes to fix 32 bit and improve the logic

v6:
* Apply fixes pointed out by Tiwei Bie
* Add temporary file fallback as memfd is not always supported

v5:
* Use execveat with memfd instead of double clone

v4:
* Do not use WNOHANG in wait for CLONE_VFORK

v3:
* Undo incorrect change in child wait loop

v2:
* Improved double clone logic using CLONE_VFORK
* Kconfig fixes pointed out by Tiwei Bie

Benjamin Berg (10):
  um: Add generic stub_syscall1 function
  um: use execveat to create userspace MMs
  um: Set parent death signal for userspace process
  um: Set parent death signal for winch thread/process
  um: Add compile time assert that stub fits on a page
  um: Calculate stub data address relative to stub code
  um: Limit TASK_SIZE to the addressable range
  um: Discover host_task_size from envp
  um: clear all memory in new userspace processes
  um: Switch to 4 level page tables on 64 bit

 arch/um/Kconfig                               |   4 +-
 arch/um/Makefile                              |   3 +-
 arch/um/drivers/chan_user.c                   |   3 +
 arch/um/include/asm/page.h                    |  14 +-
 arch/um/include/asm/pgalloc.h                 |  11 +-
 .../{pgtable-3level.h => pgtable-4level.h}    |  40 +++-
 arch/um/include/asm/pgtable.h                 |   8 +-
 arch/um/include/shared/as-layout.h            |   2 +-
 arch/um/include/shared/os.h                   |   3 -
 arch/um/include/shared/skas/stub-data.h       |  11 ++
 arch/um/kernel/dyn.lds.S                      |   3 +
 arch/um/kernel/mem.c                          |  17 +-
 arch/um/kernel/skas/.gitignore                |   2 +
 arch/um/kernel/skas/Makefile                  |  33 +++-
 arch/um/kernel/skas/mmu.c                     |  25 +--
 arch/um/kernel/skas/stub_exe.c                |  91 +++++++++
 arch/um/kernel/skas/stub_exe_embed.S          |  11 ++
 arch/um/kernel/um_arch.c                      |  34 +++-
 arch/um/os-Linux/main.c                       |   9 +-
 arch/um/os-Linux/mem.c                        |   2 +-
 arch/um/os-Linux/skas/process.c               | 181 ++++++++++++------
 arch/x86/um/Kconfig                           |   3 -
 arch/x86/um/os-Linux/Makefile                 |   2 +-
 arch/x86/um/os-Linux/task_size.c              | 151 ---------------
 arch/x86/um/shared/sysdep/stub_32.h           |  10 +-
 arch/x86/um/shared/sysdep/stub_64.h           |  19 +-
 26 files changed, 420 insertions(+), 272 deletions(-)
 rename arch/um/include/asm/{pgtable-3level.h => pgtable-4level.h} (66%)
 create mode 100644 arch/um/kernel/skas/.gitignore
 create mode 100644 arch/um/kernel/skas/stub_exe.c
 create mode 100644 arch/um/kernel/skas/stub_exe_embed.S
 delete mode 100644 arch/x86/um/os-Linux/task_size.c