From patchwork Thu Jul 4 16:27:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Berg X-Patchwork-Id: 1956975 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; secure) header.d=lists.infradead.org header.i=@lists.infradead.org header.a=rsa-sha256 header.s=bombadil.20210309 header.b=zeGbbiJI; dkim=fail reason="signature verification failed" (2048-bit key; secure) header.d=sipsolutions.net header.i=@sipsolutions.net header.a=rsa-sha256 header.s=mail header.b=ph4fDmZ/; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.infradead.org (client-ip=2607:7c80:54:3::133; helo=bombadil.infradead.org; envelope-from=linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org; receiver=patchwork.ozlabs.org) Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WFMWW38Zkz1xqs for ; Fri, 5 Jul 2024 02:27:51 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=/RHcd/yUmz0Gk8fIDLX6wsZ8ME7gFNvXydNGe/JBc3w=; b=zeGbbiJIsPB2nDbPTmjp2xR+D6 cxEjam3leWr62sD22bVYPc3bQN+fmyB/jy/l8cmwwfaDd+nXvnCtCPMKgWm68p6RycFhNNWe/toHE 87xGRTMxtQBQeBzI9g1GaF13q1XsZbaUXGHpJ1Xjsu00vX99WWfWKpu/Ekkwx2D44rsaeGH9rtaK+ Je+T1UuDAmDz4kx0Rx5rcfCoXPTDr85r5w1ew5dIwZL127NhAjnpxn7IWbsj2QJfsKnMhhYRbdVgw sANeqJG3DEbQMb0WXpfhrltTtOZtVCOZ7B9i1O578bef1t1p1SqqR6h4hYFfefffrdMd+DxzbMPTS V+z0UWLA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sPPJF-0000000DqFF-2ins; Thu, 04 Jul 2024 16:27:49 +0000 Received: from s3.sipsolutions.net ([2a01:4f8:242:246e::2] helo=sipsolutions.net) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sPPJC-0000000DqDO-2TGp for linux-um@lists.infradead.org; Thu, 04 Jul 2024 16:27:48 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=sipsolutions.net; s=mail; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Content-Type:Sender :Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-To: Resent-Cc:Resent-Message-ID; bh=/RHcd/yUmz0Gk8fIDLX6wsZ8ME7gFNvXydNGe/JBc3w=; t=1720110466; x=1721320066; b=ph4fDmZ/iFKZj10Hch8KYx0eQHH4fzKNTl4W/nMulv/m6tF V+fugDy9OUsraD8ltw6kcW259EF7mFQ+SNh4Y9d/1nUyCIa0kmZzTr1mPXx4D84H+aV4c5EXOihJf zWBeRvM52Xo6FN0DESHXSweEtQfNosn47hTJ6SOnpM+fET6Ar7msvEI0eqmAFb8SV3QDd/uxqjZW3 JHsp117s7tzwt6OKotgby63naUbPa3F61a2b9joIU6Zg9+4nRbmoQ8RnTSrgqEcMmWYqnmJLKroV7 5GBdzuI+lUPM17+0fZ2aHgmwXHNBv2IU9fv0OEWqFYhaN5M2dwlLLLVMVlVVkMqQ==; Received: by sipsolutions.net with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.97) (envelope-from ) id 1sPPJ9-0000000DOu8-06sm; Thu, 04 Jul 2024 18:27:43 +0200 From: Benjamin Berg To: linux-um@lists.infradead.org Cc: Benjamin Berg Subject: [PATCH v7 1/7] um: Add generic stub_syscall1 function Date: Thu, 4 Jul 2024 18:27:11 +0200 Message-ID: <20240704162717.1417338-2-benjamin@sipsolutions.net> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240704162717.1417338-1-benjamin@sipsolutions.net> References: <20240704162717.1417338-1-benjamin@sipsolutions.net> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240704_092746_649922_B8E60712 X-CRM114-Status: UNSURE ( 7.31 ) X-CRM114-Notice: Please train this message. X-Spam-Score: -0.2 (/) X-Spam-Report: Spam detection software, running on the system "bombadil.infradead.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: From: Benjamin Berg The 64bit version did not have a stub_syscall1 function yet. Add it as it will be useful to implement a static binary for stub loading. Signed-off-by: Benjamin Berg --- arch/x86/um/shared/sysdep/stub_64.h | 11 +++++++++++ 1 file changed, 11 insertions(+) Content analysis details: (-0.2 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 SPF_HELO_PASS SPF: HELO matches SPF record -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid X-BeenThere: linux-um@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-um" Errors-To: linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org From: Benjamin Berg The 64bit version did not have a stub_syscall1 function yet. Add it as it will be useful to implement a static binary for stub loading. Signed-off-by: Benjamin Berg --- arch/x86/um/shared/sysdep/stub_64.h | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/arch/x86/um/shared/sysdep/stub_64.h b/arch/x86/um/shared/sysdep/stub_64.h index 67f44284f1aa..8e4ff39dcade 100644 --- a/arch/x86/um/shared/sysdep/stub_64.h +++ b/arch/x86/um/shared/sysdep/stub_64.h @@ -28,6 +28,17 @@ static __always_inline long stub_syscall0(long syscall) return ret; } +static __always_inline long stub_syscall1(long syscall, long arg1) +{ + long ret; + + __asm__ volatile (__syscall + : "=a" (ret) + : "0" (syscall), "D" (arg1) : __syscall_clobber ); + + return ret; +} + static __always_inline long stub_syscall2(long syscall, long arg1, long arg2) { long ret; From patchwork Thu Jul 4 16:27:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Berg X-Patchwork-Id: 1956976 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; secure) header.d=lists.infradead.org header.i=@lists.infradead.org header.a=rsa-sha256 header.s=bombadil.20210309 header.b=j0+VrJWH; dkim=fail reason="signature verification failed" (2048-bit key; secure) header.d=sipsolutions.net header.i=@sipsolutions.net header.a=rsa-sha256 header.s=mail header.b=r3YbC6vc; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.infradead.org (client-ip=2607:7c80:54:3::133; helo=bombadil.infradead.org; envelope-from=linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org; receiver=patchwork.ozlabs.org) Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WFMWX5TbSz1xpP for ; Fri, 5 Jul 2024 02:27:52 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=E93e4GpNH+Hcp58UzLVXHt82OzpdISwr0ut/TgWwcFE=; b=j0+VrJWHXyrGzquswkwl/NtSyP vilTzAyUkciZLX/mdkJQGq2Jqva/DpFf9O855nLDF1zv+hlQm07+ZGGnbmrRgHKcN9nt634z/W7o4 f7fTWsE0dGJgnUoiC16LP/coJwzXeTKNp5SmS6MFyQmPu+XOBKny34vyAUTOy4U8PweplHRzsaj3O JkBB0blQIRkdw5/wwPzwTpk736RyyDwE6U+noKHnmpEeKaChJ70j4idC6TUrJJCgvkYQVMEtBETdF Ho1O6O0eGZRn2KJEvysfrUkb6OTHjU0Yxk5YkVU3fUlzPHzn3DWyu74W/Yhb3sZM6wzGzZI5gmebD O7zaHWAQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sPPJH-0000000DqGD-0Ggx; Thu, 04 Jul 2024 16:27:51 +0000 Received: from s3.sipsolutions.net ([2a01:4f8:242:246e::2] helo=sipsolutions.net) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sPPJD-0000000DqDf-0Sfz for linux-um@lists.infradead.org; Thu, 04 Jul 2024 16:27:49 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=sipsolutions.net; s=mail; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Content-Type:Sender :Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-To: Resent-Cc:Resent-Message-ID; bh=E93e4GpNH+Hcp58UzLVXHt82OzpdISwr0ut/TgWwcFE=; t=1720110466; x=1721320066; b=r3YbC6vcwmt/NTAj9oz1QZneO6RNScuT+Gk91mvGhKtMeMG pnCFXPBRDiAnOMMpMBexFfZjMJKRWx+jq7R8QAg2qcABadxlAyP0vahRYRUsFnXPPRseEAL0lGe9f Zw9uS62Og2f+qd+FtvsT8uA1nLktnkk0I2+2dmJ3HV7CIqUanRpBJFzWmXj17zO4xqNxZRNV6y5aI yO5m7BJ1plNtJBumYJeENAQV1B/UeSp8c7Zs7f3LEh6BrLoUwiVN7toNPMQ+VuU5vHi7ecD4OwBO4 aHJdpsYIvSjehuUPBU0LCTTUPJjGeW0DxM/4dR0UO9gVdOzLaJVsL371WsXSwqmw==; Received: by sipsolutions.net with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.97) (envelope-from ) id 1sPPJA-0000000DOu8-3zSH; Thu, 04 Jul 2024 18:27:45 +0200 From: Benjamin Berg To: linux-um@lists.infradead.org Cc: Benjamin Berg Subject: [PATCH v7 2/7] um: use execveat to create userspace MMs Date: Thu, 4 Jul 2024 18:27:12 +0200 Message-ID: <20240704162717.1417338-3-benjamin@sipsolutions.net> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240704162717.1417338-1-benjamin@sipsolutions.net> References: <20240704162717.1417338-1-benjamin@sipsolutions.net> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240704_092747_612692_09F03110 X-CRM114-Status: GOOD ( 33.31 ) X-Spam-Score: -0.2 (/) X-Spam-Report: Spam detection software, running on the system "bombadil.infradead.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: From: Benjamin Berg Using clone will not undo features that have been enabled by libc. An example of this already happening is rseq, which could cause the kernel to read/write memory of the userspace process. In the futu [...] Content analysis details: (-0.2 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 SPF_HELO_PASS SPF: HELO matches SPF record -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid X-BeenThere: linux-um@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-um" Errors-To: linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org From: Benjamin Berg Using clone will not undo features that have been enabled by libc. An example of this already happening is rseq, which could cause the kernel to read/write memory of the userspace process. In the future the standard library might also use mseal by default to protect itself, which would also thwart our attempts at unmapping everything. Solve all this by taking a step back and doing an execve into a tiny static binary that sets up the minimal environment required for the stub without using any standard library. That way we have a clean execution environment that is fully under the control of UML. Note that this changes things a bit as the FDs are not anymore shared with the kernel. Instead, we explicitly share the FDs for the physical memory and all existing iomem regions. Doing this is fine, as iomem regions cannot be added at runtime. Signed-off-by: Benjamin Berg --- v7: - Rename stub_elf to stub_exe - Move into architecture independent directory - Fix 32 bit issues - Improve tempfile logic - Other cleanups v6: - Apply fixes pointed out by Tiwei Bie - Add temporary file fallback as memfd is not always supported --- arch/um/include/shared/skas/stub-data.h | 11 ++ arch/um/kernel/skas/.gitignore | 2 + arch/um/kernel/skas/Makefile | 33 ++++- arch/um/kernel/skas/stub_exe.c | 86 +++++++++++ arch/um/kernel/skas/stub_exe_embed.S | 11 ++ arch/um/os-Linux/mem.c | 2 +- arch/um/os-Linux/skas/process.c | 183 ++++++++++++++++-------- 7 files changed, 269 insertions(+), 59 deletions(-) create mode 100644 arch/um/kernel/skas/.gitignore create mode 100644 arch/um/kernel/skas/stub_exe.c create mode 100644 arch/um/kernel/skas/stub_exe_embed.S diff --git a/arch/um/include/shared/skas/stub-data.h b/arch/um/include/shared/skas/stub-data.h index 2b6b44759dfa..3fbdda727373 100644 --- a/arch/um/include/shared/skas/stub-data.h +++ b/arch/um/include/shared/skas/stub-data.h @@ -12,6 +12,17 @@ #include #include +struct stub_init_data { + unsigned long stub_start; + + int stub_code_fd; + unsigned long stub_code_offset; + int stub_data_fd; + unsigned long stub_data_offset; + + unsigned long segv_handler; +}; + #define STUB_NEXT_SYSCALL(s) \ ((struct stub_syscall *) (((unsigned long) s) + (s)->cmd_len)) diff --git a/arch/um/kernel/skas/.gitignore b/arch/um/kernel/skas/.gitignore new file mode 100644 index 000000000000..c3409ced0f38 --- /dev/null +++ b/arch/um/kernel/skas/.gitignore @@ -0,0 +1,2 @@ +stub_exe +stub_exe.dbg diff --git a/arch/um/kernel/skas/Makefile b/arch/um/kernel/skas/Makefile index 6f86d53e3d69..fbb61968055f 100644 --- a/arch/um/kernel/skas/Makefile +++ b/arch/um/kernel/skas/Makefile @@ -3,14 +3,43 @@ # Copyright (C) 2002 - 2007 Jeff Dike (jdike@{addtoit,linux.intel}.com) # -obj-y := stub.o mmu.o process.o syscall.o uaccess.o +obj-y := stub.o mmu.o process.o syscall.o uaccess.o \ + stub_exe_embed.o + +# Stub executable + +stub_exe_objs-y := stub_exe.o + +stub_exe_objs := $(foreach F,$(stub_exe_objs-y),$(obj)/$F) + +# Object file containing the ELF executable +$(obj)/stub_exe_embed.o: $(src)/stub_exe_embed.S $(obj)/stub_exe + +$(obj)/stub_exe.dbg: $(stub_exe_objs) FORCE + $(call if_changed,stub_exe) + +$(obj)/stub_exe: OBJCOPYFLAGS := -S +$(obj)/stub_exe: $(obj)/stub_exe.dbg FORCE + $(call if_changed,objcopy) + +quiet_cmd_stub_exe = STUB_EXE $@ + cmd_stub_exe = $(CC) -nostdlib -o $@ \ + $(KBUILD_CFLAGS) $(STUB_EXE_LDFLAGS) \ + $(filter %.o,$^) + +STUB_EXE_LDFLAGS = -n -static + +targets += stub_exe.dbg stub_exe $(stub_exe_objs-y) + +# end # stub.o is in the stub, so it can't be built with profiling # GCC hardened also auto-enables -fpic, but we need %ebx so it can't work -> # disable it CFLAGS_stub.o := $(CFLAGS_NO_HARDENING) -UNPROFILE_OBJS := stub.o +CFLAGS_stub_exe.o := $(CFLAGS_NO_HARDENING) +UNPROFILE_OBJS := stub.o stub_exe.o KCOV_INSTRUMENT := n include $(srctree)/arch/um/scripts/Makefile.rules diff --git a/arch/um/kernel/skas/stub_exe.c b/arch/um/kernel/skas/stub_exe.c new file mode 100644 index 000000000000..bec471aeda30 --- /dev/null +++ b/arch/um/kernel/skas/stub_exe.c @@ -0,0 +1,86 @@ +#include +#include +#include +#include +#include + +void _start(void); + +noinline static void real_init(void) +{ + struct stub_init_data init_data; + unsigned long res; + struct { + void *ss_sp; + int ss_flags; + size_t ss_size; + } stack; + struct { + void *sa_handler_; + unsigned long sa_flags; + void *sa_restorer; + unsigned long long sa_mask; + } sa = {}; + + /* set a nice name */ + stub_syscall2(__NR_prctl, PR_SET_NAME, (unsigned long)"uml-userspace"); + + /* read information from STDIN and close it */ + res = stub_syscall3(__NR_read, 0, + (unsigned long)&init_data, sizeof(init_data)); + if (res != sizeof(init_data)) + stub_syscall1(__NR_exit, 10); + + stub_syscall1(__NR_close, 0); + + /* map stub code + data */ + res = stub_syscall6(STUB_MMAP_NR, + init_data.stub_start, UM_KERN_PAGE_SIZE, + PROT_READ | PROT_EXEC, MAP_FIXED | MAP_SHARED, + init_data.stub_code_fd, init_data.stub_code_offset); + if (res != init_data.stub_start) + stub_syscall1(__NR_exit, 11); + + res = stub_syscall6(STUB_MMAP_NR, + init_data.stub_start + UM_KERN_PAGE_SIZE, + STUB_DATA_PAGES * UM_KERN_PAGE_SIZE, + PROT_READ | PROT_WRITE, MAP_FIXED | MAP_SHARED, + init_data.stub_data_fd, init_data.stub_data_offset); + if (res != init_data.stub_start + UM_KERN_PAGE_SIZE) + stub_syscall1(__NR_exit, 12); + + /* setup signal stack inside stub data */ + stack.ss_flags = 0; + stack.ss_size = STUB_DATA_PAGES * UM_KERN_PAGE_SIZE; + stack.ss_sp = (void *)init_data.stub_start + UM_KERN_PAGE_SIZE; + stub_syscall2(__NR_sigaltstack, (unsigned long)&stack, 0); + + /* register SIGSEGV handler (SA_RESTORER, the handler never returns) */ + sa.sa_flags = SA_ONSTACK | SA_NODEFER | SA_SIGINFO | 0x04000000; + sa.sa_handler_ = (void *) init_data.segv_handler; + sa.sa_restorer = NULL; + sa.sa_mask = 0L; /* No need to mask anything */ + res = stub_syscall4(__NR_rt_sigaction, SIGSEGV, (unsigned long)&sa, 0, + sizeof(sa.sa_mask)); + if (res != 0) + stub_syscall1(__NR_exit, 13); + + stub_syscall4(__NR_ptrace, PTRACE_TRACEME, 0, 0, 0); + + stub_syscall2(__NR_kill, stub_syscall0(__NR_getpid), SIGSTOP); + + stub_syscall1(__NR_exit, 14); + + __builtin_unreachable(); +} + +void _start(void) +{ + char *alloc; + + /* Make enough space for the stub (including space for alignment) */ + alloc = __builtin_alloca((1 + 2 * STUB_DATA_PAGES - 1) * UM_KERN_PAGE_SIZE); + asm volatile("" : "+r,m"(alloc) : : "memory"); + + real_init(); +} diff --git a/arch/um/kernel/skas/stub_exe_embed.S b/arch/um/kernel/skas/stub_exe_embed.S new file mode 100644 index 000000000000..6d8914fbe8f1 --- /dev/null +++ b/arch/um/kernel/skas/stub_exe_embed.S @@ -0,0 +1,11 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#include +#include + +__INITDATA + +SYM_DATA_START(stub_exe_start) + .incbin "arch/um/kernel/skas/stub_exe" +SYM_DATA_END_LABEL(stub_exe_start, SYM_L_GLOBAL, stub_exe_end) + +__FINIT diff --git a/arch/um/os-Linux/mem.c b/arch/um/os-Linux/mem.c index cf44d386f23c..857e3deab293 100644 --- a/arch/um/os-Linux/mem.c +++ b/arch/um/os-Linux/mem.c @@ -42,7 +42,7 @@ void kasan_map_memory(void *start, size_t len) } /* Set by make_tempfile() during early boot. */ -static char *tempdir = NULL; +char *tempdir = NULL; /* Check if dir is on tmpfs. Return 0 if yes, -1 if no or error. */ static int __init check_tmpfs(const char *dir) diff --git a/arch/um/os-Linux/skas/process.c b/arch/um/os-Linux/skas/process.c index f7088345b3fc..aea1dbe48a92 100644 --- a/arch/um/os-Linux/skas/process.c +++ b/arch/um/os-Linux/skas/process.c @@ -10,8 +10,11 @@ #include #include #include +#include +#include #include #include +#include #include #include #include @@ -189,69 +192,137 @@ static void handle_trap(int pid, struct uml_pt_regs *regs) extern char __syscall_stub_start[]; -/** - * userspace_tramp() - userspace trampoline - * @stack: pointer to the new userspace stack page - * - * The userspace trampoline is used to setup a new userspace process in start_userspace() after it was clone()'ed. - * This function will run on a temporary stack page. - * It ptrace()'es itself, then - * Two pages are mapped into the userspace address space: - * - STUB_CODE (with EXEC), which contains the skas stub code - * - STUB_DATA (with R/W), which contains a data page that is used to transfer certain data between the UML userspace process and the UML kernel. - * Also for the userspace process a SIGSEGV handler is installed to catch pagefaults in the userspace process. - * And last the process stops itself to give control to the UML kernel for this userspace process. - * - * Return: Always zero, otherwise the current userspace process is ended with non null exit() call - */ +static int stub_exe_fd; + static int userspace_tramp(void *stack) { - struct sigaction sa; - void *addr; - int fd; + char *const argv[] = { "uml-userspace", NULL }; + int pipe_fds[2]; unsigned long long offset; - unsigned long segv_handler = STUB_CODE + - (unsigned long) stub_segv_handler - - (unsigned long) __syscall_stub_start; - - ptrace(PTRACE_TRACEME, 0, 0, 0); - - signal(SIGTERM, SIG_DFL); - signal(SIGWINCH, SIG_IGN); - - fd = phys_mapping(uml_to_phys(__syscall_stub_start), &offset); - addr = mmap64((void *) STUB_CODE, UM_KERN_PAGE_SIZE, - PROT_EXEC, MAP_FIXED | MAP_PRIVATE, fd, offset); - if (addr == MAP_FAILED) { - os_info("mapping mmap stub at 0x%lx failed, errno = %d\n", - STUB_CODE, errno); - exit(1); + struct stub_init_data init_data = { + .stub_start = STUB_START, + .segv_handler = STUB_CODE + + (unsigned long) stub_segv_handler - + (unsigned long) __syscall_stub_start, + }; + struct iomem_region *iomem; + int ret; + + init_data.stub_code_fd = phys_mapping(uml_to_phys(__syscall_stub_start), + &offset); + init_data.stub_code_offset = MMAP_OFFSET(offset); + + init_data.stub_data_fd = phys_mapping(uml_to_phys(stack), &offset); + init_data.stub_data_offset = MMAP_OFFSET(offset); + + /* Set CLOEXEC on all FDs and then unset on all memory related FDs */ + close_range(0, ~0U, CLOSE_RANGE_CLOEXEC); + + fcntl(init_data.stub_data_fd, F_SETFD, 0); + for (iomem = iomem_regions; iomem; iomem = iomem->next) + fcntl(iomem->fd, F_SETFD, 0); + + /* Create a pipe for init_data (no CLOEXEC) and dup2 to STDIN */ + if (pipe2(pipe_fds, 0)) + exit(2); + + close(0); + if (dup2(pipe_fds[0], 0) < 0) { + close(pipe_fds[0]); + close(pipe_fds[1]); + exit(3); } + close(pipe_fds[0]); + + /* Write init_data and close write side */ + ret = write(pipe_fds[1], &init_data, sizeof(init_data)); + close(pipe_fds[1]); + + if (ret != sizeof(init_data)) + exit(4); + + execveat(stub_exe_fd, "", argv, NULL, AT_EMPTY_PATH); - fd = phys_mapping(uml_to_phys(stack), &offset); - addr = mmap((void *) STUB_DATA, - STUB_DATA_PAGES * UM_KERN_PAGE_SIZE, PROT_READ | PROT_WRITE, - MAP_FIXED | MAP_SHARED, fd, offset); - if (addr == MAP_FAILED) { - os_info("mapping segfault stack at 0x%lx failed, errno = %d\n", - STUB_DATA, errno); - exit(1); + exit(5); +} + +extern char stub_exe_start[]; +extern char stub_exe_end[]; + +extern char *tempdir; + +#define STUB_EXE_NAME_TEMPLATE "/uml-userspace-XXXXXX" + +#ifndef MFD_EXEC +#define MFD_EXEC 0x0010U +#endif + +static int __init init_stub_exe_fd(void) +{ + size_t len = 0; + char *tmpfile = NULL; + int res; + + stub_exe_fd = memfd_create("uml-userspace", + MFD_EXEC | MFD_CLOEXEC | MFD_ALLOW_SEALING); + + if (stub_exe_fd < 0) { + printk(UM_KERN_INFO "Could not create executable memfd, using temporary file!"); + + tmpfile = malloc(strlen(tempdir) + + strlen(STUB_EXE_NAME_TEMPLATE) + 1); + if (tmpfile == NULL) + panic("Failed to allocate memory for stub executable name"); + + strcpy(tmpfile, tempdir); + strcat(tmpfile, STUB_EXE_NAME_TEMPLATE); + + stub_exe_fd = mkstemp(tmpfile); + if (stub_exe_fd < 0) + panic("Could not create temporary file for stub binary: %d", + errno); } - set_sigstack((void *) STUB_DATA, STUB_DATA_PAGES * UM_KERN_PAGE_SIZE); - sigemptyset(&sa.sa_mask); - sa.sa_flags = SA_ONSTACK | SA_NODEFER | SA_SIGINFO; - sa.sa_sigaction = (void *) segv_handler; - sa.sa_restorer = NULL; - if (sigaction(SIGSEGV, &sa, NULL) < 0) { - os_info("%s - setting SIGSEGV handler failed - errno = %d\n", - __func__, errno); - exit(1); + while (len < stub_exe_end - stub_exe_start) { + res = write(stub_exe_fd, stub_exe_start + len, + stub_exe_end - stub_exe_start - len); + if (res < 0) { + if (errno == EINTR) + continue; + + if (tmpfile) + unlink(tmpfile); + panic("%s: Failed write to memfd: %d", __func__, errno); + } + + len += res; + } + + if (!tmpfile) { + fcntl(stub_exe_fd, F_ADD_SEALS, + F_SEAL_WRITE | F_SEAL_SHRINK | F_SEAL_GROW | F_SEAL_SEAL); + } else { + /* Only executable by us */ + if (fchmod(stub_exe_fd, 00500) < 0) { + unlink(tmpfile); + panic("Could not make stub binary excutable: %d", + errno); + } + + close(stub_exe_fd); + stub_exe_fd = open(tmpfile, O_RDONLY | O_CLOEXEC | O_NOFOLLOW); + if (stub_exe_fd < 0) { + unlink(tmpfile); + panic("Could not reopen stub binary: %d", errno); + } + + unlink(tmpfile); + free(tmpfile); } - kill(os_getpid(), SIGSTOP); return 0; } +__initcall(init_stub_exe_fd); int userspace_pid[NR_CPUS]; @@ -270,7 +341,7 @@ int start_userspace(unsigned long stub_stack) { void *stack; unsigned long sp; - int pid, status, n, flags, err; + int pid, status, n, err; /* setup a temporary stack page */ stack = mmap(NULL, UM_KERN_PAGE_SIZE, @@ -286,10 +357,10 @@ int start_userspace(unsigned long stub_stack) /* set stack pointer to the end of the stack page, so it can grow downwards */ sp = (unsigned long)stack + UM_KERN_PAGE_SIZE; - flags = CLONE_FILES | SIGCHLD; - /* clone into new userspace process */ - pid = clone(userspace_tramp, (void *) sp, flags, (void *) stub_stack); + pid = clone(userspace_tramp, (void *) sp, + CLONE_VFORK | CLONE_VM | SIGCHLD, + (void *)stub_stack); if (pid < 0) { err = -errno; printk(UM_KERN_ERR "%s : clone failed, errno = %d\n", From patchwork Thu Jul 4 16:27:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Berg X-Patchwork-Id: 1956977 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; secure) header.d=lists.infradead.org header.i=@lists.infradead.org header.a=rsa-sha256 header.s=bombadil.20210309 header.b=dZstaRxe; dkim=fail reason="signature verification failed" (2048-bit key; secure) header.d=sipsolutions.net header.i=@sipsolutions.net header.a=rsa-sha256 header.s=mail header.b=k7PCn8Um; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.infradead.org (client-ip=2607:7c80:54:3::133; helo=bombadil.infradead.org; envelope-from=linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org; receiver=patchwork.ozlabs.org) Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WFMWY2b3xz1xqs for ; Fri, 5 Jul 2024 02:27:53 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=vjQoz/5CTFM/VYY0zOeSPbxeRzdNCA+DsDOOxwBCxy4=; b=dZstaRxehZ0wUpeTnhBQe9D60t B2n27c8o5/FCPSTefjID/hg0z78QRMRsojpQrVCh8L1ufdCyOkS9Abvi+FLWJe19iD8ji72AM4Dad WejaoBi0tRaL8/GCEKi0QkBimdzIZUiBcjUc+6eDo7mfLQ95N5JYR/FPp9pb+fDsG9pxDctujysRV zD9MHph55eD/dEImKlQTv7IGYBbOSpRZm09TAKb7T4RyawMwsBXHeYIezLeaYoKmKkJsj5o8Rn9Q/ cvKNjZPjCIpe5AurDpwM+PyznxmCgUF3tvJ+YiEKVkxmZfEv1GZ3OCMjGAooPqydCbtP1cHbwsm20 8KMR344Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sPPJH-0000000DqGl-30N3; Thu, 04 Jul 2024 16:27:51 +0000 Received: from s3.sipsolutions.net ([2a01:4f8:242:246e::2] helo=sipsolutions.net) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sPPJF-0000000DqEr-2Naf for linux-um@lists.infradead.org; Thu, 04 Jul 2024 16:27:50 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=sipsolutions.net; s=mail; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Content-Type:Sender :Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-To: Resent-Cc:Resent-Message-ID; bh=vjQoz/5CTFM/VYY0zOeSPbxeRzdNCA+DsDOOxwBCxy4=; t=1720110469; x=1721320069; b=k7PCn8UmWpg09NwZOxtlYwupwORf+JclG+K3iTaC2ghTxoY QporXxWtaDQI/mIwIisfq8nnZzHqiFaWXvHjOU+SECJZhryyAnZOCjTvIrTHzAOfgvuQ+5E32aJtc V+OfSZdNLmAlF14K+to/6z4BVjTc67B0wiVuam+gJnM5xZ37dNdAXfAORNotS1nORlQi7A1f4ORWo 3+lXVdEK9UR3VYd3bouL0sZ47NPRsvT0xQyvJTcbLZwRkMFgOP+1gT5/kMIsYZUPoH77la2tLNfxp AashoXUibjfqu8tuafUm/nIUmSpdTrtH5FG2sCHnWGknT/ii4N2qaSCt6b1rIDpg==; Received: by sipsolutions.net with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.97) (envelope-from ) id 1sPPJC-0000000DOu8-2ksy; Thu, 04 Jul 2024 18:27:47 +0200 From: Benjamin Berg To: linux-um@lists.infradead.org Cc: Benjamin Berg Subject: [PATCH v7 3/7] um: Fix stub_start address calculation Date: Thu, 4 Jul 2024 18:27:13 +0200 Message-ID: <20240704162717.1417338-4-benjamin@sipsolutions.net> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240704162717.1417338-1-benjamin@sipsolutions.net> References: <20240704162717.1417338-1-benjamin@sipsolutions.net> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240704_092749_635723_BE6D76E0 X-CRM114-Status: UNSURE ( 9.43 ) X-CRM114-Notice: Please train this message. X-Spam-Score: -0.2 (/) X-Spam-Report: Spam detection software, running on the system "bombadil.infradead.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: From: Benjamin Berg The calculation was wrong as it only subtracted one and then rounded down for alignment. However, this is incorrect if host_task_size is not already aligned. Content analysis details: (-0.2 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 SPF_HELO_PASS SPF: HELO matches SPF record -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid X-BeenThere: linux-um@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-um" Errors-To: linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org From: Benjamin Berg The calculation was wrong as it only subtracted one and then rounded down for alignment. However, this is incorrect if host_task_size is not already aligned. This probably worked fine because on 64 bit the host_task_size is bigger than returned by os_get_top_address. Signed-off-by: Benjamin Berg --- arch/um/kernel/um_arch.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/arch/um/kernel/um_arch.c b/arch/um/kernel/um_arch.c index 8e594cda6d77..25cd2c6d7e95 100644 --- a/arch/um/kernel/um_arch.c +++ b/arch/um/kernel/um_arch.c @@ -328,7 +328,8 @@ int __init linux_main(int argc, char **argv) /* reserve a few pages for the stubs (taking care of data alignment) */ /* align the data portion */ BUILD_BUG_ON(!is_power_of_2(STUB_DATA_PAGES)); - stub_start = (host_task_size - 1) & ~(STUB_DATA_PAGES * PAGE_SIZE - 1); + stub_start = (host_task_size - STUB_DATA_PAGES * PAGE_SIZE) & + ~(STUB_DATA_PAGES * PAGE_SIZE - 1); /* another page for the code portion */ stub_start -= PAGE_SIZE; host_task_size = stub_start; From patchwork Thu Jul 4 16:27:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Berg X-Patchwork-Id: 1956978 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; secure) header.d=lists.infradead.org header.i=@lists.infradead.org header.a=rsa-sha256 header.s=bombadil.20210309 header.b=CoxdXdFk; dkim=fail reason="signature verification failed" (2048-bit key; secure) header.d=sipsolutions.net header.i=@sipsolutions.net header.a=rsa-sha256 header.s=mail header.b=c4W8+QZm; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.infradead.org (client-ip=2607:7c80:54:3::133; helo=bombadil.infradead.org; envelope-from=linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org; receiver=patchwork.ozlabs.org) Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WFMWc6dTMz1xpP for ; Fri, 5 Jul 2024 02:27:56 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=NYxm6HJmMWAKJ6fAshLmO7RJHKejdCZeUmDhjTK4Zwg=; b=CoxdXdFkWEJFR4Y9l1IrTA650K LRSQojvbNYOQWD1WjFgygOhMDHxFbS/4PkXgwR3CxqhOrYCsCpkMaTOBn7pn/6RT3fpOz3oS/bm/N pqiQqrVOgO3PzVVlYgctS5bglHwpnlDRZ4wUkX9+2nOZ+EKZtKDcyiocfq+O1gcEVJDZHBOB/GkWg P2qjp8IaiiQteT92/tS8sS2M0g5k9oXDNzATx7kqqEDqzObKA0x2XYy+eJWOEPtlJUjgNlx1XLqY5 eV6LsMsDfKgckU9q1IecoLUtm/KhxR6uaEv1q06LgC6u/04+0R07jd2KE2qvWrG7e3qaY/diprv7H D095iC3g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sPPJL-0000000DqIe-0jOe; Thu, 04 Jul 2024 16:27:55 +0000 Received: from s3.sipsolutions.net ([2a01:4f8:242:246e::2] helo=sipsolutions.net) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sPPJI-0000000DqH2-2y4g for linux-um@lists.infradead.org; Thu, 04 Jul 2024 16:27:53 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=sipsolutions.net; s=mail; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Content-Type:Sender :Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-To: Resent-Cc:Resent-Message-ID; bh=NYxm6HJmMWAKJ6fAshLmO7RJHKejdCZeUmDhjTK4Zwg=; t=1720110472; x=1721320072; b=c4W8+QZmTdLUaAnH87fmLOkVMM6G4yZTg8aoMljzp8+a+Ja I9qqZiCzYz1Hl9/GU9eZCBn5T9uc1gooraX5Ymur1ZPfJmiaxAsYZzEN7eebNAN93Z2XrTbeUvyx9 nmhY9AQsqv9jKJSbzEBWTCeYRAnFFfUsiD1XUZDnF95Iv/lWH1evWex2151KF6B/WS0+9cMCL8eZC Jmua5b2tKwfkC6P8FMSaIA2B4FgTcjvVsOorNRWmRDb+egmBzFjulVsl9ar7UILkK3KX2sCXADhd4 faB0cx1uVDxXOAVta7Sd+uXX+jSV57xu0La/XxrImYcCY1cdjNUJzrgbE7mf7pKQ==; Received: by sipsolutions.net with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.97) (envelope-from ) id 1sPPJE-0000000DOu8-2Yfu; Thu, 04 Jul 2024 18:27:49 +0200 From: Benjamin Berg To: linux-um@lists.infradead.org Cc: Benjamin Berg Subject: [PATCH v7 4/7] um: Limit TASK_SIZE to the addressable range Date: Thu, 4 Jul 2024 18:27:14 +0200 Message-ID: <20240704162717.1417338-5-benjamin@sipsolutions.net> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240704162717.1417338-1-benjamin@sipsolutions.net> References: <20240704162717.1417338-1-benjamin@sipsolutions.net> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240704_092752_771170_C93FD218 X-CRM114-Status: GOOD ( 10.09 ) X-Spam-Score: -0.2 (/) X-Spam-Report: Spam detection software, running on the system "bombadil.infradead.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: From: Benjamin Berg We may have a TASK_SIZE from the host that is bigger than UML is able to address with a three-level pagetable. Guard against that by clipping the maximum TASK_SIZE to the maximum addressable area. Content analysis details: (-0.2 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 SPF_HELO_PASS SPF: HELO matches SPF record -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid X-BeenThere: linux-um@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-um" Errors-To: linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org From: Benjamin Berg We may have a TASK_SIZE from the host that is bigger than UML is able to address with a three-level pagetable. Guard against that by clipping the maximum TASK_SIZE to the maximum addressable area. Signed-off-by: Benjamin Berg --- v7: Fix integer overflow on 32 bit with 3-level page tables --- arch/um/kernel/um_arch.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/arch/um/kernel/um_arch.c b/arch/um/kernel/um_arch.c index 25cd2c6d7e95..f82dd4e854f3 100644 --- a/arch/um/kernel/um_arch.c +++ b/arch/um/kernel/um_arch.c @@ -334,11 +334,16 @@ int __init linux_main(int argc, char **argv) stub_start -= PAGE_SIZE; host_task_size = stub_start; + /* Limit TASK_SIZE to what is addressable by the page table */ + task_size = host_task_size; + if (task_size > (unsigned long long) PTRS_PER_PGD * PGDIR_SIZE) + task_size = PTRS_PER_PGD * PGDIR_SIZE; + /* * TASK_SIZE needs to be PGDIR_SIZE aligned or else exit_mmap craps * out */ - task_size = host_task_size & PGDIR_MASK; + task_size = task_size & PGDIR_MASK; /* OS sanity checks that need to happen before the kernel runs */ os_early_checks(); From patchwork Thu Jul 4 16:27:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Berg X-Patchwork-Id: 1956979 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; secure) header.d=lists.infradead.org header.i=@lists.infradead.org header.a=rsa-sha256 header.s=bombadil.20210309 header.b=mduaXVKt; dkim=fail reason="signature verification failed" (2048-bit key; secure) header.d=sipsolutions.net header.i=@sipsolutions.net header.a=rsa-sha256 header.s=mail header.b=duHM46dV; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.infradead.org (client-ip=2607:7c80:54:3::133; helo=bombadil.infradead.org; envelope-from=linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org; receiver=patchwork.ozlabs.org) Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WFMWg1x7xz1xpP for ; Fri, 5 Jul 2024 02:27:59 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=ckQWU8mOQ4AXcUvipA7PGWmjkmRvzOUC6H232d/Gy8g=; b=mduaXVKt0QPFq0m0aFrQorSoVx WfRyw04s/srp2zGWUUJ022ryjzLpSr9Xg86rD0aO4r0+g30o1P8gYLBb7qhCoshDAzeh5Cvk02RfQ /v3Ye9lGmCzFguUZOyK5aP0QkDxVAVqQxvGyPm/ROZb5/+Gz23IL6SSYSRNrJZ12q834U7MNP5Psn nkt+oN2wTMoAY77hSrg0WAeuCF0GItWO5wQpEOMjVFYkykXJIiOMCa9BBu4ePw5/iUJiUUKjlfAsS RNARlALMcrXJ46rj+sxdoIQAECGCbtvBU4CsO8ymD5Vg0CF3FJ/Swx76zI7Bk+UlO9w45zjjTwSY6 9pULAA2Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sPPJN-0000000DqJe-2ZIB; Thu, 04 Jul 2024 16:27:57 +0000 Received: from s3.sipsolutions.net ([2a01:4f8:242:246e::2] helo=sipsolutions.net) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sPPJK-0000000DqIA-33gH for linux-um@lists.infradead.org; Thu, 04 Jul 2024 16:27:56 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=sipsolutions.net; s=mail; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Content-Type:Sender :Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-To: Resent-Cc:Resent-Message-ID; bh=ckQWU8mOQ4AXcUvipA7PGWmjkmRvzOUC6H232d/Gy8g=; t=1720110474; x=1721320074; b=duHM46dVuGBqMzoBhKQqbcbnm7k4fQ48WdFYGPZbjPhV4Oa A/dCb+dM9isp/Zz9UOaWqbK2Re2dm2rWUmM2hr65E8R5A68iHfYmyRyTQ8LEEDz5zU1Kh4o+bBz1G D60NuMpfvnETG+f+otwvEcBmq9zDz7HA7/SXAHepTfFBfn/mFx0xkkiAApcXMu6dlw/sHbQKYbiP3 BtXbIlAXnR2B+WQN0juXrr7FnYEC+DteoSADSdz8icwEUm02LsYJupG7cmTND4ZPFcGz27BM49Vvo J/CLKsvvDHig5s3zQDr+h4kJYHaDRjX5W4N4vXHSrPw3DDkFDKgIoma954BKDc6w==; Received: by sipsolutions.net with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.97) (envelope-from ) id 1sPPJH-0000000DOu8-1cQp; Thu, 04 Jul 2024 18:27:51 +0200 From: Benjamin Berg To: linux-um@lists.infradead.org Cc: Benjamin Berg Subject: [PATCH v7 5/7] um: Discover host_task_size from envp Date: Thu, 4 Jul 2024 18:27:15 +0200 Message-ID: <20240704162717.1417338-6-benjamin@sipsolutions.net> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240704162717.1417338-1-benjamin@sipsolutions.net> References: <20240704162717.1417338-1-benjamin@sipsolutions.net> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240704_092754_958148_B57FE80F X-CRM114-Status: GOOD ( 31.13 ) X-Spam-Score: -0.2 (/) X-Spam-Report: Spam detection software, running on the system "bombadil.infradead.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: From: Benjamin Berg When loading the UML binary, the host kernel will place the stack at the highest possible address. It will then map the program name and environment variables onto the start of the stack. Content analysis details: (-0.2 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 SPF_HELO_PASS SPF: HELO matches SPF record -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid X-BeenThere: linux-um@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-um" Errors-To: linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org From: Benjamin Berg When loading the UML binary, the host kernel will place the stack at the highest possible address. It will then map the program name and environment variables onto the start of the stack. As such, an easy way to figure out the host_task_size is to use the highest pointer to an environment variable as a reference. Ensure that this works by disabling address layout randomization and re-executing UML in case it was enabled. This increases the available TASK_SIZE for 64 bit UML considerably. Signed-off-by: Benjamin Berg --- v7: Also use the same logic on 32bit --- arch/um/include/shared/as-layout.h | 2 +- arch/um/include/shared/os.h | 2 +- arch/um/kernel/um_arch.c | 4 +- arch/um/os-Linux/main.c | 9 +- arch/x86/um/os-Linux/task_size.c | 152 ++--------------------------- 5 files changed, 22 insertions(+), 147 deletions(-) diff --git a/arch/um/include/shared/as-layout.h b/arch/um/include/shared/as-layout.h index 06292fca5a4d..b69cb8dcfeed 100644 --- a/arch/um/include/shared/as-layout.h +++ b/arch/um/include/shared/as-layout.h @@ -48,7 +48,7 @@ extern unsigned long brk_start; extern unsigned long host_task_size; extern unsigned long stub_start; -extern int linux_main(int argc, char **argv); +extern int linux_main(int argc, char **argv, char **envp); extern void uml_finishsetup(void); struct siginfo; diff --git a/arch/um/include/shared/os.h b/arch/um/include/shared/os.h index 9a039d6f1f74..10c83fcde7b0 100644 --- a/arch/um/include/shared/os.h +++ b/arch/um/include/shared/os.h @@ -330,7 +330,7 @@ extern int __ignore_sigio_fd(int fd); extern int get_pty(void); /* sys-$ARCH/task_size.c */ -extern unsigned long os_get_top_address(void); +extern unsigned long os_get_top_address(char **envp); long syscall(long number, ...); diff --git a/arch/um/kernel/um_arch.c b/arch/um/kernel/um_arch.c index f82dd4e854f3..8ac2f9e39b3b 100644 --- a/arch/um/kernel/um_arch.c +++ b/arch/um/kernel/um_arch.c @@ -302,7 +302,7 @@ static void parse_cache_line(char *line) } } -int __init linux_main(int argc, char **argv) +int __init linux_main(int argc, char **argv, char **envp) { unsigned long avail, diff; unsigned long virtmem_size, max_physmem; @@ -324,7 +324,7 @@ int __init linux_main(int argc, char **argv) if (have_console == 0) add_arg(DEFAULT_COMMAND_LINE_CONSOLE); - host_task_size = os_get_top_address(); + host_task_size = os_get_top_address(envp); /* reserve a few pages for the stubs (taking care of data alignment) */ /* align the data portion */ BUILD_BUG_ON(!is_power_of_2(STUB_DATA_PAGES)); diff --git a/arch/um/os-Linux/main.c b/arch/um/os-Linux/main.c index f98ff79cdbf7..9a61b1767795 100644 --- a/arch/um/os-Linux/main.c +++ b/arch/um/os-Linux/main.c @@ -11,6 +11,7 @@ #include #include #include +#include #include #include #include @@ -108,6 +109,12 @@ int __init main(int argc, char **argv, char **envp) char **new_argv; int ret, i, err; + /* Disable randomization and re-exec if it was changed successfully */ + ret = personality(PER_LINUX | ADDR_NO_RANDOMIZE); + if (ret >= 0 && (ret & (PER_LINUX | ADDR_NO_RANDOMIZE)) != + (PER_LINUX | ADDR_NO_RANDOMIZE)) + execve("/proc/self/exe", argv, envp); + set_stklim(); setup_env_path(); @@ -140,7 +147,7 @@ int __init main(int argc, char **argv, char **envp) #endif change_sig(SIGPIPE, 0); - ret = linux_main(argc, argv); + ret = linux_main(argc, argv, envp); /* * Disable SIGPROF - I have no idea why libc doesn't do this or turn diff --git a/arch/x86/um/os-Linux/task_size.c b/arch/x86/um/os-Linux/task_size.c index 1dc9adc20b1c..a91599799b1a 100644 --- a/arch/x86/um/os-Linux/task_size.c +++ b/arch/x86/um/os-Linux/task_size.c @@ -1,151 +1,19 @@ // SPDX-License-Identifier: GPL-2.0 -#include -#include -#include -#include -#include -#ifdef __i386__ - -static jmp_buf buf; - -static void segfault(int sig) -{ - longjmp(buf, 1); -} - -static int page_ok(unsigned long page) -{ - unsigned long *address = (unsigned long *) (page << UM_KERN_PAGE_SHIFT); - unsigned long n = ~0UL; - void *mapped = NULL; - int ok = 0; - - /* - * First see if the page is readable. If it is, it may still - * be a VDSO, so we go on to see if it's writable. If not - * then try mapping memory there. If that fails, then we're - * still in the kernel area. As a sanity check, we'll fail if - * the mmap succeeds, but gives us an address different from - * what we wanted. - */ - if (setjmp(buf) == 0) - n = *address; - else { - mapped = mmap(address, UM_KERN_PAGE_SIZE, - PROT_READ | PROT_WRITE, - MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); - if (mapped == MAP_FAILED) - return 0; - if (mapped != address) - goto out; - } - - /* - * Now, is it writeable? If so, then we're in user address - * space. If not, then try mprotecting it and try the write - * again. - */ - if (setjmp(buf) == 0) { - *address = n; - ok = 1; - goto out; - } else if (mprotect(address, UM_KERN_PAGE_SIZE, - PROT_READ | PROT_WRITE) != 0) - goto out; - - if (setjmp(buf) == 0) { - *address = n; - ok = 1; - } - - out: - if (mapped != NULL) - munmap(mapped, UM_KERN_PAGE_SIZE); - return ok; -} - -unsigned long os_get_top_address(void) +unsigned long os_get_top_address(char **envp) { - struct sigaction sa, old; - unsigned long bottom = 0; - /* - * A 32-bit UML on a 64-bit host gets confused about the VDSO at - * 0xffffe000. It is mapped, is readable, can be reprotected writeable - * and written. However, exec discovers later that it can't be - * unmapped. So, just set the highest address to be checked to just - * below it. This might waste some address space on 4G/4G 32-bit - * hosts, but shouldn't hurt otherwise. - */ - unsigned long top = 0xffffd000 >> UM_KERN_PAGE_SHIFT; - unsigned long test, original; + unsigned long top_addr = (unsigned long) &top_addr; + int i; - printf("Locating the bottom of the address space ... "); - fflush(stdout); - - /* - * We're going to be longjmping out of the signal handler, so - * SA_DEFER needs to be set. - */ - sa.sa_handler = segfault; - sigemptyset(&sa.sa_mask); - sa.sa_flags = SA_NODEFER; - if (sigaction(SIGSEGV, &sa, &old)) { - perror("os_get_top_address"); - exit(1); - } - - /* Manually scan the address space, bottom-up, until we find - * the first valid page (or run out of them). - */ - for (bottom = 0; bottom < top; bottom++) { - if (page_ok(bottom)) - break; - } - - /* If we've got this far, we ran out of pages. */ - if (bottom == top) { - fprintf(stderr, "Unable to determine bottom of address " - "space.\n"); - exit(1); - } - - printf("0x%lx\n", bottom << UM_KERN_PAGE_SHIFT); - printf("Locating the top of the address space ... "); - fflush(stdout); - - original = bottom; - - /* This could happen with a 4G/4G split */ - if (page_ok(top)) - goto out; - - do { - test = bottom + (top - bottom) / 2; - if (page_ok(test)) - bottom = test; - else - top = test; - } while (top - bottom > 1); - -out: - /* Restore the old SIGSEGV handling */ - if (sigaction(SIGSEGV, &old, NULL)) { - perror("os_get_top_address"); - exit(1); + /* The earliest variable should be after the program name in ELF */ + for (i = 0; envp[i]; i++) { + if ((unsigned long) envp[i] > top_addr) + top_addr = (unsigned long) envp[i]; } - top <<= UM_KERN_PAGE_SHIFT; - printf("0x%lx\n", top); - return top; -} - -#else + top_addr &= ~(UM_KERN_PAGE_SIZE - 1); + top_addr += UM_KERN_PAGE_SIZE; -unsigned long os_get_top_address(void) -{ - /* The old value of CONFIG_TOP_ADDR */ - return 0x7fc0002000; + return top_addr; } -#endif From patchwork Thu Jul 4 16:27:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Berg X-Patchwork-Id: 1956980 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; secure) header.d=lists.infradead.org header.i=@lists.infradead.org header.a=rsa-sha256 header.s=bombadil.20210309 header.b=DlW/O9Wx; dkim=fail reason="signature verification failed" (2048-bit key; secure) header.d=sipsolutions.net header.i=@sipsolutions.net header.a=rsa-sha256 header.s=mail header.b=Aqvm92VA; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.infradead.org (client-ip=2607:7c80:54:3::133; helo=bombadil.infradead.org; envelope-from=linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org; receiver=patchwork.ozlabs.org) Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WFMWj4Ydtz1xpP for ; Fri, 5 Jul 2024 02:28:01 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=INihewrT92CzIbsoaGqN3U4W5g6odN915DCeWJorosU=; b=DlW/O9WxmVT2qcMlX5e2zYT/5/ G4VqsC6Y4B4PytdEripKyCqqtRP/Jow8qWYHq8UWA+NuWhkzOhq5L7Ymkdz2cZGgMOIKFcJBNuynO yWWXdNE941qrOUG2nddUIwbo44Lp2TAP8SnhB668+KTAR1qI8REVJSo7Q2NgPCvmohvYz89TZsXn2 AAxVbVNNdLyTSrzYNahtV9MMGQs7VqJNe83/afHsdgg1PnD9peYvxwo6Dlzt3cEplwkFuc7YhcvMn aP15FBmUC3yZskJgCeIkE0b5mS/zaOEt4uE8xUIQ0qvK9LCT92+luUcxiB5jK3JjEzFIIXQ8I4jdj +wCBRKoQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sPPJQ-0000000DqKv-02xd; Thu, 04 Jul 2024 16:28:00 +0000 Received: from s3.sipsolutions.net ([2a01:4f8:242:246e::2] helo=sipsolutions.net) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sPPJN-0000000DqJQ-2R7I for linux-um@lists.infradead.org; Thu, 04 Jul 2024 16:27:58 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=sipsolutions.net; s=mail; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Content-Type:Sender :Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-To: Resent-Cc:Resent-Message-ID; bh=INihewrT92CzIbsoaGqN3U4W5g6odN915DCeWJorosU=; t=1720110477; x=1721320077; b=Aqvm92VAvYjCuG9/dB1Qzi6oi+2Yvl/VcmOYXHKFvXrrvqY gt61lYEFhCn/ZWxpR3QEA28uC7UGlHurlvPqsb4YjOosqmHRbstQBPTyLdIKhP6MpVbOJLZFuLQll S64nSz6sPqPhcivs0QsGpLARZqS+ylSIe3iUc+LeMoDo3FSq3x7qZxzeKZhi4iGH1c9sV1nvnavcK 0Mo2m+Yqr9l46pqtE2VmjG86raP1ZBut+WuvRgYONDyVcUxjrQ2DAhVS5IyH5Ay2jTnhbDS9U80FH fYa7cRybx8A2IW09O/zK20VImbpDj/aFRvYfP3JBaWdJ7GVLR57QODBF83RmMYrQ==; Received: by sipsolutions.net with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.97) (envelope-from ) id 1sPPJK-0000000DOu8-01X6; Thu, 04 Jul 2024 18:27:55 +0200 From: Benjamin Berg To: linux-um@lists.infradead.org Cc: Benjamin Berg Subject: [PATCH v7 6/7] um: clear all memory in new userspace processes Date: Thu, 4 Jul 2024 18:27:16 +0200 Message-ID: <20240704162717.1417338-7-benjamin@sipsolutions.net> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240704162717.1417338-1-benjamin@sipsolutions.net> References: <20240704162717.1417338-1-benjamin@sipsolutions.net> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240704_092757_654281_DF27438A X-CRM114-Status: GOOD ( 15.92 ) X-Spam-Score: -0.2 (/) X-Spam-Report: Spam detection software, running on the system "bombadil.infradead.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: From: Benjamin Berg With the change to use execve() we can now safely clear the memory up to STUB_START as rseq will not be trying to use memory in that region. Also, on 64 bit the previous changes should mean that there [...] Content analysis details: (-0.2 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 SPF_HELO_PASS SPF: HELO matches SPF record -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid X-BeenThere: linux-um@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-um" Errors-To: linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org From: Benjamin Berg With the change to use execve() we can now safely clear the memory up to STUB_START as rseq will not be trying to use memory in that region. Also, on 64 bit the previous changes should mean that there is no usable memory range above the stub. Make the change and remove the comment as it is not needed anymore. --- arch/um/kernel/skas/mmu.c | 25 ++----------------------- 1 file changed, 2 insertions(+), 23 deletions(-) diff --git a/arch/um/kernel/skas/mmu.c b/arch/um/kernel/skas/mmu.c index 47f98d87ea3c..bf64702d9e04 100644 --- a/arch/um/kernel/skas/mmu.c +++ b/arch/um/kernel/skas/mmu.c @@ -40,29 +40,8 @@ int init_new_context(struct task_struct *task, struct mm_struct *mm) goto out_free; } - /* - * Ensure the new MM is clean and nothing unwanted is mapped. - * - * TODO: We should clear the memory up to STUB_START to ensure there is - * nothing mapped there, i.e. we (currently) have: - * - * |- user memory -|- unused -|- stub -|- unused -| - * ^ TASK_SIZE ^ STUB_START - * - * Meaning we have two unused areas where we may still have valid - * mappings from our internal clone(). That isn't really a problem as - * userspace is not going to access them, but it is definitely not - * correct. - * - * However, we are "lucky" and if rseq is configured, then on 32 bit - * it will fall into the first empty range while on 64 bit it is going - * to use an anonymous mapping in the second range. As such, things - * continue to work for now as long as we don't start unmapping these - * areas. - * - * Change this to STUB_START once we have a clean userspace. - */ - unmap(new_id, 0, TASK_SIZE); + /* Ensure the new MM is clean and nothing unwanted is mapped */ + unmap(new_id, 0, STUB_START); return 0; From patchwork Thu Jul 4 16:27:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Benjamin Berg X-Patchwork-Id: 1956981 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; secure) header.d=lists.infradead.org header.i=@lists.infradead.org header.a=rsa-sha256 header.s=bombadil.20210309 header.b=HMzPJVn2; dkim=fail reason="signature verification failed" (2048-bit key; secure) header.d=sipsolutions.net header.i=@sipsolutions.net header.a=rsa-sha256 header.s=mail header.b=a+7KJW6s; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.infradead.org (client-ip=2607:7c80:54:3::133; helo=bombadil.infradead.org; envelope-from=linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org; receiver=patchwork.ozlabs.org) Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WFMWp0h5pz1xpP for ; Fri, 5 Jul 2024 02:28:06 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=AYOKRaixBXyWK/dGAWwZ6jALUihDgJCQNmaVd3Tm8eE=; b=HMzPJVn2N8sOKaDjD1zWlT3c6W lstICr+xFy6JywLSTXMfQ5ANl+DJA34fYL3k25DFoZhb9c8cY5rYDZLi6U/nTdn+ctAionvmG3vtk VFfnPFdV0ToSRaZfaCo/N3gTP5F6vwd0f+bPXsQRbcB6/RED1O6HFNQyBlbuP7zEGKZmFOsFwt+17 jC4VIf6+idFfFJsDLAQWREFdwEPqcontJelJa5fgdxKGCsNFaVFMet0UZY28z6FQ8jhLPWTr6O3Ci mAcoEdrHqmy3Zqgrtz7jAwt3VQ4XhiD05/x/UQtVDt6IpcUC1AMW3sUTe4tHxpFrzozpePNSNDwwp uSeLF7ug==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sPPJU-0000000DqMA-1v1N; Thu, 04 Jul 2024 16:28:04 +0000 Received: from s3.sipsolutions.net ([2a01:4f8:242:246e::2] helo=sipsolutions.net) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sPPJR-0000000DqLQ-2vvX for linux-um@lists.infradead.org; Thu, 04 Jul 2024 16:28:03 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=sipsolutions.net; s=mail; h=Content-Transfer-Encoding:MIME-Version: References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From:Content-Type:Sender :Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-To: Resent-Cc:Resent-Message-ID; bh=AYOKRaixBXyWK/dGAWwZ6jALUihDgJCQNmaVd3Tm8eE=; t=1720110481; x=1721320081; b=a+7KJW6seXBnkJ3lsNqgPkCSJAyYx8viKML6F74DQoZ8eUi 5SUnf9aVYwuea8YNcZiaGX1e7deDkq1kTDOyY8j/qZ0EtqnzLq+C+cGkBNV27LDYjEIbHeKdd472B l9kSooTWGevBzxtXPguYZhO/v7RHLMLaSEj3xwoLPTIj23UgEG7XOdxQJevORd64bJMImdo+fDE5l 2K76mSdVAQtTBCzGXMVRKr67+BVj+pBN71EJm6T8KrML1vqe1eRaWZXnOKAcYKFF7CUz8D33pA43U Pa78Xp3By0sWY5fDeGuy5g0gAbi9UhAS5j0zbVjDQStGsyY0zSBdq9QWyZeH+JAQ==; Received: by sipsolutions.net with esmtpsa (TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim 4.97) (envelope-from ) id 1sPPJM-0000000DOu8-28Wa; Thu, 04 Jul 2024 18:27:59 +0200 From: Benjamin Berg To: linux-um@lists.infradead.org Cc: Benjamin Berg Subject: [PATCH v7 7/7] um: Add 4 level page table support Date: Thu, 4 Jul 2024 18:27:17 +0200 Message-ID: <20240704162717.1417338-8-benjamin@sipsolutions.net> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20240704162717.1417338-1-benjamin@sipsolutions.net> References: <20240704162717.1417338-1-benjamin@sipsolutions.net> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240704_092801_930753_A07F4B9B X-CRM114-Status: GOOD ( 21.44 ) X-Spam-Score: -0.2 (/) X-Spam-Report: Spam detection software, running on the system "bombadil.infradead.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: From: Benjamin Berg The larger memory space is useful to support more applications inside UML. One example for this is ASAN instrumentation of userspace applications which requires addresses that would otherwise not be a [...] Content analysis details: (-0.2 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 SPF_HELO_PASS SPF: HELO matches SPF record -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID_AU Message has a valid DKIM or DK signature from author's domain -0.1 DKIM_VALID_EF Message has a valid DKIM or DK signature from envelope-from domain -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid X-BeenThere: linux-um@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-um" Errors-To: linux-um-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org From: Benjamin Berg The larger memory space is useful to support more applications inside UML. One example for this is ASAN instrumentation of userspace applications which requires addresses that would otherwise not be available. Signed-off-by: Benjamin Berg --- v7: - Reword options and fix documentation of x86-64 default v2: - Do not hide option behind the EXPERT flag - Fix typo in new "Two-level pagetables" option --- arch/um/Kconfig | 1 + arch/um/include/asm/page.h | 14 +++- arch/um/include/asm/pgalloc.h | 11 ++- arch/um/include/asm/pgtable-4level.h | 119 +++++++++++++++++++++++++++ arch/um/include/asm/pgtable.h | 6 +- arch/um/kernel/mem.c | 17 +++- arch/x86/um/Kconfig | 38 ++++++--- 7 files changed, 189 insertions(+), 17 deletions(-) create mode 100644 arch/um/include/asm/pgtable-4level.h diff --git a/arch/um/Kconfig b/arch/um/Kconfig index dca84fd6d00a..7f93609ad63d 100644 --- a/arch/um/Kconfig +++ b/arch/um/Kconfig @@ -210,6 +210,7 @@ config MMAPPER config PGTABLE_LEVELS int + default 4 if 4_LEVEL_PGTABLES default 3 if 3_LEVEL_PGTABLES default 2 diff --git a/arch/um/include/asm/page.h b/arch/um/include/asm/page.h index 9ef9a8aedfa6..c3b2ae03b60c 100644 --- a/arch/um/include/asm/page.h +++ b/arch/um/include/asm/page.h @@ -57,14 +57,22 @@ typedef unsigned long long phys_t; typedef struct { unsigned long pte; } pte_t; typedef struct { unsigned long pgd; } pgd_t; -#ifdef CONFIG_3_LEVEL_PGTABLES +#if CONFIG_PGTABLE_LEVELS > 2 + typedef struct { unsigned long pmd; } pmd_t; #define pmd_val(x) ((x).pmd) #define __pmd(x) ((pmd_t) { (x) } ) -#endif -#define pte_val(x) ((x).pte) +#if CONFIG_PGTABLE_LEVELS > 3 +typedef struct { unsigned long pud; } pud_t; +#define pud_val(x) ((x).pud) +#define __pud(x) ((pud_t) { (x) } ) + +#endif /* CONFIG_PGTABLE_LEVELS > 3 */ +#endif /* CONFIG_PGTABLE_LEVELS > 2 */ + +#define pte_val(x) ((x).pte) #define pte_get_bits(p, bits) ((p).pte & (bits)) #define pte_set_bits(p, bits) ((p).pte |= (bits)) diff --git a/arch/um/include/asm/pgalloc.h b/arch/um/include/asm/pgalloc.h index de5e31c64793..04fb4e6969a4 100644 --- a/arch/um/include/asm/pgalloc.h +++ b/arch/um/include/asm/pgalloc.h @@ -31,7 +31,7 @@ do { \ tlb_remove_page_ptdesc((tlb), (page_ptdesc(pte))); \ } while (0) -#ifdef CONFIG_3_LEVEL_PGTABLES +#if CONFIG_PGTABLE_LEVELS > 2 #define __pmd_free_tlb(tlb, pmd, address) \ do { \ @@ -39,6 +39,15 @@ do { \ tlb_remove_page_ptdesc((tlb), virt_to_ptdesc(pmd)); \ } while (0) +#if CONFIG_PGTABLE_LEVELS > 3 + +#define __pud_free_tlb(tlb, pud, address) \ +do { \ + pagetable_pud_dtor(virt_to_ptdesc(pud)); \ + tlb_remove_page_ptdesc((tlb), virt_to_ptdesc(pud)); \ +} while (0) + +#endif #endif #endif diff --git a/arch/um/include/asm/pgtable-4level.h b/arch/um/include/asm/pgtable-4level.h new file mode 100644 index 000000000000..f912fcc16b7a --- /dev/null +++ b/arch/um/include/asm/pgtable-4level.h @@ -0,0 +1,119 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright 2003 PathScale Inc + * Derived from include/asm-i386/pgtable.h + */ + +#ifndef __UM_PGTABLE_4LEVEL_H +#define __UM_PGTABLE_4LEVEL_H + +#include + +/* PGDIR_SHIFT determines what a fourth-level page table entry can map */ + +#define PGDIR_SHIFT 39 +#define PGDIR_SIZE (1UL << PGDIR_SHIFT) +#define PGDIR_MASK (~(PGDIR_SIZE-1)) + +/* PUD_SHIFT determines the size of the area a third-level page table can + * map + */ + +#define PUD_SHIFT 30 +#define PUD_SIZE (1UL << PUD_SHIFT) +#define PUD_MASK (~(PUD_SIZE-1)) + +/* PMD_SHIFT determines the size of the area a second-level page table can + * map + */ + +#define PMD_SHIFT 21 +#define PMD_SIZE (1UL << PMD_SHIFT) +#define PMD_MASK (~(PMD_SIZE-1)) + +/* + * entries per page directory level + */ + +#define PTRS_PER_PTE 512 +#define PTRS_PER_PMD 512 +#define PTRS_PER_PUD 512 +#define PTRS_PER_PGD 512 + +#define USER_PTRS_PER_PGD ((TASK_SIZE + (PGDIR_SIZE - 1)) / PGDIR_SIZE) + +#define pte_ERROR(e) \ + printk("%s:%d: bad pte %p(%016lx).\n", __FILE__, __LINE__, &(e), \ + pte_val(e)) +#define pmd_ERROR(e) \ + printk("%s:%d: bad pmd %p(%016lx).\n", __FILE__, __LINE__, &(e), \ + pmd_val(e)) +#define pud_ERROR(e) \ + printk("%s:%d: bad pud %p(%016lx).\n", __FILE__, __LINE__, &(e), \ + pud_val(e)) +#define pgd_ERROR(e) \ + printk("%s:%d: bad pgd %p(%016lx).\n", __FILE__, __LINE__, &(e), \ + pgd_val(e)) + +#define pud_none(x) (!(pud_val(x) & ~_PAGE_NEWPAGE)) +#define pud_bad(x) ((pud_val(x) & (~PAGE_MASK & ~_PAGE_USER)) != _KERNPG_TABLE) +#define pud_present(x) (pud_val(x) & _PAGE_PRESENT) +#define pud_populate(mm, pud, pmd) \ + set_pud(pud, __pud(_PAGE_TABLE + __pa(pmd))) + +#define set_pud(pudptr, pudval) (*(pudptr) = (pudval)) + +#define p4d_none(x) (!(p4d_val(x) & ~_PAGE_NEWPAGE)) +#define p4d_bad(x) ((p4d_val(x) & (~PAGE_MASK & ~_PAGE_USER)) != _KERNPG_TABLE) +#define p4d_present(x) (p4d_val(x) & _PAGE_PRESENT) +#define p4d_populate(mm, p4d, pud) \ + set_p4d(p4d, __p4d(_PAGE_TABLE + __pa(pud))) + +#define set_p4d(p4dptr, p4dval) (*(p4dptr) = (p4dval)) + + +static inline int pgd_newpage(pgd_t pgd) +{ + return(pgd_val(pgd) & _PAGE_NEWPAGE); +} + +static inline void pgd_mkuptodate(pgd_t pgd) { pgd_val(pgd) &= ~_PAGE_NEWPAGE; } + +#define set_pmd(pmdptr, pmdval) (*(pmdptr) = (pmdval)) + +static inline void pud_clear (pud_t *pud) +{ + set_pud(pud, __pud(_PAGE_NEWPAGE)); +} + +static inline void p4d_clear (p4d_t *p4d) +{ + set_p4d(p4d, __p4d(_PAGE_NEWPAGE)); +} + +#define pud_page(pud) phys_to_page(pud_val(pud) & PAGE_MASK) +#define pud_pgtable(pud) ((pmd_t *) __va(pud_val(pud) & PAGE_MASK)) + +#define p4d_page(p4d) phys_to_page(p4d_val(p4d) & PAGE_MASK) +#define p4d_pgtable(p4d) ((pud_t *) __va(p4d_val(p4d) & PAGE_MASK)) + +static inline unsigned long pte_pfn(pte_t pte) +{ + return phys_to_pfn(pte_val(pte)); +} + +static inline pte_t pfn_pte(unsigned long page_nr, pgprot_t pgprot) +{ + pte_t pte; + phys_t phys = pfn_to_phys(page_nr); + + pte_set_val(pte, phys, pgprot); + return pte; +} + +static inline pmd_t pfn_pmd(unsigned long page_nr, pgprot_t pgprot) +{ + return __pmd((page_nr << PAGE_SHIFT) | pgprot_val(pgprot)); +} + +#endif diff --git a/arch/um/include/asm/pgtable.h b/arch/um/include/asm/pgtable.h index 5bb397b65efb..9ab3e34e8100 100644 --- a/arch/um/include/asm/pgtable.h +++ b/arch/um/include/asm/pgtable.h @@ -24,9 +24,11 @@ /* We borrow bit 10 to store the exclusive marker in swap PTEs. */ #define _PAGE_SWP_EXCLUSIVE 0x400 -#ifdef CONFIG_3_LEVEL_PGTABLES +#if CONFIG_PGTABLE_LEVELS == 4 +#include +#elif CONFIG_PGTABLE_LEVELS == 3 #include -#else +#elif CONFIG_PGTABLE_LEVELS == 2 #include #endif diff --git a/arch/um/kernel/mem.c b/arch/um/kernel/mem.c index a5b4fe2ad931..e7c262265c31 100644 --- a/arch/um/kernel/mem.c +++ b/arch/um/kernel/mem.c @@ -98,7 +98,7 @@ static void __init one_page_table_init(pmd_t *pmd) static void __init one_md_table_init(pud_t *pud) { -#ifdef CONFIG_3_LEVEL_PGTABLES +#if CONFIG_PGTABLE_LEVELS > 2 pmd_t *pmd_table = (pmd_t *) memblock_alloc_low(PAGE_SIZE, PAGE_SIZE); if (!pmd_table) panic("%s: Failed to allocate %lu bytes align=%lx\n", @@ -109,6 +109,19 @@ static void __init one_md_table_init(pud_t *pud) #endif } +static void __init one_ud_table_init(p4d_t *p4d) +{ +#if CONFIG_PGTABLE_LEVELS > 3 + pud_t *pud_table = (pud_t *) memblock_alloc_low(PAGE_SIZE, PAGE_SIZE); + if (!pud_table) + panic("%s: Failed to allocate %lu bytes align=%lx\n", + __func__, PAGE_SIZE, PAGE_SIZE); + + set_p4d(p4d, __p4d(_KERNPG_TABLE + (unsigned long) __pa(pud_table))); + BUG_ON(pud_table != pud_offset(p4d, 0)); +#endif +} + static void __init fixrange_init(unsigned long start, unsigned long end, pgd_t *pgd_base) { @@ -126,6 +139,8 @@ static void __init fixrange_init(unsigned long start, unsigned long end, for ( ; (i < PTRS_PER_PGD) && (vaddr < end); pgd++, i++) { p4d = p4d_offset(pgd, vaddr); + if (p4d_none(*p4d)) + one_ud_table_init(p4d); pud = pud_offset(p4d, vaddr); if (pud_none(*pud)) one_md_table_init(pud); diff --git a/arch/x86/um/Kconfig b/arch/x86/um/Kconfig index 186f13268401..f7a527ad704c 100644 --- a/arch/x86/um/Kconfig +++ b/arch/x86/um/Kconfig @@ -28,16 +28,34 @@ config X86_64 def_bool 64BIT select MODULES_USE_ELF_RELA -config 3_LEVEL_PGTABLES - bool "Three-level pagetables" if !64BIT - default 64BIT - help - Three-level pagetables will let UML have more than 4G of physical - memory. All the memory that can't be mapped directly will be treated - as high memory. - - However, this it experimental on 32-bit architectures, so if unsure say - N (on x86-64 it's automatically enabled, instead, as it's safe there). +choice + prompt "Pagetable levels" + default 2_LEVEL_PGTABLES if !64BIT + default 4_LEVEL_PGTABLES if 64BIT + + config 2_LEVEL_PGTABLES + bool "Two-level pagetables" if !64BIT + depends on !64BIT + help + Two-level page table for 32-bit architectures. + + config 3_LEVEL_PGTABLES + bool "Three-level pagetables" if 64BIT || (!64BIT && EXPERT) + help + Three-level pagetables will let UML have more than 4G of + physical memory. All the memory that can't be mapped + directly will be treated as high memory. + + However, this it experimental on 32-bit architectures, so if + unsure say N + + config 4_LEVEL_PGTABLES + bool "Four-level pagetables" if 64BIT + depends on 64BIT + help + Four-level pagetables, results in a bigger address space + which can be useful for some applications (e.g. ASAN). +endchoice config ARCH_HAS_SC_SIGNALS def_bool !64BIT