From patchwork Sun Mar 17 03:13:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 1912851 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=FHfnQqh7; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Ty33Q5lLXz23qh for ; Sun, 17 Mar 2024 14:13:38 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CB68D385842C for ; Sun, 17 Mar 2024 03:13:35 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-oi1-x22b.google.com (mail-oi1-x22b.google.com [IPv6:2607:f8b0:4864:20::22b]) by sourceware.org (Postfix) with ESMTPS id 729B03858D20 for ; Sun, 17 Mar 2024 03:13:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 729B03858D20 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 729B03858D20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::22b ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1710645201; cv=none; b=mhJSXwvyO6cpGwijTQoN/pOSRHhO0QZVR6A38NR52BqpRy22ZptuA6MOU9sc+ANVPUUS8YPmCX3qqIR0TZOt7RQoTBFO0pony/Nbbl655M1blZl+VOBAMozjeiHkVTYGWQdxMo05Ar82z6JpO/7HyBtc4TJwt1ZccGuzFrOurU4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1710645201; c=relaxed/simple; bh=jGugC8OFSvkXTFq6a1Rq4hmjHwV6/6Lviob4mu/wV2U=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=kFEexx+xOs9Xii2nfK5A1+o9cNfznCrEwDEkcUhFQWuGzSk5b9xFjb9rutoQIa/cVkMS5Kvak2FgGMqgmW0VBRZ5nu5b1lQe20kNMH9+fwYS2QO9fFnQKKAywzO9jlSIn5ZhO21UunCU+r/XbZsEK4WwoP0VAYRajYs14/D0Enc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oi1-x22b.google.com with SMTP id 5614622812f47-3c36ecdb8cdso1364609b6e.1 for ; Sat, 16 Mar 2024 20:13:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1710645197; x=1711249997; darn=sourceware.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=bSdZayhSYYAWmdm/JOJlpYpKzIk6FIRogl6gOgjidNg=; b=FHfnQqh7DFuM6gf25gyVKsW3vWmxjp0ZLrWvhyc1wy39kCtvYBmpzTGRXX935vuO6X bmicMjvXQj9gbA43CsBZqhwTm3MPC5X9/+l58xfdnOFEL88wx2YZIXdZL5irUtayAVhZ ZElUISplDfk1Nm+dNvbNbLojogibqyubfD1L6BYH3E01bQIRtaReFMwL7HE1r6EyIXLQ xfc33Cja+SK8Ys8PyWOWpzX8LwgSWrLtfjhk7bvIYR/knLsZ+lOV7i2FbSuJt+6o8Yvq Ryak+517FSZCnqW3RgtPZPya0NAR257BTSsbkVydqyQBlavLrqX2+QQ1W/mTIjQglo7T 5ITA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710645197; x=1711249997; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=bSdZayhSYYAWmdm/JOJlpYpKzIk6FIRogl6gOgjidNg=; b=BfpUQirQMD4ah/GMjpromeXrlPDkAUdCxaohghMpcafrxVd7DM1cfmxOGIs7egd91R olAQyl8Ooyvm8KvRPLt44Ht+tZ4hegIFi3YCYnHKLollL3A+lYJYSk3E0GlPTWx71B6C mT0RHFesnJrWx0ECCLBsZsaIao2qg+5g/kU/DhR6g8QLN5r2jgM8qKtHPEDb9Jfpqiqu 3qVWzu2sfKb0Szk7f4o5qUk7+in2eZMwqEu7HOU3IbxYhe5AcfIXeuiwMRQ8Kuv1D70z hBfWx/E4Q0qn2ogsFnpN8MetVkRbMpnOVRGZrFXL2seS6+jEmEZMLqK1BncsttJQcQtB qAuQ== X-Gm-Message-State: AOJu0YzEfXjLp+oEZk1/gqSS6ffEQ+2GtjtTwTjPgBDifuaLzEojdgao P6efTJUIvys08QS1BGU+QJWe3SbqJvKLO7d8qTBmso7cveUJBrdyn1rAPy82 X-Google-Smtp-Source: AGHT+IF0Bs4j+FM9TLTg9EslFp6j8KNatwAiBuxAzOyHFkUB7C1xnKxcb8U9MDVh+gbi15GilujQJA== X-Received: by 2002:a05:6808:4484:b0:3c2:3150:398 with SMTP id eq4-20020a056808448400b003c231500398mr13212616oib.19.1710645197492; Sat, 16 Mar 2024 20:13:17 -0700 (PDT) Received: from gnu-cfl-3.localdomain ([172.58.89.72]) by smtp.gmail.com with ESMTPSA id p6-20020a63ab06000000b005dc89957e06sm4159042pgf.71.2024.03.16.20.13.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 16 Mar 2024 20:13:16 -0700 (PDT) Received: from gnu-cfl-3.. (localhost [IPv6:::1]) by gnu-cfl-3.localdomain (Postfix) with ESMTP id 9D7C574005D; Sat, 16 Mar 2024 20:13:15 -0700 (PDT) From: "H.J. Lu" To: libc-alpha@sourceware.org Cc: fweimer@redhat.com Subject: [PATCH v2] x86-64: Allocate state buffer space for RDI, RSI and RBX Date: Sat, 16 Mar 2024 20:13:15 -0700 Message-ID: <20240317031315.565195-1-hjl.tools@gmail.com> X-Mailer: git-send-email 2.44.0 MIME-Version: 1.0 X-Spam-Status: No, score=-3019.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_ABUSEAT, RCVD_IN_DNSWL_NONE, RCVD_IN_SBL_CSS, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org _dl_tlsdesc_dynamic preserves RDI, RSI and RBX before realigning stack. After realigning stack, it saves RCX, RDX, R8, R9, R10 and R11. Define TLSDESC_CALL_STATE_SAVE_OFFSET to allocate space for all integer registers and round up the state size to 64 bytes to avoid clobbering saved RDI, RSI and RBX values on stack by xsave to STATE_SAVE_OFFSET(%rsp). This fixes BZ #31501. --- sysdeps/x86/cpu-features.c | 11 ++- sysdeps/x86/sysdep.h | 8 ++ sysdeps/x86_64/Makefile | 19 +++++ sysdeps/x86_64/tst-gnu2-tls2-x86-64-mod0.c | 19 +++++ sysdeps/x86_64/tst-gnu2-tls2-x86-64-mod1.S | 87 ++++++++++++++++++++++ sysdeps/x86_64/tst-gnu2-tls2-x86-64-mod2.c | 19 +++++ sysdeps/x86_64/tst-gnu2-tls2-x86-64.c | 21 ++++++ 7 files changed, 180 insertions(+), 4 deletions(-) create mode 100644 sysdeps/x86_64/tst-gnu2-tls2-x86-64-mod0.c create mode 100644 sysdeps/x86_64/tst-gnu2-tls2-x86-64-mod1.S create mode 100644 sysdeps/x86_64/tst-gnu2-tls2-x86-64-mod2.c create mode 100644 sysdeps/x86_64/tst-gnu2-tls2-x86-64.c diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c index 4ea373dffa..5e9c167417 100644 --- a/sysdeps/x86/cpu-features.c +++ b/sysdeps/x86/cpu-features.c @@ -311,7 +311,7 @@ update_active (struct cpu_features *cpu_features) /* NB: On AMX capable processors, ebx always includes AMX states. */ unsigned int xsave_state_full_size - = ALIGN_UP (ebx + STATE_SAVE_OFFSET, 64); + = ALIGN_UP (ebx + TLSDESC_CALL_STATE_SAVE_OFFSET, 64); cpu_features->xsave_state_size = xsave_state_full_size; @@ -401,8 +401,10 @@ update_active (struct cpu_features *cpu_features) unsigned int amx_size = (xstate_amx_comp_offsets[31] + xstate_amx_comp_sizes[31]); - amx_size = ALIGN_UP (amx_size + STATE_SAVE_OFFSET, - 64); + amx_size + = ALIGN_UP ((amx_size + + TLSDESC_CALL_STATE_SAVE_OFFSET), + 64); /* Set xsave_state_full_size to the compact AMX state size for XSAVEC. NB: xsave_state_full_size is only used in _dl_tlsdesc_dynamic_xsave and @@ -410,7 +412,8 @@ update_active (struct cpu_features *cpu_features) cpu_features->xsave_state_full_size = amx_size; #endif cpu_features->xsave_state_size - = ALIGN_UP (size + STATE_SAVE_OFFSET, 64); + = ALIGN_UP (size + TLSDESC_CALL_STATE_SAVE_OFFSET, + 64); CPU_FEATURE_SET (cpu_features, XSAVEC); } } diff --git a/sysdeps/x86/sysdep.h b/sysdeps/x86/sysdep.h index db8e576e91..262d4083e2 100644 --- a/sysdeps/x86/sysdep.h +++ b/sysdeps/x86/sysdep.h @@ -46,6 +46,13 @@ red-zone into account. */ # define STATE_SAVE_OFFSET (8 * 7 + 8) +/* _dl_tlsdesc_dynamic preserves RDI, RSI and RBX before realigning + stack. After realigning stack, it saves RCX, RDX, R8, R9, R10 and + R11. Allocate space for all integer registers and round up the state + size to 64 bytes to avoid clobbering saved RDI, RSI and RBX values on + stack by xsave on STATE_SAVE_OFFSET(%rsp). */ +# define TLSDESC_CALL_STATE_SAVE_OFFSET (STATE_SAVE_OFFSET + 64) + /* Save SSE, AVX, AVX512, mask, bound and APX registers. Bound and APX registers are mutually exclusive. */ # define STATE_SAVE_MASK \ @@ -68,6 +75,7 @@ /* Offset for fxsave/xsave area used by _dl_tlsdesc_dynamic. Since i386 doesn't have red-zone, use 0 here. */ # define STATE_SAVE_OFFSET 0 +# define TLSDESC_CALL_STATE_SAVE_OFFSET 0 /* Save SSE, AVX, AXV512, mask and bound registers. */ # define STATE_SAVE_MASK \ diff --git a/sysdeps/x86_64/Makefile b/sysdeps/x86_64/Makefile index 66b21954f3..e21e4b96ab 100644 --- a/sysdeps/x86_64/Makefile +++ b/sysdeps/x86_64/Makefile @@ -217,6 +217,25 @@ valgrind-suppressions-tst-valgrind-smoke = \ --suppressions=$(..)sysdeps/x86_64/tst-valgrind-smoke.supp endif +tests += \ + tst-gnu2-tls2-x86-64 \ +# tests + +modules-names += \ + tst-gnu2-tls2-x86-64-mod0 \ + tst-gnu2-tls2-x86-64-mod1 \ + tst-gnu2-tls2-x86-64-mod2 \ +# modules-names + +$(objpfx)tst-gnu2-tls2-x86-64: $(shared-thread-library) +$(objpfx)tst-gnu2-tls2-x86-64.out: \ + $(objpfx)tst-gnu2-tls2-x86-64-mod0.so \ + $(objpfx)tst-gnu2-tls2-x86-64-mod1.so \ + $(objpfx)tst-gnu2-tls2-x86-64-mod2.so + +CFLAGS-tst-gnu2-tls2-x86-64-mod0.c += -mtls-dialect=gnu2 +CFLAGS-tst-gnu2-tls2-x86-64-mod2.c += -mtls-dialect=gnu2 + endif # $(subdir) == elf ifeq ($(subdir),csu) diff --git a/sysdeps/x86_64/tst-gnu2-tls2-x86-64-mod0.c b/sysdeps/x86_64/tst-gnu2-tls2-x86-64-mod0.c new file mode 100644 index 0000000000..40b3ec5c82 --- /dev/null +++ b/sysdeps/x86_64/tst-gnu2-tls2-x86-64-mod0.c @@ -0,0 +1,19 @@ +/* DSO used by tst-gnu2-tls2-x86-64. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include diff --git a/sysdeps/x86_64/tst-gnu2-tls2-x86-64-mod1.S b/sysdeps/x86_64/tst-gnu2-tls2-x86-64-mod1.S new file mode 100644 index 0000000000..449ddd5c9d --- /dev/null +++ b/sysdeps/x86_64/tst-gnu2-tls2-x86-64-mod1.S @@ -0,0 +1,87 @@ +/* Check if TLSDESC relocation preserves %rdi, %rsi and %rbx. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include + +/* On AVX512 machines, OFFSET == 104 caused _dl_tlsdesc_dynamic_xsavec + to clobber %rdi, %rsi and %rbx. On Intel AVX CPUs, the state size + is 960 bytes and this test didn't fail. It may be due to the unused + last 128 bytes. On AMD AVX CPUs, the state size is 832 bytes and + this test might fail without the fix. */ +#ifndef OFFSET +# define OFFSET 104 +#endif + + .text + .p2align 4 + .globl apply_tls + .type apply_tls, @function +apply_tls: + cfi_startproc + _CET_ENDBR + pushq %rbp + cfi_def_cfa_offset (16) + cfi_offset (6, -16) + movdqu (%RDI_LP), %xmm0 + lea tls_var1@TLSDESC(%rip), %RAX_LP + mov %RSP_LP, %RBP_LP + cfi_def_cfa_register (6) + /* Align stack to 64 bytes. */ + and $-64, %RSP_LP + sub $OFFSET, %RSP_LP + pushq %rbx + /* Set %ebx to 0xbadbeef. */ + movl $0xbadbeef, %ebx + movl $0xbadbeef, %esi + movq %rdi, saved_rdi(%rip) + movq %rsi, saved_rsi(%rip) + call *tls_var1@TLSCALL(%RAX_LP) + /* Check if _dl_tlsdesc_dynamic preserves %rdi, %rsi and %rbx. */ + cmpq saved_rdi(%rip), %rdi + jne L(hlt) + cmpq saved_rsi(%rip), %rsi + jne L(hlt) + cmpl $0xbadbeef, %ebx + jne L(hlt) + add %fs:0, %RAX_LP + movups %xmm0, 32(%RAX_LP) + movdqu 16(%RDI_LP), %xmm1 + mov %RAX_LP, %RBX_LP + movups %xmm1, 48(%RAX_LP) + lea 32(%RBX_LP), %RAX_LP + pop %rbx + leave + cfi_def_cfa (7, 8) + ret +L(hlt): + hlt + cfi_endproc + .size apply_tls, .-apply_tls + .hidden tls_var1 + .globl tls_var1 + .section .tbss,"awT",@nobits + .align 16 + .type tls_var1, @object + .size tls_var1, 3200 +tls_var1: + .zero 3200 + .local saved_rdi + .comm saved_rdi,8,8 + .local saved_rsi + .comm saved_rsi,8,8 + .section .note.GNU-stack,"",@progbits diff --git a/sysdeps/x86_64/tst-gnu2-tls2-x86-64-mod2.c b/sysdeps/x86_64/tst-gnu2-tls2-x86-64-mod2.c new file mode 100644 index 0000000000..c12b81a49b --- /dev/null +++ b/sysdeps/x86_64/tst-gnu2-tls2-x86-64-mod2.c @@ -0,0 +1,19 @@ +/* DSO used by tst-gnu2-tls2-x86-64. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include diff --git a/sysdeps/x86_64/tst-gnu2-tls2-x86-64.c b/sysdeps/x86_64/tst-gnu2-tls2-x86-64.c new file mode 100644 index 0000000000..7d51f488bd --- /dev/null +++ b/sysdeps/x86_64/tst-gnu2-tls2-x86-64.c @@ -0,0 +1,21 @@ +/* Check if TLSDESC relocation preserves %rdi, %rsi and %rbx. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define MOD(i) "tst-gnu2-tls2-x86-64-mod" #i ".so" + +#include