Message ID | 20240104160043.1641757-1-hjl.tools@gmail.com |
---|---|
State | New |
Headers | show |
Series | [v6] elf: Add ELF_DYNAMIC_AFTER_RELOC to rewrite PLT | expand |
On Thu, Jan 4, 2024 at 8:01 AM H.J. Lu <hjl.tools@gmail.com> wrote: > > Changes in v6: > > 1. Print skipped PLT entries. > 2. Always generate DT_X86_64_PLT* tags when supported. > 3. Never generate DT_X86_64_PLT* tags on ld.so. > 4. Don't check for ld.so in x86_64_dynamic_after_reloc to reduce the > run-time overhead. > > Changes in v5: > > 1. Remove plt_rewrite_enabled and set plt_rewrite in set_plt_rewrite to > reduce the run-time overhead. > 2. Use __glibc_likely and __glibc_unlikely in x86_64_dynamic_after_reloc > to reduce the run-time overhead. > > Changes in v4: > > 1. Update set_plt_rewrite. > 2. Add /* fallthrough */ to case R_X86_64_JUMP_SLOT:. > 3. Remove plt_rewrite_bias since it is always zero. > 4. Skip and silently ignore unsupported r_addend. > 5. Update copyright year to 2024. > 6. Drop the mprotect (PROT_EXEC | PROT_READ) check. > > Changes in v3: > > 1. Define and use macros for instruction opcodes and sizes. > 2. Use INT32_MIN and UINT32_MAX instead of 0x80000000ULL and > 0xffffffffULL. > 3. Replace 3 1-byte writes with a 4-byte write to write JMPABS. > 4. Verify that DT_X86_64_PLTSZ is a multiple of DT_X86_64_PLTENT. > 5. Add some comments for mprotect test. > 6. Update plt_rewrite logic. > > Add ELF_DYNAMIC_AFTER_RELOC to allow target specific processing after > relocation. > > For x86-64, add > > #define DT_X86_64_PLT (DT_LOPROC + 0) > #define DT_X86_64_PLTSZ (DT_LOPROC + 1) > #define DT_X86_64_PLTENT (DT_LOPROC + 3) > > 1. DT_X86_64_PLT: The address of the procedure linkage table. > 2. DT_X86_64_PLTSZ: The total size, in bytes, of the procedure linkage > table. > 3. DT_X86_64_PLTENT: The size, in bytes, of a procedure linkage table > entry. > > With the r_addend field of the R_X86_64_JUMP_SLOT relocation set to the > memory offset of the indirect branch instruction. > > Define ELF_DYNAMIC_AFTER_RELOC for x86-64 to rewrite the PLT section > with direct branch after relocation when the lazy binding is disabled. > > PLT rewrite is disabled by default since SELinux may disallow modifying > code pages and ld.so can't detect it in all cases. Add > > $ GLIBC_TUNABLES=glibc.cpu.plt_rewrite=1 > > to enable PLT rewrite at run-time. > --- > elf/dynamic-link.h | 5 + > elf/elf.h | 5 + > elf/tst-glibcelf.py | 1 + > manual/tunables.texi | 9 ++ > scripts/glibcelf.py | 4 + > sysdeps/x86/cet-control.h | 12 ++ > sysdeps/x86/cpu-features.c | 20 ++- > sysdeps/x86/dl-procruntime.c | 1 + > sysdeps/x86/dl-tunables.list | 5 + > sysdeps/x86_64/Makefile | 27 ++++ > sysdeps/x86_64/configure | 35 +++++ > sysdeps/x86_64/configure.ac | 4 + > sysdeps/x86_64/dl-dtprocnum.h | 21 +++ > sysdeps/x86_64/dl-machine.h | 216 ++++++++++++++++++++++++++- > sysdeps/x86_64/link_map.h | 22 +++ > sysdeps/x86_64/tst-plt-rewrite1.c | 31 ++++ > sysdeps/x86_64/tst-plt-rewritemod1.c | 32 ++++ > 17 files changed, 448 insertions(+), 2 deletions(-) > create mode 100644 sysdeps/x86_64/dl-dtprocnum.h > create mode 100644 sysdeps/x86_64/link_map.h > create mode 100644 sysdeps/x86_64/tst-plt-rewrite1.c > create mode 100644 sysdeps/x86_64/tst-plt-rewritemod1.c > > diff --git a/elf/dynamic-link.h b/elf/dynamic-link.h > index 8cdf7bde09..83d834ecaf 100644 > --- a/elf/dynamic-link.h > +++ b/elf/dynamic-link.h > @@ -177,6 +177,10 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[], > } \ > } while (0); > > +# ifndef ELF_DYNAMIC_AFTER_RELOC > +# define ELF_DYNAMIC_AFTER_RELOC(map, lazy) > +# endif > + > /* This can't just be an inline function because GCC is too dumb > to inline functions containing inlines themselves. */ > # ifdef RTLD_BOOTSTRAP > @@ -192,6 +196,7 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[], > ELF_DYNAMIC_DO_RELR (map); \ > ELF_DYNAMIC_DO_REL ((map), (scope), edr_lazy, skip_ifunc); \ > ELF_DYNAMIC_DO_RELA ((map), (scope), edr_lazy, skip_ifunc); \ > + ELF_DYNAMIC_AFTER_RELOC ((map), (edr_lazy)); \ > } while (0) > > #endif > diff --git a/elf/elf.h b/elf/elf.h > index ca6a7d9d67..455731663c 100644 > --- a/elf/elf.h > +++ b/elf/elf.h > @@ -3639,6 +3639,11 @@ enum > /* x86-64 sh_type values. */ > #define SHT_X86_64_UNWIND 0x70000001 /* Unwind information. */ > > +/* x86-64 d_tag values. */ > +#define DT_X86_64_PLT (DT_LOPROC + 0) > +#define DT_X86_64_PLTSZ (DT_LOPROC + 1) > +#define DT_X86_64_PLTENT (DT_LOPROC + 3) > +#define DT_X86_64_NUM 4 > > /* AM33 relocations. */ > #define R_MN10300_NONE 0 /* No reloc. */ > diff --git a/elf/tst-glibcelf.py b/elf/tst-glibcelf.py > index 00cd2bba85..c191636a99 100644 > --- a/elf/tst-glibcelf.py > +++ b/elf/tst-glibcelf.py > @@ -187,6 +187,7 @@ DT_VALNUM > DT_VALRNGHI > DT_VALRNGLO > DT_VERSIONTAGNUM > +DT_X86_64_NUM > ELFCLASSNUM > ELFDATANUM > EM_NUM > diff --git a/manual/tunables.texi b/manual/tunables.texi > index b31f16da84..f9bd83622e 100644 > --- a/manual/tunables.texi > +++ b/manual/tunables.texi > @@ -57,6 +57,7 @@ glibc.pthread.stack_cache_size: 0x2800000 (min: 0x0, max: 0xffffffffffffffff) > glibc.cpu.hwcap_mask: 0x6 (min: 0x0, max: 0xffffffffffffffff) > glibc.malloc.mmap_max: 0 (min: 0, max: 2147483647) > glibc.elision.skip_trylock_internal_abort: 3 (min: 0, max: 2147483647) > +glibc.cpu.plt_rewrite: 0 (min: 0, max: 1) > glibc.malloc.tcache_unsorted_limit: 0x0 (min: 0x0, max: 0xffffffffffffffff) > glibc.cpu.x86_ibt: > glibc.cpu.hwcaps: > @@ -614,6 +615,14 @@ this tunable. > This tunable is specific to 64-bit x86-64. > @end deftp > > +@deftp Tunable glibc.cpu.plt_rewrite > +When this tunable is set to @code{1}, the dynamic linker will rewrite > +the PLT section with direct branch after relocation if possible when > +the lazy binding is disabled. > + This doesn't read well. Maybe When this tunable is set to @code{1}, the dynamic linker will attempt to rewrite the PLT section with a direct branch after relocation. It may fail to do so, for example when lazy binding is enabled. > +This tunable is specific to x86-64. > +@end deftp > + > @node Memory Related Tunables > @section Memory Related Tunables > @cindex memory related tunables > diff --git a/scripts/glibcelf.py b/scripts/glibcelf.py > index c5e5dda48e..5f3813f326 100644 > --- a/scripts/glibcelf.py > +++ b/scripts/glibcelf.py > @@ -439,6 +439,8 @@ class DtRISCV(Dt): > """Supplemental DT_* constants for EM_RISCV.""" > class DtSPARC(Dt): > """Supplemental DT_* constants for EM_SPARC.""" > +class DtX86_64(Dt): > + """Supplemental DT_* constants for EM_X86_64.""" > _dt_skip = ''' > DT_ENCODING DT_PROCNUM > DT_ADDRRNGLO DT_ADDRRNGHI DT_ADDRNUM > @@ -451,6 +453,7 @@ DT_MIPS_NUM > DT_PPC_NUM > DT_PPC64_NUM > DT_SPARC_NUM > +DT_X86_64_NUM > '''.strip().split() > _register_elf_h(DtAARCH64, prefix='DT_AARCH64_', skip=_dt_skip, parent=Dt) > _register_elf_h(DtALPHA, prefix='DT_ALPHA_', skip=_dt_skip, parent=Dt) > @@ -461,6 +464,7 @@ _register_elf_h(DtPPC, prefix='DT_PPC_', skip=_dt_skip, parent=Dt) > _register_elf_h(DtPPC64, prefix='DT_PPC64_', skip=_dt_skip, parent=Dt) > _register_elf_h(DtRISCV, prefix='DT_RISCV_', skip=_dt_skip, parent=Dt) > _register_elf_h(DtSPARC, prefix='DT_SPARC_', skip=_dt_skip, parent=Dt) > +_register_elf_h(DtX86_64, prefix='DT_X86_64_', skip=_dt_skip, parent=Dt) > _register_elf_h(Dt, skip=_dt_skip, ranges=True) > del _dt_skip > > diff --git a/sysdeps/x86/cet-control.h b/sysdeps/x86/cet-control.h > index a45d59bf8c..81e7bb4bd8 100644 > --- a/sysdeps/x86/cet-control.h > +++ b/sysdeps/x86/cet-control.h > @@ -32,10 +32,22 @@ enum dl_x86_cet_control > cet_permissive > }; > > +/* PLT rewrite control. */ > +enum dl_plt_rewrite_control > +{ > + /* No PLT rewrite. */ > + plt_rewrite_none, > + /* Rewrite PLT with JMP at run-time. */ > + plt_rewrite_jmp, > + /* Rewrite PLT with JMPABS at run-time. */ > + plt_rewrite_jmpabs > +}; > + > struct dl_x86_feature_control > { > enum dl_x86_cet_control ibt : 2; > enum dl_x86_cet_control shstk : 2; > + enum dl_plt_rewrite_control plt_rewrite : 2; > }; > > #endif /* cet-control.h */ > diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c > index f193ea7a2d..ccf1350b72 100644 > --- a/sysdeps/x86/cpu-features.c > +++ b/sysdeps/x86/cpu-features.c > @@ -27,6 +27,21 @@ > extern void TUNABLE_CALLBACK (set_hwcaps) (tunable_val_t *) > attribute_hidden; > > +#ifdef SHARED > +static void > +TUNABLE_CALLBACK (set_plt_rewrite) (tunable_val_t *valp) > +{ > + if (valp->numval != 0) > + { > + /* Use JMPABS only on APX processors. */ > + const struct cpu_features *cpu_features = __get_cpu_features (); > + GL(dl_x86_feature_control).plt_rewrite > + = (CPU_FEATURE_PRESENT_P (cpu_features, APX_F) > + ? plt_rewrite_jmpabs : plt_rewrite_jmp); > + } > +} > +#endif > + > #ifdef __LP64__ > static void > TUNABLE_CALLBACK (set_prefer_map_32bit_exec) (tunable_val_t *valp) > @@ -1108,7 +1123,10 @@ no_cpuid: > TUNABLE_CALLBACK (set_x86_shstk)); > #endif > > -#ifndef SHARED > +#ifdef SHARED > + TUNABLE_GET (plt_rewrite, tunable_val_t *, > + TUNABLE_CALLBACK (set_plt_rewrite)); > +#else > /* NB: In libc.a, call init_cacheinfo. */ > init_cacheinfo (); > #endif > diff --git a/sysdeps/x86/dl-procruntime.c b/sysdeps/x86/dl-procruntime.c > index 4d25d9f327..15b3d0d878 100644 > --- a/sysdeps/x86/dl-procruntime.c > +++ b/sysdeps/x86/dl-procruntime.c > @@ -67,6 +67,7 @@ PROCINFO_CLASS struct dl_x86_feature_control _dl_x86_feature_control > = { > .ibt = DEFAULT_DL_X86_CET_CONTROL, > .shstk = DEFAULT_DL_X86_CET_CONTROL, > + .plt_rewrite = plt_rewrite_none, > } > # endif > # if !defined SHARED || defined PROCINFO_DECL > diff --git a/sysdeps/x86/dl-tunables.list b/sysdeps/x86/dl-tunables.list > index 147a7270ec..e2e441e1b7 100644 > --- a/sysdeps/x86/dl-tunables.list > +++ b/sysdeps/x86/dl-tunables.list > @@ -66,5 +66,10 @@ glibc { > x86_shared_cache_size { > type: SIZE_T > } > + plt_rewrite { > + type: INT_32 > + minval: 0 > + maxval: 1 > + } should max value be at least 2 given that there are three distinct states (none, jmp, jmpabs). > } > } > diff --git a/sysdeps/x86_64/Makefile b/sysdeps/x86_64/Makefile > index 00120ca9ca..374bca80d0 100644 > --- a/sysdeps/x86_64/Makefile > +++ b/sysdeps/x86_64/Makefile > @@ -1,6 +1,14 @@ > # The i387 `long double' is a distinct type we support. > long-double-fcts = yes > > +ifeq (yes,$(have-z-mark-plt)) > +# Always generate DT_X86_64_PLT* tags. > +sysdep-LDFLAGS += -Wl,-z,mark-plt > +# Never generate DT_X86_64_PLT* tags on ld.so to avoid changing its own > +# PLT. > +LDFLAGS-rtld += -Wl,-z,nomark-plt > +endif > + > ifeq ($(subdir),csu) > gen-as-const-headers += link-defines.sym > endif > @@ -175,6 +183,25 @@ ifeq (no,$(build-hardcoded-path-in-tests)) > tests-container += tst-glibc-hwcaps-cache > endif > > +ifeq (yes,$(have-z-mark-plt)) > +tests += \ > + tst-plt-rewrite1 \ > +# tests > +modules-names += \ > + tst-plt-rewritemod1 \ > +# modules-names > + > +tst-plt-rewrite1-no-pie = yes > +LDFLAGS-tst-plt-rewrite1 = -Wl,-z,now > +LDFLAGS-tst-plt-rewritemod1.so = -Wl,-z,now > +tst-plt-rewrite1-ENV = GLIBC_TUNABLES=glibc.cpu.plt_rewrite=1 LD_DEBUG=files:bindings > +$(objpfx)tst-plt-rewrite1: $(objpfx)tst-plt-rewritemod1.so > +$(objpfx)tst-plt-rewrite1.out: /dev/null $(objpfx)tst-plt-rewrite1 > + $(tst-plt-rewrite1-ENV) $(make-test-out) > $@ 2>&1; \ > + grep -q -E "changing 'bar' PLT entry in .*/elf/tst-plt-rewritemod1.so' to direct branch" $@; \ > + $(evaluate-test) > +endif > + > endif # $(subdir) == elf > > ifeq ($(subdir),csu) > diff --git a/sysdeps/x86_64/configure b/sysdeps/x86_64/configure > index b4a80b8035..418cc4a9b8 100755 > --- a/sysdeps/x86_64/configure > +++ b/sysdeps/x86_64/configure > @@ -25,6 +25,41 @@ printf "%s\n" "$libc_cv_cc_mprefer_vector_width" >&6; } > config_vars="$config_vars > config-cflags-mprefer-vector-width = $libc_cv_cc_mprefer_vector_width" > > +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for linker that supports -z mark-plt" >&5 > +printf %s "checking for linker that supports -z mark-plt... " >&6; } > +libc_linker_feature=no > +cat > conftest.c <<EOF > +int _start (void) { return 42; } > +EOF > +if { ac_try='${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp > + -Wl,-z,mark-plt -nostdlib -nostartfiles > + -fPIC -shared -o conftest.so conftest.c > + 1>&5' > + { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5 > + (eval $ac_try) 2>&5 > + ac_status=$? > + printf "%s\n" "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 > + test $ac_status = 0; }; } > +then > + if ${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp -Wl,-z,mark-plt -nostdlib \ > + -nostartfiles -fPIC -shared -o conftest.so conftest.c 2>&1 \ > + | grep "warning: -z mark-plt ignored" > /dev/null 2>&1; then > + true > + else > + libc_linker_feature=yes > + fi > +fi > +rm -f conftest* > +if test $libc_linker_feature = yes; then > + libc_cv_z_mark_plt=yes > +else > + libc_cv_z_mark_plt=no > +fi > +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: $libc_linker_feature" >&5 > +printf "%s\n" "$libc_linker_feature" >&6; } > +config_vars="$config_vars > +have-z-mark-plt = $libc_cv_z_mark_plt" > + > if test x"$build_mathvec" = xnotset; then > build_mathvec=yes > fi > diff --git a/sysdeps/x86_64/configure.ac b/sysdeps/x86_64/configure.ac > index 937d1aff7e..d1f803c02e 100644 > --- a/sysdeps/x86_64/configure.ac > +++ b/sysdeps/x86_64/configure.ac > @@ -10,6 +10,10 @@ LIBC_TRY_CC_OPTION([-mprefer-vector-width=128], > LIBC_CONFIG_VAR([config-cflags-mprefer-vector-width], > [$libc_cv_cc_mprefer_vector_width]) > > +LIBC_LINKER_FEATURE([-z mark-plt], [-Wl,-z,mark-plt], > + [libc_cv_z_mark_plt=yes], [libc_cv_z_mark_plt=no]) > +LIBC_CONFIG_VAR([have-z-mark-plt], [$libc_cv_z_mark_plt]) > + > if test x"$build_mathvec" = xnotset; then > build_mathvec=yes > fi > diff --git a/sysdeps/x86_64/dl-dtprocnum.h b/sysdeps/x86_64/dl-dtprocnum.h > new file mode 100644 > index 0000000000..cefacb5387 > --- /dev/null > +++ b/sysdeps/x86_64/dl-dtprocnum.h > @@ -0,0 +1,21 @@ > +/* Configuration of lookup functions. x64-64 version. > + Copyright (C) 2024 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <https://www.gnu.org/licenses/>. */ > + > +/* Number of extra dynamic section entries for this architecture. By > + default there are none. */ > +#define DT_THISPROCNUM DT_X86_64_NUM > diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h > index e0a9b14469..b075482ef4 100644 > --- a/sysdeps/x86_64/dl-machine.h > +++ b/sysdeps/x86_64/dl-machine.h > @@ -22,6 +22,7 @@ > #define ELF_MACHINE_NAME "x86_64" > > #include <assert.h> > +#include <stdint.h> > #include <sys/param.h> > #include <sysdep.h> > #include <tls.h> > @@ -35,6 +36,9 @@ > # define RTLD_START_ENABLE_X86_FEATURES > #endif > > +/* Translate a processor specific dynamic tag to the index in l_info array. */ > +#define DT_X86_64(x) (DT_X86_64_##x - DT_LOPROC + DT_NUM) > + > /* Return nonzero iff ELF header is compatible with the running host. */ > static inline int __attribute__ ((unused)) > elf_machine_matches_host (const ElfW(Ehdr) *ehdr) > @@ -312,8 +316,10 @@ and creates an unsatisfiable circular dependency.\n", > > switch (r_type) > { > - case R_X86_64_GLOB_DAT: > case R_X86_64_JUMP_SLOT: > + map->l_has_jump_slot_reloc = true; > + /* fallthrough */ > + case R_X86_64_GLOB_DAT: > *reloc_addr = value; > break; > > @@ -549,3 +555,211 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[], > } > > #endif /* RESOLVE_MAP */ > + > +#if !defined ELF_DYNAMIC_AFTER_RELOC && !defined RTLD_BOOTSTRAP \ > + && defined SHARED > +# define ELF_DYNAMIC_AFTER_RELOC(map, lazy) \ > + x86_64_dynamic_after_reloc (map, (lazy)) > + > +# define JMP32_INSN_OPCODE 0xe9 > +# define JMP32_INSN_SIZE 5 > +# define JMPABS_INSN_OPCODE 0xa100d5 > +# define JMPABS_INSN_SIZE 11 > +# define INT3_INSN_OPCODE 0xcc > + > +static const char * > +x86_64_reloc_symbol_name (struct link_map *map, const ElfW(Rela) *reloc) > +{ > + const ElfW(Sym) *const symtab > + = (const void *) map->l_info[DT_SYMTAB]->d_un.d_ptr; > + const ElfW(Sym) *const refsym = &symtab[ELFW (R_SYM) (reloc->r_info)]; > + const char *strtab = (const char *) map->l_info[DT_STRTAB]->d_un.d_ptr; > + return strtab + refsym->st_name; > +} > + > +static void > +x86_64_rewrite_plt (struct link_map *map, ElfW(Addr) plt_rewrite) > +{ > + ElfW(Addr) l_addr = map->l_addr; > + ElfW(Addr) pltent = map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val; > + ElfW(Addr) start = map->l_info[DT_JMPREL]->d_un.d_ptr; > + ElfW(Addr) size = map->l_info[DT_PLTRELSZ]->d_un.d_val; > + const ElfW(Rela) *reloc = (const void *) start; > + const ElfW(Rela) *reloc_end = (const void *) (start + size); > + > + unsigned int feature_1 = THREAD_GETMEM (THREAD_SELF, > + header.feature_1); > + bool ibt_enabled_p > + = (feature_1 & GNU_PROPERTY_X86_FEATURE_1_IBT) != 0; > + > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) > + _dl_debug_printf ("\nchanging PLT in '%s' to direct branch\n", > + DSO_FILENAME (map->l_name)); > + > + for (; reloc < reloc_end; reloc++) > + if (ELFW(R_TYPE) (reloc->r_info) == R_X86_64_JUMP_SLOT) > + { > + /* Get the value from the GOT entry. */ > + ElfW(Addr) value = *(ElfW(Addr) *) (l_addr + reloc->r_offset); > + > + /* Get the corresponding PLT entry from r_addend. */ > + ElfW(Addr) branch_start = l_addr + reloc->r_addend; > + /* Skip ENDBR64 if IBT isn't enabled. */ > + if (!ibt_enabled_p) > + branch_start = ALIGN_DOWN (branch_start, pltent); Will the only preceding code always be the ENDBR64? If so why not just replace the alignment stuff with `- ENDBR64_INSN_SIZE`? Likewise below. > + /* Get the displacement from the branch target. */ > + ElfW(Addr) disp = value - branch_start - JMP32_INSN_SIZE; > + ElfW(Addr) plt_end; > + ElfW(Addr) pad; > + > + plt_end = (branch_start & -pltent) + pltent; for this to make sense `pltent` needs to a power of 2 which is not checked below in `x86_64_dynamic_after_reloc`. Likewise at the ALIGN_DOWN above. also better codegen: `(branch_start | (pltent - 1)) + 1` > + > + /* Update the PLT entry. */ > + if (((uint64_t) disp + (uint64_t) ((uint32_t) INT32_MIN)) > + <= (uint64_t) UINT32_MAX) > + { > + pad = branch_start + JMP32_INSN_SIZE; > + > + if (__glibc_unlikely (pad > plt_end)) > + continue; > + > + /* If the target branch can be reached with a direct branch, > + rewrite the PLT entry with a direct branch. */ > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_BINDINGS)) > + { > + const char *sym_name = x86_64_reloc_symbol_name (map, > + reloc); > + _dl_debug_printf ("changing '%s' PLT entry in '%s' to " > + "direct branch\n", sym_name, > + DSO_FILENAME (map->l_name)); > + } > + > + /* Write out direct branch. */ > + *(uint8_t *) branch_start = JMP32_INSN_OPCODE; > + *((uint32_t *) (branch_start + 1)) = disp; > + } > + else > + { > + if (GL(dl_x86_feature_control).plt_rewrite > + != plt_rewrite_jmpabs) > + { > + if (__glibc_unlikely (GLRO(dl_debug_mask) > + & DL_DEBUG_BINDINGS)) > + { > + const char *sym_name > + = x86_64_reloc_symbol_name (map, reloc); > + _dl_debug_printf ("skipping '%s' PLT entry in '%s'\n", > + sym_name, > + DSO_FILENAME (map->l_name)); > + } > + continue; > + } > + > + pad = branch_start + JMPABS_INSN_SIZE; > + > + if (__glibc_unlikely (pad > plt_end)) > + continue; > + > + /* Rewrite the PLT entry with JMPABS. */ > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_BINDINGS)) > + { > + const char *sym_name = x86_64_reloc_symbol_name (map, > + reloc); > + _dl_debug_printf ("changing '%s' PLT entry in '%s' to " > + "JMPABS\n", sym_name, > + DSO_FILENAME (map->l_name)); > + } > + > + /* "jmpabs $target" for 64-bit displacement. NB: JMPABS has > + a 3-byte opcode + 64bit address. There is a 1-byte overlap > + between 4-byte write and 8-byte write. */ > + *(uint32_t *) (branch_start) = JMPABS_INSN_OPCODE; > + *(uint64_t *) (branch_start + 3) = value; > + } > + > + /* Fill the unused part of the PLT entry with INT3. */ > + for (; pad < plt_end; pad++) > + *(uint8_t *) pad = INT3_INSN_OPCODE; nit: can you put braces around this for loop? Its a bit deceiving to the eye with the brace directly after it. > + } > +} > + > +static inline void > +x86_64_rewrite_plt_in_place (struct link_map *map) > +{ > + /* Adjust DT_X86_64_PLT address and DT_X86_64_PLTSZ values. */ > + ElfW(Addr) plt = (map->l_info[DT_X86_64 (PLT)]->d_un.d_ptr > + + map->l_addr); > + size_t pagesize = GLRO(dl_pagesize); > + ElfW(Addr) plt_aligned = ALIGN_DOWN (plt, pagesize); > + size_t pltsz = (map->l_info[DT_X86_64 (PLTSZ)]->d_un.d_val > + + plt - plt_aligned); > + > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) > + _dl_debug_printf ("\nchanging PLT in '%s' to writable\n", > + DSO_FILENAME (map->l_name)); > + > + if (__glibc_unlikely (__mprotect ((void *) plt_aligned, pltsz, > + PROT_WRITE | PROT_READ) < 0)) > + { > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) > + _dl_debug_printf ("\nfailed to change PLT in '%s' to writable\n", > + DSO_FILENAME (map->l_name)); > + return; > + } > + > + x86_64_rewrite_plt (map, plt_aligned); > + > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) > + _dl_debug_printf ("\nchanging PLT in '%s' back to read-only\n", > + DSO_FILENAME (map->l_name)); > + > + if (__glibc_unlikely (__mprotect ((void *) plt_aligned, pltsz, > + PROT_EXEC | PROT_READ) < 0)) > + _dl_signal_error (0, DSO_FILENAME (map->l_name), NULL, > + "failed to change PLT back to read-only"); > +} > + > +/* Rewrite PLT entries to direct branch if possible. */ > + > +static inline void > +x86_64_dynamic_after_reloc (struct link_map *map, int lazy) > +{ > + /* Ignore DT_X86_64_PLT if the lazy binding is enabled. */ > + if (lazy != 0) > + return; > + > + /* Ignore DT_X86_64_PLT if PLT rewrite isn't enabled. */ > + if (__glibc_likely (GL(dl_x86_feature_control).plt_rewrite > + == plt_rewrite_none)) > + return; > + > + if (__glibc_likely (map->l_info[DT_X86_64 (PLT)] == NULL)) > + return; > + > + /* Ignore DT_X86_64_PLT if there is no R_X86_64_JUMP_SLOT. */ > + if (map->l_has_jump_slot_reloc == 0) > + return; > + > + /* Ignore DT_X86_64_PLT if > + 1. DT_JMPREL isn't available or its value is 0. > + 2. DT_PLTRELSZ is 0. > + 3. DT_X86_64_PLTENT isn't available or its value is smaller than > + 16 bytes. > + 4. DT_X86_64_PLTSZ isn't available or its value is smaller than > + DT_X86_64_PLTENT's value or isn't a multiple of DT_X86_64_PLTENT's > + value. */ > + if (map->l_info[DT_JMPREL] == NULL > + || map->l_info[DT_JMPREL]->d_un.d_ptr == 0 > + || map->l_info[DT_PLTRELSZ]->d_un.d_val == 0 > + || map->l_info[DT_X86_64 (PLTSZ)] == NULL > + || map->l_info[DT_X86_64 (PLTENT)] == NULL > + || map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val < 16 > + || (map->l_info[DT_X86_64 (PLTSZ)]->d_un.d_val > + < map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val) > + || (map->l_info[DT_X86_64 (PLTSZ)]->d_un.d_val > + % map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val) != 0) > + return; > + > + x86_64_rewrite_plt_in_place (map); > +} > +#endif > diff --git a/sysdeps/x86_64/link_map.h b/sysdeps/x86_64/link_map.h > new file mode 100644 > index 0000000000..537f56ace5 > --- /dev/null > +++ b/sysdeps/x86_64/link_map.h > @@ -0,0 +1,22 @@ > +/* Additional fields in struct link_map. x86-64 version. > + Copyright (C) 2024 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <https://www.gnu.org/licenses/>. */ > + > +/* Has R_X86_64_JUMP_SLOT relocation. */ > +bool l_has_jump_slot_reloc; > + > +#include <sysdeps/x86/link_map.h> > diff --git a/sysdeps/x86_64/tst-plt-rewrite1.c b/sysdeps/x86_64/tst-plt-rewrite1.c > new file mode 100644 > index 0000000000..86785957e2 > --- /dev/null > +++ b/sysdeps/x86_64/tst-plt-rewrite1.c > @@ -0,0 +1,31 @@ > +/* Test PLT rewrite. > + Copyright (C) 2024 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <https://www.gnu.org/licenses/>. */ > + > +#include <string.h> > +#include <support/check.h> > + > +extern const char *foo (void); > + > +static int > +do_test (void) > +{ > + TEST_COMPARE (strcmp (foo (), "PLT rewrite works"), 0); > + return 0; > +} > + > +#include <support/test-driver.c> > diff --git a/sysdeps/x86_64/tst-plt-rewritemod1.c b/sysdeps/x86_64/tst-plt-rewritemod1.c > new file mode 100644 > index 0000000000..99f21fba5a > --- /dev/null > +++ b/sysdeps/x86_64/tst-plt-rewritemod1.c > @@ -0,0 +1,32 @@ > +/* Check PLT rewrite works correctly. > + Copyright (C) 2024 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <https://www.gnu.org/licenses/>. */ > + > +/* foo calls bar with indirect branch via PLT. PLT rewrite should > + change it to direct branch. */ > + > +const char * > +bar (void) > +{ > + return "PLT rewrite works"; > +} > + > +const char * > +foo (void) > +{ > + return bar (); > +} > -- > 2.43.0 > can you run clang-format on your changes as a whole?
On Thu, Jan 4, 2024 at 6:00 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote: > > On Thu, Jan 4, 2024 at 8:01 AM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > Changes in v6: > > > > 1. Print skipped PLT entries. > > 2. Always generate DT_X86_64_PLT* tags when supported. > > 3. Never generate DT_X86_64_PLT* tags on ld.so. > > 4. Don't check for ld.so in x86_64_dynamic_after_reloc to reduce the > > run-time overhead. > > > > Changes in v5: > > > > 1. Remove plt_rewrite_enabled and set plt_rewrite in set_plt_rewrite to > > reduce the run-time overhead. > > 2. Use __glibc_likely and __glibc_unlikely in x86_64_dynamic_after_reloc > > to reduce the run-time overhead. > > > > Changes in v4: > > > > 1. Update set_plt_rewrite. > > 2. Add /* fallthrough */ to case R_X86_64_JUMP_SLOT:. > > 3. Remove plt_rewrite_bias since it is always zero. > > 4. Skip and silently ignore unsupported r_addend. > > 5. Update copyright year to 2024. > > 6. Drop the mprotect (PROT_EXEC | PROT_READ) check. > > > > Changes in v3: > > > > 1. Define and use macros for instruction opcodes and sizes. > > 2. Use INT32_MIN and UINT32_MAX instead of 0x80000000ULL and > > 0xffffffffULL. > > 3. Replace 3 1-byte writes with a 4-byte write to write JMPABS. > > 4. Verify that DT_X86_64_PLTSZ is a multiple of DT_X86_64_PLTENT. > > 5. Add some comments for mprotect test. > > 6. Update plt_rewrite logic. > > > > Add ELF_DYNAMIC_AFTER_RELOC to allow target specific processing after > > relocation. > > > > For x86-64, add > > > > #define DT_X86_64_PLT (DT_LOPROC + 0) > > #define DT_X86_64_PLTSZ (DT_LOPROC + 1) > > #define DT_X86_64_PLTENT (DT_LOPROC + 3) > > > > 1. DT_X86_64_PLT: The address of the procedure linkage table. > > 2. DT_X86_64_PLTSZ: The total size, in bytes, of the procedure linkage > > table. > > 3. DT_X86_64_PLTENT: The size, in bytes, of a procedure linkage table > > entry. > > > > With the r_addend field of the R_X86_64_JUMP_SLOT relocation set to the > > memory offset of the indirect branch instruction. > > > > Define ELF_DYNAMIC_AFTER_RELOC for x86-64 to rewrite the PLT section > > with direct branch after relocation when the lazy binding is disabled. > > > > PLT rewrite is disabled by default since SELinux may disallow modifying > > code pages and ld.so can't detect it in all cases. Add > > > > $ GLIBC_TUNABLES=glibc.cpu.plt_rewrite=1 > > > > to enable PLT rewrite at run-time. > > --- > > elf/dynamic-link.h | 5 + > > elf/elf.h | 5 + > > elf/tst-glibcelf.py | 1 + > > manual/tunables.texi | 9 ++ > > scripts/glibcelf.py | 4 + > > sysdeps/x86/cet-control.h | 12 ++ > > sysdeps/x86/cpu-features.c | 20 ++- > > sysdeps/x86/dl-procruntime.c | 1 + > > sysdeps/x86/dl-tunables.list | 5 + > > sysdeps/x86_64/Makefile | 27 ++++ > > sysdeps/x86_64/configure | 35 +++++ > > sysdeps/x86_64/configure.ac | 4 + > > sysdeps/x86_64/dl-dtprocnum.h | 21 +++ > > sysdeps/x86_64/dl-machine.h | 216 ++++++++++++++++++++++++++- > > sysdeps/x86_64/link_map.h | 22 +++ > > sysdeps/x86_64/tst-plt-rewrite1.c | 31 ++++ > > sysdeps/x86_64/tst-plt-rewritemod1.c | 32 ++++ > > 17 files changed, 448 insertions(+), 2 deletions(-) > > create mode 100644 sysdeps/x86_64/dl-dtprocnum.h > > create mode 100644 sysdeps/x86_64/link_map.h > > create mode 100644 sysdeps/x86_64/tst-plt-rewrite1.c > > create mode 100644 sysdeps/x86_64/tst-plt-rewritemod1.c > > > > diff --git a/elf/dynamic-link.h b/elf/dynamic-link.h > > index 8cdf7bde09..83d834ecaf 100644 > > --- a/elf/dynamic-link.h > > +++ b/elf/dynamic-link.h > > @@ -177,6 +177,10 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[], > > } \ > > } while (0); > > > > +# ifndef ELF_DYNAMIC_AFTER_RELOC > > +# define ELF_DYNAMIC_AFTER_RELOC(map, lazy) > > +# endif > > + > > /* This can't just be an inline function because GCC is too dumb > > to inline functions containing inlines themselves. */ > > # ifdef RTLD_BOOTSTRAP > > @@ -192,6 +196,7 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[], > > ELF_DYNAMIC_DO_RELR (map); \ > > ELF_DYNAMIC_DO_REL ((map), (scope), edr_lazy, skip_ifunc); \ > > ELF_DYNAMIC_DO_RELA ((map), (scope), edr_lazy, skip_ifunc); \ > > + ELF_DYNAMIC_AFTER_RELOC ((map), (edr_lazy)); \ > > } while (0) > > > > #endif > > diff --git a/elf/elf.h b/elf/elf.h > > index ca6a7d9d67..455731663c 100644 > > --- a/elf/elf.h > > +++ b/elf/elf.h > > @@ -3639,6 +3639,11 @@ enum > > /* x86-64 sh_type values. */ > > #define SHT_X86_64_UNWIND 0x70000001 /* Unwind information. */ > > > > +/* x86-64 d_tag values. */ > > +#define DT_X86_64_PLT (DT_LOPROC + 0) > > +#define DT_X86_64_PLTSZ (DT_LOPROC + 1) > > +#define DT_X86_64_PLTENT (DT_LOPROC + 3) > > +#define DT_X86_64_NUM 4 > > > > /* AM33 relocations. */ > > #define R_MN10300_NONE 0 /* No reloc. */ > > diff --git a/elf/tst-glibcelf.py b/elf/tst-glibcelf.py > > index 00cd2bba85..c191636a99 100644 > > --- a/elf/tst-glibcelf.py > > +++ b/elf/tst-glibcelf.py > > @@ -187,6 +187,7 @@ DT_VALNUM > > DT_VALRNGHI > > DT_VALRNGLO > > DT_VERSIONTAGNUM > > +DT_X86_64_NUM > > ELFCLASSNUM > > ELFDATANUM > > EM_NUM > > diff --git a/manual/tunables.texi b/manual/tunables.texi > > index b31f16da84..f9bd83622e 100644 > > --- a/manual/tunables.texi > > +++ b/manual/tunables.texi > > @@ -57,6 +57,7 @@ glibc.pthread.stack_cache_size: 0x2800000 (min: 0x0, max: 0xffffffffffffffff) > > glibc.cpu.hwcap_mask: 0x6 (min: 0x0, max: 0xffffffffffffffff) > > glibc.malloc.mmap_max: 0 (min: 0, max: 2147483647) > > glibc.elision.skip_trylock_internal_abort: 3 (min: 0, max: 2147483647) > > +glibc.cpu.plt_rewrite: 0 (min: 0, max: 1) > > glibc.malloc.tcache_unsorted_limit: 0x0 (min: 0x0, max: 0xffffffffffffffff) > > glibc.cpu.x86_ibt: > > glibc.cpu.hwcaps: > > @@ -614,6 +615,14 @@ this tunable. > > This tunable is specific to 64-bit x86-64. > > @end deftp > > > > +@deftp Tunable glibc.cpu.plt_rewrite > > +When this tunable is set to @code{1}, the dynamic linker will rewrite > > +the PLT section with direct branch after relocation if possible when > > +the lazy binding is disabled. > > + > > This doesn't read well. Maybe > > When this tunable is set to @code{1}, the dynamic linker will attempt > to rewrite the PLT section with a direct branch after relocation. It may > fail to do so, for example when lazy binding is enabled. Will update. > > +This tunable is specific to x86-64. > > +@end deftp > > + > > @node Memory Related Tunables > > @section Memory Related Tunables > > @cindex memory related tunables > > diff --git a/scripts/glibcelf.py b/scripts/glibcelf.py > > index c5e5dda48e..5f3813f326 100644 > > --- a/scripts/glibcelf.py > > +++ b/scripts/glibcelf.py > > @@ -439,6 +439,8 @@ class DtRISCV(Dt): > > """Supplemental DT_* constants for EM_RISCV.""" > > class DtSPARC(Dt): > > """Supplemental DT_* constants for EM_SPARC.""" > > +class DtX86_64(Dt): > > + """Supplemental DT_* constants for EM_X86_64.""" > > _dt_skip = ''' > > DT_ENCODING DT_PROCNUM > > DT_ADDRRNGLO DT_ADDRRNGHI DT_ADDRNUM > > @@ -451,6 +453,7 @@ DT_MIPS_NUM > > DT_PPC_NUM > > DT_PPC64_NUM > > DT_SPARC_NUM > > +DT_X86_64_NUM > > '''.strip().split() > > _register_elf_h(DtAARCH64, prefix='DT_AARCH64_', skip=_dt_skip, parent=Dt) > > _register_elf_h(DtALPHA, prefix='DT_ALPHA_', skip=_dt_skip, parent=Dt) > > @@ -461,6 +464,7 @@ _register_elf_h(DtPPC, prefix='DT_PPC_', skip=_dt_skip, parent=Dt) > > _register_elf_h(DtPPC64, prefix='DT_PPC64_', skip=_dt_skip, parent=Dt) > > _register_elf_h(DtRISCV, prefix='DT_RISCV_', skip=_dt_skip, parent=Dt) > > _register_elf_h(DtSPARC, prefix='DT_SPARC_', skip=_dt_skip, parent=Dt) > > +_register_elf_h(DtX86_64, prefix='DT_X86_64_', skip=_dt_skip, parent=Dt) > > _register_elf_h(Dt, skip=_dt_skip, ranges=True) > > del _dt_skip > > > > diff --git a/sysdeps/x86/cet-control.h b/sysdeps/x86/cet-control.h > > index a45d59bf8c..81e7bb4bd8 100644 > > --- a/sysdeps/x86/cet-control.h > > +++ b/sysdeps/x86/cet-control.h > > @@ -32,10 +32,22 @@ enum dl_x86_cet_control > > cet_permissive > > }; > > > > +/* PLT rewrite control. */ > > +enum dl_plt_rewrite_control > > +{ > > + /* No PLT rewrite. */ > > + plt_rewrite_none, > > + /* Rewrite PLT with JMP at run-time. */ > > + plt_rewrite_jmp, > > + /* Rewrite PLT with JMPABS at run-time. */ > > + plt_rewrite_jmpabs > > +}; > > + > > struct dl_x86_feature_control > > { > > enum dl_x86_cet_control ibt : 2; > > enum dl_x86_cet_control shstk : 2; > > + enum dl_plt_rewrite_control plt_rewrite : 2; > > }; > > > > #endif /* cet-control.h */ > > diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c > > index f193ea7a2d..ccf1350b72 100644 > > --- a/sysdeps/x86/cpu-features.c > > +++ b/sysdeps/x86/cpu-features.c > > @@ -27,6 +27,21 @@ > > extern void TUNABLE_CALLBACK (set_hwcaps) (tunable_val_t *) > > attribute_hidden; > > > > +#ifdef SHARED > > +static void > > +TUNABLE_CALLBACK (set_plt_rewrite) (tunable_val_t *valp) > > +{ > > + if (valp->numval != 0) > > + { > > + /* Use JMPABS only on APX processors. */ > > + const struct cpu_features *cpu_features = __get_cpu_features (); > > + GL(dl_x86_feature_control).plt_rewrite > > + = (CPU_FEATURE_PRESENT_P (cpu_features, APX_F) > > + ? plt_rewrite_jmpabs : plt_rewrite_jmp); > > + } > > +} > > +#endif > > + > > #ifdef __LP64__ > > static void > > TUNABLE_CALLBACK (set_prefer_map_32bit_exec) (tunable_val_t *valp) > > @@ -1108,7 +1123,10 @@ no_cpuid: > > TUNABLE_CALLBACK (set_x86_shstk)); > > #endif > > > > -#ifndef SHARED > > +#ifdef SHARED > > + TUNABLE_GET (plt_rewrite, tunable_val_t *, > > + TUNABLE_CALLBACK (set_plt_rewrite)); > > +#else > > /* NB: In libc.a, call init_cacheinfo. */ > > init_cacheinfo (); > > #endif > > diff --git a/sysdeps/x86/dl-procruntime.c b/sysdeps/x86/dl-procruntime.c > > index 4d25d9f327..15b3d0d878 100644 > > --- a/sysdeps/x86/dl-procruntime.c > > +++ b/sysdeps/x86/dl-procruntime.c > > @@ -67,6 +67,7 @@ PROCINFO_CLASS struct dl_x86_feature_control _dl_x86_feature_control > > = { > > .ibt = DEFAULT_DL_X86_CET_CONTROL, > > .shstk = DEFAULT_DL_X86_CET_CONTROL, > > + .plt_rewrite = plt_rewrite_none, > > } > > # endif > > # if !defined SHARED || defined PROCINFO_DECL > > diff --git a/sysdeps/x86/dl-tunables.list b/sysdeps/x86/dl-tunables.list > > index 147a7270ec..e2e441e1b7 100644 > > --- a/sysdeps/x86/dl-tunables.list > > +++ b/sysdeps/x86/dl-tunables.list > > @@ -66,5 +66,10 @@ glibc { > > x86_shared_cache_size { > > type: SIZE_T > > } > > + plt_rewrite { > > + type: INT_32 > > + minval: 0 > > + maxval: 1 > > + } > should max value be at least 2 given that there are three > distinct states (none, jmp, jmpabs). I will take a look. > > } > > } > > diff --git a/sysdeps/x86_64/Makefile b/sysdeps/x86_64/Makefile > > index 00120ca9ca..374bca80d0 100644 > > --- a/sysdeps/x86_64/Makefile > > +++ b/sysdeps/x86_64/Makefile > > @@ -1,6 +1,14 @@ > > # The i387 `long double' is a distinct type we support. > > long-double-fcts = yes > > > > +ifeq (yes,$(have-z-mark-plt)) > > +# Always generate DT_X86_64_PLT* tags. > > +sysdep-LDFLAGS += -Wl,-z,mark-plt > > +# Never generate DT_X86_64_PLT* tags on ld.so to avoid changing its own > > +# PLT. > > +LDFLAGS-rtld += -Wl,-z,nomark-plt > > +endif > > + > > ifeq ($(subdir),csu) > > gen-as-const-headers += link-defines.sym > > endif > > @@ -175,6 +183,25 @@ ifeq (no,$(build-hardcoded-path-in-tests)) > > tests-container += tst-glibc-hwcaps-cache > > endif > > > > +ifeq (yes,$(have-z-mark-plt)) > > +tests += \ > > + tst-plt-rewrite1 \ > > +# tests > > +modules-names += \ > > + tst-plt-rewritemod1 \ > > +# modules-names > > + > > +tst-plt-rewrite1-no-pie = yes > > +LDFLAGS-tst-plt-rewrite1 = -Wl,-z,now > > +LDFLAGS-tst-plt-rewritemod1.so = -Wl,-z,now > > +tst-plt-rewrite1-ENV = GLIBC_TUNABLES=glibc.cpu.plt_rewrite=1 LD_DEBUG=files:bindings > > +$(objpfx)tst-plt-rewrite1: $(objpfx)tst-plt-rewritemod1.so > > +$(objpfx)tst-plt-rewrite1.out: /dev/null $(objpfx)tst-plt-rewrite1 > > + $(tst-plt-rewrite1-ENV) $(make-test-out) > $@ 2>&1; \ > > + grep -q -E "changing 'bar' PLT entry in .*/elf/tst-plt-rewritemod1.so' to direct branch" $@; \ > > + $(evaluate-test) > > +endif > > + > > endif # $(subdir) == elf > > > > ifeq ($(subdir),csu) > > diff --git a/sysdeps/x86_64/configure b/sysdeps/x86_64/configure > > index b4a80b8035..418cc4a9b8 100755 > > --- a/sysdeps/x86_64/configure > > +++ b/sysdeps/x86_64/configure > > @@ -25,6 +25,41 @@ printf "%s\n" "$libc_cv_cc_mprefer_vector_width" >&6; } > > config_vars="$config_vars > > config-cflags-mprefer-vector-width = $libc_cv_cc_mprefer_vector_width" > > > > +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for linker that supports -z mark-plt" >&5 > > +printf %s "checking for linker that supports -z mark-plt... " >&6; } > > +libc_linker_feature=no > > +cat > conftest.c <<EOF > > +int _start (void) { return 42; } > > +EOF > > +if { ac_try='${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp > > + -Wl,-z,mark-plt -nostdlib -nostartfiles > > + -fPIC -shared -o conftest.so conftest.c > > + 1>&5' > > + { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5 > > + (eval $ac_try) 2>&5 > > + ac_status=$? > > + printf "%s\n" "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 > > + test $ac_status = 0; }; } > > +then > > + if ${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp -Wl,-z,mark-plt -nostdlib \ > > + -nostartfiles -fPIC -shared -o conftest.so conftest.c 2>&1 \ > > + | grep "warning: -z mark-plt ignored" > /dev/null 2>&1; then > > + true > > + else > > + libc_linker_feature=yes > > + fi > > +fi > > +rm -f conftest* > > +if test $libc_linker_feature = yes; then > > + libc_cv_z_mark_plt=yes > > +else > > + libc_cv_z_mark_plt=no > > +fi > > +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: $libc_linker_feature" >&5 > > +printf "%s\n" "$libc_linker_feature" >&6; } > > +config_vars="$config_vars > > +have-z-mark-plt = $libc_cv_z_mark_plt" > > + > > if test x"$build_mathvec" = xnotset; then > > build_mathvec=yes > > fi > > diff --git a/sysdeps/x86_64/configure.ac b/sysdeps/x86_64/configure.ac > > index 937d1aff7e..d1f803c02e 100644 > > --- a/sysdeps/x86_64/configure.ac > > +++ b/sysdeps/x86_64/configure.ac > > @@ -10,6 +10,10 @@ LIBC_TRY_CC_OPTION([-mprefer-vector-width=128], > > LIBC_CONFIG_VAR([config-cflags-mprefer-vector-width], > > [$libc_cv_cc_mprefer_vector_width]) > > > > +LIBC_LINKER_FEATURE([-z mark-plt], [-Wl,-z,mark-plt], > > + [libc_cv_z_mark_plt=yes], [libc_cv_z_mark_plt=no]) > > +LIBC_CONFIG_VAR([have-z-mark-plt], [$libc_cv_z_mark_plt]) > > + > > if test x"$build_mathvec" = xnotset; then > > build_mathvec=yes > > fi > > diff --git a/sysdeps/x86_64/dl-dtprocnum.h b/sysdeps/x86_64/dl-dtprocnum.h > > new file mode 100644 > > index 0000000000..cefacb5387 > > --- /dev/null > > +++ b/sysdeps/x86_64/dl-dtprocnum.h > > @@ -0,0 +1,21 @@ > > +/* Configuration of lookup functions. x64-64 version. > > + Copyright (C) 2024 Free Software Foundation, Inc. > > + This file is part of the GNU C Library. > > + > > + The GNU C Library is free software; you can redistribute it and/or > > + modify it under the terms of the GNU Lesser General Public > > + License as published by the Free Software Foundation; either > > + version 2.1 of the License, or (at your option) any later version. > > + > > + The GNU C Library is distributed in the hope that it will be useful, > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > + Lesser General Public License for more details. > > + > > + You should have received a copy of the GNU Lesser General Public > > + License along with the GNU C Library; if not, see > > + <https://www.gnu.org/licenses/>. */ > > + > > +/* Number of extra dynamic section entries for this architecture. By > > + default there are none. */ > > +#define DT_THISPROCNUM DT_X86_64_NUM > > diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h > > index e0a9b14469..b075482ef4 100644 > > --- a/sysdeps/x86_64/dl-machine.h > > +++ b/sysdeps/x86_64/dl-machine.h > > @@ -22,6 +22,7 @@ > > #define ELF_MACHINE_NAME "x86_64" > > > > #include <assert.h> > > +#include <stdint.h> > > #include <sys/param.h> > > #include <sysdep.h> > > #include <tls.h> > > @@ -35,6 +36,9 @@ > > # define RTLD_START_ENABLE_X86_FEATURES > > #endif > > > > +/* Translate a processor specific dynamic tag to the index in l_info array. */ > > +#define DT_X86_64(x) (DT_X86_64_##x - DT_LOPROC + DT_NUM) > > + > > /* Return nonzero iff ELF header is compatible with the running host. */ > > static inline int __attribute__ ((unused)) > > elf_machine_matches_host (const ElfW(Ehdr) *ehdr) > > @@ -312,8 +316,10 @@ and creates an unsatisfiable circular dependency.\n", > > > > switch (r_type) > > { > > - case R_X86_64_GLOB_DAT: > > case R_X86_64_JUMP_SLOT: > > + map->l_has_jump_slot_reloc = true; > > + /* fallthrough */ > > + case R_X86_64_GLOB_DAT: > > *reloc_addr = value; > > break; > > > > @@ -549,3 +555,211 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[], > > } > > > > #endif /* RESOLVE_MAP */ > > + > > +#if !defined ELF_DYNAMIC_AFTER_RELOC && !defined RTLD_BOOTSTRAP \ > > + && defined SHARED > > +# define ELF_DYNAMIC_AFTER_RELOC(map, lazy) \ > > + x86_64_dynamic_after_reloc (map, (lazy)) > > + > > +# define JMP32_INSN_OPCODE 0xe9 > > +# define JMP32_INSN_SIZE 5 > > +# define JMPABS_INSN_OPCODE 0xa100d5 > > +# define JMPABS_INSN_SIZE 11 > > +# define INT3_INSN_OPCODE 0xcc > > + > > +static const char * > > +x86_64_reloc_symbol_name (struct link_map *map, const ElfW(Rela) *reloc) > > +{ > > + const ElfW(Sym) *const symtab > > + = (const void *) map->l_info[DT_SYMTAB]->d_un.d_ptr; > > + const ElfW(Sym) *const refsym = &symtab[ELFW (R_SYM) (reloc->r_info)]; > > + const char *strtab = (const char *) map->l_info[DT_STRTAB]->d_un.d_ptr; > > + return strtab + refsym->st_name; > > +} > > + > > +static void > > +x86_64_rewrite_plt (struct link_map *map, ElfW(Addr) plt_rewrite) > > +{ > > + ElfW(Addr) l_addr = map->l_addr; > > + ElfW(Addr) pltent = map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val; > > + ElfW(Addr) start = map->l_info[DT_JMPREL]->d_un.d_ptr; > > + ElfW(Addr) size = map->l_info[DT_PLTRELSZ]->d_un.d_val; > > + const ElfW(Rela) *reloc = (const void *) start; > > + const ElfW(Rela) *reloc_end = (const void *) (start + size); > > + > > + unsigned int feature_1 = THREAD_GETMEM (THREAD_SELF, > > + header.feature_1); > > + bool ibt_enabled_p > > + = (feature_1 & GNU_PROPERTY_X86_FEATURE_1_IBT) != 0; > > + > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) > > + _dl_debug_printf ("\nchanging PLT in '%s' to direct branch\n", > > + DSO_FILENAME (map->l_name)); > > + > > + for (; reloc < reloc_end; reloc++) > > + if (ELFW(R_TYPE) (reloc->r_info) == R_X86_64_JUMP_SLOT) > > + { > > + /* Get the value from the GOT entry. */ > > + ElfW(Addr) value = *(ElfW(Addr) *) (l_addr + reloc->r_offset); > > + > > + /* Get the corresponding PLT entry from r_addend. */ > > + ElfW(Addr) branch_start = l_addr + reloc->r_addend; > > + /* Skip ENDBR64 if IBT isn't enabled. */ > > + if (!ibt_enabled_p) > > + branch_start = ALIGN_DOWN (branch_start, pltent); > > Will the only preceding code always be the ENDBR64? > If so why not just replace the alignment stuff with `- ENDBR64_INSN_SIZE`? ENDBR64 isn't required. If CET isn't enabled, there is no ENDBR64. > Likewise below. > > > + /* Get the displacement from the branch target. */ > > + ElfW(Addr) disp = value - branch_start - JMP32_INSN_SIZE; > > + ElfW(Addr) plt_end; > > + ElfW(Addr) pad; > > + > > + plt_end = (branch_start & -pltent) + pltent; > for this to make sense `pltent` needs to a power of 2 which is not > checked below in `x86_64_dynamic_after_reloc`. Likewise at > the ALIGN_DOWN above. It is safe. From https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/6d824a52a42d173eb838b879616c1be5870b593e there is "All entries have the same size and are aligned to the entry size.". > also better codegen: `(branch_start | (pltent - 1)) + 1` > > + > > + /* Update the PLT entry. */ > > + if (((uint64_t) disp + (uint64_t) ((uint32_t) INT32_MIN)) > > + <= (uint64_t) UINT32_MAX) > > + { > > + pad = branch_start + JMP32_INSN_SIZE; > > + > > + if (__glibc_unlikely (pad > plt_end)) > > + continue; > > + > > + /* If the target branch can be reached with a direct branch, > > + rewrite the PLT entry with a direct branch. */ > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_BINDINGS)) > > + { > > + const char *sym_name = x86_64_reloc_symbol_name (map, > > + reloc); > > + _dl_debug_printf ("changing '%s' PLT entry in '%s' to " > > + "direct branch\n", sym_name, > > + DSO_FILENAME (map->l_name)); > > + } > > + > > + /* Write out direct branch. */ > > + *(uint8_t *) branch_start = JMP32_INSN_OPCODE; > > + *((uint32_t *) (branch_start + 1)) = disp; > > + } > > + else > > + { > > + if (GL(dl_x86_feature_control).plt_rewrite > > + != plt_rewrite_jmpabs) > > + { > > + if (__glibc_unlikely (GLRO(dl_debug_mask) > > + & DL_DEBUG_BINDINGS)) > > + { > > + const char *sym_name > > + = x86_64_reloc_symbol_name (map, reloc); > > + _dl_debug_printf ("skipping '%s' PLT entry in '%s'\n", > > + sym_name, > > + DSO_FILENAME (map->l_name)); > > + } > > + continue; > > + } > > + > > + pad = branch_start + JMPABS_INSN_SIZE; > > + > > + if (__glibc_unlikely (pad > plt_end)) > > + continue; > > + > > + /* Rewrite the PLT entry with JMPABS. */ > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_BINDINGS)) > > + { > > + const char *sym_name = x86_64_reloc_symbol_name (map, > > + reloc); > > + _dl_debug_printf ("changing '%s' PLT entry in '%s' to " > > + "JMPABS\n", sym_name, > > + DSO_FILENAME (map->l_name)); > > + } > > + > > + /* "jmpabs $target" for 64-bit displacement. NB: JMPABS has > > + a 3-byte opcode + 64bit address. There is a 1-byte overlap > > + between 4-byte write and 8-byte write. */ > > + *(uint32_t *) (branch_start) = JMPABS_INSN_OPCODE; > > + *(uint64_t *) (branch_start + 3) = value; > > + } > > + > > + /* Fill the unused part of the PLT entry with INT3. */ > > + for (; pad < plt_end; pad++) > > + *(uint8_t *) pad = INT3_INSN_OPCODE; > nit: can you put braces around this for loop? Its a bit deceiving > to the eye with the brace directly after it. It is quite clear to me since I wrote it. Can you be more specific? > > + } > > +} > > + > > +static inline void > > +x86_64_rewrite_plt_in_place (struct link_map *map) > > +{ > > + /* Adjust DT_X86_64_PLT address and DT_X86_64_PLTSZ values. */ > > + ElfW(Addr) plt = (map->l_info[DT_X86_64 (PLT)]->d_un.d_ptr > > + + map->l_addr); > > + size_t pagesize = GLRO(dl_pagesize); > > + ElfW(Addr) plt_aligned = ALIGN_DOWN (plt, pagesize); > > + size_t pltsz = (map->l_info[DT_X86_64 (PLTSZ)]->d_un.d_val > > + + plt - plt_aligned); > > + > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) > > + _dl_debug_printf ("\nchanging PLT in '%s' to writable\n", > > + DSO_FILENAME (map->l_name)); > > + > > + if (__glibc_unlikely (__mprotect ((void *) plt_aligned, pltsz, > > + PROT_WRITE | PROT_READ) < 0)) > > + { > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) > > + _dl_debug_printf ("\nfailed to change PLT in '%s' to writable\n", > > + DSO_FILENAME (map->l_name)); > > + return; > > + } > > + > > + x86_64_rewrite_plt (map, plt_aligned); > > + > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) > > + _dl_debug_printf ("\nchanging PLT in '%s' back to read-only\n", > > + DSO_FILENAME (map->l_name)); > > + > > + if (__glibc_unlikely (__mprotect ((void *) plt_aligned, pltsz, > > + PROT_EXEC | PROT_READ) < 0)) > > + _dl_signal_error (0, DSO_FILENAME (map->l_name), NULL, > > + "failed to change PLT back to read-only"); > > +} > > + > > +/* Rewrite PLT entries to direct branch if possible. */ > > + > > +static inline void > > +x86_64_dynamic_after_reloc (struct link_map *map, int lazy) > > +{ > > + /* Ignore DT_X86_64_PLT if the lazy binding is enabled. */ > > + if (lazy != 0) > > + return; > > + > > + /* Ignore DT_X86_64_PLT if PLT rewrite isn't enabled. */ > > + if (__glibc_likely (GL(dl_x86_feature_control).plt_rewrite > > + == plt_rewrite_none)) > > + return; > > + > > + if (__glibc_likely (map->l_info[DT_X86_64 (PLT)] == NULL)) > > + return; > > + > > + /* Ignore DT_X86_64_PLT if there is no R_X86_64_JUMP_SLOT. */ > > + if (map->l_has_jump_slot_reloc == 0) > > + return; > > + > > + /* Ignore DT_X86_64_PLT if > > + 1. DT_JMPREL isn't available or its value is 0. > > + 2. DT_PLTRELSZ is 0. > > + 3. DT_X86_64_PLTENT isn't available or its value is smaller than > > + 16 bytes. > > + 4. DT_X86_64_PLTSZ isn't available or its value is smaller than > > + DT_X86_64_PLTENT's value or isn't a multiple of DT_X86_64_PLTENT's > > + value. */ > > + if (map->l_info[DT_JMPREL] == NULL > > + || map->l_info[DT_JMPREL]->d_un.d_ptr == 0 > > + || map->l_info[DT_PLTRELSZ]->d_un.d_val == 0 > > + || map->l_info[DT_X86_64 (PLTSZ)] == NULL > > + || map->l_info[DT_X86_64 (PLTENT)] == NULL > > + || map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val < 16 > > + || (map->l_info[DT_X86_64 (PLTSZ)]->d_un.d_val > > + < map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val) > > + || (map->l_info[DT_X86_64 (PLTSZ)]->d_un.d_val > > + % map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val) != 0) > > + return; > > + > > + x86_64_rewrite_plt_in_place (map); > > +} > > +#endif > > diff --git a/sysdeps/x86_64/link_map.h b/sysdeps/x86_64/link_map.h > > new file mode 100644 > > index 0000000000..537f56ace5 > > --- /dev/null > > +++ b/sysdeps/x86_64/link_map.h > > @@ -0,0 +1,22 @@ > > +/* Additional fields in struct link_map. x86-64 version. > > + Copyright (C) 2024 Free Software Foundation, Inc. > > + This file is part of the GNU C Library. > > + > > + The GNU C Library is free software; you can redistribute it and/or > > + modify it under the terms of the GNU Lesser General Public > > + License as published by the Free Software Foundation; either > > + version 2.1 of the License, or (at your option) any later version. > > + > > + The GNU C Library is distributed in the hope that it will be useful, > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > + Lesser General Public License for more details. > > + > > + You should have received a copy of the GNU Lesser General Public > > + License along with the GNU C Library; if not, see > > + <https://www.gnu.org/licenses/>. */ > > + > > +/* Has R_X86_64_JUMP_SLOT relocation. */ > > +bool l_has_jump_slot_reloc; > > + > > +#include <sysdeps/x86/link_map.h> > > diff --git a/sysdeps/x86_64/tst-plt-rewrite1.c b/sysdeps/x86_64/tst-plt-rewrite1.c > > new file mode 100644 > > index 0000000000..86785957e2 > > --- /dev/null > > +++ b/sysdeps/x86_64/tst-plt-rewrite1.c > > @@ -0,0 +1,31 @@ > > +/* Test PLT rewrite. > > + Copyright (C) 2024 Free Software Foundation, Inc. > > + This file is part of the GNU C Library. > > + > > + The GNU C Library is free software; you can redistribute it and/or > > + modify it under the terms of the GNU Lesser General Public > > + License as published by the Free Software Foundation; either > > + version 2.1 of the License, or (at your option) any later version. > > + > > + The GNU C Library is distributed in the hope that it will be useful, > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > + Lesser General Public License for more details. > > + > > + You should have received a copy of the GNU Lesser General Public > > + License along with the GNU C Library; if not, see > > + <https://www.gnu.org/licenses/>. */ > > + > > +#include <string.h> > > +#include <support/check.h> > > + > > +extern const char *foo (void); > > + > > +static int > > +do_test (void) > > +{ > > + TEST_COMPARE (strcmp (foo (), "PLT rewrite works"), 0); > > + return 0; > > +} > > + > > +#include <support/test-driver.c> > > diff --git a/sysdeps/x86_64/tst-plt-rewritemod1.c b/sysdeps/x86_64/tst-plt-rewritemod1.c > > new file mode 100644 > > index 0000000000..99f21fba5a > > --- /dev/null > > +++ b/sysdeps/x86_64/tst-plt-rewritemod1.c > > @@ -0,0 +1,32 @@ > > +/* Check PLT rewrite works correctly. > > + Copyright (C) 2024 Free Software Foundation, Inc. > > + This file is part of the GNU C Library. > > + > > + The GNU C Library is free software; you can redistribute it and/or > > + modify it under the terms of the GNU Lesser General Public > > + License as published by the Free Software Foundation; either > > + version 2.1 of the License, or (at your option) any later version. > > + > > + The GNU C Library is distributed in the hope that it will be useful, > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > + Lesser General Public License for more details. > > + > > + You should have received a copy of the GNU Lesser General Public > > + License along with the GNU C Library; if not, see > > + <https://www.gnu.org/licenses/>. */ > > + > > +/* foo calls bar with indirect branch via PLT. PLT rewrite should > > + change it to direct branch. */ > > + > > +const char * > > +bar (void) > > +{ > > + return "PLT rewrite works"; > > +} > > + > > +const char * > > +foo (void) > > +{ > > + return bar (); > > +} > > -- > > 2.43.0 > > > > can you run clang-format on your changes as a whole? Will do. Thanks.
On Thu, Jan 4, 2024 at 6:14 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > On Thu, Jan 4, 2024 at 6:00 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote: > > > > On Thu, Jan 4, 2024 at 8:01 AM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > > > Changes in v6: > > > > > > 1. Print skipped PLT entries. > > > 2. Always generate DT_X86_64_PLT* tags when supported. > > > 3. Never generate DT_X86_64_PLT* tags on ld.so. > > > 4. Don't check for ld.so in x86_64_dynamic_after_reloc to reduce the > > > run-time overhead. > > > > > > Changes in v5: > > > > > > 1. Remove plt_rewrite_enabled and set plt_rewrite in set_plt_rewrite to > > > reduce the run-time overhead. > > > 2. Use __glibc_likely and __glibc_unlikely in x86_64_dynamic_after_reloc > > > to reduce the run-time overhead. > > > > > > Changes in v4: > > > > > > 1. Update set_plt_rewrite. > > > 2. Add /* fallthrough */ to case R_X86_64_JUMP_SLOT:. > > > 3. Remove plt_rewrite_bias since it is always zero. > > > 4. Skip and silently ignore unsupported r_addend. > > > 5. Update copyright year to 2024. > > > 6. Drop the mprotect (PROT_EXEC | PROT_READ) check. > > > > > > Changes in v3: > > > > > > 1. Define and use macros for instruction opcodes and sizes. > > > 2. Use INT32_MIN and UINT32_MAX instead of 0x80000000ULL and > > > 0xffffffffULL. > > > 3. Replace 3 1-byte writes with a 4-byte write to write JMPABS. > > > 4. Verify that DT_X86_64_PLTSZ is a multiple of DT_X86_64_PLTENT. > > > 5. Add some comments for mprotect test. > > > 6. Update plt_rewrite logic. > > > > > > Add ELF_DYNAMIC_AFTER_RELOC to allow target specific processing after > > > relocation. > > > > > > For x86-64, add > > > > > > #define DT_X86_64_PLT (DT_LOPROC + 0) > > > #define DT_X86_64_PLTSZ (DT_LOPROC + 1) > > > #define DT_X86_64_PLTENT (DT_LOPROC + 3) > > > > > > 1. DT_X86_64_PLT: The address of the procedure linkage table. > > > 2. DT_X86_64_PLTSZ: The total size, in bytes, of the procedure linkage > > > table. > > > 3. DT_X86_64_PLTENT: The size, in bytes, of a procedure linkage table > > > entry. > > > > > > With the r_addend field of the R_X86_64_JUMP_SLOT relocation set to the > > > memory offset of the indirect branch instruction. > > > > > > Define ELF_DYNAMIC_AFTER_RELOC for x86-64 to rewrite the PLT section > > > with direct branch after relocation when the lazy binding is disabled. > > > > > > PLT rewrite is disabled by default since SELinux may disallow modifying > > > code pages and ld.so can't detect it in all cases. Add > > > > > > $ GLIBC_TUNABLES=glibc.cpu.plt_rewrite=1 > > > > > > to enable PLT rewrite at run-time. > > > --- > > > elf/dynamic-link.h | 5 + > > > elf/elf.h | 5 + > > > elf/tst-glibcelf.py | 1 + > > > manual/tunables.texi | 9 ++ > > > scripts/glibcelf.py | 4 + > > > sysdeps/x86/cet-control.h | 12 ++ > > > sysdeps/x86/cpu-features.c | 20 ++- > > > sysdeps/x86/dl-procruntime.c | 1 + > > > sysdeps/x86/dl-tunables.list | 5 + > > > sysdeps/x86_64/Makefile | 27 ++++ > > > sysdeps/x86_64/configure | 35 +++++ > > > sysdeps/x86_64/configure.ac | 4 + > > > sysdeps/x86_64/dl-dtprocnum.h | 21 +++ > > > sysdeps/x86_64/dl-machine.h | 216 ++++++++++++++++++++++++++- > > > sysdeps/x86_64/link_map.h | 22 +++ > > > sysdeps/x86_64/tst-plt-rewrite1.c | 31 ++++ > > > sysdeps/x86_64/tst-plt-rewritemod1.c | 32 ++++ > > > 17 files changed, 448 insertions(+), 2 deletions(-) > > > create mode 100644 sysdeps/x86_64/dl-dtprocnum.h > > > create mode 100644 sysdeps/x86_64/link_map.h > > > create mode 100644 sysdeps/x86_64/tst-plt-rewrite1.c > > > create mode 100644 sysdeps/x86_64/tst-plt-rewritemod1.c > > > > > > diff --git a/elf/dynamic-link.h b/elf/dynamic-link.h > > > index 8cdf7bde09..83d834ecaf 100644 > > > --- a/elf/dynamic-link.h > > > +++ b/elf/dynamic-link.h > > > @@ -177,6 +177,10 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[], > > > } \ > > > } while (0); > > > > > > +# ifndef ELF_DYNAMIC_AFTER_RELOC > > > +# define ELF_DYNAMIC_AFTER_RELOC(map, lazy) > > > +# endif > > > + > > > /* This can't just be an inline function because GCC is too dumb > > > to inline functions containing inlines themselves. */ > > > # ifdef RTLD_BOOTSTRAP > > > @@ -192,6 +196,7 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[], > > > ELF_DYNAMIC_DO_RELR (map); \ > > > ELF_DYNAMIC_DO_REL ((map), (scope), edr_lazy, skip_ifunc); \ > > > ELF_DYNAMIC_DO_RELA ((map), (scope), edr_lazy, skip_ifunc); \ > > > + ELF_DYNAMIC_AFTER_RELOC ((map), (edr_lazy)); \ > > > } while (0) > > > > > > #endif > > > diff --git a/elf/elf.h b/elf/elf.h > > > index ca6a7d9d67..455731663c 100644 > > > --- a/elf/elf.h > > > +++ b/elf/elf.h > > > @@ -3639,6 +3639,11 @@ enum > > > /* x86-64 sh_type values. */ > > > #define SHT_X86_64_UNWIND 0x70000001 /* Unwind information. */ > > > > > > +/* x86-64 d_tag values. */ > > > +#define DT_X86_64_PLT (DT_LOPROC + 0) > > > +#define DT_X86_64_PLTSZ (DT_LOPROC + 1) > > > +#define DT_X86_64_PLTENT (DT_LOPROC + 3) > > > +#define DT_X86_64_NUM 4 > > > > > > /* AM33 relocations. */ > > > #define R_MN10300_NONE 0 /* No reloc. */ > > > diff --git a/elf/tst-glibcelf.py b/elf/tst-glibcelf.py > > > index 00cd2bba85..c191636a99 100644 > > > --- a/elf/tst-glibcelf.py > > > +++ b/elf/tst-glibcelf.py > > > @@ -187,6 +187,7 @@ DT_VALNUM > > > DT_VALRNGHI > > > DT_VALRNGLO > > > DT_VERSIONTAGNUM > > > +DT_X86_64_NUM > > > ELFCLASSNUM > > > ELFDATANUM > > > EM_NUM > > > diff --git a/manual/tunables.texi b/manual/tunables.texi > > > index b31f16da84..f9bd83622e 100644 > > > --- a/manual/tunables.texi > > > +++ b/manual/tunables.texi > > > @@ -57,6 +57,7 @@ glibc.pthread.stack_cache_size: 0x2800000 (min: 0x0, max: 0xffffffffffffffff) > > > glibc.cpu.hwcap_mask: 0x6 (min: 0x0, max: 0xffffffffffffffff) > > > glibc.malloc.mmap_max: 0 (min: 0, max: 2147483647) > > > glibc.elision.skip_trylock_internal_abort: 3 (min: 0, max: 2147483647) > > > +glibc.cpu.plt_rewrite: 0 (min: 0, max: 1) > > > glibc.malloc.tcache_unsorted_limit: 0x0 (min: 0x0, max: 0xffffffffffffffff) > > > glibc.cpu.x86_ibt: > > > glibc.cpu.hwcaps: > > > @@ -614,6 +615,14 @@ this tunable. > > > This tunable is specific to 64-bit x86-64. > > > @end deftp > > > > > > +@deftp Tunable glibc.cpu.plt_rewrite > > > +When this tunable is set to @code{1}, the dynamic linker will rewrite > > > +the PLT section with direct branch after relocation if possible when > > > +the lazy binding is disabled. > > > + > > > > This doesn't read well. Maybe > > > > When this tunable is set to @code{1}, the dynamic linker will attempt > > to rewrite the PLT section with a direct branch after relocation. It may > > fail to do so, for example when lazy binding is enabled. > > Will update. > > > > +This tunable is specific to x86-64. > > > +@end deftp > > > + > > > @node Memory Related Tunables > > > @section Memory Related Tunables > > > @cindex memory related tunables > > > diff --git a/scripts/glibcelf.py b/scripts/glibcelf.py > > > index c5e5dda48e..5f3813f326 100644 > > > --- a/scripts/glibcelf.py > > > +++ b/scripts/glibcelf.py > > > @@ -439,6 +439,8 @@ class DtRISCV(Dt): > > > """Supplemental DT_* constants for EM_RISCV.""" > > > class DtSPARC(Dt): > > > """Supplemental DT_* constants for EM_SPARC.""" > > > +class DtX86_64(Dt): > > > + """Supplemental DT_* constants for EM_X86_64.""" > > > _dt_skip = ''' > > > DT_ENCODING DT_PROCNUM > > > DT_ADDRRNGLO DT_ADDRRNGHI DT_ADDRNUM > > > @@ -451,6 +453,7 @@ DT_MIPS_NUM > > > DT_PPC_NUM > > > DT_PPC64_NUM > > > DT_SPARC_NUM > > > +DT_X86_64_NUM > > > '''.strip().split() > > > _register_elf_h(DtAARCH64, prefix='DT_AARCH64_', skip=_dt_skip, parent=Dt) > > > _register_elf_h(DtALPHA, prefix='DT_ALPHA_', skip=_dt_skip, parent=Dt) > > > @@ -461,6 +464,7 @@ _register_elf_h(DtPPC, prefix='DT_PPC_', skip=_dt_skip, parent=Dt) > > > _register_elf_h(DtPPC64, prefix='DT_PPC64_', skip=_dt_skip, parent=Dt) > > > _register_elf_h(DtRISCV, prefix='DT_RISCV_', skip=_dt_skip, parent=Dt) > > > _register_elf_h(DtSPARC, prefix='DT_SPARC_', skip=_dt_skip, parent=Dt) > > > +_register_elf_h(DtX86_64, prefix='DT_X86_64_', skip=_dt_skip, parent=Dt) > > > _register_elf_h(Dt, skip=_dt_skip, ranges=True) > > > del _dt_skip > > > > > > diff --git a/sysdeps/x86/cet-control.h b/sysdeps/x86/cet-control.h > > > index a45d59bf8c..81e7bb4bd8 100644 > > > --- a/sysdeps/x86/cet-control.h > > > +++ b/sysdeps/x86/cet-control.h > > > @@ -32,10 +32,22 @@ enum dl_x86_cet_control > > > cet_permissive > > > }; > > > > > > +/* PLT rewrite control. */ > > > +enum dl_plt_rewrite_control > > > +{ > > > + /* No PLT rewrite. */ > > > + plt_rewrite_none, > > > + /* Rewrite PLT with JMP at run-time. */ > > > + plt_rewrite_jmp, > > > + /* Rewrite PLT with JMPABS at run-time. */ > > > + plt_rewrite_jmpabs > > > +}; > > > + > > > struct dl_x86_feature_control > > > { > > > enum dl_x86_cet_control ibt : 2; > > > enum dl_x86_cet_control shstk : 2; > > > + enum dl_plt_rewrite_control plt_rewrite : 2; > > > }; > > > > > > #endif /* cet-control.h */ > > > diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c > > > index f193ea7a2d..ccf1350b72 100644 > > > --- a/sysdeps/x86/cpu-features.c > > > +++ b/sysdeps/x86/cpu-features.c > > > @@ -27,6 +27,21 @@ > > > extern void TUNABLE_CALLBACK (set_hwcaps) (tunable_val_t *) > > > attribute_hidden; > > > > > > +#ifdef SHARED > > > +static void > > > +TUNABLE_CALLBACK (set_plt_rewrite) (tunable_val_t *valp) > > > +{ > > > + if (valp->numval != 0) > > > + { > > > + /* Use JMPABS only on APX processors. */ > > > + const struct cpu_features *cpu_features = __get_cpu_features (); > > > + GL(dl_x86_feature_control).plt_rewrite > > > + = (CPU_FEATURE_PRESENT_P (cpu_features, APX_F) > > > + ? plt_rewrite_jmpabs : plt_rewrite_jmp); > > > + } > > > +} > > > +#endif > > > + > > > #ifdef __LP64__ > > > static void > > > TUNABLE_CALLBACK (set_prefer_map_32bit_exec) (tunable_val_t *valp) > > > @@ -1108,7 +1123,10 @@ no_cpuid: > > > TUNABLE_CALLBACK (set_x86_shstk)); > > > #endif > > > > > > -#ifndef SHARED > > > +#ifdef SHARED > > > + TUNABLE_GET (plt_rewrite, tunable_val_t *, > > > + TUNABLE_CALLBACK (set_plt_rewrite)); > > > +#else > > > /* NB: In libc.a, call init_cacheinfo. */ > > > init_cacheinfo (); > > > #endif > > > diff --git a/sysdeps/x86/dl-procruntime.c b/sysdeps/x86/dl-procruntime.c > > > index 4d25d9f327..15b3d0d878 100644 > > > --- a/sysdeps/x86/dl-procruntime.c > > > +++ b/sysdeps/x86/dl-procruntime.c > > > @@ -67,6 +67,7 @@ PROCINFO_CLASS struct dl_x86_feature_control _dl_x86_feature_control > > > = { > > > .ibt = DEFAULT_DL_X86_CET_CONTROL, > > > .shstk = DEFAULT_DL_X86_CET_CONTROL, > > > + .plt_rewrite = plt_rewrite_none, > > > } > > > # endif > > > # if !defined SHARED || defined PROCINFO_DECL > > > diff --git a/sysdeps/x86/dl-tunables.list b/sysdeps/x86/dl-tunables.list > > > index 147a7270ec..e2e441e1b7 100644 > > > --- a/sysdeps/x86/dl-tunables.list > > > +++ b/sysdeps/x86/dl-tunables.list > > > @@ -66,5 +66,10 @@ glibc { > > > x86_shared_cache_size { > > > type: SIZE_T > > > } > > > + plt_rewrite { > > > + type: INT_32 > > > + minval: 0 > > > + maxval: 1 > > > + } > > should max value be at least 2 given that there are three > > distinct states (none, jmp, jmpabs). > > I will take a look. > > > > } > > > } > > > diff --git a/sysdeps/x86_64/Makefile b/sysdeps/x86_64/Makefile > > > index 00120ca9ca..374bca80d0 100644 > > > --- a/sysdeps/x86_64/Makefile > > > +++ b/sysdeps/x86_64/Makefile > > > @@ -1,6 +1,14 @@ > > > # The i387 `long double' is a distinct type we support. > > > long-double-fcts = yes > > > > > > +ifeq (yes,$(have-z-mark-plt)) > > > +# Always generate DT_X86_64_PLT* tags. > > > +sysdep-LDFLAGS += -Wl,-z,mark-plt > > > +# Never generate DT_X86_64_PLT* tags on ld.so to avoid changing its own > > > +# PLT. > > > +LDFLAGS-rtld += -Wl,-z,nomark-plt > > > +endif > > > + > > > ifeq ($(subdir),csu) > > > gen-as-const-headers += link-defines.sym > > > endif > > > @@ -175,6 +183,25 @@ ifeq (no,$(build-hardcoded-path-in-tests)) > > > tests-container += tst-glibc-hwcaps-cache > > > endif > > > > > > +ifeq (yes,$(have-z-mark-plt)) > > > +tests += \ > > > + tst-plt-rewrite1 \ > > > +# tests > > > +modules-names += \ > > > + tst-plt-rewritemod1 \ > > > +# modules-names > > > + > > > +tst-plt-rewrite1-no-pie = yes > > > +LDFLAGS-tst-plt-rewrite1 = -Wl,-z,now > > > +LDFLAGS-tst-plt-rewritemod1.so = -Wl,-z,now > > > +tst-plt-rewrite1-ENV = GLIBC_TUNABLES=glibc.cpu.plt_rewrite=1 LD_DEBUG=files:bindings > > > +$(objpfx)tst-plt-rewrite1: $(objpfx)tst-plt-rewritemod1.so > > > +$(objpfx)tst-plt-rewrite1.out: /dev/null $(objpfx)tst-plt-rewrite1 > > > + $(tst-plt-rewrite1-ENV) $(make-test-out) > $@ 2>&1; \ > > > + grep -q -E "changing 'bar' PLT entry in .*/elf/tst-plt-rewritemod1.so' to direct branch" $@; \ > > > + $(evaluate-test) > > > +endif > > > + > > > endif # $(subdir) == elf > > > > > > ifeq ($(subdir),csu) > > > diff --git a/sysdeps/x86_64/configure b/sysdeps/x86_64/configure > > > index b4a80b8035..418cc4a9b8 100755 > > > --- a/sysdeps/x86_64/configure > > > +++ b/sysdeps/x86_64/configure > > > @@ -25,6 +25,41 @@ printf "%s\n" "$libc_cv_cc_mprefer_vector_width" >&6; } > > > config_vars="$config_vars > > > config-cflags-mprefer-vector-width = $libc_cv_cc_mprefer_vector_width" > > > > > > +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for linker that supports -z mark-plt" >&5 > > > +printf %s "checking for linker that supports -z mark-plt... " >&6; } > > > +libc_linker_feature=no > > > +cat > conftest.c <<EOF > > > +int _start (void) { return 42; } > > > +EOF > > > +if { ac_try='${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp > > > + -Wl,-z,mark-plt -nostdlib -nostartfiles > > > + -fPIC -shared -o conftest.so conftest.c > > > + 1>&5' > > > + { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5 > > > + (eval $ac_try) 2>&5 > > > + ac_status=$? > > > + printf "%s\n" "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 > > > + test $ac_status = 0; }; } > > > +then > > > + if ${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp -Wl,-z,mark-plt -nostdlib \ > > > + -nostartfiles -fPIC -shared -o conftest.so conftest.c 2>&1 \ > > > + | grep "warning: -z mark-plt ignored" > /dev/null 2>&1; then > > > + true > > > + else > > > + libc_linker_feature=yes > > > + fi > > > +fi > > > +rm -f conftest* > > > +if test $libc_linker_feature = yes; then > > > + libc_cv_z_mark_plt=yes > > > +else > > > + libc_cv_z_mark_plt=no > > > +fi > > > +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: $libc_linker_feature" >&5 > > > +printf "%s\n" "$libc_linker_feature" >&6; } > > > +config_vars="$config_vars > > > +have-z-mark-plt = $libc_cv_z_mark_plt" > > > + > > > if test x"$build_mathvec" = xnotset; then > > > build_mathvec=yes > > > fi > > > diff --git a/sysdeps/x86_64/configure.ac b/sysdeps/x86_64/configure.ac > > > index 937d1aff7e..d1f803c02e 100644 > > > --- a/sysdeps/x86_64/configure.ac > > > +++ b/sysdeps/x86_64/configure.ac > > > @@ -10,6 +10,10 @@ LIBC_TRY_CC_OPTION([-mprefer-vector-width=128], > > > LIBC_CONFIG_VAR([config-cflags-mprefer-vector-width], > > > [$libc_cv_cc_mprefer_vector_width]) > > > > > > +LIBC_LINKER_FEATURE([-z mark-plt], [-Wl,-z,mark-plt], > > > + [libc_cv_z_mark_plt=yes], [libc_cv_z_mark_plt=no]) > > > +LIBC_CONFIG_VAR([have-z-mark-plt], [$libc_cv_z_mark_plt]) > > > + > > > if test x"$build_mathvec" = xnotset; then > > > build_mathvec=yes > > > fi > > > diff --git a/sysdeps/x86_64/dl-dtprocnum.h b/sysdeps/x86_64/dl-dtprocnum.h > > > new file mode 100644 > > > index 0000000000..cefacb5387 > > > --- /dev/null > > > +++ b/sysdeps/x86_64/dl-dtprocnum.h > > > @@ -0,0 +1,21 @@ > > > +/* Configuration of lookup functions. x64-64 version. > > > + Copyright (C) 2024 Free Software Foundation, Inc. > > > + This file is part of the GNU C Library. > > > + > > > + The GNU C Library is free software; you can redistribute it and/or > > > + modify it under the terms of the GNU Lesser General Public > > > + License as published by the Free Software Foundation; either > > > + version 2.1 of the License, or (at your option) any later version. > > > + > > > + The GNU C Library is distributed in the hope that it will be useful, > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > + Lesser General Public License for more details. > > > + > > > + You should have received a copy of the GNU Lesser General Public > > > + License along with the GNU C Library; if not, see > > > + <https://www.gnu.org/licenses/>. */ > > > + > > > +/* Number of extra dynamic section entries for this architecture. By > > > + default there are none. */ > > > +#define DT_THISPROCNUM DT_X86_64_NUM > > > diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h > > > index e0a9b14469..b075482ef4 100644 > > > --- a/sysdeps/x86_64/dl-machine.h > > > +++ b/sysdeps/x86_64/dl-machine.h > > > @@ -22,6 +22,7 @@ > > > #define ELF_MACHINE_NAME "x86_64" > > > > > > #include <assert.h> > > > +#include <stdint.h> > > > #include <sys/param.h> > > > #include <sysdep.h> > > > #include <tls.h> > > > @@ -35,6 +36,9 @@ > > > # define RTLD_START_ENABLE_X86_FEATURES > > > #endif > > > > > > +/* Translate a processor specific dynamic tag to the index in l_info array. */ > > > +#define DT_X86_64(x) (DT_X86_64_##x - DT_LOPROC + DT_NUM) > > > + > > > /* Return nonzero iff ELF header is compatible with the running host. */ > > > static inline int __attribute__ ((unused)) > > > elf_machine_matches_host (const ElfW(Ehdr) *ehdr) > > > @@ -312,8 +316,10 @@ and creates an unsatisfiable circular dependency.\n", > > > > > > switch (r_type) > > > { > > > - case R_X86_64_GLOB_DAT: > > > case R_X86_64_JUMP_SLOT: > > > + map->l_has_jump_slot_reloc = true; > > > + /* fallthrough */ > > > + case R_X86_64_GLOB_DAT: > > > *reloc_addr = value; > > > break; > > > > > > @@ -549,3 +555,211 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[], > > > } > > > > > > #endif /* RESOLVE_MAP */ > > > + > > > +#if !defined ELF_DYNAMIC_AFTER_RELOC && !defined RTLD_BOOTSTRAP \ > > > + && defined SHARED > > > +# define ELF_DYNAMIC_AFTER_RELOC(map, lazy) \ > > > + x86_64_dynamic_after_reloc (map, (lazy)) > > > + > > > +# define JMP32_INSN_OPCODE 0xe9 > > > +# define JMP32_INSN_SIZE 5 > > > +# define JMPABS_INSN_OPCODE 0xa100d5 > > > +# define JMPABS_INSN_SIZE 11 > > > +# define INT3_INSN_OPCODE 0xcc > > > + > > > +static const char * > > > +x86_64_reloc_symbol_name (struct link_map *map, const ElfW(Rela) *reloc) > > > +{ > > > + const ElfW(Sym) *const symtab > > > + = (const void *) map->l_info[DT_SYMTAB]->d_un.d_ptr; > > > + const ElfW(Sym) *const refsym = &symtab[ELFW (R_SYM) (reloc->r_info)]; > > > + const char *strtab = (const char *) map->l_info[DT_STRTAB]->d_un.d_ptr; > > > + return strtab + refsym->st_name; > > > +} > > > + > > > +static void > > > +x86_64_rewrite_plt (struct link_map *map, ElfW(Addr) plt_rewrite) > > > +{ > > > + ElfW(Addr) l_addr = map->l_addr; > > > + ElfW(Addr) pltent = map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val; > > > + ElfW(Addr) start = map->l_info[DT_JMPREL]->d_un.d_ptr; > > > + ElfW(Addr) size = map->l_info[DT_PLTRELSZ]->d_un.d_val; > > > + const ElfW(Rela) *reloc = (const void *) start; > > > + const ElfW(Rela) *reloc_end = (const void *) (start + size); > > > + > > > + unsigned int feature_1 = THREAD_GETMEM (THREAD_SELF, > > > + header.feature_1); > > > + bool ibt_enabled_p > > > + = (feature_1 & GNU_PROPERTY_X86_FEATURE_1_IBT) != 0; > > > + > > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) > > > + _dl_debug_printf ("\nchanging PLT in '%s' to direct branch\n", > > > + DSO_FILENAME (map->l_name)); > > > + > > > + for (; reloc < reloc_end; reloc++) > > > + if (ELFW(R_TYPE) (reloc->r_info) == R_X86_64_JUMP_SLOT) > > > + { > > > + /* Get the value from the GOT entry. */ > > > + ElfW(Addr) value = *(ElfW(Addr) *) (l_addr + reloc->r_offset); > > > + > > > + /* Get the corresponding PLT entry from r_addend. */ > > > + ElfW(Addr) branch_start = l_addr + reloc->r_addend; > > > + /* Skip ENDBR64 if IBT isn't enabled. */ > > > + if (!ibt_enabled_p) > > > + branch_start = ALIGN_DOWN (branch_start, pltent); > > > > Will the only preceding code always be the ENDBR64? > > If so why not just replace the alignment stuff with `- ENDBR64_INSN_SIZE`? > > ENDBR64 isn't required. If CET isn't enabled, there is no ENDBR64. > > > Likewise below. > > > > > + /* Get the displacement from the branch target. */ > > > + ElfW(Addr) disp = value - branch_start - JMP32_INSN_SIZE; > > > + ElfW(Addr) plt_end; > > > + ElfW(Addr) pad; > > > + > > > + plt_end = (branch_start & -pltent) + pltent; > > for this to make sense `pltent` needs to a power of 2 which is not > > checked below in `x86_64_dynamic_after_reloc`. Likewise at > > the ALIGN_DOWN above. > > It is safe. From > > https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/6d824a52a42d173eb838b879616c1be5870b593e > > there is "All entries have the same size and are aligned to the entry size.". > but the method of aligning to X by doing: `V & -X` only works if X is a power of two. > > also better codegen: `(branch_start | (pltent - 1)) + 1` > > > + > > > + /* Update the PLT entry. */ > > > + if (((uint64_t) disp + (uint64_t) ((uint32_t) INT32_MIN)) > > > + <= (uint64_t) UINT32_MAX) > > > + { > > > + pad = branch_start + JMP32_INSN_SIZE; > > > + > > > + if (__glibc_unlikely (pad > plt_end)) > > > + continue; > > > + > > > + /* If the target branch can be reached with a direct branch, > > > + rewrite the PLT entry with a direct branch. */ > > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_BINDINGS)) > > > + { > > > + const char *sym_name = x86_64_reloc_symbol_name (map, > > > + reloc); > > > + _dl_debug_printf ("changing '%s' PLT entry in '%s' to " > > > + "direct branch\n", sym_name, > > > + DSO_FILENAME (map->l_name)); > > > + } > > > + > > > + /* Write out direct branch. */ > > > + *(uint8_t *) branch_start = JMP32_INSN_OPCODE; > > > + *((uint32_t *) (branch_start + 1)) = disp; > > > + } > > > + else > > > + { > > > + if (GL(dl_x86_feature_control).plt_rewrite > > > + != plt_rewrite_jmpabs) > > > + { > > > + if (__glibc_unlikely (GLRO(dl_debug_mask) > > > + & DL_DEBUG_BINDINGS)) > > > + { > > > + const char *sym_name > > > + = x86_64_reloc_symbol_name (map, reloc); > > > + _dl_debug_printf ("skipping '%s' PLT entry in '%s'\n", > > > + sym_name, > > > + DSO_FILENAME (map->l_name)); > > > + } > > > + continue; > > > + } > > > + > > > + pad = branch_start + JMPABS_INSN_SIZE; > > > + > > > + if (__glibc_unlikely (pad > plt_end)) > > > + continue; > > > + > > > + /* Rewrite the PLT entry with JMPABS. */ > > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_BINDINGS)) > > > + { > > > + const char *sym_name = x86_64_reloc_symbol_name (map, > > > + reloc); > > > + _dl_debug_printf ("changing '%s' PLT entry in '%s' to " > > > + "JMPABS\n", sym_name, > > > + DSO_FILENAME (map->l_name)); > > > + } > > > + > > > + /* "jmpabs $target" for 64-bit displacement. NB: JMPABS has > > > + a 3-byte opcode + 64bit address. There is a 1-byte overlap > > > + between 4-byte write and 8-byte write. */ > > > + *(uint32_t *) (branch_start) = JMPABS_INSN_OPCODE; > > > + *(uint64_t *) (branch_start + 3) = value; > > > + } > > > + > > > + /* Fill the unused part of the PLT entry with INT3. */ > > > + for (; pad < plt_end; pad++) > > > + *(uint8_t *) pad = INT3_INSN_OPCODE; > > nit: can you put braces around this for loop? Its a bit deceiving > > to the eye with the brace directly after it. > > It is quite clear to me since I wrote it. Can you be more specific? just seeing: `for(;;) }` its easy to imagine it as `for(;;) {}`. its not an issue. feel free to ignore. > > > > + } > > > +} > > > + > > > +static inline void > > > +x86_64_rewrite_plt_in_place (struct link_map *map) > > > +{ > > > + /* Adjust DT_X86_64_PLT address and DT_X86_64_PLTSZ values. */ > > > + ElfW(Addr) plt = (map->l_info[DT_X86_64 (PLT)]->d_un.d_ptr > > > + + map->l_addr); > > > + size_t pagesize = GLRO(dl_pagesize); > > > + ElfW(Addr) plt_aligned = ALIGN_DOWN (plt, pagesize); > > > + size_t pltsz = (map->l_info[DT_X86_64 (PLTSZ)]->d_un.d_val > > > + + plt - plt_aligned); > > > + > > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) > > > + _dl_debug_printf ("\nchanging PLT in '%s' to writable\n", > > > + DSO_FILENAME (map->l_name)); > > > + > > > + if (__glibc_unlikely (__mprotect ((void *) plt_aligned, pltsz, > > > + PROT_WRITE | PROT_READ) < 0)) > > > + { > > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) > > > + _dl_debug_printf ("\nfailed to change PLT in '%s' to writable\n", > > > + DSO_FILENAME (map->l_name)); > > > + return; > > > + } > > > + > > > + x86_64_rewrite_plt (map, plt_aligned); > > > + > > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) > > > + _dl_debug_printf ("\nchanging PLT in '%s' back to read-only\n", > > > + DSO_FILENAME (map->l_name)); > > > + > > > + if (__glibc_unlikely (__mprotect ((void *) plt_aligned, pltsz, > > > + PROT_EXEC | PROT_READ) < 0)) > > > + _dl_signal_error (0, DSO_FILENAME (map->l_name), NULL, > > > + "failed to change PLT back to read-only"); > > > +} > > > + > > > +/* Rewrite PLT entries to direct branch if possible. */ > > > + > > > +static inline void > > > +x86_64_dynamic_after_reloc (struct link_map *map, int lazy) > > > +{ > > > + /* Ignore DT_X86_64_PLT if the lazy binding is enabled. */ > > > + if (lazy != 0) > > > + return; > > > + > > > + /* Ignore DT_X86_64_PLT if PLT rewrite isn't enabled. */ > > > + if (__glibc_likely (GL(dl_x86_feature_control).plt_rewrite > > > + == plt_rewrite_none)) > > > + return; > > > + > > > + if (__glibc_likely (map->l_info[DT_X86_64 (PLT)] == NULL)) > > > + return; > > > + > > > + /* Ignore DT_X86_64_PLT if there is no R_X86_64_JUMP_SLOT. */ > > > + if (map->l_has_jump_slot_reloc == 0) > > > + return; > > > + > > > + /* Ignore DT_X86_64_PLT if > > > + 1. DT_JMPREL isn't available or its value is 0. > > > + 2. DT_PLTRELSZ is 0. > > > + 3. DT_X86_64_PLTENT isn't available or its value is smaller than > > > + 16 bytes. > > > + 4. DT_X86_64_PLTSZ isn't available or its value is smaller than > > > + DT_X86_64_PLTENT's value or isn't a multiple of DT_X86_64_PLTENT's > > > + value. */ > > > + if (map->l_info[DT_JMPREL] == NULL > > > + || map->l_info[DT_JMPREL]->d_un.d_ptr == 0 > > > + || map->l_info[DT_PLTRELSZ]->d_un.d_val == 0 > > > + || map->l_info[DT_X86_64 (PLTSZ)] == NULL > > > + || map->l_info[DT_X86_64 (PLTENT)] == NULL > > > + || map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val < 16 > > > + || (map->l_info[DT_X86_64 (PLTSZ)]->d_un.d_val > > > + < map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val) > > > + || (map->l_info[DT_X86_64 (PLTSZ)]->d_un.d_val > > > + % map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val) != 0) > > > + return; > > > + > > > + x86_64_rewrite_plt_in_place (map); > > > +} > > > +#endif > > > diff --git a/sysdeps/x86_64/link_map.h b/sysdeps/x86_64/link_map.h > > > new file mode 100644 > > > index 0000000000..537f56ace5 > > > --- /dev/null > > > +++ b/sysdeps/x86_64/link_map.h > > > @@ -0,0 +1,22 @@ > > > +/* Additional fields in struct link_map. x86-64 version. > > > + Copyright (C) 2024 Free Software Foundation, Inc. > > > + This file is part of the GNU C Library. > > > + > > > + The GNU C Library is free software; you can redistribute it and/or > > > + modify it under the terms of the GNU Lesser General Public > > > + License as published by the Free Software Foundation; either > > > + version 2.1 of the License, or (at your option) any later version. > > > + > > > + The GNU C Library is distributed in the hope that it will be useful, > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > + Lesser General Public License for more details. > > > + > > > + You should have received a copy of the GNU Lesser General Public > > > + License along with the GNU C Library; if not, see > > > + <https://www.gnu.org/licenses/>. */ > > > + > > > +/* Has R_X86_64_JUMP_SLOT relocation. */ > > > +bool l_has_jump_slot_reloc; > > > + > > > +#include <sysdeps/x86/link_map.h> > > > diff --git a/sysdeps/x86_64/tst-plt-rewrite1.c b/sysdeps/x86_64/tst-plt-rewrite1.c > > > new file mode 100644 > > > index 0000000000..86785957e2 > > > --- /dev/null > > > +++ b/sysdeps/x86_64/tst-plt-rewrite1.c > > > @@ -0,0 +1,31 @@ > > > +/* Test PLT rewrite. > > > + Copyright (C) 2024 Free Software Foundation, Inc. > > > + This file is part of the GNU C Library. > > > + > > > + The GNU C Library is free software; you can redistribute it and/or > > > + modify it under the terms of the GNU Lesser General Public > > > + License as published by the Free Software Foundation; either > > > + version 2.1 of the License, or (at your option) any later version. > > > + > > > + The GNU C Library is distributed in the hope that it will be useful, > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > + Lesser General Public License for more details. > > > + > > > + You should have received a copy of the GNU Lesser General Public > > > + License along with the GNU C Library; if not, see > > > + <https://www.gnu.org/licenses/>. */ > > > + > > > +#include <string.h> > > > +#include <support/check.h> > > > + > > > +extern const char *foo (void); > > > + > > > +static int > > > +do_test (void) > > > +{ > > > + TEST_COMPARE (strcmp (foo (), "PLT rewrite works"), 0); > > > + return 0; > > > +} > > > + > > > +#include <support/test-driver.c> > > > diff --git a/sysdeps/x86_64/tst-plt-rewritemod1.c b/sysdeps/x86_64/tst-plt-rewritemod1.c > > > new file mode 100644 > > > index 0000000000..99f21fba5a > > > --- /dev/null > > > +++ b/sysdeps/x86_64/tst-plt-rewritemod1.c > > > @@ -0,0 +1,32 @@ > > > +/* Check PLT rewrite works correctly. > > > + Copyright (C) 2024 Free Software Foundation, Inc. > > > + This file is part of the GNU C Library. > > > + > > > + The GNU C Library is free software; you can redistribute it and/or > > > + modify it under the terms of the GNU Lesser General Public > > > + License as published by the Free Software Foundation; either > > > + version 2.1 of the License, or (at your option) any later version. > > > + > > > + The GNU C Library is distributed in the hope that it will be useful, > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > + Lesser General Public License for more details. > > > + > > > + You should have received a copy of the GNU Lesser General Public > > > + License along with the GNU C Library; if not, see > > > + <https://www.gnu.org/licenses/>. */ > > > + > > > +/* foo calls bar with indirect branch via PLT. PLT rewrite should > > > + change it to direct branch. */ > > > + > > > +const char * > > > +bar (void) > > > +{ > > > + return "PLT rewrite works"; > > > +} > > > + > > > +const char * > > > +foo (void) > > > +{ > > > + return bar (); > > > +} > > > -- > > > 2.43.0 > > > > > > > can you run clang-format on your changes as a whole? > > Will do. > > Thanks. > > > -- > H.J.
On Thu, Jan 4, 2024 at 6:46 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote: > > On Thu, Jan 4, 2024 at 6:14 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > On Thu, Jan 4, 2024 at 6:00 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote: > > > > > > On Thu, Jan 4, 2024 at 8:01 AM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > > > > > Changes in v6: > > > > > > > > 1. Print skipped PLT entries. > > > > 2. Always generate DT_X86_64_PLT* tags when supported. > > > > 3. Never generate DT_X86_64_PLT* tags on ld.so. > > > > 4. Don't check for ld.so in x86_64_dynamic_after_reloc to reduce the > > > > run-time overhead. > > > > > > > > Changes in v5: > > > > > > > > 1. Remove plt_rewrite_enabled and set plt_rewrite in set_plt_rewrite to > > > > reduce the run-time overhead. > > > > 2. Use __glibc_likely and __glibc_unlikely in x86_64_dynamic_after_reloc > > > > to reduce the run-time overhead. > > > > > > > > Changes in v4: > > > > > > > > 1. Update set_plt_rewrite. > > > > 2. Add /* fallthrough */ to case R_X86_64_JUMP_SLOT:. > > > > 3. Remove plt_rewrite_bias since it is always zero. > > > > 4. Skip and silently ignore unsupported r_addend. > > > > 5. Update copyright year to 2024. > > > > 6. Drop the mprotect (PROT_EXEC | PROT_READ) check. > > > > > > > > Changes in v3: > > > > > > > > 1. Define and use macros for instruction opcodes and sizes. > > > > 2. Use INT32_MIN and UINT32_MAX instead of 0x80000000ULL and > > > > 0xffffffffULL. > > > > 3. Replace 3 1-byte writes with a 4-byte write to write JMPABS. > > > > 4. Verify that DT_X86_64_PLTSZ is a multiple of DT_X86_64_PLTENT. > > > > 5. Add some comments for mprotect test. > > > > 6. Update plt_rewrite logic. > > > > > > > > Add ELF_DYNAMIC_AFTER_RELOC to allow target specific processing after > > > > relocation. > > > > > > > > For x86-64, add > > > > > > > > #define DT_X86_64_PLT (DT_LOPROC + 0) > > > > #define DT_X86_64_PLTSZ (DT_LOPROC + 1) > > > > #define DT_X86_64_PLTENT (DT_LOPROC + 3) > > > > > > > > 1. DT_X86_64_PLT: The address of the procedure linkage table. > > > > 2. DT_X86_64_PLTSZ: The total size, in bytes, of the procedure linkage > > > > table. > > > > 3. DT_X86_64_PLTENT: The size, in bytes, of a procedure linkage table > > > > entry. > > > > > > > > With the r_addend field of the R_X86_64_JUMP_SLOT relocation set to the > > > > memory offset of the indirect branch instruction. > > > > > > > > Define ELF_DYNAMIC_AFTER_RELOC for x86-64 to rewrite the PLT section > > > > with direct branch after relocation when the lazy binding is disabled. > > > > > > > > PLT rewrite is disabled by default since SELinux may disallow modifying > > > > code pages and ld.so can't detect it in all cases. Add > > > > > > > > $ GLIBC_TUNABLES=glibc.cpu.plt_rewrite=1 > > > > > > > > to enable PLT rewrite at run-time. > > > > --- > > > > elf/dynamic-link.h | 5 + > > > > elf/elf.h | 5 + > > > > elf/tst-glibcelf.py | 1 + > > > > manual/tunables.texi | 9 ++ > > > > scripts/glibcelf.py | 4 + > > > > sysdeps/x86/cet-control.h | 12 ++ > > > > sysdeps/x86/cpu-features.c | 20 ++- > > > > sysdeps/x86/dl-procruntime.c | 1 + > > > > sysdeps/x86/dl-tunables.list | 5 + > > > > sysdeps/x86_64/Makefile | 27 ++++ > > > > sysdeps/x86_64/configure | 35 +++++ > > > > sysdeps/x86_64/configure.ac | 4 + > > > > sysdeps/x86_64/dl-dtprocnum.h | 21 +++ > > > > sysdeps/x86_64/dl-machine.h | 216 ++++++++++++++++++++++++++- > > > > sysdeps/x86_64/link_map.h | 22 +++ > > > > sysdeps/x86_64/tst-plt-rewrite1.c | 31 ++++ > > > > sysdeps/x86_64/tst-plt-rewritemod1.c | 32 ++++ > > > > 17 files changed, 448 insertions(+), 2 deletions(-) > > > > create mode 100644 sysdeps/x86_64/dl-dtprocnum.h > > > > create mode 100644 sysdeps/x86_64/link_map.h > > > > create mode 100644 sysdeps/x86_64/tst-plt-rewrite1.c > > > > create mode 100644 sysdeps/x86_64/tst-plt-rewritemod1.c > > > > > > > > diff --git a/elf/dynamic-link.h b/elf/dynamic-link.h > > > > index 8cdf7bde09..83d834ecaf 100644 > > > > --- a/elf/dynamic-link.h > > > > +++ b/elf/dynamic-link.h > > > > @@ -177,6 +177,10 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[], > > > > } \ > > > > } while (0); > > > > > > > > +# ifndef ELF_DYNAMIC_AFTER_RELOC > > > > +# define ELF_DYNAMIC_AFTER_RELOC(map, lazy) > > > > +# endif > > > > + > > > > /* This can't just be an inline function because GCC is too dumb > > > > to inline functions containing inlines themselves. */ > > > > # ifdef RTLD_BOOTSTRAP > > > > @@ -192,6 +196,7 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[], > > > > ELF_DYNAMIC_DO_RELR (map); \ > > > > ELF_DYNAMIC_DO_REL ((map), (scope), edr_lazy, skip_ifunc); \ > > > > ELF_DYNAMIC_DO_RELA ((map), (scope), edr_lazy, skip_ifunc); \ > > > > + ELF_DYNAMIC_AFTER_RELOC ((map), (edr_lazy)); \ > > > > } while (0) > > > > > > > > #endif > > > > diff --git a/elf/elf.h b/elf/elf.h > > > > index ca6a7d9d67..455731663c 100644 > > > > --- a/elf/elf.h > > > > +++ b/elf/elf.h > > > > @@ -3639,6 +3639,11 @@ enum > > > > /* x86-64 sh_type values. */ > > > > #define SHT_X86_64_UNWIND 0x70000001 /* Unwind information. */ > > > > > > > > +/* x86-64 d_tag values. */ > > > > +#define DT_X86_64_PLT (DT_LOPROC + 0) > > > > +#define DT_X86_64_PLTSZ (DT_LOPROC + 1) > > > > +#define DT_X86_64_PLTENT (DT_LOPROC + 3) > > > > +#define DT_X86_64_NUM 4 > > > > > > > > /* AM33 relocations. */ > > > > #define R_MN10300_NONE 0 /* No reloc. */ > > > > diff --git a/elf/tst-glibcelf.py b/elf/tst-glibcelf.py > > > > index 00cd2bba85..c191636a99 100644 > > > > --- a/elf/tst-glibcelf.py > > > > +++ b/elf/tst-glibcelf.py > > > > @@ -187,6 +187,7 @@ DT_VALNUM > > > > DT_VALRNGHI > > > > DT_VALRNGLO > > > > DT_VERSIONTAGNUM > > > > +DT_X86_64_NUM > > > > ELFCLASSNUM > > > > ELFDATANUM > > > > EM_NUM > > > > diff --git a/manual/tunables.texi b/manual/tunables.texi > > > > index b31f16da84..f9bd83622e 100644 > > > > --- a/manual/tunables.texi > > > > +++ b/manual/tunables.texi > > > > @@ -57,6 +57,7 @@ glibc.pthread.stack_cache_size: 0x2800000 (min: 0x0, max: 0xffffffffffffffff) > > > > glibc.cpu.hwcap_mask: 0x6 (min: 0x0, max: 0xffffffffffffffff) > > > > glibc.malloc.mmap_max: 0 (min: 0, max: 2147483647) > > > > glibc.elision.skip_trylock_internal_abort: 3 (min: 0, max: 2147483647) > > > > +glibc.cpu.plt_rewrite: 0 (min: 0, max: 1) > > > > glibc.malloc.tcache_unsorted_limit: 0x0 (min: 0x0, max: 0xffffffffffffffff) > > > > glibc.cpu.x86_ibt: > > > > glibc.cpu.hwcaps: > > > > @@ -614,6 +615,14 @@ this tunable. > > > > This tunable is specific to 64-bit x86-64. > > > > @end deftp > > > > > > > > +@deftp Tunable glibc.cpu.plt_rewrite > > > > +When this tunable is set to @code{1}, the dynamic linker will rewrite > > > > +the PLT section with direct branch after relocation if possible when > > > > +the lazy binding is disabled. > > > > + > > > > > > This doesn't read well. Maybe > > > > > > When this tunable is set to @code{1}, the dynamic linker will attempt > > > to rewrite the PLT section with a direct branch after relocation. It may > > > fail to do so, for example when lazy binding is enabled. > > > > Will update. > > > > > > +This tunable is specific to x86-64. > > > > +@end deftp > > > > + > > > > @node Memory Related Tunables > > > > @section Memory Related Tunables > > > > @cindex memory related tunables > > > > diff --git a/scripts/glibcelf.py b/scripts/glibcelf.py > > > > index c5e5dda48e..5f3813f326 100644 > > > > --- a/scripts/glibcelf.py > > > > +++ b/scripts/glibcelf.py > > > > @@ -439,6 +439,8 @@ class DtRISCV(Dt): > > > > """Supplemental DT_* constants for EM_RISCV.""" > > > > class DtSPARC(Dt): > > > > """Supplemental DT_* constants for EM_SPARC.""" > > > > +class DtX86_64(Dt): > > > > + """Supplemental DT_* constants for EM_X86_64.""" > > > > _dt_skip = ''' > > > > DT_ENCODING DT_PROCNUM > > > > DT_ADDRRNGLO DT_ADDRRNGHI DT_ADDRNUM > > > > @@ -451,6 +453,7 @@ DT_MIPS_NUM > > > > DT_PPC_NUM > > > > DT_PPC64_NUM > > > > DT_SPARC_NUM > > > > +DT_X86_64_NUM > > > > '''.strip().split() > > > > _register_elf_h(DtAARCH64, prefix='DT_AARCH64_', skip=_dt_skip, parent=Dt) > > > > _register_elf_h(DtALPHA, prefix='DT_ALPHA_', skip=_dt_skip, parent=Dt) > > > > @@ -461,6 +464,7 @@ _register_elf_h(DtPPC, prefix='DT_PPC_', skip=_dt_skip, parent=Dt) > > > > _register_elf_h(DtPPC64, prefix='DT_PPC64_', skip=_dt_skip, parent=Dt) > > > > _register_elf_h(DtRISCV, prefix='DT_RISCV_', skip=_dt_skip, parent=Dt) > > > > _register_elf_h(DtSPARC, prefix='DT_SPARC_', skip=_dt_skip, parent=Dt) > > > > +_register_elf_h(DtX86_64, prefix='DT_X86_64_', skip=_dt_skip, parent=Dt) > > > > _register_elf_h(Dt, skip=_dt_skip, ranges=True) > > > > del _dt_skip > > > > > > > > diff --git a/sysdeps/x86/cet-control.h b/sysdeps/x86/cet-control.h > > > > index a45d59bf8c..81e7bb4bd8 100644 > > > > --- a/sysdeps/x86/cet-control.h > > > > +++ b/sysdeps/x86/cet-control.h > > > > @@ -32,10 +32,22 @@ enum dl_x86_cet_control > > > > cet_permissive > > > > }; > > > > > > > > +/* PLT rewrite control. */ > > > > +enum dl_plt_rewrite_control > > > > +{ > > > > + /* No PLT rewrite. */ > > > > + plt_rewrite_none, > > > > + /* Rewrite PLT with JMP at run-time. */ > > > > + plt_rewrite_jmp, > > > > + /* Rewrite PLT with JMPABS at run-time. */ > > > > + plt_rewrite_jmpabs > > > > +}; > > > > + > > > > struct dl_x86_feature_control > > > > { > > > > enum dl_x86_cet_control ibt : 2; > > > > enum dl_x86_cet_control shstk : 2; > > > > + enum dl_plt_rewrite_control plt_rewrite : 2; > > > > }; > > > > > > > > #endif /* cet-control.h */ > > > > diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c > > > > index f193ea7a2d..ccf1350b72 100644 > > > > --- a/sysdeps/x86/cpu-features.c > > > > +++ b/sysdeps/x86/cpu-features.c > > > > @@ -27,6 +27,21 @@ > > > > extern void TUNABLE_CALLBACK (set_hwcaps) (tunable_val_t *) > > > > attribute_hidden; > > > > > > > > +#ifdef SHARED > > > > +static void > > > > +TUNABLE_CALLBACK (set_plt_rewrite) (tunable_val_t *valp) > > > > +{ > > > > + if (valp->numval != 0) > > > > + { > > > > + /* Use JMPABS only on APX processors. */ > > > > + const struct cpu_features *cpu_features = __get_cpu_features (); > > > > + GL(dl_x86_feature_control).plt_rewrite > > > > + = (CPU_FEATURE_PRESENT_P (cpu_features, APX_F) > > > > + ? plt_rewrite_jmpabs : plt_rewrite_jmp); > > > > + } > > > > +} > > > > +#endif > > > > + > > > > #ifdef __LP64__ > > > > static void > > > > TUNABLE_CALLBACK (set_prefer_map_32bit_exec) (tunable_val_t *valp) > > > > @@ -1108,7 +1123,10 @@ no_cpuid: > > > > TUNABLE_CALLBACK (set_x86_shstk)); > > > > #endif > > > > > > > > -#ifndef SHARED > > > > +#ifdef SHARED > > > > + TUNABLE_GET (plt_rewrite, tunable_val_t *, > > > > + TUNABLE_CALLBACK (set_plt_rewrite)); > > > > +#else > > > > /* NB: In libc.a, call init_cacheinfo. */ > > > > init_cacheinfo (); > > > > #endif > > > > diff --git a/sysdeps/x86/dl-procruntime.c b/sysdeps/x86/dl-procruntime.c > > > > index 4d25d9f327..15b3d0d878 100644 > > > > --- a/sysdeps/x86/dl-procruntime.c > > > > +++ b/sysdeps/x86/dl-procruntime.c > > > > @@ -67,6 +67,7 @@ PROCINFO_CLASS struct dl_x86_feature_control _dl_x86_feature_control > > > > = { > > > > .ibt = DEFAULT_DL_X86_CET_CONTROL, > > > > .shstk = DEFAULT_DL_X86_CET_CONTROL, > > > > + .plt_rewrite = plt_rewrite_none, > > > > } > > > > # endif > > > > # if !defined SHARED || defined PROCINFO_DECL > > > > diff --git a/sysdeps/x86/dl-tunables.list b/sysdeps/x86/dl-tunables.list > > > > index 147a7270ec..e2e441e1b7 100644 > > > > --- a/sysdeps/x86/dl-tunables.list > > > > +++ b/sysdeps/x86/dl-tunables.list > > > > @@ -66,5 +66,10 @@ glibc { > > > > x86_shared_cache_size { > > > > type: SIZE_T > > > > } > > > > + plt_rewrite { > > > > + type: INT_32 > > > > + minval: 0 > > > > + maxval: 1 > > > > + } > > > should max value be at least 2 given that there are three > > > distinct states (none, jmp, jmpabs). > > > > I will take a look. > > > > > > } > > > > } > > > > diff --git a/sysdeps/x86_64/Makefile b/sysdeps/x86_64/Makefile > > > > index 00120ca9ca..374bca80d0 100644 > > > > --- a/sysdeps/x86_64/Makefile > > > > +++ b/sysdeps/x86_64/Makefile > > > > @@ -1,6 +1,14 @@ > > > > # The i387 `long double' is a distinct type we support. > > > > long-double-fcts = yes > > > > > > > > +ifeq (yes,$(have-z-mark-plt)) > > > > +# Always generate DT_X86_64_PLT* tags. > > > > +sysdep-LDFLAGS += -Wl,-z,mark-plt > > > > +# Never generate DT_X86_64_PLT* tags on ld.so to avoid changing its own > > > > +# PLT. > > > > +LDFLAGS-rtld += -Wl,-z,nomark-plt > > > > +endif > > > > + > > > > ifeq ($(subdir),csu) > > > > gen-as-const-headers += link-defines.sym > > > > endif > > > > @@ -175,6 +183,25 @@ ifeq (no,$(build-hardcoded-path-in-tests)) > > > > tests-container += tst-glibc-hwcaps-cache > > > > endif > > > > > > > > +ifeq (yes,$(have-z-mark-plt)) > > > > +tests += \ > > > > + tst-plt-rewrite1 \ > > > > +# tests > > > > +modules-names += \ > > > > + tst-plt-rewritemod1 \ > > > > +# modules-names > > > > + > > > > +tst-plt-rewrite1-no-pie = yes > > > > +LDFLAGS-tst-plt-rewrite1 = -Wl,-z,now > > > > +LDFLAGS-tst-plt-rewritemod1.so = -Wl,-z,now > > > > +tst-plt-rewrite1-ENV = GLIBC_TUNABLES=glibc.cpu.plt_rewrite=1 LD_DEBUG=files:bindings > > > > +$(objpfx)tst-plt-rewrite1: $(objpfx)tst-plt-rewritemod1.so > > > > +$(objpfx)tst-plt-rewrite1.out: /dev/null $(objpfx)tst-plt-rewrite1 > > > > + $(tst-plt-rewrite1-ENV) $(make-test-out) > $@ 2>&1; \ > > > > + grep -q -E "changing 'bar' PLT entry in .*/elf/tst-plt-rewritemod1.so' to direct branch" $@; \ > > > > + $(evaluate-test) > > > > +endif > > > > + > > > > endif # $(subdir) == elf > > > > > > > > ifeq ($(subdir),csu) > > > > diff --git a/sysdeps/x86_64/configure b/sysdeps/x86_64/configure > > > > index b4a80b8035..418cc4a9b8 100755 > > > > --- a/sysdeps/x86_64/configure > > > > +++ b/sysdeps/x86_64/configure > > > > @@ -25,6 +25,41 @@ printf "%s\n" "$libc_cv_cc_mprefer_vector_width" >&6; } > > > > config_vars="$config_vars > > > > config-cflags-mprefer-vector-width = $libc_cv_cc_mprefer_vector_width" > > > > > > > > +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for linker that supports -z mark-plt" >&5 > > > > +printf %s "checking for linker that supports -z mark-plt... " >&6; } > > > > +libc_linker_feature=no > > > > +cat > conftest.c <<EOF > > > > +int _start (void) { return 42; } > > > > +EOF > > > > +if { ac_try='${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp > > > > + -Wl,-z,mark-plt -nostdlib -nostartfiles > > > > + -fPIC -shared -o conftest.so conftest.c > > > > + 1>&5' > > > > + { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5 > > > > + (eval $ac_try) 2>&5 > > > > + ac_status=$? > > > > + printf "%s\n" "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 > > > > + test $ac_status = 0; }; } > > > > +then > > > > + if ${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp -Wl,-z,mark-plt -nostdlib \ > > > > + -nostartfiles -fPIC -shared -o conftest.so conftest.c 2>&1 \ > > > > + | grep "warning: -z mark-plt ignored" > /dev/null 2>&1; then > > > > + true > > > > + else > > > > + libc_linker_feature=yes > > > > + fi > > > > +fi > > > > +rm -f conftest* > > > > +if test $libc_linker_feature = yes; then > > > > + libc_cv_z_mark_plt=yes > > > > +else > > > > + libc_cv_z_mark_plt=no > > > > +fi > > > > +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: $libc_linker_feature" >&5 > > > > +printf "%s\n" "$libc_linker_feature" >&6; } > > > > +config_vars="$config_vars > > > > +have-z-mark-plt = $libc_cv_z_mark_plt" > > > > + > > > > if test x"$build_mathvec" = xnotset; then > > > > build_mathvec=yes > > > > fi > > > > diff --git a/sysdeps/x86_64/configure.ac b/sysdeps/x86_64/configure.ac > > > > index 937d1aff7e..d1f803c02e 100644 > > > > --- a/sysdeps/x86_64/configure.ac > > > > +++ b/sysdeps/x86_64/configure.ac > > > > @@ -10,6 +10,10 @@ LIBC_TRY_CC_OPTION([-mprefer-vector-width=128], > > > > LIBC_CONFIG_VAR([config-cflags-mprefer-vector-width], > > > > [$libc_cv_cc_mprefer_vector_width]) > > > > > > > > +LIBC_LINKER_FEATURE([-z mark-plt], [-Wl,-z,mark-plt], > > > > + [libc_cv_z_mark_plt=yes], [libc_cv_z_mark_plt=no]) > > > > +LIBC_CONFIG_VAR([have-z-mark-plt], [$libc_cv_z_mark_plt]) > > > > + > > > > if test x"$build_mathvec" = xnotset; then > > > > build_mathvec=yes > > > > fi > > > > diff --git a/sysdeps/x86_64/dl-dtprocnum.h b/sysdeps/x86_64/dl-dtprocnum.h > > > > new file mode 100644 > > > > index 0000000000..cefacb5387 > > > > --- /dev/null > > > > +++ b/sysdeps/x86_64/dl-dtprocnum.h > > > > @@ -0,0 +1,21 @@ > > > > +/* Configuration of lookup functions. x64-64 version. > > > > + Copyright (C) 2024 Free Software Foundation, Inc. > > > > + This file is part of the GNU C Library. > > > > + > > > > + The GNU C Library is free software; you can redistribute it and/or > > > > + modify it under the terms of the GNU Lesser General Public > > > > + License as published by the Free Software Foundation; either > > > > + version 2.1 of the License, or (at your option) any later version. > > > > + > > > > + The GNU C Library is distributed in the hope that it will be useful, > > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > > + Lesser General Public License for more details. > > > > + > > > > + You should have received a copy of the GNU Lesser General Public > > > > + License along with the GNU C Library; if not, see > > > > + <https://www.gnu.org/licenses/>. */ > > > > + > > > > +/* Number of extra dynamic section entries for this architecture. By > > > > + default there are none. */ > > > > +#define DT_THISPROCNUM DT_X86_64_NUM > > > > diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h > > > > index e0a9b14469..b075482ef4 100644 > > > > --- a/sysdeps/x86_64/dl-machine.h > > > > +++ b/sysdeps/x86_64/dl-machine.h > > > > @@ -22,6 +22,7 @@ > > > > #define ELF_MACHINE_NAME "x86_64" > > > > > > > > #include <assert.h> > > > > +#include <stdint.h> > > > > #include <sys/param.h> > > > > #include <sysdep.h> > > > > #include <tls.h> > > > > @@ -35,6 +36,9 @@ > > > > # define RTLD_START_ENABLE_X86_FEATURES > > > > #endif > > > > > > > > +/* Translate a processor specific dynamic tag to the index in l_info array. */ > > > > +#define DT_X86_64(x) (DT_X86_64_##x - DT_LOPROC + DT_NUM) > > > > + > > > > /* Return nonzero iff ELF header is compatible with the running host. */ > > > > static inline int __attribute__ ((unused)) > > > > elf_machine_matches_host (const ElfW(Ehdr) *ehdr) > > > > @@ -312,8 +316,10 @@ and creates an unsatisfiable circular dependency.\n", > > > > > > > > switch (r_type) > > > > { > > > > - case R_X86_64_GLOB_DAT: > > > > case R_X86_64_JUMP_SLOT: > > > > + map->l_has_jump_slot_reloc = true; > > > > + /* fallthrough */ > > > > + case R_X86_64_GLOB_DAT: > > > > *reloc_addr = value; > > > > break; > > > > > > > > @@ -549,3 +555,211 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[], > > > > } > > > > > > > > #endif /* RESOLVE_MAP */ > > > > + > > > > +#if !defined ELF_DYNAMIC_AFTER_RELOC && !defined RTLD_BOOTSTRAP \ > > > > + && defined SHARED > > > > +# define ELF_DYNAMIC_AFTER_RELOC(map, lazy) \ > > > > + x86_64_dynamic_after_reloc (map, (lazy)) > > > > + > > > > +# define JMP32_INSN_OPCODE 0xe9 > > > > +# define JMP32_INSN_SIZE 5 > > > > +# define JMPABS_INSN_OPCODE 0xa100d5 > > > > +# define JMPABS_INSN_SIZE 11 > > > > +# define INT3_INSN_OPCODE 0xcc > > > > + > > > > +static const char * > > > > +x86_64_reloc_symbol_name (struct link_map *map, const ElfW(Rela) *reloc) > > > > +{ > > > > + const ElfW(Sym) *const symtab > > > > + = (const void *) map->l_info[DT_SYMTAB]->d_un.d_ptr; > > > > + const ElfW(Sym) *const refsym = &symtab[ELFW (R_SYM) (reloc->r_info)]; > > > > + const char *strtab = (const char *) map->l_info[DT_STRTAB]->d_un.d_ptr; > > > > + return strtab + refsym->st_name; > > > > +} > > > > + > > > > +static void > > > > +x86_64_rewrite_plt (struct link_map *map, ElfW(Addr) plt_rewrite) > > > > +{ > > > > + ElfW(Addr) l_addr = map->l_addr; > > > > + ElfW(Addr) pltent = map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val; > > > > + ElfW(Addr) start = map->l_info[DT_JMPREL]->d_un.d_ptr; > > > > + ElfW(Addr) size = map->l_info[DT_PLTRELSZ]->d_un.d_val; > > > > + const ElfW(Rela) *reloc = (const void *) start; > > > > + const ElfW(Rela) *reloc_end = (const void *) (start + size); > > > > + > > > > + unsigned int feature_1 = THREAD_GETMEM (THREAD_SELF, > > > > + header.feature_1); > > > > + bool ibt_enabled_p > > > > + = (feature_1 & GNU_PROPERTY_X86_FEATURE_1_IBT) != 0; > > > > + > > > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) > > > > + _dl_debug_printf ("\nchanging PLT in '%s' to direct branch\n", > > > > + DSO_FILENAME (map->l_name)); > > > > + > > > > + for (; reloc < reloc_end; reloc++) > > > > + if (ELFW(R_TYPE) (reloc->r_info) == R_X86_64_JUMP_SLOT) > > > > + { > > > > + /* Get the value from the GOT entry. */ > > > > + ElfW(Addr) value = *(ElfW(Addr) *) (l_addr + reloc->r_offset); > > > > + > > > > + /* Get the corresponding PLT entry from r_addend. */ > > > > + ElfW(Addr) branch_start = l_addr + reloc->r_addend; > > > > + /* Skip ENDBR64 if IBT isn't enabled. */ > > > > + if (!ibt_enabled_p) > > > > + branch_start = ALIGN_DOWN (branch_start, pltent); > > > > > > Will the only preceding code always be the ENDBR64? > > > If so why not just replace the alignment stuff with `- ENDBR64_INSN_SIZE`? > > > > ENDBR64 isn't required. If CET isn't enabled, there is no ENDBR64. > > > > > Likewise below. > > > > > > > + /* Get the displacement from the branch target. */ > > > > + ElfW(Addr) disp = value - branch_start - JMP32_INSN_SIZE; > > > > + ElfW(Addr) plt_end; > > > > + ElfW(Addr) pad; > > > > + > > > > + plt_end = (branch_start & -pltent) + pltent; > > > for this to make sense `pltent` needs to a power of 2 which is not > > > checked below in `x86_64_dynamic_after_reloc`. Likewise at > > > the ALIGN_DOWN above. > > > > It is safe. From > > > > https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/6d824a52a42d173eb838b879616c1be5870b593e > > > > there is "All entries have the same size and are aligned to the entry size.". > > > but the method of aligning to X by doing: `V & -X` only works if X is a power > of two. We should update the psABI. > > > also better codegen: `(branch_start | (pltent - 1)) + 1` > > > > + > > > > + /* Update the PLT entry. */ > > > > + if (((uint64_t) disp + (uint64_t) ((uint32_t) INT32_MIN)) > > > > + <= (uint64_t) UINT32_MAX) > > > > + { > > > > + pad = branch_start + JMP32_INSN_SIZE; > > > > + > > > > + if (__glibc_unlikely (pad > plt_end)) > > > > + continue; > > > > + > > > > + /* If the target branch can be reached with a direct branch, > > > > + rewrite the PLT entry with a direct branch. */ > > > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_BINDINGS)) > > > > + { > > > > + const char *sym_name = x86_64_reloc_symbol_name (map, > > > > + reloc); > > > > + _dl_debug_printf ("changing '%s' PLT entry in '%s' to " > > > > + "direct branch\n", sym_name, > > > > + DSO_FILENAME (map->l_name)); > > > > + } > > > > + > > > > + /* Write out direct branch. */ > > > > + *(uint8_t *) branch_start = JMP32_INSN_OPCODE; > > > > + *((uint32_t *) (branch_start + 1)) = disp; > > > > + } > > > > + else > > > > + { > > > > + if (GL(dl_x86_feature_control).plt_rewrite > > > > + != plt_rewrite_jmpabs) > > > > + { > > > > + if (__glibc_unlikely (GLRO(dl_debug_mask) > > > > + & DL_DEBUG_BINDINGS)) > > > > + { > > > > + const char *sym_name > > > > + = x86_64_reloc_symbol_name (map, reloc); > > > > + _dl_debug_printf ("skipping '%s' PLT entry in '%s'\n", > > > > + sym_name, > > > > + DSO_FILENAME (map->l_name)); > > > > + } > > > > + continue; > > > > + } > > > > + > > > > + pad = branch_start + JMPABS_INSN_SIZE; > > > > + > > > > + if (__glibc_unlikely (pad > plt_end)) > > > > + continue; > > > > + > > > > + /* Rewrite the PLT entry with JMPABS. */ > > > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_BINDINGS)) > > > > + { > > > > + const char *sym_name = x86_64_reloc_symbol_name (map, > > > > + reloc); > > > > + _dl_debug_printf ("changing '%s' PLT entry in '%s' to " > > > > + "JMPABS\n", sym_name, > > > > + DSO_FILENAME (map->l_name)); > > > > + } > > > > + > > > > + /* "jmpabs $target" for 64-bit displacement. NB: JMPABS has > > > > + a 3-byte opcode + 64bit address. There is a 1-byte overlap > > > > + between 4-byte write and 8-byte write. */ > > > > + *(uint32_t *) (branch_start) = JMPABS_INSN_OPCODE; > > > > + *(uint64_t *) (branch_start + 3) = value; > > > > + } > > > > + > > > > + /* Fill the unused part of the PLT entry with INT3. */ > > > > + for (; pad < plt_end; pad++) > > > > + *(uint8_t *) pad = INT3_INSN_OPCODE; > > > nit: can you put braces around this for loop? Its a bit deceiving > > > to the eye with the brace directly after it. > > > > It is quite clear to me since I wrote it. Can you be more specific? > > just seeing: > `for(;;) }` its easy to imagine it as `for(;;) {}`. its not an issue. > feel free to ignore. I will keep it as is. > > > > > > + } > > > > +} > > > > + > > > > +static inline void > > > > +x86_64_rewrite_plt_in_place (struct link_map *map) > > > > +{ > > > > + /* Adjust DT_X86_64_PLT address and DT_X86_64_PLTSZ values. */ > > > > + ElfW(Addr) plt = (map->l_info[DT_X86_64 (PLT)]->d_un.d_ptr > > > > + + map->l_addr); > > > > + size_t pagesize = GLRO(dl_pagesize); > > > > + ElfW(Addr) plt_aligned = ALIGN_DOWN (plt, pagesize); > > > > + size_t pltsz = (map->l_info[DT_X86_64 (PLTSZ)]->d_un.d_val > > > > + + plt - plt_aligned); > > > > + > > > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) > > > > + _dl_debug_printf ("\nchanging PLT in '%s' to writable\n", > > > > + DSO_FILENAME (map->l_name)); > > > > + > > > > + if (__glibc_unlikely (__mprotect ((void *) plt_aligned, pltsz, > > > > + PROT_WRITE | PROT_READ) < 0)) > > > > + { > > > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) > > > > + _dl_debug_printf ("\nfailed to change PLT in '%s' to writable\n", > > > > + DSO_FILENAME (map->l_name)); > > > > + return; > > > > + } > > > > + > > > > + x86_64_rewrite_plt (map, plt_aligned); > > > > + > > > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) > > > > + _dl_debug_printf ("\nchanging PLT in '%s' back to read-only\n", > > > > + DSO_FILENAME (map->l_name)); > > > > + > > > > + if (__glibc_unlikely (__mprotect ((void *) plt_aligned, pltsz, > > > > + PROT_EXEC | PROT_READ) < 0)) > > > > + _dl_signal_error (0, DSO_FILENAME (map->l_name), NULL, > > > > + "failed to change PLT back to read-only"); > > > > +} > > > > + > > > > +/* Rewrite PLT entries to direct branch if possible. */ > > > > + > > > > +static inline void > > > > +x86_64_dynamic_after_reloc (struct link_map *map, int lazy) > > > > +{ > > > > + /* Ignore DT_X86_64_PLT if the lazy binding is enabled. */ > > > > + if (lazy != 0) > > > > + return; > > > > + > > > > + /* Ignore DT_X86_64_PLT if PLT rewrite isn't enabled. */ > > > > + if (__glibc_likely (GL(dl_x86_feature_control).plt_rewrite > > > > + == plt_rewrite_none)) > > > > + return; > > > > + > > > > + if (__glibc_likely (map->l_info[DT_X86_64 (PLT)] == NULL)) > > > > + return; > > > > + > > > > + /* Ignore DT_X86_64_PLT if there is no R_X86_64_JUMP_SLOT. */ > > > > + if (map->l_has_jump_slot_reloc == 0) > > > > + return; > > > > + > > > > + /* Ignore DT_X86_64_PLT if > > > > + 1. DT_JMPREL isn't available or its value is 0. > > > > + 2. DT_PLTRELSZ is 0. > > > > + 3. DT_X86_64_PLTENT isn't available or its value is smaller than > > > > + 16 bytes. > > > > + 4. DT_X86_64_PLTSZ isn't available or its value is smaller than > > > > + DT_X86_64_PLTENT's value or isn't a multiple of DT_X86_64_PLTENT's > > > > + value. */ > > > > + if (map->l_info[DT_JMPREL] == NULL > > > > + || map->l_info[DT_JMPREL]->d_un.d_ptr == 0 > > > > + || map->l_info[DT_PLTRELSZ]->d_un.d_val == 0 > > > > + || map->l_info[DT_X86_64 (PLTSZ)] == NULL > > > > + || map->l_info[DT_X86_64 (PLTENT)] == NULL > > > > + || map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val < 16 > > > > + || (map->l_info[DT_X86_64 (PLTSZ)]->d_un.d_val > > > > + < map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val) > > > > + || (map->l_info[DT_X86_64 (PLTSZ)]->d_un.d_val > > > > + % map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val) != 0) > > > > + return; > > > > + > > > > + x86_64_rewrite_plt_in_place (map); > > > > +} > > > > +#endif > > > > diff --git a/sysdeps/x86_64/link_map.h b/sysdeps/x86_64/link_map.h > > > > new file mode 100644 > > > > index 0000000000..537f56ace5 > > > > --- /dev/null > > > > +++ b/sysdeps/x86_64/link_map.h > > > > @@ -0,0 +1,22 @@ > > > > +/* Additional fields in struct link_map. x86-64 version. > > > > + Copyright (C) 2024 Free Software Foundation, Inc. > > > > + This file is part of the GNU C Library. > > > > + > > > > + The GNU C Library is free software; you can redistribute it and/or > > > > + modify it under the terms of the GNU Lesser General Public > > > > + License as published by the Free Software Foundation; either > > > > + version 2.1 of the License, or (at your option) any later version. > > > > + > > > > + The GNU C Library is distributed in the hope that it will be useful, > > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > > + Lesser General Public License for more details. > > > > + > > > > + You should have received a copy of the GNU Lesser General Public > > > > + License along with the GNU C Library; if not, see > > > > + <https://www.gnu.org/licenses/>. */ > > > > + > > > > +/* Has R_X86_64_JUMP_SLOT relocation. */ > > > > +bool l_has_jump_slot_reloc; > > > > + > > > > +#include <sysdeps/x86/link_map.h> > > > > diff --git a/sysdeps/x86_64/tst-plt-rewrite1.c b/sysdeps/x86_64/tst-plt-rewrite1.c > > > > new file mode 100644 > > > > index 0000000000..86785957e2 > > > > --- /dev/null > > > > +++ b/sysdeps/x86_64/tst-plt-rewrite1.c > > > > @@ -0,0 +1,31 @@ > > > > +/* Test PLT rewrite. > > > > + Copyright (C) 2024 Free Software Foundation, Inc. > > > > + This file is part of the GNU C Library. > > > > + > > > > + The GNU C Library is free software; you can redistribute it and/or > > > > + modify it under the terms of the GNU Lesser General Public > > > > + License as published by the Free Software Foundation; either > > > > + version 2.1 of the License, or (at your option) any later version. > > > > + > > > > + The GNU C Library is distributed in the hope that it will be useful, > > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > > + Lesser General Public License for more details. > > > > + > > > > + You should have received a copy of the GNU Lesser General Public > > > > + License along with the GNU C Library; if not, see > > > > + <https://www.gnu.org/licenses/>. */ > > > > + > > > > +#include <string.h> > > > > +#include <support/check.h> > > > > + > > > > +extern const char *foo (void); > > > > + > > > > +static int > > > > +do_test (void) > > > > +{ > > > > + TEST_COMPARE (strcmp (foo (), "PLT rewrite works"), 0); > > > > + return 0; > > > > +} > > > > + > > > > +#include <support/test-driver.c> > > > > diff --git a/sysdeps/x86_64/tst-plt-rewritemod1.c b/sysdeps/x86_64/tst-plt-rewritemod1.c > > > > new file mode 100644 > > > > index 0000000000..99f21fba5a > > > > --- /dev/null > > > > +++ b/sysdeps/x86_64/tst-plt-rewritemod1.c > > > > @@ -0,0 +1,32 @@ > > > > +/* Check PLT rewrite works correctly. > > > > + Copyright (C) 2024 Free Software Foundation, Inc. > > > > + This file is part of the GNU C Library. > > > > + > > > > + The GNU C Library is free software; you can redistribute it and/or > > > > + modify it under the terms of the GNU Lesser General Public > > > > + License as published by the Free Software Foundation; either > > > > + version 2.1 of the License, or (at your option) any later version. > > > > + > > > > + The GNU C Library is distributed in the hope that it will be useful, > > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > > + Lesser General Public License for more details. > > > > + > > > > + You should have received a copy of the GNU Lesser General Public > > > > + License along with the GNU C Library; if not, see > > > > + <https://www.gnu.org/licenses/>. */ > > > > + > > > > +/* foo calls bar with indirect branch via PLT. PLT rewrite should > > > > + change it to direct branch. */ > > > > + > > > > +const char * > > > > +bar (void) > > > > +{ > > > > + return "PLT rewrite works"; > > > > +} > > > > + > > > > +const char * > > > > +foo (void) > > > > +{ > > > > + return bar (); > > > > +} > > > > -- > > > > 2.43.0 > > > > > > > > > > can you run clang-format on your changes as a whole? > > > > Will do. I tried clang-format. It winds up changing the existing code format, like ElfWAddr) to ElfW (Addr).
On Thu, Jan 4, 2024 at 7:12 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > On Thu, Jan 4, 2024 at 6:46 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote: > > > > On Thu, Jan 4, 2024 at 6:14 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > > > On Thu, Jan 4, 2024 at 6:00 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote: > > > > > > > > On Thu, Jan 4, 2024 at 8:01 AM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > > > > > > > Changes in v6: > > > > > > > > > > 1. Print skipped PLT entries. > > > > > 2. Always generate DT_X86_64_PLT* tags when supported. > > > > > 3. Never generate DT_X86_64_PLT* tags on ld.so. > > > > > 4. Don't check for ld.so in x86_64_dynamic_after_reloc to reduce the > > > > > run-time overhead. > > > > > > > > > > Changes in v5: > > > > > > > > > > 1. Remove plt_rewrite_enabled and set plt_rewrite in set_plt_rewrite to > > > > > reduce the run-time overhead. > > > > > 2. Use __glibc_likely and __glibc_unlikely in x86_64_dynamic_after_reloc > > > > > to reduce the run-time overhead. > > > > > > > > > > Changes in v4: > > > > > > > > > > 1. Update set_plt_rewrite. > > > > > 2. Add /* fallthrough */ to case R_X86_64_JUMP_SLOT:. > > > > > 3. Remove plt_rewrite_bias since it is always zero. > > > > > 4. Skip and silently ignore unsupported r_addend. > > > > > 5. Update copyright year to 2024. > > > > > 6. Drop the mprotect (PROT_EXEC | PROT_READ) check. > > > > > > > > > > Changes in v3: > > > > > > > > > > 1. Define and use macros for instruction opcodes and sizes. > > > > > 2. Use INT32_MIN and UINT32_MAX instead of 0x80000000ULL and > > > > > 0xffffffffULL. > > > > > 3. Replace 3 1-byte writes with a 4-byte write to write JMPABS. > > > > > 4. Verify that DT_X86_64_PLTSZ is a multiple of DT_X86_64_PLTENT. > > > > > 5. Add some comments for mprotect test. > > > > > 6. Update plt_rewrite logic. > > > > > > > > > > Add ELF_DYNAMIC_AFTER_RELOC to allow target specific processing after > > > > > relocation. > > > > > > > > > > For x86-64, add > > > > > > > > > > #define DT_X86_64_PLT (DT_LOPROC + 0) > > > > > #define DT_X86_64_PLTSZ (DT_LOPROC + 1) > > > > > #define DT_X86_64_PLTENT (DT_LOPROC + 3) > > > > > > > > > > 1. DT_X86_64_PLT: The address of the procedure linkage table. > > > > > 2. DT_X86_64_PLTSZ: The total size, in bytes, of the procedure linkage > > > > > table. > > > > > 3. DT_X86_64_PLTENT: The size, in bytes, of a procedure linkage table > > > > > entry. > > > > > > > > > > With the r_addend field of the R_X86_64_JUMP_SLOT relocation set to the > > > > > memory offset of the indirect branch instruction. > > > > > > > > > > Define ELF_DYNAMIC_AFTER_RELOC for x86-64 to rewrite the PLT section > > > > > with direct branch after relocation when the lazy binding is disabled. > > > > > > > > > > PLT rewrite is disabled by default since SELinux may disallow modifying > > > > > code pages and ld.so can't detect it in all cases. Add > > > > > > > > > > $ GLIBC_TUNABLES=glibc.cpu.plt_rewrite=1 > > > > > > > > > > to enable PLT rewrite at run-time. > > > > > --- > > > > > elf/dynamic-link.h | 5 + > > > > > elf/elf.h | 5 + > > > > > elf/tst-glibcelf.py | 1 + > > > > > manual/tunables.texi | 9 ++ > > > > > scripts/glibcelf.py | 4 + > > > > > sysdeps/x86/cet-control.h | 12 ++ > > > > > sysdeps/x86/cpu-features.c | 20 ++- > > > > > sysdeps/x86/dl-procruntime.c | 1 + > > > > > sysdeps/x86/dl-tunables.list | 5 + > > > > > sysdeps/x86_64/Makefile | 27 ++++ > > > > > sysdeps/x86_64/configure | 35 +++++ > > > > > sysdeps/x86_64/configure.ac | 4 + > > > > > sysdeps/x86_64/dl-dtprocnum.h | 21 +++ > > > > > sysdeps/x86_64/dl-machine.h | 216 ++++++++++++++++++++++++++- > > > > > sysdeps/x86_64/link_map.h | 22 +++ > > > > > sysdeps/x86_64/tst-plt-rewrite1.c | 31 ++++ > > > > > sysdeps/x86_64/tst-plt-rewritemod1.c | 32 ++++ > > > > > 17 files changed, 448 insertions(+), 2 deletions(-) > > > > > create mode 100644 sysdeps/x86_64/dl-dtprocnum.h > > > > > create mode 100644 sysdeps/x86_64/link_map.h > > > > > create mode 100644 sysdeps/x86_64/tst-plt-rewrite1.c > > > > > create mode 100644 sysdeps/x86_64/tst-plt-rewritemod1.c > > > > > > > > > > diff --git a/elf/dynamic-link.h b/elf/dynamic-link.h > > > > > index 8cdf7bde09..83d834ecaf 100644 > > > > > --- a/elf/dynamic-link.h > > > > > +++ b/elf/dynamic-link.h > > > > > @@ -177,6 +177,10 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[], > > > > > } \ > > > > > } while (0); > > > > > > > > > > +# ifndef ELF_DYNAMIC_AFTER_RELOC > > > > > +# define ELF_DYNAMIC_AFTER_RELOC(map, lazy) > > > > > +# endif > > > > > + > > > > > /* This can't just be an inline function because GCC is too dumb > > > > > to inline functions containing inlines themselves. */ > > > > > # ifdef RTLD_BOOTSTRAP > > > > > @@ -192,6 +196,7 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[], > > > > > ELF_DYNAMIC_DO_RELR (map); \ > > > > > ELF_DYNAMIC_DO_REL ((map), (scope), edr_lazy, skip_ifunc); \ > > > > > ELF_DYNAMIC_DO_RELA ((map), (scope), edr_lazy, skip_ifunc); \ > > > > > + ELF_DYNAMIC_AFTER_RELOC ((map), (edr_lazy)); \ > > > > > } while (0) > > > > > > > > > > #endif > > > > > diff --git a/elf/elf.h b/elf/elf.h > > > > > index ca6a7d9d67..455731663c 100644 > > > > > --- a/elf/elf.h > > > > > +++ b/elf/elf.h > > > > > @@ -3639,6 +3639,11 @@ enum > > > > > /* x86-64 sh_type values. */ > > > > > #define SHT_X86_64_UNWIND 0x70000001 /* Unwind information. */ > > > > > > > > > > +/* x86-64 d_tag values. */ > > > > > +#define DT_X86_64_PLT (DT_LOPROC + 0) > > > > > +#define DT_X86_64_PLTSZ (DT_LOPROC + 1) > > > > > +#define DT_X86_64_PLTENT (DT_LOPROC + 3) > > > > > +#define DT_X86_64_NUM 4 > > > > > > > > > > /* AM33 relocations. */ > > > > > #define R_MN10300_NONE 0 /* No reloc. */ > > > > > diff --git a/elf/tst-glibcelf.py b/elf/tst-glibcelf.py > > > > > index 00cd2bba85..c191636a99 100644 > > > > > --- a/elf/tst-glibcelf.py > > > > > +++ b/elf/tst-glibcelf.py > > > > > @@ -187,6 +187,7 @@ DT_VALNUM > > > > > DT_VALRNGHI > > > > > DT_VALRNGLO > > > > > DT_VERSIONTAGNUM > > > > > +DT_X86_64_NUM > > > > > ELFCLASSNUM > > > > > ELFDATANUM > > > > > EM_NUM > > > > > diff --git a/manual/tunables.texi b/manual/tunables.texi > > > > > index b31f16da84..f9bd83622e 100644 > > > > > --- a/manual/tunables.texi > > > > > +++ b/manual/tunables.texi > > > > > @@ -57,6 +57,7 @@ glibc.pthread.stack_cache_size: 0x2800000 (min: 0x0, max: 0xffffffffffffffff) > > > > > glibc.cpu.hwcap_mask: 0x6 (min: 0x0, max: 0xffffffffffffffff) > > > > > glibc.malloc.mmap_max: 0 (min: 0, max: 2147483647) > > > > > glibc.elision.skip_trylock_internal_abort: 3 (min: 0, max: 2147483647) > > > > > +glibc.cpu.plt_rewrite: 0 (min: 0, max: 1) > > > > > glibc.malloc.tcache_unsorted_limit: 0x0 (min: 0x0, max: 0xffffffffffffffff) > > > > > glibc.cpu.x86_ibt: > > > > > glibc.cpu.hwcaps: > > > > > @@ -614,6 +615,14 @@ this tunable. > > > > > This tunable is specific to 64-bit x86-64. > > > > > @end deftp > > > > > > > > > > +@deftp Tunable glibc.cpu.plt_rewrite > > > > > +When this tunable is set to @code{1}, the dynamic linker will rewrite > > > > > +the PLT section with direct branch after relocation if possible when > > > > > +the lazy binding is disabled. > > > > > + > > > > > > > > This doesn't read well. Maybe > > > > > > > > When this tunable is set to @code{1}, the dynamic linker will attempt > > > > to rewrite the PLT section with a direct branch after relocation. It may > > > > fail to do so, for example when lazy binding is enabled. > > > > > > Will update. > > > > > > > > +This tunable is specific to x86-64. > > > > > +@end deftp > > > > > + > > > > > @node Memory Related Tunables > > > > > @section Memory Related Tunables > > > > > @cindex memory related tunables > > > > > diff --git a/scripts/glibcelf.py b/scripts/glibcelf.py > > > > > index c5e5dda48e..5f3813f326 100644 > > > > > --- a/scripts/glibcelf.py > > > > > +++ b/scripts/glibcelf.py > > > > > @@ -439,6 +439,8 @@ class DtRISCV(Dt): > > > > > """Supplemental DT_* constants for EM_RISCV.""" > > > > > class DtSPARC(Dt): > > > > > """Supplemental DT_* constants for EM_SPARC.""" > > > > > +class DtX86_64(Dt): > > > > > + """Supplemental DT_* constants for EM_X86_64.""" > > > > > _dt_skip = ''' > > > > > DT_ENCODING DT_PROCNUM > > > > > DT_ADDRRNGLO DT_ADDRRNGHI DT_ADDRNUM > > > > > @@ -451,6 +453,7 @@ DT_MIPS_NUM > > > > > DT_PPC_NUM > > > > > DT_PPC64_NUM > > > > > DT_SPARC_NUM > > > > > +DT_X86_64_NUM > > > > > '''.strip().split() > > > > > _register_elf_h(DtAARCH64, prefix='DT_AARCH64_', skip=_dt_skip, parent=Dt) > > > > > _register_elf_h(DtALPHA, prefix='DT_ALPHA_', skip=_dt_skip, parent=Dt) > > > > > @@ -461,6 +464,7 @@ _register_elf_h(DtPPC, prefix='DT_PPC_', skip=_dt_skip, parent=Dt) > > > > > _register_elf_h(DtPPC64, prefix='DT_PPC64_', skip=_dt_skip, parent=Dt) > > > > > _register_elf_h(DtRISCV, prefix='DT_RISCV_', skip=_dt_skip, parent=Dt) > > > > > _register_elf_h(DtSPARC, prefix='DT_SPARC_', skip=_dt_skip, parent=Dt) > > > > > +_register_elf_h(DtX86_64, prefix='DT_X86_64_', skip=_dt_skip, parent=Dt) > > > > > _register_elf_h(Dt, skip=_dt_skip, ranges=True) > > > > > del _dt_skip > > > > > > > > > > diff --git a/sysdeps/x86/cet-control.h b/sysdeps/x86/cet-control.h > > > > > index a45d59bf8c..81e7bb4bd8 100644 > > > > > --- a/sysdeps/x86/cet-control.h > > > > > +++ b/sysdeps/x86/cet-control.h > > > > > @@ -32,10 +32,22 @@ enum dl_x86_cet_control > > > > > cet_permissive > > > > > }; > > > > > > > > > > +/* PLT rewrite control. */ > > > > > +enum dl_plt_rewrite_control > > > > > +{ > > > > > + /* No PLT rewrite. */ > > > > > + plt_rewrite_none, > > > > > + /* Rewrite PLT with JMP at run-time. */ > > > > > + plt_rewrite_jmp, > > > > > + /* Rewrite PLT with JMPABS at run-time. */ > > > > > + plt_rewrite_jmpabs > > > > > +}; > > > > > + > > > > > struct dl_x86_feature_control > > > > > { > > > > > enum dl_x86_cet_control ibt : 2; > > > > > enum dl_x86_cet_control shstk : 2; > > > > > + enum dl_plt_rewrite_control plt_rewrite : 2; > > > > > }; > > > > > > > > > > #endif /* cet-control.h */ > > > > > diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c > > > > > index f193ea7a2d..ccf1350b72 100644 > > > > > --- a/sysdeps/x86/cpu-features.c > > > > > +++ b/sysdeps/x86/cpu-features.c > > > > > @@ -27,6 +27,21 @@ > > > > > extern void TUNABLE_CALLBACK (set_hwcaps) (tunable_val_t *) > > > > > attribute_hidden; > > > > > > > > > > +#ifdef SHARED > > > > > +static void > > > > > +TUNABLE_CALLBACK (set_plt_rewrite) (tunable_val_t *valp) > > > > > +{ > > > > > + if (valp->numval != 0) > > > > > + { > > > > > + /* Use JMPABS only on APX processors. */ > > > > > + const struct cpu_features *cpu_features = __get_cpu_features (); > > > > > + GL(dl_x86_feature_control).plt_rewrite > > > > > + = (CPU_FEATURE_PRESENT_P (cpu_features, APX_F) > > > > > + ? plt_rewrite_jmpabs : plt_rewrite_jmp); > > > > > + } > > > > > +} > > > > > +#endif > > > > > + > > > > > #ifdef __LP64__ > > > > > static void > > > > > TUNABLE_CALLBACK (set_prefer_map_32bit_exec) (tunable_val_t *valp) > > > > > @@ -1108,7 +1123,10 @@ no_cpuid: > > > > > TUNABLE_CALLBACK (set_x86_shstk)); > > > > > #endif > > > > > > > > > > -#ifndef SHARED > > > > > +#ifdef SHARED > > > > > + TUNABLE_GET (plt_rewrite, tunable_val_t *, > > > > > + TUNABLE_CALLBACK (set_plt_rewrite)); > > > > > +#else > > > > > /* NB: In libc.a, call init_cacheinfo. */ > > > > > init_cacheinfo (); > > > > > #endif > > > > > diff --git a/sysdeps/x86/dl-procruntime.c b/sysdeps/x86/dl-procruntime.c > > > > > index 4d25d9f327..15b3d0d878 100644 > > > > > --- a/sysdeps/x86/dl-procruntime.c > > > > > +++ b/sysdeps/x86/dl-procruntime.c > > > > > @@ -67,6 +67,7 @@ PROCINFO_CLASS struct dl_x86_feature_control _dl_x86_feature_control > > > > > = { > > > > > .ibt = DEFAULT_DL_X86_CET_CONTROL, > > > > > .shstk = DEFAULT_DL_X86_CET_CONTROL, > > > > > + .plt_rewrite = plt_rewrite_none, > > > > > } > > > > > # endif > > > > > # if !defined SHARED || defined PROCINFO_DECL > > > > > diff --git a/sysdeps/x86/dl-tunables.list b/sysdeps/x86/dl-tunables.list > > > > > index 147a7270ec..e2e441e1b7 100644 > > > > > --- a/sysdeps/x86/dl-tunables.list > > > > > +++ b/sysdeps/x86/dl-tunables.list > > > > > @@ -66,5 +66,10 @@ glibc { > > > > > x86_shared_cache_size { > > > > > type: SIZE_T > > > > > } > > > > > + plt_rewrite { > > > > > + type: INT_32 > > > > > + minval: 0 > > > > > + maxval: 1 > > > > > + } > > > > should max value be at least 2 given that there are three > > > > distinct states (none, jmp, jmpabs). > > > > > > I will take a look. > > > > > > > > } > > > > > } > > > > > diff --git a/sysdeps/x86_64/Makefile b/sysdeps/x86_64/Makefile > > > > > index 00120ca9ca..374bca80d0 100644 > > > > > --- a/sysdeps/x86_64/Makefile > > > > > +++ b/sysdeps/x86_64/Makefile > > > > > @@ -1,6 +1,14 @@ > > > > > # The i387 `long double' is a distinct type we support. > > > > > long-double-fcts = yes > > > > > > > > > > +ifeq (yes,$(have-z-mark-plt)) > > > > > +# Always generate DT_X86_64_PLT* tags. > > > > > +sysdep-LDFLAGS += -Wl,-z,mark-plt > > > > > +# Never generate DT_X86_64_PLT* tags on ld.so to avoid changing its own > > > > > +# PLT. > > > > > +LDFLAGS-rtld += -Wl,-z,nomark-plt > > > > > +endif > > > > > + > > > > > ifeq ($(subdir),csu) > > > > > gen-as-const-headers += link-defines.sym > > > > > endif > > > > > @@ -175,6 +183,25 @@ ifeq (no,$(build-hardcoded-path-in-tests)) > > > > > tests-container += tst-glibc-hwcaps-cache > > > > > endif > > > > > > > > > > +ifeq (yes,$(have-z-mark-plt)) > > > > > +tests += \ > > > > > + tst-plt-rewrite1 \ > > > > > +# tests > > > > > +modules-names += \ > > > > > + tst-plt-rewritemod1 \ > > > > > +# modules-names > > > > > + > > > > > +tst-plt-rewrite1-no-pie = yes > > > > > +LDFLAGS-tst-plt-rewrite1 = -Wl,-z,now > > > > > +LDFLAGS-tst-plt-rewritemod1.so = -Wl,-z,now > > > > > +tst-plt-rewrite1-ENV = GLIBC_TUNABLES=glibc.cpu.plt_rewrite=1 LD_DEBUG=files:bindings > > > > > +$(objpfx)tst-plt-rewrite1: $(objpfx)tst-plt-rewritemod1.so > > > > > +$(objpfx)tst-plt-rewrite1.out: /dev/null $(objpfx)tst-plt-rewrite1 > > > > > + $(tst-plt-rewrite1-ENV) $(make-test-out) > $@ 2>&1; \ > > > > > + grep -q -E "changing 'bar' PLT entry in .*/elf/tst-plt-rewritemod1.so' to direct branch" $@; \ > > > > > + $(evaluate-test) > > > > > +endif > > > > > + > > > > > endif # $(subdir) == elf > > > > > > > > > > ifeq ($(subdir),csu) > > > > > diff --git a/sysdeps/x86_64/configure b/sysdeps/x86_64/configure > > > > > index b4a80b8035..418cc4a9b8 100755 > > > > > --- a/sysdeps/x86_64/configure > > > > > +++ b/sysdeps/x86_64/configure > > > > > @@ -25,6 +25,41 @@ printf "%s\n" "$libc_cv_cc_mprefer_vector_width" >&6; } > > > > > config_vars="$config_vars > > > > > config-cflags-mprefer-vector-width = $libc_cv_cc_mprefer_vector_width" > > > > > > > > > > +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for linker that supports -z mark-plt" >&5 > > > > > +printf %s "checking for linker that supports -z mark-plt... " >&6; } > > > > > +libc_linker_feature=no > > > > > +cat > conftest.c <<EOF > > > > > +int _start (void) { return 42; } > > > > > +EOF > > > > > +if { ac_try='${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp > > > > > + -Wl,-z,mark-plt -nostdlib -nostartfiles > > > > > + -fPIC -shared -o conftest.so conftest.c > > > > > + 1>&5' > > > > > + { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5 > > > > > + (eval $ac_try) 2>&5 > > > > > + ac_status=$? > > > > > + printf "%s\n" "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 > > > > > + test $ac_status = 0; }; } > > > > > +then > > > > > + if ${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp -Wl,-z,mark-plt -nostdlib \ > > > > > + -nostartfiles -fPIC -shared -o conftest.so conftest.c 2>&1 \ > > > > > + | grep "warning: -z mark-plt ignored" > /dev/null 2>&1; then > > > > > + true > > > > > + else > > > > > + libc_linker_feature=yes > > > > > + fi > > > > > +fi > > > > > +rm -f conftest* > > > > > +if test $libc_linker_feature = yes; then > > > > > + libc_cv_z_mark_plt=yes > > > > > +else > > > > > + libc_cv_z_mark_plt=no > > > > > +fi > > > > > +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: $libc_linker_feature" >&5 > > > > > +printf "%s\n" "$libc_linker_feature" >&6; } > > > > > +config_vars="$config_vars > > > > > +have-z-mark-plt = $libc_cv_z_mark_plt" > > > > > + > > > > > if test x"$build_mathvec" = xnotset; then > > > > > build_mathvec=yes > > > > > fi > > > > > diff --git a/sysdeps/x86_64/configure.ac b/sysdeps/x86_64/configure.ac > > > > > index 937d1aff7e..d1f803c02e 100644 > > > > > --- a/sysdeps/x86_64/configure.ac > > > > > +++ b/sysdeps/x86_64/configure.ac > > > > > @@ -10,6 +10,10 @@ LIBC_TRY_CC_OPTION([-mprefer-vector-width=128], > > > > > LIBC_CONFIG_VAR([config-cflags-mprefer-vector-width], > > > > > [$libc_cv_cc_mprefer_vector_width]) > > > > > > > > > > +LIBC_LINKER_FEATURE([-z mark-plt], [-Wl,-z,mark-plt], > > > > > + [libc_cv_z_mark_plt=yes], [libc_cv_z_mark_plt=no]) > > > > > +LIBC_CONFIG_VAR([have-z-mark-plt], [$libc_cv_z_mark_plt]) > > > > > + > > > > > if test x"$build_mathvec" = xnotset; then > > > > > build_mathvec=yes > > > > > fi > > > > > diff --git a/sysdeps/x86_64/dl-dtprocnum.h b/sysdeps/x86_64/dl-dtprocnum.h > > > > > new file mode 100644 > > > > > index 0000000000..cefacb5387 > > > > > --- /dev/null > > > > > +++ b/sysdeps/x86_64/dl-dtprocnum.h > > > > > @@ -0,0 +1,21 @@ > > > > > +/* Configuration of lookup functions. x64-64 version. > > > > > + Copyright (C) 2024 Free Software Foundation, Inc. > > > > > + This file is part of the GNU C Library. > > > > > + > > > > > + The GNU C Library is free software; you can redistribute it and/or > > > > > + modify it under the terms of the GNU Lesser General Public > > > > > + License as published by the Free Software Foundation; either > > > > > + version 2.1 of the License, or (at your option) any later version. > > > > > + > > > > > + The GNU C Library is distributed in the hope that it will be useful, > > > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > > > + Lesser General Public License for more details. > > > > > + > > > > > + You should have received a copy of the GNU Lesser General Public > > > > > + License along with the GNU C Library; if not, see > > > > > + <https://www.gnu.org/licenses/>. */ > > > > > + > > > > > +/* Number of extra dynamic section entries for this architecture. By > > > > > + default there are none. */ > > > > > +#define DT_THISPROCNUM DT_X86_64_NUM > > > > > diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h > > > > > index e0a9b14469..b075482ef4 100644 > > > > > --- a/sysdeps/x86_64/dl-machine.h > > > > > +++ b/sysdeps/x86_64/dl-machine.h > > > > > @@ -22,6 +22,7 @@ > > > > > #define ELF_MACHINE_NAME "x86_64" > > > > > > > > > > #include <assert.h> > > > > > +#include <stdint.h> > > > > > #include <sys/param.h> > > > > > #include <sysdep.h> > > > > > #include <tls.h> > > > > > @@ -35,6 +36,9 @@ > > > > > # define RTLD_START_ENABLE_X86_FEATURES > > > > > #endif > > > > > > > > > > +/* Translate a processor specific dynamic tag to the index in l_info array. */ > > > > > +#define DT_X86_64(x) (DT_X86_64_##x - DT_LOPROC + DT_NUM) > > > > > + > > > > > /* Return nonzero iff ELF header is compatible with the running host. */ > > > > > static inline int __attribute__ ((unused)) > > > > > elf_machine_matches_host (const ElfW(Ehdr) *ehdr) > > > > > @@ -312,8 +316,10 @@ and creates an unsatisfiable circular dependency.\n", > > > > > > > > > > switch (r_type) > > > > > { > > > > > - case R_X86_64_GLOB_DAT: > > > > > case R_X86_64_JUMP_SLOT: > > > > > + map->l_has_jump_slot_reloc = true; > > > > > + /* fallthrough */ > > > > > + case R_X86_64_GLOB_DAT: > > > > > *reloc_addr = value; > > > > > break; > > > > > > > > > > @@ -549,3 +555,211 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[], > > > > > } > > > > > > > > > > #endif /* RESOLVE_MAP */ > > > > > + > > > > > +#if !defined ELF_DYNAMIC_AFTER_RELOC && !defined RTLD_BOOTSTRAP \ > > > > > + && defined SHARED > > > > > +# define ELF_DYNAMIC_AFTER_RELOC(map, lazy) \ > > > > > + x86_64_dynamic_after_reloc (map, (lazy)) > > > > > + > > > > > +# define JMP32_INSN_OPCODE 0xe9 > > > > > +# define JMP32_INSN_SIZE 5 > > > > > +# define JMPABS_INSN_OPCODE 0xa100d5 > > > > > +# define JMPABS_INSN_SIZE 11 > > > > > +# define INT3_INSN_OPCODE 0xcc > > > > > + > > > > > +static const char * > > > > > +x86_64_reloc_symbol_name (struct link_map *map, const ElfW(Rela) *reloc) > > > > > +{ > > > > > + const ElfW(Sym) *const symtab > > > > > + = (const void *) map->l_info[DT_SYMTAB]->d_un.d_ptr; > > > > > + const ElfW(Sym) *const refsym = &symtab[ELFW (R_SYM) (reloc->r_info)]; > > > > > + const char *strtab = (const char *) map->l_info[DT_STRTAB]->d_un.d_ptr; > > > > > + return strtab + refsym->st_name; > > > > > +} > > > > > + > > > > > +static void > > > > > +x86_64_rewrite_plt (struct link_map *map, ElfW(Addr) plt_rewrite) > > > > > +{ > > > > > + ElfW(Addr) l_addr = map->l_addr; > > > > > + ElfW(Addr) pltent = map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val; > > > > > + ElfW(Addr) start = map->l_info[DT_JMPREL]->d_un.d_ptr; > > > > > + ElfW(Addr) size = map->l_info[DT_PLTRELSZ]->d_un.d_val; > > > > > + const ElfW(Rela) *reloc = (const void *) start; > > > > > + const ElfW(Rela) *reloc_end = (const void *) (start + size); > > > > > + > > > > > + unsigned int feature_1 = THREAD_GETMEM (THREAD_SELF, > > > > > + header.feature_1); > > > > > + bool ibt_enabled_p > > > > > + = (feature_1 & GNU_PROPERTY_X86_FEATURE_1_IBT) != 0; > > > > > + > > > > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) > > > > > + _dl_debug_printf ("\nchanging PLT in '%s' to direct branch\n", > > > > > + DSO_FILENAME (map->l_name)); > > > > > + > > > > > + for (; reloc < reloc_end; reloc++) > > > > > + if (ELFW(R_TYPE) (reloc->r_info) == R_X86_64_JUMP_SLOT) > > > > > + { > > > > > + /* Get the value from the GOT entry. */ > > > > > + ElfW(Addr) value = *(ElfW(Addr) *) (l_addr + reloc->r_offset); > > > > > + > > > > > + /* Get the corresponding PLT entry from r_addend. */ > > > > > + ElfW(Addr) branch_start = l_addr + reloc->r_addend; > > > > > + /* Skip ENDBR64 if IBT isn't enabled. */ > > > > > + if (!ibt_enabled_p) > > > > > + branch_start = ALIGN_DOWN (branch_start, pltent); > > > > > > > > Will the only preceding code always be the ENDBR64? > > > > If so why not just replace the alignment stuff with `- ENDBR64_INSN_SIZE`? > > > > > > ENDBR64 isn't required. If CET isn't enabled, there is no ENDBR64. > > > > > > > Likewise below. > > > > > > > > > + /* Get the displacement from the branch target. */ > > > > > + ElfW(Addr) disp = value - branch_start - JMP32_INSN_SIZE; > > > > > + ElfW(Addr) plt_end; > > > > > + ElfW(Addr) pad; > > > > > + > > > > > + plt_end = (branch_start & -pltent) + pltent; > > > > for this to make sense `pltent` needs to a power of 2 which is not > > > > checked below in `x86_64_dynamic_after_reloc`. Likewise at > > > > the ALIGN_DOWN above. > > > > > > It is safe. From > > > > > > https://gitlab.com/x86-psABIs/x86-64-ABI/-/commit/6d824a52a42d173eb838b879616c1be5870b593e > > > > > > there is "All entries have the same size and are aligned to the entry size.". > > > > > but the method of aligning to X by doing: `V & -X` only works if X is a power > > of two. > > We should update the psABI. > > > > > also better codegen: `(branch_start | (pltent - 1)) + 1` > > > > > + > > > > > + /* Update the PLT entry. */ > > > > > + if (((uint64_t) disp + (uint64_t) ((uint32_t) INT32_MIN)) > > > > > + <= (uint64_t) UINT32_MAX) > > > > > + { > > > > > + pad = branch_start + JMP32_INSN_SIZE; > > > > > + > > > > > + if (__glibc_unlikely (pad > plt_end)) > > > > > + continue; > > > > > + > > > > > + /* If the target branch can be reached with a direct branch, > > > > > + rewrite the PLT entry with a direct branch. */ > > > > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_BINDINGS)) > > > > > + { > > > > > + const char *sym_name = x86_64_reloc_symbol_name (map, > > > > > + reloc); > > > > > + _dl_debug_printf ("changing '%s' PLT entry in '%s' to " > > > > > + "direct branch\n", sym_name, > > > > > + DSO_FILENAME (map->l_name)); > > > > > + } > > > > > + > > > > > + /* Write out direct branch. */ > > > > > + *(uint8_t *) branch_start = JMP32_INSN_OPCODE; > > > > > + *((uint32_t *) (branch_start + 1)) = disp; > > > > > + } > > > > > + else > > > > > + { > > > > > + if (GL(dl_x86_feature_control).plt_rewrite > > > > > + != plt_rewrite_jmpabs) > > > > > + { > > > > > + if (__glibc_unlikely (GLRO(dl_debug_mask) > > > > > + & DL_DEBUG_BINDINGS)) > > > > > + { > > > > > + const char *sym_name > > > > > + = x86_64_reloc_symbol_name (map, reloc); > > > > > + _dl_debug_printf ("skipping '%s' PLT entry in '%s'\n", > > > > > + sym_name, > > > > > + DSO_FILENAME (map->l_name)); > > > > > + } > > > > > + continue; > > > > > + } > > > > > + > > > > > + pad = branch_start + JMPABS_INSN_SIZE; > > > > > + > > > > > + if (__glibc_unlikely (pad > plt_end)) > > > > > + continue; > > > > > + > > > > > + /* Rewrite the PLT entry with JMPABS. */ > > > > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_BINDINGS)) > > > > > + { > > > > > + const char *sym_name = x86_64_reloc_symbol_name (map, > > > > > + reloc); > > > > > + _dl_debug_printf ("changing '%s' PLT entry in '%s' to " > > > > > + "JMPABS\n", sym_name, > > > > > + DSO_FILENAME (map->l_name)); > > > > > + } > > > > > + > > > > > + /* "jmpabs $target" for 64-bit displacement. NB: JMPABS has > > > > > + a 3-byte opcode + 64bit address. There is a 1-byte overlap > > > > > + between 4-byte write and 8-byte write. */ > > > > > + *(uint32_t *) (branch_start) = JMPABS_INSN_OPCODE; > > > > > + *(uint64_t *) (branch_start + 3) = value; > > > > > + } > > > > > + > > > > > + /* Fill the unused part of the PLT entry with INT3. */ > > > > > + for (; pad < plt_end; pad++) > > > > > + *(uint8_t *) pad = INT3_INSN_OPCODE; > > > > nit: can you put braces around this for loop? Its a bit deceiving > > > > to the eye with the brace directly after it. > > > > > > It is quite clear to me since I wrote it. Can you be more specific? > > > > just seeing: > > `for(;;) }` its easy to imagine it as `for(;;) {}`. its not an issue. > > feel free to ignore. > > I will keep it as is. > > > > > > > > > + } > > > > > +} > > > > > + > > > > > +static inline void > > > > > +x86_64_rewrite_plt_in_place (struct link_map *map) > > > > > +{ > > > > > + /* Adjust DT_X86_64_PLT address and DT_X86_64_PLTSZ values. */ > > > > > + ElfW(Addr) plt = (map->l_info[DT_X86_64 (PLT)]->d_un.d_ptr > > > > > + + map->l_addr); > > > > > + size_t pagesize = GLRO(dl_pagesize); > > > > > + ElfW(Addr) plt_aligned = ALIGN_DOWN (plt, pagesize); > > > > > + size_t pltsz = (map->l_info[DT_X86_64 (PLTSZ)]->d_un.d_val > > > > > + + plt - plt_aligned); > > > > > + > > > > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) > > > > > + _dl_debug_printf ("\nchanging PLT in '%s' to writable\n", > > > > > + DSO_FILENAME (map->l_name)); > > > > > + > > > > > + if (__glibc_unlikely (__mprotect ((void *) plt_aligned, pltsz, > > > > > + PROT_WRITE | PROT_READ) < 0)) > > > > > + { > > > > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) > > > > > + _dl_debug_printf ("\nfailed to change PLT in '%s' to writable\n", > > > > > + DSO_FILENAME (map->l_name)); > > > > > + return; > > > > > + } > > > > > + > > > > > + x86_64_rewrite_plt (map, plt_aligned); > > > > > + > > > > > + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) > > > > > + _dl_debug_printf ("\nchanging PLT in '%s' back to read-only\n", > > > > > + DSO_FILENAME (map->l_name)); > > > > > + > > > > > + if (__glibc_unlikely (__mprotect ((void *) plt_aligned, pltsz, > > > > > + PROT_EXEC | PROT_READ) < 0)) > > > > > + _dl_signal_error (0, DSO_FILENAME (map->l_name), NULL, > > > > > + "failed to change PLT back to read-only"); > > > > > +} > > > > > + > > > > > +/* Rewrite PLT entries to direct branch if possible. */ > > > > > + > > > > > +static inline void > > > > > +x86_64_dynamic_after_reloc (struct link_map *map, int lazy) > > > > > +{ > > > > > + /* Ignore DT_X86_64_PLT if the lazy binding is enabled. */ > > > > > + if (lazy != 0) > > > > > + return; > > > > > + > > > > > + /* Ignore DT_X86_64_PLT if PLT rewrite isn't enabled. */ > > > > > + if (__glibc_likely (GL(dl_x86_feature_control).plt_rewrite > > > > > + == plt_rewrite_none)) > > > > > + return; > > > > > + > > > > > + if (__glibc_likely (map->l_info[DT_X86_64 (PLT)] == NULL)) > > > > > + return; > > > > > + > > > > > + /* Ignore DT_X86_64_PLT if there is no R_X86_64_JUMP_SLOT. */ > > > > > + if (map->l_has_jump_slot_reloc == 0) > > > > > + return; > > > > > + > > > > > + /* Ignore DT_X86_64_PLT if > > > > > + 1. DT_JMPREL isn't available or its value is 0. > > > > > + 2. DT_PLTRELSZ is 0. > > > > > + 3. DT_X86_64_PLTENT isn't available or its value is smaller than > > > > > + 16 bytes. > > > > > + 4. DT_X86_64_PLTSZ isn't available or its value is smaller than > > > > > + DT_X86_64_PLTENT's value or isn't a multiple of DT_X86_64_PLTENT's > > > > > + value. */ > > > > > + if (map->l_info[DT_JMPREL] == NULL > > > > > + || map->l_info[DT_JMPREL]->d_un.d_ptr == 0 > > > > > + || map->l_info[DT_PLTRELSZ]->d_un.d_val == 0 > > > > > + || map->l_info[DT_X86_64 (PLTSZ)] == NULL > > > > > + || map->l_info[DT_X86_64 (PLTENT)] == NULL > > > > > + || map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val < 16 > > > > > + || (map->l_info[DT_X86_64 (PLTSZ)]->d_un.d_val > > > > > + < map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val) > > > > > + || (map->l_info[DT_X86_64 (PLTSZ)]->d_un.d_val > > > > > + % map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val) != 0) > > > > > + return; > > > > > + > > > > > + x86_64_rewrite_plt_in_place (map); > > > > > +} > > > > > +#endif > > > > > diff --git a/sysdeps/x86_64/link_map.h b/sysdeps/x86_64/link_map.h > > > > > new file mode 100644 > > > > > index 0000000000..537f56ace5 > > > > > --- /dev/null > > > > > +++ b/sysdeps/x86_64/link_map.h > > > > > @@ -0,0 +1,22 @@ > > > > > +/* Additional fields in struct link_map. x86-64 version. > > > > > + Copyright (C) 2024 Free Software Foundation, Inc. > > > > > + This file is part of the GNU C Library. > > > > > + > > > > > + The GNU C Library is free software; you can redistribute it and/or > > > > > + modify it under the terms of the GNU Lesser General Public > > > > > + License as published by the Free Software Foundation; either > > > > > + version 2.1 of the License, or (at your option) any later version. > > > > > + > > > > > + The GNU C Library is distributed in the hope that it will be useful, > > > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > > > + Lesser General Public License for more details. > > > > > + > > > > > + You should have received a copy of the GNU Lesser General Public > > > > > + License along with the GNU C Library; if not, see > > > > > + <https://www.gnu.org/licenses/>. */ > > > > > + > > > > > +/* Has R_X86_64_JUMP_SLOT relocation. */ > > > > > +bool l_has_jump_slot_reloc; > > > > > + > > > > > +#include <sysdeps/x86/link_map.h> > > > > > diff --git a/sysdeps/x86_64/tst-plt-rewrite1.c b/sysdeps/x86_64/tst-plt-rewrite1.c > > > > > new file mode 100644 > > > > > index 0000000000..86785957e2 > > > > > --- /dev/null > > > > > +++ b/sysdeps/x86_64/tst-plt-rewrite1.c > > > > > @@ -0,0 +1,31 @@ > > > > > +/* Test PLT rewrite. > > > > > + Copyright (C) 2024 Free Software Foundation, Inc. > > > > > + This file is part of the GNU C Library. > > > > > + > > > > > + The GNU C Library is free software; you can redistribute it and/or > > > > > + modify it under the terms of the GNU Lesser General Public > > > > > + License as published by the Free Software Foundation; either > > > > > + version 2.1 of the License, or (at your option) any later version. > > > > > + > > > > > + The GNU C Library is distributed in the hope that it will be useful, > > > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > > > + Lesser General Public License for more details. > > > > > + > > > > > + You should have received a copy of the GNU Lesser General Public > > > > > + License along with the GNU C Library; if not, see > > > > > + <https://www.gnu.org/licenses/>. */ > > > > > + > > > > > +#include <string.h> > > > > > +#include <support/check.h> > > > > > + > > > > > +extern const char *foo (void); > > > > > + > > > > > +static int > > > > > +do_test (void) > > > > > +{ > > > > > + TEST_COMPARE (strcmp (foo (), "PLT rewrite works"), 0); > > > > > + return 0; > > > > > +} > > > > > + > > > > > +#include <support/test-driver.c> > > > > > diff --git a/sysdeps/x86_64/tst-plt-rewritemod1.c b/sysdeps/x86_64/tst-plt-rewritemod1.c > > > > > new file mode 100644 > > > > > index 0000000000..99f21fba5a > > > > > --- /dev/null > > > > > +++ b/sysdeps/x86_64/tst-plt-rewritemod1.c > > > > > @@ -0,0 +1,32 @@ > > > > > +/* Check PLT rewrite works correctly. > > > > > + Copyright (C) 2024 Free Software Foundation, Inc. > > > > > + This file is part of the GNU C Library. > > > > > + > > > > > + The GNU C Library is free software; you can redistribute it and/or > > > > > + modify it under the terms of the GNU Lesser General Public > > > > > + License as published by the Free Software Foundation; either > > > > > + version 2.1 of the License, or (at your option) any later version. > > > > > + > > > > > + The GNU C Library is distributed in the hope that it will be useful, > > > > > + but WITHOUT ANY WARRANTY; without even the implied warranty of > > > > > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > > > > > + Lesser General Public License for more details. > > > > > + > > > > > + You should have received a copy of the GNU Lesser General Public > > > > > + License along with the GNU C Library; if not, see > > > > > + <https://www.gnu.org/licenses/>. */ > > > > > + > > > > > +/* foo calls bar with indirect branch via PLT. PLT rewrite should > > > > > + change it to direct branch. */ > > > > > + > > > > > +const char * > > > > > +bar (void) > > > > > +{ > > > > > + return "PLT rewrite works"; > > > > > +} > > > > > + > > > > > +const char * > > > > > +foo (void) > > > > > +{ > > > > > + return bar (); > > > > > +} > > > > > -- > > > > > 2.43.0 > > > > > > > > > > > > > can you run clang-format on your changes as a whole? > > > > > > Will do. > > I tried clang-format. It winds up changing the existing code > format, like ElfWAddr) to ElfW (Addr). you can format a region. > > -- > H.J.
diff --git a/elf/dynamic-link.h b/elf/dynamic-link.h index 8cdf7bde09..83d834ecaf 100644 --- a/elf/dynamic-link.h +++ b/elf/dynamic-link.h @@ -177,6 +177,10 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[], } \ } while (0); +# ifndef ELF_DYNAMIC_AFTER_RELOC +# define ELF_DYNAMIC_AFTER_RELOC(map, lazy) +# endif + /* This can't just be an inline function because GCC is too dumb to inline functions containing inlines themselves. */ # ifdef RTLD_BOOTSTRAP @@ -192,6 +196,7 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[], ELF_DYNAMIC_DO_RELR (map); \ ELF_DYNAMIC_DO_REL ((map), (scope), edr_lazy, skip_ifunc); \ ELF_DYNAMIC_DO_RELA ((map), (scope), edr_lazy, skip_ifunc); \ + ELF_DYNAMIC_AFTER_RELOC ((map), (edr_lazy)); \ } while (0) #endif diff --git a/elf/elf.h b/elf/elf.h index ca6a7d9d67..455731663c 100644 --- a/elf/elf.h +++ b/elf/elf.h @@ -3639,6 +3639,11 @@ enum /* x86-64 sh_type values. */ #define SHT_X86_64_UNWIND 0x70000001 /* Unwind information. */ +/* x86-64 d_tag values. */ +#define DT_X86_64_PLT (DT_LOPROC + 0) +#define DT_X86_64_PLTSZ (DT_LOPROC + 1) +#define DT_X86_64_PLTENT (DT_LOPROC + 3) +#define DT_X86_64_NUM 4 /* AM33 relocations. */ #define R_MN10300_NONE 0 /* No reloc. */ diff --git a/elf/tst-glibcelf.py b/elf/tst-glibcelf.py index 00cd2bba85..c191636a99 100644 --- a/elf/tst-glibcelf.py +++ b/elf/tst-glibcelf.py @@ -187,6 +187,7 @@ DT_VALNUM DT_VALRNGHI DT_VALRNGLO DT_VERSIONTAGNUM +DT_X86_64_NUM ELFCLASSNUM ELFDATANUM EM_NUM diff --git a/manual/tunables.texi b/manual/tunables.texi index b31f16da84..f9bd83622e 100644 --- a/manual/tunables.texi +++ b/manual/tunables.texi @@ -57,6 +57,7 @@ glibc.pthread.stack_cache_size: 0x2800000 (min: 0x0, max: 0xffffffffffffffff) glibc.cpu.hwcap_mask: 0x6 (min: 0x0, max: 0xffffffffffffffff) glibc.malloc.mmap_max: 0 (min: 0, max: 2147483647) glibc.elision.skip_trylock_internal_abort: 3 (min: 0, max: 2147483647) +glibc.cpu.plt_rewrite: 0 (min: 0, max: 1) glibc.malloc.tcache_unsorted_limit: 0x0 (min: 0x0, max: 0xffffffffffffffff) glibc.cpu.x86_ibt: glibc.cpu.hwcaps: @@ -614,6 +615,14 @@ this tunable. This tunable is specific to 64-bit x86-64. @end deftp +@deftp Tunable glibc.cpu.plt_rewrite +When this tunable is set to @code{1}, the dynamic linker will rewrite +the PLT section with direct branch after relocation if possible when +the lazy binding is disabled. + +This tunable is specific to x86-64. +@end deftp + @node Memory Related Tunables @section Memory Related Tunables @cindex memory related tunables diff --git a/scripts/glibcelf.py b/scripts/glibcelf.py index c5e5dda48e..5f3813f326 100644 --- a/scripts/glibcelf.py +++ b/scripts/glibcelf.py @@ -439,6 +439,8 @@ class DtRISCV(Dt): """Supplemental DT_* constants for EM_RISCV.""" class DtSPARC(Dt): """Supplemental DT_* constants for EM_SPARC.""" +class DtX86_64(Dt): + """Supplemental DT_* constants for EM_X86_64.""" _dt_skip = ''' DT_ENCODING DT_PROCNUM DT_ADDRRNGLO DT_ADDRRNGHI DT_ADDRNUM @@ -451,6 +453,7 @@ DT_MIPS_NUM DT_PPC_NUM DT_PPC64_NUM DT_SPARC_NUM +DT_X86_64_NUM '''.strip().split() _register_elf_h(DtAARCH64, prefix='DT_AARCH64_', skip=_dt_skip, parent=Dt) _register_elf_h(DtALPHA, prefix='DT_ALPHA_', skip=_dt_skip, parent=Dt) @@ -461,6 +464,7 @@ _register_elf_h(DtPPC, prefix='DT_PPC_', skip=_dt_skip, parent=Dt) _register_elf_h(DtPPC64, prefix='DT_PPC64_', skip=_dt_skip, parent=Dt) _register_elf_h(DtRISCV, prefix='DT_RISCV_', skip=_dt_skip, parent=Dt) _register_elf_h(DtSPARC, prefix='DT_SPARC_', skip=_dt_skip, parent=Dt) +_register_elf_h(DtX86_64, prefix='DT_X86_64_', skip=_dt_skip, parent=Dt) _register_elf_h(Dt, skip=_dt_skip, ranges=True) del _dt_skip diff --git a/sysdeps/x86/cet-control.h b/sysdeps/x86/cet-control.h index a45d59bf8c..81e7bb4bd8 100644 --- a/sysdeps/x86/cet-control.h +++ b/sysdeps/x86/cet-control.h @@ -32,10 +32,22 @@ enum dl_x86_cet_control cet_permissive }; +/* PLT rewrite control. */ +enum dl_plt_rewrite_control +{ + /* No PLT rewrite. */ + plt_rewrite_none, + /* Rewrite PLT with JMP at run-time. */ + plt_rewrite_jmp, + /* Rewrite PLT with JMPABS at run-time. */ + plt_rewrite_jmpabs +}; + struct dl_x86_feature_control { enum dl_x86_cet_control ibt : 2; enum dl_x86_cet_control shstk : 2; + enum dl_plt_rewrite_control plt_rewrite : 2; }; #endif /* cet-control.h */ diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c index f193ea7a2d..ccf1350b72 100644 --- a/sysdeps/x86/cpu-features.c +++ b/sysdeps/x86/cpu-features.c @@ -27,6 +27,21 @@ extern void TUNABLE_CALLBACK (set_hwcaps) (tunable_val_t *) attribute_hidden; +#ifdef SHARED +static void +TUNABLE_CALLBACK (set_plt_rewrite) (tunable_val_t *valp) +{ + if (valp->numval != 0) + { + /* Use JMPABS only on APX processors. */ + const struct cpu_features *cpu_features = __get_cpu_features (); + GL(dl_x86_feature_control).plt_rewrite + = (CPU_FEATURE_PRESENT_P (cpu_features, APX_F) + ? plt_rewrite_jmpabs : plt_rewrite_jmp); + } +} +#endif + #ifdef __LP64__ static void TUNABLE_CALLBACK (set_prefer_map_32bit_exec) (tunable_val_t *valp) @@ -1108,7 +1123,10 @@ no_cpuid: TUNABLE_CALLBACK (set_x86_shstk)); #endif -#ifndef SHARED +#ifdef SHARED + TUNABLE_GET (plt_rewrite, tunable_val_t *, + TUNABLE_CALLBACK (set_plt_rewrite)); +#else /* NB: In libc.a, call init_cacheinfo. */ init_cacheinfo (); #endif diff --git a/sysdeps/x86/dl-procruntime.c b/sysdeps/x86/dl-procruntime.c index 4d25d9f327..15b3d0d878 100644 --- a/sysdeps/x86/dl-procruntime.c +++ b/sysdeps/x86/dl-procruntime.c @@ -67,6 +67,7 @@ PROCINFO_CLASS struct dl_x86_feature_control _dl_x86_feature_control = { .ibt = DEFAULT_DL_X86_CET_CONTROL, .shstk = DEFAULT_DL_X86_CET_CONTROL, + .plt_rewrite = plt_rewrite_none, } # endif # if !defined SHARED || defined PROCINFO_DECL diff --git a/sysdeps/x86/dl-tunables.list b/sysdeps/x86/dl-tunables.list index 147a7270ec..e2e441e1b7 100644 --- a/sysdeps/x86/dl-tunables.list +++ b/sysdeps/x86/dl-tunables.list @@ -66,5 +66,10 @@ glibc { x86_shared_cache_size { type: SIZE_T } + plt_rewrite { + type: INT_32 + minval: 0 + maxval: 1 + } } } diff --git a/sysdeps/x86_64/Makefile b/sysdeps/x86_64/Makefile index 00120ca9ca..374bca80d0 100644 --- a/sysdeps/x86_64/Makefile +++ b/sysdeps/x86_64/Makefile @@ -1,6 +1,14 @@ # The i387 `long double' is a distinct type we support. long-double-fcts = yes +ifeq (yes,$(have-z-mark-plt)) +# Always generate DT_X86_64_PLT* tags. +sysdep-LDFLAGS += -Wl,-z,mark-plt +# Never generate DT_X86_64_PLT* tags on ld.so to avoid changing its own +# PLT. +LDFLAGS-rtld += -Wl,-z,nomark-plt +endif + ifeq ($(subdir),csu) gen-as-const-headers += link-defines.sym endif @@ -175,6 +183,25 @@ ifeq (no,$(build-hardcoded-path-in-tests)) tests-container += tst-glibc-hwcaps-cache endif +ifeq (yes,$(have-z-mark-plt)) +tests += \ + tst-plt-rewrite1 \ +# tests +modules-names += \ + tst-plt-rewritemod1 \ +# modules-names + +tst-plt-rewrite1-no-pie = yes +LDFLAGS-tst-plt-rewrite1 = -Wl,-z,now +LDFLAGS-tst-plt-rewritemod1.so = -Wl,-z,now +tst-plt-rewrite1-ENV = GLIBC_TUNABLES=glibc.cpu.plt_rewrite=1 LD_DEBUG=files:bindings +$(objpfx)tst-plt-rewrite1: $(objpfx)tst-plt-rewritemod1.so +$(objpfx)tst-plt-rewrite1.out: /dev/null $(objpfx)tst-plt-rewrite1 + $(tst-plt-rewrite1-ENV) $(make-test-out) > $@ 2>&1; \ + grep -q -E "changing 'bar' PLT entry in .*/elf/tst-plt-rewritemod1.so' to direct branch" $@; \ + $(evaluate-test) +endif + endif # $(subdir) == elf ifeq ($(subdir),csu) diff --git a/sysdeps/x86_64/configure b/sysdeps/x86_64/configure index b4a80b8035..418cc4a9b8 100755 --- a/sysdeps/x86_64/configure +++ b/sysdeps/x86_64/configure @@ -25,6 +25,41 @@ printf "%s\n" "$libc_cv_cc_mprefer_vector_width" >&6; } config_vars="$config_vars config-cflags-mprefer-vector-width = $libc_cv_cc_mprefer_vector_width" +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for linker that supports -z mark-plt" >&5 +printf %s "checking for linker that supports -z mark-plt... " >&6; } +libc_linker_feature=no +cat > conftest.c <<EOF +int _start (void) { return 42; } +EOF +if { ac_try='${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp + -Wl,-z,mark-plt -nostdlib -nostartfiles + -fPIC -shared -o conftest.so conftest.c + 1>&5' + { { eval echo "\"\$as_me\":${as_lineno-$LINENO}: \"$ac_try\""; } >&5 + (eval $ac_try) 2>&5 + ac_status=$? + printf "%s\n" "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5 + test $ac_status = 0; }; } +then + if ${CC-cc} $CFLAGS $CPPFLAGS $LDFLAGS $no_ssp -Wl,-z,mark-plt -nostdlib \ + -nostartfiles -fPIC -shared -o conftest.so conftest.c 2>&1 \ + | grep "warning: -z mark-plt ignored" > /dev/null 2>&1; then + true + else + libc_linker_feature=yes + fi +fi +rm -f conftest* +if test $libc_linker_feature = yes; then + libc_cv_z_mark_plt=yes +else + libc_cv_z_mark_plt=no +fi +{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: $libc_linker_feature" >&5 +printf "%s\n" "$libc_linker_feature" >&6; } +config_vars="$config_vars +have-z-mark-plt = $libc_cv_z_mark_plt" + if test x"$build_mathvec" = xnotset; then build_mathvec=yes fi diff --git a/sysdeps/x86_64/configure.ac b/sysdeps/x86_64/configure.ac index 937d1aff7e..d1f803c02e 100644 --- a/sysdeps/x86_64/configure.ac +++ b/sysdeps/x86_64/configure.ac @@ -10,6 +10,10 @@ LIBC_TRY_CC_OPTION([-mprefer-vector-width=128], LIBC_CONFIG_VAR([config-cflags-mprefer-vector-width], [$libc_cv_cc_mprefer_vector_width]) +LIBC_LINKER_FEATURE([-z mark-plt], [-Wl,-z,mark-plt], + [libc_cv_z_mark_plt=yes], [libc_cv_z_mark_plt=no]) +LIBC_CONFIG_VAR([have-z-mark-plt], [$libc_cv_z_mark_plt]) + if test x"$build_mathvec" = xnotset; then build_mathvec=yes fi diff --git a/sysdeps/x86_64/dl-dtprocnum.h b/sysdeps/x86_64/dl-dtprocnum.h new file mode 100644 index 0000000000..cefacb5387 --- /dev/null +++ b/sysdeps/x86_64/dl-dtprocnum.h @@ -0,0 +1,21 @@ +/* Configuration of lookup functions. x64-64 version. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <https://www.gnu.org/licenses/>. */ + +/* Number of extra dynamic section entries for this architecture. By + default there are none. */ +#define DT_THISPROCNUM DT_X86_64_NUM diff --git a/sysdeps/x86_64/dl-machine.h b/sysdeps/x86_64/dl-machine.h index e0a9b14469..b075482ef4 100644 --- a/sysdeps/x86_64/dl-machine.h +++ b/sysdeps/x86_64/dl-machine.h @@ -22,6 +22,7 @@ #define ELF_MACHINE_NAME "x86_64" #include <assert.h> +#include <stdint.h> #include <sys/param.h> #include <sysdep.h> #include <tls.h> @@ -35,6 +36,9 @@ # define RTLD_START_ENABLE_X86_FEATURES #endif +/* Translate a processor specific dynamic tag to the index in l_info array. */ +#define DT_X86_64(x) (DT_X86_64_##x - DT_LOPROC + DT_NUM) + /* Return nonzero iff ELF header is compatible with the running host. */ static inline int __attribute__ ((unused)) elf_machine_matches_host (const ElfW(Ehdr) *ehdr) @@ -312,8 +316,10 @@ and creates an unsatisfiable circular dependency.\n", switch (r_type) { - case R_X86_64_GLOB_DAT: case R_X86_64_JUMP_SLOT: + map->l_has_jump_slot_reloc = true; + /* fallthrough */ + case R_X86_64_GLOB_DAT: *reloc_addr = value; break; @@ -549,3 +555,211 @@ elf_machine_lazy_rel (struct link_map *map, struct r_scope_elem *scope[], } #endif /* RESOLVE_MAP */ + +#if !defined ELF_DYNAMIC_AFTER_RELOC && !defined RTLD_BOOTSTRAP \ + && defined SHARED +# define ELF_DYNAMIC_AFTER_RELOC(map, lazy) \ + x86_64_dynamic_after_reloc (map, (lazy)) + +# define JMP32_INSN_OPCODE 0xe9 +# define JMP32_INSN_SIZE 5 +# define JMPABS_INSN_OPCODE 0xa100d5 +# define JMPABS_INSN_SIZE 11 +# define INT3_INSN_OPCODE 0xcc + +static const char * +x86_64_reloc_symbol_name (struct link_map *map, const ElfW(Rela) *reloc) +{ + const ElfW(Sym) *const symtab + = (const void *) map->l_info[DT_SYMTAB]->d_un.d_ptr; + const ElfW(Sym) *const refsym = &symtab[ELFW (R_SYM) (reloc->r_info)]; + const char *strtab = (const char *) map->l_info[DT_STRTAB]->d_un.d_ptr; + return strtab + refsym->st_name; +} + +static void +x86_64_rewrite_plt (struct link_map *map, ElfW(Addr) plt_rewrite) +{ + ElfW(Addr) l_addr = map->l_addr; + ElfW(Addr) pltent = map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val; + ElfW(Addr) start = map->l_info[DT_JMPREL]->d_un.d_ptr; + ElfW(Addr) size = map->l_info[DT_PLTRELSZ]->d_un.d_val; + const ElfW(Rela) *reloc = (const void *) start; + const ElfW(Rela) *reloc_end = (const void *) (start + size); + + unsigned int feature_1 = THREAD_GETMEM (THREAD_SELF, + header.feature_1); + bool ibt_enabled_p + = (feature_1 & GNU_PROPERTY_X86_FEATURE_1_IBT) != 0; + + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) + _dl_debug_printf ("\nchanging PLT in '%s' to direct branch\n", + DSO_FILENAME (map->l_name)); + + for (; reloc < reloc_end; reloc++) + if (ELFW(R_TYPE) (reloc->r_info) == R_X86_64_JUMP_SLOT) + { + /* Get the value from the GOT entry. */ + ElfW(Addr) value = *(ElfW(Addr) *) (l_addr + reloc->r_offset); + + /* Get the corresponding PLT entry from r_addend. */ + ElfW(Addr) branch_start = l_addr + reloc->r_addend; + /* Skip ENDBR64 if IBT isn't enabled. */ + if (!ibt_enabled_p) + branch_start = ALIGN_DOWN (branch_start, pltent); + /* Get the displacement from the branch target. */ + ElfW(Addr) disp = value - branch_start - JMP32_INSN_SIZE; + ElfW(Addr) plt_end; + ElfW(Addr) pad; + + plt_end = (branch_start & -pltent) + pltent; + + /* Update the PLT entry. */ + if (((uint64_t) disp + (uint64_t) ((uint32_t) INT32_MIN)) + <= (uint64_t) UINT32_MAX) + { + pad = branch_start + JMP32_INSN_SIZE; + + if (__glibc_unlikely (pad > plt_end)) + continue; + + /* If the target branch can be reached with a direct branch, + rewrite the PLT entry with a direct branch. */ + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_BINDINGS)) + { + const char *sym_name = x86_64_reloc_symbol_name (map, + reloc); + _dl_debug_printf ("changing '%s' PLT entry in '%s' to " + "direct branch\n", sym_name, + DSO_FILENAME (map->l_name)); + } + + /* Write out direct branch. */ + *(uint8_t *) branch_start = JMP32_INSN_OPCODE; + *((uint32_t *) (branch_start + 1)) = disp; + } + else + { + if (GL(dl_x86_feature_control).plt_rewrite + != plt_rewrite_jmpabs) + { + if (__glibc_unlikely (GLRO(dl_debug_mask) + & DL_DEBUG_BINDINGS)) + { + const char *sym_name + = x86_64_reloc_symbol_name (map, reloc); + _dl_debug_printf ("skipping '%s' PLT entry in '%s'\n", + sym_name, + DSO_FILENAME (map->l_name)); + } + continue; + } + + pad = branch_start + JMPABS_INSN_SIZE; + + if (__glibc_unlikely (pad > plt_end)) + continue; + + /* Rewrite the PLT entry with JMPABS. */ + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_BINDINGS)) + { + const char *sym_name = x86_64_reloc_symbol_name (map, + reloc); + _dl_debug_printf ("changing '%s' PLT entry in '%s' to " + "JMPABS\n", sym_name, + DSO_FILENAME (map->l_name)); + } + + /* "jmpabs $target" for 64-bit displacement. NB: JMPABS has + a 3-byte opcode + 64bit address. There is a 1-byte overlap + between 4-byte write and 8-byte write. */ + *(uint32_t *) (branch_start) = JMPABS_INSN_OPCODE; + *(uint64_t *) (branch_start + 3) = value; + } + + /* Fill the unused part of the PLT entry with INT3. */ + for (; pad < plt_end; pad++) + *(uint8_t *) pad = INT3_INSN_OPCODE; + } +} + +static inline void +x86_64_rewrite_plt_in_place (struct link_map *map) +{ + /* Adjust DT_X86_64_PLT address and DT_X86_64_PLTSZ values. */ + ElfW(Addr) plt = (map->l_info[DT_X86_64 (PLT)]->d_un.d_ptr + + map->l_addr); + size_t pagesize = GLRO(dl_pagesize); + ElfW(Addr) plt_aligned = ALIGN_DOWN (plt, pagesize); + size_t pltsz = (map->l_info[DT_X86_64 (PLTSZ)]->d_un.d_val + + plt - plt_aligned); + + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) + _dl_debug_printf ("\nchanging PLT in '%s' to writable\n", + DSO_FILENAME (map->l_name)); + + if (__glibc_unlikely (__mprotect ((void *) plt_aligned, pltsz, + PROT_WRITE | PROT_READ) < 0)) + { + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) + _dl_debug_printf ("\nfailed to change PLT in '%s' to writable\n", + DSO_FILENAME (map->l_name)); + return; + } + + x86_64_rewrite_plt (map, plt_aligned); + + if (__glibc_unlikely (GLRO(dl_debug_mask) & DL_DEBUG_FILES)) + _dl_debug_printf ("\nchanging PLT in '%s' back to read-only\n", + DSO_FILENAME (map->l_name)); + + if (__glibc_unlikely (__mprotect ((void *) plt_aligned, pltsz, + PROT_EXEC | PROT_READ) < 0)) + _dl_signal_error (0, DSO_FILENAME (map->l_name), NULL, + "failed to change PLT back to read-only"); +} + +/* Rewrite PLT entries to direct branch if possible. */ + +static inline void +x86_64_dynamic_after_reloc (struct link_map *map, int lazy) +{ + /* Ignore DT_X86_64_PLT if the lazy binding is enabled. */ + if (lazy != 0) + return; + + /* Ignore DT_X86_64_PLT if PLT rewrite isn't enabled. */ + if (__glibc_likely (GL(dl_x86_feature_control).plt_rewrite + == plt_rewrite_none)) + return; + + if (__glibc_likely (map->l_info[DT_X86_64 (PLT)] == NULL)) + return; + + /* Ignore DT_X86_64_PLT if there is no R_X86_64_JUMP_SLOT. */ + if (map->l_has_jump_slot_reloc == 0) + return; + + /* Ignore DT_X86_64_PLT if + 1. DT_JMPREL isn't available or its value is 0. + 2. DT_PLTRELSZ is 0. + 3. DT_X86_64_PLTENT isn't available or its value is smaller than + 16 bytes. + 4. DT_X86_64_PLTSZ isn't available or its value is smaller than + DT_X86_64_PLTENT's value or isn't a multiple of DT_X86_64_PLTENT's + value. */ + if (map->l_info[DT_JMPREL] == NULL + || map->l_info[DT_JMPREL]->d_un.d_ptr == 0 + || map->l_info[DT_PLTRELSZ]->d_un.d_val == 0 + || map->l_info[DT_X86_64 (PLTSZ)] == NULL + || map->l_info[DT_X86_64 (PLTENT)] == NULL + || map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val < 16 + || (map->l_info[DT_X86_64 (PLTSZ)]->d_un.d_val + < map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val) + || (map->l_info[DT_X86_64 (PLTSZ)]->d_un.d_val + % map->l_info[DT_X86_64 (PLTENT)]->d_un.d_val) != 0) + return; + + x86_64_rewrite_plt_in_place (map); +} +#endif diff --git a/sysdeps/x86_64/link_map.h b/sysdeps/x86_64/link_map.h new file mode 100644 index 0000000000..537f56ace5 --- /dev/null +++ b/sysdeps/x86_64/link_map.h @@ -0,0 +1,22 @@ +/* Additional fields in struct link_map. x86-64 version. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <https://www.gnu.org/licenses/>. */ + +/* Has R_X86_64_JUMP_SLOT relocation. */ +bool l_has_jump_slot_reloc; + +#include <sysdeps/x86/link_map.h> diff --git a/sysdeps/x86_64/tst-plt-rewrite1.c b/sysdeps/x86_64/tst-plt-rewrite1.c new file mode 100644 index 0000000000..86785957e2 --- /dev/null +++ b/sysdeps/x86_64/tst-plt-rewrite1.c @@ -0,0 +1,31 @@ +/* Test PLT rewrite. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <https://www.gnu.org/licenses/>. */ + +#include <string.h> +#include <support/check.h> + +extern const char *foo (void); + +static int +do_test (void) +{ + TEST_COMPARE (strcmp (foo (), "PLT rewrite works"), 0); + return 0; +} + +#include <support/test-driver.c> diff --git a/sysdeps/x86_64/tst-plt-rewritemod1.c b/sysdeps/x86_64/tst-plt-rewritemod1.c new file mode 100644 index 0000000000..99f21fba5a --- /dev/null +++ b/sysdeps/x86_64/tst-plt-rewritemod1.c @@ -0,0 +1,32 @@ +/* Check PLT rewrite works correctly. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <https://www.gnu.org/licenses/>. */ + +/* foo calls bar with indirect branch via PLT. PLT rewrite should + change it to direct branch. */ + +const char * +bar (void) +{ + return "PLT rewrite works"; +} + +const char * +foo (void) +{ + return bar (); +}