mbox series

[00/33] RFC: RELRO link maps

Message ID cover.1688499219.git.fweimer@redhat.com
Headers show
Series RFC: RELRO link maps | expand

Message

Florian Weimer July 4, 2023, 8:02 p.m. UTC
This series introduces a new security hardening.  The main goal is to
prevent direct overwrite attacks on the internal l_info[DT_FINI] and
l_infi[DT_FINI_ARRAY] members of the link map.  These are attractive
targets because the pointers are not obfuscated.  The path to DWARF data
via _dl_find_object is kept read-only as well.

During dlopen/dlmopen and dlclose, dynamic linker data structures are
temporarily made read-write using mprotect.  To offset the overhead from
that, the penultimate commit in the series implements a hash table for
dlopen.  It speeds up dlopen with many shared objects in the process,
and makes up for the additional mprotect overhead while starting Emacs,
for example.

The last commit optionally uses memory protection keys if available.
POWER and x86-64 CPU support that, but the x86-64 support is not usable
due to adverse interactions with signal handler.  The new tunable can be
used force their use, though.  This is purely a kernel limitation; a
different mode of operation could be selected with a pkey_alloc flag.
POWER also has a software limitation: modern system tends to use the
radix MMU, and protection keys are not implemented for that (only for
the hash MMU).  Not surprisingly, with memory protection keys active, I
could not measure any overhead (and the dlopen hash table often makes
startup and dlopen operations a bit faster).

There are some limitations, so this is 2.39 material at this point.

The mprotect-based implementation is still quite slow, even with the
speedup from the dlopen hash table.  This is very visible with the
nptl/tst-stack4 test case: on my machine, its run time grows from 3.3
seconds to to 4.3 seconds.  (It remains at 3.3 seconds if I force the
use of memory protection keys.)  I have some ideas for speeding up
things further, but the memprotect overhead is quite unavoidable
unfortunately.  Things like plugin discover might be somewhat similar to
what nptl/tst-stack4 does.  But I expect these rapid dlopen/dlclose
sequences to be somewhat unusual, so the hardening should not have much
of an impact in practice.

The new data structures are more complex than what we had before.  The
dl-protmem.c allocator is not protected by a fork lock (unlike malloc,
which was used before).  This means that there is a possiblity that
dlopen will not work after multi-threaded fork.  We sort-of support that
as an extension (although GLRO(dl_load_write_lock) can still cause hangs
even today, I believe).  We can probably trade memory corruption for
hangs by careful use of GLRO(dl_load_write_lock).

I would like to add some tests for memory leaks, for example by marking
all protected memory allocations by chasing pointers from the roots, and
then checking that the gaps correspond exactly to the free chunks in the
protected memory allocator.

Maybe I should remove the debugging memory allocator and only add the
region-based one.

Some of the commits in the series could perhaps be reordered and merged.

To counteract the dlclose ooverhead, we could stop unloading converters
in the iconv subsystem.  It does not seem to be a vert useful thing to
do, now that we have the dlopen hash table and many DSOs do not slow us
down much (as long RTLD_LOCAL is used, of course).

More data structures should eventually be added under the GLPM umbrella.
The head of the link map lists, GL(dl_ns)[nsid]._ns_loaded, is a good
candidate for this.  These changes could happen incrementally over time.

Thanks,
Florian


Florian Weimer (33):
  support: Add <support/memprobe.h> for protection flags probing
  misc: Enable internal use of memory protection keys
  elf: Remove _dl_sysdep_open_object hook function
  elf: Eliminate second loop in find_version in dl-version.c
  elf: In rtld_setup_main_map, assume ld.so has a DYNAMIC segment
  elf: Remove version assert in check_match in elf/dl-lookup.c
  elf: Disambiguate some failures in _dl_load_cache_lookup
  elf: Eliminate alloca in open_verify
  Do not export <alloc_buffer.h> functions from libc
  elf: Make <alloc_buffer.h> usable in ld.so
  elf: Merge the three implementations of _dl_dst_substitute
  elf: _dl_find_object may return 1 during early startup (bug 30515)
  elf: Move __rtld_malloc_init_stubs call into _dl_start_final
  elf: Merge __dl_libc_freemem into __rtld_libc_freeres
  elf: Use struct link_map_private for the internal link map
  elf: Remove run-time-writable fields from struct link_map_private
  elf: Move l_tls_offset into read-write part of link map
  elf: Allocate auditor state after read-write link map
  elf: Move link map fields used by dependency sorting to writable part
  elf: Split _dl_lookup_map, _dl_map_new_object from _dl_map_object
  elf: Add l_soname accessor function for DT_SONAME values
  elf: _dl_rtld_map should not exist in static builds
  elf: Introduce GLPM accessor for the protected memory area
  elf: Bootstrap allocation for future protected memory allocator
  elf: Implement a basic protected memory allocator
  elf: Move most of the _dl_find_object data to the protected heap
  elf: Switch to a region-based protected memory allocator
  elf: Determine the caller link map in _dl_open
  elf: Add fast path to dlopen for fully-opened maps
  elf: Use _dl_find_object instead of _dl_find_dso_for_object in dlopen
  elf: Put critical _dl_find_object pointers into protected memory area
  elf: Add hash tables to speed up DT_NEEDED, dlopen lookups
  elf: Use memory protection keys for the protected memory allocator

 NEWS                                          |   4 +
 csu/libc-start.c                              |   7 +-
 csu/libc-tls.c                                |   8 +-
 debug/backtracesyms.c                         |   4 +-
 debug/backtracesymsfd.c                       |   6 +-
 dlfcn/dladdr1.c                               |   7 +-
 dlfcn/dlinfo.c                                |   4 +-
 dlfcn/tst-dlinfo-phdr.c                       |  15 +-
 elf/Makefile                                  |  25 +-
 elf/circleload1.c                             |  18 +-
 elf/dl-addr-obj.c                             |   4 +-
 elf/dl-addr.c                                 |  13 +-
 elf/dl-audit.c                                |  25 +-
 elf/dl-cache.c                                |  33 +-
 elf/dl-call-libc-early-init.c                 |   2 +-
 elf/dl-call_fini.c                            |  11 +-
 elf/dl-close.c                                | 181 ++---
 elf/dl-debug.c                                |  12 -
 elf/dl-deps.c                                 | 177 +++--
 elf/dl-diagnostics.c                          |   2 +
 elf/dl-find_object.c                          | 169 ++---
 elf/dl-find_object.h                          |  21 +-
 elf/dl-fini.c                                 |  16 +-
 elf/dl-fptr.c                                 |   6 +-
 elf/dl-init.c                                 |  22 +-
 elf/dl-iteratephdr.c                          |  11 +-
 elf/dl-libc.c                                 | 115 +--
 elf/dl-libc_freeres.c                         |  94 ++-
 elf/dl-libname.c                              | 275 +++++++
 elf/dl-libname.h                              | 121 ++++
 elf/dl-load.c                                 | 499 ++++++-------
 elf/dl-load.h                                 |  10 +-
 elf/dl-lookup-direct.c                        |   5 +-
 elf/dl-lookup.c                               | 150 ++--
 elf/dl-machine-reject-phdr.h                  |   4 +-
 elf/dl-map-segments.h                         |  16 +-
 elf/dl-minimal.c                              |   4 +-
 elf/dl-misc.c                                 |  20 -
 elf/dl-object.c                               | 181 +++--
 elf/dl-open.c                                 | 225 +++---
 elf/dl-profile.c                              |   4 +-
 elf/dl-protmem-internal.h                     | 100 +++
 elf/dl-protmem.c                              | 679 ++++++++++++++++++
 elf/dl-protmem.h                              | 102 +++
 elf/dl-protmem_bootstrap.h                    |  36 +
 elf/dl-reloc-static-pie.c                     |   7 +-
 elf/dl-reloc.c                                |  46 +-
 elf/dl-runtime.c                              |   6 +-
 elf/dl-setup_hash.c                           |   2 +-
 elf/dl-sort-maps.c                            |  53 +-
 elf/dl-static-tls.h                           |  10 +-
 elf/dl-support.c                              |  46 +-
 elf/dl-sym-post.h                             |   6 +-
 elf/dl-sym.c                                  |  10 +-
 elf/dl-symaddr.c                              |   2 +-
 elf/dl-sysdep-open.h                          |  45 --
 elf/dl-tls.c                                  |  61 +-
 elf/dl-tunables.list                          |   6 +
 elf/dl-unmap-segments.h                       |   2 +-
 elf/dl-usage.c                                |   2 +-
 elf/dl-version.c                              |  77 +-
 elf/do-rel.h                                  |  19 +-
 elf/dynamic-link.h                            |  14 +-
 elf/get-dynamic-info.h                        |  12 +-
 elf/libc-early-init.h                         |   6 +-
 elf/loadtest.c                                |  34 +-
 elf/neededtest.c                              |  18 +-
 elf/neededtest2.c                             |  18 +-
 elf/neededtest3.c                             |  18 +-
 elf/neededtest4.c                             |  18 +-
 elf/pldd-xx.c                                 |  19 +-
 elf/pldd.c                                    |   1 +
 elf/rtld.c                                    | 455 ++++++------
 elf/rtld_static_init.c                        |   2 +-
 elf/setup-vdso.h                              |  48 +-
 elf/sotruss-lib.c                             |   5 +-
 elf/sprof.c                                   |  27 +-
 elf/tlsdeschtab.h                             |   4 +-
 elf/tst-_dl_addr_inside_object.c              |  13 +-
 elf/tst-audit19a.c                            |   2 +-
 elf/tst-auditmod28.c                          |  11 +
 elf/tst-dl-protmem.c                          | 364 ++++++++++
 elf/tst-dl_find_object-threads.c              |   6 +-
 elf/tst-dl_find_object.c                      |  19 +-
 elf/tst-relro-linkmap-disabled-mod1.c         |  46 ++
 elf/tst-relro-linkmap-disabled-mod2.c         |   2 +
 elf/tst-relro-linkmap-disabled.c              |  64 ++
 elf/tst-relro-linkmap-mod1.c                  |  42 ++
 elf/tst-relro-linkmap-mod2.c                  |   2 +
 elf/tst-relro-linkmap-mod3.c                  |   2 +
 elf/tst-relro-linkmap.c                       | 112 +++
 elf/tst-rtld-list-tunables.exp                |   1 +
 elf/tst-rtld-nomem.c                          | 177 +++++
 elf/tst-tls6.c                                |   8 +-
 elf/tst-tls7.c                                |   8 +-
 elf/tst-tls8.c                                |  24 +-
 elf/unload.c                                  |  10 +-
 elf/unload2.c                                 |  10 +-
 htl/pt-alloc.c                                |   7 +-
 include/alloc_buffer.h                        |  26 +-
 include/dlfcn.h                               |   6 +-
 include/link.h                                | 178 +++--
 include/rtld-malloc.h                         |   5 +-
 include/set-freeres.h                         |   1 -
 libio/vtables.c                               |   2 +-
 malloc/Makefile                               |   6 +-
 malloc/Versions                               |   7 -
 malloc/alloc_buffer_alloc_array.c             |   1 -
 malloc/alloc_buffer_allocate.c                |   1 -
 malloc/alloc_buffer_copy_bytes.c              |   1 -
 malloc/alloc_buffer_copy_string.c             |   1 -
 malloc/alloc_buffer_create_failure.c          |   7 +-
 malloc/set-freeres.c                          |   2 -
 malloc/tst-alloc_buffer.c                     |   4 +
 manual/tunables.texi                          |  29 +
 nptl/Versions                                 |   3 +-
 nptl/pthread_create.c                         |   8 +
 nptl_db/db_info.c                             |   3 +-
 nptl_db/structs.def                           |   3 +-
 nptl_db/td_thr_tlsbase.c                      |  12 +-
 nss/Makefile                                  |   4 +-
 stdlib/cxa_thread_atexit_impl.c               |  10 +-
 stdlib/tst-tls-atexit.c                       |  10 +-
 support/Makefile                              |   3 +
 support/memprobe.h                            |  43 ++
 support/support-alloc_buffer.c                |  26 +
 support/support_memprobe.c                    | 251 +++++++
 support/tst-support_memprobe.c                | 118 +++
 sysdeps/aarch64/dl-bti.c                      |  14 +-
 sysdeps/aarch64/dl-lookupcfg.h                |   4 +-
 sysdeps/aarch64/dl-machine.h                  |  29 +-
 sysdeps/aarch64/dl-prop.h                     |  12 +-
 sysdeps/aarch64/dl-tlsdesc.h                  |   2 +-
 sysdeps/aarch64/tlsdesc.c                     |   2 +-
 sysdeps/alpha/dl-machine.h                    |  24 +-
 sysdeps/arc/dl-machine.h                      |  21 +-
 sysdeps/arm/dl-lookupcfg.h                    |   4 +-
 sysdeps/arm/dl-machine.h                      |  43 +-
 sysdeps/arm/dl-tlsdesc.h                      |   2 +-
 sysdeps/arm/tlsdesc.c                         |   2 +-
 sysdeps/csky/dl-machine.h                     |  22 +-
 sysdeps/generic/dl-debug.h                    |   2 +-
 sysdeps/generic/dl-early_mmap.h               |  35 +
 sysdeps/generic/dl-fptr.h                     |   4 +-
 sysdeps/generic/dl-prop.h                     |   8 +-
 sysdeps/generic/dl-protected.h                |  10 +-
 sysdeps/generic/dl-protmem-pkey.h             |  20 +
 sysdeps/generic/ldsodefs.h                    | 277 ++++---
 sysdeps/generic/rtld_static_init.h            |   3 +-
 sysdeps/hppa/dl-fptr.c                        |  10 +-
 sysdeps/hppa/dl-lookupcfg.h                   |   6 +-
 sysdeps/hppa/dl-machine.h                     |  29 +-
 sysdeps/hppa/dl-runtime.c                     |   4 +-
 sysdeps/hppa/dl-runtime.h                     |   2 +-
 sysdeps/hppa/dl-symaddr.c                     |   2 +-
 sysdeps/htl/pthreadP.h                        |   2 +-
 sysdeps/i386/dl-machine.h                     |  41 +-
 sysdeps/i386/dl-tlsdesc.h                     |   2 +-
 sysdeps/i386/tlsdesc.c                        |   2 +-
 sysdeps/ia64/dl-lookupcfg.h                   |   6 +-
 sysdeps/ia64/dl-machine.h                     |  29 +-
 sysdeps/loongarch/dl-machine.h                |  19 +-
 sysdeps/loongarch/dl-tls.h                    |   2 +-
 sysdeps/m68k/dl-machine.h                     |  20 +-
 sysdeps/m68k/dl-tls.h                         |   2 +-
 sysdeps/microblaze/dl-machine.h               |  23 +-
 sysdeps/mips/Makefile                         |   6 +
 sysdeps/mips/dl-debug.h                       |   2 +-
 sysdeps/mips/dl-machine-reject-phdr.h         |  20 +-
 sysdeps/mips/dl-machine.h                     |  74 +-
 sysdeps/mips/dl-tls.h                         |   2 +-
 sysdeps/mips/dl-trampoline.c                  |  19 +-
 sysdeps/nios2/dl-init.c                       |   6 +-
 sysdeps/nios2/dl-machine.h                    |  19 +-
 sysdeps/nios2/dl-tls.h                        |   2 +-
 sysdeps/nptl/dl-mutex.c                       |   2 +-
 sysdeps/or1k/dl-machine.h                     |  20 +-
 sysdeps/powerpc/dl-tls.h                      |   2 +-
 sysdeps/powerpc/powerpc32/dl-machine.c        |  19 +-
 sysdeps/powerpc/powerpc32/dl-machine.h        |  40 +-
 sysdeps/powerpc/powerpc64/dl-machine.c        |   8 +-
 sysdeps/powerpc/powerpc64/dl-machine.h        |  48 +-
 sysdeps/riscv/dl-machine.h                    |  26 +-
 sysdeps/riscv/dl-tls.h                        |   2 +-
 sysdeps/s390/s390-32/dl-machine.h             |  29 +-
 sysdeps/s390/s390-64/dl-machine.h             |  29 +-
 sysdeps/sh/dl-machine.h                       |  36 +-
 sysdeps/sparc/sparc32/dl-machine.h            |  24 +-
 sysdeps/sparc/sparc64/dl-irel.h               |   2 +-
 sysdeps/sparc/sparc64/dl-machine.h            |  27 +-
 sysdeps/sparc/sparc64/dl-plt.h                |   4 +-
 sysdeps/unix/sysv/linux/dl-early_allocate.c   |  17 +-
 sysdeps/unix/sysv/linux/dl-early_mmap.h       |  41 ++
 sysdeps/unix/sysv/linux/dl-origin.c           |   1 -
 sysdeps/unix/sysv/linux/dl-protmem-pkey.h     |  23 +
 sysdeps/unix/sysv/linux/dl-sysdep.c           |   2 +
 sysdeps/unix/sysv/linux/dl-vdso.h             |   2 +-
 .../sysv/linux/include/bits/mman-shared.h     |  16 +
 sysdeps/unix/sysv/linux/pkey_get.c            |   5 +-
 sysdeps/unix/sysv/linux/pkey_mprotect.c       |   4 +-
 sysdeps/unix/sysv/linux/pkey_set.c            |   5 +-
 sysdeps/unix/sysv/linux/powerpc/libc-start.c  |   2 +-
 .../sysv/linux/powerpc/powerpc64/ldsodefs.h   |  14 +-
 .../sysv/linux/powerpc/powerpc64/pkey_get.c   |   4 +-
 .../sysv/linux/powerpc/powerpc64/pkey_set.c   |   4 +-
 .../sysv/linux/powerpc/rtld_static_init.h     |   3 +-
 sysdeps/unix/sysv/linux/syscalls.list         |   4 +-
 sysdeps/unix/sysv/linux/x86/dl-protmem-pkey.h |  26 +
 sysdeps/unix/sysv/linux/x86/pkey_get.c        |   5 +-
 sysdeps/unix/sysv/linux/x86/pkey_set.c        |   5 +-
 sysdeps/x86/dl-cet.c                          |   4 +-
 sysdeps/x86/dl-lookupcfg.h                    |   4 +-
 sysdeps/x86/dl-prop.h                         |  29 +-
 sysdeps/x86_64/dl-machine.h                   |  39 +-
 sysdeps/x86_64/dl-tlsdesc.h                   |   2 +-
 sysdeps/x86_64/tlsdesc.c                      |   2 +-
 216 files changed, 5271 insertions(+), 2444 deletions(-)
 create mode 100644 elf/dl-libname.c
 create mode 100644 elf/dl-libname.h
 create mode 100644 elf/dl-protmem-internal.h
 create mode 100644 elf/dl-protmem.c
 create mode 100644 elf/dl-protmem.h
 create mode 100644 elf/dl-protmem_bootstrap.h
 delete mode 100644 elf/dl-sysdep-open.h
 create mode 100644 elf/tst-dl-protmem.c
 create mode 100644 elf/tst-relro-linkmap-disabled-mod1.c
 create mode 100644 elf/tst-relro-linkmap-disabled-mod2.c
 create mode 100644 elf/tst-relro-linkmap-disabled.c
 create mode 100644 elf/tst-relro-linkmap-mod1.c
 create mode 100644 elf/tst-relro-linkmap-mod2.c
 create mode 100644 elf/tst-relro-linkmap-mod3.c
 create mode 100644 elf/tst-relro-linkmap.c
 create mode 100644 elf/tst-rtld-nomem.c
 create mode 100644 support/memprobe.h
 create mode 100644 support/support-alloc_buffer.c
 create mode 100644 support/support_memprobe.c
 create mode 100644 support/tst-support_memprobe.c
 create mode 100644 sysdeps/generic/dl-early_mmap.h
 create mode 100644 sysdeps/generic/dl-protmem-pkey.h
 create mode 100644 sysdeps/unix/sysv/linux/dl-early_mmap.h
 create mode 100644 sysdeps/unix/sysv/linux/dl-protmem-pkey.h
 create mode 100644 sysdeps/unix/sysv/linux/include/bits/mman-shared.h
 create mode 100644 sysdeps/unix/sysv/linux/x86/dl-protmem-pkey.h


base-commit: e18c293af0ece38921ad71fbd76ff8049c3b2d67

Comments

Florian Weimer July 4, 2023, 8:07 p.m. UTC | #1
I forgot to mention that I tested this on aarch64-linux-gnu,
i686-linux-gnu, powerpc64le-linux-gnu (with the hash MMU and protection
keys on POWER8, and with the radix MMU on POWER10, so without protection
keys), and x86-64-linux-gnu.  Force-enabling protection keys on x86-64
causes malloc/tst-mallocfork* tests to fail because we cannot read
dynamic linker data structures anymore during lazy binding.

Thanks,
Florian
Carlos O'Donell July 5, 2023, 3:54 p.m. UTC | #2
On 7/4/23 16:07, Florian Weimer via Libc-alpha wrote:
> I forgot to mention that I tested this on aarch64-linux-gnu,
> i686-linux-gnu, powerpc64le-linux-gnu (with the hash MMU and protection
> keys on POWER8, and with the radix MMU on POWER10, so without protection
> keys), and x86-64-linux-gnu.  Force-enabling protection keys on x86-64
> causes malloc/tst-mallocfork* tests to fail because we cannot read
> dynamic linker data structures anymore during lazy binding.

Fails pre-commit CI for 32-bit ARM:
https://patchwork.sourceware.org/project/glibc/patch/477cc628fed2769f25399d7674080dd257a80d46.1688499219.git.fweimer@redhat.com/

		=== glibc tests ===

Running glibc:dlfcn ...
FAIL: dlfcn/tststatic4 

		=== Results Summary ===
Florian Weimer July 5, 2023, 3:57 p.m. UTC | #3
* Carlos O'Donell:

> On 7/4/23 16:07, Florian Weimer via Libc-alpha wrote:
>> I forgot to mention that I tested this on aarch64-linux-gnu,
>> i686-linux-gnu, powerpc64le-linux-gnu (with the hash MMU and protection
>> keys on POWER8, and with the radix MMU on POWER10, so without protection
>> keys), and x86-64-linux-gnu.  Force-enabling protection keys on x86-64
>> causes malloc/tst-mallocfork* tests to fail because we cannot read
>> dynamic linker data structures anymore during lazy binding.
>
> Fails pre-commit CI for 32-bit ARM:
> https://patchwork.sourceware.org/project/glibc/patch/477cc628fed2769f25399d7674080dd257a80d46.1688499219.git.fweimer@redhat.com/
>
> 		=== glibc tests ===
>
> Running glibc:dlfcn ...
> FAIL: dlfcn/tststatic4 
>
> 		=== Results Summary ===

Yes, already saw that report.  Any idea why?  I don't have an easy way
to reproduce failures on that architecture, unfortunately.

Thanks,
Florian
Carlos O'Donell July 5, 2023, 5:48 p.m. UTC | #4
On 7/5/23 11:57, Florian Weimer wrote:
> * Carlos O'Donell:
> 
>> On 7/4/23 16:07, Florian Weimer via Libc-alpha wrote:
>>> I forgot to mention that I tested this on aarch64-linux-gnu,
>>> i686-linux-gnu, powerpc64le-linux-gnu (with the hash MMU and protection
>>> keys on POWER8, and with the radix MMU on POWER10, so without protection
>>> keys), and x86-64-linux-gnu.  Force-enabling protection keys on x86-64
>>> causes malloc/tst-mallocfork* tests to fail because we cannot read
>>> dynamic linker data structures anymore during lazy binding.
>>
>> Fails pre-commit CI for 32-bit ARM:
>> https://patchwork.sourceware.org/project/glibc/patch/477cc628fed2769f25399d7674080dd257a80d46.1688499219.git.fweimer@redhat.com/
>>
>> 		=== glibc tests ===
>>
>> Running glibc:dlfcn ...
>> FAIL: dlfcn/tststatic4 
>>
>> 		=== Results Summary ===
> 
> Yes, already saw that report.  Any idea why?  I don't have an easy way
> to reproduce failures on that architecture, unfortunately.

I don't know why. 

Maxim and I have discussed what to do in these scenarios and what expectations the
community should have about the pre-commit CI test owner.

My understanding is that Linaro is here to help determine why the failure occurred
and work with you to find a solution.

Notes:
https://sourceware.org/glibc/wiki/PreCommitCI
Adhemerval Zanella July 5, 2023, 5:58 p.m. UTC | #5
On 05/07/23 14:48, Carlos O'Donell wrote:
> On 7/5/23 11:57, Florian Weimer wrote:
>> * Carlos O'Donell:
>>
>>> On 7/4/23 16:07, Florian Weimer via Libc-alpha wrote:
>>>> I forgot to mention that I tested this on aarch64-linux-gnu,
>>>> i686-linux-gnu, powerpc64le-linux-gnu (with the hash MMU and protection
>>>> keys on POWER8, and with the radix MMU on POWER10, so without protection
>>>> keys), and x86-64-linux-gnu.  Force-enabling protection keys on x86-64
>>>> causes malloc/tst-mallocfork* tests to fail because we cannot read
>>>> dynamic linker data structures anymore during lazy binding.
>>>
>>> Fails pre-commit CI for 32-bit ARM:
>>> https://patchwork.sourceware.org/project/glibc/patch/477cc628fed2769f25399d7674080dd257a80d46.1688499219.git.fweimer@redhat.com/
>>>
>>> 		=== glibc tests ===
>>>
>>> Running glibc:dlfcn ...
>>> FAIL: dlfcn/tststatic4 
>>>
>>> 		=== Results Summary ===
>>
>> Yes, already saw that report.  Any idea why?  I don't have an easy way
>> to reproduce failures on that architecture, unfortunately.
> 
> I don't know why. 
> 
> Maxim and I have discussed what to do in these scenarios and what expectations the
> community should have about the pre-commit CI test owner.
> 
> My understanding is that Linaro is here to help determine why the failure occurred
> and work with you to find a solution.
> 
> Notes:
> https://sourceware.org/glibc/wiki/PreCommitCI

I will take a look of what might be happening here.
Florian Weimer July 7, 2023, 11:10 a.m. UTC | #6
* Adhemerval Zanella Netto:

> On 05/07/23 14:48, Carlos O'Donell wrote:
>> On 7/5/23 11:57, Florian Weimer wrote:
>>> * Carlos O'Donell:
>>>
>>>> On 7/4/23 16:07, Florian Weimer via Libc-alpha wrote:
>>>>> I forgot to mention that I tested this on aarch64-linux-gnu,
>>>>> i686-linux-gnu, powerpc64le-linux-gnu (with the hash MMU and protection
>>>>> keys on POWER8, and with the radix MMU on POWER10, so without protection
>>>>> keys), and x86-64-linux-gnu.  Force-enabling protection keys on x86-64
>>>>> causes malloc/tst-mallocfork* tests to fail because we cannot read
>>>>> dynamic linker data structures anymore during lazy binding.
>>>>
>>>> Fails pre-commit CI for 32-bit ARM:
>>>> https://patchwork.sourceware.org/project/glibc/patch/477cc628fed2769f25399d7674080dd257a80d46.1688499219.git.fweimer@redhat.com/
>>>>
>>>> 		=== glibc tests ===
>>>>
>>>> Running glibc:dlfcn ...
>>>> FAIL: dlfcn/tststatic4 
>>>>
>>>> 		=== Results Summary ===
>>>
>>> Yes, already saw that report.  Any idea why?  I don't have an easy way
>>> to reproduce failures on that architecture, unfortunately.
>> 
>> I don't know why. 
>> 
>> Maxim and I have discussed what to do in these scenarios and what expectations the
>> community should have about the pre-commit CI test owner.
>> 
>> My understanding is that Linaro is here to help determine why the failure occurred
>> and work with you to find a solution.
>> 
>> Notes:
>> https://sourceware.org/glibc/wiki/PreCommitCI
>
> I will take a look of what might be happening here.

I can't find the failure anymore.

What's the target triplet for this builder?  What's the closest
build-many-glibcs.py configuration?

Thanks,
Florian
Florian Weimer July 7, 2023, 12:42 p.m. UTC | #7
* Florian Weimer:

>> I will take a look of what might be happening here.
>
> I can't find the failure anymore.
>
> What's the target triplet for this builder?  What's the closest
> build-many-glibcs.py configuration?

I managed to reproduce it using the arm-linux-gnueabihf configureation:

openat(AT_FDCWD, "./", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
read(3, 0xffcd9a28, 512)                = -1 EISDIR (Is a directory)
close(3)                                = 0
dlopen [initial] (NULL): ./: cannot read file data: Is a directory
exit_group(1)                           = ?

Or alternatively, using a different LD_LIBRARY_PATH setting with an
absolute path:

openat(AT_FDCWD, "/root/build/", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
read(3, 0xffe12f18, 512)                = -1 EISDIR (Is a directory)
close(3)                                = 0
dlopen [initial] (NULL): /root/build/: cannot read file data: Is a directory
exit_group(1)                           = ?

I assume it's the same failure.

It's because there's no vDSO:

00010000-00098000 r-xp 00000000 fd:00 11958481                           /root/build/dlfcn/tststatic4
00098000-000b5000 rw-p 00087000 fd:00 11958481                           /root/build/dlfcn/tststatic4
000b5000-000b8000 rw-p 00000000 00:00 0                                  [heap]
000b8000-000da000 rw-p 00000000 00:00 0                                  [heap]
f7fef000-f7ff0000 r-xp 00000000 00:00 0                                  [sigpage]
fffcf000-ffff0000 rw-p 00000000 00:00 0                                  [stack]
ffff0000-ffff1000 r-xp 00000000 00:00 0                                  [vectors]

This means that the main map is the only map, and l_prev and l_next are
NULL.  The _dl_lookup_map function treats this as the early startup
situation, where things show up in the hash table but are not actually
on the list.

I think I can #ifdef out that check for !SHARED.

Thanks,
Florian
Adhemerval Zanella July 7, 2023, 12:48 p.m. UTC | #8
On 07/07/23 09:42, Florian Weimer wrote:
> * Florian Weimer:
> 
>>> I will take a look of what might be happening here.
>>
>> I can't find the failure anymore.
>>
>> What's the target triplet for this builder?  What's the closest
>> build-many-glibcs.py configuration?
> 
> I managed to reproduce it using the arm-linux-gnueabihf configureation:
> 
> openat(AT_FDCWD, "./", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
> read(3, 0xffcd9a28, 512)                = -1 EISDIR (Is a directory)
> close(3)                                = 0
> dlopen [initial] (NULL): ./: cannot read file data: Is a directory
> exit_group(1)                           = ?
> 
> Or alternatively, using a different LD_LIBRARY_PATH setting with an
> absolute path:
> 
> openat(AT_FDCWD, "/root/build/", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3
> read(3, 0xffe12f18, 512)                = -1 EISDIR (Is a directory)
> close(3)                                = 0
> dlopen [initial] (NULL): /root/build/: cannot read file data: Is a directory
> exit_group(1)                           = ?
> 
> I assume it's the same failure.
> 
> It's because there's no vDSO:
> 
> 00010000-00098000 r-xp 00000000 fd:00 11958481                           /root/build/dlfcn/tststatic4
> 00098000-000b5000 rw-p 00087000 fd:00 11958481                           /root/build/dlfcn/tststatic4
> 000b5000-000b8000 rw-p 00000000 00:00 0                                  [heap]
> 000b8000-000da000 rw-p 00000000 00:00 0                                  [heap]
> f7fef000-f7ff0000 r-xp 00000000 00:00 0                                  [sigpage]
> fffcf000-ffff0000 rw-p 00000000 00:00 0                                  [stack]
> ffff0000-ffff1000 r-xp 00000000 00:00 0                                  [vectors]
> 
> This means that the main map is the only map, and l_prev and l_next are
> NULL.  The _dl_lookup_map function treats this as the early startup
> situation, where things show up in the hash table but are not actually
> on the list.
> 
> I think I can #ifdef out that check for !SHARED.

I also seeing extra regressions on arm-linux-gnueabihf:

FAIL: elf/tst-audit11
FAIL: elf/tst-audit12
FAIL: elf/tst-audit2
FAIL: elf/tst-audit9
FAIL: elf/tst-tls-ie-dlmopen


$ cat elf/tst-audit11.out
Start

$ cat elf/tst-audit12.out
Start

$ cat elf/tst-audit2.out
version: 2
objopen: 0, �t��
objopen: 0, /home/adhemerval.zanella/projects/glibc/build/arm-linux-gnueabihf/elf/ld.so
activity: add
objsearch: libc.so.6, LA_SET_ORIG
objsearch: /home/adhemerval.zanella/projects/glibc/build/arm-linux-gnueabihf/libc.so.6, LA_SER_RUNPATH
objopen: 0, /home/adhemerval.zanella/projects/glibc/build/arm-linux-gnueabihf/libc.so.6
symbind32: symname=calloc, st_value=0x7af938, ndx=68, flags=8
symbind32: symname=free, st_value=0xf7709774, ndx=1547, flags=8
symbind32: symname=malloc, st_value=0xf7708ed0, ndx=397, flags=8
symbind32: symname=realloc, st_value=0xf7709b20, ndx=2033, flags=8
symbind32: symname=__tls_get_addr, st_value=0xf7b786a4, ndx=19, flags=3
symbind32: symname=malloc, st_value=0xf7708ed0, ndx=397, flags=0
pltenter: symname=malloc, st_value=0xf7708ed0, ndx=397, flags=0
symbind32: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0
pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0
pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0
pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0
pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0
pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0
pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0
pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0
pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0
pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0
pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0
pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0
pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0
symbind32: symname=memset, st_value=0xf770fbf0, ndx=1300, flags=0
pltenter: symname=memset, st_value=0xf770fbf0, ndx=1300, flags=0
pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0
pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0
pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0
activity: consistent
symbind32: symname=mmap, st_value=0xf777fc5c, ndx=3064, flags=0
pltenter: symname=mmap, st_value=0xf777fc5c, ndx=3064, flags=0
symbind32: symname=__libc_start_main, st_value=0xf7682b1c, ndx=710, flags=0
pltenter: symname=__libc_start_main, st_value=0xf7682b1c, ndx=710, flags=0
pltenter: symname=mmap, st_value=0xf777fc5c, ndx=3064, flags=0
symbind32: symname=_dl_audit_preinit, st_value=0xf7b7b2d0, ndx=38, flags=0
pltenter: symname=_dl_audit_preinit, st_value=0xf7b7b2d0, ndx=38, flags=0
preinit
pltenter: symname=memset, st_value=0xf770fbf0, ndx=1300, flags=0
symbind32: symname=getenv, st_value=0xf76a0e1c, ndx=1900, flags=0
pltenter: symname=getenv, st_value=0xf76a0e1c, ndx=1900, flags=0
symbind32: symname=mallopt, st_value=0xf770b28c, ndx=1591, flags=0
pltenter: symname=mallopt, st_value=0xf770b28c, ndx=1591, flags=0
symbind32: symname=getopt_long, st_value=0xf774736c, ndx=1467, flags=0
pltenter: symname=getopt_long, st_value=0xf774736c, ndx=1467, flags=0
pltenter: symname=getenv, st_value=0xf76a0e1c, ndx=1900, flags=0
pltenter: symname=getenv, st_value=0xf76a0e1c, ndx=1900, flags=0
symbind32: symname=setvbuf, st_value=0xf76d49ac, ndx=1342, flags=0
pltenter: symname=setvbuf, st_value=0xf76d49ac, ndx=1342, flags=0
pltenter: symname=getenv, st_value=0xf76a0e1c, ndx=1900, flags=0
pltenter: symname=getenv, st_value=0xf76a0e1c, ndx=1900, flags=0
symbind32: symname=fork, st_value=0xf7741ab8, ndx=2327, flags=0
pltenter: symname=fork, st_value=0xf7741ab8, ndx=2327, flags=0
symbind32: symname=signal, st_value=0xf769c164, ndx=973, flags=0
pltenter: symname=signal, st_value=0xf769c164, ndx=973, flags=0
symbind32: symname=alarm, st_value=0xf773cb88, ndx=350, flags=0
pltenter: symname=alarm, st_value=0xf773cb88, ndx=350, flags=0
pltenter: symname=signal, st_value=0xf769c164, ndx=973, flags=0
symbind32: symname=waitpid, st_value=0xf7765508, ndx=2882, flags=0
pltenter: symname=waitpid, st_value=0xf7765508, ndx=2882, flags=0
symbind32: symname=setrlimit64, st_value=0xf7779db8, ndx=2349, flags=0
pltenter: symname=setrlimit64, st_value=0xf7779db8, ndx=2349, flags=0
symbind32: symname=setpgid, st_value=0xf7762740, ndx=1838, flags=0
pltenter: symname=setpgid, st_value=0xf7762740, ndx=1838, flags=0
pltenter: symname=getenv, st_value=0xf76a0e1c, ndx=1900, flags=0
symbind32: symname=dlopen, st_value=0xf76eab28, ndx=1984, flags=0
pltenter: symname=dlopen, st_value=0xf76eab28, ndx=1984, flags=0
objsearch: $ORIGIN/tst-auditmod9b.so, LA_SET_ORIG
activity: delete
symbind32: symname=__cxa_finalize, st_value=0xf769f328, ndx=232, flags=0
pltenter: symname=__cxa_finalize, st_value=0xf769f328, ndx=232, flags=0
objclose
objclose
objclose
activity: consistent

(the objopen: seems bogus)

$ cat elf/tst-audit9.out
$

$ cat elf/tst-tls-ie-dlmopen.out
maintls[1000]:   0xf7a83f88 .. 0xf7a84370
var0[480]:       0x15b8320 .. 0x15b8500 global-dynamic
var1[120]:       0x15b86a8 .. 0x15b8720 global-dynamic
var2[24]:        0x15b8f00 .. 0x15b8f18 global-dynamic
var3[16]:        0x15b90c0 .. 0x15b90d0 global-dynamic
var6[576]:       0xf7a844b0 .. 0xf7a846f0 initial-exec
The next dlmopen should fail...
...OK failed with: /home/adhemerval.zanella/projects/glibc/build/arm-linux-gnueabihf/elf/tst-tls-ie-mod4.so: cannot allocate memory in static TLS block.
Didn't expect signal from child: got `Segmentation fault'
Florian Weimer July 7, 2023, 1:18 p.m. UTC | #9
* Adhemerval Zanella Netto:

> I also seeing extra regressions on arm-linux-gnueabihf:
>
> FAIL: elf/tst-audit11
> FAIL: elf/tst-audit12
> FAIL: elf/tst-audit2
> FAIL: elf/tst-audit9
> FAIL: elf/tst-tls-ie-dlmopen
>
>
> $ cat elf/tst-audit11.out
> Start
>
> $ cat elf/tst-audit12.out
> Start
>
> $ cat elf/tst-audit2.out
> version: 2

This one I can't reproduce:

# LD_AUDIT=elf/tst-auditmod1.so elf/ld.so --library-path .:elf elf/tst-audit2
[…]
objclose
objclose
objclose
activity: consistent
# echo $?
0

I'm going to repost the series with the vDSO fix.

Thanks,
Florian
Adhemerval Zanella July 7, 2023, 1:58 p.m. UTC | #10
On 07/07/23 10:18, Florian Weimer wrote:
> * Adhemerval Zanella Netto:
> 
>> I also seeing extra regressions on arm-linux-gnueabihf:
>>
>> FAIL: elf/tst-audit11
>> FAIL: elf/tst-audit12
>> FAIL: elf/tst-audit2
>> FAIL: elf/tst-audit9
>> FAIL: elf/tst-tls-ie-dlmopen
>>
>>
>> $ cat elf/tst-audit11.out
>> Start
>>
>> $ cat elf/tst-audit12.out
>> Start
>>
>> $ cat elf/tst-audit2.out
>> version: 2
> 
> This one I can't reproduce:
> 
> # LD_AUDIT=elf/tst-auditmod1.so elf/ld.so --library-path .:elf elf/tst-audit2
> […]
> objclose
> objclose
> objclose
> activity: consistent
> # echo $?
> 0
> 
> I'm going to repost the series with the vDSO fix.

Alright, I will check again once you repost it.  I have not look this series in
detail, but I wonder if you could split in first adding the dlopen hash speedup
and then add the hardening, as two different sets.  I am not sure how hard or
if it would be feasible.
Florian Weimer July 7, 2023, 2:55 p.m. UTC | #11
* Adhemerval Zanella Netto:

> Alright, I will check again once you repost it.  I have not look this
> series in detail, but I wonder if you could split in first adding the
> dlopen hash speedup and then add the hardening, as two different sets.
> I am not sure how hard or if it would be feasible.

The struct link_map_private change conflicts to some degree.  But having
the protected memory allocator really simplifies two things: we no
longer have to worry about switching from the minimal malloc to the
libc.so malloc.  It's always the protected memory allocator, so
_dl_protmem_free works during early startup and once user code starts
running.  The other thing is that valgrind and mtrace do not see the
allocations, so we don't need to free them.  I alluded this in the cover
letter; I think we should add something to report leaked memory based on
tracing from the data structure roots, rather than figuring out a safe
way how to deallocate things during process shutdown.

Thanks,
Florian