Message ID | cover.1688499219.git.fweimer@redhat.com |
---|---|
Headers | show |
Series | RFC: RELRO link maps | expand |
I forgot to mention that I tested this on aarch64-linux-gnu, i686-linux-gnu, powerpc64le-linux-gnu (with the hash MMU and protection keys on POWER8, and with the radix MMU on POWER10, so without protection keys), and x86-64-linux-gnu. Force-enabling protection keys on x86-64 causes malloc/tst-mallocfork* tests to fail because we cannot read dynamic linker data structures anymore during lazy binding. Thanks, Florian
On 7/4/23 16:07, Florian Weimer via Libc-alpha wrote: > I forgot to mention that I tested this on aarch64-linux-gnu, > i686-linux-gnu, powerpc64le-linux-gnu (with the hash MMU and protection > keys on POWER8, and with the radix MMU on POWER10, so without protection > keys), and x86-64-linux-gnu. Force-enabling protection keys on x86-64 > causes malloc/tst-mallocfork* tests to fail because we cannot read > dynamic linker data structures anymore during lazy binding. Fails pre-commit CI for 32-bit ARM: https://patchwork.sourceware.org/project/glibc/patch/477cc628fed2769f25399d7674080dd257a80d46.1688499219.git.fweimer@redhat.com/ === glibc tests === Running glibc:dlfcn ... FAIL: dlfcn/tststatic4 === Results Summary ===
* Carlos O'Donell: > On 7/4/23 16:07, Florian Weimer via Libc-alpha wrote: >> I forgot to mention that I tested this on aarch64-linux-gnu, >> i686-linux-gnu, powerpc64le-linux-gnu (with the hash MMU and protection >> keys on POWER8, and with the radix MMU on POWER10, so without protection >> keys), and x86-64-linux-gnu. Force-enabling protection keys on x86-64 >> causes malloc/tst-mallocfork* tests to fail because we cannot read >> dynamic linker data structures anymore during lazy binding. > > Fails pre-commit CI for 32-bit ARM: > https://patchwork.sourceware.org/project/glibc/patch/477cc628fed2769f25399d7674080dd257a80d46.1688499219.git.fweimer@redhat.com/ > > === glibc tests === > > Running glibc:dlfcn ... > FAIL: dlfcn/tststatic4 > > === Results Summary === Yes, already saw that report. Any idea why? I don't have an easy way to reproduce failures on that architecture, unfortunately. Thanks, Florian
On 7/5/23 11:57, Florian Weimer wrote: > * Carlos O'Donell: > >> On 7/4/23 16:07, Florian Weimer via Libc-alpha wrote: >>> I forgot to mention that I tested this on aarch64-linux-gnu, >>> i686-linux-gnu, powerpc64le-linux-gnu (with the hash MMU and protection >>> keys on POWER8, and with the radix MMU on POWER10, so without protection >>> keys), and x86-64-linux-gnu. Force-enabling protection keys on x86-64 >>> causes malloc/tst-mallocfork* tests to fail because we cannot read >>> dynamic linker data structures anymore during lazy binding. >> >> Fails pre-commit CI for 32-bit ARM: >> https://patchwork.sourceware.org/project/glibc/patch/477cc628fed2769f25399d7674080dd257a80d46.1688499219.git.fweimer@redhat.com/ >> >> === glibc tests === >> >> Running glibc:dlfcn ... >> FAIL: dlfcn/tststatic4 >> >> === Results Summary === > > Yes, already saw that report. Any idea why? I don't have an easy way > to reproduce failures on that architecture, unfortunately. I don't know why. Maxim and I have discussed what to do in these scenarios and what expectations the community should have about the pre-commit CI test owner. My understanding is that Linaro is here to help determine why the failure occurred and work with you to find a solution. Notes: https://sourceware.org/glibc/wiki/PreCommitCI
On 05/07/23 14:48, Carlos O'Donell wrote: > On 7/5/23 11:57, Florian Weimer wrote: >> * Carlos O'Donell: >> >>> On 7/4/23 16:07, Florian Weimer via Libc-alpha wrote: >>>> I forgot to mention that I tested this on aarch64-linux-gnu, >>>> i686-linux-gnu, powerpc64le-linux-gnu (with the hash MMU and protection >>>> keys on POWER8, and with the radix MMU on POWER10, so without protection >>>> keys), and x86-64-linux-gnu. Force-enabling protection keys on x86-64 >>>> causes malloc/tst-mallocfork* tests to fail because we cannot read >>>> dynamic linker data structures anymore during lazy binding. >>> >>> Fails pre-commit CI for 32-bit ARM: >>> https://patchwork.sourceware.org/project/glibc/patch/477cc628fed2769f25399d7674080dd257a80d46.1688499219.git.fweimer@redhat.com/ >>> >>> === glibc tests === >>> >>> Running glibc:dlfcn ... >>> FAIL: dlfcn/tststatic4 >>> >>> === Results Summary === >> >> Yes, already saw that report. Any idea why? I don't have an easy way >> to reproduce failures on that architecture, unfortunately. > > I don't know why. > > Maxim and I have discussed what to do in these scenarios and what expectations the > community should have about the pre-commit CI test owner. > > My understanding is that Linaro is here to help determine why the failure occurred > and work with you to find a solution. > > Notes: > https://sourceware.org/glibc/wiki/PreCommitCI I will take a look of what might be happening here.
* Adhemerval Zanella Netto: > On 05/07/23 14:48, Carlos O'Donell wrote: >> On 7/5/23 11:57, Florian Weimer wrote: >>> * Carlos O'Donell: >>> >>>> On 7/4/23 16:07, Florian Weimer via Libc-alpha wrote: >>>>> I forgot to mention that I tested this on aarch64-linux-gnu, >>>>> i686-linux-gnu, powerpc64le-linux-gnu (with the hash MMU and protection >>>>> keys on POWER8, and with the radix MMU on POWER10, so without protection >>>>> keys), and x86-64-linux-gnu. Force-enabling protection keys on x86-64 >>>>> causes malloc/tst-mallocfork* tests to fail because we cannot read >>>>> dynamic linker data structures anymore during lazy binding. >>>> >>>> Fails pre-commit CI for 32-bit ARM: >>>> https://patchwork.sourceware.org/project/glibc/patch/477cc628fed2769f25399d7674080dd257a80d46.1688499219.git.fweimer@redhat.com/ >>>> >>>> === glibc tests === >>>> >>>> Running glibc:dlfcn ... >>>> FAIL: dlfcn/tststatic4 >>>> >>>> === Results Summary === >>> >>> Yes, already saw that report. Any idea why? I don't have an easy way >>> to reproduce failures on that architecture, unfortunately. >> >> I don't know why. >> >> Maxim and I have discussed what to do in these scenarios and what expectations the >> community should have about the pre-commit CI test owner. >> >> My understanding is that Linaro is here to help determine why the failure occurred >> and work with you to find a solution. >> >> Notes: >> https://sourceware.org/glibc/wiki/PreCommitCI > > I will take a look of what might be happening here. I can't find the failure anymore. What's the target triplet for this builder? What's the closest build-many-glibcs.py configuration? Thanks, Florian
* Florian Weimer: >> I will take a look of what might be happening here. > > I can't find the failure anymore. > > What's the target triplet for this builder? What's the closest > build-many-glibcs.py configuration? I managed to reproduce it using the arm-linux-gnueabihf configureation: openat(AT_FDCWD, "./", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 read(3, 0xffcd9a28, 512) = -1 EISDIR (Is a directory) close(3) = 0 dlopen [initial] (NULL): ./: cannot read file data: Is a directory exit_group(1) = ? Or alternatively, using a different LD_LIBRARY_PATH setting with an absolute path: openat(AT_FDCWD, "/root/build/", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 read(3, 0xffe12f18, 512) = -1 EISDIR (Is a directory) close(3) = 0 dlopen [initial] (NULL): /root/build/: cannot read file data: Is a directory exit_group(1) = ? I assume it's the same failure. It's because there's no vDSO: 00010000-00098000 r-xp 00000000 fd:00 11958481 /root/build/dlfcn/tststatic4 00098000-000b5000 rw-p 00087000 fd:00 11958481 /root/build/dlfcn/tststatic4 000b5000-000b8000 rw-p 00000000 00:00 0 [heap] 000b8000-000da000 rw-p 00000000 00:00 0 [heap] f7fef000-f7ff0000 r-xp 00000000 00:00 0 [sigpage] fffcf000-ffff0000 rw-p 00000000 00:00 0 [stack] ffff0000-ffff1000 r-xp 00000000 00:00 0 [vectors] This means that the main map is the only map, and l_prev and l_next are NULL. The _dl_lookup_map function treats this as the early startup situation, where things show up in the hash table but are not actually on the list. I think I can #ifdef out that check for !SHARED. Thanks, Florian
On 07/07/23 09:42, Florian Weimer wrote: > * Florian Weimer: > >>> I will take a look of what might be happening here. >> >> I can't find the failure anymore. >> >> What's the target triplet for this builder? What's the closest >> build-many-glibcs.py configuration? > > I managed to reproduce it using the arm-linux-gnueabihf configureation: > > openat(AT_FDCWD, "./", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 > read(3, 0xffcd9a28, 512) = -1 EISDIR (Is a directory) > close(3) = 0 > dlopen [initial] (NULL): ./: cannot read file data: Is a directory > exit_group(1) = ? > > Or alternatively, using a different LD_LIBRARY_PATH setting with an > absolute path: > > openat(AT_FDCWD, "/root/build/", O_RDONLY|O_LARGEFILE|O_CLOEXEC) = 3 > read(3, 0xffe12f18, 512) = -1 EISDIR (Is a directory) > close(3) = 0 > dlopen [initial] (NULL): /root/build/: cannot read file data: Is a directory > exit_group(1) = ? > > I assume it's the same failure. > > It's because there's no vDSO: > > 00010000-00098000 r-xp 00000000 fd:00 11958481 /root/build/dlfcn/tststatic4 > 00098000-000b5000 rw-p 00087000 fd:00 11958481 /root/build/dlfcn/tststatic4 > 000b5000-000b8000 rw-p 00000000 00:00 0 [heap] > 000b8000-000da000 rw-p 00000000 00:00 0 [heap] > f7fef000-f7ff0000 r-xp 00000000 00:00 0 [sigpage] > fffcf000-ffff0000 rw-p 00000000 00:00 0 [stack] > ffff0000-ffff1000 r-xp 00000000 00:00 0 [vectors] > > This means that the main map is the only map, and l_prev and l_next are > NULL. The _dl_lookup_map function treats this as the early startup > situation, where things show up in the hash table but are not actually > on the list. > > I think I can #ifdef out that check for !SHARED. I also seeing extra regressions on arm-linux-gnueabihf: FAIL: elf/tst-audit11 FAIL: elf/tst-audit12 FAIL: elf/tst-audit2 FAIL: elf/tst-audit9 FAIL: elf/tst-tls-ie-dlmopen $ cat elf/tst-audit11.out Start $ cat elf/tst-audit12.out Start $ cat elf/tst-audit2.out version: 2 objopen: 0, �t�� objopen: 0, /home/adhemerval.zanella/projects/glibc/build/arm-linux-gnueabihf/elf/ld.so activity: add objsearch: libc.so.6, LA_SET_ORIG objsearch: /home/adhemerval.zanella/projects/glibc/build/arm-linux-gnueabihf/libc.so.6, LA_SER_RUNPATH objopen: 0, /home/adhemerval.zanella/projects/glibc/build/arm-linux-gnueabihf/libc.so.6 symbind32: symname=calloc, st_value=0x7af938, ndx=68, flags=8 symbind32: symname=free, st_value=0xf7709774, ndx=1547, flags=8 symbind32: symname=malloc, st_value=0xf7708ed0, ndx=397, flags=8 symbind32: symname=realloc, st_value=0xf7709b20, ndx=2033, flags=8 symbind32: symname=__tls_get_addr, st_value=0xf7b786a4, ndx=19, flags=3 symbind32: symname=malloc, st_value=0xf7708ed0, ndx=397, flags=0 pltenter: symname=malloc, st_value=0xf7708ed0, ndx=397, flags=0 symbind32: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0 pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0 pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0 pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0 pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0 pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0 pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0 pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0 pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0 pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0 pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0 pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0 pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0 symbind32: symname=memset, st_value=0xf770fbf0, ndx=1300, flags=0 pltenter: symname=memset, st_value=0xf770fbf0, ndx=1300, flags=0 pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0 pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0 pltenter: symname=__tunable_get_val, st_value=0xf7b79610, ndx=30, flags=0 activity: consistent symbind32: symname=mmap, st_value=0xf777fc5c, ndx=3064, flags=0 pltenter: symname=mmap, st_value=0xf777fc5c, ndx=3064, flags=0 symbind32: symname=__libc_start_main, st_value=0xf7682b1c, ndx=710, flags=0 pltenter: symname=__libc_start_main, st_value=0xf7682b1c, ndx=710, flags=0 pltenter: symname=mmap, st_value=0xf777fc5c, ndx=3064, flags=0 symbind32: symname=_dl_audit_preinit, st_value=0xf7b7b2d0, ndx=38, flags=0 pltenter: symname=_dl_audit_preinit, st_value=0xf7b7b2d0, ndx=38, flags=0 preinit pltenter: symname=memset, st_value=0xf770fbf0, ndx=1300, flags=0 symbind32: symname=getenv, st_value=0xf76a0e1c, ndx=1900, flags=0 pltenter: symname=getenv, st_value=0xf76a0e1c, ndx=1900, flags=0 symbind32: symname=mallopt, st_value=0xf770b28c, ndx=1591, flags=0 pltenter: symname=mallopt, st_value=0xf770b28c, ndx=1591, flags=0 symbind32: symname=getopt_long, st_value=0xf774736c, ndx=1467, flags=0 pltenter: symname=getopt_long, st_value=0xf774736c, ndx=1467, flags=0 pltenter: symname=getenv, st_value=0xf76a0e1c, ndx=1900, flags=0 pltenter: symname=getenv, st_value=0xf76a0e1c, ndx=1900, flags=0 symbind32: symname=setvbuf, st_value=0xf76d49ac, ndx=1342, flags=0 pltenter: symname=setvbuf, st_value=0xf76d49ac, ndx=1342, flags=0 pltenter: symname=getenv, st_value=0xf76a0e1c, ndx=1900, flags=0 pltenter: symname=getenv, st_value=0xf76a0e1c, ndx=1900, flags=0 symbind32: symname=fork, st_value=0xf7741ab8, ndx=2327, flags=0 pltenter: symname=fork, st_value=0xf7741ab8, ndx=2327, flags=0 symbind32: symname=signal, st_value=0xf769c164, ndx=973, flags=0 pltenter: symname=signal, st_value=0xf769c164, ndx=973, flags=0 symbind32: symname=alarm, st_value=0xf773cb88, ndx=350, flags=0 pltenter: symname=alarm, st_value=0xf773cb88, ndx=350, flags=0 pltenter: symname=signal, st_value=0xf769c164, ndx=973, flags=0 symbind32: symname=waitpid, st_value=0xf7765508, ndx=2882, flags=0 pltenter: symname=waitpid, st_value=0xf7765508, ndx=2882, flags=0 symbind32: symname=setrlimit64, st_value=0xf7779db8, ndx=2349, flags=0 pltenter: symname=setrlimit64, st_value=0xf7779db8, ndx=2349, flags=0 symbind32: symname=setpgid, st_value=0xf7762740, ndx=1838, flags=0 pltenter: symname=setpgid, st_value=0xf7762740, ndx=1838, flags=0 pltenter: symname=getenv, st_value=0xf76a0e1c, ndx=1900, flags=0 symbind32: symname=dlopen, st_value=0xf76eab28, ndx=1984, flags=0 pltenter: symname=dlopen, st_value=0xf76eab28, ndx=1984, flags=0 objsearch: $ORIGIN/tst-auditmod9b.so, LA_SET_ORIG activity: delete symbind32: symname=__cxa_finalize, st_value=0xf769f328, ndx=232, flags=0 pltenter: symname=__cxa_finalize, st_value=0xf769f328, ndx=232, flags=0 objclose objclose objclose activity: consistent (the objopen: seems bogus) $ cat elf/tst-audit9.out $ $ cat elf/tst-tls-ie-dlmopen.out maintls[1000]: 0xf7a83f88 .. 0xf7a84370 var0[480]: 0x15b8320 .. 0x15b8500 global-dynamic var1[120]: 0x15b86a8 .. 0x15b8720 global-dynamic var2[24]: 0x15b8f00 .. 0x15b8f18 global-dynamic var3[16]: 0x15b90c0 .. 0x15b90d0 global-dynamic var6[576]: 0xf7a844b0 .. 0xf7a846f0 initial-exec The next dlmopen should fail... ...OK failed with: /home/adhemerval.zanella/projects/glibc/build/arm-linux-gnueabihf/elf/tst-tls-ie-mod4.so: cannot allocate memory in static TLS block. Didn't expect signal from child: got `Segmentation fault'
* Adhemerval Zanella Netto: > I also seeing extra regressions on arm-linux-gnueabihf: > > FAIL: elf/tst-audit11 > FAIL: elf/tst-audit12 > FAIL: elf/tst-audit2 > FAIL: elf/tst-audit9 > FAIL: elf/tst-tls-ie-dlmopen > > > $ cat elf/tst-audit11.out > Start > > $ cat elf/tst-audit12.out > Start > > $ cat elf/tst-audit2.out > version: 2 This one I can't reproduce: # LD_AUDIT=elf/tst-auditmod1.so elf/ld.so --library-path .:elf elf/tst-audit2 […] objclose objclose objclose activity: consistent # echo $? 0 I'm going to repost the series with the vDSO fix. Thanks, Florian
On 07/07/23 10:18, Florian Weimer wrote: > * Adhemerval Zanella Netto: > >> I also seeing extra regressions on arm-linux-gnueabihf: >> >> FAIL: elf/tst-audit11 >> FAIL: elf/tst-audit12 >> FAIL: elf/tst-audit2 >> FAIL: elf/tst-audit9 >> FAIL: elf/tst-tls-ie-dlmopen >> >> >> $ cat elf/tst-audit11.out >> Start >> >> $ cat elf/tst-audit12.out >> Start >> >> $ cat elf/tst-audit2.out >> version: 2 > > This one I can't reproduce: > > # LD_AUDIT=elf/tst-auditmod1.so elf/ld.so --library-path .:elf elf/tst-audit2 > […] > objclose > objclose > objclose > activity: consistent > # echo $? > 0 > > I'm going to repost the series with the vDSO fix. Alright, I will check again once you repost it. I have not look this series in detail, but I wonder if you could split in first adding the dlopen hash speedup and then add the hardening, as two different sets. I am not sure how hard or if it would be feasible.
* Adhemerval Zanella Netto: > Alright, I will check again once you repost it. I have not look this > series in detail, but I wonder if you could split in first adding the > dlopen hash speedup and then add the hardening, as two different sets. > I am not sure how hard or if it would be feasible. The struct link_map_private change conflicts to some degree. But having the protected memory allocator really simplifies two things: we no longer have to worry about switching from the minimal malloc to the libc.so malloc. It's always the protected memory allocator, so _dl_protmem_free works during early startup and once user code starts running. The other thing is that valgrind and mtrace do not see the allocations, so we don't need to free them. I alluded this in the cover letter; I think we should add something to report leaked memory based on tracing from the data structure roots, rather than figuring out a safe way how to deallocate things during process shutdown. Thanks, Florian