Message ID | 20190121213530.23803-1-mathieu.desnoyers@efficios.com |
---|---|
State | New |
Headers | show |
Series | [RFC,1/4] glibc: Perform rseq(2) registration at C startup and thread creation (v6) | expand |
----- On Jan 21, 2019, at 4:35 PM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote: [...] > diff --git a/sysdeps/unix/sysv/linux/sys/rseq.h > b/sysdeps/unix/sysv/linux/sys/rseq.h > new file mode 100644 > index 0000000000..61937fb193 > --- /dev/null > +++ b/sysdeps/unix/sysv/linux/sys/rseq.h > @@ -0,0 +1,64 @@ [...] > + > +#ifndef _SYS_RSEQ_H > +#define _SYS_RSEQ_H 1 > + > +/* We use the structures declarations from the kernel headers. */ > +#include <linux/rseq.h> > +#include <stdint.h> > + > +/* Signature required before each abort handler code. */ > +#define RSEQ_SIG 0x53053053 I recalled that aarch64 defines RSEQ_SIG to a different value which maps to a valid trap instruction. So I plan to move the RSEQ_SIG define to per-arch headers like this: sysdeps/unix/sysv/linux/aarch64/bits/rseq.h | 24 ++ sysdeps/unix/sysv/linux/arm/bits/rseq.h | 24 ++ sysdeps/unix/sysv/linux/bits/rseq.h | 23 ++ sysdeps/unix/sysv/linux/mips/bits/rseq.h | 24 ++ sysdeps/unix/sysv/linux/powerpc/bits/rseq.h | 24 ++ sysdeps/unix/sysv/linux/s390/bits/rseq.h | 24 ++ sysdeps/unix/sysv/linux/x86/bits/rseq.h | 24 ++ where "bits/rseq.h" contains a #error: # error "Architecture does not define RSEQ_SIG. sys/rseq.h will now include <bits/rseq.h>. > + > +enum rseq_register_state > +{ > + /* Value RSEQ_REGISTER_ALLOWED means it is allowed to update > + the refcount field and to register/unregister rseq. */ > + RSEQ_REGISTER_ALLOWED = 0, > + /* Value RSEQ_REGISTER_NESTED means it is temporarily forbidden > + to update the refcount field or to register/unregister rseq. */ > + RSEQ_REGISTER_NESTED = 1, I plan to rename "RSEQ_REGISTER_NESTED" to "RSEQ_REGISTER_ONGOING" which seems to better represent the current registration state. Please let me know if anything is wrong with those changes. Thanks, Mathieu > + /* Value RSEQ_REGISTER_EXITING means it is forbidden to update the > + refcount field or to register/unregister rseq for the rest of the > + thread's lifetime. */ > + RSEQ_REGISTER_EXITING = 2, > +};
On Tue, 29 Jan 2019, Mathieu Desnoyers wrote: > I recalled that aarch64 defines RSEQ_SIG to a different value which maps to > a valid trap instruction. So I plan to move the RSEQ_SIG define to per-arch > headers like this: > > sysdeps/unix/sysv/linux/aarch64/bits/rseq.h | 24 ++ > sysdeps/unix/sysv/linux/arm/bits/rseq.h | 24 ++ > sysdeps/unix/sysv/linux/bits/rseq.h | 23 ++ > sysdeps/unix/sysv/linux/mips/bits/rseq.h | 24 ++ > sysdeps/unix/sysv/linux/powerpc/bits/rseq.h | 24 ++ > sysdeps/unix/sysv/linux/s390/bits/rseq.h | 24 ++ > sysdeps/unix/sysv/linux/x86/bits/rseq.h | 24 ++ > > where "bits/rseq.h" contains a #error: > > # error "Architecture does not define RSEQ_SIG. > > sys/rseq.h will now include <bits/rseq.h>. We're trying to reduce the number of cases where most or all new glibc architecture ports need to provide a bits/ header, by making the generic headers handle the common case. So a generic header with a #error, and lots of architecture-specific headers mostly with the same value for RSEQ_SIG, seems unfortunate. I'd hope the generic header could use a generic value, with architecture-specific variants only for architectures with some reason for a different value.
----- On Jan 29, 2019, at 4:56 PM, Joseph Myers joseph@codesourcery.com wrote: > On Tue, 29 Jan 2019, Mathieu Desnoyers wrote: > >> I recalled that aarch64 defines RSEQ_SIG to a different value which maps to >> a valid trap instruction. So I plan to move the RSEQ_SIG define to per-arch >> headers like this: >> >> sysdeps/unix/sysv/linux/aarch64/bits/rseq.h | 24 ++ >> sysdeps/unix/sysv/linux/arm/bits/rseq.h | 24 ++ >> sysdeps/unix/sysv/linux/bits/rseq.h | 23 ++ >> sysdeps/unix/sysv/linux/mips/bits/rseq.h | 24 ++ >> sysdeps/unix/sysv/linux/powerpc/bits/rseq.h | 24 ++ >> sysdeps/unix/sysv/linux/s390/bits/rseq.h | 24 ++ >> sysdeps/unix/sysv/linux/x86/bits/rseq.h | 24 ++ >> >> where "bits/rseq.h" contains a #error: >> >> # error "Architecture does not define RSEQ_SIG. >> >> sys/rseq.h will now include <bits/rseq.h>. > > We're trying to reduce the number of cases where most or all new glibc > architecture ports need to provide a bits/ header, by making the generic > headers handle the common case. So a generic header with a #error, and > lots of architecture-specific headers mostly with the same value for > RSEQ_SIG, seems unfortunate. I'd hope the generic header could use a > generic value, with architecture-specific variants only for architectures > with some reason for a different value. The issue here is that it would require us to decide right away what RSEQ_SIG is appropriate for all other Linux architectures supported by glibc. There are a few reasons for which an architecture can be required to specify its own RSEQ_SIG. For instance, it may need to map to an instruction defined in the instruction set, thus ensuring objdump does not get confused, and in other cases that the processor speculative execution happening just before the RSEQ_SIG really stops at the signature (hence the trap instruction on aarch64). I'm worried that if we introduce a "default" RSEQ_SIG value for architectures currently not supported by RSEQ and we then introduce an architecture-specific signature value in the future, some applications will try to build with wrong signatures, and when the rseq system call gets eventually implemented for those architecture and a end-user upgrades his kernel, those signatures won't match between glibc rseq registration and the application rseq abort handlers, thus leading to hard-to-reproduce segmentation faults delivered by the kernel checking those signatures upon rseq abort. This upgrade story is far from ideal. My thinking was to put the #error in the generic header, so architectures that are not supported yet cannot build against rseq.h at all, so we don't end up in a broken upgrade scenario. I'm open to alternative ways to do it though, as long as we don't let not-yet-supported architectures build broken code. Thoughts ? Thanks, Mathieu
On Tue, 29 Jan 2019, Mathieu Desnoyers wrote: > My thinking was to put the #error in the generic header, so architectures that > are not supported yet cannot build against rseq.h at all, so we don't end up > in a broken upgrade scenario. I'm open to alternative ways to do it though, as > long as we don't let not-yet-supported architectures build broken code. Any case with #error in installed glibc headers needs special-casing in check-installed-headers.sh (and, thus, such errors are to be discouraged). Cases where architectures commonly need their own bits/ headers, especially where those are likely to need updating for new kernel versions, are also discouraged. Furthermore, a normal check for glibc headers updates needed for a new kernel version would only involve examining uapi headers (and the non-uapi linux/socket.h for new address families, an unfortunate existing wart in this area). As far as I can see, this value isn't defined in any uapi header, which makes it especially likely to be missed in such a check. Furthermore, I'm hoping to add more glibc tests for consistency of such constants between glibc and the kernel, to ensure any such updates missing are caught automatically through test failures - but that doesn't work if the constants in question aren't in a uapi header. If this constant were in a uapi header, the glibc header could just include that - is the issue that it's not actually an interface between glibc and the kernel at all, but some kind of purely-userspace interface? We very definitely wish to keep to a minimum the cases where updates need to be done separately in glibc by each architecture maintainer (that's just a recipe for some updates getting missed accidentally) - meaning that there needs to be a clear way in which someone can tell, globally for all architectures, whether the set of such architecture-specific headers for this constant in glibc is complete and current, and when it needs updating (and this should be as similar to possible to such checks for any other header constant).
----- On Jan 29, 2019, at 9:40 PM, Joseph Myers joseph@codesourcery.com wrote: > On Tue, 29 Jan 2019, Mathieu Desnoyers wrote: > >> My thinking was to put the #error in the generic header, so architectures that >> are not supported yet cannot build against rseq.h at all, so we don't end up >> in a broken upgrade scenario. I'm open to alternative ways to do it though, as >> long as we don't let not-yet-supported architectures build broken code. > > Any case with #error in installed glibc headers needs special-casing in > check-installed-headers.sh (and, thus, such errors are to be discouraged). One alternative to #error would be to have an empty generic bits/rseq.h that does _not_ define RSEQ_SIG. This way, it would be possible to include sys/rseq.h from an architecture that does not define RSEQ_SIG yet, but it would not cause any build failure. It's only if the code try to use RSEQ_SIG that it would fail to compile because undefined. > Cases where architectures commonly need their own bits/ headers, > especially where those are likely to need updating for new kernel > versions, are also discouraged. The per-arch bits/rseq.h headers, once they define a specific value for RSEQ_SIG, should never ever change that value. > Furthermore, a normal check for glibc > headers updates needed for a new kernel version would only involve > examining uapi headers (and the non-uapi linux/socket.h for new address > families, an unfortunate existing wart in this area). As far as I can > see, this value isn't defined in any uapi header, which makes it > especially likely to be missed in such a check. Furthermore, I'm hoping > to add more glibc tests for consistency of such constants between glibc > and the kernel, to ensure any such updates missing are caught > automatically through test failures - but that doesn't work if the > constants in question aren't in a uapi header. > > If this constant were in a uapi header, the glibc header could just > include that - is the issue that it's not actually an interface between > glibc and the kernel at all, but some kind of purely-userspace interface? The rseq uapi headers do not enforce the value of RSEQ_SIG. The role of the kernel wrt signature is to receive it as sys_rseq argument, and then validate that abort targets are prefixed with the signature before moving the instruction pointer there. Therefore, it's up to user-space to agree on the RSEQ_SIG value across all code using rseq within a process. Since glibc will be registering rseq and exposing public headers, it appears that glibc would be the appropriate project to define the RSEQ_SIG value for each architecture. > > We very definitely wish to keep to a minimum the cases where updates need > to be done separately in glibc by each architecture maintainer (that's > just a recipe for some updates getting missed accidentally) - meaning that > there needs to be a clear way in which someone can tell, globally for all > architectures, whether the set of such architecture-specific headers for > this constant in glibc is complete and current, and when it needs updating > (and this should be as similar to possible to such checks for any other > header constant). Currently, I use #ifdef __NR_rseq from uapi unistd.h to check whether the kernel headers implement the rseq system call for the target architecture. With the approach of having an empty bits/rseq.h for architecture not yet supporting rseq in glibc, one way to check that glibc implements RSEQ_SIG for all architectures that have the rseq system call wired up in uapi would be: #include <unistd.h> #include <sys/rseq.h> #if defined (__NR_rseq) && !defined (RSEQ_SIG) # error "UAPI headers support rseq system call, but glibc does not define RSEQ_SIG." #endif Would that take care of your concerns ? Thanks, Mathieu
On Wed, Jan 30, 2019 at 1:03 PM Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote: > ----- On Jan 29, 2019, at 9:40 PM, Joseph Myers joseph@codesourcery.com wrote: > > On Tue, 29 Jan 2019, Mathieu Desnoyers wrote: > > > >> My thinking was to put the #error in the generic header, so architectures that > >> are not supported yet cannot build against rseq.h at all, so we don't end up > >> in a broken upgrade scenario. I'm open to alternative ways to do it though, as > >> long as we don't let not-yet-supported architectures build broken code. > > > > Any case with #error in installed glibc headers needs special-casing in > > check-installed-headers.sh (and, thus, such errors are to be discouraged). > > One alternative to #error would be to have an empty generic bits/rseq.h > that does _not_ define RSEQ_SIG. This way, it would be possible to > include sys/rseq.h from an architecture that does not define RSEQ_SIG > yet, but it would not cause any build failure. It's only if the code > try to use RSEQ_SIG that it would fail to compile because undefined. You seem to be clinging to an approach where every architecture (that does support rseq(2), which is hopefully going to be all of them in the near future) has to define its own bits/rseq.h. That's exactly the thing we don't want. Could you please explain why you believe it is more important to have build errors, in the short term, on architectures that don't support rseq(2) yet, than to improve the maintainability of the code in the long term? > > If this constant were in a uapi header, the glibc header could just > > include that - is the issue that it's not actually an interface between > > glibc and the kernel at all, but some kind of purely-userspace interface? > > The rseq uapi headers do not enforce the value of RSEQ_SIG. The role of the > kernel wrt signature is to receive it as sys_rseq argument, and then validate > that abort targets are prefixed with the signature before moving the > instruction pointer there. In that case, is there any reason not to use the same value on _all_ architectures? Or maybe the same value on all 32-bit architectures, and another one on all 64-bit architectures? (Based only on what you just said, I can imagine a reason: if this "signature" is baked into the code segments of programs, then it may need to be chosen for each architecture so it doesn't collide with the encoding of any valid machine instruction. But then it would be fixed as part of the ABI and the kernel should know what it is, rather than having to be told during process startup. It's quite possible I have misunderstood. I have not been following this discussion closely.) zw
* Zack Weinberg: > In that case, is there any reason not to use the same value on _all_ > architectures? Or maybe the same value on all 32-bit architectures, > and another one on all 64-bit architectures? I think the intent here is to use a value that would be extremely unlikely to appear in the instruction stream, so that jump target pointer in struct rseq_cs can be somewhat validated before the program counter is updated. This doesn't have to be a trapping instruction (in fact, the default trapping instruction would probably be a bad choice because the instruction does appear in the text segment quite frequently). But even if we choose an optimal value for all currently supported architectures, we might add an architecture in the future which prefers a different value. Thanks, Florian
----- On Jan 30, 2019, at 1:30 PM, Zack Weinberg zackw@panix.com wrote: > On Wed, Jan 30, 2019 at 1:03 PM Mathieu Desnoyers > <mathieu.desnoyers@efficios.com> wrote: >> ----- On Jan 29, 2019, at 9:40 PM, Joseph Myers joseph@codesourcery.com wrote: >> > On Tue, 29 Jan 2019, Mathieu Desnoyers wrote: >> > >> >> My thinking was to put the #error in the generic header, so architectures that >> >> are not supported yet cannot build against rseq.h at all, so we don't end up >> >> in a broken upgrade scenario. I'm open to alternative ways to do it though, as >> >> long as we don't let not-yet-supported architectures build broken code. >> > >> > Any case with #error in installed glibc headers needs special-casing in >> > check-installed-headers.sh (and, thus, such errors are to be discouraged). >> >> One alternative to #error would be to have an empty generic bits/rseq.h >> that does _not_ define RSEQ_SIG. This way, it would be possible to >> include sys/rseq.h from an architecture that does not define RSEQ_SIG >> yet, but it would not cause any build failure. It's only if the code >> try to use RSEQ_SIG that it would fail to compile because undefined. > > You seem to be clinging to an approach where every architecture (that > does support rseq(2), which is hopefully going to be all of them in > the near future) has to define its own bits/rseq.h. That's exactly > the thing we don't want. > > Could you please explain why you believe it is more important to have > build errors, in the short term, on architectures that don't support > rseq(2) yet, than to improve the maintainability of the code in the > long term? I'm open to alternative solutions, provided that they don't lead to glibc/kernel headers/kernel image upgrade scenarios that trigger segmentation faults. I just lack imagination in my search to find a solution that fulfills the non-segfault-on-upgrade requirement without the burden of a per-architecture bits/rseq.h to define RSEQ_SIG. If we find a solution that fulfills both goals, I'm all for it. Note that the alternative solution I hint at above does not use #error, but rather just leaves RSEQ_SIG undefined for architectures that don't support rseq yet. It still requires to have per-architecture bits/rseq.h, which appears to be a maintainability burden though. > >> > If this constant were in a uapi header, the glibc header could just >> > include that - is the issue that it's not actually an interface between >> > glibc and the kernel at all, but some kind of purely-userspace interface? >> >> The rseq uapi headers do not enforce the value of RSEQ_SIG. The role of the >> kernel wrt signature is to receive it as sys_rseq argument, and then validate >> that abort targets are prefixed with the signature before moving the >> instruction pointer there. > > In that case, is there any reason not to use the same value on _all_ > architectures? Or maybe the same value on all 32-bit architectures, > and another one on all 64-bit architectures? As I stated earlier in the thread, we've encountered a few reasons that justify overriding the RSEQ_SIG for specific architectures so far: - Playing nice with objdump quirks by ensuring the signature is a valid instruction, - Ensuring the architecture's speculative execution stops at the signature by making sure this is a specific trap instruction that is not found typically elsewhere in the code, - Ensuring the chosen RSEQ_SIG does not happen to be a common byte sequence frequently present in the code segments of each target architecture. > > (Based only on what you just said, I can imagine a reason: if this > "signature" is baked into the code segments of programs, then it may > need to be chosen for each architecture so it doesn't collide with the > encoding of any valid machine instruction. But then it would be fixed > as part of the ABI and the kernel should know what it is, rather than > having to be told during process startup. It's quite possible I have > misunderstood. I have not been following this discussion closely.) As you figured out, this signature needs to be baked into the application and library code. However, the way rseq uapi is defined, the choice of the signature is left to user-space. It is not defined by the kernel uapi. The kernel just validates that signature, which is received as input parameter to the rseq system call, and needs to be found before abort target labels. The main reason why we left the signature choice to user-space is that we may have to eventually turn the "fixed" signature into a randomly assigned one for hardened processes. We have to recall that the purpose of this signature is to protect against control flow redirection by an attacker through the rseq abort mechanism, so I prefer not to fix it to a static value within the kernel uapi, and leave that decision to user-space, which would allow randomizing it on process startup for hardened processes in the future. Thanks, Mathieu > > zw
On Wed, 30 Jan 2019, Mathieu Desnoyers wrote: > #if defined (__NR_rseq) && !defined (RSEQ_SIG) > # error "UAPI headers support rseq system call, but glibc does not define RSEQ_SIG." > #endif > > Would that take care of your concerns ? That would of course need appropriate conditionals based on the most recent kernel version for which a given glibc version has been updated, so that using new kernel headers with an existing glibc release does not make the build fail (cf. the test of syscall-names.list). And being able to write such a test only solves one half of the problem - it needs to be easy to determine what value to put in that header in glibc for an architecture that's newly gained support in the kernel, *without* needing any architecture expertise.
----- On Jan 30, 2019, at 4:10 PM, Joseph Myers joseph@codesourcery.com wrote: > On Wed, 30 Jan 2019, Mathieu Desnoyers wrote: > >> #if defined (__NR_rseq) && !defined (RSEQ_SIG) >> # error "UAPI headers support rseq system call, but glibc does not define >> RSEQ_SIG." >> #endif >> >> Would that take care of your concerns ? > > That would of course need appropriate conditionals based on the most > recent kernel version for which a given glibc version has been updated, so > that using new kernel headers with an existing glibc release does not make > the build fail (cf. the test of syscall-names.list). The test I hint at above would not be for the glibc build per se. It would be for a check that glibc implements support for all the system calls available in the kernel headers (if such a test target currently exists). > And being able to > write such a test only solves one half of the problem - it needs to be > easy to determine what value to put in that header in glibc for an > architecture that's newly gained support in the kernel, *without* needing > any architecture expertise. I'm afraid this requirement is incompatible with the nature of the RSEQ signature. This signature may be required to be a specific trap instruction by the architecture, so deciding on its value without architecture expertise is not possible. Thanks, Mathieu
----- On Jan 31, 2019, at 11:37 AM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote: > ----- On Jan 30, 2019, at 4:10 PM, Joseph Myers joseph@codesourcery.com wrote: > >> On Wed, 30 Jan 2019, Mathieu Desnoyers wrote: >> >>> #if defined (__NR_rseq) && !defined (RSEQ_SIG) >>> # error "UAPI headers support rseq system call, but glibc does not define >>> RSEQ_SIG." >>> #endif >>> >>> Would that take care of your concerns ? >> >> That would of course need appropriate conditionals based on the most >> recent kernel version for which a given glibc version has been updated, so >> that using new kernel headers with an existing glibc release does not make >> the build fail (cf. the test of syscall-names.list). > > The test I hint at above would not be for the glibc build per se. It would > be for a check that glibc implements support for all the system calls > available in the kernel headers (if such a test target currently exists). > >> And being able to >> write such a test only solves one half of the problem - it needs to be >> easy to determine what value to put in that header in glibc for an >> architecture that's newly gained support in the kernel, *without* needing >> any architecture expertise. > > I'm afraid this requirement is incompatible with the nature of the RSEQ > signature. This signature may be required to be a specific trap instruction > by the architecture, so deciding on its value without architecture expertise > is not possible. Just to clarify a point: the "success criterion" I'm aiming for here is to provide a rseq integration that does not cause foreseeable user crashes on upgrade. I'm all for taking into account the maintenance burden on glibc maintainers as a metric in the implementation choices made, but at this point, I don't see how we can achieve success without introducing architecture headers for the RSEQ_SIG signature. If you have ideas on how to further minimize the maintenance burden for glibc maintainers while still meeting the success criterion, I'm all ears. Thanks, Mathieu
diff --git a/NEWS b/NEWS index f488821af1..b238eaa391 100644 --- a/NEWS +++ b/NEWS @@ -35,6 +35,12 @@ Major new features: different directory. This is a GNU extension and similar to the Solaris function of the same name. +* Support for automatically registering threads with the Linux rseq(2) + system call has been added. This system call is implemented starting + from Linux 4.18. In order to be activated, it requires that glibc is built + against kernel headers that include this system call, and that glibc + detects availability of that system call at runtime. + Deprecated and removed features, and other changes affecting compatibility: * The glibc.tune tunable namespace has been renamed to glibc.cpu and the diff --git a/csu/libc-start.c b/csu/libc-start.c index 494132368f..dc39e09685 100644 --- a/csu/libc-start.c +++ b/csu/libc-start.c @@ -22,6 +22,7 @@ #include <ldsodefs.h> #include <exit-thread.h> #include <libc-internal.h> +#include <rseq-internal.h> #include <elf/dl-tunables.h> @@ -140,7 +141,10 @@ LIBC_START_MAIN (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL), __libc_multiple_libcs = &_dl_starting_up && !_dl_starting_up; -#ifndef SHARED +#ifdef SHARED + /* Register rseq ABI to the kernel. */ + (void) rseq_register_current_thread (); +#else _dl_relocate_static_pie (); char **ev = &argv[argc + 1]; @@ -218,6 +222,9 @@ LIBC_START_MAIN (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL), } # endif + /* Register rseq ABI to the kernel. */ + (void) rseq_register_current_thread (); + /* Initialize libpthread if linked in. */ if (__pthread_initialize_minimal != NULL) __pthread_initialize_minimal (); @@ -230,8 +237,7 @@ LIBC_START_MAIN (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL), # else __pointer_chk_guard_local = pointer_chk_guard; # endif - -#endif /* !SHARED */ +#endif /* Register the destructor of the dynamic linker if there is any. */ if (__glibc_likely (rtld_fini != NULL)) diff --git a/misc/Makefile b/misc/Makefile index c2c9994d17..4175cbb3d3 100644 --- a/misc/Makefile +++ b/misc/Makefile @@ -36,7 +36,8 @@ headers := sys/uio.h bits/uio-ext.h bits/uio_lim.h \ syslog.h sys/syslog.h \ bits/syslog.h bits/syslog-ldbl.h bits/syslog-path.h bits/error.h \ bits/select2.h bits/hwcap.h sys/auxv.h \ - sys/sysmacros.h bits/sysmacros.h bits/types/struct_iovec.h + sys/sysmacros.h bits/sysmacros.h bits/types/struct_iovec.h \ + rseq-internal.h routines := brk sbrk sstk ioctl \ readv writev preadv preadv64 pwritev pwritev64 \ diff --git a/misc/rseq-internal.h b/misc/rseq-internal.h new file mode 100644 index 0000000000..915122e4bf --- /dev/null +++ b/misc/rseq-internal.h @@ -0,0 +1,34 @@ +/* Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Mathieu Desnoyers <mathieu.desnoyers@efficios.com>, 2018. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#ifndef RSEQ_INTERNAL_H +#define RSEQ_INTERNAL_H + +static inline int +rseq_register_current_thread (void) +{ + return -1; +} + +static inline int +rseq_unregister_current_thread (void) +{ + return -1; +} + +#endif /* rseq-internal.h */ diff --git a/nptl/pthread_create.c b/nptl/pthread_create.c index fe75d04113..bb80f97d2d 100644 --- a/nptl/pthread_create.c +++ b/nptl/pthread_create.c @@ -33,6 +33,7 @@ #include <default-sched.h> #include <futex-internal.h> #include <tls-setup.h> +#include <rseq-internal.h> #include "libioP.h" #include <shlib-compat.h> @@ -378,6 +379,7 @@ __free_tcb (struct pthread *pd) START_THREAD_DEFN { struct pthread *pd = START_THREAD_SELF; + bool has_rseq = false; #if HP_TIMING_AVAIL /* Remember the time when the thread was started. */ @@ -396,6 +398,9 @@ START_THREAD_DEFN if (__glibc_unlikely (atomic_exchange_acq (&pd->setxid_futex, 0) == -2)) futex_wake (&pd->setxid_futex, 1, FUTEX_PRIVATE); + /* Register rseq TLS to the kernel. */ + has_rseq = !rseq_register_current_thread (); + #ifdef __NR_set_robust_list # ifndef __ASSUME_SET_ROBUST_LIST if (__set_robust_list_avail >= 0) @@ -573,6 +578,10 @@ START_THREAD_DEFN } #endif + /* Unregister rseq TLS from kernel. */ + if (has_rseq && rseq_unregister_current_thread ()) + abort(); + advise_stack_range (pd->stackblock, pd->stackblock_size, (uintptr_t) pd, pd->guardsize); diff --git a/sysdeps/unix/sysv/linux/Makefile b/sysdeps/unix/sysv/linux/Makefile index 72b6b641d5..b18e1cd450 100644 --- a/sysdeps/unix/sysv/linux/Makefile +++ b/sysdeps/unix/sysv/linux/Makefile @@ -1,5 +1,5 @@ ifeq ($(subdir),csu) -sysdep_routines += errno-loc +sysdep_routines += errno-loc rseq-sym endif ifeq ($(subdir),assert) @@ -43,7 +43,8 @@ sysdep_headers += sys/mount.h sys/acct.h sys/sysctl.h \ bits/siginfo-arch.h bits/siginfo-consts-arch.h \ bits/procfs.h bits/procfs-id.h bits/procfs-extra.h \ bits/procfs-prregset.h bits/mman-map-flags-generic.h \ - bits/msq-pad.h bits/sem-pad.h bits/shmlba.h bits/shm-pad.h + bits/msq-pad.h bits/sem-pad.h bits/shmlba.h bits/shm-pad.h \ + sys/rseq.h tests += tst-clone tst-clone2 tst-clone3 tst-fanotify tst-personality \ tst-quota tst-sync_file_range tst-sysconf-iov_max tst-ttyname \ diff --git a/sysdeps/unix/sysv/linux/Versions b/sysdeps/unix/sysv/linux/Versions index 336c13b57d..777ea723f8 100644 --- a/sysdeps/unix/sysv/linux/Versions +++ b/sysdeps/unix/sysv/linux/Versions @@ -171,6 +171,10 @@ libc { mlock2; pkey_alloc; pkey_free; pkey_set; pkey_get; pkey_mprotect; } + GLIBC_2.29 { + __rseq_abi; + __rseq_lib_abi; + } GLIBC_PRIVATE { # functions used in other libraries __syscall_rt_sigqueueinfo; diff --git a/sysdeps/unix/sysv/linux/aarch64/libc.abilist b/sysdeps/unix/sysv/linux/aarch64/libc.abilist index e66c741d04..8877199b6f 100644 --- a/sysdeps/unix/sysv/linux/aarch64/libc.abilist +++ b/sysdeps/unix/sysv/linux/aarch64/libc.abilist @@ -2138,4 +2138,6 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F diff --git a/sysdeps/unix/sysv/linux/alpha/libc.abilist b/sysdeps/unix/sysv/linux/alpha/libc.abilist index 8df162fe99..0a0db6c29f 100644 --- a/sysdeps/unix/sysv/linux/alpha/libc.abilist +++ b/sysdeps/unix/sysv/linux/alpha/libc.abilist @@ -2033,6 +2033,8 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/arm/libc.abilist b/sysdeps/unix/sysv/linux/arm/libc.abilist index 43c804f9dc..65746b1050 100644 --- a/sysdeps/unix/sysv/linux/arm/libc.abilist +++ b/sysdeps/unix/sysv/linux/arm/libc.abilist @@ -123,6 +123,8 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.4 _Exit F GLIBC_2.4 _IO_2_1_stderr_ D 0xa0 diff --git a/sysdeps/unix/sysv/linux/hppa/libc.abilist b/sysdeps/unix/sysv/linux/hppa/libc.abilist index 88b01c2e75..52c33c08a5 100644 --- a/sysdeps/unix/sysv/linux/hppa/libc.abilist +++ b/sysdeps/unix/sysv/linux/hppa/libc.abilist @@ -1880,6 +1880,8 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/i386/libc.abilist b/sysdeps/unix/sysv/linux/i386/libc.abilist index 6d02f31612..fe4a3e4f83 100644 --- a/sysdeps/unix/sysv/linux/i386/libc.abilist +++ b/sysdeps/unix/sysv/linux/i386/libc.abilist @@ -2045,6 +2045,8 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/ia64/libc.abilist b/sysdeps/unix/sysv/linux/ia64/libc.abilist index 4249712611..8db1b3a508 100644 --- a/sysdeps/unix/sysv/linux/ia64/libc.abilist +++ b/sysdeps/unix/sysv/linux/ia64/libc.abilist @@ -1914,6 +1914,8 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist index d47b808862..d2ee6f238f 100644 --- a/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist +++ b/sysdeps/unix/sysv/linux/m68k/coldfire/libc.abilist @@ -124,6 +124,8 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.4 _Exit F GLIBC_2.4 _IO_2_1_stderr_ D 0x98 diff --git a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist index d5e38308be..3250be752f 100644 --- a/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist +++ b/sysdeps/unix/sysv/linux/m68k/m680x0/libc.abilist @@ -1989,6 +1989,8 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/microblaze/libc.abilist b/sysdeps/unix/sysv/linux/microblaze/libc.abilist index 8596b84399..90507bd369 100644 --- a/sysdeps/unix/sysv/linux/microblaze/libc.abilist +++ b/sysdeps/unix/sysv/linux/microblaze/libc.abilist @@ -2130,4 +2130,6 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F diff --git a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist index 88e0f896d5..f5ad6e893f 100644 --- a/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist +++ b/sysdeps/unix/sysv/linux/mips/mips32/fpu/libc.abilist @@ -1967,6 +1967,8 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist index aff7462c34..1549be6a0e 100644 --- a/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist +++ b/sysdeps/unix/sysv/linux/mips/mips32/nofpu/libc.abilist @@ -1965,6 +1965,8 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist index 71d82444aa..33aea43f05 100644 --- a/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist +++ b/sysdeps/unix/sysv/linux/mips/mips64/n32/libc.abilist @@ -1973,6 +1973,8 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist index de6c53d293..8a43cbde55 100644 --- a/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist +++ b/sysdeps/unix/sysv/linux/mips/mips64/n64/libc.abilist @@ -1968,6 +1968,8 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/nios2/libc.abilist b/sysdeps/unix/sysv/linux/nios2/libc.abilist index e724bab9fb..7cc6936aa3 100644 --- a/sysdeps/unix/sysv/linux/nios2/libc.abilist +++ b/sysdeps/unix/sysv/linux/nios2/libc.abilist @@ -2171,4 +2171,6 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist index e9ecbccb71..aed1d82ebe 100644 --- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/fpu/libc.abilist @@ -1993,6 +1993,8 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist index da83ea6028..bbd918191c 100644 --- a/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc32/nofpu/libc.abilist @@ -1997,6 +1997,8 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist index 4535b40d15..9be7eccefc 100644 --- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc-le.abilist @@ -2228,4 +2228,6 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F diff --git a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist index 65725de4f0..7c2820c28f 100644 --- a/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist +++ b/sysdeps/unix/sysv/linux/powerpc/powerpc64/libc.abilist @@ -123,6 +123,8 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 _Exit F GLIBC_2.3 _IO_2_1_stderr_ D 0xe0 diff --git a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist index bbb3c4a8e7..0dbcc7d565 100644 --- a/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist +++ b/sysdeps/unix/sysv/linux/riscv/rv64/libc.abilist @@ -2100,4 +2100,6 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F diff --git a/sysdeps/unix/sysv/linux/rseq-internal.h b/sysdeps/unix/sysv/linux/rseq-internal.h new file mode 100644 index 0000000000..cb7b2c6f8c --- /dev/null +++ b/sysdeps/unix/sysv/linux/rseq-internal.h @@ -0,0 +1,91 @@ +/* Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Mathieu Desnoyers <mathieu.desnoyers@efficios.com>, 2018. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#ifndef RSEQ_INTERNAL_H +#define RSEQ_INTERNAL_H + +#include <sysdep.h> + +#ifdef __NR_rseq + +#include <errno.h> +#include <sys/rseq.h> + +static inline int +rseq_register_current_thread (void) +{ + int rc, ret = 0; + INTERNAL_SYSCALL_DECL (err); + + if (__rseq_abi.cpu_id == RSEQ_CPU_ID_REGISTRATION_FAILED) + return -1; + /* Temporarily prevent nested signal handlers from registering rseq. */ + __rseq_lib_abi.register_state = RSEQ_REGISTER_NESTED; + if (__rseq_lib_abi.refcount == UINT_MAX) + { + ret = -1; + goto end; + } + if (__rseq_lib_abi.refcount++) + goto end; + rc = INTERNAL_SYSCALL_CALL (rseq, err, &__rseq_abi, sizeof (struct rseq), + 0, RSEQ_SIG); + if (!rc) + goto end; + if (INTERNAL_SYSCALL_ERRNO (rc, err) != EBUSY) + __rseq_abi.cpu_id = RSEQ_CPU_ID_REGISTRATION_FAILED; + ret = -1; +end: + __rseq_lib_abi.register_state = RSEQ_REGISTER_ALLOWED; + return ret; +} + +static inline int +rseq_unregister_current_thread (void) +{ + int rc, ret = 0; + INTERNAL_SYSCALL_DECL (err); + + /* Setting __rseq_register_state = RSEQ_REGISTER_EXITING for the rest of the + thread lifetime. Ensures signal handlers nesting just before thread exit + don't try to register rseq. */ + __rseq_lib_abi.register_state = RSEQ_REGISTER_EXITING; + __rseq_lib_abi.refcount = 0; + rc = INTERNAL_SYSCALL_CALL (rseq, err, &__rseq_abi, sizeof (struct rseq), + RSEQ_FLAG_UNREGISTER, RSEQ_SIG); + if (!rc) + goto end; + ret = -1; +end: + return ret; +} +#else +static inline int +rseq_register_current_thread (void) +{ + return -1; +} + +static inline int +rseq_unregister_current_thread (void) +{ + return -1; +} +#endif + +#endif /* rseq-internal.h */ diff --git a/sysdeps/unix/sysv/linux/rseq-sym.c b/sysdeps/unix/sysv/linux/rseq-sym.c new file mode 100644 index 0000000000..99b277e9d6 --- /dev/null +++ b/sysdeps/unix/sysv/linux/rseq-sym.c @@ -0,0 +1,54 @@ +/* Copyright (C) 2018 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Mathieu Desnoyers <mathieu.desnoyers@efficios.com>, 2018. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#include <sys/syscall.h> +#include <stdint.h> + +#ifdef __NR_rseq +#include <sys/rseq.h> +#else + +enum rseq_cpu_id_state { + RSEQ_CPU_ID_UNINITIALIZED = -1, + RSEQ_CPU_ID_REGISTRATION_FAILED = -2, +}; + +/* linux/rseq.h defines struct rseq as aligned on 32 bytes. The kernel ABI + size is 20 bytes. */ +struct rseq { + uint32_t cpu_id_start; + uint32_t cpu_id; + uint64_t rseq_cs; + uint32_t flags; +} __attribute__ ((aligned(4 * sizeof(uint64_t)))); + +struct rseq_lib_abi +{ + uint32_t register_state; + uint32_t refcount; +}; + +#endif + +/* volatile because fields can be read/updated by the kernel. */ +__thread volatile struct rseq __rseq_abi = { + .cpu_id = RSEQ_CPU_ID_UNINITIALIZED, +}; + +/* volatile because fields can be read/updated by signal handlers. */ +__thread volatile struct rseq_lib_abi __rseq_lib_abi; diff --git a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist index e85ac2a178..eff957032a 100644 --- a/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist +++ b/sysdeps/unix/sysv/linux/s390/s390-32/libc.abilist @@ -2002,6 +2002,8 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist index d56931022c..32bfb25c1f 100644 --- a/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist +++ b/sysdeps/unix/sysv/linux/s390/s390-64/libc.abilist @@ -1908,6 +1908,8 @@ GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F GLIBC_2.29 __fentry__ F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/sh/libc.abilist b/sysdeps/unix/sysv/linux/sh/libc.abilist index ff939a15c4..a12d738d86 100644 --- a/sysdeps/unix/sysv/linux/sh/libc.abilist +++ b/sysdeps/unix/sysv/linux/sh/libc.abilist @@ -1884,6 +1884,8 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist index 64fa9e10a5..3333634dfa 100644 --- a/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist +++ b/sysdeps/unix/sysv/linux/sparc/sparc32/libc.abilist @@ -1996,6 +1996,8 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist index db909d1506..02db9f6669 100644 --- a/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist +++ b/sysdeps/unix/sysv/linux/sparc/sparc64/libc.abilist @@ -1937,6 +1937,8 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/sys/rseq.h b/sysdeps/unix/sysv/linux/sys/rseq.h new file mode 100644 index 0000000000..61937fb193 --- /dev/null +++ b/sysdeps/unix/sysv/linux/sys/rseq.h @@ -0,0 +1,64 @@ +/* Copyright (C) 2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + Contributed by Mathieu Desnoyers <mathieu.desnoyers@efficios.com>, 2019. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#ifndef _SYS_RSEQ_H +#define _SYS_RSEQ_H 1 + +/* We use the structures declarations from the kernel headers. */ +#include <linux/rseq.h> +#include <stdint.h> + +/* Signature required before each abort handler code. */ +#define RSEQ_SIG 0x53053053 + +enum rseq_register_state +{ + /* Value RSEQ_REGISTER_ALLOWED means it is allowed to update + the refcount field and to register/unregister rseq. */ + RSEQ_REGISTER_ALLOWED = 0, + /* Value RSEQ_REGISTER_NESTED means it is temporarily forbidden + to update the refcount field or to register/unregister rseq. */ + RSEQ_REGISTER_NESTED = 1, + /* Value RSEQ_REGISTER_EXITING means it is forbidden to update the + refcount field or to register/unregister rseq for the rest of the + thread's lifetime. */ + RSEQ_REGISTER_EXITING = 2, +}; + +struct rseq_lib_abi +{ + uint32_t register_state; /* enum rseq_register_state. */ + /* The refcount field keeps track of rseq users, so early adopters + of rseq can cooperate amongst each other and with glibc to + share rseq thread registration. The refcount field can only be + updated when allowed by the value of field register_state. + Registering rseq should be performed when incrementing refcount + from 0 to 1, and unregistering rseq should be performed when + decrementing refcount from 1 to 0. */ + uint32_t refcount; +}; + +/* volatile because fields can be read/updated by the kernel. */ +extern __thread volatile struct rseq __rseq_abi +__attribute__ ((tls_model ("initial-exec"))); + +/* volatile because fields can be read/updated by signal handlers. */ +extern __thread volatile struct rseq_lib_abi __rseq_lib_abi +__attribute__ ((tls_model ("initial-exec"))); + +#endif /* sys/rseq.h */ diff --git a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist index 3b175f104b..417d8ab9a6 100644 --- a/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist +++ b/sysdeps/unix/sysv/linux/x86_64/64/libc.abilist @@ -1895,6 +1895,8 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F GLIBC_2.3 __ctype_b_loc F GLIBC_2.3 __ctype_tolower_loc F diff --git a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist index 1b57710477..ef5ad4160d 100644 --- a/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist +++ b/sysdeps/unix/sysv/linux/x86_64/x32/libc.abilist @@ -2146,4 +2146,6 @@ GLIBC_2.28 thrd_current F GLIBC_2.28 thrd_equal F GLIBC_2.28 thrd_sleep F GLIBC_2.28 thrd_yield F +GLIBC_2.29 __rseq_abi T 0x20 +GLIBC_2.29 __rseq_lib_abi T 0x8 GLIBC_2.29 posix_spawn_file_actions_addchdir_np F
Register rseq(2) TLS for each thread (including main), and unregister for each thread (excluding main). "rseq" stands for Restartable Sequences. See the rseq(2) man page proposed here: https://lkml.org/lkml/2018/9/19/647 This patch is based on glibc commit a502c5294. The rseq(2) system call was merged into Linux 4.18. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> CC: Carlos O'Donell <carlos@redhat.com> CC: Florian Weimer <fweimer@redhat.com> CC: Joseph Myers <joseph@codesourcery.com> CC: Szabolcs Nagy <szabolcs.nagy@arm.com> CC: Thomas Gleixner <tglx@linutronix.de> CC: Ben Maurer <bmaurer@fb.com> CC: Peter Zijlstra <peterz@infradead.org> CC: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> CC: Boqun Feng <boqun.feng@gmail.com> CC: Will Deacon <will.deacon@arm.com> CC: Dave Watson <davejwatson@fb.com> CC: Paul Turner <pjt@google.com> CC: Rich Felker <dalias@libc.org> CC: libc-alpha@sourceware.org CC: linux-kernel@vger.kernel.org CC: linux-api@vger.kernel.org --- Changes since v1: - Move __rseq_refcount to an extra field at the end of __rseq_abi to eliminate one symbol. All libraries/programs which try to register rseq (glibc, early-adopter applications, early-adopter libraries) should use the rseq refcount. It becomes part of the ABI within a user-space process, but it's not part of the ABI shared with the kernel per se. - Restructure how this code is organized so glibc keeps building on non-Linux targets. - Use non-weak symbol for __rseq_abi. - Move rseq registration/unregistration implementation into its own nptl/rseq.c compile unit. - Move __rseq_abi symbol under GLIBC_2.29. Changes since v2: - Move __rseq_refcount to its own symbol, which is less ugly than trying to play tricks with the rseq uapi. - Move __rseq_abi from nptl to csu (C start up), so it can be used across glibc, including memory allocator and sched_getcpu(). The __rseq_refcount symbol is kept in nptl, because there is no reason to use it elsewhere in glibc. Changes since v3: - Set __rseq_refcount TLS to 1 on register/set to 0 on unregister because glibc is the first/last user. - Unconditionally register/unregister rseq at thread start/exit, because glibc is the first/last user. - Add missing abilist items. - Rebase on glibc master commit a502c5294. - Add NEWS entry. Changes since v4: - Do not use "weak" symbols for __rseq_abi and __rseq_refcount. Based on "System V Application Binary Interface", weak only affects the link editor, not the dynamic linker. - Install a new sys/rseq.h system header on Linux, which contains the RSEQ_SIG definition, __rseq_abi declaration and __rseq_refcount declaration. Move those definition/declarations from rseq-internal.h to the installed sys/rseq.h header. - Considering that rseq is only available on Linux, move csu/rseq.c to sysdeps/unix/sysv/linux/rseq-sym.c. - Move __rseq_refcount from nptl/rseq.c to sysdeps/unix/sysv/linux/rseq-sym.c, so it is only defined on Linux. - Move both ABI definitions for __rseq_abi and __rseq_refcount to sysdeps/unix/sysv/linux/Versions, so they only appear on Linux. - Document __rseq_abi and __rseq_refcount volatile. - Document the RSEQ_SIG signature define. - Move registration functions from rseq.c to rseq-internal.h static inline functions. Introduce empty stubs in misc/rseq-internal.h, which can be overridden by architecture code in sysdeps/unix/sysv/linux/rseq-internal.h. - Rename __rseq_register_current_thread and __rseq_unregister_current_thread to rseq_register_current_thread and rseq_unregister_current_thread, now that those are only visible as internal static inline functions. - Invoke rseq_register_current_thread() from libc-start.c LIBC_START_MAIN rather than nptl init, so applications not linked against libpthread.so have rseq registered for their main() thread. Note that it is invoked separately for SHARED and !SHARED builds. Changes since v5: - Replace __rseq_refcount by __rseq_lib_abi, which contains two uint32_t: register_state and refcount. The "register_state" field allows inhibiting rseq registration from signal handlers nested on top of glibc registration and occuring after rseq unregistration by glibc. - Introduce enum rseq_register_state, which contains the states allowed for the struct rseq_lib_abi register_state field. --- NEWS | 6 ++ csu/libc-start.c | 12 ++- misc/Makefile | 3 +- misc/rseq-internal.h | 34 +++++++ nptl/pthread_create.c | 9 ++ sysdeps/unix/sysv/linux/Makefile | 5 +- sysdeps/unix/sysv/linux/Versions | 4 + sysdeps/unix/sysv/linux/aarch64/libc.abilist | 2 + sysdeps/unix/sysv/linux/alpha/libc.abilist | 2 + sysdeps/unix/sysv/linux/arm/libc.abilist | 2 + sysdeps/unix/sysv/linux/hppa/libc.abilist | 2 + sysdeps/unix/sysv/linux/i386/libc.abilist | 2 + sysdeps/unix/sysv/linux/ia64/libc.abilist | 2 + .../sysv/linux/m68k/coldfire/libc.abilist | 2 + .../unix/sysv/linux/m68k/m680x0/libc.abilist | 2 + .../unix/sysv/linux/microblaze/libc.abilist | 2 + .../sysv/linux/mips/mips32/fpu/libc.abilist | 2 + .../sysv/linux/mips/mips32/nofpu/libc.abilist | 2 + .../sysv/linux/mips/mips64/n32/libc.abilist | 2 + .../sysv/linux/mips/mips64/n64/libc.abilist | 2 + sysdeps/unix/sysv/linux/nios2/libc.abilist | 2 + .../linux/powerpc/powerpc32/fpu/libc.abilist | 2 + .../powerpc/powerpc32/nofpu/libc.abilist | 2 + .../linux/powerpc/powerpc64/libc-le.abilist | 2 + .../sysv/linux/powerpc/powerpc64/libc.abilist | 2 + .../unix/sysv/linux/riscv/rv64/libc.abilist | 2 + sysdeps/unix/sysv/linux/rseq-internal.h | 91 +++++++++++++++++++ sysdeps/unix/sysv/linux/rseq-sym.c | 54 +++++++++++ .../unix/sysv/linux/s390/s390-32/libc.abilist | 2 + .../unix/sysv/linux/s390/s390-64/libc.abilist | 2 + sysdeps/unix/sysv/linux/sh/libc.abilist | 2 + .../sysv/linux/sparc/sparc32/libc.abilist | 2 + .../sysv/linux/sparc/sparc64/libc.abilist | 2 + sysdeps/unix/sysv/linux/sys/rseq.h | 64 +++++++++++++ .../unix/sysv/linux/x86_64/64/libc.abilist | 2 + .../unix/sysv/linux/x86_64/x32/libc.abilist | 2 + 36 files changed, 328 insertions(+), 6 deletions(-) create mode 100644 misc/rseq-internal.h create mode 100644 sysdeps/unix/sysv/linux/rseq-internal.h create mode 100644 sysdeps/unix/sysv/linux/rseq-sym.c create mode 100644 sysdeps/unix/sysv/linux/sys/rseq.h