Message ID | 1450496359-12069-1-git-send-email-nic.dade@gmail.com |
---|---|
State | New |
Headers | show |
On Fri, Dec 18, 2015 at 07:39:19PM -0800, Nicolas S. Dade wrote: > This supercedes the previous pselect patch from 16 Dec 2015. > > Linux has a pselect syscall since 2.6.something. Using it > rather than emulating it with sigprocmask+select+sigprocmask > is smaller code, and works properly. (The emulation has > race conditions when unblocked signals arrive before or > after the select) > > The tv.nsec >= 1E9 handling comes from uclibc's linux select() > implementation, which itself uses pselect() internally if the > pselect syscall exists. I though it would be good to do the > same here. > > Note that although the libc pselect() API has 6 arguments, > the linux kernel syscall as 7 arguments. There is an extra, > somewhat vestigial, sizeof the signal mask argument. > > Signed-off-by: Nicolas S. Dade <nic.dade@gmail.com> > --- > libc/sysdeps/linux/common/pselect.c | 52 +++++++++++++++++++++++++++++++++++++ > 1 file changed, 52 insertions(+) > > diff --git a/libc/sysdeps/linux/common/pselect.c b/libc/sysdeps/linux/common/pselect.c > index bf19ce3..3f1dd28 100644 > --- a/libc/sysdeps/linux/common/pselect.c > +++ b/libc/sysdeps/linux/common/pselect.c > @@ -30,6 +30,57 @@ static int __NC(pselect)(int nfds, fd_set *readfds, fd_set *writefds, > fd_set *exceptfds, const struct timespec *timeout, > const sigset_t *sigmask) > { > +#ifdef __NR_pselect6 > +#define NSEC_PER_SEC 1000000000L > + struct timespec _ts, *ts = 0; > + if (timeout) { > + /* The Linux kernel can in some situations update the timeout value. > + * We do not want that so use a local variable. > + */ > + _ts = *timeout; > + > + /* GNU extension: allow for timespec values where the sub-sec > + * field is equal to or more than 1 second. The kernel will > + * reject this on us, so take care of the time shift ourself. > + * Some applications (like readline and linphone) do this. > + * See 'clarification on select() type calls and invalid timeouts' > + * on the POSIX general list for more information. > + */ > + if (_ts.tv_nsec >= NSEC_PER_SEC) { > + _ts.tv_sec += _ts.tv_nsec / NSEC_PER_SEC; > + _ts.tv_nsec %= NSEC_PER_SEC; > + } > + > + ts = &_ts; > + } > + > + /* The pselect6 syscall API is strange. It wants a 7th arg to be > + * the sizeof(*sigmask). However syscalls with > 6 arguments aren't > + * supported on linux. So arguments 6 and 7 are stuffed in a struct > + * and a pointer to that struct is passed as the 6th argument to > + * the syscall. > + * Glibc stuffs arguments 6 and 7 in a ulong[2]. Linux reads > + * them as if there were a struct { sigset_t*; size_t } in > + * userspace. There woudl be trouble if userspace and the kernel are > + * compiled differently enough that size_t isn't the same as ulong, > + * but not enough to trigger the compat layer in linux. I can't > + * think of such a case, so I'm using linux's struct. > + * Furthermore Glibc sets the sigsetsize to _NSIG/8. However linux > + * checks for sizeof(sigset_t), which internally is a ulong array. > + * This means that if _NSIG isn't a multiple of BITS_PER_LONG then > + * linux will refuse glibc's value. So I prefer sizeof(sigset_t) for > + * the value of sigsetsize. > + */ > + struct { > + const sigset_t *sigmask; > + size_t sigsetsize; > + } args67 = { > + sigmask, > + sizeof(sigset_t), > + }; This might work if uclibc defines sigset_t not to have the ridiculous expansion space for 1024 signals like glibc does, but it still seems fragile. I'd use _NSIG/8. That's what we do in musl. (If the size you pass to the kernel is wrong you get EINVAL or something like that.) Rich
Hi Nicolas,
Nicolas S. Dade wrote,
> This supercedes the previous pselect patch from 16 Dec 2015.
Can you resend next time with PATCH v3 in the subject and
a short Changelog after the comment ---. Thanks.
What you think about Rich comment?
best regards
Waldemar
> v3 Sure. Next time. > if ... expansion space Well uclibc does not do that. It defines sigset_t to match the kernel, and even has a comment about that. /* A 'sigset_t' has a bit for each signal. * glibc has space for 1024 signals (!), but most arches supported * by Linux have 64 signals, and only MIPS has 128. * There seems to be some historical baggage in sparc[64] * where they might have (or had in the past) 32 signals only, * I hope it's irrelevant now. * Signal 0 does not exist, so we have signals 1..64, not 0..63. * In uclibc, kernel and userspace sigset_t is always the same. * BTW, struct sigaction is also the same on kernel and userspace side. */ #if defined(__mips__) # define _SIGSET_NWORDS (128 / (8 * sizeof (unsigned long))) #else # define _SIGSET_NWORDS (64 / (8 * sizeof (unsigned long))) #endif typedef struct { unsigned long __val[_SIGSET_NWORDS]; } __sigset_t; The patched uclibc pselect works properly on ARM (32-bit), both in behavior and watching the operation in strace. ARM is the only uclibc target to which I have easy access. -Nicolas Dade On Sat, Dec 19, 2015 at 12:58 AM, Waldemar Brodkorb <wbx@openadk.org> wrote: > Hi Nicolas, > Nicolas S. Dade wrote, > > > This supercedes the previous pselect patch from 16 Dec 2015. > > Can you resend next time with PATCH v3 in the subject and > a short Changelog after the comment ---. Thanks. > > What you think about Rich comment? > > best regards > Waldemar >
diff --git a/libc/sysdeps/linux/common/pselect.c b/libc/sysdeps/linux/common/pselect.c index bf19ce3..3f1dd28 100644 --- a/libc/sysdeps/linux/common/pselect.c +++ b/libc/sysdeps/linux/common/pselect.c @@ -30,6 +30,57 @@ static int __NC(pselect)(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, const struct timespec *timeout, const sigset_t *sigmask) { +#ifdef __NR_pselect6 +#define NSEC_PER_SEC 1000000000L + struct timespec _ts, *ts = 0; + if (timeout) { + /* The Linux kernel can in some situations update the timeout value. + * We do not want that so use a local variable. + */ + _ts = *timeout; + + /* GNU extension: allow for timespec values where the sub-sec + * field is equal to or more than 1 second. The kernel will + * reject this on us, so take care of the time shift ourself. + * Some applications (like readline and linphone) do this. + * See 'clarification on select() type calls and invalid timeouts' + * on the POSIX general list for more information. + */ + if (_ts.tv_nsec >= NSEC_PER_SEC) { + _ts.tv_sec += _ts.tv_nsec / NSEC_PER_SEC; + _ts.tv_nsec %= NSEC_PER_SEC; + } + + ts = &_ts; + } + + /* The pselect6 syscall API is strange. It wants a 7th arg to be + * the sizeof(*sigmask). However syscalls with > 6 arguments aren't + * supported on linux. So arguments 6 and 7 are stuffed in a struct + * and a pointer to that struct is passed as the 6th argument to + * the syscall. + * Glibc stuffs arguments 6 and 7 in a ulong[2]. Linux reads + * them as if there were a struct { sigset_t*; size_t } in + * userspace. There woudl be trouble if userspace and the kernel are + * compiled differently enough that size_t isn't the same as ulong, + * but not enough to trigger the compat layer in linux. I can't + * think of such a case, so I'm using linux's struct. + * Furthermore Glibc sets the sigsetsize to _NSIG/8. However linux + * checks for sizeof(sigset_t), which internally is a ulong array. + * This means that if _NSIG isn't a multiple of BITS_PER_LONG then + * linux will refuse glibc's value. So I prefer sizeof(sigset_t) for + * the value of sigsetsize. + */ + struct { + const sigset_t *sigmask; + size_t sigsetsize; + } args67 = { + sigmask, + sizeof(sigset_t), + }; + + return INLINE_SYSCALL(pselect6, 6, nfds, readfds, writefds, exceptfds, ts, &args67); +#else struct timeval tval; int retval; sigset_t savemask; @@ -57,6 +108,7 @@ static int __NC(pselect)(int nfds, fd_set *readfds, fd_set *writefds, sigprocmask (SIG_SETMASK, &savemask, NULL); return retval; +#endif } CANCELLABLE_SYSCALL(int, pselect, (int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, const struct timespec *timeout, const sigset_t *sigmask),
This supercedes the previous pselect patch from 16 Dec 2015. Linux has a pselect syscall since 2.6.something. Using it rather than emulating it with sigprocmask+select+sigprocmask is smaller code, and works properly. (The emulation has race conditions when unblocked signals arrive before or after the select) The tv.nsec >= 1E9 handling comes from uclibc's linux select() implementation, which itself uses pselect() internally if the pselect syscall exists. I though it would be good to do the same here. Note that although the libc pselect() API has 6 arguments, the linux kernel syscall as 7 arguments. There is an extra, somewhat vestigial, sizeof the signal mask argument. Signed-off-by: Nicolas S. Dade <nic.dade@gmail.com> --- libc/sysdeps/linux/common/pselect.c | 52 +++++++++++++++++++++++++++++++++++++ 1 file changed, 52 insertions(+)