Message ID | 1418774081.7165.71.camel@triegel.csb |
---|---|
State | New |
Headers | show |
> I want to push back on this a little for two reasons. First, while I > agree that we don't want to have two interfaces for the same thing, this > isn't exactly the case: lll_futex_ has no error checking and is just a > wrapper for whatever the underlying kernel/... provides. The futex > wrappers I introduced do have error checking. The lack of error checking in lll_futex_* (and their callers) is a bug, as agreed here previously in discussion with kernel folks. Frankly, I thought that introducing error checking uniformly to our futex uses was the motivation for your changes. Do you actually think that we want to reach an end state in which we have some uses of futex doing error checking and some uses not doing it? I find it hard to imagine we want that. > Second, it's hard to easily commit to doing something that isn't clearly > defined, especially so close to the freeze. I agree that we shouldn't > let introduce junk or duplicated functionality, but it may be > unrealistic to try to get all-or-nothing instead of incremental changes. I didn't say the changes should be all-or-nothing. Almost certainly they should not be. I said I wanted you to commit to changing all futex uses as part of the whole effort. That has nothing to do with making it all one change. It's simply about your personal commitment to keep working on the issue until all the changes are finished. As to the proximity of the freeze, firstly we don't yet have any actual date for the freeze. Secondly, I think this kind of cleanup work is best done without big delays in the middle and so probably should be all done in the same cycle. If it's too much for you to finish in this cycle, then perhaps the whole thing should wait until the next cycle. Personally, I'd rather we just get on with it now and if that pushes the next release freeze out a few weeks, that's fine with me. > There is assembly that calls futex syscalls, which needs at least the > macros for the different futex ops, and the syscall number in certain > cases. Those things are currently in lowlevellock.h. > lowlevellock-futex.h needs the same information, and it should not > include lowlevellock.h. Do we have assembly code making futex syscalls that we actually want to keep? My impression was that we wanted to get rid of pretty much all of the assembly files for this stuff, and it's simply us moving slowly for each machine that's why we haven't done that yet. > It doesn't work currently on i386 due to six-argument syscalls not being > supported, AFAIU. If someone can add that I'd appreciate it; it would > save me finding out how to do that properly. Ah. I think there are several of us who could tackle that. > > If you start with a strawman proposal for the complete new internal API, > > then we can all work together to figure out how to clean up existing code. > > I'd start with just futex (timed)wait and wake, so what's in my patch. > That covers most of the uses. The other ops are mostly for the > low-level locks, and I need to make a pass over the error handling for > these (but this was already discussed in the futex error handling > thread). As long as you have a rough idea how everything else will fit in so that we can be confident that dealing with the other operations won't make us want to reconsider the whole organization of the internal API, that seems OK. > I agree that the existance custom lowlevellock.h and the related > assembly files is an issue we want to fix. But I think we can start > making the futex facility more generic independently of that. Clearly we can start. But the existence of the old assembly code seems to be a significant barrier to doing all the cleanup in the most straightforward ways. I think it is probably just going to be net easier at the end of the day if we take some time first to clean out all the old assembly code and just don't have those wrinkles to think about when revamping the rest. > 1) Have an OS-call interface that just does that. This is what > lowlevellock-futex.h is currently, AFAIU. This is implemented > differently on each OS. That's roughly what it is currently. But it's underspecified and it doesn't have a very clean internal API. This should be replaced by something that isn't organized and named as an implementation detail of the lll layer, cleans up the private flag stuff, and cleans up the API overall wrt things I've mentioned before like the stupid negated errno protocol and using type-checking inlines rather than loosy-goosy macros. That's not to say that this cleanup is what must happen first. But an end state where we still have lowlevellock-futex.h at all in anything like its current form is not desireable. > Make sure that there is a lowlevellock-futex.h on each arch. The > attached patch does this for x86_64 and i386. More details below. I'm not clear on what that patch is accomplishing that's materially different from just using the existing linux/lowlevellock-futex.h on x86_64 and i386. But please post that patch in its own thread so it can get proper review and discussion separate from the rest of this discussion. > 2) Have an OS-independent internal futex interface, with error checking. > That's what's in my patch. This uses the interface from 1). OK. > 3) Move generic, non-low-level-lock code over to using the interface > from 2). The new semaphore does that. I have a new condvar > implementation which is just missing the PI handling, which would do the > same. I'm working on a new rwlock that would use it. Etc. OK. > 4) Use the interface from 2) in generic low-level lock. This should be > fine and without significant performance implications because all futex > ops are on the slow path, with the exception of the PI mutexes. But if > you're doing a syscall anyway, doing a few more instructions more or > less won't matter that much. OK. 3 and 4 could happen in either order, or in parallel, right? > 5) Remove custom low-level lock implementations after reviewing the > performance implications of such removals. This need not wait for any other step. Aside from the i386 issue, this can be done today and IMHO the sooner it's done the better. > Does this sound reasonable? Are you OK with doing steps 1 and 2 before > the freeze, and do as much of 3) as possible before the freeze? At the moment I'm feeling ambivalent about splitting any of thus work up across a freeze. But each step as described is sufficiently self-contained that it's probably fine just to do each as it's ready (with very careful and thorough review and testing) regardless of where that falls relative to a freeze deadline. > This uses generic lowlevellock-futex.h for x86_64. It also adds a few > #ifdef __ASSEMBLER__ to the generic Linux one to make this work. Post that separately by itself. Sounds trivial. > With i386, this doesn't work because of lack of support for six-argument > syscalls (see above); thus, this patch just splits out all the > lll_futex_* calls and related stuff into a i386-specific > lowlevellock-futex.h file. This has fewer features than the generic > one, but the only users are the current interface from 2), which just > have futex (timed)wait and wake. And it works currently, so this should > be fine and we can add stuff as necessary later on (or, better, move to > the generic lowlevellock-futex.h). Post that separately by itself. It sounds like it might be a fine intermediate change we could do right away. Making i386 INTERNAL_SYSCALL handle the 6-argument case will probably be fiddly enough that it takes a bit longer. > * microblaze, s390, hppa all use INTERNAL_SYSCALL; I believe those > could just use generic Linux lowlevellock-futex.h. Yes. I think the right approach here is just to post a patch doing the removal and let the machine maintainers agree or raise issues. If you are in a position to do a performance comparison, either empirically and/or by eyeballing generated code, then I'm sure that's helpful to report. But ultimately the responsibility is with each machine maintainer. I suspect that at least e.g. microblaze and hppa maintainers will be happy to do the removal just based on no-semantic-regressions testing and not bother to fret about feared performance issues. > * ia64 uses DO_INLINE_SYSCALL. Not sure whether that could use generic > Linux futexes too. ia64's INTERNAL_SYSCALL is a pretty trivial wrapper around DO_INLINE_SYSCALL, so I think this is almost certainly just fine. > * sh has custom asm. Not sure what to do about that. Suggest removing it and see if the maintainer even cares to worry about whether the performance is affected. > * sparc uses INTERNAL_SYSCALL, so could be moved to generic Linux, but > there is a change (whose purpose/motivation I currently don't > understand): Bug Dave to clarify the requirements for that. Thanks, Roland
On 12/16/2014 08:07 PM, Roland McGrath wrote: > As to the proximity of the freeze, firstly we don't yet have any actual > date for the freeze. Secondly, I think this kind of cleanup work is best > done without big delays in the middle and so probably should be all done in > the same cycle. If it's too much for you to finish in this cycle, then > perhaps the whole thing should wait until the next cycle. Personally, I'd > rather we just get on with it now and if that pushes the next release > freeze out a few weeks, that's fine with me. I have tentatively set the freeze date is set as January 9th. Cheers, Carlos.
On Tue, 2014-12-16 at 17:07 -0800, Roland McGrath wrote: > > I want to push back on this a little for two reasons. First, while I > > agree that we don't want to have two interfaces for the same thing, this > > isn't exactly the case: lll_futex_ has no error checking and is just a > > wrapper for whatever the underlying kernel/... provides. The futex > > wrappers I introduced do have error checking. > > The lack of error checking in lll_futex_* (and their callers) is a bug, as > agreed here previously in discussion with kernel folks. Frankly, I thought > that introducing error checking uniformly to our futex uses was the > motivation for your changes. It is. I do want to have the low-level locks check the return values of their futex calls. To me, that doesn't conflict with having another interface that is just what the underlying kernels/... provide. We can also get rid of this other interface and do error checking in each of the per kernel/... implementations. Does that clarify what I meant? > Do you actually think that we want to reach an end state in which we have > some uses of futex doing error checking and some uses not doing it? > I find it hard to imagine we want that. No, but having layered interfaces, where only the top-most one does error checking and is used throughout glibc, doesn't seem obviously bad to me. > > Second, it's hard to easily commit to doing something that isn't clearly > > defined, especially so close to the freeze. I agree that we shouldn't > > let introduce junk or duplicated functionality, but it may be > > unrealistic to try to get all-or-nothing instead of incremental changes. > > I didn't say the changes should be all-or-nothing. > Almost certainly they should not be. > > I said I wanted you to commit to changing all futex uses as part of the > whole effort. That has nothing to do with making it all one change. It's > simply about your personal commitment to keep working on the issue until > all the changes are finished. > > As to the proximity of the freeze, firstly we don't yet have any actual > date for the freeze. The message Carlos sent out states otherwise. I believed we do want to have fixed release dates, and I assumed that the release date would be Dec 30. The Jan 9 date Carlos mentioned gives a bit more time, but not much. > Secondly, I think this kind of cleanup work is best > done without big delays in the middle and so probably should be all done in > the same cycle. If it's too much for you to finish in this cycle, then > perhaps the whole thing should wait until the next cycle. What would this mean for the new semaphore I posted? Are you okay with it using the current lll_futex_* operations and custom error checking? Or do you want something like the futex-with-builtin-error-checking interface that I posted? In the latter case, this would block the semaphore from going in. > Personally, I'd > rather we just get on with it now and if that pushes the next release > freeze out a few weeks, that's fine with me. > > > There is assembly that calls futex syscalls, which needs at least the > > macros for the different futex ops, and the syscall number in certain > > cases. Those things are currently in lowlevellock.h. > > lowlevellock-futex.h needs the same information, and it should not > > include lowlevellock.h. > > Do we have assembly code making futex syscalls that we actually want to > keep? My impression was that we wanted to get rid of pretty much all of > the assembly files for this stuff, and it's simply us moving slowly for > each machine that's why we haven't done that yet. I would also hope that we can remove all of them. But actually doing that may be a bigger project. The futex-using assembly files I see are on x86, x86_64, and sh. Its the low-level lock stuff (including robust), pthread barrier, pthread condvar, pthread rwlock, pthread semaphore, cancellation. I've posted a new semaphore, and as I've said, am working on condvar and rwlock. Cancellation may be something that ends up doing futex calls from assembly. But given that futexes are on the slow path, the assembly could probably just call C futex function instead. > > It doesn't work currently on i386 due to six-argument syscalls not being > > supported, AFAIU. If someone can add that I'd appreciate it; it would > > save me finding out how to do that properly. > > Ah. I think there are several of us who could tackle that. Thanks. > > > If you start with a strawman proposal for the complete new internal API, > > > then we can all work together to figure out how to clean up existing code. > > > > I'd start with just futex (timed)wait and wake, so what's in my patch. > > That covers most of the uses. The other ops are mostly for the > > low-level locks, and I need to make a pass over the error handling for > > these (but this was already discussed in the futex error handling > > thread). > > As long as you have a rough idea how everything else will fit in so that we > can be confident that dealing with the other operations won't make us want > to reconsider the whole organization of the internal API, that seems OK. > > > I agree that the existance custom lowlevellock.h and the related > > assembly files is an issue we want to fix. But I think we can start > > making the futex facility more generic independently of that. > > Clearly we can start. But the existence of the old assembly code seems to > be a significant barrier to doing all the cleanup in the most > straightforward ways. I think it is probably just going to be net easier > at the end of the day if we take some time first to clean out all the old > assembly code and just don't have those wrinkles to think about when > revamping the rest. > > > 1) Have an OS-call interface that just does that. This is what > > lowlevellock-futex.h is currently, AFAIU. This is implemented > > differently on each OS. > > That's roughly what it is currently. But it's underspecified and it > doesn't have a very clean internal API. This should be replaced by > something that isn't organized and named as an implementation detail of the > lll layer, Agreed that the name needs to be changed. > cleans up the private flag stuff, Yes, that would be good. > and cleans up the API overall > wrt things I've mentioned before like the stupid negated errno protocol and > using type-checking inlines rather than loosy-goosy macros. > > That's not to say that this cleanup is what must happen first. But an end > state where we still have lowlevellock-futex.h at all in anything like its > current form is not desireable. OK. One thing we need to agree on is whether we need this interface in the end at all, or whether we make the interface from 2) (see below) the one that each OS implements. What do you think? Keeping 1) would make it easier for us to expose it to users in the future (I do remember you not wanting to care about this now, though). Keeping 1) isn't helping for internal use if the errors that it can return are incompatible between the different OS implementations. What is NaCl's error specification for the futex ops? Given that much of this refactoring is to make non-Linux futex possible, it would be helpful if you can summarize what you need and want. > > Make sure that there is a lowlevellock-futex.h on each arch. The > > attached patch does this for x86_64 and i386. More details below. > > I'm not clear on what that patch is accomplishing that's materially > different from just using the existing linux/lowlevellock-futex.h on x86_64 > and i386. But please post that patch in its own thread so it can get > proper review and discussion separate from the rest of this discussion. OK. > > 2) Have an OS-independent internal futex interface, with error checking. > > That's what's in my patch. This uses the interface from 1). > > OK. > > > 3) Move generic, non-low-level-lock code over to using the interface > > from 2). The new semaphore does that. I have a new condvar > > implementation which is just missing the PI handling, which would do the > > same. I'm working on a new rwlock that would use it. Etc. > > OK. > > > 4) Use the interface from 2) in generic low-level lock. This should be > > fine and without significant performance implications because all futex > > ops are on the slow path, with the exception of the PI mutexes. But if > > you're doing a syscall anyway, doing a few more instructions more or > > less won't matter that much. > > OK. 3 and 4 could happen in either order, or in parallel, right? Yes. > > 5) Remove custom low-level lock implementations after reviewing the > > performance implications of such removals. > > This need not wait for any other step. Aside from the i386 issue, this can > be done today and IMHO the sooner it's done the better. That's true, but in contrast to the other steps, this may require more time to assess the impact on performance. > > This uses generic lowlevellock-futex.h for x86_64. It also adds a few > > #ifdef __ASSEMBLER__ to the generic Linux one to make this work. > > Post that separately by itself. Sounds trivial. OK. > > With i386, this doesn't work because of lack of support for six-argument > > syscalls (see above); thus, this patch just splits out all the > > lll_futex_* calls and related stuff into a i386-specific > > lowlevellock-futex.h file. This has fewer features than the generic > > one, but the only users are the current interface from 2), which just > > have futex (timed)wait and wake. And it works currently, so this should > > be fine and we can add stuff as necessary later on (or, better, move to > > the generic lowlevellock-futex.h). > > Post that separately by itself. It sounds like it might be a fine > intermediate change we could do right away. Making i386 INTERNAL_SYSCALL > handle the 6-argument case will probably be fiddly enough that it takes > a bit longer. OK. > > * microblaze, s390, hppa all use INTERNAL_SYSCALL; I believe those > > could just use generic Linux lowlevellock-futex.h. > > Yes. I think the right approach here is just to post a patch doing the > removal and let the machine maintainers agree or raise issues. If you are > in a position to do a performance comparison, either empirically and/or by > eyeballing generated code, then I'm sure that's helpful to report. But > ultimately the responsibility is with each machine maintainer. I suspect > that at least e.g. microblaze and hppa maintainers will be happy to do the > removal just based on no-semantic-regressions testing and not bother to > fret about feared performance issues. OK. The futex ops are on the slow path, so performance differences won't matter much (in contrast to lowlevellock.h, which contains the fast paths too). > > * ia64 uses DO_INLINE_SYSCALL. Not sure whether that could use generic > > Linux futexes too. > > ia64's INTERNAL_SYSCALL is a pretty trivial wrapper around > DO_INLINE_SYSCALL, so I think this is almost certainly just fine. OK. > > * sh has custom asm. Not sure what to do about that. > > Suggest removing it and see if the maintainer even cares to worry about > whether the performance is affected. OK. > > * sparc uses INTERNAL_SYSCALL, so could be moved to generic Linux, but > > there is a change (whose purpose/motivation I currently don't > > understand): > > Bug Dave to clarify the requirements for that. OK.
> It is. I do want to have the low-level locks check the return values of > their futex calls. To me, that doesn't conflict with having another > interface that is just what the underlying kernels/... provide. We can > also get rid of this other interface and do error checking in each of > the per kernel/... implementations. Does that clarify what I meant? Yes, that makes sense. On further reflection, I think it does make sense to have the error-checking layer be outside the OS-encapsulation layer so that we have foolproof consistency of the expected error protocol of the OS layer. > The message Carlos sent out states otherwise. I believed we do want to > have fixed release dates, and I assumed that the release date would be > Dec 30. The Jan 9 date Carlos mentioned gives a bit more time, but not > much. Indeed. I don't particularly advocate the fixed date approach, but I haven't objected to it, and won't. > What would this mean for the new semaphore I posted? Are you okay with > it using the current lll_futex_* operations and custom error checking? If that is fixing some real, observable bug independent of the futex error handling question, then I think it's fine as an intermediate step to do now. If it is nothing but internal reorganization or the only kind of bug it's fixing is the lack of futex error handling that is a more pervasive problem (and there's nothing really notable about the semaphore instance of that problem)--or is a purely speculative theoretical bug--then I don't think it's worth perturbing things before the release. > I would also hope that we can remove all of them. But actually doing > that may be a bigger project. The futex-using assembly files I see are > on x86, x86_64, and sh. Its the low-level lock stuff (including > robust), pthread barrier, pthread condvar, pthread rwlock, pthread > semaphore, cancellation. The 6-argument syscall issue on i386 is the only blocker we know of, right? But indeed we wanted to be conservative about perturbing performance issues for i386/x86_64, and just establishing confidence about that will take some time. I had hoped someone like you would have done that during this cycle, but since actual focus on it is only beginning about now, it might well need to wait at least until right after the release. > Cancellation may be something that ends up doing futex calls from > assembly. Can you point to particular code you're thinking of? > One thing we need to agree on is whether we need this interface in the > end at all, or whether we make the interface from 2) (see below) the one > that each OS implements. What do you think? On the one hand I want to avoid extra layers when possible. On the other hand, I am adamant about having any sanity-check enforcement of error protocols be done in code that is OS-independent so that we do not risk skew between OS-specific implementations of the sanity checks. Perhaps that right balance is to have some common code that does the error protocol assertions for each piece of the internal interface, but as utility code that the OS-specific layer calls rather than as an extra layer around it. Either of those two approaches should be fairly easy to refactor into the other later. > Keeping 1) would make it easier for us to expose it to users in the > future (I do remember you not wanting to care about this now, though). > Keeping 1) isn't helping for internal use if the errors that it can > return are incompatible between the different OS implementations. Right. As I said before, I want to ignore speculation about a new user API if considering it would complicate the ideal internal cleanup we can do in its absence. Off hand, > What is NaCl's error specification for the futex ops? Given that much > of this refactoring is to make non-Linux futex possible, it would be > helpful if you can summarize what you need and want. It's not formally specified. Currently it's quite simple: only basic wake and wait/timed-wait are supported, and the only errors possible are EFAULT and EAGAIN (just for value mismatch). It's likely that it will grow over time to cover a larger subset of the Linux features and their nontrivial failure modes. It will always be the intent that it align pretty well with the Linux semantics. > > > 5) Remove custom low-level lock implementations after reviewing the > > > performance implications of such removals. > > > > This need not wait for any other step. Aside from the i386 issue, this can > > be done today and IMHO the sooner it's done the better. > > That's true, but in contrast to the other steps, this may require more > time to assess the impact on performance. Agreed. Thanks, Roland
On Wed, 2014-12-17 at 15:34 -0800, Roland McGrath wrote: > > It is. I do want to have the low-level locks check the return values of > > their futex calls. To me, that doesn't conflict with having another > > interface that is just what the underlying kernels/... provide. We can > > also get rid of this other interface and do error checking in each of > > the per kernel/... implementations. Does that clarify what I meant? > > Yes, that makes sense. On further reflection, I think it does make sense > to have the error-checking layer be outside the OS-encapsulation layer so > that we have foolproof consistency of the expected error protocol of the OS > layer. OK. > > What would this mean for the new semaphore I posted? Are you okay with > > it using the current lll_futex_* operations and custom error checking? > > If that is fixing some real, observable bug independent of the futex error > handling question, then I think it's fine as an intermediate step to do > now. It is fixing a correctness bug. Carlos said he's currently reviewing that, so I'll try to focus on doing more of the futex clean-up now, hoping that I get enough of it done before the freeze (and with time for you to review) to make adding the new internal futex interface fine. Sounds good? I'm traveling this week and thus don't have lots of time to spend on this, but I'll see how far I can get. > > I would also hope that we can remove all of them. But actually doing > > that may be a bigger project. The futex-using assembly files I see are > > on x86, x86_64, and sh. Its the low-level lock stuff (including > > robust), pthread barrier, pthread condvar, pthread rwlock, pthread > > semaphore, cancellation. > > The 6-argument syscall issue on i386 is the only blocker we know of, right? > But indeed we wanted to be conservative about perturbing performance issues > for i386/x86_64, and just establishing confidence about that will take some > time. I had hoped someone like you would have done that during this cycle, > but since actual focus on it is only beginning about now, it might well > need to wait at least until right after the release. No, I haven't, sorry. i386 6-argument syscall is a blocker in terms of using Linux-generic lowlevellock-futex.h everywhere. The other obstacle is that I haven't heard back from the hppa, microblaze, and ia64 maintainers; it seems okay to remove the custom lowlevellock.h on these archs, but I can't test that easily. Without that removal, I can't do certain follow-up cleanups like adding a futex_wait that accepts absolute timeouts in just Linux-generic lowlevellock-futex.h. > > Cancellation may be something that ends up doing futex calls from > > assembly. > > Can you point to particular code you're thinking of? AFAIR the plan we had for how to change cancellation, we wanted it to be based on checking whether the instruction pointer falls into a certain code region, not using the enable/disable calls we have now. We'd put the syscall into a tightly-controlled code region whose addresses we'd know, and then would be able to check whether cancellation happened in there or not. I don't have pointers to the emails handy, but I'll look for it. OTOH, maybe that could just use a cancellable version of INTERNAL_SYSCALL, and we'd be fine. > > One thing we need to agree on is whether we need this interface in the > > end at all, or whether we make the interface from 2) (see below) the one > > that each OS implements. What do you think? > > On the one hand I want to avoid extra layers when possible. On the other > hand, I am adamant about having any sanity-check enforcement of error > protocols be done in code that is OS-independent so that we do not risk > skew between OS-specific implementations of the sanity checks. OK. > Perhaps that right balance is to have some common code that does the error > protocol assertions for each piece of the internal interface, but as > utility code that the OS-specific layer calls rather than as an extra layer > around it. Either of those two approaches should be fairly easy to > refactor into the other later. OK, and I agree that such refactoring seems straightforward. I'd start with having the layering that I have now (with error checking in the futex_* wrappers in my patches, that call the lll_futex_* functions for now), if that's fine for you. This just seems to be easier to implement incrementally. Also, we get to move uses off from the lll_futex_* names into just futex_*, whcih seems right. The next patches I plan to prepare are splitting the current futex_timed_wait into two variants for relative and absolute timeouts, respectively (absolute is used in pthread_*, relative is in aio_suspend, for example). That needs an additional lll_futex_timed_wait variant, but the upside is that we can use this to consolidate much duplicated code in pthread_*, and start using the CLOCK_REALTIME optimization with futex_timed_wait_bitset everywhere. Next, I plan to transition more uses of lll_futex_* to futex_*. If any of that sounds wrong to you, please let me know soon.
commit a2848a4447dfb84635002e9dfa77031b8bbcac79 Author: Torvald Riegel <triegel@redhat.com> Date: Wed Dec 17 00:02:45 2014 +0100 [WIP] Split out futex functions from lowlevellock.h. diff --git a/sysdeps/unix/sysv/linux/i386/lowlevellock-futex.h b/sysdeps/unix/sysv/linux/i386/lowlevellock-futex.h new file mode 100644 index 0000000..2b5ae53 --- /dev/null +++ b/sysdeps/unix/sysv/linux/i386/lowlevellock-futex.h @@ -0,0 +1,136 @@ +/* Copyright (C) 2014 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#ifndef _LOWLEVELLOCK_FUTEX_H +#define _LOWLEVELLOCK_FUTEX_H 1 + +#define FUTEX_WAIT 0 +#define FUTEX_WAKE 1 +#define FUTEX_CMP_REQUEUE 4 +#define FUTEX_WAKE_OP 5 +#define FUTEX_LOCK_PI 6 +#define FUTEX_UNLOCK_PI 7 +#define FUTEX_TRYLOCK_PI 8 +#define FUTEX_WAIT_BITSET 9 +#define FUTEX_WAKE_BITSET 10 +#define FUTEX_WAIT_REQUEUE_PI 11 +#define FUTEX_CMP_REQUEUE_PI 12 +#define FUTEX_PRIVATE_FLAG 128 +#define FUTEX_CLOCK_REALTIME 256 + +#define FUTEX_BITSET_MATCH_ANY 0xffffffff + +#define FUTEX_OP_CLEAR_WAKE_IF_GT_ONE ((4 << 24) | 1) + +/* Values for 'private' parameter of locking macros. Yes, the + definition seems to be backwards. But it is not. The bit will be + reversed before passing to the system call. */ +#define LLL_PRIVATE 0 +#define LLL_SHARED FUTEX_PRIVATE_FLAG + + +#if IS_IN (libc) || IS_IN (rtld) +/* In libc.so or ld.so all futexes are private. */ +# ifdef __ASSUME_PRIVATE_FUTEX +# define __lll_private_flag(fl, private) \ + ((fl) | FUTEX_PRIVATE_FLAG) +# else +# define __lll_private_flag(fl, private) \ + ((fl) | THREAD_GETMEM (THREAD_SELF, header.private_futex)) +# endif +#else +# ifdef __ASSUME_PRIVATE_FUTEX +# define __lll_private_flag(fl, private) \ + (((fl) | FUTEX_PRIVATE_FLAG) ^ (private)) +# else +# define __lll_private_flag(fl, private) \ + (__builtin_constant_p (private) \ + ? ((private) == 0 \ + ? ((fl) | THREAD_GETMEM (THREAD_SELF, header.private_futex)) \ + : (fl)) \ + : ({ unsigned int __fl = ((private) ^ FUTEX_PRIVATE_FLAG); \ + asm ("andl %%gs:%P1, %0" : "+r" (__fl) \ + : "i" (offsetof (struct pthread, header.private_futex))); \ + __fl | (fl); })) +# endif +#endif + + +#ifndef __ASSEMBLER__ + +/* To avoid naming conflicts with lowlevellock.h, use a different prefix + here. */ +#ifdef PIC +# define LLLF_EBX_LOAD "xchgl %2, %%ebx\n" +# define LLLF_EBX_REG "D" +#else +# define LLLF_EBX_LOAD +# define LLLF_EBX_REG "b" +#endif + +#ifdef I386_USE_SYSENTER +# ifdef SHARED +# define LLLF_ENTER_KERNEL "call *%%gs:%P6\n\t" +# else +# define LLLF_ENTER_KERNEL "call *_dl_sysinfo\n\t" +# endif +#else +# define LLLF_ENTER_KERNEL "int $0x80\n\t" +#endif + + +#define lll_futex_wait(futex, val, private) \ + lll_futex_timed_wait (futex, val, NULL, private) + + +#define lll_futex_timed_wait(futex, val, timeout, private) \ + ({ \ + int __status; \ + register __typeof (val) _val asm ("edx") = (val); \ + __asm __volatile (LLLF_EBX_LOAD \ + LLLF_ENTER_KERNEL \ + LLLF_EBX_LOAD \ + : "=a" (__status) \ + : "0" (SYS_futex), LLLF_EBX_REG (futex), "S" (timeout), \ + "c" (__lll_private_flag (FUTEX_WAIT, private)), \ + "d" (_val), "i" (offsetof (tcbhead_t, sysinfo)) \ + : "memory"); \ + __status; \ + }) + + +#define lll_futex_wake(futex, nr, private) \ + ({ \ + int __status; \ + register __typeof (nr) _nr asm ("edx") = (nr); \ + LIBC_PROBE (lll_futex_wake, 3, futex, nr, private); \ + __asm __volatile (LLLF_EBX_LOAD \ + LLLF_ENTER_KERNEL \ + LLLF_EBX_LOAD \ + : "=a" (__status) \ + : "0" (SYS_futex), LLLF_EBX_REG (futex), \ + "c" (__lll_private_flag (FUTEX_WAKE, private)), \ + "d" (_nr), \ + "i" (0) /* phony, to align next arg's number */, \ + "i" (offsetof (tcbhead_t, sysinfo))); \ + __status; \ + }) + + +#endif /* !__ASSEMBLER__ */ + +#endif /* lowlevellock-futex.h */ diff --git a/sysdeps/unix/sysv/linux/i386/lowlevellock.h b/sysdeps/unix/sysv/linux/i386/lowlevellock.h index 1032f4b..c3528a8 100644 --- a/sysdeps/unix/sysv/linux/i386/lowlevellock.h +++ b/sysdeps/unix/sysv/linux/i386/lowlevellock.h @@ -45,57 +45,10 @@ # endif #endif +#include <lowlevellock-futex.h> + +/* XXX Remove when no assembler code uses futexes anymore. */ #define SYS_futex 240 -#define FUTEX_WAIT 0 -#define FUTEX_WAKE 1 -#define FUTEX_CMP_REQUEUE 4 -#define FUTEX_WAKE_OP 5 -#define FUTEX_LOCK_PI 6 -#define FUTEX_UNLOCK_PI 7 -#define FUTEX_TRYLOCK_PI 8 -#define FUTEX_WAIT_BITSET 9 -#define FUTEX_WAKE_BITSET 10 -#define FUTEX_WAIT_REQUEUE_PI 11 -#define FUTEX_CMP_REQUEUE_PI 12 -#define FUTEX_PRIVATE_FLAG 128 -#define FUTEX_CLOCK_REALTIME 256 - -#define FUTEX_BITSET_MATCH_ANY 0xffffffff - -#define FUTEX_OP_CLEAR_WAKE_IF_GT_ONE ((4 << 24) | 1) - -/* Values for 'private' parameter of locking macros. Yes, the - definition seems to be backwards. But it is not. The bit will be - reversed before passing to the system call. */ -#define LLL_PRIVATE 0 -#define LLL_SHARED FUTEX_PRIVATE_FLAG - - -#if IS_IN (libc) || IS_IN (rtld) -/* In libc.so or ld.so all futexes are private. */ -# ifdef __ASSUME_PRIVATE_FUTEX -# define __lll_private_flag(fl, private) \ - ((fl) | FUTEX_PRIVATE_FLAG) -# else -# define __lll_private_flag(fl, private) \ - ((fl) | THREAD_GETMEM (THREAD_SELF, header.private_futex)) -# endif -#else -# ifdef __ASSUME_PRIVATE_FUTEX -# define __lll_private_flag(fl, private) \ - (((fl) | FUTEX_PRIVATE_FLAG) ^ (private)) -# else -# define __lll_private_flag(fl, private) \ - (__builtin_constant_p (private) \ - ? ((private) == 0 \ - ? ((fl) | THREAD_GETMEM (THREAD_SELF, header.private_futex)) \ - : (fl)) \ - : ({ unsigned int __fl = ((private) ^ FUTEX_PRIVATE_FLAG); \ - asm ("andl %%gs:%P1, %0" : "+r" (__fl) \ - : "i" (offsetof (struct pthread, header.private_futex))); \ - __fl | (fl); })) -# endif -#endif #ifndef __ASSEMBLER__ @@ -126,43 +79,6 @@ /* Delay in spinlock loop. */ #define BUSY_WAIT_NOP asm ("rep; nop") -#define lll_futex_wait(futex, val, private) \ - lll_futex_timed_wait (futex, val, NULL, private) - - -#define lll_futex_timed_wait(futex, val, timeout, private) \ - ({ \ - int __status; \ - register __typeof (val) _val asm ("edx") = (val); \ - __asm __volatile (LLL_EBX_LOAD \ - LLL_ENTER_KERNEL \ - LLL_EBX_LOAD \ - : "=a" (__status) \ - : "0" (SYS_futex), LLL_EBX_REG (futex), "S" (timeout), \ - "c" (__lll_private_flag (FUTEX_WAIT, private)), \ - "d" (_val), "i" (offsetof (tcbhead_t, sysinfo)) \ - : "memory"); \ - __status; \ - }) - - -#define lll_futex_wake(futex, nr, private) \ - ({ \ - int __status; \ - register __typeof (nr) _nr asm ("edx") = (nr); \ - LIBC_PROBE (lll_futex_wake, 3, futex, nr, private); \ - __asm __volatile (LLL_EBX_LOAD \ - LLL_ENTER_KERNEL \ - LLL_EBX_LOAD \ - : "=a" (__status) \ - : "0" (SYS_futex), LLL_EBX_REG (futex), \ - "c" (__lll_private_flag (FUTEX_WAKE, private)), \ - "d" (_nr), \ - "i" (0) /* phony, to align next arg's number */, \ - "i" (offsetof (tcbhead_t, sysinfo))); \ - __status; \ - }) - /* NB: in the lll_trylock macro we simply return the value in %eax after the cmpxchg instruction. In case the operation succeded this @@ -381,43 +297,37 @@ extern int __lll_timedlock_elision (int *futex, short *adapt_count, (futex != LLL_LOCK_INITIALIZER) /* The kernel notifies a process which uses CLONE_CHILD_CLEARTID via futex - wakeup when the clone terminates. The memory location contains the - thread ID while the clone is running and is reset to zero - afterwards. - - The macro parameter must not have any side effect. */ + wake-up when the clone terminates. The memory location contains the + thread ID while the clone is running and is reset to zero by the kernel + afterwards. The kernel up to version 3.16.3 does not use the private futex + operations for futex wake-up when the clone terminates. */ #define lll_wait_tid(tid) \ - do { \ - int __ignore; \ - register __typeof (tid) _tid asm ("edx") = (tid); \ - if (_tid != 0) \ - __asm __volatile (LLL_EBX_LOAD \ - "1:\tmovl %1, %%eax\n\t" \ - LLL_ENTER_KERNEL \ - "cmpl $0, (%%ebx)\n\t" \ - "jne 1b\n\t" \ - LLL_EBX_LOAD \ - : "=&a" (__ignore) \ - : "i" (SYS_futex), LLL_EBX_REG (&tid), "S" (0), \ - "c" (FUTEX_WAIT), "d" (_tid), \ - "i" (offsetof (tcbhead_t, sysinfo)) \ - : "memory"); \ + do { \ + __typeof (tid) __tid; \ + while ((__tid = (tid)) != 0) \ + lll_futex_wait (&(tid), __tid, LLL_SHARED);\ } while (0) extern int __lll_timedwait_tid (int *tid, const struct timespec *abstime) __attribute__ ((regparm (2))) attribute_hidden; + +/* As lll_wait_tid, but with a timeout. If the timeout occurs then return + ETIMEDOUT. If ABSTIME is invalid, return EINVAL. + XXX Note that this differs from the generic version in that we do the + error checking here and not in __lll_timedwait_tid. */ #define lll_timedwait_tid(tid, abstime) \ ({ \ int __result = 0; \ - if (tid != 0) \ + if ((tid) != 0) \ { \ - if (abstime->tv_nsec < 0 || abstime->tv_nsec >= 1000000000) \ + if ((abstime)->tv_nsec < 0 || (abstime)->tv_nsec >= 1000000000) \ __result = EINVAL; \ else \ - __result = __lll_timedwait_tid (&tid, abstime); \ + __result = __lll_timedwait_tid (&(tid), (abstime)); \ } \ __result; }) + extern int __lll_lock_elision (int *futex, short *adapt_count, int private) attribute_hidden; diff --git a/sysdeps/unix/sysv/linux/lowlevellock-futex.h b/sysdeps/unix/sysv/linux/lowlevellock-futex.h index 8927661..907b4c7 100644 --- a/sysdeps/unix/sysv/linux/lowlevellock-futex.h +++ b/sysdeps/unix/sysv/linux/lowlevellock-futex.h @@ -19,10 +19,11 @@ #ifndef _LOWLEVELLOCK_FUTEX_H #define _LOWLEVELLOCK_FUTEX_H 1 +#ifndef __ASSEMBLER__ #include <sysdep.h> #include <tls.h> #include <kernel-features.h> - +#endif #define FUTEX_WAIT 0 #define FUTEX_WAKE 1 @@ -48,6 +49,7 @@ #define LLL_PRIVATE 0 #define LLL_SHARED FUTEX_PRIVATE_FLAG +#ifndef __ASSEMBLER__ #if IS_IN (libc) || IS_IN (rtld) /* In libc.so or ld.so all futexes are private. */ @@ -133,5 +135,6 @@ private), \ nr_wake, nr_move, mutex, val) +#endif /* !__ASSEMBLER__ */ #endif /* lowlevellock-futex.h */ diff --git a/sysdeps/unix/sysv/linux/x86_64/lowlevellock.h b/sysdeps/unix/sysv/linux/x86_64/lowlevellock.h index 2f0cf5c..56570ee 100644 --- a/sysdeps/unix/sysv/linux/x86_64/lowlevellock.h +++ b/sysdeps/unix/sysv/linux/x86_64/lowlevellock.h @@ -45,59 +45,13 @@ # endif #endif +#include <lowlevellock-futex.h> + +/* XXX Remove when no assembler code uses futexes anymore. */ #define SYS_futex __NR_futex -#define FUTEX_WAIT 0 -#define FUTEX_WAKE 1 -#define FUTEX_CMP_REQUEUE 4 -#define FUTEX_WAKE_OP 5 -#define FUTEX_LOCK_PI 6 -#define FUTEX_UNLOCK_PI 7 -#define FUTEX_TRYLOCK_PI 8 -#define FUTEX_WAIT_BITSET 9 -#define FUTEX_WAKE_BITSET 10 -#define FUTEX_WAIT_REQUEUE_PI 11 -#define FUTEX_CMP_REQUEUE_PI 12 -#define FUTEX_PRIVATE_FLAG 128 -#define FUTEX_CLOCK_REALTIME 256 - -#define FUTEX_BITSET_MATCH_ANY 0xffffffff - -#define FUTEX_OP_CLEAR_WAKE_IF_GT_ONE ((4 << 24) | 1) - -/* Values for 'private' parameter of locking macros. Yes, the - definition seems to be backwards. But it is not. The bit will be - reversed before passing to the system call. */ -#define LLL_PRIVATE 0 -#define LLL_SHARED FUTEX_PRIVATE_FLAG #ifndef __ASSEMBLER__ -#if IS_IN (libc) || IS_IN (rtld) -/* In libc.so or ld.so all futexes are private. */ -# ifdef __ASSUME_PRIVATE_FUTEX -# define __lll_private_flag(fl, private) \ - ((fl) | FUTEX_PRIVATE_FLAG) -# else -# define __lll_private_flag(fl, private) \ - ((fl) | THREAD_GETMEM (THREAD_SELF, header.private_futex)) -# endif -#else -# ifdef __ASSUME_PRIVATE_FUTEX -# define __lll_private_flag(fl, private) \ - (((fl) | FUTEX_PRIVATE_FLAG) ^ (private)) -# else -# define __lll_private_flag(fl, private) \ - (__builtin_constant_p (private) \ - ? ((private) == 0 \ - ? ((fl) | THREAD_GETMEM (THREAD_SELF, header.private_futex)) \ - : (fl)) \ - : ({ unsigned int __fl = ((private) ^ FUTEX_PRIVATE_FLAG); \ - asm ("andl %%fs:%P1, %0" : "+r" (__fl) \ - : "i" (offsetof (struct pthread, header.private_futex))); \ - __fl | (fl); })) -# endif -#endif - /* Initializer for lock. */ #define LLL_LOCK_INITIALIZER (0) #define LLL_LOCK_INITIALIZER_LOCKED (1) @@ -106,39 +60,6 @@ /* Delay in spinlock loop. */ #define BUSY_WAIT_NOP asm ("rep; nop") -#define lll_futex_wait(futex, val, private) \ - lll_futex_timed_wait(futex, val, NULL, private) - - -#define lll_futex_timed_wait(futex, val, timeout, private) \ - ({ \ - register const struct timespec *__to __asm ("r10") = timeout; \ - int __status; \ - register __typeof (val) _val __asm ("edx") = (val); \ - __asm __volatile ("syscall" \ - : "=a" (__status) \ - : "0" (SYS_futex), "D" (futex), \ - "S" (__lll_private_flag (FUTEX_WAIT, private)), \ - "d" (_val), "r" (__to) \ - : "memory", "cc", "r11", "cx"); \ - __status; \ - }) - - -#define lll_futex_wake(futex, nr, private) \ - ({ \ - int __status; \ - register __typeof (nr) _nr __asm ("edx") = (nr); \ - LIBC_PROBE (lll_futex_wake, 3, futex, nr, private); \ - __asm __volatile ("syscall" \ - : "=a" (__status) \ - : "0" (SYS_futex), "D" (futex), \ - "S" (__lll_private_flag (FUTEX_WAKE, private)), \ - "d" (_nr) \ - : "memory", "cc", "r10", "r11", "cx"); \ - __status; \ - }) - /* NB: in the lll_trylock macro we simply return the value in %eax after the cmpxchg instruction. In case the operation succeded this @@ -378,58 +299,38 @@ extern int __lll_timedlock_elision (int *futex, short *adapt_count, } \ while (0) -/* Returns non-zero if error happened, zero if success. */ -#define lll_futex_requeue(ftx, nr_wake, nr_move, mutex, val, private) \ - ({ int __res; \ - register int __nr_move __asm ("r10") = nr_move; \ - register void *__mutex __asm ("r8") = mutex; \ - register int __val __asm ("r9") = val; \ - __asm __volatile ("syscall" \ - : "=a" (__res) \ - : "0" (__NR_futex), "D" ((void *) ftx), \ - "S" (__lll_private_flag (FUTEX_CMP_REQUEUE, \ - private)), "d" (nr_wake), \ - "r" (__nr_move), "r" (__mutex), "r" (__val) \ - : "cx", "r11", "cc", "memory"); \ - __res < 0; }) - #define lll_islocked(futex) \ (futex != LLL_LOCK_INITIALIZER) /* The kernel notifies a process which uses CLONE_CHILD_CLEARTID via futex - wakeup when the clone terminates. The memory location contains the - thread ID while the clone is running and is reset to zero - afterwards. - - The macro parameter must not have any side effect. */ + wake-up when the clone terminates. The memory location contains the + thread ID while the clone is running and is reset to zero by the kernel + afterwards. The kernel up to version 3.16.3 does not use the private futex + operations for futex wake-up when the clone terminates. */ #define lll_wait_tid(tid) \ - do { \ - int __ignore; \ - register __typeof (tid) _tid asm ("edx") = (tid); \ - if (_tid != 0) \ - __asm __volatile ("xorq %%r10, %%r10\n\t" \ - "1:\tmovq %2, %%rax\n\t" \ - "syscall\n\t" \ - "cmpl $0, (%%rdi)\n\t" \ - "jne 1b" \ - : "=&a" (__ignore) \ - : "S" (FUTEX_WAIT), "i" (SYS_futex), "D" (&tid), \ - "d" (_tid) \ - : "memory", "cc", "r10", "r11", "cx"); \ + do { \ + __typeof (tid) __tid; \ + while ((__tid = (tid)) != 0) \ + lll_futex_wait (&(tid), __tid, LLL_SHARED);\ } while (0) -extern int __lll_timedwait_tid (int *tid, const struct timespec *abstime) +extern int __lll_timedwait_tid (int *, const struct timespec *) attribute_hidden; + +/* As lll_wait_tid, but with a timeout. If the timeout occurs then return + ETIMEDOUT. If ABSTIME is invalid, return EINVAL. + XXX Note that this differs from the generic version in that we do the + error checking here and not in __lll_timedwait_tid. */ #define lll_timedwait_tid(tid, abstime) \ ({ \ int __result = 0; \ - if (tid != 0) \ + if ((tid) != 0) \ { \ - if (abstime->tv_nsec < 0 || abstime->tv_nsec >= 1000000000) \ + if ((abstime)->tv_nsec < 0 || (abstime)->tv_nsec >= 1000000000) \ __result = EINVAL; \ else \ - __result = __lll_timedwait_tid (&tid, abstime); \ + __result = __lll_timedwait_tid (&(tid), (abstime)); \ } \ __result; })