Message ID | alpine.DEB.2.11.1504131033550.13447@stedding.saclay.inria.fr |
---|---|
State | New |
Headers | show |
On 13/04/15 13:42 +0200, Marc Glisse wrote: >this patch makes std::distance(list.first(),list.end()) a constant >time operation when optimizing, with no penalty for other calls. We >could do the test always (no __builtin_constant_p) but then it will >add a small runtime penalty for other calls, someone else can take >responsibility for that. I like the way you've done it. No penalty for other calls is great and IMHO it's OK that the optimisation only happens when optimising. (Does it work even at -Og?) >I could protect the whole overload with >#ifdef __OPTIMIZE__ (at -O0 the compiler does not remove the test >++end==first as dead code), but I assume it is better to minimize >differences. Agreed. >There are other ways to specialize distance, but >overloading __distance seems to have the least drawbacks (most others >involve a new extra file). From an optimization POV, it would be a bit >better to avoid the while loop and instead call a separate function >that does it (say the regular __distance), it would make the function >smaller and thus easier to inline, but it is simpler this way and >works. Is there any (dis)advantage to making one call the other, to avoid duplicating the function bodies? template<typename _Tp> inline ptrdiff_t __distance(_GLIBCXX_STD_C::_List_iterator<_Tp> __first, _GLIBCXX_STD_C::_List_iterator<_Tp> __last, input_iterator_tag __tag) { typedef _GLIBCXX_STD_C::_List_const_iterator<_Tp> _CIter; return std::__distance(_CIter(__first), _CIter(__last), __tag); } >We could do something similar for std::set, but C++ will not let us >find the address of _Rb_tree_impl::_M_node_count from that of >_Rb_tree_impl::_M_header, except when _Key_compare is pod, which >luckily is an easily available information. Avoiding this complication >is why I insisted on wrapping the size of std::list in a >_List_node<size_t> for gcc-5, which may look a bit strange at first >sight. Sadly, that node is going to look even stranger when I finish adding support for C++11 allocators, as the type of node becomes dependent on the allocator's pointer, which makes _List_node<size_t> much more complicated :-( I'll have to remember to add additional __distance overloads to handle the new _List_ptr_iterator and _List_ptr_const_iterator types that will be used for fancy pointers (although if I forget the optimisation will still work for std::list<T, std::allocator<T>>, just not for the vanishingly small number of people using allocators with fancy pointers). >Index: include/bits/stl_iterator_base_funcs.h >=================================================================== >--- include/bits/stl_iterator_base_funcs.h (revision 222041) >+++ include/bits/stl_iterator_base_funcs.h (working copy) >@@ -107,22 +107,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION > * n may be negative. > * > * For random access iterators, this uses their @c + and @c - operations > * and are constant time. For other %iterator classes they are linear time. > */ > template<typename _InputIterator> > inline typename iterator_traits<_InputIterator>::difference_type > distance(_InputIterator __first, _InputIterator __last) > { > // concept requirements -- taken care of in __distance >- return std::__distance(__first, __last, >- std::__iterator_category(__first)); >+ return __distance(__first, __last, std::__iterator_category(__first)); This should still be a qualified call to std::__distance, otherwise the compiler might end up instantiating things to figure out if there are any associated namespaces, e.g. for vector<unique_ptr<T>>::iterator we don't want to know T's base classes and rheir associated namespaces. >+ // Detect when distance is used to compute the size of the whole list. >+ template<typename _Tp> >+ inline ptrdiff_t >+ __distance(_GLIBCXX_STD_C::_List_iterator<_Tp> __first, >+ _GLIBCXX_STD_C::_List_iterator<_Tp> __last, >+ input_iterator_tag) >+ { >+ typedef _GLIBCXX_STD_C::_List_node<size_t> _Sentinel; >+ _GLIBCXX_STD_C::_List_iterator<_Tp> __beyond = __last; >+ ++__beyond; >+ bool __whole = __first == __beyond; >+ if (__builtin_constant_p (__whole) && __whole) This is clever :-) This shouldn't interfere with any changes we might need to test before backporting to the gcc-5-branch, so with the std:: qualification restored on the call to __distance it's OK for trunk. (I'll trust your judgment/masurements for whether it's worth making the _List_iterator overload call the _List_const_iterator one.)
On Mon, 13 Apr 2015, Jonathan Wakely wrote: > On 13/04/15 13:42 +0200, Marc Glisse wrote: >> this patch makes std::distance(list.first(),list.end()) a constant time >> operation when optimizing, with no penalty for other calls. We could do the >> test always (no __builtin_constant_p) but then it will add a small runtime >> penalty for other calls, someone else can take responsibility for that. > > I like the way you've done it. No penalty for other calls is great > and IMHO it's OK that the optimisation only happens when optimising. > > (Does it work even at -Og?) No, the testcase currently passes with -O1 but not -Og. >> I could protect the whole overload with #ifdef __OPTIMIZE__ (at -O0 the >> compiler does not remove the test ++end==first as dead code), but I assume >> it is better to minimize differences. > > Agreed. > >> There are other ways to specialize distance, but overloading __distance >> seems to have the least drawbacks (most others involve a new extra file). >> From an optimization POV, it would be a bit better to avoid the while loop >> and instead call a separate function that does it (say the regular >> __distance), it would make the function smaller and thus easier to inline, >> but it is simpler this way and works. > > Is there any (dis)advantage to making one call the other, to avoid > duplicating the function bodies? > > template<typename _Tp> > inline ptrdiff_t > __distance(_GLIBCXX_STD_C::_List_iterator<_Tp> __first, > _GLIBCXX_STD_C::_List_iterator<_Tp> __last, > input_iterator_tag __tag) > { > typedef _GLIBCXX_STD_C::_List_const_iterator<_Tp> _CIter; > return std::__distance(_CIter(__first), _CIter(__last), __tag); > } I don't see any disadvantage, I'll do that. >> We could do something similar for std::set, but C++ will not let us find >> the address of _Rb_tree_impl::_M_node_count from that of >> _Rb_tree_impl::_M_header, except when _Key_compare is pod, which luckily is >> an easily available information. Avoiding this complication is why I >> insisted on wrapping the size of std::list in a _List_node<size_t> for >> gcc-5, which may look a bit strange at first sight. > > Sadly, that node is going to look even stranger when I finish adding > support for C++11 allocators, as the type of node becomes dependent on > the allocator's pointer, which makes _List_node<size_t> much more > complicated :-( But then I assume _List_node_base is the part really getting more complicated, so without looking too closely it seems almost orthogonal. > I'll have to remember to add additional __distance overloads to handle > the new _List_ptr_iterator and _List_ptr_const_iterator types that > will be used for fancy pointers (although if I forget the optimisation > will still work for std::list<T, std::allocator<T>>, just not for the > vanishingly small number of people using allocators with fancy > pointers). > >> Index: include/bits/stl_iterator_base_funcs.h >> =================================================================== >> --- include/bits/stl_iterator_base_funcs.h (revision 222041) >> +++ include/bits/stl_iterator_base_funcs.h (working copy) >> @@ -107,22 +107,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION >> * n may be negative. >> * >> * For random access iterators, this uses their @c + and @c - operations >> * and are constant time. For other %iterator classes they are linear >> time. >> */ >> template<typename _InputIterator> >> inline typename iterator_traits<_InputIterator>::difference_type >> distance(_InputIterator __first, _InputIterator __last) >> { >> // concept requirements -- taken care of in __distance >> - return std::__distance(__first, __last, >> - std::__iterator_category(__first)); >> + return __distance(__first, __last, >> std::__iterator_category(__first)); > > This should still be a qualified call to std::__distance, otherwise the > compiler might end up instantiating things to figure out if there are > any associated namespaces, e.g. for vector<unique_ptr<T>>::iterator we > don't want to know T's base classes and rheir associated namespaces. Then the approach will not work. C++ overload resolution will not consider the later __distance declarations in stl_list.h if the call is qualified (I didn't change it gratuitously). A forward declaration of list iterators and those __distance overloads in bits/stl_iterator_base_funcs.h is not very appealing but may work (or it may cause issues, I don't know). Otherwise, I guess we are back to creating a new file bits/list_iterator.h, which <list> includes if <iterator> has already been included and vice versa, and which provides overloads for distance directly. >> + // Detect when distance is used to compute the size of the whole list. >> + template<typename _Tp> >> + inline ptrdiff_t >> + __distance(_GLIBCXX_STD_C::_List_iterator<_Tp> __first, >> + _GLIBCXX_STD_C::_List_iterator<_Tp> __last, >> + input_iterator_tag) >> + { >> + typedef _GLIBCXX_STD_C::_List_node<size_t> _Sentinel; >> + _GLIBCXX_STD_C::_List_iterator<_Tp> __beyond = __last; >> + ++__beyond; >> + bool __whole = __first == __beyond; >> + if (__builtin_constant_p (__whole) && __whole) > > This is clever :-) > > This shouldn't interfere with any changes we might need to test before > backporting to the gcc-5-branch, so with the std:: qualification > restored on the call to __distance it's OK for trunk. > > (I'll trust your judgment/masurements for whether it's worth making > the _List_iterator overload call the _List_const_iterator one.)
On 13/04/15 16:14 +0200, Marc Glisse wrote: >On Mon, 13 Apr 2015, Jonathan Wakely wrote: > >>On 13/04/15 13:42 +0200, Marc Glisse wrote: >>>this patch makes std::distance(list.first(),list.end()) a constant >>>time operation when optimizing, with no penalty for other calls. >>>We could do the test always (no __builtin_constant_p) but then it >>>will add a small runtime penalty for other calls, someone else can >>>take responsibility for that. >> >>I like the way you've done it. No penalty for other calls is great >>and IMHO it's OK that the optimisation only happens when optimising. >> >>(Does it work even at -Og?) > >No, the testcase currently passes with -O1 but not -Og. OK, we can live with that. >>Sadly, that node is going to look even stranger when I finish adding >>support for C++11 allocators, as the type of node becomes dependent on >>the allocator's pointer, which makes _List_node<size_t> much more >>complicated :-( > >But then I assume _List_node_base is the part really getting more >complicated, so without looking too closely it seems almost >orthogonal. In order to avoid making any changes to std::list<T, std::allocator<T>> I'm adding an entire parallel hierarchy of a new node base type (parameterized on the allocator's void_pointer type), a new node type and new iterators (parameterized on the pointer type), which will only be used when !is_pointer<Alloc::pointer>. So it will just mean adding two new overloads for the two new iterator types. Not a big deal, as long as I remember to do it :-) >>This should still be a qualified call to std::__distance, otherwise the >>compiler might end up instantiating things to figure out if there are >>any associated namespaces, e.g. for vector<unique_ptr<T>>::iterator we >>don't want to know T's base classes and rheir associated namespaces. > >Then the approach will not work. C++ overload resolution will not >consider the later __distance declarations in stl_list.h if the call >is qualified (I didn't change it gratuitously). Ahhh, yes, of course. >A forward declaration >of list iterators and those __distance overloads in >bits/stl_iterator_base_funcs.h is not very appealing but may work (or >it may cause issues, I don't know). Otherwise, I guess we are back to >creating a new file bits/list_iterator.h, which <list> includes if ><iterator> has already been included and vice versa, and which >provides overloads for distance directly. I don't have a preference, but I think the forward declarations should work without problems. <list> includes bits/stl_iterator_base_funcs.h so if the forward declarations didn't match the definitions for some reason we'd know right away.
Index: include/bits/stl_iterator_base_funcs.h =================================================================== --- include/bits/stl_iterator_base_funcs.h (revision 222041) +++ include/bits/stl_iterator_base_funcs.h (working copy) @@ -107,22 +107,21 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION * n may be negative. * * For random access iterators, this uses their @c + and @c - operations * and are constant time. For other %iterator classes they are linear time. */ template<typename _InputIterator> inline typename iterator_traits<_InputIterator>::difference_type distance(_InputIterator __first, _InputIterator __last) { // concept requirements -- taken care of in __distance - return std::__distance(__first, __last, - std::__iterator_category(__first)); + return __distance(__first, __last, std::__iterator_category(__first)); } template<typename _InputIterator, typename _Distance> inline void __advance(_InputIterator& __i, _Distance __n, input_iterator_tag) { // concept requirements __glibcxx_function_requires(_InputIteratorConcept<_InputIterator>) _GLIBCXX_DEBUG_ASSERT(__n >= 0); while (__n--) Index: include/bits/stl_list.h =================================================================== --- include/bits/stl_list.h (revision 222041) +++ include/bits/stl_list.h (working copy) @@ -1861,13 +1861,64 @@ _GLIBCXX_END_NAMESPACE_CXX11 operator>=(const list<_Tp, _Alloc>& __x, const list<_Tp, _Alloc>& __y) { return !(__x < __y); } /// See std::list::swap(). template<typename _Tp, typename _Alloc> inline void swap(list<_Tp, _Alloc>& __x, list<_Tp, _Alloc>& __y) { __x.swap(__y); } _GLIBCXX_END_NAMESPACE_CONTAINER + +#if _GLIBCXX_USE_CXX11_ABI +_GLIBCXX_BEGIN_NAMESPACE_VERSION + + // Detect when distance is used to compute the size of the whole list. + template<typename _Tp> + inline ptrdiff_t + __distance(_GLIBCXX_STD_C::_List_iterator<_Tp> __first, + _GLIBCXX_STD_C::_List_iterator<_Tp> __last, + input_iterator_tag) + { + typedef _GLIBCXX_STD_C::_List_node<size_t> _Sentinel; + _GLIBCXX_STD_C::_List_iterator<_Tp> __beyond = __last; + ++__beyond; + bool __whole = __first == __beyond; + if (__builtin_constant_p (__whole) && __whole) + return static_cast<const _Sentinel*>(__last._M_node)->_M_data; + + ptrdiff_t __n = 0; + while (__first != __last) + { + ++__first; + ++__n; + } + return __n; + } + + template<typename _Tp> + inline ptrdiff_t + __distance(_GLIBCXX_STD_C::_List_const_iterator<_Tp> __first, + _GLIBCXX_STD_C::_List_const_iterator<_Tp> __last, + input_iterator_tag) + { + typedef _GLIBCXX_STD_C::_List_node<size_t> _Sentinel; + _GLIBCXX_STD_C::_List_const_iterator<_Tp> __beyond = __last; + ++__beyond; + bool __whole = __first == __beyond; + if (__builtin_constant_p (__whole) && __whole) + return static_cast<const _Sentinel*>(__last._M_node)->_M_data; + + ptrdiff_t __n = 0; + while (__first != __last) + { + ++__first; + ++__n; + } + return __n; + } + +_GLIBCXX_END_NAMESPACE_VERSION +#endif } // namespace std #endif /* _STL_LIST_H */ Index: testsuite/23_containers/list/61347.cc =================================================================== --- testsuite/23_containers/list/61347.cc (revision 0) +++ testsuite/23_containers/list/61347.cc (working copy) @@ -0,0 +1,49 @@ +// { dg-options "-std=gnu++11 -O2 -D_GLIBCXX_USE_CXX11_ABI" } + +// Copyright (C) 2015 Free Software Foundation, Inc. +// +// This file is part of the GNU ISO C++ Library. This library is free +// software; you can redistribute it and/or modify it under the +// terms of the GNU General Public License as published by the +// Free Software Foundation; either version 3, or (at your option) +// any later version. + +// This library is distributed in the hope that it will be useful, +// but WITHOUT ANY WARRANTY; without even the implied warranty of +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +// GNU General Public License for more details. + +// You should have received a copy of the GNU General Public License along +// with this library; see the file COPYING3. If not see +// <http://www.gnu.org/licenses/>. + +#include <list> +#include <iterator> +#include <testsuite_hooks.h> + +__attribute__((__noinline__, __noclone__)) +void testm(std::list<short>& l) +{ + bool b = std::distance(l.begin(), l.end()) == l.size(); + VERIFY( __builtin_constant_p(b) ); + VERIFY( b ); +} + +__attribute__((__noinline__, __noclone__)) +void testc(const std::list<short>& l) +{ + bool b = std::distance(l.begin(), l.end()) == l.size(); + VERIFY( __builtin_constant_p(b) ); + VERIFY( b ); +} + +int main() +{ + bool test __attribute__((unused)) = true; + +#if _GLIBCXX_USE_DUAL_ABI + std::list<short> l; + testm(l); + testc(l); +#endif +}