Message ID | 879ddf66-abed-49b5-bbdf-2628696223f8@gmail.com |
---|---|
State | New |
Headers | show |
Series | libstdc++: Do not use memset _Hashtable buckets allocation | expand |
On Thu, 13 Jun 2024 at 18:40, François Dumont <frs.dumont@gmail.com> wrote: > > Hi > > Following your recent change here: > > https://gcc.gnu.org/pipermail/libstdc++/2024-June/058998.html > > I think we also need to fix the memset at bucket allocation level. > > I did it trying also to be more fancy pointer friendly by running > __uninitialized_default_n_a on the allocator returned pointer rather > than on the __to_address result. I wonder if an __uninitialized_fill_n_a > would have been better ? Doing so I also had to call std::_Destroy on > deallocation. Let me know if it is too early. You don't need the RAII guard. Initializing Alloc::pointer isn't allowed to throw exceptions: "An allocator type X shall meet the Cpp17CopyConstructible requirements (Table 32). The XX::pointer, XX::const_pointer, XX::void_pointer, and XX::const_void_pointer types shall meet the Cpp17Nullable- Pointer requirements (Table 36). No constructor, comparison operator function, copy operation, move operation, or swap operation on these pointer types shall exit via an exception." And you should not pass the allocator to the __uninitialized_xxx call, nor the _Destroy call. We don't want to use the allocator's construct/destroy members for those pointers. They are not container elements. I think either uninitialized_fill_n with nullptr or __uninitialized_default_n is fine. Not the _a forms taking an allocator though. > I also wonder if the compiler will be able to optimize it to a memset > call ? I'm interested to work on it if you confirm that it won't. It will do whatever is fastest, which might be memset or might be vectorized code to zero it out (which is probably what libc memset does too). > > libstdc++: Do not use memset in _Hashtable buckets allocation > > Using memset is incorrect if the __bucket_ptr type is non-trivial, or > does not use an all-zero bit pattern for its null value. > > Replace the use of memset with std::__uinitialized_default_n_a to set the > pointers to nullptr. Doing so and corresponding std::_Destroy when > deallocating > buckets. > > libstdc++-v3/ChangeLog: > > * include/bits/hashtable_policy.h > (_Hashtable_alloc::_M_allocate_buckets): Do not use memset to zero > out bucket pointers. > (_Hashtable_alloc::_M_deallocate_buckets): Add destroy of buckets. > > > I hope you won't ask for copy rights on the changelog entry :-) > > Tested under Linux x64, ok to commit ? > > François
On Thu, 13 Jun 2024 at 19:57, Jonathan Wakely <jwakely@redhat.com> wrote: > > On Thu, 13 Jun 2024 at 18:40, François Dumont <frs.dumont@gmail.com> wrote: > > > > Hi > > > > Following your recent change here: > > > > https://gcc.gnu.org/pipermail/libstdc++/2024-June/058998.html > > > > I think we also need to fix the memset at bucket allocation level. > > > > I did it trying also to be more fancy pointer friendly by running > > __uninitialized_default_n_a on the allocator returned pointer rather > > than on the __to_address result. I wonder if an __uninitialized_fill_n_a > > would have been better ? Doing so I also had to call std::_Destroy on > > deallocation. Let me know if it is too early. > > You don't need the RAII guard. Initializing Alloc::pointer isn't > allowed to throw exceptions: > > "An allocator type X shall meet the Cpp17CopyConstructible > requirements (Table 32). The XX::pointer, > XX::const_pointer, XX::void_pointer, and XX::const_void_pointer types > shall meet the Cpp17Nullable- > Pointer requirements (Table 36). No constructor, comparison operator > function, copy operation, move > operation, or swap operation on these pointer types shall exit via an > exception." > > And you should not pass the allocator to the __uninitialized_xxx call, > nor the _Destroy call. We don't want to use the allocator's > construct/destroy members for those pointers. They are not container > elements. > > I think either uninitialized_fill_n with nullptr or > __uninitialized_default_n is fine. Not the _a forms taking an > allocator though. And I'd use _Destroy_n(_M_buckets, _M_bucket_count) > > > I also wonder if the compiler will be able to optimize it to a memset > > call ? I'm interested to work on it if you confirm that it won't. > > It will do whatever is fastest, which might be memset or might be > vectorized code to zero it out (which is probably what libc memset > does too). > > > > > libstdc++: Do not use memset in _Hashtable buckets allocation > > > > Using memset is incorrect if the __bucket_ptr type is non-trivial, or > > does not use an all-zero bit pattern for its null value. > > > > Replace the use of memset with std::__uinitialized_default_n_a to set the > > pointers to nullptr. Doing so and corresponding std::_Destroy when > > deallocating > > buckets. > > > > libstdc++-v3/ChangeLog: > > > > * include/bits/hashtable_policy.h > > (_Hashtable_alloc::_M_allocate_buckets): Do not use memset to zero > > out bucket pointers. > > (_Hashtable_alloc::_M_deallocate_buckets): Add destroy of buckets. > > > > > > I hope you won't ask for copy rights on the changelog entry :-) > > > > Tested under Linux x64, ok to commit ? > > > > François
Here is the simplified patch then. libstdc++: Do not use memset in _Hashtable buckets allocation Using memset is incorrect if the __bucket_ptr type is non-trivial, or does not use an all-zero bit pattern for its null value. Replace the use of memset with std::__uinitialized_default_n to set the pointers to nullptr. Doing so and corresponding std::_Destroy_n when deallocating buckets. libstdc++-v3/ChangeLog: * include/bits/hashtable_policy.h (_Hashtable_alloc::_M_allocate_buckets): Do not use memset to zero out bucket pointers. (_Hashtable_alloc::_M_deallocate_buckets): Add destroy of buckets. Tested under Linux x64, ok to commit ? François On 13/06/2024 20:58, Jonathan Wakely wrote: > On Thu, 13 Jun 2024 at 19:57, Jonathan Wakely <jwakely@redhat.com> wrote: >> On Thu, 13 Jun 2024 at 18:40, François Dumont <frs.dumont@gmail.com> wrote: >>> Hi >>> >>> Following your recent change here: >>> >>> https://gcc.gnu.org/pipermail/libstdc++/2024-June/058998.html >>> >>> I think we also need to fix the memset at bucket allocation level. >>> >>> I did it trying also to be more fancy pointer friendly by running >>> __uninitialized_default_n_a on the allocator returned pointer rather >>> than on the __to_address result. I wonder if an __uninitialized_fill_n_a >>> would have been better ? Doing so I also had to call std::_Destroy on >>> deallocation. Let me know if it is too early. >> You don't need the RAII guard. Initializing Alloc::pointer isn't >> allowed to throw exceptions: >> >> "An allocator type X shall meet the Cpp17CopyConstructible >> requirements (Table 32). The XX::pointer, >> XX::const_pointer, XX::void_pointer, and XX::const_void_pointer types >> shall meet the Cpp17Nullable- >> Pointer requirements (Table 36). No constructor, comparison operator >> function, copy operation, move >> operation, or swap operation on these pointer types shall exit via an >> exception." >> >> And you should not pass the allocator to the __uninitialized_xxx call, >> nor the _Destroy call. We don't want to use the allocator's >> construct/destroy members for those pointers. They are not container >> elements. >> >> I think either uninitialized_fill_n with nullptr or >> __uninitialized_default_n is fine. Not the _a forms taking an >> allocator though. > And I'd use _Destroy_n(_M_buckets, _M_bucket_count) > > >>> I also wonder if the compiler will be able to optimize it to a memset >>> call ? I'm interested to work on it if you confirm that it won't. >> It will do whatever is fastest, which might be memset or might be >> vectorized code to zero it out (which is probably what libc memset >> does too). >> >>> libstdc++: Do not use memset in _Hashtable buckets allocation >>> >>> Using memset is incorrect if the __bucket_ptr type is non-trivial, or >>> does not use an all-zero bit pattern for its null value. >>> >>> Replace the use of memset with std::__uinitialized_default_n_a to set the >>> pointers to nullptr. Doing so and corresponding std::_Destroy when >>> deallocating >>> buckets. >>> >>> libstdc++-v3/ChangeLog: >>> >>> * include/bits/hashtable_policy.h >>> (_Hashtable_alloc::_M_allocate_buckets): Do not use memset to zero >>> out bucket pointers. >>> (_Hashtable_alloc::_M_deallocate_buckets): Add destroy of buckets. >>> >>> >>> I hope you won't ask for copy rights on the changelog entry :-) >>> >>> Tested under Linux x64, ok to commit ? >>> >>> François diff --git a/libstdc++-v3/include/bits/hashtable_policy.h b/libstdc++-v3/include/bits/hashtable_policy.h index 26def24f24e..510108a1a6f 100644 --- a/libstdc++-v3/include/bits/hashtable_policy.h +++ b/libstdc++-v3/include/bits/hashtable_policy.h @@ -33,8 +33,9 @@ #include <tuple> // for std::tuple, std::forward_as_tuple #include <bits/functional_hash.h> // for __is_fast_hash -#include <bits/stl_algobase.h> // for std::min, std::is_permutation. +#include <bits/stl_algobase.h> // for std::min, std::is_permutation #include <bits/stl_pair.h> // for std::pair +#include <bits/stl_uninitialized.h> // for __uninitialized_default_n #include <ext/aligned_buffer.h> // for __gnu_cxx::__aligned_buffer #include <ext/alloc_traits.h> // for std::__alloc_rebind #include <ext/numeric_traits.h> // for __gnu_cxx::__int_traits @@ -2069,11 +2070,9 @@ namespace __detail -> __buckets_ptr { __buckets_alloc_type __alloc(_M_node_allocator()); - auto __ptr = __buckets_alloc_traits::allocate(__alloc, __bkt_count); - __buckets_ptr __p = std::__to_address(__ptr); - __builtin_memset(__p, 0, __bkt_count * sizeof(__node_base_ptr)); - return __p; + std::__uninitialized_default_n(__ptr, __bkt_count); + return std::__to_address(__ptr); } template<typename _NodeAlloc> @@ -2085,6 +2084,7 @@ namespace __detail typedef typename __buckets_alloc_traits::pointer _Ptr; auto __ptr = std::pointer_traits<_Ptr>::pointer_to(*__bkts); __buckets_alloc_type __alloc(_M_node_allocator()); + std::_Destroy_n(__ptr, __bkt_count); __buckets_alloc_traits::deallocate(__alloc, __ptr, __bkt_count); }
On Sat, 15 Jun 2024 at 14:04, François Dumont <frs.dumont@gmail.com> wrote: > > Here is the simplified patch then. The use of std::__to_address seems wrong. The allocator returns a __buckets_ptr, and that function returns a __buckets_ptr, so it should just be returned unchanged, not by converting to a raw pointer with __to_address. > > libstdc++: Do not use memset in _Hashtable buckets allocation > > Using memset is incorrect if the __bucket_ptr type is non-trivial, or > does not use an all-zero bit pattern for its null value. > > Replace the use of memset with std::__uinitialized_default_n to set the > pointers to nullptr. Doing so and corresponding std::_Destroy_n > when deallocating > buckets. > > libstdc++-v3/ChangeLog: > > * include/bits/hashtable_policy.h > (_Hashtable_alloc::_M_allocate_buckets): Do not use memset > to zero > out bucket pointers. > (_Hashtable_alloc::_M_deallocate_buckets): Add destroy of > buckets. > > Tested under Linux x64, ok to commit ? > > François > > On 13/06/2024 20:58, Jonathan Wakely wrote: > > On Thu, 13 Jun 2024 at 19:57, Jonathan Wakely <jwakely@redhat.com> wrote: > >> On Thu, 13 Jun 2024 at 18:40, François Dumont <frs.dumont@gmail.com> wrote: > >>> Hi > >>> > >>> Following your recent change here: > >>> > >>> https://gcc.gnu.org/pipermail/libstdc++/2024-June/058998.html > >>> > >>> I think we also need to fix the memset at bucket allocation level. > >>> > >>> I did it trying also to be more fancy pointer friendly by running > >>> __uninitialized_default_n_a on the allocator returned pointer rather > >>> than on the __to_address result. I wonder if an __uninitialized_fill_n_a > >>> would have been better ? Doing so I also had to call std::_Destroy on > >>> deallocation. Let me know if it is too early. > >> You don't need the RAII guard. Initializing Alloc::pointer isn't > >> allowed to throw exceptions: > >> > >> "An allocator type X shall meet the Cpp17CopyConstructible > >> requirements (Table 32). The XX::pointer, > >> XX::const_pointer, XX::void_pointer, and XX::const_void_pointer types > >> shall meet the Cpp17Nullable- > >> Pointer requirements (Table 36). No constructor, comparison operator > >> function, copy operation, move > >> operation, or swap operation on these pointer types shall exit via an > >> exception." > >> > >> And you should not pass the allocator to the __uninitialized_xxx call, > >> nor the _Destroy call. We don't want to use the allocator's > >> construct/destroy members for those pointers. They are not container > >> elements. > >> > >> I think either uninitialized_fill_n with nullptr or > >> __uninitialized_default_n is fine. Not the _a forms taking an > >> allocator though. > > And I'd use _Destroy_n(_M_buckets, _M_bucket_count) > > > > > >>> I also wonder if the compiler will be able to optimize it to a memset > >>> call ? I'm interested to work on it if you confirm that it won't. > >> It will do whatever is fastest, which might be memset or might be > >> vectorized code to zero it out (which is probably what libc memset > >> does too). > >> > >>> libstdc++: Do not use memset in _Hashtable buckets allocation > >>> > >>> Using memset is incorrect if the __bucket_ptr type is non-trivial, or > >>> does not use an all-zero bit pattern for its null value. > >>> > >>> Replace the use of memset with std::__uinitialized_default_n_a to set the > >>> pointers to nullptr. Doing so and corresponding std::_Destroy when > >>> deallocating > >>> buckets. > >>> > >>> libstdc++-v3/ChangeLog: > >>> > >>> * include/bits/hashtable_policy.h > >>> (_Hashtable_alloc::_M_allocate_buckets): Do not use memset to zero > >>> out bucket pointers. > >>> (_Hashtable_alloc::_M_deallocate_buckets): Add destroy of buckets. > >>> > >>> > >>> I hope you won't ask for copy rights on the changelog entry :-) > >>> > >>> Tested under Linux x64, ok to commit ? > >>> > >>> François
On Mon, 17 Jun 2024 at 11:18, Jonathan Wakely <jwakely@redhat.com> wrote: > > On Sat, 15 Jun 2024 at 14:04, François Dumont <frs.dumont@gmail.com> wrote: > > > > Here is the simplified patch then. > > The use of std::__to_address seems wrong. > > The allocator returns a __buckets_ptr, and that function returns a > __buckets_ptr, so it should just be returned unchanged, not by > converting to a raw pointer with __to_address. It was already wrong, but we should fix that now not keep it wrong. Using __to_address to get a pointer to pass to memset was correct. But the result of the __to_address call was used to initialize another __buckets_ptr variable. Which is what we already had before calling __to_address. It would have made sense like this: auto __ptr = __buckets_alloc_traits::allocate(__alloc, __bkt_count); auto* __p = std::__to_address(__ptr); __builtin_memset(__p, 0, __bkt_count * sizeof(__node_base_ptr)); return __ptr; i.e. __p should be a raw pointer (not a __buckets_ptr), and then it should return __ptr not __p. But that isn't what we had. Anyway, now that we're not using memset, we don't need any raw pointer at all, so don't need std::__to_address at all. > > > > > > libstdc++: Do not use memset in _Hashtable buckets allocation > > > > Using memset is incorrect if the __bucket_ptr type is non-trivial, or > > does not use an all-zero bit pattern for its null value. > > > > Replace the use of memset with std::__uinitialized_default_n to set the > > pointers to nullptr. Doing so and corresponding std::_Destroy_n > > when deallocating > > buckets. > > > > libstdc++-v3/ChangeLog: > > > > * include/bits/hashtable_policy.h > > (_Hashtable_alloc::_M_allocate_buckets): Do not use memset > > to zero > > out bucket pointers. > > (_Hashtable_alloc::_M_deallocate_buckets): Add destroy of > > buckets. > > > > Tested under Linux x64, ok to commit ? > > > > François > > > > On 13/06/2024 20:58, Jonathan Wakely wrote: > > > On Thu, 13 Jun 2024 at 19:57, Jonathan Wakely <jwakely@redhat.com> wrote: > > >> On Thu, 13 Jun 2024 at 18:40, François Dumont <frs.dumont@gmail.com> wrote: > > >>> Hi > > >>> > > >>> Following your recent change here: > > >>> > > >>> https://gcc.gnu.org/pipermail/libstdc++/2024-June/058998.html > > >>> > > >>> I think we also need to fix the memset at bucket allocation level. > > >>> > > >>> I did it trying also to be more fancy pointer friendly by running > > >>> __uninitialized_default_n_a on the allocator returned pointer rather > > >>> than on the __to_address result. I wonder if an __uninitialized_fill_n_a > > >>> would have been better ? Doing so I also had to call std::_Destroy on > > >>> deallocation. Let me know if it is too early. > > >> You don't need the RAII guard. Initializing Alloc::pointer isn't > > >> allowed to throw exceptions: > > >> > > >> "An allocator type X shall meet the Cpp17CopyConstructible > > >> requirements (Table 32). The XX::pointer, > > >> XX::const_pointer, XX::void_pointer, and XX::const_void_pointer types > > >> shall meet the Cpp17Nullable- > > >> Pointer requirements (Table 36). No constructor, comparison operator > > >> function, copy operation, move > > >> operation, or swap operation on these pointer types shall exit via an > > >> exception." > > >> > > >> And you should not pass the allocator to the __uninitialized_xxx call, > > >> nor the _Destroy call. We don't want to use the allocator's > > >> construct/destroy members for those pointers. They are not container > > >> elements. > > >> > > >> I think either uninitialized_fill_n with nullptr or > > >> __uninitialized_default_n is fine. Not the _a forms taking an > > >> allocator though. > > > And I'd use _Destroy_n(_M_buckets, _M_bucket_count) > > > > > > > > >>> I also wonder if the compiler will be able to optimize it to a memset > > >>> call ? I'm interested to work on it if you confirm that it won't. > > >> It will do whatever is fastest, which might be memset or might be > > >> vectorized code to zero it out (which is probably what libc memset > > >> does too). > > >> > > >>> libstdc++: Do not use memset in _Hashtable buckets allocation > > >>> > > >>> Using memset is incorrect if the __bucket_ptr type is non-trivial, or > > >>> does not use an all-zero bit pattern for its null value. > > >>> > > >>> Replace the use of memset with std::__uinitialized_default_n_a to set the > > >>> pointers to nullptr. Doing so and corresponding std::_Destroy when > > >>> deallocating > > >>> buckets. > > >>> > > >>> libstdc++-v3/ChangeLog: > > >>> > > >>> * include/bits/hashtable_policy.h > > >>> (_Hashtable_alloc::_M_allocate_buckets): Do not use memset to zero > > >>> out bucket pointers. > > >>> (_Hashtable_alloc::_M_deallocate_buckets): Add destroy of buckets. > > >>> > > >>> > > >>> I hope you won't ask for copy rights on the changelog entry :-) > > >>> > > >>> Tested under Linux x64, ok to commit ? > > >>> > > >>> François
diff --git a/libstdc++-v3/include/bits/hashtable_policy.h b/libstdc++-v3/include/bits/hashtable_policy.h index 26def24f24e..6456c53e8b8 100644 --- a/libstdc++-v3/include/bits/hashtable_policy.h +++ b/libstdc++-v3/include/bits/hashtable_policy.h @@ -33,8 +33,9 @@ #include <tuple> // for std::tuple, std::forward_as_tuple #include <bits/functional_hash.h> // for __is_fast_hash -#include <bits/stl_algobase.h> // for std::min, std::is_permutation. +#include <bits/stl_algobase.h> // for std::min, std::is_permutation #include <bits/stl_pair.h> // for std::pair +#include <bits/stl_uninitialized.h> // for __uninitialized_default_n_a #include <ext/aligned_buffer.h> // for __gnu_cxx::__aligned_buffer #include <ext/alloc_traits.h> // for std::__alloc_rebind #include <ext/numeric_traits.h> // for __gnu_cxx::__int_traits @@ -2068,12 +2069,39 @@ namespace __detail _Hashtable_alloc<_NodeAlloc>::_M_allocate_buckets(std::size_t __bkt_count) -> __buckets_ptr { - __buckets_alloc_type __alloc(_M_node_allocator()); + // RAII guard for allocated storage. + struct _Guard_alloc + { + __buckets_ptr _M_storage; // Storage to deallocate + std::size_t _M_len; + __buckets_alloc_type& _M_alloc; + + _Guard_alloc(__buckets_ptr __s, std::size_t __l, + __buckets_alloc_type& __alloc) + : _M_storage(__s), _M_len(__l), _M_alloc(__alloc) + { } + _Guard_alloc(const _Guard_alloc&) = delete; + + ~_Guard_alloc() + { + if (_M_storage) + __buckets_alloc_traits::deallocate(_M_alloc, _M_storage, _M_len); + } + + __buckets_ptr + _M_release() + { + auto __res = _M_storage; + _M_storage = nullptr; + return __res; + } + }; + __buckets_alloc_type __alloc(_M_node_allocator()); auto __ptr = __buckets_alloc_traits::allocate(__alloc, __bkt_count); - __buckets_ptr __p = std::__to_address(__ptr); - __builtin_memset(__p, 0, __bkt_count * sizeof(__node_base_ptr)); - return __p; + _Guard_alloc __guard(__ptr, __bkt_count, __alloc); + std::__uninitialized_default_n_a(__ptr, __bkt_count, __alloc); + return std::__to_address(__guard._M_release()); } template<typename _NodeAlloc> @@ -2085,6 +2113,7 @@ namespace __detail typedef typename __buckets_alloc_traits::pointer _Ptr; auto __ptr = std::pointer_traits<_Ptr>::pointer_to(*__bkts); __buckets_alloc_type __alloc(_M_node_allocator()); + std::_Destroy(__ptr, __ptr + __bkt_count, __alloc); __buckets_alloc_traits::deallocate(__alloc, __ptr, __bkt_count); }