diff mbox series

[3/7] libstdc++: Inline memmove optimizations for std::copy etc. [PR115444]

Message ID 20241015142630.2148792-3-jwakely@redhat.com
State New
Headers show
Series [1/7] libstdc++: Refactor std::uninitialized_{copy, fill, fill_n} algos [PR68350] | expand

Commit Message

Jonathan Wakely Oct. 15, 2024, 2:20 p.m. UTC
This is a slightly different approach to C++98 compatibility than used
in patch 1/1 of this series for the uninitialized algos. It worked out a
bit cleaner this way for these algos, I think.

Tested x86_64-linux.

-- >8 --

This removes all the __copy_move class template specializations that
decide how to optimize std::copy and std::copy_n. We can inline those
optimizations into the algorithms, using if-constexpr (and macros for
C++98 compatibility) and remove the code dispatching to the various
class template specializations.

Doing this means we implement the optimization directly for std::copy_n
instead of deferring to std::copy, That avoids the unwanted consequence
of advancing the iterator in copy_n only to take the difference later to
get back to the length that we already had in copy_n originally (as
described in PR 115444).

With the new flattened implementations, we can also lower contiguous
iterators to pointers in std::copy/std::copy_n/std::copy_backwards, so
that they benefit from the same memmove optimizations as pointers.
There's a subtlety though: contiguous iterators can potentially throw
exceptions to exit the algorithm early.  So we can only transform the
loop to memmove if dereferencing the iterator is noexcept. We don't
check that incrementing the iterator is noexcept because we advance the
contiguous iterators before using memmove, so that if incrementing would
throw, that happens first. I am writing a proposal (P3249R0) which would
make this unnecessary, so I hope we can drop the nothrow requirements
later.

This change also solves PR 114817 by checking is_trivially_assignable
before optimizing copy/copy_n etc. to memmove. It's not enough to check
that the types are trivially copyable (a precondition for using memmove
at all), we also need to check that the specific assignment that would
be performed by the algorithm is also trivial. Replacing a non-trivial
assignment with memmove would be observable, so not allowed.

libstdc++-v3/ChangeLog:

	PR libstdc++/115444
	PR libstdc++/114817
	* include/bits/stl_algo.h (__copy_n): Remove generic overload
	and overload for random access iterators.
	(copy_n): Inline generic version of __copy_n here. Do not defer
	to std::copy for random access iterators.
	* include/bits/stl_algobase.h (__copy_move): Remove.
	(__nothrow_contiguous_iterator, __memcpyable_iterators): New
	concepts.
	(__assign_one, _GLIBCXX_TO_ADDR, _GLIBCXX_ADVANCE): New helpers.
	(__copy_move_a2): Inline __copy_move logic and conditional
	memmove optimization into the most generic overload.
	(__copy_n_a): Likewise.
	(__copy_move_backward): Remove.
	(__copy_move_backward_a2): Inline __copy_move_backward logic and
	memmove optimization into the most generic overload.
	* testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc:
	New test.
	* testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc:
	New test.
	* testsuite/25_algorithms/copy/114817.cc: New test.
	* testsuite/25_algorithms/copy/115444.cc: New test.
	* testsuite/25_algorithms/copy_n/114817.cc: New test.
---
 libstdc++-v3/include/bits/stl_algo.h          |  24 +-
 libstdc++-v3/include/bits/stl_algobase.h      | 426 +++++++++---------
 .../uninitialized_copy/114817.cc              |  39 ++
 .../uninitialized_copy_n/114817.cc            |  39 ++
 .../testsuite/25_algorithms/copy/114817.cc    |  38 ++
 .../testsuite/25_algorithms/copy/115444.cc    |  93 ++++
 .../testsuite/25_algorithms/copy_n/114817.cc  |  38 ++
 7 files changed, 469 insertions(+), 228 deletions(-)
 create mode 100644 libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc
 create mode 100644 libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc
 create mode 100644 libstdc++-v3/testsuite/25_algorithms/copy/114817.cc
 create mode 100644 libstdc++-v3/testsuite/25_algorithms/copy/115444.cc
 create mode 100644 libstdc++-v3/testsuite/25_algorithms/copy_n/114817.cc

Comments

Jonathan Wakely Oct. 16, 2024, 3:17 p.m. UTC | #1
On Tue, 15 Oct 2024 at 15:29, Jonathan Wakely <jwakely@redhat.com> wrote:
>
> This is a slightly different approach to C++98 compatibility than used
> in patch 1/1 of this series for the uninitialized algos. It worked out a
> bit cleaner this way for these algos, I think.
>
> Tested x86_64-linux.
>
> -- >8 --
>
> This removes all the __copy_move class template specializations that
> decide how to optimize std::copy and std::copy_n. We can inline those
> optimizations into the algorithms, using if-constexpr (and macros for
> C++98 compatibility) and remove the code dispatching to the various
> class template specializations.
>
> Doing this means we implement the optimization directly for std::copy_n
> instead of deferring to std::copy, That avoids the unwanted consequence
> of advancing the iterator in copy_n only to take the difference later to
> get back to the length that we already had in copy_n originally (as
> described in PR 115444).
>
> With the new flattened implementations, we can also lower contiguous
> iterators to pointers in std::copy/std::copy_n/std::copy_backwards, so
> that they benefit from the same memmove optimizations as pointers.
> There's a subtlety though: contiguous iterators can potentially throw
> exceptions to exit the algorithm early.  So we can only transform the
> loop to memmove if dereferencing the iterator is noexcept. We don't
> check that incrementing the iterator is noexcept because we advance the
> contiguous iterators before using memmove, so that if incrementing would
> throw, that happens first. I am writing a proposal (P3249R0) which would

Oops, got my own paper number wrong, I've fixed that locally.
The paper is online now:
https://isocpp.org/files/papers/P3349R0.html

> make this unnecessary, so I hope we can drop the nothrow requirements
> later.
>
> This change also solves PR 114817 by checking is_trivially_assignable
> before optimizing copy/copy_n etc. to memmove. It's not enough to check
> that the types are trivially copyable (a precondition for using memmove
> at all), we also need to check that the specific assignment that would
> be performed by the algorithm is also trivial. Replacing a non-trivial
> assignment with memmove would be observable, so not allowed.
>
> libstdc++-v3/ChangeLog:
>
>         PR libstdc++/115444
>         PR libstdc++/114817
>         * include/bits/stl_algo.h (__copy_n): Remove generic overload
>         and overload for random access iterators.
>         (copy_n): Inline generic version of __copy_n here. Do not defer
>         to std::copy for random access iterators.
>         * include/bits/stl_algobase.h (__copy_move): Remove.
>         (__nothrow_contiguous_iterator, __memcpyable_iterators): New
>         concepts.
>         (__assign_one, _GLIBCXX_TO_ADDR, _GLIBCXX_ADVANCE): New helpers.
>         (__copy_move_a2): Inline __copy_move logic and conditional
>         memmove optimization into the most generic overload.
>         (__copy_n_a): Likewise.
>         (__copy_move_backward): Remove.
>         (__copy_move_backward_a2): Inline __copy_move_backward logic and
>         memmove optimization into the most generic overload.
>         * testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc:
>         New test.
>         * testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc:
>         New test.
>         * testsuite/25_algorithms/copy/114817.cc: New test.
>         * testsuite/25_algorithms/copy/115444.cc: New test.
>         * testsuite/25_algorithms/copy_n/114817.cc: New test.
> ---
>  libstdc++-v3/include/bits/stl_algo.h          |  24 +-
>  libstdc++-v3/include/bits/stl_algobase.h      | 426 +++++++++---------
>  .../uninitialized_copy/114817.cc              |  39 ++
>  .../uninitialized_copy_n/114817.cc            |  39 ++
>  .../testsuite/25_algorithms/copy/114817.cc    |  38 ++
>  .../testsuite/25_algorithms/copy/115444.cc    |  93 ++++
>  .../testsuite/25_algorithms/copy_n/114817.cc  |  38 ++
>  7 files changed, 469 insertions(+), 228 deletions(-)
>  create mode 100644 libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc
>  create mode 100644 libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc
>  create mode 100644 libstdc++-v3/testsuite/25_algorithms/copy/114817.cc
>  create mode 100644 libstdc++-v3/testsuite/25_algorithms/copy/115444.cc
>  create mode 100644 libstdc++-v3/testsuite/25_algorithms/copy_n/114817.cc
>
> diff --git a/libstdc++-v3/include/bits/stl_algo.h b/libstdc++-v3/include/bits/stl_algo.h
> index a1ef665506d..489ce7e14d2 100644
> --- a/libstdc++-v3/include/bits/stl_algo.h
> +++ b/libstdc++-v3/include/bits/stl_algo.h
> @@ -665,25 +665,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        return __result;
>      }
>
> -  template<typename _InputIterator, typename _Size, typename _OutputIterator>
> -    _GLIBCXX20_CONSTEXPR
> -    _OutputIterator
> -    __copy_n(_InputIterator __first, _Size __n,
> -            _OutputIterator __result, input_iterator_tag)
> -    {
> -      return std::__niter_wrap(__result,
> -                              __copy_n_a(__first, __n,
> -                                         std::__niter_base(__result), true));
> -    }
> -
> -  template<typename _RandomAccessIterator, typename _Size,
> -          typename _OutputIterator>
> -    _GLIBCXX20_CONSTEXPR
> -    inline _OutputIterator
> -    __copy_n(_RandomAccessIterator __first, _Size __n,
> -            _OutputIterator __result, random_access_iterator_tag)
> -    { return std::copy(__first, __first + __n, __result); }
> -
>    /**
>     *  @brief Copies the range [first,first+n) into [result,result+n).
>     *  @ingroup mutating_algorithms
> @@ -714,8 +695,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        __glibcxx_requires_can_increment(__first, __n2);
>        __glibcxx_requires_can_increment(__result, __n2);
>
> -      return std::__copy_n(__first, __n2, __result,
> -                          std::__iterator_category(__first));
> +      auto __res = std::__copy_n_a(std::__niter_base(__first), __n2,
> +                                  std::__niter_base(__result), true);
> +      return std::__niter_wrap(__result, std::move(__res));
>      }
>
>    /**
> diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h
> index 751b7ad119b..5f77b00be9b 100644
> --- a/libstdc++-v3/include/bits/stl_algobase.h
> +++ b/libstdc++-v3/include/bits/stl_algobase.h
> @@ -77,6 +77,7 @@
>  #endif
>  #if __cplusplus >= 202002L
>  # include <compare>
> +# include <bits/ptr_traits.h> // std::to_address
>  #endif
>
>  namespace std _GLIBCXX_VISIBILITY(default)
> @@ -308,110 +309,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        return __a;
>      }
>
> -  // All of these auxiliary structs serve two purposes.  (1) Replace
> -  // calls to copy with memmove whenever possible.  (Memmove, not memcpy,
> -  // because the input and output ranges are permitted to overlap.)
> -  // (2) If we're using random access iterators, then write the loop as
> -  // a for loop with an explicit count.
> -
> -  template<bool _IsMove, bool _IsSimple, typename _Category>
> -    struct __copy_move
> -    {
> -      template<typename _II, typename _OI>
> -       _GLIBCXX20_CONSTEXPR
> -       static _OI
> -       __copy_m(_II __first, _II __last, _OI __result)
> -       {
> -         for (; __first != __last; ++__result, (void)++__first)
> -           *__result = *__first;
> -         return __result;
> -       }
> -    };
> -
> -#if __cplusplus >= 201103L
> -  template<typename _Category>
> -    struct __copy_move<true, false, _Category>
> -    {
> -      template<typename _II, typename _OI>
> -       _GLIBCXX20_CONSTEXPR
> -       static _OI
> -       __copy_m(_II __first, _II __last, _OI __result)
> -       {
> -         for (; __first != __last; ++__result, (void)++__first)
> -           *__result = std::move(*__first);
> -         return __result;
> -       }
> -    };
> -#endif
> -
> -  template<>
> -    struct __copy_move<false, false, random_access_iterator_tag>
> -    {
> -      template<typename _II, typename _OI>
> -       _GLIBCXX20_CONSTEXPR
> -       static _OI
> -       __copy_m(_II __first, _II __last, _OI __result)
> -       {
> -         typedef typename iterator_traits<_II>::difference_type _Distance;
> -         for(_Distance __n = __last - __first; __n > 0; --__n)
> -           {
> -             *__result = *__first;
> -             ++__first;
> -             ++__result;
> -           }
> -         return __result;
> -       }
> -
> -      template<typename _Tp, typename _Up>
> -       static void
> -       __assign_one(_Tp* __to, _Up* __from)
> -       { *__to = *__from; }
> -    };
> -
> -#if __cplusplus >= 201103L
> -  template<>
> -    struct __copy_move<true, false, random_access_iterator_tag>
> -    {
> -      template<typename _II, typename _OI>
> -       _GLIBCXX20_CONSTEXPR
> -       static _OI
> -       __copy_m(_II __first, _II __last, _OI __result)
> -       {
> -         typedef typename iterator_traits<_II>::difference_type _Distance;
> -         for(_Distance __n = __last - __first; __n > 0; --__n)
> -           {
> -             *__result = std::move(*__first);
> -             ++__first;
> -             ++__result;
> -           }
> -         return __result;
> -       }
> -
> -      template<typename _Tp, typename _Up>
> -       static void
> -       __assign_one(_Tp* __to, _Up* __from)
> -       { *__to = std::move(*__from); }
> -    };
> -#endif
> -
> -  template<bool _IsMove>
> -    struct __copy_move<_IsMove, true, random_access_iterator_tag>
> -    {
> -      template<typename _Tp, typename _Up>
> -       _GLIBCXX20_CONSTEXPR
> -       static _Up*
> -       __copy_m(_Tp* __first, _Tp* __last, _Up* __result)
> -       {
> -         const ptrdiff_t _Num = __last - __first;
> -         if (__builtin_expect(_Num > 1, true))
> -           __builtin_memmove(__result, __first, sizeof(_Tp) * _Num);
> -         else if (_Num == 1)
> -           std::__copy_move<_IsMove, false, random_access_iterator_tag>::
> -             __assign_one(__result, __first);
> -         return __result + _Num;
> -       }
> -    };
> -
>  _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>
>    template<typename _Tp, typename _Ref, typename _Ptr>
> @@ -461,21 +358,127 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>         _GLIBCXX_STD_C::_Deque_iterator<_CharT, _CharT&, _CharT*>);
>  #endif // HOSTED
>
> -  template<bool _IsMove, typename _II, typename _OI>
> -    _GLIBCXX20_CONSTEXPR
> -    inline _OI
> -    __copy_move_a2(_II __first, _II __last, _OI __result)
> -    {
> -      typedef typename iterator_traits<_II>::iterator_category _Category;
> -#ifdef __cpp_lib_is_constant_evaluated
> -      if (std::is_constant_evaluated())
> -       return std::__copy_move<_IsMove, false, _Category>::
> -         __copy_m(__first, __last, __result);
> +#if __cpp_lib_concepts
> +  // N.B. this is not the same as nothrow-forward-iterator, which doesn't
> +  // require noexcept operations, it just says it's undefined if they throw.
> +  // Here we require them to be actually noexcept.
> +  template<typename _Iter>
> +    concept __nothrow_contiguous_iterator
> +      = contiguous_iterator<_Iter> && requires (_Iter __i) {
> +       // If this operation can throw then the iterator could cause
> +       // the algorithm to exit early via an exception, in which case
> +       // we can't use memcpy.
> +       { *__i } noexcept;
> +      };
> +
> +  template<typename _OutIter, typename _InIter, typename _Sent = _InIter>
> +    concept __memcpyable_iterators
> +      = __nothrow_contiguous_iterator<_OutIter>
> +         && __nothrow_contiguous_iterator<_InIter>
> +         && sized_sentinel_for<_Sent, _InIter>
> +         && requires (_OutIter __o, _InIter __i, _Sent __s) {
> +           requires !!__memcpyable<decltype(std::to_address(__o)),
> +                                   decltype(std::to_address(__i))>::__value;
> +           { __i != __s } noexcept;
> +         };
>  #endif
> -      return std::__copy_move<_IsMove, __memcpyable<_OI, _II>::__value,
> -                             _Category>::__copy_m(__first, __last, __result);
> +
> +#if __cplusplus < 201103L
> +  // Used by __copy_move_a2, __copy_n_a and __copy_move_backward_a2 to
> +  // get raw pointers so that calls to __builtin_memmove will compile,
> +  // because C++98 can't use 'if constexpr' so statements that use memmove
> +  // with pointer arguments need to also compile for arbitrary iterator types.
> +  template<typename _Iter> __attribute__((__always_inline__))
> +    inline void* __ptr_or_null(_Iter) { return 0; }
> +  template<typename _Tp> __attribute__((__always_inline__))
> +  inline void* __ptr_or_null(_Tp* __p) { return (void*)__p; }
> +# define _GLIBCXX_TO_ADDR(P) std::__ptr_or_null(P)
> +  // Used to advance output iterators (std::advance requires InputIterator).
> +  template<typename _Iter> __attribute__((__always_inline__))
> +    inline void __ptr_advance(_Iter&, ptrdiff_t) { }
> +  template<typename _Tp> __attribute__((__always_inline__))
> +    inline void __ptr_advance(_Tp*& __p, ptrdiff_t __n) { __p += __n; }
> +# define _GLIBCXX_ADVANCE(P, N) std::__ptr_advance(P, N)
> +#else
> +  // For C++11 mode the __builtin_memmove calls are guarded by 'if constexpr'
> +  // so we know the iterators used with memmove are guaranteed to be pointers.
> +# define _GLIBCXX_TO_ADDR(P) P
> +# define _GLIBCXX_ADVANCE(P, N) P += N
> +#endif
> +
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Wc++17-extensions"
> +  template<bool _IsMove, typename _OutIter, typename _InIter>
> +    __attribute__((__always_inline__)) _GLIBCXX20_CONSTEXPR
> +    inline void
> +    __assign_one(_OutIter& __out, _InIter& __in)
> +    {
> +#if __cplusplus >= 201103L
> +      if constexpr (_IsMove)
> +       *__out = std::move(*__in);
> +      else
> +#endif
> +       *__out = *__in;
>      }
>
> +  template<bool _IsMove, typename _InIter, typename _Sent, typename _OutIter>
> +    _GLIBCXX20_CONSTEXPR
> +    inline _OutIter
> +    __copy_move_a2(_InIter __first, _Sent __last, _OutIter __result)
> +    {
> +      typedef __decltype(*__first) _InRef;
> +      typedef __decltype(*__result) _OutRef;
> +      if _GLIBCXX_CONSTEXPR (!__is_trivially_assignable(_OutRef, _InRef))
> +       { } /* Skip the optimizations and use the loop at the end. */
> +#ifdef __cpp_lib_is_constant_evaluated
> +      else if (std::is_constant_evaluated())
> +       { } /* Skip the optimizations and use the loop at the end. */
> +#endif
> +      else if _GLIBCXX_CONSTEXPR (__memcpyable<_OutIter, _InIter>::__value)
> +       {
> +         ptrdiff_t __n = std::distance(__first, __last);
> +         if (__builtin_expect(__n > 1, true))
> +           {
> +             __builtin_memmove(_GLIBCXX_TO_ADDR(__result),
> +                               _GLIBCXX_TO_ADDR(__first),
> +                               __n * sizeof(*__first));
> +             _GLIBCXX_ADVANCE(__result, __n);
> +           }
> +         else if (__n == 1)
> +           {
> +             std::__assign_one<_IsMove>(__result, __first);
> +             ++__result;
> +           }
> +         return __result;
> +       }
> +#if __cpp_lib_concepts
> +      else if constexpr (__memcpyable_iterators<_OutIter, _InIter, _Sent>)
> +       {
> +         if (auto __n = __last - __first; __n > 1) [[likely]]
> +           {
> +             void* __dest = std::to_address(__result);
> +             const void* __src = std::to_address(__first);
> +             size_t __nbytes = __n * sizeof(iter_value_t<_InIter>);
> +             // Advance the iterators first, in case doing so throws.
> +             __result += __n;
> +             __first += __n;
> +             __builtin_memmove(__dest, __src, __nbytes);
> +           }
> +         else if (__n == 1)
> +           {
> +             std::__assign_one<_IsMove>(__result, __first);
> +             ++__result;
> +           }
> +         return __result;
> +       }
> +#endif
> +
> +      for (; __first != __last; ++__result, (void)++__first)
> +       std::__assign_one<_IsMove>(__result, __first);
> +      return __result;
> +    }
> +#pragma GCC diagnostic pop
> +
>    template<bool _IsMove,
>            typename _Tp, typename _Ref, typename _Ptr, typename _OI>
>      _OI
> @@ -537,12 +540,56 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>                   const ::__gnu_debug::_Safe_iterator<_IIte, _ISeq, _ICat>&,
>                   const ::__gnu_debug::_Safe_iterator<_OIte, _OSeq, _OCat>&);
>
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Wc++17-extensions" // for if-constexpr
>    template<typename _InputIterator, typename _Size, typename _OutputIterator>
>      _GLIBCXX20_CONSTEXPR
>      _OutputIterator
>      __copy_n_a(_InputIterator __first, _Size __n, _OutputIterator __result,
>                bool)
>      {
> +      typedef __decltype(*__first) _InRef;
> +      typedef __decltype(*__result) _OutRef;
> +      if _GLIBCXX_CONSTEXPR (!__is_trivially_assignable(_OutRef, _InRef))
> +       { } /* Skip the optimizations and use the loop at the end. */
> +#ifdef __cpp_lib_is_constant_evaluated
> +      else if (std::is_constant_evaluated())
> +       { } /* Skip the optimizations and use the loop at the end. */
> +#endif
> +      else if _GLIBCXX_CONSTEXPR (__memcpyable<_OutputIterator,
> +                                              _InputIterator>::__value)
> +       {
> +         if (__builtin_expect(__n > 1, true))
> +           {
> +             __builtin_memmove(_GLIBCXX_TO_ADDR(__result),
> +                               _GLIBCXX_TO_ADDR(__first),
> +                               __n * sizeof(*__first));
> +             _GLIBCXX_ADVANCE(__result, __n);
> +           }
> +         else if (__n == 1)
> +           *__result++ = *__first;
> +         return __result;
> +       }
> +#if __cpp_lib_concepts
> +      else if constexpr (__memcpyable_iterators<_OutputIterator,
> +                                               _InputIterator>)
> +       {
> +         if (__n > 1) [[likely]]
> +           {
> +             void* __dest = std::to_address(__result);
> +             const void* __src = std::to_address(__first);
> +             size_t __nbytes = __n * sizeof(iter_value_t<_InputIterator>);
> +             // Advance the iterators first, in case doing so throws.
> +             __result += __n;
> +             __first += __n;
> +             __builtin_memmove(__dest, __src, __nbytes);
> +           }
> +         else if (__n == 1)
> +           *__result++ = *__first;
> +         return __result;
> +       }
> +#endif
> +
>        if (__n > 0)
>         {
>           while (true)
> @@ -557,6 +604,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>         }
>        return __result;
>      }
> +#pragma GCC diagnostic pop
>
>  #if _GLIBCXX_HOSTED
>    template<typename _CharT, typename _Size>
> @@ -644,105 +692,69 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>  #define _GLIBCXX_MOVE3(_Tp, _Up, _Vp) std::copy(_Tp, _Up, _Vp)
>  #endif
>
> -  template<bool _IsMove, bool _IsSimple, typename _Category>
> -    struct __copy_move_backward
> -    {
> -      template<typename _BI1, typename _BI2>
> -       _GLIBCXX20_CONSTEXPR
> -       static _BI2
> -       __copy_move_b(_BI1 __first, _BI1 __last, _BI2 __result)
> -       {
> -         while (__first != __last)
> -           *--__result = *--__last;
> -         return __result;
> -       }
> -    };
> -
> -#if __cplusplus >= 201103L
> -  template<typename _Category>
> -    struct __copy_move_backward<true, false, _Category>
> -    {
> -      template<typename _BI1, typename _BI2>
> -       _GLIBCXX20_CONSTEXPR
> -       static _BI2
> -       __copy_move_b(_BI1 __first, _BI1 __last, _BI2 __result)
> -       {
> -         while (__first != __last)
> -           *--__result = std::move(*--__last);
> -         return __result;
> -       }
> -    };
> -#endif
> -
> -  template<>
> -    struct __copy_move_backward<false, false, random_access_iterator_tag>
> -    {
> -      template<typename _BI1, typename _BI2>
> -       _GLIBCXX20_CONSTEXPR
> -       static _BI2
> -       __copy_move_b(_BI1 __first, _BI1 __last, _BI2 __result)
> -       {
> -         typename iterator_traits<_BI1>::difference_type
> -           __n = __last - __first;
> -         for (; __n > 0; --__n)
> -           *--__result = *--__last;
> -         return __result;
> -       }
> -    };
> -
> -#if __cplusplus >= 201103L
> -  template<>
> -    struct __copy_move_backward<true, false, random_access_iterator_tag>
> -    {
> -      template<typename _BI1, typename _BI2>
> -       _GLIBCXX20_CONSTEXPR
> -       static _BI2
> -       __copy_move_b(_BI1 __first, _BI1 __last, _BI2 __result)
> -       {
> -         typename iterator_traits<_BI1>::difference_type
> -           __n = __last - __first;
> -         for (; __n > 0; --__n)
> -           *--__result = std::move(*--__last);
> -         return __result;
> -       }
> -    };
> -#endif
> -
> -  template<bool _IsMove>
> -    struct __copy_move_backward<_IsMove, true, random_access_iterator_tag>
> -    {
> -      template<typename _Tp, typename _Up>
> -       _GLIBCXX20_CONSTEXPR
> -       static _Up*
> -       __copy_move_b(_Tp* __first, _Tp* __last, _Up* __result)
> -       {
> -         const ptrdiff_t _Num = __last - __first;
> -         if (__builtin_expect(_Num > 1, true))
> -           __builtin_memmove(__result - _Num, __first, sizeof(_Tp) * _Num);
> -         else if (_Num == 1)
> -           std::__copy_move<_IsMove, false, random_access_iterator_tag>::
> -             __assign_one(__result - 1, __first);
> -         return __result - _Num;
> -       }
> -    };
> -
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Wc++17-extensions"
>    template<bool _IsMove, typename _BI1, typename _BI2>
>      _GLIBCXX20_CONSTEXPR
>      inline _BI2
>      __copy_move_backward_a2(_BI1 __first, _BI1 __last, _BI2 __result)
>      {
> -      typedef typename iterator_traits<_BI1>::iterator_category _Category;
> +      typedef __decltype(*__first) _InRef;
> +      typedef __decltype(*__result) _OutRef;
> +      if _GLIBCXX_CONSTEXPR (!__is_trivially_assignable(_OutRef, _InRef))
> +       { } /* Skip the optimizations and use the loop at the end. */
>  #ifdef __cpp_lib_is_constant_evaluated
> -      if (std::is_constant_evaluated())
> -       return std::__copy_move_backward<_IsMove, false, _Category>::
> -         __copy_move_b(__first, __last, __result);
> +      else if (std::is_constant_evaluated())
> +       { } /* Skip the optimizations and use the loop at the end. */
>  #endif
> -      return std::__copy_move_backward<_IsMove,
> -                                      __memcpyable<_BI2, _BI1>::__value,
> -                                      _Category>::__copy_move_b(__first,
> -                                                                __last,
> -                                                                __result);
> +      else if _GLIBCXX_CONSTEXPR (__memcpyable<_BI2, _BI1>::__value)
> +       {
> +         ptrdiff_t __n = std::distance(__first, __last);
> +         std::advance(__result, -__n);
> +         if (__builtin_expect(__n > 1, true))
> +           {
> +             __builtin_memmove(_GLIBCXX_TO_ADDR(__result),
> +                               _GLIBCXX_TO_ADDR(__first),
> +                               __n * sizeof(*__first));
> +           }
> +         else if (__n == 1)
> +           std::__assign_one<_IsMove>(__result, __first);
> +         return __result;
> +       }
> +#if __cpp_lib_concepts
> +      else if constexpr (__memcpyable_iterators<_BI2, _BI1>)
> +       {
> +         if (auto __n = __last - __first; __n > 1) [[likely]]
> +           {
> +             const void* __src = std::to_address(__first);
> +             // Advance the iterators first, in case doing so throws.
> +             __result -= __n;
> +             __first += __n;
> +             void* __dest = std::to_address(__result);
> +             size_t __nbytes = __n * sizeof(iter_value_t<_BI1>);
> +             __builtin_memmove(__dest, __src, __nbytes);
> +           }
> +         else if (__n == 1)
> +           {
> +             --__result;
> +             std::__assign_one<_IsMove>(__result, __first);
> +           }
> +         return __result;
> +       }
> +#endif
> +
> +      while (__first != __last)
> +       {
> +         --__last;
> +         --__result;
> +         std::__assign_one<_IsMove>(__result, __last);
> +       }
> +      return __result;
>      }
> +#pragma GCC diagnostic pop
> +
> +#undef _GLIBCXX_TO_ADDR
> +#undef _GLIBCXX_ADVANCE
>
>    template<bool _IsMove, typename _BI1, typename _BI2>
>      _GLIBCXX20_CONSTEXPR
> diff --git a/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc
> new file mode 100644
> index 00000000000..531b863e143
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc
> @@ -0,0 +1,39 @@
> +// { dg-do run { target c++11 } }
> +
> +// Bug libstdc++/114817 - Wrong codegen for std::copy of
> +// "trivially copyable but not trivially assignable" type
> +
> +#include <memory>
> +#include <testsuite_hooks.h>
> +
> +int constructions = 0;
> +
> +struct NonTrivialCons
> +{
> +  NonTrivialCons() = default;
> +  NonTrivialCons(int v) : val(v) { }
> +  NonTrivialCons(const volatile NonTrivialCons&) = delete;
> +  template<class = void>
> +    NonTrivialCons(const NonTrivialCons& o)
> +    : val(o.val)
> +    {
> +      ++constructions;
> +    }
> +
> +  int val;
> +};
> +
> +static_assert(std::is_trivially_copyable<NonTrivialCons>::value);
> +
> +int main()
> +{
> +  NonTrivialCons src[2]{1, 2};
> +  NonTrivialCons dst[2];
> +#if __cplusplus < 201703L
> +  constructions = 0;
> +#endif
> +  std::uninitialized_copy(src, src+2, dst);
> +  VERIFY( constructions == 2 );
> +  VERIFY( dst[0].val == src[0].val );
> +  VERIFY( dst[1].val == src[1].val );
> +}
> diff --git a/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc
> new file mode 100644
> index 00000000000..531b863e143
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc
> @@ -0,0 +1,39 @@
> +// { dg-do run { target c++11 } }
> +
> +// Bug libstdc++/114817 - Wrong codegen for std::copy of
> +// "trivially copyable but not trivially assignable" type
> +
> +#include <memory>
> +#include <testsuite_hooks.h>
> +
> +int constructions = 0;
> +
> +struct NonTrivialCons
> +{
> +  NonTrivialCons() = default;
> +  NonTrivialCons(int v) : val(v) { }
> +  NonTrivialCons(const volatile NonTrivialCons&) = delete;
> +  template<class = void>
> +    NonTrivialCons(const NonTrivialCons& o)
> +    : val(o.val)
> +    {
> +      ++constructions;
> +    }
> +
> +  int val;
> +};
> +
> +static_assert(std::is_trivially_copyable<NonTrivialCons>::value);
> +
> +int main()
> +{
> +  NonTrivialCons src[2]{1, 2};
> +  NonTrivialCons dst[2];
> +#if __cplusplus < 201703L
> +  constructions = 0;
> +#endif
> +  std::uninitialized_copy(src, src+2, dst);
> +  VERIFY( constructions == 2 );
> +  VERIFY( dst[0].val == src[0].val );
> +  VERIFY( dst[1].val == src[1].val );
> +}
> diff --git a/libstdc++-v3/testsuite/25_algorithms/copy/114817.cc b/libstdc++-v3/testsuite/25_algorithms/copy/114817.cc
> new file mode 100644
> index 00000000000..b5fcc6bb037
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/25_algorithms/copy/114817.cc
> @@ -0,0 +1,38 @@
> +// { dg-do run { target c++11 } }
> +
> +// Bug libstdc++/114817 - Wrong codegen for std::copy of
> +// "trivially copyable but not trivially assignable" type
> +
> +#include <algorithm>
> +#include <testsuite_hooks.h>
> +
> +int assignments = 0;
> +
> +struct NonTrivialAssignment
> +{
> +  NonTrivialAssignment(int v) : val(v) { }
> +  NonTrivialAssignment(const NonTrivialAssignment&) = default;
> +  void operator=(const volatile NonTrivialAssignment&) = delete;
> +  template<class = void>
> +    NonTrivialAssignment&
> +    operator=(const NonTrivialAssignment& o)
> +    {
> +      ++assignments;
> +      val = o.val;
> +      return *this;
> +    }
> +
> +  int val;
> +};
> +
> +static_assert(std::is_trivially_copyable<NonTrivialAssignment>::value);
> +
> +int main()
> +{
> +  NonTrivialAssignment src[2]{1, 2};
> +  NonTrivialAssignment dst[2]{3, 4};
> +  std::copy(src, src+2, dst);
> +  VERIFY( assignments == 2 );
> +  VERIFY( dst[0].val == src[0].val );
> +  VERIFY( dst[1].val == src[1].val );
> +}
> diff --git a/libstdc++-v3/testsuite/25_algorithms/copy/115444.cc b/libstdc++-v3/testsuite/25_algorithms/copy/115444.cc
> new file mode 100644
> index 00000000000..fa629abea5f
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/25_algorithms/copy/115444.cc
> @@ -0,0 +1,93 @@
> +// { dg-do run }
> +// { dg-require-normal-mode "debug mode checks use operator-(Iter, Iter)" }
> +
> +#include <algorithm>
> +#include <iterator>
> +#include <testsuite_hooks.h>
> +
> +const int g = 0;
> +
> +struct Iter {
> +  typedef long difference_type;
> +  typedef int value_type;
> +  typedef const int& reference;
> +  typedef const int* pointer;
> +#if __cpp_lib_concepts
> +  using iterator_category = std::contiguous_iterator_tag;
> +#else
> +  typedef std::random_access_iterator_tag iterator_category;
> +#endif
> +
> +  Iter(const int* p = 0, const int* limit = 0) : ptr(p), limit(limit) { }
> +
> +  const int& operator*() const {
> +#ifdef __cpp_exceptions
> +    if (ptr == limit)
> +      throw 1;
> +#endif
> +    return *ptr;
> +  }
> +  const int* operator->() const { return ptr; }
> +  const int& operator[](long n) const { return ptr[n]; }
> +
> +  Iter& operator++() { ++ptr; return *this; }
> +  Iter operator++(int) { Iter tmp = *this; ++ptr; return tmp; }
> +  Iter& operator--() { --ptr; return *this; }
> +  Iter operator--(int) { Iter tmp = *this; --ptr; return tmp; }
> +
> +  Iter& operator+=(int n) { ptr += n; return *this; }
> +  Iter& operator-=(int n) { ptr -= n; return *this; }
> +
> +  friend Iter operator+(int n, Iter it) { return it += n; }
> +  friend Iter operator+(Iter it, int n) { return it += n; }
> +  friend Iter operator-(Iter it, int n) { return it -= n; }
> +
> +  bool operator==(const Iter& it) const { return ptr == it.ptr; }
> +  bool operator!=(const Iter& it) const { return ptr != it.ptr; }
> +  bool operator<(const Iter& it) const { return ptr < it.ptr; }
> +  bool operator>(const Iter& it) const { return ptr > it.ptr; }
> +  bool operator<=(const Iter& it) const { return ptr <= it.ptr; }
> +  bool operator>=(const Iter& it) const { return ptr >= it.ptr; }
> +
> +  // std::copy should not need to take the difference between two iterators:
> +  friend int operator-(Iter, Iter) { VERIFY( ! "operator- called" ); }
> +
> +private:
> +  const int* ptr;
> +  const int* limit;
> +};
> +
> +void
> +test_pr115444_no_difference()
> +{
> +  int from = 1;
> +  int to = 0;
> +  Iter iter(&from);
> +  // This should not use operator-(Iter, Iter)
> +  std::copy(iter, iter+1, &to);
> +}
> +
> +void
> +test_pr115444_exceptional()
> +{
> +#if __cpp_exceptions
> +  int from[3] = { 1, 2, 3 };
> +  int to[3] = { -1, -1, -1 };
> +  Iter iter(from, from+2);
> +  try {
> +    std::copy(iter, iter + 3, to);
> +  } catch (int) {
> +  }
> +  // std::copy should exit via exception on third dereference.
> +  // This precludes using memcpy or memmove to optimize the copying.
> +  VERIFY( to[0] == 1 );
> +  VERIFY( to[1] == 2 );
> +  VERIFY( to[2] == -1 );
> +#endif
> +}
> +
> +int main()
> +{
> +  test_pr115444_no_difference();
> +  test_pr115444_exceptional();
> +}
> diff --git a/libstdc++-v3/testsuite/25_algorithms/copy_n/114817.cc b/libstdc++-v3/testsuite/25_algorithms/copy_n/114817.cc
> new file mode 100644
> index 00000000000..09e181f3fd0
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/25_algorithms/copy_n/114817.cc
> @@ -0,0 +1,38 @@
> +// { dg-do run { target c++11 } }
> +
> +// Bug libstdc++/114817 - Wrong codegen for std::copy of
> +// "trivially copyable but not trivially assignable" type
> +
> +#include <algorithm>
> +#include <testsuite_hooks.h>
> +
> +int assignments = 0;
> +
> +struct NonTrivialAssignment
> +{
> +  NonTrivialAssignment(int v) : val(v) { }
> +  NonTrivialAssignment(const NonTrivialAssignment&) = default;
> +  void operator=(const volatile NonTrivialAssignment&) = delete;
> +  template<class = void>
> +    NonTrivialAssignment&
> +    operator=(const NonTrivialAssignment& o)
> +    {
> +      ++assignments;
> +      val = o.val;
> +      return *this;
> +    }
> +
> +  int val;
> +};
> +
> +static_assert(std::is_trivially_copyable<NonTrivialAssignment>::value);
> +
> +int main()
> +{
> +  NonTrivialAssignment src[2]{1, 2};
> +  NonTrivialAssignment dst[2]{3, 4};
> +  std::copy_n(src, 2, dst);
> +  VERIFY( assignments == 2 );
> +  VERIFY( dst[0].val == src[0].val );
> +  VERIFY( dst[1].val == src[1].val );
> +}
> --
> 2.46.2
>
Patrick Palka Oct. 17, 2024, 2:04 a.m. UTC | #2
On Tue, 15 Oct 2024, Jonathan Wakely wrote:

> This is a slightly different approach to C++98 compatibility than used
> in patch 1/1 of this series for the uninitialized algos. It worked out a
> bit cleaner this way for these algos, I think.
> 
> Tested x86_64-linux.
> 
> -- >8 --
> 
> This removes all the __copy_move class template specializations that
> decide how to optimize std::copy and std::copy_n. We can inline those
> optimizations into the algorithms, using if-constexpr (and macros for
> C++98 compatibility) and remove the code dispatching to the various
> class template specializations.
> 
> Doing this means we implement the optimization directly for std::copy_n
> instead of deferring to std::copy, That avoids the unwanted consequence
> of advancing the iterator in copy_n only to take the difference later to
> get back to the length that we already had in copy_n originally (as
> described in PR 115444).
> 
> With the new flattened implementations, we can also lower contiguous
> iterators to pointers in std::copy/std::copy_n/std::copy_backwards, so
> that they benefit from the same memmove optimizations as pointers.
> There's a subtlety though: contiguous iterators can potentially throw
> exceptions to exit the algorithm early.  So we can only transform the
> loop to memmove if dereferencing the iterator is noexcept. We don't
> check that incrementing the iterator is noexcept because we advance the
> contiguous iterators before using memmove, so that if incrementing would
> throw, that happens first. I am writing a proposal (P3249R0) which would
> make this unnecessary, so I hope we can drop the nothrow requirements
> later.
> 
> This change also solves PR 114817 by checking is_trivially_assignable
> before optimizing copy/copy_n etc. to memmove. It's not enough to check
> that the types are trivially copyable (a precondition for using memmove
> at all), we also need to check that the specific assignment that would
> be performed by the algorithm is also trivial. Replacing a non-trivial
> assignment with memmove would be observable, so not allowed.
> 
> libstdc++-v3/ChangeLog:
> 
> 	PR libstdc++/115444
> 	PR libstdc++/114817
> 	* include/bits/stl_algo.h (__copy_n): Remove generic overload
> 	and overload for random access iterators.
> 	(copy_n): Inline generic version of __copy_n here. Do not defer
> 	to std::copy for random access iterators.
> 	* include/bits/stl_algobase.h (__copy_move): Remove.
> 	(__nothrow_contiguous_iterator, __memcpyable_iterators): New
> 	concepts.
> 	(__assign_one, _GLIBCXX_TO_ADDR, _GLIBCXX_ADVANCE): New helpers.
> 	(__copy_move_a2): Inline __copy_move logic and conditional
> 	memmove optimization into the most generic overload.
> 	(__copy_n_a): Likewise.
> 	(__copy_move_backward): Remove.
> 	(__copy_move_backward_a2): Inline __copy_move_backward logic and
> 	memmove optimization into the most generic overload.
> 	* testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc:
> 	New test.
> 	* testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc:
> 	New test.
> 	* testsuite/25_algorithms/copy/114817.cc: New test.
> 	* testsuite/25_algorithms/copy/115444.cc: New test.
> 	* testsuite/25_algorithms/copy_n/114817.cc: New test.
> ---
>  libstdc++-v3/include/bits/stl_algo.h          |  24 +-
>  libstdc++-v3/include/bits/stl_algobase.h      | 426 +++++++++---------
>  .../uninitialized_copy/114817.cc              |  39 ++
>  .../uninitialized_copy_n/114817.cc            |  39 ++
>  .../testsuite/25_algorithms/copy/114817.cc    |  38 ++
>  .../testsuite/25_algorithms/copy/115444.cc    |  93 ++++
>  .../testsuite/25_algorithms/copy_n/114817.cc  |  38 ++
>  7 files changed, 469 insertions(+), 228 deletions(-)
>  create mode 100644 libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc
>  create mode 100644 libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc
>  create mode 100644 libstdc++-v3/testsuite/25_algorithms/copy/114817.cc
>  create mode 100644 libstdc++-v3/testsuite/25_algorithms/copy/115444.cc
>  create mode 100644 libstdc++-v3/testsuite/25_algorithms/copy_n/114817.cc
> 
> diff --git a/libstdc++-v3/include/bits/stl_algo.h b/libstdc++-v3/include/bits/stl_algo.h
> index a1ef665506d..489ce7e14d2 100644
> --- a/libstdc++-v3/include/bits/stl_algo.h
> +++ b/libstdc++-v3/include/bits/stl_algo.h
> @@ -665,25 +665,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        return __result;
>      }
>  
> -  template<typename _InputIterator, typename _Size, typename _OutputIterator>
> -    _GLIBCXX20_CONSTEXPR
> -    _OutputIterator
> -    __copy_n(_InputIterator __first, _Size __n,
> -	     _OutputIterator __result, input_iterator_tag)
> -    {
> -      return std::__niter_wrap(__result,
> -			       __copy_n_a(__first, __n,
> -					  std::__niter_base(__result), true));
> -    }
> -
> -  template<typename _RandomAccessIterator, typename _Size,
> -	   typename _OutputIterator>
> -    _GLIBCXX20_CONSTEXPR
> -    inline _OutputIterator
> -    __copy_n(_RandomAccessIterator __first, _Size __n,
> -	     _OutputIterator __result, random_access_iterator_tag)
> -    { return std::copy(__first, __first + __n, __result); }
> -
>    /**
>     *  @brief Copies the range [first,first+n) into [result,result+n).
>     *  @ingroup mutating_algorithms
> @@ -714,8 +695,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        __glibcxx_requires_can_increment(__first, __n2);
>        __glibcxx_requires_can_increment(__result, __n2);
>  
> -      return std::__copy_n(__first, __n2, __result,
> -			   std::__iterator_category(__first));
> +      auto __res = std::__copy_n_a(std::__niter_base(__first), __n2,
> +				   std::__niter_base(__result), true);
> +      return std::__niter_wrap(__result, std::move(__res));
>      }
>  
>    /**
> diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h
> index 751b7ad119b..5f77b00be9b 100644
> --- a/libstdc++-v3/include/bits/stl_algobase.h
> +++ b/libstdc++-v3/include/bits/stl_algobase.h
> @@ -77,6 +77,7 @@
>  #endif
>  #if __cplusplus >= 202002L
>  # include <compare>
> +# include <bits/ptr_traits.h> // std::to_address
>  #endif
>  
>  namespace std _GLIBCXX_VISIBILITY(default)
> @@ -308,110 +309,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>        return __a;
>      }
>  
> -  // All of these auxiliary structs serve two purposes.  (1) Replace
> -  // calls to copy with memmove whenever possible.  (Memmove, not memcpy,
> -  // because the input and output ranges are permitted to overlap.)
> -  // (2) If we're using random access iterators, then write the loop as
> -  // a for loop with an explicit count.
> -
> -  template<bool _IsMove, bool _IsSimple, typename _Category>
> -    struct __copy_move
> -    {
> -      template<typename _II, typename _OI>
> -	_GLIBCXX20_CONSTEXPR
> -	static _OI
> -	__copy_m(_II __first, _II __last, _OI __result)
> -	{
> -	  for (; __first != __last; ++__result, (void)++__first)
> -	    *__result = *__first;
> -	  return __result;
> -	}
> -    };
> -
> -#if __cplusplus >= 201103L
> -  template<typename _Category>
> -    struct __copy_move<true, false, _Category>
> -    {
> -      template<typename _II, typename _OI>
> -	_GLIBCXX20_CONSTEXPR
> -	static _OI
> -	__copy_m(_II __first, _II __last, _OI __result)
> -	{
> -	  for (; __first != __last; ++__result, (void)++__first)
> -	    *__result = std::move(*__first);
> -	  return __result;
> -	}
> -    };
> -#endif
> -
> -  template<>
> -    struct __copy_move<false, false, random_access_iterator_tag>
> -    {
> -      template<typename _II, typename _OI>
> -	_GLIBCXX20_CONSTEXPR
> -	static _OI
> -	__copy_m(_II __first, _II __last, _OI __result)
> -	{
> -	  typedef typename iterator_traits<_II>::difference_type _Distance;
> -	  for(_Distance __n = __last - __first; __n > 0; --__n)
> -	    {
> -	      *__result = *__first;
> -	      ++__first;
> -	      ++__result;
> -	    }
> -	  return __result;
> -	}
> -
> -      template<typename _Tp, typename _Up>
> -	static void
> -	__assign_one(_Tp* __to, _Up* __from)
> -	{ *__to = *__from; }
> -    };
> -
> -#if __cplusplus >= 201103L
> -  template<>
> -    struct __copy_move<true, false, random_access_iterator_tag>
> -    {
> -      template<typename _II, typename _OI>
> -	_GLIBCXX20_CONSTEXPR
> -	static _OI
> -	__copy_m(_II __first, _II __last, _OI __result)
> -	{
> -	  typedef typename iterator_traits<_II>::difference_type _Distance;
> -	  for(_Distance __n = __last - __first; __n > 0; --__n)
> -	    {
> -	      *__result = std::move(*__first);
> -	      ++__first;
> -	      ++__result;
> -	    }
> -	  return __result;
> -	}
> -
> -      template<typename _Tp, typename _Up>
> -	static void
> -	__assign_one(_Tp* __to, _Up* __from)
> -	{ *__to = std::move(*__from); }
> -    };
> -#endif
> -
> -  template<bool _IsMove>
> -    struct __copy_move<_IsMove, true, random_access_iterator_tag>
> -    {
> -      template<typename _Tp, typename _Up>
> -	_GLIBCXX20_CONSTEXPR
> -	static _Up*
> -	__copy_m(_Tp* __first, _Tp* __last, _Up* __result)
> -	{
> -	  const ptrdiff_t _Num = __last - __first;
> -	  if (__builtin_expect(_Num > 1, true))
> -	    __builtin_memmove(__result, __first, sizeof(_Tp) * _Num);
> -	  else if (_Num == 1)
> -	    std::__copy_move<_IsMove, false, random_access_iterator_tag>::
> -	      __assign_one(__result, __first);
> -	  return __result + _Num;
> -	}
> -    };
> -
>  _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
>  
>    template<typename _Tp, typename _Ref, typename _Ptr>
> @@ -461,21 +358,127 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>  	_GLIBCXX_STD_C::_Deque_iterator<_CharT, _CharT&, _CharT*>);
>  #endif // HOSTED
>  
> -  template<bool _IsMove, typename _II, typename _OI>
> -    _GLIBCXX20_CONSTEXPR
> -    inline _OI
> -    __copy_move_a2(_II __first, _II __last, _OI __result)
> -    {
> -      typedef typename iterator_traits<_II>::iterator_category _Category;
> -#ifdef __cpp_lib_is_constant_evaluated
> -      if (std::is_constant_evaluated())
> -	return std::__copy_move<_IsMove, false, _Category>::
> -	  __copy_m(__first, __last, __result);
> +#if __cpp_lib_concepts
> +  // N.B. this is not the same as nothrow-forward-iterator, which doesn't
> +  // require noexcept operations, it just says it's undefined if they throw.
> +  // Here we require them to be actually noexcept.
> +  template<typename _Iter>
> +    concept __nothrow_contiguous_iterator
> +      = contiguous_iterator<_Iter> && requires (_Iter __i) {
> +	// If this operation can throw then the iterator could cause
> +	// the algorithm to exit early via an exception, in which case
> +	// we can't use memcpy.
> +	{ *__i } noexcept;
> +      };
> +
> +  template<typename _OutIter, typename _InIter, typename _Sent = _InIter>
> +    concept __memcpyable_iterators
> +      = __nothrow_contiguous_iterator<_OutIter>
> +	  && __nothrow_contiguous_iterator<_InIter>
> +	  && sized_sentinel_for<_Sent, _InIter>
> +	  && requires (_OutIter __o, _InIter __i, _Sent __s) {
> +	    requires !!__memcpyable<decltype(std::to_address(__o)),
> +				    decltype(std::to_address(__i))>::__value;
> +	    { __i != __s } noexcept;
> +	  };
>  #endif
> -      return std::__copy_move<_IsMove, __memcpyable<_OI, _II>::__value,
> -			      _Category>::__copy_m(__first, __last, __result);
> +
> +#if __cplusplus < 201103L
> +  // Used by __copy_move_a2, __copy_n_a and __copy_move_backward_a2 to
> +  // get raw pointers so that calls to __builtin_memmove will compile,
> +  // because C++98 can't use 'if constexpr' so statements that use memmove
> +  // with pointer arguments need to also compile for arbitrary iterator types.
> +  template<typename _Iter> __attribute__((__always_inline__))
> +    inline void* __ptr_or_null(_Iter) { return 0; }
> +  template<typename _Tp> __attribute__((__always_inline__))
> +  inline void* __ptr_or_null(_Tp* __p) { return (void*)__p; }
> +# define _GLIBCXX_TO_ADDR(P) std::__ptr_or_null(P)
> +  // Used to advance output iterators (std::advance requires InputIterator).
> +  template<typename _Iter> __attribute__((__always_inline__))
> +    inline void __ptr_advance(_Iter&, ptrdiff_t) { }
> +  template<typename _Tp> __attribute__((__always_inline__))
> +    inline void __ptr_advance(_Tp*& __p, ptrdiff_t __n) { __p += __n; }
> +# define _GLIBCXX_ADVANCE(P, N) std::__ptr_advance(P, N)
> +#else
> +  // For C++11 mode the __builtin_memmove calls are guarded by 'if constexpr'
> +  // so we know the iterators used with memmove are guaranteed to be pointers.
> +# define _GLIBCXX_TO_ADDR(P) P
> +# define _GLIBCXX_ADVANCE(P, N) P += N
> +#endif
> +
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Wc++17-extensions"
> +  template<bool _IsMove, typename _OutIter, typename _InIter>
> +    __attribute__((__always_inline__)) _GLIBCXX20_CONSTEXPR
> +    inline void
> +    __assign_one(_OutIter& __out, _InIter& __in)
> +    {
> +#if __cplusplus >= 201103L
> +      if constexpr (_IsMove)
> +	*__out = std::move(*__in);
> +      else
> +#endif
> +	*__out = *__in;
>      }
>  
> +  template<bool _IsMove, typename _InIter, typename _Sent, typename _OutIter>
> +    _GLIBCXX20_CONSTEXPR
> +    inline _OutIter
> +    __copy_move_a2(_InIter __first, _Sent __last, _OutIter __result)
> +    {
> +      typedef __decltype(*__first) _InRef;
> +      typedef __decltype(*__result) _OutRef;
> +      if _GLIBCXX_CONSTEXPR (!__is_trivially_assignable(_OutRef, _InRef))
> +	{ } /* Skip the optimizations and use the loop at the end. */
> +#ifdef __cpp_lib_is_constant_evaluated
> +      else if (std::is_constant_evaluated())

Maybe check std::__is_constant_evaluated() instead since it's always defined?

> +	{ } /* Skip the optimizations and use the loop at the end. */
> +#endif
> +      else if _GLIBCXX_CONSTEXPR (__memcpyable<_OutIter, _InIter>::__value)
> +	{
> +	  ptrdiff_t __n = std::distance(__first, __last);
> +	  if (__builtin_expect(__n > 1, true))

I guess [[__likely__]] can't be used here since this is a C++98 code
path.

LGTM

> +	    {
> +	      __builtin_memmove(_GLIBCXX_TO_ADDR(__result),
> +				_GLIBCXX_TO_ADDR(__first),
> +				__n * sizeof(*__first));
> +	      _GLIBCXX_ADVANCE(__result, __n);
> +	    }
> +	  else if (__n == 1)
> +	    {
> +	      std::__assign_one<_IsMove>(__result, __first);
> +	      ++__result;
> +	    }
> +	  return __result;
> +	}
> +#if __cpp_lib_concepts
> +      else if constexpr (__memcpyable_iterators<_OutIter, _InIter, _Sent>)
> +	{
> +	  if (auto __n = __last - __first; __n > 1) [[likely]]
> +	    {
> +	      void* __dest = std::to_address(__result);
> +	      const void* __src = std::to_address(__first);
> +	      size_t __nbytes = __n * sizeof(iter_value_t<_InIter>);
> +	      // Advance the iterators first, in case doing so throws.
> +	      __result += __n;
> +	      __first += __n;
> +	      __builtin_memmove(__dest, __src, __nbytes);
> +	    }
> +	  else if (__n == 1)
> +	    {
> +	      std::__assign_one<_IsMove>(__result, __first);
> +	      ++__result;
> +	    }
> +	  return __result;
> +	}
> +#endif
> +
> +      for (; __first != __last; ++__result, (void)++__first)
> +	std::__assign_one<_IsMove>(__result, __first);
> +      return __result;
> +    }
> +#pragma GCC diagnostic pop
> +
>    template<bool _IsMove,
>  	   typename _Tp, typename _Ref, typename _Ptr, typename _OI>
>      _OI
> @@ -537,12 +540,56 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>  		  const ::__gnu_debug::_Safe_iterator<_IIte, _ISeq, _ICat>&,
>  		  const ::__gnu_debug::_Safe_iterator<_OIte, _OSeq, _OCat>&);
>  
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Wc++17-extensions" // for if-constexpr
>    template<typename _InputIterator, typename _Size, typename _OutputIterator>
>      _GLIBCXX20_CONSTEXPR
>      _OutputIterator
>      __copy_n_a(_InputIterator __first, _Size __n, _OutputIterator __result,
>  	       bool)
>      {
> +      typedef __decltype(*__first) _InRef;
> +      typedef __decltype(*__result) _OutRef;
> +      if _GLIBCXX_CONSTEXPR (!__is_trivially_assignable(_OutRef, _InRef))
> +	{ } /* Skip the optimizations and use the loop at the end. */
> +#ifdef __cpp_lib_is_constant_evaluated
> +      else if (std::is_constant_evaluated())
> +	{ } /* Skip the optimizations and use the loop at the end. */
> +#endif
> +      else if _GLIBCXX_CONSTEXPR (__memcpyable<_OutputIterator,
> +					       _InputIterator>::__value)
> +	{
> +	  if (__builtin_expect(__n > 1, true))
> +	    {
> +	      __builtin_memmove(_GLIBCXX_TO_ADDR(__result),
> +				_GLIBCXX_TO_ADDR(__first),
> +				__n * sizeof(*__first));
> +	      _GLIBCXX_ADVANCE(__result, __n);
> +	    }
> +	  else if (__n == 1)
> +	    *__result++ = *__first;
> +	  return __result;
> +	}
> +#if __cpp_lib_concepts
> +      else if constexpr (__memcpyable_iterators<_OutputIterator,
> +						_InputIterator>)
> +	{
> +	  if (__n > 1) [[likely]]
> +	    {
> +	      void* __dest = std::to_address(__result);
> +	      const void* __src = std::to_address(__first);
> +	      size_t __nbytes = __n * sizeof(iter_value_t<_InputIterator>);
> +	      // Advance the iterators first, in case doing so throws.
> +	      __result += __n;
> +	      __first += __n;
> +	      __builtin_memmove(__dest, __src, __nbytes);
> +	    }
> +	  else if (__n == 1)
> +	    *__result++ = *__first;
> +	  return __result;
> +	}
> +#endif
> +
>        if (__n > 0)
>  	{
>  	  while (true)
> @@ -557,6 +604,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>  	}
>        return __result;
>      }
> +#pragma GCC diagnostic pop
>  
>  #if _GLIBCXX_HOSTED
>    template<typename _CharT, typename _Size>
> @@ -644,105 +692,69 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
>  #define _GLIBCXX_MOVE3(_Tp, _Up, _Vp) std::copy(_Tp, _Up, _Vp)
>  #endif
>  
> -  template<bool _IsMove, bool _IsSimple, typename _Category>
> -    struct __copy_move_backward
> -    {
> -      template<typename _BI1, typename _BI2>
> -	_GLIBCXX20_CONSTEXPR
> -	static _BI2
> -	__copy_move_b(_BI1 __first, _BI1 __last, _BI2 __result)
> -	{
> -	  while (__first != __last)
> -	    *--__result = *--__last;
> -	  return __result;
> -	}
> -    };
> -
> -#if __cplusplus >= 201103L
> -  template<typename _Category>
> -    struct __copy_move_backward<true, false, _Category>
> -    {
> -      template<typename _BI1, typename _BI2>
> -	_GLIBCXX20_CONSTEXPR
> -	static _BI2
> -	__copy_move_b(_BI1 __first, _BI1 __last, _BI2 __result)
> -	{
> -	  while (__first != __last)
> -	    *--__result = std::move(*--__last);
> -	  return __result;
> -	}
> -    };
> -#endif
> -
> -  template<>
> -    struct __copy_move_backward<false, false, random_access_iterator_tag>
> -    {
> -      template<typename _BI1, typename _BI2>
> -	_GLIBCXX20_CONSTEXPR
> -	static _BI2
> -	__copy_move_b(_BI1 __first, _BI1 __last, _BI2 __result)
> -	{
> -	  typename iterator_traits<_BI1>::difference_type
> -	    __n = __last - __first;
> -	  for (; __n > 0; --__n)
> -	    *--__result = *--__last;
> -	  return __result;
> -	}
> -    };
> -
> -#if __cplusplus >= 201103L
> -  template<>
> -    struct __copy_move_backward<true, false, random_access_iterator_tag>
> -    {
> -      template<typename _BI1, typename _BI2>
> -	_GLIBCXX20_CONSTEXPR
> -	static _BI2
> -	__copy_move_b(_BI1 __first, _BI1 __last, _BI2 __result)
> -	{
> -	  typename iterator_traits<_BI1>::difference_type
> -	    __n = __last - __first;
> -	  for (; __n > 0; --__n)
> -	    *--__result = std::move(*--__last);
> -	  return __result;
> -	}
> -    };
> -#endif
> -
> -  template<bool _IsMove>
> -    struct __copy_move_backward<_IsMove, true, random_access_iterator_tag>
> -    {
> -      template<typename _Tp, typename _Up>
> -	_GLIBCXX20_CONSTEXPR
> -	static _Up*
> -	__copy_move_b(_Tp* __first, _Tp* __last, _Up* __result)
> -	{
> -	  const ptrdiff_t _Num = __last - __first;
> -	  if (__builtin_expect(_Num > 1, true))
> -	    __builtin_memmove(__result - _Num, __first, sizeof(_Tp) * _Num);
> -	  else if (_Num == 1)
> -	    std::__copy_move<_IsMove, false, random_access_iterator_tag>::
> -	      __assign_one(__result - 1, __first);
> -	  return __result - _Num;
> -	}
> -    };
> -
> +#pragma GCC diagnostic push
> +#pragma GCC diagnostic ignored "-Wc++17-extensions"
>    template<bool _IsMove, typename _BI1, typename _BI2>
>      _GLIBCXX20_CONSTEXPR
>      inline _BI2
>      __copy_move_backward_a2(_BI1 __first, _BI1 __last, _BI2 __result)
>      {
> -      typedef typename iterator_traits<_BI1>::iterator_category _Category;
> +      typedef __decltype(*__first) _InRef;
> +      typedef __decltype(*__result) _OutRef;
> +      if _GLIBCXX_CONSTEXPR (!__is_trivially_assignable(_OutRef, _InRef))
> +       { } /* Skip the optimizations and use the loop at the end. */
>  #ifdef __cpp_lib_is_constant_evaluated
> -      if (std::is_constant_evaluated())
> -	return std::__copy_move_backward<_IsMove, false, _Category>::
> -	  __copy_move_b(__first, __last, __result);
> +      else if (std::is_constant_evaluated())
> +       { } /* Skip the optimizations and use the loop at the end. */
>  #endif
> -      return std::__copy_move_backward<_IsMove,
> -				       __memcpyable<_BI2, _BI1>::__value,
> -				       _Category>::__copy_move_b(__first,
> -								 __last,
> -								 __result);
> +      else if _GLIBCXX_CONSTEXPR (__memcpyable<_BI2, _BI1>::__value)
> +	{
> +	  ptrdiff_t __n = std::distance(__first, __last);
> +	  std::advance(__result, -__n);
> +	  if (__builtin_expect(__n > 1, true))
> +	    {
> +	      __builtin_memmove(_GLIBCXX_TO_ADDR(__result),
> +				_GLIBCXX_TO_ADDR(__first),
> +				__n * sizeof(*__first));
> +	    }
> +	  else if (__n == 1)
> +	    std::__assign_one<_IsMove>(__result, __first);
> +	  return __result;
> +	}
> +#if __cpp_lib_concepts
> +      else if constexpr (__memcpyable_iterators<_BI2, _BI1>)
> +	{
> +	  if (auto __n = __last - __first; __n > 1) [[likely]]
> +	    {
> +	      const void* __src = std::to_address(__first);
> +	      // Advance the iterators first, in case doing so throws.
> +	      __result -= __n;
> +	      __first += __n;
> +	      void* __dest = std::to_address(__result);
> +	      size_t __nbytes = __n * sizeof(iter_value_t<_BI1>);
> +	      __builtin_memmove(__dest, __src, __nbytes);
> +	    }
> +	  else if (__n == 1)
> +	    {
> +	      --__result;
> +	      std::__assign_one<_IsMove>(__result, __first);
> +	    }
> +	  return __result;
> +	}
> +#endif
> +
> +      while (__first != __last)
> +	{
> +	  --__last;
> +	  --__result;
> +	  std::__assign_one<_IsMove>(__result, __last);
> +	}
> +      return __result;
>      }
> +#pragma GCC diagnostic pop
> +
> +#undef _GLIBCXX_TO_ADDR
> +#undef _GLIBCXX_ADVANCE
>  
>    template<bool _IsMove, typename _BI1, typename _BI2>
>      _GLIBCXX20_CONSTEXPR
> diff --git a/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc
> new file mode 100644
> index 00000000000..531b863e143
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc
> @@ -0,0 +1,39 @@
> +// { dg-do run { target c++11 } }
> +
> +// Bug libstdc++/114817 - Wrong codegen for std::copy of
> +// "trivially copyable but not trivially assignable" type
> +
> +#include <memory>
> +#include <testsuite_hooks.h>
> +
> +int constructions = 0;
> +
> +struct NonTrivialCons
> +{
> +  NonTrivialCons() = default;
> +  NonTrivialCons(int v) : val(v) { }
> +  NonTrivialCons(const volatile NonTrivialCons&) = delete;
> +  template<class = void>
> +    NonTrivialCons(const NonTrivialCons& o)
> +    : val(o.val)
> +    {
> +      ++constructions;
> +    }
> +
> +  int val;
> +};
> +
> +static_assert(std::is_trivially_copyable<NonTrivialCons>::value);
> +
> +int main()
> +{
> +  NonTrivialCons src[2]{1, 2};
> +  NonTrivialCons dst[2];
> +#if __cplusplus < 201703L
> +  constructions = 0;
> +#endif
> +  std::uninitialized_copy(src, src+2, dst);
> +  VERIFY( constructions == 2 );
> +  VERIFY( dst[0].val == src[0].val );
> +  VERIFY( dst[1].val == src[1].val );
> +}
> diff --git a/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc
> new file mode 100644
> index 00000000000..531b863e143
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc
> @@ -0,0 +1,39 @@
> +// { dg-do run { target c++11 } }
> +
> +// Bug libstdc++/114817 - Wrong codegen for std::copy of
> +// "trivially copyable but not trivially assignable" type
> +
> +#include <memory>
> +#include <testsuite_hooks.h>
> +
> +int constructions = 0;
> +
> +struct NonTrivialCons
> +{
> +  NonTrivialCons() = default;
> +  NonTrivialCons(int v) : val(v) { }
> +  NonTrivialCons(const volatile NonTrivialCons&) = delete;
> +  template<class = void>
> +    NonTrivialCons(const NonTrivialCons& o)
> +    : val(o.val)
> +    {
> +      ++constructions;
> +    }
> +
> +  int val;
> +};
> +
> +static_assert(std::is_trivially_copyable<NonTrivialCons>::value);
> +
> +int main()
> +{
> +  NonTrivialCons src[2]{1, 2};
> +  NonTrivialCons dst[2];
> +#if __cplusplus < 201703L
> +  constructions = 0;
> +#endif
> +  std::uninitialized_copy(src, src+2, dst);
> +  VERIFY( constructions == 2 );
> +  VERIFY( dst[0].val == src[0].val );
> +  VERIFY( dst[1].val == src[1].val );
> +}
> diff --git a/libstdc++-v3/testsuite/25_algorithms/copy/114817.cc b/libstdc++-v3/testsuite/25_algorithms/copy/114817.cc
> new file mode 100644
> index 00000000000..b5fcc6bb037
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/25_algorithms/copy/114817.cc
> @@ -0,0 +1,38 @@
> +// { dg-do run { target c++11 } }
> +
> +// Bug libstdc++/114817 - Wrong codegen for std::copy of
> +// "trivially copyable but not trivially assignable" type
> +
> +#include <algorithm>
> +#include <testsuite_hooks.h>
> +
> +int assignments = 0;
> +
> +struct NonTrivialAssignment
> +{
> +  NonTrivialAssignment(int v) : val(v) { }
> +  NonTrivialAssignment(const NonTrivialAssignment&) = default;
> +  void operator=(const volatile NonTrivialAssignment&) = delete;
> +  template<class = void>
> +    NonTrivialAssignment&
> +    operator=(const NonTrivialAssignment& o)
> +    {
> +      ++assignments;
> +      val = o.val;
> +      return *this;
> +    }
> +
> +  int val;
> +};
> +
> +static_assert(std::is_trivially_copyable<NonTrivialAssignment>::value);
> +
> +int main()
> +{
> +  NonTrivialAssignment src[2]{1, 2};
> +  NonTrivialAssignment dst[2]{3, 4};
> +  std::copy(src, src+2, dst);
> +  VERIFY( assignments == 2 );
> +  VERIFY( dst[0].val == src[0].val );
> +  VERIFY( dst[1].val == src[1].val );
> +}
> diff --git a/libstdc++-v3/testsuite/25_algorithms/copy/115444.cc b/libstdc++-v3/testsuite/25_algorithms/copy/115444.cc
> new file mode 100644
> index 00000000000..fa629abea5f
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/25_algorithms/copy/115444.cc
> @@ -0,0 +1,93 @@
> +// { dg-do run }
> +// { dg-require-normal-mode "debug mode checks use operator-(Iter, Iter)" }
> +
> +#include <algorithm>
> +#include <iterator>
> +#include <testsuite_hooks.h>
> +
> +const int g = 0;
> +
> +struct Iter {
> +  typedef long difference_type;
> +  typedef int value_type;
> +  typedef const int& reference;
> +  typedef const int* pointer;
> +#if __cpp_lib_concepts
> +  using iterator_category = std::contiguous_iterator_tag;
> +#else
> +  typedef std::random_access_iterator_tag iterator_category;
> +#endif
> +
> +  Iter(const int* p = 0, const int* limit = 0) : ptr(p), limit(limit) { }
> +
> +  const int& operator*() const {
> +#ifdef __cpp_exceptions
> +    if (ptr == limit)
> +      throw 1;
> +#endif
> +    return *ptr;
> +  }
> +  const int* operator->() const { return ptr; }
> +  const int& operator[](long n) const { return ptr[n]; }
> +
> +  Iter& operator++() { ++ptr; return *this; }
> +  Iter operator++(int) { Iter tmp = *this; ++ptr; return tmp; }
> +  Iter& operator--() { --ptr; return *this; }
> +  Iter operator--(int) { Iter tmp = *this; --ptr; return tmp; }
> +
> +  Iter& operator+=(int n) { ptr += n; return *this; }
> +  Iter& operator-=(int n) { ptr -= n; return *this; }
> +
> +  friend Iter operator+(int n, Iter it) { return it += n; }
> +  friend Iter operator+(Iter it, int n) { return it += n; }
> +  friend Iter operator-(Iter it, int n) { return it -= n; }
> +
> +  bool operator==(const Iter& it) const { return ptr == it.ptr; }
> +  bool operator!=(const Iter& it) const { return ptr != it.ptr; }
> +  bool operator<(const Iter& it) const { return ptr < it.ptr; }
> +  bool operator>(const Iter& it) const { return ptr > it.ptr; }
> +  bool operator<=(const Iter& it) const { return ptr <= it.ptr; }
> +  bool operator>=(const Iter& it) const { return ptr >= it.ptr; }
> +
> +  // std::copy should not need to take the difference between two iterators:
> +  friend int operator-(Iter, Iter) { VERIFY( ! "operator- called" ); }
> +
> +private:
> +  const int* ptr;
> +  const int* limit;
> +};
> +
> +void
> +test_pr115444_no_difference()
> +{
> +  int from = 1;
> +  int to = 0;
> +  Iter iter(&from);
> +  // This should not use operator-(Iter, Iter)
> +  std::copy(iter, iter+1, &to);
> +}
> +
> +void
> +test_pr115444_exceptional()
> +{
> +#if __cpp_exceptions
> +  int from[3] = { 1, 2, 3 };
> +  int to[3] = { -1, -1, -1 };
> +  Iter iter(from, from+2);
> +  try {
> +    std::copy(iter, iter + 3, to);
> +  } catch (int) {
> +  }
> +  // std::copy should exit via exception on third dereference.
> +  // This precludes using memcpy or memmove to optimize the copying.
> +  VERIFY( to[0] == 1 );
> +  VERIFY( to[1] == 2 );
> +  VERIFY( to[2] == -1 );
> +#endif
> +}
> +
> +int main()
> +{
> +  test_pr115444_no_difference();
> +  test_pr115444_exceptional();
> +}
> diff --git a/libstdc++-v3/testsuite/25_algorithms/copy_n/114817.cc b/libstdc++-v3/testsuite/25_algorithms/copy_n/114817.cc
> new file mode 100644
> index 00000000000..09e181f3fd0
> --- /dev/null
> +++ b/libstdc++-v3/testsuite/25_algorithms/copy_n/114817.cc
> @@ -0,0 +1,38 @@
> +// { dg-do run { target c++11 } }
> +
> +// Bug libstdc++/114817 - Wrong codegen for std::copy of
> +// "trivially copyable but not trivially assignable" type
> +
> +#include <algorithm>
> +#include <testsuite_hooks.h>
> +
> +int assignments = 0;
> +
> +struct NonTrivialAssignment
> +{
> +  NonTrivialAssignment(int v) : val(v) { }
> +  NonTrivialAssignment(const NonTrivialAssignment&) = default;
> +  void operator=(const volatile NonTrivialAssignment&) = delete;
> +  template<class = void>
> +    NonTrivialAssignment&
> +    operator=(const NonTrivialAssignment& o)
> +    {
> +      ++assignments;
> +      val = o.val;
> +      return *this;
> +    }
> +
> +  int val;
> +};
> +
> +static_assert(std::is_trivially_copyable<NonTrivialAssignment>::value);
> +
> +int main()
> +{
> +  NonTrivialAssignment src[2]{1, 2};
> +  NonTrivialAssignment dst[2]{3, 4};
> +  std::copy_n(src, 2, dst);
> +  VERIFY( assignments == 2 );
> +  VERIFY( dst[0].val == src[0].val );
> +  VERIFY( dst[1].val == src[1].val );
> +}
> -- 
> 2.46.2
> 
>
Jonathan Wakely Oct. 17, 2024, 7:15 a.m. UTC | #3
On Thu, 17 Oct 2024, 03:04 Patrick Palka, <ppalka@redhat.com> wrote:

> On Tue, 15 Oct 2024, Jonathan Wakely wrote:
>
> > This is a slightly different approach to C++98 compatibility than used
> > in patch 1/1 of this series for the uninitialized algos. It worked out a
> > bit cleaner this way for these algos, I think.
> >
> > Tested x86_64-linux.
> >
> > -- >8 --
> >
> > This removes all the __copy_move class template specializations that
> > decide how to optimize std::copy and std::copy_n. We can inline those
> > optimizations into the algorithms, using if-constexpr (and macros for
> > C++98 compatibility) and remove the code dispatching to the various
> > class template specializations.
> >
> > Doing this means we implement the optimization directly for std::copy_n
> > instead of deferring to std::copy, That avoids the unwanted consequence
> > of advancing the iterator in copy_n only to take the difference later to
> > get back to the length that we already had in copy_n originally (as
> > described in PR 115444).
> >
> > With the new flattened implementations, we can also lower contiguous
> > iterators to pointers in std::copy/std::copy_n/std::copy_backwards, so
> > that they benefit from the same memmove optimizations as pointers.
> > There's a subtlety though: contiguous iterators can potentially throw
> > exceptions to exit the algorithm early.  So we can only transform the
> > loop to memmove if dereferencing the iterator is noexcept. We don't
> > check that incrementing the iterator is noexcept because we advance the
> > contiguous iterators before using memmove, so that if incrementing would
> > throw, that happens first. I am writing a proposal (P3249R0) which would
> > make this unnecessary, so I hope we can drop the nothrow requirements
> > later.
> >
> > This change also solves PR 114817 by checking is_trivially_assignable
> > before optimizing copy/copy_n etc. to memmove. It's not enough to check
> > that the types are trivially copyable (a precondition for using memmove
> > at all), we also need to check that the specific assignment that would
> > be performed by the algorithm is also trivial. Replacing a non-trivial
> > assignment with memmove would be observable, so not allowed.
> >
> > libstdc++-v3/ChangeLog:
> >
> >       PR libstdc++/115444
> >       PR libstdc++/114817
> >       * include/bits/stl_algo.h (__copy_n): Remove generic overload
> >       and overload for random access iterators.
> >       (copy_n): Inline generic version of __copy_n here. Do not defer
> >       to std::copy for random access iterators.
> >       * include/bits/stl_algobase.h (__copy_move): Remove.
> >       (__nothrow_contiguous_iterator, __memcpyable_iterators): New
> >       concepts.
> >       (__assign_one, _GLIBCXX_TO_ADDR, _GLIBCXX_ADVANCE): New helpers.
> >       (__copy_move_a2): Inline __copy_move logic and conditional
> >       memmove optimization into the most generic overload.
> >       (__copy_n_a): Likewise.
> >       (__copy_move_backward): Remove.
> >       (__copy_move_backward_a2): Inline __copy_move_backward logic and
> >       memmove optimization into the most generic overload.
> >       *
> testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc:
> >       New test.
> >       *
> testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc:
> >       New test.
> >       * testsuite/25_algorithms/copy/114817.cc: New test.
> >       * testsuite/25_algorithms/copy/115444.cc: New test.
> >       * testsuite/25_algorithms/copy_n/114817.cc: New test.
> > ---
> >  libstdc++-v3/include/bits/stl_algo.h          |  24 +-
> >  libstdc++-v3/include/bits/stl_algobase.h      | 426 +++++++++---------
> >  .../uninitialized_copy/114817.cc              |  39 ++
> >  .../uninitialized_copy_n/114817.cc            |  39 ++
> >  .../testsuite/25_algorithms/copy/114817.cc    |  38 ++
> >  .../testsuite/25_algorithms/copy/115444.cc    |  93 ++++
> >  .../testsuite/25_algorithms/copy_n/114817.cc  |  38 ++
> >  7 files changed, 469 insertions(+), 228 deletions(-)
> >  create mode 100644
> libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc
> >  create mode 100644
> libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc
> >  create mode 100644 libstdc++-v3/testsuite/25_algorithms/copy/114817.cc
> >  create mode 100644 libstdc++-v3/testsuite/25_algorithms/copy/115444.cc
> >  create mode 100644 libstdc++-v3/testsuite/25_algorithms/copy_n/114817.cc
> >
> > diff --git a/libstdc++-v3/include/bits/stl_algo.h
> b/libstdc++-v3/include/bits/stl_algo.h
> > index a1ef665506d..489ce7e14d2 100644
> > --- a/libstdc++-v3/include/bits/stl_algo.h
> > +++ b/libstdc++-v3/include/bits/stl_algo.h
> > @@ -665,25 +665,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >        return __result;
> >      }
> >
> > -  template<typename _InputIterator, typename _Size, typename
> _OutputIterator>
> > -    _GLIBCXX20_CONSTEXPR
> > -    _OutputIterator
> > -    __copy_n(_InputIterator __first, _Size __n,
> > -          _OutputIterator __result, input_iterator_tag)
> > -    {
> > -      return std::__niter_wrap(__result,
> > -                            __copy_n_a(__first, __n,
> > -                                       std::__niter_base(__result),
> true));
> > -    }
> > -
> > -  template<typename _RandomAccessIterator, typename _Size,
> > -        typename _OutputIterator>
> > -    _GLIBCXX20_CONSTEXPR
> > -    inline _OutputIterator
> > -    __copy_n(_RandomAccessIterator __first, _Size __n,
> > -          _OutputIterator __result, random_access_iterator_tag)
> > -    { return std::copy(__first, __first + __n, __result); }
> > -
> >    /**
> >     *  @brief Copies the range [first,first+n) into [result,result+n).
> >     *  @ingroup mutating_algorithms
> > @@ -714,8 +695,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >        __glibcxx_requires_can_increment(__first, __n2);
> >        __glibcxx_requires_can_increment(__result, __n2);
> >
> > -      return std::__copy_n(__first, __n2, __result,
> > -                        std::__iterator_category(__first));
> > +      auto __res = std::__copy_n_a(std::__niter_base(__first), __n2,
> > +                                std::__niter_base(__result), true);
> > +      return std::__niter_wrap(__result, std::move(__res));
> >      }
> >
> >    /**
> > diff --git a/libstdc++-v3/include/bits/stl_algobase.h
> b/libstdc++-v3/include/bits/stl_algobase.h
> > index 751b7ad119b..5f77b00be9b 100644
> > --- a/libstdc++-v3/include/bits/stl_algobase.h
> > +++ b/libstdc++-v3/include/bits/stl_algobase.h
> > @@ -77,6 +77,7 @@
> >  #endif
> >  #if __cplusplus >= 202002L
> >  # include <compare>
> > +# include <bits/ptr_traits.h> // std::to_address
> >  #endif
> >
> >  namespace std _GLIBCXX_VISIBILITY(default)
> > @@ -308,110 +309,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >        return __a;
> >      }
> >
> > -  // All of these auxiliary structs serve two purposes.  (1) Replace
> > -  // calls to copy with memmove whenever possible.  (Memmove, not
> memcpy,
> > -  // because the input and output ranges are permitted to overlap.)
> > -  // (2) If we're using random access iterators, then write the loop as
> > -  // a for loop with an explicit count.
> > -
> > -  template<bool _IsMove, bool _IsSimple, typename _Category>
> > -    struct __copy_move
> > -    {
> > -      template<typename _II, typename _OI>
> > -     _GLIBCXX20_CONSTEXPR
> > -     static _OI
> > -     __copy_m(_II __first, _II __last, _OI __result)
> > -     {
> > -       for (; __first != __last; ++__result, (void)++__first)
> > -         *__result = *__first;
> > -       return __result;
> > -     }
> > -    };
> > -
> > -#if __cplusplus >= 201103L
> > -  template<typename _Category>
> > -    struct __copy_move<true, false, _Category>
> > -    {
> > -      template<typename _II, typename _OI>
> > -     _GLIBCXX20_CONSTEXPR
> > -     static _OI
> > -     __copy_m(_II __first, _II __last, _OI __result)
> > -     {
> > -       for (; __first != __last; ++__result, (void)++__first)
> > -         *__result = std::move(*__first);
> > -       return __result;
> > -     }
> > -    };
> > -#endif
> > -
> > -  template<>
> > -    struct __copy_move<false, false, random_access_iterator_tag>
> > -    {
> > -      template<typename _II, typename _OI>
> > -     _GLIBCXX20_CONSTEXPR
> > -     static _OI
> > -     __copy_m(_II __first, _II __last, _OI __result)
> > -     {
> > -       typedef typename iterator_traits<_II>::difference_type _Distance;
> > -       for(_Distance __n = __last - __first; __n > 0; --__n)
> > -         {
> > -           *__result = *__first;
> > -           ++__first;
> > -           ++__result;
> > -         }
> > -       return __result;
> > -     }
> > -
> > -      template<typename _Tp, typename _Up>
> > -     static void
> > -     __assign_one(_Tp* __to, _Up* __from)
> > -     { *__to = *__from; }
> > -    };
> > -
> > -#if __cplusplus >= 201103L
> > -  template<>
> > -    struct __copy_move<true, false, random_access_iterator_tag>
> > -    {
> > -      template<typename _II, typename _OI>
> > -     _GLIBCXX20_CONSTEXPR
> > -     static _OI
> > -     __copy_m(_II __first, _II __last, _OI __result)
> > -     {
> > -       typedef typename iterator_traits<_II>::difference_type _Distance;
> > -       for(_Distance __n = __last - __first; __n > 0; --__n)
> > -         {
> > -           *__result = std::move(*__first);
> > -           ++__first;
> > -           ++__result;
> > -         }
> > -       return __result;
> > -     }
> > -
> > -      template<typename _Tp, typename _Up>
> > -     static void
> > -     __assign_one(_Tp* __to, _Up* __from)
> > -     { *__to = std::move(*__from); }
> > -    };
> > -#endif
> > -
> > -  template<bool _IsMove>
> > -    struct __copy_move<_IsMove, true, random_access_iterator_tag>
> > -    {
> > -      template<typename _Tp, typename _Up>
> > -     _GLIBCXX20_CONSTEXPR
> > -     static _Up*
> > -     __copy_m(_Tp* __first, _Tp* __last, _Up* __result)
> > -     {
> > -       const ptrdiff_t _Num = __last - __first;
> > -       if (__builtin_expect(_Num > 1, true))
> > -         __builtin_memmove(__result, __first, sizeof(_Tp) * _Num);
> > -       else if (_Num == 1)
> > -         std::__copy_move<_IsMove, false, random_access_iterator_tag>::
> > -           __assign_one(__result, __first);
> > -       return __result + _Num;
> > -     }
> > -    };
> > -
> >  _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
> >
> >    template<typename _Tp, typename _Ref, typename _Ptr>
> > @@ -461,21 +358,127 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
> >       _GLIBCXX_STD_C::_Deque_iterator<_CharT, _CharT&, _CharT*>);
> >  #endif // HOSTED
> >
> > -  template<bool _IsMove, typename _II, typename _OI>
> > -    _GLIBCXX20_CONSTEXPR
> > -    inline _OI
> > -    __copy_move_a2(_II __first, _II __last, _OI __result)
> > -    {
> > -      typedef typename iterator_traits<_II>::iterator_category
> _Category;
> > -#ifdef __cpp_lib_is_constant_evaluated
> > -      if (std::is_constant_evaluated())
> > -     return std::__copy_move<_IsMove, false, _Category>::
> > -       __copy_m(__first, __last, __result);
> > +#if __cpp_lib_concepts
> > +  // N.B. this is not the same as nothrow-forward-iterator, which
> doesn't
> > +  // require noexcept operations, it just says it's undefined if they
> throw.
> > +  // Here we require them to be actually noexcept.
> > +  template<typename _Iter>
> > +    concept __nothrow_contiguous_iterator
> > +      = contiguous_iterator<_Iter> && requires (_Iter __i) {
> > +     // If this operation can throw then the iterator could cause
> > +     // the algorithm to exit early via an exception, in which case
> > +     // we can't use memcpy.
> > +     { *__i } noexcept;
> > +      };
> > +
> > +  template<typename _OutIter, typename _InIter, typename _Sent =
> _InIter>
> > +    concept __memcpyable_iterators
> > +      = __nothrow_contiguous_iterator<_OutIter>
> > +       && __nothrow_contiguous_iterator<_InIter>
> > +       && sized_sentinel_for<_Sent, _InIter>
> > +       && requires (_OutIter __o, _InIter __i, _Sent __s) {
> > +         requires !!__memcpyable<decltype(std::to_address(__o)),
> > +
>  decltype(std::to_address(__i))>::__value;
> > +         { __i != __s } noexcept;
> > +       };
> >  #endif
> > -      return std::__copy_move<_IsMove, __memcpyable<_OI, _II>::__value,
> > -                           _Category>::__copy_m(__first, __last,
> __result);
> > +
> > +#if __cplusplus < 201103L
> > +  // Used by __copy_move_a2, __copy_n_a and __copy_move_backward_a2 to
> > +  // get raw pointers so that calls to __builtin_memmove will compile,
> > +  // because C++98 can't use 'if constexpr' so statements that use
> memmove
> > +  // with pointer arguments need to also compile for arbitrary iterator
> types.
> > +  template<typename _Iter> __attribute__((__always_inline__))
> > +    inline void* __ptr_or_null(_Iter) { return 0; }
> > +  template<typename _Tp> __attribute__((__always_inline__))
> > +  inline void* __ptr_or_null(_Tp* __p) { return (void*)__p; }
> > +# define _GLIBCXX_TO_ADDR(P) std::__ptr_or_null(P)
> > +  // Used to advance output iterators (std::advance requires
> InputIterator).
> > +  template<typename _Iter> __attribute__((__always_inline__))
> > +    inline void __ptr_advance(_Iter&, ptrdiff_t) { }
> > +  template<typename _Tp> __attribute__((__always_inline__))
> > +    inline void __ptr_advance(_Tp*& __p, ptrdiff_t __n) { __p += __n; }
> > +# define _GLIBCXX_ADVANCE(P, N) std::__ptr_advance(P, N)
> > +#else
> > +  // For C++11 mode the __builtin_memmove calls are guarded by 'if
> constexpr'
> > +  // so we know the iterators used with memmove are guaranteed to be
> pointers.
> > +# define _GLIBCXX_TO_ADDR(P) P
> > +# define _GLIBCXX_ADVANCE(P, N) P += N
> > +#endif
> > +
> > +#pragma GCC diagnostic push
> > +#pragma GCC diagnostic ignored "-Wc++17-extensions"
> > +  template<bool _IsMove, typename _OutIter, typename _InIter>
> > +    __attribute__((__always_inline__)) _GLIBCXX20_CONSTEXPR
> > +    inline void
> > +    __assign_one(_OutIter& __out, _InIter& __in)
> > +    {
> > +#if __cplusplus >= 201103L
> > +      if constexpr (_IsMove)
> > +     *__out = std::move(*__in);
> > +      else
> > +#endif
> > +     *__out = *__in;
> >      }
> >
> > +  template<bool _IsMove, typename _InIter, typename _Sent, typename
> _OutIter>
> > +    _GLIBCXX20_CONSTEXPR
> > +    inline _OutIter
> > +    __copy_move_a2(_InIter __first, _Sent __last, _OutIter __result)
> > +    {
> > +      typedef __decltype(*__first) _InRef;
> > +      typedef __decltype(*__result) _OutRef;
> > +      if _GLIBCXX_CONSTEXPR (!__is_trivially_assignable(_OutRef,
> _InRef))
> > +     { } /* Skip the optimizations and use the loop at the end. */
> > +#ifdef __cpp_lib_is_constant_evaluated
> > +      else if (std::is_constant_evaluated())
>
> Maybe check std::__is_constant_evaluated() instead since it's always
> defined?
>


For some reason I thought it wasn't defined for C++98, but it is, and it's
always_inline so won't add any runtime overhead.



> > +     { } /* Skip the optimizations and use the loop at the end. */
> > +#endif
> > +      else if _GLIBCXX_CONSTEXPR (__memcpyable<_OutIter,
> _InIter>::__value)
> > +     {
> > +       ptrdiff_t __n = std::distance(__first, __last);
> > +       if (__builtin_expect(__n > 1, true))
>
> I guess [[__likely__]] can't be used here since this is a C++98 code
> path.
>

Yes. I think Jason recently made [[...]] attributes work in C++98 mode, but
I don't know if clang supports that.



> LGTM
>
> > +         {
> > +           __builtin_memmove(_GLIBCXX_TO_ADDR(__result),
> > +                             _GLIBCXX_TO_ADDR(__first),
> > +                             __n * sizeof(*__first));
> > +           _GLIBCXX_ADVANCE(__result, __n);
> > +         }
> > +       else if (__n == 1)
> > +         {
> > +           std::__assign_one<_IsMove>(__result, __first);
> > +           ++__result;
> > +         }
> > +       return __result;
> > +     }
> > +#if __cpp_lib_concepts
> > +      else if constexpr (__memcpyable_iterators<_OutIter, _InIter,
> _Sent>)
> > +     {
> > +       if (auto __n = __last - __first; __n > 1) [[likely]]
> > +         {
> > +           void* __dest = std::to_address(__result);
> > +           const void* __src = std::to_address(__first);
> > +           size_t __nbytes = __n * sizeof(iter_value_t<_InIter>);
> > +           // Advance the iterators first, in case doing so throws.
> > +           __result += __n;
> > +           __first += __n;
> > +           __builtin_memmove(__dest, __src, __nbytes);
> > +         }
> > +       else if (__n == 1)
> > +         {
> > +           std::__assign_one<_IsMove>(__result, __first);
> > +           ++__result;
> > +         }
> > +       return __result;
> > +     }
> > +#endif
> > +
> > +      for (; __first != __last; ++__result, (void)++__first)
> > +     std::__assign_one<_IsMove>(__result, __first);
> > +      return __result;
> > +    }
> > +#pragma GCC diagnostic pop
> > +
> >    template<bool _IsMove,
> >          typename _Tp, typename _Ref, typename _Ptr, typename _OI>
> >      _OI
> > @@ -537,12 +540,56 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
> >                 const ::__gnu_debug::_Safe_iterator<_IIte, _ISeq,
> _ICat>&,
> >                 const ::__gnu_debug::_Safe_iterator<_OIte, _OSeq,
> _OCat>&);
> >
> > +#pragma GCC diagnostic push
> > +#pragma GCC diagnostic ignored "-Wc++17-extensions" // for if-constexpr
> >    template<typename _InputIterator, typename _Size, typename
> _OutputIterator>
> >      _GLIBCXX20_CONSTEXPR
> >      _OutputIterator
> >      __copy_n_a(_InputIterator __first, _Size __n, _OutputIterator
> __result,
> >              bool)
> >      {
> > +      typedef __decltype(*__first) _InRef;
> > +      typedef __decltype(*__result) _OutRef;
> > +      if _GLIBCXX_CONSTEXPR (!__is_trivially_assignable(_OutRef,
> _InRef))
> > +     { } /* Skip the optimizations and use the loop at the end. */
> > +#ifdef __cpp_lib_is_constant_evaluated
> > +      else if (std::is_constant_evaluated())
> > +     { } /* Skip the optimizations and use the loop at the end. */
> > +#endif
> > +      else if _GLIBCXX_CONSTEXPR (__memcpyable<_OutputIterator,
> > +                                            _InputIterator>::__value)
> > +     {
> > +       if (__builtin_expect(__n > 1, true))
> > +         {
> > +           __builtin_memmove(_GLIBCXX_TO_ADDR(__result),
> > +                             _GLIBCXX_TO_ADDR(__first),
> > +                             __n * sizeof(*__first));
> > +           _GLIBCXX_ADVANCE(__result, __n);
> > +         }
> > +       else if (__n == 1)
> > +         *__result++ = *__first;
> > +       return __result;
> > +     }
> > +#if __cpp_lib_concepts
> > +      else if constexpr (__memcpyable_iterators<_OutputIterator,
> > +                                             _InputIterator>)
> > +     {
> > +       if (__n > 1) [[likely]]
> > +         {
> > +           void* __dest = std::to_address(__result);
> > +           const void* __src = std::to_address(__first);
> > +           size_t __nbytes = __n * sizeof(iter_value_t<_InputIterator>);
> > +           // Advance the iterators first, in case doing so throws.
> > +           __result += __n;
> > +           __first += __n;
> > +           __builtin_memmove(__dest, __src, __nbytes);
> > +         }
> > +       else if (__n == 1)
> > +         *__result++ = *__first;
> > +       return __result;
> > +     }
> > +#endif
> > +
> >        if (__n > 0)
> >       {
> >         while (true)
> > @@ -557,6 +604,7 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
> >       }
> >        return __result;
> >      }
> > +#pragma GCC diagnostic pop
> >
> >  #if _GLIBCXX_HOSTED
> >    template<typename _CharT, typename _Size>
> > @@ -644,105 +692,69 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
> >  #define _GLIBCXX_MOVE3(_Tp, _Up, _Vp) std::copy(_Tp, _Up, _Vp)
> >  #endif
> >
> > -  template<bool _IsMove, bool _IsSimple, typename _Category>
> > -    struct __copy_move_backward
> > -    {
> > -      template<typename _BI1, typename _BI2>
> > -     _GLIBCXX20_CONSTEXPR
> > -     static _BI2
> > -     __copy_move_b(_BI1 __first, _BI1 __last, _BI2 __result)
> > -     {
> > -       while (__first != __last)
> > -         *--__result = *--__last;
> > -       return __result;
> > -     }
> > -    };
> > -
> > -#if __cplusplus >= 201103L
> > -  template<typename _Category>
> > -    struct __copy_move_backward<true, false, _Category>
> > -    {
> > -      template<typename _BI1, typename _BI2>
> > -     _GLIBCXX20_CONSTEXPR
> > -     static _BI2
> > -     __copy_move_b(_BI1 __first, _BI1 __last, _BI2 __result)
> > -     {
> > -       while (__first != __last)
> > -         *--__result = std::move(*--__last);
> > -       return __result;
> > -     }
> > -    };
> > -#endif
> > -
> > -  template<>
> > -    struct __copy_move_backward<false, false,
> random_access_iterator_tag>
> > -    {
> > -      template<typename _BI1, typename _BI2>
> > -     _GLIBCXX20_CONSTEXPR
> > -     static _BI2
> > -     __copy_move_b(_BI1 __first, _BI1 __last, _BI2 __result)
> > -     {
> > -       typename iterator_traits<_BI1>::difference_type
> > -         __n = __last - __first;
> > -       for (; __n > 0; --__n)
> > -         *--__result = *--__last;
> > -       return __result;
> > -     }
> > -    };
> > -
> > -#if __cplusplus >= 201103L
> > -  template<>
> > -    struct __copy_move_backward<true, false, random_access_iterator_tag>
> > -    {
> > -      template<typename _BI1, typename _BI2>
> > -     _GLIBCXX20_CONSTEXPR
> > -     static _BI2
> > -     __copy_move_b(_BI1 __first, _BI1 __last, _BI2 __result)
> > -     {
> > -       typename iterator_traits<_BI1>::difference_type
> > -         __n = __last - __first;
> > -       for (; __n > 0; --__n)
> > -         *--__result = std::move(*--__last);
> > -       return __result;
> > -     }
> > -    };
> > -#endif
> > -
> > -  template<bool _IsMove>
> > -    struct __copy_move_backward<_IsMove, true,
> random_access_iterator_tag>
> > -    {
> > -      template<typename _Tp, typename _Up>
> > -     _GLIBCXX20_CONSTEXPR
> > -     static _Up*
> > -     __copy_move_b(_Tp* __first, _Tp* __last, _Up* __result)
> > -     {
> > -       const ptrdiff_t _Num = __last - __first;
> > -       if (__builtin_expect(_Num > 1, true))
> > -         __builtin_memmove(__result - _Num, __first, sizeof(_Tp) *
> _Num);
> > -       else if (_Num == 1)
> > -         std::__copy_move<_IsMove, false, random_access_iterator_tag>::
> > -           __assign_one(__result - 1, __first);
> > -       return __result - _Num;
> > -     }
> > -    };
> > -
> > +#pragma GCC diagnostic push
> > +#pragma GCC diagnostic ignored "-Wc++17-extensions"
> >    template<bool _IsMove, typename _BI1, typename _BI2>
> >      _GLIBCXX20_CONSTEXPR
> >      inline _BI2
> >      __copy_move_backward_a2(_BI1 __first, _BI1 __last, _BI2 __result)
> >      {
> > -      typedef typename iterator_traits<_BI1>::iterator_category
> _Category;
> > +      typedef __decltype(*__first) _InRef;
> > +      typedef __decltype(*__result) _OutRef;
> > +      if _GLIBCXX_CONSTEXPR (!__is_trivially_assignable(_OutRef,
> _InRef))
> > +       { } /* Skip the optimizations and use the loop at the end. */
> >  #ifdef __cpp_lib_is_constant_evaluated
> > -      if (std::is_constant_evaluated())
> > -     return std::__copy_move_backward<_IsMove, false, _Category>::
> > -       __copy_move_b(__first, __last, __result);
> > +      else if (std::is_constant_evaluated())
> > +       { } /* Skip the optimizations and use the loop at the end. */
> >  #endif
> > -      return std::__copy_move_backward<_IsMove,
> > -                                    __memcpyable<_BI2, _BI1>::__value,
> > -                                    _Category>::__copy_move_b(__first,
> > -                                                              __last,
> > -                                                              __result);
> > +      else if _GLIBCXX_CONSTEXPR (__memcpyable<_BI2, _BI1>::__value)
> > +     {
> > +       ptrdiff_t __n = std::distance(__first, __last);
> > +       std::advance(__result, -__n);
> > +       if (__builtin_expect(__n > 1, true))
> > +         {
> > +           __builtin_memmove(_GLIBCXX_TO_ADDR(__result),
> > +                             _GLIBCXX_TO_ADDR(__first),
> > +                             __n * sizeof(*__first));
> > +         }
> > +       else if (__n == 1)
> > +         std::__assign_one<_IsMove>(__result, __first);
> > +       return __result;
> > +     }
> > +#if __cpp_lib_concepts
> > +      else if constexpr (__memcpyable_iterators<_BI2, _BI1>)
> > +     {
> > +       if (auto __n = __last - __first; __n > 1) [[likely]]
> > +         {
> > +           const void* __src = std::to_address(__first);
> > +           // Advance the iterators first, in case doing so throws.
> > +           __result -= __n;
> > +           __first += __n;
> > +           void* __dest = std::to_address(__result);
> > +           size_t __nbytes = __n * sizeof(iter_value_t<_BI1>);
> > +           __builtin_memmove(__dest, __src, __nbytes);
> > +         }
> > +       else if (__n == 1)
> > +         {
> > +           --__result;
> > +           std::__assign_one<_IsMove>(__result, __first);
> > +         }
> > +       return __result;
> > +     }
> > +#endif
> > +
> > +      while (__first != __last)
> > +     {
> > +       --__last;
> > +       --__result;
> > +       std::__assign_one<_IsMove>(__result, __last);
> > +     }
> > +      return __result;
> >      }
> > +#pragma GCC diagnostic pop
> > +
> > +#undef _GLIBCXX_TO_ADDR
> > +#undef _GLIBCXX_ADVANCE
> >
> >    template<bool _IsMove, typename _BI1, typename _BI2>
> >      _GLIBCXX20_CONSTEXPR
> > diff --git
> a/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc
> b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc
> > new file mode 100644
> > index 00000000000..531b863e143
> > --- /dev/null
> > +++
> b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc
> > @@ -0,0 +1,39 @@
> > +// { dg-do run { target c++11 } }
> > +
> > +// Bug libstdc++/114817 - Wrong codegen for std::copy of
> > +// "trivially copyable but not trivially assignable" type
> > +
> > +#include <memory>
> > +#include <testsuite_hooks.h>
> > +
> > +int constructions = 0;
> > +
> > +struct NonTrivialCons
> > +{
> > +  NonTrivialCons() = default;
> > +  NonTrivialCons(int v) : val(v) { }
> > +  NonTrivialCons(const volatile NonTrivialCons&) = delete;
> > +  template<class = void>
> > +    NonTrivialCons(const NonTrivialCons& o)
> > +    : val(o.val)
> > +    {
> > +      ++constructions;
> > +    }
> > +
> > +  int val;
> > +};
> > +
> > +static_assert(std::is_trivially_copyable<NonTrivialCons>::value);
> > +
> > +int main()
> > +{
> > +  NonTrivialCons src[2]{1, 2};
> > +  NonTrivialCons dst[2];
> > +#if __cplusplus < 201703L
> > +  constructions = 0;
> > +#endif
> > +  std::uninitialized_copy(src, src+2, dst);
> > +  VERIFY( constructions == 2 );
> > +  VERIFY( dst[0].val == src[0].val );
> > +  VERIFY( dst[1].val == src[1].val );
> > +}
> > diff --git
> a/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc
> b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc
> > new file mode 100644
> > index 00000000000..531b863e143
> > --- /dev/null
> > +++
> b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc
> > @@ -0,0 +1,39 @@
> > +// { dg-do run { target c++11 } }
> > +
> > +// Bug libstdc++/114817 - Wrong codegen for std::copy of
> > +// "trivially copyable but not trivially assignable" type
> > +
> > +#include <memory>
> > +#include <testsuite_hooks.h>
> > +
> > +int constructions = 0;
> > +
> > +struct NonTrivialCons
> > +{
> > +  NonTrivialCons() = default;
> > +  NonTrivialCons(int v) : val(v) { }
> > +  NonTrivialCons(const volatile NonTrivialCons&) = delete;
> > +  template<class = void>
> > +    NonTrivialCons(const NonTrivialCons& o)
> > +    : val(o.val)
> > +    {
> > +      ++constructions;
> > +    }
> > +
> > +  int val;
> > +};
> > +
> > +static_assert(std::is_trivially_copyable<NonTrivialCons>::value);
> > +
> > +int main()
> > +{
> > +  NonTrivialCons src[2]{1, 2};
> > +  NonTrivialCons dst[2];
> > +#if __cplusplus < 201703L
> > +  constructions = 0;
> > +#endif
> > +  std::uninitialized_copy(src, src+2, dst);
> > +  VERIFY( constructions == 2 );
> > +  VERIFY( dst[0].val == src[0].val );
> > +  VERIFY( dst[1].val == src[1].val );
> > +}
> > diff --git a/libstdc++-v3/testsuite/25_algorithms/copy/114817.cc
> b/libstdc++-v3/testsuite/25_algorithms/copy/114817.cc
> > new file mode 100644
> > index 00000000000..b5fcc6bb037
> > --- /dev/null
> > +++ b/libstdc++-v3/testsuite/25_algorithms/copy/114817.cc
> > @@ -0,0 +1,38 @@
> > +// { dg-do run { target c++11 } }
> > +
> > +// Bug libstdc++/114817 - Wrong codegen for std::copy of
> > +// "trivially copyable but not trivially assignable" type
> > +
> > +#include <algorithm>
> > +#include <testsuite_hooks.h>
> > +
> > +int assignments = 0;
> > +
> > +struct NonTrivialAssignment
> > +{
> > +  NonTrivialAssignment(int v) : val(v) { }
> > +  NonTrivialAssignment(const NonTrivialAssignment&) = default;
> > +  void operator=(const volatile NonTrivialAssignment&) = delete;
> > +  template<class = void>
> > +    NonTrivialAssignment&
> > +    operator=(const NonTrivialAssignment& o)
> > +    {
> > +      ++assignments;
> > +      val = o.val;
> > +      return *this;
> > +    }
> > +
> > +  int val;
> > +};
> > +
> > +static_assert(std::is_trivially_copyable<NonTrivialAssignment>::value);
> > +
> > +int main()
> > +{
> > +  NonTrivialAssignment src[2]{1, 2};
> > +  NonTrivialAssignment dst[2]{3, 4};
> > +  std::copy(src, src+2, dst);
> > +  VERIFY( assignments == 2 );
> > +  VERIFY( dst[0].val == src[0].val );
> > +  VERIFY( dst[1].val == src[1].val );
> > +}
> > diff --git a/libstdc++-v3/testsuite/25_algorithms/copy/115444.cc
> b/libstdc++-v3/testsuite/25_algorithms/copy/115444.cc
> > new file mode 100644
> > index 00000000000..fa629abea5f
> > --- /dev/null
> > +++ b/libstdc++-v3/testsuite/25_algorithms/copy/115444.cc
> > @@ -0,0 +1,93 @@
> > +// { dg-do run }
> > +// { dg-require-normal-mode "debug mode checks use operator-(Iter,
> Iter)" }
> > +
> > +#include <algorithm>
> > +#include <iterator>
> > +#include <testsuite_hooks.h>
> > +
> > +const int g = 0;
> > +
> > +struct Iter {
> > +  typedef long difference_type;
> > +  typedef int value_type;
> > +  typedef const int& reference;
> > +  typedef const int* pointer;
> > +#if __cpp_lib_concepts
> > +  using iterator_category = std::contiguous_iterator_tag;
> > +#else
> > +  typedef std::random_access_iterator_tag iterator_category;
> > +#endif
> > +
> > +  Iter(const int* p = 0, const int* limit = 0) : ptr(p), limit(limit) {
> }
> > +
> > +  const int& operator*() const {
> > +#ifdef __cpp_exceptions
> > +    if (ptr == limit)
> > +      throw 1;
> > +#endif
> > +    return *ptr;
> > +  }
> > +  const int* operator->() const { return ptr; }
> > +  const int& operator[](long n) const { return ptr[n]; }
> > +
> > +  Iter& operator++() { ++ptr; return *this; }
> > +  Iter operator++(int) { Iter tmp = *this; ++ptr; return tmp; }
> > +  Iter& operator--() { --ptr; return *this; }
> > +  Iter operator--(int) { Iter tmp = *this; --ptr; return tmp; }
> > +
> > +  Iter& operator+=(int n) { ptr += n; return *this; }
> > +  Iter& operator-=(int n) { ptr -= n; return *this; }
> > +
> > +  friend Iter operator+(int n, Iter it) { return it += n; }
> > +  friend Iter operator+(Iter it, int n) { return it += n; }
> > +  friend Iter operator-(Iter it, int n) { return it -= n; }
> > +
> > +  bool operator==(const Iter& it) const { return ptr == it.ptr; }
> > +  bool operator!=(const Iter& it) const { return ptr != it.ptr; }
> > +  bool operator<(const Iter& it) const { return ptr < it.ptr; }
> > +  bool operator>(const Iter& it) const { return ptr > it.ptr; }
> > +  bool operator<=(const Iter& it) const { return ptr <= it.ptr; }
> > +  bool operator>=(const Iter& it) const { return ptr >= it.ptr; }
> > +
> > +  // std::copy should not need to take the difference between two
> iterators:
> > +  friend int operator-(Iter, Iter) { VERIFY( ! "operator- called" ); }
> > +
> > +private:
> > +  const int* ptr;
> > +  const int* limit;
> > +};
> > +
> > +void
> > +test_pr115444_no_difference()
> > +{
> > +  int from = 1;
> > +  int to = 0;
> > +  Iter iter(&from);
> > +  // This should not use operator-(Iter, Iter)
> > +  std::copy(iter, iter+1, &to);
> > +}
> > +
> > +void
> > +test_pr115444_exceptional()
> > +{
> > +#if __cpp_exceptions
> > +  int from[3] = { 1, 2, 3 };
> > +  int to[3] = { -1, -1, -1 };
> > +  Iter iter(from, from+2);
> > +  try {
> > +    std::copy(iter, iter + 3, to);
> > +  } catch (int) {
> > +  }
> > +  // std::copy should exit via exception on third dereference.
> > +  // This precludes using memcpy or memmove to optimize the copying.
> > +  VERIFY( to[0] == 1 );
> > +  VERIFY( to[1] == 2 );
> > +  VERIFY( to[2] == -1 );
> > +#endif
> > +}
> > +
> > +int main()
> > +{
> > +  test_pr115444_no_difference();
> > +  test_pr115444_exceptional();
> > +}
> > diff --git a/libstdc++-v3/testsuite/25_algorithms/copy_n/114817.cc
> b/libstdc++-v3/testsuite/25_algorithms/copy_n/114817.cc
> > new file mode 100644
> > index 00000000000..09e181f3fd0
> > --- /dev/null
> > +++ b/libstdc++-v3/testsuite/25_algorithms/copy_n/114817.cc
> > @@ -0,0 +1,38 @@
> > +// { dg-do run { target c++11 } }
> > +
> > +// Bug libstdc++/114817 - Wrong codegen for std::copy of
> > +// "trivially copyable but not trivially assignable" type
> > +
> > +#include <algorithm>
> > +#include <testsuite_hooks.h>
> > +
> > +int assignments = 0;
> > +
> > +struct NonTrivialAssignment
> > +{
> > +  NonTrivialAssignment(int v) : val(v) { }
> > +  NonTrivialAssignment(const NonTrivialAssignment&) = default;
> > +  void operator=(const volatile NonTrivialAssignment&) = delete;
> > +  template<class = void>
> > +    NonTrivialAssignment&
> > +    operator=(const NonTrivialAssignment& o)
> > +    {
> > +      ++assignments;
> > +      val = o.val;
> > +      return *this;
> > +    }
> > +
> > +  int val;
> > +};
> > +
> > +static_assert(std::is_trivially_copyable<NonTrivialAssignment>::value);
> > +
> > +int main()
> > +{
> > +  NonTrivialAssignment src[2]{1, 2};
> > +  NonTrivialAssignment dst[2]{3, 4};
> > +  std::copy_n(src, 2, dst);
> > +  VERIFY( assignments == 2 );
> > +  VERIFY( dst[0].val == src[0].val );
> > +  VERIFY( dst[1].val == src[1].val );
> > +}
> > --
> > 2.46.2
> >
> >
>
>
diff mbox series

Patch

diff --git a/libstdc++-v3/include/bits/stl_algo.h b/libstdc++-v3/include/bits/stl_algo.h
index a1ef665506d..489ce7e14d2 100644
--- a/libstdc++-v3/include/bits/stl_algo.h
+++ b/libstdc++-v3/include/bits/stl_algo.h
@@ -665,25 +665,6 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
       return __result;
     }
 
-  template<typename _InputIterator, typename _Size, typename _OutputIterator>
-    _GLIBCXX20_CONSTEXPR
-    _OutputIterator
-    __copy_n(_InputIterator __first, _Size __n,
-	     _OutputIterator __result, input_iterator_tag)
-    {
-      return std::__niter_wrap(__result,
-			       __copy_n_a(__first, __n,
-					  std::__niter_base(__result), true));
-    }
-
-  template<typename _RandomAccessIterator, typename _Size,
-	   typename _OutputIterator>
-    _GLIBCXX20_CONSTEXPR
-    inline _OutputIterator
-    __copy_n(_RandomAccessIterator __first, _Size __n,
-	     _OutputIterator __result, random_access_iterator_tag)
-    { return std::copy(__first, __first + __n, __result); }
-
   /**
    *  @brief Copies the range [first,first+n) into [result,result+n).
    *  @ingroup mutating_algorithms
@@ -714,8 +695,9 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
       __glibcxx_requires_can_increment(__first, __n2);
       __glibcxx_requires_can_increment(__result, __n2);
 
-      return std::__copy_n(__first, __n2, __result,
-			   std::__iterator_category(__first));
+      auto __res = std::__copy_n_a(std::__niter_base(__first), __n2,
+				   std::__niter_base(__result), true);
+      return std::__niter_wrap(__result, std::move(__res));
     }
 
   /**
diff --git a/libstdc++-v3/include/bits/stl_algobase.h b/libstdc++-v3/include/bits/stl_algobase.h
index 751b7ad119b..5f77b00be9b 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -77,6 +77,7 @@ 
 #endif
 #if __cplusplus >= 202002L
 # include <compare>
+# include <bits/ptr_traits.h> // std::to_address
 #endif
 
 namespace std _GLIBCXX_VISIBILITY(default)
@@ -308,110 +309,6 @@  _GLIBCXX_BEGIN_NAMESPACE_VERSION
       return __a;
     }
 
-  // All of these auxiliary structs serve two purposes.  (1) Replace
-  // calls to copy with memmove whenever possible.  (Memmove, not memcpy,
-  // because the input and output ranges are permitted to overlap.)
-  // (2) If we're using random access iterators, then write the loop as
-  // a for loop with an explicit count.
-
-  template<bool _IsMove, bool _IsSimple, typename _Category>
-    struct __copy_move
-    {
-      template<typename _II, typename _OI>
-	_GLIBCXX20_CONSTEXPR
-	static _OI
-	__copy_m(_II __first, _II __last, _OI __result)
-	{
-	  for (; __first != __last; ++__result, (void)++__first)
-	    *__result = *__first;
-	  return __result;
-	}
-    };
-
-#if __cplusplus >= 201103L
-  template<typename _Category>
-    struct __copy_move<true, false, _Category>
-    {
-      template<typename _II, typename _OI>
-	_GLIBCXX20_CONSTEXPR
-	static _OI
-	__copy_m(_II __first, _II __last, _OI __result)
-	{
-	  for (; __first != __last; ++__result, (void)++__first)
-	    *__result = std::move(*__first);
-	  return __result;
-	}
-    };
-#endif
-
-  template<>
-    struct __copy_move<false, false, random_access_iterator_tag>
-    {
-      template<typename _II, typename _OI>
-	_GLIBCXX20_CONSTEXPR
-	static _OI
-	__copy_m(_II __first, _II __last, _OI __result)
-	{
-	  typedef typename iterator_traits<_II>::difference_type _Distance;
-	  for(_Distance __n = __last - __first; __n > 0; --__n)
-	    {
-	      *__result = *__first;
-	      ++__first;
-	      ++__result;
-	    }
-	  return __result;
-	}
-
-      template<typename _Tp, typename _Up>
-	static void
-	__assign_one(_Tp* __to, _Up* __from)
-	{ *__to = *__from; }
-    };
-
-#if __cplusplus >= 201103L
-  template<>
-    struct __copy_move<true, false, random_access_iterator_tag>
-    {
-      template<typename _II, typename _OI>
-	_GLIBCXX20_CONSTEXPR
-	static _OI
-	__copy_m(_II __first, _II __last, _OI __result)
-	{
-	  typedef typename iterator_traits<_II>::difference_type _Distance;
-	  for(_Distance __n = __last - __first; __n > 0; --__n)
-	    {
-	      *__result = std::move(*__first);
-	      ++__first;
-	      ++__result;
-	    }
-	  return __result;
-	}
-
-      template<typename _Tp, typename _Up>
-	static void
-	__assign_one(_Tp* __to, _Up* __from)
-	{ *__to = std::move(*__from); }
-    };
-#endif
-
-  template<bool _IsMove>
-    struct __copy_move<_IsMove, true, random_access_iterator_tag>
-    {
-      template<typename _Tp, typename _Up>
-	_GLIBCXX20_CONSTEXPR
-	static _Up*
-	__copy_m(_Tp* __first, _Tp* __last, _Up* __result)
-	{
-	  const ptrdiff_t _Num = __last - __first;
-	  if (__builtin_expect(_Num > 1, true))
-	    __builtin_memmove(__result, __first, sizeof(_Tp) * _Num);
-	  else if (_Num == 1)
-	    std::__copy_move<_IsMove, false, random_access_iterator_tag>::
-	      __assign_one(__result, __first);
-	  return __result + _Num;
-	}
-    };
-
 _GLIBCXX_BEGIN_NAMESPACE_CONTAINER
 
   template<typename _Tp, typename _Ref, typename _Ptr>
@@ -461,21 +358,127 @@  _GLIBCXX_END_NAMESPACE_CONTAINER
 	_GLIBCXX_STD_C::_Deque_iterator<_CharT, _CharT&, _CharT*>);
 #endif // HOSTED
 
-  template<bool _IsMove, typename _II, typename _OI>
-    _GLIBCXX20_CONSTEXPR
-    inline _OI
-    __copy_move_a2(_II __first, _II __last, _OI __result)
-    {
-      typedef typename iterator_traits<_II>::iterator_category _Category;
-#ifdef __cpp_lib_is_constant_evaluated
-      if (std::is_constant_evaluated())
-	return std::__copy_move<_IsMove, false, _Category>::
-	  __copy_m(__first, __last, __result);
+#if __cpp_lib_concepts
+  // N.B. this is not the same as nothrow-forward-iterator, which doesn't
+  // require noexcept operations, it just says it's undefined if they throw.
+  // Here we require them to be actually noexcept.
+  template<typename _Iter>
+    concept __nothrow_contiguous_iterator
+      = contiguous_iterator<_Iter> && requires (_Iter __i) {
+	// If this operation can throw then the iterator could cause
+	// the algorithm to exit early via an exception, in which case
+	// we can't use memcpy.
+	{ *__i } noexcept;
+      };
+
+  template<typename _OutIter, typename _InIter, typename _Sent = _InIter>
+    concept __memcpyable_iterators
+      = __nothrow_contiguous_iterator<_OutIter>
+	  && __nothrow_contiguous_iterator<_InIter>
+	  && sized_sentinel_for<_Sent, _InIter>
+	  && requires (_OutIter __o, _InIter __i, _Sent __s) {
+	    requires !!__memcpyable<decltype(std::to_address(__o)),
+				    decltype(std::to_address(__i))>::__value;
+	    { __i != __s } noexcept;
+	  };
 #endif
-      return std::__copy_move<_IsMove, __memcpyable<_OI, _II>::__value,
-			      _Category>::__copy_m(__first, __last, __result);
+
+#if __cplusplus < 201103L
+  // Used by __copy_move_a2, __copy_n_a and __copy_move_backward_a2 to
+  // get raw pointers so that calls to __builtin_memmove will compile,
+  // because C++98 can't use 'if constexpr' so statements that use memmove
+  // with pointer arguments need to also compile for arbitrary iterator types.
+  template<typename _Iter> __attribute__((__always_inline__))
+    inline void* __ptr_or_null(_Iter) { return 0; }
+  template<typename _Tp> __attribute__((__always_inline__))
+  inline void* __ptr_or_null(_Tp* __p) { return (void*)__p; }
+# define _GLIBCXX_TO_ADDR(P) std::__ptr_or_null(P)
+  // Used to advance output iterators (std::advance requires InputIterator).
+  template<typename _Iter> __attribute__((__always_inline__))
+    inline void __ptr_advance(_Iter&, ptrdiff_t) { }
+  template<typename _Tp> __attribute__((__always_inline__))
+    inline void __ptr_advance(_Tp*& __p, ptrdiff_t __n) { __p += __n; }
+# define _GLIBCXX_ADVANCE(P, N) std::__ptr_advance(P, N)
+#else
+  // For C++11 mode the __builtin_memmove calls are guarded by 'if constexpr'
+  // so we know the iterators used with memmove are guaranteed to be pointers.
+# define _GLIBCXX_TO_ADDR(P) P
+# define _GLIBCXX_ADVANCE(P, N) P += N
+#endif
+
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wc++17-extensions"
+  template<bool _IsMove, typename _OutIter, typename _InIter>
+    __attribute__((__always_inline__)) _GLIBCXX20_CONSTEXPR
+    inline void
+    __assign_one(_OutIter& __out, _InIter& __in)
+    {
+#if __cplusplus >= 201103L
+      if constexpr (_IsMove)
+	*__out = std::move(*__in);
+      else
+#endif
+	*__out = *__in;
     }
 
+  template<bool _IsMove, typename _InIter, typename _Sent, typename _OutIter>
+    _GLIBCXX20_CONSTEXPR
+    inline _OutIter
+    __copy_move_a2(_InIter __first, _Sent __last, _OutIter __result)
+    {
+      typedef __decltype(*__first) _InRef;
+      typedef __decltype(*__result) _OutRef;
+      if _GLIBCXX_CONSTEXPR (!__is_trivially_assignable(_OutRef, _InRef))
+	{ } /* Skip the optimizations and use the loop at the end. */
+#ifdef __cpp_lib_is_constant_evaluated
+      else if (std::is_constant_evaluated())
+	{ } /* Skip the optimizations and use the loop at the end. */
+#endif
+      else if _GLIBCXX_CONSTEXPR (__memcpyable<_OutIter, _InIter>::__value)
+	{
+	  ptrdiff_t __n = std::distance(__first, __last);
+	  if (__builtin_expect(__n > 1, true))
+	    {
+	      __builtin_memmove(_GLIBCXX_TO_ADDR(__result),
+				_GLIBCXX_TO_ADDR(__first),
+				__n * sizeof(*__first));
+	      _GLIBCXX_ADVANCE(__result, __n);
+	    }
+	  else if (__n == 1)
+	    {
+	      std::__assign_one<_IsMove>(__result, __first);
+	      ++__result;
+	    }
+	  return __result;
+	}
+#if __cpp_lib_concepts
+      else if constexpr (__memcpyable_iterators<_OutIter, _InIter, _Sent>)
+	{
+	  if (auto __n = __last - __first; __n > 1) [[likely]]
+	    {
+	      void* __dest = std::to_address(__result);
+	      const void* __src = std::to_address(__first);
+	      size_t __nbytes = __n * sizeof(iter_value_t<_InIter>);
+	      // Advance the iterators first, in case doing so throws.
+	      __result += __n;
+	      __first += __n;
+	      __builtin_memmove(__dest, __src, __nbytes);
+	    }
+	  else if (__n == 1)
+	    {
+	      std::__assign_one<_IsMove>(__result, __first);
+	      ++__result;
+	    }
+	  return __result;
+	}
+#endif
+
+      for (; __first != __last; ++__result, (void)++__first)
+	std::__assign_one<_IsMove>(__result, __first);
+      return __result;
+    }
+#pragma GCC diagnostic pop
+
   template<bool _IsMove,
 	   typename _Tp, typename _Ref, typename _Ptr, typename _OI>
     _OI
@@ -537,12 +540,56 @@  _GLIBCXX_END_NAMESPACE_CONTAINER
 		  const ::__gnu_debug::_Safe_iterator<_IIte, _ISeq, _ICat>&,
 		  const ::__gnu_debug::_Safe_iterator<_OIte, _OSeq, _OCat>&);
 
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wc++17-extensions" // for if-constexpr
   template<typename _InputIterator, typename _Size, typename _OutputIterator>
     _GLIBCXX20_CONSTEXPR
     _OutputIterator
     __copy_n_a(_InputIterator __first, _Size __n, _OutputIterator __result,
 	       bool)
     {
+      typedef __decltype(*__first) _InRef;
+      typedef __decltype(*__result) _OutRef;
+      if _GLIBCXX_CONSTEXPR (!__is_trivially_assignable(_OutRef, _InRef))
+	{ } /* Skip the optimizations and use the loop at the end. */
+#ifdef __cpp_lib_is_constant_evaluated
+      else if (std::is_constant_evaluated())
+	{ } /* Skip the optimizations and use the loop at the end. */
+#endif
+      else if _GLIBCXX_CONSTEXPR (__memcpyable<_OutputIterator,
+					       _InputIterator>::__value)
+	{
+	  if (__builtin_expect(__n > 1, true))
+	    {
+	      __builtin_memmove(_GLIBCXX_TO_ADDR(__result),
+				_GLIBCXX_TO_ADDR(__first),
+				__n * sizeof(*__first));
+	      _GLIBCXX_ADVANCE(__result, __n);
+	    }
+	  else if (__n == 1)
+	    *__result++ = *__first;
+	  return __result;
+	}
+#if __cpp_lib_concepts
+      else if constexpr (__memcpyable_iterators<_OutputIterator,
+						_InputIterator>)
+	{
+	  if (__n > 1) [[likely]]
+	    {
+	      void* __dest = std::to_address(__result);
+	      const void* __src = std::to_address(__first);
+	      size_t __nbytes = __n * sizeof(iter_value_t<_InputIterator>);
+	      // Advance the iterators first, in case doing so throws.
+	      __result += __n;
+	      __first += __n;
+	      __builtin_memmove(__dest, __src, __nbytes);
+	    }
+	  else if (__n == 1)
+	    *__result++ = *__first;
+	  return __result;
+	}
+#endif
+
       if (__n > 0)
 	{
 	  while (true)
@@ -557,6 +604,7 @@  _GLIBCXX_END_NAMESPACE_CONTAINER
 	}
       return __result;
     }
+#pragma GCC diagnostic pop
 
 #if _GLIBCXX_HOSTED
   template<typename _CharT, typename _Size>
@@ -644,105 +692,69 @@  _GLIBCXX_END_NAMESPACE_CONTAINER
 #define _GLIBCXX_MOVE3(_Tp, _Up, _Vp) std::copy(_Tp, _Up, _Vp)
 #endif
 
-  template<bool _IsMove, bool _IsSimple, typename _Category>
-    struct __copy_move_backward
-    {
-      template<typename _BI1, typename _BI2>
-	_GLIBCXX20_CONSTEXPR
-	static _BI2
-	__copy_move_b(_BI1 __first, _BI1 __last, _BI2 __result)
-	{
-	  while (__first != __last)
-	    *--__result = *--__last;
-	  return __result;
-	}
-    };
-
-#if __cplusplus >= 201103L
-  template<typename _Category>
-    struct __copy_move_backward<true, false, _Category>
-    {
-      template<typename _BI1, typename _BI2>
-	_GLIBCXX20_CONSTEXPR
-	static _BI2
-	__copy_move_b(_BI1 __first, _BI1 __last, _BI2 __result)
-	{
-	  while (__first != __last)
-	    *--__result = std::move(*--__last);
-	  return __result;
-	}
-    };
-#endif
-
-  template<>
-    struct __copy_move_backward<false, false, random_access_iterator_tag>
-    {
-      template<typename _BI1, typename _BI2>
-	_GLIBCXX20_CONSTEXPR
-	static _BI2
-	__copy_move_b(_BI1 __first, _BI1 __last, _BI2 __result)
-	{
-	  typename iterator_traits<_BI1>::difference_type
-	    __n = __last - __first;
-	  for (; __n > 0; --__n)
-	    *--__result = *--__last;
-	  return __result;
-	}
-    };
-
-#if __cplusplus >= 201103L
-  template<>
-    struct __copy_move_backward<true, false, random_access_iterator_tag>
-    {
-      template<typename _BI1, typename _BI2>
-	_GLIBCXX20_CONSTEXPR
-	static _BI2
-	__copy_move_b(_BI1 __first, _BI1 __last, _BI2 __result)
-	{
-	  typename iterator_traits<_BI1>::difference_type
-	    __n = __last - __first;
-	  for (; __n > 0; --__n)
-	    *--__result = std::move(*--__last);
-	  return __result;
-	}
-    };
-#endif
-
-  template<bool _IsMove>
-    struct __copy_move_backward<_IsMove, true, random_access_iterator_tag>
-    {
-      template<typename _Tp, typename _Up>
-	_GLIBCXX20_CONSTEXPR
-	static _Up*
-	__copy_move_b(_Tp* __first, _Tp* __last, _Up* __result)
-	{
-	  const ptrdiff_t _Num = __last - __first;
-	  if (__builtin_expect(_Num > 1, true))
-	    __builtin_memmove(__result - _Num, __first, sizeof(_Tp) * _Num);
-	  else if (_Num == 1)
-	    std::__copy_move<_IsMove, false, random_access_iterator_tag>::
-	      __assign_one(__result - 1, __first);
-	  return __result - _Num;
-	}
-    };
-
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wc++17-extensions"
   template<bool _IsMove, typename _BI1, typename _BI2>
     _GLIBCXX20_CONSTEXPR
     inline _BI2
     __copy_move_backward_a2(_BI1 __first, _BI1 __last, _BI2 __result)
     {
-      typedef typename iterator_traits<_BI1>::iterator_category _Category;
+      typedef __decltype(*__first) _InRef;
+      typedef __decltype(*__result) _OutRef;
+      if _GLIBCXX_CONSTEXPR (!__is_trivially_assignable(_OutRef, _InRef))
+       { } /* Skip the optimizations and use the loop at the end. */
 #ifdef __cpp_lib_is_constant_evaluated
-      if (std::is_constant_evaluated())
-	return std::__copy_move_backward<_IsMove, false, _Category>::
-	  __copy_move_b(__first, __last, __result);
+      else if (std::is_constant_evaluated())
+       { } /* Skip the optimizations and use the loop at the end. */
 #endif
-      return std::__copy_move_backward<_IsMove,
-				       __memcpyable<_BI2, _BI1>::__value,
-				       _Category>::__copy_move_b(__first,
-								 __last,
-								 __result);
+      else if _GLIBCXX_CONSTEXPR (__memcpyable<_BI2, _BI1>::__value)
+	{
+	  ptrdiff_t __n = std::distance(__first, __last);
+	  std::advance(__result, -__n);
+	  if (__builtin_expect(__n > 1, true))
+	    {
+	      __builtin_memmove(_GLIBCXX_TO_ADDR(__result),
+				_GLIBCXX_TO_ADDR(__first),
+				__n * sizeof(*__first));
+	    }
+	  else if (__n == 1)
+	    std::__assign_one<_IsMove>(__result, __first);
+	  return __result;
+	}
+#if __cpp_lib_concepts
+      else if constexpr (__memcpyable_iterators<_BI2, _BI1>)
+	{
+	  if (auto __n = __last - __first; __n > 1) [[likely]]
+	    {
+	      const void* __src = std::to_address(__first);
+	      // Advance the iterators first, in case doing so throws.
+	      __result -= __n;
+	      __first += __n;
+	      void* __dest = std::to_address(__result);
+	      size_t __nbytes = __n * sizeof(iter_value_t<_BI1>);
+	      __builtin_memmove(__dest, __src, __nbytes);
+	    }
+	  else if (__n == 1)
+	    {
+	      --__result;
+	      std::__assign_one<_IsMove>(__result, __first);
+	    }
+	  return __result;
+	}
+#endif
+
+      while (__first != __last)
+	{
+	  --__last;
+	  --__result;
+	  std::__assign_one<_IsMove>(__result, __last);
+	}
+      return __result;
     }
+#pragma GCC diagnostic pop
+
+#undef _GLIBCXX_TO_ADDR
+#undef _GLIBCXX_ADVANCE
 
   template<bool _IsMove, typename _BI1, typename _BI2>
     _GLIBCXX20_CONSTEXPR
diff --git a/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc
new file mode 100644
index 00000000000..531b863e143
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy/114817.cc
@@ -0,0 +1,39 @@ 
+// { dg-do run { target c++11 } }
+
+// Bug libstdc++/114817 - Wrong codegen for std::copy of
+// "trivially copyable but not trivially assignable" type
+
+#include <memory>
+#include <testsuite_hooks.h>
+
+int constructions = 0;
+
+struct NonTrivialCons
+{
+  NonTrivialCons() = default;
+  NonTrivialCons(int v) : val(v) { }
+  NonTrivialCons(const volatile NonTrivialCons&) = delete;
+  template<class = void>
+    NonTrivialCons(const NonTrivialCons& o)
+    : val(o.val)
+    {
+      ++constructions;
+    }
+
+  int val;
+};
+
+static_assert(std::is_trivially_copyable<NonTrivialCons>::value);
+
+int main()
+{
+  NonTrivialCons src[2]{1, 2};
+  NonTrivialCons dst[2];
+#if __cplusplus < 201703L
+  constructions = 0;
+#endif
+  std::uninitialized_copy(src, src+2, dst);
+  VERIFY( constructions == 2 );
+  VERIFY( dst[0].val == src[0].val );
+  VERIFY( dst[1].val == src[1].val );
+}
diff --git a/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc
new file mode 100644
index 00000000000..531b863e143
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/specialized_algorithms/uninitialized_copy_n/114817.cc
@@ -0,0 +1,39 @@ 
+// { dg-do run { target c++11 } }
+
+// Bug libstdc++/114817 - Wrong codegen for std::copy of
+// "trivially copyable but not trivially assignable" type
+
+#include <memory>
+#include <testsuite_hooks.h>
+
+int constructions = 0;
+
+struct NonTrivialCons
+{
+  NonTrivialCons() = default;
+  NonTrivialCons(int v) : val(v) { }
+  NonTrivialCons(const volatile NonTrivialCons&) = delete;
+  template<class = void>
+    NonTrivialCons(const NonTrivialCons& o)
+    : val(o.val)
+    {
+      ++constructions;
+    }
+
+  int val;
+};
+
+static_assert(std::is_trivially_copyable<NonTrivialCons>::value);
+
+int main()
+{
+  NonTrivialCons src[2]{1, 2};
+  NonTrivialCons dst[2];
+#if __cplusplus < 201703L
+  constructions = 0;
+#endif
+  std::uninitialized_copy(src, src+2, dst);
+  VERIFY( constructions == 2 );
+  VERIFY( dst[0].val == src[0].val );
+  VERIFY( dst[1].val == src[1].val );
+}
diff --git a/libstdc++-v3/testsuite/25_algorithms/copy/114817.cc b/libstdc++-v3/testsuite/25_algorithms/copy/114817.cc
new file mode 100644
index 00000000000..b5fcc6bb037
--- /dev/null
+++ b/libstdc++-v3/testsuite/25_algorithms/copy/114817.cc
@@ -0,0 +1,38 @@ 
+// { dg-do run { target c++11 } }
+
+// Bug libstdc++/114817 - Wrong codegen for std::copy of
+// "trivially copyable but not trivially assignable" type
+
+#include <algorithm>
+#include <testsuite_hooks.h>
+
+int assignments = 0;
+
+struct NonTrivialAssignment
+{
+  NonTrivialAssignment(int v) : val(v) { }
+  NonTrivialAssignment(const NonTrivialAssignment&) = default;
+  void operator=(const volatile NonTrivialAssignment&) = delete;
+  template<class = void>
+    NonTrivialAssignment&
+    operator=(const NonTrivialAssignment& o)
+    {
+      ++assignments;
+      val = o.val;
+      return *this;
+    }
+
+  int val;
+};
+
+static_assert(std::is_trivially_copyable<NonTrivialAssignment>::value);
+
+int main()
+{
+  NonTrivialAssignment src[2]{1, 2};
+  NonTrivialAssignment dst[2]{3, 4};
+  std::copy(src, src+2, dst);
+  VERIFY( assignments == 2 );
+  VERIFY( dst[0].val == src[0].val );
+  VERIFY( dst[1].val == src[1].val );
+}
diff --git a/libstdc++-v3/testsuite/25_algorithms/copy/115444.cc b/libstdc++-v3/testsuite/25_algorithms/copy/115444.cc
new file mode 100644
index 00000000000..fa629abea5f
--- /dev/null
+++ b/libstdc++-v3/testsuite/25_algorithms/copy/115444.cc
@@ -0,0 +1,93 @@ 
+// { dg-do run }
+// { dg-require-normal-mode "debug mode checks use operator-(Iter, Iter)" }
+
+#include <algorithm>
+#include <iterator>
+#include <testsuite_hooks.h>
+
+const int g = 0;
+
+struct Iter {
+  typedef long difference_type;
+  typedef int value_type;
+  typedef const int& reference;
+  typedef const int* pointer;
+#if __cpp_lib_concepts
+  using iterator_category = std::contiguous_iterator_tag;
+#else
+  typedef std::random_access_iterator_tag iterator_category;
+#endif
+
+  Iter(const int* p = 0, const int* limit = 0) : ptr(p), limit(limit) { }
+
+  const int& operator*() const {
+#ifdef __cpp_exceptions
+    if (ptr == limit)
+      throw 1;
+#endif
+    return *ptr;
+  }
+  const int* operator->() const { return ptr; }
+  const int& operator[](long n) const { return ptr[n]; }
+
+  Iter& operator++() { ++ptr; return *this; }
+  Iter operator++(int) { Iter tmp = *this; ++ptr; return tmp; }
+  Iter& operator--() { --ptr; return *this; }
+  Iter operator--(int) { Iter tmp = *this; --ptr; return tmp; }
+
+  Iter& operator+=(int n) { ptr += n; return *this; }
+  Iter& operator-=(int n) { ptr -= n; return *this; }
+
+  friend Iter operator+(int n, Iter it) { return it += n; }
+  friend Iter operator+(Iter it, int n) { return it += n; }
+  friend Iter operator-(Iter it, int n) { return it -= n; }
+
+  bool operator==(const Iter& it) const { return ptr == it.ptr; }
+  bool operator!=(const Iter& it) const { return ptr != it.ptr; }
+  bool operator<(const Iter& it) const { return ptr < it.ptr; }
+  bool operator>(const Iter& it) const { return ptr > it.ptr; }
+  bool operator<=(const Iter& it) const { return ptr <= it.ptr; }
+  bool operator>=(const Iter& it) const { return ptr >= it.ptr; }
+
+  // std::copy should not need to take the difference between two iterators:
+  friend int operator-(Iter, Iter) { VERIFY( ! "operator- called" ); }
+
+private:
+  const int* ptr;
+  const int* limit;
+};
+
+void
+test_pr115444_no_difference()
+{
+  int from = 1;
+  int to = 0;
+  Iter iter(&from);
+  // This should not use operator-(Iter, Iter)
+  std::copy(iter, iter+1, &to);
+}
+
+void
+test_pr115444_exceptional()
+{
+#if __cpp_exceptions
+  int from[3] = { 1, 2, 3 };
+  int to[3] = { -1, -1, -1 };
+  Iter iter(from, from+2);
+  try {
+    std::copy(iter, iter + 3, to);
+  } catch (int) {
+  }
+  // std::copy should exit via exception on third dereference.
+  // This precludes using memcpy or memmove to optimize the copying.
+  VERIFY( to[0] == 1 );
+  VERIFY( to[1] == 2 );
+  VERIFY( to[2] == -1 );
+#endif
+}
+
+int main()
+{
+  test_pr115444_no_difference();
+  test_pr115444_exceptional();
+}
diff --git a/libstdc++-v3/testsuite/25_algorithms/copy_n/114817.cc b/libstdc++-v3/testsuite/25_algorithms/copy_n/114817.cc
new file mode 100644
index 00000000000..09e181f3fd0
--- /dev/null
+++ b/libstdc++-v3/testsuite/25_algorithms/copy_n/114817.cc
@@ -0,0 +1,38 @@ 
+// { dg-do run { target c++11 } }
+
+// Bug libstdc++/114817 - Wrong codegen for std::copy of
+// "trivially copyable but not trivially assignable" type
+
+#include <algorithm>
+#include <testsuite_hooks.h>
+
+int assignments = 0;
+
+struct NonTrivialAssignment
+{
+  NonTrivialAssignment(int v) : val(v) { }
+  NonTrivialAssignment(const NonTrivialAssignment&) = default;
+  void operator=(const volatile NonTrivialAssignment&) = delete;
+  template<class = void>
+    NonTrivialAssignment&
+    operator=(const NonTrivialAssignment& o)
+    {
+      ++assignments;
+      val = o.val;
+      return *this;
+    }
+
+  int val;
+};
+
+static_assert(std::is_trivially_copyable<NonTrivialAssignment>::value);
+
+int main()
+{
+  NonTrivialAssignment src[2]{1, 2};
+  NonTrivialAssignment dst[2]{3, 4};
+  std::copy_n(src, 2, dst);
+  VERIFY( assignments == 2 );
+  VERIFY( dst[0].val == src[0].val );
+  VERIFY( dst[1].val == src[1].val );
+}