From patchwork Wed Jun 5 17:47:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Wakely X-Patchwork-Id: 1943996 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=OvHmCjFY; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VvZkJ1G0Fz20WK for ; Thu, 6 Jun 2024 03:50:32 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 4AB083873CAE for ; Wed, 5 Jun 2024 17:50:30 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 78EA73873C80 for ; Wed, 5 Jun 2024 17:50:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 78EA73873C80 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 78EA73873C80 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1717609803; cv=none; b=sdzCqqs38uB2TUAxEEb23P2FWWXXaZTsTZiL3OyGUFeXHC0s9DBO4zCR2svnr9TcWs2h73VyYN/LeHrbNTPYI+0Us/VRIFoQ2eNvU6X/wFihkzVaj1QhulJ1gaOKkL/FbgmpEcbqt04XU6lkpEHCLpx69ILrKmbXvOQCZN2HZ/M= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1717609803; c=relaxed/simple; bh=vTRLfQ63If31duIj9fhWsaGPMJQt5SsiJ24jL438fW4=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=wOWmIfl7C2bn/t6vyxUD5oeiduYPrHlAVwm3raiL0F+F27hE6BSBzKHbieUV8sHdHViws8OjM8IbePlJBj4coHCJyRuOxCIctrmNKZ8+4UtognoHCIWYmZZctiglkcqGuZpd3IDPb2NDqckrDAdWAqqULQ6dRaUX4eS/7TRU4Ok= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1717609800; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=n2WzD1ireaB2WPdn8dW43SZ2FXE4yProzNIwVdgt0TY=; b=OvHmCjFYIvE5vKR6yeRFOf5fXJ4mqLtqCzgqxAIzxGvk8fme/h51P0o2A+WkCqZ0OLiPPU FqD1rpHMlhLd18sa/wnhsqj4bLip1nOmAkEu1CJ4U9T/zEFrRJDAaIxSA+kBk6mt20gsYM pOyo9YsTj6Ag4Z0plN+SrmF/vylSThI= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-355-eHEhehx4OESo64bwUj73lQ-1; Wed, 05 Jun 2024 13:49:56 -0400 X-MC-Unique: eHEhehx4OESo64bwUj73lQ-1 Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id C45FB197700A; Wed, 5 Jun 2024 17:49:55 +0000 (UTC) Received: from localhost (unknown [10.39.192.32]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id C7EFC1956086; Wed, 5 Jun 2024 17:49:54 +0000 (UTC) From: Jonathan Wakely To: libstdc++@gcc.gnu.org, gcc-patches@gcc.gnu.org Subject: [PATCH v2] libstdc++: Use memchr to optimize std::find [PR88545] Date: Wed, 5 Jun 2024 18:47:27 +0100 Message-ID: <20240605174953.143341-1-jwakely@redhat.com> In-Reply-To: <20240605153233.119881-1-jwakely@redhat.com> References: <20240605153233.119881-1-jwakely@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-10.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, RCVD_IN_SBL_CSS, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org This v2 patch uses std::is_constant_evaluated() so the algos don't try to use memchr in constant expressions, and removes the _GLIBCXX_NODISCARD I added to std::find (it should still be added, but as a separate patch). Tamar has asked that I don't push this until he compares the performance to a vectorized std::find_if, so I'll wait. -- >8 -- This optimizes std::find to use memchr when searching for an integer in a range of bytes. libstdc++-v3/ChangeLog: PR libstdc++/88545 PR libstdc++/115040 * include/bits/cpp_type_traits.h (__can_use_memchr_for_find): New variable template. * include/bits/ranges_util.h (__find_fn): Use memchr when possible. * include/bits/stl_algo.h (find): Likewise. * testsuite/25_algorithms/find/bytes.cc: New test. --- libstdc++-v3/include/bits/cpp_type_traits.h | 13 ++ libstdc++-v3/include/bits/ranges_util.h | 18 +++ libstdc++-v3/include/bits/stl_algo.h | 33 +++++ .../testsuite/25_algorithms/find/bytes.cc | 135 ++++++++++++++++++ 4 files changed, 199 insertions(+) create mode 100644 libstdc++-v3/testsuite/25_algorithms/find/bytes.cc diff --git a/libstdc++-v3/include/bits/cpp_type_traits.h b/libstdc++-v3/include/bits/cpp_type_traits.h index 59f1a1875eb..466e6792a11 100644 --- a/libstdc++-v3/include/bits/cpp_type_traits.h +++ b/libstdc++-v3/include/bits/cpp_type_traits.h @@ -35,6 +35,10 @@ #pragma GCC system_header #include +#include +#if __glibcxx_type_trait_variable_templates +# include // is_same_v, is_integral_v +#endif // // This file provides some compile-time information about various types. @@ -589,6 +593,15 @@ __INT_N(__GLIBCXX_TYPE_INT_N_3) { static constexpr bool __value = false; }; #endif +#if __glibcxx_type_trait_variable_templates + template + constexpr bool __can_use_memchr_for_find + // Can only use memchr to search for narrow characters and std::byte. + = __is_byte<_ValT>::__value + // And only if the value to find is an integer (or is also std::byte). + && (is_same_v<_Tp, _ValT> || is_integral_v<_Tp>); +#endif + // // Move iterator type // diff --git a/libstdc++-v3/include/bits/ranges_util.h b/libstdc++-v3/include/bits/ranges_util.h index 9b79c3a229d..03239fd8af6 100644 --- a/libstdc++-v3/include/bits/ranges_util.h +++ b/libstdc++-v3/include/bits/ranges_util.h @@ -34,6 +34,7 @@ # include # include # include +# include // __can_use_memchr_for_find #ifdef __glibcxx_ranges namespace std _GLIBCXX_VISIBILITY(default) @@ -494,6 +495,23 @@ namespace ranges operator()(_Iter __first, _Sent __last, const _Tp& __value, _Proj __proj = {}) const { + if constexpr (is_same_v<_Proj, identity>) + if constexpr(__can_use_memchr_for_find, _Tp>) + if constexpr (sized_sentinel_for<_Sent, _Iter>) + if constexpr (contiguous_iterator<_Iter>) + if (!is_constant_evaluated()) + { + auto __n = __last - __first; + if (__n > 0) + { + const int __ival = static_cast(__value); + const void* __p0 = std::to_address(__first); + if (auto __p1 = __builtin_memchr(__p0, __ival, __n)) + __n = (const char*)__p1 - (const char*)__p0; + } + return __first + __n; + } + while (__first != __last && !(std::__invoke(__proj, *__first) == __value)) ++__first; diff --git a/libstdc++-v3/include/bits/stl_algo.h b/libstdc++-v3/include/bits/stl_algo.h index 1a996aa61da..4daaf60f289 100644 --- a/libstdc++-v3/include/bits/stl_algo.h +++ b/libstdc++-v3/include/bits/stl_algo.h @@ -3846,6 +3846,39 @@ _GLIBCXX_BEGIN_NAMESPACE_ALGO __glibcxx_function_requires(_EqualOpConcept< typename iterator_traits<_InputIterator>::value_type, _Tp>) __glibcxx_requires_valid_range(__first, __last); + +#if __cpp_if_constexpr && __glibcxx_type_trait_variable_templates + using _ValT = typename iterator_traits<_InputIterator>::value_type; + if constexpr (__can_use_memchr_for_find<_ValT, _Tp>) + { + // If converting the value to the 1-byte value_type alters its value, + // then it would not be found by std::find using equality comparison. + // We need to check this here, because otherwise something like + // memchr("a", 'a'+256, 1) would give a false positive match. + if (static_cast<_ValT>(__val) != __val) + return __last; + else if (!__is_constant_evaluated()) + { + const void* __p0 = nullptr; + if constexpr (is_pointer_v) + __p0 = std::__niter_base(__first); +#if __cpp_lib_concepts + else if constexpr (contiguous_iterator<_InputIterator>) + __p0 = std::to_address(__first); +#endif + + if (__p0) + { + const int __ival = static_cast(__val); + if (auto __n = std::distance(__first, __last); __n > 0) + if (auto __p1 = __builtin_memchr(__p0, __ival, __n)) + return __first + ((const char*)__p1 - (const char*)__p0); + return __last; + } + } + } +#endif + return std::__find_if(__first, __last, __gnu_cxx::__ops::__iter_equals_val(__val)); } diff --git a/libstdc++-v3/testsuite/25_algorithms/find/bytes.cc b/libstdc++-v3/testsuite/25_algorithms/find/bytes.cc new file mode 100644 index 00000000000..2f643c4c4b4 --- /dev/null +++ b/libstdc++-v3/testsuite/25_algorithms/find/bytes.cc @@ -0,0 +1,135 @@ +// { dg-do run } + +#include +#include // std::byte +#include + +// PR libstdc++/88545 made std::find use memchr as an optimization. +// This test verifies that it didn't change any semantics. + +template +void +test_char() +{ + const C a[] = { (C)'a', (C)'b', (C)'c', (C)'d' }; + const C* end = a + sizeof(a); + const C* res = std::find(a, end, a[0]); + VERIFY( res == a ); + res = std::find(a, end, a[2]); + VERIFY( res == a+2 ); + res = std::find(a, end, a[0] + 256); + VERIFY( res == end ); + res = std::find(a, end, a[0] - 256); + VERIFY( res == end ); + res = std::find(a, end, 256); + VERIFY( res == end ); + +#ifdef __cpp_lib_ranges + res = std::ranges::find(a, a[0]); + VERIFY( res == a ); + res = std::ranges::find(a, a[2]); + VERIFY( res == a+2 ); + res = std::ranges::find(a, a[0] + 256); + VERIFY( res == end ); + res = std::ranges::find(a, a[0] - 256); + VERIFY( res == end ); + res = std::ranges::find(a, 256); + VERIFY( res == end ); +#endif +} + +// Trivial type of size 1, with custom equality. +struct S { + bool operator==(const S&) const { return true; }; + char c; +}; + +// Trivial type of size 1, with custom equality. +enum E +#if __cplusplus >= 201103L +: unsigned char +#endif +{ e1 = 1, e255 = 255 }; + +bool operator==(E l, E r) { return (l % 3) == (r % 3); } + +struct X { char c; }; +bool operator==(X, char) { return false; } +bool operator==(char, X) { return false; } + +bool operator==(E, char) { return false; } +bool operator==(char, E) { return false; } + +void +test_non_characters() +{ + S s[3] = { {'a'}, {'b'}, {'c'} }; + S sx = {'x'}; + S* sres = std::find(s, s+3, sx); + VERIFY( sres == s ); // memchr optimization would not find a match + + E e[3] = { E(1), E(2), E(3) }; + E* eres = std::find(e, e+3, E(4)); + VERIFY( eres == e ); // memchr optimization would not find a match + + char x[1] = { 'x' }; + X xx = { 'x' }; + char* xres = std::find(x, x+1, xx); + VERIFY( xres == x+1 ); // memchr optimization would find a match + xres = std::find(x, x+1, E('x')); + VERIFY( xres == x+1 ); // memchr optimization would find a match + +#ifdef __cpp_lib_byte + std::byte b[] = { std::byte{0}, std::byte{1}, std::byte{2}, std::byte{3} }; + std::byte* bres = std::find(b, b+4, std::byte{4}); + VERIFY( bres == b+4 ); + bres = std::find(b, b+2, std::byte{3}); + VERIFY( bres == b+2 ); + bres = std::find(b, b+3, std::byte{3}); + VERIFY( bres == b+3 ); +#endif + +#ifdef __cpp_lib_ranges + sres = std::ranges::find(s, sx); + VERIFY( sres == s ); + + eres = std::ranges::find(e, e+3, E(4)); + VERIFY( eres == e ); + + xres = std::ranges::find(x, xx); + VERIFY( xres == std::ranges::end(x) ); + + bres = std::ranges::find(b, std::byte{4}); + VERIFY( bres == b+4 ); + bres = std::ranges::find(b, b+2, std::byte{3}); + VERIFY( bres == b+2 ); + bres = std::ranges::find(b, std::byte{3}); + VERIFY( bres == b+3 ); + + xres = std::find(x, x+1, xx); + VERIFY( xres == std::ranges::end(x) ); + xres = std::find(x, x+1, E('x')); + VERIFY( xres == std::ranges::end(x) ); +#endif +} + +int main() +{ + test_char(); + test_char(); + test_char(); + test_non_characters(); + +#if __cpp_lib_constexpr_algorithms + static_assert( [] { + char c[] = "abcd"; + return std::find(c, c+4, 'b') == c+1; + }() ); +#ifdef __cpp_lib_ranges + static_assert( [] { + char c[] = "abcd"; + return std::ranges::find(c, 'b') == c+1; + }() ); +#endif +#endif +}