From patchwork Thu Oct 19 04:55:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Fran=C3=A7ois_Dumont?= X-Patchwork-Id: 1851398 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=XGsJPKx9; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4S9wQz1hDvz20cx for ; Thu, 19 Oct 2023 15:56:09 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8E3A7385735E for ; Thu, 19 Oct 2023 04:56:07 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-lj1-x22d.google.com (mail-lj1-x22d.google.com [IPv6:2a00:1450:4864:20::22d]) by sourceware.org (Postfix) with ESMTPS id BD61A3858D1E; Thu, 19 Oct 2023 04:55:53 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BD61A3858D1E Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BD61A3858D1E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::22d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697691356; cv=none; b=h3AKW6Ws+XEDdeOBZ4zTvZnm8GvG/ygLlJstSffXIFEeknVs6kvTNUD+rJH+fGLEXssFknlwFcKN1sF/EJcBE7IospWgnp3R4fsdDzM2xblWorZLSoW47UY6V+FZ6Xvy8jwxcD3oJC8bHMYMh+2VNkH2f9wvHmsSbRI4y9X9OtA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697691356; c=relaxed/simple; bh=VFUTzzAruPG2DFcFVO9n/TJR3EsdLd43F2TK71Z0e4E=; h=DKIM-Signature:Message-ID:Date:MIME-Version:To:From:Subject; b=QQTxfCg604wD6xngSeSZg28oU54IpnuuPX8bf74zkjcA4Jb83sPuJGcEWWoejfHm1tt5gNdnTgsGbPqHe1sZQXQbEOBM2C1PXVrxQ/8fM11ikwxneuvFfuhh8/uLU2S80q5VeDaXjETX3XtNutEr5Mf9Xy51ggmJkzSGgQpjcdk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lj1-x22d.google.com with SMTP id 38308e7fff4ca-2c50d1b9f22so70604351fa.0; Wed, 18 Oct 2023 21:55:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697691352; x=1698296152; darn=gcc.gnu.org; h=subject:from:cc:to:content-language:user-agent:mime-version:date :message-id:from:to:cc:subject:date:message-id:reply-to; bh=RNwpac+NSgHMty9F7/7Ytdbo8L2nlR7obJl8h5LrkVg=; b=XGsJPKx9C1S4C63CKHDMZBYRKmapkLchyJ+E1DRoovnTCy+ixau+CDX5LqW10tTEvn 2SHcxsHiXtcepbE7SLTUjQNabGEB4H2KErALER318qCC/2wkTw+uirZXHgz+5EhjfQg1 9bnmvee+cSiPYseOEzaLHCMgxr1M7j/0PaRSzRIBwSG9Kb5C03e+8Tca6q23f2una+nb 0vxob8m0LnmlBUKi8QuTB8lzLfaFbesKv1UbNMe6fb5Mufb0QrOVZRyC7um3P5S+n0ZX w+e1MCJn40X63IRoIxTfINahRlGHtRAYUASml53FafcUNG6nxpY8dmU+Bo2Nem7UsaHE 8wjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697691352; x=1698296152; h=subject:from:cc:to:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=RNwpac+NSgHMty9F7/7Ytdbo8L2nlR7obJl8h5LrkVg=; b=WdrxrEvpp8QZYFNx00fNi38nQQ8lHtJfsv6pnofXejoQzNhpm/gvbduA4/k1iscTHv bkiZ/3Yg71kJKQKO4CwCXQfuAT7dvYwDvQrC8xMARv7lYJ62Zbyupzzua3euAqXhlw4j Ffuyad6GtBr8FBnijoUkYy52MwSO0AnmCyzm7u0e8NlDRP8rnhfx6drm3Rs6AQizyEUf cOZlfJfvzy3RlrQuqzTrwTNtfbE223Ml90Efakr40RB+2rz7wGlr0D4k/qjcjkiTkUgS U/ppkWr93RYTqhSgn4xgyFikXbDmlSquQDRdrAbjdfRVCVv51z7bBEiRe/rfdvqhAq4/ 4u3w== X-Gm-Message-State: AOJu0YzLevKWmgFZZlpP8noL03PMFrTAF5oBPoBY2Cstj78rp2AkPNQo MmEvjyj65Z+Wtpl3EFyJJBp+rUSkDTo= X-Google-Smtp-Source: AGHT+IEz5bVCqVVfmODpuzSKa8okn0mue7K7ZfKmV9uU1PQvaiK1XDnsiBX+boADVpnnT8R4GKe/ng== X-Received: by 2002:a2e:a589:0:b0:2c5:1abb:7077 with SMTP id m9-20020a2ea589000000b002c51abb7077mr505406ljp.1.1697691351216; Wed, 18 Oct 2023 21:55:51 -0700 (PDT) Received: from [10.78.0.77] ([89.207.171.155]) by smtp.gmail.com with ESMTPSA id f13-20020a1c6a0d000000b003fe23b10fdfsm3307057wmc.36.2023.10.18.21.55.49 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 18 Oct 2023 21:55:50 -0700 (PDT) Message-ID: Date: Thu, 19 Oct 2023 06:55:48 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: libstdc++ Cc: gcc-patches From: =?utf-8?q?Fran=C3=A7ois_Dumont?= Subject: [PATCH][_Hashtable] Fix merge X-Spam-Status: No, score=-10.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org libstdc++: [_Hashtable] Do not reuse untrusted cached hash code On merge reuse merged node cached hash code only if we are on the same type of hash and this hash is stateless. Usage of function pointers or std::function as hash functor will prevent this optimization. libstdc++-v3/ChangeLog     * include/bits/hashtable_policy.h     (_Hash_code_base::_M_hash_code(const _Hash&, const _Hash_node_value<>&)): Remove.     (_Hash_code_base::_M_hash_code<_H2>(const _H2&, const _Hash_node_value<>&)): Remove.     * include/bits/hashtable.h     (_M_src_hash_code<_H2>(const _H2&, const key_type&, const __node_value_type&)): New.     (_M_merge_unique<>, _M_merge_multi<>): Use latter.     * testsuite/23_containers/unordered_map/modifiers/merge.cc     (test04, test05, test06): New test cases. Tested under Linux x86_64, ok to commit ? François diff --git a/libstdc++-v3/include/bits/hashtable.h b/libstdc++-v3/include/bits/hashtable.h index 4c12dc895b2..f69acfe5213 100644 --- a/libstdc++-v3/include/bits/hashtable.h +++ b/libstdc++-v3/include/bits/hashtable.h @@ -1109,6 +1109,20 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION return { __n, this->_M_node_allocator() }; } + // Check and if needed compute hash code using _Hash as __n _M_hash_code, + // if present, was computed using _H2. + template + __hash_code + _M_src_hash_code(const _H2&, const key_type& __k, + const __node_value_type& __src_n) const + { + if constexpr (std::is_same_v<_H2, _Hash>) + if constexpr (std::is_empty_v<_Hash>) + return this->_M_hash_code(__src_n); + + return this->_M_hash_code(__k); + } + public: // Extract a node. node_type @@ -1146,7 +1160,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION auto __pos = __i++; const key_type& __k = _ExtractKey{}(*__pos); __hash_code __code - = this->_M_hash_code(__src.hash_function(), *__pos._M_cur); + = _M_src_hash_code(__src.hash_function(), __k, *__pos._M_cur); size_type __bkt = _M_bucket_index(__code); if (_M_find_node(__bkt, __k, __code) == nullptr) { @@ -1174,8 +1188,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION for (auto __i = __src.cbegin(), __end = __src.cend(); __i != __end;) { auto __pos = __i++; + const key_type& __k = _ExtractKey{}(*__pos); __hash_code __code - = this->_M_hash_code(__src.hash_function(), *__pos._M_cur); + = _M_src_hash_code(__src.hash_function(), __k, *__pos._M_cur); auto __nh = __src.extract(__pos); __hint = _M_insert_multi_node(__hint, __code, __nh._M_ptr)._M_cur; __nh._M_ptr = nullptr; diff --git a/libstdc++-v3/include/bits/hashtable_policy.h b/libstdc++-v3/include/bits/hashtable_policy.h index 86b32fb15f2..5d162463dc3 100644 --- a/libstdc++-v3/include/bits/hashtable_policy.h +++ b/libstdc++-v3/include/bits/hashtable_policy.h @@ -1319,19 +1319,6 @@ namespace __detail return _M_hash()(__k); } - __hash_code - _M_hash_code(const _Hash&, - const _Hash_node_value<_Value, true>& __n) const - { return __n._M_hash_code; } - - // Compute hash code using _Hash as __n _M_hash_code, if present, was - // computed using _H2. - template - __hash_code - _M_hash_code(const _H2&, - const _Hash_node_value<_Value, __cache_hash_code>& __n) const - { return _M_hash_code(_ExtractKey{}(__n._M_v())); } - __hash_code _M_hash_code(const _Hash_node_value<_Value, false>& __n) const { return _M_hash_code(_ExtractKey{}(__n._M_v())); } diff --git a/libstdc++-v3/testsuite/23_containers/unordered_map/modifiers/merge.cc b/libstdc++-v3/testsuite/23_containers/unordered_map/modifiers/merge.cc index b140ce452aa..c051b58137a 100644 --- a/libstdc++-v3/testsuite/23_containers/unordered_map/modifiers/merge.cc +++ b/libstdc++-v3/testsuite/23_containers/unordered_map/modifiers/merge.cc @@ -17,15 +17,29 @@ // { dg-do run { target c++17 } } +#include +#include #include #include #include using test_type = std::unordered_map; -struct hash { - auto operator()(int i) const noexcept { return ~std::hash()(i); } -}; +template + struct xhash + { + auto operator()(const T& i) const noexcept + { return ~std::hash()(i); } + }; + + +namespace std +{ + template + struct __is_fast_hash> : __is_fast_hash> + { }; +} + struct equal : std::equal_to<> { }; template @@ -64,7 +78,7 @@ test02() { const test_type c0{ {1, 10}, {2, 20}, {3, 30} }; test_type c1 = c0; - std::unordered_map c2( c0.begin(), c0.end() ); + std::unordered_map, equal> c2( c0.begin(), c0.end() ); c1.merge(c2); VERIFY( c1 == c0 ); @@ -89,7 +103,7 @@ test03() { const test_type c0{ {1, 10}, {2, 20}, {3, 30} }; test_type c1 = c0; - std::unordered_multimap c2( c0.begin(), c0.end() ); + std::unordered_multimap, equal> c2( c0.begin(), c0.end() ); c1.merge(c2); VERIFY( c1 == c0 ); VERIFY( equal_elements(c2, c0) ); @@ -125,10 +139,164 @@ test03() VERIFY( c2.empty() ); } +void +test04() +{ + const std::unordered_map c0 + { {"one", 10}, {"two", 20}, {"three", 30} }; + + std::unordered_map c1 = c0; + std::unordered_multimap c2( c0.begin(), c0.end() ); + c1.merge(c2); + VERIFY( c1 == c0 ); + VERIFY( equal_elements(c2, c0) ); + + c1.clear(); + c1.merge(c2); + VERIFY( c1 == c0 ); + VERIFY( c2.empty() ); + + c2.merge(c1); + VERIFY( c1.empty() ); + VERIFY( equal_elements(c2, c0) ); + + c1 = c0; + c2.merge(c1); + VERIFY( c1.empty() ); + VERIFY( c2.size() == (2 * c0.size()) ); + VERIFY( c2.count("one") == 2 ); + VERIFY( c2.count("two") == 2 ); + VERIFY( c2.count("three") == 2 ); + + c1.merge(c2); + VERIFY( c1 == c0 ); + VERIFY( equal_elements(c2, c0) ); + + c1.merge(std::move(c2)); + VERIFY( c1 == c0 ); + VERIFY( equal_elements(c2, c0) ); + + c1.clear(); + c1.merge(std::move(c2)); + VERIFY( c1 == c0 ); + VERIFY( c2.empty() ); +} + +void +test05() +{ + const std::unordered_map c0 + { {"one", 10}, {"two", 20}, {"three", 30} }; + + std::unordered_map c1 = c0; + std::unordered_multimap, equal> c2( c0.begin(), c0.end() ); + c1.merge(c2); + VERIFY( c1 == c0 ); + VERIFY( equal_elements(c2, c0) ); + + c1.clear(); + c1.merge(c2); + VERIFY( c1 == c0 ); + VERIFY( c2.empty() ); + + c2.merge(c1); + VERIFY( c1.empty() ); + VERIFY( equal_elements(c2, c0) ); + + c1 = c0; + c2.merge(c1); + VERIFY( c1.empty() ); + VERIFY( c2.size() == (2 * c0.size()) ); + VERIFY( c2.count("one") == 2 ); + VERIFY( c2.count("two") == 2 ); + VERIFY( c2.count("three") == 2 ); + + c1.merge(c2); + VERIFY( c1 == c0 ); + VERIFY( equal_elements(c2, c0) ); + + c1.merge(std::move(c2)); + VERIFY( c1 == c0 ); + VERIFY( equal_elements(c2, c0) ); + + c1.clear(); + c1.merge(std::move(c2)); + VERIFY( c1 == c0 ); + VERIFY( c2.empty() ); +} + +template + using hash_f = + std::function; + +std::size_t +hash_func(const std::string& str) +{ return std::hash{}(str); } + +std::size_t +xhash_func(const std::string& str) +{ return xhash{}(str); } + +namespace std +{ + template + struct __is_fast_hash> : __is_fast_hash> + { }; +} + +void +test06() +{ + const std::unordered_map, equal> + c0({ {"one", 10}, {"two", 20}, {"three", 30} }, 3, &hash_func); + + std::unordered_map, equal> + c1(3, &hash_func); + c1 = c0; + std::unordered_multimap, equal> + c2(c0.begin(), c0.end(), 3, &xhash_func); + c1.merge(c2); + VERIFY( c1 == c0 ); + VERIFY( equal_elements(c2, c0) ); + + c1.clear(); + c1.merge(c2); + VERIFY( c1 == c0 ); + VERIFY( c2.empty() ); + + c2.merge(c1); + VERIFY( c1.empty() ); + VERIFY( equal_elements(c2, c0) ); + + c1 = c0; + c2.merge(c1); + VERIFY( c1.empty() ); + VERIFY( c2.size() == (2 * c0.size()) ); + VERIFY( c2.count("one") == 2 ); + VERIFY( c2.count("two") == 2 ); + VERIFY( c2.count("three") == 2 ); + + c1.merge(c2); + VERIFY( c1 == c0 ); + VERIFY( equal_elements(c2, c0) ); + + c1.merge(std::move(c2)); + VERIFY( c1 == c0 ); + VERIFY( equal_elements(c2, c0) ); + + c1.clear(); + c1.merge(std::move(c2)); + VERIFY( c1 == c0 ); + VERIFY( c2.empty() ); +} + int main() { test01(); test02(); test03(); + test04(); + test05(); + test06(); }