From patchwork Wed Oct 2 18:11:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dmitry Ilvokhin X-Patchwork-Id: 1992122 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=ilvokhin.com header.i=@ilvokhin.com header.a=rsa-sha256 header.s=mail header.b=UqZfKSQJ; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XJjZP2qtGz1xtY for ; Thu, 3 Oct 2024 04:12:11 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D485F385C6D3 for ; Wed, 2 Oct 2024 18:12:08 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail.ilvokhin.com (mail.ilvokhin.com [178.62.254.231]) by sourceware.org (Postfix) with ESMTPS id 4C96F3858D29; Wed, 2 Oct 2024 18:11:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4C96F3858D29 Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=ilvokhin.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=ilvokhin.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 4C96F3858D29 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=178.62.254.231 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1727892711; cv=none; b=HQaEa9aKwTOxeN2Njr+MolNdF2tmWgtYreMgt4P4SGATAr1NdRt1c9Lrvhlq+Tw6b7rGTlZX3DIOxARqj4S9CgMcC6JeBH6UOOcXj8BhD0WEMcJCqw9jCuG3xJYyOra/wlbQpmlrBUOAGNSNqi1/VB1LNN95TxEkKnCunlVWies= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1727892711; c=relaxed/simple; bh=WUj6D1RZYoR4lSxVQ2/TObJyaTkAXFjLhJN1N9uvCJM=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=LGReXFsIP7BexrwAi2+qpLtAth0+EXBo657WwExS/0La1UejiQQeOJ6N5Miax39L+YivkhntAaV6J0fa7oBtqWnFQmLIpsdhnBZ/ej9BcFAU4NwTbyEdBWSAYZWbfrDDzf4fG9pizBan3YKOuxhEUiRWY89ah9C82Ynu4g8/YH4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from gate (unknown [178.62.204.248]) (Authenticated sender: d@ilvokhin.com) by mail.ilvokhin.com (Postfix) with ESMTPSA id 20CC43D68F; Wed, 02 Oct 2024 18:11:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ilvokhin.com; s=mail; t=1727892708; bh=upthyAhgD+8C393A6VvK/7G5xVLhhjHiefp6JZ3UeMo=; h=Date:From:To:Cc:Subject; b=UqZfKSQJXvVivtm98YRbVBTEGcplT9zRkY81vgVr/y2807V0eOsBfIAPuVFKGTfNR hW9JIzdJR1gZaKlF2LGJ2vbUNfZ6IcR6Zx22LvJyNQRcfdshryYR7cUSGkcivHq72P kvChAiKei8I3y9ziYPaGmR67UCrLN/Kmq14S+ky0= Date: Wed, 2 Oct 2024 18:11:42 +0000 From: Dmitry Ilvokhin To: libstdc++@gcc.gnu.org Cc: gcc-patches@gcc.gnu.org Subject: [PATCH] libstdc++: Unroll loop in load_bytes function Message-ID: MIME-Version: 1.0 Content-Disposition: inline X-Spam-Status: No, score=-14.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Instead of looping over every byte of the tail, unroll loop manually using switch statement, then compilers (at least GCC and Clang) will generate a jump table [1], which is faster on a microbenchmark [2]. [1]: https://godbolt.org/z/aE8Mq3j5G [2]: https://quick-bench.com/q/ylYLW2R22AZKRvameYYtbYxag24 libstdc++-v3/ChangeLog: * libstdc++-v3/libsupc++/hash_bytes.cc (load_bytes): unroll loop using switch statement. Signed-off-by: Dmitry Ilvokhin --- libstdc++-v3/libsupc++/hash_bytes.cc | 27 +++++++++++++++++++++++---- 1 file changed, 23 insertions(+), 4 deletions(-) diff --git a/libstdc++-v3/libsupc++/hash_bytes.cc b/libstdc++-v3/libsupc++/hash_bytes.cc index 3665375096a..294a7323dd0 100644 --- a/libstdc++-v3/libsupc++/hash_bytes.cc +++ b/libstdc++-v3/libsupc++/hash_bytes.cc @@ -50,10 +50,29 @@ namespace load_bytes(const char* p, int n) { std::size_t result = 0; - --n; - do - result = (result << 8) + static_cast(p[n]); - while (--n >= 0); + switch(n & 7) + { + case 7: + result |= std::size_t(p[6]) << 48; + [[gnu::fallthrough]]; + case 6: + result |= std::size_t(p[5]) << 40; + [[gnu::fallthrough]]; + case 5: + result |= std::size_t(p[4]) << 32; + [[gnu::fallthrough]]; + case 4: + result |= std::size_t(p[3]) << 24; + [[gnu::fallthrough]]; + case 3: + result |= std::size_t(p[2]) << 16; + [[gnu::fallthrough]]; + case 2: + result |= std::size_t(p[1]) << 8; + [[gnu::fallthrough]]; + case 1: + result |= std::size_t(p[0]); + }; return result; }