From patchwork Fri Jun 7 05:29:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stephen Face X-Patchwork-Id: 1944900 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=gmail.com header.i=@gmail.com header.a=rsa-sha256 header.s=20230601 header.b=YTmgc7vw; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4VwVBm0S33z20Q5 for ; Fri, 7 Jun 2024 15:29:51 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id F267E38DC2DC for ; Fri, 7 Jun 2024 05:29:49 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-oi1-x22a.google.com (mail-oi1-x22a.google.com [IPv6:2607:f8b0:4864:20::22a]) by sourceware.org (Postfix) with ESMTPS id 2588939B2ECE; Fri, 7 Jun 2024 05:29:28 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2588939B2ECE Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2588939B2ECE Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::22a ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1717738169; cv=none; b=jBUBgZV5+v9py/ZyVIF5GkOWoWkgfS2FjmxEycavZlf0fMzNJ62NCMNq85fPirSTPrk1CUU9kuvNwzaKJcEaNMgkL+PIzbSZQIaZfVXSP/+YCbKKcRGTMDRrjK3kmk/qW1U4gvS1nJ96aHhx91T4fZXY0dvDNX+CxlTfdSpQ+aA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1717738169; c=relaxed/simple; bh=JII3pNkyLGa/zfIMWPPxjSiwEZ241q+wYTTQAzZnWjE=; h=DKIM-Signature:Message-ID:Date:MIME-Version:From:Subject:To; b=o39mRpKwdysxql8ceQ+64OIuzZiizfIfRMONBZTF2fP331CarKMbejPKi8TFqUNgpNtFHBZugxzP0uawZT5cOs2STajFYMzSSGUOEpRz22M7WCYU1aXClmc0h7wSwFHN2Jq+InQBCPMBYySS1wNh2raKYSeY6aKbnBe++NSKav8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-oi1-x22a.google.com with SMTP id 5614622812f47-3d1fe4d1ebaso911534b6e.3; Thu, 06 Jun 2024 22:29:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1717738166; x=1718342966; darn=gcc.gnu.org; h=content-transfer-encoding:cc:to:subject:from:content-language :user-agent:mime-version:date:message-id:from:to:cc:subject:date :message-id:reply-to; bh=BzFxar/vrN02pveFOx7ffRRZXfqhmO5vZegMk9YO4vU=; b=YTmgc7vwRjFkGvUBOgUJwhV+VxH7vay6DqV55FTDOnpmXTYM1T8VEoaa2fc6ZOjySK V1FdxAVf5AWuOp+Csf5h7H0fm6GGJ4Ya3l/8xZ7xOgwsLAd3UDokwYPIq1V0WtYCcRU8 7k1Ap47rMiTphj3XeNX6QtLtyh99kuMPhPfXsZRuqyJ6zzMa9FqNiy90N8rveJNhLaPf RzDRMKrN4kKic2cjJpNptxuYd/rhh4GROZXR+HnG8fcBtkiGGE3sjc1tGOd4dThuwSFh Six70YGUvuikXJN7RawbUUQTkLYzhrXo94oG7rCidUXT9kOeIvEj4TNJ3hacpofQAgVJ BR6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717738166; x=1718342966; h=content-transfer-encoding:cc:to:subject:from:content-language :user-agent:mime-version:date:message-id:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=BzFxar/vrN02pveFOx7ffRRZXfqhmO5vZegMk9YO4vU=; b=i0Dg/WFzSjJ3U3m55JsBe+UN1Nc4h+sFy7DJNQ4NMFBgyWDYOZmZUOCsyWHvGJ48VD /LoQtUx3KTNqukQeVKfFJj9tdBH7lU7uvJf+s1lcha4eLCFvF1Xk8cp7Xbbiuu7MUAV9 Bb0RV8H2nQeq+wW0AKWGP4lFe1N/+TCglLELFlywZrBq2TzNRbdlpE/48BR1zUuKCCpq r3riBaIfrlHrKOGNwhJLL5+1nnB5EVcYMfosKFhC8rUucR/iMXV+LR3tz5qFXK0yDKud 64EoA7aNUKTd4A/Lq45dClceOPCEwfkZdfjrXp2/ho8Z6lxQeQD8gtnZ0bajV8CoO8v1 2zYQ== X-Gm-Message-State: AOJu0Yx1CIlAoRcUAerGnIYgC3/AMTktvAfLiV5q0XBQW1IkDjOhq3kV S6KvBINX3FXOnxN9qZYR2pvwP1g3DVbnpYATt8lOXoj9kUVEgbxidHdfew== X-Google-Smtp-Source: AGHT+IFaMH62vA0B7gXFreHrhs3oimuwDYZyaWNSvnaQuejcrag5fyfOEDZ3RupAzihfXGAtenUFSw== X-Received: by 2002:a05:6808:4cc:b0:3c9:c831:1439 with SMTP id 5614622812f47-3d210f51ec8mr1390789b6e.54.1717738166275; Thu, 06 Jun 2024 22:29:26 -0700 (PDT) Received: from ?IPV6:2604:4080:1153:80f1:c1e0:9b3:6b2f:8a5a? ([2604:4080:1153:80f1:c1e0:9b3:6b2f:8a5a]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-703fd50de91sm1895544b3a.188.2024.06.06.22.29.25 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 06 Jun 2024 22:29:25 -0700 (PDT) Message-ID: <9e5636e8-224f-4dde-9c60-489b4aad6534@gmail.com> Date: Thu, 6 Jun 2024 22:29:23 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US From: Stephen Face Subject: [PATCH] libstdc++: Optimize std::gcd To: libstdc++@gcc.gnu.org Cc: gcc-patches@gcc.gnu.org X-Spam-Status: No, score=-12.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org This patch is to optimize the runtime execution of gcd. Mathematically, it computes with the same algorithm as before, but subtractions and branches are rearranged to encourage generation of code that can use flags from the subtractions for conditional moves. Additionally, most pairs of integers are coprime, so this patch also includes a check for one of the integers to be equal to 1, and then it will exit the loop early in this case. libstdc++-v3/ChangeLog: * include/std/numeric(__gcd): Optimize. --- I have tested this on x86_64-linux and aarch64-linux. I have tested the timing with random distributions of small inputs and large inputs on a couple of machines with -O2 and found decreases in execution time from 20% to 60% depending on the machine and distribution of inputs. libstdc++-v3/include/std/numeric | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/libstdc++-v3/include/std/numeric b/libstdc++-v3/include/std/numeric index c912db4a519..3c9e8387a0e 100644 --- a/libstdc++-v3/include/std/numeric +++ b/libstdc++-v3/include/std/numeric @@ -148,19 +148,20 @@ namespace __detail while (true) { - if (__m > __n) - { - _Tp __tmp = __m; - __m = __n; - __n = __tmp; - } + _Tp __m_minus_n = __m - __n; + if (__m_minus_n == 0) + return __m << __k; - __n -= __m; + _Tp __next_m = __m < __n ? __m : __n; - if (__n == 0) - return __m << __k; + if (__next_m == 1) + return __next_m << __k; + + _Tp __n_minus_m = __n - __m; + __n = __n < __m ? __m_minus_n : __n_minus_m; + __m = __next_m; - __n >>= std::__countr_zero(__n); + __n >>= std::__countr_zero(__m_minus_n); } } } // namespace __detail