From patchwork Fri Oct 18 11:17:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1999071 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XVMhR1CFJz1xth for ; Fri, 18 Oct 2024 22:20:53 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2A1D3385843F for ; Fri, 18 Oct 2024 11:20:51 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 0B0EC3858410 for ; Fri, 18 Oct 2024 11:18:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0B0EC3858410 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0B0EC3858410 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729250329; cv=none; b=DfgBjyVaCMn0TnYAchuabL6SR0TvPMQ0YN0Mc2NHpEyyZir/v2XslCG2CTw/sHlkD9q0RPAYzTshI4uxgi6PAUCpG6PDY/W54kTB4CjvW+jDrHN0S1WHXxbV4LL7xUnVvylu7wFdzuOAfP0WHgjZrGNJhiJs44+IzC74gcol7hI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729250329; c=relaxed/simple; bh=bYEG4p7nc9TQS8DqZHI5DB5et3T9ElWyRaKSYqlQsaY=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=kaYEmHUHovJ/h0hx3LE0fP7ZvSCQdlvE8zr5i5TxfuD58qAw7E8Brlv0qwCWFnPD+pvVob0imgwThOjdlnFZdl75KbJVl95GTU1YESOF49UL+nwHCpOqVgiHUBRO/xPEvpVCHpKLYHckiq8N1dJGyEuOr4np9NV5DhGDQg2E3bU= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7A297FEC; Fri, 18 Oct 2024 04:19:13 -0700 (PDT) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 448B23F58B; Fri, 18 Oct 2024 04:18:43 -0700 (PDT) From: Richard Sandiford To: rguenther@suse.de, gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [PATCH 1/9] Make more places handle exact_div like trunc_div Date: Fri, 18 Oct 2024 12:17:58 +0100 Message-Id: <20241018111806.4026759-2-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20241018111806.4026759-1-richard.sandiford@arm.com> References: <20241018111806.4026759-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-18.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org I tried to look for places where we were handling TRUNC_DIV_EXPR more favourably than EXACT_DIV_EXPR. Most of the places that I looked at but didn't change were handling div/mod pairs. But there's bound to be others I missed... gcc/ * match.pd: Extend some rules to handle exact_div like trunc_div. * tree.h (trunc_div_p): New function. * tree-ssa-loop-niter.cc (is_rshift_by_1): Use it. * tree-ssa-loop-ivopts.cc (force_expr_to_var_cost): Handle EXACT_DIV_EXPR. --- gcc/match.pd | 60 +++++++++++++++++++------------------ gcc/tree-ssa-loop-ivopts.cc | 2 ++ gcc/tree-ssa-loop-niter.cc | 2 +- gcc/tree.h | 13 ++++++++ 4 files changed, 47 insertions(+), 30 deletions(-) diff --git a/gcc/match.pd b/gcc/match.pd index 12d81fcac0d..4aea028a866 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -492,27 +492,28 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) of A starting from shift's type sign bit are zero, as (unsigned long long) (1 << 31) is -2147483648ULL, not 2147483648ULL, so it is valid only if A >> 31 is zero. */ -(simplify - (trunc_div (convert?@0 @3) (convert2? (lshift integer_onep@1 @2))) - (if ((TYPE_UNSIGNED (type) || tree_expr_nonnegative_p (@0)) - && (!VECTOR_TYPE_P (type) - || target_supports_op_p (type, RSHIFT_EXPR, optab_vector) - || target_supports_op_p (type, RSHIFT_EXPR, optab_scalar)) - && (useless_type_conversion_p (type, TREE_TYPE (@1)) - || (element_precision (type) >= element_precision (TREE_TYPE (@1)) - && (TYPE_UNSIGNED (TREE_TYPE (@1)) - || (element_precision (type) - == element_precision (TREE_TYPE (@1))) - || (INTEGRAL_TYPE_P (type) - && (tree_nonzero_bits (@0) - & wi::mask (element_precision (TREE_TYPE (@1)) - 1, - true, - element_precision (type))) == 0))))) - (if (!VECTOR_TYPE_P (type) - && useless_type_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1)) - && element_precision (TREE_TYPE (@3)) < element_precision (type)) - (convert (rshift @3 @2)) - (rshift @0 @2)))) +(for div (trunc_div exact_div) + (simplify + (div (convert?@0 @3) (convert2? (lshift integer_onep@1 @2))) + (if ((TYPE_UNSIGNED (type) || tree_expr_nonnegative_p (@0)) + && (!VECTOR_TYPE_P (type) + || target_supports_op_p (type, RSHIFT_EXPR, optab_vector) + || target_supports_op_p (type, RSHIFT_EXPR, optab_scalar)) + && (useless_type_conversion_p (type, TREE_TYPE (@1)) + || (element_precision (type) >= element_precision (TREE_TYPE (@1)) + && (TYPE_UNSIGNED (TREE_TYPE (@1)) + || (element_precision (type) + == element_precision (TREE_TYPE (@1))) + || (INTEGRAL_TYPE_P (type) + && (tree_nonzero_bits (@0) + & wi::mask (element_precision (TREE_TYPE (@1)) - 1, + true, + element_precision (type))) == 0))))) + (if (!VECTOR_TYPE_P (type) + && useless_type_conversion_p (TREE_TYPE (@3), TREE_TYPE (@1)) + && element_precision (TREE_TYPE (@3)) < element_precision (type)) + (convert (rshift @3 @2)) + (rshift @0 @2))))) /* Preserve explicit divisions by 0: the C++ front-end wants to detect undefined behavior in constexpr evaluation, and assuming that the division @@ -947,13 +948,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) { build_one_cst (utype); }))))))) /* Simplify (unsigned t * 2)/2 -> unsigned t & 0x7FFFFFFF. */ -(simplify - (trunc_div (mult @0 integer_pow2p@1) @1) - (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && TYPE_UNSIGNED (TREE_TYPE (@0))) - (bit_and @0 { wide_int_to_tree - (type, wi::mask (TYPE_PRECISION (type) - - wi::exact_log2 (wi::to_wide (@1)), - false, TYPE_PRECISION (type))); }))) +(for div (trunc_div exact_div) + (simplify + (div (mult @0 integer_pow2p@1) @1) + (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) && TYPE_UNSIGNED (TREE_TYPE (@0))) + (bit_and @0 { wide_int_to_tree + (type, wi::mask (TYPE_PRECISION (type) + - wi::exact_log2 (wi::to_wide (@1)), + false, TYPE_PRECISION (type))); })))) /* Simplify (unsigned t / 2) * 2 -> unsigned t & ~1. */ (simplify @@ -5715,7 +5717,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* Sink binary operation to branches, but only if we can fold it. */ (for op (tcc_comparison plus minus mult bit_and bit_ior bit_xor - lshift rshift rdiv trunc_div ceil_div floor_div round_div + lshift rshift rdiv trunc_div ceil_div floor_div round_div exact_div trunc_mod ceil_mod floor_mod round_mod min max) /* (c ? a : b) op (c ? d : e) --> c ? (a op d) : (b op e) */ (simplify diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc index 7441324aec2..68d091dc4a3 100644 --- a/gcc/tree-ssa-loop-ivopts.cc +++ b/gcc/tree-ssa-loop-ivopts.cc @@ -4369,6 +4369,7 @@ force_expr_to_var_cost (tree expr, bool speed) case PLUS_EXPR: case MINUS_EXPR: case MULT_EXPR: + case EXACT_DIV_EXPR: case TRUNC_DIV_EXPR: case BIT_AND_EXPR: case BIT_IOR_EXPR: @@ -4482,6 +4483,7 @@ force_expr_to_var_cost (tree expr, bool speed) return comp_cost (target_spill_cost [speed], 0); break; + case EXACT_DIV_EXPR: case TRUNC_DIV_EXPR: /* Division by power of two is usually cheap, so we allow it. Forbid anything else. */ diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc index f87731ef892..38be41b06b3 100644 --- a/gcc/tree-ssa-loop-niter.cc +++ b/gcc/tree-ssa-loop-niter.cc @@ -2328,7 +2328,7 @@ is_rshift_by_1 (gassign *stmt) if (gimple_assign_rhs_code (stmt) == RSHIFT_EXPR && integer_onep (gimple_assign_rhs2 (stmt))) return true; - if (gimple_assign_rhs_code (stmt) == TRUNC_DIV_EXPR + if (trunc_div_p (gimple_assign_rhs_code (stmt)) && tree_fits_shwi_p (gimple_assign_rhs2 (stmt)) && tree_to_shwi (gimple_assign_rhs2 (stmt)) == 2) return true; diff --git a/gcc/tree.h b/gcc/tree.h index d324a3f42a6..db91cd685dc 100644 --- a/gcc/tree.h +++ b/gcc/tree.h @@ -5601,6 +5601,19 @@ struct_ptr_hash (const void *a) return (intptr_t)*x >> 4; } +/* Return true if CODE can be treated as a truncating division. + + EXACT_DIV_EXPR can be treated as a truncating division in which the + remainder is known to be zero. However, if trunc_div_p gates the + generation of new IL, the conservative choice for that new IL is + TRUNC_DIV_EXPR rather than CODE. Using CODE (EXACT_DIV_EXPR) would + only be correct if the transformation preserves exactness. */ +inline bool +trunc_div_p (tree_code code) +{ + return code == TRUNC_DIV_EXPR || code == EXACT_DIV_EXPR; +} + /* Return nonzero if CODE is a tree code that represents a truth value. */ inline bool truth_value_p (enum tree_code code) From patchwork Fri Oct 18 11:17:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1999068 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XVMg60sXYz1xvV for ; Fri, 18 Oct 2024 22:19:46 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 466CD3857C68 for ; Fri, 18 Oct 2024 11:19:44 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id CA235385840B for ; Fri, 18 Oct 2024 11:18:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CA235385840B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CA235385840B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729250339; cv=none; b=Zbsyxqh+9m5DCqK5HZWjr/mqStUKoe+DJ8eOjYqJ8htJxlhO8sWHZ5gIw0e8PH/ugShU/1OvBIpLWcoCZchMr+A/QrpRaCEiLIQH/A+AYpqqeS1gTSsyGGN9tvV3tT6Nvd8BkVS0qBgARbOoyr3r6TgXQe2F0Rel1WKQCKhryb4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729250339; c=relaxed/simple; bh=XWKtgCr310e3I5X7rluGHQzyZ0GzmCxcsuQUs8FmBCw=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=N+4gVIT8MFX6hIr0UZwDE6rM1ly1RPr+FT395GYcVm2+4AJDllVCkmwlIbWbAuOlK/aFsS2Kl1ovlqx93XdZgg2OfIXjo0hBSJ+qogS/JziRdA2kViAkmZqhReOTGrK/ZDFi+Ar9uFz1Yd2rNqNIKWO3QB0dNH+tqXJSQjgZAfw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 49637FEC; Fri, 18 Oct 2024 04:19:21 -0700 (PDT) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 128193F58B; Fri, 18 Oct 2024 04:18:50 -0700 (PDT) From: Richard Sandiford To: rguenther@suse.de, gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [PATCH 2/9] Use get_nonzero_bits to simplify trunc_div to exact_div Date: Fri, 18 Oct 2024 12:17:59 +0100 Message-Id: <20241018111806.4026759-3-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20241018111806.4026759-1-richard.sandiford@arm.com> References: <20241018111806.4026759-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-18.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org There are a limited number of existing rules that benefit from knowing that a division is exact. Later patches will add more. gcc/ * match.pd: Simplify X / (1 << C) to X /[ex] (1 << C) if the low C bits of X are clear gcc/testsuite/ * gcc.dg/tree-ssa/cmpexactdiv-6.c: New test. --- gcc/match.pd | 9 ++++++ gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv-6.c | 29 +++++++++++++++++++ 2 files changed, 38 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv-6.c diff --git a/gcc/match.pd b/gcc/match.pd index 4aea028a866..b952225b08c 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -5431,6 +5431,15 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) TYPE_PRECISION (type)), 0)) (convert @0))) +#if GIMPLE +/* X / (1 << C) -> X /[ex] (1 << C) if the low C bits of X are clear. */ +(simplify + (trunc_div (with_possible_nonzero_bits2 @0) integer_pow2p@1) + (if (INTEGRAL_TYPE_P (type) + && !TYPE_UNSIGNED (type) + && wi::multiple_of_p (get_nonzero_bits (@0), wi::to_wide (@1), SIGNED)) + (exact_div @0 @1))) +#endif /* (X /[ex] A) * A -> X. */ (simplify diff --git a/gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv-6.c b/gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv-6.c new file mode 100644 index 00000000000..82d517b05ab --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv-6.c @@ -0,0 +1,29 @@ +/* { dg-options "-O2 -fdump-tree-optimized-raw" } */ + +typedef __INTPTR_TYPE__ intptr_t; + +int +f1 (int x, int y) +{ + if ((x & 1) || (y & 1)) + __builtin_unreachable (); + x /= 2; + y /= 2; + return x < y; +} + +int +f2 (void *ptr1, void *ptr2, void *ptr3) +{ + ptr1 = __builtin_assume_aligned (ptr1, 4); + ptr2 = __builtin_assume_aligned (ptr2, 4); + ptr3 = __builtin_assume_aligned (ptr3, 4); + intptr_t diff1 = (intptr_t) ptr1 - (intptr_t) ptr2; + intptr_t diff2 = (intptr_t) ptr1 - (intptr_t) ptr3; + diff1 /= 2; + diff2 /= 2; + return diff1 < diff2; +} + +/* { dg-final { scan-tree-dump-not {<[a-z]*_div_expr,} "optimized" } } */ +/* { dg-final { scan-tree-dump-not { X-Patchwork-Id: 1999073 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XVMjj3K8Nz1xth for ; Fri, 18 Oct 2024 22:22:01 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A569B385801B for ; Fri, 18 Oct 2024 11:21:59 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 397F93858C42 for ; Fri, 18 Oct 2024 11:18:59 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 397F93858C42 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 397F93858C42 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729250345; cv=none; b=W8vDKXuAvPnDyhXMlvTVORCMQdDENPsHCkXQQ2Euk9qcGbpApT4v9FZp7zBg0h7BecJriOZN4UZ2UCpBqxQRmyBWjthxE9EBvyVTtuBJtFrTCz2udmVhPYlb6KBIJqqkdTugbzzw+vqIoWhrOrR9IyqJoVVln1C/K24HZbcl9e0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729250345; c=relaxed/simple; bh=mBju9N0hxr4GpgDuYOLFMsd7p0wSYR76BaVZusiqHw8=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=Myza46q1urCSXg6OLbPwuz8HjmhvpJ0ICR+fM3DaBGQP2sKe0xms7Ffc7CFRAJo5/NRfdmGtQaNDRlSTkj9mwnMTFmNwITCSUMf9cm5eiH1orFi0JygIiBPIky5QunFeM6kGSV4uIkRcGEmqku08fLs8P/WfF7WGf2Q39pOfJVQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8D5E81477; Fri, 18 Oct 2024 04:19:28 -0700 (PDT) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3CE8F3F58B; Fri, 18 Oct 2024 04:18:58 -0700 (PDT) From: Richard Sandiford To: rguenther@suse.de, gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [PATCH 3/9] Simplify X /[ex] Y cmp Z -> X cmp (Y * Z) Date: Fri, 18 Oct 2024 12:18:00 +0100 Message-Id: <20241018111806.4026759-4-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20241018111806.4026759-1-richard.sandiford@arm.com> References: <20241018111806.4026759-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-18.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch applies X /[ex] Y cmp Z -> X cmp (Y * Z) when Y * Z is representable. The closest check for "is representable" on range operations seemed to be overflow_free_p. However, that is designed for testing existing operations and so takes the definedness of signed overflow into account. Here, the question is whether we can create an entirely new value. The patch adds a new optional argument to overflow_free_p to distinguish these cases. It also adds a wrapper, so that it isn't necessary to specify TRIO_VARYING. I couldn't find a good way of removing the duplication between the two operand orders. The rules are (in a loose sense) symmetric, but they're not based on commutativity. gcc/ * range-op.h (range_query_type): New enum. (range_op_handler::fits_type_p): New function. (range_operator::overflow_free_p): Add an argument to specify the type of query. (range_op_handler::overflow_free_p): Likewise. * range-op-mixed.h (operator_plus::overflow_free_p): Likewise. (operator_minus::overflow_free_p): Likewise. (operator_mult::overflow_free_p): Likewise. * range-op.cc (range_op_handler::overflow_free_p): Likewise. (range_operator::overflow_free_p): Likewise. (operator_plus::overflow_free_p): Likewise. (operator_minus::overflow_free_p): Likewise. (operator_mult::overflow_free_p): Likewise. * match.pd: Simplify X /[ex] Y cmp Z -> X cmp (Y * Z) when Y * Z is representable. gcc/testsuite/ * gcc.dg/tree-ssa/cmpexactdiv-7.c: New test. * gcc.dg/tree-ssa/cmpexactdiv-8.c: Likewise. --- gcc/match.pd | 21 +++++++++++++ gcc/range-op-mixed.h | 9 ++++-- gcc/range-op.cc | 19 ++++++------ gcc/range-op.h | 31 +++++++++++++++++-- gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv-7.c | 21 +++++++++++++ gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv-8.c | 20 ++++++++++++ 6 files changed, 107 insertions(+), 14 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv-7.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv-8.c diff --git a/gcc/match.pd b/gcc/match.pd index b952225b08c..1b1d38cf105 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -2679,6 +2679,27 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (le (minus (convert:etype @0) { lo; }) { hi; }) (gt (minus (convert:etype @0) { lo; }) { hi; }))))))))) +#if GIMPLE +/* X /[ex] Y cmp Z -> X cmp (Y * Z), if Y * Z is representable. */ +(for cmp (simple_comparison) + (simplify + (cmp (exact_div:s @0 @1) @2) + (with { int_range_max r1, r2; } + (if (INTEGRAL_TYPE_P (type) + && get_range_query (cfun)->range_of_expr (r1, @1) + && get_range_query (cfun)->range_of_expr (r2, @2) + && range_op_handler (MULT_EXPR).fits_type_p (r1, r2)) + (cmp @0 (mult @1 @2))))) + (simplify + (cmp @2 (exact_div:s @0 @1)) + (with { int_range_max r1, r2; } + (if (INTEGRAL_TYPE_P (type) + && get_range_query (cfun)->range_of_expr (r1, @1) + && get_range_query (cfun)->range_of_expr (r2, @2) + && range_op_handler (MULT_EXPR).fits_type_p (r1, r2)) + (cmp (mult @1 @2) @0))))) +#endif + /* X + Z < Y + Z is the same as X < Y when there is no overflow. */ (for op (lt le ge gt) (simplify diff --git a/gcc/range-op-mixed.h b/gcc/range-op-mixed.h index cc1db2f6775..402cb87c2b2 100644 --- a/gcc/range-op-mixed.h +++ b/gcc/range-op-mixed.h @@ -539,7 +539,8 @@ public: const irange &rh) const final override; virtual bool overflow_free_p (const irange &lh, const irange &rh, - relation_trio = TRIO_VARYING) const; + relation_trio = TRIO_VARYING, + range_query_type = QUERY_WITH_GIMPLE_UB) const; // Check compatibility of all operands. bool operand_check_p (tree t1, tree t2, tree t3) const final override { return range_compatible_p (t1, t2) && range_compatible_p (t1, t3); } @@ -615,7 +616,8 @@ public: const irange &rh) const final override; virtual bool overflow_free_p (const irange &lh, const irange &rh, - relation_trio = TRIO_VARYING) const; + relation_trio = TRIO_VARYING, + range_query_type = QUERY_WITH_GIMPLE_UB) const; // Check compatibility of all operands. bool operand_check_p (tree t1, tree t2, tree t3) const final override { return range_compatible_p (t1, t2) && range_compatible_p (t1, t3); } @@ -701,7 +703,8 @@ public: const REAL_VALUE_TYPE &rh_lb, const REAL_VALUE_TYPE &rh_ub, relation_kind kind) const final override; virtual bool overflow_free_p (const irange &lh, const irange &rh, - relation_trio = TRIO_VARYING) const; + relation_trio = TRIO_VARYING, + range_query_type = QUERY_WITH_GIMPLE_UB) const; // Check compatibility of all operands. bool operand_check_p (tree t1, tree t2, tree t3) const final override { return range_compatible_p (t1, t2) && range_compatible_p (t1, t3); } diff --git a/gcc/range-op.cc b/gcc/range-op.cc index 3f5cf083440..1634cebd1bd 100644 --- a/gcc/range-op.cc +++ b/gcc/range-op.cc @@ -470,7 +470,8 @@ range_op_handler::op1_op2_relation (const vrange &lhs, bool range_op_handler::overflow_free_p (const vrange &lh, const vrange &rh, - relation_trio rel) const + relation_trio rel, + range_query_type query_type) const { gcc_checking_assert (m_operator); switch (dispatch_kind (lh, lh, rh)) @@ -478,7 +479,7 @@ range_op_handler::overflow_free_p (const vrange &lh, case RO_III: return m_operator->overflow_free_p(as_a (lh), as_a (rh), - rel); + rel, query_type); default: return false; } @@ -823,7 +824,7 @@ range_operator::op1_op2_relation_effect (irange &lhs_range ATTRIBUTE_UNUSED, bool range_operator::overflow_free_p (const irange &, const irange &, - relation_trio) const + relation_trio, range_query_type) const { return false; } @@ -4532,13 +4533,13 @@ range_op_table::initialize_integral_ops () bool operator_plus::overflow_free_p (const irange &lh, const irange &rh, - relation_trio) const + relation_trio, range_query_type query_type) const { if (lh.undefined_p () || rh.undefined_p ()) return false; tree type = lh.type (); - if (TYPE_OVERFLOW_UNDEFINED (type)) + if (query_type == QUERY_WITH_GIMPLE_UB && TYPE_OVERFLOW_UNDEFINED (type)) return true; wi::overflow_type ovf; @@ -4563,13 +4564,13 @@ operator_plus::overflow_free_p (const irange &lh, const irange &rh, bool operator_minus::overflow_free_p (const irange &lh, const irange &rh, - relation_trio) const + relation_trio, range_query_type query_type) const { if (lh.undefined_p () || rh.undefined_p ()) return false; tree type = lh.type (); - if (TYPE_OVERFLOW_UNDEFINED (type)) + if (query_type == QUERY_WITH_GIMPLE_UB && TYPE_OVERFLOW_UNDEFINED (type)) return true; wi::overflow_type ovf; @@ -4594,13 +4595,13 @@ operator_minus::overflow_free_p (const irange &lh, const irange &rh, bool operator_mult::overflow_free_p (const irange &lh, const irange &rh, - relation_trio) const + relation_trio, range_query_type query_type) const { if (lh.undefined_p () || rh.undefined_p ()) return false; tree type = lh.type (); - if (TYPE_OVERFLOW_UNDEFINED (type)) + if (query_type == QUERY_WITH_GIMPLE_UB && TYPE_OVERFLOW_UNDEFINED (type)) return true; wi::overflow_type ovf; diff --git a/gcc/range-op.h b/gcc/range-op.h index e415f87d7e6..3c69fdd2812 100644 --- a/gcc/range-op.h +++ b/gcc/range-op.h @@ -32,6 +32,18 @@ enum range_op_dispatch_type DISPATCH_OP1_OP2_RELATION }; +enum range_query_type +{ + // Take advantage of gimple's rules about undefined behavior. This is + // usually the right choice when querying existing operations. + QUERY_WITH_GIMPLE_UB, + + // The result of the operation must follow the rules of natural arithmetic. + // This is usually the right choice when querying whether we can create + // a new operation. + QUERY_NATURAL_ARITH +}; + // This class is implemented for each kind of operator supported by // the range generator. It serves various purposes. // @@ -225,7 +237,8 @@ public: const frange &op2) const; virtual bool overflow_free_p (const irange &lh, const irange &rh, - relation_trio = TRIO_VARYING) const; + relation_trio = TRIO_VARYING, + range_query_type = QUERY_WITH_GIMPLE_UB) const; // Compatability check for operands. virtual bool operand_check_p (tree, tree, tree) const; @@ -319,7 +332,10 @@ public: const vrange &op1, const vrange &op2) const; bool overflow_free_p (const vrange &lh, const vrange &rh, - relation_trio = TRIO_VARYING) const; + relation_trio = TRIO_VARYING, + range_query_type = QUERY_WITH_GIMPLE_UB) const; + bool fits_type_p (const irange &lh, const irange &rh, + relation_trio = TRIO_VARYING) const; bool operand_check_p (tree, tree, tree) const; protected: unsigned dispatch_kind (const vrange &lhs, const vrange &op1, @@ -330,6 +346,17 @@ protected: range_operator *m_operator; }; +// Test whether the result of the operator fits within the type. +// This follows the rules of natural arithmetic and so disallows +// any form of overflow or wrapping. + +inline bool +range_op_handler::fits_type_p (const irange &lh, const irange &rh, + relation_trio trio) const +{ + return overflow_free_p (lh, rh, trio, QUERY_NATURAL_ARITH); +} + // Cast the range in R to TYPE if R supports TYPE. inline bool diff --git a/gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv-7.c b/gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv-7.c new file mode 100644 index 00000000000..8a33bbd30f0 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/cmpexactdiv-7.c @@ -0,0 +1,21 @@ +/* { dg-options "-O2 -fdump-tree-optimized-raw" } */ + +#define TEST_CMP(FN, OP) \ + int \ + FN (int x, int y) \ + { \ + if (x & 7) \ + __builtin_unreachable (); \ + x /= 4; \ + return x OP (int) (y & (~0U >> 3)); \ + } + +TEST_CMP (f1, <) +TEST_CMP (f2, <=) +TEST_CMP (f3, ==) +TEST_CMP (f4, !=) +TEST_CMP (f5, >=) +TEST_CMP (f6, >) + +/* { dg-final { scan-tree-dump-not {<[a-z]*_div_expr, } "optimized" } } */ +/* { dg-final { scan-tree-dump-not {> 2)); \ + } + +TEST_CMP (f1, <) +TEST_CMP (f2, <=) +TEST_CMP (f3, ==) +TEST_CMP (f4, !=) +TEST_CMP (f5, >=) +TEST_CMP (f6, >) + +/* { dg-final { scan-tree-dump-times {<[a-z]*_div_expr, } 6 "optimized" } } */ From patchwork Fri Oct 18 11:18:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1999069 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XVMgb0fwjz1xth for ; Fri, 18 Oct 2024 22:20:11 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 43A993857C4F for ; Fri, 18 Oct 2024 11:20:09 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id DD2023858C3A for ; Fri, 18 Oct 2024 11:19:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DD2023858C3A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org DD2023858C3A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729250355; cv=none; b=NC7cA87Y0AitJvBfVML2Coj6eJBGUodUb56ZT7CfGnKjXXZfWr6KYKQoaF+2CRjUGKoZty4wsspVjM6rSvl5lRBg9G2Rt9Q0qhKVVlzmPHcvSzk21DWo9RrFsneqzV0Q2hB3JcBFQ994nynUtFeMS70LW7VuNI0Fv7lsXyyCQj4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729250355; c=relaxed/simple; bh=9wxFBSTr0aCQYLVG+D8CQRdc/yC93kB2ADgHKlrPsnw=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=GOsqwOy22QM97YNIeqXozVn5Kilms1Ii/57+F7zIwAFfpMdN2G/FZBqxt3IAfcadCynHBxf2rCZ9XTQZgUq/DbnXA7a4DQSNorPl26gICAZ10RJlIg33BLBC55QjKyE7jyC1UnTdWuqVAOlXtuXzsGt7+lOIkqiZThddqRc1r5g= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 510F5FEC; Fri, 18 Oct 2024 04:19:36 -0700 (PDT) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 1AD433F58B; Fri, 18 Oct 2024 04:19:05 -0700 (PDT) From: Richard Sandiford To: rguenther@suse.de, gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [PATCH 4/9] Simplify (X /[ex] C1) * (C1 * C2) -> X * C2 Date: Fri, 18 Oct 2024 12:18:01 +0100 Message-Id: <20241018111806.4026759-5-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20241018111806.4026759-1-richard.sandiford@arm.com> References: <20241018111806.4026759-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-18.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_NUMSUBJECT, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org gcc/ * match.pd: Simplify (X /[ex] C1) * (C1 * C2) -> X * C2. gcc/testsuite/ * gcc.dg/tree-ssa/mulexactdiv-1.c: New test. * gcc.dg/tree-ssa/mulexactdiv-2.c: Likewise. * gcc.dg/tree-ssa/mulexactdiv-3.c: Likewise. * gcc.dg/tree-ssa/mulexactdiv-4.c: Likewise. * gcc.target/aarch64/sve/cnt_fold_1.c: Likewise. * gcc.target/aarch64/sve/cnt_fold_2.c: Likewise. --- gcc/match.pd | 8 ++ gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-1.c | 23 ++++ gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-2.c | 19 +++ gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-3.c | 21 ++++ gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-4.c | 14 +++ .../gcc.target/aarch64/sve/cnt_fold_1.c | 110 ++++++++++++++++++ .../gcc.target/aarch64/sve/cnt_fold_2.c | 55 +++++++++ 7 files changed, 250 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-1.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-2.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-3.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cnt_fold_1.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cnt_fold_2.c diff --git a/gcc/match.pd b/gcc/match.pd index 1b1d38cf105..6677bc06d80 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -31,6 +31,7 @@ along with GCC; see the file COPYING3. If not see zerop initializer_each_zero_or_onep CONSTANT_CLASS_P + poly_int_tree_p tree_expr_nonnegative_p tree_expr_nonzero_p integer_valued_real_p @@ -5467,6 +5468,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (mult (convert1? (exact_div @0 @@1)) (convert2? @1)) (convert @0)) +/* (X /[ex] C1) * (C1 * C2) -> X * C2. */ +(simplify + (mult (convert? (exact_div @0 INTEGER_CST@1)) poly_int_tree_p@2) + (with { poly_widest_int factor; } + (if (multiple_p (wi::to_poly_widest (@2), wi::to_widest (@1), &factor)) + (mult (convert @0) { wide_int_to_tree (type, factor); })))) + /* Simplify (A / B) * B + (A % B) -> A. */ (for div (trunc_div ceil_div floor_div round_div) mod (trunc_mod ceil_mod floor_mod round_mod) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-1.c b/gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-1.c new file mode 100644 index 00000000000..fa853eb7dff --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-1.c @@ -0,0 +1,23 @@ +/* { dg-options "-O2 -fdump-tree-optimized-raw" } */ + +#define TEST_CMP(FN, DIV, MUL) \ + int \ + FN (int x) \ + { \ + if (x & 7) \ + __builtin_unreachable (); \ + x /= DIV; \ + return x * MUL; \ + } + +TEST_CMP (f1, 2, 6) +TEST_CMP (f2, 2, 10) +TEST_CMP (f3, 4, 80) +TEST_CMP (f4, 8, 200) + +/* { dg-final { scan-tree-dump-not {<[a-z]*_div_expr, } "optimized" } } */ +/* { dg-final { scan-tree-dump-not {> 1) & -2) +TEST_CMP (f2, int, 4, unsigned long, -8) +TEST_CMP (f3, int, 8, unsigned int, -24) +TEST_CMP (f4, long, 2, int, (~0U >> 1) & -2) +TEST_CMP (f5, long, 4, unsigned int, 100) +TEST_CMP (f6, long, 8, unsigned long, 200) + +/* { dg-final { scan-tree-dump-not {<[a-z]*_div_expr, } "optimized" } } */ +/* { dg-final { scan-tree-dump-not { + +/* +** f1: +** cntd x([0-9]+) +** mul w0, (w0, w\1|w\1, w0) +** ret +*/ +int +f1 (int x) +{ + if (x & 1) + __builtin_unreachable (); + x /= 2; + return x * svcntw(); +} + +/* +** f2: +** cntd x([0-9]+) +** mul w0, (w0, w\1|w\1, w0) +** ret +*/ +int +f2 (int x) +{ + if (x & 3) + __builtin_unreachable (); + x /= 4; + return x * svcnth(); +} + +/* +** f3: +** cntd x([0-9]+) +** mul w0, (w0, w\1|w\1, w0) +** ret +*/ +int +f3 (int x) +{ + if (x & 7) + __builtin_unreachable (); + x /= 8; + return x * svcntb(); +} + +/* +** f4: +** cntw x([0-9]+) +** mul w0, (w0, w\1|w\1, w0) +** ret +*/ +int +f4 (int x) +{ + if (x & 1) + __builtin_unreachable (); + x /= 2; + return x * svcnth(); +} + +/* +** f5: +** cntw x([0-9]+) +** mul w0, (w0, w\1|w\1, w0) +** ret +*/ +int +f5 (int x) +{ + if (x & 3) + __builtin_unreachable (); + x /= 4; + return x * svcntb(); +} + +/* +** f6: +** cnth x([0-9]+) +** mul w0, (w0, w\1|w\1, w0) +** ret +*/ +int +f6 (int x) +{ + if (x & 1) + __builtin_unreachable (); + x /= 2; + return x * svcntb(); +} + +/* +** f7: +** cntb x([0-9]+) +** mul w0, (w0, w\1|w\1, w0) +** ret +*/ +int +f7 (int x) +{ + if (x & 15) + __builtin_unreachable (); + x /= 16; + return x * svcntb() * 16; +} diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cnt_fold_2.c b/gcc/testsuite/gcc.target/aarch64/sve/cnt_fold_2.c new file mode 100644 index 00000000000..7412b7b964e --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cnt_fold_2.c @@ -0,0 +1,55 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +#include + +int +f1 (int x) +{ + x /= 2; + return x * svcntw(); +} + +int +f2 (int x) +{ + x /= 4; + return x * svcnth(); +} + +int +f3 (int x) +{ + x /= 8; + return x * svcntb(); +} + +int +f4 (int x) +{ + x /= 2; + return x * svcnth(); +} + +int +f5 (int x) +{ + x /= 4; + return x * svcntb(); +} + +int +f6 (int x) +{ + x /= 2; + return x * svcntb(); +} + +int +f7 (int x) +{ + x /= 16; + return x * svcntb() * 16; +} + +/* { dg-final { scan-assembler-times {\tasr\t} 7 } } */ From patchwork Fri Oct 18 11:18:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1999070 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XVMgr5RB5z1xth for ; Fri, 18 Oct 2024 22:20:24 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E31DF3858019 for ; Fri, 18 Oct 2024 11:20:22 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id CCD963858410 for ; Fri, 18 Oct 2024 11:19:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CCD963858410 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CCD963858410 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729250384; cv=none; b=J/HKDuxOwOaCgwuFT9ybCTCIDVlarJfxOgpdfq/fqyfRcSfiaxmmCEZdDEHK8ybFTXZPwNeO31ln8YixWC4faE1FFjtfArKoO5je1Gh2w5C+7U9ptf0AQn9JKf9AsweEdHBIb+9OoH/rLgBA3oFjMuogghj0zLs7x/rU3VMH7e8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729250384; c=relaxed/simple; bh=YbMZ2qhgP43mZt2tYayDfb95u6XrFfjzOOyaou/Jtdc=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=ICLx9AfnJoc0yP+szngR5v8irvZG3xNRK4g+5MS0eTNnzJo2MDkUf2Ub1Oe68GZyDVzeITAAcUex0S80GNPse1PxtrQXf0qyNOhF/mbSg6g9T+gejufxokR4SFhwgoqT7M9WzBPhE6pWQmna8gJMJ/ZI7T2bPAChXEWGJjyCD9E= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 47FC3FEC; Fri, 18 Oct 2024 04:19:55 -0700 (PDT) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 132C23F58B; Fri, 18 Oct 2024 04:19:13 -0700 (PDT) From: Richard Sandiford To: rguenther@suse.de, gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [PATCH 5/9] Generalise ((X /[ex] A) +- B) * A -> X +- A * B rule Date: Fri, 18 Oct 2024 12:18:02 +0100 Message-Id: <20241018111806.4026759-6-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20241018111806.4026759-1-richard.sandiford@arm.com> References: <20241018111806.4026759-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-18.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org match.pd had a rule to simplify ((X /[ex] A) +- B) * A -> X +- A * B when A and B are INTEGER_CSTs. This patch extends it to handle the case where the outer multiplication is by a factor of A, not just A itself. It also handles addition and multiplication of poly_ints. (Exact division by a poly_int seems unlikely.) I'm not sure why minus is handled here. Wouldn't minus of a constant be canonicalised to a plus? gcc/ * match.pd: Generalise ((X /[ex] A) +- B) * A -> X +- A * B rule to ((X /[ex] C1) +- C2) * (C1 * C3) -> (X * C3) +- (C1 * C2 * C3). gcc/testsuite/ * gcc.dg/tree-ssa/mulexactdiv-5.c: New test. * gcc.dg/tree-ssa/mulexactdiv-6.c: Likewise. * gcc.dg/tree-ssa/mulexactdiv-7.c: Likewise. * gcc.dg/tree-ssa/mulexactdiv-8.c: Likewise. * gcc.target/aarch64/sve/cnt_fold_3.c: Likewise. --- gcc/match.pd | 38 +++++++----- gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-5.c | 29 +++++++++ gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-6.c | 59 +++++++++++++++++++ gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-7.c | 22 +++++++ gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-8.c | 20 +++++++ .../gcc.target/aarch64/sve/cnt_fold_3.c | 40 +++++++++++++ 6 files changed, 194 insertions(+), 14 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-5.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-6.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-7.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-8.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cnt_fold_3.c diff --git a/gcc/match.pd b/gcc/match.pd index 6677bc06d80..268316456c3 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -5493,24 +5493,34 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) optab_vector))) (eq (trunc_mod @0 @1) { build_zero_cst (TREE_TYPE (@0)); }))) -/* ((X /[ex] A) +- B) * A --> X +- A * B. */ +/* ((X /[ex] C1) +- C2) * (C1 * C3) --> (X * C3) +- (C1 * C2 * C3). */ (for op (plus minus) (simplify - (mult (convert1? (op (convert2? (exact_div @0 INTEGER_CST@@1)) INTEGER_CST@2)) @1) - (if (tree_nop_conversion_p (type, TREE_TYPE (@2)) - && tree_nop_conversion_p (TREE_TYPE (@0), TREE_TYPE (@2))) - (with - { - wi::overflow_type overflow; - wide_int mul = wi::mul (wi::to_wide (@1), wi::to_wide (@2), - TYPE_SIGN (type), &overflow); - } + (mult (convert1? (op (convert2? (exact_div @0 INTEGER_CST@1)) + poly_int_tree_p@2)) + poly_int_tree_p@3) + (with { poly_widest_int factor; } + (if (tree_nop_conversion_p (type, TREE_TYPE (@2)) + && tree_nop_conversion_p (TREE_TYPE (@0), TREE_TYPE (@2)) + && multiple_p (wi::to_poly_widest (@3), wi::to_widest (@1), &factor)) + (with + { + wi::overflow_type overflow; + wide_int mul; + } (if (types_match (type, TREE_TYPE (@2)) - && types_match (TREE_TYPE (@0), TREE_TYPE (@2)) && !overflow) - (op @0 { wide_int_to_tree (type, mul); }) + && types_match (TREE_TYPE (@0), TREE_TYPE (@2)) + && TREE_CODE (@2) == INTEGER_CST + && TREE_CODE (@3) == INTEGER_CST + && (mul = wi::mul (wi::to_wide (@2), wi::to_wide (@3), + TYPE_SIGN (type), &overflow), + !overflow)) + (op (mult @0 { wide_int_to_tree (type, factor); }) + { wide_int_to_tree (type, mul); }) (with { tree utype = unsigned_type_for (type); } - (convert (op (convert:utype @0) - (mult (convert:utype @1) (convert:utype @2)))))))))) + (convert (op (mult (convert:utype @0) + { wide_int_to_tree (utype, factor); }) + (mult (convert:utype @3) (convert:utype @2))))))))))) /* Canonicalization of binary operations. */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-5.c b/gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-5.c new file mode 100644 index 00000000000..37cd676fff6 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/mulexactdiv-5.c @@ -0,0 +1,29 @@ +/* { dg-options "-O2 -fdump-tree-optimized-raw" } */ + +#define TEST_CMP(FN, DIV, ADD, MUL) \ + int \ + FN (int x) \ + { \ + if (x & 7) \ + __builtin_unreachable (); \ + x /= DIV; \ + x += ADD; \ + return x * MUL; \ + } + +TEST_CMP (f1, 2, 1, 6) +TEST_CMP (f2, 2, 2, 10) +TEST_CMP (f3, 4, 3, 80) +TEST_CMP (f4, 8, 4, 200) + +/* { dg-final { scan-tree-dump-not {<[a-z]*_div_expr,} "optimized" } } */ +/* { dg-final { scan-tree-dump-not {> 1, 6) +TEST_CMP (f2, 2, ~(~0U >> 2), 10) + +void +cmp1 (int x) +{ + if (x & 3) + __builtin_unreachable (); + + int y = x / 4; + y += (int) (~0U / 3U); + y *= 8; + + unsigned z = x; + z *= 2U; + z += ~0U / 3U * 8U; + + if (y != (int) z) + __builtin_abort (); +} + + +void +cmp2 (int x) +{ + if (x & 63) + __builtin_unreachable (); + + unsigned y = x / 64; + y += 100U; + int y2 = (int) y * 256; + + unsigned z = x; + z *= 4U; + z += 25600U; + + if (y2 != (int) z) + __builtin_abort (); +} + +/* { dg-final { scan-tree-dump-not {<[a-z]*_div_expr,} "optimized" } } */ +/* { dg-final { scan-tree-dump-not { + +/* +** f1: +** sub x0, x1, x0 +** incb x0 +** ret +*/ +uint64_t +f1 (int *ptr1, int *ptr2) +{ + return ((ptr2 - ptr1) + svcntw ()) * 4; +} + +/* +** f2: +** sub x0, x1, x0 +** incb x0, all, mul #4 +** ret +*/ +uint64_t +f2 (int *ptr1, int *ptr2) +{ + return ((ptr2 - ptr1) + svcntb ()) * 4; +} + +/* +** f3: +** sub x0, x1, x0 +** ret +*/ +uint64_t +f3 (int *ptr1, int *ptr2) +{ + return (((int *) ((char *) ptr2 + svcntb ()) - ptr1) - svcntw ()) * 4; +} From patchwork Fri Oct 18 11:18:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1999074 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XVMkl0qvpz1xth for ; Fri, 18 Oct 2024 22:22:55 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 576913858404 for ; Fri, 18 Oct 2024 11:22:53 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id AAB7F3858C32 for ; Fri, 18 Oct 2024 11:19:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AAB7F3858C32 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org AAB7F3858C32 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729250384; cv=none; b=BBp5W8rXSPSJKIoo3Guc4KiPAA09d/0en+zj0R9cC83rzAnXw+eyrcrC1CJmOqOW6ZyQqHWQufNUesRCuLdcW74Kw5q1H2ATxXxEgFJM84fGlOobd758hqZWMHtuGeH0+cUr8K5tXfq4EKo3euZf5EShUYSJDt7qQ3rBv0bbD5E= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729250384; c=relaxed/simple; bh=fCafOpcvlYaj804KeYrrshB85RZzQAB/c1k6SIa14/E=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=DPHVNyxAV6pepPzfuUwwx3reb4C83uF9F2LVuaYBJ+7G3z+0n5DRP8cbnqiX+akRCQZFnQOMs+FWpu6LdbvqEcSyQOaZ53YDHNKs6/tKWsdszkHvAF8SQbgzBhBU7vp0KCLb6EScPHgoF4D4tKgKJrVJid6bTyHOaUWkwgx4sAQ= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1C7851477; Fri, 18 Oct 2024 04:20:03 -0700 (PDT) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id DC7863F58B; Fri, 18 Oct 2024 04:19:32 -0700 (PDT) From: Richard Sandiford To: rguenther@suse.de, gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [PATCH 6/9] Try to simplify (X >> C1) << (C1 + C2) -> X << C2 Date: Fri, 18 Oct 2024 12:18:03 +0100 Message-Id: <20241018111806.4026759-7-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20241018111806.4026759-1-richard.sandiford@arm.com> References: <20241018111806.4026759-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-18.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_NUMSUBJECT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adds a rule to simplify (X >> C1) << (C1 + C2) -> X << C2 when the low C1 bits of X are known to be zero. Any single conversion can take place between the shifts. E.g. for a truncating conversion, any extra bits of X that are preserved by truncating after the shift are immediately lost by the shift left. And the sign bits used for an extending conversion are the same as the sign bits used for the rshift. (A double conversion of say int->unsigned->uint64_t would be wrong though.) gcc/ * match.pd: Simplify (X >> C1) << (C1 + C2) -> X << C2 if the low C1 bits of X are zero. gcc/testsuite/ * gcc.dg/tree-ssa/shifts-1.c: New test. * gcc.dg/tree-ssa/shifts-2.c: Likewise. --- gcc/match.pd | 13 +++++ gcc/testsuite/gcc.dg/tree-ssa/shifts-1.c | 61 ++++++++++++++++++++++++ gcc/testsuite/gcc.dg/tree-ssa/shifts-2.c | 21 ++++++++ 3 files changed, 95 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/shifts-1.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/shifts-2.c diff --git a/gcc/match.pd b/gcc/match.pd index 268316456c3..540582dc984 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -4902,6 +4902,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) - TYPE_PRECISION (TREE_TYPE (@2))))) (bit_and (convert @0) (lshift { build_minus_one_cst (type); } @1)))) +#if GIMPLE +/* (X >> C1) << (C1 + C2) -> X << C2 if the low C1 bits of X are zero. */ +(simplify + (lshift (convert? (rshift (with_possible_nonzero_bits2 @0) INTEGER_CST@1)) + INTEGER_CST@2) + (if (INTEGRAL_TYPE_P (type) + && wi::ltu_p (wi::to_wide (@1), element_precision (type)) + && wi::ltu_p (wi::to_wide (@2), element_precision (type)) + && wi::to_widest (@2) >= wi::to_widest (@1) + && wi::to_widest (@1) <= wi::ctz (get_nonzero_bits (@0))) + (lshift (convert @0) (minus @2 @1)))) +#endif + /* For (x << c) >> c, optimize into x & ((unsigned)-1 >> c) for unsigned x OR truncate into the precision(type) - c lowest bits of signed x (if they have mode precision or a precision of 1). */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/shifts-1.c b/gcc/testsuite/gcc.dg/tree-ssa/shifts-1.c new file mode 100644 index 00000000000..d88500ca8dd --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/shifts-1.c @@ -0,0 +1,61 @@ +/* { dg-options "-O2 -fdump-tree-optimized-raw" } */ + +unsigned int +f1 (unsigned int x) +{ + if (x & 3) + __builtin_unreachable (); + x >>= 2; + return x << 3; +} + +unsigned int +f2 (unsigned int x) +{ + if (x & 3) + __builtin_unreachable (); + unsigned char y = x; + y >>= 2; + return y << 3; +} + +unsigned long +f3 (unsigned int x) +{ + if (x & 3) + __builtin_unreachable (); + x >>= 2; + return (unsigned long) x << 3; +} + +int +f4 (int x) +{ + if (x & 15) + __builtin_unreachable (); + x >>= 4; + return x << 5; +} + +unsigned int +f5 (int x) +{ + if (x & 31) + __builtin_unreachable (); + x >>= 5; + return x << 6; +} + +unsigned int +f6 (unsigned int x) +{ + if (x & 1) + __builtin_unreachable (); + x >>= 1; + return x << (sizeof (int) * __CHAR_BIT__ - 1); +} + +/* { dg-final { scan-tree-dump-not {<[a-z]*_div_expr,} "optimized" } } */ +/* { dg-final { scan-tree-dump-not {>= 3; + return x << 4; +} + +unsigned int +f2 (unsigned int x) +{ + if (x & 3) + __builtin_unreachable (); + x >>= 2; + return x << 1; +} + +/* { dg-final { scan-tree-dump-times { X-Patchwork-Id: 1999072 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XVMjF0V0Jz1xth for ; Fri, 18 Oct 2024 22:21:37 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 353283858283 for ; Fri, 18 Oct 2024 11:21:35 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 699E8385842A for ; Fri, 18 Oct 2024 11:19:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 699E8385842A Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 699E8385842A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729250388; cv=none; b=Yddv1IaNkXkO/K2HKVivSPIGHTuM9TjZsCu/ZR936RCHnFMOFc0HdpvJWvWaPttBU2hZNYSnBj/vrpdbyxXOu77n091e9rf74vKh1edF/kUkLVjkqjASn/E32uavapTUGQeb5h9jTTexomZUxGDcQJrwtzccfCZvosezfV/Jxac= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729250388; c=relaxed/simple; bh=HrRPd9tJvDIPdXfIRdEsJYQl9/lMctLtD9Hmetg2Ox0=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=UEf86j3FxJBh0lewuf0SxcxXtpawbx6YYXaaOyOjtEz/hQ4N/YXXmssofi629Ednnq/w4fpOZlKuKqxWAAMnRELjo5sr0dGDnd6QP3LfS3txO1Y+PQl2Pm5lPB1kx8bgd7Kb3ndWfd3s7IQLls4yoBTj24ZeqNbcAcNlh1jdfIk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DCEB515BF; Fri, 18 Oct 2024 04:20:11 -0700 (PDT) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A7E2B3F58B; Fri, 18 Oct 2024 04:19:41 -0700 (PDT) From: Richard Sandiford To: rguenther@suse.de, gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [PATCH 7/9] Handle POLY_INT_CSTs in get_nonzero_bits Date: Fri, 18 Oct 2024 12:18:04 +0100 Message-Id: <20241018111806.4026759-8-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20241018111806.4026759-1-richard.sandiford@arm.com> References: <20241018111806.4026759-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-18.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch extends get_nonzero_bits to handle POLY_INT_CSTs, The easiest (but also most useful) case is that the number of trailing zeros in the runtime value is at least the number of trailing zeros in each individual component. In principle, we could do this for coeffs 1 and above only, and then OR in ceoff 0. This would give ~0x11 for [14, 32], say. But that's future work. gcc/ * tree-ssanames.cc (get_nonzero_bits): Handle POLY_INT_CSTs. * match.pd (with_possible_nonzero_bits): Likewise. gcc/testsuite/ * gcc.target/aarch64/sve/cnt_fold_4.c: New test. --- gcc/match.pd | 2 + .../gcc.target/aarch64/sve/cnt_fold_4.c | 61 +++++++++++++++++++ gcc/tree-ssanames.cc | 3 + 3 files changed, 66 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cnt_fold_4.c diff --git a/gcc/match.pd b/gcc/match.pd index 540582dc984..41903554478 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -2893,6 +2893,8 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) possibly set. */ (match with_possible_nonzero_bits INTEGER_CST@0) +(match with_possible_nonzero_bits + POLY_INT_CST@0) (match with_possible_nonzero_bits SSA_NAME@0 (if (INTEGRAL_TYPE_P (TREE_TYPE (@0)) || POINTER_TYPE_P (TREE_TYPE (@0))))) diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cnt_fold_4.c b/gcc/testsuite/gcc.target/aarch64/sve/cnt_fold_4.c new file mode 100644 index 00000000000..b7a53701993 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cnt_fold_4.c @@ -0,0 +1,61 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +/* +** f1: +** cnth x0 +** ret +*/ +uint64_t +f1 () +{ + uint64_t x = svcntw (); + x >>= 2; + return x << 3; +} + +/* +** f2: +** [^\n]+ +** [^\n]+ +** ... +** ret +*/ +uint64_t +f2 () +{ + uint64_t x = svcntd (); + x >>= 2; + return x << 3; +} + +/* +** f3: +** cntb x0, all, mul #4 +** ret +*/ +uint64_t +f3 () +{ + uint64_t x = svcntd (); + x >>= 1; + return x << 6; +} + +/* +** f4: +** [^\n]+ +** [^\n]+ +** ... +** ret +*/ +uint64_t +f4 () +{ + uint64_t x = svcntd (); + x >>= 2; + return x << 2; +} diff --git a/gcc/tree-ssanames.cc b/gcc/tree-ssanames.cc index 4f83fcbb517..d2d1ec18797 100644 --- a/gcc/tree-ssanames.cc +++ b/gcc/tree-ssanames.cc @@ -505,6 +505,9 @@ get_nonzero_bits (const_tree name) /* Use element_precision instead of TYPE_PRECISION so complex and vector types get a non-zero precision. */ unsigned int precision = element_precision (TREE_TYPE (name)); + if (POLY_INT_CST_P (name)) + return -known_alignment (wi::to_poly_wide (name)); + if (POINTER_TYPE_P (TREE_TYPE (name))) { struct ptr_info_def *pi = SSA_NAME_PTR_INFO (name); From patchwork Fri Oct 18 11:18:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1999075 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XVMls2jnTz1xth for ; Fri, 18 Oct 2024 22:23:53 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9B4BF385828E for ; Fri, 18 Oct 2024 11:23:51 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 5B9353858C35 for ; Fri, 18 Oct 2024 11:19:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5B9353858C35 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5B9353858C35 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729250397; cv=none; b=qwWTwXDQn/d6thxQwo1uP/klf8PJ7DtUZxQ1P9UuK+Q2DuWandSs8d+bpFu+5Kh8hQrpqbDkardIgpGedcLlvJAFHg9oP7lHIv/hqAXt0PlFKzlUjW+E7IS0CFpA3WVOCleQQgDVscOs4p83dyutiSd/0xKtKj7u3iEXEk3IVms= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729250397; c=relaxed/simple; bh=ScTcro66IGOtvjcscjspMfCjWzcvaHm4TrI4YkDuLcw=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=TGU1/PDjF+ZoCjefTCEMGSh2GOIhCKfr9e4GNLUWes2qBOzya30/dAIuLW5QOiHCCAS/WMaeZMYIwWl+OIODU7TaLwTNaONWOO4PSTUchQS98gGDlKRKPsIevG/FLuS3EQSncDSO2CR9fIk3guAMIxpXpBtKce8EFzzHiZUwUMg= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B6E7BFEC; Fri, 18 Oct 2024 04:20:20 -0700 (PDT) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 800643F58B; Fri, 18 Oct 2024 04:19:50 -0700 (PDT) From: Richard Sandiford To: rguenther@suse.de, gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [PATCH 8/9] Try to simplify (X >> C1) * (C2 << C1) -> X * C2 Date: Fri, 18 Oct 2024 12:18:05 +0100 Message-Id: <20241018111806.4026759-9-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20241018111806.4026759-1-richard.sandiford@arm.com> References: <20241018111806.4026759-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-18.3 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_NUMSUBJECT, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch adds a rule to simplify (X >> C1) * (C2 << C1) -> X * C2 when the low C1 bits of X are known to be zero. As with the earlier X >> C1 << (C2 + C1) patch, any single conversion is allowed between the shift and the multiplication. gcc/ * match.pd: Simplify (X >> C1) * (C2 << C1) -> X * C2 if the low C1 bits of X are zero. gcc/testsuite/ * gcc.dg/tree-ssa/shifts-3.c: New test. * gcc.dg/tree-ssa/shifts-4.c: Likewise. * gcc.target/aarch64/sve/cnt_fold_5.c: Likewise. --- gcc/match.pd | 13 ++++ gcc/testsuite/gcc.dg/tree-ssa/shifts-3.c | 65 +++++++++++++++++++ gcc/testsuite/gcc.dg/tree-ssa/shifts-4.c | 23 +++++++ .../gcc.target/aarch64/sve/cnt_fold_5.c | 38 +++++++++++ 4 files changed, 139 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/shifts-3.c create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/shifts-4.c create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cnt_fold_5.c diff --git a/gcc/match.pd b/gcc/match.pd index 41903554478..85f5eeefa08 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -4915,6 +4915,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) && wi::to_widest (@2) >= wi::to_widest (@1) && wi::to_widest (@1) <= wi::ctz (get_nonzero_bits (@0))) (lshift (convert @0) (minus @2 @1)))) + +/* (X >> C1) * (C2 << C1) -> X * C2 if the low C1 bits of X are zero. */ +(simplify + (mult (convert? (rshift (with_possible_nonzero_bits2 @0) INTEGER_CST@1)) + poly_int_tree_p@2) + (with { poly_widest_int factor; } + (if (INTEGRAL_TYPE_P (type) + && wi::ltu_p (wi::to_wide (@1), element_precision (type)) + && wi::to_widest (@1) <= wi::ctz (get_nonzero_bits (@0)) + && multiple_p (wi::to_poly_widest (@2), + widest_int (1) << tree_to_uhwi (@1), + &factor)) + (mult (convert @0) { wide_int_to_tree (type, factor); })))) #endif /* For (x << c) >> c, optimize into x & ((unsigned)-1 >> c) for diff --git a/gcc/testsuite/gcc.dg/tree-ssa/shifts-3.c b/gcc/testsuite/gcc.dg/tree-ssa/shifts-3.c new file mode 100644 index 00000000000..dcff518e630 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/shifts-3.c @@ -0,0 +1,65 @@ +/* { dg-options "-O2 -fdump-tree-optimized-raw" } */ + +unsigned int +f1 (unsigned int x) +{ + if (x & 3) + __builtin_unreachable (); + x >>= 2; + return x * 20; +} + +unsigned int +f2 (unsigned int x) +{ + if (x & 3) + __builtin_unreachable (); + unsigned char y = x; + y >>= 2; + return y * 36; +} + +unsigned long +f3 (unsigned int x) +{ + if (x & 3) + __builtin_unreachable (); + x >>= 2; + return (unsigned long) x * 88; +} + +int +f4 (int x) +{ + if (x & 15) + __builtin_unreachable (); + x >>= 4; + return x * 48; +} + +unsigned int +f5 (int x) +{ + if (x & 31) + __builtin_unreachable (); + x >>= 5; + return x * 3200; +} + +unsigned int +f6 (unsigned int x) +{ + if (x & 1) + __builtin_unreachable (); + x >>= 1; + return x * (~0U / 3 & -2); +} + +/* { dg-final { scan-tree-dump-not {<[a-z]*_div_expr,} "optimized" } } */ +/* { dg-final { scan-tree-dump-not {>= 2; + return x * 10; +} + +unsigned int +f2 (unsigned int x) +{ + if (x & 3) + __builtin_unreachable (); + x >>= 3; + return x * 24; +} + +/* { dg-final { scan-tree-dump-times { + +/* +** f1: +** ... +** cntd [^\n]+ +** ... +** mul [^\n]+ +** ret +*/ +uint64_t +f1 (int x) +{ + if (x & 3) + __builtin_unreachable (); + x >>= 2; + return (uint64_t) x * svcnth (); +} + +/* +** f2: +** ... +** asr [^\n]+ +** ... +** ret +*/ +uint64_t +f2 (int x) +{ + if (x & 3) + __builtin_unreachable (); + x >>= 2; + return (uint64_t) x * svcntw (); +} From patchwork Fri Oct 18 11:18:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1999077 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XVMp34ZBgz1xw2 for ; Fri, 18 Oct 2024 22:25:47 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AC5A9385840C for ; Fri, 18 Oct 2024 11:25:45 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 0D4A23858410 for ; Fri, 18 Oct 2024 11:20:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0D4A23858410 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0D4A23858410 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729250416; cv=none; b=bA82XrZqqYyuSzu/8FuI82pbps3mQEo2ILuL1BwBKSyAr8AiVKWe9+q9+ztrzvgYpPv7t1baCiO803c4fMQDDYRhg3T4/7IEtHiJogjDMNXhNcqfefHWhtfuW4VZcuGNYoH7cWE5xzBzATeOqPPA+MEajdD3lTwWs6FVKfC6zQs= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729250416; c=relaxed/simple; bh=VW1KR8OQlJe4pfgBMEodIoLe+E9ME7MYZFEfq7IUSaU=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=lFn7Xe4baUjue7ycGe6t3KV33poX2xBBxzQMHK3BtEhSW77+xiJs+brMAXukLsTwKV7aPNkk3qj3UWdGxdoy9Z1rIpBLxuwSIdNnycfZnyLp6AyJchiX4L05g2FbbDUMggXiugScixFTYsNdquZ2k52n1GE7JRmEoPFA8CJOJ90= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 72F12FEC; Fri, 18 Oct 2024 04:20:34 -0700 (PDT) Received: from e121540-lin.manchester.arm.com (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3F3A23F58B; Fri, 18 Oct 2024 04:20:04 -0700 (PDT) From: Richard Sandiford To: rguenther@suse.de, gcc-patches@gcc.gnu.org Cc: Richard Sandiford Subject: [PATCH 9/9] Record nonzero bits in the irange_bitmask of POLY_INT_CSTs Date: Fri, 18 Oct 2024 12:18:06 +0100 Message-Id: <20241018111806.4026759-10-richard.sandiford@arm.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20241018111806.4026759-1-richard.sandiford@arm.com> References: <20241018111806.4026759-1-richard.sandiford@arm.com> MIME-Version: 1.0 X-Spam-Status: No, score=-18.6 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org At the moment, ranger punts entirely on POLY_INT_CSTs. Numerical ranges are a bit difficult, unless we do start modelling bounds on the indeterminates. But we can at least track the nonzero bits. gcc/ * value-query.cc (range_query::get_tree_range): Use get_nonzero_bits to populate the irange_bitmask of a POLY_INT_CST. gcc/testsuite/ * gcc.target/aarch64/sve/cnt_fold_6.c: New test. --- .../gcc.target/aarch64/sve/cnt_fold_6.c | 75 +++++++++++++++++++ gcc/value-query.cc | 7 ++ 2 files changed, 82 insertions(+) create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/cnt_fold_6.c diff --git a/gcc/testsuite/gcc.target/aarch64/sve/cnt_fold_6.c b/gcc/testsuite/gcc.target/aarch64/sve/cnt_fold_6.c new file mode 100644 index 00000000000..9d9e1ca9330 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/sve/cnt_fold_6.c @@ -0,0 +1,75 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include + +/* +** f1: +** ... +** cntb (x[0-9]+) +** ... +** add x[0-9]+, \1, #?16 +** ... +** csel [^\n]+ +** ret +*/ +uint64_t +f1 (int x) +{ + uint64_t y = x ? svcnth () : svcnth () + 8; + y >>= 3; + y <<= 4; + return y; +} + +/* +** f2: +** ... +** (?:and|[al]sr) [^\n]+ +** ... +** ret +*/ +uint64_t +f2 (int x) +{ + uint64_t y = x ? svcnth () : svcnth () + 8; + y >>= 4; + y <<= 5; + return y; +} + +/* +** f3: +** ... +** cntw (x[0-9]+) +** ... +** add x[0-9]+, \1, #?16 +** ... +** csel [^\n]+ +** ret +*/ +uint64_t +f3 (int x) +{ + uint64_t y = x ? svcntd () : svcntd () + 8; + y >>= 1; + y <<= 2; + return y; +} + +/* +** f4: +** ... +** (?:and|[al]sr) [^\n]+ +** ... +** ret +*/ +uint64_t +f4 (int x) +{ + uint64_t y = x ? svcntd () : svcntd () + 8; + y >>= 2; + y <<= 3; + return y; +} diff --git a/gcc/value-query.cc b/gcc/value-query.cc index cac2cb5b2bc..34499da1a98 100644 --- a/gcc/value-query.cc +++ b/gcc/value-query.cc @@ -375,6 +375,13 @@ range_query::get_tree_range (vrange &r, tree expr, gimple *stmt, } default: + if (POLY_INT_CST_P (expr)) + { + unsigned int precision = TYPE_PRECISION (type); + r.set_varying (type); + r.update_bitmask ({ wi::zero (precision), get_nonzero_bits (expr) }); + return true; + } break; } if (BINARY_CLASS_P (expr) || COMPARISON_CLASS_P (expr))