From patchwork Fri Sep 6 07:06:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filip Kastl X-Patchwork-Id: 1981642 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=suse.cz header.i=@suse.cz header.a=rsa-sha256 header.s=susede2_rsa header.b=NL8GNT5Y; dkim=pass header.d=suse.cz header.i=@suse.cz header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=WHkOk/mq; dkim=pass (1024-bit key) header.d=suse.cz header.i=@suse.cz header.a=rsa-sha256 header.s=susede2_rsa header.b=xQ+NZFdu; dkim=neutral header.d=suse.cz header.i=@suse.cz header.a=ed25519-sha256 header.s=susede2_ed25519 header.b=HhiCPkbm; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4X0S2x2WQqz1yh1 for ; Fri, 6 Sep 2024 17:07:05 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2CD7B384A4A3 for ; Fri, 6 Sep 2024 07:07:03 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by sourceware.org (Postfix) with ESMTPS id E3ABE384A84E for ; Fri, 6 Sep 2024 07:06:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E3ABE384A84E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=suse.cz ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E3ABE384A84E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.135.223.130 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725606399; cv=none; b=Ui8lbt56xzHav7tfDYs3Ge7LwPOXpL2Kgyl+F8s65RbKpIq86xhyn9rhFDprXkps7geE4hW3fi4SNRbinKM9aB+pbF1O7liFMcYo5h6p8eDgbOsSlqgxEwvCUbvuuWL2cD6uRjMeSsX2VPS3yzkfhxnVUXrYYtvH3pbAXFBzG/8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1725606399; c=relaxed/simple; bh=JtFF4CRmao2LUQ4LrPa9ONG6S8UYMvckzqkKSa5hUhk=; h=DKIM-Signature:DKIM-Signature:DKIM-Signature:DKIM-Signature:Date: From:To:Subject:Message-ID:MIME-Version; b=P6z/+J4R24GxsQLcGeZqF711ygc0G9JlqgfZoANapK5BM/0ZQBJZZ+rqilSrCQeq1sA9yoMyrCH943pBWV/g6j+o8kpnSobDi+qELEBG/tBlaiq4pfmazEbl3ZYR8LfXQMfq7TbNt/gGaWssG4E0auUltrnKqULXwVKYgsG+3p4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id BE1A521AC0; Fri, 6 Sep 2024 07:06:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1725606396; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=5NzFyjKKLKIWrgztVZxmJwXQAjacJ4gRA3q0VLtfMKU=; b=NL8GNT5YDSm+zR93XiyCP6oqSaAzRdrZOiacmn3vfgv0Fo/by7WQEQTu4fFBvRaRb5xgX8 F0hfbv3YDiZMZ66fhsyassSN+wPnLVw3lhjfV5xZldLmXO34OYsJumj2M3xqxspO/hXn1H vfiPdZtpQh96UJ7bqGjHCcD6OonJGGg= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1725606396; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=5NzFyjKKLKIWrgztVZxmJwXQAjacJ4gRA3q0VLtfMKU=; b=WHkOk/mqjgVIY1turfZNrnfx+ShRoNNpLCBn9mt22OLpsPuR6LI9MmluuWRMrP+Z1scZK0 K3fzLQ1Qab4bavBg== Authentication-Results: smtp-out1.suse.de; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=xQ+NZFdu; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=HhiCPkbm DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1725606395; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=5NzFyjKKLKIWrgztVZxmJwXQAjacJ4gRA3q0VLtfMKU=; b=xQ+NZFduupr9oM/QOXyJmWrZRvXmam12qs45y5mxP4IvwP7AGnqNHxL+BNQc4dQy9KCk/N uSEr0yMSK1KKCCurt5YXN67Hc0SuqfpJOqiHeA1tQci5DVmxc8DRktEqkdtYsIj4PImcqc EhV4oeokjcyFvm0vAeABvRr5EGgYp/8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1725606395; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type; bh=5NzFyjKKLKIWrgztVZxmJwXQAjacJ4gRA3q0VLtfMKU=; b=HhiCPkbmacWdu9yXmYpSZeAFYap2bC6loVxdJ9jcFinyYf5IWfLPttFTdXfi3oO1V4FR/D 65UKGOObPSimq1AQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id ADD7C1395F; Fri, 6 Sep 2024 07:06:35 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id VbdoKvup2malLgAAD6G6ig (envelope-from ); Fri, 06 Sep 2024 07:06:35 +0000 Date: Fri, 6 Sep 2024 09:06:35 +0200 From: Filip Kastl To: gcc-patches@gcc.gnu.org Cc: maxim.kuvyrkov@linaro.org, mjambor@suse.cz, rguenther@suse.de Subject: [PATCH] gimple ssa: Don't use __builtin_popcount in switch exp transform [PR116616] Message-ID: MIME-Version: 1.0 Content-Disposition: inline X-Rspamd-Queue-Id: BE1A521AC0 X-Spam-Level: X-Spamd-Result: default: False [-4.51 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; R_DKIM_ALLOW(-0.20)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; MX_GOOD(-0.01)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FUZZY_BLOCKED(0.00)[rspamd.com]; DBL_BLOCKED_OPENRESOLVER(0.00)[suse.cz:dkim,suse.cz:email]; MIME_TRACE(0.00)[0:+]; MISSING_XM_UA(0.00)[]; ARC_NA(0.00)[]; RCVD_TLS_ALL(0.00)[]; DKIM_SIGNED(0.00)[suse.cz:s=susede2_rsa,suse.cz:s=susede2_ed25519]; FROM_EQ_ENVFROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCPT_COUNT_THREE(0.00)[4]; RCVD_COUNT_TWO(0.00)[2]; TO_MATCH_ENVRCPT_ALL(0.00)[]; TO_DN_NONE(0.00)[]; DNSWL_BLOCKED(0.00)[2a07:de40:b281:104:10:150:64:97:from,2a07:de40:b281:106:10:150:64:167:received]; DKIM_TRACE(0.00)[suse.cz:+] X-Rspamd-Server: rspamd2.dmz-prg2.suse.org X-Rspamd-Action: no action X-Spam-Score: -4.51 X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Hi, bootstrapped and regtested on x86_64-linux. Ok to push? Thanks, Filip Kastl ---- 8< ---- Switch exponential transformation in the switch conversion pass currently generates tmp1 = __builtin_popcount (var); tmp2 = tmp1 == 1; when inserting code to determine if var is power of two. If the target doesn't support expanding the builtin as special instructions switch conversion relies on this whole pattern being expanded as bitmagic. However, it is possible that other GIMPLE optimizations move the two statements of the pattern apart. In that case the builtin becomes a libgcc call in the final binary. The call is slow and in case of freestanding programs can result in linking error (this bug was originally found while compiling Linux kernel). This patch modifies switch conversion to insert the bitmagic (var ^ (var - 1)) > (var - 1) instead of the builtin. gcc/ChangeLog: PR tree-optimization/116616 * tree-switch-conversion.cc (can_pow2p): Remove this function. (gen_pow2p): Generate bitmagic instead of a builtin. Remove the TYPE parameter. (switch_conversion::is_exp_index_transform_viable): Don't call can_pow2p. (switch_conversion::exp_index_transform): Call gen_pow2p without the TYPE parameter. * tree-switch-conversion.h: Remove m_exp_index_transform_pow2p_type. gcc/testsuite/ChangeLog: PR tree-optimization/116616 * gcc.target/i386/switch-exp-transform-1.c: Don't test for presence of the POPCOUNT internal fn call. Signed-off-by: Filip Kastl --- .../gcc.target/i386/switch-exp-transform-1.c | 7 +- gcc/tree-switch-conversion.cc | 82 ++++--------------- gcc/tree-switch-conversion.h | 6 +- 3 files changed, 19 insertions(+), 76 deletions(-) diff --git a/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c b/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c index a8c9e03e515..4832f5b52c3 100644 --- a/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c +++ b/gcc/testsuite/gcc.target/i386/switch-exp-transform-1.c @@ -1,10 +1,8 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fdump-tree-switchconv -fdump-tree-widening_mul -mpopcnt -mbmi" } */ +/* { dg-options "-O2 -fdump-tree-switchconv -mbmi" } */ /* Checks that exponential index transform enables switch conversion to convert - this switch into an array lookup. Also checks that the "index variable is a - power of two" check has been generated and that it has been later expanded - into an internal function. */ + this switch into an array lookup. */ int foo(unsigned bar) { @@ -30,4 +28,3 @@ int foo(unsigned bar) } /* { dg-final { scan-tree-dump "CSWTCH" "switchconv" } } */ -/* { dg-final { scan-tree-dump "POPCOUNT" "widening_mul" } } */ diff --git a/gcc/tree-switch-conversion.cc b/gcc/tree-switch-conversion.cc index c1332a26094..2e42611f9fd 100644 --- a/gcc/tree-switch-conversion.cc +++ b/gcc/tree-switch-conversion.cc @@ -133,75 +133,27 @@ gen_log2 (tree op, location_t loc, tree *result, tree type) return stmts; } -/* Is it possible to efficiently check that a value of TYPE is a power of 2? - - If yes, returns TYPE. If no, returns NULL_TREE. May also return another - type. This indicates that logarithm of the variable can be computed but - only after it is converted to this type. - - Also see gen_pow2p. */ - -static tree -can_pow2p (tree type) -{ - /* __builtin_popcount supports the unsigned type or its long and long long - variants. Choose the smallest out of those that can still fit TYPE. */ - int prec = TYPE_PRECISION (type); - int i_prec = TYPE_PRECISION (unsigned_type_node); - int li_prec = TYPE_PRECISION (long_unsigned_type_node); - int lli_prec = TYPE_PRECISION (long_long_unsigned_type_node); - - if (prec <= i_prec) - return unsigned_type_node; - else if (prec <= li_prec) - return long_unsigned_type_node; - else if (prec <= lli_prec) - return long_long_unsigned_type_node; - else - return NULL_TREE; -} - -/* Build a sequence of gimple statements checking that OP is a power of 2. Use - special optabs if target supports them. Return the result as a - boolean_type_node ssa name through RESULT. Assumes that OP's value will - be non-negative. The generated check may give arbitrary answer for negative - values. - - Before computing the check, OP may have to be converted to another type. - This should be specified in TYPE. Use can_pow2p to decide what this type - should be. - - Should only be used if can_pow2p returns true for type of OP. */ +/* Build a sequence of gimple statements checking that OP is a power of 2. + Return the result as a boolean_type_node ssa name through RESULT. Assumes + that OP's value will be non-negative. The generated check may give + arbitrary answer for negative values. */ static gimple_seq -gen_pow2p (tree op, location_t loc, tree *result, tree type) +gen_pow2p (tree op, location_t loc, tree *result) { gimple_seq stmts = NULL; gimple_stmt_iterator gsi = gsi_last (stmts); - built_in_function fn; - if (type == unsigned_type_node) - fn = BUILT_IN_POPCOUNT; - else if (type == long_unsigned_type_node) - fn = BUILT_IN_POPCOUNTL; - else - { - fn = BUILT_IN_POPCOUNTLL; - gcc_checking_assert (type == long_long_unsigned_type_node); - } + tree type = TREE_TYPE (op); + + /* Build (op ^ (op - 1)) > (op - 1). */ + tree tmp1 = gimple_build (&gsi, false, GSI_NEW_STMT, loc, MINUS_EXPR, type, + op, build_one_cst (type)); + tree tmp2 = gimple_build (&gsi, false, GSI_NEW_STMT, loc, BIT_XOR_EXPR, type, + op, tmp1); + *result = gimple_build (&gsi, false, GSI_NEW_STMT, loc, GT_EXPR, + boolean_type_node, tmp2, tmp1); - tree orig_type = TREE_TYPE (op); - tree tmp1; - if (type != orig_type) - tmp1 = gimple_convert (&gsi, false, GSI_NEW_STMT, loc, type, op); - else - tmp1 = op; - /* Build __builtin_popcount{l,ll} (op) == 1. */ - tree tmp2 = gimple_build (&gsi, false, GSI_NEW_STMT, loc, - as_combined_fn (fn), integer_type_node, tmp1); - *result = gimple_build (&gsi, false, GSI_NEW_STMT, loc, EQ_EXPR, - boolean_type_node, tmp2, - build_one_cst (integer_type_node)); return stmts; } @@ -371,9 +323,6 @@ switch_conversion::is_exp_index_transform_viable (gswitch *swtch) m_exp_index_transform_log2_type = can_log2 (index_type, opt_type); if (!m_exp_index_transform_log2_type) return false; - m_exp_index_transform_pow2p_type = can_pow2p (index_type); - if (!m_exp_index_transform_pow2p_type) - return false; /* Check that each case label corresponds only to one value (no case 1..3). */ @@ -467,8 +416,7 @@ switch_conversion::exp_index_transform (gswitch *swtch) new_edge2->probability = profile_probability::even (); tree tmp; - gimple_seq stmts = gen_pow2p (index, UNKNOWN_LOCATION, &tmp, - m_exp_index_transform_pow2p_type); + gimple_seq stmts = gen_pow2p (index, UNKNOWN_LOCATION, &tmp); gsi = gsi_last_bb (cond_bb); gsi_insert_seq_after (&gsi, stmts, GSI_LAST_NEW_STMT); gcond *stmt_cond = gimple_build_cond (NE_EXPR, tmp, boolean_false_node, diff --git a/gcc/tree-switch-conversion.h b/gcc/tree-switch-conversion.h index 14610499e5f..6468995eb31 100644 --- a/gcc/tree-switch-conversion.h +++ b/gcc/tree-switch-conversion.h @@ -920,11 +920,9 @@ public: bool m_exp_index_transform_applied; /* If switch conversion decided exponential index transform is viable, here - will be stored the types to which index variable has to be converted - before the logarithm and the "is power of 2" operations which are part of - the transform. */ + will be stored the type to which index variable has to be converted + before the logarithm operation which is a part of the transform. */ tree m_exp_index_transform_log2_type; - tree m_exp_index_transform_pow2p_type; }; void