From patchwork Fri Aug 9 17:18:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1971036 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WgVyJ5xMGz1yYl for ; Sat, 10 Aug 2024 03:19:20 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 92B1A385C6CF for ; Fri, 9 Aug 2024 17:19:18 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id EF8193858CD9 for ; Fri, 9 Aug 2024 17:18:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EF8193858CD9 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org EF8193858CD9 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1723223940; cv=none; b=kRZn3YCmK3qgLoS308i0q+SYp9SHJMF9F6dl4LESrtTCbdaWdzEnyeyj1oikmAdVB9nZ5HDvfOprX5pA9ho36CBply+wyVMXKc47OSpK8yPT8l1Sif33RcolGE9Xl0B2sf+5S1QqY4CL6nT5OSW9qtIGfEjCPzuL/8TFC3C0FaE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1723223940; c=relaxed/simple; bh=1wwxkzoHgEJ5UxkBwBYNsrcBXb1Y125ETiAtiuyv/qU=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=coRfybd70gRO2W1bBYGSbckTqHXPkY5zBDOLRx75Ar9VfOYP2n5ZQAXNecH3DJ4AdunRIlXfqKGBg/WoshhIzXcV82aAtSmFkxf7Dbs5nykLzM+dB+ljXlzyFeWYRUOzUoBNE4gxAu8Nkbo7+ZA46xzUPTlS8H/zmsSF8oTudg0= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7DAEA13D5 for ; Fri, 9 Aug 2024 10:19:22 -0700 (PDT) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3CCD63F6A8 for ; Fri, 9 Aug 2024 10:18:56 -0700 (PDT) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org, richard.sandiford@arm.com Subject: [PATCH] Use splay-tree-utils.h in tree-ssa-sccvn [PR30920] Date: Fri, 09 Aug 2024 18:18:54 +0100 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-Spam-Status: No, score=-19.1 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch is an attempt to gauge opinion on one way of fixing PR30920. The PR points out that the libiberty splay tree implementation does not implement the algorithm described by Sleator and Tarjan and has unclear complexity bounds. (It's also somewhat dangerous in that splay_tree_min and splay_tree_max walk the tree without splaying, meaning that they are fully linear in the worst case, rather than amortised logarithmic.) These properties have been carried over to typed-splay-tree.h. We could fix those problems directly in the existing implementations, and probably should for libiberty. But when I added rtl-ssa, I also added a third(!) splay tree implementation: splay-tree-utils.h. In response to Jeff's understandable unease about having three implementations, I was supposed to go back during the next stage 1 and reduce it to no more than two. I never did that. :-( splay-tree-utils.h is so called because rtl-ssa uses splay trees in structures that are relatively small and very size-sensitive. I therefore wanted to be able to embed the splay tree links directly in the structures, rather than pay the penalty of using separate nodes with one-way or two-way links between them. There were also operations for which it was convenient to treat the splay tree root as an explicitly managed cursor, rather than treating the tree as a pure ADT. The interface is therefore a bit more low-level than for the other implementations. I wondered whether the same trade-offs might apply to users of the libiberty splay trees. The first one I looked at in detail was SCC value numbering, which seemed like it would benefit from using splay-tree-utils.h directly. The patch does that. It also adds a couple of new helper routines to splay-tree-utils.h. I don't expect this approach to be the right one for every use of splay trees. E.g. splay tree used for omp gimplification would certainly need separate nodes. Tested on aarch64-linux-gnu & x86_64-linux-gnu. OK to install? Richard gcc/ * splay-tree-utils.h (rooted_splay_tree::insert_relative) (rooted_splay_tree::lookup_le): New functions. (rooted_splay_tree::remove_root_and_splay_next): Likewise. * splay-tree-utils.h (rooted_splay_tree::insert_relative): New function, extracted from... (rooted_splay_tree::insert): ...here. (rooted_splay_tree::lookup_le): New function. (rooted_splay_tree::remove_root_and_splay_next): Likewise. * tree-ssa-sccvn.cc (pd_range::m_children): New member variable. (vn_walk_cb_data::vn_walk_cb_data): Initialize first_range. (vn_walk_cb_data::known_ranges): Use a default_splay_tree. (vn_walk_cb_data::~vn_walk_cb_data): Remove freeing of known_ranges. (pd_range_compare, pd_range_alloc, pd_range_dealloc): Delete. (vn_walk_cb_data::push_partial_def): Rewrite splay tree operations to use splay-tree-utils.h. * rtl-ssa/accesses.cc (function_info::add_use): Use insert_relative. --- gcc/rtl-ssa/accesses.cc | 8 +-- gcc/splay-tree-utils.h | 29 +++++++++++ gcc/splay-tree-utils.tcc | 69 ++++++++++++++++++++++--- gcc/tree-ssa-sccvn.cc | 106 +++++++++++++-------------------------- 4 files changed, 131 insertions(+), 81 deletions(-) diff --git a/gcc/rtl-ssa/accesses.cc b/gcc/rtl-ssa/accesses.cc index 5e9077545a8..ef99759871a 100644 --- a/gcc/rtl-ssa/accesses.cc +++ b/gcc/rtl-ssa/accesses.cc @@ -1232,16 +1232,16 @@ function_info::add_use (use_info *use) need_use_splay_tree (def); int comparison = lookup_use (def->m_use_tree, insn); gcc_checking_assert (comparison != 0); - splay_tree_node *neighbor = def->m_use_tree.root (); + use_info *neighbor = def->m_use_tree.root ()->value (); // If USE comes before NEIGHBOR, insert USE to NEIGHBOR's left, // otherwise insert USE to NEIGHBOR's right. auto *use_node = allocate> (use); - def->m_use_tree.insert_child (neighbor, comparison > 0, use_node); + def->m_use_tree.insert_relative (comparison, use_node); if (comparison > 0) - insert_use_after (use, neighbor->value ()); + insert_use_after (use, neighbor); else - insert_use_before (use, neighbor->value ()); + insert_use_before (use, neighbor); } void diff --git a/gcc/splay-tree-utils.h b/gcc/splay-tree-utils.h index 8344808f6d1..9526e0ba336 100644 --- a/gcc/splay-tree-utils.h +++ b/gcc/splay-tree-utils.h @@ -185,6 +185,21 @@ public: template bool insert (node_type new_node, Comparator compare); + // Insert NEW_NODE into the splay tree. If the tree is currently non-empty, + // COMPARISON is < 0 if NEW_NODE comes immediate before the current root, + // or > 0 if NEW_NODE comes immediately after the current root. + // + // On return, NEW_NODE is the root of the tree. + // + // For example, this can be used in the construct: + // + // int comparison = tree.lookup (...); + // if (comparison != 0) + // tree.insert_relative (comparison, create_new_node ()); + // + // Complexity: O(1) + void insert_relative (int comparison, node_type new_node); + // Insert NEW_NODE into the splay tree, given that NEW_NODE is the // maximum node of the new tree. On return, NEW_NODE is also the // root of the tree. @@ -212,6 +227,14 @@ public: // Otherwise amortized O(log N), worst-cast O(N). void remove_root (); + // Remove the root node of the splay tree. If the root node was not + // the maximum node, bring the next node to the root and return true. + // Return false otherwise. + // + // Complexity: O(1) if removing the maximum node. Otherwise amortized + // O(log N), worst-cast O(N). + bool remove_root_and_splay_next (); + // Split the left child of the current root out into a separate tree // and return the new tree. rooted_splay_tree split_before_root (); @@ -326,6 +349,12 @@ public: int lookup (LeftPredicate want_something_smaller, RightPredicate want_something_bigger); + // Like lookup, but always pick a node that is no bigger than the one + // being searched for, if such a node exists. + template + int lookup_le (LeftPredicate want_something_smaller, + RightPredicate want_something_bigger); + // Keep the ability to print subtrees. using parent::print; diff --git a/gcc/splay-tree-utils.tcc b/gcc/splay-tree-utils.tcc index 5c36bb55f78..6118233a5f6 100644 --- a/gcc/splay-tree-utils.tcc +++ b/gcc/splay-tree-utils.tcc @@ -340,15 +340,30 @@ rooted_splay_tree::insert (node_type new_node, Comparator compare) if (comparison == 0) return false; - // Insert NEW_NODE before M_ROOT if COMPARISON < 0 and after M_ROOT - // otherwise. - set_child (new_node, comparison < 0, m_root); - set_child (new_node, comparison > 0, get_child (m_root, comparison > 0)); - set_child (m_root, comparison > 0, nullptr); - m_root = new_node; + insert_relative (comparison, new_node); return true; } +// See the comment above the declaration. +template +inline void +rooted_splay_tree::insert_relative (int comparison, + node_type new_node) +{ + gcc_checking_assert (!get_child (new_node, 0) + && !get_child (new_node, 1) + && (!m_root || comparison != 0)); + if (m_root) + { + // Insert NEW_NODE before M_ROOT if COMPARISON < 0 and after M_ROOT + // otherwise. + set_child (new_node, comparison < 0, m_root); + set_child (new_node, comparison > 0, get_child (m_root, comparison > 0)); + set_child (m_root, comparison > 0, node_type ()); + } + m_root = new_node; +} + // See the comment above the declaration. template inline void @@ -398,6 +413,35 @@ rooted_splay_tree::remove_root () set_child (node, 1, node_type ()); } +// See the comment above the declaration. +template +inline bool +rooted_splay_tree::remove_root_and_splay_next () +{ + node_type node = m_root; + node_type right = get_child (node, 1); + if (right) + { + // Bring the minimum right-hand node to the root. + if (get_child (right, 0)) + { + right = parent::template splay_limit<0> (right); + gcc_checking_assert (!get_child (right, 0)); + } + set_child (right, 0, get_child (node, 0)); + m_root = right; + } + else + m_root = get_child (node, 0); + if (m_root) + set_parent (m_root, node_type ()); + + // Clear the links from NODE. Its parent is already node_type (). + set_child (node, 0, node_type ()); + set_child (node, 1, node_type ()); + return right; +} + // See the comment above the declaration. template inline rooted_splay_tree @@ -730,6 +774,19 @@ rooted_splay_tree::lookup (LeftPredicate want_something_smaller, return result; } +// See the comment above the declaration. +template +template +int +rooted_splay_tree::lookup_le (LeftPredicate want_something_smaller, + RightPredicate want_something_bigger) +{ + int comparison = lookup (want_something_smaller, want_something_bigger); + if (comparison < 0 && splay_prev_node ()) + comparison = 1; + return comparison; +} + // See the comment above the declaration. template template diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc index 0639ba426ff..4370d09d9d8 100644 --- a/gcc/tree-ssa-sccvn.cc +++ b/gcc/tree-ssa-sccvn.cc @@ -21,7 +21,6 @@ along with GCC; see the file COPYING3. If not see #include "config.h" #include "system.h" #include "coretypes.h" -#include "splay-tree.h" #include "backend.h" #include "rtl.h" #include "tree.h" @@ -33,6 +32,7 @@ along with GCC; see the file COPYING3. If not see #include "emit-rtl.h" #include "cgraph.h" #include "gimple-pretty-print.h" +#include "splay-tree-utils.h" #include "alias.h" #include "fold-const.h" #include "stor-layout.h" @@ -1832,6 +1832,7 @@ struct pd_range { HOST_WIDE_INT offset; HOST_WIDE_INT size; + pd_range *m_children[2]; }; struct pd_data @@ -1853,8 +1854,8 @@ struct vn_walk_cb_data mask (mask_), masked_result (NULL_TREE), same_val (NULL_TREE), vn_walk_kind (vn_walk_kind_), tbaa_p (tbaa_p_), redundant_store_removal_p (redundant_store_removal_p_), - saved_operands (vNULL), first_set (-2), first_base_set (-2), - known_ranges (NULL) + saved_operands (vNULL), first_range (), first_set (-2), + first_base_set (-2) { if (!last_vuse_ptr) last_vuse_ptr = &last_vuse; @@ -1924,7 +1925,7 @@ struct vn_walk_cb_data pd_range first_range; alias_set_type first_set; alias_set_type first_base_set; - splay_tree known_ranges; + default_splay_tree known_ranges; obstack ranges_obstack; static constexpr HOST_WIDE_INT bufsize = 64; }; @@ -1932,10 +1933,7 @@ struct vn_walk_cb_data vn_walk_cb_data::~vn_walk_cb_data () { if (known_ranges) - { - splay_tree_delete (known_ranges); - obstack_free (&ranges_obstack, NULL); - } + obstack_free (&ranges_obstack, NULL); saved_operands.release (); } @@ -1961,32 +1959,6 @@ vn_walk_cb_data::finish (alias_set_type set, alias_set_type base_set, tree val) vr->type, operands, val); } -/* pd_range splay-tree helpers. */ - -static int -pd_range_compare (splay_tree_key offset1p, splay_tree_key offset2p) -{ - HOST_WIDE_INT offset1 = *(HOST_WIDE_INT *)offset1p; - HOST_WIDE_INT offset2 = *(HOST_WIDE_INT *)offset2p; - if (offset1 < offset2) - return -1; - else if (offset1 > offset2) - return 1; - return 0; -} - -static void * -pd_tree_alloc (int size, void *data_) -{ - vn_walk_cb_data *data = (vn_walk_cb_data *)data_; - return obstack_alloc (&data->ranges_obstack, size); -} - -static void -pd_tree_dealloc (void *, void *) -{ -} - /* Push PD to the vector of partial definitions returning a value when we are ready to combine things with VUSE, SET and MAXSIZEI, NULL when we want to continue looking for partial defs or -1 @@ -2055,51 +2027,43 @@ vn_walk_cb_data::push_partial_def (pd_data pd, /* ??? Optimize the case where the 2nd partial def completes things. */ gcc_obstack_init (&ranges_obstack); - known_ranges = splay_tree_new_with_allocator (pd_range_compare, 0, 0, - pd_tree_alloc, - pd_tree_dealloc, this); - splay_tree_insert (known_ranges, - (splay_tree_key)&first_range.offset, - (splay_tree_value)&first_range); - } - - pd_range newr = { pd.offset, pd.size }; - splay_tree_node n; - /* Lookup the predecessor of offset + 1 and see if we need to merge. */ - HOST_WIDE_INT loffset = newr.offset + 1; - if ((n = splay_tree_predecessor (known_ranges, (splay_tree_key)&loffset)) - && ((r = (pd_range *)n->value), true) + known_ranges.insert_max_node (&first_range); + } + /* Lookup the offset and see if we need to merge. */ + int comparison = known_ranges.lookup_le + ([&] (pd_range *r) { return pd.offset < r->offset; }, + [&] (pd_range *r) { return pd.offset > r->offset; }); + r = known_ranges.root (); + if (comparison >= 0 && ranges_known_overlap_p (r->offset, r->size + 1, - newr.offset, newr.size)) + pd.offset, pd.size)) { /* Ignore partial defs already covered. Here we also drop shadowed clobbers arriving here at the floor. */ - if (known_subrange_p (newr.offset, newr.size, r->offset, r->size)) + if (known_subrange_p (pd.offset, pd.size, r->offset, r->size)) return NULL; - r->size - = MAX (r->offset + r->size, newr.offset + newr.size) - r->offset; + r->size = MAX (r->offset + r->size, pd.offset + pd.size) - r->offset; } else { - /* newr.offset wasn't covered yet, insert the range. */ - r = XOBNEW (&ranges_obstack, pd_range); - *r = newr; - splay_tree_insert (known_ranges, (splay_tree_key)&r->offset, - (splay_tree_value)r); - } - /* Merge r which now contains newr and is a member of the splay tree with - adjacent overlapping ranges. */ - pd_range *rafter; - while ((n = splay_tree_successor (known_ranges, - (splay_tree_key)&r->offset)) - && ((rafter = (pd_range *)n->value), true) - && ranges_known_overlap_p (r->offset, r->size + 1, - rafter->offset, rafter->size)) - { - r->size = MAX (r->offset + r->size, - rafter->offset + rafter->size) - r->offset; - splay_tree_remove (known_ranges, (splay_tree_key)&rafter->offset); + /* pd.offset wasn't covered yet, insert the range. */ + void *addr = XOBNEW (&ranges_obstack, pd_range); + r = new (addr) pd_range { pd.offset, pd.size, {} }; + known_ranges.insert_relative (comparison, r); } + /* Merge r which now contains pd's range and is a member of the splay + tree with adjacent overlapping ranges. */ + if (known_ranges.splay_next_node ()) + do + { + pd_range *rafter = known_ranges.root (); + if (!ranges_known_overlap_p (r->offset, r->size + 1, + rafter->offset, rafter->size)) + break; + r->size = MAX (r->offset + r->size, + rafter->offset + rafter->size) - r->offset; + } + while (known_ranges.remove_root_and_splay_next ()); /* If we get a clobber, fail. */ if (TREE_CLOBBER_P (pd.rhs)) return (void *)-1; @@ -2109,7 +2073,7 @@ vn_walk_cb_data::push_partial_def (pd_data pd, partial_defs.safe_push (pd); } - /* Now we have merged newr into the range tree. When we have covered + /* Now we have merged pd's range into the range tree. When we have covered [offseti, sizei] then the tree will contain exactly one node which has the desired properties and it will be 'r'. */ if (!known_subrange_p (0, maxsizei, r->offset, r->size))