From patchwork Thu Dec 10 21:35:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew MacLeod X-Patchwork-Id: 1414562 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=pass (p=none dis=none) header.from=gcc.gnu.org Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=Ql8YQUEX; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4CsS0C2JgLz9sVn for ; Fri, 11 Dec 2020 08:36:11 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D91C53972488; Thu, 10 Dec 2020 21:36:08 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D91C53972488 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1607636168; bh=8Qo7S3agaVA4UYI2C1Knc9nVw0RDWjLCbcbSmlxv2RQ=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=Ql8YQUEXximAY3ZTTM4M1/sXlhKHMmpFXU+ZfKTJMHMTCkuQ3nSR1bEkse2K3PuC0 4D5jAh118/kwJuUE/WTvdXCY5QdoF7bc1EDzdgJ/5EEyq7gnV8c0LRhTjIsfjGV/u/ bDDR4IBvBTUSfCoyKaYv/H1rxSy4H+DfiAvKWFnM= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by sourceware.org (Postfix) with ESMTP id 4D55E3833014 for ; Thu, 10 Dec 2020 21:36:05 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 4D55E3833014 Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-426-QbWSiaMKOhC6uxF-bEiwTw-1; Thu, 10 Dec 2020 16:36:01 -0500 X-MC-Unique: QbWSiaMKOhC6uxF-bEiwTw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 74C3919251A0 for ; Thu, 10 Dec 2020 21:36:00 +0000 (UTC) Received: from [10.10.118.101] (ovpn-118-101.rdu2.redhat.com [10.10.118.101]) by smtp.corp.redhat.com (Postfix) with ESMTP id 102D260BF1 for ; Thu, 10 Dec 2020 21:35:59 +0000 (UTC) To: gcc-patches Subject: [patch] [PR tree-optimization/98174] Reduce memory requirements for ranger Message-ID: Date: Thu, 10 Dec 2020 16:35:59 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Andrew MacLeod via Gcc-patches From: Andrew MacLeod Reply-To: Andrew MacLeod Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" With very large CFG's ranger on entry cache is not particularly efficient. One thing I never got to was recognizing that if an ssa-name is never used in an outgoing edge calculation, then its range never changes.. the global range is sufficient and we do not need to propagate the on-entry cache with it.  Many SSA_NAMES fall into this category. This patch implements this by preprocessing the exit of each block (which would be done anyway, we just do it up front now) and keeping a cumulative map of anything which is exported.   Anything without the bit set is a "pure global" which means nothing ever changes the range. Then when it comes time to populate the on-entry cache, we just bail if its a pure global. THis seems to resolve most of the memory issues I am seeing. One side effect is ranger now doesnt fully propagate the non-null property which is a product of statement side effects.  This is the only thing in ranger which can affect a range but isnt a block export.   This won't impact the hybrid EVRP model since EVRP continues to track this information.   I will keep an eye on whether this has any impact on the other couple of passes which are using ranger..  If necessary, we can easily make this a hybrid mode only mode, but for this release it may not be needed. Next release, I plan to handle the non-null processing in a completely different way anyway when we add the facility to query for general statement side effects.  so this is in fact a future compatible change. There is still an time increase issue in pr91257 that I am continuing to look at to see if there is anything further to be done;  it seems mostly related to PHIs and the new edge processing. At least this seems to resolve the excessive memory issues in the specified PRs. I will also continue to monitor if we need to re-enable anything for the non-null processing outside of hybrid. Bootstrapped on x86_64-pc-linux-gnu, no regressions , pushed. Andrew commit 7f359556a772e26eabf8d31e53aae1de6f2f200d Author: Andrew MacLeod Date: Thu Dec 10 14:59:14 2020 -0500 Reduce memory requirements for ranger Calculate block exit info upfront, and then any SSA_NAME which is never used in an outgoing range calculation is a pure global and can bypass the on-entry cache. PR tree-optimization/98174 * gimple-range-cache.cc (ranger_cache::ssa_range_in_bb): Only push poor values to be examined if it isn't a pure global. (ranger_cache::block_range): Don't process pure globals. (ranger_cache::fill_block_cache): Adjust has_edge_range call. * gimple-range-gori.cc (gori_map::all_outgoing): New bitmap. (gori_map::gori_map): Allocate all_outgoing. (gori_map::is_export_p): No specified BB returns global context. (gori_map::calculate_gori): Accumulate each block into global. (gori_compute::gori_compute): Preprocess each block for exports. (gori_compute::has_edge_range_p): No edge returns global context. * gimple-range-gori.h (has_edge_range_p): Provide default parameter. diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc index b01563c83f9..edebad45a50 100644 --- a/gcc/gimple-range-cache.cc +++ b/gcc/gimple-range-cache.cc @@ -779,8 +779,10 @@ ranger_cache::ssa_range_in_bb (irange &r, tree name, basic_block bb) // Look for the on-entry value of name in BB from the cache. else if (!m_on_entry.get_bb_range (r, name, bb)) { - // If it has no entry then mark this as a poor value. - if (push_poor_value (bb, name)) + // If it has no entry but should, then mark this as a poor value. + // Its not a poor value if it does not have *any* edge ranges, + // Then global range is as good as it gets. + if (has_edge_range_p (name) && push_poor_value (bb, name)) { if (DEBUG_RANGE_CACHE) { @@ -812,6 +814,11 @@ ranger_cache::block_range (irange &r, basic_block bb, tree name, bool calc) { gcc_checking_assert (gimple_range_ssa_p (name)); + // If there are no range calculations anywhere in the IL, global range + // applies everywhere, so don't bother caching it. + if (!has_edge_range_p (name)) + return false; + if (calc) { gimple *def_stmt = SSA_NAME_DEF_STMT (name); @@ -1072,7 +1079,7 @@ ranger_cache::fill_block_cache (tree name, basic_block bb, basic_block def_bb) { if (DEBUG_RANGE_CACHE) fprintf (dump_file, "has cache, "); - if (!r.undefined_p () || has_edge_range_p (e, name)) + if (!r.undefined_p () || has_edge_range_p (name, e)) { add_to_update (node); if (DEBUG_RANGE_CACHE) diff --git a/gcc/gimple-range-gori.cc b/gcc/gimple-range-gori.cc index af3609e414e..ac13718f7e6 100644 --- a/gcc/gimple-range-gori.cc +++ b/gcc/gimple-range-gori.cc @@ -229,17 +229,18 @@ public: gori_map (); ~gori_map (); - bool is_export_p (tree name, basic_block bb); + bool is_export_p (tree name, basic_block bb = NULL); bool def_chain_in_export_p (tree name, basic_block bb); + bitmap exports (basic_block bb); void dump (FILE *f); void dump (FILE *f, basic_block bb); private: bitmap_obstack m_bitmaps; vec m_outgoing; // BB: Outgoing ranges calculatable on edges + bitmap all_outgoing; // All outgoing ranges combined. void maybe_add_gori (tree name, basic_block bb); void calculate_gori (basic_block bb); - bitmap exports (basic_block bb); }; @@ -250,6 +251,7 @@ gori_map::gori_map () m_outgoing.create (0); m_outgoing.safe_grow_cleared (last_basic_block_for_fn (cfun)); bitmap_obstack_initialize (&m_bitmaps); + all_outgoing = BITMAP_ALLOC (&m_bitmaps); } // Free any memory the GORI map allocated. @@ -276,6 +278,9 @@ gori_map::exports (basic_block bb) bool gori_map::is_export_p (tree name, basic_block bb) { + // If no BB is specified, test if it is exported anywhere in the IL. + if (!bb) + return bitmap_bit_p (all_outgoing, SSA_NAME_VERSION (name)); return bitmap_bit_p (exports (bb), SSA_NAME_VERSION (name)); } @@ -342,6 +347,8 @@ gori_map::calculate_gori (basic_block bb) name = gimple_range_ssa_p (gimple_switch_index (gs)); maybe_add_gori (name, gimple_bb (stmt)); } + // Add this bitmap to the aggregate list of all outgoing names. + bitmap_ior_into (all_outgoing, m_outgoing[bb->index]); } // Dump the table information for BB to file F. @@ -438,6 +445,16 @@ gori_compute::gori_compute () m_bool_zero = int_range<2> (boolean_false_node, boolean_false_node); m_bool_one = int_range<2> (boolean_true_node, boolean_true_node); m_gori_map = new gori_map; + unsigned x, lim = last_basic_block_for_fn (cfun); + // Calculate outgoing range info upfront. This will fully populate the + // all_outgoing bitmap which will help eliminate processing of names + // which never have their ranges adjusted. + for (x = 0; x < lim ; x++) + { + basic_block bb = BASIC_BLOCK_FOR_FN (cfun, x); + if (bb) + m_gori_map->exports (bb); + } } // Destruct a gori_compute_object. @@ -969,8 +986,12 @@ gori_compute::compute_operand1_and_operand2_range // Return TRUE if a range can be calcalated for NAME on edge E. bool -gori_compute::has_edge_range_p (edge e, tree name) +gori_compute::has_edge_range_p (tree name, edge e) { + // If no edge is specified, check if NAME is an export on any edge. + if (!e) + return m_gori_map->is_export_p (name); + return (m_gori_map->is_export_p (name, e->src) || m_gori_map->def_chain_in_export_p (name, e->src)); } diff --git a/gcc/gimple-range-gori.h b/gcc/gimple-range-gori.h index 8ef452bf433..6cbf532c435 100644 --- a/gcc/gimple-range-gori.h +++ b/gcc/gimple-range-gori.h @@ -29,10 +29,13 @@ along with GCC; see the file COPYING3. If not see // // There are 2 primary entry points: // -// has_edge_range_p (edge e, tree name) +// has_edge_range_p (tree name, edge e) // returns true if the outgoing edge *may* be able to produce range // information for ssa_name NAME on edge E. // FALSE is returned if this edge does not affect the range of NAME. +// if no edge is specified, return TRUE if name may have a value calculated +// on *ANY* edge that has been seen. FALSE indicates that the global value +// is applicable everywhere that has been processed. // // outgoing_edge_range_p (irange &range, edge e, tree name) // Actually does the calculation of RANGE for name on E @@ -68,7 +71,7 @@ public: gori_compute (); ~gori_compute (); bool outgoing_edge_range_p (irange &r, edge e, tree name); - bool has_edge_range_p (edge e, tree name); + bool has_edge_range_p (tree name, edge e = NULL); void dump (FILE *f); protected: virtual void ssa_range_in_bb (irange &r, tree name, basic_block bb);