diff mbox series

[PR,tree-optimization/98174] Reduce memory requirements for ranger

Message ID eed87b60-2bfa-e5e1-b2fe-afaad8cbbfb2@redhat.com
State New
Headers show
Series [PR,tree-optimization/98174] Reduce memory requirements for ranger | expand

Commit Message

Andrew MacLeod Dec. 10, 2020, 9:35 p.m. UTC
With very large CFG's ranger on entry cache is not particularly efficient.

One thing I never got to was recognizing that if an ssa-name is never 
used in an outgoing edge calculation, then its range never changes.. the 
global range is sufficient and we do not need to propagate the on-entry 
cache with it.  Many SSA_NAMES fall into this category.

This patch implements this by preprocessing the exit of each block 
(which would be done anyway, we just do it up front now) and keeping a 
cumulative map of anything which is exported.   Anything without the bit 
set is a "pure global" which means nothing ever changes the range.

Then when it comes time to populate the on-entry cache, we just bail if 
its a pure global.

THis seems to resolve most of the memory issues I am seeing. One side 
effect is ranger now doesnt fully propagate the non-null property which 
is a product of statement side effects.  This is the only thing in 
ranger which can affect a range but isnt a block export.   This won't 
impact the hybrid EVRP model since EVRP continues to track this 
information.   I will keep an eye on whether this has any impact on the 
other couple of passes which are using ranger..  If necessary, we can 
easily make this a hybrid mode only mode, but for this release it may 
not be needed.

Next release, I plan to handle the non-null processing in a completely 
different way anyway when we add the facility to query for general 
statement side effects.  so this is in fact a future compatible change.

There is still an time increase issue in pr91257 that I am continuing to 
look at to see if there is anything further to be done;  it seems mostly 
related to PHIs and the new edge processing. At least this seems to 
resolve the excessive memory issues in the specified PRs.

I will also continue to monitor if we need to re-enable anything for the 
non-null processing outside of hybrid.

Bootstrapped on x86_64-pc-linux-gnu, no regressions , pushed.

Andrew
commit 7f359556a772e26eabf8d31e53aae1de6f2f200d
Author: Andrew MacLeod <amacleod@redhat.com>
Date:   Thu Dec 10 14:59:14 2020 -0500

    Reduce memory requirements for ranger
    
    Calculate block exit info upfront, and then any SSA_NAME which is never
    used in an outgoing range calculation is a pure global and can bypass the
    on-entry cache.
    
            PR tree-optimization/98174
            * gimple-range-cache.cc (ranger_cache::ssa_range_in_bb): Only push
            poor values to be examined if it isn't a pure global.
            (ranger_cache::block_range): Don't process pure globals.
            (ranger_cache::fill_block_cache): Adjust has_edge_range call.
            * gimple-range-gori.cc (gori_map::all_outgoing): New bitmap.
            (gori_map::gori_map): Allocate all_outgoing.
            (gori_map::is_export_p): No specified BB returns global context.
            (gori_map::calculate_gori): Accumulate each block into global.
            (gori_compute::gori_compute): Preprocess each block for exports.
            (gori_compute::has_edge_range_p): No edge returns global context.
            * gimple-range-gori.h (has_edge_range_p): Provide default parameter.
diff mbox series

Patch

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index b01563c83f9..edebad45a50 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -779,8 +779,10 @@  ranger_cache::ssa_range_in_bb (irange &r, tree name, basic_block bb)
   // Look for the on-entry value of name in BB from the cache.
   else if (!m_on_entry.get_bb_range (r, name, bb))
     {
-      // If it has no entry then mark this as a poor value.
-      if (push_poor_value (bb, name))
+      // If it has no entry but should, then mark this as a poor value.
+      // Its not a poor value if it does not have *any* edge ranges,
+      // Then global range is as good as it gets.
+      if (has_edge_range_p (name) && push_poor_value (bb, name))
 	{
 	  if (DEBUG_RANGE_CACHE)
 	    {
@@ -812,6 +814,11 @@  ranger_cache::block_range (irange &r, basic_block bb, tree name, bool calc)
 {
   gcc_checking_assert (gimple_range_ssa_p (name));
 
+  // If there are no range calculations anywhere in the IL, global range
+  // applies everywhere, so don't bother caching it.
+  if (!has_edge_range_p (name))
+    return false;
+
   if (calc)
     {
       gimple *def_stmt = SSA_NAME_DEF_STMT (name);
@@ -1072,7 +1079,7 @@  ranger_cache::fill_block_cache (tree name, basic_block bb, basic_block def_bb)
 	    {
 	      if (DEBUG_RANGE_CACHE)
 		fprintf (dump_file, "has cache, ");
-	      if (!r.undefined_p () || has_edge_range_p (e, name))
+	      if (!r.undefined_p () || has_edge_range_p (name, e))
 		{
 		  add_to_update (node);
 		  if (DEBUG_RANGE_CACHE)
diff --git a/gcc/gimple-range-gori.cc b/gcc/gimple-range-gori.cc
index af3609e414e..ac13718f7e6 100644
--- a/gcc/gimple-range-gori.cc
+++ b/gcc/gimple-range-gori.cc
@@ -229,17 +229,18 @@  public:
   gori_map ();
   ~gori_map ();
 
-  bool is_export_p (tree name, basic_block bb);
+  bool is_export_p (tree name, basic_block bb = NULL);
   bool def_chain_in_export_p (tree name, basic_block bb);
+  bitmap exports (basic_block bb);
 
   void dump (FILE *f);
   void dump (FILE *f, basic_block bb);
 private:
   bitmap_obstack m_bitmaps;
   vec<bitmap> m_outgoing;	// BB: Outgoing ranges calculatable on edges
+  bitmap all_outgoing;		// All outgoing ranges combined. 
   void maybe_add_gori (tree name, basic_block bb);
   void calculate_gori (basic_block bb);
-  bitmap exports (basic_block bb);
 };
 
 
@@ -250,6 +251,7 @@  gori_map::gori_map ()
   m_outgoing.create (0);
   m_outgoing.safe_grow_cleared (last_basic_block_for_fn (cfun));
   bitmap_obstack_initialize (&m_bitmaps);
+  all_outgoing = BITMAP_ALLOC (&m_bitmaps);
 }
 
 // Free any memory the GORI map allocated.
@@ -276,6 +278,9 @@  gori_map::exports (basic_block bb)
 bool
 gori_map::is_export_p (tree name, basic_block bb)
 {
+  // If no BB is specified, test if it is exported anywhere in the IL.
+  if (!bb)
+    return bitmap_bit_p (all_outgoing, SSA_NAME_VERSION (name));
   return bitmap_bit_p (exports (bb), SSA_NAME_VERSION (name));
 }
 
@@ -342,6 +347,8 @@  gori_map::calculate_gori (basic_block bb)
       name = gimple_range_ssa_p (gimple_switch_index (gs));
       maybe_add_gori (name, gimple_bb (stmt));
     }
+  // Add this bitmap to the aggregate list of all outgoing names.
+  bitmap_ior_into (all_outgoing, m_outgoing[bb->index]);
 }
 
 // Dump the table information for BB to file F.
@@ -438,6 +445,16 @@  gori_compute::gori_compute ()
   m_bool_zero = int_range<2> (boolean_false_node, boolean_false_node);
   m_bool_one = int_range<2> (boolean_true_node, boolean_true_node);
   m_gori_map = new gori_map;
+  unsigned x, lim = last_basic_block_for_fn (cfun);
+  // Calculate outgoing range info upfront.  This will fully populate the
+  // all_outgoing bitmap which will help eliminate processing of names
+  // which never have their ranges adjusted.
+  for (x = 0; x < lim ; x++)
+    {
+      basic_block bb = BASIC_BLOCK_FOR_FN (cfun, x);
+      if (bb)
+	m_gori_map->exports (bb);
+    }
 }
 
 // Destruct a gori_compute_object.
@@ -969,8 +986,12 @@  gori_compute::compute_operand1_and_operand2_range
 // Return TRUE if a range can be calcalated for NAME on edge E.
 
 bool
-gori_compute::has_edge_range_p (edge e, tree name)
+gori_compute::has_edge_range_p (tree name, edge e)
 {
+  // If no edge is specified, check if NAME is an export on any edge.
+  if (!e)
+    return m_gori_map->is_export_p (name);
+
   return (m_gori_map->is_export_p (name, e->src)
 	  || m_gori_map->def_chain_in_export_p (name, e->src));
 }
diff --git a/gcc/gimple-range-gori.h b/gcc/gimple-range-gori.h
index 8ef452bf433..6cbf532c435 100644
--- a/gcc/gimple-range-gori.h
+++ b/gcc/gimple-range-gori.h
@@ -29,10 +29,13 @@  along with GCC; see the file COPYING3.  If not see
 //
 // There are 2 primary entry points:
 //
-// has_edge_range_p (edge e, tree name)  
+// has_edge_range_p (tree name, edge e)
 //   returns true if the outgoing edge *may* be able to produce range
 //   information for ssa_name NAME on edge E.
 //   FALSE is returned if this edge does not affect the range of NAME.
+//   if no edge is specified, return TRUE if name may have a value calculated
+//   on *ANY* edge that has been seen.  FALSE indicates that the global value
+//   is applicable everywhere that has been processed.
 //
 // outgoing_edge_range_p (irange &range, edge e, tree name)
 //   Actually does the calculation of RANGE for name on E
@@ -68,7 +71,7 @@  public:
   gori_compute ();
   ~gori_compute ();
   bool outgoing_edge_range_p (irange &r, edge e, tree name);
-  bool has_edge_range_p (edge e, tree name);
+  bool has_edge_range_p (tree name, edge e = NULL);
   void dump (FILE *f);
 protected:
   virtual void ssa_range_in_bb (irange &r, tree name, basic_block bb);