diff mbox series

[COMMITTED,2/2] PR tree-optimization/114855 - Adjust rangers recomputation depth based on the number of BBs.

Message ID c2816f11-af73-4a5b-8ce2-3631c7355a1a@redhat.com
State New
Headers show
Series [COMMITTED,1/2] PR tree-optimization/114855 - Limit equivalency processing in rangers cache. | expand

Commit Message

Andrew MacLeod Aug. 9, 2024, 7:18 p.m. UTC
The other place a lot of time is being spent is in the recomputation 
section. We have a depth limit parameter (ranger-recompuation-depth) 
which defaults to 5, and specifies the number of statements back we will 
look to recompute.. this allows us to recompute some relatively 
complicated things.

Again, when the CFG is very large, this testcase shows that this can be 
very expensive, and we should probably prune that down as the CFG 
grows.  This patch reduces the specified recomputation depth value by 
one for every 4096 blocks in the CFG.  Which means for any testcase over 
about 16,000
  basic blocks, we will only look at recomputing single statements.  
This helps to reduce the appearance of quadratic-ness.

Original time at -O1 was 1506 seconds, with 1215 seconds spent in DOM.
The previous patch dropped it to 846 seconds total, with 589 seconds 
spent in DOM.  thats about a 50% reduction in DOM.
This patch drops the time to a total of 416 seconds, with 214 being 
spent in DOM, so overal about a 90% time reduction in DOM for this test 
case.

Then vast majority of time remaining is spent recalculating edges.

Bootstrapped on x86_64-pc-linux-gnu with no regressions.  Pushed.

Andrew
diff mbox series

Patch

From 9e4da946c4263a4c89d5fc365b3c97ae244c5018 Mon Sep 17 00:00:00 2001
From: Andrew MacLeod <amacleod@redhat.com>
Date: Thu, 8 Aug 2024 16:37:28 -0400
Subject: [PATCH 2/2] Adjust rangers recomputation depth based on the number of
 BBs.

As the number of block increase, recomputations can become more
expensive.  Adjust the depth limit to avoid excessive compile time.

	PR tree-optimization/114855
	* gimple-range-gori.cc (gori_compute::gori_compute): Adjust
	ranger_recompute_depth limit based on the number of BBs.
	(gori_compute::may_recompute_p): Use previosuly calculated value.
	* gimple-range-gori.h (gori_compute::m_recompute_depth): New.
---
 gcc/gimple-range-gori.cc | 12 ++++++++----
 gcc/gimple-range-gori.h  |  1 +
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/gcc/gimple-range-gori.cc b/gcc/gimple-range-gori.cc
index a31e3be65f7..f2e2b5049aa 100644
--- a/gcc/gimple-range-gori.cc
+++ b/gcc/gimple-range-gori.cc
@@ -567,6 +567,13 @@  gori_compute::gori_compute (gori_map &map, int not_executable_flag,
   m_bool_one = range_true ();
   if (dump_file && (param_ranger_debug & RANGER_DEBUG_GORI))
     tracer.enable_trace ();
+
+  // Reduce maximum recompute depth based on the size of the CFG to avoid
+  // excessive compuations in large CFGs.
+  m_recompute_depth = (int) param_ranger_recompute_depth
+		      - (int) last_basic_block_for_fn (cfun) / 4096;
+  if (m_recompute_depth < 1)
+    m_recompute_depth = 1;
 }
 
 gori_compute::~gori_compute ()
@@ -1327,10 +1334,7 @@  gori_compute::may_recompute_p (tree name, basic_block bb, int depth)
     {
       // -1 indicates a default param, convert it to the real default.
       if (depth == -1)
-	{
-	  depth = (int)param_ranger_recompute_depth;
-	  gcc_checking_assert (depth >= 1);
-	}
+	depth = m_recompute_depth;
 
       bool res = m_map.is_export_p (dep1, bb);
       if (res || depth <= 1)
diff --git a/gcc/gimple-range-gori.h b/gcc/gimple-range-gori.h
index 11019e38471..97e051cd317 100644
--- a/gcc/gimple-range-gori.h
+++ b/gcc/gimple-range-gori.h
@@ -206,6 +206,7 @@  private:
 
   range_tracer tracer;
   int m_not_executable_flag;
+  int m_recompute_depth;
 };
 
 // These APIs are used to query GORI if there are ranges generated on an edge.
-- 
2.45.0