Message ID | 1bb04945-d13b-4805-b9ed-0be4a5c773fc@redhat.com |
---|---|
State | New |
Headers | show |
Series | Add param for bb limit to invoke fast_vrp. | expand |
On Fri, Jun 21, 2024 at 3:02 PM Andrew MacLeod <amacleod@redhat.com> wrote: > > This patch adds > > --param=vrp-block-limit=N > > When the basic block counter for a function exceeded 'N' , VRP is > invoked with the new fast_vrp algorithm instead. This algorithm uses a > lot less memory and processing power, although it does get a few less > things. > > Primary motivation is cases like > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114855 in which the 3 VRP > passes consume about 600 seconds of the compile time, and a lot of > memory. With fast_vrp, it spends less than 10 seconds total in the > 3 passes of VRP. This test case has about 400,000 basic blocks. > > The default for N in this patch is 150,000, arbitrarily chosen. > > This bootstraps, (and I bootstrapped it with --param=vrp-block-limit=0 > as well) on x86_64-pc-linux-gnu, with no regressions. > > What do you think, OK for trunk? + if (last_basic_block_for_fn (fun) > param_vrp_block_limit || + &data == &pass_data_fast_vrp) || goes to the next line. Btw, we have -Wdisabled-optimization for these cases which should say sth like "function has excess of %d number of basic blocks (--param vrp-block-limit=%d), using fast VRP algorithm" or so in this case. As I wrote in the PR the priority should be -O1 compile-time performance and memory use. Richard. > Andrew > > PS sorry,. it doesn't help the threader in that PR :-( It's probably one of the known bottlenecks there - IIRC the path range queries make O(n^2) work. I can look at this at some point as I've dealt with the slowness of this pass in the past. There is param_max_jump_thread_paths that should limit searching but there is IIRC no way to limit the work path ranger actually does when resolving the query. Richard.
On 6/22/24 09:15, Richard Biener wrote: > On Fri, Jun 21, 2024 at 3:02 PM Andrew MacLeod <amacleod@redhat.com> wrote: >> This patch adds >> >> --param=vrp-block-limit=N >> >> When the basic block counter for a function exceeded 'N' , VRP is >> invoked with the new fast_vrp algorithm instead. This algorithm uses a >> lot less memory and processing power, although it does get a few less >> things. >> >> Primary motivation is cases like >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114855 in which the 3 VRP >> passes consume about 600 seconds of the compile time, and a lot of >> memory. With fast_vrp, it spends less than 10 seconds total in the >> 3 passes of VRP. This test case has about 400,000 basic blocks. >> >> The default for N in this patch is 150,000, arbitrarily chosen. >> >> This bootstraps, (and I bootstrapped it with --param=vrp-block-limit=0 >> as well) on x86_64-pc-linux-gnu, with no regressions. >> >> What do you think, OK for trunk? > + if (last_basic_block_for_fn (fun) > param_vrp_block_limit || > + &data == &pass_data_fast_vrp) > > || goes to the next line. > > Btw, we have -Wdisabled-optimization for these cases which should > say sth like "function has excess of %d number of basic blocks > (--param vrp-block-limit=%d), using fast VRP algorithm" > or so in this case. > > As I wrote in the PR the priority should be -O1 compile-time > performance and memory use. Yeah, I just wanted to use it as a model for "bad" cases for ranger. Adjusted patch attached which now issues the warning. I also found that the transitive relations were causing a small blowup in time for relation processing now that I turned relations on for fast VRP. I commited a patch and fast_vrp no longer does transitives. If you want to experiment with enabling fast VRP at -O1, it should be fast all the time now. I think :-) This testcase runs in about 95 seconds on my test machine. if I turn on VRP, a single VRP pass takes about 2.5 seconds. Its all set up, you can just add: NEXT_PASS (pass_fast_vrp); at an appropriate spot. > Richard. > >> Andrew >> >> PS sorry,. it doesn't help the threader in that PR :-( > It's probably one of the known bottlenecks there - IIRC the path range > queries make O(n^2) work. I can look at this at some point as I've > dealt with the slowness of this pass in the past. > > There is param_max_jump_thread_paths that should limit searching > but there is IIRC no way to limit the work path ranger actually does > when resolving the query. > Yeah, Id like to talk to Aldy about revamping the threader now that some of the newer facilities are available that fast_vrp uses. We can calculate all the outgoing ranges for a block at once with : // Fill ssa-cache R with any outgoing ranges on edge E, using QUERY. bool gori_on_edge (class ssa_cache &r, edge e, range_query *query = NULL); This is what the fast_vrp routines uses. We can gather all range restrictions generated from an edge efficiently just once and then intersect them with a known range as we walk the different paths. We don't need the gori exports , nor any of the other on-demand bits where we calculate each export range dynamically.. I suspect it would reduce the workload and memory impact quite a bit, but I'm not really familiar with exactly how the threader uses those things. It'd require some minor tweaking to the lazy_ssa_cache to make the bitmap of names set accessible. This would provide similar functionality to what the gori export () routine provides. Both relations and inferred ranges should only need to be calculated once per block as well and could/should/would be applied the same way if they are present. I don't *think* the threader uses any of the def chains, but Aldy can chip in. Andrew
On Mon, Jun 24, 2024 at 7:20 PM Andrew MacLeod <amacleod@redhat.com> wrote: > > > On 6/22/24 09:15, Richard Biener wrote: > > On Fri, Jun 21, 2024 at 3:02 PM Andrew MacLeod <amacleod@redhat.com> wrote: > >> This patch adds > >> > >> --param=vrp-block-limit=N > >> > >> When the basic block counter for a function exceeded 'N' , VRP is > >> invoked with the new fast_vrp algorithm instead. This algorithm uses a > >> lot less memory and processing power, although it does get a few less > >> things. > >> > >> Primary motivation is cases like > >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114855 in which the 3 VRP > >> passes consume about 600 seconds of the compile time, and a lot of > >> memory. With fast_vrp, it spends less than 10 seconds total in the > >> 3 passes of VRP. This test case has about 400,000 basic blocks. > >> > >> The default for N in this patch is 150,000, arbitrarily chosen. > >> > >> This bootstraps, (and I bootstrapped it with --param=vrp-block-limit=0 > >> as well) on x86_64-pc-linux-gnu, with no regressions. > >> > >> What do you think, OK for trunk? > > + if (last_basic_block_for_fn (fun) > param_vrp_block_limit || > > + &data == &pass_data_fast_vrp) > > > > || goes to the next line. > > > > Btw, we have -Wdisabled-optimization for these cases which should > > say sth like "function has excess of %d number of basic blocks > > (--param vrp-block-limit=%d), using fast VRP algorithm" > > or so in this case. > > > > As I wrote in the PR the priority should be -O1 compile-time > > performance and memory use. > > > Yeah, I just wanted to use it as a model for "bad" cases for ranger. > Adjusted patch attached which now issues the warning. > > I also found that the transitive relations were causing a small blowup > in time for relation processing now that I turned relations on for fast > VRP. I commited a patch and fast_vrp no longer does transitives. > > If you want to experiment with enabling fast VRP at -O1, it should be > fast all the time now. I think :-) This testcase runs in about 95 > seconds on my test machine. if I turn on VRP, a single VRP pass takes > about 2.5 seconds. Its all set up, you can just add: > > NEXT_PASS (pass_fast_vrp); > > at an appropriate spot. > > > Richard. > > > >> Andrew > >> > >> PS sorry,. it doesn't help the threader in that PR :-( > > It's probably one of the known bottlenecks there - IIRC the path range > > queries make O(n^2) work. I can look at this at some point as I've > > dealt with the slowness of this pass in the past. > > > > There is param_max_jump_thread_paths that should limit searching > > but there is IIRC no way to limit the work path ranger actually does > > when resolving the query. > > > Yeah, Id like to talk to Aldy about revamping the threader now that some > of the newer facilities are available that fast_vrp uses. > > We can calculate all the outgoing ranges for a block at once with : > > // Fill ssa-cache R with any outgoing ranges on edge E, using QUERY. > bool gori_on_edge (class ssa_cache &r, edge e, range_query *query = > NULL); > > This is what the fast_vrp routines uses. We can gather all range > restrictions generated from an edge efficiently just once and then > intersect them with a known range as we walk the different paths. We > don't need the gori exports , nor any of the other on-demand bits where > we calculate each export range dynamically.. I suspect it would reduce > the workload and memory impact quite a bit, but I'm not really familiar > with exactly how the threader uses those things. > > It'd require some minor tweaking to the lazy_ssa_cache to make the > bitmap of names set accessible. This would provide similar > functionality to what the gori export () routine provides. Both > relations and inferred ranges should only need to be calculated once per > block as well and could/should/would be applied the same way if they are > present. I don't *think* the threader uses any of the def chains, but > Aldy can chip in. + warning (OPT_Wdisabled_optimization, + "Using fast VRP algorithm. %d basic blocks" + " exceeds %s%d limit", + n_basic_blocks_for_fn (fun), + "--param=vrp-block-limit=", + param_vrp_block_limit); This should be: warning (OPT_Wdisabled_optimization, "Using fast VRP algorithm. %d basic blocks" " exceeds %<%--param=vrp-block-limit=d%> limit", n_basic_blocks_for_fn (fun), param_vrp_block_limit); I had thought it was mentioned that options should be quoted but it is not mentioned in the coding conventions: https://gcc.gnu.org/codingconventions.html#Diagnostics But it is mentioned in https://inbox.sourceware.org/gcc/2d2bd844-2de4-ecff-7a07-b2235075074c@gmail.com/ ; This is why you were getting an error as you mentioned on IRC. > > Andrew
On Mon, Jun 24, 2024 at 7:35 PM Andrew Pinski <pinskia@gmail.com> wrote: > > On Mon, Jun 24, 2024 at 7:20 PM Andrew MacLeod <amacleod@redhat.com> wrote: > > > > > > On 6/22/24 09:15, Richard Biener wrote: > > > On Fri, Jun 21, 2024 at 3:02 PM Andrew MacLeod <amacleod@redhat.com> wrote: > > >> This patch adds > > >> > > >> --param=vrp-block-limit=N > > >> > > >> When the basic block counter for a function exceeded 'N' , VRP is > > >> invoked with the new fast_vrp algorithm instead. This algorithm uses a > > >> lot less memory and processing power, although it does get a few less > > >> things. > > >> > > >> Primary motivation is cases like > > >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114855 in which the 3 VRP > > >> passes consume about 600 seconds of the compile time, and a lot of > > >> memory. With fast_vrp, it spends less than 10 seconds total in the > > >> 3 passes of VRP. This test case has about 400,000 basic blocks. > > >> > > >> The default for N in this patch is 150,000, arbitrarily chosen. > > >> > > >> This bootstraps, (and I bootstrapped it with --param=vrp-block-limit=0 > > >> as well) on x86_64-pc-linux-gnu, with no regressions. > > >> > > >> What do you think, OK for trunk? > > > + if (last_basic_block_for_fn (fun) > param_vrp_block_limit || > > > + &data == &pass_data_fast_vrp) > > > > > > || goes to the next line. > > > > > > Btw, we have -Wdisabled-optimization for these cases which should > > > say sth like "function has excess of %d number of basic blocks > > > (--param vrp-block-limit=%d), using fast VRP algorithm" > > > or so in this case. > > > > > > As I wrote in the PR the priority should be -O1 compile-time > > > performance and memory use. > > > > > > Yeah, I just wanted to use it as a model for "bad" cases for ranger. > > Adjusted patch attached which now issues the warning. > > > > I also found that the transitive relations were causing a small blowup > > in time for relation processing now that I turned relations on for fast > > VRP. I commited a patch and fast_vrp no longer does transitives. > > > > If you want to experiment with enabling fast VRP at -O1, it should be > > fast all the time now. I think :-) This testcase runs in about 95 > > seconds on my test machine. if I turn on VRP, a single VRP pass takes > > about 2.5 seconds. Its all set up, you can just add: > > > > NEXT_PASS (pass_fast_vrp); > > > > at an appropriate spot. > > > > > Richard. > > > > > >> Andrew > > >> > > >> PS sorry,. it doesn't help the threader in that PR :-( > > > It's probably one of the known bottlenecks there - IIRC the path range > > > queries make O(n^2) work. I can look at this at some point as I've > > > dealt with the slowness of this pass in the past. > > > > > > There is param_max_jump_thread_paths that should limit searching > > > but there is IIRC no way to limit the work path ranger actually does > > > when resolving the query. > > > > > Yeah, Id like to talk to Aldy about revamping the threader now that some > > of the newer facilities are available that fast_vrp uses. > > > > We can calculate all the outgoing ranges for a block at once with : > > > > // Fill ssa-cache R with any outgoing ranges on edge E, using QUERY. > > bool gori_on_edge (class ssa_cache &r, edge e, range_query *query = > > NULL); > > > > This is what the fast_vrp routines uses. We can gather all range > > restrictions generated from an edge efficiently just once and then > > intersect them with a known range as we walk the different paths. We > > don't need the gori exports , nor any of the other on-demand bits where > > we calculate each export range dynamically.. I suspect it would reduce > > the workload and memory impact quite a bit, but I'm not really familiar > > with exactly how the threader uses those things. > > > > It'd require some minor tweaking to the lazy_ssa_cache to make the > > bitmap of names set accessible. This would provide similar > > functionality to what the gori export () routine provides. Both > > relations and inferred ranges should only need to be calculated once per > > block as well and could/should/would be applied the same way if they are > > present. I don't *think* the threader uses any of the def chains, but > > Aldy can chip in. > > + warning (OPT_Wdisabled_optimization, > + "Using fast VRP algorithm. %d basic blocks" > + " exceeds %s%d limit", > + n_basic_blocks_for_fn (fun), > + "--param=vrp-block-limit=", > + param_vrp_block_limit); > > This should be: > warning (OPT_Wdisabled_optimization, "Using fast VRP algorithm. %d basic blocks" > " exceeds %<%--param=vrp-block-limit=d%> limit", > n_basic_blocks_for_fn (fun), param_vrp_block_limit); > > I had thought it was mentioned that options should be quoted but it is > not mentioned in the coding conventions: > https://gcc.gnu.org/codingconventions.html#Diagnostics > > But it is mentioned in > https://inbox.sourceware.org/gcc/2d2bd844-2de4-ecff-7a07-b2235075074c@gmail.com/ > ; This is why you were getting an error as you mentioned on IRC. Note I filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115627 to add it to the coding conventions. I might take a stab at adding the rules mentioned in https://inbox.sourceware.org/gcc-patches/8ac62fe2-e4bf-0922-4947-fca9567a0561@gmail.com/ since those are the ones which are detected right now. Thanks, Andrew > > > > > > Andrew
On 6/24/24 22:35, Andrew Pinski wrote: > On Mon, Jun 24, 2024 at 7:20 PM Andrew MacLeod <amacleod@redhat.com> wrote: >> // Fill ssa-cache R with any outgoing ranges on edge E, using QUERY. >> bool gori_on_edge (class ssa_cache &r, edge e, range_query *query = >> NULL); >> >> This is what the fast_vrp routines uses. We can gather all range >> restrictions generated from an edge efficiently just once and then >> intersect them with a known range as we walk the different paths. We >> don't need the gori exports , nor any of the other on-demand bits where >> we calculate each export range dynamically.. I suspect it would reduce >> the workload and memory impact quite a bit, but I'm not really familiar >> with exactly how the threader uses those things. >> >> It'd require some minor tweaking to the lazy_ssa_cache to make the >> bitmap of names set accessible. This would provide similar >> functionality to what the gori export () routine provides. Both >> relations and inferred ranges should only need to be calculated once per >> block as well and could/should/would be applied the same way if they are >> present. I don't *think* the threader uses any of the def chains, but >> Aldy can chip in. > + warning (OPT_Wdisabled_optimization, > + "Using fast VRP algorithm. %d basic blocks" > + " exceeds %s%d limit", > + n_basic_blocks_for_fn (fun), > + "--param=vrp-block-limit=", > + param_vrp_block_limit); > > This should be: > warning (OPT_Wdisabled_optimization, "Using fast VRP algorithm. %d basic blocks" > " exceeds %<%--param=vrp-block-limit=d%> limit", > n_basic_blocks_for_fn (fun), param_vrp_block_limit); > > I had thought it was mentioned that options should be quoted but it is > not mentioned in the coding conventions: > https://gcc.gnu.org/codingconventions.html#Diagnostics > > But it is mentioned in > https://inbox.sourceware.org/gcc/2d2bd844-2de4-ecff-7a07-b2235075074c@gmail.com/ > ; This is why you were getting an error as you mentioned on IRC. > > I didnt do that because when I did, everything in bracketed by the %< %> in the warning came out in bold text. is that the intended effect? Andrew.
On Mon, 2024-06-24 at 21:27 -0700, Andrew Pinski wrote: > On Mon, Jun 24, 2024 at 7:35 PM Andrew Pinski <pinskia@gmail.com> > wrote: > > > > On Mon, Jun 24, 2024 at 7:20 PM Andrew MacLeod > > <amacleod@redhat.com> wrote: > > > > > > > > > On 6/22/24 09:15, Richard Biener wrote: > > > > On Fri, Jun 21, 2024 at 3:02 PM Andrew MacLeod > > > > <amacleod@redhat.com> wrote: > > > > > This patch adds > > > > > > > > > > --param=vrp-block-limit=N > > > > > > > > > > When the basic block counter for a function exceeded 'N' , > > > > > VRP is > > > > > invoked with the new fast_vrp algorithm instead. This > > > > > algorithm uses a > > > > > lot less memory and processing power, although it does get a > > > > > few less > > > > > things. > > > > > > > > > > Primary motivation is cases like > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114855 in which > > > > > the 3 VRP > > > > > passes consume about 600 seconds of the compile time, and a > > > > > lot of > > > > > memory. With fast_vrp, it spends less than 10 seconds > > > > > total in the > > > > > 3 passes of VRP. This test case has about 400,000 basic > > > > > blocks. > > > > > > > > > > The default for N in this patch is 150,000, arbitrarily > > > > > chosen. > > > > > > > > > > This bootstraps, (and I bootstrapped it with --param=vrp- > > > > > block-limit=0 > > > > > as well) on x86_64-pc-linux-gnu, with no regressions. > > > > > > > > > > What do you think, OK for trunk? > > > > + if (last_basic_block_for_fn (fun) > > > > > param_vrp_block_limit || > > > > + &data == &pass_data_fast_vrp) > > > > > > > > > > goes to the next line. > > > > > > > > Btw, we have -Wdisabled-optimization for these cases which > > > > should > > > > say sth like "function has excess of %d number of basic blocks > > > > (--param vrp-block-limit=%d), using fast VRP algorithm" > > > > or so in this case. > > > > > > > > As I wrote in the PR the priority should be -O1 compile-time > > > > performance and memory use. > > > > > > > > > Yeah, I just wanted to use it as a model for "bad" cases for > > > ranger. > > > Adjusted patch attached which now issues the warning. > > > > > > I also found that the transitive relations were causing a small > > > blowup > > > in time for relation processing now that I turned relations on > > > for fast > > > VRP. I commited a patch and fast_vrp no longer does transitives. > > > > > > If you want to experiment with enabling fast VRP at -O1, it > > > should be > > > fast all the time now. I think :-) This testcase runs in > > > about 95 > > > seconds on my test machine. if I turn on VRP, a single VRP pass > > > takes > > > about 2.5 seconds. Its all set up, you can just add: > > > > > > NEXT_PASS (pass_fast_vrp); > > > > > > at an appropriate spot. > > > > > > > Richard. > > > > > > > > > Andrew > > > > > > > > > > PS sorry,. it doesn't help the threader in that PR :-( > > > > It's probably one of the known bottlenecks there - IIRC the > > > > path range > > > > queries make O(n^2) work. I can look at this at some point as > > > > I've > > > > dealt with the slowness of this pass in the past. > > > > > > > > There is param_max_jump_thread_paths that should limit > > > > searching > > > > but there is IIRC no way to limit the work path ranger actually > > > > does > > > > when resolving the query. > > > > > > > Yeah, Id like to talk to Aldy about revamping the threader now > > > that some > > > of the newer facilities are available that fast_vrp uses. > > > > > > We can calculate all the outgoing ranges for a block at once with > > > : > > > > > > // Fill ssa-cache R with any outgoing ranges on edge E, using > > > QUERY. > > > bool gori_on_edge (class ssa_cache &r, edge e, range_query > > > *query = > > > NULL); > > > > > > This is what the fast_vrp routines uses. We can gather all range > > > restrictions generated from an edge efficiently just once and > > > then > > > intersect them with a known range as we walk the different paths. > > > We > > > don't need the gori exports , nor any of the other on-demand bits > > > where > > > we calculate each export range dynamically.. I suspect it would > > > reduce > > > the workload and memory impact quite a bit, but I'm not really > > > familiar > > > with exactly how the threader uses those things. > > > > > > It'd require some minor tweaking to the lazy_ssa_cache to make > > > the > > > bitmap of names set accessible. This would provide similar > > > functionality to what the gori export () routine provides. Both > > > relations and inferred ranges should only need to be calculated > > > once per > > > block as well and could/should/would be applied the same way if > > > they are > > > present. I don't *think* the threader uses any of the def > > > chains, but > > > Aldy can chip in. > > > > + warning (OPT_Wdisabled_optimization, > > + "Using fast VRP algorithm. %d basic blocks" > > + " exceeds %s%d limit", > > + n_basic_blocks_for_fn (fun), > > + "--param=vrp-block-limit=", > > + param_vrp_block_limit); > > > > This should be: > > warning (OPT_Wdisabled_optimization, "Using fast VRP algorithm. %d > > basic blocks" > > " exceeds %<%--param=vrp-block-limit=d%> limit", > > n_basic_blocks_for_fn (fun), param_vrp_block_limit); > > > > I had thought it was mentioned that options should be quoted but it > > is > > not mentioned in the coding conventions: > > https://gcc.gnu.org/codingconventions.html#Diagnostics > > > > But it is mentioned in > > https://inbox.sourceware.org/gcc/2d2bd844-2de4-ecff-7a07-b2235075074c@gmail.com/ > > ; This is why you were getting an error as you mentioned on IRC. > > Note I filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115627 to > add it to the coding conventions. I might take a stab at adding the > rules mentioned in > https://inbox.sourceware.org/gcc-patches/8ac62fe2-e4bf-0922-4947-fca9567a0561@gmail.com/ > since those are the ones which are detected right now. FWIW there's some overlap between: https://gcc.gnu.org/codingconventions.html#Diagnostics and: https://gcc.gnu.org/onlinedocs/gccint/Guidelines-for-Diagnostics.html#Quoting Command-line options ought to be quoted in diagnostic messages. In particular as of GCC 14 [1] such quoted strings will automatically get a URL to the documentation for the option in a sufficiently capable terminal. I'm hoping to add something similar for quoted *attributes* in GCC 15; see [2]. FWIW I'm also looking at something that could support better HTML output (e.g. div classes) for such quoted output (e.g. via markdown via SARIF output), but that's more experimental. Dave [1] e.g. r14-6920-g9e49746da303b8 [2] https://gcc.gnu.org/pipermail/gcc-patches/2024-June/655541.html
From 3bb9bd3ca8038676e45b0bddcda91cbed7e51662 Mon Sep 17 00:00:00 2001 From: Andrew MacLeod <amacleod@redhat.com> Date: Mon, 17 Jun 2024 11:38:46 -0400 Subject: [PATCH 4/5] Add param for bb limit to invoke fast_vrp. If the basic block count is too high, simply use fast_vrp for all VRP passes. gcc/doc/ * invoke.texi (vrp-block-limit): Document. gcc/ * params.opt (-param=vrp-block-limit): New. * tree-vrp.cc (fvrp_folder::execute): Invoke fast_vrp if block count exceeds limit. --- gcc/doc/invoke.texi | 3 +++ gcc/params.opt | 4 ++++ gcc/tree-vrp.cc | 4 ++-- 3 files changed, 9 insertions(+), 2 deletions(-) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 5d7a87fde86..f2f8f6334dc 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -16840,6 +16840,9 @@ this parameter. The default value of this parameter is 50. @item vect-induction-float Enable loop vectorization of floating point inductions. +@item vrp-block-limit +Maximum number of basic blocks before VRP switches to a lower memory algorithm. + @item vrp-sparse-threshold Maximum number of basic blocks before VRP uses a sparse bitmap cache. diff --git a/gcc/params.opt b/gcc/params.opt index d34ef545bf0..c17ba17b91b 100644 --- a/gcc/params.opt +++ b/gcc/params.opt @@ -1198,6 +1198,10 @@ The maximum factor which the loop vectorizer applies to the cost of statements i Common Joined UInteger Var(param_vect_induction_float) Init(1) IntegerRange(0, 1) Param Optimization Enable loop vectorization of floating point inductions. +-param=vrp-block-limit= +Common Joined UInteger Var(param_vrp_block_limit) Init(150000) Optimization Param +Maximum number of basic blocks before VRP switches to a fast model with less memory requirements. + -param=vrp-sparse-threshold= Common Joined UInteger Var(param_vrp_sparse_threshold) Init(3000) Optimization Param Maximum number of basic blocks before VRP uses a sparse bitmap cache. diff --git a/gcc/tree-vrp.cc b/gcc/tree-vrp.cc index 4fc33e63e7d..eef02146ec6 100644 --- a/gcc/tree-vrp.cc +++ b/gcc/tree-vrp.cc @@ -1330,9 +1330,9 @@ public: unsigned int execute (function *fun) final override { // Check for fast vrp. - if (&data == &pass_data_fast_vrp) + if (last_basic_block_for_fn (fun) > param_vrp_block_limit || + &data == &pass_data_fast_vrp) return execute_fast_vrp (fun, final_p); - return execute_ranger_vrp (fun, final_p); } -- 2.45.0