diff mbox

[google,gcc-4_8] Tree Loop Unrolling - Relax code size increase with -O2

Message ID CAAs8HmxrTeR715cWCst6CNkzO+sCpJQw2n8w1tau6iB-aAvocQ@mail.gmail.com
State New
Headers show

Commit Message

Sriraman Tallam Jan. 22, 2014, 12:46 a.m. UTC
On Tue, Jan 21, 2014 at 2:49 PM, Xinliang David Li <davidxl@google.com> wrote:
> I think it might be better to introduce a new parameter for  max peel
> insn at O2 (e.g, call it MAX_O2_COMPLETELY_PEEL_INSN or
> MAX_DEFAULT_...), and use the same logic in your patch to override the
> MAX_COMPLETELY_PEELED_INSN parameter at O2).
>
> By so doing, we don't need to have a hard coded factor of 2.

Patch attached with that change.

Sri

>
> In the longer run, we really need better cost/benefit analysis, but
> that is independent.
>
> David
>
> On Tue, Jan 21, 2014 at 1:49 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>> Hi,
>>
>>      Currently, tree unrolling pass(cunroll) does not allow any code
>> size growth in O2 mode.  Code size growth is permitted only if O3 or
>> funroll-loops/fpeel-loops is used. I have created  a patch to allow
>> partial code size increase in O2 mode. With funroll-loops the maximum
>> allowed code growth is 400 unrolled insns. I have set it to 200
>> unrolled insns in O2 mode.  This patch improves an image processing
>> benchmark by 20%. It improves most benchmarks by 1-2%. The code size
>> increase is <1% for all the benchmarks except the image processing
>> benchmark which increases by 6% (perf improves by 20%).
>>
>>      I am working on getting this patch reviewed for trunk. Here is
>> the disussion on this:
>> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02643.html  I have
>> incorporated the comments on making the patch simpler. I will
>> follow-up on that patch to trunk by also getting data on limiting
>> complete peeling with O2.
>>
>> Is this ok for the google branch?
>>
>> Thanks
>> Sri

Comments

Xinliang David Li Jan. 22, 2014, 12:51 a.m. UTC | #1
ok.

David

On Tue, Jan 21, 2014 at 4:46 PM, Sriraman Tallam <tmsriram@google.com> wrote:
> On Tue, Jan 21, 2014 at 2:49 PM, Xinliang David Li <davidxl@google.com> wrote:
>> I think it might be better to introduce a new parameter for  max peel
>> insn at O2 (e.g, call it MAX_O2_COMPLETELY_PEEL_INSN or
>> MAX_DEFAULT_...), and use the same logic in your patch to override the
>> MAX_COMPLETELY_PEELED_INSN parameter at O2).
>>
>> By so doing, we don't need to have a hard coded factor of 2.
>
> Patch attached with that change.
>
> Sri
>
>>
>> In the longer run, we really need better cost/benefit analysis, but
>> that is independent.
>>
>> David
>>
>> On Tue, Jan 21, 2014 at 1:49 PM, Sriraman Tallam <tmsriram@google.com> wrote:
>>> Hi,
>>>
>>>      Currently, tree unrolling pass(cunroll) does not allow any code
>>> size growth in O2 mode.  Code size growth is permitted only if O3 or
>>> funroll-loops/fpeel-loops is used. I have created  a patch to allow
>>> partial code size increase in O2 mode. With funroll-loops the maximum
>>> allowed code growth is 400 unrolled insns. I have set it to 200
>>> unrolled insns in O2 mode.  This patch improves an image processing
>>> benchmark by 20%. It improves most benchmarks by 1-2%. The code size
>>> increase is <1% for all the benchmarks except the image processing
>>> benchmark which increases by 6% (perf improves by 20%).
>>>
>>>      I am working on getting this patch reviewed for trunk. Here is
>>> the disussion on this:
>>> http://gcc.gnu.org/ml/gcc-patches/2013-11/msg02643.html  I have
>>> incorporated the comments on making the patch simpler. I will
>>> follow-up on that patch to trunk by also getting data on limiting
>>> complete peeling with O2.
>>>
>>> Is this ok for the google branch?
>>>
>>> Thanks
>>> Sri
diff mbox

Patch

Index: params.def
===================================================================
--- params.def	(revision 206638)
+++ params.def	(working copy)
@@ -339,6 +339,11 @@  DEFPARAM(PARAM_MAX_COMPLETELY_PEELED_INSNS,
 	"max-completely-peeled-insns",
 	"The maximum number of insns of a completely peeled loop",
 	400, 0, 0)
+/* The default maximum number of insns of a peeled loop, with -O2.  */
+DEFPARAM(PARAM_MAX_DEFAULT_COMPLETELY_PEELED_INSNS,
+	"max-default-completely-peeled-insns",
+	"The maximum number of insns of a completely peeled loop",
+	200, 0, 0)
 /* The maximum number of peelings of a single loop that is peeled completely.  */
 DEFPARAM(PARAM_MAX_COMPLETELY_PEEL_TIMES,
 	"max-completely-peel-times",
Index: opts.c
===================================================================
--- opts.c	(revision 206638)
+++ opts.c	(working copy)
@@ -855,6 +855,18 @@  finish_options (struct gcc_options *opts, struct g
             0, opts->x_param_values, opts_set->x_param_values);
     }
 
+  /* Set PARAM_MAX_COMPLETELY_PEELED_INSNS to the default original value during
+     -O2 when -funroll-loops and -fpeel-loops are not set.   */
+  if (optimize == 2 && !opts->x_flag_unroll_loops && !opts->x_flag_peel_loops
+      && !opts->x_flag_unroll_all_loops)
+
+    {
+      maybe_set_param_value
+       (PARAM_MAX_COMPLETELY_PEELED_INSNS,
+        PARAM_VALUE (PARAM_MAX_DEFAULT_COMPLETELY_PEELED_INSNS),
+	opts->x_param_values, opts_set->x_param_values);
+    }
+
   /* Set PARAM_MAX_STORES_TO_SINK to 0 if either vectorization or if-conversion
      is disabled.  */
   if ((!opts->x_flag_tree_loop_vectorize && !opts->x_flag_tree_slp_vectorize)
Index: tree-ssa-loop.c
===================================================================
--- tree-ssa-loop.c	(revision 206638)
+++ tree-ssa-loop.c	(working copy)
@@ -467,7 +467,7 @@  tree_complete_unroll (void)
 
   return tree_unroll_loops_completely (flag_unroll_loops
 				       || flag_peel_loops
-				       || optimize >= 3, true);
+				       || optimize >= 2, true);
 }
 
 static bool