Message ID | 56D3C8D9.5020508@linaro.org |
---|---|
State | New |
Headers | show |
On 29 February 2016 at 05:28, kugan <kugan.vivekanandarajah@linaro.org> wrote: > >> That looks better, but I think the unordered_remove will break operand >> sorting >> and thus you probably don't handle x + x + x + x + y + y + y + y + y + >> y + z + z + z + z >> optimally. >> >> I'd say you simply want to avoid the recursion and collect a vector of >> [start, end] pairs >> before doing any modification to the ops vector. > > > Hi Richard, > > Is the attached patch looks better? > Minor comment, I've noticed typos in your updated comment: "There should be two multiplication left in test1 (inculding one generated" should be "There should be two multiplications left in test1 (including one generated" > Thanks, > Kugan
On Wed, Mar 2, 2016 at 3:28 PM, Christophe Lyon <christophe.lyon@linaro.org> wrote: > On 29 February 2016 at 05:28, kugan <kugan.vivekanandarajah@linaro.org> wrote: >> >>> That looks better, but I think the unordered_remove will break operand >>> sorting >>> and thus you probably don't handle x + x + x + x + y + y + y + y + y + >>> y + z + z + z + z >>> optimally. >>> >>> I'd say you simply want to avoid the recursion and collect a vector of >>> [start, end] pairs >>> before doing any modification to the ops vector. >> >> >> Hi Richard, >> >> Is the attached patch looks better? >> > > Minor comment, I've noticed typos in your updated comment: > "There should be two multiplication left in test1 (inculding one generated" > should be > "There should be two multiplications left in test1 (including one generated" +/* Transoform repeated addition of same values into multiply with + constant. */ Transform +static void +transform_add_to_multiply (gimple_stmt_iterator *gsi, gimple *stmt, vec<operand_entry *> *ops) split the long line op_list looks redundant - ops[start]->op gives you the desired value already and if you use a vec<std::pair<int, int>> you can have a more C++ish start,end pair. + tree tmp = make_temp_ssa_name (TREE_TYPE (op), NULL, "reassocmul"); + gassign *mul_stmt = gimple_build_assign (tmp, MULT_EXPR, + op, build_int_cst (TREE_TYPE(op), count)); this won't work for floating point or complex numbers - you need to use sth like fold_convert (TREE_TYPE (op), build_int_cst (integer_type_node, count)); For FP types you need to guard the transform with flag_unsafe_math_optimizations + gimple_set_location (mul_stmt, gimple_location (stmt)); + gimple_set_uid (mul_stmt, gimple_uid (stmt)); + gsi_insert_before (gsi, mul_stmt, GSI_SAME_STMT); I think you do not want to set the stmt uid and you want to insert the stmt right after the def of op (or at the original first add - though you can't get your hands at that easily). You also don't want to set the location to the last stmt of the whole add sequence - simply leave it unset. + oe = operand_entry_pool.allocate (); + oe->op = tmp; + oe->rank = get_rank (op) * count; ? Why that? oe->rank should be get_rank (tmp). + oe->id = 0; other places use next_operand_entry_id++. I think you want to simply use add_to_ops_vec (oe, tmp); here for all of the above. Please return whether you did any optimization and do the qsort of the operand vector only if you did sth. Testcase with FP math missing. Likewise with complex or vector math. Thanks, Richard. >> Thanks, >> Kugan
On Tue, Apr 19, 2016 at 1:56 PM, Richard Biener <richard.guenther@gmail.com> wrote: > On Wed, Mar 2, 2016 at 3:28 PM, Christophe Lyon > <christophe.lyon@linaro.org> wrote: >> On 29 February 2016 at 05:28, kugan <kugan.vivekanandarajah@linaro.org> wrote: >>> >>>> That looks better, but I think the unordered_remove will break operand >>>> sorting >>>> and thus you probably don't handle x + x + x + x + y + y + y + y + y + >>>> y + z + z + z + z >>>> optimally. >>>> >>>> I'd say you simply want to avoid the recursion and collect a vector of >>>> [start, end] pairs >>>> before doing any modification to the ops vector. >>> >>> >>> Hi Richard, >>> >>> Is the attached patch looks better? >>> >> >> Minor comment, I've noticed typos in your updated comment: >> "There should be two multiplication left in test1 (inculding one generated" >> should be >> "There should be two multiplications left in test1 (including one generated" > > +/* Transoform repeated addition of same values into multiply with > + constant. */ > > Transform > > +static void > +transform_add_to_multiply (gimple_stmt_iterator *gsi, gimple *stmt, > vec<operand_entry *> *ops) > > split the long line > > op_list looks redundant - ops[start]->op gives you the desired value > already and if you > use a vec<std::pair<int, int>> you can have a more C++ish start,end pair. > > + tree tmp = make_temp_ssa_name (TREE_TYPE (op), NULL, "reassocmul"); > + gassign *mul_stmt = gimple_build_assign (tmp, MULT_EXPR, > + op, build_int_cst > (TREE_TYPE(op), count)); > > this won't work for floating point or complex numbers - you need to use sth like > fold_convert (TREE_TYPE (op), build_int_cst (integer_type_node, count)); > > For FP types you need to guard the transform with flag_unsafe_math_optimizations > > + gimple_set_location (mul_stmt, gimple_location (stmt)); > + gimple_set_uid (mul_stmt, gimple_uid (stmt)); > + gsi_insert_before (gsi, mul_stmt, GSI_SAME_STMT); > > I think you do not want to set the stmt uid and you want to insert the > stmt right > after the def of op (or at the original first add - though you can't > get your hands at > that easily). You also don't want to set the location to the last stmt of the > whole add sequence - simply leave it unset. > > + oe = operand_entry_pool.allocate (); > + oe->op = tmp; > + oe->rank = get_rank (op) * count; > > ? Why that? oe->rank should be get_rank (tmp). > > + oe->id = 0; > > other places use next_operand_entry_id++. I think you want to simply > use add_to_ops_vec (oe, tmp); here for all of the above. > > Please return whether you did any optimization and do the > qsort of the operand vector only if you did sth. > > Testcase with FP math missing. Likewise with complex or vector math. Btw, does it handle associating x + 3 * x + x to 5 * x ? Richard. > Thanks, > Richard. > >>> Thanks, >>> Kugan
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr63586.c b/gcc/testsuite/gcc.dg/tree-ssa/pr63586.c index e69de29..a002bdd 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/pr63586.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr63586.c @@ -0,0 +1,59 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-reassoc1" } */ + +unsigned f1 (unsigned x) +{ + unsigned y = x + x; + y = y + x; + y = y + x; + y = y + x; + y = y + x; + y = y + x; + y = y + x; + return y; +} + +unsigned f2 (unsigned x, unsigned z) +{ + unsigned y = x + x; + y = y + x; + y = y + x; + y = y + z; + y = y + z; + y = y + z; + y = y + z; + return y; +} + +unsigned f3 (unsigned x, unsigned z, unsigned k) +{ + unsigned y = x + x; + y = y + x; + y = y + x; + y = y + z; + y = y + z; + y = y + z; + y = y + k; + return y; +} + +unsigned f4 (unsigned x, unsigned z, unsigned k) +{ + unsigned y = k + x; + y = y + x; + y = y + x; + y = y + z; + y = y + z; + y = y + z; + y = y + x; + return y; +} + +unsigned f5 (unsigned x, unsigned y, unsigned z) +{ + return x + x + x + x + y + y + y + y + y + + y + z + z + z + z; +} + + +/* { dg-final { scan-tree-dump-times "\\\*" 10 "reassoc1" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c index 62802d1..16ebc86 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/reassoc-14.c @@ -19,6 +19,7 @@ unsigned int test2 (unsigned int x, unsigned int y, unsigned int z, return tmp1 + tmp2 + tmp3; } -/* There should be one multiplication left in test1 and three in test2. */ +/* There should be two multiplication left in test1 (inculding one generated + when converting addition to multiplication) and three in test2. */ -/* { dg-final { scan-tree-dump-times "\\\*" 4 "reassoc1" } } */ +/* { dg-final { scan-tree-dump-times "\\\*" 5 "reassoc1" } } */ diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c index 17eb64f..0a43faf 100644 --- a/gcc/tree-ssa-reassoc.c +++ b/gcc/tree-ssa-reassoc.c @@ -1698,6 +1698,79 @@ eliminate_redundant_comparison (enum tree_code opcode, return false; } +/* Transoform repeated addition of same values into multiply with + constant. */ +static void +transform_add_to_multiply (gimple_stmt_iterator *gsi, gimple *stmt, vec<operand_entry *> *ops) +{ + operand_entry *oe; + tree op = NULL_TREE; + int j; + int i, start = -1, end = 0, count = 0; + vec<int> start_inds = vNULL; + vec<int> end_inds = vNULL; + vec<tree> op_list = vNULL; + + /* Look for repeated operands. */ + FOR_EACH_VEC_ELT (*ops, i, oe) + { + if (start == -1) + { + count = 1; + op = oe->op; + start = i; + } + else if (operand_equal_p (oe->op, op, 0)) + { + count++; + end = i; + } + else + { + if (count > 1) + { + start_inds.safe_push (start); + end_inds.safe_push (end); + op_list.safe_push (op); + } + count = 1; + op = oe->op; + start = i; + } + } + + if (count > 1) + { + start_inds.safe_push (start); + end_inds.safe_push (end); + op_list.safe_push (op); + } + + for (j = start_inds.length () - 1; j >= 0; --j) + { + /* Convert repeated operand addition to multiplication. */ + start = start_inds[j]; + end = end_inds[j]; + op = op_list[j]; + count = end - start + 1; + for (i = end; i >= start; --i) + ops->unordered_remove (i); + tree tmp = make_temp_ssa_name (TREE_TYPE (op), NULL, "reassocmul"); + gassign *mul_stmt = gimple_build_assign (tmp, MULT_EXPR, + op, build_int_cst (TREE_TYPE(op), count)); + gimple_set_location (mul_stmt, gimple_location (stmt)); + gimple_set_uid (mul_stmt, gimple_uid (stmt)); + gsi_insert_before (gsi, mul_stmt, GSI_SAME_STMT); + oe = operand_entry_pool.allocate (); + oe->op = tmp; + oe->rank = get_rank (op) * count; + oe->id = 0; + oe->count = 1; + ops->safe_push (oe); + } +} + + /* Perform various identities and other optimizations on the list of operand entries, stored in OPS. The tree code for the binary operation between all the operands is OPCODE. */ @@ -4922,6 +4995,12 @@ reassociate_bb (basic_block bb) && flag_unsafe_math_optimizations) powi_result = attempt_builtin_powi (stmt, &ops); + if (rhs_code == PLUS_EXPR) + { + transform_add_to_multiply (&gsi, stmt, &ops); + ops.qsort (sort_by_operand_rank); + } + /* If the operand vector is now empty, all operands were consumed by the __builtin_powi optimization. */ if (ops.length () == 0)