Message ID | 011c01d86a16$baaf6030$300e2090$@nextmovesoftware.com |
---|---|
State | New |
Headers | show |
Series | Simplify logic in tree-scalar-evolution's expensive_expression_p. | expand |
On Tue, May 17, 2022 at 7:51 PM Roger Sayle <roger@nextmovesoftware.com> wrote: > > > This patch simplifies tree-scalar-evolution's expensive_expression_p, but > produces identical results; the replacement implementation is just smaller > (uses less memory), faster and easier to understand. > > The current idiom (introduced to fix PR90726) looks like: > > hash_map<tree, uint64_t> cache; > uint64_t expanded_size = 0; > return (expression_expensive_p (expr, cache, expanded_size) > || expanded_size > cache.elements ()); > > Here the recursive function computes expanded_size, effectively the > number of tree nodes visited, which is then only used in the comparison > against cache.elements(), i.e. to check whether the number of visited > nodes is greater than the number of unique visited nodes. This is > equivalent to instead checking where expression_expensive_p's recursion > visits any node more than once. > > Instead of using a map to cache the "cost" of revisited sub-trees, the > same outcome can be determined using a set, and immediately returning > true as soon as encountering a previously seen tree node, avoiding the > unnecessary "cost"/expanded_size computation. [A simplification analogous > to checking STL's empty() instead of comparing size() with zero]. > > The semantics of expensive_expression_p (both before and after) are > quite reasonable, as calling unshare_expr on a generic tree can result > in an exponential growth in the number of gimple statements, hence > any "shared" nodes are indeed expensive. If shared nodes are to be > allowed, they'll need to be managed explicitly with SAVE_EXPR (or similar > mechanism) that avoids exponential growth. Indeed. The code seems to allow for doing "better" than counting the cost of a sub-expression either as one or giving up on the whole expression though while the cleanup removes this possibility. In fact the predicate currently allows arbitrary expensive (read: large) expressions and the result of expression_expensive_p (x) and expression_expensive_p (unshare_expr (x)) isn't equal which IMHO is a bug. > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > and make -k check, both with and without --target_board=unix{-m32}, with > no new failures. Is this a reasonable clean-up for mainline? So I think the change goes in the wrong direction, even if it preserves the current semantics. The bug is probably in the callers allowing arbitrarily large expressions so maybe the API can be changed to provide a max_size argument. I know I placed the extra expression expansion limit ontop of the previous implementation which just looked for an expensive operation. Also note that technically unshare_expr isn't necessary if gimplification would cope with shared trees - possibly a variant which, instead of unsharing in-expression uses, would insert SAVE_EXPRs, would do the trick here. Richard. > > 2022-05-17 Roger Sayle <roger@nextmovesoftware.com> > > gcc/ChangeLog > * tree_scalar_evolution.cc (expression_expensive_p): Change type > of cache from hash_map to hash_set, and remove cost argument. > When expr appears in the hash_set, return true. Calculation of > cost (and updating hash_map) is no longer required. > (expression_expensive_p): Simplify top-level implementation. > > > Thanks in advance, > Roger > -- >
diff --git a/gcc/tree-scalar-evolution.cc b/gcc/tree-scalar-evolution.cc index 72ceb40..347dede 100644 --- a/gcc/tree-scalar-evolution.cc +++ b/gcc/tree-scalar-evolution.cc @@ -3352,8 +3352,7 @@ scev_finalize (void) for scev_const_prop. */ static bool -expression_expensive_p (tree expr, hash_map<tree, uint64_t> &cache, - uint64_t &cost) +expression_expensive_p (tree expr, hash_set<tree> &cache) { enum tree_code code; @@ -3377,19 +3376,11 @@ expression_expensive_p (tree expr, hash_map<tree, uint64_t> &cache, return true; } - bool visited_p; - uint64_t &local_cost = cache.get_or_insert (expr, &visited_p); - if (visited_p) - { - uint64_t tem = cost + local_cost; - if (tem < cost) - return true; - cost = tem; - return false; - } - local_cost = 1; + /* If we've encountered this expression before, it would be duplicated + by unshare_expr, which makes this expression expensive. */ + if (cache.add (expr)) + return true; - uint64_t op_cost = 0; if (code == CALL_EXPR) { tree arg; @@ -3431,16 +3422,14 @@ expression_expensive_p (tree expr, hash_map<tree, uint64_t> &cache, } FOR_EACH_CALL_EXPR_ARG (arg, iter, expr) - if (expression_expensive_p (arg, cache, op_cost)) + if (expression_expensive_p (arg, cache)) return true; - *cache.get (expr) += op_cost; - cost += op_cost + 1; return false; } if (code == COND_EXPR) { - if (expression_expensive_p (TREE_OPERAND (expr, 0), cache, op_cost) + if (expression_expensive_p (TREE_OPERAND (expr, 0), cache) || (EXPR_P (TREE_OPERAND (expr, 1)) && EXPR_P (TREE_OPERAND (expr, 2))) /* If either branch has side effects or could trap. */ @@ -3448,13 +3437,9 @@ expression_expensive_p (tree expr, hash_map<tree, uint64_t> &cache, || generic_expr_could_trap_p (TREE_OPERAND (expr, 1)) || TREE_SIDE_EFFECTS (TREE_OPERAND (expr, 0)) || generic_expr_could_trap_p (TREE_OPERAND (expr, 0)) - || expression_expensive_p (TREE_OPERAND (expr, 1), - cache, op_cost) - || expression_expensive_p (TREE_OPERAND (expr, 2), - cache, op_cost)) + || expression_expensive_p (TREE_OPERAND (expr, 1), cache) + || expression_expensive_p (TREE_OPERAND (expr, 2), cache)) return true; - *cache.get (expr) += op_cost; - cost += op_cost + 1; return false; } @@ -3462,15 +3447,13 @@ expression_expensive_p (tree expr, hash_map<tree, uint64_t> &cache, { case tcc_binary: case tcc_comparison: - if (expression_expensive_p (TREE_OPERAND (expr, 1), cache, op_cost)) + if (expression_expensive_p (TREE_OPERAND (expr, 1), cache)) return true; /* Fallthru. */ case tcc_unary: - if (expression_expensive_p (TREE_OPERAND (expr, 0), cache, op_cost)) + if (expression_expensive_p (TREE_OPERAND (expr, 0), cache)) return true; - *cache.get (expr) += op_cost; - cost += op_cost + 1; return false; default: @@ -3481,10 +3464,8 @@ expression_expensive_p (tree expr, hash_map<tree, uint64_t> &cache, bool expression_expensive_p (tree expr) { - hash_map<tree, uint64_t> cache; - uint64_t expanded_size = 0; - return (expression_expensive_p (expr, cache, expanded_size) - || expanded_size > cache.elements ()); + hash_set<tree> cache; + return expression_expensive_p (expr, cache); } /* Do final value replacement for LOOP, return true if we did anything. */