From patchwork Mon Sep 7 19:32:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin Jambor X-Patchwork-Id: 1359229 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=suse.cz Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4BldjS6LP4z9sSP for ; Tue, 8 Sep 2020 05:32:59 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C53EB393C86C; Mon, 7 Sep 2020 19:32:52 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by sourceware.org (Postfix) with ESMTPS id E6541386186A for ; Mon, 7 Sep 2020 19:32:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org E6541386186A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mjambor@suse.cz X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 61D20ACCF for ; Mon, 7 Sep 2020 19:32:45 +0000 (UTC) From: Martin Jambor To: GCC Patches Subject: [PATCH 1/6] ipa: Bundle vectors describing argument values User-Agent: Notmuch/0.30 (https://notmuchmail.org) Emacs/26.3 (x86_64-suse-linux-gnu) Date: Mon, 07 Sep 2020 21:32:44 +0200 Message-ID: MIME-Version: 1.0 X-Spam-Status: No, score=-3038.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jan Hubicka Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi, this large patch is mostly mechanical change which aims to replace uses of separate vectors about known scalar values (usually called known_vals or known_csts), known aggregate values (known_aggs), known virtual call contexts (known_contexts) and known value ranges (known_value_ranges) with uses of either new type ipa_call_arg_values or ipa_auto_call_arg_values, both of which simply contain these vectors inside them. The need for two distinct types comes from the fact that when the vectors are constructed from jump functions or lattices, we really should use auto_vecs with embedded storage allocated on stack. On the other hand, the bundle in ipa_call_context can be allocated on heap when in cache, one time for each call_graph node. ipa_call_context is constructible from ipa_auto_call_arg_values but then its vectors must not be resized, otherwise the vectors will stop pointing to the stack ones. Unfortunately, I don't think the structure embedded in ipa_call_context can be made constant because we need to manipulate and deallocate it when in cache. Last week I bootstrapped and tested (and LTO-bootstrapped) this patch individually, this week I did so for the whole patch set. OK for trunk? Thanks, Martin gcc/ChangeLog: 2020-09-01 Martin Jambor * ipa-prop.h (ipa_auto_call_arg_values): New type. (class ipa_call_arg_values): Likewise. (ipa_get_indirect_edge_target): Replaced vector arguments with ipa_call_arg_values in declaration. Added an overload for ipa_auto_call_arg_values. * ipa-fnsummary.h (ipa_call_context): Removed members m_known_vals, m_known_contexts, m_known_aggs, duplicate_from, release and equal_to, new members m_avals, store_to_cache and equivalent_to_p. Adjusted construcotr arguments. (estimate_ipcp_clone_size_and_time): Replaced vector arguments with ipa_auto_call_arg_values in declaration. (evaluate_properties_for_edge): Likewise. * ipa-cp.c (ipa_get_indirect_edge_target): Adjusted to work on ipa_call_arg_values rather than on separate vectors. Added an overload for ipa_auto_call_arg_values. (devirtualization_time_bonus): Adjusted to work on ipa_auto_call_arg_values rather than on separate vectors. (gather_context_independent_values): Adjusted to work on ipa_auto_call_arg_values rather than on separate vectors. (perform_estimation_of_a_value): Likewise. (estimate_local_effects): Likewise. (modify_known_vectors_with_val): Adjusted both variants to work on ipa_auto_call_arg_values and rename them to copy_known_vectors_add_val. (decide_about_value): Adjusted to work on ipa_call_arg_values rather than on separate vectors. (decide_whether_version_node): Likewise. * ipa-fnsummary.c (evaluate_conditions_for_known_args): Likewise. (evaluate_properties_for_edge): Likewise. (ipa_fn_summary_t::duplicate): Likewise. (estimate_edge_devirt_benefit): Adjusted to work on ipa_call_arg_values rather than on separate vectors. (estimate_edge_size_and_time): Likewise. (estimate_calls_size_and_time_1): Likewise. (summarize_calls_size_and_time): Adjusted calls to estimate_edge_size_and_time. (estimate_calls_size_and_time): Adjusted to work on ipa_call_arg_values rather than on separate vectors. (ipa_call_context::ipa_call_context): Construct from a pointer to ipa_auto_call_arg_values instead of inividual vectors. (ipa_call_context::duplicate_from): Adjusted to access vectors within m_avals. (ipa_call_context::release): Likewise. (ipa_call_context::equal_to): Likewise. (ipa_call_context::estimate_size_and_time): Adjusted to work on ipa_call_arg_values rather than on separate vectors. (estimate_ipcp_clone_size_and_time): Adjusted to work with ipa_auto_call_arg_values rather than on separate vectors. (ipa_merge_fn_summary_after_inlining): Likewise. Adjusted call to estimate_edge_size_and_time. (ipa_update_overall_fn_summary): Adjusted call to estimate_edge_size_and_time. * ipa-inline-analysis.c (do_estimate_edge_time): Adjusted to work with ipa_auto_call_arg_values rather than with separate vectors. (do_estimate_edge_size): Likewise. (do_estimate_edge_hints): Likewise. * ipa-prop.c (ipa_auto_call_arg_values::~ipa_auto_call_arg_values): New destructor. --- gcc/ipa-cp.c | 245 ++++++++++----------- gcc/ipa-fnsummary.c | 446 +++++++++++++++++--------------------- gcc/ipa-fnsummary.h | 27 +-- gcc/ipa-inline-analysis.c | 41 +--- gcc/ipa-prop.c | 10 + gcc/ipa-prop.h | 112 +++++++++- 6 files changed, 452 insertions(+), 429 deletions(-) diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c index b3e7d41ea10..292dd7e5bdf 100644 --- a/gcc/ipa-cp.c +++ b/gcc/ipa-cp.c @@ -3117,30 +3117,40 @@ ipa_get_indirect_edge_target_1 (struct cgraph_edge *ie, return target; } - -/* If an indirect edge IE can be turned into a direct one based on KNOWN_CSTS, - KNOWN_CONTEXTS (which can be vNULL) or KNOWN_AGGS (which also can be vNULL) - return the destination. */ +/* If an indirect edge IE can be turned into a direct one based on data in + AVALS, return the destination. Store into *SPECULATIVE a boolean determinig + whether the discovered target is only speculative guess. */ tree ipa_get_indirect_edge_target (struct cgraph_edge *ie, - vec known_csts, - vec known_contexts, - vec known_aggs, + ipa_call_arg_values *avals, bool *speculative) { - return ipa_get_indirect_edge_target_1 (ie, known_csts, known_contexts, - known_aggs, NULL, speculative); + return ipa_get_indirect_edge_target_1 (ie, avals->m_known_vals, + avals->m_known_contexts, + avals->m_known_aggs, + NULL, speculative); } -/* Calculate devirtualization time bonus for NODE, assuming we know KNOWN_CSTS - and KNOWN_CONTEXTS. */ +/* The same functionality as above overloaded for ipa_auto_call_arg_values. */ + +tree +ipa_get_indirect_edge_target (struct cgraph_edge *ie, + ipa_auto_call_arg_values *avals, + bool *speculative) +{ + return ipa_get_indirect_edge_target_1 (ie, avals->m_known_vals, + avals->m_known_contexts, + avals->m_known_aggs, + NULL, speculative); +} + +/* Calculate devirtualization time bonus for NODE, assuming we know information + about arguments stored in AVALS. */ static int devirtualization_time_bonus (struct cgraph_node *node, - vec known_csts, - vec known_contexts, - vec known_aggs) + ipa_auto_call_arg_values *avals) { struct cgraph_edge *ie; int res = 0; @@ -3153,8 +3163,7 @@ devirtualization_time_bonus (struct cgraph_node *node, tree target; bool speculative; - target = ipa_get_indirect_edge_target (ie, known_csts, known_contexts, - known_aggs, &speculative); + target = ipa_get_indirect_edge_target (ie, avals, &speculative); if (!target) continue; @@ -3306,32 +3315,27 @@ context_independent_aggregate_values (class ipcp_param_lattices *plats) return res; } -/* Allocate KNOWN_CSTS, KNOWN_CONTEXTS and, if non-NULL, KNOWN_AGGS and - populate them with values of parameters that are known independent of the - context. INFO describes the function. If REMOVABLE_PARAMS_COST is - non-NULL, the movement cost of all removable parameters will be stored in - it. */ +/* Grow vectors in AVALS and fill them with information about values of + parameters that are known to be independent of the context. Only calculate + m_known_aggs if CALCULATE_AGGS is true. INFO describes the function. If + REMOVABLE_PARAMS_COST is non-NULL, the movement cost of all removable + parameters will be stored in it. + + TODO: Also grow context independent value range vectors. */ static bool gather_context_independent_values (class ipa_node_params *info, - vec *known_csts, - vec - *known_contexts, - vec *known_aggs, + ipa_auto_call_arg_values *avals, + bool calculate_aggs, int *removable_params_cost) { int i, count = ipa_get_param_count (info); bool ret = false; - known_csts->create (0); - known_contexts->create (0); - known_csts->safe_grow_cleared (count, true); - known_contexts->safe_grow_cleared (count, true); - if (known_aggs) - { - known_aggs->create (0); - known_aggs->safe_grow_cleared (count, true); - } + avals->m_known_vals.safe_grow_cleared (count, true); + avals->m_known_contexts.safe_grow_cleared (count, true); + if (calculate_aggs) + avals->m_known_aggs.safe_grow_cleared (count, true); if (removable_params_cost) *removable_params_cost = 0; @@ -3345,7 +3349,7 @@ gather_context_independent_values (class ipa_node_params *info, { ipcp_value *val = lat->values; gcc_checking_assert (TREE_CODE (val->value) != TREE_BINFO); - (*known_csts)[i] = val->value; + avals->m_known_vals[i] = val->value; if (removable_params_cost) *removable_params_cost += estimate_move_cost (TREE_TYPE (val->value), false); @@ -3363,15 +3367,15 @@ gather_context_independent_values (class ipa_node_params *info, /* Do not account known context as reason for cloning. We can see if it permits devirtualization. */ if (ctxlat->is_single_const ()) - (*known_contexts)[i] = ctxlat->values->value; + avals->m_known_contexts[i] = ctxlat->values->value; - if (known_aggs) + if (calculate_aggs) { vec agg_items; struct ipa_agg_value_set *agg; agg_items = context_independent_aggregate_values (plats); - agg = &(*known_aggs)[i]; + agg = &avals->m_known_aggs[i]; agg->items = agg_items; agg->by_ref = plats->aggs_by_ref; ret |= !agg_items.is_empty (); @@ -3381,25 +3385,23 @@ gather_context_independent_values (class ipa_node_params *info, return ret; } -/* Perform time and size measurement of NODE with the context given in - KNOWN_CSTS, KNOWN_CONTEXTS and KNOWN_AGGS, calculate the benefit and cost - given BASE_TIME of the node without specialization, REMOVABLE_PARAMS_COST of - all context-independent removable parameters and EST_MOVE_COST of estimated - movement of the considered parameter and store it into VAL. */ +/* Perform time and size measurement of NODE with the context given in AVALS, + calculate the benefit compared to the node without specialization and store + it into VAL. Take into account REMOVABLE_PARAMS_COST of all + context-independent or unused removable parameters and EST_MOVE_COST, the + estimated movement of the considered parameter. */ static void -perform_estimation_of_a_value (cgraph_node *node, vec known_csts, - vec known_contexts, - vec known_aggs, - int removable_params_cost, - int est_move_cost, ipcp_value_base *val) +perform_estimation_of_a_value (cgraph_node *node, + ipa_auto_call_arg_values *avals, + int removable_params_cost, int est_move_cost, + ipcp_value_base *val) { int size, time_benefit; sreal time, base_time; ipa_hints hints; - estimate_ipcp_clone_size_and_time (node, known_csts, known_contexts, - known_aggs, &size, &time, + estimate_ipcp_clone_size_and_time (node, avals, &size, &time, &base_time, &hints); base_time -= time; if (base_time > 65535) @@ -3412,8 +3414,7 @@ perform_estimation_of_a_value (cgraph_node *node, vec known_csts, time_benefit = 0; else time_benefit = base_time.to_int () - + devirtualization_time_bonus (node, known_csts, known_contexts, - known_aggs) + + devirtualization_time_bonus (node, avals) + hint_time_bonus (node, hints) + removable_params_cost + est_move_cost; @@ -3454,9 +3455,6 @@ estimate_local_effects (struct cgraph_node *node) { class ipa_node_params *info = IPA_NODE_REF (node); int i, count = ipa_get_param_count (info); - vec known_csts; - vec known_contexts; - vec known_aggs; bool always_const; int removable_params_cost; @@ -3466,11 +3464,10 @@ estimate_local_effects (struct cgraph_node *node) if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, "\nEstimating effects for %s.\n", node->dump_name ()); - always_const = gather_context_independent_values (info, &known_csts, - &known_contexts, &known_aggs, + ipa_auto_call_arg_values avals; + always_const = gather_context_independent_values (info, &avals, true, &removable_params_cost); - int devirt_bonus = devirtualization_time_bonus (node, known_csts, - known_contexts, known_aggs); + int devirt_bonus = devirtualization_time_bonus (node, &avals); if (always_const || devirt_bonus || (removable_params_cost && node->can_change_signature)) { @@ -3482,8 +3479,7 @@ estimate_local_effects (struct cgraph_node *node) init_caller_stats (&stats); node->call_for_symbol_thunks_and_aliases (gather_caller_stats, &stats, false); - estimate_ipcp_clone_size_and_time (node, known_csts, known_contexts, - known_aggs, &size, &time, + estimate_ipcp_clone_size_and_time (node, &avals, &size, &time, &base_time, &hints); time -= devirt_bonus; time -= hint_time_bonus (node, hints); @@ -3536,18 +3532,17 @@ estimate_local_effects (struct cgraph_node *node) if (lat->bottom || !lat->values - || known_csts[i]) + || avals.m_known_vals[i]) continue; for (val = lat->values; val; val = val->next) { gcc_checking_assert (TREE_CODE (val->value) != TREE_BINFO); - known_csts[i] = val->value; + avals.m_known_vals[i] = val->value; int emc = estimate_move_cost (TREE_TYPE (val->value), true); - perform_estimation_of_a_value (node, known_csts, known_contexts, - known_aggs, - removable_params_cost, emc, val); + perform_estimation_of_a_value (node, &avals, removable_params_cost, + emc, val); if (dump_file && (dump_flags & TDF_DETAILS)) { @@ -3559,7 +3554,7 @@ estimate_local_effects (struct cgraph_node *node) val->local_time_benefit, val->local_size_cost); } } - known_csts[i] = NULL_TREE; + avals.m_known_vals[i] = NULL_TREE; } for (i = 0; i < count; i++) @@ -3574,15 +3569,14 @@ estimate_local_effects (struct cgraph_node *node) if (ctxlat->bottom || !ctxlat->values - || !known_contexts[i].useless_p ()) + || !avals.m_known_contexts[i].useless_p ()) continue; for (val = ctxlat->values; val; val = val->next) { - known_contexts[i] = val->value; - perform_estimation_of_a_value (node, known_csts, known_contexts, - known_aggs, - removable_params_cost, 0, val); + avals.m_known_contexts[i] = val->value; + perform_estimation_of_a_value (node, &avals, removable_params_cost, + 0, val); if (dump_file && (dump_flags & TDF_DETAILS)) { @@ -3594,20 +3588,18 @@ estimate_local_effects (struct cgraph_node *node) val->local_time_benefit, val->local_size_cost); } } - known_contexts[i] = ipa_polymorphic_call_context (); + avals.m_known_contexts[i] = ipa_polymorphic_call_context (); } for (i = 0; i < count; i++) { class ipcp_param_lattices *plats = ipa_get_parm_lattices (info, i); - struct ipa_agg_value_set *agg; - struct ipcp_agg_lattice *aglat; if (plats->aggs_bottom || !plats->aggs) continue; - agg = &known_aggs[i]; - for (aglat = plats->aggs; aglat; aglat = aglat->next) + ipa_agg_value_set *agg = &avals.m_known_aggs[i]; + for (ipcp_agg_lattice *aglat = plats->aggs; aglat; aglat = aglat->next) { ipcp_value *val; if (aglat->bottom || !aglat->values @@ -3624,8 +3616,7 @@ estimate_local_effects (struct cgraph_node *node) item.value = val->value; agg->items.safe_push (item); - perform_estimation_of_a_value (node, known_csts, known_contexts, - known_aggs, + perform_estimation_of_a_value (node, &avals, removable_params_cost, 0, val); if (dump_file && (dump_flags & TDF_DETAILS)) @@ -3645,10 +3636,6 @@ estimate_local_effects (struct cgraph_node *node) } } } - - known_csts.release (); - known_contexts.release (); - ipa_release_agg_values (known_aggs); } @@ -5372,31 +5359,34 @@ copy_useful_known_contexts (vec known_contexts) return vNULL; } -/* Copy KNOWN_CSTS and modify the copy according to VAL and INDEX. If - non-empty, replace KNOWN_CONTEXTS with its copy too. */ +/* Copy known scalar values from AVALS into KNOWN_CSTS and modify the copy + according to VAL and INDEX. If non-empty, replace KNOWN_CONTEXTS with its + copy too. */ static void -modify_known_vectors_with_val (vec *known_csts, - vec *known_contexts, - ipcp_value *val, - int index) +copy_known_vectors_add_val (ipa_auto_call_arg_values *avals, + vec *known_csts, + vec *known_contexts, + ipcp_value *val, int index) { - *known_csts = known_csts->copy (); - *known_contexts = copy_useful_known_contexts (*known_contexts); + *known_csts = avals->m_known_vals.copy (); + *known_contexts = copy_useful_known_contexts (avals->m_known_contexts); (*known_csts)[index] = val->value; } -/* Replace KNOWN_CSTS with its copy. Also copy KNOWN_CONTEXTS and modify the - copy according to VAL and INDEX. */ +/* Copy known scalar values from AVALS into KNOWN_CSTS. Similarly, copy + contexts to KNOWN_CONTEXTS and modify the copy according to VAL and + INDEX. */ static void -modify_known_vectors_with_val (vec *known_csts, - vec *known_contexts, - ipcp_value *val, - int index) +copy_known_vectors_add_val (ipa_auto_call_arg_values *avals, + vec *known_csts, + vec *known_contexts, + ipcp_value *val, + int index) { - *known_csts = known_csts->copy (); - *known_contexts = known_contexts->copy (); + *known_csts = avals->m_known_vals.copy (); + *known_contexts = avals->m_known_contexts.copy (); (*known_contexts)[index] = val->value; } @@ -5433,16 +5423,15 @@ ipcp_val_agg_replacement_ok_p (ipa_agg_replacement_value *, return offset == -1; } -/* Decide whether to create a special version of NODE for value VAL of parameter - at the given INDEX. If OFFSET is -1, the value is for the parameter itself, - otherwise it is stored at the given OFFSET of the parameter. KNOWN_CSTS, - KNOWN_CONTEXTS and KNOWN_AGGS describe the other already known values. */ +/* Decide whether to create a special version of NODE for value VAL of + parameter at the given INDEX. If OFFSET is -1, the value is for the + parameter itself, otherwise it is stored at the given OFFSET of the + parameter. AVALS describes the other already known values. */ template static bool decide_about_value (struct cgraph_node *node, int index, HOST_WIDE_INT offset, - ipcp_value *val, vec known_csts, - vec known_contexts) + ipcp_value *val, ipa_auto_call_arg_values *avals) { struct ipa_agg_replacement_value *aggvals; int freq_sum, caller_count; @@ -5492,13 +5481,16 @@ decide_about_value (struct cgraph_node *node, int index, HOST_WIDE_INT offset, fprintf (dump_file, " Creating a specialized node of %s.\n", node->dump_name ()); + vec known_csts; + vec known_contexts; + callers = gather_edges_for_value (val, node, caller_count); if (offset == -1) - modify_known_vectors_with_val (&known_csts, &known_contexts, val, index); + copy_known_vectors_add_val (avals, &known_csts, &known_contexts, val, index); else { - known_csts = known_csts.copy (); - known_contexts = copy_useful_known_contexts (known_contexts); + known_csts = avals->m_known_vals.copy (); + known_contexts = copy_useful_known_contexts (avals->m_known_contexts); } find_more_scalar_values_for_callers_subset (node, known_csts, callers); find_more_contexts_for_caller_subset (node, &known_contexts, callers); @@ -5522,8 +5514,6 @@ decide_whether_version_node (struct cgraph_node *node) { class ipa_node_params *info = IPA_NODE_REF (node); int i, count = ipa_get_param_count (info); - vec known_csts; - vec known_contexts; bool ret = false; if (count == 0) @@ -5533,8 +5523,8 @@ decide_whether_version_node (struct cgraph_node *node) fprintf (dump_file, "\nEvaluating opportunities for %s.\n", node->dump_name ()); - gather_context_independent_values (info, &known_csts, &known_contexts, - NULL, NULL); + ipa_auto_call_arg_values avals; + gather_context_independent_values (info, &avals, false, NULL); for (i = 0; i < count;i++) { @@ -5543,12 +5533,11 @@ decide_whether_version_node (struct cgraph_node *node) ipcp_lattice *ctxlat = &plats->ctxlat; if (!lat->bottom - && !known_csts[i]) + && !avals.m_known_vals[i]) { ipcp_value *val; for (val = lat->values; val; val = val->next) - ret |= decide_about_value (node, i, -1, val, known_csts, - known_contexts); + ret |= decide_about_value (node, i, -1, val, &avals); } if (!plats->aggs_bottom) @@ -5557,22 +5546,20 @@ decide_whether_version_node (struct cgraph_node *node) ipcp_value *val; for (aglat = plats->aggs; aglat; aglat = aglat->next) if (!aglat->bottom && aglat->values - /* If the following is false, the one value is in - known_aggs. */ + /* If the following is false, the one value has been considered + for cloning for all contexts. */ && (plats->aggs_contain_variable || !aglat->is_single_const ())) for (val = aglat->values; val; val = val->next) - ret |= decide_about_value (node, i, aglat->offset, val, - known_csts, known_contexts); + ret |= decide_about_value (node, i, aglat->offset, val, &avals); } if (!ctxlat->bottom - && known_contexts[i].useless_p ()) + && avals.m_known_contexts[i].useless_p ()) { ipcp_value *val; for (val = ctxlat->values; val; val = val->next) - ret |= decide_about_value (node, i, -1, val, known_csts, - known_contexts); + ret |= decide_about_value (node, i, -1, val, &avals); } info = IPA_NODE_REF (node); @@ -5595,11 +5582,9 @@ decide_whether_version_node (struct cgraph_node *node) if (!adjust_callers_for_value_intersection (callers, node)) { /* If node is not called by anyone, or all its caller edges are - self-recursive, the node is not really be in use, no need to - do cloning. */ + self-recursive, the node is not really in use, no need to do + cloning. */ callers.release (); - known_csts.release (); - known_contexts.release (); info->do_clone_for_all_contexts = false; return ret; } @@ -5608,6 +5593,9 @@ decide_whether_version_node (struct cgraph_node *node) fprintf (dump_file, " - Creating a specialized node of %s " "for all known contexts.\n", node->dump_name ()); + vec known_csts = avals.m_known_vals.copy (); + vec known_contexts + = copy_useful_known_contexts (avals.m_known_contexts); find_more_scalar_values_for_callers_subset (node, known_csts, callers); find_more_contexts_for_caller_subset (node, &known_contexts, callers); ipa_agg_replacement_value *aggvals @@ -5625,11 +5613,6 @@ decide_whether_version_node (struct cgraph_node *node) IPA_NODE_REF (clone)->is_all_contexts_clone = true; ret = true; } - else - { - known_csts.release (); - known_contexts.release (); - } return ret; } diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c index 86d01addb44..d8b95aca307 100644 --- a/gcc/ipa-fnsummary.c +++ b/gcc/ipa-fnsummary.c @@ -320,19 +320,18 @@ set_hint_predicate (predicate **p, predicate new_predicate) is always false in the second and also builtin_constant_p tests cannot use the fact that parameter is indeed a constant. - KNOWN_VALS is partial mapping of parameters of NODE to constant values. - KNOWN_AGGS is a vector of aggregate known offset/value set for each - parameter. Return clause of possible truths. When INLINE_P is true, assume - that we are inlining. + When INLINE_P is true, assume that we are inlining. AVAL contains known + information about argument values. The function does not modify its content + and so AVALs could also be of type ipa_call_arg_values but so far all + callers work with the auto version and so we avoid the conversion for + convenience. - ERROR_MARK means compile time invariant. */ + ERROR_MARK value of an argument means compile time invariant. */ static void evaluate_conditions_for_known_args (struct cgraph_node *node, bool inline_p, - vec known_vals, - vec known_value_ranges, - vec known_aggs, + ipa_auto_call_arg_values *avals, clause_t *ret_clause, clause_t *ret_nonspec_clause) { @@ -351,38 +350,33 @@ evaluate_conditions_for_known_args (struct cgraph_node *node, /* We allow call stmt to have fewer arguments than the callee function (especially for K&R style programs). So bound check here (we assume - known_aggs vector, if non-NULL, has the same length as - known_vals). */ - gcc_checking_assert (!known_aggs.length () || !known_vals.length () - || (known_vals.length () == known_aggs.length ())); + m_known_aggs vector is either empty or has the same length as + m_known_vals). */ + gcc_checking_assert (!avals->m_known_aggs.length () + || !avals->m_known_vals.length () + || (avals->m_known_vals.length () + == avals->m_known_aggs.length ())); if (c->agg_contents) { - struct ipa_agg_value_set *agg; - if (c->code == predicate::changed && !c->by_ref - && c->operand_num < (int)known_vals.length () - && (known_vals[c->operand_num] == error_mark_node)) + && (avals->safe_sval_at(c->operand_num) == error_mark_node)) continue; - if (c->operand_num < (int)known_aggs.length ()) + if (ipa_agg_value_set *agg = avals->safe_aggval_at (c->operand_num)) { - agg = &known_aggs[c->operand_num]; - val = ipa_find_agg_cst_for_param (agg, - c->operand_num - < (int) known_vals.length () - ? known_vals[c->operand_num] - : NULL, - c->offset, c->by_ref); + tree sval = avals->safe_sval_at (c->operand_num); + val = ipa_find_agg_cst_for_param (agg, sval, c->offset, + c->by_ref); } else val = NULL_TREE; } - else if (c->operand_num < (int) known_vals.length ()) + else { - val = known_vals[c->operand_num]; - if (val == error_mark_node && c->code != predicate::changed) + val = avals->safe_sval_at (c->operand_num); + if (val && val == error_mark_node && c->code != predicate::changed) val = NULL_TREE; } @@ -446,53 +440,54 @@ evaluate_conditions_for_known_args (struct cgraph_node *node, continue; } } - if (c->operand_num < (int) known_value_ranges.length () + if (c->operand_num < (int) avals->m_known_value_ranges.length () && !c->agg_contents - && !known_value_ranges[c->operand_num].undefined_p () - && !known_value_ranges[c->operand_num].varying_p () - && TYPE_SIZE (c->type) - == TYPE_SIZE (known_value_ranges[c->operand_num].type ()) && (!val || TREE_CODE (val) != INTEGER_CST)) { - value_range vr = known_value_ranges[c->operand_num]; - if (!useless_type_conversion_p (c->type, vr.type ())) + value_range vr = avals->m_known_value_ranges[c->operand_num]; + if (!vr.undefined_p () + && !vr.varying_p () + && (TYPE_SIZE (c->type) == TYPE_SIZE (vr.type ()))) { - value_range res; - range_fold_unary_expr (&res, NOP_EXPR, - c->type, &vr, vr.type ()); - vr = res; - } - tree type = c->type; - - for (j = 0; vec_safe_iterate (c->param_ops, j, &op); j++) - { - if (vr.varying_p () || vr.undefined_p ()) - break; - - value_range res; - if (!op->val[0]) - range_fold_unary_expr (&res, op->code, op->type, &vr, type); - else if (!op->val[1]) + if (!useless_type_conversion_p (c->type, vr.type ())) { - value_range op0 (op->val[0], op->val[0]); - range_fold_binary_expr (&res, op->code, op->type, - op->index ? &op0 : &vr, - op->index ? &vr : &op0); + value_range res; + range_fold_unary_expr (&res, NOP_EXPR, + c->type, &vr, vr.type ()); + vr = res; + } + tree type = c->type; + + for (j = 0; vec_safe_iterate (c->param_ops, j, &op); j++) + { + if (vr.varying_p () || vr.undefined_p ()) + break; + + value_range res; + if (!op->val[0]) + range_fold_unary_expr (&res, op->code, op->type, &vr, type); + else if (!op->val[1]) + { + value_range op0 (op->val[0], op->val[0]); + range_fold_binary_expr (&res, op->code, op->type, + op->index ? &op0 : &vr, + op->index ? &vr : &op0); + } + else + gcc_unreachable (); + type = op->type; + vr = res; + } + if (!vr.varying_p () && !vr.undefined_p ()) + { + value_range res; + value_range val_vr (c->val, c->val); + range_fold_binary_expr (&res, c->code, boolean_type_node, + &vr, + &val_vr); + if (res.zero_p ()) + continue; } - else - gcc_unreachable (); - type = op->type; - vr = res; - } - if (!vr.varying_p () && !vr.undefined_p ()) - { - value_range res; - value_range val_vr (c->val, c->val); - range_fold_binary_expr (&res, c->code, boolean_type_node, - &vr, - &val_vr); - if (res.zero_p ()) - continue; } } @@ -538,24 +533,20 @@ fre_will_run_p (struct cgraph_node *node) (if non-NULL) conditions evaluated for nonspecialized clone called in a given context. - KNOWN_VALS_PTR and KNOWN_AGGS_PTR must be non-NULL and will be filled by - known constant and aggregate values of parameters. - - KNOWN_CONTEXT_PTR, if non-NULL, will be filled by polymorphic call contexts - of parameter used by a polymorphic call. */ + Vectors in AVALS will be populated with useful known information about + argument values - information not known to have any uses will be omitted - + except for m_known_contexts which will only be calculated if + COMPUTE_CONTEXTS is true. */ void evaluate_properties_for_edge (struct cgraph_edge *e, bool inline_p, clause_t *clause_ptr, clause_t *nonspec_clause_ptr, - vec *known_vals_ptr, - vec - *known_contexts_ptr, - vec *known_aggs_ptr) + ipa_auto_call_arg_values *avals, + bool compute_contexts) { struct cgraph_node *callee = e->callee->ultimate_alias_target (); class ipa_fn_summary *info = ipa_fn_summaries->get (callee); - auto_vec known_value_ranges; class ipa_edge_args *args; if (clause_ptr) @@ -563,7 +554,7 @@ evaluate_properties_for_edge (struct cgraph_edge *e, bool inline_p, if (ipa_node_params_sum && !e->call_stmt_cannot_inline_p - && (info->conds || known_contexts_ptr) + && (info->conds || compute_contexts) && (args = IPA_EDGE_REF (e)) != NULL) { struct cgraph_node *caller; @@ -608,15 +599,15 @@ evaluate_properties_for_edge (struct cgraph_edge *e, bool inline_p, if (cst) { gcc_checking_assert (TREE_CODE (cst) != TREE_BINFO); - if (!known_vals_ptr->length ()) - vec_safe_grow_cleared (known_vals_ptr, count, true); - (*known_vals_ptr)[i] = cst; + if (!avals->m_known_vals.length ()) + avals->m_known_vals.safe_grow_cleared (count, true); + avals->m_known_vals[i] = cst; } else if (inline_p && !es->param[i].change_prob) { - if (!known_vals_ptr->length ()) - vec_safe_grow_cleared (known_vals_ptr, count, true); - (*known_vals_ptr)[i] = error_mark_node; + if (!avals->m_known_vals.length ()) + avals->m_known_vals.safe_grow_cleared (count, true); + avals->m_known_vals[i] = error_mark_node; } /* If we failed to get simple constant, try value range. */ @@ -624,19 +615,20 @@ evaluate_properties_for_edge (struct cgraph_edge *e, bool inline_p, && vrp_will_run_p (caller) && ipa_is_param_used_by_ipa_predicates (callee_pi, i)) { - value_range vr + value_range vr = ipa_value_range_from_jfunc (caller_parms_info, e, jf, ipa_get_type (callee_pi, i)); if (!vr.undefined_p () && !vr.varying_p ()) { - if (!known_value_ranges.length ()) + if (!avals->m_known_value_ranges.length ()) { - known_value_ranges.safe_grow (count, true); + avals->m_known_value_ranges.safe_grow (count, true); for (int i = 0; i < count; ++i) - new (&known_value_ranges[i]) value_range (); + new (&avals->m_known_value_ranges[i]) + value_range (); } - known_value_ranges[i] = vr; + avals->m_known_value_ranges[i] = vr; } } @@ -648,25 +640,25 @@ evaluate_properties_for_edge (struct cgraph_edge *e, bool inline_p, caller, &jf->agg); if (agg.items.length ()) { - if (!known_aggs_ptr->length ()) - vec_safe_grow_cleared (known_aggs_ptr, count, true); - (*known_aggs_ptr)[i] = agg; + if (!avals->m_known_aggs.length ()) + avals->m_known_aggs.safe_grow_cleared (count, true); + avals->m_known_aggs[i] = agg; } } } /* For calls used in polymorphic calls we further determine polymorphic call context. */ - if (known_contexts_ptr + if (compute_contexts && ipa_is_param_used_by_polymorphic_call (callee_pi, i)) { ipa_polymorphic_call_context ctx = ipa_context_from_jfunc (caller_parms_info, e, i, jf); if (!ctx.useless_p ()) { - if (!known_contexts_ptr->length ()) - known_contexts_ptr->safe_grow_cleared (count, true); - (*known_contexts_ptr)[i] + if (!avals->m_known_contexts.length ()) + avals->m_known_contexts.safe_grow_cleared (count, true); + avals->m_known_contexts[i] = ipa_context_from_jfunc (caller_parms_info, e, i, jf); } } @@ -685,18 +677,14 @@ evaluate_properties_for_edge (struct cgraph_edge *e, bool inline_p, cst = NULL; if (cst) { - if (!known_vals_ptr->length ()) - vec_safe_grow_cleared (known_vals_ptr, count, true); - (*known_vals_ptr)[i] = cst; + if (!avals->m_known_vals.length ()) + avals->m_known_vals.safe_grow_cleared (count, true); + avals->m_known_vals[i] = cst; } } } - evaluate_conditions_for_known_args (callee, inline_p, - *known_vals_ptr, - known_value_ranges, - *known_aggs_ptr, - clause_ptr, + evaluate_conditions_for_known_args (callee, inline_p, avals, clause_ptr, nonspec_clause_ptr); } @@ -781,7 +769,7 @@ ipa_fn_summary_t::duplicate (cgraph_node *src, vec *entry = info->size_time_table; /* Use SRC parm info since it may not be copied yet. */ class ipa_node_params *parms_info = IPA_NODE_REF (src); - vec known_vals = vNULL; + ipa_auto_call_arg_values avals; int count = ipa_get_param_count (parms_info); int i, j; clause_t possible_truths; @@ -792,7 +780,7 @@ ipa_fn_summary_t::duplicate (cgraph_node *src, struct cgraph_edge *edge, *next; info->size_time_table = 0; - known_vals.safe_grow_cleared (count, true); + avals.m_known_vals.safe_grow_cleared (count, true); for (i = 0; i < count; i++) { struct ipa_replace_map *r; @@ -801,20 +789,17 @@ ipa_fn_summary_t::duplicate (cgraph_node *src, { if (r->parm_num == i) { - known_vals[i] = r->new_tree; + avals.m_known_vals[i] = r->new_tree; break; } } } evaluate_conditions_for_known_args (dst, false, - known_vals, - vNULL, - vNULL, + &avals, &possible_truths, /* We are going to specialize, so ignore nonspec truths. */ NULL); - known_vals.release (); info->account_size_time (0, 0, true_pred, true_pred); @@ -3009,15 +2994,14 @@ compute_fn_summary_for_current (void) return 0; } -/* Estimate benefit devirtualizing indirect edge IE, provided KNOWN_VALS, - KNOWN_CONTEXTS and KNOWN_AGGS. */ +/* Estimate benefit devirtualizing indirect edge IE and return true if it can + be devirtualized and inlined, provided m_known_vals, m_known_contexts and + m_known_aggs in AVALS. Return false straight away if AVALS is NULL. */ static bool estimate_edge_devirt_benefit (struct cgraph_edge *ie, int *size, int *time, - vec known_vals, - vec known_contexts, - vec known_aggs) + ipa_call_arg_values *avals) { tree target; struct cgraph_node *callee; @@ -3025,13 +3009,13 @@ estimate_edge_devirt_benefit (struct cgraph_edge *ie, enum availability avail; bool speculative; - if (!known_vals.length () && !known_contexts.length ()) + if (!avals + || (!avals->m_known_vals.length() && !avals->m_known_contexts.length ())) return false; if (!opt_for_fn (ie->caller->decl, flag_indirect_inlining)) return false; - target = ipa_get_indirect_edge_target (ie, known_vals, known_contexts, - known_aggs, &speculative); + target = ipa_get_indirect_edge_target (ie, avals, &speculative); if (!target || speculative) return false; @@ -3055,17 +3039,13 @@ estimate_edge_devirt_benefit (struct cgraph_edge *ie, } /* Increase SIZE, MIN_SIZE (if non-NULL) and TIME for size and time needed to - handle edge E with probability PROB. - Set HINTS if edge may be devirtualized. - KNOWN_VALS, KNOWN_AGGS and KNOWN_CONTEXTS describe context of the call - site. */ + handle edge E with probability PROB. Set HINTS accordingly if edge may be + devirtualized. AVALS, if non-NULL, describes the context of the call site + as far as values of parameters are concerened. */ static inline void estimate_edge_size_and_time (struct cgraph_edge *e, int *size, int *min_size, - sreal *time, - vec known_vals, - vec known_contexts, - vec known_aggs, + sreal *time, ipa_call_arg_values *avals, ipa_hints *hints) { class ipa_call_summary *es = ipa_call_summaries->get (e); @@ -3074,8 +3054,7 @@ estimate_edge_size_and_time (struct cgraph_edge *e, int *size, int *min_size, int cur_size; if (!e->callee && hints && e->maybe_hot_p () - && estimate_edge_devirt_benefit (e, &call_size, &call_time, - known_vals, known_contexts, known_aggs)) + && estimate_edge_devirt_benefit (e, &call_size, &call_time, avals)) *hints |= INLINE_HINT_indirect_call; cur_size = call_size * ipa_fn_summary::size_scale; *size += cur_size; @@ -3087,9 +3066,9 @@ estimate_edge_size_and_time (struct cgraph_edge *e, int *size, int *min_size, /* Increase SIZE, MIN_SIZE and TIME for size and time needed to handle all - calls in NODE. POSSIBLE_TRUTHS, KNOWN_VALS, KNOWN_AGGS and KNOWN_CONTEXTS - describe context of the call site. - + calls in NODE. POSSIBLE_TRUTHS and AVALS describe the context of the call + site. + Helper for estimate_calls_size_and_time which does the same but (in most cases) faster. */ @@ -3098,9 +3077,7 @@ estimate_calls_size_and_time_1 (struct cgraph_node *node, int *size, int *min_size, sreal *time, ipa_hints *hints, clause_t possible_truths, - vec known_vals, - vec known_contexts, - vec known_aggs) + ipa_call_arg_values *avals) { struct cgraph_edge *e; for (e = node->callees; e; e = e->next_callee) @@ -3109,10 +3086,8 @@ estimate_calls_size_and_time_1 (struct cgraph_node *node, int *size, { gcc_checking_assert (!ipa_call_summaries->get (e)); estimate_calls_size_and_time_1 (e->callee, size, min_size, time, - hints, - possible_truths, - known_vals, known_contexts, - known_aggs); + hints, possible_truths, avals); + continue; } class ipa_call_summary *es = ipa_call_summaries->get (e); @@ -3130,9 +3105,7 @@ estimate_calls_size_and_time_1 (struct cgraph_node *node, int *size, so we do not need to compute probabilities. */ estimate_edge_size_and_time (e, size, es->predicate ? NULL : min_size, - time, - known_vals, known_contexts, - known_aggs, hints); + time, avals, hints); } } for (e = node->indirect_calls; e; e = e->next_callee) @@ -3142,9 +3115,7 @@ estimate_calls_size_and_time_1 (struct cgraph_node *node, int *size, || es->predicate->evaluate (possible_truths)) estimate_edge_size_and_time (e, size, es->predicate ? NULL : min_size, - time, - known_vals, known_contexts, known_aggs, - hints); + time, avals, hints); } } @@ -3166,8 +3137,7 @@ summarize_calls_size_and_time (struct cgraph_node *node, int size = 0; sreal time = 0; - estimate_edge_size_and_time (e, &size, NULL, &time, - vNULL, vNULL, vNULL, NULL); + estimate_edge_size_and_time (e, &size, NULL, &time, NULL, NULL); struct predicate pred = true; class ipa_call_summary *es = ipa_call_summaries->get (e); @@ -3181,8 +3151,7 @@ summarize_calls_size_and_time (struct cgraph_node *node, int size = 0; sreal time = 0; - estimate_edge_size_and_time (e, &size, NULL, &time, - vNULL, vNULL, vNULL, NULL); + estimate_edge_size_and_time (e, &size, NULL, &time, NULL, NULL); struct predicate pred = true; class ipa_call_summary *es = ipa_call_summaries->get (e); @@ -3193,17 +3162,15 @@ summarize_calls_size_and_time (struct cgraph_node *node, } /* Increase SIZE, MIN_SIZE and TIME for size and time needed to handle all - calls in NODE. POSSIBLE_TRUTHS, KNOWN_VALS, KNOWN_AGGS and KNOWN_CONTEXTS - describe context of the call site. */ + calls in NODE. POSSIBLE_TRUTHS and AVALS (the latter if non-NULL) describe + context of the call site. */ static void estimate_calls_size_and_time (struct cgraph_node *node, int *size, int *min_size, sreal *time, ipa_hints *hints, clause_t possible_truths, - vec known_vals, - vec known_contexts, - vec known_aggs) + ipa_call_arg_values *avals) { class ipa_fn_summary *sum = ipa_fn_summaries->get (node); bool use_table = true; @@ -3222,9 +3189,10 @@ estimate_calls_size_and_time (struct cgraph_node *node, int *size, use_table = false; /* If there is an indirect edge that may be optimized, we need to go the slow way. */ - else if ((known_vals.length () - || known_contexts.length () - || known_aggs.length ()) && hints) + else if (avals && hints + && (avals->m_known_vals.length () + || avals->m_known_contexts.length () + || avals->m_known_aggs.length ())) { class ipa_node_params *params_summary = IPA_NODE_REF (node); unsigned int nargs = params_summary @@ -3233,13 +3201,13 @@ estimate_calls_size_and_time (struct cgraph_node *node, int *size, for (unsigned int i = 0; i < nargs && use_table; i++) { if (ipa_is_param_used_by_indirect_call (params_summary, i) - && ((known_vals.length () > i && known_vals[i]) - || (known_aggs.length () > i - && known_aggs[i].items.length ()))) + && (avals->safe_sval_at (i) + || (avals->m_known_aggs.length () > i + && avals->m_known_aggs[i].items.length ()))) use_table = false; else if (ipa_is_param_used_by_polymorphic_call (params_summary, i) - && (known_contexts.length () > i - && !known_contexts[i].useless_p ())) + && (avals->m_known_contexts.length () > i + && !avals->m_known_contexts[i].useless_p ())) use_table = false; } } @@ -3282,8 +3250,7 @@ estimate_calls_size_and_time (struct cgraph_node *node, int *size, < ipa_fn_summary::max_size_time_table_size) { estimate_calls_size_and_time_1 (node, &old_size, NULL, &old_time, NULL, - possible_truths, known_vals, - known_contexts, known_aggs); + possible_truths, avals); gcc_assert (*size == old_size); if (time && (*time - old_time > 1 || *time - old_time < -1) && dump_file) @@ -3295,31 +3262,22 @@ estimate_calls_size_and_time (struct cgraph_node *node, int *size, /* Slow path by walking all edges. */ else estimate_calls_size_and_time_1 (node, size, min_size, time, hints, - possible_truths, known_vals, known_contexts, - known_aggs); + possible_truths, avals); } -/* Default constructor for ipa call context. - Memory allocation of known_vals, known_contexts - and known_aggs vectors is owned by the caller, but can - be release by ipa_call_context::release. - - inline_param_summary is owned by the caller. */ -ipa_call_context::ipa_call_context (cgraph_node *node, - clause_t possible_truths, +/* Main constructor for ipa call context. Memory allocation of ARG_VALUES + is owned by the caller. INLINE_PARAM_SUMMARY is also owned by the + caller. */ + +ipa_call_context::ipa_call_context (cgraph_node *node, clause_t possible_truths, clause_t nonspec_possible_truths, - vec known_vals, - vec - known_contexts, - vec known_aggs, vec - inline_param_summary) + inline_param_summary, + ipa_auto_call_arg_values *arg_values) : m_node (node), m_possible_truths (possible_truths), m_nonspec_possible_truths (nonspec_possible_truths), m_inline_param_summary (inline_param_summary), - m_known_vals (known_vals), - m_known_contexts (known_contexts), - m_known_aggs (known_aggs) + m_avals (arg_values) { } @@ -3350,47 +3308,50 @@ ipa_call_context::duplicate_from (const ipa_call_context &ctx) break; } } - m_known_vals = vNULL; - if (ctx.m_known_vals.exists ()) + m_avals.m_known_vals = vNULL; + if (ctx.m_avals.m_known_vals.exists ()) { - unsigned int n = MIN (ctx.m_known_vals.length (), nargs); + unsigned int n = MIN (ctx.m_avals.m_known_vals.length (), nargs); for (unsigned int i = 0; i < n; i++) if (ipa_is_param_used_by_indirect_call (params_summary, i) - && ctx.m_known_vals[i]) + && ctx.m_avals.m_known_vals[i]) { - m_known_vals = ctx.m_known_vals.copy (); + m_avals.m_known_vals = ctx.m_avals.m_known_vals.copy (); break; } } - m_known_contexts = vNULL; - if (ctx.m_known_contexts.exists ()) + m_avals.m_known_contexts = vNULL; + if (ctx.m_avals.m_known_contexts.exists ()) { - unsigned int n = MIN (ctx.m_known_contexts.length (), nargs); + unsigned int n = MIN (ctx.m_avals.m_known_contexts.length (), nargs); for (unsigned int i = 0; i < n; i++) if (ipa_is_param_used_by_polymorphic_call (params_summary, i) - && !ctx.m_known_contexts[i].useless_p ()) + && !ctx.m_avals.m_known_contexts[i].useless_p ()) { - m_known_contexts = ctx.m_known_contexts.copy (); + m_avals.m_known_contexts = ctx.m_avals.m_known_contexts.copy (); break; } } - m_known_aggs = vNULL; - if (ctx.m_known_aggs.exists ()) + m_avals.m_known_aggs = vNULL; + if (ctx.m_avals.m_known_aggs.exists ()) { - unsigned int n = MIN (ctx.m_known_aggs.length (), nargs); + unsigned int n = MIN (ctx.m_avals.m_known_aggs.length (), nargs); for (unsigned int i = 0; i < n; i++) if (ipa_is_param_used_by_indirect_call (params_summary, i) - && !ctx.m_known_aggs[i].is_empty ()) + && !ctx.m_avals.m_known_aggs[i].is_empty ()) { - m_known_aggs = ipa_copy_agg_values (ctx.m_known_aggs); + m_avals.m_known_aggs + = ipa_copy_agg_values (ctx.m_avals.m_known_aggs); break; } } + + m_avals.m_known_value_ranges = vNULL; } /* Release memory used by known_vals/contexts/aggs vectors. @@ -3404,11 +3365,11 @@ ipa_call_context::release (bool all) /* See if context is initialized at first place. */ if (!m_node) return; - ipa_release_agg_values (m_known_aggs, all); + ipa_release_agg_values (m_avals.m_known_aggs, all); if (all) { - m_known_vals.release (); - m_known_contexts.release (); + m_avals.m_known_vals.release (); + m_avals.m_known_contexts.release (); m_inline_param_summary.release (); } } @@ -3454,77 +3415,81 @@ ipa_call_context::equal_to (const ipa_call_context &ctx) return false; } } - if (m_known_vals.exists () || ctx.m_known_vals.exists ()) + if (m_avals.m_known_vals.exists () || ctx.m_avals.m_known_vals.exists ()) { for (unsigned int i = 0; i < nargs; i++) { if (!ipa_is_param_used_by_indirect_call (params_summary, i)) continue; - if (i >= m_known_vals.length () || !m_known_vals[i]) + if (i >= m_avals.m_known_vals.length () || !m_avals.m_known_vals[i]) { - if (i < ctx.m_known_vals.length () && ctx.m_known_vals[i]) + if (i < ctx.m_avals.m_known_vals.length () + && ctx.m_avals.m_known_vals[i]) return false; continue; } - if (i >= ctx.m_known_vals.length () || !ctx.m_known_vals[i]) + if (i >= ctx.m_avals.m_known_vals.length () + || !ctx.m_avals.m_known_vals[i]) { - if (i < m_known_vals.length () && m_known_vals[i]) + if (i < m_avals.m_known_vals.length () && m_avals.m_known_vals[i]) return false; continue; } - if (m_known_vals[i] != ctx.m_known_vals[i]) + if (m_avals.m_known_vals[i] != ctx.m_avals.m_known_vals[i]) return false; } } - if (m_known_contexts.exists () || ctx.m_known_contexts.exists ()) + if (m_avals.m_known_contexts.exists () + || ctx.m_avals.m_known_contexts.exists ()) { for (unsigned int i = 0; i < nargs; i++) { if (!ipa_is_param_used_by_polymorphic_call (params_summary, i)) continue; - if (i >= m_known_contexts.length () - || m_known_contexts[i].useless_p ()) + if (i >= m_avals.m_known_contexts.length () + || m_avals.m_known_contexts[i].useless_p ()) { - if (i < ctx.m_known_contexts.length () - && !ctx.m_known_contexts[i].useless_p ()) + if (i < ctx.m_avals.m_known_contexts.length () + && !ctx.m_avals.m_known_contexts[i].useless_p ()) return false; continue; } - if (i >= ctx.m_known_contexts.length () - || ctx.m_known_contexts[i].useless_p ()) + if (i >= ctx.m_avals.m_known_contexts.length () + || ctx.m_avals.m_known_contexts[i].useless_p ()) { - if (i < m_known_contexts.length () - && !m_known_contexts[i].useless_p ()) + if (i < m_avals.m_known_contexts.length () + && !m_avals.m_known_contexts[i].useless_p ()) return false; continue; } - if (!m_known_contexts[i].equal_to - (ctx.m_known_contexts[i])) + if (!m_avals.m_known_contexts[i].equal_to + (ctx.m_avals.m_known_contexts[i])) return false; } } - if (m_known_aggs.exists () || ctx.m_known_aggs.exists ()) + if (m_avals.m_known_aggs.exists () || ctx.m_avals.m_known_aggs.exists ()) { for (unsigned int i = 0; i < nargs; i++) { if (!ipa_is_param_used_by_indirect_call (params_summary, i)) continue; - if (i >= m_known_aggs.length () || m_known_aggs[i].is_empty ()) + if (i >= m_avals.m_known_aggs.length () + || m_avals.m_known_aggs[i].is_empty ()) { - if (i < ctx.m_known_aggs.length () - && !ctx.m_known_aggs[i].is_empty ()) + if (i < ctx.m_avals.m_known_aggs.length () + && !ctx.m_avals.m_known_aggs[i].is_empty ()) return false; continue; } - if (i >= ctx.m_known_aggs.length () - || ctx.m_known_aggs[i].is_empty ()) + if (i >= ctx.m_avals.m_known_aggs.length () + || ctx.m_avals.m_known_aggs[i].is_empty ()) { - if (i < m_known_aggs.length () - && !m_known_aggs[i].is_empty ()) + if (i < m_avals.m_known_aggs.length () + && !m_avals.m_known_aggs[i].is_empty ()) return false; continue; } - if (!m_known_aggs[i].equal_to (ctx.m_known_aggs[i])) + if (!m_avals.m_known_aggs[i].equal_to (ctx.m_avals.m_known_aggs[i])) return false; } } @@ -3574,7 +3539,7 @@ ipa_call_context::estimate_size_and_time (int *ret_size, estimate_calls_size_and_time (m_node, &size, &min_size, ret_time ? &time : NULL, ret_hints ? &hints : NULL, m_possible_truths, - m_known_vals, m_known_contexts, m_known_aggs); + &m_avals); sreal nonspecialized_time = time; @@ -3681,22 +3646,16 @@ ipa_call_context::estimate_size_and_time (int *ret_size, void estimate_ipcp_clone_size_and_time (struct cgraph_node *node, - vec known_vals, - vec - known_contexts, - vec known_aggs, + ipa_auto_call_arg_values *avals, int *ret_size, sreal *ret_time, sreal *ret_nonspec_time, ipa_hints *hints) { clause_t clause, nonspec_clause; - /* TODO: Also pass known value ranges. */ - evaluate_conditions_for_known_args (node, false, known_vals, vNULL, - known_aggs, &clause, &nonspec_clause); - ipa_call_context ctx (node, clause, nonspec_clause, - known_vals, known_contexts, - known_aggs, vNULL); + evaluate_conditions_for_known_args (node, false, avals, &clause, + &nonspec_clause); + ipa_call_context ctx (node, clause, nonspec_clause, vNULL, avals); ctx.estimate_size_and_time (ret_size, NULL, ret_time, ret_nonspec_time, hints); } @@ -3914,10 +3873,8 @@ ipa_merge_fn_summary_after_inlining (struct cgraph_edge *edge) if (callee_info->conds) { - auto_vec known_vals; - auto_vec known_aggs; - evaluate_properties_for_edge (edge, true, &clause, NULL, - &known_vals, NULL, &known_aggs); + ipa_auto_call_arg_values avals; + evaluate_properties_for_edge (edge, true, &clause, NULL, &avals, false); } if (ipa_node_params_sum && callee_info->conds) { @@ -4011,8 +3968,7 @@ ipa_merge_fn_summary_after_inlining (struct cgraph_edge *edge) int edge_size = 0; sreal edge_time = 0; - estimate_edge_size_and_time (edge, &edge_size, NULL, &edge_time, vNULL, - vNULL, vNULL, 0); + estimate_edge_size_and_time (edge, &edge_size, NULL, &edge_time, NULL, 0); /* Unaccount size and time of the optimized out call. */ info->account_size_time (-edge_size, -edge_time, es->predicate ? *es->predicate : true, @@ -4054,7 +4010,7 @@ ipa_update_overall_fn_summary (struct cgraph_node *node, bool reset) estimate_calls_size_and_time (node, &size_info->size, &info->min_size, &info->time, NULL, ~(clause_t) (1 << predicate::false_condition), - vNULL, vNULL, vNULL); + NULL); size_info->size = RDIV (size_info->size, ipa_fn_summary::size_scale); info->min_size = RDIV (info->min_size, ipa_fn_summary::size_scale); } diff --git a/gcc/ipa-fnsummary.h b/gcc/ipa-fnsummary.h index c6ddc9f3199..dfcde9f91b8 100644 --- a/gcc/ipa-fnsummary.h +++ b/gcc/ipa-fnsummary.h @@ -297,10 +297,8 @@ public: ipa_call_context (cgraph_node *node, clause_t possible_truths, clause_t nonspec_possible_truths, - vec known_vals, - vec known_contexts, - vec known_aggs, - vec m_inline_param_summary); + vec inline_param_summary, + ipa_auto_call_arg_values *arg_values); ipa_call_context () : m_node(NULL) { @@ -328,14 +326,9 @@ private: /* Inline summary maintains info about change probabilities. */ vec m_inline_param_summary; - /* The following is used only to resolve indirect calls. */ - - /* Vector describing known values of parameters. */ - vec m_known_vals; - /* Vector describing known polymorphic call contexts. */ - vec m_known_contexts; - /* Vector describing known aggregate values. */ - vec m_known_aggs; + /* Even after having calculated clauses, the information about argument + values is used to resolve indirect calls. */ + ipa_call_arg_values m_avals; }; extern fast_call_summary *ipa_call_summaries; @@ -349,9 +342,7 @@ void ipa_free_fn_summary (void); void ipa_free_size_summary (void); void inline_analyze_function (struct cgraph_node *node); void estimate_ipcp_clone_size_and_time (struct cgraph_node *, - vec, - vec, - vec, + ipa_auto_call_arg_values *, int *, sreal *, sreal *, ipa_hints *); void ipa_merge_fn_summary_after_inlining (struct cgraph_edge *edge); @@ -363,10 +354,8 @@ void evaluate_properties_for_edge (struct cgraph_edge *e, bool inline_p, clause_t *clause_ptr, clause_t *nonspec_clause_ptr, - vec *known_vals_ptr, - vec - *known_contexts_ptr, - vec *); + ipa_auto_call_arg_values *avals, + bool compute_contexts); void ipa_fnsummary_c_finalize (void); HOST_WIDE_INT ipa_get_stack_frame_offset (struct cgraph_node *node); diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c index 148efbc09ef..d2ae8196d09 100644 --- a/gcc/ipa-inline-analysis.c +++ b/gcc/ipa-inline-analysis.c @@ -184,20 +184,16 @@ do_estimate_edge_time (struct cgraph_edge *edge, sreal *ret_nonspec_time) ipa_hints hints; struct cgraph_node *callee; clause_t clause, nonspec_clause; - auto_vec known_vals; - auto_vec known_contexts; - auto_vec known_aggs; + ipa_auto_call_arg_values avals; class ipa_call_summary *es = ipa_call_summaries->get (edge); int min_size = -1; callee = edge->callee->ultimate_alias_target (); gcc_checking_assert (edge->inline_failed); - evaluate_properties_for_edge (edge, true, - &clause, &nonspec_clause, &known_vals, - &known_contexts, &known_aggs); - ipa_call_context ctx (callee, clause, nonspec_clause, known_vals, - known_contexts, known_aggs, es->param); + evaluate_properties_for_edge (edge, true, &clause, &nonspec_clause, + &avals, true); + ipa_call_context ctx (callee, clause, nonspec_clause, es->param, &avals); if (node_context_cache != NULL) { node_context_summary *e = node_context_cache->get_create (callee); @@ -255,7 +251,6 @@ do_estimate_edge_time (struct cgraph_edge *edge, sreal *ret_nonspec_time) : edge->caller->count.ipa ()))) hints |= INLINE_HINT_known_hot; - ctx.release (); gcc_checking_assert (size >= 0); gcc_checking_assert (time >= 0); @@ -307,9 +302,6 @@ do_estimate_edge_size (struct cgraph_edge *edge) int size; struct cgraph_node *callee; clause_t clause, nonspec_clause; - auto_vec known_vals; - auto_vec known_contexts; - auto_vec known_aggs; /* When we do caching, use do_estimate_edge_time to populate the entry. */ @@ -325,14 +317,11 @@ do_estimate_edge_size (struct cgraph_edge *edge) /* Early inliner runs without caching, go ahead and do the dirty work. */ gcc_checking_assert (edge->inline_failed); - evaluate_properties_for_edge (edge, true, - &clause, &nonspec_clause, - &known_vals, &known_contexts, - &known_aggs); - ipa_call_context ctx (callee, clause, nonspec_clause, known_vals, - known_contexts, known_aggs, vNULL); + ipa_auto_call_arg_values avals; + evaluate_properties_for_edge (edge, true, &clause, &nonspec_clause, + &avals, true); + ipa_call_context ctx (callee, clause, nonspec_clause, vNULL, &avals); ctx.estimate_size_and_time (&size, NULL, NULL, NULL, NULL); - ctx.release (); return size; } @@ -346,9 +335,6 @@ do_estimate_edge_hints (struct cgraph_edge *edge) ipa_hints hints; struct cgraph_node *callee; clause_t clause, nonspec_clause; - auto_vec known_vals; - auto_vec known_contexts; - auto_vec known_aggs; /* When we do caching, use do_estimate_edge_time to populate the entry. */ @@ -364,14 +350,11 @@ do_estimate_edge_hints (struct cgraph_edge *edge) /* Early inliner runs without caching, go ahead and do the dirty work. */ gcc_checking_assert (edge->inline_failed); - evaluate_properties_for_edge (edge, true, - &clause, &nonspec_clause, - &known_vals, &known_contexts, - &known_aggs); - ipa_call_context ctx (callee, clause, nonspec_clause, known_vals, - known_contexts, known_aggs, vNULL); + ipa_auto_call_arg_values avals; + evaluate_properties_for_edge (edge, true, &clause, &nonspec_clause, + &avals, true); + ipa_call_context ctx (callee, clause, nonspec_clause, vNULL, &avals); ctx.estimate_size_and_time (NULL, NULL, NULL, NULL, &hints); - ctx.release (); hints |= simple_edge_hints (edge); return hints; } diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c index b28c78eeab4..230625a89bb 100644 --- a/gcc/ipa-prop.c +++ b/gcc/ipa-prop.c @@ -5795,4 +5795,14 @@ ipa_agg_value::equal_to (const ipa_agg_value &other) return offset == other.offset && operand_equal_p (value, other.value, 0); } + +/* Destructor also removing individual aggregate values. */ + +ipa_auto_call_arg_values::~ipa_auto_call_arg_values () +{ + ipa_release_agg_values (m_known_aggs, false); +} + + + #include "gt-ipa-prop.h" diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h index 23fcf905ef3..8b2edf6300c 100644 --- a/gcc/ipa-prop.h +++ b/gcc/ipa-prop.h @@ -433,6 +433,107 @@ ipa_get_jf_ancestor_type_preserved (struct ipa_jump_func *jfunc) return jfunc->value.ancestor.agg_preserved; } +/* Class for allocating a bundle of various potentially known properties about + actual arguments of a particular call on stack for the usual case and on + heap only if there are unusually many arguments. The data is deallocated + when the instance of this class goes out of scope or is otherwise + destructed. */ + +class ipa_auto_call_arg_values +{ +public: + ~ipa_auto_call_arg_values (); + + /* If m_known_vals (vector of known "scalar" values) is sufficiantly long, + return its element at INDEX, otherwise return NULL. */ + tree safe_sval_at (int index) + { + /* TODO: Assert non-negative index here and test. */ + if ((unsigned) index < m_known_vals.length ()) + return m_known_vals[index]; + return NULL; + } + + /* If m_known_aggs is sufficiantly long, return the pointer rto its element + at INDEX, otherwise return NULL. */ + ipa_agg_value_set *safe_aggval_at (int index) + { + /* TODO: Assert non-negative index here and test. */ + if ((unsigned) index < m_known_aggs.length ()) + return &m_known_aggs[index]; + return NULL; + } + + /* Vector describing known values of parameters. */ + auto_vec m_known_vals; + + /* Vector describing known polymorphic call contexts. */ + auto_vec m_known_contexts; + + /* Vector describing known aggregate values. */ + auto_vec m_known_aggs; + + /* Vector describing known value ranges of arguments. */ + auto_vec m_known_value_ranges; +}; + +/* Class bundling the various potentially known properties about actual + arguments of a particular call. This variant does not deallocate the + bundled data in any way. */ + +class ipa_call_arg_values +{ +public: + /* Default constructor, setting the vectors to empty ones. */ + ipa_call_arg_values () + {} + + /* Construct this general variant of the bundle from the variant which uses + auto_vecs to hold the vectors. This means that vectors of objects + constructed with this constructor should not be changed because if they + get reallocated, the member vectors and the underlying auto_vecs would get + out of sync. */ + ipa_call_arg_values (ipa_auto_call_arg_values *aavals) + : m_known_vals (aavals->m_known_vals), + m_known_contexts (aavals->m_known_contexts), + m_known_aggs (aavals->m_known_aggs), + m_known_value_ranges (aavals->m_known_value_ranges) + {} + + /* If m_known_vals (vector of known "scalar" values) is sufficiantly long, + return its element at INDEX, otherwise return NULL. */ + tree safe_sval_at (int index) + { + /* TODO: Assert non-negative index here and test. */ + if ((unsigned) index < m_known_vals.length ()) + return m_known_vals[index]; + return NULL; + } + + /* If m_known_aggs is sufficiantly long, return the pointer rto its element + at INDEX, otherwise return NULL. */ + ipa_agg_value_set *safe_aggval_at (int index) + { + /* TODO: Assert non-negative index here and test. */ + if ((unsigned) index < m_known_aggs.length ()) + return &m_known_aggs[index]; + return NULL; + } + + /* Vector describing known values of parameters. */ + vec m_known_vals = vNULL; + + /* Vector describing known polymorphic call contexts. */ + vec m_known_contexts = vNULL; + + /* Vector describing known aggregate values. */ + vec m_known_aggs = vNULL; + + /* Vector describing known value ranges of arguments. */ + vec m_known_value_ranges = vNULL; +}; + + /* Summary describing a single formal parameter. */ struct GTY(()) ipa_param_descriptor @@ -970,12 +1071,13 @@ void ipa_initialize_node_params (struct cgraph_node *node); bool ipa_propagate_indirect_call_infos (struct cgraph_edge *cs, vec *new_edges); -/* Indirect edge and binfo processing. */ +/* Indirect edge processing and target discovery. */ tree ipa_get_indirect_edge_target (struct cgraph_edge *ie, - vec, - vec, - vec, - bool *); + ipa_call_arg_values *avals, + bool *speculative); +tree ipa_get_indirect_edge_target (struct cgraph_edge *ie, + ipa_auto_call_arg_values *avals, + bool *speculative); struct cgraph_edge *ipa_make_edge_direct_to_target (struct cgraph_edge *, tree, bool speculative = false); tree ipa_impossible_devirt_target (struct cgraph_edge *, tree); From patchwork Mon Sep 7 19:33:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin Jambor X-Patchwork-Id: 1359230 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=suse.cz Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Bldkf2T7Pz9sTK for ; Tue, 8 Sep 2020 05:34:02 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C265F393C86C; Mon, 7 Sep 2020 19:33:59 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by sourceware.org (Postfix) with ESMTPS id 5CEFB386186A for ; Mon, 7 Sep 2020 19:33:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 5CEFB386186A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mjambor@suse.cz X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 3CB6AAC3F for ; Mon, 7 Sep 2020 19:33:57 +0000 (UTC) From: Martin Jambor To: GCC Patches Subject: [PATCH 2/6] ipa: Introduce ipa_cached_call_context User-Agent: Notmuch/0.30 (https://notmuchmail.org) Emacs/26.3 (x86_64-suse-linux-gnu) Date: Mon, 07 Sep 2020 21:33:56 +0200 Message-ID: MIME-Version: 1.0 X-Spam-Status: No, score=-3038.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jan Hubicka Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi, as we discussed with Honza on the mailin glist last week, making cached call context structure distinct from the normal one may make it clearer that the cached data need to be explicitely deallocated. This implements does that division. It is not mandatory for the overall main goals of the patch set and can be dropped if deemed superfluous. Last week I bootstrapped and tested (and LTO-bootstrapped) this patch individually, this week I did so for the whole patch set. OK for trunk? Thanks, Martin gcc/ChangeLog: 2020-09-02 Martin Jambor * ipa-fnsummary.h (ipa_cached_call_context): New forward declaration and class. (class ipa_call_context): Make friend ipa_cached_call_context. Moved methods duplicate_from and release to it too. * ipa-fnsummary.c (ipa_call_context::duplicate_from): Moved to class ipa_cached_call_context. (ipa_call_context::release): Likewise, removed the parameter. * ipa-inline-analysis.c (node_context_cache_entry): Change the type of ctx to ipa_cached_call_context. (do_estimate_edge_time): Remove parameter from the call to ipa_cached_call_context::release. --- gcc/ipa-fnsummary.c | 21 ++++++++------------- gcc/ipa-fnsummary.h | 16 ++++++++++++++-- gcc/ipa-inline-analysis.c | 4 ++-- 3 files changed, 24 insertions(+), 17 deletions(-) diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c index d8b95aca307..aaccc203b3b 100644 --- a/gcc/ipa-fnsummary.c +++ b/gcc/ipa-fnsummary.c @@ -3284,7 +3284,7 @@ ipa_call_context::ipa_call_context (cgraph_node *node, clause_t possible_truths, /* Set THIS to be a duplicate of CTX. Copy all relevant info. */ void -ipa_call_context::duplicate_from (const ipa_call_context &ctx) +ipa_cached_call_context::duplicate_from (const ipa_call_context &ctx) { m_node = ctx.m_node; m_possible_truths = ctx.m_possible_truths; @@ -3354,24 +3354,19 @@ ipa_call_context::duplicate_from (const ipa_call_context &ctx) m_avals.m_known_value_ranges = vNULL; } -/* Release memory used by known_vals/contexts/aggs vectors. - If ALL is true release also inline_param_summary. - This happens when context was previously duplicated to be stored - into cache. */ +/* Release memory used by known_vals/contexts/aggs vectors. and + inline_param_summary. */ void -ipa_call_context::release (bool all) +ipa_cached_call_context::release () { /* See if context is initialized at first place. */ if (!m_node) return; - ipa_release_agg_values (m_avals.m_known_aggs, all); - if (all) - { - m_avals.m_known_vals.release (); - m_avals.m_known_contexts.release (); - m_inline_param_summary.release (); - } + ipa_release_agg_values (m_avals.m_known_aggs, true); + m_avals.m_known_vals.release (); + m_avals.m_known_contexts.release (); + m_inline_param_summary.release (); } /* Return true if CTX describes the same call context as THIS. */ diff --git a/gcc/ipa-fnsummary.h b/gcc/ipa-fnsummary.h index dfcde9f91b8..8e5659f62aa 100644 --- a/gcc/ipa-fnsummary.h +++ b/gcc/ipa-fnsummary.h @@ -287,6 +287,8 @@ public: ipa_call_summary *dst_data); }; +class ipa_cached_call_context; + /* This object describe a context of call. That is a summary of known information about its parameters. Main purpose of this context is to give more realistic estimations of function runtime, size and @@ -307,8 +309,6 @@ public: sreal *ret_time, sreal *ret_nonspecialized_time, ipa_hints *ret_hints); - void duplicate_from (const ipa_call_context &ctx); - void release (bool all = false); bool equal_to (const ipa_call_context &); bool exists_p () { @@ -329,6 +329,18 @@ private: /* Even after having calculated clauses, the information about argument values is used to resolve indirect calls. */ ipa_call_arg_values m_avals; + + friend ipa_cached_call_context; +}; + +/* Variant of ipa_call_context that is stored in a cache over a longer period + of time. */ + +class ipa_cached_call_context : public ipa_call_context +{ +public: + void duplicate_from (const ipa_call_context &ctx); + void release (); }; extern fast_call_summary *ipa_call_summaries; diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c index d2ae8196d09..b7af77f7b9b 100644 --- a/gcc/ipa-inline-analysis.c +++ b/gcc/ipa-inline-analysis.c @@ -57,7 +57,7 @@ fast_call_summary *edge_growth_cache = NULL; class node_context_cache_entry { public: - ipa_call_context ctx; + ipa_cached_call_context ctx; sreal time, nonspec_time; int size; ipa_hints hints; @@ -226,7 +226,7 @@ do_estimate_edge_time (struct cgraph_edge *edge, sreal *ret_nonspec_time) node_context_cache_miss++; else node_context_cache_clear++; - e->entry.ctx.release (true); + e->entry.ctx.release (); ctx.estimate_size_and_time (&size, &min_size, &time, &nonspec_time, &hints); e->entry.size = size; From patchwork Mon Sep 7 19:35:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin Jambor X-Patchwork-Id: 1359231 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=suse.cz Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4BldmY1rN9z9sSP for ; Tue, 8 Sep 2020 05:35:41 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2589539484BF; Mon, 7 Sep 2020 19:35:39 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by sourceware.org (Postfix) with ESMTPS id 8F2B33947C0D for ; Mon, 7 Sep 2020 19:35:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 8F2B33947C0D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mjambor@suse.cz X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 694FBACDB for ; Mon, 7 Sep 2020 19:35:36 +0000 (UTC) From: Martin Jambor To: GCC Patches Subject: [PATCH 3/6] ipa: Bundle estimates of ipa_call_context::estimate_size_and_time User-Agent: Notmuch/0.30 (https://notmuchmail.org) Emacs/26.3 (x86_64-suse-linux-gnu) Date: Mon, 07 Sep 2020 21:35:35 +0200 Message-ID: MIME-Version: 1.0 X-Spam-Status: No, score=-3038.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jan Hubicka Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi, A subsequent patch adds another two estimates that the code in ipa_call_context::estimate_size_and_time computes, and the fact that the function has a special output parameter for each thing it computes would make it have just too many. Therefore, this patch collapses all those ouptut parameters into one output structure. Last week I bootstrapped and tested (and LTO-bootstrapped) the series up to this patch in particular, this week I did so for the whole patch set. OK for trunk? Thanks, Martin gcc/ChangeLog: 2020-09-02 Martin Jambor * ipa-inline-analysis.c (do_estimate_edge_time): Adjusted to use ipa_call_estimates. (do_estimate_edge_size): Likewise. (do_estimate_edge_hints): Likewise. * ipa-fnsummary.h (struct ipa_call_estimates): New type. (ipa_call_context::estimate_size_and_time): Adjusted declaration. (estimate_ipcp_clone_size_and_time): Likewise. * ipa-cp.c (hint_time_bonus): Changed the type of the second argument to ipa_call_estimates. (perform_estimation_of_a_value): Adjusted to use ipa_call_estimates. (estimate_local_effects): Likewise. * ipa-fnsummary.c (ipa_call_context::estimate_size_and_time): Adjusted to return estimates in a single ipa_call_estimates parameter. (estimate_ipcp_clone_size_and_time): Likewise. --- gcc/ipa-cp.c | 45 ++++++++++++++--------------- gcc/ipa-fnsummary.c | 60 +++++++++++++++++++-------------------- gcc/ipa-fnsummary.h | 36 +++++++++++++++++------ gcc/ipa-inline-analysis.c | 47 +++++++++++++++++------------- 4 files changed, 105 insertions(+), 83 deletions(-) diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c index 292dd7e5bdf..77c84a6ed5d 100644 --- a/gcc/ipa-cp.c +++ b/gcc/ipa-cp.c @@ -3196,12 +3196,13 @@ devirtualization_time_bonus (struct cgraph_node *node, return res; } -/* Return time bonus incurred because of HINTS. */ +/* Return time bonus incurred because of hints stored in ESTIMATES. */ static int -hint_time_bonus (cgraph_node *node, ipa_hints hints) +hint_time_bonus (cgraph_node *node, const ipa_call_estimates &estimates) { int result = 0; + ipa_hints hints = estimates.hints; if (hints & (INLINE_HINT_loop_iterations | INLINE_HINT_loop_stride)) result += opt_for_fn (node->decl, param_ipa_cp_loop_hint_bonus); return result; @@ -3397,15 +3398,13 @@ perform_estimation_of_a_value (cgraph_node *node, int removable_params_cost, int est_move_cost, ipcp_value_base *val) { - int size, time_benefit; - sreal time, base_time; - ipa_hints hints; + int time_benefit; + ipa_call_estimates estimates; - estimate_ipcp_clone_size_and_time (node, avals, &size, &time, - &base_time, &hints); - base_time -= time; - if (base_time > 65535) - base_time = 65535; + estimate_ipcp_clone_size_and_time (node, avals, &estimates); + sreal time_delta = estimates.nonspecialized_time - estimates.time; + if (time_delta > 65535) + time_delta = 65535; /* Extern inline functions have no cloning local time benefits because they will be inlined anyway. The only reason to clone them is if it enables @@ -3413,11 +3412,12 @@ perform_estimation_of_a_value (cgraph_node *node, if (DECL_EXTERNAL (node->decl) && DECL_DECLARED_INLINE_P (node->decl)) time_benefit = 0; else - time_benefit = base_time.to_int () + time_benefit = time_delta.to_int () + devirtualization_time_bonus (node, avals) - + hint_time_bonus (node, hints) + + hint_time_bonus (node, estimates) + removable_params_cost + est_move_cost; + int size = estimates.size; gcc_checking_assert (size >=0); /* The inliner-heuristics based estimates may think that in certain contexts some functions do not have any size at all but we want @@ -3472,23 +3472,21 @@ estimate_local_effects (struct cgraph_node *node) || (removable_params_cost && node->can_change_signature)) { struct caller_statistics stats; - ipa_hints hints; - sreal time, base_time; - int size; + ipa_call_estimates estimates; init_caller_stats (&stats); node->call_for_symbol_thunks_and_aliases (gather_caller_stats, &stats, false); - estimate_ipcp_clone_size_and_time (node, &avals, &size, &time, - &base_time, &hints); - time -= devirt_bonus; - time -= hint_time_bonus (node, hints); - time -= removable_params_cost; - size -= stats.n_calls * removable_params_cost; + estimate_ipcp_clone_size_and_time (node, &avals, &estimates); + sreal time = estimates.nonspecialized_time - estimates.time; + time += devirt_bonus; + time += hint_time_bonus (node, estimates); + time += removable_params_cost; + int size = estimates.size - stats.n_calls * removable_params_cost; if (dump_file) fprintf (dump_file, " - context independent values, size: %i, " - "time_benefit: %f\n", size, (base_time - time).to_double ()); + "time_benefit: %f\n", size, (time).to_double ()); if (size <= 0 || node->local) { @@ -3499,8 +3497,7 @@ estimate_local_effects (struct cgraph_node *node) "known contexts, code not going to grow.\n"); } else if (good_cloning_opportunity_p (node, - MIN ((base_time - time).to_int (), - 65536), + MIN ((time).to_int (), 65536), stats.freq_sum, stats.count_sum, size)) { diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c index aaccc203b3b..2404a92291c 100644 --- a/gcc/ipa-fnsummary.c +++ b/gcc/ipa-fnsummary.c @@ -3491,18 +3491,14 @@ ipa_call_context::equal_to (const ipa_call_context &ctx) return true; } -/* Estimate size and time needed to execute call in the given context. - Additionally determine hints determined by the context. Finally compute - minimal size needed for the call that is independent on the call context and - can be used for fast estimates. Return the values in RET_SIZE, - RET_MIN_SIZE, RET_TIME and RET_HINTS. */ +/* Fill in the selected fields in ESTIMATES with value estimated for call in + this context. Always compute size and min_size. Only compute time and + nonspecialized_time if EST_TIMES is true. Only compute hints if EST_HINTS + is true. */ void -ipa_call_context::estimate_size_and_time (int *ret_size, - int *ret_min_size, - sreal *ret_time, - sreal *ret_nonspecialized_time, - ipa_hints *ret_hints) +ipa_call_context::estimate_size_and_time (ipa_call_estimates *estimates, + bool est_times, bool est_hints) { class ipa_fn_summary *info = ipa_fn_summaries->get (m_node); size_time_entry *e; @@ -3532,8 +3528,8 @@ ipa_call_context::estimate_size_and_time (int *ret_size, if (m_node->callees || m_node->indirect_calls) estimate_calls_size_and_time (m_node, &size, &min_size, - ret_time ? &time : NULL, - ret_hints ? &hints : NULL, m_possible_truths, + est_times ? &time : NULL, + est_hints ? &hints : NULL, m_possible_truths, &m_avals); sreal nonspecialized_time = time; @@ -3560,7 +3556,7 @@ ipa_call_context::estimate_size_and_time (int *ret_size, known to be constant in a specialized setting. */ if (nonconst) size += e->size; - if (!ret_time) + if (!est_times) continue; nonspecialized_time += e->time; if (!nonconst) @@ -3600,7 +3596,7 @@ ipa_call_context::estimate_size_and_time (int *ret_size, if (time > nonspecialized_time) time = nonspecialized_time; - if (ret_hints) + if (est_hints) { if (info->loop_iterations && !info->loop_iterations->evaluate (m_possible_truths)) @@ -3618,18 +3614,23 @@ ipa_call_context::estimate_size_and_time (int *ret_size, min_size = RDIV (min_size, ipa_fn_summary::size_scale); if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, "\n size:%i time:%f nonspec time:%f\n", (int) size, - time.to_double (), nonspecialized_time.to_double ()); - if (ret_time) - *ret_time = time; - if (ret_nonspecialized_time) - *ret_nonspecialized_time = nonspecialized_time; - if (ret_size) - *ret_size = size; - if (ret_min_size) - *ret_min_size = min_size; - if (ret_hints) - *ret_hints = hints; + { + if (est_times) + fprintf (dump_file, "\n size:%i time:%f nonspec time:%f\n", + (int) size, time.to_double (), + nonspecialized_time.to_double ()); + else + fprintf (dump_file, "\n size:%i (time not estimated)\n", (int) size); + } + if (est_times) + { + estimates->time = time; + estimates->nonspecialized_time = nonspecialized_time; + } + estimates->size = size; + estimates->min_size = min_size; + if (est_hints) + estimates->hints = hints; return; } @@ -3642,17 +3643,14 @@ ipa_call_context::estimate_size_and_time (int *ret_size, void estimate_ipcp_clone_size_and_time (struct cgraph_node *node, ipa_auto_call_arg_values *avals, - int *ret_size, sreal *ret_time, - sreal *ret_nonspec_time, - ipa_hints *hints) + ipa_call_estimates *estimates) { clause_t clause, nonspec_clause; evaluate_conditions_for_known_args (node, false, avals, &clause, &nonspec_clause); ipa_call_context ctx (node, clause, nonspec_clause, vNULL, avals); - ctx.estimate_size_and_time (ret_size, NULL, ret_time, - ret_nonspec_time, hints); + ctx.estimate_size_and_time (estimates); } /* Return stack frame offset where frame of NODE is supposed to start inside diff --git a/gcc/ipa-fnsummary.h b/gcc/ipa-fnsummary.h index 8e5659f62aa..7b68e7ce096 100644 --- a/gcc/ipa-fnsummary.h +++ b/gcc/ipa-fnsummary.h @@ -287,6 +287,29 @@ public: ipa_call_summary *dst_data); }; +/* Estimated execution times, code sizes and other information about the + code executing a call described by ipa_call_context. */ + +struct ipa_call_estimates +{ + /* Estimated size needed to execute call in the given context. */ + int size; + + /* Minimal size needed for the call that is + independent on the call context + and can be used for fast estimates. */ + int min_size; + + /* Estimated time needed to execute call in the given context. */ + sreal time; + + /* Estimated time needed to execute the function when not ignoring + computations known to be constant in this context. */ + sreal nonspecialized_time; + + /* Further discovered reasons why to inline or specialize the give calls. */ + ipa_hints hints; +}; + class ipa_cached_call_context; /* This object describe a context of call. That is a summary of known @@ -305,10 +328,8 @@ public: : m_node(NULL) { } - void estimate_size_and_time (int *ret_size, int *ret_min_size, - sreal *ret_time, - sreal *ret_nonspecialized_time, - ipa_hints *ret_hints); + void estimate_size_and_time (ipa_call_estimates *estimates, + bool est_times = true, bool est_hints = true); bool equal_to (const ipa_call_context &); bool exists_p () { @@ -353,10 +374,9 @@ void ipa_dump_hints (FILE *f, ipa_hints); void ipa_free_fn_summary (void); void ipa_free_size_summary (void); void inline_analyze_function (struct cgraph_node *node); -void estimate_ipcp_clone_size_and_time (struct cgraph_node *, - ipa_auto_call_arg_values *, - int *, sreal *, sreal *, - ipa_hints *); +void estimate_ipcp_clone_size_and_time (struct cgraph_node *node, + ipa_auto_call_arg_values *avals, + ipa_call_estimates *estimates); void ipa_merge_fn_summary_after_inlining (struct cgraph_edge *edge); void ipa_update_overall_fn_summary (struct cgraph_node *node, bool reset = true); void compute_fn_summary (struct cgraph_node *, bool); diff --git a/gcc/ipa-inline-analysis.c b/gcc/ipa-inline-analysis.c index b7af77f7b9b..acbf82e84d9 100644 --- a/gcc/ipa-inline-analysis.c +++ b/gcc/ipa-inline-analysis.c @@ -208,16 +208,12 @@ do_estimate_edge_time (struct cgraph_edge *edge, sreal *ret_nonspec_time) && !opt_for_fn (callee->decl, flag_profile_partial_training) && !callee->count.ipa_p ()) { - sreal chk_time, chk_nonspec_time; - int chk_size, chk_min_size; - - ipa_hints chk_hints; - ctx.estimate_size_and_time (&chk_size, &chk_min_size, - &chk_time, &chk_nonspec_time, - &chk_hints); - gcc_assert (chk_size == size && chk_time == time - && chk_nonspec_time == nonspec_time - && chk_hints == hints); + ipa_call_estimates chk_estimates; + ctx.estimate_size_and_time (&chk_estimates); + gcc_assert (chk_estimates.size == size + && chk_estimates.time == time + && chk_estimates.nonspecialized_time == nonspec_time + && chk_estimates.hints == hints); } } else @@ -227,18 +223,28 @@ do_estimate_edge_time (struct cgraph_edge *edge, sreal *ret_nonspec_time) else node_context_cache_clear++; e->entry.ctx.release (); - ctx.estimate_size_and_time (&size, &min_size, - &time, &nonspec_time, &hints); + ipa_call_estimates estimates; + ctx.estimate_size_and_time (&estimates); + size = estimates.size; e->entry.size = size; + time = estimates.time; e->entry.time = time; + nonspec_time = estimates.nonspecialized_time; e->entry.nonspec_time = nonspec_time; + hints = estimates.hints; e->entry.hints = hints; e->entry.ctx.duplicate_from (ctx); } } else - ctx.estimate_size_and_time (&size, &min_size, - &time, &nonspec_time, &hints); + { + ipa_call_estimates estimates; + ctx.estimate_size_and_time (&estimates); + size = estimates.size; + time = estimates.time; + nonspec_time = estimates.nonspecialized_time; + hints = estimates.hints; + } /* When we have profile feedback, we can quite safely identify hot edges and for those we disable size limits. Don't do that when @@ -321,8 +327,9 @@ do_estimate_edge_size (struct cgraph_edge *edge) evaluate_properties_for_edge (edge, true, &clause, &nonspec_clause, &avals, true); ipa_call_context ctx (callee, clause, nonspec_clause, vNULL, &avals); - ctx.estimate_size_and_time (&size, NULL, NULL, NULL, NULL); - return size; + ipa_call_estimates estimates; + ctx.estimate_size_and_time (&estimates, false, false); + return estimates.size; } @@ -332,7 +339,6 @@ do_estimate_edge_size (struct cgraph_edge *edge) ipa_hints do_estimate_edge_hints (struct cgraph_edge *edge) { - ipa_hints hints; struct cgraph_node *callee; clause_t clause, nonspec_clause; @@ -341,7 +347,7 @@ do_estimate_edge_hints (struct cgraph_edge *edge) if (edge_growth_cache != NULL) { do_estimate_edge_time (edge); - hints = edge_growth_cache->get (edge)->hints; + ipa_hints hints = edge_growth_cache->get (edge)->hints; gcc_checking_assert (hints); return hints - 1; } @@ -354,8 +360,9 @@ do_estimate_edge_hints (struct cgraph_edge *edge) evaluate_properties_for_edge (edge, true, &clause, &nonspec_clause, &avals, true); ipa_call_context ctx (callee, clause, nonspec_clause, vNULL, &avals); - ctx.estimate_size_and_time (NULL, NULL, NULL, NULL, &hints); - hints |= simple_edge_hints (edge); + ipa_call_estimates estimates; + ctx.estimate_size_and_time (&estimates, false, true); + ipa_hints hints = estimates.hints | simple_edge_hints (edge); return hints; } From patchwork Mon Sep 7 19:36:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin Jambor X-Patchwork-Id: 1359232 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=suse.cz Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Bldnt5nTrz9sSP for ; Tue, 8 Sep 2020 05:36:50 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A01FF394743D; Mon, 7 Sep 2020 19:36:48 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by sourceware.org (Postfix) with ESMTPS id 88040394743C for ; Mon, 7 Sep 2020 19:36:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 88040394743C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mjambor@suse.cz X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 6477CACA3 for ; Mon, 7 Sep 2020 19:36:44 +0000 (UTC) From: Martin Jambor To: GCC Patches Subject: [PATCH 4/6] ipa: Multiple predicates for loop properties, with frequencies User-Agent: Notmuch/0.30 (https://notmuchmail.org) Emacs/26.3 (x86_64-suse-linux-gnu) Date: Mon, 07 Sep 2020 21:36:43 +0200 Message-ID: MIME-Version: 1.0 X-Spam-Status: No, score=-3038.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jan Hubicka Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi, this patch enhances the ability of IPA to reason under what conditions loops in a function have known iteration counts or strides because it replaces single predicates which currently hold conjunction of predicates for all loops with vectors capable of holding multiple predicates, each with a cumulative frequency of loops with the property. This second property is then used by IPA-CP to much more aggressively boost its heuristic score for cloning opportunities which make iteration counts or strides of frequent loops compile time constant. Last week I bootstrapped and tested (and LTO-bootstrapped) the series up to this patch in particular, this week I did so for the whole patch set. OK for trunk? Thanks, Martin gcc/ChangeLog: 2020-09-03 Martin Jambor * ipa-fnsummary.h (ipa_freqcounting_predicate): New type. (ipa_fn_summary): Change the type of loop_iterations and loop_strides to vectors of ipa_freqcounting_predicate. (ipa_fn_summary::ipa_fn_summary): Construct the new vectors. (ipa_call_estimates): New fields loops_with_known_iterations and loops_with_known_strides. * ipa-cp.c (hint_time_bonus): Multiply param_ipa_cp_loop_hint_bonus with the expected frequencies of loops with known iteration count or stride. * ipa-fnsummary.c (add_freqcounting_predicate): New function. (ipa_fn_summary::~ipa_fn_summary): Release the new vectors instead of just two predicates. (remap_hint_predicate_after_duplication): Replace with function remap_freqcounting_preds_after_dup. (ipa_fn_summary_t::duplicate): Use it or duplicate new vectors. (ipa_dump_fn_summary): Dump the new vectors. (analyze_function_body): Compute the loop property vectors. (ipa_call_context::estimate_size_and_time): Calculate also loops_with_known_iterations and loops_with_known_strides. Adjusted dumping accordinly. (remap_hint_predicate): Replace with function remap_freqcounting_predicate. (ipa_merge_fn_summary_after_inlining): Use it. (inline_read_section): Stream loopcounting vectors instead of two simple predicates. (ipa_fn_summary_write): Likewise. * params.opt (ipa-max-loop-predicates): New parameter. * doc/invoke.texi (ipa-max-loop-predicates): Document new param. gcc/testsuite/ChangeLog: 2020-09-03 Martin Jambor * gcc.dg/ipa/ipcp-loophint-1.c: New test. --- gcc/doc/invoke.texi | 4 + gcc/ipa-cp.c | 9 + gcc/ipa-fnsummary.c | 318 ++++++++++++++------- gcc/ipa-fnsummary.h | 38 ++- gcc/params.opt | 4 + gcc/testsuite/gcc.dg/ipa/ipcp-loophint-1.c | 29 ++ 6 files changed, 288 insertions(+), 114 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/ipa/ipcp-loophint-1.c diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index bca8c856dc8..d539e58a11c 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -13259,6 +13259,10 @@ of iterations of a loop known, it adds a bonus of @option{ipa-cp-loop-hint-bonus} to the profitability score of the candidate. +@item ipa-max-loop-predicates +The maximum number of different predicates IPA will use to describe when +loops in a function have known properties. + @item ipa-max-aa-steps During its analysis of function bodies, IPA-CP employs alias analysis in order to track values pointed to by function parameters. In order diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c index 77c84a6ed5d..f6320c787de 100644 --- a/gcc/ipa-cp.c +++ b/gcc/ipa-cp.c @@ -3205,6 +3205,15 @@ hint_time_bonus (cgraph_node *node, const ipa_call_estimates &estimates) ipa_hints hints = estimates.hints; if (hints & (INLINE_HINT_loop_iterations | INLINE_HINT_loop_stride)) result += opt_for_fn (node->decl, param_ipa_cp_loop_hint_bonus); + + sreal bonus_for_one = opt_for_fn (node->decl, param_ipa_cp_loop_hint_bonus); + + if (hints & INLINE_HINT_loop_iterations) + result += (estimates.loops_with_known_iterations * bonus_for_one).to_int (); + + if (hints & INLINE_HINT_loop_stride) + result += (estimates.loops_with_known_strides * bonus_for_one).to_int (); + return result; } diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c index 2404a92291c..aec1c319a65 100644 --- a/gcc/ipa-fnsummary.c +++ b/gcc/ipa-fnsummary.c @@ -310,6 +310,36 @@ set_hint_predicate (predicate **p, predicate new_predicate) } } +/* Find if NEW_PREDICATE is already in V and if so, increment its freq. + Otherwise add a new item to the vector with this predicate and frerq equal + to add_freq, unless the number of predicates would exceed MAX_NUM_PREDICATES + in which case the function does nothing. */ + +static void +add_freqcounting_predicate (vec **v, + const predicate &new_predicate, sreal add_freq, + unsigned max_num_predicates) +{ + if (new_predicate == false || new_predicate == true) + return; + ipa_freqcounting_predicate *f; + for (int i = 0; vec_safe_iterate (*v, i, &f); i++) + if (new_predicate == f->predicate) + { + f->freq += add_freq; + return; + } + if (vec_safe_length (*v) >= max_num_predicates) + /* Too many different predicates to account for. */ + return; + + ipa_freqcounting_predicate fcp; + fcp.predicate = NULL; + set_hint_predicate (&fcp.predicate, new_predicate); + fcp.freq = add_freq; + vec_safe_push (*v, fcp); + return; +} /* Compute what conditions may or may not hold given information about parameters. RET_CLAUSE returns truths that may hold in a specialized copy, @@ -710,13 +740,17 @@ ipa_call_summary::~ipa_call_summary () ipa_fn_summary::~ipa_fn_summary () { - if (loop_iterations) - edge_predicate_pool.remove (loop_iterations); - if (loop_stride) - edge_predicate_pool.remove (loop_stride); + unsigned len = vec_safe_length (loop_iterations); + for (unsigned i = 0; i < len; i++) + edge_predicate_pool.remove ((*loop_iterations)[i].predicate); + len = vec_safe_length (loop_strides); + for (unsigned i = 0; i < len; i++) + edge_predicate_pool.remove ((*loop_strides)[i].predicate); vec_free (conds); vec_free (size_time_table); vec_free (call_size_time_table); + vec_free (loop_iterations); + vec_free (loop_strides); } void @@ -729,24 +763,33 @@ ipa_fn_summary_t::remove_callees (cgraph_node *node) ipa_call_summaries->remove (e); } -/* Same as remap_predicate_after_duplication but handle hint predicate *P. - Additionally care about allocating new memory slot for updated predicate - and set it to NULL when it becomes true or false (and thus uninteresting). - */ +/* Duplicate predicates in loop hint vector, allocating memory for them and + remove and deallocate any uninteresting (true or false) ones. Return the + result. */ -static void -remap_hint_predicate_after_duplication (predicate **p, - clause_t possible_truths) +static vec * +remap_freqcounting_preds_after_dup (vec *v, + clause_t possible_truths) { - predicate new_predicate; + if (vec_safe_length (v) == 0) + return NULL; - if (!*p) - return; + vec *res = v->copy (); + int len = res->length(); + for (int i = len - 1; i >= 0; i--) + { + predicate new_predicate + = (*res)[i].predicate->remap_after_duplication (possible_truths); + /* We do not want to free previous predicate; it is used by node + origin. */ + (*res)[i].predicate = NULL; + set_hint_predicate (&(*res)[i].predicate, new_predicate); - new_predicate = (*p)->remap_after_duplication (possible_truths); - /* We do not want to free previous predicate; it is used by node origin. */ - *p = NULL; - set_hint_predicate (p, new_predicate); + if (!(*res)[i].predicate) + res->unordered_remove (i); + } + + return res; } @@ -859,9 +902,11 @@ ipa_fn_summary_t::duplicate (cgraph_node *src, optimized_out_size += es->call_stmt_size * ipa_fn_summary::size_scale; edge_set_predicate (edge, &new_predicate); } - remap_hint_predicate_after_duplication (&info->loop_iterations, + info->loop_iterations + = remap_freqcounting_preds_after_dup (info->loop_iterations, possible_truths); - remap_hint_predicate_after_duplication (&info->loop_stride, + info->loop_strides + = remap_freqcounting_preds_after_dup (info->loop_strides, possible_truths); /* If inliner or someone after inliner will ever start producing @@ -873,17 +918,21 @@ ipa_fn_summary_t::duplicate (cgraph_node *src, else { info->size_time_table = vec_safe_copy (info->size_time_table); - if (info->loop_iterations) + info->loop_iterations = vec_safe_copy (info->loop_iterations); + info->loop_strides = vec_safe_copy (info->loop_strides); + + ipa_freqcounting_predicate *f; + for (int i = 0; vec_safe_iterate (info->loop_iterations, i, &f); i++) { - predicate p = *info->loop_iterations; - info->loop_iterations = NULL; - set_hint_predicate (&info->loop_iterations, p); + predicate p = *f->predicate; + f->predicate = NULL; + set_hint_predicate (&f->predicate, p); } - if (info->loop_stride) + for (int i = 0; vec_safe_iterate (info->loop_strides, i, &f); i++) { - predicate p = *info->loop_stride; - info->loop_stride = NULL; - set_hint_predicate (&info->loop_stride, p); + predicate p = *f->predicate; + f->predicate = NULL; + set_hint_predicate (&f->predicate, p); } } if (!dst->inlined_to) @@ -1042,15 +1091,28 @@ ipa_dump_fn_summary (FILE *f, struct cgraph_node *node) } fprintf (f, "\n"); } - if (s->loop_iterations) + ipa_freqcounting_predicate *fcp; + bool first_fcp = true; + for (int i = 0; vec_safe_iterate (s->loop_iterations, i, &fcp); i++) { - fprintf (f, " loop iterations:"); - s->loop_iterations->dump (f, s->conds); + if (first_fcp) + { + fprintf (f, " loop iterations:"); + first_fcp = false; + } + fprintf (f, " %3.2f for ", fcp->freq.to_double ()); + fcp->predicate->dump (f, s->conds); } - if (s->loop_stride) + first_fcp = true; + for (int i = 0; vec_safe_iterate (s->loop_strides, i, &fcp); i++) { - fprintf (f, " loop stride:"); - s->loop_stride->dump (f, s->conds); + if (first_fcp) + { + fprintf (f, " loop strides:"); + first_fcp = false; + } + fprintf (f, " %3.2f for :", fcp->freq.to_double ()); + fcp->predicate->dump (f, s->conds); } fprintf (f, " calls:\n"); dump_ipa_call_summary (f, 4, node, s); @@ -2499,12 +2561,13 @@ analyze_function_body (struct cgraph_node *node, bool early) if (fbi.info) compute_bb_predicates (&fbi, node, info, params_summary); + const profile_count entry_count = ENTRY_BLOCK_PTR_FOR_FN (cfun)->count; order = XNEWVEC (int, n_basic_blocks_for_fn (cfun)); nblocks = pre_and_rev_post_order_compute (NULL, order, false); for (n = 0; n < nblocks; n++) { bb = BASIC_BLOCK_FOR_FN (cfun, order[n]); - freq = bb->count.to_sreal_scale (ENTRY_BLOCK_PTR_FOR_FN (cfun)->count); + freq = bb->count.to_sreal_scale (entry_count); if (clobber_only_eh_bb_p (bb)) { if (dump_file && (dump_flags & TDF_DETAILS)) @@ -2743,24 +2806,29 @@ analyze_function_body (struct cgraph_node *node, bool early) if (nonconstant_names.exists () && !early) { + ipa_fn_summary *s = ipa_fn_summaries->get (node); class loop *loop; - predicate loop_iterations = true; - predicate loop_stride = true; + unsigned max_loop_predicates = opt_for_fn (node->decl, + param_ipa_max_loop_predicates); if (dump_file && (dump_flags & TDF_DETAILS)) flow_loops_dump (dump_file, NULL, 0); scev_initialize (); FOR_EACH_LOOP (loop, 0) { + predicate loop_iterations = true; + sreal header_freq; vec exits; edge ex; unsigned int j; class tree_niter_desc niter_desc; - if (loop->header->aux) - bb_predicate = *(predicate *) loop->header->aux; - else - bb_predicate = false; + if (!loop->header->aux) + continue; + profile_count phdr_count = loop_preheader_edge (loop)->count (); + sreal phdr_freq = phdr_count.to_sreal_scale (entry_count); + + bb_predicate = *(predicate *) loop->header->aux; exits = get_loop_exit_edges (loop); FOR_EACH_VEC_ELT (exits, j, ex) if (number_of_iterations_exit (loop, ex, &niter_desc, false) @@ -2775,10 +2843,10 @@ analyze_function_body (struct cgraph_node *node, bool early) will_be_nonconstant = bb_predicate & will_be_nonconstant; if (will_be_nonconstant != true && will_be_nonconstant != false) - /* This is slightly inprecise. We may want to represent each - loop with independent predicate. */ loop_iterations &= will_be_nonconstant; } + add_freqcounting_predicate (&s->loop_iterations, loop_iterations, + phdr_freq, max_loop_predicates); exits.release (); } @@ -2788,14 +2856,17 @@ analyze_function_body (struct cgraph_node *node, bool early) for (loop = loops_for_fn (cfun)->tree_root->inner; loop != NULL; loop = loop->next) { + predicate loop_stride = true; basic_block *body = get_loop_body (loop); + profile_count phdr_count = loop_preheader_edge (loop)->count (); + sreal phdr_freq = phdr_count.to_sreal_scale (entry_count); for (unsigned i = 0; i < loop->num_nodes; i++) { gimple_stmt_iterator gsi; - if (body[i]->aux) - bb_predicate = *(predicate *) body[i]->aux; - else - bb_predicate = false; + if (!body[i]->aux) + continue; + + bb_predicate = *(predicate *) body[i]->aux; for (gsi = gsi_start_bb (body[i]); !gsi_end_p (gsi); gsi_next (&gsi)) { @@ -2824,16 +2895,13 @@ analyze_function_body (struct cgraph_node *node, bool early) will_be_nonconstant = bb_predicate & will_be_nonconstant; if (will_be_nonconstant != true && will_be_nonconstant != false) - /* This is slightly inprecise. We may want to represent - each loop with independent predicate. */ loop_stride = loop_stride & will_be_nonconstant; } } + add_freqcounting_predicate (&s->loop_strides, loop_stride, + phdr_freq, max_loop_predicates); free (body); } - ipa_fn_summary *s = ipa_fn_summaries->get (node); - set_hint_predicate (&s->loop_iterations, loop_iterations); - set_hint_predicate (&s->loop_stride, loop_stride); scev_finalize (); } FOR_ALL_BB_FN (bb, my_function) @@ -3506,6 +3574,8 @@ ipa_call_context::estimate_size_and_time (ipa_call_estimates *estimates, sreal time = 0; int min_size = 0; ipa_hints hints = 0; + sreal loops_with_known_iterations = 0; + sreal loops_with_known_strides = 0; int i; if (dump_file && (dump_flags & TDF_DETAILS)) @@ -3598,16 +3668,27 @@ ipa_call_context::estimate_size_and_time (ipa_call_estimates *estimates, if (est_hints) { - if (info->loop_iterations - && !info->loop_iterations->evaluate (m_possible_truths)) - hints |= INLINE_HINT_loop_iterations; - if (info->loop_stride - && !info->loop_stride->evaluate (m_possible_truths)) - hints |= INLINE_HINT_loop_stride; if (info->scc_no) hints |= INLINE_HINT_in_scc; if (DECL_DECLARED_INLINE_P (m_node->decl)) hints |= INLINE_HINT_declared_inline; + + ipa_freqcounting_predicate *fcp; + for (i = 0; vec_safe_iterate (info->loop_iterations, i, &fcp); i++) + if (!fcp->predicate->evaluate (m_possible_truths)) + { + hints |= INLINE_HINT_loop_iterations; + loops_with_known_iterations += fcp->freq; + } + estimates->loops_with_known_iterations = loops_with_known_iterations; + + for (i = 0; vec_safe_iterate (info->loop_strides, i, &fcp); i++) + if (!fcp->predicate->evaluate (m_possible_truths)) + { + hints |= INLINE_HINT_loop_stride; + loops_with_known_strides += fcp->freq; + } + estimates->loops_with_known_strides = loops_with_known_strides; } size = RDIV (size, ipa_fn_summary::size_scale); @@ -3615,12 +3696,15 @@ ipa_call_context::estimate_size_and_time (ipa_call_estimates *estimates, if (dump_file && (dump_flags & TDF_DETAILS)) { + fprintf (dump_file, "\n size:%i", (int) size); if (est_times) - fprintf (dump_file, "\n size:%i time:%f nonspec time:%f\n", - (int) size, time.to_double (), - nonspecialized_time.to_double ()); - else - fprintf (dump_file, "\n size:%i (time not estimated)\n", (int) size); + fprintf (dump_file, " time:%f nonspec time:%f", + time.to_double (), nonspecialized_time.to_double ()); + if (est_hints) + fprintf (dump_file, " loops with known iterations:%f " + "known strides:%f", loops_with_known_iterations.to_double (), + loops_with_known_strides.to_double ()); + fprintf (dump_file, "\n"); } if (est_times) { @@ -3809,32 +3893,29 @@ remap_edge_summaries (struct cgraph_edge *inlined_edge, } } -/* Same as remap_predicate, but set result into hint *HINT. */ +/* Run remap_after_inlining on each predicate in V. */ static void -remap_hint_predicate (class ipa_fn_summary *info, - class ipa_node_params *params_summary, - class ipa_fn_summary *callee_info, - predicate **hint, - vec operand_map, - vec offset_map, - clause_t possible_truths, - predicate *toplev_predicate) -{ - predicate p; +remap_freqcounting_predicate (class ipa_fn_summary *info, + class ipa_node_params *params_summary, + class ipa_fn_summary *callee_info, + vec *v, + vec operand_map, + vec offset_map, + clause_t possible_truths, + predicate *toplev_predicate) - if (!*hint) - return; - p = (*hint)->remap_after_inlining - (info, params_summary, callee_info, - operand_map, offset_map, - possible_truths, *toplev_predicate); - if (p != false && p != true) +{ + ipa_freqcounting_predicate *fcp; + for (int i = 0; vec_safe_iterate (v, i, &fcp); i++) { - if (!*hint) - set_hint_predicate (hint, p); - else - **hint &= p; + predicate p + = fcp->predicate->remap_after_inlining (info, params_summary, + callee_info, operand_map, + offset_map, possible_truths, + *toplev_predicate); + if (p != false && p != true) + *fcp->predicate &= p; } } @@ -3942,12 +4023,12 @@ ipa_merge_fn_summary_after_inlining (struct cgraph_edge *edge) remap_edge_summaries (edge, edge->callee, info, params_summary, callee_info, operand_map, offset_map, clause, &toplev_predicate); - remap_hint_predicate (info, params_summary, callee_info, - &callee_info->loop_iterations, - operand_map, offset_map, clause, &toplev_predicate); - remap_hint_predicate (info, params_summary, callee_info, - &callee_info->loop_stride, - operand_map, offset_map, clause, &toplev_predicate); + remap_freqcounting_predicate (info, params_summary, callee_info, + info->loop_iterations, operand_map, + offset_map, clause, &toplev_predicate); + remap_freqcounting_predicate (info, params_summary, callee_info, + info->loop_strides, operand_map, + offset_map, clause, &toplev_predicate); HOST_WIDE_INT stack_frame_offset = ipa_get_stack_frame_offset (edge->callee); HOST_WIDE_INT peak = stack_frame_offset + callee_info->estimated_stack_size; @@ -4271,12 +4352,34 @@ inline_read_section (struct lto_file_decl_data *file_data, const char *data, info->size_time_table->quick_push (e); } - p.stream_in (&ib); - if (info) - set_hint_predicate (&info->loop_iterations, p); - p.stream_in (&ib); - if (info) - set_hint_predicate (&info->loop_stride, p); + count2 = streamer_read_uhwi (&ib); + for (j = 0; j < count2; j++) + { + p.stream_in (&ib); + sreal fcp_freq = sreal::stream_in (&ib); + if (info) + { + ipa_freqcounting_predicate fcp; + fcp.predicate = NULL; + set_hint_predicate (&fcp.predicate, p); + fcp.freq = fcp_freq; + vec_safe_push (info->loop_iterations, fcp); + } + } + count2 = streamer_read_uhwi (&ib); + for (j = 0; j < count2; j++) + { + p.stream_in (&ib); + sreal fcp_freq = sreal::stream_in (&ib); + if (info) + { + ipa_freqcounting_predicate fcp; + fcp.predicate = NULL; + set_hint_predicate (&fcp.predicate, p); + fcp.freq = fcp_freq; + vec_safe_push (info->loop_strides, fcp); + } + } for (e = node->callees; e; e = e->next_callee) read_ipa_call_summary (&ib, e, info != NULL); for (e = node->indirect_calls; e; e = e->next_callee) @@ -4436,14 +4539,19 @@ ipa_fn_summary_write (void) e->exec_predicate.stream_out (ob); e->nonconst_predicate.stream_out (ob); } - if (info->loop_iterations) - info->loop_iterations->stream_out (ob); - else - streamer_write_uhwi (ob, 0); - if (info->loop_stride) - info->loop_stride->stream_out (ob); - else - streamer_write_uhwi (ob, 0); + ipa_freqcounting_predicate *fcp; + streamer_write_uhwi (ob, vec_safe_length (info->loop_iterations)); + for (i = 0; vec_safe_iterate (info->loop_iterations, i, &fcp); i++) + { + fcp->predicate->stream_out (ob); + fcp->freq.stream_out (ob); + } + streamer_write_uhwi (ob, vec_safe_length (info->loop_strides)); + for (i = 0; vec_safe_iterate (info->loop_strides, i, &fcp); i++) + { + fcp->predicate->stream_out (ob); + fcp->freq.stream_out (ob); + } for (edge = cnode->callees; edge; edge = edge->next_callee) write_ipa_call_summary (ob, edge); for (edge = cnode->indirect_calls; edge; edge = edge->next_callee) diff --git a/gcc/ipa-fnsummary.h b/gcc/ipa-fnsummary.h index 7b68e7ce096..2dc9f74f68a 100644 --- a/gcc/ipa-fnsummary.h +++ b/gcc/ipa-fnsummary.h @@ -101,6 +101,19 @@ public: } }; +/* Structure to capture how frequently some interesting events occur given a + particular predicate. The structure is used to estimate how often we + encounter loops with known iteration count or stride in various + contexts. */ + +struct GTY(()) ipa_freqcounting_predicate +{ + /* The described event happens with this frequency... */ + sreal freq; + /* ...when this predicate evaluates to false. */ + class predicate * GTY((skip)) predicate; +}; + /* Function inlining information. */ class GTY(()) ipa_fn_summary { @@ -112,8 +125,9 @@ public: inlinable (false), single_caller (false), fp_expressions (false), estimated_stack_size (false), time (0), conds (NULL), - size_time_table (NULL), call_size_time_table (NULL), loop_iterations (NULL), - loop_stride (NULL), growth (0), scc_no (0) + size_time_table (NULL), call_size_time_table (NULL), + loop_iterations (NULL), loop_strides (NULL), + growth (0), scc_no (0) { } @@ -125,7 +139,7 @@ public: estimated_stack_size (s.estimated_stack_size), time (s.time), conds (s.conds), size_time_table (s.size_time_table), call_size_time_table (NULL), - loop_iterations (s.loop_iterations), loop_stride (s.loop_stride), + loop_iterations (s.loop_iterations), loop_strides (s.loop_strides), growth (s.growth), scc_no (s.scc_no) {} @@ -164,12 +178,10 @@ public: vec *size_time_table; vec *call_size_time_table; - /* Predicate on when some loop in the function becomes to have known - bounds. */ - predicate * GTY((skip)) loop_iterations; - /* Predicate on when some loop in the function becomes to have known - stride. */ - predicate * GTY((skip)) loop_stride; + /* Predicates on when some loops in the function can have known bounds. */ + vec *loop_iterations; + /* Predicates on when some loops in the function can have known strides. */ + vec *loop_strides; /* Estimated growth for inlining all copies of the function before start of small functions inlining. This value will get out of date as the callers are duplicated, but @@ -308,6 +320,14 @@ struct ipa_call_estimates /* Further discovered reasons why to inline or specialize the give calls. */ ipa_hints hints; + + /* Frequency how often a loop with known number of iterations is encountered. + Calculated with hints. */ + sreal loops_with_known_iterations; + + /* Frequency how often a loop with known strides is encountered. Calculated + with hints. */ + sreal loops_with_known_strides; }; class ipa_cached_call_context; diff --git a/gcc/params.opt b/gcc/params.opt index f39e5d1a012..97509963d71 100644 --- a/gcc/params.opt +++ b/gcc/params.opt @@ -230,6 +230,10 @@ Maximum number of aggregate content items for a parameter in jump functions and Common Joined UInteger Var(param_ipa_max_param_expr_ops) Init(10) Param Optimization Maximum number of operations in a parameter expression that can be handled by IPA analysis. +-param=ipa-max-loop-predicates= +Common Joined UInteger Var(param_ipa_max_loop_predicates) Init(16) Param Optimization +Maximum number of different predicates used to track properties of loops in IPA analysis. + -param=ipa-max-switch-predicate-bounds= Common Joined UInteger Var(param_ipa_max_switch_predicate_bounds) Init(5) Param Optimization Maximal number of boundary endpoints of case ranges of switch statement used during IPA function summary generation. diff --git a/gcc/testsuite/gcc.dg/ipa/ipcp-loophint-1.c b/gcc/testsuite/gcc.dg/ipa/ipcp-loophint-1.c new file mode 100644 index 00000000000..6d049af68af --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/ipcp-loophint-1.c @@ -0,0 +1,29 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fdump-ipa-cp-details" } */ + +extern int *o, *p, *q, *r; + +#define FUNCTIONS fa(), fb(), fc(), fd(), fe(), ff(), fg() + +extern void FUNCTIONS; + +void foo (int c) +{ + FUNCTIONS; + FUNCTIONS; + for (int i = 0; i < 100; i++) + { + for (int j = 0; j < c; j++) + o[i] = p[i] + q[i] * r[i]; + } + FUNCTIONS; + FUNCTIONS; +} + +void bar() +{ + foo (8); + p[4]++; +} + +/* { dg-final { scan-ipa-dump {with known iterations:[1-9]} "cp" } } */ From patchwork Mon Sep 7 19:38:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin Jambor X-Patchwork-Id: 1359234 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=suse.cz Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4BldrH62V4z9sSP for ; Tue, 8 Sep 2020 05:38:54 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AE1A9394743E; Mon, 7 Sep 2020 19:38:51 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by sourceware.org (Postfix) with ESMTPS id 9DEA3394743C for ; Mon, 7 Sep 2020 19:38:49 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 9DEA3394743C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mjambor@suse.cz X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 28A52ACDB for ; Mon, 7 Sep 2020 19:38:49 +0000 (UTC) From: Martin Jambor To: GCC Patches Subject: [PATCH 5/6] ipa-cp: Add dumping of overall_size after cloning User-Agent: Notmuch/0.30 (https://notmuchmail.org) Emacs/26.3 (x86_64-suse-linux-gnu) Date: Mon, 07 Sep 2020 21:38:47 +0200 Message-ID: MIME-Version: 1.0 X-Spam-Status: No, score=-3038.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jan Hubicka Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi, when experimenting with IPA-CP parameters, especially when looking into exchange2_r, it has been very useful to know what the value of overall_size is at different stages of the decision process. This patch therefore adds it to the generated dumps. Bootstrapped and tested and LTO bootstrapped on x86_64 as a part of the whole series. OK for trunk? Thanks, Martin gcc/ChangeLog: 2020-09-07 Martin Jambor * ipa-cp.c (estimate_local_effects): Add overeall_size to dumped string. (decide_about_value): Add dumping new overall_size. --- gcc/ipa-cp.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c index f6320c787de..12acf24c553 100644 --- a/gcc/ipa-cp.c +++ b/gcc/ipa-cp.c @@ -3517,7 +3517,8 @@ estimate_local_effects (struct cgraph_node *node) if (dump_file) fprintf (dump_file, " Decided to specialize for all " - "known contexts, growth deemed beneficial.\n"); + "known contexts, growth (to %li) deemed " + "beneficial.\n", overall_size); } else if (dump_file && (dump_flags & TDF_DETAILS)) fprintf (dump_file, " Not cloning for all contexts because " @@ -5506,6 +5507,9 @@ decide_about_value (struct cgraph_node *node, int index, HOST_WIDE_INT offset, val->spec_node = create_specialized_node (node, known_csts, known_contexts, aggvals, callers); overall_size += val->local_size_cost; + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " overall size reached %li\n", + overall_size); /* TODO: If for some lattice there is only one other known value left, make a special node for it too. */ From patchwork Mon Sep 7 19:41:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin Jambor X-Patchwork-Id: 1359236 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=suse.cz Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4Bldv50W8wz9sSP for ; Tue, 8 Sep 2020 05:41:21 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 62197394743E; Mon, 7 Sep 2020 19:41:18 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx2.suse.de (mx2.suse.de [195.135.220.15]) by sourceware.org (Postfix) with ESMTPS id 5712E394743D for ; Mon, 7 Sep 2020 19:41:11 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 5712E394743D Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mjambor@suse.cz X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.221.27]) by mx2.suse.de (Postfix) with ESMTP id 2FC37AF0B for ; Mon, 7 Sep 2020 19:41:11 +0000 (UTC) From: Martin Jambor To: GCC Patches Subject: [PATCH 6/6] ipa-cp: Separate and increase the large-unit parameter User-Agent: Notmuch/0.30 (https://notmuchmail.org) Emacs/26.3 (x86_64-suse-linux-gnu) Date: Mon, 07 Sep 2020 21:41:10 +0200 Message-ID: MIME-Version: 1.0 X-Spam-Status: No, score=-3038.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Jan Hubicka Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi, a previous patch in the series has taught IPA-CP to identify the important cloning opportunities in 548.exchange2_r as worthwhile on their own, but the optimization is still prevented from taking place because of the overall unit-growh limit. This patches raises that limit so that it takes place and the benchmark runs 30% faster (on AMD Zen2 CPU at least). Before this patch, IPA-CP uses the following formulae to arrive at the overall_size limit: base = MAX(orig_size, param_large_unit_insns) overall_size_limit = base + base * param_ipa_cp_unit_growth / 100 since param_ipa_cp_unit_growth has default 10, param_large_unit_insns has default value 10000. The problem with exchange2 (at least on zen2 but I have had a quick look on aarch64 too) is that the original estimated unit size is 10513 and so param_large_unit_insns does not apply and the default limit is therefore 11564 which is good enough only for one of the ideal 8 clonings, we need the limit to be at least 16291. I would like to raise param_ipa_cp_unit_growth a little bit more soon too, but most certainly not to 55. Therefore, the large_unit must be increased. In this patch, I decided to decouple the inlining and ipa-cp large-unit parameters. It also makes sense because IPA-CP uses it only at -O3 while inlining also at -O2 (IIUC). But if we agree we can try raising param_large_unit_insns to 13-14 thousand "instructions," perhaps it is not necessary. But then again, it may make sense to actually increase the IPA-CP limit further. I plan to experiment with IPA-CP tuning on a larger set of programs. Meanwhile, mainly to address the 548.exchange2_r regression, I'm suggesting this simple change. Bootstrapped and tested and LTO bootstrapped on x86_64 as a part of the whole series. OK for trunk? Thanks, Martin gcc/ChangeLog: 2020-09-07 Martin Jambor * params.opt (ipa-cp-large-unit-insns): New parameter. * ipa-cp.c (get_max_overall_size): Use the new parameter. --- gcc/ipa-cp.c | 2 +- gcc/params.opt | 4 ++++ 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c index 12acf24c553..2152f9e5876 100644 --- a/gcc/ipa-cp.c +++ b/gcc/ipa-cp.c @@ -3448,7 +3448,7 @@ static long get_max_overall_size (cgraph_node *node) { long max_new_size = orig_overall_size; - long large_unit = opt_for_fn (node->decl, param_large_unit_insns); + long large_unit = opt_for_fn (node->decl, param_ipa_cp_large_unit_insns); if (max_new_size < large_unit) max_new_size = large_unit; int unit_growth = opt_for_fn (node->decl, param_ipa_cp_unit_growth); diff --git a/gcc/params.opt b/gcc/params.opt index 97509963d71..ef2c1f81dd7 100644 --- a/gcc/params.opt +++ b/gcc/params.opt @@ -214,6 +214,10 @@ Percentage penalty functions containing a single call to another function will r Common Joined UInteger Var(param_ipa_cp_unit_growth) Init(10) Param Optimization How much can given compilation unit grow because of the interprocedural constant propagation (in percent). +-param=ipa-cp-large-unit-insns= +Common Joined UInteger Var(param_ipa_cp_large_unit_insns) Optimization Init(16000) Param +The size of translation unit that IPA-CP pass considers large. + -param=ipa-cp-value-list-size= Common Joined UInteger Var(param_ipa_cp_value_list_size) Init(8) Param Optimization Maximum size of a list of values associated with each parameter for interprocedural constant propagation.