From patchwork Sat Dec 29 23:41:38 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Martin Jambor X-Patchwork-Id: 1019392 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-493194-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="Y61mXUhC"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 43S0Vb2GYJz9s3q for ; Sun, 30 Dec 2018 10:42:28 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; q=dns; s=default; b=VW/ixgCVy+efmRClpleoO4+EQyRika4Wem0nRtTLjgVO3TR14A 1auLcUESp2g1YNG7ibvOOATtjdRbgXscottiyAap5AT7z/Wyk9MbH2nxINoTPtfo ZgDcR/R61XU9W0ijum/SS+thxCMyUdbIh5DO+k3VPrwrvA45E94pEuGoY= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type; s= default; bh=4t84lTV4nS0OzRrT91mKX5KnuHA=; b=Y61mXUhCxbq0Lpl3st3h 2fR3uwASFwxcPrOQ6PZV4ZQVxBSVw8oufxbMeTOZ+I2spau2gY7TK7YcWybc6YcS FBPW4qlB3cNDLqdf3YOjqSFw3dZ5vixXeYham+MpQPS2iNsR91k3jjkBC/m3spWR +HoYV1nUKY7NI8kCurQgBK8= Received: (qmail 41464 invoked by alias); 29 Dec 2018 23:41:58 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 41455 invoked by uid 89); 29 Dec 2018 23:41:57 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=0.8 required=5.0 tests=BAYES_50, SPF_PASS, TIME_LIMIT_EXCEEDED autolearn=unavailable version=3.3.2 spammy=visited, queued, book, About X-HELO: mx1.suse.de Received: from mx2.suse.de (HELO mx1.suse.de) (195.135.220.15) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Sat, 29 Dec 2018 23:41:46 +0000 Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 41A5AAD0A for ; Sat, 29 Dec 2018 23:41:42 +0000 (UTC) From: Martin Jambor To: GCC Patches Cc: Jan Hubicka Subject: [WIP] Reimplementation of IPA-SRA User-Agent: Notmuch/0.26 (https://notmuchmail.org) Emacs/26.1 (x86_64-suse-linux-gnu) Date: Sun, 30 Dec 2018 00:41:38 +0100 Message-ID: MIME-Version: 1.0 X-IsSubscribed: yes Hi, this is a reimplementation of IPA-SRA to make it a full IPA pass that can handle strictly connected components in the call-graph, can take advantage of LTO and does not weirdly switch functions in pass-pipeline like our current quasi-IPA SRA does. It is still work-in-progress but it already passes (LTO) bootstrap and testing and I have compiled quite some code with it. I still want to review some of its uglier bits and think how can I make them more elegant, but I tend to think the basic ideas behind it will not change very much. Currently, it sits before IPA-CP but I will experiment with switching their order. At the moment, I'm sending it because I promised it to Honza but anyone interested in IPA stuff is welcome to have a look and comment. Unlike the current IPA-SRA it can also remove return values, even in SCCs. On the other hand, it is less powerful when it comes to structures passed by reference. By design it will not create references to bits of an aggregate because that is probably less than useless, and it also cannot usually split aggregates passed by reference that are just passed to another function (where splitting would be useful) because it cannot perform the same AA analysis like the current implementation which already knows what types it should look at. On the other hand, parameter removal and splitting also works in SCCs. It also avoids tons of useless work. In 523.xalancbmk_r, the old IPA-SRA splits over 8 thousand parameters whereas the new one with LTO just eight (not eight thousand, just eight). It turns out that half of the parameters processed by the current pass are in functions that are early inlined and the other half is in unreachable functions. Any comments welcome, Martin 2018-12-20 Martin Jambor * coretypes.h (cgraph_edge): Declare. * ipa-param-manipulation.c: Rewrite. * ipa-param-manipulation.h: Likewise. * Makefile.in (GTFILES): Added ipa-param-manipulation.h and ipa-sra.c. (OBJS): Added ipa-sra.o. * cgraph.h (ipa_replace_map): Removed fields old_tree, replace_p and ref_p, added fields param_adjustments and performed_splits. (struct cgraph_clone_info): Remove ags_to_skip and combined_args_to_skip, new field param_adjustments. (cgraph_node::create_clone): Changed parameters to use ipa_param_adjustments. (cgraph_node::create_virtual_clone): Likewise. (cgraph_node::create_virtual_clone_with_body): Likewise. (tree_function_versioning): Likewise. (cgraph_build_function_type_skip_args): Removed. * cgraph.c (cgraph_edge::redirect_call_stmt_to_callee): Convert to using ipa_param_adjustments. (clone_of_p): Likewise. * cgraphclones.c (cgraph_build_function_type_skip_args): Removed. (build_function_decl_skip_args): Likewise. (duplicate_thunk_for_node): Adjust parameters using ipa_param_body_adjustments, copy param_adjustments instead of args_to_skip. (cgraph_node::create_clone): Convert to using ipa_param_adjustments. (cgraph_node::create_virtual_clone): Likewise. (cgraph_node::create_version_clone_with_body): Likewise. (cgraph_materialize_clone): Likewise. (symbol_table::materialize_all_clones): Likewise. * ipa-fnsummary.c (ipa_fn_summary_t::duplicate): Simplify ipa_replace_map check. * ipa-cp.c (get_replacement_map): Do not initialize removed fields. (initialize_node_lattices): Make aware that some parameters might have already been removed. (want_remove_some_param_p): New function. (create_specialized_node): Convert to using ipa_param_adjustments and deal with possibly pre-existing adjustments. * lto-cgraph.c (output_cgraph_opt_summary_p): Likewise. (output_node_opt_summary): Do not stream removed fields. Stream parameter adjustments instead of argumetns to skip. (input_node_opt_summary): Likewise. (input_node_opt_summary): Likewise. * lto-section-in.c (lto_section_name): Added ipa-sra section. * lto-streamer.h (lto_section_type): Likewise. * tree-inline.h (copy_body_data): New field killed_new_ssa_names. (copy_decl_to_var): Declare. * tree-inline.c (update_clone_info): Do not remap old_tree. (remap_gimple_stmt): Use ipa_param_body_adjustments to modify gimple statements, walk all extra generated statements and remap their operands. (redirect_all_calls): Add killed SSA names to a hash set. (remap_ssa_name): Do not remap killed SSA names. (copy_arguments_for_versioning): Renames to copy_arguments_nochange, half of functionality moved to ipa_param_body_adjustments. (copy_decl_to_var): Make exported. (copy_body): Destroy killed_new_ssa_names hash set. (expand_call_inline): Remap performed splits. (update_clone_info): Likewise. (tree_function_versioning): Simplify tree_map processing. Updated to accept ipa_param_adjustments and use ipa_param_body_adjustments. * tree-inline.h (struct copy_body_data): New field param_body_adjs. * omp-simd-clone.c (simd_clone_vector_of_formal_parm_types): Adjust for the new interface. (simd_clone_clauses_extract): Likewise, make args an auto_vec. (simd_clone_compute_base_data_type): Likewise. (simd_clone_init_simd_arrays): Adjust for the new interface. (simd_clone_adjust_argument_types): Likewise. (struct modify_stmt_info): Likewise. (ipa_simd_modify_stmt_ops): Likewise. (ipa_simd_modify_function_body): Likewise. (simd_clone_adjust): Likewise. * tree-sra.c: Removed IPA-SRA. Include tree-sra.h. (type_internals_preclude_sra_p): Make public. * tree-sra.h: New file. * ipa-inline-transform.c (save_inline_function_body): Update to refelct new tree_function_versioning signature. * ipa-prop.c (adjust_agg_replacement_values): Use a helper from (ipcp_modif_dom_walker::before_dom_children): Likewise. ipa_param_adjustments to get current parameter indices. (ipcp_update_bits): Likewise. (ipcp_update_vr): Likewise. * ipa-split.c (split_function): Convert to using ipa_param_adjustments. * ipa-sra.c: New file. * multiple_target.c (create_target_clone): Update to reflet new type of create_version_clone_with_body. * trans-mem.c (ipa_tm_create_version): Update to reflect new type of tree_function_versioning. * tree-sra.c (modify_function): Update to reflect new type of tree_function_versioning. * params.def (PARAM_IPA_SRA_MAX_REPLACEMENT): New. (PARAM_SRA_MAX_TYPE_CHECK_STEPS): New. * passes.def: Remove old IPA-SRA and add new one. * tree-pass.h (make_pass_early_ipa_sra): Remove declaration. (make_pass_ipa_sra): Declare. testsuite/ * g++.dg/ipa/pr81248.C: Adjust dg-options and dump-scan. * gcc.dg/ipa/ipa-sra-1.c: Likewise. * gcc.dg/ipa/ipa-sra-10.c: Likewise. * gcc.dg/ipa/ipa-sra-11.c: Likewise. * gcc.dg/ipa/ipa-sra-3.c: Likewise. * gcc.dg/ipa/ipa-sra-4.c: Likewise. * gcc.dg/ipa/ipa-sra-5.c: Likewise. * gcc.dg/ipa/ipacost-2.c: Disable ipa-sra. * gcc.dg/ipa/ipcp-agg-9.c: Likewise. * gcc.dg/ipa/pr78121.c: Adjust scan pattern. * gcc.dg/ipa/vrp1.c: Likewise. * gcc.dg/ipa/vrp2.c: Likewise. * gcc.dg/ipa/vrp3.c: Likewise. * gcc.dg/ipa/vrp7.c: Likewise. * gcc.dg/ipa/vrp8.c: Likewise. * gcc.dg/noreorder.c: use noipa attribute instead of noinline. * gcc.dg/ipa/20040703-wpa.c: New test. --- gcc/Makefile.in | 3 +- gcc/cgraph.c | 127 +- gcc/cgraph.h | 32 +- gcc/cgraphclones.c | 215 +- gcc/coretypes.h | 1 + gcc/ipa-cp.c | 173 +- gcc/ipa-fnsummary.c | 4 +- gcc/ipa-inline-transform.c | 3 +- gcc/ipa-param-manipulation.c | 2106 ++++++++++--- gcc/ipa-param-manipulation.h | 312 +- gcc/ipa-prop.c | 103 +- gcc/ipa-split.c | 31 +- gcc/ipa-sra.c | 3823 +++++++++++++++++++++++ gcc/lto-cgraph.c | 123 +- gcc/lto-section-in.c | 3 +- gcc/lto-streamer.h | 1 + gcc/multiple_target.c | 5 +- gcc/omp-simd-clone.c | 209 +- gcc/params.def | 13 + gcc/passes.def | 2 +- gcc/testsuite/g++.dg/ipa/pr81248.C | 4 +- gcc/testsuite/gcc.dg/ipa/20040703-wpa.c | 151 + gcc/testsuite/gcc.dg/ipa/ipa-sra-1.c | 4 +- gcc/testsuite/gcc.dg/ipa/ipa-sra-10.c | 4 +- gcc/testsuite/gcc.dg/ipa/ipa-sra-11.c | 6 +- gcc/testsuite/gcc.dg/ipa/ipa-sra-3.c | 7 +- gcc/testsuite/gcc.dg/ipa/ipa-sra-4.c | 8 +- gcc/testsuite/gcc.dg/ipa/ipa-sra-5.c | 4 +- gcc/testsuite/gcc.dg/ipa/ipacost-2.c | 4 +- gcc/testsuite/gcc.dg/ipa/ipcp-agg-9.c | 2 +- gcc/testsuite/gcc.dg/ipa/pr78121.c | 2 +- gcc/testsuite/gcc.dg/ipa/vrp1.c | 4 +- gcc/testsuite/gcc.dg/ipa/vrp2.c | 4 +- gcc/testsuite/gcc.dg/ipa/vrp3.c | 2 +- gcc/testsuite/gcc.dg/ipa/vrp7.c | 2 +- gcc/testsuite/gcc.dg/ipa/vrp8.c | 2 +- gcc/testsuite/gcc.dg/noreorder.c | 6 +- gcc/trans-mem.c | 3 +- gcc/tree-inline.c | 361 ++- gcc/tree-inline.h | 10 + gcc/tree-pass.h | 2 +- gcc/tree-sra.c | 1862 +---------- gcc/tree-sra.h | 31 + 43 files changed, 6724 insertions(+), 3050 deletions(-) create mode 100644 gcc/ipa-sra.c create mode 100644 gcc/testsuite/gcc.dg/ipa/20040703-wpa.c create mode 100644 gcc/tree-sra.h diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 7960cace16a..c5035693683 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1367,6 +1367,7 @@ OBJS = \ init-regs.o \ internal-fn.o \ ipa-cp.o \ + ipa-sra.o \ ipa-devirt.o \ ipa-fnsummary.o \ ipa-polymorphic-call.o \ @@ -2525,7 +2526,7 @@ GTFILES = $(CPPLIB_H) $(srcdir)/input.h $(srcdir)/coretypes.h \ $(srcdir)/reload.h $(srcdir)/caller-save.c $(srcdir)/symtab.c \ $(srcdir)/alias.c $(srcdir)/bitmap.c $(srcdir)/cselib.c $(srcdir)/cgraph.c \ $(srcdir)/ipa-prop.c $(srcdir)/ipa-cp.c $(srcdir)/ipa-utils.h \ - $(srcdir)/dbxout.c \ + $(srcdir)/ipa-param-manipulation.h $(srcdir)/ipa-sra.c $(srcdir)/dbxout.c \ $(srcdir)/signop.h \ $(srcdir)/dwarf2out.h \ $(srcdir)/dwarf2asm.c \ diff --git a/gcc/cgraph.c b/gcc/cgraph.c index 850a9b62469..c6befe5b45b 100644 --- a/gcc/cgraph.c +++ b/gcc/cgraph.c @@ -1349,7 +1349,7 @@ cgraph_edge::redirect_call_stmt_to_callee (void) if (flag_checking && decl) { cgraph_node *node = cgraph_node::get (decl); - gcc_assert (!node || !node->clone.combined_args_to_skip); + gcc_assert (!node || !node->clone.param_adjustments); } if (symtab->dump_file) @@ -1357,25 +1357,36 @@ cgraph_edge::redirect_call_stmt_to_callee (void) fprintf (symtab->dump_file, "updating call of %s -> %s: ", e->caller->dump_name (), e->callee->dump_name ()); print_gimple_stmt (symtab->dump_file, e->call_stmt, 0, dump_flags); - if (e->callee->clone.combined_args_to_skip) + if (e->callee->clone.param_adjustments) + e->callee->clone.param_adjustments->dump (symtab->dump_file); + unsigned performed_len + = vec_safe_length (e->caller->clone.performed_splits); + if (performed_len > 0) + fprintf (symtab->dump_file, "Performed splits records:\n"); + for (unsigned i = 0; i < performed_len; i++) { - fprintf (symtab->dump_file, " combined args to skip: "); - dump_bitmap (symtab->dump_file, - e->callee->clone.combined_args_to_skip); + ipa_param_performed_split *sm + = &(*e->caller->clone.performed_splits)[i]; + print_node_brief (symtab->dump_file, " dummy_decl: ", sm->dummy_decl, + TDF_UID); + fprintf (symtab->dump_file, ", unit_offset: %u\n", sm->unit_offset); } } - if (e->callee->clone.combined_args_to_skip) + if (ipa_param_adjustments *padjs = e->callee->clone.param_adjustments) { - int lp_nr; + /* We need to defer cleaning EH info on the new statement to + fixup-cfg. We may not have dominator information at this point + and thus would end up with unreachable blocks and have no way + to communicate that we need to run CFG cleanup then. */ + int lp_nr = lookup_stmt_eh_lp (e->call_stmt); + if (lp_nr != 0) + remove_stmt_from_eh_lp (e->call_stmt); - new_stmt = e->call_stmt; - if (e->callee->clone.combined_args_to_skip) - new_stmt - = gimple_call_copy_skip_args (new_stmt, - e->callee->clone.combined_args_to_skip); tree old_fntype = gimple_call_fntype (e->call_stmt); - gimple_call_set_fndecl (new_stmt, e->callee->decl); + new_stmt = padjs->modify_call (e->call_stmt, + e->caller->clone.performed_splits, + e->callee->decl, false); cgraph_node *origin = e->callee; while (origin->clone_of) origin = origin->clone_of; @@ -1386,92 +1397,12 @@ cgraph_edge::redirect_call_stmt_to_callee (void) gimple_call_set_fntype (new_stmt, TREE_TYPE (e->callee->decl)); else { - bitmap skip = e->callee->clone.combined_args_to_skip; - tree t = cgraph_build_function_type_skip_args (old_fntype, skip, - false); - gimple_call_set_fntype (new_stmt, t); - } - - if (gimple_vdef (new_stmt) - && TREE_CODE (gimple_vdef (new_stmt)) == SSA_NAME) - SSA_NAME_DEF_STMT (gimple_vdef (new_stmt)) = new_stmt; - - gsi = gsi_for_stmt (e->call_stmt); - - /* For optimized away parameters, add on the caller side - before the call - DEBUG D#X => parm_Y(D) - stmts and associate D#X with parm in decl_debug_args_lookup - vector to say for debug info that if parameter parm had been passed, - it would have value parm_Y(D). */ - if (e->callee->clone.combined_args_to_skip && MAY_HAVE_DEBUG_BIND_STMTS) - { - vec **debug_args - = decl_debug_args_lookup (e->callee->decl); - tree old_decl = gimple_call_fndecl (e->call_stmt); - if (debug_args && old_decl) - { - tree parm; - unsigned i = 0, num; - unsigned len = vec_safe_length (*debug_args); - unsigned nargs = gimple_call_num_args (e->call_stmt); - for (parm = DECL_ARGUMENTS (old_decl), num = 0; - parm && num < nargs; - parm = DECL_CHAIN (parm), num++) - if (bitmap_bit_p (e->callee->clone.combined_args_to_skip, num) - && is_gimple_reg (parm)) - { - unsigned last = i; - - while (i < len && (**debug_args)[i] != DECL_ORIGIN (parm)) - i += 2; - if (i >= len) - { - i = 0; - while (i < last - && (**debug_args)[i] != DECL_ORIGIN (parm)) - i += 2; - if (i >= last) - continue; - } - tree ddecl = (**debug_args)[i + 1]; - tree arg = gimple_call_arg (e->call_stmt, num); - if (!useless_type_conversion_p (TREE_TYPE (ddecl), - TREE_TYPE (arg))) - { - tree rhs1; - if (!fold_convertible_p (TREE_TYPE (ddecl), arg)) - continue; - if (TREE_CODE (arg) == SSA_NAME - && gimple_assign_cast_p (SSA_NAME_DEF_STMT (arg)) - && (rhs1 - = gimple_assign_rhs1 (SSA_NAME_DEF_STMT (arg))) - && useless_type_conversion_p (TREE_TYPE (ddecl), - TREE_TYPE (rhs1))) - arg = rhs1; - else - arg = fold_convert (TREE_TYPE (ddecl), arg); - } - - gimple *def_temp - = gimple_build_debug_bind (ddecl, unshare_expr (arg), - e->call_stmt); - gsi_insert_before (&gsi, def_temp, GSI_SAME_STMT); - } - } + tree new_fntype = padjs->build_new_function_type (old_fntype, true); + gimple_call_set_fntype (new_stmt, new_fntype); } - gsi_replace (&gsi, new_stmt, false); - /* We need to defer cleaning EH info on the new statement to - fixup-cfg. We may not have dominator information at this point - and thus would end up with unreachable blocks and have no way - to communicate that we need to run CFG cleanup then. */ - lp_nr = lookup_stmt_eh_lp (e->call_stmt); if (lp_nr != 0) - { - remove_stmt_from_eh_lp (e->call_stmt); - add_stmt_to_eh_lp (new_stmt, lp_nr); - } + add_stmt_to_eh_lp (new_stmt, lp_nr); } else { @@ -2982,8 +2913,8 @@ clone_of_p (cgraph_node *node, cgraph_node *node2) if (skipped_thunk) { - if (!node2->clone.args_to_skip - || !bitmap_bit_p (node2->clone.args_to_skip, 0)) + if (!node2->clone.param_adjustments + || node2->clone.param_adjustments->first_param_intact_p ()) return false; if (node2->former_clone_of == node->decl) return true; diff --git a/gcc/cgraph.h b/gcc/cgraph.h index 51cea066ad3..d4d98c1c577 100644 --- a/gcc/cgraph.h +++ b/gcc/cgraph.h @@ -24,6 +24,7 @@ along with GCC; see the file COPYING3. If not see #include "profile-count.h" #include "ipa-ref.h" #include "plugin-api.h" +#include "ipa-param-manipulation.h" extern void debuginfo_early_init (void); extern void debuginfo_init (void); @@ -734,23 +735,19 @@ struct GTY(()) cgraph_global_info { will be replaced by another tree while versioning. */ struct GTY(()) ipa_replace_map { - /* The tree that will be replaced. */ - tree old_tree; /* The new (replacing) tree. */ tree new_tree; /* Parameter number to replace, when old_tree is NULL. */ int parm_num; - /* True when a substitution should be done, false otherwise. */ - bool replace_p; - /* True when we replace a reference to old_tree. */ - bool ref_p; }; struct GTY(()) cgraph_clone_info { vec *tree_map; - bitmap args_to_skip; - bitmap combined_args_to_skip; + ipa_param_adjustments *param_adjustments; + /* Lists of all splits with their offsets for each dummy variables + representing a replaced-by-splits parameter. */ + vec *performed_splits; }; enum cgraph_simd_clone_arg_type @@ -970,15 +967,16 @@ public: vec redirect_callers, bool call_duplication_hook, cgraph_node *new_inlined_to, - bitmap args_to_skip, const char *suffix = NULL); + ipa_param_adjustments *param_adjustments, + const char *suffix = NULL); /* Create callgraph node clone with new declaration. The actual body will be copied later at compilation stage. The name of the new clone will be constructed from the name of the original node, SUFFIX and NUM_SUFFIX. */ cgraph_node *create_virtual_clone (vec redirect_callers, vec *tree_map, - bitmap args_to_skip, const char * suffix, - unsigned num_suffix); + ipa_param_adjustments *param_adjustments, + const char * suffix, unsigned num_suffix); /* cgraph node being removed from symbol table; see if its entry can be replaced by other inline clone. */ @@ -1022,9 +1020,9 @@ public: Return the new version's cgraph node. */ cgraph_node *create_version_clone_with_body (vec redirect_callers, - vec *tree_map, bitmap args_to_skip, - bool skip_return, bitmap bbs_to_copy, basic_block new_entry_block, - const char *clone_name); + vec *tree_map, + ipa_param_adjustments *param_adjustments, + bitmap bbs_to_copy, basic_block new_entry_block, const char *clone_name); /* Insert a new cgraph_function_version_info node into cgraph_fnver_htab corresponding to cgraph_node. */ @@ -2401,14 +2399,12 @@ tree clone_function_name (tree decl, const char *suffix, tree clone_function_name (tree decl, const char *suffix); void tree_function_versioning (tree, tree, vec *, - bool, bitmap, bool, bitmap, basic_block); + ipa_param_adjustments *, + bool, bitmap, basic_block); void dump_callgraph_transformation (const cgraph_node *original, const cgraph_node *clone, const char *suffix); -tree cgraph_build_function_type_skip_args (tree orig_type, bitmap args_to_skip, - bool skip_return); - /* In cgraphbuild.c */ int compute_call_stmt_bb_frequency (tree, basic_block bb); void record_references_in_initializer (tree, bool); diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c index f8076920fff..01bd66ec00e 100644 --- a/gcc/cgraphclones.c +++ b/gcc/cgraphclones.c @@ -142,99 +142,6 @@ cgraph_edge::clone (cgraph_node *n, gcall *call_stmt, unsigned stmt_uid, return new_edge; } -/* Build variant of function type ORIG_TYPE skipping ARGS_TO_SKIP and the - return value if SKIP_RETURN is true. */ - -tree -cgraph_build_function_type_skip_args (tree orig_type, bitmap args_to_skip, - bool skip_return) -{ - tree new_type = NULL; - tree args, new_args = NULL; - tree new_reversed; - int i = 0; - - for (args = TYPE_ARG_TYPES (orig_type); args && args != void_list_node; - args = TREE_CHAIN (args), i++) - if (!args_to_skip || !bitmap_bit_p (args_to_skip, i)) - new_args = tree_cons (NULL_TREE, TREE_VALUE (args), new_args); - - new_reversed = nreverse (new_args); - if (args) - { - if (new_reversed) - TREE_CHAIN (new_args) = void_list_node; - else - new_reversed = void_list_node; - } - - /* Use copy_node to preserve as much as possible from original type - (debug info, attribute lists etc.) - Exception is METHOD_TYPEs must have THIS argument. - When we are asked to remove it, we need to build new FUNCTION_TYPE - instead. */ - if (TREE_CODE (orig_type) != METHOD_TYPE - || !args_to_skip - || !bitmap_bit_p (args_to_skip, 0)) - { - new_type = build_distinct_type_copy (orig_type); - TYPE_ARG_TYPES (new_type) = new_reversed; - } - else - { - new_type - = build_distinct_type_copy (build_function_type (TREE_TYPE (orig_type), - new_reversed)); - TYPE_CONTEXT (new_type) = TYPE_CONTEXT (orig_type); - } - - if (skip_return) - TREE_TYPE (new_type) = void_type_node; - - return new_type; -} - -/* Build variant of function decl ORIG_DECL skipping ARGS_TO_SKIP and the - return value if SKIP_RETURN is true. - - Arguments from DECL_ARGUMENTS list can't be removed now, since they are - linked by TREE_CHAIN directly. The caller is responsible for eliminating - them when they are being duplicated (i.e. copy_arguments_for_versioning). */ - -static tree -build_function_decl_skip_args (tree orig_decl, bitmap args_to_skip, - bool skip_return) -{ - tree new_decl = copy_node (orig_decl); - tree new_type; - - new_type = TREE_TYPE (orig_decl); - if (prototype_p (new_type) - || (skip_return && !VOID_TYPE_P (TREE_TYPE (new_type)))) - new_type - = cgraph_build_function_type_skip_args (new_type, args_to_skip, - skip_return); - TREE_TYPE (new_decl) = new_type; - - /* For declarations setting DECL_VINDEX (i.e. methods) - we expect first argument to be THIS pointer. */ - if (args_to_skip && bitmap_bit_p (args_to_skip, 0)) - DECL_VINDEX (new_decl) = NULL_TREE; - - /* When signature changes, we need to clear builtin info. */ - if (fndecl_built_in_p (new_decl) - && args_to_skip - && !bitmap_empty_p (args_to_skip)) - { - DECL_BUILT_IN_CLASS (new_decl) = NOT_BUILT_IN; - DECL_FUNCTION_CODE (new_decl) = (enum built_in_function) 0; - } - /* The FE might have information and assumptions about the other - arguments. */ - DECL_LANG_SPECIFIC (new_decl) = NULL; - return new_decl; -} - /* Set flags of NEW_NODE and its decl. NEW_NODE is a newly created private clone or its thunk. */ @@ -282,35 +189,21 @@ duplicate_thunk_for_node (cgraph_node *thunk, cgraph_node *node) return cs->caller; tree new_decl; - if (!node->clone.args_to_skip) - new_decl = copy_node (thunk->decl); - else + if (node->clone.param_adjustments) { /* We do not need to duplicate this_adjusting thunks if we have removed this. */ if (thunk->thunk.this_adjusting - && bitmap_bit_p (node->clone.args_to_skip, 0)) + && !node->clone.param_adjustments->first_param_intact_p ()) return node; - new_decl = build_function_decl_skip_args (thunk->decl, - node->clone.args_to_skip, - false); + new_decl = copy_node (thunk->decl); + ipa_param_body_adjustments body_adj (node->clone.param_adjustments, + new_decl); + body_adj.modify_formal_parameters (); } - - tree *link = &DECL_ARGUMENTS (new_decl); - int i = 0; - for (tree pd = DECL_ARGUMENTS (thunk->decl); pd; pd = DECL_CHAIN (pd), i++) - { - if (!node->clone.args_to_skip - || !bitmap_bit_p (node->clone.args_to_skip, i)) - { - tree nd = copy_node (pd); - DECL_CONTEXT (nd) = new_decl; - *link = nd; - link = &DECL_CHAIN (nd); - } - } - *link = NULL_TREE; + else + new_decl = copy_node (thunk->decl); gcc_checking_assert (!DECL_STRUCT_FUNCTION (new_decl)); gcc_checking_assert (!DECL_INITIAL (new_decl)); @@ -332,8 +225,7 @@ duplicate_thunk_for_node (cgraph_node *thunk, cgraph_node *node) new_thunk->thunk = thunk->thunk; new_thunk->unique_name = in_lto_p; new_thunk->former_clone_of = thunk->decl; - new_thunk->clone.args_to_skip = node->clone.args_to_skip; - new_thunk->clone.combined_args_to_skip = node->clone.combined_args_to_skip; + new_thunk->clone.param_adjustments = node->clone.param_adjustments; cgraph_edge *e = new_thunk->create_edge (node, NULL, new_thunk->count); symtab->call_edge_duplication_hooks (thunk->callees, e); @@ -416,7 +308,11 @@ dump_callgraph_transformation (const cgraph_node *original, If the new node is being inlined into another one, NEW_INLINED_TO should be the outline function the new one is (even indirectly) inlined to. All hooks will see this in node's global.inlined_to, when invoked. Can be NULL if the - node is not inlined. */ + node is not inlined. + + If PARAM_ADJUSTMENTS is non-NULL, the parameter manipulation information + will be overwritten by the new structure. Otherwise the new node will + share parameter manipulation information with the original node. */ cgraph_node * cgraph_node::create_clone (tree new_decl, profile_count prof_count, @@ -424,7 +320,8 @@ cgraph_node::create_clone (tree new_decl, profile_count prof_count, vec redirect_callers, bool call_duplication_hook, cgraph_node *new_inlined_to, - bitmap args_to_skip, const char *suffix) + ipa_param_adjustments *param_adjustments, + const char *suffix) { cgraph_node *new_node = symtab->create_empty (); cgraph_edge *e; @@ -468,19 +365,13 @@ cgraph_node::create_clone (tree new_decl, profile_count prof_count, new_node->merged_comdat = merged_comdat; new_node->thunk = thunk; + if (param_adjustments) + new_node->clone.param_adjustments = param_adjustments; + else + new_node->clone.param_adjustments = clone.param_adjustments; new_node->clone.tree_map = NULL; - new_node->clone.args_to_skip = args_to_skip; + new_node->clone.performed_splits = vec_safe_copy (clone.performed_splits); new_node->split_part = split_part; - if (!args_to_skip) - new_node->clone.combined_args_to_skip = clone.combined_args_to_skip; - else if (clone.combined_args_to_skip) - { - new_node->clone.combined_args_to_skip = BITMAP_GGC_ALLOC (); - bitmap_ior (new_node->clone.combined_args_to_skip, - clone.combined_args_to_skip, args_to_skip); - } - else - new_node->clone.combined_args_to_skip = args_to_skip; FOR_EACH_VEC_ELT (redirect_callers, i, e) { @@ -622,8 +513,8 @@ clone_function_name (tree decl, const char *suffix) cgraph_node * cgraph_node::create_virtual_clone (vec redirect_callers, vec *tree_map, - bitmap args_to_skip, const char * suffix, - unsigned num_suffix) + ipa_param_adjustments *param_adjustments, + const char * suffix, unsigned num_suffix) { tree old_decl = decl; cgraph_node *new_node = NULL; @@ -633,13 +524,16 @@ cgraph_node::create_virtual_clone (vec redirect_callers, char *name; gcc_checking_assert (local.versionable); - gcc_assert (local.can_change_signature || !args_to_skip); + /* TODO: It would be nice if we could recognize that param_adjustments do not + actually perform any changes, but at the moment let's require it simply + does not exist. */ + gcc_assert (local.can_change_signature || !param_adjustments); /* Make a new FUNCTION_DECL tree node */ - if (!args_to_skip) + if (!param_adjustments) new_decl = copy_node (old_decl); else - new_decl = build_function_decl_skip_args (old_decl, args_to_skip, false); + new_decl = param_adjustments->adjust_decl (old_decl); /* These pointers represent function body and will be populated only when clone is materialized. */ @@ -663,7 +557,8 @@ cgraph_node::create_virtual_clone (vec redirect_callers, SET_DECL_RTL (new_decl, NULL); new_node = create_clone (new_decl, count, false, - redirect_callers, false, NULL, args_to_skip, suffix); + redirect_callers, false, NULL, param_adjustments, + suffix); /* Update the properties. Make clone visible only within this translation unit. Make sure @@ -1017,9 +912,9 @@ cgraph_node::create_version_clone (tree new_decl, cgraph_node * cgraph_node::create_version_clone_with_body (vec redirect_callers, - vec *tree_map, bitmap args_to_skip, - bool skip_return, bitmap bbs_to_copy, basic_block new_entry_block, - const char *suffix) + vec *tree_map, + ipa_param_adjustments *param_adjustments, + bitmap bbs_to_copy, basic_block new_entry_block, const char *suffix) { tree old_decl = decl; cgraph_node *new_version_node = NULL; @@ -1028,14 +923,16 @@ cgraph_node::create_version_clone_with_body if (!tree_versionable_function_p (old_decl)) return NULL; - gcc_assert (local.can_change_signature || !args_to_skip); + /* TODO: Restore an assert that we do not change signature if + local.can_change_signature is false. We cannot just check that + param_adjustments is NULL because unfortunately ipa-split removes return + values from such functions. */ /* Make a new FUNCTION_DECL tree node for the new version. */ - if (!args_to_skip && !skip_return) - new_decl = copy_node (old_decl); + if (param_adjustments) + new_decl = param_adjustments->adjust_decl (old_decl); else - new_decl - = build_function_decl_skip_args (old_decl, args_to_skip, skip_return); + new_decl = copy_node (old_decl); /* Generate a new name for the new version. */ DECL_NAME (new_decl) = clone_function_name_numbered (old_decl, suffix); @@ -1057,8 +954,8 @@ cgraph_node::create_version_clone_with_body new_version_node->ipa_transforms_to_apply = ipa_transforms_to_apply.copy (); /* Copy the OLD_VERSION_NODE function tree to the new version. */ - tree_function_versioning (old_decl, new_decl, tree_map, false, args_to_skip, - skip_return, bbs_to_copy, new_entry_block); + tree_function_versioning (old_decl, new_decl, tree_map, param_adjustments, + false, bbs_to_copy, new_entry_block); /* Update the new version's properties. Make The new version visible only within this translation unit. Make sure @@ -1098,9 +995,8 @@ cgraph_materialize_clone (cgraph_node *node) node->former_clone_of = node->clone_of->former_clone_of; /* Copy the OLD_VERSION_NODE function tree to the new version. */ tree_function_versioning (node->clone_of->decl, node->decl, - node->clone.tree_map, true, - node->clone.args_to_skip, false, - NULL, NULL); + node->clone.tree_map, node->clone.param_adjustments, + true, NULL, NULL); if (symtab->dump_file) { dump_function_to_file (node->clone_of->decl, symtab->dump_file, @@ -1175,28 +1071,15 @@ symbol_table::materialize_all_clones (void) { ipa_replace_map *replace_info; replace_info = (*node->clone.tree_map)[i]; - print_generic_expr (symtab->dump_file, - replace_info->old_tree); - fprintf (symtab->dump_file, " -> "); + fprintf (symtab->dump_file, "%i -> ", + (*node->clone.tree_map)[i]->parm_num); print_generic_expr (symtab->dump_file, replace_info->new_tree); - fprintf (symtab->dump_file, "%s%s;", - replace_info->replace_p ? "(replace)":"", - replace_info->ref_p ? "(ref)":""); } fprintf (symtab->dump_file, "\n"); } - if (node->clone.args_to_skip) - { - fprintf (symtab->dump_file, " args_to_skip: "); - dump_bitmap (symtab->dump_file, - node->clone.args_to_skip); - } - if (node->clone.args_to_skip) - { - fprintf (symtab->dump_file, " combined_args_to_skip:"); - dump_bitmap (symtab->dump_file, node->clone.combined_args_to_skip); - } + if (node->clone.param_adjustments) + node->clone.param_adjustments->dump (symtab->dump_file); } cgraph_materialize_clone (node); stabilized = false; diff --git a/gcc/coretypes.h b/gcc/coretypes.h index 271cce8e20f..c926a0a64f0 100644 --- a/gcc/coretypes.h +++ b/gcc/coretypes.h @@ -141,6 +141,7 @@ struct gomp_teams; class symtab_node; struct cgraph_node; class varpool_node; +struct cgraph_edge; union section; typedef union section section; diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c index 57589153614..da067bdf78b 100644 --- a/gcc/ipa-cp.c +++ b/gcc/ipa-cp.c @@ -1165,7 +1165,10 @@ initialize_node_lattices (struct cgraph_node *node) int i; gcc_checking_assert (node->has_gimple_body_p ()); - if (node->local.local) + + if (!ipa_get_param_count (info)) + disable = true; + else if (node->local.local) { int caller_count = 0; node->call_for_symbol_thunks_and_aliases (count_callers, &caller_count, @@ -1187,32 +1190,72 @@ initialize_node_lattices (struct cgraph_node *node) disable = true; } - for (i = 0; i < ipa_get_param_count (info); i++) + if (dump_file && (dump_flags & TDF_DETAILS) + && !node->alias && !node->thunk.thunk_p) { - struct ipcp_param_lattices *plats = ipa_get_parm_lattices (info, i); - plats->m_value_range.init (); + fprintf (dump_file, "Initializing lattices of %s\n", + node->dump_name ()); + if (disable || variable) + fprintf (dump_file, " Marking all lattices as %s\n", + disable ? "BOTTOM" : "VARIABLE"); } - if (disable || variable) + auto_vec surviving_params; + bool pre_modified = false; + if (!disable && node->clone.param_adjustments) { - for (i = 0; i < ipa_get_param_count (info); i++) + /* At the moment all IPA optimizations should use the number of + parameters of the prevailing decl as the m_always_copy_start. + Handling any other value would complicate the code below, so for the + time bing let's only assert it is so. */ + gcc_assert ((node->clone.param_adjustments->m_always_copy_start + == ipa_get_param_count (info)) + || node->clone.param_adjustments->m_always_copy_start < 0); + + pre_modified = true; + node->clone.param_adjustments->get_surviving_params (&surviving_params); + + if (dump_file && (dump_flags & TDF_DETAILS) + && !node->alias && !node->thunk.thunk_p) { - struct ipcp_param_lattices *plats = ipa_get_parm_lattices (info, i); - if (disable) + bool first = true; + for (int j = 0; j < ipa_get_param_count (info); j++) { - plats->itself.set_to_bottom (); - plats->ctxlat.set_to_bottom (); - set_agg_lats_to_bottom (plats); - plats->bits_lattice.set_to_bottom (); - plats->m_value_range.set_to_bottom (); + if (j < (int) surviving_params.length () + && surviving_params[j]) + continue; + if (first) + { + fprintf (dump_file, + " The following parameters are dead on arrival:"); + first = false; + } + fprintf (dump_file, " %u", j); } - else + if (!first) + fprintf (dump_file, "\n"); + } + } + + for (i = 0; i < ipa_get_param_count (info); i++) + { + struct ipcp_param_lattices *plats = ipa_get_parm_lattices (info, i); + if (disable + || (pre_modified && (surviving_params.length () <= (unsigned) i + || !surviving_params[i]))) + { + plats->itself.set_to_bottom (); + plats->ctxlat.set_to_bottom (); + set_agg_lats_to_bottom (plats); + plats->bits_lattice.set_to_bottom (); + plats->m_value_range.set_to_bottom (); + } + else + { + plats->m_value_range.init (); + if (variable) set_all_contains_variable (plats); } - if (dump_file && (dump_flags & TDF_DETAILS) - && !node->alias && !node->thunk.thunk_p) - fprintf (dump_file, "Marking all lattices of %s as %s\n", - node->dump_name (), disable ? "BOTTOM" : "VARIABLE"); } for (ie = node->indirect_calls; ie; ie = ie->next_callee) @@ -3632,12 +3675,8 @@ get_replacement_map (struct ipa_node_params *info, tree value, int parm_num) print_generic_expr (dump_file, value); fprintf (dump_file, "\n"); } - replace_map->old_tree = NULL; replace_map->parm_num = parm_num; replace_map->new_tree = value; - replace_map->replace_p = true; - replace_map->ref_p = false; - return replace_map; } @@ -3775,6 +3814,34 @@ update_specialized_profile (struct cgraph_node *new_node, dump_profile_updates (orig_node, new_node); } +/* Return true if we would like to remove a parameter from NODE when cloning it + with KNOWN_CSTS scalar constants. */ + +static bool +want_remove_some_param_p (cgraph_node *node, vec known_csts) +{ + auto_vec surviving; + bool filled_vec = false; + ipa_node_params *info = IPA_NODE_REF (node); + int i, count = ipa_get_param_count (info); + for (i = 0; i < count; i++) + { + if (!known_csts[i] && ipa_is_param_used (info, i)) + continue; + + if (!filled_vec) + { + if (!node->clone.param_adjustments) + return true; + node->clone.param_adjustments->get_surviving_params (&surviving); + filled_vec = true; + } + if (surviving.length() < (unsigned) i && surviving[i]) + return true; + } + return false; +} + /* Create a specialized version of NODE with known constants in KNOWN_CSTS, known contexts in KNOWN_CONTEXTS and known aggregate values in AGGVALS and redirect all edges in CALLERS to it. */ @@ -3788,31 +3855,65 @@ create_specialized_node (struct cgraph_node *node, { struct ipa_node_params *new_info, *info = IPA_NODE_REF (node); vec *replace_trees = NULL; + vec *new_params = NULL; struct ipa_agg_replacement_value *av; struct cgraph_node *new_node; int i, count = ipa_get_param_count (info); - bitmap args_to_skip; - + ipa_param_adjustments *old_adjustments = node->clone.param_adjustments; + ipa_param_adjustments *new_adjustments; gcc_assert (!info->ipcp_orig_node); + gcc_assert (node->local.can_change_signature + || !old_adjustments); - if (node->local.can_change_signature) + if (old_adjustments) { - args_to_skip = BITMAP_GGC_ALLOC (); - for (i = 0; i < count; i++) + /* At the moment all IPA optimizations should use the number of + parameters of the prevailing decl as the m_always_copy_start. + Handling any other value would complicate the code below, so for the + time bing let's only assert it is so. */ + gcc_assert (old_adjustments->m_always_copy_start == count + || old_adjustments->m_always_copy_start < 0); + int old_adj_count = vec_safe_length (old_adjustments->m_adj_params); + for (i = 0; i < old_adj_count; i++) { - tree t = known_csts[i]; - - if (t || !ipa_is_param_used (info, i)) - bitmap_set_bit (args_to_skip, i); + ipa_adjusted_param *old_adj = &(*old_adjustments->m_adj_params)[i]; + if (!node->local.can_change_signature + || old_adj->op != IPA_PARAM_OP_COPY + || (!known_csts[old_adj->base_index] + && ipa_is_param_used (info, old_adj->base_index))) + { + ipa_adjusted_param new_adj; + memcpy (&new_adj, old_adj, sizeof (new_adj)); + new_adj.prev_clone_adjustment = true; + new_adj.prev_clone_index = i; + vec_safe_push (new_params, new_adj); + } } + bool skip_return = old_adjustments->m_skip_return; + new_adjustments = (new (ggc_alloc ()) + ipa_param_adjustments (new_params, count, + skip_return)); } - else + else if (node->local.can_change_signature + && want_remove_some_param_p (node, known_csts)) { - args_to_skip = NULL; - if (dump_file && (dump_flags & TDF_DETAILS)) - fprintf (dump_file, " cannot change function signature\n"); + ipa_adjusted_param adj; + memset (&adj, 0, sizeof (adj)); + adj.op = IPA_PARAM_OP_COPY; + for (i = 0; i < count; i++) + if (!known_csts[i] && ipa_is_param_used (info, i)) + { + adj.base_index = i; + adj.prev_clone_index = i; + vec_safe_push (new_params, adj); + } + new_adjustments = (new (ggc_alloc ()) + ipa_param_adjustments (new_params, count, false)); } + else + new_adjustments = NULL; + replace_trees = vec_safe_copy (node->clone.tree_map); for (i = 0; i < count; i++) { tree t = known_csts[i]; @@ -3841,7 +3942,7 @@ create_specialized_node (struct cgraph_node *node, IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME ( node->decl))); new_node = node->create_virtual_clone (callers, replace_trees, - args_to_skip, "constprop", + new_adjustments, "constprop", suffix_counter); suffix_counter++; diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c index 1c43b31104b..10251d6abd8 100644 --- a/gcc/ipa-fnsummary.c +++ b/gcc/ipa-fnsummary.c @@ -623,9 +623,7 @@ ipa_fn_summary_t::duplicate (cgraph_node *src, for (j = 0; vec_safe_iterate (dst->clone.tree_map, j, &r); j++) { - if (((!r->old_tree && r->parm_num == i) - || (r->old_tree && r->old_tree == ipa_get_param (parms_info, i))) - && r->replace_p && !r->ref_p) + if (r->parm_num == i) { known_vals[i] = r->new_tree; break; diff --git a/gcc/ipa-inline-transform.c b/gcc/ipa-inline-transform.c index 0e749985291..7e5e6d7f5b6 100644 --- a/gcc/ipa-inline-transform.c +++ b/gcc/ipa-inline-transform.c @@ -583,8 +583,7 @@ save_inline_function_body (struct cgraph_node *node) /* Copy the OLD_VERSION_NODE function tree to the new version. */ tree_function_versioning (node->decl, first_clone->decl, - NULL, true, NULL, false, - NULL, NULL); + NULL, NULL, true, NULL, NULL); /* The function will be short lived and removed after we inline all the clones, but make it internal so we won't confuse ourself. */ diff --git a/gcc/ipa-param-manipulation.c b/gcc/ipa-param-manipulation.c index 1e3a92a125f..3363f49baec 100644 --- a/gcc/ipa-param-manipulation.c +++ b/gcc/ipa-param-manipulation.c @@ -28,164 +28,197 @@ along with GCC; see the file COPYING3. If not see #include "ssa.h" #include "cgraph.h" #include "fold-const.h" +#include "tree-eh.h" #include "stor-layout.h" #include "gimplify.h" #include "gimple-iterator.h" #include "gimplify-me.h" +#include "tree-cfg.h" #include "tree-dfa.h" #include "ipa-param-manipulation.h" #include "print-tree.h" #include "gimple-pretty-print.h" #include "builtins.h" +#include "tree-ssa.h" +#include "tree-inline.h" +#include "gimple-walk.h" -/* Return a heap allocated vector containing formal parameters of FNDECL. */ +/* Actual prefixes of different newly synthetized parameters. Keep in sync + with IPA_PARAM_PREFIX_* defines. */ -vec -ipa_get_vector_of_formal_parms (tree fndecl) +static const char *ipa_param_prefixes[] = {"SYNTH", + "ISRA", + "simd", + "mask"}; + +/* Names of parameters for dumping. Keep in sync with enum ipa_parm_op. */ + +static const char *ipa_param_op_names[] = {"IPA_PARAM_OP_UNDEFINED", + "IPA_PARAM_OP_COPY", + "IPA_PARAM_OP_NEW", + "IPA_PARAM_OP_SPLIT"}; + +/* Fill an empty vector ARGS with PARM_DECLs representing formal parameters of + FNDECL. The function should not be called during LTO WPA phase except for + thunks (or functions with bodies streamed in). */ + +void +ipa_fill_vector_with_formal_parms (vec *args, tree fndecl) { - vec args; int count; tree parm; - gcc_assert (!flag_wpa); + /* Safety check that we do not attempt to use the function in WPA, except + when the function is a thunk and then we have DECL_ARGUMENTS or when we + have already explicitely loaded its body. */ + gcc_assert (!flag_wpa + || DECL_ARGUMENTS (fndecl) + || gimple_has_body_p (fndecl)); count = 0; for (parm = DECL_ARGUMENTS (fndecl); parm; parm = DECL_CHAIN (parm)) count++; - args.create (count); + args->reserve_exact (count); for (parm = DECL_ARGUMENTS (fndecl); parm; parm = DECL_CHAIN (parm)) - args.quick_push (parm); - - return args; + args->quick_push (parm); } -/* Return a heap allocated vector containing types of formal parameters of +/* Fill an empty vector TYPES with trees representing formal parameters of function type FNTYPE. */ -vec -ipa_get_vector_of_formal_parm_types (tree fntype) +void +ipa_fill_vector_with_formal_parm_types (vec *types, tree fntype) { - vec types; int count = 0; tree t; for (t = TYPE_ARG_TYPES (fntype); t; t = TREE_CHAIN (t)) count++; - types.create (count); + types->reserve_exact (count); for (t = TYPE_ARG_TYPES (fntype); t; t = TREE_CHAIN (t)) - types.quick_push (TREE_VALUE (t)); - - return types; + types->quick_push (TREE_VALUE (t)); } -/* Modify the function declaration FNDECL and its type according to the plan in - ADJUSTMENTS. It also sets base fields of individual adjustments structures - to reflect the actual parameters being modified which are determined by the - base_index field. */ +/* Dump the adjustments in the vector ADJUSTMENTS to dump_file in a human + friendly way, assuming they are meant to be applied to FNDECL. */ void -ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments) -{ - vec oparms = ipa_get_vector_of_formal_parms (fndecl); - tree orig_type = TREE_TYPE (fndecl); - tree old_arg_types = TYPE_ARG_TYPES (orig_type); - - /* The following test is an ugly hack, some functions simply don't have any - arguments in their type. This is probably a bug but well... */ - bool care_for_types = (old_arg_types != NULL_TREE); - bool last_parm_void; - vec otypes; - if (care_for_types) - { - last_parm_void = (TREE_VALUE (tree_last (old_arg_types)) - == void_type_node); - otypes = ipa_get_vector_of_formal_parm_types (orig_type); - if (last_parm_void) - gcc_assert (oparms.length () + 1 == otypes.length ()); - else - gcc_assert (oparms.length () == otypes.length ()); - } - else - { - last_parm_void = false; - otypes.create (0); - } +ipa_dump_adjusted_parameters (FILE *f, + vec *adj_params) +{ + unsigned i, len = vec_safe_length (adj_params); + bool first = true; - int len = adjustments.length (); - tree *link = &DECL_ARGUMENTS (fndecl); - tree new_arg_types = NULL; - for (int i = 0; i < len; i++) + fprintf (f, "IPA adjusted parameters: "); + for (i = 0; i < len; i++) { - struct ipa_parm_adjustment *adj; - gcc_assert (link); + struct ipa_adjusted_param *apm; + apm = &(*adj_params)[i]; - adj = &adjustments[i]; - tree parm; - if (adj->op == IPA_PARM_OP_NEW) - parm = NULL; + if (!first) + fprintf (f, " "); else - parm = oparms[adj->base_index]; - adj->base = parm; + first = false; - if (adj->op == IPA_PARM_OP_COPY) + fprintf (f, "%i. %s %s", i, ipa_param_op_names[apm->op], + apm->prev_clone_adjustment ? "prev_clone_adjustment " : ""); + switch (apm->op) { - if (care_for_types) - new_arg_types = tree_cons (NULL_TREE, otypes[adj->base_index], - new_arg_types); - *link = parm; - link = &DECL_CHAIN (parm); + case IPA_PARAM_OP_UNDEFINED: + break; + + case IPA_PARAM_OP_COPY: + fprintf (f, ", base_index: %u", apm->base_index); + fprintf (f, ", prev_clone_index: %u", apm->prev_clone_index); + break; + + case IPA_PARAM_OP_SPLIT: + fprintf (f, ", offset: %u", apm->unit_offset); + /* fall-through */ + case IPA_PARAM_OP_NEW: + fprintf (f, ", base_index: %u", apm->base_index); + fprintf (f, ", prev_clone_index: %u", apm->prev_clone_index); + print_node_brief (f, ", type: ", apm->type, 0); + print_node_brief (f, ", alias type: ", apm->alias_ptr_type, 0); + fprintf (f, " prefix: %s, reverse: %u, by_ref: %u", + ipa_param_prefixes[apm->param_prefix_index], + apm->reverse, apm->by_ref); + break; } - else if (adj->op != IPA_PARM_OP_REMOVE) - { - tree new_parm; - tree ptype; + fprintf (f, "\n"); + } +} - if (adj->by_ref) - ptype = build_pointer_type (adj->type); - else +/* Fill NEW_TYPES with types of a function after its current OTYPES have been + modified as described in ADJ_PARAMS. */ + +static void +fill_vector_of_new_param_types (vec *new_types, vec *otypes, + vec *adj_params, + bool use_prev_indices) +{ + unsigned adj_len = vec_safe_length (adj_params); + new_types->reserve_exact (adj_len); + for (unsigned i = 0; i < adj_len ; i++) + { + ipa_adjusted_param *apm = &(*adj_params)[i]; + if (apm->op == IPA_PARAM_OP_COPY) + { + unsigned index + = use_prev_indices ? apm->prev_clone_index : apm->base_index; + /* The following needs to be handled gracefully by breaking because + of type mismatches. This happens with LTO but apparently also in + Fortran with -fcoarray=lib -O2 -lcaf_single -latomic. */ + if (index >= otypes->length ()) + continue; + new_types->quick_push ((*otypes)[index]); + } + else if (apm->op == IPA_PARAM_OP_NEW + || apm->op == IPA_PARAM_OP_SPLIT) + { + tree ntype; + if (apm->by_ref) { - ptype = adj->type; - if (is_gimple_reg_type (ptype) - && TYPE_MODE (ptype) != BLKmode) + ntype = build_pointer_type (apm->type); + if (is_gimple_reg_type (ntype) + && TYPE_MODE (ntype) != BLKmode) { - unsigned malign = GET_MODE_ALIGNMENT (TYPE_MODE (ptype)); - if (TYPE_ALIGN (ptype) != malign) - ptype = build_aligned_type (ptype, malign); + unsigned malign = GET_MODE_ALIGNMENT (TYPE_MODE (ntype)); + if (TYPE_ALIGN (ntype) != malign) + ntype = build_aligned_type (ntype, malign); } } - - if (care_for_types) - new_arg_types = tree_cons (NULL_TREE, ptype, new_arg_types); - - new_parm = build_decl (UNKNOWN_LOCATION, PARM_DECL, NULL_TREE, - ptype); - const char *prefix = adj->arg_prefix ? adj->arg_prefix : "SYNTH"; - DECL_NAME (new_parm) = create_tmp_var_name (prefix); - DECL_ARTIFICIAL (new_parm) = 1; - DECL_ARG_TYPE (new_parm) = ptype; - DECL_CONTEXT (new_parm) = fndecl; - TREE_USED (new_parm) = 1; - DECL_IGNORED_P (new_parm) = 1; - layout_decl (new_parm, 0); - - if (adj->op == IPA_PARM_OP_NEW) - adj->base = NULL; else - adj->base = parm; - adj->new_decl = new_parm; - - *link = new_parm; - link = &DECL_CHAIN (new_parm); + ntype = apm->type; + new_types->quick_push (ntype); } + else + gcc_unreachable (); } +} - *link = NULL_TREE; +/* Build and return a function type just like ORIG_TYPE but with parameter + types given in NEW_PARAM_TYPES - which can be NULL if, but only if, + ORIG_TYPE itself has NULL TREE_ARG_TYPEs. If METHOD2FUNC is true, also make + it a FUNCTION_TYPE instead of FUNCTION_TYPE. */ - tree new_reversed = NULL; - if (care_for_types) +static tree +build_adjusted_function_type (tree orig_type, vec *new_param_types, + bool method2func, bool skip_return) +{ + tree new_arg_types = NULL; + if (TYPE_ARG_TYPES (orig_type)) { - new_reversed = nreverse (new_arg_types); + gcc_checking_assert (new_param_types); + bool last_parm_void = (TREE_VALUE (tree_last (TYPE_ARG_TYPES (orig_type))) + == void_type_node); + unsigned len = new_param_types->length (); + for (unsigned i = 0; i < len; i++) + new_arg_types = tree_cons (NULL_TREE, (*new_param_types)[i], + new_arg_types); + + tree new_reversed = nreverse (new_arg_types); if (last_parm_void) { if (new_reversed) @@ -193,227 +226,588 @@ ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec adjustments) else new_reversed = void_list_node; } + new_arg_types = new_reversed; } /* Use copy_node to preserve as much as possible from original type (debug info, attribute lists etc.) - Exception is METHOD_TYPEs must have THIS argument. - When we are asked to remove it, we need to build new FUNCTION_TYPE - instead. */ + Exception is METHOD_TYPEs which must have THIS argument and when we are + asked to remove it, we need to build new FUNCTION_TYPE instead. */ tree new_type = NULL; - if (TREE_CODE (orig_type) != METHOD_TYPE - || (adjustments[0].op == IPA_PARM_OP_COPY - && adjustments[0].base_index == 0)) + if (method2func) + { + tree ret_type; + if (skip_return) + ret_type = void_type_node; + else + ret_type = TREE_TYPE (orig_type); + + new_type + = build_distinct_type_copy (build_function_type (ret_type, + new_arg_types)); + TYPE_CONTEXT (new_type) = TYPE_CONTEXT (orig_type); + } + else { new_type = build_distinct_type_copy (orig_type); - TYPE_ARG_TYPES (new_type) = new_reversed; + TYPE_ARG_TYPES (new_type) = new_arg_types; + if (skip_return) + TREE_TYPE (new_type) = void_type_node; + } + + return new_type; +} + + +/* Return the maximum index in any IPA_PARAM_OP_COPY adjustment or -1 if there + is none. */ + +int +ipa_param_adjustments::get_max_base_index () +{ + unsigned adj_len = vec_safe_length (m_adj_params); + int max_index = -1; + for (unsigned i = 0; i < adj_len ; i++) + { + ipa_adjusted_param *apm = &(*m_adj_params)[i]; + if (apm->op == IPA_PARAM_OP_COPY + && max_index < apm->base_index) + max_index = apm->base_index; + } + return max_index; +} + + +/* Fill SURVIVING_PARAMS with an array of bools where each one says whether a + parameter that originally was at that position still survives in the given + clone or is removed/replaced. If the final array is smaller than an index + of an original parameter, that parameter also did not survive. That a + parameter survives does not mean it has the same index as before. */ + +void +ipa_param_adjustments::get_surviving_params (vec *surviving_params) +{ + unsigned adj_len = vec_safe_length (m_adj_params); + int max_index = get_max_base_index (); + + if (max_index < 0) + return; + surviving_params->reserve_exact (max_index + 1); + surviving_params->quick_grow_cleared (max_index + 1); + for (unsigned i = 0; i < adj_len ; i++) + { + ipa_adjusted_param *apm = &(*m_adj_params)[i]; + if (apm->op == IPA_PARAM_OP_COPY) + (*surviving_params)[apm->base_index] = true; + } +} + +/* Fill NEW_INDICES with new indices of each surviving parameter or -1 for + those which do not survive. Any parameter outside of lenght of the vector + does not survive. There is currently no support for a parameter to be + copied to two distinct new parameters. */ + +void +ipa_param_adjustments::get_updated_indices (vec *new_indices) +{ + unsigned adj_len = vec_safe_length (m_adj_params); + int max_index = get_max_base_index (); + + if (max_index < 0) + return; + unsigned res_len = max_index + 1; + new_indices->reserve_exact (res_len); + for (unsigned i = 0; i < res_len ; i++) + new_indices->quick_push (-1); + for (unsigned i = 0; i < adj_len ; i++) + { + ipa_adjusted_param *apm = &(*m_adj_params)[i]; + if (apm->op == IPA_PARAM_OP_COPY) + (*new_indices)[apm->base_index] = i; + } +} + +/* Return true if the first parameter (assuming there was one) survives the + transformation intact and remains the first one. */ + +bool +ipa_param_adjustments::first_param_intact_p () +{ + return (!vec_safe_is_empty (m_adj_params) + && (*m_adj_params)[0].op == IPA_PARAM_OP_COPY + && (*m_adj_params)[0].base_index == 0); +} + +/* Return true if we have to change what has formerly been a method into a + function. */ + +bool +ipa_param_adjustments::method2func_p (tree orig_type) +{ + return ((TREE_CODE (orig_type) == METHOD_TYPE) && !first_param_intact_p ()); +} + +/* Given function type OLD_TYPE, return a new type derived from it after + performing all atored modifications. ULTIMATE_ORIGIN should be true when + OLD_TYPE refers to the type before any IPA transformations, as opposed to a + type that can be an intermediate one in between various IPA + transformations. */ + +tree +ipa_param_adjustments::build_new_function_type (tree old_type, + bool type_is_original_p) +{ + auto_vec new_param_types, *new_param_types_p; + if (prototype_p (old_type)) + { + auto_vec otypes; + ipa_fill_vector_with_formal_parm_types (&otypes, old_type); + fill_vector_of_new_param_types (&new_param_types, &otypes, m_adj_params, + !type_is_original_p); + new_param_types_p = &new_param_types; } else + new_param_types_p = NULL; + + return build_adjusted_function_type (old_type, new_param_types_p, + method2func_p (old_type), m_skip_return); +} + +/* Build variant of function decl ORIG_DECL skipping ARGS_TO_SKIP and the + return value if SKIP_RETURN is true. Arguments from DECL_ARGUMENTS list are + not removed now, since they are linked by TREE_CHAIN directly and not + accessible in LTO during WPA. The caller is responsible for eliminating + them when clones are properly materialized. */ + +tree +ipa_param_adjustments::adjust_decl (tree orig_decl) +{ + tree new_decl = copy_node (orig_decl); + tree orig_type = TREE_TYPE (orig_decl); + if (prototype_p (orig_type) + || (m_skip_return && !VOID_TYPE_P (TREE_TYPE (orig_type)))) { - new_type - = build_distinct_type_copy (build_function_type (TREE_TYPE (orig_type), - new_reversed)); - TYPE_CONTEXT (new_type) = TYPE_CONTEXT (orig_type); - DECL_VINDEX (fndecl) = NULL_TREE; + tree new_type = build_new_function_type (orig_type, false); + TREE_TYPE (new_decl) = new_type; } + if (method2func_p (orig_type)) + DECL_VINDEX (new_decl) = NULL_TREE; /* When signature changes, we need to clear builtin info. */ - if (fndecl_built_in_p (fndecl)) + if (fndecl_built_in_p (new_decl)) { - DECL_BUILT_IN_CLASS (fndecl) = NOT_BUILT_IN; - DECL_FUNCTION_CODE (fndecl) = (enum built_in_function) 0; + DECL_BUILT_IN_CLASS (new_decl) = NOT_BUILT_IN; + DECL_FUNCTION_CODE (new_decl) = (enum built_in_function) 0; } - TREE_TYPE (fndecl) = new_type; - DECL_VIRTUAL_P (fndecl) = 0; - DECL_LANG_SPECIFIC (fndecl) = NULL; - otypes.release (); - oparms.release (); + DECL_VIRTUAL_P (new_decl) = 0; + DECL_LANG_SPECIFIC (new_decl) = NULL; + + return new_decl; } -/* Modify actual arguments of a function call CS as indicated in ADJUSTMENTS. - If this is a directly recursive call, CS must be NULL. Otherwise it must - contain the corresponding call graph edge. */ +/* Wrapper around get_base_ref_and_offset for cases interesting for IPA-SRA + transformations. Return true if EXPR has an interesting form and fill in + *BASE_P and *UNIT_OFFSET_P with the appropriate info. */ -void -ipa_modify_call_arguments (struct cgraph_edge *cs, gcall *stmt, - ipa_parm_adjustment_vec adjustments) -{ - struct cgraph_node *current_node = cgraph_node::get (current_function_decl); - vec vargs; - vec **debug_args = NULL; - gcall *new_stmt; - gimple_stmt_iterator gsi, prev_gsi; - tree callee_decl; - int i, len; +static bool +isra_get_ref_base_and_offset (tree expr, tree *base_p, unsigned *unit_offset_p) +{ + HOST_WIDE_INT offset, size; + bool reverse; + tree base + = get_ref_base_and_extent_hwi (expr, &offset, &size, &reverse); + if (!base || size < 0) + return false; + + if ((offset % BITS_PER_UNIT) != 0) + return false; + + if (TREE_CODE (base) == MEM_REF) + { + poly_int64 plmoff = mem_ref_offset (base).force_shwi (); + HOST_WIDE_INT moff; + bool is_cst = plmoff.is_constant (&moff); + if (!is_cst) + return false; + offset += moff * BITS_PER_UNIT; + base = TREE_OPERAND (base, 0); + } + + if (offset < 0 || (offset / BITS_PER_UNIT) > UINT_MAX) + return false; + + *base_p = base; + *unit_offset_p = offset / BITS_PER_UNIT; + return true; +} + +/* Return true if APM describes a transitive split, i.e. one that happened for + both the caller and the callee. */ - len = adjustments.length (); - vargs.create (len); - callee_decl = !cs ? gimple_call_fndecl (stmt) : cs->callee->decl; - current_node->remove_stmt_references (stmt); +static bool +transitive_split_p (vec *performed_splits, + tree expr, unsigned *sm_idx, unsigned *unit_offset_p) +{ + tree base; + if (!isra_get_ref_base_and_offset (expr, &base, unit_offset_p)) + return false; + + if (TREE_CODE (base) == SSA_NAME) + { + base = SSA_NAME_VAR (base); + if (!base) + return false; + } + + unsigned len = vec_safe_length (performed_splits); + for (unsigned i = 0 ; i < len; i++) + { + ipa_param_performed_split *sm = &(*performed_splits)[i]; + if (sm->dummy_decl == base) + { + *sm_idx = i; + return true; + } + } + return false; +} + +struct transitive_split_map +{ + tree repl; + unsigned base_index; + unsigned unit_offset; +}; + +/* Build all structures necessary to handle transitive splits. !!!Doc */ + +static unsigned +init_transitive_splits (vec *performed_splits, + gcall *stmt, vec *index_map, + auto_vec *trans_map) +{ + unsigned phony_arguments = 0; + unsigned stmt_idx = 0, base_index = 0; + unsigned nargs = gimple_call_num_args (stmt); + while (stmt_idx < nargs) + { + unsigned unit_offset_delta; + tree base_arg = gimple_call_arg (stmt, stmt_idx); + + if (phony_arguments > 0) + index_map->safe_push (stmt_idx); + + unsigned sm_idx; + stmt_idx++; + if (transitive_split_p (performed_splits, base_arg, &sm_idx, + &unit_offset_delta)) + { + if (phony_arguments == 0) + /* We have optimistically avoided constructing index_map do far but + now it is clear it will be necessary, so let's creater the easy + bit we skipped until now. */ + for (unsigned k = 0; k < stmt_idx; k++) + index_map->safe_push (k); + + tree dummy = (*performed_splits)[sm_idx].dummy_decl; + for (unsigned j = sm_idx; j < performed_splits->length (); j++) + { + ipa_param_performed_split *caller_split + = &(*performed_splits)[j]; + if (caller_split->dummy_decl != dummy) + break; + gcc_assert (stmt_idx < nargs); + tree arg = gimple_call_arg (stmt, stmt_idx); + + struct transitive_split_map tsm; + tsm.repl = arg; + tsm.base_index = base_index; + gcc_assert (caller_split->unit_offset >= unit_offset_delta); + tsm.unit_offset = (caller_split->unit_offset - unit_offset_delta); + trans_map->safe_push (tsm); + + phony_arguments++; + stmt_idx++; + } + } + base_index++; + } + return phony_arguments; +} + +/* Modify actual arguments of a function call in statement STMT, assuming it + calls CALLEE_DECL. CALLER_ADJ must be the description of parameter + adjustments of the caller or NULL if there are none. Return the new + statement that replaced the old one. When invoked, cfun and + current_function_decl have to be set to the caller. */ + +gcall * +ipa_param_adjustments::modify_call (gcall *stmt, + vec *performed_splits, + tree callee_decl, bool update_references) +{ + unsigned len = vec_safe_length (m_adj_params); + auto_vec vargs (len); + tree old_decl = m_old_decl ? m_old_decl : gimple_call_fndecl (stmt); + unsigned old_nargs = gimple_call_num_args (stmt); + auto_vec kept (old_nargs); + kept.quick_grow_cleared (old_nargs); + + auto_vec index_map; + auto_vec trans_map; + bool transitive_remapping = false; + if (performed_splits) + { + unsigned removed = init_transitive_splits (performed_splits, + stmt, &index_map, &trans_map); + if (removed > 0) + { + transitive_remapping = true; + old_nargs -= removed; + } + } - gsi = gsi_for_stmt (stmt); - prev_gsi = gsi; + cgraph_node *current_node = cgraph_node::get (current_function_decl); + if (update_references) + current_node->remove_stmt_references (stmt); + + gimple_stmt_iterator gsi = gsi_for_stmt (stmt); + gimple_stmt_iterator prev_gsi = gsi; gsi_prev (&prev_gsi); - for (i = 0; i < len; i++) + for (unsigned i = 0; i < len; i++) { - struct ipa_parm_adjustment *adj; + ipa_adjusted_param *apm = &(*m_adj_params)[i]; - adj = &adjustments[i]; + gcc_assert (apm->op != IPA_PARAM_OP_UNDEFINED + /* Any transformation that introduces unspecified new + parameters needs to transform actual arguments itself. */ + && apm->op != IPA_PARAM_OP_NEW); - if (adj->op == IPA_PARM_OP_COPY) + if (apm->op == IPA_PARAM_OP_COPY) { - tree arg = gimple_call_arg (stmt, adj->base_index); + unsigned index = apm->base_index; + if (index >= old_nargs) + /* Can happen if the original call has argument mismatch, + ignore. */ + continue; + if (transitive_remapping) + index = index_map[apm->base_index]; + + tree arg = gimple_call_arg (stmt, index); vargs.quick_push (arg); + kept[index] = true; + continue; } - else if (adj->op != IPA_PARM_OP_REMOVE) + + /* At the moment the only user of IPA_PARAM_OP_NEW modifies calls itself. + If we ever want to support it during WPA IPA stage, we'll need a + mechanism to call into the IPA passes that introduced them. Currently + we simply mandate that IPA infrastructure understands all argument + modifications. Remember, edge redirection/modification is done only + once, not in steps for each pass modifying the callee like clone + materialization. */ + gcc_assert (apm->op == IPA_PARAM_OP_SPLIT); + + /* We have to handle transitive changes differently using the maps we + have created before. So look into them first. */ + + tree repl = NULL_TREE; + for (unsigned j = 0; j < trans_map.length (); j++) + if (trans_map[j].base_index == apm->base_index + && trans_map[j].unit_offset == apm->unit_offset) + { + repl = trans_map[j].repl; + break; + } + if (repl) { - tree expr, base, off; - location_t loc; - unsigned int deref_align = 0; - bool deref_base = false; - - /* We create a new parameter out of the value of the old one, we can - do the following kind of transformations: - - - A scalar passed by reference is converted to a scalar passed by - value. (adj->by_ref is false and the type of the original - actual argument is a pointer to a scalar). - - - A part of an aggregate is passed instead of the whole aggregate. - The part can be passed either by value or by reference, this is - determined by value of adj->by_ref. Moreover, the code below - handles both situations when the original aggregate is passed by - value (its type is not a pointer) and when it is passed by - reference (it is a pointer to an aggregate). - - When the new argument is passed by reference (adj->by_ref is true) - it must be a part of an aggregate and therefore we form it by - simply taking the address of a reference inside the original - aggregate. */ - - poly_int64 byte_offset = exact_div (adj->offset, BITS_PER_UNIT); - base = gimple_call_arg (stmt, adj->base_index); - loc = gimple_location (stmt); - - if (TREE_CODE (base) != ADDR_EXPR - && POINTER_TYPE_P (TREE_TYPE (base))) - off = build_int_cst (adj->alias_ptr_type, byte_offset); - else + vargs.quick_push (repl); + continue; + } + + unsigned index = apm->base_index; + if (index >= old_nargs) + /* Can happen if the original call has argument mismatch, ignore. */ + continue; + if (transitive_remapping) + index = index_map[apm->base_index]; + tree base = gimple_call_arg (stmt, index); + + /* We create a new parameter out of the value of the old one, we can + do the following kind of transformations: + + - A scalar passed by reference is converted to a scalar passed by + value. (apm->by_ref is false and the type of the original + actual argument is a pointer to a scalar). + + - A part of an aggregate is passed instead of the whole aggregate. + The part can be passed either by value or by reference, this is + determined by value of apm->by_ref. Moreover, the code below + handles both situations when the original aggregate is passed by + value (its type is not a pointer) and when it is passed by + reference (it is a pointer to an aggregate). + + When the new argument is passed by reference (apm->by_ref is true) + it must be a part of an aggregate and therefore we form it by + simply taking the address of a reference inside the original + aggregate. */ + + location_t loc = gimple_location (stmt); + + tree off; + bool deref_base = false; + unsigned int deref_align = 0; + if (TREE_CODE (base) != ADDR_EXPR && POINTER_TYPE_P (TREE_TYPE (base))) + off = build_int_cst (apm->alias_ptr_type, apm->unit_offset); + else + { + bool addrof; + if (TREE_CODE (base) == ADDR_EXPR) { - poly_int64 base_offset; - tree prev_base; - bool addrof; + base = TREE_OPERAND (base, 0); + addrof = true; + } + else + addrof = false; - if (TREE_CODE (base) == ADDR_EXPR) - { - base = TREE_OPERAND (base, 0); - addrof = true; - } - else - addrof = false; - prev_base = base; - base = get_addr_base_and_unit_offset (base, &base_offset); - /* Aggregate arguments can have non-invariant addresses. */ - if (!base) - { - base = build_fold_addr_expr (prev_base); - off = build_int_cst (adj->alias_ptr_type, byte_offset); - } - else if (TREE_CODE (base) == MEM_REF) - { - if (!addrof) - { - deref_base = true; - deref_align = TYPE_ALIGN (TREE_TYPE (base)); - } - off = build_int_cst (adj->alias_ptr_type, - base_offset + byte_offset); - off = int_const_binop (PLUS_EXPR, TREE_OPERAND (base, 1), - off); - base = TREE_OPERAND (base, 0); - } - else + tree prev_base = base; + poly_int64 base_offset; + base = get_addr_base_and_unit_offset (base, &base_offset); + + /* Aggregate arguments can have non-invariant addresses. */ + if (!base) + { + base = build_fold_addr_expr (prev_base); + off = build_int_cst (apm->alias_ptr_type, apm->unit_offset); + } + else if (TREE_CODE (base) == MEM_REF) + { + if (!addrof) { - off = build_int_cst (adj->alias_ptr_type, - base_offset + byte_offset); - base = build_fold_addr_expr (base); + deref_base = true; + deref_align = TYPE_ALIGN (TREE_TYPE (base)); } + off = build_int_cst (apm->alias_ptr_type, + base_offset + apm->unit_offset); + off = int_const_binop (PLUS_EXPR, TREE_OPERAND (base, 1), + off); + base = TREE_OPERAND (base, 0); } - - if (!adj->by_ref) + else { - tree type = adj->type; - unsigned int align; - unsigned HOST_WIDE_INT misalign; + off = build_int_cst (apm->alias_ptr_type, + base_offset + apm->unit_offset); + base = build_fold_addr_expr (base); + } + } - if (deref_base) - { - align = deref_align; - misalign = 0; - } - else - { - get_pointer_alignment_1 (base, &align, &misalign); - if (TYPE_ALIGN (type) > align) - align = TYPE_ALIGN (type); - } - misalign += (offset_int::from (wi::to_wide (off), - SIGNED).to_short_addr () - * BITS_PER_UNIT); - misalign = misalign & (align - 1); - if (misalign != 0) - align = least_bit_hwi (misalign); - if (align < TYPE_ALIGN (type)) - type = build_aligned_type (type, align); - base = force_gimple_operand_gsi (&gsi, base, - true, NULL, true, GSI_SAME_STMT); - expr = fold_build2_loc (loc, MEM_REF, type, base, off); - REF_REVERSE_STORAGE_ORDER (expr) = adj->reverse; - /* If expr is not a valid gimple call argument emit - a load into a temporary. */ - if (is_gimple_reg_type (TREE_TYPE (expr))) - { - gimple *tem = gimple_build_assign (NULL_TREE, expr); - if (gimple_in_ssa_p (cfun)) - { - gimple_set_vuse (tem, gimple_vuse (stmt)); - expr = make_ssa_name (TREE_TYPE (expr), tem); - } - else - expr = create_tmp_reg (TREE_TYPE (expr)); - gimple_assign_set_lhs (tem, expr); - gimple_set_location (tem, loc); - gsi_insert_before (&gsi, tem, GSI_SAME_STMT); - } + tree expr; + if (!apm->by_ref) + { + tree type = apm->type; + unsigned int align; + unsigned HOST_WIDE_INT misalign; + + if (deref_base) + { + align = deref_align; + misalign = 0; } else { - expr = fold_build2_loc (loc, MEM_REF, adj->type, base, off); - REF_REVERSE_STORAGE_ORDER (expr) = adj->reverse; - expr = build_fold_addr_expr (expr); - expr = force_gimple_operand_gsi (&gsi, expr, - true, NULL, true, GSI_SAME_STMT); + get_pointer_alignment_1 (base, &align, &misalign); + /* All users must make sure that we can be optimistic when it + comes to alignment in this case (by inspecting the final users + of these new parameters). */ + if (TYPE_ALIGN (type) > align) + align = TYPE_ALIGN (type); + } + misalign += (offset_int::from (wi::to_wide (off), + SIGNED).to_short_addr () + * BITS_PER_UNIT); + misalign = misalign & (align - 1); + if (misalign != 0) + align = least_bit_hwi (misalign); + if (align < TYPE_ALIGN (type)) + type = build_aligned_type (type, align); + base = force_gimple_operand_gsi (&gsi, base, + true, NULL, true, GSI_SAME_STMT); + expr = fold_build2_loc (loc, MEM_REF, type, base, off); + REF_REVERSE_STORAGE_ORDER (expr) = apm->reverse; + /* If expr is not a valid gimple call argument emit + a load into a temporary. */ + if (is_gimple_reg_type (TREE_TYPE (expr))) + { + gimple *tem = gimple_build_assign (NULL_TREE, expr); + if (gimple_in_ssa_p (cfun)) + { + gimple_set_vuse (tem, gimple_vuse (stmt)); + expr = make_ssa_name (TREE_TYPE (expr), tem); + } + else + expr = create_tmp_reg (TREE_TYPE (expr)); + gimple_assign_set_lhs (tem, expr); + gsi_insert_before (&gsi, tem, GSI_SAME_STMT); } - vargs.quick_push (expr); } - if (adj->op != IPA_PARM_OP_COPY && MAY_HAVE_DEBUG_BIND_STMTS) + else { - unsigned int ix; - tree ddecl = NULL_TREE, origin = DECL_ORIGIN (adj->base), arg; - gimple *def_temp; + expr = fold_build2_loc (loc, MEM_REF, apm->type, base, off); + REF_REVERSE_STORAGE_ORDER (expr) = apm->reverse; + expr = build_fold_addr_expr (expr); + expr = force_gimple_operand_gsi (&gsi, expr, + true, NULL, true, GSI_SAME_STMT); + } + vargs.quick_push (expr); + } + + if (m_always_copy_start >= 0) + for (unsigned i = m_always_copy_start; i < old_nargs; i++) + vargs.safe_push (gimple_call_arg (stmt, i)); + + /* For optimized away parameters, add on the caller side + before the call + DEBUG D#X => parm_Y(D) + stmts and associate D#X with parm in decl_debug_args_lookup + vector to say for debug info that if parameter parm had been passed, + it would have value parm_Y(D). */ + if (MAY_HAVE_DEBUG_STMTS && old_decl && callee_decl) + { + vec **debug_args = NULL; + unsigned i = 0; + for (tree old_parm = DECL_ARGUMENTS (old_decl); + old_parm && i < old_nargs && ((int) i) < m_always_copy_start; + old_parm = DECL_CHAIN (old_parm), i++) + { + if (!is_gimple_reg (old_parm) || kept[i]) + continue; + tree origin = DECL_ORIGIN (old_parm); + tree arg = gimple_call_arg (stmt, i); - arg = gimple_call_arg (stmt, adj->base_index); if (!useless_type_conversion_p (TREE_TYPE (origin), TREE_TYPE (arg))) { if (!fold_convertible_p (TREE_TYPE (origin), arg)) continue; - arg = fold_convert_loc (gimple_location (stmt), - TREE_TYPE (origin), arg); + tree rhs1; + if (TREE_CODE (arg) == SSA_NAME + && gimple_assign_cast_p (SSA_NAME_DEF_STMT (arg)) + && (rhs1 + = gimple_assign_rhs1 (SSA_NAME_DEF_STMT (arg))) + && useless_type_conversion_p (TREE_TYPE (origin), + TREE_TYPE (rhs1))) + arg = rhs1; + else + arg = fold_convert_loc (gimple_location (stmt), + TREE_TYPE (origin), arg); } if (debug_args == NULL) debug_args = decl_debug_args_insert (callee_decl); + unsigned int ix; + tree ddecl = NULL_TREE; for (ix = 0; vec_safe_iterate (*debug_args, ix, &ddecl); ix += 2) if (ddecl == origin) { @@ -430,7 +824,8 @@ ipa_modify_call_arguments (struct cgraph_edge *cs, gcall *stmt, vec_safe_push (*debug_args, origin); vec_safe_push (*debug_args, ddecl); } - def_temp = gimple_build_debug_bind (ddecl, unshare_expr (arg), stmt); + gimple *def_temp = gimple_build_debug_bind (ddecl, + unshare_expr (arg), stmt); gsi_insert_before (&gsi, def_temp, GSI_SAME_STMT); } } @@ -441,10 +836,34 @@ ipa_modify_call_arguments (struct cgraph_edge *cs, gcall *stmt, print_gimple_stmt (dump_file, gsi_stmt (gsi), 0); } - new_stmt = gimple_build_call_vec (callee_decl, vargs); - vargs.release (); - if (gimple_call_lhs (stmt)) - gimple_call_set_lhs (new_stmt, gimple_call_lhs (stmt)); + gcall *new_stmt = gimple_build_call_vec (callee_decl, vargs); + + if (tree lhs = gimple_call_lhs (stmt)) + { + if (!m_skip_return) + gimple_call_set_lhs (new_stmt, lhs); + else if (TREE_CODE (lhs) == SSA_NAME) + { + /* LHS should now by a default-def SSA. Unfortunately default-def + SSA_NAMEs need a backing variable (or at least some code examining + SSAs assumes it is non-NULL). So we either have to re-use the + decl we have at hand or introdice a new one. */ + tree repl = create_tmp_var (TREE_TYPE (lhs), "removed_return"); + repl = get_or_create_ssa_default_def (cfun, repl); + SSA_NAME_IS_DEFAULT_DEF (repl) = true; + imm_use_iterator ui; + use_operand_p use_p; + gimple *using_stmt; + FOR_EACH_IMM_USE_STMT (using_stmt, ui, lhs) + { + FOR_EACH_IMM_USE_ON_STMT (use_p, ui) + { + SET_USE (use_p, repl); + } + update_stmt (using_stmt); + } + } + } gimple_set_block (new_stmt, gimple_block (stmt)); if (gimple_has_location (stmt)) @@ -457,7 +876,8 @@ ipa_modify_call_arguments (struct cgraph_edge *cs, gcall *stmt, if (gimple_vdef (stmt)) { gimple_set_vdef (new_stmt, gimple_vdef (stmt)); - SSA_NAME_DEF_STMT (gimple_vdef (new_stmt)) = new_stmt; + if (TREE_CODE (gimple_vdef (new_stmt)) == SSA_NAME) + SSA_NAME_DEF_STMT (gimple_vdef (new_stmt)) = new_stmt; } } @@ -468,120 +888,384 @@ ipa_modify_call_arguments (struct cgraph_edge *cs, gcall *stmt, fprintf (dump_file, "\n"); } gsi_replace (&gsi, new_stmt, true); - if (cs) - cs->set_call_stmt (new_stmt); - do - { - current_node->record_stmt_references (gsi_stmt (gsi)); - gsi_prev (&gsi); - } - while (gsi_stmt (gsi) != gsi_stmt (prev_gsi)); + if (update_references) + do + { + current_node->record_stmt_references (gsi_stmt (gsi)); + gsi_prev (&gsi); + } + while (gsi_stmt (gsi) != gsi_stmt (prev_gsi)); + return new_stmt; +} + +/* Note that this variant will always re-record references. */ +/* FIXME: This is for early IPA-SRA only, consider removing when it is + gone. */ +void +ipa_param_adjustments::modify_call (struct cgraph_edge *cs) +{ + gcall *old_call = cs->call_stmt; + tree callee_decl = cs->callee ? cs->callee->decl : NULL; + /* TODO: Set current_function_decl? */ + gcall *new_call = modify_call (old_call, NULL, callee_decl, true); + cs->set_call_stmt (new_call); } -/* Return true iff BASE_INDEX is in ADJUSTMENTS more than once. */ +/* Dump information contained in the object in textual form to F. */ -static bool -index_in_adjustments_multiple_times_p (int base_index, - ipa_parm_adjustment_vec adjustments) +void +ipa_param_adjustments::dump (FILE *f) { - int i, len = adjustments.length (); - bool one = false; + fprintf (f, "m_always_copy_start: %i\n", m_always_copy_start); + ipa_dump_adjusted_parameters (f, m_adj_params); + if (m_skip_return) + fprintf (f, "Will SKIP return.\n"); +} - for (i = 0; i < len; i++) - { - struct ipa_parm_adjustment *adj; - adj = &adjustments[i]; +/* Dump information contained in the object in textual form to stderr. */ - if (adj->base_index == base_index) - { - if (one) - return true; - else - one = true; - } - } - return false; +void +ipa_param_adjustments::debug () +{ + dump (stderr); } -/* Return adjustments that should have the same effect on function parameters - and call arguments as if they were first changed according to adjustments in - INNER and then by adjustments in OUTER. */ +/* Register that REPLACEMENT should replace parameter described in APM and + optionally as DUMMY to mark transitive splits accross calls. */ -ipa_parm_adjustment_vec -ipa_combine_adjustments (ipa_parm_adjustment_vec inner, - ipa_parm_adjustment_vec outer) +void +ipa_param_body_adjustments::register_replacement (ipa_adjusted_param *apm, + tree replacement, + tree dummy) { - int i, outlen = outer.length (); - int inlen = inner.length (); - int removals = 0; - ipa_parm_adjustment_vec adjustments, tmp; + gcc_checking_assert (apm->op == IPA_PARAM_OP_SPLIT + || apm->op == IPA_PARAM_OP_NEW); + gcc_checking_assert (!apm->prev_clone_adjustment); + ipa_param_body_replacement psr; + psr.base = m_oparms[apm->base_index]; + psr.repl = replacement; + psr.dummy = dummy; + psr.unit_offset = apm->unit_offset; + psr.by_ref = apm->by_ref; + psr.reverse = apm->reverse; + m_replacements.safe_push (psr); +} - tmp.create (inlen); - for (i = 0; i < inlen; i++) - { - struct ipa_parm_adjustment *n; - n = &inner[i]; +/* Copy or not, as appropriate given COPY_PARM_DECLS and ID, a pre-existing + PARM_DECL T so that it can be included in the parameters of the modified + function. */ - if (n->op == IPA_PARM_OP_REMOVE) - removals++; - else +tree +ipa_param_body_adjustments::carry_over_param (tree t, bool copy_parm_decls) +{ + tree new_parm; + if (copy_parm_decls) + { + if (m_id) { - /* FIXME: Handling of new arguments are not implemented yet. */ - gcc_assert (n->op != IPA_PARM_OP_NEW); - tmp.quick_push (*n); + new_parm = remap_decl (t, m_id); + if (TREE_CODE (new_parm) != PARM_DECL) + new_parm = m_id->copy_decl (t, m_id); } + else + new_parm = copy_node (t); } + else + new_parm = t; + return new_parm; +} - adjustments.create (outlen + removals); - for (i = 0; i < outlen; i++) - { - struct ipa_parm_adjustment r; - struct ipa_parm_adjustment *out = &outer[i]; - struct ipa_parm_adjustment *in = &tmp[out->base_index]; +/* Common initialization. */ - memset (&r, 0, sizeof (r)); - gcc_assert (in->op != IPA_PARM_OP_REMOVE); - if (out->op == IPA_PARM_OP_REMOVE) +void +ipa_param_body_adjustments::common_initialization (bool copy_parm_decls, + tree old_fndecl, + tree *vars, + vec *tree_map) +{ + tree fndecl = old_fndecl ? old_fndecl : m_fndecl; + ipa_fill_vector_with_formal_parms (&m_oparms, fndecl); + auto_vec otypes; + if (TYPE_ARG_TYPES (TREE_TYPE (fndecl)) != NULL_TREE) + ipa_fill_vector_with_formal_parm_types (&otypes, TREE_TYPE (fndecl)); + else + { + auto_vec oparms; + ipa_fill_vector_with_formal_parms (&oparms, fndecl); + unsigned ocount = oparms.length (); + m_new_types.reserve_exact (ocount); + for (unsigned i = 0; i < ocount; i++) + otypes.quick_push (TREE_TYPE (oparms[i])); + } + fill_vector_of_new_param_types (&m_new_types, &otypes, m_adj_params, true); + + auto_vec kept; + kept.reserve_exact (m_oparms.length ()); + kept.quick_grow_cleared (m_oparms.length ()); + auto_vec isra_dummy_decls; + isra_dummy_decls.reserve_exact (m_oparms.length ()); + isra_dummy_decls.quick_grow_cleared (m_oparms.length ()); + + unsigned adj_len = vec_safe_length (m_adj_params); + m_method2func = ((TREE_CODE (TREE_TYPE (m_fndecl)) == METHOD_TYPE) + && (adj_len == 0 + || (*m_adj_params)[0].op != IPA_PARAM_OP_COPY + || (*m_adj_params)[0].base_index != 0)); + + m_new_decls.reserve_exact (adj_len); + for (unsigned i = 0; i < adj_len ; i++) + { + ipa_adjusted_param *apm = &(*m_adj_params)[i]; + unsigned prev_index = apm->prev_clone_index; + tree new_parm; + if (apm->op == IPA_PARAM_OP_COPY + || apm->prev_clone_adjustment) { - if (!index_in_adjustments_multiple_times_p (in->base_index, tmp)) + kept[prev_index] = true; + new_parm = carry_over_param (m_oparms[prev_index], copy_parm_decls); + m_new_decls.quick_push (new_parm); + } + else if (apm->op == IPA_PARAM_OP_NEW + || apm->op == IPA_PARAM_OP_SPLIT) + { + tree new_type = m_new_types[i]; + gcc_checking_assert (new_type); + new_parm = build_decl (UNKNOWN_LOCATION, PARM_DECL, NULL_TREE, + new_type); + const char *prefix = ipa_param_prefixes[apm->param_prefix_index]; + DECL_NAME (new_parm) = create_tmp_var_name (prefix); + DECL_ARTIFICIAL (new_parm) = 1; + DECL_ARG_TYPE (new_parm) = new_type; + DECL_CONTEXT (new_parm) = m_fndecl; + TREE_USED (new_parm) = 1; + DECL_IGNORED_P (new_parm) = 1; + layout_decl (new_parm, 0); + m_new_decls.quick_push (new_parm); + + if (apm->op == IPA_PARAM_OP_SPLIT) { - r.op = IPA_PARM_OP_REMOVE; - adjustments.quick_push (r); + m_split_modifications_p = true; + + if (m_id) + { + tree dummy_decl; + if (!isra_dummy_decls[prev_index]) + { + dummy_decl = copy_decl_to_var (m_oparms[prev_index], + m_id); + /* Any attempt to remap this dummy in this particular + instance of clone materialization should yield + itself. */ + insert_decl_map (m_id, dummy_decl, dummy_decl); + + DECL_CHAIN (dummy_decl) = *vars; + *vars = dummy_decl; + isra_dummy_decls[prev_index] = dummy_decl; + } + else + dummy_decl = isra_dummy_decls[prev_index]; + + register_replacement (apm, new_parm, dummy_decl); + gcc_checking_assert (m_adjustments); + ipa_param_performed_split ps; + ps.dummy_decl = dummy_decl; + ps.unit_offset = apm->unit_offset; + vec_safe_push (m_id->dst_node->clone.performed_splits, ps); + } + else + register_replacement (apm, new_parm); } - continue; - } + } else - { - /* FIXME: Handling of new arguments are not implemented yet. */ - gcc_assert (out->op != IPA_PARM_OP_NEW); - } + gcc_unreachable (); + } - r.base_index = in->base_index; - r.type = out->type; + unsigned op_len = m_oparms.length (); + for (unsigned i = 0; i < op_len; i++) + if (!kept[i]) + { + /* We operate in different modes with and without id when it comes to + converting remaining uses of removed PARM_DECLs (which do not + however use the initial value) to VAR_DECL copies. With id, we rely + on its mapping and create a replacement straight away. Without it, + we have our own mechanism. Just don't mix them, that is why you + should not call replace_removed_params_ssa_names or + perform_cfun_body_modifications when you construct with ID not equel + to NULL. */ + if (m_id) + { + if (!m_id->decl_map->get (m_oparms[i])) + { + /* TODO: Perhaps at least aggregate-type params could re-use + their isra_dummy_decl here? */ + tree var = copy_decl_to_var (m_oparms[i], m_id); + insert_decl_map (m_id, m_oparms[i], var); + /* Declare this new variable. */ + DECL_CHAIN (var) = *vars; + *vars = var; + } + } + else + { + m_removed_decls.safe_push (m_oparms[i]); + m_removed_map.put (m_oparms[i], m_removed_decls.length () - 1); + } + } - /* FIXME: Create nonlocal value too. */ + if (!MAY_HAVE_DEBUG_STMTS) + return; - if (in->op == IPA_PARM_OP_COPY && out->op == IPA_PARM_OP_COPY) - r.op = IPA_PARM_OP_COPY; - else if (in->op == IPA_PARM_OP_COPY) - r.offset = out->offset; - else if (out->op == IPA_PARM_OP_COPY) - r.offset = in->offset; - else - r.offset = in->offset + out->offset; - adjustments.quick_push (r); + auto_vec index_mapping; + bool need_remap = false; + + if (m_id && m_id->src_node->clone.param_adjustments) + { + ipa_param_adjustments *prev_adjustments + = m_id->src_node->clone.param_adjustments; + prev_adjustments->get_updated_indices (&index_mapping); + need_remap = true; } - for (i = 0; i < inlen; i++) + /* Do not output debuginfo for parameter declarations as if they vanished + when they were in fact replaced by a constant. */ + if (tree_map) + for (unsigned i = 0; i < tree_map->length (); i++) + { + int parm_num = (*tree_map)[i]->parm_num; + gcc_assert (parm_num >= 0); + if (need_remap) + parm_num = index_mapping[parm_num]; + kept[parm_num] = true; + } + + for (unsigned i = 0; i < op_len; i++) + if (!kept[i] && is_gimple_reg (m_oparms[i])) + m_reset_debug_decls.safe_push (m_oparms[i]); +} + +/* Constructor of ipa_param_body_adjustments performing all necessary + initializations. */ + +ipa_param_body_adjustments +::ipa_param_body_adjustments (vec *adj_params, + tree fndecl) + : m_adj_params (adj_params), m_adjustments (NULL), m_reset_debug_decls (), + m_split_modifications_p (false), m_fndecl (fndecl), m_id (NULL), + m_oparms (), m_new_decls (), m_new_types (), m_replacements (), + m_removed_decls (), m_removed_map (), m_method2func (false) +{ + common_initialization (false, NULL, NULL, NULL); +} + +ipa_param_body_adjustments +::ipa_param_body_adjustments (ipa_param_adjustments *adjustments, + tree fndecl) + : m_adj_params (adjustments->m_adj_params), m_adjustments (adjustments), + m_reset_debug_decls (), m_split_modifications_p (false), m_fndecl (fndecl), + m_id (NULL), m_oparms (), m_new_decls (), m_new_types (), + m_replacements (), m_removed_decls (), m_removed_map (), + m_method2func (false) +{ + common_initialization (false, NULL, NULL, NULL); +} + +ipa_param_body_adjustments +::ipa_param_body_adjustments (ipa_param_adjustments *adjustments, + tree fndecl, tree old_fndecl, + bool copy_parm_decls, copy_body_data *id, + tree *vars, + vec *tree_map) + : m_adj_params (adjustments->m_adj_params), m_adjustments (adjustments), + m_reset_debug_decls (), m_split_modifications_p (false), m_fndecl (fndecl), + m_id (id), m_oparms (), m_new_decls (), m_new_types (), m_replacements (), + m_removed_decls (), m_removed_map (), m_method2func (false) +{ + common_initialization (copy_parm_decls, old_fndecl, vars, tree_map); +} + +/* Chain new param decls up and return them. */ + +tree +ipa_param_body_adjustments::get_new_param_chain () +{ + tree result; + tree *link = &result; + + unsigned len = vec_safe_length (m_adj_params); + for (unsigned i = 0; i < len; i++) { - struct ipa_parm_adjustment *n = &inner[i]; + tree new_decl = m_new_decls[i]; + *link = new_decl; + link = &DECL_CHAIN (new_decl); + } + *link = NULL_TREE; + return result; +} + +/* Modify the function parameters FNDECL and its type according to the plan in + ADJUSTMENTS. If ORIG_OLD_DECL is true, the curreent m_fndecl has not + already been adjusted with ipa_param_adjustments::adjust_decl and so + equivalent changes to the DECL will also be made. */ + +void +ipa_param_body_adjustments::modify_formal_parameters () +{ + tree orig_type = TREE_TYPE (m_fndecl); + DECL_ARGUMENTS (m_fndecl) = get_new_param_chain (); - if (n->op == IPA_PARM_OP_REMOVE) - adjustments.quick_push (*n); + /* When signature changes, we need to clear builtin info. */ + if (fndecl_built_in_p (m_fndecl)) + { + DECL_BUILT_IN_CLASS (m_fndecl) = NOT_BUILT_IN; + DECL_FUNCTION_CODE (m_fndecl) = (enum built_in_function) 0; } - tmp.release (); - return adjustments; + /* At this point, removing return value is only implemented when going + through tree_function_versioning, not when modifying function body + directly. */ + gcc_assert (!m_adjustments || !m_adjustments->m_skip_return); + tree new_type = build_adjusted_function_type (orig_type, &m_new_types, + m_method2func, false); + + TREE_TYPE (m_fndecl) = new_type; + DECL_VIRTUAL_P (m_fndecl) = 0; + DECL_LANG_SPECIFIC (m_fndecl) = NULL; + if (m_method2func) + DECL_VINDEX (m_fndecl) = NULL_TREE; +} + +/* Given BASE and UNIT_OFFSET, find the corresponding record among replacement + structures. */ + +ipa_param_body_replacement * +ipa_param_body_adjustments::lookup_replacement_1 (tree base, + unsigned unit_offset) +{ + unsigned int len = m_replacements.length (); + for (unsigned i = 0; i < len; i++) + { + ipa_param_body_replacement *pbr = &m_replacements[i]; + + if (pbr->base == base + && (pbr->unit_offset == unit_offset)) + return pbr; + } + return NULL; +} + +/* Given BASE and UNIT_OFFSET, find the corresponding replacement expression + and return it, assuming it is known it does not hold value by reference or + in reverse storage order. */ + +tree +ipa_param_body_adjustments::lookup_replacement (tree base, unsigned unit_offset) +{ + ipa_param_body_replacement *pbr = lookup_replacement_1 (base, unit_offset); + if (!pbr) + return NULL; + gcc_assert (!pbr->by_ref && !pbr->reverse); + return pbr->repl; } /* If T is an SSA_NAME, return NULL if it is not a default def or @@ -602,165 +1286,649 @@ get_ssa_base_param (tree t, bool ignore_default_def) return t; } -/* Given an expression, return an adjustment entry specifying the - transformation to be done on EXPR. If no suitable adjustment entry - was found, returns NULL. +/* Given an expression, return the structure describing how it should be + replaced if it accesses a part of a split parameter or NULL otherwise. - If IGNORE_DEFAULT_DEF is set, consider SSA_NAMEs which are not a - default def, otherwise bail on them. + Do not free the result, it will be deallocated when the object is destroyed. - If CONVERT is non-NULL, this function will set *CONVERT if the - expression provided is a component reference. ADJUSTMENTS is the - adjustments vector. */ + If IGNORE_DEFAULT_DEF is cleared, consider only SSA_NAMEs of PARM_DECLs + which are default definitions, if set, consider all SSA_NAMEs of + PARM_DECLs. */ -ipa_parm_adjustment * -ipa_get_adjustment_candidate (tree **expr, bool *convert, - ipa_parm_adjustment_vec adjustments, - bool ignore_default_def) +ipa_param_body_replacement * +ipa_param_body_adjustments::get_expr_replacement_1 (tree expr, + bool ignore_default_def) { - if (TREE_CODE (**expr) == BIT_FIELD_REF - || TREE_CODE (**expr) == IMAGPART_EXPR - || TREE_CODE (**expr) == REALPART_EXPR) - { - *expr = &TREE_OPERAND (**expr, 0); - if (convert) - *convert = true; - } + tree base; + unsigned unit_offset; - poly_int64 offset, size, max_size; - bool reverse; - tree base - = get_ref_base_and_extent (**expr, &offset, &size, &max_size, &reverse); - if (!base || !known_size_p (size) || !known_size_p (max_size)) + if (!isra_get_ref_base_and_offset (expr, &base, &unit_offset)) return NULL; - if (TREE_CODE (base) == MEM_REF) - { - offset += mem_ref_offset (base).force_shwi () * BITS_PER_UNIT; - base = TREE_OPERAND (base, 0); - } - base = get_ssa_base_param (base, ignore_default_def); if (!base || TREE_CODE (base) != PARM_DECL) return NULL; + return lookup_replacement_1 (base, unit_offset); +} - struct ipa_parm_adjustment *cand = NULL; - unsigned int len = adjustments.length (); - for (unsigned i = 0; i < len; i++) - { - struct ipa_parm_adjustment *adj = &adjustments[i]; +/* Given an expression, return the structure describing how it should be + replaced just like in the previous function but return directly the + expression assuming that it is known it does not hold value by reference or + in reverse storage order. */ - if (adj->base == base - && (known_eq (adj->offset, offset) || adj->op == IPA_PARM_OP_REMOVE)) - { - cand = adj; - break; - } +tree +ipa_param_body_adjustments::get_expr_replacement (tree expr, + bool ignore_default_def) +{ + ipa_param_body_replacement *pbr = get_expr_replacement_1 (expr, + ignore_default_def); + if (!pbr) + return NULL; + gcc_assert (!pbr->by_ref && !pbr->reverse); + return pbr->repl; +} + +/* Given OLD_DECL, which is a PARM_DECL of a parameter that is being removed + (which includes it being split or replaced), return a new variable that + should be used for any SSA names that will remain in the function that + previously belonged to OLD_DECL. */ + +tree +ipa_param_body_adjustments::get_replacement_ssa_base (tree old_decl) +{ + unsigned *idx = m_removed_map.get (old_decl); + if (!idx) + return NULL; + + tree repl; + if (TREE_CODE (m_removed_decls[*idx]) == PARM_DECL) + { + gcc_assert (m_removed_decls[*idx] == old_decl); + repl = copy_var_decl (old_decl, DECL_NAME (old_decl), + TREE_TYPE (old_decl)); + m_removed_decls[*idx] = repl; } + else + repl = m_removed_decls[*idx]; + return repl; +} + +/* If OLD_NAME, which is being defined by statement STMT, is an SSA_NAME of a + parameter which is to be removed because its value is not used, create a new + SSA_NAME relating to a replacement VAR_DECL, replace all uses of the + original with it and return it. If there is no need to re-map, return NULL. + ADJUSTMENTS is a pointer to a vector of IPA-SRA adjustments. */ + +tree +ipa_param_body_adjustments::replace_removed_params_ssa_names (tree old_name, + gimple *stmt) +{ + gcc_assert (!m_id); + if (TREE_CODE (old_name) != SSA_NAME) + return NULL; + + tree decl = SSA_NAME_VAR (old_name); + if (decl == NULL_TREE + || TREE_CODE (decl) != PARM_DECL) + return NULL; - if (!cand || cand->op == IPA_PARM_OP_COPY || cand->op == IPA_PARM_OP_REMOVE) + tree repl = get_replacement_ssa_base (decl); + if (!repl) return NULL; - return cand; + + tree new_name = make_ssa_name (repl, stmt); + SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_name) + = SSA_NAME_OCCURS_IN_ABNORMAL_PHI (old_name); + + if (dump_file) + { + fprintf (dump_file, "replacing an SSA name of a removed param "); + print_generic_expr (dump_file, old_name); + fprintf (dump_file, " with "); + print_generic_expr (dump_file, new_name); + fprintf (dump_file, "\n"); + } + + replace_uses_by (old_name, new_name); + return new_name; } -/* If the expression *EXPR should be replaced by a reduction of a parameter, do - so. ADJUSTMENTS is a pointer to a vector of adjustments. CONVERT - specifies whether the function should care about type incompatibility the - current and new expressions. If it is false, the function will leave - incompatibility issues to the caller. Return true iff the expression - was modified. */ +/* If the expression *EXPR_P should be replaced by a reduction of a parameter, + do so. CONVERT specifies whether the function should care about type + incompatibility of the current and new expressions. If it is false, the + function will leave incompatibility issues to the caller, but it will be + overridden if BIT_FIELD_REF, IMAGPART_EXPR or REALPART_EXPR is encountered. + Return true iff the expression was modified. CALL_ARG should be true when + the modification is done as a part of re-mapping a call argument. */ bool -ipa_modify_expr (tree *expr, bool convert, - ipa_parm_adjustment_vec adjustments) +ipa_param_body_adjustments::modify_expr (tree *expr_p, bool convert) { - struct ipa_parm_adjustment *cand - = ipa_get_adjustment_candidate (&expr, &convert, adjustments, false); - if (!cand) + tree expr = *expr_p; + + if (TREE_CODE (expr) == BIT_FIELD_REF + || TREE_CODE (expr) == IMAGPART_EXPR + || TREE_CODE (expr) == REALPART_EXPR) + { + expr_p = &TREE_OPERAND (expr, 0); + expr = *expr_p; + convert = true; + } + + ipa_param_body_replacement *pbr = get_expr_replacement_1 (expr, false); + if (!pbr) return false; - tree src; - if (cand->by_ref) + tree repl; + if (pbr->by_ref) { - src = build_simple_mem_ref (cand->new_decl); - REF_REVERSE_STORAGE_ORDER (src) = cand->reverse; + repl = build_simple_mem_ref (pbr->repl); + REF_REVERSE_STORAGE_ORDER (repl) = pbr->reverse; } else - src = cand->new_decl; + repl = pbr->repl; if (dump_file && (dump_flags & TDF_DETAILS)) { fprintf (dump_file, "About to replace expr "); - print_generic_expr (dump_file, *expr); + print_generic_expr (dump_file, expr); fprintf (dump_file, " with "); - print_generic_expr (dump_file, src); + print_generic_expr (dump_file, repl); fprintf (dump_file, "\n"); } - if (convert && !useless_type_conversion_p (TREE_TYPE (*expr), cand->type)) + if (convert && !useless_type_conversion_p (TREE_TYPE (expr), + TREE_TYPE (repl))) { - tree vce = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (*expr), src); - *expr = vce; + tree vce = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (expr), repl); + *expr_p = vce; } else - *expr = src; + *expr_p = repl; return true; } -/* Dump the adjustments in the vector ADJUSTMENTS to dump_file in a human - friendly way, assuming they are meant to be applied to FNDECL. */ +/* If the statement STMT contains any expressions that need to replaced with a + different one as noted by ADJUSTMENTS, do so. Handle any potential type + incompatibilities. If any conversion sttements have to be pre-pended to + STMT, they will be added to EXTRA_STMTS. Return true iff the statement was + modified. */ -void -ipa_dump_param_adjustments (FILE *file, ipa_parm_adjustment_vec adjustments, - tree fndecl) +bool +ipa_param_body_adjustments::modify_assignment (gimple *stmt, + gimple_seq *extra_stmts) { - int i, len = adjustments.length (); - bool first = true; - vec parms = ipa_get_vector_of_formal_parms (fndecl); + tree *lhs_p, *rhs_p; + bool any; - fprintf (file, "IPA param adjustments: "); - for (i = 0; i < len; i++) + if (!gimple_assign_single_p (stmt)) + return false; + + rhs_p = gimple_assign_rhs1_ptr (stmt); + lhs_p = gimple_assign_lhs_ptr (stmt); + + any = modify_expr (lhs_p, false); + any |= modify_expr (rhs_p, false); + if (any) { - struct ipa_parm_adjustment *adj; - adj = &adjustments[i]; + tree new_rhs = NULL_TREE; - if (!first) - fprintf (file, " "); - else - first = false; + if (!useless_type_conversion_p (TREE_TYPE (*lhs_p), TREE_TYPE (*rhs_p))) + { + if (TREE_CODE (*rhs_p) == CONSTRUCTOR) + { + /* V_C_Es of constructors can cause trouble (PR 42714). */ + if (is_gimple_reg_type (TREE_TYPE (*lhs_p))) + *rhs_p = build_zero_cst (TREE_TYPE (*lhs_p)); + else + *rhs_p = build_constructor (TREE_TYPE (*lhs_p), + NULL); + } + else + new_rhs = fold_build1_loc (gimple_location (stmt), + VIEW_CONVERT_EXPR, TREE_TYPE (*lhs_p), + *rhs_p); + } + else if (REFERENCE_CLASS_P (*rhs_p) + && is_gimple_reg_type (TREE_TYPE (*lhs_p)) + && !is_gimple_reg (*lhs_p)) + /* This can happen when an assignment in between two single field + structures is turned into an assignment in between two pointers to + scalars (PR 42237). */ + new_rhs = *rhs_p; + + if (new_rhs) + { + tree tmp = force_gimple_operand (new_rhs, extra_stmts, true, + NULL_TREE); + gimple_assign_set_rhs1 (stmt, tmp); + } + + return true; + } + + return any; +} + +struct simple_tree_swap_info +{ + tree from, to; + bool done; +}; + +/* Simple remapper to remap a split parameter to a special dummy copy so that + edge redirections can detect transitive redirections. */ + +static tree +remap_split_decl_to_dummy (tree *tp, int *walk_subtrees, void *data) +{ + tree t = *tp; + + if (DECL_P (t) || TREE_CODE (t) == SSA_NAME) + { + struct simple_tree_swap_info *swapinfo + = (struct simple_tree_swap_info *) data; + if (t == swapinfo->from + || (TREE_CODE (t) == SSA_NAME + && SSA_NAME_VAR (t) == swapinfo->from)) + { + *tp = swapinfo->to; + swapinfo->done = true; + } + *walk_subtrees = 0; + } + else if (TYPE_P (t)) + *walk_subtrees = 0; + else + *walk_subtrees = 1; + return NULL_TREE; +} + +bool +ipa_param_body_adjustments::modify_call_stmt (gcall **stmt_p) +{ + gcall *stmt = *stmt_p; + auto_vec pass_through_args; + auto_vec pass_through_bases; - fprintf (file, "%i. base_index: %i - ", i, adj->base_index); - print_generic_expr (file, parms[adj->base_index]); - if (adj->base) + if (m_split_modifications_p && m_id) + { + for (unsigned i = 0; i < gimple_call_num_args (stmt); i++) { - fprintf (file, ", base: "); - print_generic_expr (file, adj->base); + tree t = gimple_call_arg (stmt, i); + gcc_assert (TREE_CODE (t) != BIT_FIELD_REF + && TREE_CODE (t) != IMAGPART_EXPR + && TREE_CODE (t) != REALPART_EXPR); + + tree base; + unsigned unit_offset; + if (!isra_get_ref_base_and_offset (t, &base, &unit_offset)) + continue; + + bool by_ref = false; + if (TREE_CODE (base) == SSA_NAME) + { + if (!SSA_NAME_IS_DEFAULT_DEF (base)) + continue; + base = SSA_NAME_VAR (base); + gcc_checking_assert (base); + by_ref = true; + } + if (TREE_CODE (base) != PARM_DECL) + continue; + + bool base_among_replacements = false; + unsigned int repl_list_len = m_replacements.length (); + for (unsigned j = 0; j < repl_list_len; j++) + { + ipa_param_body_replacement *pbr = &m_replacements[j]; + if (pbr->base == base) + { + base_among_replacements = true; + break; + } + } + if (!base_among_replacements) + continue; + + /* We still have to distinguish between an end-use that we have to + transform now and a pass-through, which happens in the following + two cases. */ + + /* TODO: After we adjust ptr_parm_has_nonarg_uses to also consider + &MEM_REF[ssa_name + offset], we will also have to detect that case + here. */ + + if (TREE_CODE (t) == SSA_NAME + && SSA_NAME_IS_DEFAULT_DEF (t) + && SSA_NAME_VAR (t) + && TREE_CODE (SSA_NAME_VAR (t)) == PARM_DECL) + { + /* This must be a by_reference pass-through. */ + gcc_assert (POINTER_TYPE_P (TREE_TYPE (t))); + pass_through_args.safe_push (i); + pass_through_bases.safe_push (base); + } + else if (!by_ref && AGGREGATE_TYPE_P (TREE_TYPE (t))) + { + /* Currently IPA-SRA guarantees the aggregate access type + exactly matches in this case. So if it does not match, it is + a pass-through argument that will be sorted out at edge + redirection time. */ + ipa_param_body_replacement *pbr + = lookup_replacement_1 (base, unit_offset); + + if (!pbr + || (TYPE_MAIN_VARIANT (TREE_TYPE (t)) + != TYPE_MAIN_VARIANT (TREE_TYPE (pbr->repl)))) + { + pass_through_args.safe_push (i); + pass_through_bases.safe_push (base); + } + } } - if (adj->new_decl) + } + + unsigned nargs = gimple_call_num_args (stmt); + if (!pass_through_args.is_empty ()) + { + auto_vec vargs; + unsigned pt_idx = 0; + for (unsigned i = 0; i < nargs; i++) { - fprintf (file, ", new_decl: "); - print_generic_expr (file, adj->new_decl); + if (pt_idx < pass_through_args.length () + && i == pass_through_args[pt_idx]) + { + tree base = pass_through_bases[pt_idx]; + pt_idx++; + unsigned j = 0; + while (m_replacements[j].base != base) + j++; + + /* Map Base will get mapped to the special transitive-isra marker + dummy decl. */ + struct simple_tree_swap_info swapinfo; + swapinfo.from = base; + swapinfo.to = m_replacements[j].dummy; + swapinfo.done = false; + tree arg = gimple_call_arg (stmt, i); + walk_tree (&arg, remap_split_decl_to_dummy, &swapinfo, NULL); + gcc_assert (swapinfo.done); + vargs.safe_push (arg); + /* Now let's push all replacements so that all gimple register + ones get correct SSA_NAMES. Edge redirection will weed out + the dummy argument as well as all unused replacements + later. */ + unsigned int repl_list_len = m_replacements.length (); + for (; j < repl_list_len; j++) + { + if (m_replacements[j].base != base) + break; + vargs.safe_push (m_replacements[j].repl); + } + } + else + { + tree t = gimple_call_arg (stmt, i); + modify_expr (&t, true); + vargs.safe_push (t); + } } - if (adj->new_ssa_base) + gcall *new_stmt = gimple_build_call_vec (gimple_call_fndecl (stmt), + vargs); + gimple_call_set_chain (new_stmt, gimple_call_chain (stmt)); + gimple_call_copy_flags (new_stmt, stmt); + if (tree lhs = gimple_call_lhs (stmt)) { - fprintf (file, ", new_ssa_base: "); - print_generic_expr (file, adj->new_ssa_base); + modify_expr (&lhs, false); + gimple_call_set_lhs (new_stmt, lhs); } + *stmt_p = new_stmt; + return true; + } - if (adj->op == IPA_PARM_OP_COPY) - fprintf (file, ", copy_param"); - else if (adj->op == IPA_PARM_OP_REMOVE) - fprintf (file, ", remove_param"); - else + /* Otherwise, no need to rebuild the statement, let's just modify arguments + and the LHS if/as appropriate. */ + bool modified = false; + for (unsigned i = 0; i < nargs; i++) + { + tree *t = gimple_call_arg_ptr (stmt, i); + modified |= modify_expr (t, true); + } + + if (gimple_call_lhs (stmt)) + { + tree *t = gimple_call_lhs_ptr (stmt); + modified |= modify_expr (t, false); + } + + return modified; +} + +bool +ipa_param_body_adjustments::modify_gimple_stmt (gimple **stmt, + gimple_seq *extra_stmts) +{ + bool modified = false; + tree *t; + + switch (gimple_code (*stmt)) + { + case GIMPLE_RETURN: + t = gimple_return_retval_ptr (as_a (*stmt)); + if (m_adjustments && m_adjustments->m_skip_return) + *t = NULL_TREE; + else if (*t != NULL_TREE) + modified |= modify_expr (t, true); + break; + + case GIMPLE_ASSIGN: + modified |= modify_assignment (*stmt, extra_stmts); + break; + + case GIMPLE_CALL: + modified |= modify_call_stmt ((gcall **) stmt); + break; + + case GIMPLE_ASM: + { + gasm *asm_stmt = as_a (*stmt); + for (unsigned i = 0; i < gimple_asm_ninputs (asm_stmt); i++) + { + t = &TREE_VALUE (gimple_asm_input_op (asm_stmt, i)); + modified |= modify_expr (t, true); + } + for (unsigned i = 0; i < gimple_asm_noutputs (asm_stmt); i++) + { + t = &TREE_VALUE (gimple_asm_output_op (asm_stmt, i)); + modified |= modify_expr (t, false); + } + } + break; + + default: + break; + } + return modified; +} + + +/* Traverse body of the current function and perform the requested adjustments. + Return true iff the CFG has been changed. */ + +bool +ipa_param_body_adjustments::modify_cfun_body () +{ + bool cfg_changed = false; + basic_block bb; + + FOR_EACH_BB_FN (bb, cfun) + { + gimple_stmt_iterator gsi; + + for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gphi *phi = as_a (gsi_stmt (gsi)); + tree new_lhs, old_lhs = gimple_phi_result (phi); + new_lhs = replace_removed_params_ssa_names (old_lhs, phi); + if (new_lhs) + { + gimple_phi_set_result (phi, new_lhs); + release_ssa_name (old_lhs); + } + } + + gsi = gsi_start_bb (bb); + while (!gsi_end_p (gsi)) { - fprintf (file, ", offset "); - print_dec (adj->offset, file); + gimple *stmt = gsi_stmt (gsi); + gimple *stmt_copy = stmt; + gimple_seq extra_stmts = NULL; + bool modified = modify_gimple_stmt (&stmt, &extra_stmts); + if (stmt != stmt_copy) + { + gcc_checking_assert (modified); + gsi_replace (&gsi, stmt, false); + } + if (!gimple_seq_empty_p (extra_stmts)) + gsi_insert_seq_before (&gsi, extra_stmts, GSI_SAME_STMT); + + def_operand_p defp; + ssa_op_iter iter; + FOR_EACH_SSA_DEF_OPERAND (defp, stmt, iter, SSA_OP_DEF) + { + tree old_def = DEF_FROM_PTR (defp); + if (tree new_def = replace_removed_params_ssa_names (old_def, + stmt)) + { + SET_DEF (defp, new_def); + release_ssa_name (old_def); + modified = true; + } + } + + if (modified) + { + update_stmt (stmt); + if (maybe_clean_eh_stmt (stmt) + && gimple_purge_dead_eh_edges (gimple_bb (stmt))) + cfg_changed = true; + } + gsi_next (&gsi); } - if (adj->by_ref) - fprintf (file, ", by_ref"); - print_node_brief (file, ", type: ", adj->type, 0); - fprintf (file, "\n"); } - parms.release (); + + return cfg_changed; +} + +/* Call gimple_debug_bind_reset_value on all debug statements describing + gimple register parameters that are being removed or replaced. */ + +void +ipa_param_body_adjustments::reset_debug_stmts () +{ + int i, len; + gimple_stmt_iterator *gsip = NULL, gsi; + + if (MAY_HAVE_DEBUG_STMTS && single_succ_p (ENTRY_BLOCK_PTR_FOR_FN (cfun))) + { + gsi = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun))); + gsip = &gsi; + } + len = m_reset_debug_decls.length (); + for (i = 0; i < len; i++) + { + imm_use_iterator ui; + gimple *stmt; + gdebug *def_temp; + tree name, vexpr, copy = NULL_TREE; + use_operand_p use_p; + tree decl = m_reset_debug_decls[i]; + + gcc_checking_assert (is_gimple_reg (decl)); + name = ssa_default_def (cfun, decl); + vexpr = NULL; + if (name) + FOR_EACH_IMM_USE_STMT (stmt, ui, name) + { + if (gimple_clobber_p (stmt)) + { + gimple_stmt_iterator cgsi = gsi_for_stmt (stmt); + unlink_stmt_vdef (stmt); + gsi_remove (&cgsi, true); + release_defs (stmt); + continue; + } + /* All other users must have been removed by function body + modification. */ + gcc_assert (is_gimple_debug (stmt)); + if (vexpr == NULL && gsip != NULL) + { + vexpr = make_node (DEBUG_EXPR_DECL); + def_temp = gimple_build_debug_source_bind (vexpr, decl, NULL); + DECL_ARTIFICIAL (vexpr) = 1; + TREE_TYPE (vexpr) = TREE_TYPE (name); + SET_DECL_MODE (vexpr, DECL_MODE (decl)); + gsi_insert_before (gsip, def_temp, GSI_SAME_STMT); + } + if (vexpr) + { + FOR_EACH_IMM_USE_ON_STMT (use_p, ui) + SET_USE (use_p, vexpr); + } + else + gimple_debug_bind_reset_value (stmt); + update_stmt (stmt); + } + /* Create a VAR_DECL for debug info purposes. */ + if (!DECL_IGNORED_P (decl)) + { + copy = build_decl (DECL_SOURCE_LOCATION (current_function_decl), + VAR_DECL, DECL_NAME (decl), + TREE_TYPE (decl)); + if (DECL_PT_UID_SET_P (decl)) + SET_DECL_PT_UID (copy, DECL_PT_UID (decl)); + TREE_ADDRESSABLE (copy) = TREE_ADDRESSABLE (decl); + TREE_READONLY (copy) = TREE_READONLY (decl); + TREE_THIS_VOLATILE (copy) = TREE_THIS_VOLATILE (decl); + DECL_GIMPLE_REG_P (copy) = DECL_GIMPLE_REG_P (decl); + DECL_ARTIFICIAL (copy) = DECL_ARTIFICIAL (decl); + DECL_IGNORED_P (copy) = DECL_IGNORED_P (decl); + DECL_ABSTRACT_ORIGIN (copy) = DECL_ORIGIN (decl); + DECL_SEEN_IN_BIND_EXPR_P (copy) = 1; + SET_DECL_RTL (copy, 0); + TREE_USED (copy) = 1; + DECL_CONTEXT (copy) = current_function_decl; + add_local_decl (cfun, copy); + DECL_CHAIN (copy) + = BLOCK_VARS (DECL_INITIAL (current_function_decl)); + BLOCK_VARS (DECL_INITIAL (current_function_decl)) = copy; + } + if (gsip != NULL && copy && target_for_debug_bind (decl)) + { + gcc_assert (TREE_CODE (decl) == PARM_DECL); + if (vexpr) + def_temp = gimple_build_debug_bind (copy, vexpr, NULL); + else + def_temp = gimple_build_debug_source_bind (copy, decl, + NULL); + gsi_insert_before (gsip, def_temp, GSI_SAME_STMT); + } + } +} + +/* Perform all necessary body changes to change signature, body and debug info + of fun according to adjustments passed at construction. Return true if CFG + was changed in any way. */ + +bool +ipa_param_body_adjustments::perform_cfun_body_modifications () +{ + bool cfg_changed; + modify_formal_parameters (); + cfg_changed = modify_cfun_body (); + reset_debug_stmts (); + + return cfg_changed; } diff --git a/gcc/ipa-param-manipulation.h b/gcc/ipa-param-manipulation.h index 84bc42d5196..c14b9376b08 100644 --- a/gcc/ipa-param-manipulation.h +++ b/gcc/ipa-param-manipulation.h @@ -21,100 +21,276 @@ along with GCC; see the file COPYING3. If not see #ifndef IPA_PARAM_MANIPULATION_H #define IPA_PARAM_MANIPULATION_H +/* Indices into ipa_param_prefixes to identify a human-readable prefix for newly + synthesized parameters. Keep in sync with the array. */ +#define IPA_PARAM_PREFIX_SYNTH 0 +#define IPA_PARAM_PREFIX_ISRA 1 +#define IPA_PARAM_PREFIX_SIMD 2 +#define IPA_PARAM_PREFIX_MASK 3 + +/* We do not support manipulating functions with more than + 1< *adj_params); + +/* Structure to remember the split performed on a node so that edge redirection + (i.e. splitting arguments of call statements) know how split formal + parameters of the caller are represented. */ + +struct GTY(()) ipa_param_performed_split +{ + /* The dummy VAR_DECL that was created instead of the split parameter that + sits in the call in the meantime between clone materialization and call + redirection. */ + tree dummy_decl; + /* Offset into the original parameter. */ + unsigned unit_offset; +}; + +/* Class used to record planned modifications to parameters of a function and + also to perform necessary modifications at the caller side at the gimple + level. */ + +class GTY(()) ipa_param_adjustments +{ +public: + /* Constructor from NEW_PARAMS showing how new parameters should look like + plus copying any pre-existing actual arguments starting from argument + with index ALWAYS_COPY_START (if non-negative, negative means do not copy + anything beyond what is described in NEW_PARAMS), and SKIP_RETURN, which + indicates that the function should return void after transformation. */ + + /* TODO: OLD_DECL is only necessary to support generating debuginfo for the + old early IPA SRA. Will be removed after transitioning to true + IPA-SRA. */ + + ipa_param_adjustments (vec *new_params, + int always_copy_start, bool skip_return, + tree old_decl = NULL) + : m_adj_params (new_params), m_always_copy_start (always_copy_start), + m_skip_return (skip_return), m_old_decl (old_decl) + {} + + gcall *modify_call (gcall *stmt, + vec *performed_splits, + tree callee_decl, bool update_references); + void modify_call (cgraph_edge *cs); + bool first_param_intact_p (); + tree build_new_function_type (tree old_type, bool type_is_original_p); + tree adjust_decl (tree orig_decl); + void get_surviving_params (vec *surviving_params); + void get_updated_indices (vec *new_indices); + + void dump (FILE *f); + void debug (); + + /* How the known part of arguments should look like. */ + vec *m_adj_params; + + /* If non-negative, copy any arguments starting at this offset. */ + int m_always_copy_start; + /* If true, make the function not return any value. */ + bool m_skip_return; + + /* TODO: The following field is only necessary to support generating + debuginfo for the old early IPA SRA. Will be removed after transitioning + to IPA-SRA. */ + tree m_old_decl; +private: + ipa_param_adjustments () {} + + void init (vec *cur_params); + int get_max_base_index (); + bool method2func_p (tree orig_type); +}; + +/* Structure used to map expressions accessing split or replaced parameters to + new PARM_DECLs. TODO: Even though there usually be only few, but should we + use a hash? */ + +struct ipa_param_body_replacement +{ + tree base, repl, dummy; + unsigned unit_offset; + bool by_ref; + bool reverse; +}; + +struct ipa_replace_map; + +/* Class used when actually performing adjustments to formal parameters of + a function to map accesses that need to be replaced to replacements. */ + +class ipa_param_body_adjustments +{ +public: + ipa_param_body_adjustments (ipa_param_adjustments *adjustments, + tree fndecl, tree old_fndecl, + bool copy_parm_decls, struct copy_body_data *id, + tree *vars, + vec *tree_map); + ipa_param_body_adjustments (ipa_param_adjustments *adjustments, + tree fndecl); + ipa_param_body_adjustments (vec *adj_params, + tree fndecl); + + bool perform_cfun_body_modifications (); + + void modify_formal_parameters (); + void register_replacement (ipa_adjusted_param *apm, tree replacement, + tree dummy = NULL_TREE); + tree lookup_replacement (tree base, unsigned unit_offset); + tree get_expr_replacement (tree expr, bool ignore_default_def); + tree get_replacement_ssa_base (tree old_decl); + bool modify_gimple_stmt (gimple **stmt, gimple_seq *extra_stmts); + tree get_new_param_chain (); + + vec *m_adj_params; + ipa_param_adjustments *m_adjustments; + + /* Vector of old parameter declarations that must have their debug bind + statements re-mapped and debug decls created. */ + + auto_vec m_reset_debug_decls; + + /* Set to true if there are any IPA_PARAM_OP_SPLIT adjustments among stored + adjustments. */ + bool m_split_modifications_p; +private: + tree carry_over_param (tree t, bool copy_parm_decls); + void common_initialization (bool copy_parm_decls, tree old_fndecl, + tree *vars, + vec *tree_map); + unsigned get_base_index (ipa_adjusted_param *apm); + ipa_param_body_replacement *lookup_replacement_1 (tree base, + unsigned unit_offset); + ipa_param_body_replacement *get_expr_replacement_1 (tree expr, + bool ignore_default_def); + tree replace_removed_params_ssa_names (tree old_name, gimple *stmt); + bool modify_expr (tree *expr_p, bool convert); + bool modify_assignment (gimple *stmt, gimple_seq *extra_stmts); + bool modify_call_stmt (gcall **stmt_p); + bool modify_cfun_body (); + void reset_debug_stmts (); + + + /* Declaration of the function that is being transformed. */ + + tree m_fndecl; + + /* If non-NULL, the tree-inline master data structure guiding materialization + of the current clone. */ + struct copy_body_data *m_id; + + /* Vector of old parameter declarations (before changing them). */ + + auto_vec m_oparms; + + /* Vector of parameter declarations the function will have after + transformation. */ + + auto_vec m_new_decls; + + /* If the function type has non-NULL TYPE_ARG_TYPES, this is the vector of + these types after transformation, otherwise an empty one. */ + + auto_vec m_new_types; + + /* Vector of structures telling how to replace old parameters in in the + function body. */ + + auto_vec m_replacements; + + /* Vector for remapping SSA_BASES from old parameter declarations that are + being removed as a part of the transformation. Before a new VAR_DECL is + created, it holds the old PARM_DECL, once the variable is built it is + stored here. */ + + auto_vec m_removed_decls; + + /* Hash to quickly lookup the item in m_removed_decls given the old decl. */ + + hash_map m_removed_map; + + /* True iff the transformed function is a class method that is about to loose + its this pointer and must be converted to a normal function. */ + + bool m_method2func; }; -typedef vec ipa_parm_adjustment_vec; - -vec ipa_get_vector_of_formal_parms (tree fndecl); -vec ipa_get_vector_of_formal_parm_types (tree fntype); -void ipa_modify_formal_parameters (tree fndecl, ipa_parm_adjustment_vec); -void ipa_modify_call_arguments (struct cgraph_edge *, gcall *, - ipa_parm_adjustment_vec); -ipa_parm_adjustment_vec ipa_combine_adjustments (ipa_parm_adjustment_vec, - ipa_parm_adjustment_vec); -void ipa_dump_param_adjustments (FILE *, ipa_parm_adjustment_vec, tree); - -bool ipa_modify_expr (tree *, bool, ipa_parm_adjustment_vec); -ipa_parm_adjustment *ipa_get_adjustment_candidate (tree **, bool *, - ipa_parm_adjustment_vec, - bool); +void ipa_fill_vector_with_formal_parms (vec *args, tree fndecl); +void ipa_fill_vector_with_formal_parm_types (vec *types, tree fntype); #endif /* IPA_PARAM_MANIPULATION_H */ diff --git a/gcc/ipa-prop.c b/gcc/ipa-prop.c index 05e666e0588..309bdcf90d4 100644 --- a/gcc/ipa-prop.c +++ b/gcc/ipa-prop.c @@ -4825,31 +4825,24 @@ adjust_agg_replacement_values (struct cgraph_node *node, struct ipa_agg_replacement_value *aggval) { struct ipa_agg_replacement_value *v; - int i, c = 0, d = 0, *adj; - if (!node->clone.combined_args_to_skip) + if (!node->clone.param_adjustments) return; + auto_vec new_indices; + node->clone.param_adjustments->get_updated_indices (&new_indices); for (v = aggval; v; v = v->next) { - gcc_assert (v->index >= 0); - if (c < v->index) - c = v->index; - } - c++; - - adj = XALLOCAVEC (int, c); - for (i = 0; i < c; i++) - if (bitmap_bit_p (node->clone.combined_args_to_skip, i)) - { - adj[i] = -1; - d++; - } - else - adj[i] = i - d; + gcc_checking_assert (v->index >= 0); - for (v = aggval; v; v = v->next) - v->index = adj[v->index]; + if ((unsigned) v->index < new_indices.length ()) + v->index = new_indices[v->index]; + else + /* This can happen if we know about a constant passed by reference by + an argument which is never actually used for anything, let alone + loading that constant. */ + v->index = -1; + } } /* Dominator walker driving the ipcp modification phase. */ @@ -4973,24 +4966,41 @@ ipcp_modif_dom_walker::before_dom_children (basic_block bb) static void ipcp_update_bits (struct cgraph_node *node) { - tree parm = DECL_ARGUMENTS (node->decl); - tree next_parm = parm; ipcp_transformation *ts = ipcp_get_transformation_summary (node); if (!ts || vec_safe_length (ts->bits) == 0) return; - vec &bits = *ts->bits; unsigned count = bits.length (); + if (!count) + return; - for (unsigned i = 0; i < count; ++i, parm = next_parm) + auto_vec new_indices; + bool need_remapping = false; + if (node->clone.param_adjustments) { - if (node->clone.combined_args_to_skip - && bitmap_bit_p (node->clone.combined_args_to_skip, i)) - continue; + node->clone.param_adjustments->get_updated_indices (&new_indices); + need_remapping = true; + } + auto_vec parm_decls; + ipa_fill_vector_with_formal_parms (&parm_decls, node->decl); + for (unsigned i = 0; i < count; ++i) + { + tree parm; + if (need_remapping) + { + if (i >= new_indices.length ()) + continue; + int idx = new_indices[i]; + if (idx < 0) + continue; + parm = parm_decls[idx]; + } + else + parm = parm_decls[i]; gcc_checking_assert (parm); - next_parm = DECL_CHAIN (parm); + if (!bits[i] || !(INTEGRAL_TYPE_P (TREE_TYPE (parm)) @@ -5065,22 +5075,42 @@ ipcp_update_bits (struct cgraph_node *node) static void ipcp_update_vr (struct cgraph_node *node) { - tree fndecl = node->decl; - tree parm = DECL_ARGUMENTS (fndecl); - tree next_parm = parm; ipcp_transformation *ts = ipcp_get_transformation_summary (node); if (!ts || vec_safe_length (ts->m_vr) == 0) return; const vec &vr = *ts->m_vr; unsigned count = vr.length (); + if (!count) + return; - for (unsigned i = 0; i < count; ++i, parm = next_parm) + auto_vec new_indices; + bool need_remapping = false; + if (node->clone.param_adjustments) { - if (node->clone.combined_args_to_skip - && bitmap_bit_p (node->clone.combined_args_to_skip, i)) - continue; + node->clone.param_adjustments->get_updated_indices (&new_indices); + need_remapping = true; + } + auto_vec parm_decls; + ipa_fill_vector_with_formal_parms (&parm_decls, node->decl); + + for (unsigned i = 0; i < count; ++i) + { + tree parm; + int remapped_idx; + if (need_remapping) + { + if (i >= new_indices.length ()) + continue; + remapped_idx = new_indices[i]; + if (remapped_idx < 0) + continue; + } + else + remapped_idx = i; + + parm = parm_decls[remapped_idx]; + gcc_checking_assert (parm); - next_parm = DECL_CHAIN (parm); tree ddef = ssa_default_def (DECL_STRUCT_FUNCTION (node->decl), parm); if (!ddef || !is_gimple_reg (parm)) @@ -5095,7 +5125,8 @@ ipcp_update_vr (struct cgraph_node *node) { if (dump_file) { - fprintf (dump_file, "Setting value range of param %u ", i); + fprintf (dump_file, "Setting value range of param %u " + "(now %i) ", i, remapped_idx); fprintf (dump_file, "%s[", (vr[i].type == VR_ANTI_RANGE) ? "~" : ""); print_decs (vr[i].min, dump_file); diff --git a/gcc/ipa-split.c b/gcc/ipa-split.c index 38f5bcf00a6..699ade00f89 100644 --- a/gcc/ipa-split.c +++ b/gcc/ipa-split.c @@ -1322,13 +1322,38 @@ split_function (basic_block return_bb, struct split_point *split_point, } } + ipa_param_adjustments *adjustments; + bool skip_return = (!split_part_return_p + || !split_point->split_part_set_retval); + /* TODO: Perhaps get rid of args_to_skip entirely, after we make sure the + debug info generation and discrepancy avoiding works well too. */ + if ((args_to_skip && !bitmap_empty_p (args_to_skip)) + || skip_return) + { + vec *new_params = NULL; + unsigned j; + for (parm = DECL_ARGUMENTS (current_function_decl), j = 0; + parm; parm = DECL_CHAIN (parm), j++) + if (!args_to_skip || !bitmap_bit_p (args_to_skip, j)) + { + ipa_adjusted_param adj; + memset (&adj, 0, sizeof (adj)); + adj.op = IPA_PARAM_OP_COPY; + adj.base_index = j; + adj.prev_clone_index = j; + vec_safe_push (new_params, adj); + } + adjustments = new ipa_param_adjustments (new_params, j, skip_return); + } + else + adjustments = NULL; + /* Now create the actual clone. */ cgraph_edge::rebuild_edges (); node = cur_node->create_version_clone_with_body - (vNULL, NULL, args_to_skip, - !split_part_return_p || !split_point->split_part_set_retval, + (vNULL, NULL, adjustments, split_point->split_bbs, split_point->entry_bb, "part"); - + delete adjustments; node->split_part = true; if (cur_node->same_comdat_group) diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c new file mode 100644 index 00000000000..53c1623b4bf --- /dev/null +++ b/gcc/ipa-sra.c @@ -0,0 +1,3823 @@ +/* Interprocedural scalar replacement of aggregates + Copyright (C) 2005-2018 Free Software Foundation, Inc. + + Contributed by Martin Jambor + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "backend.h" +#include "target.h" +#include "rtl.h" +#include "tree.h" +#include "gimple.h" +#include "predict.h" +#include "alloc-pool.h" +#include "tree-pass.h" +#include "ssa.h" +#include "cgraph.h" +#include "print-tree.h" +#include "gimple-pretty-print.h" +#include "alias.h" +#include "fold-const.h" +#include "tree-eh.h" +#include "stor-layout.h" +#include "gimplify.h" +#include "gimple-iterator.h" +#include "gimplify-me.h" +#include "gimple-walk.h" +#include "tree-cfg.h" +#include "tree-dfa.h" +#include "tree-ssa.h" +#include "tree-sra.h" +#include "symbol-summary.h" +#include "ipa-prop.h" +#include "params.h" +#include "dbgcnt.h" +#include "ipa-fnsummary.h" +#include "tree-inline.h" +#include "ipa-inline.h" +#include "ipa-utils.h" +#include "builtins.h" +#include "cfganal.h" +#include "errors.h" +#include "tree-streamer.h" + + +/* Bits used to track size of an aggregate in bytes interprocedurally. */ +#define ISRA_ARG_SIZE_LIMIT 16 + +/* Structure describing accesses to a specific portion of an aggregate + parameter, as given by the offset and size. Any smaller accesses that occur + within a function that fall within another access form a tree. The pass + cannot analyze parameters with only partially overlapping accesses. */ + +struct GTY(()) param_access +{ + /* Type that a potential replacement should have. This field only has + meaning in the summary building and transformation phases, when it is + reconstructoed from the body. Must not be touched in IPA analysys + stage. */ + tree type; + + /* Alias reference type to be used in MEM_REFs when adjusting caller + arguments. */ + tree alias_ptr_type; + + /* Values returned by get_ref_base_and_extent but converted to bytes and + stored as unsigned ints. */ + unsigned unit_offset; + unsigned unit_size; + + /* Set once we are sure that the access will really end up in a potentially + transformed function - initially not set for function arguments. */ + unsigned definitive : 1; + /* Set when initially we have allowed overlaps for this access and so it + eventually needs checking for overlaps. */ + /* !!! Use for testing only, otherwise not worth having, let's simply check + all final splits. */ + unsigned check_overlaps : 1; +}; + +/* This structure has the same purpose as the one above and additoonally it + contains some fields that are only necessary in the summary generation + phase. */ + +struct gensum_param_access +{ + /* Values returned by get_ref_base_and_extent. */ + HOST_WIDE_INT offset; + HOST_WIDE_INT size; + + /* if this access has any children (in terms of the definition above), this + points to the first one. */ + struct gensum_param_access *first_child; + + /* In intraprocedural SRA, pointer to the next sibling in the access tree as + described above. */ + struct gensum_param_access *next_sibling; + + /* Type that a potential replacement should have. This field only has + meaning in the summary building and transformation phases, when it is + reconstructoed from the body. Must not be touched in IPA analysys + stage. */ + tree type; + + /* Alias refrerence type to be used in MEM_REFs when adjusting caller + arguments. */ + tree alias_ptr_type; + + /* Have there been writes to or reads from this exact location except for as + arguments to a function call that can be tracked. */ + bool nonarg; +}; + +/* Summary describing a parameter in the IPA stages. */ + +/* !!! TODO: Probably remove the m_prefixes here. */ +struct GTY(()) isra_param_desc +{ + /* List of access representatives to the parameters, sorted according to + their offset. */ + vec *m_accesses; + + /* If the below is non-zero, this is the nuber of uses as actual arguents. */ + int m_call_uses; /* !!! Unnecessary? */ + + /* How many of the call uses are passes to nodes in other SCC components. */ + int m_scc_uses; + + /* Number of times this parameter has been directly passed to */ + unsigned ptr_pt_count; /* !!! Probably entirerely unnecessary */ + + /* Unit size limit of total size of all replacements. */ + unsigned m_param_size_limit : ISRA_ARG_SIZE_LIMIT; + /* Sum of unit sizes of all definitive replacements. */ + unsigned m_size_reached : ISRA_ARG_SIZE_LIMIT; + + /* A parameter that is used only in call arguments and can be removed if all + concerned actual arguments are removed. */ + unsigned m_locally_unused : 1; + /* An aggregate that is a candidate for breaking up or complete removal. */ + unsigned m_split_candidate : 1; + /* Is this a parameter passing stuff by reference? */ + unsigned m_by_ref : 1; +}; + +/* Structure used when generating summaries that describes a parameter. */ + +struct gensum_param_desc +{ + /* Roots of param_accesses. */ + gensum_param_access *x_accesses; /* !!! x_ */ + + /* Number of */ + unsigned m_access_count; + + /* If the below is non-zero, this is the nuber of uses as actual arguents. */ + int m_call_uses; + + /* Number of times this parameter has been directly passed to */ + unsigned ptr_pt_count; + + /* Size limit of total size of all replacements. */ + unsigned param_size_limit; /* !!? m_ ? */ + /* Sum of sizes of nonarg accesses. */ + unsigned nonarg_acc_size; /* !!? m_ ? */ + + /* A parameter that is used only in call arguments and can be removed if all + concerned actual arguments are removed. */ + bool m_locally_unused; + /* An aggregate that is a candidate for breaking up or complete removal. */ + bool m_split_candidate; + /* Is this a parameter passing stuff by reference? */ + bool m_by_ref; + + /* The number of this parameter as they are ordered in function decl. */ + int m_param_number; + + /* For parameters passing data by reference, this is parameter index to + compute indices to bb_dereferences. */ + int m_deref_index; +}; + +/* Properly deallocate m_accesses of DESC. */ + +static void +free_param_decl_accesses (isra_param_desc *desc) +{ + /* !!? Now that desc is in GGC, perhaps we should leave at least the + deallocation of m_accesses elements to the GC. Let's do so after + testing. */ + unsigned len = vec_safe_length (desc->m_accesses); + for (unsigned i = 0; i < len; ++i) + ggc_free ((*desc->m_accesses)[i]); + vec_free (desc->m_accesses); +} + +/* Class used to convey information about functions from the + intra-procedurwl analysis stage to inter-procedural one. */ + +class GTY((for_user)) isra_func_summary +{ +public: + /* initialize the object. */ + + isra_func_summary () + : m_parameters (NULL), m_candidate (false), m_returns_value (false), + m_return_ignored (false), m_queued (false) + {} + + /* Destroy m_parameters. */ + + ~isra_func_summary (); + + /* Mark the function as not a candidate for any IPA-SRA transofrmation. + Return true if it was a candidate until now. */ + + bool zap (); + + /* Vector of parameter descriptors corresponding to the function being + analyzed. */ + vec *m_parameters; + + /* Whether the node is even a candidate for any IPA-SRA transformation at + all. */ + unsigned m_candidate : 1; + + /* Whether the original function returns any value. */ + unsigned m_returns_value : 1; + + /* Set to true if all call statements do not actually use the returned + value. */ + + unsigned m_return_ignored : 1; + + /* Whether the node is already queued in IPA SRA stack during processing of + call graphs SCCs. */ + + unsigned m_queued : 1; +}; + +/* Perhaps remove because everything is in GC anyway, but let's kleep it around + for testing since we have it. */ + +isra_func_summary::~isra_func_summary () +{ + unsigned len = vec_safe_length (m_parameters); + for (unsigned i = 0; i < len; ++i) + free_param_decl_accesses (&(*m_parameters)[i]); + vec_free (m_parameters); +} + + +/* Mark the function as not a candidate for any IPA-SRA transofrmation. Return + true if it was a candidate until now. */ + +bool isra_func_summary::zap () +{ + bool ret = m_candidate; + m_candidate = false; + vec_free (m_parameters); + return ret; +} + +#define IPA_SRA_MAX_PARAM_FLOW_LEN 7 + +/* Structure to describe which formal parameters feed into a particular actual + arguments. */ + +/* !!! this can easily be turned into a more compact representation. */ +struct isra_param_flow +{ + /* Number of elements in array inputs that contain valid data. */ + char length; + /* Indices of formal parameters that feed into the described actual + argument. */ + char inputs[IPA_SRA_MAX_PARAM_FLOW_LEN]; + + /* True when the value of this actual copy is a portion of a formal + parameter. */ + unsigned aggregate_pass_through : 1; /* !!? bad name? Also, active! */ + /* True when the value of this actual copy is a verbatim pass through of an + obtained pointer. */ + unsigned pointer_pass_through : 1; /* !!? bad name? Also, active! */ + /* True when it is safe to copy access candidates here from the callee, which + would mean introducing dereferences into callers of the caller. */ + unsigned safe_to_import_accesses : 1; + + /* The number of the formal parameter that is passed through */ + unsigned param_number; + /* Offset within the formal parameter. */ + unsigned unit_offset; + /* Size of the portion of the formal parameter. */ + unsigned unit_size; +}; + +/* Strucutre used to convey information about calls from the intra-procedurwl + analysis stage to inter-procedural one. */ + +class isra_call_summary +{ +public: + isra_call_summary () + : m_inputs (), m_return_ignored (false), m_return_returned (false), + m_bit_aligned_arg (false) + {} + + void init_inputs (unsigned arg_count); + void dump (FILE *f); + + /* Information about what formal parameters of the caller are used to compute + indivisual actual arguments of this call. */ + + auto_vec m_inputs; /* !!! Rename to arg_flow or sth similar? */ + + /* Set to true if the call statement does not have a LHS. */ + unsigned m_return_ignored : 1; + + /* Set to true if the LHS of call statement is only used to construct the + return value of the caller. */ + unsigned m_return_returned : 1; + + /* Set when any of the call arguments are not byte-aligned. */ + unsigned m_bit_aligned_arg : 1; +}; + +/* Class to manage function summaries. */ + +class GTY((user)) ipa_sra_function_summaries + : public function_summary +{ +public: + ipa_sra_function_summaries (symbol_table *table, bool ggc): + function_summary (table, ggc) { } + + /* Hook that is called by summary when a node is duplicated. */ + virtual void duplicate (cgraph_node *, + cgraph_node *, + isra_func_summary *old_sum, + isra_func_summary *new_sum) + { + /* TODO: Somehow stop copying when ISRA is doing the cloning, it is + useless. */ + new_sum->m_candidate = old_sum->m_candidate; + new_sum->m_returns_value = old_sum->m_returns_value; + new_sum->m_return_ignored = old_sum->m_return_ignored; + gcc_assert (!old_sum->m_queued); + new_sum->m_queued = false; + + unsigned param_count = vec_safe_length (old_sum->m_parameters); + if (!param_count) + return; + vec_safe_reserve_exact (new_sum->m_parameters, param_count); + new_sum->m_parameters->quick_grow_cleared (param_count); + for (unsigned i = 0; i < param_count; i++) + { + isra_param_desc *s = &(*old_sum->m_parameters)[i]; + isra_param_desc *d = &(*new_sum->m_parameters)[i]; + + d->m_call_uses = s->m_call_uses; + d->m_scc_uses = s->m_scc_uses; + d->ptr_pt_count = s->ptr_pt_count; + d->m_param_size_limit = s->m_param_size_limit; + d->m_size_reached = s->m_size_reached; + d->m_locally_unused = s->m_locally_unused; + d->m_split_candidate = s->m_split_candidate; + d->m_by_ref = s->m_by_ref; + + unsigned acc_count = vec_safe_length (s->m_accesses); + vec_safe_reserve_exact (d->m_accesses, acc_count); + for (unsigned j = 0; j < acc_count; j++) + { + param_access *from = (*s->m_accesses)[j]; + param_access *to = ggc_cleared_alloc (); + to->type = from->type; + to->alias_ptr_type = from->alias_ptr_type; + to->unit_offset = from->unit_offset; + to->unit_size = from->unit_size; + to->definitive = from->definitive; + to->check_overlaps = from->check_overlaps; + d->m_accesses->quick_push (to); + } + } + } +}; + +static GTY(()) ipa_sra_function_summaries *func_sums; + +/* Class to manage call summaries. */ + +class ipa_sra_call_summaries: public call_summary +{ +public: + ipa_sra_call_summaries (symbol_table *table): + call_summary (table) { } +}; + +static ipa_sra_call_summaries *call_sums; + + +/* Initialize m_inputs of a particular instance of isra_call_summary. + ARG_COUNT is the number of actual arguments passed. */ + +void +isra_call_summary::init_inputs (unsigned arg_count) +{ + if (arg_count == 0) + { + gcc_checking_assert (m_inputs.length () == 0); + return; + } + if (m_inputs.length () == 0) + { + m_inputs.reserve_exact (arg_count); + m_inputs.quick_grow_cleared (arg_count); + } + else + gcc_checking_assert (arg_count == m_inputs.length ()); +} + +/* Dump all information in call summary to F. */ + +void +isra_call_summary::dump (FILE *f) +{ + if (m_return_ignored) + fprintf (f, " return value ignored\n"); + if (m_return_returned) + fprintf (f, " return value used only to compute caller return value\n"); + for (unsigned i = 0; i < m_inputs.length (); i++) + { + fprintf (f, " Parameter %u:\n", i); + isra_param_flow *ipf = &m_inputs[i]; + + if (ipf->length) + { + bool first = true; + fprintf (f, " Scalar param sources: "); + for (int j = 0; j < ipf->length; j++) + { + if (!first) + fprintf (f, ", "); + else + first = false; + fprintf (f, "%i", (int) ipf->inputs[j]); + } + fprintf (f, "\n"); + } + if (ipf->aggregate_pass_through) + fprintf (f, " Aggregate pass through from param %u, " + "unit offset: %u , unit size: %u\n", + ipf->param_number, ipf->unit_offset, ipf->unit_size); + if (ipf->pointer_pass_through) + fprintf (f, " Pointer pass through from param %u, " + "safe_to_import_accesses: %u\n", + ipf->param_number, ipf->safe_to_import_accesses); + } +} + +/* With all GTY stuff done, we can move to anonymous namespace. */ +namespace { + +/* Return false the function is apparently unsuitable for IPA-SRA based on it's + attributes, return true otherwise. NODE is the cgraph node of the current + function. */ + +static bool +ipa_sra_preliminary_function_checks (cgraph_node *node) +{ + if (!node->local.can_change_signature) + { + if (dump_file) + fprintf (dump_file, "Function can not change signature.\n"); + return false; + } + + if (!tree_versionable_function_p (node->decl)) + { + if (dump_file) + fprintf (dump_file, "Function is not versionable.\n"); + return false; + } + + if (!opt_for_fn (node->decl, optimize) + || !opt_for_fn (node->decl, flag_ipa_sra)) + { + if (dump_file) + fprintf (dump_file, "Not optimizing or IPA-SRA turned off for this " + "function.\n"); + return false; + } + + if (DECL_VIRTUAL_P (node->decl)) + { + if (dump_file) + fprintf (dump_file, "Function is a virtual method.\n"); + return false; + } + + struct function *fun = DECL_STRUCT_FUNCTION (node->decl); + if (fun->stdarg) + { + if (dump_file) + fprintf (dump_file, "Function uses stdarg. \n"); + return false; + } + + if (TYPE_ATTRIBUTES (TREE_TYPE (node->decl))) + { + if (dump_file) + fprintf (dump_file, "Function type has attributes. \n"); + return false; + } + + if (DECL_DISREGARD_INLINE_LIMITS (node->decl)) + { + if (dump_file) + fprintf (dump_file, "Always inline function will be inlined " + "anyway. \n"); + return false; + } + + return true; +} + +/* Quick mapping from a decl to its param descriptor. */ +/* TODO: Make local? */ + +static hash_map *decl2desc; + +/* Countdown of allowe Alias analysis steps during summary building. */ + +static int aa_walking_limit; + +/* This is a table in which for each basic block and parameter there is a + distance (offset + size) in that parameter which is dereferenced and + accessed in that BB. */ +static HOST_WIDE_INT *bb_dereferences = NULL; +/* How many by-reference parameters there are in the current function. */ +static int by_ref_count; + +/* Bitmap of BBs that can cause the function to "stop" progressing by + returning, throwing externally, looping infinitely or calling a function + which might abort etc.. */ +static bitmap final_bbs; + +/* Print access tree starting at ACCESS to F. */ + +static void +dump_gensum_access (FILE *f, gensum_param_access *access, unsigned indent) +{ + fprintf (f, " "); + for (unsigned i = 0; i < indent; i++) + fprintf (f, " "); + fprintf (f, " * Access to offset: " HOST_WIDE_INT_PRINT_DEC, + access->offset); + fprintf (f, ", size: " HOST_WIDE_INT_PRINT_DEC, access->size); + fprintf (f, ", type: "); + print_generic_expr (f, access->type); + fprintf (f, ", alias_ptr_type: "); + print_generic_expr (f, access->alias_ptr_type); + fprintf (f, ", nonarg: %u\n", access->nonarg); + for (gensum_param_access *ch = access->first_child; + ch; + ch = ch->next_sibling) + dump_gensum_access (f, ch, indent + 2); +} + + +/* Print access tree starting at ACCESS to F. */ + +static void +dump_isra_access (FILE *f, param_access *access) +{ + fprintf (f, " * Access to unit offset: %u", access->unit_offset); + fprintf (f, ", unit size: %u", access->unit_size); + fprintf (f, ", type: "); + print_generic_expr (f, access->type); + fprintf (f, ", alias_ptr_type: "); + print_generic_expr (f, access->alias_ptr_type); + fprintf (f, ", definitive: %u, check_overlaps: %u\n", access->definitive, + access->check_overlaps); +} + +DEBUG_FUNCTION void +debug_isra_access (param_access *access) +{ + dump_isra_access (stderr, access); +} + +/* Dump DESC to F. */ + +static void +dump_gensum_param_descriptor (FILE *f, gensum_param_desc *desc) +{ + if (desc->m_locally_unused) + { + fprintf (f, " unused with %i call_uses\n", desc->m_call_uses); + } + if (!desc->m_split_candidate) + { + fprintf (f, " not a candidate\n"); + return; + } + if (desc->m_by_ref) + fprintf (f, " by_ref with %u pass throughs\n", desc->ptr_pt_count); + + for (gensum_param_access *acc = desc->x_accesses; + acc; + acc = acc->next_sibling) + dump_gensum_access (f, acc, 2); +} + +/* Dump all parameter descriptors in IFS, assuming it describes FNDECl, to + F. */ + +static void +dump_gensum_param_descriptors (FILE *f, tree fndecl, + vec *param_descriptions) +{ + tree parm = DECL_ARGUMENTS (fndecl); + for (unsigned i = 0; + i < param_descriptions->length (); + ++i, parm = DECL_CHAIN (parm)) + { + fprintf (f, " Descriptor for parameter %i ", i); + print_generic_expr (f, parm, TDF_UID); + fprintf (f, "\n"); + dump_gensum_param_descriptor (f, &(*param_descriptions)[i]); + } +} + + +/* Dump DESC to F. */ + +static void +dump_isra_param_descriptor (FILE *f, isra_param_desc *desc) +{ + if (desc->m_locally_unused) + { + fprintf (f, " unused with %i call_uses\n", desc->m_call_uses); + } + if (!desc->m_split_candidate) + { + fprintf (f, " not a candidate\n"); + return; + } + fprintf (f, " param_size_limit: %u, size_reached: %u\n", + desc->m_param_size_limit, desc->m_size_reached); + if (desc->m_by_ref) + fprintf (f, " by_ref with %u pass throughs\n", desc->ptr_pt_count); + + for (unsigned i = 0; i < vec_safe_length (desc->m_accesses); ++i) + { + param_access *access = (*desc->m_accesses)[i]; + dump_isra_access (f, access); + } +} + +/* Dump all parameter descriptors in IFS, assuming it describes FNDECl, to + F. */ + +static void +dump_isra_param_descriptors (FILE *f, tree fndecl, + isra_func_summary *ifs) +{ + tree parm = DECL_ARGUMENTS (fndecl); + if (!ifs->m_parameters) + { + fprintf (f, " parameter descriptors not available\n"); + return; + } + + for (unsigned i = 0; + i < ifs->m_parameters->length (); + ++i, parm = DECL_CHAIN (parm)) + { + fprintf (f, " Descriptor for parameter %i ", i); + print_generic_expr (f, parm, TDF_UID); + fprintf (f, "\n"); + dump_isra_param_descriptor (f, &(*ifs->m_parameters)[i]); + } +} + +/* Add SRC to PARAM_FLOW, unless already there or would exceed limit or not fit + in a char. If it would exeed limit or would not fit in a char, return + false, otherwise return true. */ + +static bool +add_src_to_param_flow (isra_param_flow *param_flow, int src) +{ + gcc_checking_assert (src >= 0); + if (src > CHAR_MAX + || param_flow->length == IPA_SRA_MAX_PARAM_FLOW_LEN) + return false; + + param_flow->inputs[(int) param_flow->length] = src; + param_flow->length++; + return true; +} + +/* Inspect all uses of NAME and simple arithmetic calculations involving NAME + in NODE and return a negative number if any of them is used for something + else than either an actual call argument, simple arithemtic operation or + debug statement. If there are no such uses, return the number of actual + arguments that this parameter evetually feeds to (or zero if there is none). + For any such parameter, mark PAMR_NUM as one of its sources. ANALYZED is a + bitmap that tracks which SSA names we have already started + investigating. */ + +static int +isra_track_scalar_value_uses (cgraph_node *node, tree name, int parm_num, + bitmap analyzed) +{ + int res = 0; + imm_use_iterator imm_iter; + gimple *stmt; + + FOR_EACH_IMM_USE_STMT (stmt, imm_iter, name) + { + if (is_gimple_debug (stmt)) + continue; + + /* TODO: I guess we could handle at least const builtin functions like + arithmetic operations below. */ + if (is_gimple_call (stmt)) + { + int all_uses = 0; + use_operand_p use_p; + FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter) + all_uses++; + + gcall *call = as_a (stmt); + unsigned arg_count; + if (gimple_call_internal_p (call) + || (arg_count = gimple_call_num_args (call)) == 0) + { + res = -1; + BREAK_FROM_IMM_USE_STMT (imm_iter); + } + + cgraph_edge *cs = node->get_edge (stmt); + gcc_checking_assert (cs); + isra_call_summary *csum = call_sums->get_create (cs); + csum->init_inputs (arg_count); + + int simple_uses = 0; + for (unsigned i = 0; i < arg_count; i++) + if (gimple_call_arg (call, i) == name) + { + if (!add_src_to_param_flow (&csum->m_inputs[i], parm_num)) + { + simple_uses = -1; + break; + } + simple_uses++; + } + + if (simple_uses < 0 + || all_uses != simple_uses) + { + res = -1; + BREAK_FROM_IMM_USE_STMT (imm_iter); + } + res += all_uses; + } + else if ((is_gimple_assign (stmt) && !gimple_has_volatile_ops (stmt)) + || gimple_code (stmt) == GIMPLE_PHI) + { + tree lhs; + if (gimple_code (stmt) == GIMPLE_PHI) + lhs = gimple_phi_result (stmt); + else + lhs = gimple_assign_lhs (stmt); + + if (TREE_CODE (lhs) != SSA_NAME) + { + res = -1; + BREAK_FROM_IMM_USE_STMT (imm_iter); + } + gcc_assert (!gimple_vdef (stmt)); + if (bitmap_set_bit (analyzed, SSA_NAME_VERSION (lhs))) + { + int tmp = isra_track_scalar_value_uses (node, lhs, parm_num, + analyzed); + if (tmp < 0) + { + res = tmp; + BREAK_FROM_IMM_USE_STMT (imm_iter); + } + res += tmp; + } + } + else + { + res = -1; + BREAK_FROM_IMM_USE_STMT (imm_iter); + } + } + return res; +} + +/* Inspect all uses of PARM, which must be a gimple register, in FUN (which is + also described by NODE) and simple arithmetic calculations involving PARM + and return false if any of them is used for something else than either an + actual call argument, simple arithemtic operation or debug statement. If + there are no such uses, return true and store the number of actual arguments + that this parameter evetually feeds to (or zero if there is none) to + *CALL_USES_P. For any such parameter, mark PAMR_NUM as one of its + sources. */ + +static bool +isra_track_scalar_param (function *fun, cgraph_node *node, tree parm, + int parm_num, int *call_uses_p) +{ + gcc_checking_assert (is_gimple_reg (parm)); + + tree name = ssa_default_def (fun, parm); + if (!name || has_zero_uses (name)) + { + *call_uses_p = 0; + return true; + } + + bitmap analyzed = BITMAP_ALLOC (NULL); + int call_uses = isra_track_scalar_value_uses (node, name, parm_num, analyzed); + BITMAP_FREE (analyzed); + if (call_uses < 0) + return false; + *call_uses_p = call_uses; + return true; +} + +/* Scan immediate uses of a default definition SSA name of a parameter PARM and + examine whether there are any nonarg uses that are not actual arguments or + otherwise infeasible uses. If so, return true, otherwise return false. + Create pass-through IPA flow records for any direct uses as argument calls + and if returning false, store their number into *PT_COUNT_P. NODE and FUN + must represent the function that is currently analyzed, PARM_NUM must be the + index of the analyzed parameter. */ + +static bool +ptr_parm_has_nonarg_uses (cgraph_node *node, function *fun, tree parm, + int parm_num, unsigned *pt_count_p) +{ + imm_use_iterator ui; + gimple *stmt; + tree name = ssa_default_def (fun, parm); + bool ret = false; + unsigned pt_count = 0; + + if (!name || has_zero_uses (name)) + return false; + + FOR_EACH_IMM_USE_STMT (stmt, ui, name) + { + unsigned uses_ok = 0; + use_operand_p use_p; + + if (is_gimple_debug (stmt)) + continue; + + if (gimple_assign_single_p (stmt)) + { + tree rhs = gimple_assign_rhs1 (stmt); + while (handled_component_p (rhs)) + rhs = TREE_OPERAND (rhs, 0); + if (TREE_CODE (rhs) == MEM_REF + && TREE_OPERAND (rhs, 0) == name + && integer_zerop (TREE_OPERAND (rhs, 1)) + && types_compatible_p (TREE_TYPE (rhs), + TREE_TYPE (TREE_TYPE (name))) + && !TREE_THIS_VOLATILE (rhs)) + uses_ok++; + } + else if (is_gimple_call (stmt)) + { + gcall *call = as_a (stmt); + unsigned arg_count; + if (gimple_call_internal_p (call) + || (arg_count = gimple_call_num_args (call)) == 0) + { + ret = true; + BREAK_FROM_IMM_USE_STMT (ui); + } + + cgraph_edge *cs = node->get_edge (stmt); + gcc_checking_assert (cs); + isra_call_summary *csum = call_sums->get_create (cs); + csum->init_inputs (arg_count); + + for (unsigned i = 0; i < arg_count; ++i) + { + tree arg = gimple_call_arg (stmt, i); + + if (arg == name) + { + /* TODO: Allow &MEM_REF[name + offset] here, + ipa_param_body_adjustments::modify_call_stmt has to be + adjusted too. */ + csum->m_inputs[i].pointer_pass_through = true; + csum->m_inputs[i].param_number = parm_num; + pt_count++; + uses_ok++; + continue; + } + /* TODO: Calculate offset here and we can also consider + ADDR_EXPR's of MEM_REFs a pass-through. */ + + while (handled_component_p (arg)) + arg = TREE_OPERAND (arg, 0); + if (TREE_CODE (arg) == MEM_REF + && TREE_OPERAND (arg, 0) == name + && integer_zerop (TREE_OPERAND (arg, 1)) + && types_compatible_p (TREE_TYPE (arg), + TREE_TYPE (TREE_TYPE (name))) + && !TREE_THIS_VOLATILE (arg)) + uses_ok++; + } + } + + /* If the number of valid uses does not match the number of + uses in this stmt there is an unhandled use. */ + unsigned all_uses = 0; + FOR_EACH_IMM_USE_ON_STMT (use_p, ui) + all_uses++; + + gcc_checking_assert (uses_ok <= all_uses); + if (uses_ok != all_uses) + { + ret = true; + BREAK_FROM_IMM_USE_STMT (ui); + } + } + + *pt_count_p = pt_count; + return ret; +} + +/* Initialize vector of parameter descriptors of NODE. Return true if there + are any candidates for any optimization. */ + +static bool +create_parameter_descriptors (cgraph_node *node, + vec *param_descriptions) +{ + function *fun = DECL_STRUCT_FUNCTION (node->decl); + bool ret = false; + + decl2desc = new hash_map; + int num = 0; + for (tree parm = DECL_ARGUMENTS (node->decl); + parm; + parm = DECL_CHAIN (parm), num++) + { + const char *msg; + gensum_param_desc *desc = &(*param_descriptions)[num]; + /* !!? the vector is grown cleared in the caller */ + memset (desc, 0, sizeof (*desc)); + desc->m_param_number = num; + decl2desc->put (parm, desc); + + if (dump_file && (dump_flags & TDF_DETAILS)) + print_generic_expr (dump_file, parm, TDF_UID); + + int scalar_call_uses; + tree type = TREE_TYPE (parm); + if (TREE_THIS_VOLATILE (parm)) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " not a candidate, is volatile\n"); + continue; + } + if (!is_gimple_reg_type (type) && is_va_list_type (type)) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " not a candidate, is a va_list type\n"); + continue; + } + if (TREE_ADDRESSABLE (parm)) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " not a candidate, is addressable\n"); + continue; + } + if (TREE_ADDRESSABLE (type)) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " not a candidate, type cannot be split\n"); + continue; + } + + if (is_gimple_reg (parm) + && isra_track_scalar_param (fun, node, parm, num, &scalar_call_uses)) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " is a scalar with only %i call uses\n", + scalar_call_uses); + + desc->m_locally_unused = true; + desc->m_call_uses = scalar_call_uses; + ret = true; + } + + if (POINTER_TYPE_P (type)) + { + type = TREE_TYPE (type); + + if (TREE_CODE (type) == FUNCTION_TYPE + || TREE_CODE (type) == METHOD_TYPE) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " not a candidate, reference to " + "a function\n"); + continue; + } + if (TYPE_VOLATILE (type)) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " not a candidate, reference to " + "a volatile type\n"); + continue; + } + if (TREE_CODE (type) == ARRAY_TYPE + && TYPE_NONALIASED_COMPONENT (type)) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " not a candidate, reference to a" + "nonaliased component array\n"); + continue; + } + if (!is_gimple_reg (parm)) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " not a candidate, a reference which is " + "not a gimple register (probably addressable)\n"); + continue; + } + if (ptr_parm_has_nonarg_uses (node, fun, parm, num, + &desc->ptr_pt_count)) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " not a candidate, reference has " + "nonarg uses\n"); + continue; + } + if (is_va_list_type (type)) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " not a candidate, reference to " + "a va list\n"); + continue; + } + desc->m_by_ref = true; + } + else if (!AGGREGATE_TYPE_P (type)) + { + /* This is in an else branch because scalars passed by reference are + still candidates to be passed by value. */ + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " not a candidate, not an aggregate\n"); + continue; + } + + if (!COMPLETE_TYPE_P (type)) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " not a candidate, not a complete type\n"); + continue; + } + if (!tree_fits_uhwi_p (TYPE_SIZE (type))) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " not a candidate, size not representable\n"); + continue; + } + unsigned HOST_WIDE_INT type_size + = tree_to_uhwi (TYPE_SIZE (type)) / BITS_PER_UNIT; + if (type_size == 0 + || type_size >= 1 << ISRA_ARG_SIZE_LIMIT) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " not a candidate, has zero or huge size\n"); + continue; + } + unsigned ttl = PARAM_VALUE (PARAM_SRA_MAX_TYPE_CHECK_STEPS); + if (type_internals_preclude_sra_p (type, &msg, &ttl)) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " not a candidate, %s\n", msg); + continue; + } + + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " is a candidate\n"); + + ret = true; + desc->m_split_candidate = true; + if (desc->m_by_ref) + desc->m_deref_index = by_ref_count++; + } + return ret; +} + +/* Return pointer to descriptor of parameter DECL or NULL if we are looking at . */ + +static gensum_param_desc * +get_gensum_param_desc (tree decl) +{ + gcc_checking_assert (TREE_CODE (decl) == PARM_DECL); + gensum_param_desc **slot = decl2desc->get (decl); + if (!slot) + /* This can happen for static chains which we cannot handle so far. */ + return NULL; + gcc_checking_assert (*slot); + return *slot; +} + + +/* Remove parameter described by DESC from candidates for IPA-SRA and write + REASON to the dump file if there is one. */ + +/* !!? Perhaps rename to emphasize this prevents splitting, not removal? */ + +static void +disqualify_split_candidate (gensum_param_desc *desc, const char *reason) +{ + if (!desc->m_split_candidate) + return; + + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "! Disqualifying parameter number %i - %s\n", + desc->m_param_number, reason); + + desc->m_split_candidate = false; +} + +/* Remove DECL from candidates for IPA-SRA and write REASON to the dump file if + there is one. */ + +static void +disqualify_split_candidate (tree decl, const char *reason) +{ + gensum_param_desc *desc = get_gensum_param_desc (decl); + if (desc) + disqualify_split_candidate (desc, reason); +} + +/* Allocate a new access to DESC and fill it in with OFFSET and SIZE. But + first, check that there are not too many of them already. If so, do not + allocate anything and return NULL. */ + +static gensum_param_access * +allocate_access (gensum_param_desc *desc, + HOST_WIDE_INT offset, HOST_WIDE_INT size) +{ + if (desc->m_access_count + == (unsigned) PARAM_VALUE (PARAM_IPA_SRA_MAX_REPLACEMENTS)) + { + disqualify_split_candidate (desc, "Too many replacement candidates"); + return NULL; + } + + /* !!! TODO: allocate from an obstack */ + gensum_param_access *access = new gensum_param_access (); + memset (access, 0, sizeof (*access)); + access->offset = offset; + access->size = size; + return access; +} + +/* In what context scan_expr_access has been called, whether it deals with a + load, a function argument, or a store. */ + +enum isra_scan_context {ISRA_CTX_LOAD, ISRA_CTX_ARG, ISRA_CTX_STORE}; + +/* Return an access describing memory access to the variable described by DESC + at OFFSET with SIZE in context CTX, starting at pointer to the linked list + at a certaint tree level FIRST. Attempt to create if it does not exist, but + fail and return NULL if there are already too many accesses, if it would + create a partially overlapping access or if an access woule end up in a + non-call access. */ + +static gensum_param_access * +get_access_1 (gensum_param_desc *desc, gensum_param_access **first, + HOST_WIDE_INT offset, HOST_WIDE_INT size, isra_scan_context ctx) +{ + gensum_param_access *access = *first, **ptr = first; + + if (!access) + { + /* No pre-existing access at this level, just create it. */ + gensum_param_access *a = allocate_access (desc, offset, size); + if (!a) + return NULL; + *first = a; + return *first; + } + + if (access->offset >= offset + size) + { + /* We want to squeeze in in front of the very first access, just do + it. */ + gensum_param_access *r = allocate_access (desc, offset, size); + if (!r) + return NULL; + r->next_sibling = access; + *first = r; + return r; + } + + /* Skip all accesses that have to come before us until the next sibling is + already too far. */ + while (offset >= access->offset + access->size + && access->next_sibling + && access->next_sibling->offset < offset + size) + { + ptr = &access->next_sibling; + access = access->next_sibling; + } + + /* At this point we know we do not belong before access. */ + gcc_assert (access->offset < offset + size); + + if (access->offset == offset && access->size == size) + /* We found what we were looking for. */ + return access; + + if (access->offset <= offset + && access->offset + access->size >= offset + size) + { + /* We fit into access which is larger than us. We need to find/create + something below access. But we only allow nesting in call + arguments. */ + if (access->nonarg) + return NULL; + + return get_access_1 (desc, &access->first_child, offset, size, ctx); + } + + if (offset <= access->offset + && offset + size >= access->offset + access->size) + /* We are actually bigger than access, which fully fits into us, take its + place and make all accesses fitting into it its children. */ + { + if (ctx != ISRA_CTX_ARG) + return NULL; + + gensum_param_access *r = allocate_access (desc, offset, size); + if (!r) + return NULL; + r->first_child = access; + + while (access->next_sibling + && access->next_sibling->offset < offset + size) + access = access->next_sibling; + if (access->offset + access->size > offset + size) + { + /* This must be a different access, which are sorted, so the + following must be true and this signals a partial overlap. */ + gcc_assert (access->offset > offset); + return NULL; + } + + r->next_sibling = access->next_sibling; + access->next_sibling = NULL; + *ptr = r; + return r; + } + + if (offset >= access->offset + access->size) + { + /* We belong after access. */ + gensum_param_access *r = allocate_access (desc, offset, size); + if (!r) + return NULL; + r->next_sibling = access->next_sibling; + access->next_sibling = r; + return r; + } + + if (offset < access->offset) + { + /* We know the following, otherwise we would have created a + super-access. */ + gcc_checking_assert (offset + size < access->offset + access->size); + return NULL; + } + + if (offset + size > access->offset + access->size) + { + /* Likewise. */ + gcc_checking_assert (offset > access->offset); + return NULL; + } + + gcc_unreachable (); +} + +/* Return an access describing memory access to the variable described by DESC + at OFFSET with SIZE in context CTX, mark it as used in context CTX. Attempt + to create if it does not exist, but fail and return NULL if there are + already too many accesses, if it would create a partially overlapping access + or if an access woule end up in a non-call access. */ + +static gensum_param_access * +get_access (gensum_param_desc *desc, HOST_WIDE_INT offset, HOST_WIDE_INT size, + isra_scan_context ctx) +{ + gcc_checking_assert (desc->m_split_candidate); + + gensum_param_access *access = get_access_1 (desc, &desc->x_accesses, offset, + size, ctx); + if (!access) + { + disqualify_split_candidate (desc, "Bad access overlap"); + return NULL; + } + + switch (ctx) + { + case ISRA_CTX_STORE: + gcc_assert (!desc->m_by_ref); + /* Fall-through */ + case ISRA_CTX_LOAD: + access->nonarg = true; + break; + case ISRA_CTX_ARG: + break; + } + + return access; +} + +/* Verify that parameter access tree starting with ACCESS is in good shape. + PARENT_OFFSET and PARENT_SIZE are the ciorresponding fields of parent of + ACCESS or zero if there is none. */ + +static bool +verify_access_tree_1 (gensum_param_access *access, HOST_WIDE_INT parent_offset, + HOST_WIDE_INT parent_size) +{ + while (access) + { + gcc_assert (access->offset >= 0 && access->size > 0); + + if (parent_size != 0) + { + if (access->offset < parent_offset) + { + error ("Access offset before parent offset"); + return true; + } + if (access->size >= parent_size) + { + error ("Access size greater or equal to its parent size"); + return true; + } + if (access->offset + access->size > parent_offset + parent_size) + { + error ("Access terminates outside of its parent"); + return true; + } + } + + if (verify_access_tree_1 (access->first_child, access->offset, + access->size)) + return true; + + if (access->next_sibling + && (access->next_sibling->offset < access->offset + access->size)) + { + error ("Access overlaps with its sibling"); + return true; + } + + access = access->next_sibling; + } + return false; +} + +/* Verify that parameter access tree starting with ACCESS is in good shape, + halt compilation and dump the tree to stderr if not. */ + +DEBUG_FUNCTION void +isra_verify_access_tree (gensum_param_access *access) +{ + if (verify_access_tree_1 (access, 0, 0)) + { + for (; access; access = access->next_sibling) + dump_gensum_access (stderr, access, 2); + internal_error ("IPA-SRA access verification failed"); + } +} + + +/* Callback of walk_stmt_load_store_addr_ops visit_addr used to determine + GIMPLE_ASM operands with memory constrains which cannot be scalarized. */ + +static bool +asm_visit_addr (gimple *, tree op, tree, void *) +{ + op = get_base_address (op); + if (op + && TREE_CODE (op) == PARM_DECL) + disqualify_split_candidate (op, "Non-scalarizable GIMPLE_ASM operand."); + + return false; +} + +/* Mark a dereference of parameter identified by DESC of distance DIST in a + basic block BB, unless the BB has already been marked as a potentially + final. */ + +static void +mark_param_dereference (gensum_param_desc *desc, HOST_WIDE_INT dist, + basic_block bb) +{ + gcc_assert (desc->m_by_ref); + gcc_checking_assert (desc->m_split_candidate); + + if (bitmap_bit_p (final_bbs, bb->index)) + return; + + int idx = bb->index * by_ref_count + desc->m_deref_index; + if (bb_dereferences[idx] < dist) + bb_dereferences[idx] = dist; +} + +/* Return true, if any potential replacements should use NEW_TYPE as opposed to + previously recorded OLD_TYPE. */ + +static bool +type_prevails_p (tree old_type, tree new_type) +{ + if (old_type == new_type) + return false; + + /* Non-aggregates are always better. */ + if (!is_gimple_reg_type (old_type) + && is_gimple_reg_type (new_type)) + return true; + if (is_gimple_reg_type (old_type) + && !is_gimple_reg_type (new_type)) + return false; + + /* Prefer any complex or vector type over any other scalar type. */ + if (TREE_CODE (old_type) != COMPLEX_TYPE + && TREE_CODE (old_type) != VECTOR_TYPE + && (TREE_CODE (new_type) == COMPLEX_TYPE + || TREE_CODE (new_type) == VECTOR_TYPE)) + return true; + if ((TREE_CODE (old_type) == COMPLEX_TYPE + || TREE_CODE (old_type) == VECTOR_TYPE) + && TREE_CODE (new_type) != COMPLEX_TYPE + && TREE_CODE (new_type) != VECTOR_TYPE) + return false; + + /* Use the integral type with the bigger precision first. */ + if (INTEGRAL_TYPE_P (old_type) + && INTEGRAL_TYPE_P (new_type)) + return (TYPE_PRECISION (new_type) > TYPE_PRECISION (old_type)); + + /* Put any integral type with non-full precision last. */ + if (INTEGRAL_TYPE_P (old_type) + && (TREE_INT_CST_LOW (TYPE_SIZE (old_type)) + != TYPE_PRECISION (old_type))) + return true; + if (INTEGRAL_TYPE_P (new_type) + && (TREE_INT_CST_LOW (TYPE_SIZE (new_type)) + != TYPE_PRECISION (new_type))) + return false; + /* Stabilize the selection. */ + return TYPE_UID (old_type) < TYPE_UID (new_type); +} + +/* When scanning an expression which is a call argument, this structure + specifies the call and the position of the agrument. */ + +struct scan_call_info +{ + /* Call graph edge representing the call. */ + cgraph_edge *cs; + /* Total number of arguments in the call. */ + unsigned argument_count; + /* Number of the actual argument being scanned. */ + unsigned arg_idx; +}; + +/* Record use of ACCESS which belongs to a parameter described by DESC in a + call argument described by CALL_INFO. */ + +static void +record_nonregister_call_use (gensum_param_desc *desc, + scan_call_info *call_info, + HOST_WIDE_INT offset, HOST_WIDE_INT size) +{ + isra_call_summary *csum = call_sums->get_create (call_info->cs); + csum->init_inputs (call_info->argument_count); + + isra_param_flow *param_flow = &csum->m_inputs[call_info->arg_idx]; + param_flow->aggregate_pass_through = true; + param_flow->param_number = desc->m_param_number; + + gcc_checking_assert ((offset % BITS_PER_UNIT) == 0); + gcc_checking_assert ((size % BITS_PER_UNIT) == 0); + param_flow->unit_offset = offset / BITS_PER_UNIT; + param_flow->unit_size = size / BITS_PER_UNIT; + + desc->m_call_uses++; +} + +/* Callback of walk_aliased_vdefs, just mark that there was a possible + modification. */ + +static bool +mark_maybe_modified (ao_ref *, tree, void *data) +{ + bool *maybe_modified = (bool *) data; + *maybe_modified = true; + return true; +} + +/* Analyze expression EXPR from GIMPLE for accesses to parameters. CTX + specifies whether EXPR is used in a load, store or as an argument call. BB + should be the basic block in which expr resides. If CTX specifies call + arguemnt context, CALL_INFO must describe tha call and argument position, + otherwise it is ignored. */ + +static void +scan_expr_access (tree expr, gimple *stmt, isra_scan_context ctx, + basic_block bb, scan_call_info *call_info = NULL) +{ + poly_int64 poffset, psize, pmax_size; + HOST_WIDE_INT offset, size, max_size; + tree base; + bool deref = false; + bool reverse; + + if (TREE_CODE (expr) == BIT_FIELD_REF + || TREE_CODE (expr) == IMAGPART_EXPR + || TREE_CODE (expr) == REALPART_EXPR) + expr = TREE_OPERAND (expr, 0); + + base = get_ref_base_and_extent (expr, &poffset, &psize, &pmax_size, &reverse); + + if (TREE_CODE (base) == MEM_REF) + { + tree op = TREE_OPERAND (base, 0); + if (TREE_CODE (op) != SSA_NAME + || !SSA_NAME_IS_DEFAULT_DEF (op)) + return; + base = SSA_NAME_VAR (op); + if (!base) + return; + deref = true; + } + if (TREE_CODE (base) != PARM_DECL) + return; + + /* !!! Move get_gensum_param_desc here and then disqualify using it. */ + if (!poffset.is_constant (&offset) + || !psize.is_constant (&size) + || !pmax_size.is_constant (&max_size)) + { + disqualify_split_candidate (base, "Encountered a polynomial-sized " + "access."); + return; + } + if (size < 0 || size != max_size) + { + disqualify_split_candidate (base, "Encountered a variable sized access."); + return; + } + + if (TREE_CODE (expr) == COMPONENT_REF + && DECL_BIT_FIELD (TREE_OPERAND (expr, 1))) + { + disqualify_split_candidate (base, "Encountered a bit-field access."); + return; + } + gcc_assert (offset >= 0); + gcc_assert ((offset % BITS_PER_UNIT) == 0); + gcc_assert ((size % BITS_PER_UNIT) == 0); + if ((offset / BITS_PER_UNIT) >= UINT_MAX + || (size / BITS_PER_UNIT) >= UINT_MAX) + { + disqualify_split_candidate (base, "Encountered an access with too big " + "offset or size"); + return; + } + + gensum_param_desc *desc = get_gensum_param_desc (base); + if (!desc || !desc->m_split_candidate) + return; + + tree type = TREE_TYPE (expr); + unsigned int exp_align = get_object_alignment (expr); + + if (exp_align < TYPE_ALIGN (type)) + { + disqualify_split_candidate (desc, "Underaligned access."); + return; + } + + if (deref) + { + if (!desc->m_by_ref) + { + disqualify_split_candidate (desc, "Dereferencing a non-reference."); + return; + } + else if (ctx == ISRA_CTX_STORE) + { + disqualify_split_candidate (desc, "Storing to data passed by " + "reference."); + return; + } + + if (!aa_walking_limit) + { + disqualify_split_candidate (desc, "Out of alias analysis step " + "limit."); + return; + } + + gcc_checking_assert (gimple_vuse (stmt)); + bool maybe_modified = false; + ao_ref ar; + + ao_ref_init (&ar, expr); + bitmap visited = BITMAP_ALLOC (NULL); + int walked = walk_aliased_vdefs (&ar, gimple_vuse (stmt), + mark_maybe_modified, &maybe_modified, + &visited, NULL, aa_walking_limit); + BITMAP_FREE (visited); + if (walked > 0) + { + gcc_assert (aa_walking_limit > walked); + aa_walking_limit = aa_walking_limit - walked; + } + if (walked < 0) + aa_walking_limit = 0; + if (maybe_modified || walked < 0) + { + disqualify_split_candidate (desc, "Data passed by reference possibly " + "modified through an alias."); + return; + } + else + mark_param_dereference (desc, offset + size, bb); + } + else + /* Pointer parameters with direct uses should have been ruled out by + analyzing SSA default def when looking at the paremeters. */ + gcc_assert (!desc->m_by_ref); + + gensum_param_access *access = get_access (desc, offset, size, ctx); + if (!access) + return; + + if (ctx == ISRA_CTX_ARG) + { + gcc_checking_assert (call_info); + if (!deref) + record_nonregister_call_use (desc, call_info, offset, size); + else + /* This is not a pass-through of a pointer, this is a use like any + other. */ + access->nonarg = true; + } + + if (!access->type) + { + access->type = type; + access->alias_ptr_type = reference_alias_ptr_type (expr); + } + else + { + if (exp_align < TYPE_ALIGN (access->type)) + { + disqualify_split_candidate (desc, "Reference has lower alignment " + "than a previous one."); + return; + } + if (access->alias_ptr_type != reference_alias_ptr_type (expr)) + { + disqualify_split_candidate (desc, "Multiple alias pointer types."); + return; + } + if (!deref + && (AGGREGATE_TYPE_P (type) || AGGREGATE_TYPE_P (access->type)) + && (TYPE_MAIN_VARIANT (access->type) != TYPE_MAIN_VARIANT (type))) + { + /* We need the same aggregate type on all accesses to be able to + distinguish transformation spots from pass-through arguments in + the transofrmation phase. */ + disqualify_split_candidate (desc, "We do not support aggegate " + "type punning."); + return; + } + + if (type_prevails_p (access->type, type)) + access->type = type; + } +} + +/* Scan body function described by NODE and FUN and create access trees for + parameters. */ + +static void +scan_function (cgraph_node *node, struct function *fun) +{ + basic_block bb; + + FOR_EACH_BB_FN (bb, fun) + { + gimple_stmt_iterator gsi; + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + + if (stmt_can_throw_external (fun, stmt)) + bitmap_set_bit (final_bbs, bb->index); + switch (gimple_code (stmt)) + { + case GIMPLE_RETURN: + { + tree t = gimple_return_retval (as_a (stmt)); + if (t != NULL_TREE) + scan_expr_access (t, stmt, ISRA_CTX_LOAD, bb); + bitmap_set_bit (final_bbs, bb->index); + } + break; + + case GIMPLE_ASSIGN: + if (gimple_assign_single_p (stmt) + && !gimple_clobber_p (stmt)) + { + tree rhs = gimple_assign_rhs1 (stmt); + scan_expr_access (rhs, stmt, ISRA_CTX_LOAD, bb); + tree lhs = gimple_assign_lhs (stmt); + scan_expr_access (lhs, stmt, ISRA_CTX_STORE, bb); + } + break; + + case GIMPLE_CALL: + { + unsigned argument_count = gimple_call_num_args (stmt); + scan_call_info call_info; + call_info.cs = node->get_edge (stmt); + call_info.argument_count = argument_count; + + for (unsigned i = 0; i < argument_count; i++) + { + call_info.arg_idx = i; + scan_expr_access (gimple_call_arg (stmt, i), stmt, + ISRA_CTX_ARG, bb, &call_info); + } + + tree lhs = gimple_call_lhs (stmt); + if (lhs) + scan_expr_access (lhs, stmt, ISRA_CTX_STORE, bb); + int flags = gimple_call_flags (stmt); + if ((flags & (ECF_CONST | ECF_PURE)) == 0) + bitmap_set_bit (final_bbs, bb->index); + } + break; + + case GIMPLE_ASM: + { + gasm *asm_stmt = as_a (stmt); + walk_stmt_load_store_addr_ops (asm_stmt, NULL, NULL, NULL, + asm_visit_addr); + bitmap_set_bit (final_bbs, bb->index); + + for (unsigned i = 0; i < gimple_asm_ninputs (asm_stmt); i++) + { + tree t = TREE_VALUE (gimple_asm_input_op (asm_stmt, i)); + scan_expr_access (t, stmt, ISRA_CTX_LOAD, bb); + } + for (unsigned i = 0; i < gimple_asm_noutputs (asm_stmt); i++) + { + tree t = TREE_VALUE (gimple_asm_output_op (asm_stmt, i)); + scan_expr_access (t, stmt, ISRA_CTX_STORE, bb); + } + } + break; + + default: + break; + } + } + } +} + +/* Return true if SSA_NAME NAME is only used in return statements, or if + results of any operations it is involved in are only used in return + statements. ANALYZED is a bitmap that tracks which SSA names we have + already started investigating. */ + +static bool +ssa_name_only_returned_p (tree name, bitmap analyzed) +{ + bool res = true; + imm_use_iterator imm_iter; + gimple *stmt; + + FOR_EACH_IMM_USE_STMT (stmt, imm_iter, name) + { + if (is_gimple_debug (stmt)) + continue; + + if (gimple_code (stmt) == GIMPLE_RETURN) + { + tree t = gimple_return_retval (as_a (stmt)); + if (t != name) + { + res = false; + BREAK_FROM_IMM_USE_STMT (imm_iter); + } + } + else if ((is_gimple_assign (stmt) && !gimple_has_volatile_ops (stmt)) + || gimple_code (stmt) == GIMPLE_PHI) + { + /* TODO: And perhaps for const function calls too? */ + tree lhs; + if (gimple_code (stmt) == GIMPLE_PHI) + lhs = gimple_phi_result (stmt); + else + lhs = gimple_assign_lhs (stmt); + + if (TREE_CODE (lhs) != SSA_NAME) + { + res = false; + BREAK_FROM_IMM_USE_STMT (imm_iter); + } + gcc_assert (!gimple_vdef (stmt)); + if (bitmap_set_bit (analyzed, SSA_NAME_VERSION (lhs)) + && !ssa_name_only_returned_p (lhs, analyzed)) + { + res = false; + BREAK_FROM_IMM_USE_STMT (imm_iter); + } + } + else + { + res = false; + BREAK_FROM_IMM_USE_STMT (imm_iter); + } + } + return res; +} + +/* Inspect the uses of the return value of the call associated with CS, and if + it is not used or if it is only used to construct the return value of the + caller, mark it as such in call or caller summary. Also check for + misaligned arguments. */ + +static void +isra_analyze_call (cgraph_edge *cs) +{ + gcall *call_stmt = cs->call_stmt; + unsigned count = gimple_call_num_args (call_stmt); + isra_call_summary *csum = call_sums->get_create (cs); + csum->init_inputs (count); /* !!? Try avoiding calling this. */ + for (unsigned i = 0; i < count; i++) + { + tree arg = gimple_call_arg (call_stmt, i); + if (is_gimple_reg (arg)) + continue; + + tree offset; + poly_int64 bitsize, bitpos; + machine_mode mode; + int unsignedp, reversep, volatilep = 0; + get_inner_reference (arg, &bitsize, &bitpos, &offset, &mode, + &unsignedp, &reversep, &volatilep); + if (!multiple_p (bitpos, BITS_PER_UNIT)) + { + csum->m_bit_aligned_arg = true; + break; + } + } + + tree lhs = gimple_call_lhs (call_stmt); + if (lhs) + { + /* TODO: Also detect unused and/or only forwarded to + return aggregates. */ + if (TREE_CODE (lhs) == SSA_NAME) + { + bitmap analyzed = BITMAP_ALLOC (NULL); + if (ssa_name_only_returned_p (lhs, analyzed)) + csum->m_return_returned = true; + BITMAP_FREE (analyzed); + } + } + else + csum->m_return_ignored = true; +} + +/* Look at all calls going out of NODE, described also by IFS and perform all + analyses necessary for IPA-SRA that are not done at body scan time or done + even when body is not scanned because the function is not a candidate. */ + +static void +isra_analyze_all_outgoing_calls (cgraph_node *node) +{ + for (cgraph_edge *cs = node->callees; cs; cs = cs->next_callee) + isra_analyze_call (cs); + for (cgraph_edge *cs = node->indirect_calls; cs; cs = cs->next_callee) + isra_analyze_call (cs); +} + +/* Dump a dereferences table with heading STR to file F. */ + +static void +dump_dereferences_table (FILE *f, struct function *fun, const char *str) +{ + basic_block bb; + + fprintf (dump_file, "%s", str); + FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR_FOR_FN (fun), + EXIT_BLOCK_PTR_FOR_FN (fun), next_bb) + { + fprintf (f, "%4i %i ", bb->index, bitmap_bit_p (final_bbs, bb->index)); + if (bb != EXIT_BLOCK_PTR_FOR_FN (fun)) + { + int i; + for (i = 0; i < by_ref_count; i++) + { + int idx = bb->index * by_ref_count + i; + fprintf (f, " %4" HOST_WIDE_INT_PRINT "d", bb_dereferences[idx]); + } + } + fprintf (f, "\n"); + } + fprintf (dump_file, "\n"); +} + +/* Propagate distances in bb_dereferences in the opposite direction than the + control flow edges, in each step storing the maximum of the current value + and the minimum of all successors. These steps are repeated until the table + stabilizes. Note that BBs which might terminate the functions (according to + final_bbs bitmap) never updated in this way. */ + +static void +propagate_dereference_distances (struct function *fun) +{ + basic_block bb; + + if (dump_file && (dump_flags & TDF_DETAILS)) + dump_dereferences_table (dump_file, fun, + "Dereference table before propagation:\n"); + + auto_vec queue (last_basic_block_for_fn (fun)); + queue.quick_push (ENTRY_BLOCK_PTR_FOR_FN (fun)); + FOR_EACH_BB_FN (bb, fun) + { + queue.quick_push (bb); + bb->aux = bb; + } + + while (!queue.is_empty ()) + { + edge_iterator ei; + edge e; + bool change = false; + int i; + + bb = queue.pop (); + bb->aux = NULL; + + if (bitmap_bit_p (final_bbs, bb->index)) + continue; + + for (i = 0; i < by_ref_count; i++) + { + int idx = bb->index * by_ref_count + i; + bool first = true; + HOST_WIDE_INT inh = 0; + + FOR_EACH_EDGE (e, ei, bb->succs) + { + int succ_idx = e->dest->index * by_ref_count + i; + + if (e->src == EXIT_BLOCK_PTR_FOR_FN (fun)) + continue; + + if (first) + { + first = false; + inh = bb_dereferences [succ_idx]; + } + else if (bb_dereferences [succ_idx] < inh) + inh = bb_dereferences [succ_idx]; + } + + if (!first && bb_dereferences[idx] < inh) + { + bb_dereferences[idx] = inh; + change = true; + } + } + + if (change && !bitmap_bit_p (final_bbs, bb->index)) + FOR_EACH_EDGE (e, ei, bb->preds) + { + if (e->src->aux) + continue; + + e->src->aux = e->src; + queue.quick_push (e->src); + } + } + + if (dump_file && (dump_flags & TDF_DETAILS)) + dump_dereferences_table (dump_file, fun, + "Dereference table after propagation:\n"); +} + +/* Perform basic checks on ACCESS to PARM described by DESC and all its + children, return true if the parameter cannot be split, otherwise return + true and update *TOTAL_SIZE and *ONLY_CALLS. ENTRY_BB_INDEX must be the + index of the entry BB in the function of PARM. */ + +static bool +check_gensum_access (tree parm, gensum_param_desc *desc, + gensum_param_access *access, + HOST_WIDE_INT *nonarg_acc_size, bool *only_calls, + int entry_bb_index) +{ + if (access->nonarg) + { + *only_calls = false; + *nonarg_acc_size += access->size; + } + /* Do not decompose a non-BLKmode param in a way that would create + BLKmode params. Especially for by-reference passing (thus, + pointer-type param) this is hardly worthwhile. */ + if (DECL_MODE (parm) != BLKmode + && TYPE_MODE (access->type) == BLKmode) + { + disqualify_split_candidate (desc, "Would convert a non-BLK to a BLK."); + return true; + } + + if (desc->m_by_ref) + { + int idx = (entry_bb_index * by_ref_count + desc->m_deref_index); + if ((access->offset + access->size) > bb_dereferences[idx]) + { + disqualify_split_candidate (desc, "Would create a possibly " + "illegal dereference in a caller."); + return true; + } + } + + for (gensum_param_access *ch = access->first_child; + ch; + ch = ch->next_sibling) + if (check_gensum_access (parm, desc, ch, nonarg_acc_size, only_calls, + entry_bb_index)) + return true; + + return false; +} + +/* Copy data from FROM and all of its children to a vector of accesses in IPA + descriptor DESC. */ + +static void +copy_accesses_to_ipa_desc (gensum_param_access *from, isra_param_desc *desc) +{ + param_access *to = ggc_cleared_alloc (); + gcc_checking_assert ((from->offset % BITS_PER_UNIT) == 0); + gcc_checking_assert ((from->size % BITS_PER_UNIT) == 0); + to->unit_offset = from->offset / BITS_PER_UNIT; + to->unit_size = from->size / BITS_PER_UNIT; + to->type = from->type; + to->alias_ptr_type = from->alias_ptr_type; + to->definitive = from->nonarg; + to->check_overlaps = !from->nonarg; + vec_safe_push (desc->m_accesses, to); + + for (gensum_param_access *ch = from->first_child; + ch; + ch = ch->next_sibling) + copy_accesses_to_ipa_desc (ch, desc); +} + +/* Analyze function body scan results stored in param_accesses and + param_accesses, detect possible transformations and store information of + those in function summary. NODE, FUN and IFS are all various structures + describing the currently analyzed function. */ + +static void +process_scan_results (cgraph_node *node, struct function *fun, + isra_func_summary *ifs, + vec *param_descriptions) +{ + bool check_pass_throughs = false; + bool dereferences_propagated = false; + tree parm = DECL_ARGUMENTS (node->decl); + unsigned param_count = param_descriptions->length(); + + for (unsigned desc_index = 0; + desc_index < param_count; + desc_index++, parm = DECL_CHAIN (parm)) + { + gensum_param_desc *desc = &(*param_descriptions)[desc_index]; + if (!desc->m_locally_unused && !desc->m_split_candidate) + continue; + + if (flag_checking) + isra_verify_access_tree (desc->x_accesses); + + if (!dereferences_propagated + && desc->m_by_ref + && desc->x_accesses) + { + propagate_dereference_distances (fun); + dereferences_propagated = true; + } + + HOST_WIDE_INT nonarg_acc_size = 0; + bool only_calls = true; + + int entry_bb_index = ENTRY_BLOCK_PTR_FOR_FN (fun)->index; + for (gensum_param_access *acc = desc->x_accesses; + acc; + acc = acc->next_sibling) + if (check_gensum_access (parm, desc, acc, &nonarg_acc_size, &only_calls, + entry_bb_index)) + continue; + + if (only_calls) + desc->m_locally_unused = true; + + HOST_WIDE_INT cur_param_size + = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (parm))); + HOST_WIDE_INT param_size_limit; + if (!desc->m_by_ref || optimize_function_for_size_p (fun)) + param_size_limit = cur_param_size; + else + param_size_limit = (PARAM_VALUE (PARAM_IPA_SRA_PTR_GROWTH_FACTOR) + * cur_param_size); + if (nonarg_acc_size > param_size_limit + || (!desc->m_by_ref && nonarg_acc_size == param_size_limit)) + { + disqualify_split_candidate (desc, "Would result into a too big set of" + "replacements."); + } + else + { + /* create_parameter_descriptors makes sure unit sizes of all + candidate parameters fit unsigned integers restricted to + ISRA_ARG_SIZE_LIMIT bits. */ + desc->param_size_limit = param_size_limit / BITS_PER_UNIT; + desc->nonarg_acc_size = nonarg_acc_size / BITS_PER_UNIT; + if (desc->m_split_candidate && desc->ptr_pt_count) + { + gcc_assert (desc->m_by_ref); /* TODO: Remove after testing. */ + check_pass_throughs = true; + } + } + } + + /* When a pointer parameter is passed-through to a callee, in which it is + only used to read only one or a few items, we can attempt to transform it + to obtaining and passing through the items instead of the pointer. But we + must take extra care that 1) we do not introduce any segfault by moving + dereferences above control flow and that 2) the data is not modified + through an alias in this function. The IPA analysis must not introduce + any accesses candidates unless it can prove both. + + The current solution is very crude as it consists of ensuring that the + call postdominates entry BB and that the definition of VUSE of the call is + default definition. TODO: For non-recursive callees in the same + compilation unit we could do better by doing analysis in topological order + an looking into access candidates of callees, using their alias_ptr_types + to attempt real AA. We could also use the maximum known dereferenced + offset in this function at IPA level but chances are that it is smaller + than the one in the callee (if the candidate survives relatively modest + replacement size limit). + + TODO: Measure the overhead and the effect of just being pessimistic. + Maybe this is ony -O3 material? + */ + bool pdoms_calculated = false; + if (check_pass_throughs) + for (cgraph_edge *cs = node->callees; cs; cs = cs->next_callee) + { + gcall *call_stmt = cs->call_stmt; + tree vuse = gimple_vuse (call_stmt); + + /* If the callee is a const function, we don't get a VUSE. In such + case there will be no memory accesses in the called function (or the + const attribute is wrong) and then we just don't care. */ + bool uses_memory_as_obtained = vuse && SSA_NAME_IS_DEFAULT_DEF (vuse); + + unsigned count = gimple_call_num_args (call_stmt); + isra_call_summary *csum = call_sums->get_create (cs); + csum->init_inputs (count); + for (unsigned argidx = 0; argidx < count; argidx++) + { + if (!csum->m_inputs[argidx].pointer_pass_through) + continue; + unsigned pidx = csum->m_inputs[argidx].param_number; + gensum_param_desc *desc = &(*param_descriptions)[pidx]; + if (!desc->m_split_candidate) + { + csum->m_inputs[argidx].pointer_pass_through = false; + continue; + } + if (!uses_memory_as_obtained) + continue; + /* Post-dominator check placed last, hoping that it usually won't + be needed. */ + + if (!pdoms_calculated) + { + push_cfun (fun); + add_noreturn_fake_exit_edges (); + connect_infinite_loops_to_exit (); + calculate_dominance_info (CDI_POST_DOMINATORS); + pdoms_calculated = true; + } + if (dominated_by_p (CDI_POST_DOMINATORS, + gimple_bb (call_stmt), + single_succ (ENTRY_BLOCK_PTR_FOR_FN (fun)))) + csum->m_inputs[argidx].safe_to_import_accesses = true; + } + + } + if (pdoms_calculated) + { + free_dominance_info (CDI_POST_DOMINATORS); + remove_fake_exit_edges (); + pop_cfun (); + } + + vec_safe_reserve_exact (ifs->m_parameters, param_count); + ifs->m_parameters->quick_grow_cleared (param_count); + for (unsigned desc_index = 0; desc_index < param_count; desc_index++) + { + gensum_param_desc *s = &(*param_descriptions)[desc_index]; + isra_param_desc *d = &(*ifs->m_parameters)[desc_index]; + + d->m_call_uses = s->m_call_uses; + d->ptr_pt_count = s->ptr_pt_count; + d->m_param_size_limit = s->param_size_limit; + d->m_size_reached = s->nonarg_acc_size; + d->m_locally_unused = s->m_locally_unused; + d->m_split_candidate = s->m_split_candidate; + d->m_by_ref = s->m_by_ref; + + for (gensum_param_access *acc = s->x_accesses; + acc; + acc = acc->next_sibling) + copy_accesses_to_ipa_desc (acc, d); + } +} + +/* Intraprocedural part of IPA-SRA analysis. Scan function body of NODE and + create a summary structure describing IPA-SRA opportunities and constraints + in it. */ + +static void +ipa_sra_summarize_function (cgraph_node *node) +{ + if (dump_file) + fprintf (dump_file, "Creating summary for %s/%i:\n", node->name (), + node->order); + if (!ipa_sra_preliminary_function_checks (node)) + return; + isra_func_summary *ifs = func_sums->get_create (node); + ifs->m_candidate = true; + tree ret = TREE_TYPE (TREE_TYPE (node->decl)); + ifs->m_returns_value = (TREE_CODE (ret) != VOID_TYPE); + + unsigned count = 0; + for (tree parm = DECL_ARGUMENTS (node->decl); parm; parm = DECL_CHAIN (parm)) + count++; + + struct function *fun = DECL_STRUCT_FUNCTION (node->decl); + if (count > 0) + { + auto_vec param_descriptions (count); + param_descriptions.reserve_exact (count); + param_descriptions.quick_grow_cleared (count); + + if (create_parameter_descriptors (node, ¶m_descriptions)) + { + final_bbs = BITMAP_ALLOC (NULL); + bb_dereferences = XCNEWVEC (HOST_WIDE_INT, + by_ref_count + * last_basic_block_for_fn (fun)); + aa_walking_limit = PARAM_VALUE (PARAM_IPA_MAX_AA_STEPS); + scan_function (node, fun); + + if (dump_file) + { + dump_gensum_param_descriptors (dump_file, node->decl, + ¶m_descriptions); + fprintf (dump_file, "----------------------------------------\n"); + } + } + process_scan_results (node, fun, ifs, ¶m_descriptions); + + if (dump_file) + dump_isra_param_descriptors (dump_file, node->decl, ifs); + if (bb_dereferences) + { + free (bb_dereferences); + bb_dereferences = NULL; + BITMAP_FREE (final_bbs); + final_bbs = NULL; + } + } + isra_analyze_all_outgoing_calls (node); + + delete decl2desc; + decl2desc = NULL; + if (dump_file) + fprintf (dump_file, "\n\n"); + return; +} + +/* Intraprocedural part of IPA-SRA analysis. Scan bodies of all functions in + this compilation unit and create summary structures describing IPA-SRA + opportunities and constraints in them. */ + +static void +ipa_sra_generate_summary (void) +{ + struct cgraph_node *node; + + gcc_checking_assert (!func_sums); + gcc_checking_assert (!call_sums); + func_sums + = (new (ggc_cleared_alloc ()) + ipa_sra_function_summaries (symtab, true)); + call_sums = new ipa_sra_call_summaries (symtab); + + FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node) + ipa_sra_summarize_function (node); + return; +} + +/* Write intraproceural analysis information about E and all of its outgoing + edges into a stream for LTO WPA. */ + +static void +isra_write_edge_summary (output_block *ob, cgraph_edge *e) +{ + isra_call_summary *csum = call_sums->get (e); + unsigned input_count = csum->m_inputs.length (); + streamer_write_uhwi (ob, input_count); + for (unsigned i = 0; i < input_count; i++) + { + isra_param_flow *ipf = &csum->m_inputs[i]; + streamer_write_hwi (ob, ipf->length); + bitpack_d bp = bitpack_create (ob->main_stream); + for (int j = 0; j < ipf->length; j++) + bp_pack_value (&bp, ipf->inputs[j], 8); + bp_pack_value (&bp, ipf->aggregate_pass_through, 1); + bp_pack_value (&bp, ipf->pointer_pass_through, 1); + bp_pack_value (&bp, ipf->safe_to_import_accesses, 1); + streamer_write_bitpack (&bp); + streamer_write_uhwi (ob, ipf->param_number); + streamer_write_uhwi (ob, ipf->unit_offset); + streamer_write_uhwi (ob, ipf->unit_size); + } + bitpack_d bp = bitpack_create (ob->main_stream); + bp_pack_value (&bp, csum->m_return_ignored, 1); + bp_pack_value (&bp, csum->m_return_returned, 1); + bp_pack_value (&bp, csum->m_bit_aligned_arg, 1); + streamer_write_bitpack (&bp); +} + +/* Write intraproceural analysis information about NODE and all of its outgoing + edges into a stream for LTO WPA. */ + +static void +isra_write_node_summary (output_block *ob, cgraph_node *node) +{ + isra_func_summary *ifs = func_sums->get (node); + lto_symtab_encoder_t encoder = ob->decl_state->symtab_node_encoder; + int node_ref = lto_symtab_encoder_encode (encoder, node); + streamer_write_uhwi (ob, node_ref); + + unsigned param_desc_count = vec_safe_length (ifs->m_parameters); + streamer_write_uhwi (ob, param_desc_count); + for (unsigned i = 0; i < param_desc_count; i++) + { + isra_param_desc *desc = &(*ifs->m_parameters)[i]; + unsigned access_count = vec_safe_length (desc->m_accesses); + streamer_write_uhwi (ob, access_count); + for (unsigned j = 0; j < access_count; j++) + { + param_access *acc = (*desc->m_accesses)[j]; + stream_write_tree (ob, acc->type, true); + stream_write_tree (ob, acc->alias_ptr_type, true); + streamer_write_uhwi (ob, acc->unit_offset); + streamer_write_uhwi (ob, acc->unit_size); + bitpack_d bp = bitpack_create (ob->main_stream); + bp_pack_value (&bp, acc->definitive, 1); + bp_pack_value (&bp, acc->check_overlaps, 1); + streamer_write_bitpack (&bp); + } + streamer_write_hwi (ob, desc->m_call_uses); + gcc_assert (desc->m_scc_uses == 0); + streamer_write_uhwi (ob, desc->ptr_pt_count); + streamer_write_uhwi (ob, desc->m_param_size_limit); + streamer_write_uhwi (ob, desc->m_size_reached); + bitpack_d bp = bitpack_create (ob->main_stream); + bp_pack_value (&bp, desc->m_locally_unused, 1); + bp_pack_value (&bp, desc->m_split_candidate, 1); + bp_pack_value (&bp, desc->m_by_ref, 1); + streamer_write_bitpack (&bp); + } + bitpack_d bp = bitpack_create (ob->main_stream); + bp_pack_value (&bp, ifs->m_candidate, 1); + bp_pack_value (&bp, ifs->m_returns_value, 1); + bp_pack_value (&bp, ifs->m_return_ignored, 1); + gcc_assert (!ifs->m_queued); + streamer_write_bitpack (&bp); + + for (cgraph_edge *e = node->callees; e; e = e->next_callee) + isra_write_edge_summary (ob, e); + for (cgraph_edge *e = node->indirect_calls; e; e = e->next_callee) + isra_write_edge_summary (ob, e); +} + +/* Write intraproceural analysis information into a stream for LTO WPA. */ + +static void +ipa_sra_write_summary (void) +{ + if (!func_sums || !call_sums) + return; + + struct output_block *ob = create_output_block (LTO_section_ipa_sra); + lto_symtab_encoder_t encoder = ob->decl_state->symtab_node_encoder; + ob->symbol = NULL; + + unsigned int count = 0; + lto_symtab_encoder_iterator lsei; + for (lsei = lsei_start_function_in_partition (encoder); + !lsei_end_p (lsei); + lsei_next_function_in_partition (&lsei)) + { + cgraph_node *node = lsei_cgraph_node (lsei); + if (node->has_gimple_body_p () + && func_sums->get (node) != NULL) + count++; + } + streamer_write_uhwi (ob, count); + + /* Process all of the functions. */ + for (lsei = lsei_start_function_in_partition (encoder); !lsei_end_p (lsei); + lsei_next_function_in_partition (&lsei)) + { + cgraph_node *node = lsei_cgraph_node (lsei); + if (node->has_gimple_body_p () + && func_sums->get (node) != NULL) + isra_write_node_summary (ob, node); + } + streamer_write_char_stream (ob->main_stream, 0); + produce_asm (ob, NULL); + destroy_output_block (ob); +} + +/* Read intraproceural analysis information about E and all of its outgoing + edges into a stream for LTO WPA. */ + +static void +isra_read_edge_summary (struct lto_input_block *ib, cgraph_edge *cs) +{ + isra_call_summary *csum = call_sums->get_create (cs); + unsigned input_count = streamer_read_uhwi (ib); + csum->init_inputs (input_count); + for (unsigned i = 0; i < input_count; i++) + { + isra_param_flow *ipf = &csum->m_inputs[i]; + ipf->length = streamer_read_hwi (ib); + bitpack_d bp = streamer_read_bitpack (ib); + for (int j = 0; j < ipf->length; j++) + ipf->inputs[j] = bp_unpack_value (&bp, 8); + ipf->aggregate_pass_through = bp_unpack_value (&bp, 1); + ipf->pointer_pass_through = bp_unpack_value (&bp, 1); + ipf->safe_to_import_accesses = bp_unpack_value (&bp, 1); + ipf->param_number = streamer_read_uhwi (ib); + ipf->unit_offset = streamer_read_uhwi (ib); + ipf->unit_size = streamer_read_uhwi (ib); + } + bitpack_d bp = streamer_read_bitpack (ib); + csum->m_return_ignored = bp_unpack_value (&bp, 1); + csum->m_return_returned = bp_unpack_value (&bp, 1); + csum->m_bit_aligned_arg = bp_unpack_value (&bp, 1); +} + +/* Read intraproceural analysis information about NODE and all of its outgoing + edges into a stream for LTO WPA. */ + +static void +isra_read_node_info (struct lto_input_block *ib, cgraph_node *node, + struct data_in *data_in) +{ + isra_func_summary *ifs = func_sums->get_create (node); + unsigned param_desc_count = streamer_read_uhwi (ib); + if (param_desc_count > 0) + { + vec_safe_reserve_exact (ifs->m_parameters, param_desc_count); + ifs->m_parameters->quick_grow_cleared (param_desc_count); + } + for (unsigned i = 0; i < param_desc_count; i++) + { + isra_param_desc *desc = &(*ifs->m_parameters)[i]; + unsigned access_count = streamer_read_uhwi (ib); + for (unsigned j = 0; j < access_count; j++) + { + param_access *acc = ggc_cleared_alloc (); + acc->type = stream_read_tree (ib, data_in); + acc->alias_ptr_type = stream_read_tree (ib, data_in); + acc->unit_offset = streamer_read_uhwi (ib); + acc->unit_size = streamer_read_uhwi (ib); + bitpack_d bp = streamer_read_bitpack (ib); + acc->definitive = bp_unpack_value (&bp, 1); + acc->check_overlaps = bp_unpack_value (&bp, 1); + vec_safe_push (desc->m_accesses, acc); + } + desc->m_call_uses = streamer_read_hwi (ib); + desc->m_scc_uses = 0; + desc->ptr_pt_count = streamer_read_uhwi (ib); + desc->m_param_size_limit = streamer_read_uhwi (ib); + desc->m_size_reached = streamer_read_uhwi (ib); + bitpack_d bp = streamer_read_bitpack (ib); + desc->m_locally_unused = bp_unpack_value (&bp, 1); + desc->m_split_candidate = bp_unpack_value (&bp, 1); + desc->m_by_ref = bp_unpack_value (&bp, 1); + } + bitpack_d bp = streamer_read_bitpack (ib); + ifs->m_candidate = bp_unpack_value (&bp, 1); + ifs->m_returns_value = bp_unpack_value (&bp, 1); + ifs->m_return_ignored = bp_unpack_value (&bp, 1); + ifs->m_queued = 0; + + for (cgraph_edge *e = node->callees; e; e = e->next_callee) + isra_read_edge_summary (ib, e); + for (cgraph_edge *e = node->indirect_calls; e; e = e->next_callee) + isra_read_edge_summary (ib, e); +} + +/* Read IPA-SRA summaries from a section in file FILE_DATA of length LEN with + data DATA. TODO: This function was copied almost verbatim from ipa-prop.c, + that cannot be right. */ + +static void +isra_read_summary_section (struct lto_file_decl_data *file_data, + const char *data, size_t len) +{ + const struct lto_function_header *header = + (const struct lto_function_header *) data; + const int cfg_offset = sizeof (struct lto_function_header); + const int main_offset = cfg_offset + header->cfg_size; + const int string_offset = main_offset + header->main_size; + struct data_in *data_in; + unsigned int i; + unsigned int count; + + lto_input_block ib_main ((const char *) data + main_offset, + header->main_size, file_data->mode_table); + + data_in = + lto_data_in_create (file_data, (const char *) data + string_offset, + header->string_size, vNULL); + count = streamer_read_uhwi (&ib_main); + + for (i = 0; i < count; i++) + { + unsigned int index; + struct cgraph_node *node; + lto_symtab_encoder_t encoder; + + index = streamer_read_uhwi (&ib_main); + encoder = file_data->symtab_node_encoder; + node = dyn_cast (lto_symtab_encoder_deref (encoder, + index)); + gcc_assert (node->definition); + isra_read_node_info (&ib_main, node, data_in); + } + lto_free_section_data (file_data, LTO_section_ipa_sra, NULL, data, + len); + lto_data_in_delete (data_in); +} + +/* Read intraproceural analysis information into a stream for LTO WPA. */ + +static void +ipa_sra_read_summary (void) +{ + struct lto_file_decl_data **file_data_vec = lto_get_file_decl_data (); + struct lto_file_decl_data *file_data; + unsigned int j = 0; + + gcc_checking_assert (!func_sums); + gcc_checking_assert (!call_sums); + func_sums + = (new (ggc_cleared_alloc ()) + ipa_sra_function_summaries (symtab, true)); + call_sums = new ipa_sra_call_summaries (symtab); + + while ((file_data = file_data_vec[j++])) + { + size_t len; + const char *data = lto_get_section_data (file_data, LTO_section_ipa_sra, + NULL, &len); + if (data) + isra_read_summary_section (file_data, data, len); + } +} + +/* Dump all IPA-SRA summary data for all cgraph nodes and edges to file F. */ + +static void +ipa_sra_dump_all_summaries (FILE *f) +{ + cgraph_node *node; + FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node) + { + fprintf (f, "\nSummary for node %s:\n", node->dump_name ()); + + isra_func_summary *ifs = func_sums->get (node); + if (!ifs) + { + fprintf (f, " Function does not have any associated IPA-SRA " + "summary\n"); + continue; + } + if (!ifs->m_candidate) + { + fprintf (f, " Not a candidate function\n"); + continue; + } + if (ifs->m_returns_value) + fprintf (f, " Returns value\n"); + if (vec_safe_is_empty (ifs->m_parameters)) + fprintf (f, " No parameter information. \n"); + else + for (unsigned i = 0; i < ifs->m_parameters->length (); ++i) + { + fprintf (f, " Descriptor for parameter %i:\n", i); + dump_isra_param_descriptor (f, &(*ifs->m_parameters)[i]); + } + fprintf (f, "\n"); + + struct cgraph_edge *cs; + for (cs = node->callees; cs; cs = cs->next_callee) + { + fprintf (f, " Summary for edge %s->%s:\n", cs->caller->dump_name (), + cs->callee->dump_name ()); + isra_call_summary *csum = call_sums->get (cs); + if (csum) + csum->dump (f); + else + fprintf (f, " Call summary is MISSING!\n"); + } + + } + fprintf (f, "\n\n"); +} + +/* Perform function-scope viability tests that can be only made at IPA level + and return false if the function is deemed unsuitable for IPA-SRA. */ + +static bool +ipa_sra_ipa_function_checks (cgraph_node *node) +{ + if (!node->can_be_local_p ()) + { + if (dump_file) + fprintf (dump_file, "Function %s disqualified because it cannot be " + "made local.\n", node->dump_name ()); + return false; + } + if (!node->local.can_change_signature) + { + if (dump_file) + fprintf (dump_file, "Function can not change signature.\n"); + return false; + } + + return true; +} + +/* Issues found out by check_callers_for_issues. */ + +struct caller_issues +{ + /* THere is a thunk among callers. */ + bool thunk; + /* Call site with no available information. */ + bool unknown_callsite; + /* There is a bit-aligned load in one of non- */ + bool bit_aligned_argument; +}; + +/* Worker for call_for_symbol_and_aliases, set any flags of passed caller_issues + that apply. */ + +static bool +check_for_caller_issues (struct cgraph_node *node, void *data) +{ + struct caller_issues *issues = (struct caller_issues *) data; + + for (cgraph_edge *cs = node->callers; cs; cs = cs->next_caller) + { + if (cs->caller->thunk.thunk_p) + { + issues->thunk = true; + /* TODO: We should be able to process at least some types of + thunks. */ + return true; + } + + isra_call_summary *csum = call_sums->get (cs); + if (!csum) + { + issues->unknown_callsite = true; + return true; + } + + if (csum->m_bit_aligned_arg) + issues->bit_aligned_argument = true; + } + return false; +} + +/* Look at all incoming edges to NODE, including aliases and thunks and look + for problems. Return true if NODE type should not be modified at all. */ + +static bool +check_all_callers_for_issues (cgraph_node *node) +{ + struct caller_issues issues; + memset (&issues, 0, sizeof (issues)); + + node->call_for_symbol_and_aliases (check_for_caller_issues, &issues, true); + if (issues.unknown_callsite) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "A call of %s has not been analyzed. Disabling " + "all modifications.\n", node->dump_name ()); + return true; + } + /* TODO: We should be able to process at least some types of thunks. */ + if (issues.thunk) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "A call of %s is through thunk, which are not" + " handled yet. Disabling all modifications.\n", + node->dump_name ()); + return true; + } + + if (issues.bit_aligned_argument) + { + /* Let's only remove parameters from such functions. TODO: We could + only prevent splitting the problematic parameters if anybody thinks + it is worth it. */ + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "A call of %s has bit-alinged aggregate argument," + " disabling parameter splitting.\n", node->dump_name ()); + + isra_func_summary *ifs = func_sums->get (node); + gcc_checking_assert (ifs); + unsigned param_count = vec_safe_length (ifs->m_parameters); + for (unsigned i = 0; i < param_count; i++) + (*ifs->m_parameters)[i].m_split_candidate = false; + } + return false; +} + +/* Count the number of times formal parameters feed into an actual argument of + a call within the same SCC. */ + +static void +count_param_scc_uses (cgraph_edge *cs) +{ + isra_func_summary *from_ifs = func_sums->get (cs->caller); + gcc_checking_assert (from_ifs); + if (!from_ifs->m_parameters) + return; + isra_call_summary *csum = call_sums->get (cs); + gcc_checking_assert (csum); + unsigned args_count = csum->m_inputs.length (); + + enum availability availability; + cgraph_node *callee = cs->callee->function_symbol (&availability); + isra_func_summary *to_ifs = func_sums->get (callee); + if (!to_ifs || !to_ifs->m_candidate + || vec_safe_is_empty (to_ifs->m_parameters)) + return; + + for (unsigned i = 0; i < args_count; i++) + { + isra_param_flow *ipf = &csum->m_inputs[i]; + for (int j = 0; j < ipf->length; j++) + (*from_ifs->m_parameters)[ipf->inputs[j]].m_scc_uses++; + + if (ipf->aggregate_pass_through) + (*from_ifs->m_parameters)[ipf->param_number].m_scc_uses++; + } +} + +/* Find the access with corresponding OFFSET and SIZE among accesses in + PARAM_DESC and return it or NULL if such an access is not there. */ + +static param_access * +find_param_access (isra_param_desc *param_desc, unsigned offset, unsigned size) +{ + unsigned pclen = vec_safe_length (param_desc->m_accesses); + + for (unsigned i = 0; i < pclen; i++) + if ((*param_desc->m_accesses)[i]->unit_offset == offset + && (*param_desc->m_accesses)[i]->unit_size == size) + return (*param_desc->m_accesses)[i]; + + return NULL; +} + +/* Return iff the total size of definite replacement SIZE would violate the + limit set for it in PARAM. */ + +static bool +size_would_violate_limit_p (isra_param_desc *desc, unsigned size) +{ + unsigned limit = desc->m_param_size_limit; + if (size > limit + || (!desc->m_by_ref && size == limit)) + return true; + return false; +} + +/* Increase reached size of DESC by SIZE or disqualify it if it would violate + the set limit. */ + +static void +bump_reached_size (isra_param_desc *desc, unsigned size) +{ + unsigned after = desc->m_size_reached + size; + if (size_would_violate_limit_p (desc, after)) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " ...size limit reached, disqualifying " + "candidate\n"); + desc->m_split_candidate = false; + return; + } + desc->m_size_reached = after; +} + +/* Take all actions required to deal with indirect call edge CS, for both + parameter removal and splitting. */ + +static void +process_indirect_edge (cgraph_edge *cs) +{ + isra_func_summary *from_ifs = func_sums->get (cs->caller); + gcc_checking_assert (from_ifs); + isra_call_summary *csum = call_sums->get (cs); + gcc_checking_assert (csum); + unsigned args_count = csum->m_inputs.length (); + + for (unsigned i = 0; i < args_count; i++) + { + isra_param_flow *ipf = &csum->m_inputs[i]; + for (int j = 0; j < ipf->length; j++) + { + int input_idx = ipf->inputs[j]; + (*from_ifs->m_parameters)[input_idx].m_locally_unused = false; + (*from_ifs->m_parameters)[input_idx].m_split_candidate = false; + } + + if (ipf->pointer_pass_through) + { + isra_param_desc *param_desc + = &(*from_ifs->m_parameters)[ipf->param_number]; + param_desc->m_split_candidate = false; + } + if (ipf->aggregate_pass_through) + { + isra_param_desc *param_desc + = &(*from_ifs->m_parameters)[ipf->param_number]; + + param_desc->m_locally_unused = false; + if (!param_desc->m_split_candidate) + continue; + gcc_assert (!param_desc->m_by_ref); + param_access *pacc = find_param_access (param_desc, ipf->unit_offset, + ipf->unit_size); + gcc_checking_assert (pacc); + bump_reached_size (param_desc, pacc->unit_size); + pacc->definitive = true; + ipf->aggregate_pass_through = false; + } + } +} + +/* Propagate parameter removal information through cross-SCC edge CS, + i.e. decrease the use count in the caller parameter descriptor for each use + in this call. */ + +static void +param_removal_cross_scc_edge (cgraph_edge *cs) +{ + enum availability availability; + cgraph_node *callee = cs->callee->function_symbol (&availability); + isra_func_summary *to_ifs = func_sums->get (callee); + if (!to_ifs || !to_ifs->m_candidate + || vec_safe_is_empty (to_ifs->m_parameters)) + return; + isra_func_summary *from_ifs = func_sums->get (cs->caller); + gcc_checking_assert (from_ifs); + + isra_call_summary *csum = call_sums->get (cs); + gcc_checking_assert (csum); + unsigned args_count = csum->m_inputs.length (); + unsigned param_count = vec_safe_length (to_ifs->m_parameters); + + for (unsigned i = 0; (i < args_count) && (i < param_count); i++) + { + isra_param_desc *dest_desc = &(*to_ifs->m_parameters)[i]; + if (dest_desc->m_locally_unused + && (dest_desc->m_call_uses == dest_desc->m_scc_uses)) + { + isra_param_flow *ipf = &csum->m_inputs[i]; + for (int j = 0; j < ipf->length; j++) + { + int input_idx = ipf->inputs[j]; + if ((*from_ifs->m_parameters)[input_idx].m_locally_unused) + (*from_ifs->m_parameters)[input_idx].m_call_uses--; + } + + if (ipf->aggregate_pass_through + && (*from_ifs->m_parameters)[ipf->param_number].m_locally_unused) + (*from_ifs->m_parameters)[ipf->param_number].m_call_uses--; + } + } +} + +/* Unless it is already there, push NODE which is also described by IFS to + STACK. */ + +static void +isra_push_node_to_stack (cgraph_node *node, isra_func_summary *ifs, + vec *stack) +{ + if (!ifs->m_queued) + { + ifs->m_queued = true; + stack->safe_push (node); + } +} + +/* If parameter with index INPUT_IDX is marked as locally unused, mark it as + used and push CALLER on STACK. */ + +static void +isra_mark_caller_param_used (isra_func_summary *from_ifs, int input_idx, + cgraph_node *caller, vec *stack) +{ + if ((*from_ifs->m_parameters)[input_idx].m_locally_unused) + { + (*from_ifs->m_parameters)[input_idx].m_locally_unused = false; + isra_push_node_to_stack (caller, from_ifs, stack); + } +} + + +/* Propagate information that any parameter is not used only locally within a + SCC accross CS to the caller, which must be in the same SCC as the + callee. Push any callers that need to be re-processed to STACK. */ + +static void +propagate_nonlocal_across_edge (cgraph_edge *cs, vec *stack) +{ + isra_func_summary *from_ifs = func_sums->get (cs->caller); + if (!from_ifs || vec_safe_is_empty (from_ifs->m_parameters)) + return; + + isra_call_summary *csum = call_sums->get (cs); + gcc_checking_assert (csum); + unsigned args_count = csum->m_inputs.length (); + enum availability availability; + cgraph_node *callee = cs->callee->function_symbol (&availability); + isra_func_summary *to_ifs = func_sums->get (callee); + + unsigned param_count = to_ifs ? vec_safe_length (to_ifs->m_parameters) : 0; + for (unsigned i = 0; i < args_count; i++) + { + if (i < param_count) + { + isra_param_desc *dest_desc = &(*to_ifs->m_parameters)[i]; + + if (dest_desc->m_locally_unused) + { + int d = dest_desc->m_call_uses - dest_desc->m_scc_uses; + gcc_assert (d >= 0); + if (d == 0) + /* The number of uses matches exactly the number of times this + parameter is passed to a function within SCC, so far so + good. */ + continue; + } + } + + /* The argument is passed to a function which needs it (or there is a + weird parameter-argument count mismatch), we must mark the parameter + as used also in callers within this SCC. */ + isra_param_flow *ipf = &csum->m_inputs[i]; + for (int j = 0; j < ipf->length; j++) + { + int input_idx = ipf->inputs[j]; + isra_mark_caller_param_used (from_ifs, input_idx, cs->caller, stack); + } + if (ipf->aggregate_pass_through + && (*from_ifs->m_parameters)[ipf->param_number].m_locally_unused) + isra_mark_caller_param_used (from_ifs, ipf->param_number, + cs->caller, stack); + } +} + +/* Propagate information that any parameter is not used only locally within a + SCC to all callers of NODE that are in the same SCC. Push any callers that + need to be re-processed to STACK. */ + +static bool +propagate_nonarg_to_css_callers (cgraph_node *node, void *data) +{ + vec *stack = (vec *) data; + cgraph_edge *cs; + for (cs = node->callers; cs; cs = cs->next_caller) + if (ipa_edge_within_scc (cs)) + propagate_nonlocal_across_edge (cs, stack); + return false; +} + +/* Return true iff all definitive accesses in ARG_DESC are also present as + definitive accesses in PARAM_DESC. */ + +static bool +all_callee_accesses_present_p (isra_param_desc *param_desc, + isra_param_desc *arg_desc) +{ + unsigned aclen = vec_safe_length (arg_desc->m_accesses); + for (unsigned j = 0; j < aclen; j++) + { + param_access *argacc = (*arg_desc->m_accesses)[j]; + if (!argacc->definitive) + continue; + param_access *pacc = find_param_access (param_desc, argacc->unit_offset, + argacc->unit_size); + if (!pacc || !pacc->definitive) + return false; + } + return true; +} + +/* Type internal to function pull_accesses_from_callee. Unfortunately gcc 4.8 + does not allow intantiating an auto_vec with a type defined within a + function. */ +enum acc_prop_kind {ACC_PROP_DONT, ACC_PROP_COPY, ACC_PROP_DEFINITIVE}; + + +/* Attempt to propagate all definite accesses from ARG_DESC to PARAM_DESC, if + they would not violate some constraint there. If successful, return NULL, + otherwise return the string reason for failure (which can be written to the + dump file). In case of success, set *CHANGE_P to true if propagation + actually changed anything. */ + +static const char * +pull_accesses_from_callee (isra_param_desc *param_desc, + isra_param_desc *arg_desc, + unsigned delta_offset, unsigned arg_size, + bool *change_p) +{ + unsigned pclen = vec_safe_length (param_desc->m_accesses); + unsigned aclen = vec_safe_length (arg_desc->m_accesses); + unsigned prop_count = 0; + unsigned prop_size = 0; + bool change = false; + + auto_vec prop_kinds (aclen); + for (unsigned j = 0; j < aclen; j++) + { + param_access *argacc = (*arg_desc->m_accesses)[j]; + prop_kinds.safe_push (ACC_PROP_DONT); + + if (arg_size > 0 + && argacc->unit_offset + argacc->unit_size > arg_size) + return "callee access outsize size boundary"; + + if (!argacc->definitive) + continue; + + unsigned offset = argacc->unit_offset + delta_offset; + /* Given that accesses are initially stored according to increasing + offset and decreasing size in case of equal offsets, the following + searches could be written more efficiently (if we kept the ordering + when copying). But the number of accesses is capped at + PARAM_IPA_SRA_MAX_REPLACEMENTS (so most likely 8) and the code gets + messy quickly, so let's improve on that only if necessary and above + all incrementally. */ + + bool exact_match = false; + for (unsigned i = 0; i < pclen; i++) + { + /* Check for overlaps. */ + param_access *pacc = (*param_desc->m_accesses)[i]; + if (pacc->unit_offset == offset + && pacc->unit_size == argacc->unit_size) + { + if (argacc->alias_ptr_type != pacc->alias_ptr_type + || !types_compatible_p (argacc->type, pacc->type)) + return "propagated access types would not match existing ones"; + + exact_match = true; + if (!pacc->definitive) + { + prop_kinds[j] = ACC_PROP_DEFINITIVE; + prop_size += argacc->unit_size; + change = true; + } + break; + } + + if (offset < pacc->unit_offset + pacc->unit_size + && offset + argacc->unit_size > pacc->unit_offset) + { + /* None permissible with load or store accesses, possible to + fit into argument ones. */ + if (pacc->definitive + || offset < pacc->unit_offset + || (offset + argacc->unit_size + > pacc->unit_offset + pacc->unit_size)) + return "a propagated access would conflict in caller"; + } + } + + if (!exact_match) + { + prop_kinds[j] = ACC_PROP_COPY; + prop_count++; + prop_size += argacc->unit_size; + change = true; + } + } + + if (!change) + return NULL; + + if ((prop_count + pclen + > (unsigned) PARAM_VALUE (PARAM_IPA_SRA_MAX_REPLACEMENTS)) + || size_would_violate_limit_p (param_desc, + param_desc->m_size_reached + prop_size)) + return "propagating accesses would violate the count or size limit"; + + *change_p = true; + for (unsigned j = 0; j < aclen; j++) + { + if (prop_kinds[j] == ACC_PROP_COPY) + { + param_access *argacc = (*arg_desc->m_accesses)[j]; + + param_access *copy = ggc_cleared_alloc (); + copy->unit_offset = argacc->unit_offset + delta_offset; + copy->unit_size = argacc->unit_size; + copy->type = argacc->type; + copy->alias_ptr_type = argacc->alias_ptr_type; + copy->definitive = true; + vec_safe_push (param_desc->m_accesses, copy); + } + else if (prop_kinds[j] == ACC_PROP_DEFINITIVE) + { + param_access *argacc = (*arg_desc->m_accesses)[j]; + param_access *csp + = find_param_access (param_desc, argacc->unit_offset + delta_offset, + argacc->unit_size); + csp->definitive = true; + } + } + + param_desc->m_size_reached += prop_size; + + return NULL; +} + +/* Propagate parameter splitting information through call graph edge CS. + Return true if any changes that might need to be propagated within SCCs have + been made. */ + +static bool +param_splitting_across_edge (cgraph_edge *cs) +{ + bool res = false; + bool cross_scc = !ipa_edge_within_scc (cs); + enum availability availability; + cgraph_node *callee = cs->callee->function_symbol (&availability); + isra_func_summary *from_ifs = func_sums->get (cs->caller); + gcc_checking_assert (from_ifs && from_ifs->m_parameters); + + isra_call_summary *csum = call_sums->get (cs); + gcc_checking_assert (csum); + unsigned args_count = csum->m_inputs.length (); + isra_func_summary *to_ifs = func_sums->get (callee); + unsigned param_count + = ((to_ifs && to_ifs->m_candidate) + ? vec_safe_length (to_ifs->m_parameters) + : 0); + + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "Splitting accross %s->%s:\n", + cs->caller->dump_name (), callee->dump_name ()); + + unsigned i; + for (i = 0; (i < args_count) && (i < param_count); i++) + { + isra_param_desc *arg_desc = &(*to_ifs->m_parameters)[i]; + isra_param_flow *ipf = &csum->m_inputs[i]; + + if (arg_desc->m_locally_unused && !arg_desc->m_call_uses) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " ->%u: unused in callee\n", i); + ipf->pointer_pass_through = false; + continue; + } + + if (ipf->pointer_pass_through) + { + int idx = ipf->param_number; + isra_param_desc *param_desc = &(*from_ifs->m_parameters)[idx]; + if (!param_desc->m_split_candidate) + continue; + gcc_assert (param_desc->m_by_ref); + + if (!arg_desc->m_split_candidate || !arg_desc->m_by_ref) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " %u->%u: not candidate or not by " + "reference in callee\n", idx, i); + param_desc->m_split_candidate = false; + ipf->pointer_pass_through = false; + res = true; + } + else if (!ipf->safe_to_import_accesses) + { + if (!all_callee_accesses_present_p (param_desc, arg_desc)) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " %u->%u: cannot import accesses.\n", + idx, i); + param_desc->m_split_candidate = false; + ipf->pointer_pass_through = false; + res = true; + + } + else + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " %u->%u: verified callee accesses " + "present.\n", idx, i); + if (cross_scc) + ipf->pointer_pass_through = false; + } + } + else + { + const char *pull_failure + = pull_accesses_from_callee (param_desc, arg_desc, 0, 0, &res); + if (pull_failure) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " %u->%u: by_ref access pull " + "failed: %s.\n", idx, i, pull_failure); + param_desc->m_split_candidate = false; + ipf->pointer_pass_through = false; + res = true; + } + else + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " %u->%u: by_ref access pull " + "succeeded.\n", idx, i); + if (cross_scc) + ipf->pointer_pass_through = false; + } + } + } + else if (ipf->aggregate_pass_through) + { + int idx = ipf->param_number; + isra_param_desc *param_desc = &(*from_ifs->m_parameters)[idx]; + if (!param_desc->m_split_candidate) + continue; + gcc_assert (!param_desc->m_by_ref); + param_access *pacc = find_param_access (param_desc, ipf->unit_offset, + ipf->unit_size); + gcc_checking_assert (pacc); + + if (pacc->definitive) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " %u->%u: already definitive\n", idx, i); + ipf->aggregate_pass_through = false; + } + else if (!arg_desc->m_split_candidate || arg_desc->m_by_ref) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " %u->%u: not candidate or by " + "reference in callee\n", idx, i); + bump_reached_size (param_desc, pacc->unit_size); + pacc->definitive = true; + ipf->aggregate_pass_through = false; + res = true; + } + else + { + const char *pull_failure + = pull_accesses_from_callee (param_desc, arg_desc, + ipf->unit_offset, + ipf->unit_size, &res); + if (pull_failure) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " %u->%u: arg access pull " + "failed: %s.\n", idx, i, pull_failure); + bump_reached_size (param_desc, pacc->unit_size); + pacc->definitive = true; + res = true; + ipf->aggregate_pass_through = false; + } + else + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " %u->%u: arg access pull " + "succeeded.\n", idx, i); + if (cross_scc) + ipf->aggregate_pass_through = false; + } + } + } + } + + /* Handle argument-parameter count mismatches. */ + for (; (i < args_count); i++) + { + isra_param_flow *ipf = &csum->m_inputs[i]; + + if (ipf->pointer_pass_through || ipf->aggregate_pass_through) + { + int idx = ipf->param_number; + isra_param_desc *param_desc = &(*from_ifs->m_parameters)[idx]; + if (!param_desc->m_split_candidate) + continue; + + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, " %u->%u: no corresponding formal parameter\n", + idx, i); + param_desc->m_split_candidate = false; + ipf->pointer_pass_through = false; + ipf->aggregate_pass_through = false; + res = true; + } + } + return res; +} + +/* Check for any overlaps of definite param accesses among splitting candidates + and if any are found disqualify them and return true. */ + +static bool +validate_splitting_overlaps (cgraph_node *node) +{ + bool res = false; + isra_func_summary *ifs = func_sums->get (node); + if (!ifs || !ifs->m_candidate) + return res; + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "Validating splits for %s\n", node->dump_name ()); + unsigned param_count = vec_safe_length (ifs->m_parameters); + + for (unsigned pidx = 0; pidx < param_count; pidx++) + { + isra_param_desc *desc = &(*ifs->m_parameters)[pidx]; + if (!desc->m_split_candidate + || (desc->m_locally_unused + && desc->m_call_uses == desc->m_scc_uses)) + continue; + + bool definitive_access_present = false; + unsigned pclen = vec_safe_length (desc->m_accesses); + for (unsigned i = 0; i < pclen; i++) + { + param_access *a1 = (*desc->m_accesses)[i]; + + if (!a1->definitive) + continue; + definitive_access_present = true; + bool overlap = false; + for (unsigned j = i + 1; j < pclen; j++) + { + param_access *a2 = (*desc->m_accesses)[j]; + if (a2->definitive + && a1->unit_offset < a2->unit_offset + a2->unit_size + && a1->unit_offset + a1->unit_size > a2->unit_offset) + { + overlap = true; + break; + } + } + if (overlap) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "Disqualifying parameter %u of %s" + "because of late discovered overlap\n", + pidx, node->dump_name ()); + desc->m_split_candidate = false; + res = true; + /* !!! Remove after testing. */ + gcc_assert (a1->check_overlaps); + break; + } + } + /* !!? remove after testing? */ + gcc_checking_assert (definitive_access_present); + } + return res; +} + +/* Worker for call_for_symbol_and_aliases, look at all callers and if all their + callers ignore the return value, or come from the same SCC and use the + return value only to compute their return value, return false, otherwise + return true. */ + +static bool +propagate_unused_ret_first_stage (cgraph_node *node, void *) +{ + for (cgraph_edge *cs = node->callers; cs; cs = cs->next_caller) + { + isra_call_summary *csum = call_sums->get (cs); + gcc_checking_assert (csum); + if (csum->m_return_ignored) + continue; + if (!csum->m_return_returned) + return true; + + isra_func_summary *from_ifs = func_sums->get (cs->caller); + if (!from_ifs || !from_ifs->m_candidate) + return true; + + if (!ipa_edge_within_scc (cs) + && !from_ifs->m_return_ignored) + return true; + } + + return false; +} + +/* Do finall processing of results of IPA propagation regarding NODE, clone it + if appropriate. */ + +static void +process_isra_node_results (cgraph_node *node) +{ + isra_func_summary *ifs = func_sums->get (node); + if (!ifs) + return; + + if (dump_file) + { + fprintf (dump_file, "\nEvaluating analysis results for %s\n", + node->dump_name ()); + } + + unsigned param_count = vec_safe_length (ifs->m_parameters); + bool will_change_function = false; + if (ifs->m_returns_value && ifs->m_return_ignored) + { + will_change_function = true; + if (dump_file) + fprintf (dump_file, " Will remove return value.\n"); + } + else + for (unsigned i = 0; i < param_count; i++) + { + isra_param_desc *desc = &(*ifs->m_parameters)[i]; + if ((desc->m_locally_unused + && desc->m_call_uses == desc->m_scc_uses) + || desc->m_split_candidate) + { + will_change_function = true; + break; + } + } + if (!will_change_function) + return; + + /* Currently IPA-SRA is the first IPA pass creating param_adjustments. If + that ever changes, we'll have to add logic to combine pre-existing + adjustments with the modifications IPA-SRA wishes to make, similar to what + is done in IPA-CP. */ + gcc_assert (!node->clone.param_adjustments); + vec *new_params = NULL; + for (unsigned parm_num = 0; parm_num < param_count; parm_num++) + { + isra_param_desc *desc = &(*ifs->m_parameters)[parm_num]; + if (desc->m_locally_unused + && desc->m_call_uses == desc->m_scc_uses) + { + if (dump_file) + fprintf (dump_file, " Will remove parameter %u\n", parm_num); + continue; + } + + if (!desc->m_split_candidate) + { + ipa_adjusted_param adj; + memset (&adj, 0, sizeof (adj)); + adj.op = IPA_PARAM_OP_COPY; + adj.base_index = parm_num; + adj.prev_clone_index = parm_num; + vec_safe_push (new_params, adj); + continue; + } + + if (dump_file) + fprintf (dump_file, " Will split parameter %u\n", parm_num); + unsigned aclen = vec_safe_length (desc->m_accesses); + for (unsigned j = 0; j < aclen; j++) + { + param_access *pa = (*desc->m_accesses)[j]; + if (!pa->definitive) + continue; + if (dump_file) + fprintf (dump_file, " - component at byte offset %u, " + "size %u\n", pa->unit_offset, pa->unit_size); + + ipa_adjusted_param adj; + memset (&adj, 0, sizeof (adj)); + adj.op = IPA_PARAM_OP_SPLIT; + adj.base_index = parm_num; + adj.prev_clone_index = parm_num; + adj.param_prefix_index = IPA_PARAM_PREFIX_ISRA; + adj.reverse = false; /* FIXME: Really? */ + adj.type = pa->type; + adj.alias_ptr_type = pa->alias_ptr_type; + adj.unit_offset = pa->unit_offset; + vec_safe_push (new_params, adj); + } + } + ipa_param_adjustments *new_adjustments + = (new (ggc_alloc ()) + ipa_param_adjustments (new_params, param_count, + ifs->m_returns_value && ifs->m_return_ignored)); + + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "\n Created adjustments:\n"); + new_adjustments->dump (dump_file); + } + + vec callers = node->collect_callers ();; + node->create_virtual_clone (callers, NULL, new_adjustments, "isra", 0); + callers.release (); +} + +/* Run the interprocedural part of IPA-SRA. */ + +static unsigned int +ipa_sra_analysis (void) +{ + if (dump_file) + { + fprintf (dump_file, "\n============ IPA SRA IPA stage ============\n"); + ipa_sra_dump_all_summaries (dump_file); + } + + /* !!! In LTO, this will fail, we ned to strem in the summaries. */ + gcc_checking_assert (func_sums); + gcc_checking_assert (call_sums); + + cgraph_node **order = XCNEWVEC (cgraph_node *, symtab->cgraph_count); + auto_vec stack; + int node_scc_count = ipa_reduced_postorder (order, true, true, NULL); + + /* One swoop from callees to callers for parameter removal and splitting. */ + for (int i = 0; i < node_scc_count; i++) + { + cgraph_node *scc_rep = order[i]; + vec cycle_nodes = ipa_get_nodes_in_cycle (scc_rep); + unsigned j; + + /* Preliminary IPA function level checks and first step of parameter + removal. */ + cgraph_node *v; + FOR_EACH_VEC_ELT (cycle_nodes, j, v) + { + isra_func_summary *ifs = func_sums->get (v); + if (!ifs) + continue; + if (!ifs->m_candidate) + { + gcc_checking_assert (vec_safe_is_empty (ifs->m_parameters)); + continue; + } + if (!ipa_sra_ipa_function_checks (v) + || check_all_callers_for_issues (v)) + { + ifs->zap (); + continue; + } + + for (cgraph_edge *cs = v->indirect_calls; cs; cs = cs->next_callee) + process_indirect_edge (cs); + for (cgraph_edge *cs = v->callees; cs; cs = cs->next_callee) + if (ipa_edge_within_scc (cs)) + count_param_scc_uses (cs); + else + param_removal_cross_scc_edge (cs); + } + + /* Undoing optimistic assumptions for intra-SCC edges during parameter + removal. */ + FOR_EACH_VEC_ELT (cycle_nodes, j, v) + v->call_for_symbol_thunks_and_aliases (propagate_nonarg_to_css_callers, + &stack, true); + + while (!stack.is_empty ()) + { + cgraph_node *v = stack.pop (); + isra_func_summary *ifs = func_sums->get (v); + gcc_checking_assert (ifs && ifs->m_queued); + ifs->m_queued = false; + + v->call_for_symbol_thunks_and_aliases + (propagate_nonarg_to_css_callers, &stack, true); + } + + /* Parameter splitting. */ + bool repeat_scc_propagation; + do + { + repeat_scc_propagation = false; + bool repeat_edge_propagation; + do + { + repeat_edge_propagation = false; + FOR_EACH_VEC_ELT (cycle_nodes, j, v) + { + isra_func_summary *ifs = func_sums->get (v); + if (!ifs || !ifs->m_candidate || !ifs->m_parameters) + continue; + for (cgraph_edge *cs = v->callees; cs; cs = cs->next_callee) + if (param_splitting_across_edge (cs)) + repeat_edge_propagation = true; + } + } + while (repeat_edge_propagation); + + FOR_EACH_VEC_ELT (cycle_nodes, j, v) + if (validate_splitting_overlaps (v)) + repeat_scc_propagation = true; + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, "\n"); + } + while (repeat_scc_propagation); + + cycle_nodes.release (); + } + + /* One swoop from caller to callees for result removal. */ + for (int i = node_scc_count - 1; i >= 0 ; i--) + { + cgraph_node *scc_rep = order[i]; + vec cycle_nodes = ipa_get_nodes_in_cycle (scc_rep); + unsigned j; + + cgraph_node *v; + FOR_EACH_VEC_ELT (cycle_nodes, j, v) + { + isra_func_summary *ifs = func_sums->get (v); + if (!ifs || !ifs->m_candidate) + continue; + + bool return_needed + = v->call_for_symbol_and_aliases (propagate_unused_ret_first_stage, + NULL, true); + ifs->m_return_ignored = !return_needed; + if (return_needed) + isra_push_node_to_stack (v, ifs, &stack); + } + + while (!stack.is_empty ()) + { + cgraph_node *node = stack.pop (); + isra_func_summary *ifs = func_sums->get (node); + gcc_checking_assert (ifs && ifs->m_queued); + ifs->m_queued = false; + + for (cgraph_edge *cs = node->callees; cs; cs = cs->next_callee) + if (ipa_edge_within_scc (cs) + && call_sums->get (cs)->m_return_returned) + { + enum availability av; + cgraph_node *callee = cs->callee->function_symbol (&av); + isra_func_summary *to_ifs = func_sums->get (callee); + if (to_ifs && to_ifs->m_return_ignored) + { + to_ifs->m_return_ignored = false; + isra_push_node_to_stack (callee, to_ifs, &stack); + } + } + } + cycle_nodes.release (); + } + + if (dump_file && (dump_flags & TDF_DETAILS)) + { + fprintf (dump_file, "\n============ IPA SRA PROP RESULTS ============\n"); + ipa_sra_dump_all_summaries (dump_file); + } + + ipa_free_postorder_info (); + free (order); + + cgraph_node *node; + + FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node) + process_isra_node_results (node); + + func_sums->release (); + func_sums = NULL; + call_sums->release (); + call_sums = NULL; + + if (dump_file) + fprintf (dump_file, "\nIPA SRA IPA analysis done\n\n"); + return 0; +} + + +const pass_data pass_data_ipa_sra = +{ + IPA_PASS, /* type */ + "sra", /* name */ + OPTGROUP_NONE, /* optinfo_flags */ + TV_IPA_SRA, /* tv_id */ + 0, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + ( TODO_dump_symtab | TODO_remove_functions ), /* todo_flags_finish */ +}; + +class pass_ipa_sra : public ipa_opt_pass_d +{ +public: + pass_ipa_sra (gcc::context *ctxt) + : ipa_opt_pass_d (pass_data_ipa_sra, ctxt, + ipa_sra_generate_summary, /* generate_summary */ + ipa_sra_write_summary, /* write_summary */ + ipa_sra_read_summary, /* read_summary */ + NULL , /* write_optimization_summary */ + NULL, /* read_optimization_summary */ + NULL, /* stmt_fixup */ + 0, /* function_transform_todo_flags_start */ + NULL, /* function_transform */ + NULL) /* variable_transform */ + {} + + /* opt_pass methods: */ + virtual bool gate (function *) + { + /* FIXME: We should remove the optimize check after we ensure we never run + IPA passes when not optimizing. */ + return (flag_ipa_sra && optimize); + } + + virtual unsigned int execute (function *) { return ipa_sra_analysis (); } + +}; // class pass_ipa_sra + +} // anon namespace + +ipa_opt_pass_d * +make_pass_ipa_sra (gcc::context *ctxt) +{ + return new pass_ipa_sra (ctxt); +} + + +#include "gt-ipa-sra.h" diff --git a/gcc/lto-cgraph.c b/gcc/lto-cgraph.c index 45138fd2f0c..9ff413ebe91 100644 --- a/gcc/lto-cgraph.c +++ b/gcc/lto-cgraph.c @@ -1808,8 +1808,7 @@ output_cgraph_opt_summary_p (struct cgraph_node *node) { return ((node->clone_of || node->former_clone_of) && (node->clone.tree_map - || node->clone.args_to_skip - || node->clone.combined_args_to_skip)); + || node->clone.param_adjustments)); } /* Output optimization summary for EDGE to OB. */ @@ -1826,42 +1825,54 @@ output_node_opt_summary (struct output_block *ob, struct cgraph_node *node, lto_symtab_encoder_t encoder) { - unsigned int index; - bitmap_iterator bi; struct ipa_replace_map *map; - struct bitpack_d bp; int i; struct cgraph_edge *e; - if (node->clone.args_to_skip) - { - streamer_write_uhwi (ob, bitmap_count_bits (node->clone.args_to_skip)); - EXECUTE_IF_SET_IN_BITMAP (node->clone.args_to_skip, 0, index, bi) - streamer_write_uhwi (ob, index); - } - else - streamer_write_uhwi (ob, 0); - if (node->clone.combined_args_to_skip) + /* TODO: Should this code be moved to ipa-param-manipulation? */ + struct bitpack_d bp; + bp = bitpack_create (ob->main_stream); + bp_pack_value (&bp, (node->clone.param_adjustments != NULL), 1); + streamer_write_bitpack (&bp); + if (ipa_param_adjustments *adjustments = node->clone.param_adjustments) { - streamer_write_uhwi (ob, bitmap_count_bits (node->clone.combined_args_to_skip)); - EXECUTE_IF_SET_IN_BITMAP (node->clone.combined_args_to_skip, 0, index, bi) - streamer_write_uhwi (ob, index); + streamer_write_uhwi (ob, vec_safe_length (adjustments->m_adj_params)); + ipa_adjusted_param *adj; + FOR_EACH_VEC_SAFE_ELT (adjustments->m_adj_params, i, adj) + { + bp = bitpack_create (ob->main_stream); + bp_pack_value (&bp, adj->base_index, IPA_PARAM_MAX_INDEX_BITS); + bp_pack_value (&bp, adj->prev_clone_index, IPA_PARAM_MAX_INDEX_BITS); + bp_pack_value (&bp, adj->op, 2); + bp_pack_value (&bp, adj->param_prefix_index, 2); + bp_pack_value (&bp, adj->prev_clone_adjustment, 1); + bp_pack_value (&bp, adj->reverse, 1); + bp_pack_value (&bp, adj->by_ref, 1); + bp_pack_value (&bp, adj->user_flag, 1); + streamer_write_bitpack (&bp); + if (adj->op == IPA_PARAM_OP_SPLIT + || adj->op == IPA_PARAM_OP_NEW) + { + stream_write_tree (ob, adj->type, true); + if (adj->op == IPA_PARAM_OP_SPLIT) + { + stream_write_tree (ob, adj->alias_ptr_type, true); + streamer_write_uhwi (ob, adj->unit_offset); + } + } + } + streamer_write_hwi (ob, adjustments->m_always_copy_start); + bp = bitpack_create (ob->main_stream); + bp_pack_value (&bp, node->clone.param_adjustments->m_skip_return, 1); + streamer_write_bitpack (&bp); } - else - streamer_write_uhwi (ob, 0); + streamer_write_uhwi (ob, vec_safe_length (node->clone.tree_map)); FOR_EACH_VEC_SAFE_ELT (node->clone.tree_map, i, map) { - /* At the moment we assume all old trees to be PARM_DECLs, because we have no - mechanism to store function local declarations into summaries. */ - gcc_assert (!map->old_tree); streamer_write_uhwi (ob, map->parm_num); gcc_assert (EXPR_LOCATION (map->new_tree) == UNKNOWN_LOCATION); stream_write_tree (ob, map->new_tree, true); - bp = bitpack_create (ob->main_stream); - bp_pack_value (&bp, map->replace_p, 1); - bp_pack_value (&bp, map->ref_p, 1); - streamer_write_bitpack (&bp); } if (lto_symtab_encoder_in_partition_p (encoder, node)) @@ -1926,26 +1937,50 @@ input_node_opt_summary (struct cgraph_node *node, { int i; int count; - int bit; - struct bitpack_d bp; struct cgraph_edge *e; - count = streamer_read_uhwi (ib_main); - if (count) - node->clone.args_to_skip = BITMAP_GGC_ALLOC (); - for (i = 0; i < count; i++) - { - bit = streamer_read_uhwi (ib_main); - bitmap_set_bit (node->clone.args_to_skip, bit); - } - count = streamer_read_uhwi (ib_main); - if (count) - node->clone.combined_args_to_skip = BITMAP_GGC_ALLOC (); - for (i = 0; i < count; i++) + /* TODO: Should this code be moved to ipa-param-manipulation? */ + struct bitpack_d bp; + bp = streamer_read_bitpack (ib_main); + bool have_adjustments = bp_unpack_value (&bp, 1); + if (have_adjustments) { - bit = streamer_read_uhwi (ib_main); - bitmap_set_bit (node->clone.combined_args_to_skip, bit); + count = streamer_read_uhwi (ib_main); + vec *new_params = NULL; + for (i = 0; i < count; i++) + { + ipa_adjusted_param adj; + memset (&adj, 0, sizeof (adj)); + bp = streamer_read_bitpack (ib_main); + adj.base_index = bp_unpack_value (&bp, IPA_PARAM_MAX_INDEX_BITS); + adj.prev_clone_index + = bp_unpack_value (&bp, IPA_PARAM_MAX_INDEX_BITS); + adj.op = (enum ipa_parm_op) bp_unpack_value (&bp, 2); + adj.param_prefix_index = bp_unpack_value (&bp, 2); + adj.prev_clone_adjustment = bp_unpack_value (&bp, 1); + adj.reverse = bp_unpack_value (&bp, 1); + adj.by_ref = bp_unpack_value (&bp, 1); + adj.user_flag = bp_unpack_value (&bp, 1); + if (adj.op == IPA_PARAM_OP_SPLIT + || adj.op == IPA_PARAM_OP_NEW) + { + adj.type = stream_read_tree (ib_main, data_in); + if (adj.op == IPA_PARAM_OP_SPLIT) + { + adj.alias_ptr_type = stream_read_tree (ib_main, data_in); + adj.unit_offset = streamer_read_uhwi (ib_main); + } + } + vec_safe_push (new_params, adj); + } + int always_copy_start = streamer_read_hwi (ib_main); + bp = streamer_read_bitpack (ib_main); + bool skip_return = bp_unpack_value (&bp, 1); + node->clone.param_adjustments + = (new (ggc_alloc ()) + ipa_param_adjustments (new_params, always_copy_start, skip_return)); } + count = streamer_read_uhwi (ib_main); for (i = 0; i < count; i++) { @@ -1953,11 +1988,7 @@ input_node_opt_summary (struct cgraph_node *node, vec_safe_push (node->clone.tree_map, map); map->parm_num = streamer_read_uhwi (ib_main); - map->old_tree = NULL; map->new_tree = stream_read_tree (ib_main, data_in); - bp = streamer_read_bitpack (ib_main); - map->replace_p = bp_unpack_value (&bp, 1); - map->ref_p = bp_unpack_value (&bp, 1); } for (e = node->callees; e; e = e->next_callee) input_edge_opt_summary (e, ib_main); diff --git a/gcc/lto-section-in.c b/gcc/lto-section-in.c index f4d340ff5a3..e10b14f9a32 100644 --- a/gcc/lto-section-in.c +++ b/gcc/lto-section-in.c @@ -52,7 +52,8 @@ const char *lto_section_name[LTO_N_SECTION_TYPES] = "icf", "offload_table", "mode_table", - "hsa" + "hsa", + "ipa-sra" }; diff --git a/gcc/lto-streamer.h b/gcc/lto-streamer.h index dd279f6762b..fc381313355 100644 --- a/gcc/lto-streamer.h +++ b/gcc/lto-streamer.h @@ -234,6 +234,7 @@ enum lto_section_type LTO_section_offload_table, LTO_section_mode_table, LTO_section_ipa_hsa, + LTO_section_ipa_sra, LTO_N_SECTION_TYPES /* Must be last. */ }; diff --git a/gcc/multiple_target.c b/gcc/multiple_target.c index 5225e46bf04..0338bbb04a3 100644 --- a/gcc/multiple_target.c +++ b/gcc/multiple_target.c @@ -301,9 +301,8 @@ create_target_clone (cgraph_node *node, bool definition, char *name) if (definition) { new_node = node->create_version_clone_with_body (vNULL, NULL, - NULL, false, - NULL, NULL, - name); + NULL, NULL, + NULL, name); new_node->force_output = true; } else diff --git a/gcc/omp-simd-clone.c b/gcc/omp-simd-clone.c index ba03bd50fe6..0312d7c8fd0 100644 --- a/gcc/omp-simd-clone.c +++ b/gcc/omp-simd-clone.c @@ -86,21 +86,23 @@ simd_clone_struct_copy (struct cgraph_simd_clone *to, * sizeof (struct cgraph_simd_clone_arg)))); } -/* Return vector of parameter types of function FNDECL. This uses - TYPE_ARG_TYPES if available, otherwise falls back to types of +/* Fill an empty vector ARGS with parameter types of function FNDECL. This + uses TYPE_ARG_TYPES if available, otherwise falls back to types of DECL_ARGUMENTS types. */ -static vec -simd_clone_vector_of_formal_parm_types (tree fndecl) +static void +simd_clone_vector_of_formal_parm_types (vec *args, tree fndecl) { if (TYPE_ARG_TYPES (TREE_TYPE (fndecl))) - return ipa_get_vector_of_formal_parm_types (TREE_TYPE (fndecl)); - vec args = ipa_get_vector_of_formal_parms (fndecl); + { + ipa_fill_vector_with_formal_parm_types (args, TREE_TYPE (fndecl)); + return; + } + ipa_fill_vector_with_formal_parms (args, fndecl); unsigned int i; tree arg; - FOR_EACH_VEC_ELT (args, i, arg) - args[i] = TREE_TYPE (args[i]); - return args; + FOR_EACH_VEC_ELT (*args, i, arg) + (*args)[i] = TREE_TYPE ((*args)[i]); } /* Given a simd function in NODE, extract the simd specific @@ -113,7 +115,8 @@ static struct cgraph_simd_clone * simd_clone_clauses_extract (struct cgraph_node *node, tree clauses, bool *inbranch_specified) { - vec args = simd_clone_vector_of_formal_parm_types (node->decl); + auto_vec args; + simd_clone_vector_of_formal_parm_types (&args, node->decl); tree t; int n; *inbranch_specified = false; @@ -192,14 +195,12 @@ simd_clone_clauses_extract (struct cgraph_node *node, tree clauses, { warning_at (OMP_CLAUSE_LOCATION (t), 0, "ignoring large linear step"); - args.release (); return NULL; } else if (integer_zerop (step)) { warning_at (OMP_CLAUSE_LOCATION (t), 0, "ignoring zero linear step"); - args.release (); return NULL; } else @@ -259,7 +260,6 @@ simd_clone_clauses_extract (struct cgraph_node *node, tree clauses, warning_at (DECL_SOURCE_LOCATION (node->decl), 0, "ignoring %<#pragma omp declare simd%> on function " "with %<_Atomic%> qualified return type"); - args.release (); return NULL; } @@ -274,7 +274,6 @@ simd_clone_clauses_extract (struct cgraph_node *node, tree clauses, return NULL; } - args.release (); return clone_info; } @@ -299,14 +298,14 @@ simd_clone_compute_base_data_type (struct cgraph_node *node, such parameter. */ else { - vec map = simd_clone_vector_of_formal_parm_types (fndecl); + auto_vec map; + simd_clone_vector_of_formal_parm_types (&map, fndecl); for (unsigned int i = 0; i < clone_info->nargs; ++i) if (clone_info->args[i].arg_type == SIMD_CLONE_ARG_TYPE_VECTOR) { type = map[i]; break; } - map.release (); } /* c) If the characteristic data type determined by a) or b) above @@ -437,7 +436,7 @@ simd_clone_create (struct cgraph_node *old_node) return NULL; old_node->get_body (); new_node = old_node->create_version_clone_with_body (vNULL, NULL, NULL, - false, NULL, NULL, + NULL, NULL, "simdclone"); } else @@ -559,31 +558,33 @@ create_tmp_simd_array (const char *prefix, tree type, int simdlen) NODE is the function whose arguments are to be adjusted. - Returns an adjustment vector that will be filled describing how the - argument types will be adjusted. */ + If NODE does not represent function definition, returns NULL. Otherwise + returns an adjustment class that will be filled describing how the argument + declarations will be remapped. New arguments which are not to be remapped + are marked with USER_FLAG. */ -static ipa_parm_adjustment_vec +static ipa_param_body_adjustments * simd_clone_adjust_argument_types (struct cgraph_node *node) { - vec args; - ipa_parm_adjustment_vec adjustments; + auto_vec args; if (node->definition) - args = ipa_get_vector_of_formal_parms (node->decl); + ipa_fill_vector_with_formal_parms (&args, node->decl); else - args = simd_clone_vector_of_formal_parm_types (node->decl); - adjustments.create (args.length ()); - unsigned i, j, veclen; - struct ipa_parm_adjustment adj; + simd_clone_vector_of_formal_parm_types (&args, node->decl); struct cgraph_simd_clone *sc = node->simdclone; + vec *new_params = NULL; + vec_safe_reserve (new_params, sc->nargs); + unsigned i, j, veclen; for (i = 0; i < sc->nargs; ++i) { + ipa_adjusted_param adj; memset (&adj, 0, sizeof (adj)); tree parm = args[i]; tree parm_type = node->definition ? TREE_TYPE (parm) : parm; adj.base_index = i; - adj.base = parm; + adj.prev_clone_index = i; sc->args[i].orig_arg = node->definition ? parm : NULL_TREE; sc->args[i].orig_type = parm_type; @@ -592,7 +593,7 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) { default: /* No adjustment necessary for scalar arguments. */ - adj.op = IPA_PARM_OP_COPY; + adj.op = IPA_PARAM_OP_COPY; break; case SIMD_CLONE_ARG_TYPE_LINEAR_UVAL_CONSTANT_STEP: case SIMD_CLONE_ARG_TYPE_LINEAR_UVAL_VARIABLE_STEP: @@ -601,7 +602,7 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) = create_tmp_simd_array (IDENTIFIER_POINTER (DECL_NAME (parm)), TREE_TYPE (parm_type), sc->simdlen); - adj.op = IPA_PARM_OP_COPY; + adj.op = IPA_PARAM_OP_COPY; break; case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_CONSTANT_STEP: case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_VARIABLE_STEP: @@ -613,7 +614,8 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) veclen /= GET_MODE_BITSIZE (SCALAR_TYPE_MODE (parm_type)); if (veclen > sc->simdlen) veclen = sc->simdlen; - adj.arg_prefix = "simd"; + adj.op = IPA_PARAM_OP_NEW; + adj.param_prefix_index = IPA_PARAM_PREFIX_SIMD; if (POINTER_TYPE_P (parm_type)) adj.type = build_vector_type (pointer_sized_int_node, veclen); else @@ -621,13 +623,15 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) sc->args[i].vector_type = adj.type; for (j = veclen; j < sc->simdlen; j += veclen) { - adjustments.safe_push (adj); + vec_safe_push (new_params, adj); if (j == veclen) { memset (&adj, 0, sizeof (adj)); - adj.op = IPA_PARM_OP_NEW; - adj.arg_prefix = "simd"; + adj.op = IPA_PARAM_OP_NEW; + adj.user_flag = 1; + adj.param_prefix_index = IPA_PARAM_PREFIX_SIMD; adj.base_index = i; + adj.prev_clone_index = i; adj.type = sc->args[i].vector_type; } } @@ -638,18 +642,20 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) ? IDENTIFIER_POINTER (DECL_NAME (parm)) : NULL, parm_type, sc->simdlen); } - adjustments.safe_push (adj); + vec_safe_push (new_params, adj); } if (sc->inbranch) { tree base_type = simd_clone_compute_base_data_type (sc->origin, sc); - + ipa_adjusted_param adj; memset (&adj, 0, sizeof (adj)); - adj.op = IPA_PARM_OP_NEW; - adj.arg_prefix = "mask"; + adj.op = IPA_PARAM_OP_NEW; + adj.user_flag = 1; + adj.param_prefix_index = IPA_PARAM_PREFIX_MASK; adj.base_index = i; + adj.prev_clone_index = i; if (INTEGRAL_TYPE_P (base_type) || POINTER_TYPE_P (base_type)) veclen = sc->vecsize_int; else @@ -664,10 +670,10 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) adj.type = build_vector_type (pointer_sized_int_node, veclen); else adj.type = build_vector_type (base_type, veclen); - adjustments.safe_push (adj); + vec_safe_push (new_params, adj); for (j = veclen; j < sc->simdlen; j += veclen) - adjustments.safe_push (adj); + vec_safe_push (new_params, adj); /* We have previously allocated one extra entry for the mask. Use it and fill it. */ @@ -692,7 +698,13 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) } if (node->definition) - ipa_modify_formal_parameters (node->decl, adjustments); + { + ipa_param_body_adjustments *adjustments + = new ipa_param_body_adjustments (new_params, node->decl); + + adjustments->modify_formal_parameters (); + return adjustments; + } else { tree new_arg_types = NULL_TREE, new_reversed; @@ -701,12 +713,12 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) last_parm_void = true; gcc_assert (TYPE_ARG_TYPES (TREE_TYPE (node->decl))); - j = adjustments.length (); + j = vec_safe_length (new_params); for (i = 0; i < j; i++) { - struct ipa_parm_adjustment *adj = &adjustments[i]; + struct ipa_adjusted_param *adj = &(*new_params)[i]; tree ptype; - if (adj->op == IPA_PARM_OP_COPY) + if (adj->op == IPA_PARAM_OP_COPY) ptype = args[adj->base_index]; else ptype = adj->type; @@ -724,11 +736,8 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) tree new_type = build_distinct_type_copy (TREE_TYPE (node->decl)); TYPE_ARG_TYPES (new_type) = new_reversed; TREE_TYPE (node->decl) = new_type; - - adjustments.release (); + return NULL; } - args.release (); - return adjustments; } /* Initialize and copy the function arguments in NODE to their @@ -737,7 +746,7 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) static gimple_seq simd_clone_init_simd_arrays (struct cgraph_node *node, - ipa_parm_adjustment_vec adjustments) + ipa_param_body_adjustments *adjustments) { gimple_seq seq = NULL; unsigned i = 0, j = 0, k; @@ -746,7 +755,7 @@ simd_clone_init_simd_arrays (struct cgraph_node *node, arg; arg = DECL_CHAIN (arg), i++, j++) { - if (adjustments[j].op == IPA_PARM_OP_COPY + if ((*adjustments->m_adj_params)[j].op == IPA_PARAM_OP_COPY || POINTER_TYPE_P (TREE_TYPE (arg))) continue; @@ -811,7 +820,7 @@ simd_clone_init_simd_arrays (struct cgraph_node *node, /* Callback info for ipa_simd_modify_stmt_ops below. */ struct modify_stmt_info { - ipa_parm_adjustment_vec adjustments; + ipa_param_body_adjustments *adjustments; gimple *stmt; /* True if the parent statement was modified by ipa_simd_modify_stmt_ops. */ @@ -831,15 +840,20 @@ ipa_simd_modify_stmt_ops (tree *tp, int *walk_subtrees, void *data) tree *orig_tp = tp; if (TREE_CODE (*tp) == ADDR_EXPR) tp = &TREE_OPERAND (*tp, 0); - struct ipa_parm_adjustment *cand = NULL; + + if (TREE_CODE (*tp) == BIT_FIELD_REF + || TREE_CODE (*tp) == IMAGPART_EXPR + || TREE_CODE (*tp) == REALPART_EXPR) + tp = &TREE_OPERAND (*tp, 0); + + tree repl = NULL_TREE; if (TREE_CODE (*tp) == PARM_DECL) - cand = ipa_get_adjustment_candidate (&tp, NULL, info->adjustments, true); + repl = info->adjustments->get_expr_replacement (*tp, true); else if (TYPE_P (*tp)) *walk_subtrees = 0; - tree repl = NULL_TREE; - if (cand) - repl = unshare_expr (cand->new_decl); + if (repl) + repl = unshare_expr (repl); else { if (tp != orig_tp) @@ -905,70 +919,56 @@ ipa_simd_modify_stmt_ops (tree *tp, int *walk_subtrees, void *data) static void ipa_simd_modify_function_body (struct cgraph_node *node, - ipa_parm_adjustment_vec adjustments, + ipa_param_body_adjustments *adjustments, tree retval_array, tree iter) { basic_block bb; - unsigned int i, j, l; + unsigned int i, j; + - /* Re-use the adjustments array, but this time use it to replace - every function argument use to an offset into the corresponding - simd_array. */ + /* Register replacements for every function argument use to an offset into + the corresponding simd_array. */ for (i = 0, j = 0; i < node->simdclone->nargs; ++i, ++j) { - if (!node->simdclone->args[i].vector_arg) + if (!node->simdclone->args[i].vector_arg + || (*adjustments->m_adj_params)[j].user_flag) continue; tree basetype = TREE_TYPE (node->simdclone->args[i].orig_arg); tree vectype = TREE_TYPE (node->simdclone->args[i].vector_arg); - adjustments[j].new_decl - = build4 (ARRAY_REF, - basetype, - node->simdclone->args[i].simd_array, - iter, - NULL_TREE, NULL_TREE); - if (adjustments[j].op == IPA_PARM_OP_NONE - && simd_clone_subparts (vectype) < node->simdclone->simdlen) + tree r = build4 (ARRAY_REF, basetype, node->simdclone->args[i].simd_array, + iter, NULL_TREE, NULL_TREE); + adjustments->register_replacement (&(*adjustments->m_adj_params)[j], r); + + if (simd_clone_subparts (vectype) < node->simdclone->simdlen) j += node->simdclone->simdlen / simd_clone_subparts (vectype) - 1; } - l = adjustments.length (); tree name; - FOR_EACH_SSA_NAME (i, name, cfun) { + tree base_var; if (SSA_NAME_VAR (name) - && TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL) + && TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL + && (base_var + = adjustments->get_replacement_ssa_base (SSA_NAME_VAR (name)))) { - for (j = 0; j < l; j++) - if (SSA_NAME_VAR (name) == adjustments[j].base - && adjustments[j].new_decl) - { - tree base_var; - if (adjustments[j].new_ssa_base == NULL_TREE) - { - base_var - = copy_var_decl (adjustments[j].base, - DECL_NAME (adjustments[j].base), - TREE_TYPE (adjustments[j].base)); - adjustments[j].new_ssa_base = base_var; - } - else - base_var = adjustments[j].new_ssa_base; - if (SSA_NAME_IS_DEFAULT_DEF (name)) - { - bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); - gimple_stmt_iterator gsi = gsi_after_labels (bb); - tree new_decl = unshare_expr (adjustments[j].new_decl); - set_ssa_default_def (cfun, adjustments[j].base, NULL_TREE); - SET_SSA_NAME_VAR_OR_IDENTIFIER (name, base_var); - SSA_NAME_IS_DEFAULT_DEF (name) = 0; - gimple *stmt = gimple_build_assign (name, new_decl); - gsi_insert_before (&gsi, stmt, GSI_SAME_STMT); - } - else - SET_SSA_NAME_VAR_OR_IDENTIFIER (name, base_var); - } + if (SSA_NAME_IS_DEFAULT_DEF (name)) + { + tree old_decl = SSA_NAME_VAR (name); + bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); + gimple_stmt_iterator gsi = gsi_after_labels (bb); + tree repl = adjustments->lookup_replacement (old_decl, 0); + gcc_checking_assert (repl); + repl = unshare_expr (repl); + set_ssa_default_def (cfun, old_decl, NULL_TREE); + SET_SSA_NAME_VAR_OR_IDENTIFIER (name, base_var); + SSA_NAME_IS_DEFAULT_DEF (name) = 0; + gimple *stmt = gimple_build_assign (name, repl); + gsi_insert_before (&gsi, stmt, GSI_SAME_STMT); + } + else + SET_SSA_NAME_VAR_OR_IDENTIFIER (name, base_var); } } @@ -1115,8 +1115,9 @@ simd_clone_adjust (struct cgraph_node *node) targetm.simd_clone.adjust (node); tree retval = simd_clone_adjust_return_type (node); - ipa_parm_adjustment_vec adjustments + ipa_param_body_adjustments *adjustments = simd_clone_adjust_argument_types (node); + gcc_assert (adjustments); push_gimplify_context (); @@ -1128,7 +1129,7 @@ simd_clone_adjust (struct cgraph_node *node) tree iter1 = make_ssa_name (iter); tree iter2 = NULL_TREE; ipa_simd_modify_function_body (node, adjustments, retval, iter1); - adjustments.release (); + delete adjustments; /* Initialize the iteration variable. */ basic_block entry_bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); diff --git a/gcc/params.def b/gcc/params.def index 6f98fccd291..3e7a10c41ce 100644 --- a/gcc/params.def +++ b/gcc/params.def @@ -1031,6 +1031,13 @@ DEFPARAM (PARAM_IPA_SRA_PTR_GROWTH_FACTOR, "that ipa-sra replaces a pointer to an aggregate with.", 2, 0, 0) +DEFPARAM (PARAM_IPA_SRA_MAX_REPLACEMENTS, + "ipa-sra-max-replacements", + "Maximum pieces that IPA-SRA tracks per formal parameter, as " + "a consequence, also the maximum number of replacements of a formal " + "parameter. ", + 8, 0, 16) + DEFPARAM (PARAM_TM_MAX_AGGREGATE_SIZE, "tm-max-aggregate-size", "Size in bytes after which thread-local aggregates should be " @@ -1050,6 +1057,12 @@ DEFPARAM (PARAM_SRA_MAX_SCALARIZATION_SIZE_SIZE, "considered for scalarization when compiling for size.", 0, 0, 0) +DEFPARAM (PARAM_SRA_MAX_TYPE_CHECK_STEPS, + "sra-max-type-check-steps", + "Maximum number of steps SRA optimizations should perform when " + "evaluating whether a type can be scalarized.", + 64, 0, 0) + DEFPARAM (PARAM_IPA_CP_VALUE_LIST_SIZE, "ipa-cp-value-list-size", "Maximum size of a list of values associated with each parameter for " diff --git a/gcc/passes.def b/gcc/passes.def index 144df4fa417..b6e258f2098 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -89,7 +89,6 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_dse); NEXT_PASS (pass_cd_dce); NEXT_PASS (pass_phiopt, true /* early_p */); - NEXT_PASS (pass_early_ipa_sra); NEXT_PASS (pass_tail_recursion); NEXT_PASS (pass_convert_switch); NEXT_PASS (pass_cleanup_eh); @@ -146,6 +145,7 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_ipa_whole_program_visibility); NEXT_PASS (pass_ipa_profile); NEXT_PASS (pass_ipa_icf); + NEXT_PASS (pass_ipa_sra); NEXT_PASS (pass_ipa_devirt); NEXT_PASS (pass_ipa_cp); NEXT_PASS (pass_ipa_cdtor_merge); diff --git a/gcc/testsuite/g++.dg/ipa/pr81248.C b/gcc/testsuite/g++.dg/ipa/pr81248.C index d7796ff7ab7..b79710fc048 100644 --- a/gcc/testsuite/g++.dg/ipa/pr81248.C +++ b/gcc/testsuite/g++.dg/ipa/pr81248.C @@ -1,5 +1,5 @@ // { dg-do compile { target c++17 } } -// { dg-options "-O2 -fdump-tree-eipa_sra" } +// { dg-options "-O2 -fdump-ipa-sra" } #include @@ -37,4 +37,4 @@ int main() { f(n2); } -// { dg-final { scan-tree-dump-times "Adjusting call" 2 "eipa_sra" } } +// { dg-final { scan-ipa-dump "Will split parameter 0" "sra" } } diff --git a/gcc/testsuite/gcc.dg/ipa/20040703-wpa.c b/gcc/testsuite/gcc.dg/ipa/20040703-wpa.c new file mode 100644 index 00000000000..b1a318be886 --- /dev/null +++ b/gcc/testsuite/gcc.dg/ipa/20040703-wpa.c @@ -0,0 +1,151 @@ +/* With -fwhole-program this is an excelent testcase for inlining IPA-SRAed + functions into each other. */ +/* { dg-do run } */ +/* { dg-options "-O2 -w -fno-ipa-cp -fwhole-program" } */ +/* { dg-require-effective-target int32plus } */ + +#define PART_PRECISION (sizeof (cpp_num_part) * 8) + +typedef unsigned int cpp_num_part; +typedef struct cpp_num cpp_num; +struct cpp_num +{ + cpp_num_part high; + cpp_num_part low; + int unsignedp; /* True if value should be treated as unsigned. */ + int overflow; /* True if the most recent calculation overflowed. */ +}; + +static int +num_positive (cpp_num num, unsigned int precision) +{ + if (precision > PART_PRECISION) + { + precision -= PART_PRECISION; + return (num.high & (cpp_num_part) 1 << (precision - 1)) == 0; + } + + return (num.low & (cpp_num_part) 1 << (precision - 1)) == 0; +} + +static cpp_num +num_trim (cpp_num num, unsigned int precision) +{ + if (precision > PART_PRECISION) + { + precision -= PART_PRECISION; + if (precision < PART_PRECISION) + num.high &= ((cpp_num_part) 1 << precision) - 1; + } + else + { + if (precision < PART_PRECISION) + num.low &= ((cpp_num_part) 1 << precision) - 1; + num.high = 0; + } + + return num; +} + +/* Shift NUM, of width PRECISION, right by N bits. */ +static cpp_num +num_rshift (cpp_num num, unsigned int precision, unsigned int n) +{ + cpp_num_part sign_mask; + int x = num_positive (num, precision); + + if (num.unsignedp || x) + sign_mask = 0; + else + sign_mask = ~(cpp_num_part) 0; + + if (n >= precision) + num.high = num.low = sign_mask; + else + { + /* Sign-extend. */ + if (precision < PART_PRECISION) + num.high = sign_mask, num.low |= sign_mask << precision; + else if (precision < 2 * PART_PRECISION) + num.high |= sign_mask << (precision - PART_PRECISION); + + if (n >= PART_PRECISION) + { + n -= PART_PRECISION; + num.low = num.high; + num.high = sign_mask; + } + + if (n) + { + num.low = (num.low >> n) | (num.high << (PART_PRECISION - n)); + num.high = (num.high >> n) | (sign_mask << (PART_PRECISION - n)); + } + } + + num = num_trim (num, precision); + num.overflow = 0; + return num; +} + #define num_zerop(num) ((num.low | num.high) == 0) +#define num_eq(num1, num2) (num1.low == num2.low && num1.high == num2.high) + +cpp_num +num_lshift (cpp_num num, unsigned int precision, unsigned int n) +{ + if (n >= precision) + { + num.overflow = !num.unsignedp && !num_zerop (num); + num.high = num.low = 0; + } + else + { + cpp_num orig; + unsigned int m = n; + + orig = num; + if (m >= PART_PRECISION) + { + m -= PART_PRECISION; + num.high = num.low; + num.low = 0; + } + if (m) + { + num.high = (num.high << m) | (num.low >> (PART_PRECISION - m)); + num.low <<= m; + } + num = num_trim (num, precision); + + if (num.unsignedp) + num.overflow = 0; + else + { + cpp_num maybe_orig = num_rshift (num, precision, n); + num.overflow = !num_eq (orig, maybe_orig); + } + } + + return num; +} + +unsigned int precision = 64; +unsigned int n = 16; + +cpp_num num = { 0, 3, 0, 0 }; + +int main() +{ + cpp_num res = num_lshift (num, 64, n); + + if (res.low != 0x30000) + abort (); + + if (res.high != 0) + abort (); + + if (res.overflow != 0) + abort (); + + exit (0); +} diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-1.c b/gcc/testsuite/gcc.dg/ipa/ipa-sra-1.c index 4db904b419e..4a22e3978f9 100644 --- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-1.c +++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-1.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O2 -fipa-sra -fdump-tree-eipa_sra-details" } */ +/* { dg-options "-O2 -fipa-sra -fdump-ipa-sra-details" } */ struct bovid { @@ -36,4 +36,4 @@ main (int argc, char *argv[]) return 0; } -/* { dg-final { scan-tree-dump-times "About to replace expr" 2 "eipa_sra" } } */ +/* { dg-final { scan-ipa-dump "Will split parameter" "sra" } } */ diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-10.c b/gcc/testsuite/gcc.dg/ipa/ipa-sra-10.c index 24b64d1234a..4ecdba1b8b1 100644 --- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-10.c +++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-10.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fipa-sra -fdump-tree-eipa_sra-details" } */ +/* { dg-options "-O2 -fipa-sra -fdump-ipa-sra" } */ extern void consume (int); extern int glob, glob1, glob2; @@ -31,4 +31,4 @@ bar (int a) return 0; } -/* { dg-final { scan-tree-dump-times "replacing an SSA name of a removed param" 4 "eipa_sra" } } */ +/* { dg-final { scan-ipa-dump "Will remove parameter 0" "eipa_sra" } } */ diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-11.c b/gcc/testsuite/gcc.dg/ipa/ipa-sra-11.c index e91423a62fb..61c02c1c47c 100644 --- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-11.c +++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-11.c @@ -1,5 +1,5 @@ -/* { dg-do run } */ -/* { dg-options "-O2 -fipa-sra -fdump-tree-eipa_sra-details" } */ +/* { dg-do compile } */ +/* { dg-options "-O2 -fipa-sra -fdump-ipa-sra-details" } */ struct bovid { @@ -36,4 +36,4 @@ main (int argc, char *argv[]) return 0; } -/* { dg-final { scan-tree-dump-not "About to replace expr" "eipa_sra" } } */ +/* { dg-final { scan-ipa-dump-not "Will split parameter" "sra" } } */ diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-3.c b/gcc/testsuite/gcc.dg/ipa/ipa-sra-3.c index 23dec2a661e..37c7008c55e 100644 --- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-3.c +++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-3.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fipa-sra -fdump-tree-eipa_sra-details" } */ +/* { dg-options "-O2 -fipa-sra -fdump-ipa-sra" } */ struct bovid { @@ -34,5 +34,6 @@ void caller (void) return; } -/* { dg-final { scan-tree-dump "base: z, remove_param" "eipa_sra" } } */ -/* { dg-final { scan-tree-dump "base: calf, remove_param" "eipa_sra" } } */ +/* { dg-final { scan-ipa-dump "Will split parameter 0" "sra" } } */ +/* { dg-final { scan-ipa-dump "Will remove parameter 1" "sra" } } */ +/* { dg-final { scan-ipa-dump "Will remove parameter 2" "sra" } } */ diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-4.c b/gcc/testsuite/gcc.dg/ipa/ipa-sra-4.c index 50ac179b225..fdbd5e5d72d 100644 --- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-4.c +++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-4.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fipa-sra -fdump-tree-eipa_sra-details" } */ +/* { dg-options "-O2 -fipa-sra -fno-ipa-pure-const -fdump-ipa-sra" } */ static int __attribute__((noinline)) @@ -61,7 +61,5 @@ void caller (void) return; } -/* { dg-final { scan-tree-dump "About to replace expr \\*i_.*D. with ISRA" "eipa_sra" } } */ -/* { dg-final { scan-tree-dump "About to replace expr \\*l_.*D. with ISRA" "eipa_sra" } } */ -/* { dg-final { scan-tree-dump-times "About to replace expr \*j_.*D. with ISRA" 0 "eipa_sra" } } */ -/* { dg-final { scan-tree-dump-times "About to replace expr \*k_.*D. with ISRA" 0 "eipa_sra" } } */ +/* { dg-final { scan-ipa-dump-times "Will split parameter" 2 "sra" } } */ + diff --git a/gcc/testsuite/gcc.dg/ipa/ipa-sra-5.c b/gcc/testsuite/gcc.dg/ipa/ipa-sra-5.c index 3310a6df2e7..8a7568119b1 100644 --- a/gcc/testsuite/gcc.dg/ipa/ipa-sra-5.c +++ b/gcc/testsuite/gcc.dg/ipa/ipa-sra-5.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -fipa-sra -fdump-tree-eipa_sra-details" } */ +/* { dg-options "-O2 -fipa-sra -fdump-ipa-sra" } */ static int * __attribute__((noinline,used)) @@ -16,4 +16,4 @@ int *caller (void) return ox (&a, &b); } -/* { dg-final { scan-tree-dump-times "base: j, remove_param" 0 "eipa_sra" } } */ +/* { dg-final { scan-ipa-dump-times "Will split parameter" 0 "sra" } } */ diff --git a/gcc/testsuite/gcc.dg/ipa/ipacost-2.c b/gcc/testsuite/gcc.dg/ipa/ipacost-2.c index 43f01147091..e0501db1ae5 100644 --- a/gcc/testsuite/gcc.dg/ipa/ipacost-2.c +++ b/gcc/testsuite/gcc.dg/ipa/ipacost-2.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O3 -fipa-cp -fipa-cp-clone -fdump-ipa-cp -fno-early-inlining -fdump-tree-optimized -fno-ipa-icf" } */ +/* { dg-options "-O3 -fipa-cp -fipa-cp-clone -fdump-ipa-cp -fno-early-inlining -fno-ipa-sra -fdump-tree-optimized -fno-ipa-icf" } */ /* { dg-add-options bind_pic_locally } */ int array[100]; @@ -72,7 +72,7 @@ main() } /* { dg-final { scan-ipa-dump-times "Creating a specialized node of i_can_be_propagated_fully2" 1 "cp" } } */ -/* { dg-final { scan-ipa-dump-times "Creating a specialized node of i_can_be_propagated_fully/" 1 "cp" } } */ +/* { dg-final { scan-ipa-dump-times "Creating a specialized node of i_can_be_propagated_fully\[./\]" 1 "cp" } } */ /* { dg-final { scan-ipa-dump-not "Creating a specialized node of i_can_not_be_propagated_fully2" "cp" } } */ /* { dg-final { scan-ipa-dump-not "Creating a specialized node of i_can_not_be_propagated_fully/" "cp" } } */ /* { dg-final { scan-tree-dump-not "i_can_be_propagated_fully \\(" "optimized" } } */ diff --git a/gcc/testsuite/gcc.dg/ipa/ipcp-agg-9.c b/gcc/testsuite/gcc.dg/ipa/ipcp-agg-9.c index 2005a10dc15..c69a285b287 100644 --- a/gcc/testsuite/gcc.dg/ipa/ipcp-agg-9.c +++ b/gcc/testsuite/gcc.dg/ipa/ipcp-agg-9.c @@ -1,6 +1,6 @@ /* Verify that IPA-CP can make edges direct based on aggregate contents. */ /* { dg-do compile } */ -/* { dg-options "-O3 -fno-early-inlining -fdump-ipa-cp -fdump-ipa-inline" } */ +/* { dg-options "-O3 -fno-early-inlining -fno-ipa-sra -fdump-ipa-cp -fdump-ipa-inline" } */ struct S { diff --git a/gcc/testsuite/gcc.dg/ipa/pr78121.c b/gcc/testsuite/gcc.dg/ipa/pr78121.c index 4a0ae187256..19d6eda22f8 100644 --- a/gcc/testsuite/gcc.dg/ipa/pr78121.c +++ b/gcc/testsuite/gcc.dg/ipa/pr78121.c @@ -13,4 +13,4 @@ static void fn1(c) unsigned char c; void fn3() { fn1 (267); } -/* { dg-final { scan-ipa-dump-times "Setting value range of param 0 \\\[11, 35\\\]" 1 "cp" } } */ +/* { dg-final { scan-ipa-dump "Setting value range of param 0 \\(now 0\\) \\\[11, 35\\\]" "cp" } } */ diff --git a/gcc/testsuite/gcc.dg/ipa/vrp1.c b/gcc/testsuite/gcc.dg/ipa/vrp1.c index 72a3139851c..e32a13c3d6a 100644 --- a/gcc/testsuite/gcc.dg/ipa/vrp1.c +++ b/gcc/testsuite/gcc.dg/ipa/vrp1.c @@ -28,5 +28,5 @@ int main () return 0; } -/* { dg-final { scan-ipa-dump "Setting value range of param 0 \\\[6," "cp" } } */ -/* { dg-final { scan-ipa-dump "Setting value range of param 0 \\\[0, 999\\\]" "cp" } } */ +/* { dg-final { scan-ipa-dump "Setting value range of param 0 \\(now 0\\) \\\[6," "cp" } } */ +/* { dg-final { scan-ipa-dump "Setting value range of param 0 \\(now 0\\) \\\[0, 999\\\]" "cp" } } */ diff --git a/gcc/testsuite/gcc.dg/ipa/vrp2.c b/gcc/testsuite/gcc.dg/ipa/vrp2.c index c720e5ce8d4..31909bdbf24 100644 --- a/gcc/testsuite/gcc.dg/ipa/vrp2.c +++ b/gcc/testsuite/gcc.dg/ipa/vrp2.c @@ -31,5 +31,5 @@ int main () return 0; } -/* { dg-final { scan-ipa-dump "Setting value range of param 0 \\\[4," "cp" } } */ -/* { dg-final { scan-ipa-dump "Setting value range of param 0 \\\[0, 11\\\]" "cp" } } */ +/* { dg-final { scan-ipa-dump "Setting value range of param 0 \\(now 0\\) \\\[4," "cp" } } */ +/* { dg-final { scan-ipa-dump "Setting value range of param 0 \\(now 0\\) \\\[0, 11\\\]" "cp" } } */ diff --git a/gcc/testsuite/gcc.dg/ipa/vrp3.c b/gcc/testsuite/gcc.dg/ipa/vrp3.c index fb5d54aca82..9b1dcf98b25 100644 --- a/gcc/testsuite/gcc.dg/ipa/vrp3.c +++ b/gcc/testsuite/gcc.dg/ipa/vrp3.c @@ -27,4 +27,4 @@ int main () return 0; } -/* { dg-final { scan-ipa-dump-times "Setting value range of param 0 \\\[0, 9\\\]" 2 "cp" } } */ +/* { dg-final { scan-ipa-dump-times "Setting value range of param 0 \\(now 0\\) \\\[0, 9\\\]" 2 "cp" } } */ diff --git a/gcc/testsuite/gcc.dg/ipa/vrp7.c b/gcc/testsuite/gcc.dg/ipa/vrp7.c index e4e0bc66a64..ca5aa29e975 100644 --- a/gcc/testsuite/gcc.dg/ipa/vrp7.c +++ b/gcc/testsuite/gcc.dg/ipa/vrp7.c @@ -29,4 +29,4 @@ int main () return 0; } -/* { dg-final { scan-ipa-dump-times "Setting value range of param 0 \\\[-10, 9\\\]" 1 "cp" } } */ +/* { dg-final { scan-ipa-dump-times "Setting value range of param 0 \\(now 0\\) \\\[-10, 9\\\]" 1 "cp" } } */ diff --git a/gcc/testsuite/gcc.dg/ipa/vrp8.c b/gcc/testsuite/gcc.dg/ipa/vrp8.c index 55832b0886e..0ac5fb5277d 100644 --- a/gcc/testsuite/gcc.dg/ipa/vrp8.c +++ b/gcc/testsuite/gcc.dg/ipa/vrp8.c @@ -39,4 +39,4 @@ main () return 0; } -/* { dg-final { scan-ipa-dump-times "Setting value range of param 0 \\\[-10, 9\\\]" 1 "cp" } } */ +/* { dg-final { scan-ipa-dump-times "Setting value range of param 0 \\(now 0\\) \\\[-10, 9\\\]" 1 "cp" } } */ diff --git a/gcc/testsuite/gcc.dg/noreorder.c b/gcc/testsuite/gcc.dg/noreorder.c index 7d40a2958a4..e413b689dd6 100644 --- a/gcc/testsuite/gcc.dg/noreorder.c +++ b/gcc/testsuite/gcc.dg/noreorder.c @@ -13,7 +13,7 @@ static int func2(void); asm("firstasm"); -NOREORDER __attribute__((noinline)) int bozo(void) +NOREORDER __attribute__((noipa)) int bozo(void) { f2(3); func2(); @@ -21,14 +21,14 @@ NOREORDER __attribute__((noinline)) int bozo(void) asm("jukjuk"); -NOREORDER __attribute__((noinline)) static int func1(void) +NOREORDER __attribute__((noipa)) static int func1(void) { f2(1); } asm("barbar"); -NOREORDER __attribute__((noinline)) static int func2(void) +NOREORDER __attribute__((noipa)) static int func2(void) { func1(); } diff --git a/gcc/trans-mem.c b/gcc/trans-mem.c index bb7146bc153..778aa875ff1 100644 --- a/gcc/trans-mem.c +++ b/gcc/trans-mem.c @@ -5009,8 +5009,7 @@ ipa_tm_create_version (struct cgraph_node *old_node) } tree_function_versioning (old_decl, new_decl, - NULL, false, NULL, - false, NULL, NULL); + NULL, NULL, false, NULL, NULL); } record_tm_clone_pair (old_decl, new_decl); diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c index 8c4c82e54e8..4abe176856f 100644 --- a/gcc/tree-inline.c +++ b/gcc/tree-inline.c @@ -128,7 +128,6 @@ static void copy_bind_expr (tree *, int *, copy_body_data *); static void declare_inline_vars (tree, tree); static void remap_save_expr (tree *, hash_map *, int *); static void prepend_lexical_block (tree current_block, tree new_block); -static tree copy_decl_to_var (tree, copy_body_data *); static tree copy_result_decl_to_var (tree, copy_body_data *); static tree copy_decl_maybe_to_var (tree, copy_body_data *); static gimple_seq remap_gimple_stmt (gimple *, copy_body_data *); @@ -191,7 +190,17 @@ remap_ssa_name (tree name, copy_body_data *id) n = id->decl_map->get (name); if (n) - return unshare_expr (*n); + { + if (id->killed_new_ssa_names + && id->killed_new_ssa_names->contains (*n)) + { + gcc_assert (processing_debug_stmt); + processing_debug_stmt = -1; + return name; + } + + return unshare_expr (*n); + } if (processing_debug_stmt) { @@ -1762,6 +1771,21 @@ remap_gimple_stmt (gimple *stmt, copy_body_data *id) gcc_assert (n); gimple_set_block (copy, *n); } + if (id->param_body_adjs) + { + gimple_seq extra_stmts = NULL; + id->param_body_adjs->modify_gimple_stmt (©, &extra_stmts); + if (!gimple_seq_empty_p (extra_stmts)) + { + memset (&wi, 0, sizeof (wi)); + wi.info = id; + for (gimple_stmt_iterator egsi = gsi_start (extra_stmts); + !gsi_end_p (egsi); + gsi_next (&egsi)) + walk_gimple_op (gsi_stmt (egsi), remap_gimple_op_r, &wi); + gimple_seq_add_seq (&stmts, extra_stmts); + } + } if (id->reset_location) gimple_set_location (copy, input_location); @@ -2644,10 +2668,20 @@ redirect_all_calls (copy_body_data * id, basic_block bb) gimple *stmt = gsi_stmt (si); if (is_gimple_call (stmt)) { + tree old_lhs = gimple_call_lhs (stmt); struct cgraph_edge *edge = id->dst_node->get_edge (stmt); if (edge) { edge->redirect_call_stmt_to_callee (); + if (old_lhs + && TREE_CODE (old_lhs) == SSA_NAME + && !gimple_call_lhs (edge->call_stmt)) + { + if (!id->killed_new_ssa_names) + id->killed_new_ssa_names = new hash_set (16); + id->killed_new_ssa_names->add (old_lhs); + } + if (stmt == last && id->call_stmt && maybe_clean_eh_stmt (stmt)) gimple_purge_dead_eh_edges (bb); } @@ -2962,6 +2996,8 @@ copy_body (copy_body_data *id, body = copy_cfg_body (id, entry_block_map, exit_block_map, new_entry); copy_debug_stmts (id); + delete id->killed_new_ssa_names; + id->killed_new_ssa_names = NULL; return body; } @@ -4677,6 +4713,38 @@ expand_call_inline (basic_block bb, gimple *stmt, copy_body_data *id) /* Add local vars in this inlined callee to caller. */ add_local_variables (id->src_cfun, cfun, id); + if (id->src_node->clone.performed_splits) + { + /* Any calls from the inlined function will be turned into calls from the + function we inline into. We must preserve notes about how to split + parameters such calls should be redirected/updated. */ + unsigned len = vec_safe_length (id->src_node->clone.performed_splits); + for (unsigned i = 0; i < len; i++) + { + ipa_param_performed_split ps + = (*id->src_node->clone.performed_splits)[i]; + ps.dummy_decl = remap_decl (ps.dummy_decl, id); + vec_safe_push (id->dst_node->clone.performed_splits, ps); + } + + if (flag_checking) + { + len = vec_safe_length (id->dst_node->clone.performed_splits); + for (unsigned i = 0; i < len; i++) + { + ipa_param_performed_split *ps1 + = &(*id->dst_node->clone.performed_splits)[i]; + for (unsigned j = i + 1; j < len; j++) + { + ipa_param_performed_split *ps2 + = &(*id->dst_node->clone.performed_splits)[j]; + gcc_assert (ps1->dummy_decl != ps2->dummy_decl + || ps1->unit_offset != ps2->unit_offset); + } + } + } + } + if (dump_enabled_p ()) { char buf[128]; @@ -5497,7 +5565,7 @@ copy_decl_for_dup_finish (copy_body_data *id, tree decl, tree copy) return copy; } -static tree +tree copy_decl_to_var (tree decl, copy_body_data *id) { tree copy, type; @@ -5580,38 +5648,24 @@ copy_decl_maybe_to_var (tree decl, copy_body_data *id) return copy_decl_no_change (decl, id); } -/* Return a copy of the function's argument tree. */ +/* Return a copy of the function's argument tree without any modifications. */ + static tree -copy_arguments_for_versioning (tree orig_parm, copy_body_data * id, - bitmap args_to_skip, tree *vars) +copy_arguments_nochange (tree orig_parm, copy_body_data * id) { tree arg, *parg; tree new_parm = NULL; - int i = 0; parg = &new_parm; - - for (arg = orig_parm; arg; arg = DECL_CHAIN (arg), i++) - if (!args_to_skip || !bitmap_bit_p (args_to_skip, i)) - { - tree new_tree = remap_decl (arg, id); - if (TREE_CODE (new_tree) != PARM_DECL) - new_tree = id->copy_decl (arg, id); - lang_hooks.dup_lang_specific_decl (new_tree); - *parg = new_tree; - parg = &DECL_CHAIN (new_tree); - } - else if (!id->decl_map->get (arg)) - { - /* Make an equivalent VAR_DECL. If the argument was used - as temporary variable later in function, the uses will be - replaced by local variable. */ - tree var = copy_decl_to_var (arg, id); - insert_decl_map (id, arg, var); - /* Declare this new variable. */ - DECL_CHAIN (var) = *vars; - *vars = var; - } + for (arg = orig_parm; arg; arg = DECL_CHAIN (arg)) + { + tree new_tree = remap_decl (arg, id); + if (TREE_CODE (new_tree) != PARM_DECL) + new_tree = id->copy_decl (arg, id); + lang_hooks.dup_lang_specific_decl (new_tree); + *parg = new_tree; + parg = &DECL_CHAIN (new_tree); + } return new_parm; } @@ -5720,6 +5774,18 @@ delete_unreachable_blocks_update_callgraph (copy_body_data *id) static void update_clone_info (copy_body_data * id) { + vec *cur_performed_splits + = id->dst_node->clone.performed_splits; + if (cur_performed_splits) + { + unsigned len = cur_performed_splits->length (); + for (unsigned i = 0; i < len; i++) + { + ipa_param_performed_split *ps = &(*cur_performed_splits)[i]; + ps->dummy_decl = remap_decl (ps->dummy_decl, id); + } + } + struct cgraph_node *node; if (!id->dst_node->clones) return; @@ -5733,10 +5799,55 @@ update_clone_info (copy_body_data * id) { struct ipa_replace_map *replace_info; replace_info = (*node->clone.tree_map)[i]; - walk_tree (&replace_info->old_tree, copy_tree_body_r, id, NULL); walk_tree (&replace_info->new_tree, copy_tree_body_r, id, NULL); } } + if (node->clone.performed_splits) + { + unsigned len = vec_safe_length (node->clone.performed_splits); + for (unsigned i = 0; i < len; i++) + { + ipa_param_performed_split *ps + = &(*node->clone.performed_splits)[i]; + ps->dummy_decl = remap_decl (ps->dummy_decl, id); + } + } + if (unsigned len = vec_safe_length (cur_performed_splits)) + { + /* We do not want to add current performed splits when we are saving + a copy of function body for later during inlining, that would just + duplicate all entries. So let's have a look whether anything + referring to the first dummy_decl is present. */ + unsigned dst_len = vec_safe_length (node->clone.performed_splits); + ipa_param_performed_split *first = &(*cur_performed_splits)[0]; + for (unsigned i = 0; i < dst_len; i++) + if ((*node->clone.performed_splits)[i].dummy_decl + == first->dummy_decl) + { + len = 0; + break; + } + + for (unsigned i = 0; i < len; i++) + vec_safe_push (node->clone.performed_splits, + (*cur_performed_splits)[i]); + if (flag_checking) + { + for (unsigned i = 0; i < dst_len; i++) + { + ipa_param_performed_split *ps1 + = &(*node->clone.performed_splits)[i]; + for (unsigned j = i + 1; j < dst_len; j++) + { + ipa_param_performed_split *ps2 + = &(*node->clone.performed_splits)[j]; + gcc_assert (ps1->dummy_decl != ps2->dummy_decl + || ps1->unit_offset != ps2->unit_offset); + } + } + } + } + if (node->clones) node = node->clones; else if (node->next_sibling_clone) @@ -5769,8 +5880,8 @@ update_clone_info (copy_body_data * id) void tree_function_versioning (tree old_decl, tree new_decl, vec *tree_map, - bool update_clones, bitmap args_to_skip, - bool skip_return, bitmap blocks_to_copy, + ipa_param_adjustments *param_adjustments, + bool update_clones, bitmap blocks_to_copy, basic_block new_entry) { struct cgraph_node *old_version_node; @@ -5782,7 +5893,6 @@ tree_function_versioning (tree old_decl, tree new_decl, basic_block old_entry_block, bb; auto_vec init_stmts; tree vars = NULL_TREE; - bitmap debug_args_to_skip = args_to_skip; gcc_assert (TREE_CODE (old_decl) == FUNCTION_DECL && TREE_CODE (new_decl) == FUNCTION_DECL); @@ -5858,96 +5968,79 @@ tree_function_versioning (tree old_decl, tree new_decl, DECL_STRUCT_FUNCTION (new_decl)->static_chain_decl = copy_static_chain (p, &id); + auto_vec new_param_indices; + ipa_param_adjustments *old_param_adjustments + = old_version_node->clone.param_adjustments; + if (old_param_adjustments) + old_param_adjustments->get_updated_indices (&new_param_indices); + /* If there's a tree_map, prepare for substitution. */ if (tree_map) for (i = 0; i < tree_map->length (); i++) { gimple *init; replace_info = (*tree_map)[i]; - if (replace_info->replace_p) + + int p = replace_info->parm_num; + if (old_param_adjustments) + p = new_param_indices[p]; + + tree parm; + tree req_type, new_type; + + for (parm = DECL_ARGUMENTS (old_decl); p; + parm = DECL_CHAIN (parm)) + p--; + tree old_tree = parm; + req_type = TREE_TYPE (parm); + new_type = TREE_TYPE (replace_info->new_tree); + if (!useless_type_conversion_p (req_type, new_type)) { - int parm_num = -1; - if (!replace_info->old_tree) - { - int p = replace_info->parm_num; - tree parm; - tree req_type, new_type; - - for (parm = DECL_ARGUMENTS (old_decl); p; - parm = DECL_CHAIN (parm)) - p--; - replace_info->old_tree = parm; - parm_num = replace_info->parm_num; - req_type = TREE_TYPE (parm); - new_type = TREE_TYPE (replace_info->new_tree); - if (!useless_type_conversion_p (req_type, new_type)) - { - if (fold_convertible_p (req_type, replace_info->new_tree)) - replace_info->new_tree - = fold_build1 (NOP_EXPR, req_type, - replace_info->new_tree); - else if (TYPE_SIZE (req_type) == TYPE_SIZE (new_type)) - replace_info->new_tree - = fold_build1 (VIEW_CONVERT_EXPR, req_type, - replace_info->new_tree); - else - { - if (dump_file) - { - fprintf (dump_file, " const "); - print_generic_expr (dump_file, - replace_info->new_tree); - fprintf (dump_file, - " can't be converted to param "); - print_generic_expr (dump_file, parm); - fprintf (dump_file, "\n"); - } - replace_info->old_tree = NULL; - } - } - } + if (fold_convertible_p (req_type, replace_info->new_tree)) + replace_info->new_tree + = fold_build1 (NOP_EXPR, req_type, replace_info->new_tree); + else if (TYPE_SIZE (req_type) == TYPE_SIZE (new_type)) + replace_info->new_tree + = fold_build1 (VIEW_CONVERT_EXPR, req_type, + replace_info->new_tree); else - gcc_assert (TREE_CODE (replace_info->old_tree) == PARM_DECL); - if (replace_info->old_tree) { - init = setup_one_parameter (&id, replace_info->old_tree, - replace_info->new_tree, id.src_fn, - NULL, - &vars); - if (init) - init_stmts.safe_push (init); - if (MAY_HAVE_DEBUG_BIND_STMTS && args_to_skip) + if (dump_file) { - if (parm_num == -1) - { - tree parm; - int p; - for (parm = DECL_ARGUMENTS (old_decl), p = 0; parm; - parm = DECL_CHAIN (parm), p++) - if (parm == replace_info->old_tree) - { - parm_num = p; - break; - } - } - if (parm_num != -1) - { - if (debug_args_to_skip == args_to_skip) - { - debug_args_to_skip = BITMAP_ALLOC (NULL); - bitmap_copy (debug_args_to_skip, args_to_skip); - } - bitmap_clear_bit (debug_args_to_skip, parm_num); - } + fprintf (dump_file, " const "); + print_generic_expr (dump_file, + replace_info->new_tree); + fprintf (dump_file, + " can't be converted to param "); + print_generic_expr (dump_file, parm); + fprintf (dump_file, "\n"); } + old_tree = NULL; } } + + if (old_tree) + { + init = setup_one_parameter (&id, old_tree, replace_info->new_tree, + id.src_fn, NULL, &vars); + if (init) + init_stmts.safe_push (init); + } } - /* Copy the function's arguments. */ - if (DECL_ARGUMENTS (old_decl) != NULL_TREE) + + ipa_param_body_adjustments *param_body_adjs = NULL; + if (param_adjustments) + { + param_body_adjs = new ipa_param_body_adjustments (param_adjustments, + new_decl, old_decl, + true, &id, &vars, + tree_map); + id.param_body_adjs = param_body_adjs; + DECL_ARGUMENTS (new_decl) = param_body_adjs->get_new_param_chain (); + } + else if (DECL_ARGUMENTS (old_decl) != NULL_TREE) DECL_ARGUMENTS (new_decl) - = copy_arguments_for_versioning (DECL_ARGUMENTS (old_decl), &id, - args_to_skip, &vars); + = copy_arguments_nochange (DECL_ARGUMENTS (old_decl), &id); DECL_INITIAL (new_decl) = remap_blocks (DECL_INITIAL (id.src_fn), &id); BLOCK_SUPERCONTEXT (DECL_INITIAL (new_decl)) = new_decl; @@ -5960,7 +6053,8 @@ tree_function_versioning (tree old_decl, tree new_decl, if (DECL_RESULT (old_decl) == NULL_TREE) ; - else if (skip_return && !VOID_TYPE_P (TREE_TYPE (DECL_RESULT (old_decl)))) + else if (param_adjustments && param_adjustments->m_skip_return + && !VOID_TYPE_P (TREE_TYPE (DECL_RESULT (old_decl)))) { DECL_RESULT (new_decl) = build_decl (DECL_SOURCE_LOCATION (DECL_RESULT (old_decl)), @@ -6058,29 +6152,34 @@ tree_function_versioning (tree old_decl, tree new_decl, } } - if (debug_args_to_skip && MAY_HAVE_DEBUG_BIND_STMTS) + /* TODO: I don't quite understand how exactly this is different from what + ipa_param_body_adjustments::reset_debug_stmts statement does but it is + quite a bit different. Which is strange, we should want to do the same + thing. */ + if (param_body_adjs && MAY_HAVE_DEBUG_BIND_STMTS) { - tree parm; vec **debug_args = NULL; unsigned int len = 0; - for (parm = DECL_ARGUMENTS (old_decl), i = 0; - parm; parm = DECL_CHAIN (parm), i++) - if (bitmap_bit_p (debug_args_to_skip, i) && is_gimple_reg (parm)) - { - tree ddecl; + unsigned reset_len = param_body_adjs->m_reset_debug_decls.length (); - if (debug_args == NULL) - { - debug_args = decl_debug_args_insert (new_decl); - len = vec_safe_length (*debug_args); - } - ddecl = make_node (DEBUG_EXPR_DECL); - DECL_ARTIFICIAL (ddecl) = 1; - TREE_TYPE (ddecl) = TREE_TYPE (parm); - SET_DECL_MODE (ddecl, DECL_MODE (parm)); - vec_safe_push (*debug_args, DECL_ORIGIN (parm)); - vec_safe_push (*debug_args, ddecl); - } + for (i = 0; i < reset_len; i++) + { + tree parm = param_body_adjs->m_reset_debug_decls[i]; + gcc_assert (is_gimple_reg (parm)); + tree ddecl; + + if (debug_args == NULL) + { + debug_args = decl_debug_args_insert (new_decl); + len = vec_safe_length (*debug_args); + } + ddecl = make_node (DEBUG_EXPR_DECL); + DECL_ARTIFICIAL (ddecl) = 1; + TREE_TYPE (ddecl) = TREE_TYPE (parm); + SET_DECL_MODE (ddecl, DECL_MODE (parm)); + vec_safe_push (*debug_args, DECL_ORIGIN (parm)); + vec_safe_push (*debug_args, ddecl); + } if (debug_args != NULL) { /* On the callee side, add @@ -6106,7 +6205,7 @@ tree_function_versioning (tree old_decl, tree new_decl, if (var == NULL_TREE) break; vexpr = make_node (DEBUG_EXPR_DECL); - parm = (**debug_args)[i]; + tree parm = (**debug_args)[i]; DECL_ARTIFICIAL (vexpr) = 1; TREE_TYPE (vexpr) = TREE_TYPE (parm); SET_DECL_MODE (vexpr, DECL_MODE (parm)); @@ -6118,9 +6217,7 @@ tree_function_versioning (tree old_decl, tree new_decl, while (i > len); } } - - if (debug_args_to_skip && debug_args_to_skip != args_to_skip) - BITMAP_FREE (debug_args_to_skip); + delete param_body_adjs; free_dominance_info (CDI_DOMINATORS); free_dominance_info (CDI_POST_DOMINATORS); diff --git a/gcc/tree-inline.h b/gcc/tree-inline.h index 29caab71154..4f44527c67b 100644 --- a/gcc/tree-inline.h +++ b/gcc/tree-inline.h @@ -151,6 +151,15 @@ struct copy_body_data /* A list of addressable local variables remapped into the caller when inlining a call within an OpenMP SIMD-on-SIMT loop. */ vec *dst_simt_vars; + + /* Class managing changes to function parameters and return value planned + during IPA stage. */ + class ipa_param_body_adjustments *param_body_adjs; + + /* Hash set of SSA names that have been killed during call graph edge + redirection and should not be introduced into debug statements or NULL if no + SSA_NAME was deleted during redirections happened. */ + hash_set *killed_new_ssa_names; }; /* Weights of constructions for estimate_num_insns. */ @@ -219,6 +228,7 @@ extern bool debug_find_tree (tree, tree); extern tree copy_fn (tree, tree&, tree&); extern const char *copy_forbidden (struct function *fun); extern tree copy_decl_for_dup_finish (copy_body_data *id, tree decl, tree copy); +extern tree copy_decl_to_var (tree, copy_body_data *); /* This is in tree-inline.c since the routine uses data structures from the inliner. */ diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 9f9d85fdbc3..bcd3fe3305a 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -355,7 +355,6 @@ extern gimple_opt_pass *make_pass_early_tree_profile (gcc::context *ctxt); extern gimple_opt_pass *make_pass_cleanup_eh (gcc::context *ctxt); extern gimple_opt_pass *make_pass_sra (gcc::context *ctxt); extern gimple_opt_pass *make_pass_sra_early (gcc::context *ctxt); -extern gimple_opt_pass *make_pass_early_ipa_sra (gcc::context *ctxt); extern gimple_opt_pass *make_pass_tail_recursion (gcc::context *ctxt); extern gimple_opt_pass *make_pass_tail_calls (gcc::context *ctxt); extern gimple_opt_pass *make_pass_fix_loops (gcc::context *ctxt); @@ -500,6 +499,7 @@ extern ipa_opt_pass_d *make_pass_ipa_inline (gcc::context *ctxt); extern simple_ipa_opt_pass *make_pass_ipa_free_lang_data (gcc::context *ctxt); extern simple_ipa_opt_pass *make_pass_ipa_free_fn_summary (gcc::context *ctxt); extern ipa_opt_pass_d *make_pass_ipa_cp (gcc::context *ctxt); +extern ipa_opt_pass_d *make_pass_ipa_sra (gcc::context *ctxt); extern ipa_opt_pass_d *make_pass_ipa_icf (gcc::context *ctxt); extern ipa_opt_pass_d *make_pass_ipa_devirt (gcc::context *ctxt); extern ipa_opt_pass_d *make_pass_ipa_reference (gcc::context *ctxt); diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c index e3e37466283..d7f75e27953 100644 --- a/gcc/tree-sra.c +++ b/gcc/tree-sra.c @@ -96,15 +96,10 @@ along with GCC; see the file COPYING3. If not see #include "tree-cfg.h" #include "tree-dfa.h" #include "tree-ssa.h" -#include "symbol-summary.h" -#include "ipa-param-manipulation.h" -#include "ipa-prop.h" #include "params.h" #include "dbgcnt.h" -#include "tree-inline.h" -#include "ipa-fnsummary.h" -#include "ipa-utils.h" #include "builtins.h" +#include "tree-sra.h" /* Enumeration of all aggregate reductions we can do. */ enum sra_mode { SRA_MODE_EARLY_IPA, /* early call regularization */ @@ -168,8 +163,7 @@ struct access struct access *first_child; /* In intraprocedural SRA, pointer to the next sibling in the access tree as - described above. In IPA-SRA this is a pointer to the next access - belonging to the same group (having the same representative). */ + described above. */ struct access *next_sibling; /* Pointers to the first and last element in the linked list of assign @@ -184,9 +178,6 @@ struct access when grp_to_be_replaced flag is set. */ tree replacement_decl; - /* Is this access an access to a non-addressable field? */ - unsigned non_addressable : 1; - /* Is this access made in reverse storage order? */ unsigned reverse : 1; @@ -255,19 +246,6 @@ struct access /* Should TREE_NO_WARNING of a replacement be set? */ unsigned grp_no_warning : 1; - - /* Is it possible that the group refers to data which might be (directly or - otherwise) modified? */ - unsigned grp_maybe_modified : 1; - - /* Set when this is a representative of a pointer to scalar (i.e. by - reference) parameter which we consider for turning into a plain scalar - (i.e. a by value parameter). */ - unsigned grp_scalar_ptr : 1; - - /* Set when we discover that this pointer is not safe to dereference in the - caller. */ - unsigned grp_not_necessarilly_dereferenced : 1; }; typedef struct access *access_p; @@ -344,29 +322,6 @@ static struct obstack name_obstack; propagated to their assignment counterparts. */ static struct access *work_queue_head; -/* Number of parameters of the analyzed function when doing early ipa SRA. */ -static int func_param_count; - -/* scan_function sets the following to true if it encounters a call to - __builtin_apply_args. */ -static bool encountered_apply_args; - -/* Set by scan_function when it finds a recursive call. */ -static bool encountered_recursive_call; - -/* Set by scan_function when it finds a recursive call with less actual - arguments than formal parameters.. */ -static bool encountered_unchangable_recursive_call; - -/* This is a table in which for each basic block and parameter there is a - distance (offset + size) in that parameter which is dereferenced and - accessed in that BB. */ -static HOST_WIDE_INT *bb_dereferences; -/* Bitmap of BBs that can cause the function to "stop" progressing by - returning, throwing externally, looping infinitely or calling a function - which might abort etc.. */ -static bitmap final_bbs; - /* Representative of no accesses at all. */ static struct access no_accesses_representant; @@ -435,8 +390,7 @@ dump_access (FILE *f, struct access *access, bool grp) print_generic_expr (f, access->expr); fprintf (f, ", type = "); print_generic_expr (f, access->type); - fprintf (f, ", non_addressable = %d, reverse = %d", - access->non_addressable, access->reverse); + fprintf (f, ", reverse = %d", access->reverse); if (grp) fprintf (f, ", grp_read = %d, grp_write = %d, grp_assignment_read = %d, " "grp_assignment_write = %d, grp_scalar_read = %d, " @@ -444,16 +398,14 @@ dump_access (FILE *f, struct access *access, bool grp) "grp_hint = %d, grp_covered = %d, " "grp_unscalarizable_region = %d, grp_unscalarized_data = %d, " "grp_partial_lhs = %d, grp_to_be_replaced = %d, " - "grp_to_be_debug_replaced = %d, grp_maybe_modified = %d, " - "grp_not_necessarilly_dereferenced = %d\n", + "grp_to_be_debug_replaced = %d\n", access->grp_read, access->grp_write, access->grp_assignment_read, access->grp_assignment_write, access->grp_scalar_read, access->grp_scalar_write, access->grp_total_scalarization, access->grp_hint, access->grp_covered, access->grp_unscalarizable_region, access->grp_unscalarized_data, access->grp_partial_lhs, access->grp_to_be_replaced, - access->grp_to_be_debug_replaced, access->grp_maybe_modified, - access->grp_not_necessarilly_dereferenced); + access->grp_to_be_debug_replaced); else fprintf (f, ", write = %d, grp_total_scalarization = %d, " "grp_partial_lhs = %d\n", @@ -665,9 +617,6 @@ sra_initialize (void) gcc_obstack_init (&name_obstack); base_access_vec = new hash_map >; memset (&sra_stats, 0, sizeof (sra_stats)); - encountered_apply_args = false; - encountered_recursive_call = false; - encountered_unchangable_recursive_call = false; } /* Deallocate all general structures. */ @@ -715,14 +664,22 @@ disqualify_candidate (tree decl, const char *reason) } /* Return true iff the type contains a field or an element which does not allow - scalarization. */ + scalarization. Before doing anything however, decrement *TTL and bail out + if it is zero. */ -static bool -type_internals_preclude_sra_p (tree type, const char **msg) +bool +type_internals_preclude_sra_p (tree type, const char **msg, unsigned *ttl) { tree fld; tree et; + --*ttl; + if (!*ttl) + { + *msg = "too many nested types to check"; + return true; + } + switch (TREE_CODE (type)) { case RECORD_TYPE: @@ -731,6 +688,8 @@ type_internals_preclude_sra_p (tree type, const char **msg) for (fld = TYPE_FIELDS (type); fld; fld = DECL_CHAIN (fld)) if (TREE_CODE (fld) == FIELD_DECL) { + if (TREE_CODE (fld) == FUNCTION_DECL) + continue; tree ft = TREE_TYPE (fld); if (TREE_THIS_VOLATILE (fld)) @@ -770,7 +729,8 @@ type_internals_preclude_sra_p (tree type, const char **msg) return true; } - if (AGGREGATE_TYPE_P (ft) && type_internals_preclude_sra_p (ft, msg)) + if (AGGREGATE_TYPE_P (ft) && type_internals_preclude_sra_p (ft, msg, + ttl)) return true; } @@ -785,7 +745,7 @@ type_internals_preclude_sra_p (tree type, const char **msg) return true; } - if (AGGREGATE_TYPE_P (et) && type_internals_preclude_sra_p (et, msg)) + if (AGGREGATE_TYPE_P (et) && type_internals_preclude_sra_p (et, msg, ttl)) return true; return false; @@ -795,48 +755,6 @@ type_internals_preclude_sra_p (tree type, const char **msg) } } -/* If T is an SSA_NAME, return NULL if it is not a default def or return its - base variable if it is. Return T if it is not an SSA_NAME. */ - -static tree -get_ssa_base_param (tree t) -{ - if (TREE_CODE (t) == SSA_NAME) - { - if (SSA_NAME_IS_DEFAULT_DEF (t)) - return SSA_NAME_VAR (t); - else - return NULL_TREE; - } - return t; -} - -/* Mark a dereference of BASE of distance DIST in a basic block tht STMT - belongs to, unless the BB has already been marked as a potentially - final. */ - -static void -mark_parm_dereference (tree base, HOST_WIDE_INT dist, gimple *stmt) -{ - basic_block bb = gimple_bb (stmt); - int idx, parm_index = 0; - tree parm; - - if (bitmap_bit_p (final_bbs, bb->index)) - return; - - for (parm = DECL_ARGUMENTS (current_function_decl); - parm && parm != base; - parm = DECL_CHAIN (parm)) - parm_index++; - - gcc_assert (parm_index < func_param_count); - - idx = bb->index * func_param_count + parm_index; - if (bb_dereferences[idx] < dist) - bb_dereferences[idx] = dist; -} - /* Allocate an access structure for BASE, OFFSET and SIZE, clear it, fill in the three fields. Also add it to the vector of accesses corresponding to the base. Finally, return the new access. */ @@ -869,7 +787,7 @@ create_access (tree expr, gimple *stmt, bool write) poly_int64 poffset, psize, pmax_size; HOST_WIDE_INT offset, size, max_size; tree base = expr; - bool reverse, ptr, unscalarizable_region = false; + bool reverse, unscalarizable_region = false; base = get_ref_base_and_extent (expr, &poffset, &psize, &pmax_size, &reverse); @@ -881,20 +799,8 @@ create_access (tree expr, gimple *stmt, bool write) return NULL; } - if (sra_mode == SRA_MODE_EARLY_IPA - && TREE_CODE (base) == MEM_REF) - { - base = get_ssa_base_param (TREE_OPERAND (base, 0)); - if (!base) - return NULL; - ptr = true; - } - else - ptr = false; - /* For constant-pool entries, check we can substitute the constant value. */ - if (constant_decl_p (base) - && (sra_mode == SRA_MODE_EARLY_INTRA || sra_mode == SRA_MODE_INTRA)) + if (constant_decl_p (base)) { gcc_assert (!bitmap_bit_p (disqualified_constants, DECL_UID (base))); if (expr != base @@ -914,36 +820,15 @@ create_access (tree expr, gimple *stmt, bool write) if (!DECL_P (base) || !bitmap_bit_p (candidate_bitmap, DECL_UID (base))) return NULL; - if (sra_mode == SRA_MODE_EARLY_IPA) + if (size != max_size) { - if (size < 0 || size != max_size) - { - disqualify_candidate (base, "Encountered a variable sized access."); - return NULL; - } - if (TREE_CODE (expr) == COMPONENT_REF - && DECL_BIT_FIELD (TREE_OPERAND (expr, 1))) - { - disqualify_candidate (base, "Encountered a bit-field access."); - return NULL; - } - gcc_checking_assert ((offset % BITS_PER_UNIT) == 0); - - if (ptr) - mark_parm_dereference (base, offset + size, stmt); + size = max_size; + unscalarizable_region = true; } - else + if (size < 0) { - if (size != max_size) - { - size = max_size; - unscalarizable_region = true; - } - if (size < 0) - { - disqualify_candidate (base, "Encountered an unconstrained access."); - return NULL; - } + disqualify_candidate (base, "Encountered an unconstrained access."); + return NULL; } access = create_access_1 (base, offset, size); @@ -954,10 +839,6 @@ create_access (tree expr, gimple *stmt, bool write) access->stmt = stmt; access->reverse = reverse; - if (TREE_CODE (expr) == COMPONENT_REF - && DECL_NONADDRESSABLE_P (TREE_OPERAND (expr, 1))) - access->non_addressable = 1; - return access; } @@ -1184,10 +1065,6 @@ static void disqualify_base_of_expr (tree t, const char *reason) { t = get_base_address (t); - if (sra_mode == SRA_MODE_EARLY_IPA - && TREE_CODE (t) == MEM_REF) - t = get_ssa_base_param (TREE_OPERAND (t, 0)); - if (t && DECL_P (t)) disqualify_candidate (t, reason); } @@ -1240,8 +1117,7 @@ build_access_from_expr_1 (tree expr, gimple *stmt, bool write) switch (TREE_CODE (expr)) { case MEM_REF: - if (TREE_CODE (TREE_OPERAND (expr, 0)) != ADDR_EXPR - && sra_mode != SRA_MODE_EARLY_IPA) + if (TREE_CODE (TREE_OPERAND (expr, 0)) != ADDR_EXPR) return NULL; /* fall through */ case VAR_DECL: @@ -1315,8 +1191,7 @@ single_non_eh_succ (basic_block bb) static bool disqualify_if_bad_bb_terminating_stmt (gimple *stmt, tree lhs, tree rhs) { - if ((sra_mode == SRA_MODE_EARLY_INTRA || sra_mode == SRA_MODE_INTRA) - && stmt_ends_bb_p (stmt)) + if (stmt_ends_bb_p (stmt)) { if (single_non_eh_succ (gimple_bb (stmt))) return false; @@ -1429,29 +1304,6 @@ asm_visit_addr (gimple *, tree op, tree, void *) return false; } -/* Return true iff callsite CALL has at least as many actual arguments as there - are formal parameters of the function currently processed by IPA-SRA and - that their types match. */ - -static inline bool -callsite_arguments_match_p (gimple *call) -{ - if (gimple_call_num_args (call) < (unsigned) func_param_count) - return false; - - tree parm; - int i; - for (parm = DECL_ARGUMENTS (current_function_decl), i = 0; - parm; - parm = DECL_CHAIN (parm), i++) - { - tree arg = gimple_call_arg (call, i); - if (!useless_type_conversion_p (TREE_TYPE (parm), TREE_TYPE (arg))) - return false; - } - return true; -} - /* Scan function and look for interesting expressions and create access structures for them. Return true iff any access is created. */ @@ -1470,16 +1322,12 @@ scan_function (void) tree t; unsigned i; - if (final_bbs && stmt_can_throw_external (cfun, stmt)) - bitmap_set_bit (final_bbs, bb->index); switch (gimple_code (stmt)) { case GIMPLE_RETURN: t = gimple_return_retval (as_a (stmt)); if (t != NULL_TREE) ret |= build_access_from_expr (t, stmt, false); - if (final_bbs) - bitmap_set_bit (final_bbs, bb->index); break; case GIMPLE_ASSIGN: @@ -1491,28 +1339,6 @@ scan_function (void) ret |= build_access_from_expr (gimple_call_arg (stmt, i), stmt, false); - if (sra_mode == SRA_MODE_EARLY_IPA) - { - tree dest = gimple_call_fndecl (stmt); - int flags = gimple_call_flags (stmt); - - if (dest) - { - if (fndecl_built_in_p (dest, BUILT_IN_APPLY_ARGS)) - encountered_apply_args = true; - if (recursive_call_p (current_function_decl, dest)) - { - encountered_recursive_call = true; - if (!callsite_arguments_match_p (stmt)) - encountered_unchangable_recursive_call = true; - } - } - - if (final_bbs - && (flags & (ECF_CONST | ECF_PURE)) == 0) - bitmap_set_bit (final_bbs, bb->index); - } - t = gimple_call_lhs (stmt); if (t && !disqualify_if_bad_bb_terminating_stmt (stmt, t, NULL)) ret |= build_access_from_expr (t, stmt, true); @@ -1523,9 +1349,6 @@ scan_function (void) gasm *asm_stmt = as_a (stmt); walk_stmt_load_store_addr_ops (asm_stmt, NULL, NULL, NULL, asm_visit_addr); - if (final_bbs) - bitmap_set_bit (final_bbs, bb->index); - for (i = 0; i < gimple_asm_ninputs (asm_stmt); i++) { t = TREE_VALUE (gimple_asm_input_op (asm_stmt, i)); @@ -1940,14 +1763,6 @@ build_user_friendly_ref_for_offset (tree *res, tree type, HOST_WIDE_INT offset, } } -/* Return true iff TYPE is stdarg va_list type. */ - -static inline bool -is_va_list_type (tree type) -{ - return TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (va_list_type_node); -} - /* Print message to dump file why a variable was rejected. */ static void @@ -1975,10 +1790,8 @@ maybe_add_sra_candidate (tree var) reject (var, "not aggregate"); return false; } - /* Allow constant-pool entries (that "need to live in memory") - unless we are doing IPA SRA. */ - if (needs_to_live_in_memory (var) - && (sra_mode == SRA_MODE_EARLY_IPA || !constant_decl_p (var))) + /* Allow constant-pool entries that "need to live in memory". */ + if (needs_to_live_in_memory (var) && !constant_decl_p (var)) { reject (var, "needs to live in memory"); return false; @@ -2003,7 +1816,8 @@ maybe_add_sra_candidate (tree var) reject (var, "type size is zero"); return false; } - if (type_internals_preclude_sra_p (type, &msg)) + unsigned ttl = PARAM_VALUE (PARAM_SRA_MAX_TYPE_CHECK_STEPS); + if (type_internals_preclude_sra_p (type, &msg, &ttl)) { reject (var, msg); return false; @@ -4032,1609 +3846,3 @@ make_pass_sra (gcc::context *ctxt) { return new pass_sra (ctxt); } - - -/* Return true iff PARM (which must be a parm_decl) is an unused scalar - parameter. */ - -static bool -is_unused_scalar_param (tree parm) -{ - tree name; - return (is_gimple_reg (parm) - && (!(name = ssa_default_def (cfun, parm)) - || has_zero_uses (name))); -} - -/* Scan immediate uses of a default definition SSA name of a parameter PARM and - examine whether there are any direct or otherwise infeasible ones. If so, - return true, otherwise return false. PARM must be a gimple register with a - non-NULL default definition. */ - -static bool -ptr_parm_has_direct_uses (tree parm) -{ - imm_use_iterator ui; - gimple *stmt; - tree name = ssa_default_def (cfun, parm); - bool ret = false; - - FOR_EACH_IMM_USE_STMT (stmt, ui, name) - { - int uses_ok = 0; - use_operand_p use_p; - - if (is_gimple_debug (stmt)) - continue; - - /* Valid uses include dereferences on the lhs and the rhs. */ - if (gimple_has_lhs (stmt)) - { - tree lhs = gimple_get_lhs (stmt); - while (handled_component_p (lhs)) - lhs = TREE_OPERAND (lhs, 0); - if (TREE_CODE (lhs) == MEM_REF - && TREE_OPERAND (lhs, 0) == name - && integer_zerop (TREE_OPERAND (lhs, 1)) - && types_compatible_p (TREE_TYPE (lhs), - TREE_TYPE (TREE_TYPE (name))) - && !TREE_THIS_VOLATILE (lhs)) - uses_ok++; - } - if (gimple_assign_single_p (stmt)) - { - tree rhs = gimple_assign_rhs1 (stmt); - while (handled_component_p (rhs)) - rhs = TREE_OPERAND (rhs, 0); - if (TREE_CODE (rhs) == MEM_REF - && TREE_OPERAND (rhs, 0) == name - && integer_zerop (TREE_OPERAND (rhs, 1)) - && types_compatible_p (TREE_TYPE (rhs), - TREE_TYPE (TREE_TYPE (name))) - && !TREE_THIS_VOLATILE (rhs)) - uses_ok++; - } - else if (is_gimple_call (stmt)) - { - unsigned i; - for (i = 0; i < gimple_call_num_args (stmt); ++i) - { - tree arg = gimple_call_arg (stmt, i); - while (handled_component_p (arg)) - arg = TREE_OPERAND (arg, 0); - if (TREE_CODE (arg) == MEM_REF - && TREE_OPERAND (arg, 0) == name - && integer_zerop (TREE_OPERAND (arg, 1)) - && types_compatible_p (TREE_TYPE (arg), - TREE_TYPE (TREE_TYPE (name))) - && !TREE_THIS_VOLATILE (arg)) - uses_ok++; - } - } - - /* If the number of valid uses does not match the number of - uses in this stmt there is an unhandled use. */ - FOR_EACH_IMM_USE_ON_STMT (use_p, ui) - --uses_ok; - - if (uses_ok != 0) - ret = true; - - if (ret) - BREAK_FROM_IMM_USE_STMT (ui); - } - - return ret; -} - -/* Identify candidates for reduction for IPA-SRA based on their type and mark - them in candidate_bitmap. Note that these do not necessarily include - parameter which are unused and thus can be removed. Return true iff any - such candidate has been found. */ - -static bool -find_param_candidates (void) -{ - tree parm; - int count = 0; - bool ret = false; - const char *msg; - - for (parm = DECL_ARGUMENTS (current_function_decl); - parm; - parm = DECL_CHAIN (parm)) - { - tree type = TREE_TYPE (parm); - tree_node **slot; - - count++; - - if (TREE_THIS_VOLATILE (parm) - || TREE_ADDRESSABLE (parm) - || (!is_gimple_reg_type (type) && is_va_list_type (type))) - continue; - - if (is_unused_scalar_param (parm)) - { - ret = true; - continue; - } - - if (POINTER_TYPE_P (type)) - { - type = TREE_TYPE (type); - - if (TREE_CODE (type) == FUNCTION_TYPE - || TYPE_VOLATILE (type) - || (TREE_CODE (type) == ARRAY_TYPE - && TYPE_NONALIASED_COMPONENT (type)) - || !is_gimple_reg (parm) - || is_va_list_type (type) - || ptr_parm_has_direct_uses (parm)) - continue; - } - else if (!AGGREGATE_TYPE_P (type)) - continue; - - if (!COMPLETE_TYPE_P (type) - || !tree_fits_uhwi_p (TYPE_SIZE (type)) - || tree_to_uhwi (TYPE_SIZE (type)) == 0 - || (AGGREGATE_TYPE_P (type) - && type_internals_preclude_sra_p (type, &msg))) - continue; - - bitmap_set_bit (candidate_bitmap, DECL_UID (parm)); - slot = candidates->find_slot_with_hash (parm, DECL_UID (parm), INSERT); - *slot = parm; - - ret = true; - if (dump_file && (dump_flags & TDF_DETAILS)) - { - fprintf (dump_file, "Candidate (%d): ", DECL_UID (parm)); - print_generic_expr (dump_file, parm); - fprintf (dump_file, "\n"); - } - } - - func_param_count = count; - return ret; -} - -/* Callback of walk_aliased_vdefs, marks the access passed as DATA as - maybe_modified. */ - -static bool -mark_maybe_modified (ao_ref *ao ATTRIBUTE_UNUSED, tree vdef ATTRIBUTE_UNUSED, - void *data) -{ - struct access *repr = (struct access *) data; - - repr->grp_maybe_modified = 1; - return true; -} - -/* Analyze what representatives (in linked lists accessible from - REPRESENTATIVES) can be modified by side effects of statements in the - current function. */ - -static void -analyze_modified_params (vec representatives) -{ - int i; - - for (i = 0; i < func_param_count; i++) - { - struct access *repr; - - for (repr = representatives[i]; - repr; - repr = repr->next_grp) - { - struct access *access; - bitmap visited; - ao_ref ar; - - if (no_accesses_p (repr)) - continue; - if (!POINTER_TYPE_P (TREE_TYPE (repr->base)) - || repr->grp_maybe_modified) - continue; - - ao_ref_init (&ar, repr->expr); - visited = BITMAP_ALLOC (NULL); - for (access = repr; access; access = access->next_sibling) - { - /* All accesses are read ones, otherwise grp_maybe_modified would - be trivially set. */ - walk_aliased_vdefs (&ar, gimple_vuse (access->stmt), - mark_maybe_modified, repr, &visited); - if (repr->grp_maybe_modified) - break; - } - BITMAP_FREE (visited); - } - } -} - -/* Propagate distances in bb_dereferences in the opposite direction than the - control flow edges, in each step storing the maximum of the current value - and the minimum of all successors. These steps are repeated until the table - stabilizes. Note that BBs which might terminate the functions (according to - final_bbs bitmap) never updated in this way. */ - -static void -propagate_dereference_distances (void) -{ - basic_block bb; - - auto_vec queue (last_basic_block_for_fn (cfun)); - queue.quick_push (ENTRY_BLOCK_PTR_FOR_FN (cfun)); - FOR_EACH_BB_FN (bb, cfun) - { - queue.quick_push (bb); - bb->aux = bb; - } - - while (!queue.is_empty ()) - { - edge_iterator ei; - edge e; - bool change = false; - int i; - - bb = queue.pop (); - bb->aux = NULL; - - if (bitmap_bit_p (final_bbs, bb->index)) - continue; - - for (i = 0; i < func_param_count; i++) - { - int idx = bb->index * func_param_count + i; - bool first = true; - HOST_WIDE_INT inh = 0; - - FOR_EACH_EDGE (e, ei, bb->succs) - { - int succ_idx = e->dest->index * func_param_count + i; - - if (e->src == EXIT_BLOCK_PTR_FOR_FN (cfun)) - continue; - - if (first) - { - first = false; - inh = bb_dereferences [succ_idx]; - } - else if (bb_dereferences [succ_idx] < inh) - inh = bb_dereferences [succ_idx]; - } - - if (!first && bb_dereferences[idx] < inh) - { - bb_dereferences[idx] = inh; - change = true; - } - } - - if (change && !bitmap_bit_p (final_bbs, bb->index)) - FOR_EACH_EDGE (e, ei, bb->preds) - { - if (e->src->aux) - continue; - - e->src->aux = e->src; - queue.quick_push (e->src); - } - } -} - -/* Dump a dereferences TABLE with heading STR to file F. */ - -static void -dump_dereferences_table (FILE *f, const char *str, HOST_WIDE_INT *table) -{ - basic_block bb; - - fprintf (dump_file, "%s", str); - FOR_BB_BETWEEN (bb, ENTRY_BLOCK_PTR_FOR_FN (cfun), - EXIT_BLOCK_PTR_FOR_FN (cfun), next_bb) - { - fprintf (f, "%4i %i ", bb->index, bitmap_bit_p (final_bbs, bb->index)); - if (bb != EXIT_BLOCK_PTR_FOR_FN (cfun)) - { - int i; - for (i = 0; i < func_param_count; i++) - { - int idx = bb->index * func_param_count + i; - fprintf (f, " %4" HOST_WIDE_INT_PRINT "d", table[idx]); - } - } - fprintf (f, "\n"); - } - fprintf (dump_file, "\n"); -} - -/* Determine what (parts of) parameters passed by reference that are not - assigned to are not certainly dereferenced in this function and thus the - dereferencing cannot be safely moved to the caller without potentially - introducing a segfault. Mark such REPRESENTATIVES as - grp_not_necessarilly_dereferenced. - - The dereferenced maximum "distance," i.e. the offset + size of the accessed - part is calculated rather than simple booleans are calculated for each - pointer parameter to handle cases when only a fraction of the whole - aggregate is allocated (see testsuite/gcc.c-torture/execute/ipa-sra-2.c for - an example). - - The maximum dereference distances for each pointer parameter and BB are - already stored in bb_dereference. This routine simply propagates these - values upwards by propagate_dereference_distances and then compares the - distances of individual parameters in the ENTRY BB to the equivalent - distances of each representative of a (fraction of a) parameter. */ - -static void -analyze_caller_dereference_legality (vec representatives) -{ - int i; - - if (dump_file && (dump_flags & TDF_DETAILS)) - dump_dereferences_table (dump_file, - "Dereference table before propagation:\n", - bb_dereferences); - - propagate_dereference_distances (); - - if (dump_file && (dump_flags & TDF_DETAILS)) - dump_dereferences_table (dump_file, - "Dereference table after propagation:\n", - bb_dereferences); - - for (i = 0; i < func_param_count; i++) - { - struct access *repr = representatives[i]; - int idx = ENTRY_BLOCK_PTR_FOR_FN (cfun)->index * func_param_count + i; - - if (!repr || no_accesses_p (repr)) - continue; - - do - { - if ((repr->offset + repr->size) > bb_dereferences[idx]) - repr->grp_not_necessarilly_dereferenced = 1; - repr = repr->next_grp; - } - while (repr); - } -} - -/* Return the representative access for the parameter declaration PARM if it is - a scalar passed by reference which is not written to and the pointer value - is not used directly. Thus, if it is legal to dereference it in the caller - and we can rule out modifications through aliases, such parameter should be - turned into one passed by value. Return NULL otherwise. */ - -static struct access * -unmodified_by_ref_scalar_representative (tree parm) -{ - int i, access_count; - struct access *repr; - vec *access_vec; - - access_vec = get_base_access_vector (parm); - gcc_assert (access_vec); - repr = (*access_vec)[0]; - if (repr->write) - return NULL; - repr->group_representative = repr; - - access_count = access_vec->length (); - for (i = 1; i < access_count; i++) - { - struct access *access = (*access_vec)[i]; - if (access->write) - return NULL; - access->group_representative = repr; - access->next_sibling = repr->next_sibling; - repr->next_sibling = access; - } - - repr->grp_read = 1; - repr->grp_scalar_ptr = 1; - return repr; -} - -/* Return true iff this ACCESS precludes IPA-SRA of the parameter it is - associated with. REQ_ALIGN is the minimum required alignment. */ - -static bool -access_precludes_ipa_sra_p (struct access *access, unsigned int req_align) -{ - unsigned int exp_align; - /* Avoid issues such as the second simple testcase in PR 42025. The problem - is incompatible assign in a call statement (and possibly even in asm - statements). This can be relaxed by using a new temporary but only for - non-TREE_ADDRESSABLE types and is probably not worth the complexity. (In - intraprocedural SRA we deal with this by keeping the old aggregate around, - something we cannot do in IPA-SRA.) */ - if (access->write - && (is_gimple_call (access->stmt) - || gimple_code (access->stmt) == GIMPLE_ASM)) - return true; - - exp_align = get_object_alignment (access->expr); - if (exp_align < req_align) - return true; - - return false; -} - - -/* Sort collected accesses for parameter PARM, identify representatives for - each accessed region and link them together. Return NULL if there are - different but overlapping accesses, return the special ptr value meaning - there are no accesses for this parameter if that is the case and return the - first representative otherwise. Set *RO_GRP if there is a group of accesses - with only read (i.e. no write) accesses. */ - -static struct access * -splice_param_accesses (tree parm, bool *ro_grp) -{ - int i, j, access_count, group_count; - int total_size = 0; - struct access *access, *res, **prev_acc_ptr = &res; - vec *access_vec; - - access_vec = get_base_access_vector (parm); - if (!access_vec) - return &no_accesses_representant; - access_count = access_vec->length (); - - access_vec->qsort (compare_access_positions); - - i = 0; - total_size = 0; - group_count = 0; - while (i < access_count) - { - bool modification; - tree a1_alias_type; - access = (*access_vec)[i]; - modification = access->write; - if (access_precludes_ipa_sra_p (access, TYPE_ALIGN (access->type))) - return NULL; - a1_alias_type = reference_alias_ptr_type (access->expr); - - /* Access is about to become group representative unless we find some - nasty overlap which would preclude us from breaking this parameter - apart. */ - - j = i + 1; - while (j < access_count) - { - struct access *ac2 = (*access_vec)[j]; - if (ac2->offset != access->offset) - { - /* All or nothing law for parameters. */ - if (access->offset + access->size > ac2->offset) - return NULL; - else - break; - } - else if (ac2->size != access->size) - return NULL; - - if (access_precludes_ipa_sra_p (ac2, TYPE_ALIGN (access->type)) - || (ac2->type != access->type - && (TREE_ADDRESSABLE (ac2->type) - || TREE_ADDRESSABLE (access->type))) - || (reference_alias_ptr_type (ac2->expr) != a1_alias_type)) - return NULL; - - modification |= ac2->write; - ac2->group_representative = access; - ac2->next_sibling = access->next_sibling; - access->next_sibling = ac2; - j++; - } - - group_count++; - access->grp_maybe_modified = modification; - if (!modification) - *ro_grp = true; - *prev_acc_ptr = access; - prev_acc_ptr = &access->next_grp; - total_size += access->size; - i = j; - } - - gcc_assert (group_count > 0); - return res; -} - -/* Decide whether parameters with representative accesses given by REPR should - be reduced into components. */ - -static int -decide_one_param_reduction (struct access *repr) -{ - HOST_WIDE_INT total_size, cur_parm_size; - bool by_ref; - tree parm; - - parm = repr->base; - cur_parm_size = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (parm))); - gcc_assert (cur_parm_size > 0); - - if (POINTER_TYPE_P (TREE_TYPE (parm))) - by_ref = true; - else - by_ref = false; - - if (dump_file) - { - struct access *acc; - fprintf (dump_file, "Evaluating PARAM group sizes for "); - print_generic_expr (dump_file, parm); - fprintf (dump_file, " (UID: %u): \n", DECL_UID (parm)); - for (acc = repr; acc; acc = acc->next_grp) - dump_access (dump_file, acc, true); - } - - total_size = 0; - int new_param_count = 0; - - for (; repr; repr = repr->next_grp) - { - gcc_assert (parm == repr->base); - - /* Taking the address of a non-addressable field is verboten. */ - if (by_ref && repr->non_addressable) - return 0; - - /* Do not decompose a non-BLKmode param in a way that would - create BLKmode params. Especially for by-reference passing - (thus, pointer-type param) this is hardly worthwhile. */ - if (DECL_MODE (parm) != BLKmode - && TYPE_MODE (repr->type) == BLKmode) - return 0; - - if (!by_ref || (!repr->grp_maybe_modified - && !repr->grp_not_necessarilly_dereferenced)) - total_size += repr->size; - else - total_size += cur_parm_size; - - new_param_count++; - } - - gcc_assert (new_param_count > 0); - - if (!by_ref) - { - if (total_size >= cur_parm_size) - return 0; - } - else - { - int parm_num_limit; - if (optimize_function_for_size_p (cfun)) - parm_num_limit = 1; - else - parm_num_limit = PARAM_VALUE (PARAM_IPA_SRA_PTR_GROWTH_FACTOR); - - if (new_param_count > parm_num_limit - || total_size > (parm_num_limit * cur_parm_size)) - return 0; - } - - if (dump_file) - fprintf (dump_file, " ....will be split into %i components\n", - new_param_count); - return new_param_count; -} - -/* The order of the following enums is important, we need to do extra work for - UNUSED_PARAMS, BY_VAL_ACCESSES and UNMODIF_BY_REF_ACCESSES. */ -enum ipa_splicing_result { NO_GOOD_ACCESS, UNUSED_PARAMS, BY_VAL_ACCESSES, - MODIF_BY_REF_ACCESSES, UNMODIF_BY_REF_ACCESSES }; - -/* Identify representatives of all accesses to all candidate parameters for - IPA-SRA. Return result based on what representatives have been found. */ - -static enum ipa_splicing_result -splice_all_param_accesses (vec &representatives) -{ - enum ipa_splicing_result result = NO_GOOD_ACCESS; - tree parm; - struct access *repr; - - representatives.create (func_param_count); - - for (parm = DECL_ARGUMENTS (current_function_decl); - parm; - parm = DECL_CHAIN (parm)) - { - if (is_unused_scalar_param (parm)) - { - representatives.quick_push (&no_accesses_representant); - if (result == NO_GOOD_ACCESS) - result = UNUSED_PARAMS; - } - else if (POINTER_TYPE_P (TREE_TYPE (parm)) - && is_gimple_reg_type (TREE_TYPE (TREE_TYPE (parm))) - && bitmap_bit_p (candidate_bitmap, DECL_UID (parm))) - { - repr = unmodified_by_ref_scalar_representative (parm); - representatives.quick_push (repr); - if (repr) - result = UNMODIF_BY_REF_ACCESSES; - } - else if (bitmap_bit_p (candidate_bitmap, DECL_UID (parm))) - { - bool ro_grp = false; - repr = splice_param_accesses (parm, &ro_grp); - representatives.quick_push (repr); - - if (repr && !no_accesses_p (repr)) - { - if (POINTER_TYPE_P (TREE_TYPE (parm))) - { - if (ro_grp) - result = UNMODIF_BY_REF_ACCESSES; - else if (result < MODIF_BY_REF_ACCESSES) - result = MODIF_BY_REF_ACCESSES; - } - else if (result < BY_VAL_ACCESSES) - result = BY_VAL_ACCESSES; - } - else if (no_accesses_p (repr) && (result == NO_GOOD_ACCESS)) - result = UNUSED_PARAMS; - } - else - representatives.quick_push (NULL); - } - - if (result == NO_GOOD_ACCESS) - { - representatives.release (); - return NO_GOOD_ACCESS; - } - - return result; -} - -/* Return the index of BASE in PARMS. Abort if it is not found. */ - -static inline int -get_param_index (tree base, vec parms) -{ - int i, len; - - len = parms.length (); - for (i = 0; i < len; i++) - if (parms[i] == base) - return i; - gcc_unreachable (); -} - -/* Convert the decisions made at the representative level into compact - parameter adjustments. REPRESENTATIVES are pointers to first - representatives of each param accesses, ADJUSTMENTS_COUNT is the expected - final number of adjustments. */ - -static ipa_parm_adjustment_vec -turn_representatives_into_adjustments (vec representatives, - int adjustments_count) -{ - vec parms; - ipa_parm_adjustment_vec adjustments; - tree parm; - int i; - - gcc_assert (adjustments_count > 0); - parms = ipa_get_vector_of_formal_parms (current_function_decl); - adjustments.create (adjustments_count); - parm = DECL_ARGUMENTS (current_function_decl); - for (i = 0; i < func_param_count; i++, parm = DECL_CHAIN (parm)) - { - struct access *repr = representatives[i]; - - if (!repr || no_accesses_p (repr)) - { - struct ipa_parm_adjustment adj; - - memset (&adj, 0, sizeof (adj)); - adj.base_index = get_param_index (parm, parms); - adj.base = parm; - if (!repr) - adj.op = IPA_PARM_OP_COPY; - else - adj.op = IPA_PARM_OP_REMOVE; - adj.arg_prefix = "ISRA"; - adjustments.quick_push (adj); - } - else - { - struct ipa_parm_adjustment adj; - int index = get_param_index (parm, parms); - - for (; repr; repr = repr->next_grp) - { - memset (&adj, 0, sizeof (adj)); - gcc_assert (repr->base == parm); - adj.base_index = index; - adj.base = repr->base; - adj.type = repr->type; - adj.alias_ptr_type = reference_alias_ptr_type (repr->expr); - adj.offset = repr->offset; - adj.reverse = repr->reverse; - adj.by_ref = (POINTER_TYPE_P (TREE_TYPE (repr->base)) - && (repr->grp_maybe_modified - || repr->grp_not_necessarilly_dereferenced)); - adj.arg_prefix = "ISRA"; - adjustments.quick_push (adj); - } - } - } - parms.release (); - return adjustments; -} - -/* Analyze the collected accesses and produce a plan what to do with the - parameters in the form of adjustments, NULL meaning nothing. */ - -static ipa_parm_adjustment_vec -analyze_all_param_acesses (void) -{ - enum ipa_splicing_result repr_state; - bool proceed = false; - int i, adjustments_count = 0; - vec representatives; - ipa_parm_adjustment_vec adjustments; - - repr_state = splice_all_param_accesses (representatives); - if (repr_state == NO_GOOD_ACCESS) - return ipa_parm_adjustment_vec (); - - /* If there are any parameters passed by reference which are not modified - directly, we need to check whether they can be modified indirectly. */ - if (repr_state == UNMODIF_BY_REF_ACCESSES) - { - analyze_caller_dereference_legality (representatives); - analyze_modified_params (representatives); - } - - for (i = 0; i < func_param_count; i++) - { - struct access *repr = representatives[i]; - - if (repr && !no_accesses_p (repr)) - { - if (repr->grp_scalar_ptr) - { - adjustments_count++; - if (repr->grp_not_necessarilly_dereferenced - || repr->grp_maybe_modified) - representatives[i] = NULL; - else - { - proceed = true; - sra_stats.scalar_by_ref_to_by_val++; - } - } - else - { - int new_components = decide_one_param_reduction (repr); - - if (new_components == 0) - { - representatives[i] = NULL; - adjustments_count++; - } - else - { - adjustments_count += new_components; - sra_stats.aggregate_params_reduced++; - sra_stats.param_reductions_created += new_components; - proceed = true; - } - } - } - else - { - if (no_accesses_p (repr)) - { - proceed = true; - sra_stats.deleted_unused_parameters++; - } - adjustments_count++; - } - } - - if (!proceed && dump_file) - fprintf (dump_file, "NOT proceeding to change params.\n"); - - if (proceed) - adjustments = turn_representatives_into_adjustments (representatives, - adjustments_count); - else - adjustments = ipa_parm_adjustment_vec (); - - representatives.release (); - return adjustments; -} - -/* If a parameter replacement identified by ADJ does not yet exist in the form - of declaration, create it and record it, otherwise return the previously - created one. */ - -static tree -get_replaced_param_substitute (struct ipa_parm_adjustment *adj) -{ - tree repl; - if (!adj->new_ssa_base) - { - char *pretty_name = make_fancy_name (adj->base); - - repl = create_tmp_reg (TREE_TYPE (adj->base), "ISR"); - DECL_NAME (repl) = get_identifier (pretty_name); - DECL_NAMELESS (repl) = 1; - obstack_free (&name_obstack, pretty_name); - - adj->new_ssa_base = repl; - } - else - repl = adj->new_ssa_base; - return repl; -} - -/* Find the first adjustment for a particular parameter BASE in a vector of - ADJUSTMENTS which is not a copy_param. Return NULL if there is no such - adjustment. */ - -static struct ipa_parm_adjustment * -get_adjustment_for_base (ipa_parm_adjustment_vec adjustments, tree base) -{ - int i, len; - - len = adjustments.length (); - for (i = 0; i < len; i++) - { - struct ipa_parm_adjustment *adj; - - adj = &adjustments[i]; - if (adj->op != IPA_PARM_OP_COPY && adj->base == base) - return adj; - } - - return NULL; -} - -/* If OLD_NAME, which is being defined by statement STMT, is an SSA_NAME of a - parameter which is to be removed because its value is not used, create a new - SSA_NAME relating to a replacement VAR_DECL, replace all uses of the - original with it and return it. If there is no need to re-map, return NULL. - ADJUSTMENTS is a pointer to a vector of IPA-SRA adjustments. */ - -static tree -replace_removed_params_ssa_names (tree old_name, gimple *stmt, - ipa_parm_adjustment_vec adjustments) -{ - struct ipa_parm_adjustment *adj; - tree decl, repl, new_name; - - if (TREE_CODE (old_name) != SSA_NAME) - return NULL; - - decl = SSA_NAME_VAR (old_name); - if (decl == NULL_TREE - || TREE_CODE (decl) != PARM_DECL) - return NULL; - - adj = get_adjustment_for_base (adjustments, decl); - if (!adj) - return NULL; - - repl = get_replaced_param_substitute (adj); - new_name = make_ssa_name (repl, stmt); - SSA_NAME_OCCURS_IN_ABNORMAL_PHI (new_name) - = SSA_NAME_OCCURS_IN_ABNORMAL_PHI (old_name); - - if (dump_file) - { - fprintf (dump_file, "replacing an SSA name of a removed param "); - print_generic_expr (dump_file, old_name); - fprintf (dump_file, " with "); - print_generic_expr (dump_file, new_name); - fprintf (dump_file, "\n"); - } - - replace_uses_by (old_name, new_name); - return new_name; -} - -/* If the statement STMT contains any expressions that need to replaced with a - different one as noted by ADJUSTMENTS, do so. Handle any potential type - incompatibilities (GSI is used to accommodate conversion statements and must - point to the statement). Return true iff the statement was modified. */ - -static bool -sra_ipa_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi, - ipa_parm_adjustment_vec adjustments) -{ - tree *lhs_p, *rhs_p; - bool any; - - if (!gimple_assign_single_p (stmt)) - return false; - - rhs_p = gimple_assign_rhs1_ptr (stmt); - lhs_p = gimple_assign_lhs_ptr (stmt); - - any = ipa_modify_expr (rhs_p, false, adjustments); - any |= ipa_modify_expr (lhs_p, false, adjustments); - if (any) - { - tree new_rhs = NULL_TREE; - - if (!useless_type_conversion_p (TREE_TYPE (*lhs_p), TREE_TYPE (*rhs_p))) - { - if (TREE_CODE (*rhs_p) == CONSTRUCTOR) - { - /* V_C_Es of constructors can cause trouble (PR 42714). */ - if (is_gimple_reg_type (TREE_TYPE (*lhs_p))) - *rhs_p = build_zero_cst (TREE_TYPE (*lhs_p)); - else - *rhs_p = build_constructor (TREE_TYPE (*lhs_p), - NULL); - } - else - new_rhs = fold_build1_loc (gimple_location (stmt), - VIEW_CONVERT_EXPR, TREE_TYPE (*lhs_p), - *rhs_p); - } - else if (REFERENCE_CLASS_P (*rhs_p) - && is_gimple_reg_type (TREE_TYPE (*lhs_p)) - && !is_gimple_reg (*lhs_p)) - /* This can happen when an assignment in between two single field - structures is turned into an assignment in between two pointers to - scalars (PR 42237). */ - new_rhs = *rhs_p; - - if (new_rhs) - { - tree tmp = force_gimple_operand_gsi (gsi, new_rhs, true, NULL_TREE, - true, GSI_SAME_STMT); - - gimple_assign_set_rhs_from_tree (gsi, tmp); - } - - return true; - } - - return false; -} - -/* Traverse the function body and all modifications as described in - ADJUSTMENTS. Return true iff the CFG has been changed. */ - -bool -ipa_sra_modify_function_body (ipa_parm_adjustment_vec adjustments) -{ - bool cfg_changed = false; - basic_block bb; - - FOR_EACH_BB_FN (bb, cfun) - { - gimple_stmt_iterator gsi; - - for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); gsi_next (&gsi)) - { - gphi *phi = as_a (gsi_stmt (gsi)); - tree new_lhs, old_lhs = gimple_phi_result (phi); - new_lhs = replace_removed_params_ssa_names (old_lhs, phi, adjustments); - if (new_lhs) - { - gimple_phi_set_result (phi, new_lhs); - release_ssa_name (old_lhs); - } - } - - gsi = gsi_start_bb (bb); - while (!gsi_end_p (gsi)) - { - gimple *stmt = gsi_stmt (gsi); - bool modified = false; - tree *t; - unsigned i; - - switch (gimple_code (stmt)) - { - case GIMPLE_RETURN: - t = gimple_return_retval_ptr (as_a (stmt)); - if (*t != NULL_TREE) - modified |= ipa_modify_expr (t, true, adjustments); - break; - - case GIMPLE_ASSIGN: - modified |= sra_ipa_modify_assign (stmt, &gsi, adjustments); - break; - - case GIMPLE_CALL: - /* Operands must be processed before the lhs. */ - for (i = 0; i < gimple_call_num_args (stmt); i++) - { - t = gimple_call_arg_ptr (stmt, i); - modified |= ipa_modify_expr (t, true, adjustments); - } - - if (gimple_call_lhs (stmt)) - { - t = gimple_call_lhs_ptr (stmt); - modified |= ipa_modify_expr (t, false, adjustments); - } - break; - - case GIMPLE_ASM: - { - gasm *asm_stmt = as_a (stmt); - for (i = 0; i < gimple_asm_ninputs (asm_stmt); i++) - { - t = &TREE_VALUE (gimple_asm_input_op (asm_stmt, i)); - modified |= ipa_modify_expr (t, true, adjustments); - } - for (i = 0; i < gimple_asm_noutputs (asm_stmt); i++) - { - t = &TREE_VALUE (gimple_asm_output_op (asm_stmt, i)); - modified |= ipa_modify_expr (t, false, adjustments); - } - } - break; - - default: - break; - } - - def_operand_p defp; - ssa_op_iter iter; - FOR_EACH_SSA_DEF_OPERAND (defp, stmt, iter, SSA_OP_DEF) - { - tree old_def = DEF_FROM_PTR (defp); - if (tree new_def = replace_removed_params_ssa_names (old_def, stmt, - adjustments)) - { - SET_DEF (defp, new_def); - release_ssa_name (old_def); - modified = true; - } - } - - if (modified) - { - update_stmt (stmt); - if (maybe_clean_eh_stmt (stmt) - && gimple_purge_dead_eh_edges (gimple_bb (stmt))) - cfg_changed = true; - } - gsi_next (&gsi); - } - } - - return cfg_changed; -} - -/* Call gimple_debug_bind_reset_value on all debug statements describing - gimple register parameters that are being removed or replaced. */ - -static void -sra_ipa_reset_debug_stmts (ipa_parm_adjustment_vec adjustments) -{ - int i, len; - gimple_stmt_iterator *gsip = NULL, gsi; - - if (MAY_HAVE_DEBUG_STMTS && single_succ_p (ENTRY_BLOCK_PTR_FOR_FN (cfun))) - { - gsi = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun))); - gsip = &gsi; - } - len = adjustments.length (); - for (i = 0; i < len; i++) - { - struct ipa_parm_adjustment *adj; - imm_use_iterator ui; - gimple *stmt; - gdebug *def_temp; - tree name, vexpr, copy = NULL_TREE; - use_operand_p use_p; - - adj = &adjustments[i]; - if (adj->op == IPA_PARM_OP_COPY || !is_gimple_reg (adj->base)) - continue; - name = ssa_default_def (cfun, adj->base); - vexpr = NULL; - if (name) - FOR_EACH_IMM_USE_STMT (stmt, ui, name) - { - if (gimple_clobber_p (stmt)) - { - gimple_stmt_iterator cgsi = gsi_for_stmt (stmt); - unlink_stmt_vdef (stmt); - gsi_remove (&cgsi, true); - release_defs (stmt); - continue; - } - /* All other users must have been removed by - ipa_sra_modify_function_body. */ - gcc_assert (is_gimple_debug (stmt)); - if (vexpr == NULL && gsip != NULL) - { - gcc_assert (TREE_CODE (adj->base) == PARM_DECL); - vexpr = make_node (DEBUG_EXPR_DECL); - def_temp = gimple_build_debug_source_bind (vexpr, adj->base, - NULL); - DECL_ARTIFICIAL (vexpr) = 1; - TREE_TYPE (vexpr) = TREE_TYPE (name); - SET_DECL_MODE (vexpr, DECL_MODE (adj->base)); - gsi_insert_before (gsip, def_temp, GSI_SAME_STMT); - } - if (vexpr) - { - FOR_EACH_IMM_USE_ON_STMT (use_p, ui) - SET_USE (use_p, vexpr); - } - else - gimple_debug_bind_reset_value (stmt); - update_stmt (stmt); - } - /* Create a VAR_DECL for debug info purposes. */ - if (!DECL_IGNORED_P (adj->base)) - { - copy = build_decl (DECL_SOURCE_LOCATION (current_function_decl), - VAR_DECL, DECL_NAME (adj->base), - TREE_TYPE (adj->base)); - if (DECL_PT_UID_SET_P (adj->base)) - SET_DECL_PT_UID (copy, DECL_PT_UID (adj->base)); - TREE_ADDRESSABLE (copy) = TREE_ADDRESSABLE (adj->base); - TREE_READONLY (copy) = TREE_READONLY (adj->base); - TREE_THIS_VOLATILE (copy) = TREE_THIS_VOLATILE (adj->base); - DECL_GIMPLE_REG_P (copy) = DECL_GIMPLE_REG_P (adj->base); - DECL_ARTIFICIAL (copy) = DECL_ARTIFICIAL (adj->base); - DECL_IGNORED_P (copy) = DECL_IGNORED_P (adj->base); - DECL_ABSTRACT_ORIGIN (copy) = DECL_ORIGIN (adj->base); - DECL_SEEN_IN_BIND_EXPR_P (copy) = 1; - SET_DECL_RTL (copy, 0); - TREE_USED (copy) = 1; - DECL_CONTEXT (copy) = current_function_decl; - add_local_decl (cfun, copy); - DECL_CHAIN (copy) = - BLOCK_VARS (DECL_INITIAL (current_function_decl)); - BLOCK_VARS (DECL_INITIAL (current_function_decl)) = copy; - } - if (gsip != NULL && copy && target_for_debug_bind (adj->base)) - { - gcc_assert (TREE_CODE (adj->base) == PARM_DECL); - if (vexpr) - def_temp = gimple_build_debug_bind (copy, vexpr, NULL); - else - def_temp = gimple_build_debug_source_bind (copy, adj->base, - NULL); - gsi_insert_before (gsip, def_temp, GSI_SAME_STMT); - } - } -} - -/* Return false if all callers have at least as many actual arguments as there - are formal parameters in the current function and that their types - match. */ - -static bool -some_callers_have_mismatched_arguments_p (struct cgraph_node *node, - void *data ATTRIBUTE_UNUSED) -{ - struct cgraph_edge *cs; - for (cs = node->callers; cs; cs = cs->next_caller) - if (!cs->call_stmt || !callsite_arguments_match_p (cs->call_stmt)) - return true; - - return false; -} - -/* Return false if all callers have vuse attached to a call statement. */ - -static bool -some_callers_have_no_vuse_p (struct cgraph_node *node, - void *data ATTRIBUTE_UNUSED) -{ - struct cgraph_edge *cs; - for (cs = node->callers; cs; cs = cs->next_caller) - if (!cs->call_stmt || !gimple_vuse (cs->call_stmt)) - return true; - - return false; -} - -/* Convert all callers of NODE. */ - -static bool -convert_callers_for_node (struct cgraph_node *node, - void *data) -{ - ipa_parm_adjustment_vec *adjustments = (ipa_parm_adjustment_vec *) data; - bitmap recomputed_callers = BITMAP_ALLOC (NULL); - struct cgraph_edge *cs; - - for (cs = node->callers; cs; cs = cs->next_caller) - { - push_cfun (DECL_STRUCT_FUNCTION (cs->caller->decl)); - - if (dump_file) - fprintf (dump_file, "Adjusting call %s -> %s\n", - cs->caller->dump_name (), cs->callee->dump_name ()); - - ipa_modify_call_arguments (cs, cs->call_stmt, *adjustments); - - pop_cfun (); - } - - for (cs = node->callers; cs; cs = cs->next_caller) - if (bitmap_set_bit (recomputed_callers, cs->caller->get_uid ()) - && gimple_in_ssa_p (DECL_STRUCT_FUNCTION (cs->caller->decl))) - compute_fn_summary (cs->caller, true); - BITMAP_FREE (recomputed_callers); - - return true; -} - -/* Convert all callers of NODE to pass parameters as given in ADJUSTMENTS. */ - -static void -convert_callers (struct cgraph_node *node, tree old_decl, - ipa_parm_adjustment_vec adjustments) -{ - basic_block this_block; - - node->call_for_symbol_and_aliases (convert_callers_for_node, - &adjustments, false); - - if (!encountered_recursive_call) - return; - - FOR_EACH_BB_FN (this_block, cfun) - { - gimple_stmt_iterator gsi; - - for (gsi = gsi_start_bb (this_block); !gsi_end_p (gsi); gsi_next (&gsi)) - { - gcall *stmt; - tree call_fndecl; - stmt = dyn_cast (gsi_stmt (gsi)); - if (!stmt) - continue; - call_fndecl = gimple_call_fndecl (stmt); - if (call_fndecl == old_decl) - { - if (dump_file) - fprintf (dump_file, "Adjusting recursive call"); - gimple_call_set_fndecl (stmt, node->decl); - ipa_modify_call_arguments (NULL, stmt, adjustments); - } - } - } - - return; -} - -/* Perform all the modification required in IPA-SRA for NODE to have parameters - as given in ADJUSTMENTS. Return true iff the CFG has been changed. */ - -static bool -modify_function (struct cgraph_node *node, ipa_parm_adjustment_vec adjustments) -{ - struct cgraph_node *new_node; - bool cfg_changed; - - cgraph_edge::rebuild_edges (); - free_dominance_info (CDI_DOMINATORS); - pop_cfun (); - - /* This must be done after rebuilding cgraph edges for node above. - Otherwise any recursive calls to node that are recorded in - redirect_callers will be corrupted. */ - vec redirect_callers = node->collect_callers (); - new_node = node->create_version_clone_with_body (redirect_callers, NULL, - NULL, false, NULL, NULL, - "isra"); - redirect_callers.release (); - - push_cfun (DECL_STRUCT_FUNCTION (new_node->decl)); - ipa_modify_formal_parameters (current_function_decl, adjustments); - cfg_changed = ipa_sra_modify_function_body (adjustments); - sra_ipa_reset_debug_stmts (adjustments); - convert_callers (new_node, node->decl, adjustments); - new_node->make_local (); - return cfg_changed; -} - -/* Means of communication between ipa_sra_check_caller and - ipa_sra_preliminary_function_checks. */ - -struct ipa_sra_check_caller_data -{ - bool has_callers; - bool bad_arg_alignment; - bool has_thunk; -}; - -/* If NODE has a caller, mark that fact in DATA which is pointer to - ipa_sra_check_caller_data. Also check all aggregate arguments in all known - calls if they are unit aligned and if not, set the appropriate flag in DATA - too. */ - -static bool -ipa_sra_check_caller (struct cgraph_node *node, void *data) -{ - if (!node->callers) - return false; - - struct ipa_sra_check_caller_data *iscc; - iscc = (struct ipa_sra_check_caller_data *) data; - iscc->has_callers = true; - - for (cgraph_edge *cs = node->callers; cs; cs = cs->next_caller) - { - if (cs->caller->thunk.thunk_p) - { - iscc->has_thunk = true; - return true; - } - gimple *call_stmt = cs->call_stmt; - unsigned count = gimple_call_num_args (call_stmt); - for (unsigned i = 0; i < count; i++) - { - tree arg = gimple_call_arg (call_stmt, i); - if (is_gimple_reg (arg)) - continue; - - tree offset; - poly_int64 bitsize, bitpos; - machine_mode mode; - int unsignedp, reversep, volatilep = 0; - get_inner_reference (arg, &bitsize, &bitpos, &offset, &mode, - &unsignedp, &reversep, &volatilep); - if (!multiple_p (bitpos, BITS_PER_UNIT)) - { - iscc->bad_arg_alignment = true; - return true; - } - } - } - - return false; -} - -/* Return false the function is apparently unsuitable for IPA-SRA based on it's - attributes, return true otherwise. NODE is the cgraph node of the current - function. */ - -static bool -ipa_sra_preliminary_function_checks (struct cgraph_node *node) -{ - if (!node->can_be_local_p ()) - { - if (dump_file) - fprintf (dump_file, "Function not local to this compilation unit.\n"); - return false; - } - - if (!node->local.can_change_signature) - { - if (dump_file) - fprintf (dump_file, "Function can not change signature.\n"); - return false; - } - - if (!tree_versionable_function_p (node->decl)) - { - if (dump_file) - fprintf (dump_file, "Function is not versionable.\n"); - return false; - } - - if (!opt_for_fn (node->decl, optimize) - || !opt_for_fn (node->decl, flag_ipa_sra)) - { - if (dump_file) - fprintf (dump_file, "Function not optimized.\n"); - return false; - } - - if (DECL_VIRTUAL_P (current_function_decl)) - { - if (dump_file) - fprintf (dump_file, "Function is a virtual method.\n"); - return false; - } - - if ((DECL_ONE_ONLY (node->decl) || DECL_EXTERNAL (node->decl)) - && ipa_fn_summaries->get (node) - && ipa_fn_summaries->get (node)->size >= MAX_INLINE_INSNS_AUTO) - { - if (dump_file) - fprintf (dump_file, "Function too big to be made truly local.\n"); - return false; - } - - if (cfun->stdarg) - { - if (dump_file) - fprintf (dump_file, "Function uses stdarg. \n"); - return false; - } - - if (TYPE_ATTRIBUTES (TREE_TYPE (node->decl))) - return false; - - if (DECL_DISREGARD_INLINE_LIMITS (node->decl)) - { - if (dump_file) - fprintf (dump_file, "Always inline function will be inlined " - "anyway. \n"); - return false; - } - - struct ipa_sra_check_caller_data iscc; - memset (&iscc, 0, sizeof(iscc)); - node->call_for_symbol_and_aliases (ipa_sra_check_caller, &iscc, true); - if (!iscc.has_callers) - { - if (dump_file) - fprintf (dump_file, - "Function has no callers in this compilation unit.\n"); - return false; - } - - if (iscc.bad_arg_alignment) - { - if (dump_file) - fprintf (dump_file, - "A function call has an argument with non-unit alignment.\n"); - return false; - } - - if (iscc.has_thunk) - { - if (dump_file) - fprintf (dump_file, - "A has thunk.\n"); - return false; - } - - return true; -} - -/* Perform early interprocedural SRA. */ - -static unsigned int -ipa_early_sra (void) -{ - struct cgraph_node *node = cgraph_node::get (current_function_decl); - ipa_parm_adjustment_vec adjustments; - int ret = 0; - - if (!ipa_sra_preliminary_function_checks (node)) - return 0; - - sra_initialize (); - sra_mode = SRA_MODE_EARLY_IPA; - - if (!find_param_candidates ()) - { - if (dump_file) - fprintf (dump_file, "Function has no IPA-SRA candidates.\n"); - goto simple_out; - } - - if (node->call_for_symbol_and_aliases - (some_callers_have_mismatched_arguments_p, NULL, true)) - { - if (dump_file) - fprintf (dump_file, "There are callers with insufficient number of " - "arguments or arguments with type mismatches.\n"); - goto simple_out; - } - - if (node->call_for_symbol_and_aliases - (some_callers_have_no_vuse_p, NULL, true)) - { - if (dump_file) - fprintf (dump_file, "There are callers with no VUSE attached " - "to a call stmt.\n"); - goto simple_out; - } - - bb_dereferences = XCNEWVEC (HOST_WIDE_INT, - func_param_count - * last_basic_block_for_fn (cfun)); - final_bbs = BITMAP_ALLOC (NULL); - - scan_function (); - if (encountered_apply_args) - { - if (dump_file) - fprintf (dump_file, "Function calls __builtin_apply_args().\n"); - goto out; - } - - if (encountered_unchangable_recursive_call) - { - if (dump_file) - fprintf (dump_file, "Function calls itself with insufficient " - "number of arguments.\n"); - goto out; - } - - adjustments = analyze_all_param_acesses (); - if (!adjustments.exists ()) - goto out; - if (dump_file) - ipa_dump_param_adjustments (dump_file, adjustments, current_function_decl); - - if (modify_function (node, adjustments)) - ret = TODO_update_ssa | TODO_cleanup_cfg; - else - ret = TODO_update_ssa; - adjustments.release (); - - statistics_counter_event (cfun, "Unused parameters deleted", - sra_stats.deleted_unused_parameters); - statistics_counter_event (cfun, "Scalar parameters converted to by-value", - sra_stats.scalar_by_ref_to_by_val); - statistics_counter_event (cfun, "Aggregate parameters broken up", - sra_stats.aggregate_params_reduced); - statistics_counter_event (cfun, "Aggregate parameter components created", - sra_stats.param_reductions_created); - - out: - BITMAP_FREE (final_bbs); - free (bb_dereferences); - simple_out: - sra_deinitialize (); - return ret; -} - -namespace { - -const pass_data pass_data_early_ipa_sra = -{ - GIMPLE_PASS, /* type */ - "eipa_sra", /* name */ - OPTGROUP_NONE, /* optinfo_flags */ - TV_IPA_SRA, /* tv_id */ - 0, /* properties_required */ - 0, /* properties_provided */ - 0, /* properties_destroyed */ - 0, /* todo_flags_start */ - TODO_dump_symtab, /* todo_flags_finish */ -}; - -class pass_early_ipa_sra : public gimple_opt_pass -{ -public: - pass_early_ipa_sra (gcc::context *ctxt) - : gimple_opt_pass (pass_data_early_ipa_sra, ctxt) - {} - - /* opt_pass methods: */ - virtual bool gate (function *) { return flag_ipa_sra && dbg_cnt (eipa_sra); } - virtual unsigned int execute (function *) { return ipa_early_sra (); } - -}; // class pass_early_ipa_sra - -} // anon namespace - -gimple_opt_pass * -make_pass_early_ipa_sra (gcc::context *ctxt) -{ - return new pass_early_ipa_sra (ctxt); -} diff --git a/gcc/tree-sra.h b/gcc/tree-sra.h new file mode 100644 index 00000000000..b442534d18b --- /dev/null +++ b/gcc/tree-sra.h @@ -0,0 +1,31 @@ +/* Scalar Replacement of Aggregates (SRA) converts some structure + references into scalar references, exposing them to the scalar + optimizers. + Copyright (C) 2017 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + +bool type_internals_preclude_sra_p (tree type, const char **msg, unsigned *ttl); + +/* Return true iff TYPE is stdarg va_list type (which early SRA and IPA-SRA + should leave alone). */ + +static inline bool +is_va_list_type (tree type) +{ + return TYPE_MAIN_VARIANT (type) == TYPE_MAIN_VARIANT (va_list_type_node); +}