Message ID | or37zlpujd.fsf@livre.home |
---|---|
State | New |
Headers | show |
FAIL: gcc.target/aarch64/target_attr_crypto_ice_1.c (internal compiler error) In file included from /opt/gcc/gcc-20150815/gcc/testsuite/gcc.target/aarch64/target_attr_crypto_ice_1.c:4:0: /opt/gcc/gcc-20150815/Build/gcc/include/arm_neon.h: In function 'test_vsha1cq_u32': /opt/gcc/gcc-20150815/Build/gcc/include/arm_neon.h:21076:10: internal compiler error: in expand_expr_real_1, at expr.c:9532 0x7f060b expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool) ../../gcc/expr.c:9532 0xdb1027 expand_normal ../../gcc/expr.h:261 0xdb1027 aarch64_simd_expand_args ../../gcc/config/aarch64/aarch64-builtins.c:944 0xdb1027 aarch64_simd_expand_builtin(int, tree_node*, rtx_def*) ../../gcc/config/aarch64/aarch64-builtins.c:1118 0x6cc667 expand_builtin(tree_node*, rtx_def*, rtx_def*, machine_mode, int) ../../gcc/builtins.c:5931 0x7ecab7 expand_expr_real_1(tree_node*, rtx_def*, machine_mode, expand_modifier, rtx_def**, bool) ../../gcc/expr.c:10360 0x7f8547 store_expr_with_bounds(tree_node*, rtx_def*, int, bool, tree_node*) ../../gcc/expr.c:5398 0x7fa9d3 expand_assignment(tree_node*, tree_node*, bool) ../../gcc/expr.c:5170 0x6f435f expand_call_stmt ../../gcc/cfgexpand.c:2621 0x6f435f expand_gimple_stmt_1 ../../gcc/cfgexpand.c:3510 0x6f435f expand_gimple_stmt ../../gcc/cfgexpand.c:3671 0x6f69c7 expand_gimple_tailcall ../../gcc/cfgexpand.c:3718 0x6f69c7 expand_gimple_basic_block ../../gcc/cfgexpand.c:5651 0x6fc777 execute ../../gcc/cfgexpand.c:6260 Andreas.
On Aug 15, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote: > FAIL: gcc.target/aarch64/target_attr_crypto_ice_1.c (internal compiler error) > In file included from > /opt/gcc/gcc-20150815/gcc/testsuite/gcc.target/aarch64/target_attr_crypto_ice_1.c:4:0: Are you sure this is a regression introduced by my patch? The comments at the top of this file seem to indicate it is a known problem in the expansion of the crypto builtin, which is precisely what we see in the backtrace? If it is indeed a regression, would you please provide me with a preprocessed testcase so that I can look into it without a native environment? Thanks in advance,
On Aug 16, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote: > Alexandre Oliva <aoliva@redhat.com> writes: >> On Aug 15, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote: >> >>> FAIL: gcc.target/aarch64/target_attr_crypto_ice_1.c (internal compiler error) >> >>> In file included from >>> /opt/gcc/gcc-20150815/gcc/testsuite/gcc.target/aarch64/target_attr_crypto_ice_1.c:4:0: >> >> Are you sure this is a regression introduced by my patch? > Yes, it reintroduces the ICE. Ugh. I see this testcase was introduced very recently, so presumably it wasn't present in the tree that James Greenhalgh tested and confirmed there were no regressions. The hack in aarch64-builtins.c looks risky IMHO. Changing the mode of a decl after RTL is assigned to it (or to its SSA partitions) seems fishy. The assert is doing just what it was supposed to do. The only surprise to me is that it didn't catch this unexpected and unsupported change before. Presumably if we just dropped the assert in expand_expr_real_1, this case would work just fine, although the unsignedp bit would be meaningless and thus confusing, since the subreg isn't about a promotion, but about reflecting the mode change that was made from under us. May I suggest that you guys find (or introduce) other means to change the layout and mode of the decl *before* RTL is assigned to the params? I think this would save us a ton of trouble down the road. Just think how much trouble you'd get if the different modes had different calling conventions, alignment requirements, valid register assignments, or anything that might make coalescing their SSA names with those of other variables invalid.
On 14 August 2015 at 20:57, Alexandre Oliva <aoliva@redhat.com> wrote: > On Aug 11, 2015, Patrick Marlier <patrick.marlier@gmail.com> wrote: > >> On Mon, Aug 10, 2015 at 5:14 PM, Jeff Law <law@redhat.com> wrote: >>> On 08/10/2015 02:23 AM, James Greenhalgh wrote: > >>>> For what it is worth, I bootstrapped and tested the consolidated patch >>>> on arm-none-linux-gnueabihf and aarch64-none-linux-gnu with trunk at >>>> r226516 over the weekend, and didn't see any new issues. > > Thanks! > >> Especially as the bug reporter, I am impressed how a slight problem >> can lead to such a patch! ;) >> Thanks a lot Alexandre! > > You're welcome. I'm glad it appears to be working to everyone's > satisfaction now. I've just committed it as r226901, with only a > context adjustment to account for a change in use_register_for_decl in > function.c. /me crosses fingers :-) > > Here's the patch as checked in: > Hi, Since this was committed (r226901), I can see that the compiler build fails for armeb targets, when building libgcc: In file included from /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/fixed-bit.c:55:0: /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/fixed-bit.c: In function '__gnu_addha3': /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/fixed-bit.h:450:31: internal compiler error: in simplify_subreg, at simplify-rtx.c:5790 #define FIXED_OP(OP,MODE,NUM) __gnu_ ## OP ## MODE ## NUM ^ /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/fixed-bit.h:460:30: note: in expansion of macro 'FIXED_OP' #define FIXED_ADD_TEMP(NAME) FIXED_OP(add,NAME,3) ^ /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/fixed-bit.h:492:19: note: in expansion of macro 'FIXED_ADD_TEMP' #define FIXED_ADD FIXED_ADD_TEMP(MODE_NAME_S) ^ /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/libgcc/fixed-bit.c:59:1: note: in expansion of macro 'FIXED_ADD' FIXED_ADD (FIXED_C_TYPE a, FIXED_C_TYPE b) ^ 0xa4bbc3 simplify_subreg(machine_mode, rtx_def*, machine_mode, unsigned int) /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/simplify-rtx.c:5790 0xa4bbc3 simplify_subreg(machine_mode, rtx_def*, machine_mode, unsigned int) /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/simplify-rtx.c:5790 0xa4ce2d simplify_gen_subreg(machine_mode, rtx_def*, machine_mode, unsigned int) /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/simplify-rtx.c:6013 0xa4ce2d simplify_gen_subreg(machine_mode, rtx_def*, machine_mode, unsigned int) /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/simplify-rtx.c:6013 0x784385 move_block_from_reg(int, rtx_def*, int) /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:1536 0x784385 move_block_from_reg(int, rtx_def*, int) /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/expr.c:1536 0x7e165d assign_parm_setup_block /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:3076 0x7e165d assign_parm_setup_block /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:3076 0x7e813a assign_parms /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:3805 0x7e813a assign_parms /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:3805 0x7e8f2e expand_function_start(tree_node*) /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:5234 0x7e8f2e expand_function_start(tree_node*) /tmp/4972337_7.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/function.c:5234 Christophe. > for gcc/ChangeLog > > PR rtl-optimization/64164 > PR bootstrap/66978 > PR middle-end/66983 > PR rtl-optimization/67000 > PR middle-end/67034 > PR middle-end/67035 > * Makefile.in (OBJS): Drop tree-ssa-copyrename.o. > * tree-ssa-copyrename.c: Removed. > * opts.c (default_options_table): Drop -ftree-copyrename. Add > -ftree-coalesce-vars. > * passes.def: Drop all occurrences of pass_rename_ssa_copies. > * common.opt (ftree-copyrename): Ignore. > (ftree-coalesce-inlined-vars): Likewise. > * doc/invoke.texi: Remove the ignored options above. > * gimple-expr.h (gimple_can_coalesce_p): Move declaration > * tree-ssa-coalesce.h: ... here. > * tree-ssa-uncprop.c: Include tree-ssa-coalesce.h and other > headers required by it. > * gimple-expr.c (gimple_can_coalesce_p): Allow coalescing > across variables when flag_tree_coalesce_vars. Check register > use and promoted modes to allow coalescing. Do not coalesce > maybe-byref parms with SSA_NAMEs of other variables, or > anonymous SSA_NAMEs. Moved to tree-ssa-coalesce.c. > * tree-ssa-live.c (struct tree_int_map_hasher): Move along > with its member functions to tree-ssa-coalesce.c. > (var_map_base_init): Likewise. Renamed to > compute_samebase_partition_bases. > (partition_view_normal): Drop want_bases parameter. > (partition_view_bitmap): Likewise. > * tree-ssa-live.h: Adjust declarations. > * tree-ssa-coalesce.c: Include explow.h and cfgexpand.h. > (build_ssa_conflict_graph): Process PARM_ and RESULT_DECLs's > default defs at the entry point. > (dump_part_var_map): New. > (compute_optimized_partition_bases): New, called by... > (coalesce_ssa_name): ... when flag_tree_coalesce_vars, instead > of compute_samebase_partition_bases. Adjust. > * alias.c (nonoverlapping_memrefs_p): Disregard gimple-regs. > * cfgexpand.c (leader_merge, parm_maybe_byref_p): New. > (ssa_default_def_partition): New. > (get_rtl_for_parm_ssa_default_def): New. > (align_local_variable, add_stack_var): Support anonymous SSA > names. > (defer_stack_allocation): Likewise. Declare earlier. > (set_rtl): Merge exprs and attrs, even for MEMs and non-SSA > vars. Update DECL_RTL for PARM_DECLs and RESULT_DECLs too. > Do no record deferred-allocation marker in > SA.partition_to_pseudo. > (expand_stack_vars): Adjust check for the marker in it. > (expand_one_stack_var_at): Handle anonymous SSA_NAMEs. Drop > redundant MEM attr setting. > (expand_one_stack_var_1): Handle anonymous SSA_NAMEs. Renamed > from... > (expand_one_stack_var): ... this. New wrapper to check and > skip already expanded SSA partitions. > (record_alignment_for_reg_var): New, factored out of... > (expand_one_var): ... this. > (expand_one_ssa_partition): New. > (adjust_one_expanded_partition_var): New. > (expand_one_register_var): Check and skip already expanded SSA > partitions. > (expand_used_vars): Don't create DECLs for anonymous SSA > names. Expand all SSA partitions, then adjust all SSA names. > (pass::execute): Replace the loops that set > SA.partition_to_pseudo from partition leaders and cleared > DECL_RTL for multi-location variables, and that which used to > rename vars and set attrs, with one that clears DECL_RTL and > checks that PARMs and RESULTs default_defs match DECL_RTL. > * cfgexpand.h (get_rtl_for_parm_ssa_default_def): Declare. > * emit-rtl.c: Include stor-layout.h. > (set_reg_attrs_for_parm): Handle NULL decl. > (set_reg_attrs_for_decl_rtl): Take mode from expression if > it's not a DECL. > * stmt.c (emit_case_decision_tree): Pass it the SSA_NAME > rather than its possibly-NULL DECL. > * explow.c (promote_ssa_mode): New. > * explow.h (promote_ssa_mode): Declare. > * expr.c (expand_expr_real_1): Handle anonymous SSA_NAMEs. > (read_complex_part): Export. > * expr.h (read_complex_part): Declare. > * cfgexpand.h (parm_maybe_byref_p): Declare. > * function.c: Include cfgexpand.h. > (use_register_for_decl): Handle SSA_NAMEs, anonymous or not. > (use_register_for_parm_decl): Wrapper for the above to > special-case the result_ptr. > (rtl_for_parm): Ditto for get_rtl_for_parm_ssa_default_def. > (split_complex_args): Take assign_parm_data_all argument. > Pass it to rtl_for_parm. Set up rtl and context for split > args. Reset complex parm before fetching its default decl > rtl. > (assign_parms_unsplit_complex): Use the default-def complex > parm rtl if it matches the components. > (assign_parms_augmented_arg_list): Adjust. > (maybe_reset_rtl_for_parm): Reset DECL_RTL of parms with > multiple locations. Recognize split complex args. > (assign_parm_adjust_stack_rtl): Add all and parm arguments, > for rtl_for_parm. For SSA-assigned parms, zero stack_parm. > (assign_parm_setup_block): Prefer SSA-assigned location, and > fill in its address if the memory location of a maybe-byref > parm was not assigned by cfgexpand. > (assign_parm_setup_reg): Likewise. Adjust its mode as > needed. Use entry_parm for equiv if stack_parm is NULL. Make > sure passed_pointer parms don't need conversion. Copy address > or value as needed. > (assign_parm_setup_stack): Prefer SSA-assigned location. > (assign_parms): Maybe reset DECL_RTL of params. Adjust stack > rtl before testing for pointer bounds. Special-case result_ptr. > (expand_function_start): Maybe reset DECL_RTL of result. > Prefer SSA-assigned location for result and static chain. > Factor out DECL_RESULT and SET_DECL_RTL. Convert static chain > to Pmode if needed, from H.J. Lu <hongjiu.lu@intel.com>. > * tree-outof-ssa.c (insert_value_copy_on_edge): Handle > anonymous SSA names. Use promote_ssa_mode. > (get_temp_reg): Likewise. > (remove_ssa_form): Adjust. > * stor-layout.c (layout_decl): Don't set mem attributes of > non-MEMs. > * var-tracking.c (dataflow_set_clear_at_call): Take call_insn > and get its reg_usage for reg invalidation. > (compute_bb_dataflow): Pass it insn. > (emit_notes_in_bb): Likewise. > > for gcc/testsuite/ChangeLog > > * gcc.dg/guality/pr54200.c: Add -fno-tree-coalesce-vars. > * gcc.dg/ssp-1.c: Make counter a register. > * gcc.dg/ssp-2.c: Likewise. > * gcc.dg/torture/parm-coalesce.c: New. > --- > gcc/Makefile.in | 1 > gcc/alias.c | 13 + > gcc/cfgexpand.c | 471 +++++++++++++++++++------- > gcc/cfgexpand.h | 3 > gcc/common.opt | 12 - > gcc/doc/invoke.texi | 48 +-- > gcc/emit-rtl.c | 8 > gcc/explow.c | 29 ++ > gcc/explow.h | 3 > gcc/expr.c | 41 +- > gcc/expr.h | 1 > gcc/function.c | 341 +++++++++++++++---- > gcc/gimple-expr.c | 39 -- > gcc/gimple-expr.h | 1 > gcc/opts.c | 2 > gcc/passes.def | 5 > gcc/stmt.c | 2 > gcc/stor-layout.c | 3 > gcc/testsuite/gcc.dg/guality/pr54200.c | 2 > gcc/testsuite/gcc.dg/ssp-1.c | 2 > gcc/testsuite/gcc.dg/ssp-2.c | 2 > gcc/testsuite/gcc.dg/torture/parm-coalesce.c | 40 ++ > gcc/tree-outof-ssa.c | 16 - > gcc/tree-ssa-coalesce.c | 384 +++++++++++++++++++++ > gcc/tree-ssa-coalesce.h | 1 > gcc/tree-ssa-copyrename.c | 475 -------------------------- > gcc/tree-ssa-live.c | 99 ----- > gcc/tree-ssa-live.h | 4 > gcc/tree-ssa-uncprop.c | 5 > gcc/var-tracking.c | 12 - > 30 files changed, 1187 insertions(+), 878 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/torture/parm-coalesce.c > delete mode 100644 gcc/tree-ssa-copyrename.c > > diff --git a/gcc/Makefile.in b/gcc/Makefile.in > index c1cb4ce..e298ecc 100644 > --- a/gcc/Makefile.in > +++ b/gcc/Makefile.in > @@ -1447,7 +1447,6 @@ OBJS = \ > tree-ssa-ccp.o \ > tree-ssa-coalesce.o \ > tree-ssa-copy.o \ > - tree-ssa-copyrename.o \ > tree-ssa-dce.o \ > tree-ssa-dom.o \ > tree-ssa-dse.o \ > diff --git a/gcc/alias.c b/gcc/alias.c > index fa7d5d8..4681e3f 100644 > --- a/gcc/alias.c > +++ b/gcc/alias.c > @@ -2507,6 +2507,19 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant) > if (! DECL_P (exprx) || ! DECL_P (expry)) > return 0; > > + /* If we refer to different gimple registers, or one gimple register > + and one non-gimple-register, we know they can't overlap. First, > + gimple registers don't have their addresses taken. Now, there > + could be more than one stack slot for (different versions of) the > + same gimple register, but we can presumably tell they don't > + overlap based on offsets from stack base addresses elsewhere. > + It's important that we don't proceed to DECL_RTL, because gimple > + registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be > + able to do anything about them since no SSA information will have > + remained to guide it. */ > + if (is_gimple_reg (exprx) || is_gimple_reg (expry)) > + return exprx != expry; > + > /* With invalid code we can end up storing into the constant pool. > Bail out to avoid ICEing when creating RTL for this. > See gfortran.dg/lto/20091028-2_0.f90. */ > diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c > index 7df9d06..0bc20f6 100644 > --- a/gcc/cfgexpand.c > +++ b/gcc/cfgexpand.c > @@ -97,6 +97,8 @@ gimple currently_expanding_gimple_stmt; > > static rtx expand_debug_expr (tree); > > +static bool defer_stack_allocation (tree, bool); > + > /* Return an expression tree corresponding to the RHS of GIMPLE > statement STMT. */ > > @@ -150,21 +152,149 @@ gimple_assign_rhs_to_tree (gimple stmt) > > #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x) > > +/* Choose either CUR or NEXT as the leader DECL for a partition. > + Prefer ignored decls, to simplify debug dumps and reduce ambiguity > + out of the same user variable being in multiple partitions (this is > + less likely for compiler-introduced temps). */ > + > +static tree > +leader_merge (tree cur, tree next) > +{ > + if (cur == NULL || cur == next) > + return next; > + > + if (DECL_P (cur) && DECL_IGNORED_P (cur)) > + return cur; > + > + if (DECL_P (next) && DECL_IGNORED_P (next)) > + return next; > + > + return cur; > +} > + > +/* Return true if VAR is a PARM_DECL or a RESULT_DECL of type BLKmode. > + Such parameters are likely passed as a pointer to the value, rather > + than as a value, and so we must not coalesce them, nor allocate > + stack space for them before determining the calling conventions for > + them. For their SSA_NAMEs, expand_one_ssa_partition emits RTL as > + MEMs with pc_rtx as the address, and then it replaces the pc_rtx > + with NULL so as to make sure the MEM is not used before it is > + adjusted in assign_parm_setup_reg. */ > + > +bool > +parm_maybe_byref_p (tree var) > +{ > + if (!var || VAR_P (var)) > + return false; > + > + gcc_assert (TREE_CODE (var) == PARM_DECL > + || TREE_CODE (var) == RESULT_DECL); > + > + return TYPE_MODE (TREE_TYPE (var)) == BLKmode; > +} > + > +/* Return the partition of the default SSA_DEF for decl VAR. */ > + > +static int > +ssa_default_def_partition (tree var) > +{ > + tree name = ssa_default_def (cfun, var); > + > + if (!name) > + return NO_PARTITION; > + > + return var_to_partition (SA.map, name); > +} > + > +/* Return the RTL for the default SSA def of a PARM or RESULT, if > + there is one. */ > + > +rtx > +get_rtl_for_parm_ssa_default_def (tree var) > +{ > + gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL); > + > + if (!is_gimple_reg (var)) > + return NULL_RTX; > + > + /* If we've already determined RTL for the decl, use it. This is > + not just an optimization: if VAR is a PARM whose incoming value > + is unused, we won't find a default def to use its partition, but > + we still want to use the location of the parm, if it was used at > + all. During assign_parms, until a location is assigned for the > + VAR, RTL can only for a parm or result if we're not coalescing > + across variables, when we know we're coalescing all SSA_NAMEs of > + each parm or result, and we're not coalescing them with names > + pertaining to other variables, such as other parms' default > + defs. */ > + if (DECL_RTL_SET_P (var)) > + { > + gcc_assert (DECL_RTL (var) != pc_rtx); > + return DECL_RTL (var); > + } > + > + int part = ssa_default_def_partition (var); > + if (part == NO_PARTITION) > + return NULL_RTX; > + > + return SA.partition_to_pseudo[part]; > +} > + > /* Associate declaration T with storage space X. If T is no > SSA name this is exactly SET_DECL_RTL, otherwise make the > partition of T associated with X. */ > static inline void > set_rtl (tree t, rtx x) > { > + if (x && SSAVAR (t)) > + { > + bool skip = false; > + tree cur = NULL_TREE; > + > + if (MEM_P (x)) > + cur = MEM_EXPR (x); > + else if (REG_P (x)) > + cur = REG_EXPR (x); > + else if (GET_CODE (x) == CONCAT > + && REG_P (XEXP (x, 0))) > + cur = REG_EXPR (XEXP (x, 0)); > + else if (GET_CODE (x) == PARALLEL) > + cur = REG_EXPR (XVECEXP (x, 0, 0)); > + else if (x == pc_rtx) > + skip = true; > + else > + gcc_unreachable (); > + > + tree next = skip ? cur : leader_merge (cur, SSAVAR (t)); > + > + if (cur != next) > + { > + if (MEM_P (x)) > + set_mem_attributes (x, next, true); > + else > + set_reg_attrs_for_decl_rtl (next, x); > + } > + } > + > if (TREE_CODE (t) == SSA_NAME) > { > - SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x; > - if (x && !MEM_P (x)) > - set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x); > - /* For the benefit of debug information at -O0 (where vartracking > - doesn't run) record the place also in the base DECL if it's > - a normal variable (not a parameter). */ > - if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL) > + int part = var_to_partition (SA.map, t); > + if (part != NO_PARTITION) > + { > + if (SA.partition_to_pseudo[part]) > + gcc_assert (SA.partition_to_pseudo[part] == x); > + else if (x != pc_rtx) > + SA.partition_to_pseudo[part] = x; > + } > + /* For the benefit of debug information at -O0 (where > + vartracking doesn't run) record the place also in the base > + DECL. For PARMs and RESULTs, we may end up resetting these > + in function.c:maybe_reset_rtl_for_parm, but in some rare > + cases we may need them (unused and overwritten incoming > + value, that at -O0 must share the location with the other > + uses in spite of the missing default def), and this may be > + the only chance to preserve them. */ > + if (x && x != pc_rtx && SSA_NAME_VAR (t)) > { > tree var = SSA_NAME_VAR (t); > /* If we don't yet have something recorded, just record it now. */ > @@ -248,8 +378,15 @@ static bool has_short_buffer; > static unsigned int > align_local_variable (tree decl) > { > - unsigned int align = LOCAL_DECL_ALIGNMENT (decl); > - DECL_ALIGN (decl) = align; > + unsigned int align; > + > + if (TREE_CODE (decl) == SSA_NAME) > + align = TYPE_ALIGN (TREE_TYPE (decl)); > + else > + { > + align = LOCAL_DECL_ALIGNMENT (decl); > + DECL_ALIGN (decl) = align; > + } > return align / BITS_PER_UNIT; > } > > @@ -315,12 +452,15 @@ add_stack_var (tree decl) > decl_to_stack_part->put (decl, stack_vars_num); > > v->decl = decl; > - v->size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (decl))); > + tree size = TREE_CODE (decl) == SSA_NAME > + ? TYPE_SIZE_UNIT (TREE_TYPE (decl)) > + : DECL_SIZE_UNIT (decl); > + v->size = tree_to_uhwi (size); > /* Ensure that all variables have size, so that &a != &b for any two > variables that are simultaneously live. */ > if (v->size == 0) > v->size = 1; > - v->alignb = align_local_variable (SSAVAR (decl)); > + v->alignb = align_local_variable (decl); > /* An alignment of zero can mightily confuse us later. */ > gcc_assert (v->alignb != 0); > > @@ -862,7 +1002,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align, > gcc_assert (offset == trunc_int_for_mode (offset, Pmode)); > > x = plus_constant (Pmode, base, offset); > - x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x); > + x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME > + ? TYPE_MODE (TREE_TYPE (decl)) > + : DECL_MODE (SSAVAR (decl)), x); > > if (TREE_CODE (decl) != SSA_NAME) > { > @@ -884,7 +1026,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align, > DECL_USER_ALIGN (decl) = 0; > } > > - set_mem_attributes (x, SSAVAR (decl), true); > set_rtl (decl, x); > } > > @@ -950,9 +1091,9 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data) > /* Skip variables that have already had rtl assigned. See also > add_stack_var where we perpetrate this pc_rtx hack. */ > decl = stack_vars[i].decl; > - if ((TREE_CODE (decl) == SSA_NAME > - ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)] > - : DECL_RTL (decl)) != pc_rtx) > + if (TREE_CODE (decl) == SSA_NAME > + ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)] != NULL_RTX > + : DECL_RTL (decl) != pc_rtx) > continue; > > large_size += alignb - 1; > @@ -981,9 +1122,9 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data) > /* Skip variables that have already had rtl assigned. See also > add_stack_var where we perpetrate this pc_rtx hack. */ > decl = stack_vars[i].decl; > - if ((TREE_CODE (decl) == SSA_NAME > - ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)] > - : DECL_RTL (decl)) != pc_rtx) > + if (TREE_CODE (decl) == SSA_NAME > + ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)] != NULL_RTX > + : DECL_RTL (decl) != pc_rtx) > continue; > > /* Check the predicate to see whether this variable should be > @@ -1099,13 +1240,22 @@ account_stack_vars (void) > to a variable to be allocated in the stack frame. */ > > static void > -expand_one_stack_var (tree var) > +expand_one_stack_var_1 (tree var) > { > HOST_WIDE_INT size, offset; > unsigned byte_align; > > - size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var))); > - byte_align = align_local_variable (SSAVAR (var)); > + if (TREE_CODE (var) == SSA_NAME) > + { > + tree type = TREE_TYPE (var); > + size = tree_to_uhwi (TYPE_SIZE_UNIT (type)); > + byte_align = TYPE_ALIGN_UNIT (type); > + } > + else > + { > + size = tree_to_uhwi (DECL_SIZE_UNIT (var)); > + byte_align = align_local_variable (var); > + } > > /* We handle highly aligned variables in expand_stack_vars. */ > gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT); > @@ -1116,6 +1266,27 @@ expand_one_stack_var (tree var) > crtl->max_used_stack_slot_alignment, offset); > } > > +/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are > + already assigned some MEM. */ > + > +static void > +expand_one_stack_var (tree var) > +{ > + if (TREE_CODE (var) == SSA_NAME) > + { > + int part = var_to_partition (SA.map, var); > + if (part != NO_PARTITION) > + { > + rtx x = SA.partition_to_pseudo[part]; > + gcc_assert (x); > + gcc_assert (MEM_P (x)); > + return; > + } > + } > + > + return expand_one_stack_var_1 (var); > +} > + > /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL > that will reside in a hard register. */ > > @@ -1125,13 +1296,136 @@ expand_one_hard_reg_var (tree var) > rest_of_decl_compilation (var, 0, 0); > } > > +/* Record the alignment requirements of some variable assigned to a > + pseudo. */ > + > +static void > +record_alignment_for_reg_var (unsigned int align) > +{ > + if (SUPPORTS_STACK_ALIGNMENT > + && crtl->stack_alignment_estimated < align) > + { > + /* stack_alignment_estimated shouldn't change after stack > + realign decision made */ > + gcc_assert (!crtl->stack_realign_processed); > + crtl->stack_alignment_estimated = align; > + } > + > + /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted. > + So here we only make sure stack_alignment_needed >= align. */ > + if (crtl->stack_alignment_needed < align) > + crtl->stack_alignment_needed = align; > + if (crtl->max_used_stack_slot_alignment < align) > + crtl->max_used_stack_slot_alignment = align; > +} > + > +/* Create RTL for an SSA partition. */ > + > +static void > +expand_one_ssa_partition (tree var) > +{ > + int part = var_to_partition (SA.map, var); > + gcc_assert (part != NO_PARTITION); > + > + if (SA.partition_to_pseudo[part]) > + return; > + > + unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var), > + TYPE_MODE (TREE_TYPE (var)), > + TYPE_ALIGN (TREE_TYPE (var))); > + > + /* If the variable alignment is very large we'll dynamicaly allocate > + it, which means that in-frame portion is just a pointer. */ > + if (align > MAX_SUPPORTED_STACK_ALIGNMENT) > + align = POINTER_SIZE; > + > + record_alignment_for_reg_var (align); > + > + if (!use_register_for_decl (var)) > + { > + if (parm_maybe_byref_p (SSA_NAME_VAR (var)) > + && ssa_default_def_partition (SSA_NAME_VAR (var)) == part) > + { > + expand_one_stack_var_at (var, pc_rtx, 0, 0); > + rtx x = SA.partition_to_pseudo[part]; > + gcc_assert (GET_CODE (x) == MEM); > + gcc_assert (GET_MODE (x) == BLKmode); > + gcc_assert (XEXP (x, 0) == pc_rtx); > + /* Reset the address, so that any attempt to use it will > + ICE. It will be adjusted in assign_parm_setup_reg. */ > + XEXP (x, 0) = NULL_RTX; > + } > + else if (defer_stack_allocation (var, true)) > + add_stack_var (var); > + else > + expand_one_stack_var_1 (var); > + return; > + } > + > + machine_mode reg_mode = promote_ssa_mode (var, NULL); > + > + rtx x = gen_reg_rtx (reg_mode); > + > + set_rtl (var, x); > +} > + > +/* Record the association between the RTL generated for a partition > + and the underlying variable of the SSA_NAME. */ > + > +static void > +adjust_one_expanded_partition_var (tree var) > +{ > + if (!var) > + return; > + > + tree decl = SSA_NAME_VAR (var); > + > + int part = var_to_partition (SA.map, var); > + if (part == NO_PARTITION) > + return; > + > + rtx x = SA.partition_to_pseudo[part]; > + > + if (!x) > + { > + /* This var will get a stack slot later. */ > + gcc_assert (defer_stack_allocation (var, true)); > + return; > + } > + > + set_rtl (var, x); > + > + if (!REG_P (x)) > + return; > + > + /* Note if the object is a user variable. */ > + if (decl && !DECL_ARTIFICIAL (decl)) > + mark_user_reg (x); > + > + if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var))) > + mark_reg_pointer (x, get_pointer_alignment (var)); > +} > + > /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL > that will reside in a pseudo register. */ > > static void > expand_one_register_var (tree var) > { > - tree decl = SSAVAR (var); > + if (TREE_CODE (var) == SSA_NAME) > + { > + int part = var_to_partition (SA.map, var); > + if (part != NO_PARTITION) > + { > + rtx x = SA.partition_to_pseudo[part]; > + gcc_assert (x); > + gcc_assert (REG_P (x)); > + return; > + } > + gcc_unreachable (); > + } > + > + tree decl = var; > tree type = TREE_TYPE (decl); > machine_mode reg_mode = promote_decl_mode (decl, NULL); > rtx x = gen_reg_rtx (reg_mode); > @@ -1177,10 +1471,14 @@ expand_one_error_var (tree var) > static bool > defer_stack_allocation (tree var, bool toplevel) > { > + tree size_unit = TREE_CODE (var) == SSA_NAME > + ? TYPE_SIZE_UNIT (TREE_TYPE (var)) > + : DECL_SIZE_UNIT (var); > + > /* Whether the variable is small enough for immediate allocation not to be > a problem with regard to the frame size. */ > bool smallish > - = ((HOST_WIDE_INT) tree_to_uhwi (DECL_SIZE_UNIT (var)) > + = ((HOST_WIDE_INT) tree_to_uhwi (size_unit) > < PARAM_VALUE (PARAM_MIN_SIZE_FOR_STACK_SHARING)); > > /* If stack protection is enabled, *all* stack variables must be deferred, > @@ -1189,16 +1487,24 @@ defer_stack_allocation (tree var, bool toplevel) > if (flag_stack_protect || ((flag_sanitize & SANITIZE_ADDRESS) && ASAN_STACK)) > return true; > > + unsigned int align = TREE_CODE (var) == SSA_NAME > + ? TYPE_ALIGN (TREE_TYPE (var)) > + : DECL_ALIGN (var); > + > /* We handle "large" alignment via dynamic allocation. We want to handle > this extra complication in only one place, so defer them. */ > - if (DECL_ALIGN (var) > MAX_SUPPORTED_STACK_ALIGNMENT) > + if (align > MAX_SUPPORTED_STACK_ALIGNMENT) > return true; > > + bool ignored = TREE_CODE (var) == SSA_NAME > + ? !SSAVAR (var) || DECL_IGNORED_P (SSA_NAME_VAR (var)) > + : DECL_IGNORED_P (var); > + > /* When optimization is enabled, DECL_IGNORED_P variables originally scoped > might be detached from their block and appear at toplevel when we reach > here. We want to coalesce them with variables from other blocks when > the immediate contribution to the frame size would be noticeable. */ > - if (toplevel && optimize > 0 && DECL_IGNORED_P (var) && !smallish) > + if (toplevel && optimize > 0 && ignored && !smallish) > return true; > > /* Variables declared in the outermost scope automatically conflict > @@ -1265,21 +1571,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand) > align = POINTER_SIZE; > } > > - if (SUPPORTS_STACK_ALIGNMENT > - && crtl->stack_alignment_estimated < align) > - { > - /* stack_alignment_estimated shouldn't change after stack > - realign decision made */ > - gcc_assert (!crtl->stack_realign_processed); > - crtl->stack_alignment_estimated = align; > - } > - > - /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted. > - So here we only make sure stack_alignment_needed >= align. */ > - if (crtl->stack_alignment_needed < align) > - crtl->stack_alignment_needed = align; > - if (crtl->max_used_stack_slot_alignment < align) > - crtl->max_used_stack_slot_alignment = align; > + record_alignment_for_reg_var (align); > > if (TREE_CODE (origvar) == SSA_NAME) > { > @@ -1722,48 +2014,18 @@ expand_used_vars (void) > if (targetm.use_pseudo_pic_reg ()) > pic_offset_table_rtx = gen_reg_rtx (Pmode); > > - hash_map<tree, tree> ssa_name_decls; > for (i = 0; i < SA.map->num_partitions; i++) > { > tree var = partition_to_var (SA.map, i); > > gcc_assert (!virtual_operand_p (var)); > > - /* Assign decls to each SSA name partition, share decls for partitions > - we could have coalesced (those with the same type). */ > - if (SSA_NAME_VAR (var) == NULL_TREE) > - { > - tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var)); > - if (!*slot) > - *slot = create_tmp_reg (TREE_TYPE (var)); > - replace_ssa_name_symbol (var, *slot); > - } > - > - /* Always allocate space for partitions based on VAR_DECLs. But for > - those based on PARM_DECLs or RESULT_DECLs and which matter for the > - debug info, there is no need to do so if optimization is disabled > - because all the SSA_NAMEs based on these DECLs have been coalesced > - into a single partition, which is thus assigned the canonical RTL > - location of the DECLs. If in_lto_p, we can't rely on optimize, > - a function could be compiled with -O1 -flto first and only the > - link performed at -O0. */ > - if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL) > - expand_one_var (var, true, true); > - else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p) > - { > - /* This is a PARM_DECL or RESULT_DECL. For those partitions that > - contain the default def (representing the parm or result itself) > - we don't do anything here. But those which don't contain the > - default def (representing a temporary based on the parm/result) > - we need to allocate space just like for normal VAR_DECLs. */ > - if (!bitmap_bit_p (SA.partition_has_default_def, i)) > - { > - expand_one_var (var, true, true); > - gcc_assert (SA.partition_to_pseudo[i]); > - } > - } > + expand_one_ssa_partition (var); > } > > + for (i = 1; i < num_ssa_names; i++) > + adjust_one_expanded_partition_var (ssa_name (i)); > + > if (flag_stack_protect == SPCT_FLAG_STRONG) > gen_stack_protect_signal > = stack_protect_decl_p () || stack_protect_return_slot_p (); > @@ -5928,35 +6190,6 @@ pass_expand::execute (function *fun) > parm_birth_insn = var_seq; > } > > - /* Now that we also have the parameter RTXs, copy them over to our > - partitions. */ > - for (i = 0; i < SA.map->num_partitions; i++) > - { > - tree var = SSA_NAME_VAR (partition_to_var (SA.map, i)); > - > - if (TREE_CODE (var) != VAR_DECL > - && !SA.partition_to_pseudo[i]) > - SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var); > - gcc_assert (SA.partition_to_pseudo[i]); > - > - /* If this decl was marked as living in multiple places, reset > - this now to NULL. */ > - if (DECL_RTL_IF_SET (var) == pc_rtx) > - SET_DECL_RTL (var, NULL); > - > - /* Some RTL parts really want to look at DECL_RTL(x) when x > - was a decl marked in REG_ATTR or MEM_ATTR. We could use > - SET_DECL_RTL here making this available, but that would mean > - to select one of the potentially many RTLs for one DECL. Instead > - of doing that we simply reset the MEM_EXPR of the RTL in question, > - then nobody can get at it and hence nobody can call DECL_RTL on it. */ > - if (!DECL_RTL_SET_P (var)) > - { > - if (MEM_P (SA.partition_to_pseudo[i])) > - set_mem_expr (SA.partition_to_pseudo[i], NULL); > - } > - } > - > /* If we have a class containing differently aligned pointers > we need to merge those into the corresponding RTL pointer > alignment. */ > @@ -5964,7 +6197,6 @@ pass_expand::execute (function *fun) > { > tree name = ssa_name (i); > int part; > - rtx r; > > if (!name > /* We might have generated new SSA names in > @@ -5977,20 +6209,25 @@ pass_expand::execute (function *fun) > if (part == NO_PARTITION) > continue; > > - /* Adjust all partition members to get the underlying decl of > - the representative which we might have created in expand_one_var. */ > - if (SSA_NAME_VAR (name) == NULL_TREE) > + gcc_assert (SA.partition_to_pseudo[part] > + || defer_stack_allocation (name, true)); > + > + /* If this decl was marked as living in multiple places, reset > + this now to NULL. */ > + tree var = SSA_NAME_VAR (name); > + if (var && DECL_RTL_IF_SET (var) == pc_rtx) > + SET_DECL_RTL (var, NULL); > + /* Check that the pseudos chosen by assign_parms are those of > + the corresponding default defs. */ > + else if (SSA_NAME_IS_DEFAULT_DEF (name) > + && (TREE_CODE (var) == PARM_DECL > + || TREE_CODE (var) == RESULT_DECL)) > { > - tree leader = partition_to_var (SA.map, part); > - gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE); > - replace_ssa_name_symbol (name, SSA_NAME_VAR (leader)); > + rtx in = DECL_RTL_IF_SET (var); > + gcc_assert (in); > + rtx out = SA.partition_to_pseudo[part]; > + gcc_assert (in == out || rtx_equal_p (in, out)); > } > - if (!POINTER_TYPE_P (TREE_TYPE (name))) > - continue; > - > - r = SA.partition_to_pseudo[part]; > - if (REG_P (r)) > - mark_reg_pointer (r, get_pointer_alignment (name)); > } > > /* If this function is `main', emit a call to `__main' > diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h > index a0b6e3e..987cf356 100644 > --- a/gcc/cfgexpand.h > +++ b/gcc/cfgexpand.h > @@ -22,5 +22,8 @@ along with GCC; see the file COPYING3. If not see > > extern tree gimple_assign_rhs_to_tree (gimple); > extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *); > +extern bool parm_maybe_byref_p (tree); > +extern rtx get_rtl_for_parm_ssa_default_def (tree var); > + > > #endif /* GCC_CFGEXPAND_H */ > diff --git a/gcc/common.opt b/gcc/common.opt > index e80eadf..dd59ff3 100644 > --- a/gcc/common.opt > +++ b/gcc/common.opt > @@ -2234,16 +2234,16 @@ Common Report Var(flag_tree_ch) Optimization > Enable loop header copying on trees > > ftree-coalesce-inlined-vars > -Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization > -Enable coalescing of copy-related user variables that are inlined > +Common Ignore RejectNegative > +Does nothing. Preserved for backward compatibility. > > ftree-coalesce-vars > -Common Report Var(flag_ssa_coalesce_vars,2) Optimization > -Enable coalescing of all copy-related user variables > +Common Report Var(flag_tree_coalesce_vars) Optimization > +Enable SSA coalescing of user variables > > ftree-copyrename > -Common Report Var(flag_tree_copyrename) Optimization > -Replace SSA temporaries with better names in copies > +Common Ignore > +Does nothing. Preserved for backward compatibility. > > ftree-copy-prop > Common Report Var(flag_tree_copy_prop) Optimization > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index 2871337..27be317 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -342,7 +342,6 @@ Objective-C and Objective-C++ Dialects}. > -fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol > -fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol > -fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol > --fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol > -fdump-tree-nrv -fdump-tree-vect @gol > -fdump-tree-sink @gol > -fdump-tree-sra@r{[}-@var{n}@r{]} @gol > @@ -448,9 +447,8 @@ Objective-C and Objective-C++ Dialects}. > -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol > -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol > -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol > --ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol > --ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol > --ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol > +-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol > +-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol > -ftree-loop-if-convert-stores -ftree-loop-im @gol > -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol > -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol > @@ -7133,11 +7131,6 @@ name is made by appending @file{.phiopt} to the source file name. > Dump each function after forward propagating single use variables. The file > name is made by appending @file{.forwprop} to the source file name. > > -@item copyrename > -@opindex fdump-tree-copyrename > -Dump each function after applying the copy rename optimization. The file > -name is made by appending @file{.copyrename} to the source file name. > - > @item nrv > @opindex fdump-tree-nrv > Dump each function after applying the named return value optimization on > @@ -7602,8 +7595,8 @@ compilation time. > -ftree-ccp @gol > -fssa-phiopt @gol > -ftree-ch @gol > +-ftree-coalesce-vars @gol > -ftree-copy-prop @gol > --ftree-copyrename @gol > -ftree-dce @gol > -ftree-dominator-opts @gol > -ftree-dse @gol > @@ -8867,6 +8860,15 @@ be parallelized. Parallelize all the loops that can be analyzed to > not contain loop carried dependences without checking that it is > profitable to parallelize the loops. > > +@item -ftree-coalesce-vars > +@opindex ftree-coalesce-vars > +Tell the compiler to attempt to combine small user-defined variables > +too, instead of just compiler temporaries. This may severely limit the > +ability to debug an optimized program compiled with > +@option{-fno-var-tracking-assignments}. In the negated form, this flag > +prevents SSA coalescing of user variables. This option is enabled by > +default if optimization is enabled. > + > @item -ftree-loop-if-convert > @opindex ftree-loop-if-convert > Attempt to transform conditional jumps in the innermost loops to > @@ -8980,32 +8982,6 @@ Perform scalar replacement of aggregates. This pass replaces structure > references with scalars to prevent committing structures to memory too > early. This flag is enabled by default at @option{-O} and higher. > > -@item -ftree-copyrename > -@opindex ftree-copyrename > -Perform copy renaming on trees. This pass attempts to rename compiler > -temporaries to other variables at copy locations, usually resulting in > -variable names which more closely resemble the original variables. This flag > -is enabled by default at @option{-O} and higher. > - > -@item -ftree-coalesce-inlined-vars > -@opindex ftree-coalesce-inlined-vars > -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to > -combine small user-defined variables too, but only if they are inlined > -from other functions. It is a more limited form of > -@option{-ftree-coalesce-vars}. This may harm debug information of such > -inlined variables, but it keeps variables of the inlined-into > -function apart from each other, such that they are more likely to > -contain the expected values in a debugging session. > - > -@item -ftree-coalesce-vars > -@opindex ftree-coalesce-vars > -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to > -combine small user-defined variables too, instead of just compiler > -temporaries. This may severely limit the ability to debug an optimized > -program compiled with @option{-fno-var-tracking-assignments}. In the > -negated form, this flag prevents SSA coalescing of user variables, > -including inlined ones. This option is enabled by default. > - > @item -ftree-ter > @opindex ftree-ter > Perform temporary expression replacement during the SSA->normal phase. Single > diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c > index d211e6b0..a6ef154 100644 > --- a/gcc/emit-rtl.c > +++ b/gcc/emit-rtl.c > @@ -63,6 +63,7 @@ along with GCC; see the file COPYING3. If not see > #include "target.h" > #include "builtins.h" > #include "rtl-iter.h" > +#include "stor-layout.h" > > struct target_rtl default_target_rtl; > #if SWITCHABLE_TARGET > @@ -1233,6 +1234,9 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem) > void > set_reg_attrs_for_decl_rtl (tree t, rtx x) > { > + if (!t) > + return; > + tree tdecl = t; > if (GET_CODE (x) == SUBREG) > { > gcc_assert (subreg_lowpart_p (x)); > @@ -1241,7 +1245,9 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x) > if (REG_P (x)) > REG_ATTRS (x) > = get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x), > - DECL_MODE (t))); > + DECL_P (tdecl) > + ? DECL_MODE (tdecl) > + : TYPE_MODE (TREE_TYPE (tdecl)))); > if (GET_CODE (x) == CONCAT) > { > if (REG_P (XEXP (x, 0))) > diff --git a/gcc/explow.c b/gcc/explow.c > index bd342c1..6941f4e 100644 > --- a/gcc/explow.c > +++ b/gcc/explow.c > @@ -842,6 +842,35 @@ promote_decl_mode (const_tree decl, int *punsignedp) > return pmode; > } > > +/* Return the promoted mode for name. If it is a named SSA_NAME, it > + is the same as promote_decl_mode. Otherwise, it is the promoted > + mode of a temp decl of same type as the SSA_NAME, if we had created > + one. */ > + > +machine_mode > +promote_ssa_mode (const_tree name, int *punsignedp) > +{ > + gcc_assert (TREE_CODE (name) == SSA_NAME); > + > + /* Partitions holding parms and results must be promoted as expected > + by function.c. */ > + if (SSA_NAME_VAR (name) > + && (TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL > + || TREE_CODE (SSA_NAME_VAR (name)) == RESULT_DECL)) > + return promote_decl_mode (SSA_NAME_VAR (name), punsignedp); > + > + tree type = TREE_TYPE (name); > + int unsignedp = TYPE_UNSIGNED (type); > + machine_mode mode = TYPE_MODE (type); > + > + machine_mode pmode = promote_mode (type, mode, &unsignedp); > + if (punsignedp) > + *punsignedp = unsignedp; > + > + return pmode; > +} > + > + > > /* Controls the behaviour of {anti_,}adjust_stack. */ > static bool suppress_reg_args_size; > diff --git a/gcc/explow.h b/gcc/explow.h > index 94613de..52113db 100644 > --- a/gcc/explow.h > +++ b/gcc/explow.h > @@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *); > /* Return mode and signedness to use when object is promoted. */ > machine_mode promote_decl_mode (const_tree, int *); > > +/* Return mode and signedness to use when object is promoted. */ > +machine_mode promote_ssa_mode (const_tree, int *); > + > /* Remove some bytes from the stack. An rtx says how many. */ > extern void adjust_stack (rtx); > > diff --git a/gcc/expr.c b/gcc/expr.c > index 31b4573..f604f52 100644 > --- a/gcc/expr.c > +++ b/gcc/expr.c > @@ -3022,7 +3022,7 @@ write_complex_part (rtx cplx, rtx val, bool imag_p) > /* Extract one of the components of the complex value CPLX. Extract the > real part if IMAG_P is false, and the imaginary part if it's true. */ > > -static rtx > +rtx > read_complex_part (rtx cplx, bool imag_p) > { > machine_mode cmode, imode; > @@ -9236,7 +9236,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, > rtx op0, op1, temp, decl_rtl; > tree type; > int unsignedp; > - machine_mode mode; > + machine_mode mode, dmode; > enum tree_code code = TREE_CODE (exp); > rtx subtarget, original_target; > int ignore; > @@ -9367,7 +9367,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, > if (g == NULL > && modifier == EXPAND_INITIALIZER > && !SSA_NAME_IS_DEFAULT_DEF (exp) > - && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp))) > + && (optimize || !SSA_NAME_VAR (exp) > + || DECL_IGNORED_P (SSA_NAME_VAR (exp))) > && stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp))) > g = SSA_NAME_DEF_STMT (exp); > if (g) > @@ -9446,15 +9447,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, > /* Ensure variable marked as used even if it doesn't go through > a parser. If it hasn't be used yet, write out an external > definition. */ > - TREE_USED (exp) = 1; > + if (exp) > + TREE_USED (exp) = 1; > > /* Show we haven't gotten RTL for this yet. */ > temp = 0; > > /* Variables inherited from containing functions should have > been lowered by this point. */ > - context = decl_function_context (exp); > - gcc_assert (SCOPE_FILE_SCOPE_P (context) > + if (exp) > + context = decl_function_context (exp); > + gcc_assert (!exp > + || SCOPE_FILE_SCOPE_P (context) > || context == current_function_decl > || TREE_STATIC (exp) > || DECL_EXTERNAL (exp) > @@ -9478,7 +9482,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, > decl_rtl = use_anchored_address (decl_rtl); > if (modifier != EXPAND_CONST_ADDRESS > && modifier != EXPAND_SUM > - && !memory_address_addr_space_p (DECL_MODE (exp), > + && !memory_address_addr_space_p (exp ? DECL_MODE (exp) > + : GET_MODE (decl_rtl), > XEXP (decl_rtl, 0), > MEM_ADDR_SPACE (decl_rtl))) > temp = replace_equiv_address (decl_rtl, > @@ -9489,12 +9494,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, > if the address is a register. */ > if (temp != 0) > { > - if (MEM_P (temp) && REG_P (XEXP (temp, 0))) > + if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0))) > mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp)); > > return temp; > } > > + if (exp) > + dmode = DECL_MODE (exp); > + else > + dmode = TYPE_MODE (TREE_TYPE (ssa_name)); > + > /* If the mode of DECL_RTL does not match that of the decl, > there are two cases: we are dealing with a BLKmode value > that is returned in a register, or we are dealing with > @@ -9502,22 +9512,23 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, > of the wanted mode, but mark it so that we know that it > was already extended. */ > if (REG_P (decl_rtl) > - && DECL_MODE (exp) != BLKmode > - && GET_MODE (decl_rtl) != DECL_MODE (exp)) > + && dmode != BLKmode > + && GET_MODE (decl_rtl) != dmode) > { > machine_mode pmode; > > /* Get the signedness to be used for this variable. Ensure we get > the same mode we got when the variable was declared. */ > - if (code == SSA_NAME > - && (g = SSA_NAME_DEF_STMT (ssa_name)) > - && gimple_code (g) == GIMPLE_CALL > - && !gimple_call_internal_p (g)) > + if (code != SSA_NAME) > + pmode = promote_decl_mode (exp, &unsignedp); > + else if ((g = SSA_NAME_DEF_STMT (ssa_name)) > + && gimple_code (g) == GIMPLE_CALL > + && !gimple_call_internal_p (g)) > pmode = promote_function_mode (type, mode, &unsignedp, > gimple_call_fntype (g), > 2); > else > - pmode = promote_decl_mode (exp, &unsignedp); > + pmode = promote_ssa_mode (ssa_name, &unsignedp); > gcc_assert (GET_MODE (decl_rtl) == pmode); > > temp = gen_lowpart_SUBREG (mode, decl_rtl); > diff --git a/gcc/expr.h b/gcc/expr.h > index 32d1707..a2c8e1d 100644 > --- a/gcc/expr.h > +++ b/gcc/expr.h > @@ -210,6 +210,7 @@ extern rtx_insn *emit_move_insn_1 (rtx, rtx); > > extern rtx_insn *emit_move_complex_push (machine_mode, rtx, rtx); > extern rtx_insn *emit_move_complex_parts (rtx, rtx); > +extern rtx read_complex_part (rtx, bool); > extern void write_complex_part (rtx, rtx, bool); > extern rtx emit_move_resolve_push (machine_mode, rtx); > > diff --git a/gcc/function.c b/gcc/function.c > index 20bf3b3..715c19f 100644 > --- a/gcc/function.c > +++ b/gcc/function.c > @@ -72,6 +72,9 @@ along with GCC; see the file COPYING3. If not see > #include "cfganal.h" > #include "cfgbuild.h" > #include "cfgcleanup.h" > +#include "cfgexpand.h" > +#include "basic-block.h" > +#include "df.h" > #include "params.h" > #include "bb-reorder.h" > #include "shrink-wrap.h" > @@ -148,6 +151,9 @@ static bool contains (const_rtx, hash_table<insn_cache_hasher> *); > static void prepare_function_start (void); > static void do_clobber_return_reg (rtx, void *); > static void do_use_return_reg (rtx, void *); > +static rtx rtl_for_parm (struct assign_parm_data_all *, tree); > +static void maybe_reset_rtl_for_parm (tree); > + > > /* Stack of nested functions. */ > /* Keep track of the cfun stack. */ > @@ -2105,6 +2111,30 @@ aggregate_value_p (const_tree exp, const_tree fntype) > bool > use_register_for_decl (const_tree decl) > { > + if (TREE_CODE (decl) == SSA_NAME) > + { > + /* We often try to use the SSA_NAME, instead of its underlying > + decl, to get type information and guide decisions, to avoid > + differences of behavior between anonymous and named > + variables, but in this one case we have to go for the actual > + variable if there is one. The main reason is that, at least > + at -O0, we want to place user variables on the stack, but we > + don't mind using pseudos for anonymous or ignored temps. > + Should we take the SSA_NAME, we'd conclude all SSA_NAMEs > + should go in pseudos, whereas their corresponding variables > + might have to go on the stack. So, disregarding the decl > + here would negatively impact debug info at -O0, enable > + coalescing between SSA_NAMEs that ought to get different > + stack/pseudo assignments, and get the incoming argument > + processing thoroughly confused by PARM_DECLs expected to live > + in stack slots but assigned to pseudos. */ > + if (!SSA_NAME_VAR (decl)) > + return TYPE_MODE (TREE_TYPE (decl)) != BLKmode > + && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl))); > + > + decl = SSA_NAME_VAR (decl); > + } > + > /* Honor volatile. */ > if (TREE_SIDE_EFFECTS (decl)) > return false; > @@ -2240,7 +2270,7 @@ assign_parms_initialize_all (struct assign_parm_data_all *all) > needed, else the old list. */ > > static void > -split_complex_args (vec<tree> *args) > +split_complex_args (struct assign_parm_data_all *all, vec<tree> *args) > { > unsigned i; > tree p; > @@ -2251,6 +2281,7 @@ split_complex_args (vec<tree> *args) > if (TREE_CODE (type) == COMPLEX_TYPE > && targetm.calls.split_complex_arg (type)) > { > + tree cparm = p; > tree decl; > tree subtype = TREE_TYPE (type); > bool addressable = TREE_ADDRESSABLE (p); > @@ -2269,6 +2300,9 @@ split_complex_args (vec<tree> *args) > DECL_ARTIFICIAL (p) = addressable; > DECL_IGNORED_P (p) = addressable; > TREE_ADDRESSABLE (p) = 0; > + /* Reset the RTL before layout_decl, or it may change the > + mode of the RTL of the original argument copied to P. */ > + SET_DECL_RTL (p, NULL_RTX); > layout_decl (p, 0); > (*args)[i] = p; > > @@ -2280,6 +2314,25 @@ split_complex_args (vec<tree> *args) > DECL_IGNORED_P (decl) = addressable; > layout_decl (decl, 0); > args->safe_insert (++i, decl); > + > + /* If we are expanding a function, rather than gimplifying > + it, propagate the RTL of the complex parm to the split > + declarations, and set their contexts so that > + maybe_reset_rtl_for_parm can recognize them and refrain > + from resetting their RTL. */ > + if (currently_expanding_to_rtl) > + { > + maybe_reset_rtl_for_parm (cparm); > + rtx rtl = rtl_for_parm (all, cparm); > + if (rtl) > + { > + SET_DECL_RTL (p, read_complex_part (rtl, false)); > + SET_DECL_RTL (decl, read_complex_part (rtl, true)); > + > + DECL_CONTEXT (p) = cparm; > + DECL_CONTEXT (decl) = cparm; > + } > + } > } > } > } > @@ -2342,7 +2395,7 @@ assign_parms_augmented_arg_list (struct assign_parm_data_all *all) > > /* If the target wants to split complex arguments into scalars, do so. */ > if (targetm.calls.split_complex_arg) > - split_complex_args (&fnargs); > + split_complex_args (all, &fnargs); > > return fnargs; > } > @@ -2745,23 +2798,98 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data) > data->entry_parm = entry_parm; > } > > +/* Wrapper for use_register_for_decl, that special-cases the > + .result_ptr as the function's RESULT_DECL when the RESULT_DECL is > + passed by reference. */ > + > +static bool > +use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm) > +{ > + if (parm == all->function_result_decl) > + { > + tree result = DECL_RESULT (current_function_decl); > + > + if (DECL_BY_REFERENCE (result)) > + parm = result; > + } > + > + return use_register_for_decl (parm); > +} > + > +/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases > + the .result_ptr as the function's RESULT_DECL when the RESULT_DECL > + is passed by reference. */ > + > +static rtx > +rtl_for_parm (struct assign_parm_data_all *all, tree parm) > +{ > + if (parm == all->function_result_decl) > + { > + tree result = DECL_RESULT (current_function_decl); > + > + if (!DECL_BY_REFERENCE (result)) > + return NULL_RTX; > + > + parm = result; > + } > + > + return get_rtl_for_parm_ssa_default_def (parm); > +} > + > +/* Reset the location of PARM_DECLs and RESULT_DECLs that had > + SSA_NAMEs in multiple partitions, so that assign_parms will choose > + the default def, if it exists, or create new RTL to hold the unused > + entry value. If we are coalescing across variables, we want to > + reset the location too, because a parm without a default def > + (incoming value unused) might be coalesced with one with a default > + def, and then assign_parms would copy both incoming values to the > + same location, which might cause the wrong value to survive. */ > +static void > +maybe_reset_rtl_for_parm (tree parm) > +{ > + gcc_assert (TREE_CODE (parm) == PARM_DECL > + || TREE_CODE (parm) == RESULT_DECL); > + > + /* This is a split complex parameter, and its context was set to its > + original PARM_DECL in split_complex_args so that we could > + recognize it here and not reset its RTL. */ > + if (DECL_CONTEXT (parm) && TREE_CODE (DECL_CONTEXT (parm)) == PARM_DECL) > + { > + DECL_CONTEXT (parm) = DECL_CONTEXT (DECL_CONTEXT (parm)); > + return; > + } > + > + if ((flag_tree_coalesce_vars > + || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx)) > + && is_gimple_reg (parm)) > + SET_DECL_RTL (parm, NULL_RTX); > +} > + > /* A subroutine of assign_parms. Adjust DATA->STACK_RTL such that it's > always valid and properly aligned. */ > > static void > -assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data) > +assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm, > + struct assign_parm_data_one *data) > { > rtx stack_parm = data->stack_parm; > > + /* If out-of-SSA assigned RTL to the parm default def, make sure we > + don't use what we might have computed before. */ > + rtx ssa_assigned = rtl_for_parm (all, parm); > + if (ssa_assigned) > + stack_parm = NULL; > + > /* If we can't trust the parm stack slot to be aligned enough for its > ultimate type, don't use that slot after entry. We'll make another > stack slot, if we need one. */ > - if (stack_parm > - && ((STRICT_ALIGNMENT > - && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm)) > - || (data->nominal_type > - && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm) > - && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY))) > + else if (stack_parm > + && ((STRICT_ALIGNMENT > + && (GET_MODE_ALIGNMENT (data->nominal_mode) > + > MEM_ALIGN (stack_parm))) > + || (data->nominal_type > + && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm) > + && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY))) > stack_parm = NULL; > > /* If parm was passed in memory, and we need to convert it on entry, > @@ -2823,14 +2951,32 @@ assign_parm_setup_block (struct assign_parm_data_all *all, > > size = int_size_in_bytes (data->passed_type); > size_stored = CEIL_ROUND (size, UNITS_PER_WORD); > + > if (stack_parm == 0) > { > DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD); > - stack_parm = assign_stack_local (BLKmode, size_stored, > - DECL_ALIGN (parm)); > - if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size) > - PUT_MODE (stack_parm, GET_MODE (entry_parm)); > - set_mem_attributes (stack_parm, parm, 1); > + rtx from_expand = rtl_for_parm (all, parm); > + if (from_expand && (!parm_maybe_byref_p (parm) > + || XEXP (from_expand, 0) != NULL_RTX)) > + stack_parm = copy_rtx (from_expand); > + else > + { > + stack_parm = assign_stack_local (BLKmode, size_stored, > + DECL_ALIGN (parm)); > + if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size) > + PUT_MODE (stack_parm, GET_MODE (entry_parm)); > + if (from_expand) > + { > + gcc_assert (GET_CODE (stack_parm) == MEM); > + gcc_assert (GET_CODE (from_expand) == MEM); > + gcc_assert (XEXP (from_expand, 0) == NULL_RTX); > + XEXP (from_expand, 0) = XEXP (stack_parm, 0); > + PUT_MODE (from_expand, GET_MODE (stack_parm)); > + stack_parm = copy_rtx (from_expand); > + } > + else > + set_mem_attributes (stack_parm, parm, 1); > + } > } > > /* If a BLKmode arrives in registers, copy it to a stack slot. Handle > @@ -2968,14 +3114,34 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp, > TREE_TYPE (current_function_decl), 2); > > - parmreg = gen_reg_rtx (promoted_nominal_mode); > + rtx from_expand = parmreg = rtl_for_parm (all, parm); > > - if (!DECL_ARTIFICIAL (parm)) > - mark_user_reg (parmreg); > + if (from_expand && !data->passed_pointer) > + { > + if (GET_MODE (parmreg) != promoted_nominal_mode) > + parmreg = gen_lowpart (promoted_nominal_mode, parmreg); > + } > + else if (!from_expand || parm_maybe_byref_p (parm)) > + { > + parmreg = gen_reg_rtx (promoted_nominal_mode); > + if (!DECL_ARTIFICIAL (parm)) > + mark_user_reg (parmreg); > + > + if (from_expand) > + { > + gcc_assert (data->passed_pointer); > + gcc_assert (GET_CODE (from_expand) == MEM > + && GET_MODE (from_expand) == BLKmode > + && XEXP (from_expand, 0) == NULL_RTX); > + XEXP (from_expand, 0) = parmreg; > + } > + } > > /* If this was an item that we received a pointer to, > set DECL_RTL appropriately. */ > - if (data->passed_pointer) > + if (from_expand) > + SET_DECL_RTL (parm, from_expand); > + else if (data->passed_pointer) > { > rtx x = gen_rtx_MEM (TYPE_MODE (TREE_TYPE (data->passed_type)), parmreg); > set_mem_attributes (x, parm, 1); > @@ -2990,10 +3156,13 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > assign_parm_find_data_types and expand_expr_real_1. */ > > equiv_stack_parm = data->stack_parm; > + if (!equiv_stack_parm) > + equiv_stack_parm = data->entry_parm; > validated_mem = validize_mem (copy_rtx (data->entry_parm)); > > need_conversion = (data->nominal_mode != data->passed_mode > || promoted_nominal_mode != data->promoted_mode); > + gcc_assert (!(need_conversion && data->passed_pointer && from_expand)); > moved = false; > > if (need_conversion > @@ -3125,16 +3294,28 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > > did_conversion = true; > } > - else > + /* We don't want to copy the incoming pointer to a parmreg expected > + to hold the value rather than the pointer. */ > + else if (!data->passed_pointer || parmreg != from_expand) > emit_move_insn (parmreg, validated_mem); > > /* If we were passed a pointer but the actual value can safely live > in a register, retrieve it and use it directly. */ > - if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode) > + if (data->passed_pointer > + && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode)) > { > + rtx src = DECL_RTL (parm); > + > /* We can't use nominal_mode, because it will have been set to > Pmode above. We must use the actual mode of the parm. */ > - if (use_register_for_decl (parm)) > + if (from_expand) > + { > + parmreg = from_expand; > + gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm))); > + src = gen_rtx_MEM (GET_MODE (parmreg), validated_mem); > + set_mem_attributes (src, parm, 1); > + } > + else if (use_register_for_decl (parm)) > { > parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm))); > mark_user_reg (parmreg); > @@ -3151,14 +3332,14 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > set_mem_attributes (parmreg, parm, 1); > } > > - if (GET_MODE (parmreg) != GET_MODE (DECL_RTL (parm))) > + if (GET_MODE (parmreg) != GET_MODE (src)) > { > - rtx tempreg = gen_reg_rtx (GET_MODE (DECL_RTL (parm))); > + rtx tempreg = gen_reg_rtx (GET_MODE (src)); > int unsigned_p = TYPE_UNSIGNED (TREE_TYPE (parm)); > > push_to_sequence2 (all->first_conversion_insn, > all->last_conversion_insn); > - emit_move_insn (tempreg, DECL_RTL (parm)); > + emit_move_insn (tempreg, src); > tempreg = convert_to_mode (GET_MODE (parmreg), tempreg, unsigned_p); > emit_move_insn (parmreg, tempreg); > all->first_conversion_insn = get_insns (); > @@ -3167,14 +3348,16 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > > did_conversion = true; > } > + else if (GET_MODE (parmreg) == BLKmode) > + gcc_assert (parm_maybe_byref_p (parm)); > else > - emit_move_insn (parmreg, DECL_RTL (parm)); > + emit_move_insn (parmreg, src); > > SET_DECL_RTL (parm, parmreg); > > /* STACK_PARM is the pointer, not the parm, and PARMREG is > now the parm. */ > - data->stack_parm = NULL; > + data->stack_parm = equiv_stack_parm = NULL; > } > > /* Mark the register as eliminable if we did no conversion and it was > @@ -3184,11 +3367,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > make here would screw up life analysis for it. */ > if (data->nominal_mode == data->passed_mode > && !did_conversion > - && data->stack_parm != 0 > - && MEM_P (data->stack_parm) > + && equiv_stack_parm != 0 > + && MEM_P (equiv_stack_parm) > && data->locate.offset.var == 0 > && reg_mentioned_p (virtual_incoming_args_rtx, > - XEXP (data->stack_parm, 0))) > + XEXP (equiv_stack_parm, 0))) > { > rtx_insn *linsn = get_last_insn (); > rtx_insn *sinsn; > @@ -3201,8 +3384,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, > = GET_MODE_INNER (GET_MODE (parmreg)); > int regnor = REGNO (XEXP (parmreg, 0)); > int regnoi = REGNO (XEXP (parmreg, 1)); > - rtx stackr = adjust_address_nv (data->stack_parm, submode, 0); > - rtx stacki = adjust_address_nv (data->stack_parm, submode, > + rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0); > + rtx stacki = adjust_address_nv (equiv_stack_parm, submode, > GET_MODE_SIZE (submode)); > > /* Scan backwards for the set of the real and > @@ -3275,6 +3458,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm, > > if (data->stack_parm == 0) > { > + rtx x = data->stack_parm = rtl_for_parm (all, parm); > + if (x) > + gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm)); > + } > + > + if (data->stack_parm == 0) > + { > int align = STACK_SLOT_ALIGNMENT (data->passed_type, > GET_MODE (data->entry_parm), > TYPE_ALIGN (data->passed_type)); > @@ -3337,11 +3527,21 @@ assign_parms_unsplit_complex (struct assign_parm_data_all *all, > imag = DECL_RTL (fnargs[i + 1]); > if (inner != GET_MODE (real)) > { > - real = gen_lowpart_SUBREG (inner, real); > - imag = gen_lowpart_SUBREG (inner, imag); > + real = simplify_gen_subreg (inner, real, GET_MODE (real), > + subreg_lowpart_offset > + (inner, GET_MODE (real))); > + imag = simplify_gen_subreg (inner, imag, GET_MODE (imag), > + subreg_lowpart_offset > + (inner, GET_MODE (imag))); > } > > - if (TREE_ADDRESSABLE (parm)) > + if ((tmp = rtl_for_parm (all, parm)) != NULL_RTX > + && rtx_equal_p (real, > + read_complex_part (tmp, false)) > + && rtx_equal_p (imag, > + read_complex_part (tmp, true))) > + ; /* We now have the right rtl in tmp. */ > + else if (TREE_ADDRESSABLE (parm)) > { > rtx rmem, imem; > HOST_WIDE_INT size = int_size_in_bytes (TREE_TYPE (parm)); > @@ -3487,7 +3687,7 @@ assign_bounds (vec<bounds_parm_data> &bndargs, > assign_parm_setup_block (&all, pbdata->bounds_parm, > &pbdata->parm_data); > else if (pbdata->parm_data.passed_pointer > - || use_register_for_decl (pbdata->bounds_parm)) > + || use_register_for_parm_decl (&all, pbdata->bounds_parm)) > assign_parm_setup_reg (&all, pbdata->bounds_parm, > &pbdata->parm_data); > else > @@ -3531,6 +3731,8 @@ assign_parms (tree fndecl) > DECL_INCOMING_RTL (parm) = DECL_RTL (parm); > continue; > } > + else > + maybe_reset_rtl_for_parm (parm); > > /* Estimate stack alignment from parameter alignment. */ > if (SUPPORTS_STACK_ALIGNMENT) > @@ -3580,7 +3782,9 @@ assign_parms (tree fndecl) > else > set_decl_incoming_rtl (parm, data.entry_parm, false); > > - /* Boudns should be loaded in the particular order to > + assign_parm_adjust_stack_rtl (&all, parm, &data); > + > + /* Bounds should be loaded in the particular order to > have registers allocated correctly. Collect info about > input bounds and load them later. */ > if (POINTER_BOUNDS_TYPE_P (data.passed_type)) > @@ -3597,11 +3801,10 @@ assign_parms (tree fndecl) > } > else > { > - assign_parm_adjust_stack_rtl (&data); > - > if (assign_parm_setup_block_p (&data)) > assign_parm_setup_block (&all, parm, &data); > - else if (data.passed_pointer || use_register_for_decl (parm)) > + else if (data.passed_pointer > + || use_register_for_parm_decl (&all, parm)) > assign_parm_setup_reg (&all, parm, &data); > else > assign_parm_setup_stack (&all, parm, &data); > @@ -4932,7 +5135,9 @@ expand_function_start (tree subr) > before any library calls that assign parms might generate. */ > > /* Decide whether to return the value in memory or in a register. */ > - if (aggregate_value_p (DECL_RESULT (subr), subr)) > + tree res = DECL_RESULT (subr); > + maybe_reset_rtl_for_parm (res); > + if (aggregate_value_p (res, subr)) > { > /* Returning something that won't go in a register. */ > rtx value_address = 0; > @@ -4940,7 +5145,7 @@ expand_function_start (tree subr) > #ifdef PCC_STATIC_STRUCT_RETURN > if (cfun->returns_pcc_struct) > { > - int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr))); > + int size = int_size_in_bytes (TREE_TYPE (res)); > value_address = assemble_static_space (size); > } > else > @@ -4952,36 +5157,45 @@ expand_function_start (tree subr) > it. */ > if (sv) > { > - value_address = gen_reg_rtx (Pmode); > + if (DECL_BY_REFERENCE (res)) > + value_address = get_rtl_for_parm_ssa_default_def (res); > + if (!value_address) > + value_address = gen_reg_rtx (Pmode); > emit_move_insn (value_address, sv); > } > } > if (value_address) > { > rtx x = value_address; > - if (!DECL_BY_REFERENCE (DECL_RESULT (subr))) > + if (!DECL_BY_REFERENCE (res)) > { > - x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x); > - set_mem_attributes (x, DECL_RESULT (subr), 1); > + x = get_rtl_for_parm_ssa_default_def (res); > + if (!x) > + { > + x = gen_rtx_MEM (DECL_MODE (res), value_address); > + set_mem_attributes (x, res, 1); > + } > } > - SET_DECL_RTL (DECL_RESULT (subr), x); > + SET_DECL_RTL (res, x); > } > } > - else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode) > + else if (DECL_MODE (res) == VOIDmode) > /* If return mode is void, this decl rtl should not be used. */ > - SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX); > + SET_DECL_RTL (res, NULL_RTX); > else > { > /* Compute the return values into a pseudo reg, which we will copy > into the true return register after the cleanups are done. */ > - tree return_type = TREE_TYPE (DECL_RESULT (subr)); > - if (TYPE_MODE (return_type) != BLKmode > - && targetm.calls.return_in_msb (return_type)) > + tree return_type = TREE_TYPE (res); > + rtx x = get_rtl_for_parm_ssa_default_def (res); > + if (x) > + /* Use it. */; > + else if (TYPE_MODE (return_type) != BLKmode > + && targetm.calls.return_in_msb (return_type)) > /* expand_function_end will insert the appropriate padding in > this case. Use the return value's natural (unpadded) mode > within the function proper. */ > - SET_DECL_RTL (DECL_RESULT (subr), > - gen_reg_rtx (TYPE_MODE (return_type))); > + x = gen_reg_rtx (TYPE_MODE (return_type)); > else > { > /* In order to figure out what mode to use for the pseudo, we > @@ -4992,25 +5206,26 @@ expand_function_start (tree subr) > /* Structures that are returned in registers are not > aggregate_value_p, so we may see a PARALLEL or a REG. */ > if (REG_P (hard_reg)) > - SET_DECL_RTL (DECL_RESULT (subr), > - gen_reg_rtx (GET_MODE (hard_reg))); > + x = gen_reg_rtx (GET_MODE (hard_reg)); > else > { > gcc_assert (GET_CODE (hard_reg) == PARALLEL); > - SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg)); > + x = gen_group_rtx (hard_reg); > } > } > > + SET_DECL_RTL (res, x); > + > /* Set DECL_REGISTER flag so that expand_function_end will copy the > result to the real return register(s). */ > - DECL_REGISTER (DECL_RESULT (subr)) = 1; > + DECL_REGISTER (res) = 1; > > if (chkp_function_instrumented_p (current_function_decl)) > { > - tree return_type = TREE_TYPE (DECL_RESULT (subr)); > + tree return_type = TREE_TYPE (res); > rtx bounds = targetm.calls.chkp_function_value_bounds (return_type, > subr, 1); > - SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds); > + SET_DECL_BOUNDS_RTL (res, bounds); > } > } > > @@ -5025,13 +5240,19 @@ expand_function_start (tree subr) > rtx local, chain; > rtx_insn *insn; > > - local = gen_reg_rtx (Pmode); > + local = get_rtl_for_parm_ssa_default_def (parm); > + if (!local) > + local = gen_reg_rtx (Pmode); > chain = targetm.calls.static_chain (current_function_decl, true); > > set_decl_incoming_rtl (parm, chain, false); > SET_DECL_RTL (parm, local); > mark_reg_pointer (local, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (parm)))); > > + if (GET_MODE (local) != Pmode) > + local = convert_to_mode (Pmode, local, > + TYPE_UNSIGNED (TREE_TYPE (parm))); > + > insn = emit_move_insn (local, chain); > > /* Mark the register as eliminable, similar to parameters. */ > diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c > index b558d90..baed630 100644 > --- a/gcc/gimple-expr.c > +++ b/gcc/gimple-expr.c > @@ -375,45 +375,6 @@ copy_var_decl (tree var, tree name, tree type) > return copy; > } > > -/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for > - coalescing together, false otherwise. > - > - This must stay consistent with var_map_base_init in tree-ssa-live.c. */ > - > -bool > -gimple_can_coalesce_p (tree name1, tree name2) > -{ > - /* First check the SSA_NAME's associated DECL. We only want to > - coalesce if they have the same DECL or both have no associated DECL. */ > - tree var1 = SSA_NAME_VAR (name1); > - tree var2 = SSA_NAME_VAR (name2); > - var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE; > - var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE; > - if (var1 != var2) > - return false; > - > - /* Now check the types. If the types are the same, then we should > - try to coalesce V1 and V2. */ > - tree t1 = TREE_TYPE (name1); > - tree t2 = TREE_TYPE (name2); > - if (t1 == t2) > - return true; > - > - /* If the types are not the same, check for a canonical type match. This > - (for example) allows coalescing when the types are fundamentally the > - same, but just have different names. > - > - Note pointer types with different address spaces may have the same > - canonical type. Those are rejected for coalescing by the > - types_compatible_p check. */ > - if (TYPE_CANONICAL (t1) > - && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2) > - && types_compatible_p (t1, t2)) > - return true; > - > - return false; > -} > - > /* Strip off a legitimate source ending from the input string NAME of > length LEN. Rather than having to know the names used by all of > our front ends, we strip off an ending of a period followed by > diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h > index ed23eb2..3d1c89f 100644 > --- a/gcc/gimple-expr.h > +++ b/gcc/gimple-expr.h > @@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree); > extern bool gimple_has_body_p (tree); > extern const char *gimple_decl_printable_name (tree, int); > extern tree copy_var_decl (tree, tree, tree); > -extern bool gimple_can_coalesce_p (tree, tree); > extern tree create_tmp_var_name (const char *); > extern tree create_tmp_var_raw (tree, const char * = NULL); > extern tree create_tmp_var (tree, const char * = NULL); > diff --git a/gcc/opts.c b/gcc/opts.c > index 9d5de96..32de605 100644 > --- a/gcc/opts.c > +++ b/gcc/opts.c > @@ -445,12 +445,12 @@ static const struct default_options default_options_table[] = > { OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 }, > { OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 }, > { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 }, > + { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 }, > { OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 }, > { OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 }, > { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 }, > { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 }, > { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 }, > - { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 }, > { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 }, > { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 }, > { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 }, > diff --git a/gcc/passes.def b/gcc/passes.def > index 6b66f8f..64fc4d9 100644 > --- a/gcc/passes.def > +++ b/gcc/passes.def > @@ -77,7 +77,6 @@ along with GCC; see the file COPYING3. If not see > NEXT_PASS (pass_all_early_optimizations); > PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations) > NEXT_PASS (pass_remove_cgraph_callee_edges); > - NEXT_PASS (pass_rename_ssa_copies); > NEXT_PASS (pass_object_sizes); > NEXT_PASS (pass_ccp); > /* After CCP we rewrite no longer addressed locals into SSA > @@ -155,7 +154,6 @@ along with GCC; see the file COPYING3. If not see > /* Initial scalar cleanups before alias computation. > They ensure memory accesses are not indirect wherever possible. */ > NEXT_PASS (pass_strip_predict_hints); > - NEXT_PASS (pass_rename_ssa_copies); > NEXT_PASS (pass_ccp); > /* After CCP we rewrite no longer addressed locals into SSA > form if possible. */ > @@ -183,7 +181,6 @@ along with GCC; see the file COPYING3. If not see > NEXT_PASS (pass_ch); > NEXT_PASS (pass_lower_complex); > NEXT_PASS (pass_sra); > - NEXT_PASS (pass_rename_ssa_copies); > /* The dom pass will also resolve all __builtin_constant_p calls > that are still there to 0. This has to be done after some > propagations have already run, but before some more dead code > @@ -293,7 +290,6 @@ along with GCC; see the file COPYING3. If not see > NEXT_PASS (pass_fold_builtins); > NEXT_PASS (pass_optimize_widening_mul); > NEXT_PASS (pass_tail_calls); > - NEXT_PASS (pass_rename_ssa_copies); > /* FIXME: If DCE is not run before checking for uninitialized uses, > we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c). > However, this also causes us to misdiagnose cases that should be > @@ -328,7 +324,6 @@ along with GCC; see the file COPYING3. If not see > NEXT_PASS (pass_dce); > NEXT_PASS (pass_asan); > NEXT_PASS (pass_tsan); > - NEXT_PASS (pass_rename_ssa_copies); > /* ??? We do want some kind of loop invariant motion, but we possibly > need to adjust LIM to be more friendly towards preserving accurate > debug information here. */ > diff --git a/gcc/stmt.c b/gcc/stmt.c > index 391686c..e7f7dd4 100644 > --- a/gcc/stmt.c > +++ b/gcc/stmt.c > @@ -891,7 +891,7 @@ emit_case_decision_tree (tree index_expr, tree index_type, > { > index = copy_to_reg (index); > if (TREE_CODE (index_expr) == SSA_NAME) > - set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (index_expr), index); > + set_reg_attrs_for_decl_rtl (index_expr, index); > } > > balance_case_nodes (&case_list, NULL); > diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c > index 9757777..938e54b 100644 > --- a/gcc/stor-layout.c > +++ b/gcc/stor-layout.c > @@ -782,7 +782,8 @@ layout_decl (tree decl, unsigned int known_align) > { > PUT_MODE (rtl, DECL_MODE (decl)); > SET_DECL_RTL (decl, 0); > - set_mem_attributes (rtl, decl, 1); > + if (MEM_P (rtl)) > + set_mem_attributes (rtl, decl, 1); > SET_DECL_RTL (decl, rtl); > } > } > diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c > index 9b17187..e1e7293 100644 > --- a/gcc/testsuite/gcc.dg/guality/pr54200.c > +++ b/gcc/testsuite/gcc.dg/guality/pr54200.c > @@ -1,6 +1,6 @@ > /* PR tree-optimization/54200 */ > /* { dg-do run } */ > -/* { dg-options "-g -fno-var-tracking-assignments" } */ > +/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */ > > int o __attribute__((used)); > > diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c > index 5467f4d..db69332 100644 > --- a/gcc/testsuite/gcc.dg/ssp-1.c > +++ b/gcc/testsuite/gcc.dg/ssp-1.c > @@ -12,7 +12,7 @@ __stack_chk_fail (void) > > int main () > { > - int i; > + register int i; > char foo[255]; > > // smash stack > diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c > index 9a7ac32..752fe53 100644 > --- a/gcc/testsuite/gcc.dg/ssp-2.c > +++ b/gcc/testsuite/gcc.dg/ssp-2.c > @@ -14,7 +14,7 @@ __stack_chk_fail (void) > void > overflow() > { > - int i = 0; > + register int i = 0; > char foo[30]; > > /* Overflow buffer. */ > diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c > new file mode 100644 > index 0000000..dbd81c1 > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c > @@ -0,0 +1,40 @@ > +/* { dg-do run } */ > + > +#include <stdlib.h> > + > +/* Make sure we don't coalesce both incoming parms, one whose incoming > + value is unused, to the same location, so as to overwrite one of > + them with the incoming value of the other. */ > + > +int __attribute__((noinline, noclone)) > +foo (int i, int j) > +{ > + j = i; /* The incoming value for J is unused. */ > + i = 2; > + if (j) > + j++; > + j += i + 1; > + return j; > +} > + > +/* Same as foo, but with swapped parameters. */ > +int __attribute__((noinline, noclone)) > +bar (int j, int i) > +{ > + j = i; /* The incoming value for J is unused. */ > + i = 2; > + if (j) > + j++; > + j += i + 1; > + return j; > +} > + > +int > +main (void) > +{ > + if (foo (0, 1) != 3) > + abort (); > + if (bar (1, 0) != 3) > + abort (); > + return 0; > +} > diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c > index 7b747ab9..978476c 100644 > --- a/gcc/tree-outof-ssa.c > +++ b/gcc/tree-outof-ssa.c > @@ -279,7 +279,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus) > rtx dest_rtx, seq, x; > machine_mode dest_mode, src_mode; > int unsignedp; > - tree var; > > if (dump_file && (dump_flags & TDF_DETAILS)) > { > @@ -301,12 +300,12 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus) > > start_sequence (); > > - var = SSA_NAME_VAR (partition_to_var (SA.map, dest)); > + tree name = partition_to_var (SA.map, dest); > src_mode = TYPE_MODE (TREE_TYPE (src)); > dest_mode = GET_MODE (dest_rtx); > - gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var))); > + gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name))); > gcc_assert (!REG_P (dest_rtx) > - || dest_mode == promote_decl_mode (var, &unsignedp)); > + || dest_mode == promote_ssa_mode (name, &unsignedp)); > > if (src_mode != dest_mode) > { > @@ -682,13 +681,12 @@ elim_backward (elim_graph g, int T) > static rtx > get_temp_reg (tree name) > { > - tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name; > - tree type = TREE_TYPE (var); > + tree type = TREE_TYPE (name); > int unsignedp; > - machine_mode reg_mode = promote_decl_mode (var, &unsignedp); > + machine_mode reg_mode = promote_ssa_mode (name, &unsignedp); > rtx x = gen_reg_rtx (reg_mode); > if (POINTER_TYPE_P (type)) > - mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var)))); > + mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type))); > return x; > } > > @@ -988,7 +986,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa) > > /* Return to viewing the variable list as just all reference variables after > coalescing has been performed. */ > - partition_view_normal (map, false); > + partition_view_normal (map); > > if (dump_file && (dump_flags & TDF_DETAILS)) > { > diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c > index bf8983f..08ce72c 100644 > --- a/gcc/tree-ssa-coalesce.c > +++ b/gcc/tree-ssa-coalesce.c > @@ -36,6 +36,8 @@ along with GCC; see the file COPYING3. If not see > #include "gimple-iterator.h" > #include "tree-ssa-live.h" > #include "tree-ssa-coalesce.h" > +#include "cfgexpand.h" > +#include "explow.h" > #include "diagnostic-core.h" > > > @@ -806,6 +808,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo) > basic_block bb; > ssa_op_iter iter; > live_track_p live; > + basic_block entry; > + > + /* If inter-variable coalescing is enabled, we may attempt to > + coalesce variables from different base variables, including > + different parameters, so we have to make sure default defs live > + at the entry block conflict with each other. */ > + if (flag_tree_coalesce_vars) > + entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); > + else > + entry = NULL; > > map = live_var_map (liveinfo); > graph = ssa_conflicts_new (num_var_partitions (map)); > @@ -864,6 +876,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo) > live_track_process_def (live, result, graph); > } > > + /* Pretend there are defs for params' default defs at the start > + of the (post-)entry block. */ > + if (bb == entry) > + { > + unsigned base; > + bitmap_iterator bi; > + EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi) > + { > + bitmap_iterator bi2; > + unsigned part; > + EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base], > + 0, part, bi2) > + { > + tree var = partition_to_var (map, part); > + if (!SSA_NAME_VAR (var) > + || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL > + && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL) > + || !SSA_NAME_IS_DEFAULT_DEF (var)) > + continue; > + live_track_process_def (live, var, graph); > + } > + } > + } > + > live_track_clear_base_vars (live); > } > > @@ -1132,6 +1168,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y, > { > var1 = partition_to_var (map, p1); > var2 = partition_to_var (map, p2); > + > z = var_union (map, var1, var2); > if (z == NO_PARTITION) > { > @@ -1149,6 +1186,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y, > > if (debug) > fprintf (debug, ": Success -> %d\n", z); > + > return true; > } > > @@ -1244,6 +1282,333 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2) > } > > > +/* Output partition map MAP with coalescing plan PART to file F. */ > + > +void > +dump_part_var_map (FILE *f, partition part, var_map map) > +{ > + int t; > + unsigned x, y; > + int p; > + > + fprintf (f, "\nCoalescible Partition map \n\n"); > + > + for (x = 0; x < map->num_partitions; x++) > + { > + if (map->view_to_partition != NULL) > + p = map->view_to_partition[x]; > + else > + p = x; > + > + if (ssa_name (p) == NULL_TREE > + || virtual_operand_p (ssa_name (p))) > + continue; > + > + t = 0; > + for (y = 1; y < num_ssa_names; y++) > + { > + tree var = version_to_var (map, y); > + if (!var) > + continue; > + int q = var_to_partition (map, var); > + p = partition_find (part, q); > + gcc_assert (map->partition_to_base_index[q] > + == map->partition_to_base_index[p]); > + > + if (p == (int)x) > + { > + if (t++ == 0) > + { > + fprintf (f, "Partition %d, base %d (", x, > + map->partition_to_base_index[q]); > + print_generic_expr (f, partition_to_var (map, q), TDF_SLIM); > + fprintf (f, " - "); > + } > + fprintf (f, "%d ", y); > + } > + } > + if (t != 0) > + fprintf (f, ")\n"); > + } > + fprintf (f, "\n"); > +} > + > +/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for > + coalescing together, false otherwise. > + > + This must stay consistent with var_map_base_init in tree-ssa-live.c. */ > + > +bool > +gimple_can_coalesce_p (tree name1, tree name2) > +{ > + /* First check the SSA_NAME's associated DECL. Without > + optimization, we only want to coalesce if they have the same DECL > + or both have no associated DECL. */ > + tree var1 = SSA_NAME_VAR (name1); > + tree var2 = SSA_NAME_VAR (name2); > + var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE; > + var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE; > + if (var1 != var2 && !flag_tree_coalesce_vars) > + return false; > + > + /* Now check the types. If the types are the same, then we should > + try to coalesce V1 and V2. */ > + tree t1 = TREE_TYPE (name1); > + tree t2 = TREE_TYPE (name2); > + if (t1 == t2) > + { > + check_modes: > + /* If the base variables are the same, we're good: none of the > + other tests below could possibly fail. */ > + var1 = SSA_NAME_VAR (name1); > + var2 = SSA_NAME_VAR (name2); > + if (var1 == var2) > + return true; > + > + /* We don't want to coalesce two SSA names if one of the base > + variables is supposed to be a register while the other is > + supposed to be on the stack. Anonymous SSA names take > + registers, but when not optimizing, user variables should go > + on the stack, so coalescing them with the anonymous variable > + as the partition leader would end up assigning the user > + variable to a register. Don't do that! */ > + bool reg1 = !var1 || use_register_for_decl (var1); > + bool reg2 = !var2 || use_register_for_decl (var2); > + if (reg1 != reg2) > + return false; > + > + /* Check that the promoted modes are the same. We don't want to > + coalesce if the promoted modes would be different. Only > + PARM_DECLs and RESULT_DECLs have different promotion rules, > + so skip the test if both are variables, or both are anonymous > + SSA_NAMEs. Now, if a parm or result has BLKmode, do not > + coalesce its SSA versions with those of any other variables, > + because it may be passed by reference. */ > + return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2))) > + || (/* The case var1 == var2 is already covered above. */ > + !parm_maybe_byref_p (var1) > + && !parm_maybe_byref_p (var2) > + && promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL)); > + } > + > + /* If the types are not the same, check for a canonical type match. This > + (for example) allows coalescing when the types are fundamentally the > + same, but just have different names. > + > + Note pointer types with different address spaces may have the same > + canonical type. Those are rejected for coalescing by the > + types_compatible_p check. */ > + if (TYPE_CANONICAL (t1) > + && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2) > + && types_compatible_p (t1, t2)) > + goto check_modes; > + > + return false; > +} > + > +/* Fill in MAP's partition_to_base_index, with one index for each > + partition of SSA names USED_IN_COPIES and related by CL coalesce > + possibilities. This must match gimple_can_coalesce_p in the > + optimized case. */ > + > +static void > +compute_optimized_partition_bases (var_map map, bitmap used_in_copies, > + coalesce_list_p cl) > +{ > + int parts = num_var_partitions (map); > + partition tentative = partition_new (parts); > + > + /* Partition the SSA versions so that, for each coalescible > + pair, both of its members are in the same partition in > + TENTATIVE. */ > + gcc_assert (!cl->sorted); > + coalesce_pair_p node; > + coalesce_iterator_type ppi; > + FOR_EACH_PARTITION_PAIR (node, ppi, cl) > + { > + tree v1 = ssa_name (node->first_element); > + int p1 = partition_find (tentative, var_to_partition (map, v1)); > + tree v2 = ssa_name (node->second_element); > + int p2 = partition_find (tentative, var_to_partition (map, v2)); > + > + if (p1 == p2) > + continue; > + > + partition_union (tentative, p1, p2); > + } > + > + /* We have to deal with cost one pairs too. */ > + for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next) > + { > + tree v1 = ssa_name (co->first_element); > + int p1 = partition_find (tentative, var_to_partition (map, v1)); > + tree v2 = ssa_name (co->second_element); > + int p2 = partition_find (tentative, var_to_partition (map, v2)); > + > + if (p1 == p2) > + continue; > + > + partition_union (tentative, p1, p2); > + } > + > + /* And also with abnormal edges. */ > + basic_block bb; > + edge e; > + edge_iterator ei; > + FOR_EACH_BB_FN (bb, cfun) > + { > + FOR_EACH_EDGE (e, ei, bb->preds) > + if (e->flags & EDGE_ABNORMAL) > + { > + gphi_iterator gsi; > + for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); > + gsi_next (&gsi)) > + { > + gphi *phi = gsi.phi (); > + tree arg = PHI_ARG_DEF (phi, e->dest_idx); > + if (SSA_NAME_IS_DEFAULT_DEF (arg) > + && (!SSA_NAME_VAR (arg) > + || TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL)) > + continue; > + > + tree res = PHI_RESULT (phi); > + > + int p1 = partition_find (tentative, var_to_partition (map, res)); > + int p2 = partition_find (tentative, var_to_partition (map, arg)); > + > + if (p1 == p2) > + continue; > + > + partition_union (tentative, p1, p2); > + } > + } > + } > + > + map->partition_to_base_index = XCNEWVEC (int, parts); > + auto_vec<unsigned int> index_map (parts); > + if (parts) > + index_map.quick_grow (parts); > + > + const unsigned no_part = -1; > + unsigned count = parts; > + while (count) > + index_map[--count] = no_part; > + > + /* Initialize MAP's mapping from partition to base index, using > + as base indices an enumeration of the TENTATIVE partitions in > + which each SSA version ended up, so that we compute conflicts > + between all SSA versions that ended up in the same potential > + coalesce partition. */ > + bitmap_iterator bi; > + unsigned i; > + EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi) > + { > + int pidx = var_to_partition (map, ssa_name (i)); > + int base = partition_find (tentative, pidx); > + if (index_map[base] != no_part) > + continue; > + index_map[base] = count++; > + } > + > + map->num_basevars = count; > + > + EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi) > + { > + int pidx = var_to_partition (map, ssa_name (i)); > + int base = partition_find (tentative, pidx); > + gcc_assert (index_map[base] < count); > + map->partition_to_base_index[pidx] = index_map[base]; > + } > + > + if (dump_file && (dump_flags & TDF_DETAILS)) > + dump_part_var_map (dump_file, tentative, map); > + > + partition_delete (tentative); > +} > + > +/* Hashtable helpers. */ > + > +struct tree_int_map_hasher : nofree_ptr_hash <tree_int_map> > +{ > + static inline hashval_t hash (const tree_int_map *); > + static inline bool equal (const tree_int_map *, const tree_int_map *); > +}; > + > +inline hashval_t > +tree_int_map_hasher::hash (const tree_int_map *v) > +{ > + return tree_map_base_hash (v); > +} > + > +inline bool > +tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c) > +{ > + return tree_int_map_eq (v, c); > +} > + > +/* This routine will initialize the basevar fields of MAP with base > + names. Partitions will share the same base if they have the same > + SSA_NAME_VAR, or, being anonymous variables, the same type. This > + must match gimple_can_coalesce_p in the non-optimized case. */ > + > +static void > +compute_samebase_partition_bases (var_map map) > +{ > + int x, num_part; > + tree var; > + struct tree_int_map *m, *mapstorage; > + > + num_part = num_var_partitions (map); > + hash_table<tree_int_map_hasher> tree_to_index (num_part); > + /* We can have at most num_part entries in the hash tables, so it's > + enough to allocate so many map elements once, saving some malloc > + calls. */ > + mapstorage = m = XNEWVEC (struct tree_int_map, num_part); > + > + /* If a base table already exists, clear it, otherwise create it. */ > + free (map->partition_to_base_index); > + map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part); > + > + /* Build the base variable list, and point partitions at their bases. */ > + for (x = 0; x < num_part; x++) > + { > + struct tree_int_map **slot; > + unsigned baseindex; > + var = partition_to_var (map, x); > + if (SSA_NAME_VAR (var) > + && (!VAR_P (SSA_NAME_VAR (var)) > + || !DECL_IGNORED_P (SSA_NAME_VAR (var)))) > + m->base.from = SSA_NAME_VAR (var); > + else > + /* This restricts what anonymous SSA names we can coalesce > + as it restricts the sets we compute conflicts for. > + Using TREE_TYPE to generate sets is the easies as > + type equivalency also holds for SSA names with the same > + underlying decl. > + > + Check gimple_can_coalesce_p when changing this code. */ > + m->base.from = (TYPE_CANONICAL (TREE_TYPE (var)) > + ? TYPE_CANONICAL (TREE_TYPE (var)) > + : TREE_TYPE (var)); > + /* If base variable hasn't been seen, set it up. */ > + slot = tree_to_index.find_slot (m, INSERT); > + if (!*slot) > + { > + baseindex = m - mapstorage; > + m->to = baseindex; > + *slot = m; > + m++; > + } > + else > + baseindex = (*slot)->to; > + map->partition_to_base_index[x] = baseindex; > + } > + > + map->num_basevars = m - mapstorage; > + > + free (mapstorage); > +} > + > /* Reduce the number of copies by coalescing variables in the function. Return > a partition map with the resulting coalesces. */ > > @@ -1260,9 +1625,10 @@ coalesce_ssa_name (void) > cl = create_coalesce_list (); > map = create_outofssa_var_map (cl, used_in_copies); > > - /* If optimization is disabled, we need to coalesce all the names originating > - from the same SSA_NAME_VAR so debug info remains undisturbed. */ > - if (!optimize) > + /* If this optimization is disabled, we need to coalesce all the > + names originating from the same SSA_NAME_VAR so debug info > + remains undisturbed. */ > + if (!flag_tree_coalesce_vars) > { > hash_table<ssa_name_var_hash> ssa_name_hash (10); > > @@ -1303,8 +1669,13 @@ coalesce_ssa_name (void) > if (dump_file && (dump_flags & TDF_DETAILS)) > dump_var_map (dump_file, map); > > - /* Don't calculate live ranges for variables not in the coalesce list. */ > - partition_view_bitmap (map, used_in_copies, true); > + partition_view_bitmap (map, used_in_copies); > + > + if (flag_tree_coalesce_vars) > + compute_optimized_partition_bases (map, used_in_copies, cl); > + else > + compute_samebase_partition_bases (map); > + > BITMAP_FREE (used_in_copies); > > if (num_var_partitions (map) < 1) > @@ -1343,8 +1714,7 @@ coalesce_ssa_name (void) > > /* Now coalesce everything in the list. */ > coalesce_partitions (map, graph, cl, > - ((dump_flags & TDF_DETAILS) ? dump_file > - : NULL)); > + ((dump_flags & TDF_DETAILS) ? dump_file : NULL)); > > delete_coalesce_list (cl); > ssa_conflicts_delete (graph); > diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h > index 99b188a..ae289b4 100644 > --- a/gcc/tree-ssa-coalesce.h > +++ b/gcc/tree-ssa-coalesce.h > @@ -21,5 +21,6 @@ along with GCC; see the file COPYING3. If not see > #define GCC_TREE_SSA_COALESCE_H > > extern var_map coalesce_ssa_name (void); > +extern bool gimple_can_coalesce_p (tree, tree); > > #endif /* GCC_TREE_SSA_COALESCE_H */ > diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c > deleted file mode 100644 > index aeb7f28..0000000 > --- a/gcc/tree-ssa-copyrename.c > +++ /dev/null > @@ -1,475 +0,0 @@ > -/* Rename SSA copies. > - Copyright (C) 2004-2015 Free Software Foundation, Inc. > - Contributed by Andrew MacLeod <amacleod@redhat.com> > - > -This file is part of GCC. > - > -GCC is free software; you can redistribute it and/or modify > -it under the terms of the GNU General Public License as published by > -the Free Software Foundation; either version 3, or (at your option) > -any later version. > - > -GCC is distributed in the hope that it will be useful, > -but WITHOUT ANY WARRANTY; without even the implied warranty of > -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > -GNU General Public License for more details. > - > -You should have received a copy of the GNU General Public License > -along with GCC; see the file COPYING3. If not see > -<http://www.gnu.org/licenses/>. */ > - > -#include "config.h" > -#include "system.h" > -#include "coretypes.h" > -#include "backend.h" > -#include "tree.h" > -#include "gimple.h" > -#include "rtl.h" > -#include "ssa.h" > -#include "alias.h" > -#include "fold-const.h" > -#include "internal-fn.h" > -#include "gimple-iterator.h" > -#include "flags.h" > -#include "tree-pretty-print.h" > -#include "insn-config.h" > -#include "expmed.h" > -#include "dojump.h" > -#include "explow.h" > -#include "calls.h" > -#include "emit-rtl.h" > -#include "varasm.h" > -#include "stmt.h" > -#include "expr.h" > -#include "tree-dfa.h" > -#include "tree-inline.h" > -#include "tree-ssa-live.h" > -#include "tree-pass.h" > -#include "langhooks.h" > - > -static struct > -{ > - /* Number of copies coalesced. */ > - int coalesced; > -} stats; > - > -/* The following routines implement the SSA copy renaming phase. > - > - This optimization looks for copies between 2 SSA_NAMES, either through a > - direct copy, or an implicit one via a PHI node result and its arguments. > - > - Each copy is examined to determine if it is possible to rename the base > - variable of one of the operands to the same variable as the other operand. > - i.e. > - T.3_5 = <blah> > - a_1 = T.3_5 > - > - If this copy couldn't be copy propagated, it could possibly remain in the > - program throughout the optimization phases. After SSA->normal, it would > - become: > - > - T.3 = <blah> > - a = T.3 > - > - Since T.3_5 is distinct from all other SSA versions of T.3, there is no > - fundamental reason why the base variable needs to be T.3, subject to > - certain restrictions. This optimization attempts to determine if we can > - change the base variable on copies like this, and result in code such as: > - > - a_5 = <blah> > - a_1 = a_5 > - > - This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is > - possible, the copy goes away completely. If it isn't possible, a new temp > - will be created for a_5, and you will end up with the exact same code: > - > - a.8 = <blah> > - a = a.8 > - > - The other benefit of performing this optimization relates to what variables > - are chosen in copies. Gimplification of the program uses temporaries for > - a lot of things. expressions like > - > - a_1 = <blah> > - <blah2> = a_1 > - > - get turned into > - > - T.3_5 = <blah> > - a_1 = T.3_5 > - <blah2> = a_1 > - > - Copy propagation is done in a forward direction, and if we can propagate > - through the copy, we end up with: > - > - T.3_5 = <blah> > - <blah2> = T.3_5 > - > - The copy is gone, but so is all reference to the user variable 'a'. By > - performing this optimization, we would see the sequence: > - > - a_5 = <blah> > - a_1 = a_5 > - <blah2> = a_1 > - > - which copy propagation would then turn into: > - > - a_5 = <blah> > - <blah2> = a_5 > - > - and so we still retain the user variable whenever possible. */ > - > - > -/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid. > - Choose a representative for the partition, and send debug info to DEBUG. */ > - > -static void > -copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug) > -{ > - int p1, p2, p3; > - tree root1, root2; > - tree rep1, rep2; > - bool ign1, ign2, abnorm; > - > - gcc_assert (TREE_CODE (var1) == SSA_NAME); > - gcc_assert (TREE_CODE (var2) == SSA_NAME); > - > - register_ssa_partition (map, var1); > - register_ssa_partition (map, var2); > - > - p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1)); > - p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2)); > - > - if (debug) > - { > - fprintf (debug, "Try : "); > - print_generic_expr (debug, var1, TDF_SLIM); > - fprintf (debug, "(P%d) & ", p1); > - print_generic_expr (debug, var2, TDF_SLIM); > - fprintf (debug, "(P%d)", p2); > - } > - > - gcc_assert (p1 != NO_PARTITION); > - gcc_assert (p2 != NO_PARTITION); > - > - if (p1 == p2) > - { > - if (debug) > - fprintf (debug, " : Already coalesced.\n"); > - return; > - } > - > - rep1 = partition_to_var (map, p1); > - rep2 = partition_to_var (map, p2); > - root1 = SSA_NAME_VAR (rep1); > - root2 = SSA_NAME_VAR (rep2); > - if (!root1 && !root2) > - return; > - > - /* Don't coalesce if one of the variables occurs in an abnormal PHI. */ > - abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1) > - || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2)); > - if (abnorm) > - { > - if (debug) > - fprintf (debug, " : Abnormal PHI barrier. No coalesce.\n"); > - return; > - } > - > - /* Partitions already have the same root, simply merge them. */ > - if (root1 == root2) > - { > - p1 = partition_union (map->var_partition, p1, p2); > - if (debug) > - fprintf (debug, " : Same root, coalesced --> P%d.\n", p1); > - return; > - } > - > - /* Never attempt to coalesce 2 different parameters. */ > - if ((root1 && TREE_CODE (root1) == PARM_DECL) > - && (root2 && TREE_CODE (root2) == PARM_DECL)) > - { > - if (debug) > - fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n"); > - return; > - } > - > - if ((root1 && TREE_CODE (root1) == RESULT_DECL) > - != (root2 && TREE_CODE (root2) == RESULT_DECL)) > - { > - if (debug) > - fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n"); > - return; > - } > - > - ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1)); > - ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2)); > - > - /* Refrain from coalescing user variables, if requested. */ > - if (!ign1 && !ign2) > - { > - if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2)) > - ign2 = true; > - else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1)) > - ign1 = true; > - else if (flag_ssa_coalesce_vars != 2) > - { > - if (debug) > - fprintf (debug, " : 2 different USER vars. No coalesce.\n"); > - return; > - } > - else > - ign2 = true; > - } > - > - /* If both values have default defs, we can't coalesce. If only one has a > - tag, make sure that variable is the new root partition. */ > - if (root1 && ssa_default_def (cfun, root1)) > - { > - if (root2 && ssa_default_def (cfun, root2)) > - { > - if (debug) > - fprintf (debug, " : 2 default defs. No coalesce.\n"); > - return; > - } > - else > - { > - ign2 = true; > - ign1 = false; > - } > - } > - else if (root2 && ssa_default_def (cfun, root2)) > - { > - ign1 = true; > - ign2 = false; > - } > - > - /* Do not coalesce if we cannot assign a symbol to the partition. */ > - if (!(!ign2 && root2) > - && !(!ign1 && root1)) > - { > - if (debug) > - fprintf (debug, " : Choosen variable has no root. No coalesce.\n"); > - return; > - } > - > - /* Don't coalesce if the new chosen root variable would be read-only. > - If both ign1 && ign2, then the root var of the larger partition > - wins, so reject in that case if any of the root vars is TREE_READONLY. > - Otherwise reject only if the root var, on which replace_ssa_name_symbol > - will be called below, is readonly. */ > - if (((root1 && TREE_READONLY (root1)) && ign2) > - || ((root2 && TREE_READONLY (root2)) && ign1)) > - { > - if (debug) > - fprintf (debug, " : Readonly variable. No coalesce.\n"); > - return; > - } > - > - /* Don't coalesce if the two variables aren't type compatible . */ > - if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2)) > - /* There is a disconnect between the middle-end type-system and > - VRP, avoid coalescing enum types with different bounds. */ > - || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE > - || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE) > - && TREE_TYPE (var1) != TREE_TYPE (var2))) > - { > - if (debug) > - fprintf (debug, " : Incompatible types. No coalesce.\n"); > - return; > - } > - > - /* Merge the two partitions. */ > - p3 = partition_union (map->var_partition, p1, p2); > - > - /* Set the root variable of the partition to the better choice, if there is > - one. */ > - if (!ign2 && root2) > - replace_ssa_name_symbol (partition_to_var (map, p3), root2); > - else if (!ign1 && root1) > - replace_ssa_name_symbol (partition_to_var (map, p3), root1); > - else > - gcc_unreachable (); > - > - if (debug) > - { > - fprintf (debug, " --> P%d ", p3); > - print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)), > - TDF_SLIM); > - fprintf (debug, "\n"); > - } > -} > - > - > -namespace { > - > -const pass_data pass_data_rename_ssa_copies = > -{ > - GIMPLE_PASS, /* type */ > - "copyrename", /* name */ > - OPTGROUP_NONE, /* optinfo_flags */ > - TV_TREE_COPY_RENAME, /* tv_id */ > - ( PROP_cfg | PROP_ssa ), /* properties_required */ > - 0, /* properties_provided */ > - 0, /* properties_destroyed */ > - 0, /* todo_flags_start */ > - 0, /* todo_flags_finish */ > -}; > - > -class pass_rename_ssa_copies : public gimple_opt_pass > -{ > -public: > - pass_rename_ssa_copies (gcc::context *ctxt) > - : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt) > - {} > - > - /* opt_pass methods: */ > - opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); } > - virtual bool gate (function *) { return flag_tree_copyrename != 0; } > - virtual unsigned int execute (function *); > - > -}; // class pass_rename_ssa_copies > - > -/* This function will make a pass through the IL, and attempt to coalesce any > - SSA versions which occur in PHI's or copies. Coalescing is accomplished by > - changing the underlying root variable of all coalesced version. This will > - then cause the SSA->normal pass to attempt to coalesce them all to the same > - variable. */ > - > -unsigned int > -pass_rename_ssa_copies::execute (function *fun) > -{ > - var_map map; > - basic_block bb; > - tree var, part_var; > - gimple stmt; > - unsigned x; > - FILE *debug; > - > - memset (&stats, 0, sizeof (stats)); > - > - if (dump_file && (dump_flags & TDF_DETAILS)) > - debug = dump_file; > - else > - debug = NULL; > - > - map = init_var_map (num_ssa_names); > - > - FOR_EACH_BB_FN (bb, fun) > - { > - /* Scan for real copies. */ > - for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); > - gsi_next (&gsi)) > - { > - stmt = gsi_stmt (gsi); > - if (gimple_assign_ssa_name_copy_p (stmt)) > - { > - tree lhs = gimple_assign_lhs (stmt); > - tree rhs = gimple_assign_rhs1 (stmt); > - > - copy_rename_partition_coalesce (map, lhs, rhs, debug); > - } > - } > - } > - > - FOR_EACH_BB_FN (bb, fun) > - { > - /* Treat PHI nodes as copies between the result and each argument. */ > - for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi); > - gsi_next (&gsi)) > - { > - size_t i; > - tree res; > - gphi *phi = gsi.phi (); > - res = gimple_phi_result (phi); > - > - /* Do not process virtual SSA_NAMES. */ > - if (virtual_operand_p (res)) > - continue; > - > - /* Make sure to only use the same partition for an argument > - as the result but never the other way around. */ > - if (SSA_NAME_VAR (res) > - && !DECL_IGNORED_P (SSA_NAME_VAR (res))) > - for (i = 0; i < gimple_phi_num_args (phi); i++) > - { > - tree arg = PHI_ARG_DEF (phi, i); > - if (TREE_CODE (arg) == SSA_NAME) > - copy_rename_partition_coalesce (map, res, arg, > - debug); > - } > - /* Else if all arguments are in the same partition try to merge > - it with the result. */ > - else > - { > - int all_p_same = -1; > - int p = -1; > - for (i = 0; i < gimple_phi_num_args (phi); i++) > - { > - tree arg = PHI_ARG_DEF (phi, i); > - if (TREE_CODE (arg) != SSA_NAME) > - { > - all_p_same = 0; > - break; > - } > - else if (all_p_same == -1) > - { > - p = partition_find (map->var_partition, > - SSA_NAME_VERSION (arg)); > - all_p_same = 1; > - } > - else if (all_p_same == 1 > - && p != partition_find (map->var_partition, > - SSA_NAME_VERSION (arg))) > - { > - all_p_same = 0; > - break; > - } > - } > - if (all_p_same == 1) > - copy_rename_partition_coalesce (map, res, > - PHI_ARG_DEF (phi, 0), > - debug); > - } > - } > - } > - > - if (debug) > - dump_var_map (debug, map); > - > - /* Now one more pass to make all elements of a partition share the same > - root variable. */ > - > - for (x = 1; x < num_ssa_names; x++) > - { > - part_var = partition_to_var (map, x); > - if (!part_var) > - continue; > - var = ssa_name (x); > - if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var)) > - continue; > - if (debug) > - { > - fprintf (debug, "Coalesced "); > - print_generic_expr (debug, var, TDF_SLIM); > - fprintf (debug, " to "); > - print_generic_expr (debug, part_var, TDF_SLIM); > - fprintf (debug, "\n"); > - } > - stats.coalesced++; > - replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var)); > - } > - > - statistics_counter_event (fun, "copies coalesced", > - stats.coalesced); > - delete_var_map (map); > - return 0; > -} > - > -} // anon namespace > - > -gimple_opt_pass * > -make_pass_rename_ssa_copies (gcc::context *ctxt) > -{ > - return new pass_rename_ssa_copies (ctxt); > -} > diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c > index 5b00f58..4772558 100644 > --- a/gcc/tree-ssa-live.c > +++ b/gcc/tree-ssa-live.c > @@ -70,88 +70,6 @@ static void verify_live_on_entry (tree_live_info_p); > ssa_name or variable, and vice versa. */ > > > -/* Hashtable helpers. */ > - > -struct tree_int_map_hasher : nofree_ptr_hash <tree_int_map> > -{ > - static inline hashval_t hash (const tree_int_map *); > - static inline bool equal (const tree_int_map *, const tree_int_map *); > -}; > - > -inline hashval_t > -tree_int_map_hasher::hash (const tree_int_map *v) > -{ > - return tree_map_base_hash (v); > -} > - > -inline bool > -tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c) > -{ > - return tree_int_map_eq (v, c); > -} > - > - > -/* This routine will initialize the basevar fields of MAP. */ > - > -static void > -var_map_base_init (var_map map) > -{ > - int x, num_part; > - tree var; > - struct tree_int_map *m, *mapstorage; > - > - num_part = num_var_partitions (map); > - hash_table<tree_int_map_hasher> tree_to_index (num_part); > - /* We can have at most num_part entries in the hash tables, so it's > - enough to allocate so many map elements once, saving some malloc > - calls. */ > - mapstorage = m = XNEWVEC (struct tree_int_map, num_part); > - > - /* If a base table already exists, clear it, otherwise create it. */ > - free (map->partition_to_base_index); > - map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part); > - > - /* Build the base variable list, and point partitions at their bases. */ > - for (x = 0; x < num_part; x++) > - { > - struct tree_int_map **slot; > - unsigned baseindex; > - var = partition_to_var (map, x); > - if (SSA_NAME_VAR (var) > - && (!VAR_P (SSA_NAME_VAR (var)) > - || !DECL_IGNORED_P (SSA_NAME_VAR (var)))) > - m->base.from = SSA_NAME_VAR (var); > - else > - /* This restricts what anonymous SSA names we can coalesce > - as it restricts the sets we compute conflicts for. > - Using TREE_TYPE to generate sets is the easies as > - type equivalency also holds for SSA names with the same > - underlying decl. > - > - Check gimple_can_coalesce_p when changing this code. */ > - m->base.from = (TYPE_CANONICAL (TREE_TYPE (var)) > - ? TYPE_CANONICAL (TREE_TYPE (var)) > - : TREE_TYPE (var)); > - /* If base variable hasn't been seen, set it up. */ > - slot = tree_to_index.find_slot (m, INSERT); > - if (!*slot) > - { > - baseindex = m - mapstorage; > - m->to = baseindex; > - *slot = m; > - m++; > - } > - else > - baseindex = (*slot)->to; > - map->partition_to_base_index[x] = baseindex; > - } > - > - map->num_basevars = m - mapstorage; > - > - free (mapstorage); > -} > - > - > /* Remove the base table in MAP. */ > > static void > @@ -329,21 +247,17 @@ partition_view_fini (var_map map, bitmap selected) > } > > > -/* Create a partition view which includes all the used partitions in MAP. If > - WANT_BASES is true, create the base variable map as well. */ > +/* Create a partition view which includes all the used partitions in MAP. */ > > void > -partition_view_normal (var_map map, bool want_bases) > +partition_view_normal (var_map map) > { > bitmap used; > > used = partition_view_init (map); > partition_view_fini (map, used); > > - if (want_bases) > - var_map_base_init (map); > - else > - var_map_base_fini (map); > + var_map_base_fini (map); > } > > > @@ -352,7 +266,7 @@ partition_view_normal (var_map map, bool want_bases) > as well. */ > > void > -partition_view_bitmap (var_map map, bitmap only, bool want_bases) > +partition_view_bitmap (var_map map, bitmap only) > { > bitmap used; > bitmap new_partitions = BITMAP_ALLOC (NULL); > @@ -368,10 +282,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases) > } > partition_view_fini (map, new_partitions); > > - if (want_bases) > - var_map_base_init (map); > - else > - var_map_base_fini (map); > + var_map_base_fini (map); > } > > > diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h > index d5d7820..1f88358 100644 > --- a/gcc/tree-ssa-live.h > +++ b/gcc/tree-ssa-live.h > @@ -71,8 +71,8 @@ typedef struct _var_map > extern var_map init_var_map (int); > extern void delete_var_map (var_map); > extern int var_union (var_map, tree, tree); > -extern void partition_view_normal (var_map, bool); > -extern void partition_view_bitmap (var_map, bitmap, bool); > +extern void partition_view_normal (var_map); > +extern void partition_view_bitmap (var_map, bitmap); > extern void dump_scope_blocks (FILE *, int); > extern void debug_scope_block (tree, int); > extern void debug_scope_blocks (int); > diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c > index 437f69d..1fbd71e 100644 > --- a/gcc/tree-ssa-uncprop.c > +++ b/gcc/tree-ssa-uncprop.c > @@ -38,6 +38,11 @@ along with GCC; see the file COPYING3. If not see > #include "tree-pass.h" > #include "tree-ssa-propagate.h" > #include "tree-hash-traits.h" > +#include "bitmap.h" > +#include "stringpool.h" > +#include "tree-ssanames.h" > +#include "tree-ssa-live.h" > +#include "tree-ssa-coalesce.h" > > /* The basic structure describing an equivalency created by traversing > an edge. Traversing the edge effectively means that we can assume > diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c > index da9de28..a31a137 100644 > --- a/gcc/var-tracking.c > +++ b/gcc/var-tracking.c > @@ -4856,12 +4856,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set) > registers, as well as associations between MEMs and VALUEs. */ > > static void > -dataflow_set_clear_at_call (dataflow_set *set) > +dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn) > { > unsigned int r; > hard_reg_set_iterator hrsi; > + HARD_REG_SET invalidated_regs; > > - EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi) > + get_call_reg_set_usage (call_insn, &invalidated_regs, > + regs_invalidated_by_call); > + > + EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi) > var_regno_delete (set, r); > > if (MAY_HAVE_DEBUG_INSNS) > @@ -6645,7 +6649,7 @@ compute_bb_dataflow (basic_block bb) > switch (mo->type) > { > case MO_CALL: > - dataflow_set_clear_at_call (out); > + dataflow_set_clear_at_call (out, insn); > break; > > case MO_USE: > @@ -9107,7 +9111,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set) > switch (mo->type) > { > case MO_CALL: > - dataflow_set_clear_at_call (set); > + dataflow_set_clear_at_call (set, insn); > emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars); > { > rtx arguments = mo->u.loc, *p = &arguments; > > > -- > Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/ > You must be the change you wish to see in the world. -- Gandhi > Be Free! -- http://FSFLA.org/ FSF Latin America board member > Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
Hi Alexandre, On 17/08/15 03:56, Alexandre Oliva wrote: > On Aug 16, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote: > >> Alexandre Oliva <aoliva@redhat.com> writes: >>> On Aug 15, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote: >>> >>>> FAIL: gcc.target/aarch64/target_attr_crypto_ice_1.c (internal compiler error) >>>> In file included from >>>> /opt/gcc/gcc-20150815/gcc/testsuite/gcc.target/aarch64/target_attr_crypto_ice_1.c:4:0: >>> Are you sure this is a regression introduced by my patch? >> Yes, it reintroduces the ICE. > Ugh. I see this testcase was introduced very recently, so presumably it > wasn't present in the tree that James Greenhalgh tested and confirmed > there were no regressions. Yeah, I introduced it as part of the SWITCHABLE_TARGET work for aarch64. A bit of a mid-air collision :( > The hack in aarch64-builtins.c looks risky IMHO. Changing the mode of a > decl after RTL is assigned to it (or to its SSA partitions) seems fishy. > The assert is doing just what it was supposed to do. The only surprise > to me is that it didn't catch this unexpected and unsupported change > before. > > Presumably if we just dropped the assert in expand_expr_real_1, this > case would work just fine, although the unsignedp bit would be > meaningless and thus confusing, since the subreg isn't about a > promotion, but about reflecting the mode change that was made from under > us. > > May I suggest that you guys find (or introduce) other means to change > the layout and mode of the decl *before* RTL is assigned to the params? > I think this would save us a ton of trouble down the road. Just think > how much trouble you'd get if the different modes had different calling > conventions, alignment requirements, valid register assignments, or > anything that might make coalescing their SSA names with those of other > variables invalid. > I'm not familiar with the intricacies in this area but I'll have a look. Perhaps we can somehow re-layout the SIMD types when switching from a non-simd to a simd target... Can you, or Andreas please file a PR so we don't forget? Thanks, Kyrill
On Aug 17, 2015, Christophe Lyon <christophe.lyon@linaro.org> wrote: > Since this was committed (r226901), I can see that the compiler build > fails for armeb targets, when building libgcc: Any chance you could get me a preprocessed testcase for this failure, please? Thanks in advance,
On 17 August 2015 at 13:58, Alexandre Oliva <aoliva@redhat.com> wrote: > On Aug 17, 2015, Christophe Lyon <christophe.lyon@linaro.org> wrote: > >> Since this was committed (r226901), I can see that the compiler build >> fails for armeb targets, when building libgcc: > > Any chance you could get me a preprocessed testcase for this failure, please? > Yes, here it is, attached. My gcc is configured with: --target=armeb-linux-gnueabihf--with-mode=arm --with-cpu=cortex-a9 --with-fpu=neon Thanks, Christophe. > Thanks in advance, > > -- > Alexandre Oliva, freedom fighter http://FSFLA.org/~lxoliva/ > You must be the change you wish to see in the world. -- Gandhi > Be Free! -- http://FSFLA.org/ FSF Latin America board member > Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer
On Mon, Aug 17, 2015 at 5:20 PM, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote: > Hi Alexandre, > > On 17/08/15 03:56, Alexandre Oliva wrote: >> >> On Aug 16, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote: >> >>> Alexandre Oliva <aoliva@redhat.com> writes: >>>> >>>> On Aug 15, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote: >>>> >>>>> FAIL: gcc.target/aarch64/target_attr_crypto_ice_1.c (internal compiler >>>>> error) >>>>> In file included from >>>>> >>>>> /opt/gcc/gcc-20150815/gcc/testsuite/gcc.target/aarch64/target_attr_crypto_ice_1.c:4:0: >>>> >>>> Are you sure this is a regression introduced by my patch? >>> >>> Yes, it reintroduces the ICE. >> >> Ugh. I see this testcase was introduced very recently, so presumably it >> wasn't present in the tree that James Greenhalgh tested and confirmed >> there were no regressions. > > > Yeah, I introduced it as part of the SWITCHABLE_TARGET > work for aarch64. A bit of a mid-air collision :( > >> The hack in aarch64-builtins.c looks risky IMHO. Changing the mode of a >> decl after RTL is assigned to it (or to its SSA partitions) seems fishy. >> The assert is doing just what it was supposed to do. The only surprise >> to me is that it didn't catch this unexpected and unsupported change >> before. >> >> Presumably if we just dropped the assert in expand_expr_real_1, this >> case would work just fine, although the unsignedp bit would be >> meaningless and thus confusing, since the subreg isn't about a >> promotion, but about reflecting the mode change that was made from under >> us. >> >> May I suggest that you guys find (or introduce) other means to change >> the layout and mode of the decl *before* RTL is assigned to the params? >> I think this would save us a ton of trouble down the road. Just think >> how much trouble you'd get if the different modes had different calling >> conventions, alignment requirements, valid register assignments, or >> anything that might make coalescing their SSA names with those of other >> variables invalid. >> > I'm not familiar with the intricacies in this area but > I'll have a look. > Perhaps we can somehow re-layout the SIMD types when > switching from a non-simd to a simd target... > Can you, or Andreas please file a PR so we don't forget? How does x86 handle this case? Because it should be handling this case somehow. Thanks, Andrew > > Thanks, > Kyrill
On 17/08/15 03:56, Alexandre Oliva wrote: > On Aug 16, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote: > >> Alexandre Oliva <aoliva@redhat.com> writes: >>> On Aug 15, 2015, Andreas Schwab <schwab@linux-m68k.org> wrote: >>> >>>> FAIL: gcc.target/aarch64/target_attr_crypto_ice_1.c (internal compiler error) >>>> In file included from >>>> /opt/gcc/gcc-20150815/gcc/testsuite/gcc.target/aarch64/target_attr_crypto_ice_1.c:4:0: >>> Are you sure this is a regression introduced by my patch? >> Yes, it reintroduces the ICE. > Ugh. I see this testcase was introduced very recently, so presumably it > wasn't present in the tree that James Greenhalgh tested and confirmed > there were no regressions. > > The hack in aarch64-builtins.c looks risky IMHO. Changing the mode of a > decl after RTL is assigned to it (or to its SSA partitions) seems fishy. > The assert is doing just what it was supposed to do. The only surprise > to me is that it didn't catch this unexpected and unsupported change > before. > > Presumably if we just dropped the assert in expand_expr_real_1, this > case would work just fine, although the unsignedp bit would be > meaningless and thus confusing, since the subreg isn't about a > promotion, but about reflecting the mode change that was made from under > us. > > May I suggest that you guys find (or introduce) other means to change > the layout and mode of the decl *before* RTL is assigned to the params? Hmm, if in TARGET_SET_CURRENT_FUNCTION, which is called fairly early on to set up cfun I do the relaying of the param decls then it seems to work. Will do some more testing... > I think this would save us a ton of trouble down the road. Just think > how much trouble you'd get if the different modes had different calling > conventions, alignment requirements, valid register assignments, or > anything that might make coalescing their SSA names with those of other > variables invalid. >
On 14/08/15 19:57, Alexandre Oliva wrote: > > I'm glad it appears to be working to everyone's > satisfaction now. I've just committed it as r226901, with only a > context adjustment to account for a change in use_register_for_decl in > function.c. /me crosses fingers :-) > > Here's the patch as checked in: One more failure to report, I'm afraid. On AArch64 Bigendian, aapcs64/func-ret-4.c ICEs in simplify_subreg (line refs here are from r227348): In file included from /work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aa pcs64/func-ret-4.c:14:0: /work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/func-ret-4.c: In function 'func_return_val_10': /work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-2.h:12:2 4: internal compiler error: in simplify_subreg, at simplify-rtx.c:5808 /work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-2.h:13:4 0: note: in definition of macro 'FUNC_NAME_COMBINE' /work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-2.h:15:2 7: note: in expansion of macro 'FUNC_NAME_1' /work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-2.h:15:3 9: note: in expansion of macro 'FUNC_BASE_NAME' /work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/abitest-2.h:69:3 3: note: in expansion of macro 'FUNC_NAME' /work/alalaw01/src/gcc/gcc/testsuite/gcc.target/aarch64/aapcs64/func-ret-4.c:23: 1: note: in expansion of macro 'FUNC_VAL_CHECK' 0xa7ba44 simplify_subreg(machine_mode, rtx_def*, machine_mode, unsigned int) /work/alalaw01/src/gcc/gcc/simplify-rtx.c:5808 0xa7c4ef simplify_gen_subreg(machine_mode, rtx_def*, machine_mode, unsigned int) /work/alalaw01/src/gcc/gcc/simplify-rtx.c:6031 0x7ad097 operand_subword(rtx_def*, unsigned int, int, machine_mode) /work/alalaw01/src/gcc/gcc/emit-rtl.c:1611 0x7def4e move_block_from_reg(int, rtx_def*, int) /work/alalaw01/src/gcc/gcc/expr.c:1536 0x83a494 assign_parm_setup_block /work/alalaw01/src/gcc/gcc/function.c:3117 0x841a43 assign_parms /work/alalaw01/src/gcc/gcc/function.c:3857 0x842ffa expand_function_start(tree_node*) /work/alalaw01/src/gcc/gcc/function.c:5286 0x6e7496 execute /work/alalaw01/src/gcc/gcc/cfgexpand.c:6203 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. FAIL: gcc.target/aarch64/aapcs64/func-ret-4.c compilation, -O1 (internal compiler error) Also at -O2, -O3 -g, -Og -g, -Os. -O0 is OK. simplify_subreg is called with outermode=DImode, op= (concat:CHI (reg:HI 76 [ t ]) (reg:HI 77 [ t+2 ])) innermode = BLKmode (which violates the assertion), byte=0. move_block_from_reg (in expr.c) calls operand_subword(x, i, 1, BLKmode), here i=0 and x is the concat:CHI above, and operand_subword doesn't handle that case (well, it passes it onto simplify_subreg). In assign_parm_setup_block, I see 'mem = validize_mem (copy_rtx (stack_parm))' where stack_parm is again the same concat:CHI. This should be easily reproducible with a stage 1 compiler (aarch64_be-none-elf). --Alan
On Sep 2, 2015, Alan Lawrence <alan.lawrence@arm.com> wrote: > One more failure to report, I'm afraid. On AArch64 Bigendian, > aapcs64/func-ret-4.c ICEs in simplify_subreg (line refs here are from > r227348): Thanks. The failure mode was different in the current, revamped git branch aoliva/pr64164, but I've just fixed it there. I'm almost ready to post a new patch, with a new, simpler, less fragile and more maintainable approach to integrate cfgexpand and assign_parms' RTL assignment, so if you could give it a spin on big and little endian aarch64 natives, that would be very much appreciated!
On 02/09/15 23:12, Alexandre Oliva wrote: > On Sep 2, 2015, Alan Lawrence <alan.lawrence@arm.com> wrote: > >> One more failure to report, I'm afraid. On AArch64 Bigendian, >> aapcs64/func-ret-4.c ICEs in simplify_subreg (line refs here are from >> r227348): > > Thanks. The failure mode was different in the current, revamped git > branch aoliva/pr64164, but I've just fixed it there. > > I'm almost ready to post a new patch, with a new, simpler, less fragile > and more maintainable approach to integrate cfgexpand and assign_parms' > RTL assignment, so if you could give it a spin on big and little endian > aarch64 natives, that would be very much appreciated! > On aarch64_be, that branch fixes the ICE - but func-ret-4.c fails on execution, and now func-ret-3.c does too! Also it causes a bunch of errors building newlib using cross-built binutils, which I haven't tracked down yet: /work/alalaw01/src2/binutils-gdb/newlib/libc/locale/locale.c: In function '__get_locale_env': /work/alalaw01/src2/binutils-gdb/newlib/libc/locale/locale.c:911:1: internal compiler error: in insert_value_copy_on_edge, at tree-outof-ssa.c:308 __get_locale_env(struct _reent *p, int category) ^ 0xb4ecc4 insert_value_copy_on_edge /work/alalaw01/src2/gcc/gcc/tree-outof-ssa.c:307 0xb4ecc4 eliminate_phi /work/alalaw01/src2/gcc/gcc/tree-outof-ssa.c:780 0xb4ecc4 expand_phi_nodes(ssaexpand*) /work/alalaw01/src2/gcc/gcc/tree-outof-ssa.c:943 0x6e74a6 execute /work/alalaw01/src2/gcc/gcc/cfgexpand.c:6242 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. make[7]: *** [lib_a-locale.o] Error 1 --Alan
On 02/09/15 23:12, Alexandre Oliva wrote: > On Sep 2, 2015, Alan Lawrence <alan.lawrence@arm.com> wrote: > >> One more failure to report, I'm afraid. On AArch64 Bigendian, >> aapcs64/func-ret-4.c ICEs in simplify_subreg (line refs here are from >> r227348): > > Thanks. The failure mode was different in the current, revamped git > branch aoliva/pr64164, but I've just fixed it there. > > I'm almost ready to post a new patch, with a new, simpler, less fragile > and more maintainable approach to integrate cfgexpand and assign_parms' > RTL assignment, so if you could give it a spin on big and little endian > aarch64 natives, that would be very much appreciated! > On trunk, aarch64_be is still ICEing in gcc.target/aarch64/aapcs64/func-ret-4.c (complex numbers). With the latest git commit 2b27ef197ece54c4573c5a748b0d40076e35412c on branch aoliva/pr64164, I am now able to build a cross toolchain for aarch64 and aarch64_be, and can confirm the ABI failure is fixed on the branch. HTH, Alan
diff --git a/gcc/Makefile.in b/gcc/Makefile.in index c1cb4ce..e298ecc 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1447,7 +1447,6 @@ OBJS = \ tree-ssa-ccp.o \ tree-ssa-coalesce.o \ tree-ssa-copy.o \ - tree-ssa-copyrename.o \ tree-ssa-dce.o \ tree-ssa-dom.o \ tree-ssa-dse.o \ diff --git a/gcc/alias.c b/gcc/alias.c index fa7d5d8..4681e3f 100644 --- a/gcc/alias.c +++ b/gcc/alias.c @@ -2507,6 +2507,19 @@ nonoverlapping_memrefs_p (const_rtx x, const_rtx y, bool loop_invariant) if (! DECL_P (exprx) || ! DECL_P (expry)) return 0; + /* If we refer to different gimple registers, or one gimple register + and one non-gimple-register, we know they can't overlap. First, + gimple registers don't have their addresses taken. Now, there + could be more than one stack slot for (different versions of) the + same gimple register, but we can presumably tell they don't + overlap based on offsets from stack base addresses elsewhere. + It's important that we don't proceed to DECL_RTL, because gimple + registers may not pass DECL_RTL_SET_P, and make_decl_rtl won't be + able to do anything about them since no SSA information will have + remained to guide it. */ + if (is_gimple_reg (exprx) || is_gimple_reg (expry)) + return exprx != expry; + /* With invalid code we can end up storing into the constant pool. Bail out to avoid ICEing when creating RTL for this. See gfortran.dg/lto/20091028-2_0.f90. */ diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c index 7df9d06..0bc20f6 100644 --- a/gcc/cfgexpand.c +++ b/gcc/cfgexpand.c @@ -97,6 +97,8 @@ gimple currently_expanding_gimple_stmt; static rtx expand_debug_expr (tree); +static bool defer_stack_allocation (tree, bool); + /* Return an expression tree corresponding to the RHS of GIMPLE statement STMT. */ @@ -150,21 +152,149 @@ gimple_assign_rhs_to_tree (gimple stmt) #define SSAVAR(x) (TREE_CODE (x) == SSA_NAME ? SSA_NAME_VAR (x) : x) +/* Choose either CUR or NEXT as the leader DECL for a partition. + Prefer ignored decls, to simplify debug dumps and reduce ambiguity + out of the same user variable being in multiple partitions (this is + less likely for compiler-introduced temps). */ + +static tree +leader_merge (tree cur, tree next) +{ + if (cur == NULL || cur == next) + return next; + + if (DECL_P (cur) && DECL_IGNORED_P (cur)) + return cur; + + if (DECL_P (next) && DECL_IGNORED_P (next)) + return next; + + return cur; +} + +/* Return true if VAR is a PARM_DECL or a RESULT_DECL of type BLKmode. + Such parameters are likely passed as a pointer to the value, rather + than as a value, and so we must not coalesce them, nor allocate + stack space for them before determining the calling conventions for + them. For their SSA_NAMEs, expand_one_ssa_partition emits RTL as + MEMs with pc_rtx as the address, and then it replaces the pc_rtx + with NULL so as to make sure the MEM is not used before it is + adjusted in assign_parm_setup_reg. */ + +bool +parm_maybe_byref_p (tree var) +{ + if (!var || VAR_P (var)) + return false; + + gcc_assert (TREE_CODE (var) == PARM_DECL + || TREE_CODE (var) == RESULT_DECL); + + return TYPE_MODE (TREE_TYPE (var)) == BLKmode; +} + +/* Return the partition of the default SSA_DEF for decl VAR. */ + +static int +ssa_default_def_partition (tree var) +{ + tree name = ssa_default_def (cfun, var); + + if (!name) + return NO_PARTITION; + + return var_to_partition (SA.map, name); +} + +/* Return the RTL for the default SSA def of a PARM or RESULT, if + there is one. */ + +rtx +get_rtl_for_parm_ssa_default_def (tree var) +{ + gcc_assert (TREE_CODE (var) == PARM_DECL || TREE_CODE (var) == RESULT_DECL); + + if (!is_gimple_reg (var)) + return NULL_RTX; + + /* If we've already determined RTL for the decl, use it. This is + not just an optimization: if VAR is a PARM whose incoming value + is unused, we won't find a default def to use its partition, but + we still want to use the location of the parm, if it was used at + all. During assign_parms, until a location is assigned for the + VAR, RTL can only for a parm or result if we're not coalescing + across variables, when we know we're coalescing all SSA_NAMEs of + each parm or result, and we're not coalescing them with names + pertaining to other variables, such as other parms' default + defs. */ + if (DECL_RTL_SET_P (var)) + { + gcc_assert (DECL_RTL (var) != pc_rtx); + return DECL_RTL (var); + } + + int part = ssa_default_def_partition (var); + if (part == NO_PARTITION) + return NULL_RTX; + + return SA.partition_to_pseudo[part]; +} + /* Associate declaration T with storage space X. If T is no SSA name this is exactly SET_DECL_RTL, otherwise make the partition of T associated with X. */ static inline void set_rtl (tree t, rtx x) { + if (x && SSAVAR (t)) + { + bool skip = false; + tree cur = NULL_TREE; + + if (MEM_P (x)) + cur = MEM_EXPR (x); + else if (REG_P (x)) + cur = REG_EXPR (x); + else if (GET_CODE (x) == CONCAT + && REG_P (XEXP (x, 0))) + cur = REG_EXPR (XEXP (x, 0)); + else if (GET_CODE (x) == PARALLEL) + cur = REG_EXPR (XVECEXP (x, 0, 0)); + else if (x == pc_rtx) + skip = true; + else + gcc_unreachable (); + + tree next = skip ? cur : leader_merge (cur, SSAVAR (t)); + + if (cur != next) + { + if (MEM_P (x)) + set_mem_attributes (x, next, true); + else + set_reg_attrs_for_decl_rtl (next, x); + } + } + if (TREE_CODE (t) == SSA_NAME) { - SA.partition_to_pseudo[var_to_partition (SA.map, t)] = x; - if (x && !MEM_P (x)) - set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (t), x); - /* For the benefit of debug information at -O0 (where vartracking - doesn't run) record the place also in the base DECL if it's - a normal variable (not a parameter). */ - if (x && x != pc_rtx && TREE_CODE (SSA_NAME_VAR (t)) == VAR_DECL) + int part = var_to_partition (SA.map, t); + if (part != NO_PARTITION) + { + if (SA.partition_to_pseudo[part]) + gcc_assert (SA.partition_to_pseudo[part] == x); + else if (x != pc_rtx) + SA.partition_to_pseudo[part] = x; + } + /* For the benefit of debug information at -O0 (where + vartracking doesn't run) record the place also in the base + DECL. For PARMs and RESULTs, we may end up resetting these + in function.c:maybe_reset_rtl_for_parm, but in some rare + cases we may need them (unused and overwritten incoming + value, that at -O0 must share the location with the other + uses in spite of the missing default def), and this may be + the only chance to preserve them. */ + if (x && x != pc_rtx && SSA_NAME_VAR (t)) { tree var = SSA_NAME_VAR (t); /* If we don't yet have something recorded, just record it now. */ @@ -248,8 +378,15 @@ static bool has_short_buffer; static unsigned int align_local_variable (tree decl) { - unsigned int align = LOCAL_DECL_ALIGNMENT (decl); - DECL_ALIGN (decl) = align; + unsigned int align; + + if (TREE_CODE (decl) == SSA_NAME) + align = TYPE_ALIGN (TREE_TYPE (decl)); + else + { + align = LOCAL_DECL_ALIGNMENT (decl); + DECL_ALIGN (decl) = align; + } return align / BITS_PER_UNIT; } @@ -315,12 +452,15 @@ add_stack_var (tree decl) decl_to_stack_part->put (decl, stack_vars_num); v->decl = decl; - v->size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (decl))); + tree size = TREE_CODE (decl) == SSA_NAME + ? TYPE_SIZE_UNIT (TREE_TYPE (decl)) + : DECL_SIZE_UNIT (decl); + v->size = tree_to_uhwi (size); /* Ensure that all variables have size, so that &a != &b for any two variables that are simultaneously live. */ if (v->size == 0) v->size = 1; - v->alignb = align_local_variable (SSAVAR (decl)); + v->alignb = align_local_variable (decl); /* An alignment of zero can mightily confuse us later. */ gcc_assert (v->alignb != 0); @@ -862,7 +1002,9 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align, gcc_assert (offset == trunc_int_for_mode (offset, Pmode)); x = plus_constant (Pmode, base, offset); - x = gen_rtx_MEM (DECL_MODE (SSAVAR (decl)), x); + x = gen_rtx_MEM (TREE_CODE (decl) == SSA_NAME + ? TYPE_MODE (TREE_TYPE (decl)) + : DECL_MODE (SSAVAR (decl)), x); if (TREE_CODE (decl) != SSA_NAME) { @@ -884,7 +1026,6 @@ expand_one_stack_var_at (tree decl, rtx base, unsigned base_align, DECL_USER_ALIGN (decl) = 0; } - set_mem_attributes (x, SSAVAR (decl), true); set_rtl (decl, x); } @@ -950,9 +1091,9 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data) /* Skip variables that have already had rtl assigned. See also add_stack_var where we perpetrate this pc_rtx hack. */ decl = stack_vars[i].decl; - if ((TREE_CODE (decl) == SSA_NAME - ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)] - : DECL_RTL (decl)) != pc_rtx) + if (TREE_CODE (decl) == SSA_NAME + ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)] != NULL_RTX + : DECL_RTL (decl) != pc_rtx) continue; large_size += alignb - 1; @@ -981,9 +1122,9 @@ expand_stack_vars (bool (*pred) (size_t), struct stack_vars_data *data) /* Skip variables that have already had rtl assigned. See also add_stack_var where we perpetrate this pc_rtx hack. */ decl = stack_vars[i].decl; - if ((TREE_CODE (decl) == SSA_NAME - ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)] - : DECL_RTL (decl)) != pc_rtx) + if (TREE_CODE (decl) == SSA_NAME + ? SA.partition_to_pseudo[var_to_partition (SA.map, decl)] != NULL_RTX + : DECL_RTL (decl) != pc_rtx) continue; /* Check the predicate to see whether this variable should be @@ -1099,13 +1240,22 @@ account_stack_vars (void) to a variable to be allocated in the stack frame. */ static void -expand_one_stack_var (tree var) +expand_one_stack_var_1 (tree var) { HOST_WIDE_INT size, offset; unsigned byte_align; - size = tree_to_uhwi (DECL_SIZE_UNIT (SSAVAR (var))); - byte_align = align_local_variable (SSAVAR (var)); + if (TREE_CODE (var) == SSA_NAME) + { + tree type = TREE_TYPE (var); + size = tree_to_uhwi (TYPE_SIZE_UNIT (type)); + byte_align = TYPE_ALIGN_UNIT (type); + } + else + { + size = tree_to_uhwi (DECL_SIZE_UNIT (var)); + byte_align = align_local_variable (var); + } /* We handle highly aligned variables in expand_stack_vars. */ gcc_assert (byte_align * BITS_PER_UNIT <= MAX_SUPPORTED_STACK_ALIGNMENT); @@ -1116,6 +1266,27 @@ expand_one_stack_var (tree var) crtl->max_used_stack_slot_alignment, offset); } +/* Wrapper for expand_one_stack_var_1 that checks SSA_NAMEs are + already assigned some MEM. */ + +static void +expand_one_stack_var (tree var) +{ + if (TREE_CODE (var) == SSA_NAME) + { + int part = var_to_partition (SA.map, var); + if (part != NO_PARTITION) + { + rtx x = SA.partition_to_pseudo[part]; + gcc_assert (x); + gcc_assert (MEM_P (x)); + return; + } + } + + return expand_one_stack_var_1 (var); +} + /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL that will reside in a hard register. */ @@ -1125,13 +1296,136 @@ expand_one_hard_reg_var (tree var) rest_of_decl_compilation (var, 0, 0); } +/* Record the alignment requirements of some variable assigned to a + pseudo. */ + +static void +record_alignment_for_reg_var (unsigned int align) +{ + if (SUPPORTS_STACK_ALIGNMENT + && crtl->stack_alignment_estimated < align) + { + /* stack_alignment_estimated shouldn't change after stack + realign decision made */ + gcc_assert (!crtl->stack_realign_processed); + crtl->stack_alignment_estimated = align; + } + + /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted. + So here we only make sure stack_alignment_needed >= align. */ + if (crtl->stack_alignment_needed < align) + crtl->stack_alignment_needed = align; + if (crtl->max_used_stack_slot_alignment < align) + crtl->max_used_stack_slot_alignment = align; +} + +/* Create RTL for an SSA partition. */ + +static void +expand_one_ssa_partition (tree var) +{ + int part = var_to_partition (SA.map, var); + gcc_assert (part != NO_PARTITION); + + if (SA.partition_to_pseudo[part]) + return; + + unsigned int align = MINIMUM_ALIGNMENT (TREE_TYPE (var), + TYPE_MODE (TREE_TYPE (var)), + TYPE_ALIGN (TREE_TYPE (var))); + + /* If the variable alignment is very large we'll dynamicaly allocate + it, which means that in-frame portion is just a pointer. */ + if (align > MAX_SUPPORTED_STACK_ALIGNMENT) + align = POINTER_SIZE; + + record_alignment_for_reg_var (align); + + if (!use_register_for_decl (var)) + { + if (parm_maybe_byref_p (SSA_NAME_VAR (var)) + && ssa_default_def_partition (SSA_NAME_VAR (var)) == part) + { + expand_one_stack_var_at (var, pc_rtx, 0, 0); + rtx x = SA.partition_to_pseudo[part]; + gcc_assert (GET_CODE (x) == MEM); + gcc_assert (GET_MODE (x) == BLKmode); + gcc_assert (XEXP (x, 0) == pc_rtx); + /* Reset the address, so that any attempt to use it will + ICE. It will be adjusted in assign_parm_setup_reg. */ + XEXP (x, 0) = NULL_RTX; + } + else if (defer_stack_allocation (var, true)) + add_stack_var (var); + else + expand_one_stack_var_1 (var); + return; + } + + machine_mode reg_mode = promote_ssa_mode (var, NULL); + + rtx x = gen_reg_rtx (reg_mode); + + set_rtl (var, x); +} + +/* Record the association between the RTL generated for a partition + and the underlying variable of the SSA_NAME. */ + +static void +adjust_one_expanded_partition_var (tree var) +{ + if (!var) + return; + + tree decl = SSA_NAME_VAR (var); + + int part = var_to_partition (SA.map, var); + if (part == NO_PARTITION) + return; + + rtx x = SA.partition_to_pseudo[part]; + + if (!x) + { + /* This var will get a stack slot later. */ + gcc_assert (defer_stack_allocation (var, true)); + return; + } + + set_rtl (var, x); + + if (!REG_P (x)) + return; + + /* Note if the object is a user variable. */ + if (decl && !DECL_ARTIFICIAL (decl)) + mark_user_reg (x); + + if (POINTER_TYPE_P (decl ? TREE_TYPE (decl) : TREE_TYPE (var))) + mark_reg_pointer (x, get_pointer_alignment (var)); +} + /* A subroutine of expand_one_var. Called to assign rtl to a VAR_DECL that will reside in a pseudo register. */ static void expand_one_register_var (tree var) { - tree decl = SSAVAR (var); + if (TREE_CODE (var) == SSA_NAME) + { + int part = var_to_partition (SA.map, var); + if (part != NO_PARTITION) + { + rtx x = SA.partition_to_pseudo[part]; + gcc_assert (x); + gcc_assert (REG_P (x)); + return; + } + gcc_unreachable (); + } + + tree decl = var; tree type = TREE_TYPE (decl); machine_mode reg_mode = promote_decl_mode (decl, NULL); rtx x = gen_reg_rtx (reg_mode); @@ -1177,10 +1471,14 @@ expand_one_error_var (tree var) static bool defer_stack_allocation (tree var, bool toplevel) { + tree size_unit = TREE_CODE (var) == SSA_NAME + ? TYPE_SIZE_UNIT (TREE_TYPE (var)) + : DECL_SIZE_UNIT (var); + /* Whether the variable is small enough for immediate allocation not to be a problem with regard to the frame size. */ bool smallish - = ((HOST_WIDE_INT) tree_to_uhwi (DECL_SIZE_UNIT (var)) + = ((HOST_WIDE_INT) tree_to_uhwi (size_unit) < PARAM_VALUE (PARAM_MIN_SIZE_FOR_STACK_SHARING)); /* If stack protection is enabled, *all* stack variables must be deferred, @@ -1189,16 +1487,24 @@ defer_stack_allocation (tree var, bool toplevel) if (flag_stack_protect || ((flag_sanitize & SANITIZE_ADDRESS) && ASAN_STACK)) return true; + unsigned int align = TREE_CODE (var) == SSA_NAME + ? TYPE_ALIGN (TREE_TYPE (var)) + : DECL_ALIGN (var); + /* We handle "large" alignment via dynamic allocation. We want to handle this extra complication in only one place, so defer them. */ - if (DECL_ALIGN (var) > MAX_SUPPORTED_STACK_ALIGNMENT) + if (align > MAX_SUPPORTED_STACK_ALIGNMENT) return true; + bool ignored = TREE_CODE (var) == SSA_NAME + ? !SSAVAR (var) || DECL_IGNORED_P (SSA_NAME_VAR (var)) + : DECL_IGNORED_P (var); + /* When optimization is enabled, DECL_IGNORED_P variables originally scoped might be detached from their block and appear at toplevel when we reach here. We want to coalesce them with variables from other blocks when the immediate contribution to the frame size would be noticeable. */ - if (toplevel && optimize > 0 && DECL_IGNORED_P (var) && !smallish) + if (toplevel && optimize > 0 && ignored && !smallish) return true; /* Variables declared in the outermost scope automatically conflict @@ -1265,21 +1571,7 @@ expand_one_var (tree var, bool toplevel, bool really_expand) align = POINTER_SIZE; } - if (SUPPORTS_STACK_ALIGNMENT - && crtl->stack_alignment_estimated < align) - { - /* stack_alignment_estimated shouldn't change after stack - realign decision made */ - gcc_assert (!crtl->stack_realign_processed); - crtl->stack_alignment_estimated = align; - } - - /* stack_alignment_needed > PREFERRED_STACK_BOUNDARY is permitted. - So here we only make sure stack_alignment_needed >= align. */ - if (crtl->stack_alignment_needed < align) - crtl->stack_alignment_needed = align; - if (crtl->max_used_stack_slot_alignment < align) - crtl->max_used_stack_slot_alignment = align; + record_alignment_for_reg_var (align); if (TREE_CODE (origvar) == SSA_NAME) { @@ -1722,48 +2014,18 @@ expand_used_vars (void) if (targetm.use_pseudo_pic_reg ()) pic_offset_table_rtx = gen_reg_rtx (Pmode); - hash_map<tree, tree> ssa_name_decls; for (i = 0; i < SA.map->num_partitions; i++) { tree var = partition_to_var (SA.map, i); gcc_assert (!virtual_operand_p (var)); - /* Assign decls to each SSA name partition, share decls for partitions - we could have coalesced (those with the same type). */ - if (SSA_NAME_VAR (var) == NULL_TREE) - { - tree *slot = &ssa_name_decls.get_or_insert (TREE_TYPE (var)); - if (!*slot) - *slot = create_tmp_reg (TREE_TYPE (var)); - replace_ssa_name_symbol (var, *slot); - } - - /* Always allocate space for partitions based on VAR_DECLs. But for - those based on PARM_DECLs or RESULT_DECLs and which matter for the - debug info, there is no need to do so if optimization is disabled - because all the SSA_NAMEs based on these DECLs have been coalesced - into a single partition, which is thus assigned the canonical RTL - location of the DECLs. If in_lto_p, we can't rely on optimize, - a function could be compiled with -O1 -flto first and only the - link performed at -O0. */ - if (TREE_CODE (SSA_NAME_VAR (var)) == VAR_DECL) - expand_one_var (var, true, true); - else if (DECL_IGNORED_P (SSA_NAME_VAR (var)) || optimize || in_lto_p) - { - /* This is a PARM_DECL or RESULT_DECL. For those partitions that - contain the default def (representing the parm or result itself) - we don't do anything here. But those which don't contain the - default def (representing a temporary based on the parm/result) - we need to allocate space just like for normal VAR_DECLs. */ - if (!bitmap_bit_p (SA.partition_has_default_def, i)) - { - expand_one_var (var, true, true); - gcc_assert (SA.partition_to_pseudo[i]); - } - } + expand_one_ssa_partition (var); } + for (i = 1; i < num_ssa_names; i++) + adjust_one_expanded_partition_var (ssa_name (i)); + if (flag_stack_protect == SPCT_FLAG_STRONG) gen_stack_protect_signal = stack_protect_decl_p () || stack_protect_return_slot_p (); @@ -5928,35 +6190,6 @@ pass_expand::execute (function *fun) parm_birth_insn = var_seq; } - /* Now that we also have the parameter RTXs, copy them over to our - partitions. */ - for (i = 0; i < SA.map->num_partitions; i++) - { - tree var = SSA_NAME_VAR (partition_to_var (SA.map, i)); - - if (TREE_CODE (var) != VAR_DECL - && !SA.partition_to_pseudo[i]) - SA.partition_to_pseudo[i] = DECL_RTL_IF_SET (var); - gcc_assert (SA.partition_to_pseudo[i]); - - /* If this decl was marked as living in multiple places, reset - this now to NULL. */ - if (DECL_RTL_IF_SET (var) == pc_rtx) - SET_DECL_RTL (var, NULL); - - /* Some RTL parts really want to look at DECL_RTL(x) when x - was a decl marked in REG_ATTR or MEM_ATTR. We could use - SET_DECL_RTL here making this available, but that would mean - to select one of the potentially many RTLs for one DECL. Instead - of doing that we simply reset the MEM_EXPR of the RTL in question, - then nobody can get at it and hence nobody can call DECL_RTL on it. */ - if (!DECL_RTL_SET_P (var)) - { - if (MEM_P (SA.partition_to_pseudo[i])) - set_mem_expr (SA.partition_to_pseudo[i], NULL); - } - } - /* If we have a class containing differently aligned pointers we need to merge those into the corresponding RTL pointer alignment. */ @@ -5964,7 +6197,6 @@ pass_expand::execute (function *fun) { tree name = ssa_name (i); int part; - rtx r; if (!name /* We might have generated new SSA names in @@ -5977,20 +6209,25 @@ pass_expand::execute (function *fun) if (part == NO_PARTITION) continue; - /* Adjust all partition members to get the underlying decl of - the representative which we might have created in expand_one_var. */ - if (SSA_NAME_VAR (name) == NULL_TREE) + gcc_assert (SA.partition_to_pseudo[part] + || defer_stack_allocation (name, true)); + + /* If this decl was marked as living in multiple places, reset + this now to NULL. */ + tree var = SSA_NAME_VAR (name); + if (var && DECL_RTL_IF_SET (var) == pc_rtx) + SET_DECL_RTL (var, NULL); + /* Check that the pseudos chosen by assign_parms are those of + the corresponding default defs. */ + else if (SSA_NAME_IS_DEFAULT_DEF (name) + && (TREE_CODE (var) == PARM_DECL + || TREE_CODE (var) == RESULT_DECL)) { - tree leader = partition_to_var (SA.map, part); - gcc_assert (SSA_NAME_VAR (leader) != NULL_TREE); - replace_ssa_name_symbol (name, SSA_NAME_VAR (leader)); + rtx in = DECL_RTL_IF_SET (var); + gcc_assert (in); + rtx out = SA.partition_to_pseudo[part]; + gcc_assert (in == out || rtx_equal_p (in, out)); } - if (!POINTER_TYPE_P (TREE_TYPE (name))) - continue; - - r = SA.partition_to_pseudo[part]; - if (REG_P (r)) - mark_reg_pointer (r, get_pointer_alignment (name)); } /* If this function is `main', emit a call to `__main' diff --git a/gcc/cfgexpand.h b/gcc/cfgexpand.h index a0b6e3e..987cf356 100644 --- a/gcc/cfgexpand.h +++ b/gcc/cfgexpand.h @@ -22,5 +22,8 @@ along with GCC; see the file COPYING3. If not see extern tree gimple_assign_rhs_to_tree (gimple); extern HOST_WIDE_INT estimated_stack_frame_size (struct cgraph_node *); +extern bool parm_maybe_byref_p (tree); +extern rtx get_rtl_for_parm_ssa_default_def (tree var); + #endif /* GCC_CFGEXPAND_H */ diff --git a/gcc/common.opt b/gcc/common.opt index e80eadf..dd59ff3 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2234,16 +2234,16 @@ Common Report Var(flag_tree_ch) Optimization Enable loop header copying on trees ftree-coalesce-inlined-vars -Common Report Var(flag_ssa_coalesce_vars,1) Init(2) RejectNegative Optimization -Enable coalescing of copy-related user variables that are inlined +Common Ignore RejectNegative +Does nothing. Preserved for backward compatibility. ftree-coalesce-vars -Common Report Var(flag_ssa_coalesce_vars,2) Optimization -Enable coalescing of all copy-related user variables +Common Report Var(flag_tree_coalesce_vars) Optimization +Enable SSA coalescing of user variables ftree-copyrename -Common Report Var(flag_tree_copyrename) Optimization -Replace SSA temporaries with better names in copies +Common Ignore +Does nothing. Preserved for backward compatibility. ftree-copy-prop Common Report Var(flag_tree_copy_prop) Optimization diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 2871337..27be317 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -342,7 +342,6 @@ Objective-C and Objective-C++ Dialects}. -fdump-tree-phiprop@r{[}-@var{n}@r{]} @gol -fdump-tree-phiopt@r{[}-@var{n}@r{]} @gol -fdump-tree-forwprop@r{[}-@var{n}@r{]} @gol --fdump-tree-copyrename@r{[}-@var{n}@r{]} @gol -fdump-tree-nrv -fdump-tree-vect @gol -fdump-tree-sink @gol -fdump-tree-sra@r{[}-@var{n}@r{]} @gol @@ -448,9 +447,8 @@ Objective-C and Objective-C++ Dialects}. -fstack-protector-explicit -fstdarg-opt -fstrict-aliasing @gol -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol --ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol --ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol --ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol +-ftree-coalesce-vars -ftree-copy-prop -ftree-dce -ftree-dominator-opts @gol +-ftree-dse -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol -ftree-loop-if-convert-stores -ftree-loop-im @gol -ftree-phiprop -ftree-loop-distribution -ftree-loop-distribute-patterns @gol -ftree-loop-ivcanon -ftree-loop-linear -ftree-loop-optimize @gol @@ -7133,11 +7131,6 @@ name is made by appending @file{.phiopt} to the source file name. Dump each function after forward propagating single use variables. The file name is made by appending @file{.forwprop} to the source file name. -@item copyrename -@opindex fdump-tree-copyrename -Dump each function after applying the copy rename optimization. The file -name is made by appending @file{.copyrename} to the source file name. - @item nrv @opindex fdump-tree-nrv Dump each function after applying the named return value optimization on @@ -7602,8 +7595,8 @@ compilation time. -ftree-ccp @gol -fssa-phiopt @gol -ftree-ch @gol +-ftree-coalesce-vars @gol -ftree-copy-prop @gol --ftree-copyrename @gol -ftree-dce @gol -ftree-dominator-opts @gol -ftree-dse @gol @@ -8867,6 +8860,15 @@ be parallelized. Parallelize all the loops that can be analyzed to not contain loop carried dependences without checking that it is profitable to parallelize the loops. +@item -ftree-coalesce-vars +@opindex ftree-coalesce-vars +Tell the compiler to attempt to combine small user-defined variables +too, instead of just compiler temporaries. This may severely limit the +ability to debug an optimized program compiled with +@option{-fno-var-tracking-assignments}. In the negated form, this flag +prevents SSA coalescing of user variables. This option is enabled by +default if optimization is enabled. + @item -ftree-loop-if-convert @opindex ftree-loop-if-convert Attempt to transform conditional jumps in the innermost loops to @@ -8980,32 +8982,6 @@ Perform scalar replacement of aggregates. This pass replaces structure references with scalars to prevent committing structures to memory too early. This flag is enabled by default at @option{-O} and higher. -@item -ftree-copyrename -@opindex ftree-copyrename -Perform copy renaming on trees. This pass attempts to rename compiler -temporaries to other variables at copy locations, usually resulting in -variable names which more closely resemble the original variables. This flag -is enabled by default at @option{-O} and higher. - -@item -ftree-coalesce-inlined-vars -@opindex ftree-coalesce-inlined-vars -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to -combine small user-defined variables too, but only if they are inlined -from other functions. It is a more limited form of -@option{-ftree-coalesce-vars}. This may harm debug information of such -inlined variables, but it keeps variables of the inlined-into -function apart from each other, such that they are more likely to -contain the expected values in a debugging session. - -@item -ftree-coalesce-vars -@opindex ftree-coalesce-vars -Tell the copyrename pass (see @option{-ftree-copyrename}) to attempt to -combine small user-defined variables too, instead of just compiler -temporaries. This may severely limit the ability to debug an optimized -program compiled with @option{-fno-var-tracking-assignments}. In the -negated form, this flag prevents SSA coalescing of user variables, -including inlined ones. This option is enabled by default. - @item -ftree-ter @opindex ftree-ter Perform temporary expression replacement during the SSA->normal phase. Single diff --git a/gcc/emit-rtl.c b/gcc/emit-rtl.c index d211e6b0..a6ef154 100644 --- a/gcc/emit-rtl.c +++ b/gcc/emit-rtl.c @@ -63,6 +63,7 @@ along with GCC; see the file COPYING3. If not see #include "target.h" #include "builtins.h" #include "rtl-iter.h" +#include "stor-layout.h" struct target_rtl default_target_rtl; #if SWITCHABLE_TARGET @@ -1233,6 +1234,9 @@ set_reg_attrs_for_parm (rtx parm_rtx, rtx mem) void set_reg_attrs_for_decl_rtl (tree t, rtx x) { + if (!t) + return; + tree tdecl = t; if (GET_CODE (x) == SUBREG) { gcc_assert (subreg_lowpart_p (x)); @@ -1241,7 +1245,9 @@ set_reg_attrs_for_decl_rtl (tree t, rtx x) if (REG_P (x)) REG_ATTRS (x) = get_reg_attrs (t, byte_lowpart_offset (GET_MODE (x), - DECL_MODE (t))); + DECL_P (tdecl) + ? DECL_MODE (tdecl) + : TYPE_MODE (TREE_TYPE (tdecl)))); if (GET_CODE (x) == CONCAT) { if (REG_P (XEXP (x, 0))) diff --git a/gcc/explow.c b/gcc/explow.c index bd342c1..6941f4e 100644 --- a/gcc/explow.c +++ b/gcc/explow.c @@ -842,6 +842,35 @@ promote_decl_mode (const_tree decl, int *punsignedp) return pmode; } +/* Return the promoted mode for name. If it is a named SSA_NAME, it + is the same as promote_decl_mode. Otherwise, it is the promoted + mode of a temp decl of same type as the SSA_NAME, if we had created + one. */ + +machine_mode +promote_ssa_mode (const_tree name, int *punsignedp) +{ + gcc_assert (TREE_CODE (name) == SSA_NAME); + + /* Partitions holding parms and results must be promoted as expected + by function.c. */ + if (SSA_NAME_VAR (name) + && (TREE_CODE (SSA_NAME_VAR (name)) == PARM_DECL + || TREE_CODE (SSA_NAME_VAR (name)) == RESULT_DECL)) + return promote_decl_mode (SSA_NAME_VAR (name), punsignedp); + + tree type = TREE_TYPE (name); + int unsignedp = TYPE_UNSIGNED (type); + machine_mode mode = TYPE_MODE (type); + + machine_mode pmode = promote_mode (type, mode, &unsignedp); + if (punsignedp) + *punsignedp = unsignedp; + + return pmode; +} + + /* Controls the behaviour of {anti_,}adjust_stack. */ static bool suppress_reg_args_size; diff --git a/gcc/explow.h b/gcc/explow.h index 94613de..52113db 100644 --- a/gcc/explow.h +++ b/gcc/explow.h @@ -57,6 +57,9 @@ extern machine_mode promote_mode (const_tree, machine_mode, int *); /* Return mode and signedness to use when object is promoted. */ machine_mode promote_decl_mode (const_tree, int *); +/* Return mode and signedness to use when object is promoted. */ +machine_mode promote_ssa_mode (const_tree, int *); + /* Remove some bytes from the stack. An rtx says how many. */ extern void adjust_stack (rtx); diff --git a/gcc/expr.c b/gcc/expr.c index 31b4573..f604f52 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -3022,7 +3022,7 @@ write_complex_part (rtx cplx, rtx val, bool imag_p) /* Extract one of the components of the complex value CPLX. Extract the real part if IMAG_P is false, and the imaginary part if it's true. */ -static rtx +rtx read_complex_part (rtx cplx, bool imag_p) { machine_mode cmode, imode; @@ -9236,7 +9236,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, rtx op0, op1, temp, decl_rtl; tree type; int unsignedp; - machine_mode mode; + machine_mode mode, dmode; enum tree_code code = TREE_CODE (exp); rtx subtarget, original_target; int ignore; @@ -9367,7 +9367,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, if (g == NULL && modifier == EXPAND_INITIALIZER && !SSA_NAME_IS_DEFAULT_DEF (exp) - && (optimize || DECL_IGNORED_P (SSA_NAME_VAR (exp))) + && (optimize || !SSA_NAME_VAR (exp) + || DECL_IGNORED_P (SSA_NAME_VAR (exp))) && stmt_is_replaceable_p (SSA_NAME_DEF_STMT (exp))) g = SSA_NAME_DEF_STMT (exp); if (g) @@ -9446,15 +9447,18 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, /* Ensure variable marked as used even if it doesn't go through a parser. If it hasn't be used yet, write out an external definition. */ - TREE_USED (exp) = 1; + if (exp) + TREE_USED (exp) = 1; /* Show we haven't gotten RTL for this yet. */ temp = 0; /* Variables inherited from containing functions should have been lowered by this point. */ - context = decl_function_context (exp); - gcc_assert (SCOPE_FILE_SCOPE_P (context) + if (exp) + context = decl_function_context (exp); + gcc_assert (!exp + || SCOPE_FILE_SCOPE_P (context) || context == current_function_decl || TREE_STATIC (exp) || DECL_EXTERNAL (exp) @@ -9478,7 +9482,8 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, decl_rtl = use_anchored_address (decl_rtl); if (modifier != EXPAND_CONST_ADDRESS && modifier != EXPAND_SUM - && !memory_address_addr_space_p (DECL_MODE (exp), + && !memory_address_addr_space_p (exp ? DECL_MODE (exp) + : GET_MODE (decl_rtl), XEXP (decl_rtl, 0), MEM_ADDR_SPACE (decl_rtl))) temp = replace_equiv_address (decl_rtl, @@ -9489,12 +9494,17 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, if the address is a register. */ if (temp != 0) { - if (MEM_P (temp) && REG_P (XEXP (temp, 0))) + if (exp && MEM_P (temp) && REG_P (XEXP (temp, 0))) mark_reg_pointer (XEXP (temp, 0), DECL_ALIGN (exp)); return temp; } + if (exp) + dmode = DECL_MODE (exp); + else + dmode = TYPE_MODE (TREE_TYPE (ssa_name)); + /* If the mode of DECL_RTL does not match that of the decl, there are two cases: we are dealing with a BLKmode value that is returned in a register, or we are dealing with @@ -9502,22 +9512,23 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, of the wanted mode, but mark it so that we know that it was already extended. */ if (REG_P (decl_rtl) - && DECL_MODE (exp) != BLKmode - && GET_MODE (decl_rtl) != DECL_MODE (exp)) + && dmode != BLKmode + && GET_MODE (decl_rtl) != dmode) { machine_mode pmode; /* Get the signedness to be used for this variable. Ensure we get the same mode we got when the variable was declared. */ - if (code == SSA_NAME - && (g = SSA_NAME_DEF_STMT (ssa_name)) - && gimple_code (g) == GIMPLE_CALL - && !gimple_call_internal_p (g)) + if (code != SSA_NAME) + pmode = promote_decl_mode (exp, &unsignedp); + else if ((g = SSA_NAME_DEF_STMT (ssa_name)) + && gimple_code (g) == GIMPLE_CALL + && !gimple_call_internal_p (g)) pmode = promote_function_mode (type, mode, &unsignedp, gimple_call_fntype (g), 2); else - pmode = promote_decl_mode (exp, &unsignedp); + pmode = promote_ssa_mode (ssa_name, &unsignedp); gcc_assert (GET_MODE (decl_rtl) == pmode); temp = gen_lowpart_SUBREG (mode, decl_rtl); diff --git a/gcc/expr.h b/gcc/expr.h index 32d1707..a2c8e1d 100644 --- a/gcc/expr.h +++ b/gcc/expr.h @@ -210,6 +210,7 @@ extern rtx_insn *emit_move_insn_1 (rtx, rtx); extern rtx_insn *emit_move_complex_push (machine_mode, rtx, rtx); extern rtx_insn *emit_move_complex_parts (rtx, rtx); +extern rtx read_complex_part (rtx, bool); extern void write_complex_part (rtx, rtx, bool); extern rtx emit_move_resolve_push (machine_mode, rtx); diff --git a/gcc/function.c b/gcc/function.c index 20bf3b3..715c19f 100644 --- a/gcc/function.c +++ b/gcc/function.c @@ -72,6 +72,9 @@ along with GCC; see the file COPYING3. If not see #include "cfganal.h" #include "cfgbuild.h" #include "cfgcleanup.h" +#include "cfgexpand.h" +#include "basic-block.h" +#include "df.h" #include "params.h" #include "bb-reorder.h" #include "shrink-wrap.h" @@ -148,6 +151,9 @@ static bool contains (const_rtx, hash_table<insn_cache_hasher> *); static void prepare_function_start (void); static void do_clobber_return_reg (rtx, void *); static void do_use_return_reg (rtx, void *); +static rtx rtl_for_parm (struct assign_parm_data_all *, tree); +static void maybe_reset_rtl_for_parm (tree); + /* Stack of nested functions. */ /* Keep track of the cfun stack. */ @@ -2105,6 +2111,30 @@ aggregate_value_p (const_tree exp, const_tree fntype) bool use_register_for_decl (const_tree decl) { + if (TREE_CODE (decl) == SSA_NAME) + { + /* We often try to use the SSA_NAME, instead of its underlying + decl, to get type information and guide decisions, to avoid + differences of behavior between anonymous and named + variables, but in this one case we have to go for the actual + variable if there is one. The main reason is that, at least + at -O0, we want to place user variables on the stack, but we + don't mind using pseudos for anonymous or ignored temps. + Should we take the SSA_NAME, we'd conclude all SSA_NAMEs + should go in pseudos, whereas their corresponding variables + might have to go on the stack. So, disregarding the decl + here would negatively impact debug info at -O0, enable + coalescing between SSA_NAMEs that ought to get different + stack/pseudo assignments, and get the incoming argument + processing thoroughly confused by PARM_DECLs expected to live + in stack slots but assigned to pseudos. */ + if (!SSA_NAME_VAR (decl)) + return TYPE_MODE (TREE_TYPE (decl)) != BLKmode + && !(flag_float_store && FLOAT_TYPE_P (TREE_TYPE (decl))); + + decl = SSA_NAME_VAR (decl); + } + /* Honor volatile. */ if (TREE_SIDE_EFFECTS (decl)) return false; @@ -2240,7 +2270,7 @@ assign_parms_initialize_all (struct assign_parm_data_all *all) needed, else the old list. */ static void -split_complex_args (vec<tree> *args) +split_complex_args (struct assign_parm_data_all *all, vec<tree> *args) { unsigned i; tree p; @@ -2251,6 +2281,7 @@ split_complex_args (vec<tree> *args) if (TREE_CODE (type) == COMPLEX_TYPE && targetm.calls.split_complex_arg (type)) { + tree cparm = p; tree decl; tree subtype = TREE_TYPE (type); bool addressable = TREE_ADDRESSABLE (p); @@ -2269,6 +2300,9 @@ split_complex_args (vec<tree> *args) DECL_ARTIFICIAL (p) = addressable; DECL_IGNORED_P (p) = addressable; TREE_ADDRESSABLE (p) = 0; + /* Reset the RTL before layout_decl, or it may change the + mode of the RTL of the original argument copied to P. */ + SET_DECL_RTL (p, NULL_RTX); layout_decl (p, 0); (*args)[i] = p; @@ -2280,6 +2314,25 @@ split_complex_args (vec<tree> *args) DECL_IGNORED_P (decl) = addressable; layout_decl (decl, 0); args->safe_insert (++i, decl); + + /* If we are expanding a function, rather than gimplifying + it, propagate the RTL of the complex parm to the split + declarations, and set their contexts so that + maybe_reset_rtl_for_parm can recognize them and refrain + from resetting their RTL. */ + if (currently_expanding_to_rtl) + { + maybe_reset_rtl_for_parm (cparm); + rtx rtl = rtl_for_parm (all, cparm); + if (rtl) + { + SET_DECL_RTL (p, read_complex_part (rtl, false)); + SET_DECL_RTL (decl, read_complex_part (rtl, true)); + + DECL_CONTEXT (p) = cparm; + DECL_CONTEXT (decl) = cparm; + } + } } } } @@ -2342,7 +2395,7 @@ assign_parms_augmented_arg_list (struct assign_parm_data_all *all) /* If the target wants to split complex arguments into scalars, do so. */ if (targetm.calls.split_complex_arg) - split_complex_args (&fnargs); + split_complex_args (all, &fnargs); return fnargs; } @@ -2745,23 +2798,98 @@ assign_parm_remove_parallels (struct assign_parm_data_one *data) data->entry_parm = entry_parm; } +/* Wrapper for use_register_for_decl, that special-cases the + .result_ptr as the function's RESULT_DECL when the RESULT_DECL is + passed by reference. */ + +static bool +use_register_for_parm_decl (struct assign_parm_data_all *all, tree parm) +{ + if (parm == all->function_result_decl) + { + tree result = DECL_RESULT (current_function_decl); + + if (DECL_BY_REFERENCE (result)) + parm = result; + } + + return use_register_for_decl (parm); +} + +/* Wrapper for get_rtl_for_parm_ssa_default_def, that special-cases + the .result_ptr as the function's RESULT_DECL when the RESULT_DECL + is passed by reference. */ + +static rtx +rtl_for_parm (struct assign_parm_data_all *all, tree parm) +{ + if (parm == all->function_result_decl) + { + tree result = DECL_RESULT (current_function_decl); + + if (!DECL_BY_REFERENCE (result)) + return NULL_RTX; + + parm = result; + } + + return get_rtl_for_parm_ssa_default_def (parm); +} + +/* Reset the location of PARM_DECLs and RESULT_DECLs that had + SSA_NAMEs in multiple partitions, so that assign_parms will choose + the default def, if it exists, or create new RTL to hold the unused + entry value. If we are coalescing across variables, we want to + reset the location too, because a parm without a default def + (incoming value unused) might be coalesced with one with a default + def, and then assign_parms would copy both incoming values to the + same location, which might cause the wrong value to survive. */ +static void +maybe_reset_rtl_for_parm (tree parm) +{ + gcc_assert (TREE_CODE (parm) == PARM_DECL + || TREE_CODE (parm) == RESULT_DECL); + + /* This is a split complex parameter, and its context was set to its + original PARM_DECL in split_complex_args so that we could + recognize it here and not reset its RTL. */ + if (DECL_CONTEXT (parm) && TREE_CODE (DECL_CONTEXT (parm)) == PARM_DECL) + { + DECL_CONTEXT (parm) = DECL_CONTEXT (DECL_CONTEXT (parm)); + return; + } + + if ((flag_tree_coalesce_vars + || (DECL_RTL_SET_P (parm) && DECL_RTL (parm) == pc_rtx)) + && is_gimple_reg (parm)) + SET_DECL_RTL (parm, NULL_RTX); +} + /* A subroutine of assign_parms. Adjust DATA->STACK_RTL such that it's always valid and properly aligned. */ static void -assign_parm_adjust_stack_rtl (struct assign_parm_data_one *data) +assign_parm_adjust_stack_rtl (struct assign_parm_data_all *all, tree parm, + struct assign_parm_data_one *data) { rtx stack_parm = data->stack_parm; + /* If out-of-SSA assigned RTL to the parm default def, make sure we + don't use what we might have computed before. */ + rtx ssa_assigned = rtl_for_parm (all, parm); + if (ssa_assigned) + stack_parm = NULL; + /* If we can't trust the parm stack slot to be aligned enough for its ultimate type, don't use that slot after entry. We'll make another stack slot, if we need one. */ - if (stack_parm - && ((STRICT_ALIGNMENT - && GET_MODE_ALIGNMENT (data->nominal_mode) > MEM_ALIGN (stack_parm)) - || (data->nominal_type - && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm) - && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY))) + else if (stack_parm + && ((STRICT_ALIGNMENT + && (GET_MODE_ALIGNMENT (data->nominal_mode) + > MEM_ALIGN (stack_parm))) + || (data->nominal_type + && TYPE_ALIGN (data->nominal_type) > MEM_ALIGN (stack_parm) + && MEM_ALIGN (stack_parm) < PREFERRED_STACK_BOUNDARY))) stack_parm = NULL; /* If parm was passed in memory, and we need to convert it on entry, @@ -2823,14 +2951,32 @@ assign_parm_setup_block (struct assign_parm_data_all *all, size = int_size_in_bytes (data->passed_type); size_stored = CEIL_ROUND (size, UNITS_PER_WORD); + if (stack_parm == 0) { DECL_ALIGN (parm) = MAX (DECL_ALIGN (parm), BITS_PER_WORD); - stack_parm = assign_stack_local (BLKmode, size_stored, - DECL_ALIGN (parm)); - if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size) - PUT_MODE (stack_parm, GET_MODE (entry_parm)); - set_mem_attributes (stack_parm, parm, 1); + rtx from_expand = rtl_for_parm (all, parm); + if (from_expand && (!parm_maybe_byref_p (parm) + || XEXP (from_expand, 0) != NULL_RTX)) + stack_parm = copy_rtx (from_expand); + else + { + stack_parm = assign_stack_local (BLKmode, size_stored, + DECL_ALIGN (parm)); + if (GET_MODE_SIZE (GET_MODE (entry_parm)) == size) + PUT_MODE (stack_parm, GET_MODE (entry_parm)); + if (from_expand) + { + gcc_assert (GET_CODE (stack_parm) == MEM); + gcc_assert (GET_CODE (from_expand) == MEM); + gcc_assert (XEXP (from_expand, 0) == NULL_RTX); + XEXP (from_expand, 0) = XEXP (stack_parm, 0); + PUT_MODE (from_expand, GET_MODE (stack_parm)); + stack_parm = copy_rtx (from_expand); + } + else + set_mem_attributes (stack_parm, parm, 1); + } } /* If a BLKmode arrives in registers, copy it to a stack slot. Handle @@ -2968,14 +3114,34 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, = promote_function_mode (data->nominal_type, data->nominal_mode, &unsignedp, TREE_TYPE (current_function_decl), 2); - parmreg = gen_reg_rtx (promoted_nominal_mode); + rtx from_expand = parmreg = rtl_for_parm (all, parm); - if (!DECL_ARTIFICIAL (parm)) - mark_user_reg (parmreg); + if (from_expand && !data->passed_pointer) + { + if (GET_MODE (parmreg) != promoted_nominal_mode) + parmreg = gen_lowpart (promoted_nominal_mode, parmreg); + } + else if (!from_expand || parm_maybe_byref_p (parm)) + { + parmreg = gen_reg_rtx (promoted_nominal_mode); + if (!DECL_ARTIFICIAL (parm)) + mark_user_reg (parmreg); + + if (from_expand) + { + gcc_assert (data->passed_pointer); + gcc_assert (GET_CODE (from_expand) == MEM + && GET_MODE (from_expand) == BLKmode + && XEXP (from_expand, 0) == NULL_RTX); + XEXP (from_expand, 0) = parmreg; + } + } /* If this was an item that we received a pointer to, set DECL_RTL appropriately. */ - if (data->passed_pointer) + if (from_expand) + SET_DECL_RTL (parm, from_expand); + else if (data->passed_pointer) { rtx x = gen_rtx_MEM (TYPE_MODE (TREE_TYPE (data->passed_type)), parmreg); set_mem_attributes (x, parm, 1); @@ -2990,10 +3156,13 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, assign_parm_find_data_types and expand_expr_real_1. */ equiv_stack_parm = data->stack_parm; + if (!equiv_stack_parm) + equiv_stack_parm = data->entry_parm; validated_mem = validize_mem (copy_rtx (data->entry_parm)); need_conversion = (data->nominal_mode != data->passed_mode || promoted_nominal_mode != data->promoted_mode); + gcc_assert (!(need_conversion && data->passed_pointer && from_expand)); moved = false; if (need_conversion @@ -3125,16 +3294,28 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, did_conversion = true; } - else + /* We don't want to copy the incoming pointer to a parmreg expected + to hold the value rather than the pointer. */ + else if (!data->passed_pointer || parmreg != from_expand) emit_move_insn (parmreg, validated_mem); /* If we were passed a pointer but the actual value can safely live in a register, retrieve it and use it directly. */ - if (data->passed_pointer && TYPE_MODE (TREE_TYPE (parm)) != BLKmode) + if (data->passed_pointer + && (from_expand || TYPE_MODE (TREE_TYPE (parm)) != BLKmode)) { + rtx src = DECL_RTL (parm); + /* We can't use nominal_mode, because it will have been set to Pmode above. We must use the actual mode of the parm. */ - if (use_register_for_decl (parm)) + if (from_expand) + { + parmreg = from_expand; + gcc_assert (GET_MODE (parmreg) == TYPE_MODE (TREE_TYPE (parm))); + src = gen_rtx_MEM (GET_MODE (parmreg), validated_mem); + set_mem_attributes (src, parm, 1); + } + else if (use_register_for_decl (parm)) { parmreg = gen_reg_rtx (TYPE_MODE (TREE_TYPE (parm))); mark_user_reg (parmreg); @@ -3151,14 +3332,14 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, set_mem_attributes (parmreg, parm, 1); } - if (GET_MODE (parmreg) != GET_MODE (DECL_RTL (parm))) + if (GET_MODE (parmreg) != GET_MODE (src)) { - rtx tempreg = gen_reg_rtx (GET_MODE (DECL_RTL (parm))); + rtx tempreg = gen_reg_rtx (GET_MODE (src)); int unsigned_p = TYPE_UNSIGNED (TREE_TYPE (parm)); push_to_sequence2 (all->first_conversion_insn, all->last_conversion_insn); - emit_move_insn (tempreg, DECL_RTL (parm)); + emit_move_insn (tempreg, src); tempreg = convert_to_mode (GET_MODE (parmreg), tempreg, unsigned_p); emit_move_insn (parmreg, tempreg); all->first_conversion_insn = get_insns (); @@ -3167,14 +3348,16 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, did_conversion = true; } + else if (GET_MODE (parmreg) == BLKmode) + gcc_assert (parm_maybe_byref_p (parm)); else - emit_move_insn (parmreg, DECL_RTL (parm)); + emit_move_insn (parmreg, src); SET_DECL_RTL (parm, parmreg); /* STACK_PARM is the pointer, not the parm, and PARMREG is now the parm. */ - data->stack_parm = NULL; + data->stack_parm = equiv_stack_parm = NULL; } /* Mark the register as eliminable if we did no conversion and it was @@ -3184,11 +3367,11 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, make here would screw up life analysis for it. */ if (data->nominal_mode == data->passed_mode && !did_conversion - && data->stack_parm != 0 - && MEM_P (data->stack_parm) + && equiv_stack_parm != 0 + && MEM_P (equiv_stack_parm) && data->locate.offset.var == 0 && reg_mentioned_p (virtual_incoming_args_rtx, - XEXP (data->stack_parm, 0))) + XEXP (equiv_stack_parm, 0))) { rtx_insn *linsn = get_last_insn (); rtx_insn *sinsn; @@ -3201,8 +3384,8 @@ assign_parm_setup_reg (struct assign_parm_data_all *all, tree parm, = GET_MODE_INNER (GET_MODE (parmreg)); int regnor = REGNO (XEXP (parmreg, 0)); int regnoi = REGNO (XEXP (parmreg, 1)); - rtx stackr = adjust_address_nv (data->stack_parm, submode, 0); - rtx stacki = adjust_address_nv (data->stack_parm, submode, + rtx stackr = adjust_address_nv (equiv_stack_parm, submode, 0); + rtx stacki = adjust_address_nv (equiv_stack_parm, submode, GET_MODE_SIZE (submode)); /* Scan backwards for the set of the real and @@ -3275,6 +3458,13 @@ assign_parm_setup_stack (struct assign_parm_data_all *all, tree parm, if (data->stack_parm == 0) { + rtx x = data->stack_parm = rtl_for_parm (all, parm); + if (x) + gcc_assert (GET_MODE (x) == GET_MODE (data->entry_parm)); + } + + if (data->stack_parm == 0) + { int align = STACK_SLOT_ALIGNMENT (data->passed_type, GET_MODE (data->entry_parm), TYPE_ALIGN (data->passed_type)); @@ -3337,11 +3527,21 @@ assign_parms_unsplit_complex (struct assign_parm_data_all *all, imag = DECL_RTL (fnargs[i + 1]); if (inner != GET_MODE (real)) { - real = gen_lowpart_SUBREG (inner, real); - imag = gen_lowpart_SUBREG (inner, imag); + real = simplify_gen_subreg (inner, real, GET_MODE (real), + subreg_lowpart_offset + (inner, GET_MODE (real))); + imag = simplify_gen_subreg (inner, imag, GET_MODE (imag), + subreg_lowpart_offset + (inner, GET_MODE (imag))); } - if (TREE_ADDRESSABLE (parm)) + if ((tmp = rtl_for_parm (all, parm)) != NULL_RTX + && rtx_equal_p (real, + read_complex_part (tmp, false)) + && rtx_equal_p (imag, + read_complex_part (tmp, true))) + ; /* We now have the right rtl in tmp. */ + else if (TREE_ADDRESSABLE (parm)) { rtx rmem, imem; HOST_WIDE_INT size = int_size_in_bytes (TREE_TYPE (parm)); @@ -3487,7 +3687,7 @@ assign_bounds (vec<bounds_parm_data> &bndargs, assign_parm_setup_block (&all, pbdata->bounds_parm, &pbdata->parm_data); else if (pbdata->parm_data.passed_pointer - || use_register_for_decl (pbdata->bounds_parm)) + || use_register_for_parm_decl (&all, pbdata->bounds_parm)) assign_parm_setup_reg (&all, pbdata->bounds_parm, &pbdata->parm_data); else @@ -3531,6 +3731,8 @@ assign_parms (tree fndecl) DECL_INCOMING_RTL (parm) = DECL_RTL (parm); continue; } + else + maybe_reset_rtl_for_parm (parm); /* Estimate stack alignment from parameter alignment. */ if (SUPPORTS_STACK_ALIGNMENT) @@ -3580,7 +3782,9 @@ assign_parms (tree fndecl) else set_decl_incoming_rtl (parm, data.entry_parm, false); - /* Boudns should be loaded in the particular order to + assign_parm_adjust_stack_rtl (&all, parm, &data); + + /* Bounds should be loaded in the particular order to have registers allocated correctly. Collect info about input bounds and load them later. */ if (POINTER_BOUNDS_TYPE_P (data.passed_type)) @@ -3597,11 +3801,10 @@ assign_parms (tree fndecl) } else { - assign_parm_adjust_stack_rtl (&data); - if (assign_parm_setup_block_p (&data)) assign_parm_setup_block (&all, parm, &data); - else if (data.passed_pointer || use_register_for_decl (parm)) + else if (data.passed_pointer + || use_register_for_parm_decl (&all, parm)) assign_parm_setup_reg (&all, parm, &data); else assign_parm_setup_stack (&all, parm, &data); @@ -4932,7 +5135,9 @@ expand_function_start (tree subr) before any library calls that assign parms might generate. */ /* Decide whether to return the value in memory or in a register. */ - if (aggregate_value_p (DECL_RESULT (subr), subr)) + tree res = DECL_RESULT (subr); + maybe_reset_rtl_for_parm (res); + if (aggregate_value_p (res, subr)) { /* Returning something that won't go in a register. */ rtx value_address = 0; @@ -4940,7 +5145,7 @@ expand_function_start (tree subr) #ifdef PCC_STATIC_STRUCT_RETURN if (cfun->returns_pcc_struct) { - int size = int_size_in_bytes (TREE_TYPE (DECL_RESULT (subr))); + int size = int_size_in_bytes (TREE_TYPE (res)); value_address = assemble_static_space (size); } else @@ -4952,36 +5157,45 @@ expand_function_start (tree subr) it. */ if (sv) { - value_address = gen_reg_rtx (Pmode); + if (DECL_BY_REFERENCE (res)) + value_address = get_rtl_for_parm_ssa_default_def (res); + if (!value_address) + value_address = gen_reg_rtx (Pmode); emit_move_insn (value_address, sv); } } if (value_address) { rtx x = value_address; - if (!DECL_BY_REFERENCE (DECL_RESULT (subr))) + if (!DECL_BY_REFERENCE (res)) { - x = gen_rtx_MEM (DECL_MODE (DECL_RESULT (subr)), x); - set_mem_attributes (x, DECL_RESULT (subr), 1); + x = get_rtl_for_parm_ssa_default_def (res); + if (!x) + { + x = gen_rtx_MEM (DECL_MODE (res), value_address); + set_mem_attributes (x, res, 1); + } } - SET_DECL_RTL (DECL_RESULT (subr), x); + SET_DECL_RTL (res, x); } } - else if (DECL_MODE (DECL_RESULT (subr)) == VOIDmode) + else if (DECL_MODE (res) == VOIDmode) /* If return mode is void, this decl rtl should not be used. */ - SET_DECL_RTL (DECL_RESULT (subr), NULL_RTX); + SET_DECL_RTL (res, NULL_RTX); else { /* Compute the return values into a pseudo reg, which we will copy into the true return register after the cleanups are done. */ - tree return_type = TREE_TYPE (DECL_RESULT (subr)); - if (TYPE_MODE (return_type) != BLKmode - && targetm.calls.return_in_msb (return_type)) + tree return_type = TREE_TYPE (res); + rtx x = get_rtl_for_parm_ssa_default_def (res); + if (x) + /* Use it. */; + else if (TYPE_MODE (return_type) != BLKmode + && targetm.calls.return_in_msb (return_type)) /* expand_function_end will insert the appropriate padding in this case. Use the return value's natural (unpadded) mode within the function proper. */ - SET_DECL_RTL (DECL_RESULT (subr), - gen_reg_rtx (TYPE_MODE (return_type))); + x = gen_reg_rtx (TYPE_MODE (return_type)); else { /* In order to figure out what mode to use for the pseudo, we @@ -4992,25 +5206,26 @@ expand_function_start (tree subr) /* Structures that are returned in registers are not aggregate_value_p, so we may see a PARALLEL or a REG. */ if (REG_P (hard_reg)) - SET_DECL_RTL (DECL_RESULT (subr), - gen_reg_rtx (GET_MODE (hard_reg))); + x = gen_reg_rtx (GET_MODE (hard_reg)); else { gcc_assert (GET_CODE (hard_reg) == PARALLEL); - SET_DECL_RTL (DECL_RESULT (subr), gen_group_rtx (hard_reg)); + x = gen_group_rtx (hard_reg); } } + SET_DECL_RTL (res, x); + /* Set DECL_REGISTER flag so that expand_function_end will copy the result to the real return register(s). */ - DECL_REGISTER (DECL_RESULT (subr)) = 1; + DECL_REGISTER (res) = 1; if (chkp_function_instrumented_p (current_function_decl)) { - tree return_type = TREE_TYPE (DECL_RESULT (subr)); + tree return_type = TREE_TYPE (res); rtx bounds = targetm.calls.chkp_function_value_bounds (return_type, subr, 1); - SET_DECL_BOUNDS_RTL (DECL_RESULT (subr), bounds); + SET_DECL_BOUNDS_RTL (res, bounds); } } @@ -5025,13 +5240,19 @@ expand_function_start (tree subr) rtx local, chain; rtx_insn *insn; - local = gen_reg_rtx (Pmode); + local = get_rtl_for_parm_ssa_default_def (parm); + if (!local) + local = gen_reg_rtx (Pmode); chain = targetm.calls.static_chain (current_function_decl, true); set_decl_incoming_rtl (parm, chain, false); SET_DECL_RTL (parm, local); mark_reg_pointer (local, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (parm)))); + if (GET_MODE (local) != Pmode) + local = convert_to_mode (Pmode, local, + TYPE_UNSIGNED (TREE_TYPE (parm))); + insn = emit_move_insn (local, chain); /* Mark the register as eliminable, similar to parameters. */ diff --git a/gcc/gimple-expr.c b/gcc/gimple-expr.c index b558d90..baed630 100644 --- a/gcc/gimple-expr.c +++ b/gcc/gimple-expr.c @@ -375,45 +375,6 @@ copy_var_decl (tree var, tree name, tree type) return copy; } -/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for - coalescing together, false otherwise. - - This must stay consistent with var_map_base_init in tree-ssa-live.c. */ - -bool -gimple_can_coalesce_p (tree name1, tree name2) -{ - /* First check the SSA_NAME's associated DECL. We only want to - coalesce if they have the same DECL or both have no associated DECL. */ - tree var1 = SSA_NAME_VAR (name1); - tree var2 = SSA_NAME_VAR (name2); - var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE; - var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE; - if (var1 != var2) - return false; - - /* Now check the types. If the types are the same, then we should - try to coalesce V1 and V2. */ - tree t1 = TREE_TYPE (name1); - tree t2 = TREE_TYPE (name2); - if (t1 == t2) - return true; - - /* If the types are not the same, check for a canonical type match. This - (for example) allows coalescing when the types are fundamentally the - same, but just have different names. - - Note pointer types with different address spaces may have the same - canonical type. Those are rejected for coalescing by the - types_compatible_p check. */ - if (TYPE_CANONICAL (t1) - && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2) - && types_compatible_p (t1, t2)) - return true; - - return false; -} - /* Strip off a legitimate source ending from the input string NAME of length LEN. Rather than having to know the names used by all of our front ends, we strip off an ending of a period followed by diff --git a/gcc/gimple-expr.h b/gcc/gimple-expr.h index ed23eb2..3d1c89f 100644 --- a/gcc/gimple-expr.h +++ b/gcc/gimple-expr.h @@ -28,7 +28,6 @@ extern gimple_seq gimple_body (tree); extern bool gimple_has_body_p (tree); extern const char *gimple_decl_printable_name (tree, int); extern tree copy_var_decl (tree, tree, tree); -extern bool gimple_can_coalesce_p (tree, tree); extern tree create_tmp_var_name (const char *); extern tree create_tmp_var_raw (tree, const char * = NULL); extern tree create_tmp_var (tree, const char * = NULL); diff --git a/gcc/opts.c b/gcc/opts.c index 9d5de96..32de605 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -445,12 +445,12 @@ static const struct default_options default_options_table[] = { OPT_LEVELS_1_PLUS, OPT_fsplit_wide_types, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_ftree_ccp, NULL, 1 }, { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_bit_ccp, NULL, 1 }, + { OPT_LEVELS_1_PLUS, OPT_ftree_coalesce_vars, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_ftree_dce, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_ftree_dominator_opts, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_ftree_dse, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_ftree_ter, NULL, 1 }, { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_ftree_sra, NULL, 1 }, - { OPT_LEVELS_1_PLUS, OPT_ftree_copyrename, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_ftree_fre, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_ftree_copy_prop, NULL, 1 }, { OPT_LEVELS_1_PLUS, OPT_ftree_sink, NULL, 1 }, diff --git a/gcc/passes.def b/gcc/passes.def index 6b66f8f..64fc4d9 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -77,7 +77,6 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_all_early_optimizations); PUSH_INSERT_PASSES_WITHIN (pass_all_early_optimizations) NEXT_PASS (pass_remove_cgraph_callee_edges); - NEXT_PASS (pass_rename_ssa_copies); NEXT_PASS (pass_object_sizes); NEXT_PASS (pass_ccp); /* After CCP we rewrite no longer addressed locals into SSA @@ -155,7 +154,6 @@ along with GCC; see the file COPYING3. If not see /* Initial scalar cleanups before alias computation. They ensure memory accesses are not indirect wherever possible. */ NEXT_PASS (pass_strip_predict_hints); - NEXT_PASS (pass_rename_ssa_copies); NEXT_PASS (pass_ccp); /* After CCP we rewrite no longer addressed locals into SSA form if possible. */ @@ -183,7 +181,6 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_ch); NEXT_PASS (pass_lower_complex); NEXT_PASS (pass_sra); - NEXT_PASS (pass_rename_ssa_copies); /* The dom pass will also resolve all __builtin_constant_p calls that are still there to 0. This has to be done after some propagations have already run, but before some more dead code @@ -293,7 +290,6 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_fold_builtins); NEXT_PASS (pass_optimize_widening_mul); NEXT_PASS (pass_tail_calls); - NEXT_PASS (pass_rename_ssa_copies); /* FIXME: If DCE is not run before checking for uninitialized uses, we may get false warnings (e.g., testsuite/gcc.dg/uninit-5.c). However, this also causes us to misdiagnose cases that should be @@ -328,7 +324,6 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_dce); NEXT_PASS (pass_asan); NEXT_PASS (pass_tsan); - NEXT_PASS (pass_rename_ssa_copies); /* ??? We do want some kind of loop invariant motion, but we possibly need to adjust LIM to be more friendly towards preserving accurate debug information here. */ diff --git a/gcc/stmt.c b/gcc/stmt.c index 391686c..e7f7dd4 100644 --- a/gcc/stmt.c +++ b/gcc/stmt.c @@ -891,7 +891,7 @@ emit_case_decision_tree (tree index_expr, tree index_type, { index = copy_to_reg (index); if (TREE_CODE (index_expr) == SSA_NAME) - set_reg_attrs_for_decl_rtl (SSA_NAME_VAR (index_expr), index); + set_reg_attrs_for_decl_rtl (index_expr, index); } balance_case_nodes (&case_list, NULL); diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c index 9757777..938e54b 100644 --- a/gcc/stor-layout.c +++ b/gcc/stor-layout.c @@ -782,7 +782,8 @@ layout_decl (tree decl, unsigned int known_align) { PUT_MODE (rtl, DECL_MODE (decl)); SET_DECL_RTL (decl, 0); - set_mem_attributes (rtl, decl, 1); + if (MEM_P (rtl)) + set_mem_attributes (rtl, decl, 1); SET_DECL_RTL (decl, rtl); } } diff --git a/gcc/testsuite/gcc.dg/guality/pr54200.c b/gcc/testsuite/gcc.dg/guality/pr54200.c index 9b17187..e1e7293 100644 --- a/gcc/testsuite/gcc.dg/guality/pr54200.c +++ b/gcc/testsuite/gcc.dg/guality/pr54200.c @@ -1,6 +1,6 @@ /* PR tree-optimization/54200 */ /* { dg-do run } */ -/* { dg-options "-g -fno-var-tracking-assignments" } */ +/* { dg-options "-g -fno-var-tracking-assignments -fno-tree-coalesce-vars" } */ int o __attribute__((used)); diff --git a/gcc/testsuite/gcc.dg/ssp-1.c b/gcc/testsuite/gcc.dg/ssp-1.c index 5467f4d..db69332 100644 --- a/gcc/testsuite/gcc.dg/ssp-1.c +++ b/gcc/testsuite/gcc.dg/ssp-1.c @@ -12,7 +12,7 @@ __stack_chk_fail (void) int main () { - int i; + register int i; char foo[255]; // smash stack diff --git a/gcc/testsuite/gcc.dg/ssp-2.c b/gcc/testsuite/gcc.dg/ssp-2.c index 9a7ac32..752fe53 100644 --- a/gcc/testsuite/gcc.dg/ssp-2.c +++ b/gcc/testsuite/gcc.dg/ssp-2.c @@ -14,7 +14,7 @@ __stack_chk_fail (void) void overflow() { - int i = 0; + register int i = 0; char foo[30]; /* Overflow buffer. */ diff --git a/gcc/testsuite/gcc.dg/torture/parm-coalesce.c b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c new file mode 100644 index 0000000..dbd81c1 --- /dev/null +++ b/gcc/testsuite/gcc.dg/torture/parm-coalesce.c @@ -0,0 +1,40 @@ +/* { dg-do run } */ + +#include <stdlib.h> + +/* Make sure we don't coalesce both incoming parms, one whose incoming + value is unused, to the same location, so as to overwrite one of + them with the incoming value of the other. */ + +int __attribute__((noinline, noclone)) +foo (int i, int j) +{ + j = i; /* The incoming value for J is unused. */ + i = 2; + if (j) + j++; + j += i + 1; + return j; +} + +/* Same as foo, but with swapped parameters. */ +int __attribute__((noinline, noclone)) +bar (int j, int i) +{ + j = i; /* The incoming value for J is unused. */ + i = 2; + if (j) + j++; + j += i + 1; + return j; +} + +int +main (void) +{ + if (foo (0, 1) != 3) + abort (); + if (bar (1, 0) != 3) + abort (); + return 0; +} diff --git a/gcc/tree-outof-ssa.c b/gcc/tree-outof-ssa.c index 7b747ab9..978476c 100644 --- a/gcc/tree-outof-ssa.c +++ b/gcc/tree-outof-ssa.c @@ -279,7 +279,6 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus) rtx dest_rtx, seq, x; machine_mode dest_mode, src_mode; int unsignedp; - tree var; if (dump_file && (dump_flags & TDF_DETAILS)) { @@ -301,12 +300,12 @@ insert_value_copy_on_edge (edge e, int dest, tree src, source_location locus) start_sequence (); - var = SSA_NAME_VAR (partition_to_var (SA.map, dest)); + tree name = partition_to_var (SA.map, dest); src_mode = TYPE_MODE (TREE_TYPE (src)); dest_mode = GET_MODE (dest_rtx); - gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (var))); + gcc_assert (src_mode == TYPE_MODE (TREE_TYPE (name))); gcc_assert (!REG_P (dest_rtx) - || dest_mode == promote_decl_mode (var, &unsignedp)); + || dest_mode == promote_ssa_mode (name, &unsignedp)); if (src_mode != dest_mode) { @@ -682,13 +681,12 @@ elim_backward (elim_graph g, int T) static rtx get_temp_reg (tree name) { - tree var = TREE_CODE (name) == SSA_NAME ? SSA_NAME_VAR (name) : name; - tree type = TREE_TYPE (var); + tree type = TREE_TYPE (name); int unsignedp; - machine_mode reg_mode = promote_decl_mode (var, &unsignedp); + machine_mode reg_mode = promote_ssa_mode (name, &unsignedp); rtx x = gen_reg_rtx (reg_mode); if (POINTER_TYPE_P (type)) - mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (TREE_TYPE (var)))); + mark_reg_pointer (x, TYPE_ALIGN (TREE_TYPE (type))); return x; } @@ -988,7 +986,7 @@ remove_ssa_form (bool perform_ter, struct ssaexpand *sa) /* Return to viewing the variable list as just all reference variables after coalescing has been performed. */ - partition_view_normal (map, false); + partition_view_normal (map); if (dump_file && (dump_flags & TDF_DETAILS)) { diff --git a/gcc/tree-ssa-coalesce.c b/gcc/tree-ssa-coalesce.c index bf8983f..08ce72c 100644 --- a/gcc/tree-ssa-coalesce.c +++ b/gcc/tree-ssa-coalesce.c @@ -36,6 +36,8 @@ along with GCC; see the file COPYING3. If not see #include "gimple-iterator.h" #include "tree-ssa-live.h" #include "tree-ssa-coalesce.h" +#include "cfgexpand.h" +#include "explow.h" #include "diagnostic-core.h" @@ -806,6 +808,16 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo) basic_block bb; ssa_op_iter iter; live_track_p live; + basic_block entry; + + /* If inter-variable coalescing is enabled, we may attempt to + coalesce variables from different base variables, including + different parameters, so we have to make sure default defs live + at the entry block conflict with each other. */ + if (flag_tree_coalesce_vars) + entry = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)); + else + entry = NULL; map = live_var_map (liveinfo); graph = ssa_conflicts_new (num_var_partitions (map)); @@ -864,6 +876,30 @@ build_ssa_conflict_graph (tree_live_info_p liveinfo) live_track_process_def (live, result, graph); } + /* Pretend there are defs for params' default defs at the start + of the (post-)entry block. */ + if (bb == entry) + { + unsigned base; + bitmap_iterator bi; + EXECUTE_IF_SET_IN_BITMAP (live->live_base_var, 0, base, bi) + { + bitmap_iterator bi2; + unsigned part; + EXECUTE_IF_SET_IN_BITMAP (live->live_base_partitions[base], + 0, part, bi2) + { + tree var = partition_to_var (map, part); + if (!SSA_NAME_VAR (var) + || (TREE_CODE (SSA_NAME_VAR (var)) != PARM_DECL + && TREE_CODE (SSA_NAME_VAR (var)) != RESULT_DECL) + || !SSA_NAME_IS_DEFAULT_DEF (var)) + continue; + live_track_process_def (live, var, graph); + } + } + } + live_track_clear_base_vars (live); } @@ -1132,6 +1168,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y, { var1 = partition_to_var (map, p1); var2 = partition_to_var (map, p2); + z = var_union (map, var1, var2); if (z == NO_PARTITION) { @@ -1149,6 +1186,7 @@ attempt_coalesce (var_map map, ssa_conflicts_p graph, int x, int y, if (debug) fprintf (debug, ": Success -> %d\n", z); + return true; } @@ -1244,6 +1282,333 @@ ssa_name_var_hash::equal (const tree_node *n1, const tree_node *n2) } +/* Output partition map MAP with coalescing plan PART to file F. */ + +void +dump_part_var_map (FILE *f, partition part, var_map map) +{ + int t; + unsigned x, y; + int p; + + fprintf (f, "\nCoalescible Partition map \n\n"); + + for (x = 0; x < map->num_partitions; x++) + { + if (map->view_to_partition != NULL) + p = map->view_to_partition[x]; + else + p = x; + + if (ssa_name (p) == NULL_TREE + || virtual_operand_p (ssa_name (p))) + continue; + + t = 0; + for (y = 1; y < num_ssa_names; y++) + { + tree var = version_to_var (map, y); + if (!var) + continue; + int q = var_to_partition (map, var); + p = partition_find (part, q); + gcc_assert (map->partition_to_base_index[q] + == map->partition_to_base_index[p]); + + if (p == (int)x) + { + if (t++ == 0) + { + fprintf (f, "Partition %d, base %d (", x, + map->partition_to_base_index[q]); + print_generic_expr (f, partition_to_var (map, q), TDF_SLIM); + fprintf (f, " - "); + } + fprintf (f, "%d ", y); + } + } + if (t != 0) + fprintf (f, ")\n"); + } + fprintf (f, "\n"); +} + +/* Given SSA_NAMEs NAME1 and NAME2, return true if they are candidates for + coalescing together, false otherwise. + + This must stay consistent with var_map_base_init in tree-ssa-live.c. */ + +bool +gimple_can_coalesce_p (tree name1, tree name2) +{ + /* First check the SSA_NAME's associated DECL. Without + optimization, we only want to coalesce if they have the same DECL + or both have no associated DECL. */ + tree var1 = SSA_NAME_VAR (name1); + tree var2 = SSA_NAME_VAR (name2); + var1 = (var1 && (!VAR_P (var1) || !DECL_IGNORED_P (var1))) ? var1 : NULL_TREE; + var2 = (var2 && (!VAR_P (var2) || !DECL_IGNORED_P (var2))) ? var2 : NULL_TREE; + if (var1 != var2 && !flag_tree_coalesce_vars) + return false; + + /* Now check the types. If the types are the same, then we should + try to coalesce V1 and V2. */ + tree t1 = TREE_TYPE (name1); + tree t2 = TREE_TYPE (name2); + if (t1 == t2) + { + check_modes: + /* If the base variables are the same, we're good: none of the + other tests below could possibly fail. */ + var1 = SSA_NAME_VAR (name1); + var2 = SSA_NAME_VAR (name2); + if (var1 == var2) + return true; + + /* We don't want to coalesce two SSA names if one of the base + variables is supposed to be a register while the other is + supposed to be on the stack. Anonymous SSA names take + registers, but when not optimizing, user variables should go + on the stack, so coalescing them with the anonymous variable + as the partition leader would end up assigning the user + variable to a register. Don't do that! */ + bool reg1 = !var1 || use_register_for_decl (var1); + bool reg2 = !var2 || use_register_for_decl (var2); + if (reg1 != reg2) + return false; + + /* Check that the promoted modes are the same. We don't want to + coalesce if the promoted modes would be different. Only + PARM_DECLs and RESULT_DECLs have different promotion rules, + so skip the test if both are variables, or both are anonymous + SSA_NAMEs. Now, if a parm or result has BLKmode, do not + coalesce its SSA versions with those of any other variables, + because it may be passed by reference. */ + return ((!var1 || VAR_P (var1)) && (!var2 || VAR_P (var2))) + || (/* The case var1 == var2 is already covered above. */ + !parm_maybe_byref_p (var1) + && !parm_maybe_byref_p (var2) + && promote_ssa_mode (name1, NULL) == promote_ssa_mode (name2, NULL)); + } + + /* If the types are not the same, check for a canonical type match. This + (for example) allows coalescing when the types are fundamentally the + same, but just have different names. + + Note pointer types with different address spaces may have the same + canonical type. Those are rejected for coalescing by the + types_compatible_p check. */ + if (TYPE_CANONICAL (t1) + && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2) + && types_compatible_p (t1, t2)) + goto check_modes; + + return false; +} + +/* Fill in MAP's partition_to_base_index, with one index for each + partition of SSA names USED_IN_COPIES and related by CL coalesce + possibilities. This must match gimple_can_coalesce_p in the + optimized case. */ + +static void +compute_optimized_partition_bases (var_map map, bitmap used_in_copies, + coalesce_list_p cl) +{ + int parts = num_var_partitions (map); + partition tentative = partition_new (parts); + + /* Partition the SSA versions so that, for each coalescible + pair, both of its members are in the same partition in + TENTATIVE. */ + gcc_assert (!cl->sorted); + coalesce_pair_p node; + coalesce_iterator_type ppi; + FOR_EACH_PARTITION_PAIR (node, ppi, cl) + { + tree v1 = ssa_name (node->first_element); + int p1 = partition_find (tentative, var_to_partition (map, v1)); + tree v2 = ssa_name (node->second_element); + int p2 = partition_find (tentative, var_to_partition (map, v2)); + + if (p1 == p2) + continue; + + partition_union (tentative, p1, p2); + } + + /* We have to deal with cost one pairs too. */ + for (cost_one_pair_d *co = cl->cost_one_list; co; co = co->next) + { + tree v1 = ssa_name (co->first_element); + int p1 = partition_find (tentative, var_to_partition (map, v1)); + tree v2 = ssa_name (co->second_element); + int p2 = partition_find (tentative, var_to_partition (map, v2)); + + if (p1 == p2) + continue; + + partition_union (tentative, p1, p2); + } + + /* And also with abnormal edges. */ + basic_block bb; + edge e; + edge_iterator ei; + FOR_EACH_BB_FN (bb, cfun) + { + FOR_EACH_EDGE (e, ei, bb->preds) + if (e->flags & EDGE_ABNORMAL) + { + gphi_iterator gsi; + for (gsi = gsi_start_phis (bb); !gsi_end_p (gsi); + gsi_next (&gsi)) + { + gphi *phi = gsi.phi (); + tree arg = PHI_ARG_DEF (phi, e->dest_idx); + if (SSA_NAME_IS_DEFAULT_DEF (arg) + && (!SSA_NAME_VAR (arg) + || TREE_CODE (SSA_NAME_VAR (arg)) != PARM_DECL)) + continue; + + tree res = PHI_RESULT (phi); + + int p1 = partition_find (tentative, var_to_partition (map, res)); + int p2 = partition_find (tentative, var_to_partition (map, arg)); + + if (p1 == p2) + continue; + + partition_union (tentative, p1, p2); + } + } + } + + map->partition_to_base_index = XCNEWVEC (int, parts); + auto_vec<unsigned int> index_map (parts); + if (parts) + index_map.quick_grow (parts); + + const unsigned no_part = -1; + unsigned count = parts; + while (count) + index_map[--count] = no_part; + + /* Initialize MAP's mapping from partition to base index, using + as base indices an enumeration of the TENTATIVE partitions in + which each SSA version ended up, so that we compute conflicts + between all SSA versions that ended up in the same potential + coalesce partition. */ + bitmap_iterator bi; + unsigned i; + EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi) + { + int pidx = var_to_partition (map, ssa_name (i)); + int base = partition_find (tentative, pidx); + if (index_map[base] != no_part) + continue; + index_map[base] = count++; + } + + map->num_basevars = count; + + EXECUTE_IF_SET_IN_BITMAP (used_in_copies, 0, i, bi) + { + int pidx = var_to_partition (map, ssa_name (i)); + int base = partition_find (tentative, pidx); + gcc_assert (index_map[base] < count); + map->partition_to_base_index[pidx] = index_map[base]; + } + + if (dump_file && (dump_flags & TDF_DETAILS)) + dump_part_var_map (dump_file, tentative, map); + + partition_delete (tentative); +} + +/* Hashtable helpers. */ + +struct tree_int_map_hasher : nofree_ptr_hash <tree_int_map> +{ + static inline hashval_t hash (const tree_int_map *); + static inline bool equal (const tree_int_map *, const tree_int_map *); +}; + +inline hashval_t +tree_int_map_hasher::hash (const tree_int_map *v) +{ + return tree_map_base_hash (v); +} + +inline bool +tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c) +{ + return tree_int_map_eq (v, c); +} + +/* This routine will initialize the basevar fields of MAP with base + names. Partitions will share the same base if they have the same + SSA_NAME_VAR, or, being anonymous variables, the same type. This + must match gimple_can_coalesce_p in the non-optimized case. */ + +static void +compute_samebase_partition_bases (var_map map) +{ + int x, num_part; + tree var; + struct tree_int_map *m, *mapstorage; + + num_part = num_var_partitions (map); + hash_table<tree_int_map_hasher> tree_to_index (num_part); + /* We can have at most num_part entries in the hash tables, so it's + enough to allocate so many map elements once, saving some malloc + calls. */ + mapstorage = m = XNEWVEC (struct tree_int_map, num_part); + + /* If a base table already exists, clear it, otherwise create it. */ + free (map->partition_to_base_index); + map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part); + + /* Build the base variable list, and point partitions at their bases. */ + for (x = 0; x < num_part; x++) + { + struct tree_int_map **slot; + unsigned baseindex; + var = partition_to_var (map, x); + if (SSA_NAME_VAR (var) + && (!VAR_P (SSA_NAME_VAR (var)) + || !DECL_IGNORED_P (SSA_NAME_VAR (var)))) + m->base.from = SSA_NAME_VAR (var); + else + /* This restricts what anonymous SSA names we can coalesce + as it restricts the sets we compute conflicts for. + Using TREE_TYPE to generate sets is the easies as + type equivalency also holds for SSA names with the same + underlying decl. + + Check gimple_can_coalesce_p when changing this code. */ + m->base.from = (TYPE_CANONICAL (TREE_TYPE (var)) + ? TYPE_CANONICAL (TREE_TYPE (var)) + : TREE_TYPE (var)); + /* If base variable hasn't been seen, set it up. */ + slot = tree_to_index.find_slot (m, INSERT); + if (!*slot) + { + baseindex = m - mapstorage; + m->to = baseindex; + *slot = m; + m++; + } + else + baseindex = (*slot)->to; + map->partition_to_base_index[x] = baseindex; + } + + map->num_basevars = m - mapstorage; + + free (mapstorage); +} + /* Reduce the number of copies by coalescing variables in the function. Return a partition map with the resulting coalesces. */ @@ -1260,9 +1625,10 @@ coalesce_ssa_name (void) cl = create_coalesce_list (); map = create_outofssa_var_map (cl, used_in_copies); - /* If optimization is disabled, we need to coalesce all the names originating - from the same SSA_NAME_VAR so debug info remains undisturbed. */ - if (!optimize) + /* If this optimization is disabled, we need to coalesce all the + names originating from the same SSA_NAME_VAR so debug info + remains undisturbed. */ + if (!flag_tree_coalesce_vars) { hash_table<ssa_name_var_hash> ssa_name_hash (10); @@ -1303,8 +1669,13 @@ coalesce_ssa_name (void) if (dump_file && (dump_flags & TDF_DETAILS)) dump_var_map (dump_file, map); - /* Don't calculate live ranges for variables not in the coalesce list. */ - partition_view_bitmap (map, used_in_copies, true); + partition_view_bitmap (map, used_in_copies); + + if (flag_tree_coalesce_vars) + compute_optimized_partition_bases (map, used_in_copies, cl); + else + compute_samebase_partition_bases (map); + BITMAP_FREE (used_in_copies); if (num_var_partitions (map) < 1) @@ -1343,8 +1714,7 @@ coalesce_ssa_name (void) /* Now coalesce everything in the list. */ coalesce_partitions (map, graph, cl, - ((dump_flags & TDF_DETAILS) ? dump_file - : NULL)); + ((dump_flags & TDF_DETAILS) ? dump_file : NULL)); delete_coalesce_list (cl); ssa_conflicts_delete (graph); diff --git a/gcc/tree-ssa-coalesce.h b/gcc/tree-ssa-coalesce.h index 99b188a..ae289b4 100644 --- a/gcc/tree-ssa-coalesce.h +++ b/gcc/tree-ssa-coalesce.h @@ -21,5 +21,6 @@ along with GCC; see the file COPYING3. If not see #define GCC_TREE_SSA_COALESCE_H extern var_map coalesce_ssa_name (void); +extern bool gimple_can_coalesce_p (tree, tree); #endif /* GCC_TREE_SSA_COALESCE_H */ diff --git a/gcc/tree-ssa-copyrename.c b/gcc/tree-ssa-copyrename.c deleted file mode 100644 index aeb7f28..0000000 --- a/gcc/tree-ssa-copyrename.c +++ /dev/null @@ -1,475 +0,0 @@ -/* Rename SSA copies. - Copyright (C) 2004-2015 Free Software Foundation, Inc. - Contributed by Andrew MacLeod <amacleod@redhat.com> - -This file is part of GCC. - -GCC is free software; you can redistribute it and/or modify -it under the terms of the GNU General Public License as published by -the Free Software Foundation; either version 3, or (at your option) -any later version. - -GCC is distributed in the hope that it will be useful, -but WITHOUT ANY WARRANTY; without even the implied warranty of -MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the -GNU General Public License for more details. - -You should have received a copy of the GNU General Public License -along with GCC; see the file COPYING3. If not see -<http://www.gnu.org/licenses/>. */ - -#include "config.h" -#include "system.h" -#include "coretypes.h" -#include "backend.h" -#include "tree.h" -#include "gimple.h" -#include "rtl.h" -#include "ssa.h" -#include "alias.h" -#include "fold-const.h" -#include "internal-fn.h" -#include "gimple-iterator.h" -#include "flags.h" -#include "tree-pretty-print.h" -#include "insn-config.h" -#include "expmed.h" -#include "dojump.h" -#include "explow.h" -#include "calls.h" -#include "emit-rtl.h" -#include "varasm.h" -#include "stmt.h" -#include "expr.h" -#include "tree-dfa.h" -#include "tree-inline.h" -#include "tree-ssa-live.h" -#include "tree-pass.h" -#include "langhooks.h" - -static struct -{ - /* Number of copies coalesced. */ - int coalesced; -} stats; - -/* The following routines implement the SSA copy renaming phase. - - This optimization looks for copies between 2 SSA_NAMES, either through a - direct copy, or an implicit one via a PHI node result and its arguments. - - Each copy is examined to determine if it is possible to rename the base - variable of one of the operands to the same variable as the other operand. - i.e. - T.3_5 = <blah> - a_1 = T.3_5 - - If this copy couldn't be copy propagated, it could possibly remain in the - program throughout the optimization phases. After SSA->normal, it would - become: - - T.3 = <blah> - a = T.3 - - Since T.3_5 is distinct from all other SSA versions of T.3, there is no - fundamental reason why the base variable needs to be T.3, subject to - certain restrictions. This optimization attempts to determine if we can - change the base variable on copies like this, and result in code such as: - - a_5 = <blah> - a_1 = a_5 - - This gives the SSA->normal pass a shot at coalescing a_1 and a_5. If it is - possible, the copy goes away completely. If it isn't possible, a new temp - will be created for a_5, and you will end up with the exact same code: - - a.8 = <blah> - a = a.8 - - The other benefit of performing this optimization relates to what variables - are chosen in copies. Gimplification of the program uses temporaries for - a lot of things. expressions like - - a_1 = <blah> - <blah2> = a_1 - - get turned into - - T.3_5 = <blah> - a_1 = T.3_5 - <blah2> = a_1 - - Copy propagation is done in a forward direction, and if we can propagate - through the copy, we end up with: - - T.3_5 = <blah> - <blah2> = T.3_5 - - The copy is gone, but so is all reference to the user variable 'a'. By - performing this optimization, we would see the sequence: - - a_5 = <blah> - a_1 = a_5 - <blah2> = a_1 - - which copy propagation would then turn into: - - a_5 = <blah> - <blah2> = a_5 - - and so we still retain the user variable whenever possible. */ - - -/* Coalesce the partitions in MAP representing VAR1 and VAR2 if it is valid. - Choose a representative for the partition, and send debug info to DEBUG. */ - -static void -copy_rename_partition_coalesce (var_map map, tree var1, tree var2, FILE *debug) -{ - int p1, p2, p3; - tree root1, root2; - tree rep1, rep2; - bool ign1, ign2, abnorm; - - gcc_assert (TREE_CODE (var1) == SSA_NAME); - gcc_assert (TREE_CODE (var2) == SSA_NAME); - - register_ssa_partition (map, var1); - register_ssa_partition (map, var2); - - p1 = partition_find (map->var_partition, SSA_NAME_VERSION (var1)); - p2 = partition_find (map->var_partition, SSA_NAME_VERSION (var2)); - - if (debug) - { - fprintf (debug, "Try : "); - print_generic_expr (debug, var1, TDF_SLIM); - fprintf (debug, "(P%d) & ", p1); - print_generic_expr (debug, var2, TDF_SLIM); - fprintf (debug, "(P%d)", p2); - } - - gcc_assert (p1 != NO_PARTITION); - gcc_assert (p2 != NO_PARTITION); - - if (p1 == p2) - { - if (debug) - fprintf (debug, " : Already coalesced.\n"); - return; - } - - rep1 = partition_to_var (map, p1); - rep2 = partition_to_var (map, p2); - root1 = SSA_NAME_VAR (rep1); - root2 = SSA_NAME_VAR (rep2); - if (!root1 && !root2) - return; - - /* Don't coalesce if one of the variables occurs in an abnormal PHI. */ - abnorm = (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep1) - || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rep2)); - if (abnorm) - { - if (debug) - fprintf (debug, " : Abnormal PHI barrier. No coalesce.\n"); - return; - } - - /* Partitions already have the same root, simply merge them. */ - if (root1 == root2) - { - p1 = partition_union (map->var_partition, p1, p2); - if (debug) - fprintf (debug, " : Same root, coalesced --> P%d.\n", p1); - return; - } - - /* Never attempt to coalesce 2 different parameters. */ - if ((root1 && TREE_CODE (root1) == PARM_DECL) - && (root2 && TREE_CODE (root2) == PARM_DECL)) - { - if (debug) - fprintf (debug, " : 2 different PARM_DECLS. No coalesce.\n"); - return; - } - - if ((root1 && TREE_CODE (root1) == RESULT_DECL) - != (root2 && TREE_CODE (root2) == RESULT_DECL)) - { - if (debug) - fprintf (debug, " : One root a RESULT_DECL. No coalesce.\n"); - return; - } - - ign1 = !root1 || (TREE_CODE (root1) == VAR_DECL && DECL_IGNORED_P (root1)); - ign2 = !root2 || (TREE_CODE (root2) == VAR_DECL && DECL_IGNORED_P (root2)); - - /* Refrain from coalescing user variables, if requested. */ - if (!ign1 && !ign2) - { - if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root2)) - ign2 = true; - else if (flag_ssa_coalesce_vars && DECL_FROM_INLINE (root1)) - ign1 = true; - else if (flag_ssa_coalesce_vars != 2) - { - if (debug) - fprintf (debug, " : 2 different USER vars. No coalesce.\n"); - return; - } - else - ign2 = true; - } - - /* If both values have default defs, we can't coalesce. If only one has a - tag, make sure that variable is the new root partition. */ - if (root1 && ssa_default_def (cfun, root1)) - { - if (root2 && ssa_default_def (cfun, root2)) - { - if (debug) - fprintf (debug, " : 2 default defs. No coalesce.\n"); - return; - } - else - { - ign2 = true; - ign1 = false; - } - } - else if (root2 && ssa_default_def (cfun, root2)) - { - ign1 = true; - ign2 = false; - } - - /* Do not coalesce if we cannot assign a symbol to the partition. */ - if (!(!ign2 && root2) - && !(!ign1 && root1)) - { - if (debug) - fprintf (debug, " : Choosen variable has no root. No coalesce.\n"); - return; - } - - /* Don't coalesce if the new chosen root variable would be read-only. - If both ign1 && ign2, then the root var of the larger partition - wins, so reject in that case if any of the root vars is TREE_READONLY. - Otherwise reject only if the root var, on which replace_ssa_name_symbol - will be called below, is readonly. */ - if (((root1 && TREE_READONLY (root1)) && ign2) - || ((root2 && TREE_READONLY (root2)) && ign1)) - { - if (debug) - fprintf (debug, " : Readonly variable. No coalesce.\n"); - return; - } - - /* Don't coalesce if the two variables aren't type compatible . */ - if (!types_compatible_p (TREE_TYPE (var1), TREE_TYPE (var2)) - /* There is a disconnect between the middle-end type-system and - VRP, avoid coalescing enum types with different bounds. */ - || ((TREE_CODE (TREE_TYPE (var1)) == ENUMERAL_TYPE - || TREE_CODE (TREE_TYPE (var2)) == ENUMERAL_TYPE) - && TREE_TYPE (var1) != TREE_TYPE (var2))) - { - if (debug) - fprintf (debug, " : Incompatible types. No coalesce.\n"); - return; - } - - /* Merge the two partitions. */ - p3 = partition_union (map->var_partition, p1, p2); - - /* Set the root variable of the partition to the better choice, if there is - one. */ - if (!ign2 && root2) - replace_ssa_name_symbol (partition_to_var (map, p3), root2); - else if (!ign1 && root1) - replace_ssa_name_symbol (partition_to_var (map, p3), root1); - else - gcc_unreachable (); - - if (debug) - { - fprintf (debug, " --> P%d ", p3); - print_generic_expr (debug, SSA_NAME_VAR (partition_to_var (map, p3)), - TDF_SLIM); - fprintf (debug, "\n"); - } -} - - -namespace { - -const pass_data pass_data_rename_ssa_copies = -{ - GIMPLE_PASS, /* type */ - "copyrename", /* name */ - OPTGROUP_NONE, /* optinfo_flags */ - TV_TREE_COPY_RENAME, /* tv_id */ - ( PROP_cfg | PROP_ssa ), /* properties_required */ - 0, /* properties_provided */ - 0, /* properties_destroyed */ - 0, /* todo_flags_start */ - 0, /* todo_flags_finish */ -}; - -class pass_rename_ssa_copies : public gimple_opt_pass -{ -public: - pass_rename_ssa_copies (gcc::context *ctxt) - : gimple_opt_pass (pass_data_rename_ssa_copies, ctxt) - {} - - /* opt_pass methods: */ - opt_pass * clone () { return new pass_rename_ssa_copies (m_ctxt); } - virtual bool gate (function *) { return flag_tree_copyrename != 0; } - virtual unsigned int execute (function *); - -}; // class pass_rename_ssa_copies - -/* This function will make a pass through the IL, and attempt to coalesce any - SSA versions which occur in PHI's or copies. Coalescing is accomplished by - changing the underlying root variable of all coalesced version. This will - then cause the SSA->normal pass to attempt to coalesce them all to the same - variable. */ - -unsigned int -pass_rename_ssa_copies::execute (function *fun) -{ - var_map map; - basic_block bb; - tree var, part_var; - gimple stmt; - unsigned x; - FILE *debug; - - memset (&stats, 0, sizeof (stats)); - - if (dump_file && (dump_flags & TDF_DETAILS)) - debug = dump_file; - else - debug = NULL; - - map = init_var_map (num_ssa_names); - - FOR_EACH_BB_FN (bb, fun) - { - /* Scan for real copies. */ - for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi); - gsi_next (&gsi)) - { - stmt = gsi_stmt (gsi); - if (gimple_assign_ssa_name_copy_p (stmt)) - { - tree lhs = gimple_assign_lhs (stmt); - tree rhs = gimple_assign_rhs1 (stmt); - - copy_rename_partition_coalesce (map, lhs, rhs, debug); - } - } - } - - FOR_EACH_BB_FN (bb, fun) - { - /* Treat PHI nodes as copies between the result and each argument. */ - for (gphi_iterator gsi = gsi_start_phis (bb); !gsi_end_p (gsi); - gsi_next (&gsi)) - { - size_t i; - tree res; - gphi *phi = gsi.phi (); - res = gimple_phi_result (phi); - - /* Do not process virtual SSA_NAMES. */ - if (virtual_operand_p (res)) - continue; - - /* Make sure to only use the same partition for an argument - as the result but never the other way around. */ - if (SSA_NAME_VAR (res) - && !DECL_IGNORED_P (SSA_NAME_VAR (res))) - for (i = 0; i < gimple_phi_num_args (phi); i++) - { - tree arg = PHI_ARG_DEF (phi, i); - if (TREE_CODE (arg) == SSA_NAME) - copy_rename_partition_coalesce (map, res, arg, - debug); - } - /* Else if all arguments are in the same partition try to merge - it with the result. */ - else - { - int all_p_same = -1; - int p = -1; - for (i = 0; i < gimple_phi_num_args (phi); i++) - { - tree arg = PHI_ARG_DEF (phi, i); - if (TREE_CODE (arg) != SSA_NAME) - { - all_p_same = 0; - break; - } - else if (all_p_same == -1) - { - p = partition_find (map->var_partition, - SSA_NAME_VERSION (arg)); - all_p_same = 1; - } - else if (all_p_same == 1 - && p != partition_find (map->var_partition, - SSA_NAME_VERSION (arg))) - { - all_p_same = 0; - break; - } - } - if (all_p_same == 1) - copy_rename_partition_coalesce (map, res, - PHI_ARG_DEF (phi, 0), - debug); - } - } - } - - if (debug) - dump_var_map (debug, map); - - /* Now one more pass to make all elements of a partition share the same - root variable. */ - - for (x = 1; x < num_ssa_names; x++) - { - part_var = partition_to_var (map, x); - if (!part_var) - continue; - var = ssa_name (x); - if (SSA_NAME_VAR (var) == SSA_NAME_VAR (part_var)) - continue; - if (debug) - { - fprintf (debug, "Coalesced "); - print_generic_expr (debug, var, TDF_SLIM); - fprintf (debug, " to "); - print_generic_expr (debug, part_var, TDF_SLIM); - fprintf (debug, "\n"); - } - stats.coalesced++; - replace_ssa_name_symbol (var, SSA_NAME_VAR (part_var)); - } - - statistics_counter_event (fun, "copies coalesced", - stats.coalesced); - delete_var_map (map); - return 0; -} - -} // anon namespace - -gimple_opt_pass * -make_pass_rename_ssa_copies (gcc::context *ctxt) -{ - return new pass_rename_ssa_copies (ctxt); -} diff --git a/gcc/tree-ssa-live.c b/gcc/tree-ssa-live.c index 5b00f58..4772558 100644 --- a/gcc/tree-ssa-live.c +++ b/gcc/tree-ssa-live.c @@ -70,88 +70,6 @@ static void verify_live_on_entry (tree_live_info_p); ssa_name or variable, and vice versa. */ -/* Hashtable helpers. */ - -struct tree_int_map_hasher : nofree_ptr_hash <tree_int_map> -{ - static inline hashval_t hash (const tree_int_map *); - static inline bool equal (const tree_int_map *, const tree_int_map *); -}; - -inline hashval_t -tree_int_map_hasher::hash (const tree_int_map *v) -{ - return tree_map_base_hash (v); -} - -inline bool -tree_int_map_hasher::equal (const tree_int_map *v, const tree_int_map *c) -{ - return tree_int_map_eq (v, c); -} - - -/* This routine will initialize the basevar fields of MAP. */ - -static void -var_map_base_init (var_map map) -{ - int x, num_part; - tree var; - struct tree_int_map *m, *mapstorage; - - num_part = num_var_partitions (map); - hash_table<tree_int_map_hasher> tree_to_index (num_part); - /* We can have at most num_part entries in the hash tables, so it's - enough to allocate so many map elements once, saving some malloc - calls. */ - mapstorage = m = XNEWVEC (struct tree_int_map, num_part); - - /* If a base table already exists, clear it, otherwise create it. */ - free (map->partition_to_base_index); - map->partition_to_base_index = (int *) xmalloc (sizeof (int) * num_part); - - /* Build the base variable list, and point partitions at their bases. */ - for (x = 0; x < num_part; x++) - { - struct tree_int_map **slot; - unsigned baseindex; - var = partition_to_var (map, x); - if (SSA_NAME_VAR (var) - && (!VAR_P (SSA_NAME_VAR (var)) - || !DECL_IGNORED_P (SSA_NAME_VAR (var)))) - m->base.from = SSA_NAME_VAR (var); - else - /* This restricts what anonymous SSA names we can coalesce - as it restricts the sets we compute conflicts for. - Using TREE_TYPE to generate sets is the easies as - type equivalency also holds for SSA names with the same - underlying decl. - - Check gimple_can_coalesce_p when changing this code. */ - m->base.from = (TYPE_CANONICAL (TREE_TYPE (var)) - ? TYPE_CANONICAL (TREE_TYPE (var)) - : TREE_TYPE (var)); - /* If base variable hasn't been seen, set it up. */ - slot = tree_to_index.find_slot (m, INSERT); - if (!*slot) - { - baseindex = m - mapstorage; - m->to = baseindex; - *slot = m; - m++; - } - else - baseindex = (*slot)->to; - map->partition_to_base_index[x] = baseindex; - } - - map->num_basevars = m - mapstorage; - - free (mapstorage); -} - - /* Remove the base table in MAP. */ static void @@ -329,21 +247,17 @@ partition_view_fini (var_map map, bitmap selected) } -/* Create a partition view which includes all the used partitions in MAP. If - WANT_BASES is true, create the base variable map as well. */ +/* Create a partition view which includes all the used partitions in MAP. */ void -partition_view_normal (var_map map, bool want_bases) +partition_view_normal (var_map map) { bitmap used; used = partition_view_init (map); partition_view_fini (map, used); - if (want_bases) - var_map_base_init (map); - else - var_map_base_fini (map); + var_map_base_fini (map); } @@ -352,7 +266,7 @@ partition_view_normal (var_map map, bool want_bases) as well. */ void -partition_view_bitmap (var_map map, bitmap only, bool want_bases) +partition_view_bitmap (var_map map, bitmap only) { bitmap used; bitmap new_partitions = BITMAP_ALLOC (NULL); @@ -368,10 +282,7 @@ partition_view_bitmap (var_map map, bitmap only, bool want_bases) } partition_view_fini (map, new_partitions); - if (want_bases) - var_map_base_init (map); - else - var_map_base_fini (map); + var_map_base_fini (map); } diff --git a/gcc/tree-ssa-live.h b/gcc/tree-ssa-live.h index d5d7820..1f88358 100644 --- a/gcc/tree-ssa-live.h +++ b/gcc/tree-ssa-live.h @@ -71,8 +71,8 @@ typedef struct _var_map extern var_map init_var_map (int); extern void delete_var_map (var_map); extern int var_union (var_map, tree, tree); -extern void partition_view_normal (var_map, bool); -extern void partition_view_bitmap (var_map, bitmap, bool); +extern void partition_view_normal (var_map); +extern void partition_view_bitmap (var_map, bitmap); extern void dump_scope_blocks (FILE *, int); extern void debug_scope_block (tree, int); extern void debug_scope_blocks (int); diff --git a/gcc/tree-ssa-uncprop.c b/gcc/tree-ssa-uncprop.c index 437f69d..1fbd71e 100644 --- a/gcc/tree-ssa-uncprop.c +++ b/gcc/tree-ssa-uncprop.c @@ -38,6 +38,11 @@ along with GCC; see the file COPYING3. If not see #include "tree-pass.h" #include "tree-ssa-propagate.h" #include "tree-hash-traits.h" +#include "bitmap.h" +#include "stringpool.h" +#include "tree-ssanames.h" +#include "tree-ssa-live.h" +#include "tree-ssa-coalesce.h" /* The basic structure describing an equivalency created by traversing an edge. Traversing the edge effectively means that we can assume diff --git a/gcc/var-tracking.c b/gcc/var-tracking.c index da9de28..a31a137 100644 --- a/gcc/var-tracking.c +++ b/gcc/var-tracking.c @@ -4856,12 +4856,16 @@ dataflow_set_remove_mem_locs (variable_def **slot, dataflow_set *set) registers, as well as associations between MEMs and VALUEs. */ static void -dataflow_set_clear_at_call (dataflow_set *set) +dataflow_set_clear_at_call (dataflow_set *set, rtx_insn *call_insn) { unsigned int r; hard_reg_set_iterator hrsi; + HARD_REG_SET invalidated_regs; - EXECUTE_IF_SET_IN_HARD_REG_SET (regs_invalidated_by_call, 0, r, hrsi) + get_call_reg_set_usage (call_insn, &invalidated_regs, + regs_invalidated_by_call); + + EXECUTE_IF_SET_IN_HARD_REG_SET (invalidated_regs, 0, r, hrsi) var_regno_delete (set, r); if (MAY_HAVE_DEBUG_INSNS) @@ -6645,7 +6649,7 @@ compute_bb_dataflow (basic_block bb) switch (mo->type) { case MO_CALL: - dataflow_set_clear_at_call (out); + dataflow_set_clear_at_call (out, insn); break; case MO_USE: @@ -9107,7 +9111,7 @@ emit_notes_in_bb (basic_block bb, dataflow_set *set) switch (mo->type) { case MO_CALL: - dataflow_set_clear_at_call (set); + dataflow_set_clear_at_call (set, insn); emit_notes_for_changes (insn, EMIT_NOTE_AFTER_CALL_INSN, set->vars); { rtx arguments = mo->u.loc, *p = &arguments;