Message ID | ZXnX24ZRkpHr7B-m@localhost.localdomain |
---|---|
State | New |
Headers | show |
Series | [v4] A new copy propagation and PHI elimination pass | expand |
> Am 13.12.2023 um 17:12 schrieb Filip Kastl <fkastl@suse.cz>: > > >> >>>> Hi, >>>> >>>> this is a patch that I submitted two months ago as an RFC. I added some polish >>>> since. >>>> >>>> It is a new lightweight pass that removes redundant PHI functions and as a >>>> bonus does basic copy propagation. With Jan Hubi?ka we measured that it is able >>>> to remove usually more than 5% of all PHI functions when run among early passes >>>> (sometimes even 13% or more). Those are mostly PHI functions that would be >>>> later optimized away but with this pass it is possible to remove them early >>>> enough so that they don't get streamed when runing LTO (and also potentially >>>> inlined at multiple places). It is also able to remove some redundant PHIs >>>> that otherwise would still be present during RTL expansion. >>>> >>>> Jakub Jel?nek was concerned about debug info coverage so I compiled cc1plus >>>> with and without this patch. These are the sizes of .debug_info and >>>> .debug_loclists >>>> >>>> .debug_info without patch 181694311 >>>> .debug_info with patch 181692320 >>>> +0.0011% change >>>> >>>> .debug_loclists without patch 47934753 >>>> .debug_loclists with patch 47934966 >>>> -0.0004% change >>>> >>>> I wanted to use dwlocstat to compare debug coverages but didn't manage to get >>>> the program working on my machine sadly. Hope this suffices. Seems to me that >>>> my patch doesn't have a significant impact on debug info. >>>> >>>> Bootstraped and tested* on x86_64-pc-linux-gnu. >>>> >>>> * One testcase (pr79691.c) did regress. However that is because the test is >>>> dependent on a certain variable not being copy propagated. I will go into more >>>> detail about this in a reply to this mail. >>>> >>>> Ok to commit? >>> >>> This is a second version of the patch. In this version, I modified the >>> pr79691.c testcase so that it works as intended with other changes from the >>> patch. >>> >>> The pr79691.c testcase checks that we get constants from snprintf calls and >>> that they simplify into a single constant. The testcase doesn't account for >>> the fact that this constant may be further copy propagated which is exactly >>> what happens with this patch applied. >>> >>> Bootstrapped and tested on x86_64-pc-linux-gnu. >>> >>> Ok to commit? >> >> This is the third version of the patch. In this version, I adressed most of >> Richards remarks about the second version. Here is a summary of changes I made: >> >> - Rename the pass from tree-ssa-sccopy.cc to gimple-ssa-sccopy.cc >> - Use simple_dce_from_worklist to remove propagated statements >> - Use existing replace_uses API instead of reinventing it >> - This allowed me to get rid of some now redundant cleanup code >> - Encapsulate the SCC finding into a class >> - Rework stmt_may_generate_copy to get rid of redundant checks >> - Add check that PHI doesn't contain two non-SSA-name values to >> stmt_may_generate_copy >> - Regarding alignment and value ranges in stmt_may_generate_copy: For now use >> the conservative check that Richard suggested >> - Index array of vertices that SCC discovery uses by SSA name version numbers >> instead of numbering statements myself. >> >> >> I didn't make any changes based on these remarks: >> >> 1 It might be nice to optimize SCCs of size 1 somehow, not sure how >> many times these appear - possibly prevent them from even entering >> the SCC discovery? >> >> It would be nice. But the only way to do this that I see right now is to first >> propagate SCCs of size 1 and then the rest. This would mean adding a new copy >> propagation procedure. It wouldn't be a trivial procedure. Efficiency of the >> pass relies on having SCCs topogically sorted so this procedure would have to >> implement some topological sort algorithm. >> >> This could be done. It could save allocating some vec<>s (right now, SCCs of >> size 1 are represented by a vec<> with a single element). But is it worth it to >> optimize the pass this way right now? If possible, I'd like to see that the >> pass works and sort out any problems people encounter with it before I start >> optimizing it. >> >> 2 Instead of collecting all stmts that may generate a copy at the beginning of >> the pass into a vec<>, let the SCC discovery check that statements may >> generate a copy on the fly. >> >> This would be a big change to the pass, it would require a lot of reworking. >> I'm also not sure if this would help reduce the number of allocated vec<>s that >> much because I'll still want to represent SCCs by vec<>s. >> >> Again - its possible I'll want to rework the pass in this way in the future but >> I'd like to leave it as it is for now. >> >> 3 Add a comment saying that the pass is doing optimistic copy propagation >> >> I don't think the pass works in an optimistic way. It doesn't assume that all >> variables are copies of each other at any point. It instead identifies copy >> statements (or PHI SCCs that act as copy statements) and then for each of these >> it propagates: By that I mean if a copy statement says that _3 is a copy of _2, >> then it replaces all uses of _3 by _2. >> >> But its possible that I still misinterpret what 'optimistic' means. If >> mentioning that it works in an optimistic way would help readability of the >> pass, I'm happy to add that comment. >> >> Richard, is the patch ok as it is now? Or do you think that making changes 1, 2 >> or 3 is necessary? Or are there any other issues which should be adressed? >> >> Filip > > This is the fourth version of this patch. Changes in this version: > - Disabled the pass for -Og > - This meant that I could revert my changes to gcc.dg/tree-ssa/pr79691.c > - Fixed formatting of the patch (like braces around a single-statment if body) > - compute_sccs (auto_vec<>...) -> compute_sccs (vec<> ...) > - auto_vec<gimple *> worklist; worklist = ...; -> > auto_vec<gimple *> worklist = ...; > > I've also checked that there are no memory leak issues. Valgrind doesn't detect > any aditional leaks when running with sccopy vs running without sccopy: > > [fkastl@scala playground]$ valgrind $HOME/gcc/inst/bin/gcc /home/fkastl/gcc/src/gcc/testsuite/gcc.dg/sccopy-1.c -fgimple -O2 > ==27363== Memcheck, a memory error detector > ==27363== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. > ==27363== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info > ==27363== Command: /home/fkastl/gcc/inst/bin/gcc /home/fkastl/gcc/src/gcc/testsuite/gcc.dg/sccopy-1.c -fgimple -O2 > ==27363== > ==27363== > ==27363== HEAP SUMMARY: > ==27363== in use at exit: 174,234 bytes in 92 blocks > ==27363== total heap usage: 408 allocs, 316 frees, 228,050 bytes allocated > ==27363== > ==27363== LEAK SUMMARY: > ==27363== definitely lost: 5,935 bytes in 23 blocks > ==27363== indirectly lost: 82 bytes in 5 blocks > ==27363== possibly lost: 0 bytes in 0 blocks > ==27363== still reachable: 168,217 bytes in 64 blocks > ==27363== of which reachable via heuristic: > ==27363== newarray : 1,544 bytes in 1 blocks > ==27363== suppressed: 0 bytes in 0 blocks > ==27363== Rerun with --leak-check=full to see details of leaked memory > ==27363== > ==27363== For lists of detected and suppressed errors, rerun with: -s > ==27363== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) > > [fkastl@scala playground]$ valgrind $HOME/gcc/inst/bin/gcc /home/fkastl/gcc/src/gcc/testsuite/gcc.dg/sccopy-1.c -fgimple -O2 -fdisable-tree-sccopy1 -fdisable-tree-sccopy2 > ==27372== Memcheck, a memory error detector > ==27372== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. > ==27372== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info > ==27372== Command: /home/fkastl/gcc/inst/bin/gcc /home/fkastl/gcc/src/gcc/testsuite/gcc.dg/sccopy-1.c -fgimple -O2 -fdisable-tree-sccopy1 -fdisable-tree-sccopy2 > ==27372== > cc1: note: disable pass tree-sccopy1 for functions in the range of [0, 4294967295] > cc1: note: disable pass tree-sccopy2 for functions in the range of [0, 4294967295] > ==27372== > ==27372== HEAP SUMMARY: > ==27372== in use at exit: 174,282 bytes in 92 blocks > ==27372== total heap usage: 408 allocs, 316 frees, 228,674 bytes allocated > ==27372== > ==27372== LEAK SUMMARY: > ==27372== definitely lost: 5,935 bytes in 23 blocks > ==27372== indirectly lost: 82 bytes in 5 blocks > ==27372== possibly lost: 0 bytes in 0 blocks > ==27372== still reachable: 168,265 bytes in 64 blocks > ==27372== of which reachable via heuristic: > ==27372== newarray : 1,544 bytes in 1 blocks > ==27372== suppressed: 0 bytes in 0 blocks > ==27372== Rerun with --leak-check=full to see details of leaked memory > ==27372== > ==27372== For lists of detected and suppressed errors, rerun with: -s > ==27372== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) > > Richard, I wrote down the bitmap vs hash_set optimizations you suggested so > that I can work on the pass more later and optimize it. > > I didn't yet bootstrap this version. I just regtested it. Will bootstrap and > regtest on the most current commit. Once that is successfully done, is the pass > OK to be pushed to main? Yes. Thanks, Richard > Filip > > -- >8 -- > > This patch adds the strongly-connected copy propagation (SCCOPY) pass. > It is a lightweight GIMPLE copy propagation pass that also removes some > redundant PHI statements. It handles degenerate PHIs, e.g.: > > _5 = PHI <_1>; > _6 = PHI <_6, _6, _1, _1>; > _7 = PHI <16, _7>; > // Replaces occurences of _5 and _6 by _1 and _7 by 16 > > It also handles more complicated situations, e.g.: > > _8 = PHI <_9, _10>; > _9 = PHI <_8, _10>; > _10 = PHI <_8, _9, _1>; > // Replaces occurences of _8, _9 and _10 by _1 > > gcc/ChangeLog: > > * Makefile.in: Added sccopy pass. > * passes.def: Added sccopy pass before LTO streaming and before > RTL expansion. > * tree-pass.h (make_pass_sccopy): Added sccopy pass. > * gimple-ssa-sccopy.cc: New file. > > gcc/testsuite/ChangeLog: > > * gcc.dg/sccopy-1.c: New test. > > Signed-off-by: Filip Kastl <fkastl@suse.cz> > --- > gcc/Makefile.in | 1 + > gcc/gimple-ssa-sccopy.cc | 680 ++++++++++++++++++++++++++++++++ > gcc/passes.def | 2 + > gcc/testsuite/gcc.dg/sccopy-1.c | 78 ++++ > gcc/tree-pass.h | 1 + > 5 files changed, 762 insertions(+) > create mode 100644 gcc/gimple-ssa-sccopy.cc > create mode 100644 gcc/testsuite/gcc.dg/sccopy-1.c > > diff --git a/gcc/Makefile.in b/gcc/Makefile.in > index 753f2f36618..2e6f2fef1ba 100644 > --- a/gcc/Makefile.in > +++ b/gcc/Makefile.in > @@ -1497,6 +1497,7 @@ OBJS = \ > gimple-ssa-backprop.o \ > gimple-ssa-isolate-paths.o \ > gimple-ssa-nonnull-compare.o \ > + gimple-ssa-sccopy.o \ > gimple-ssa-split-paths.o \ > gimple-ssa-store-merging.o \ > gimple-ssa-strength-reduction.o \ > diff --git a/gcc/gimple-ssa-sccopy.cc b/gcc/gimple-ssa-sccopy.cc > new file mode 100644 > index 00000000000..ac5ec32eb32 > --- /dev/null > +++ b/gcc/gimple-ssa-sccopy.cc > @@ -0,0 +1,680 @@ > +/* Strongly-connected copy propagation pass for the GNU compiler. > + Copyright (C) 2023 Free Software Foundation, Inc. > + Contributed by Filip Kastl <fkastl@suse.cz> > + > +This file is part of GCC. > + > +GCC is free software; you can redistribute it and/or modify it under > +the terms of the GNU General Public License as published by the Free > +Software Foundation; either version 3, or (at your option) any later > +version. > + > +GCC is distributed in the hope that it will be useful, but WITHOUT ANY > +WARRANTY; without even the implied warranty of MERCHANTABILITY or > +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License > +for more details. > + > +You should have received a copy of the GNU General Public License > +along with GCC; see the file COPYING3. If not see > +<http://www.gnu.org/licenses/>. */ > + > +#include "config.h" > +#include "system.h" > +#include "coretypes.h" > +#include "backend.h" > +#include "tree.h" > +#include "gimple.h" > +#include "tree-pass.h" > +#include "ssa.h" > +#include "gimple-iterator.h" > +#include "vec.h" > +#include "hash-set.h" > +#include <algorithm> > +#include "ssa-iterators.h" > +#include "gimple-fold.h" > +#include "gimplify.h" > +#include "tree-cfg.h" > +#include "tree-eh.h" > +#include "builtins.h" > +#include "tree-ssa-dce.h" > +#include "fold-const.h" > + > +/* Strongly connected copy propagation pass. > + > + This is a lightweight copy propagation pass that is also able to eliminate > + redundant PHI statements. The pass considers the following types of copy > + statements: > + > + 1 An assignment statement with a single argument. > + > + _3 = _2; > + _4 = 5; > + > + 2 A degenerate PHI statement. A degenerate PHI is a PHI that only refers to > + itself or one other value. > + > + _5 = PHI <_1>; > + _6 = PHI <_6, _6, _1, _1>; > + _7 = PHI <16, _7>; > + > + 3 A set of PHI statements that only refer to each other or to one other > + value. > + > + _8 = PHI <_9, _10>; > + _9 = PHI <_8, _10>; > + _10 = PHI <_8, _9, _1>; > + > + All of these statements produce copies and can be eliminated from the > + program. For a copy statement we identify the value it creates a copy of > + and replace references to the statement with the value -- we propagate the > + copy. > + > + _3 = _2; // Replace all occurences of _3 by _2 > + > + _8 = PHI <_9, _10>; > + _9 = PHI <_8, _10>; > + _10 = PHI <_8, _9, _1>; // Replace all occurences of _8, _9 and _10 by _1 > + > + To find all three types of copy statements we use an algorithm based on > + strongly-connected components (SCCs) in dataflow graph. The algorithm was > + introduced in an article from 2013[1]. We describe the algorithm bellow. > + > + To identify SCCs we implement the Robert Tarjan's SCC algorithm. For the > + SCC computation we wrap potential copy statements in the 'vertex' struct. > + To each of these statements we also assign a vertex number ('vxnum'). Since > + the main algorithm has to be able to compute SCCs of subgraphs of the whole > + dataflow graph we use GIMPLE stmt flags to prevent Tarjan's algorithm from > + leaving the subgraph. > + > + References: > + > + [1] Simple and Efficient Construction of Static Single Assignmemnt Form, > + Braun, Buchwald, Hack, Leissa, Mallon, Zwinkau, 2013, LNCS vol. 7791, > + Section 3.2. */ > + > +/* Bitmap tracking statements which were propagated to be removed at the end of > + the pass. */ > + > +static bitmap dead_stmts; > + > +/* State of vertex during SCC discovery. > + > + unvisited Vertex hasn't yet been popped from worklist. > + vopen DFS has visited vertex for the first time. Vertex has been put > + on Tarjan stack. > + closed DFS has backtracked through vertex. At this point, vertex > + doesn't have any unvisited neighbors. > + in_scc Vertex has been popped from Tarjan stack. */ > + > +enum vstate > +{ > + unvisited, > + vopen, > + closed, > + in_scc > +}; > + > +/* Information about a vertex. Used by SCC discovery. */ > + > +struct vertex > +{ > + bool active; /* scc_discovery::compute_sccs () only considers a subgraph of > + the whole dataflow graph. It uses this flag so that it knows > + which vertices are part of this subgraph. */ > + vstate state; > + unsigned index; > + unsigned lowlink; > +}; > + > +/* SCC discovery. > + > + Used to find SCCs in a dataflow graph. Implements Tarjan's SCC > + algorithm. */ > + > +class scc_discovery > +{ > +public: > + scc_discovery (); > + ~scc_discovery (); > + auto_vec<vec<gimple *>> compute_sccs (vec<gimple *> &stmts); > + > +private: > + unsigned curr_generation = 0; > + vertex* vertices; /* Indexed by SSA_NAME_VERSION. */ > + auto_vec<unsigned> worklist; /* DFS stack. */ > + auto_vec<unsigned> stack; /* Tarjan stack. */ > + > + void visit_neighbor (tree neigh_tree, unsigned parent_vxnum); > +}; > + > +scc_discovery::scc_discovery () > +{ > + /* Create vertex struct for each SSA name. */ > + vertices = XNEWVEC (struct vertex, num_ssa_names); > + unsigned i = 0; > + for (i = 0; i < num_ssa_names; i++) > + vertices[i].active = false; > +} > + > +scc_discovery::~scc_discovery () > +{ > + XDELETEVEC (vertices); > +} > + > +/* Part of 'scc_discovery::compute_sccs ()'. */ > + > +void > +scc_discovery::visit_neighbor (tree neigh_tree, unsigned parent_version) > +{ > + if (TREE_CODE (neigh_tree) != SSA_NAME) > + return; /* Skip any neighbor that isn't an SSA name. */ > + unsigned neigh_version = SSA_NAME_VERSION (neigh_tree); > + > + /* Skip neighbors outside the subgraph that Tarjan currently works > + with. */ > + if (!vertices[neigh_version].active) > + return; > + > + vstate neigh_state = vertices[neigh_version].state; > + vstate parent_state = vertices[parent_version].state; > + if (parent_state == vopen) /* We're currently opening parent. */ > + { > + /* Put unvisited neighbors on worklist. Update lowlink of parent > + vertex according to indices of neighbors present on stack. */ > + switch (neigh_state) > + { > + case unvisited: > + worklist.safe_push (neigh_version); > + break; > + case vopen: > + case closed: > + vertices[parent_version].lowlink > + = std::min (vertices[parent_version].lowlink, > + vertices[neigh_version].index); > + break; > + case in_scc: > + /* Ignore these edges. */ > + break; > + } > + } > + else if (parent_state == closed) /* We're currently closing parent. */ > + { > + /* Update lowlink of parent vertex according to lowlinks of > + children of parent (in terms of DFS tree). */ > + if (neigh_state == closed) > + { > + vertices[parent_version].lowlink > + = std::min (vertices[parent_version].lowlink, > + vertices[neigh_version].lowlink); > + } > + } > +} > + > +/* Compute SCCs in dataflow graph on given statements 'stmts'. Ignore > + statements outside 'stmts'. Return the SCCs in a reverse topological > + order. > + > + stmt_may_generate_copy () must be true for all statements from 'stmts'! */ > + > +auto_vec<vec<gimple *>> > +scc_discovery::compute_sccs (vec<gimple *> &stmts) > +{ > + auto_vec<vec<gimple *>> sccs; > + > + for (gimple *stmt : stmts) > + { > + unsigned i; > + switch (gimple_code (stmt)) > + { > + case GIMPLE_ASSIGN: > + i = SSA_NAME_VERSION (gimple_assign_lhs (stmt)); > + break; > + case GIMPLE_PHI: > + i = SSA_NAME_VERSION (gimple_phi_result (stmt)); > + break; > + default: > + gcc_unreachable (); > + } > + > + vertices[i].index = 0; > + vertices[i].lowlink = 0; > + vertices[i].state = unvisited; > + vertices[i].active = true; /* Mark the subgraph we'll be working on so > + that we don't leave it. */ > + > + worklist.safe_push (i); > + } > + > + /* Worklist loop. */ > + unsigned curr_index = 0; > + while (!worklist.is_empty ()) > + { > + unsigned i = worklist.pop (); > + gimple *stmt = SSA_NAME_DEF_STMT (ssa_name (i)); > + vstate state = vertices[i].state; > + > + if (state == unvisited) > + { > + vertices[i].state = vopen; > + > + /* Assign index to this vertex. */ > + vertices[i].index = curr_index; > + vertices[i].lowlink = curr_index; > + curr_index++; > + > + /* Put vertex on stack and also on worklist to be closed later. */ > + stack.safe_push (i); > + worklist.safe_push (i); > + } > + else if (state == vopen) > + vertices[i].state = closed; > + > + /* Visit neighbors of this vertex. */ > + tree op; > + gphi *phi; > + switch (gimple_code (stmt)) > + { > + case GIMPLE_PHI: > + phi = as_a <gphi *> (stmt); > + unsigned j; > + for (j = 0; j < gimple_phi_num_args (phi); j++) > + { > + op = gimple_phi_arg_def (phi, j); > + visit_neighbor (op, i); > + } > + break; > + case GIMPLE_ASSIGN: > + op = gimple_assign_rhs1 (stmt); > + visit_neighbor (op, i); > + break; > + default: > + gcc_unreachable (); > + } > + > + /* If we've just closed a root vertex of an scc, pop scc from stack. */ > + if (state == vopen && vertices[i].lowlink == vertices[i].index) > + { > + vec<gimple *> scc = vNULL; > + > + unsigned j; > + do > + { > + j = stack.pop (); > + scc.safe_push (SSA_NAME_DEF_STMT (ssa_name (j))); > + vertices[j].state = in_scc; > + } > + while (j != i); > + > + sccs.safe_push (scc); > + } > + } > + > + if (!stack.is_empty ()) > + gcc_unreachable (); > + > + /* Clear 'active' flags. */ > + for (gimple *stmt : stmts) > + { > + unsigned i; > + switch (gimple_code (stmt)) > + { > + case GIMPLE_ASSIGN: > + i = SSA_NAME_VERSION (gimple_assign_lhs (stmt)); > + break; > + case GIMPLE_PHI: > + i = SSA_NAME_VERSION (gimple_phi_result (stmt)); > + break; > + default: > + gcc_unreachable (); > + } > + > + vertices[i].active = false; > + } > + > + return sccs; > +} > + > +/* Could this statement potentially be a copy statement? > + > + This pass only considers statements for which this function returns 'true'. > + Those are basically PHI functions and assignment statements similar to > + > + _2 = _1; > + or > + _2 = 5; */ > + > +static bool > +stmt_may_generate_copy (gimple *stmt) > +{ > + /* A PHI may generate a copy. */ > + if (gimple_code (stmt) == GIMPLE_PHI) > + { > + gphi *phi = as_a <gphi *> (stmt); > + > + /* No OCCURS_IN_ABNORMAL_PHI SSA names in lhs nor rhs. */ > + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_phi_result (phi))) > + return false; > + > + unsigned i; > + for (i = 0; i < gimple_phi_num_args (phi); i++) > + { > + tree op = gimple_phi_arg_def (phi, i); > + if (TREE_CODE (op) == SSA_NAME > + && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op)) > + return false; > + } > + > + /* If PHI has more than one unique non-SSA arguments, it won't generate a > + copy. */ > + tree const_op = NULL_TREE; > + for (i = 0; i < gimple_phi_num_args (phi); i++) > + { > + tree op = gimple_phi_arg_def (phi, i); > + if (TREE_CODE (op) != SSA_NAME) > + { > + if (const_op && !operand_equal_p (op, const_op)) > + return false; > + const_op = op; > + } > + } > + > + return true; > + } > + > + /* Or a statement of type _2 = _1; OR _2 = 5; may generate a copy. */ > + > + if (!gimple_assign_single_p (stmt)) > + return false; > + > + tree lhs = gimple_assign_lhs (stmt); > + tree rhs = gimple_assign_rhs1 (stmt); > + > + if (TREE_CODE (lhs) != SSA_NAME) > + return false; > + > + /* lhs shouldn't flow through any abnormal edges. */ > + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs)) > + return false; > + > + if (is_gimple_min_invariant (rhs)) > + return true; /* A statement of type _2 = 5;. */ > + > + if (TREE_CODE (rhs) != SSA_NAME) > + return false; > + > + /* rhs shouldn't flow through any abnormal edges. */ > + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs)) > + return false; > + > + /* It is possible that lhs has more alignment or value range information. By > + propagating we would lose this information. So in the case that alignment > + or value range information differs, we are conservative and do not > + propagate. > + > + FIXME: Propagate alignment and value range info the same way copy-prop > + does. */ > + if (POINTER_TYPE_P (TREE_TYPE (lhs)) > + && POINTER_TYPE_P (TREE_TYPE (rhs)) > + && SSA_NAME_PTR_INFO (lhs) != SSA_NAME_PTR_INFO (rhs)) > + return false; > + if (!POINTER_TYPE_P (TREE_TYPE (lhs)) > + && !POINTER_TYPE_P (TREE_TYPE (rhs)) > + && SSA_NAME_RANGE_INFO (lhs) != SSA_NAME_RANGE_INFO (rhs)) > + return false; > + > + return true; /* A statement of type _2 = _1;. */ > +} > + > +/* Return all statements in cfun that could generate copies. All statements > + for which stmt_may_generate_copy returns 'true'. */ > + > +static auto_vec<gimple *> > +get_all_stmt_may_generate_copy (void) > +{ > + auto_vec<gimple *> result; > + > + basic_block bb; > + FOR_EACH_BB_FN (bb, cfun) > + { > + gimple_stmt_iterator gsi; > + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) > + { > + gimple *s = gsi_stmt (gsi); > + if (stmt_may_generate_copy (s)) > + result.safe_push (s); > + } > + > + gphi_iterator pi; > + for (pi = gsi_start_phis (bb); !gsi_end_p (pi); gsi_next (&pi)) > + { > + gimple *s = pi.phi (); > + if (stmt_may_generate_copy (s)) > + result.safe_push (s); > + } > + } > + > + return result; > +} > + > +/* For each statement from given SCC, replace its usages by value > + VAL. */ > + > +static void > +replace_scc_by_value (vec<gimple *> scc, tree val) > +{ > + for (gimple *stmt : scc) > + { > + tree name = gimple_get_lhs (stmt); > + replace_uses_by (name, val); > + bitmap_set_bit (dead_stmts, SSA_NAME_VERSION (name)); > + } > + > + if (dump_file) > + fprintf (dump_file, "Replacing SCC of size %d\n", scc.length ()); > +} > + > +/* Part of 'sccopy_propagate ()'. */ > + > +static void > +sccopy_visit_op (tree op, hash_set<tree> &outer_ops, > + hash_set<gimple *> &scc_set, bool &is_inner, > + tree &last_outer_op) > +{ > + bool op_in_scc = false; > + > + if (TREE_CODE (op) == SSA_NAME) > + { > + gimple *op_stmt = SSA_NAME_DEF_STMT (op); > + if (scc_set.contains (op_stmt)) > + op_in_scc = true; > + } > + > + if (!op_in_scc) > + { > + outer_ops.add (op); > + last_outer_op = op; > + is_inner = false; > + } > +} > + > +/* Main function of this pass. Find and propagate all three types of copy > + statements (see pass description above). > + > + This is an implementation of an algorithm from the paper Simple and > + Efficient Construction of Static Single Assignmemnt Form[1]. It is based > + on strongly-connected components (SCCs) in dataflow graph. The original > + algorithm only considers PHI statements. We extend it to also consider > + assignment statements of type _2 = _1;. > + > + The algorithm is based on this definition of a set of redundant PHIs[1]: > + > + A non-empty set P of PHI functions is redundant iff the PHI functions just > + reference each other or one other value > + > + It uses this lemma[1]: > + > + Let P be a redundant set of PHI functions. Then there is a > + strongly-connected component S subset of P that is also redundant. > + > + The algorithm works in this way: > + > + 1 Find SCCs > + 2 For each SCC S in topological order: > + 3 Construct set 'inner' of statements that only have other statements > + from S on their right hand side > + 4 Construct set 'outer' of values that originate outside S and appear on > + right hand side of some statement from S > + 5 If |outer| = 1, outer only contains a value v. Statements in S only > + refer to each other or to v -- they are redundant. Propagate v. > + Else, recurse on statements in inner. > + > + The implementation is non-recursive. > + > + References: > + > + [1] Simple and Efficient Construction of Static Single Assignmemnt Form, > + Braun, Buchwald, Hack, Leissa, Mallon, Zwinkau, 2013, LNCS vol. 7791, > + Section 3.2. */ > + > +static void > +sccopy_propagate () > +{ > + auto_vec<gimple *> useful_stmts = get_all_stmt_may_generate_copy (); > + scc_discovery discovery; > + > + auto_vec<vec<gimple *>> worklist = discovery.compute_sccs (useful_stmts); > + > + while (!worklist.is_empty ()) > + { > + vec<gimple *> scc = worklist.pop (); > + > + auto_vec<gimple *> inner; > + hash_set<tree> outer_ops; > + tree last_outer_op = NULL_TREE; > + > + /* Prepare hash set of PHIs in scc to query later. */ > + hash_set<gimple *> scc_set; > + for (gimple *stmt : scc) > + scc_set.add (stmt); > + > + for (gimple *stmt : scc) > + { > + bool is_inner = true; > + > + gphi *phi; > + tree op; > + > + switch (gimple_code (stmt)) > + { > + case GIMPLE_PHI: > + phi = as_a <gphi *> (stmt); > + unsigned j; > + for (j = 0; j < gimple_phi_num_args (phi); j++) > + { > + op = gimple_phi_arg_def (phi, j); > + sccopy_visit_op (op, outer_ops, scc_set, is_inner, > + last_outer_op); > + } > + break; > + case GIMPLE_ASSIGN: > + op = gimple_assign_rhs1 (stmt); > + sccopy_visit_op (op, outer_ops, scc_set, is_inner, > + last_outer_op); > + break; > + default: > + gcc_unreachable (); > + } > + > + if (is_inner) > + inner.safe_push (stmt); > + } > + > + if (outer_ops.elements () == 1) > + { > + /* The only operand in outer_ops. */ > + tree outer_op = last_outer_op; > + replace_scc_by_value (scc, outer_op); > + } > + else if (outer_ops.elements () > 1) > + { > + /* Add inner sccs to worklist. */ > + auto_vec<vec<gimple *>> inner_sccs > + = discovery.compute_sccs (inner); > + for (vec<gimple *> inner_scc : inner_sccs) > + worklist.safe_push (inner_scc); > + } > + else > + gcc_unreachable (); > + > + scc.release (); > + } > +} > + > +/* Called when pass execution starts. */ > + > +static void > +init_sccopy (void) > +{ > + /* For propagated statements. */ > + dead_stmts = BITMAP_ALLOC (NULL); > +} > + > +/* Called before pass execution ends. */ > + > +static void > +finalize_sccopy (void) > +{ > + /* Remove all propagated statements. */ > + simple_dce_from_worklist (dead_stmts); > + BITMAP_FREE (dead_stmts); > + > + /* Propagating a constant may create dead eh edges. */ > + basic_block bb; > + FOR_EACH_BB_FN (bb, cfun) > + gimple_purge_dead_eh_edges (bb); > +} > + > +namespace { > + > +const pass_data pass_data_sccopy = > +{ > + GIMPLE_PASS, /* type */ > + "sccopy", /* name */ > + OPTGROUP_NONE, /* optinfo_flags */ > + TV_NONE, /* tv_id */ > + ( PROP_cfg | PROP_ssa ), /* properties_required */ > + 0, /* properties_provided */ > + 0, /* properties_destroyed */ > + 0, /* todo_flags_start */ > + TODO_update_ssa | TODO_cleanup_cfg, /* todo_flags_finish */ > +}; > + > +class pass_sccopy : public gimple_opt_pass > +{ > +public: > + pass_sccopy (gcc::context *ctxt) > + : gimple_opt_pass (pass_data_sccopy, ctxt) > + {} > + > + /* opt_pass methods: */ > + virtual bool gate (function *) { return true; } > + virtual unsigned int execute (function *); > + opt_pass * clone () final override { return new pass_sccopy (m_ctxt); } > +}; // class pass_sccopy > + > +unsigned > +pass_sccopy::execute (function *) > +{ > + init_sccopy (); > + sccopy_propagate (); > + finalize_sccopy (); > + return 0; > +} > + > +} // anon namespace > + > +gimple_opt_pass * > +make_pass_sccopy (gcc::context *ctxt) > +{ > + return new pass_sccopy (ctxt); > +} > diff --git a/gcc/passes.def b/gcc/passes.def > index 1e1950bdb39..048ad3ee7a5 100644 > --- a/gcc/passes.def > +++ b/gcc/passes.def > @@ -100,6 +100,7 @@ along with GCC; see the file COPYING3. If not see > NEXT_PASS (pass_if_to_switch); > NEXT_PASS (pass_convert_switch); > NEXT_PASS (pass_cleanup_eh); > + NEXT_PASS (pass_sccopy); > NEXT_PASS (pass_profile); > NEXT_PASS (pass_local_pure_const); > NEXT_PASS (pass_modref); > @@ -368,6 +369,7 @@ along with GCC; see the file COPYING3. If not see > However, this also causes us to misdiagnose cases that should be > real warnings (e.g., testsuite/gcc.dg/pr18501.c). */ > NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */); > + NEXT_PASS (pass_sccopy); > NEXT_PASS (pass_tail_calls); > /* Split critical edges before late uninit warning to reduce the > number of false positives from it. */ > diff --git a/gcc/testsuite/gcc.dg/sccopy-1.c b/gcc/testsuite/gcc.dg/sccopy-1.c > new file mode 100644 > index 00000000000..1e61a6b320e > --- /dev/null > +++ b/gcc/testsuite/gcc.dg/sccopy-1.c > @@ -0,0 +1,78 @@ > +/* { dg-do compile } */ > +/* { dg-options "-fgimple -fdump-tree-sccopy -O2" } */ > +/* { dg-final { scan-tree-dump "Replacing SCC of size 2" "sccopy1" } } */ > + > +int __GIMPLE (ssa, startwith ("sccopy")) > +main () > +{ > + int a; > + int y; > + int x; > + int _1; > + int _2; > + int _13; > + > + __BB(2): > + if (x_7(D) == 5) > + goto __BB3; > + else > + goto __BB4; > + > + __BB(3): > + a_10 = x_7(D); > + goto __BB5; > + > + __BB(4): > + a_9 = y_8(D); > + goto __BB5; > + > + __BB(5): > + a_3 = __PHI (__BB3: a_10, __BB4: a_9); > + if (x_7(D) == y_8(D)) > + goto __BB6; > + else > + goto __BB11; > + > + __BB(6): > + a_11 = a_3 + 1; > + goto __BB7; > + > + __BB(7): > + a_4 = __PHI (__BB6: a_11, __BB11: a_6); > +label1: > + if (x_7(D) != y_8(D)) > + goto __BB8; > + else > + goto __BB10; > + > + __BB(8): > + goto __BB9; > + > + __BB(9): > + a_12 = __PHI (__BB8: a_4, __BB10: a_5); > + goto __BB10; > + > + __BB(10,loop_header(1)): > + a_5 = __PHI (__BB7: a_4, __BB9: a_12); > +label2: > + _1 = y_8(D) * 2; > + if (x_7(D) == _1) > + goto __BB9; > + else > + goto __BB11; > + > + __BB(11): > + a_6 = __PHI (__BB5: a_3, __BB10: a_5); > + _2 = x_7(D) * 3; > + if (y_8(D) == _2) > + goto __BB7; > + else > + goto __BB12; > + > + __BB(12): > + _13 = 0; > + return _13; > + > +} > + > + > diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h > index 09e6ada5b2f..94a48461590 100644 > --- a/gcc/tree-pass.h > +++ b/gcc/tree-pass.h > @@ -399,6 +399,7 @@ extern gimple_opt_pass *make_pass_iv_optimize (gcc::context *ctxt); > extern gimple_opt_pass *make_pass_tree_loop_done (gcc::context *ctxt); > extern gimple_opt_pass *make_pass_ch (gcc::context *ctxt); > extern gimple_opt_pass *make_pass_ch_vect (gcc::context *ctxt); > +extern gimple_opt_pass *make_pass_sccopy (gcc::context *ctxt); > extern gimple_opt_pass *make_pass_ccp (gcc::context *ctxt); > extern gimple_opt_pass *make_pass_split_paths (gcc::context *ctxt); > extern gimple_opt_pass *make_pass_build_ssa (gcc::context *ctxt); > -- > 2.43.0 >
Successfully bootstrapped and regtested on x86_64-linux. Will push to master. Filip
diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 753f2f36618..2e6f2fef1ba 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1497,6 +1497,7 @@ OBJS = \ gimple-ssa-backprop.o \ gimple-ssa-isolate-paths.o \ gimple-ssa-nonnull-compare.o \ + gimple-ssa-sccopy.o \ gimple-ssa-split-paths.o \ gimple-ssa-store-merging.o \ gimple-ssa-strength-reduction.o \ diff --git a/gcc/gimple-ssa-sccopy.cc b/gcc/gimple-ssa-sccopy.cc new file mode 100644 index 00000000000..ac5ec32eb32 --- /dev/null +++ b/gcc/gimple-ssa-sccopy.cc @@ -0,0 +1,680 @@ +/* Strongly-connected copy propagation pass for the GNU compiler. + Copyright (C) 2023 Free Software Foundation, Inc. + Contributed by Filip Kastl <fkastl@suse.cz> + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License +for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "backend.h" +#include "tree.h" +#include "gimple.h" +#include "tree-pass.h" +#include "ssa.h" +#include "gimple-iterator.h" +#include "vec.h" +#include "hash-set.h" +#include <algorithm> +#include "ssa-iterators.h" +#include "gimple-fold.h" +#include "gimplify.h" +#include "tree-cfg.h" +#include "tree-eh.h" +#include "builtins.h" +#include "tree-ssa-dce.h" +#include "fold-const.h" + +/* Strongly connected copy propagation pass. + + This is a lightweight copy propagation pass that is also able to eliminate + redundant PHI statements. The pass considers the following types of copy + statements: + + 1 An assignment statement with a single argument. + + _3 = _2; + _4 = 5; + + 2 A degenerate PHI statement. A degenerate PHI is a PHI that only refers to + itself or one other value. + + _5 = PHI <_1>; + _6 = PHI <_6, _6, _1, _1>; + _7 = PHI <16, _7>; + + 3 A set of PHI statements that only refer to each other or to one other + value. + + _8 = PHI <_9, _10>; + _9 = PHI <_8, _10>; + _10 = PHI <_8, _9, _1>; + + All of these statements produce copies and can be eliminated from the + program. For a copy statement we identify the value it creates a copy of + and replace references to the statement with the value -- we propagate the + copy. + + _3 = _2; // Replace all occurences of _3 by _2 + + _8 = PHI <_9, _10>; + _9 = PHI <_8, _10>; + _10 = PHI <_8, _9, _1>; // Replace all occurences of _8, _9 and _10 by _1 + + To find all three types of copy statements we use an algorithm based on + strongly-connected components (SCCs) in dataflow graph. The algorithm was + introduced in an article from 2013[1]. We describe the algorithm bellow. + + To identify SCCs we implement the Robert Tarjan's SCC algorithm. For the + SCC computation we wrap potential copy statements in the 'vertex' struct. + To each of these statements we also assign a vertex number ('vxnum'). Since + the main algorithm has to be able to compute SCCs of subgraphs of the whole + dataflow graph we use GIMPLE stmt flags to prevent Tarjan's algorithm from + leaving the subgraph. + + References: + + [1] Simple and Efficient Construction of Static Single Assignmemnt Form, + Braun, Buchwald, Hack, Leissa, Mallon, Zwinkau, 2013, LNCS vol. 7791, + Section 3.2. */ + +/* Bitmap tracking statements which were propagated to be removed at the end of + the pass. */ + +static bitmap dead_stmts; + +/* State of vertex during SCC discovery. + + unvisited Vertex hasn't yet been popped from worklist. + vopen DFS has visited vertex for the first time. Vertex has been put + on Tarjan stack. + closed DFS has backtracked through vertex. At this point, vertex + doesn't have any unvisited neighbors. + in_scc Vertex has been popped from Tarjan stack. */ + +enum vstate +{ + unvisited, + vopen, + closed, + in_scc +}; + +/* Information about a vertex. Used by SCC discovery. */ + +struct vertex +{ + bool active; /* scc_discovery::compute_sccs () only considers a subgraph of + the whole dataflow graph. It uses this flag so that it knows + which vertices are part of this subgraph. */ + vstate state; + unsigned index; + unsigned lowlink; +}; + +/* SCC discovery. + + Used to find SCCs in a dataflow graph. Implements Tarjan's SCC + algorithm. */ + +class scc_discovery +{ +public: + scc_discovery (); + ~scc_discovery (); + auto_vec<vec<gimple *>> compute_sccs (vec<gimple *> &stmts); + +private: + unsigned curr_generation = 0; + vertex* vertices; /* Indexed by SSA_NAME_VERSION. */ + auto_vec<unsigned> worklist; /* DFS stack. */ + auto_vec<unsigned> stack; /* Tarjan stack. */ + + void visit_neighbor (tree neigh_tree, unsigned parent_vxnum); +}; + +scc_discovery::scc_discovery () +{ + /* Create vertex struct for each SSA name. */ + vertices = XNEWVEC (struct vertex, num_ssa_names); + unsigned i = 0; + for (i = 0; i < num_ssa_names; i++) + vertices[i].active = false; +} + +scc_discovery::~scc_discovery () +{ + XDELETEVEC (vertices); +} + +/* Part of 'scc_discovery::compute_sccs ()'. */ + +void +scc_discovery::visit_neighbor (tree neigh_tree, unsigned parent_version) +{ + if (TREE_CODE (neigh_tree) != SSA_NAME) + return; /* Skip any neighbor that isn't an SSA name. */ + unsigned neigh_version = SSA_NAME_VERSION (neigh_tree); + + /* Skip neighbors outside the subgraph that Tarjan currently works + with. */ + if (!vertices[neigh_version].active) + return; + + vstate neigh_state = vertices[neigh_version].state; + vstate parent_state = vertices[parent_version].state; + if (parent_state == vopen) /* We're currently opening parent. */ + { + /* Put unvisited neighbors on worklist. Update lowlink of parent + vertex according to indices of neighbors present on stack. */ + switch (neigh_state) + { + case unvisited: + worklist.safe_push (neigh_version); + break; + case vopen: + case closed: + vertices[parent_version].lowlink + = std::min (vertices[parent_version].lowlink, + vertices[neigh_version].index); + break; + case in_scc: + /* Ignore these edges. */ + break; + } + } + else if (parent_state == closed) /* We're currently closing parent. */ + { + /* Update lowlink of parent vertex according to lowlinks of + children of parent (in terms of DFS tree). */ + if (neigh_state == closed) + { + vertices[parent_version].lowlink + = std::min (vertices[parent_version].lowlink, + vertices[neigh_version].lowlink); + } + } +} + +/* Compute SCCs in dataflow graph on given statements 'stmts'. Ignore + statements outside 'stmts'. Return the SCCs in a reverse topological + order. + + stmt_may_generate_copy () must be true for all statements from 'stmts'! */ + +auto_vec<vec<gimple *>> +scc_discovery::compute_sccs (vec<gimple *> &stmts) +{ + auto_vec<vec<gimple *>> sccs; + + for (gimple *stmt : stmts) + { + unsigned i; + switch (gimple_code (stmt)) + { + case GIMPLE_ASSIGN: + i = SSA_NAME_VERSION (gimple_assign_lhs (stmt)); + break; + case GIMPLE_PHI: + i = SSA_NAME_VERSION (gimple_phi_result (stmt)); + break; + default: + gcc_unreachable (); + } + + vertices[i].index = 0; + vertices[i].lowlink = 0; + vertices[i].state = unvisited; + vertices[i].active = true; /* Mark the subgraph we'll be working on so + that we don't leave it. */ + + worklist.safe_push (i); + } + + /* Worklist loop. */ + unsigned curr_index = 0; + while (!worklist.is_empty ()) + { + unsigned i = worklist.pop (); + gimple *stmt = SSA_NAME_DEF_STMT (ssa_name (i)); + vstate state = vertices[i].state; + + if (state == unvisited) + { + vertices[i].state = vopen; + + /* Assign index to this vertex. */ + vertices[i].index = curr_index; + vertices[i].lowlink = curr_index; + curr_index++; + + /* Put vertex on stack and also on worklist to be closed later. */ + stack.safe_push (i); + worklist.safe_push (i); + } + else if (state == vopen) + vertices[i].state = closed; + + /* Visit neighbors of this vertex. */ + tree op; + gphi *phi; + switch (gimple_code (stmt)) + { + case GIMPLE_PHI: + phi = as_a <gphi *> (stmt); + unsigned j; + for (j = 0; j < gimple_phi_num_args (phi); j++) + { + op = gimple_phi_arg_def (phi, j); + visit_neighbor (op, i); + } + break; + case GIMPLE_ASSIGN: + op = gimple_assign_rhs1 (stmt); + visit_neighbor (op, i); + break; + default: + gcc_unreachable (); + } + + /* If we've just closed a root vertex of an scc, pop scc from stack. */ + if (state == vopen && vertices[i].lowlink == vertices[i].index) + { + vec<gimple *> scc = vNULL; + + unsigned j; + do + { + j = stack.pop (); + scc.safe_push (SSA_NAME_DEF_STMT (ssa_name (j))); + vertices[j].state = in_scc; + } + while (j != i); + + sccs.safe_push (scc); + } + } + + if (!stack.is_empty ()) + gcc_unreachable (); + + /* Clear 'active' flags. */ + for (gimple *stmt : stmts) + { + unsigned i; + switch (gimple_code (stmt)) + { + case GIMPLE_ASSIGN: + i = SSA_NAME_VERSION (gimple_assign_lhs (stmt)); + break; + case GIMPLE_PHI: + i = SSA_NAME_VERSION (gimple_phi_result (stmt)); + break; + default: + gcc_unreachable (); + } + + vertices[i].active = false; + } + + return sccs; +} + +/* Could this statement potentially be a copy statement? + + This pass only considers statements for which this function returns 'true'. + Those are basically PHI functions and assignment statements similar to + + _2 = _1; + or + _2 = 5; */ + +static bool +stmt_may_generate_copy (gimple *stmt) +{ + /* A PHI may generate a copy. */ + if (gimple_code (stmt) == GIMPLE_PHI) + { + gphi *phi = as_a <gphi *> (stmt); + + /* No OCCURS_IN_ABNORMAL_PHI SSA names in lhs nor rhs. */ + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (gimple_phi_result (phi))) + return false; + + unsigned i; + for (i = 0; i < gimple_phi_num_args (phi); i++) + { + tree op = gimple_phi_arg_def (phi, i); + if (TREE_CODE (op) == SSA_NAME + && SSA_NAME_OCCURS_IN_ABNORMAL_PHI (op)) + return false; + } + + /* If PHI has more than one unique non-SSA arguments, it won't generate a + copy. */ + tree const_op = NULL_TREE; + for (i = 0; i < gimple_phi_num_args (phi); i++) + { + tree op = gimple_phi_arg_def (phi, i); + if (TREE_CODE (op) != SSA_NAME) + { + if (const_op && !operand_equal_p (op, const_op)) + return false; + const_op = op; + } + } + + return true; + } + + /* Or a statement of type _2 = _1; OR _2 = 5; may generate a copy. */ + + if (!gimple_assign_single_p (stmt)) + return false; + + tree lhs = gimple_assign_lhs (stmt); + tree rhs = gimple_assign_rhs1 (stmt); + + if (TREE_CODE (lhs) != SSA_NAME) + return false; + + /* lhs shouldn't flow through any abnormal edges. */ + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs)) + return false; + + if (is_gimple_min_invariant (rhs)) + return true; /* A statement of type _2 = 5;. */ + + if (TREE_CODE (rhs) != SSA_NAME) + return false; + + /* rhs shouldn't flow through any abnormal edges. */ + if (SSA_NAME_OCCURS_IN_ABNORMAL_PHI (rhs)) + return false; + + /* It is possible that lhs has more alignment or value range information. By + propagating we would lose this information. So in the case that alignment + or value range information differs, we are conservative and do not + propagate. + + FIXME: Propagate alignment and value range info the same way copy-prop + does. */ + if (POINTER_TYPE_P (TREE_TYPE (lhs)) + && POINTER_TYPE_P (TREE_TYPE (rhs)) + && SSA_NAME_PTR_INFO (lhs) != SSA_NAME_PTR_INFO (rhs)) + return false; + if (!POINTER_TYPE_P (TREE_TYPE (lhs)) + && !POINTER_TYPE_P (TREE_TYPE (rhs)) + && SSA_NAME_RANGE_INFO (lhs) != SSA_NAME_RANGE_INFO (rhs)) + return false; + + return true; /* A statement of type _2 = _1;. */ +} + +/* Return all statements in cfun that could generate copies. All statements + for which stmt_may_generate_copy returns 'true'. */ + +static auto_vec<gimple *> +get_all_stmt_may_generate_copy (void) +{ + auto_vec<gimple *> result; + + basic_block bb; + FOR_EACH_BB_FN (bb, cfun) + { + gimple_stmt_iterator gsi; + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *s = gsi_stmt (gsi); + if (stmt_may_generate_copy (s)) + result.safe_push (s); + } + + gphi_iterator pi; + for (pi = gsi_start_phis (bb); !gsi_end_p (pi); gsi_next (&pi)) + { + gimple *s = pi.phi (); + if (stmt_may_generate_copy (s)) + result.safe_push (s); + } + } + + return result; +} + +/* For each statement from given SCC, replace its usages by value + VAL. */ + +static void +replace_scc_by_value (vec<gimple *> scc, tree val) +{ + for (gimple *stmt : scc) + { + tree name = gimple_get_lhs (stmt); + replace_uses_by (name, val); + bitmap_set_bit (dead_stmts, SSA_NAME_VERSION (name)); + } + + if (dump_file) + fprintf (dump_file, "Replacing SCC of size %d\n", scc.length ()); +} + +/* Part of 'sccopy_propagate ()'. */ + +static void +sccopy_visit_op (tree op, hash_set<tree> &outer_ops, + hash_set<gimple *> &scc_set, bool &is_inner, + tree &last_outer_op) +{ + bool op_in_scc = false; + + if (TREE_CODE (op) == SSA_NAME) + { + gimple *op_stmt = SSA_NAME_DEF_STMT (op); + if (scc_set.contains (op_stmt)) + op_in_scc = true; + } + + if (!op_in_scc) + { + outer_ops.add (op); + last_outer_op = op; + is_inner = false; + } +} + +/* Main function of this pass. Find and propagate all three types of copy + statements (see pass description above). + + This is an implementation of an algorithm from the paper Simple and + Efficient Construction of Static Single Assignmemnt Form[1]. It is based + on strongly-connected components (SCCs) in dataflow graph. The original + algorithm only considers PHI statements. We extend it to also consider + assignment statements of type _2 = _1;. + + The algorithm is based on this definition of a set of redundant PHIs[1]: + + A non-empty set P of PHI functions is redundant iff the PHI functions just + reference each other or one other value + + It uses this lemma[1]: + + Let P be a redundant set of PHI functions. Then there is a + strongly-connected component S subset of P that is also redundant. + + The algorithm works in this way: + + 1 Find SCCs + 2 For each SCC S in topological order: + 3 Construct set 'inner' of statements that only have other statements + from S on their right hand side + 4 Construct set 'outer' of values that originate outside S and appear on + right hand side of some statement from S + 5 If |outer| = 1, outer only contains a value v. Statements in S only + refer to each other or to v -- they are redundant. Propagate v. + Else, recurse on statements in inner. + + The implementation is non-recursive. + + References: + + [1] Simple and Efficient Construction of Static Single Assignmemnt Form, + Braun, Buchwald, Hack, Leissa, Mallon, Zwinkau, 2013, LNCS vol. 7791, + Section 3.2. */ + +static void +sccopy_propagate () +{ + auto_vec<gimple *> useful_stmts = get_all_stmt_may_generate_copy (); + scc_discovery discovery; + + auto_vec<vec<gimple *>> worklist = discovery.compute_sccs (useful_stmts); + + while (!worklist.is_empty ()) + { + vec<gimple *> scc = worklist.pop (); + + auto_vec<gimple *> inner; + hash_set<tree> outer_ops; + tree last_outer_op = NULL_TREE; + + /* Prepare hash set of PHIs in scc to query later. */ + hash_set<gimple *> scc_set; + for (gimple *stmt : scc) + scc_set.add (stmt); + + for (gimple *stmt : scc) + { + bool is_inner = true; + + gphi *phi; + tree op; + + switch (gimple_code (stmt)) + { + case GIMPLE_PHI: + phi = as_a <gphi *> (stmt); + unsigned j; + for (j = 0; j < gimple_phi_num_args (phi); j++) + { + op = gimple_phi_arg_def (phi, j); + sccopy_visit_op (op, outer_ops, scc_set, is_inner, + last_outer_op); + } + break; + case GIMPLE_ASSIGN: + op = gimple_assign_rhs1 (stmt); + sccopy_visit_op (op, outer_ops, scc_set, is_inner, + last_outer_op); + break; + default: + gcc_unreachable (); + } + + if (is_inner) + inner.safe_push (stmt); + } + + if (outer_ops.elements () == 1) + { + /* The only operand in outer_ops. */ + tree outer_op = last_outer_op; + replace_scc_by_value (scc, outer_op); + } + else if (outer_ops.elements () > 1) + { + /* Add inner sccs to worklist. */ + auto_vec<vec<gimple *>> inner_sccs + = discovery.compute_sccs (inner); + for (vec<gimple *> inner_scc : inner_sccs) + worklist.safe_push (inner_scc); + } + else + gcc_unreachable (); + + scc.release (); + } +} + +/* Called when pass execution starts. */ + +static void +init_sccopy (void) +{ + /* For propagated statements. */ + dead_stmts = BITMAP_ALLOC (NULL); +} + +/* Called before pass execution ends. */ + +static void +finalize_sccopy (void) +{ + /* Remove all propagated statements. */ + simple_dce_from_worklist (dead_stmts); + BITMAP_FREE (dead_stmts); + + /* Propagating a constant may create dead eh edges. */ + basic_block bb; + FOR_EACH_BB_FN (bb, cfun) + gimple_purge_dead_eh_edges (bb); +} + +namespace { + +const pass_data pass_data_sccopy = +{ + GIMPLE_PASS, /* type */ + "sccopy", /* name */ + OPTGROUP_NONE, /* optinfo_flags */ + TV_NONE, /* tv_id */ + ( PROP_cfg | PROP_ssa ), /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + TODO_update_ssa | TODO_cleanup_cfg, /* todo_flags_finish */ +}; + +class pass_sccopy : public gimple_opt_pass +{ +public: + pass_sccopy (gcc::context *ctxt) + : gimple_opt_pass (pass_data_sccopy, ctxt) + {} + + /* opt_pass methods: */ + virtual bool gate (function *) { return true; } + virtual unsigned int execute (function *); + opt_pass * clone () final override { return new pass_sccopy (m_ctxt); } +}; // class pass_sccopy + +unsigned +pass_sccopy::execute (function *) +{ + init_sccopy (); + sccopy_propagate (); + finalize_sccopy (); + return 0; +} + +} // anon namespace + +gimple_opt_pass * +make_pass_sccopy (gcc::context *ctxt) +{ + return new pass_sccopy (ctxt); +} diff --git a/gcc/passes.def b/gcc/passes.def index 1e1950bdb39..048ad3ee7a5 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -100,6 +100,7 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_if_to_switch); NEXT_PASS (pass_convert_switch); NEXT_PASS (pass_cleanup_eh); + NEXT_PASS (pass_sccopy); NEXT_PASS (pass_profile); NEXT_PASS (pass_local_pure_const); NEXT_PASS (pass_modref); @@ -368,6 +369,7 @@ along with GCC; see the file COPYING3. If not see However, this also causes us to misdiagnose cases that should be real warnings (e.g., testsuite/gcc.dg/pr18501.c). */ NEXT_PASS (pass_cd_dce, false /* update_address_taken_p */); + NEXT_PASS (pass_sccopy); NEXT_PASS (pass_tail_calls); /* Split critical edges before late uninit warning to reduce the number of false positives from it. */ diff --git a/gcc/testsuite/gcc.dg/sccopy-1.c b/gcc/testsuite/gcc.dg/sccopy-1.c new file mode 100644 index 00000000000..1e61a6b320e --- /dev/null +++ b/gcc/testsuite/gcc.dg/sccopy-1.c @@ -0,0 +1,78 @@ +/* { dg-do compile } */ +/* { dg-options "-fgimple -fdump-tree-sccopy -O2" } */ +/* { dg-final { scan-tree-dump "Replacing SCC of size 2" "sccopy1" } } */ + +int __GIMPLE (ssa, startwith ("sccopy")) +main () +{ + int a; + int y; + int x; + int _1; + int _2; + int _13; + + __BB(2): + if (x_7(D) == 5) + goto __BB3; + else + goto __BB4; + + __BB(3): + a_10 = x_7(D); + goto __BB5; + + __BB(4): + a_9 = y_8(D); + goto __BB5; + + __BB(5): + a_3 = __PHI (__BB3: a_10, __BB4: a_9); + if (x_7(D) == y_8(D)) + goto __BB6; + else + goto __BB11; + + __BB(6): + a_11 = a_3 + 1; + goto __BB7; + + __BB(7): + a_4 = __PHI (__BB6: a_11, __BB11: a_6); +label1: + if (x_7(D) != y_8(D)) + goto __BB8; + else + goto __BB10; + + __BB(8): + goto __BB9; + + __BB(9): + a_12 = __PHI (__BB8: a_4, __BB10: a_5); + goto __BB10; + + __BB(10,loop_header(1)): + a_5 = __PHI (__BB7: a_4, __BB9: a_12); +label2: + _1 = y_8(D) * 2; + if (x_7(D) == _1) + goto __BB9; + else + goto __BB11; + + __BB(11): + a_6 = __PHI (__BB5: a_3, __BB10: a_5); + _2 = x_7(D) * 3; + if (y_8(D) == _2) + goto __BB7; + else + goto __BB12; + + __BB(12): + _13 = 0; + return _13; + +} + + diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 09e6ada5b2f..94a48461590 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -399,6 +399,7 @@ extern gimple_opt_pass *make_pass_iv_optimize (gcc::context *ctxt); extern gimple_opt_pass *make_pass_tree_loop_done (gcc::context *ctxt); extern gimple_opt_pass *make_pass_ch (gcc::context *ctxt); extern gimple_opt_pass *make_pass_ch_vect (gcc::context *ctxt); +extern gimple_opt_pass *make_pass_sccopy (gcc::context *ctxt); extern gimple_opt_pass *make_pass_ccp (gcc::context *ctxt); extern gimple_opt_pass *make_pass_split_paths (gcc::context *ctxt); extern gimple_opt_pass *make_pass_build_ssa (gcc::context *ctxt);