From patchwork Thu Nov 12 21:57:54 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Law X-Patchwork-Id: 543716 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 9740D141429 for ; Fri, 13 Nov 2015 08:58:10 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=dS7Rxbok; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; q=dns; s=default; b=nD7MOgPayFqRue+Xz oNBipG3HE33MeCG8JbLJCKnPTFePpnqAGj1o9KE8oEO6vE4ymlGS9VZ3Vc1+5JGy +DwtnQghrmEEwgL5RktmPPgmKsT3WwqGNVvsF6Bp8Zcs7lNsHF52h6yJZgyCi7BL Z28jcqeflo8fOk70OqtpIiW3zc= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; s=default; bh=KaVkFk0jtWw1+VQkPusuS3R XPiI=; b=dS7RxbokG6FpZ0sGYy3ldrZ9eQLaWyzn6Z33q8Bq57bvP1+Vt1SYwz7 O5YPI+0iSdcWXP1b+81Rsqvx+NLiS8ifzXYE6dnRiDZ71+1MBa/zjB03V4WzHJiD aY+UMWif2bR9x7QRymrvA8M2w7FFMpQds+OQor9tqRxosfiDA1JA= Received: (qmail 71687 invoked by alias); 12 Nov 2015 21:58:00 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 71676 invoked by uid 89); 12 Nov 2015 21:57:59 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD, SPF_HELO_PASS autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Thu, 12 Nov 2015 21:57:56 +0000 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (Postfix) with ESMTPS id 538BDC0AED51; Thu, 12 Nov 2015 21:57:55 +0000 (UTC) Received: from localhost.localdomain (ovpn-113-52.phx2.redhat.com [10.3.113.52]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id tACLvspe025865; Thu, 12 Nov 2015 16:57:54 -0500 Subject: Re: [Patch, tree-optimization]: Add new path Splitting pass on tree ssa representation To: Richard Biener References: <37378DC5BCD0EE48BA4B082E0B55DFAA41F3F56C@XAP-PVEXMBX02.xlnx.xilinx.com> <37378DC5BCD0EE48BA4B082E0B55DFAA41F41150@XAP-PVEXMBX02.xlnx.xilinx.com> <37378DC5BCD0EE48BA4B082E0B55DFAA41F42541@XAP-PVEXMBX02.xlnx.xilinx.com> <37378DC5BCD0EE48BA4B082E0B55DFAA4295155D@XAP-PVEXMBX02.xlnx.xilinx.com> <37378DC5BCD0EE48BA4B082E0B55DFAA4295ADCB@XAP-PVEXMBX02.xlnx.xilinx.com> <55D4F921.2020708@redhat.com> <37378DC5BCD0EE48BA4B082E0B55DFAA4297704C@XAP-PVEXMBX02.xlnx.xilinx.com> <5643A732.4040707@redhat.com> <5644C6CC.90203@redhat.com> <5644DB59.9040809@redhat.com> Cc: Ajit Kumar Agarwal , GCC Patches , Vinod Kathail , Shail Aditya Gupta , Vidhumouli Hunsigida , Nagaraju Mekala From: Jeff Law Message-ID: <56450B62.4090404@redhat.com> Date: Thu, 12 Nov 2015 14:57:54 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <5644DB59.9040809@redhat.com> X-IsSubscribed: yes On 11/12/2015 11:32 AM, Jeff Law wrote: > On 11/12/2015 10:05 AM, Jeff Law wrote: >>> But IIRC you mentioned it should enable vectorization or so? In this >>> case >>> that's obviously too late. >> The opposite. Path splitting interferes with if-conversion & >> vectorization. Path splitting mucks up the CFG enough that >> if-conversion won't fire and as a result vectorization is inhibited. It >> also creates multi-latch loops, which isn't a great situation either. >> >> It *may* be the case that dropping it that far down in the pipeline and >> making the modifications necessary to handle simple latches may in turn >> make the path splitting code play better with if-conversion and >> vectorization and avoid creation of multi-latch loops. At least that's >> how it looks on paper when I draw out the CFG manipulations. >> >> I'll do some experiments. > It doesn't look too terrible to ravamp the recognition code to work > later in the pipeline with simple latches. Sadly that doesn't seem to > have fixed the bad interactions with if-conversion. > > *But* that does open up the possibility of moving the path splitting > pass even deeper in the pipeline -- in particular we can move it past > the vectorizer. Which is may be a win. > > So the big question is whether or not we'll still see enough benefits > from having it so late in the pipeline. It's still early enough that we > get DOM, VRP, reassoc, forwprop, phiopt, etc. > > Ajit, I'll pass along an updated patch after doing some more testing. So here's what I'm working with. It runs after the vectorizer now. Ajit, if you could benchmark this it would be greatly appreciated. I know you saw significant improvements on one or more benchmarks in the past. It'd be good to know that the updated placement of the pass doesn't invalidate the gains you saw. With the updated pass placement, we don't have to worry about switching the pass on/off based on whether or not the vectorizer & if-conversion are enabled. So that hackery is gone. I think I've beefed up the test to identify the diamond patterns we want so that it's stricter in what we accept. The call to ignore_bb_p is a part of that test so that we're actually looking at the right block in a world where we're doing this transformation with simple latches. I've also put a graphical comment before perform_path_splitting which hopefully shows the CFG transformation we're making a bit clearer. This bootstraps and regression tests cleanly on x86_64-linux-gnu. diff --git a/gcc/Makefile.in b/gcc/Makefile.in index 34d2356..6613e83 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1474,6 +1474,7 @@ OBJS = \ tree-ssa-loop.o \ tree-ssa-math-opts.o \ tree-ssa-operands.o \ + tree-ssa-path-split.o \ tree-ssa-phionlycprop.o \ tree-ssa-phiopt.o \ tree-ssa-phiprop.o \ diff --git a/gcc/common.opt b/gcc/common.opt index 757ce85..3e946ca 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -2403,6 +2403,10 @@ ftree-vrp Common Report Var(flag_tree_vrp) Init(0) Optimization Perform Value Range Propagation on trees. +ftree-path-split +Common Report Var(flag_tree_path_split) Init(0) Optimization +Perform Path Splitting on trees for loop backedges. + funit-at-a-time Common Report Var(flag_unit_at_a_time) Init(1) Compile whole compilation unit at a time. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 213a9d0..b1e95da 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -354,6 +354,7 @@ Objective-C and Objective-C++ Dialects}. -fdump-tree-fre@r{[}-@var{n}@r{]} @gol -fdump-tree-vtable-verify @gol -fdump-tree-vrp@r{[}-@var{n}@r{]} @gol +-fdump-tree-path-split@r{[}-@var{n}@r{]} @gol -fdump-tree-storeccp@r{[}-@var{n}@r{]} @gol -fdump-final-insns=@var{file} @gol -fcompare-debug@r{[}=@var{opts}@r{]} -fcompare-debug-second @gol @@ -462,7 +463,7 @@ Objective-C and Objective-C++ Dialects}. -ftree-parallelize-loops=@var{n} -ftree-pre -ftree-partial-pre -ftree-pta @gol -ftree-reassoc -ftree-sink -ftree-slsr -ftree-sra @gol -ftree-switch-conversion -ftree-tail-merge -ftree-ter @gol --ftree-vectorize -ftree-vrp @gol +-ftree-vectorize -ftree-vrp @gol -ftree-path-split @gol -funit-at-a-time -funroll-all-loops -funroll-loops @gol -funsafe-loop-optimizations -funsafe-math-optimizations -funswitch-loops @gol -fipa-ra -fvariable-expansion-in-unroller -fvect-cost-model -fvpt @gol @@ -7169,6 +7170,11 @@ output on to @file{stderr}. If two conflicting dump filenames are given for the same pass, then the latter option overrides the earlier one. +@item path-split +@opindex fdump-tree-path-split +Dump each function after path splitting. The file name is made by +appending @file{.path-split} to the source file name. + @item all Turn on all options, except @option{raw}, @option{slim}, @option{verbose} and @option{lineno}. @@ -7811,6 +7817,7 @@ also turns on the following optimization flags: -ftree-switch-conversion -ftree-tail-merge @gol -ftree-pre @gol -ftree-vrp @gol +-ftree-path-split @gol -fipa-ra} Please note the warning under @option{-fgcse} about @@ -8819,7 +8826,7 @@ currently enabled, but may be enabled by @option{-O2} in the future. @item -ftree-sink @opindex ftree-sink -Perform forward store motion on trees. This flag is +Perform forward store motion on trees. This flag is enabled by default at @option{-O} and higher. @item -ftree-bit-ccp @@ -9125,6 +9132,13 @@ enabled by default at @option{-O2} and higher. Null pointer check elimination is only done if @option{-fdelete-null-pointer-checks} is enabled. +@item -ftree-path-split +@opindex ftree-path-split +Perform Path Splitting on trees. When the two execution paths of a +if-then-else merge at the loop latch node, try to duplicate the +merge node into two paths. This is enabled by default at @option{-O2} +and above. + @item -fsplit-ivs-in-unroller @opindex fsplit-ivs-in-unroller Enables expression of values of induction variables in later iterations diff --git a/gcc/opts.c b/gcc/opts.c index 9a3fbb3..9a0b27c 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -509,6 +509,7 @@ static const struct default_options default_options_table[] = { OPT_LEVELS_2_PLUS, OPT_fisolate_erroneous_paths_dereference, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_fipa_ra, NULL, 1 }, { OPT_LEVELS_2_PLUS, OPT_flra_remat, NULL, 1 }, + { OPT_LEVELS_2_PLUS, OPT_ftree_path_split, NULL, 1 }, /* -O3 optimizations. */ { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 }, diff --git a/gcc/passes.def b/gcc/passes.def index c0ab6b9..4c9ef5f 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -274,6 +274,7 @@ along with GCC; see the file COPYING3. If not see POP_INSERT_PASSES () NEXT_PASS (pass_simduid_cleanup); NEXT_PASS (pass_lower_vector_ssa); + NEXT_PASS (pass_path_split); NEXT_PASS (pass_cse_reciprocals); NEXT_PASS (pass_reassoc); NEXT_PASS (pass_strength_reduction); diff --git a/gcc/testsuite/gcc.dg/tree-ssa/path-split-1.c b/gcc/testsuite/gcc.dg/tree-ssa/path-split-1.c new file mode 100644 index 0000000..c7e9515 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/path-split-1.c @@ -0,0 +1,67 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -fdump-tree-path-split-details " } */ + +#include +#include + +#define RGBMAX 255 + +int +test() +{ + int i, Pels; + unsigned char sum = 0; + unsigned char xr, xg, xb; + unsigned char xc, xm, xy, xk; + unsigned char *ReadPtr, *EritePtr; + + ReadPtr = ( unsigned char *) malloc (sizeof (unsigned char) * 100); + EritePtr = ( unsigned char *) malloc (sizeof (unsigned char) * 100); + + for (i = 0; i < 100;i++) + { + ReadPtr[i] = 100 - i; + } + + for (i = 0; i < 100; i++) + { + xr = *ReadPtr++; + xg = *ReadPtr++; + xb = *ReadPtr++; + + xc = (unsigned char) (RGBMAX - xr); + xm = (unsigned char) (RGBMAX - xg); + xy = (unsigned char) (RGBMAX - xb); + + if (xc < xm) + { + xk = (unsigned char) (xc < xy ? xc : xy); + } + else + { + xk = (unsigned char) (xm < xy ? xm : xy); + } + + xc = (unsigned char) (xc - xk); + xm = (unsigned char) (xm - xk); + xy = (unsigned char) (xy - xk); + + *EritePtr++ = xc; + *EritePtr++ = xm; + *EritePtr++ = xy; + *EritePtr++ = xk; + sum += *EritePtr; + } + return sum; +} + +int +main() +{ + if (test() != 33) + abort(); + + return 0; +} + +/* { dg-final { scan-tree-dump "Duplicating join block" "path-split" } } */ diff --git a/gcc/timevar.def b/gcc/timevar.def index b429faf..3dba68b 100644 --- a/gcc/timevar.def +++ b/gcc/timevar.def @@ -300,3 +300,4 @@ DEFTIMEVAR (TV_LINK , "link JIT code") DEFTIMEVAR (TV_LOAD , "load JIT result") DEFTIMEVAR (TV_JIT_ACQUIRING_MUTEX , "acquiring JIT mutex") DEFTIMEVAR (TV_JIT_CLIENT_CODE , "JIT client code") +DEFTIMEVAR (TV_TREE_PATH_SPLIT , "tree path split") diff --git a/gcc/tracer.c b/gcc/tracer.c index 941dc20..c7b5150 100644 --- a/gcc/tracer.c +++ b/gcc/tracer.c @@ -51,9 +51,9 @@ #include "tree-inline.h" #include "cfgloop.h" #include "fibonacci_heap.h" +#include "tracer.h" static int count_insns (basic_block); -static bool ignore_bb_p (const_basic_block); static bool better_p (const_edge, const_edge); static edge find_best_successor (basic_block); static edge find_best_predecessor (basic_block); @@ -85,7 +85,7 @@ bb_seen_p (basic_block bb) } /* Return true if we should ignore the basic block for purposes of tracing. */ -static bool +bool ignore_bb_p (const_basic_block bb) { if (bb->index < NUM_FIXED_BLOCKS) @@ -226,6 +226,24 @@ find_trace (basic_block bb, basic_block *trace) return i; } +/* Duplicate block BB2, placing it after BB in the CFG. Return the + newly created block. */ +basic_block +transform_duplicate (basic_block bb, basic_block bb2) +{ + edge e; + basic_block copy; + + e = find_edge (bb, bb2); + + copy = duplicate_block (bb2, e, bb); + flush_pending_stmts (e); + + add_phi_args_after_copy (©, 1, NULL); + + return (copy); +} + /* Look for basic blocks in frequency order, construct traces and tail duplicate if profitable. */ @@ -321,17 +339,8 @@ tail_duplicate (void) entries or at least rotate the loop. */ && bb2->loop_father->header != bb2) { - edge e; - basic_block copy; - - nduplicated += counts [bb2->index]; - - e = find_edge (bb, bb2); - - copy = duplicate_block (bb2, e, bb); - flush_pending_stmts (e); - - add_phi_args_after_copy (©, 1, NULL); + nduplicated += counts [bb2->index]; + basic_block copy = transform_duplicate (bb, bb2); /* Reconsider the original copy of block we've duplicated. Removing the most common predecessor may make it to be diff --git a/gcc/tracer.h b/gcc/tracer.h new file mode 100644 index 0000000..cd1792a --- /dev/null +++ b/gcc/tracer.h @@ -0,0 +1,26 @@ +/* Header file for Tracer. + Copyright (C) 2015 Free Software Foundation, Inc. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify it under +the terms of the GNU General Public License as published by the Free +Software Foundation; either version 3, or (at your option) any later +version. + +GCC is distributed in the hope that it will be useful, but WITHOUT ANY +WARRANTY; without even the implied warranty of MERCHANTABILITY or +FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License + for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + +#ifndef GCC_TRACER_H +#define GCC_TRACER_H + +extern basic_block transform_duplicate (basic_block bb, basic_block bb2); +extern bool ignore_bb_p (const_basic_block bb); + +#endif /* GCC_TRACER_H */ diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index 49e22a9..6963acc 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -390,6 +390,7 @@ extern gimple_opt_pass *make_pass_tree_loop_done (gcc::context *ctxt); extern gimple_opt_pass *make_pass_ch (gcc::context *ctxt); extern gimple_opt_pass *make_pass_ch_vect (gcc::context *ctxt); extern gimple_opt_pass *make_pass_ccp (gcc::context *ctxt); +extern gimple_opt_pass *make_pass_path_split (gcc::context *ctxt); extern gimple_opt_pass *make_pass_phi_only_cprop (gcc::context *ctxt); extern gimple_opt_pass *make_pass_build_ssa (gcc::context *ctxt); extern gimple_opt_pass *make_pass_build_alias (gcc::context *ctxt); diff --git a/gcc/tree-ssa-path-split.c b/gcc/tree-ssa-path-split.c new file mode 100644 index 0000000..9f61bd4 --- /dev/null +++ b/gcc/tree-ssa-path-split.c @@ -0,0 +1,275 @@ +/* Support routines for Path Splitting. + Copyright (C) 2015 Free Software Foundation, Inc. + Contributed by Ajit Kumar Agarwal . + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 3, or (at your option) + any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "backend.h" +#include "tree.h" +#include "gimple.h" +#include "tree-pass.h" +#include "cfganal.h" +#include "cfgloop.h" +#include "gimple-iterator.h" +#include "tracer.h" + +/* Given LATCH, the latch block in a loop, see if the shape of the + path reaching LATCH is suitable for path splitting. If so, return + the block that will be duplicated into its predecessor paths. Else + return NULL. */ + +static basic_block +find_block_to_duplicate_for_path_splitting (basic_block latch) +{ + /* We should have simple latches at this point. So the latch should + have a single successor. This implies the predecessor of the latch + likely has the loop exit. And it's that predecessor we're most + interested in. To keep things simple, we're going to require that + the latch have a single predecessor too. */ + if (single_succ_p (latch) && single_pred_p (latch)) + { + basic_block bb = get_immediate_dominator (CDI_DOMINATORS, latch); + gcc_assert (single_pred_edge (latch)->src == bb); + + /* If BB has been marked as not to be duplicated, then honor that + request. */ + if (ignore_bb_p (bb)) + return NULL; + + gimple *last = gsi_stmt (gsi_last_nondebug_bb (bb)); + /* The immediate dominator of the latch must end in a conditional. */ + if (!last || gimple_code (last) != GIMPLE_COND) + return NULL; + + /* We're hoping that BB is a join point for an IF-THEN-ELSE diamond + region. Verify that it is. + + First, verify that BB has two predecessors (each arm of the + IF-THEN-ELSE) and two successors (the latch and exit). */ + if (EDGE_COUNT (bb->preds) == 2 && EDGE_COUNT (bb->succs) == 2) + { + /* Now verify that BB's immediate dominator ends in a + conditional as well. */ + basic_block bb_idom = get_immediate_dominator (CDI_DOMINATORS, bb); + gimple *last = gsi_stmt (gsi_last_nondebug_bb (bb_idom)); + if (!last || gimple_code (last) != GIMPLE_COND) + return NULL; + + /* And that BB's immediate dominator's successors are the + the predecessors of BB. */ + if (!find_edge (bb_idom, EDGE_PRED (bb, 0)->src) + || !find_edge (bb_idom, EDGE_PRED (bb, 1)->src)) + return NULL; + + /* So at this point we have a simple diamond for an IF-THEN-ELSE + construct starting at BB_IDOM, with a join point at BB. BB + pass control outside the loop or to the loop latch. + + We're going to want to create two duplicates of BB, one for + each successor of BB_IDOM. */ + return bb; + } + } + return NULL; +} + +/* Return TRUE if BB is a reasonable block to duplicate by examining + its size, false otherwise. BB will always be a loop latch block. + + Should this use the same tests as we do for jump threading? */ + +static bool +is_feasible_trace (basic_block bb) +{ + int num_stmt = 0; + gimple_stmt_iterator gsi; + + for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi)) + { + gimple *stmt = gsi_stmt (gsi); + if (!is_gimple_debug (stmt)) + num_stmt++; + } + + /* We may want to limit how many statements we copy. */ + if (num_stmt > 1) + return true; + + return false; +} + +/* If the immediate dominator of the latch of the loop is + block with conditional branch, then the loop latch is + duplicated to its predecessors path preserving the SSA + semantics. + + CFG before transformation. + + 2 + | + | + +---->3 + | / \ + | / \ + | 4 5 + | \ / + | \ / + | 6 + | / \ + | / \ + | 8 7 + | | | + ---+ E + + + + Block 8 is the latch. We're going to make copies of block 6 (9 & 10) + and wire things up so they look like this: + + 2 + | + | + +---->3 + | / \ + | / \ + | 4 5 + | | | + | | | + | 9 10 + | |\ /| + | | \ / | + | | 7 | + | | | | + | | E | + | | | + | \ / + | \ / + +-----8 + + + Blocks 9 and 10 will get merged into blocks 4 & 5 respectively which + enables CSE, DCE and other optimizations to occur on a larger block + of code. */ + +static bool +perform_path_splitting () +{ + bool changed = false; + loop_p loop; + + loop_optimizer_init (LOOPS_NORMAL | LOOPS_HAVE_RECORDED_EXITS); + initialize_original_copy_tables (); + calculate_dominance_info (CDI_DOMINATORS); + + FOR_EACH_LOOP (loop, LI_FROM_INNERMOST) + { + /* See if there is a block that we can duplicate to split the + path to the loop latch. */ + basic_block bb = find_block_to_duplicate_for_path_splitting (loop->latch); + + /* BB is the merge point for an IF-THEN-ELSE we want to transform. + + Essentially we want to create two duplicates of BB and append + a duplicate to the THEN and ELSE clauses. This will split the + path leading to the latch. BB will be unreachable and removed. */ + if (bb && is_feasible_trace (bb)) + { + if (dump_file && (dump_flags & TDF_DETAILS)) + fprintf (dump_file, + "Duplicating join block %d into predecessor paths\n", + bb->index); + basic_block pred0 = EDGE_PRED (bb, 0)->src; + basic_block pred1 = EDGE_PRED (bb, 1)->src; + transform_duplicate (pred0, bb); + transform_duplicate (pred1, bb); + changed = true; + } + } + + loop_optimizer_finalize (); + free_original_copy_tables (); + return changed; +} + +/* Main entry point for path splitting. Returns TODO_cleanup_cfg if any + paths where split, otherwise return zero. */ + +static unsigned int +execute_path_split (void) +{ + /* If we don't have at least 2 real blocks and backedges in the + CFG, then there's no point in trying to perform path splitting. */ + if (n_basic_blocks_for_fn (cfun) <= NUM_FIXED_BLOCKS + 1 + || !mark_dfs_back_edges ()) + return 0; + + bool changed = perform_path_splitting(); + if (changed) + { + free_dominance_info (CDI_DOMINATORS); + /* If we changed the CFG schedule loops for fixup by cleanup_cfg. */ + if (current_loops) + loops_state_set (LOOPS_NEED_FIXUP); + } + + return changed ? TODO_cleanup_cfg : 0; +} + +static bool +gate_path_split (void) +{ + return flag_tree_path_split; +} + +namespace { + +const pass_data pass_data_path_split = +{ + GIMPLE_PASS, /* type */ + "path-split", /* name */ + OPTGROUP_NONE, /* optinfo_flags */ + TV_TREE_PATH_SPLIT, /* tv_id */ + PROP_ssa, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + TODO_update_ssa, /* todo_flags_finish */ +}; + +class pass_path_split : public gimple_opt_pass +{ + public: + pass_path_split (gcc::context *ctxt) + : gimple_opt_pass (pass_data_path_split, ctxt) + {} + /* opt_pass methods: */ + opt_pass * clone () { return new pass_path_split (m_ctxt); } + virtual bool gate (function *) { return gate_path_split (); } + virtual unsigned int execute (function *) { return execute_path_split (); } + +}; // class pass_path_split + +} // anon namespace + +gimple_opt_pass * +make_pass_path_split (gcc::context *ctxt) +{ + return new pass_path_split (ctxt); +}