From patchwork Tue Nov 25 11:27:34 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom de Vries X-Patchwork-Id: 414634 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id B8F33140168 for ; Tue, 25 Nov 2014 22:27:52 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; q=dns; s=default; b=dM3eyJHpH+Z7jtbkB qGYjjmL8xYC+Zdco2vcTtVDh3wbeiV7CYLpJVztX4f1PZL9hAWZilhSbWP1cSe+q LjVlqM5CQbcEATAHyN4PHA4REBtFJG4E6KB4mw/cg5gzd/Y1mwLBRdGruCNeeED9 j9sWbroQONShDFECnT9ePVhDSU= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; s=default; bh=iJh5FtCIUeHzqXshsSoJpr+ MiFk=; b=aSDFUvdMTBds1yJbWj/a69tiISKyPpDUfn2a8ekXSD2L6wmA3CYAYn0 5OYr9aIHaNEsnVUSJA9lwqxwmLf5hIBjr7eLTn0vOoa5Gg5kqW9f5YOgI5rncmka Du39PRo1JfIabLkL8qswuh8WiaAh8ZsZUbFWBX0XcnOcjesDqJyg= Received: (qmail 20529 invoked by alias); 25 Nov 2014 11:27:45 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 20518 invoked by uid 89); 25 Nov 2014 11:27:44 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.2 X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 25 Nov 2014 11:27:43 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-03.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1XtEHS-00053M-To from Tom_deVries@mentor.com ; Tue, 25 Nov 2014 03:27:39 -0800 Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-03.mgc.mentorg.com (137.202.0.108) with Microsoft SMTP Server id 14.3.181.6; Tue, 25 Nov 2014 11:27:37 +0000 Message-ID: <547467A6.9@mentor.com> Date: Tue, 25 Nov 2014 12:27:34 +0100 From: Tom de Vries User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: GCC Patches CC: Richard Biener , Jakub Jelinek , Thomas Schwinge Subject: Re: [PATCH, 3/8] Add pass_ch_oacc_kernels to pass_oacc_kernels References: <546743BC.5070804@mentor.com> <54678B85.60005@mentor.com> In-Reply-To: <54678B85.60005@mentor.com> On 15-11-14 18:21, Tom de Vries wrote: > On 15-11-14 13:14, Tom de Vries wrote: >> Hi, >> >> I'm submitting a patch series with initial support for the oacc kernels >> directive. >> >> The patch series uses pass_parallelize_loops to implement parallelization of >> loops in the oacc kernels region. >> >> The patch series consists of these 8 patches: >> ... >> 1 Expand oacc kernels after pass_build_ealias >> 2 Add pass_oacc_kernels >> 3 Add pass_ch_oacc_kernels to pass_oacc_kernels >> 4 Add pass_tree_loop_{init,done} to pass_oacc_kernels >> 5 Add pass_loop_im to pass_oacc_kernels >> 6 Add pass_ccp to pass_oacc_kernels >> 7 Add pass_parloops_oacc_kernels to pass_oacc_kernels >> 8 Do simple omp lowering for no address taken var >> ... > > This patch adds a pass_ch_oacc_kernels to the pass group pass_oacc_kernels. > > The idea is that pass_parallelize_loops only deals with loops for which the > header has been copied, so the easiest way to meet that requirement when running > pass_parallelize_loops in group pass_oacc_kernels, is to run pass_ch as a part > of pass_oacc_kernels. > > We define a seperate pass pass_ch_oacc_kernels, to leave all loops that aren't > part of a kernels region alone. > Updated for moving pass_oacc_kernels down past pass_fre in the pass list. Bootstrapped and reg-tested as before. OK for trunk? Thanks, - Tom [PATCH 3/7] Add pass_ch_oacc_kernels to pass_oacc_kernels 2014-11-25 Tom de Vries * omp-low.c (loop_in_oacc_kernels_region_p): New function. * omp-low.h (loop_in_oacc_kernels_region_p): Declare. * passes.def: Add pass_ch_oacc_kernels to pass group pass_oacc_kernels. * tree-pass.h (make_pass_ch_oacc_kernels): Declare * tree-ssa-loop-ch.c: Include omp-low.h. (pass_ch_execute): Declare. (pass_ch::execute): Factor out ... (pass_ch_execute): ... this new function. If handling oacc kernels, skip loops that are not in oacc kernels region. (pass_ch_oacc_kernels::execute): (pass_data_ch_oacc_kernels): New pass_data. (class pass_ch_oacc_kernels): New pass. (pass_ch_oacc_kernels::execute, make_pass_ch_oacc_kernels): New function. --- gcc/omp-low.c | 83 ++++++++++++++++++++++++++++++++++++++++++++++++++ gcc/omp-low.h | 2 ++ gcc/passes.def | 1 + gcc/tree-pass.h | 1 + gcc/tree-ssa-loop-ch.c | 59 +++++++++++++++++++++++++++++++++-- 5 files changed, 144 insertions(+), 2 deletions(-) diff --git a/gcc/omp-low.c b/gcc/omp-low.c index 3ac546c..543dd48 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -13912,4 +13912,87 @@ gimple_stmt_omp_data_i_init_p (gimple stmt) SSA_OP_DEF); } +/* Return true if LOOP is inside a kernels region. */ + +bool +loop_in_oacc_kernels_region_p (struct loop *loop, basic_block *region_entry, + basic_block *region_exit) +{ + bitmap excludes_bitmap = BITMAP_GGC_ALLOC (); + bitmap region_bitmap = BITMAP_GGC_ALLOC (); + bitmap_clear (region_bitmap); + + if (region_entry != NULL) + *region_entry = NULL; + if (region_exit != NULL) + *region_exit = NULL; + + basic_block bb; + gimple last; + FOR_EACH_BB_FN (bb, cfun) + { + if (bitmap_bit_p (region_bitmap, bb->index)) + continue; + + last = last_stmt (bb); + if (!last) + continue; + + if (gimple_code (last) != GIMPLE_OACC_KERNELS) + continue; + + bitmap_clear (excludes_bitmap); + bitmap_set_bit (excludes_bitmap, bb->index); + + vec dominated + = get_all_dominated_blocks (CDI_DOMINATORS, bb); + + unsigned di; + basic_block dom; + + basic_block end_region = NULL; + FOR_EACH_VEC_ELT (dominated, di, dom) + { + if (dom == bb) + continue; + + last = last_stmt (dom); + if (!last) + continue; + + if (gimple_code (last) != GIMPLE_OMP_RETURN) + continue; + + if (end_region == NULL + || dominated_by_p (CDI_DOMINATORS, end_region, dom)) + end_region = dom; + } + + vec excludes + = get_all_dominated_blocks (CDI_DOMINATORS, end_region); + + unsigned di2; + basic_block exclude; + + FOR_EACH_VEC_ELT (excludes, di2, exclude) + if (exclude != end_region) + bitmap_set_bit (excludes_bitmap, exclude->index); + + FOR_EACH_VEC_ELT (dominated, di, dom) + if (!bitmap_bit_p (excludes_bitmap, dom->index)) + bitmap_set_bit (region_bitmap, dom->index); + + if (bitmap_bit_p (region_bitmap, loop->header->index)) + { + if (region_entry != NULL) + *region_entry = bb; + if (region_exit != NULL) + *region_exit = end_region; + return true; + } + } + + return false; +} + #include "gt-omp-low.h" diff --git a/gcc/omp-low.h b/gcc/omp-low.h index 32076e4..30df867 100644 --- a/gcc/omp-low.h +++ b/gcc/omp-low.h @@ -29,6 +29,8 @@ extern tree omp_reduction_init (tree, tree); extern bool make_gimple_omp_edges (basic_block, struct omp_region **, int *); extern void omp_finish_file (void); extern bool gimple_stmt_omp_data_i_init_p (gimple); +extern bool loop_in_oacc_kernels_region_p (struct loop *, basic_block *, + basic_block *); extern GTY(()) vec *offload_funcs; extern GTY(()) vec *offload_vars; diff --git a/gcc/passes.def b/gcc/passes.def index efb3d8c..01368bb 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -90,6 +90,7 @@ along with GCC; see the file COPYING3. If not see function. */ NEXT_PASS (pass_oacc_kernels); PUSH_INSERT_PASSES_WITHIN (pass_oacc_kernels) + NEXT_PASS (pass_ch_oacc_kernels); NEXT_PASS (pass_expand_omp_ssa); POP_INSERT_PASSES () NEXT_PASS (pass_merge_phi); diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index d63ab2b..dd1f308 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -378,6 +378,7 @@ extern gimple_opt_pass *make_pass_loop_prefetch (gcc::context *ctxt); extern gimple_opt_pass *make_pass_iv_optimize (gcc::context *ctxt); extern gimple_opt_pass *make_pass_tree_loop_done (gcc::context *ctxt); extern gimple_opt_pass *make_pass_ch (gcc::context *ctxt); +extern gimple_opt_pass *make_pass_ch_oacc_kernels (gcc::context *ctxt); extern gimple_opt_pass *make_pass_ccp (gcc::context *ctxt); extern gimple_opt_pass *make_pass_phi_only_cprop (gcc::context *ctxt); extern gimple_opt_pass *make_pass_build_ssa (gcc::context *ctxt); diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c index 300b2fa..8f91552 100644 --- a/gcc/tree-ssa-loop-ch.c +++ b/gcc/tree-ssa-loop-ch.c @@ -48,12 +48,15 @@ along with GCC; see the file COPYING3. If not see #include "tree-inline.h" #include "flags.h" #include "tree-ssa-threadedge.h" +#include "omp-low.h" /* Duplicates headers of loops if they are small enough, so that the statements in the loop body are always executed when the loop is entered. This increases effectiveness of code motion optimizations, and reduces the need for loop preconditioning. */ +static unsigned int pass_ch_execute (function *, bool); + /* Check whether we should duplicate HEADER of LOOP. At most *LIMIT instructions should be duplicated, limit is decreased by the actual amount. */ @@ -172,6 +175,14 @@ public: unsigned int pass_ch::execute (function *fun) { + return pass_ch_execute (fun, false); +} + +} // anon namespace + +static unsigned int +pass_ch_execute (function *fun, bool oacc_kernels_p) +{ struct loop *loop; basic_block header; edge exit, entry; @@ -205,6 +216,10 @@ pass_ch::execute (function *fun) if (do_while_loop_p (loop)) continue; + if (oacc_kernels_p + && !loop_in_oacc_kernels_region_p (loop, NULL, NULL)) + continue; + /* Iterate the header copying up to limit; this takes care of the cases like while (a && b) {...}, where we want to have both of the conditions copied. TODO -- handle while (a || b) - like cases, by not requiring @@ -295,10 +310,50 @@ pass_ch::execute (function *fun) return 0; } -} // anon namespace - gimple_opt_pass * make_pass_ch (gcc::context *ctxt) { return new pass_ch (ctxt); } + +namespace { + +const pass_data pass_data_ch_oacc_kernels = +{ + GIMPLE_PASS, /* type */ + "ch_oacc_kernels", /* name */ + OPTGROUP_LOOP, /* optinfo_flags */ + TV_TREE_CH, /* tv_id */ + ( PROP_cfg | PROP_ssa ), /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + TODO_cleanup_cfg, /* todo_flags_finish */ +}; + + class pass_ch_oacc_kernels : public gimple_opt_pass +{ +public: + pass_ch_oacc_kernels (gcc::context *ctxt) + : gimple_opt_pass (pass_data_ch_oacc_kernels, ctxt) + {} + + /* opt_pass methods: */ + virtual bool gate (function *) { return true; } + virtual unsigned int execute (function *); + +}; // class pass_ch_oacc_kernels + +unsigned int +pass_ch_oacc_kernels::execute (function *fun) +{ + return pass_ch_execute (fun, true); +} + +} // anon namespace + +gimple_opt_pass * +make_pass_ch_oacc_kernels (gcc::context *ctxt) +{ + return new pass_ch_oacc_kernels (ctxt); +} -- 1.9.1