From patchwork Thu Jun 4 15:55:32 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom de Vries X-Patchwork-Id: 480786 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id C6E351401F6 for ; Fri, 5 Jun 2015 01:55:54 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=o7FL8HKY; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; q=dns; s=default; b=tk+kl2ilEyYhVwhBE yzzdpwrzalbDBeQRNkV/U1W7M6dEMTcCnT5fgiraABhfM5c67sSo7lIpoo2z+3ek BJ0V9ACsj1w4/j+p9Fuk2pHJPzCWbl7LWBaUVuK0wrSB1Vh70w2fw+hjFs2tqB0j D3iGU09INL4u5AfHrqtFJnBrfk= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; s=default; bh=LnvzcWpb9UwZNhUZ+PKb8oR Mpx8=; b=o7FL8HKYkZRZOCSrLBrAEWK2WKrAxQjpE+JJvTv+nbUYojN6pc4oZ43 6bLiXxWrqUejFePkSfXMx3YH9fvt36Kk4UKHBLmxxDTrOn8zYrRl8rLLMWAJn+mk TPcX8iJfzZLb/Z7UsFvymzagwjx+txnjUZe5AABhvSqt3Zk74Rfw= Received: (qmail 40458 invoked by alias); 4 Jun 2015 15:55:45 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 40447 invoked by uid 89); 4 Jun 2015 15:55:44 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 04 Jun 2015 15:55:43 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-03.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1Z0XUZ-0004VT-Bx from Tom_deVries@mentor.com ; Thu, 04 Jun 2015 08:55:39 -0700 Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-03.mgc.mentorg.com (137.202.0.108) with Microsoft SMTP Server id 14.3.224.2; Thu, 4 Jun 2015 16:55:38 +0100 Message-ID: <557074F4.4020109@mentor.com> Date: Thu, 4 Jun 2015 17:55:32 +0200 From: Tom de Vries User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: Richard Biener CC: Thomas Schwinge , GCC Patches , Jakub Jelinek Subject: Re: [PATCH, 3/8] Add pass_ch_oacc_kernels to pass_oacc_kernels References: <546743BC.5070804@mentor.com> <54678B85.60005@mentor.com> <547467A6.9@mentor.com> <87a8y11cuq.fsf@kepler.schwinge.homeip.net> <556EC6F5.8090107@mentor.com> In-Reply-To: On 03/06/15 13:20, Richard Biener wrote: > On Wed, 3 Jun 2015, Tom de Vries wrote: > >> On 22/04/15 09:39, Richard Biener wrote: >>> Ehm. So why not simply add a flag to struct loop instead and set it >>> during OMP region parsing/lowering? >> >> Attached patch adds an in_oacc_kernels_region flag to struct loop, and uses >> it. OK for gomp-4_0-branch? > > Works for me. > Committed as attached, with minor fix to pass bootstrap. Thanks, - Tom Add in_oacc_kernels_region field to struct loop 2015-06-03 Tom de Vries * cfgloop.h (struct loop): Add in_oacc_kernels_region field. * omp-low.c (mark_loops_in_oacc_kernels_region): New function. (loop_get_oacc_kernels_region_entry): New function. (expand_omp_target): Call mark_loops_in_oacc_kernels_region. (loop_in_oacc_kernels_region_p): Remove function. * omp-low.h (loop_in_oacc_kernels_region_p): Remove declaration. (loop_get_oacc_kernels_region_entry): Declare. * tree-parloops.c (parallelize_loops): Use in_oacc_kernels_region field and loop_get_oacc_kernels_region_entry. * tree-ssa-loop-ch.c (pass_ch_execute): Use in_oacc_kernels_region field. --- gcc/cfgloop.h | 3 + gcc/omp-low.c | 155 ++++++++++++++++++++----------------------------- gcc/omp-low.h | 3 +- gcc/tree-parloops.c | 7 ++- gcc/tree-ssa-loop-ch.c | 2 +- 5 files changed, 73 insertions(+), 97 deletions(-) diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h index 1d84572..a3654d9 100644 --- a/gcc/cfgloop.h +++ b/gcc/cfgloop.h @@ -195,6 +195,9 @@ struct GTY ((chain_next ("%h.next"))) loop { /* True if we should try harder to vectorize this loop. */ bool force_vectorize; + /* True if the loop is part of an oacc kernels region. */ + bool in_oacc_kernels_region; + /* For SIMD loops, this is a unique identifier of the loop, referenced by IFN_GOMP_SIMD_VF, IFN_GOMP_SIMD_LANE and IFN_GOMP_SIMD_LAST_LANE builtins. */ diff --git a/gcc/omp-low.c b/gcc/omp-low.c index b1aa603..22a57af 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -9425,6 +9425,68 @@ oacc_alloc_broadcast_storage (omp_context *ctx) TYPE_SIZE_UNIT (vull_type_node)); } +/* Mark the loops inside the kernels region starting at REGION_ENTRY and ending + at REGION_EXIT. */ + +static void +mark_loops_in_oacc_kernels_region (basic_block region_entry, + basic_block region_exit) +{ + bitmap dominated_bitmap = BITMAP_GGC_ALLOC (); + bitmap excludes_bitmap = BITMAP_GGC_ALLOC (); + unsigned di; + basic_block bb; + + bitmap_clear (dominated_bitmap); + bitmap_clear (excludes_bitmap); + + /* Get all the blocks dominated by the region entry. That will include the + entire region. */ + vec dominated + = get_all_dominated_blocks (CDI_DOMINATORS, region_entry); + FOR_EACH_VEC_ELT (dominated, di, bb) + bitmap_set_bit (dominated_bitmap, bb->index); + + /* Exclude all the blocks which are not in the region: the blocks dominated by + the region exit. */ + if (region_exit != NULL) + { + vec excludes + = get_all_dominated_blocks (CDI_DOMINATORS, region_exit); + FOR_EACH_VEC_ELT (excludes, di, bb) + bitmap_set_bit (excludes_bitmap, bb->index); + } + + /* Mark the loops in the region. */ + struct loop *loop; + FOR_EACH_LOOP (loop, 0) + if (bitmap_bit_p (dominated_bitmap, loop->header->index) + && !bitmap_bit_p (excludes_bitmap, loop->header->index)) + loop->in_oacc_kernels_region = true; +} + +/* Return the entry basic block of the oacc kernels region containing LOOP. */ + +basic_block +loop_get_oacc_kernels_region_entry (struct loop *loop) +{ + if (!loop->in_oacc_kernels_region) + return NULL; + + basic_block bb = loop->header; + while (true) + { + bb = get_immediate_dominator (CDI_DOMINATORS, bb); + gcc_assert (bb != NULL); + + gimple last = last_stmt (bb); + if (last != NULL + && gimple_code (last) == GIMPLE_OMP_TARGET + && gimple_omp_target_kind (last) == GF_OMP_TARGET_KIND_OACC_KERNELS) + return bb; + } +} + /* Expand the GIMPLE_OMP_TARGET starting at REGION. */ static void @@ -9495,6 +9557,8 @@ expand_omp_target (struct omp_region *region) as an optimization barrier. */ do_splitoff = false; cfun->curr_properties &= ~PROP_gimple_eomp; + + mark_loops_in_oacc_kernels_region (region->entry, region->exit); } else { @@ -15331,97 +15395,6 @@ gimple_stmt_omp_data_i_init_p (gimple stmt) SSA_OP_DEF); } -/* Return true if LOOP is inside a kernels region. */ - -bool -loop_in_oacc_kernels_region_p (struct loop *loop, basic_block *region_entry, - basic_block *region_exit) -{ - bitmap excludes_bitmap = BITMAP_GGC_ALLOC (); - bitmap region_bitmap = BITMAP_GGC_ALLOC (); - bitmap_clear (region_bitmap); - - if (region_entry != NULL) - *region_entry = NULL; - if (region_exit != NULL) - *region_exit = NULL; - - basic_block bb; - gimple last; - FOR_EACH_BB_FN (bb, cfun) - { - if (bitmap_bit_p (region_bitmap, bb->index)) - continue; - - last = last_stmt (bb); - if (!last) - continue; - - if (gimple_code (last) != GIMPLE_OMP_TARGET - || (gimple_omp_target_kind (last) != GF_OMP_TARGET_KIND_OACC_KERNELS)) - continue; - - bitmap_clear (excludes_bitmap); - bitmap_set_bit (excludes_bitmap, bb->index); - - vec dominated - = get_all_dominated_blocks (CDI_DOMINATORS, bb); - - unsigned di; - basic_block dom; - - basic_block end_region = NULL; - FOR_EACH_VEC_ELT (dominated, di, dom) - { - if (dom == bb) - continue; - - last = last_stmt (dom); - if (!last) - continue; - - if (gimple_code (last) != GIMPLE_OMP_RETURN) - continue; - - if (end_region == NULL - || dominated_by_p (CDI_DOMINATORS, end_region, dom)) - end_region = dom; - } - - if (end_region == NULL) - { - gimple kernels = last_stmt (bb); - fatal_error (gimple_location (kernels), - "End of kernel region unreachable"); - } - - vec excludes - = get_all_dominated_blocks (CDI_DOMINATORS, end_region); - - unsigned di2; - basic_block exclude; - - FOR_EACH_VEC_ELT (excludes, di2, exclude) - if (exclude != end_region) - bitmap_set_bit (excludes_bitmap, exclude->index); - - FOR_EACH_VEC_ELT (dominated, di, dom) - if (!bitmap_bit_p (excludes_bitmap, dom->index)) - bitmap_set_bit (region_bitmap, dom->index); - - if (bitmap_bit_p (region_bitmap, loop->header->index)) - { - if (region_entry != NULL) - *region_entry = bb; - if (region_exit != NULL) - *region_exit = end_region; - return true; - } - } - - return false; -} - namespace { const pass_data pass_data_late_lower_omp = diff --git a/gcc/omp-low.h b/gcc/omp-low.h index ae63c9f..fbc8416 100644 --- a/gcc/omp-low.h +++ b/gcc/omp-low.h @@ -29,8 +29,7 @@ extern tree omp_reduction_init (tree, tree); extern bool make_gimple_omp_edges (basic_block, struct omp_region **, int *); extern void omp_finish_file (void); extern bool gimple_stmt_omp_data_i_init_p (gimple); -extern bool loop_in_oacc_kernels_region_p (struct loop *, basic_block *, - basic_block *); +extern basic_block loop_get_oacc_kernels_region_entry (struct loop *); extern GTY(()) vec *offload_funcs; extern GTY(()) vec *offload_vars; diff --git a/gcc/tree-parloops.c b/gcc/tree-parloops.c index 72877ee..e451704 100644 --- a/gcc/tree-parloops.c +++ b/gcc/tree-parloops.c @@ -2629,7 +2629,7 @@ parallelize_loops (bool oacc_kernels_p) struct obstack parloop_obstack; HOST_WIDE_INT estimated; source_location loop_loc; - basic_block region_entry, region_exit; + basic_block region_entry = NULL; /* Do not parallelize loops in the functions created by parallelization. */ if (parallelized_function_p (cfun->decl)) @@ -2649,8 +2649,7 @@ parallelize_loops (bool oacc_kernels_p) if (oacc_kernels_p) { - if (!loop_in_oacc_kernels_region_p (loop, ®ion_entry, - ®ion_exit)) + if (!loop->in_oacc_kernels_region) continue; /* TODO: Allow nested loops. */ @@ -2661,6 +2660,8 @@ parallelize_loops (bool oacc_kernels_p) fprintf (dump_file, "Trying loop %d with header bb %d in oacc kernels region\n", loop->num, loop->header->index); + + region_entry = loop_get_oacc_kernels_region_entry (loop); } if (dump_file && (dump_flags & TDF_DETAILS)) diff --git a/gcc/tree-ssa-loop-ch.c b/gcc/tree-ssa-loop-ch.c index 1cd77e6..7527efd 100644 --- a/gcc/tree-ssa-loop-ch.c +++ b/gcc/tree-ssa-loop-ch.c @@ -225,7 +225,7 @@ pass_ch_execute (function *fun, bool oacc_kernels_p) continue; if (oacc_kernels_p - && !loop_in_oacc_kernels_region_p (loop, NULL, NULL)) + && !loop->in_oacc_kernels_region) continue; /* Iterate the header copying up to limit; this takes care of the cases -- 1.9.1