From patchwork Tue Feb 23 15:19:31 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Schwinge X-Patchwork-Id: 586913 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 17B21140B0E for ; Wed, 24 Feb 2016 02:20:02 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=hot8a+Cp; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:in-reply-to:references:date:message-id :mime-version:content-type:content-transfer-encoding; q=dns; s= default; b=OZ+TY6DCzwFjxVVQ43vbf+h4mhQXZlYhLKpIkrnFSCMVZSSLiz189 CzqLh/PO9iaJgz6MPnpb6kIQqhBuP1Qvqsx8BLDW7FQ51mkSNzRz0P3G4zZ+THLj JpxgMYsKib58vq51dpHc/aaUr03D4Gok5Kue+6KPwWvoXXVo9n1tC4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:in-reply-to:references:date:message-id :mime-version:content-type:content-transfer-encoding; s=default; bh=ztDJTQQVluXuFUPMPnzmgZfnr3s=; b=hot8a+Cp5WWHGd9+t2GeZTh1pQEl r5lGfVJBvuAOu7YbVIo6RKM1F/rwMIp3dQxhG6f2v55VO9VEU7OoMELQFKQ12yOh q+dMBouJZfoSBB1+x9c53o5D+jMDtIViwaTMbtlrwL45Lbdazhlg76MdGDvrF8J/ 10ncl5w5REvb72Y= Received: (qmail 90278 invoked by alias); 23 Feb 2016 15:19:53 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 90165 invoked by uid 89); 23 Feb 2016 15:19:52 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.6 required=5.0 tests=AWL, BAYES_50, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy=controlled, sk:rocebi, D*Uni-Bielefeld.DE, D*DE X-HELO: fencepost.gnu.org Received: from fencepost.gnu.org (HELO fencepost.gnu.org) (208.118.235.10) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Tue, 23 Feb 2016 15:19:48 +0000 Received: from eggs.gnu.org ([2001:4830:134:3::10]:41226) by fencepost.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1aYEkb-0001PL-Vt for gcc-patches@gnu.org; Tue, 23 Feb 2016 10:19:46 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aYEkX-000802-1U for gcc-patches@gnu.org; Tue, 23 Feb 2016 10:19:45 -0500 Received: from relay1.mentorg.com ([192.94.38.131]:53743) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aYEkW-0007zn-K8 for gcc-patches@gnu.org; Tue, 23 Feb 2016 10:19:40 -0500 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-01.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1aYEkT-0006ud-HT from Thomas_Schwinge@mentor.com ; Tue, 23 Feb 2016 07:19:38 -0800 Received: from hertz.schwinge.homeip.net (137.202.0.76) by SVR-IES-FEM-01.mgc.mentorg.com (137.202.0.104) with Microsoft SMTP Server id 14.3.224.2; Tue, 23 Feb 2016 15:19:36 +0000 From: Thomas Schwinge To: Tom de Vries , Nathan Sidwell , CC: Jakub Jelinek , Bernd Schmidt , Richard Biener Subject: Re: Use plain -fopenacc to enable OpenACC kernels processing In-Reply-To: <56C202A6.8070209@mentor.com> References: <5640BD31.2060602@mentor.com> <5640DA47.2010508@mentor.com> <87bn7v4b0m.fsf@kepler.schwinge.homeip.net> <87bn7o8w8n.fsf@hertz.schwinge.homeip.net> <56C202A6.8070209@mentor.com> User-Agent: Notmuch/0.9-101-g81dad07 (http://notmuchmail.org) Emacs/24.4.1 (x86_64-pc-linux-gnu) Date: Tue, 23 Feb 2016 16:19:31 +0100 Message-ID: <87si0j5u9o.fsf@hertz.schwinge.homeip.net> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Windows NT kernel [generic] [fuzzy] X-Received-From: 192.94.38.131 Hi! On Mon, 15 Feb 2016 17:53:58 +0100, Tom de Vries wrote: > On 10/02/16 15:40, Thomas Schwinge wrote: > > On Fri, 5 Feb 2016 13:06:17 +0100, I wrote: > >> On Mon, 9 Nov 2015 18:39:19 +0100, Tom de Vries wrote: > >>> On 09/11/15 16:35, Tom de Vries wrote: > >>>> this patch series for stage1 trunk adds support to: > >>>> - parallelize oacc kernels regions using parloops, and > >>>> - map the loops onto the oacc gang dimension. > >> > >>> Atm, the parallelization behaviour for the kernels region is controlled > >>> by flag_tree_parallelize_loops, which is also used to control generic > >>> auto-parallelization by autopar using omp. That is not ideal, and we may > >>> want a separate flag (or param) to control the behaviour for oacc > >>> kernels, f.i. -foacc-kernels-gang-parallelize=. I'm open to suggestions. > >> > >> I suggest to use plain -fopenacc to enable OpenACC kernels processing > >> (which just makes sense, I hope) ;-) and have later processing stages > >> determine the actual parametrization (currently: number of gangs) (that > >> is, Nathan's recent "Default compute dimensions" patches). > > That makes a lot of sense. Thanks for working on this. > >> Originally, I want to use: > >> > >> OMP_CLAUSE_NUM_GANGS_EXPR (clause) = build_int_cst (integer_type_node, n_threads == 0 ? -1 : n_threads); > >> > >> ... to store -1 "have the compiler decidew" (instead of now 0 "have the > >> run-time decide", which might prevent some code optimizations, as I > >> understand it) for the n_threads == 0 case, but it seems that for an > >> offloaded OpenACC kernels region, gcc/omp-low.c:oacc_validate_dims is > >> called with the parameter "used" set to 0 instead of "gang", and then the > >> "Default anything left to 1 or a partitioned default" logic will default > >> dims["gang"] to oacc_min_dims["gang"] (that is, 1) instead of the > >> oacc_default_dims["gang"] (that is, 32). Nathan, does that smell like a > >> bug (and could you look into that)? filed. (Nathan?) > >> --- gcc/tree-parloops.c > >> +++ gcc/tree-parloops.c > The oacc-parloops changes look good to me. I approve them for 6.0 stage > 4 (given that using the ftree-parallelize-loops= flag for oacc > kernels parallelization was was just a placeholder waiting to be > replaced by an oacc-based approach). [ And I'd expect that the > tree-ssa-loop.c changes and the mechanical testsuite changes can be > regarded as trivial. ] Thanks; committed (without changes) in r233634: commit 3a37a410bbfed45d04f06887c348938182369d5a Author: tschwinge Date: Tue Feb 23 15:07:54 2016 +0000 Use plain -fopenacc to enable OpenACC kernels processing gcc/ * tree-parloops.c (create_parallel_loop, gen_parallel_loop) (parallelize_loops): In OpenACC kernels mode, set n_threads to zero. (pass_parallelize_loops::gate): In OpenACC kernels mode, gate on flag_openacc. * tree-ssa-loop.c (gate_oacc_kernels): Likewise. gcc/testsuite/ * c-c++-common/goacc/kernels-counter-vars-function-scope.c: Adjust to -ftree-parallelize-loops/-fopenacc changes. * c-c++-common/goacc/kernels-double-reduction-n.c: Likewise. * c-c++-common/goacc/kernels-double-reduction.c: Likewise. * c-c++-common/goacc/kernels-loop-2.c: Likewise. * c-c++-common/goacc/kernels-loop-3.c: Likewise. * c-c++-common/goacc/kernels-loop-g.c: Likewise. * c-c++-common/goacc/kernels-loop-mod-not-zero.c: Likewise. * c-c++-common/goacc/kernels-loop-n.c: Likewise. * c-c++-common/goacc/kernels-loop-nest.c: Likewise. * c-c++-common/goacc/kernels-loop.c: Likewise. * c-c++-common/goacc/kernels-one-counter-var.c: Likewise. * c-c++-common/goacc/kernels-reduction.c: Likewise. * gfortran.dg/goacc/kernels-loop-inner.f95: Likewise. * gfortran.dg/goacc/kernels-loops-adjacent.f95: Likewise. libgomp/ * oacc-parallel.c (GOACC_parallel_keyed): Initialize dims. * plugin/plugin-nvptx.c (nvptx_exec): Provide default values for dims. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c: Adjust to -ftree-parallelize-loops/-fopenacc changes. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-loop.c: Likewise. * testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c: Likewise. git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@233634 138bc75d-0d04-0410-961f-82ee72b054a4 --- gcc/ChangeLog | 9 ++++++ gcc/testsuite/ChangeLog | 18 ++++++++++++ .../goacc/kernels-counter-vars-function-scope.c | 3 +- .../goacc/kernels-double-reduction-n.c | 3 +- .../c-c++-common/goacc/kernels-double-reduction.c | 3 +- gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c | 3 +- gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c | 4 +-- gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c | 4 +-- .../c-c++-common/goacc/kernels-loop-mod-not-zero.c | 3 +- gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c | 4 +-- .../c-c++-common/goacc/kernels-loop-nest.c | 3 +- gcc/testsuite/c-c++-common/goacc/kernels-loop.c | 4 +-- .../c-c++-common/goacc/kernels-one-counter-var.c | 4 +-- .../c-c++-common/goacc/kernels-reduction.c | 4 +-- .../gfortran.dg/goacc/kernels-loop-inner.f95 | 1 - .../gfortran.dg/goacc/kernels-loops-adjacent.f95 | 1 - gcc/tree-parloops.c | 25 ++++++++++++++--- gcc/tree-ssa-loop.c | 7 ++--- libgomp/ChangeLog | 32 ++++++++++++++++++++++ libgomp/oacc-parallel.c | 4 +++ libgomp/plugin/plugin-nvptx.c | 18 ++++++++++-- .../libgomp.oacc-c-c++-common/kernels-loop-2.c | 3 -- .../libgomp.oacc-c-c++-common/kernels-loop-3.c | 3 -- .../kernels-loop-and-seq-2.c | 3 -- .../kernels-loop-and-seq-3.c | 3 -- .../kernels-loop-and-seq-4.c | 3 -- .../kernels-loop-and-seq-5.c | 3 -- .../kernels-loop-and-seq-6.c | 3 -- .../kernels-loop-and-seq.c | 3 -- .../kernels-loop-collapse.c | 3 -- .../libgomp.oacc-c-c++-common/kernels-loop-g.c | 2 -- .../kernels-loop-mod-not-zero.c | 3 -- .../libgomp.oacc-c-c++-common/kernels-loop-n.c | 3 -- .../libgomp.oacc-c-c++-common/kernels-loop-nest.c | 3 -- .../libgomp.oacc-c-c++-common/kernels-loop.c | 3 -- .../libgomp.oacc-c-c++-common/kernels-reduction.c | 3 -- 36 files changed, 114 insertions(+), 87 deletions(-) Grüße Thomas diff --git gcc/ChangeLog gcc/ChangeLog index ce8d366..0b2149d 100644 --- gcc/ChangeLog +++ gcc/ChangeLog @@ -1,3 +1,12 @@ +2016-02-23 Thomas Schwinge + + * tree-parloops.c (create_parallel_loop, gen_parallel_loop) + (parallelize_loops): In OpenACC kernels mode, set n_threads to + zero. + (pass_parallelize_loops::gate): In OpenACC kernels mode, gate on + flag_openacc. + * tree-ssa-loop.c (gate_oacc_kernels): Likewise. + 2016-02-23 Richard Biener * mem-stats.h (struct mem_usage): Use PRIu64 for printing size_t. diff --git gcc/testsuite/ChangeLog gcc/testsuite/ChangeLog index 60372ce..17cf40c 100644 --- gcc/testsuite/ChangeLog +++ gcc/testsuite/ChangeLog @@ -1,3 +1,21 @@ +2016-02-23 Thomas Schwinge + + * c-c++-common/goacc/kernels-counter-vars-function-scope.c: Adjust + to -ftree-parallelize-loops/-fopenacc changes. + * c-c++-common/goacc/kernels-double-reduction-n.c: Likewise. + * c-c++-common/goacc/kernels-double-reduction.c: Likewise. + * c-c++-common/goacc/kernels-loop-2.c: Likewise. + * c-c++-common/goacc/kernels-loop-3.c: Likewise. + * c-c++-common/goacc/kernels-loop-g.c: Likewise. + * c-c++-common/goacc/kernels-loop-mod-not-zero.c: Likewise. + * c-c++-common/goacc/kernels-loop-n.c: Likewise. + * c-c++-common/goacc/kernels-loop-nest.c: Likewise. + * c-c++-common/goacc/kernels-loop.c: Likewise. + * c-c++-common/goacc/kernels-one-counter-var.c: Likewise. + * c-c++-common/goacc/kernels-reduction.c: Likewise. + * gfortran.dg/goacc/kernels-loop-inner.f95: Likewise. + * gfortran.dg/goacc/kernels-loops-adjacent.f95: Likewise. + 2016-02-23 Rainer Orth * gcc.target/i386/chkp-hidden-def.c: Require alias support. diff --git gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c index e8b5357..17f240e 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c +++ gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -51,4 +50,4 @@ main (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c index c39d674..750f576 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c +++ gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -34,4 +33,4 @@ foo (unsigned int n) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c index 3501d0d..df60d6a 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c +++ gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -34,4 +33,4 @@ foo (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c index f97584d..913d91f 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -67,4 +66,4 @@ main (void) /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.1" 1 "optimized" } } */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.2" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 3 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 3 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c index 530d62a..1822d2a 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -45,5 +44,4 @@ main (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ - +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c index 4f1c2c5..e946319 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c @@ -1,6 +1,5 @@ /* { dg-additional-options "-O2" } */ /* { dg-additional-options "-g" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -13,5 +12,4 @@ /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ - +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c index 151db51..9b63b45 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -49,4 +48,4 @@ main (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c index bee5f5a..279f797 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -52,5 +51,4 @@ foo (COUNTERTYPE n) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ - +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c index ea0e342..db1071f 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -36,4 +35,4 @@ main (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop.c gcc/testsuite/c-c++-common/goacc/kernels-loop.c index ab5dfb9..abf7a3c 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-loop.c +++ gcc/testsuite/c-c++-common/goacc/kernels-loop.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -52,5 +51,4 @@ main (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ - +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c index b16a8cd..95f4817 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c +++ gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -50,5 +49,4 @@ main (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ - +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-reduction.c gcc/testsuite/c-c++-common/goacc/kernels-reduction.c index 61c5df3..6f5a418 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-reduction.c +++ gcc/testsuite/c-c++-common/goacc/kernels-reduction.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -32,5 +31,4 @@ foo (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ - +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 index 4db3a50..3334741 100644 --- gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 +++ gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 @@ -1,5 +1,4 @@ ! { dg-additional-options "-O2" } -! { dg-additional-options "-ftree-parallelize-loops=32" } program main implicit none diff --git gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 index fef3d10..fb92da8 100644 --- gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 +++ gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 @@ -1,5 +1,4 @@ ! { dg-additional-options "-O2" } -! { dg-additional-options "-ftree-parallelize-loops=10" } program main implicit none diff --git gcc/tree-parloops.c gcc/tree-parloops.c index 139e38c..e498e5b 100644 --- gcc/tree-parloops.c +++ gcc/tree-parloops.c @@ -2016,7 +2016,8 @@ transform_to_exit_first_loop (struct loop *loop, /* Create the parallel constructs for LOOP as described in gen_parallel_loop. LOOP_FN and DATA are the arguments of GIMPLE_OMP_PARALLEL. NEW_DATA is the variable that should be initialized from the argument - of LOOP_FN. N_THREADS is the requested number of threads. */ + of LOOP_FN. N_THREADS is the requested number of threads, which can be 0 if + that number is to be determined later. */ static void create_parallel_loop (struct loop *loop, tree loop_fn, tree data, @@ -2049,6 +2050,7 @@ create_parallel_loop (struct loop *loop, tree loop_fn, tree data, basic_block paral_bb = single_pred (bb); gsi = gsi_last_bb (paral_bb); + gcc_checking_assert (n_threads != 0); t = build_omp_clause (loc, OMP_CLAUSE_NUM_THREADS); OMP_CLAUSE_NUM_THREADS_EXPR (t) = build_int_cst (integer_type_node, n_threads); @@ -2221,7 +2223,8 @@ create_parallel_loop (struct loop *loop, tree loop_fn, tree data, } /* Generates code to execute the iterations of LOOP in N_THREADS - threads in parallel. + threads in parallel, which can be 0 if that number is to be determined + later. NITER describes number of iterations of LOOP. REDUCTION_LIST describes the reductions existent in the LOOP. */ @@ -2318,6 +2321,7 @@ gen_parallel_loop (struct loop *loop, else m_p_thread=MIN_PER_THREAD; + gcc_checking_assert (n_threads != 0); many_iterations_cond = fold_build2 (GE_EXPR, boolean_type_node, nit, build_int_cst (type, m_p_thread * n_threads)); @@ -3177,7 +3181,7 @@ oacc_entry_exit_ok (struct loop *loop, static bool parallelize_loops (bool oacc_kernels_p) { - unsigned n_threads = flag_tree_parallelize_loops; + unsigned n_threads; bool changed = false; struct loop *loop; struct loop *skip_loop = NULL; @@ -3199,6 +3203,13 @@ parallelize_loops (bool oacc_kernels_p) if (cfun->has_nonlocal_label) return false; + /* For OpenACC kernels, n_threads will be determined later; otherwise, it's + the argument to -ftree-parallelize-loops. */ + if (oacc_kernels_p) + n_threads = 0; + else + n_threads = flag_tree_parallelize_loops; + gcc_obstack_init (&parloop_obstack); reduction_info_table_type reduction_list (10); @@ -3361,7 +3372,13 @@ public: {} /* opt_pass methods: */ - virtual bool gate (function *) { return flag_tree_parallelize_loops > 1; } + virtual bool gate (function *) + { + if (oacc_kernels_p) + return flag_openacc; + else + return flag_tree_parallelize_loops > 1; + } virtual unsigned int execute (function *); opt_pass * clone () { return new pass_parallelize_loops (m_ctxt); } void set_pass_param (unsigned int n, bool param) diff --git gcc/tree-ssa-loop.c gcc/tree-ssa-loop.c index bdbade5..4c39fbc 100644 --- gcc/tree-ssa-loop.c +++ gcc/tree-ssa-loop.c @@ -148,7 +148,7 @@ make_pass_tree_loop (gcc::context *ctxt) static bool gate_oacc_kernels (function *fn) { - if (flag_tree_parallelize_loops <= 1) + if (!flag_openacc) return false; tree oacc_function_attr = get_oacc_fn_attrib (fn->decl); @@ -230,10 +230,9 @@ public: virtual bool gate (function *) { return (optimize - /* Don't bother doing anything if the program has errors. */ - && !seen_error () && flag_openacc - && flag_tree_parallelize_loops > 1); + /* Don't bother doing anything if the program has errors. */ + && !seen_error ()); } }; // class pass_ipa_oacc diff --git libgomp/ChangeLog libgomp/ChangeLog index 1394126..e6a7082 100644 --- libgomp/ChangeLog +++ libgomp/ChangeLog @@ -1,3 +1,35 @@ +2016-02-23 Thomas Schwinge + + * oacc-parallel.c (GOACC_parallel_keyed): Initialize dims. + * plugin/plugin-nvptx.c (nvptx_exec): Provide default values for + dims. + * testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c: Adjust to + -ftree-parallelize-loops/-fopenacc changes. + * testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c: Likewise. + * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c: + Likewise. + * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c: + Likewise. + * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c: + Likewise. + * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c: + Likewise. + * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c: + Likewise. + * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c: + Likewise. + * testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c: + Likewise. + * testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c: Likewise. + * testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c: + Likewise. + * testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c: Likewise. + * testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c: + Likewise. + * testsuite/libgomp.oacc-c-c++-common/kernels-loop.c: Likewise. + * testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c: + Likewise. + 2016-02-22 Cesar Philippidis * testsuite/libgomp.oacc-c-c++-common/vprop.c: New test. diff --git libgomp/oacc-parallel.c libgomp/oacc-parallel.c index bc24651..f795bf7 100644 --- libgomp/oacc-parallel.c +++ libgomp/oacc-parallel.c @@ -103,6 +103,10 @@ GOACC_parallel_keyed (int device, void (*fn) (void *), return; } + /* Default: let the runtime choose. */ + for (i = 0; i != GOMP_DIM_MAX; i++) + dims[i] = 0; + va_start (ap, kinds); /* TODO: This will need amending when device_type is implemented. */ while ((tag = va_arg (ap, unsigned)) != 0) diff --git libgomp/plugin/plugin-nvptx.c libgomp/plugin/plugin-nvptx.c index 7ec1810..3f1bb6d 100644 --- libgomp/plugin/plugin-nvptx.c +++ libgomp/plugin/plugin-nvptx.c @@ -894,9 +894,21 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs, /* Initialize the launch dimensions. Typically this is constant, provided by the device compiler, but we must permit runtime values. */ - for (i = 0; i != 3; i++) - if (targ_fn->launch->dim[i]) - dims[i] = targ_fn->launch->dim[i]; + int seen_zero = 0; + for (i = 0; i != GOMP_DIM_MAX; i++) + { + if (targ_fn->launch->dim[i]) + dims[i] = targ_fn->launch->dim[i]; + if (!dims[i]) + seen_zero = 1; + } + + if (seen_zero) + { + for (i = 0; i != GOMP_DIM_MAX; i++) + if (!dims[i]) + dims[i] = /* TODO */ 32; + } /* This reserves a chunk of a pre-allocated page of memory mapped on both the host and the device. HP is a host pointer to the new chunk, and DP is diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c index 13e57bd..c7592d6 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N (1024 * 512) diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c index f61a74a..31114ac 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N (1024 * 512) diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c index 2e4100f..d36592f 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N 32 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c index b3e736b..e622971 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N 32 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c index 8b9affa..c731278 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N 32 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c index 83d4e7f..67dcce2 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N 32 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c index 01d5e5e..b8b5dde 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N 32 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c index 61d1283..9d9308a 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N 32 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c index f7f04cb..997d6c7 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N 100 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c index 96b6e4e..88258be 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c @@ -1,5 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-g" } */ #include "kernels-loop.c" diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c index 1433cb2..147ebb5 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N ((1024 * 512) + 1) diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c index fd0d5b1..9a3eaca 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N ((1024 * 512) + 1) diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c index 21d2599..28c725a 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N 1000 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c index 3762e5a..355123c 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N (1024 * 512) diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c index 511e25f..8647a94 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define n 10000