From patchwork Thu Jun 2 16:20:57 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Schwinge X-Patchwork-Id: 629359 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3rLCCS0M4Pz9sDX for ; Fri, 3 Jun 2016 02:21:23 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=vxZFSANq; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:in-reply-to:references:date:message-id :mime-version:content-type:content-transfer-encoding; q=dns; s= default; b=rK6wwvrUYgQhE83/MCPMWHXW3cqKToD5SMZF2/QPobMYvpN6rM1bI B65t84T0AdeXKJrPiTT0uIHw4i2Weve/pQ7XbNA1zCMjA+lVGWvx7giIvNYEbAAG 9Dbc1Ia1KtKnQHGglHu2oJZ+C0Pji44dYLBIH+TH1/e9AhRhk8WV7w= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:in-reply-to:references:date:message-id :mime-version:content-type:content-transfer-encoding; s=default; bh=yTFjt1hLKAILkaoe23eftk3vzfw=; b=vxZFSANqrjiSFHYMKlvoocT8bidt +4uezPfTrFIuhnQtOiQ3iCuOp1k3R9xagy8J8kjriQTLkzUuAsRzoegbOw1boG9W ZAjvwFJha7XNArO+iYuiNP49kZ9bubLT9/108Y4phokkL7vPeKNb5diB9UXCUrA5 Yi5cnF/FlfUpG5g= Received: (qmail 81314 invoked by alias); 2 Jun 2016 16:21:16 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 81283 invoked by uid 89); 2 Jun 2016 16:21:15 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 spammy=2224, BAR, 4922, amusement X-Spam-User: qpsmtpd, 2 recipients X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Thu, 02 Jun 2016 16:21:05 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-02.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1b8VMj-000486-9H from Thomas_Schwinge@mentor.com ; Thu, 02 Jun 2016 09:21:01 -0700 Received: from hertz.schwinge.homeip.net (137.202.0.76) by SVR-IES-FEM-02.mgc.mentorg.com (137.202.0.106) with Microsoft SMTP Server id 14.3.224.2; Thu, 2 Jun 2016 17:21:00 +0100 From: Thomas Schwinge To: , Jakub Jelinek CC: Subject: Re: [PR middle-end/71373] Handle more OMP_CLAUSE_* in nested function decomposition In-Reply-To: <20160601151217.GS28550@tucnak.redhat.com> References: <87h9ddrlrx.fsf@hertz.schwinge.homeip.net> <5459732B.1010101@codesourcery.com> <20160601151217.GS28550@tucnak.redhat.com> User-Agent: Notmuch/0.9-101-g81dad07 (http://notmuchmail.org) Emacs/24.4.1 (x86_64-pc-linux-gnu) Date: Thu, 2 Jun 2016 18:20:57 +0200 Message-ID: <87d1nzsgt2.fsf@hertz.schwinge.homeip.net> MIME-Version: 1.0 Hi! On Wed, 1 Jun 2016 17:12:17 +0200, Jakub Jelinek wrote: > On Wed, Jun 01, 2016 at 05:06:42PM +0200, Thomas Schwinge wrote: > > Here are the OpenACC bits of . > > > > As we're currently not paying attention to OpenACC tile clauses in the > > middle end, and thus OMP_CLAUSE_TILE's arguments are not to be considered > > stable, I opted to simply discard them early, simplifying their > > gcc/tree-nested.c handling. Everything else should be self-explanatory. > > > > OK for trunk and gcc-6-branch? > > LGTM for both, but please as a follow-up try to work also on a C testcase > with nested functions that covers all the clauses (both referencing > vars/expressions that are defined in the current function and used by a > nested function, and for vars/expressions that are defined in parent > function and used in clauses inside of nested function. OK, I translated gcc/testsuite/gfortran.dg/goacc/subroutines.f90 from Fortran to C: gcc/testsuite/gcc.dg/goacc/nested.c. For amusement ;-) I'm also including the test case that originally made us aware of the problem, gcc/testsuite/gcc.dg/goacc/pr71373.c. Oh, and I just remembered , so after re-testing, I'll also include these test cases, as far as still relevant. Nested function decomposition is not applicable to C++, so we don't need any C++ test cases, right? During the translation of gcc/testsuite/gfortran.dg/goacc/subroutines.f90 from Fortran to C, I stumbled upon "C/C++ OpenACC cache directive rejects valid syntax", , so that one will need to go in first, before I'll then commit the following: commit 7eff9da0e8fe5eda7d76b9a27dbb1ec4e6183844 Author: Thomas Schwinge Date: Wed Jun 1 17:01:35 2016 +0200 [PR middle-end/71373] Handle more OMP_CLAUSE_* in nested function decomposition gcc/ * gimplify.c (gimplify_adjust_omp_clauses): Discard OMP_CLAUSE_TILE. * omp-low.c (scan_sharing_clauses): Don't expect OMP_CLAUSE_TILE. gcc/testsuite/ * c-c++-common/goacc/combined-directives.c: XFAIL tree scanning for OpenACC tile clauses. * gfortran.dg/goacc/combined-directives.f90: Likewise. gcc/ PR middle-end/71373 * tree-nested.c (convert_nonlocal_omp_clauses) (convert_local_omp_clauses): Handle OMP_CLAUSE_ASYNC, OMP_CLAUSE_WAIT, OMP_CLAUSE_INDEPENDENT, OMP_CLAUSE_AUTO, OMP_CLAUSE__CACHE_, OMP_CLAUSE_TILE. gcc/testsuite/ PR middle-end/71373 * gcc.dg/goacc/nested.c: New file. * gcc.dg/goacc/pr71373.c: Likewise. * gfortran.dg/goacc/subroutines.f90: Update. --- gcc/gimplify.c | 6 ++ gcc/omp-low.c | 4 +- .../c-c++-common/goacc/combined-directives.c | 3 +- gcc/testsuite/gcc.dg/goacc/nested.c | 100 +++++++++++++++++++++ gcc/testsuite/gcc.dg/goacc/pr71373.c | 41 +++++++++ .../gfortran.dg/goacc/combined-directives.f90 | 3 +- gcc/testsuite/gfortran.dg/goacc/subroutines.f90 | 56 ++++++++---- gcc/tree-nested.c | 30 +++++++ 8 files changed, 221 insertions(+), 22 deletions(-) Grüße Thomas diff --git gcc/gimplify.c gcc/gimplify.c index f12c6a1..7c19cf3 100644 --- gcc/gimplify.c +++ gcc/gimplify.c @@ -8280,7 +8280,13 @@ gimplify_adjust_omp_clauses (gimple_seq *pre_p, gimple_seq body, tree *list_p, case OMP_CLAUSE_VECTOR: case OMP_CLAUSE_AUTO: case OMP_CLAUSE_SEQ: + break; + case OMP_CLAUSE_TILE: + /* We're not yet making use of the information provided by OpenACC + tile clauses. Discard these here, to simplify later middle end + processing. */ + remove = true; break; default: diff --git gcc/omp-low.c gcc/omp-low.c index 91d5fcf..c6ba31c 100644 --- gcc/omp-low.c +++ gcc/omp-low.c @@ -2187,7 +2187,6 @@ scan_sharing_clauses (tree clauses, omp_context *ctx, case OMP_CLAUSE_GANG: case OMP_CLAUSE_WORKER: case OMP_CLAUSE_VECTOR: - case OMP_CLAUSE_TILE: case OMP_CLAUSE_INDEPENDENT: case OMP_CLAUSE_AUTO: case OMP_CLAUSE_SEQ: @@ -2201,6 +2200,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx, break; case OMP_CLAUSE__CACHE_: + case OMP_CLAUSE_TILE: default: gcc_unreachable (); } @@ -2357,7 +2357,6 @@ scan_sharing_clauses (tree clauses, omp_context *ctx, case OMP_CLAUSE_GANG: case OMP_CLAUSE_WORKER: case OMP_CLAUSE_VECTOR: - case OMP_CLAUSE_TILE: case OMP_CLAUSE_INDEPENDENT: case OMP_CLAUSE_AUTO: case OMP_CLAUSE_SEQ: @@ -2365,6 +2364,7 @@ scan_sharing_clauses (tree clauses, omp_context *ctx, break; case OMP_CLAUSE__CACHE_: + case OMP_CLAUSE_TILE: default: gcc_unreachable (); } diff --git gcc/testsuite/c-c++-common/goacc/combined-directives.c gcc/testsuite/c-c++-common/goacc/combined-directives.c index c2a3c57..3fa800d 100644 --- gcc/testsuite/c-c++-common/goacc/combined-directives.c +++ gcc/testsuite/c-c++-common/goacc/combined-directives.c @@ -111,6 +111,7 @@ test () // { dg-final { scan-tree-dump-times "acc loop vector" 2 "gimple" } } // { dg-final { scan-tree-dump-times "acc loop seq" 2 "gimple" } } // { dg-final { scan-tree-dump-times "acc loop auto" 2 "gimple" } } -// { dg-final { scan-tree-dump-times "acc loop tile.2, 3" 2 "gimple" } } +// XFAILed: OpenACC tile clauses are discarded during gimplification. +// { dg-final { scan-tree-dump-times "acc loop tile.2, 3" 2 "gimple" { xfail *-*-* } } } // { dg-final { scan-tree-dump-times "acc loop independent private.i" 2 "gimple" } } // { dg-final { scan-tree-dump-times "private.z" 2 "gimple" } } diff --git gcc/testsuite/gcc.dg/goacc/nested.c gcc/testsuite/gcc.dg/goacc/nested.c new file mode 100644 index 0000000..6e1f236 --- /dev/null +++ gcc/testsuite/gcc.dg/goacc/nested.c @@ -0,0 +1,100 @@ +/* Exercise how tree-nested.c handles OpenACC clauses. */ +/* See gcc/testsuite/gfortran.dg/goacc/subroutines.f90 for the Fortran + version. */ + +int main () +{ +#define N 100 + int nonlocal_arg; + int nonlocal_a[N]; + int nonlocal_i; + int nonlocal_j; + + for (int i = 0; i < N; ++i) + nonlocal_a[i] = 5; + nonlocal_arg = 5; + + void local () + { + int local_i; + int local_arg; + int local_a[N]; + int local_j; + + for (int i = 0; i < N; ++i) + local_a[i] = 5; + local_arg = 5; + +#pragma acc kernels loop \ + gang(num:local_arg) worker(local_arg) vector(local_arg) \ + wait async(local_arg) + for (local_i = 0; local_i < N; ++local_i) + { +#pragma acc cache (local_a[local_i:5]) + local_a[local_i] = 100; +#pragma acc loop seq tile(*) + for (local_j = 0; local_j < N; ++local_j) + ; +#pragma acc loop auto independent tile(1) + for (local_j = 0; local_j < N; ++local_j) + ; + } + +#pragma acc kernels loop \ + gang(static:local_arg) worker(local_arg) vector(local_arg) \ + wait(local_arg, local_arg + 1, local_arg + 2) async + for (local_i = 0; local_i < N; ++local_i) + { +#pragma acc cache (local_a[local_i:4]) + local_a[local_i] = 100; +#pragma acc loop seq tile(1) + for (local_j = 0; local_j < N; ++local_j) + ; +#pragma acc loop auto independent tile(*) + for (local_j = 0; local_j < N; ++local_j) + ; + } + } + + void nonlocal () + { + for (int i = 0; i < N; ++i) + nonlocal_a[i] = 5; + nonlocal_arg = 5; + +#pragma acc kernels loop \ + gang(num:nonlocal_arg) worker(nonlocal_arg) vector(nonlocal_arg) \ + wait async(nonlocal_arg) + for (nonlocal_i = 0; nonlocal_i < N; ++nonlocal_i) + { +#pragma acc cache (nonlocal_a[nonlocal_i:3]) + nonlocal_a[nonlocal_i] = 100; +#pragma acc loop seq tile(2) + for (nonlocal_j = 0; nonlocal_j < N; ++nonlocal_j) + ; +#pragma acc loop auto independent tile(3) + for (nonlocal_j = 0; nonlocal_j < N; ++nonlocal_j) + ; + } + +#pragma acc kernels loop \ + gang(static:nonlocal_arg) worker(nonlocal_arg) vector(nonlocal_arg) \ + wait(nonlocal_arg, nonlocal_arg + 1, nonlocal_arg + 2) async + for (nonlocal_i = 0; nonlocal_i < N; ++nonlocal_i) + { +#pragma acc cache (nonlocal_a[nonlocal_i:2]) + nonlocal_a[nonlocal_i] = 100; +#pragma acc loop seq tile(*) + for (nonlocal_j = 0; nonlocal_j < N; ++nonlocal_j) + ; +#pragma acc loop auto independent tile(*) + for (nonlocal_j = 0; nonlocal_j < N; ++nonlocal_j) + ; + } + } + + local (); + nonlocal (); + + return 0; +} diff --git gcc/testsuite/gcc.dg/goacc/pr71373.c gcc/testsuite/gcc.dg/goacc/pr71373.c new file mode 100644 index 0000000..9381752 --- /dev/null +++ gcc/testsuite/gcc.dg/goacc/pr71373.c @@ -0,0 +1,41 @@ +/* Unintentional nested function usage. */ +/* Due to missing right braces '}', the following functions are parsed as + nested functions. This ran into an ICE. */ + +void foo (void) +{ + #pragma acc parallel + { + #pragma acc loop independent + for (int i = 0; i < 16; i++) + ; + // Note right brace '}' commented out here. + //} +} +void bar (void) +{ +} + +// Adding right brace '}' here, to make this compile. +} + + +// ..., and the other way round: + +void BAR (void) +{ +// Note right brace '}' commented out here. +//} + +void FOO (void) +{ + #pragma acc parallel + { + #pragma acc loop independent + for (int i = 0; i < 16; i++) + ; + } +} + +// Adding right brace '}' here, to make this compile. +} diff --git gcc/testsuite/gfortran.dg/goacc/combined-directives.f90 gcc/testsuite/gfortran.dg/goacc/combined-directives.f90 index 42a447a..abb5e6b 100644 --- gcc/testsuite/gfortran.dg/goacc/combined-directives.f90 +++ gcc/testsuite/gfortran.dg/goacc/combined-directives.f90 @@ -143,7 +143,8 @@ end subroutine test ! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. vector" 2 "gimple" } } ! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. seq" 2 "gimple" } } ! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. auto" 2 "gimple" } } -! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. tile.2, 3" 2 "gimple" } } +! XFAILed: OpenACC tile clauses are discarded during gimplification. +! { dg-final { scan-tree-dump-times "acc loop private.i. private.j. tile.2, 3" 2 "gimple" { xfail *-*-* } } } ! { dg-final { scan-tree-dump-times "acc loop private.i. independent" 2 "gimple" } } ! { dg-final { scan-tree-dump-times "private.z" 2 "gimple" } } ! { dg-final { scan-tree-dump-times "omp target oacc_\[^ \]+ map.force_tofrom:y" 2 "gimple" } } diff --git gcc/testsuite/gfortran.dg/goacc/subroutines.f90 gcc/testsuite/gfortran.dg/goacc/subroutines.f90 index 6cab798..479ef4f 100644 --- gcc/testsuite/gfortran.dg/goacc/subroutines.f90 +++ gcc/testsuite/gfortran.dg/goacc/subroutines.f90 @@ -1,6 +1,5 @@ -! Exercise how tree-nested.c handles gang, worker vector and seq. - -! { dg-do compile } +! Exercise how tree-nested.c handles OpenACC clauses. +! See gcc/testsuite/c-c++-common/goacc/nested.c for the C version. program main integer, parameter :: N = 100 @@ -8,10 +7,10 @@ program main integer :: nonlocal_a(N) integer :: nonlocal_i integer :: nonlocal_j - + nonlocal_a (:) = 5 nonlocal_arg = 5 - + call local () call nonlocal () @@ -22,24 +21,35 @@ contains integer :: local_arg integer :: local_a(N) integer :: local_j - + local_a (:) = 5 local_arg = 5 - !$acc kernels loop gang(num:local_arg) worker(local_arg) vector(local_arg) + !$acc kernels loop & + !$acc gang(num:local_arg) worker(local_arg) vector(local_arg) & + !$acc wait async(local_arg) do local_i = 1, N + !$acc cache (local_a(local_i:local_i + 5)) local_a(local_i) = 100 - !$acc loop seq + !$acc loop seq tile(*) + do local_j = 1, N + enddo + !$acc loop auto independent tile(1) do local_j = 1, N enddo enddo !$acc end kernels loop - !$acc kernels loop gang(static:local_arg) worker(local_arg) & - !$acc vector(local_arg) + !$acc kernels loop & + !$acc gang(static:local_arg) worker(local_arg) vector(local_arg) & + !$acc wait(local_arg, local_arg + 1, local_arg + 2) async do local_i = 1, N + !$acc cache (local_a(local_i:local_i + 4)) local_a(local_i) = 100 - !$acc loop seq + !$acc loop seq tile(1) + do local_j = 1, N + enddo + !$acc loop auto independent tile(*) do local_j = 1, N enddo enddo @@ -49,22 +59,32 @@ contains subroutine nonlocal () nonlocal_a (:) = 5 nonlocal_arg = 5 - - !$acc kernels loop gang(num:nonlocal_arg) worker(nonlocal_arg) & - !$acc vector(nonlocal_arg) + + !$acc kernels loop & + !$acc gang(num:nonlocal_arg) worker(nonlocal_arg) vector(nonlocal_arg) & + !$acc wait async(nonlocal_arg) do nonlocal_i = 1, N + !$acc cache (nonlocal_a(nonlocal_i:nonlocal_i + 3)) nonlocal_a(nonlocal_i) = 100 - !$acc loop seq + !$acc loop seq tile(2) + do nonlocal_j = 1, N + enddo + !$acc loop auto independent tile(3) do nonlocal_j = 1, N enddo enddo !$acc end kernels loop - !$acc kernels loop gang(static:nonlocal_arg) worker(nonlocal_arg) & - !$acc vector(nonlocal_arg) + !$acc kernels loop & + !$acc gang(static:nonlocal_arg) worker(nonlocal_arg) vector(nonlocal_arg) & + !$acc wait(nonlocal_arg, nonlocal_arg + 1, nonlocal_arg + 2) async do nonlocal_i = 1, N + !$acc cache (nonlocal_a(nonlocal_i:nonlocal_i + 2)) nonlocal_a(nonlocal_i) = 100 - !$acc loop seq + !$acc loop seq tile(*) + do nonlocal_j = 1, N + enddo + !$acc loop auto independent tile(*) do nonlocal_j = 1, N enddo enddo diff --git gcc/tree-nested.c gcc/tree-nested.c index 25a92aa..97d3c52 100644 --- gcc/tree-nested.c +++ gcc/tree-nested.c @@ -1114,6 +1114,8 @@ convert_nonlocal_omp_clauses (tree *pclauses, struct walk_stmt_info *wi) case OMP_CLAUSE_GANG: case OMP_CLAUSE_WORKER: case OMP_CLAUSE_VECTOR: + case OMP_CLAUSE_ASYNC: + case OMP_CLAUSE_WAIT: /* Several OpenACC clauses have optional arguments. Check if they are present. */ if (OMP_CLAUSE_OPERAND (clause, 0)) @@ -1197,8 +1199,21 @@ convert_nonlocal_omp_clauses (tree *pclauses, struct walk_stmt_info *wi) case OMP_CLAUSE_SIMD: case OMP_CLAUSE_DEFAULTMAP: case OMP_CLAUSE_SEQ: + case OMP_CLAUSE_INDEPENDENT: + case OMP_CLAUSE_AUTO: break; + case OMP_CLAUSE__CACHE_: + /* These clauses belong to the OpenACC cache directive, which is + discarded during gimplification, so we don't expect to see + anything here. */ + gcc_unreachable (); + + case OMP_CLAUSE_TILE: + /* OpenACC tile clauses are discarded during gimplification, so we + don't expect to see anything here. */ + gcc_unreachable (); + default: gcc_unreachable (); } @@ -1790,6 +1805,8 @@ convert_local_omp_clauses (tree *pclauses, struct walk_stmt_info *wi) case OMP_CLAUSE_GANG: case OMP_CLAUSE_WORKER: case OMP_CLAUSE_VECTOR: + case OMP_CLAUSE_ASYNC: + case OMP_CLAUSE_WAIT: /* Several OpenACC clauses have optional arguments. Check if they are present. */ if (OMP_CLAUSE_OPERAND (clause, 0)) @@ -1878,8 +1895,21 @@ convert_local_omp_clauses (tree *pclauses, struct walk_stmt_info *wi) case OMP_CLAUSE_SIMD: case OMP_CLAUSE_DEFAULTMAP: case OMP_CLAUSE_SEQ: + case OMP_CLAUSE_INDEPENDENT: + case OMP_CLAUSE_AUTO: break; + case OMP_CLAUSE__CACHE_: + /* These clauses belong to the OpenACC cache directive, which is + discarded during gimplification, so we don't expect to see + anything here. */ + gcc_unreachable (); + + case OMP_CLAUSE_TILE: + /* OpenACC tile clauses are discarded during gimplification, so we + don't expect to see anything here. */ + gcc_unreachable (); + default: gcc_unreachable (); }