From patchwork Mon Sep 28 15:38:23 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tom de Vries X-Patchwork-Id: 523376 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 6BB261401CD for ; Tue, 29 Sep 2015 01:39:12 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=wFAeJvt3; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=di0Q42zW17aHdjySTSHrqkbfBklDrKDd0DK2lBU9M/abw8eUff KZz2AG7UJR1nJiZ5TeQ5ZkFjDqsv0JzLw5umOnTQt1SJ1xA6Ybw15/7VEcQ7F1Yo LR2EEly2NSt/RMrUVZbD2lM6OwLjwhngHDgUILXpHE5bjUlA+qm/Fqtr4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:message-id:date:mime-version:content-type; s= default; bh=DWBCJNlFY1hZp0hnAUxakCOkkvk=; b=wFAeJvt3UGx4Zg+CE4v/ H2yPYZnrCPGXst36n0G8GzhNFNuC0GYl6de2QfEfhbIFBMQL+YV8pmUR9NlXS5YB z9l48eR9yYwI6BeGpP2zTNvq2Pw5B5c4Tc7wyPH8Bw75/Z8Lo9mb4LOmtrMHQ0+Z 8qjVPxmv34lehFuJU6laM20= Received: (qmail 62622 invoked by alias); 28 Sep 2015 15:39:01 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 62606 invoked by uid 89); 28 Sep 2015 15:39:00 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.3 required=5.0 tests=AWL, BAYES_05, SPF_PASS, T_RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: fencepost.gnu.org Received: from fencepost.gnu.org (HELO fencepost.gnu.org) (208.118.235.10) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Mon, 28 Sep 2015 15:38:57 +0000 Received: from eggs.gnu.org ([2001:4830:134:3::10]:44932) by fencepost.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1ZgaVz-0001NB-8Y for gcc-patches@gnu.org; Mon, 28 Sep 2015 11:38:55 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ZgaVv-0000G3-Ei for gcc-patches@gnu.org; Mon, 28 Sep 2015 11:38:54 -0400 Received: from relay1.mentorg.com ([192.94.38.131]:64793) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ZgaVv-0000FT-5b for gcc-patches@gnu.org; Mon, 28 Sep 2015 11:38:51 -0400 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-03.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1ZgaVs-0000je-Lq from Tom_deVries@mentor.com for gcc-patches@gnu.org; Mon, 28 Sep 2015 08:38:49 -0700 Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-03.mgc.mentorg.com (137.202.0.108) with Microsoft SMTP Server id 14.3.224.2; Mon, 28 Sep 2015 16:38:47 +0100 To: Nathan Sidwell CC: "gcc-patches@gnu.org" From: Tom de Vries Subject: [gomp4, WIP] Implement -foffload-alias Message-ID: <56095EEF.7010700@mentor.com> Date: Mon, 28 Sep 2015 17:38:23 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Windows NT kernel [generic] [fuzzy] X-Received-From: 192.94.38.131 Hi, this work-in-progress patch implements a new option -foffload-alias=. The option -foffload-alias=none instructs the compiler to assume that objects references and pointer dereferences in an offload region do not alias. The option -foffload-alias=pointer instructs the compiler to assume that objects references in an offload region do not alias. The option -foffload-alias=all instructs the compiler to make no assumptions about aliasing in offload regions. The default value is -foffload-alias=none. The patch works by adding restrict to the types of the fields used to pass data to an offloading region. Atm, the kernels-loop-offload-alias-ptr.c test-case passes, but the kernels-loop-offload-alias-none.c test-case fails. For the latter, the required amount of restrict is added, but it has no effect. I've reported this in a more basic form in PR67742: "3rd-level restrict ignored". Thanks, - Tom Implement -foffload-alias 2015-09-28 Tom de Vries * common.opt (foffload-alias): New option. * flag-types.h (enum offload_alias): New enum. * omp-low.c (is_gimple_oacc_offload): New function. (install_var_field): Handle flag_offload_alias. * doc/invoke.texi (@item Code Generation Options): Add -foffload-alias. (@item -foffload-alias): New item. * c-c++-common/goacc/kernels-loop-offload-alias-none.c: New test. * c-c++-common/goacc/kernels-loop-offload-alias-ptr.c: New test. --- gcc/common.opt | 16 ++++++ gcc/doc/invoke.texi | 11 ++++ gcc/flag-types.h | 7 +++ gcc/omp-low.c | 24 ++++++++- .../goacc/kernels-loop-offload-alias-none.c | 63 ++++++++++++++++++++++ .../goacc/kernels-loop-offload-alias-ptr.c | 52 ++++++++++++++++++ 6 files changed, 172 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-offload-alias-none.c create mode 100644 gcc/testsuite/c-c++-common/goacc/kernels-loop-offload-alias-ptr.c diff --git a/gcc/common.opt b/gcc/common.opt index 290b6b3..28977a4 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -1730,6 +1730,22 @@ Enum(offload_abi) String(ilp32) Value(OFFLOAD_ABI_ILP32) EnumValue Enum(offload_abi) String(lp64) Value(OFFLOAD_ABI_LP64) +foffload-alias= +Common Joined RejectNegative Enum(offload_alias) Var(flag_offload_alias) Init(OFFLOAD_ALIAS_NONE) +-foffload-alias=[all|pointer|none] Assume non-aliasing in an offload region + +Enum +Name(offload_alias) Type(enum offload_alias) UnknownError(unknown offload aliasing %qs) + +EnumValue +Enum(offload_alias) String(all) Value(OFFLOAD_ALIAS_ALL) + +EnumValue +Enum(offload_alias) String(pointer) Value(OFFLOAD_ALIAS_POINTER) + +EnumValue +Enum(offload_alias) String(none) Value(OFFLOAD_ALIAS_NONE) + fomit-frame-pointer Common Report Var(flag_omit_frame_pointer) Optimization When possible do not generate stack frames diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 909a453..a5ab785 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -1136,6 +1136,7 @@ See S/390 and zSeries Options. -finstrument-functions-exclude-function-list=@var{sym},@var{sym},@dots{} @gol -finstrument-functions-exclude-file-list=@var{file},@var{file},@dots{} @gol -fno-common -fno-ident @gol +-foffload-alias=@r{[}none@r{|}pointer@r{|}all@r{]} @gol -fpcc-struct-return -fpic -fPIC -fpie -fPIE -fno-plt @gol -fno-jump-tables @gol -frecord-gcc-switches @gol @@ -23695,6 +23696,16 @@ The options @option{-ftrapv} and @option{-fwrapv} override each other, so using using @option{-ftrapv} @option{-fwrapv} @option{-fno-wrapv} on the command-line results in @option{-ftrapv} being effective. +@item -foffload-alias=@r{[}none@r{|}pointer@r{|}all@r{]} +@opindex -foffload-alias +The option @option{-foffload-alias=none} instructs the compiler to assume that +objects references and pointer dereferences in an offload region do not alias. +The option @option{-foffload-alias=pointer} instruct the compiler to assume that +objects references in an offload region are presumed unaliased. The option +@option{-foffload-alias=all} instructs the compiler to make no assumtions about +aliasing in offload regions. The default value is +@option{-foffload-alias=none}. + @item -fexceptions @opindex fexceptions Enable exception handling. Generates extra code needed to propagate diff --git a/gcc/flag-types.h b/gcc/flag-types.h index ac9ca0b..e8e672d 100644 --- a/gcc/flag-types.h +++ b/gcc/flag-types.h @@ -286,5 +286,12 @@ enum gfc_convert GFC_FLAG_CONVERT_LITTLE }; +enum offload_alias +{ + OFFLOAD_ALIAS_ALL, + OFFLOAD_ALIAS_POINTER, + OFFLOAD_ALIAS_NONE +}; + #endif /* ! GCC_FLAG_TYPES_H */ diff --git a/gcc/omp-low.c b/gcc/omp-low.c index a5904eb..9cbba1f 100644 --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -1140,6 +1140,16 @@ use_pointer_for_field (tree decl, omp_context *shared_ctx) return false; } +static bool +is_gimple_oacc_offload (const gimple *stmt) +{ + return (gimple_code (stmt) == GIMPLE_OMP_TARGET + && (((gimple_omp_target_kind (stmt) + == GF_OMP_TARGET_KIND_OACC_PARALLEL) + || (gimple_omp_target_kind (stmt) + == GF_OMP_TARGET_KIND_OACC_KERNELS)))); +} + /* Construct a new automatic decl similar to VAR. */ static tree @@ -1283,6 +1293,7 @@ static void install_var_field (tree var, bool by_ref, int mask, omp_context *ctx) { tree field, type, sfield = NULL_TREE; + bool in_oacc_offload = is_gimple_oacc_offload (ctx->stmt); gcc_assert ((mask & 1) == 0 || !splay_tree_lookup (ctx->field_map, (splay_tree_key) var)); @@ -1298,7 +1309,18 @@ install_var_field (tree var, bool by_ref, int mask, omp_context *ctx) type = build_pointer_type (build_pointer_type (type)); } else if (by_ref) - type = build_pointer_type (type); + { + if (in_oacc_offload + && flag_offload_alias == OFFLOAD_ALIAS_NONE + && POINTER_TYPE_P (type)) + type = build_qualified_type (type, TYPE_QUAL_RESTRICT); + + type = build_pointer_type (type); + + if (in_oacc_offload + && flag_offload_alias != OFFLOAD_ALIAS_ALL) + type = build_qualified_type (type, TYPE_QUAL_RESTRICT); + } else if ((mask & 3) == 1 && is_reference (var)) type = TREE_TYPE (type); diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-offload-alias-none.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-offload-alias-none.c new file mode 100644 index 0000000..918bced --- /dev/null +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-offload-alias-none.c @@ -0,0 +1,63 @@ +/* { dg-additional-options "-O2" } */ +/* { dg-additional-options "-ftree-parallelize-loops=32" } */ +/* { dg-additional-options "-fdump-tree-parloops_oacc_kernels-all" } */ +/* { dg-additional-options "-fdump-tree-optimized" } */ +/* { dg-additional-options "-fdump-tree-alias-all" } */ +/* { dg-additional-options "-foffload-alias=none" } */ + +#include + +#define N (1024 * 512) +#define COUNTERTYPE unsigned int + +int +main (void) +{ + unsigned int *a; + unsigned int *b; + unsigned int *c; + + a = (unsigned int *)malloc (N * sizeof (unsigned int)); + b = (unsigned int *)malloc (N * sizeof (unsigned int)); + c = (unsigned int *)malloc (N * sizeof (unsigned int)); + + for (COUNTERTYPE i = 0; i < N; i++) + a[i] = i * 2; + + for (COUNTERTYPE i = 0; i < N; i++) + b[i] = i * 4; + +#pragma acc kernels copyin (a[0:N], b[0:N]) copyout (c[0:N]) + { + for (COUNTERTYPE ii = 0; ii < N; ii++) + c[ii] = a[ii] + b[ii]; + } + + for (COUNTERTYPE i = 0; i < N; i++) + if (c[i] != a[i] + b[i]) + abort (); + + free (a); + free (b); + free (c); + + return 0; +} + +/* Check that only one loop is analyzed, and that it can be parallelized. */ +/* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 "parloops_oacc_kernels" } } */ +/* { dg-final { scan-tree-dump-not "FAILED:" "parloops_oacc_kernels" } } */ + +/* Check that the loop has been split off into a function. */ +/* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ + +/* { dg-final { scan-tree-dump-times "(?n)pragma omp target oacc_parallel.*num_gangs\\(32\\)" 1 "parloops_oacc_kernels" } } */ + +/* { dg-final { scan-tree-dump-times "clique 1 base 1" 3 "alias" } } */ +/* { dg-final { scan-tree-dump-times "clique 1 base 2" 1 "alias" } } */ +/* { dg-final { scan-tree-dump-times "clique 1 base 3" 1 "alias" } } */ +/* { dg-final { scan-tree-dump-times "clique 1 base 4" 1 "alias" } } */ +/* { dg-final { scan-tree-dump-times "clique 1 base 5" 1 "alias" } } */ +/* { dg-final { scan-tree-dump-times "clique 1 base 6" 1 "alias" } } */ +/* { dg-final { scan-tree-dump-times "clique 1 base 7" 1 "alias" } } */ +/* { dg-final { scan-tree-dump-times "(?n)clique .* base .*" 9 "alias" } } */ diff --git a/gcc/testsuite/c-c++-common/goacc/kernels-loop-offload-alias-ptr.c b/gcc/testsuite/c-c++-common/goacc/kernels-loop-offload-alias-ptr.c new file mode 100644 index 0000000..029d77c --- /dev/null +++ b/gcc/testsuite/c-c++-common/goacc/kernels-loop-offload-alias-ptr.c @@ -0,0 +1,52 @@ +/* { dg-additional-options "-O2" } */ +/* { dg-additional-options "-ftree-parallelize-loops=32" } */ +/* { dg-additional-options "-fdump-tree-parloops_oacc_kernels-all" } */ +/* { dg-additional-options "-fdump-tree-optimized" } */ +/* { dg-additional-options "-fdump-tree-alias-all" } */ +/* { dg-additional-options "-foffload-alias=pointer" } */ + +#include + +#define N (1024 * 512) +#define COUNTERTYPE unsigned int + +unsigned int a[N]; +unsigned int b[N]; +unsigned int c[N]; + +int +main (void) +{ + for (COUNTERTYPE i = 0; i < N; i++) + a[i] = i * 2; + + for (COUNTERTYPE i = 0; i < N; i++) + b[i] = i * 4; + +#pragma acc kernels copyin (a[0:N], b[0:N]) copyout (c[0:N]) + { + for (COUNTERTYPE ii = 0; ii < N; ii++) + c[ii] = a[ii] + b[ii]; + } + + for (COUNTERTYPE i = 0; i < N; i++) + if (c[i] != a[i] + b[i]) + abort (); + + return 0; +} + +/* Check that only one loop is analyzed, and that it can be parallelized. */ +/* { dg-final { scan-tree-dump-times "SUCCESS: may be parallelized" 1 "parloops_oacc_kernels" } } */ +/* { dg-final { scan-tree-dump-not "FAILED:" "parloops_oacc_kernels" } } */ + +/* Check that the loop has been split off into a function. */ +/* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ + +/* { dg-final { scan-tree-dump-times "(?n)pragma omp target oacc_parallel.*num_gangs\\(32\\)" 1 "parloops_oacc_kernels" } } */ + +/* { dg-final { scan-tree-dump-times "clique 1 base 1" 3 "alias" } } */ +/* { dg-final { scan-tree-dump-times "clique 1 base 2" 1 "alias" } } */ +/* { dg-final { scan-tree-dump-times "clique 1 base 3" 1 "alias" } } */ +/* { dg-final { scan-tree-dump-times "clique 1 base 4" 1 "alias" } } */ +/* { dg-final { scan-tree-dump-times "(?n)clique .* base .*" 6 "alias" } } */ -- 1.9.1