[1/3] openacc: Add support for gang local storage allocation in shared memory

This patch implements a method to track the "private-ness" of
OpenACC variables declared in offload regions in gang-partitioned,
worker-partitioned or vector-partitioned modes. Variables declared
implicitly in scoped blocks and those declared "private" on enclosing
directives (e.g. "acc parallel") are both handled. Variables that are
e.g. gang-private can then be adjusted so they reside in GPU shared
memory.

The reason for doing this is twofold: correct implementation of OpenACC
semantics, and optimisation, since shared memory might be faster than
the main memory on a GPU. Handling of private variables is intimately
tied to the execution model for gangs/workers/vectors implemented by
a particular target: for current targets, we use (or on mainline, will
soon use) a broadcasting/neutering scheme.

That is sufficient for code that e.g. sets a variable in worker-single
mode and expects to use the value in worker-partitioned mode. The
difficulty (semantics-wise) comes when the user wants to do something like
an atomic operation in worker-partitioned mode and expects a worker-single
(gang private) variable to be shared across each partitioned worker.
Forcing use of shared memory for such variables makes that work properly.

In terms of implementation, the parallelism level of a given loop is
not fixed until the oaccdevlow pass in the offload compiler, so the
patch delays fixing the parallelism level of variables declared on or
within such loops until the same point. This is done by adding a new
internal UNIQUE function (OACC_PRIVATE) that lists (the address of) each
private variable as an argument, and other arguments set so as to be able
to determine the correct parallelism level to use for the listed
variables. This new internal function fits into the existing scheme for
demarcating OpenACC loops, as described in comments in the patch.

Two new target hooks are introduced: TARGET_GOACC_ADJUST_PRIVATE_DECL and
TARGET_GOACC_EXPAND_VAR_DECL.  The first can tweak a variable declaration
at oaccdevlow time, and the second at expand time.  The first or both
of these target hooks can be used by a given offload target, depending
on its strategy for implementing private variables.

Tested with offloading to AMD GCN and (separately) to NVPTX.

OK (for stage 1)?

Thanks,

Julian

2021-02-22  Julian Brown  <julian@codesourcery.com>
	    Chung-Lin Tang  <cltang@codesourcery.com>

gcc/
	* doc/tm.texi.in (TARGET_GOACC_EXPAND_VAR_DECL,
	TARGET_GOACC_ADJUST_PRIVATE_DECL): Add documentation hooks.
	* doc/tm.texi: Regenerate.
	* expr.c (expand_expr_real_1): Expand decls using the expand_var_decl
	OpenACC hook if defined.
	* internal-fn.c (expand_UNIQUE): Handle IFN_UNIQUE_OACC_PRIVATE.
	* internal-fn.h (IFN_UNIQUE_CODES): Add OACC_PRIVATE.
	* omp-low.c (omp_context): Add oacc_addressable_var_decls field.
	(lower_oacc_reductions): Add PRIVATE_MARKER parameter.  Insert before
	fork.
	(lower_oacc_head_tail): Add PRIVATE_MARKER parameter. Modify private
	marker's gimple call arguments, and pass it to lower_oacc_reductions.
	(oacc_record_private_var_clauses, oacc_record_vars_in_bind,
	make_oacc_private_marker): New functions.
	(lower_omp_for): Call oacc_record_private_var_clauses with "for"
	clauses. Call oacc_record_vars_in_bind for OpenACC contexts. Create
	private marker and pass to lower_oacc_head_tail.
	(lower_omp_target): Create private marker and pass to
	lower_oacc_reductions.
	(lower_omp_1): Call oacc_record_vars_in_bind for OpenACC.
	* omp-offload.c (convert.h): Include.
	(oacc_loop_xform_head_tail): Treat private-variable markers like
	fork/join when transforming head/tail sequences.
	(struct addr_expr_rewrite_info): Add struct.
	(rewrite_addr_expr): New function.
	(is_sync_builtin_call): New function.
	(execute_oacc_device_lower): Support rewriting gang-private variables
	using target hook, and fix up addr_expr and var_decl nodes afterwards.
	* target.def (expand_accel_var, adjust_private_decl): New hooks.

libgomp/
	* testsuite/libgomp.oacc-c-c++-common/gang-private-1.c: New test.
	* testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c: Likewise.
	* testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90: Likewise.
	* testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90: Likewise.
---
 gcc/doc/tm.texi                               |  26 ++
 gcc/doc/tm.texi.in                            |   4 +
 gcc/expr.c                                    |  13 +-
 gcc/internal-fn.c                             |   2 +
 gcc/internal-fn.h                             |   3 +-
 gcc/omp-low.c                                 | 122 +++++++++-
 gcc/omp-offload.c                             | 225 +++++++++++++++++-
 gcc/target.def                                |  30 +++
 .../gang-private-1.c                          |  38 +++
 .../libgomp.oacc-c-c++-common/loop-gwv-2.c    |  95 ++++++++
 .../gangprivate-attrib-1.f90                  |  25 ++
 .../gangprivate-attrib-2.f90                  |  25 ++
 12 files changed, 599 insertions(+), 9 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/gang-private-1.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/loop-gwv-2.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-1.f90
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/gangprivate-attrib-2.f90

Message ID	aaf895fb1e3a009afc146d08d0cc267fa81971b3.1614342218.git.julian@codesourcery.com
State	New
Headers	show Return-Path: <gcc-patches-bounces@gcc.gnu.org> DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org C24D63950434 IronPort-SDR: okT0l7ZYA464X8GL+Ddn0cWkf11MJ49XEofpY6cw6qgKOr0cU4Oi0elsvrQwtf2s1clb7fQqW3 Jwv1Y6lbJgXpfwZDg49o7BbmWSwyoT4pWsJ3mkh+u33tvyNSllIUPePAJ/Lgc4wmRakuDvfB98 Uw+D3heUJRgrKI/qiM5NII9fxseuzllUT9Y6WoBwYJO/cOnC7oBpIly1S+KrnYAgoNpuRap3Jl lhx07/tjlAIBNmIftoNr4ZjNUI5YgnVvHdGeTxVbqH3IzbF8mnr4jDhYmjiIusGwpl0h/sOSry MFQ= IronPort-SDR: EHDhx5FL015tKkqKwos4/jNuwfwPEOdXIes7lKEzU1KnWGz8p2ruPb/Jc3XrI5CG9HNhSLqsYT ap+HBw/ZiIQalmR/RTkI02ZR9G/wHOWD/3yA+s66o84qZgZz0EychuewUBD4rGF56iBhkzJLLx ik9J9ZOnxx2YHrXfC11zYTphd9NKQL2EwWQ6tqM7UHgDgy+0YoazmgmRbl7H/CWEfY4XOPtbub Qq46clYiTXnOFjXPz4BR6rT8qAMUyvx7IWw1jxRZXX1n/n6uzkuqjzEQmE7zo6WkGPHm0xO95l TQA= From: Julian Brown <julian@codesourcery.com> To: <gcc-patches@gcc.gnu.org> Subject: [PATCH 1/3] openacc: Add support for gang local storage allocation in shared memory Date: Fri, 26 Feb 2021 04:34:50 -0800 Message-ID: <aaf895fb1e3a009afc146d08d0cc267fa81971b3.1614342218.git.julian@codesourcery.com> In-Reply-To: <cover.1614342218.git.julian@codesourcery.com> References: <cover.1614342218.git.julian@codesourcery.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain Precedence: list Cc: jakub@redhat.com, Thomas Schwinge <thomas_schwinge@mentor.com> Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces@gcc.gnu.org>
Series	openacc: Gang-private variables in shared memory \| expand [0/3] openacc: Gang-private variables in shared memory [1/3] openacc: Add support for gang local storage allocation in shared memory [2/3] amdgcn: AMD GCN parts for OpenACC private variables patch [3/3] nvptx: NVPTX parts for OpenACC private variables patch

[1/3] openacc: Add support for gang local storage allocation in shared memory

Commit Message

Comments

Patch