mbox series

[0/4] openacc: Worker partitioning in the middle end

Message ID cover.1614685766.git.julian@codesourcery.com
Headers show
Series openacc: Worker partitioning in the middle end | expand

Message

Julian Brown March 2, 2021, 12:20 p.m. UTC
This series contains updated parts of the patch series that was previously
sent upstream in November 2019:

  https://gcc.gnu.org/pipermail/gcc-patches/2019-November/534547.html

The purpose of the series is to enable multiple workers for OpenACC
(workers being one of the dimensions of parallelism supported by the
standard) on targets such as AMD GCN. (NVPTX uses its own scheme for
supporting multiple workers, implemented mostly in the backend.)

Tested with offloading to AMD GCN and (separately) to NVPTX.

Further commentary is provided alongside individual patches. I'm posting
these patches for review now, but I don't expect to commit them until
stage 1.

Thanks,

Julian

Julian Brown (4):
  openacc: Middle-end worker-partitioning support
  openacc: Fix async bugs in several OpenACC test cases
  amdgcn: Enable OpenACC worker partitioning for AMD GCN
  openacc: Reference-typed reduction and private variable rewriting

 gcc/Makefile.in                               |    1 +
 gcc/config/gcn/gcn-protos.h                   |    2 +-
 gcc/config/gcn/gcn-tree.c                     |    6 +-
 gcc/config/gcn/gcn.c                          |   23 +-
 gcc/config/gcn/gcn.opt                        |    5 -
 gcc/doc/tm.texi                               |   10 +
 gcc/doc/tm.texi.in                            |    4 +
 gcc/gimplify.c                                |  117 ++
 gcc/oacc-neuter-bcast.c                       | 1471 +++++++++++++++++
 gcc/oacc-neuter-bcast.h                       |   26 +
 gcc/omp-builtins.def                          |    8 +
 gcc/omp-low.c                                 |   47 +-
 gcc/omp-offload.c                             |  159 +-
 gcc/omp-offload.h                             |    1 +
 gcc/passes.def                                |    2 +
 gcc/target.def                                |   13 +
 gcc/targhooks.h                               |    1 +
 .../goacc/classify-kernels-unparallelized.c   |    8 +-
 .../c-c++-common/goacc/classify-kernels.c     |    8 +-
 .../c-c++-common/goacc/classify-parallel.c    |    8 +-
 .../c-c++-common/goacc/classify-routine.c     |    8 +-
 .../c-c++-common/goacc/classify-serial.c      |    8 +-
 .../gcc.dg/goacc/loop-processing-1.c          |    2 +-
 .../goacc/classify-kernels-unparallelized.f95 |    8 +-
 .../gfortran.dg/goacc/classify-kernels.f95    |    8 +-
 .../gfortran.dg/goacc/classify-parallel.f95   |    8 +-
 .../gfortran.dg/goacc/classify-routine.f95    |    8 +-
 .../gfortran.dg/goacc/classify-serial.f95     |    8 +-
 gcc/tree-core.h                               |    4 +-
 gcc/tree-pass.h                               |    2 +
 gcc/tree.c                                    |   11 +-
 gcc/tree.h                                    |    2 +
 libgomp/plugin/plugin-gcn.c                   |    4 +-
 .../libgomp.oacc-c++/privatized-ref-2.C       |   64 +
 .../libgomp.oacc-c++/privatized-ref-3.C       |   64 +
 .../libgomp.oacc-c-c++-common/deep-copy-10.c  |   14 +-
 .../loop-dim-default.c                        |   11 +-
 .../libgomp.oacc-c-c++-common/parallel-dims.c |   13 +-
 .../libgomp.oacc-fortran/lib-16-2.f90         |    5 +
 .../testsuite/libgomp.oacc-fortran/lib-16.f90 |    5 +
 .../libgomp.oacc-fortran/parallel-dims-aux.c  |    9 +-
 .../libgomp.oacc-fortran/privatized-ref-1.f95 |   71 +
 42 files changed, 2112 insertions(+), 145 deletions(-)
 create mode 100644 gcc/oacc-neuter-bcast.c
 create mode 100644 gcc/oacc-neuter-bcast.h
 create mode 100644 libgomp/testsuite/libgomp.oacc-c++/privatized-ref-2.C
 create mode 100644 libgomp/testsuite/libgomp.oacc-c++/privatized-ref-3.C
 create mode 100644 libgomp/testsuite/libgomp.oacc-fortran/privatized-ref-1.f95