From 93fb166876a0540416e19c9428316d1370dd1e1b Mon Sep 17 00:00:00 2001
From: Frederik Harwath <frederik@codesourcery.com>
Date: Tue, 3 Nov 2020 12:58:37 +0100
Subject: [PATCH] Move pass_oacc_device_lower after pass_graphite
As a first step towards enabling the use of Graphite for optimizing
OpenACC loops, the OpenACC device lowering must be moved after the
Graphite pass. This means that the device lowering now takes place
after some crucial optimization passes. Thus new instances of those
passes are added inside of a new pass pass_oacc_functions which
ensures that they execute on OpenACC functions only. The choice of the
new position for pass_oacc_device_lower is further constrainted by the
need to execute it before pass_vectorize. This means that
pass_oacc_device_lower now runs inside of pass_tree_loop. A further
instance of the pass that handles functions without loops is added
inside of pass_tree_no_loop. Yet another pass instance that executes
if optimizations are disabled is included inside of a new
pass_no_optimizations.
2020-11-03 Frederik Harwath <frederik@codesourcery.com>
Thomas Schwinge <thomas@codesourcery.com>
gcc/ChangeLog:
* omp-general.c (oacc_get_fn_dim_size): Adapt.
* omp-offload.c (pass_oacc_device_lower::clone) : New method.
* passes.c (class pass_no_optimizations): New pass.
(make_pass_no_optimizations): New static function.
* passes.def: Move pass_oacc_device_lower into pass_tree_loop
and add further instances to pass_tree_no_loop and to new pass
pass_no_optimizations. Add new instances of
pass_lower_complex, pass_ccp, pass_sink_code,
pass_complete_unrolli, pass_backprop, pass_phiprop,
pass_forwprop, pass_vrp, pass_dce, pass_loop_done,
pass_loop_init, pass_fix_loops supporting the
pass_oacc_device_lower instance in pass_tree_loop.
* tree-pass.h (make_pass_oacc_functions): New static function.
(make_pass_oacc_functions): New static function.
* tree-ssa-loop-ivcanon.c (pass_complete_unroll::clone): New method.
(pass_complete_unrolli::clone): New method.
* tree-ssa-loop.c (pass_fix_loops::clone): New method.
(pass_tree_loop_init::clone): New method.
(pass_tree_loop_done::clone): New method.
* tree-ssa-phiprop.c (pass_phiprop::clone): New method.
* tree-ssa-sink.c (pass_sink_code::clone): New method.
libgomp/ChangeLog:
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-1.c: Adapt to
changed pass instance numbering.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-4.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-5.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-6.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/vector-length-128-7.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/pr84955-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/pr85486-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/pr85486-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/pr85486.c: Likewise.
gcc/testsuite/ChangeLog:
* c-c++-common/goacc/classify-kernels-unparallelized.c: Adapt to changed
pass instance numbering.
* c-c++-common/goacc/classify-kernels.c: Likewise.
* c-c++-common/goacc/classify-parallel.c: Likewise.
* c-c++-common/goacc/classify-routine.c: Likewise.
* c-c++-common/unroll-1.c: Likewise.
* c-c++-common/unroll-4.c: Likewise.
* g++.dg/ext/unroll-1.C: Likewise.
* g++.dg/ext/unroll-2.C: Likewise.
* g++.dg/ext/unroll-3.C: Likewise.
* g++.dg/tree-ssa/pr49911.C: Likewise.
* g++.dg/vect/pr36648.cc: Likewise.
* gcc.dg/goacc/loop-processing-1.c: Likewise.
* gcc.dg/graphite/fuse-1.c: Likewise.
* gcc.dg/tree-ssa/backprop-1.c: Likewise.
* gcc.dg/tree-ssa/backprop-2.c: Likewise.
* gcc.dg/tree-ssa/backprop-3.c: Likewise.
* gcc.dg/tree-ssa/backprop-4.c: Likewise.
* gcc.dg/tree-ssa/backprop-5.c: Likewise.
* gcc.dg/tree-ssa/backprop-6.c: Likewise.
* gcc.dg/tree-ssa/cunroll-1.c: Likewise.
* gcc.dg/tree-ssa/cunroll-3.c: Likewise.
* gcc.dg/tree-ssa/cunroll-9.c: Likewise.
* gcc.dg/tree-ssa/ldist-17.c: Likewise.
* gcc.dg/tree-ssa/loop-38.c: Likewise.
* gcc.dg/tree-ssa/pr21463.c: Likewise.
* gcc.dg/tree-ssa/pr45427.c: Likewise.
* gcc.dg/tree-ssa/pr61743-1.c: Likewise.
* gcc.dg/tree-ssa/pr68234.c: Likewise.
* gcc.dg/tree-ssa/pr70232.c: Likewise.
* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Likewise.
* gcc.dg/tree-ssa/ssa-sink-1.c: Likewise.
* gcc.dg/tree-ssa/ssa-sink-10.c: Likewise.
* gcc.dg/tree-ssa/ssa-sink-13.c: Likewise.
* gcc.dg/tree-ssa/ssa-sink-14.c: Likewise.
* gcc.dg/tree-ssa/ssa-sink-16.c: Likewise.
* gcc.dg/tree-ssa/ssa-sink-17.c: Likewise.
* gcc.dg/tree-ssa/ssa-sink-2.c: Likewise.
* gcc.dg/tree-ssa/ssa-sink-3.c: Likewise.
* gcc.dg/tree-ssa/ssa-sink-4.c: Likewise.
* gcc.dg/tree-ssa/ssa-sink-5.c: Likewise.
* gcc.dg/tree-ssa/ssa-sink-6.c: Likewise.
* gcc.dg/tree-ssa/ssa-sink-7.c: Likewise.
* gcc.dg/tree-ssa/ssa-sink-8.c: Likewise.
* gcc.dg/tree-ssa/ssa-sink-9.c: Likewise.
* gcc.dg/tree-ssa/ssa-thread-11.c: Likewise.
* gcc.dg/tree-ssa/vrp47.c: Likewise.
* gcc.dg/tree-ssa/vrp91.c: Likewise.
* gcc.dg/unroll-2.c: Likewise.
* gcc.dg/unroll-3.c: Likewise.
* gcc.dg/unroll-4.c: Likewise.
* gcc.dg/unroll-5.c: Likewise.
* gcc.dg/vect/bb-slp-59.c: Likewise.
* gcc.dg/vect/pr26359.c: Likewise.
* gcc.dg/vect/vect-profile-1.c: Likewise.
* gcc.dg/vrp-min-max-2.c: Likewise.
* gcc.dg/wrapped-binop-simplify.c: Likewise.
* gfortran.dg/directive_unroll_1.f90: Likewise.
* gfortran.dg/directive_unroll_4.f90: Likewise.
* gfortran.dg/goacc/classify-kernels-unparallelized.f95: Likewise.
* gfortran.dg/goacc/classify-kernels.f95: Likewise.
* gfortran.dg/goacc/classify-parallel.f95: Likewise.
* gfortran.dg/goacc/classify-routine.f95: Likewise.
* gnat.dg/unroll1.adb: Likewise.
* gnat.dg/unroll2.adb: Likewise.
* c-c++-common/goacc/device-lowering-no-optimizations.c: New test.
* c-c++-common/goacc/device-lowering-with-optimizations.c: New test.
---
gcc/omp-general.c | 8 ++-
gcc/omp-offload.c | 1 +
gcc/passes.c | 35 ++++++++++++
gcc/passes.def | 29 +++++++++-
.../goacc/classify-kernels-unparallelized.c | 6 +--
.../c-c++-common/goacc/classify-kernels.c | 6 +--
.../c-c++-common/goacc/classify-parallel.c | 6 +--
.../c-c++-common/goacc/classify-routine.c | 6 +--
.../goacc/device-lowering-no-optimizations.c | 25 +++++++++
.../device-lowering-with-optimizations.c | 30 +++++++++++
gcc/testsuite/c-c++-common/unroll-1.c | 8 +--
gcc/testsuite/c-c++-common/unroll-4.c | 4 +-
gcc/testsuite/g++.dg/ext/unroll-1.C | 2 +-
gcc/testsuite/g++.dg/ext/unroll-2.C | 2 +-
gcc/testsuite/g++.dg/ext/unroll-3.C | 2 +-
gcc/testsuite/g++.dg/tree-ssa/pr49911.C | 4 +-
gcc/testsuite/g++.dg/vect/pr36648.cc | 2 +-
.../gcc.dg/goacc/loop-processing-1.c | 3 +-
gcc/testsuite/gcc.dg/graphite/fuse-1.c | 2 +-
gcc/testsuite/gcc.dg/tree-ssa/backprop-1.c | 6 +--
gcc/testsuite/gcc.dg/tree-ssa/backprop-2.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/backprop-3.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/backprop-4.c | 6 +--
gcc/testsuite/gcc.dg/tree-ssa/backprop-5.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/backprop-6.c | 6 +--
gcc/testsuite/gcc.dg/tree-ssa/cunroll-1.c | 6 +--
gcc/testsuite/gcc.dg/tree-ssa/cunroll-3.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/cunroll-9.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/ldist-17.c | 2 +-
gcc/testsuite/gcc.dg/tree-ssa/loop-38.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/pr21463.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/pr45427.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/pr61743-1.c | 2 +-
gcc/testsuite/gcc.dg/tree-ssa/pr68234.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/pr70232.c | 4 +-
.../gcc.dg/tree-ssa/ssa-dom-thread-7.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-1.c | 2 +-
gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-10.c | 2 +-
gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-13.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-14.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-16.c | 2 +-
gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-17.c | 2 +-
gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-2.c | 2 +-
gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-3.c | 2 +-
gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-4.c | 2 +-
gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-5.c | 2 +-
gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-6.c | 2 +-
gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-7.c | 2 +-
gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-8.c | 2 +-
gcc/testsuite/gcc.dg/tree-ssa/ssa-sink-9.c | 2 +-
gcc/testsuite/gcc.dg/tree-ssa/ssa-thread-11.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/vrp47.c | 4 +-
gcc/testsuite/gcc.dg/tree-ssa/vrp91.c | 4 +-
gcc/testsuite/gcc.dg/unroll-2.c | 2 +-
gcc/testsuite/gcc.dg/unroll-3.c | 4 +-
gcc/testsuite/gcc.dg/unroll-4.c | 4 +-
gcc/testsuite/gcc.dg/unroll-5.c | 4 +-
gcc/testsuite/gcc.dg/vect/bb-slp-59.c | 2 +-
gcc/testsuite/gcc.dg/vect/pr26359.c | 4 +-
gcc/testsuite/gcc.dg/vect/vect-profile-1.c | 2 +-
gcc/testsuite/gcc.dg/vrp-min-max-2.c | 6 +--
gcc/testsuite/gcc.dg/wrapped-binop-simplify.c | 4 +-
.../gfortran.dg/directive_unroll_1.f90 | 2 +-
.../gfortran.dg/directive_unroll_4.f90 | 2 +-
.../goacc/classify-kernels-unparallelized.f95 | 6 +--
.../gfortran.dg/goacc/classify-kernels.f95 | 6 +--
.../gfortran.dg/goacc/classify-parallel.f95 | 6 +--
.../gfortran.dg/goacc/classify-routine.f95 | 6 +--
gcc/testsuite/gnat.dg/unroll1.adb | 2 +-
gcc/testsuite/gnat.dg/unroll2.adb | 2 +-
gcc/tree-pass.h | 1 +
gcc/tree-ssa-loop-ivcanon.c | 2 +
gcc/tree-ssa-loop.c | 54 +++++++++++++++++++
gcc/tree-ssa-phiprop.c | 2 +
gcc/tree-ssa-sink.c | 2 +
.../libgomp.oacc-c-c++-common/pr84955-1.c | 4 +-
.../libgomp.oacc-c-c++-common/pr85486-2.c | 2 +-
.../libgomp.oacc-c-c++-common/pr85486-3.c | 2 +-
.../libgomp.oacc-c-c++-common/pr85486.c | 2 +-
.../vector-length-128-1.c | 2 +-
.../vector-length-128-2.c | 2 +-
.../vector-length-128-3.c | 2 +-
.../vector-length-128-4.c | 2 +-
.../vector-length-128-5.c | 2 +-
.../vector-length-128-6.c | 2 +-
.../vector-length-128-7.c | 2 +-
86 files changed, 316 insertions(+), 130 deletions(-)
create mode 100644 gcc/testsuite/c-c++-common/goacc/device-lowering-no-optimizations.c
create mode 100644 gcc/testsuite/c-c++-common/goacc/device-lowering-with-optimizations.c
@@ -2778,7 +2778,13 @@ oacc_get_fn_dim_size (tree fn, int axis)
while (axis--)
dims = TREE_CHAIN (dims);
- int size = TREE_INT_CST_LOW (TREE_VALUE (dims));
+ tree v = TREE_VALUE (dims);
+ /* TODO With 'pass_oacc_device_lower' moved "later", this is necessary to
+ avoid ICE for some OpenACC 'kernels' ("parloops") constructs. */
+ if (v == NULL_TREE)
+ return 0;
+
+ int size = TREE_INT_CST_LOW (v);
return size;
}
@@ -2027,6 +2027,7 @@ public:
{
return execute_oacc_device_lower ();
}
+ opt_pass * clone () { return new pass_oacc_device_lower (m_ctxt); }
}; // class pass_oacc_device_lower
@@ -620,6 +620,41 @@ make_pass_all_optimizations_g (gcc::context *ctxt)
namespace {
+const pass_data pass_data_no_optimizations =
+{
+ GIMPLE_PASS, /* type */
+ "*no_optimizations", /* name */
+ OPTGROUP_NONE, /* optinfo_flags */
+ TV_OPTIMIZE, /* tv_id */
+ 0, /* properties_required */
+ 0, /* properties_provided */
+ 0, /* properties_destroyed */
+ 0, /* todo_flags_start */
+ 0, /* todo_flags_finish */
+};
+
+class pass_no_optimizations : public gimple_opt_pass
+{
+public:
+ pass_no_optimizations (gcc::context *ctxt)
+ : gimple_opt_pass (pass_data_no_optimizations, ctxt)
+ {}
+
+ /* opt_pass methods: */
+ virtual bool gate (function *) { return !optimize; }
+
+}; // class pass_no_optimizations
+
+} // anon namespace
+
+static gimple_opt_pass *
+make_pass_no_optimizations (gcc::context *ctxt)
+{
+ return new pass_no_optimizations (ctxt);
+}
+
+namespace {
+
const pass_data pass_data_rest_of_compilation =
{
RTL_PASS, /* type */
@@ -181,7 +181,6 @@ along with GCC; see the file COPYING3. If not see
INSERT_PASSES_AFTER (all_passes)
NEXT_PASS (pass_fixup_cfg);
NEXT_PASS (pass_lower_eh_dispatch);
- NEXT_PASS (pass_oacc_device_lower);
NEXT_PASS (pass_omp_device_lower);
NEXT_PASS (pass_omp_target_link);
NEXT_PASS (pass_adjust_alignment);
@@ -284,6 +283,29 @@ along with GCC; see the file COPYING3. If not see
POP_INSERT_PASSES ()
NEXT_PASS (pass_parallelize_loops, false /* oacc_kernels_p */);
NEXT_PASS (pass_expand_omp_ssa);
+ /* Interrupt pass_tree_loop for OpenACC device lowering. */
+ NEXT_PASS (pass_oacc_functions);
+ PUSH_INSERT_PASSES_WITHIN (pass_oacc_functions)
+ NEXT_PASS (pass_tree_loop_done);
+ NEXT_PASS (pass_oacc_device_lower);
+ /* Passes that must run after OpenACC device lowering. */
+ /* Lower complex number instructions arising from reductions. */
+ NEXT_PASS (pass_lower_complex);
+ /* Those optimizations are generally beneficial, but they are
+ particularly important to help the vectorizer which is crucial
+ for AMD GCN offloading. */
+ NEXT_PASS (pass_ccp, true /* nonzero_p */);
+ NEXT_PASS (pass_sink_code);
+ NEXT_PASS (pass_complete_unrolli);
+ NEXT_PASS (pass_backprop);
+ NEXT_PASS (pass_phiprop);
+ NEXT_PASS (pass_forwprop);
+ NEXT_PASS (pass_vrp, false /* warn_array_bounds_p */);
+ NEXT_PASS (pass_dce);
+ NEXT_PASS (pass_fix_loops);
+ /* Continue pass_tree_loop after OpenACC device lowering. */
+ NEXT_PASS (pass_tree_loop_init);
+ POP_INSERT_PASSES ()
NEXT_PASS (pass_ch_vect);
NEXT_PASS (pass_if_conversion);
/* pass_vectorize must immediately follow pass_if_conversion.
@@ -312,6 +334,7 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_tree_no_loop);
PUSH_INSERT_PASSES_WITHIN (pass_tree_no_loop)
NEXT_PASS (pass_slp_vectorize);
+ NEXT_PASS (pass_oacc_device_lower);
POP_INSERT_PASSES ()
NEXT_PASS (pass_simduid_cleanup);
NEXT_PASS (pass_lower_vector_ssa);
@@ -387,6 +410,10 @@ along with GCC; see the file COPYING3. If not see
NEXT_PASS (pass_local_pure_const);
NEXT_PASS (pass_modref);
POP_INSERT_PASSES ()
+ NEXT_PASS (pass_no_optimizations);
+ PUSH_INSERT_PASSES_WITHIN (pass_no_optimizations)
+ NEXT_PASS (pass_oacc_device_lower);
+ POP_INSERT_PASSES ()
NEXT_PASS (pass_tm_init);
PUSH_INSERT_PASSES_WITHIN (pass_tm_init)
NEXT_PASS (pass_tm_mark);
@@ -35,6 +35,6 @@ void KERNELS ()
/* Check the offloaded function's classification and compute dimensions (will
always be 1 x 1 x 1 for non-offloading compilation).
- { dg-final { scan-tree-dump-times "(?n)Function is unparallelized OpenACC kernels offload" 1 "oaccdevlow" } }
- { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } }
- { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels, omp target entrypoint\\)\\)" 1 "oaccdevlow" } } */
+ { dg-final { scan-tree-dump-times "(?n)Function is unparallelized OpenACC kernels offload" 1 "oaccdevlow1" } }
+ { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow1" } }
+ { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels, omp target entrypoint\\)\\)" 1 "oaccdevlow1" } } */
@@ -31,6 +31,6 @@ void KERNELS ()
/* Check the offloaded function's classification and compute dimensions (will
always be 1 x 1 x 1 for non-offloading compilation).
- { dg-final { scan-tree-dump-times "(?n)Function is parallelized OpenACC kernels offload" 1 "oaccdevlow" } }
- { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } }
- { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "oaccdevlow" } } */
+ { dg-final { scan-tree-dump-times "(?n)Function is parallelized OpenACC kernels offload" 1 "oaccdevlow1" } }
+ { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow1" } }
+ { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "oaccdevlow1" } } */
@@ -24,6 +24,6 @@ void PARALLEL ()
/* Check the offloaded function's classification and compute dimensions (will
always be 1 x 1 x 1 for non-offloading compilation).
- { dg-final { scan-tree-dump-times "(?n)Function is OpenACC parallel offload" 1 "oaccdevlow" } }
- { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } }
- { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), omp target entrypoint\\)\\)" 1 "oaccdevlow" } } */
+ { dg-final { scan-tree-dump-times "(?n)Function is OpenACC parallel offload" 1 "oaccdevlow1" } }
+ { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow1" } }
+ { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), omp target entrypoint\\)\\)" 1 "oaccdevlow1" } } */
@@ -26,6 +26,6 @@ void ROUTINE ()
/* Check the offloaded function's classification and compute dimensions (will
always be 1 x 1 x 1 for non-offloading compilation).
- { dg-final { scan-tree-dump-times "(?n)Function is OpenACC routine level 1" 1 "oaccdevlow" } }
- { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } }
- { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(0 1, 1 1, 1 1\\), omp declare target \\(worker\\), oacc function \\(0 1, 1 0, 1 0\\)\\)\\)" 1 "oaccdevlow" } } */
+ { dg-final { scan-tree-dump-times "(?n)Function is OpenACC routine level 1" 1 "oaccdevlow1" } }
+ { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow1" } }
+ { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(0 1, 1 1, 1 1\\), omp declare target \\(worker\\), oacc function \\(0 1, 1 0, 1 0\\)\\)\\)" 1 "oaccdevlow1" } } */
new file mode 100644
@@ -0,0 +1,25 @@
+/* Check that the instance of the OpenACC device lowering pass that is
+ supposed to run if optimizations are disabled does get executed. */
+
+/* { dg-additional-options "-fdump-tree-oaccdevlow" } */
+/* { dg-additional-options "-O0" } */
+
+#pragma acc routine
+int test (int x)
+{
+ return x * x;
+}
+
+int test2 (int x)
+{
+#pragma acc parallel
+ {
+ for (int i = 1; i < 1000; ++i)
+ x += x;
+ }
+
+ return x;
+}
+
+/* { dg-final { scan-tree-dump-times "Function is OpenACC routine" 1 "oaccdevlow3" } } */
+/* { dg-final { scan-tree-dump-times "Function is OpenACC parallel" 1 "oaccdevlow3" } } */
new file mode 100644
@@ -0,0 +1,30 @@
+/* Check that the different instances of the OpenACC device lowering
+ pass get executed on the types of functions they are supposed to
+ handle if optimizations are enabled. */
+
+/* { dg-additional-options "-fdump-tree-oaccdevlow" } */
+/* { dg-additional-options "-O" } */
+
+#pragma acc routine
+int test (int x)
+{
+ return x * x;
+}
+
+int test2 (int x)
+{
+#pragma acc parallel
+ {
+ for (int i = 1; i < 1000; ++i)
+ x += x;
+ }
+
+ return x;
+}
+
+
+/* { dg-final { scan-tree-dump-times "Function is OpenACC routine" 1 "oaccdevlow2" } }
+ The acc routine should be handled by the pass instance for functions without loops. */
+/* { dg-final { scan-tree-dump-times "Function is OpenACC parallel" 1 "oaccdevlow1" } }
+ The function with the parallel region should be handled by the pass instance
+ for functions with loops. */
@@ -1,5 +1,5 @@
-/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-cunrolli-details -fdump-rtl-loop2_unroll-details" } */
+/* { dg-do compile } *
+/* { dg-options "-O2 -fdump-tree-cunrolli1-details -fdump-rtl-loop2_unroll-details" } */
extern void bar (int);
@@ -10,12 +10,12 @@ void test (void)
#pragma GCC unroll 8
for (unsigned long i = 1; i <= 8; ++i)
bar(i);
- /* { dg-final { scan-tree-dump "11:.*: loop with 8 iterations completely unrolled" "cunrolli" } } */
+ /* { dg-final { scan-tree-dump "11:.*: loop with 8 iterations completely unrolled" "cunrolli1" } } */
#pragma GCC unroll 8
for (unsigned long i = 1; i <= 7; ++i)
bar(i);
- /* { dg-final { scan-tree-dump "16:.*: loop with 7 iterations completely unrolled" "cunrolli" } } */
+ /* { dg-final { scan-tree-dump "16:.*: loop with 7 iterations completely unrolled" "cunrolli1" } } */
#pragma GCC unroll 8
for (unsigned long i = 1; i <= 15; ++i)
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -funroll-all-loops -fdump-rtl-loop2_unroll-details -fdump-tree-cunrolli-details" } */
+/* { dg-options "-O2 -funroll-all-loops -fdump-rtl-loop2_unroll-details -fdump-tree-cunrolli1-details" } */
extern void bar (int);
@@ -17,6 +17,6 @@ void test (void)
for (unsigned long i = 1; i <= j; ++i)
bar(i);
- /* { dg-final { scan-tree-dump "Not unrolling loop .: user didn't want it unrolled completely" "cunrolli" } } */
+ /* { dg-final { scan-tree-dump "Not unrolling loop .: user didn't want it unrolled completely" "cunrolli1" } } */
/* { dg-final { scan-rtl-dump-times "Not unrolling loop, user didn't want it unrolled" 2 "loop2_unroll" } } */
}
@@ -16,4 +16,4 @@ bar (int *a, int *b, int *c)
foo <int> (a, b, c);
}
-// { dg-final { scan-tree-dump "loop with 8 iterations completely unrolled" "cunrolli" } }
+// { dg-final { scan-tree-dump "loop with 8 iterations completely unrolled" "cunrolli1" } }
@@ -10,4 +10,4 @@ foo (int (&a)[8], int *b, int *c)
a[i] = b[i] * c[i];
}
-// { dg-final { scan-tree-dump "loop with 8 iterations completely unrolled" "cunrolli" } }
+// { dg-final { scan-tree-dump "loop with 8 iterations completely unrolled" "cunrolli1" } }
@@ -17,4 +17,4 @@ bar (int (&a)[8], int *b, int *c)
foo <int> (a, b, c);
}
-// { dg-final { scan-tree-dump "loop with 8 iterations completely unrolled" "cunrolli" } }
+// { dg-final { scan-tree-dump "loop with 8 iterations completely unrolled" "cunrolli1" } }
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -fstrict-enums -fno-rtti -fno-exceptions -fno-strict-aliasing -fdump-tree-vrp2" } */
+/* { dg-options "-O2 -fstrict-enums -fno-rtti -fno-exceptions -fno-strict-aliasing -fdump-tree-vrp3" } */
extern void JS_Assert();
@@ -37,4 +37,4 @@ void jsop_setelem(bool y, int z) {
x = frame.dataRematInfo2(y, z);
}
-/* { dg-final { scan-tree-dump-times "Folding predicate.*45" 0 "vrp2"} } */
+/* { dg-final { scan-tree-dump-times "Folding predicate.*45" 0 "vrp3"} } */
@@ -1,6 +1,6 @@
/* { dg-do compile } */
/* { dg-require-effective-target vect_float } */
-/* { dg-additional-options "-fdisable-tree-cunrolli" } */
+/* { dg-additional-options "-fdisable-tree-cunrolli1" } */
struct vector
{
@@ -15,4 +15,5 @@ void vector_1 (int *ary, int size)
}
}
-/* { dg-final { scan-tree-dump {OpenACC loops.*Loop 0\(0\).*Loop 24\(1\).*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_HEAD_MARK, 0, 1, 36\);.*Head-0:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_HEAD_MARK, 0, 1, 36\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_FORK, \.data_dep\.[0-9_]+, 0\);.*Tail-0:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_TAIL_MARK, \.data_dep\.[0-9_]+, 1\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_JOIN, \.data_dep\.[0-9_]+, 0\);.*Loop 6\(6\).*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_HEAD_MARK, 0, 2, 6\);.*Head-0:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_HEAD_MARK, 0, 2, 6\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_FORK, \.data_dep\.[0-9_]+, 1\);.*Head-1:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_HEAD_MARK, \.data_dep\.[0-9_]+, 1\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_FORK, \.data_dep\.[0-9_]+, 2\);.*Tail-1:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_TAIL_MARK, \.data_dep\.[0-9_]+, 2\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_JOIN, \.data_dep\.[0-9_]+, 2\);.*Tail-0:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_TAIL_MARK, \.data_dep\.[0-9_]+, 1\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_JOIN, \.data_dep\.[0-9_]+, 1\);} "oaccdevlow" } } */
+/* { dg-final { scan-tree-dump {
+OpenACC loops.*Loop 0\(0\).*Loop [0-9]{2}\(1\).*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_HEAD_MARK, 0, 1, 36\);.*Head-0:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_HEAD_MARK, 0, 1, 36\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_FORK, \.data_dep\.[0-9_]+, 0\);.*Tail-0:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_TAIL_MARK, \.data_dep\.[0-9_]+, 1\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_JOIN, \.data_dep\.[0-9_]+, 0\);.*Loop 6\(6\).*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_HEAD_MARK, 0, 2, 6\);.*Head-0:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_HEAD_MARK, 0, 2, 6\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_FORK, \.data_dep\.[0-9_]+, 1\);.*Head-1:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_HEAD_MARK, \.data_dep\.[0-9_]+, 1\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_FORK, \.data_dep\.[0-9_]+, 2\);.*Tail-1:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_TAIL_MARK, \.data_dep\.[0-9_]+, 2\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_JOIN, \.data_dep\.[0-9_]+, 2\);.*Tail-0:.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_TAIL_MARK, \.data_dep\.[0-9_]+, 1\);.*\.data_dep\.[0-9_]+ = \.UNIQUE \(OACC_JOIN, \.data_dep\.[0-9_]+, 1\);} "oaccdevlow*" } } */
@@ -12,7 +12,7 @@ for (int c0 = 0; c0 <= 99; c0 += 1) {
/* { dg-final { scan-tree-dump-times "AST generated by isl:.*for \\(int c0 = 0; c0 <= 99; c0 \\+= 1\\) \\{.*S_.*\\(c0\\);.*S_.*\\(c0\\);.*S_.*\\(c0\\);.*\\}" 1 "graphite" } } */
/* Check that after fusing the loops, the scalar computation is also fused. */
-/* { dg-final { scan-tree-dump-times "gimple_simplified to\[^\\n\]*\\^ 12" 1 "forwprop4" } } */
+/* { dg-final { scan-tree-dump-times "gimple_simplified to\[^\\n\]*\\^ 12" 1 "forwprop5" } } */
#define MAX 100
int A[MAX];
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O -g -fdump-tree-backprop-details" } */
+/* { dg-options "-O -g -fdump-tree-backprop1-details" } */
/* Test a simple case of non-looping code in which both uses ignore
the sign and both definitions are sign ops. */
@@ -18,5 +18,5 @@ TEST_FUNCTION (float, f)
TEST_FUNCTION (double, )
TEST_FUNCTION (long double, l)
-/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = -x} 3 "backprop" } } */
-/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = ABS_EXPR <x} 3 "backprop" } } */
+/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = -x} 3 "backprop1" } } */
+/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = ABS_EXPR <x} 3 "backprop1" } } */
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O -g -fdump-tree-backprop-details" } */
+/* { dg-options "-O -g -fdump-tree-backprop1-details" } */
/* Test a simple case of non-looping code in which both uses ignore
the sign but only one definition is a sign op. */
@@ -18,4 +18,4 @@ TEST_FUNCTION (float, f)
TEST_FUNCTION (double, )
TEST_FUNCTION (long double, l)
-/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = -x} 3 "backprop" } } */
+/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = -x} 3 "backprop1" } } */
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O -g -fdump-tree-backprop-details" } */
+/* { dg-options "-O -g -fdump-tree-backprop1-details" } */
/* Test a simple case of non-looping code in which one use ignores
the sign but another doesn't. */
@@ -18,4 +18,4 @@ TEST_FUNCTION (float, f)
TEST_FUNCTION (double, )
TEST_FUNCTION (long double, l)
-/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = -x} 0 "backprop" } } */
+/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = -x} 0 "backprop1" } } */
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O -g -fdump-tree-backprop-details" } */
+/* { dg-options "-O -g -fdump-tree-backprop1-details" } */
/* Test a simple reduction loop in which all inputs are sign ops and
the consumer doesn't care about the sign. */
@@ -17,5 +17,5 @@ TEST_FUNCTION (float, f)
TEST_FUNCTION (double, )
TEST_FUNCTION (long double, l)
-/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = __builtin_copysign} 3 "backprop" } } */
-/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = -} 3 "backprop" } } */
+/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = __builtin_copysign} 3 "backprop1" } } */
+/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = -} 3 "backprop1" } } */
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O -g -fdump-tree-backprop-details" } */
+/* { dg-options "-O -g -fdump-tree-backprop1-details" } */
/* Test a loop that does both a multiplication and addition. The addition
should prevent any sign ops from being removed. */
@@ -17,4 +17,4 @@ TEST_FUNCTION (float, f)
TEST_FUNCTION (double, )
TEST_FUNCTION (long double, l)
-/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = __builtin_copysign} 0 "backprop" } } */
+/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = __builtin_copysign} 0 "backprop1" } } */
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-backprop-details" } */
+/* { dg-options "-O -fdump-tree-backprop1-details" } */
void start (void *);
void end (void *);
@@ -26,5 +26,5 @@ TEST_FUNCTION (float, f)
TEST_FUNCTION (double, )
TEST_FUNCTION (long double, l)
-/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = -} 6 "backprop" } } */
-/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = ABS_EXPR <} 3 "backprop" } } */
+/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = -} 6 "backprop1" } } */
+/* { dg-final { scan-tree-dump-times {Deleting[^\n]* = ABS_EXPR <} 3 "backprop1" } } */
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O3 -fdump-tree-cunrolli-details" } */
+/* { dg-options "-O3 -fdump-tree-cunrolli1-details" } */
int a[2];
void
test(int c)
@@ -9,5 +9,5 @@ test(int c)
a[i]=5;
}
/* Array bounds says the loop will not roll much. */
-/* { dg-final { scan-tree-dump "loop with 2 iterations completely unrolled" "cunrolli"} } */
-/* { dg-final { scan-tree-dump "Last iteration exit edge was proved true." "cunrolli"} } */
+/* { dg-final { scan-tree-dump "loop with 2 iterations completely unrolled" "cunrolli1"} } */
+/* { dg-final { scan-tree-dump "Last iteration exit edge was proved true." "cunrolli1"} } */
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-cunrolli-details" } */
+/* { dg-options "-O2 -fdump-tree-cunrolli1-details" } */
int a[1];
void
test(int c)
@@ -12,4 +12,4 @@ test(int c)
}
/* If we start duplicating headers prior curoll, this loop will have 0 iterations. */
-/* { dg-final { scan-tree-dump "loop with 1 iterations completely unrolled" "cunrolli"} } */
+/* { dg-final { scan-tree-dump "loop with 1 iterations completely unrolled" "cunrolli1"} } */
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-cunrolli-details -fdisable-tree-evrp" } */
+/* { dg-options "-O2 -fdump-tree-cunrolli1-details -fdisable-tree-evrp" } */
void abort (void);
int q (void);
int a[10];
@@ -20,4 +20,4 @@ t (int n)
}
return sum;
}
-/* { dg-final { scan-tree-dump-times "Removed pointless exit:" 1 "cunrolli" } } */
+/* { dg-final { scan-tree-dump-times "Removed pointless exit:" 1 "cunrolli1" } } */
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -ftree-loop-distribution -ftree-loop-distribute-patterns -fdump-tree-ldist-details -fdisable-tree-cunroll -fdisable-tree-cunrolli" } */
+/* { dg-options "-O2 -ftree-loop-distribution -ftree-loop-distribute-patterns -fdump-tree-ldist-details -fdisable-tree-cunroll -fdisable-tree-cunrolli1" } */
typedef int mad_fixed_t;
struct mad_pcm
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-cunrolli-details" } */
+/* { dg-options "-O2 -fdump-tree-cunrolli1-details" } */
int a[10];
int b[11];
int q (void);
@@ -15,4 +15,4 @@ t(int n)
sum+=b[i];
return sum;
}
-/* { dg-final { scan-tree-dump "Loop 1 iterates at most 11 times" "cunrolli" } } */
+/* { dg-final { scan-tree-dump "Loop 1 iterates at most 11 times" "cunrolli1" } } */
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-phiprop-details" } */
+/* { dg-options "-O -fdump-tree-phiprop1-details" } */
struct f
{
@@ -16,4 +16,4 @@ int g(int i, int c, struct f *ff, int g)
return *t;
}
-/* { dg-final { scan-tree-dump-times "Inserting PHI for result of load" 1 "phiprop" } } */
+/* { dg-final { scan-tree-dump-times "Inserting PHI for result of load" 1 "phiprop1" } } */
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-cunrolli-details" } */
+/* { dg-options "-O2 -fdump-tree-cunrolli1-details" } */
extern void abort (void);
int __attribute__((noinline,noclone))
@@ -25,4 +25,4 @@ int main()
return 0;
}
-/* { dg-final { scan-tree-dump-times "bounded by 0x0\[^0-9a-f\]" 0 "cunrolli"} } */
+/* { dg-final { scan-tree-dump-times "bounded by 0x0\[^0-9a-f\]" 0 "cunrolli1"} } */
@@ -50,4 +50,4 @@ int foo1 (e_u8 a[4][N], int b1, int b2, e_u8 b[M+1][4][N])
/* { dg-final { scan-tree-dump-times "loop with 3 iterations completely unrolled" 2 "cunroll" } } */
/* { dg-final { scan-tree-dump-times "loop with 7 iterations completely unrolled" 2 "cunroll" } } */
-/* { dg-final { scan-tree-dump-not "completely unrolled" "cunrolli" } } */
+/* { dg-final { scan-tree-dump-not "completely unrolled" "cunrolli1" } } */
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp2" } */
+/* { dg-options "-O2 -fdump-tree-vrp3" } */
extern int nc;
void ff (unsigned long long);
@@ -21,4 +21,4 @@ f (void)
}
}
-/* { dg-final { scan-tree-dump ">> 6" "vrp2" } } */
+/* { dg-final { scan-tree-dump ">> 6" "vrp3" } } */
@@ -1,12 +1,12 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -w -fdump-tree-vrp1-details -fdump-tree-vrp2-details -fdump-tree-dom2-details -fdump-tree-dom3-details" } */
+/* { dg-options "-O2 -w -fdump-tree-vrp1-details -fdump-tree-vrp3-details -fdump-tree-dom2-details -fdump-tree-dom3-details" } */
/* All the threads found by the FSM threader should have too
many statements to be profitable. */
/* { dg-final { scan-tree-dump-not "Registering FSM " "dom2"} } */
/* { dg-final { scan-tree-dump-not "Registering FSM " "dom3"} } */
/* { dg-final { scan-tree-dump-not "Registering FSM " "vrp1"} } */
-/* { dg-final { scan-tree-dump-not "Registering FSM " "vrp2"} } */
+/* { dg-final { scan-tree-dump-not "Registering FSM " "vrp3"} } */
typedef _Bool bool;
typedef unsigned char uint8_t;
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-thread1-stats -fdump-tree-thread2-stats -fdump-tree-dom2-stats -fdump-tree-thread3-stats -fdump-tree-dom3-stats -fdump-tree-vrp2-stats -fno-guess-branch-probability" } */
+/* { dg-options "-O2 -fdump-tree-thread1-stats -fdump-tree-thread2-stats -fdump-tree-dom2-stats -fdump-tree-thread3-stats -fdump-tree-dom3-stats -fdump-tree-vrp3-stats -fno-guess-branch-probability" } */
/* Here we have the same issue as was commented in ssa-dom-thread-6.c.
The PHI coming into the threader has a lot more constants, so the
@@ -24,7 +24,7 @@ $ diff clean/a.c.105t.mergephi2 a.c.105t.mergephi2
to change decisions in switch expansion which in turn can expose new
jump threading opportunities. Skip the later tests on aarch64. */
/* { dg-final { scan-tree-dump-not "Jumps threaded" "dom3" { target { ! aarch64*-*-* } } } } */
-/* { dg-final { scan-tree-dump-not "Jumps threaded" "vrp2" { target { ! aarch64*-*-* } } } } */
+/* { dg-final { scan-tree-dump-not "Jumps threaded" "vrp3" { target { ! aarch64*-*-* } } } } */
enum STATE {
S0=0,
@@ -7,4 +7,4 @@ foo (int a, int b, int c)
return c ? x : a;
}
/* We should sink the x = a * b calculation into the branch that returns x. */
-/* { dg-final { scan-tree-dump-times "Sunk statements: 1" 1 "sink" } } */
+/* { dg-final { scan-tree-dump-times "Sunk statements: 1" 1 "sink1" } } */
@@ -16,4 +16,4 @@ void foo (void)
}
}
-/* { dg-final { scan-tree-dump-times "Sinking # VUSE" 4 "sink" } } */
+/* { dg-final { scan-tree-dump-times "Sinking # VUSE" 4 "sink1" } } */
@@ -21,5 +21,5 @@ void test ()
/* We should sink/merge all stores and end up with a single BB. */
-/* { dg-final { scan-tree-dump-times "MEM\[^\n\r\]* = 0;" 3 "sink" } } */
-/* { dg-final { scan-tree-dump-times "<bb " 1 "sink" } } */
+/* { dg-final { scan-tree-dump-times "MEM\[^\n\r\]* = 0;" 3 "sink1" } } */
+/* { dg-final { scan-tree-dump-times "<bb " 1 "sink1" } } */
@@ -13,5 +13,5 @@ void foo (int b)
/* We should have sunk the store and inserted a PHI to merge the
stored values. */
-/* { dg-final { scan-tree-dump-times " = PHI" 1 "sink" } } */
-/* { dg-final { scan-tree-dump-times "x = " 1 "sink" } } */
+/* { dg-final { scan-tree-dump-times " = PHI" 1 "sink1" } } */
+/* { dg-final { scan-tree-dump-times "x = " 1 "sink1" } } */
@@ -10,5 +10,5 @@ int f(int n)
return j;
}
-/* { dg-final { scan-tree-dump "Sinking j_. = __builtin_ffs" "sink" } } */
+/* { dg-final { scan-tree-dump "Sinking j_. = __builtin_ffs" "sink1" } } */
/* { dg-final { scan-tree-dump "return 2;" "optimized" } } */
@@ -12,4 +12,4 @@ int my_f(int a, int b)
}
/* We should sink the call to pure_f to the if block. */
-/* { dg-final { scan-tree-dump "Sinking # VUSE" "sink" } } */
+/* { dg-final { scan-tree-dump "Sinking # VUSE" "sink1" } } */
@@ -9,4 +9,4 @@ bar (int a, int b, int c)
return y;
}
/* We should sink the x = a * b calculation into the else branch */
-/* { dg-final { scan-tree-dump-times "Sunk statements: 1" 1 "sink" } } */
+/* { dg-final { scan-tree-dump-times "Sunk statements: 1" 1 "sink1" } } */
@@ -12,4 +12,4 @@ main (int argc)
}
}
/* We should sink the a = argc + 1 calculation into the if branch */
-/* { dg-final { scan-tree-dump-times "Sunk statements: 1" 1 "sink" } } */
+/* { dg-final { scan-tree-dump-times "Sunk statements: 1" 1 "sink1" } } */
@@ -17,4 +17,4 @@ main (int argc)
foo2 (a);
}
/* We should sink the first a = b + c calculation into the else branch */
-/* { dg-final { scan-tree-dump-times "Sunk statements: 1" 1 "sink" } } */
+/* { dg-final { scan-tree-dump-times "Sunk statements: 1" 1 "sink1" } } */
@@ -44,4 +44,4 @@ void foo(int16_t runs[], uint8_t alpha[], int x, int count)
}
/* We should not sink the next_runs = runs + x calculation after the loop. */
-/* { dg-final { scan-tree-dump-times "Sunk statements:" 0 "sink" } } */
+/* { dg-final { scan-tree-dump-times "Sunk statements:" 0 "sink1" } } */
@@ -14,4 +14,4 @@ int foo(int *a, int r)
/* *a = 1 should be sunk to the else block. */
-/* { dg-final { scan-tree-dump-times "Sinking" 1 "sink" } } */
+/* { dg-final { scan-tree-dump-times "Sinking" 1 "sink1" } } */
@@ -15,4 +15,4 @@ int foo(int *a, int r, short *b)
/* *a = 1 should be sunk to the else block. */
-/* { dg-final { scan-tree-dump-times "Sinking" 1 "sink" } } */
+/* { dg-final { scan-tree-dump-times "Sinking" 1 "sink1" } } */
@@ -24,4 +24,4 @@ int foo(int *a, int r, short *b)
/* *a = 1 should be sunk into the default case. */
-/* { dg-final { scan-tree-dump-times "Sinking" 1 "sink" } } */
+/* { dg-final { scan-tree-dump-times "Sinking" 1 "sink1" } } */
@@ -15,4 +15,4 @@ int foo(int *a, int r, int *b)
/* *a = 1 should be sunk to the else block. */
-/* { dg-final { scan-tree-dump-times "Sinking" 1 "sink" } } */
+/* { dg-final { scan-tree-dump-times "Sinking" 1 "sink1" } } */
@@ -1,6 +1,6 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp2-details --param logical-op-non-short-circuit=1" } */
-/* { dg-final { scan-tree-dump-not "IRREDUCIBLE_LOOP" "vrp2" } } */
+/* { dg-options "-O2 -fdump-tree-vrp3-details --param logical-op-non-short-circuit=1" } */
+/* { dg-final { scan-tree-dump-not "IRREDUCIBLE_LOOP" "vrp3" } } */
void abort (void);
typedef struct bitmap_head_def *bitmap;
@@ -1,7 +1,7 @@
/* Setting LOGICAL_OP_NON_SHORT_CIRCUIT to 0 inhibits the setcc
optimizations that expose the VRP opportunity. */
/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp1 -fdump-tree-dom2 -fdump-tree-vrp2 --param logical-op-non-short-circuit=1" } */
+/* { dg-options "-O2 -fdump-tree-vrp1 -fdump-tree-dom2 -fdump-tree-vrp3 --param logical-op-non-short-circuit=1" } */
/* { dg-additional-options "-march=i586" { target { { i?86-*-* x86_64-*-* } && ia32 } } } */
int h(int x, int y)
@@ -39,5 +39,5 @@ int f(int x)
/* VRP2 gets rid of the remaining & 1 operations, x and y are always
either 0 or 1. */
-/* { dg-final { scan-tree-dump-times " & 1;" 0 "vrp2" } } */
+/* { dg-final { scan-tree-dump-times " & 1;" 0 "vrp3" } } */
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-S -O2 -fdump-tree-vrp2" } */
+/* { dg-options "-S -O2 -fdump-tree-vrp3" } */
unsigned short data;
void foo ()
@@ -18,4 +18,4 @@ void foo ()
}
}
-/* { dg-final { scan-tree-dump "\\\[0, 7\\\]" "vrp2" } } */
+/* { dg-final { scan-tree-dump "\\\[0, 7\\\]" "vrp3" } } */
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-cunrolli-details=stderr -fno-peel-loops -fno-tree-vrp -fdisable-tree-cunroll -fenable-tree-cunrolli" } */
+/* { dg-options "-O2 -fdump-tree-cunrolli-details=stderr -fno-peel-loops -fno-tree-vrp -fdisable-tree-cunroll -fenable-tree-cunrolli1" } */
/* Blank lines can occur in the output of
-fdump-tree-cunrolli-details=stderr. */
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-cunrolli-details -fno-peel-loops -fno-tree-vrp -fdisable-tree-cunrolli=foo -fenable-tree-cunrolli=foo" } */
+/* { dg-options "-O2 -fdump-tree-cunrolli-details -fno-peel-loops -fno-tree-vrp -fdisable-tree-cunrolli1=foo -fenable-tree-cunrolli1=foo" } */
unsigned a[100], b[100];
inline void bar()
@@ -28,4 +28,4 @@ int foo2(void)
return 1;
}
-/* { dg-final { scan-tree-dump-times "loop with 2 iterations completely unrolled" 1 "cunrolli" } } */
+/* { dg-final { scan-tree-dump-times "loop with 2 iterations completely unrolled" 1 "cunrolli1" } } */
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-cunrolli-details -fno-peel-loops -fno-tree-vrp -fdisable-tree-cunroll -fenable-tree-cunrolli=foo -fdisable-tree-cunrolli=foo2" } */
+/* { dg-options "-O2 -fdump-tree-cunrolli1-details -fno-peel-loops -fno-tree-vrp -fdisable-tree-cunroll -fenable-tree-cunrolli1=foo -fdisable-tree-cunrolli1=foo2" } */
unsigned a[100], b[100];
inline void bar()
@@ -28,4 +28,4 @@ int foo2(void)
return 1;
}
-/* { dg-final { scan-tree-dump-times "loop with 2 iterations completely unrolled" 1 "cunrolli" } } */
+/* { dg-final { scan-tree-dump-times "loop with 2 iterations completely unrolled" 1 "cunrolli1" } } */
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-cunrolli-details -fno-peel-loops -fno-tree-vrp -fdisable-tree-cunroll -fenable-tree-cunrolli=foo2 -fdisable-tree-cunrolli=foo" } */
+/* { dg-options "-O2 -fdump-tree-cunrolli1-details -fno-peel-loops -fno-tree-vrp -fdisable-tree-cunroll -fenable-tree-cunrolli1=foo2 -fdisable-tree-cunrolli1=foo" } */
unsigned a[100], b[100];
inline void bar()
@@ -28,4 +28,4 @@ int foo2(void)
return 1;
}
-/* { dg-final { scan-tree-dump-times "loop with 2 iterations completely unrolled" 1 "cunrolli" } } */
+/* { dg-final { scan-tree-dump-times "loop with 2 iterations completely unrolled" 1 "cunrolli1" } } */
@@ -22,5 +22,5 @@ void foo (void)
/* We should be able to vectorize the cycle in one SLP attempt including
both load groups and do only one permutation. */
/* { dg-final { scan-tree-dump-times "transform load" 2 "slp1" } } */
-/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 1 "loopdone" } } */
+/* { dg-final { scan-tree-dump-times "VEC_PERM_EXPR" 1 "loopdone2" } } */
/* { dg-final { scan-tree-dump-times "optimized: basic block" 1 "slp1" } } */
@@ -1,6 +1,6 @@
/* { dg-do compile } */
/* { dg-require-effective-target vect_int } */
-/* { dg-additional-options "-fdump-tree-dce6-details" } */
+/* { dg-additional-options "-fdump-tree-dce7-details" } */
int a[256], b[256], c[256];
@@ -13,4 +13,4 @@ foo () {
}
}
-/* { dg-final { scan-tree-dump-times "Deleting : vect_" 0 "dce6" } } */
+/* { dg-final { scan-tree-dump-times "Deleting : vect_" 0 "dce7" } } */
@@ -1,6 +1,6 @@
/* { dg-do compile } */
/* { dg-require-effective-target vect_int } */
-/* { dg-additional-options "-fdump-tree-vect-details-blocks -fdisable-tree-cunrolli" } */
+/* { dg-additional-options "-fdump-tree-vect-details-blocks -fdisable-tree-cunrolli1" } */
/* At least one of these should correspond to a full vector. */
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-vrp2" } */
+/* { dg-options "-O2 -fdump-tree-vrp3" } */
/* { dg-require-effective-target int32plus } */
int Foo (int X)
@@ -14,5 +14,5 @@ int Foo (int X)
/* We expect this min/max pair to survive. */
-/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "vrp2" } } */
-/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "vrp2" } } */
+/* { dg-final { scan-tree-dump-times "MIN_EXPR" 1 "vrp3" } } */
+/* { dg-final { scan-tree-dump-times "MAX_EXPR" 1 "vrp3" } } */
@@ -1,6 +1,6 @@
/* { dg-do compile { target { { i?86-*-* x86_64-*-* s390*-*-* } && lp64 } } } */
-/* { dg-options "-O2 -fdump-tree-vrp2-details" } */
-/* { dg-final { scan-tree-dump-times "gimple_simplified to" 4 "vrp2" } } */
+/* { dg-options "-O2 -fdump-tree-vrp3-details" } */
+/* { dg-final { scan-tree-dump-times "gimple_simplified to" 4 "vrp3" } } */
void v1 (unsigned long *in, unsigned long *out, unsigned int n)
{
@@ -12,7 +12,7 @@ subroutine test1(a)
DO i=1, 8, 1
call dummy(a(i))
ENDDO
-! { dg-final { scan-tree-dump "12:.*: loop with 8 iterations completely unrolled" "cunrolli" } } */
+! { dg-final { scan-tree-dump "12:.*: loop with 8 iterations completely unrolled" "cunrolli1" } } */
end subroutine test1
subroutine test2(a, n)
@@ -25,5 +25,5 @@ subroutine test2(a, n)
ENDDO
end subroutine test2
-! { dg-final { scan-tree-dump "Not unrolling loop .: user didn't want it unrolled completely" "cunrolli" } } */
+! { dg-final { scan-tree-dump "Not unrolling loop .: user didn't want it unrolled completely" "cunrolli1" } } */
! { dg-final { scan-rtl-dump-times "Not unrolling loop, user didn't want it unrolled" 2 "loop2_unroll" } } */
@@ -37,6 +37,6 @@ end program main
! Check the offloaded function's classification and compute dimensions (will
! always be 1 x 1 x 1 for non-offloading compilation).
-! { dg-final { scan-tree-dump-times "(?n)Function is unparallelized OpenACC kernels offload" 1 "oaccdevlow" } }
-! { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } }
-! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels, omp target entrypoint\\)\\)" 1 "oaccdevlow" } }
+! { dg-final { scan-tree-dump-times "(?n)Function is unparallelized OpenACC kernels offload" 1 "oaccdevlow1" } }
+! { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow1" } }
+! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels, omp target entrypoint\\)\\)" 1 "oaccdevlow1" } }
@@ -33,6 +33,6 @@ end program main
! Check the offloaded function's classification and compute dimensions (will
! always be 1 x 1 x 1 for non-offloading compilation).
-! { dg-final { scan-tree-dump-times "(?n)Function is parallelized OpenACC kernels offload" 1 "oaccdevlow" } }
-! { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } }
-! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "oaccdevlow" } }
+! { dg-final { scan-tree-dump-times "(?n)Function is parallelized OpenACC kernels offload" 1 "oaccdevlow1" } }
+! { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow1" } }
+! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), oacc kernels parallelized, oacc function \\(, , \\), oacc kernels, omp target entrypoint\\)\\)" 1 "oaccdevlow1" } }
@@ -26,6 +26,6 @@ end program main
! Check the offloaded function's classification and compute dimensions (will
! always be 1 x 1 x 1 for non-offloading compilation).
-! { dg-final { scan-tree-dump-times "(?n)Function is OpenACC parallel offload" 1 "oaccdevlow" } }
-! { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } }
-! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), omp target entrypoint\\)\\)" 1 "oaccdevlow" } }
+! { dg-final { scan-tree-dump-times "(?n)Function is OpenACC parallel offload" 1 "oaccdevlow1" } }
+! { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow1" } }
+! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(1, 1, 1\\), omp target entrypoint\\)\\)" 1 "oaccdevlow1" } }
@@ -25,6 +25,6 @@ end subroutine ROUTINE
! Check the offloaded function's classification and compute dimensions (will
! always be 1 x 1 x 1 for non-offloading compilation).
-! { dg-final { scan-tree-dump-times "(?n)Function is OpenACC routine level 1" 1 "oaccdevlow" } }
-! { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow" } }
-! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(0 1, 1 1, 1 1\\), omp declare target \\(worker\\)\\)\\)" 1 "oaccdevlow" } }
+! { dg-final { scan-tree-dump-times "(?n)Function is OpenACC routine level 1" 1 "oaccdevlow1" } }
+! { dg-final { scan-tree-dump-times "(?n)Compute dimensions \\\[1, 1, 1\\\]" 1 "oaccdevlow1" } }
+! { dg-final { scan-tree-dump-times "(?n)__attribute__\\(\\(oacc function \\(0 1, 1 1, 1 1\\), omp declare target \\(worker\\)\\)\\)" 1 "oaccdevlow1" } }
@@ -23,5 +23,5 @@ package body Unroll1 is
end Unroll1;
+-- { dg-final { scan-tree-dump-times "Not unrolling loop .: user didn't want it unrolled completely" 2 "cunrolli1" } }
-- { dg-final { scan-rtl-dump-times "Not unrolling loop, user didn't want it unrolled" 2 "loop2_unroll" } }
@@ -23,4 +23,4 @@ package body Unroll2 is
end Unroll2;
+-- { dg-final { scan-tree-dump-times "loop with 3 iterations completely unrolled" 2 "cunrolli1" } }
@@ -478,6 +478,7 @@ extern gimple_opt_pass *make_pass_vtable_verify (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_ubsan (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_sanopt (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_oacc_kernels (gcc::context *ctxt);
+extern gimple_opt_pass *make_pass_oacc_functions (gcc::context *ctxt);
extern simple_ipa_opt_pass *make_pass_ipa_oacc (gcc::context *ctxt);
extern simple_ipa_opt_pass *make_pass_ipa_oacc_kernels (gcc::context *ctxt);
extern gimple_opt_pass *make_pass_warn_nonnull_compare (gcc::context *ctxt);
@@ -1587,6 +1587,7 @@ public:
/* opt_pass methods: */
virtual unsigned int execute (function *);
+ opt_pass * clone () { return new pass_complete_unroll (m_ctxt); }
}; // class pass_complete_unroll
@@ -1646,6 +1647,7 @@ public:
/* opt_pass methods: */
virtual bool gate (function *) { return optimize >= 2; }
virtual unsigned int execute (function *);
+ opt_pass * clone () { return new pass_complete_unrolli (m_ctxt); }
}; // class pass_complete_unrolli
@@ -70,6 +70,9 @@ public:
virtual bool gate (function *) { return flag_tree_loop_optimize; }
virtual unsigned int execute (function *fn);
+
+ opt_pass * clone () { return new pass_fix_loops (m_ctxt); }
+
}; // class pass_fix_loops
unsigned int
@@ -202,6 +205,53 @@ make_pass_oacc_kernels (gcc::context *ctxt)
return new pass_oacc_kernels (ctxt);
}
+/* A superpass that runs only on OpenACC functions. */
+
+namespace {
+
+const pass_data pass_data_oacc_functions =
+{
+ GIMPLE_PASS, /* type */
+ "*oacc_functions", /* name */
+ OPTGROUP_LOOP, /* optinfo_flags */
+ TV_TREE_LOOP, /* tv_id */
+ 0, /* properties_required */
+ 0, /* properties_provided */
+ 0, /* properties_destroyed */
+ 0, /* todo_flags_start */
+ 0, /* todo_flags_finish */
+};
+
+class pass_oacc_functions: public gimple_opt_pass
+{
+public:
+ pass_oacc_functions (gcc::context *ctxt)
+ : gimple_opt_pass (pass_data_oacc_functions, ctxt)
+ {}
+
+ /* opt_pass methods: */
+ virtual bool gate (function *fn) {
+ if (!flag_openacc)
+ return false;
+
+ if (!oacc_get_fn_attrib (fn->decl))
+ return false;
+
+ return true;
+ }
+
+}; // class pass_oacc_functions
+
+} // anon namespace
+
+gimple_opt_pass *
+make_pass_oacc_functions (gcc::context *ctxt)
+{
+ return new pass_oacc_functions (ctxt);
+}
+
+
+
/* The ipa oacc superpass. */
namespace {
@@ -344,6 +394,8 @@ public:
/* opt_pass methods: */
virtual unsigned int execute (function *);
+ opt_pass * clone () { return new pass_tree_loop_init (m_ctxt); }
+
}; // class pass_tree_loop_init
unsigned int
@@ -558,6 +610,8 @@ public:
/* opt_pass methods: */
virtual unsigned int execute (function *) { return tree_ssa_loop_done (); }
+ opt_pass * clone () { return new pass_tree_loop_done (m_ctxt); }
+
}; // class pass_tree_loop_done
} // anon namespace
@@ -479,6 +479,8 @@ public:
virtual bool gate (function *) { return flag_tree_phiprop; }
virtual unsigned int execute (function *);
+ opt_pass * clone () { return new pass_phiprop (m_ctxt); }
+
}; // class pass_phiprop
unsigned int
@@ -816,6 +816,8 @@ public:
virtual bool gate (function *) { return flag_tree_sink != 0; }
virtual unsigned int execute (function *);
+ opt_pass * clone () { return new pass_sink_code (m_ctxt); }
+
}; // class pass_sink_code
unsigned int
@@ -1,5 +1,5 @@
/* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loops" } */
+/* { dg-options "-O2 -fdump-tree-cddce -ffinite-loops" } */
int
f1 (void)
@@ -28,4 +28,4 @@ f2 (void)
return i + j;
}
-/* { dg-final { scan-tree-dump-not "if" "cddce2"} } */
+/* { dg-final { scan-tree-dump-not "if" "cddce3"} } */
@@ -7,5 +7,5 @@
#include "pr85486.c"
-/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 32\\)" "oaccdevlow" } } */
+/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 32\\)" "oaccdevlow{2,3}" } } */
/* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=1, vectors=32" } */
@@ -7,5 +7,5 @@
#include "pr85486.c"
-/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 32\\)" "oaccdevlow" } } */
+/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 32\\)" "oaccdevlow{2,3}" } } */
/* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=1, vectors=32" } */
@@ -54,5 +54,5 @@ main (void)
return 0;
}
-/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 32\\)" "oaccdevlow" } } */
+/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 32\\)" "oaccdevlow{2,3}" } } */
/* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=1, vectors=32" } */
@@ -34,5 +34,5 @@ main (void)
return 0;
}
-/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 128\\)" "oaccdevlow" } } */
+/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 128\\)" "oaccdevlow*" } } */
/* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=1, vectors=128" } */
@@ -35,5 +35,5 @@ main (void)
return 0;
}
-/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 128\\)" "oaccdevlow" } } */
+/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 128\\)" "oaccdevlow*" } } */
/* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=1, vectors=128" } */
@@ -38,5 +38,5 @@ main (void)
return 0;
}
-/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 32\\)" "oaccdevlow" } } */
+/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 32\\)" "oaccdevlow*" } } */
/* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=1, vectors=32" } */
@@ -36,5 +36,5 @@ main (void)
return 0;
}
-/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 2, 128\\)" "oaccdevlow" } } */
+/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 2, 128\\)" "oaccdevlow*" } } */
/* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=2, vectors=128" } */
@@ -37,5 +37,5 @@ main (void)
return 0;
}
-/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 2, 128\\)" "oaccdevlow" } } */
+/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 2, 128\\)" "oaccdevlow*" } } */
/* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=2, vectors=128" } */
@@ -37,5 +37,5 @@ main (void)
return 0;
}
-/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 0, 128\\)" "oaccdevlow" } } */
+/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 0, 128\\)" "oaccdevlow*" } } */
/* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=2, vectors=128" } */
@@ -36,5 +36,5 @@ main (void)
return 0;
}
-/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 0, 128\\)" "oaccdevlow" } } */
+/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 0, 128\\)" "oaccdevlow*" } } */
/* { dg-output "nvptx_exec: kernel main\\\$_omp_fn\\\$0: launch gangs=1, workers=8, vectors=128" } */
--
2.17.1