Message ID | 87bn7v4b0m.fsf@kepler.schwinge.homeip.net |
---|---|
State | New |
Headers | show |
Hi! Will this patch be acceptable for GCC trunk in the current development stage? In its current incarnation, this patch depends on my 'Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid offloading"' patch, <http://news.gmane.org/find-root.php?message_id=%3C87zivg8rcy.fsf%40hertz.schwinge.homeip.net%3E>, which Bernd suggested "has to be considered after gcc-6". So, I'll have to re-work this patch here, hence I'm first checking if it generally meets approval? On Fri, 5 Feb 2016 13:06:17 +0100, I wrote: > On Mon, 9 Nov 2015 18:39:19 +0100, Tom de Vries <Tom_deVries@mentor.com> wrote: > > On 09/11/15 16:35, Tom de Vries wrote: > > > this patch series for stage1 trunk adds support to: > > > - parallelize oacc kernels regions using parloops, and > > > - map the loops onto the oacc gang dimension. > > > Atm, the parallelization behaviour for the kernels region is controlled > > by flag_tree_parallelize_loops, which is also used to control generic > > auto-parallelization by autopar using omp. That is not ideal, and we may > > want a separate flag (or param) to control the behaviour for oacc > > kernels, f.i. -foacc-kernels-gang-parallelize=<n>. I'm open to suggestions. > > I suggest to use plain -fopenacc to enable OpenACC kernels processing > (which just makes sense, I hope) ;-) and have later processing stages > determine the actual parametrization (currently: number of gangs) (that > is, Nathan's recent "Default compute dimensions" patches). > > The code changes are simple enough; OK for trunk? (This patch depends on > my 'Un-parallelized OpenACC kernels constructs with nvptx offloading: > "avoid offloading"' pending review, > <http://news.gmane.org/find-root.php?message_id=%3C87zivg8rcy.fsf%40hertz.schwinge.homeip.net%3E>.) > > Originally, I want to use: > > OMP_CLAUSE_NUM_GANGS_EXPR (clause) = build_int_cst (integer_type_node, n_threads == 0 ? -1 : n_threads); > > ... to store -1 "have the compiler decidew" (instead of now 0 "have the > run-time decide", which might prevent some code optimizations, as I > understand it) for the n_threads == 0 case, but it seems that for an > offloaded OpenACC kernels region, gcc/omp-low.c:oacc_validate_dims is > called with the parameter "used" set to 0 instead of "gang", and then the > "Default anything left to 1 or a partitioned default" logic will default > dims["gang"] to oacc_min_dims["gang"] (that is, 1) instead of the > oacc_default_dims["gang"] (that is, 32). Nathan, does that smell like a > bug (and could you look into that)? > > diff --git gcc/tree-parloops.c gcc/tree-parloops.c > index 139e38c..e498e5b 100644 > --- gcc/tree-parloops.c > +++ gcc/tree-parloops.c > @@ -2016,7 +2016,8 @@ transform_to_exit_first_loop (struct loop *loop, > /* Create the parallel constructs for LOOP as described in gen_parallel_loop. > LOOP_FN and DATA are the arguments of GIMPLE_OMP_PARALLEL. > NEW_DATA is the variable that should be initialized from the argument > - of LOOP_FN. N_THREADS is the requested number of threads. */ > + of LOOP_FN. N_THREADS is the requested number of threads, which can be 0 if > + that number is to be determined later. */ > > static void > create_parallel_loop (struct loop *loop, tree loop_fn, tree data, > @@ -2049,6 +2050,7 @@ create_parallel_loop (struct loop *loop, tree loop_fn, tree data, > basic_block paral_bb = single_pred (bb); > gsi = gsi_last_bb (paral_bb); > > + gcc_checking_assert (n_threads != 0); > t = build_omp_clause (loc, OMP_CLAUSE_NUM_THREADS); > OMP_CLAUSE_NUM_THREADS_EXPR (t) > = build_int_cst (integer_type_node, n_threads); > @@ -2221,7 +2223,8 @@ create_parallel_loop (struct loop *loop, tree loop_fn, tree data, > } > > /* Generates code to execute the iterations of LOOP in N_THREADS > - threads in parallel. > + threads in parallel, which can be 0 if that number is to be determined > + later. > > NITER describes number of iterations of LOOP. > REDUCTION_LIST describes the reductions existent in the LOOP. */ > @@ -2318,6 +2321,7 @@ gen_parallel_loop (struct loop *loop, > else > m_p_thread=MIN_PER_THREAD; > > + gcc_checking_assert (n_threads != 0); > many_iterations_cond = > fold_build2 (GE_EXPR, boolean_type_node, > nit, build_int_cst (type, m_p_thread * n_threads)); > @@ -3177,7 +3181,7 @@ oacc_entry_exit_ok (struct loop *loop, > static bool > parallelize_loops (bool oacc_kernels_p) > { > - unsigned n_threads = flag_tree_parallelize_loops; > + unsigned n_threads; > bool changed = false; > struct loop *loop; > struct loop *skip_loop = NULL; > @@ -3199,6 +3203,13 @@ parallelize_loops (bool oacc_kernels_p) > if (cfun->has_nonlocal_label) > return false; > > + /* For OpenACC kernels, n_threads will be determined later; otherwise, it's > + the argument to -ftree-parallelize-loops. */ > + if (oacc_kernels_p) > + n_threads = 0; > + else > + n_threads = flag_tree_parallelize_loops; > + > gcc_obstack_init (&parloop_obstack); > reduction_info_table_type reduction_list (10); > > @@ -3361,7 +3372,13 @@ public: > {} > > /* opt_pass methods: */ > - virtual bool gate (function *) { return flag_tree_parallelize_loops > 1; } > + virtual bool gate (function *) > + { > + if (oacc_kernels_p) > + return flag_openacc; > + else > + return flag_tree_parallelize_loops > 1; > + } > virtual unsigned int execute (function *); > opt_pass * clone () { return new pass_parallelize_loops (m_ctxt); } > void set_pass_param (unsigned int n, bool param) > diff --git gcc/tree-ssa-loop.c gcc/tree-ssa-loop.c > index bdbade5..4c39fbc 100644 > --- gcc/tree-ssa-loop.c > +++ gcc/tree-ssa-loop.c > @@ -148,7 +148,7 @@ make_pass_tree_loop (gcc::context *ctxt) > static bool > gate_oacc_kernels (function *fn) > { > - if (flag_tree_parallelize_loops <= 1) > + if (!flag_openacc) > return false; > > tree oacc_function_attr = get_oacc_fn_attrib (fn->decl); > @@ -230,10 +230,9 @@ public: > virtual bool gate (function *) > { > return (optimize > - /* Don't bother doing anything if the program has errors. */ > - && !seen_error () > && flag_openacc > - && flag_tree_parallelize_loops > 1); > + /* Don't bother doing anything if the program has errors. */ > + && !seen_error ()); > } > > }; // class pass_ipa_oacc > diff --git gcc/config/nvptx/nvptx.c gcc/config/nvptx/nvptx.c > index fe28154..2fd3d52 100644 > --- gcc/config/nvptx/nvptx.c > +++ gcc/config/nvptx/nvptx.c > @@ -4140,7 +4140,7 @@ nvptx_goacc_validate_dims (tree decl, int dims[], int fn_level) > bool avoid_offloading_p = true; > for (unsigned ix = 0; ix != GOMP_DIM_MAX; ix++) > { > - if (dims[ix] > 1) > + if (dims[ix] > 1 || dims[ix] == 0) > { > avoid_offloading_p = false; > break; > diff --git libgomp/oacc-parallel.c libgomp/oacc-parallel.c > index bc24651..f795bf7 100644 > --- libgomp/oacc-parallel.c > +++ libgomp/oacc-parallel.c > @@ -103,6 +103,10 @@ GOACC_parallel_keyed (int device, void (*fn) (void *), > return; > } > > + /* Default: let the runtime choose. */ > + for (i = 0; i != GOMP_DIM_MAX; i++) > + dims[i] = 0; > + > va_start (ap, kinds); > /* TODO: This will need amending when device_type is implemented. */ > while ((tag = va_arg (ap, unsigned)) != 0) > diff --git libgomp/plugin/plugin-nvptx.c libgomp/plugin/plugin-nvptx.c > index 7ec1810..3f1bb6d 100644 > --- libgomp/plugin/plugin-nvptx.c > +++ libgomp/plugin/plugin-nvptx.c > @@ -894,9 +894,21 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs, > /* Initialize the launch dimensions. Typically this is constant, > provided by the device compiler, but we must permit runtime > values. */ > - for (i = 0; i != 3; i++) > - if (targ_fn->launch->dim[i]) > - dims[i] = targ_fn->launch->dim[i]; > + int seen_zero = 0; > + for (i = 0; i != GOMP_DIM_MAX; i++) > + { > + if (targ_fn->launch->dim[i]) > + dims[i] = targ_fn->launch->dim[i]; > + if (!dims[i]) > + seen_zero = 1; > + } > + > + if (seen_zero) > + { > + for (i = 0; i != GOMP_DIM_MAX; i++) > + if (!dims[i]) > + dims[i] = /* TODO */ 32; > + } > > /* This reserves a chunk of a pre-allocated page of memory mapped on both > the host and the device. HP is a host pointer to the new chunk, and DP is > > The TODO in libgomp/plugin/plugin-nvptx.c:nvptx_exec will be resolved by > Nathan's "Default compute dimensions (runtime)", > <http://news.gmane.org/find-root.php?message_id=%3C56B21D23.5060209%40acm.org%3E>. > > The remainder is just "mechanical" updates to the test cases: > > diff --git gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c > index e8b5357..17f240e 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -51,4 +50,4 @@ main (void) > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c > index c39d674..750f576 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -34,4 +33,4 @@ foo (unsigned int n) > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c > index 3501d0d..df60d6a 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -34,4 +33,4 @@ foo (void) > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c > index f97584d..913d91f 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -67,4 +66,4 @@ main (void) > /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.1" 1 "optimized" } } */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.2" 1 "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 3 "parloops1" } } */ > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 3 "parloops1" } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c > index 530d62a..1822d2a 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -45,5 +44,4 @@ main (void) > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ > - > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c > index 4f1c2c5..e946319 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c > @@ -1,6 +1,5 @@ > /* { dg-additional-options "-O2" } */ > /* { dg-additional-options "-g" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -13,5 +12,4 @@ > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ > - > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c > index 151db51..9b63b45 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -49,4 +48,4 @@ main (void) > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c > index bee5f5a..279f797 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -52,5 +51,4 @@ foo (COUNTERTYPE n) > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ > - > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c > index ea0e342..db1071f 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -36,4 +35,4 @@ main (void) > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop.c gcc/testsuite/c-c++-common/goacc/kernels-loop.c > index ab5dfb9..abf7a3c 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-loop.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-loop.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -52,5 +51,4 @@ main (void) > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ > - > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c > index b16a8cd..95f4817 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -50,5 +49,4 @@ main (void) > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ > - > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ > diff --git gcc/testsuite/c-c++-common/goacc/kernels-reduction.c gcc/testsuite/c-c++-common/goacc/kernels-reduction.c > index 61c5df3..6f5a418 100644 > --- gcc/testsuite/c-c++-common/goacc/kernels-reduction.c > +++ gcc/testsuite/c-c++-common/goacc/kernels-reduction.c > @@ -1,5 +1,4 @@ > /* { dg-additional-options "-O2" } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-fdump-tree-parloops1-all" } */ > /* { dg-additional-options "-fdump-tree-optimized" } */ > > @@ -32,5 +31,4 @@ foo (void) > /* Check that the loop has been split off into a function. */ > /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 "optimized" } } */ > > -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ > - > +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ > diff --git gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 > index 4db3a50..3334741 100644 > --- gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 > +++ gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 > @@ -1,5 +1,4 @@ > ! { dg-additional-options "-O2" } > -! { dg-additional-options "-ftree-parallelize-loops=32" } > > program main > implicit none > diff --git gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 > index fef3d10..fb92da8 100644 > --- gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 > +++ gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 > @@ -1,5 +1,4 @@ > ! { dg-additional-options "-O2" } > -! { dg-additional-options "-ftree-parallelize-loops=10" } > > program main > implicit none > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c > index 08745fc..366b4f5 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c > @@ -1,6 +1,5 @@ > /* Test that the compiler decides to "avoid offloading". */ > > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* The ACC_DEVICE_TYPE environment variable gets set in the testing > framework, and that overrides the "avoid offloading" flag at run time. > { dg-xfail-run-if "TODO" { openacc_nvidia_accel_selected } } */ > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c > index 724228a..a63ec97 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c > @@ -1,8 +1,6 @@ > /* Test that a user can override the compiler's "avoid offloading" > decision at run time. */ > > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <openacc.h> > > int main(void) > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c > index 2fb5196..da01d02 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c > @@ -1,7 +1,6 @@ > /* Test that a user can override the compiler's "avoid offloading" > decision at compile time. */ > > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* Override the compiler's "avoid offloading" decision. > { dg-additional-options "-foffload-force" } */ > > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c > index 87ca378..39899ab 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c > @@ -1,7 +1,5 @@ > /* This test exercises combined directives. */ > > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > int > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c > index 8f0144c..31da8b1 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c > @@ -1,5 +1,3 @@ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <openacc.h> > > int test_parallel () > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-1.c > index 3ef6f9b..51745ba 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-1.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-1.c > @@ -1,5 +1,4 @@ > /* { dg-do run { target openacc_nvidia_accel_selected } } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-lcuda -lcublas -lcudart" } */ > > #include <stdlib.h> > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c > index 614ad33..588e864 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c > @@ -1,5 +1,3 @@ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > int i; > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c > index 13e57bd..c7592d6 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c > @@ -1,6 +1,3 @@ > -/* { dg-do run } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N (1024 * 512) > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c > index f61a74a..31114ac 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c > @@ -1,6 +1,3 @@ > -/* { dg-do run } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N (1024 * 512) > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c > index 5cdc200..3ffdfe2 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c > @@ -1,5 +1,3 @@ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N 32 > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c > index 2e4d4d2..a554d66 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c > @@ -1,5 +1,3 @@ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N 32 > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c > index 5bf00db..f0144b4 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c > @@ -1,5 +1,3 @@ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N 32 > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c > index d39b667..4719edd 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c > @@ -1,5 +1,3 @@ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N 32 > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c > index bb2e85b..ca4f638 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c > @@ -1,5 +1,3 @@ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N 32 > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c > index e513827..d2fff38 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c > @@ -1,5 +1,3 @@ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N 32 > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c > index c4791a4..0df4b3f 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c > @@ -1,5 +1,3 @@ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N 100 > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c > index 96b6e4e..88258be 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c > @@ -1,5 +1,3 @@ > -/* { dg-do run } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > /* { dg-additional-options "-g" } */ > > #include "kernels-loop.c" > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c > index 1433cb2..147ebb5 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c > @@ -1,6 +1,3 @@ > -/* { dg-do run } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N ((1024 * 512) + 1) > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c > index fd0d5b1..9a3eaca 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c > @@ -1,6 +1,3 @@ > -/* { dg-do run } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N ((1024 * 512) + 1) > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c > index 21d2599..28c725a 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c > @@ -1,6 +1,3 @@ > -/* { dg-do run } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N 1000 > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c > index 3762e5a..355123c 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c > @@ -1,6 +1,3 @@ > -/* { dg-do run } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define N (1024 * 512) > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c > index 511e25f..8647a94 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c > @@ -1,6 +1,3 @@ > -/* { dg-do run } */ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > #define n 10000 > diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c > index 94a5ae2..83cddb5 100644 > --- libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c > +++ libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c > @@ -1,5 +1,3 @@ > -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ > - > #include <stdlib.h> > > int > diff --git libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-1.f libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-1.f > index 5f18b94..ca5cd01 100644 > --- libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-1.f > +++ libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-1.f > @@ -2,7 +2,6 @@ > > ! { dg-do run } > ! { dg-additional-options "-cpp" } > -! { dg-additional-options "-ftree-parallelize-loops=32" } > ! The "avoid offloading" warning is only triggered for -O2 and higher. > ! { dg-xfail-if "n/a" { nvptx_offloading_configured } { "-O0" "-O1" } { "" } } > ! The ACC_DEVICE_TYPE environment variable gets set in the testing > diff --git libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-2.f libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-2.f > index 51801ad..6200b37 100644 > --- libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-2.f > +++ libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-2.f > @@ -3,7 +3,6 @@ > > ! { dg-do run } > ! { dg-additional-options "-cpp" } > -! { dg-additional-options "-ftree-parallelize-loops=32" } > ! The "avoid offloading" warning is only triggered for -O2 and higher. > ! { dg-xfail-if "n/a" { nvptx_offloading_configured } { "-O0" "-O1" } { "" } } > > diff --git libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-3.f libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-3.f > index bea6ab8..865d09f 100644 > --- libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-3.f > +++ libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-3.f > @@ -3,7 +3,6 @@ > > ! { dg-do run } > ! { dg-additional-options "-cpp" } > -! { dg-additional-options "-ftree-parallelize-loops=32" } > ! Override the compiler's "avoid offloading" decision. > ! { dg-additional-options "-foffload-force" } > > diff --git libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90 libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90 > index 4b52579..12ff36c 100644 > --- libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90 > +++ libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90 > @@ -1,7 +1,6 @@ > ! This test exercises combined directives. > > ! { dg-do run } > -! { dg-additional-options "-ftree-parallelize-loops=32" } > ! The "avoid offloading" warning is only triggered for -O2 and higher. > ! { dg-xfail-if "n/a" { nvptx_offloading_configured } { "-O0" "-O1" } { "" } } > > diff --git libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 > index b9298c7..0643e89 100644 > --- libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 > +++ libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 > @@ -2,7 +2,6 @@ > ! offloaded regions are properly mapped using present_or_copy. > > ! { dg-do run } > -! { dg-additional-options "-ftree-parallelize-loops=32" } > ! The "avoid offloading" warning is only triggered for -O2 and higher. > ! { dg-xfail-if "n/a" { nvptx_offloading_configured } { "-O0" "-O1" } { "" } } Grüße Thomas
On 10/02/16 15:40, Thomas Schwinge wrote: > Hi! > > Will this patch be acceptable for GCC trunk in the current development > stage? In its current incarnation, this patch depends on my > 'Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid > offloading"' patch, > <http://news.gmane.org/find-root.php?message_id=%3C87zivg8rcy.fsf%40hertz.schwinge.homeip.net%3E>, > which Bernd suggested "has to be considered after gcc-6". So, I'll have > to re-work this patch here, hence I'm first checking if it generally > meets approval? > > On Fri, 5 Feb 2016 13:06:17 +0100, I wrote: >> On Mon, 9 Nov 2015 18:39:19 +0100, Tom de Vries <Tom_deVries@mentor.com> wrote: >>> On 09/11/15 16:35, Tom de Vries wrote: >>>> this patch series for stage1 trunk adds support to: >>>> - parallelize oacc kernels regions using parloops, and >>>> - map the loops onto the oacc gang dimension. >> >>> Atm, the parallelization behaviour for the kernels region is controlled >>> by flag_tree_parallelize_loops, which is also used to control generic >>> auto-parallelization by autopar using omp. That is not ideal, and we may >>> want a separate flag (or param) to control the behaviour for oacc >>> kernels, f.i. -foacc-kernels-gang-parallelize=<n>. I'm open to suggestions. >> >> I suggest to use plain -fopenacc to enable OpenACC kernels processing >> (which just makes sense, I hope) ;-) and have later processing stages >> determine the actual parametrization (currently: number of gangs) (that >> is, Nathan's recent "Default compute dimensions" patches). >> Hi Thomas, That makes a lot of sense. Thanks for working on this. >> The code changes are simple enough; OK for trunk? (This patch depends on >> my 'Un-parallelized OpenACC kernels constructs with nvptx offloading: >> "avoid offloading"' pending review, >> <http://news.gmane.org/find-root.php?message_id=%3C87zivg8rcy.fsf%40hertz.schwinge.homeip.net%3E>.) >> >> Originally, I want to use: >> >> OMP_CLAUSE_NUM_GANGS_EXPR (clause) = build_int_cst (integer_type_node, n_threads == 0 ? -1 : n_threads); >> >> ... to store -1 "have the compiler decidew" (instead of now 0 "have the >> run-time decide", which might prevent some code optimizations, as I >> understand it) for the n_threads == 0 case, but it seems that for an >> offloaded OpenACC kernels region, gcc/omp-low.c:oacc_validate_dims is >> called with the parameter "used" set to 0 instead of "gang", and then the >> "Default anything left to 1 or a partitioned default" logic will default >> dims["gang"] to oacc_min_dims["gang"] (that is, 1) instead of the >> oacc_default_dims["gang"] (that is, 32). Nathan, does that smell like a >> bug (and could you look into that)? >> >> diff --git gcc/tree-parloops.c gcc/tree-parloops.c >> index 139e38c..e498e5b 100644 >> --- gcc/tree-parloops.c >> +++ gcc/tree-parloops.c >> @@ -2016,7 +2016,8 @@ transform_to_exit_first_loop (struct loop *loop, >> /* Create the parallel constructs for LOOP as described in gen_parallel_loop. >> LOOP_FN and DATA are the arguments of GIMPLE_OMP_PARALLEL. >> NEW_DATA is the variable that should be initialized from the argument >> - of LOOP_FN. N_THREADS is the requested number of threads. */ >> + of LOOP_FN. N_THREADS is the requested number of threads, which can be 0 if >> + that number is to be determined later. */ >> >> static void >> create_parallel_loop (struct loop *loop, tree loop_fn, tree data, >> @@ -2049,6 +2050,7 @@ create_parallel_loop (struct loop *loop, tree loop_fn, tree data, >> basic_block paral_bb = single_pred (bb); >> gsi = gsi_last_bb (paral_bb); >> >> + gcc_checking_assert (n_threads != 0); >> t = build_omp_clause (loc, OMP_CLAUSE_NUM_THREADS); >> OMP_CLAUSE_NUM_THREADS_EXPR (t) >> = build_int_cst (integer_type_node, n_threads); >> @@ -2221,7 +2223,8 @@ create_parallel_loop (struct loop *loop, tree loop_fn, tree data, >> } >> >> /* Generates code to execute the iterations of LOOP in N_THREADS >> - threads in parallel. >> + threads in parallel, which can be 0 if that number is to be determined >> + later. >> >> NITER describes number of iterations of LOOP. >> REDUCTION_LIST describes the reductions existent in the LOOP. */ >> @@ -2318,6 +2321,7 @@ gen_parallel_loop (struct loop *loop, >> else >> m_p_thread=MIN_PER_THREAD; >> >> + gcc_checking_assert (n_threads != 0); >> many_iterations_cond = >> fold_build2 (GE_EXPR, boolean_type_node, >> nit, build_int_cst (type, m_p_thread * n_threads)); >> @@ -3177,7 +3181,7 @@ oacc_entry_exit_ok (struct loop *loop, >> static bool >> parallelize_loops (bool oacc_kernels_p) >> { >> - unsigned n_threads = flag_tree_parallelize_loops; >> + unsigned n_threads; >> bool changed = false; >> struct loop *loop; >> struct loop *skip_loop = NULL; >> @@ -3199,6 +3203,13 @@ parallelize_loops (bool oacc_kernels_p) >> if (cfun->has_nonlocal_label) >> return false; >> >> + /* For OpenACC kernels, n_threads will be determined later; otherwise, it's >> + the argument to -ftree-parallelize-loops. */ >> + if (oacc_kernels_p) >> + n_threads = 0; >> + else >> + n_threads = flag_tree_parallelize_loops; >> + >> gcc_obstack_init (&parloop_obstack); >> reduction_info_table_type reduction_list (10); >> >> @@ -3361,7 +3372,13 @@ public: >> {} >> >> /* opt_pass methods: */ >> - virtual bool gate (function *) { return flag_tree_parallelize_loops > 1; } >> + virtual bool gate (function *) >> + { >> + if (oacc_kernels_p) >> + return flag_openacc; >> + else >> + return flag_tree_parallelize_loops > 1; >> + } I wouldn't mind using the tertiary expression here, but I suppose that's a taste thing. >> virtual unsigned int execute (function *); >> opt_pass * clone () { return new pass_parallelize_loops (m_ctxt); } >> void set_pass_param (unsigned int n, bool param) The oacc-parloops changes look good to me. I approve them for 6.0 stage 4 (given that using the ftree-parallelize-loops=<n> flag for oacc kernels parallelization was was just a placeholder waiting to be replaced by an oacc-based approach). [ And I'd expect that the tree-ssa-loop.c changes and the mechanical testsuite changes can be regarded as trivial. ] Thanks, - Tom >> diff --git gcc/tree-ssa-loop.c gcc/tree-ssa-loop.c >> index bdbade5..4c39fbc 100644 >> --- gcc/tree-ssa-loop.c >> +++ gcc/tree-ssa-loop.c >> @@ -148,7 +148,7 @@ make_pass_tree_loop (gcc::context *ctxt) >> static bool >> gate_oacc_kernels (function *fn) >> { >> - if (flag_tree_parallelize_loops <= 1) >> + if (!flag_openacc) >> return false; >> >> tree oacc_function_attr = get_oacc_fn_attrib (fn->decl); >> @@ -230,10 +230,9 @@ public: >> virtual bool gate (function *) >> { >> return (optimize >> - /* Don't bother doing anything if the program has errors. */ >> - && !seen_error () >> && flag_openacc >> - && flag_tree_parallelize_loops > 1); >> + /* Don't bother doing anything if the program has errors. */ >> + && !seen_error ()); >> } >> >> }; // class pass_ipa_oacc >> diff --git gcc/config/nvptx/nvptx.c gcc/config/nvptx/nvptx.c >> index fe28154..2fd3d52 100644 >> --- gcc/config/nvptx/nvptx.c >> +++ gcc/config/nvptx/nvptx.c >> @@ -4140,7 +4140,7 @@ nvptx_goacc_validate_dims (tree decl, int dims[], int fn_level) >> bool avoid_offloading_p = true; >> for (unsigned ix = 0; ix != GOMP_DIM_MAX; ix++) >> { >> - if (dims[ix] > 1) >> + if (dims[ix] > 1 || dims[ix] == 0) >> { >> avoid_offloading_p = false; >> break; >> diff --git libgomp/oacc-parallel.c libgomp/oacc-parallel.c >> index bc24651..f795bf7 100644 >> --- libgomp/oacc-parallel.c >> +++ libgomp/oacc-parallel.c >> @@ -103,6 +103,10 @@ GOACC_parallel_keyed (int device, void (*fn) (void *), >> return; >> } >> >> + /* Default: let the runtime choose. */ >> + for (i = 0; i != GOMP_DIM_MAX; i++) >> + dims[i] = 0; >> + >> va_start (ap, kinds); >> /* TODO: This will need amending when device_type is implemented. */ >> while ((tag = va_arg (ap, unsigned)) != 0) >> diff --git libgomp/plugin/plugin-nvptx.c libgomp/plugin/plugin-nvptx.c >> index 7ec1810..3f1bb6d 100644 >> --- libgomp/plugin/plugin-nvptx.c >> +++ libgomp/plugin/plugin-nvptx.c >> @@ -894,9 +894,21 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs, >> /* Initialize the launch dimensions. Typically this is constant, >> provided by the device compiler, but we must permit runtime >> values. */ >> - for (i = 0; i != 3; i++) >> - if (targ_fn->launch->dim[i]) >> - dims[i] = targ_fn->launch->dim[i]; >> + int seen_zero = 0; >> + for (i = 0; i != GOMP_DIM_MAX; i++) >> + { >> + if (targ_fn->launch->dim[i]) >> + dims[i] = targ_fn->launch->dim[i]; >> + if (!dims[i]) >> + seen_zero = 1; >> + } >> + >> + if (seen_zero) >> + { >> + for (i = 0; i != GOMP_DIM_MAX; i++) >> + if (!dims[i]) >> + dims[i] = /* TODO */ 32; >> + } >> >> /* This reserves a chunk of a pre-allocated page of memory mapped on both >> the host and the device. HP is a host pointer to the new chunk, and DP is >> >> The TODO in libgomp/plugin/plugin-nvptx.c:nvptx_exec will be resolved by >> Nathan's "Default compute dimensions (runtime)", >> <http://news.gmane.org/find-root.php?message_id=%3C56B21D23.5060209%40acm.org%3E>. >> >> The remainder is just "mechanical" updates to the test cases: >> >> diff --git gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c >> index e8b5357..17f240e 100644 >> --- gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c >> +++ gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c >> @@ -1,5 +1,4 @@ >> /* { dg-additional-options "-O2" } */ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> /* { dg-additional-options "-fdump-tree-parloops1-all" } */ >> /* { dg-additional-options "-fdump-tree-optimized" } */ >> >> @@ -51,4 +50,4 @@ main (void) >> /* Check that the loop has been split off into a function. */ >> /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ >> >> -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ >> +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ >> diff --git gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c >> index c39d674..750f576 100644 >> --- gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c >> +++ gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c >> @@ -1,5 +1,4 @@ >> /* { dg-additional-options "-O2" } */ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> /* { dg-additional-options "-fdump-tree-parloops1-all" } */ >> /* { dg-additional-options "-fdump-tree-optimized" } */ >> >> @@ -34,4 +33,4 @@ foo (unsigned int n) >> /* Check that the loop has been split off into a function. */ >> /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 "optimized" } } */ >> >> -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ >> +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ >> diff --git gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c >> index 3501d0d..df60d6a 100644 >> --- gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c >> +++ gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c >> @@ -1,5 +1,4 @@ >> /* { dg-additional-options "-O2" } */ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> /* { dg-additional-options "-fdump-tree-parloops1-all" } */ >> /* { dg-additional-options "-fdump-tree-optimized" } */ >> >> @@ -34,4 +33,4 @@ foo (void) >> /* Check that the loop has been split off into a function. */ >> /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 "optimized" } } */ >> >> -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ >> +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ >> diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c >> index f97584d..913d91f 100644 >> --- gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c >> +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c >> @@ -1,5 +1,4 @@ >> /* { dg-additional-options "-O2" } */ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> /* { dg-additional-options "-fdump-tree-parloops1-all" } */ >> /* { dg-additional-options "-fdump-tree-optimized" } */ >> >> @@ -67,4 +66,4 @@ main (void) >> /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.1" 1 "optimized" } } */ >> /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.2" 1 "optimized" } } */ >> >> -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 3 "parloops1" } } */ >> +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 3 "parloops1" } } */ >> diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c >> index 530d62a..1822d2a 100644 >> --- gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c >> +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c >> @@ -1,5 +1,4 @@ >> /* { dg-additional-options "-O2" } */ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> /* { dg-additional-options "-fdump-tree-parloops1-all" } */ >> /* { dg-additional-options "-fdump-tree-optimized" } */ >> >> @@ -45,5 +44,4 @@ main (void) >> /* Check that the loop has been split off into a function. */ >> /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ >> >> -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ >> - >> +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ >> diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c >> index 4f1c2c5..e946319 100644 >> --- gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c >> +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c >> @@ -1,6 +1,5 @@ >> /* { dg-additional-options "-O2" } */ >> /* { dg-additional-options "-g" } */ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> /* { dg-additional-options "-fdump-tree-parloops1-all" } */ >> /* { dg-additional-options "-fdump-tree-optimized" } */ >> >> @@ -13,5 +12,4 @@ >> /* Check that the loop has been split off into a function. */ >> /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ >> >> -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ >> - >> +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ >> diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c >> index 151db51..9b63b45 100644 >> --- gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c >> +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c >> @@ -1,5 +1,4 @@ >> /* { dg-additional-options "-O2" } */ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> /* { dg-additional-options "-fdump-tree-parloops1-all" } */ >> /* { dg-additional-options "-fdump-tree-optimized" } */ >> >> @@ -49,4 +48,4 @@ main (void) >> /* Check that the loop has been split off into a function. */ >> /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ >> >> -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ >> +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ >> diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c >> index bee5f5a..279f797 100644 >> --- gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c >> +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c >> @@ -1,5 +1,4 @@ >> /* { dg-additional-options "-O2" } */ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> /* { dg-additional-options "-fdump-tree-parloops1-all" } */ >> /* { dg-additional-options "-fdump-tree-optimized" } */ >> >> @@ -52,5 +51,4 @@ foo (COUNTERTYPE n) >> /* Check that the loop has been split off into a function. */ >> /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 "optimized" } } */ >> >> -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ >> - >> +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ >> diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c >> index ea0e342..db1071f 100644 >> --- gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c >> +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c >> @@ -1,5 +1,4 @@ >> /* { dg-additional-options "-O2" } */ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> /* { dg-additional-options "-fdump-tree-parloops1-all" } */ >> /* { dg-additional-options "-fdump-tree-optimized" } */ >> >> @@ -36,4 +35,4 @@ main (void) >> /* Check that the loop has been split off into a function. */ >> /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ >> >> -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ >> +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ >> diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop.c gcc/testsuite/c-c++-common/goacc/kernels-loop.c >> index ab5dfb9..abf7a3c 100644 >> --- gcc/testsuite/c-c++-common/goacc/kernels-loop.c >> +++ gcc/testsuite/c-c++-common/goacc/kernels-loop.c >> @@ -1,5 +1,4 @@ >> /* { dg-additional-options "-O2" } */ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> /* { dg-additional-options "-fdump-tree-parloops1-all" } */ >> /* { dg-additional-options "-fdump-tree-optimized" } */ >> >> @@ -52,5 +51,4 @@ main (void) >> /* Check that the loop has been split off into a function. */ >> /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ >> >> -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ >> - >> +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ >> diff --git gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c >> index b16a8cd..95f4817 100644 >> --- gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c >> +++ gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c >> @@ -1,5 +1,4 @@ >> /* { dg-additional-options "-O2" } */ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> /* { dg-additional-options "-fdump-tree-parloops1-all" } */ >> /* { dg-additional-options "-fdump-tree-optimized" } */ >> >> @@ -50,5 +49,4 @@ main (void) >> /* Check that the loop has been split off into a function. */ >> /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ >> >> -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ >> - >> +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ >> diff --git gcc/testsuite/c-c++-common/goacc/kernels-reduction.c gcc/testsuite/c-c++-common/goacc/kernels-reduction.c >> index 61c5df3..6f5a418 100644 >> --- gcc/testsuite/c-c++-common/goacc/kernels-reduction.c >> +++ gcc/testsuite/c-c++-common/goacc/kernels-reduction.c >> @@ -1,5 +1,4 @@ >> /* { dg-additional-options "-O2" } */ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> /* { dg-additional-options "-fdump-tree-parloops1-all" } */ >> /* { dg-additional-options "-fdump-tree-optimized" } */ >> >> @@ -32,5 +31,4 @@ foo (void) >> /* Check that the loop has been split off into a function. */ >> /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 "optimized" } } */ >> >> -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ >> - >> +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ >> diff --git gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 >> index 4db3a50..3334741 100644 >> --- gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 >> +++ gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 >> @@ -1,5 +1,4 @@ >> ! { dg-additional-options "-O2" } >> -! { dg-additional-options "-ftree-parallelize-loops=32" } >> >> program main >> implicit none >> diff --git gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 >> index fef3d10..fb92da8 100644 >> --- gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 >> +++ gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 >> @@ -1,5 +1,4 @@ >> ! { dg-additional-options "-O2" } >> -! { dg-additional-options "-ftree-parallelize-loops=10" } >> >> program main >> implicit none >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c >> index 08745fc..366b4f5 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c >> @@ -1,6 +1,5 @@ >> /* Test that the compiler decides to "avoid offloading". */ >> >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> /* The ACC_DEVICE_TYPE environment variable gets set in the testing >> framework, and that overrides the "avoid offloading" flag at run time. >> { dg-xfail-run-if "TODO" { openacc_nvidia_accel_selected } } */ >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c >> index 724228a..a63ec97 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c >> @@ -1,8 +1,6 @@ >> /* Test that a user can override the compiler's "avoid offloading" >> decision at run time. */ >> >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> - >> #include <openacc.h> >> >> int main(void) >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c >> index 2fb5196..da01d02 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c >> @@ -1,7 +1,6 @@ >> /* Test that a user can override the compiler's "avoid offloading" >> decision at compile time. */ >> >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> /* Override the compiler's "avoid offloading" decision. >> { dg-additional-options "-foffload-force" } */ >> >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c >> index 87ca378..39899ab 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c >> @@ -1,7 +1,5 @@ >> /* This test exercises combined directives. */ >> >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> - >> #include <stdlib.h> >> >> int >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c >> index 8f0144c..31da8b1 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c >> @@ -1,5 +1,3 @@ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> - >> #include <openacc.h> >> >> int test_parallel () >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-1.c >> index 3ef6f9b..51745ba 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-1.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-1.c >> @@ -1,5 +1,4 @@ >> /* { dg-do run { target openacc_nvidia_accel_selected } } */ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> /* { dg-additional-options "-lcuda -lcublas -lcudart" } */ >> >> #include <stdlib.h> >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c >> index 614ad33..588e864 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c >> @@ -1,5 +1,3 @@ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> - >> #include <stdlib.h> >> >> int i; >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c >> index 13e57bd..c7592d6 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c >> @@ -1,6 +1,3 @@ >> -/* { dg-do run } */ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> - >> #include <stdlib.h> >> >> #define N (1024 * 512) >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c >> index f61a74a..31114ac 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c >> @@ -1,6 +1,3 @@ >> -/* { dg-do run } */ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> - >> #include <stdlib.h> >> >> #define N (1024 * 512) >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c >> index 5cdc200..3ffdfe2 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c >> @@ -1,5 +1,3 @@ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> - >> #include <stdlib.h> >> >> #define N 32 >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c >> index 2e4d4d2..a554d66 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c >> @@ -1,5 +1,3 @@ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> - >> #include <stdlib.h> >> >> #define N 32 >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c >> index 5bf00db..f0144b4 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c >> @@ -1,5 +1,3 @@ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> - >> #include <stdlib.h> >> >> #define N 32 >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c >> index d39b667..4719edd 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c >> @@ -1,5 +1,3 @@ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> - >> #include <stdlib.h> >> >> #define N 32 >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c >> index bb2e85b..ca4f638 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c >> @@ -1,5 +1,3 @@ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> - >> #include <stdlib.h> >> >> #define N 32 >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c >> index e513827..d2fff38 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c >> @@ -1,5 +1,3 @@ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> - >> #include <stdlib.h> >> >> #define N 32 >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c >> index c4791a4..0df4b3f 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c >> @@ -1,5 +1,3 @@ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> - >> #include <stdlib.h> >> >> #define N 100 >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c >> index 96b6e4e..88258be 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c >> @@ -1,5 +1,3 @@ >> -/* { dg-do run } */ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> /* { dg-additional-options "-g" } */ >> >> #include "kernels-loop.c" >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c >> index 1433cb2..147ebb5 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c >> @@ -1,6 +1,3 @@ >> -/* { dg-do run } */ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> - >> #include <stdlib.h> >> >> #define N ((1024 * 512) + 1) >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c >> index fd0d5b1..9a3eaca 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c >> @@ -1,6 +1,3 @@ >> -/* { dg-do run } */ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> - >> #include <stdlib.h> >> >> #define N ((1024 * 512) + 1) >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c >> index 21d2599..28c725a 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c >> @@ -1,6 +1,3 @@ >> -/* { dg-do run } */ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> - >> #include <stdlib.h> >> >> #define N 1000 >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c >> index 3762e5a..355123c 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c >> @@ -1,6 +1,3 @@ >> -/* { dg-do run } */ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> - >> #include <stdlib.h> >> >> #define N (1024 * 512) >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c >> index 511e25f..8647a94 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c >> @@ -1,6 +1,3 @@ >> -/* { dg-do run } */ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> - >> #include <stdlib.h> >> >> #define n 10000 >> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c >> index 94a5ae2..83cddb5 100644 >> --- libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c >> +++ libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c >> @@ -1,5 +1,3 @@ >> -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ >> - >> #include <stdlib.h> >> >> int >> diff --git libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-1.f libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-1.f >> index 5f18b94..ca5cd01 100644 >> --- libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-1.f >> +++ libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-1.f >> @@ -2,7 +2,6 @@ >> >> ! { dg-do run } >> ! { dg-additional-options "-cpp" } >> -! { dg-additional-options "-ftree-parallelize-loops=32" } >> ! The "avoid offloading" warning is only triggered for -O2 and higher. >> ! { dg-xfail-if "n/a" { nvptx_offloading_configured } { "-O0" "-O1" } { "" } } >> ! The ACC_DEVICE_TYPE environment variable gets set in the testing >> diff --git libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-2.f libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-2.f >> index 51801ad..6200b37 100644 >> --- libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-2.f >> +++ libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-2.f >> @@ -3,7 +3,6 @@ >> >> ! { dg-do run } >> ! { dg-additional-options "-cpp" } >> -! { dg-additional-options "-ftree-parallelize-loops=32" } >> ! The "avoid offloading" warning is only triggered for -O2 and higher. >> ! { dg-xfail-if "n/a" { nvptx_offloading_configured } { "-O0" "-O1" } { "" } } >> >> diff --git libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-3.f libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-3.f >> index bea6ab8..865d09f 100644 >> --- libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-3.f >> +++ libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-3.f >> @@ -3,7 +3,6 @@ >> >> ! { dg-do run } >> ! { dg-additional-options "-cpp" } >> -! { dg-additional-options "-ftree-parallelize-loops=32" } >> ! Override the compiler's "avoid offloading" decision. >> ! { dg-additional-options "-foffload-force" } >> >> diff --git libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90 libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90 >> index 4b52579..12ff36c 100644 >> --- libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90 >> +++ libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90 >> @@ -1,7 +1,6 @@ >> ! This test exercises combined directives. >> >> ! { dg-do run } >> -! { dg-additional-options "-ftree-parallelize-loops=32" } >> ! The "avoid offloading" warning is only triggered for -O2 and higher. >> ! { dg-xfail-if "n/a" { nvptx_offloading_configured } { "-O0" "-O1" } { "" } } >> >> diff --git libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 >> index b9298c7..0643e89 100644 >> --- libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 >> +++ libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 >> @@ -2,7 +2,6 @@ >> ! offloaded regions are properly mapped using present_or_copy. >> >> ! { dg-do run } >> -! { dg-additional-options "-ftree-parallelize-loops=32" } >> ! The "avoid offloading" warning is only triggered for -O2 and higher. >> ! { dg-xfail-if "n/a" { nvptx_offloading_configured } { "-O0" "-O1" } { "" } } >
diff --git gcc/tree-parloops.c gcc/tree-parloops.c index 139e38c..e498e5b 100644 --- gcc/tree-parloops.c +++ gcc/tree-parloops.c @@ -2016,7 +2016,8 @@ transform_to_exit_first_loop (struct loop *loop, /* Create the parallel constructs for LOOP as described in gen_parallel_loop. LOOP_FN and DATA are the arguments of GIMPLE_OMP_PARALLEL. NEW_DATA is the variable that should be initialized from the argument - of LOOP_FN. N_THREADS is the requested number of threads. */ + of LOOP_FN. N_THREADS is the requested number of threads, which can be 0 if + that number is to be determined later. */ static void create_parallel_loop (struct loop *loop, tree loop_fn, tree data, @@ -2049,6 +2050,7 @@ create_parallel_loop (struct loop *loop, tree loop_fn, tree data, basic_block paral_bb = single_pred (bb); gsi = gsi_last_bb (paral_bb); + gcc_checking_assert (n_threads != 0); t = build_omp_clause (loc, OMP_CLAUSE_NUM_THREADS); OMP_CLAUSE_NUM_THREADS_EXPR (t) = build_int_cst (integer_type_node, n_threads); @@ -2221,7 +2223,8 @@ create_parallel_loop (struct loop *loop, tree loop_fn, tree data, } /* Generates code to execute the iterations of LOOP in N_THREADS - threads in parallel. + threads in parallel, which can be 0 if that number is to be determined + later. NITER describes number of iterations of LOOP. REDUCTION_LIST describes the reductions existent in the LOOP. */ @@ -2318,6 +2321,7 @@ gen_parallel_loop (struct loop *loop, else m_p_thread=MIN_PER_THREAD; + gcc_checking_assert (n_threads != 0); many_iterations_cond = fold_build2 (GE_EXPR, boolean_type_node, nit, build_int_cst (type, m_p_thread * n_threads)); @@ -3177,7 +3181,7 @@ oacc_entry_exit_ok (struct loop *loop, static bool parallelize_loops (bool oacc_kernels_p) { - unsigned n_threads = flag_tree_parallelize_loops; + unsigned n_threads; bool changed = false; struct loop *loop; struct loop *skip_loop = NULL; @@ -3199,6 +3203,13 @@ parallelize_loops (bool oacc_kernels_p) if (cfun->has_nonlocal_label) return false; + /* For OpenACC kernels, n_threads will be determined later; otherwise, it's + the argument to -ftree-parallelize-loops. */ + if (oacc_kernels_p) + n_threads = 0; + else + n_threads = flag_tree_parallelize_loops; + gcc_obstack_init (&parloop_obstack); reduction_info_table_type reduction_list (10); @@ -3361,7 +3372,13 @@ public: {} /* opt_pass methods: */ - virtual bool gate (function *) { return flag_tree_parallelize_loops > 1; } + virtual bool gate (function *) + { + if (oacc_kernels_p) + return flag_openacc; + else + return flag_tree_parallelize_loops > 1; + } virtual unsigned int execute (function *); opt_pass * clone () { return new pass_parallelize_loops (m_ctxt); } void set_pass_param (unsigned int n, bool param) diff --git gcc/tree-ssa-loop.c gcc/tree-ssa-loop.c index bdbade5..4c39fbc 100644 --- gcc/tree-ssa-loop.c +++ gcc/tree-ssa-loop.c @@ -148,7 +148,7 @@ make_pass_tree_loop (gcc::context *ctxt) static bool gate_oacc_kernels (function *fn) { - if (flag_tree_parallelize_loops <= 1) + if (!flag_openacc) return false; tree oacc_function_attr = get_oacc_fn_attrib (fn->decl); @@ -230,10 +230,9 @@ public: virtual bool gate (function *) { return (optimize - /* Don't bother doing anything if the program has errors. */ - && !seen_error () && flag_openacc - && flag_tree_parallelize_loops > 1); + /* Don't bother doing anything if the program has errors. */ + && !seen_error ()); } }; // class pass_ipa_oacc diff --git gcc/config/nvptx/nvptx.c gcc/config/nvptx/nvptx.c index fe28154..2fd3d52 100644 --- gcc/config/nvptx/nvptx.c +++ gcc/config/nvptx/nvptx.c @@ -4140,7 +4140,7 @@ nvptx_goacc_validate_dims (tree decl, int dims[], int fn_level) bool avoid_offloading_p = true; for (unsigned ix = 0; ix != GOMP_DIM_MAX; ix++) { - if (dims[ix] > 1) + if (dims[ix] > 1 || dims[ix] == 0) { avoid_offloading_p = false; break; diff --git libgomp/oacc-parallel.c libgomp/oacc-parallel.c index bc24651..f795bf7 100644 --- libgomp/oacc-parallel.c +++ libgomp/oacc-parallel.c @@ -103,6 +103,10 @@ GOACC_parallel_keyed (int device, void (*fn) (void *), return; } + /* Default: let the runtime choose. */ + for (i = 0; i != GOMP_DIM_MAX; i++) + dims[i] = 0; + va_start (ap, kinds); /* TODO: This will need amending when device_type is implemented. */ while ((tag = va_arg (ap, unsigned)) != 0) diff --git libgomp/plugin/plugin-nvptx.c libgomp/plugin/plugin-nvptx.c index 7ec1810..3f1bb6d 100644 --- libgomp/plugin/plugin-nvptx.c +++ libgomp/plugin/plugin-nvptx.c @@ -894,9 +894,21 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs, /* Initialize the launch dimensions. Typically this is constant, provided by the device compiler, but we must permit runtime values. */ - for (i = 0; i != 3; i++) - if (targ_fn->launch->dim[i]) - dims[i] = targ_fn->launch->dim[i]; + int seen_zero = 0; + for (i = 0; i != GOMP_DIM_MAX; i++) + { + if (targ_fn->launch->dim[i]) + dims[i] = targ_fn->launch->dim[i]; + if (!dims[i]) + seen_zero = 1; + } + + if (seen_zero) + { + for (i = 0; i != GOMP_DIM_MAX; i++) + if (!dims[i]) + dims[i] = /* TODO */ 32; + } /* This reserves a chunk of a pre-allocated page of memory mapped on both the host and the device. HP is a host pointer to the new chunk, and DP is The TODO in libgomp/plugin/plugin-nvptx.c:nvptx_exec will be resolved by Nathan's "Default compute dimensions (runtime)", <http://news.gmane.org/find-root.php?message_id=%3C56B21D23.5060209%40acm.org%3E>. The remainder is just "mechanical" updates to the test cases: diff --git gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c index e8b5357..17f240e 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c +++ gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -51,4 +50,4 @@ main (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c index c39d674..750f576 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c +++ gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -34,4 +33,4 @@ foo (unsigned int n) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c index 3501d0d..df60d6a 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c +++ gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -34,4 +33,4 @@ foo (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c index f97584d..913d91f 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -67,4 +66,4 @@ main (void) /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.1" 1 "optimized" } } */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.2" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 3 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 3 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c index 530d62a..1822d2a 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -45,5 +44,4 @@ main (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ - +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c index 4f1c2c5..e946319 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c @@ -1,6 +1,5 @@ /* { dg-additional-options "-O2" } */ /* { dg-additional-options "-g" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -13,5 +12,4 @@ /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ - +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c index 151db51..9b63b45 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -49,4 +48,4 @@ main (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c index bee5f5a..279f797 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -52,5 +51,4 @@ foo (COUNTERTYPE n) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ - +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c index ea0e342..db1071f 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -36,4 +35,4 @@ main (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop.c gcc/testsuite/c-c++-common/goacc/kernels-loop.c index ab5dfb9..abf7a3c 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-loop.c +++ gcc/testsuite/c-c++-common/goacc/kernels-loop.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -52,5 +51,4 @@ main (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ - +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c index b16a8cd..95f4817 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c +++ gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -50,5 +49,4 @@ main (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ - +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-reduction.c gcc/testsuite/c-c++-common/goacc/kernels-reduction.c index 61c5df3..6f5a418 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-reduction.c +++ gcc/testsuite/c-c++-common/goacc/kernels-reduction.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -32,5 +31,4 @@ foo (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ - +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 index 4db3a50..3334741 100644 --- gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 +++ gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 @@ -1,5 +1,4 @@ ! { dg-additional-options "-O2" } -! { dg-additional-options "-ftree-parallelize-loops=32" } program main implicit none diff --git gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 index fef3d10..fb92da8 100644 --- gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 +++ gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 @@ -1,5 +1,4 @@ ! { dg-additional-options "-O2" } -! { dg-additional-options "-ftree-parallelize-loops=10" } program main implicit none diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c index 08745fc..366b4f5 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c @@ -1,6 +1,5 @@ /* Test that the compiler decides to "avoid offloading". */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* The ACC_DEVICE_TYPE environment variable gets set in the testing framework, and that overrides the "avoid offloading" flag at run time. { dg-xfail-run-if "TODO" { openacc_nvidia_accel_selected } } */ diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c index 724228a..a63ec97 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c @@ -1,8 +1,6 @@ /* Test that a user can override the compiler's "avoid offloading" decision at run time. */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include <openacc.h> int main(void) diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c index 2fb5196..da01d02 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c @@ -1,7 +1,6 @@ /* Test that a user can override the compiler's "avoid offloading" decision at compile time. */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* Override the compiler's "avoid offloading" decision. { dg-additional-options "-foffload-force" } */ diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c index 87ca378..39899ab 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c @@ -1,7 +1,5 @@ /* This test exercises combined directives. */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include <stdlib.h> int diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c index 8f0144c..31da8b1 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c @@ -1,5 +1,3 @@ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include <openacc.h> int test_parallel () diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-1.c index 3ef6f9b..51745ba 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-1.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-1.c @@ -1,5 +1,4 @@ /* { dg-do run { target openacc_nvidia_accel_selected } } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-lcuda -lcublas -lcudart" } */ #include <stdlib.h> diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c index 614ad33..588e864 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c @@ -1,5 +1,3 @@ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include <stdlib.h> int i; diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c index 13e57bd..c7592d6 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include <stdlib.h> #define N (1024 * 512) diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c index f61a74a..31114ac 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include <stdlib.h> #define N (1024 * 512) diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c index 5cdc200..3ffdfe2 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c @@ -1,5 +1,3 @@ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include <stdlib.h> #define N 32 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c index 2e4d4d2..a554d66 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c @@ -1,5 +1,3 @@ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include <stdlib.h> #define N 32 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c index 5bf00db..f0144b4 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c @@ -1,5 +1,3 @@ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include <stdlib.h> #define N 32 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c index d39b667..4719edd 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c @@ -1,5 +1,3 @@ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include <stdlib.h> #define N 32 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c index bb2e85b..ca4f638 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c @@ -1,5 +1,3 @@ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include <stdlib.h> #define N 32 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c index e513827..d2fff38 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c @@ -1,5 +1,3 @@ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include <stdlib.h> #define N 32 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c index c4791a4..0df4b3f 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c @@ -1,5 +1,3 @@ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include <stdlib.h> #define N 100 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c index 96b6e4e..88258be 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c @@ -1,5 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-g" } */ #include "kernels-loop.c" diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c index 1433cb2..147ebb5 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include <stdlib.h> #define N ((1024 * 512) + 1) diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c index fd0d5b1..9a3eaca 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include <stdlib.h> #define N ((1024 * 512) + 1) diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c index 21d2599..28c725a 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include <stdlib.h> #define N 1000 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c index 3762e5a..355123c 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include <stdlib.h> #define N (1024 * 512) diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c index 511e25f..8647a94 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include <stdlib.h> #define n 10000 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c index 94a5ae2..83cddb5 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c @@ -1,5 +1,3 @@ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include <stdlib.h> int diff --git libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-1.f libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-1.f index 5f18b94..ca5cd01 100644 --- libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-1.f +++ libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-1.f @@ -2,7 +2,6 @@ ! { dg-do run } ! { dg-additional-options "-cpp" } -! { dg-additional-options "-ftree-parallelize-loops=32" } ! The "avoid offloading" warning is only triggered for -O2 and higher. ! { dg-xfail-if "n/a" { nvptx_offloading_configured } { "-O0" "-O1" } { "" } } ! The ACC_DEVICE_TYPE environment variable gets set in the testing diff --git libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-2.f libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-2.f index 51801ad..6200b37 100644 --- libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-2.f +++ libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-2.f @@ -3,7 +3,6 @@ ! { dg-do run } ! { dg-additional-options "-cpp" } -! { dg-additional-options "-ftree-parallelize-loops=32" } ! The "avoid offloading" warning is only triggered for -O2 and higher. ! { dg-xfail-if "n/a" { nvptx_offloading_configured } { "-O0" "-O1" } { "" } } diff --git libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-3.f libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-3.f index bea6ab8..865d09f 100644 --- libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-3.f +++ libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-3.f @@ -3,7 +3,6 @@ ! { dg-do run } ! { dg-additional-options "-cpp" } -! { dg-additional-options "-ftree-parallelize-loops=32" } ! Override the compiler's "avoid offloading" decision. ! { dg-additional-options "-foffload-force" } diff --git libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90 libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90 index 4b52579..12ff36c 100644 --- libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90 +++ libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90 @@ -1,7 +1,6 @@ ! This test exercises combined directives. ! { dg-do run } -! { dg-additional-options "-ftree-parallelize-loops=32" } ! The "avoid offloading" warning is only triggered for -O2 and higher. ! { dg-xfail-if "n/a" { nvptx_offloading_configured } { "-O0" "-O1" } { "" } } diff --git libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 index b9298c7..0643e89 100644 --- libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 +++ libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 @@ -2,7 +2,6 @@ ! offloaded regions are properly mapped using present_or_copy. ! { dg-do run } -! { dg-additional-options "-ftree-parallelize-loops=32" } ! The "avoid offloading" warning is only triggered for -O2 and higher. ! { dg-xfail-if "n/a" { nvptx_offloading_configured } { "-O0" "-O1" } { "" } }