From patchwork Fri Feb 5 12:06:17 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Schwinge X-Patchwork-Id: 579464 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 2543A1402D2 for ; Fri, 5 Feb 2016 23:07:05 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=hKEkXnvR; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:in-reply-to:references:date:message-id :mime-version:content-type; q=dns; s=default; b=p9AZ6F+oJsTQwGpY fa1ZJhFYVj5Bdb6j8F6kkjwhIpDwv2AhEwlsupSHZZOLotZbAPUWnT2Iy77AMDry JEP2siySJcbT69dpG9qzPxYlT8fev6XopxegYaHEXdf2EVR4n/NkgsRT/k2XC0GU icqcglhnC4H9+tcCYYnwRH3hUgE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:in-reply-to:references:date:message-id :mime-version:content-type; s=default; bh=xMkhlTRpXAAZ0rYGhrDdOK ZasBA=; b=hKEkXnvRO+KACHJQ9iE0NRGTS7uLYG++lCBur/JgUEL7umiHovxIiV an4o8Kdlx/aInPKf/x1SpgVSkCkAQwx3NSnP4SEgA4QszXoCShuKfT08FFcGr903 anAw7D1w9+rn3MJQK9fgeZRRAp+qXjUcB0+UbPUM6yOMiCrgi+MM4= Received: (qmail 22742 invoked by alias); 5 Feb 2016 12:06:57 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 22719 invoked by uid 89); 5 Feb 2016 12:06:55 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.6 required=5.0 tests=AWL, BAYES_50, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy=fi, 1036, existent, gang X-HELO: fencepost.gnu.org Received: from fencepost.gnu.org (HELO fencepost.gnu.org) (208.118.235.10) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Fri, 05 Feb 2016 12:06:46 +0000 Received: from eggs.gnu.org ([2001:4830:134:3::10]:50703) by fencepost.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:256) (Exim 4.82) (envelope-from ) id 1aRf9u-0007fh-U2 for gcc-patches@gnu.org; Fri, 05 Feb 2016 07:06:43 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1aRf9p-0001tj-FT for gcc-patches@gnu.org; Fri, 05 Feb 2016 07:06:42 -0500 Received: from relay1.mentorg.com ([192.94.38.131]:41899) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1aRf9o-0001tS-Vc for gcc-patches@gnu.org; Fri, 05 Feb 2016 07:06:37 -0500 Received: from svr-orw-fem-03.mgc.mentorg.com ([147.34.97.39]) by relay1.mentorg.com with esmtp id 1aRf9k-0000Jt-FV from Thomas_Schwinge@mentor.com ; Fri, 05 Feb 2016 04:06:32 -0800 Received: from tftp-cs (147.34.91.1) by svr-orw-fem-03.mgc.mentorg.com (147.34.97.39) with Microsoft SMTP Server id 14.3.224.2; Fri, 5 Feb 2016 04:06:31 -0800 Received: by tftp-cs (Postfix, from userid 49978) id 6CF73C2300; Fri, 5 Feb 2016 04:06:31 -0800 (PST) From: Thomas Schwinge To: Jakub Jelinek , , Nathan Sidwell CC: Tom de Vries , Richard Biener Subject: Use plain -fopenacc to enable OpenACC kernels processing (was: [PATCH, 6/16] Add pass_oacc_kernels) In-Reply-To: <5640DA47.2010508@mentor.com> References: <5640BD31.2060602@mentor.com> <5640DA47.2010508@mentor.com> User-Agent: Notmuch/0.9-125-g4686d11 (http://notmuchmail.org) Emacs/24.5.1 (i586-pc-linux-gnu) Date: Fri, 5 Feb 2016 13:06:17 +0100 Message-ID: <87bn7v4b0m.fsf@kepler.schwinge.homeip.net> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: Windows NT kernel [generic] [fuzzy] X-Received-From: 192.94.38.131 Hi! On Mon, 9 Nov 2015 18:39:19 +0100, Tom de Vries wrote: > On 09/11/15 16:35, Tom de Vries wrote: > > this patch series for stage1 trunk adds support to: > > - parallelize oacc kernels regions using parloops, and > > - map the loops onto the oacc gang dimension. > Atm, the parallelization behaviour for the kernels region is controlled > by flag_tree_parallelize_loops, which is also used to control generic > auto-parallelization by autopar using omp. That is not ideal, and we may > want a separate flag (or param) to control the behaviour for oacc > kernels, f.i. -foacc-kernels-gang-parallelize=. I'm open to suggestions. I suggest to use plain -fopenacc to enable OpenACC kernels processing (which just makes sense, I hope) ;-) and have later processing stages determine the actual parametrization (currently: number of gangs) (that is, Nathan's recent "Default compute dimensions" patches). The code changes are simple enough; OK for trunk? (This patch depends on my 'Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid offloading"' pending review, .) Originally, I want to use: OMP_CLAUSE_NUM_GANGS_EXPR (clause) = build_int_cst (integer_type_node, n_threads == 0 ? -1 : n_threads); ... to store -1 "have the compiler decidew" (instead of now 0 "have the run-time decide", which might prevent some code optimizations, as I understand it) for the n_threads == 0 case, but it seems that for an offloaded OpenACC kernels region, gcc/omp-low.c:oacc_validate_dims is called with the parameter "used" set to 0 instead of "gang", and then the "Default anything left to 1 or a partitioned default" logic will default dims["gang"] to oacc_min_dims["gang"] (that is, 1) instead of the oacc_default_dims["gang"] (that is, 32). Nathan, does that smell like a bug (and could you look into that)? Grüße Thomas diff --git gcc/tree-parloops.c gcc/tree-parloops.c index 139e38c..e498e5b 100644 --- gcc/tree-parloops.c +++ gcc/tree-parloops.c @@ -2016,7 +2016,8 @@ transform_to_exit_first_loop (struct loop *loop, /* Create the parallel constructs for LOOP as described in gen_parallel_loop. LOOP_FN and DATA are the arguments of GIMPLE_OMP_PARALLEL. NEW_DATA is the variable that should be initialized from the argument - of LOOP_FN. N_THREADS is the requested number of threads. */ + of LOOP_FN. N_THREADS is the requested number of threads, which can be 0 if + that number is to be determined later. */ static void create_parallel_loop (struct loop *loop, tree loop_fn, tree data, @@ -2049,6 +2050,7 @@ create_parallel_loop (struct loop *loop, tree loop_fn, tree data, basic_block paral_bb = single_pred (bb); gsi = gsi_last_bb (paral_bb); + gcc_checking_assert (n_threads != 0); t = build_omp_clause (loc, OMP_CLAUSE_NUM_THREADS); OMP_CLAUSE_NUM_THREADS_EXPR (t) = build_int_cst (integer_type_node, n_threads); @@ -2221,7 +2223,8 @@ create_parallel_loop (struct loop *loop, tree loop_fn, tree data, } /* Generates code to execute the iterations of LOOP in N_THREADS - threads in parallel. + threads in parallel, which can be 0 if that number is to be determined + later. NITER describes number of iterations of LOOP. REDUCTION_LIST describes the reductions existent in the LOOP. */ @@ -2318,6 +2321,7 @@ gen_parallel_loop (struct loop *loop, else m_p_thread=MIN_PER_THREAD; + gcc_checking_assert (n_threads != 0); many_iterations_cond = fold_build2 (GE_EXPR, boolean_type_node, nit, build_int_cst (type, m_p_thread * n_threads)); @@ -3177,7 +3181,7 @@ oacc_entry_exit_ok (struct loop *loop, static bool parallelize_loops (bool oacc_kernels_p) { - unsigned n_threads = flag_tree_parallelize_loops; + unsigned n_threads; bool changed = false; struct loop *loop; struct loop *skip_loop = NULL; @@ -3199,6 +3203,13 @@ parallelize_loops (bool oacc_kernels_p) if (cfun->has_nonlocal_label) return false; + /* For OpenACC kernels, n_threads will be determined later; otherwise, it's + the argument to -ftree-parallelize-loops. */ + if (oacc_kernels_p) + n_threads = 0; + else + n_threads = flag_tree_parallelize_loops; + gcc_obstack_init (&parloop_obstack); reduction_info_table_type reduction_list (10); @@ -3361,7 +3372,13 @@ public: {} /* opt_pass methods: */ - virtual bool gate (function *) { return flag_tree_parallelize_loops > 1; } + virtual bool gate (function *) + { + if (oacc_kernels_p) + return flag_openacc; + else + return flag_tree_parallelize_loops > 1; + } virtual unsigned int execute (function *); opt_pass * clone () { return new pass_parallelize_loops (m_ctxt); } void set_pass_param (unsigned int n, bool param) diff --git gcc/tree-ssa-loop.c gcc/tree-ssa-loop.c index bdbade5..4c39fbc 100644 --- gcc/tree-ssa-loop.c +++ gcc/tree-ssa-loop.c @@ -148,7 +148,7 @@ make_pass_tree_loop (gcc::context *ctxt) static bool gate_oacc_kernels (function *fn) { - if (flag_tree_parallelize_loops <= 1) + if (!flag_openacc) return false; tree oacc_function_attr = get_oacc_fn_attrib (fn->decl); @@ -230,10 +230,9 @@ public: virtual bool gate (function *) { return (optimize - /* Don't bother doing anything if the program has errors. */ - && !seen_error () && flag_openacc - && flag_tree_parallelize_loops > 1); + /* Don't bother doing anything if the program has errors. */ + && !seen_error ()); } }; // class pass_ipa_oacc diff --git gcc/config/nvptx/nvptx.c gcc/config/nvptx/nvptx.c index fe28154..2fd3d52 100644 --- gcc/config/nvptx/nvptx.c +++ gcc/config/nvptx/nvptx.c @@ -4140,7 +4140,7 @@ nvptx_goacc_validate_dims (tree decl, int dims[], int fn_level) bool avoid_offloading_p = true; for (unsigned ix = 0; ix != GOMP_DIM_MAX; ix++) { - if (dims[ix] > 1) + if (dims[ix] > 1 || dims[ix] == 0) { avoid_offloading_p = false; break; diff --git libgomp/oacc-parallel.c libgomp/oacc-parallel.c index bc24651..f795bf7 100644 --- libgomp/oacc-parallel.c +++ libgomp/oacc-parallel.c @@ -103,6 +103,10 @@ GOACC_parallel_keyed (int device, void (*fn) (void *), return; } + /* Default: let the runtime choose. */ + for (i = 0; i != GOMP_DIM_MAX; i++) + dims[i] = 0; + va_start (ap, kinds); /* TODO: This will need amending when device_type is implemented. */ while ((tag = va_arg (ap, unsigned)) != 0) diff --git libgomp/plugin/plugin-nvptx.c libgomp/plugin/plugin-nvptx.c index 7ec1810..3f1bb6d 100644 --- libgomp/plugin/plugin-nvptx.c +++ libgomp/plugin/plugin-nvptx.c @@ -894,9 +894,21 @@ nvptx_exec (void (*fn), size_t mapnum, void **hostaddrs, void **devaddrs, /* Initialize the launch dimensions. Typically this is constant, provided by the device compiler, but we must permit runtime values. */ - for (i = 0; i != 3; i++) - if (targ_fn->launch->dim[i]) - dims[i] = targ_fn->launch->dim[i]; + int seen_zero = 0; + for (i = 0; i != GOMP_DIM_MAX; i++) + { + if (targ_fn->launch->dim[i]) + dims[i] = targ_fn->launch->dim[i]; + if (!dims[i]) + seen_zero = 1; + } + + if (seen_zero) + { + for (i = 0; i != GOMP_DIM_MAX; i++) + if (!dims[i]) + dims[i] = /* TODO */ 32; + } /* This reserves a chunk of a pre-allocated page of memory mapped on both the host and the device. HP is a host pointer to the new chunk, and DP is The TODO in libgomp/plugin/plugin-nvptx.c:nvptx_exec will be resolved by Nathan's "Default compute dimensions (runtime)", . The remainder is just "mechanical" updates to the test cases: diff --git gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c index e8b5357..17f240e 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c +++ gcc/testsuite/c-c++-common/goacc/kernels-counter-vars-function-scope.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -51,4 +50,4 @@ main (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c index c39d674..750f576 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c +++ gcc/testsuite/c-c++-common/goacc/kernels-double-reduction-n.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -34,4 +33,4 @@ foo (unsigned int n) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c index 3501d0d..df60d6a 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c +++ gcc/testsuite/c-c++-common/goacc/kernels-double-reduction.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -34,4 +33,4 @@ foo (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c index f97584d..913d91f 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -67,4 +66,4 @@ main (void) /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.1" 1 "optimized" } } */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.2" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 3 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 3 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c index 530d62a..1822d2a 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-3.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -45,5 +44,4 @@ main (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ - +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c index 4f1c2c5..e946319 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-g.c @@ -1,6 +1,5 @@ /* { dg-additional-options "-O2" } */ /* { dg-additional-options "-g" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -13,5 +12,4 @@ /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ - +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c index 151db51..9b63b45 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-mod-not-zero.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -49,4 +48,4 @@ main (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c index bee5f5a..279f797 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-n.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -52,5 +51,4 @@ foo (COUNTERTYPE n) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ - +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c index ea0e342..db1071f 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c +++ gcc/testsuite/c-c++-common/goacc/kernels-loop-nest.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -36,4 +35,4 @@ main (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-loop.c gcc/testsuite/c-c++-common/goacc/kernels-loop.c index ab5dfb9..abf7a3c 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-loop.c +++ gcc/testsuite/c-c++-common/goacc/kernels-loop.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -52,5 +51,4 @@ main (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ - +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c index b16a8cd..95f4817 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c +++ gcc/testsuite/c-c++-common/goacc/kernels-one-counter-var.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -50,5 +49,4 @@ main (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*main._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ - +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/c-c++-common/goacc/kernels-reduction.c gcc/testsuite/c-c++-common/goacc/kernels-reduction.c index 61c5df3..6f5a418 100644 --- gcc/testsuite/c-c++-common/goacc/kernels-reduction.c +++ gcc/testsuite/c-c++-common/goacc/kernels-reduction.c @@ -1,5 +1,4 @@ /* { dg-additional-options "-O2" } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-fdump-tree-parloops1-all" } */ /* { dg-additional-options "-fdump-tree-optimized" } */ @@ -32,5 +31,4 @@ foo (void) /* Check that the loop has been split off into a function. */ /* { dg-final { scan-tree-dump-times "(?n);; Function .*foo.*._omp_fn.0" 1 "optimized" } } */ -/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(32," 1 "parloops1" } } */ - +/* { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 1 "parloops1" } } */ diff --git gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 index 4db3a50..3334741 100644 --- gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 +++ gcc/testsuite/gfortran.dg/goacc/kernels-loop-inner.f95 @@ -1,5 +1,4 @@ ! { dg-additional-options "-O2" } -! { dg-additional-options "-ftree-parallelize-loops=32" } program main implicit none diff --git gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 index fef3d10..fb92da8 100644 --- gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 +++ gcc/testsuite/gfortran.dg/goacc/kernels-loops-adjacent.f95 @@ -1,5 +1,4 @@ ! { dg-additional-options "-O2" } -! { dg-additional-options "-ftree-parallelize-loops=10" } program main implicit none diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c index 08745fc..366b4f5 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c @@ -1,6 +1,5 @@ /* Test that the compiler decides to "avoid offloading". */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* The ACC_DEVICE_TYPE environment variable gets set in the testing framework, and that overrides the "avoid offloading" flag at run time. { dg-xfail-run-if "TODO" { openacc_nvidia_accel_selected } } */ diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c index 724228a..a63ec97 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c @@ -1,8 +1,6 @@ /* Test that a user can override the compiler's "avoid offloading" decision at run time. */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include int main(void) diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c index 2fb5196..da01d02 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c @@ -1,7 +1,6 @@ /* Test that a user can override the compiler's "avoid offloading" decision at compile time. */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* Override the compiler's "avoid offloading" decision. { dg-additional-options "-foffload-force" } */ diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c index 87ca378..39899ab 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c @@ -1,7 +1,5 @@ /* This test exercises combined directives. */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include int diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c index 8f0144c..31da8b1 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/default-1.c @@ -1,5 +1,3 @@ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include int test_parallel () diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-1.c index 3ef6f9b..51745ba 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-1.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/host_data-1.c @@ -1,5 +1,4 @@ /* { dg-do run { target openacc_nvidia_accel_selected } } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-lcuda -lcublas -lcudart" } */ #include diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c index 614ad33..588e864 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-1.c @@ -1,5 +1,3 @@ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include int i; diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c index 13e57bd..c7592d6 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-2.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N (1024 * 512) diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c index f61a74a..31114ac 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-3.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N (1024 * 512) diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c index 5cdc200..3ffdfe2 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c @@ -1,5 +1,3 @@ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N 32 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c index 2e4d4d2..a554d66 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c @@ -1,5 +1,3 @@ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N 32 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c index 5bf00db..f0144b4 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c @@ -1,5 +1,3 @@ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N 32 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c index d39b667..4719edd 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c @@ -1,5 +1,3 @@ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N 32 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c index bb2e85b..ca4f638 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c @@ -1,5 +1,3 @@ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N 32 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c index e513827..d2fff38 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c @@ -1,5 +1,3 @@ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N 32 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c index c4791a4..0df4b3f 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c @@ -1,5 +1,3 @@ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N 100 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c index 96b6e4e..88258be 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-g.c @@ -1,5 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ /* { dg-additional-options "-g" } */ #include "kernels-loop.c" diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c index 1433cb2..147ebb5 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-mod-not-zero.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N ((1024 * 512) + 1) diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c index fd0d5b1..9a3eaca 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-n.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N ((1024 * 512) + 1) diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c index 21d2599..28c725a 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop-nest.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N 1000 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c index 3762e5a..355123c 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-loop.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define N (1024 * 512) diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c index 511e25f..8647a94 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/kernels-reduction.c @@ -1,6 +1,3 @@ -/* { dg-do run } */ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include #define n 10000 diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c index 94a5ae2..83cddb5 100644 --- libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c +++ libgomp/testsuite/libgomp.oacc-c-c++-common/nested-2.c @@ -1,5 +1,3 @@ -/* { dg-additional-options "-ftree-parallelize-loops=32" } */ - #include int diff --git libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-1.f libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-1.f index 5f18b94..ca5cd01 100644 --- libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-1.f +++ libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-1.f @@ -2,7 +2,6 @@ ! { dg-do run } ! { dg-additional-options "-cpp" } -! { dg-additional-options "-ftree-parallelize-loops=32" } ! The "avoid offloading" warning is only triggered for -O2 and higher. ! { dg-xfail-if "n/a" { nvptx_offloading_configured } { "-O0" "-O1" } { "" } } ! The ACC_DEVICE_TYPE environment variable gets set in the testing diff --git libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-2.f libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-2.f index 51801ad..6200b37 100644 --- libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-2.f +++ libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-2.f @@ -3,7 +3,6 @@ ! { dg-do run } ! { dg-additional-options "-cpp" } -! { dg-additional-options "-ftree-parallelize-loops=32" } ! The "avoid offloading" warning is only triggered for -O2 and higher. ! { dg-xfail-if "n/a" { nvptx_offloading_configured } { "-O0" "-O1" } { "" } } diff --git libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-3.f libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-3.f index bea6ab8..865d09f 100644 --- libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-3.f +++ libgomp/testsuite/libgomp.oacc-fortran/avoid-offloading-3.f @@ -3,7 +3,6 @@ ! { dg-do run } ! { dg-additional-options "-cpp" } -! { dg-additional-options "-ftree-parallelize-loops=32" } ! Override the compiler's "avoid offloading" decision. ! { dg-additional-options "-foffload-force" } diff --git libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90 libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90 index 4b52579..12ff36c 100644 --- libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90 +++ libgomp/testsuite/libgomp.oacc-fortran/combined-directives-1.f90 @@ -1,7 +1,6 @@ ! This test exercises combined directives. ! { dg-do run } -! { dg-additional-options "-ftree-parallelize-loops=32" } ! The "avoid offloading" warning is only triggered for -O2 and higher. ! { dg-xfail-if "n/a" { nvptx_offloading_configured } { "-O0" "-O1" } { "" } } diff --git libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 index b9298c7..0643e89 100644 --- libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 +++ libgomp/testsuite/libgomp.oacc-fortran/non-scalar-data.f90 @@ -2,7 +2,6 @@ ! offloaded regions are properly mapped using present_or_copy. ! { dg-do run } -! { dg-additional-options "-ftree-parallelize-loops=32" } ! The "avoid offloading" warning is only triggered for -O2 and higher. ! { dg-xfail-if "n/a" { nvptx_offloading_configured } { "-O0" "-O1" } { "" } }