From patchwork Tue Nov 5 14:35:43 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chung-Lin Tang X-Patchwork-Id: 1189721 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-512506-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=mentor.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="akcmMkqA"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 476sff6Wb0z9sNx for ; Wed, 6 Nov 2019 01:36:09 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:reply-to:to:message-id:date:mime-version:content-type; q=dns; s=default; b=pWWTtNjiSAyuG7zLTqWwNCUSxYXMo7oeeFEQHe+iI3O KY+2IO2LOT5STmjulKUdbyAnXBjZsv/m2Hn49PCF+a8EHnF6Og4salY6MZqUAWhG aXcKOuZkWGWR2ap4HUa4UGdy6UjUvP0LTP/1tEbIRLnS1ES3dw4fA/Da0x1DarcE = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:reply-to:to:message-id:date:mime-version:content-type; s=default; bh=BG7LhxayGwQqDLWRwy73XwaWOyQ=; b=akcmMkqAzbGpg/8ay 6IAwEcc05HKM+yVf+sFchIIZp1qnOoPIvGyHdAPdU6pybB+dpbU6jVcJXztKtq9O xRU24chSnZvTgi6El4rttpLQ8RsKId1j/81BjI3kwmcfa6VznibN+jTZXA90ZLat IOyhIkNkMyKevuaoxL3dDC5zmk= Received: (qmail 124446 invoked by alias); 5 Nov 2019 14:36:00 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 124429 invoked by uid 89); 5 Nov 2019 14:35:59 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-13.1 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_ASCII_DIVIDERS, SPF_PASS autolearn=ham version=3.3.1 spammy=ATM, simplest, Care, gimplify_expr X-HELO: esa3.mentor.iphmx.com Received: from esa3.mentor.iphmx.com (HELO esa3.mentor.iphmx.com) (68.232.137.180) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 05 Nov 2019 14:35:52 +0000 IronPort-SDR: jgYDmXZLKhav+UUIREkxoDM5e4CTxAo6MGM1gg1z4CljmFjQjzK4rP+P8GVo8hW/Ajh66IGRFl 8LUCIUHRrq9jVe850PrKi8cYSZiTQzS/aI5+KAgBys/bOubqIx0vXKgRH9UvNlCqD8MQIL/P9R mqRQW8k1G6zr8FWwJuWw/Al2ceziDLivO+xpBQ6Lnx4vMI+vnxlf1/YBUT2bI7FYVt9Vr3XfA8 Fzxq2UM/10EtB0dM6EKRGSKYfjmI6j5oBVVrbjXmz9tzcxyNsreudlU8/e1XJ1nmR3I0wZVEaF +Gs= Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa3.mentor.iphmx.com with ESMTP; 05 Nov 2019 06:35:50 -0800 IronPort-SDR: 8MAmjykhKl8UMyknOEh21JTVX8BYLDKLBaMtuZTCHjCrliaOPI6H/MTDEthAkPu+Y0LK96+UPz w/2L+6SRF+WtG1/qDUgZcRDmPECqVeki+ZCN2Hcudjj357IyfDiqsj//ip+XuHEcSqYQBudoe4 gk3QnlToB00F9rJKIfF6BZ8n2c0J0aLc42jJ6kfDFdJxcN4BAhrsBY2BP8VhszixZ9uppcrh/2 yokmCF8GGaEiAbZWws8hSVQ5P3nZe1QTIxMVZJYTER84oOEN1Ak91uWjL1HyzAuqD4jR3yDE6y agQ= From: Chung-Lin Tang Subject: [PATCH, OpenACC, v2] Non-contiguous array support for OpenACC data clauses Reply-To: To: gcc-patches , Thomas Schwinge Message-ID: Date: Tue, 5 Nov 2019 22:35:43 +0800 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 Hi Thomas, after your last round of review, I realized that the bulk of the compiler omp-low work was simply a case of dumb over-engineering in the wrong direction :P (although it did painstakingly function correctly) Instead of making code changes for bias adjustment in the child function code in the omp-low phase, this should simply be done by the libgomp runtime map preparation (similar to how the current single-dimension array biases are handled) So this updated patch (1) discards away a large part of the last omp-low.c patch, and (2) adjusts the libgomp/target.c patch to do the per-dimensional adjustments. Also, the bit of C/C++ front-end logic you mentioned that was questionable was removed. After looking closely, it wasn't needed; the relaxing of pointers for OpenACC was enough. Still some aspects of handling arrays inside the multi-dimension type still need some more work, e.g. see the catching in the omp-low.c part. A compiler dg-scan testcase was also added. However, the issue of ACC_DEVICE_TYPE=host not working (and hence "!openacc_host_selected" in the testcases) actually is a bit more sophisticated than I thought: The reason it doesn't work for the host device, is because we use the map pointer (i.e. a hostaddrs[] entry when passed into libgomp) to point to an array descriptor to pass the whole array information, and rely on code inside gomp_map_vars_* to setup things, and place the final on-device address of the non-contig. array into devaddrs[], therefore only using a single map entry (something I thought was quite clever) However, this broke down on the host and host-fallback devices, simply because, there we do NOT do any gomp_map_vars processing; our current code in GOACC_parallel_keyed simply skips it and passes the offload function the original hostaddrs[] contents. Lacking the processing to transform the descriptor pointer into a proper array ref, things of course segfault. So I think we have three options for this (which may have some interactions with say, the "proper" host-side parallelization we eventually need to implement for OpenACC 2.7) (1) The simplest solution: implement a processing which searches and reverts such non-contiguous array map entries in GOACC_parallel_keyed. (note: I have implemented this in the current attached "v2" patch) (2) Make the GOACC_parallel_keyed code to not make short cuts for host-modes; i.e. still do the proper gomp_map_vars processing for all cases. (3) Modify the non-contiguous array map conventions: a possible solution is to use two maps placed together: one for the array pointer, another for the array descriptor (as opposed to the current style of using only one map) This needs more further elaborate compiler/runtime work. The first two options will pessimize host-mode performance somewhat. The third I have some WIP patches, but it's still buggy ATM. Seeking your opinion on what we should do. Thanks, Chung-Lin gcc/c/ * c-typeck.c (handle_omp_array_sections_1): Add 'bool &non_contiguous' parameter, adjust recursive call site, add cases for allowing pointer based multi-dimensional arrays for OpenACC. (handle_omp_array_sections): Adjust handle_omp_array_sections_1 call, handle non-contiguous case to create dynamic array map. gcc/cp/ * semantics.c (handle_omp_array_sections_1): Add 'bool &non_contiguous' parameter, adjust recursive call site, add cases for allowing pointer based multi-dimensional arrays for OpenACC. (handle_omp_array_sections): Adjust handle_omp_array_sections_1 call, handle non-contiguous case to create dynamic array map. gcc/ * gimplify.c (gimplify_scan_omp_clauses): For non-contiguous array map kinds, make sure bias in each dimension are put into firstprivate variables. * omp-low.c (append_field_to_record_type): New function. (create_noncontig_array_descr_type): Likewise. (create_noncontig_array_descr_init_code): Likewise. (scan_sharing_clauses): For non-contiguous array map kinds, check for supported dimension structure, and install non-contiguous array variable into current omp_context. (reorder_noncontig_array_clauses): New function. (scan_omp_target): Call reorder_noncontig_array_clauses to place non-contiguous array map clauses at beginning of clause sequence. (lower_omp_target): Add handling for non-contiguous array map kinds. * tree-pretty-print.c (dump_omp_clauses): Add cases for printing GOMP_MAP_NONCONTIG_ARRAY map kinds. include/ * gomp-constants.h (GOMP_MAP_FLAG_SPECIAL_3): Define. (enum gomp_map_kind): Add GOMP_MAP_NONCONTIG_ARRAY, GOMP_MAP_NONCONTIG_ARRAY_TO, GOMP_MAP_NONCONTIG_ARRAY_FROM, GOMP_MAP_NONCONTIG_ARRAY_TOFROM, GOMP_MAP_NONCONTIG_ARRAY_FORCE_TO, GOMP_MAP_NONCONTIG_ARRAY_FORCE_FROM, GOMP_MAP_NONCONTIG_ARRAY_FORCE_TOFROM, GOMP_MAP_NONCONTIG_ARRAY_ALLOC, GOMP_MAP_NONCONTIG_ARRAY_FORCE_ALLOC, GOMP_MAP_NONCONTIG_ARRAY_FORCE_PRESENT. (GOMP_MAP_NONCONTIG_ARRAY_P): Define. gcc/testsuite/ * c-c++-common/goacc/noncontig_array-1.c: New test. libgomp/ * target.c (struct gomp_ncarray_dim): New struct declaration. (struct gomp_ncarray_descr_type): Likewise. (struct ncarray_info): Likewise. (gomp_noncontig_array_count_rows): New function. (gomp_noncontig_array_compute_info): Likewise. (gomp_noncontig_array_fill_rows_1): Likewise. (gomp_noncontig_array_fill_rows): Likewise. (gomp_noncontig_array_create_ptrblock): Likewise. (gomp_map_vars_internal): Add code to handle non-contiguous array map kinds. * oacc-parallel.c (revert_noncontig_array_map_pointers): New function. (GOACC_parallel_keyed): Call revert_noncontig_array_map_pointers when executing for host-modes. * testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c: New test. * testsuite/libgomp.oacc-c-c++-common/noncontig_array-2.c: New test. * testsuite/libgomp.oacc-c-c++-common/noncontig_array-3.c: New test. * testsuite/libgomp.oacc-c-c++-common/noncontig_array-4.c: New test. * testsuite/libgomp.oacc-c-c++-common/noncontig_array-utils.h: Support header for new tests. Index: gcc/c/c-typeck.c =================================================================== --- gcc/c/c-typeck.c (revision 277827) +++ gcc/c/c-typeck.c (working copy) @@ -12868,7 +12868,7 @@ c_finish_omp_cancellation_point (location_t loc, t static tree handle_omp_array_sections_1 (tree c, tree t, vec &types, bool &maybe_zero_len, unsigned int &first_non_one, - enum c_omp_region_type ort) + bool &non_contiguous, enum c_omp_region_type ort) { tree ret, low_bound, length, type; if (TREE_CODE (t) != TREE_LIST) @@ -12953,7 +12953,8 @@ handle_omp_array_sections_1 (tree c, tree t, vec types; tree *tp = &OMP_CLAUSE_DECL (c); if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_DEPEND @@ -13205,7 +13214,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi tp = &TREE_VALUE (*tp); tree first = handle_omp_array_sections_1 (c, *tp, types, maybe_zero_len, first_non_one, - ort); + non_contiguous, ort); if (first == error_mark_node) return true; if (first == NULL_TREE) @@ -13238,6 +13247,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi unsigned int num = types.length (), i; tree t, side_effects = NULL_TREE, size = NULL_TREE; tree condition = NULL_TREE; + tree ncarray_dims = NULL_TREE; if (int_size_in_bytes (TREE_TYPE (first)) <= 0) maybe_zero_len = true; @@ -13261,6 +13271,13 @@ handle_omp_array_sections (tree c, enum c_omp_regi length = fold_convert (sizetype, length); if (low_bound == NULL_TREE) low_bound = integer_zero_node; + + if (non_contiguous) + { + ncarray_dims = tree_cons (low_bound, length, ncarray_dims); + continue; + } + if (!maybe_zero_len && i > first_non_one) { if (integer_nonzerop (low_bound)) @@ -13357,6 +13374,14 @@ handle_omp_array_sections (tree c, enum c_omp_regi size = size_binop (MULT_EXPR, size, l); } } + if (non_contiguous) + { + int kind = OMP_CLAUSE_MAP_KIND (c); + OMP_CLAUSE_SET_MAP_KIND (c, kind | GOMP_MAP_NONCONTIG_ARRAY); + OMP_CLAUSE_DECL (c) = t; + OMP_CLAUSE_SIZE (c) = ncarray_dims; + return false; + } if (side_effects) size = build2 (COMPOUND_EXPR, sizetype, side_effects, size); if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION Index: gcc/cp/semantics.c =================================================================== --- gcc/cp/semantics.c (revision 277827) +++ gcc/cp/semantics.c (working copy) @@ -4732,7 +4732,7 @@ omp_privatize_field (tree t, bool shared) static tree handle_omp_array_sections_1 (tree c, tree t, vec &types, bool &maybe_zero_len, unsigned int &first_non_one, - enum c_omp_region_type ort) + bool &non_contiguous, enum c_omp_region_type ort) { tree ret, low_bound, length, type; if (TREE_CODE (t) != TREE_LIST) @@ -4817,7 +4817,8 @@ handle_omp_array_sections_1 (tree c, tree t, vec types; tree *tp = &OMP_CLAUSE_DECL (c); if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_DEPEND @@ -5092,7 +5101,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi tp = &TREE_VALUE (*tp); tree first = handle_omp_array_sections_1 (c, *tp, types, maybe_zero_len, first_non_one, - ort); + non_contiguous, ort); if (first == error_mark_node) return true; if (first == NULL_TREE) @@ -5126,6 +5135,7 @@ handle_omp_array_sections (tree c, enum c_omp_regi unsigned int num = types.length (), i; tree t, side_effects = NULL_TREE, size = NULL_TREE; tree condition = NULL_TREE; + tree ncarray_dims = NULL_TREE; if (int_size_in_bytes (TREE_TYPE (first)) <= 0) maybe_zero_len = true; @@ -5151,6 +5161,13 @@ handle_omp_array_sections (tree c, enum c_omp_regi length = fold_convert (sizetype, length); if (low_bound == NULL_TREE) low_bound = integer_zero_node; + + if (non_contiguous) + { + ncarray_dims = tree_cons (low_bound, length, ncarray_dims); + continue; + } + if (!maybe_zero_len && i > first_non_one) { if (integer_nonzerop (low_bound)) @@ -5242,6 +5259,14 @@ handle_omp_array_sections (tree c, enum c_omp_regi } if (!processing_template_decl) { + if (non_contiguous) + { + int kind = OMP_CLAUSE_MAP_KIND (c); + OMP_CLAUSE_SET_MAP_KIND (c, kind | GOMP_MAP_NONCONTIG_ARRAY); + OMP_CLAUSE_DECL (c) = t; + OMP_CLAUSE_SIZE (c) = ncarray_dims; + return false; + } if (side_effects) size = build2 (COMPOUND_EXPR, sizetype, side_effects, size); if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_REDUCTION Index: gcc/gimplify.c =================================================================== --- gcc/gimplify.c (revision 277827) +++ gcc/gimplify.c (working copy) @@ -8622,9 +8622,17 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_se if (OMP_CLAUSE_SIZE (c) == NULL_TREE) OMP_CLAUSE_SIZE (c) = DECL_P (decl) ? DECL_SIZE_UNIT (decl) : TYPE_SIZE_UNIT (TREE_TYPE (decl)); - if (gimplify_expr (&OMP_CLAUSE_SIZE (c), pre_p, - NULL, is_gimple_val, fb_rvalue) == GS_ERROR) + if (OMP_CLAUSE_SIZE (c) + && TREE_CODE (OMP_CLAUSE_SIZE (c)) == TREE_LIST + && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c))) { + /* For non-contiguous array maps, OMP_CLAUSE_SIZE is a TREE_LIST + of the individual array dimensions, which gimplify_expr doesn't + handle, so skip the call to gimplify_expr here. */ + } + else if (gimplify_expr (&OMP_CLAUSE_SIZE (c), pre_p, + NULL, is_gimple_val, fb_rvalue) == GS_ERROR) + { remove = true; break; } Index: gcc/omp-low.c =================================================================== --- gcc/omp-low.c (revision 277827) +++ gcc/omp-low.c (working copy) @@ -894,6 +894,137 @@ omp_copy_decl (tree var, copy_body_data *cb) return error_mark_node; } +/* Helper function for create_noncontig_array_descr_type(), to append a new field + to a record type. */ + +static void +append_field_to_record_type (tree record_type, tree fld_ident, tree fld_type) +{ + tree *p, fld = build_decl (UNKNOWN_LOCATION, FIELD_DECL, fld_ident, fld_type); + DECL_CONTEXT (fld) = record_type; + + for (p = &TYPE_FIELDS (record_type); *p; p = &DECL_CHAIN (*p)) + ; + *p = fld; +} + +/* Create type for non-contiguous array descriptor. Returns created type, and + returns the number of dimensions in *DIM_NUM. */ + +static tree +create_noncontig_array_descr_type (tree decl, tree dims, int *dim_num) +{ + int n = 0; + tree array_descr_type, name, x; + gcc_assert (TREE_CODE (dims) == TREE_LIST); + + array_descr_type = lang_hooks.types.make_type (RECORD_TYPE); + name = create_tmp_var_name (".omp_noncontig_array_descr_type"); + name = build_decl (UNKNOWN_LOCATION, TYPE_DECL, name, array_descr_type); + DECL_ARTIFICIAL (name) = 1; + DECL_NAMELESS (name) = 1; + TYPE_NAME (array_descr_type) = name; + TYPE_ARTIFICIAL (array_descr_type) = 1; + + /* Main starting pointer/array. */ + tree main_var_type = TREE_TYPE (decl); + if (TREE_CODE (main_var_type) == REFERENCE_TYPE) + main_var_type = TREE_TYPE (main_var_type); + append_field_to_record_type (array_descr_type, DECL_NAME (decl), + (TREE_CODE (TREE_TYPE (decl)) == POINTER_TYPE + ? main_var_type + : build_pointer_type (main_var_type))); + /* Number of dimensions. */ + append_field_to_record_type (array_descr_type, get_identifier ("__dim_num"), + sizetype); + + for (x = dims; x; x = TREE_CHAIN (x), n++) + { + char *fldname; + /* One for the start index. */ + ASM_FORMAT_PRIVATE_NAME (fldname, "__dim_base", n); + append_field_to_record_type (array_descr_type, get_identifier (fldname), + sizetype); + /* One for the length. */ + ASM_FORMAT_PRIVATE_NAME (fldname, "__dim_length", n); + append_field_to_record_type (array_descr_type, get_identifier (fldname), + sizetype); + /* One for the element size. */ + ASM_FORMAT_PRIVATE_NAME (fldname, "__dim_elem_size", n); + append_field_to_record_type (array_descr_type, get_identifier (fldname), + sizetype); + /* One for is_array flag. */ + ASM_FORMAT_PRIVATE_NAME (fldname, "__dim_is_array", n); + append_field_to_record_type (array_descr_type, get_identifier (fldname), + sizetype); + } + + layout_type (array_descr_type); + *dim_num = n; + return array_descr_type; +} + +/* Generate code sequence for initializing non-contiguous array descriptor. */ + +static void +create_noncontig_array_descr_init_code (tree array_descr, tree array_var, + tree dimensions, int dim_num, + gimple_seq *ilist) +{ + tree fld, fldref; + tree array_descr_type = TREE_TYPE (array_descr); + tree dim_type = TREE_TYPE (array_var); + + fld = TYPE_FIELDS (array_descr_type); + fldref = omp_build_component_ref (array_descr, fld); + gimplify_assign (fldref, (TREE_CODE (dim_type) == ARRAY_TYPE + ? build_fold_addr_expr (array_var) : array_var), + ilist); + + if (TREE_CODE (dim_type) == REFERENCE_TYPE) + dim_type = TREE_TYPE (dim_type); + + fld = TREE_CHAIN (fld); + fldref = omp_build_component_ref (array_descr, fld); + gimplify_assign (fldref, build_int_cst (sizetype, dim_num), ilist); + + while (dimensions) + { + tree dim_base = fold_convert (sizetype, TREE_PURPOSE (dimensions)); + tree dim_length = fold_convert (sizetype, TREE_VALUE (dimensions)); + tree dim_elem_size = TYPE_SIZE_UNIT (TREE_TYPE (dim_type)); + tree dim_is_array = (TREE_CODE (dim_type) == ARRAY_TYPE + ? integer_one_node : integer_zero_node); + /* Set base. */ + fld = TREE_CHAIN (fld); + fldref = omp_build_component_ref (array_descr, fld); + dim_base = fold_build2 (MULT_EXPR, sizetype, dim_base, dim_elem_size); + gimplify_assign (fldref, dim_base, ilist); + + /* Set length. */ + fld = TREE_CHAIN (fld); + fldref = omp_build_component_ref (array_descr, fld); + dim_length = fold_build2 (MULT_EXPR, sizetype, dim_length, dim_elem_size); + gimplify_assign (fldref, dim_length, ilist); + + /* Set elem_size. */ + fld = TREE_CHAIN (fld); + fldref = omp_build_component_ref (array_descr, fld); + dim_elem_size = fold_convert (sizetype, dim_elem_size); + gimplify_assign (fldref, dim_elem_size, ilist); + + /* Set is_array flag. */ + fld = TREE_CHAIN (fld); + fldref = omp_build_component_ref (array_descr, fld); + dim_is_array = fold_convert (sizetype, dim_is_array); + gimplify_assign (fldref, dim_is_array, ilist); + + dimensions = TREE_CHAIN (dimensions); + dim_type = TREE_TYPE (dim_type); + } + gcc_assert (TREE_CHAIN (fld) == NULL_TREE); +} + /* Create a new context, with OUTER_CTX being the surrounding context. */ static omp_context * @@ -1367,6 +1498,38 @@ scan_sharing_clauses (tree clauses, omp_context *c install_var_local (decl, ctx); break; } + + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c))) + { + tree array_decl = OMP_CLAUSE_DECL (c); + tree array_type = TREE_TYPE (array_decl); + bool by_ref = (TREE_CODE (array_type) == ARRAY_TYPE + ? true : false); + + /* Checking code to ensure we only have arrays at top dimension. + This limitation might be lifted in the future. */ + if (TREE_CODE (array_type) == REFERENCE_TYPE) + array_type = TREE_TYPE (array_type); + tree t = array_type, prev_t = NULL_TREE; + while (t) + { + if (TREE_CODE (t) == ARRAY_TYPE && prev_t) + { + error_at (gimple_location (ctx->stmt), "array types are" + " only allowed at outermost dimension of" + " non-contiguous array"); + break; + } + prev_t = t; + t = TREE_TYPE (t); + } + + install_var_field (array_decl, by_ref, 3, ctx); + install_var_local (array_decl, ctx); + break; + } + if (DECL_P (decl)) { if (DECL_SIZE (decl) @@ -2597,6 +2760,50 @@ scan_omp_single (gomp_single *stmt, omp_context *o layout_type (ctx->record_type); } +/* Reorder clauses so that non-contiguous array map clauses are placed at the very + front of the chain. */ + +static void +reorder_noncontig_array_clauses (tree *clauses_ptr) +{ + tree c, clauses = *clauses_ptr; + tree prev_clause = NULL_TREE, next_clause; + tree array_clauses = NULL_TREE, array_clauses_tail = NULL_TREE; + + for (c = clauses; c; c = next_clause) + { + next_clause = OMP_CLAUSE_CHAIN (c); + + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c))) + { + /* Unchain c from clauses. */ + if (c == clauses) + clauses = next_clause; + + /* Link on to array_clauses. */ + if (array_clauses_tail) + OMP_CLAUSE_CHAIN (array_clauses_tail) = c; + else + array_clauses = c; + array_clauses_tail = c; + + if (prev_clause) + OMP_CLAUSE_CHAIN (prev_clause) = next_clause; + continue; + } + + prev_clause = c; + } + + /* Place non-contiguous array clauses at the start of the clause list. */ + if (array_clauses) + { + OMP_CLAUSE_CHAIN (array_clauses_tail) = clauses; + *clauses_ptr = array_clauses; + } +} + /* Scan a GIMPLE_OMP_TARGET. */ static void @@ -2605,7 +2812,6 @@ scan_omp_target (gomp_target *stmt, omp_context *o omp_context *ctx; tree name; bool offloaded = is_gimple_omp_offloaded (stmt); - tree clauses = gimple_omp_target_clauses (stmt); ctx = new_omp_context (stmt, outer_ctx); ctx->field_map = splay_tree_new (splay_tree_compare_pointers, 0, 0); @@ -2624,6 +2830,14 @@ scan_omp_target (gomp_target *stmt, omp_context *o gimple_omp_target_set_child_fn (stmt, ctx->cb.dst_fn); } + /* If is OpenACC construct, put non-contiguous array clauses (if any) + in front of clause chain. The runtime can then test the first to see + if the additional map processing for them is required. */ + if (is_gimple_omp_oacc (stmt)) + reorder_noncontig_array_clauses (gimple_omp_target_clauses_ptr (stmt)); + + tree clauses = gimple_omp_target_clauses (stmt); + scan_sharing_clauses (clauses, ctx); scan_omp (gimple_omp_body_ptr (stmt), ctx); @@ -11335,6 +11549,15 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp case GOMP_MAP_FORCE_PRESENT: case GOMP_MAP_FORCE_DEVICEPTR: case GOMP_MAP_DEVICE_RESIDENT: + case GOMP_MAP_NONCONTIG_ARRAY_TO: + case GOMP_MAP_NONCONTIG_ARRAY_FROM: + case GOMP_MAP_NONCONTIG_ARRAY_TOFROM: + case GOMP_MAP_NONCONTIG_ARRAY_FORCE_TO: + case GOMP_MAP_NONCONTIG_ARRAY_FORCE_FROM: + case GOMP_MAP_NONCONTIG_ARRAY_FORCE_TOFROM: + case GOMP_MAP_NONCONTIG_ARRAY_ALLOC: + case GOMP_MAP_NONCONTIG_ARRAY_FORCE_ALLOC: + case GOMP_MAP_NONCONTIG_ARRAY_FORCE_PRESENT: case GOMP_MAP_LINK: gcc_assert (is_gimple_omp_oacc (stmt)); break; @@ -11397,7 +11620,14 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp if (offloaded && !(OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP && OMP_CLAUSE_MAP_IN_REDUCTION (c))) { - x = build_receiver_ref (var, true, ctx); + tree var_type = TREE_TYPE (var); + bool rcv_by_ref = + (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c)) + && TREE_CODE (var_type) != ARRAY_TYPE + ? false : true); + + x = build_receiver_ref (var, rcv_by_ref, ctx); tree new_var = lookup_decl (var, ctx); if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP @@ -11647,6 +11877,24 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp avar = build_fold_addr_expr (avar); gimplify_assign (x, avar, &ilist); } + else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c))) + { + int dim_num; + tree dimensions = OMP_CLAUSE_SIZE (c); + + tree array_descr_type = + create_noncontig_array_descr_type (OMP_CLAUSE_DECL (c), + dimensions, &dim_num); + tree array_descr = + create_tmp_var_raw (array_descr_type, ".omp_noncontig_array_descr"); + gimple_add_tmp_var (array_descr); + + create_noncontig_array_descr_init_code + (array_descr, ovar, dimensions, dim_num, &ilist); + + gimplify_assign (x, build_fold_addr_expr (array_descr), &ilist); + } else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_FIRSTPRIVATE) { gcc_assert (is_gimple_omp_oacc (ctx->stmt)); @@ -11718,6 +11966,9 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp s = TREE_TYPE (s); s = TYPE_SIZE_UNIT (s); } + else if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP + && GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (c))) + s = NULL_TREE; else s = OMP_CLAUSE_SIZE (c); if (s == NULL_TREE) Index: gcc/testsuite/c-c++-common/goacc/noncontig_array-1.c =================================================================== --- gcc/testsuite/c-c++-common/goacc/noncontig_array-1.c (nonexistent) +++ gcc/testsuite/c-c++-common/goacc/noncontig_array-1.c (working copy) @@ -0,0 +1,25 @@ +/* { dg-do compile } */ + +void foo (void) +{ + int array_of_array[10][10]; + int **ptr_to_ptr; + int *array_of_ptr[10]; + int (*ptr_to_array)[10]; + + #pragma acc parallel copy (array_of_array[2:4][0:10]) + array_of_array[5][5] = 1; + + #pragma acc parallel copy (ptr_to_ptr[2:4][1:7]) + ptr_to_ptr[5][5] = 1; + + #pragma acc parallel copy (array_of_ptr[2:4][1:7]) + array_of_ptr[5][5] = 1; + + #pragma acc parallel copy (ptr_to_array[2:4][1:7]) /* { dg-error "array section is not contiguous in 'map' clause" } */ + ptr_to_array[5][5] = 1; +} +/* { dg-final { scan-tree-dump-times {#pragma omp target oacc_parallel map\(tofrom:array_of_array} 1 gimple } } */ +/* { dg-final { scan-tree-dump-times {#pragma omp target oacc_parallel map\(tofrom,noncontig_array:ptr_to_ptr \[dimensions: 2 4, 1 7\]} 1 gimple } } */ +/* { dg-final { scan-tree-dump-times {#pragma omp target oacc_parallel map\(tofrom,noncontig_array:array_of_ptr \[dimensions: 2 4, 1 7\]} 1 gimple } } */ +/* { dg-final { scan-tree-dump-times {#pragma omp target oacc_parallel map\(tofrom,noncontig_array:ptr_to_array \[dimensions: 2 4, 1 7\]} 1 gimple { xfail *-*-* } } } */ Index: gcc/tree-pretty-print.c =================================================================== --- gcc/tree-pretty-print.c (revision 277827) +++ gcc/tree-pretty-print.c (working copy) @@ -849,6 +849,33 @@ dump_omp_clause (pretty_printer *pp, tree clause, case GOMP_MAP_LINK: pp_string (pp, "link"); break; + case GOMP_MAP_NONCONTIG_ARRAY_TO: + pp_string (pp, "to,noncontig_array"); + break; + case GOMP_MAP_NONCONTIG_ARRAY_FROM: + pp_string (pp, "from,noncontig_array"); + break; + case GOMP_MAP_NONCONTIG_ARRAY_TOFROM: + pp_string (pp, "tofrom,noncontig_array"); + break; + case GOMP_MAP_NONCONTIG_ARRAY_FORCE_TO: + pp_string (pp, "force_to,noncontig_array"); + break; + case GOMP_MAP_NONCONTIG_ARRAY_FORCE_FROM: + pp_string (pp, "force_from,noncontig_array"); + break; + case GOMP_MAP_NONCONTIG_ARRAY_FORCE_TOFROM: + pp_string (pp, "force_tofrom,noncontig_array"); + break; + case GOMP_MAP_NONCONTIG_ARRAY_ALLOC: + pp_string (pp, "alloc,noncontig_array"); + break; + case GOMP_MAP_NONCONTIG_ARRAY_FORCE_ALLOC: + pp_string (pp, "force_alloc,noncontig_array"); + break; + case GOMP_MAP_NONCONTIG_ARRAY_FORCE_PRESENT: + pp_string (pp, "force_present,noncontig_array"); + break; default: gcc_unreachable (); } @@ -859,8 +886,15 @@ dump_omp_clause (pretty_printer *pp, tree clause, if (OMP_CLAUSE_SIZE (clause)) { switch (OMP_CLAUSE_CODE (clause) == OMP_CLAUSE_MAP - ? OMP_CLAUSE_MAP_KIND (clause) : GOMP_MAP_TO) + ? (GOMP_MAP_NONCONTIG_ARRAY_P (OMP_CLAUSE_MAP_KIND (clause)) + ? GOMP_MAP_NONCONTIG_ARRAY + : OMP_CLAUSE_MAP_KIND (clause)) + : GOMP_MAP_TO) { + case GOMP_MAP_NONCONTIG_ARRAY: + gcc_assert (TREE_CODE (OMP_CLAUSE_SIZE (clause)) == TREE_LIST); + pp_string (pp, " [dimensions: "); + break; case GOMP_MAP_POINTER: case GOMP_MAP_FIRSTPRIVATE_POINTER: case GOMP_MAP_FIRSTPRIVATE_REFERENCE: Index: include/gomp-constants.h =================================================================== --- include/gomp-constants.h (revision 277827) +++ include/gomp-constants.h (working copy) @@ -40,6 +40,7 @@ #define GOMP_MAP_FLAG_SPECIAL_0 (1 << 2) #define GOMP_MAP_FLAG_SPECIAL_1 (1 << 3) #define GOMP_MAP_FLAG_SPECIAL_2 (1 << 4) +#define GOMP_MAP_FLAG_SPECIAL_3 (1 << 5) #define GOMP_MAP_FLAG_SPECIAL (GOMP_MAP_FLAG_SPECIAL_1 \ | GOMP_MAP_FLAG_SPECIAL_0) /* Flag to force a specific behavior (or else, trigger a run-time error). */ @@ -127,6 +128,26 @@ enum gomp_map_kind /* Decrement usage count and deallocate if zero. */ GOMP_MAP_RELEASE = (GOMP_MAP_FLAG_SPECIAL_2 | GOMP_MAP_DELETE), + /* Mapping kinds for non-contiguous arrays. */ + GOMP_MAP_NONCONTIG_ARRAY = (GOMP_MAP_FLAG_SPECIAL_3), + GOMP_MAP_NONCONTIG_ARRAY_TO = (GOMP_MAP_NONCONTIG_ARRAY + | GOMP_MAP_TO), + GOMP_MAP_NONCONTIG_ARRAY_FROM = (GOMP_MAP_NONCONTIG_ARRAY + | GOMP_MAP_FROM), + GOMP_MAP_NONCONTIG_ARRAY_TOFROM = (GOMP_MAP_NONCONTIG_ARRAY + | GOMP_MAP_TOFROM), + GOMP_MAP_NONCONTIG_ARRAY_FORCE_TO = (GOMP_MAP_NONCONTIG_ARRAY_TO + | GOMP_MAP_FLAG_FORCE), + GOMP_MAP_NONCONTIG_ARRAY_FORCE_FROM = (GOMP_MAP_NONCONTIG_ARRAY_FROM + | GOMP_MAP_FLAG_FORCE), + GOMP_MAP_NONCONTIG_ARRAY_FORCE_TOFROM = (GOMP_MAP_NONCONTIG_ARRAY_TOFROM + | GOMP_MAP_FLAG_FORCE), + GOMP_MAP_NONCONTIG_ARRAY_ALLOC = (GOMP_MAP_NONCONTIG_ARRAY + | GOMP_MAP_ALLOC), + GOMP_MAP_NONCONTIG_ARRAY_FORCE_ALLOC = (GOMP_MAP_NONCONTIG_ARRAY + | GOMP_MAP_FORCE_ALLOC), + GOMP_MAP_NONCONTIG_ARRAY_FORCE_PRESENT = (GOMP_MAP_NONCONTIG_ARRAY + | GOMP_MAP_FORCE_PRESENT), /* Internal to GCC, not used in libgomp. */ /* Do not map, but pointer assign a pointer instead. */ @@ -155,6 +176,8 @@ enum gomp_map_kind #define GOMP_MAP_ALWAYS_P(X) \ (GOMP_MAP_ALWAYS_TO_P (X) || ((X) == GOMP_MAP_ALWAYS_FROM)) +#define GOMP_MAP_NONCONTIG_ARRAY_P(X) \ + ((X) & GOMP_MAP_NONCONTIG_ARRAY) /* Asynchronous behavior. Keep in sync with libgomp/{openacc.h,openacc.f90,openacc_lib.h}:acc_async_t. */ Index: libgomp/oacc-parallel.c =================================================================== --- libgomp/oacc-parallel.c (revision 277827) +++ libgomp/oacc-parallel.c (working copy) @@ -111,6 +111,21 @@ handle_ftn_pointers (size_t mapnum, void **hostadd } } +static inline void +revert_noncontig_array_map_pointers (size_t mapnum, void **hostaddrs, + unsigned short *kinds) +{ + for (int i = 0; i < mapnum; i++) + { + if (GOMP_MAP_NONCONTIG_ARRAY_P (kinds[i] & 0xff)) + hostaddrs[i] = *((void **)hostaddrs[i]); + else + /* We assume all non-contiguous array map entries are placed at the + start; first other map kind means we can exit. */ + break; + } +} + static void goacc_wait (int async, int num_waits, va_list *ap); @@ -212,6 +227,7 @@ GOACC_parallel_keyed (int flags_m, void (*fn) (voi prof_info.device_type = acc_device_host; api_info.device_type = prof_info.device_type; goacc_save_and_set_bind (acc_device_host); + revert_noncontig_array_map_pointers (mapnum, hostaddrs, kinds); fn (hostaddrs); goacc_restore_bind (); goto out_prof; @@ -218,6 +234,7 @@ GOACC_parallel_keyed (int flags_m, void (*fn) (voi } else if (acc_device_type (acc_dev->type) == acc_device_host) { + revert_noncontig_array_map_pointers (mapnum, hostaddrs, kinds); fn (hostaddrs); goto out_prof; } Index: libgomp/target.c =================================================================== --- libgomp/target.c (revision 277827) +++ libgomp/target.c (working copy) @@ -520,6 +520,152 @@ gomp_map_val (struct target_mem_desc *tgt, void ** } } +/* Definitions for data structures describing non-contiguous arrays + (Note: interfaces with compiler) + + The compiler generates a descriptor for each such array, places the + descriptor on stack, and passes the address of the descriptor to the libgomp + runtime as a normal map argument. The runtime then processes the array + data structure setup, and replaces the argument with the new actual + array address for the child function. + + Care must be taken such that the struct field and layout assumptions + of struct gomp_ncarray_dim, gomp_ncarray_descr_type inside the compiler + be consistant with the below declarations. */ + +struct gomp_ncarray_dim { + size_t base; + size_t length; + size_t elem_size; + size_t is_array; +}; + +struct gomp_ncarray_descr_type { + void *ptr; + size_t ndims; + struct gomp_ncarray_dim dims[]; +}; + +/* Internal non-contiguous array info struct, used only here inside the runtime. */ + +struct ncarray_info +{ + struct gomp_ncarray_descr_type *descr; + size_t map_index; + size_t ptrblock_size; + size_t data_row_num; + size_t data_row_size; +}; + +static size_t +gomp_noncontig_array_count_rows (struct gomp_ncarray_descr_type *descr) +{ + size_t nrows = 1; + for (size_t d = 0; d < descr->ndims - 1; d++) + nrows *= descr->dims[d].length / sizeof (void *); + return nrows; +} + +static void +gomp_noncontig_array_compute_info (struct ncarray_info *nca) +{ + size_t d, n = 1; + struct gomp_ncarray_descr_type *descr = nca->descr; + + nca->ptrblock_size = 0; + for (d = 0; d < descr->ndims - 1; d++) + { + size_t dim_count = descr->dims[d].length / descr->dims[d].elem_size; + size_t dim_ptrblock_size = (descr->dims[d + 1].is_array + ? 0 : descr->dims[d].length * n); + nca->ptrblock_size += dim_ptrblock_size; + n *= dim_count; + } + nca->data_row_num = n; + nca->data_row_size = descr->dims[d].length; +} + +static void +gomp_noncontig_array_fill_rows_1 (struct gomp_ncarray_descr_type *descr, void *nca, + size_t d, void ***row_ptr, size_t *count) +{ + if (d < descr->ndims - 1) + { + size_t elsize = descr->dims[d].elem_size; + size_t n = descr->dims[d].length / elsize; + void *p = nca + descr->dims[d].base; + for (size_t i = 0; i < n; i++) + { + void *ptr = p + i * elsize; + /* Deref if next dimension is not array. */ + if (!descr->dims[d + 1].is_array) + ptr = *((void **) ptr); + gomp_noncontig_array_fill_rows_1 (descr, ptr, d + 1, row_ptr, count); + } + } + else + { + **row_ptr = nca + descr->dims[d].base; + *row_ptr += 1; + *count += 1; + } +} + +static size_t +gomp_noncontig_array_fill_rows (struct gomp_ncarray_descr_type *descr, void *rows[]) +{ + size_t count = 0; + void **p = rows; + gomp_noncontig_array_fill_rows_1 (descr, descr->ptr, 0, &p, &count); + return count; +} + +static void * +gomp_noncontig_array_create_ptrblock (struct ncarray_info *nca, + void *tgt_addr, void *tgt_data_rows[]) +{ + struct gomp_ncarray_descr_type *descr = nca->descr; + void *ptrblock = gomp_malloc (nca->ptrblock_size); + void **curr_dim_ptrblock = (void **) ptrblock; + size_t n = 1; + + for (size_t d = 0; d < descr->ndims - 1; d++) + { + int curr_dim_len = descr->dims[d].length; + int next_dim_len = descr->dims[d + 1].length; + int curr_dim_num = curr_dim_len / sizeof (void *); + size_t next_dim_bias = descr->dims[d + 1].base; + + void *next_dim_ptrblock + = (void *)(curr_dim_ptrblock + n * curr_dim_num); + + for (int b = 0; b < n; b++) + for (int i = 0; i < curr_dim_num; i++) + { + if (d < descr->ndims - 2) + { + void *ptr = (next_dim_ptrblock + + b * curr_dim_num * next_dim_len + + i * next_dim_len); + void *tgt_ptr = tgt_addr + (ptr - ptrblock) - next_dim_bias; + curr_dim_ptrblock[b * curr_dim_num + i] = tgt_ptr; + } + else + { + curr_dim_ptrblock[b * curr_dim_num + i] + = tgt_data_rows[b * curr_dim_num + i] - next_dim_bias; + } + void *addr = &curr_dim_ptrblock[b * curr_dim_num + i]; + assert (ptrblock <= addr && addr < ptrblock + nca->ptrblock_size); + } + + n *= curr_dim_num; + curr_dim_ptrblock = next_dim_ptrblock; + } + assert (n == nca->data_row_num); + return ptrblock; +} + static inline __attribute__((always_inline)) struct target_mem_desc * gomp_map_vars_internal (struct gomp_device_descr *devicep, struct goacc_asyncqueue *aq, size_t mapnum, @@ -533,9 +679,37 @@ gomp_map_vars_internal (struct gomp_device_descr * const int typemask = short_mapkind ? 0xff : 0x7; struct splay_tree_s *mem_map = &devicep->mem_map; struct splay_tree_key_s cur_node; - struct target_mem_desc *tgt - = gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum); - tgt->list_count = mapnum; + struct target_mem_desc *tgt; + + bool process_noncontig_arrays = false; + size_t nca_data_row_num = 0, row_start = 0; + size_t nca_info_num = 0, nca_index; + struct ncarray_info *nca_info = NULL; + struct target_var_desc *row_desc; + uintptr_t target_row_addr; + void **host_data_rows = NULL, **target_data_rows = NULL; + void *row; + + if (mapnum > 0) + { + int kind = get_kind (short_mapkind, kinds, 0); + process_noncontig_arrays = GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask); + } + + if (process_noncontig_arrays) + for (i = 0; i < mapnum; i++) + { + int kind = get_kind (short_mapkind, kinds, i); + if (GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask)) + { + nca_data_row_num += gomp_noncontig_array_count_rows (hostaddrs[i]); + nca_info_num += 1; + } + } + + tgt = gomp_malloc (sizeof (*tgt) + + sizeof (tgt->list[0]) * (mapnum + nca_data_row_num)); + tgt->list_count = mapnum + nca_data_row_num; tgt->refcount = pragma_kind == GOMP_MAP_VARS_ENTER_DATA ? 0 : 1; tgt->device_descr = devicep; struct gomp_coalesce_buf cbuf, *cbufp = NULL; @@ -547,6 +721,14 @@ gomp_map_vars_internal (struct gomp_device_descr * return tgt; } + if (nca_info_num) + nca_info = gomp_alloca (sizeof (struct ncarray_info) * nca_info_num); + if (nca_data_row_num) + { + host_data_rows = gomp_malloc (2 * sizeof (void *) * nca_data_row_num); + target_data_rows = &host_data_rows[nca_data_row_num]; + } + tgt_align = sizeof (void *); tgt_size = 0; cbuf.chunks = NULL; @@ -578,7 +760,7 @@ gomp_map_vars_internal (struct gomp_device_descr * return NULL; } - for (i = 0; i < mapnum; i++) + for (i = 0, nca_index = 0; i < mapnum; i++) { int kind = get_kind (short_mapkind, kinds, i); if (hostaddrs[i] == NULL @@ -667,6 +849,20 @@ gomp_map_vars_internal (struct gomp_device_descr * has_firstprivate = true; continue; } + else if (GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask)) + { + /* Ignore non-contiguous arrays for now, we process them together + later. */ + tgt->list[i].key = NULL; + tgt->list[i].offset = 0; + not_found_cnt++; + + struct ncarray_info *nca = &nca_info[nca_index++]; + nca->descr = (struct gomp_ncarray_descr_type *) hostaddrs[i]; + nca->map_index = i; + continue; + } + cur_node.host_start = (uintptr_t) hostaddrs[i]; if (!GOMP_MAP_POINTER_P (kind & typemask)) cur_node.host_end = cur_node.host_start + sizes[i]; @@ -735,6 +931,56 @@ gomp_map_vars_internal (struct gomp_device_descr * } } + /* For non-contiguous arrays. Each data row is one target item, separated + from the normal map clause items, hence we order them after mapnum. */ + if (process_noncontig_arrays) + for (i = 0, nca_index = 0, row_start = 0; i < mapnum; i++) + { + int kind = get_kind (short_mapkind, kinds, i); + if (!GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask)) + continue; + + struct ncarray_info *nca = &nca_info[nca_index++]; + struct gomp_ncarray_descr_type *descr = nca->descr; + size_t nr; + + gomp_noncontig_array_compute_info (nca); + + /* We have allocated space in host/target_data_rows to place all the + row data block pointers, now we can start filling them in. */ + nr = gomp_noncontig_array_fill_rows (descr, &host_data_rows[row_start]); + assert (nr == nca->data_row_num); + + size_t align = (size_t) 1 << (kind >> rshift); + if (tgt_align < align) + tgt_align = align; + tgt_size = (tgt_size + align - 1) & ~(align - 1); + tgt_size += nca->ptrblock_size; + + for (size_t j = 0; j < nca->data_row_num; j++) + { + row = host_data_rows[row_start + j]; + row_desc = &tgt->list[mapnum + row_start + j]; + + cur_node.host_start = (uintptr_t) row; + cur_node.host_end = cur_node.host_start + nca->data_row_size; + splay_tree_key n = splay_tree_lookup (mem_map, &cur_node); + if (n) + { + assert (n->refcount != REFCOUNT_LINK); + gomp_map_vars_existing (devicep, aq, n, &cur_node, row_desc, + kind & typemask, /* TODO: cbuf? */ NULL); + } + else + { + tgt_size = (tgt_size + align - 1) & ~(align - 1); + tgt_size += nca->data_row_size; + not_found_cnt++; + } + } + row_start += nca->data_row_num; + } + if (devaddrs) { if (mapnum != 1) @@ -895,6 +1141,15 @@ gomp_map_vars_internal (struct gomp_device_descr * default: break; } + + if (GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask)) + { + tgt->list[i].key = &array->key; + tgt->list[i].key->tgt = tgt; + array++; + continue; + } + splay_tree_key k = &array->key; k->host_start = (uintptr_t) hostaddrs[i]; if (!GOMP_MAP_POINTER_P (kind & typemask)) @@ -1044,8 +1299,112 @@ gomp_map_vars_internal (struct gomp_device_descr * array++; } } + + /* Processing of non-contiguous array rows. */ + if (process_noncontig_arrays) + { + for (i = 0, nca_index = 0, row_start = 0; i < mapnum; i++) + { + int kind = get_kind (short_mapkind, kinds, i); + if (!GOMP_MAP_NONCONTIG_ARRAY_P (kind & typemask)) + continue; + + struct ncarray_info *nca = &nca_info[nca_index++]; + assert (nca->descr == hostaddrs[i]); + + /* The map for the non-contiguous array itself is never copied from + during unmapping, its the data rows that count. Set copy-from + flags to false here. */ + tgt->list[i].copy_from = false; + tgt->list[i].always_copy_from = false; + + size_t align = (size_t) 1 << (kind >> rshift); + tgt_size = (tgt_size + align - 1) & ~(align - 1); + + /* For the map of the non-contiguous array itself, adjust so that + the passed device address points to the beginning of the + ptrblock. Remember to adjust the first-dimension's bias here. */ + tgt->list[i].key->tgt_offset = tgt_size - nca->descr->dims[0].base; + + void *target_ptrblock = (void*) tgt->tgt_start + tgt_size; + tgt_size += nca->ptrblock_size; + + /* Add splay key for each data row in current non-contiguous + array. */ + for (size_t j = 0; j < nca->data_row_num; j++) + { + row = host_data_rows[row_start + j]; + row_desc = &tgt->list[mapnum + row_start + j]; + + cur_node.host_start = (uintptr_t) row; + cur_node.host_end = cur_node.host_start + nca->data_row_size; + splay_tree_key n = splay_tree_lookup (mem_map, &cur_node); + if (n) + { + assert (n->refcount != REFCOUNT_LINK); + gomp_map_vars_existing (devicep, aq, n, &cur_node, row_desc, + kind & typemask, cbufp); + target_row_addr = n->tgt->tgt_start + n->tgt_offset; + } + else + { + tgt->refcount++; + + splay_tree_key k = &array->key; + k->host_start = (uintptr_t) row; + k->host_end = k->host_start + nca->data_row_size; + + k->tgt = tgt; + k->refcount = 1; + k->link_key = NULL; + tgt_size = (tgt_size + align - 1) & ~(align - 1); + target_row_addr = tgt->tgt_start + tgt_size; + k->tgt_offset = tgt_size; + tgt_size += nca->data_row_size; + + row_desc->key = k; + row_desc->copy_from + = GOMP_MAP_COPY_FROM_P (kind & typemask); + row_desc->always_copy_from + = GOMP_MAP_COPY_FROM_P (kind & typemask); + row_desc->offset = 0; + row_desc->length = nca->data_row_size; + + array->left = NULL; + array->right = NULL; + splay_tree_insert (mem_map, array); + + if (GOMP_MAP_COPY_TO_P (kind & typemask)) + gomp_copy_host2dev (devicep, aq, + (void *) tgt->tgt_start + k->tgt_offset, + (void *) k->host_start, + nca->data_row_size, cbufp); + array++; + } + target_data_rows[row_start + j] = (void *) target_row_addr; + } + + /* Now we have the target memory allocated, and target offsets of all + row blocks assigned and calculated, we can construct the + accelerator side ptrblock and copy it in. */ + if (nca->ptrblock_size) + { + void *ptrblock = gomp_noncontig_array_create_ptrblock + (nca, target_ptrblock, target_data_rows + row_start); + gomp_copy_host2dev (devicep, aq, target_ptrblock, ptrblock, + nca->ptrblock_size, cbufp); + free (ptrblock); + } + + row_start += nca->data_row_num; + } + assert (row_start == nca_data_row_num && nca_index == nca_info_num); + } } + if (nca_data_row_num) + free (host_data_rows); + if (pragma_kind == GOMP_MAP_VARS_TARGET) { for (i = 0; i < mapnum; i++) Index: libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c =================================================================== --- libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c (nonexistent) +++ libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-1.c (working copy) @@ -0,0 +1,103 @@ +/* { dg-do run } */ + +#include +#include + +#define n 100 +#define m 100 + +int b[n][m]; + +void +test1 (void) +{ + int i, j, *a[100]; + + /* Array of pointers form test. */ + for (i = 0; i < n; i++) + { + a[i] = (int *)malloc (sizeof (int) * m); + for (j = 0; j < m; j++) + b[i][j] = j - i; + } + + #pragma acc parallel loop copyout(a[0:n][0:m]) copyin(b) + for (i = 0; i < n; i++) + #pragma acc loop + for (j = 0; j < m; j++) + a[i][j] = b[i][j]; + + for (i = 0; i < n; i++) + { + for (j = 0; j < m; j++) + assert (a[i][j] == b[i][j]); + /* Clean up. */ + free (a[i]); + } +} + +void +test2 (void) +{ + int i, j, **a = (int **) malloc (sizeof (int *) * n); + + /* Separately allocated blocks. */ + for (i = 0; i < n; i++) + { + a[i] = (int *)malloc (sizeof (int) * m); + for (j = 0; j < m; j++) + b[i][j] = j - i; + } + + #pragma acc parallel loop copyout(a[0:n][0:m]) copyin(b) + for (i = 0; i < n; i++) + #pragma acc loop + for (j = 0; j < m; j++) + a[i][j] = b[i][j]; + + for (i = 0; i < n; i++) + { + for (j = 0; j < m; j++) + assert (a[i][j] == b[i][j]); + /* Clean up. */ + free (a[i]); + } + free (a); +} + +void +test3 (void) +{ + int i, j, **a = (int **) malloc (sizeof (int *) * n); + a[0] = (int *) malloc (sizeof (int) * n * m); + + /* Rows allocated in one contiguous block. */ + for (i = 0; i < n; i++) + { + a[i] = *a + i * m; + for (j = 0; j < m; j++) + b[i][j] = j - i; + } + + #pragma acc parallel loop copyout(a[0:n][0:m]) copyin(b) + for (i = 0; i < n; i++) + #pragma acc loop + for (j = 0; j < m; j++) + a[i][j] = b[i][j]; + + for (i = 0; i < n; i++) + for (j = 0; j < m; j++) + assert (a[i][j] == b[i][j]); + + free (a[0]); + free (a); +} + +int +main (void) +{ + test1 (); + test2 (); + test3 (); + return 0; +} Index: libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-2.c =================================================================== --- libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-2.c (nonexistent) +++ libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-2.c (working copy) @@ -0,0 +1,37 @@ +/* { dg-do run } */ + +#include +#include "noncontig_array-utils.h" + +int +main (void) +{ + int n = 10; + int ***a = (int ***) create_ncarray (sizeof (int), n, 3); + int ***b = (int ***) create_ncarray (sizeof (int), n, 3); + int ***c = (int ***) create_ncarray (sizeof (int), n, 3); + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + { + a[i][j][k] = i + j * k + k; + b[i][j][k] = j + k * i + i * j; + c[i][j][k] = a[i][j][k]; + } + + #pragma acc parallel copy (a[0:n][0:n][0:n]) copyin (b[0:n][0:n][0:n]) + { + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + a[i][j][k] += b[k][j][i] + i + j + k; + } + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + assert (a[i][j][k] == c[i][j][k] + b[k][j][i] + i + j + k); + + return 0; +} Index: libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-3.c =================================================================== --- libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-3.c (nonexistent) +++ libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-3.c (working copy) @@ -0,0 +1,45 @@ +/* { dg-do run } */ + +#include +#include "noncontig_array-utils.h" + +int main (void) +{ + int n = 20, x = 5, y = 12; + int *****a = (int *****) create_ncarray (sizeof (int), n, 5); + + int sum1 = 0, sum2 = 0, sum3 = 0; + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + for (int l = 0; l < n; l++) + for (int m = 0; m < n; m++) + { + a[i][j][k][l][m] = 1; + sum1++; + } + + #pragma acc parallel copy (a[x:y][x:y][x:y][x:y][x:y]) copy(sum2) + { + for (int i = x; i < x + y; i++) + for (int j = x; j < x + y; j++) + for (int k = x; k < x + y; k++) + for (int l = x; l < x + y; l++) + for (int m = x; m < x + y; m++) + { + a[i][j][k][l][m] = 0; + sum2++; + } + } + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + for (int l = 0; l < n; l++) + for (int m = 0; m < n; m++) + sum3 += a[i][j][k][l][m]; + + assert (sum1 == sum2 + sum3); + return 0; +} Index: libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-4.c =================================================================== --- libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-4.c (nonexistent) +++ libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-4.c (working copy) @@ -0,0 +1,36 @@ +/* { dg-do run } */ + +#include +#include "noncontig_array-utils.h" + +int main (void) +{ + int n = 128; + double ***a = (double ***) create_ncarray (sizeof (double), n, 3); + double ***b = (double ***) create_ncarray (sizeof (double), n, 3); + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + a[i][j][k] = i + j + k + i * j * k; + + /* This test exercises async copyout of non-contiguous array rows. */ + #pragma acc parallel copyin(a[0:n][0:n][0:n]) copyout(b[0:n][0:n][0:n]) async(5) + { + #pragma acc loop gang + for (int i = 0; i < n; i++) + #pragma acc loop vector + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + b[i][j][k] = a[i][j][k] * 2.0; + } + + #pragma acc wait (5) + + for (int i = 0; i < n; i++) + for (int j = 0; j < n; j++) + for (int k = 0; k < n; k++) + assert (b[i][j][k] == a[i][j][k] * 2.0); + + return 0; +} Index: libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-utils.h =================================================================== --- libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-utils.h (nonexistent) +++ libgomp/testsuite/libgomp.oacc-c-c++-common/noncontig_array-utils.h (working copy) @@ -0,0 +1,44 @@ +#include +#include +#include +#include + +/* Allocate and create a pointer based NDIMS-dimensional array, + each dimension DIMLEN long, with ELSIZE sized data elements. */ +void * +create_ncarray (size_t elsize, int dimlen, int ndims) +{ + size_t blk_size = 0; + size_t n = 1; + + for (int i = 0; i < ndims - 1; i++) + { + n *= dimlen; + blk_size += sizeof (void *) * n; + } + size_t data_rows_num = n; + size_t data_rows_offset = blk_size; + blk_size += elsize * n * dimlen; + + void *blk = (void *) malloc (blk_size); + memset (blk, 0, blk_size); + void **curr_dim = (void **) blk; + n = 1; + + for (int d = 0; d < ndims - 1; d++) + { + uintptr_t next_dim = (uintptr_t) (curr_dim + n * dimlen); + size_t next_dimlen = dimlen * (d < ndims - 2 ? sizeof (void *) : elsize); + + for (int b = 0; b < n; b++) + for (int i = 0; i < dimlen; i++) + if (d < ndims - 1) + curr_dim[b * dimlen + i] + = (void*) (next_dim + b * dimlen * next_dimlen + i * next_dimlen); + + n *= dimlen; + curr_dim = (void**) next_dim; + } + assert (n == data_rows_num); + return blk; +}