diff mbox series

[v2,4/5] openmp, fortran: Add support for map iterators in OpenMP target construct (Fortran)

Message ID 1728c2ce-3a61-4ad8-beef-21b361e9a0d0@baylibre.com
State New
Headers show
Series openmp: Add support for iterators in OpenMP mapping clauses | expand

Commit Message

Kwok Cheung Yeung Sept. 3, 2024, 5:10 p.m. UTC
This patch adds support for iterators in the map clause of OpenMP target 
constructs.

The parsing and translation of iterators in the front-end works the same 
as for the affinity and depend clauses.

The iterator gimplification needed to be modified slightly to handle 
Fortran. The difference in how ranges work in loops (i.e. the condition 
on the upper bound is <=, rather than < as in C/C++) needs to be 
compensated for when calculating the iteration count and in the 
iteration loop itself.

During Fortran translation of iterators, statements for the side-effects 
of any translated expressions are placed into BLOCK_SUBBLOCKS of the 
block containing the iterator variables (this also occurs with the other 
clauses supporting iterators). However, the previous lowering of 
iterators into Gimple does not appear to do anything with these 
statements, which causes issues if anything in the loop body references 
these side-effects (typically calculation of array boundaries and 
strides). This appears to be a bug that was simply not triggered by 
existing testcases. These statements are now gimplified into the 
innermost loop body.

The libgomp runtime was modified to handle GOMP_MAP_STRUCTs in 
iterators, which can result from the use of derived types (which I used 
in test cases to implement arrays of pointers). libgomp expects a 
GOMP_MAP_STRUCT map to be followed immediately by a number of maps 
corresponding to the fields of the struct, so an iterator 
GOMP_MAP_STRUCT and its fields need to be expanded in a breadth-first 
order, rather than the usual depth-first manner (which would result in 
multiple GOMP_MAP_STRUCTS, followed by multiple instances of the first 
field, then multiples of the second etc.).

When filling in the .omp_data_t data structure for the target, only the 
address associated with the first map generated by an iterator is set 
(as only a single slot in the data structure is allocated for each 
iterator map).
From f7cdf555e9d5c49b455a364a1eef2123c7bb76d1 Mon Sep 17 00:00:00 2001
From: Kwok Cheung Yeung <kcyeung@baylibre.com>
Date: Mon, 2 Sep 2024 19:34:15 +0100
Subject: [PATCH 4/5] openmp, fortran: Add support for map iterators in OpenMP
 target construct (Fortran)

This adds support for iterators in map clauses within OpenMP
'target' constructs in Fortran.

Some special handling for struct field maps has been added to libgomp in
order to handle arrays of derived types.

2024-09-02  Kwok Cheung Yeung  <kcyeung@baylibre.com>

	gcc/fortran/
	* dump-parse-tree.cc (show_omp_namelist): Add iterator support for
	OMP_LIST_MAP.
	* openmp.cc (gfc_free_omp_clauses): Free namespace in namelist for
	OMP_LIST_MAP.
	(gfc_match_omp_clauses): Parse 'iterator' modifier for 'map' clause.
	(resolve_omp_clauses): Resolve iterators for OMP_LIST_MAP.
	* trans-openmp.cc (gfc_trans_omp_clauses): Handle iterators in
	OMP_LIST_MAP clauses.

	gcc/
	* gimplify.cc (compute_iterator_count): Account for difference in loop
	boundaries in Fortran.
	(build_iterator_loop): Change upper boundary condition for Fortran.
	Insert block statements into innermost loop.
	(omp_accumulate_sibling_list): Prevent structs generated by iterators
	from being treated as unordered.
	* tree-pretty-print.cc (dump_block_node): Ignore BLOCK_SUBBLOCKS
	containing iterator block statements.

	gcc/testsuite/
	* gfortran.dg/gomp/target-iterator-1.f90: New.
	* gfortran.dg/gomp/target-iterator-2.f90: New.
	* gfortran.dg/gomp/target-iterator-3.f90: New.

	libgomp/
	* target.c (kind_to_name): New.
	(gomp_add_map): New.
	(gomp_merge_iterator_maps): Return array indicating the iteration
	that a map originated from.  Expand fields of a struct mapping
	breadth-first.
	(gomp_map_vars_internal): Add extra argument in call to
	gomp_merge_iterator_maps and free it at the end.  Only add address of
	first iteration for field maps to target variables.
	(gomp_update): Add extra argument in call to gomp_merge_iterator_maps.
	Free it at the end of the function.
	* testsuite/libgomp.fortran/target-map-iterators-1.f90: New.
	* testsuite/libgomp.fortran/target-map-iterators-2.f90: New.
	* testsuite/libgomp.fortran/target-map-iterators-3.f90: New.
---
 gcc/fortran/dump-parse-tree.cc                |   9 +-
 gcc/fortran/openmp.cc                         |  35 ++++-
 gcc/fortran/trans-openmp.cc                   |  73 ++++++++--
 gcc/gimplify.cc                               |  36 +++--
 .../gfortran.dg/gomp/target-iterator-1.f90    |  26 ++++
 .../gfortran.dg/gomp/target-iterator-2.f90    |  27 ++++
 .../gfortran.dg/gomp/target-iterator-3.f90    |  24 ++++
 gcc/tree-pretty-print.cc                      |   4 +-
 libgomp/target.c                              | 132 ++++++++++++++----
 .../target-map-iterators-1.f90                |  45 ++++++
 .../target-map-iterators-2.f90                |  45 ++++++
 .../target-map-iterators-3.f90                |  57 ++++++++
 12 files changed, 451 insertions(+), 62 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/target-iterator-1.f90
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/target-iterator-2.f90
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/target-iterator-3.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/target-map-iterators-1.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/target-map-iterators-2.f90
 create mode 100644 libgomp/testsuite/libgomp.fortran/target-map-iterators-3.f90
diff mbox series

Patch

diff --git a/gcc/fortran/dump-parse-tree.cc b/gcc/fortran/dump-parse-tree.cc
index 80aa8ef84e7..0272a443f65 100644
--- a/gcc/fortran/dump-parse-tree.cc
+++ b/gcc/fortran/dump-parse-tree.cc
@@ -1349,7 +1349,8 @@  show_omp_namelist (int list_type, gfc_omp_namelist *n)
   for (; n; n = n->next)
     {
       gfc_current_ns = ns_curr;
-      if (list_type == OMP_LIST_AFFINITY || list_type == OMP_LIST_DEPEND)
+      if (list_type == OMP_LIST_AFFINITY || list_type == OMP_LIST_DEPEND
+	  || list_type == OMP_LIST_MAP)
 	{
 	  gfc_current_ns = n->u2.ns ? n->u2.ns : ns_curr;
 	  if (n->u2.ns != ns_iter)
@@ -1361,8 +1362,12 @@  show_omp_namelist (int list_type, gfc_omp_namelist *n)
 		    fputs ("AFFINITY (", dumpfile);
 		  else if (n->u.depend_doacross_op == OMP_DOACROSS_SINK_FIRST)
 		    fputs ("DOACROSS (", dumpfile);
-		  else
+		  else if (list_type == OMP_LIST_DEPEND)
 		    fputs ("DEPEND (", dumpfile);
+		  else if (list_type == OMP_LIST_MAP)
+		    fputs ("MAP (", dumpfile);
+		  else
+		    gcc_unreachable ();
 		}
 	      if (n->u2.ns)
 		{
diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
index 333f0c7fe7f..996126e6e7f 100644
--- a/gcc/fortran/openmp.cc
+++ b/gcc/fortran/openmp.cc
@@ -191,7 +191,8 @@  gfc_free_omp_clauses (gfc_omp_clauses *c)
   gfc_free_expr (c->vector_length_expr);
   for (i = 0; i < OMP_LIST_NUM; i++)
     gfc_free_omp_namelist (c->lists[i],
-			   i == OMP_LIST_AFFINITY || i == OMP_LIST_DEPEND,
+			   i == OMP_LIST_AFFINITY || i == OMP_LIST_DEPEND
+			   || i == OMP_LIST_MAP,
 			   i == OMP_LIST_ALLOCATE,
 			   i == OMP_LIST_USES_ALLOCATORS);
   gfc_free_expr_list (c->wait_list);
@@ -3079,9 +3080,12 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 	      int always_modifier = 0;
 	      int close_modifier = 0;
 	      int present_modifier = 0;
+	      int iterator_modifier = 0;
+	      gfc_namespace *ns_iter = NULL, *ns_curr = gfc_current_ns;
 	      locus second_always_locus = old_loc2;
 	      locus second_close_locus = old_loc2;
 	      locus second_present_locus = old_loc2;
+	      locus second_iterator_locus = old_loc2;
 
 	      for (;;)
 		{
@@ -3101,6 +3105,11 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 		      if (present_modifier++ == 1)
 			second_present_locus = current_locus;
 		    }
+		  else if (gfc_match_iterator (&ns_iter, true) == MATCH_YES)
+		    {
+		      if (iterator_modifier++ == 1)
+		      second_iterator_locus = current_locus;
+		    }
 		  else
 		    break;
 		  gfc_match (", ");
@@ -3157,15 +3166,30 @@  gfc_match_omp_clauses (gfc_omp_clauses **cp, const omp_mask mask,
 			     &second_present_locus);
 		  break;
 		}
+	      if (iterator_modifier > 1)
+		{
+		  gfc_error ("too many %<iterator%> modifiers at %L",
+			     &second_iterator_locus);
+		  break;
+		}
 
 	      head = NULL;
-	      if (gfc_match_omp_variable_list ("", &c->lists[OMP_LIST_MAP],
+	      if (ns_iter)
+		gfc_current_ns = ns_iter;
+	      m = gfc_match_omp_variable_list ("", &c->lists[OMP_LIST_MAP],
 					       false, NULL, &head,
-					       true, true) == MATCH_YES)
+					       true, true);
+	      gfc_current_ns = ns_curr;
+	      if (m == MATCH_YES)
 		{
 		  gfc_omp_namelist *n;
 		  for (n = *head; n; n = n->next)
-		    n->u.map.op = map_op;
+		    {
+		      n->u.map.op = map_op;
+		      n->u2.ns = ns_iter;
+		      if (ns_iter)
+			ns_iter->refs++;
+		    }
 		  continue;
 		}
 	      gfc_current_locus = old_loc;
@@ -8411,7 +8435,8 @@  resolve_omp_clauses (gfc_code *code, gfc_omp_clauses *omp_clauses,
 	  case OMP_LIST_CACHE:
 	    for (; n != NULL; n = n->next)
 	      {
-		if ((list == OMP_LIST_DEPEND || list == OMP_LIST_AFFINITY)
+		if ((list == OMP_LIST_DEPEND || list == OMP_LIST_AFFINITY
+		     || list == OMP_LIST_MAP)
 		    && n->u2.ns && !n->u2.ns->resolved)
 		  {
 		    n->u2.ns->resolved = 1;
diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index df1bf144e23..a9929430e53 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -2694,7 +2694,7 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		       locus where, bool declare_simd = false,
 		       bool openacc = false, gfc_exec_op op = EXEC_NOP)
 {
-  tree omp_clauses = NULL_TREE, prev_clauses, chunk_size, c;
+  tree omp_clauses = NULL_TREE, prev_clauses = NULL_TREE, chunk_size, c;
   tree iterator = NULL_TREE;
   tree tree_block = NULL_TREE;
   stmtblock_t iter_block;
@@ -3129,11 +3129,40 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 	    }
 	  break;
 	case OMP_LIST_MAP:
+	  iterator = NULL_TREE;
+	  prev = NULL;
+	  prev_clauses = omp_clauses;
 	  for (; n != NULL; n = n->next)
 	    {
 	      if (!n->sym->attr.referenced)
 		continue;
 
+	      if (iterator && prev->u2.ns != n->u2.ns)
+		{
+		  /* Finish previous iterator group.  */
+		  BLOCK_SUBBLOCKS (tree_block) = gfc_finish_block (&iter_block);
+		  TREE_VEC_ELT (iterator, 5) = tree_block;
+		  for (tree c = omp_clauses; c != prev_clauses;
+		       c = OMP_CLAUSE_CHAIN (c))
+		    OMP_CLAUSE_DECL (c) = build_tree_list (iterator,
+							   OMP_CLAUSE_DECL (c));
+		  prev_clauses = omp_clauses;
+		  iterator = NULL_TREE;
+		}
+	      if (n->u2.ns && (!prev || prev->u2.ns != n->u2.ns))
+		{
+		  /* Start a new iterator group.  */
+		  gfc_init_block (&iter_block);
+		  tree_block = make_node (BLOCK);
+		  TREE_USED (tree_block) = 1;
+		  BLOCK_VARS (tree_block) = NULL_TREE;
+		  prev_clauses = omp_clauses;
+		  iterator = handle_iterator (n->u2.ns, block, tree_block);
+		}
+	      if (!iterator)
+		gfc_init_block (&iter_block);
+	      prev = n;
+
 	      bool always_modifier = false;
 	      tree node = build_omp_clause (input_location, OMP_CLAUSE_MAP);
 	      tree node2 = NULL_TREE;
@@ -3332,7 +3361,7 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 						       TRUTH_NOT_EXPR,
 						       boolean_type_node,
 						       present);
-			  gfc_add_expr_to_block (block,
+			  gfc_add_expr_to_block (&iter_block,
 						 build3_loc (input_location,
 							     COND_EXPR,
 							     void_type_node,
@@ -3392,7 +3421,8 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		      tree type = TREE_TYPE (decl);
 		      tree ptr = gfc_conv_descriptor_data_get (decl);
 		      if (present)
-			ptr = gfc_build_cond_assign_expr (block, present, ptr,
+			ptr = gfc_build_cond_assign_expr (&iter_block,
+							  present, ptr,
 							  null_pointer_node);
 		      gcc_assert (POINTER_TYPE_P (TREE_TYPE (ptr)));
 		      ptr = build_fold_indirect_ref (ptr);
@@ -3420,7 +3450,7 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 			      ptr = gfc_conv_descriptor_data_get (decl);
 			      ptr = gfc_build_addr_expr (NULL, ptr);
 			      ptr = gfc_build_cond_assign_expr (
-				      block, present, ptr, null_pointer_node);
+				&iter_block, present, ptr, null_pointer_node);
 			      ptr = build_fold_indirect_ref (ptr);
 			      OMP_CLAUSE_DECL (node3) = ptr;
 			    }
@@ -3509,7 +3539,7 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 						    TRUTH_ANDIF_EXPR,
 						    boolean_type_node,
 						    present, cond);
-			  gfc_add_expr_to_block (block,
+			  gfc_add_expr_to_block (&iter_block,
 						 build3_loc (input_location,
 							     COND_EXPR,
 							     void_type_node,
@@ -3538,12 +3568,12 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 			      tree cond = build3_loc (input_location, COND_EXPR,
 						      void_type_node, present,
 						      cond_body, NULL_TREE);
-			      gfc_add_expr_to_block (block, cond);
+			      gfc_add_expr_to_block (&iter_block, cond);
 			      OMP_CLAUSE_SIZE (node) = var;
 			    }
 			  else
 			    {
-			      gfc_add_block_to_block (block, &cond_block);
+			      gfc_add_block_to_block (&iter_block, &cond_block);
 			      OMP_CLAUSE_SIZE (node) = size;
 			    }
 			}
@@ -3555,7 +3585,8 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		      /* A single indirectref is handled by the middle end.  */
 		      gcc_assert (!POINTER_TYPE_P (TREE_TYPE (decl)));
 		      decl = TREE_OPERAND (decl, 0);
-		      decl = gfc_build_cond_assign_expr (block, present, decl,
+		      decl = gfc_build_cond_assign_expr (&iter_block,
+							 present, decl,
 							 null_pointer_node);
 		      OMP_CLAUSE_DECL (node) = build_fold_indirect_ref (decl);
 		    }
@@ -3589,7 +3620,7 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 							 size_type_node,
 							 cond, size,
 							 size_zero_node);
-		      size = gfc_evaluate_now (size, block);
+		      size = gfc_evaluate_now (size, &iter_block);
 		      OMP_CLAUSE_SIZE (node) = size;
 		    }
 		}
@@ -3608,7 +3639,8 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		      && !(POINTER_TYPE_P (type)
 			   && GFC_DESCRIPTOR_TYPE_P (TREE_TYPE (type))))
 		    k = GOMP_MAP_FIRSTPRIVATE_POINTER;
-		  gfc_trans_omp_array_section (block, op, n, decl, element,
+		  gfc_trans_omp_array_section (&iter_block,
+					       op, n, decl, element,
 					       !openacc, k, node, node2,
 					       node3, node4);
 		}
@@ -3626,12 +3658,12 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		  gfc_init_se (&se, NULL);
 
 		  gfc_conv_expr (&se, n->expr);
-		  gfc_add_block_to_block (block, &se.pre);
+		  gfc_add_block_to_block (&iter_block, &se.pre);
 		  /* For BT_CHARACTER a pointer is returned.  */
 		  OMP_CLAUSE_DECL (node)
 		    = POINTER_TYPE_P (TREE_TYPE (se.expr))
 		      ? build_fold_indirect_ref (se.expr) : se.expr;
-		  gfc_add_block_to_block (block, &se.post);
+		  gfc_add_block_to_block (&iter_block, &se.post);
 		  if (pointer || allocatable)
 		    {
 		      /* If it's a bare attach/detach clause, we just want
@@ -3843,7 +3875,7 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 			  OMP_CLAUSE_DECL (node) = ptr;
 			  int rank = GFC_TYPE_ARRAY_RANK (type);
 			  OMP_CLAUSE_SIZE (node)
-			    = gfc_full_array_size (block, inner, rank);
+			    = gfc_full_array_size (&iter_block, inner, rank);
 			  tree elemsz
 			    = TYPE_SIZE_UNIT (gfc_get_element_type (type));
 			  map_kind = OMP_CLAUSE_MAP_KIND (node);
@@ -3981,7 +4013,8 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 		      /* An array element or section.  */
 		      bool element = lastref->u.ar.type == AR_ELEMENT;
 		      gomp_map_kind kind = GOMP_MAP_ATTACH_DETACH;
-		      gfc_trans_omp_array_section (block, op, n, inner, element,
+		      gfc_trans_omp_array_section (&iter_block,
+						   op, n, inner, element,
 						   !openacc, kind, node, node2,
 						   node3, node4);
 		    }
@@ -3993,6 +4026,8 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 
 	      finalize_map_clause:
 
+	      if (!iterator)
+		gfc_add_block_to_block (block, &iter_block);
 	      omp_clauses = gfc_trans_add_clause (node, omp_clauses);
 	      if (node2)
 		omp_clauses = gfc_trans_add_clause (node2, omp_clauses);
@@ -4003,6 +4038,16 @@  gfc_trans_omp_clauses (stmtblock_t *block, gfc_omp_clauses *clauses,
 	      if (node5)
 		omp_clauses = gfc_trans_add_clause (node5, omp_clauses);
 	    }
+	  if (iterator)
+	    {
+	      /* Finish last iterator group.  */
+	      BLOCK_SUBBLOCKS (tree_block) = gfc_finish_block (&iter_block);
+	      TREE_VEC_ELT (iterator, 5) = tree_block;
+	      for (tree c = omp_clauses; c != prev_clauses;
+		c = OMP_CLAUSE_CHAIN (c))
+	      OMP_CLAUSE_DECL (c) = build_tree_list (iterator,
+						     OMP_CLAUSE_DECL (c));
+	    }
 	  break;
 	case OMP_LIST_TO:
 	case OMP_LIST_FROM:
diff --git a/gcc/gimplify.cc b/gcc/gimplify.cc
index 6e938296245..09e6b927d72 100644
--- a/gcc/gimplify.cc
+++ b/gcc/gimplify.cc
@@ -8858,10 +8858,17 @@  compute_iterator_count (tree t, gimple_seq *pre_p)
 	endmbegin = fold_build2_loc (loc, POINTER_DIFF_EXPR, stype, end, begin);
       else
 	endmbegin = fold_build2_loc (loc, MINUS_EXPR, type, end, begin);
-      tree stepm1 = fold_build2_loc (loc, MINUS_EXPR, stype, step,
-				     build_int_cst (stype, 1));
-      tree stepp1 = fold_build2_loc (loc, PLUS_EXPR, stype, step,
-				     build_int_cst (stype, 1));
+      /* Account for iteration stopping on the end value in Fortran rather
+	 than before it.  */
+      tree stepm1 = step;
+      tree stepp1 = step;
+      if (!lang_GNU_Fortran ())
+	{
+	  stepm1 = fold_build2_loc (loc, MINUS_EXPR, stype, step,
+				    build_int_cst (stype, 1));
+	  stepp1 = fold_build2_loc (loc, PLUS_EXPR, stype, step,
+				    build_int_cst (stype, 1));
+	}
       tree pos = fold_build2_loc (loc, PLUS_EXPR, stype,
 				  unshare_expr (endmbegin), stepm1);
       pos = fold_build2_loc (loc, TRUNC_DIV_EXPR, stype, pos, step);
@@ -8913,6 +8920,7 @@  build_iterator_loop (tree c, gimple_seq *pre_p, tree *last_bind)
       gimplify_ctxp->into_ssa = saved_into_ssa;
     }
   tree block = TREE_VEC_ELT (TREE_PURPOSE (t), 5);
+  tree block_stmts = lang_GNU_Fortran () ? BLOCK_SUBBLOCKS (block) : NULL_TREE;
   *last_bind = build3 (BIND_EXPR, void_type_node,
 		       BLOCK_VARS (block), NULL, block);
   TREE_SIDE_EFFECTS (*last_bind) = 1;
@@ -8925,6 +8933,7 @@  build_iterator_loop (tree c, gimple_seq *pre_p, tree *last_bind)
       tree end = TREE_VEC_ELT (it, 2);
       tree step = TREE_VEC_ELT (it, 3);
       tree orig_step = TREE_VEC_ELT (it, 4);
+      block = TREE_VEC_ELT (it, 5);
       tree type = TREE_TYPE (var);
       location_t loc = DECL_SOURCE_LOCATION (var);
       /* Emit:
@@ -8935,9 +8944,9 @@  build_iterator_loop (tree c, gimple_seq *pre_p, tree *last_bind)
 	 var = var + step;
 	 cond_label:
 	 if (orig_step > 0) {
-	   if (var < end) goto beg_label;
+	   if (var < end) goto beg_label;  // <= for Fortran
 	 } else {
-	   if (var > end) goto beg_label;
+	   if (var > end) goto beg_label;  // >= for Fortran
 	 }
 	 for each iterator, with inner iterators added to
 	 the ... above.  */
@@ -8963,10 +8972,12 @@  build_iterator_loop (tree c, gimple_seq *pre_p, tree *last_bind)
       append_to_statement_list_force (tem, p);
       tem = build1 (LABEL_EXPR, void_type_node, cond_label);
       append_to_statement_list (tem, p);
-      tree cond = fold_build2_loc (loc, LT_EXPR, boolean_type_node, var, end);
+      tree cond = fold_build2_loc (loc, lang_GNU_Fortran () ? LE_EXPR : LT_EXPR,
+				   boolean_type_node, var, end);
       tree pos = fold_build3_loc (loc, COND_EXPR, void_type_node, cond,
 				  build_and_jump (&beg_label), void_node);
-      cond = fold_build2_loc (loc, GT_EXPR, boolean_type_node, var, end);
+      cond = fold_build2_loc (loc, lang_GNU_Fortran () ? GE_EXPR : GT_EXPR,
+			      boolean_type_node, var, end);
       tree neg = fold_build3_loc (loc, COND_EXPR, void_type_node, cond,
 				  build_and_jump (&beg_label), void_node);
       tree osteptype = TREE_TYPE (orig_step);
@@ -8975,6 +8986,11 @@  build_iterator_loop (tree c, gimple_seq *pre_p, tree *last_bind)
       tem = fold_build3_loc (loc, COND_EXPR, void_type_node, cond, pos, neg);
       append_to_statement_list_force (tem, p);
       p = &BIND_EXPR_BODY (bind);
+      /* The Fortran front-end stashes statements into the BLOCK_SUBBLOCKS
+	 of the last element of the first iterator.  These should go into the
+	 body of the innermost loop.  */
+      if (!TREE_CHAIN (it))
+	append_to_statement_list_force (block_stmts, p);
     }
 
   return p;
@@ -11398,6 +11414,8 @@  omp_accumulate_sibling_list (enum omp_region_type region_type,
   poly_offset_int coffset;
   poly_int64 cbitpos;
   tree ocd = OMP_ITERATOR_CLAUSE_DECL (grp_end);
+  tree iterator = OMP_ITERATOR_DECL_P (OMP_CLAUSE_DECL (grp_end))
+    ? TREE_PURPOSE (OMP_CLAUSE_DECL (grp_end)) : NULL_TREE;
   bool openmp = !(region_type & ORT_ACC);
   bool target = (region_type & ORT_TARGET) != 0;
   tree *continue_at = NULL;
@@ -11476,7 +11494,7 @@  omp_accumulate_sibling_list (enum omp_region_type region_type,
       if (struct_map_to_clause == NULL)
 	struct_map_to_clause = new hash_map<tree_operand_hash, tree>;
 
-      if (variable_offset)
+      if (variable_offset && !iterator)
 	str_kind = GOMP_MAP_STRUCT_UNORD;
 
       tree l = build_omp_clause (OMP_CLAUSE_LOCATION (grp_end), OMP_CLAUSE_MAP);
diff --git a/gcc/testsuite/gfortran.dg/gomp/target-iterator-1.f90 b/gcc/testsuite/gfortran.dg/gomp/target-iterator-1.f90
new file mode 100644
index 00000000000..25abbaf741e
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/target-iterator-1.f90
@@ -0,0 +1,26 @@ 
+! { dg-do compile }
+! { dg-options "-fopenmp" }
+
+program main
+  implicit none
+
+  integer, parameter :: DIM1 = 17
+  integer, parameter :: DIM2 = 39
+  type :: array_ptr
+    integer, pointer :: ptr(:)
+  end type
+  
+  type (array_ptr) :: x(DIM1), y(DIM1)
+
+  !$omp target map (iterator(i=1:DIM1), to: x(i)%ptr(:))
+  !$omp end target
+
+  !$omp target map (iterator(i=1:DIM1), to: x(i)%ptr(:), y(i)%ptr(:))
+  !$omp end target
+
+  !$omp target map (iterator(i=1:DIM1), to: x(i)%ptr(:) + 3) ! { dg-error "Syntax error in OpenMP variable list at .1." }
+  !$omp end target ! { dg-error "Unexpected \\\!\\\$OMP END TARGET statement at .1." }
+
+  !$omp target map(iterator(i=1:DIM1), iterator(j=1:DIM2), to: x(i)%ptr(j)) ! { dg-error "too many 'iterator' modifiers at .1." }
+  !$omp end target ! { dg-error "Unexpected \\\!\\\$OMP END TARGET statement at .1." }
+end program
diff --git a/gcc/testsuite/gfortran.dg/gomp/target-iterator-2.f90 b/gcc/testsuite/gfortran.dg/gomp/target-iterator-2.f90
new file mode 100644
index 00000000000..b7d7501cf63
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/target-iterator-2.f90
@@ -0,0 +1,27 @@ 
+! { dg-do compile }
+! { dg-options "-fopenmp" }
+
+program main
+  implicit none
+
+  integer, parameter :: DIM = 40
+  type :: array_ptr
+    integer, pointer :: ptr(:)
+  end type
+  
+  type (array_ptr) :: x(DIM), y(DIM), z(DIM)
+
+  !$omp target map(iterator(i=1:10), to: x) ! { dg-error "iterator variable .i. not used in clause expression" }
+    ! Add a reference to x to ensure that the 'to' clause does not get dropped.
+    x(1)%ptr(1) = 0
+  !$omp end target
+
+  !$omp target map(iterator(i=1:10, j=1:20), to: x(i)) ! { dg-error "iterator variable .j. not used in clause expression" }
+  !$omp end target
+
+  !$omp target map(iterator(i=1:10, j=1:20, k=1:30), to: x(i), y(j), z(k))
+  !$omp end target
+  ! { dg-error "iterator variable .i. not used in clause expression" "" { target *-*-* } .-2 }
+  ! { dg-error "iterator variable .j. not used in clause expression" "" { target *-*-* } .-3 }
+  ! { dg-error "iterator variable .k. not used in clause expression" "" { target *-*-* } .-4 }
+end program
diff --git a/gcc/testsuite/gfortran.dg/gomp/target-iterator-3.f90 b/gcc/testsuite/gfortran.dg/gomp/target-iterator-3.f90
new file mode 100644
index 00000000000..3cff65ab072
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/gomp/target-iterator-3.f90
@@ -0,0 +1,24 @@ 
+! { dg-do compile }
+! { dg-options "-fopenmp -fdump-tree-gimple" }
+
+program main
+  implicit none
+
+  integer, parameter :: DIM1 = 17
+  integer, parameter :: DIM2 = 27
+  type :: ptr_t
+    integer, pointer :: ptr(:)
+  end type
+  
+  type (ptr_t) :: x(DIM1), y(DIM2)
+
+  !$omp target map(iterator(i=1:DIM1), to: x(i)%ptr(:)) map(iterator(i=1:DIM2), from: y(i)%ptr(:))
+  !$omp end target
+end program
+
+! { dg-final { scan-tree-dump-times "if \\(i <= 17\\) goto <D\.\[0-9\]+>; else goto <D\.\[0-9\]+>;" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "if \\(i <= 27\\) goto <D\.\[0-9\]+>; else goto <D\.\[0-9\]+>;" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "map\\(iterator\\(integer\\(kind=4\\) i=1:17:1\\):iterator_array=D\.\[0-9\]+:to:" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "map\\(iterator\\(integer\\(kind=4\\) i=1:27:1\\):iterator_array=D\.\[0-9\]+:from:" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "map\\(iterator\\(integer\\(kind=4\\) i=1:17:1\\):iterator_array=D\.\[0-9\]+:attach:" 1 "gimple" } }
+! { dg-final { scan-tree-dump-times "map\\(iterator\\(integer\\(kind=4\\) i=1:27:1\\):iterator_array=D\.\[0-9\]+:attach:" 1 "gimple" } }
diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc
index e43f30818d0..8af83b934f0 100644
--- a/gcc/tree-pretty-print.cc
+++ b/gcc/tree-pretty-print.cc
@@ -1688,7 +1688,9 @@  dump_block_node (pretty_printer *pp, tree block, int spc, dump_flags_t flags)
       newline_and_indent (pp, spc + 2);
     }
 
-  if (BLOCK_SUBBLOCKS (block))
+  if (BLOCK_SUBBLOCKS (block)
+      && (!lang_GNU_Fortran ()
+	  || TREE_CODE (BLOCK_SUBBLOCKS (block)) != STATEMENT_LIST))
     {
       pp_string (pp, "SUBBLOCKS: ");
       for (t = BLOCK_SUBBLOCKS (block); t; t = BLOCK_CHAIN (t))
diff --git a/libgomp/target.c b/libgomp/target.c
index c69418f0b78..dabe88bc900 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -972,14 +972,74 @@  gomp_map_val (struct target_mem_desc *tgt, void **hostaddrs, size_t i)
     }
 }
 
+static const char *
+kind_to_name (unsigned short kind)
+{
+  if (GOMP_MAP_IMPLICIT_P (kind))
+    kind &= ~GOMP_MAP_IMPLICIT;
+
+  switch (kind & 0xff)
+    {
+    case GOMP_MAP_ALLOC: return "GOMP_MAP_ALLOC";
+    case GOMP_MAP_FIRSTPRIVATE: return "GOMP_MAP_FIRSTPRIVATE";
+    case GOMP_MAP_FIRSTPRIVATE_INT: return "GOMP_MAP_FIRSTPRIVATE_INT";
+    case GOMP_MAP_TO: return "GOMP_MAP_TO";
+    case GOMP_MAP_TO_PSET: return "GOMP_MAP_TO_PSET";
+    case GOMP_MAP_FROM: return "GOMP_MAP_FROM";
+    case GOMP_MAP_TOFROM: return "GOMP_MAP_TOFROM";
+    case GOMP_MAP_ATTACH: return "GOMP_MAP_ATTACH";
+    case GOMP_MAP_DETACH: return "GOMP_MAP_DETACH";
+    case GOMP_MAP_STRUCT: return "GOMP_MAP_STRUCT";
+    case GOMP_MAP_STRUCT_UNORD: return "GOMP_MAP_STRUCT_UNORD";
+    default: return "unknown";
+    }
+}
+
+static void
+gomp_add_map (size_t idx, size_t *new_idx,
+	      void ***hostaddrs, size_t **sizes, unsigned short **skinds,
+	      void ***new_hostaddrs, size_t **new_sizes,
+	      unsigned short **new_kinds, size_t *iterator_count)
+{
+  if ((*sizes)[idx] == SIZE_MAX)
+    {
+      uintptr_t *iterator_array = (*hostaddrs)[idx];
+      size_t count = *iterator_array++;
+      for (size_t i = 0; i < count; i++)
+	{
+	  (*new_hostaddrs)[*new_idx] = (void *) *iterator_array++;
+	  (*new_sizes)[*new_idx] = *iterator_array++;
+	  (*new_kinds)[*new_idx] = (*skinds)[idx];
+	  iterator_count[*new_idx] = i + 1;
+	  gomp_debug (1,
+		      "Expanding map %ld <%s>: "
+		      "hostaddrs[%ld] = %p, sizes[%ld] = %ld\n",
+		      idx, kind_to_name ((*new_kinds)[*new_idx]),
+		      *new_idx, (*new_hostaddrs)[*new_idx],
+		      *new_idx, (*new_sizes)[*new_idx]);
+	  (*new_idx)++;
+	}
+    }
+  else
+    {
+      (*new_hostaddrs)[*new_idx] = (*hostaddrs)[idx];
+      (*new_sizes)[*new_idx] = (*sizes)[idx];
+      (*new_kinds)[*new_idx] = (*skinds)[idx];
+      iterator_count[*new_idx] = 0;
+      (*new_idx)++;
+    }
+}
+
 /* Map entries containing expanded iterators will be flattened and merged into
    HOSTADDRS, SIZES and KINDS, and MAPNUM updated.  Returns true if there are
-   any iterators found.  HOSTADDRS, SIZES and KINDS must be freed afterwards
-   if any merging occurs.  */
+   any iterators found.  ITERATOR_COUNT holds the iteration count of the
+   iterator that generates each map (0 if not generated from an iterator).
+   HOSTADDRS, SIZES, KINDS and ITERATOR_COUNT must be freed afterwards if any
+   merging occurs.  */
 
 static bool
 gomp_merge_iterator_maps (size_t *mapnum, void ***hostaddrs, size_t **sizes,
-			  void **kinds)
+			  void **kinds, size_t **iterator_count)
 {
   bool iterator_p = false;
   size_t map_count = 0;
@@ -1006,33 +1066,36 @@  gomp_merge_iterator_maps (size_t *mapnum, void ***hostaddrs, size_t **sizes,
   unsigned short *new_kinds
     = (unsigned short *) gomp_malloc (map_count * sizeof (unsigned short));
   size_t new_idx = 0;
+  *iterator_count = (size_t *) gomp_malloc (map_count * sizeof (size_t));
 
   for (size_t i = 0; i < *mapnum; i++)
     {
-      if ((*sizes)[i] == SIZE_MAX)
+      int map_type = get_kind (true, *skinds, i) & 0xff;
+      if (map_type == GOMP_MAP_STRUCT || map_type == GOMP_MAP_STRUCT_UNORD)
 	{
-	  uintptr_t *iterator_array = (*hostaddrs)[i];
-	  size_t count = iterator_array[0];
-	  for (int j = 1; j < count * 2 + 1; j += 2)
+	  size_t field_count = (*sizes)[i];
+
+	  gomp_add_map (i, &new_idx, hostaddrs, sizes, skinds,
+			&new_hostaddrs, &new_sizes, &new_kinds, *iterator_count);
+
+	  for (size_t j = i + 1; j <= i + field_count; j++)
 	    {
-	      new_hostaddrs[new_idx] = (void *) iterator_array[j];
-	      new_sizes[new_idx] = iterator_array[j+1];
-	      new_kinds[new_idx] = (*skinds)[i];
-	      gomp_debug (1,
-			  "Expanding map %ld: "
-			  "hostaddrs[%ld] = %p, sizes[%ld] = %ld\n",
-			  i, new_idx, new_hostaddrs[new_idx],
-			  new_idx, new_sizes[new_idx]);
-	      new_idx++;
+	      if ((*sizes)[j] == SIZE_MAX)
+		{
+		  uintptr_t *iterator_array = (*hostaddrs)[j];
+		  size_t count = iterator_array[0];
+		  new_sizes[i] += count - 1;
+		}
+	      gomp_add_map (j, &new_idx, hostaddrs, sizes, skinds,
+			    &new_hostaddrs, &new_sizes, &new_kinds,
+			    *iterator_count);
 	    }
+	  gomp_debug (1, "Map %ld new field count = %ld\n", i, new_sizes[i]);
+	  i += field_count;
 	}
       else
-	{
-	  new_hostaddrs[new_idx] = (*hostaddrs)[i];
-	  new_sizes[new_idx] = (*sizes)[i];
-	  new_kinds[new_idx] = (*skinds)[i];
-	  new_idx++;
-	}
+	gomp_add_map (i, &new_idx, hostaddrs, sizes, skinds,
+		      &new_hostaddrs, &new_sizes, &new_kinds, *iterator_count);
     }
 
   *mapnum = map_count;
@@ -1060,9 +1123,10 @@  gomp_map_vars_internal (struct gomp_device_descr *devicep,
   struct splay_tree_s *mem_map = &devicep->mem_map;
   struct splay_tree_key_s cur_node;
   bool iterators_p = false;
+  size_t *iterator_count = NULL;
   if (short_mapkind)
     iterators_p = gomp_merge_iterator_maps (&mapnum, &hostaddrs, &sizes,
-					    &kinds);
+					    &kinds, &iterator_count);
   struct target_mem_desc *tgt
     = gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum);
   tgt->list_count = mapnum;
@@ -1912,14 +1976,17 @@  gomp_map_vars_internal (struct gomp_device_descr *devicep,
 
   if (pragma_kind & GOMP_MAP_VARS_TARGET)
     {
+      size_t map_num = 0;
       for (i = 0; i < mapnum; i++)
-	{
-	  cur_node.tgt_offset = gomp_map_val (tgt, hostaddrs, i);
-	  gomp_copy_host2dev (devicep, aq,
-			      (void *) (tgt->tgt_start + i * sizeof (void *)),
-			      (void *) &cur_node.tgt_offset, sizeof (void *),
-			      true, cbufp);
-	}
+	if (!iterator_count || iterator_count[i] <= 1)
+	  {
+	    cur_node.tgt_offset = gomp_map_val (tgt, hostaddrs, i);
+	    gomp_copy_host2dev (devicep, aq,
+				(void *) (tgt->tgt_start + map_num * sizeof (void *)),
+				(void *) &cur_node.tgt_offset, sizeof (void *),
+				true, cbufp);
+	    map_num++;
+	  }
     }
 
   if (cbufp)
@@ -1957,6 +2024,7 @@  gomp_map_vars_internal (struct gomp_device_descr *devicep,
       free (hostaddrs);
       free (sizes);
       free (kinds);
+      free (iterator_count);
     }
 
   return tgt;
@@ -2225,6 +2293,7 @@  gomp_update (struct gomp_device_descr *devicep, size_t mapnum, void **hostaddrs,
   struct splay_tree_key_s cur_node;
   const int typemask = short_mapkind ? 0xff : 0x7;
   bool iterators_p = false;
+  size_t *iterator_count = NULL;
 
   if (!devicep)
     return;
@@ -2234,7 +2303,7 @@  gomp_update (struct gomp_device_descr *devicep, size_t mapnum, void **hostaddrs,
 
   if (short_mapkind)
     iterators_p = gomp_merge_iterator_maps (&mapnum, &hostaddrs, &sizes,
-					    &kinds);
+					    &kinds, &iterator_count);
 
   gomp_mutex_lock (&devicep->lock);
   if (devicep->state == GOMP_DEVICE_FINALIZED)
@@ -2335,6 +2404,7 @@  gomp_update (struct gomp_device_descr *devicep, size_t mapnum, void **hostaddrs,
       free (hostaddrs);
       free (sizes);
       free (kinds);
+      free (iterator_count);
     }
 }
 
diff --git a/libgomp/testsuite/libgomp.fortran/target-map-iterators-1.f90 b/libgomp/testsuite/libgomp.fortran/target-map-iterators-1.f90
new file mode 100644
index 00000000000..80e077e69fd
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/target-map-iterators-1.f90
@@ -0,0 +1,45 @@ 
+! { dg-do run }
+
+! Test transfer of dynamically-allocated arrays to target using map
+! iterators.
+
+program test
+  implicit none
+
+  integer, parameter :: DIM1 = 8
+  integer, parameter :: DIM2 = 15
+
+  type :: array_ptr
+    integer, pointer :: arr(:)
+  end type
+
+  type (array_ptr) :: x(DIM1)
+  integer :: expected, sum, i, j
+
+  expected = mkarray ()
+
+  !$omp target map(iterator(i=1:DIM1), to: x(i)%arr(:)) map(from: sum)
+    sum = 0
+    do i = 1, DIM1
+      do j = 1, DIM2
+	sum = sum + x(i)%arr(j)
+      end do
+    end do
+  !$omp end target
+
+  if (sum .ne. expected) stop 1
+contains
+  integer function mkarray ()
+    integer :: exp = 0
+
+    do i = 1, DIM1
+      allocate (x(i)%arr(DIM2))
+      do j = 1, DIM2
+        x(i)%arr(j) = i * j
+	exp = exp + x(i)%arr(j)
+      end do
+    end do
+
+    mkarray = exp
+  end function
+end program
diff --git a/libgomp/testsuite/libgomp.fortran/target-map-iterators-2.f90 b/libgomp/testsuite/libgomp.fortran/target-map-iterators-2.f90
new file mode 100644
index 00000000000..cf0e7fbd9b3
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/target-map-iterators-2.f90
@@ -0,0 +1,45 @@ 
+! { dg-do run }
+
+! Test transfer of dynamically-allocated arrays from target using map
+! iterators.
+
+program test
+  implicit none
+
+  integer, parameter :: DIM1 = 8
+  integer, parameter :: DIM2 = 15
+
+  type :: array_ptr
+    integer, pointer :: arr(:)
+  end type
+
+  type (array_ptr) :: x(DIM1)
+  integer :: expected, sum, i, j
+
+  call mkarray
+
+  !$omp target map(iterator(i=1:DIM1), from: x(i)%arr(:)) map(from: expected)
+    expected = 0
+    do i = 1, DIM1
+      do j = 1, DIM2
+	x(i)%arr(j) = (i+1) * (j+1)
+	expected = expected + x(i)%arr(j)
+      end do
+    end do
+  !$omp end target
+
+  sum = 0
+  do i = 1, DIM1
+    do j = 1, DIM2
+      sum = sum + x(i)%arr(j)
+    end do
+  end do
+
+  if (sum .ne. expected) stop 1
+contains
+  subroutine mkarray
+    do i = 1, DIM1
+      allocate (x(i)%arr(DIM2))
+    end do
+  end subroutine
+end program
diff --git a/libgomp/testsuite/libgomp.fortran/target-map-iterators-3.f90 b/libgomp/testsuite/libgomp.fortran/target-map-iterators-3.f90
new file mode 100644
index 00000000000..8072c074557
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/target-map-iterators-3.f90
@@ -0,0 +1,57 @@ 
+! { dg-do run }
+
+! Test transfer of dynamically-allocated arrays to target using map
+! iterators, with multiple iterators and function calls in the iterator
+! expression.
+
+program test
+  implicit none
+
+  integer, parameter :: DIM1 = 16
+  integer, parameter :: DIM2 = 4
+
+  type :: array_ptr
+    integer, pointer :: arr(:)
+  end type
+
+  type (array_ptr) :: x(DIM1), y(DIM1)
+  integer :: expected, sum, i, j
+
+  expected = mkarrays ()
+
+  !$omp target map(iterator(i=0:DIM1/4-1, j=0:3), to: x(f (i, j))%arr(:)) &
+  !$omp        map(iterator(i=1:DIM1), to: y(i)%arr(:)) &
+  !$omp        map(from: sum)
+    sum = 0
+    do i = 1, DIM1
+      do j = 1, DIM2
+	sum = sum + x(i)%arr(j) * y(i)%arr(j)
+      end do
+    end do
+  !$omp end target
+
+  print *, sum, expected
+  if (sum .ne. expected) stop 1
+contains
+  integer function mkarrays ()
+    integer :: exp = 0
+
+    do i = 1, DIM1
+      allocate (x(i)%arr(DIM2))
+      allocate (y(i)%arr(DIM2))
+      do j = 1, DIM2
+	x(i)%arr(j) = i * j
+	y(i)%arr(j) = i + j
+	exp = exp + x(i)%arr(j) * y(i)%arr(j)
+      end do
+    end do
+
+    mkarrays = exp
+  end function
+
+  integer function f (i, j)
+    integer, intent(in) :: i, j
+
+    f = i * 4 + j + 1
+  end function
+end program