diff mbox series

[v4,1/5] libgomp, openmp: Add ompx_pinned_mem_alloc

Message ID 20240531120751.56140-2-ams@baylibre.com
State New
Headers show
Series libgomp: OpenMP pinned memory for omp_alloc | expand

Commit Message

Andrew Stubbs May 31, 2024, 12:07 p.m. UTC
Compared to the previous v3 posting of this patch, the enumeration of
the "ompx" allocators have been moved to start at "100".

---------

This creates a new predefined allocator as a shortcut for using pinned
memory with OpenMP.  The name uses the OpenMP extension space and is
intended to be consistent with other OpenMP implementations currently in
development.

The allocator is equivalent to using a custom allocator with the pinned
trait and the null fallback trait.

libgomp/ChangeLog:

	* allocator.c (ompx_min_predefined_alloc): New.
	(ompx_max_predefined_alloc): New.
	(predefined_alloc_mapping): Rename to ...
	(predefined_omp_alloc_mapping): ... this.
	(predefined_ompx_alloc_mapping): New.
	(predefined_allocator_p): New.
	(predefined_alloc_mapping): New (as a function).
	(omp_aligned_alloc): Support ompx_pinned_mem_alloc. Use
	predefined_allocator_p and predefined_alloc_mapping.
	(omp_free): Likewise.
	(omp_aligned_calloc): Likewise.
	(omp_realloc): Likewise.
	* libgomp.texi: Document ompx_pinned_mem_alloc.
	* omp.h.in (omp_allocator_handle_t): Add ompx_pinned_mem_alloc.
	* omp_lib.f90.in: Add ompx_pinned_mem_alloc.
	* testsuite/libgomp.c/alloc-pinned-5.c: New test.
	* testsuite/libgomp.c/alloc-pinned-6.c: New test.
	* testsuite/libgomp.fortran/alloc-pinned-1.f90: New test.

Co-Authored-By: Thomas Schwinge <thomas@codesourcery.com>
---
 libgomp/allocator.c                           | 115 +++++++++++++-----
 libgomp/libgomp.texi                          |   7 +-
 libgomp/omp.h.in                              |   1 +
 libgomp/omp_lib.f90.in                        |   2 +
 libgomp/testsuite/libgomp.c/alloc-pinned-5.c  | 103 ++++++++++++++++
 libgomp/testsuite/libgomp.c/alloc-pinned-6.c  | 101 +++++++++++++++
 .../libgomp.fortran/alloc-pinned-1.f90        |  16 +++
 7 files changed, 312 insertions(+), 33 deletions(-)
 create mode 100644 libgomp/testsuite/libgomp.c/alloc-pinned-5.c
 create mode 100644 libgomp/testsuite/libgomp.c/alloc-pinned-6.c
 create mode 100644 libgomp/testsuite/libgomp.fortran/alloc-pinned-1.f90

Comments

Tobias Burnus June 6, 2024, 11:40 a.m. UTC | #1
Hi Andrew, hi Jakub, hello world,

Andrew Stubbs wrote:

> Compared to the previous v3 posting of this patch, the enumeration of
> the "ompx" allocators have been moved to start at "100"

100 is a bad value - as can be seen below.

As Jakub suggested at https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640432.html
"given that LLVM uses 100-102 range, perhaps pick a different one, 200 or 150"

(I know that the first review email suggested 100.)

> This creates a new predefined allocator as a shortcut for using pinned
> memory with OpenMP.  The name uses the OpenMP extension space and is
> intended to be consistent with other OpenMP implementations currently in
> development.

Namely: ompx_pinned_mem_alloc

RFC: Should we use this name or - similar to LLVM - prefix this by
a vendor prefix instead (gnu_omp_ or gcc_omp_ instead of ompx_)?

IMHO it is fine to use ompx_ for pinned as the semantic is clear
and should be compatible with IBM and AMD.

For other additional memspaces / allocators, I am less sure, i.e.
on OG13 there are:
- ompx_unified_shared_mem_space, ompx_host_mem_space
- ompx_unified_shared_mem_alloc, ompx_host_mem_alloc

(BTW: In light of TR13 naming, the USM one could be
..._devices_all_mem_{alloc,space}, just to start some bikeshading
or following LLVM + Intel '…target_{host,shared}…'.)

* * *

Looking at other compilers:

IBM's compiler, https://www.ibm.com/docs/en/SSXVZZ_16.1.1/pdf/compiler.pdf , has:
- ompx_pinned_mem_alloc, tagged as IBM extension and otherwise without documenting it further

Checking omp.h, they define it as:
   ompx_pinned_mem_alloc = 9, /* Preview of host pinned memory support */
and additionally have:
   LOMP_MAX_MEM_ALLOC = 1024,

AMD's compiler based on clang has:
       /* Preview of pinned memory support */
       ompx_pinned_mem_alloc = 120,
in addition to the LLVM defines shown below.

Regarding LLVM:
- they don't offer 'pinned'
- they use the prefix 'llvm_omp' not 'ompx'

Namely:
     typedef enum omp_allocator_handle_t
...
       llvm_omp_target_host_mem_alloc = 100,
       llvm_omp_target_shared_mem_alloc = 101,
       llvm_omp_target_device_mem_alloc = 102,
...
     typedef enum omp_memspace_handle_t
...
       llvm_omp_target_host_mem_space = 100,
       llvm_omp_target_shared_mem_space = 101,
       llvm_omp_target_device_mem_space = 102,

Remark: I did not find a documentation - and while I
understand in principle host and shared, I wonder how
LLVM handles 'device_mem_space' when there is more than
one device.

BTW: OpenMP TR13 avoids this issue by adding two sets of
API routines. Namely:

First, for memspaces,
- omp_get_{device,devices}_memspace
- omp_get_{device,devices}_and_host_memspace
- omp_get_devices_all_memspace

and, secondly, for allocators:
- omp_get_{device,devices}_allocator
- omp_get_{device,devices}_and_host_allocator
- omp_get_devices_all_allocator

where omp_get_device_* takes a single device number and
omp_get_devices_* a list of device numbers while _and_host
automatically adds the initial device to the list.

* * *

Looking at Intel, they even use extensions without prefix:

omp_target_{host,shared,device}_mem_{space,alloc}

and contrary to LLVM they document it with the semantic, cf.
https://www.intel.com/content/www/us/en/docs/dpcpp-cpp-compiler/developer-guide-reference/2023-1/openmp-memory-spaces-and-allocators.html

* * *

> The allocator is equivalent to using a custom allocator with the pinned
> trait and the null fallback trait.

...

> diff --git a/libgomp/allocator.c b/libgomp/allocator.c
> index cdedc7d80e9..18e3f525ec6 100644
> --- a/libgomp/allocator.c
> +++ b/libgomp/allocator.c
> @@ -99,6 +99,8 @@ GOMP_is_alloc (void *ptr)

...

>    #define ARRAY_SIZE(A) (sizeof (A) / sizeof ((A)[0]))
> -_Static_assert (ARRAY_SIZE (predefined_alloc_mapping)
> +_Static_assert (ARRAY_SIZE (predefined_omp_alloc_mapping)
>    		== omp_max_predefined_alloc + 1,
> -		"predefined_alloc_mapping must match omp_memspace_handle_t");
> +		"predefined_omp_alloc_mapping must match omp_memspace_handle_t");
> +#define ARRAY_SIZE(A) (sizeof (A) / sizeof ((A)[0]))

I am surprised that this compiles: Why do you re-#define this macro?

* * *

> --- a/libgomp/omp.h.in
> +++ b/libgomp/omp.h.in
> @@ -134,6 +134,7 @@ typedef enum omp_allocator_handle_t __GOMP_UINTPTR_T_ENUM
>      omp_cgroup_mem_alloc = 6,
>      omp_pteam_mem_alloc = 7,
>      omp_thread_mem_alloc = 8,
> +  ompx_pinned_mem_alloc = 100,

See remark regarding "100" at the top of this email.

> --- a/libgomp/omp_lib.f90.in
> +++ b/libgomp/omp_lib.f90.in
> +        integer (kind=omp_allocator_handle_kind), &
> +                 parameter :: ompx_pinned_mem_alloc = 100

Likewise.

* * *

Why didn't you also update omp_lib.h.in?

* * *

I think you really want to update the checking code inside GCC itself,

i.e. for Fortran:

     3 |   !$omp allocate(a) allocator(100)

       |                 2            1

Error: Predefined allocator required in ALLOCATOR clause at (1) as the list item 'a' at (2) has the SAVE attribute

and for C:

foo.c:6:58: error: 'allocator' clause requires a predefined allocator as 'a' is static

     6 |   #pragma omp allocate(a) allocator(ompx_pinned_mem_alloc)

       |                                                          ^

this shows up using

module m

   integer :: a

   !$omp allocate(a) allocator(100)

end

and

enum omp_allocator_handle_t { ompx_pinned_mem_alloc = 100 };

void f()

{

   static int a;

   #pragma omp allocate(a) allocator(ompx_pinned_mem_alloc)

}

You probably also want to update the testcases. See also:

https://github.com/gcc-mirror/gcc/blob/master/gcc/testsuite/c-c++-common/gomp/allocate-9.c#L23-L28

which checks for '> 9' and I think there is also a Fortran testcase.

* * *

On the code side, c-parser.cc has:

           else if (allocator

                    && (wi::to_widest (allocator) < 1

                        || wi::to_widest (allocator) > 8))

             /* 8 = largest predefined memory allocator. */

             error_at (allocator_loc,

                       "%<allocator%> clause requires a predefined allocator as "

                       "%qD is static", var);

And gcc/fortran/openmp.cc has the function 'is_predefined_allocator'

we could consider to unify the number handling, but it should be updated.

* * *

NOTE: If you are missing C++, I (or someone else) still have to

address Jakub's review comments for:

https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633782.html

* * *

RFC: Should this also be supported as:

OMP_ALLOCATOR=ompx_pinned_mem

Your current documentation change implies so. Hence, you also need to touch:

libgomp/env.c's parse_allocator

* * *

And carrying on my question from last time:

> +++ b/libgomp/testsuite/libgomp.c/alloc-pinned-5.c
> @@ -0,0 +1,103 @@
> +/* { dg-do run } */
...
> +#define CHECK_SIZE(SIZE) { \
> +  struct rlimit limit; \
> +  if (getrlimit (RLIMIT_MEMLOCK, &limit) \
> +      || limit.rlim_cur <= SIZE) \
> +    fprintf (stderr, "unsufficient lockable memory; please increase ulimit\n"); \
> +  }

Namely, I wrote:

Glancing through the patches, for test cases, I think you should
'abort()' in CHECK_SIZE if it fails (rlimit issue or not supported
system). Or do you think that the results are still could make sense
when continuing and possibly failing later?

Otherwise, it looks good to me.

Tobias
diff mbox series

Patch

diff --git a/libgomp/allocator.c b/libgomp/allocator.c
index cdedc7d80e9..18e3f525ec6 100644
--- a/libgomp/allocator.c
+++ b/libgomp/allocator.c
@@ -99,6 +99,8 @@  GOMP_is_alloc (void *ptr)
 
 
 #define omp_max_predefined_alloc omp_thread_mem_alloc
+#define ompx_min_predefined_alloc ompx_pinned_mem_alloc
+#define ompx_max_predefined_alloc ompx_pinned_mem_alloc
 
 /* These macros may be overridden in config/<target>/allocator.c.
    The defaults (no override) are to return NULL for pinned memory requests
@@ -131,7 +133,7 @@  GOMP_is_alloc (void *ptr)
    The index to this table is the omp_allocator_handle_t enum value.
    When the user calls omp_alloc with a predefined allocator this
    table determines what memory they get.  */
-static const omp_memspace_handle_t predefined_alloc_mapping[] = {
+static const omp_memspace_handle_t predefined_omp_alloc_mapping[] = {
   omp_default_mem_space,   /* omp_null_allocator doesn't actually use this. */
   omp_default_mem_space,   /* omp_default_mem_alloc. */
   omp_large_cap_mem_space, /* omp_large_cap_mem_alloc. */
@@ -142,11 +144,41 @@  static const omp_memspace_handle_t predefined_alloc_mapping[] = {
   omp_low_lat_mem_space,   /* omp_pteam_mem_alloc (implementation defined). */
   omp_low_lat_mem_space,   /* omp_thread_mem_alloc (implementation defined). */
 };
+static const omp_memspace_handle_t predefined_ompx_alloc_mapping[] = {
+  omp_default_mem_space,   /* ompx_pinned_mem_alloc. */
+};
 
 #define ARRAY_SIZE(A) (sizeof (A) / sizeof ((A)[0]))
-_Static_assert (ARRAY_SIZE (predefined_alloc_mapping)
+_Static_assert (ARRAY_SIZE (predefined_omp_alloc_mapping)
 		== omp_max_predefined_alloc + 1,
-		"predefined_alloc_mapping must match omp_memspace_handle_t");
+		"predefined_omp_alloc_mapping must match omp_memspace_handle_t");
+#define ARRAY_SIZE(A) (sizeof (A) / sizeof ((A)[0]))
+_Static_assert (ARRAY_SIZE (predefined_ompx_alloc_mapping)
+		== ompx_max_predefined_alloc - ompx_min_predefined_alloc + 1,
+		"predefined_ompx_alloc_mapping must match"
+		" omp_memspace_handle_t");
+
+static inline bool
+predefined_allocator_p (omp_allocator_handle_t allocator)
+{
+  return allocator <= ompx_max_predefined_alloc;
+}
+
+static inline omp_memspace_handle_t
+predefined_alloc_mapping (omp_allocator_handle_t allocator)
+{
+  if (allocator <= omp_max_predefined_alloc)
+    return predefined_omp_alloc_mapping[allocator];
+  else if (allocator >= ompx_min_predefined_alloc
+	   && allocator <= ompx_max_predefined_alloc)
+    {
+      int index = allocator - ompx_min_predefined_alloc;
+      return predefined_ompx_alloc_mapping[index];
+    }
+  else
+    /* This should never happen.  */
+    return omp_default_mem_space;
+}
 
 enum gomp_numa_memkind_kind
 {
@@ -556,7 +588,7 @@  retry:
       allocator = (omp_allocator_handle_t) thr->ts.def_allocator;
     }
 
-  if (allocator > omp_max_predefined_alloc)
+  if (!predefined_allocator_p (allocator))
     {
       allocator_data = (struct omp_allocator_data *) allocator;
       if (new_alignment < allocator_data->alignment)
@@ -685,9 +717,11 @@  retry:
 	  omp_memspace_handle_t memspace;
 	  memspace = (allocator_data
 		      ? allocator_data->memspace
-		      : predefined_alloc_mapping[allocator]);
-	  ptr = MEMSPACE_ALLOC (memspace, new_size,
-				allocator_data && allocator_data->pinned);
+		      : predefined_alloc_mapping (allocator));
+	  int pinned = (allocator_data
+			? allocator_data->pinned
+			: allocator == ompx_pinned_mem_alloc);
+	  ptr = MEMSPACE_ALLOC (memspace, new_size, pinned);
 	}
       if (ptr == NULL)
 	goto fail;
@@ -708,7 +742,8 @@  retry:
 fail:;
   int fallback = (allocator_data
 		  ? allocator_data->fallback
-		  : allocator == omp_default_mem_alloc
+		  : (allocator == omp_default_mem_alloc
+		     || allocator == ompx_pinned_mem_alloc)
 		  ? omp_atv_null_fb
 		  : omp_atv_default_mem_fb);
   switch (fallback)
@@ -764,7 +799,7 @@  omp_free (void *ptr, omp_allocator_handle_t allocator)
     return;
   (void) allocator;
   data = &((struct omp_mem_header *) ptr)[-1];
-  if (data->allocator > omp_max_predefined_alloc)
+  if (!predefined_allocator_p (data->allocator))
     {
       struct omp_allocator_data *allocator_data
 	= (struct omp_allocator_data *) (data->allocator);
@@ -822,7 +857,8 @@  omp_free (void *ptr, omp_allocator_handle_t allocator)
 	}
 #endif
 
-      memspace = predefined_alloc_mapping[data->allocator];
+      memspace = predefined_alloc_mapping (data->allocator);
+      pinned = (data->allocator == ompx_pinned_mem_alloc);
     }
 
   MEMSPACE_FREE (memspace, data->ptr, data->size, pinned);
@@ -860,7 +896,7 @@  retry:
       allocator = (omp_allocator_handle_t) thr->ts.def_allocator;
     }
 
-  if (allocator > omp_max_predefined_alloc)
+  if (!predefined_allocator_p (allocator))
     {
       allocator_data = (struct omp_allocator_data *) allocator;
       if (new_alignment < allocator_data->alignment)
@@ -995,9 +1031,11 @@  retry:
 	  omp_memspace_handle_t memspace;
 	  memspace = (allocator_data
 		      ? allocator_data->memspace
-		      : predefined_alloc_mapping[allocator]);
-	  ptr = MEMSPACE_CALLOC (memspace, new_size,
-				 allocator_data && allocator_data->pinned);
+		      : predefined_alloc_mapping (allocator));
+	  int pinned = (allocator_data
+			? allocator_data->pinned
+			: allocator == ompx_pinned_mem_alloc);
+	  ptr = MEMSPACE_CALLOC (memspace, new_size, pinned);
 	}
       if (ptr == NULL)
 	goto fail;
@@ -1018,7 +1056,8 @@  retry:
 fail:;
   int fallback = (allocator_data
 		  ? allocator_data->fallback
-		  : allocator == omp_default_mem_alloc
+		  : (allocator == omp_default_mem_alloc
+		     || allocator == ompx_pinned_mem_alloc)
 		  ? omp_atv_null_fb
 		  : omp_atv_default_mem_fb);
   switch (fallback)
@@ -1076,7 +1115,7 @@  retry:
   if (allocator == omp_null_allocator)
     allocator = free_allocator;
 
-  if (allocator > omp_max_predefined_alloc)
+  if (!predefined_allocator_p (allocator))
     {
       allocator_data = (struct omp_allocator_data *) allocator;
       if (new_alignment < allocator_data->alignment)
@@ -1104,7 +1143,7 @@  retry:
 	}
 #endif
     }
-  if (free_allocator > omp_max_predefined_alloc)
+  if (!predefined_allocator_p (free_allocator))
     {
       free_allocator_data = (struct omp_allocator_data *) free_allocator;
 #if defined(LIBGOMP_USE_MEMKIND) || defined(LIBGOMP_USE_LIBNUMA)
@@ -1228,11 +1267,14 @@  retry:
       else
 #endif
       if (prev_size)
-	new_ptr = MEMSPACE_REALLOC (allocator_data->memspace, data->ptr,
-				    data->size, new_size,
-				    (free_allocator_data
-				     && free_allocator_data->pinned),
-				    allocator_data->pinned);
+	{
+	  int was_pinned = (free_allocator_data
+			    ? free_allocator_data->pinned
+			    : free_allocator == ompx_pinned_mem_alloc);
+	  new_ptr = MEMSPACE_REALLOC (allocator_data->memspace, data->ptr,
+				      data->size, new_size, was_pinned,
+				      allocator_data->pinned);
+	}
       else
 	new_ptr = MEMSPACE_ALLOC (allocator_data->memspace, new_size,
 				  allocator_data->pinned);
@@ -1287,11 +1329,15 @@  retry:
 	  omp_memspace_handle_t memspace;
 	  memspace = (allocator_data
 		      ? allocator_data->memspace
-		      : predefined_alloc_mapping[allocator]);
+		      : predefined_alloc_mapping (allocator));
+	  int was_pinned = (free_allocator_data
+			    ? free_allocator_data->pinned
+			    : free_allocator == ompx_pinned_mem_alloc);
+	  int pinned = (allocator_data
+			? allocator_data->pinned
+			: allocator == ompx_pinned_mem_alloc);
 	  new_ptr = MEMSPACE_REALLOC (memspace, data->ptr, data->size, new_size,
-				      (free_allocator_data
-				       && free_allocator_data->pinned),
-				      allocator_data && allocator_data->pinned);
+				      was_pinned, pinned);
 	}
       if (new_ptr == NULL)
 	goto fail;
@@ -1324,9 +1370,11 @@  retry:
 	  omp_memspace_handle_t memspace;
 	  memspace = (allocator_data
 		      ? allocator_data->memspace
-		      : predefined_alloc_mapping[allocator]);
-	  new_ptr = MEMSPACE_ALLOC (memspace, new_size,
-				    allocator_data && allocator_data->pinned);
+		      : predefined_alloc_mapping (allocator));
+	  int pinned = (allocator_data
+			? allocator_data->pinned
+			: allocator == ompx_pinned_mem_alloc);
+	  new_ptr = MEMSPACE_ALLOC (memspace, new_size, pinned);
 	}
       if (new_ptr == NULL)
 	goto fail;
@@ -1380,8 +1428,10 @@  retry:
     omp_memspace_handle_t was_memspace;
     was_memspace = (free_allocator_data
 		    ? free_allocator_data->memspace
-		    : predefined_alloc_mapping[free_allocator]);
-    int was_pinned = (free_allocator_data && free_allocator_data->pinned);
+		    : predefined_alloc_mapping (free_allocator));
+    int was_pinned = (free_allocator_data
+		      ? free_allocator_data->pinned
+		      : free_allocator == ompx_pinned_mem_alloc);
     MEMSPACE_FREE (was_memspace, data->ptr, data->size, was_pinned);
   }
   return ret;
@@ -1389,7 +1439,8 @@  retry:
 fail:;
   int fallback = (allocator_data
 		  ? allocator_data->fallback
-		  : allocator == omp_default_mem_alloc
+		  : (allocator == omp_default_mem_alloc
+		     || allocator == ompx_pinned_mem_alloc)
 		  ? omp_atv_null_fb
 		  : omp_atv_default_mem_fb);
   switch (fallback)
diff --git a/libgomp/libgomp.texi b/libgomp/libgomp.texi
index d612488ad10..aa0a81062ec 100644
--- a/libgomp/libgomp.texi
+++ b/libgomp/libgomp.texi
@@ -3440,6 +3440,7 @@  value.
 @item omp_cgroup_mem_alloc      @tab omp_low_lat_mem_space (implementation defined)
 @item omp_pteam_mem_alloc       @tab omp_low_lat_mem_space (implementation defined)
 @item omp_thread_mem_alloc      @tab omp_low_lat_mem_space (implementation defined)
+@item ompx_pinned_mem_alloc     @tab omp_default_mem_space (GNU extension)
 @end multitable
 
 The predefined allocators use the default values for the traits,
@@ -3465,7 +3466,7 @@  as listed below.  Except that the last three allocators have the
 @item @code{fb_data}   @tab @emph{unsupported as it needs an allocator handle}
                        @tab (none)
 @item @code{pinned}    @tab @code{true}, @code{false}
-                       @tab @code{false}
+                       @tab See below
 @item @code{partition} @tab @code{environment}, @code{nearest},
                             @code{blocked}, @code{interleaved}
                        @tab @code{environment}
@@ -3476,6 +3477,10 @@  For the @code{fallback} trait, the default value is @code{null_fb} for the
 with device memory; for all other allocators, it is @code{default_mem_fb}
 by default.
 
+For the @code{pinned} trait, the default value is @code{true} for
+predefined allocator @code{ompx_pinned_mem_alloc} (a GNU extension), and
+@code{false} for all others.
+
 Examples:
 @smallexample
 OMP_ALLOCATOR=omp_high_bw_mem_alloc
diff --git a/libgomp/omp.h.in b/libgomp/omp.h.in
index 9b00647339e..117d67f2aa7 100644
--- a/libgomp/omp.h.in
+++ b/libgomp/omp.h.in
@@ -134,6 +134,7 @@  typedef enum omp_allocator_handle_t __GOMP_UINTPTR_T_ENUM
   omp_cgroup_mem_alloc = 6,
   omp_pteam_mem_alloc = 7,
   omp_thread_mem_alloc = 8,
+  ompx_pinned_mem_alloc = 100,
   __omp_allocator_handle_t_max__ = __UINTPTR_MAX__
 } omp_allocator_handle_t;
 
diff --git a/libgomp/omp_lib.f90.in b/libgomp/omp_lib.f90.in
index 65365e4497b..a5ece9e986d 100644
--- a/libgomp/omp_lib.f90.in
+++ b/libgomp/omp_lib.f90.in
@@ -158,6 +158,8 @@ 
                  parameter :: omp_pteam_mem_alloc = 7
         integer (kind=omp_allocator_handle_kind), &
                  parameter :: omp_thread_mem_alloc = 8
+        integer (kind=omp_allocator_handle_kind), &
+                 parameter :: ompx_pinned_mem_alloc = 100
         integer (omp_memspace_handle_kind), &
                  parameter :: omp_default_mem_space = 0
         integer (omp_memspace_handle_kind), &
diff --git a/libgomp/testsuite/libgomp.c/alloc-pinned-5.c b/libgomp/testsuite/libgomp.c/alloc-pinned-5.c
new file mode 100644
index 00000000000..18e6d20ca5b
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/alloc-pinned-5.c
@@ -0,0 +1,103 @@ 
+/* { dg-do run } */
+
+/* { dg-xfail-run-if "Pinning not implemented on this host" { ! *-*-linux-gnu } } */
+
+/* Test that ompx_pinned_mem_alloc works.  */
+
+#include <stdio.h>
+#include <stdlib.h>
+
+#ifdef __linux__
+#include <sys/types.h>
+#include <unistd.h>
+
+#include <sys/mman.h>
+#include <sys/resource.h>
+
+#define PAGE_SIZE sysconf(_SC_PAGESIZE)
+#define CHECK_SIZE(SIZE) { \
+  struct rlimit limit; \
+  if (getrlimit (RLIMIT_MEMLOCK, &limit) \
+      || limit.rlim_cur <= SIZE) \
+    fprintf (stderr, "unsufficient lockable memory; please increase ulimit\n"); \
+  }
+
+int
+get_pinned_mem ()
+{
+  int pid = getpid ();
+  char buf[100];
+  sprintf (buf, "/proc/%d/status", pid);
+
+  FILE *proc = fopen (buf, "r");
+  if (!proc)
+    abort ();
+  while (fgets (buf, 100, proc))
+    {
+      int val;
+      if (sscanf (buf, "VmLck: %d", &val))
+	{
+	  fclose (proc);
+	  return val;
+	}
+    }
+  abort ();
+}
+#else
+#define PAGE_SIZE 10000 * 1024 /* unknown */
+#define CHECK_SIZE(SIZE) fprintf (stderr, "OS unsupported\n");
+
+int
+get_pinned_mem ()
+{
+  return 0;
+}
+#endif
+
+static void
+verify0 (char *p, size_t s)
+{
+  for (size_t i = 0; i < s; ++i)
+    if (p[i] != 0)
+      abort ();
+}
+
+#include <omp.h>
+
+int
+main ()
+{
+  /* Allocate at least a page each time, allowing space for overhead,
+     but stay within the ulimit.  */
+  const int SIZE = PAGE_SIZE - 128;
+  CHECK_SIZE (SIZE * 5);
+
+  // Sanity check
+  if (get_pinned_mem () != 0)
+    abort ();
+
+  void *p = omp_alloc (SIZE, ompx_pinned_mem_alloc);
+  if (!p)
+    abort ();
+
+  int amount = get_pinned_mem ();
+  if (amount == 0)
+    abort ();
+
+  p = omp_realloc (p, SIZE * 2, ompx_pinned_mem_alloc, ompx_pinned_mem_alloc);
+
+  int amount2 = get_pinned_mem ();
+  if (amount2 <= amount)
+    abort ();
+
+  /* SIZE*2 ensures that it doesn't slot into the space possibly
+     vacated by realloc.  */
+  p = omp_calloc (1, SIZE * 2, ompx_pinned_mem_alloc);
+
+  if (get_pinned_mem () <= amount2)
+    abort ();
+
+  verify0 (p, SIZE * 2);
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.c/alloc-pinned-6.c b/libgomp/testsuite/libgomp.c/alloc-pinned-6.c
new file mode 100644
index 00000000000..f80a0264f97
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/alloc-pinned-6.c
@@ -0,0 +1,101 @@ 
+/* { dg-do run } */
+
+/* Test that ompx_pinned_mem_alloc fails correctly.  */
+
+#include <stdio.h>
+#include <stdlib.h>
+
+#ifdef __linux__
+#include <sys/types.h>
+#include <unistd.h>
+
+#include <sys/mman.h>
+#include <sys/resource.h>
+
+#define PAGE_SIZE sysconf(_SC_PAGESIZE)
+
+int
+get_pinned_mem ()
+{
+  int pid = getpid ();
+  char buf[100];
+  sprintf (buf, "/proc/%d/status", pid);
+
+  FILE *proc = fopen (buf, "r");
+  if (!proc)
+    abort ();
+  while (fgets (buf, 100, proc))
+    {
+      int val;
+      if (sscanf (buf, "VmLck: %d", &val))
+	{
+	  fclose (proc);
+	  return val;
+	}
+    }
+  abort ();
+}
+
+void
+set_pin_limit (int size)
+{
+  struct rlimit limit;
+  if (getrlimit (RLIMIT_MEMLOCK, &limit))
+    abort ();
+  limit.rlim_cur = (limit.rlim_max < size ? limit.rlim_max : size);
+  if (setrlimit (RLIMIT_MEMLOCK, &limit))
+    abort ();
+}
+#else
+#define PAGE_SIZE 10000 * 1024 /* unknown */
+
+int
+get_pinned_mem ()
+{
+  return 0;
+}
+
+void
+set_pin_limit ()
+{
+}
+#endif
+
+#include <omp.h>
+
+int
+main ()
+{
+  /* Allocate at least a page each time, but stay within the ulimit.  */
+  const int SIZE = PAGE_SIZE * 4;
+
+  /* Ensure that the limit is smaller than the allocation.  */
+  set_pin_limit (SIZE / 2);
+
+  // Sanity check
+  if (get_pinned_mem () != 0)
+    abort ();
+
+  // Should fail
+  void *p = omp_alloc (SIZE, ompx_pinned_mem_alloc);
+  if (p)
+    abort ();
+
+  // Should fail
+  p = omp_calloc (1, SIZE, ompx_pinned_mem_alloc);
+  if (p)
+    abort ();
+
+  // Should fail to realloc
+  void *notpinned = omp_alloc (SIZE, omp_default_mem_alloc);
+  p = omp_realloc (notpinned, SIZE, ompx_pinned_mem_alloc, omp_default_mem_alloc);
+  if (!notpinned || p)
+    abort ();
+
+  // No memory should have been pinned
+  int amount = get_pinned_mem ();
+  if (amount != 0)
+    abort ();
+
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.fortran/alloc-pinned-1.f90 b/libgomp/testsuite/libgomp.fortran/alloc-pinned-1.f90
new file mode 100644
index 00000000000..798dc3d5a12
--- /dev/null
+++ b/libgomp/testsuite/libgomp.fortran/alloc-pinned-1.f90
@@ -0,0 +1,16 @@ 
+! Ensure that the ompx_pinned_mem_alloc predefined allocator is present and
+! accepted.  The majority of the functionality testing lives in the C tests.
+!
+! { dg-xfail-run-if "Pinning not implemented on this host" { ! *-*-linux-gnu } }
+
+program main
+  use omp_lib
+  use ISO_C_Binding
+  implicit none (external, type)
+
+  type(c_ptr) :: p
+
+  p = omp_alloc (10_c_size_t, ompx_pinned_mem_alloc);
+  if (.not. c_associated (p)) stop 1
+  call omp_free (p, ompx_pinned_mem_alloc);
+end program main