diff mbox series

[v2,10/10] fortran: Add -finline-intrinsics flag for MINLOC/MAXLOC [PR90608]

Message ID 20240816102227.189290-11-morin-mikael@orange.fr
State New
Headers show
Series fortran: Inline MINLOC/MAXLOC without DIM argument [PR90608] | expand

Commit Message

Mikael Morin Aug. 16, 2024, 10:22 a.m. UTC
From: Mikael Morin <mikael@gcc.gnu.org>

This patch is new in the V2 series.

Regression-tested on x86_64-pc-linux-gnu.
OK for master?

-- >8 --

Introduce the -finline-intrinsics flag to control from the command line
whether to generate either inline code or calls to the functions from the
library, for the MINLOC and MAXLOC intrinsics.

The flag allows to specify inlining either independently for each intrinsic
(either MINLOC or MAXLOC), or all together.  For each intrinsic, a default
value is set if none was set.  The default value depends on the optimization
setting: inlining is avoided if not optimizing or if optimizing for size;
otherwise inlining is preferred.

There is no direct support for this behaviour provided by the .opt options
framework.  It is obtained by defining three different variants of the flag
(finline-intrinsics, fno-inline-intrinsics, finline-intrinsics=) all using
the same underlying option variable.  Each enum value (corresponding to an
intrinsic function) uses two identical bits, and the variable is initialized
with alternated bits, so that we can tell whether the value was left
initialized by checking whether the two bits have different values.

	PR fortran/90608

gcc/ChangeLog:

	* flag-types.h (enum gfc_inlineable_intrinsics): New type.

gcc/fortran/ChangeLog:

	* invoke.texi(finline-intrinsics): Document new flag.
	* lang.opt (finline-intrinsics, finline-intrinsics=,
	fno-inline-intrinsics): New flags.
	* options.cc (gfc_post_options): If the option variable controling
	the inlining of MAXLOC (respectively MINLOC) has not been set, set
	it or clear it depending on the optimization option variables.
	* trans-intrinsic.cc (gfc_inline_intrinsic_function_p): Return false
	if inlining for the intrinsic is disabled according to the option
	variable.

gcc/testsuite/ChangeLog:

	* gfortran.dg/minmaxloc_18.f90: New test.
	* gfortran.dg/minmaxloc_18a.f90: New test.
	* gfortran.dg/minmaxloc_18b.f90: New test.
	* gfortran.dg/minmaxloc_18c.f90: New test.
	* gfortran.dg/minmaxloc_18d.f90: New test.
---
 gcc/flag-types.h                            |  30 +
 gcc/fortran/invoke.texi                     |  24 +
 gcc/fortran/lang.opt                        |  27 +
 gcc/fortran/options.cc                      |  21 +-
 gcc/fortran/trans-intrinsic.cc              |  13 +-
 gcc/testsuite/gfortran.dg/minmaxloc_18.f90  | 772 ++++++++++++++++++++
 gcc/testsuite/gfortran.dg/minmaxloc_18a.f90 |  10 +
 gcc/testsuite/gfortran.dg/minmaxloc_18b.f90 |  10 +
 gcc/testsuite/gfortran.dg/minmaxloc_18c.f90 |  10 +
 gcc/testsuite/gfortran.dg/minmaxloc_18d.f90 |  10 +
 10 files changed, 922 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/minmaxloc_18.f90
 create mode 100644 gcc/testsuite/gfortran.dg/minmaxloc_18a.f90
 create mode 100644 gcc/testsuite/gfortran.dg/minmaxloc_18b.f90
 create mode 100644 gcc/testsuite/gfortran.dg/minmaxloc_18c.f90
 create mode 100644 gcc/testsuite/gfortran.dg/minmaxloc_18d.f90

Comments

Harald Anlauf Aug. 19, 2024, 7:44 p.m. UTC | #1
Hi Mikael,

apart from patch #04/10, which did not apply cleanly here, I was
able to test your patch.  It seems to work with a manual workaround
(-fno-frontend-optimize) to work around this problem.
Might be a local issue...

That said, it works as advertised.  Thanks for separating out the
IEEE-NaN tests.

What I did not fully get is the way you deal with -finline-intrinsics= .
Currently there are only MINLOC and MAXLOC, but in the future there
could be many more intrinsics.  Given that you need 2 bits per
intrinsic in flag_inline_intrinsics, how future-proof is that?

In the documentation, you have:

+The set of intrinsics permitting the choice of implementation variant 
through
+@code{-finline-intrinsics} is currently limited to non-scalar 
@code{MAXLOC} and
+@code{MINLOC}.

Why do you say "non-scalar"?  The new inlining is done for these
intrinsics when the DIM argument is absent.  The result characteristics
however is:

   "If DIM does not appear, the result is an array of rank one and of 
size equal to the rank of ARRAY; ..."

and I thought the implementation does just that and does that right.
(With DIM present, the result is an array of rank rank(arg)-1.)
Can you clarify the wording in a way that is better understandable?

Otherwise the Fortran parts look fine to me.

For the changes to gcc/flag-types.h you might need an OK from the
gcc maintainers.

Thanks,
Harald

Am 16.08.24 um 12:22 schrieb Mikael Morin:
> From: Mikael Morin <mikael@gcc.gnu.org>
> 
> This patch is new in the V2 series.
> 
> Regression-tested on x86_64-pc-linux-gnu.
> OK for master?
> 
> -- >8 --
> 
> Introduce the -finline-intrinsics flag to control from the command line
> whether to generate either inline code or calls to the functions from the
> library, for the MINLOC and MAXLOC intrinsics.
> 
> The flag allows to specify inlining either independently for each intrinsic
> (either MINLOC or MAXLOC), or all together.  For each intrinsic, a default
> value is set if none was set.  The default value depends on the optimization
> setting: inlining is avoided if not optimizing or if optimizing for size;
> otherwise inlining is preferred.
> 
> There is no direct support for this behaviour provided by the .opt options
> framework.  It is obtained by defining three different variants of the flag
> (finline-intrinsics, fno-inline-intrinsics, finline-intrinsics=) all using
> the same underlying option variable.  Each enum value (corresponding to an
> intrinsic function) uses two identical bits, and the variable is initialized
> with alternated bits, so that we can tell whether the value was left
> initialized by checking whether the two bits have different values.
> 
> 	PR fortran/90608
> 
> gcc/ChangeLog:
> 
> 	* flag-types.h (enum gfc_inlineable_intrinsics): New type.
> 
> gcc/fortran/ChangeLog:
> 
> 	* invoke.texi(finline-intrinsics): Document new flag.
> 	* lang.opt (finline-intrinsics, finline-intrinsics=,
> 	fno-inline-intrinsics): New flags.
> 	* options.cc (gfc_post_options): If the option variable controling
> 	the inlining of MAXLOC (respectively MINLOC) has not been set, set
> 	it or clear it depending on the optimization option variables.
> 	* trans-intrinsic.cc (gfc_inline_intrinsic_function_p): Return false
> 	if inlining for the intrinsic is disabled according to the option
> 	variable.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gfortran.dg/minmaxloc_18.f90: New test.
> 	* gfortran.dg/minmaxloc_18a.f90: New test.
> 	* gfortran.dg/minmaxloc_18b.f90: New test.
> 	* gfortran.dg/minmaxloc_18c.f90: New test.
> 	* gfortran.dg/minmaxloc_18d.f90: New test.
> ---
>   gcc/flag-types.h                            |  30 +
>   gcc/fortran/invoke.texi                     |  24 +
>   gcc/fortran/lang.opt                        |  27 +
>   gcc/fortran/options.cc                      |  21 +-
>   gcc/fortran/trans-intrinsic.cc              |  13 +-
>   gcc/testsuite/gfortran.dg/minmaxloc_18.f90  | 772 ++++++++++++++++++++
>   gcc/testsuite/gfortran.dg/minmaxloc_18a.f90 |  10 +
>   gcc/testsuite/gfortran.dg/minmaxloc_18b.f90 |  10 +
>   gcc/testsuite/gfortran.dg/minmaxloc_18c.f90 |  10 +
>   gcc/testsuite/gfortran.dg/minmaxloc_18d.f90 |  10 +
>   10 files changed, 922 insertions(+), 5 deletions(-)
>   create mode 100644 gcc/testsuite/gfortran.dg/minmaxloc_18.f90
>   create mode 100644 gcc/testsuite/gfortran.dg/minmaxloc_18a.f90
>   create mode 100644 gcc/testsuite/gfortran.dg/minmaxloc_18b.f90
>   create mode 100644 gcc/testsuite/gfortran.dg/minmaxloc_18c.f90
>   create mode 100644 gcc/testsuite/gfortran.dg/minmaxloc_18d.f90
> 
> diff --git a/gcc/flag-types.h b/gcc/flag-types.h
> index 1e497f0bb91..df56337f7e8 100644
> --- a/gcc/flag-types.h
> +++ b/gcc/flag-types.h
> @@ -451,6 +451,36 @@ enum gfc_convert
>   };
>   
>   
> +/* gfortran -finline-intrinsics= values;
> +   We use two identical bits for each value, and initialize with alternated
> +   bits, so that we can check whether a value has been set by checking whether
> +   the two bits have identical value.  */
> +
> +#define GFC_INL_INTR_VAL(idx) (3 << (2 * idx))
> +#define GFC_INL_INTR_UNSET_VAL(val) (0x55555555 & (val))
> +
> +enum gfc_inlineable_intrinsics
> +{
> +  GFC_FLAG_INLINE_INTRINSIC_NONE = 0,
> +  GFC_FLAG_INLINE_INTRINSIC_MAXLOC = GFC_INL_INTR_VAL (0),
> +  GFC_FLAG_INLINE_INTRINSIC_MINLOC = GFC_INL_INTR_VAL (1),
> +  GFC_FLAG_INLINE_INTRINSIC_ALL = GFC_FLAG_INLINE_INTRINSIC_MAXLOC
> +				  | GFC_FLAG_INLINE_INTRINSIC_MINLOC,
> +
> +  GFC_FLAG_INLINE_INTRINSIC_NONE_UNSET
> +		  = GFC_INL_INTR_UNSET_VAL (GFC_FLAG_INLINE_INTRINSIC_NONE),
> +  GFC_FLAG_INLINE_INTRINSIC_MAXLOC_UNSET
> +		  = GFC_INL_INTR_UNSET_VAL (GFC_FLAG_INLINE_INTRINSIC_MAXLOC),
> +  GFC_FLAG_INLINE_INTRINSIC_MINLOC_UNSET
> +		  = GFC_INL_INTR_UNSET_VAL (GFC_FLAG_INLINE_INTRINSIC_MINLOC),
> +  GFC_FLAG_INLINE_INTRINSIC_ALL_UNSET
> +		  = GFC_INL_INTR_UNSET_VAL (GFC_FLAG_INLINE_INTRINSIC_ALL)
> +};
> +
> +#undef GFC_INL_INTR_UNSET_VAL
> +#undef GFC_INL_INTR_VAL
> +
> +
>   /* Inline String Operations functions.  */
>   enum ilsop_fn
>   {
> diff --git a/gcc/fortran/invoke.texi b/gcc/fortran/invoke.texi
> index 6bc42afe2c4..53b6de1c92b 100644
> --- a/gcc/fortran/invoke.texi
> +++ b/gcc/fortran/invoke.texi
> @@ -194,6 +194,7 @@ and warnings}.
>   -finit-character=@var{n} -finit-integer=@var{n} -finit-local-zero
>   -finit-derived -finit-logical=@var{<true|false>}
>   -finit-real=@var{<zero|inf|-inf|nan|snan>}
> +-finline-intrinsics[=<@var{minloc},@var{maxloc}>]
>   -finline-matmul-limit=@var{n}
>   -finline-arg-packing -fmax-array-constructor=@var{n}
>   -fmax-stack-var-size=@var{n} -fno-align-commons -fno-automatic
> @@ -1994,6 +1995,29 @@ geometric mean of the dimensions of the argument and result matrices.
>   
>   The default value for @var{n} is 30.
>   
> +@opindex @code{finline-intrinsics}
> +@item -finline-intrinsics
> +@itemx -finline-intrinsics=@var{intr1},@var{intr2},...
> +Usage of intrinsics can be implemented either by generating a call to the
> +libgfortran library function implementing it, or by directly generating the
> +implementation code inline.  For most intrinsics, only a single of those
> +variants is available and there is no choice of implementation.  For some of
> +them, however, both are available, and for them the @code{-finline-intrinsics}
> +flag permits the selection of inline code generation in its positive form, or
> +library call generation in its negative form @code{-fno-inline-intrinsics}.
> +With @code{-finline-intrinsics=...} or @code{-fno-inline-intrinsics=...}, the
> +choice applies only to the intrinsics present in the comma-separated list
> +provided as argument.
> +
> +For each intrinsic, if no choice of implementation was made through either of
> +the flag variants, a default behaviour is chosen depending on optimization:
> +library calls are generated when not optimizing or when optimizing for size;
> +otherwise inline code is preferred.
> +
> +The set of intrinsics permitting the choice of implementation variant through
> +@code{-finline-intrinsics} is currently limited to non-scalar @code{MAXLOC} and
> +@code{MINLOC}.
> +
>   @opindex @code{finline-matmul-limit}
>   @item -finline-matmul-limit=@var{n}
>   When front-end optimization is active, some calls to the @code{MATMUL}
> diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
> index 5cf7b492254..ac08a851da4 100644
> --- a/gcc/fortran/lang.opt
> +++ b/gcc/fortran/lang.opt
> @@ -676,6 +676,33 @@ finline-arg-packing
>   Fortran  Var(flag_inline_arg_packing) Init(-1)
>   -finline-arg-packing	Perform argument packing inline.
>   
> +finline-intrinsics
> +Fortran RejectNegative Enum(gfc_inlineable_intrinsics) Var(flag_inline_intrinsics, GFC_FLAG_INLINE_INTRINSIC_ALL) Undocumented
> +
> +fno-inline-intrinsics
> +Fortran RejectNegative Enum(gfc_inlineable_intrinsics) Var(flag_inline_intrinsics, GFC_FLAG_INLINE_INTRINSIC_NONE) Undocumented
> +
> +finline-intrinsics=
> +Fortran Joined Var(flag_inline_intrinsics) Enum(gfc_inlineable_intrinsics) Init(GFC_FLAG_INLINE_INTRINSIC_ALL_UNSET) EnumSet
> +Enable generation of inline code instead of calls to functions from the library to implement intrinsics.
> +
> +Enum
> +Name(gfc_inlineable_intrinsics) Type(int) UnknownError(%qs is not an inline-controlable intrinsic)
> +
> +; This is not part of any set
> +; EnumValue
> +; Enum(gfc_inlineable_intrinsics) String(none) Value(GFC_FLAG_INLINE_INTRINSIC_NONE)
> +
> +EnumValue
> +Enum(gfc_inlineable_intrinsics) String(maxloc) Value(GFC_FLAG_INLINE_INTRINSIC_MAXLOC) Set(1)
> +
> +EnumValue
> +Enum(gfc_inlineable_intrinsics) String(minloc) Value(GFC_FLAG_INLINE_INTRINSIC_MINLOC) Set(2)
> +
> +; This is not part of any set
> +; EnumValue
> +; Enum(gfc_inlineable_intrinsics) String(all) Value(GFC_FLAG_INLINE_INTRINSIC_ALL)
> +
>   finline-matmul-limit=
>   Fortran RejectNegative Joined UInteger Var(flag_inline_matmul_limit) Init(-1)
>   -finline-matmul-limit=<n>	Specify the size of the largest matrix for which matmul will be inlined.
> diff --git a/gcc/fortran/options.cc b/gcc/fortran/options.cc
> index d8c5c8e62fc..6f2579ad9de 100644
> --- a/gcc/fortran/options.cc
> +++ b/gcc/fortran/options.cc
> @@ -472,7 +472,26 @@ gfc_post_options (const char **pfilename)
>     /* Implement -fno-automatic as -fmax-stack-var-size=0.  */
>     if (!flag_automatic)
>       flag_max_stack_var_size = 0;
> -
> +
> +  /* Decide inlining preference depending on optimization if nothing was
> +     specified on the command line.  */
> +  if ((flag_inline_intrinsics & GFC_FLAG_INLINE_INTRINSIC_MAXLOC)
> +      == GFC_FLAG_INLINE_INTRINSIC_MAXLOC_UNSET)
> +    {
> +      if (optimize == 0 || optimize_size != 0)
> +	flag_inline_intrinsics &= ~GFC_FLAG_INLINE_INTRINSIC_MAXLOC;
> +      else
> +	flag_inline_intrinsics |= GFC_FLAG_INLINE_INTRINSIC_MAXLOC;
> +    }
> +  if ((flag_inline_intrinsics & GFC_FLAG_INLINE_INTRINSIC_MINLOC)
> +      == GFC_FLAG_INLINE_INTRINSIC_MINLOC_UNSET)
> +    {
> +      if (optimize == 0 || optimize_size != 0)
> +	flag_inline_intrinsics &= ~GFC_FLAG_INLINE_INTRINSIC_MINLOC;
> +      else
> +	flag_inline_intrinsics |= GFC_FLAG_INLINE_INTRINSIC_MINLOC;
> +    }
> +
>     /* If the user did not specify an inline matmul limit, inline up to the BLAS
>        limit or up to 30 if no external BLAS is specified.  */
>   
> diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc
> index b03f7b1653e..456f28eba4e 100644
> --- a/gcc/fortran/trans-intrinsic.cc
> +++ b/gcc/fortran/trans-intrinsic.cc
> @@ -11840,10 +11840,11 @@ gfc_inline_intrinsic_function_p (gfc_expr *expr)
>     gfc_actual_arglist *args, *dim_arg, *mask_arg;
>     gfc_expr *maskexpr;
>   
> -  if (!expr->value.function.isym)
> +  gfc_intrinsic_sym *isym = expr->value.function.isym;
> +  if (!isym)
>       return false;
>   
> -  switch (expr->value.function.isym->id)
> +  switch (isym->id)
>       {
>       case GFC_ISYM_PRODUCT:
>       case GFC_ISYM_SUM:
> @@ -11879,8 +11880,12 @@ gfc_inline_intrinsic_function_p (gfc_expr *expr)
>       case GFC_ISYM_MINLOC:
>       case GFC_ISYM_MAXLOC:
>         {
> -	/* Disable inline expansion if code size matters.  */
> -	if (optimize_size)
> +	if ((isym->id == GFC_ISYM_MINLOC
> +	     && (flag_inline_intrinsics
> +		 & GFC_FLAG_INLINE_INTRINSIC_MINLOC) == 0)
> +	    || (isym->id == GFC_ISYM_MAXLOC
> +		&& (flag_inline_intrinsics
> +		    & GFC_FLAG_INLINE_INTRINSIC_MAXLOC) == 0))
>   	  return false;
>   
>   	gfc_actual_arglist *array_arg = expr->value.function.actual;
> diff --git a/gcc/testsuite/gfortran.dg/minmaxloc_18.f90 b/gcc/testsuite/gfortran.dg/minmaxloc_18.f90
> new file mode 100644
> index 00000000000..e8cd2d42d8d
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/minmaxloc_18.f90
> @@ -0,0 +1,772 @@
> +! { dg-do compile }
> +! { dg-additional-options "-O -fdump-tree-original" }
> +! { dg-final { scan-tree-dump-not "gfortran_\[sm\]?minloc" "original" } }
> +! { dg-final { scan-tree-dump-not "gfortran_\[sm\]?maxloc" "original" } }
> +!
> +! PR fortran/90608
> +! Check that all MINLOC and MAXLOC calls are inlined with optimizations by default.
> +
> +subroutine check_maxloc_without_mask
> +  implicit none
> +  integer, parameter :: data5(*) = (/ 1, 7, 2, 7, 0 /)
> +  integer, parameter :: data64(*) = (/ 2, 5, 4, 6, 0, 9, 3, 5,  &
> +                                       4, 4, 1, 7, 3, 2, 1, 2,  &
> +                                       5, 4, 6, 0, 9, 3, 5, 4,  &
> +                                       4, 1, 7, 3, 2, 1, 2, 5,  &
> +                                       4, 6, 0, 9, 3, 5, 4, 4,  &
> +                                       1, 7, 3, 2, 1, 2, 5, 4,  &
> +                                       6, 0, 9, 3, 5, 4, 4, 1,  &
> +                                       7, 3, 2, 1, 2, 5, 4, 6  /)
> +  call check_int_const_shape_rank_1
> +  call check_int_const_shape_rank_3
> +  call check_int_const_shape_empty_4
> +  call check_int_alloc_rank_1
> +  call check_int_alloc_rank_3
> +  call check_real_const_shape_rank_1
> +  call check_real_const_shape_rank_3
> +  call check_real_const_shape_empty_4
> +  call check_real_alloc_rank_1
> +  call check_real_alloc_rank_3
> +contains
> +  subroutine check_int_const_shape_rank_1()
> +    integer :: a(5)
> +    integer, allocatable :: m(:)
> +    a = data5
> +    m = maxloc(a)
> +    if (size(m, dim=1) /= 1) stop 11
> +    if (any(m /= (/ 2 /))) stop 12
> +  end subroutine
> +  subroutine check_int_const_shape_rank_3()
> +    integer :: a(4,4,4)
> +    integer, allocatable :: m(:)
> +    a = reshape(data64, shape(a))
> +    m = maxloc(a)
> +    if (size(m, dim=1) /= 3) stop 21
> +    if (any(m /= (/ 2, 2, 1 /))) stop 22
> +  end subroutine
> +  subroutine check_int_const_shape_empty_4()
> +    integer :: a(9,3,0,7)
> +    integer, allocatable :: m(:)
> +    a = reshape((/ integer:: /), shape(a))
> +    m = maxloc(a)
> +    if (size(m, dim=1) /= 4) stop 31
> +    if (any(m /= (/ 0, 0, 0, 0 /))) stop 32
> +  end subroutine
> +  subroutine check_int_alloc_rank_1()
> +    integer, allocatable :: a(:)
> +    integer, allocatable :: m(:)
> +    allocate(a(5))
> +    a(:) = data5
> +    m = maxloc(a)
> +    if (size(m, dim=1) /= 1) stop 41
> +    if (any(m /= (/ 2 /))) stop 42
> +  end subroutine
> +  subroutine check_int_alloc_rank_3()
> +    integer, allocatable :: a(:,:,:)
> +    integer, allocatable :: m(:)
> +    allocate(a(4,4,4))
> +    a(:,:,:) = reshape(data64, shape(a))
> +    m = maxloc(a)
> +    if (size(m, dim=1) /= 3) stop 51
> +    if (any(m /= (/ 2, 2, 1 /))) stop 52
> +  end subroutine
> +  subroutine check_real_const_shape_rank_1()
> +    real :: a(5)
> +    integer, allocatable :: m(:)
> +    a = (/ real:: data5 /)
> +    m = maxloc(a)
> +    if (size(m, dim=1) /= 1) stop 71
> +    if (any(m /= (/ 2 /))) stop 72
> +  end subroutine
> +  subroutine check_real_const_shape_rank_3()
> +    real :: a(4,4,4)
> +    integer, allocatable :: m(:)
> +    a = reshape((/ real:: data64 /), shape(a))
> +    m = maxloc(a)
> +    if (size(m, dim=1) /= 3) stop 81
> +    if (any(m /= (/ 2, 2, 1 /))) stop 82
> +  end subroutine
> +  subroutine check_real_const_shape_empty_4()
> +    real :: a(9,3,0,7)
> +    integer, allocatable :: m(:)
> +    a = reshape((/ real:: /), shape(a))
> +    m = maxloc(a)
> +    if (size(m, dim=1) /= 4) stop 91
> +    if (any(m /= (/ 0, 0, 0, 0 /))) stop 92
> +  end subroutine
> +  subroutine check_real_alloc_rank_1()
> +    real, allocatable :: a(:)
> +    integer, allocatable :: m(:)
> +    allocate(a(5))
> +    a(:) = (/ real:: data5 /)
> +    m = maxloc(a)
> +    if (size(m, dim=1) /= 1) stop 111
> +    if (any(m /= (/ 2 /))) stop 112
> +  end subroutine
> +  subroutine check_real_alloc_rank_3()
> +    real, allocatable :: a(:,:,:)
> +    integer, allocatable :: m(:)
> +    allocate(a(4,4,4))
> +    a(:,:,:) = reshape((/ real:: data64 /), shape(a))
> +    m = maxloc(a)
> +    if (size(m, dim=1) /= 3) stop 121
> +    if (any(m /= (/ 2, 2, 1 /))) stop 122
> +  end subroutine
> +end subroutine check_maxloc_without_mask
> +subroutine check_minloc_without_mask
> +  implicit none
> +  integer, parameter :: data5(*) = (/ 8, 2, 7, 2, 9 /)
> +  integer, parameter :: data64(*) = (/ 7, 4, 5, 3, 9, 0, 6, 4,  &
> +                                       5, 5, 8, 2, 6, 7, 8, 7,  &
> +                                       4, 5, 3, 9, 0, 6, 4, 5,  &
> +                                       5, 8, 2, 6, 7, 8, 7, 4,  &
> +                                       5, 3, 9, 0, 6, 4, 5, 5,  &
> +                                       8, 2, 6, 7, 8, 7, 4, 5,  &
> +                                       3, 9, 0, 6, 4, 5, 5, 8,  &
> +                                       2, 6, 7, 8, 7, 4, 5, 3  /)
> +  call check_int_const_shape_rank_1
> +  call check_int_const_shape_rank_3
> +  call check_int_const_shape_empty_4
> +  call check_int_alloc_rank_1
> +  call check_int_alloc_rank_3
> +  call check_real_const_shape_rank_1
> +  call check_real_const_shape_rank_3
> +  call check_real_const_shape_empty_4
> +  call check_real_alloc_rank_1
> +  call check_real_alloc_rank_3
> +contains
> +  subroutine check_int_const_shape_rank_1()
> +    integer :: a(5)
> +    integer, allocatable :: m(:)
> +    a = data5
> +    m = minloc(a)
> +    if (size(m, dim=1) /= 1) stop 11
> +    if (any(m /= (/ 2 /))) stop 12
> +  end subroutine
> +  subroutine check_int_const_shape_rank_3()
> +    integer :: a(4,4,4)
> +    integer, allocatable :: m(:)
> +    a = reshape(data64, shape(a))
> +    m = minloc(a)
> +    if (size(m, dim=1) /= 3) stop 21
> +    if (any(m /= (/ 2, 2, 1 /))) stop 22
> +  end subroutine
> +  subroutine check_int_const_shape_empty_4()
> +    integer :: a(9,3,0,7)
> +    integer, allocatable :: m(:)
> +    a = reshape((/ integer:: /), shape(a))
> +    m = minloc(a)
> +    if (size(m, dim=1) /= 4) stop 31
> +    if (any(m /= (/ 0, 0, 0, 0 /))) stop 32
> +  end subroutine
> +  subroutine check_int_alloc_rank_1()
> +    integer, allocatable :: a(:)
> +    integer, allocatable :: m(:)
> +    allocate(a(5))
> +    a(:) = data5
> +    m = minloc(a)
> +    if (size(m, dim=1) /= 1) stop 41
> +    if (any(m /= (/ 2 /))) stop 42
> +  end subroutine
> +  subroutine check_int_alloc_rank_3()
> +    integer, allocatable :: a(:,:,:)
> +    integer, allocatable :: m(:)
> +    allocate(a(4,4,4))
> +    a(:,:,:) = reshape(data64, shape(a))
> +    m = minloc(a)
> +    if (size(m, dim=1) /= 3) stop 51
> +    if (any(m /= (/ 2, 2, 1 /))) stop 52
> +  end subroutine
> +  subroutine check_real_const_shape_rank_1()
> +    real :: a(5)
> +    integer, allocatable :: m(:)
> +    a = (/ real:: data5 /)
> +    m = minloc(a)
> +    if (size(m, dim=1) /= 1) stop 71
> +    if (any(m /= (/ 2 /))) stop 72
> +  end subroutine
> +  subroutine check_real_const_shape_rank_3()
> +    real :: a(4,4,4)
> +    integer, allocatable :: m(:)
> +    a = reshape((/ real:: data64 /), shape(a))
> +    m = minloc(a)
> +    if (size(m, dim=1) /= 3) stop 81
> +    if (any(m /= (/ 2, 2, 1 /))) stop 82
> +  end subroutine
> +  subroutine check_real_const_shape_empty_4()
> +    real :: a(9,3,0,7)
> +    integer, allocatable :: m(:)
> +    a = reshape((/ real:: /), shape(a))
> +    m = minloc(a)
> +    if (size(m, dim=1) /= 4) stop 91
> +    if (any(m /= (/ 0, 0, 0, 0 /))) stop 92
> +  end subroutine
> +  subroutine check_real_alloc_rank_1()
> +    real, allocatable :: a(:)
> +    integer, allocatable :: m(:)
> +    allocate(a(5))
> +    a(:) = (/ real:: data5 /)
> +    m = minloc(a)
> +    if (size(m, dim=1) /= 1) stop 111
> +    if (any(m /= (/ 2 /))) stop 112
> +  end subroutine
> +  subroutine check_real_alloc_rank_3()
> +    real, allocatable :: a(:,:,:)
> +    integer, allocatable :: m(:)
> +    allocate(a(4,4,4))
> +    a(:,:,:) = reshape((/ real:: data64 /), shape(a))
> +    m = minloc(a)
> +    if (size(m, dim=1) /= 3) stop 121
> +    if (any(m /= (/ 2, 2, 1 /))) stop 122
> +  end subroutine
> +end subroutine check_minloc_without_mask
> +subroutine check_maxloc_with_mask
> +  implicit none
> +  integer, parameter :: data10(*) = (/ 7, 4, 7, 6, 6, 4, 6, 3, 9, 8 /)
> +  logical, parameter :: mask10(*) = (/ .false., .true., .false., &
> +                                       .false., .true., .true.,  &
> +                                       .true. , .true., .false., &
> +                                       .false. /)
> +  integer, parameter :: data64(*) = (/ 2, 5, 4, 6, 0, 9, 3, 5,  &
> +                                       4, 4, 1, 7, 3, 2, 1, 2,  &
> +                                       5, 4, 6, 0, 9, 3, 5, 4,  &
> +                                       4, 1, 7, 3, 2, 1, 2, 5,  &
> +                                       4, 6, 0, 9, 3, 5, 4, 4,  &
> +                                       1, 7, 3, 2, 1, 2, 5, 4,  &
> +                                       6, 0, 9, 3, 5, 4, 4, 1,  &
> +                                       7, 3, 2, 1, 2, 5, 4, 6  /)
> +  logical, parameter :: mask64(*) = (/ .true. , .false., .false., .false., &
> +                                       .true. , .false., .true. , .false., &
> +                                       .false., .true. , .true. , .false., &
> +                                       .true. , .true. , .true. , .true. , &
> +                                       .false., .true. , .false., .true. , &
> +                                       .false., .true. , .false., .true. , &
> +                                       .true. , .false., .false., .true. , &
> +                                       .true. , .true. , .true. , .false., &
> +                                       .false., .false., .true. , .false., &
> +                                       .true. , .false., .true. , .true. , &
> +                                       .true. , .false., .true. , .true. , &
> +                                       .false., .true. , .false., .true. , &
> +                                       .false., .true. , .false., .false., &
> +                                       .false., .true. , .true. , .true. , &
> +                                       .false., .true. , .false., .true. , &
> +                                       .true. , .false., .false., .false. /)
> +  call check_int_const_shape_rank_1
> +  call check_int_const_shape_rank_3
> +  call check_int_const_shape_rank_3_true_mask
> +  call check_int_const_shape_rank_3_false_mask
> +  call check_int_const_shape_rank_3_optional_mask_present
> +  call check_int_const_shape_rank_3_optional_mask_absent
> +  call check_int_const_shape_empty_4
> +  call check_int_alloc_rank_1
> +  call check_int_alloc_rank_3
> +  call check_int_alloc_rank_3_true_mask
> +  call check_int_alloc_rank_3_false_mask
> +  call check_real_const_shape_rank_1
> +  call check_real_const_shape_rank_3
> +  call check_real_const_shape_rank_3_true_mask
> +  call check_real_const_shape_rank_3_false_mask
> +  call check_real_const_shape_rank_3_optional_mask_present
> +  call check_real_const_shape_rank_3_optional_mask_absent
> +  call check_real_const_shape_empty_4
> +  call check_real_alloc_rank_1
> +  call check_real_alloc_rank_3
> +  call check_real_alloc_rank_3_true_mask
> +  call check_real_alloc_rank_3_false_mask
> +contains
> +  subroutine check_int_const_shape_rank_1()
> +    integer :: a(10)
> +    logical :: m(10)
> +    integer, allocatable :: r(:)
> +    a = data10
> +    m = mask10
> +    r = maxloc(a, mask = m)
> +    if (size(r, dim = 1) /= 1) stop 11
> +    if (any(r /= (/ 5 /))) stop 12
> +  end subroutine
> +  subroutine check_int_const_shape_rank_3()
> +    integer :: a(4,4,4)
> +    logical :: m(4,4,4)
> +    integer, allocatable :: r(:)
> +    a = reshape(data64, shape(a))
> +    m = reshape(mask64, shape(m))
> +    r = maxloc(a, mask = m)
> +    if (size(r, dim = 1) /= 3) stop 21
> +    if (any(r /= (/ 2, 3, 1 /))) stop 22
> +  end subroutine
> +  subroutine check_int_const_shape_rank_3_true_mask()
> +    integer :: a(4,4,4)
> +    integer, allocatable :: r(:)
> +    a = reshape(data64, shape(a))
> +    r = maxloc(a, mask = .true.)
> +    if (size(r, dim = 1) /= 3) stop 31
> +    if (any(r /= (/ 2, 2, 1 /))) stop 32
> +  end subroutine
> +  subroutine check_int_const_shape_rank_3_false_mask()
> +    integer :: a(4,4,4)
> +    integer, allocatable :: r(:)
> +    a = reshape(data64, shape(a))
> +    r = maxloc(a, mask = .false.)
> +    if (size(r, dim = 1) /= 3) stop 41
> +    if (any(r /= (/ 0, 0, 0 /))) stop 42
> +  end subroutine
> +  subroutine call_maxloc_int(r, a, m)
> +    integer :: a(:,:,:)
> +    logical, optional :: m(:,:,:)
> +    integer, allocatable :: r(:)
> +    r = maxloc(a, mask = m)
> +  end subroutine
> +  subroutine check_int_const_shape_rank_3_optional_mask_present()
> +    integer :: a(4,4,4)
> +    logical :: m(4,4,4)
> +    integer, allocatable :: r(:)
> +    a = reshape(data64, shape(a))
> +    m = reshape(mask64, shape(m))
> +    call call_maxloc_int(r, a, m)
> +    if (size(r, dim = 1) /= 3) stop 51
> +    if (any(r /= (/ 2, 3, 1 /))) stop 52
> +  end subroutine
> +  subroutine check_int_const_shape_rank_3_optional_mask_absent()
> +    integer :: a(4,4,4)
> +    integer, allocatable :: r(:)
> +    a = reshape(data64, shape(a))
> +    call call_maxloc_int(r, a)
> +    if (size(r, dim = 1) /= 3) stop 61
> +    if (any(r /= (/ 2, 2, 1 /))) stop 62
> +  end subroutine
> +  subroutine check_int_const_shape_empty_4()
> +    integer :: a(9,3,0,7)
> +    logical :: m(9,3,0,7)
> +    integer, allocatable :: r(:)
> +    a = reshape((/ integer:: /), shape(a))
> +    m = reshape((/ logical:: /), shape(m))
> +    r = maxloc(a, mask = m)
> +    if (size(r, dim = 1) /= 4) stop 71
> +    if (any(r /= (/ 0, 0, 0, 0 /))) stop 72
> +  end subroutine
> +  subroutine check_int_alloc_rank_1()
> +    integer, allocatable :: a(:)
> +    logical, allocatable :: m(:)
> +    integer, allocatable :: r(:)
> +    allocate(a(10), m(10))
> +    a(:) = data10
> +    m(:) = mask10
> +    r = maxloc(a, mask = m)
> +    if (size(r, dim = 1) /= 1) stop 81
> +    if (any(r /= (/ 5 /))) stop 82
> +  end subroutine
> +  subroutine check_int_alloc_rank_3()
> +    integer, allocatable :: a(:,:,:)
> +    logical, allocatable :: m(:,:,:)
> +    integer, allocatable :: r(:)
> +    allocate(a(4,4,4), m(4,4,4))
> +    a(:,:,:) = reshape(data64, shape(a))
> +    m(:,:,:) = reshape(mask64, shape(m))
> +    r = maxloc(a, mask = m)
> +    if (size(r, dim = 1) /= 3) stop 91
> +    if (any(r /= (/ 2, 3, 1 /))) stop 92
> +  end subroutine
> +  subroutine check_int_alloc_rank_3_true_mask()
> +    integer, allocatable :: a(:,:,:)
> +    integer, allocatable :: r(:)
> +    allocate(a(4,4,4))
> +    a(:,:,:) = reshape(data64, shape(a))
> +    r = maxloc(a, mask = .true.)
> +    if (size(r, dim = 1) /= 3) stop 101
> +    if (any(r /= (/ 2, 2, 1 /))) stop 102
> +  end subroutine
> +  subroutine check_int_alloc_rank_3_false_mask()
> +    integer, allocatable :: a(:,:,:)
> +    integer, allocatable :: r(:)
> +    allocate(a(4,4,4))
> +    a(:,:,:) = reshape(data64, shape(a))
> +    r = maxloc(a, mask = .false.)
> +    if (size(r, dim = 1) /= 3) stop 111
> +    if (any(r /= (/ 0, 0, 0 /))) stop 112
> +  end subroutine
> +  subroutine check_real_const_shape_rank_1()
> +    real :: a(10)
> +    logical :: m(10)
> +    integer, allocatable :: r(:)
> +    a = (/ real:: data10 /)
> +    m = mask10
> +    r = maxloc(a, mask = m)
> +    if (size(r, dim = 1) /= 1) stop 131
> +    if (any(r /= (/ 5 /))) stop 132
> +  end subroutine
> +  subroutine check_real_const_shape_rank_3()
> +    real :: a(4,4,4)
> +    logical :: m(4,4,4)
> +    integer, allocatable :: r(:)
> +    a = reshape((/ real:: data64 /), shape(a))
> +    m = reshape(mask64, shape(m))
> +    r = maxloc(a, mask = m)
> +    if (size(r, dim = 1) /= 3) stop 141
> +    if (any(r /= (/ 2, 3, 1 /))) stop 142
> +  end subroutine
> +  subroutine check_real_const_shape_rank_3_true_mask()
> +    real :: a(4,4,4)
> +    integer, allocatable :: r(:)
> +    a = reshape((/ real:: data64 /), shape(a))
> +    r = maxloc(a, mask = .true.)
> +    if (size(r, dim = 1) /= 3) stop 151
> +    if (any(r /= (/ 2, 2, 1 /))) stop 152
> +  end subroutine
> +  subroutine check_real_const_shape_rank_3_false_mask()
> +    real :: a(4,4,4)
> +    integer, allocatable :: r(:)
> +    a = reshape((/ real:: data64 /), shape(a))
> +    r = maxloc(a, mask = .false.)
> +    if (size(r, dim = 1) /= 3) stop 161
> +    if (any(r /= (/ 0, 0, 0 /))) stop 162
> +  end subroutine
> +  subroutine call_maxloc_real(r, a, m)
> +    real :: a(:,:,:)
> +    logical, optional  :: m(:,:,:)
> +    integer, allocatable :: r(:)
> +    r = maxloc(a, mask = m)
> +  end subroutine
> +  subroutine check_real_const_shape_rank_3_optional_mask_present()
> +    real :: a(4,4,4)
> +    logical :: m(4,4,4)
> +    integer, allocatable :: r(:)
> +    a = reshape((/ real:: data64 /), shape(a))
> +    m = reshape(mask64, shape(m))
> +    call call_maxloc_real(r, a, m)
> +    if (size(r, dim = 1) /= 3) stop 171
> +    if (any(r /= (/ 2, 3, 1 /))) stop 172
> +  end subroutine
> +  subroutine check_real_const_shape_rank_3_optional_mask_absent()
> +    real :: a(4,4,4)
> +    integer, allocatable :: r(:)
> +    a = reshape((/ real:: data64 /), shape(a))
> +    call call_maxloc_real(r, a)
> +    if (size(r, dim = 1) /= 3) stop 181
> +    if (any(r /= (/ 2, 2, 1 /))) stop 182
> +  end subroutine
> +  subroutine check_real_const_shape_empty_4()
> +    real :: a(9,3,0,7)
> +    logical :: m(9,3,0,7)
> +    integer, allocatable :: r(:)
> +    a = reshape((/ real:: /), shape(a))
> +    m = reshape((/ logical:: /), shape(m))
> +    r = maxloc(a, mask = m)
> +    if (size(r, dim = 1) /= 4) stop 191
> +    if (any(r /= (/ 0, 0, 0, 0 /))) stop 192
> +  end subroutine
> +  subroutine check_real_alloc_rank_1()
> +    real, allocatable :: a(:)
> +    logical, allocatable :: m(:)
> +    integer, allocatable :: r(:)
> +    allocate(a(10), m(10))
> +    a(:) = (/ real:: data10 /)
> +    m(:) = mask10
> +    r = maxloc(a, mask = m)
> +    if (size(r, dim = 1) /= 1) stop 201
> +    if (any(r /= (/ 5 /))) stop 202
> +  end subroutine
> +  subroutine check_real_alloc_rank_3()
> +    real, allocatable :: a(:,:,:)
> +    logical, allocatable :: m(:,:,:)
> +    integer, allocatable :: r(:)
> +    allocate(a(4,4,4), m(4,4,4))
> +    a(:,:,:) = reshape((/ real:: data64 /), shape(a))
> +    m(:,:,:) = reshape(mask64, shape(m))
> +    r = maxloc(a, mask = m)
> +    if (size(r, dim = 1) /= 3) stop 211
> +    if (any(r /= (/ 2, 3, 1 /))) stop 212
> +  end subroutine
> +  subroutine check_real_alloc_rank_3_true_mask()
> +    real, allocatable :: a(:,:,:)
> +    integer, allocatable :: r(:)
> +    allocate(a(4,4,4))
> +    a(:,:,:) = reshape((/ real:: data64 /), shape(a))
> +    r = maxloc(a, mask = .true.)
> +    if (size(r, dim = 1) /= 3) stop 221
> +    if (any(r /= (/ 2, 2, 1 /))) stop 222
> +  end subroutine
> +  subroutine check_real_alloc_rank_3_false_mask()
> +    real, allocatable :: a(:,:,:)
> +    integer, allocatable :: r(:)
> +    allocate(a(4,4,4))
> +    a(:,:,:) = reshape((/ real:: data64 /), shape(a))
> +    r = maxloc(a, mask = .false.)
> +    if (size(r, dim = 1) /= 3) stop 231
> +    if (any(r /= (/ 0, 0, 0 /))) stop 232
> +  end subroutine
> +end subroutine check_maxloc_with_mask
> +subroutine check_minloc_with_mask
> +  implicit none
> +  integer, parameter :: data10(*) = (/ 2, 5, 2, 3, 3, 5, 3, 6, 0, 1 /)
> +  logical, parameter :: mask10(*) = (/ .false., .true., .false., &
> +                                       .false., .true., .true.,  &
> +                                       .true. , .true., .false., &
> +                                       .false. /)
> +  integer, parameter :: data64(*) = (/ 7, 4, 5, 3, 9, 0, 6, 4,  &
> +                                       5, 5, 8, 2, 6, 7, 8, 7,  &
> +                                       4, 5, 3, 9, 0, 6, 4, 5,  &
> +                                       5, 8, 2, 6, 7, 8, 7, 4,  &
> +                                       5, 3, 9, 0, 6, 4, 5, 5,  &
> +                                       8, 2, 6, 7, 8, 7, 4, 5,  &
> +                                       3, 9, 0, 6, 4, 5, 5, 8,  &
> +                                       2, 6, 7, 8, 7, 4, 5, 3  /)
> +  logical, parameter :: mask64(*) = (/ .true. , .false., .false., .false., &
> +                                       .true. , .false., .true. , .false., &
> +                                       .false., .true. , .true. , .false., &
> +                                       .true. , .true. , .true. , .true. , &
> +                                       .false., .true. , .false., .true. , &
> +                                       .false., .true. , .false., .true. , &
> +                                       .true. , .false., .false., .true. , &
> +                                       .true. , .true. , .true. , .false., &
> +                                       .false., .false., .true. , .false., &
> +                                       .true. , .false., .true. , .true. , &
> +                                       .true. , .false., .true. , .true. , &
> +                                       .false., .true. , .false., .true. , &
> +                                       .false., .true. , .false., .false., &
> +                                       .false., .true. , .true. , .true. , &
> +                                       .false., .true. , .false., .true. , &
> +                                       .true. , .false., .false., .false. /)
> +  call check_int_const_shape_rank_1
> +  call check_int_const_shape_rank_3
> +  call check_int_const_shape_rank_3_true_mask
> +  call check_int_const_shape_rank_3_false_mask
> +  call check_int_const_shape_rank_3_optional_mask_present
> +  call check_int_const_shape_rank_3_optional_mask_absent
> +  call check_int_const_shape_empty_4
> +  call check_int_alloc_rank_1
> +  call check_int_alloc_rank_3
> +  call check_int_alloc_rank_3_true_mask
> +  call check_int_alloc_rank_3_false_mask
> +  call check_real_const_shape_rank_1
> +  call check_real_const_shape_rank_3
> +  call check_real_const_shape_rank_3_true_mask
> +  call check_real_const_shape_rank_3_false_mask
> +  call check_real_const_shape_rank_3_optional_mask_present
> +  call check_real_const_shape_rank_3_optional_mask_absent
> +  call check_real_const_shape_empty_4
> +  call check_real_alloc_rank_1
> +  call check_real_alloc_rank_3
> +  call check_real_alloc_rank_3_true_mask
> +  call check_real_alloc_rank_3_false_mask
> +contains
> +  subroutine check_int_const_shape_rank_1()
> +    integer :: a(10)
> +    logical :: m(10)
> +    integer, allocatable :: r(:)
> +    a = data10
> +    m = mask10
> +    r = minloc(a, mask = m)
> +    if (size(r, dim = 1) /= 1) stop 11
> +    if (any(r /= (/ 5 /))) stop 12
> +  end subroutine
> +  subroutine check_int_const_shape_rank_3()
> +    integer :: a(4,4,4)
> +    logical :: m(4,4,4)
> +    integer, allocatable :: r(:)
> +    a = reshape(data64, shape(a))
> +    m = reshape(mask64, shape(m))
> +    r = minloc(a, mask = m)
> +    if (size(r, dim = 1) /= 3) stop 21
> +    if (any(r /= (/ 2, 3, 1 /))) stop 22
> +  end subroutine
> +  subroutine check_int_const_shape_rank_3_true_mask()
> +    integer :: a(4,4,4)
> +    integer, allocatable :: r(:)
> +    a = reshape(data64, shape(a))
> +    r = minloc(a, mask = .true.)
> +    if (size(r, dim = 1) /= 3) stop 31
> +    if (any(r /= (/ 2, 2, 1 /))) stop 32
> +  end subroutine
> +  subroutine check_int_const_shape_rank_3_false_mask()
> +    integer :: a(4,4,4)
> +    integer, allocatable :: r(:)
> +    a = reshape(data64, shape(a))
> +    r = minloc(a, mask = .false.)
> +    if (size(r, dim = 1) /= 3) stop 41
> +    if (any(r /= (/ 0, 0, 0 /))) stop 42
> +  end subroutine
> +  subroutine call_minloc_int(r, a, m)
> +    integer :: a(:,:,:)
> +    logical, optional :: m(:,:,:)
> +    integer, allocatable :: r(:)
> +    r = minloc(a, mask = m)
> +  end subroutine
> +  subroutine check_int_const_shape_rank_3_optional_mask_present()
> +    integer :: a(4,4,4)
> +    logical :: m(4,4,4)
> +    integer, allocatable :: r(:)
> +    a = reshape(data64, shape(a))
> +    m = reshape(mask64, shape(m))
> +    call call_minloc_int(r, a, m)
> +    if (size(r, dim = 1) /= 3) stop 51
> +    if (any(r /= (/ 2, 3, 1 /))) stop 52
> +  end subroutine
> +  subroutine check_int_const_shape_rank_3_optional_mask_absent()
> +    integer :: a(4,4,4)
> +    integer, allocatable :: r(:)
> +    a = reshape(data64, shape(a))
> +    call call_minloc_int(r, a)
> +    if (size(r, dim = 1) /= 3) stop 61
> +    if (any(r /= (/ 2, 2, 1 /))) stop 62
> +  end subroutine
> +  subroutine check_int_const_shape_empty_4()
> +    integer :: a(9,3,0,7)
> +    logical :: m(9,3,0,7)
> +    integer, allocatable :: r(:)
> +    a = reshape((/ integer:: /), shape(a))
> +    m = reshape((/ logical:: /), shape(m))
> +    r = minloc(a, mask = m)
> +    if (size(r, dim = 1) /= 4) stop 71
> +    if (any(r /= (/ 0, 0, 0, 0 /))) stop 72
> +  end subroutine
> +  subroutine check_int_alloc_rank_1()
> +    integer, allocatable :: a(:)
> +    logical, allocatable :: m(:)
> +    integer, allocatable :: r(:)
> +    allocate(a(10), m(10))
> +    a(:) = data10
> +    m(:) = mask10
> +    r = minloc(a, mask = m)
> +    if (size(r, dim = 1) /= 1) stop 81
> +    if (any(r /= (/ 5 /))) stop 82
> +  end subroutine
> +  subroutine check_int_alloc_rank_3()
> +    integer, allocatable :: a(:,:,:)
> +    logical, allocatable :: m(:,:,:)
> +    integer, allocatable :: r(:)
> +    allocate(a(4,4,4), m(4,4,4))
> +    a(:,:,:) = reshape(data64, shape(a))
> +    m(:,:,:) = reshape(mask64, shape(m))
> +    r = minloc(a, mask = m)
> +    if (size(r, dim = 1) /= 3) stop 91
> +    if (any(r /= (/ 2, 3, 1 /))) stop 92
> +  end subroutine
> +  subroutine check_int_alloc_rank_3_true_mask()
> +    integer, allocatable :: a(:,:,:)
> +    integer, allocatable :: r(:)
> +    allocate(a(4,4,4))
> +    a(:,:,:) = reshape(data64, shape(a))
> +    r = minloc(a, mask = .true.)
> +    if (size(r, dim = 1) /= 3) stop 101
> +    if (any(r /= (/ 2, 2, 1 /))) stop 102
> +  end subroutine
> +  subroutine check_int_alloc_rank_3_false_mask()
> +    integer, allocatable :: a(:,:,:)
> +    integer, allocatable :: r(:)
> +    allocate(a(4,4,4))
> +    a(:,:,:) = reshape(data64, shape(a))
> +    r = minloc(a, mask = .false.)
> +    if (size(r, dim = 1) /= 3) stop 111
> +    if (any(r /= (/ 0, 0, 0 /))) stop 112
> +  end subroutine
> +  subroutine check_real_const_shape_rank_1()
> +    real :: a(10)
> +    logical :: m(10)
> +    integer, allocatable :: r(:)
> +    a = (/ real:: data10 /)
> +    m = mask10
> +    r = minloc(a, mask = m)
> +    if (size(r, dim = 1) /= 1) stop 131
> +    if (any(r /= (/ 5 /))) stop 132
> +  end subroutine
> +  subroutine check_real_const_shape_rank_3()
> +    real :: a(4,4,4)
> +    logical :: m(4,4,4)
> +    integer, allocatable :: r(:)
> +    a = reshape((/ real:: data64 /), shape(a))
> +    m = reshape(mask64, shape(m))
> +    r = minloc(a, mask = m)
> +    if (size(r, dim = 1) /= 3) stop 141
> +    if (any(r /= (/ 2, 3, 1 /))) stop 142
> +  end subroutine
> +  subroutine check_real_const_shape_rank_3_true_mask()
> +    real :: a(4,4,4)
> +    integer, allocatable :: r(:)
> +    a = reshape((/ real:: data64 /), shape(a))
> +    r = minloc(a, mask = .true.)
> +    if (size(r, dim = 1) /= 3) stop 151
> +    if (any(r /= (/ 2, 2, 1 /))) stop 152
> +  end subroutine
> +  subroutine check_real_const_shape_rank_3_false_mask()
> +    real :: a(4,4,4)
> +    integer, allocatable :: r(:)
> +    a = reshape((/ real:: data64 /), shape(a))
> +    r = minloc(a, mask = .false.)
> +    if (size(r, dim = 1) /= 3) stop 161
> +    if (any(r /= (/ 0, 0, 0 /))) stop 162
> +  end subroutine
> +  subroutine call_minloc_real(r, a, m)
> +    real :: a(:,:,:)
> +    logical, optional  :: m(:,:,:)
> +    integer, allocatable :: r(:)
> +    r = minloc(a, mask = m)
> +  end subroutine
> +  subroutine check_real_const_shape_rank_3_optional_mask_present()
> +    real :: a(4,4,4)
> +    logical :: m(4,4,4)
> +    integer, allocatable :: r(:)
> +    a = reshape((/ real:: data64 /), shape(a))
> +    m = reshape(mask64, shape(m))
> +    call call_minloc_real(r, a, m)
> +    if (size(r, dim = 1) /= 3) stop 171
> +    if (any(r /= (/ 2, 3, 1 /))) stop 172
> +  end subroutine
> +  subroutine check_real_const_shape_rank_3_optional_mask_absent()
> +    real :: a(4,4,4)
> +    integer, allocatable :: r(:)
> +    a = reshape((/ real:: data64 /), shape(a))
> +    call call_minloc_real(r, a)
> +    if (size(r, dim = 1) /= 3) stop 181
> +    if (any(r /= (/ 2, 2, 1 /))) stop 182
> +  end subroutine
> +  subroutine check_real_const_shape_empty_4()
> +    real :: a(9,3,0,7)
> +    logical :: m(9,3,0,7)
> +    integer, allocatable :: r(:)
> +    a = reshape((/ real:: /), shape(a))
> +    m = reshape((/ logical:: /), shape(m))
> +    r = minloc(a, mask = m)
> +    if (size(r, dim = 1) /= 4) stop 191
> +    if (any(r /= (/ 0, 0, 0, 0 /))) stop 192
> +  end subroutine
> +  subroutine check_real_alloc_rank_1()
> +    real, allocatable :: a(:)
> +    logical, allocatable :: m(:)
> +    integer, allocatable :: r(:)
> +    allocate(a(10), m(10))
> +    a(:) = (/ real:: data10 /)
> +    m(:) = mask10
> +    r = minloc(a, mask = m)
> +    if (size(r, dim = 1) /= 1) stop 201
> +    if (any(r /= (/ 5 /))) stop 202
> +  end subroutine
> +  subroutine check_real_alloc_rank_3()
> +    real, allocatable :: a(:,:,:)
> +    logical, allocatable :: m(:,:,:)
> +    integer, allocatable :: r(:)
> +    allocate(a(4,4,4), m(4,4,4))
> +    a(:,:,:) = reshape((/ real:: data64 /), shape(a))
> +    m(:,:,:) = reshape(mask64, shape(m))
> +    r = minloc(a, mask = m)
> +    if (size(r, dim = 1) /= 3) stop 211
> +    if (any(r /= (/ 2, 3, 1 /))) stop 212
> +  end subroutine
> +  subroutine check_real_alloc_rank_3_true_mask()
> +    real, allocatable :: a(:,:,:)
> +    integer, allocatable :: r(:)
> +    allocate(a(4,4,4))
> +    a(:,:,:) = reshape((/ real:: data64 /), shape(a))
> +    r = minloc(a, mask = .true.)
> +    if (size(r, dim = 1) /= 3) stop 221
> +    if (any(r /= (/ 2, 2, 1 /))) stop 222
> +  end subroutine
> +  subroutine check_real_alloc_rank_3_false_mask()
> +    real, allocatable :: a(:,:,:)
> +    integer, allocatable :: r(:)
> +    allocate(a(4,4,4))
> +    a(:,:,:) = reshape((/ real:: data64 /), shape(a))
> +    r = minloc(a, mask = .false.)
> +    if (size(r, dim = 1) /= 3) stop 231
> +    if (any(r /= (/ 0, 0, 0 /))) stop 232
> +  end subroutine
> +end subroutine check_minloc_with_mask
> diff --git a/gcc/testsuite/gfortran.dg/minmaxloc_18a.f90 b/gcc/testsuite/gfortran.dg/minmaxloc_18a.f90
> new file mode 100644
> index 00000000000..362d1765c89
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/minmaxloc_18a.f90
> @@ -0,0 +1,10 @@
> +! { dg-do compile }
> +! { dg-additional-files "minmaxloc_18.f90" }
> +! { dg-additional-options "-Os -fdump-tree-original" }
> +! { dg-final { scan-tree-dump-times "gfortran_\[sm\]?minloc" 30 "original" } }
> +! { dg-final { scan-tree-dump-times "gfortran_\[sm\]?maxloc" 30 "original" } }
> +!
> +! PR fortran/90608
> +! Check that all MINLOC and MAXLOC intrinsics use the implementation provided
> +! by the library when optimizing for size.
> +include "minmaxloc_18.f90"
> diff --git a/gcc/testsuite/gfortran.dg/minmaxloc_18b.f90 b/gcc/testsuite/gfortran.dg/minmaxloc_18b.f90
> new file mode 100644
> index 00000000000..068c941110f
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/minmaxloc_18b.f90
> @@ -0,0 +1,10 @@
> +! { dg-do compile }
> +! { dg-additional-files "minmaxloc_18.f90" }
> +! { dg-additional-options "-O2 -fno-inline-intrinsics=minloc -fdump-tree-original" }
> +! { dg-final { scan-tree-dump-times "gfortran_\[sm\]?minloc" 30 "original" } }
> +! { dg-final { scan-tree-dump-not "gfortran_\[sm\]?maxloc" "original" } }
> +!
> +! PR fortran/90608
> +! Check that -O2 enables inlining and -fno-inline-intrinsics selectively
> +! disables it.
> +include "minmaxloc_18.f90"
> diff --git a/gcc/testsuite/gfortran.dg/minmaxloc_18c.f90 b/gcc/testsuite/gfortran.dg/minmaxloc_18c.f90
> new file mode 100644
> index 00000000000..47fe54e20a9
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/minmaxloc_18c.f90
> @@ -0,0 +1,10 @@
> +! { dg-do compile }
> +! { dg-additional-files "minmaxloc_18.f90" }
> +! { dg-additional-options "-O3 -fno-inline-intrinsics=maxloc -fdump-tree-original" }
> +! { dg-final { scan-tree-dump-not "gfortran_\[sm\]?minloc" "original" } }
> +! { dg-final { scan-tree-dump-times "gfortran_\[sm\]?maxloc" 30 "original" } }
> +!
> +! PR fortran/90608
> +! Check that -O3 enables inlining and -fno-inline-intrinsics selectively
> +! disables it.
> +include "minmaxloc_18.f90"
> diff --git a/gcc/testsuite/gfortran.dg/minmaxloc_18d.f90 b/gcc/testsuite/gfortran.dg/minmaxloc_18d.f90
> new file mode 100644
> index 00000000000..eb530f69a2e
> --- /dev/null
> +++ b/gcc/testsuite/gfortran.dg/minmaxloc_18d.f90
> @@ -0,0 +1,10 @@
> +! { dg-do compile }
> +! { dg-additional-files "minmaxloc_18.f90" }
> +! { dg-additional-options "-O0 -finline-intrinsics=maxloc -fdump-tree-original" }
> +! { dg-final { scan-tree-dump-times "gfortran_\[sm\]?minloc" 30 "original" } }
> +! { dg-final { scan-tree-dump-not "gfortran_\[sm\]?maxloc" "original" } }
> +!
> +! PR fortran/90608
> +! Check that -O0 disables inlining and -finline-intrinsics selectively
> +! enables it.
> +include "minmaxloc_18.f90"
Mikael Morin Aug. 20, 2024, 9:51 a.m. UTC | #2
Hello,

Le 19/08/2024 à 21:44, Harald Anlauf a écrit :
> Hi Mikael,
> 
> apart from patch #04/10, which did not apply cleanly here, I was
> able to test your patch.  It seems to work with a manual workaround
> (-fno-frontend-optimize) to work around this problem.
> Might be a local issue...
> 
Huh? That's unexpected, patches were rebased before submitting, and I 
don't think there was any recent activity in that area of the compiler 
anyway.

> That said, it works as advertised.  Thanks for separating out the
> IEEE-NaN tests.
> 
> What I did not fully get is the way you deal with -finline-intrinsics= .
> Currently there are only MINLOC and MAXLOC, but in the future there
> could be many more intrinsics.  Given that you need 2 bits per
> intrinsic in flag_inline_intrinsics, how future-proof is that?

Well, I don't expect that many candidate intrinsics for the flags; 
currently SUM and PRODUCT could be added, and probably IALL, IANY and 
IPARITY as well.  Remember that having both a libgfortran and a frontend 
implementation is a prerequisite.

For the future, 2 bits gives room to 16 intrinsics, and if we extend to 
64 bits, to 32 intrinsics without much hassle.  Having only 1 bit per 
intrinsic would be certainly more future proof but I haven't found how 
to do it.  Zero bit (aka no per-intrinsic setting) would be even more 
future-proof, but less convenient.  As a last resort possibility the 
.opt framework gives the possibility to accumulate options in a vector 
and have them processed "by hand".  It seemed more convenient to me to 
just use 2 bits per intrinsic, but we can fall back to manual processing 
if we get out of bits at some point.

Using only 1 bit per intrinsic was what I tried first, but then the 
default value has to be set before processing the flags, which means no 
default value depending on optimization, as optimization level is not 
known at that time.  I tried using EnabledBy(O || Ofast || Og) to set 
default value dependending on optimization, but this doesn't work either 
because the "O" covers all levels (-O0, -O1, -O2, -O3) without 
distinction.  And finally EnabledBy(O1 || O2 || O3 || Ofast || Og) is 
not accepted because "O1", "O2" and "O3" are not valid option names.

I can certainly drop the per-intrinsic setting to have unlimited room 
for future intrinsics.  Or if you have ideas on how to only use one bit 
per intrinsic, I'm all ears.  Or maybe the testcases are too cumbersome, 
and with a slight modification of behaviour we can save one bit (for 
example, also generate inline code for -O0). What do you think?

> In the documentation, you have:
> 
> +The set of intrinsics permitting the choice of implementation variant
> through
> +@code{-finline-intrinsics} is currently limited to non-scalar
> @code{MAXLOC} and
> +@code{MINLOC}.
> 
> Why do you say "non-scalar"?  The new inlining is done for these
> intrinsics when the DIM argument is absent.  The result characteristics
> however is:
> 
>    "If DIM does not appear, the result is an array of rank one and of
> size equal to the rank of ARRAY; ..."
> 
> and I thought the implementation does just that and does that right.
> (With DIM present, the result is an array of rank rank(arg)-1.)
> Can you clarify the wording in a way that is better understandable?
> 
Yeah, these patches are all about non-scalar MINLOC/MAXLOC.  But there 
is also the scalar MINLOC/MAXLOC case which has pre-existing inline code 
support (and on which these patches are based).  The scalar case (aka 
with DIM present, and ARRAY of rank 1) is always inlined, so the flag 
has no effect on it.

Does this sound better?:
The set of intrinsics allowed as argument to @code{-finline-intrinsics=} 
is currently limited to @code{MAXLOC} and @code{MINLOC}.  The effect of 
the flag is moreover limited to calls of those intrinsics without 
@code{DIM} argument and with @code{ARRAY} of a non-@code{CHARACTER} type.

> Otherwise the Fortran parts look fine to me.
> 
Thanks for the review.

> For the changes to gcc/flag-types.h you might need an OK from the
> gcc maintainers.
> 
> Thanks,
> Harald
>
Harald Anlauf Aug. 20, 2024, 5:38 p.m. UTC | #3
Hi Mikael,

Am 20.08.24 um 11:51 schrieb Mikael Morin:
> Hello,
>
> Le 19/08/2024 à 21:44, Harald Anlauf a écrit :
>> Hi Mikael,
>>
>> apart from patch #04/10, which did not apply cleanly here, I was
>> able to test your patch.  It seems to work with a manual workaround
>> (-fno-frontend-optimize) to work around this problem.
>> Might be a local issue...
>>
> Huh? That's unexpected, patches were rebased before submitting, and I
> don't think there was any recent activity in that area of the compiler
> anyway.

staring at that patch and the code revealed that Andre's coarray rank
stuff interfered, so you might see a merge conflict.  I managed to
resolve this, and then everything passes.  Good!

>> That said, it works as advertised.  Thanks for separating out the
>> IEEE-NaN tests.
>>
>> What I did not fully get is the way you deal with -finline-intrinsics= .
>> Currently there are only MINLOC and MAXLOC, but in the future there
>> could be many more intrinsics.  Given that you need 2 bits per
>> intrinsic in flag_inline_intrinsics, how future-proof is that?
>
> Well, I don't expect that many candidate intrinsics for the flags;
> currently SUM and PRODUCT could be added, and probably IALL, IANY and
> IPARITY as well.  Remember that having both a libgfortran and a frontend
> implementation is a prerequisite.

Yes, these are the primary candidates.  Maybe NORM2; MINVAL, MAXVAL;
FINDLOC; CSHIFT as well.

> For the future, 2 bits gives room to 16 intrinsics, and if we extend to
> 64 bits, to 32 intrinsics without much hassle.  Having only 1 bit per
> intrinsic would be certainly more future proof but I haven't found how
> to do it.  Zero bit (aka no per-intrinsic setting) would be even more
> future-proof, but less convenient.  As a last resort possibility the
> .opt framework gives the possibility to accumulate options in a vector
> and have them processed "by hand".  It seemed more convenient to me to
> just use 2 bits per intrinsic, but we can fall back to manual processing
> if we get out of bits at some point.
>
> Using only 1 bit per intrinsic was what I tried first, but then the
> default value has to be set before processing the flags, which means no
> default value depending on optimization, as optimization level is not
> known at that time.  I tried using EnabledBy(O || Ofast || Og) to set
> default value dependending on optimization, but this doesn't work either
> because the "O" covers all levels (-O0, -O1, -O2, -O3) without
> distinction.  And finally EnabledBy(O1 || O2 || O3 || Ofast || Og) is
> not accepted because "O1", "O2" and "O3" are not valid option names.
>
> I can certainly drop the per-intrinsic setting to have unlimited room
> for future intrinsics.  Or if you have ideas on how to only use one bit
> per intrinsic, I'm all ears.  Or maybe the testcases are too cumbersome,
> and with a slight modification of behaviour we can save one bit (for
> example, also generate inline code for -O0). What do you think?

Since the logic is essentially ternary, using 2 bits is natural.
I was not sure if we would be limited to 16 intrinsics, which would
be fairly small, since I did not see what types are possible.
32 is probably good enough for the next decade or so...

>> In the documentation, you have:
>>
>> +The set of intrinsics permitting the choice of implementation variant
>> through
>> +@code{-finline-intrinsics} is currently limited to non-scalar
>> @code{MAXLOC} and
>> +@code{MINLOC}.
>>
>> Why do you say "non-scalar"?  The new inlining is done for these
>> intrinsics when the DIM argument is absent.  The result characteristics
>> however is:
>>
>>    "If DIM does not appear, the result is an array of rank one and of
>> size equal to the rank of ARRAY; ..."
>>
>> and I thought the implementation does just that and does that right.
>> (With DIM present, the result is an array of rank rank(arg)-1.)
>> Can you clarify the wording in a way that is better understandable?
>>
> Yeah, these patches are all about non-scalar MINLOC/MAXLOC.  But there
> is also the scalar MINLOC/MAXLOC case which has pre-existing inline code
> support (and on which these patches are based).  The scalar case (aka
> with DIM present, and ARRAY of rank 1) is always inlined, so the flag
> has no effect on it.

I was asking because you refer by "scalar" or "non-scalar" to the
result, not to the argument.  This looked like non-standard use of
language to me.

> Does this sound better?:
> The set of intrinsics allowed as argument to @code{-finline-intrinsics=}
> is currently limited to @code{MAXLOC} and @code{MINLOC}.  The effect of
> the flag is moreover limited to calls of those intrinsics without
> @code{DIM} argument and with @code{ARRAY} of a non-@code{CHARACTER} type.

Yes, this is better.  Do you want to add something like:
The case of rank-1 argument and @code{DIM} argument present, i.e.
@code{MAXLOC}(A(:),@code{DIM}=1)} or @code{MINLOC}(A(:),@code{DIM}=1)}
is inlined unconditionally for numeric rank-1 array argument A.

Thanks,
Harald

>> Otherwise the Fortran parts look fine to me.
>>
> Thanks for the review.
>
>> For the changes to gcc/flag-types.h you might need an OK from the
>> gcc maintainers.
>>
>> Thanks,
>> Harald
>>
>
diff mbox series

Patch

diff --git a/gcc/flag-types.h b/gcc/flag-types.h
index 1e497f0bb91..df56337f7e8 100644
--- a/gcc/flag-types.h
+++ b/gcc/flag-types.h
@@ -451,6 +451,36 @@  enum gfc_convert
 };
 
 
+/* gfortran -finline-intrinsics= values;
+   We use two identical bits for each value, and initialize with alternated
+   bits, so that we can check whether a value has been set by checking whether
+   the two bits have identical value.  */
+
+#define GFC_INL_INTR_VAL(idx) (3 << (2 * idx))
+#define GFC_INL_INTR_UNSET_VAL(val) (0x55555555 & (val))
+
+enum gfc_inlineable_intrinsics
+{
+  GFC_FLAG_INLINE_INTRINSIC_NONE = 0,
+  GFC_FLAG_INLINE_INTRINSIC_MAXLOC = GFC_INL_INTR_VAL (0),
+  GFC_FLAG_INLINE_INTRINSIC_MINLOC = GFC_INL_INTR_VAL (1),
+  GFC_FLAG_INLINE_INTRINSIC_ALL = GFC_FLAG_INLINE_INTRINSIC_MAXLOC
+				  | GFC_FLAG_INLINE_INTRINSIC_MINLOC,
+
+  GFC_FLAG_INLINE_INTRINSIC_NONE_UNSET
+		  = GFC_INL_INTR_UNSET_VAL (GFC_FLAG_INLINE_INTRINSIC_NONE),
+  GFC_FLAG_INLINE_INTRINSIC_MAXLOC_UNSET
+		  = GFC_INL_INTR_UNSET_VAL (GFC_FLAG_INLINE_INTRINSIC_MAXLOC),
+  GFC_FLAG_INLINE_INTRINSIC_MINLOC_UNSET
+		  = GFC_INL_INTR_UNSET_VAL (GFC_FLAG_INLINE_INTRINSIC_MINLOC),
+  GFC_FLAG_INLINE_INTRINSIC_ALL_UNSET
+		  = GFC_INL_INTR_UNSET_VAL (GFC_FLAG_INLINE_INTRINSIC_ALL)
+};
+
+#undef GFC_INL_INTR_UNSET_VAL
+#undef GFC_INL_INTR_VAL
+
+
 /* Inline String Operations functions.  */
 enum ilsop_fn
 {
diff --git a/gcc/fortran/invoke.texi b/gcc/fortran/invoke.texi
index 6bc42afe2c4..53b6de1c92b 100644
--- a/gcc/fortran/invoke.texi
+++ b/gcc/fortran/invoke.texi
@@ -194,6 +194,7 @@  and warnings}.
 -finit-character=@var{n} -finit-integer=@var{n} -finit-local-zero
 -finit-derived -finit-logical=@var{<true|false>}
 -finit-real=@var{<zero|inf|-inf|nan|snan>}
+-finline-intrinsics[=<@var{minloc},@var{maxloc}>]
 -finline-matmul-limit=@var{n}
 -finline-arg-packing -fmax-array-constructor=@var{n}
 -fmax-stack-var-size=@var{n} -fno-align-commons -fno-automatic
@@ -1994,6 +1995,29 @@  geometric mean of the dimensions of the argument and result matrices.
 
 The default value for @var{n} is 30.
 
+@opindex @code{finline-intrinsics}
+@item -finline-intrinsics
+@itemx -finline-intrinsics=@var{intr1},@var{intr2},...
+Usage of intrinsics can be implemented either by generating a call to the
+libgfortran library function implementing it, or by directly generating the
+implementation code inline.  For most intrinsics, only a single of those
+variants is available and there is no choice of implementation.  For some of
+them, however, both are available, and for them the @code{-finline-intrinsics}
+flag permits the selection of inline code generation in its positive form, or
+library call generation in its negative form @code{-fno-inline-intrinsics}.
+With @code{-finline-intrinsics=...} or @code{-fno-inline-intrinsics=...}, the
+choice applies only to the intrinsics present in the comma-separated list
+provided as argument.
+
+For each intrinsic, if no choice of implementation was made through either of
+the flag variants, a default behaviour is chosen depending on optimization:
+library calls are generated when not optimizing or when optimizing for size;
+otherwise inline code is preferred.
+
+The set of intrinsics permitting the choice of implementation variant through
+@code{-finline-intrinsics} is currently limited to non-scalar @code{MAXLOC} and
+@code{MINLOC}.
+
 @opindex @code{finline-matmul-limit}
 @item -finline-matmul-limit=@var{n}
 When front-end optimization is active, some calls to the @code{MATMUL}
diff --git a/gcc/fortran/lang.opt b/gcc/fortran/lang.opt
index 5cf7b492254..ac08a851da4 100644
--- a/gcc/fortran/lang.opt
+++ b/gcc/fortran/lang.opt
@@ -676,6 +676,33 @@  finline-arg-packing
 Fortran  Var(flag_inline_arg_packing) Init(-1)
 -finline-arg-packing	Perform argument packing inline.
 
+finline-intrinsics
+Fortran RejectNegative Enum(gfc_inlineable_intrinsics) Var(flag_inline_intrinsics, GFC_FLAG_INLINE_INTRINSIC_ALL) Undocumented
+
+fno-inline-intrinsics
+Fortran RejectNegative Enum(gfc_inlineable_intrinsics) Var(flag_inline_intrinsics, GFC_FLAG_INLINE_INTRINSIC_NONE) Undocumented
+
+finline-intrinsics=
+Fortran Joined Var(flag_inline_intrinsics) Enum(gfc_inlineable_intrinsics) Init(GFC_FLAG_INLINE_INTRINSIC_ALL_UNSET) EnumSet
+Enable generation of inline code instead of calls to functions from the library to implement intrinsics.
+
+Enum
+Name(gfc_inlineable_intrinsics) Type(int) UnknownError(%qs is not an inline-controlable intrinsic)
+
+; This is not part of any set
+; EnumValue
+; Enum(gfc_inlineable_intrinsics) String(none) Value(GFC_FLAG_INLINE_INTRINSIC_NONE)
+
+EnumValue
+Enum(gfc_inlineable_intrinsics) String(maxloc) Value(GFC_FLAG_INLINE_INTRINSIC_MAXLOC) Set(1)
+
+EnumValue
+Enum(gfc_inlineable_intrinsics) String(minloc) Value(GFC_FLAG_INLINE_INTRINSIC_MINLOC) Set(2)
+
+; This is not part of any set
+; EnumValue
+; Enum(gfc_inlineable_intrinsics) String(all) Value(GFC_FLAG_INLINE_INTRINSIC_ALL)
+
 finline-matmul-limit=
 Fortran RejectNegative Joined UInteger Var(flag_inline_matmul_limit) Init(-1)
 -finline-matmul-limit=<n>	Specify the size of the largest matrix for which matmul will be inlined.
diff --git a/gcc/fortran/options.cc b/gcc/fortran/options.cc
index d8c5c8e62fc..6f2579ad9de 100644
--- a/gcc/fortran/options.cc
+++ b/gcc/fortran/options.cc
@@ -472,7 +472,26 @@  gfc_post_options (const char **pfilename)
   /* Implement -fno-automatic as -fmax-stack-var-size=0.  */
   if (!flag_automatic)
     flag_max_stack_var_size = 0;
-  
+
+  /* Decide inlining preference depending on optimization if nothing was
+     specified on the command line.  */
+  if ((flag_inline_intrinsics & GFC_FLAG_INLINE_INTRINSIC_MAXLOC)
+      == GFC_FLAG_INLINE_INTRINSIC_MAXLOC_UNSET)
+    {
+      if (optimize == 0 || optimize_size != 0)
+	flag_inline_intrinsics &= ~GFC_FLAG_INLINE_INTRINSIC_MAXLOC;
+      else
+	flag_inline_intrinsics |= GFC_FLAG_INLINE_INTRINSIC_MAXLOC;
+    }
+  if ((flag_inline_intrinsics & GFC_FLAG_INLINE_INTRINSIC_MINLOC)
+      == GFC_FLAG_INLINE_INTRINSIC_MINLOC_UNSET)
+    {
+      if (optimize == 0 || optimize_size != 0)
+	flag_inline_intrinsics &= ~GFC_FLAG_INLINE_INTRINSIC_MINLOC;
+      else
+	flag_inline_intrinsics |= GFC_FLAG_INLINE_INTRINSIC_MINLOC;
+    }
+
   /* If the user did not specify an inline matmul limit, inline up to the BLAS
      limit or up to 30 if no external BLAS is specified.  */
 
diff --git a/gcc/fortran/trans-intrinsic.cc b/gcc/fortran/trans-intrinsic.cc
index b03f7b1653e..456f28eba4e 100644
--- a/gcc/fortran/trans-intrinsic.cc
+++ b/gcc/fortran/trans-intrinsic.cc
@@ -11840,10 +11840,11 @@  gfc_inline_intrinsic_function_p (gfc_expr *expr)
   gfc_actual_arglist *args, *dim_arg, *mask_arg;
   gfc_expr *maskexpr;
 
-  if (!expr->value.function.isym)
+  gfc_intrinsic_sym *isym = expr->value.function.isym;
+  if (!isym)
     return false;
 
-  switch (expr->value.function.isym->id)
+  switch (isym->id)
     {
     case GFC_ISYM_PRODUCT:
     case GFC_ISYM_SUM:
@@ -11879,8 +11880,12 @@  gfc_inline_intrinsic_function_p (gfc_expr *expr)
     case GFC_ISYM_MINLOC:
     case GFC_ISYM_MAXLOC:
       {
-	/* Disable inline expansion if code size matters.  */
-	if (optimize_size)
+	if ((isym->id == GFC_ISYM_MINLOC
+	     && (flag_inline_intrinsics
+		 & GFC_FLAG_INLINE_INTRINSIC_MINLOC) == 0)
+	    || (isym->id == GFC_ISYM_MAXLOC
+		&& (flag_inline_intrinsics
+		    & GFC_FLAG_INLINE_INTRINSIC_MAXLOC) == 0))
 	  return false;
 
 	gfc_actual_arglist *array_arg = expr->value.function.actual;
diff --git a/gcc/testsuite/gfortran.dg/minmaxloc_18.f90 b/gcc/testsuite/gfortran.dg/minmaxloc_18.f90
new file mode 100644
index 00000000000..e8cd2d42d8d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/minmaxloc_18.f90
@@ -0,0 +1,772 @@ 
+! { dg-do compile }
+! { dg-additional-options "-O -fdump-tree-original" }
+! { dg-final { scan-tree-dump-not "gfortran_\[sm\]?minloc" "original" } }
+! { dg-final { scan-tree-dump-not "gfortran_\[sm\]?maxloc" "original" } }
+!
+! PR fortran/90608
+! Check that all MINLOC and MAXLOC calls are inlined with optimizations by default.
+
+subroutine check_maxloc_without_mask
+  implicit none
+  integer, parameter :: data5(*) = (/ 1, 7, 2, 7, 0 /)
+  integer, parameter :: data64(*) = (/ 2, 5, 4, 6, 0, 9, 3, 5,  &
+                                       4, 4, 1, 7, 3, 2, 1, 2,  &
+                                       5, 4, 6, 0, 9, 3, 5, 4,  &
+                                       4, 1, 7, 3, 2, 1, 2, 5,  &
+                                       4, 6, 0, 9, 3, 5, 4, 4,  &
+                                       1, 7, 3, 2, 1, 2, 5, 4,  &
+                                       6, 0, 9, 3, 5, 4, 4, 1,  &
+                                       7, 3, 2, 1, 2, 5, 4, 6  /)
+  call check_int_const_shape_rank_1
+  call check_int_const_shape_rank_3
+  call check_int_const_shape_empty_4
+  call check_int_alloc_rank_1
+  call check_int_alloc_rank_3
+  call check_real_const_shape_rank_1
+  call check_real_const_shape_rank_3
+  call check_real_const_shape_empty_4
+  call check_real_alloc_rank_1
+  call check_real_alloc_rank_3
+contains
+  subroutine check_int_const_shape_rank_1()
+    integer :: a(5)
+    integer, allocatable :: m(:)
+    a = data5
+    m = maxloc(a)
+    if (size(m, dim=1) /= 1) stop 11
+    if (any(m /= (/ 2 /))) stop 12
+  end subroutine
+  subroutine check_int_const_shape_rank_3()
+    integer :: a(4,4,4)
+    integer, allocatable :: m(:)
+    a = reshape(data64, shape(a))
+    m = maxloc(a)
+    if (size(m, dim=1) /= 3) stop 21
+    if (any(m /= (/ 2, 2, 1 /))) stop 22
+  end subroutine
+  subroutine check_int_const_shape_empty_4()
+    integer :: a(9,3,0,7)
+    integer, allocatable :: m(:)
+    a = reshape((/ integer:: /), shape(a))
+    m = maxloc(a)
+    if (size(m, dim=1) /= 4) stop 31
+    if (any(m /= (/ 0, 0, 0, 0 /))) stop 32
+  end subroutine
+  subroutine check_int_alloc_rank_1()
+    integer, allocatable :: a(:)
+    integer, allocatable :: m(:)
+    allocate(a(5))
+    a(:) = data5
+    m = maxloc(a)
+    if (size(m, dim=1) /= 1) stop 41
+    if (any(m /= (/ 2 /))) stop 42
+  end subroutine
+  subroutine check_int_alloc_rank_3()
+    integer, allocatable :: a(:,:,:)
+    integer, allocatable :: m(:)
+    allocate(a(4,4,4))
+    a(:,:,:) = reshape(data64, shape(a))
+    m = maxloc(a)
+    if (size(m, dim=1) /= 3) stop 51
+    if (any(m /= (/ 2, 2, 1 /))) stop 52
+  end subroutine
+  subroutine check_real_const_shape_rank_1()
+    real :: a(5)
+    integer, allocatable :: m(:)
+    a = (/ real:: data5 /)
+    m = maxloc(a)
+    if (size(m, dim=1) /= 1) stop 71
+    if (any(m /= (/ 2 /))) stop 72
+  end subroutine
+  subroutine check_real_const_shape_rank_3()
+    real :: a(4,4,4)
+    integer, allocatable :: m(:)
+    a = reshape((/ real:: data64 /), shape(a))
+    m = maxloc(a)
+    if (size(m, dim=1) /= 3) stop 81
+    if (any(m /= (/ 2, 2, 1 /))) stop 82
+  end subroutine
+  subroutine check_real_const_shape_empty_4()
+    real :: a(9,3,0,7)
+    integer, allocatable :: m(:)
+    a = reshape((/ real:: /), shape(a))
+    m = maxloc(a)
+    if (size(m, dim=1) /= 4) stop 91
+    if (any(m /= (/ 0, 0, 0, 0 /))) stop 92
+  end subroutine
+  subroutine check_real_alloc_rank_1()
+    real, allocatable :: a(:)
+    integer, allocatable :: m(:)
+    allocate(a(5))
+    a(:) = (/ real:: data5 /)
+    m = maxloc(a)
+    if (size(m, dim=1) /= 1) stop 111
+    if (any(m /= (/ 2 /))) stop 112
+  end subroutine
+  subroutine check_real_alloc_rank_3()
+    real, allocatable :: a(:,:,:)
+    integer, allocatable :: m(:)
+    allocate(a(4,4,4))
+    a(:,:,:) = reshape((/ real:: data64 /), shape(a))
+    m = maxloc(a)
+    if (size(m, dim=1) /= 3) stop 121
+    if (any(m /= (/ 2, 2, 1 /))) stop 122
+  end subroutine
+end subroutine check_maxloc_without_mask
+subroutine check_minloc_without_mask
+  implicit none
+  integer, parameter :: data5(*) = (/ 8, 2, 7, 2, 9 /)
+  integer, parameter :: data64(*) = (/ 7, 4, 5, 3, 9, 0, 6, 4,  &
+                                       5, 5, 8, 2, 6, 7, 8, 7,  &
+                                       4, 5, 3, 9, 0, 6, 4, 5,  &
+                                       5, 8, 2, 6, 7, 8, 7, 4,  &
+                                       5, 3, 9, 0, 6, 4, 5, 5,  &
+                                       8, 2, 6, 7, 8, 7, 4, 5,  &
+                                       3, 9, 0, 6, 4, 5, 5, 8,  &
+                                       2, 6, 7, 8, 7, 4, 5, 3  /)
+  call check_int_const_shape_rank_1
+  call check_int_const_shape_rank_3
+  call check_int_const_shape_empty_4
+  call check_int_alloc_rank_1
+  call check_int_alloc_rank_3
+  call check_real_const_shape_rank_1
+  call check_real_const_shape_rank_3
+  call check_real_const_shape_empty_4
+  call check_real_alloc_rank_1
+  call check_real_alloc_rank_3
+contains
+  subroutine check_int_const_shape_rank_1()
+    integer :: a(5)
+    integer, allocatable :: m(:)
+    a = data5
+    m = minloc(a)
+    if (size(m, dim=1) /= 1) stop 11
+    if (any(m /= (/ 2 /))) stop 12
+  end subroutine
+  subroutine check_int_const_shape_rank_3()
+    integer :: a(4,4,4)
+    integer, allocatable :: m(:)
+    a = reshape(data64, shape(a))
+    m = minloc(a)
+    if (size(m, dim=1) /= 3) stop 21
+    if (any(m /= (/ 2, 2, 1 /))) stop 22
+  end subroutine
+  subroutine check_int_const_shape_empty_4()
+    integer :: a(9,3,0,7)
+    integer, allocatable :: m(:)
+    a = reshape((/ integer:: /), shape(a))
+    m = minloc(a)
+    if (size(m, dim=1) /= 4) stop 31
+    if (any(m /= (/ 0, 0, 0, 0 /))) stop 32
+  end subroutine
+  subroutine check_int_alloc_rank_1()
+    integer, allocatable :: a(:)
+    integer, allocatable :: m(:)
+    allocate(a(5))
+    a(:) = data5
+    m = minloc(a)
+    if (size(m, dim=1) /= 1) stop 41
+    if (any(m /= (/ 2 /))) stop 42
+  end subroutine
+  subroutine check_int_alloc_rank_3()
+    integer, allocatable :: a(:,:,:)
+    integer, allocatable :: m(:)
+    allocate(a(4,4,4))
+    a(:,:,:) = reshape(data64, shape(a))
+    m = minloc(a)
+    if (size(m, dim=1) /= 3) stop 51
+    if (any(m /= (/ 2, 2, 1 /))) stop 52
+  end subroutine
+  subroutine check_real_const_shape_rank_1()
+    real :: a(5)
+    integer, allocatable :: m(:)
+    a = (/ real:: data5 /)
+    m = minloc(a)
+    if (size(m, dim=1) /= 1) stop 71
+    if (any(m /= (/ 2 /))) stop 72
+  end subroutine
+  subroutine check_real_const_shape_rank_3()
+    real :: a(4,4,4)
+    integer, allocatable :: m(:)
+    a = reshape((/ real:: data64 /), shape(a))
+    m = minloc(a)
+    if (size(m, dim=1) /= 3) stop 81
+    if (any(m /= (/ 2, 2, 1 /))) stop 82
+  end subroutine
+  subroutine check_real_const_shape_empty_4()
+    real :: a(9,3,0,7)
+    integer, allocatable :: m(:)
+    a = reshape((/ real:: /), shape(a))
+    m = minloc(a)
+    if (size(m, dim=1) /= 4) stop 91
+    if (any(m /= (/ 0, 0, 0, 0 /))) stop 92
+  end subroutine
+  subroutine check_real_alloc_rank_1()
+    real, allocatable :: a(:)
+    integer, allocatable :: m(:)
+    allocate(a(5))
+    a(:) = (/ real:: data5 /)
+    m = minloc(a)
+    if (size(m, dim=1) /= 1) stop 111
+    if (any(m /= (/ 2 /))) stop 112
+  end subroutine
+  subroutine check_real_alloc_rank_3()
+    real, allocatable :: a(:,:,:)
+    integer, allocatable :: m(:)
+    allocate(a(4,4,4))
+    a(:,:,:) = reshape((/ real:: data64 /), shape(a))
+    m = minloc(a)
+    if (size(m, dim=1) /= 3) stop 121
+    if (any(m /= (/ 2, 2, 1 /))) stop 122
+  end subroutine
+end subroutine check_minloc_without_mask
+subroutine check_maxloc_with_mask
+  implicit none
+  integer, parameter :: data10(*) = (/ 7, 4, 7, 6, 6, 4, 6, 3, 9, 8 /)
+  logical, parameter :: mask10(*) = (/ .false., .true., .false., &
+                                       .false., .true., .true.,  &
+                                       .true. , .true., .false., &
+                                       .false. /)
+  integer, parameter :: data64(*) = (/ 2, 5, 4, 6, 0, 9, 3, 5,  &
+                                       4, 4, 1, 7, 3, 2, 1, 2,  &
+                                       5, 4, 6, 0, 9, 3, 5, 4,  &
+                                       4, 1, 7, 3, 2, 1, 2, 5,  &
+                                       4, 6, 0, 9, 3, 5, 4, 4,  &
+                                       1, 7, 3, 2, 1, 2, 5, 4,  &
+                                       6, 0, 9, 3, 5, 4, 4, 1,  &
+                                       7, 3, 2, 1, 2, 5, 4, 6  /)
+  logical, parameter :: mask64(*) = (/ .true. , .false., .false., .false., &
+                                       .true. , .false., .true. , .false., &
+                                       .false., .true. , .true. , .false., &
+                                       .true. , .true. , .true. , .true. , &
+                                       .false., .true. , .false., .true. , &
+                                       .false., .true. , .false., .true. , &
+                                       .true. , .false., .false., .true. , &
+                                       .true. , .true. , .true. , .false., &
+                                       .false., .false., .true. , .false., &
+                                       .true. , .false., .true. , .true. , &
+                                       .true. , .false., .true. , .true. , &
+                                       .false., .true. , .false., .true. , &
+                                       .false., .true. , .false., .false., &
+                                       .false., .true. , .true. , .true. , &
+                                       .false., .true. , .false., .true. , &
+                                       .true. , .false., .false., .false. /)
+  call check_int_const_shape_rank_1
+  call check_int_const_shape_rank_3
+  call check_int_const_shape_rank_3_true_mask
+  call check_int_const_shape_rank_3_false_mask
+  call check_int_const_shape_rank_3_optional_mask_present
+  call check_int_const_shape_rank_3_optional_mask_absent
+  call check_int_const_shape_empty_4
+  call check_int_alloc_rank_1
+  call check_int_alloc_rank_3
+  call check_int_alloc_rank_3_true_mask
+  call check_int_alloc_rank_3_false_mask
+  call check_real_const_shape_rank_1
+  call check_real_const_shape_rank_3
+  call check_real_const_shape_rank_3_true_mask
+  call check_real_const_shape_rank_3_false_mask
+  call check_real_const_shape_rank_3_optional_mask_present
+  call check_real_const_shape_rank_3_optional_mask_absent
+  call check_real_const_shape_empty_4
+  call check_real_alloc_rank_1
+  call check_real_alloc_rank_3
+  call check_real_alloc_rank_3_true_mask
+  call check_real_alloc_rank_3_false_mask
+contains
+  subroutine check_int_const_shape_rank_1()
+    integer :: a(10)
+    logical :: m(10)
+    integer, allocatable :: r(:)
+    a = data10
+    m = mask10
+    r = maxloc(a, mask = m)
+    if (size(r, dim = 1) /= 1) stop 11
+    if (any(r /= (/ 5 /))) stop 12
+  end subroutine
+  subroutine check_int_const_shape_rank_3()
+    integer :: a(4,4,4)
+    logical :: m(4,4,4)
+    integer, allocatable :: r(:)
+    a = reshape(data64, shape(a))
+    m = reshape(mask64, shape(m))
+    r = maxloc(a, mask = m)
+    if (size(r, dim = 1) /= 3) stop 21
+    if (any(r /= (/ 2, 3, 1 /))) stop 22
+  end subroutine
+  subroutine check_int_const_shape_rank_3_true_mask()
+    integer :: a(4,4,4)
+    integer, allocatable :: r(:)
+    a = reshape(data64, shape(a))
+    r = maxloc(a, mask = .true.)
+    if (size(r, dim = 1) /= 3) stop 31
+    if (any(r /= (/ 2, 2, 1 /))) stop 32
+  end subroutine
+  subroutine check_int_const_shape_rank_3_false_mask()
+    integer :: a(4,4,4)
+    integer, allocatable :: r(:)
+    a = reshape(data64, shape(a))
+    r = maxloc(a, mask = .false.)
+    if (size(r, dim = 1) /= 3) stop 41
+    if (any(r /= (/ 0, 0, 0 /))) stop 42
+  end subroutine
+  subroutine call_maxloc_int(r, a, m)
+    integer :: a(:,:,:)
+    logical, optional :: m(:,:,:)
+    integer, allocatable :: r(:)
+    r = maxloc(a, mask = m)
+  end subroutine
+  subroutine check_int_const_shape_rank_3_optional_mask_present()
+    integer :: a(4,4,4)
+    logical :: m(4,4,4)
+    integer, allocatable :: r(:)
+    a = reshape(data64, shape(a))
+    m = reshape(mask64, shape(m))
+    call call_maxloc_int(r, a, m)
+    if (size(r, dim = 1) /= 3) stop 51
+    if (any(r /= (/ 2, 3, 1 /))) stop 52
+  end subroutine
+  subroutine check_int_const_shape_rank_3_optional_mask_absent()
+    integer :: a(4,4,4)
+    integer, allocatable :: r(:)
+    a = reshape(data64, shape(a))
+    call call_maxloc_int(r, a)
+    if (size(r, dim = 1) /= 3) stop 61
+    if (any(r /= (/ 2, 2, 1 /))) stop 62
+  end subroutine
+  subroutine check_int_const_shape_empty_4()
+    integer :: a(9,3,0,7)
+    logical :: m(9,3,0,7)
+    integer, allocatable :: r(:)
+    a = reshape((/ integer:: /), shape(a))
+    m = reshape((/ logical:: /), shape(m))
+    r = maxloc(a, mask = m)
+    if (size(r, dim = 1) /= 4) stop 71
+    if (any(r /= (/ 0, 0, 0, 0 /))) stop 72
+  end subroutine
+  subroutine check_int_alloc_rank_1()
+    integer, allocatable :: a(:)
+    logical, allocatable :: m(:)
+    integer, allocatable :: r(:)
+    allocate(a(10), m(10))
+    a(:) = data10
+    m(:) = mask10
+    r = maxloc(a, mask = m)
+    if (size(r, dim = 1) /= 1) stop 81
+    if (any(r /= (/ 5 /))) stop 82
+  end subroutine
+  subroutine check_int_alloc_rank_3()
+    integer, allocatable :: a(:,:,:)
+    logical, allocatable :: m(:,:,:)
+    integer, allocatable :: r(:)
+    allocate(a(4,4,4), m(4,4,4))
+    a(:,:,:) = reshape(data64, shape(a))
+    m(:,:,:) = reshape(mask64, shape(m))
+    r = maxloc(a, mask = m)
+    if (size(r, dim = 1) /= 3) stop 91
+    if (any(r /= (/ 2, 3, 1 /))) stop 92
+  end subroutine
+  subroutine check_int_alloc_rank_3_true_mask()
+    integer, allocatable :: a(:,:,:)
+    integer, allocatable :: r(:)
+    allocate(a(4,4,4))
+    a(:,:,:) = reshape(data64, shape(a))
+    r = maxloc(a, mask = .true.)
+    if (size(r, dim = 1) /= 3) stop 101
+    if (any(r /= (/ 2, 2, 1 /))) stop 102
+  end subroutine
+  subroutine check_int_alloc_rank_3_false_mask()
+    integer, allocatable :: a(:,:,:)
+    integer, allocatable :: r(:)
+    allocate(a(4,4,4))
+    a(:,:,:) = reshape(data64, shape(a))
+    r = maxloc(a, mask = .false.)
+    if (size(r, dim = 1) /= 3) stop 111
+    if (any(r /= (/ 0, 0, 0 /))) stop 112
+  end subroutine
+  subroutine check_real_const_shape_rank_1()
+    real :: a(10)
+    logical :: m(10)
+    integer, allocatable :: r(:)
+    a = (/ real:: data10 /)
+    m = mask10
+    r = maxloc(a, mask = m)
+    if (size(r, dim = 1) /= 1) stop 131
+    if (any(r /= (/ 5 /))) stop 132
+  end subroutine
+  subroutine check_real_const_shape_rank_3()
+    real :: a(4,4,4)
+    logical :: m(4,4,4)
+    integer, allocatable :: r(:)
+    a = reshape((/ real:: data64 /), shape(a))
+    m = reshape(mask64, shape(m))
+    r = maxloc(a, mask = m)
+    if (size(r, dim = 1) /= 3) stop 141
+    if (any(r /= (/ 2, 3, 1 /))) stop 142
+  end subroutine
+  subroutine check_real_const_shape_rank_3_true_mask()
+    real :: a(4,4,4)
+    integer, allocatable :: r(:)
+    a = reshape((/ real:: data64 /), shape(a))
+    r = maxloc(a, mask = .true.)
+    if (size(r, dim = 1) /= 3) stop 151
+    if (any(r /= (/ 2, 2, 1 /))) stop 152
+  end subroutine
+  subroutine check_real_const_shape_rank_3_false_mask()
+    real :: a(4,4,4)
+    integer, allocatable :: r(:)
+    a = reshape((/ real:: data64 /), shape(a))
+    r = maxloc(a, mask = .false.)
+    if (size(r, dim = 1) /= 3) stop 161
+    if (any(r /= (/ 0, 0, 0 /))) stop 162
+  end subroutine
+  subroutine call_maxloc_real(r, a, m)
+    real :: a(:,:,:)
+    logical, optional  :: m(:,:,:)
+    integer, allocatable :: r(:)
+    r = maxloc(a, mask = m)
+  end subroutine
+  subroutine check_real_const_shape_rank_3_optional_mask_present()
+    real :: a(4,4,4)
+    logical :: m(4,4,4)
+    integer, allocatable :: r(:)
+    a = reshape((/ real:: data64 /), shape(a))
+    m = reshape(mask64, shape(m))
+    call call_maxloc_real(r, a, m)
+    if (size(r, dim = 1) /= 3) stop 171
+    if (any(r /= (/ 2, 3, 1 /))) stop 172
+  end subroutine
+  subroutine check_real_const_shape_rank_3_optional_mask_absent()
+    real :: a(4,4,4)
+    integer, allocatable :: r(:)
+    a = reshape((/ real:: data64 /), shape(a))
+    call call_maxloc_real(r, a)
+    if (size(r, dim = 1) /= 3) stop 181
+    if (any(r /= (/ 2, 2, 1 /))) stop 182
+  end subroutine
+  subroutine check_real_const_shape_empty_4()
+    real :: a(9,3,0,7)
+    logical :: m(9,3,0,7)
+    integer, allocatable :: r(:)
+    a = reshape((/ real:: /), shape(a))
+    m = reshape((/ logical:: /), shape(m))
+    r = maxloc(a, mask = m)
+    if (size(r, dim = 1) /= 4) stop 191
+    if (any(r /= (/ 0, 0, 0, 0 /))) stop 192
+  end subroutine
+  subroutine check_real_alloc_rank_1()
+    real, allocatable :: a(:)
+    logical, allocatable :: m(:)
+    integer, allocatable :: r(:)
+    allocate(a(10), m(10))
+    a(:) = (/ real:: data10 /)
+    m(:) = mask10
+    r = maxloc(a, mask = m)
+    if (size(r, dim = 1) /= 1) stop 201
+    if (any(r /= (/ 5 /))) stop 202
+  end subroutine
+  subroutine check_real_alloc_rank_3()
+    real, allocatable :: a(:,:,:)
+    logical, allocatable :: m(:,:,:)
+    integer, allocatable :: r(:)
+    allocate(a(4,4,4), m(4,4,4))
+    a(:,:,:) = reshape((/ real:: data64 /), shape(a))
+    m(:,:,:) = reshape(mask64, shape(m))
+    r = maxloc(a, mask = m)
+    if (size(r, dim = 1) /= 3) stop 211
+    if (any(r /= (/ 2, 3, 1 /))) stop 212
+  end subroutine
+  subroutine check_real_alloc_rank_3_true_mask()
+    real, allocatable :: a(:,:,:)
+    integer, allocatable :: r(:)
+    allocate(a(4,4,4))
+    a(:,:,:) = reshape((/ real:: data64 /), shape(a))
+    r = maxloc(a, mask = .true.)
+    if (size(r, dim = 1) /= 3) stop 221
+    if (any(r /= (/ 2, 2, 1 /))) stop 222
+  end subroutine
+  subroutine check_real_alloc_rank_3_false_mask()
+    real, allocatable :: a(:,:,:)
+    integer, allocatable :: r(:)
+    allocate(a(4,4,4))
+    a(:,:,:) = reshape((/ real:: data64 /), shape(a))
+    r = maxloc(a, mask = .false.)
+    if (size(r, dim = 1) /= 3) stop 231
+    if (any(r /= (/ 0, 0, 0 /))) stop 232
+  end subroutine
+end subroutine check_maxloc_with_mask
+subroutine check_minloc_with_mask
+  implicit none
+  integer, parameter :: data10(*) = (/ 2, 5, 2, 3, 3, 5, 3, 6, 0, 1 /)
+  logical, parameter :: mask10(*) = (/ .false., .true., .false., &
+                                       .false., .true., .true.,  &
+                                       .true. , .true., .false., &
+                                       .false. /)
+  integer, parameter :: data64(*) = (/ 7, 4, 5, 3, 9, 0, 6, 4,  &
+                                       5, 5, 8, 2, 6, 7, 8, 7,  &
+                                       4, 5, 3, 9, 0, 6, 4, 5,  &
+                                       5, 8, 2, 6, 7, 8, 7, 4,  &
+                                       5, 3, 9, 0, 6, 4, 5, 5,  &
+                                       8, 2, 6, 7, 8, 7, 4, 5,  &
+                                       3, 9, 0, 6, 4, 5, 5, 8,  &
+                                       2, 6, 7, 8, 7, 4, 5, 3  /)
+  logical, parameter :: mask64(*) = (/ .true. , .false., .false., .false., &
+                                       .true. , .false., .true. , .false., &
+                                       .false., .true. , .true. , .false., &
+                                       .true. , .true. , .true. , .true. , &
+                                       .false., .true. , .false., .true. , &
+                                       .false., .true. , .false., .true. , &
+                                       .true. , .false., .false., .true. , &
+                                       .true. , .true. , .true. , .false., &
+                                       .false., .false., .true. , .false., &
+                                       .true. , .false., .true. , .true. , &
+                                       .true. , .false., .true. , .true. , &
+                                       .false., .true. , .false., .true. , &
+                                       .false., .true. , .false., .false., &
+                                       .false., .true. , .true. , .true. , &
+                                       .false., .true. , .false., .true. , &
+                                       .true. , .false., .false., .false. /)
+  call check_int_const_shape_rank_1
+  call check_int_const_shape_rank_3
+  call check_int_const_shape_rank_3_true_mask
+  call check_int_const_shape_rank_3_false_mask
+  call check_int_const_shape_rank_3_optional_mask_present
+  call check_int_const_shape_rank_3_optional_mask_absent
+  call check_int_const_shape_empty_4
+  call check_int_alloc_rank_1
+  call check_int_alloc_rank_3
+  call check_int_alloc_rank_3_true_mask
+  call check_int_alloc_rank_3_false_mask
+  call check_real_const_shape_rank_1
+  call check_real_const_shape_rank_3
+  call check_real_const_shape_rank_3_true_mask
+  call check_real_const_shape_rank_3_false_mask
+  call check_real_const_shape_rank_3_optional_mask_present
+  call check_real_const_shape_rank_3_optional_mask_absent
+  call check_real_const_shape_empty_4
+  call check_real_alloc_rank_1
+  call check_real_alloc_rank_3
+  call check_real_alloc_rank_3_true_mask
+  call check_real_alloc_rank_3_false_mask
+contains
+  subroutine check_int_const_shape_rank_1()
+    integer :: a(10)
+    logical :: m(10)
+    integer, allocatable :: r(:)
+    a = data10
+    m = mask10
+    r = minloc(a, mask = m)
+    if (size(r, dim = 1) /= 1) stop 11
+    if (any(r /= (/ 5 /))) stop 12
+  end subroutine
+  subroutine check_int_const_shape_rank_3()
+    integer :: a(4,4,4)
+    logical :: m(4,4,4)
+    integer, allocatable :: r(:)
+    a = reshape(data64, shape(a))
+    m = reshape(mask64, shape(m))
+    r = minloc(a, mask = m)
+    if (size(r, dim = 1) /= 3) stop 21
+    if (any(r /= (/ 2, 3, 1 /))) stop 22
+  end subroutine
+  subroutine check_int_const_shape_rank_3_true_mask()
+    integer :: a(4,4,4)
+    integer, allocatable :: r(:)
+    a = reshape(data64, shape(a))
+    r = minloc(a, mask = .true.)
+    if (size(r, dim = 1) /= 3) stop 31
+    if (any(r /= (/ 2, 2, 1 /))) stop 32
+  end subroutine
+  subroutine check_int_const_shape_rank_3_false_mask()
+    integer :: a(4,4,4)
+    integer, allocatable :: r(:)
+    a = reshape(data64, shape(a))
+    r = minloc(a, mask = .false.)
+    if (size(r, dim = 1) /= 3) stop 41
+    if (any(r /= (/ 0, 0, 0 /))) stop 42
+  end subroutine
+  subroutine call_minloc_int(r, a, m)
+    integer :: a(:,:,:)
+    logical, optional :: m(:,:,:)
+    integer, allocatable :: r(:)
+    r = minloc(a, mask = m)
+  end subroutine
+  subroutine check_int_const_shape_rank_3_optional_mask_present()
+    integer :: a(4,4,4)
+    logical :: m(4,4,4)
+    integer, allocatable :: r(:)
+    a = reshape(data64, shape(a))
+    m = reshape(mask64, shape(m))
+    call call_minloc_int(r, a, m)
+    if (size(r, dim = 1) /= 3) stop 51
+    if (any(r /= (/ 2, 3, 1 /))) stop 52
+  end subroutine
+  subroutine check_int_const_shape_rank_3_optional_mask_absent()
+    integer :: a(4,4,4)
+    integer, allocatable :: r(:)
+    a = reshape(data64, shape(a))
+    call call_minloc_int(r, a)
+    if (size(r, dim = 1) /= 3) stop 61
+    if (any(r /= (/ 2, 2, 1 /))) stop 62
+  end subroutine
+  subroutine check_int_const_shape_empty_4()
+    integer :: a(9,3,0,7)
+    logical :: m(9,3,0,7)
+    integer, allocatable :: r(:)
+    a = reshape((/ integer:: /), shape(a))
+    m = reshape((/ logical:: /), shape(m))
+    r = minloc(a, mask = m)
+    if (size(r, dim = 1) /= 4) stop 71
+    if (any(r /= (/ 0, 0, 0, 0 /))) stop 72
+  end subroutine
+  subroutine check_int_alloc_rank_1()
+    integer, allocatable :: a(:)
+    logical, allocatable :: m(:)
+    integer, allocatable :: r(:)
+    allocate(a(10), m(10))
+    a(:) = data10
+    m(:) = mask10
+    r = minloc(a, mask = m)
+    if (size(r, dim = 1) /= 1) stop 81
+    if (any(r /= (/ 5 /))) stop 82
+  end subroutine
+  subroutine check_int_alloc_rank_3()
+    integer, allocatable :: a(:,:,:)
+    logical, allocatable :: m(:,:,:)
+    integer, allocatable :: r(:)
+    allocate(a(4,4,4), m(4,4,4))
+    a(:,:,:) = reshape(data64, shape(a))
+    m(:,:,:) = reshape(mask64, shape(m))
+    r = minloc(a, mask = m)
+    if (size(r, dim = 1) /= 3) stop 91
+    if (any(r /= (/ 2, 3, 1 /))) stop 92
+  end subroutine
+  subroutine check_int_alloc_rank_3_true_mask()
+    integer, allocatable :: a(:,:,:)
+    integer, allocatable :: r(:)
+    allocate(a(4,4,4))
+    a(:,:,:) = reshape(data64, shape(a))
+    r = minloc(a, mask = .true.)
+    if (size(r, dim = 1) /= 3) stop 101
+    if (any(r /= (/ 2, 2, 1 /))) stop 102
+  end subroutine
+  subroutine check_int_alloc_rank_3_false_mask()
+    integer, allocatable :: a(:,:,:)
+    integer, allocatable :: r(:)
+    allocate(a(4,4,4))
+    a(:,:,:) = reshape(data64, shape(a))
+    r = minloc(a, mask = .false.)
+    if (size(r, dim = 1) /= 3) stop 111
+    if (any(r /= (/ 0, 0, 0 /))) stop 112
+  end subroutine
+  subroutine check_real_const_shape_rank_1()
+    real :: a(10)
+    logical :: m(10)
+    integer, allocatable :: r(:)
+    a = (/ real:: data10 /)
+    m = mask10
+    r = minloc(a, mask = m)
+    if (size(r, dim = 1) /= 1) stop 131
+    if (any(r /= (/ 5 /))) stop 132
+  end subroutine
+  subroutine check_real_const_shape_rank_3()
+    real :: a(4,4,4)
+    logical :: m(4,4,4)
+    integer, allocatable :: r(:)
+    a = reshape((/ real:: data64 /), shape(a))
+    m = reshape(mask64, shape(m))
+    r = minloc(a, mask = m)
+    if (size(r, dim = 1) /= 3) stop 141
+    if (any(r /= (/ 2, 3, 1 /))) stop 142
+  end subroutine
+  subroutine check_real_const_shape_rank_3_true_mask()
+    real :: a(4,4,4)
+    integer, allocatable :: r(:)
+    a = reshape((/ real:: data64 /), shape(a))
+    r = minloc(a, mask = .true.)
+    if (size(r, dim = 1) /= 3) stop 151
+    if (any(r /= (/ 2, 2, 1 /))) stop 152
+  end subroutine
+  subroutine check_real_const_shape_rank_3_false_mask()
+    real :: a(4,4,4)
+    integer, allocatable :: r(:)
+    a = reshape((/ real:: data64 /), shape(a))
+    r = minloc(a, mask = .false.)
+    if (size(r, dim = 1) /= 3) stop 161
+    if (any(r /= (/ 0, 0, 0 /))) stop 162
+  end subroutine
+  subroutine call_minloc_real(r, a, m)
+    real :: a(:,:,:)
+    logical, optional  :: m(:,:,:)
+    integer, allocatable :: r(:)
+    r = minloc(a, mask = m)
+  end subroutine
+  subroutine check_real_const_shape_rank_3_optional_mask_present()
+    real :: a(4,4,4)
+    logical :: m(4,4,4)
+    integer, allocatable :: r(:)
+    a = reshape((/ real:: data64 /), shape(a))
+    m = reshape(mask64, shape(m))
+    call call_minloc_real(r, a, m)
+    if (size(r, dim = 1) /= 3) stop 171
+    if (any(r /= (/ 2, 3, 1 /))) stop 172
+  end subroutine
+  subroutine check_real_const_shape_rank_3_optional_mask_absent()
+    real :: a(4,4,4)
+    integer, allocatable :: r(:)
+    a = reshape((/ real:: data64 /), shape(a))
+    call call_minloc_real(r, a)
+    if (size(r, dim = 1) /= 3) stop 181
+    if (any(r /= (/ 2, 2, 1 /))) stop 182
+  end subroutine
+  subroutine check_real_const_shape_empty_4()
+    real :: a(9,3,0,7)
+    logical :: m(9,3,0,7)
+    integer, allocatable :: r(:)
+    a = reshape((/ real:: /), shape(a))
+    m = reshape((/ logical:: /), shape(m))
+    r = minloc(a, mask = m)
+    if (size(r, dim = 1) /= 4) stop 191
+    if (any(r /= (/ 0, 0, 0, 0 /))) stop 192
+  end subroutine
+  subroutine check_real_alloc_rank_1()
+    real, allocatable :: a(:)
+    logical, allocatable :: m(:)
+    integer, allocatable :: r(:)
+    allocate(a(10), m(10))
+    a(:) = (/ real:: data10 /)
+    m(:) = mask10
+    r = minloc(a, mask = m)
+    if (size(r, dim = 1) /= 1) stop 201
+    if (any(r /= (/ 5 /))) stop 202
+  end subroutine
+  subroutine check_real_alloc_rank_3()
+    real, allocatable :: a(:,:,:)
+    logical, allocatable :: m(:,:,:)
+    integer, allocatable :: r(:)
+    allocate(a(4,4,4), m(4,4,4))
+    a(:,:,:) = reshape((/ real:: data64 /), shape(a))
+    m(:,:,:) = reshape(mask64, shape(m))
+    r = minloc(a, mask = m)
+    if (size(r, dim = 1) /= 3) stop 211
+    if (any(r /= (/ 2, 3, 1 /))) stop 212
+  end subroutine
+  subroutine check_real_alloc_rank_3_true_mask()
+    real, allocatable :: a(:,:,:)
+    integer, allocatable :: r(:)
+    allocate(a(4,4,4))
+    a(:,:,:) = reshape((/ real:: data64 /), shape(a))
+    r = minloc(a, mask = .true.)
+    if (size(r, dim = 1) /= 3) stop 221
+    if (any(r /= (/ 2, 2, 1 /))) stop 222
+  end subroutine
+  subroutine check_real_alloc_rank_3_false_mask()
+    real, allocatable :: a(:,:,:)
+    integer, allocatable :: r(:)
+    allocate(a(4,4,4))
+    a(:,:,:) = reshape((/ real:: data64 /), shape(a))
+    r = minloc(a, mask = .false.)
+    if (size(r, dim = 1) /= 3) stop 231
+    if (any(r /= (/ 0, 0, 0 /))) stop 232
+  end subroutine
+end subroutine check_minloc_with_mask
diff --git a/gcc/testsuite/gfortran.dg/minmaxloc_18a.f90 b/gcc/testsuite/gfortran.dg/minmaxloc_18a.f90
new file mode 100644
index 00000000000..362d1765c89
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/minmaxloc_18a.f90
@@ -0,0 +1,10 @@ 
+! { dg-do compile }
+! { dg-additional-files "minmaxloc_18.f90" }
+! { dg-additional-options "-Os -fdump-tree-original" }
+! { dg-final { scan-tree-dump-times "gfortran_\[sm\]?minloc" 30 "original" } }
+! { dg-final { scan-tree-dump-times "gfortran_\[sm\]?maxloc" 30 "original" } }
+!
+! PR fortran/90608
+! Check that all MINLOC and MAXLOC intrinsics use the implementation provided
+! by the library when optimizing for size.
+include "minmaxloc_18.f90"
diff --git a/gcc/testsuite/gfortran.dg/minmaxloc_18b.f90 b/gcc/testsuite/gfortran.dg/minmaxloc_18b.f90
new file mode 100644
index 00000000000..068c941110f
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/minmaxloc_18b.f90
@@ -0,0 +1,10 @@ 
+! { dg-do compile }
+! { dg-additional-files "minmaxloc_18.f90" }
+! { dg-additional-options "-O2 -fno-inline-intrinsics=minloc -fdump-tree-original" }
+! { dg-final { scan-tree-dump-times "gfortran_\[sm\]?minloc" 30 "original" } }
+! { dg-final { scan-tree-dump-not "gfortran_\[sm\]?maxloc" "original" } }
+!
+! PR fortran/90608
+! Check that -O2 enables inlining and -fno-inline-intrinsics selectively
+! disables it.
+include "minmaxloc_18.f90"
diff --git a/gcc/testsuite/gfortran.dg/minmaxloc_18c.f90 b/gcc/testsuite/gfortran.dg/minmaxloc_18c.f90
new file mode 100644
index 00000000000..47fe54e20a9
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/minmaxloc_18c.f90
@@ -0,0 +1,10 @@ 
+! { dg-do compile }
+! { dg-additional-files "minmaxloc_18.f90" }
+! { dg-additional-options "-O3 -fno-inline-intrinsics=maxloc -fdump-tree-original" }
+! { dg-final { scan-tree-dump-not "gfortran_\[sm\]?minloc" "original" } }
+! { dg-final { scan-tree-dump-times "gfortran_\[sm\]?maxloc" 30 "original" } }
+!
+! PR fortran/90608
+! Check that -O3 enables inlining and -fno-inline-intrinsics selectively
+! disables it.
+include "minmaxloc_18.f90"
diff --git a/gcc/testsuite/gfortran.dg/minmaxloc_18d.f90 b/gcc/testsuite/gfortran.dg/minmaxloc_18d.f90
new file mode 100644
index 00000000000..eb530f69a2e
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/minmaxloc_18d.f90
@@ -0,0 +1,10 @@ 
+! { dg-do compile }
+! { dg-additional-files "minmaxloc_18.f90" }
+! { dg-additional-options "-O0 -finline-intrinsics=maxloc -fdump-tree-original" }
+! { dg-final { scan-tree-dump-times "gfortran_\[sm\]?minloc" 30 "original" } }
+! { dg-final { scan-tree-dump-not "gfortran_\[sm\]?maxloc" "original" } }
+!
+! PR fortran/90608
+! Check that -O0 disables inlining and -finline-intrinsics selectively
+! enables it.
+include "minmaxloc_18.f90"