diff mbox series

[v2,01/12] OpenMP: metadirective tree data structures and front-end interfaces

Message ID 20240529190500.4097834-2-sloosemore@baylibre.com
State New
Headers show
Series OpenMP: Metadirective support + "declare variant" improvements | expand

Commit Message

Sandra Loosemore May 29, 2024, 7:04 p.m. UTC
This patch adds the OMP_METADIRECTIVE tree node and shared tree-level
support for manipulating metadirectives.  It defines/exposes
interfaces that will be used in subsequent patches that add front-end
and middle-end support, but nothing generates these nodes yet.

This patch also adds compile-time support for dynamic context
selectors (the target_device selector set and the condition selector
of the user selector set) for metadirectives only.  The "declare
variant" directive still supports only static selectors.

gcc/ChangeLog
	* Makefile.in (GTFILES): Move omp-general.h earlier in the list.
	* builtin-types.def (BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR):
	New.
	* doc/generic.texi (OpenMP): Document OMP_METADIRECTIVE and
	context selector interfaces.
	* omp-builtins.def (BUILT_IN_GOMP_EVALUATE_TARGET_DEVICE): New.
	* omp-general.cc (omp_check_context_selector): Add metadirective_p
	parameter, use it to conditionalize target_device support.
	(make_omp_metadirective_variant): New.
	(omp_context_selector_matches): Add metadirective_p and delay_p
	parameters, use them to control things that can only be matched
	late.  Handle OMP_TRAIT_SET_TARGET_DEVICE.
	(score_wide_int): Move definition to omp-general.h.
	(omp_encode_kind_arch_isa_props): New.
	(omp_dynamic_cond): New.
	(omp_context_compute_score): Handle OMP_TRAIT_SET_TARGET_DEVICE.
	(omp_resolve_late_declare_variant, omp_resolve_declare_variant):
	Adjust calls to omp_context_selector_matches.
	(sort_variant): New.
	(omp_get_dynamic_candidates): New.
	(omp_early_resolve_metadirective): New.
	* omp-general.h (score_wide_int): Moved here from omp-general.cc.
	(struct omp_variant): New.
	(OMP_METADIRECTIVE_VARIANT_SELECTOR): New.
	(OMP_METADIRECTIVE_VARIANT_DIRECTIVE): New.
	(OMP_METADIRECTIVE_VARIANT_BODY): New.
	(make_omp_metadirective_variant): Declare.
	(omp_check_context_selector): Adjust to match definition.
	(omp_context_selector_matches): Likewise.
	(omp_early_resolve_metadirective): New.
	* tree-pretty-print.cc (dump_omp_context_selector): Remove
	static qualifier.
	(dump_generic_node): Handle OMP_METADIRECTIVE.
	* tree-pretty-print.h (dump_omp_context_selector): Declare.
	* tree.def (OMP_METADIRECTIVE): New.
	* tree.h (OMP_METADIRECTIVE_VARIANTS): New.

gcc/c/ChangeLog
	* c-parser.cc (c_finish_omp_declare_variant): Update calls to
	omp_check_context_selector and omp_context_selector_matches.

gcc/cp/ChangeLog
	* decl.cc (omp_declare_variant_finalize_one):  Update call to
	omp_context_selector_matches to pass additional arguments.
	* parser.cc (cp_finish_omp_declare_variant): Likewise for
	omp_check_context_selector.

gcc/fortran/ChangeLog
	* trans-openmp.cc (gfc_trans_omp_declare_variant):  Update calls to
	omp_check_context_selector and omp_context_selector_matches.
	* types.def (BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR): New.

Co-Authored-By: Kwok Cheung Yeung <kcy@codesourcery.com>
Co-Authored-By: Sandra Loosemore <sandra@codesourcery.com>
---
 gcc/Makefile.in             |   2 +-
 gcc/builtin-types.def       |   2 +
 gcc/c/c-parser.cc           |   4 +-
 gcc/cp/decl.cc              |   2 +-
 gcc/cp/parser.cc            |   2 +-
 gcc/doc/generic.texi        |  32 ++++
 gcc/fortran/trans-openmp.cc |   4 +-
 gcc/fortran/types.def       |   2 +
 gcc/omp-builtins.def        |   3 +
 gcc/omp-general.cc          | 357 ++++++++++++++++++++++++++++++++++--
 gcc/omp-general.h           |  31 +++-
 gcc/tree-pretty-print.cc    |  36 +++-
 gcc/tree-pretty-print.h     |   2 +
 gcc/tree.def                |   6 +
 gcc/tree.h                  |   3 +
 15 files changed, 461 insertions(+), 27 deletions(-)

Comments

Tobias Burnus May 31, 2024, 12:22 p.m. UTC | #1
Hi Sandra,

some observations/comments, but in general it looks good.

Sandra Loosemore wrote:
> This patch adds the OMP_METADIRECTIVE tree node and shared tree-level
> support for manipulating metadirectives.  It defines/exposes
> interfaces that will be used in subsequent patches that add front-end
> and middle-end support, but nothing generates these nodes yet.
>
> This patch also adds compile-time support for dynamic context
> selectors (the target_device selector set and the condition selector
> of the user selector set) for metadirectives only.  The "declare
> variant" directive still supports only static selectors.
...
>   /* Return 1 if context selector matches the current OpenMP context, 0
>      if it does not and -1 if it is unknown and need to be determined later.
>      Some properties can be checked right away during parsing (this routine),
>      others need to wait until the whole TU is parsed, others need to wait until
> -   IPA, others until vectorization.  */
> +   IPA, others until vectorization.
> +
> +   METADIRECTIVE_P is true if this is a metadirective context, and DELAY_P
> +   is true if it's too early in compilation to determine whether some
> +   properties match.
> +
> +   Dynamic properties (which are evaluated at run-time) should always
> +   return 1.  */
I have to admit that I don't really see the use of metadirective_p as …
>   int
> -omp_context_selector_matches (tree ctx)
> +omp_context_selector_matches (tree ctx, bool metadirective_p, bool delay_p)
...
> +		    if (metadirective_p && delay_p)
> +		      return -1;

I do see why the resolution of KIND/ARCH/ISA should be delayed – for 
both variant/metadirective as long as the code is run by the host and 
the device. Except that we could exclude, e.g., 'kind(FPGA)' early on as 
we don't support it at all.

But once the device code is split off, I don't see why we can't expand 
the DEVICE clause right away for both variant and metadirective – while 
for 'target_device', we cannot do much until runtime – except of 
excluding things like 'kind(fpga)' – or excluding all 'arch' known not 
to be supported neither by the host nor by any enabled offload devices.

Thus, I see why there is a 'delay_p', but not why there is a 
'metadirective_p'.

But I might have missed something important ...

>              case OMP_TRAIT_USER_CONDITION:
>                if (set == OMP_TRAIT_SET_USER)
>                  for (tree p = OMP_TS_PROPERTIES (ts); p; p = TREE_CHAIN (p))
>                    if (OMP_TP_NAME (p) == NULL_TREE)
>                      {
> +		      /* OpenMP 5.1 allows non-constant conditions for
> +			 metadirectives.  */
> +		      if (metadirective_p
> +			  && !tree_fits_shwi_p (OMP_TP_VALUE (p)))
> +			break;
>                        if (integer_zerop (OMP_TP_VALUE (p)))
>                          return 0;
>                        if (integer_nonzerop (OMP_TP_VALUE (p)))
>                          break;
>                        ret = -1;
>                      }

(BTW: I am happy to be enlightened as I likely have miss some fine print.)

Regarding the comment: True, but shouldn't this be handled before by 
issuing an error when such a clause is used in 'declare variant', i.e. 
only occur when metadirective_p is/can be true?

Besides, I have to admit that I do not understand the new code. The 
current code has: constant zero → whole selector known to be false 
("return 0"); nonzero constant → keep current state, i.e. either 'true' 
(1) or don't known ('-1') and continue; otherwise (not const) → set to 
"don't know" (-1) and continue with the next item.

That seems to make also sense for metadirectives. But your patch changes 
this to keep current state if a variable. In that case, '1' is used if 
this is the only item or the previous condition is true. Or "-1" when 
the previous item is "don't know" (-1). - I think that doesn't make 
sense and it should always return -1 for a run time value.

Additionally, I wonder why you use tree_fits_shwi_p instead of a simple 
'TREE_CODE (OMP_TP_VALUE (p)) != INTEGER_CST'. It does not seem to 
matter here, but '(uint128_t)-1' looks like a valid condition and valid 
constant, which integer_nonzerop should handled but if the hwi is 128bit 
wide, it won't fit into a signed variable.

(As integer_nonzerop and the current code both do "break;" it won't 
change the result of the current code.)

* * *
> +static tree
> +omp_dynamic_cond (tree ctx)
> +{
...
> +      /* The user condition is not dynamic if it is constant.  */
> +      if (!tree_fits_shwi_p (TREE_VALUE (expr_list)))

Any reason for using tree_fits_shwi_p instead of INTEGER_CST? Here, 
(uint128_t)-1 could make a difference …

> +	/* omp_initial_device is -1, omp_invalid_device is -4; choose
> +	   a value that isn't otherwise defined to indicate the default
> +	   device.  */
> +	device_num = build_int_cst (integer_type_node, -2);

Don't do this - we do it differently for 'target' and it should do the 
same. Some value usage history:

For target / target data etc, GCC historically used -1 to denote for 
that no device clause was specified (→ default device) and >= 0 for a 
user-specified device, i.e. "GOMP_target_ext(device_num,..." gets a "-1" 
if nothing was specified (= default device).

OpenMP 5.2 then introduced omp_{initial,invalid}_device with 
omp_initial_device == -1 and the other one < -1 but implementation defined.

Thus, we have 0...num_devices + -4 (omp_invalid_device) and "-1" for 
both default device (GOMP_target_ext etc.) and as host/initial device 
(API routines omp_...).

Solution was: For the GOMP_... functions, keep using -1 = default device 
and use -2 for omp_initial_device. (And for the API routines, -1 == 
initial device).

If you look at the device number handling in libgomp, functions which 
can be called in both context have a "remap" boolean to handle the two 
usages for -1.


I strongly suggest to use the same semantic here to avoid confusion, 
i.e. -1 = nothing specified, use default device. And map a 
user-specified omp_initial_device (-1) to a -2.

And it would help to use GOMP_DEVICE_HOST_FALLBACK (-2, host/initial 
device) and GOMP_DEVICE_ICV (-1, default device) where appropriate as 
they are a tiny bit more readable and more greppable.

For the conversion, have a look at omp-expand.cc's expand_omp_target and 
search for both OMP_CLAUSE_DEVICE (2nd hit) and need_device_adjustment; 
if the latter is true: device = (cond ? device : 
GOMP_DEVICE_HOST_FALLBACK)  [where cond is 'd == -1' or rather 
'(unsigned) d > (unsigned)-2']

It might make sense to move (part of) this code out of expand_omp_target 
and share it by both; at least the gimple code is rather complex. And I 
guess your code also runs in later phase of the compiler and, hence, 
also cannot use simple code.

* * *
> @@ -4019,6 +4019,40 @@ dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
...
> +	    if (selector == NULL_TREE)
> +	      pp_string (pp, "default:");

I wonder whether we should be forward looking (OpenMP >= 5.2) and dump 
'otherwise:'. (OpenMP <= 5.1 + [deprecated] 5.2 have 'default'.)

Tobias
Sandra Loosemore July 7, 2024, 11:15 p.m. UTC | #2
On 5/31/24 06:22, Tobias Burnus wrote:

> I have to admit that I don't really see the use of metadirective_p as …
>>   int
>> -omp_context_selector_matches (tree ctx)
>> +omp_context_selector_matches (tree ctx, bool metadirective_p, bool 
>> delay_p)
> ...
>> +            if (metadirective_p && delay_p)
>> +              return -1;
> 
> I do see why the resolution of KIND/ARCH/ISA should be delayed – for 
> both variant/metadirective as long as the code is run by the host and 
> the device. Except that we could exclude, e.g., 'kind(FPGA)' early on as 
> we don't support it at all.
> 
> But once the device code is split off, I don't see why we can't expand 
> the DEVICE clause right away for both variant and metadirective – while 
> for 'target_device', we cannot do much until runtime – except of 
> excluding things like 'kind(fpga)' – or excluding all 'arch' known not 
> to be supported neither by the host nor by any enabled offload devices.
> 
> Thus, I see why there is a 'delay_p', but not why there is a 
> 'metadirective_p'.
> 
> But I might have missed something important ...

Yeah, omp_context_selector_matches() is pretty substantially revised in 
part 9 of the posted patch set -- among other things, to remove the 
metadirective_p parameter.  The current split between patches isn't 
ideal but this is such a huge patch set already (with more pieces in the 
works to support "begin declare variant") that refactoring them would be 
a lot of work and probably result in something even more challenging to 
review.  :-S

> 
>>              case OMP_TRAIT_USER_CONDITION:
>>                if (set == OMP_TRAIT_SET_USER)
>>                  for (tree p = OMP_TS_PROPERTIES (ts); p; p = 
>> TREE_CHAIN (p))
>>                    if (OMP_TP_NAME (p) == NULL_TREE)
>>                      {
>> +              /* OpenMP 5.1 allows non-constant conditions for
>> +             metadirectives.  */
>> +              if (metadirective_p
>> +              && !tree_fits_shwi_p (OMP_TP_VALUE (p)))
>> +            break;
>>                        if (integer_zerop (OMP_TP_VALUE (p)))
>>                          return 0;
>>                        if (integer_nonzerop (OMP_TP_VALUE (p)))
>>                          break;
>>                        ret = -1;
>>                      }
> 
> (BTW: I am happy to be enlightened as I likely have miss some fine print.)
> 
> Regarding the comment: True, but shouldn't this be handled before by 
> issuing an error when such a clause is used in 'declare variant', i.e. 
> only occur when metadirective_p is/can be true?

The error diagnostic is handled during parsing in the respective front 
ends (parts 4, 5, and 7 of the series).

> Besides, I have to admit that I do not understand the new code. The 
> current code has: constant zero → whole selector known to be false 
> ("return 0"); nonzero constant → keep current state, i.e. either 'true' 
> (1) or don't known ('-1') and continue; otherwise (not const) → set to 
> "don't know" (-1) and continue with the next item.
> 
> That seems to make also sense for metadirectives. But your patch changes 
> this to keep current state if a variable. In that case, '1' is used if 
> this is the only item or the previous condition is true. Or "-1" when 
> the previous item is "don't know" (-1). - I think that doesn't make 
> sense and it should always return -1 for a run time value.

-1 doesn't really mean "don't know".  It means "don't know YET".  For 
the purposes of omp_context_selector_matches, a dynamic selector always 
matches in the sense that they all need to be included in the list of 
replacement candidates.

> Additionally, I wonder why you use tree_fits_shwi_p instead of a simple 
> 'TREE_CODE (OMP_TP_VALUE (p)) != INTEGER_CST'. It does not seem to 
> matter here, but '(uint128_t)-1' looks like a valid condition and valid 
> constant, which integer_nonzerop should handled but if the hwi is 128bit 
> wide, it won't fit into a signed variable.
> 
> (As integer_nonzerop and the current code both do "break;" it won't 
> change the result of the current code.)

The existing code for parsing "declare variant" context selectors 
already uses tree_fits_shwi_p.  (See e.g. c_parser_omp_context_selector 
in gcc/c/c-parser.cc.)  If there's a better idiom for checking for a 
constant I'll certainly use it, but I was trying to be consistent with 
what I thought was standard practice already.  :-S

> * * *
>> +static tree
>> +omp_dynamic_cond (tree ctx)
>> +{
> ...
>> +      /* The user condition is not dynamic if it is constant.  */
>> +      if (!tree_fits_shwi_p (TREE_VALUE (expr_list)))
> 
> Any reason for using tree_fits_shwi_p instead of INTEGER_CST? Here, 
> (uint128_t)-1 could make a difference …

Same here.

> 
>> +    /* omp_initial_device is -1, omp_invalid_device is -4; choose
>> +       a value that isn't otherwise defined to indicate the default
>> +       device.  */
>> +    device_num = build_int_cst (integer_type_node, -2);
> 
> Don't do this - we do it differently for 'target' and it should do the 
> same. Some value usage history:

Wait, in your January review comments on an earlier version of this 
patch you told me I *should* do this -- use a negative value that is 
neither omp_initial_device nor omp_invalid_device.

> For target / target data etc, GCC historically used -1 to denote for 
> that no device clause was specified (→ default device) and >= 0 for a 
> user-specified device, i.e. "GOMP_target_ext(device_num,..." gets a "-1" 
> if nothing was specified (= default device).
> 
> OpenMP 5.2 then introduced omp_{initial,invalid}_device with 
> omp_initial_device == -1 and the other one < -1 but implementation defined.
> 
> Thus, we have 0...num_devices + -4 (omp_invalid_device) and "-1" for 
> both default device (GOMP_target_ext etc.) and as host/initial device 
> (API routines omp_...).
> 
> Solution was: For the GOMP_... functions, keep using -1 = default device 
> and use -2 for omp_initial_device. (And for the API routines, -1 == 
> initial device).
> 
> If you look at the device number handling in libgomp, functions which 
> can be called in both context have a "remap" boolean to handle the two 
> usages for -1.
> 
> 
> I strongly suggest to use the same semantic here to avoid confusion, 
> i.e. -1 = nothing specified, use default device. And map a 
> user-specified omp_initial_device (-1) to a -2.
> 
> And it would help to use GOMP_DEVICE_HOST_FALLBACK (-2, host/initial 
> device) and GOMP_DEVICE_ICV (-1, default device) where appropriate as 
> they are a tiny bit more readable and more greppable.
> 
> For the conversion, have a look at omp-expand.cc's expand_omp_target and 
> search for both OMP_CLAUSE_DEVICE (2nd hit) and need_device_adjustment; 
> if the latter is true: device = (cond ? device : 
> GOMP_DEVICE_HOST_FALLBACK)  [where cond is 'd == -1' or rather 
> '(unsigned) d > (unsigned)-2']
> 
> It might make sense to move (part of) this code out of expand_omp_target 
> and share it by both; at least the gimple code is rather complex. And I 
> guess your code also runs in later phase of the compiler and, hence, 
> also cannot use simple code.

I'm totally lost here.  Why is this necessary or a good idea?  The 
device number gets passed directly to a generated call to the runtime 
function GOMP_evaluate_target_device, which specifically tests for -2 as 
a magic number (part 3 of the patch series).  It doesn't have anything 
to do with the omp target construct expansion.  If -2, specifically, was 
a bad choice, I can change it to a different magic number that is used 
only for this purpose and nothing else.

> * * *
>> @@ -4019,6 +4019,40 @@ dump_generic_node (pretty_printer *pp, tree 
>> node, int spc, dump_flags_t flags,
> ...
>> +        if (selector == NULL_TREE)
>> +          pp_string (pp, "default:");
> 
> I wonder whether we should be forward looking (OpenMP >= 5.2) and dump 
> 'otherwise:'. (OpenMP <= 5.1 + [deprecated] 5.2 have 'default'.)

I can fix this one easily enough now, but I was envisioning some future 
task to clean up all the deprecated syntaxes.  Among other things, GCC 
still implements only the OpenMP 5.0 syntax for requires selectors 
(where the clauses of the requires directive are recognized as traits by 
themselves in the "implementation" set) instead of the 5.1+ syntax (the 
requires clauses are properties of a "requires" trait in the 
"implementation" set).  In addition to the backlog of 
unreviewed/uncommitted patches relating to variant directives, there's 
also a backlog of incomplete or broken features, and doing something 
about different versions of the syntax is one of them.

-Sandra
Tobias Burnus July 16, 2024, 12:53 p.m. UTC | #3
Hi Sandra,

Sandra Loosemore wrote:
>>> +    /* omp_initial_device is -1, omp_invalid_device is -4; choose
>>> +       a value that isn't otherwise defined to indicate the default
>>> +       device.  */
>>> +    device_num = build_int_cst (integer_type_node, -2);
>>
>> Don't do this - we do it differently for 'target' and it should do the 
>> same. Some value usage history:

Without caring for backward compatibility, I think we had somewhere

#define OMP_DEFAULT_DEVICE -2

and would simply use it everywhere when doing API calls.


But to handle old code, we have to handle both:
  -1 → default device
and
  -1 → initial device (= host).


Before coming back to your code, let's try to explain the history
and reason again. Maybe I manage to explain it better this time:

* * *

The problem is that -1 on the user side and -1 on the internal-use
side mean different things. Namely:

In the old days OpenMP had on the user side:
   device numbers 0 ... omp_get_num_devices()
where the upper bound was the initial device (= host), 
omp_get_initial_device().

For
   omp target num_device(n)
the device number has to be passed to the run time – and GCC just passes 
"n" here.

But GCC also needs to handle:
   omp target
i.e. not specifying a device number (= using the default device). It has 
been implemented in the obvious way, i.e. passing '-1'.


Later, OpenMP added:
   omp_initial_device == -1
   omp_invalid_device (negative, implementation defined, != 
omp_initial_device)

GCC set the latter rather arbitrary to -4.


RESULT: Everything works fine, except for -1 as
   omp target device_num(omp_initial_device)
and
   omp target
are now the same, but semantically one uses the host and the other the
default device.


Therefore, GCC uses:
(A) API routines - use omp_initial_device == -1 as value.

(B) Directives - use -1 for no clause (= backward compatible), using the 
default device.
Using -2 for omp_initial_device.


Hence, the following defines exist:

#define GOMP_DEVICE_ICV                 -1
#define GOMP_DEVICE_HOST_FALLBACK       -2
#define GOMP_DEVICE_INVALID             -4


If you call an OpenMP runtime API routine, you need to use -1 for the 
initial device and for GOMP_* functions related to directives -2 using
GOMP_DEVICE_HOST_FALLBACK, when constructing it manually.

Code wise, GCC handles num_device(n) by generating code like:
   if !num_device
     devnum = GOMP_DEVICE_ICV;
   else
     devnum = (n == -1) ? GOMP_DEVICE_HOST_FALLBACK : n;

That's not ideal but one solution to handle backward compatibility.



Inside libgomp/target.c, there is:

   resolve_device (int device_id, bool remapped)

and 'remapped' is
- 'false' for OpenMP API routines and
- 'true' for GOMP_* calls.

The following code in resolve_device does then undo the '-1':

   if (remapped && device_id == GOMP_DEVICE_ICV)
       device_id = icv->default_device_var;
       remapped = false;
   if (device_id < 0)
       if (device_id == (remapped ? GOMP_DEVICE_HOST_FALLBACK
                                  : omp_initial_device))
         return NULL;

* * *

Now coming back to your code:

If you call
   resolve_device
directly, using the GOMP_* variant makes sense, i.e. passing
the device number as is with 'remap = true'. This also makes
sense for consistency with the remaining code.

Downside: This requires to add
   (n == -1) ? -2 : n
for user-specified 'n'.


If you handle the device_num resolution yourself in libgomp, you have 
two variants to chose from:

(a) using a different value to denote the default-device (e.g. '-2' or 
'-3')  and pass it as is

(b) call resolve_device with remapping in libgomp, but handling -1 for 
the default device as '(n == -1) ? -2 : n' during code gen

I think either works - and either variant is confusing in one way or the
other.

* * *

Jumping to:

[PATCH v2 03/12] libgomp: runtime support for target_device selector

libgomp/target.c:

> +bool
> +GOMP_evaluate_target_device (int device_num, const char *kind,
> +			     const char *arch, const char *isa)
> +{

If you do the remapping, you could just use:

   struct gomp_device_descr *devicep = resolve_device (device, true);
   if (kind && strcmp (kind, "any") == 0)
     kind = NULL;
   if (devicep == NULL)
     result = GOMP_evaluate_current_device (kind, arch, isa);
   else
     result = device->evaluate_device_func (device_num, kind, arch, isa);

which seems to be simpler than the code you have.

If you don't do the remapping:

> +  bool result = true;
> +
> +  /* -2 is a magic number to indicate the device number was not specified;
> +     in that case it's supposed to use the default device.  */
> +  if (device_num == -2)
> +    device_num = omp_get_default_device ();

… then you need to handle -2 yourself.

> +  if (kind && strcmp (kind, "any") == 0)
> +    kind = NULL;
> +
> +  gomp_debug (1, "%s: device_num = %u, kind=%s, arch=%s, isa=%s",
> +	      __FUNCTION__, device_num, kind, arch, isa);
> +
> +  if (omp_get_device_num () == device_num)
> +    result = GOMP_evaluate_current_device (kind, arch, isa);

Here, you additionally need to take care of omp_initial_device, e.g.

   if (device_num == -1 /* omp_initial_device */
       || device_num == gomp_get_num_devices ())

(note the 'gomp_' to avoid two indirections)

> +  else
> +    {
> +      if (!omp_is_initial_device ())
> +	/* Accelerators are not expected to know about other devices.  */
> +	result = false;

This is always false. At least for GCN and NVPTX as those have their own 
file and functions under libgomp → config/gcn/target.c and 
config/nvptx/target.c

In turn, if GOMP_evaluate_target_device can be called on the device 
side, you need to add it to those files. I think this is compile-time
resolvable, but I have not checked.

(And if it is callable but protected by a run-time 'if(false)' you have
to add a stub with 'gcc_unreachable()' to avoid linking errors. But I 
think the TARGET_OMP_DEVICE_KIND_ARCH_ISA function already takes care
of resolving this at compile time. [Not checked.])

* * *

> +      else
> +	{
> +	  struct gomp_device_descr *device = resolve_device (device_num, true);

Semantically, this should use 'false' here as there is no remapping 
done. It does not really matter as '-2'/default-device has been resolved 
above and the inital device (-1 and num_devices) is handled above.

BTW: This code handles omp_invalid_device via error termination. (As it 
should.)

It also handles the OMP_TARGET_OFFLOAD=mandatory diagnostic.

> +	  if (device == NULL)
> +	    result = false;
> +	  else if (device->evaluate_device_func)
> +	    result = device->evaluate_device_func (device_num, kind, arch,
> +						   isa);
> +	}
> +    }
> +
> +  gomp_debug (1, " -> %s\n", result ? "true" : "false");
> +  return result;
> +}
> +

* * *

Due to backward compatibility issues, there is no real clean solution
for the magic value, whether '-1' is the default device or the initial
device.

I think both variants are acceptable, but either way, libgomp/target.c
needs some tweaking, cf. above.

* * *

Question: Is now everything clear - or are you still confused by my writing?

Tobias
Sandra Loosemore July 16, 2024, 5:03 p.m. UTC | #4
On 7/16/24 06:53, Tobias Burnus wrote:

> Question: Is now everything clear - or are you still confused by my 
> writing?

Well, I still do not understand why backward compatibility concerns 
specific to some other directive should affect the ABI for a new 
directive that does not have any current libgomp runtime support, but I 
suggest that in the interest of efficiently moving forward with this you 
just tell me what ABI you want me to implement and I will re-do the code 
that way.  I *thought* I had followed your previous suggestion in the 
current version of the patches, but apparently I guessed wrong, so I'd 
appreciate it if you were more explicit about exactly what the 
compiler/runtime interface should be, and if there are specific things 
that should be happening in either the generated code or the runtime, 
you make it clear what goes in which side of the interface.

-Sandra
Tobias Burnus July 16, 2024, 5:30 p.m. UTC | #5
Hi Sandra,

Am 16.07.24 um 19:03 schrieb Sandra Loosemore:
> Well, I still do not understand why backward compatibility concerns 
> specific to some other directive should affect the ABI for a new 
> directive that does not have any current libgomp runtime support,

I am happy that I managed to explain you the background of the "-1" 
mess. Otherwise:


The backward-compatibility hack is not required, but it has two 
advantages: consistency of the values used and it makes the code inside 
target.c way simpler by just using

   struct gomp_device_descr *devicep = resolve_device (device, true);

instead of handling several additional cases.


However, as written, avoiding the '(n == -1) ? -2 : n' code generation 
also has advantages; hence, I am also happy with that variant. (i.e. -2 
or -3 denoting the default device).

However, if you use -2 == default device, you need to fix the 
libgomp/target.c implementation as your code doesn't handle 
omp_default_device correctly, which 'resolve_device (device, true);' 
would handle automatically.


> you just tell me what ABI you want me to implement and I will re-do 
> the code that way.

Having looked at the code again – and in particular at libgomp/target.c, 
I realized the merits of using -2. Thus, at the end, I am happy with 
*either* variant.

But either version requires some changes: One the creation of the 
conditional gimple code + much simplified code in target.c. And the 
other, keeping the current gimple code – but fixing/extending target.c.

Tobias
diff mbox series

Patch

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index a7f15694c34..d08889a3cec 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2869,6 +2869,7 @@  GTFILES = $(CPPLIB_H) $(srcdir)/input.h $(srcdir)/coretypes.h \
   $(srcdir)/tree-ssa-operands.h \
   $(srcdir)/tree-profile.cc $(srcdir)/tree-nested.cc \
   $(srcdir)/omp-offload.h \
+  $(srcdir)/omp-general.h \
   $(srcdir)/omp-general.cc \
   $(srcdir)/omp-low.cc \
   $(srcdir)/targhooks.cc $(out_file) $(srcdir)/passes.cc \
@@ -2895,7 +2896,6 @@  GTFILES = $(CPPLIB_H) $(srcdir)/input.h $(srcdir)/coretypes.h \
   $(srcdir)/ipa-strub.cc \
   $(srcdir)/internal-fn.h \
   $(srcdir)/calls.cc \
-  $(srcdir)/omp-general.h \
   $(srcdir)/analyzer/analyzer-language.cc \
   @all_gtfiles@
 
diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index c97d6bad1de..605a38ab84d 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -878,6 +878,8 @@  DEF_FUNCTION_TYPE_4 (BT_FN_VOID_UINT_PTR_INT_PTR, BT_VOID, BT_INT, BT_PTR,
 		     BT_INT, BT_PTR)
 DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_UINT_UINT_UINT_BOOL,
 		     BT_BOOL, BT_UINT, BT_UINT, BT_UINT, BT_BOOL)
+DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR,
+		     BT_BOOL, BT_INT, BT_CONST_PTR, BT_CONST_PTR, BT_CONST_PTR)
 
 DEF_FUNCTION_TYPE_5 (BT_FN_INT_STRING_INT_SIZE_CONST_STRING_VALIST_ARG,
 		     BT_INT, BT_STRING, BT_INT, BT_SIZE, BT_CONST_STRING,
diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 00f8bf4376e..41e8c923fcd 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -24926,7 +24926,7 @@  c_finish_omp_declare_variant (c_parser *parser, tree fndecl, tree parms)
   tree ctx = c_parser_omp_context_selector_specification (parser, parms);
   if (ctx == error_mark_node)
     goto fail;
-  ctx = omp_check_context_selector (match_loc, ctx);
+  ctx = omp_check_context_selector (match_loc, ctx, false);
   if (ctx != error_mark_node && variant != error_mark_node)
     {
       if (TREE_CODE (variant) != FUNCTION_DECL)
@@ -24959,7 +24959,7 @@  c_finish_omp_declare_variant (c_parser *parser, tree fndecl, tree parms)
 	  tree construct
 	    = omp_get_context_selector_list (ctx, OMP_TRAIT_SET_CONSTRUCT);
 	  omp_mark_declare_variant (match_loc, variant, construct);
-	  if (omp_context_selector_matches (ctx))
+	  if (omp_context_selector_matches (ctx, false, true))
 	    {
 	      tree attr
 		= tree_cons (get_identifier ("omp declare variant base"),
diff --git a/gcc/cp/decl.cc b/gcc/cp/decl.cc
index a992d54dc8f..c625b45cab4 100644
--- a/gcc/cp/decl.cc
+++ b/gcc/cp/decl.cc
@@ -8357,7 +8357,7 @@  omp_declare_variant_finalize_one (tree decl, tree attr)
 	  tree construct
 	    = omp_get_context_selector_list (ctx, OMP_TRAIT_SET_CONSTRUCT);
 	  omp_mark_declare_variant (match_loc, variant, construct);
-	  if (!omp_context_selector_matches (ctx))
+	  if (!omp_context_selector_matches (ctx, false, true))
 	    return true;
 	  TREE_PURPOSE (TREE_VALUE (attr)) = variant;
 	}
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 779625144db..8f3d566aa25 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -48738,7 +48738,7 @@  cp_finish_omp_declare_variant (cp_parser *parser, cp_token *pragma_tok,
   tree ctx = cp_parser_omp_context_selector_specification (parser, true);
   if (ctx == error_mark_node)
     goto fail;
-  ctx = omp_check_context_selector (match_loc, ctx);
+  ctx = omp_check_context_selector (match_loc, ctx, false);
   if (ctx != error_mark_node && variant != error_mark_node)
     {
       tree match_loc_node = maybe_wrap_with_location (integer_zero_node,
diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index c596b7d44b2..e869203df0e 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -2335,6 +2335,7 @@  edge.  Rethrowing the exception is represented using @code{RESX_EXPR}.
 @tindex OMP_CONTINUE
 @tindex OMP_ATOMIC
 @tindex OMP_CLAUSE
+@tindex OMP_METADIRECTIVE
 
 All the statements starting with @code{OMP_} represent directives and
 clauses used by the OpenMP API @w{@uref{https://www.openmp.org}}.
@@ -2552,6 +2553,37 @@  same clause @code{C} need to be represented as multiple @code{C} clauses
 chained together.  This facilitates adding new clauses during
 compilation.
 
+@item OMP_METADIRECTIVE
+
+Represents @code{#pragma omp metadirective}.  This node has one field,
+accessed by the @code{OMP_METADIRECTIVE_VARIANTS (@var{node})} macro.
+
+Metadirective variants are represented internally as @code{TREE_LIST} nodes
+but you should use the interface provided in @file{omp-general.h} to
+access their components.
+
+@code{OMP_METADIRECTIVE_VARIANT_SELECTOR (@var{variant})}
+is the selector associated with the variant; this is null for the
+@samp{otherwise}/@samp{default} alternative.
+
+@code{OMP_METADIRECTIVE_VARIANT_DIRECTIVE (@var{variant})} is the
+nested directive for the variant.
+
+@code{OMP_METADIRECTIVE_VARIANT_BODY (@var{variant})} represents
+statements following a nested standalone or utility directive.
+In other cases, this field is null and the body is part of the
+nested directive instead.
+
+Metadirective context selectors (as well as context selectors for
+@code{#pragma omp declare variant}) are also represented internally using
+a @code{TREE_LIST} representation but with accessors and constructors
+declared in @file{omp-general.h}.  A complete context selector is a list of
+trait-set selectors, which are in turn composed of a list of trait selectors,
+each of which may have a list of trait properties.
+Identifiers for trait-set selectors and trait selectors are enums
+defined in @file{omp-selectors.h}, while trait property identifiers are
+string constants.
+
 @end table
 
 @node OpenACC
diff --git a/gcc/fortran/trans-openmp.cc b/gcc/fortran/trans-openmp.cc
index f867e2240bf..99ec0320258 100644
--- a/gcc/fortran/trans-openmp.cc
+++ b/gcc/fortran/trans-openmp.cc
@@ -8504,7 +8504,7 @@  gfc_trans_omp_declare_variant (gfc_namespace *ns)
 	  continue;
 	}
       set_selectors = omp_check_context_selector
-	  (gfc_get_location (&odv->where), set_selectors);
+	(gfc_get_location (&odv->where), set_selectors, false);
       if (set_selectors != error_mark_node)
 	{
 	  if (!variant_proc_sym->attr.implicit_type
@@ -8541,7 +8541,7 @@  gfc_trans_omp_declare_variant (gfc_namespace *ns)
 	      omp_mark_declare_variant (gfc_get_location (&odv->where),
 					gfc_get_symbol_decl (variant_proc_sym),
 					construct);
-	      if (omp_context_selector_matches (set_selectors))
+	      if (omp_context_selector_matches (set_selectors, false, true))
 		{
 		  tree id = get_identifier ("omp declare variant base");
 		  tree variant = gfc_get_symbol_decl (variant_proc_sym);
diff --git a/gcc/fortran/types.def b/gcc/fortran/types.def
index 390cc9542f7..f1e2973e5ff 100644
--- a/gcc/fortran/types.def
+++ b/gcc/fortran/types.def
@@ -176,6 +176,8 @@  DEF_FUNCTION_TYPE_4 (BT_FN_VOID_UINT_PTR_INT_PTR, BT_VOID, BT_INT, BT_PTR,
 		     BT_INT, BT_PTR)
 DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_UINT_UINT_UINT_BOOL,
 		     BT_BOOL, BT_UINT, BT_UINT, BT_UINT, BT_BOOL)
+DEF_FUNCTION_TYPE_4 (BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR,
+		     BT_BOOL, BT_INT, BT_CONST_PTR, BT_CONST_PTR, BT_CONST_PTR)
 
 DEF_FUNCTION_TYPE_5 (BT_FN_VOID_OMPFN_PTR_UINT_UINT_UINT,
 		     BT_VOID, BT_PTR_FN_VOID_PTR, BT_PTR, BT_UINT, BT_UINT,
diff --git a/gcc/omp-builtins.def b/gcc/omp-builtins.def
index 044d5d087b6..ecaace8de31 100644
--- a/gcc/omp-builtins.def
+++ b/gcc/omp-builtins.def
@@ -476,3 +476,6 @@  DEF_GOMP_BUILTIN (BUILT_IN_GOMP_WARNING, "GOMP_warning",
 		  BT_FN_VOID_CONST_PTR_SIZE, ATTR_NOTHROW_LEAF_LIST)
 DEF_GOMP_BUILTIN (BUILT_IN_GOMP_ERROR, "GOMP_error",
 		  BT_FN_VOID_CONST_PTR_SIZE, ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_EVALUATE_TARGET_DEVICE, "GOMP_evaluate_target_device",
+		  BT_FN_BOOL_INT_CONST_PTR_CONST_PTR_CONST_PTR,
+		  ATTR_NOTHROW_LEAF_LIST)
diff --git a/gcc/omp-general.cc b/gcc/omp-general.cc
index 2c095200d5b..e4c84d15644 100644
--- a/gcc/omp-general.cc
+++ b/gcc/omp-general.cc
@@ -1269,7 +1269,7 @@  omp_context_name_list_prop (tree prop)
    it is correct or error_mark_node otherwise.  */
 
 tree
-omp_check_context_selector (location_t loc, tree ctx)
+omp_check_context_selector (location_t loc, tree ctx, bool metadirective_p)
 {
   bool tss_seen[OMP_TRAIT_SET_LAST], ts_seen[OMP_TRAIT_LAST];
 
@@ -1278,9 +1278,10 @@  omp_check_context_selector (location_t loc, tree ctx)
     {
       enum omp_tss_code tss_code = OMP_TSS_CODE (tss);
 
-      /* We can parse this, but not handle it yet.  */
-      if (tss_code == OMP_TRAIT_SET_TARGET_DEVICE)
-	sorry_at (loc, "%<target_device%> selector set is not supported yet");
+      /* FIXME: not implemented yet.  */
+      if (!metadirective_p && tss_code == OMP_TRAIT_SET_TARGET_DEVICE)
+       sorry_at (loc, "%<target_device%> selector set is not supported "
+		 "yet for %<declare variant%>");
 
       /* Each trait-set-selector-name can only be specified once.  */
       if (tss_seen[tss_code])
@@ -1432,14 +1433,29 @@  make_trait_property (tree name, tree value, tree chain)
   return tree_cons (name, value, chain);
 }
 
+/* Constructor for metadirective variants.  */
+tree
+make_omp_metadirective_variant (tree selector, tree directive, tree body)
+{
+  return build_tree_list (selector, build_tree_list (directive, body));
+}
+
+
 /* Return 1 if context selector matches the current OpenMP context, 0
    if it does not and -1 if it is unknown and need to be determined later.
    Some properties can be checked right away during parsing (this routine),
    others need to wait until the whole TU is parsed, others need to wait until
-   IPA, others until vectorization.  */
+   IPA, others until vectorization.
+
+   METADIRECTIVE_P is true if this is a metadirective context, and DELAY_P
+   is true if it's too early in compilation to determine whether some
+   properties match.
+
+   Dynamic properties (which are evaluated at run-time) should always
+   return 1.  */
 
 int
-omp_context_selector_matches (tree ctx)
+omp_context_selector_matches (tree ctx, bool metadirective_p, bool delay_p)
 {
   int ret = 1;
   for (tree tss = ctx; tss; tss = TREE_CHAIN (tss))
@@ -1512,6 +1528,11 @@  omp_context_selector_matches (tree ctx)
 	    ret = -1;
 	  continue;
 	}
+      else if (set == OMP_TRAIT_SET_TARGET_DEVICE)
+	/* The target_device set is dynamic, so treat it as always
+	   resolvable.  */
+	continue;
+
       for (tree ts = selectors; ts; ts = TREE_CHAIN (ts))
 	{
 	  enum omp_ts_code sel = OMP_TS_CODE (ts);
@@ -1581,6 +1602,9 @@  omp_context_selector_matches (tree ctx)
 		    const char *arch = omp_context_name_list_prop (p);
 		    if (arch == NULL)
 		      return 0;
+		    if (metadirective_p && delay_p)
+		      return -1;
+
 		    int r = 0;
 		    if (targetm.omp.device_kind_arch_isa != NULL)
 		      r = targetm.omp.device_kind_arch_isa (omp_device_arch,
@@ -1703,6 +1727,9 @@  omp_context_selector_matches (tree ctx)
 #endif
 			continue;
 		      }
+		    if (metadirective_p && delay_p)
+		      return -1;
+
 		    int r = 0;
 		    if (targetm.omp.device_kind_arch_isa != NULL)
 		      r = targetm.omp.device_kind_arch_isa (omp_device_kind,
@@ -1742,6 +1769,9 @@  omp_context_selector_matches (tree ctx)
 		    const char *isa = omp_context_name_list_prop (p);
 		    if (isa == NULL)
 		      return 0;
+		    if (metadirective_p && delay_p)
+		      return -1;
+
 		    int r = 0;
 		    if (targetm.omp.device_kind_arch_isa != NULL)
 		      r = targetm.omp.device_kind_arch_isa (omp_device_isa,
@@ -1793,6 +1823,12 @@  omp_context_selector_matches (tree ctx)
 		for (tree p = OMP_TS_PROPERTIES (ts); p; p = TREE_CHAIN (p))
 		  if (OMP_TP_NAME (p) == NULL_TREE)
 		    {
+		      /* OpenMP 5.1 allows non-constant conditions for
+			 metadirectives.  */
+		      if (metadirective_p
+			  && !tree_fits_shwi_p (OMP_TP_VALUE (p)))
+			break;
+
 		      if (integer_zerop (OMP_TP_VALUE (p)))
 			return 0;
 		      if (integer_nonzerop (OMP_TP_VALUE (p)))
@@ -2191,9 +2227,107 @@  omp_lookup_ts_code (enum omp_tss_code set, const char *s)
   return OMP_TRAIT_INVALID;
 }
 
-/* Needs to be a GC-friendly widest_int variant, but precision is
-   desirable to be the same on all targets.  */
-typedef generic_wide_int <fixed_wide_int_storage <1024> > score_wide_int;
+/* Helper for omp_dynamic_cond: encode the kind/arch/isa property-lists
+   into strings for GOMP_evaluate_target_device.  The property-list
+   strings are encoded similarly to those in omp_offload_kind_arch_isa,
+   above: each trait is passed as a string, with each property for the
+   string separated by '\0', and an extra '\0' at the end of the string.  */
+static tree
+omp_encode_kind_arch_isa_props (tree props)
+{
+  if (!props)
+    return NULL_TREE;
+  size_t length = 1;
+  for (tree p = props; p; p = TREE_CHAIN (p))
+    length += strlen (omp_context_name_list_prop (p)) + 1;
+  char *buffer = (char *) alloca (length);
+  size_t n = 0;
+  for (tree p = props; p; p = TREE_CHAIN (p))
+    {
+      const char *str = omp_context_name_list_prop (p);
+      strcpy (buffer + n, str);
+      n += strlen (str) + 1;
+    }
+  *(buffer + n) = '\0';
+  return build_string_literal (length, buffer);
+}
+
+/* Return a tree expression representing the dynamic part of the context
+   selector CTX.  */
+static tree
+omp_dynamic_cond (tree ctx)
+{
+  tree expr = NULL_TREE;
+
+  tree user = omp_get_context_selector (ctx, OMP_TRAIT_SET_USER,
+					OMP_TRAIT_USER_CONDITION);
+  if (user)
+    {
+      tree expr_list = OMP_TS_PROPERTIES (user);
+
+      gcc_assert (OMP_TP_NAME (expr_list) == NULL_TREE);
+
+      /* The user condition is not dynamic if it is constant.  */
+      if (!tree_fits_shwi_p (TREE_VALUE (expr_list)))
+	expr = TREE_VALUE (expr_list);
+    }
+
+  tree target_device
+    = omp_get_context_selector_list (ctx, OMP_TRAIT_SET_TARGET_DEVICE);
+  if (target_device)
+    {
+      tree device_num = null_pointer_node;
+      tree kind = null_pointer_node;
+      tree arch = null_pointer_node;
+      tree isa = null_pointer_node;
+
+      tree device_num_sel
+	= omp_get_context_selector (ctx, OMP_TRAIT_SET_TARGET_DEVICE,
+				    OMP_TRAIT_DEVICE_NUM);
+      if (device_num_sel)
+	device_num = OMP_TP_VALUE (OMP_TS_PROPERTIES (device_num_sel));
+      else
+	/* omp_initial_device is -1, omp_invalid_device is -4; choose
+	   a value that isn't otherwise defined to indicate the default
+	   device.  */
+	device_num = build_int_cst (integer_type_node, -2);
+
+      tree kind_sel
+	= omp_get_context_selector (ctx, OMP_TRAIT_SET_TARGET_DEVICE,
+				    OMP_TRAIT_DEVICE_KIND);
+      /* "any" is equivalent to omitting this trait selector.  */
+      if (kind_sel
+	  && strcmp (omp_context_name_list_prop (OMP_TS_PROPERTIES (kind_sel)),
+		     "any"))
+	kind = omp_encode_kind_arch_isa_props (OMP_TS_PROPERTIES (kind_sel));
+
+
+      tree arch_sel
+	= omp_get_context_selector (ctx, OMP_TRAIT_SET_TARGET_DEVICE,
+				    OMP_TRAIT_DEVICE_ARCH);
+      if (arch_sel)
+	arch = omp_encode_kind_arch_isa_props (OMP_TS_PROPERTIES (arch_sel));
+
+      tree isa_sel
+	= omp_get_context_selector (ctx, OMP_TRAIT_SET_TARGET_DEVICE,
+				    OMP_TRAIT_DEVICE_ISA);
+      if (isa_sel)
+	isa = omp_encode_kind_arch_isa_props (OMP_TS_PROPERTIES (isa_sel));
+
+      /* Generate a call to GOMP_evaluate_target_device.  */
+      tree builtin_fn
+	= builtin_decl_explicit (BUILT_IN_GOMP_EVALUATE_TARGET_DEVICE);
+      tree call = build_call_expr (builtin_fn, 4, device_num, kind, arch, isa);
+
+      if (expr == NULL_TREE)
+	expr = call;
+      else
+	expr = fold_build2 (TRUTH_ANDIF_EXPR, boolean_type_node, expr, call);
+    }
+
+  return expr;
+}
+
 
 /* Compute *SCORE for context selector CTX.  Return true if the score
    would be different depending on whether it is a declare simd clone or
@@ -2205,12 +2339,21 @@  omp_context_compute_score (tree ctx, score_wide_int *score, bool declare_simd)
 {
   tree selectors
     = omp_get_context_selector_list (ctx, OMP_TRAIT_SET_CONSTRUCT);
-  bool has_kind = omp_get_context_selector (ctx, OMP_TRAIT_SET_DEVICE,
-					    OMP_TRAIT_DEVICE_KIND);
-  bool has_arch = omp_get_context_selector (ctx, OMP_TRAIT_SET_DEVICE,
-					    OMP_TRAIT_DEVICE_ARCH);
-  bool has_isa = omp_get_context_selector (ctx, OMP_TRAIT_SET_DEVICE,
-					   OMP_TRAIT_DEVICE_ISA);
+  bool has_kind
+    = (omp_get_context_selector (ctx, OMP_TRAIT_SET_DEVICE,
+				 OMP_TRAIT_DEVICE_KIND)
+       || omp_get_context_selector (ctx, OMP_TRAIT_SET_TARGET_DEVICE,
+				    OMP_TRAIT_DEVICE_KIND));
+  bool has_arch
+    = (omp_get_context_selector (ctx, OMP_TRAIT_SET_DEVICE,
+				 OMP_TRAIT_DEVICE_ARCH)
+       || omp_get_context_selector (ctx, OMP_TRAIT_SET_TARGET_DEVICE,
+				    OMP_TRAIT_DEVICE_ARCH));
+  bool has_isa
+    = (omp_get_context_selector (ctx, OMP_TRAIT_SET_DEVICE,
+				 OMP_TRAIT_DEVICE_ISA)
+       || omp_get_context_selector (ctx, OMP_TRAIT_SET_TARGET_DEVICE,
+				    OMP_TRAIT_DEVICE_ISA));
   bool ret = false;
   *score = 1;
   for (tree tss = ctx; tss; tss = TREE_CHAIN (tss))
@@ -2395,7 +2538,7 @@  omp_resolve_late_declare_variant (tree alt)
 	  nmatches++;
 	  continue;
 	}
-      switch (omp_context_selector_matches (varentry1->ctx))
+      switch (omp_context_selector_matches (varentry1->ctx, false, true))
 	{
 	case 0:
           matches.safe_push (false);
@@ -2499,7 +2642,8 @@  omp_resolve_declare_variant (tree base)
 	 don't process it again.  */
       if (node && node->declare_variant_alt)
 	return base;
-      switch (omp_context_selector_matches (TREE_VALUE (TREE_VALUE (attr))))
+      switch (omp_context_selector_matches (TREE_VALUE (TREE_VALUE (attr)),
+					    false, true))
 	{
 	case 0:
 	  /* No match, ignore.  */
@@ -2864,6 +3008,185 @@  omp_lto_input_declare_variant_alt (lto_input_block *ib, cgraph_node *node,
 						 INSERT) = entryp;
 }
 
+/* Comparison function for sorting routines, to sort OpenMP metadirective
+   variants by decreasing score.  */
+
+static int
+sort_variant (const void * a, const void *b, void *)
+{
+  score_wide_int score1
+    = ((const struct omp_variant *) a)->score;
+  score_wide_int score2
+    = ((const struct omp_variant *) b)->score;
+
+  if (score1 > score2)
+    return -1;
+  else if (score1 < score2)
+    return 1;
+  else
+    return 0;
+}
+
+/* Return a vector of dynamic replacement candidates for the directive
+   candidates in ALL_VARIANTS.  Return an empty vector if the metadirective
+   cannot be resolved.  */
+
+static vec<struct omp_variant>
+omp_get_dynamic_candidates (vec <struct omp_variant> &all_variants,
+			    bool delay_p)
+{
+  auto_vec <struct omp_variant> variants;
+  struct omp_variant default_variant;
+  bool default_found = false;
+
+  for (unsigned int i = 0; i < all_variants.length (); i++)
+    {
+      struct omp_variant variant = all_variants[i];
+
+      if (variant.selector == NULL_TREE)
+	{
+	  gcc_assert (!default_found);
+	  default_found = true;
+	  default_variant = variant;
+	  default_variant.score = 0;
+	  default_variant.resolvable_p = true;
+	  default_variant.dynamic_selector = NULL_TREE;
+	  if (dump_file)
+	    fprintf (dump_file,
+		     "Considering default selector as candidate\n");
+	  continue;
+	}
+
+      variant.resolvable_p = true;
+
+      if (dump_file)
+	{
+	  fprintf (dump_file, "Considering selector ");
+	  print_omp_context_selector (dump_file, variant.selector, TDF_NONE);
+	  fprintf (dump_file, " as candidate - ");
+	}
+
+      switch (omp_context_selector_matches (variant.selector, true, delay_p))
+	{
+	case -1:
+	  variant.resolvable_p = false;
+	  if (dump_file)
+	    fprintf (dump_file, "unresolvable");
+	  /* FALLTHRU */
+	case 1:
+	  /* TODO: Handle SIMD score?.  */
+	  omp_context_compute_score (variant.selector, &variant.score, false);
+	  variant.dynamic_selector = omp_dynamic_cond (variant.selector);
+	  variants.safe_push (variant);
+	  if (dump_file && variant.resolvable_p)
+	    {
+	      if (variant.dynamic_selector)
+		fprintf (dump_file, "matched, dynamic");
+	      else
+		fprintf (dump_file, "matched, non-dynamic");
+	    }
+	  break;
+	case 0:
+	  if (dump_file)
+	    fprintf (dump_file, "no match");
+	  break;
+	}
+
+      if (dump_file)
+	fprintf (dump_file, "\n");
+    }
+
+  /* There must be one default variant.  */
+  gcc_assert (default_found);
+
+  /* A context selector that is a strict subset of another context selector
+     has a score of zero.  */
+  for (unsigned int i = 0; i < variants.length (); i++)
+    for (unsigned int j = i + 1; j < variants.length (); j++)
+      {
+	int r = omp_context_selector_compare (variants[i].selector,
+					      variants[j].selector);
+	if (r == -1)
+	  {
+	    /* variant1 is a strict subset of variant2.  */
+	    variants[i].score = 0;
+	    break;
+	  }
+	else if (r == 1)
+	  /* variant2 is a strict subset of variant1.  */
+	  variants[j].score = 0;
+      }
+
+  /* Sort the variants by decreasing score, preserving the original order
+     in case of a tie.  */
+  variants.stablesort (sort_variant, NULL);
+
+  /* Add the default as a final choice.  */
+  variants.safe_push (default_variant);
+
+  /* Build the dynamic candidate list.  */
+  for (unsigned i = 0; i < variants.length (); i++)
+    {
+      /* If one of the candidates is unresolvable, give up for now.  */
+      if (!variants[i].resolvable_p)
+	{
+	  variants.truncate (0);
+	  break;
+	}
+
+      if (dump_file)
+	{
+	  fprintf (dump_file, "Adding directive variant with ");
+
+	  if (variants[i].selector)
+	    {
+	      fprintf (dump_file, "selector ");
+	      print_omp_context_selector (dump_file, variants[i].selector,
+					  TDF_NONE);
+	    }
+	  else
+	    fprintf (dump_file, "default selector");
+
+	  fprintf (dump_file, " as candidate.\n");
+	}
+
+      /* The last of the candidates is ended by a static selector.  */
+      if (!variants[i].dynamic_selector)
+	{
+	  variants.truncate (i + 1);
+	  break;
+	}
+    }
+
+  return variants.copy ();
+}
+
+/* Return a vector of dynamic replacement candidates for the metadirective
+   statement in METADIRECTIVE.  Return an empty vector if the metadirective
+   cannot be resolved.  */
+
+vec<struct omp_variant>
+omp_early_resolve_metadirective (tree metadirective)
+{
+  auto_vec <struct omp_variant> candidates;
+  tree variant = OMP_METADIRECTIVE_VARIANTS (metadirective);
+
+  gcc_assert (variant);
+  while (variant)
+    {
+      struct omp_variant candidate;
+
+      candidate.selector = OMP_METADIRECTIVE_VARIANT_SELECTOR (variant);
+      candidate.alternative = OMP_METADIRECTIVE_VARIANT_DIRECTIVE (variant);
+      candidate.body = OMP_METADIRECTIVE_VARIANT_BODY (variant);
+
+      candidates.safe_push (candidate);
+      variant = TREE_CHAIN (variant);
+    }
+
+  return omp_get_dynamic_candidates (candidates, true);
+}
+
 /* Encode an oacc launch argument.  This matches the GOMP_LAUNCH_PACK
    macro on gomp-constants.h.  We do not check for overflow.  */
 
diff --git a/gcc/omp-general.h b/gcc/omp-general.h
index b7d85a768ce..5807bc42cd7 100644
--- a/gcc/omp-general.h
+++ b/gcc/omp-general.h
@@ -91,6 +91,22 @@  struct omp_for_data
   tree adjn1;
 };
 
+/* Needs to be a GC-friendly widest_int variant, but precision is
+   desirable to be the same on all targets.  */
+typedef generic_wide_int <fixed_wide_int_storage <1024> > score_wide_int;
+
+/* A structure describing a variant in a metadirective.  */
+
+struct GTY(()) omp_variant
+{
+  score_wide_int score;
+  tree selector;
+  tree alternative;
+  tree body;
+  tree dynamic_selector;
+  bool resolvable_p : 1;
+};
+
 #define OACC_FN_ATTRIB "oacc function"
 
 /* Accessors for OMP context selectors, used by variant directives.
@@ -150,6 +166,15 @@  extern tree make_trait_set_selector (enum omp_tss_code, tree, tree);
 extern tree make_trait_selector (enum omp_ts_code, tree, tree, tree);
 extern tree make_trait_property (tree, tree, tree);
 
+/* Accessors and constructor for metadirective variants.  */
+#define OMP_METADIRECTIVE_VARIANT_SELECTOR(v) \
+  TREE_PURPOSE (v)
+#define OMP_METADIRECTIVE_VARIANT_DIRECTIVE(v) \
+  TREE_PURPOSE (TREE_VALUE (v))
+#define OMP_METADIRECTIVE_VARIANT_BODY(v) \
+  TREE_VALUE (TREE_VALUE (v))
+extern tree make_omp_metadirective_variant (tree, tree, tree);
+
 extern tree omp_find_clause (tree clauses, enum omp_clause_code kind);
 extern bool omp_is_allocatable_or_ptr (tree decl);
 extern tree omp_check_optional_argument (tree decl, bool for_present_check);
@@ -166,15 +191,17 @@  extern poly_uint64 omp_max_vf (void);
 extern int omp_max_simt_vf (void);
 extern const char *omp_context_name_list_prop (tree);
 extern void omp_construct_traits_to_codes (tree, int, enum tree_code *);
-extern tree omp_check_context_selector (location_t loc, tree ctx);
+extern tree omp_check_context_selector (location_t loc, tree ctx,
+					bool metadirective_p);
 extern void omp_mark_declare_variant (location_t loc, tree variant,
 				      tree construct);
-extern int omp_context_selector_matches (tree);
+extern int omp_context_selector_matches (tree, bool, bool);
 extern int omp_context_selector_set_compare (enum omp_tss_code, tree, tree);
 extern tree omp_get_context_selector (tree, enum omp_tss_code,
 				      enum omp_ts_code);
 extern tree omp_get_context_selector_list (tree, enum omp_tss_code);
 extern tree omp_resolve_declare_variant (tree);
+extern vec<struct omp_variant> omp_early_resolve_metadirective (tree);
 extern tree oacc_launch_pack (unsigned code, tree device, unsigned op);
 extern tree oacc_replace_fn_attrib_attr (tree attribs, tree dims);
 extern void oacc_replace_fn_attrib (tree fn, tree dims);
diff --git a/gcc/tree-pretty-print.cc b/gcc/tree-pretty-print.cc
index f9ad8562078..815f9b472a0 100644
--- a/gcc/tree-pretty-print.cc
+++ b/gcc/tree-pretty-print.cc
@@ -1503,7 +1503,7 @@  dump_omp_clauses (pretty_printer *pp, tree clause, int spc, dump_flags_t flags,
 }
 
 /* Dump an OpenMP context selector CTX to PP.  */
-static void
+void
 dump_omp_context_selector (pretty_printer *pp, tree ctx, int spc,
 			   dump_flags_t flags)
 {
@@ -4019,6 +4019,40 @@  dump_generic_node (pretty_printer *pp, tree node, int spc, dump_flags_t flags,
       is_expr = false;
       break;
 
+    case OMP_METADIRECTIVE:
+      {
+	pp_string (pp, "#pragma omp metadirective");
+	newline_and_indent (pp, spc + 2);
+	pp_left_brace (pp);
+
+	tree variant = OMP_METADIRECTIVE_VARIANTS (node);
+	while (variant != NULL_TREE)
+	  {
+	    tree selector = OMP_METADIRECTIVE_VARIANT_SELECTOR (variant);
+	    tree directive = OMP_METADIRECTIVE_VARIANT_DIRECTIVE (variant);
+	    tree body = OMP_METADIRECTIVE_VARIANT_BODY (variant);
+
+	    newline_and_indent (pp, spc + 4);
+	    if (selector == NULL_TREE)
+	      pp_string (pp, "default:");
+	    else
+	      {
+		pp_string (pp, "when (");
+		dump_omp_context_selector (pp, selector, spc + 4, flags);
+		pp_string (pp, "):");
+	      }
+	    newline_and_indent (pp, spc + 6);
+
+	    dump_generic_node (pp, directive, spc + 6, flags, false);
+	    newline_and_indent (pp, spc + 6);
+	    dump_generic_node (pp, body, spc + 6, flags, false);
+	    variant = TREE_CHAIN (variant);
+	  }
+	newline_and_indent (pp, spc + 2);
+	pp_right_brace (pp);
+      }
+      break;
+
     case TRANSACTION_EXPR:
       if (TRANSACTION_EXPR_OUTER (node))
 	pp_string (pp, "__transaction_atomic [[outer]]");
diff --git a/gcc/tree-pretty-print.h b/gcc/tree-pretty-print.h
index 0da6242629b..ac6467935c1 100644
--- a/gcc/tree-pretty-print.h
+++ b/gcc/tree-pretty-print.h
@@ -45,6 +45,8 @@  extern void dump_omp_atomic_memory_order (pretty_printer *,
 					  enum omp_memory_order);
 extern void dump_omp_loop_non_rect_expr (pretty_printer *, tree, int,
 					 dump_flags_t);
+extern void dump_omp_context_selector (pretty_printer *, tree, int,
+				       dump_flags_t);
 extern void print_omp_context_selector (FILE *, tree, dump_flags_t);
 extern int dump_generic_node (pretty_printer *, tree, int, dump_flags_t, bool);
 extern void print_declaration (pretty_printer *, tree, int, dump_flags_t);
diff --git a/gcc/tree.def b/gcc/tree.def
index 24128e1e039..f909ae0a443 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1340,6 +1340,12 @@  DEFTREECODE (OMP_TARGET_ENTER_DATA, "omp_target_enter_data", tcc_statement, 1)
    Operand 0: OMP_TARGET_EXIT_DATA_CLAUSES: List of clauses.  */
 DEFTREECODE (OMP_TARGET_EXIT_DATA, "omp_target_exit_data", tcc_statement, 1)
 
+/* OpenMP - #pragma omp metadirective [variant1 ... variantN]
+   Operand 0: OMP_METADIRECTIVE_VARIANTS: List of selectors and directive
+   variants.  Use the interface in omp-general.h to construct variants
+   and access their fields.  */
+DEFTREECODE (OMP_METADIRECTIVE, "omp_metadirective", tcc_statement, 1)
+
 /* OMP_ATOMIC through OMP_ATOMIC_CAPTURE_NEW must be consecutive,
    or OMP_ATOMIC_SEQ_CST needs adjusting.  */
 
diff --git a/gcc/tree.h b/gcc/tree.h
index ee2aae332a4..5e3f27fe6c6 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -1596,6 +1596,9 @@  class auto_suppress_location_wrappers
 #define OMP_TARGET_EXIT_DATA_CLAUSES(NODE)\
   TREE_OPERAND (OMP_TARGET_EXIT_DATA_CHECK (NODE), 0)
 
+#define OMP_METADIRECTIVE_VARIANTS(NODE) \
+  TREE_OPERAND (OMP_METADIRECTIVE_CHECK (NODE), 0)
+
 #define OMP_SCAN_BODY(NODE)	TREE_OPERAND (OMP_SCAN_CHECK (NODE), 0)
 #define OMP_SCAN_CLAUSES(NODE)	TREE_OPERAND (OMP_SCAN_CHECK (NODE), 1)