Message ID | 20180909230543.614f69cd@squid.athome |
---|---|
State | New |
Headers | show |
Series | [OpenACC] C++ reference mapping (PR middle-end/86336) | expand |
On Mon, Sep 10, 2018 at 4:05 AM, Julian Brown <julian@codesourcery.com> wrote: > This patch (by Cesar) changes the way C++ references are mapped in > OpenACC regions, fixing an ICE in the non-scalar-data.C testcase. > > Post-patch, references are mapped like this (from the omplower dump): > > map(force_present:*x [len: 4]) map(firstprivate ref:x [pointer assign, bias: 0]) > > Tested with offloading to NVPTX and bootstrapped. OK for trunk? > > Thanks, > > Julian > > ChangeLog > > 2018-09-09 Cesar Philippidis <cesar@codesourcery.com> > Julian Brown <julian@codesourcery.com> > > PR middle-end/86336 > > (gimplify_adjust_omp_clauses_1): Update handling of mapping of C++ > references. How is reference handling specified differently between OpenMP and OpenACC? It seems strange for them to differ. In any case, you shouldn't need to check lang_GNU_CXX since we're already calling the langhook. Jason
On 09/10/2018 10:37 AM, Jason Merrill wrote: > On Mon, Sep 10, 2018 at 4:05 AM, Julian Brown <julian@codesourcery.com> wrote: >> This patch (by Cesar) changes the way C++ references are mapped in >> OpenACC regions, fixing an ICE in the non-scalar-data.C testcase. >> >> Post-patch, references are mapped like this (from the omplower dump): >> >> map(force_present:*x [len: 4]) map(firstprivate ref:x [pointer assign, bias: 0]) >> >> Tested with offloading to NVPTX and bootstrapped. OK for trunk? >> >> Thanks, >> >> Julian >> >> ChangeLog >> >> 2018-09-09 Cesar Philippidis <cesar@codesourcery.com> >> Julian Brown <julian@codesourcery.com> >> >> PR middle-end/86336 >> >> (gimplify_adjust_omp_clauses_1): Update handling of mapping of C++ >> references. > > How is reference handling specified differently between OpenMP and > OpenACC? It seems strange for them to differ. Both OpenACC and OpenMP privatize mapped array pointers on the accelerator for subarrays in the same way. However, for pointers without subarrays, OpenMP treats them as zero-length arrays, whereas OpenACC treats them as ordinary scalars so that the pointer target will not get remapped on the accelerator (which is odd because there's a deviceptr clause for that). Scalars in C++ are special, because references must treated like an array of length one, for lack of a better terminology. > In any case, you shouldn't need to check lang_GNU_CXX since we're > already calling the langhook. Julian, can you look into this? I'm traveling tomorrow. Cesar
On Mon, 10 Sep 2018 10:52:47 -0700 Cesar Philippidis <cesar@codesourcery.com> wrote: > On 09/10/2018 10:37 AM, Jason Merrill wrote: > > On Mon, Sep 10, 2018 at 4:05 AM, Julian Brown > > <julian@codesourcery.com> wrote: > >> This patch (by Cesar) changes the way C++ references are mapped in > >> OpenACC regions, fixing an ICE in the non-scalar-data.C testcase. > >> > >> Post-patch, references are mapped like this (from the omplower > >> dump): > >> > >> map(force_present:*x [len: 4]) map(firstprivate ref:x [pointer > >> assign, bias: 0]) > >> > >> Tested with offloading to NVPTX and bootstrapped. OK for trunk? > >> > >> Thanks, > >> > >> Julian > >> > >> ChangeLog > >> > >> 2018-09-09 Cesar Philippidis <cesar@codesourcery.com> > >> Julian Brown <julian@codesourcery.com> > >> > >> PR middle-end/86336 > >> > >> (gimplify_adjust_omp_clauses_1): Update handling of > >> mapping of C++ references. > > > > How is reference handling specified differently between OpenMP and > > OpenACC? It seems strange for them to differ. > > Both OpenACC and OpenMP privatize mapped array pointers on the > accelerator for subarrays in the same way. However, for pointers > without subarrays, OpenMP treats them as zero-length arrays, whereas > OpenACC treats them as ordinary scalars so that the pointer target > will not get remapped on the accelerator (which is odd because > there's a deviceptr clause for that). Scalars in C++ are special, > because references must treated like an array of length one, for lack > of a better terminology. I think it's more accurate to say that OpenACC says nothing about C++ references at all, nor about how unadorned pointers are mapped in copy/copyin/copyout clauses. So arguably we get to choose whatever we want, preferably based on the principle of least surprise. (ICE'ing definitely counts as a surprise!) As noted in a previous email, PGI seems to treat pointers to aggregates specially, mapping them as ptr[0:1], but it's unclear if the same is true for pointers to scalars with their compiler. Neither behaviour seems to be standard-mandated, but this patch extends the idea to references to scalars nonetheless. > > In any case, you shouldn't need to check lang_GNU_CXX since we're > > already calling the langhook. > > Julian, can you look into this? I'm traveling tomorrow. Yes, I'll continue to look at this patch. Thanks, Julian
On Mon, Sep 10, 2018 at 7:07 PM, Julian Brown <julian@codesourcery.com> wrote: > On Mon, 10 Sep 2018 10:52:47 -0700 > Cesar Philippidis <cesar@codesourcery.com> wrote: > >> On 09/10/2018 10:37 AM, Jason Merrill wrote: >> > On Mon, Sep 10, 2018 at 4:05 AM, Julian Brown >> > <julian@codesourcery.com> wrote: >> >> This patch (by Cesar) changes the way C++ references are mapped in >> >> OpenACC regions, fixing an ICE in the non-scalar-data.C testcase. >> >> >> >> Post-patch, references are mapped like this (from the omplower >> >> dump): >> >> >> >> map(force_present:*x [len: 4]) map(firstprivate ref:x [pointer >> >> assign, bias: 0]) >> >> >> >> Tested with offloading to NVPTX and bootstrapped. OK for trunk? >> >> >> >> Thanks, >> >> >> >> Julian >> >> >> >> ChangeLog >> >> >> >> 2018-09-09 Cesar Philippidis <cesar@codesourcery.com> >> >> Julian Brown <julian@codesourcery.com> >> >> >> >> PR middle-end/86336 >> >> >> >> (gimplify_adjust_omp_clauses_1): Update handling of >> >> mapping of C++ references. >> > >> > How is reference handling specified differently between OpenMP and >> > OpenACC? It seems strange for them to differ. >> >> Both OpenACC and OpenMP privatize mapped array pointers on the >> accelerator for subarrays in the same way. However, for pointers >> without subarrays, OpenMP treats them as zero-length arrays, whereas >> OpenACC treats them as ordinary scalars so that the pointer target >> will not get remapped on the accelerator (which is odd because >> there's a deviceptr clause for that). Scalars in C++ are special, >> because references must treated like an array of length one, for lack >> of a better terminology. > > I think it's more accurate to say that OpenACC says nothing about C++ > references at all, nor about how unadorned pointers are mapped in > copy/copyin/copyout clauses. So arguably we get to choose whatever we > want, preferably based on the principle of least surprise. (ICE'ing > definitely counts as a surprise!) > > As noted in a previous email, PGI seems to treat pointers to > aggregates specially, mapping them as ptr[0:1], but it's unclear if the > same is true for pointers to scalars with their compiler. Neither > behaviour seems to be standard-mandated, but this patch extends the > idea to references to scalars nonetheless. That certainly seems like the most sensible way of handling references to non-arrays. And the 'this' pointer, incidentally. Should we not do the same for OpenMP? Jakub? Jason
On Mon, Sep 10, 2018 at 10:22:15PM +0100, Jason Merrill wrote: > > As noted in a previous email, PGI seems to treat pointers to > > aggregates specially, mapping them as ptr[0:1], but it's unclear if the > > same is true for pointers to scalars with their compiler. Neither > > behaviour seems to be standard-mandated, but this patch extends the > > idea to references to scalars nonetheless. > > That certainly seems like the most sensible way of handling references > to non-arrays. And the 'this' pointer, incidentally. Should we not > do the same for OpenMP? Jakub? OpenMP specifies what to do, though for 4.0, 4.5 and 5.0 it is all different (and also depends on defaultmap clause), I believe currently we implement what 4.5 says and when I'll try to implement the 5.0 version, I'll certainly try to follow the standard. With defaultmap, one can specify what will happen with various kinds of implicit mappings (map them as bits, firstprivatize them, for pointers handle them as zero length array sections, refuse to do any implicit mapping). E.g. part of what OpenMP 5.0 says is: ... - If a defaultmap clause is present for the category of the variable and specifies an implicit behavior other than default, the data-mapping attribute is determined by that clause. - If the target construct is within a class non-static member function, and a variable is an accessible data member of the object for which the non-static data member function is invoked, the variable is treated as if the this[:1] expression had appeared in a map clause with a map-type of tofrom. Additionally, if the variable is of a type pointer or reference to pointer, it is also treated as if it has appeared in a map clause as a zero-length array section. - If the this keyword is referenced inside a target construct within a class non-static member function, it is treated as if the this[:1] expression had appeared in a map clause with a map-type of tofrom. - A variable that is of type pointer is treated as if it is the base pointer of a zero-length array section that appeared as a list item in a map clause. - A variable that is of type reference to pointer is treated as if it had appeared in a map clause as a zero-length array section. ... - If the type of a list item is a reference to a type T then the reference in the device data environment is initialized to refer to the object in the device data environment that corresponds to the object referenced by the list item. If mapping occurs, it occurs as though the object were mapped through a pointer with an array section of type T and length one. - No type mapped through a reference can contain a reference to its own type, or any cycle of references to types that could produce a cycle of references. ... Jakub
On Mon, 10 Sep 2018 22:22:15 +0100 Jason Merrill <jason@redhat.com> wrote: > On Mon, Sep 10, 2018 at 7:07 PM, Julian Brown > <julian@codesourcery.com> wrote: > > I think it's more accurate to say that OpenACC says nothing about > > C++ references at all, nor about how unadorned pointers are mapped > > in copy/copyin/copyout clauses. So arguably we get to choose > > whatever we want, preferably based on the principle of least > > surprise. (ICE'ing definitely counts as a surprise!) > > > > As noted in a previous email, PGI seems to treat pointers to > > aggregates specially, mapping them as ptr[0:1], but it's unclear if > > the same is true for pointers to scalars with their compiler. > > Neither behaviour seems to be standard-mandated, but this patch > > extends the idea to references to scalars nonetheless. > > That certainly seems like the most sensible way of handling references > to non-arrays. [...] To try to clarify things for myself a bit, I tried to figure out better what the current OpenMP behaviour in GCC is, and what the equivalent OpenACC behaviour should be. I think the handling of references can and should match between the two APIs (though implementation details of the patch to make that so need a little work still). Pointers (without array sections) are a little more awkward: going by what OpenMP 4.5 and OpenACC 2.5 say, there does seem to be a deliberate difference in mapping behaviour, at least for cases that are specified. Previously, I was confusing the cases marked (*) and (**) below a little. So, we have: == OpenMP 4.5 ===================================================== #include <stdio.h> int main (int argc, char* argv[]) { int arr[32]; int &myref = arr[16]; int *myptr = &arr[18]; const char *sep = ""; for (int i = 0; i < 32; i++) arr[i] = i; //#pragma omp target // mapped as firstprivate: no effect on host //#pragma omp target defaultmap(tofrom:scalar) // works #pragma omp target map(tofrom:myref) // works { myref = 1000; } #pragma omp target enter data map(to:arr[0:32]) //#pragma omp target // works, mapped as zero-length array section (*) //#pragma omp target map(tofrom:myptr) // crashes (**) #pragma omp target map(tofrom:myptr[0:1]) // works { *myptr = 2000; } #pragma omp target exit data map(from:arr[0:32]) for (int i = 0; i < 32; i++, sep = ", ") printf ("%s%d", sep, arr[i]); printf ("\n"); return 0; } == OpenACC 2.5 ==================================================== #include <stdio.h> int main (int argc, char* argv[]) { int arr[32]; int &myref = arr[16]; int *myptr = &arr[18]; const char *sep = ""; for (int i = 0; i < 32; i++) arr[i] = i; //#pragma acc parallel // mapped as firstprivate: no effect on host #pragma acc parallel copy(myref) // works { myref = 1000; } #pragma acc enter data copyin(arr[0:32]) //#pragma acc parallel // crashes (*) //#pragma acc parallel copy(myptr) // crashes (**) //#pragma acc parallel copy(myptr[0:1]) // works //#pragma acc parallel present(myptr) // runtime error, not present #pragma acc parallel present(myptr[0:1]) // works { *myptr = 2000; } #pragma acc exit data copyout(arr[0:32]) for (int i = 0; i < 32; i++, sep = ", ") printf ("%s%d", sep, arr[i]); printf ("\n"); return 0; } =================================================================== The pointer-mapping cases marked (*), implicit mapping, are the ones specified in OpenMP 4.5 to map as zero-length array sections. For OpenACC the pointer is considered a scalar so is mapped as bits (so the host pointer causes the target to crash on dereference). The cases marked (**) -- also maybe applicable to C++ "this" -- currently copy as bits on OpenMP and on OpenACC, but could be changed to map like length-one array sections. Or, they could raise a warning. There's no apparent difference between OpenMP and OpenACC there though (in specified behaviour and/or implementation? Despite what I thought previously) so that's probably a decision for another day. Cheers, Julian
On Mon, Sep 10, 2018 at 08:31:49PM -0400, Julian Brown wrote: > #pragma omp target enter data map(to:arr[0:32]) > > //#pragma omp target // works, mapped as zero-length array section (*) > //#pragma omp target map(tofrom:myptr) // crashes (**) In this case OpenMP doesn't allow the implementation to choose what to do, it really should copy it bitwise in this case. There are cases where it will still work, e.g. if the pointer was in use_device_ptr clause on surrounding target data construct (doesn't even need to be visible to the compiler, could be in some non-visible caller), otherwise it would be a user bug. Warning might be useful, but only if we can prove the pointer value is really a host pointer, rather than use_device_ptr translated pointer or say something constructed out of CUDA or other APIs. > #pragma omp target map(tofrom:myptr[0:1]) // works > { > *myptr = 2000; > } > Jakub
On Mon, 10 Sep 2018 20:31:49 -0400 Julian Brown <julian@codesourcery.com> wrote: > [...] I think the handling of references can and should match between > the two APIs (though implementation details of the patch to make that > so need a little work still). Here's a new version of the patch, somewhat simplified and slightly more obviously making the treatment of references between OpenMP and OpenACC the same. I worried a little about the potential side-effects of making ctx->target_firstprivatize_array_bases true for parallel and kernels regions, but test results revealed no problems with doing that and I think generated code may even be a little better (and more consistent) in some cases. For example, one case that is handled differently now is as follows: #include <stdlib.h> __attribute__((noinline)) int bar (int c) { int arr[c]; #pragma acc parallel loop copy(arr) for (int i = 0; i < c; i++) arr[i] = i; for (int i = 0; i < c; i++) if (arr[i] != i) abort (); return arr[c - 1]; } int main (int argc, char *argv[]) { return bar (100); } The VLA was previously mapped as: #pragma omp target oacc_parallel map(tofrom:*arr.1 [len: D.2607]) \ map(alloc:arr [pointer assign, bias: 0]) firstprivate(c) and is now mapped as: #pragma omp target oacc_parallel map(tofrom:*arr.1 [len: D.2607]) \ map(firstprivate:arr [pointer assign, bias: 0]) firstprivate(c) Either works, but IIUC using firstprivate_pointer can be more efficient if the pointer is dereferenced multiple times in a kernel, since a local copy of the incoming mapped pointer is made per-thread/workitem. Generally, array sections are already using firstprivate pointers for their bases with OpenACC. Re-tested with offloading to NVPTX and bootstrapped. OK, or any other comments? Thanks, Julian ChangeLog 2018-09-09 Cesar Philippidis <cesar@codesourcery.com> Julian Brown <julian@codesourcery.com> PR middle-end/86336 gcc/cp/ * semantics.c (finish_omp_clauses): Treat C++ references the same in OpenACC as OpenMP. * gimplify.c (gimplify_scan_omp_clauses): Set target_firstprivatize_array_bases in OpenACC parallel and kernels region contexts. Remove GOMP_MAP_FIRSTPRIVATE_REFERENCE clauses from OpenACC data regions. libgomp/ * testsuite/libgomp.oacc-c++/non-scalar-data.C: Remove XFAIL. commit 6f3d5b86b4413722c3e7ab3ca9a678d7c35b68fe Author: Julian Brown <julian@codesourcery.com> Date: Thu Sep 6 15:32:50 2018 -0700 [OpenACC] C++ reference mapping 2018-09-09 Cesar Philippidis <cesar@codesourcery.com> Julian Brown <julian@codesourcery.com> PR middle-end/86336 gcc/cp/ * semantics.c (finish_omp_clauses): Treat C++ references the same in OpenACC as OpenMP. * gimplify.c (gimplify_scan_omp_clauses): Set target_firstprivatize_array_bases in OpenACC parallel and kernels region contexts. Remove GOMP_MAP_FIRSTPRIVATE_REFERENCE clauses from OpenACC data regions. libgomp/ * testsuite/libgomp.oacc-c++/non-scalar-data.C: Remove XFAIL. diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index f3e5d83..bf3c63a 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -6878,7 +6878,7 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) handle_map_references: if (!remove && !processing_template_decl - && (ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP + && ort != C_ORT_DECLARE_SIMD && TYPE_REF_P (TREE_TYPE (OMP_CLAUSE_DECL (c)))) { t = OMP_CLAUSE_DECL (c); diff --git a/gcc/gimplify.c b/gcc/gimplify.c index dbd0f0e..f0eb04a 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -7513,6 +7513,8 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, case OMP_TARGET_EXIT_DATA: case OACC_DECLARE: case OACC_HOST_DATA: + case OACC_PARALLEL: + case OACC_KERNELS: ctx->target_firstprivatize_array_bases = true; default: break; @@ -8556,7 +8558,8 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, if (code == OACC_DATA && OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP - && OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_POINTER) + && (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_POINTER + || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_REFERENCE)) remove = true; if (remove) *list_p = OMP_CLAUSE_CHAIN (c); diff --git a/libgomp/testsuite/libgomp.oacc-c++/non-scalar-data.C b/libgomp/testsuite/libgomp.oacc-c++/non-scalar-data.C index 8e4b296..e5f8707 100644 --- a/libgomp/testsuite/libgomp.oacc-c++/non-scalar-data.C +++ b/libgomp/testsuite/libgomp.oacc-c++/non-scalar-data.C @@ -1,8 +1,7 @@ // Ensure that a non-scalar dummy arguments which are implicitly used inside // offloaded regions are properly mapped using present_or_copy semantics. -// { dg-xfail-if "TODO" { *-*-* } } -// { dg-excess-errors "ICE" } +// { dg-do run } #include <cassert>
On Tue, Sep 11, 2018 at 10:20:26PM -0400, Julian Brown wrote: > 2018-09-09 Cesar Philippidis <cesar@codesourcery.com> > Julian Brown <julian@codesourcery.com> > > PR middle-end/86336 > > gcc/cp/ > * semantics.c (finish_omp_clauses): Treat C++ references the same in > OpenACC as OpenMP. > > * gimplify.c (gimplify_scan_omp_clauses): Set > target_firstprivatize_array_bases in OpenACC parallel and kernels > region contexts. Remove GOMP_MAP_FIRSTPRIVATE_REFERENCE clauses from > OpenACC data regions. > > libgomp/ > * testsuite/libgomp.oacc-c++/non-scalar-data.C: Remove XFAIL. LGTM. Jakub
commit fed5f1044b3d7add83065b3bbe2ba2a95a1e95ce Author: Julian Brown <julian@codesourcery.com> Date: Thu Sep 6 15:32:50 2018 -0700 [OpenACC] C++ reference mapping 2018-09-09 Cesar Philippidis <cesar@codesourcery.com> Julian Brown <julian@codesourcery.com> gcc/cp/ * semantics.c (finish_omp_clauses): Map C++ references by value and FIRSTPRIVATE_REFERENCE. * gimplify.c (gimplify_scan_omp_clauses): Remove FIRSTPRIVATE_REFERENCE mappings in OpenACC data regions. (gimplify_adjust_omp_clauses_1): Update handling of mapping of C++ references. libgomp/ * testsuite/libgomp.oacc-c++/non-scalar-data.C: Remove XFAIL. diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c index 676de01..707f054 100644 --- a/gcc/cp/semantics.c +++ b/gcc/cp/semantics.c @@ -6877,7 +6877,8 @@ finish_omp_clauses (tree clauses, enum c_omp_region_type ort) handle_map_references: if (!remove && !processing_template_decl - && (ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP + && ((ort & C_ORT_OMP_DECLARE_SIMD) == C_ORT_OMP + || ort == C_ORT_ACC) && TYPE_REF_P (TREE_TYPE (OMP_CLAUSE_DECL (c)))) { t = OMP_CLAUSE_DECL (c); diff --git a/gcc/gimplify.c b/gcc/gimplify.c index dbd0f0e..4011cb2 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -8556,7 +8556,8 @@ gimplify_scan_omp_clauses (tree *list_p, gimple_seq *pre_p, if (code == OACC_DATA && OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP - && OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_POINTER) + && (OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_POINTER + || OMP_CLAUSE_MAP_KIND (c) == GOMP_MAP_FIRSTPRIVATE_REFERENCE)) remove = true; if (remove) *list_p = OMP_CLAUSE_CHAIN (c); @@ -8872,7 +8873,9 @@ gimplify_adjust_omp_clauses_1 (splay_tree_node n, void *data) OMP_CLAUSE_CHAIN (nc) = OMP_CLAUSE_CHAIN (clause); OMP_CLAUSE_CHAIN (clause) = nc; } - else if (gimplify_omp_ctxp->target_firstprivatize_array_bases + else if ((((gimplify_omp_ctxp->region_type & ORT_ACC) + && lang_GNU_CXX ()) + || gimplify_omp_ctxp->target_firstprivatize_array_bases) && lang_hooks.decls.omp_privatize_by_reference (decl)) { OMP_CLAUSE_DECL (clause) = build_simple_mem_ref (decl); diff --git a/libgomp/testsuite/libgomp.oacc-c++/non-scalar-data.C b/libgomp/testsuite/libgomp.oacc-c++/non-scalar-data.C index 8e4b296..e5f8707 100644 --- a/libgomp/testsuite/libgomp.oacc-c++/non-scalar-data.C +++ b/libgomp/testsuite/libgomp.oacc-c++/non-scalar-data.C @@ -1,8 +1,7 @@ // Ensure that a non-scalar dummy arguments which are implicitly used inside // offloaded regions are properly mapped using present_or_copy semantics. -// { dg-xfail-if "TODO" { *-*-* } } -// { dg-excess-errors "ICE" } +// { dg-do run } #include <cassert>