Message ID | b4b56c8e-4c35-132c-1a75-0fe18ee07d71@codesourcery.com |
---|---|
State | New |
Headers | show |
Series | OpenMP: Fix use_device_{addr,ptr} with in-data-sharing arg | expand |
On Wed, Apr 20, 2022 at 03:19:38PM +0200, Tobias Burnus wrote: > For array-descriptor vars, the descriptor is assigned to a temporary. However, > this failed when the clause's argument was in turn in a data-sharing clause > as the outer context's VALUE_EXPR wasn't used. > > gcc/ChangeLog: > > * omp-low.cc (lower_omp_target): Fix use_device_{addr,ptr} with list > item that is in an outer data-sharing clause. > > libgomp/ChangeLog: > > * testsuite/libgomp.fortran/use_device_addr-5.f90: New test. > > gcc/omp-low.cc | 22 ++-- > .../libgomp.fortran/use_device_addr-5.f90 | 143 +++++++++++++++++++++ > 2 files changed, 156 insertions(+), 9 deletions(-) > > diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc > index bf5779b6543..6e387fd9a61 100644 > --- a/gcc/omp-low.cc > +++ b/gcc/omp-low.cc > @@ -13656,26 +13656,30 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) > new_var = lookup_decl (var, ctx); > new_var = DECL_VALUE_EXPR (new_var); > tree v = new_var; > + tree v2 = var; > + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_USE_DEVICE_PTR > + || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_USE_DEVICE_ADDR) > + { > + v2 = maybe_lookup_decl_in_outer_ctx (var, ctx); > + if (DECL_HAS_VALUE_EXPR_P (v2)) > + v2 = DECL_VALUE_EXPR (v2); I don't understand the above 2 lines, why do you need that? Regardless whether v2 has DECL_VALUE_EXPR or not, the type of the DECL_VALUE_EXPR (v2) and v2 should be the same, build_fold_indirect_ref should work on both and then v2 is only used as second operand of gimplify_assign, where the gimplifier makes sure to handle DECL_VALUE_EXPR correctly. I certainly don't see any difference in the *.omplower dump if I comment out the above 2 lines. Otherwise LGTM, so if the 2 lines aren't needed, please also drop the {}s around v2 = maybe_lookup_decl_in_outer_ctx (var, ctx); and reindent. > + } > > if (is_ref) > { > - var = build_fold_indirect_ref (var); > - gimplify_expr (&var, &assign_body, NULL, is_gimple_val, > - fb_rvalue); > - v = create_tmp_var_raw (TREE_TYPE (var), get_name (var)); > + v2 = build_fold_indirect_ref (v2); > + v = create_tmp_var_raw (TREE_TYPE (v2), get_name (var)); > gimple_add_tmp_var (v); > TREE_ADDRESSABLE (v) = 1; > - gimple_seq_add_stmt (&assign_body, > - gimple_build_assign (v, var)); > + gimplify_assign (v, v2, &assign_body); > tree rhs = build_fold_addr_expr (v); > gimple_seq_add_stmt (&assign_body, > gimple_build_assign (new_var, rhs)); > } > else > - gimple_seq_add_stmt (&assign_body, > - gimple_build_assign (new_var, var)); > + gimplify_assign (new_var, v2, &assign_body); > > - tree v2 = lang_hooks.decls.omp_array_data (unshare_expr (v), false); > + v2 = lang_hooks.decls.omp_array_data (unshare_expr (v), false); > gcc_assert (v2); > gimplify_expr (&x, &assign_body, NULL, is_gimple_val, fb_rvalue); > gimple_seq_add_stmt (&assign_body, Jakub
Hi Jakub, On 04.05.22 14:03, Jakub Jelinek wrote: > On Wed, Apr 20, 2022 at 03:19:38PM +0200, Tobias Burnus wrote: >> --- a/gcc/omp-low.cc >> +++ b/gcc/omp-low.cc >> @@ -13656,26 +13656,30 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) >> new_var = lookup_decl (var, ctx); >> new_var = DECL_VALUE_EXPR (new_var); >> tree v = new_var; >> + tree v2 = var; >> + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_USE_DEVICE_PTR >> + || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_USE_DEVICE_ADDR) >> + { >> + v2 = maybe_lookup_decl_in_outer_ctx (var, ctx); >> + if (DECL_HAS_VALUE_EXPR_P (v2)) >> + v2 = DECL_VALUE_EXPR (v2); > I don't understand the above 2 lines, why do you need that? I think it was intermittently required with some (half-)working patch. But I concur that it is no longer is needed. > Otherwise LGTM, so if the 2 lines aren't needed, please also drop the > {}s around v2 = maybe_lookup_decl_in_outer_ctx (var, ctx); and reindent. I did so, tested it also on my end and committed it as r13-116-g3f8c389fe90bf565a6221a46bb7fb745dd4c1510 Thanks for the review! Tobias ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Hi! On 2022-04-20T15:19:38+0200, Tobias Burnus <tobias@codesourcery.com> wrote: > For > omp parallel shared(array_desc_var) > the shared-variable is passed to the generated function as > argument - and replaced by a DECL_VALUE_EXPR inside the parallel region. > > If inside the parallel region, a > > omp target data has_device_addr(array_descr_var) > > is used, the latter generates a > omp_arr->array_descr_var = &array_descr_var.data; > ... > tmp_desc = array_descr_var > tmp_desc.data = omp_o->array_descr_var > > that is: 'tmp_desc' gets assigned the original descriptor > and only the data components is updated. > > > However, if that's inside the parallel region, not 'array_descr_var' > has to be used – but the value expression ('omp_i->array_descr_var'). > > Fixed by searching the variable used in use_device_{addr,ptr} in the > outer OpenMP context – and then checking for a DECL_VALUE_EXPR. I wonder if corresponding OpenACC clause needs similar consideration -- or maybe is covered with this 'OMP_CLAUSE_USE_DEVICE_PTR', 'OMP_CLAUSE_USE_DEVICE_ADDR' handling here (haven't looked yet). > --- /dev/null > +++ b/libgomp/testsuite/libgomp.fortran/use_device_addr-5.f90 > @@ -0,0 +1,143 @@ > +program main > + use omp_lib > + implicit none > + integer, allocatable :: aaa(:,:,:) > + integer :: i > + > + allocate (aaa(-4:10,-3:8,2)) > + aaa(:,:,:) = reshape ([(i, i = 1, size(aaa))], shape(aaa)) > + > + do i = 0, omp_get_num_devices() > + !$omp target data map(to: aaa) > + call test_addr (aaa, i) > + call test_ptr (aaa, i) > + !$omp end target data > + end do Pushed to master branch commit 798152475559a6be8049692932cc747c6499e7f5 "Fix up 'libgomp.fortran/use_device_addr-5.f90' multi-device testing", see attached. Grüße Thomas > + deallocate (aaa) > + > +contains > + > + subroutine test_addr (aaaa, dev) > + use iso_c_binding > + integer, target, allocatable :: aaaa(:,:,:), bbbb(:,:,:) > + integer, value :: dev > + integer :: i > + type(c_ptr) :: ptr > + logical :: is_shared > + > + is_shared = .false. > + !$omp target device(dev) map(to: is_shared) > + is_shared = .true. > + !$omp end target > + > + allocate (bbbb(-4:10,-3:8,2)) > + bbbb(:,:,:) = reshape ([(-i, i = 1, size(bbbb))], shape(bbbb)) > + !$omp target enter data map(to: bbbb) device(dev) > + if (any (lbound (aaaa) /= [-4, -3, 1])) error stop 1 > + if (any (shape (aaaa) /= [15, 12, 2])) error stop 2 > + if (any (lbound (bbbb) /= [-4, -3, 1])) error stop 3 > + if (any (shape (bbbb) /= [15, 12, 2])) error stop 4 > + if (any (aaaa /= -bbbb)) error stop 5 > + if (any (aaaa /= reshape ([(i, i = 1, size(aaaa))], shape(aaaa)))) & > + error stop 6 > + > + !$omp parallel do shared(bbbb, aaaa) > + do i = 1,1 > + if (any (lbound (aaaa) /= [-4, -3, 1])) error stop 5 > + if (any (shape (aaaa) /= [15, 12, 2])) error stop 6 > + if (any (lbound (bbbb) /= [-4, -3, 1])) error stop 7 > + if (any (shape (bbbb) /= [15, 12, 2])) error stop 8 > + if (any (aaaa /= -bbbb)) error stop 5 > + if (any (aaaa /= reshape ([(i, i = 1, size(aaaa))], shape(aaaa)))) & > + error stop 6 > + ptr = c_loc (aaaa) > + !$omp target data use_device_addr(bbbb, aaaa) device(dev) > + if (any (lbound (aaaa) /= [-4, -3, 1])) error stop 9 > + if (any (shape (aaaa) /= [15, 12, 2])) error stop 10 > + if (any (lbound (bbbb) /= [-4, -3, 1])) error stop 11 > + if (any (shape (bbbb) /= [15, 12, 2])) error stop 12 > + if (is_shared) then > + if (any (aaaa /= -bbbb)) error stop 5 > + if (any (aaaa /= reshape ([(i, i = 1, size(aaaa))], shape(aaaa)))) & > + error stop 6 > + end if > + if (is_shared .neqv. c_associated (ptr, c_loc (aaaa))) error stop > + > + !$omp target has_device_addr(bbbb, aaaa) device(dev) > + if (any (lbound (aaaa) /= [-4, -3, 1])) error stop 9 > + if (any (shape (aaaa) /= [15, 12, 2])) error stop 10 > + if (any (lbound (bbbb) /= [-4, -3, 1])) error stop 11 > + if (any (shape (bbbb) /= [15, 12, 2])) error stop 12 > + if (any (aaaa /= -bbbb)) error stop 5 > + if (any (aaaa /= reshape ([(i, i = 1, size(aaaa))], shape(aaaa)))) & > + error stop 6 > + !$omp end target > + !$omp end target data > + end do > + !$omp target exit data map(delete: bbbb) device(dev) > + deallocate (bbbb) > + end subroutine test_addr > + > + subroutine test_ptr (aaaa, dev) > + use iso_c_binding > + integer, target, allocatable :: aaaa(:,:,:), bbbb(:,:,:) > + integer, value :: dev > + integer :: i > + type(c_ptr) :: ptr > + logical :: is_shared > + > + is_shared = .false. > + !$omp target device(dev) map(to: is_shared) > + is_shared = .true. > + !$omp end target > + > + allocate (bbbb(-4:10,-3:8,2)) > + bbbb(:,:,:) = reshape ([(-i, i = 1, size(bbbb))], shape(bbbb)) > + !$omp target enter data map(to: bbbb) device(dev) > + if (any (lbound (aaaa) /= [-4, -3, 1])) error stop 1 > + if (any (shape (aaaa) /= [15, 12, 2])) error stop 2 > + if (any (lbound (bbbb) /= [-4, -3, 1])) error stop 3 > + if (any (shape (bbbb) /= [15, 12, 2])) error stop 4 > + if (any (aaaa /= -bbbb)) error stop 5 > + if (any (aaaa /= reshape ([(i, i = 1, size(aaaa))], shape(aaaa)))) & > + error stop 6 > + > + !$omp parallel do shared(bbbb, aaaa) > + do i = 1,1 > + if (any (lbound (aaaa) /= [-4, -3, 1])) error stop 5 > + if (any (shape (aaaa) /= [15, 12, 2])) error stop 6 > + if (any (lbound (bbbb) /= [-4, -3, 1])) error stop 7 > + if (any (shape (bbbb) /= [15, 12, 2])) error stop 8 > + if (any (aaaa /= -bbbb)) error stop 5 > + if (any (aaaa /= reshape ([(i, i = 1, size(aaaa))], shape(aaaa)))) & > + error stop 6 > + ptr = c_loc (aaaa) > + !$omp target data use_device_ptr(bbbb, aaaa) device(dev) > + if (any (lbound (aaaa) /= [-4, -3, 1])) error stop 9 > + if (any (shape (aaaa) /= [15, 12, 2])) error stop 10 > + if (any (lbound (bbbb) /= [-4, -3, 1])) error stop 11 > + if (any (shape (bbbb) /= [15, 12, 2])) error stop 12 > + if (is_shared) then > + if (any (aaaa /= -bbbb)) error stop 5 > + if (any (aaaa /= reshape ([(i, i = 1, size(aaaa))], shape(aaaa)))) & > + error stop 6 > + end if > + if (is_shared .neqv. c_associated (ptr, c_loc (aaaa))) error stop > + > + ! Uses has_device_addr due to PR fortran/105318 > + !!$omp target is_device_ptr(bbbb, aaaa) device(dev) > + !$omp target has_device_addr(bbbb, aaaa) device(dev) > + if (any (lbound (aaaa) /= [-4, -3, 1])) error stop 9 > + if (any (shape (aaaa) /= [15, 12, 2])) error stop 10 > + if (any (lbound (bbbb) /= [-4, -3, 1])) error stop 11 > + if (any (shape (bbbb) /= [15, 12, 2])) error stop 12 > + if (any (aaaa /= -bbbb)) error stop 5 > + if (any (aaaa /= reshape ([(i, i = 1, size(aaaa))], shape(aaaa)))) & > + error stop 6 > + !$omp end target > + !$omp end target data > + end do > + !$omp target exit data map(delete: bbbb) device(dev) > + deallocate (bbbb) > + end subroutine test_ptr > +end program main ----------------- Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
OpenMP: Fix use_device_{addr,ptr} with in-data-sharing arg For array-descriptor vars, the descriptor is assigned to a temporary. However, this failed when the clause's argument was in turn in a data-sharing clause as the outer context's VALUE_EXPR wasn't used. gcc/ChangeLog: * omp-low.cc (lower_omp_target): Fix use_device_{addr,ptr} with list item that is in an outer data-sharing clause. libgomp/ChangeLog: * testsuite/libgomp.fortran/use_device_addr-5.f90: New test. gcc/omp-low.cc | 22 ++-- .../libgomp.fortran/use_device_addr-5.f90 | 143 +++++++++++++++++++++ 2 files changed, 156 insertions(+), 9 deletions(-) diff --git a/gcc/omp-low.cc b/gcc/omp-low.cc index bf5779b6543..6e387fd9a61 100644 --- a/gcc/omp-low.cc +++ b/gcc/omp-low.cc @@ -13656,26 +13656,30 @@ lower_omp_target (gimple_stmt_iterator *gsi_p, omp_context *ctx) new_var = lookup_decl (var, ctx); new_var = DECL_VALUE_EXPR (new_var); tree v = new_var; + tree v2 = var; + if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_USE_DEVICE_PTR + || OMP_CLAUSE_CODE (c) == OMP_CLAUSE_USE_DEVICE_ADDR) + { + v2 = maybe_lookup_decl_in_outer_ctx (var, ctx); + if (DECL_HAS_VALUE_EXPR_P (v2)) + v2 = DECL_VALUE_EXPR (v2); + } if (is_ref) { - var = build_fold_indirect_ref (var); - gimplify_expr (&var, &assign_body, NULL, is_gimple_val, - fb_rvalue); - v = create_tmp_var_raw (TREE_TYPE (var), get_name (var)); + v2 = build_fold_indirect_ref (v2); + v = create_tmp_var_raw (TREE_TYPE (v2), get_name (var)); gimple_add_tmp_var (v); TREE_ADDRESSABLE (v) = 1; - gimple_seq_add_stmt (&assign_body, - gimple_build_assign (v, var)); + gimplify_assign (v, v2, &assign_body); tree rhs = build_fold_addr_expr (v); gimple_seq_add_stmt (&assign_body, gimple_build_assign (new_var, rhs)); } else - gimple_seq_add_stmt (&assign_body, - gimple_build_assign (new_var, var)); + gimplify_assign (new_var, v2, &assign_body); - tree v2 = lang_hooks.decls.omp_array_data (unshare_expr (v), false); + v2 = lang_hooks.decls.omp_array_data (unshare_expr (v), false); gcc_assert (v2); gimplify_expr (&x, &assign_body, NULL, is_gimple_val, fb_rvalue); gimple_seq_add_stmt (&assign_body, diff --git a/libgomp/testsuite/libgomp.fortran/use_device_addr-5.f90 b/libgomp/testsuite/libgomp.fortran/use_device_addr-5.f90 new file mode 100644 index 00000000000..1def70a1bc0 --- /dev/null +++ b/libgomp/testsuite/libgomp.fortran/use_device_addr-5.f90 @@ -0,0 +1,143 @@ +program main + use omp_lib + implicit none + integer, allocatable :: aaa(:,:,:) + integer :: i + + allocate (aaa(-4:10,-3:8,2)) + aaa(:,:,:) = reshape ([(i, i = 1, size(aaa))], shape(aaa)) + + do i = 0, omp_get_num_devices() + !$omp target data map(to: aaa) + call test_addr (aaa, i) + call test_ptr (aaa, i) + !$omp end target data + end do + deallocate (aaa) + +contains + + subroutine test_addr (aaaa, dev) + use iso_c_binding + integer, target, allocatable :: aaaa(:,:,:), bbbb(:,:,:) + integer, value :: dev + integer :: i + type(c_ptr) :: ptr + logical :: is_shared + + is_shared = .false. + !$omp target device(dev) map(to: is_shared) + is_shared = .true. + !$omp end target + + allocate (bbbb(-4:10,-3:8,2)) + bbbb(:,:,:) = reshape ([(-i, i = 1, size(bbbb))], shape(bbbb)) + !$omp target enter data map(to: bbbb) device(dev) + if (any (lbound (aaaa) /= [-4, -3, 1])) error stop 1 + if (any (shape (aaaa) /= [15, 12, 2])) error stop 2 + if (any (lbound (bbbb) /= [-4, -3, 1])) error stop 3 + if (any (shape (bbbb) /= [15, 12, 2])) error stop 4 + if (any (aaaa /= -bbbb)) error stop 5 + if (any (aaaa /= reshape ([(i, i = 1, size(aaaa))], shape(aaaa)))) & + error stop 6 + + !$omp parallel do shared(bbbb, aaaa) + do i = 1,1 + if (any (lbound (aaaa) /= [-4, -3, 1])) error stop 5 + if (any (shape (aaaa) /= [15, 12, 2])) error stop 6 + if (any (lbound (bbbb) /= [-4, -3, 1])) error stop 7 + if (any (shape (bbbb) /= [15, 12, 2])) error stop 8 + if (any (aaaa /= -bbbb)) error stop 5 + if (any (aaaa /= reshape ([(i, i = 1, size(aaaa))], shape(aaaa)))) & + error stop 6 + ptr = c_loc (aaaa) + !$omp target data use_device_addr(bbbb, aaaa) device(dev) + if (any (lbound (aaaa) /= [-4, -3, 1])) error stop 9 + if (any (shape (aaaa) /= [15, 12, 2])) error stop 10 + if (any (lbound (bbbb) /= [-4, -3, 1])) error stop 11 + if (any (shape (bbbb) /= [15, 12, 2])) error stop 12 + if (is_shared) then + if (any (aaaa /= -bbbb)) error stop 5 + if (any (aaaa /= reshape ([(i, i = 1, size(aaaa))], shape(aaaa)))) & + error stop 6 + end if + if (is_shared .neqv. c_associated (ptr, c_loc (aaaa))) error stop + + !$omp target has_device_addr(bbbb, aaaa) device(dev) + if (any (lbound (aaaa) /= [-4, -3, 1])) error stop 9 + if (any (shape (aaaa) /= [15, 12, 2])) error stop 10 + if (any (lbound (bbbb) /= [-4, -3, 1])) error stop 11 + if (any (shape (bbbb) /= [15, 12, 2])) error stop 12 + if (any (aaaa /= -bbbb)) error stop 5 + if (any (aaaa /= reshape ([(i, i = 1, size(aaaa))], shape(aaaa)))) & + error stop 6 + !$omp end target + !$omp end target data + end do + !$omp target exit data map(delete: bbbb) device(dev) + deallocate (bbbb) + end subroutine test_addr + + subroutine test_ptr (aaaa, dev) + use iso_c_binding + integer, target, allocatable :: aaaa(:,:,:), bbbb(:,:,:) + integer, value :: dev + integer :: i + type(c_ptr) :: ptr + logical :: is_shared + + is_shared = .false. + !$omp target device(dev) map(to: is_shared) + is_shared = .true. + !$omp end target + + allocate (bbbb(-4:10,-3:8,2)) + bbbb(:,:,:) = reshape ([(-i, i = 1, size(bbbb))], shape(bbbb)) + !$omp target enter data map(to: bbbb) device(dev) + if (any (lbound (aaaa) /= [-4, -3, 1])) error stop 1 + if (any (shape (aaaa) /= [15, 12, 2])) error stop 2 + if (any (lbound (bbbb) /= [-4, -3, 1])) error stop 3 + if (any (shape (bbbb) /= [15, 12, 2])) error stop 4 + if (any (aaaa /= -bbbb)) error stop 5 + if (any (aaaa /= reshape ([(i, i = 1, size(aaaa))], shape(aaaa)))) & + error stop 6 + + !$omp parallel do shared(bbbb, aaaa) + do i = 1,1 + if (any (lbound (aaaa) /= [-4, -3, 1])) error stop 5 + if (any (shape (aaaa) /= [15, 12, 2])) error stop 6 + if (any (lbound (bbbb) /= [-4, -3, 1])) error stop 7 + if (any (shape (bbbb) /= [15, 12, 2])) error stop 8 + if (any (aaaa /= -bbbb)) error stop 5 + if (any (aaaa /= reshape ([(i, i = 1, size(aaaa))], shape(aaaa)))) & + error stop 6 + ptr = c_loc (aaaa) + !$omp target data use_device_ptr(bbbb, aaaa) device(dev) + if (any (lbound (aaaa) /= [-4, -3, 1])) error stop 9 + if (any (shape (aaaa) /= [15, 12, 2])) error stop 10 + if (any (lbound (bbbb) /= [-4, -3, 1])) error stop 11 + if (any (shape (bbbb) /= [15, 12, 2])) error stop 12 + if (is_shared) then + if (any (aaaa /= -bbbb)) error stop 5 + if (any (aaaa /= reshape ([(i, i = 1, size(aaaa))], shape(aaaa)))) & + error stop 6 + end if + if (is_shared .neqv. c_associated (ptr, c_loc (aaaa))) error stop + + ! Uses has_device_addr due to PR fortran/105318 + !!$omp target is_device_ptr(bbbb, aaaa) device(dev) + !$omp target has_device_addr(bbbb, aaaa) device(dev) + if (any (lbound (aaaa) /= [-4, -3, 1])) error stop 9 + if (any (shape (aaaa) /= [15, 12, 2])) error stop 10 + if (any (lbound (bbbb) /= [-4, -3, 1])) error stop 11 + if (any (shape (bbbb) /= [15, 12, 2])) error stop 12 + if (any (aaaa /= -bbbb)) error stop 5 + if (any (aaaa /= reshape ([(i, i = 1, size(aaaa))], shape(aaaa)))) & + error stop 6 + !$omp end target + !$omp end target data + end do + !$omp target exit data map(delete: bbbb) device(dev) + deallocate (bbbb) + end subroutine test_ptr +end program main