From patchwork Wed Dec 18 06:02:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Julian Brown X-Patchwork-Id: 1211979 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-516171-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="yDS9zEv5"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47d4Fc31Z8z9sRs for ; Wed, 18 Dec 2019 17:03:48 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding:content-type; q=dns; s= default; b=wzUQjIvad1sk+DH+ob0Md5aV8evMUHDj4qIzQXZ06TULv8xT/e+Ng BRtkIr3qwJxkb6ZNnzkeJ7PNeu3Lxt5XQo7JBSpG8UEqIp3u5IblVNpKoLU/3n3s 3saijazZp+2E3RLyWyWO4J/b9867JfuN4ZP7DWNbqMSprKpwZaEd7M= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding:content-type; s=default; bh=6z/4En2YOyelhcK2ct+uFiJaBWs=; b=yDS9zEv5hgFzztzM5qTu00e9/MRp c9SycnPOkhAQ0jFzOdva42r/jwxSpq8AfMtkTAtgWn9XaEjI+j/WX6Hxe5/9D8BV oazE1gu+ON/zwoZKFUKZRiXv0vAHCHBLh+SVt0fZzvY7iUHahUd7NjVLXvXZHuOM JQloi6RvpFk+ChM= Received: (qmail 97477 invoked by alias); 18 Dec 2019 06:03:12 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 97376 invoked by uid 89); 18 Dec 2019 06:03:12 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-23.8 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT autolearn=ham version=3.3.1 spammy=locked, 9397, 4537, 3378 X-HELO: esa1.mentor.iphmx.com Received: from esa1.mentor.iphmx.com (HELO esa1.mentor.iphmx.com) (68.232.129.153) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 18 Dec 2019 06:03:06 +0000 IronPort-SDR: Zup6xDftxT3er4DDFEB8oON3DP0Ud0aVIT7Bg+1+arFqZ8Inwit8SRMpjwyDiht/UjI5aRGjd4 sGQU58x1+TfP/OI63TMIbljd798h7XOx2N8GwCRddrUcAWodN3d1eN2faf86DmLZHG7wscbhKN NbV9K8WsBGSpxmdsYESL+DtgHnRyzL2YMMQ/bHm9EnqmksW5b5xVfNnsA4c+Qvy+E/LwZHgERM ezaNLzD2sRv0TlBQocDpjYMIa18M3iI2TU+iTj4V4gGIsvUTJY1WgGK02oqEeijojYQ0JW9n0D A3Q= Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa1.mentor.iphmx.com with ESMTP; 17 Dec 2019 22:03:04 -0800 IronPort-SDR: J7xdfArSsVz3URfvFCVan1LOk4CVACWpb3Q7uITyurlEztCC2CoCUSTztkCU4HDhC7gsi2dAbx 6o39V/YeWM8rax1t+z9XziiI/k6BoEPAVIZxj/U57za2MO2taqVyguoB2p6raOX35dkaoim1Zs QKbll/l13XeebwbRXNbN+JlSbdrKs2xaz/u/nQEJJVFo3ZrweCOxtGXjBWA4UfHpCmpycmbKq4 561UZtIbSRGxtX6n/iI1Hgzj+C91GXjQKHN+bf3KYlkfLiBVOrOXr//HvhS/5MDiigXIk6cOtj uCM= From: Julian Brown To: CC: Thomas Schwinge , Jakub Jelinek , Tobias Burnus , , Subject: [PATCH 02/13] OpenACC reference count overhaul Date: Tue, 17 Dec 2019 22:02:27 -0800 Message-ID: <491e3ca360313930f8f2f5686ffd386cf2fad04e.1576648001.git.julian@codesourcery.com> In-Reply-To: References: MIME-Version: 1.0 X-IsSubscribed: yes This is a rebased version of the reference-count overhaul patch last posted here: https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02235.html This version omits parts of the above patch already committed upstream and merges some recent REFCOUNT_INFINITY changes. This patch causes the newish PR92843 test to fail, though IMO that test relies on behaviour arising from a rather nuanced reading of the spec. Hopefully we can resolve that problem as a follow-up. Tested alongside other patches in this series with offloading to NVPTX. OK? Julian 2019-11-22 Julian Brown Thomas Schwinge libgomp/ * libgomp.h (struct splay_tree_key_s): Substitute dynamic_refcount field for virtual_refcount. (enum gomp_map_vars_kind): Add GOMP_MAP_VARS_OPENACC_ENTER_DATA. (gomp_free_memmap): Remove prototype. * oacc-init.c (acc_shutdown_1): Iteratively call gomp_remove_var instead of calling gomp_free_memmap. * oacc-mem.c (acc_unmap_data): Open code instead of forcing target_mem_desc's to_free NULL then calling gomp_unmap_vars. Handle REFCOUNT_INFINITY on target blocks. (present_create_copy): Use virtual_refcount instead of dynamic_refcount. Re-do lookup for target pointer return value. (delete_copyout): Update for virtual_refcount semantics. (gomp_acc_insert_pointer, gomp_acc_remove_pointer, find_pointer): Remove functions. (find_group_last, goacc_enter_data_internal, goacc_exit_data_internal): New functions. (GOACC_enter_exit_data): Use goacc_enter_data_internal and goacc_exit_data_internal helper functions. * target.c (gomp_map_vars_internal): Handle GOMP_MAP_VARS_OPENACC_ENTER_DATA. Update for virtual_refcount semantics. (gomp_unmap_vars_internal): Update for virtual_refcount semantics. (gomp_load_image_to_device, omp_target_associate_ptr): Zero-initialise virtual_refcount field instead of dynamic_refcount. (gomp_free_memmap): Remove function. * testsuite/libgomp.oacc-c-c++-common/unmap-infinity-1.c: New test. * testsuite/libgomp.c-c++-common/unmap-infinity-2.c: New test. * testsuite/libgomp.oacc-c-c++-common/subset-subarray-mappings-1-r-p.c: Remove PR92848 TODOs. * testsuite/libgomp.oacc-c-c++-common/pr92843-1.c: Add XFAIL. --- libgomp/libgomp.h | 9 +- libgomp/oacc-init.c | 10 +- libgomp/oacc-mem.c | 399 +++++++----------- libgomp/target.c | 53 +-- .../libgomp.c-c++-common/unmap-infinity-2.c | 19 + .../libgomp.oacc-c-c++-common/pr92843-1.c | 1 + .../subset-subarray-mappings-1-r-p.c | 16 - .../unmap-infinity-1.c | 17 + 8 files changed, 228 insertions(+), 296 deletions(-) create mode 100644 libgomp/testsuite/libgomp.c-c++-common/unmap-infinity-2.c create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/unmap-infinity-1.c diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h index 0f1f11284d5..865b9df2444 100644 --- a/libgomp/libgomp.h +++ b/libgomp/libgomp.h @@ -1007,8 +1007,11 @@ struct splay_tree_key_s { uintptr_t tgt_offset; /* Reference count. */ uintptr_t refcount; - /* Dynamic reference count. */ - uintptr_t dynamic_refcount; + /* Reference counts beyond those that represent genuine references in the + linked splay tree key/target memory structures, e.g. for multiple OpenACC + "present increment" operations (via "acc enter data") referring to the same + host-memory block. */ + uintptr_t virtual_refcount; struct splay_tree_aux *aux; }; @@ -1139,6 +1142,7 @@ struct gomp_device_descr enum gomp_map_vars_kind { GOMP_MAP_VARS_OPENACC, + GOMP_MAP_VARS_OPENACC_ENTER_DATA, GOMP_MAP_VARS_TARGET, GOMP_MAP_VARS_DATA, GOMP_MAP_VARS_ENTER_DATA @@ -1169,7 +1173,6 @@ extern void gomp_unmap_vars_async (struct target_mem_desc *, bool, struct goacc_asyncqueue *); extern void gomp_init_device (struct gomp_device_descr *); extern bool gomp_fini_device (struct gomp_device_descr *); -extern void gomp_free_memmap (struct splay_tree_s *); extern void gomp_unload_device (struct gomp_device_descr *); extern bool gomp_remove_var (struct gomp_device_descr *, splay_tree_key); extern void gomp_remove_var_async (struct gomp_device_descr *, splay_tree_key, diff --git a/libgomp/oacc-init.c b/libgomp/oacc-init.c index a444c604d59..dd88b58a379 100644 --- a/libgomp/oacc-init.c +++ b/libgomp/oacc-init.c @@ -370,7 +370,15 @@ acc_shutdown_1 (acc_device_t d) if (walk->dev) { gomp_mutex_lock (&walk->dev->lock); - gomp_free_memmap (&walk->dev->mem_map); + + while (walk->dev->mem_map.root) + { + splay_tree_key k = &walk->dev->mem_map.root->key; + if (k->aux) + k->aux->link_key = NULL; + gomp_remove_var (walk->dev, k); + } + gomp_mutex_unlock (&walk->dev->lock); walk->dev = NULL; diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c index 196b7e2a520..2a0e7236b92 100644 --- a/libgomp/oacc-mem.c +++ b/libgomp/oacc-mem.c @@ -406,7 +406,7 @@ acc_map_data (void *h, void *d, size_t s) &kinds, true, GOMP_MAP_VARS_OPENACC); splay_tree_key n = tgt->list[0].key; assert (n->refcount == 1); - assert (n->dynamic_refcount == 0); + assert (n->virtual_refcount == 0); /* Special reference counting behavior. */ n->refcount = REFCOUNT_INFINITY; @@ -434,12 +434,9 @@ acc_unmap_data (void *h) acc_api_info api_info; bool profiling_p = GOACC_PROFILING_SETUP_P (thr, &prof_info, &api_info); - size_t host_size; - gomp_mutex_lock (&acc_dev->lock); splay_tree_key n = lookup_host (acc_dev, h, 1); - struct target_mem_desc *t; if (!n) { @@ -447,7 +444,7 @@ acc_unmap_data (void *h) gomp_fatal ("%p is not a mapped block", (void *)h); } - host_size = n->host_end - n->host_start; + size_t host_size = n->host_end - n->host_start; if (n->host_start != (uintptr_t) h) { @@ -456,7 +453,7 @@ acc_unmap_data (void *h) (void *) n->host_start, (int) host_size, (void *) h); } /* TODO This currently doesn't catch 'REFCOUNT_INFINITY' usage different from - 'acc_map_data'. Maybe 'dynamic_refcount' can be used for disambiguating + 'acc_map_data'. Maybe 'virtual_refcount' can be used for disambiguating the different 'REFCOUNT_INFINITY' cases, or simply separate 'REFCOUNT_INFINITY' values per different usage ('REFCOUNT_ACC_MAP_DATA' etc.)? */ @@ -468,24 +465,25 @@ acc_unmap_data (void *h) (void *) h, (int) host_size); } - /* Mark for removal. */ - n->refcount = 1; + splay_tree_remove (&acc_dev->mem_map, n); - t = n->tgt; + struct target_mem_desc *tgt = n->tgt; - if (t->refcount == 2) + if (tgt->refcount == REFCOUNT_INFINITY) { - /* This is the last reference, so pull the descriptor off the - chain. This avoids gomp_unmap_vars via gomp_unmap_tgt from - freeing the device memory. */ - t->tgt_end = 0; - t->to_free = 0; + gomp_mutex_unlock (&acc_dev->lock); + gomp_fatal ("cannot unmap target block"); + } + else if (tgt->refcount > 1) + tgt->refcount--; + else + { + free (tgt->array); + free (tgt); } gomp_mutex_unlock (&acc_dev->lock); - gomp_unmap_vars (t, true); - if (profiling_p) { thr->prof_info = NULL; @@ -545,8 +543,10 @@ present_create_copy (unsigned f, void *h, size_t s, int async) assert (n->refcount != REFCOUNT_LINK); if (n->refcount != REFCOUNT_INFINITY) - n->refcount++; - n->dynamic_refcount++; + { + n->refcount++; + n->virtual_refcount++; + } gomp_mutex_unlock (&acc_dev->lock); } @@ -557,7 +557,6 @@ present_create_copy (unsigned f, void *h, size_t s, int async) } else { - struct target_mem_desc *tgt; size_t mapnum = 1; unsigned short kinds; void *hostaddrs = h; @@ -571,14 +570,16 @@ present_create_copy (unsigned f, void *h, size_t s, int async) goacc_aq aq = get_goacc_asyncqueue (async); - tgt = gomp_map_vars_async (acc_dev, aq, mapnum, &hostaddrs, NULL, &s, - &kinds, true, GOMP_MAP_VARS_OPENACC); - n = tgt->list[0].key; - assert (n->refcount == 1); - assert (n->dynamic_refcount == 0); - n->dynamic_refcount++; + gomp_map_vars_async (acc_dev, aq, mapnum, &hostaddrs, NULL, &s, &kinds, + true, GOMP_MAP_VARS_OPENACC_ENTER_DATA); - d = tgt->to_free; + gomp_mutex_lock (&acc_dev->lock); + n = lookup_host (acc_dev, h, s); + assert (n != NULL); + assert (n->tgt_offset == 0); + assert ((uintptr_t) h == n->host_start); + d = (void *) n->tgt->tgt_start; + gomp_mutex_unlock (&acc_dev->lock); } if (profiling_p) @@ -696,26 +697,21 @@ delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname) (void *) h, (int) s, (void *) n->host_start, (int) host_size); } - assert (n->refcount != REFCOUNT_LINK); - if (n->refcount != REFCOUNT_INFINITY - && n->refcount < n->dynamic_refcount) - { - gomp_mutex_unlock (&acc_dev->lock); - gomp_fatal ("Dynamic reference counting assert fail\n"); - } - if (f & FLAG_FINALIZE) { if (n->refcount != REFCOUNT_INFINITY) - n->refcount -= n->dynamic_refcount; - n->dynamic_refcount = 0; + n->refcount -= n->virtual_refcount; + n->virtual_refcount = 0; } - else if (n->dynamic_refcount) + + if (n->virtual_refcount > 0) { if (n->refcount != REFCOUNT_INFINITY) n->refcount--; - n->dynamic_refcount--; + n->virtual_refcount--; } + else if (n->refcount > 0 && n->refcount != REFCOUNT_INFINITY) + n->refcount--; if (n->refcount == 0) { @@ -870,154 +866,138 @@ acc_update_self_async (void *h, size_t s, int async) update_dev_host (0, h, s, async); } +/* Some types of (pointer) variables use several consecutive mappings, which + must be treated as a group for enter/exit data directives. This function + returns the last mapping in such a group (inclusive), or POS for singleton + mappings. */ -/* OpenACC 'enter data', 'exit data': 'GOACC_enter_exit_data' and its helper - functions. */ - -/* Special handling for 'GOMP_MAP_POINTER', 'GOMP_MAP_TO_PSET'. - - Only the first mapping is considered in reference counting; the following - ones implicitly follow suit. */ - -static void -goacc_insert_pointer (size_t mapnum, void **hostaddrs, size_t *sizes, - void *kinds, int async) +static int +find_group_last (int pos, size_t mapnum, unsigned short *kinds) { - struct target_mem_desc *tgt; - struct goacc_thread *thr = goacc_thread (); - struct gomp_device_descr *acc_dev = thr->dev; - - if (*hostaddrs == NULL) - return; + unsigned char kind0 = kinds[pos] & 0xff; + int first_pos = pos, last_pos = pos; - if (acc_is_present (*hostaddrs, *sizes)) + if (kind0 == GOMP_MAP_TO_PSET) { - splay_tree_key n; - gomp_mutex_lock (&acc_dev->lock); - n = lookup_host (acc_dev, *hostaddrs, *sizes); - assert (n->refcount != REFCOUNT_INFINITY - && n->refcount != REFCOUNT_LINK); - gomp_mutex_unlock (&acc_dev->lock); - - tgt = n->tgt; - for (size_t i = 0; i < tgt->list_count; i++) - if (tgt->list[i].key == n) - { - for (size_t j = 0; j < mapnum; j++) - if (i + j < tgt->list_count && tgt->list[i + j].key) - { - tgt->list[i + j].key->refcount++; - tgt->list[i + j].key->dynamic_refcount++; - } - return; - } - /* Should not reach here. */ - gomp_fatal ("Dynamic refcount incrementing failed for pointer/pset"); + while (pos + 1 < mapnum && (kinds[pos + 1] & 0xff) == GOMP_MAP_POINTER) + last_pos = ++pos; + /* We expect at least one GOMP_MAP_POINTER after a GOMP_MAP_TO_PSET. */ + assert (last_pos > first_pos); + } + else + { + /* GOMP_MAP_ALWAYS_POINTER can only appear directly after some other + mapping. */ + if (pos + 1 < mapnum + && (kinds[pos + 1] & 0xff) == GOMP_MAP_ALWAYS_POINTER) + return pos + 1; + + /* We can have one or several GOMP_MAP_POINTER mappings after a to/from + (etc.) mapping. */ + while (pos + 1 < mapnum && (kinds[pos + 1] & 0xff) == GOMP_MAP_POINTER) + last_pos = ++pos; } - gomp_debug (0, " %s: prepare mappings\n", __FUNCTION__); - goacc_aq aq = get_goacc_asyncqueue (async); - tgt = gomp_map_vars_async (acc_dev, aq, mapnum, hostaddrs, - NULL, sizes, kinds, true, GOMP_MAP_VARS_OPENACC); - splay_tree_key n = tgt->list[0].key; - assert (n->refcount == 1); - assert (n->dynamic_refcount == 0); - n->dynamic_refcount++; - gomp_debug (0, " %s: mappings prepared\n", __FUNCTION__); + return last_pos; } +/* Map variables for OpenACC "enter data". We can't just call + gomp_map_vars_async once, because individual mapped variables might have + "exit data" called for them at different times. */ + static void -goacc_remove_pointer (void *h, size_t s, bool force_copyfrom, int async, - int finalize, int mapnum) +goacc_enter_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum, + void **hostaddrs, size_t *sizes, + unsigned short *kinds, goacc_aq aq) { - struct goacc_thread *thr = goacc_thread (); - struct gomp_device_descr *acc_dev = thr->dev; - splay_tree_key n; - struct target_mem_desc *t; - int minrefs = (mapnum == 1) ? 2 : 3; - - if (!acc_is_present (h, s)) - return; - - gomp_mutex_lock (&acc_dev->lock); - - n = lookup_host (acc_dev, h, 1); - - if (!n) + for (size_t i = 0; i < mapnum; i++) { - gomp_mutex_unlock (&acc_dev->lock); - gomp_fatal ("%p is not a mapped block", (void *)h); - } - - gomp_debug (0, " %s: restore mappings\n", __FUNCTION__); + int group_last = find_group_last (i, mapnum, kinds); - t = n->tgt; + gomp_map_vars_async (acc_dev, aq, + (group_last - i) + 1, + &hostaddrs[i], NULL, + &sizes[i], &kinds[i], true, + GOMP_MAP_VARS_OPENACC_ENTER_DATA); - assert (n->refcount != REFCOUNT_INFINITY - && n->refcount != REFCOUNT_LINK); - if (n->refcount < n->dynamic_refcount) - { - gomp_mutex_unlock (&acc_dev->lock); - gomp_fatal ("Dynamic reference counting assert fail\n"); + i = group_last; } +} - if (finalize) - { - n->refcount -= n->dynamic_refcount; - n->dynamic_refcount = 0; - } - else if (n->dynamic_refcount) - { - n->refcount--; - n->dynamic_refcount--; - } +/* Unmap variables for OpenACC "exit data", with optional finalization + (affecting all mappings in this operation). */ - gomp_mutex_unlock (&acc_dev->lock); +static void +goacc_exit_data_internal (struct gomp_device_descr *acc_dev, size_t mapnum, + void **hostaddrs, size_t *sizes, + unsigned short *kinds, bool finalize, goacc_aq aq) +{ + gomp_mutex_lock (&acc_dev->lock); - if (n->refcount == 0) + for (size_t i = 0; i < mapnum; ++i) { - /* Set refcount to 1 to allow gomp_unmap_vars to unmap it. */ - n->refcount = 1; - t->refcount = minrefs; - for (size_t i = 0; i < t->list_count; i++) - if (t->list[i].key == n) - { - t->list[i].copy_from = force_copyfrom ? 1 : 0; - break; - } + unsigned char kind = kinds[i] & 0xff; + bool copyfrom = false; - /* If running synchronously, unmap immediately. */ - if (async < acc_async_noval) - gomp_unmap_vars (t, true); - else + switch (kind) { - goacc_aq aq = get_goacc_asyncqueue (async); - gomp_unmap_vars_async (t, true, aq); + case GOMP_MAP_FROM: + case GOMP_MAP_FORCE_FROM: + case GOMP_MAP_ALWAYS_FROM: + copyfrom = true; + /* Fallthrough. */ + + case GOMP_MAP_TO_PSET: + case GOMP_MAP_POINTER: + case GOMP_MAP_DELETE: + case GOMP_MAP_RELEASE: + { + struct splay_tree_key_s cur_node; + cur_node.host_start = (uintptr_t) hostaddrs[i]; + cur_node.host_end = cur_node.host_start + + (kind == GOMP_MAP_POINTER + ? sizeof (void *) : sizes[i]); + splay_tree_key n + = splay_tree_lookup (&acc_dev->mem_map, &cur_node); + + if (n == NULL) + continue; + + if (finalize) + { + if (n->refcount != REFCOUNT_INFINITY) + n->refcount -= n->virtual_refcount; + n->virtual_refcount = 0; + } + + if (n->virtual_refcount > 0) + { + if (n->refcount != REFCOUNT_INFINITY) + n->refcount--; + n->virtual_refcount--; + } + else if (n->refcount > 0 && n->refcount != REFCOUNT_INFINITY) + n->refcount--; + + if (copyfrom + && (kind != GOMP_MAP_FROM || n->refcount == 0)) + gomp_copy_dev2host (acc_dev, aq, (void *) cur_node.host_start, + (void *) (n->tgt->tgt_start + n->tgt_offset + + cur_node.host_start + - n->host_start), + cur_node.host_end - cur_node.host_start); + + if (n->refcount == 0) + gomp_remove_var_async (acc_dev, n, aq); + } + break; + default: + gomp_fatal (">>>> goacc_exit_data_internal UNHANDLED kind 0x%.2x", + kind); } } gomp_mutex_unlock (&acc_dev->lock); - - gomp_debug (0, " %s: mappings restored\n", __FUNCTION__); -} - -/* Return the number of mappings associated with 'GOMP_MAP_TO_PSET' or - 'GOMP_MAP_POINTER'. */ - -static int -find_pointer (int pos, size_t mapnum, unsigned short *kinds) -{ - if (pos + 1 >= mapnum) - return 0; - - unsigned char kind = kinds[pos+1] & 0xff; - - if (kind == GOMP_MAP_TO_PSET) - return 3; - else if (kind == GOMP_MAP_POINTER) - return 2; - - return 0; } void @@ -1147,98 +1127,13 @@ GOACC_enter_exit_data (int flags_m, size_t mapnum, void **hostaddrs, va_end (ap); } - /* In c, non-pointers and arrays are represented by a single data clause. - Dynamically allocated arrays and subarrays are represented by a data - clause followed by an internal GOMP_MAP_POINTER. - - In fortran, scalars and not allocated arrays are represented by a - single data clause. Allocated arrays and subarrays have three mappings: - 1) the original data clause, 2) a PSET 3) a pointer to the array data. - */ + goacc_aq aq = get_goacc_asyncqueue (async); if (data_enter) - { - for (i = 0; i < mapnum; i++) - { - unsigned char kind = kinds[i] & 0xff; - - /* Scan for pointers and PSETs. */ - int pointer = find_pointer (i, mapnum, kinds); - - if (!pointer) - { - switch (kind) - { - case GOMP_MAP_ALLOC: - case GOMP_MAP_FORCE_ALLOC: - acc_create_async (hostaddrs[i], sizes[i], async); - break; - case GOMP_MAP_TO: - case GOMP_MAP_FORCE_TO: - acc_copyin_async (hostaddrs[i], sizes[i], async); - break; - default: - gomp_fatal (">>>> GOACC_enter_exit_data UNHANDLED kind 0x%.2x", - kind); - break; - } - } - else - { - goacc_insert_pointer (pointer, &hostaddrs[i], &sizes[i], &kinds[i], - async); - /* Increment 'i' by two because OpenACC requires fortran - arrays to be contiguous, so each PSET is associated with - one of MAP_FORCE_ALLOC/MAP_FORCE_PRESET/MAP_FORCE_TO, and - one MAP_POINTER. */ - i += pointer - 1; - } - } - } + goacc_enter_data_internal (acc_dev, mapnum, hostaddrs, sizes, kinds, aq); else - for (i = 0; i < mapnum; ++i) - { - unsigned char kind = kinds[i] & 0xff; - - int pointer = find_pointer (i, mapnum, kinds); - - if (!pointer) - { - switch (kind) - { - case GOMP_MAP_RELEASE: - case GOMP_MAP_DELETE: - if (acc_is_present (hostaddrs[i], sizes[i])) - { - if (finalize) - acc_delete_finalize_async (hostaddrs[i], sizes[i], async); - else - acc_delete_async (hostaddrs[i], sizes[i], async); - } - break; - case GOMP_MAP_FROM: - case GOMP_MAP_FORCE_FROM: - if (finalize) - acc_copyout_finalize_async (hostaddrs[i], sizes[i], async); - else - acc_copyout_async (hostaddrs[i], sizes[i], async); - break; - default: - gomp_fatal (">>>> GOACC_enter_exit_data UNHANDLED kind 0x%.2x", - kind); - break; - } - } - else - { - bool copyfrom = (kind == GOMP_MAP_FORCE_FROM - || kind == GOMP_MAP_FROM); - goacc_remove_pointer (hostaddrs[i], sizes[i], copyfrom, async, - finalize, pointer); - /* See the above comment. */ - i += pointer - 1; - } - } + goacc_exit_data_internal (acc_dev, mapnum, hostaddrs, sizes, kinds, + finalize, aq); out_prof: if (profiling_p) diff --git a/libgomp/target.c b/libgomp/target.c index 97c2b5c5e4d..23f9e1618ca 100644 --- a/libgomp/target.c +++ b/libgomp/target.c @@ -536,8 +536,10 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep, struct target_mem_desc *tgt = gomp_malloc (sizeof (*tgt) + sizeof (tgt->list[0]) * mapnum); tgt->list_count = mapnum; - tgt->refcount = pragma_kind == GOMP_MAP_VARS_ENTER_DATA ? 0 : 1; + tgt->refcount = (pragma_kind == GOMP_MAP_VARS_ENTER_DATA + || pragma_kind == GOMP_MAP_VARS_OPENACC_ENTER_DATA) ? 0 : 1; tgt->device_descr = devicep; + tgt->prev = NULL; struct gomp_coalesce_buf cbuf, *cbufp = NULL; if (mapnum == 0) @@ -939,7 +941,7 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep, tgt->list[i].offset = 0; tgt->list[i].length = k->host_end - k->host_start; k->refcount = 1; - k->dynamic_refcount = 0; + k->virtual_refcount = 0; tgt->refcount++; array->left = NULL; array->right = NULL; @@ -1077,8 +1079,20 @@ gomp_map_vars_internal (struct gomp_device_descr *devicep, /* If the variable from "omp target enter data" map-list was already mapped, tgt is not needed. Otherwise tgt will be freed by gomp_unmap_vars or gomp_exit_data. */ - if (pragma_kind == GOMP_MAP_VARS_ENTER_DATA && tgt->refcount == 0) - { + if ((pragma_kind == GOMP_MAP_VARS_ENTER_DATA + || pragma_kind == GOMP_MAP_VARS_OPENACC_ENTER_DATA) + && tgt->refcount == 0) + { + /* If we're about to discard a target_mem_desc with no "structural" + references (tgt->refcount == 0), any splay keys linked in the tgt's + list must have their virtual refcount incremented to represent that + "lost" reference in order to implement the semantics of the OpenACC + "present increment" operation properly. */ + if (pragma_kind == GOMP_MAP_VARS_OPENACC_ENTER_DATA) + for (i = 0; i < tgt->list_count; i++) + if (tgt->list[i].key) + tgt->list[i].key->virtual_refcount++; + free (tgt); tgt = NULL; } @@ -1216,7 +1230,14 @@ gomp_unmap_vars_internal (struct target_mem_desc *tgt, bool do_copyfrom, continue; bool do_unmap = false; - if (k->refcount > 1 && k->refcount != REFCOUNT_INFINITY) + if (k->tgt == tgt + && k->virtual_refcount > 0 + && k->refcount != REFCOUNT_INFINITY) + { + k->virtual_refcount--; + k->refcount--; + } + else if (k->refcount > 1 && k->refcount != REFCOUNT_INFINITY) k->refcount--; else if (k->refcount == 1) { @@ -1373,7 +1394,7 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version, k->tgt = tgt; k->tgt_offset = target_table[i].start; k->refcount = REFCOUNT_INFINITY; - k->dynamic_refcount = 0; + k->virtual_refcount = 0; k->aux = NULL; array->left = NULL; array->right = NULL; @@ -1406,7 +1427,7 @@ gomp_load_image_to_device (struct gomp_device_descr *devicep, unsigned version, k->tgt = tgt; k->tgt_offset = target_var->start; k->refcount = target_size & link_bit ? REFCOUNT_LINK : REFCOUNT_INFINITY; - k->dynamic_refcount = 0; + k->virtual_refcount = 0; k->aux = NULL; array->left = NULL; array->right = NULL; @@ -1641,22 +1662,6 @@ gomp_unload_device (struct gomp_device_descr *devicep) } } -/* Free address mapping tables. MM must be locked on entry, and remains locked - on return. */ - -attribute_hidden void -gomp_free_memmap (struct splay_tree_s *mem_map) -{ - while (mem_map->root) - { - struct target_mem_desc *tgt = mem_map->root->key.tgt; - - splay_tree_remove (mem_map, &mem_map->root->key); - free (tgt->array); - free (tgt); - } -} - /* Host fallback for GOMP_target{,_ext} routines. */ static void @@ -2668,7 +2673,7 @@ omp_target_associate_ptr (const void *host_ptr, const void *device_ptr, k->tgt = tgt; k->tgt_offset = (uintptr_t) device_ptr + device_offset; k->refcount = REFCOUNT_INFINITY; - k->dynamic_refcount = 0; + k->virtual_refcount = 0; k->aux = NULL; array->left = NULL; array->right = NULL; diff --git a/libgomp/testsuite/libgomp.c-c++-common/unmap-infinity-2.c b/libgomp/testsuite/libgomp.c-c++-common/unmap-infinity-2.c new file mode 100644 index 00000000000..3931c5aba25 --- /dev/null +++ b/libgomp/testsuite/libgomp.c-c++-common/unmap-infinity-2.c @@ -0,0 +1,19 @@ +int foo[16]; +#pragma omp declare target (foo) + +__attribute__((used)) void bar (void) +{ + #pragma omp target parallel for + for (int i = 0; i < 16; i++) + foo[i] = i; +} + +int +main (int argc, char *argv[]) +{ + int *foo_copy = foo; + /* Try to trigger the unmapping of a REFCOUNT_INFINITY target block. This + does nothing at the time of writing. */ + #pragma omp target exit data map(delete: foo_copy[0:16]) + return 0; +} diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr92843-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr92843-1.c index db5b35b08d9..f16c46a37bf 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/pr92843-1.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/pr92843-1.c @@ -1,6 +1,7 @@ /* Verify that 'acc_copyout' etc. is a no-op if there's still a structured reference count. */ +/* { dg-xfail-run-if "TODO PR92843" { *-*-* } } */ /* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ #include diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/subset-subarray-mappings-1-r-p.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/subset-subarray-mappings-1-r-p.c index 9b5d83c66dd..907b8587773 100644 --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/subset-subarray-mappings-1-r-p.c +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/subset-subarray-mappings-1-r-p.c @@ -156,20 +156,16 @@ f1 (void) assert (acc_is_present (&myblock[i], SUBSET)); assert (acc_is_present (myblock, SIZE)); -#if 0 //TODO PR92848 if (last) cb_ev_free_expected = true; -#endif #if OPENACC_RUNTIME acc_delete (&myblock[i], SUBSET); #else # pragma acc exit data delete (myblock[i:SUBSET]) #endif -#if 0 //TODO PR92848 assert (!cb_ev_free_expected); if (last) assert (cb_ev_free_device_ptr == cb_ev_alloc_device_ptr); -#endif assert (acc_is_present (&myblock[i], SUBSET) != last); assert (acc_is_present (myblock, SIZE) != last); } @@ -331,9 +327,7 @@ f3 () assert (acc_is_present (h, SIZE)); assert (acc_is_present (&h[2], SIZE - 2)); -#if 0 //TODO PR92848 cb_ev_free_expected = true; -#endif #if OPENACC_RUNTIME acc_delete (h, SIZE); #else @@ -343,10 +337,8 @@ f3 () # pragma acc exit data delete (h) # endif #endif -#if 0 //TODO PR92848 assert (!cb_ev_free_expected); assert (cb_ev_free_device_ptr == cb_ev_alloc_device_ptr); -#endif assert (!acc_is_present (h, SIZE)); assert (!acc_is_present (&h[2], SIZE - 2)); @@ -401,19 +393,15 @@ f_lib_22 (void) memset (h, c1, SIZE); /* Now 'copyout' not the whole but only a "subset" subarray, missing one SUBSET at the beginning, and half a SUBSET at the end... */ -#if 0 //TODO PR92848 cb_ev_free_expected = true; -#endif #if OPENACC_RUNTIME acc_copyout (h + SUBSET, SIZE - SUBSET - SUBSET / 2); #else # pragma acc exit data copyout (h[SUBSET:SIZE - SUBSET - SUBSET / 2]) #endif -#if 0 //TODO PR92848 /* ..., yet, expect the device memory object to be 'free'd... */ assert (!cb_ev_free_expected); assert (cb_ev_free_device_ptr == cb_ev_alloc_device_ptr); -#endif /* ..., and the mapping to be removed... */ assert (!acc_is_present (h, SIZE)); assert (!acc_is_present (&h[SUBSET], SIZE - SUBSET - SUBSET / 2)); @@ -474,19 +462,15 @@ f_lib_30 (void) assert (aligned_address (cb_ev_alloc_device_ptr) == d); /* We 'delete' not the whole but only a "subset" subarray... */ -#if 0 //TODO PR92848 cb_ev_free_expected = true; -#endif #if OPENACC_RUNTIME acc_delete (h, SIZE - SUBSET); #else # pragma acc exit data delete (h[0:SIZE - SUBSET]) #endif -#if 0 //TODO PR92848 /* ..., yet, expect the device memory object to be 'free'd... */ assert (!cb_ev_free_expected); assert (cb_ev_free_device_ptr == cb_ev_alloc_device_ptr); -#endif /* ..., and the mapping to be removed. */ assert (!acc_is_present (h, SIZE)); assert (!acc_is_present (h, SIZE - SUBSET)); diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/unmap-infinity-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/unmap-infinity-1.c new file mode 100644 index 00000000000..872f0c1de5c --- /dev/null +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/unmap-infinity-1.c @@ -0,0 +1,17 @@ +/* { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } } */ + +#include + +int foo[16]; +#pragma acc declare device_resident(foo) + +int +main (int argc, char *argv[]) +{ + acc_init (acc_device_default); + acc_unmap_data ((void *) foo); +/* { dg-output "libgomp: cannot unmap target block" } */ + return 0; +} + +/* { dg-shouldfail "" } */