From patchwork Thu Dec 12 19:01:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Julian Brown X-Patchwork-Id: 1208810 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-515833-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="JZNU6lB+"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47Yjp057vCz9sPh for ; Fri, 13 Dec 2019 06:01:40 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; q=dns; s= default; b=CWZzcZQp7Ar9SOwIzPi28jZHHDGECyBJiRvv6q0dLqCSMtzjnXr/Y wQRu6wsVMYhsHC58jGj2eWjP9fN8KA7TUP9k19CCGYUlCsYgf4Q7opvuqPdnqx88 p2Qs3oFITGyNMPg0jTdRfa825xYJcnTivSNcixuhy1rDC1lscH4d7o= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:date :from:to:subject:message-id:mime-version:content-type; s= default; bh=Y0RLYVfrdmtwAXMBtLF3cLYvz6o=; b=JZNU6lB+lHqWLYNrb022 sDUgJoqKQu8oy3n38u+cxvOg5BJvTTQ5112i6BWtsZVOVbNLniVYlmSyvRNcEh2F JZv75RR1dNmHm3NRn4uJNPR6ELf9XNg2zED9TE1UwkasKPOd8HWOUK09Jh9yN3Gq jDnkUz2pHyMROOGjO6H8aDQ= Received: (qmail 95996 invoked by alias); 12 Dec 2019 19:01:32 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 95980 invoked by uid 89); 12 Dec 2019 19:01:32 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-16.4 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT, SPF_PASS autolearn=ham version=3.3.1 spammy=beforehand, delete_copyout, locked, sk:goacc_a X-HELO: esa2.mentor.iphmx.com Received: from esa2.mentor.iphmx.com (HELO esa2.mentor.iphmx.com) (68.232.141.98) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 12 Dec 2019 19:01:30 +0000 IronPort-SDR: ikDl+7KO+nW/fhrjCQPzycuyNucFzo8QluLt/UJYQmVcR7YmDWYeYTtPT9rdMREB8HSz0dmipA V9Hn7767ZVZ/TxVYNwer4R/Kgd6BQmcKC5t0JF9NcyRrN5/e5pz9HeapwUthVafbByBJ6wcOSa K0fr0KWew7S1BvQuiee3Td7cOXo78hoWC2GE2LzLEQoY+Zr28P4LaGJm2OT7PX5ML6XyZ/C6/k T0E1M1W4UCzjh4TmbieC0UptjEjC0jeZRx74vU++uq6h64GGQ95XgF4xpPkmjLvYH0cBQdBYt/ Oqg= Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa2.mentor.iphmx.com with ESMTP; 12 Dec 2019 11:01:28 -0800 IronPort-SDR: EsjXJFMXaW/AGbK1b2XkXu3FJJ9tW35amOP5WWIAwO14hAzrw9OGBCnKrAGBxiIA4vMH84qFq6 DIOTnYcwizbPM0tCrqYJzskIPzaKDBRl/SwPiPTrE299EH/lzplaLQDctaG1jla0jtuLfAMm4i fVLQVIP65+7GWwvtrECbnUdAqXlvzHEeb6NmtbZ8DVoazPCJb5LinfRDQbfsPEjhTjAIgo324J 3g0hbnvN47NhNZDkAwsxAnoKvAG2netCuit/MQmzwRoD5zghCveYlBX9nlW0t9cHI5GNuD0nX8 Bb4= Date: Thu, 12 Dec 2019 19:01:21 +0000 From: Julian Brown To: , Thomas Schwinge , Jakub Jelinek Subject: [PATCH, OpenACC] Fix potential race condition in OpenACC "exit data" operations Message-ID: <20191212190121.42ea3f8c@squid.athome> MIME-Version: 1.0 X-IsSubscribed: yes Hi, This is a fix for PR92881, broken out of the larger "reference counting overhaul" patch last posted here: https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02235.html The current implementation (potentially with synchronous unmapping of a variable/memory block immediately after an asynchronous copyout which might not complete beforehand) is "obviously incorrect" by observation, but does not appear to cause any issues with the current testsuite (i.e., there is no test improvement with this patch). Tested with offloading to AMD GCN. OK for trunk? Thanks, Julian ChangeLog 2019-12-12 Julian Brown Thomas Schwinge PR libgomp/92881 libgomp/ * libgomp.h (gomp_remove_var_async): Add prototype. * oacc-mem.c (delete_copyout): Call gomp_remove_var_async instead of gomp_remove_var. * target.c (gomp_unref_tgt): Change return type to bool, indicating whether target_mem_desc was unmapped. (gomp_unref_tgt_void): New. (gomp_remove_var): Reimplement in terms of... (gomp_remove_var_internal): ...this new helper function. (gomp_remove_var_async): New, implemented using above helper function. (gomp_unmap_vars_internal): Use gomp_unref_tgt_void instead of gomp_unref_tgt. commit 7a5134adb474c65269bb16b1e31bf92bb9d6a06a Author: Julian Brown Date: Thu Dec 12 10:39:23 2019 -0800 Fix potential race condition in OpenACC "exit data" operations PR libgomp/92881 libgomp/ * libgomp.h (gomp_remove_var_async): Add prototype. * oacc-mem.c (delete_copyout): Call gomp_remove_var_async instead of gomp_remove_var. * target.c (gomp_unref_tgt): Change return type to bool, indicating whether target_mem_desc was unmapped. (gomp_unref_tgt_void): New. (gomp_remove_var): Reimplement in terms of... (gomp_remove_var_internal): ...this new helper function. (gomp_remove_var_async): New, implemented using above helper function. (gomp_unmap_vars_internal): Use gomp_unref_tgt_void instead of gomp_unref_tgt. diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h index 8cbfe0b1f30..6390187dff5 100644 --- a/libgomp/libgomp.h +++ b/libgomp/libgomp.h @@ -1166,6 +1166,8 @@ extern bool gomp_fini_device (struct gomp_device_descr *); extern void gomp_free_memmap (struct splay_tree_s *); extern void gomp_unload_device (struct gomp_device_descr *); extern bool gomp_remove_var (struct gomp_device_descr *, splay_tree_key); +extern void gomp_remove_var_async (struct gomp_device_descr *, splay_tree_key, + struct goacc_asyncqueue *); extern bool gomp_offload_target_available_p (int); /* work.c */ diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c index a809d0495a6..196b7e2a520 100644 --- a/libgomp/oacc-mem.c +++ b/libgomp/oacc-mem.c @@ -660,7 +660,6 @@ static void delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname) { splay_tree_key n; - void *d; struct goacc_thread *thr = goacc_thread (); struct gomp_device_descr *acc_dev = thr->dev; @@ -689,9 +688,6 @@ delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname) gomp_fatal ("[%p,%d] is not mapped", (void *)h, (int)s); } - d = (void *) (n->tgt->tgt_start + n->tgt_offset - + (uintptr_t) h - n->host_start); - if ((uintptr_t) h < n->host_start || (uintptr_t) h + s > n->host_end) { size_t host_size = n->host_end - n->host_start; @@ -723,12 +719,15 @@ delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname) if (n->refcount == 0) { + goacc_aq aq = get_goacc_asyncqueue (async); + if (f & FLAG_COPYOUT) { - goacc_aq aq = get_goacc_asyncqueue (async); + void *d = (void *) (n->tgt->tgt_start + n->tgt_offset + + (uintptr_t) h - n->host_start); gomp_copy_dev2host (acc_dev, aq, h, d, s); } - gomp_remove_var (acc_dev, n); + gomp_remove_var_async (acc_dev, n, aq); } gomp_mutex_unlock (&acc_dev->lock); diff --git a/libgomp/target.c b/libgomp/target.c index d2fd90be539..d2990e06348 100644 --- a/libgomp/target.c +++ b/libgomp/target.c @@ -1128,32 +1128,65 @@ gomp_unmap_tgt (struct target_mem_desc *tgt) free (tgt); } -attribute_hidden bool -gomp_remove_var (struct gomp_device_descr *devicep, splay_tree_key k) +static bool +gomp_unref_tgt (void *ptr) { bool is_tgt_unmapped = false; - splay_tree_remove (&devicep->mem_map, k); - if (k->link_key) - splay_tree_insert (&devicep->mem_map, (splay_tree_node) k->link_key); - if (k->tgt->refcount > 1) - k->tgt->refcount--; + + struct target_mem_desc *tgt = (struct target_mem_desc *) ptr; + + assert (tgt->refcount != REFCOUNT_INFINITY); + + if (tgt->refcount > 1) + tgt->refcount--; else { + gomp_unmap_tgt (tgt); is_tgt_unmapped = true; - gomp_unmap_tgt (k->tgt); } + return is_tgt_unmapped; } static void -gomp_unref_tgt (void *ptr) +gomp_unref_tgt_void (void *ptr) { - struct target_mem_desc *tgt = (struct target_mem_desc *) ptr; + (void) gomp_unref_tgt (ptr); +} - if (tgt->refcount > 1) - tgt->refcount--; +static inline __attribute__((always_inline)) bool +gomp_remove_var_internal (struct gomp_device_descr *devicep, splay_tree_key k, + struct goacc_asyncqueue *aq) +{ + bool is_tgt_unmapped = false; + splay_tree_remove (&devicep->mem_map, k); + if (k->link_key) + splay_tree_insert (&devicep->mem_map, (splay_tree_node) k->link_key); + if (aq) + devicep->openacc.async.queue_callback_func (aq, gomp_unref_tgt_void, + (void *) k->tgt); else - gomp_unmap_tgt (tgt); + is_tgt_unmapped = gomp_unref_tgt ((void *) k->tgt); + return is_tgt_unmapped; +} + +attribute_hidden bool +gomp_remove_var (struct gomp_device_descr *devicep, splay_tree_key k) +{ + return gomp_remove_var_internal (devicep, k, NULL); +} + +/* Remove a variable asynchronously. This actually removes the variable + mapping immediately, but retains the linked target_mem_desc until the + asynchronous operation has completed (as it may still refer to target + memory). The device lock must be held before entry, and remains locked on + exit. */ + +attribute_hidden void +gomp_remove_var_async (struct gomp_device_descr *devicep, splay_tree_key k, + struct goacc_asyncqueue *aq) +{ + (void) gomp_remove_var_internal (devicep, k, aq); } /* Unmap variables described by TGT. If DO_COPYFROM is true, copy relevant @@ -1209,7 +1242,7 @@ gomp_unmap_vars_internal (struct target_mem_desc *tgt, bool do_copyfrom, } if (aq) - devicep->openacc.async.queue_callback_func (aq, gomp_unref_tgt, + devicep->openacc.async.queue_callback_func (aq, gomp_unref_tgt_void, (void *) tgt); else gomp_unref_tgt ((void *) tgt);