From patchwork Wed Mar 25 17:09:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Schwinge X-Patchwork-Id: 1261553 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48nZP358Ppz9sPk for ; Thu, 26 Mar 2020 04:09:57 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 98C0D385E024; Wed, 25 Mar 2020 17:09:53 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa4.mentor.iphmx.com (esa4.mentor.iphmx.com [68.232.137.252]) by sourceware.org (Postfix) with ESMTPS id 8A79D385E00B for ; Wed, 25 Mar 2020 17:09:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org 8A79D385E00B Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Thomas_Schwinge@mentor.com IronPort-SDR: VsDcSyzOWI8KYcKnpiFRZYUqXqYicsVgn16o4ThCJelFYYr3GgPgbJAnoPLwrsZiZFl+0mJmzW s/Ld/OYKEY/XEA7mzE1ede08H2i5Ju2FogiWBEhIp9nMlcDP8ZSrHYCn5krU1gTyp1wGE3nIwd QPhNbhX44XMxC7dstzCASzAcxVvAo/eS9EjF21zuyr902KpRpibRGspnnyM+VfRqUuzskCluvV fWoc45lcWc6GEOvpqmLYHBuLB/T9dn+B5esMBNZMhdsnr265d/fvFv2iXfPqLiMKMza5gP6vuF 8aw= X-IronPort-AV: E=Sophos; i="5.72,304,1580803200"; d="scan'208,223"; a="47153014" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa4.mentor.iphmx.com with ESMTP; 25 Mar 2020 09:09:39 -0800 IronPort-SDR: J0ATy9Xm/3lrd9t3svGgHKDwz1ORAfJS+2a7RZ3PZl9o2T5GSK1zfLE5YE6xLM+Sod2O5vN7Nx IC3shDd0jevDthVSlnec6MsTSMY2sirONtPdJs/HOuWe1KFWB0WP/G0TfmakbmSt7J3r3wnGZV dCBjVZGFn5MYARXJulPh6+mCpXipUHinn8mKJbUAoA5U9FM7hF+uE6koWv8p3xXVMsKNuPymKd 6lSrMHU4WoQBHEBA777Df2SVwCvYQ98ZDnCiUo0UE47D88R7qmixOoEB7iNuyAHgZSOt+kFBvW cKQ= From: Thomas Schwinge To: , Frederik Harwath Subject: [og9] Fix og9 "Fix hang when running oacc exec with CUDA 9.0 nvprof" In-Reply-To: References: <87k28acit3.fsf@hertz.schwinge.homeip.net> User-Agent: Notmuch/0.29.1+93~g67ed7df (https://notmuchmail.org) Emacs/25.2.2 (x86_64-pc-linux-gnu) Date: Wed, 25 Mar 2020 18:09:25 +0100 Message-ID: MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-06.mgc.mentorg.com (139.181.222.6) To SVR-IES-MBX-03.mgc.mentorg.com (139.181.222.3) X-Spam-Status: No, score=-31.1 required=5.0 tests=BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kwok Cheung Yeung Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" Hi! On 2018-02-22T12:23:25+0100, Tom de Vries wrote: > when using cuda 9 nvprof with an openacc executable, the executable hangs. > > The scenario resulting in the hang is as follows: > 1. goacc_lazy_initialize calls gomp_mutex_lock (&acc_device_lock) > 2. goacc_lazy_initialize calls acc_init_1 > 3. acc_init_1 calls goacc_profiling_dispatch (&prof_info, > &device_init_event_info, &api_info); > 4. goacc_profiling_dispatch calls the registered callback in the cuda > profiling library > 5. the registered call back calls acc_get_device_type > 6. acc_get_device_type calls gomp_mutex_lock (&acc_device_lock) > 7. The lock is not recursive, so we have deadlock > > The registered callback in cuda 8 does not call acc_get_device_type, so > the hang doesn't occur there. (ACK for the general problem description/analysis.) > This patch fixes the hang by detecting in acc_get_device_type that the > calling thread is a thread that is currently initializing the openacc > part of the libgomp library, and returning acc_device_none, which is a > legal value given that the openacc standard states "If the device type > has not yet been selected, the value acc_device_none may be returned". (This specific way of resolving the issue I still have to look into. This may need a more general solution, to make all such libgomp OpenACC entry points re-entrant.) > Committed to og7 branch. What Frederik has discovered today in the hard way... is that the og9 version of this patch did get its code altered in a way so that it no longer resolves the problem it's meant to resolve -- the hang was back. On Git-mirror-based openacc-gcc-9-branch that's: commit 84af3c5a2fbb5023057e2ca319b0c22f5f7d4795 Author: Julian Brown AuthorDate: Tue Feb 26 16:00:54 2019 -0800 Commit: Kwok Cheung Yeung CommitDate: Fri May 31 13:40:07 2019 -0700 Fix hang when running oacc exec with CUDA 9.0 nvprof 2018-09-20 Tom de Vries Cesar Philippidis libgomp/ [...] ..., which got cherry-picked (automated, without any review) into current devel/omp/gcc-9 in commit f752d880a5abc591a25ad22fb892363f6520bcf1. Of course, it would've helped tremendously had the original og7 commit included a test case... :'-/ (... by simply reproducing the nested calls that CUDA 9 nvprof seems to be doing.) Still without a test case, for now I have pushed the attached patch to devel/omp/gcc-9 in commit 9ae129017c7fc1fa638d6beedd3802b515ca692b 'Fix og9 "Fix hang when running oacc exec with CUDA 9.0 nvprof"'. Grüße Thomas ----------------- Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter From 9ae129017c7fc1fa638d6beedd3802b515ca692b Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Wed, 25 Mar 2020 17:57:02 +0100 Subject: [PATCH] Fix og9 "Fix hang when running oacc exec with CUDA 9.0 nvprof" Compared to the original og7 version, and still-good og8 version, the og9 version of this patch did get its code altered in a way so that it no longer resolves the problem it's meant to resolve -- the hang was back. libgomp/ * oacc-init.c (acc_init_1): Move 'acc_init_state' logic to where it belongs. --- libgomp/ChangeLog.omp | 5 +++++ libgomp/oacc-init.c | 10 +++++----- 2 files changed, 10 insertions(+), 5 deletions(-) diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp index 88957864a69..75c45917998 100644 --- a/libgomp/ChangeLog.omp +++ b/libgomp/ChangeLog.omp @@ -1,3 +1,8 @@ +2020-03-25 Thomas Schwinge + + * oacc-init.c (acc_init_1): Move 'acc_init_state' logic to where + it belongs. + 2019-11-22 Kwok Cheung Yeung * testsuite/libgomp.oacc-fortran/lib-16.f90: Fix async-safety issue. diff --git a/libgomp/oacc-init.c b/libgomp/oacc-init.c index beeeb48c106..765fa2f3b95 100644 --- a/libgomp/oacc-init.c +++ b/libgomp/oacc-init.c @@ -231,6 +231,11 @@ acc_dev_num_out_of_range (acc_device_t d, int ord, int ndevs) static struct gomp_device_descr * acc_init_1 (acc_device_t d, acc_construct_t parent_construct, int implicit) { + gomp_mutex_lock (&acc_init_state_lock); + acc_init_state = initializing; + acc_init_thread = pthread_self (); + gomp_mutex_unlock (&acc_init_state_lock); + bool check_not_nested_p; if (implicit) { @@ -293,11 +298,6 @@ acc_init_1 (acc_device_t d, acc_construct_t parent_construct, int implicit) struct gomp_device_descr *base_dev, *acc_dev; int ndevs; - gomp_mutex_lock (&acc_init_state_lock); - acc_init_state = initializing; - acc_init_thread = pthread_self (); - gomp_mutex_unlock (&acc_init_state_lock); - base_dev = resolve_device (d, true); ndevs = base_dev->get_num_devices_func (); -- 2.17.1