From patchwork Sun Jul 19 12:37:52 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nathan Sidwell X-Patchwork-Id: 497458 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 49CB5140CEB for ; Sun, 19 Jul 2015 22:38:06 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=gNn61bqw; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=TWmks93a18jeDxBW4wC8KUcjoHGhIbUmywI46hZRWMm6KfEqJx hNEHonoc9lU9edEUA8nVfexjCB42LDBG1M0OkLS1ntJ5F4sQhxDtBK+XlHNKww7d YYOEF/ulveRNxZNW6lpsXsJd66AvG/OM+GG/HT9dCtWWvAiGU0HBJLq9M= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; s= default; bh=0VYz2nq9I6MGOaz0U40Yu0AQhso=; b=gNn61bqwnIBmuKY2hnkC z5CXRV1EBx089uC0QhM6K8eut++o3gaRJ+Xqrl1iWwuHqCLJl3pKOSf1PAdyg7a7 /w2PPdxs6KXmHaNSFMFxwcB86jwzlseIpAU2aEQKIoxT4r1MYsXwcjmTMQFrm4y5 xCFGxEIyRpsR+TZ8Pb4pm5Y= Received: (qmail 73481 invoked by alias); 19 Jul 2015 12:37:58 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 73467 invoked by uid 89); 19 Jul 2015 12:37:57 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.6 required=5.0 tests=BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-qk0-f172.google.com Received: from mail-qk0-f172.google.com (HELO mail-qk0-f172.google.com) (209.85.220.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Sun, 19 Jul 2015 12:37:56 +0000 Received: by qkdl129 with SMTP id l129so97901542qkd.0 for ; Sun, 19 Jul 2015 05:37:54 -0700 (PDT) X-Received: by 10.55.33.92 with SMTP id h89mr39127544qkh.69.1437309474501; Sun, 19 Jul 2015 05:37:54 -0700 (PDT) Received: from ?IPv6:2601:181:c000:c497:a2a8:cdff:fe3e:b48? ([2601:181:c000:c497:a2a8:cdff:fe3e:b48]) by smtp.googlemail.com with ESMTPSA id 99sm9144755qku.20.2015.07.19.05.37.53 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 19 Jul 2015 05:37:53 -0700 (PDT) To: Thomas Schwinge , Jakub Jelinek , GCC Patches From: Nathan Sidwell Subject: Refactor openacc wait routine Message-ID: <55AB9A20.9070708@acm.org> Date: Sun, 19 Jul 2015 08:37:52 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.1.0 MIME-Version: 1.0 this trunk patch refactors libgomp's goacc_wait, which is used for two different purposes. 1) when openacc pragmas specify a (non-zero) waits. 2) when the wait pragma itself specifies a zero number of waits. this leads to #2 calling goacc_wait with num_waits=0, and forces #1 to never do that. Fixed by breaking out the num_waits == 0 handling from goacc_wait into GOACC_wait, the wait pragma handler. I have kept the num_wait=0 checks elsewhere, but they are now for efficiency rather than correctness. ok for trunk (& gomp4) nathan Index: libgomp/oacc-parallel.c =================================================================== --- libgomp/oacc-parallel.c (revision 225959) +++ libgomp/oacc-parallel.c (working copy) @@ -105,13 +105,13 @@ GOACC_parallel (int device, void (*fn) ( return; } - va_start (ap, num_waits); + if (num_waits) + { + va_start (ap, num_waits); + goacc_wait (async, num_waits, ap); + va_end (ap); + } - if (num_waits > 0) - goacc_wait (async, num_waits, ap); - - va_end (ap); - acc_dev->openacc.async_set_async_func (async); if (!(acc_dev->capabilities & GOMP_OFFLOAD_CAP_NATIVE_EXEC)) @@ -225,14 +225,12 @@ GOACC_enter_exit_data (int device, size_ || host_fallback) return; - if (num_waits > 0) + if (num_waits) { va_list ap; va_start (ap, num_waits); - goacc_wait (async, num_waits, ap); - va_end (ap); } @@ -350,47 +348,21 @@ goacc_wait (int async, int num_waits, va { struct goacc_thread *thr = goacc_thread (); struct gomp_device_descr *acc_dev = thr->dev; - int i; - - assert (num_waits >= 0); - - if (async == acc_async_sync && num_waits == 0) - { - acc_wait_all (); - return; - } - - if (async == acc_async_sync && num_waits) - { - for (i = 0; i < num_waits; i++) - { - int qid = va_arg (ap, int); - - if (acc_async_test (qid)) - continue; - acc_wait (qid); - } - return; - } - - if (async == acc_async_noval && num_waits == 0) - { - acc_dev->openacc.async_wait_all_async_func (acc_async_noval); - return; - } - - for (i = 0; i < num_waits; i++) + while (num_waits--) { int qid = va_arg (ap, int); if (acc_async_test (qid)) continue; - /* If we're waiting on the same asynchronous queue as we're launching on, - the queue itself will order work as required, so there's no need to - wait explicitly. */ - if (qid != async) + if (async == acc_async_sync) + acc_wait (qid); + else if (qid == async) + ;/* If we're waiting on the same asynchronous queue as we're + launching on, the queue itself will order work as + required, so there's no need to wait explicitly. */ + else acc_dev->openacc.async_wait_async_func (qid, async); } } @@ -412,14 +384,12 @@ GOACC_update (int device, size_t mapnum, || host_fallback) return; - if (num_waits > 0) + if (num_waits) { va_list ap; va_start (ap, num_waits); - goacc_wait (async, num_waits, ap); - va_end (ap); } @@ -455,13 +425,18 @@ GOACC_update (int device, size_t mapnum, void GOACC_wait (int async, int num_waits, ...) { - va_list ap; - - va_start (ap, num_waits); - - goacc_wait (async, num_waits, ap); + if (num_waits) + { + va_list ap; - va_end (ap); + va_start (ap, num_waits); + goacc_wait (async, num_waits, ap); + va_end (ap); + } + else if (async == acc_async_sync) + acc_wait_all (); + else if (async == acc_async_noval) + acc_dev->openacc.async_wait_all_async_func (acc_async_noval); } int