From patchwork Mon Oct 17 13:38:50 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Thomas Schwinge X-Patchwork-Id: 682964 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3syK8J39m6z9sDG for ; Tue, 18 Oct 2016 00:40:16 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=oyA9ZtD4; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:in-reply-to:references:date:message-id :mime-version:content-type:content-transfer-encoding; q=dns; s= default; b=h/atk46AZg8vggVAXBA0JOGY2qYsTTQHl7Y11VJQKj0XM2y8XU4g/ mbk4cHS9Ag3tqXx5sOUongjZ1vQ+OfCUi2hlwIVzGSemhwUR/p17HfJKs6W5x9QG a21EKyWp9bTzYm+x2Wowko1fPLtq7tsyrRiyYfNspTyD6r50P+WSjg= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:in-reply-to:references:date:message-id :mime-version:content-type:content-transfer-encoding; s=default; bh=voKv2OuHJPTWk6d1SWbX/aRSLXk=; b=oyA9ZtD4F4sxmOZLaqRNzMX+fl3n YPUIqO5lxcxabUqfip+a16Bd3M8V22Cm0jIRj4qXQDQIUqA8r4kG4to7unU2uD04 djRG18neT9AxNEQbcTGUJS1i/ydc7DHbYtjyMcUZktXLDI4qZFwWXgC8QbVB4Ej3 MFv0k2kStdoYMLI= Received: (qmail 17423 invoked by alias); 17 Oct 2016 13:39:11 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 17395 invoked by uid 89); 17 Oct 2016 13:39:09 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.9 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_NONE, SPF_PASS, URIBL_RED autolearn=ham version=3.3.2 spammy=Hx-languages-length:5504, Head X-Spam-User: qpsmtpd, 2 recipients X-HELO: relay1.mentorg.com Received: from relay1.mentorg.com (HELO relay1.mentorg.com) (192.94.38.131) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 17 Oct 2016 13:38:59 +0000 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=svr-ies-mbx-01.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1bw881-000254-64 from Thomas_Schwinge@mentor.com ; Mon, 17 Oct 2016 06:38:57 -0700 Received: from hertz.schwinge.homeip.net (137.202.0.87) by svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) with Microsoft SMTP Server (TLS) id 15.0.1210.3; Mon, 17 Oct 2016 14:38:52 +0100 From: Thomas Schwinge To: Richard Biener CC: Nathan Sidwell , GCC Patches , GCC Development , Subject: Re: Clear basic block flags before using BB_VISITED for OpenACC loops processing In-Reply-To: References: <8760ov2wbg.fsf@hertz.schwinge.homeip.net> <9c7622ab-0c76-b054-ea4b-34df71febe35@acm.org> <87zim31fjx.fsf@hertz.schwinge.homeip.net> <87wph719jm.fsf@hertz.schwinge.homeip.net> User-Agent: Notmuch/0.9-101-g81dad07 (http://notmuchmail.org) Emacs/24.4.1 (x86_64-pc-linux-gnu) Date: Mon, 17 Oct 2016 15:38:50 +0200 Message-ID: <87twcb14et.fsf@hertz.schwinge.homeip.net> MIME-Version: 1.0 X-ClientProxiedBy: svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) To svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) Hi! On Mon, 17 Oct 2016 14:08:44 +0200, Richard Biener wrote: > On Mon, Oct 17, 2016 at 1:47 PM, Thomas Schwinge > wrote: > > On Mon, 17 Oct 2016 13:22:17 +0200, Richard Biener wrote: > >> On Mon, Oct 17, 2016 at 11:38 AM, Thomas Schwinge > >> wrote: > >> > On Fri, 14 Oct 2016 13:06:59 +0200, Richard Biener wrote: > >> >> On Fri, Oct 14, 2016 at 1:00 PM, Nathan Sidwell wrote: > >> >> > On 10/14/16 05:28, Richard Biener wrote: > >> >> > > >> >> >> The BB_VISITED flag has indetermined state at the beginning of a pass. > >> >> >> You have to ensure it is cleared yourself. > >> >> > > >> >> > > >> >> > In that case the openacc (&nvptx?) passes should be modified to clear the > >> >> > flags at their start, rather than at their end. > >> > OK to commit the following? Is such a test case appropriate (which would > >> > have caught this issue right away), in particular the dg-final > >> > scan-tree-dump line? > >> > >> Ugh. Not worse to what we do in various dwarf scanning I guess. > > > > ;-| > > > >> Doesn't failure lead to a miscompile eventually? So you could formulate > >> this as a dg-do run test with a check for the desired outcome? > > > > No, unfortunately. In this case the error is "benign" such that the > > OpenACC loop processing machinery will decide to not parallelize loops > > that ought to be parallelized. > > So you can scan for "loop parallelized" instead? The dump would still contain the outer loop's "Loop 0(0)" marker, so I'd have to scan for "Head"/"Tail"/"UNIQUE" or similar instead -- but that seems likewise fragile (for false negatives), and less useful than scanning for the complete pattern. > I fear your pattern > is quite fragile > to maintain over time. Agreed -- but then, that's intentional: my idea for this new test case has been to have it actually verify the expected OpenACC loop processing, so it's clear that this pattern will need to be adjusted if changing the OpenACC loop processing. > > This won't generally cause any problem > > (apart from performance regression, obviously); it just caused problems > > in a few libgomp test cases that actually at run time test for > > parallelized execution -- which will/did trigger only with nvptx > > offloading enabled, which not too many people are testing. The test case > > I propose below will trigger also for non-offloading configurations. On IRC, Segher suggested to 'use {} instead of "" to avoid [all those backslashes]' -- thanks, done. commit 88260dc23e752c3e05c6644ee3b653a947714440 Author: Thomas Schwinge Date: Mon Oct 17 15:33:09 2016 +0200 Clear basic block flags before using BB_VISITED for OpenACC loops processing gcc/ * omp-low.c (oacc_loop_discovery): Call clear_bb_flags. gcc/testsuite/ * gcc.dg/goacc/loop-processing-1.c: New file. --- gcc/omp-low.c | 9 +++++---- gcc/testsuite/gcc.dg/goacc/loop-processing-1.c | 18 ++++++++++++++++++ 2 files changed, 23 insertions(+), 4 deletions(-) Grüße Thomas diff --git gcc/omp-low.c gcc/omp-low.c index 213bf8c..5257d21 100644 --- gcc/omp-low.c +++ gcc/omp-low.c @@ -19340,7 +19340,9 @@ oacc_loop_sibling_nreverse (oacc_loop *loop) static oacc_loop * oacc_loop_discovery () { - basic_block bb; + /* Clear basic block flags, in particular BB_VISITED which we're going to use + in the following. */ + clear_bb_flags (); oacc_loop *top = new_oacc_loop_outer (current_function_decl); oacc_loop_discover_walk (top, ENTRY_BLOCK_PTR_FOR_FN (cfun)); @@ -19349,9 +19351,8 @@ oacc_loop_discovery () that diagnostics come out in an unsurprising order. */ top = oacc_loop_sibling_nreverse (top); - /* Reset the visited flags. */ - FOR_ALL_BB_FN (bb, cfun) - bb->flags &= ~BB_VISITED; + /* Clear basic block flags again, as otherwise IRA will explode later on. */ + clear_bb_flags (); return top; } diff --git gcc/testsuite/gcc.dg/goacc/loop-processing-1.c gcc/testsuite/gcc.dg/goacc/loop-processing-1.c new file mode 100644 index 0000000..619576a --- /dev/null +++ gcc/testsuite/gcc.dg/goacc/loop-processing-1.c @@ -0,0 +1,18 @@ +/* Make sure that OpenACC loop processing happens. */ +/* { dg-additional-options "-O2 -fdump-tree-oaccdevlow" } */ + +extern int place (); + +void vector_1 (int *ary, int size) +{ +#pragma acc parallel num_workers (32) vector_length(32) copy(ary[0:size]) firstprivate (size) + { +#pragma acc loop gang + for (int jx = 0; jx < 1; jx++) +#pragma acc loop auto + for (int ix = 0; ix < size; ix++) + ary[ix] = place (); + } +} + +/* { dg-final { scan-tree-dump {OpenACC loops.*Loop 0\(0\).*Loop 14\(1\).*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_HEAD_MARK, 0, 1, 20\);.*Head-0:.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_HEAD_MARK, 0, 1, 20\);.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_FORK, \.data_dep\.[0-9_]+, 0\);.*Tail-0:.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_TAIL_MARK, \.data_dep\.[0-9_]+, 1\);.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_JOIN, \.data_dep\.[0-9_]+, 0\);.*Loop 6\(4\).*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_HEAD_MARK, 0, 1, 6\);.*Head-0:.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_HEAD_MARK, 0, 1, 6\);.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_FORK, \.data_dep\.[0-9_]+, 2\);.*Tail-0:.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_TAIL_MARK, \.data_dep\.[0-9_]+, 1\);.*\.data_dep\.[0-9_]+ = UNIQUE \(OACC_JOIN, \.data_dep\.[0-9_]+, 2\);} "oaccdevlow" } } */