From patchwork Tue Sep 27 17:52:03 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nathan Sidwell X-Patchwork-Id: 675757 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3sk7hk2SJ4z9s65 for ; Wed, 28 Sep 2016 03:52:38 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=P2b/Is3D; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=LG0KcuQ8ibpB9NUsoL5EQx4t2Zh152NEBw4wkRPsrcuxhSCixK gYbY4GYnc4ORi9Q6Kzo4hPG2w5AP+TPdJrKN5+HEnHVLHURPgzm5bzIV8f+qDOso itYQGCCOS4E0I0+wRbWEeDhFOxgzj/n3cH1lS5exWRXuwUgELC/ix+ePM= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to :from:subject:message-id:date:mime-version:content-type; s= default; bh=neJ0ekIMDIMzPKiciAz1aWgWg24=; b=P2b/Is3DWD/lkyG66Ugu jvUxTZXtpI9w8mVJeCw5kzPXjBOeCVJlntggK+CgoG/Hi0H6EtZntI3ZdtLSEEho HwJivNN8GmZp7Kz5jJbtZwSr3zr7ahvvPalbAKIcmH56mhCI2mBqdaCNoumv2raS kYUD+Cj4USUpYlQYTNg2VUY= Received: (qmail 3066 invoked by alias); 27 Sep 2016 17:52:15 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 2959 invoked by uid 89); 27 Sep 2016 17:52:15 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=BAYES_00, FREEMAIL_FROM, RCVD_IN_DNSWL_LOW, RCVD_IN_SORBS_SPAM, SPF_PASS, URIBL_RED autolearn=ham version=3.3.2 spammy=sk:nathan, nathancodesourcerycom, nathan@codesourcery.com, sk:IFN_UNI X-HELO: mail-qt0-f173.google.com Received: from mail-qt0-f173.google.com (HELO mail-qt0-f173.google.com) (209.85.216.173) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 27 Sep 2016 17:52:06 +0000 Received: by mail-qt0-f173.google.com with SMTP id l91so10919757qte.3 for ; Tue, 27 Sep 2016 10:52:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:to:from:subject:message-id:date :user-agent:mime-version; bh=zcfNT6UiESFpWnReKYfnrYGlspJw4B+i1y0KwTUOUmk=; b=QgKz5fwNaDwOdNTTiAdP9N2tqDQnfNC8f8Zp8rlijOhFtiPt4lNVw+UWsPDzjcNfmH ZVT7Xhx+OSwPon9HbgPsFi1DB9gut/Bswm41n8dlaaOJ4N/U8fVyTirJjuQ/8zAC+swZ gevre2AZRqGAJr4ZVZ4IryCwKjtymMM8zA60B41zZyK8XPv0vmZAvO5yndvoAlCZPXhQ X2XIWePlJhyCMxU4R9qdyFjcjUdfM8z3cP4g81JvO4OsjqtmFoSSonvNqX0pslCNSm5J iZaNw9ne+4WBB6M7Yk0a2Bvv4q00hE9gGZc2yx2GBScFAc6BvtfUkiwEbK7/UgMgRyra esrA== X-Gm-Message-State: AA6/9RnYGyruk+Ptv+zSAKzU7UiyzdF7Z0w9EiNBzWd+og16Kj6b7hcSUrW4Cv9TVbPdJA== X-Received: by 10.237.53.174 with SMTP id c43mr29547361qte.84.1474998725062; Tue, 27 Sep 2016 10:52:05 -0700 (PDT) Received: from ?IPv6:2601:181:c003:1930:3fe6:c217:b86a:6e86? ([2601:181:c003:1930:3fe6:c217:b86a:6e86]) by smtp.googlemail.com with ESMTPSA id d73sm1761142qke.45.2016.09.27.10.52.04 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 27 Sep 2016 10:52:04 -0700 (PDT) To: GCC Patches From: Nathan Sidwell Subject: [gomp4] loop abstraction fns Message-ID: Date: Tue, 27 Sep 2016 13:52:03 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.3.0 MIME-Version: 1.0 In implementing support for tile, the loop structure becomes a little more complex, such that oacc_loop_xform_loop's simple scanning approach fails. I realized, that wasn't necessary, if rather than count the abstraction functions when discovering the loops, we record the gimple calls themselves. Then we can simply iterate over that vector to fill in the partitioning information. That's a better solution on its own merits. applied to gomp4 nathan 2016-09-27 Nathan Sidwell * omp-low.c (struct oacc_loop): Change 'ifns' to vector of call stmts. (new_oacc_loop_raw, finish_oacc_loop): Adjust. (oacc_loop_discover_walk): Append loop abstraction sites to list. (oacc_loop_xform_loop): Delete. (oacc_loop_process): Iterate over call list directly. (execute_oacc_device_lower): Ignore unknown unspecs. Index: omp-low.c =================================================================== --- omp-low.c (revision 240524) +++ omp-low.c (working copy) @@ -253,7 +253,7 @@ struct oacc_loop unsigned mask; /* Partitioning mask. */ unsigned inner; /* Partitioning of inner loops. */ unsigned flags; /* Partitioning flags. */ - unsigned ifns; /* Contained loop abstraction functions. */ + vec ifns; /* Contained loop abstraction functions. */ tree chunk_size; /* Chunk size. */ gcall *head_end; /* Final marker of head sequence. */ }; @@ -19341,7 +19341,6 @@ new_oacc_loop_raw (oacc_loop *parent, lo loop->routine = NULL_TREE; loop->mask = loop->flags = loop->inner = 0; - loop->ifns = 0; loop->chunk_size = 0; loop->head_end = NULL; @@ -19404,7 +19403,7 @@ static oacc_loop * finish_oacc_loop (oacc_loop *loop) { /* If the loop has been collapsed, don't partition it. */ - if (!loop->ifns) + if (loop->ifns.is_empty ()) loop->mask = loop->flags = 0; return loop->parent; } @@ -19542,9 +19541,9 @@ oacc_loop_discover_walk (oacc_loop *loop break; case IFN_GOACC_LOOP: - /* Count the goacc loop abstraction fns, to determine if the - loop was collapsed already. */ - loop->ifns++; + /* Record the abstraction function, so we can manipulate it + later. */ + loop->ifns.safe_push (call); break; case IFN_UNIQUE: @@ -19685,51 +19684,6 @@ oacc_loop_xform_head_tail (gcall *from, } } -/* Transform the IFN_GOACC_LOOP internal functions by providing the - determined partitioning mask and chunking argument. END_MARKER - points at the end IFN_HEAD_TAIL call intgroducing the loop. IFNS - is the number of IFN_GOACC_LOOP calls for the loop. MASK_ARG is - the replacement partitioning mask and CHUNK_ARG is the replacement - chunking arg. */ - -static void -oacc_loop_xform_loop (gcall *end_marker, unsigned ifns, - tree mask_arg, tree chunk_arg) -{ - gimple_stmt_iterator gsi = gsi_for_stmt (end_marker); - - gcc_checking_assert (ifns); - for (;;) - { - for (; !gsi_end_p (gsi); gsi_next (&gsi)) - { - gimple *stmt = gsi_stmt (gsi); - - if (!is_gimple_call (stmt)) - continue; - - gcall *call = as_a (stmt); - - if (!gimple_call_internal_p (call)) - continue; - - if (gimple_call_internal_fn (call) != IFN_GOACC_LOOP) - continue; - - *gimple_call_arg_ptr (call, 5) = mask_arg; - *gimple_call_arg_ptr (call, 4) = chunk_arg; - ifns--; - if (!ifns) - return; - } - - /* The LOOP_BOUND ifn could be in the single successor - block. */ - basic_block bb = single_succ (gsi_bb (gsi)); - gsi = gsi_start_bb (bb); - } -} - /* Process the discovered OpenACC loops, setting the correct partitioning level etc. */ @@ -19742,13 +19696,25 @@ oacc_loop_process (oacc_loop *loop) if (loop->mask && !loop->routine) { int ix; - unsigned mask = loop->mask; - unsigned dim = GOMP_DIM_GANG; - tree mask_arg = build_int_cst (unsigned_type_node, mask); + tree mask_arg = build_int_cst (unsigned_type_node, loop->mask); tree chunk_arg = loop->chunk_size; + gcall *call; - oacc_loop_xform_loop (loop->head_end, loop->ifns, mask_arg, chunk_arg); + for (ix = 0; loop->ifns.iterate (ix, &call); ix++) + switch (gimple_call_internal_fn (call)) + { + case IFN_GOACC_LOOP: + gcc_assert (gimple_call_arg (call, 5) == integer_zero_node); + *gimple_call_arg_ptr (call, 5) = mask_arg; + *gimple_call_arg_ptr (call, 4) = chunk_arg; + break; + default: + gcc_unreachable (); + } + + unsigned dim = GOMP_DIM_GANG; + unsigned mask = loop->mask; for (ix = 0; ix != GOMP_DIM_MAX && mask; ix++) { while (!(GOMP_DIM_MASK (dim) & mask)) @@ -20176,7 +20142,7 @@ execute_oacc_device_lower () switch (kind) { default: - gcc_unreachable (); + break; case IFN_UNIQUE_OACC_FORK: case IFN_UNIQUE_OACC_JOIN: