From patchwork Wed Nov 4 17:29:19 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nathan Sidwell X-Patchwork-Id: 540106 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 29F52140E1A for ; Thu, 5 Nov 2015 04:29:32 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=pRW+gyqo; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; q=dns; s=default; b=K4zHU59M0D4zKJLQw lzFcmyU5F0VTFzRM7K5yNZ1vTlAgL9RSgnw4s7hGtWz+YKkJT5j+8wHj2At+KSTz wUcxGPMvTNAb21OlJLjgk8LI1Rfq0MU3UI/UUdCe2ZoqWj0NJ6IhB12LY/cuD3XY ILd/t0ow54JSCOiW5YiThb7w+0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; s=default; bh=eIbG9GJ3fQjn5BlkX07lZqh SxOI=; b=pRW+gyqooN0zWESJJszs2FnXSIsWf7hxvpHbfHmpI0Fs4kaKmwxMwF8 H3+3qU1vKUzjiDJubLqqfumYlrKDlpuXUaDqIBqrsgVtTTXqQG2EO655iFTtaCPS oBtTWU+GftDfoDAT6oAJJF+bbW/n8Cz2oo3L75ps7bzmXC3xncPI= Received: (qmail 85969 invoked by alias); 4 Nov 2015 17:29:25 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 85952 invoked by uid 89); 4 Nov 2015 17:29:24 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=0.9 required=5.0 tests=BAYES_50, FREEMAIL_FROM, KAM_ASCII_DIVIDERS, RCVD_IN_DNSWL_LOW, SPF_PASS autolearn=no version=3.3.2 X-HELO: mail-yk0-f181.google.com Received: from mail-yk0-f181.google.com (HELO mail-yk0-f181.google.com) (209.85.160.181) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-GCM-SHA256 encrypted) ESMTPS; Wed, 04 Nov 2015 17:29:22 +0000 Received: by ykek133 with SMTP id k133so84630870yke.2 for ; Wed, 04 Nov 2015 09:29:20 -0800 (PST) X-Received: by 10.31.141.73 with SMTP id p70mr2982233vkd.47.1446658160365; Wed, 04 Nov 2015 09:29:20 -0800 (PST) Received: from ?IPv6:2601:181:c000:c497:a2a8:cdff:fe3e:b48? ([2601:181:c000:c497:a2a8:cdff:fe3e:b48]) by smtp.googlemail.com with ESMTPSA id y66sm1505204vky.12.2015.11.04.09.29.19 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 04 Nov 2015 09:29:19 -0800 (PST) Subject: Re: OpenACC dimension range propagation optimization To: Richard Biener References: <5638F8EF.6050607@acm.org> Cc: GCC Patches , Jakub Jelinek , Bernd Schmidt From: Nathan Sidwell Message-ID: <563A406F.7090506@acm.org> Date: Wed, 4 Nov 2015 12:29:19 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: On 11/04/15 05:26, Richard Biener wrote: > On Tue, Nov 3, 2015 at 7:11 PM, Nathan Sidwell wrote: >> Richard, > this all seems a little bit fragile and relying on implementation details? > Is the attribute always present? Is the call argument always a constant > that fits in a HOST_WIDE_INT (or even int here)? Are there always enough > 'dims' in the tree list? Is the 'dim' value always an INTEGER_CST that > fits a HOST_WIDE_INT (or even an int here)? > If so I'd like to see helper functions to hide these implementation details > from generic code like this. Like this? I've added two helper functions to omp-low.c, one to get the internal fn arg number and the other to get a dimension value, given an axis number. omp-low seemed the most appropriate point -- that's where the dimension processing is, and the generation of these internal fn calls. (Bernd, I'll fixup the dimension folding patch to use these calls before applying it.) ok? nathan 2015-11-04 Nathan Sidwell * target.def (goacc.dim_limit): New hook. * targhooks.h (default_goacc_dim_limit): Declare. * doc/tm.texi.in (TARGET_GOACC_DIM_LIMIT): Add. * doc/tm.texi: Rebuilt. * omp-low.h (get_oacc_fn_dim_size, get_oacc_ifn_dim_arg): Declare. * omp-low.c (get_oacc_fn_dim_size, get_oacc_ifn_dim_arg): New. (default_goacc_dim_limit): New. * config/nvptx/nvptx.c (PTX_VECTOR_LENGTH, PTX_WORKER_LENGTH): New. (nvptx_goacc_dim_limit) New. (TARGET_GOACC_DIM_LIMIT): Override. * tree-vrp.c: Include omp-low.h, target.h. (extract_range_basic): Add handling for IFN_GOACC_DIM_SIZE & IFN_GOACC_DIM_POS. Index: omp-low.c =================================================================== --- omp-low.c (revision 229757) +++ omp-low.c (working copy) @@ -12095,6 +12095,41 @@ get_oacc_fn_attrib (tree fn) return lookup_attribute (OACC_FN_ATTRIB, DECL_ATTRIBUTES (fn)); } +/* Extract an oacc execution dimension from FN. FN must be an + offloaded function or routine that has already had its execution + dimensions lowered to the target-specific values. */ + +int +get_oacc_fn_dim_size (tree fn, int axis) +{ + tree attrs = get_oacc_fn_attrib (fn); + + gcc_assert (axis < GOMP_DIM_MAX); + + tree dims = TREE_VALUE (attrs); + while (axis--) + dims = TREE_CHAIN (dims); + + int size = TREE_INT_CST_LOW (TREE_VALUE (dims)); + + return size; +} + +/* Extract the dimension axis from an IFN_GOACC_DIM_POS or + IFN_GOACC_DIM_SIZE call. */ + +int +get_oacc_ifn_dim_arg (const gimple *stmt) +{ + gcc_checking_assert (gimple_call_internal_fn (stmt) == IFN_GOACC_DIM_SIZE + || gimple_call_internal_fn (stmt) == IFN_GOACC_DIM_POS); + tree arg = gimple_call_arg (stmt, 0); + HOST_WIDE_INT axis = TREE_INT_CST_LOW (arg); + + gcc_checking_assert (axis >= 0 && axis < GOMP_DIM_MAX); + return (int) axis; +} + /* Expand the GIMPLE_OMP_TARGET starting at REGION. */ static void @@ -19383,6 +19418,18 @@ default_goacc_validate_dims (tree ARG_UN return changed; } +/* Default dimension bound is unknown on accelerator and 1 on host. */ + +int +default_goacc_dim_limit (int ARG_UNUSED (axis)) +{ +#ifdef ACCEL_COMPILER + return 0; +#else + return 1; +#endif +} + namespace { const pass_data pass_data_oacc_device_lower = Index: omp-low.h =================================================================== --- omp-low.h (revision 229757) +++ omp-low.h (working copy) @@ -31,6 +31,8 @@ extern bool make_gimple_omp_edges (basic extern void omp_finish_file (void); extern tree omp_member_access_dummy_var (tree); extern tree get_oacc_fn_attrib (tree); +extern int get_oacc_ifn_dim_arg (const gimple *); +extern int get_oacc_fn_dim_size (tree, int); extern GTY(()) vec *offload_funcs; extern GTY(()) vec *offload_vars; Index: targhooks.h =================================================================== --- targhooks.h (revision 229757) +++ targhooks.h (working copy) @@ -110,6 +110,7 @@ extern void default_destroy_cost_data (v /* OpenACC hooks. */ extern bool default_goacc_validate_dims (tree, int [], int); +extern int default_goacc_dim_limit (int); extern bool default_goacc_fork_join (gcall *, const int [], bool); /* These are here, and not in hooks.[ch], because not all users of Index: doc/tm.texi =================================================================== --- doc/tm.texi (revision 229757) +++ doc/tm.texi (working copy) @@ -5777,6 +5777,11 @@ true, if changes have been made. You mu provide dimensions larger than 1. @end deftypefn +@deftypefn {Target Hook} int TARGET_GOACC_DIM_LIMIT (int @var{axis}) +This hook should return the maximum size of a particular dimension, +or zero if unbounded. +@end deftypefn + @deftypefn {Target Hook} bool TARGET_GOACC_FORK_JOIN (gcall *@var{call}, const int *@var{dims}, bool @var{is_fork}) This hook can be used to convert IFN_GOACC_FORK and IFN_GOACC_JOIN function calls to target-specific gimple, or indicate whether they Index: doc/tm.texi.in =================================================================== --- doc/tm.texi.in (revision 229757) +++ doc/tm.texi.in (working copy) @@ -4262,6 +4262,8 @@ address; but often a machine-dependent @hook TARGET_GOACC_VALIDATE_DIMS +@hook TARGET_GOACC_DIM_LIMIT + @hook TARGET_GOACC_FORK_JOIN @node Anchored Addresses Index: tree-vrp.c =================================================================== --- tree-vrp.c (revision 229757) +++ tree-vrp.c (working copy) @@ -55,8 +55,8 @@ along with GCC; see the file COPYING3. #include "tree-ssa-threadupdate.h" #include "tree-ssa-scopedtables.h" #include "tree-ssa-threadedge.h" - - +#include "omp-low.h" +#include "target.h" /* Range of values that can be associated with an SSA_NAME after VRP has executed. */ @@ -3973,7 +3973,9 @@ extract_range_basic (value_range *vr, gi else if (is_gimple_call (stmt) && gimple_call_internal_p (stmt)) { enum tree_code subcode = ERROR_MARK; - switch (gimple_call_internal_fn (stmt)) + unsigned ifn_code = gimple_call_internal_fn (stmt); + + switch (ifn_code) { case IFN_UBSAN_CHECK_ADD: subcode = PLUS_EXPR; @@ -3984,6 +3986,28 @@ extract_range_basic (value_range *vr, gi case IFN_UBSAN_CHECK_MUL: subcode = MULT_EXPR; break; + case IFN_GOACC_DIM_SIZE: + case IFN_GOACC_DIM_POS: + /* Optimizing these two internal functions helps the loop + optimizer eliminate outer comparisons. Size is [1,N] + and pos is [0,N-1]. */ + { + bool is_pos = ifn_code == IFN_GOACC_DIM_POS; + int axis = get_oacc_ifn_dim_arg (stmt); + int size = get_oacc_fn_dim_size (current_function_decl, axis); + + if (!size) + /* If it's dynamic, the backend might know a hardware + limitation. */ + size = targetm.goacc.dim_limit (axis); + + tree type = TREE_TYPE (gimple_call_lhs (stmt)); + set_value_range (vr, VR_RANGE, + build_int_cst (type, is_pos ? 0 : 1), + size ? build_int_cst (type, size - is_pos) + : vrp_val_max (type), NULL); + } + return; default: break; } Index: config/nvptx/nvptx.c =================================================================== --- config/nvptx/nvptx.c (revision 229757) +++ config/nvptx/nvptx.c (working copy) @@ -3248,6 +3248,10 @@ nvptx_file_end (void) } } +/* Define dimension sizes for known hardware. */ +#define PTX_VECTOR_LENGTH 32 +#define PTX_WORKER_LENGTH 32 + /* Validate compute dimensions of an OpenACC offload or routine, fill in non-unity defaults. FN_LEVEL indicates the level at which a routine might spawn a loop. It is negative for non-routines. */ @@ -3264,6 +3268,25 @@ nvptx_goacc_validate_dims (tree ARG_UNUS return changed; } +/* Return maximum dimension size, or zero for unbounded. */ + +static int +nvptx_dim_limit (int axis) +{ + switch (axis) + { + case GOMP_DIM_WORKER: + return PTX_WORKER_LENGTH; + + case GOMP_DIM_VECTOR: + return PTX_VECTOR_LENGTH; + + default: + break; + } + return 0; +} + /* Determine whether fork & joins are needed. */ static bool @@ -3376,6 +3399,9 @@ nvptx_goacc_fork_join (gcall *call, cons #undef TARGET_GOACC_VALIDATE_DIMS #define TARGET_GOACC_VALIDATE_DIMS nvptx_goacc_validate_dims +#undef TARGET_GOACC_DIM_LIMIT +#define TARGET_GOACC_DIM_LIMIT nvptx_dim_limit + #undef TARGET_GOACC_FORK_JOIN #define TARGET_GOACC_FORK_JOIN nvptx_goacc_fork_join Index: target.def =================================================================== --- target.def (revision 229757) +++ target.def (working copy) @@ -1659,6 +1659,13 @@ bool, (tree decl, int *dims, int fn_leve default_goacc_validate_dims) DEFHOOK +(dim_limit, +"This hook should return the maximum size of a particular dimension,\n\ +or zero if unbounded.", +int, (int axis), +default_goacc_dim_limit) + +DEFHOOK (fork_join, "This hook can be used to convert IFN_GOACC_FORK and IFN_GOACC_JOIN\n\ function calls to target-specific gimple, or indicate whether they\n\