From patchwork Wed May 20 12:01:44 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bernd Schmidt X-Patchwork-Id: 474368 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 9B0C714027C for ; Wed, 20 May 2015 22:02:14 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=gUs4kr6F; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; q= dns; s=default; b=RzG0fQA4OFwL0tKjSFjGgC4iTj55vGdgKfNbSmZm5uVIsz WBzmfl7UEgGF3HYeEvljQmbKJA5esdpQyN69n21RqDEXIeBD8DL7cOQ4pFBYXvZJ ymr4uPcHCGeFIbHTr2Ja5t0+FGh6tr/sktbodWxWgcNw5TMHwxmvvtKUbC6D8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; s= default; bh=ci7EW7JJtbPoNGIflYHjhHDnsPQ=; b=gUs4kr6Fp9DZ06cAjMga fZ310+yHwcdMKpYLr+fzdjdibl+X2S+bI0PMTNuQqka4DzUvwgJ59YNXgc9IZwjy Avafhh2w2v5xW5rrOmGbc7BzquHpWEb+qnHbX63/LSQDwJsAx7kQNyGu1gVI9i25 FsruIqJRsr6QJ/isVfoLO/E= Received: (qmail 18151 invoked by alias); 20 May 2015 12:02:05 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 18140 invoked by uid 89); 20 May 2015 12:02:05 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.1 required=5.0 tests=AWL, BAYES_50, KAM_ASCII_DIVIDERS, SPF_PASS, T_RP_MATCHES_RCVD autolearn=no version=3.3.2 X-HELO: fencepost.gnu.org Received: from fencepost.gnu.org (HELO fencepost.gnu.org) (208.118.235.10) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Wed, 20 May 2015 12:02:00 +0000 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51823) by fencepost.gnu.org with esmtps (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1Yv2hC-0001WJ-Hl for gcc-patches@gnu.org; Wed, 20 May 2015 08:01:58 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Yv2h7-0001M3-LE for gcc-patches@gnu.org; Wed, 20 May 2015 08:01:58 -0400 Received: from relay1.mentorg.com ([192.94.38.131]:61923) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Yv2h7-0001Lr-CR for gcc-patches@gnu.org; Wed, 20 May 2015 08:01:53 -0400 Received: from nat-ies.mentorg.com ([192.94.31.2] helo=SVR-IES-FEM-02.mgc.mentorg.com) by relay1.mentorg.com with esmtp id 1Yv2h4-0007Xq-II from Bernd_Schmidt@mentor.com ; Wed, 20 May 2015 05:01:51 -0700 Received: from [127.0.0.1] (137.202.0.76) by SVR-IES-FEM-02.mgc.mentorg.com (137.202.0.106) with Microsoft SMTP Server id 14.3.224.2; Wed, 20 May 2015 13:01:49 +0100 Message-ID: <555C77A8.2030300@codesourcery.com> Date: Wed, 20 May 2015 14:01:44 +0200 From: Bernd Schmidt User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: , Jakub Jelinek Subject: [gomp4] New builtins, preparation for oacc vector-single X-detected-operating-system: by eggs.gnu.org: Windows NT kernel [generic] [fuzzy] X-Received-From: 192.94.38.131 To implement OpenACC vector-single mode, we need to ensure that only one thread out of the group representing a worker executes. The others skip computations but follow along the CFG, so the results of conditional branch decisions must be broadcast to them. The patch below adds a new builtin and nvptx pattern to implement that broadcast functionality. Committed on gomp-4_0-branch. Bernd Index: gcc/ChangeLog.gomp =================================================================== --- gcc/ChangeLog.gomp (revision 223360) +++ gcc/ChangeLog.gomp (working copy) @@ -1,3 +1,16 @@ +2015-05-19 Bernd Schmidt + + * omp-builtins.def (GOACC_thread_broadcast, + GOACC_thread_broadcast_ll): New builtins. + * optabs.def (oacc_thread_broadcast_optab): New optab. + * builtins.c (expand_builtin_oacc_thread_broadcast): New function. + (expand_builtin): Use it. + * config/nvptx/nvptx.c (nvptx_cannot_copy_insn_p): New function. + (TARGET_CANNOT_COPY_INSN_P): Define. + * config/nvptx/nvptx.md (UNSPECV_WARP_BCAST): New constant. + (oacc_thread_broadcastsi): New pattern. + (oacc_thread_broadcastdi): New expander. + 2015-05-19 Tom de Vries * omp-low.c (enclosing_target_ctx): Comment out. Index: gcc/builtins.c =================================================================== --- gcc/builtins.c (revision 223360) +++ gcc/builtins.c (working copy) @@ -6022,6 +6022,43 @@ expand_oacc_ganglocal_ptr (rtx target AT return NULL_RTX; } +/* Handle a GOACC_thread_broadcast builtin call EXP with target TARGET. + Return the result. */ + +static rtx +expand_builtin_oacc_thread_broadcast (tree exp, rtx target) +{ + tree arg0 = CALL_EXPR_ARG (exp, 0); + enum insn_code icode; + + enum machine_mode mode = TYPE_MODE (TREE_TYPE (arg0)); + gcc_assert (INTEGRAL_MODE_P (mode)); + do + { + icode = direct_optab_handler (oacc_thread_broadcast_optab, mode); + mode = GET_MODE_WIDER_MODE (mode); + } + while (icode == CODE_FOR_nothing && mode != VOIDmode); + if (icode == CODE_FOR_nothing) + return expand_expr (arg0, NULL_RTX, VOIDmode, EXPAND_NORMAL); + + rtx tmp = target; + machine_mode mode0 = insn_data[icode].operand[0].mode; + machine_mode mode1 = insn_data[icode].operand[1].mode; + if (!REG_P (tmp) || GET_MODE (tmp) != mode0) + tmp = gen_reg_rtx (mode0); + rtx op1 = expand_expr (arg0, NULL_RTX, mode1, EXPAND_NORMAL); + if (GET_MODE (op1) != mode1) + op1 = convert_to_mode (mode1, op1, 0); + + rtx insn = GEN_FCN (icode) (tmp, op1); + if (insn != NULL_RTX) + { + emit_insn (insn); + return tmp; + } + return const0_rtx; +} /* Expand an expression EXP that calls a built-in function, with result going to TARGET if that's convenient @@ -7177,6 +7214,10 @@ expand_builtin (tree exp, rtx target, rt return target; break; + case BUILT_IN_GOACC_THREAD_BROADCAST: + case BUILT_IN_GOACC_THREAD_BROADCAST_LL: + return expand_builtin_oacc_thread_broadcast (exp, target); + default: /* just do library call, if unknown builtin */ break; } Index: gcc/config/nvptx/nvptx.c =================================================================== --- gcc/config/nvptx/nvptx.c (revision 223360) +++ gcc/config/nvptx/nvptx.c (working copy) @@ -2029,6 +2029,15 @@ nvptx_vector_alignment (const_tree type) return MIN (align, BIGGEST_ALIGNMENT); } + +static bool +nvptx_cannot_copy_insn_p (rtx_insn *insn) +{ + if (recog_memoized (insn) == CODE_FOR_oacc_thread_broadcastsi) + return true; + return false; +} + /* Record a symbol for mkoffload to enter into the mapping table. */ @@ -2153,6 +2162,9 @@ nvptx_file_end (void) #undef TARGET_VECTOR_ALIGNMENT #define TARGET_VECTOR_ALIGNMENT nvptx_vector_alignment +#undef TARGET_CANNOT_COPY_INSN_P +#define TARGET_CANNOT_COPY_INSN_P nvptx_cannot_copy_insn_p + struct gcc_target targetm = TARGET_INITIALIZER; #include "gt-nvptx.h" Index: gcc/config/nvptx/nvptx.md =================================================================== --- gcc/config/nvptx/nvptx.md (revision 223360) +++ gcc/config/nvptx/nvptx.md (working copy) @@ -61,6 +61,7 @@ (define_c_enum "unspecv" [ UNSPECV_LOCK UNSPECV_CAS UNSPECV_XCHG + UNSPECV_WARP_BCAST ]) (define_attr "subregs_ok" "false,true" @@ -1322,6 +1323,37 @@ (define_expand "oacc_ctaid" FAIL; }) +(define_insn "oacc_thread_broadcastsi" + [(set (match_operand:SI 0 "nvptx_register_operand" "") + (unspec_volatile:SI [(match_operand:SI 1 "nvptx_register_operand" "")] + UNSPECV_WARP_BCAST))] + "" + "%.\\tshfl.idx.b32\\t%0, %1, 0, 31;") + +(define_expand "oacc_thread_broadcastdi" + [(set (match_operand:DI 0 "nvptx_register_operand" "") + (unspec_volatile:DI [(match_operand:DI 1 "nvptx_register_operand" "")] + UNSPECV_WARP_BCAST))] + "" +{ + rtx t = gen_reg_rtx (DImode); + emit_insn (gen_lshrdi3 (t, operands[1], GEN_INT (32))); + rtx op0 = force_reg (SImode, gen_lowpart (SImode, t)); + rtx op1 = force_reg (SImode, gen_lowpart (SImode, operands[1])); + rtx targ0 = gen_reg_rtx (SImode); + rtx targ1 = gen_reg_rtx (SImode); + emit_insn (gen_oacc_thread_broadcastsi (targ0, op0)); + emit_insn (gen_oacc_thread_broadcastsi (targ1, op1)); + rtx t2 = gen_reg_rtx (DImode); + rtx t3 = gen_reg_rtx (DImode); + emit_insn (gen_extendsidi2 (t2, targ0)); + emit_insn (gen_extendsidi2 (t3, targ1)); + rtx t4 = gen_reg_rtx (DImode); + emit_insn (gen_ashldi3 (t4, t2, GEN_INT (32))); + emit_insn (gen_iordi3 (operands[0], t3, t4)); + DONE; +}) + (define_insn "ganglocal_ptr" [(set (match_operand:P 0 "nvptx_register_operand" "") (unspec:P [(const_int 0)] UNSPEC_SHARED_DATA))] Index: gcc/fortran/ChangeLog.gomp =================================================================== --- gcc/fortran/ChangeLog.gomp (revision 223360) +++ gcc/fortran/ChangeLog.gomp (working copy) @@ -1,3 +1,7 @@ +2015-05-19 Bernd Schmidt + + * types.def (BT_FN_ULONGLONG_ULONGLONG): Define. + 2015-05-13 Cesar Philippidis * f95-lang.c (gfc_attribute_table): Add and "oacc function" Index: gcc/fortran/types.def =================================================================== --- gcc/fortran/types.def (revision 223360) +++ gcc/fortran/types.def (working copy) @@ -84,6 +84,7 @@ DEF_FUNCTION_TYPE_1 (BT_FN_VOID_PTRPTR, DEF_FUNCTION_TYPE_1 (BT_FN_VOID_VPTR, BT_VOID, BT_VOLATILE_PTR) DEF_FUNCTION_TYPE_1 (BT_FN_INT_INT, BT_INT, BT_INT) DEF_FUNCTION_TYPE_1 (BT_FN_UINT_UINT, BT_UINT, BT_UINT) +DEF_FUNCTION_TYPE_1 (BT_FN_ULONGLONG_ULONGLONG, BT_ULONGLONG, BT_ULONGLONG) DEF_FUNCTION_TYPE_1 (BT_FN_PTR_PTR, BT_PTR, BT_PTR) DEF_FUNCTION_TYPE_1 (BT_FN_VOID_INT, BT_VOID, BT_INT) DEF_FUNCTION_TYPE_1 (BT_FN_BOOL_INT, BT_BOOL, BT_INT) Index: gcc/omp-builtins.def =================================================================== --- gcc/omp-builtins.def (revision 223360) +++ gcc/omp-builtins.def (working copy) @@ -77,6 +77,10 @@ DEF_GOACC_BUILTIN (BUILT_IN_GOACC_GET_GA BT_FN_PTR, ATTR_NOTHROW_LEAF_LIST) DEF_GOACC_BUILTIN (BUILT_IN_GOACC_DEVICEPTR, "GOACC_deviceptr", BT_FN_PTR_PTR, ATTR_CONST_NOTHROW_LEAF_LIST) +DEF_GOACC_BUILTIN (BUILT_IN_GOACC_THREAD_BROADCAST, "GOACC_thread_broadcast", + BT_FN_UINT_UINT, ATTR_NOTHROW_LEAF_LIST) +DEF_GOACC_BUILTIN (BUILT_IN_GOACC_THREAD_BROADCAST_LL, "GOACC_thread_broadcast_ll", + BT_FN_ULONGLONG_ULONGLONG, ATTR_NOTHROW_LEAF_LIST) DEF_GOACC_BUILTIN_COMPILER (BUILT_IN_ACC_ON_DEVICE, "acc_on_device", BT_FN_INT_INT, ATTR_CONST_NOTHROW_LEAF_LIST) Index: gcc/optabs.def =================================================================== --- gcc/optabs.def (revision 223360) +++ gcc/optabs.def (working copy) @@ -332,3 +332,5 @@ OPTAB_D (atomic_xor_optab, "atomic_xor$I OPTAB_D (get_thread_pointer_optab, "get_thread_pointer$I$a") OPTAB_D (set_thread_pointer_optab, "set_thread_pointer$I$a") + +OPTAB_D (oacc_thread_broadcast_optab, "oacc_thread_broadcast$I$a")