From patchwork Wed Dec 18 00:21:56 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Iyer, Balaji V" X-Patchwork-Id: 302582 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 993092C00A8 for ; Wed, 18 Dec 2013 11:22:22 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:references:in-reply-to :content-type:mime-version; q=dns; s=default; b=qHNd7U+6g43+fx7Y Rd/fZhzxgVZHpp/ZjoONKEncKmoJzbQFlUfxUqR/BRMs4gT/eyzLgKCoYAKSiUNn 0oS1r/YL72mc7sZ6Hk+PY1N0CKydL0Irr0vz/lg8O/0BbhWUkL208XSdJQay25TD c+oFdqwtUtD0Fnr0vECgo6NhCHc= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:references:in-reply-to :content-type:mime-version; s=default; bh=vWMaNCu8jj2sTNnwr5VfrF Yu7b8=; b=HeahZmloXliUEwDYE46dJStov9XvdASvzDaqBQM26EuMFvduu6cjr7 EGfBoZbrfJ0J6ADnuuTpB4ygMwInDfp0npmzaEIqJhJzHpJGT435TBvLZhWB70BD 4T9hxYxljgDzzVTRiN5PeOQYmJZRPzW8Ej5eaDSZTLQZ/GqHyJdp8= Received: (qmail 21242 invoked by alias); 18 Dec 2013 00:22:13 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 21231 invoked by uid 89); 18 Dec 2013 00:22:12 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.2 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mga02.intel.com Received: from mga02.intel.com (HELO mga02.intel.com) (134.134.136.20) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 18 Dec 2013 00:22:08 +0000 Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga101.jf.intel.com with ESMTP; 17 Dec 2013 16:21:58 -0800 X-ExtLoop1: 1 Received: from fmsmsx105.amr.corp.intel.com ([10.19.9.36]) by orsmga001.jf.intel.com with ESMTP; 17 Dec 2013 16:21:56 -0800 Received: from fmsmsx101.amr.corp.intel.com ([169.254.1.227]) by FMSMSX105.amr.corp.intel.com ([169.254.5.121]) with mapi id 14.03.0123.003; Tue, 17 Dec 2013 16:21:56 -0800 From: "Iyer, Balaji V" To: Jason Merrill , 'Jeff Law' , "'Aldy Hernandez'" CC: "'gcc-patches@gcc.gnu.org'" , "'rth@redhat.com'" , 'Jakub Jelinek' Subject: RE: [PATCH] _Cilk_for for C and C++ Date: Wed, 18 Dec 2013 00:21:56 +0000 Message-ID: References: <52869727.4060307@redhat.com> <528F8A42.1040704@redhat.com> <52962687.2020107@redhat.com> <52964B50.6090006@redhat.com> <529693B0.4050005@redhat.com> <529D7A76.1080001@redhat.com> <52AF6EE8.2080807@redhat.com> In-Reply-To: <52AF6EE8.2080807@redhat.com> MIME-Version: 1.0 X-IsSubscribed: yes Hi Jason, Here is a fixed patch. I have also answered your questions below. Thanks, Balaji V. Iyer. > -----Original Message----- > From: Jason Merrill [mailto:jason@redhat.com] > Sent: Monday, December 16, 2013 4:22 PM > To: Iyer, Balaji V; 'Jeff Law'; 'Aldy Hernandez' > Cc: 'gcc-patches@gcc.gnu.org'; 'rth@redhat.com'; 'Jakub Jelinek' > Subject: Re: [PATCH] _Cilk_for for C and C++ > > On 12/15/2013 07:40 PM, Iyer, Balaji V wrote: > > - tree clauses, tree *cclauses) > > + tree clauses_or_grain, tree *cclauses) > > Instead of this, please make the grainsize a new type of clause. > The reason why I store it in OMP_FOR_CLAUSE is because OMP clauses cannot occur in _Cilk_for. So adding a new clause seem to be an overkill IMHO. I need a place to store the grain value and so I chose this spot. > > - return (gimple_omp_subcode (g) & GF_OMP_FOR_COMBINED) != 0; > > + return (gimple_omp_for_kind (g) == GF_OMP_FOR_COMBINED); > > I don't really know this code, but this change seems unlikely to be correct. > Can you explain it? > Yep it is wrong. I have reverted it. I have moved around the gf_task bits. > > + tree data_name = get_identifier (".omp_data_i"); if (is_cilk_for) > > + data_name = get_identifier (".cilk_for_data_i"); > > Why does the name of an artificial parameter matter? > Well, it helps differenciate between the two.. I have reverted it. > > } > > +/* A subroutine of expand_omp_for. Generate code for _Cilk_for loop. > > + Given parameters: > > Need a blank line after the }. > Fixed. > Jason diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c old mode 100644 new mode 100755 index cfaeaf0..c3dcb21 --- a/gcc/c-family/c-common.c +++ b/gcc/c-family/c-common.c @@ -414,6 +414,7 @@ const struct c_common_resword c_common_reswords[] = { "_Complex", RID_COMPLEX, 0 }, { "_Cilk_spawn", RID_CILK_SPAWN, 0 }, { "_Cilk_sync", RID_CILK_SYNC, 0 }, + { "_Cilk_for", RID_CILK_FOR, 0 }, { "_Imaginary", RID_IMAGINARY, D_CONLY }, { "_Decimal32", RID_DFLOAT32, D_CONLY | D_EXT }, { "_Decimal64", RID_DFLOAT64, D_CONLY | D_EXT }, diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h old mode 100644 new mode 100755 index 4357d1f..508de30 --- a/gcc/c-family/c-common.h +++ b/gcc/c-family/c-common.h @@ -149,7 +149,7 @@ enum rid RID_CONSTEXPR, RID_DECLTYPE, RID_NOEXCEPT, RID_NULLPTR, RID_STATIC_ASSERT, /* Cilk Plus keywords. */ - RID_CILK_SPAWN, RID_CILK_SYNC, + RID_CILK_SPAWN, RID_CILK_SYNC, RID_CILK_FOR, /* Objective-C ("AT" reserved words - they are only keywords when they follow '@') */ diff --git a/gcc/c-family/c-omp.c b/gcc/c-family/c-omp.c old mode 100644 new mode 100755 index 3ccf8f9..6ae0a0f --- a/gcc/c-family/c-omp.c +++ b/gcc/c-family/c-omp.c @@ -386,7 +386,7 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv, bool fail = false; int i; - if (code == CILK_SIMD + if ((code == CILK_SIMD || code == CILK_FOR) && !c_check_cilk_loop (locus, TREE_VEC_ELT (declv, 0))) fail = true; @@ -516,7 +516,7 @@ c_finish_omp_for (location_t locus, enum tree_code code, tree declv, 0)) TREE_SET_CODE (cond, TREE_CODE (cond) == NE_EXPR ? LT_EXPR : GE_EXPR); - else if (code != CILK_SIMD) + else if (code != CILK_SIMD && code != CILK_FOR) cond_ok = false; } } diff --git a/gcc/c-family/c-pragma.c b/gcc/c-family/c-pragma.c old mode 100644 new mode 100755 index 64a5b66..9d6efc5 --- a/gcc/c-family/c-pragma.c +++ b/gcc/c-family/c-pragma.c @@ -1394,6 +1394,11 @@ init_pragma (void) cpp_register_deferred_pragma (parse_in, "GCC", "ivdep", PRAGMA_IVDEP, false, false); + + if (flag_enable_cilkplus && !flag_preprocess_only) + cpp_register_deferred_pragma (parse_in, "cilk", "grainsize", + PRAGMA_CILK_GRAINSIZE, true, false); + #ifdef HANDLE_PRAGMA_PACK_WITH_EXPANSION c_register_pragma_with_expansion (0, "pack", handle_pragma_pack); #else diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h old mode 100644 new mode 100755 index 5379b9e..ef62653 --- a/gcc/c-family/c-pragma.h +++ b/gcc/c-family/c-pragma.h @@ -55,6 +55,9 @@ typedef enum pragma_kind { /* Top level clause to handle all Cilk Plus pragma simd clauses. */ PRAGMA_CILK_SIMD, + /* This pragma handles setting of grainsize for a _Cilk_for. */ + PRAGMA_CILK_GRAINSIZE, + PRAGMA_GCC_PCH_PREPROCESS, PRAGMA_IVDEP, diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c old mode 100644 new mode 100755 index c78d269..3d2a6e0 --- a/gcc/c/c-parser.c +++ b/gcc/c/c-parser.c @@ -1242,9 +1242,10 @@ static bool c_parser_objc_diagnose_bad_element_prefix (c_parser *, struct c_declspecs *); /* Cilk Plus supporting routines. */ -static void c_parser_cilk_simd (c_parser *); +static void c_parser_cilk_simd (c_parser *, bool, tree); static bool c_parser_cilk_verify_simd (c_parser *, enum pragma_context); static tree c_parser_array_notation (location_t, c_parser *, tree, tree); +static void c_parser_cilk_grainsize (c_parser *); /* Parse a translation unit (C90 6.7, C99 6.9). @@ -4776,6 +4777,16 @@ c_parser_statement_after_labels (c_parser *parser) case RID_FOR: c_parser_for_statement (parser, false); break; + case RID_CILK_FOR: + if (!flag_enable_cilkplus) + { + error_at (c_parser_peek_token (parser)->location, + "-fcilkplus must be enabled to use %<_Cilk_for%>"); + c_parser_skip_to_end_of_block_or_statement (parser); + } + else + c_parser_cilk_simd (parser, true, integer_zero_node); + break; case RID_CILK_SYNC: c_parser_consume_token (parser); c_parser_skip_until_found (parser, CPP_SEMICOLON, "expected %<;%>"); @@ -9386,7 +9397,24 @@ c_parser_pragma (c_parser *parser, enum pragma_context context) if (!c_parser_cilk_verify_simd (parser, context)) return false; c_parser_consume_pragma (parser); - c_parser_cilk_simd (parser); + c_parser_cilk_simd (parser, false, NULL_TREE); + return false; + case PRAGMA_CILK_GRAINSIZE: + if (!flag_enable_cilkplus) + { + warning (0, "%<#pragma grainsize%> ignored because -fcilkplus is not" + " enabled"); + c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL); + return false; + } + if (context == pragma_external) + { + error_at (c_parser_peek_token (parser)->location, + "%<#pragma grainsize%> must be inside a function"); + c_parser_skip_until_found (parser, CPP_PRAGMA_EOL, NULL); + return false; + } + c_parser_cilk_grainsize (parser); return false; default: @@ -11460,7 +11488,7 @@ c_parser_omp_flush (c_parser *parser) static tree c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code, - tree clauses, tree *cclauses) + tree clauses_or_grain, tree *cclauses) { tree decl, cond, incr, save_break, save_cont, body, init, stmt, cl; tree declv, condv, incrv, initv, ret = NULL; @@ -11468,6 +11496,8 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code, int i, collapse = 1, nbraces = 0; location_t for_loc; vec *for_block = make_tree_vector (); + tree clauses = code == CILK_FOR ? NULL_TREE : clauses_or_grain; + tree grain = code == CILK_FOR ? clauses_or_grain : NULL_TREE; for (cl = clauses; cl; cl = OMP_CLAUSE_CHAIN (cl)) if (OMP_CLAUSE_CODE (cl) == OMP_CLAUSE_COLLAPSE) @@ -11480,11 +11510,18 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code, condv = make_tree_vec (collapse); incrv = make_tree_vec (collapse); - if (!c_parser_next_token_is_keyword (parser, RID_FOR)) + if (code != CILK_FOR + && !c_parser_next_token_is_keyword (parser, RID_FOR)) { c_parser_error (parser, "for statement expected"); return NULL; } + if (code == CILK_FOR + && !c_parser_next_token_is_keyword (parser, RID_CILK_FOR)) + { + c_parser_error (parser, "_Cilk_for statement expected"); + return NULL; + } for_loc = c_parser_peek_token (parser)->location; c_parser_consume_token (parser); @@ -11562,7 +11599,7 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code, case LE_EXPR: break; case NE_EXPR: - if (code == CILK_SIMD) + if (code == CILK_SIMD || code == CILK_FOR) break; /* FALLTHRU. */ default: @@ -11736,6 +11773,11 @@ c_parser_omp_for_loop (location_t loc, c_parser *parser, enum tree_code code, } } OMP_FOR_CLAUSES (stmt) = clauses; + /* If it is a _Cilk_for statement, then the OMP_FOR_CLAUSES location + stores the user-defined grain value or an integer_zero_node + indicating that the runtime must compute a suitable grain. */ + if (code == CILK_FOR) + OMP_FOR_CLAUSES (stmt) = grain; } ret = stmt; } @@ -13566,18 +13608,75 @@ c_parser_cilk_all_clauses (c_parser *parser) return c_finish_cilk_clauses (clauses); } +/* This function helps parse the grainsize pragma for a _Cilk_for statement. + Here is the correct syntax of this pragma: + #pragma cilk grainsize = + */ + +static void +c_parser_cilk_grainsize (c_parser *parser) +{ + extern tree convert_to_integer (tree, tree); + + /* consume the 'grainsize' keyword. */ + c_parser_consume_pragma (parser); + + if (c_parser_require (parser, CPP_EQ, "expected %<=%>") != 0) + { + struct c_expr g_expr = c_parser_binary_expression (parser, NULL, NULL); + if (g_expr.value && TREE_CODE (g_expr.value) == C_MAYBE_CONST_EXPR) + { + error_at (input_location, "cannot convert grain to long integer.\n"); + c_parser_skip_to_pragma_eol (parser); + } + else if (g_expr.value && g_expr.value != error_mark_node) + { + c_parser_skip_to_pragma_eol (parser); + c_token *token = c_parser_peek_token (parser); + if (token && token->type == CPP_KEYWORD + && token->keyword == RID_CILK_FOR) + { + tree grain = convert_to_integer (long_integer_type_node, + g_expr.value); + if (grain && grain != error_mark_node) + c_parser_cilk_simd (parser, true, grain); + } + else + warning (0, "grainsize pragma is not followed by %<_Cilk_for%>"); + } + else + c_parser_skip_to_pragma_eol (parser); + } + else + c_parser_skip_to_pragma_eol (parser); +} + /* Main entry point for parsing Cilk Plus <#pragma simd> for loops. */ static void -c_parser_cilk_simd (c_parser *parser) +c_parser_cilk_simd (c_parser *parser, bool is_cilk_for, tree grain) { - tree clauses = c_parser_cilk_all_clauses (parser); + tree super_block = NULL_TREE; + tree clauses = NULL_TREE; + + if (!is_cilk_for) + clauses = c_parser_cilk_all_clauses (parser); + else + { + super_block = c_begin_omp_parallel (); + clauses = grain; + } tree block = c_begin_compound_stmt (true); location_t loc = c_parser_peek_token (parser)->location; - c_parser_omp_for_loop (loc, parser, CILK_SIMD, clauses, NULL); + enum tree_code code = is_cilk_for ? CILK_FOR : CILK_SIMD; + c_parser_omp_for_loop (loc, parser, code, clauses, NULL); block = c_end_compound_stmt (loc, block, true); add_stmt (block); + if (is_cilk_for) + /* The term super_block is not used in scheduling terms but in + set-theory, i.e. set vs. super-set. */ + c_finish_omp_parallel (loc, NULL_TREE, super_block); } /* Parse a transaction attribute (GCC Extension). diff --git a/gcc/cilk-builtins.def b/gcc/cilk-builtins.def old mode 100644 new mode 100755 index 8634194..7f8f97a --- a/gcc/cilk-builtins.def +++ b/gcc/cilk-builtins.def @@ -31,3 +31,5 @@ DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SYNC, "__cilkrts_sync") DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_LEAVE_FRAME, "__cilkrts_leave_frame") DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_POP_FRAME, "__cilkrts_pop_frame") DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_SAVE_FP, "__cilkrts_save_fp_ctrl_state") +DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_FOR_32, "__cilkrts_cilk_for_32") +DEF_CILK_BUILTIN_STUB (BUILT_IN_CILK_FOR_64, "__cilkrts_cilk_for_64") diff --git a/gcc/cilk-common.c b/gcc/cilk-common.c old mode 100644 new mode 100755 index 52b3785..2574f12 --- a/gcc/cilk-common.c +++ b/gcc/cilk-common.c @@ -106,6 +106,26 @@ install_builtin (const char *name, tree fntype, enum built_in_function code, return fndecl; } +/* Returns a FUNCTION_DECL of type TYPE whose name is *NAME. */ + +static tree +cilk_declare_looper (const char *name, tree type, enum built_in_function code) +{ + tree cb, ft, fn; + + cb = build_function_type_list (void_type_node, + ptr_type_node, type, type, + NULL_TREE); + cb = build_pointer_type (cb); + ft = build_function_type_list (void_type_node, + cb, ptr_type_node, type, + integer_type_node, NULL_TREE); + fn = install_builtin (name, ft, code, false); + TREE_NOTHROW (fn) = 0; + + return fn; +} + /* Creates and initializes all the built-in Cilk keywords functions and three structures: __cilkrts_stack_frame, __cilkrts_pedigree and __cilkrts_worker. Detailed information about __cilkrts_stack_frame and @@ -269,6 +289,15 @@ cilk_init_builtins (void) cilk_save_fp_fndecl = install_builtin ("__cilkrts_save_fp_ctrl_state", fptr_fun, BUILT_IN_CILK_SAVE_FP, false); + /* __cilkrts_cilk_for_32 (...); */ + cilk_for_32_fndecl = cilk_declare_looper ("__cilkrts_cilk_for_32", + unsigned_intSI_type_node, + BUILT_IN_CILK_FOR_32); + /* __cilkrts_cilk_for_64 (...); */ + cilk_for_64_fndecl = cilk_declare_looper ("__cilkrts_cilk_for_64", + unsigned_intDI_type_node, + BUILT_IN_CILK_FOR_64); + } /* Get the appropriate frame arguments for CALL that is of type CALL_EXPR. */ diff --git a/gcc/cilk.h b/gcc/cilk.h old mode 100644 new mode 100755 index e990992..ea9f6ff --- a/gcc/cilk.h +++ b/gcc/cilk.h @@ -40,6 +40,9 @@ enum cilk_tree_index { CILK_TI_F_POP, /* __cilkrts_pop_frame (...). */ CILK_TI_F_RETHROW, /* __cilkrts_rethrow (...). */ CILK_TI_F_SAVE_FP, /* __cilkrts_save_fp_ctrl_state (...). */ + CILK_TI_F_LOOP_32, /* __cilkrts_cilk_for_32 (...). */ + CILK_TI_F_LOOP_64, /* __cilkrts_cilk_for_64 (...). */ + /* __cilkrts_stack_frame struct fields. */ CILK_TI_FRAME_FLAGS, /* stack_frame->flags. */ CILK_TI_FRAME_PARENT, /* stack_frame->parent. */ @@ -77,6 +80,8 @@ extern GTY (()) tree cilk_trees[CILK_TI_MAX]; #define cilk_rethrow_fndecl cilk_trees[CILK_TI_F_RETHROW] #define cilk_pop_fndecl cilk_trees[CILK_TI_F_POP] #define cilk_save_fp_fndecl cilk_trees[CILK_TI_F_SAVE_FP] +#define cilk_for_32_fndecl cilk_trees[CILK_TI_F_LOOP_32] +#define cilk_for_64_fndecl cilk_trees[CILK_TI_F_LOOP_64] #define cilk_worker_type_fndecl cilk_trees[CILK_TI_WORKER_TYPE] #define cilk_frame_type_decl cilk_trees[CILK_TI_FRAME_TYPE] diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c index 42e3f5f..867aa52 100644 --- a/gcc/gimple-pretty-print.c +++ b/gcc/gimple-pretty-print.c @@ -1158,6 +1158,8 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags) case GF_OMP_FOR_KIND_DISTRIBUTE: pp_string (buffer, "#pragma omp distribute"); break; + case GF_OMP_FOR_KIND_CILKFOR: + break; default: gcc_unreachable (); } @@ -1167,7 +1169,10 @@ dump_gimple_omp_for (pretty_printer *buffer, gimple gs, int spc, int flags) if (i) spc += 2; newline_and_indent (buffer, spc); - pp_string (buffer, "for ("); + if (gimple_omp_for_kind (gs) == GF_OMP_FOR_KIND_CILKFOR) + pp_string (buffer, "_Cilk_for ("); + else + pp_string (buffer, "for ("); dump_generic_node (buffer, gimple_omp_for_index (gs, i), spc, flags, false); pp_string (buffer, " = "); diff --git a/gcc/gimple.h b/gcc/gimple.h old mode 100644 new mode 100755 index a49016f..1c0e049 --- a/gcc/gimple.h +++ b/gcc/gimple.h @@ -91,13 +91,14 @@ enum gf_mask { GF_CALL_ALLOCA_FOR_VAR = 1 << 5, GF_CALL_INTERNAL = 1 << 6, GF_OMP_PARALLEL_COMBINED = 1 << 0, - GF_OMP_FOR_KIND_MASK = 3 << 0, + GF_OMP_FOR_KIND_MASK = 7 << 0, GF_OMP_FOR_KIND_FOR = 0 << 0, GF_OMP_FOR_KIND_DISTRIBUTE = 1 << 0, GF_OMP_FOR_KIND_SIMD = 2 << 0, GF_OMP_FOR_KIND_CILKSIMD = 3 << 0, - GF_OMP_FOR_COMBINED = 1 << 2, - GF_OMP_FOR_COMBINED_INTO = 1 << 3, + GF_OMP_FOR_KIND_CILKFOR = 4 << 0, + GF_OMP_FOR_COMBINED = 1 << 3, + GF_OMP_FOR_COMBINED_INTO = 1 << 4, GF_OMP_TARGET_KIND_MASK = 3 << 0, GF_OMP_TARGET_KIND_REGION = 0 << 0, GF_OMP_TARGET_KIND_DATA = 1 << 0, @@ -523,6 +524,12 @@ struct GTY(()) gimple_omp_for_iter { /* Increment. */ tree incr; + + /* Loop count, only used by _Cilk_for. */ + tree loop_count; + + /* Grain value, only used by _Cilk_for. */ + tree grain; }; /* GIMPLE_OMP_FOR */ @@ -4562,6 +4569,58 @@ gimple_omp_for_set_pre_body (gimple gs, gimple_seq pre_body) omp_for_stmt->pre_body = pre_body; } +/* Set COUNT to be the loop count value for OMP_FOR GS. */ + +static inline void +gimple_cilk_for_set_count (tree count, gimple gs) +{ + const gimple_statement_omp_for *omp_for_stmt = + as_a (gs); + omp_for_stmt->iter[0].loop_count = count; +} + +/* Set GRAIN to be the grain value used by Cilk runtime for OMP_FOR GS. */ + +static inline void +gimple_cilk_for_set_grain (tree grain, gimple gs) +{ + const gimple_statement_omp_for *omp_for_stmt = + as_a (gs); + omp_for_stmt->iter[0].grain = grain; +} + +/* Returns the induction variable of type TREE from GS that is of type + GIMPLE_STATEMENT_OMP_FOR. */ + +static inline tree +gimple_cilk_for_induction_var (const_gimple gs) +{ + const gimple_statement_omp_for *cilk_for_stmt = + as_a (gs); + return cilk_for_stmt->iter->index; +} + +/* Returns the LOOP_COUNT value of type TREE from GS that is of type + GIMPLE_STATEMENT_OMP_FOR. */ + +static inline tree +gimple_cilk_for_loop_count (const_gimple gs) +{ + const gimple_statement_omp_for *cilk_for_stmt = + as_a (gs); + return cilk_for_stmt->iter->loop_count; +} + +/* Returns the GRAIN value of type TREE from GS that is of type + GIMPLE_STATEMENT_OMP_FOR. */ + +static inline tree +gimple_cilk_for_grain (const_gimple gs) +{ + const gimple_statement_omp_for *cilk_for_stmt = + as_a (gs); + return cilk_for_stmt->iter->grain; +} /* Return the clauses associated with OMP_PARALLEL GS. */ diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 1ca847a..bcc5ede5 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -6545,6 +6545,26 @@ find_combined_omp_for (tree *tp, int *walk_subtrees, void *) return NULL_TREE; } +/* Computes the loop count, absolute value of (FINAL-INIT)/STEP and store + them in CFOR->ITER->LOOP_COUNT. GRAIN is stored in CFOR->ITER->GRAIN. */ + +static void +cilk_for_compute_set_count_grain (gimple cfor, tree init, tree final, + tree incr, tree grain) +{ + enum tree_code cond = gimple_omp_for_cond (cfor, 0); + tree type = TREE_TYPE (init); + tree m = fold_build2 (MINUS_EXPR, type, final, init); + if (cond == GT_EXPR || cond == GE_EXPR) + m = fold_build1 (NEGATE_EXPR, TREE_TYPE (m), m); + + tree t = fold_build2 (TRUNC_DIV_EXPR, type, m, incr); + if (cond == LE_EXPR || cond == GE_EXPR) + t = fold_build2 (PLUS_EXPR, type, t, build_one_cst (type)); + gimple_cilk_for_set_count (t, cfor); + gimple_cilk_for_set_grain (grain, cfor); +} + /* Gimplify the gross structure of an OMP_FOR statement. */ static enum gimplify_status @@ -6559,7 +6579,15 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) bool simd; bitmap has_decl_expr = NULL; + tree grain = NULL_TREE; + tree orig_init = NULL_TREE, orig_cond = NULL_TREE, orig_incr = NULL_TREE; orig_for_stmt = for_stmt = *expr_p; + + if (TREE_CODE (for_stmt) == CILK_FOR) + { + grain = OMP_FOR_CLAUSES (for_stmt); + OMP_FOR_CLAUSES (for_stmt) = NULL_TREE; + } simd = TREE_CODE (for_stmt) == OMP_SIMD || TREE_CODE (for_stmt) == CILK_SIMD; @@ -6677,7 +6705,12 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) } else var = decl; - + + /* Original initial, final and increment values are necessary to compute + the loop-count. Otherwise, they are stored in variables and their + context could be changed, potentially making it impossible to compute + them correctly. */ + orig_init = TREE_OPERAND (t, 1); tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL, is_gimple_val, fb_rvalue); ret = MIN (ret, tret); @@ -6689,6 +6722,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) gcc_assert (COMPARISON_CLASS_P (t)); gcc_assert (TREE_OPERAND (t, 0) == decl); + orig_cond = TREE_OPERAND (t, 1); tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL, is_gimple_val, fb_rvalue); ret = MIN (ret, tret); @@ -6713,6 +6747,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t); t = build2 (MODIFY_EXPR, TREE_TYPE (var), var, t); TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t; + orig_incr = build_one_cst (TREE_TYPE (t)); break; } @@ -6726,6 +6761,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) t = build2 (PLUS_EXPR, TREE_TYPE (decl), var, t); t = build2 (MODIFY_EXPR, TREE_TYPE (var), var, t); TREE_VEC_ELT (OMP_FOR_INCR (for_stmt), i) = t; + orig_incr = build_one_cst (TREE_TYPE (t)); break; case MODIFY_EXPR: @@ -6753,8 +6789,16 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) gcc_unreachable (); } - tret = gimplify_expr (&TREE_OPERAND (t, 1), &for_pre_body, NULL, - is_gimple_val, fb_rvalue); + orig_incr = TREE_OPERAND (t, 1); + /* Right here we are just trying to extract the absolute + value of the increment. */ + if (TREE_CODE (t) == MINUS_EXPR + || TREE_CODE (TREE_OPERAND (t, 1)) == NEGATE_EXPR + || (TREE_CODE (TREE_OPERAND (t, 1)) == INTEGER_CST + && tree_int_cst_sgn (TREE_OPERAND (t, 1)) < 1)) + orig_incr = fold_build1 (NEGATE_EXPR, TREE_TYPE (t), orig_incr); + tret = gimplify_expr (&TREE_OPERAND (t, 1), pre_p, + NULL, is_gimple_val, fb_rvalue); ret = MIN (ret, tret); if (c) { @@ -6825,6 +6869,7 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) case OMP_FOR: kind = GF_OMP_FOR_KIND_FOR; break; case OMP_SIMD: kind = GF_OMP_FOR_KIND_SIMD; break; case CILK_SIMD: kind = GF_OMP_FOR_KIND_CILKSIMD; break; + case CILK_FOR: kind = GF_OMP_FOR_KIND_CILKFOR; break; case OMP_DISTRIBUTE: kind = GF_OMP_FOR_KIND_DISTRIBUTE; break; default: gcc_unreachable (); @@ -6859,6 +6904,10 @@ gimplify_omp_for (tree *expr_p, gimple_seq *pre_p) gimple_omp_for_set_incr (gfor, i, TREE_OPERAND (t, 1)); } + if (kind == GF_OMP_FOR_KIND_CILKFOR) + cilk_for_compute_set_count_grain (gfor, orig_init, orig_cond, orig_incr, + grain); + gimplify_seq_add_stmt (pre_p, gfor); if (ret != GS_ALL_DONE) return GS_ERROR; @@ -7880,6 +7929,7 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p, case OMP_FOR: case OMP_SIMD: case CILK_SIMD: + case CILK_FOR: case OMP_DISTRIBUTE: ret = gimplify_omp_for (expr_p, pre_p); break; diff --git a/gcc/omp-low.c b/gcc/omp-low.c old mode 100644 new mode 100755 index 97092dd..be75fde --- a/gcc/omp-low.c +++ b/gcc/omp-low.c @@ -71,6 +71,7 @@ along with GCC; see the file COPYING3. If not see #include "ipa-prop.h" #include "tree-nested.h" #include "tree-eh.h" +#include "cilk.h" /* Lowering of OpenMP parallel and workshare constructs proceeds in two @@ -198,6 +199,13 @@ struct omp_for_data struct omp_for_data_loop *loops; }; +/* A structure with necessary elements from _Cilk_for statement. This + struct. node is passed in to WALK_STMT_INFO->INFO. */ +typedef struct cilk_for_information { + bool found; + tree induction_var; + tree count; +} cilk_for_info; static splay_tree all_contexts; static int taskreg_nesting_level; @@ -314,6 +322,8 @@ extract_omp_for_data (gimple for_stmt, struct omp_for_data *fd, fd->have_ordered = false; fd->sched_kind = OMP_CLAUSE_SCHEDULE_STATIC; fd->chunk_size = NULL_TREE; + if (gimple_omp_for_kind (fd->for_stmt) == GF_OMP_FOR_KIND_CILKFOR) + fd->sched_kind = OMP_CLAUSE_SCHEDULE_CILKFOR; collapse_iter = NULL; collapse_count = NULL; @@ -1820,29 +1830,125 @@ scan_sharing_clauses (tree clauses, omp_context *ctx) scan_omp (&OMP_CLAUSE_LASTPRIVATE_GIMPLE_SEQ (c), ctx); } -/* Create a new name for omp child function. Returns an identifier. */ +/* Create a new name for omp child function. Returns an identifier. If + IS_CILK_FOR is true then the suffix for the child function is + "_cilk_for_fn." */ static GTY(()) unsigned int tmp_ompfn_id_num; static tree -create_omp_child_function_name (bool task_copy) +create_omp_child_function_name (bool task_copy, bool is_cilk_for) { + if (is_cilk_for) + return clone_function_name (current_function_decl, "_cilk_for_fn"); return (clone_function_name (current_function_decl, task_copy ? "_omp_cpyfn" : "_omp_fn")); } +/* Helper function for walk_gimple_seq function. *GSI_P is the gimple stmt. + iterator passed by walk_gimple_seq and *WI->INFO holds the CILK_FOR_INFO + structure. This function sets the values inside this structure if it + finds a _Cilk_for statement in *GSI_P. HANDLED_OPS_P is unused. */ + +static tree +find_cilk_for_stmt (gimple_stmt_iterator *gsi_p, + bool *handled_ops_p ATTRIBUTE_UNUSED, + struct walk_stmt_info *wi) +{ + cilk_for_info *cf_info = (cilk_for_info *) wi->info; + gimple stmt = gsi_stmt (*gsi_p); + + if (gimple_code (stmt) == GIMPLE_OMP_FOR + && (gimple_omp_for_kind (stmt) == GF_OMP_FOR_KIND_CILKFOR) + /* For nested _Cilk_for statments, just look into the + outer-most one. */ + && cf_info->found == false) + { + cf_info->found = true; + cf_info->induction_var = gimple_cilk_for_induction_var (stmt); + cf_info->count = gimple_cilk_for_loop_count (stmt); + } + return NULL_TREE; +} + +/* Returns true if STMT contains a CILK_FOR statement. If found then + populate *IND_VAR and *LOOP_COUNT with induction variable + and loop-count value. Otherwise these values remain untouched. + IND_VAR and LOOP_COUNT can be NULL and if so then they are also + left untouched. */ + +static bool +is_cilk_for_stmt (gimple stmt, tree *ind_var, tree *loop_count) +{ + if (!flag_enable_cilkplus) + return false; + if (gimple_code (stmt) == GIMPLE_OMP_PARALLEL) + stmt = gimple_omp_body (stmt); + if (gimple_code (stmt) == GIMPLE_BIND) + { + gimple_seq body = gimple_bind_body (stmt); + struct walk_stmt_info wi; + cilk_for_info cf_info; + memset (&cf_info, 0, sizeof (cilk_for_info)); + memset (&wi, 0, sizeof (wi)); + wi.info = &cf_info; + walk_gimple_seq (body, find_cilk_for_stmt, NULL, &wi); + if (cf_info.found) + { + if (ind_var) + *ind_var = cf_info.induction_var; + if (loop_count) + *loop_count = cf_info.count; + return true; + } + } + return false; +} + +/* Returns the type of the induction variable for the child function for + _Cilk_for and the types for _high and _low variables based on TYPE. */ + +static tree +cilk_for_check_loop_diff_type (tree type) +{ + if (type == integer_type_node) + return type; + else if (TYPE_PRECISION (type) <= TYPE_PRECISION (uint32_type_node)) + { + if (TYPE_UNSIGNED (type)) + return uint32_type_node; + else + return integer_type_node; + } + else + { + if (TYPE_UNSIGNED (type)) + return uint64_type_node; + else + return long_long_integer_type_node; + } + gcc_unreachable (); +} + /* Build a decl for the omp child function. It'll not contain a body yet, just the bare decl. */ static void create_omp_child_function (omp_context *ctx, bool task_copy) { - tree decl, type, name, t; + tree decl, type, name, t, ind_var = NULL_TREE, loop_count = NULL_TREE; - name = create_omp_child_function_name (task_copy); + bool is_cilk_for = is_cilk_for_stmt (ctx->stmt, &ind_var, &loop_count); + tree cilk_var_type = (is_cilk_for ? + cilk_for_check_loop_diff_type (TREE_TYPE (ind_var)) : NULL_TREE); + + name = create_omp_child_function_name (task_copy, is_cilk_for); if (task_copy) type = build_function_type_list (void_type_node, ptr_type_node, ptr_type_node, NULL_TREE); + else if (is_cilk_for) + type = build_function_type_list (void_type_node, ptr_type_node, + cilk_var_type, cilk_var_type, NULL_TREE); else type = build_function_type_list (void_type_node, ptr_type_node, NULL_TREE); @@ -1892,13 +1998,44 @@ create_omp_child_function (omp_context *ctx, bool task_copy) DECL_CONTEXT (t) = decl; DECL_RESULT (decl) = t; - t = build_decl (DECL_SOURCE_LOCATION (decl), - PARM_DECL, get_identifier (".omp_data_i"), ptr_type_node); + /* _Cilk_for's child function requires two extra parameters called + __low and __high that are set the by Cilk runtime when it calls this + function. */ + if (is_cilk_for) + { + t = build_decl (DECL_SOURCE_LOCATION (decl), + PARM_DECL, get_identifier ("__high"), cilk_var_type); + DECL_ARTIFICIAL (t) = 1; + DECL_NAMELESS (t) = 1; + DECL_ARG_TYPE (t) = ptr_type_node; + DECL_CONTEXT (t) = current_function_decl; + TREE_USED (t) = 1; + TREE_ADDRESSABLE (t) = 1; + DECL_CHAIN (t) = DECL_ARGUMENTS (decl); + DECL_ARGUMENTS (decl) = t; + + t = build_decl (DECL_SOURCE_LOCATION (decl), + PARM_DECL, get_identifier ("__low"), cilk_var_type); + DECL_ARTIFICIAL (t) = 1; + DECL_NAMELESS (t) = 1; + DECL_ARG_TYPE (t) = ptr_type_node; + DECL_CONTEXT (t) = current_function_decl; + TREE_USED (t) = 1; + TREE_ADDRESSABLE (t) = 1; + DECL_CHAIN (t) = DECL_ARGUMENTS (decl); + DECL_ARGUMENTS (decl) = t; + } + + tree data_name = get_identifier (".omp_data_i"); + t = build_decl (DECL_SOURCE_LOCATION (decl), PARM_DECL, data_name, + ptr_type_node); DECL_ARTIFICIAL (t) = 1; DECL_NAMELESS (t) = 1; DECL_ARG_TYPE (t) = ptr_type_node; DECL_CONTEXT (t) = current_function_decl; TREE_USED (t) = 1; + if (is_cilk_for) + DECL_CHAIN (t) = DECL_ARGUMENTS (decl); DECL_ARGUMENTS (decl) = t; if (!task_copy) ctx->receiver_decl = t; @@ -4317,6 +4454,41 @@ expand_parallel_call (struct omp_region *region, basic_block bb, false, GSI_CONTINUE_LINKING); } +/* Builds a function call using the values from WS_ARGS and data arguments + of ENTRY_STMT. The function call is inserted into BB. */ + +static void +expand_cilk_for_call (basic_block bb, gimple entry_stmt, + vec *ws_args) +{ + tree t, t1, t2; + gimple_stmt_iterator gsi; + vec *args; + + /* The builtin function's name, the loop-count and the grain value are + stored in WS_ARGS. */ + tree func_name = (*ws_args)[0]; + tree count = (*ws_args)[1]; + tree grain = (*ws_args)[2]; + + gsi = gsi_last_bb (bb); + t = gimple_omp_parallel_data_arg (entry_stmt); + if (t == NULL) + t1 = null_pointer_node; + else + t1 = build_fold_addr_expr (t); + t2 = build_fold_addr_expr (gimple_omp_parallel_child_fn (entry_stmt)); + + vec_alloc (args, 4); + args->quick_push (t2); + args->quick_push (t1); + args->quick_push (count); + args->quick_push (grain); + t = build_call_expr_loc_vec (UNKNOWN_LOCATION, func_name, args); + + force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, false, + GSI_CONTINUE_LINKING); +} /* Build the function call to GOMP_task to actually generate the task operation. BB is the block where to insert the code. */ @@ -4652,8 +4824,39 @@ expand_omp_taskreg (struct omp_region *region) entry_bb = region->entry; exit_bb = region->exit; + /* The way _Cilk_for is constructed in this compiler can be thought of + as a parallel omp_for. But the inner workings between them are very + different so we need a way to differenciate between them. Thus, we + added a new schedule type called OMP_CLAUSE_SCHEDULE_CILKFOR, which + pretty much says that this is not a parallel omp for but a _Cilk_for + statement. */ + bool is_cilk_for = + (flag_enable_cilkplus && region->inner && + (region->inner->sched_kind == OMP_CLAUSE_SCHEDULE_CILKFOR)); + + /* Extract the __high and __low parameter from the function. */ + tree high_arg = NULL_TREE, low_arg = NULL_TREE; + if (is_cilk_for) + { + for (tree ii_arg = DECL_ARGUMENTS (child_fn); ii_arg != NULL_TREE; + ii_arg = TREE_CHAIN (ii_arg)) + { + if (!strcmp (IDENTIFIER_POINTER (DECL_NAME (ii_arg)), "__high")) + high_arg = ii_arg; + if (!strcmp (IDENTIFIER_POINTER (DECL_NAME (ii_arg)), "__low")) + low_arg = ii_arg; + } + gcc_assert (high_arg); + gcc_assert (low_arg); + } + if (is_combined_parallel (region)) ws_args = region->ws_args; + else if (is_cilk_for) + /* If it is a _Cilk_for statement, it is modelled *like* a parallel for, + so the inner statement should have the information that is required + to by expand_cilk_for_call. */ + ws_args = region->inner->ws_args; else ws_args = NULL; @@ -4759,6 +4962,49 @@ expand_omp_taskreg (struct omp_region *region) } } + /* In here the calls to the GET_NUM_THREADS and GET_THREAD_NUM are + removed. Further, they will be replaced by __low and __high + parameter values. */ + gimple high_assign = NULL, low_assign = NULL; + if (is_cilk_for) + { + gimple_stmt_iterator gsi2 = gsi_start_bb (single_succ (entry_bb)); + while (!gsi_end_p (gsi2)) + { + gimple stmt = gsi_stmt (gsi2); + + if (gimple_call_builtin_p (stmt, BUILT_IN_OMP_GET_NUM_THREADS)) + { + /* There can only be one one call to these two functions + If there are multiple, then something went wrong + somewhere. */ + gcc_assert (low_assign == NULL); + tree ltype = TREE_TYPE (gimple_get_lhs (stmt)); + tree tmp2 = create_tmp_reg (TREE_TYPE (low_arg), NULL); + low_assign = gimple_build_assign + (gimple_get_lhs (stmt), fold_convert (ltype, tmp2)); + gsi_remove (&gsi2, true); + gimple tmp_stmt = gimple_build_assign (tmp2, low_arg); + gsi_insert_before (&gsi2, low_assign, GSI_NEW_STMT); + gsi_insert_before (&gsi2, tmp_stmt, GSI_NEW_STMT); + } + else if (gimple_call_builtin_p (stmt, + BUILT_IN_OMP_GET_THREAD_NUM)) + { + gcc_assert (high_assign == NULL); + tree htype = TREE_TYPE (gimple_get_lhs (stmt)); + tree tmp2 = create_tmp_reg (TREE_TYPE (high_arg), NULL); + + high_assign = gimple_build_assign + (gimple_get_lhs (stmt), fold_convert (htype, tmp2)); + gsi_remove (&gsi2, true); + gimple tmp_stmt = gimple_build_assign (tmp2, high_arg); + gsi_insert_before (&gsi2, high_assign, GSI_NEW_STMT); + gsi_insert_before (&gsi2, tmp_stmt, GSI_NEW_STMT); + } + gsi_next (&gsi2); + } + } /* Declare local variables needed in CHILD_CFUN. */ block = DECL_INITIAL (child_fn); BLOCK_VARS (block) = vec2chain (child_cfun->local_decls); @@ -4821,6 +5067,13 @@ expand_omp_taskreg (struct omp_region *region) if (loops_state_satisfies_p (LOOPS_NEED_FIXUP)) child_cfun->x_current_loops->state |= LOOPS_NEED_FIXUP; + /* We expand it before it is customarily done for other flavors because + the call to the function __cilkrts_cilk_for_64/32 (inserted by the + function below) may use some variables and thus the function call + must be inserted before the unwanted variables are eliminated. */ + if (is_cilk_for) + expand_cilk_for_call (new_bb, entry_stmt, ws_args); + /* Remove non-local VAR_DECLs from child_cfun->local_decls list. */ num = vec_safe_length (child_cfun->local_decls); for (srcidx = 0, dstidx = 0; srcidx < num; srcidx++) @@ -4834,7 +5087,7 @@ expand_omp_taskreg (struct omp_region *region) } if (dstidx != num) vec_safe_truncate (child_cfun->local_decls, dstidx); - + /* Inform the callgraph about the new function. */ DECL_STRUCT_FUNCTION (child_fn)->curr_properties = cfun->curr_properties; cgraph_add_new_function (child_fn, true); @@ -4865,11 +5118,14 @@ expand_omp_taskreg (struct omp_region *region) pop_cfun (); } - /* Emit a library call to launch the children threads. */ - if (gimple_code (entry_stmt) == GIMPLE_OMP_PARALLEL) - expand_parallel_call (region, new_bb, entry_stmt, ws_args); - else - expand_task_call (new_bb, entry_stmt); + if (!is_cilk_for) + { + /* Emit a library call to launch the children threads. */ + if (gimple_code (entry_stmt) == GIMPLE_OMP_PARALLEL) + expand_parallel_call (region, new_bb, entry_stmt, ws_args); + else + expand_task_call (new_bb, entry_stmt); + } if (gimple_in_ssa_p (cfun)) update_ssa (TODO_update_ssa_only_virtuals); } @@ -6544,6 +6800,225 @@ expand_omp_for_static_chunk (struct omp_region *region, } } +/* A subroutine of expand_omp_for. Generate code for _Cilk_for loop. + Given parameters: + for (V = N1; V cond N2; V += STEP) BODY; + + where COND is "<" or ">", we generate pseudocode + + for (ind_var = low; ind_var < high; ind_var++) + { + if (n1 < n2) + V = n1 + (ind_var * STEP) + else + V = n2 - (ind_var * STEP); + + + } + + In the above pseudocode, low and high are function parameters of the + child function. In the function below, we are inserting a temp. + variable that will be making a call to two OMP functions that will not be + found in the body of _Cilk_for (since OMP_FOR cannot be mixed + with _Cilk_for). These functions are replaced with low and high + by the function that handleds taskreg. */ + + +static void +expand_cilk_for (struct omp_region *region, struct omp_for_data *fd) +{ + bool broken_loop = region->cont == NULL; + tree type = cilk_for_check_loop_diff_type (TREE_TYPE (fd->loop.v)); + basic_block entry_bb = region->entry; + basic_block cont_bb = region->cont; + + gcc_assert (EDGE_COUNT (entry_bb->succs) == 2); + gcc_assert (broken_loop + || BRANCH_EDGE (entry_bb)->dest == FALLTHRU_EDGE (cont_bb)->dest); + basic_block l0_bb = FALLTHRU_EDGE (entry_bb)->dest; + basic_block l1_bb, l2_bb; + + tree grain = gimple_cilk_for_grain (fd->for_stmt); + if (!broken_loop) + { + gcc_assert (BRANCH_EDGE (cont_bb)->dest == l0_bb); + gcc_assert (EDGE_COUNT (cont_bb->succs) == 2); + l1_bb = split_block (cont_bb, last_stmt (cont_bb))->dest; + l2_bb = BRANCH_EDGE (entry_bb)->dest; + } + else + { + BRANCH_EDGE (entry_bb)->flags &= ~EDGE_ABNORMAL; + l1_bb = split_edge (BRANCH_EDGE (entry_bb)); + l2_bb = single_succ (l1_bb); + } + basic_block exit_bb = region->exit; + basic_block l2_dom_bb = NULL; + + gimple_stmt_iterator gsi = gsi_last_bb (entry_bb); + + /* Below statements until the "tree high_val = ..." are pseudo statements + used to pass information to be used by expand_omp_taskreg. + low_val and high_val will be replaced by the __low and __high + parameter from the child function. + + The call_exprs part is a place-holder, it is mainly used + to distinctly identify to the top-level part that this is + where we should put low and high (reasoning given in header + comment). */ + + tree t = build_call_expr + (builtin_decl_explicit (BUILT_IN_OMP_GET_NUM_THREADS), 0); + t = fold_convert (type, t); + tree low_val = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, true, + GSI_SAME_STMT); + t = build_call_expr (builtin_decl_explicit (BUILT_IN_OMP_GET_THREAD_NUM), + 0); + t = fold_convert (TREE_TYPE (fd->loop.v), t); + tree high_val = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, true, + GSI_SAME_STMT); + + tree ind_var = create_tmp_reg (type, "__cilk_ind_var"); + gcc_assert (gimple_code (gsi_stmt (gsi)) == GIMPLE_OMP_FOR); + + /* Not needed in SSA form right now. */ + gcc_assert (!gimple_in_ssa_p (cfun)); + if (l2_dom_bb == NULL) + l2_dom_bb = l1_bb; + + tree n1 = low_val; + tree n2 = high_val; + + expand_omp_build_assign (&gsi, ind_var, n1); + + /* Remove the GIMPLE_OMP_FOR statement. */ + gsi_remove (&gsi, true); + + gimple stmt; + if (!broken_loop) + { + /* Code to control the increment goes in the CONT_BB. */ + gsi = gsi_last_bb (cont_bb); + stmt = gsi_stmt (gsi); + gcc_assert (gimple_code (stmt) == GIMPLE_OMP_CONTINUE); + enum tree_code code = PLUS_EXPR; + if (POINTER_TYPE_P (type)) + t = fold_build_pointer_plus (ind_var, build_one_cst (type)); + else + t = fold_build2 (code, type, ind_var, build_one_cst (type)); + expand_omp_build_assign (&gsi, ind_var, t); + + /* Remove GIMPLE_OMP_CONTINUE. */ + gsi_remove (&gsi, true); + } + + /* Emit the condition in L1_BB. */ + gsi = gsi_start_bb (l1_bb); + + tree step = fold_convert (type, fd->loop.step); + if ((TREE_CODE (step) == INTEGER_CST && tree_int_cst_sgn (step) < 1)) + step = fold_build1_loc (UNKNOWN_LOCATION, NEGATE_EXPR, type, step); + + t = build2 (MULT_EXPR, type, ind_var, step); + tree tmp = create_tmp_reg (type, NULL); + gsi_insert_after (&gsi, gimple_build_assign (tmp, t), GSI_NEW_STMT); + + tree tmp2 = create_tmp_reg (type, NULL); + tree cvtd = fold_convert (type, fd->loop.n1); + gsi_insert_after (&gsi, gimple_build_assign (tmp2, cvtd), GSI_NEW_STMT); + + if (fd->loop.cond_code == GE_EXPR || fd->loop.cond_code == GT_EXPR) + t = fold_build2 (MINUS_EXPR, type, tmp2, tmp); + else + t = fold_build2 (PLUS_EXPR, type, tmp2, tmp); + + tmp = create_tmp_reg (type, NULL); + gsi_insert_after (&gsi, gimple_build_assign (tmp, t), GSI_NEW_STMT); + + cvtd = fold_convert (TREE_TYPE (fd->loop.v), tmp); + gsi_insert_after (&gsi, gimple_build_assign (fd->loop.v, cvtd), + GSI_NEW_STMT); + + t = fold_convert (type, n2); + t = force_gimple_operand_gsi (&gsi, t, true, NULL_TREE, + false, GSI_CONTINUE_LINKING); + /* The condition is always '<' since the runtime will fill in the low + and high values. */ + t = build2 (LT_EXPR, boolean_type_node, ind_var, t); + stmt = gimple_build_cond_empty (t); + gsi_insert_after (&gsi, stmt, GSI_CONTINUE_LINKING); + if (walk_tree (gimple_cond_lhs_ptr (stmt), expand_omp_regimplify_p, + NULL, NULL) + || walk_tree (gimple_cond_rhs_ptr (stmt), expand_omp_regimplify_p, + NULL, NULL)) + { + gsi = gsi_for_stmt (stmt); + gimple_regimplify_operands (stmt, &gsi); + } + + /* Remove GIMPLE_OMP_RETURN. */ + gsi = gsi_last_bb (exit_bb); + gsi_remove (&gsi, true); + + /* Connect the new blocks. */ + remove_edge (FALLTHRU_EDGE (entry_bb)); + + edge e, ne; + if (!broken_loop) + { + remove_edge (BRANCH_EDGE (entry_bb)); + make_edge (entry_bb, l1_bb, EDGE_FALLTHRU); + + e = BRANCH_EDGE (l1_bb); + ne = FALLTHRU_EDGE (l1_bb); + e->flags = EDGE_TRUE_VALUE; + } + else + { + single_succ_edge (entry_bb)->flags = EDGE_FALLTHRU; + + ne = single_succ_edge (l1_bb); + e = make_edge (l1_bb, l0_bb, EDGE_TRUE_VALUE); + + } + ne->flags = EDGE_FALSE_VALUE; + e->probability = REG_BR_PROB_BASE * 7 / 8; + ne->probability = REG_BR_PROB_BASE / 8; + + set_immediate_dominator (CDI_DOMINATORS, l1_bb, entry_bb); + set_immediate_dominator (CDI_DOMINATORS, l2_bb, l2_dom_bb); + set_immediate_dominator (CDI_DOMINATORS, l0_bb, l1_bb); + + if (!broken_loop) + { + struct loop *loop = alloc_loop (); + loop->header = l1_bb; + loop->latch = cont_bb; + add_loop (loop, l1_bb->loop_father); + loop->safelen = INT_MAX; + } + + /* Pick the correct library function based on the precision of the + induction variable type. */ + tree lib_fun = NULL_TREE; + if (TYPE_PRECISION (type) == 32) + lib_fun = cilk_for_32_fndecl; + else if (TYPE_PRECISION (type) == 64) + lib_fun = cilk_for_64_fndecl; + else + gcc_unreachable (); + + tree count = gimple_cilk_for_loop_count (fd->for_stmt); + gcc_assert (count != NULL_TREE); + + /* ws_args contains three information: The library function flavor to call + (__cilkrts_cilk_for_32/__cilkrts_cilk_for_64) loop_count and the grain + value. */ + vec_alloc (region->ws_args, 3); + region->ws_args->quick_push (lib_fun); + region->ws_args->quick_push (count); + region->ws_args->quick_push (grain); +} /* A subroutine of expand_omp_for. Generate code for a simd non-worksharing loop. Given parameters: @@ -6884,6 +7359,8 @@ expand_omp_for (struct omp_region *region, gimple inner_stmt) if (gimple_omp_for_kind (fd.for_stmt) & GF_OMP_FOR_KIND_SIMD) expand_omp_simd (region, &fd); + else if (gimple_omp_for_kind (fd.for_stmt) & GF_OMP_FOR_KIND_CILKFOR) + expand_cilk_for (region, &fd); else if (fd.sched_kind == OMP_CLAUSE_SCHEDULE_STATIC && !fd.have_ordered) { diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c new file mode 100644 index 0000000..3f68022 --- /dev/null +++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk-fors.c @@ -0,0 +1,100 @@ +/* { dg-do run { target { i?86-*-* x86_64-*-* } } } */ +/* { dg-options "-fcilkplus" } */ +/* { dg-additional-options "-std=gnu99" { target c } } */ +/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */ + +#if HAVE_IO +#include +#endif + +static void check (int *Array, int start, int end, int incr, int value) +{ + int ii = 0; + for (ii = start; ii < end; ii = ii + incr) + if (Array[ii] != value) + __builtin_abort (); +#if HAVE_IO + printf ("Passed\n"); +#endif +} + +static void check_reverse (int *Array, int start, int end, int incr, int value) +{ + int ii = 0; + for (ii = start; ii >= end; ii = ii - incr) + if (Array[ii] != value) + __builtin_abort (); +#if HAVE_IO + printf ("Passed\n"); +#endif +} + + +int main (void) +{ + int Array[10]; + int x = 9, y = 0, z = 3; + + + _Cilk_for (int ii = 0; ii < 10; ii++) + Array[ii] = 1133; + check (Array, 0, 10, 1, 1133); + + _Cilk_for (int ii = 0; ii < 10; ++ii) + Array[ii] = 3311; + check (Array, 0, 10, 1, 3311); + + _Cilk_for (int ii = 9; ii > -1; ii--) + Array[ii] = 4433; + check_reverse (Array, 9, 0, 1, 4433); + + _Cilk_for (int ii = 9; ii > -1; --ii) + Array[ii] = 9988; + check_reverse (Array, 9, 0, 1, 9988); + + _Cilk_for (int ii = 0; ii < 10; ++ii) + Array[ii] = 3311; + check (Array, 0, 10, 1, 3311); + + _Cilk_for (int ii = 0; ii < 10; ii += 2) + Array[ii] = 1328; + check (Array, 0, 10, 2, 1328); + + _Cilk_for (int ii = 9; ii >= 0; ii -= 2) + Array[ii] = 1738; + check_reverse (Array, 9, 0, 2, 1738); + + + _Cilk_for (int ii = 0; ii < 10; ii++) + { + if (ii % 2) + Array[ii] = 1343; + else + Array[ii] = 3413; + } + + check (Array, 1, 10, 2, 1343); + check (Array, 0, 10, 2, 3413); + + _Cilk_for (short cc = 0; cc < 10; cc++) + Array[cc] = 1343; + check (Array, 0, 10, 1,1343); + + _Cilk_for (short cc = 9; cc >= 0; cc--) + Array[cc] = 1348; + check_reverse (Array, 9, 0, 1, 1348); + + + + /* Loop with polynomials in the start, final and incr. */ + _Cilk_for (int ii = z - 3; ii <= z * 3; ii += z-1) + { + Array[ii] = 3233; + } + + for (int ii = z-3; ii <= z*3; ii += (z-1)) + if (Array[ii] != 3233) + __builtin_abort (); + + return 0; +} diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c new file mode 100644 index 0000000..0ebc09a --- /dev/null +++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_errors.c @@ -0,0 +1,49 @@ +/* { dg-do compile } */ +/* { dg-options "-fcilkplus" } */ +/* { dg-additional-options "-std=c99" { target c } } */ + + +int main (void) +{ + int q = 0, ii = 0, jj = 0; + + _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" } */ + q = 5; + + _Cilk_for (; ii < 10; ii++) /* { dg-error "expected iteration declaration" } */ + q = 2; + + _Cilk_for (int ii = 0; ; ii++) /* { dg-error "missing controlling predicate" } */ + q = 2; + + _Cilk_for (int ii = 0; ii < 10, jj < 10; ii++) /* { dg-error "expected ';' before ',' token" } */ + q = 5; + + _Cilk_for (int ii = 0; ii < 10; ) /* { dg-error "missing increment" } */ + q = 5; + + _Cilk_for (int ii = 0, jj = 0; ii < 10; ii++) /* { dg-error "expected iteration declaration" } */ + q = 5; + + _Cilk_for (volatile int vii = 0; vii < 10; vii++) /* { dg-error "iteration variable cannot be volatile" } */ + q = 5; + + _Cilk_for (static int sii = 0; sii < 10; sii++) /* { dg-error "static variable" } */ + q = 5; + + _Cilk_for (float fii = 3.47; fii < 5.23; fii++) /* { dg-error "invalid type for iteration variable" } */ + q = 5; + + _Cilk_for (int ii = 0; 10 > jj; ii++) /* { dg-error "invalid controlling predicate" } */ + q = 5; + + _Cilk_for (int ii = 0; ii < 10; ii >> 1) /* { dg-error "invalid increment expression" } */ + q = 5; + + _Cilk_for (int ii = 10; ii >= 0; ii--) /* This is OK! */ + q = 5; + + _Cilk_for (int ii; ii < 10; ii++) /* { dg-error "is not initialized" } */ + q = 5; + return 0; +} diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c new file mode 100644 index 0000000..6cb9b03 --- /dev/null +++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain.c @@ -0,0 +1,35 @@ +/* { dg-do run { target { i?86-*-* x86_64-*-* } } } */ +/* { dg-options "-fcilkplus" } */ +/* { dg-additional-options "-std=gnu99" { target c } } */ +/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */ + + +int grain_value = 2; +int main (void) +{ + int Array1[200], Array1_Serial[200]; + + for (int ii = 0; ii < 200; ii++) + { + Array1_Serial[ii] = 2; + Array1[ii] = 1; + } + +#pragma cilk grainsize = 2 + _Cilk_for (int ii = 0; ii < 200; ii++) + Array1[ii] = 2; + + for (int ii = 0; ii < 200; ii++) + if (Array1[ii] != Array1_Serial[ii]) + return (ii+1); + +#pragma cilk grainsize = grain_value + _Cilk_for (int ii = 0; ii < 200; ii++) + Array1[ii] = 2; + + for (int ii = 0; ii < 200; ii++) + if (Array1[ii] != Array1_Serial[ii]) + return (ii+1); + + return 0; +} diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c new file mode 100644 index 0000000..ff8bc0a --- /dev/null +++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_grain_errors.c @@ -0,0 +1,48 @@ +/* { dg-do compile } */ +/* { dg-options "-fcilkplus -Wunknown-pragmas" } */ +/* { dg-additional-options "-std=c99" { target c } } */ + + +char Array1[26]; + +#pragma cilk grainsize = 2 /* { dg-error "must be inside a function" } */ + +int main(int argc, char **argv) +{ +/* This is OK. */ +#pragma cilk grainsize = 2 + _Cilk_for (int ii = 0; ii < 10; ii++) + Array1[ii] = 0; + +#pragma cilk grainsize 2 /* { dg-error "expected '=' before numeric constant" } */ + _Cilk_for (int ii = 0; ii < 10; ii++) + Array1[ii] = 0; + +#pragma cilk grainsiz = 2 /* { dg-warning "ignoring #pragma cilk grainsiz" } */ + _Cilk_for (int ii = 0; ii < 10; ii++) + Array1[ii] = 0; + + +/* This is OK, it will do a type conversion to long int. */ +#pragma cilk grainsize = 0.5 + _Cilk_for (int ii = 0; ii < 10; ii++) + Array1[ii] = 0; + +#pragma cilk grainsize = 1 + while (Array1[5] != 0) /* { dg-warning "grainsize pragma is not followed" } */ + { + /* Blah */ + } + +#pragma cilk grainsize = 1 + int q = 0; /* { dg-warning "grainsize pragma is not followed" } */ + _Cilk_for (q = 0; q < 10; q++) + Array1[q] = 5; + + while (Array1[5] != 0) + { + /* Blah */ + } + + return 0; +} diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c new file mode 100644 index 0000000..7a779f7 --- /dev/null +++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/cilk_for_ptr_iter.c @@ -0,0 +1,41 @@ +/* { dg-do run { target { i?86-*-* x86_64-*-* } } } */ +/* { dg-options "-fcilkplus" } */ +/* { dg-additional-options "-std=gnu99" { target c } } */ +/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */ + + + +/* loop control variable must have integer, pointer or class type + +*/ + +#define ARRAY_SIZE 10000 +int a[ARRAY_SIZE]; + +int main(void) +{ + int ii = 0; + +#if 1 + for (ii =0; ii < ARRAY_SIZE; ii++) + a[ii] = 5; +#endif + _Cilk_for(int *aa = a; aa < a + ARRAY_SIZE; aa++) + *aa = 0; +#if 1 + for (ii = 0; ii < ARRAY_SIZE; ii++) + if (a[ii] != 0) + __builtin_abort (); +#endif + + _Cilk_for (int *aa = a; aa < a + ARRAY_SIZE; aa = aa + 2) + *aa = 4; + +#if 1 + for (ii = 0; ii < ARRAY_SIZE; ii = ii + 2) + if (a[ii] != 4) + __builtin_abort (); +#endif + + return 0; +} diff --git a/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c b/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c new file mode 100644 index 0000000..cffe17e --- /dev/null +++ b/gcc/testsuite/c-c++-common/cilk-plus/CK/nested_cilk_for.c @@ -0,0 +1,79 @@ +/* { dg-do run { target { i?86-*-* x86_64-*-* } } } */ +/* { dg-options "-fcilkplus" } */ +/* { dg-additional-options "-std=gnu99" { target c } } */ +/* { dg-additional-options "-lcilkrts" { target { i?86-*-* x86_64-*-* } } } */ + +#if HAVE_IO +#include +#endif + +int main (void) +{ + int Array[10][10]; + + + for (int ii = 0; ii < 10; ii++) + for (int jj = 0; jj < 10; jj++) + { + Array[ii][jj] = 0; + } + + _Cilk_for (int ii = 0; ii < 10; ii++) + _Cilk_for (int jj = 0; jj < 5; jj++) + Array[ii][jj] = 5; + + for (int ii = 0; ii < 10; ii++) + for (int jj = 0; jj < 5; jj++) + if (Array[ii][jj] != 5) +#if HAVE_IO + printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]); +#else + __builtin_abort (); +#endif + + + /* One goes up and one goes down. */ + _Cilk_for (int ii = 0; ii < 10; ii++) + _Cilk_for (int jj = 9; jj >= 0; jj--) + Array[ii][jj] = 7; + + for (int ii = 0; ii < 10; ii++) + for (int jj = 9; jj >= 0; jj--) + if (Array[ii][jj] != 7) +#if HAVE_IO + printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]); +#else + __builtin_abort (); +#endif + + /* different step sizes. */ + _Cilk_for (int ii = 0; ii < 10; ii++) + _Cilk_for (int jj = 0; jj < 10; jj += 2) + Array[ii][jj] = 9; + + for (int ii = 0; ii < 10; ii++) + for (int jj = 0; jj < 10; jj += 2) + if (Array[ii][jj] != 9) +#if HAVE_IO + printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]); +#else + __builtin_abort (); +#endif + + /* different step sizes. */ + _Cilk_for (int ii = 0; ii < 10; ii += 2) + _Cilk_for (int jj = 5; jj < 9; jj++) + Array[ii][jj] = 11; + + for (int ii = 0; ii < 10; ii += 2) + for (int jj = 5; jj < 9; jj++) + if (Array[ii][jj] != 11) +#if HAVE_IO + printf("Array[%d][%d] = %d\n", ii, jj, Array[ii][jj]); +#else + __builtin_abort (); +#endif + + return 0; +} + diff --git a/gcc/tree-core.h b/gcc/tree-core.h index 0822d35..9412bab 100644 --- a/gcc/tree-core.h +++ b/gcc/tree-core.h @@ -351,6 +351,7 @@ enum omp_clause_schedule_kind { OMP_CLAUSE_SCHEDULE_GUIDED, OMP_CLAUSE_SCHEDULE_AUTO, OMP_CLAUSE_SCHEDULE_RUNTIME, + OMP_CLAUSE_SCHEDULE_CILKFOR, OMP_CLAUSE_SCHEDULE_LAST }; diff --git a/gcc/tree.def b/gcc/tree.def index 364e510..0a32bc4 100644 --- a/gcc/tree.def +++ b/gcc/tree.def @@ -1051,6 +1051,10 @@ DEFTREECODE (OMP_SIMD, "omp_simd", tcc_statement, 6) Operands like for OMP_FOR. */ DEFTREECODE (CILK_SIMD, "cilk_simd", tcc_statement, 6) +/* Cilk Plus - _Cilk_for (..) + Operands like for OMP_FOR. */ +DEFTREECODE (CILK_FOR, "cilk_for", tcc_statement, 6) + /* OpenMP - #pragma omp distribute [clause1 ... clauseN] Operands like for OMP_FOR. */ DEFTREECODE (OMP_DISTRIBUTE, "omp_distribute", tcc_statement, 6)