From patchwork Wed Oct 18 14:41:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 1850936 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4S9YTY41hbz20Pd for ; Thu, 19 Oct 2023 01:42:09 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 081FA3875DEE for ; Wed, 18 Oct 2023 14:42:00 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 32F4438560AA for ; Wed, 18 Oct 2023 14:41:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 32F4438560AA Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 32F4438560AA Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697640081; cv=none; b=C0QRckucs9qCXQHTOYSxlJ4ODd7/GTN7r5Cb9aW5bp4obZHSOwL0RpaqFIDIf8h5Mxtph7yNfMs0Fm+1gULK6yaeCiumtIrrtCsyp6Zk2G4IvvD6Zssvk8LG9YwzwqLc/98LyoQMH041EP6srYrNQEciOcOWtIBOOqrsFyR/I4o= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697640081; c=relaxed/simple; bh=ydpB1X5T02pUHj0kEmQDFHHtwcgv5dyWQ4BikwS0yUM=; h=Message-ID:Date:MIME-Version:Subject:From:To; b=rPJMoPAl5O5/crYz1pm9uOcJRVJPMBL3XHuup3bjP2ttm9TA5oec5gRa+cTTz1fSgCj9YiLk9ErJaKmb9MXdSTPW1HLi9BMOmh27feWIpZhv+2923aDpcEZ7nlEd4ms3Mg2YtQUoERiAOzh6+7+a27eHZuQV6uoY6XiIwjsqxJs= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1ACCB2F4; Wed, 18 Oct 2023 07:42:00 -0700 (PDT) Received: from [10.57.67.225] (unknown [10.57.67.225]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4D85C3F64C; Wed, 18 Oct 2023 07:41:18 -0700 (PDT) Message-ID: <8bc378c8-b87d-4fa6-a8f6-7665612352d8@arm.com> Date: Wed, 18 Oct 2023 15:41:17 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: [PATCH6/8] omp: Reorder call for TARGET_SIMD_CLONE_ADJUST (was Re: [PATCH7/8] vect: Add TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM) Content-Language: en-US From: "Andre Vieira (lists)" To: Richard Biener Cc: gcc-patches@gcc.gnu.org, Richard Sandiford , "jakub@redhat.com" References: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> <7fc775ba-378a-483f-b318-6b6b4aff92a6@arm.com> In-Reply-To: X-Spam-Status: No, score=-13.7 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org This patch moves the call to TARGET_SIMD_CLONE_ADJUST until after the arguments and return types have been transformed into vector types. It also constructs the adjuments and retval modifications after this call, allowing targets to alter the types of the arguments and return of the clone prior to the modifications to the function definition. Is this OK? gcc/ChangeLog: * omp-simd-clone.cc (simd_clone_adjust_return_type): Hoist out code to create return array and don't return new type. (simd_clone_adjust_argument_types): Hoist out code that creates ipa_param_body_adjustments and don't return them. (simd_clone_adjust): Call TARGET_SIMD_CLONE_ADJUST after return and argument types have been vectorized, create adjustments and return array after the hook. (expand_simd_clones): Call TARGET_SIMD_CLONE_ADJUST after return and argument types have been vectorized. On 04/10/2023 13:40, Andre Vieira (lists) wrote: > > > On 04/10/2023 11:41, Richard Biener wrote: >> On Wed, 4 Oct 2023, Andre Vieira (lists) wrote: >> >>> >>> >>> On 30/08/2023 14:04, Richard Biener wrote: >>>> On Wed, 30 Aug 2023, Andre Vieira (lists) wrote: >>>> >>>>> This patch adds a new target hook to enable us to adapt the types >>>>> of return >>>>> and parameters of simd clones.  We use this in two ways, the first >>>>> one is >>>>> to >>>>> make sure we can create valid SVE types, including the SVE type >>>>> attribute, >>>>> when creating a SVE simd clone, even when the target options do not >>>>> support >>>>> SVE.  We are following the same behaviour seen with x86 that >>>>> creates simd >>>>> clones according to the ABI rules when no simdlen is provided, even >>>>> if that >>>>> simdlen is not supported by the current target options.  Note that >>>>> this >>>>> doesn't mean the simd clone will be used in auto-vectorization. >>>> >>>> You are not documenting the bool parameter of the new hook. >>>> >>>> What's wrong with doing the adjustment in TARGET_SIMD_CLONE_ADJUST? >>> >>> simd_clone_adjust_argument_types is called after that hook, so by the >>> time we >>> call TARGET_SIMD_CLONE_ADJUST the types are still in scalar, not >>> vector.  The >>> same is true for the return type one. >>> >>> Also the changes to the types need to be taken into consideration in >>> 'adjustments' I think. >> >> Nothing in the three existing implementations of TARGET_SIMD_CLONE_ADJUST >> relies on this ordering I think, how about moving the hook invocation >> after simd_clone_adjust_argument_types? >> > > But that wouldn't change the 'ipa_param_body_adjustments' for when we > have a function definition and we need to redo the body. >> Richard. >> >>> PS: I hope the subject line survived, my email client is having a bit >>> of a >>> wobble this morning... it's what you get for updating software :( diff --git a/gcc/omp-simd-clone.cc b/gcc/omp-simd-clone.cc index ef0b9b48c7212900023bc0eaebca5e1f9389db77..fb80888190c88e29895ecfbbe1b17d390c9a9dfe 100644 --- a/gcc/omp-simd-clone.cc +++ b/gcc/omp-simd-clone.cc @@ -701,10 +701,9 @@ simd_clone_create (struct cgraph_node *old_node, bool force_local) } /* Adjust the return type of the given function to its appropriate - vector counterpart. Returns a simd array to be used throughout the - function as a return value. */ + vector counterpart. */ -static tree +static void simd_clone_adjust_return_type (struct cgraph_node *node) { tree fndecl = node->decl; @@ -714,7 +713,7 @@ simd_clone_adjust_return_type (struct cgraph_node *node) /* Adjust the function return type. */ if (orig_rettype == void_type_node) - return NULL_TREE; + return; t = TREE_TYPE (TREE_TYPE (fndecl)); if (INTEGRAL_TYPE_P (t) || POINTER_TYPE_P (t)) veclen = node->simdclone->vecsize_int; @@ -737,24 +736,6 @@ simd_clone_adjust_return_type (struct cgraph_node *node) veclen)); } TREE_TYPE (TREE_TYPE (fndecl)) = t; - if (!node->definition) - return NULL_TREE; - - t = DECL_RESULT (fndecl); - /* Adjust the DECL_RESULT. */ - gcc_assert (TREE_TYPE (t) != void_type_node); - TREE_TYPE (t) = TREE_TYPE (TREE_TYPE (fndecl)); - relayout_decl (t); - - tree atype = build_array_type_nelts (orig_rettype, - node->simdclone->simdlen); - if (maybe_ne (veclen, node->simdclone->simdlen)) - return build1 (VIEW_CONVERT_EXPR, atype, t); - - /* Set up a SIMD array to use as the return value. */ - tree retval = create_tmp_var_raw (atype, "retval"); - gimple_add_tmp_var (retval); - return retval; } /* Each vector argument has a corresponding array to be used locally @@ -788,7 +769,7 @@ create_tmp_simd_array (const char *prefix, tree type, poly_uint64 simdlen) declarations will be remapped. New arguments which are not to be remapped are marked with USER_FLAG. */ -static ipa_param_body_adjustments * +static void simd_clone_adjust_argument_types (struct cgraph_node *node) { auto_vec args; @@ -798,15 +779,11 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) else simd_clone_vector_of_formal_parm_types (&args, node->decl); struct cgraph_simd_clone *sc = node->simdclone; - vec *new_params = NULL; - vec_safe_reserve (new_params, sc->nargs); - unsigned i, j, k; + unsigned i, k; poly_uint64 veclen; for (i = 0; i < sc->nargs; ++i) { - ipa_adjusted_param adj; - memset (&adj, 0, sizeof (adj)); tree parm = NULL_TREE; tree parm_type = NULL_TREE; if(i < args.length()) @@ -815,17 +792,12 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) parm_type = node->definition ? TREE_TYPE (parm) : parm; } - adj.base_index = i; - adj.prev_clone_index = i; - sc->args[i].orig_arg = node->definition ? parm : NULL_TREE; sc->args[i].orig_type = parm_type; switch (sc->args[i].arg_type) { default: - /* No adjustment necessary for scalar arguments. */ - adj.op = IPA_PARAM_OP_COPY; break; case SIMD_CLONE_ARG_TYPE_LINEAR_UVAL_CONSTANT_STEP: case SIMD_CLONE_ARG_TYPE_LINEAR_UVAL_VARIABLE_STEP: @@ -834,7 +806,6 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) = create_tmp_simd_array (IDENTIFIER_POINTER (DECL_NAME (parm)), TREE_TYPE (parm_type), sc->simdlen); - adj.op = IPA_PARAM_OP_COPY; break; case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_CONSTANT_STEP: case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_VARIABLE_STEP: @@ -851,28 +822,12 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) GET_MODE_BITSIZE (SCALAR_TYPE_MODE (parm_type))); if (multiple_p (veclen, sc->simdlen)) veclen = sc->simdlen; - adj.op = IPA_PARAM_OP_NEW; - adj.param_prefix_index = IPA_PARAM_PREFIX_SIMD; + tree vtype; if (POINTER_TYPE_P (parm_type)) - adj.type = build_vector_type (pointer_sized_int_node, veclen); + vtype = build_vector_type (pointer_sized_int_node, veclen); else - adj.type = build_vector_type (parm_type, veclen); - sc->args[i].vector_type = adj.type; - k = vector_unroll_factor (sc->simdlen, veclen); - for (j = 1; j < k; j++) - { - vec_safe_push (new_params, adj); - if (j == 1) - { - memset (&adj, 0, sizeof (adj)); - adj.op = IPA_PARAM_OP_NEW; - adj.user_flag = 1; - adj.param_prefix_index = IPA_PARAM_PREFIX_SIMD; - adj.base_index = i; - adj.prev_clone_index = i; - adj.type = sc->args[i].vector_type; - } - } + vtype = build_vector_type (parm_type, veclen); + sc->args[i].vector_type = vtype; if (node->definition) sc->args[i].simd_array @@ -880,20 +835,12 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) ? IDENTIFIER_POINTER (DECL_NAME (parm)) : NULL, parm_type, sc->simdlen); } - vec_safe_push (new_params, adj); } if (sc->inbranch) { tree base_type = simd_clone_compute_base_data_type (sc->origin, sc); - ipa_adjusted_param adj; - memset (&adj, 0, sizeof (adj)); - adj.op = IPA_PARAM_OP_NEW; - adj.user_flag = 1; - adj.param_prefix_index = IPA_PARAM_PREFIX_MASK; - - adj.base_index = i; - adj.prev_clone_index = i; + tree mask_type; if (INTEGRAL_TYPE_P (base_type) || POINTER_TYPE_P (base_type)) veclen = sc->vecsize_int; else @@ -906,17 +853,14 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) if (multiple_p (veclen, sc->simdlen)) veclen = sc->simdlen; if (sc->mask_mode != VOIDmode) - adj.type + mask_type = lang_hooks.types.type_for_mode (sc->mask_mode, 1); else if (POINTER_TYPE_P (base_type)) - adj.type = build_vector_type (pointer_sized_int_node, veclen); + mask_type = build_vector_type (pointer_sized_int_node, veclen); else - adj.type = build_vector_type (base_type, veclen); - vec_safe_push (new_params, adj); + mask_type = build_vector_type (base_type, veclen); k = vector_unroll_factor (sc->simdlen, veclen); - for (j = 1; j < k; j++) - vec_safe_push (new_params, adj); /* We have previously allocated one extra entry for the mask. Use it and fill it. */ @@ -932,24 +876,16 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) = create_tmp_simd_array ("mask", base_type, sc->simdlen); else if (k > 1) sc->args[i].simd_array - = create_tmp_simd_array ("mask", adj.type, k); + = create_tmp_simd_array ("mask", mask_type, k); else sc->args[i].simd_array = NULL_TREE; } sc->args[i].orig_type = base_type; sc->args[i].arg_type = SIMD_CLONE_ARG_TYPE_MASK; - sc->args[i].vector_type = adj.type; + sc->args[i].vector_type = mask_type; } - if (node->definition) - { - ipa_param_body_adjustments *adjustments - = new ipa_param_body_adjustments (new_params, node->decl); - - adjustments->modify_formal_parameters (); - return adjustments; - } - else + if (!node->definition) { tree new_arg_types = NULL_TREE, new_reversed; bool last_parm_void = false; @@ -957,15 +893,20 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) last_parm_void = true; gcc_assert (TYPE_ARG_TYPES (TREE_TYPE (node->decl))); - j = vec_safe_length (new_params); - for (i = 0; i < j; i++) + for (i = 0; i < sc->nargs; i++) { - struct ipa_adjusted_param *adj = &(*new_params)[i]; tree ptype; - if (adj->op == IPA_PARAM_OP_COPY) - ptype = args[adj->base_index]; - else - ptype = adj->type; + switch (sc->args[i].arg_type) + { + default: + ptype = sc->args[i].orig_type; + break; + case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_CONSTANT_STEP: + case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_VARIABLE_STEP: + case SIMD_CLONE_ARG_TYPE_VECTOR: + ptype = sc->args[i].vector_type; + break; + } new_arg_types = tree_cons (NULL_TREE, ptype, new_arg_types); } new_reversed = nreverse (new_arg_types); @@ -977,7 +918,6 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) new_reversed = void_list_node; } TYPE_ARG_TYPES (TREE_TYPE (node->decl)) = new_reversed; - return NULL; } } @@ -996,7 +936,8 @@ simd_clone_init_simd_arrays (struct cgraph_node *node, arg; arg = DECL_CHAIN (arg), i++, j++) { - if ((*adjustments->m_adj_params)[j].op == IPA_PARAM_OP_COPY + ipa_adjusted_param adj = (*adjustments->m_adj_params)[j]; + if (adj.op == IPA_PARAM_OP_COPY || POINTER_TYPE_P (TREE_TYPE (arg))) continue; @@ -1004,7 +945,7 @@ simd_clone_init_simd_arrays (struct cgraph_node *node, tree array = node->simdclone->args[i].simd_array; if (node->simdclone->mask_mode != VOIDmode - && node->simdclone->args[i].arg_type == SIMD_CLONE_ARG_TYPE_MASK) + && adj.param_prefix_index == IPA_PARAM_PREFIX_MASK) { if (array == NULL_TREE) continue; @@ -1024,8 +965,9 @@ simd_clone_init_simd_arrays (struct cgraph_node *node, } continue; } - if (known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg)), - node->simdclone->simdlen)) + if (!VECTOR_TYPE_P (TREE_TYPE (arg)) + || known_eq (TYPE_VECTOR_SUBPARTS (TREE_TYPE (arg)), + node->simdclone->simdlen)) { tree ptype = build_pointer_type (TREE_TYPE (TREE_TYPE (array))); tree ptr = build_fold_addr_expr (array); @@ -1423,13 +1365,120 @@ simd_clone_adjust (struct cgraph_node *node) { push_cfun (DECL_STRUCT_FUNCTION (node->decl)); + tree orig_rettype = TREE_TYPE (TREE_TYPE (node->decl)); TREE_TYPE (node->decl) = build_distinct_type_copy (TREE_TYPE (node->decl)); + simd_clone_adjust_return_type (node); + simd_clone_adjust_argument_types (node); targetm.simd_clone.adjust (node); + tree retval = NULL_TREE; + if (orig_rettype != void_type_node) + { + poly_uint64 veclen; + if (INTEGRAL_TYPE_P (orig_rettype) || POINTER_TYPE_P (orig_rettype)) + veclen = node->simdclone->vecsize_int; + else + veclen = node->simdclone->vecsize_float; + if (known_eq (veclen, 0U)) + veclen = node->simdclone->simdlen; + else + veclen = exact_div (veclen, + GET_MODE_BITSIZE (SCALAR_TYPE_MODE (orig_rettype))); + if (multiple_p (veclen, node->simdclone->simdlen)) + veclen = node->simdclone->simdlen; + + retval = DECL_RESULT (node->decl); + /* Adjust the DECL_RESULT. */ + TREE_TYPE (retval) = TREE_TYPE (TREE_TYPE (node->decl)); + relayout_decl (retval); + + tree atype = build_array_type_nelts (orig_rettype, + node->simdclone->simdlen); + if (maybe_ne (veclen, node->simdclone->simdlen)) + retval = build1 (VIEW_CONVERT_EXPR, atype, retval); + else + { + /* Set up a SIMD array to use as the return value. */ + retval = create_tmp_var_raw (atype, "retval"); + gimple_add_tmp_var (retval); + } + } - tree retval = simd_clone_adjust_return_type (node); + struct cgraph_simd_clone *sc = node->simdclone; + vec *new_params = NULL; + vec_safe_reserve (new_params, sc->nargs); + unsigned i, j, k; + for (i = 0; i < sc->nargs; ++i) + { + ipa_adjusted_param adj; + memset (&adj, 0, sizeof (adj)); + poly_uint64 veclen; + tree elem_type; + + adj.base_index = i; + adj.prev_clone_index = i; + switch (sc->args[i].arg_type) + { + default: + /* No adjustment necessary for scalar arguments. */ + adj.op = IPA_PARAM_OP_COPY; + break; + case SIMD_CLONE_ARG_TYPE_LINEAR_UVAL_VARIABLE_STEP: + adj.op = IPA_PARAM_OP_COPY; + break; + case SIMD_CLONE_ARG_TYPE_MASK: + case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_CONSTANT_STEP: + case SIMD_CLONE_ARG_TYPE_LINEAR_VAL_VARIABLE_STEP: + case SIMD_CLONE_ARG_TYPE_VECTOR: + if (sc->args[i].arg_type == SIMD_CLONE_ARG_TYPE_MASK + && sc->mask_mode != VOIDmode) + elem_type = boolean_type_node; + else + elem_type = TREE_TYPE (sc->args[i].vector_type); + if (INTEGRAL_TYPE_P (elem_type) || POINTER_TYPE_P (elem_type)) + veclen = sc->vecsize_int; + else + veclen = sc->vecsize_float; + if (known_eq (veclen, 0U)) + veclen = sc->simdlen; + else + veclen + = exact_div (veclen, + GET_MODE_BITSIZE (SCALAR_TYPE_MODE (elem_type))); + if (multiple_p (veclen, sc->simdlen)) + veclen = sc->simdlen; + if (sc->args[i].arg_type == SIMD_CLONE_ARG_TYPE_MASK) + { + adj.user_flag = 1; + adj.param_prefix_index = IPA_PARAM_PREFIX_MASK; + } + else + adj.param_prefix_index = IPA_PARAM_PREFIX_SIMD; + adj.op = IPA_PARAM_OP_NEW; + adj.type = sc->args[i].vector_type; + k = vector_unroll_factor (sc->simdlen, veclen); + for (j = 1; j < k; j++) + { + vec_safe_push (new_params, adj); + if (j == 1) + { + memset (&adj, 0, sizeof (adj)); + adj.op = IPA_PARAM_OP_NEW; + adj.user_flag = 1; + if (sc->args[i].arg_type == SIMD_CLONE_ARG_TYPE_MASK) + adj.param_prefix_index = IPA_PARAM_PREFIX_MASK; + else + adj.param_prefix_index = IPA_PARAM_PREFIX_SIMD; + adj.base_index = i; + adj.prev_clone_index = i; + adj.type = sc->args[i].vector_type; + } + } + } + vec_safe_push (new_params, adj); + } ipa_param_body_adjustments *adjustments - = simd_clone_adjust_argument_types (node); - gcc_assert (adjustments); + = new ipa_param_body_adjustments (new_params, node->decl); + adjustments->modify_formal_parameters (); push_gimplify_context (); @@ -2050,9 +2099,9 @@ expand_simd_clones (struct cgraph_node *node) { TREE_TYPE (n->decl) = build_distinct_type_copy (TREE_TYPE (n->decl)); - targetm.simd_clone.adjust (n); simd_clone_adjust_return_type (n); simd_clone_adjust_argument_types (n); + targetm.simd_clone.adjust (n); } if (dump_file) fprintf (dump_file, "\nGenerated %s clone %s\n",