From patchwork Fri Apr 28 12:37:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 1774963 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=S0tNQWvr; dkim-atps=neutral Received: from sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Q7C1q38wbz23tl for ; Fri, 28 Apr 2023 22:42:55 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6E6F5385381D for ; Fri, 28 Apr 2023 12:42:53 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 6E6F5385381D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1682685773; bh=GVGdHcNDf3p14oBsgAjJFY5gnNryBKblqZwm1gcn5s0=; h=Date:Subject:To:Cc:References:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=S0tNQWvrdt4819LuY0BOSeZ6W6kfRqfiRiFtRNqCU/eGv+VViD8fM5zIVeW6iGFnk kZvQ+39RRsnONtg/HhX+/LUkmhf3i++M5h82NNIvVIdBgb5NL5olENwx31x3lWva1O 4yJrsOSAk433sxr1yrfE4kpbEv3qGnicxwhPgT3g= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id CB4433881D06 for ; Fri, 28 Apr 2023 12:37:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CB4433881D06 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7C737C14; Fri, 28 Apr 2023 05:38:05 -0700 (PDT) Received: from [10.57.70.10] (unknown [10.57.70.10]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 98A363F5A1; Fri, 28 Apr 2023 05:37:20 -0700 (PDT) Message-ID: Date: Fri, 28 Apr 2023 13:37:14 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Subject: [PATCH 2/3] Refactor widen_plus as internal_fn Content-Language: en-US To: Richard Biener Cc: Richard Biener , Richard Sandiford , "gcc-patches@gcc.gnu.org" References: <51ce8969-3130-452e-092e-f9d91eff2dad@arm.com> In-Reply-To: X-Spam-Status: No, score=-14.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" This patch replaces the existing tree_code widen_plus and widen_minus patterns with internal_fn versions. DEF_INTERNAL_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it provides convenience wrappers for defining conversions that require a hi/lo split, like widening and narrowing operations. Each definition for will require an optab named and two other optabs that you specify for signed and unsigned. The hi/lo pair is necessary because the widening operations take n narrow elements as inputs and return n/2 wide elements as outputs. The 'lo' operation operates on the first n/2 elements of input. The 'hi' operation operates on the second n/2 elements of input. Defining an internal_fn along with hi/lo variations allows a single internal function to be returned from a vect_recog function that will later be expanded to hi/lo. DEF_INTERNAL_OPTAB_HILO_FN is used in internal-fn.def to register a widening internal_fn. It is defined differently in different places and internal-fn.def is sourced from those places so the parameters given can be reused. internal-fn.c: defined to expand to hi/lo signed/unsigned optabs, later defined to generate the 'expand_' functions for the hi/lo versions of the fn. internal-fn.def: defined to invoke DEF_INTERNAL_OPTAB_FN for the original and hi/lo variants of the internal_fn For example: IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO for aarch64: IFN_VEC_WIDEN_PLUS_HI -> vec_widen_addl_hi_ -> (u/s)addl2 IFN_VEC_WIDEN_PLUS_LO -> vec_widen_addl_lo_ -> (u/s)addl This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI. gcc/ChangeLog: 2023-04-28 Andre Vieira Joel Hutton Tamar Christina * internal-fn.cc (INCLUDE_MAP): Include maps for use in optab lookup. (DEF_INTERNAL_OPTAB_HILO_FN): Macro to define an internal_fn that expands into multiple internal_fns (for widening). (ifn_cmp): Function to compare ifn's for sorting/searching. (lookup_hilo_ifn_optab): Add lookup function. (lookup_hilo_internal_fn): Add lookup function. (commutative_binary_fn_p): Add widen_plus fn's. (widening_fn_p): New function. (decomposes_to_hilo_fn_p): New function. * internal-fn.def (DEF_INTERNAL_OPTAB_HILO_FN): Define widening plus,minus functions. (VEC_WIDEN_PLUS): Replacement for VEC_WIDEN_PLUS tree code. (VEC_WIDEN_MINUS): Replacement for VEC_WIDEN_MINUS tree code. * internal-fn.h (GCC_INTERNAL_FN_H): Add headers. (lookup_hilo_ifn_optab): Add prototype. (lookup_hilo_internal_fn): Likewise. (widening_fn_p): Likewise. (decomposes_to_hilo_fn_p): Likewise. * optabs.cc (commutative_optab_p): Add widening plus, minus optabs. * optabs.def (OPTAB_CD): widen add, sub optabs * tree-vect-patterns.cc (vect_recog_widen_op_pattern): Support patterns with a hi/lo split. (vect_recog_widen_plus_pattern): Refactor to return IFN_VECT_WIDEN_PLUS. (vect_recog_widen_minus_pattern): Refactor to return new IFN_VEC_WIDEN_MINUS. * tree-vect-stmts.cc (vectorizable_conversion): Add widen plus/minus ifn support. (supportable_widening_operation): Add widen plus/minus ifn support. gcc/testsuite/ChangeLog: * gcc.target/aarch64/vect-widen-add.c: Test that new IFN_VEC_WIDEN_PLUS is being used. * gcc.target/aarch64/vect-widen-sub.c: Test that new IFN_VEC_WIDEN_MINUS is being used. diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc index 6e81dc05e0e0714256759b0594816df451415a2d..e4d815cd577d266d2bccf6fb68d62aac91a8b4cf 100644 --- a/gcc/internal-fn.cc +++ b/gcc/internal-fn.cc @@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License along with GCC; see the file COPYING3. If not see . */ +#define INCLUDE_MAP #include "config.h" #include "system.h" #include "coretypes.h" @@ -70,6 +71,26 @@ const int internal_fn_flags_array[] = { 0 }; +const enum internal_fn internal_fn_hilo_keys_array[] = { +#undef DEF_INTERNAL_OPTAB_HILO_FN +#define DEF_INTERNAL_OPTAB_HILO_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + IFN_##NAME##_LO, \ + IFN_##NAME##_HI, +#include "internal-fn.def" + IFN_LAST +#undef DEF_INTERNAL_OPTAB_HILO_FN +}; + +const optab internal_fn_hilo_values_array[] = { +#undef DEF_INTERNAL_OPTAB_HILO_FN +#define DEF_INTERNAL_OPTAB_HILO_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + SOPTAB##_lo_optab, UOPTAB##_lo_optab, \ + SOPTAB##_hi_optab, UOPTAB##_hi_optab, +#include "internal-fn.def" + unknown_optab, unknown_optab +#undef DEF_INTERNAL_OPTAB_HILO_FN +}; + /* Return the internal function called NAME, or IFN_LAST if there's no such function. */ @@ -90,6 +111,61 @@ lookup_internal_fn (const char *name) return entry ? *entry : IFN_LAST; } +static int +ifn_cmp (const void *a_, const void *b_) +{ + typedef std::pair ifn_pair; + auto *a = (const std::pair *)a_; + auto *b = (const std::pair *)b_; + return (int) (a->first.first) - (b->first.first); +} + +/* Return the optab belonging to the given internal function NAME for the given + SIGN or unknown_optab. */ + +optab +lookup_hilo_ifn_optab (enum internal_fn fn, unsigned sign) +{ + typedef std::pair ifn_pair; + typedef auto_vec >fn_to_optab_map_type; + static fn_to_optab_map_type *fn_to_optab_map; + + if (!fn_to_optab_map) + { + unsigned num + = sizeof (internal_fn_hilo_keys_array) / sizeof (enum internal_fn); + fn_to_optab_map = new fn_to_optab_map_type (); + for (unsigned int i = 0; i < num - 1; ++i) + { + enum internal_fn fn = internal_fn_hilo_keys_array[i]; + optab v1 = internal_fn_hilo_values_array[2*i]; + optab v2 = internal_fn_hilo_values_array[2*i + 1]; + ifn_pair key1 (fn, 0); + fn_to_optab_map->safe_push ({key1, v1}); + ifn_pair key2 (fn, 1); + fn_to_optab_map->safe_push ({key2, v2}); + } + fn_to_optab_map->qsort (ifn_cmp); + } + + ifn_pair new_pair (fn, sign ? 1 : 0); + optab tmp; + std::pair pair_wrap (new_pair, tmp); + auto entry = fn_to_optab_map->bsearch (&pair_wrap, ifn_cmp); + return entry != fn_to_optab_map->end () ? entry->second : unknown_optab; +} + +extern void +lookup_hilo_internal_fn (enum internal_fn ifn, enum internal_fn *lo, + enum internal_fn *hi) +{ + gcc_assert (decomposes_to_hilo_fn_p (ifn)); + + *lo = internal_fn (ifn + 1); + *hi = internal_fn (ifn + 2); +} + + /* Fnspec of each internal function, indexed by function number. */ const_tree internal_fn_fnspec_array[IFN_LAST + 1]; @@ -3970,6 +4046,9 @@ commutative_binary_fn_p (internal_fn fn) case IFN_UBSAN_CHECK_MUL: case IFN_ADD_OVERFLOW: case IFN_MUL_OVERFLOW: + case IFN_VEC_WIDEN_PLUS: + case IFN_VEC_WIDEN_PLUS_LO: + case IFN_VEC_WIDEN_PLUS_HI: return true; default: @@ -4043,6 +4122,42 @@ first_commutative_argument (internal_fn fn) } } +/* Return true if FN has a wider output type than its argument types. */ + +bool +widening_fn_p (internal_fn fn) +{ + switch (fn) + { + case IFN_VEC_WIDEN_PLUS: + case IFN_VEC_WIDEN_MINUS: + return true; + + default: + return false; + } +} + +/* Return true if FN decomposes to _hi and _lo IFN. If true this should also + be a widening function. */ + +bool +decomposes_to_hilo_fn_p (internal_fn fn) +{ + if (!widening_fn_p (fn)) + return false; + + switch (fn) + { + case IFN_VEC_WIDEN_PLUS: + case IFN_VEC_WIDEN_MINUS: + return true; + + default: + return false; + } +} + /* Return true if IFN_SET_EDOM is supported. */ bool @@ -4055,6 +4170,32 @@ set_edom_supported_p (void) #endif } +#undef DEF_INTERNAL_OPTAB_HILO_FN +#define DEF_INTERNAL_OPTAB_HILO_FN(CODE, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + static void \ + expand_##CODE (internal_fn, gcall *) \ + { \ + gcc_unreachable (); \ + } \ + static void \ + expand_##CODE##_LO (internal_fn fn, gcall *stmt) \ + { \ + tree ty = TREE_TYPE (gimple_get_lhs (stmt)); \ + if (!TYPE_UNSIGNED (ty)) \ + expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_lo##_optab); \ + else \ + expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_lo##_optab); \ + } \ + static void \ + expand_##CODE##_HI (internal_fn fn, gcall *stmt) \ + { \ + tree ty = TREE_TYPE (gimple_get_lhs (stmt)); \ + if (!TYPE_UNSIGNED (ty)) \ + expand_##TYPE##_optab_fn (fn, stmt, SOPTAB##_hi##_optab); \ + else \ + expand_##TYPE##_optab_fn (fn, stmt, UOPTAB##_hi##_optab); \ + } + #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) \ static void \ expand_##CODE (internal_fn fn, gcall *stmt) \ @@ -4071,6 +4212,7 @@ set_edom_supported_p (void) expand_##TYPE##_optab_fn (fn, stmt, which_optab); \ } #include "internal-fn.def" +#undef DEF_INTERNAL_OPTAB_HILO_FN /* Routines to expand each internal function, indexed by function number. Each routine has the prototype: diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def index 7fe742c2ae713e7152ab05cfdfba86e4e0aa3456..347ed667d92620e0ee3ea15c58ecac6c242ebe73 100644 --- a/gcc/internal-fn.def +++ b/gcc/internal-fn.def @@ -85,6 +85,13 @@ along with GCC; see the file COPYING3. If not see says that the function extends the C-level BUILT_IN_{,L,LL,IMAX} group of functions to any integral mode (including vector modes). + DEF_INTERNAL_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it + provides convenience wrappers for defining conversions that require a + hi/lo split, like widening and narrowing operations. Each definition + for will require an optab named and two other optabs that + you specify for signed and unsigned. + + Each entry must have a corresponding expander of the form: void expand_NAME (gimple_call stmt) @@ -123,6 +130,14 @@ along with GCC; see the file COPYING3. If not see DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE) #endif +#ifndef DEF_INTERNAL_OPTAB_HILO_FN +#define DEF_INTERNAL_OPTAB_HILO_FN(NAME, FLAGS, OPTAB, SOPTAB, UOPTAB, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _LO, FLAGS, unknown, TYPE) \ + DEF_INTERNAL_OPTAB_FN (NAME ## _HI, FLAGS, unknown, TYPE) +#endif + + DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load) DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes) DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE, @@ -315,6 +330,14 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary) DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary) DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary) +DEF_INTERNAL_OPTAB_HILO_FN (VEC_WIDEN_PLUS, + ECF_CONST | ECF_NOTHROW, + vec_widen_add, vec_widen_saddl, vec_widen_uaddl, + binary) +DEF_INTERNAL_OPTAB_HILO_FN (VEC_WIDEN_MINUS, + ECF_CONST | ECF_NOTHROW, + vec_widen_sub, vec_widen_ssubl, vec_widen_usubl, + binary) DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary) DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary) diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h index 08922ed4254898f5fffca3f33973e96ed9ce772f..6a5f8762e872ad2ef64ce2986a678e3b40622d81 100644 --- a/gcc/internal-fn.h +++ b/gcc/internal-fn.h @@ -20,6 +20,10 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_INTERNAL_FN_H #define GCC_INTERNAL_FN_H +#include "insn-codes.h" +#include "insn-opinit.h" + + /* INTEGER_CST values for IFN_UNIQUE function arg-0. UNSPEC: Undifferentiated UNIQUE. @@ -112,6 +116,9 @@ internal_fn_name (enum internal_fn fn) } extern internal_fn lookup_internal_fn (const char *); +extern optab lookup_hilo_ifn_optab (enum internal_fn, unsigned); +extern void lookup_hilo_internal_fn (enum internal_fn, enum internal_fn *, + enum internal_fn *); /* Return the ECF_* flags for function FN. */ @@ -210,6 +217,8 @@ extern bool commutative_binary_fn_p (internal_fn); extern bool commutative_ternary_fn_p (internal_fn); extern int first_commutative_argument (internal_fn); extern bool associative_binary_fn_p (internal_fn); +extern bool widening_fn_p (internal_fn); +extern bool decomposes_to_hilo_fn_p (internal_fn); extern bool set_edom_supported_p (void); diff --git a/gcc/optabs.cc b/gcc/optabs.cc index c8e39c82d57a7d726e7da33d247b80f32ec9236c..d4dd7ee3d34d01c32ab432ae4e4ce9e4b522b2f7 100644 --- a/gcc/optabs.cc +++ b/gcc/optabs.cc @@ -1314,7 +1314,12 @@ commutative_optab_p (optab binoptab) || binoptab == smul_widen_optab || binoptab == umul_widen_optab || binoptab == smul_highpart_optab - || binoptab == umul_highpart_optab); + || binoptab == umul_highpart_optab + || binoptab == vec_widen_add_optab + || binoptab == vec_widen_saddl_hi_optab + || binoptab == vec_widen_saddl_lo_optab + || binoptab == vec_widen_uaddl_hi_optab + || binoptab == vec_widen_uaddl_lo_optab); } /* X is to be used in mode MODE as operand OPN to BINOPTAB. If we're diff --git a/gcc/optabs.def b/gcc/optabs.def index 695f5911b300c9ca5737de9be809fa01aabe5e01..e064189103b3be70644468d11f3c91ac45ffe0d0 100644 --- a/gcc/optabs.def +++ b/gcc/optabs.def @@ -78,6 +78,8 @@ OPTAB_CD(smsub_widen_optab, "msub$b$a4") OPTAB_CD(umsub_widen_optab, "umsub$b$a4") OPTAB_CD(ssmsub_widen_optab, "ssmsub$b$a4") OPTAB_CD(usmsub_widen_optab, "usmsub$a$b4") +OPTAB_CD(vec_widen_add_optab, "add$a$b3") +OPTAB_CD(vec_widen_sub_optab, "sub$a$b3") OPTAB_CD(vec_load_lanes_optab, "vec_load_lanes$a$b") OPTAB_CD(vec_store_lanes_optab, "vec_store_lanes$a$b") OPTAB_CD(vec_mask_load_lanes_optab, "vec_mask_load_lanes$a$b") diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c index 220bd9352a4c7acd2e3713e441d74898d3e92b30..7037673d32bd780e1c9b58a51e58e2bac3b30b7e 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O3 -save-temps" } */ +/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */ #include #include @@ -86,6 +86,8 @@ main() return 0; } +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect" } } */ +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect" } } */ /* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */ /* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */ /* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */ diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c index a2bed63affbd091977df95a126da1f5b8c1d41d2..83bc1edb6105f47114b665e24a13e6194b2179a2 100644 --- a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c +++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c @@ -1,5 +1,5 @@ /* { dg-do run } */ -/* { dg-options "-O3 -save-temps" } */ +/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */ #include #include @@ -86,6 +86,8 @@ main() return 0; } +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect" } } */ +/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect" } } */ /* { dg-final { scan-assembler-times {\tusubl\t} 1} } */ /* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */ /* { dg-final { scan-assembler-times {\tssubl\t} 1} } */ diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc index b35023adade94c1996cd076c4b7419560e819c6b..3175dd92187c0935f78ebbf2eb476bdcf8b4ccd1 100644 --- a/gcc/tree-vect-patterns.cc +++ b/gcc/tree-vect-patterns.cc @@ -1394,14 +1394,16 @@ static gimple * vect_recog_widen_op_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out, tree_code orig_code, code_helper wide_code, - bool shift_p, const char *name) + bool shift_p, const char *name, + enum optab_subtype *subtype = NULL) { gimple *last_stmt = last_stmt_info->stmt; vect_unpromoted_value unprom[2]; tree half_type; if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code, - shift_p, 2, unprom, &half_type)) + shift_p, 2, unprom, &half_type, subtype)) + return NULL; /* Pattern detected. */ @@ -1467,6 +1469,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo, type, pattern_stmt, vecctype); } +static gimple * +vect_recog_widen_op_pattern (vec_info *vinfo, + stmt_vec_info last_stmt_info, tree *type_out, + tree_code orig_code, internal_fn wide_ifn, + bool shift_p, const char *name, + enum optab_subtype *subtype = NULL) +{ + combined_fn ifn = as_combined_fn (wide_ifn); + return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, + orig_code, ifn, shift_p, name, + subtype); +} + + /* Try to detect multiplication on widened inputs, converting MULT_EXPR to WIDEN_MULT_EXPR. See vect_recog_widen_op_pattern for details. */ @@ -1480,26 +1496,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, } /* Try to detect addition on widened inputs, converting PLUS_EXPR - to WIDEN_PLUS_EXPR. See vect_recog_widen_op_pattern for details. */ + to IFN_VEC_WIDEN_PLUS. See vect_recog_widen_op_pattern for details. */ static gimple * vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out) { + enum optab_subtype subtype; return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, - PLUS_EXPR, WIDEN_PLUS_EXPR, false, - "vect_recog_widen_plus_pattern"); + PLUS_EXPR, IFN_VEC_WIDEN_PLUS, + false, "vect_recog_widen_plus_pattern", + &subtype); } /* Try to detect subtraction on widened inputs, converting MINUS_EXPR - to WIDEN_MINUS_EXPR. See vect_recog_widen_op_pattern for details. */ + to IFN_VEC_WIDEN_MINUS. See vect_recog_widen_op_pattern for details. */ static gimple * vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info, tree *type_out) { + enum optab_subtype subtype; return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out, - MINUS_EXPR, WIDEN_MINUS_EXPR, false, - "vect_recog_widen_minus_pattern"); + MINUS_EXPR, IFN_VEC_WIDEN_MINUS, + false, "vect_recog_widen_minus_pattern", + &subtype); } /* Function vect_recog_popcount_pattern @@ -6067,6 +6087,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = { { vect_recog_mask_conversion_pattern, "mask_conversion" }, { vect_recog_widen_plus_pattern, "widen_plus" }, { vect_recog_widen_minus_pattern, "widen_minus" }, + /* These must come after the double widening ones. */ }; const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs); diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index ce47f4940fa9a1baca4ba1162065cfc3b4072eba..2a7ef2439e12d1966e8884433963a3d387a856b7 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -5035,7 +5035,9 @@ vectorizable_conversion (vec_info *vinfo, bool widen_arith = (code == WIDEN_PLUS_EXPR || code == WIDEN_MINUS_EXPR || code == WIDEN_MULT_EXPR - || code == WIDEN_LSHIFT_EXPR); + || code == WIDEN_LSHIFT_EXPR + || code == IFN_VEC_WIDEN_PLUS + || code == IFN_VEC_WIDEN_MINUS); if (!widen_arith && !CONVERT_EXPR_CODE_P (code) @@ -5085,7 +5087,9 @@ vectorizable_conversion (vec_info *vinfo, gcc_assert (code == WIDEN_MULT_EXPR || code == WIDEN_LSHIFT_EXPR || code == WIDEN_PLUS_EXPR - || code == WIDEN_MINUS_EXPR); + || code == WIDEN_MINUS_EXPR + || code == IFN_VEC_WIDEN_PLUS + || code == IFN_VEC_WIDEN_MINUS); op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) : @@ -12335,12 +12339,46 @@ supportable_widening_operation (vec_info *vinfo, optab1 = vec_unpacks_sbool_lo_optab; optab2 = vec_unpacks_sbool_hi_optab; } - else - { - optab1 = optab_for_tree_code (c1, vectype, optab_default); - optab2 = optab_for_tree_code (c2, vectype, optab_default); + + if (code.is_fn_code ()) + { + internal_fn ifn = as_internal_fn ((combined_fn) code); + gcc_assert (decomposes_to_hilo_fn_p (ifn)); + + internal_fn lo, hi; + lookup_hilo_internal_fn (ifn, &lo, &hi); + *code1 = as_combined_fn (lo); + *code2 = as_combined_fn (hi); + optab1 = lookup_hilo_ifn_optab (lo, !TYPE_UNSIGNED (vectype)); + optab2 = lookup_hilo_ifn_optab (hi, !TYPE_UNSIGNED (vectype)); } + if (code.is_tree_code ()) + { + if (code == FIX_TRUNC_EXPR) + { + /* The signedness is determined from output operand. */ + optab1 = optab_for_tree_code (c1, vectype_out, optab_default); + optab2 = optab_for_tree_code (c2, vectype_out, optab_default); + } + else if (CONVERT_EXPR_CODE_P (code.safe_as_tree_code ()) + && VECTOR_BOOLEAN_TYPE_P (wide_vectype) + && VECTOR_BOOLEAN_TYPE_P (vectype) + && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype) + && SCALAR_INT_MODE_P (TYPE_MODE (vectype))) + { + /* If the input and result modes are the same, a different optab + is needed where we pass in the number of units in vectype. */ + optab1 = vec_unpacks_sbool_lo_optab; + optab2 = vec_unpacks_sbool_hi_optab; + } + else + { + optab1 = optab_for_tree_code (c1, vectype, optab_default); + optab2 = optab_for_tree_code (c2, vectype, optab_default); + } + } + if (!optab1 || !optab2) return false;