From patchwork Wed Aug 30 09:06:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 1827662 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=WTU+U2fB; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RbJMD3z3mz1ygM for ; Wed, 30 Aug 2023 19:06:47 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 00B2F3858428 for ; Wed, 30 Aug 2023 09:06:45 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 00B2F3858428 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693386405; bh=ajWrULaZE4v63VHRKU44ZPN3PJMcX+ajCOkgfb5Ma+M=; h=Date:Subject:To:References:Cc:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=WTU+U2fBGoj8bOFDnrXY8CDPbzXmPK9tFyvbE25eoDBlTH0ugL51ghEq80xQTwTPJ lR4i+Y4LK9fpurlNxxFBKwzzdqRW8pDBXWBBvL/H4X5iJAf7Y2/6mgycnlKEsBsdQY siEJgHnOhOuTU83gm1TsS2Af5mP9acl6ZYhXIjlE= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id E901A3858C30 for ; Wed, 30 Aug 2023 09:06:23 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E901A3858C30 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 1DACE2F4; Wed, 30 Aug 2023 02:07:03 -0700 (PDT) Received: from [10.57.64.216] (unknown [10.57.64.216]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id DEA573F64C; Wed, 30 Aug 2023 02:06:22 -0700 (PDT) Message-ID: <9c15446b-1f4d-62d7-9427-a19eb07ac8ee@arm.com> Date: Wed, 30 Aug 2023 10:06:17 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: [PATCH 1/8] parloops: Copy target and optimizations when creating a function clone Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> Cc: Richard Sandiford , Richard Biener , "jakub@redhat.com" In-Reply-To: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> X-Spam-Status: No, score=-14.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" SVE simd clones require to be compiled with a SVE target enabled or the argument types will not be created properly. To achieve this we need to copy DECL_FUNCTION_SPECIFIC_TARGET from the original function declaration to the clones. I decided it was probably also a good idea to copy DECL_FUNCTION_SPECIFIC_OPTIMIZATION in case the original function is meant to be compiled with specific optimization options. gcc/ChangeLog: * tree-parloops.cc (create_loop_fn): Copy specific target and optimization options to clone. diff --git a/gcc/tree-parloops.cc b/gcc/tree-parloops.cc index e495bbd65270bdf90bae2c4a2b52777522352a77..a35f3d5023b06e5ef96eb4222488fcb34dd7bd45 100644 --- a/gcc/tree-parloops.cc +++ b/gcc/tree-parloops.cc @@ -2203,6 +2203,11 @@ create_loop_fn (location_t loc) DECL_CONTEXT (t) = decl; TREE_USED (t) = 1; DECL_ARGUMENTS (decl) = t; + DECL_FUNCTION_SPECIFIC_TARGET (decl) + = DECL_FUNCTION_SPECIFIC_TARGET (act_cfun->decl); + DECL_FUNCTION_SPECIFIC_OPTIMIZATION (decl) + = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (act_cfun->decl); + allocate_struct_function (decl, false); From patchwork Wed Aug 30 09:08:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 1827663 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=rufXsPtA; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RbJPh4L4Lz1ygM for ; Wed, 30 Aug 2023 19:08:55 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 0759738582A3 for ; Wed, 30 Aug 2023 09:08:54 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 0759738582A3 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693386534; bh=j8676XM4anbmPESOHlcFpK/LcZFIt14s9aOs4HYdSO4=; h=Date:Subject:To:References:Cc:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=rufXsPtAQmmlVrVAVSyNLXN8MAU9UNFDFUm7GY61uUVSrCVR/39kd7LABWI+LGzeL zfDr+cSGSmUk30OzsZcWP+AzK4DcQNzFEnImWh68YyxNOfMQYmsgU+xuXbHWsrjtFN G/1yqi2cquG4kSI1L85vxoEPZcYx3FgEyqpnfqkU= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id C84953858C30 for ; Wed, 30 Aug 2023 09:08:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C84953858C30 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id D067C2F4; Wed, 30 Aug 2023 02:09:12 -0700 (PDT) Received: from [10.57.64.216] (unknown [10.57.64.216]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id AE2CC3F64C; Wed, 30 Aug 2023 02:08:32 -0700 (PDT) Message-ID: <0942baa7-f186-4d0b-f556-3b8f926a24ad@arm.com> Date: Wed, 30 Aug 2023 10:08:27 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: [Patch 2/8] parloops: Allow poly nit and bound Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> Cc: Richard Sandiford , Richard Biener In-Reply-To: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> X-Spam-Status: No, score=-14.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Teach parloops how to handle a poly nit and bound e ahead of the changes to enable non-constant simdlen. gcc/ChangeLog: * tree-parloops.cc (try_to_transform_to_exit_first_loop_alt): Accept poly NIT and ALT_BOUND. diff --git a/gcc/tree-parloops.cc b/gcc/tree-parloops.cc index a35f3d5023b06e5ef96eb4222488fcb34dd7bd45..cf713e53d712fb5ad050e274f373adba5a90c5a7 100644 --- a/gcc/tree-parloops.cc +++ b/gcc/tree-parloops.cc @@ -2531,14 +2531,16 @@ try_transform_to_exit_first_loop_alt (class loop *loop, tree nit_type = TREE_TYPE (nit); /* Figure out whether nit + 1 overflows. */ - if (TREE_CODE (nit) == INTEGER_CST) + if (TREE_CODE (nit) == INTEGER_CST + || TREE_CODE (nit) == POLY_INT_CST) { if (!tree_int_cst_equal (nit, TYPE_MAX_VALUE (nit_type))) { alt_bound = fold_build2_loc (UNKNOWN_LOCATION, PLUS_EXPR, nit_type, nit, build_one_cst (nit_type)); - gcc_assert (TREE_CODE (alt_bound) == INTEGER_CST); + gcc_assert (TREE_CODE (alt_bound) == INTEGER_CST + || TREE_CODE (alt_bound) == POLY_INT_CST); transform_to_exit_first_loop_alt (loop, reduction_list, alt_bound); return true; } From patchwork Wed Aug 30 09:10:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 1827664 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=CBwXsXOT; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RbJRh42Vqz1yfX for ; Wed, 30 Aug 2023 19:10:40 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 745F038582BD for ; Wed, 30 Aug 2023 09:10:38 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 745F038582BD DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693386638; bh=bM4ssMgK8S2KBdw7ymicPyUe/e/bvNa/kAFB8eEJh1Q=; h=Date:Subject:To:References:Cc:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=CBwXsXOTldiK6eha3TZ5lYyMs2icDnapaCIX/NvPWXxH58jitFGP8oKZodErDTq/p rVvWXIHMF4lmp2W2WD4J/ytGNs7F9ayyQ74L/Inf98seNjEi+EE2cNIs9cuYfpJZYF pg45SVExWb1gOyhHVf2TfrmX3SW81/PAn+VWQhXs= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 0F7773858C30 for ; Wed, 30 Aug 2023 09:10:18 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0F7773858C30 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DC0AB2F4; Wed, 30 Aug 2023 02:10:56 -0700 (PDT) Received: from [10.57.64.216] (unknown [10.57.64.216]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 966773F64C; Wed, 30 Aug 2023 02:10:16 -0700 (PDT) Message-ID: <6adafeff-e026-aec9-2b1a-8a5f736f813d@arm.com> Date: Wed, 30 Aug 2023 10:10:10 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: [Patch 3/8] vect: Fix vect_get_smallest_scalar_type for simd clones Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> Cc: Richard Sandiford , Richard Biener , "jakub@redhat.com" In-Reply-To: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> X-Spam-Status: No, score=-13.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" The vect_get_smallest_scalar_type helper function was using any argument to a simd clone call when trying to determine the smallest scalar type that would be vectorized. This included the function pointer type in a MASK_CALL for instance, and would result in the wrong type being selected. Instead this patch special cases simd_clone_call's and uses only scalar types of the original function that get transformed into vector types. gcc/ChangeLog: * tree-vect-data-refs.cci (vect_get_smallest_scalar_type): Special case simd clone calls and only use types that are mapped to vectors. * tree-vect-stmts.cc (simd_clone_call_p): New helper function. * tree-vectorizer.h (simd_clone_call_p): Declare new function. gcc/testsuite/ChangeLog: * gcc.dg/vect/vect-simd-clone-16f.c: Remove unnecessary differentation between targets with different pointer sizes. * gcc.dg/vect/vect-simd-clone-17f.c: Likewise. * gcc.dg/vect/vect-simd-clone-18f.c: Likewise. diff --git a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16f.c b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16f.c index 574698d3e133ecb8700e698fa42a6b05dd6b8a18..7cd29e894d0502a59fadfe67db2db383133022d3 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16f.c +++ b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-16f.c @@ -7,9 +7,8 @@ #include "vect-simd-clone-16.c" /* Ensure the the in-branch simd clones are used on targets that support them. - Some targets use pairs of vectors and do twice the calls. */ -/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" { target { ! { { i?86-*-* x86_64-*-* } && { ! lp64 } } } } } } */ -/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 4 "vect" { target { { i?86*-*-* x86_64-*-* } && { ! lp64 } } } } } */ + */ +/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" } } */ /* The LTO test produces two dump files and we scan the wrong one. */ /* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-17f.c b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-17f.c index 8bb6d19301a67a3eebce522daaf7d54d88f708d7..177521dc44531479fca1f1a1a0f2010f30fa3fb5 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-17f.c +++ b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-17f.c @@ -7,9 +7,8 @@ #include "vect-simd-clone-17.c" /* Ensure the the in-branch simd clones are used on targets that support them. - Some targets use pairs of vectors and do twice the calls. */ -/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" { target { ! { { i?86-*-* x86_64-*-* } && { ! lp64 } } } } } } */ -/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 4 "vect" { target { { i?86*-*-* x86_64-*-* } && { ! lp64 } } } } } */ + */ +/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" } } */ /* The LTO test produces two dump files and we scan the wrong one. */ /* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-18f.c b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-18f.c index d34f23f4db8e9c237558cc22fe66b7e02b9e6c20..4dd51381d73c0c7c8ec812f24e5054df038059c5 100644 --- a/gcc/testsuite/gcc.dg/vect/vect-simd-clone-18f.c +++ b/gcc/testsuite/gcc.dg/vect/vect-simd-clone-18f.c @@ -7,9 +7,8 @@ #include "vect-simd-clone-18.c" /* Ensure the the in-branch simd clones are used on targets that support them. - Some targets use pairs of vectors and do twice the calls. */ -/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" { target { ! { { i?86-*-* x86_64-*-* } && { ! lp64 } } } } } } */ -/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 4 "vect" { target { { i?86*-*-* x86_64-*-* } && { ! lp64 } } } } } */ + */ +/* { dg-final { scan-tree-dump-times {[\n\r] [^\n]* = foo\.simdclone} 2 "vect" } } */ /* The LTO test produces two dump files and we scan the wrong one. */ /* { dg-skip-if "" { *-*-* } { "-flto" } { "" } } */ diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc index a3570c45b5209281ac18c1220c3b95398487f389..1bdbea232afc6facddac23269ee3da033eb1ed50 100644 --- a/gcc/tree-vect-data-refs.cc +++ b/gcc/tree-vect-data-refs.cc @@ -119,6 +119,7 @@ tree vect_get_smallest_scalar_type (stmt_vec_info stmt_info, tree scalar_type) { HOST_WIDE_INT lhs, rhs; + cgraph_node *node; /* During the analysis phase, this function is called on arbitrary statements that might not have scalar results. */ @@ -145,6 +146,23 @@ vect_get_smallest_scalar_type (stmt_vec_info stmt_info, tree scalar_type) scalar_type = rhs_type; } } + else if (simd_clone_call_p (stmt_info->stmt, &node)) + { + auto clone = node->simd_clones->simdclone; + for (unsigned int i = 0; i < clone->nargs; ++i) + { + if (clone->args[i].arg_type == SIMD_CLONE_ARG_TYPE_VECTOR) + { + tree arg_scalar_type = TREE_TYPE (clone->args[i].vector_type); + rhs = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (arg_scalar_type)); + if (rhs < lhs) + { + scalar_type = arg_scalar_type; + lhs = rhs; + } + } + } + } else if (gcall *call = dyn_cast (stmt_info->stmt)) { unsigned int i = 0; diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 0fe5d0594abc095d3770b5ce4b9f2bad5205ab2f..35207de7acb410358220dbe8d1af82215b5091bf 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -3965,6 +3965,29 @@ vect_simd_lane_linear (tree op, class loop *loop, } } +bool +simd_clone_call_p (gimple *stmt, cgraph_node **out_node) +{ + gcall *call = dyn_cast (stmt); + if (!call) + return false; + + tree fndecl = NULL_TREE; + if (gimple_call_internal_p (call, IFN_MASK_CALL)) + fndecl = TREE_OPERAND (gimple_call_arg (stmt, 0), 0); + else + fndecl = gimple_call_fndecl (stmt); + + if (fndecl == NULL_TREE) + return false; + + cgraph_node *node = cgraph_node::get (fndecl); + if (out_node) + *out_node = node; + + return node != NULL && node->simd_clones != NULL; +} + /* Function vectorizable_simd_clone_call. Check if STMT_INFO performs a function call that can be vectorized diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h index a65161499ea13f200aa745ca396db663a217b081..69634f7a6032696b394a62fb7ca8986bc78987c8 100644 --- a/gcc/tree-vectorizer.h +++ b/gcc/tree-vectorizer.h @@ -2165,6 +2165,7 @@ extern bool vect_can_advance_ivs_p (loop_vec_info); extern void vect_update_inits_of_drs (loop_vec_info, tree, tree_code); /* In tree-vect-stmts.cc. */ +extern bool simd_clone_call_p (gimple *, struct cgraph_node **node = NULL); extern tree get_related_vectype_for_scalar_type (machine_mode, tree, poly_uint64 = 0); extern tree get_vectype_for_scalar_type (vec_info *, tree, unsigned int = 0); From patchwork Wed Aug 30 09:11:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 1827665 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=bcT2rV8d; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RbJTh5MgWz1yfX for ; Wed, 30 Aug 2023 19:12:24 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AA6343857711 for ; Wed, 30 Aug 2023 09:12:22 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org AA6343857711 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693386742; bh=ukL/whRxULUdNHYfPhCY2xeRI3OxTnMTJfPaTCgj4Rs=; h=Date:Subject:To:References:Cc:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=bcT2rV8dgp0q9oXONtDP5k0TumkHM3V451vqEbr2VTPuv3ERKT5DDNLyBx1uWcN83 oP/dyQDedDWEHXaFlB+Qwl5UmuYlP7775GwehYbAly28w/wcwCqe1hlazh1OlIxhH0 3ESJ7QGVVSZYXbQkMZ7piWYjYzjVzwWkrBF7hQBw= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id B16AC38582A3 for ; Wed, 30 Aug 2023 09:12:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B16AC38582A3 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id AADC62F4; Wed, 30 Aug 2023 02:12:40 -0700 (PDT) Received: from [10.57.64.216] (unknown [10.57.64.216]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C60BB3F64C; Wed, 30 Aug 2023 02:12:00 -0700 (PDT) Message-ID: <49eca251-630e-b26c-5d66-4f8b322ee801@arm.com> Date: Wed, 30 Aug 2023 10:11:55 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: [PATCH 4/8] vect: don't allow fully masked loops with non-masked simd clones [PR 110485] Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> Cc: Richard Biener In-Reply-To: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> X-Spam-Status: No, score=-14.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" When analyzing a loop and choosing a simdclone to use it is possible to choose a simdclone that cannot be used 'inbranch' for a loop that can use partial vectors. This may lead to the vectorizer deciding to use partial vectors which are not supported for notinbranch simd clones. This patch fixes that by disabling the use of partial vectors once a notinbranch simd clone has been selected. gcc/ChangeLog: PR tree-optimization/110485 * tree-vect-stmts.cc (vectorizable_simd_clone_call): Disable partial vectors usage if a notinbranch simdclone has been selected. gcc/testsuite/ChangeLog: * gcc.dg/gomp/pr110485.c: New test. diff --git a/gcc/testsuite/gcc.dg/gomp/pr110485.c b/gcc/testsuite/gcc.dg/gomp/pr110485.c new file mode 100644 index 0000000000000000000000000000000000000000..ba6817a127f40246071e32ccebf692cc4d121d15 --- /dev/null +++ b/gcc/testsuite/gcc.dg/gomp/pr110485.c @@ -0,0 +1,19 @@ +/* PR 110485 */ +/* { dg-do compile } */ +/* { dg-additional-options "-Ofast -fdump-tree-vect-details" } */ +/* { dg-additional-options "-march=znver4 --param=vect-partial-vector-usage=1" { target x86_64-*-* } } */ +#pragma omp declare simd notinbranch uniform(p) +extern double __attribute__ ((const)) bar (double a, double p); + +double a[1024]; +double b[1024]; + +void foo (int n) +{ + #pragma omp simd + for (int i = 0; i < n; ++i) + a[i] = bar (b[i], 71.2); +} + +/* { dg-final { scan-tree-dump-not "MASK_LOAD" "vect" } } */ +/* { dg-final { scan-tree-dump "can't use a fully-masked loop because a non-masked simd clone was selected." "vect" { target x86_64-*-* } } } */ diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 35207de7acb410358220dbe8d1af82215b5091bf..664c3b5f7ca48fdb49383fb8a97f407465574479 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -4349,6 +4349,17 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, ? boolean_true_node : boolean_false_node; STMT_VINFO_SIMD_CLONE_INFO (stmt_info).safe_push (sll); } + + if (!bestn->simdclone->inbranch) + { + if (dump_enabled_p () + && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)) + dump_printf_loc (MSG_NOTE, vect_location, + "can't use a fully-masked loop because a" + " non-masked simd clone was selected.\n"); + LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false; + } + STMT_VINFO_TYPE (stmt_info) = call_simd_clone_vec_info_type; DUMP_VECT_SCOPE ("vectorizable_simd_clone_call"); /* vect_model_simple_cost (vinfo, stmt_info, ncopies, From patchwork Wed Aug 30 09:13:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 1827666 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=BDlk4Myf; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RbJWW15Gtz1yfX for ; Wed, 30 Aug 2023 19:13:59 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2E9843858002 for ; Wed, 30 Aug 2023 09:13:57 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2E9843858002 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693386837; bh=go/t5BhZ+sgQiGLM6SuykLLeoVMZUQuS/iqy8MzK9I0=; h=Date:Subject:To:References:Cc:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=BDlk4MyfIZrjbEUMWbRGgVEmcHfL7ydESlr2AoBcIwW4ydNSR/FVUki1ojNl5pTP6 COWSLfmog2cGHyHZBxmWYw9hBtF79YuSLPLPQgt3FNjLpsQKQyqqjokArtaHd3JSBL wCuk3xomgZy1+iXmKDdIvwvO6JWanUhMy45ymLT8= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id AD0C23858C30 for ; Wed, 30 Aug 2023 09:13:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org AD0C23858C30 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C6D102F4; Wed, 30 Aug 2023 02:14:13 -0700 (PDT) Received: from [10.57.64.216] (unknown [10.57.64.216]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 74CE23F64C; Wed, 30 Aug 2023 02:13:33 -0700 (PDT) Message-ID: Date: Wed, 30 Aug 2023 10:13:27 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: [PATCH 5/8] vect: Use inbranch simdclones in masked loops Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> Cc: Richard Biener , Richard Sandiford , "jakub@redhat.com" In-Reply-To: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> X-Spam-Status: No, score=-14.0 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" This patch enables the compiler to use inbranch simdclones when generating masked loops in autovectorization. gcc/ChangeLog: * omp-simd-clone.cc (simd_clone_adjust_argument_types): Make function compatible with mask parameters in clone. * tree-vect-stmts.cc (vect_convert): New helper function. (vect_build_all_ones_mask): Allow vector boolean typed masks. (vectorizable_simd_clone_call): Enable the use of masked clones in fully masked loops. diff --git a/gcc/omp-simd-clone.cc b/gcc/omp-simd-clone.cc index a42643400ddcf10961633448b49d4caafb999f12..ef0b9b48c7212900023bc0eaebca5e1f9389db77 100644 --- a/gcc/omp-simd-clone.cc +++ b/gcc/omp-simd-clone.cc @@ -807,8 +807,14 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) { ipa_adjusted_param adj; memset (&adj, 0, sizeof (adj)); - tree parm = args[i]; - tree parm_type = node->definition ? TREE_TYPE (parm) : parm; + tree parm = NULL_TREE; + tree parm_type = NULL_TREE; + if(i < args.length()) + { + parm = args[i]; + parm_type = node->definition ? TREE_TYPE (parm) : parm; + } + adj.base_index = i; adj.prev_clone_index = i; @@ -1547,7 +1553,7 @@ simd_clone_adjust (struct cgraph_node *node) mask = gimple_assign_lhs (g); g = gimple_build_assign (make_ssa_name (TREE_TYPE (mask)), BIT_AND_EXPR, mask, - build_int_cst (TREE_TYPE (mask), 1)); + build_one_cst (TREE_TYPE (mask))); gsi_insert_after (&gsi, g, GSI_CONTINUE_LINKING); mask = gimple_assign_lhs (g); } diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 664c3b5f7ca48fdb49383fb8a97f407465574479..7217f36a250d549b955c874d7c7644d94982b0b5 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -1723,6 +1723,20 @@ check_load_store_for_partial_vectors (loop_vec_info loop_vinfo, tree vectype, } } +/* Return SSA name of the result of the conversion of OPERAND into type TYPE. + The conversion statement is inserted at GSI. */ + +static tree +vect_convert (vec_info *vinfo, stmt_vec_info stmt_info, tree type, tree operand, + gimple_stmt_iterator *gsi) +{ + operand = build1 (VIEW_CONVERT_EXPR, type, operand); + gassign *new_stmt = gimple_build_assign (make_ssa_name (type), + operand); + vect_finish_stmt_generation (vinfo, stmt_info, new_stmt, gsi); + return gimple_get_lhs (new_stmt); +} + /* Return the mask input to a masked load or store. VEC_MASK is the vectorized form of the scalar mask condition and LOOP_MASK, if nonnull, is the mask that needs to be applied to all loads and stores in a vectorized loop. @@ -2666,7 +2680,8 @@ vect_build_all_ones_mask (vec_info *vinfo, { if (TREE_CODE (masktype) == INTEGER_TYPE) return build_int_cst (masktype, -1); - else if (TREE_CODE (TREE_TYPE (masktype)) == INTEGER_TYPE) + else if (VECTOR_BOOLEAN_TYPE_P (masktype) + || TREE_CODE (TREE_TYPE (masktype)) == INTEGER_TYPE) { tree mask = build_int_cst (TREE_TYPE (masktype), -1); mask = build_vector_from_val (masktype, mask); @@ -4018,7 +4033,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, size_t i, nargs; tree lhs, rtype, ratype; vec *ret_ctor_elts = NULL; - int arg_offset = 0; + int masked_call_offset = 0; /* Is STMT a vectorizable call? */ gcall *stmt = dyn_cast (stmt_info->stmt); @@ -4033,7 +4048,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, gcc_checking_assert (TREE_CODE (fndecl) == ADDR_EXPR); fndecl = TREE_OPERAND (fndecl, 0); gcc_checking_assert (TREE_CODE (fndecl) == FUNCTION_DECL); - arg_offset = 1; + masked_call_offset = 1; } if (fndecl == NULL_TREE) return false; @@ -4065,7 +4080,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, return false; /* Process function arguments. */ - nargs = gimple_call_num_args (stmt) - arg_offset; + nargs = gimple_call_num_args (stmt) - masked_call_offset; /* Bail out if the function has zero arguments. */ if (nargs == 0) @@ -4083,7 +4098,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, thisarginfo.op = NULL_TREE; thisarginfo.simd_lane_linear = false; - op = gimple_call_arg (stmt, i + arg_offset); + op = gimple_call_arg (stmt, i + masked_call_offset); if (!vect_is_simple_use (op, vinfo, &thisarginfo.dt, &thisarginfo.vectype) || thisarginfo.dt == vect_uninitialized_def) @@ -4161,14 +4176,6 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, } poly_uint64 vf = LOOP_VINFO_VECT_FACTOR (loop_vinfo); - if (!vf.is_constant ()) - { - if (dump_enabled_p ()) - dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location, - "not considering SIMD clones; not yet supported" - " for variable-width vectors.\n"); - return false; - } unsigned int badness = 0; struct cgraph_node *bestn = NULL; @@ -4181,7 +4188,8 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, unsigned int this_badness = 0; unsigned int num_calls; if (!constant_multiple_p (vf, n->simdclone->simdlen, &num_calls) - || n->simdclone->nargs != nargs) + || (!n->simdclone->inbranch && (masked_call_offset > 0)) + || nargs != n->simdclone->nargs) continue; if (num_calls != 1) this_badness += exact_log2 (num_calls) * 4096; @@ -4198,7 +4206,8 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, case SIMD_CLONE_ARG_TYPE_VECTOR: if (!useless_type_conversion_p (n->simdclone->args[i].orig_type, - TREE_TYPE (gimple_call_arg (stmt, i + arg_offset)))) + TREE_TYPE (gimple_call_arg (stmt, + i + masked_call_offset)))) i = -1; else if (arginfo[i].dt == vect_constant_def || arginfo[i].dt == vect_external_def @@ -4243,6 +4252,17 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, } if (i == (size_t) -1) continue; + if (masked_call_offset == 0 + && n->simdclone->inbranch + && n->simdclone->nargs > nargs) + { + gcc_assert (n->simdclone->args[n->simdclone->nargs - 1].arg_type == + SIMD_CLONE_ARG_TYPE_MASK); + /* Penalize using a masked SIMD clone in a non-masked loop, that is + not in a branch, as we'd have to construct an all-true mask. */ + if (!loop_vinfo || !LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)) + this_badness += 64; + } if (bestn == NULL || this_badness < badness) { bestn = n; @@ -4259,7 +4279,8 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, || arginfo[i].dt == vect_external_def) && bestn->simdclone->args[i].arg_type == SIMD_CLONE_ARG_TYPE_VECTOR) { - tree arg_type = TREE_TYPE (gimple_call_arg (stmt, i + arg_offset)); + tree arg_type = TREE_TYPE (gimple_call_arg (stmt, + i + masked_call_offset)); arginfo[i].vectype = get_vectype_for_scalar_type (vinfo, arg_type, slp_node); if (arginfo[i].vectype == NULL @@ -4331,24 +4352,38 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, && TREE_CODE (TREE_TYPE (TREE_TYPE (bestn->decl))) == ARRAY_TYPE) vinfo->any_known_not_updated_vssa = true; STMT_VINFO_SIMD_CLONE_INFO (stmt_info).safe_push (bestn->decl); - for (i = 0; i < nargs; i++) - if ((bestn->simdclone->args[i].arg_type - == SIMD_CLONE_ARG_TYPE_LINEAR_CONSTANT_STEP) - || (bestn->simdclone->args[i].arg_type - == SIMD_CLONE_ARG_TYPE_LINEAR_REF_CONSTANT_STEP)) - { - STMT_VINFO_SIMD_CLONE_INFO (stmt_info).safe_grow_cleared (i * 3 - + 1, - true); - STMT_VINFO_SIMD_CLONE_INFO (stmt_info).safe_push (arginfo[i].op); - tree lst = POINTER_TYPE_P (TREE_TYPE (arginfo[i].op)) - ? size_type_node : TREE_TYPE (arginfo[i].op); - tree ls = build_int_cst (lst, arginfo[i].linear_step); - STMT_VINFO_SIMD_CLONE_INFO (stmt_info).safe_push (ls); - tree sll = arginfo[i].simd_lane_linear - ? boolean_true_node : boolean_false_node; - STMT_VINFO_SIMD_CLONE_INFO (stmt_info).safe_push (sll); - } + + for (i = 0; i < bestn->simdclone->nargs; i++) + { + switch (bestn->simdclone->args[i].arg_type) + { + default: + continue; + case SIMD_CLONE_ARG_TYPE_LINEAR_CONSTANT_STEP: + case SIMD_CLONE_ARG_TYPE_LINEAR_REF_CONSTANT_STEP: + { + auto &clone_info = STMT_VINFO_SIMD_CLONE_INFO (stmt_info); + clone_info.safe_grow_cleared (i * 3 + 1, true); + clone_info.safe_push (arginfo[i].op); + tree lst = POINTER_TYPE_P (TREE_TYPE (arginfo[i].op)) + ? size_type_node : TREE_TYPE (arginfo[i].op); + tree ls = build_int_cst (lst, arginfo[i].linear_step); + clone_info.safe_push (ls); + tree sll = arginfo[i].simd_lane_linear + ? boolean_true_node : boolean_false_node; + clone_info.safe_push (sll); + } + break; + case SIMD_CLONE_ARG_TYPE_MASK: + if (loop_vinfo + && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo)) + vect_record_loop_mask (loop_vinfo, + &LOOP_VINFO_MASKS (loop_vinfo), + ncopies, vectype, op); + + break; + } + } if (!bestn->simdclone->inbranch) { @@ -4394,6 +4429,8 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, vec_oprnds_i.safe_grow_cleared (nargs, true); for (j = 0; j < ncopies; ++j) { + poly_uint64 callee_nelements; + poly_uint64 caller_nelements; /* Build argument list for the vectorized call. */ if (j == 0) vargs.create (nargs); @@ -4404,8 +4441,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, { unsigned int k, l, m, o; tree atype; - poly_uint64 callee_nelements, caller_nelements; - op = gimple_call_arg (stmt, i + arg_offset); + op = gimple_call_arg (stmt, i + masked_call_offset); switch (bestn->simdclone->args[i].arg_type) { case SIMD_CLONE_ARG_TYPE_VECTOR: @@ -4482,16 +4518,9 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, if (k == 1) if (!useless_type_conversion_p (TREE_TYPE (vec_oprnd0), atype)) - { - vec_oprnd0 - = build1 (VIEW_CONVERT_EXPR, atype, vec_oprnd0); - gassign *new_stmt - = gimple_build_assign (make_ssa_name (atype), - vec_oprnd0); - vect_finish_stmt_generation (vinfo, stmt_info, - new_stmt, gsi); - vargs.safe_push (gimple_assign_lhs (new_stmt)); - } + vargs.safe_push (vect_convert (vinfo, stmt_info, + atype, vec_oprnd0, + gsi)); else vargs.safe_push (vec_oprnd0); else @@ -4544,6 +4573,24 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, vec_oprnds_i[i] = 0; } vec_oprnd0 = vec_oprnds[i][vec_oprnds_i[i]++]; + if (loop_vinfo + && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)) + { + vec_loop_masks *loop_masks + = &LOOP_VINFO_MASKS (loop_vinfo); + tree loop_mask + = vect_get_loop_mask (loop_vinfo, gsi, + loop_masks, ncopies, + vectype, j); + vec_oprnd0 + = prepare_vec_mask (loop_vinfo, + TREE_TYPE (loop_mask), + loop_mask, vec_oprnd0, + gsi); + loop_vinfo->vec_cond_masked_set.add ({ vec_oprnd0, + loop_mask }); + + } vec_oprnd0 = build3 (VEC_COND_EXPR, atype, vec_oprnd0, build_vector_from_val (atype, one), @@ -4641,6 +4688,64 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, } } + if (masked_call_offset == 0 + && bestn->simdclone->inbranch + && bestn->simdclone->nargs > nargs) + { + unsigned long m, o; + size_t mask_i = bestn->simdclone->nargs - 1; + tree mask; + gcc_assert (bestn->simdclone->args[mask_i].arg_type == + SIMD_CLONE_ARG_TYPE_MASK); + + tree masktype = bestn->simdclone->args[mask_i].vector_type; + callee_nelements = TYPE_VECTOR_SUBPARTS (masktype); + o = vector_unroll_factor (nunits, callee_nelements); + for (m = j * o; m < (j + 1) * o; m++) + { + if (loop_vinfo && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)) + { + vec_loop_masks *loop_masks = &LOOP_VINFO_MASKS (loop_vinfo); + mask = vect_get_loop_mask (loop_vinfo, gsi, loop_masks, + ncopies, vectype, j); + } + else + mask = vect_build_all_ones_mask (vinfo, stmt_info, masktype); + + if (!useless_type_conversion_p (TREE_TYPE (mask), masktype)) + { + gassign *new_stmt; + if (bestn->simdclone->mask_mode != VOIDmode) + { + /* This means we are dealing with integer mask modes. + First convert to an integer type with the same size as + the current vector type. */ + unsigned HOST_WIDE_INT intermediate_size + = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (mask))); + tree mid_int_type = + build_nonstandard_integer_type (intermediate_size, 1); + mask = build1 (VIEW_CONVERT_EXPR, mid_int_type, mask); + new_stmt + = gimple_build_assign (make_ssa_name (mid_int_type), + mask); + gsi_insert_before (gsi, new_stmt, GSI_SAME_STMT); + /* Then zero-extend to the mask mode. */ + mask = fold_build1 (NOP_EXPR, masktype, + gimple_get_lhs (new_stmt)); + } + else + mask = build1 (VIEW_CONVERT_EXPR, masktype, mask); + + new_stmt = gimple_build_assign (make_ssa_name (masktype), + mask); + vect_finish_stmt_generation (vinfo, stmt_info, + new_stmt, gsi); + mask = gimple_assign_lhs (new_stmt); + } + vargs.safe_push (mask); + } + } + gcall *new_call = gimple_build_call_vec (fndecl, vargs); if (vec_dest) { @@ -4659,13 +4764,13 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, if (vec_dest) { - if (!multiple_p (TYPE_VECTOR_SUBPARTS (vectype), nunits)) + caller_nelements = TYPE_VECTOR_SUBPARTS (vectype); + if (!multiple_p (caller_nelements, nunits)) { unsigned int k, l; poly_uint64 prec = GET_MODE_BITSIZE (TYPE_MODE (vectype)); poly_uint64 bytes = GET_MODE_SIZE (TYPE_MODE (vectype)); - k = vector_unroll_factor (nunits, - TYPE_VECTOR_SUBPARTS (vectype)); + k = vector_unroll_factor (nunits, caller_nelements); gcc_assert ((k & (k - 1)) == 0); for (l = 0; l < k; l++) { @@ -4691,11 +4796,11 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, vect_clobber_variable (vinfo, stmt_info, gsi, new_temp); continue; } - else if (!multiple_p (nunits, TYPE_VECTOR_SUBPARTS (vectype))) + else if (!multiple_p (nunits, caller_nelements)) { unsigned int k; - if (!constant_multiple_p (TYPE_VECTOR_SUBPARTS (rtype), - TYPE_VECTOR_SUBPARTS (vectype), &k)) + if (!constant_multiple_p (caller_nelements, + TYPE_VECTOR_SUBPARTS (rtype), &k)) gcc_unreachable (); gcc_assert ((k & (k - 1)) == 0); if ((j & (k - 1)) == 0) From patchwork Wed Aug 30 09:14:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 1827667 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=ZnlkJ/5d; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RbJXl4CkTz1ygM for ; Wed, 30 Aug 2023 19:15:03 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 952F53858409 for ; Wed, 30 Aug 2023 09:15:01 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 952F53858409 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693386901; bh=OUXloNw3aREKP+/BRbRaqMhrdbzQgU1BRQnIVkM1654=; h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=ZnlkJ/5dkwyFiVgnZfAocTakf7hcIAzss1KFdq681f2C2rPn0O/LGz/CmtyEv3COS PjFzXhx67Gpt2cHBrvbGbO4NnWu3tH+iz/GwSTt0GwvmnZlo959qF4jDzhJn4csVEg GVWg5zRs/4c1StwMX2N0RUXUFfleRnaRvyb+vyHM= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id B33DF3858C30 for ; Wed, 30 Aug 2023 09:14:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org B33DF3858C30 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8B87EFEC for ; Wed, 30 Aug 2023 02:15:19 -0700 (PDT) Received: from [10.57.64.216] (unknown [10.57.64.216]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id CBE003F64C for ; Wed, 30 Aug 2023 02:14:39 -0700 (PDT) Message-ID: <4eda2924-2fe1-63ed-d6c5-2bdea8fd34d3@arm.com> Date: Wed, 30 Aug 2023 10:14:38 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: [PATCH 6/8] vect: Add vector_mode paramater to simd_clone_usable Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> In-Reply-To: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> X-Spam-Status: No, score=-13.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" This patch adds a machine_mode parameter to the TARGET_SIMD_CLONE_USABLE hook to enable rejecting SVE modes when the target architecture does not support SVE. gcc/ChangeLog: * config/aarch64/aarch64.cc (aarch64_simd_clone_usable): Add mode parameter and use to to reject SVE modes when target architecture does not support SVE. * config/gcn/gcn.cc (gcn_simd_clone_usable): Add unused mode parameter. * config/i386/i386.cc (ix86_simd_clone_usable): Likewise. * doc/tm.texi (TARGET_SIMD_CLONE_USABLE): Document new parameter. * target.def (usable): Add new parameter. * tree-vect-stmts.cc (vectorizable_simd_clone_call): Pass vector mode to TARGET_SIMD_CLONE_CALL hook. diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index 5fb4c863d875871d6de865e72ce360506a3694d2..a13d3fba05f9f9d2989b36c681bc77d71e943e0d 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -27498,12 +27498,18 @@ aarch64_simd_clone_adjust (struct cgraph_node *node) /* Implement TARGET_SIMD_CLONE_USABLE. */ static int -aarch64_simd_clone_usable (struct cgraph_node *node) +aarch64_simd_clone_usable (struct cgraph_node *node, machine_mode vector_mode) { switch (node->simdclone->vecsize_mangle) { case 'n': - if (!TARGET_SIMD) + if (!TARGET_SIMD + || aarch64_sve_mode_p (vector_mode)) + return -1; + return 0; + case 's': + if (!TARGET_SVE + || !aarch64_sve_mode_p (vector_mode)) return -1; return 0; default: diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc index 02f4dedec4214b1eea9e6f5057ed57d7e0db316a..252676273f06500c99df6ae251f0406c618df891 100644 --- a/gcc/config/gcn/gcn.cc +++ b/gcc/config/gcn/gcn.cc @@ -5599,7 +5599,8 @@ gcn_simd_clone_adjust (struct cgraph_node *ARG_UNUSED (node)) /* Implement TARGET_SIMD_CLONE_USABLE. */ static int -gcn_simd_clone_usable (struct cgraph_node *ARG_UNUSED (node)) +gcn_simd_clone_usable (struct cgraph_node *ARG_UNUSED (node), + machine_mode ARG_UNUSED (mode)) { /* We don't need to do anything here because gcn_simd_clone_compute_vecsize_and_simdlen currently only returns one diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 5d57726e22cea8bcaa8ac8b1b25ac420193f39bb..84f0d5a7cb679e6be92001f59802276635506e97 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -24379,7 +24379,8 @@ ix86_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node, slightly less desirable, etc.). */ static int -ix86_simd_clone_usable (struct cgraph_node *node) +ix86_simd_clone_usable (struct cgraph_node *node, + machine_mode mode ATTRIBUTE_UNUSED) { switch (node->simdclone->vecsize_mangle) { diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 95ba56e05ae4a0f11639cc4a21d6736c53ad5ef1..bde22e562ebb9069122eb3b142ab8f4a4ae56a3a 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6336,11 +6336,13 @@ This hook should add implicit @code{attribute(target("..."))} attribute to SIMD clone @var{node} if needed. @end deftypefn -@deftypefn {Target Hook} int TARGET_SIMD_CLONE_USABLE (struct cgraph_node *@var{}) +@deftypefn {Target Hook} int TARGET_SIMD_CLONE_USABLE (struct cgraph_node *@var{}, @var{machine_mode}) This hook should return -1 if SIMD clone @var{node} shouldn't be used -in vectorized loops in current function, or non-negative number if it is -usable. In that case, the smaller the number is, the more desirable it is -to use it. +in vectorized loops being vectorized with mode @var{m} in current function, or +non-negative number if it is usable. In that case, the smaller the number is, +the more desirable it is to use it. +@end deftypefn + @end deftypefn @deftypefn {Target Hook} int TARGET_SIMT_VF (void) diff --git a/gcc/target.def b/gcc/target.def index 7d684296c17897b4ceecb31c5de1ae8665a8228e..6a0cbc454526ee29011451b570354bf234a4eabd 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -1645,10 +1645,11 @@ void, (struct cgraph_node *), NULL) DEFHOOK (usable, "This hook should return -1 if SIMD clone @var{node} shouldn't be used\n\ -in vectorized loops in current function, or non-negative number if it is\n\ -usable. In that case, the smaller the number is, the more desirable it is\n\ -to use it.", -int, (struct cgraph_node *), NULL) +in vectorized loops being vectorized with mode @var{m} in current function, or\n\ +non-negative number if it is usable. In that case, the smaller the number is,\n\ +the more desirable it is to use it.", +int, (struct cgraph_node *, machine_mode), NULL) + HOOK_VECTOR_END (simd_clone) diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 7217f36a250d549b955c874d7c7644d94982b0b5..dc2fc20ef9fe777132308c9e33f7731d62717466 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -4195,7 +4195,7 @@ vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info, this_badness += exact_log2 (num_calls) * 4096; if (n->simdclone->inbranch) this_badness += 8192; - int target_badness = targetm.simd_clone.usable (n); + int target_badness = targetm.simd_clone.usable (n, vinfo->vector_mode); if (target_badness < 0) continue; this_badness += target_badness * 512; From patchwork Wed Aug 30 09:17:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 1827668 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=PpBZDyet; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RbJcF61Dwz1yZs for ; Wed, 30 Aug 2023 19:18:05 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id D4D51385558C for ; Wed, 30 Aug 2023 09:18:03 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org D4D51385558C DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693387083; bh=7gjs5Kq+Wwa+VNFeI2QPhrB+oXDd0GatUWD4bpqXrZc=; h=Date:Subject:To:References:Cc:In-Reply-To:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=PpBZDyetNC3nIMy/dtA8pPmUpW2ga5oxBqKkZ5KXfRWev5M1bVegJ85RdKFUh3IMu Z5FTv24k4L0WzUysPkg8nWp07X8bYZSD+e8wn/248GA3qVWN29DTG9ugq8DknOwQUk HqVoWawFN65iV6jHEki/X1QycrpfRGlqa4vIbE5k= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 1784F3857714 for ; Wed, 30 Aug 2023 09:17:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1784F3857714 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3F60E2F4; Wed, 30 Aug 2023 02:18:21 -0700 (PDT) Received: from [10.57.64.216] (unknown [10.57.64.216]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 0BF033F64C; Wed, 30 Aug 2023 02:17:40 -0700 (PDT) Message-ID: Date: Wed, 30 Aug 2023 10:17:39 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: [PATCH7/8] vect: Add TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> Cc: Richard Biener , Richard Sandiford , "jakub@redhat.com" In-Reply-To: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> X-Spam-Status: No, score=-13.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" This patch adds a new target hook to enable us to adapt the types of return and parameters of simd clones. We use this in two ways, the first one is to make sure we can create valid SVE types, including the SVE type attribute, when creating a SVE simd clone, even when the target options do not support SVE. We are following the same behaviour seen with x86 that creates simd clones according to the ABI rules when no simdlen is provided, even if that simdlen is not supported by the current target options. Note that this doesn't mean the simd clone will be used in auto-vectorization. gcc/ChangeLog: (TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM): Define. * doc/tm.texi (TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM): Document. * doc/tm.texi.in (TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM): New. * omp-simd-clone.cc (simd_adjust_return_type): Call new hook. (simd_clone_adjust_argument_types): Likewise. * target.def (adjust_ret_or_param): New hook. * targhooks.cc (default_simd_clone_adjust_ret_or_param): New. * targhooks.h (default_simd_clone_adjust_ret_or_param): New. diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index bde22e562ebb9069122eb3b142ab8f4a4ae56a3a..b80c09ec36d51f1bb55b14229f46207fb4457223 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -6343,6 +6343,9 @@ non-negative number if it is usable. In that case, the smaller the number is, the more desirable it is to use it. @end deftypefn +@deftypefn {Target Hook} tree TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM (struct cgraph_node *@var{}, @var{tree}, @var{bool}) +If defined, this hook should adjust the type of the return or parameter +@var{type} to be used by the simd clone @var{node}. @end deftypefn @deftypefn {Target Hook} int TARGET_SIMT_VF (void) diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index 4ac96dc357d35e0e57bb43a41d1b1a4f66d05946..7496a32d84f7c422fe7ea88215ee72f3c354a3f4 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -4211,6 +4211,8 @@ address; but often a machine-dependent strategy can generate better code. @hook TARGET_SIMD_CLONE_USABLE +@hook TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM + @hook TARGET_SIMT_VF @hook TARGET_OMP_DEVICE_KIND_ARCH_ISA diff --git a/gcc/omp-simd-clone.cc b/gcc/omp-simd-clone.cc index ef0b9b48c7212900023bc0eaebca5e1f9389db77..c2fd4d3be878e56b6394e34097d2de826a0ba1ff 100644 --- a/gcc/omp-simd-clone.cc +++ b/gcc/omp-simd-clone.cc @@ -736,6 +736,7 @@ simd_clone_adjust_return_type (struct cgraph_node *node) t = build_array_type_nelts (t, exact_div (node->simdclone->simdlen, veclen)); } + t = targetm.simd_clone.adjust_ret_or_param (node, t, false); TREE_TYPE (TREE_TYPE (fndecl)) = t; if (!node->definition) return NULL_TREE; @@ -748,6 +749,7 @@ simd_clone_adjust_return_type (struct cgraph_node *node) tree atype = build_array_type_nelts (orig_rettype, node->simdclone->simdlen); + atype = targetm.simd_clone.adjust_ret_or_param (node, atype, false); if (maybe_ne (veclen, node->simdclone->simdlen)) return build1 (VIEW_CONVERT_EXPR, atype, t); @@ -880,6 +882,8 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) ? IDENTIFIER_POINTER (DECL_NAME (parm)) : NULL, parm_type, sc->simdlen); } + adj.type = targetm.simd_clone.adjust_ret_or_param (node, adj.type, + false); vec_safe_push (new_params, adj); } @@ -912,6 +916,8 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) adj.type = build_vector_type (pointer_sized_int_node, veclen); else adj.type = build_vector_type (base_type, veclen); + adj.type = targetm.simd_clone.adjust_ret_or_param (node, adj.type, + true); vec_safe_push (new_params, adj); k = vector_unroll_factor (sc->simdlen, veclen); @@ -937,6 +943,7 @@ simd_clone_adjust_argument_types (struct cgraph_node *node) sc->args[i].simd_array = NULL_TREE; } sc->args[i].orig_type = base_type; + sc->args[i].vector_type = adj.type; sc->args[i].arg_type = SIMD_CLONE_ARG_TYPE_MASK; sc->args[i].vector_type = adj.type; } diff --git a/gcc/target.def b/gcc/target.def index 6a0cbc454526ee29011451b570354bf234a4eabd..665083ce035da03b40b15f23684ccdacce33c9d3 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -1650,6 +1650,13 @@ non-negative number if it is usable. In that case, the smaller the number is,\n the more desirable it is to use it.", int, (struct cgraph_node *, machine_mode), NULL) +DEFHOOK +(adjust_ret_or_param, +"If defined, this hook should adjust the type of the return or parameter\n\ +@var{type} to be used by the simd clone @var{node}.", +tree, (struct cgraph_node *, tree, bool), +default_simd_clone_adjust_ret_or_param) + HOOK_VECTOR_END (simd_clone) diff --git a/gcc/targhooks.h b/gcc/targhooks.h index 1a0db8dddd594d9b1fb04ae0d9a66ad6b7a396dc..558157514814228ef2ed41ae0861e1c088eea9ef 100644 --- a/gcc/targhooks.h +++ b/gcc/targhooks.h @@ -75,6 +75,9 @@ extern void default_print_operand (FILE *, rtx, int); extern void default_print_operand_address (FILE *, machine_mode, rtx); extern bool default_print_operand_punct_valid_p (unsigned char); extern tree default_mangle_assembler_name (const char *); +extern tree default_simd_clone_adjust_ret_or_param + (struct cgraph_node *,tree , bool); + extern machine_mode default_translate_mode_attribute (machine_mode); extern bool default_scalar_mode_supported_p (scalar_mode); diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc index e190369f87a92e6a92372dc348d9374c3a965c0a..6b6f6132c6dc62b92ad8d448d63ca6041386788f 100644 --- a/gcc/targhooks.cc +++ b/gcc/targhooks.cc @@ -399,6 +399,16 @@ default_mangle_assembler_name (const char *name ATTRIBUTE_UNUSED) return get_identifier (stripped); } +/* The default implementation of TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM. */ + +tree +default_simd_clone_adjust_ret_or_param (struct cgraph_node *node ATTRIBUTE_UNUSED, + tree type, + bool is_return ATTRIBUTE_UNUSED) +{ + return type; +} + /* The default implementation of TARGET_TRANSLATE_MODE_ATTRIBUTE. */ machine_mode From patchwork Wed Aug 30 09:19:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 1827669 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=vKQfNuJk; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RbJfr60Nkz1ygM for ; Wed, 30 Aug 2023 19:20:20 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DB74E3858428 for ; Wed, 30 Aug 2023 09:20:18 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org DB74E3858428 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1693387218; bh=UiF55yh1WGL1MumlWOZqY6dEZYIK1OW3QecIm2mkRfs=; h=Date:Subject:To:References:In-Reply-To:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=vKQfNuJk69f3mnFqr0JGfpj3YLrlKlqGM0BcoUNIqyJwilapuU/lhUtMmAOJcUzca AC25cTpFKJrilzRCS2iTYDa8biLhESORdpV6zCfIhqBX+SDBKNrrMsGd+HPxwphTzD itvgfQzzJC3lVZ0j5hYMce7nJDK7hqUBgSgrDFYs= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id A43B73858402 for ; Wed, 30 Aug 2023 09:19:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A43B73858402 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A56632F4; Wed, 30 Aug 2023 02:20:33 -0700 (PDT) Received: from [10.57.64.216] (unknown [10.57.64.216]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 79C043F64C; Wed, 30 Aug 2023 02:19:53 -0700 (PDT) Message-ID: <25cccf6c-1b3c-a032-7930-aba25a311dca@arm.com> Date: Wed, 30 Aug 2023 10:19:46 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: [PATCH 8/8] aarch64: Add SVE support for simd clones [PR 96342] Content-Language: en-US To: gcc-patches@gcc.gnu.org References: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> In-Reply-To: <73b53052-c3a4-4028-2836-ade419431eda@arm.com> X-Spam-Status: No, score=-13.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "Andre Vieira \(lists\) via Gcc-patches" From: "Andre Vieira (lists)" Reply-To: "Andre Vieira \(lists\)" Cc: Richard Sandiford Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" This patch finalizes adding support for the generation of SVE simd clones when no simdlen is provided, following the ABI rules where the widest data type determines the minimum amount of elements in a length agnostic vector. gcc/ChangeLog: * config/aarch64/aarch64-protos.h (add_sve_type_attribute): Declare. * config/aarch64/aarch64-sve-builtins.cc (add_sve_type_attribute): Make visibility global. * config/aarch64/aarch64.cc (aarch64_fntype_abi): Ensure SVE ABI is chosen over SIMD ABI if a SVE type is used in return or arguments. (aarch64_simd_clone_compute_vecsize_and_simdlen): Create VLA simd clone when no simdlen is provided, according to ABI rules. (aarch64_simd_clone_adjust): Add '+sve' attribute to SVE simd clones. (aarch64_simd_clone_adjust_ret_or_param): New. (TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM): Define. * omp-simd-clone.cc (simd_clone_mangle): Print 'x' for VLA simdlen. (simd_clone_adjust): Adapt safelen check to be compatible with VLA simdlen. gcc/testsuite/ChangeLog: * c-c++-common/gomp/declare-variant-14.c: Adapt aarch64 scan. * gfortran.dg/gomp/declare-variant-14.f90: Likewise. * gcc.target/aarch64/declare-simd-1.c: Remove warning checks where no longer necessary. * gcc.target/aarch64/declare-simd-2.c: Add SVE clone scan. diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h index 70303d6fd953e0c397b9138ede8858c2db2e53db..d7888c95a4999fad1a4c55d5cd2287c2040302c8 100644 --- a/gcc/config/aarch64/aarch64-protos.h +++ b/gcc/config/aarch64/aarch64-protos.h @@ -1001,6 +1001,8 @@ namespace aarch64_sve { #ifdef GCC_TARGET_H bool verify_type_context (location_t, type_context_kind, const_tree, bool); #endif + void add_sve_type_attribute (tree, unsigned int, unsigned int, + const char *, const char *); } extern void aarch64_split_combinev16qi (rtx operands[3]); diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 161a14edde7c9fb1b13b146cf50463e2d78db264..6f99c438d10daa91b7e3b623c995489f1a8a0f4c 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -569,14 +569,16 @@ static bool reported_missing_registers_p; /* Record that TYPE is an ABI-defined SVE type that contains NUM_ZR SVE vectors and NUM_PR SVE predicates. MANGLED_NAME, if nonnull, is the ABI-defined mangling of the type. ACLE_NAME is the name of the type. */ -static void +void add_sve_type_attribute (tree type, unsigned int num_zr, unsigned int num_pr, const char *mangled_name, const char *acle_name) { tree mangled_name_tree = (mangled_name ? get_identifier (mangled_name) : NULL_TREE); + tree acle_name_tree + = (acle_name ? get_identifier (acle_name) : NULL_TREE); - tree value = tree_cons (NULL_TREE, get_identifier (acle_name), NULL_TREE); + tree value = tree_cons (NULL_TREE, acle_name_tree, NULL_TREE); value = tree_cons (NULL_TREE, mangled_name_tree, value); value = tree_cons (NULL_TREE, size_int (num_pr), value); value = tree_cons (NULL_TREE, size_int (num_zr), value); diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc index a13d3fba05f9f9d2989b36c681bc77d71e943e0d..492acb9ce081866162faa8dfca777e4cb943797f 100644 --- a/gcc/config/aarch64/aarch64.cc +++ b/gcc/config/aarch64/aarch64.cc @@ -4034,13 +4034,13 @@ aarch64_takes_arguments_in_sve_regs_p (const_tree fntype) static const predefined_function_abi & aarch64_fntype_abi (const_tree fntype) { - if (lookup_attribute ("aarch64_vector_pcs", TYPE_ATTRIBUTES (fntype))) - return aarch64_simd_abi (); - if (aarch64_returns_value_in_sve_regs_p (fntype) || aarch64_takes_arguments_in_sve_regs_p (fntype)) return aarch64_sve_abi (); + if (lookup_attribute ("aarch64_vector_pcs", TYPE_ATTRIBUTES (fntype))) + return aarch64_simd_abi (); + return default_function_abi; } @@ -27327,7 +27327,7 @@ aarch64_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node, int num, bool explicit_p) { tree t, ret_type; - unsigned int nds_elt_bits; + unsigned int nds_elt_bits, wds_elt_bits; int count; unsigned HOST_WIDE_INT const_simdlen; poly_uint64 vec_bits; @@ -27374,10 +27374,14 @@ aarch64_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node, if (TREE_CODE (ret_type) != VOID_TYPE) { nds_elt_bits = lane_size (SIMD_CLONE_ARG_TYPE_VECTOR, ret_type); + wds_elt_bits = nds_elt_bits; vec_elts.safe_push (std::make_pair (ret_type, nds_elt_bits)); } else - nds_elt_bits = POINTER_SIZE; + { + nds_elt_bits = POINTER_SIZE; + wds_elt_bits = 0; + } int i; tree type_arg_types = TYPE_ARG_TYPES (TREE_TYPE (node->decl)); @@ -27385,30 +27389,36 @@ aarch64_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node, for (t = (decl_arg_p ? DECL_ARGUMENTS (node->decl) : type_arg_types), i = 0; t && t != void_list_node; t = TREE_CHAIN (t), i++) { - tree arg_type = decl_arg_p ? TREE_TYPE (t) : TREE_VALUE (t); + tree type = decl_arg_p ? TREE_TYPE (t) : TREE_VALUE (t); if (clonei->args[i].arg_type != SIMD_CLONE_ARG_TYPE_UNIFORM - && !supported_simd_type (arg_type)) + && !supported_simd_type (type)) { if (!explicit_p) ; - else if (COMPLEX_FLOAT_TYPE_P (ret_type)) + else if (COMPLEX_FLOAT_TYPE_P (type)) warning_at (DECL_SOURCE_LOCATION (node->decl), 0, "GCC does not currently support argument type %qT " - "for simd", arg_type); + "for simd", type); else warning_at (DECL_SOURCE_LOCATION (node->decl), 0, "unsupported argument type %qT for simd", - arg_type); + type); return 0; } - unsigned lane_bits = lane_size (clonei->args[i].arg_type, arg_type); + unsigned lane_bits = lane_size (clonei->args[i].arg_type, type); if (clonei->args[i].arg_type == SIMD_CLONE_ARG_TYPE_VECTOR) - vec_elts.safe_push (std::make_pair (arg_type, lane_bits)); + vec_elts.safe_push (std::make_pair (type, lane_bits)); if (nds_elt_bits > lane_bits) nds_elt_bits = lane_bits; + else if (wds_elt_bits < lane_bits) + wds_elt_bits = lane_bits; } - clonei->vecsize_mangle = 'n'; + /* If we could not determine the WDS type from available parameters/return, + then fallback to using uintptr_t. */ + if (wds_elt_bits == 0) + wds_elt_bits = POINTER_SIZE; + clonei->mask_mode = VOIDmode; poly_uint64 simdlen; auto_vec simdlens (2); @@ -27419,6 +27429,11 @@ aarch64_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node, simdlen = exact_div (poly_uint64 (64), nds_elt_bits); simdlens.safe_push (simdlen); simdlens.safe_push (simdlen * 2); + /* Only create a SVE simd clone if we aren't dealing with an unprototyped + function. */ + if (DECL_ARGUMENTS (node->decl) != 0 + || type_arg_types != 0) + simdlens.safe_push (exact_div (poly_uint64 (128, 128), wds_elt_bits)); } else simdlens.safe_push (clonei->simdlen); @@ -27439,19 +27454,20 @@ aarch64_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node, while (j < count && !simdlens.is_empty ()) { bool remove_simdlen = false; - for (auto elt : vec_elts) - if (known_gt (simdlens[j] * elt.second, 128U)) - { - /* Don't issue a warning for every simdclone when there is no - specific simdlen clause. */ - if (explicit_p && known_ne (clonei->simdlen, 0U)) - warning_at (DECL_SOURCE_LOCATION (node->decl), 0, - "GCC does not currently support simdlen %wd for " - "type %qT", - constant_lower_bound (simdlens[j]), elt.first); - remove_simdlen = true; - break; - } + if (simdlens[j].is_constant ()) + for (auto elt : vec_elts) + if (known_gt (simdlens[j] * elt.second, 128U)) + { + /* Don't issue a warning for every simdclone when there is no + specific simdlen clause. */ + if (explicit_p && known_ne (clonei->simdlen, 0U)) + warning_at (DECL_SOURCE_LOCATION (node->decl), 0, + "GCC does not currently support simdlen %wd for " + "type %qT", + constant_lower_bound (simdlens[j]), elt.first); + remove_simdlen = true; + break; + } if (remove_simdlen) { count--; @@ -27479,6 +27495,13 @@ aarch64_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node, gcc_assert (num < count); clonei->simdlen = simdlens[num]; + if (clonei->simdlen.is_constant ()) + clonei->vecsize_mangle = 'n'; + else + { + clonei->vecsize_mangle = 's'; + clonei->inbranch = 1; + } return count; } @@ -27493,6 +27516,11 @@ aarch64_simd_clone_adjust (struct cgraph_node *node) tree t = TREE_TYPE (node->decl); TYPE_ATTRIBUTES (t) = make_attribute ("aarch64_vector_pcs", "default", TYPE_ATTRIBUTES (t)); + if (node->simdclone->vecsize_mangle == 's') + { + tree target = build_string (strlen ("+sve"), "+sve"); + aarch64_option_valid_attribute_p (node->decl, NULL_TREE, target, 0); + } } /* Implement TARGET_SIMD_CLONE_USABLE. */ @@ -27517,6 +27545,57 @@ aarch64_simd_clone_usable (struct cgraph_node *node, machine_mode vector_mode) } } +/* Implement TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM. */ + +static tree +aarch64_simd_clone_adjust_ret_or_param (cgraph_node *node, tree type, + bool is_mask) +{ + if (type + && VECTOR_TYPE_P (type) + && node->simdclone->vecsize_mangle == 's') + { + cl_target_option cur_target; + cl_target_option_save (&cur_target, &global_options, &global_options_set); + tree new_target = DECL_FUNCTION_SPECIFIC_TARGET (node->decl); + cl_target_option_restore (&global_options, &global_options_set, + TREE_TARGET_OPTION (new_target)); + aarch64_override_options_internal (&global_options); + bool m_old_have_regs_of_mode[MAX_MACHINE_MODE]; + memcpy (m_old_have_regs_of_mode, have_regs_of_mode, + sizeof (have_regs_of_mode)); + for (int i = 0; i < NUM_MACHINE_MODES; ++i) + if (aarch64_sve_mode_p ((machine_mode) i)) + have_regs_of_mode[i] = true; + poly_uint16 old_sve_vg = aarch64_sve_vg; + if (!node->simdclone->simdlen.is_constant ()) + aarch64_sve_vg = poly_uint16 (2, 2); + unsigned int num_zr = 0; + unsigned int num_pr = 0; + type = TREE_TYPE (type); + type = build_vector_type (type, node->simdclone->simdlen); + if (is_mask) + { + type = truth_type_for (type); + num_pr = 1; + } + else + num_zr = 1; + + aarch64_sve::add_sve_type_attribute (type, num_zr, num_pr, NULL, NULL); + cl_target_option_restore (&global_options, &global_options_set, &cur_target); + aarch64_override_options_internal (&global_options); + memcpy (have_regs_of_mode, m_old_have_regs_of_mode, + sizeof (have_regs_of_mode)); + aarch64_sve_vg = old_sve_vg; + } + else if (type + && VECTOR_TYPE_P (type) + && is_mask) + type = truth_type_for (type); + return type; +} + /* Implement TARGET_COMP_TYPE_ATTRIBUTES */ static int @@ -28590,6 +28669,10 @@ aarch64_libgcc_floating_mode_supported_p #undef TARGET_SIMD_CLONE_ADJUST #define TARGET_SIMD_CLONE_ADJUST aarch64_simd_clone_adjust +#undef TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM +#define TARGET_SIMD_CLONE_ADJUST_RET_OR_PARAM \ + aarch64_simd_clone_adjust_ret_or_param + #undef TARGET_SIMD_CLONE_USABLE #define TARGET_SIMD_CLONE_USABLE aarch64_simd_clone_usable diff --git a/gcc/omp-simd-clone.cc b/gcc/omp-simd-clone.cc index c2fd4d3be878e56b6394e34097d2de826a0ba1ff..091f194f1829fb9f70827d8674fd4dae44282d55 100644 --- a/gcc/omp-simd-clone.cc +++ b/gcc/omp-simd-clone.cc @@ -541,9 +541,12 @@ simd_clone_mangle (struct cgraph_node *node, pp_string (&pp, "_ZGV"); pp_character (&pp, vecsize_mangle); pp_character (&pp, mask); - /* For now, simdlen is always constant, while variable simdlen pp 'n'. */ - unsigned int len = simdlen.to_constant (); - pp_decimal_int (&pp, (len)); + + unsigned long long len = 0; + if (simdlen.is_constant (&len)) + pp_decimal_int (&pp, (int) (len)); + else + pp_character (&pp, 'x'); for (n = 0; n < clone_info->nargs; ++n) { @@ -1499,8 +1502,8 @@ simd_clone_adjust (struct cgraph_node *node) below). */ loop = alloc_loop (); cfun->has_force_vectorize_loops = true; - /* For now, simlen is always constant. */ - loop->safelen = node->simdclone->simdlen.to_constant (); + /* We can assert that safelen is the 'minimum' simdlen. */ + loop->safelen = constant_lower_bound (node->simdclone->simdlen); loop->force_vectorize = true; loop->header = body_bb; } diff --git a/gcc/testsuite/c-c++-common/gomp/declare-variant-14.c b/gcc/testsuite/c-c++-common/gomp/declare-variant-14.c index e3668893afe33a58c029cddd433d9bf43cce2bfa..12f8b3b839b7f3ff9e4f99768e59c0e1c5339062 100644 --- a/gcc/testsuite/c-c++-common/gomp/declare-variant-14.c +++ b/gcc/testsuite/c-c++-common/gomp/declare-variant-14.c @@ -21,7 +21,7 @@ test1 (int x) shall call f01 with score 8. */ /* { dg-final { scan-tree-dump-not "f04 \\\(x" "optimized" } } */ /* { dg-final { scan-tree-dump-times "f03 \\\(x" 14 "optimized" { target { !aarch64*-*-* } } } } */ - /* { dg-final { scan-tree-dump-times "f03 \\\(x" 10 "optimized" { target { aarch64*-*-* } } } } */ + /* { dg-final { scan-tree-dump-times "f03 \\\(x" 12 "optimized" { target { aarch64*-*-* } } } } */ /* { dg-final { scan-tree-dump-times "f01 \\\(x" 4 "optimized" { target { !aarch64*-*-* } } } } */ /* { dg-final { scan-tree-dump-times "f01 \\\(x" 0 "optimized" { target { aarch64*-*-* } } } } */ int a = f04 (x); diff --git a/gcc/testsuite/gcc.target/aarch64/declare-simd-1.c b/gcc/testsuite/gcc.target/aarch64/declare-simd-1.c index aab8c17f0c442a7cda4dce23cc18162a0b7f676e..add6e7c93019834fbd5bed5ead18b52d4cdd0a37 100644 --- a/gcc/testsuite/gcc.target/aarch64/declare-simd-1.c +++ b/gcc/testsuite/gcc.target/aarch64/declare-simd-1.c @@ -4,28 +4,39 @@ extern "C" { #endif #pragma omp declare simd -int __attribute__ ((const)) f00 (int a , char b) /* { dg-warning {GCC does not currently support a simdclone with simdlens 8 and 16 for these types.} } */ +int __attribute__ ((const)) f00 (int a , char b) { return a + b; } +/* { dg-final { scan-assembler-not {_ZGVn[a-z0-9]+_f00} } } */ +/* { dg-final { scan-assembler {_ZGVsMxvv_f00} } } */ + #pragma omp declare simd -long long __attribute__ ((const)) f01 (int a , short b) /* { dg-warning {GCC does not currently support a simdclone with simdlens 4 and 8 for these types.} } */ +long long __attribute__ ((const)) f01 (int a , short b) { return a + b; } +/* { dg-final { scan-assembler-not {_ZGVn[a-z0-9]+_f01} } } */ +/* { dg-final { scan-assembler {_ZGVsMxvv_f01} } } */ #pragma omp declare simd linear(b) -long long __attribute__ ((const)) f02 (short *b, int a) /* { dg-warning {GCC does not currently support a simdclone with simdlens 4 and 8 for these types.} } */ +long long __attribute__ ((const)) f02 (short *b, int a) { return a + *b; } +/* { dg-final { scan-assembler-not {_ZGVn[a-z0-9]+_f02} } } */ +/* { dg-final { scan-assembler {_ZGVsMxl2v_f02} } } */ + #pragma omp declare simd uniform(b) -void f03 (char b, int a) /* { dg-warning {GCC does not currently support a simdclone with simdlens 8 and 16 for these types.} } */ +void f03 (char b, int a) { } +/* { dg-final { scan-assembler-not {_ZGVn[a-z0-9]+_f03} } } */ +/* { dg-final { scan-assembler {_ZGVsMxuv_f03} } } */ + #pragma omp declare simd simdlen(4) double f04 (void) /* { dg-warning {GCC does not currently support simdlen 4 for type 'double'} } */ { diff --git a/gcc/testsuite/gcc.target/aarch64/declare-simd-2.c b/gcc/testsuite/gcc.target/aarch64/declare-simd-2.c index abb128ffc9cd2c1353b99eb38aae72377746e6d6..604869a30456e4db988bba86e059a27f19dda589 100644 --- a/gcc/testsuite/gcc.target/aarch64/declare-simd-2.c +++ b/gcc/testsuite/gcc.target/aarch64/declare-simd-2.c @@ -10,6 +10,7 @@ short __attribute__ ((const)) f00 (short a , char b) } /* { dg-final { scan-assembler {_ZGVnN8vv_f00:} } } */ /* { dg-final { scan-assembler {_ZGVnM8vv_f00:} } } */ +/* { dg-final { scan-assembler {_ZGVsMxvv_f00:} } } */ #pragma omp declare simd notinbranch short __attribute__ ((const)) f01 (int a , short b) @@ -17,6 +18,7 @@ short __attribute__ ((const)) f01 (int a , short b) return a + b; } /* { dg-final { scan-assembler {_ZGVnN4vv_f01:} } } */ +/* { dg-final { scan-assembler {_ZGVsMxvv_f01:} } } */ #pragma omp declare simd linear(b) inbranch int __attribute__ ((const)) f02 (int a, short *b) @@ -24,6 +26,7 @@ int __attribute__ ((const)) f02 (int a, short *b) return a + *b; } /* { dg-final { scan-assembler {_ZGVnM4vl2_f02:} } } */ +/* { dg-final { scan-assembler {_ZGVsMxvl2_f02:} } } */ #pragma omp declare simd uniform(a) notinbranch void f03 (char b, int a) @@ -31,6 +34,7 @@ void f03 (char b, int a) } /* { dg-final { scan-assembler {_ZGVnN8vu_f03:} } } */ /* { dg-final { scan-assembler {_ZGVnN16vu_f03:} } } */ +/* { dg-final { scan-assembler {_ZGVsMxvu_f03:} } } */ #pragma omp declare simd simdlen(2) float f04 (double a) @@ -39,6 +43,7 @@ float f04 (double a) } /* { dg-final { scan-assembler {_ZGVnN2v_f04:} } } */ /* { dg-final { scan-assembler {_ZGVnM2v_f04:} } } */ +/* { dg-final { scan-assembler-not {_ZGVs[0-9a-z]*_f04:} } } */ #pragma omp declare simd uniform(a) linear (b) void f05 (short a, short *b, short c) @@ -50,6 +55,7 @@ void f05 (short a, short *b, short c) /* { dg-final { scan-assembler {_ZGVnN4ul2v_f05:} } } */ /* { dg-final { scan-assembler {_ZGVnM8ul2v_f05:} } } */ /* { dg-final { scan-assembler {_ZGVnM8ul2v_f05:} } } */ +/* { dg-final { scan-assembler {_ZGVsMxul2v_f05:} } } */ #ifdef __cplusplus } #endif diff --git a/gcc/testsuite/gfortran.dg/gomp/declare-variant-14.f90 b/gcc/testsuite/gfortran.dg/gomp/declare-variant-14.f90 index 6319df0558f37b95f1b2eb17374bdb4ecbc33295..38677b8f7a76b960ce9363b1c0cabf6fc5086ab6 100644 --- a/gcc/testsuite/gfortran.dg/gomp/declare-variant-14.f90 +++ b/gcc/testsuite/gfortran.dg/gomp/declare-variant-14.f90 @@ -41,7 +41,7 @@ contains ! shall call f01 with score 8. ! { dg-final { scan-tree-dump-not "f04 \\\(x" "optimized" } } ! { dg-final { scan-tree-dump-times "f03 \\\(x" 14 "optimized" { target { !aarch64*-*-* } } } } - ! { dg-final { scan-tree-dump-times "f03 \\\(x" 6 "optimized" { target { aarch64*-*-* } } } } + ! { dg-final { scan-tree-dump-times "f03 \\\(x" 8 "optimized" { target { aarch64*-*-* } } } } ! { dg-final { scan-tree-dump-times "f01 \\\(x" 4 "optimized" { target { !aarch64*-*-* } } } } ! { dg-final { scan-tree-dump-times "f01 \\\(x" 0 "optimized" { target { aarch64*-*-* } } } } a = f04 (x)